This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!
that evaluates a linear map (p : X at x G X is a duality.
^^K
1. Adjoint or dual maps Let X, Y be vector spaces, X* and F* their duals and < , >x and < , >Y the evaluation maps on X* x X and Y* x Y. For every linear map £ : X ^^ y , one also has a map ^* : F* -^ X* defined by < r(2/*),x > : = < y\e{x)
>
Vx G X,V2/* G F*.
It turns out that £* is linear. Now if (ei, e2,. •., Cn) and ( / i , . . . , fm) are bases in X and Y respectively, and (e^, e ^ , . . . , e^) and (/^, / ^ , . . . , / ^ ) are the dual bases in X* and F*, then the associated matrices L = [L^] G Mm,n(If^) and M = [Mj] G Mn,m(IK), associated respectively to £ and ^*, are defined by m
n
h=l
2=1
By duality, i.e., M = L-^. Therefore we conclude that i / L is the matrix associated to £ in a given basis, then L^ is the matrix associated to £* in the dual basis. We can now discuss how coordinate changes in X reflect on the dual space. Let X be a vector space of dimension n, X* its dual space, ( e i , . . . , e^), (ei, 62,..., Cn) two bases on X and ( e \ e^,..., e^) and (e^ e^,..., e") the corresponding dual bases on X*. Let ^ : X -^ X be the linear map defined by £{ei) := e^ Vi = 1 , . . . , n. Then by duality <£*{e'),ej
> = <e\£{ej)
> = < e\ej > = Sij =< e\ej
>
Vz,j,
^*(e*) = e* Vz = 1 , . . . , n. If L and L^ are the associated matrices to £ and ^*, L changes basis from (ei, e 2 , . . . , e-n) to (ei, 62,..., en) in X, and L^ changes basis in the dual space from (e^, e^,..., e") to (e^, e^,..., e").
2.2 Eigenvectors and Similar Matrices
57
Now if (/? G X*, we have (/? = Zir=i ^i^^ ~ Y17=i ^^^^ hence n
n
i=l
n
n
z=l j = l
z=l
Thus, if a := (ai, a 2 , . . . , a^), b := (61, 62, • • •, bn)^ we have a^ = L ^ b ^
or
a = bL.
In other words, the coordinates in X* change according to the change of basis. We say that the change of coordinates in X* is covariant
2.2 Eigenvectors and Similar Matrices Let A : X —^ X he a. hnear operator on a vector space. How can we describe the properties of A that are invariant by isomorphisms? Since isomorphims amount to changing basis, we can put it in another way. Suppose X is finite dimensional, dim X = n, then we may consider the matrix A associated to A using a basis (we use the same basis both in the source and the target X). But how can we catch the properties of A that are independent of the basis? One possibihty is to try to choose an "optimal" basis in which, say, the matrix A takes the simplest form. As we have seen, if we choose two coordinate systems £ and ^ on X, and S is the matrix that changes coordinates from £ to ^ , then the matrices A and B that represent A respectively in the basis £ and J^ are related by B = SAS-^ Therefore we are asking for a nonsingular matrix S such that S~^AS has the simplest possible form: this is the problem of reducing a matrix to a canonical form. Let us try to make the meaning of "simplest" for a matrix more precise. Suppose that in X there are two supplementary invariant subspaces under A X = Wi® W2,
A{Wi) C Wi, A{W2) C W2.
Then every x e X splits uniquely as x = xi + X2 with xi G VFi, X2 G W25 and A{x) = A{xi) + A{x2) with A(xi) G Wi and A{x2) G VF2. In other words, A splits into two operators Ai : Wi ^^ Wi^ A \ W2 -^ W2 that are the restrictions of A to Wi and W2- Now suppose that d i m X = n and let (ei, 6 2 , . . . , Ck) and (/i, /2, •. •, fn-k) be two bases respectively of Wi and W2. Then the matrix associated to A in the basis (ei, 6 2 , . . . , efc, / i , / 2 , . . •, fn-k) of X has the form
58
2. Vector Spaces and Linear Maps
A =
Ai 0
where some of the entries are zero. If we pursue this approach, the optimum would be the decomposition of X into n supplementary invariant subspaces Wi, W2,.. •, Wn under A of dimension 1,
X = T^i e H^2 e • • • e T^n,
A(Wi) c Wi.
In this case, A acts on each Wi as a dilation: A{x) = Xix Vx G Wi for some A^ G K. Morever, if (ei, 6 2 , . . . , Cn) is a basis of X such that ei G W^ for each 2, then the matrix associated to A in this basis is the diagonal matrix A = diag(Ai,A2,...,An).
2.2-1 Eigenvectors a. Eigenvectors and eigenvalues As usual, K denotes the field R or C. 2.33 Definition. Let A : X -^ X be a linear operator on a vector space X over K. We say that x E X is an eigenvector of A if Ax = Xx for some X E K. If X is a nonzero eigenvector, the number X for which A{x) = Xx is called an eigenvalue of A, or more precisely, the eigenvalue of A relative to X. The set of eigenvalues of A is called the spectrum of A. / / A G Mn,nO^), we refer to eigenvalues and eigenvectors of the associated linear operator A : W^ —> W^, A(x) := Ax, as the eigenvalues and the eigenvectors of A. Prom the definition, A is an eigenvalue of A if and only if ker(A Id — A) 7^ {0}, equivalently, if and only if Aid — >1 is not invertible. If A is an eigenvalue, the subspace of all eigenvectors with eigenvalue A FA := {^ e X I A{x) = Xx\ = ker(AId - A) is called the eigenspace of A relative to A. 2.34 E x a m p l e , let X = C°° ([0, n]) be the linear space of smooth functions that vanish at 0 and n and let £>^ : X —> X he the linear operator D'^{f) := j " that maps every function / into its second derivative. Nonzero eigenvectors of the operator D^ ^ that is, the nonidentically zero functions y € C°*^[0,1] such that D'^y{x) = \y{x) for some A G M, are called eigenfunctions. 2.35 E x a m p l e . Let X be the set Pn of polynomials of degree less than n. Then, each Pfc d Pn A; = 0 , . . . , n is an invariant subspace for the operator of differentiation. It has zero as a unique eigenvalue.
2.2 Eigenvectors and Similar Matrices
59
2.36 %, Show that the rotation in E^ by an angle 6 has no nonzero eigenvectors if 9 ^ 0,n, since in this case there are no invariant lines.
2.37 Definition. Let A : X ^^ X be a linear operator on X. A subspace W C X is invariant (under A) if A{W) C W. In the following proposition we collect some simple properties of eigenvectors. 2.38 Proposition. Let A : X -^ X be a linear operator on X. (i) X 7^ 0 is an eigenvector if and only if Span {x} is an invariant subspace under A. (ii) Let A be an eigenvector of A and let Vx be the corresponding eigenspace. Then every subspace W C Vx is an invariant subspace under A, i.e., A{W) C W. (iii) dimker(AId — A) > 0 if and only if A is an eigenvalue for A. (iv) Let W C X be an invariant subspace under A and let A be an eigenvalue for A^^r. Then A is an eigenvalue for A : X ^ X . (v) A is an eigenvalue for A if and only ifO is an eigenvalue for Aid —^. (vi) Let if : X —^ Y be an isomorphism and let A : X -^ X be an operator. Then x E X is an eigenvector for A if and only if (p{x) is an eigenvector for (p o A o (p~^, and x and ip{x) have the same eigenvalue. (vii) Nonzero eigenvectors with different eigenvalues are linearly independent. Proof, (i), . . . , (vi) are trivial. To prove (vii) we proceed by induction on the number k of eigenvectors. For A; = 1 the claim is trivial. Now assume by induction that the claim holds for fc — 1 nonzero eigenvectors, and let e i , 62, • • •, e^ be such that ej / 0 Vi = 1 , . . . , fc, A{ej) = XjCj Vj = 1 , . . . , fc with Xj ^ Ai Vi 7^ j . Let aiei-fa2e2H
\-akCk = 0,
(2.8)
be a linear combination of e i , 6 2 , . . . , e^. From (2.8), multiplying by Ai and applying A we get aiXiei
+ a2Aie2 H
h afcAie^ = 0,
a i A i e i -h a2A2e2 H
h OfcAfcefc = 0,
consequently k
^{Xj
-Xi)ajej
=0.
j=2
By the inductive assumption, aj{Xj — Ai) = 0 Vj = 2 , . . . , n, hence aj = 0 for all j > 2. We then conclude from (2.8) that we also have a i = 0, i.e., that e i , 6 2 , . - . , Cfc are linearly independent. D
Let A : X ^^ X he a. linear operator on X of dimension n, and let A be the associated matrix in a coordinate system f : X —> W^. Then (vi) implies that x G X is an eigenvector of A if and only if x := £{x) is an eigenvector for x —> A x and x and x have the same eigenvalue. Prom (vii) Proposition 2.38 we infer the following.
60
2. Vector Spaces and Linear Maps
2.39 Corollary. Let A : X -^ X be a linear operator on a vector space X of dimension n. If A has n different eigenvalues, then X has a basis formed by eigenvectors of A. b. Similar matrices Let A : X ^^ Xhea, linear operator on a vector space X of dimension n. As we have seen, if we fix a basis, we can represent A by an n x n matrix. If A and A' G Mn^ni^) ^^^ ^wo such matrices that represent A in two different bases (ei, e 2 , . . . , e^) and (ei, 62,..., en), then by Proposition 2.28 A' = S~^AS where S is the matrix that changes basis from (ei, ^ 2 , . . . , e^) to (ei, € 2 , . . . ,
€n)-
2.40 Definition. Two matrices A , B G Mn n(K) are said to be similar if there exists S G GL(n,K) such that B = S~ AS. It turns out that the similarity relation is an equivalence relation on matrices, thus nxn matrices are partitioned into classes of similar matrices. Since matrices associated to a linear operator A : X ^^ X^ dimX = n, are similar, we can associate to A a unique class of similar matrices. It follows that if a property is preserved by similarity equivalence, then it can be taken as a property of the linear operator to which the class is referred. For instance, let A : X -^ X he a. hnear operator, and let A, B be such that B = S~^AS. By Binet's formula, we have det B = det S"^ det A det S = —— det A det S = det A. detS Thus we may define the determinant of the linear map A : X —^ X hy det A := det A where A is any matrix associated to A. c. The characteristic polynomial Let X be a vector space of dimension n, and let A : X ^^ X he a, hnear operator. The function ^-^PAW
'•= det(AId -A),
AG K,
is called the characteristic polynomial of A. It can be computed by representing A by a matrix A in a coordinate system and computing PAW as the characteristic polynomial of any of the matrices A representing A, PA{X)=PA{X)
=
detiXld-A).
In particular, it follows that p^( ) : K —> IK is a polynomial in A of degree n, and that the roots of p^(A) are the eigenvalues of A or of A. Moreover, we can state
2.2 Eigenvectors and Similar Matrices
61
2.41 Proposition. We have the following. (i) Two similar matrices A, B have the same eigenvalues and the same characteristic polynomials. (ii) / / A has the form
([M\ 0
0 A2
u
0
0 \ 0
AfeU
where for z = 1 , . . . , fc, each block A^ is a square matrix of dimension ki with principal diagonal on the principal diagonal of A, then PA{S) = PAA^) ' PA2{s). • .PAfc(s).
(iii) We have
det(5ld - A) = 5^ - tr A5^-i + • • • + (-l)^det A n k=l
where t r A := X]^=i A^ is the trace of the matrix A, and ak is the sum of the determinants of the kx k submatrices of A with principal diagonal on the principal diagonal of A. Proof, (i) If B = S A S - i , S G GL(n,K), then s I d - B = S ( s l d - A ) S - i , hence det(s Id - B ) = det S det(s Id - A)(det S ) - ^ = det(s Id - A), (ii) The matrix 5 Id — A is a block matrix of the same form
sId-A2
5 Id - Afc
V hence det(s Id - A) = Y[i=i det{s Id - A^). (iii) We leave it to the reader.
Notice that there exist matrices with the same eigenvalues that are not similar, see Exercise 2.73.
62
2. Vector Spaces and Linear Maps
d. Algebraic and geometric multiplicity 2.42 Definition. Let A : X -^ X be a linear operator, and let X GK be an eigenvalue of A. We say that A has geometric multiplicity k e N if dimker(AId - A) = k. Let PA{S) be the characteristic polynomial of A. We say that A has algebraic multiplicity k if PA{S) = {S- X)^q{s),
where q{X) ^ 0.
2.43 Proposition. Let A : X -^ X be a linear operator on a vector space of dimension n and let A be an eigenvalue of A of algebraic multiplicity m. Then dimker(AId - A) <m. Proof. Let us choose a basis (ei, 6 2 , . . •, Cn) in X such that (ei, e2, • •., e^) is a basis for Vx '•= ker(AId — A). The matrix A associated to A in this basis has the form
|\
(\ Aid
C
0
D
A =
vl
1/
where the first block, Aid, is a fc x /c matrix of dimension k = dimV^. Thus Proposition 2.41 (ii) yields PA{S) = det(s Id — A) = (s — \)^pr>(s), and the multiplicity of A is at least k. D
e. Diagonizable matrices 2.44 Definition. We say that A G Mn,n{^) is diagonizable, if A is similar to a diagonal matrix. 2.45 Theorem. Let A : X -^ X be a linear operator on a vector space of dimension n, and let (ei, 6 2 , . . . , Cn) be a basis of X. The following claims are equivalent. (i) ei, 6 2 , . . . , Cn are eigenvectors of A and Ai, A2,..., An are the relative eigenvalues. (ii) We have A{x) = Yl^=i K^^^i /^^ all x e X if x = Yl7=i ^^^i(iii) The matrix that represents A in the basis (ei, ^ 2 , . . . , e^) is diag(Ai, A2,..., An). (iv) If A is the matrix associated to A in the basis (/i, / 2 , . . . , fn), then S-^AS = dia^(Ai, A2,..., An) where S is the matrix that changes basis from (/i, /25 • • • ? /n) io (ei, 6 2 , . . . , Cn), i-e., the ith column of S is the coordinate vector of the eigenvector ei in the basis (/i, 7 2 , . . . , /n)-
2.2 Eigenvectors and Similar Matrices
63
Proof, (i) ^ (ii) by linearity and (iii) <=> (i) since (iii) is equivalent to A{ei) = Xiei. D Finally (iii) ^ (iv) by Corollary 2.29.
2.46 Corollary. Let A : X ^^ X be a linear operator on a vector space of dimension n. Then the following claims are equivalent. (i) X splits as the direct sum of n one-dimensional invariant subspaces (under A), X = Wie • • - ^ Wn. (ii) X has a basis made of eigenvectors of A. (iii) Let Ai, A2,..., A^ be all distinct eigenvalues of A, and let Vx^, • • •, Vx,^ be the corresponding eigenspaces. Then y ^ dim Vxi = n. (iv) / / A is the matrix associated to A in a basis, then A is diagonizable. Proof, (i) implies (ii) since any nonzero vector in any of the Wis is an eigenvector. Denoting by (ei, e2,. • •, en) a basis of eigenvectors, the spaces Wi := Spanje^} are supplementary spaces of dimension one, hence (ii) implies (i). (iii) is a rewriting of (i) since for each eigenvalue A, Vx is the direct sum of the Wi^s that have A as the corresponding eigenvalue. Finally (ii) and (iii) are equivalent by Theorem 2.45. D
2.47 Linear equations. The existence of a basis of X of eigenvectors of an operator A : X —^ X makes solving the Unear equation A{x) — y trivial. Let (ei, e 2 , . . . , Cn) be a basis of X of eigenvectors of A and let Ai, A2,..., An be the corresponding eigenvalues. Writing x^y e X in this basis, n
n
x = Y^x'ei,
^y'ei,
i=l
i=l
we rewrite the equation A{x) = y as
5^(A,x^-2/%=0, z=l
i.e., as the diagonal system Aix^ =y^, ..., (AnX
Therefore
—y
.
64
2. Vector Spaces and Linear Maps
(i) suppose that 0 is not an eigenvalue, then A{x) = y has a unique solution
=E
•ei.
Z=l
(ii) let 0 be an eigenvalue, and let VQ — Span{ei, e 2 , . . . , e^}. Then A{x) = 2/ is solvable if and only \i y^ = --- — y^ — ^ and a solution of A{x) = 7/ is xo := Xir=fc+i A"^^- -^^ linearity, the space of all solutions is the set | X G X X - X O G
kerA =
VQ
[•
2.48 % Let A : X —>• X be a linear operator on a finite-dimensional space. Show that A is invertible if and only if 0 is not an eigenvalue for A. In this case show that 1/A is an eigenvalue for A"^ if and only if A is an eigenvalue for A.
f. Triangularizable matrices First, we notice that the eigenvalues of a triangular matrix are the entries of the principal diagonal. We can then state the following. 2.49 Theorem. Let A G Mn,n(IK). / / the characteristic polynomial decomposes as a product of factors of first degree, i.e., if there are (not necessarily distinct) numbers Ai, A2,..., An G K such that det(AId - A) = (A - A i ) . . . (A - An), then A is similar to an upper triangular matrix. Proof. Let us prove the following equivalent claim. Let A : X -^ X he a linear operator on a vector space of dimension n. If PAW factorizes as a product of factors of first degree, then there exists a basis (iti, it2, • • •, Un) of X such that Span{wi}, Span {it 1,1*2}, Span {141,1*2,^x3}, . . . ,Span {1*1,1*2, • • itn} are invariant subspaces under A. In this case we have for the linear operator >l(x) = A x associated to A A u i = A(ui)
= a}ui,
Au2 = ^ ( u 2 ) = a2Ui H- a2U2,
[AUn
= A(Un)
= a^Ui
+ a^U2
H
(- a^JUn,
i.e., the matrix A associated to A using the basis (t*i, 1*2, • • •, Un) is upper triangular, /a\
A =
0
Vo
4\
4
„2
0
a^/
2.2 Eigenvectors and Similar Matrices
65
We proceed by induction on the dimension of X . If d i m X = 1, the claim is trivial. Now suppose that the claim holds for any linear operator on a vector space of dimension n — 1, and let us prove the claim for A. Prom PA (A) = det(AId - A) = (A - A i ) . . . (A - An), Ai is an eigenvalue of A, hence there is a corresponding nonzero eigenvalue ui and Span {ui} is an invariant subspace under A. Now we complete {ui} as a basis by adding vectors f2, • • • i^n, and let B be the restriction of the operator A to Span {v2,.. .Vn}Let B be the matrix associated to B in the basis {v2,. • • ,Vn), and let A be the matrix associated to A in the basis (iti, W2, • • •, Wn)- Then
/«!
where ai = Ai.
A =
\ Thus PA(X)
= PA (A) = (A - A I ) P B ( A ) = (A -
AI)PB(A).
It follows that the characteristic polynomial of B is P B ( A ) = (A — A 2 ) . . . (A — An). By the inductive hypothesis, there exists a basis (u2,... itn) of Span {v2,..., fn} such that Span{1x2}, Span{u2,W3}, . . . , S p a n { i i 2 , . . •,Wn} are invariant subspaces under B, hence S p a n { u i } , Span{iti,ii2}, Span{141,1x2,^3}, . . . ,Span{iti,tX2,... Wn} are invariant subspaces under A.
D
2.2.2 Complex matrices When K = C, a significant simplification arises. Because of the fundamental theorem of algebra, the characteristic polynomial PA (A) of every linear operator A : X ^^ X over a complex vector space X of dimension n, factorizes as product of n factors of first degree. In particular, A has n eigenvalues, if we count them with their multiplicities. Prom Theorem 2.49 we conclude the following at once. 2.50 Corollary. Let A e Mn,n(C) be a complex matrix. Then A is similar to an upper triangular matrix, that is, there exists a nonsingular matrix S G Mn,n(C) such that S~-^AS is upper triangular. Moreover, 2.51 Corollary. Let A G Mn,n(C)6e a matrix. Then A is diagonizahle (as a complex matrix) if and only if the geometric and algebraic multiplicities of each eigenvalue agree.
66
2. Vector Spaces and Linear Maps
Proof. Let Ai, A 2 , . . . , Afc be the distinct eigenvalues of A , for each i = 1^.. .k,\et and Vx^ respectively be the algebraic multiplicity and the eigenspace of A^. If dim V\^ = rrii Vi, then by the fundamental theorem of algebra k
rrii
n
^ d i m V ^ . =^mi i=l
= n.
i=l
Hence A is diagonizable, by Corollary 2.46. Conversely, if A is diagonizable, then Yli=i dim Vx^ = n, hence by Proposition 2.43 dimVxi ^ i^i^ hence k
n = ^rrii
k
> ^ d i m V ^ i = n. D
2.52 Remark (Real and complex eigenvalues). If A G Mn,n{^), its eigenvalues are by definition the real solutions of the polynomial equation det(AId — A) = 0. But A is also a matrix with complex entries, A G Mn,n{C) and it has as eigenvalues which are the complex solutions of det(AId — A) = 0. It is customary to call eigenvalues of A the complex solutions of det(AId — A) = 0 even if A has real entries, while the real solutions of the same equation, which are the eigenvalues of the real matrix A following Definition 2.33, are called real eigenvalues. The further developments we want to discuss depend on some relationships among polynomials and matrices that we now want to illustrate. a. The Cayley-Hamilton theorem Given a polynomial f{t) = ^^=1 CLk^^•> to every n x n matrix A we can associate a new matrix / ( A ) defined by n
/ ( A ) := ao Id + ^ k=l
n
akA^ =: ^
a^A k
k=0
if we set A^ := Id. It is easily seen that, if a polynomial f{t) factors as f{t) = p{t)q{t), then the matrices p{A) and q{A) commute, and we have fiA)=p{A)q{A)=q{A)p{A). 2.53 Proposition. Let A E Mn{C), and let p{t) be a polynomial. Then (i) if X is an eigenvalue of Ay then p{\) is an eigenvalue ofp{A), (ii) if fi is an eigenvalue of p{A), then fi = p{X) for some eigenvalue A
of A. Proof, (i) follows observing that A'^ , /c G N, is an eigenvalue of A'^ if A is an eigenvalue of A . (ii) Since ^ is an eigenvalue of p ( A ) , the matrix p{A) — /xld is singular. Let p{t) = J2i=i ^i** be of degree fc, a/e 7^ 0. By the fundamental theorem of algebra we have
2.2 Eigenvectors and Similar Matrices
67
p{t) ~ iJ, = ak n(*~'^^)' i=l
hence p(A) — / i l d = a^ H i ^ i C A — r^ Id) and, since p(A) — /xld is singular, at least one of its fa<:tors, say A — ri Id, is singular. Consequently, r i is an eigenvalue of A and trivially, p{ri) — /x = 0. D
Now consider two polynomials P{t) := J^- PjP and Q(t) := ^ ^ Qkt^ with n X n matrices as coefficients. Trivially, the product polynomial R(t) := P{t)Q{t) is given by
2.54 Lemma. Using the previous notation, if A G Mn,n{C) with the coefficients ofQ{t), then R ( A ) = P ( A ) Q ( A ) .
commutes
Proof. In fax;t,
R(A) = ^P,QfcA>+'= = 5^(P,A^)(QfeA'=) = (^P,A^)(^Q,A^)
=P(A)Q(A).
2.55 Theorem (Cayley-Hamiilton). Let A G Mn,n{C) 0,'^d let PA{S) be its characteristic polynomial, PA{S) '= det(sld —A). Then PA{A) = 0. Proof. Set Q(5) := s I d — A , s G C, and denote by cof Q(s) the matrix of cofactors of Q(s). By Laplace's formulas, see (1.22), cof Q(s) Q(s) = Q(s) cof Q(s) = det Q(s) Id =
PA{S)
Id.
Since A trivially commutes with the coefficents Id and A of Q ( s ) , Lemma 2.54 yields PA (A) = PA (A) Id = cof Q ( A ) Q ( A ) = cof Q ( A ) -0 = 0. D
b. Factorization and invariant subspaces Given two polynomials Pi,P2 with degPi > degP2, we may divide Pi by P2, i.e., uniquely decompose Pi as Pi = QP2 + R where d e g P < degP2. This allows us to define the greatest common divisor (g.c.d.) of two polynomials that is defined up to a scalar factor, and compute it by Euclid's algorithm. Moreover, since complex polynomials factor with irreducible factors of degree 1, the g.c.d. of two complex polynomials is a constant polynomial if and only if the two polynomials have no common root. We also have 2.56 Lemma. Let p{t) and q{t) be two polynomials with no common zeros. Then there exist polynomials a{t) and b{t) such that a{t)p{t) + b{t)q{t) = 1 V t e C .
68
2. Vector Spaces and Linear Maps
We refer the readers to [GM2], but for their convenience we add the proof of Lemma 2.56 Proof. Let V := (r{t) := a{t)p{t) -h b{t)q(t) I a{t), b{t) are polynomials} and let d = ap-\-/3q he the nonzero polynomial of minimum degree in V. We claim that d divides both p and q. Otherwise, dividing p by d we would get a nonzero polynomial r := p — md and, since p and d are in V, r = p — md G V also, hence a contradiction, since r has degree strictly less than d. Then we claim that the degree of d is zero. Otherwise, d would have a root that should be common to p and q since d divides both p and q. In conclusion, d is a nonzero • constant polynomial.
2.57 Proposition. For every polynomial p, the kernel of p{A) is an invariant subspace for A G Mn,n{C)Proof. Let w G kerp(A).
Since tp{t) = p{t) t, we infer Ap{A) = p{A)A.
Therefore
p ( A ) ( A w ) = (p(A) A ) w = ( A p ( A ) ) w = A p ( A ) w := AO = 0. Hence Aw G kerp(A).
•
2.58 Proposition. Let p be the product of two coprime polynomials, p{t) =Pi{t)p2{t), and let A G Mn,n(C). Then kerp(A) := kerpi(A) 0ker;?2(A). Proof By Lemma 2.56, there exist two polynomials a i , a 2 such that ai{t)pi(t) + a2(t)p2(t) = 1. Hence (2.9) ai (A)pi (A) + a2 (A)p2 (A) = Id. Set Wi := k e r p i ( A ) ,
W2 := kerp2(A),
W := kerp(A).
Now for every x G W, we have ai {A)pi ( A ) x G W2 since P 2 ( A ) a i ( A ) p i ( A ) x = P2(A)(Id - a2(A)p2(A))x = (Id - a2(A)p2(A))p2(A)x = a i ( A ) p i ( A ) p 2 ( A ) x = a i ( A ) p ( A ) x - 0. and, similarly, a2(A)p2(A)x G Wi. Thus W = Wi + W2. Finally W = WieW2fact, if y G Wi n W2, then by (2.9), we have
In
y = a i (A)pi (A)y -h 02 (A)p2 (A)y = 0 + 0 = 0.
c. Generalized eigenvectors and the spectral theorem 2.59 Definition. Let A G Mn,n(C), and let A be an eigenvalue of A of multiplicity k. We call generalized eigenvectors of A relative to the eigenvalue A the elements of W:=ker{Xld-A)^. Of course, (i) eigenvectors relative to A are generalized eigenvectors relative to A, (ii) the spaces of generalized eigenvectors are invariant subspaces for A.
2.2 Eigenvectors and Similar Matrices
69
2.60 T h e o r e m . Let A G Mn,n(C). Let Ai, A2,..., A^ be the eigenvalues of A with multiplicities m i , 7712,..., ruk and let VFi, VF2,..., W^ 6e the subspaces of the relative generalized eigenvectors, Wi := ker(AiId — A). Then (i) the spaces VFi, VF2,..., Wk are supplementary, consequently there is a basis of C^ of generalized eigenvectors of A, (ii) dimWi = mi. Consequently, if we choose A ' G Mn,n(^) using a basis (ei, 6 2 , . . . , e^) where the the first rui elements span Wi, the following m2 elements span W2 and the last m^ elements span Wk- We can then write the matrix A ' in the new basis similar to A where A' has the form 0
Ai
0 \ 0
A2
A' = 0
0
where for every i = 1,... ,k, the block Ai is ami x ^ i matrix with Xi as the only eigenvector with multiplicity mi and, of course, (A^ Id — A^)"^^ = 0. Proof, (i) Clearly the polynomials pi{s) := (Ai - s)"^i, ^2(5) := (A2 - s)"^^, . . . , Pk{s) := (Afe — s)'^f^ factorize pA and are coprime. Set N^ := p i ( A ) and notice that Wi = k e r N i . Repeatedly applying Proposition 2.58, we then get kerpA(A) = k e r ( N i N 2 . . . N ^ ) = k e r ( N i ) © ker(N2N3 • • • N ^ )
= --- =
WieW2e"-®Wk.
(i) then follows from the Cayley-Hamilton theorem, kerpA(A) = C^. (ii) It remains to show that dim Wi = rrii VI Let (ei, 62, • • •, e-n) be a basis such that the first hi elements span VTi, the following /12 elements span W2 and the last h^ elements span W^. A is therefore similar to a block matrix Ai
0
0
0
A2
0
0
0
\
A' =
Afc h
where the block A^ is a square matrix of dimension hi := dim W^. On the other hand, the Qi X Qi matrix (A^ Id — Aj)"^^ = 0 hence all the eigenvalues of Xi Id — A i are zero. Therefore A i has a unique eigenvalue Ai with multiplicity /li, and p^i (s) := (s — Xi)^^. We then have k
PA{S) = PA'is) = UPAM i=l
k
= Yl(s ~ A)^S i=l
and the uniqueness of the factorization yields hi = rrii. The rest of the claim is trivial. D
70
2. Vector Spaces and Linear Maps
Another proof of dim Wi = ruj goes as follows. First we show the following. 2.61 L e m m a . IfO is an eigenvalue ofB^ eigenvalue for B"^ with multiplicity m.
Mn,n(C) with multiplicity
m, the 0 is an
Proof. The function 1 - A'^, A € C, can be factorized as 1 - A"^ = I l i l o ^ ( ^ ' ~ ^) where (jj := e*27r/m jg ^ j.QQ^ Q£ unity (the two polynomials have the same degree and take the same values at the m roots of the unity and at 0). For z,t E C m—l
.
m —1
---*'"=.™(i - (1)-)=.- n (-^ - ^) = n (-*- - *). i=0
i=0
hence m— l
2^Id-B^ = 2^Id-B^-
J|(u;^2ld-B). i=0
If we set qo(z) := H l i ^ ^ Q{^-^z)^ we have qo{0) ^ 0, and m—l
m—l
P B - (z"^) ••= detiz"^ Id - B"^) = Yl PB(UJ'Z) = JJ (uj^z^q{uj^z) i=0
=
z'^\o{z).
i=0
(2.10) On the other hand p-B^ — s'^qi(r) for some qi with qi{0) ^ 0 and some r > 1. Thus, following (2.10) PBm(s) = s'^qi(s), i.e., 0 is an eigenvalue of multiplicity m for B*^.
•
i4noi/ier proof that dim Wj = m,i in Theorem, 2.60. Since
y] m,i = y^ dim Wi = dim X, it suffices to show that dim Wi < rui Vi. Since 0 is an eigenvalue of B := Aj Id —A of multiplicity m := r/ij, 0 is an eigenvalue of multiplicity m for B"^ by Lemma 2.61. Since Wi is the eigenspace corresponding to the eigenvalue 0 of B ' ^ , it follows from Proposition 2.43 that dim Wj < m. D
d. Jordan's canonical form 2.62 Definition. A matrix B E Mn^n^) exists k >0 such that B'^ = 0.
^s said to he nilpotent if there
Let B G Mq^q{C) be a nilpotent matrix and let k be such that B^ = 0, but gfc-i _^ Q p^^ g^ basis (ei, 6 2 , . . . , Cg) of kerB, and, for each z = 1 , . . . , s, set ej := Ci and define ef, ef,..., ef' to solve the systems Be^ '-= ^i~ for j = 2 , 3 . . . as long as possible. Let {e^}, j = 1 , . . . ,fc^,z = 1 , . . . , ^, be the family of vectors obtained this way. 2.63 Theorem (Canonical form of a nilpotent matrix). Let "B be a q X q nilpotent matrix. Using the previous notation, {e]} is a basis ofC^. Consequently, if we write B with respect to this basis, we get a qxq matrix B ' similar to B of the form
2.2 Eigenvectors eind Similar Matrices
/K
71
0 \ 0 (2.11)
B' B
V
AJ
where each block Bj has dimension ki and, if ki > 1, it has the form 0 1 0 0 0 1 B,= 0 0 0
. . . . . .
0 0 0
0 0 0
. . .
1 0
(2.12)
The reduced matrix B ' is called the canonical Jordan form of the nilpotent matrix B . Proof. The kernels Hj := ker B-^ of B-^, j = 1,... ,k, form a strictly increasing sequence of subspaces {0} = i/o C / / i C i/2 C • • • C Hk-i C Hk := C*?. The claim then follows by iteratively applying the following lemma.
D
2.64 Lemma. For j = 1,2,... ,k — l, let (ei, e 2 , . . . , Cp) be a basis of Hj and let xi, X2,..., x^ be all possible solutions of Bxj = Cj, j = 1,... ,p. Then (ei, 6 2 , . . . , ep,xi, X2,..., Xr) is a basis for Hj^\. Proof. In fact, it is easily seen that o the vectors e i , e2, • . . , Cp, x i , 0:2, • • •, Xr are linearly independent, o {ei, 6 2 , . . . , ep,a;i, X 2 , . . . , Xr} C Hj^i, o the image of Hj^i by B is contained in Hj. Thus r, which is the number of elements ei in the image of B , is the dimension of the image of i / j + i by B . The rank formula then yields dim Hj^i
= dim Hj -\- dim f Im B PI ifj + i ) =
p-\-r.
Now consider a generic matrix A G M„,n(C). We first rewrite A using a basis of generalized eigenvectors to get a new matrix A ' similar to A of the form A^ A'
0
0
...
0 \ 0
(2.13)
72
2. Vector Spaces and Linear Maps
where each block A^ has the dimension of the algebraic multiplicity rrii of the eigenvalue A^ and a unique eigenvalue A^. Moreover, the matrix Ci := A J d - Ai is nilpotent, and precisely C^' = 0 and C^"''^ •=/- 0. Applying Theorem 2.63 to each Cj, we then show that A^ is similar to \i Id + B ' where B ' is as (2.11). Therefore, we conclude the following. 2.65 Theorem (Jordan's canonical form). Lei Ai, A 2 , . . , A^; he all distinct eigenvalues of A e Mn,n(C). For every z = 1 , . . . , A: (i) let (^2,1,..., Ui^p.) be a basis of the eigenspace Vx. (as we know, pi < rii),
(ii) consider the generalized eigenvectors relative to Ai defined as follows: for any j = 1,2,...,pi, a) set ejj := Uij, b) set efj to be a solution of a-l
iA-Xild)efj=e
(2.14)
a
as long as the system (2.14) is solvable, c) denote by OL{i,j) the number of solved systems plus 1. Then for every i = 1,... ,k the list (efj) with j = 1,... ,pi and a = 1,..., a(i,j) is a basis for the generalized eigenspace Wi relative to \i. Hence the full list Kj)
i = l....,kj
= l,...,pi,a
is a basis ofC^. By the definition of the {efj}, S:=
(2.15)
= l,...,a(z,j) if we set
1 2 1 2 1 2 ^ l , l 5 ^ I , l 5 • • • 5 ^ 1 , 2 ' ^1,25 • • • 5 ^ 2 , 1 ' ^ 2 , 1 ' • • •
the matrix J := S ^AS, that represents x —> A x in the basis (2.15), has the form
A
j
=
'1,1
0
0
0
0
Ji,)
0
0
0
0
\
'1,2
0
0
'k,pk
\ where i = 1,... ,k, j = 1,... ,pi, 3ij has dimension a{i,j)
and
2.2 Eigenvectors and Similar Matrices
73
if dim 3ij = 1
1
0
0
o\
0
\i
1
0
0
0
0
\i
1
0
0
0
0
...
\i
0
0
...
0
(\ ^iJ ~~ \
\ Q
otherwise.
1 \)
A basis with the properties of the basis in (2.15) is called a Jordan basis^ and the matrix J that represents A in a Jordan basis is called a canonical Jordan form of A. 2.66 E x a m p l e . Find a canonical Jordan form of ^ 2 0 0 0 0 1 2 0 0 A = 0 1 2 0 0 0 1 3
Vl
0
0
A is lower triangular hence the eigenvalues of multiplictiy 2. We then have /o 0 1 0 A-2Id = 0 1 0 0 Vl 0
^ 0 0 0
1
3/
A are 2 with multiplicity 3 and 3 with
0 0 0 1 0
0 0 0 1 1
o\ 0 0 0 1/
A — 2Id has rank 4 since the columns of A of indices 1, 2, 3 and 5 are linearly independent. Therefore the eigenspace V2 has dimension 5 — 4 == 1 by the rank formula. We now compute a nonzero eigenvalue,
/
(x\ y ( A - 2 Id)
0
z t
\
/0\ 0 0 0
y z-\-t
Voy For instance, one eigenvector is si := ( 0 , 0 , 1 , - 1 , 1 ) ^ . We now compute the Jordan basis relative to this eigenvalue. We have e\ -^ = s\ and it is possible to solve
\
/ 0 \ 0 1
z+t \x-ht + uj
-1
/
0 X
y
for instance, S2
^1,1
Vl/
(0,1,0, —1, 2 ) ^ is a solution. Hence we compute a solution of
74
2. Vector Spaces and Linear Maps
(
0
\ 1 0 -1
X
y z-ht \x + t-\-u/
\2j
hence S3 := ef ^^ = (1,0, 0, - 1 , 2 ) ^ . Looking now at the other eigenvalue,
A-3Id =
0 0 -1 1 0
0 -1 1 0 0
(''1 0 0
Vi
o\
0 0 0 0 1
0 0 0 0/
A is of rank 4 since the columns of indices 1, 2, 3 and 4 are linearly independent. Thus by the rank formula, the eigenspace relative to the eigenvalue 2 has dimension 1. We now compute an eigenvector with eigenvalue 2. We need to solve /x\
/
( A - 3 Id)
- x \
t
-y y- z z
\u)
\x + t/
y z
=
('\ 0 0 0
=
\oJ
and a nonzero solution is, for instance, 54 := (0,0,0, 0,1)-^. Finally, we compute Jordan's basis relative to this eigenvalue. A solution of /
- x \ 0 0 0
-y
y-z z
\x + t) is given by S5 = e^ ^ = (0, 0 , 0 , 1 , 0)-^. Thus, we conclude that the matrix
/o S = Ul
S2 S3 S4 S5
0 1 -1
=
Vi
0 1 0 -1 2
1 0 0 -1 2
0 0 0 0 1
o\ 0 0 1 0/
is nonsingular, since the columns are linearly independent, and by construction /2 0 S-^AS^: 0 0
1 2 0 0
0 0 1 0 2 0 0 3
0\ 0 0 1
Vo 0 0 0 3/
2.2 Eigenvectors and Similar Matrices
75
e. Elementary divisors As we have seen, the characteristic polynomial det(5ld-A),
seK,
is invariant by similarity transformations. However, in general the equality of two characteristic polynomials does not imply that the two matrices be similar.
2.67 E x a m p l e . The unique eigenvalue of the matrix A ^ = I
V /i
I is AQ and has
XoJ
multiplicity 2. The corresponding eigenspace is given by the solutions of the system
{
O'x'^ +0'X^
=0,
fjLX^ + 0 • X ^ = 0 .
If /x 7^ 0, then VXO,M ^ ^ dimension 1. Notice that AQ is diagonal, while A^ is not diagonal. Moreover, AQ and A^ with fi ^ 0 are not similar.
It would be interesting to find a complete set of invariants that characterizes the class of similarity of a matrix, without going explictly into Jordan's reduction algorithm. Here we mention a few results in this direction. Let A e Mn,n(C). The determinants of the minors of order k of the matrix 5 Id — A form a subset T>k of polynomials in the s variable. Denote by Dk{s) the g.c.d. of these polynomials whose coefiicient of the maximal degree term is normalized to 1. Moreover set Do{s) := 1. Using Laplace's formula, one sees that Dk-i{s) divides Dk{s) for all k = l , . . . , n . The polynomiSfe
are called the elementary divisors of A. They form a complete set of invariants that describe the complex similarity class of A. In fact, the following holds. 2.68 Theorem. The following claims are equivalent (i) A and B are similar as complex matrices, (ii) A and B have the same Jordan's canonical form (up to permutations of rows and columns), (iii) A and B have the same elementary divisors.
76
2. Vector Spxaces and Linear Maps
2.3 Exercises 2.69 f. Write a few 3 x 3 real matrices and interpret them as linear maps from M^ into E^. For each of these linear maps, choose a new basis of R^ and write the associate matrix with respect to the new basis both in the source and the target R^. 2.70 %. Let Vi, V 2 , . . . , Vri be finite-dimensional vector spaces, and let / o , / i , - • • ? / n be linear maps such that {0}:^yiAy2^
. . . ^ ^ ' V n _ / ^ ' Vn ^ {0}.
Show that, if I m ( / i ) = ker(/i+i) Vi = 0 , . . . , n - 1, then E ? = i ( - 1 ) ' d i m Vi = 0 . 2.71 f. Consider R as a vector space over Q. Show that 1 and ( are linearly independent if and only if ^ is irrational, ? ^ Q. Give reasons to support that R as a vector space over Q is not finite dimensional. 2.72 ^ L a g r a n g e m u l t i p l i e r s . Let X, Y and Z be three vector spaces over K and let f : X ^>'Y, g : X —^ Z he two linear maps. Show that ker p C ker / if and only if there exists a linear map £ : Z -^ Y such that / := io g. 2.73 f.
Show that the matrices
;:)• 0°)' have the same eigenvalues but are not similar. 2.74 ^ . Let Ai, A 2 , . . . , An be the eigenvalues of A € Mn,n(C), possibly repeated with their multiplicities. Show that tr A = Ai + • • -f An and det A = Ai • A2 • • • An. 2.75 %. Show that p{s) = s'^ -\- an-is^~^ the n X n matrix /
H
0 0
1 0
0 1
-ao
—ai
—a2
2.76 %, Let A G Mfc,fc(K), B € Mn,n{^), polynomial of the matrix
h ao is the characteristic polynomial of ... ...
0 0
\
-fln-i
/
C 6 Mk,n{^)-
VO
Compute the characteristic
B
2.77 % L e t ^ r C ^ -^ C^ be defined by ^(ei) := ei^i if i = 1 , . . . , n - l and ^(en) = e i , where e i , e2, • . . , en is the canonical basis of C^. Show that the associated matrix L is diagonizable and that the eigenvalues are all distinct. [Hint: Compute t h e characteristic polynomial.]
2.3 Exercises
2.78 %. Let A 6 Mn,n{^)
77
and suppose A^ = Id. Show that A is similar to
/T
for some k, 1 < k < n. [Hint: Consider the subspeices V+ := {a^ I A x = x } and V- := {x I A x = - x } and show that V+ 0 y_ = R^. ] 2.79 i[. Let A, B G Mn,nW be two matrices such that A^ = B ^ = Id and tr A = tr B . Show that A and B are similar. [Hint: Use Exercise 2.78.] 2.80 f. Show that the diagonizable matrices span Mn,n{^)- [Hint: Consider the matrices Mij = diag (1, 2 , . . . , n ) + 'Eij where Eij has value 1 at entry (i, j ) and value zero otherwise.] 2.81 %. Let A , B e Mn,n(^) and let B be symmetric. Show that the polynomial t -^ det(A -I- t B ) has degree less than R a n k B . 2.82 %, Show that any linear operator A : W^ dimension 1 or 2.
V^ has an invariant subspace of
2.83 f F i t t i n g d e c o m p o s i t i o n . Let / : X —>• X be a linear operator of a finitedimensional vector space and set f^ := / o • • • o / /c-times. Show that there exists k, 1 < k
ker(/'=) = ker(/'=+i), Im(/'=) = Im (/«=+!), /|iin(/fe) • I n i ( / ^ ) -^ Im(f^) /(ker/*=)cker(/^), /|ker(/fc) • ker(/'^) -^ kerif^)
is an isomorphism, is nilpotent,
(vi) y = k e r ( / ' = ) e l m ( / ' = ) . 2.84 ^ . A is nilpotent if and only if all its eigenvalues are zero. 2.85 %, Consider the linear operators in the linear space of polynomials B(P){t) = tP(t). A(P){t) := P'{t), Compute the operator AB — BA. 2.86 f. Let A,B
be linear operators on R^. Show that
(i) tT(AB)=tv{BA), (ii) AB-BA:^ Id. 2.87 t . Show that a linear operator C : R^ -> R^ can be written as C = AB where A,B : R-^ -^ R-^ are linear operators if and only if tr C = 0.
BA
78
2. Vector Spaces and Linear Maps
2.88 %. Show that the Jordan canonical form of the matrix fa al al 0 a al 0 0 a \0
0
4\ r.2
0
a)
with alai ...an ^ # 0 : /a 1 0 0 o 1 0 0 a
0\ 0 0
\0
a)
0 0
3- Euclidean and Hermitian Spaces
3.1 The Geometry of Euclidean and Hermitian Spaces Until now we have introduced several different languages, linear independence, matrices and products, linear maps that are connected in several ways to linear systems and stated some results. The structure we used is essentially hnearity. A new structure, the inner product, provides a richer framework that we shall illustrate in this chapter. a. Euclidean spaces 3.1 Definition. Let X be a real vector space. An inner product on X is a map ( | ) : X x X —^W which is o
(BILINEAR) (X,2/)
—> {x\y) is linear in each factor, i.e., {Xx + fiy\z) = X{x\z) + fi{y\z), {x\Xy -h /iz) = X{x\y) + fi{x\z),
for all x,y,z
e X, for all A, // G M.
o (SYMMETRIC) {x\y) = {y\x) o (POSITIVE DEFINITE) {X\X)
for
> 0
all x,y e X. VX and {x\x)
=0 if and only if x = 0.
The nonnegative real number \x\ := y/{x\x) is called the norm of x e X. A finite-dimensional vector space X with an inner product is called an Euclidean vector space, and the inner product of X is called the scalar product of X. 3.2 Example. The map ( | ) : R^ x E'^ -^ R defined by (x|y) := x . y = f ^ ^ V ,
x := {x\ x^,..., x^), y := {y\ y\ ... ^ y^)
80
3. Euclidean and Hermitian Spaces
is an inner product on W^, called the standard scalar product of R^, and W^ with this scalar product is an Euclidean space. In some sense, as we shall see later, see Proposition 3.25, it is the unique Euclidean space of dimension n. Other examples of inner products on M^ can be obtained by weighing the coordinates by nonnegative real numbers. Let Ai, A2, •. •, An be positive real numbers. Then (x|y) •.= ^ \ i x Y ,
x = ( x i , x 2 , . . . , x " ) , y = {y\
y\...,y")
i=l
is an inner product on R^. Other examples of inner products in infinite-dimensional vector spaces can be found in Chapter 10.
Let X be a vector space with an inner product. Prom the bihnearity of the inner product we deduce that \x + 2/p = {x + y\x + y) = {x\x + t/) + {y\x + y) = {x\x) + 2{x\y) + {y\y) = \x\^ + 2{x\y) + l^l^
(3.1)
from which we infer the following. 3.3 Theorem. The following hold. (i)
(PARALLELOGRAM IDENTITY)
\x + y\^ + \x-yf (ii)
( P O L A R I T Y FORMULA)
We have
= 2 (|x|2 + \yf)
Vx,
yeX.
We have
{x\y) = 4 ( k + 2/P - k - y\^)
Vx, yeX,
hence we can get the scalar product of x and y by computing two norms. (iii) (CAUCHY-SCHWARZ INEQUALITY) The following inequality holds \{x\y)\<\x\
\yl
VX,7/GX;
moreover^ {x\y) = \x\\y\ if and only if either y = 0 or x = Xy for some A G M, A > 0. Proof, (i), (ii) follow trivially from (3.1). Let us prove (iii). If y = 0, the claim is trivial. If 2/ 7^ 0, the function < -^ |x -f i y p , i 6 E, is a second order nonnegative polynomial since 0<\x-\-
ty\'^ = ix + ty\x + ty) = {x-\- ty\x) -f (x + ty\ty) = \x\'^ + 2{x\y) t + |y|^ *^;
hence its discriminant is nonpositive, thus ((x\y))'^ — |xp|2/p < 0. If {x\y) = \x\ \y\, then the discriminant of t —>• |x -f ty\^ vanishes. If t/ 7^ 0, then for some t G M we have \x + *2/P = 0, i.e., x = —ty. Finally, —t is nonnegative since D -t{y\y) =: ix\y) = \x\ \y\ > 0.
3.4 Definition. Let X be a vector space with an inner product. Two vectors x,y G X are said to be orthogonal, and we write x Ly, if {x\y) = 0.
3.1 The Geometry of Euclidean and Hermitian Spaces
81
Prom (3.1) we immediately infer the following. 3.5 Proposition (Pythagorean theorem). Let X be a vector space with an inner product. Then two vectors x^y G X are orthogonal if and only if I
I
|2
I
|2
, I
|2
1^ + 2/1 = m + \y\ ' 3.6 Carnot's formula. Let x , y G M^ be two nonzero vectors of R^, that we think of as the plane of Euclidean geometry with an orthogonal Cartesian reference. Setting x := (a, 6), y := (c, d), and denoting by 6 the angle between Ox and Oy, it is easy to see that |x|, |y| are the lengths of the two segments Ox and Oy, and that x « y := ac-\- bd = |x| |y| cos^. Thus (3.1) reads as Carnot's formula |x + y p = |xp -h |yp H- 2|x| |y| cos^. In general, given two vectors x , y G R", we have by Cauchy-Schwarz inequality | x » y | < | x | | y | , hence there exists a ^ G R such that x»y
m \y\
. =: cos^.
6 is called the angle between x and y and denoted by xy. In this way (3.1) rewrites as Carnot 's formula |x -f y p = |xp + |yp + 2|x| |y| cosl9. Notice that the angle 6 is defined up to the sign, since cos^ is an even function. 3.7 Proposition. Let X be a Euclidean vector space and let { \ ) be its inner product. The norm of x G X , |x| :=
y/{x\x)
is a function \ | : X -^ R with the following properties (i) \x\ G R4. Vx G X. (ii) (NONDEGENERACY) (iii)
(iv)
\X\ =0 if and only ifx = 0. (1-HOMOGENEITY) |AX| = |A| |x| VA G R, Vx G X. (TRIANGULAR INEQUALITY) \x-\-y\ < \x\ + \y\ \/x,y
G X.
Proof, (i), (ii), (iii) are trivial (iv) follows from the Cauchy-Schwarz inequality since \x + 2/|2 = |x|2 + |t/|2 + 2{y\x) < |x|2 + |y|2 + 2 \{y\x)\
<|xp + |2/|2+2|x|M = (N + M)2). D
Finally, we call the distance between x and y G X the number d{x,y) := \x — y\. It is trivial to check, using Proposition 3.7, that the distance function d : X x X ^^ R defined by d{x,y) := \x — y\^ has the following properties
82
3. Euclidean and Hermitian Spaces
(i)
(NONDEGENERACY)
d(x, y) >0^x,y
e X and d{x, y) = 0 ii and only
ii X = y. (ii) (SYMMETRY) d{x,y) = d{y,x) Vx,?/ € X. (iii) (TRIANGULAR INEQUALITY) d{x,y) < d(x,z) + d{z,y) Vx,y,z e X.
We refer to d as the distance in X induced by the inner product. 3.8 Inner products in coordinates. Let X be a Euclidean space, denote by ( I ) its inner product, and let (ei, 6 2 , . . . , e^) be a basis of X. If ^ = E l L i ^'^i^ y = E I L i y'^i ^ ^i then by linearity {x\y) =
Y^x'y^{ei\ej).
The matrix G = [gij],
Qij = {ei\ej)
is called the Gram matrix of the scalar product in the basis (ei, eg, • • •, ^n)Introducing the coordinate column vectors x = (x^, x ^ , . . . , x'^)'^ and y = (2/^, y ^ , . . . , 2/"^)^ G R"^ and denoting by • • • the standard scalar product in R"^, we have {x\y) = x . G y = x^Gy rows by columns. We notice that (i) G is symmetric, G^ — G, since the scalar product is symmetric, (ii) G is positive definite, i.e., x^Gx > 0 Vx G R"' and x^Gx = 0 if and only if X — 0, in particular, G is invertible. b. Hermit ian spaces A similar structure exists on complex vector spaces. 3.9 Definition. Let X be a vector space over C. A Hermitian product on X is a map {\):XxX-^C which is (i) (SESQUILINEAR),
i.e.,
{av -h (iw\z) = a{v\z) + l3{w\z), {v\aw -f /3z) = a{v\w) -h l3{v\z) (ii) (iii)
(HERMITIAN) {Z\UJ)
yzex.
= (wlz) "iw.z £ X, in particular {z\z) G \
(POSITIVE DEFINITE) (Z\Z)
> 0 and {z\z) = 0 if and only if z = 0.
The nonnegative real number \z\ :=^ y^{z\z) is called the norm of z E X. 3.10 Definition. A finite-dimensional complex space with a Hermitian product is called a Hermitian space.
3.1 The Geometry of Euclidean and Hermitian Spaces
83
3.11 E x a m p l e . Of course the product (z,w) -^ {z\w) := wz is a Hermitian product on C. More generally, the map ( | ) : C^ x C " —^ C defined by n
{z\w) := z » w := ^z^w^ J=i
Mz = (z^, z^,...,
^ " ) , w = {w'^, w'^,...,
w'^)
is a Hermitian product on C^, called the standard Hermitian product of C^. As we shall see later, see Proposition 3.25, C^ equipped with the standard Hermitian product is in a sense the only Hermitian space of dimension n.
Let X be a complex vector space with a Hermitian product ( | ). Prom the properties of the Hermitian product we deduce \z -h w\'^ = (z -{• w\z -\-w) = (z\z -h K;) + (w\z -\-w) I
I
\
I
/
V I
/
\
I
, (3.2)
/
= {z\z) + {z\w) + {w\z) -h {w\w) = \z\'^ -h \w\'^ + 2di{z\w) from which we infer at once the following. 3.12 T h e o r e m ,
(ii)
(i) We have
(PARALLELOGRAM IDENTITY)
\z + w\'^ -\-\z-wf (iii)
( P O L A R I T Y FORMULA)
We have
= 2 (|zp + \wf)
\/z, w
eX.
We have
— iw\ 4:{z\w) =: (\z -\- w\'^ — \z — w;p 1 4- if 1^: -h iw\'^ — \z 2^ for all z^w G X. We therefore can compute the Hermitian product of z and w by computing four norms. (iv) (CAUCHY-SCHWARZ INEQUALITY) The following inequality holds \{z\w)\ < \z\ \wl
yz.weX;
moreover {z\w) = \z\ \w\ if and only if either w = 0, or z = Xw for some A G M, A > 0. Proof, (i), (ii), (iii) follow trivially from (3.2). Let us prove (iv). Let z, w E X and A = te*^, t,e eR. From (3.2) 0 < |z + \w\'^ = t'^\w\'^ + \z\^ + 2t^{e-'yz\w))
Vt G M,
hence its discriminant is nonpositive, thus me-'»{z\w))\
< \z\ \w\.
Since 0 is arbitrary, we conclude |(2;|ii;)| < \z\ \w\. The second part of the claim then follows as in the real case. If {z\w) = \z\ \w\, then the discriminant of the real polynomial t —> |2 + ttyp, t e R, vanishes. If -w; ^i^ 0, for some t G M we have \z + tw\'^ = 0, i.e., z = —tw. Finally, —t is nonnegative since —t{w\w) = {z\w) = \z\ \w\ > 0. D
,
84
3. Euclidean and Hermitian Spaces
3.13 i[. Let A" be a complex vector space with a Hermitian product and let z^w ^ X. Show that K^^lif)! = \z\ \w\ if and only if either it; = 0 or there exists A 6 C such that z = \w.
3.14 Definition. Let X be a complex vector space with a Hermitian product ( I ). Two vectors z^w e X are said to be orthogonal, and we write z 1.W, if {z\w) = 0. Prom (3.2) we immediately infer the following. 3.15 P r o p o s i t i o n ( P y t h a g o r e a n t h e o r e m ) . Let X be a complex vector space with a Hermitian product ( | ). If z^w E X are orthogonal, then
We see here a diiference between the real and the complex cases. Contrary to the real case, two complex vectors, such that |z + i(;p = |2:p + |if;p holds, need not be orthogonal. For instance, choose X := C, {z\w) := wz, and let z = 1 and w = i. 3.16 P r o p o s i t i o n . Let X be a complex vector space with a Hermitian product on it. The norm of z e X, \z\ :=
^/(z\z),
is a real-valued function \ \ : X —^R with the following properties (i) \z\ G R+ Vz G X. (ii) (NONDEGENERACY) (iii) (iv)
\Z\=0 if and only if z = 0. (1-HOMOGENEITY) \XZ\ = |A| \z\ VA G C, Vz G X. (TRIANGULAR INEQUALITY) \Z-\-W\ < \z\ + \w\ \/z,w
G X.
Proof, (i), (ii), (iii) are trivial, (iv) follows from the Cauchy-Schwarz inequality since \z + w\^ = \z\^ -f k | 2 + 2^(z\w)
< \z\^ + |i/;|2 + 2 \(z\w)\
<\z\^ + \w\^ + 2\z\\w\
= (\z\ +
\w\n D
Finally, we call distance between two points z^w oi X the real number d{z,w) := \z — w\. It is trivial to check, using Proposition 3.16, that the distance function d : X x X ^^ R defined by d{z,w) := \z — w\ has the following properties (i)
(NONDEGENERACY) d{z,w) > 0 yz,w e X and d{z,w) = 0 if and only ii z = w. (ii) (SYMMETRY) d{z,w) = d{w,z) ^z.w G X. (iii)
(TRIANGULAR INEQUALITY) d{z, w) < d(z, x) + d{x, w) Ww, x,z
e
X.
We refer to d as to the distance on X induced by the Hermitian product.
3.1 The Geometry of Euclidean and Hermitian Spaces
85
3.17 Hermitian products in coordinates. If X is a Hermitian space, the Gram matrix associated to the Hermitian product is defined by setting G=
[gij],
9ij :=
{ei\ej).
Using Hnearity {z\w) = Y2 {ei\ej)z'w^ = z^Gw ',3 = 1
if z = (z^, z ^ , . . . , z'^), w = (it;^, w'^,..., w'^) G C^ are the coordinate vector columns of z and w in the basis (ei, e 2 , . . . , Cn)- Notice that (i) G is a Hermitian matrix, G = G, (ii) G is positive definite, i.e., z^Gz > 0 Vz G C^ and z^Gz = 0 if and only if z = 0, in particular, G is invertible. c. Orthonormal basis and the Gram—Schmidt algorithm 3.18 Definition. Let X be a Euclidean space with scalar product { \ ) or a Hermitian vector space with Hermitian product ( | ). ^4 system of vectors {^a}aeA ^ ^ '^s called orthonormal if iea\ep) =Sap
Va,/3 G A
Orthonormal vectors are hnearly independent. In particular, n orthonormal vectors in a Euclidean or Hermitian vector space of dimension n form a basis, called an orthonormal basis. 3.19 E x a m p l e . The canonical basis ( e i , e 2 , . . . , Cn) of E"^ is an orthonormal basis for the standard inner product in E " . Similarly, the canonical basis ( e i , 6 2 , . . . , en) of C^ is an orthonormal basis for the standard Hermitian product in C"^. 3.20 %. Let ( I ) be an inner (Hermitian) product on a Euclidean (Hermitian) space X of dimension n and let G be the associated Gram matrix in a basis (ei, 6 2 , . . . , en)Show that G = Idn if and only if (ei, e2,. •., en) is orthonormal.
Starting from a denumerable system of linearly independent vectors, we can construct a new denumerable system of orthonormal vectors that span the same subspaces by means of the Gram-Schmidt algorithm, 3.21 Theorem (Gram-Schmidt). Let X be a real (complex) vector space with inner (Hermitian) product ( | ). Let t'l, t;2,..., t'jt,... be a denumerable set of linearly independent vectors in X. Then there exist a set of orthonormal vectors wi, W2,. - -, Wk,- • - such that for each fc = 1,2,... Span|it;i, W2,..., wA = Spanjt'i, t'2,.--, Vkj.
86
3. Euclidean and Hermitian Spaces
Proof. We proceed by induction. In fact, the algorithm W[ = VI,
wi := -—f-, '^p='^pYl^j=lMwj)wj p -P A^j= w Wp:=
—-
never stops since Wp ^ 0 "ip = 1,2,3,...
and produces the claimed orthonormal basis. D
3.22 Proposition (Pythagorean theorem). LetX be a real (complex) vector space with inner (Hermitian) product ( | ). Let (ei, e2, • . . , e^) be an orthonormal basis of X. Then k X =
Y^{x\ej)ej
xeX,
2=1
that is the ith coordinate of x in the basis (ei, e2,. •., Cn) is the cosine director {x\ei) of x with respect to ei. Therefore we compute k
{x\y) = Y^(x|ei) {y\ei)
if X is Euclidean^
i=l k
{x\y) = 2_\{^\^i) {y\^i)
^ / ^ ^-^ Hermitian,
i=l
so that in both cases Pythagoras's theorem holds: k
\x\' = {x\x) = Y;^\ix\ei)\-'. i=l
Proof. In fact, by linearity, for j = 1 , . . . , A; and x — ^Y^=\ ^^^i ^® have n
n
n
i=l
i=\
i=\
Similarly, using linearity and assuming X is Hermitian, we have {x\y) = (Y^x'ei
I (jZv^ej)
i=l n
= f^ i,3 = ^
j=\ k
hence, by the first part, n
{x\y) =
^{x\ei){y\ei).
x*^(e,|e^)
3.1 The Geometry of Euclidean and Hermitian Spaces
87
d. Isometries 3.23 Definition. Let X^Y he two real (complex) vector spaces with inner (Hermitian) products ( | )x and ( | )y. We say that a linear map A : X -^ Y is an isometry if and only if \A{X)\Y
= \x\x
Vx e
X,
or, equivalently, compare the polar formula, if {A{x)\A{y))Y
= {x\y)x
^x,yeX.
Isometries are trivially injective, but not surjective. If there exists a surjective isometry between two Euclidean (Hermitian) spaces, then X and Y are said to be isometric. 3.24 %. Let X,Y be two real (complex) vector spaces with inner (Hermitian) products { \ )x and { \ )Y and let A : X —>• V be a linear map. Show that the following claims are equivalent (i) A is an isometry, (ii) B C X is am orthonormal basis if and only if A{B) is an orthonormal basis for A{X).
Let X be a real vector space with inner product ( | ) or a complex vector space with Hermitian product ( | ). Let (ei, e 2 , . . . , e^) be a basis in X and f : X -^ K^, (K = R of K = C) be the corresponding system of coordinates. Proposition 3.22 implies that the following claims are equivalent. (i) (ei, 6 2 , . . . , Cn) is an orthonormal basis, (ii) £{x) = ((x|ei),...,(x|en)), (iii) £ is an isometry between X and the Euclidean space W^ with the standard scalar product (or C^ with the standard Hermitian product). In this way, the Gram-Schmidt algorithm yields the following. 3.25 Proposition. Let X be a real vector space with inner product ( | ) (or a complex vector space with Hermitian product { \ )) of dimension n. Then X is isometric to R^ with the standard scalar product (respectively, to C^ with the standard Hermitian product), the isometry being the coordinate system associated to an orthonormal basis. In other words, using an orthonormal basis on X is the same as identifying X with R"^ (or with C") with the canonical inner (Hermitian) product. 3.26 I s o m e t r i e s in c o o r d i n a t e s . Let us compute the matrix associated to an isometry R : X -^ Y between two Euclidean spaces of dimension n and m respectively, in an orthonormal basis (so that X and Y are respectively isometric to R^ (C^) and W^ ( C ^ ) by means of the associated coordinate system). It is therefore sufficient to discuss real isometries i? : E^ -^ E"^ and complex isometries RiC^ -^C^. Let i? : E^ —)• E ^ be linear and let R € Mm,nW be the associated matrix, il(x) = R x , X G E"^. Denoting by ( e i , e 2 , . . •, en) the canonical basis of E " ,
3. Euclidean and Hermitian Spaces
R =
ri
r2
...
Fn ,
Ti = R e j Vi.
Since (ei, e 2 , . . . , Cn) is orthonormal, R is an isometry if and only if ( r i , rg, • •., Tn) are orthonormal. In particular, m > n and Tj Ti = Ti* Tj = 5ij
i.e., the matrix R is an orthogonal matrix^ R ^ R = Idn. When m — n, the isometries i l : R'^ —» R'^ are necessarily surjective being injective, and form a group under composition. As above, we deduce that the group of isometries of R'^ is isomorphic to the orthogonal group 0(n) defined by 0{n) := | R e Mn,n(R) | R ^ R = I d n } . Observe that a square orthogonal matrix R is invertible with R~^ = R-^. If follows that R R ^ = Id and | det R | = 1. Similarly, consider C^ as a Hermit ian space with the standard Hermitian product. Let R:C -^C^ he linear and let R € Mm,n(C) be such that R{z) = R z . Denoting by ( e i , e 2 , . . . , Cn) the canonical basis of R^, R =
r i r2
...
Fn ,
Ti = R e i Vi = 1 , . . . , m.
Since ( e i , e 2 , . . . , en) is orthonormal, R is an isometry if and only if r i , r 2 , . . •, rn are orthonormal. In particular, m > n and
i.e., the matrix R is a unitary
matrix, R^R=
Idn.
When 171 = 71, the isometries R : C^ -^ C^ are necessarily surjective being injective, moreover they form a group under composition. From the above, we deduce that the group of isometries of C^ is isomorphic to the unitary group U(n) defined by U{n) := | R € Mn,n(C) | R ^ R = I d n } . Observe that a square unitary matrix R is invertible with R R R ^ = Id and | det R | = 1.
^= R
. I t follows that
e. The projection theorem Let X be a real (complex) vector space with inner (Hermitian) product ( I ) that is not necessarily finite dimensional, let F C X be a finitedimensional linear subspace of X of dimension k and let (ei, 6 2 , . . . , e^) be an orthonormal basis of V. We say that x G X is orthogonal to V if {x\v) = 0 Vz; G 1^. As (ei, e 2 , . . . , ek) is a basis of V, x 1.V if andonly if (x|ei) = 0 Vi = 1 , . . . ,fc. For all a; G X, the vector Py{x) :=^{x\ei)ei i=i
eV
3.1 The Geometry of Euclidean and Hermitian Spaces
89
is called the orthogonal projection of x in F , and the map Py : X -^ V, X —> Pv{x), the projection map onto V. By Proposition 3.22, Py(x) = x if X G F , hence ImP = V and P^ = P. By Proposition 3.22 we also have |Py(x)p = Zli=i l(^ki)P- The next theorem explains the name for Pv{x) and shows that in fact Pv{x) is well defined as it does not depend on the chosen basis (ei, e2, • •., e/c). 3.27 Theorem (of orthogonal projection). With the previous notation, there exists a unique z G V such that x — z is orthogonal to V, i.e., {x — z\v) = 0 \fv e V. Moreover, the following claims are equivalent. (i) X — z is orthogonal to V, i.e., {x — z\v) = 0 ^v e V, (ii) z GV is the orthogonal projection of x onto V, z = Pv{x), (iii) z is the point in V of minimum distance from x, i.e., \x — z\ < \x — v\
Mv
GV^
V ^ z.
In particular, Pv{x) is well defined as it does not depend on the chosen orthonormal basis and there is a unique minimizer of the function v -^ \x — v\, V e V, the vector z = Pv{x). Proof. We first prove uniqueness. U zi,Z2 £V are such that (x — Zi\v) = 0 ioT i = I, 2, then {zi — Z2\v) = 0 Vt; G V, in particular \zi — Z2\'^ = 0. (i) => (ii). From (i) we have {x\ei) = {z\ei) Vi = 1 , . . . , fc. By Proposition 3.22 k
k
z = Y^{z\ei)ei
= '^{x\ei)ei
=
Pv{x).
1=1
i=i
This also shows existence of a point z such that x — z is orthogonal to V and that the definition of Pv{x) is independent of the chosen orthonormal basis (ei, 6 2 , . . . , e^). (ii) =^ (i). Ii z = Py{x),
we have for every j = 1,...
,k
k
{x - z\ej) = {x\ej) - '^{x\ei)(ei\ej)
= {x\ej) - {x\ej) = 0,
i=l
hence (x — z\v) = 0 Vi'. (i) => (iii). Let v EV.
Since {x — z\v) = 0 v/e have
\x-v\'^
= \x-
z-\- z-vl"^
= \x-
z\'^ -\-\z-
i;p,
hence (iii). (iii) =^ (i). Let v e V. The function t ^y \x — z + t v p , t G M, has a minimum point at t = 0. Since \x- z + tvl"^ = \x - z\'^ + 2t^{x - z\v) -\- t'^\v\'^, necessarily 3f?(x — z\v) = 0. If X is a real vector space, this means {x — z\v) = 0, hence (i). If X is a complex vector space, from R{x — z\v) = 0 \/v e V, we also have 3f?(e-^^(x - z\v)) = 0 V(9 € M Vv G V, hence {x - z\v) = 0 Vi; € V and thus (ii). D
We can discuss linear independence in terms of an orthogonal projection. In fact, for any finite-dimensional space V C X, x e V ii and only if X — PY{X) = 0, equivalently, the equation x — Pv{x) = 0 is an implicit equation that characterizes V as the kernel of Id- Py.
90
3. Euclidean and Hermitian Spaces
3.28 %, Let W = Span { v i , V 2 , . . . , v^} be a subspace of K^. Describe a procedure that uses the orthogonal projection theorem to find a basis of W. 3.29 %, Given A G Mm,n(^), describe a procedure that uses the orthogonal projection theorem in order to select a maximal system of independent rows and columns of A. 3.30 %. Let A G Mm^nC^)- Describe a procedure to find a basis of ker A. 3.31 ^. Given k linear independent vectors, choose among the vectors (ei, e2, • . . , en) of R"" (n — k) vectors that together with v i , V 2 , . . . , v^ form a basis of R"^. 3.32 P r o j e c t i o n s in c o o r d i n a t e s . Let X be a Euclidean (Hermitian) space of dimension n and let F C X be a subspace of dimension k. Let us compute the matrix associated to the orthogonal projection operator Py : X -^ X in an orthonormal basis. Of course, it suffices to think of Py as of the orthogonal projection on a subspace of R^ (C^ in the complex case). Let (ei, e 2 , . . . , en) be the canonical basis of R^ and V C R^. Let v i , V 2 , . . . , v^ be an orthonormal basis of V and denote by V = [vM the n x k nonsingular matrix
V : = ^vi I V2 I ... I VfcJ so that Vj = Z ^ i L i ^ j e j - Let P be the n x n matrix associated to the orthogonal projection onto V, Py(x) = P x , or, Pi = P e i , z = l , . . . , n .
' = [pi |P2 I ••• | p n j , Then Pi = Py{ei)
= ^{ei.Wj)wj
=
Y^v]wj
3=1
j=l
j=lh=l
h=l
(3.3)
I.e.,
P = VV^. The complex case is similar. With the same notation, instead of (3.3) we have k
Pi = Py{ei)
k
= ^{ei
.Vj )wj = ^
v^Wj
J=l
3=1
3=1h=l
h=l
(3.4)
i.e., P = VV^.
f. Orthogonal subspaces Let X be a real vector space with inner product ( | ) or a complex vector space with Hermitian product ( | ). Suppose X is finite dimensional and let W^ C X be a linear subspace of X. The subset ly-^ := {x G X I {x\y) is called the orthogonal of W in X.
=OyyeW^
3.1 The Geometry of Euclidean and Hermitian Spaces
91
3.33 Proposition. We have (i) W-^ is a linear subspace of Xj
(ii) wnw-^ = {o}, (iii) (W^)^ = W, (iv) W and W-^ are supplementary, hence dim W + dim W-^ = n, (v) if Pw dnd Pw^ tt^^ respectively, the orthogonal projections onto W and W-^ seen as linear maps from X into itself, then Pw^ = ^^x — PwProof. We prove (iv) and leave the rest to the reader. Let {vi, V2, •.., Vk) he a basis of W. Then we can complete (vi, V2,... •, v^) with n — k vectors of the canonical basis to get a new basis of X. Then the Gram-Schmidt procedure yields an orthonormal basis {wi, W2^" ", Wn) of X such that W = Span iwi, W2,... •, Wk\- On the other hand Wk-\-i,..., Wn € W-^, hence dim W-^ = n — k. •
g. Riesz's theorem 3.34 Theorem (Riesz). Let X be a Euclidean or Hermitian space of dimension n. For any L e X* there is a unique XL G X such that L{X) = {X\XL)
VXGX.
'
(3.5)
Proof. Assume for instance, that X is Hermitian. Suppose L ^ 0, otherwise we choose XL = 0, and observe that d i m l m L = 1, and V := kerL has dimension n — 1 if d i m X = n. Fix XQ G V-^ with |a:o| = 1, then every x E X decomposes as X = x' -\- XxQ,
x' 6 kerL, A = (x|a:o)-
Consequently, L{x) = I/(a:') -f AL(xo) = (x|a;o)I/(a:o) = (a;|L(xo)xo) and the claim follows choosing x^ '= L{xo)xo.
•
The map (3 : X* ^^ X, L ^^ XL defined by the Riesz theorem is called the Riesz map. Notice that /? is linear if X is Euclidean and antilinear if X is Hermitian. 3.35 T h e Riesz m a p in c o o r d i n a t e s . Let X be a Euclidean (Hermitian) space with inner (Hermitian) product ( | ),fixa basis and denote by x = (x^, x ^ , . . . , x'^) the coordinates of x, and by G the Gram matrix of the inner (Hermitian) product. Let L G X* and let L be the associated matrix, L{x) = L x . From (3.5) L x = L{x) = {X\XL)
= X-^GXL
if X is Euclidean,
L x = L{x) = {X\XL)
= X"^G5CL"
if X is Hermitian,
Gx£, = L-^
or XL = G~^L-^
if X is Euclidean,
7^
1 'p
i.e.,
Gxx, = L
or x/, = G
L
if X is Hermitian.
In particular, if the chosen basis (ei, 62, • • •, en) is orthonormal, then G = Id and XX, = L-^
if X is Euclidean,
—T
XL = L
if X is Hermitian.
92
3. Euclidean and Hermitian Spaces
Figure 3.1. Dynamometer.
3.36 E x a m p l e (Work and forces). Suppose a mass m is fixed to a dynamometer. If 9 is the inclination of the dynamometer, the dynamometer shows the number L = mg cos 0,
(3.6)
where p is a suitable constant. Notice that we need no coordinates in R"^ to read the dynamometer. We may model the lecture of the dynamometer as a map of the direction V of the dynamometer, that is, as a map L : S"^ —* R from the unit sphere 5"^ = {x € E^ Ma;I = 1} of the ambient space V into R. Moreover, extending L homogeneously to the entire space V by setting L{v) := \v\ L{v/\v\), v e R^ \ {0}, we see that such an extension is linear because of the simple dependence of L from the inclination. Thus we can model the elementary work done on the mass m, the measures made using the dynamometer, by a linear map L : V -^ R. Thinking of the ambient space V as Euclidean, by Riesz's theorem we can represent L as a scalar product, introducing a vector F := XL EV such that {v\F) = L(y)
Vt; G V.
We interpret such a vector as the force whose action on the mass produces the elementary work L{v). Now fix a basis (ei,62,63) of V. li F = (F^^F"^, F^)^ is the column vector of the force coordinates and L = (Li,L2,I/3) is the 1 x 3 matrix of the coordinates of L in the dual basis, that is, the three readings Li = L{ei), i = 1,2,3, of the dynamometer in the directions 61,62,63, then, as we have seen.
In particular, if (61,62,63) is an orthonormal basis.
h. The adjoint operator Let XyY be two vector spaces both on K = M or K = C with inner (Hermitian) products ( | )x and ( | )y and let A : X -^Y he a, hnear map. For any y eV the map X -> {A{x)\y)Y
3.1 The Geometry of Euclidean and Hermitian Spaces
93
defines a linear map on X, hence by Riesz's theorem there is a unique A*{y) e X such that {A{x)\y)Y = {y\A%x))x
Vx G X, Vy e Y,
(3.7)
It is easily seen that the map y -^ A*{y) from Y into X defined by (3.7) is linear: it is called the adjoint of A. Moreover, (i) let A,B : X -^ Y he two linear maps between two Euchdean or Hermitian spaces. Then {A + B)* = A* ^ B*, (ii) (XA)* = XA* if A G M and A : X ^ Y is SL hnear map between two Euclidean_spaces, (iii) (XA)* = XA"" if A G C and A : X ^ F is a hnear map between two Hermitian spaces, (iv) ( 5 o A)* = A* o 5 * if A : X ^ F and B : F ^ Z are hnear maps between Euclidean (Hermitian) spaces, (v) (A*)* = ^ i f A : X - ^ y i s a linear map. 3.37 %, Let X, Y be vector spaces. We have already defined an adjoint A : Y* —^ X* with no use of inner or Hermitian products, < A{y*),x>=<
y*,A(x)
>
\/x e X, My* e
Y\
If X and Y are Euclidean (Hermitian) spaces, denote by /3x : X* —^ X, /Sy : Y* -^ Y the Riesz isomorphisms and by A* the adjoint of A defined by (3.7). Show that A* =
/3xoAo0-\ 3.38 T h e adjoint o p e r a t o r in c o o r d i n a t e s . Let X, Y be two Euclidean (Hermitian) spaces with inner (Hermitian) products { \ )x and ( | ) y . Fix two bases in X and y , and denote the Gram matrices of the inner (Hermitian) products on X and Y respectively, by G and H . Denote by x the coordinates of a vector x. Let A : X -^ Y be a linear map. A* be the adjoint map and let A, A* be respectively, the associated matrices. Then we have (A(x)\y)Y
= x^A^Hy,
{x\A*{y))
= x^'GA'y,
{x\A*iy))
= x^GA^^y,
if X and Y are Euclidean and (A{x)\y)Y
= x^A^Hy,
if X and Y are Hermitian. Therefore GA* = A ^ H
if X and Y are Euclidean,
GA* = A ^ H
if X and Y are Hermitian,
or, recalling that G ^ = G, ( G ~ i ) ^ = G - \ H ^ = H if X and Y are Euclidean and that G ^ = G, ( G - i ) ^ = G - i , and H ^ = H if X and Y are Hermitian, we find A* = G - ^ A ^ H
if X and Y are Euclidean,
A* = G~^ A ^ H
if X and r are Hermitian.
In particular. A* = A ^ in the Euclidean case, _r A* = A in the Hermitian case if and only if the chosen bases in X and Y are orthonormal.
(3.8)
94
3. Euclidean and Hermitian Spaces
3.39 Theorem. Let A: X -^Y be a linear operator between two Euclidean or two Hermitian spaces and let A* : Y -^ X be its adjoint. Then Rank^* = R a n k ^ . Moreover, (Im^)-^=ker^*,
Im^=(ker^*)^,
{lmA*)-^=kerA,
IinA* = (ker^)-^.
Proof. Fix two orthonormal bases on X and Y, and let A be the matrix associated to A using these bases. Then, see (3.8), the matrix associated to A* is A-^, hence Rank A* = Rank A ^ = Rank A = Rank A, and dim(ker A*)-^ = dim Y - dim ker A* = Rank A* = Rank A = dim Im A. On the other hand, Im A C (ker A*)-*- since, if t/ = A{x) and A*{v) = 0, then {y\v) = (A{x)\v) = {x\A*{v)) = 0. We then conclude that (ker A*)-*- = ImA. The other claims easily follow. In fact, they are all equivalent to I m A = (kerA*)-*-. D
As an immediate consequence of Theorem 3.39 we have the following. 3.40 Theorem (The alternative theorem). Let A : X -^ Y be a linear operator between two Euclidean or two Hermitian spaces and let A* : Y -^ X be its adjoint. Then A|kerA-L * (ker^)-^ -^ ImA and At^ : ImA —^ (ker^)-^ are injective and onto, hence isomorphisms. Moreover, (i) A(x) = y has at least a solution if and only if y is orthogonal to kerA^ (ii) y is orthogonal to ImA if and only if A*{y) = 0, (iii) A is injective if and only if A* is surjective, (iv) A is surjective if and only if A* is injective. 3.41 . A more direct proof of the equaUty ker A = (Im^*)-^ is the following. For simplicity, consider the real case. Clearly, it suffices to work in coordinates and by choosing an orthonormal basis, it is enough to show that Im A = (ker A^)-^ for every matrix A € Mm,n{^)Let A = (a'j) G Mm„n{^) and let a^, a^,..., a"^ be the rows of A, equivalently the columns of A ^ , /ai\
A =
\a^J Then,
3.2 Metrics on Real Vector Spaces
Ax
/ a\x^ + alx^ + • • • + al^x"" \ ajx^ + ajx^ + • • • + alx"" \ a f x^ + afx^ + • • • + a ^ x ^ /
95
2
\ a^ • x/
Consequently, x G ker A if and only if a* • x = 0 Vi = 1 , . . . , m, i.e., kerA = S p a n U \ s?,...,
a^}
= (ImA^)"^.
(3.9)
3-2 Metrics on Real Vector Spaces In this section, we discuss bilinear forms on real vector spaces. One can develop similar considerations in the complex setting, but we refrain from doing it. a. Bilinear forms and linear operators 3.42 Definition. Let X he a real linear space. A bilinear form on X is a map 6 : X X X —> M that is linear in each factor, i.e., b{ax + /3y, z)=a 6(x, z)-\-(3b{y, z), 6(x, ay -\- f3z) = a 6(x, y) + 0 b{z, z). for all X, y, X G X and all forms on X by B{X).
a,l3e
We denote the space of all bilinear
Observe that, if 6 G B{X), then 6(0, x) - b{0,y) = 0 Vx,?/ G X. The class of bihnear forms becomes a vector space if we set (6i + 62)(x,y) := 6i(x,y) + b2{x,y), (A6)(x, y) := 6(Ax, y) =- 6(x, A, y). Suppose that X is a linear space with an inner product denoted by ( I ). If 6 G S(X), then for every y e X the map x -^ b{x,y) is linear, consequently, by Riesz's theorem there is a unique B := B{y) G X such that b{x,y) = {x\B{y)) Vx G X. (3.10) It is easily seen that the map y —> B{y) from Y into X defined by (3.10) is linear. Thus (3.10) defines a one-to-one correspondence between B{X) and the space of linear operators £(X, X), and it is easy to see this correspondence is a linear isomorphism between B{X) and £(X, X).
3. Euclidean and Hermitian Spaces
96
Ueber
die Hjpothe8eQ,welohe derGeometne zu Gruad^ liegen.
d ie
Hypothesen, B. R i e m a n n.
welche der Oeometrie za 6nmde liegen. iBcljung. P l » n d«r Un BekaantUoh wut dte Qmmctrte to . den Begritf de« iUittmea, ab die «r*««n Grandbagriib tta die Baume al* etwaa Gegebene* wnxa. Sie giebt ton iha NomlnaldefinitioneD, w«hrend die wcMBtUchen BeetimmuBgeQ in Form TOO Axiomen auftretesi. D u VerMltaiM dieeer VoreuMetrangen bleibt dabei im Dankeln; man •ieht weder «in, ob uod in wie wdt "ihte Verbindung nothwendig, fiocli a priori, ob sie mflgHch Ut. Uiese Dankelheit wntde such von Ettklid bi« auf L e g e n d r e . der Oeometrie ni nennen. urn den berOl>mte»ten neoeren Bei weder Ton den MathenwUkem, noch on den PhiloMiphen, welch* sich te die* aeinen Oiund wohl dariu, damit betcbiitigttD. geboben. £ • h daM der allgemeine Begriff mehr£uh tttgedehnter i . ganz onbearbeitet blieb. Icli cliein die babe mir daher tunichet die A n ^ b e gesteilt. den Begriff einer mehrftch auigedehateo GrBMe au< aUgemrinei» Ght««»«nbegri«»n lu conetniimi. £ • wird daraos hervoigehea. dan ein« m«hrfe«h auigedehnte GrOwe ver-
B. R i e m a n n.
t) Dine AMModlooc H tm 10. Jani 18&4 ran d«m 2«Mk MiMT HaUiUtioa Ttnaitaltelen CoUoqiiam Iflerans eridlrt Mch die Form der D«nMlni(,
d«m dreix«bs«m B«nd« 4*r AUiandlaag«n der KSni|^ieiicD OtMllwiwrt der WiMMtehaftcn tn OSttingen.
Gdttingen, in der Diet«riolitoli»ii 1867,
BoohhandUng.
BmanMbwrig, im JuU 1M7.
Figure 3.2. Frontispiece and a page of the celebrated dissertation of G. F. Bernhard Riemann (1826-1866).
3.43 Bilinear forms in c o o r d i n a t e s . Let X be a finite-dimensional vector space and let (ei, 6 2 , . . . , e-n) be a basis of X. Let us denote by B the nxn matrix, sometimes called the Gram matrix of b. B = [bij]
bij =
b{ei,ej).
Recall that the first index from the left is the row index. Then by linearity, if for every x, 2/, X = {x^, cc^,..., ic^)^ and y = (y^, 2/^, • • •, y^)^ € M^ are respectively, the column vectors of the coordinates of x and y, we have
bix,y) = J2 ^ij^'y' = x^-lBy) =x^By. In particular, a coordinate system induces a one-to-one correspondence between bilinear forms in X and bilinear forms in W^. Notice that the entries of the matrix B have two lower indices that sum with the indices of the coordinates of the vectors x, y that have upper indices. This also reminds us that B is not the matrix associated to a linear operator B related to B . In fact, if instead N is the associated linear operator to 6, bix.y) = (x\N{y))
Vx,y€X,
then y ^ B x - b(x,y) = {x\N{y))
= y^N^Gx
where we have denoted by G the Gram matrix associated to the inner product on X , G = [gij], gij — (ei|ej), and by N the n v. n matrix associated to A^ : X —^ X in the basis (ei, 6 2 , . . . , en). Thus N^G = B or, recalling that G is symmetric and invertible, N = G-iB^.
3.2 Metrics on Real Vector Spaces
97
b. Symmetric bilinear forms or metrics 3.44 Definition. Let X be a real vector space. A bilinear form b G B{X) is said to be (i) symmetric or a metric, ifb{x^y) = b{y,x) Vx,?/ G X, (ii) antisymmetric ifb{x,y) = —b{y,x) ^x,y G X. The space of symmetric bilinear forms is denoted by Syra{X). 3.45 %, Let b G B(X). Show that bs{x,y) := ^{b{x,y) + b{y,x)), x,y € X , is a symmetric bilinear form and bA{x,y) :— ^{b{x,y) — b{y,x)), x,y £ X, is an antisymmetric bilinear form. In particular, one has the natural decomposition 6(x, y) = bs {x, y) + 6^ (x, y) of b into its symmetric and antisymmetric parts. Show that b is symmetric if and only if 6 = 65, and that b is antisymmetric if and only if 6 = 6^^. 3.46 %. Let 6 G B(X) be a symmetric form, and let B be the associated Gram matrix. Show that 6 is symmetric if and only if B ^ = B . 3.47 %, Let b e B(X) and let N be the associated linear operator, see (3.10). Show that AT is self-adjoint, N* = AT, if and only if 6 G Sym{X). Show that N* = -N if and only if b is antisymmetric.
c. Sylvester's theorem 3.48 Definition. Let X be a real vector space. We say that a metric on X, i.e., a bilinear symmetric form g : X x X ^^^R is (i) nondegenerate if^xeX,x^O there is y e and Wy e Xy y y^ 0 there is x e X such that (ii) positively definite ifb{x^x) > O'^x ^ X, x ^ (iii) negatively definite ifb{x,x) < 0 \/x e X, x
X such that 6(x, y) ^ 0 6(x, y) j^ 0, 0, ^0.
3.49 %. Show that the scalar product {x\y) on X is a symmetric and nondegenerate bilinear form. We shall see later, Theorems 3.52 and 3.53, that any symmetric, nondegenerate and positive bilinear form on a finite-dimensional space is actually an inner product.
3.50 Definition. Let X be a vector space of dimension n and let g G Sym{X) be a metric on X. (i) We say that a basis (ei, e 2 , . . . , Cn) is g-orthogonal if g{ei,ej) = 0 Vz,j = l,...,n, i^j. (ii) The radical of g is defined as the linear space md{g) := | x G X\g{x,y)
= 0\fye
x}.
(iii) The range of the metric g is r{g) := n — dimrad^.
98
3. Euclidean and Hermitian Spaces
Figure 3.3. Jorgen Gram (1850-1916) and James Joseph Sylvester (1814-1897).
(iv) The signature of the metric g is the triplet of numbers {i^{g)J.{g),io{g)) where i^{g) := maximum of the dimensions of the subspaces V C X on which g is positive definite, g{v^v) > O^v EV, V ^^ 0, i-{g) := maximum of the dimensions of the subspaces V C X on which g is negative definite, g{v,v)
and
i+{g) + i-{g) + io{g) = n.
3.2 Metrics on Real Vector Spaces
Proof. Suppose that g(ei,ei)
99
> 0 for i = 1 , . . . , n-j.. For each v = X^i^i '^^^i^ we have 9{v,v) = Y^\v'\'^g{ei,ei)
> 0,
1=1
hence dim Span {ei, 6 2 , . . . , en^} < H ( P ) - On the other hand, if l y C X is a subspace of dimension i-^{g) such that g{v, i;) > 0 Vv € W, we have V y n S p a n { e n ^ + i , . . . , e n } = {0} since g{v,v) < 0 for all v G S p a n { e n , + i , . . . , e n } . Therefore we also have i-\-{g) < n — {n — n-|_) = n.^. Similarly, one proves that n _ = i-{g). Finally, since G := [^(ei,ej)] is the matrix associated to g in the basis (ei, 6 2 , . . . , en), we have io{g) = d i m r a d ( y ) = d i m k e r G , and, since G is diagonal, d i m k e r G = UQ. D
d. Existence of ^-orthogonal bases The Gram-Schmidt algorithm yields the existence of an orthonormal basis in a Euclidean space X. We now see that a slight modification of the GramSchmidt algorithm allows us to construct in a finite-dimensional space a ^-orthogonal basis for a given metric g. 3.53 Theorem (Gram-Schmidt). Let g be a metric on a finite-dimensional real vector space X. Then g has a g-orthogonal basis. Proof. Let r be the rank of gf, r := n—dimrcid (^), and let {wi, W2,. •., Wn-r) be a basis of rad (g). If V denotes a supplementary subspace of rad (^f), then V is p-orthogonal to radg and d i m F = r. Moreover, for every v £ V there is z £ X such that g{v, z) ^ 0. Decomposing zasz = w-\-t^wEV^t£ r a d ( ^ ) , we then have g{v,w) = g{v,w) + g{v, t) = g(v, z) ^ 0, i.e., g is nondegenerate on V. Since trivially, (i^i, i i ; 2 , . . . , Wn-r) is ^-orthogonal and V is ^-orthogonal to (i^i, 1^2, • • •, '^n-r)-, in order to conclude it suffices to complete the basis {w\^ W2,. - • -, Wn-r) with a ^f-orthogonal basis of V; in other words, it suffices to prove the claim under the further assumption that g be nondegenerate. We proceed by induction on the dimension of X. Let ( / i , / 2 , . . . , / n ) be a basis of X. We claim that there exists ei G X with g{ei,ei) / 0. In fact, if for some fi we have gifiifi) 7^ O5 we simply choose ei := / i , otherwise, if g{fi,fi) = 0 for all i, for some k ^ 0 we must have g{fi, fk) ¥" 0, since by assumption rad (g) = {0}. In this case, we choose ei := / i -|- /^ as
g{fi + fkji + fk) = gifufi) + 2g{fiJk) -f gifkJk) = O + 2g{fiJk) + 0 / 0 . Now it is easily seen that the subspace Vi:=[vex\g{euv)
= 0]
supplements S p a n { e i } , and we find a basis (t'2, • • • ,^n) of Vi such that g{vj,ei) for all j = 2 , . . . , n by setting ._.
= 0
fl(/j,ei) P(ei,ei)
Since g is nondegenerate on Vi, by the induction assumption we find a p-orthogonal basis ( e 2 , . . . , Cn) of Vi, and the vectors (ei, 6 2 , . . . , Cn) form a p-orthogonal basis of X. D
100
3. Euclidean and Hermitian Spaces
A variant of the Gram-Schmidt procedure is the following one due to Carl Jacobi (1804-1851). Let g : X X X -^ R he Si metric on X. Let (/i, / 2 , . •., fn) be a basis of X, let G be the matrix associated to g in this basis, G = [gij], gij = 9{fu fj)- Set Ao = 1 and for A: = 1 , . . . , n Ak :=detGA: where G^ is the k x k submatrix of the first k rows and k columns. 3.54 Proposition (Jacobi). / / A^ 7^ 0 for all k = 1 , . . . , n, there exists a g-orthogonal basis (ei, 6 2 , . . . , e^) of X; moreover g[ek,ek) := —7—. Proof. We look for a basis (ei, e2, • . . , en) so that I ei
=a\fi,
62 = 0 2 / 1 +«2/2»
or, equivalently, ek
•=Yl^kfi^
(3.11)
k = l,..-,n,
as in the Gram-Schmidt procedure, such that g(ei,ej) = 0 for i ^ j . At first sight the system giei^ej) = 0, i :^ j , is a system in the unknowns aj.. However, if we impose that for all fc's p(efc,/i) = 0 Vi = l , . . . , A ; - l , (3.12) by linearity g{ek,ei) = 0 for i < /c, and by symmetry gie^^ei) = 0 for i > k. It suffices ,a^ then to fulfill (3.12) i.e., solve the system offc— 1 equations in k unknowns al.,a^,... k
Yl^ifjJiK
=0,
Vi = 1,... ,^ - 1.
(3.13)
j=i
If we add the normalization condition k
Yl9{fjJk)ai
= l,
(3.14)
j=i
we get a system of k equations in k unknowns of the type G ^ x = b , where Gk = [9ij]y 9ij '•= gifiJj)^ X = (oj^,...,aj^)^ and b = ( 0 , 0 , . . . , 1)^. Since det Gfc = Afc and Afc 7^ 0 by assumption, the system is solvable. Due to the arbitrarity of fc, we are able to find a gf-orthogonal basis of type (3.11). It remains to compute ^(6^,6^). Prom (3.13) and (3.14) we get
9iek,ek) = Yl ^O'kOUiJj) = J2'^k{Y^9(fiJj)4)
=Yl/'k^Jk=cit j=i
and we compute a^ by Cramer's formula, ^k —
hence giek.Ck) =
Ak-i/Ak-
Afc
'
3.2 Metrics on Real Vector Spaces
101
3.55 Remark. Notice that Jacobi's method is a rewriting of the GramSchmidt procedure in the case where g{fi^fi) 7^ 0 for all i's. In terms of Gram's matrix G := [gi^i^ej)]^ we have also proved that
T-GT = d i a g { ^ } for a suitable triangular matrix T. 3.56 Corollary (Sylvester). Suppose that Ai,...,Afc ^ 0. Then the metric g is nondegenerate. Moreover, i-{g) equals the number of changes of sign in the sequence (1, Ai, A 2 , . . . , A^). In particular, if Ak > 0 for all k 's, then g is positive definite. Let (ei, 6 2 , . . . , en) be a ^-orthogonal basis oi X. By reordering the basis in such a way that > 0 if j = l , . . . , 2 + ( ^ ) , < 0 if j = 2+(p) + l,...,2+(^) + *-(^), = 0 otherwise;
9{ej,ej)\ and setting (
fr-=^
[ Cj
^
if 7 - 1
i^(a)-\-i
(a)
otherwise
we get
9{fjJj)
( 1 if j = l , . . . , i + ( ^ ) , = < - 1 if j = n ( ^ ) + l , . . . , i + ( ^ ) + z_(c/), I 0 otherwise.
e. Congruent matrices It is worth seeing now how the matrix associated to a bilinear form changes when we change bases. Let (ei, 6 2 , . . . , Cn) and (/i, /2, • • •, fn) be two bases of X and let R be the matrix associated to the map R : X -^ X, R{ei) := fi in the basis (ei, 6 2 , . . . , e^), that is
R := [ri I r2 where r^ is the column vector of the coordinates of fi in the basis (ei, 6 2 , . . . , Cn)' As we know, if x and x' are the column vectors of the coordinates of X respectively, in the basis (ei, 6 2 , . . . , en) and (/i, /2, • • •, /n), then x = R x ' . Denote by B and B ' the matrices associated to b respectively, in the coordinates (ei, 6 2 , . . . , e^) and (/i, / 2 , . • •, fn)- Then we have
102
3. Euclidean and Hermitian Spaces
b{x,y)
x'^BV
b{x, y) = x^By = x "T-DTJ R^BRy' hence (3.15)
B' = R^BR.
The previous argument can be of course reversed. If (3.15) holds, then B and B ' are the Gram matrices of the same metric h on W^ in different coordinates 6(x,2/)=x^bV = (RxfB(Ry). 3.57 Definition. Two matrices A , B G Mn^n{ if there exists a nonsingular matrix R E Mn,n{
are said to he congruent such that B = R ^ A R .
It turns out that the congruence relation is an equivalence relation on matrices, thus the nxn matrices are partitioned into classes of congruent matrices. Since the matrices associated to a bilinear form in different basis are congruent, to any bilinear form corresponds a unique class of congruent matrices. The above then reads as saying that two matrices A, B G Mn,n{^) are congruent if and only if they represent the same bilinear form in different coordinates. Thus, the existence of a ^r-orthogonal basis is equivalent to the following. 3.58 T h e o r e m . A symmetric matrix A G Mn^n{ diagonal matrix.
is congruent to a
Moreover, Sylvester's theorem reads equivalently as the following. 3.59 T h e o r e m . Two diagonal matrices I, J G Mn,n(^) o,f^ congruent if and only if they have the same number of positive, negative and zero entries in the diagonal. If, moreover, a symmetric matrix A G Mji_^ji{M^ is congruent to (\ 0 0 Ida 0
V 0
-Idfc
0
0
0/
then (a^b^n — a — b) is the signature of the metric y ^ A x . Thus the existence of a ^-orthogonal matrix in conjunction with Sylvester's theorem reads as the following. 3.60 T h e o r e m . Two symmetric matrices A , B G Mn,nO^) o,re congruent if and only if the metrics y-^ A x and y-^Bx on W^ have the same signature (a, 6, r). In this case, A and B are congruent to
3.2 Metrics on Real Vector Spaces
V
103
\
Ida
0
0
0
-Idfc
0
0
0
0/
f. Classification of real metrics Since reordering the basis elements is a linear change of coordinates, we can now reformulate Sylvester's theorem in conjunction with the existence of a ^-orthonormal basis as follows. Let X, Y be two real vector spaces, and let g, h be two metrics respectively, on X and Y. We say that {X,g) and (F, h) are isometric if and only if there is an isomorphism L : X —^ Y such that h{L{x),L{y)) = g{x^y) Wx,y G X. Observing that two metrics are isometric if and only if, in coordinates, their Gram matrices are congruent, from Theorem 3.60 we infer the following. 3.61 Theorem. (X, g) and (y, h) are isometric if and only if g and h have the same signature, (hid) ^'^-{9)^0(9)) = (i+(/i),i_(/i),ioW). Moreover, if X has dimension n and the metric g on X has signature (a^b.r), a + b + r = n, then (X^g) is isometric to (lR^,/i) where /i(x,y) := x^Hy and
/I H
V
Ida
0
0
0
-Idb
0
0
0
0/
According to the above, the metrics over a real finite-dimensional vector space X are classified, modulus isometrics, by their signature. Some of them have names: (i) The Euclidean metric: i-^{g) = n, i-{g) = io{g) — 0; in this case g is a scalar product. (ii) The pseudoeuclidean metrics: i-{g) = 0. (iii) The Lorenz metric or Minkowski metric: i-\-{g) = n — 1, i-{g) — 1, ^o(^) = 0. (iv) The Artin metric i-\-{g) = i-{g) = P, ^o(p) = 03.62 %. Show that a biUnear form ^ on a finite-dimensional space X is an inner product on X if and only if g is symmetric and positive definite.
104
3. Euclidean and Hermitian Spaces
g. Quadratic forms Let X be a finite-dimensional vector space over M and let b G B{X) be a bilinear form on X. The quadratic form 0 : X ^ R associated to b is defined by 0(x) = b{x,x)^ X e X. Observe that 0 is fixed only by the symmetric part of b bs{x,y) :=
-{b{x,y)-\-b{y,x))
since b{x, x) = bs{x, x) \fx G X. Moreover one can recover bs from (/) since bs is symmetric, bs{x, y) = 2 v^^ -^y)-
^ W - ^(^)) •
Another important relation between a bilinear form b G B{X) and its quadratic form 0 is the following. Let x and v £ X. Since (j){x -f- tv) = (f){x) + M b(x^ v) -f b{v, we have j^(t>{x + tv)^t=o = 2bs{x,v).
(3.16)
We refer to (3.16) saying that the symmetric part bs ofb is the first variation of the associated quadratic form. 3.63 Homogeneous polynomials of degree two. Let B = [bij] € Mn,nW and let n
6(x,y) := x^By = ^
bijx'y^
be the bilinear form defined by B on R*^, x = ( x \ x ^ , . . . , x"), y = ( y \ ?/2,..., 2/^). Clearly, n
(/)(x) = 6(x, x) = x^Bx = ^
bijx'^x^
is a homogeneous polynomial of degree two. Conversely, any homogeneous polynomial of degree two P{x) = ^
bijx'x^ = x^Bx
i,j = l,n i<3
defines a unique symmetric bilinear form in W^ by 6(x,y):=^(p(x + y ) - P ( x ) - P ( y ) ) with associated quadratic form P .
3.2 Metrics on Real Vector Spaces
105
3.64 E x a m p l e . Let {x,y) be the standard coordinates in M."^. The quadratic polynomial ax^ + 6xs/ + cy^ = ( x , j , ) ( ^ J ^
' f ) ( y
is the quadratic form of the metrics g{{x,y),{z,w))
'a := {z,w) L \b/2
b/2\
(x^
c J
\yj
3.65 D e r i v a t i v e s of a quadratic form. Prom (3.16) we can compute the partial derivatives of the quadratic form (f){x) := x ^ G y . In fact, choosing u = e^, we have -^{x)
:= -(t>{x + teh) = 2bs(x,eh)
= x^(G + G^)e^
hence, arranging the partial derivatives in a 1 x n matrix, called the Jacobian oi(t>,
matrix
•'*'-[^w||^w|-lSH Idx^^ ^\ dx^^'^'l '" \ dx^ we have D>(a;)=x^(G-hG^), or, taking the transpose, V(/)(x) := (D)(x))^ = (G + G ^ ) x .
h. Reducing to a sum of squares Let ^ be a metric on a real vector space X of dimension n and let (j) be the associated quadratic form. Then, choosing a basis (ei, 6 2 , . . . , en) we have (t){x) = g{x,x) =
^{x'fg[euei) 2=1
if and only if (ei, 6 2 , . . . , e^) is ^-orthogonal, and the number of positive, negative and zero coefficients is the signature of g. Thus, Sylvester's theorem in conjunction with the fact that we can always find a ^-orthogonal basis can be rephrased as follows. 3.66 Theorem (Sylvester's law of inertia). Let (j){x) = g{x,x) be the quadratic form associated to a metric g on an n-dimensional real vector space. (i) There exists a basis (/i, / 2 , . . . , fn) of X such that
2=1
where {i^{g),i-{g),io{g))
2=1
is the signature of g.
2=1
106
3. Euclidean and Hermitian Spaces
(ii) If for some basis (ei, 6 2 , . . . , en) n
n
x:=Y,^'ei,
(3.17)
then the numbers n^^n^ andno respectively, of positive, negative and zero ^{ei)^s are the signature {i^{g),i-{g),io{g)) of g. 3 . 6 7 E x a m p l e . In order to reduce a quadratic form cf) to the canonical form (3.17), we may use Gram-Schmidt's algorithm. Let us repeat it focusing this time on the change of coordinates instead of on the change of basis. Suppose we want to reduce to a sum of squares by changing coordinates, the quadratic form n
where at least one of the aij 's is not zero. We first look for a coefficient akk that is not zero. If we find it, we go further, otherwise if all akk vanish, at least one of the mixed terms, say a i 2 , is nonzero; the change of variables x^ = i / i -f-2/2,
yx^ = y^
for j = 3 , . . . , n,
transforms ai2X^x'^ into ai2{(y^)'^ — (y^)^), and since a n = 022 = 0, in the new coordinates (y^, j / ^ , . . . , y^) the coefficient of (y^)^ is not zero. Thus, possibly after a linear change of variables, we write 0 as ^(x) = — ( a n j / i ) 2 + y ^ azjyiyJ + B ( s / 2 , . . . , J/"). We now complete the square and set
iY'=a^^y^ +
ZU^y'^
\Y^ =y^
fori = 2 , . . . , n .
so that
^ix) =
«ii ^
^(a^^y'-^^^yA\c=^{Y'f-^C
"Hi 2
j=2
/
an
where C contains only products of Y ^ , . . . , Y^. The process can then be iterated. 3.68 E x a m p l e . Show that Jacobi's method in Proposition 3.54 transforms (p in Ai
A2
A3
if X = Y27=i ^*^i' ^^^ ^ suitable p-orthogonal basis (ei, 6 2 , . . . , en)3.69 E x a m p l e (Classification of conies). The conies in the plane are the zeros of a second degree polynomial in two variables P{x,y)
:=ax'^ -{-2bxy-hcy^
-\-dx + ey + f = 0,
{x,y) G M^,
(3.18)
3.2 Metrics on Real Vector Spaces
107
where a, 6, c, d,e,f£ R. Choose a new system of coordinates (X, Y), X = ax-{- l3y, Y = 7X + 6y in which the quadratic part of P transforms into a sum of squares ax^ + bxy + cy^ = pX"^ + g^^, consequently, P into pX'^ + qY'^ + 2rX + 2 s y + / = 0. Now we can classify the conies in terms of the signs of p, q and / . If p, q are zero, the conic reduces to the straight line 2rX + 2 s y + / = 0. If p 7»^ 0 and g == 0, then, completing the square, the conic becomes p{X - Xof
+ 25y -f / = 0,
Xo = - ,
i.e., a parabola with vertex in (Xo,0) and axis parallel to the axis of Y. Similarly, if p = 0 and q ^ 0, the conic is a parabola with vertex in {0,Yo), YQ := s/q, and axis parallel to the X axis. Finally, if pq ^ 0, completing the square, the conic is p{X - Xo? -h q{Y - Yof
+ / = 0,
Xo=
r/p,
FQ =
s/q,
i.e., it is o a hyperbola if / 7^ 0 and pg < 0, o two straight lines if / = 0 and pg < 0, o an ellipse if sgn (/) = —sgn (p) and pq > 0, o a point if / = 0 and pq > 0, o the empty set if sgn (/) = sgn (p) and pq > 0. Since we have operated with linear changes of coordinates that map straight lines into straight lines, ellipses into ellipses, and hyperbolas into hyperbolas, we conclude the following. 3.70 P r o p o s i t i o n . The conies in the plane are classified in terms of the signature of their quadratic part and of the sign of the zero term. 3.71 %. The equation of a quadric i.e., of the zeros of a second order polynomial in n variables, see Figure 3.4 for n — 3, has the form )(x) : = x ^ A x + 2 b « x + c = 0 where A G Mn,n{R) is symmetric, h ^ X and c ^R. Prove the following claims, (i) 0 is a center of symmetry, i.e., )(x) = >(—x), if and only if b = 0. (ii) xo is a center of symmetry, i.e., 0(xo — x) = )(xo + x), if and only if AXQ = —b. (iii) If det A 7^ 0, then there is a center of symmetry XQ; letting x = z + XQ, we have (f)(x.) = z ^ A z -f ci = 0 for a suitable ci 6 M. (iv) By Sylvester's law of inertia, z ^ A z - f c i transforms with a suitable linear change of coordinates into i=l
i=p-\-l
(v) Suppose det A = 0. Since A = A ^ , we have ker A — (Im A)-^. Choosing a basis in which the first k elements generate Im A and the last n — k ker A, then A writes as
I
f\ A'
V
0
0'
108
3. Euclidean and Hermitian Spaces
(b)
ic),{k),iQ)
^
^
/^\if)
ii)
(9)
iJ)
^
(0
Figure 3.4. Quadrics: (a) ellipsoid: a^x'^-\-b^y'^-\-c'^z'^ = 1; (b) point: a^x'^-\-b'^y^-\-c^z'^ = 0; (c) imaginary ellipsoid: a^x"^ -f- b'^y^ -}- c^z^ = — 1; (d) hyperboloid of one sheet: a^x^ -h b'^y'^ — (P'z^ = 1; (e) cone: a^x^ + b^y^ — cP'z'^ = 0; (f) hyperboloid of two sheets: -a^x^ - b'^y^ + c^z^ = 1; (g) paraboloid: a'^x'^ + b'^y^ - 2cz = 0, c> 0; (h) saddle a^x^ - b'^y^ — 2cz = 0, c > 0; (i) elliptic cyhnder: a^x^ + b'^y^ — 1; (j) straight line: a^x^ + 6^2/^ = 0; (k) imaginary straight line: c?x^ -\-b^y^ = —1; (1) hyperbolic cylinder cP'x^ — b^y^ — 1; (m) nonparallel planes: d^x^ — b'^y^ = 0; (n) parabolic cylinder a?x'^ — 2cz, c > 0; (o) parallel planes: a?x'^ = 1; (p) plane: a^x^ = 0; (q) imaginary plane: a?x'^ = —1.
3.3 Exercises
109
in this new basis and the quadric can be written as (/>(x) = ( x O ^ A ' x ' + 2(b'|xO + 2 ( b ' ' | x ' 0 + C2 = 0 where x ' , b ' G Im A, x ' ' , b ' ' G ker A, x = x ' + x ' ' , b = b ' + b ' ' and det A ' / 0. Applying the argument in (iii) to ( x ' ) ^ A ' x ' + 2 b ' « x ' +C2, we may further transform the quadric into (l){x) = ( x O ^ A ' x ' + C3 + 2 b ' ' . x ' ' = 0, and, writing j / ' := —2 b " • x ' ' — C3, that is, by means of an affine transformation that does not change the variable x', we end up with (f){x) = ( x ' ) ' ^ A ' x ' — y' = 0.
3.3 Exercises 3.72 %, Starting from specific lines or planes expressed in parametric or implicit way in M^, write o the straight line through the origin perpendicular to a plane, o the plane through the origin perpendicular to a straight line, o the distance of a point from a straight line and from a plane, o the distance between two straight lines, o the perpendicular straight line to two given nonintersecting lines, o the symmetric of a point with respect to a line and to a plane, o the symmetric of a line with respect to a plane. 3.73 %. Let X, Y be two Euclidean spaces with inner products respectively, ( | ) x and ( I ) y . Show that X X y is an Euclidean space with inner product {xi\yi)-\-{x2\y2), (xi,X2), (2/1,2/2) E X xY. of X X y .
Notice that X x {0} and {0} x Y are orthogonal subspaces
3.74 If. Let X, 2/ G M"". Show that x ± y if and only if \x - ay\ > \x\ Va G M. 3.75 %, The graph of the map A{x) := Ax, A G Mm,n{^) GA := { ( x , y ) | a : G R ' ' , yeR^,
y = A{x)\
is defined as C R"" x R'^.
Show that GA is a linear subspace of M'^+'^ of dimension n and that it is generated by the column vectors of the {k -\- n) x n
1\
A Id„
110
3. Euclidean and Hermitian Spaces
Also show that the row vectors of the k x {n + k) matrix
-Idfe
A generates the orthogonal to GA-
3.76 %. Write in the standard basis of R^ the matrices of the orthogonal projection on specific subspaces of dimension 2 and 3. 3.77 %, Let X be Euclidean or Hermitian and let V, W be subspaces of X. Show that
v-^nw^ = {v-\-w)^. 3.78 If. Let / : Mn,n{K) -> K be a linear map such that / ( A B ) = / ( B A ) V A , B € Mn,niK). Show that there is A 6 K such that / ( X ) = A t r X for all X E Mn,n{K) where t r X : = E ^ i C c ^ i f X = [a.}]. 3.79 f. Show that the bilinear form b : Mn,n{^)
x Mn,n{R) -^ K given by n
6(A,B):=tr(A^B):=53(A^B)| i=l
defines an inner product on the real vector space Mn,n{R)- Find the orthogonal of the symmetric matrices. 3.80 f. Given n + 1 points zi, Z2,.-., Zn+i in C, show that there exists a unique polynomial of degree at most n with prescribed values at zi, ^2, • • ? Zn+i- [Hint: If Vn is the set of complex polynomials of degree at most n, consider the map >: Vn -^ C^"*" given by (/>(P) := ( P ( 2 i ) , P f e ) , • • • ,P(^n)).] 3 . 8 1 % D i s c r e t e i n t e g r a t i o n . Let ti, t2, • - •, tn he n points in [a,b] C M. Show that there are constants a i , a 2 , . . •, an such that b
/
n
P{t)dt
=
^ajP{tj)
for every polynomial of degree at most n — 1. 3.82 If. Let g := fO, 1]^ = la; G M"" I 0 < a:i < 1,
i=l,...,n\
be the cube of side one in R'^. Show that its diagonal has length y/n. Denote by x i , . . . ,X2'n the vertices of Q and by x := ( 1 / 2 , 1 / 2 , . . . , 1/2) the center of Q. Show that the balls around x that do not intersect the balls B{xi, 1/2), i = 1 , . . . , 2^, necessarily have radius at most Rn '-= {y/n — 2)/2. Conclude that for n > 4, B(x, Rn) is not contained in Q. 3.83 f.
Give a few explicit metrics in M^ and find the corresponding orthogonal bases.
3.84 f. Reduce a few explicit quadratic forms in R^ and R'* to their canonical form.
4. Self-Adjoint Operators
In this chapter, we deal with self-adjoint operators on a Euclidean or Hermitian space, and, more precisely, with the spectral theory for self-adjoint and normal operators. In the last section, we shall see methods and results of linear algebra at work on some specific examples and problems.
4.1 Elements of Spectral Theory 4.1.1 Self-adjoint operators a. Self-adjoint operators 4.1 Definition. Let X be a Euclidean or Hermitian space X. A linear operator A : X -^ X is called self-adjoint if A* = A. As we can see, if A is the matrix associated to A in an orthonormal basis, then A ^ and A are the matrices associated to A* in the same basis according to whether X is Euclidean or Hermitian. In particular, A is self-adjoint if and only if A = A"^ in the Euclidean case and A = A^ in the Hermitian case. Moreover, as a consequence of the alternative theorem we have X = ker A 0 Im A,
ker A
±lmA
ii A : X -^ X is self-adjoint. Finally, notice that the space of self-adjoint operators is a subalgebra of £(X, X). Typical examples of self-adjoint operators are the orthogonal projection operators. In fact, we have the following. 4.2 Proposition. Let X be a Euclidean or Hermitian space and let P : X -^ X be a linear operator. P is the orthogonal projection onto its image if and only if P* = P and P o P = P.
112
4. Self-Adjoint Operators
Proof. This follows, for instance, from 3.32. Here we present a more direct proof. Suppose P is the orthogonal projection onto its image. Then for every y € X {y~P{y)\z) = 0 V2 € I m P . Thus y = P{y) if y € I m P , that is P{x) = PoP(a:) = P'^{x) Vx G X. Moreover, for x,y E X 0 = (a; - P{x)\P{y))
= (x\P{y))
-
(P{x)\P(y)),
0 = (P{x)\y - P(y)) = (P(x)\y)
-
(P(x)\P{y)),
hence, {P{x)\y) = (x|P(j/)),
i.e.,
P * = P.
Conversely, if P * = P and P^^ = P we have {x - P{x)\Piz))
= {P*(x - P{x))\z)
= (P{x) - P''(x)\z)
= (P(x) - P{x)\z)
=0
for all z^X.
D
b. The spectral theorem The following theorem, as we shall see, yields a characterization of the self-adjoint operators. 4.3 Theorem (Spectral theoremi). Let A : X -^ X be a self-adjoint operator on the Euclidean or Hermitian space X. Then X has an orthonormal basis made of eigenvectors of X. In order to prove Theorem 4.3 let us first state the following. 4.4 Proposition. Under the hypothesis of Theorem 4-3 we have (i) A has n real eigenvalues, if counted with multiplicity, (ii) ifV CW^ is an invariant subspace under A, then V-^ is also invariant under A, (iii) eigenvectors corresponding to distinct eigenvalues are orthogonal. Proof, (i) Assume X is Hermitian and let A € Mn,n(^) be the matrix associated to A in an orthonormal basis. Then A = A''", and A has n complex eigenvalues, if counted with multiplicity. Let z 6 C"^ be an eigenvector with eigenvalue A G C. Then A z = A z = Az = A z. Consequently, if A = (a*), z = (z^, z"^,...,
z^), we have
1 A|z|2 = Er=i A z* z* = E«"=i ^' A? = E«",,=i "5 ^^ ^\ = E",=i 4 ^' ^'Since a* = a^ for a l H , j = 1 , . . . , n, we conclude that (A-A)|zp = 0
i.e.,
AGM.
In the Euclidean case, A ^ = A = A , also. (ii) Let w € V^. For every v £ V -we have {A{'w)\v) = (w\A{v)) = 0 since A{v) G V and w eV-^. Thus A{w) ± V. (iii) Let x, y be eigenvectors of A with eigenvalues A, /x, respectively. Then A and /x are real and (A - ^i){x\y) = iXx\y) - {x\ny) = (A(x)\y) - {x\A{y)) = 0. Thus (x\y) = 0 if A 5^ ^ .
D
4.1 Elements of Spectral Theory
113
Proof of Theorem 4-3. We proceed by induction on the dimension n oi X. On account of Proposition 4.4 (i), the claim trivially holds if d i m X = 1. Suppose the theorem has been proved for all self-adjoint operators on H when dim H = n — 1 and let us prove it for A. Because of (i) Proposition 4.4, all eigenvalues of A are real, hence there exists at least an eigenvector ui of A with norm one. Let H := Span {txi}"*" and let B := A^fj be the restriction of A to H. Because of (ii) Proposition 4.4, B{H) C H, hence B \ H ^^ H is a linear operator on H (whose dimension is n — 1); moreover, B is self-adjoint, since it is the restriction to a subspace of a self-adjoint operator. Therefore, by the inductive assumption, there is an orthonormal basis {u2^... ,Un) oi H made by eigenvectors of B , hence of A. Since U2,... ,Un are orthogonal to u i , (txi, W2, • • •, ttn) is an orthonormal • basis of X made by eigenvectors of ^ .
The next proposition expresses the existence of an orthonormal basis of eigenvectors in several diflFerent ways, see Theorem 2.45. We leave its simple proof to the reader. 4.5 Proposition. Let A : X ^^ X be a linear operator on a Euclidean or Hermitian space X of dimension n. Let (wi, U2, • • •, Un) be a basis of X and let Xi^ A2,..., An be real numbers. The following claims are equivalent (i) (lAi, 1^2, •. •, Un) is an orthonormal basis of X and each Uj is an eigenvector of A with eigenvalue Xj, i.e., A{uj) = XjUj,
(^il'^j) = ^ij
Vi, j = 1 , . . . , n,
(ii) {ui, U2',..., Un) is an orthonormal basis and n
J= l
(iii) (lii, 1/2,..., Un) is an orthonormal basis and for all x^y £ X (A(x)\y) = / ^ ^ = i ^ji^M (y\^j) ^f^ ^^ Euclidean, IX]?= 1 ^3i^Wj) iy\%•) if X is Hermitian. Moreover, we have the following, compare with Theorem 2.45. 4.6 Proposition. Let A : X —^ X be a self-adjoint operator in a Euclidean or Hermitian space X of dimension n and let A € Mn^n{^) be the matrix associated to A in a given orthonormal basis. Then A is similar to a diagonal matrix. More precisely, let {ui, 1*2,..., Un) be a basis of X of eigenvectors of A, /et Ai, A2,.. •, An E M be the corresponding eigenvalues and let S G Mn,n{^) be the matrix that has the n-tuple of components of Ui in the given orthonormal basis as the ith column. S :=
U i U2
Ur,
Then S^S = Id
and
S^AS = diag (Ai, A2,..., An)
114
4. Self-Adjoint Operators
if X is Euclidean, and S^S = Id
and
S^AS = diag (Ai, A2,..., A^).
if X is Hermitian. Proof. Since the columns of S are orthonormal, it follows that S ^ S = Id if X is Euclid—T
ean or S S = Id if X is Hermitian. The rest of the proof is contained in Theorem 2.45. D
c. Spectral resolution Let A : X —^ X he 3. self-adjoint operator on a Euclidean or Hermitian space X of dimension n, let (i^i, 1*2,..., Un) be an orthonormal basis of eigenvectors of A, let Ai, A2,..., A^; be the distinct eigenvalues of A and Vi, V2,..., Vk the corresponding eigenspaces. Let Pi : X -^ Vi he the projector on Vi so that
UjEVi
and by (ii) Proposition 4.4 k
A{x) = J2XiPi{x). i=l
As we have seen, by (iii) Proposition 4.4, we have Vi L Vj ii i •=^ j and, by the spectral theorem, Y2i=i dimV^ = n. In other words, we can say that {Vi}i is a decomposition of X in orthogonal subspaces or state the following. 4.7 Theorem. Let A : X -^ X be self-adjoint on a Euclidean or Hermitian space X of dimension n. Then there exists a unique family of projectors Pi, P25. • • 5 Pk CL'^d distinct real numbers Ai, A2,. •., A^^ such that k
p. o Pj = 6ijPj,
^
k
Pi = Id
and A = ^
XiPi.
Finally, we can easily complete the spectral theorem as follows. 4.8 Proposition. Let X be a Euclidean or Hermitian space. A linear opertor A : X -^ X is self-adjoint if and only if the eigenvalues of A are real and there exists an orthonormal basis of X made of eigenvectors of A.
4.1 Elements of Spectral Theory
115
d. Quadratic forms To a self-adjoint operator A : X —^ X we may associate the bilinear form a:X xX ^K, a{x,y) := {A{x)\y),
x,y e X,
which is symmetric if X is EucUdean and sesquilinear^ tt(^?y) = o.{y^x)^ if X is Hermitian. 4.9 Theorem. Let A : X —^ X be a self-adjoint operator, (ei, e2, • . . , e^) an orthonormal basis of X of eigenvectors of A and Ai, A2,. •., A^ be the corresponding eigenvalues. Then n
{A{x)\x) =Y.Xi\{x\ei)f
Vx e X,
(4.1)
2=1
In particular, if Amin ^nd Amax CiT'e respectively, the smallest and largest eigenvalues of A, then A m i n k l ^ < {A{x)\x)
< A max 1^1
Vx G X.
Moreover, we have {A{x)\x) = Amin |^P (resp. {A{x)\x) = Amax kPy^ if and only if x is an eigenvector with eigenvalue Amin (resp. Xmax)Proof. Proposition 4.5 yields (4.1) hence n
n
i=l
2=1
and, since l^p = J27=i l ( ^ k i ) P ^^ G X, the first part of the claim is proved. Let us prove that {A(x)\x) = Amin 12^ P if and only if x is an eigenvector with eigenvalue Amin- If x is an eigenvector of A with eigenvalue Amin) then A{x) = Amin 2: hence {A{x)\x) = (Amin^:^!^) = Amin k p . Conversely, suppose (ei, 6 2 , . . . , Cn) is a basis of X made by eigenvectors of A and the eigenspace Vx^-^^ is spanned by (ei, 6 2 , . . . , e^). Prom {A(x)\x) = AminkP^ we infer that 0 = iA(x)\x)
n - AminkP = E ( ^ ^ i=l
^min)|(x|ei)P
and, as AjAmin ^ 0, we get that (x|ei) = 0 Vz > A:, thus x G V\^.^. We proceed similarly for Amax.
D
All eigenvalues can, in fact, be characterized as in Theorem 4.9. Let us order the eigenvalues, counted with their multiplicity, as Ai < A2 < • • • < An and let (ei, e 2 , . . . , en) be an orthonormal basis of corresponding eigenvectors (ei, 6 2 , . . . , en), A{ei) = XiCi Vz = 1 , . . . , n; finally, set Vk := Span{ei, e 2 , . . . , e^},
Wk := {efc,efc+i,... ,en}.
116
4. Self-Adjoint Operators
Since T4, Wk are invariant subspaces under A and Vj^ = Wk-\-i, by applying Theorem 4.9 to the restriction of {A{x)\x) on 14 and VF^, we find Ai = mm{A{x)\x), 1x1=1 Xk = max< {A{x)\x)
(4.2) |x| = 1, x G Vjt >
= min< {A{x)\x)
|a:| = 1, x e Wk \
if A: = 2 , . . . , n — 1,
An = max(A(x)|x). kl=i Moreover, if 5 is a subspace of dimension n—fc-fl, we have 5014 ¥" {0}? then there is XQ ^ S ilVk with |a;o| = 1; thus min|(A(x)|x) |x| = 1, x e s\ < {A{xo)\xo) < max< {A{x)\x)
|x| = 1, x eVk\
= Xk-
Since dim W4 = n — k -\-1 and mmxeWk{^{^)\^) = A^, we conclude with the min-max characterization of eigenvalues that makes no reference to eigenvectors. 4.10 Proposition (Courant). Let A he a self-adjoint operator on a Euclidean or Hermitian space X of dimension n and let Xi < X2 < - - - < Xn be the eigenvalues of A in nondecreasing order and counted with multiplicity. Then Xk =
max
min<^ (A(x)\x)
dimS=n-k-\-l
=
\x\ = 1. x e S>
l^v V / I / I I I
min m a x n A ( x ) b ) dimS=k
J
\x\ =
l^v V / I / I I I
l,xeS>. J
4.11 A variational algorithm for the eigenvectors. Prom (4.2) we know that Afc :=min{(A(x)|x)| |x| = 1, x G 14-^1},
A: = l . . . , n ,
(4.3)
where F_i = {0}. This yields an iterative procedure to compute the eigenvalues of A. For j = 1 define Ai =
m.m{A{x)\x), kl=i
and for j = 1 , . . . , n — 1 set Vj := eigenspace of Aj, Aj+i := min< {A{x)\x) | |x| = 1, x e Wj >.
4.1 Elements of Spectral Theory
117
Notice that such an algorithm yields an alternative proof of the spectral theorem. We shall see in Chapter 10 that this procedure extends to certain classes of self-adjoint operators in infinite-dimensional spaces. Finally, notice that Sylvester's theorem, Gram-Schmidt's procedure or the other variants for reducing a quadratic form to a canonical form, see Chapter 3, allow us to find the numbers of positive, negative and null eigenvalues (with multiplicity) without computing them explicitly. e. Positive operators A self-adjoint operator A : X -^ X is called positive (resp. nonnegative) if the quadratic form (j){x) := (Ax|x) is positive for x ^ 0 (resp. nonnegative). Prom the results about metrics, see Corollary 3.56, or directly from Theorem 4.9, we have the following. 4.12 Proposition. Let A: X ^^ X be self-adjoint. A is positive (nonnegative) if and only if all eigenvalues of A are positive (nonnegative) or iff there is X> 0 (X>0) such that {Ax\x) > A|xp. 4.13 Corollary. A : X ^^ X is positive self-adjoint if and only if a{x,y) = {A{x)\y) is an inner (Hermitian) product on X. 4.14 Proposition (Simultaneous diagonalization). Let A^M : X ^>X be linear self-adjoint operators on X. Suppose M is positive. Then there exists a basis (ei, e 2 , . . . , e^) of X and real numbers Ai, A2,..., An such that {M{ei)\ej) = 6ij, A{ej) = XjMcj \/iJ = 1 , . . . ,n. (4.4) Proof. The metric g{x,y) := (M(x)\y) is a scalar (Hermitian) product on X and the Unear operator M~^A : X —> X is self-adjoint with respect to g since g{M-^A{x),y)
= {MM-''A(x)\y)
= {A{x)\y) =
= {x\MM-'^A(y))
= (Mx\M-'^A(y))
{x\A{y)) =
g{x,M-'^A{y)).
Therefore, M~^A has real eigenvalues and, by the spectral theorem, there is a gorthonormal basis of X made of eigenvectors of M ~ ^ A , g{ei,ej)
= {M{ei)\ej)
= Sij,
M~^A(ej)
= XjCj \/i,j = 1 , . . . , n .
4.15 Remark. We cannot drop the positivity assumption in Proposition 4.14. For instance, if
we have det(AId — M~^A) = A^ + 1, hence M~^A has no real eigenvalue.
118
4. Self-Adjoint Operators
4.16 %. Show the following. P r o p o s i t i o n . Let X be a Euclidean space and let g,b : X x X —^R be two metrics on X. Suppose g is positive. Then there exists a basis of X that is both g-orthogonal and b-orthogonal. 4 . 1 7 ^ . Let A,M be linear self-adjoint operators and let M be positive. Then M~^A is self-adjoint with respect to the inner product g{x,y) := (M{x)\y). Show that the eigenvalues Ai, A2, • •., An of M~^^A are iteratively given by Ai =
min
g(x,x) = l
g{M
A(x))x
= min
Xyt^O
{M{x)\x)
and for J = 1 , . . . , n — 1 I V^ := eigenspace of M~^A IWj
relative to Aj,
:=iVi®V2e'-'eVj)-^,
[Xj+1 := mm{{A{x)\x)
\ {M{x)\x)
= 1, x e
Wj},
where V-*- denotes the orthogonal to V with respect to the inner product g. 4 . 1 8 f.
Show the following.
P r o p o s i t i o n . Let T be a linear operator on K^. IfT-\-T* is positive then all eigenvalues of T have positive (nonnegative) real part.
(nonnegative),
f. The operators A* A and A A* Let A : X -^Y he Si linear operator between X and Y that we assume are either both Euclidean or both Hermitian. Prom now on we shall write Ax instead of A{x) for the sake of simplicity. As usual, A* : Y ^ X denotes the adjoint of A. 4.19 Proposition. The operator A* A : X -^ X is (i) self-adjoint, (ii) nonnegative, (iii) Ax, A*Ax and {A*Ax\x) are all nonzero or all zero, in particular A* A is positive if and only if A is infective, (iv) if ui^ i/2,..., Un are eigenvectors of A* A respectively, with eigenvalues Ai, A2,..., An; then {Aui\Auj) =
\i{ui\uj),
in particular, if ui, U2,. -., Un are orthogonal to each other, then Au\,..., Aun are orthogonal to each other as well. Proof
(i) In fact, {A*A)* = A*A** = A*A.
(ii) and (iii) If Ax = 0, then trivially A*Ax = 0, and if A*Ax = 0, then (A*Ax\x) = 0. On the other hand, {A*Ax\x) = {Ax\Ax) = | ^ x p hence Ax = 0 if (A*Ax\x) = 0. (iv) In fact, {Aui\Auj)
= {A*Aui\uj)
= \i{ui\uj)
= Xi\ui\'^Sij.
D
4.1 Elements of Spectral Theory
4.20 Proposition. The operator A A* :Y ^^Y
119
is
(i) self'adjoint, (ii) nonnegative, (iii) A*x, AA*x and {AA*x\x) are either all nonzero or all zero, in particular AA* is positive if and only if ker A* = {0}, equivalently if and only if A is surjective. (iv) if ui, U2j' >., Un are eigenvectors of AA"^ with eigenvalues respectively, Ai, A2,..., Xny then {A''ui\A*Uj) = Xi{ui\uj), in particular, if ui, U2,.. - ^ Un are orthogonal to each other, then A*ui^..., A*Un are orthogonal to each other as well Moreover, A A* and A* A have the same nonzero eigenvalues and Rank^Tl* = R a n k ^ M = Rank A = Rank A*. In particular, Rank A A* = Rank A* A < min(dim X, dim F ) . Proof. The claims (i) (ii) (iii) and (iv) are proved as in Proposition 4.19. To prove that A* A and A A* have the same nonzero eigenvalues, notice that if X € A", x 7«^ 0, is an eigenvalue for A*A with eigenvalue A 7^ 0, A*Ax = Xx, then Ax 7^ 0 by (iii) Proposition 4.19 and AA*{Ax) = A{A*Ax) = A{Xx) = XAx, i.e., Ax is a nonzero eigenvector for A A* with the same eigenvalue A. Similarly, one proves that if 2/ 7^ 0 is an eigenvector for A A* with eigenvalue X ^ 0, then by (iii) A*y ^ 0 and A*y is an eigenvector for A* A with eigenvalue A. Finally, from the alternative theorem, we have Rank A A* = Rank A* = Rank A = Rank A M .
g. Powers of a self-adjoint operator Let A : X -^ X he self-adjoint. By the spectral theorem, there is an orthonormal basis (ei, 6 2 , . . . , e-n) of X and real numbers Ai, A2,..., A^ such that n
Ax = 2_[^j{^\^j)^j
^^ ^ ^'
By induction, one easily computes, using the eigenvectors ei, 6 2 , . . . , Cn and the eigenvalues Ai, A2,..., An of A the /c-power oi A, A^ :=^ Ao- - -oA k times, V/c > 2, as n
A^x = Y,i^i)H^\ei)
ei
Vx e X
(4.5)
i=l
4.21 Proposition. Let A: X ^^ X be self-adjoint and fc > 1. Then (i) A^ is self-adjoint, (ii) A is an eigenvalue for A if and only if A^ is an eigenvalue for A^,
120
4. Self-Adjoint Operators
(iii) X E X is an eigenvector of A with eigenvalue A if and only if x is an eigenvector for A^ with eigenvalue X^. In particular, the eigenspaces of A relative to A and of A^ relative to X^ coincide. (iv) / / A is invertihle, equivalently, if all eigenvalues of A are nonzero, then 1
A~^x = 22 T~(^kO ^i
^^ ^ ^'
.=1 ^^ 4.22 % Let A: X -^ X he self-adjoint. Show that
li p{t) = YlT=i ^kt^ ^^ ^ polynomial of degree m, then (4.5) yields m
m
P{A){x) = Y^akA^x)
n
n
= ^5^afcA^^(x|e,)e,- = 5^p(A,)(x|e,)e,-. (4.6)
k=l
k=lj=l
j=l
4.23 Proposition. Let A : X -^ X be a nonnegative self-adjoint operator and let k E N, k > 1. There exists a unique nonnegative self-adjoint operator B : X -^ X such that B'^^ = A. Moreover, B is positive if A is positive. The operator B such that B'^^ = A \s called the 2A;th root of A and is denoted by ^ \ / A . Proof. If A{x) ^ X;7=i ^jiA^j^j^
(4.5) yields B^^ == A for n
Uniqueness remains to be shown. Suppose B and C are self-adjoint, nonnegative and such that A = B"^^ = C^'^. Then B and C have the same eigenvalues and the same eigenspaces by Proposition 4.21, hence B = C. •
In particular, if A : X —^ X is nonnegative and self-adjoint, the operator square root of A is defined by n \/A{X)
:= ^2 V ^ ( ^ l ^ i ) ^ J '
X e X,
i=l
if A has the spectral decomposition Ax = X]^=i
^j{^\^j)^j'
4 . 2 4 %. Prove Proposition 4.14 by noticing that, if A and M are self-adjoint and M is positive, then M~'^/^AM~'^/^ : X —>• X is well defined and self-adjoint. 4 . 2 5 ^ . Let A,B be self-adjoint and let A be positive. Show that B is positive if S := AB -h BA is positive. [Hint: Consider A~^/^BA~^/2 and apply Exercise 4.18.]
4.1 Elements of Spectral Theory
121
4.1.2 Normal operators a. Simultaneous spectral decompositions 4.26 Theorem. Let X be a Euclidean or Hermitian space. If A and B are two self-adjoint operators on X that commute, A = A\
B = B\
AB = BA,
then there exists an orthonormal basis (ei, 6 2 , . . . , Cn) on X of eigenvectors of A and B, hence n
n
z = ^{z\ei)ei,
Az = ^Xi{z\ei)ei,
2=1
1=1
n
Bz =
^fii{z\ei)ei, i—1
Ai, A2,..., An G M and /ii, /Li2,..., /in ^ I^ being the eigenvalues respectively of A and B. This is proved by induction as in Theorem 4.3 on account of the following. 4.27 Proposition. Under the hypoteses of Theorem 4-26, we have (i) A and B have a common nonzero eigenvector, (ii) if V is invariant under A and B, then V-^ is invariant under A and B as well. Proof, (i) Let A be an eigenvalue of A and let V\ be the corresponding eigenspace. For all y € Vx we have ABy = BAy = XBy, i.e., By G V^. Thus V^ is invariant under B , consequently, there is an eigenvector w £ Vx oi B^y^, i.e., common to A and B. (ii) For every w G V-^ and z £ V, we have Az,Bz G V and {Aw\z) = {w\Az) = 0, {Bw\z) = {w\Bz) = 0, i.e.. Aw, Bw eV-^. • 4 . 2 8 %. Show that two symmetric matrices A, B are simultaneously diagonizable if and only if they commute A B = B A .
b. Normal operators on Hermitian spaces A linear operator on a Euclidean or Hermitian space is called normal if NN* = N*N. Of course, if we fix an orthonormal basis in X, we may represent N with an n x n matrix N 6 Mn,n(C) and N is normal if and only if N N ^ = N ^ N if X is Hermitian or N N ^ = N ^ N if X is Euclidean. The class of normal operators, though not trivial from the algebraic point of view (it is not closed for the operations of sum and composition), is interesting as it contains several families of important operators as subclasses. For instance, self-adjoint operators A^ = A^*, anti-self-adjoint operators N* = —N, and isometric operators, N*N = Id, are normal operators. Moreover, normal operators in a Hermitian space are exactly the ones that are diagonizable. In fact, we have the following.
122
4. Self-Adjoint Operators
4.29 Theorem (Spectral theorem). Let X be a Hermitian space of dimension n and let N : X -^ X he a linear operator. Then N is normal if and only if there exists an orthonormal basis of X made by eigenvectors ofN. Proof. Let (ei, 6 2 , . . . , Cn) be an orthonormal basis of X made by eigenvectors of N. Then for every z £ X n
n
Nz = ^Xj{z\ej)ej,
N*z =
j= l
^^{z\ej)ej 3= 1
hence NN*z = XI^^i |AjP(^|e-,)ej- = N*Nz. Conversely, let N + N* N -N* A:= — , B := . 2 2i It is easily seen that A and B are self-adjoint and commute. Theorem 4.26 then yields a basis of orthonormal eigenvectors of A and B and therefore of eigenvectors oi N := A+ iB and N* = A- iB. D 4.30 1 . Show that AT : C"^ -^ C"^ is normal if and only if N and N* have the same eigenspaces.
c. Normal operators on Euclidean spaces Let us translate the information contained in the spectral theorem for normal operators on Hermitian spaces into information about normal operators on Euclidean spaces. In order to do that, let us first make a few remarks. As usual, in C^ we write z — x+iy, x,y EW^ for z — {x\ -\-iyi,..., Xn + iyn)' If VF is a subspace of W^, the subspace of C"^ WeiW
:= (zeC'^\z
= x-h iy, x,y
is called the complexified ofW. Trivially, A\mc{W®iW) if F is a subspace of C^, set V
ew\ = dim^ W. Also,
:={ZGC^|ZGF}.
4.31 Lemma. A_subspace V C C^ is the complexified of a real subspace W if and only ifV = V. Proof. liV vectors
= W^
iW, trivially V = V. Conversely, if 2 € F is such that z e V, the
''have real coordinates. Set
z -\- z 2 '
^ -
W :=^xeW\x=
z — z {z/i) -h z/i 2i 2
^ ^ , 2€ V};
then it is easily seen that V = W ® iW li V = V.
4.1 Elements of Spectral Theory
123
For N : M^ ^ W^ we define its complexified as the (complex) linear operator Nc : C -^ C defined by Nc{z) := Nx + iNy iiz = x-\-iy. Then we easily see that (i) A is an eigenvalue of N if and only if A is an eigenvalue of Nc^ (ii) N is respectively, a self-adjoint, anti-self-adjoint, isometric or normal operator if and only if Nc is respectively, a self-adjoint, anti-selfadjoint, isometric or normal operator on C"^, (iii) the eigenvalues of N are either real, or pairwise complex conjugate; in the latter case the conjugate eigenvalues A and A have the same multiplicity. 4.32 Proposition. Let N : W^ ^^ W^ be a normal operator. Every real eigenvalue \ of N of multiplicity k has an eigenspace Vx of dimension k. In particular, V\ has an orthonormal basis made of eigenvectors. Proof. Let A be a real eigenvalue for NQ, NQZ = Xz. We have NQZ
— Nx - iNy = N^z = Xz = Xz,
i.e., z € C^ is an eigenvector of N^ with eigenvalue A if and only if 'z is also an eigenvector with eigenvalue A. The eigenspace Ex of Nc relative to A is then closed under conjugation and by Lemma 4.31 Ex '•= Wx © iWx, where
Wx:={xeR''\x=^,
z-\- z
zeEx],
and dimR Wx — dime ^x • Since N^ is diagonizable in C and W\ C VA ? we have k — dime Ex = dim Wx < dimR V^. As dimV^ ^ k, see Proposition 2.43, the claim follows.
D
4.33 Proposition. Let X be a nonreal eigenvalue of the normal operator N :W^ —^W^ with multiplicity k. Then there exist k planes of dimension 2 that are invariant under N. More precisely, i / e i , e 2 , . . . , en G C"^ are k orthonormal eigenvectors that span the eigenspace Ex of Nc relative to A and we set U2J-1 : =
Cj -\- Cj 7=—,
V2 '
U2j :=
'''
^.— ^.
V2i '
then lii, ii2,. • •, U2k Q'Te orthonormal in W^, and for j = 1,... ,k the plane Span{iA2j_i5^2j}? is invariant under N; more precisely we have
{
N{u2j-l)
= OLU2J-1 - fiU2j,
N{u2j) = /3u2j-i + au2j
where X=: a-\- i/3. Proof. Let Ex, Ej be the eigenspax:es of Nc relative to A and A. Since Nc is diagonizable on C, then ^A -L ^Jdime ^A = dime -^X ~ ^' On the other hand, for z ^ Ex
124
4. Self-Adjoint Operators
Ncz = Nx — iNy = N^z =
\z.
Therefore, z ^ Ex'ii and only if 2 G E-^. The complex subspgice F\ := Ex®Ej of C^ has dimension 2k and is closed under conjugation; Lemma 4.31 then yields Fx = Wx ^iWx where Wx:=[xeR''\x=
^ ^ ,
zeEx\
and
dimR Wx = dime E = 2k.
(4.7)
If (ei, 6 2 , . . . , efc) is an orthonormal basis of Ex, (ei, 6 2 , . . . , e^) is an orthonormal basis of Ej; since y/2ej
=: U2J-1
-\-iu2j,
V^ej = : U2j-1
• ^y'2j,
we see that {uj} is an orthonormal basis of Wx- Finally, if A := a -f- z/?, we compute
= ^+ Ae-
(Niu2j-i) = Nc{^) \N{U2J)
= Af ( ^ )
= ^ ^ ^
• = aU2j-l
- 0U2j,
= • • • = /3«2,-l + a « 2 „
i.e., Span {tt2j-1, W2j} is invariant under N.
D
Observing that the eigenspaces of the real eigenvalues and the eigenspaces of the complex conjugate eigenvectors are pairwise orthogonal, from Propositions 4.32 and 4.33 we infer the following. 4.34 Theorem. Let N be a normal operator on M.^. Then R^ is the direct sum of 1-dimensional and 2-dimensional subspaces that are pairwise orthogonal and invariant under N. In other words, there is an orthonormal basis such that the matrix N associated to N in this basis has the block structure 0 \ 0 A Ai N'
0
0
0
0
To each real eigenvalue A of multiplicity k correspond k blocks A of_dimension 1 x 1 . To each couple of complex conjugate eigenvalues A, A of multiplicity k correspond fc 2 x 2 blocks of the form a -a
(3 a
where a + if3 := A. 4.35 Corolleiry. Let A/^: M" —> R" be a normal operator. Then (i) N is self-adjoint if and only if all its eigenvalues are real, (ii) A'' is anti-self-adjoint if and only if all its eigenvalues are purely imaginary (or zero), (iii) N is an isometry if and only if all its eigenvalues have modulus one. 4 . 3 6 % Show Corollary 4.35.
4.1 Elements of Spectral Theory
125
4.1.3 Some representation formulas a. The operator A* A Let yl: X —> y be a linear operator between two Euclidean spaces or two Hermitian spaces and let ^* : y ^ ' X be its adjoint. As we have seen, yl*A : X —> X is self-adjoint, nonnegative and can be written as n
A" Ax =
Y^Xi{x\ei)ei 2=1
where (ei, 6 2 , . . . , Cn) is a basis of X made of eigenvectors of A*A and for each 2 = 1 , . . . , n Ai is the eigenvalue relative to e^; accordingly, we also have {A*Ay^^x
:= ^
fJ.i{x\ei) e^.
2=1
where /Xi := ^/Xi. The operator (A*A)^/^ and its eigenvalues / i i , . . . ,/Xn, called the singular values of A, play an important role in the description of A 4 . 3 7 f. Let A G Mm,n{^)' Show that ||A|| := sup|a.|^i |Ax| is the greatest singular value of A. [Hint: | A x p = (A* Ax) •x .]
4.38 Theorem (Polar decomposition). Let A : X ^^Y between two Euclidean or two Hermitian spaces.
be an operator
(i) If dimX < d i m y , then there exists an isometry U : X -^ Y, i.e., tf'U = Id, such that Moreover, if A = US with f/*C/ = Id and S* = S, then S = (A* A)^/^ and U is uniquely defined on ker S-^ = ker A-^. (ii) If dimX > dim.Y, then there exists an isometry U : Y ^y X, i.e., U*U = Id such that A = {AA*y^^U\ Moreover, ifA = SU with U*U = Id and 5* = S, then S = {AA*)^^'^ and U is uniquely defined on ker 5-^ = I m ^ . Proof. Let us show (i). Set n := d i m X and N := d i m y . First let us prove uniqueness. If A = 175 where U*U = Id and 5* = S, then A*A = S*U*US = S*S = ^ 2 , i.e., S = (A*A)i/2. Now from A = U(A*A)^/^, we infer for i = 1 , . . . , n Aid)
= t/(A*A)i/2(ei) = Uimei)
=
fiiU{ei),
if (ei, 6 2 , . . . , en) is an orthonormal basis of X of eigenvectors of {A*A)^^^ with relative eigenvalues ^ 1 , /JL2, . • •, /Xn- Hence, U(ei) = —A(ei) if/Ltj ^ 0, i.e., U is uniquely defined by A on the direct sum of the eigenspaces relative to nonzero eigenvalues of (A* A)^/^, that is, on the orthogonal of ker(A*A)^/2 = ker A. Now we shall exhibit U. The vectors A{ei),..., A{en) are orthogonal and |A(ei)| = Mi as
126
4. Self-Adjoint Operators
{A{ei)\A{ej))
= {A*A{ei)\ej)
= fJLi{ei\ej) = f^i6ij.
Let us reorder the eigenvectors and the corresponding eigenvalues in such a way that for some k, 1 < k < n, the vectors A{ei),..., A{ek) are not zero and A(ek-\-i) = - • - = A{en) = 0. For i = 1 , . . . , A: we set i;i := , . r ^ ^ and we complete t;i, t ; 2 , . . . , ^^fc to form a new orthonormal basis (vi^ V2,-.., VN) oiY. Now consider U : X -^Y defined by U(ei) :=Vi
i=l,...,n.
By construction {U{ei)\U{ej)) — Sij, i.e., U*U = Id, and, since fXi = \A{ei)\ = 0 for i > k, we conclude for every i = 1,... ,n
I yrt^t;^ = Ovi = 0
if k < i < n
(ii) follows by applying (i) to ^ * .
D
b. Singular value decomposition Combining polar decomposition and the spectral theorem we deduce the so-called singular value decomposition of a matrix A. We discuss only the real case, since the complex one requires only a few straightforward changes. Let A G MM,nW with n < N. The polar decomposition yields A = U(A^A)i/2
^i^j^
uTu _ jj
On the other hand, since A-^A is symmetric, the spectral theorem yields S e Mn.nW such that (A^A)i/2 = S^diag(/ii, / i 2 , . . . , /in)S,
S^S = Id,
where /ii, /X25 • • •, /^n are the squares of the singular values of A. Recall that the ith column of S is the eigenvector of (A*A)^/^ relative to the eigenvalue fii. then T ^ T = Id, In conclusion, if we set T := U S ^ G MNA^), S^S = Id and A = Tdiag(/ii, / i 2 , . . . , /in)S. This is the singular value decomposition of A, that is implemented in most computer hbraries on linear algebra. Starting from the singular value decomposition of A, we can easily compute, of course, (A-^A)^/^, and the polar decomposition of A. 4.39 . We notice that the singular value decomposition can be written in a more symmetric form if we extend T to a square orthogonal matrix V G MN,N(^), V ^ V = Id and extending diag (/xi, /i2, • • •, /in) to a A/^ x n matrix by adding N — n null rows at the bottom. Then, again
A = VAS where V G MATXATW, V ^ V = Id, S G Mn,n(^),
S ^ S = Id and
4.1 Elements of Spectral Theory
0
/MI 0
/i2
0
0 0
0 0
Mn
\0
0
A =
127
0
0/
c. The Moore-Penrose inverse Let A : X —^Y he 3. linear operator between two Euclidean or two Hermitian spaces of dimension respectively, n and m. Denote by P:X
^kevA^
Q:Y-^lmA
and
the orthogonal projections operators to kevA-^ and 1mA. Of course Ax = Qy has at least a solution x G X for any y ^Y. Equivalently, there exists X E X such that y — Ax ± Im ^ . Since the set of solutions of Ax — Qy is a translate of ker A, we conclude that there exists a unique x := A'^y E X such that y — Ax 1. Im A, [x e
Ax = Qy,
equivalently,
(4.8)
X = Px.
kerA^,
The linear map A"^ :Y ^^ X, y —^ A^y, defined this way, i.e.,
is called the Moore-Penrose inverse oi A: X ^^Y. {AA^
From the definition
=Q,
A^A = P, ker A+ =lmA^ ImA^ =
= kerQ,
keiA^.
4.40 Proposition. A^ is the unique linear map B :Y -^ X such that AB = Q,
BA = P
and
kerB = keTQ',
(4.9)
moreover we have A^AA"^ =A^AA''
=A\
(4.10)
128
4. Self-Adjoint Operators
Proof. We prove that B = A^ by showing for s\\ y £ Y the vector x := By satisfies (4.8). The first equaUty in (4.9) yields Ax = ABy = Qy and the last two imply x = By = BQy = BAx = Px. Finally, from AA^ = Q and A^A = P^ we infer that A*AA^ = A*Q = A\
A^AA* = PA* = A*,
using also that A*Q = A* and PA* = A* since A and A* are such that ImA (kerA*)-^ and ImA* = kerA-^.
= D
The equation (4.10) allow us to compute A^ easily when A is injective or surjective. 4.41 Corollary. Let A : X ^^ Y be a linear map between Euclidean or Hermitian spaces of dimension n and m, respectively. (i) If ker A = {0}, then n <m, A* A is invertible and A^ =
{A*A)-^A*;
moreover^ if A = [/(A*A)^/^ is the polar decomposition of A, then A^ = {A*A)-^/^U\ (ii) If ker A* = {0}, then n>m, AA* is invertible, and
moreover, if A = {AA*)^^'^U* is the polar decomposition of A, then At = C/(AA*)-V2. For more on the Moore-Penrose inverse, see Chapter 10.
4.2 Some Applications In this final section, we illustrate methods of linear algebra in a few specific examples.
4.2.1 The method of least squares a. The method of least squares Suppose we have m experimental data yi, 2/2, • • •, Vm when performing an experiment of which we have a mathematical model that imposes that the data should be functions, (/>(x), of a variable x e X. Our problem is that of finding x G X in such a way that the theoretical data 0(x) be as close as possible to the data of the experiment. We can formahze our problem as follows. We list the experimental data as a vector y = (yi, 2/2, • • •, 2/m) G W^ and represent the mathematical
4.2 Some Applications
129
model as a map 0 : X —> W^. Then, we introduce a cost function C = C{(j){x)^y) that evaluates the error between the expected result when the parameter is x, and the experimental data. Our problem then becomes that of finding a minimizer of the cost function C. If we choose (i) the model of the data to be linear^ i.e., X is a vector space of dimension n and (j) = A\ X -^ W^ is a linear operator, (ii) as cost function, the function square distance between the expected and the experimental data, C{x) = \Ax - 2/|2 = {Ax - y\Ax - y),
(4.11)
we talk of the {linear) least squares problem. 4.42 Theorem. Let X and Y he Euclidean spaces, A \ X ^^ Y a linear map, y EY and C : X ^^R the function C{x) := \Ax-y\Y^
x e X.
The following claims are equivalent (i) X is a minimizer of C, (ii) y - Ax ± 1mA, (iii) X solves the canonical equation A*{Ax-y)
= 0.
(4.12)
Consequently C has at least a minimizer in X and the space of minimizers of C is a translate o/ker A. Proof. Clearly minimizing is equivalent to finding z = Ax G I m A of least distance from y. By the orthogonal projection theorem, x is a minimizer if and only if Ax is the orthogonal projection of y onto Im A. We therefore deduce that a minimizer x G X for C exists, that for two minimizers xi,X2 of C we have Axi = Ax2, i.e., x i — X2 6 ker A and that (i) and (ii) are equivalent. Finally, since ImA-*- = kerA*, (ii) and (iii) are clearly equivalent. • 4 . 4 3 R e m a r k . The equation (4.12) expresses the fa-ct that the function x —>• | Aa: — 6p is stationary at a minimizer. In fact, compare 3.65, since Vx{z\x) = z and Vx(^x\x) = 2La; if L is self-adjoint, we have \Ax - 6|2 = |6|2 _ 2(6|Ax) + |Ax|2, V(6|Ax) = V(A*6|x) = A*6, Vx|Ax|2 = Vx{A*Ax\x)
=
2A*Xx
hence Vx\h-Ax\^
=
2A*{Ax-h).
As a consequence of Theorem 4.42 on account of (4.8) we can state the following 4.44 Corollary. The unique minimizer of C{x) = \Ax — y|y in Im A* = ker A-^ is X = A^y.
130
4. Self-Adjoint Operators
b. The function of linear regression Given m vectors xi, X2,. • •, Xm in a Euclidean space X and m corresponding numbers yi, 2/2, • • •, 2/m, we want to find a linear map L : X -^R that minimizes the quantity m
nL):='£\yi-Lixi)\\ i=l
This is in fact a dual formulation of the linear least squares problem. By Riesz's theorem, to every linear map L : X —> R corresponds a unique vector WL ^ X such that L{y) := {y\wL)j and conversely. Therefore, we need to find w G X such that m
C{w) := ^Ivi
- {xi\w)\'^ -^ min.
2=1
If y := (2/1, 2/2, • • •, 2/m) ^ R"^ and A: X —^ W^ is the linear map Aw := [{xi\w), {X2\w),...
{xn\w)j,
we are again seeking a minimizer of C : X —> M C{w) :=\y-Aw\^,
w e X.
Theorem 4.42 tells us that the set of minimizers is nonempty, it is a translate of ker A and the unique minimizer of C in ker A-^ = Im ^* is if; := A^y. Notice that n
A*a=j2
^i^i^
^=(«^
^^ • • -«"") ^ ^"^
2=1
hence, w £ IxnA'' = ker A"^ if and only if if; is a linear combination of xi, X2,..., Xm- We therefore conclude that A'^y is the unique minimizer of the cost function C that is a linear combination o / x i , 0:2,..., Xm- The corresponding linear map L{x) := {x\A^y) is called the function of linear regression.
4.2.2 Trigonometric polynomials Let us reconsider in the more abstract setting of vector spaces some of the results about trigonometric polynomials, see e.g.. Section 5.4.1 of [GM2]. Let Vn,2Tr be the class of trigonometric polynomials of degree m with complex coefficients n
Vn,2ir ••= [Pix) = ^
Cke''"' I Cfc e C, A; = - n , . . . , n } .
4.2 Some Applications
131
Recall that the vector {c-n,. "•,Cn) G C^"^"^^ is called the spectrum of P{^) — Z]fc=-n ^k^^^^' Clearly, Pn,27r is a vector space over C of dimension at most 2n + 1. The function (P|Q) : VU^-K X ^n,27r -^ C defined by
is a Hermitian product on Vn,2T^ that makes Pn,27r a Hermitian space. Since
(gifcx|^z/.x)=1. r ^i{k-h)x ^^ ^ ^^^^ see Lemma 5.45 of [GM2], we have the following. 4.45 Proposition. The trigonometric polynomials {e^^^}k=-n,n form an orthonormal set of2n-\-l vectors in Pn,27r o,nd we have the following. (i) T^n,27T is a Hermitian space of dimension 2n + 1. (ii) The map ^: 'Pn,27r -^ C^^"^^, that maps a trigonometric polynomial to its spectrum is well defined since it is the coordinate system in Vn,27r relative to the orthonormal basis {e'^^^}. In particular^ : Vn,2n -^ £2n-\-i j^g ^ (complex) isometry. (iii) (FOURIER COEFFICIENTS) For k = - n , ...,n we have
1 n
Cfe = (P|e"=") = — / (iv)
P{t)e-''''dt.
( E N E R G Y IDENTITY)
i- r \P{t)fdt = \\P\f := (P|P) = J2 |(P|e''=-)|2 = f; \Ck?. k=—n
k=—n
a. Spectrum and products Let P{x) = Y2=-n ^ke'^"" and Q{x) = Y2=-n dke'^'' be two trigonometric polynomials of order n. Their product is the trigonometric polynomial of order 2n
k=—n
/e=—n
h,k=—n
2n
= E ( E -Hd,)e'''. p=-2n
h-\-k=p
If we denote by {ck} * {dk} the product in the sense of Cauchy of the spectra of P and Q, we can state the following.
132
4. Self-Adjoint Operators
4.46 Proposition. The spectrum of P{x)Q{x) is the product in the sense of Cauchy of the spectra of P and Q (PQ)k = Pk^Qk4.47 Definition. The convolution product of P and Q is defined by p * Q{x) := ^
r
P{x +
t)Q{i)dt
^TT J-n
Notice that the operation (P^Q) ^^ P *Q is hnear in the first factor and antihnear in the second one. We have 4.48 Proposition. P^Q is a trigonometric polynomial of degree n. Moreover the spectrum of P^Q is the term-by-term product of the spectra of P and Q, {p7Q)^:=P,Qk. Proof. In fact for h, k = — n , . . . , n, we have
27r y_7r
27r J - T T
hence, if P{x) = Efc=_n Cfce^^=" and Q{x) = E f c = - n ^fce^''^, we have
P^Q{x)=
f^
f2 ^hdi^^hke'"''= Y.
h=—nk=—n
^fc^^'""-
k=—n
b. Sampling of trigonometric polynomials A trigonometric polynomial of degree n can be reconstructed from its values on a suitable choice of its values on 2n + 1 points, see Section 5.4.1 r - := — o^+li? 27r . J = "~^?..., n, then the sampling map of [GM2]. Set Xj C : Vn,2n
-- C ^ ^ + l ,
C{P)
:= ( P ( x _ n ) , • • • ,
is invertible, in fact, see Theorem 5.49 of [GM2], 1 '^ ^^""^ • " 2 ; r f T ^
P{xj)Dn{x-Xj)
3 = -n
where Dn{t) is the Dirichlet kernel of order n Dn{t)—
Yl
e^^* = l 4 - 2 ^ c o s A : t .
P{Xn))
4.2 Some Applications
133
Spectrum
Trigonometric polynomials of degree n
£2n-\-l
E
n
ikt
Samplings £2n+l
IDFT
Figure 4.1. The scenario of trigonometric polynomials, spectra and samples.
4.49 Proposition. K,27r given by
J^^^C
V2n-\-lC~'^(z)(x)
and its inverse yjin + 1C~^ : C^"''^^
:= , \^ ZiDJx - Xj) >/2nTT .^1^ -^ '
are isometries between Vn,2n O'Tid C^"^"^^. Proof. In fact, C maps e* '^*, k = — n , . . . , n, to an orthonormal basis of C^'^"'"^:
Prom the samples, we can directly compute the spectrum of P, 4.50 Proposition. Let P{x) e Vn,2n CL'^^d Xj := 2 ^ ^ j ? j = —n, . . . , n . Then
^ f^ P(t)e-'^' dt = ^ ^ J2 P{xj)e-''^^.
(4.13)
Proof. Since (4.13) is linear in P , it suffices to prove it when P{x) = e*'^*, h = — n , . . . , n. In this case, we have ^ J^^ P{t)e~'^ ^^ dt = 6hk ^^^
3=-n
since Dn{xj)
J=-n
= 0 for j ^ 0, j e [-n,n] and DnCO) = 2n + 1.
D
134
4. Self-Adjoint Operators
c. The discrete Fourier transform The relation between the values {P{tj)} of P e 'Pn,27r at the 2n + 1 points tj and the spectrum P of P in the previous paragraph is a special case of the so-called discrete Fourier transform. For each positive integer N^ consider the 27r-periodic function EN{t) : R -^ C given by „ /^N v^^ ikt 1 ^ EN{t):=2^e'^' = \ , k=o [ i-eit
if H s a multiple of 27r,
. . (4.14)
Otherwise.
Let uj = e*i5^ and let l,a;,c<;^,... ^uj^~'^ be the ATth roots of one. For h GZ we have — V cj^^ = < ^ if /i is a multiple of TV, -^ ^^Q [O otherwise,
,^ ^^.
in particular, iV-l
_ ^
u;^^ = (^;,fc
a -N
(4.16)
fc=0
The discrete Fourier transform of order N, DFTN defined by DPT/v(y) := U y rows by column, where U = [£/]],
C/j:=-iw-'^
Vi,i = 0 , . . . , J V - l .
The inverse discrete Fourier transform of order AT, := Vz where is defined by IDFTM{Z) V = [Vi],
V; = Ji,
- C ^ -^ C ^ , is
IDFTN
'- C^ —^ C ^ ,
Vz,j = 0 , 7 V - l .
4.51 Proposition. IDFTN is the inverse of DFTj\f. Moreover, the operators y/N DFTN and -4^ IDFT^ are isometrics ofC^. Proof. In fact, by (4,16)
fc=0 fc=0 i.e., V = U - i and, by the definition of U and V , U ^ = ; ^ V , hence U ^ = ^ V =
4.2 Some Applications
135
Notice that, according to their definitions, we need N'^ multiphcations to compute DFTN (or IDFTjsj). There is an algorithm, that we shall not describe here, called the Fast Fourier Transform that, using the redunwith only N dance of some multiphcations, computes DFTN (or IDFTN) multiplications with a performance of 0{N log N). Let P{t) = E L - n ^k^'^'' ^ ^n,27r and let AT > 2n + 1. A computation similar to the one in Proposition 4.50 shows that
^ ' ^ h £
^^^^'~'' ^^ ^ ^ ^
n^j^-'''^'
=
DFTMV
(4.17)
and Xj := ^ j , -N < j < N. Thus the where y := {P{xo),... ,P{XN)) spectrum of P is the DFT^ of its values at Xj if n < N/2. On the other hand, if z := (ZQ, • • •, ZN-I) is the vector defined by
Zk '=
and we recall that
Pk
if 0 < fc < n,
0
if n < A: < iV/2,
Pk-N
if N/2
0
if iV/2 + n < fc < TV
<
IDFTN
is the inverse of P{xj)
i.e., the values of P at
Xj are
:=
the
N/2 -f n,
DFTN,
we have
{IDFTNZ)J,
IDFTN
of the spectrum of P.
4.52 Frequency spectrum. In applications, the DFTN and IDFTN may appear slightly differently. If / is a To-periodic function, one lets To/AT be the period of sampling, so that tj := ^j = jT^ j = 0 , 1 , . . . , iV — 1, are the sampling points and DTFN produces the sequence
cfc := ^ E fiJT)e''^'''In other words, the values of {ck} are regarded as the values of the component of frequency Vk '•= k^- = - ^ , i.e., as the samples of the so-called frequency spectrum / : E ^^ C of / , defined by
\0
otherwise.
The discrete Fourier transform and its inverse then rewrite as
136
4. Self-Adjoint Operators
f{4f)'h"ff<-'^>'-'^" j=0 N-1
f('^^)=Ef{iky^^''j=o
4.2.3 Systems of difference equations Linear difference equations of first and second order are discussed e.g., in [GM2]. Here we shall discuss systems of linear difference equations. a. Systems of linear difference equations First let us consider systems of first order. Let A G Mk,k{C). The homogeneous linear recurrence for the sequence in C'^
{
Xn+i = AXn,
n > 0,
Xo given
has the unique solution Xn := A'^XQ Vn, as one can easily check. 4.53 Proposition. Given {Fn} in C^, the recurrence f X n + l = AXn
-h F n + i ,
n > 0,
[Xo given has the solution n
Xn := A^Xo + Y. ^""''^3
'^n>{)
3=0
where we assume FQ := 0. Proof. In fact, for n > 0 we have n+l
n
3=0
3=0
n
= A A^Xo + A ^ j=0
n
A ^ - ^ F , - + F n + i = A ( A ^ X O -h ^
A ^ - ^ F ^ ) + Fn+i
3=0
= AXn + Fn-l-l. D
4.2 Some Applications
137
4.54 Higher order linear difference equations. Every equation Xn-[-k H- dk-lXn-^k-l
H
h Go^n = /n+1,
n > 0
(4.18)
can be transformed into afcx A: system of difference equations of first order. In fact, if Xn := (Xn,Xn+l,...,Xn+fc-l)^ G C'',
Fn:=(0,0,...,0,/n^€C^ and A is the k x k matrix / 0 0
1 0
0 1
0 0
\ (4.19)
A:= 0 -ao
0 0 —oi —a2
-flfe-l
/
it is easily seen that ^n+1 — AX„ + Fn-\-l
(4.20)
and conversely, if {Xn} solves (4.20), then {xn}, Xn '-= X^ Vn, solves (4.18). In this way the theory of higher order linear difference equations is subsumed to that of first order systems. In this respect, one computes for the matrix A in (4.19) that k-i
det(AId - A) = A'^ + ^ a j X ^ . j=0
This polynomial in A is called the characteristic polynomial of the difference equation (4.18). b. Power of a matrix Let us compute the power of A in an efficient way. To do this we remark the following. (i) If B is similar to A, A = S'^BS for some S with det S 7^ 0, then A^ = S-^BSS-^BS = S-^B^S and, by induction, A " = S-^B"S
Vn.
138
4. Self-Adjoint Operators
(ii) If B is a block matrix with square blocks in the principal diagonal 0
Bi
0
Bo
B =
then
B" =
v
By
0
0
B?
0
0
B:
Let Ai, A2,..., A/j be the distinct eigenvalues of A with multiphcities mi, m 2 , . . . , mfc. For every k, let pk be the dimension of the eigenspace relative to A^ (the geometric multiplicity). Then, see Theorem 2.65, there exists a nonsingular matrix S € Mfe,fc(C) such that J := S~^AS has the Jordan form Ji,i
J==
v
0
0
0
Jl,2
0
...
0
0
...
\
'k,pk
where i = 1 , . . . , A;, j = 1 , . . . ,pi and /Ai 0
1 0 Ai 1
Xi 0 0
if Jij has dimension 1, ... ...
0\ 0
'^ij
otherwise. 0 Vo
0 0
0 0
... ...
Ai 0
1 Ai/
4.2 Some Applications
139
Consequently A^ = S J ^ S - \ and -tn
0
0
Tn •^1,2
0
0
J^ =
Tn
0
V It remains to compute the power of each Jordan block. If J ' = Jij = (A) has dimension one, then J'^ = A'^. If instead J ' = Jij is a block of dimension q at least two, /A 0
1 A
0 1
... ...
0\ 0
0 ... VO 0
0 ...
A 0
1 A/
J^ =
then J ' = AId + B ,
B}
:=Si+ij.
Since I Sr-^ij
C^)]
if r < g.
[0 ifr>9, we have B^ = 0. Thus Newton's binomial formula yields
J'^ = (Aid + B)^ = J2 ("")A"-^B^ 3=0 ^-^^
I.e.
j=0
^-^ ^
0
1
0 VO
3=0 ^-^^
n\
...
( " ) ^
0
1
n\ A
0
0
1
J ' " = A"
/
140
4. Self-Adjoint Operators
Notice that each element of A " = SJ^S~^ has the form k
3=1
where Ai, A2,..., A^ are the eigenvalues of A and Pj{t) is a polynomial of degree at most Pj — 1, where pj is the algebraic multiplicity of \j. It follows that for p > maxi{\\i\) there is a constant Cp such that every solution of Xn-^i = AXfi satisfies \Xn\ = sup < Cpp"" \Xo\
Vn.
In particular we have the following. 4.55 Theorem. If all eigenvalues of A have modulus less than one, then every solution of Xn^i = AXn converges to zero as n -^ +00. Proof. Fix cr > 0 such that maxi=i,n \^i\ < cr < 1. As we have seen, there exists a constant Ca such that if Xn is a solution of X n + i = AXn Vn, then \Xn\ < CaCT'' \Xol
Vn.
Since 0 < cr < 1, a"^ —^ 0, and the claim is proved.
D
4.56 E x a m p l e (Fibonacci n u m b e r s ) . Consider the sequence of Fibonacci numbers
(
/n+2 = /n+1 + / n
n > 0,
.^ ^^.
/O = 0, / i = 1, that is given by 1 / / l + \/5\n
/l-y/5\n
(4.22)
see e.g., [GM2]. Let us find it again as an application of the above. Set fn Un-i-1 then
F.,.H^^^M= /n+2/
'-''
\/n+/n+l7
0^ ' 1 and, F„=A»(; where 0 1
1 1
]=r
\1
^ ^n, 1
4.2 Some Applications
141
The characteristic polynomial of A is det(AId — A) = A(A — 1) — 1, hence A has two distinct eigenvalues l-hv/5 l-v/5 2 ' ^ 2 An eigenvalue relative to A is (1, A) and an eigenvector relative to /x is (1, /x). The matrix A is diagonizable a s A = S A S ~ ^ where
^X
IJLJ
A -/i
y A
-ly
\0
/x
It follows that
1
A
A — /x \^A
l \ / A'^ ii)
\—iJL^
Consequently,
'" = I^(--''") = 7!((H^)"-(^)") 4,2.4 An ODE system: small oscillations Let x i , X 2 , . . . , xjv be N point masses in M^ each respectively, to a nonzero mass m i , m 2 , . . . , TRN. Assume that each point exerts a force on the other points according to Hookers law^ i.e., the force exerted by the mass at Xj on Xi is proportional to the distance of Xj from x^ and directed along the line through x^ and Xj,
By Newton's reaction law, the force exerted by x^ on Xj is equal and opposite in direction, fji = ~fij^ consequently the elastic constants fc^j, i ^ j , satisfy the symmetry condition kij = kji. In conclusion, the total force exerted by the system on the mass at x^ is N X,j = l,N i^3
3 = 1,N i^j
where we set ku := — ^J=I,N
3 = 1,N iy^J
j=l
kij. Newton's equation then takes the form
mix'/-fi=0,
z = l,...,Ar,
(4.23)
with the particularity that the j t h component of the force depends only on the j t h component of the mass. The system then splits into 3 systems of N equations of second order, one for each coordinate. If we use the matrix notation, things simplify. Denote by M := diag {mi, m2,. • •, TUN}
142
4. Self-Adjoint Operators
the positive diagonal matrix of masses, by K := (fc^j) G MN,N{^) the symmetric matrix of elastic contants, and by X{t) € M^ the jth coordinates of the points x i , . . . , x^v A = (Xj^,..., x ^ j ,
x^ =: [Xj^, a^^, x^ j ,
i.e., the columns of the matrix X(t) := [xi(t) I X2(t) I . . . I Xiv(t)]
e Miv,3(M).
Then (4.23) transforms into the system of equations MX'\t)-\-KX{t) = 0
(4.24)
where the product is the product rows by columns. Finally, if X"(t) denotes the matrix of second derivatives of the entries of X(^), the system (4.23) can be written as X'\t) + M - ^ K X ( t ) = 0, (4.25) in the unknown X : R -^ Miv,n(R)Since M~^K is symmetric, there is an orthonormal basis of R ^ made by eigenvalues of M~^K and real numbers Ai, A2,..., AA/^ such that {ui\uj) = 5ij
and
M.~^Kuj := \jUj\
notice that i^i, ?X2? • • •, ^iv are pairwise orthonormal vectors since M is diagonal. Denoting by Pj the projection operator onto Span{uj} we also have N
Id = 5^P„
N
M-^K = X^A,P,.
3=1
j=i
Thus, projecting (4.25) onto SpanjiXj} we find 0 = Pj{0) = P , ( X " + M - ^ K X ) = {PjXy
+ A^(P,X),
Vj = 1 , . . . , iV,
i.e., the system (4.25) splits into A^ second order equations each in the unknowns of the matrix PjX.{t). Since K is positive, the eigenvalues are positive, consequently each element of the matrix PjX(t) is a solution of the harmonic oscillator
hence PjX{t) = cos(yA~t)P,X(0) + ! ^ ^ ^ i ^ P , . X ' ( 0 ) . In conclusion, since Id = ^j^i Pj-, we have
4.3 Exercises
143
N
x{t) = J2PiMt) ' '
(4.26)
The numbers \/A7/(27r),... \/A^/(27r) are called the proper frequencies of the system. We may also use a functional notation ^
A2n+1
^
\2n
and we can write (4.26) as X(i) = cos{t^/A)X{0)
+ ?H^i^^X'(0), vA
where A := M~^K.
4.3 Exercises 4 . 5 7 %. Let A be an n X n matrix and let A be its eigenvalue of greatest modulus. Show that |A| < sup^dail + | 4 | + • • • + |aj,|). 4.58 % G r a m m a t r i x . Let {/i, / 2 , . . . , fm} be m vectors in M^. The matrix G = [gij] G Mm,n(IR) defined by gij = {fi\fj) is called Gram's matrix. Show that G is nonnegative and it is positive if and only if / i , 72, • • • ? fm are linearly independent. 4.59 t . Let A,B : C^ -^ C^ be self-adjoint and let A be positive. Show that the eigenvalues of A~^B are real. Show also that A~^B is positive if B is positive. 4 . 6 0 %. Let A = [a^] G Mn,n(K) be self-adjoint and positive. Show that det A < (trA/n)"^ and deduce det A < n?=i<^i- [Hint: Use the inequality between geometric and aritmethic means, see [GMl].] 4.61 %, Let A e Mn,nOQ and let a i , a 2 , . . . , an G K"^ be the columns of A. Prove Hadamard's formula det A < f l i L i \^i\- [Hint: Consider H = A* A.] 4.62 f. Let A, B G Mn,n{^) be symmetric and suppose that A is positive. Then the number of positive, negative and zero eigenvalues, counted with their multiplicity, of A B and of B coincide. 4.63 1 . Show that ||Ar*Ar|| = ||7V||2 if N is normal.
144
4. Self-Adjoint Operators
4 . 6 4 % D i s c r e t e Fourier transform. Let T : C^ —• C ^ be the cycling forward shifting operator T{{zo,zi,... ,ZN-I)) '•= {zi,Z2,.. • ,ZN-I,ZO). Show that (i) T is self-adjoint, (ii) the N eigenvalues of T are the ATth roots of 1, (iii) the vectors Ufc : - - i = ( l , a ; ^ , u ; 2 f e , . . . , a ; ^ ( ^ - l ) ) ,
u := e'^, fc = 0 , . . . , AT - 1,
form an orthonormal basis of C ^ of eigenvectors of T; finally the cosine directions (z|ufc) of z G C-^ with respect to the basis ( u o , . . . , u^^^) are given by the Discrete Fourier transform of z. 4.65 ^ . Let A, B : X -^ X he two self-adjoint operators on a Euclidean or Hermitian space. Suppose that all eigenvalues of A- B are strictly positive. Order the eigenvalues Ai, A 2 , . . . , An of A and /ii, /i2, • • •, /in of B in a nondecreasing order. Show that Xi < fj,i Vi = 1 , . . . , n. [Hint: Use the variational characterization of the eigenvalues.] 4.66 f. Let A : X -^ X he self-adjoint on a Euclidean or Hermitian space. Let Ai, A 2 , . . . , An and /xi, /X2,..., /Xn be respectively, the eigenvalues and the singular values of A that we think of as ordered as |Ai | < IA2I < • • • < |An| and A*i < /i2 < • • • < MnShow that |Ai| = /ii Vi = 1 , . . . , n . [Hint: A*A = A^.] 4 . 6 7 %, Let A : X -^ X he a. linear operator on a Euclidean or Hermitian space. Let m, M he respectively the smallest and the greatest singular value of A. Show that ' ^ ^ l-^l ^ ^ for any eigenvalue X oi A. 4.68 f. Let A : X -^ Y he a, linear operator between two Euclidean or two Hermitian spaces. Show that (i) ( ^ M ) i / 2 maps ker A to {0}, (ii) (A*A)^^'^ is an isomorphism from ker A-^ onto itself, (iii) {AA*y^'^ is an isomorphism from ImA onto itself. 4.69 %, Let A : X -^ Y he a. linear operator between two Euclidean or two Hermitian spaces. Let (wi, 1*2, • • •, Un) and /ii, / i 2 , . . . , /in, fJ'i > 0 be such that (wi, U2, • •, Un) is an orthonormal basis of X and {A* A^^'^x = X]i A*i(^l^i)^i- Show that (i) AA'^y = E ^ , # o / ^ i ( 2 / l ^ ^ i ) ^ ^ i ^2/ e ^ ' (ii) If B denotes the restriction of [A*A)^^'^ to kei A^, see Exercise 4.68, then B ^x=
y ^ —{x\ui)ui • ^ — '
1
1
Vj;€kerA"'",
1
>^i^0 " *
(iii) If C denotes the restriction of (AA*)^^"^ to ImA, see Exercise 4.68, then C~^2/= ^
—{y\Aui)Aui
VyGlmA
^i^O ^'
4.70 %. Let A 6 Miv,n(^K)i ^ > ^? with Rank A = n. Select n vectors wi, 1^2, • • •, Un € K** such that A w i , . . . , A u n G K ^ are orthonormal. [Hint: Find U G Mn,n(K) such that A U is an isometry.]
4.3 Exercises
4.71 %. Let A e MN,n{^) and A = U A V , where U G 0(N), to 4.39, show that A+ = V ^ A ' U ^ where
/^
0
0
0\
0
-^
0
0
A' =
V G 0{n).
145
According
1
Mfc
0
\ 0
0
...
0
0/
Ml) M2, • • •, Mfe being the nonzero singular values of A. 4.72 %, For u : R —> R^, discuss the system of equations
(Pu —-+ dt
2
. V-l
-l'
u = 0.
4.73 If. Let A G Mn,n(R) be a symmetric matrix. Discuss the following systems of ODEs x ' ( t ) + A x ( t ) = 0, - i x ' ( t ) + Ax(t) = 0 , x " ( t ) + Ax(t) = 0,
where Ais positive definite
and show that the solutions are given respectively by e-*^x(0),
e-^*^x(0),
cos(tVA)x(0) +
sin(t\/A)
^
, x'(0).
4.74 ^ . Let A be symmetric. Show that for the solutions of x''(t) + Ax(t) = 0 the energy is conserved. Assuming A positive, show that |x(t)| < E/Xi where E is the energy of x(t) and A the smallest eigenvalue of A. 4.75 %. Let A be a Hermitian matrix. Show that |x(t)| = const if x(t) solves the Schrodinger equation i x ' -|- A x = 0.
Part II
Metrics and Topology
Felix Hausdorff (1869-1942), Maurice Frechet (1878-1973) and Rene-Louis Baire (18741932).
5. Metric Spaces and Continuous Functions
The rethinking process of infinitesimal calculus, that was started with the definition of the limit of a sequence by Bernhard Bolzano (1781-1848) and Augustin-Louis Cauchy (1789-1857) at the beginning of the XIX century and was carried on with the introduction of the system of real numbers by Richard Dedekind (1831-1916) and Georg Cantor (1845-1918) and of the system of complex numbers with the parallel development of the theory of functions by Camille Jordan (1838-1922), Karl Weierstrass (18151897), J. Henri Poincare (1854-1912), G. F. Bernhard Riemann (18261866), Jacques Hadamard (1865-1963), Emile Borel (1871-1956), ReneLouis Baire (1874-1932), Henri Lebesgue (1875-1941) during the whole of the XIX and beginning of the XX century, led to the introduction of new concepts such as open and closed sets, the point of accumulation and the compact set. These notions found their natural collocation and their correct generalization in the notion of a metric space, introduced by Maurice Frechet (1878-1973) in 1906 and eventually developed by Felix Hausdorff (1869-1942) together with the more general notion of topological space. The intuitive notion of a "continuous function" probably dates back to the classical age. It corresponds to the notion of deformation without "tearing". A function from X to F is more or less considered to be continuous if, when x varies slightly, the target point y = f{x) also varies slightly. The critical analysis of this intuitive idea also led, with Bernhard Bolzano (1781-1848) and Augustin-Louis Cauchy (1789-1857), to the correct definition of continuity and the limit of a function and to the study of the properties of continuous functions. We owe the theorem of intermediate values to Bolzano and Cauchy, around 1860 Karl Weierstrass proved that continuous functions take on maximum and minimum values in a closed and limited interval, and in 1870 Eduard Heine (1821-1881) studied uniform continuity. The notion of a continuous function also appears in the work of J. Henri Poincare (1854-1912) in an apparently totally different context, in the so-called analysis situs, which is today's topology and algebraic topology. For Henri Poincare, analysis situs is the science that enables us to know the qualitative properties of geometrical figures. Poincare referred to the properties that are preserved when geometrical figures undergo any kind of deformation except those that introduce tearing and glueing of points. An intuitive idea for some of these aspects may be provided by the following examples.
150
5. Metric Spaces and Continuous Functions
GBUNDZCGE
MENaENLEHEE ESPACES ABSTRAITS INTRODUCTION A L'ANALYSE GfiNfiRALB
FELIX H1U8D0RPP
a FIOCRES: (M TKXr Xatirice MUlECHBT
PARIS GA0THIBU-VJLLAR8 »i CI, tOlTEURS H aniHl»-A«fil(li««, H
LEIPZia VEBLAO VON VEIT A COilP,
Figure 5.1. Frontispieces of Les espaces abstraits by Maurice Frechet (1878-1973) and of the Mengenlehre by Felix Hausdorff (1869-1942).
o Let US draw a disc on a rubber sheet. No matter how one pulls at the rubber sheet, without tearing it, the disc stays whole. Similarly, if one draws a ring, any way one pulls the rubber sheet without tearing or glueing any points, the central hole is preserved. Let us think of a loop of string that surrounds an infinite pole. In order to separate the string from the pole one has to break one of the two. Even more, if the string is wrappped several times around the pole, the linking number between string and pole is constant, regardless of the shape of the coils. o We have already seen Euler's formula for polyhedra in [GMl]. It is a remarkable formula whose context is not classical geometry. It was Poincare who extended it to all surfaces of the type of the sphere, i.e., surfaces that can be obtained as continuous deformations of a sphere without tearing or glueing. o E, R^, R^ are clearly different objects as linear vectorspaces. As we have seen, they have the same cardinality and are thus undistinguishable as sets. Therefore it is impossible to give meaning to the concept of dimension if one stays inside the theory of sets. One can show, instead, that their algebraic dimension is preserved by deformations without tearing or glueing. At the core of this analysis of geometrical figures we have the notion of a continuous deformation that corresponds to the notion of of a continuous one-to-one map whose inverse is also continuous, called homeomorphisms. We have already discussed some relevant properties of continuous functions / : R - ^ R e / : R 2 - ^ R i n [GMl] and [GM2]. Here we shall discuss continuity in a sufficiently general context, though not in the most general.
5.1 Metric Spaces
151
Poincare himself was convinced of the enormous importance of extending the methods and ideas of his analysis situs to more than three dimensions. ... L'analysis situs a plus de trois dimensions presente des difficultes enormes; il faut pour tenter de les surmonter etre bien persuade de Vextreme importance de cette science. Si cette importance n'est pas bien comprise de tout le monde, c 'est que tout le monde n'y a pas suffisamment reflechi^ In the first twenty years of this century with the contribution, among others, of David Hilbert (1862-1943), Maurice Prechet (1878-1973), FeUx Hausdorff (1869-1942), Pavel Alexandroff (1896-1982) and Pavel Urysohn (1898-1924), the fundamental role of the notion of an open set in the study of continuity was made clear, and general topology was developed as the study of the properties of geometrical figures that are invariant with respect to homeomorphisms, thus linking back to Euler who, in 1739, had solved the famous problem of Konigsberg's bridges with a topological method. There are innumerable successive applications, so much so that continuity and the structures related to it have become one of the most pervasive languages of mathematics. In this chapter and in the next, we shall discuss topological notions and continuity in the context of metric spaces.
5.1 Metric Spaces 5.1.1 Basic definitions a. Metrics 5.1 Definition. Let X be a set. A distance or metric on X is a map d : X X X -^ R+ for which the following conditions hold: (i) (IDENTITY) d{x,y) >Oifx^yeX, and d{x,x) = 0 Vx G X . (ii) (SYMMETRY) d{x,y) = d{y,x) ^x,y e X. (iii) (TRIANGLE INEQUALITY) d{x,y) < d{x,z) -\-d{z,y), \/ x,y,z e X. A metric space is a set X with a distance d. Formally we say that (X, d) is a metric space if X is a set and d is a distance on X. The properties (i), (ii) and (iii) are often called metric axioms.
^ The analysis situs in more than three dimensions presents enormous difficulties; in order to overcome them one has to be strongly convinced of the extreme importance of this science. If its importance is not well understood by everyone, it is because they have not sufficiently thought about it.
152
5. Metric Spaces and Continuous Functions
\
T
B
Figure 5.2. Time as distance.
5.2 E x a m p l e . The Euclidean distance d{x, y) := \x — y\, x,y £R, is a, distance on R. On M? and R^ distances are defined by the Euclidean distance, given for n = 2,3 by _
^ 1/2
Irtixi-y^A
where X := (xi,a;2),y := (yi,2/2) if n = 2, or x := (xi,X2,X3),y := (2/1,2/2,2/3) if n = 3. In other words, R, R'^, R^ are metric spaces with the EucHdean distance. 5.3 E x a m p l e . Imagine R^ as a union of strips En := {(a:i,a;2,iC3) \n < X3 < n -\- 1}, made by materials of different indices of refractions Vn- The time t{A, B) needed for a light ray to go from A to B in R^ defines a distance on R^, see Figure 5.2.
5.4 E x a m p l e . In the infinite cylinder C = {{x,y,z)\x'^ -\-y"^ = 1} c R^, we may define a distance between two points P and Q as the minimal length of the line on C, or geodesic, connecting P and Q. Observe that we can always cut the cylinder along a directrix in such a way that the curve is not touched. If we unfold the cut cylinder to a plane, the distance between P and Q is the Euclidean distance of the two image points.
5.5 1. Of course 1001a; — 2/1 is also a distance on R, only the scale factor has changed. More generally, if / : R —• R is an injective map, then d(x, y) := \f(x) — f{y)\ is again a distance on R.
5.6 Definition. Let (X, d) be a metric space. The open ball or spherical open neighborhood centered at XQ e X of radius p > 0 is the set B{xo,p) := Ix e X\ d{x, XQ) < p\.
Figure 5.3. Metrics on a cylinder and on the boundary of a cube.
5.1 Metric Spaces
153
Notice the strict inequality in the definition of B{x,r). In M, R^, R^ with the EucHdean metric, B{xo^ r) is respectively, the open interval ]xo — r, xo + r[, the open disc of center XQ G R^ and radius r > 0, and the ball bounded by the sphere of R^ of center XQ G R^ and radius r > 0. We say that a subset £' C X of a metric space is bounded if it is contained in some open ball. The diameter oi E C X is given by d i a m ^ := sup< d(a:,y) x, ?/ G £^L and, trivially, E is bounded iff didiXnE < +oo. Despite the suggestive language, the open balls of a metric space need not be either round nor convex; however they have some of the usual properties of discs in R^. For instance (i) B{xo, r) C B{xo, s) VXQ G X and 0 < r < 5, (ii) Ur>oB{xo,r) = X Vxo G X, (iii) nr->o^(^o,^) = {^o} Vxo G X ,
(iv) Va:o G X and Vz G B{xo^r) the open ball centered at z and radius p :=z r — d(xo, z) > 0 is contained in B{xo, r), (v) for every couple of balls B{x,r) and B{y,s) with a nonvoid intersection and Vz G B{x,r) n ^(y, s), there exists t > 0 such that B{z, t) C B{x, r) n B{y, s), in fact t := min(r — d(x, z), s — d{y, z)), (vi) for every x,y e X with x ^ y the balls 5 ( x , r i ) and B{y^r2) are disjoint if ri + r2 < G!(X, y). 5.7 ^ . Prove the previous claims. Notice how essential the strict inequality in the definition of B(xo, p) is.
b. Convergence A distance d on a set X allow us to define the notion of convergent sequence in X in a natural way. 5.8 Definition. Let (X, c!) he a metric space. We say that the sequence {xn} C X converges to x e X, and we write Xn -^ x, if d{Xn,x) -^ 0 in R, that is , if for any r > 0 there exists n such that d{xn^x) < r for all n>n. The metric axioms at once yield that the basic facts we know for limits of sequences of real numbers also hold for limits of sequences in an arbitrary metric space. We have (i) the limit of a sequence {xn} is unique, if it exists, (ii) if {xn} converges, then {xn} is bounded, (iii) computing the limit of {xn} consists in having a candidate x G X and then showing that the sequence of nonnegative real numbers {d{xn',x)} converges to zero, (iv) if Xn -^ X, then any subsequence of {xn} has the same limit x.
154
5. Metric Spaces and Continuous Functions
Thus, the choice of a distance on a given set X suffices to pass to the limit in X (in the sense specified by the metric d). However, given a set X, there is no distance on X that is reasonably absolute (even in R), but we may consider different distances in X. The corresponding convergences have different meanings and can be suited to treat specific problems. They all use the same general language, but the exact meaning of what convergence means is hidden in the definition of the distance. This flexibility makes the language of metric spaces useful in a large context.
5-1.2 Examples of metric spaces Relevant examples of distances are provided by linear vector spaces on the fields K = E or C in which we have defined a norm. 5.9 Definition. Let X be a linear space over K = R or C. A norm on X is a function \\ \\ : X -^ R+ satisfying the following properties (i) (FiNiTENESS) ||a;|| eR\/x e X. (ii) (IDENTITY) ||X|| > 0 and \\x\\ =0 if and only if x — 0. (iii)
(iv)
(1-HOMOGENEITY) ||AX|| = |A| ||X|| V X G X, (TRIANGLE INEQUALITY) ||X + 2/|| < ||x|| +
VA G K.
||2/|| \J x,y e X.
/ / II • II is a norm on X, we say that (X, || ||) Z5 a linear normed space or simply that X is a normed space with norm \\\\. Let X be a linear space with norm || ||. It is easy to show that the function d: X x X -^ R+ given by d{x,y) := \\x-y\\,
x,y e X,
satisfies the metric axioms, hence defines a distance on X, called the natural distance in the normed space (X, || ||). Obviously, such a distance is translation invariant, i.e., d{x -\- z,y -\- z) — d{x, y) Vx, y,z ^ X. Trivial examples of metric spaces are provided by the nonempty subsets of a metric space. If A is a subset of a metric space (X, c?), then the restriction of d to ^ x A is trivially, a distance on A. We say that ^ is a metric space with the induced distance from X. 5.10 If. For instance, the cylinder C := {{x, y, 2;) G M^ | x^ 4- y^ = 1} is a metric space with the Euclidean distance that, for x, y G C, yields d(x, y) :=:length of the chord joining x and y. The geodesic distance dg of Example 5.4, that is the length of the shortest path in C between x and y, defines another distance. C with the geodesic distance dg has to be considered as another metric space different from C with the Euclidean distance. A simple calculation shows that l|x-y||
We shall now illustrate a few examples of metric spaces.
5.1 Metric Spaces
,-'^/ / //^/ / :: // / / / V ^
\
^
^
155
V
\\ \ \ \N • \ \ '» : \\^: Ni
/Ti,
= \ \ \\ \ : X ^s \V\V
//'i // /: / / /' . v \ /^'/ / / '
Figure 5.4. The ball centered at (0,0) of radius 1 in R^ respectively, for the metrics d\, di.3, d2 and d^. The unit ball centered at (0,0) of radius one for the metric doo is the square ] — 1, l[x] — 1,1[.
a. Metrics on
finite-dimensional
vector spaces
5.11 %, As we have already seen, E^ with the Euclidean distance |x — y | is a metric space. More generally, any Euclidean or Hermitian vector space X is a normed space with norm given by
\\x\\:=^/W) cf. Chapter 3. X is therefore a metric space with the induced distance d{x,y) :=
\\x-y\\,
called the Euclidean distance of X. 5.12 % oo-distance. Set for x = (a^i, 0:2,..., Xn) G M.'^ ||x||cx) := max(|xi|, | x 2 | , . . . , \xn\)Show that X —> ||a:||oo is a norm on R'^. Hence, R^, equipped with the distance doo(x,y) :=
\\x-y\\oo,
is a metric space different from the standard R^ with the Euclidean distance of Exercise 5.11. In R"^, the unit ball centered at (0,0) of radius one for the metric doo is the square ] — 1, l[x] — 1,1[, see Figure 5.4.
5.13 % p-distance. Given a real number p > 1, we set for x € R*^
:=(Ekil
l/p
Show that ||x||p is a norm on R"^, hence dp(x,y) := | | x - 2 / | | p is a distance on R*^. Observe that || II2 and ^2 are the Euclidean norm and distance in R"^. In R^, the unit ball centered at (0,0) of radius one for the metric dp for some values of p is shown in Figure 5.4. [Hint: The triangle inequality for the p-norm is called Minkowski's discrete inequality
156
5. Metric Spaces and Continuous Functions
| | a + b | | p < ||a||p+||b||p,
Va := (ai, a2, • . . , an), h={bi,
6 2 , . . . , 6n),
which follows for instance from Minkowski's inequality for integral norms, see [GMl]. Alternatively, we can proceed as follows. Suppose a and b are nonzero, otherwise the inequality is trivial, apply the convexity inequality f{Xx+{l — X)y) < Xf{x)-\-{l — X)f{y) to f{t) = tP with X := a i / | | a | | p , y := 6i/||b||p, A := ||a||p/(||a||p + ||b||p), and sum on i from 1 to n, to get |a + b||p
<1.]
^llp-M|b|b
5.14 t P r o d u c t spaces. Let (Xi,^^^)), {X2,d^'^^), ... (Xn,d(^)) be n metric spaces and let y = X i X ^ 2 X • • • X Xn be the Cartesian product of X i , . . . , Xn- Show that each of the functions defined on X x K by
fcip(x,y) := (EILW('H^i,2/iF)'^^ (^doo(x,y) :=m3X^S'-\xi,yi)
ifp> 1,
U == l , . . . , n |
for X = (rr^, x ^ , . . . , x^), y = (t/^, 2 / ^ , . . . , y^) G Y, are distances on Y. Notice that if X i = • • • = Xn = M with the Euclidean distance, Y is R^, then the distances dp(x, y) are just the distances in Exercises 5.13 and 5.12. Also show that if {x^} C Y, x^ := (x^, x | , . . . x^) Vfc, and x = (x^, x ^ , . . . , x ^ ) , then the following claims are equivalent. (i) There exists p > 1 such that dp(xfc,x) —^ 0, (ii) d p ( x f c , x ) - ^ O V p > 1, (iii)
doo(Xfc,x)-^OVp>l,
(iv) Vz = l , . . . , n d i ( 4 , x O - ^ 0 . 5.15 % D i s c r e t e d i s t a n c e . Let X be any set. The discrete distance on X is given by \l d{x, y)= < I 0
ifx^y, if X = y.
Show that the balls for the discrete distance are B{x,r)
= {y ex\d{x,y)
^'
'
-^
= {^''^
[X
'^"" - ^'
ifr>l,
and that convergent sequences with respect to the discrete distance reduce to sequences that are definitively constant. 5.16 % C o d e s d i s t a n c e . Let X be a set that we think of as a set of symbols, and let X ^ = X x X x - - - x X the space of ordered words on n symbols. Given two words X = (xi, X2,..., Xn) and 2/ = (t/i, 2/2, • • •, 2/n) € X ^ , let dix,y)
: = # | i | a ; i ^ yij
be the number of bits in x and y that are diff^erent. Show that d(x, y) defines a distance in X ^ . Characterize the balls of X ^ relative to that distance. [Hint: Write d{x,y) — X)?=i d(xi,yi) where d is the discrete distance in X.]
5.1 Metric Spaces
157
RENDICONTI MEMOfilE E COMUHiCAZiONI.
CIRCOLO MATEMATICO
SUR OOEUHIES POLXTS DU CALCUL FOKCTIOSKEL; P«r M. M i n r l o t fritkti
( P m ) •>
DI PALERMO IXTRODUCTIOM.
DiHiTOKt: G. B. GUCCIA. U vitnHe X, qiund i tauw vakur aumirfqiw de x amrMfond OM n h u r bin Mwr-
TOMO XXII ^irectkiiu ((ar eumpk «i c« ^ui coacemt ruaifsnniU) « eo pinkulhr au poiot ik vw ik «e <)u'on doit jXtndre pwr varubk. Oepuk looficmpt, oo t cooaMM te ruKdiNU ik liow, il* tni^ ou mOuw Jk n v u M l n iramiti^ut*. U s l u m t (lainl». ndoui Kwt pitu li'caun. Akui, M. U: Koi-x > M uiwni i CtwUcf l a ibiKiiQiu
PALERMO. SEDM DMllA SOCISTA
Figure 5.5. The first page of the These at the Faculty of Sciences of Paris by Maurice Prechet (1878-1973), published on the Rendiconti del Circolo Matematico di Palermo.
b. Metrics on spaces of sequences We now introduce some distances and norms on infinite-dimensional vector spaces. 5.17 E x a m p l e (^oo s p a c e ) . Consider the space of all real (or complex) sequences X := ( a ; i , . . . ) - For x = {xn}, y := {yn}, set ||x||oo : = s u p | a : n | ,
doo(x,y) := | | x - y | | o o .
n
It is easy to show that x —• ||a;||oo satisfies the axioms of a norm apart from the fact that it may take the value +oo. Thus a: —>• ||a:||oo is a norm in the space ^C50 : = | x = {Xn} I ||x||oo < +CXD j ,
that is, on the linear vector space of hounded sequences. Consequently,
doo{x,y) := | | x - y | | o o ,
x,y eioo,
is a distance on ^oo J called the unifoTm distance. Convergence of "{x/j. }• (Z •^oo to x G •^oo in the uniform norm, called the uniform convergence, amounts to ||Xfc - x | | o o = S U p | x J . - X '
as fc —• oo.
(5.1)
where Xfc = ( a ; ^ , a ; | , . . . ) and x = (x^, x ^ , . . . ) . Notice that the uniform convergence in (5.1) is stronger than the pointwise convergence
Vi
xi
as fc —>^ oo.
For instance, let (f{t) := te~*, t € M+, and consider the sequence of sequences {x^} where x^ := {x^}n, xV- := < ^ ( ^ ) . Then \fi we have xj. = f e"^/'' -^ 0 as fc -^ oo, while llxfc - Oiloo = s u p j ^ e - ^ / ^ I 2 = 0 , 1 , . . . j = - 7^ 0.
158
5. Metric Spaces and Continuous Functions
Of course W^ with the metric doo in Exercise 5.12 is a subset of ^oo endowed with the induced metric doo- This follows from the identification (a:\...,x^) ^
(x\...,x",0,...,0,...).
5.18 E x a m p l e (£p spaces, p > 1). Consider the space of all real (or complex) sequences X := ( x i , . . . ) . For 1 < p < oo, x = {xn} and y := {y-n} set 1 /
Trivially, ||x||^p = 0 if and only if any element of the sequence x is zero, moreover Minkowski's inequality
l|x + y|Up<||x||«^ + ||y|U^, holds as it follows from Exercise 5.13 (passing to the limit as n -^ oo in Minkowski's inequality in E'^). Thus || ||^ satisfies the metric axioms apart from the fact that it may take the value -f-oo. Hence, || H^ is a norm in the linear space of sequences ^p-{x={Xn}|||x|Up<+Oo}.
Consequently, d^ (x, y) := ||x — yWcp is a distance on ip. Convergence of {x^} C ^p to x G ^p amounts to oo
where x^ = i^\^^^^ • • •) ^^^ ^ = (x^, x ^ , . . . ) . Notice that W^ with the metric dp in Exercise 5.13 is a subset of £p endowed with the induced metric d^p. This follows for instance from the identification (x\...,x'^) ^
(x\...,x'',0,...,0,...).
Finally, observe that ||x||^g < ||x||£p Vx if 1 < p < g, hence iiCipC
iq.
(5.2)
Since there exist sequences x = {xn} such that HxH^^ < +oo while ||x||£p = +00 if p < g, as for instance r 1 1i/p the inclusions (5.2) are strict if 1 < p < g. The case p = 2 is particularly relevant since the ^2 norm is induced by the scalar product ( x | y ) , , := £ x * j / * ,
||x|U, = ^ ( x l x ) , , .
Z = l
^2 is called the Hilbert coordinate space, and the set { x - ( x \ x 2 , . . . , ) € ^ 2 | | x i | < i Vi} the Hilbert cube.
5.1 Metric Spaces
159
Figure 5.6. Tubular neighborhood of the graph of / .
c. Metrics on spaces of functions The language of metric spaces is particularly relevant in dealing with different types of convergences of functions. As examples of metric spaces of functions, we then introduce a few normed spaces that are relevant in the sequel. 5.19 E x a m p l e ( C o n t i n u o u s f u n c t i o n s ) . Denote by C°([0,1]) the space of all continuous functions / : [0,1] -^ M. For / : [0,1] -> E set ll/||oo,[0,l] —
sup x€[0,l]
\f(x)\.
We have (0 ll/llcxD,[o,i] ^ ~^^^ ^y Weierstrass's theorem, (ii) ll/lloc!lo!i]=Oifr/(x) = OVx, (iii) l|A/||oc,[o,i] = |A|||/||oo,[o,i). (iv) | | / + 9 | | c » , [ 0 , l l < ll/llcx,,[0,ll + IMIoo,lO,ll-
To prove (iv) for instance, observe that for all x € [0,1], we have \f(x)+g{x)\ < \f(x)\ + \g{x)\ < ||/||oc,[o,i] + IMIoc,[o,i] hence the right-hand side is an upperbound for the values of f -\- g. The map / —* ||/||oo,[o,i] ^^ *^^^ ^ norm on C^([0,1]), called the uniform or infinity norm. Consequently C^([0,1]) is a normed space and a metric space with the uniform distance ^
^
te[o,i]
In this space, the ball B{f, e) of center / and radius e > 0 is the set of all continuous functions g G ^^([0,1]) such that \gix) - f{x)\ < e
VxG[0,l]
or the family of all continuous functions with graphs in the tubular neighborhood of radius e of the graph of / t / ( / , e ) : = { ( a ; , j / ) | x e [ 0 , l ] , y eR,\y
- f{x)\ < e],
(5.3)
see Figure 5.6. The uniform convergence in C^([0,1]), that is the convergence in the uniform norm, of {fk} C C^([0,1]) to / G C^([0,1]) amounts to computing Mk '•= ll/fc - /lloc,[o,i] = ^ ^ ^ , j l/fc W - / W l for every fc = 1, 2 , . . . and to showing that Mk —* 0 as /c -^ -f-oo.
160
5. Metric Spaces and Continuous Functions
-1
-1/A;2
Figure 5.7. The function / ^ in (5.4).
5.20 E x a m p l e ( F u n c t i o n s o f class C^dO, 1])). Denote by C^([0,1]) the space of all functions / : [0,1] - ^ R of class C \ see [GMl]. For / 6 C^{[0,1]), set llci([0,i]) ' =
sup | / ( a : ) | + xG[0,l]
sup
\f{x
||/lloo,[0,l] + ll/'lloo,[0,l]-
xG[0,l]
It is easy to check that / —• Consequently,
Ci(fo 11) ^s ^ norm in the vector space C^([0,1]).
dc^([o,i])if^9) •= ll/-pllci([o,i]) defines a distance in C^([0,1]). In this case, a function g E C^ has a distance less than e from / if | | / - S/||oo,[o,i] + 11/' - P1IOO,[O,I] < ^i equivalently, if the graph of g is in the tubular neighborhood t / ( / , ei) of the graph of / , and the graph of g' is in the tubular neighborhood C/(/',e2) of f with ei + 62 = e, see (5.3). Moreover, convergence in the Ci([0,l])-norm of { A } C CH[0,1]) to f E C i ( [ 0 , l ] ) , \\fk - /Ilci([0,i]) ^ 0, amounts to / uniformly in [0,1], fk •/'
uniformly in [0,1].
Figures 5.8 and 5.9 show graphs of Lipschitz functions and functions of class C^([0,1]) that are closer and closer to zero in the uniform norm, but with uniform norm of the derivatives larger than one. 5.21 E x a m p l e (Integral m e t r i c s ) . Another norm and corresponding distance in C^([0,1]) is given by the distance in the mean
II/IIL1([O,I]) '= J
l/WN*,
dLmo,i])(f,g)
•= \\f -g\\LHo,i)
•= j
\f-9\dx.
5.22 ^ . Show that the L -norm in C0([0,1]) satisfies the norm axioms. Convergence with respect to the L^-distance differs from the uniform one. For instance, for A: = 1, 2 , . . . set (5.4)
\o
if]^
We have ||/fc||cx>,[o,i] = /(O) = fc - ^ H-oo while ||/fc||Li([o,i]) = V(2fc) - ^ 0, cf. Figure 5.7. More generally, the I / P ( [ 0 , l])-norm, 1 < p < oo, on C°([0,1]), is defined by
5.1 Metric Spaces
161
Figure 5.8. The Lebesgue example.
\\f\\LP(o,i) ••= (^£ \fixw dxy \ It turns out that / -^ | | / | | L P ( O , I ) satisfies the axioms of a norm, hence 1
dLP{[0,i]){f,9)
'= \\f - 9\\LPi[0,i]) '= [
i/p
\f-9\^^^] 0
is a distance in C^([0,1]); it is called the L P ( [ 0 , 1])-distance. 5.23 ^ . Show that the L P ( [ 0 , l])-norm in C^([0,1]) satisfies the norm axioms. [Hint: The triangle inequality is in fact Minkowski's inequality, see [GMl].]
5-l,3 Continuity and limits in metric spaces a. Lipschitz-continuous maps between metric spaces 5.24 Definition. Let (X^dx) and (Y^dy) be two metric spaces and let 0 < a < 1. We say that a function f : X ^^ Y is a-Holder-continuous if there exists L > 0 such that dY{f{x)J{y)) 1-Holder-continuous smallest constant L of / , often denoted called the Lipschitz
yx.yeX.
< Ldxix.vr,
(5.5)
functions are also called Lipschitz continuous. The for which (5.5) holds is called the a-Holder constant by [f]a' When a = 1, the l-Holder constant is also constant of f and denoted by [f]i, L i p / or L i p ( / ) .
5.25 E x a m p l e ( T h e d i s t a n c e f u n c t i o n ) . Let {X, d) be a metric space. For any XQ G X , the function f{x) := d{x,xo) : X —> R is a Lipschitz-continuous function with Lip (/) = 1. In fact, from the triangle inequaUty, we get \fiy) - f{x)\ = \d(y,xo) -d{x,xo)\
< d{x,y)
yx,y G X,
hence / is Lipschitz continuous with Lip (/) < 1. Choosing a; = XQ, we have \f{y) - / ( ^ o ) | = \d{y, xo) - d{xo, xo)\ = d{y, XQ), thus L i p ( / ) > 1.
162
5. Metric Spaces and Continuous Functions
Figure 5.9. On the left, the sequence fkix) := k~^ cos{kx) that converges uniformly to zero with slopes equibounded by one. On the right, gk(x) '•= k~^ cos{k'^x), that converges uniformly to zero, but with slopes that diverge to infinity. Given any function / G C'^([0,1]), a similar phenomenon occurs for the sequences fk{x) := f(kx)/k, gk{x) =
f{k'^x)/k.
5.26 ^ D i s t a n c e from a s e t . Let (X, d) be a metric space. The distance function from cc € A" to a nonempty subset A C X is defined by d{x, A) := inf|d(a:, y) U € A \ . It is easy to show that f{x) := d(x, A) : X —^R is a, Lipschitz-continuous function with
Li (f)^
Jo iid(x,A) = OWx, I1
otherwise.
If d{x, A) is identically zero, then the claim is trivial. On the other hand, for any x,y £ X and z E A we have d{x, z) < d{x, y) -\- d{y, z) hence, taking the infimum in z^
d{x,A)-d{y,A)
and interchanging x and y,
\d{x,A)-d{y,A)\
= d{x, A) >
d(x, Xn), n -\-1 from which we infer that the Lipschitz constant of a: —• d{x^ A) must not be smaller than one.
b. Continuous maps in metric spaces The notion of continuity that we introduced in [GMl], [GM2] for functions on one real variable can be extended in the context of the abstract metric structure. In fact, by paraphrasing the definition of continuity of functions / : R -^ R+ we get
5.1 Metric Spaces
163
5.27 Definition. Let (X^dx) and {Y^dy) be two metric spaces. We say that f : X ^^ Y is continuous at XQ Z/ Ve > 0 there exists S > 0 such that dy(/(x),/(xo)) < e whenever dx{x,xo) < S, i.e., S)) C 5 y ( / ( X Q ) , e).
Ve > 0 3(5 > 0 such that f{Bx{xo,
(5.6)
We say that f : X ^^ Y is continuous in E C X if f is continuous at every point XQ ^ E. When E = X and f : X ^^ Y is continuous at any point of X, we simply say that f : X ^^Y is continuous. 5.28 1[. Show that a-Holder-continuous functions, 0 < a < 1, in particular Lipschitzcontinuous functions, between two metric spaces are continuous.
Let (X, dx) and (F, dy) be two metric spaces and E C X. Since E is a, metric space with the induced distance of X, Definition 5.27 also appHes to the function f : E —^Y. Thus f : E —^Y is continuous dit XQ E E if \/e>03S>0
such that f{Bx{xo,
S) H E) C By{f{xo),
e)
(5.7)
and we say that / : E" ^ y is continuous ii f : E ^^ Y is continuous at any point XQ E E. 5.29 Remark. As in the case of functions of one real variable, the domain of the function / is relevant in order to decide if / is continuous or not. For instance, f : X -^Y is continuous in £" C X if Vxo G E V€ > 0 35 > 0 such that f{Bx{xo,
5)) C By{f{xo),
e),
(5.8)
while the restriction f\E - E -^ Y oi f to E is continuous in E if yxoeE\/e>03S>0
such that f{Bx{xo,S)nE)
C By{f{xo),e).
(5.9)
We deduce that the restriction property holds: if f : X ^^Y is continuous in E, then its restriction f\E '- ^ ~^^ ^o E is continuous. The opposite is in general false, as simple examples show. 5.30 Proposition. Let X,Y,Z be three metric spaces, and XQ E X. If f : X ^^ Y is continuous at XQ and g : Y -^ Z is continuous at f{xo), then g o f : X ^^ Z is continuous at XQ • In particular, the composition of two continuous functions is continuous. Proof. Let e > 0. Since g is continuous at f{xo), there exists a > 0 such that g{BY{f{xo),(T)) C Bz{g{f{xo)),e). Since / is continuous at XQ, there exists 5 > 0 such that f(BxixoyS)) C By(/(a;o),cr), consequently go f{Bx{xo,S))
C g{BYif{xo),a))
C Bzigo
f{xo),e). D
Continuity can be expressed in terms of convergent sequences. As in the proof of Theorem 2.46 of [GM2], one shows 5.31 Theorem. Let (X, dx) and (y, dy) be two metric spaces, f : X -^Y is continuous at XQ E X if and only if f{xn) -^ f{xo) in iy^dy) whenever {Xn}
d X,
Xn-^
XQ in ( X , d x ) .
164
5. Metric Spaces and Continuous Functions
c. Limits in metric spaces Related to the notion of continuity is the notion of the hmit. Again, we want to rephrase f{x) —^yo as x ^ XQ. For that we need / to be defined near XQ, but not necessarily at XQ. For this purpose we introduce the 5.32 Definition. Let X be a metric space and A C X. We say that XQ G X is an accumulation point of A if each ball centered at XQ contains at least one point of A distinct from XQ , Vr>0
B(xo,r)nA\{xo}^0.
Accumulation points are also called cluster points. 5.33 %, Consider R with the EucUdean metric. Show that (i) the set of points of accumulation of A :=]a, 6[, B = [a,b], C = [a, b[ is the closed interval [a, 6], (ii) the set of points of accumulation of A :=]0,1[U{2}, B = [0,1]U{2}, C = [0,1[U{2} is the closed interval [0,1], (iii) the set of points of accumulation of the rational numbers and of the irrational numbers is the whole R.
We shall return to this notion, but for the moment the definition suffices. 5.34 Definition. Let (X, dx) and (Y, dy) be two metric spaces, letEcX and let XQ G X be a point of accumulation of E. Given f : E\ {XQ} -^ Y, we say that y^ £Y is the limit of f{x) as x -^ XQ, X E E, and we write f{x) -^yo as x-^
XQ,
or
lim /(x) = yo X
'XQ
xeE
if for any e > 0 there exists 6 > 0 such that dy(/(x),2/o) < e whenever X e E and 0 < dx{x,xo) < S. Equivalently, Ve > 0 3(5 > 0 such that f{Bx{xo,
S)r]E\
{XQ}) C Byiyo, e).
Notice that, while in order to deal with the continuity of / at xo we only need / to be defined at XQ; when we deal with the notion of limit we only need that XQ be a point of accumulation of E. These two requirements are unrelated, since not all points of E are points of accumulation and not all points of accumulation of E are in E, see, e.g.. Exercise 5.33. Moreover, the condition 0 < dxix^xo) in the definition of limit expresses the fact that we can disregard the value of / at XQ (in case / is defined at XQ). Also notice that the limit is unique if it exists^ and that limits are preserved by restriction. To be precise, we have 5.35 Proposition. Let (X, dx) and (F, dy) be two metric spaces. Suppose F C E C X and let XQ G X be a point of accumulation for F. If f{x) —^y as X -^ xo; X £ E, then / ( x ) -^ y as x -^ XQ, X £ F. 5.36 ^ . As for functions of one variable, the notions of limit and continuity are strongly related. Show the following.
5.1 Metric Spaces
165
P r o p o s i t i o n . Let X and Y be two metric spaces, E C X and XQ ^ X. (i) / / XQ belongs to E and is not a point of accumulation of E, then every function f : E —^Y is continuous at XQ. (ii) Suppose that XQ belongs to E and is a point of accumulation for E. Then a) f : E -^ Y is continuous at XQ if and only if f{x) —>• f{xo) as x ^^ XQ, xe E, b) f(x) —> y as X -^ XQ, X ^ E, if and only if the function g : EU {XQ} -^ Y defined by \fix)
ifxeE\{xo}, if X =
is continuous
XQ
at XQ.
We conclude with a change of variable theorem for limits, see e.g., Proposition 2.27 of [GMl] and Example 2.49 of [GM2]. 5.37 Proposition. Let X^Y,Z be metric spaces, E C X and let XQ be a point of accumulation for E. Let f : E —^ Y, g : f{E) -^ Z be two functions and suppose that /(XQ) is an accumulation point of f{E). If (i) 9{y) -^ L as y -^ yo, y e f{E), (ii) f(x) -^ yo as X -^ xoy X e E, (iii) either f{xo) = yO) or f{x) ^ yo for all x e E and x ^ XQ, then g{f{x)) -^ L as x —^ XQ, X E E. d. The junction property A property we have just hinted at in the case of real functions is the junction property^ see Section 2.1.2 of [GMl], which is more significant for functions of several variables. Let X be a set. We say that a family {[/«} of subsets of a metric space is locally finite at a point XQ G X if there exists r > 0 such that B(xo, r) meets at most a finite number of the C/a's. 5.38 Proposition. Let {X^dx), (1^,dy) be metric spaces, f : X —^ Y a function, XQ E X, and let {Ua} be a family of subsets of X locally finite at XQ.
(i) Suppose that XQ is as X -^ XQ, X eUa, (ii) / / XQ G HaUa and then f : X ^^Y is
a point of accumulation of Ua and that f[x) -^ y for all a. Then f{x) —^y as x ^^ XQ, X £ X. f : Ua C X —^ Y is continuous at xo for all a, continuous at XQ .
5.39 t - Prove Proposition 5.38. 5.40 E x a m p l e . An assumption on the covering is necessary in order that the conclusions of Proposition 5.38 hold. Set A := {(x, y)\x'^
{
1
if I
0
otherwise. otl
166
5. Metric Spaces and Continuous Functions
The function / is discontinuous at XQ := (0,0), since its oscillation is one in every ball centered at XQ. Denote by Um the straight line through the origin Um '•= { ( x , y ) \ y = m x } ,
m G M,
Uoo '-= {{x,y)\x
= 0}.
The C/a's, a G R U {oo} form a covering of E^ that is not locally finite at XQ and for any a G M U oo, the restriction of / to e£ich Ua is zero near the origin. In particular, each restricition f^u^ : [/« —^ M is continuous at the origin.
5-1-4 Functions from R^ into
om
It is important to be acquainted with the Umit notion we have just introduced in an abstract context. For this purpose, in this section, we shall focus on mappings between Euclidean spaces and illustrate with a few examples some of the abstract notions previously introduced. a. The vector space C^{A,W^) Denote by eV R"^ -^ R the linear map that maps x = (x^, x ^ , . . . , x^) G R'^ into its ith component, e*(x) := x \ Any map f : X -^ W' from a set X into R^ writes as an n-tuple of real-valued functions / ( x ) = ( / ^ ( x ) , . . . , /'^(x)), where for any i = 1 , . . . , n the function /* : X -^ R is given by p{x) := e^(/(x)). Prom n
|yi|,|y2|,...,i2/„l<|y|<X^|j/i|
yeM"
z=l
we readily infer the following. 5.41 Proposition. The following claims hold. (i) The maps eV R^ —> R, i = 1 , . . . , n, are Lipschitz continuous. (ii) Let {X, d) he a metric space. Then a) f : X -^ W^ is continuous at XQ e X if and only if all its components f^, / ^ , . . . , / ^ are continuous at XQ, b) '^f fi9 ' ^ -^^^ ^^6 continuous at XQ, then f -\- g : X —> W^ is continuous at XQ , c) if f : X -^ W^ and A : X —> R are continuous at XQ then the map Xf : X —^ W^ defined by A/(x) := A(x)/(x), is continuous at
XQ.
5.42 E x a m p l e . The function / : R^ —>• R, / ( x , y, x) := sin(x^y) + x^ is continuous at R^. In fact, if xo := (xo,yo,zo), then the coordinate functions x = (x, y,%) -^ x, x —^ y, X —>• z are continuous at XQ by Proposition 5.41. By Proposition 5.41 (iii), x -^ x'^y and X -^ z'^ are continuous at xo, and by (ii) Proposition 5.41, x -^ x'^y + z"^ is continuous at XQ. Finally sin(x^2/ -^ x^) is continuous since sin is continuous.
5.43 Definition. Let X and Y he two metric spaces. We denote hy C^{X, Y) the class of all continuous function f : X -^Y.
5.1 Metric Spaces
167
As a consequence of Proposition 5.41 C^(X,R"^) is a vector space. Moreover, if A G C^{X,R) and / G CO(X,R^), then A / : X -^ R^ given by Xf{x) := A(x)/(x), xeX, belongs to C^(X,R^). In particular, 5.44 Corollary. Polynomials in n variables belong to C^(R^,R). Therefore, maps f '.W^ —^ W^ whose components are polynomials of n variables are continuous. In particular, linear maps L G £(R^,R'^) are continuous. It is worth noticing that in fact 5.45 Proposition. Let L :W^ ^^ W^ be linear. Then L is Lipschitz continuous in R^. Proof. As L is linear, we have Lip (/) : ==
sup x,yeR'^
\\X-y\\Rn
Xy^y
=
sup — x,y£R^ \\x-y\\un
=
sup —— o^zeR^ IPIIR'^
= : ||L||.
xj^y
Let us prove that ||L|| < H-CXD. Since L is continuous at zero by Corollary 5.44, there exists S > 0 such that ||L(ii;)|| < 1 whenever \\w\\ < S. For any nonzero 2 € M^, set w := 2Jnr\- Since ||ti;|| < 6, we have ||L(i(;)|| < 1. Therefore, writing z = ^y^w and using the linearity of L
||L(.)|| = | | « L W | | = « | 1 L H | | < ^ | N 1 hence
||L||<+oo.
For a more detailed description of linear maps in normed spaces, see Chapters 9 and 10. b. Some nonlinear continuous transformations from R^ into R ^ We now present a few examples of nonlinear continuous transformations between Euclidean spaces. 5.46 E x a m p l e . For fc = 0 , 1 , . . . consider the map w^ :] — 1,1[—> M^ given by , ,
J (cos kt, sin kt)
if t G]0, 27r/fc[,
1(1,0)
otherwise.
This is a Lipschitz function whose graph is given in Figure 5.10. Notice that the graph of Uk = {{t,Uk{t))} is a curve that "converges" as fc —>• oo to a horizontal Une plus a vertical circle at 0. Compare with the function sgn x from M to R.
168
5. Metric Spaces and Continuous Functions
2Tv/k
Figure 5.10. The function u^ in Example 5.46.
5.47 E x a m p l e ( S t e r e o g r a p h i c p r o j e c t i o n ) . Let
be the unit sphere in E^"*"^. If x = ( x i , . . . Xn.Xn^i) € R^"*"^, let us denote the coordinates of x by (2/, z) where y = (xi, X2,..., Xn) € M" and z = Xn-\-i € R. With this notation, S"^ = {(y, z) e M"" X M 112/|2 -j-z'^ = 1}. Furthermore, denote by Ps = (0, - 1 ) € 5 ^ the South pole of S"^. The stereographic projection (from the South pole) is the map that projects from the South pole the sphere onto the {z = 0} plane, cT : S " \ {Ps} C K " + i ^ R",
(y, z) -^
~ ^ . L -j- Z
It is easily seen that a is injective, surjective and continuous with a continuous inverse given by
that maps x E M.'^ into the point of S'^ lying in the segment joining the South pole of 5"' with X, see Figure 5.11. 5.48 E x a m p l e (Polar c o o r d i n a t e s ) . The transformation (T:E:=
Up,e)\p>0,
0 < ^ < 27r}-^R2,
(p,^) ^
(pcos(9,psin(9)
defines a map that is injective and continuous with range R^ \ {0}. The extension of the map to the third coordinate a : E X R - ^ R 2 xR--R^,
(p.O.z) -^
(pcose.psmO.z)
defines the so-called cylindrical coordinates in R^. 5.49 E x a m p l e (Spherical c o o r d i n a t e s ) . The representation of points {x,y, z) € R^ as I X — psinipcosO, I y = p simp sin 6, ^Z =
pCOSip,
see Figure 5.12, defines the spherical coordinates in R^. This in turn defines a continuous transformation (p,0,(p) —> {x,y,z) from E := | ( p , ^, (^) I /9 > 0, 0 < ^ < 27r, 0 < <^ < TT} = ] 0 , +OO[X [0,2n[x [0, n]
into R 3 \ {0}.
5.1 Metric Spaces
169
Figure 5.11. The stereographic projection from the South pole.
Complex-valued functions of one complex variable provide examples of tranformations of the plane. 5.50 E x a m p l e {w = z'^). The map z ^^ z'^ defines a continuous transformation of C to C. The inverse image of each nonzero n; G C is made by n distinct points, given by the n roots of w, those n points collapse to zero when it; = 0. If we write the transformation w = z"^ as \w\ == \z\'^ Argti; = n A r g z , we see, identifyng C with M^, that the circle of radius r and center 0 is mapped into the circle of radius r'^ and center 0. Moreover, if a point goes clockwise along the circle, then the normalized point image -^jp^ goes along the unit circle clockwise n times. 5.51 f. The map z —^ w = z'^ restricted to ipo < Argz < c^i with 0 < c^i — v?o < 27r/n is injective and continuous. 5.52 %. Show that the map z -^ w = z"^ maps the family of parallel lines to the axes (but the axes themselves) into two families of parabolas with the common axis as the real axis and the common foci at the origin, see Figure 5.13.
5.53 E x a m p l e ( T h e Joukowski f u n c t i o n ) . This is the map
A(.):=i(z+1),
P==
Figure 5.12. Spherical coordinates.
z/O,
{x,y,z)
170
5. Metric Spaces and Continuous Functions
Figure 5.13. The transformation w = z'^ maps families of lines parallel to the axes, except for the axes, into two families of parabolas with the common axis as the real axis and the common foci at the origin.
which appears in several problems of aerodynamics. It is a continuous function defined every point w ^ ± 1 , 0 has at most, and, in fact, exactly in C \ {0}. Since X{z) = \(l/z), two distinct inverse images zi,Z2 satisfying ziZ2 = 1. is one-to-one from {\z\ < 1, 2 / 0} or {\z\ > 1} 5.54 If. Show that X(z) = l/2{z-\-l/z) into the complement of the segment {w\ — 1 < dlw < 1}. A maps the family of circles {z I \z\ = r}ry 0 < r < 1, into a family of co-focal ellipses and maps the diameters z = te^°^, — 1 < ^ < 1 , 0 < a < 7 r , i n a family of co-focal hyperbolas, see Figure 5.14. 5.55 E x a m p l e ( T h e M o b i u s t r a n s f o r m a t i o n s ) . These maps, defined by L(^):==^i±^, cz -\- d
ad-bc^O
(5.10)
are continuous and injective from C \ {—d/c} into C \ {a/c} and have several relevant properties that we list below, asking the reader to show that they hold. 5.56 %. Show the following. (i) L(z) —> a/c as \z\ -^ oo and \L{z)\ -^ oo as z -^ —d/c. Because of this, we write L(oo) = a/c, L(—d/c) = oo and say that L is continuous from CU{oo} into itself. (ii) Show that every rational function, i.e., the quotient of two complex polynomials, defines a continuous transformation of C U {oo} into itself, as in (i). (iii) The Mobius transformations L{z) in (5.10) are the only rational functions from C U {oo} into itself that are injective. (iv) The Mobius transformations (aiZ -\- bi)/(ciZ -|- di), i = 1,2, are identical if and only if ( a i , 6i, ci, di) is a nonzero multiple of (a2, 62, C2, ^2)(v) The Mobius transformations form a group G with respect to the composition of maps; the subset H C G, H := {z, 1 — z, l/z, 1/(1 — z), (z — l)/z} is a subgroup
of a (vi) A Mobius transformation maps straight lines and circles into straight lines and circles (show this first for the map l/z, taking into account that the equations for straight lines and circles have the form A{x'^ + V^) + 2Bx + 2Cy + D = 0 if z = x-\-iy). (vii) The map in (5.10) maps circles and straight lines through —c/d into a straight line and any other straight line or circle into a circle.
5.1 Metric Spax:es
171
\ \ Figure 5.14. The Joukowski function maps circles \z\ = r, 0 < r < 1, and diameters z = ±6=*=*", 0 < t < l , 0 < a < 27r, respectively into a family of ellipses and of cofocal hyperbolas.
(viii) The only Mobius transformation with at least two fixed points is z. Two Mobius transformations are equal if they agree at three distinct points. There is a unique Mobius transformation that maps three distinct points 2:1,22, ^3 € C U {00} into three distinct points wi,W2jWs G C U {00}.
5.57 E x a m p l e ( E x p o n e n t i a l a n d l o g a r i t h m ) . The complex function z —> exp^;, see [GM2], is continuous from C —• C, periodic of period 27Ti with image C \ {0}. In particular e^ does not vanish, and every nonzero w has infinitely many preimages. 5.58 f. Taking into account what we have proved in [GM2], show the following. (i) ii; = e^ is injective with a continuous inverse in every strip parallel to the real axis of width /i < 27r, and has an image as the interior of an angle of radiants h and vertex at the origin; (ii) w = e^ maps every straight line which is not parallel to the axes into a logarithmic spiral, see Chapter 7.
c. The calculus of limits for functions of several variables Though we may have appeared pedantic, we have always insisted in specifying the domain E C X in which the independent variables varied. This is in fact particularly relevant when dealing with limits and continuity of functions of several variables, as in this case there are several reasonable ways of approaching a point XQ. Different choices may and, in general, do lead to different answers concerning the existence and/or the equality of the limits lim fix) and lim fix). xeE xeF Let (X, dx) and (F, dy) be two metric spaces, f : X -^Y a point of accumulation of X.
and XQ e X
172
5. Metric Spaces and Continuous Functions
Figure 5.15. The function in Example 5.59.
(i) If we find two sets Ei^E2 such that XQ is an accumulation point of both El and E2, and the restrictions f : Ei C X -^ Y and f : E2 C X -^Y oi f have diflFerent Hmits, then / has no Hmit when X ^y
XQ,
X e El U E2.
(ii) if we want to show that f{x) has hmit as a: ^ XQ, we may a) guess a possible limit yo £Y^ for instance computing the limit yo of a suitable restriction of / , b) show that the real-valued function x -^ dy(/(x),2/o) converges to zero as X -^ xo, for instance proving that dy (/(x), yo) < h{x) for all x G X, x ^ XQ, where h : X -^R is such that h{x) -^ 0 as x -^ XQ. 5.59 E x a m p l e . Let / : E^ \ {(0,0)} -^ M be defined by f(x,y) := xy/{x^ + 2/^) for {x,y) 7^ (0,0). Let us show that / has no limit as (x^y) —• (0,0). By contradiction, suppose that f{x,y) —> L G R as (x,y) —>• (0,0). Then for any sequence {(xn,2/n)} C R^\{(0,0)} converging to (0,0) we find f{xn,yn) —>• L- Choosing {xn.yn) '-— (l/'^, ^7^)? we have
hence, a s n - ^ 0 0 , L = k/{l-\-k?). Since k is arbitrary, we have a contradiction. This is even more evident if we observe that / is positively homogeneous of degree 0, i.e., /(Ax, \y) = f{x, y) for all A > 0, i.e., / is constant along half-lines from the origin, see Figure 5.15. It is then clear that / has limit at (0,0) if and only if / is constant, which is not the case. Notice that from the inequality 2xy < x^ -^y^ we can easily infer that \f{x, y)\ < 1/2 V(x, y) € R^ \ {(0, 0)}, i.e., that / is a bounded function. 5.60 E x a m p l e . Let fix,y) := sm(x'^y)/{x'^ +2/^) for {x,y) ^ (0,0). In this case ( l / n , 0 ) -» (0,0) and / ( l / n , 0 ) = 0 - ^ 0 . Thus 0 is the only possible limit as (x,y) -^ (0,0); and, in fact it is, since \ff^ o.\ n\ |sin(x2y)| ^ |x| |y| ,^, ^ 1 ,^, ^ |/(x,2/)-0 = < \x\ < - x| - ^ 0 x^ + y^ x^ -\-y^ 2 as (x,2/) —^ (0,0). Here we used | sin t\ < \t\ Vt, 2|x| |y| < x2 -h 2/^ V(x,2/) and that (x,2/) -^ |x| is a continuous map in R^, see Proposition 5.41.
5.1 Metric Spaces
173
We can also consider the restriction of / to continuous paths from XQ, i.e., choose a map (f : [0,1] -^ R^ that is continuous at least at 0 with (p{0) = XQ and ^{t) ^ XQ ior t ^ 0 and compute, if possible
Such limits may or may not exist and their values depend on the chosen path, for a fixed / . Of course, if xeE then, on account of the restriction property and of the change of variable theorem,
i i S , /(^) = ^
and
lim f{^{t)) = L
xeF respectively for any F C E oi which XQ remains a point of accumulation and for any continuous path in E, (^([0,1]) C E. 5.61 E x a m p l e . Let us reconsider the function / :
R2
\ {(0, 0)} ^ M,
fix, y) :=
- ^ ^ x^ + 2/^
which is continuous in R-^ \ {(0, 0)}. Suppose that we move from zero along the straight Hne {(x,y)\y — mx,a: € M} that we parametrize \yy x -^ (x,mx). Then f{(p{x)) = f{x,mx)
= —-^ — -, as X -> 0, 1 + m^ 1 + m^ in particular, the previous limit depends on m, hence f{x, y) has no limit as (x, y) -^ (0,0). Set E := {{x,y) | x € M, 0
= 0.
In fact, in this case
0 < J ^ < ^ = A N X + y^
X'^
i„E.
5.62 E x a m p l e . The function
fi^,y) = { ^
if(x,v)eEM{(0,0)}, if (x, 3/) = (0,0)
is continuous in M? \ {(0,0)} but is not continuous at (0,0). Restricting / to a straight line through (0,0) parametrized as (p{t) = (ta, tb), t € K, gives
f(^it)) = fiat,bt) = - ^ ^ ^ = J^t^O
ast^O.
However, restricting / along the graph of the function y = ax^ parametrized as (^(x) := (x,ax'^), gives / ( x , a x ) = -—-—r— x^ + a^x^
-> — - — - , 1 + a^
asx-^0,
174
5. Metric Spaces and Continuous Functions
MONOGRARTE
MATEMATYCZNE
|
KOMITET REDAKCYJNY: S. B A M A C H . B. KKA5T£il. K. KV&ATOWSKI. s. MAatoBKiBWicz, w. saM.ntiitit
i «.
srtwHAVs
TOMin
TOPOLOGIE I E5PACE5 M £ T R I S A B L E 5 . &SPACE5 COMPLETE
C A S I M I R
K U R A T O W S K I
. X I'CCOie POtTTeCHNIQI,'! OZ IWOv
Z SUBWEMCJI FU»Z)0*ZU KCtTgKT K A * 0 » 0 W E ; W A R S Z A W A - L W d W
iqSS
Figure 5.16. Kazimierz Kuratowski (1896-1980) and the frontispiece of the first volume of his Topologie.
thus / has no Hmit as {x, y) —> (0, 0). Let us now consider the restriction of / to the set
E:=\{x,y)\x>0,
\y\ < x^}.
We have lim
/ ( x , y) — 0.
{x,y)eE In fact,
I x^y
•0
=
J.4 -1-2/2
x^ -\' 2/2
< kl -^ 0,
since (x, y) G £?.
We conclude by observing that for functions / : R^
the expression
lim f{x) = +00 means that VM G R there is J > 0 such that f{x) > MWxe
B{xo,
5)\{XQ}.
5.2 The Topology of Metric Spaces In this section we introduce some families of subsets of a metric space X that are defined by the metric structure, namely the families of open and closed sets. Recall that if X is a set, V{X) denotes the set of all subsets of
X: Ae V{X) if and only
liAdX.
5.2 The Topology of Metric Spaces
175
5.2.1 Basic facts a. Open sets 5.63 Definition. A subset A of a metric space (X, d) is called an open set if for all x £ A there exists a ball centered at x contained in A, i.e., Vx G A 3ra, > 0 such that B{x,rx)
C A.
(^-H)
5.64 Proposition. A subset A of a metric space X is open if and only if either A is empty or is a union of open balls. Proof. Let A be open. Then either A is empty or A is trivially a union of open balls, A — \Jr^^AB(x,rx). Conversely, (5.11) trivially holds if A = 0. If instead x G A 7^ 0, since we assume that A is union of balls, there is 2/ 6 X and p > 0 such that x G ^(2/5 p) G A. Thus y ^ A and, setting r := p — d{x, y), we have r > 0 and by the triangle inequality B{x^r) C B{y,p) C A. •
In particular, 5.65 Corollary. The open balls of a metric space X are open sets. 5.66 %. Let (X, d) be a metric space and r > 0. Show that {y E X \ d{y, x) > r} is an open set in X. 5.67 %, Let (X, d) be a metric space. Show that {xn} C X converges to x G X if and only if, for any open set A such that x £ A, there exists n such that Xn E A for all n >n.
The following is also easily seen. 5.68 Proposition. Let {X, d) be a metric space. Then (i) 0 and X are open sets, (ii) if {Aa} is a family of open sets, then UaA^ is an open set, too, (iii) zf 74i, A 2 , . . . , An are finitely many open sets, then n^^^A^ is open. 5.69 % By considering the open sets {] — - , ^ [ | n G N}, show that the intersection of infinitely many open sets needs not be an open set.
b. Closed sets Recall that the complement of A C X is the set A^ := X \ A. 5.70 Definition. Let X be a metric space. F C X is called a closed set is an open set. if F^ — X\F
176
5. Metric Spax^es and Continuous Functions
The de Morgan formulas
a
a
together with Proposition 5.68 yield at once the following. 5.71 Proposition. Let X be a metric space. Then (i) 0 and X are closed sets, (ii) the intersection of any family of closed sets is a closed set, (iii) the union of finitely many closed sets is a closed set. 5.72 %. Show that [a, 6], [a,+oo[ and [—a, H-oo[ are closed sets in E, while [0,1[ is neither closed nor open. 5.73 if. Show that the set { - | n = 1,2,... } is neither closed nor open. 5.74 %, Show that any finite subset of a metric space is a closed set. 5.75 ^ . Show that the closed ball {x € X\d{x,xo) X I d(x, xo) = r} are closed sets.
< r}, and its boundary {x €
One may characterize closed sets in terms of convergent sequences. 5.76 Proposition. Let {X,d) be a metric space. A set F C X is a closed set if and only if every convergent sequence with values in F converges to a point of F. Proof. Suppose that F is closed and that {x^} C F converges to x € X. Let us prove that x E F. Assuming on the contrary, that x f F, there exists r > 0 such that B(x,r) fl F = 0. As {xn} C F , we have d{xn,x) > r Vn, a contradiction since d{xn,x) —> 0. Conversely, suppose that, whenever {xk} C F and Xk -^ x, we have a: G F , but F is not closed. Thus X \F is not open, hence there exists a point x £ X \F such that Vr > 0 B{x, r) n F ^ 0. Choosing r = l , | , | , . . . , w e inductively construct a sequence {xn} C F such that d{xn,x) < ^, hence converging to x. Thus x e F by assumption, but X £ X \F by construction, a contradiction. •
c. Continuity 5.77 Theorem. Let {X,dx) and {Y^dy) be two metric spaces and f : X —^Y. Then the following claims are equivalent (i) (ii) (iii) (iv)
/ is continuous, f~^{B) is an open set in X for any open ball B ofY, f~^{A) is an open set in X for any open set A inY, f~^(F) is a closed set in X for any closed set F inY.
5.2 The Topology of Metric Spaces
177
Proof, (i) => (ii). Let B be an open ball in Y and let x be a point in f~^{B). Since f(x) E B, there exists a ball BY{f{x),e) C B. Since / is continuous at x, there exists (5 > 0 such that f{Bx{x,S)) C B y ( / ( x ) , e ) C B that is B x ( x , 5 ) C / " ^ ( B ) . As x is arbitrary, f~^{B) is an open set in X. (ii) =^ (i) Suppose / " ^ B ) is open for any open ball B of y . Then, given XQ, / ~ ^ ( B y ( / ( x o , 6 ) ) ) is open, hence there is (5 > 0 such that Bx{xo,S) is contained in /~-^(By(/(xo,6))), i.e., f(Bx{xoyS)) C B y ( / ( x o ) , € ) , hence / is continuous at XQ. (ii) and (iii) are equivalent since f~^{UiAi) ofX.
= Uif~^{Ai)
for any family {Ai} of subsets
(iii) and (iv) are equivalent on account of the de Morgan formulas.
•
5.78 %. Let / , p : X —> y be two continuous functions between metric spaces. Show that the set {x £ X \ / ( x ) = 9{x)} is closed. 5.79 ^ . It is convenient to set Definition. Let (X, d) be a metric space. U C X is said to be a neighborhood of XQ £ X if there exists an open set A of X such that XQ £ A C U. In particular o B(xo,r) is a neighborhood of any x € B ( x o , r ) , o A is open if and only if A is a neighborhood of any point of A. Let {X, d), (y, d) be two metric spaces let XQ e X and let / : X -^ y . Show that / is continuous at XQ if and only if the inverse image of an open neighborhood of / ( X Q ) is an open neighborhood of XQ .
Finally, we state a junction rule for continuous functions, see Proposition 5.38. 5.80 Proposition. Let (X, d) be a metric space, and let {Ua} be a covering of X. Suppose that either all Ua ^s are open sets or all Ua 's are closed and for any x £ X there is an open ball that intersects only finitely many Ua- Then (i) A
d. Continuous real-valued maps Let (X, d) be a metric space and / : X ^> R. Prom Theorem 5.77 we find that / : X ^ ' R is continuous if and only if /~^(]a, b[) is an open set for every bounded interval ]a, fe[c M. Moreover, 5.82 Corollary. Let f : X ^^ R be a function defined on a metric space X and let t e R. Then
178
5. Metric Spaces and Continuous Functions
(i) {x e X\ f{x) > t}, {x e X\ f{x) < t} are open sets, (ii) {x e X\ f{x) >t},{xeX\ f{x) < t} and {x £ X \ f{x) = t} are closed sets. 5.83 Proposition. Let (X^d) be a metric space. Then F C X is a closed set of X if and only if F = {x\ d(x, F) =0}. Proof. By Corollary 5.82, {x \ d{x, F) = 0} is closed, x -> d{x, F) being Lipschitz continuous, see Example 5.25. Therefore F = {x\ d(x, F) = 0} implies that F is closed. Conversely assume that F is closed and that there exists x ^ F such that d{x, F) = 0. Since F is closed by assumption, there exists r > 0 such that B{x, r) D F = 0. But then (i(x, F ) > r > 0, a contradiction. D 5.84 f. Prove the following P r o p o s i t i o n . Let (X, d) he a metric space. (i) F C such (ii) A C such
X is that X is that
a closed set if and F = {x e X\ f{x) an open set if and A = {x e X\ f{x)
Then
only if there exists a continuous function f : X -^R < 0}, only if there exists a continuous function f : X ^>-R < 0}.
Actually f can be chosen to be a Lipschitz-continuous [Hint: If F is closed, choose f{x) := d{x,F)^
function.
while if A is an open set, choose f{x)
=
-d(x,X\A).]
e. The topology of a metric space 5.85 Definition. The topology of a metric space X is the family TX C V{X) of its open sets. It may happen that different distances di and ^2 on the same set X that define diff'erent famihes of balls produce the same family of open sets for the same reason that a ball is union of infinitely many squares and a square is union of infinitely many balls. We say that the two distances are topologically equivalent if {X,di) and (X, ^2) have the same topology, i.e., the same family of open sets. The following proposition yields necessary and sufficient conditions in order that two distances be topologically equivalent. 5.86 Proposition. Let di, d2 be two distances in X and let Bi{x,r) and B2(x^r) be the corresponding balls of center x and radius r. The following claims are equivalent (i) di and ^2 ^^^ topologically equivalent, (ii) every ball Bi{x,r) is open for 0^2 OL^d every ball B2{x^r) is open for di. (iii) \/x e X and r > 0 there are r^^Px > 0 such that B2{x, r^) C Bi{x, r) and Bi[x,px) C B2[x,r), (iv) the identity map i : X -^ X is a homeomorphism between the metric spaces (X^di) and (X, 0J2).
5.2 The Topology of Metric Spaces
179
Figure 5.17. x is an interior point to A, y is s. boundary point to A and z is an exterior point to A. X and y are adherent points to A and z is not.
5.87 ^ . Show that the distances in R^ doo and dp Vp > 1, see Exercise 5.13, are all topologically equivalent to the Euclidean distance d2- If we substitute W^ with the infinitely-dimensional vector space of sequences ii, the three distances give rise to different open sets.
We say that a property of X is a topological property of X if it can be expressed only in terms of set operations and open sets. For instance, being an open or closed set, the closure of or the boundary of, or a convergent sequence are topological properties of X, see Section 5.2.2 for more. As we have seen, / is continuous if and only if the inverse image of open sets is open, A trivial consequence, for instance, is that the composition of continuous functions is continuous, see Proposition 5.30. Also we see that the continuity of / : X —> y is strongly related to the topologies Tx '•= {Ac
X\A
open in X } ,
ry := {AcY\A
open in F } ,
respectively on X and y , and in fact it depends on the metrics only through Tx and Ty. In other words being a continuous function f : X ^^ Y is a, topological property of X and Y. f. Interior, exterior, adherent and boundary points 5.88 Definition. Let X be a metric space and A C X. We say that XQ G X is interior to A if there is an open ball B{xQ,r) such that B{xo^r) C A; we say that XQ is exterior to A if XQ is interior to X \ A; we say that XQ is adherent to A if it is not interior to X \ A; finally, we say that XQ is a X\A. boundary point of A if XQ is neither interior to A nor interior to o
The set of interior points to A is denoted by A or by int A, the set of adherent points of A, called also the closure of A, is denoted by A or by cl (A), and finally the set of boundary points to A is called the boundary of A and is denoted by dA. 5.89 ^ . Let (X, d) be a metric space and B{xo,r) be an open ball of X. Show that (i) every point of B{xo,r) is interior to B{xo,r), i.e., intB{xo,r) = B{xo,r), (ii) every point x such that d(x,xo) = r is a boundary point to B{xo,r), i.e., dB{xo,r) = {x I d{x, XQ) = r } , (iii) every point x with d{x,xo) > r is exterior to B{xo,r),
180
5. Metric Spaces and Continuous Functions
(iv) every point x such that d{x^xo) < r is adherent to B{XO^T)^ {x\d{x,XQ) < r } . 5.90 %, Let X be a metric space and Ac (i) int Ac A,
i.e., c\{B{xo^r))
=
X. Show that
(ii) int A is an open set and actually the largest open set contained in A, i n t ^ = u | c / I U open U C A \ (iii) A is open if and only if >1 = int v4. 5.91 %. Let X be a metric space and A (Z X. Show that (i) ^ C A , (ii) A is closed and actually the smallest closed set that contains A, i.e., cl {A) = n J F I F closed , F D
A^
(iii) A is closed if and only if A = A, (iv) A = {xeX\d(x,A) = 0}. 5.92 (i) (ii) (iii)
%, Let X be a metric space and A C X. Show that dA = d{X\A),_ dAnmtA = i/}, A = dAU int A,dA = A\ int A, ^A = An A^, in particular dA is a closed set,
(iv) ddA = 0, A = A, int int A = int A, (v) >l is closed if and only if dA C 4 , (vi) A is open if and only if dA fl A = 0. 5.93 %, Let {X,dx) and ( F , d y ) be metric spaces and / : X —>• V. Show that the following claims are equivalent (i) / : X —>^ y is continuous, (ii) f(A) C JjA) foTjll ACX, for all BcY. (iii) / - i ( ^ ) C f-'^(B)
g. Points of accumulation Let A C Xhesi subset of a metric space. The set of points of accumulation, or cluster points, of A, denoted by DA, is sometimes called the derived of A. Trivially T>A C A, and the set of adherent points to A that are not points of accumulation of A, X{A) := ^ \ DA, are the points x e A such that J5(x, r) r\A = {x} for some r > 0. These points are contained in A, I{A)=A\VAcA and are called isolated points of A. 5.94 % Show that X>A C A and that A is closed if and only if VA C A.
5.95 Proposition. Let (X,d) be a metric space, F C X and x e X. We have
5.2 The Topology of Metric Spaces
181
(i) X is adherent to F if and only if there exists a sequence {xn} C F that converges to x, (ii) X is an accumulation point for F if and only if there exists a sequence {xn} C F taking distinct values in F that converges to x; in particular, a) X is an accumulation point for F if and only if there exists a sequence {xn} C F\ {x} that converges to x, b) in every open set containing an accumulation point for F there are infinitely many distinct points of F. Proof, (i) If there is a sequence {xn} C F that converges to x G X , in every neighborhood of X there is at least a point of F , hence x is adherent to F. Conversely, if x is adherent to F, there IS a XYI G B{x, ^ ) n F for each n, hence {xn} C F and Xfi ^ X. (ii) If moreover x is a point of accumulation of F, we can choose Xn E F \ {x} and moreover Xn G B{x,rn), Vn •= mm{d{x,Xn-i), - ) • The sequence {xn} has the desired properties. D
h. Subsets and relative topology Let (X^d) be a metric space and Y C X. Then (Y^d) is a metric space, too. The family of open sets in Y induced by the distance d is called the relative topology of Y, We want to compare the topology of X and the relative topology of Y. The open ball in Y with center x ^Y and radius r > 0 is Bvix.r)
:= ^yeY^d{y,x)
< r|
=Bx{x,r)nY.
5.96 Proposition. Let {X,d) be a metric space and let Y C X.
Then
(i) B is open in Y if and only if there exists an open set A C X in X such that B = AnY, (ii) B is closed in Y if and only if there exists a closed set A in X such that B = AnY. Proof. Since (ii) follows at once from (i), we prove (i). Suppose that A is open in X and let x be a point in An Y. Since A is open in X , there exists a ball Bx{x,r) C A, hence By(x,r) = Bx{x,r)nY C AnY. Thus AnY is open in Y. Conversely, suppose that B is open in Y. Then for any x e B there is a ball Byix^rx) = Bx{x,rx) n B C B. The set A := U{Bxix,rx) | a: G B } is an open set in X and AnY = B. D
Also the notions of interior, exterior, adherent and boundary points, in (y, d) are related to the same notions in (X, d), and whenever we want to emphasize the dependence on_y' of the interior, closure, derived and boundary sets we write inty(yl), Ay, VyA, dyA instead of int A, A, DA, dA, 5.97 Proposition. For any A CY (i) mty{A)_=mtx{A)nY, (ii) Ay = AxnY,
we have
182
5. Metric Spaces and Continuous Functions
dyA Figure 5.18. dA and
(iii) VYA = (iv) dYA =
dyA.
VxAr]Y, dxA\dxY.
5.98 f. Let Y := [0,1[C R. The open balls of Y are the subsets of the type {y G [0,1[ I I?/ — a:;| < r } . If x is not zero and r is sufficiently small, {y | I?/ — ic| < r } fl [0,1[ is again an open interval with center x,]x — r^x -\- r[. But, if x = 0, then for r < 1 B y ( 0 , r ) := [0,r[. Notice that x = 0 is an interior point of Y (for the relative topology of Y), but it is a boundary point for the topology of X. This is in agreement with the intuition: in the first case we are considering y as a space in itself and nothing exists outside it, every point is an interior point and dyY = i/}; in the second case y is a subset of R and 0 is at the frontier between Y and M \ y . 5.99 ^ . Prove the claims of this paragraph that we have not proved.
5.2.2 A digression on general topology a. Topological spaces As a further step of abstraction, following Felix Hausdorff (1869-1942) and Kazimierz Kuratowski (1896-1980), we can separate the topological structure of open sets from the metric structure, giving a set-definition of open sets in terms of their properties. 5.100 Definition. Let X be a set. A topology in X is a distinct family of subsets T C V{X), called open sets, such that o 0, X G r, o if {Aoc} C T, then Ua^a ^ T, o if Ai,A2,...,AneT, thenOl^^Ak^T. A set X endowed with a topology is called a topological space. Sometimes we write it as (X^r). 5.101 Definition. A function f : X -^ Y between topological spaces (X,Tx) and {Y,TY) is said to be continuous if f~^{B) e TX whenever
5.2 The Topology of Metric Spaces
183
B e TY' f : X ^^ Y is said to be a homeomorphism if f is both injective and surjective and both f and f~^ are continuous, or, in other words A G Tx if and only if f{A) G ryTwo topological spaces are said to be homeomorphic if and only if there exists a homeomorphism between them. Proposition 5.68 then reads as follows. 5.102 Proposition. Let (X, d) be a metric space. Then the family formed by the empty set and by the sets that are the union of open balls of X is a topology on X, called the topology on X induced by the metric d. The topological structure is more flexible than the metric structure, and allows us to greatly enlarge the notion of the space on which we can operate with continuous deformations. This is in fact necessary if one wants to deal with qualitative properties of geometric figures, in the old terminology, with analysis situs. We shall not dwell on these topics nor with the systematic analysis of different topologies that one can introduce on a set, i.e., on the study of general topology. However, it is proper to distinguish between metric properties and topological properties. According to Felix Klein (1849-1925) a geometry is the study of the properties of figures or spaces that are invariant under the action of a certain set of transformations. For instance, Euchdean plane geometry is the study of the plane figures and of their properties that are invariant under similarity transformations. Given a metric space (X, d), a property of an object defined in terms of the set operations in X and of the metric of X is a metric property of X, for instance whether {xn} C X is convergent or not is a metric property of X. More generally, in the class of metric spaces, the natural transformations are those h : (X, dx) -^ {Y, dy) that are one-to-one and do not change the distances dy{h{x)^h{x)) = dx{x,y). Also two metric spaces (X,d) and (y, d) are said to be isometric if there exists an isometry between them. A metric invariant is a predicate defined on a class of metric spaces that is true (respectively, false) for all spaces isometric with (X, d) whenever it is true (false) for (X, d). With this languange, the metric properties that make sense for a class of metric spaces, being evidently preserved by isometrics, are metric invariants. And the Geometry of Metric Spaces^ that is the study of metric spaces, of their metric properties, is in fact the study of metric invariants. 5.103 %, Let {X,di) and (K,^2) be two metric spaces and denote them respectively, by Bi{x^ r) and B2{x, r) the ball centered at x and radius r respectively, for the metrics di and d2. Show that a one-to-one map /i : X —>> K is an isometry if and only if the action of h preserves the balls, i.e., h{Bi{x,r))
= B2{h{x),r)
Vx G X,Vr > 0.
Similarly, given a topological space (X, r x ) , a property of an object defined in terms of the set operations and open sets of X is called a topological property of X, for instance being an open or closed subset, being
184
5. Metric Spaces and Continuous Functions
the closure or boundary of a subset, or being a convergent sequence in X are topological properties of X. In the class of topological spaces, the natural group of transformations is the group of homeomorphisms, that are precisely all the one-toone maps whose actions preserve the open sets. Two topological spaces are said homeomorphic if there is a homeomorphism from one to the other. A topological invariant is a predicate defined on a class of topological spaces that is true (false) in any topological space that is homeomorphic to X whenever it is true (false) on X. With this language, topological properties that make sense for a class of topological spaces, being evidently preserved by the homeomorphims, are topological invariants. And the topology, that is the study of objects and of their properties that are preserved by the action of homeomorphisms, is in fact the study of topological invariants. b. Topologizing a set On a set X we may introduce several topologies, that is subsets o{V{X). Since such subsets are ordered by inclusion, topologies are partially ordered by inclusion. On one hand, we may consider the indiscrete topology r = {0, X } in which no other sets than 0 and X are open, thus there are no "small" neighborhoods. On the other hand, we can consider the discrete topology in which any subset is an open set, r = T^iX), thus any point is an open set. There is a kind of general procedure to introduce a topology in such a way that the sets of a given family £ CV{X) are all open sets. Of course we can take the discrete topology but what is significant is the smallest family of subsets r that contains £ and is closed with respect to finite intersections and arbitrary unions. This is called the coarser topology or the weaker topology for which f C r. It is unique and can be obtained adding possibly to £ the empty set, X, the finite intersections of elements of £ and the arbitrary union of these finite intersections. This previous construction is necessary, but in general it is quite complicated and £ loses control on r, since r builds up from finite intersections of elements of £. However, if the family £ has the following property, as for instance it happens for the balls of a metric space, this can be avoided. A basis S of X is a family of subsets of X with the following property: for every couple Ua and Up e 13 there is U^ e B such that Uj cUaf^Up. We have the following. 5.104 Proposition. Let B = {Ua} he a basis for X. Then the family T consisting of ^, X and all the unions of members of B is the weaker topology in X containing B. c. Separation properties It is worth noticing that several separation properties that are trivial in a metric space do not hold, in general, in a topological space. The following claims, o sets consisting of a single points are closed,
5.3 Completeness
185
o for any two distinct points x and y G X there exist disjoint open sets A and B such that x ^ A and y E B, o for any x e X and closed set F C X there exist disjoint open sets A and B such that x G A and F C B. o for any pair of disjoint closed sets E and F there exist disjoint open sets A and B such that E C A and F C B, are all true in a metric space, but do not hold in the indiscrete topology. A topological space is called a Hausdorff topological space if (ii) holds, regular if (iii) holds and normal if (iv) holds. It is easy to show that (i) and (iv) imply (iii), (i) and (iii) imply (ii), and (ii) implies (i). We conclude by stating a theorem that ensures that a topological space be metrizable, i.e., when we can introduce a metric on it so that the topology is the one induced by the metric. 5.105 Theorem (Uryshon). A topological space X with a countable basis is metrizable if and only if it is regular.
5.3 Completeness a. Complete metric spaces 5.106 Definition. A sequence {xn} with values in a metric space {X,d) is a Cauchy sequence if Ve > 0
3 1/ such that d{xn^Xm) < ^ ^n^m > v.
It is easily seen that 5.107 Proposition. In a metric space (i) every convergent sequence is a Cauchy sequence, (ii) any subsequence of a Cauchy sequence is again a Cauchy sequence, (iii) if {xk^} is a subsequence of a Cauchy sequence {xn} such that Xk^ -^ XQ, then Xn —> XQ. 5.108 Definition. A metric space (X, d) is called complete if every Cauchy sequence converges in X. By definition, a Cauchy sequence and a complete metric space are metric invariants. With Definition 5.108, Theorems 2.35 and 4.23 of [GM2] read as R, R^, C are complete metric spaces. Moreover, since n \xi\,
\X21 . . . , \Xn\ < \\x\\ < ^ 2=1
\Xi\
VX = ( X i , X2, . . . , Xn),
186
5. Metric Spaces and Continuous Functions
{xk} C M.^ or C"^ is a convergent sequence (respectively, Cauchy sequence) if and only if the sequences of coordinates {x^},2 = l , . . . , n are convergent sequences (Cauchy sequences). Thus 5.109 Theorem. For all n > 1, W^ and C^ endowed with the Euclidean metric are complete metric spaces. b. Completion of a metric space Several useful metric spaces are complete. Notice that closed sets of a complete metric space are complete metric spaces with the induced distance. However, there are noncomplete metric spaces. The simplest significant examples are of course the open intervals of R and the set of rational numbers with the Euclidean distance. Let X be a metric space. A complete metric space X* is called a completion of X if (i) X is isometric to a subspace X of X*, (ii) X is dense in X*, i.e., clX = X*. We have the following. 5.110 Theorem (HausdorfF). Every metric space X has a completion and any two completions of X are isometric. Though every noncomplete metric space can be regarded as a subspace of its completion, it is worth remarking that from an effective point of view the real problem is to realize a suited handy model of this completion. For instance, the Hausdorff model, when apphed to rationals, constructs the completion as equivalence class of Cauchy sequences of rationals, instead of real numbers. In the same way, the Hausdorff procedure applied to a metric space X of functions produces a space of equivalence classes of Cauchy sequences. It would be desirable to obtain another class of functions as completion, instead. But this can be far from trivial. For instance a space of functions that is the completion of C°([0,1]) with the L^([0,1]) distance can be obtained by the Lebesgue integration theory. 5.111 %, Show that a closed set F of a complete metric space is complete with the induced metric. 5.112 f. Let (X,d) completion of A.
be a metric space and A C X. Show that the closure of A is a
Proof of Theorem 5.110. In fact we outline the main steps leaving to the reader the task of completing the arguments. (i) We consider the family of all Cauchy sequences of X and we say that two Cauchy sequences {yn} and {zn} are equivalent if d{yn,Zn) -^ 0 (i.e., if, a posteriori, {yn} and {zn} "have the same limit"). Denote by X the set of equivalence classes obtained this way. Given two classes of equivalence Y and Z in X, let {yn} and {zn} be two representatives respectively of Y and Z. Then one sees
5.3 Completeness
187
Figure 5.19. Felix Hausdorff (1869-1942) and Rene-Louis Baire (1874-1932).
(i) {d{zn,yn)} is a Cauchy sequence of real numbers, hence converges to a real number. Moreover, such a limit does not depend on the representatives {yn} e {zn} of Y and Z, so that d{zn,yn) -^ diZ,Y). (ii) (i(y, Z) is a distance in X. (ii) Let X be the subspace of X of the equivalence classes of the constant sequences with values in X . It turns out that X is isometric to X. Let Y e X and let {yn} be a representative of Y. Denote by Yu the class of all Cauchy sequences that are equivalent to the constant sequence {zn} where Zn := yu ^'n. Then it is easily seen that Y^ ^>^ Y in X and that X is dense in X . (iii) Let {Yu} be a Cauchy sequence in X. For all u we choose Zu ^ X such that d{Yu, Zu) < 1/^' and we let Zu ^ X he Q. representative of Z^. Then we see that {z^} is a Cauchy sequence in X and, if Z is the equivalence class of {zu}^ then Y^ —> Z. This proves that X is complete. (iv) It remains to prove that any two completions are isometric. Suppose that X and X are two completions of X. With the above notation, we find X C X and X (Z X that are isometric and one-to-one with X. Therefore X and X are isometric and in a one-to-one correspondence. Because X is dense in X and X are dense respectively in X and X it is not difficult to extend the isometry i \ X ^ X to an isometry between X and X.
D
c. Equivalent metrics Completeness is a metric invariant and not a topological invariant. This means that isometric spaces are both complete or noncomplete and that there exist metric spaces X and Y that are homeomorphic, but X is complete and Y is noncomplete. In fact, homeomorphisms preserve convergent sequences but not Cauchy sequences. 5.113 E x a m p l e . Consider X :— endowed with the distance d{x,y)
endowed with the Euclidean metric and Y : =
l + l^l
i + |y||
188
5. Metric Spaces and Continuous Functions
X and Y are homeomorphic, a homeomorphism being given by the map h{x) := jirr^y x G R. In particular both distances give rise to the same converging sequences. However the sequence {n} is not a Cauchy sequence for the EucHdean distance, but it is a Cauchy sequence for the metric d since for n, m G N, m > n d(m, n) =
11 4- n
< 1 1 H- m -
>^ 0
per 1/ —» oo.
1+n
Since {n} does not converge in (M, d), Y = (R, d) is not complete.
Homeomorphic, but nonisometric spaces can sometimes have the same Cauchy sequences. A sufficient condition ensuring that Cauchy sequences with respect to different metrics on the same set X are the same, is that the two metrics be equivalent, i.e., there exist constants Ai,A2 > 0 such that Aidi(x,2/) < d2{x,y)
<
\2di{x,y).
5.114 %, Show that two metric spaces which are equivalent are also topologically equivalent, compare Proposition 5.86.
d. The nested sequence theorem An extension of Cantor's principle or the nested intervals theorem in R, see [GM2], holds in a complete metric space. 5.115 Proposition. Let {Ek} he a monotone-decreasing sequence of nonempty sets, i.e., ^ ^ Ek-\-\ C Ek^k = 0,1,..., in a complete metric space X. //diam (Ek) —^ 0, then there exists one and only one point x E X with the following property: any ball centered at x contains one, and therefore infinitely many of the Ek 's. Moreover, if all the Ek are closed, then DkEk = {x}. As a special case we have the following. 5.116 Corollary. In a complete metric space a sequence of nested closed balls with diameters that converge to zero have a unique common point. Notice that the conclusion of Corollary 5.116 does not hold if the diameters do not converge to zero: for Ek := [k, +oo[c M we have 0 ^^ Ek^i C Ek and CikEk = 0. 5.117 ^ . Prove Proposition 5.115.
e. Baire's theorem 5.118 Theorem (Baire). Let X be a complete metric space that can be written as a denumerable union of closed sets {Ei}, X = U^^Ei. Then at least one of the Ei's contains a ball of X.
5.3 Completeness
189
Proof. Suppose that none of the EiS contains a ball of X and let x\ ^ E\\ Since Ei is closed, there is r\ such that c l ( S ( x i , r i ) ) n £"1 = 0 . Inside cl ( B ( a ; i , r i / 2 ) ) there is now X2 ^ E2 (otherwise cl ( B ( x i , r i / 2 ) ) C E2 which is a contradiction) H E2 = 0 , also we may choose r2 < r i / 2 . Iteratand r2 such that c\{B{x2,r2)) {B{xk,rk)}, ing this procedure we find a monotonic-decreasing family of closed balls D cl(B(x2,r2)) D • • • such that c l ( B ( x n , r n ) ) fl J5n = 0. Thus the comcl{B{xi,ri)) mon point to all these balls, that exists by Corollary 5.116, would not belong to any of the En, a contradiction. D
An equivalent formulation is the following. 5.119 Proposition. In a complete metric space, the denumerable intersection of open dense sets of a complete metric space X is dense in X. 5.120 Definition. A subset A of a metric space X is said nowhere dense if its closure has no interior point, intcl(^) = 0 , equivalently, if X\A is dense in X. A set is called meager or of the first category if it can he written as a countable union of nowhere dense sets. If a set is not of the first category, then we say that it is of the second category. 5.121 Proposition. In a complete metric space a meager set has no interior point, or, equivalently, its complement is dense. Proof. Let {An} be a family such that intclAn = 0. Suppose there is an open set U with U C U n ^ n . Prom U C UnAn C On An we deduce HnAn^ C U^. Baire's theorem, see Proposition 5.119, then implies that U^ is dense. Since U^ is closed, we conclude D that U"" = X i.e., U = 0.
5.122 Corollary. A complete metric space is a set of second category. This form of Baire's theorem is often used to prove existence results or to show that a certain property is generic^ i.e., holds for "almost all points" in the sense of the category, i.e., that the set X\{xeX\p{x)
holds}
is a meager set. In this way one can show^, see also Chapters 9 and 10, the following. 5.123 Proposition. The class of continuous functions on the interval [0,1] which have infinite right-derivative at every point, are of second category in C^([0,1]) with the uniform distance; in particular, there exist continuous functions that are nowhere differentiable. Finally we notice that, though for a meager set A we have int A = 0, we may have intclA ^ 0: consider A := Q C M. •^ See, e.g., J. Dugundii,Topology,
Allyn and Bacon Inc., Boston.
190
5. Metric Spaces and Continuous Functions
5.4 Exercises 5.124 %. Show that \x — 2/P is not a distance in R. 5.125 %. Let (X, d) be a metric space and M > 0. Show that the functions di{x,y)
:=mm(M,d(x,y)),
d2{x,y)
:= d{x,y)/{l
+
d(x,y))
are also distances in (X, d) that give rise to the same topology. 5.126 If. Plot the balls of the following metric in C ,.
.
d(z,w)=y
I b — lyl
if diTgz = axgw or z = w,
I l^l + \w\
otherwise.
5.127 ^ . Let (X, dx) be a metric space. Show that, if / : [0, H-oo[—>^ [0 + CXD[ is concave and /(O) = 0, then
d{x,y) :=
fidx{x,y))
is a distance on X, in particular d"(x, y) is a distance for any a, 0 < a < 1. Notice that instead || ||", 0 < a < 1, is not a norm, if || || is a norm. 5.128 1. Let f :{X,dx) -^ {Y, dy) be a-Holder continuous. Show that f : {X,d%) (Yjdy) is Lipschitz continuous.
-^
5.129 %. Let S be the space of all sequences of real numbers. Show that the function d: S X S -^R given by
if X = {xn}, y = {l/n}, is a distance on S. 5.130 % C o n s t a n c y of sign. Let X and Y be metric spaces and F be a closed set of y , let XQ be a point of accumulation for X and f \ X —^Y. li f{x) - ^ y a s x — > ^ x o and y ^ F, then there exists S > 0 such that f(B{xo,S) \ {XQ}) fl F = 0. 5.131 f. Let (X, rf) and {Y,S) be two metric spaces and let X x F be the product metric space with the metric p((a;i,2/i),(x2,2/2)) := yd(xi,X2y Show (i) (ii) (iii)
+ <^(2/i, 2/2)^.
that the projection maps n{x, y) = x, 7r(a:, y) = y, are continuous, map open sets into open sets, but, in general, do not map closed sets into closed sets.
5.132 % C o n t i n u i t y of o p e r a t i o n s o n functions. Let * : Y x Z -^ W he a. map which we think of as an operation. Given f : X —^ Z and ^ : X -^ F , we may then define the map f * g : Y x Z ^ W by f * g{x) = f(x) * g{x), x e X. Suppose that X, y, Z, W are metric spaces, consider y x Z as the product metric space with the distance as in Exercise 5.131. Show that if / , g are continuous at XQ, and * is continuous at {f{xo),g{xo)), then f * g is continuous at XQ.
5.4 Exercises
5.133 f.
191
Show that
(i) the parametric equation of straight lines in E^, t —> a t 4- b , a, b G M^, is a continuous function, (ii) the parametric equation of the heUx in R^, t —>^ {cost, sint,t), t E M, is a continuous function. 5.134 %. Let {X,dx) and (Y^dy) be two metric spaces, E C Y, XQ he a point of accumulation of E and f : E C X ^>-Y. Show that f(x) -^ yo as x ^>- XQ, x ^ E, ii and only i f V e > 0 3 ( 5 > 0 such that f{E H B{xo, S) \ {XQ}) C B{yo, e). 5.135 %, Show that the scalar product inM^, (x|y) := J27=i ^iVi^ Vx = x^, x ^ , . . . , x'^, y = y^^ y^,- • • 1 y^ G M^, is a continuous function of the 2n variables (x, y) G R^^. 5.136 %. Find the maximal domain of definition of the following functions and decide whether they are continuous there:
1 + 2/2'
a;-logy'
\/xey — ye^.
5.137 ^ . Decide whether the following functions are continuous
J ^
if(x,2/)^(0,0),
f ^
if(x,2/)^(0,0),
0
if(x,2/) = (0,0).
\o
if(x,2/) = (0,0),
I | ^
if(x,2/)^(0,0),
J;;:^^ [ ^
if 2 / iiy-x^>y/W\' - x 2 > ^ and (a., 2 / ) / ( 0 , 0 ) ,
\o
if(x,2/) = (0,0),
\o
if(x,2/) = (0,0).
5.138 ^ . Compute, if they exist, the limits as (x,y) —> (0,0) of log2 {l-{-xy)
X sin(x2 + 3y^)
s i n x ( l — cosa?)
x^ sin^ y sin(x2 + 2/2^
5.139 ^ . Consider R with the Euclidean topology. Show that (i) (ii) (iii) (iv) (v) (vi)
if A if A HA if A if A if A
=]a, 6[, we have int A = A, A = [a, 6] and dA = {a, 6}, = [a, 6[, we have int A =]a, 6[, A = [a, 6] and ^A = {a, 6}, = [a, +oo[, we have int A =]a, +oo[, A = A and OA = {a}, = Q C R, we have int A = 0, A = R and dA = R, = {(^5 2/) € 1^^ I ^ = 2/}» we have int A = 0, A = A and dA = A, = N C R, we have int A = 0, A = A and dA = A.
5.140 %. Let {X, d) be a metric space and {Ai} be a family of subsets of X. Show that UiAi C UiAi,
DiAi C DiAi.
5.141 ^ . Prove the following T h e o r e m . Any open set A ofR is either empty or a finite or denumerable disjoint open intervals with endpoints that do not belong to A.
union of
192
5. Metric Spaces and Continuous Functions
[Hint: Show that (i) Vx G A there is an interval ]C,77[ with x G]^,ry[ and ^, r] ^ A, (ii) if two such intervals ]$I,T7I[ and ]?2,^2[ have a common point in A and endpoints not in A, then they are equal, (iii) since each of those intervals contains a rational, then they are at most countable many.] Show that the previous theorem does not hold in R^. Show that we instead have the following. T h e o r e m . Every open set A CR'^ is the union of a finite or countable union of cubes with disjoint interiors. 5.142 %. Prove the following theorem, see Exercise 5.141, T h e o r e m . Every closed set F C R can be obtained by taking out from R a finite or countable family of disjoint open intervals. 5.143 1. Let X be a metric space. (i) Show that a^o G X is an interior point of A C X ii and only if there is an open set U such that XQ E U C A. (ii) Using only open sets, express that XQ is an exterior point to A, an adherent point to A and a boundary point to A. 5.144 %. Let X be a metric space. Show that A is open if and only if any sequence >n. {xn} that converges to XQ £ A is definitively in A, i.e., 3n such that Xn ^ A'^n 5.145 %, Let (X,dx) and ( F , d y ) be two metric spaces and let / : A" —>^ y be a continuous map. Show that (i) if 2/0 € y is an interior point of B C Y and if f{xo) = yo, then XQ is an interior point o f / - 1 ( B ) . (ii) if ico G X is adherent to A C X, then f(xo) is adherent to f(A), (iii) if xo G X is a boundary point of ^ C X , then f{xo) is a boundary point for f(A), (iv) if a:o G X is a point of accumulation of A C X and / is injective, then f(xo) is a point of accumulation of f(A). 5.146 %. Let X be a metric space and A C X. Show that a:o G X is an accumulation point for A if and only if for every open set U with XQ £ U we have U H A\ {XQ} y^ 0. Show also that being an accumulation point for a set is a topological notion. [Hint: Use (iv) of Exercise 5.145.] 5.147 %. Let X be a metric space and A C X. Show that x is a point of accumulation of A if and only ii x E A\ {x}. 5.148 %. Let X be a metric space. A set A C X without points of accumulation in X , V(A) = 0, is called discrete. A set without isolated points, X{A) = 0, is called perfect. Of course every point of a discrete set is isolated, since A C A = T{A) C A. Show that the converse is false: a set of isolated points, A = X{A), needs not be necessarily discrete. We may only deduce that VA = dA. 5.149 %. Let X be a metric space. Recall that a set £" C X is dense in X ii E = X. Show that the following claims are equivalent (i) D is dense in X , (ii) every nonempty open set intersects D, (iii) D^ = X \D has no interior points, (iv) every open ball B{x,r) intersects D.
5.4 Exercises
193
5.150 % Q is dense in R, i.e., Q = M, and dQ = R. Show that R \ Q is dense in R. Show that the set E of points of R^ with rational coordinates and its complement are dense in R^. 5.151 %, Let r be an additive subgroup of R. Show that either F is dense in R or F is the subgroup of integer multiples of a fixed real number. 5.152 %, Let X be a metric space. Show Xn —^ x if and only if for every open set A In particular, the notion of convergence with X Ei A there is n such that Xn E A^iny^n. is a topological notion. 5.153 %, The notion of a convergent sequence makes sense in a topological space. One says that {xn} C X converge to a; G X if for every open set A with x E A there is n such that Xn £ AWn>n. However, in this generality limits are not unique. If in X we consider the indiscrete topology r = {0, X } , every sequence with values in X converges to any point in X. Show that limits of converging sequences are unique in a Hausdorff topological space. Finally, let us notice that in an arbitrary topological space, closed sets cannot be characterized in terms of limits of sequences, see Proposition 5.76. 5.154 ^ . Let (X, r ) be a topological space. A set F C X is called sequentially closed with respect to r if every convergent sequence with values in F has limit in F. Show that the family of sequentially closed sets satisfies the axioms of closed sets. Consequently there is a topology (a priori different from r ) for which the closed sets are the family of sequentially closed sets.
5.155 f. Let X be a metric space. Show that d i a m A = diamA, but in general diam int A < diam A. 5.156 If. Let 0 7^ F C R be bounded from above. Show that sup JE; € F ; if snpE ^ E, then sup F is a point of ax;cumulation of E; finally, show that there exist max E and min E, if E is nonempty, bounded and closed. 5.157 %, Let X be a metric space. Show that dA = 0 iff A is both open and closed. Show that in R'^ we have OA = 0 iff A = 0 or A = R". 5.158 %, Let X be a metric space. Show that dint A C dA, and that it may happen that aint A ^ dA. 5.159 If. Sometimes one says that A is a regular open set if A = int A, and that C is a regular closed set if C = int C. Show examples of regular and nonregular open and closed sets in R^ and R^. Show the following: (i) The interior of a closed set is a regular open set, the closure of an open set is a regular closed set. (ii) The complement of a regular open (closed) set is a regular closed (open) set. (iii) If A and B are regular open sets, then An B is a. regular open set; if C and D are regular closed sets then C U D is a regular closed set. 5.160 %. Let X be a metric space. A subset D C X is dense in X if and only if for every x G X we can find a sequence {xn} with values in D such that Xn -^ x. 5.161 %, Let (X, d) and (Y^d) be two metric spaces. Show that (i) if / : X —)• y is continuous, then / : £" C X —>^ Y is continuous in E with the induced metric,
194
5. Metric Spaces and Continuous Functions
Figure 5.20. A Cauchy sequence in C^([0,1]) with the L^-metric, with a noncontinuous "limit".
(ii) / : X —> y is continuous if and only if f : X —^ f{^)
is continuous.
5.162 ^ . Let X and Y be two metric spaces and let / : X —>^ y . Show that / is continuous if and only if df-'^{A) C f~^{dA) \/ AcX. 5.163 If O p e n a n d closed m a p s . Let (X,dx) and (Y^dy) be two metric spaces. A map f : X ^)'Y is called open (respectively, closed) if the image of an open (respectively, closed) set of X is an open (respectively, closed) set in Y. Show that (i) the coordinate maps iri : W^ -^ R, K = (xi, ) -^ Xi, i = 1,... ,n, are open maps but not closed maps, (ii) similarly the coordinate maps of a product TTX : X X Y —^ X, ny : X X Y —^ Y given by 7Tx(x,y) = x, 7ry(x,y) = y are open but in general not closed maps, (iii) / : X —^ y is an open map if and only if / ( i n t A) C int f{A) \/A C X, (iv) / : X -^ y is a closed map if and only if f{A) C f(A) VA C X. 5.164 %, Let / : X -^ y be injective. Show that / is an open map if and only if it is a closed map. 5.165 f. A metric space {X,dx) is called topologically complete if there exists a distance d in X topologically equivalent to dx for which (X, d) is complete. Show that being topologically complete is a topological invariant.
5.166 ^ . Let (X, d) be a metric space. Show that the following two claims are equivalent (i) (X, d) is a complete metric space, (ii) If {Fa} is a family of closed sets of X such that a) any finite subfamily of { F a } has nonempty intersection, b) inf{diamFa} = 0, then (laFa is nonempty and consists of exactly one point. 5.167 %. Show that the irrational numbers in [0,1] cannot be written as countable union of closed sets in [0,1]. [Hint: Suppose they are, so that [0,1] = Ur€Q{r'} U UiEi and use Baire's theorem.] 5.168 %, Show that a complete metric space made of countably many points has at least an isolated point. In particular, a complete metric space without isolated points is not countable. Notice that, if Xn -^ Xoo in M, then A := {xn | n = 1, 2 . . . } U {xoo} with the induced distance is a countable complete metric space.
5.4 Exercises
195
5.169 f. Show that C^([0,1]) with the L^-metric is not complete. [Hint: Consider the sequence in Figure 5.20.] 5.170 f. Show that X = {n | n = 0 , 1 , 2 , . . . } and F = {1/n | n = 1 , 2 , . . . } are homeomorphic as subspaces of R, but X is complete, while Y is not complete.
6. Compactness and Connectedness
In this chapter we shall discuss, still in the metric context, two important topological invariants: compactness and connectedness.
6.1 Compactness Let E be a subset of M?. We ask ourselves whether there exists a point xo E E oi maximal distance from the origin. Of course E needs to be bounded, sup^.^^; d(0, x) < -foo, if we want a positive answer, and it is easily seen that if E is not closed, our question may have a negative answer, for instance ii E = 5(0,1). Assuming E bounded and closed, how can we prove existence? We can find a maximizing sequence, i.e., a sequence {xn} C E such that d(0,Xfc) —> supd(0,x), and our question has a positive answer if {xk} is converging or, at least, if {xk} has a subsequence that converges to some point XQ G E. In fact, in this case, d(0,a:jfc^) -^ d(0,a;o), x -^ d{0,x) is continuous, and d(0, Xn^) —^ sup^^^(i(0,x), too, thus concluding that d(0,xo) = supd(0,x). xeE
6.1.1 Compact spaces a. Sequential compactness 6.1 Definition. Let (X^d) be a metric space. A subset K C X is said to be sequentially compact if every sequence {xk} C K has a subsequence {xuk} that converges to a point of K. Necessary conditions for compactness are stated in the following
198
6. Compactness and Connectedness
awifAm it VM^WHttt^fta, Mc tin mtntfifn* itfiinti StefitUaf itn&lfttnt mn\((^mi tint wUt fBuriel itt ^Ui^ma Utiti
Vittnatb fdoljait^/ airfiilttMtf^Mofl, .1.1 nttiHlfalfn «|it(t<»« hr t
%iit Mt tt^tMmgn •». >'*rr«|*«0 <w OMN*
Figure 6.1. Bernhard Bolzano (17811848) and the frontispiece of the work where Bolzano-Weierstrass theorem appears.
|>rag. 181?,
6.2 Proposition. PFe have (i) ^ni/ sequentially compact metric space (X, d) zs complete; (ii) ^n^/ sequentially compact subset of a metric space {X, d) is hounded, closed, and complete with the induced metric. Proof, (i) Let {x^} C X be a Cauchy sequence. Sequential compactness allows us to extract a convergent subsequence; since {x^} is Cauchy, the entire sequence converges, see Proposition 5.107. (ii) Let K be sequentially compact. Every point x 6 K is the limit of a sequence with values in K\ by assumption x £ K, thus K = K and K is closed. Suppose that K is not bounded. Then there is a sequence {xn} C K such that d(xi,Xj) > 1 Vi, j . Such a sequence has no convergent subsequences, a contradiction. Finally, K is complete by (i). D
b. Compact sets in R^ In general, bounded and closed sets of a metric space are not sequentially compact. However we have 6.3 Theorem. InW^, n > 1, a set is sequentially compact if and only if it is closed and bounded. This follows from 6.4 Theorem (Bolzano—Weierstrass). Any infinite and bounded subset E ofW^, n>l, has at least a point of accumulation.
6.1 Compactness
199
Proof. Since E is bounded, there is a cube Co of side L, so that X . . . X [a^f^.b'"'],
£ C Co := \af\bf^]
bf' - a f > = L.
Since £J is infinite, if we divide CQ in 2^ equal subcubes, one of them Ci:=[aW,6Wlx...x[a«,6W],
6 ^ ' - a ^ = L/2,
contains infinitely many elements of E. By induction, we divide Ci in 2^ equal subcubes with no common interiors, and choose one of them, Ci-|-i, that contains infinitely many elements oi E. If Q := [a
x . . . X [a
fef'
- af' =
L/2\
the vertices of Cj converge, <^^^'bf
^
«r.6r
and
a r = 6 r
since for each k = l , . . . , n the sequences {a^'^^^} and {b^^^^^} are real-valued Cauchy sequences. The point a := (af^,..., a^) is then an accumulation point for E, since for any r > 0, Ci C B(a, r) for i sufficiently large. •
Another useful consequence of Bolzano-Weierstrass theorem is 6.5 Theorem. Any bounded sequence {xk} ofW^ has a convergent subsequence. Proof. If {xk} takes finitely many values, then at least one of them, say a, is taken infinitely often. If {pfclfceN are the indices such that Xp^ = a, then {xpj^} converges, since it is constant. Assume now that {xk} takes infinitely many values. The BolzanoWeierstrass theorem yields a point of accumulation Xoo for these values. Now we choose pi as the first index for which Ixp^ — Xoo| < 1, P2 as the first index greater than pi such that \xp2 — Xc»| < 1/2 and so on: then {xp^} is a subsequence of {xn} and Xp^ -^ Xoo. D
c. Coverings and €-nets There are other ways to express compactness. Let A be a subset of a metric space X. A covering of A is a, family A = {Ac} of subsets of X such that A C Ua^a- We have already said that A = {Aa} is din open covering of A if each A^ is an open set, and that {Aa} is a finite covering of A if the number of the A^'s is finite. A subcovering of a covering ^ of A is a subfamily of A that is still a covering of A 6.6 Definition. We say that a subset A of a metric space X is totally bounded if for any e > 0 there is a finite number of balls B{xi,e), i = 1,2,...,A/" of radius e, each centered at xi G X, such that A C uiIiB(a;i,€). For a given e > 0, the corresponding balls are said to form an e-covering of A^ and their centers, characterized by the fact that each point of A has distance less than e from some of the x^'s, form a set {xi) called an e-net for A. With this terminology A is totally bounded iff for every e > 0 there exists an e-net for A. Notice also that A C X is totally bounded if and only if for every e > 0 there exists a finite covering {Ai} oi X with sets having diam Ai < e.
200
6. Compactness and Connectedness
6.7 Definition. We say that a subset K of a metric space is compact if every open covering of K contains a finite subcovering. We have the following. 6.8 Theorem. Let X be a metric space. The following claims are equivalent. (i) X is sequentially compact. (ii) X is complete and totally bounded. (iii) X is compact. The implication (ii) => (i) is known as the Hausdorff criterion and the implication (i) => (iii) as the finite covering lemma. Proof, (i) =^ (ii) By Proposition 6.2, X is complete. Suppose X is not totally bounded. Then for some r > 0 no finite family of balls of radius r can cover X. Start with xi € X; since B(a;i,r) does not cover X, there is X2 G X such that d{x2,xi) > r. Since {B{xi,r), S ( x 2 , r ) } does not cover X either, there is X3 G X such that d{xs,xi) > r and d{xs,X2) > r. By induction, we construct a sequence {xi] such that d(xi,Xj) > r "ii > j , hence d{xi,Xj) > r Vi, j . Such a sequence has no convergent subsequence, but this contradicts the assumption. (ii) => (iii) By contradiction, suppose that X has an open covering A = {Aa} with no finite subcovering. Since X is totally bounded, there exists a finite covering {Ci} of K, n
\ \ Ci = X,
such that
diam Ci < 1,
i = 1 , . . . , n.
1=1
By the assumption, there exists at least ki such that A has no finite subcovering for Cki • Of course Xi := C^j is a metric space which is totally bounded; therefore we can cover Cki with finitely many open sets with diameter less than 1/2, and A has no finite subcovering for one of them that we call X2. By induction, we construct a sequence {Xi} of subsets of X with XiD
X2D
-•' ,
d i a m X i < 1/2%
such that none of them can be covered by finitely many open sets of A. Now we choose for each k a point x^ G Xk. Since {xk} is trivially a Cauchy sequence and X is complete, {xfc} converges to some XQ E X. Let AQ E Abe such that XQ E A and let r be such that B{xo,r) C A {A is an open set). For k suflficiently large we then have d(xk,xo) < r for all X G Xk, i.e., X^ C B(xo,r) C AQ. In conclusion, Xk is covered by one open set in A, a contradiction since by construction no finite subcovering of A could cover Xk • (iii) => (i) If, by contradiction, {xk} has no convergent subsequence, then {xk} is an infinite set without points of accumulation in X. For every x E X there is a ball B{x,rx) centered at x that contains at most one point of {xk}- The family of these balls J' := {B{Xyrx)}x£X is an open covering of X with no finite subcovering of {xk} hence of X, contradicting the assumption. D 6.9 R e m a r k . Clearly the notions of compactness and sequential compactness are topological notions. They have a meaning in the more general setting of topological spaces, while the notion of totally bounded sets is just a metric notion. We shall not deal with compactness in topological spax:es. We only mention that compactness and sequential compactness are not equivalent in the context of topological spaces. 6.10 ^ . Let X be a metric space. Show that any closed subset of a compact set is compact.
6.1 Compactness
201
6.11 %, Let X be a metric space. Show that finite unions and generic intersections of compact sets are compact. 6.12 ^ . Show that a finite set is compact.
6.1.2 Continuous functions and compactness a. The Weierstrass theorem As in [GM2], continuity of / : K ^ M and compactness of K yield existence of a minimizer. 6.13 Definition. Let f : X —^R. Points x_,x+ G X such that f{x-)
= inf / ( x ) , ^€X
/(x+) = sup f{x) xex
are called respectively, a minimum point or a minimizer and a maximum point or a maximize! for / : X ^ R. A sequence {xk} C X such that f{xk) -^ inixex f{x) fresp. f{xk) -^ ^^Pxex fi^)) ^^ called a minimizing sequence fresp. a maximizing sequence/ Notice that any function / : X ^ R defined on a set X has a minimizing and a maximizing sequence. In fact, because of the properties of the infimum, there exists a sequence {yk} C f{X) such that yk -^ infa^^x / ( ^ ) (that may be — oo), and for each k there exists a point Xk E X such that f{xk) = yk, hence f{xk) -> inf^ex f{x). 6.14 Theorem (Weierstrass). Let f : X -^ R be a continuous realvalued function defined in a compact metric space. Then f achieves its maximum and minimum values, i.e., there exists a:_,x+ G X such that / ( x _ ) = inf / ( x ) , ^^^
/(x+) = sup f{x). xex
Proof. Let us prove the existence of a minimizer. Let {xk} C K he a. minimizing sequence. Since X is compact, is has a subsequence {xuk} that converges to some X- € X. By continuity of / , f{xnk) —^ fi^-)y while by restriction Vrik '= fi^nj
-^
inf / ( x ) . xE A
The uniqueness of the Umit yields infaj^x / ( ^ ) = / ( ^ - ) -
^
In fact, we proved that, if / : X -^ M is continuous and X is compact, then any minimizing (resp. maximizing) sequence has a subsequence that converges to a minimum (resp. maximum) point.
202
6. Compactness and Connectedness
b. Continuity and compactness Compactness and sequential compactness are topological invariants. In fact, we have the following. 6.15 Theorem. Let f : X -^ Y be a continuous function between two metric spaces. If X is compact, then f(X) is compact. Proof. Let {Va} be an open covering of f{X). Since / is continuous, {/"^(Va)} is an open convering of X. Consequently, there are indices a i , . . . , aj^ such that
XCf-HVa,)U...UrHVa^), hence
f(X)CVa,U...UVaj,, i.e., f{X) is compact.
•
Another proof of Theorem 6.15. Let us prove that f{X) is sequentially compfict whenever X is sequentially compact. If {yn} C f{X) and {xn} C X is such that f{xn) = y-n \fn, since X is sequentially compact, a subsequence {x^^} of {xn} converges to a point XQ G X. By continuity, the subsquence {f(xk^)} of {yn} converges to f{xo) E f(X). Then Theorem 6.8 applies. • 6.16 t - Infer Theorem 6.14 from Theorem 6.15. 6.17 %, Suppose that E is a. noncompact metric space. Show that there exist (i) f : E -^R continuous and unbounded, (ii) f : E —^M. continuous and bounded without maximizers and/or minimizers.
c. Continuity of the inverse function Compactness also plays an important role in dealing with the continuity of the inverse function of invertible maps. 6.18 Theorem. Let f : X —^ Y be a continuous function between two metric spaces. If X is compact, then f is a closed function. In particular, if f is infective, then the inverse funcion f~^ : f{X) —^ X is continuous. Proof. Let F C X he a. closed set. Since X is compact, F is compact. Prom Theorem 6.15 we then infer that f{F) is compsict, hence closed. Suppose / injective and let g : f{X) —^ X he the inverse of / . We then have 9~'^{E) = f{E) "iE C X, hence g'^^iF) is a closed set if F is a closed set in X. D
6.19 Corollary. Let f : X -^Y be a one-to-one, continuous map between two metric spaces. If X is compact, then f is a homeomorphism. 6.20 E x a m p l e . The following example shows that the assumption of compactness in Theorem 6.18 cannot be avoided. Let X = [0, 27r[, Y he the unit circle of C centered at the origin and f{t) := e**, t G X. Clearly f{t) — cost + isint is continuous and injective, but its inverse function f~^ is not continuous at the point (1,0) = /(O).
6.1 Compactness
203
6.1.3 Semicontinuity and the Frechet-Weierstrass theorem Going through the proof of Weierstrass's theorem we see that a weaker assumption suffices to prove existence of a minimizer. In fact, if instead of the continuity of / we assume^ / ( ^ - ) < hminf /(x^),
whenever {xk} is such that Xk -^ x_,
(6.1)
then for any convergent subsequence {xn^} of a minimizing sequence, Xrik -^
^0,
/(^nj
-^
inf
/(x).
we have inf f{x) < f{xo) < hminf / ( x ^ J = lim / ( x ^ J = i n f / ( x ) , x£X
AC—>oo
k-^oo
xEX
i.e., again /(XQ) = inf^GX / ( ^ ) - We therefore introduce the following definitions. 6.21 Definition. We say that a function / : X ^ R defined on a metric space X is sequentially lower semicontinuous at x G X, s.l.s.c. for short, if / ( x ) < liminf/(xfc)
whenever {xk} C X is such that Xk -^ x.
fc—»-oo
6.22 Definition. We_say that a subset E of a metric space X is relatively compact if its closure E is compact. 6.23 Definition. Let X be a metric space. We say that f : X -^ R is coercive if for allteR the level sets of f,
[xex\
fix) < t]
are relatively compact. Then we can state the following. 6.24 Theorem (Frechet-Weierstrass). Let X be a metric space and let f : X ^^ R be bounded from below, coercive and sequentially lower semicontinuous. Then f takes its minimum value.
^ See Exercises 6.26 and 6.28 for the definition of lim inf and related information.
204
6. Compactness and Connectedness
Figure 6.2. Lebesgue's example of a sequence of curves of length \ / 2 that converges in the uniform distance to a curve of length 1.
6.25 E x a m p l e . There are many interesting examples of functions that are semicontinuous but not continuous: a typical example is the length of a curve. Though we postpone details, Lebesgue's example in Figure 6.2 shows that the function length, defined on the space of piecewise linear curves with the uniform distance, is not continuous. In fact length(/fc) = \ / 2 , fkix) -^ fooix) := 0 uniformly in [0,1], and length(/oo) = 1 < 27r. We shall prove later that in fact the length functional is sequentially lower semicontinuous. 6.26 %, We say that f : X —>• R is lower semicontinuous, for short l.s.c., if for all t G M the sets {x G X \ f(x) < t} are closed. Sequential lower semicontinuity and lower semicontinuity are topological concepts; they turn out to be different, in general. Show that if X is a metric space, then / is lower semicontinuous if and only if / is sequentially semicontinuous. 6.27 f. Let X be a metric space. We recall, see e.g., [GM2], that ^ G M is the liminf oi f : X -^Rasy-^x, £ = liminf fiy), if X is a point of accumulation of X and (i) Vm < ^ 3 (5 such that f{y) > m if y G B(x, 5) \ {XQ}, (ii) V m > ^ V ( 5 > 0 3 2/5G B{x, 5) \ {x) such that f{ys) < m. Show that the lim inf always exists and is given by liminf / (2/)= sup
inf
/ ( y ) = lim
inf
f{y).
Similarly we can define the lim sup of / : X —> R, so that lim s u p / ( y ) = — lim inf (—/(x)). y—i'X
y ^""^
Explicitly define it and show that lim s u p / ( y ) = y-*x
lim r^0+
sup
/(y).
B(x,r)\{x}
Finally, show that f : X ^>-Ris sequentially lower semicontinuous if and only if Vx G X /(x)
6.2 Extending Continuous Functions
205
6.2 Extending Continuous Functions 6.2.1 Uniformly continuous functions 6.29 Definition. Let {X,dx) and {Y^dy) be two metric spaces. We say that f : X —^Y is uniformly continuous in X if for any e > 0 there exists 6 > 0 such that dy(/(x), f{y)) < e for all x^y £ X with dx{x,y) < 5. 6.30 Remark. Uniform continuity is a global property, in contrast with continuity (at all points) which is a local property. A comparison is worthwhile (i) / : X —> y is continuous if Vxo G X , V e > 0 3 ( 5 > 0 ( i n principle 5 depends on e and XQ) dy{f (X), f (XQ)) < e whenever dx{x,xo) < S. (ii) / : X -^ y is uniformly continuous in X if Ve > 0 35 > 0 (in this case S depends on e but not on XQ) dy {f {X)^ f (XQ)) < e whenever dx{x,xo) < S. Of course, if / is uniformly continuous in X, / is continuous in X and uniformly continuous on any subset of X. Moreover if {Ua} is a finite partition of X and each F^u^ : Ua -^ Y is uniformly continuous in [/«? then / : X -^ y is uniformly continuous in X. 6.31 %, Show that Lipschitz-continuous and more generally Holder-continuous functions, see Definition 5.24, are uniformly continuous functions. 6.32 ^ . Show that f : X ^^ Y is not uniformly continuous in X if and only if there exist two sequences {xn}, {Vn} C X and eo > 0 such that dx{xn^yn) -^ 0 and dY{f{xn),f{yn)) > eo Vn. 6.33 % Show that (i) x^ and sinx^, x G M, are not uniformly continuous in R, (ii) 1/x is not uniformly continuous in ]0,1], (iii) sin x^, x E M, is not uniformly continuous in M. Using directly Lagrange's theorem, show that (iii) x'^, x G [0,1], is uniformly continuous in [0,1], (iv) e~^, X G M, is uniformly continuous in [0, +oo[. 6.34 f. Let X , y be two metric spaces and let / : X —>> y be uniformly continuous. Show that the image of a Cauchy sequence is a Cauchy sequence on Y.
6.35 Theorem (Heine-Cantor-Borel). Let f : X -^ Y be a continuous map between metric spaces. If X is compact^ then f is uniformly continuous.
206
6. Compactness and Connectedness
Proof. By contradiction, suppose that / is not uniformly continuous. Then there is eo > 0, and two sequences {xn}, {Vn} C X such that dx(xn,yn)
< -, and dvUi^n), f{yn)) > eo Vn. (6.2) n Since X is compact, {xn} has a convergent subsequence, x^^ -^ x, x E X. The first inequality in (6.2) yields that {y^ } converges to x, too. On account of the continuity of/,
dY{f(xkJ,f{x))^0, hence dY{f{xk^)^f{yk^))
dY{f{ykJ,fix))^0,
-^ 0: this contradicts the second inequality in (6.2).
D
6.2.2 Extending uniformly continuous functions to the closure of their domains Let X, Y be metric spaces, E C X and / : £" ^ F be a continuous function. Under which conditions is there a continuous extension of / over £•, i.e., a continuous g : E -^ Y such that g = f in E? Notice that we do not want to change the target Y. Of course, such an extension may not exist, for instance if ^ =]0,1] and f{x) = 1/x, x G]0, 1]. On the other hand, if it exists, it is unique. In fact, if gi and g2 : E —> Y are two continuous extensions, then A := {x e E\gi{x) = ^2(^)} is closed and contains E^ hence A = E. 6.36 Theorem. Let X and Y be two metric spaces. Suppose that Y is complete and that f : E C X —^ Y is a uniformly continuous map. Then f extends uniquely to a continuous function on E; moreover the extension is uniformly continuous in E. Proof. First we observe (i) since / is uniformly continuous in E, if {xn} is a Cauchy sequence in E, then {f{xn)} is a Cauchy sequence in y , hence it converges in Y, (ii) since / is uniformly continuous, if {xn} {yn} C E are such that Xn ^>- x and y-n ^^ X for some x £ X^ then the Cauchy sequences {f{xn)} and {f(yn)} have the same limit. Define F : ^ —> y as follows. For any x ^ E, let {xn} C E he such that Xn —^ x. Define F{x) := lim / ( x n ) . We then leave to the reader the task of proving that (i) F is welldefined, i.e., its definition makes sense, since for any x the value F{x) is independent of the chosen sequence {xn} converging to x, (ii) F{x) = f{x)\^xeE, _ (iii) F is uniformly continuous in E, (iv) F extends / , i.e., F{x) = f(x)\/x G E. D 6.37 %. As a special case of Theorem 6.36, we notice that a function f : E C X ^ y , which is uniformly continuous on a dense subset E C X, extends to a uniformly continuous function defined on the whole X.
6.2 Extending Continuous Functions
207
6.2.3 Extending continuous functions Let X, Y be metric spaces, E C X and f : E -^Y he a, continuous function. Under which conditions can / be extended to a continuous function F : X ^ y ? This is a basic question for continuous maps. a. Lipschitz-continuous functions We first consider real-valued Lipschitz-continuous maps, f : E C X —^ R. 6.38 T h e o r e m ( M c S h a n e ) . Let (X, d) be a metric space, E C X and let f : E ^^R be a Lipschitz map. Then there exists a Lipschitz-continuous map F : X ^ R with the same Lipschitz constant as f, which extends f. Proof. Let L := Lip ( / ) . For a: 6 X let us define F{x):=
ini (f{y) + yeE \
Ld{x,y)) /
and show that it has the required properties. For x £ E v/e clearly have F{x) < while, / being Lipschitz, gives f{x)
+ Ld{x,y)
f(x)
"iy e E,
i.e., f(x) < F{x), thus concluding that F(x) = f(x) Vx € E. Moreover, we have F{x) < inf (f{z)
+ L d(y, z)) + L d{x, y) = F{y) + L d{x, y)
and similarly F{y)
The previous theorem allows us to extend vector-valued Lipschitzcontinuous maps f : E C X -^ R"^, but the Lipschitz extension will have, in principle, a Lipschitz constant less than y/mLip{f). Actually, a more elaborated argument allows us to prove the following. 6.39 Theorem (Kirszbraun). Let f : E C R'' -> W^ be a Lipschitzcontinuous map. Then f has an extension F :W^ -^ W^ such that Lip F = Lip/. In fact there exist several extensions of Kirszbraun's theorem that we will not discuss. We only mention that it may fail if either R'^ or R"^ is remetrized by some norm not induced by an inner product. 6.40 If ( F e d e r e r ) . Let X be R^ ^ i t h the infinity norm 11 x | ] oo — supdx^l + \x'^\) and the map f : AcX -^R'^, where A := { ( - 1 , 1 ) , (1, - 1 ) , (1,1)} and / ( - 1 , 1 ) := ( - 1 , 0),
/ ( I , - 1 ) := (1, 0),
/ ( I , 1) := (0, Vs).
Show that Lip (/) = 1, but / has no 1-Lipschitz extension to A U {(0, 0)}.
208
6. Compactness and Connectedness
6.2.4 Tietze's theorem An extension of Theorem 6.38 holds for continuous functions in a closed domain. 6.41 Theorem (Tietze). Let X be a metric space, E C X be a closed subset of X, and f a continuous function from E into [—1,1] (respectively, R / Then f has a continuous extension from X into [—1,1] (respectively,
R). Actually we have the following. 6.42 T h e o r e m ( D u g u n d j i ) . Let X he a metric space, E a closed subset of X and let f be a continuous function from E into W^. Then f has a continuous extension from X into W^; moreover the range of f is contained in the convex hull of f(E). We recall that the convex hull of a subset E CM."^ is the intersection of all convex sets that contain E. Proof of Tietze's theorem,. First assume that / is bounded. Then it is not restrictive to assume that inf^; / = 1 and sup^; f = 2. We shall prove that the function
fix) F{x):=liniy^E{f{y)d{x,y)) d{x, E) I
iixe E, lix^E
is a continuous extension of / and 1 < F(a:) < 2 Vx € X . Since the last claim is trivial, we need to prove that F is continuous in X. Decompose X = \ntE\J {X\E)\J dE. If XQ 6 int E, then F is continuous at XQ by assumption. Let XQ ^ X\E.lii this case x -^ d{x,E) is continuous and strictly positive in an open y)) neighborhood of XQ, therefore it suffices to prove that that h{x) := miy^E{fiy)d(x, is continuous at XQ. We notice that for y E E and x, XQ € A' we have f(y)d{x,y)
< f{y)d{xo,y)
+ fiy)d{x,xo)
< f(y)d(xo,y)
+
2d{x,xo),
hence h{x) < h{xo) 4- 2d(x,
XQ)
and, exchanging x with XQ, \h{x) — h{xo)\ <
2d{x,xo)-
This proves continuity of h a.t XQ. Let XQ e dE. For e > 0 let r > 0 be such that \f{y) — f{xo)\ d{y^ ^o) < r and y G E. For x G B{xo,r/4) we have ^ o J " ^ „,(/(2/)d(^.J/)) < nxo)d{x,xo} EnB{xo,r/4)
< 2 4j = 2^
and 3 r.v J^^ . Sf{y)d{x, y)) > d{xo,y) - d{x, XQ) > -r. E\B(xo,r/4) 4 Therefore we find for x with d(xo,x) < r / 4 , h{x) = inU f{y)d{x,y)) = inf f{y)d{x,y) yeE EnB{xo,r/4) and d{x, E) = d(x, E n B(xo, r / 4 ) ) .
< e provided
6.2 Extending Continuous Functions
209
On the other hand, ior y £ E n B(xo, r) we have |/(a:o) — f{y)\ < ^ hence {f{xo) - €)d{x, E) < h{x) < {f{xo) + e)d{x, E) if a; G B{xo,r/4:),
i.e., h{x) is continuous at XQ.
Finally, if / is not bounded, we apply the above to g := (f o f, (f being the homeomorphism (p : M —>]0,2[, (p{x) = ^ + Wi^TT)' ^^ ^ extends continuously g, then F := (p~^ o G continuously extends / . •
6.43 Remark. The extension F : X ^ R oi f : Ec Tietze's theorem is Lipschitz continuous outside E.
X ^ R provided by
Sketch of the proof of Theorem 6.42, assuming X = R^ and E C X compact. a countable dense set {efc}^ in E and for x ^ E and k = 1 , 2 , . . . , and set
Choose
\x-ek\ ,.(.):=n.ax{2-^,0} The function
(
fix),
xeE,
Ek>l^-''Mx)f{ek) defines a continuous extension / , moreover /(M'^) is contained in the convex-hull of }{E). • 6.44 ^ . Let E and F be two disjoint nonempty closed sets of a metric space Check that the function / : X —>> [0,1] given by /(x)=
{X,d).
"^(^'^^ d{x, E) + d{x, F)
is continuous in X , has values in [0,1], f{x) = OWx E E and f{x) = l\/x
E F.
6.45 f. Let E and F be two disjoint nonempty closed sets of a metric space {X,d). Using the function / in Exercise 6.44 show that there exist two open sets A.BcX with An B = 0, AD E and B D F. Indeed Exercise 6.45 has an inverse. 6.46 L e m m a ( U r y s h o n l e m m a ) . Let X be a topological space such that each couple of disjoint closed sets can be separated by two open disjoint sets. Then, given a couple of disjoint closed sets E and F, there exists a continuous function / : X —^ [0,1] such that f{x) = l'ixe E and f{x) = 0 Vx G F . This lemma answers the problem of finding nontrivial continuous functions in a topological space and is a basic step in the construction of the so-called partition or decomposition of unity, a means that allow us to pass from local to global constructions and vice versa. Since we shall not need these results in a general context, we refrain from further comments and address the reader to one of the treatises in general topology.
210
6. Compactness and Connectedness
PR6FACB AU VOLUME I. La Topoiogie traite des propri«16» de« euenble* de poinls. invarlontes par rapports aux transforinatioM bicontlnues. Une traiwfonnation (univoqoe) y-f^x) est dite continue, lortque la condition x^Umx^ Mtratne f (x)~ ]im/{x,). Bll« eat dite bkontlnae ou une homiomorphie, lorsqu'elle admet, en outre, une transformation inverse *«"/"'{>) continue. Le terme .ensemble de points' exige quelques oxplieations: on peut noUmroent se domander quel est I'espace dont on considire I«s points. Corome on salt, la notion de point de I'espace euclidien i 3 dimensions a M. itendue dans la Qtomitrie analytique sur I'espace k un nombre arbitraire dee dimensions: un point p de I'espace euclidien d* (& k dimensions) est par difiniUon un sysltoie de k nombres r*els /><",f<»',...,/>i*>; la convergence l i m p , - / ? signifie
TOPOLOGY James
Dugundji Pnfam of Malhtmelitt Umv^uty of Soulkttn CaUfornia
quo I'on a Urn p i " - / ' ' , quel que soit / < * . Le diveloppoment r
ALtYN AND BACON. INC. HOSrON . LONDON . SVONEV . TORONTO
Or, c'est pricistoent I'itude des invariants dea bomiomorphies entre soos-ensenbles de I'espace C^ qui constitue le vrai domaine de la Topoiogie i I'itat actuel de cette science. Ajonlons
Figure 6.3. The first page of the Preface of Topoiogie by Kazimierz Kuratowski (18961980) and the frontispiece of a classical in general topology.
6.3 Connectedness Intuitively, a space is connected if it does not consist of two separate pieces.
6.3.1 Connected spaces 6.47 Definition. A metric space X is said to be connected if it is not the union of two nonempty disjoint open sets. A subset E C X is connected if it is connected as subspace of X. This can be formulated in other ways. For example we say_that two sets A and 5 of a metric space X are separated if both AilB and AnB are empty, i.e., no point of A is adherent to B and no point of B is adherent to A. 6.48 Proposition. Let X be a metric space. The following properties are equivalent. (i) X is connected. (ii) There are no closed sets F,G in X such that F fl G = 0 and X = FUG. (iii) The only subsets of X both open and closed are 0 and X. (iv) X is not the union of two nonempty and separated subsets.
6.3 Connectedness
211
Proof. Trivially (i) <^ (ii) ^ (iii). Let us prove that (i) =^ (iv). By contradiction, suppose X = AUB where A and_B are nonempty and separated. Prom An B = ^ and AUB = X we infer A C B^ and B^ C A, hence A = B^, i.e., A is an open set. Similarly we infer that B is open, concluding that X is not connected, a contradiction. Finally, let us prove that (iv) =^ (ii). By contradiction, assume that X is not connected. Then X = AuB with A, B closed, disjoint and nonempty, thus (AnB) = (AHB) = AnB = 0. Thus X is separated, a contradiction. D
a. Connected subsets 6.49 Theorem. E cR is connected if and only if E is an interval. Proof. If £• C M is not an interval, there exist x,y ^ E and z ^ E with x < z < y. Thus the sets Ei := ED] — oo,z[ and E2 := En]z,-\-oo[ are nonempty and separated. Since E = EiU E2, E is not connected, a contradiction. Conversely, if E is not connected, then E = AU B with A and B nonempty and separated. Let x £ A and y £ B and, without loss of generality, suppose x < y. Define z :=sup{An
[x,y]).
We have z £ A hence z ( B; in particular x < z < y. If z ^ A_then x < z < y and z ^ E, i.e., E is not an interval. Otherwise, if 2; € ^ , then z ( B and there exists zi such that z < zi
b . Connected components Because of Exercise 6.53, the following definition makes sense. 6.55 Definition. Let X be a metric space. The connected component of X containing XQ E X is the largest connected subset CXQ of X such that
6.56 Proposition. Let X be a metric space. We have the following. (i) The distinct connected components of the points of X form a partition
ofX. (ii) Each connected component C C X is a closed set. (iii) Ifye Cx, then Cx =Cy.
212
6. Compactness and Connectedness
(iv) If Y C X is a nonempty open and closed subset of X, then Y is a connected component of X. Observe that the connected components are not necessarily open. For instance, consider X = Q for which Cx := {x} Va: G Q. Of particular interest are the locally connected metric spaces, i.e., spaces X for which for every x e X there exists r^ > 0 such that B{x,rx) is connected. 6.57 Proposition. LetX be metric space. The following claims are equivalent. (i) Each connected component is open. (ii) X is locally connected. Proof. Each point in X has a connected open neighborhood by (i), hence (ii) holds. Let C be a connected component of X , let a; G C and, by assumption, let B{x^rx) be a connected ball centered at x. As B{x^rx) is connected, trivially B{x^rx) C C, i.e., C is open. D
6.58 Corollary. Every convex set ofW^ is connected. Proof. In fact every convex set iC C M^ is the union of all segments joining a fixed D point xo €i K to points x ^ K. Then Exercise 6.53 applies.
The class of all connected sets of a metric space X is a topological invariant. This follows at once from the following. 6.59 Theorem. Let f : X -^ Y be a continuous map between metric spaces. If X is connected, then f{X) cY is connected. Proof. Assume by contradiction that f{X) is not connected. Then there exist nojiempty open sets C,D CY such that CnDn f{X) = 0, (C U D) D f{X) = f{X). Since / is continuous, A := f~^{C), B := f~^(D) are nonempty open sets in A", such that D An B = (d and AU B = X. A contradiction, since X is connected.
Since the intervals are the only connected subsets of M, we again find the intermediate value theorem of [GMl] and, more generally, 6.60 Corollary. Let f : X ^^ R be a continuous function defined on a connected metric space. Then f takes all values between any two that it assumes. c. Segment-connected sets in M"^ In R'^ we can introduce a more restrictive notion of connectedness that in some respect is more natural. If x, y G M^, a polyline joining x to 2/ is the union of finitely many segments of the type [a:,Xi],[xi,X2],...,[a:Ar-i,y]
6.3 Connectedness
213
where Xi e R'^ and [xi,Xi_j_i] denotes the segment joining Xi with Xi^i. It is easy to check that a polyUne joining x to y can be seen as the image or trajectory of a piecewise hnear function 7 : [0,1] —> W^. Notice that piecewise hnear functions are Lipschitz continuous. 6.61 Definition. We say that A cW^ is segment-connected if each pair of its points can be joined by a polyline that lies entirely in A. If A[x] denotes the set of all points that can be joined to x by a polyline in A, we see that A is segment-connected if and only ifA = A[x]. Moreover we have the following. 6.62 Proposition. Any segment-connected A CM.'^ is connected. Of course, not every connected set is segment-connected, indeed a circle in R^ is connected but not segment-connected. However, we have the following. 6.63 Theorem. Let A be an nonempty open set ofW^. nected if and only if A is segment-connected.
Then A is con-
Proof. Let XQ e A, let B := A[x] be the set of all points that can be connected with XQ by a polyline and let C := A\A[x]. We now prove that both B and C are open. Since A is connected, we infer A = i4[x] hence, A is segment-connected. Let X E B. Since A is open, there exists B{x,r) C A. Since x is connected with XQ by a polyline, adding a further segment we can connect each point of B{x, r) with XQ by a polyline. Therefore B(x,r) C B if x G B , i.e., B is open. Similarly, if x G C, let B{x, r) C A. No points in B{x, r) can be connected with XQ by a polyline since on the contrary adding a further segment, we can connect x with XQ. SO B{X, r) C C if X G C , i.e., C is open. D
d. Path-connectedness Another notion of connection that makes sense in a topological space is joining by paths. Let X be a metric space. A path or a curve in X joining x with y is a continuous function / : [0,1] ^> X with /(O) = x and / ( I ) = y. The image of the path is called the trajectory of the path. 6.64 Definition. A metric space X is said path-connected if any two points in X can be joined by a path. Evidently R*^ is path-connected. We have, as in Theorem 6.63, the following. 6.65 Proposition. Any path-connected metric space X is connected. The converse is however false in general. 6.66 t . Consider the set A CR"^, A = GU I where G is the graph of f{x) := sin 1/x, 0 < a; < 1, and / = {0} x [—1,1]. Show that A is connected but not path-connected.
214
6. Compactness and Connectedness
Similarly to connected sets, if {Aa} C X are path-connected with Ha^a 7^ 0, then A := UaAa is path-connected. Because of this, one can define the path-connected component of X containing a given XQ G X as the maximal subset of X containing XQ that is path-connected. However, examples show that the path-connected components are not closed, in general. But we have the following. 6.67 Proposition. Let X be metric space. The following claims are equivalent. (i) Each path-connected component is open (hence closed). (ii) Each point of x has a path-connected open neighborhood. Proof, (ii) follows trivially from (i). Let C be a path-connected component of X, let X £ C and by assumption let B(x^rx) be a path-connected ball centered at x. Then trivially B(x, Vx) C C, i.e., C is open. Moreover C is also closed since X\C is the union of the other path-connected components that are open sets, as we have proved. D
6.68 Corollary. An open set A of W^ is connected if and only if it is path-connected. Proof. Suppose that A is connected and let U C A he a, nonempty open set. Each point X E U then has a ball B{x,r) C U that is path-connected. By Proposition 6.57 any path-connected component C of A is open and closed in A. Since A is connected, D C = A.
6.3.2 Some applications Topological invariants can be used to prove that two given spaces are not homeomorphic. 6.69 Proposition. R and R^, n> 1, are not homeomorphic. Proof. Assume, by contradiction, that /i : R^ —> E is a homeomorphism, and let XQ be a point of R'^. Then clearly R^ \ {XQ} is connected, but /i(R^ \ {XQ}) = R \ {h(xo)} is not connected, a contradiction. D
Much more delicate is proving that 6.70 Theorem. R"^ and W^, n^vn,
are not homeomorphic.
The idea of the proof remains the same. It will be sufficient to have a topological invariant that distinguishes between different R'^. Similarly, one shows that [0,1] and [0,1]"^, n > 1, are not homeomorphic even if one-to-one correspondence exists. 6.71 %, Show that for any one-to-one mapping /i : [0,1]" —> [0,1] neither h nor h~^ is continuous.
6.3 Connectedness
215
6.72 %, Show that the unit circle S^ of M^ is not homeomorphic to M. [Hint: Suppose /i : 5^ ^ R is a homeomorphism and let XQ e S^. Then S^ \ {XQ} is connected, while R \ {h{xo)} is not connected.]
6.73 Theorem. In R each closed interval is homeomorphic to [—1,1]; each open interval is homeomorphic ^o ] — 1,1[ and each half-open interval is homeomorphic ^o ] — 1,1]. Moreover, no two of these three intervals are homeomorphic. Proof. The first part is trivial. To prove the second part, it suffices to argue as in Proposition 6.69 removing respectively, 2, 0 or 1 points to one of the three standard intervals, thus destroying connectedness. D 6.74 %, Show that the unit ball 5 ^ := {x e R'^^^ | \x\ = 1} in R^+^ is connected and that S^ and S^, n > 1, are not homeomorphic. 6.75 ^ . Let A C R^ and let C C R"^ be a connected set containing points of both A and R"^ \ A. Show that C contains points of dA. 6.76 f. Show that the numbers of connected components and of path-connected components are topological invariants. T h e o r e m . Let f : X —^ Y (path-connected) component f is a homeomorphism, then (path-connected) components
be a continuous function. The image of each connected of X must lie in a connected component ofY. Moreover, if f induces a one-to-one correspondence between connected of X andV.
6.77 %. In set theory, the following theorem of Cantor-Bernstein holds, see Theorem 3.58 of [GM2]. T h e o r e m . / / there exist injective maps X —>• "K and K —> X , then there exists a one-to-one map between X and Y. This theorem becomes false if we require also continuity. T h e o r e m ( K u r a t o w s k i ) . There may exist continuous and one-to-one maps f \ X ^y Y and g :Y -^ X between metric spaces and yet X and Y are not homeomorphic. [Hint: Let X, y C R be given by X =]0,1[U{2}U]3,4[U{5} U . . . U]3n, 3n + l[U{3n + 2} U . . . Y =]0,1]U]3,4[U{5} U . . . U]3TI, 3n -f l[U{3n -h 2} U . . . By Exercise 6.76, X and Y are not homeomorphic, since the component ]0,1] of Y is not homeomorphic to any component of X , but the maps f : X —^ Y and g :Y ^>^ X given by x/2 f{x):=r
^^^^^'
I1
if X = 2,
are continuous and one-to-one.]
and
g{x) := { ^ X— 3
ifx€]0,l[, ifa:G]3,4[, otherwise
216
6. Compactness and Connectedness
6.4 Exercises 6.78 %, Show that a continuous map between compact spaces needs not be an open map, i.e., needs not map open sets into open sets. 6.79 ^ . Show that an open set in R^ has at most countable many connected components. Show that this is no longer true for closed sets. 6.80 %. The distance between two subsets A and B of a metric space is defined by d{A,B)
:= inf d{a,b). aEA beB
Of course, the distance between two disjoint open sets or closed sets may be zero. Show that, if A is closed and B is compact, then d(A, B) > 0. [Hint: Suppose 3 an,bn such that d{an,bn) —> 0 . . . ] and (K,dy) be metric spaces, and let {X x Y^dxxv) be their 6.81 %. Let {X,dx) Cartesian product with one of the equivalent distances in Exercise 5.14. Let TT : X xY be the projection map onto the first component, 7r(x,2/) := x. n is an open map, see Exercise 5.131. Assuming Y compact, show that TT is a closed map, i.e., maps closed sets into closed sets. 6.82 %, Let f : X -^ Y he a. map between two metric spaces and suppose Y is compact. Show that / is continuous if and only if its graph Gf := [{x, y)eXxY\xeX,
y = f{x)]
is closed in X x Y endowed with one of the distances in Exercise 5.14. Show that, in general, the claim is false if Y is not compact. 6.83 If. Let K be a compact set in E^, and for every x € M set i^'x := {2/ ^ Jf^ I {x, y) € K} and f{x) := diam A'a;, x G M. Show that / is upper semicontinuous. 6.84 ^ . A map / : X ^- K is said to be proper if the inverse image of any compact set K C Y is a. compact set in X. Show that / is a closed map if it is continuous and proper. 6.85 %. Show Theorem 6.35 using the finite covering property of X. [Hint: Ve > 0 to every x E X we can associate a S(x) > 0 such that dy {f (x), f (y)) < e/2 whenever y E X and dx(x, y) < S{x). Prom the open covering {B{x, S{x))} of X we can extract a finite subcovering {B{xi,rxi)}i=i^...^N such that X C B{xi,6{xi))U.. .\JB{xn,S{xN)). Set 6 := m i n { ( 5 ( x i ) , . . . ,
S{XN)}.]
6.86 %, Let f : E —^ R"^ be uniformly continuous on a bounded set E. Show that f{E) is bounded. [Hint: The closure of £J is a compact set ...] 6.87 t . Show that (i) if / : X ^- R^ and g : X —^ R'^ are uniformly continuous, then f -\- g and A/, A € R, are uniformly continuous, (ii) if / : X -^ y is uniformly continuous in A C X and B C X, then / is uniformly continuous in AU B. 6.88 %. Let f^g : X —^ R he uniformly continuous. Give conditions such that fg is uniformly continuous.
6.4 Exercises
217
6.89 %. Show that the composition of uniformly continuous functions is uniformly continuous. 6.90 %. Concerning maps / : [0, -|-oo[—)> R, show the following. (i) If / is continuous and f{x) —)^ A G M as x -^ +oo, then / is uniformly continuous in [0,-|-oo[. (ii) If / is continuous and has an asymptote, then / is uniformly continuous in [0, +oo[. (iii) If / : [0, -|-oo[—> R is uniformly continuous in [0, -f-oo[, then there exists constants A and B such that |/(a;)| < A\x\ + J5 Vx > 0. (iv) If / is bounded, then there exists a concave function uj{t), t > 0, such that
l/W-/W|0. 6.91 1 . Let K C X he a, compact subset of a metric space X and x G X \ K. Show that there exists y E K such that d{x, y) = d{x, K). 6.92 %, Let X be a metric compact space and / : X —>> X be an isometry. Show that f{X) — X. [Hint: / ^ , / ^ , . • • > are isometries.] 6 . 9 3 %%, Show that the set of points of R^ whose coordinates are not both rational, is connected. 6.94 %. Let B be a, at most, countable subset of R"^, n > 1. Show that C :=R'^\B is segment-connected. [Hint: Assume that 0 € C and show that each x E: C can be connected with the origin by a path contained in C, thus C is path-connected. Now if the segment [0, x] is contained in C we have reached the end of our proof, otherwise consider any segment R transversal to [0, x] and show that there is z E R such that the polyline [0,2:] U [z,x] does not intersect B.] 6.95 %. Let / : R^ —> R, n > 1, be continuous. Show that there are at most two points 2/ G R for which f~^{y) is at most countable. [Hint: Take into account Exercise 6.94.]
7. Curves
The intuitive idea of a curve is that of the motion of a point in space. This idea summarizes the heritage of the ancient Greeks who used to think of curves as geometric figures characterized by specific geometric properties, as the conies, and the heritage of the XVIII century, when, with the development of mechanics, curves were thought of as the trace of a moving point.
7.1 Curves in W 7.1-1 Curves and trajectories Prom a mathematical point of view, it is convenient to think of a curve as of a continuous map 7 from an interval / of R into M'^, 7 € C^(/, M'^). The image 7(7) of a curve 7 G C^{I,W^), is called the trace or the trajectory of 7. We say that 7 : / ^ ' R"^ is a parametrization of F if 7(/) = F, intuitively, a curve is a (continuous) way to travel on F. li x^y G R^, a curve 7 € C^([a, 6], R^) such that 7(a) = x, 7(6) = y, is often called a path joining x and y. A curve is what in kinematics is called the motion law of a material point, and its image or trajectory is the "line" in R*^ spanned when the point moves. If the basis in R"^ is understood, —as we shall do from now on, fixing the standard basis of R'^— a curve 7 G C^(/, R"^) writes as an n-tuple of continuous real-valued functions of one variable, 7(t) = (7^(t),7^(t),... ,7"^(t)), 7* : / ^ R, 7*(t) being the component of 7(t) Vt G / . Let /c = 1,2,..., or 00. We say that a curve 7 G C^(/,R^) if all the components of 7 are real-valued functions respectively, of class C^(/,R), and that 7 is a curve of class C'^ if 7 G C^(/,R^) We also say that 7 : [a, b] ^ R'^ is a closed curve of class C^ if 7 is closed, 7 G C^(/,R'^) and moreover, the derivatives of order up to k of each component of 7 at a and b coincide, Dj-f'{a) = Dj^\b)
Vz = 1 , . . . , n, Vj - 1 , . . . , A;.
If 7 : / ^ R" is of class C \ the vector
220
7. Curves
7'(*o):=((y)%),..., (7")'(
te [0,1],
is an affine map, called the parametric equation of the line through x in the direction of y. Thus its trajectory is the line L CW^ through s(0) = x and s ( l ) = y with constant vector velocity s'{t) = y — x. In kinematics, s{t) is the position of a point traveling on the straight line s(R) with constant velocity ly — x| assuming s(0) = x and s ( l ) = y. Therefore the restriction si[o,i] of s, s{t) = {l-t)x-\-ty,
0
1,
describes the uniform motion of a point starting from x at time t = 0 and arriving in y at time t = 1 with constant speed ly — x| and is called the parametric equation of the segment joining y to x. 7.2 E x a m p l e ( U n i f o r m circular m o t i o n ) . T h e curve 7 : R —)^ E^^ given by ^{t) = (cos t, sin t) has as its trajectory the unit circle of R^ {(x, y) | x^ + 2/^ = 1} with velocity one. In fact, Y(t) = (—sint,cost) thus |7'(t)| = 1 Vt. 7 describes the uniform circular motion of a point on the unit circle that starts at time t = 0 at (1,0) and moves counterclockwise with angular velocity one, cf. [GMl]. Notice that 7 ' J- 7 and 7 ' ' JL 7 ' since
(7'(*)h(t)) = i ^ ( t ) = 0 . (7"(*)|7'(t))=J^W = 0. Finally, observe that the restriction of 7 to [0,27r[ runs on the unit circle once, since 7|[o,27r[ is injective. The uniform circular motion is better described looking at R^ as the Gauss plane of complex numbers, see [GM2]. Doing so, we substitute y(t) with t —> e**, t € R, since we have e** = cos t-\- i sin t. 7.3 E x a m p l e ( G r a p h s ) . Let / G C ° ( / , R ^ ) be a curve. The graph of / , Gf := [(x,y)
elxW'lxel,
y = fix)]
CR^+\
has the standard parametrization, still denoted by Gf, Gf : I —> R'^'^^, Gf{t) := t -^ (t, f{t)), called the graph-curve o f / . Observe that Gf is an injective map, in particular Gf is never a closed curve, Gf is of class C^ if / is of class C^, k = 1 ,00, and G^(t) = ( ! , / ' ( ( ) ) if / is of class C^. A point that moves with the graph-curve law along the graph, moves with horizontal component of the velocity field normalized to + 1 . Notice that \G'^(t)\ > 1 V(.
7.1 Curves in W
221
Figure 7.1. A cylindrical helix.
7.4 E x a m p l e (Cylindrical h e l i x ) . If j{t) = (acost, a s i n t , 6t), t 6 M, then y{t) = (—asint, acos t, 6), t E M. We see that the point ^{t) moves with constant (scalar) speed along a helix, see Figure 7.1.
7.5 E x a m p l e (Different p a r a m e t r i z a t i o n s ) . Different curves may have the same trace, as we have seen for uniform circular motion. As another example, the curves 71 (t) := (t,0), 72(0 '= (*^,0) and73(t) := (t{t'^-l),0), t G M, are different parametrizations of the abscissa-axis of M^; of course, the three parametrizations give rise to different motions along the x-axis. Similarly, the curves ai (t) = (t3,t2) and (72(t) = (t, (t2)V3)^ t G M, are different parametrizations of (a) Figure 7.2. Notice that a i is a C°°parametrization, while a2 is continuous but not of class C^. 7.6 E x a m p l e (Polar c u r v e s ) . Many curves are more conveniently described by a polar parametrization: instead of giving the evolution of Cartesian coordinates of 7(t) := (x(t), y{t)), we give two real functions 9{t) and p{t) that describe respectively, the angle evolution of 7(t) measured from the positive part to the abscissa axis and the distance of 7(t) from the origin, so that in Cartesian coordinates 7 ( 0 = {p{t) cose{t),p{t)
sinOit)).
If the independent variable t coincides with the angle 0, 9{t) = t, we obtain a polar curve ^{0) = (p((9) cos (9, p(^) sin ^).
In the literature there are many classical curves that have been studied for their relevance in many questions. Listing them would be incredibly long, but we shall illustrate some of them in Section 7.1.3.
Figure 7.2. (a) ^(t) = {t^,t^),
(b) j{t) = {t^ - At,t^
222
7. Curves
a. The calculus Essentially the entire calculus, with the exception of the mean value theorem, can be carried on to curves. 7.7 Definition. Let 7 G C^{[a, 6]; R^), 7 = (7^, 7 ^ , . . . , 7^). The integral of 7 on [a, b] is the vector in R" f 'y{s)ds:= Ja
( I jHs)ds,
f 'Y\s)ds,
\ Ja
...,
Ja
f
r{s)ds).
Ja
/
7.8 Proposition. If-f £ C°([a,6];]R"), then I f^l{s)ds\ Proof. Suppose that f^ 7(5) ds ^ ( i ; \ v ^ , . . . , v'^) e E"" we have {v\ f
^{s)ds)
=Y^v'
f
<
J^\-f{s)\ds.
0, otherwise the claim is trivial. For all v
-f'{s)ds=
f
v'-f'{s)ds=
=
f {v\y{s)) ds;
using Cauchy's inequality we deduce \(v\
f
j{s)ds)\
= \ f
{v\^is))ds\<
f
\{v\j{s))\ds
<\v\ f h{s)\ds Ja for all !> G M". Therefore it suffices to choose v := f^ 7(5) ds to find the desired result. D
If 7 G C^{[a, 6], R*^) and n > 1, the mean value theorem does not hold. Indeed, if j{t) = (cost, sinf), t G [0,27r], and s G [0, 27r] is such that 0 = 7(27r)-7(0) = 27ry(5), we reach a contradiction, since |7'(5)| = |(— sin 5, cos s)| = 1. However, the fundamental theorem of calculus, when applied to the components yields the following. 7.9 Theorem. Let 7 G C^([a,6];R^). Then 7(6)-7(a) = / y{s)ds. Ja Finally, we notice that Taylor ^s formula extends to curves simply writing it for each component, 7(t) = 7(to) + y(to)(t - to) + ^7"(^o)(t - to)' + • • • + l^J^'\to){t
- to)' + ^ /
(^ - s)'7^'^'Hs)
ds.
(7.1)
7.1 Curves in R^
223
Figure 7.3. Some trajectories: from the left, (a) simple curve, (b) simple closed curve, (c), (d), (f) curves that are not simple.
b. Self-intersections Traces of curves may have self-intersections^ i.e., in general, curves are not injective. In (b) Figure 7.2 the trace of the curve 7(t) = {t^ - 4t, t'^ - 4) t eR self-intersects at the origin. One defines the multiplicity of a curve 7 G C^{I,W) dit xeW^ as the number of fs such that j{t) = x,
N(nJ,x):=#[tel\i{t)
= x]
Of course, the trace of 7 is the set of points with multiplicity at least 1. We shall distinguish two cases. (i) 7 : / -^ R"^ is not closed, i.e., 7(a) ^ 7(6). In this case we say that 7 is simple if 7 is not injective i.e., all points of its trajectory have multiplicity 1. Notice that, if / = [a, ^], then 7 is simple if and only if 7 is an homeomorphism of [a, 6] onto 7([a, 6]), [a, 6] being compact, see Corollary 6.19. In contrast, if / is not compact, / and 7(7) in general are not homeomorphic. For instance let / = [0,27r[ and 7(t) := (cost,sint), t G / be the uniform circular motion. Then 7(7) is the unit circle that is not homeomorphic to 7, see Exercise 6.72. (ii) ^ is a closed curve, i.e., 7 = [a, 6] and 7(a) = 7(6). In this case we say that 7 is a simple closed curve if the restriction of 7 to [a, 6[ is injective, or, equivalently, if all points of the trajectory of 7, but 7(a) = 7(6) have multiplicity 1. A (closed) curve 7 has self-intersections if it is not a (closed) simple curve. 7.10 %. Show that any closed curve 7 : [a, 6] —> M^ can be seen as a continuous map from the unit circle S^ C M^. Furthermore show that its trax^e is homeomorphic to S^ if 7 is simple. 7.11 If. Study the curves (x{t),y(t)),
x{t) = 2 t / ( l + t^), y(t) = (f
- 1)/(1 + t^),
c. Equivalent parametrizations Many properties of curves are independent of the choice of the parameter, that is, are invariant under homemorphic changes of the parameter. This is the case for the multiplicity function and, as we shall see later, of the length. For this reason, it is convenient to introduce the following definition
224
7. Curves
7.12 Definition. Let I, J be intervals and let 7 E C^(J,M^) and 5 G C^(J,R"^). We say that 5 is equivalent to 7 if there is a continuous oneto-one map h : J -^ I such that 5{s) = j{h{s))
V 5 G J.
In other words S is equivalent to 7 if (5 reduces to 7 modulo a continuous change of variable in the time axis. Since the inverse /i~^ : / —^ J of a continuous one-to-one map is also continuous, see [GMl], we have that 7 is equivalent to S iff S is equivalent to 7. Actually one sees that the relation of equivalence among curves is an equivalence relation. Trivially, two equivalent curves have the same trace and the same multiplicity function; the converse is in general false. 7.13 E x a m p l e , ^{t) = (cosi, sint), t 6 [0, 27r] and d(t) = (cost, sint), t e [0,4n] have the same trace but are not equivalent since their multipHcity functions are diflPerent.
However, we have the following. 7.14 Theorem. Two simple curves with the same trace are equivalent. Proof. Assume for simplicity that the two curves 7 G C^il.R"^) and <5 G C^{J,W^), I and J being intervals, are not closed. Set h := y'^ oS which clearly is a one-to-one and continuous map from J to I. h is then a homeomorphism, see [GMl], and clearly
s(t) = 707-10 s(t) = s(h{t))
\/teJ.
The notion of equivalence between curves can be made more precise. 7.15 Definition. Let 7 G C^{I,W) and S G C^(J,W) be two equivalent curves, and let h : J -^ I be a homeomorphism such that S(t) — j{h{t)) Wt e J. We say that 7 and 5 have the same orientation if h is monotoneincreasing and have opposite orientation if h is monotone-decreasing. Since every homeomorphism between intervals is either strictly increasing or strictly decreasing, see [GMl], two equivalent curves either have the same orientation or have opposite orientations. In this way, the set of curves can be partitioned into equivalence classes and each class decomposes into two disjoint subclasses: equivalent curves with the same orientation and equivalent curves with opposite orientation.
7.1.2 Regular curves and tangent vectors a. Regular curves We say that a curve 7 of class C^ is regular if 7'(^) ^ 0 Vt. It is also convenient to reconsider the notion of equivalence in the category of curves of class C^.
7.1 Curves in E^
225
7.16 Definition. Let I, J be intervals. Two curves 7 G C^(/,R^), S G C^(J,R'^) of class C^ are C^-equivalent if there exists a one-to-one map h : J -^ I of class C^ with h'{t) ^ 0 Wt e J such that 7(5) = 7{h{s))
^seJ. Clearly C^-equivalent curves have the same trace. We can prove that being C^-equivalent is an equivalence relation between regular curves; actually we shall prove the following result after Proposition 7.37. 7.17 Theorem. Let 7 and S be two curves of class C^, and suppose they are regular. Then 7 and S are C^ -equivalent if and only if they are C^equivalent. Since every function of class C^ with h' ^ 0 Mt is either strictly increasing or strictly decreasing, since h' cannot change sign, any two C^equivalent curves either have the same orientation or have opposite orientation. In this way the set of C^-curves can be partitioned into equivalence classes and each class decomposes into two disjoint subclasses: C^equivalent curves with the same orientation and C^-equivalent curves with opposite orientation. b. Tangent vectors Let 7 : / ^^ R"^ be a simple, regular curve of class C^ and let F := 7(7) be its trace. If x G F, there exists a unique t e I such that ^(t) = x. 7.18 Definition. The space of tangent vectors to the trace T at x eT is defined as the space of all multiples ofy{t), Tan^F := Span W{t)\
7(t) = x.
The unit tangent vector to 7 at a; := 7(t) is defined by
Notice that the previous definition makes sense since one proves that Span {7'(t)} where 7(t) = x, depends only on the trace of 7 and on x. In fact, if 7 : / —^ R'^ and S : J —^M^ are two curves with the same trace F, then Theorems 7.14 and 7.17 yield that 7 and 5 are C^-equivalent, i.e., there exists h : J ^^ I one-to-one and of class C^ with h'{s) ^ 0 "is E J such that S{s) = ^{h{s)) \/s G / . Differentiating, we get S\s) =
h\s)S\h{s)),
that is, 6'{s) and 7'(t) are multiples of each other when S{s) = ^{t) =: x. Moreover,
226
7. Curves
T ^N
litN) ^ to
7(^1)
Figure 7.4. A polygonal line inscribed on a curve.
that is, two C^-equivalent curves with the same orientation have the same unit tangent vector, and two C^-equivalent curves with opposite orientation have the opposite unit tangent vector. Remaining in a classic context, it is convenient to also introduce the families of piecewise-C^ curves and piecewise regular curves. 7.19 Definition. A curve 7 : [a, 6] —> W^ is said to be piecewise-C^ (respectively, regular) if ^ ^ C^(/, R'^) and there exist finitely many points a = to < ti < ' •' < tjsf = b such that the restrictions 7|[t.,ti_i] ^^^ ^/ class C^ (respectively, regular) for all i — 1^... ^N. We emphasize that in Definition 7.19 7 is required to be continuous everywhere in [a, 6], while derivatives are required to exist everywhere except at finite many points where only left- and right-derivatives exist. Notice also that piecewise-C^ curves are Lipschitz continuous. 7.20 %, Let 7 : [a, 6] —> R^ be a piecewise regular curve. Show that every point in 7([a, 6]) has finite multiplicity. Show a piecewise regular curve that has infinitely many points of multiplicity 2. 7.21 % Show that 7(b) - 7 ( a ) = / Ja if 7 : [a, h] —>• M"" is piecewise C^.
l'{s)ds
c. Length of a curve Recall that a partition a of [a, 6] is a choice af finitely many points t o , . . . , tjv with a = to < ti < ' •' < t^ = b. Denote by S the family of partitions of [a,b]. For each partition G = {to->ti,... .t^] ^ S one computes the length of the polygonal line P{cr) joining the points 7(^0)? 7(^1)7 • • • ? I{^N) in the listed order, Figure 7.4, N
P{a):=Y,UU)-i{U_,)\.
7.1 Curves in W
227
Figure 7.5. The graph of f{x) = x s i n ( l / x ) , x € [0,1], is not rectifiable.
7.22 Definition. Let 7 G C°([a,6];E^). The length of 7 Z5 defined as L(7) := s u p | p ( a ) I cr G ^ j and i(;e 5a?/ ^/ia^ 7 Z5 rectifiable or 7 has finite total variation z/ L(7) < +CXD.
In other words the length of a curve is the supremum of the lengths of all inscribed polygonals. The following is easily seen. 7.23 Proposition. 7/7 and S are equivalent, then L(7) = L{5). In particular 7 and S are either both rectifiable or not, and the length of a simple curve depends only on its trace. 7.24 If. Prove Proposition 7.23. 7.25 f.
Let 7 : [a, b] —>- R'^ be a curve and let a < c < b. Show that L{'y) = L(7|[a,c]) +
^(7[c,6])-
7.26 If. Show that if -f{t) = (cost,sint), t G [0,27r], we have L(7) = 27r, while if 7(t) = (cost,sint), t 6 f0,47r], we have L{'y) = 47r.
7.27 E x a m p l e . Curves 7 G C^i[a, 6);R^) need not be rectifiable, i.e., of finite length. Indeed the curve graph of / , j{x) = (x, f{x)) where
Figure 7.6. A closed curve that is not rectifiable.
228
7. Curves
Figure 7.7. An approximation of the von Koch curve.
-, ,
Jxsin(l/a;)
f(x} := <
ifx€]0,l], if x = 0
has infinite length, see Figure 7.5. Indeed, if Xn:=
;
TT,
neN,
mr + 7r/2 the length of 7|[a:„_i,xn] ^^ larger than Xn\ sin l/xn\ n—l
= Xn, hence for any n n—1
^
m > L(7|[.„,ii) > |:-«= = g ^^rw5' i.e., L(7) = oo. Notice that 7 belongs to C0([0,1],M^) fl C^flO, 1],R^), but 7 ' is not bounded in a neighborhood of 0. 7.28 E x a m p l e ( T h e v o n K o c h c u r v e ) . Clearly a bounded region of the plane may be enclosed by a curve of arbitrarily large length, think of the coasts of Great Britain or of Figure 7.6. A curve of infinite length enclosing a finite area is the von Koch curve that is constructed as follows. Start from an equilateral triangle, replace the middle third of each line segment with the two sides of an equilateral triangle whose third side is the middle third that we want to remove. Then one iterates the procedure indefinitely. One can show that the iterated curves converge uniformly to a curve, called the von Koch curve, which (i) is a continuous simple curve, (ii) has infinite length and encloses a finite area, (iii) is not diff^erentiable at any point. 7.29 %, Show that each iteration in the construction of von Koch's curve increases its length by a factor 4 / 3 , and, given any two points on the curve, the length of the arc between the two points is infinity. Finally, show that the surface enclosed by von Koch's curve is 8/5 of the surface of the initial triangle.
7.30 E x a m p l e ( T h e P e a n o c u r v e ) . Continuous nonsimple curves may be quite pathological. Giuseppe Peano (1858-1932) showed in 1890 an example of a continuous curve 7 : [0,1] —• [0,1] x [0,1] whose trsice is the entire square: any such curve is called a Peano curve. Following David Hilbert (1862-1943), one such curve may be constructed as follows. Consider the sequence of continuous curves 7i : [0,1] —>• R^ as in Figure 7.8. The curve at step i is obtained by modifying the curve at step (i — 1) in an interval of width 2~* and in a corresponding square of side 2~* on the target. The sequence of these curves therefore converges uniformly to a continuous curve, whose trsLce is the
7.1 Curves in R^
229
igiH
Figure 7.8. Construction of a Peano curve according to Hilbert.
entire square. Of course, 7 is not injective, otherwise we would conclude that [0,1] is homeomorphic to the square [0,1]^, compare Proposition 6.69. Another way of constructing a Peano curve, closer to the original proof of Peano who used ternary representations of reals, is the following. Represent each x € [0,1] in its dyadic expression, x = Y^QLI ^i/2*, h 6 {0,1}, choosing not to have representations ending with period 1. If x = J2^o ^ i / 2 ' ^ [0.1]^ set 00
,
00
1
-r(-)-(E^.E^)1=0
i=0
Using the fact that the alignment "changes" by a small quantity if x varies in a sufficiently small interval, we easily infer that 7 is continuous. On the other hand, 7 is trivially surjective.
No pathological behavior occurs for curves of class C^. In particular, there is a formula for computing their length. 7.31 Theorem. Let 7 G C^([a,6];R'^). Then 7 is rectifiable and
L(n)= I \i{s)\ds. Ja Proof. Let cr G «S be a partition of [a, 6], P(cr) the length of the polygonal line corresponding to G. The fundamental theorem of calculus yields 7(ti)-7(*i-i)= /
l\s)ds,
|7fe)-7(*i-i)l< / '
W{s)\ds.
hence Jti-i
Summing over i, we conclude
P{cT) = Y.\l{ti)-l{U-^)\< i.e., L(7) = sup^ P(cr) < j^\^'{s)\ds rectifiable. It remains to show that
/
W{s)\dx
< 00, for cr arbitrary. This shows that 7 is
/ \y(s)\dx
(7.2)
230
7. Curves
MATHEMATI8CHE ANNAIEN.
Sur vne eoortw, qui' ramplit tonte nae kin plane. 0, PuM i tub.
m TXEBINDUMO lOT aMXOlUMK
D«u Mtt* Mot* <» auaaia* daox (bacUoM s «( y, ositonB*!
BUBOLF FBIEDBIGH ALFBED CLSBSCE XS»t»t Mitwirk*a( dtr H«rr*a
l«C«>«ni«i( WniMmibw
r-0,0,4,0,,.. . (Pour M momat, T « t MotioMt mw Mat* d* liiiiarM}. Si • «it ra tkatM, ddidgaoat ju ka 1« sliifto > - o, emyitwaMrt d* o; «'*(t-ik-to, F««>a* kO-S, k l - l , k2~0. 8i » - k«. *e dwhdt a - k»i oa » «<mi k« » • (nod. S). DMgDoei par k>a I* rdntttt ^ i'optntion k t^>«<« a toil
Profc IWte Ktoin TOytik
*""^^
Prof A4olidi l U y a r
Dooe »., • * " ahiW d. X, **»
LXIPZIO, Dsvcx o>o TBBiAS voa B. • . r a o a m t . iMa
Figure 7.9. T h e first page of the paper of Giuseppe Peano (1858-1932) appeared in Matematische Annalen.
t l7'(«)l ds
< P{(7e) + e.
Ja
We observe that for every s € [U-i^ti] j{ti)-^{ti-i)
we have
= f ^ y(t)dt=f^
(y(t)-'y'{s))dt-}-^'(s)(ti-ti^i),
consequently
iv wi < ^i
-|7(«i)-7(
(7.3)
^i—1
provided we choose the partition a^ := {to,ti,... Wit)-'y'is)\<e
,tjsf) in such a way that i{s,telti-i,ti]
(such a choice is possible since 7 ' : [a,b] -^ W^ is uniformly continuous in [a, 6] by the Heine-Cantor theorem, Theorem 6.35). The conclusion then follows integrating with respect to s on [ti-i,ti] and summing over i. D Of course Theorem 7.31 also holds for piecewise-C^ curves: if 7 € C^[a,b], a = to < ti < " • < tN = b dind ^ e C^{[ti-i,ti];W) V i = l , . . . i V , then
7.32 Lipschitz curves. Lipschitz curves, i.e., curves 7 : [a, 6] - ^ R" for which there is L > 0 such that hit)
~ j{s)\ < L\t - s\
Vf,5G[a,6],
7.1 Curves in W
231
are rectifiable. In fact, for every partition 7, with a = to < ti < ... < IN = b we have N
P{a) =
^\^{U-i)-^iU)\
Quite a bit more comphcated is the problem of finding an exphcit formula for the length of a Lipschitz curve or, more generally, of a rectifiable curve. This was solved with the contributions of Henri Lebesgue (1875-1941), Giuseppe Vitah (1875-1932), Tibor Rado (1895-1965), Hans Rademacher (1892-1969) and Leonida Tonelh (1885-1946) using several results of a more refined theory of integration, known as Lebesgue integration theory. 7.33 % T h e l e n g t h formula holds for primitives. Let 7 : [a, 6] -^ M" be a curve. Suppose there exists a Riemann integrable function ip : [a, b] —> M^ such that 7(t) = 7(a) -h / ^i^(s) ds Ja Show that 7 is rectifiable and L{'y) = f^ |V^(t)| dt.
Vt G [a, b].
7.34 %. Show that two regular curves that are C^-equivalent have the same length. [Hint: Use the formula of integration by substitution.] 7.35 E x a m p l e ( L e n g t h of g r a p h s ) . Let / G C^([a,6],M). The graph of / , Gf : [a, 6] -^ R2^ ^^(^) ^ (t,/(*)), is regular and G'^{t) = ( l , / ' ( t ) ) . Thus the length of Of
Ja
7.36 E x a m p l e ( L e n g t h in polar c o o r d i n a t e s ) . (i) Let p{t) : [a,6] -^ R-\., 0 : [a, 6] —>• R be continuous functions and let 7(t) = {x{t), y{t)) be the corresponding plane curve in polar coordinates, 7(t) = {p{t) cos6(t), p{t) sin6(t)). Since | 7 ' p = x'2 + 2/'^ = p'^ + p26^'^ we infer
L(7)= f ^p'^ + p^O'^ dt, Ja
in particular, for a polar curve 7(t) = (p(t) cos^(t),p(t) sin^(t)), we have
L{i) = f yjp'^ + p2 dt. (ii) Let p{t) : [a,6] ^> R+, ^ : [a,6] —^ R and / : [a,6] —>• R be continuous and let 7(t) := (x(t),y(t),2;(t)), t e [a,6], be the curve in space defined by cylindrical coordinates {p{t),e{t)/f{t)), i.e., 7(t) := {p{t) cos e(t), p{t) sin6(t), f{e{t))). Since
L(7) = y
^p'2 ^ ^2^/2 ^ y.,2^,2 ^^
(iii) For a curve in spherical coordinates {p{t), 6{t), (p{t)), that is, for the curve ^{t) = {x{t),y{t),z{t)), t G [a, 6] where x{t) = p(t) sin(p{t) cos0{t),
y{t) = p{t) sin(p{t) sin0{t),
z{t) = p{t) cosip{t),
the length is
L(7) = f
y^p'2_^p2^'2_^^2sinV^'^ dt.
232
7. Curves
a t
b
Figure 7.10. Arc length or curvilinear coordinate.
d. Arc length and C^-equivalence Let 7 G C^{[a, 6]; R^) be a curve of class C^ and regular, y{t) ^ 0 V^. The function Sj : [a^b] -^R that for each t e [a,b] gives the length of ^Ifa,*]? ,(t) -i:(7|[a,t]) = /
h'{s)\ds,
Ja
is called the arc length or curvilinear abscissa of 7. We have (i) s^{t) is continuous, not decreasing and maps [a, b] onto [0, L], L being the length of 7. Moreover Sj is differentiable at every point and
s'^{t) = h'it)\
yte[a,b],
(ii) since 7 is regular, j'{t) ^ 0^ t e [a,6], s^{t) is in fact strictly increasing; consequently, its inverse t^ : [0, L] -^ [a, b] is strictly increasing, too, and by the differentiation theorem of the inverse, see [GMl], t^ is of class C^ and
With the previous notation, the reparametrization by arc length of 7 is defined as the curve S^ : [0,L] —^ W^ given by 6^is) := j{t^{s))
se[0,L],
Differentiating, we get
|<5;(,)| == l ^ ^ ^ ^ l = mt,{s))\ \t'^is)\ = 1
Vs.
As a consequence, the arc length reparametrization of a regular curve 7 of length L is a curve 5 : [0, L] -^ W^ that is C^-equivalent to 7 , has the same orientation of 7 and for which \S'{s)\ = 1 V5 G [0,1/]. It is actually the unique reparametrization with these properties. 7.37 Proposition. Let ip : [a, 6] -^ W and ip : [c,d] -^ M^ 6e ti(;o C^equivalent curves with the same orientation, 'ip{s) = (p{h{s)) \/s G [c,d], for some h : [c, d] -^ [a, b] of class C^ with h' > 0, and length L. Then s^{t) = s^{h{t)),
\/t G [c, d],
/ience (^(^(5) = 6^{s) Vs G [0,L].
and
t^{s) = t^{s)
\/s G [0, L],
7.1 Curves in R^
233
fi|i»rw
Figure 7.11. Maria Agnesi (1718-1799) and a page from the Editio princeps of the works of Archimedes of Syracuse (287BC-212BC).
Proo/. If ip{s) = ifi{h{s)) \/s e[c,d],
heC^,h'
> 0, then for any ( £ [c.d]
s^{t) = I ' IV^'WI d r = I* \v'(h{T))\h'{T)
dr
rh(t) Ja hence S^ := ip o t-^p = ip o h~^ o t^p = (f o h o h~^ o t^p = (p o t^p = S(p. Proof of Theorem 7.17. Assume that S € C^{[c,d],R'^), 7 G Ci([a,6],IR'^) 7 regular, ^ : [c,d] -^ [a,6] is continuous and increasing and S(s) = ^{h{s)) Vs G [c,d]. Then the functions /3(5) := L{6\^c,s]) = f
W{s)\ds,
s G [c,d],
a W : = L ( 7 , [ , , , ^ ) = /" h'(T)\dr t e [a,bl Ja are of class C^ and ^(s) = L((5|[c,s]) = ^(7|[a,/i(s)]) = «(^(s)) Vs G [c,d], see Proposition 7.37. Since 7 is regular, a{t) is invertible with inverse of class C^, hence /i(s) = a~^(^(s)) and ^ is of class C^. D
7.1.3 Some celebrated curves Throughout the centuries, mathematicians, artists, scholars of natural sciences and layman have had an interest in plane curves, their variety of forms, and their occurrence in many natural phenomena. As a consequence
D
234
7. Curves
Figure 7.12. (a) Archimedes's spiral, (b) Fermat's spiral, (c) Hyperbolic spiral.
there is a large literature which attempts to classify plane curves according to their properties focusing on their constructive aspects or by simply providing catalogs. In this section we shall present some of these famous curves. a. Spirals Spirals are probably among the most known curves, the first and simplest being the spiral of Archimedes. This is the curve described by a point that moves with constant velocity along a half-line that rotates with constant angular velocity along its origin. If the origin of the half-line is the origin of a Cartesian plane, we have p = vt,9 = cut, thus the polar form of Archimedes's spiral is p = a6,
a := —.
Other spirals are obtained assuming that the motion along the half-line is accelerated, for instance p = a^". All these spirals begin at the origin at ^ = 0 and move away from the origin as 9 increases.
Figure 7.13. (a) Lituus, (b) Logarithmic spiral, (c) Cayley's sextic.
7.1 Curves in R^
235
ee Figure 7.14. (a) Cardioid, (b) Lemniscate, (c) L'Hospital cubic.
ARCHIMEDEAN SPIRALS.
These are the curves defined by
ae.
m e
Among them, see Figures 7.12 and 7.13, we mention o Archimedes's spiral p = aO^ o FermaVs spiral p^ = a^6^ o the hyperbolic or inverse spiral p = a/6^ o the lituus (? — a^/9^ o the logarithmic or equiangular spiral 6 — log^ /?, i.e., p = A^, A > 1. It is the spiralis mirabilis of Johann Bernoulh (1667-1748). It (actually, its tangent at every point) forms a constant angle with any ray from the origin, and every ray intersects the logarithmic spiral in a sequence of points with distances in a geometric progression. It is probably the spiral that one finds most frequently in nature, expressing growth proportional to the organism, as in shells, pine cones, sunfiowers or in galaxies. SINUSOIDAL SPIRALS. A large variety of curves is described by the sinusoidal spirals p^ = a'^cos(n^), n rational. For instance, o Cayley's sextic p = 4acos^(^/3), see Figure 7.13, that we can also write in an implicit form as the set of points (x, y) such that 4(x^ + 2/2-ax)3 = 27a2(x2 +7/2)2^
Figure 7.15. (a) Parabolic spiral a = 1, 6 = 0.7, (b) Euler's spiral.
236
7. Curves
Figure 7.16. (a) The conchoid, (b) The conchoid of Nicomedes a = 4, 6 = 2, (c) Limacon of Pascal a = b = 1.
o Cardioid p = 2a(l + cos0), see Figure 7.14, that we can write implicitly as the set of points (x, y) such that [x^ + y^ — 2ax)'^ = Aa^{x'^-\-y^), o Lemniscate of Bernoulli p^ = a^cos(2^), see Figure 7.14, equivalently as the set of points (x, y) such that (x^ + y^)^ = a^(x^ — 2/^), o Cubic of de VHospital: pcos^lO/S) — a, see Figure 7.14. Other well-known spirals are, see Figure 7.15, PARABOLIC SPIRALS, (p - a)^ =
6^^,
EULER'S SPIRAL. 7(t) = {x(t),y(t))
where x{t) = ± J^ ^
dt, y{t) =
±J^^dt,0
CONCHOID OF NICOMEDES.
\ X = b-\- a cos ^, I.e.,
\y= (6 + a cos 6) tan 0,
by the change of variable t = 6 tan^, see Figure 7.16. We can write it also in polar coordinates as p{e) = a -f
cos^'
7.1 Curves in W
237
or as the set of points (x, y) such that
LiMACON OF PASCAL. (Etienne Pascal (1588-1640), the father of Blaise Pascal.) It is the conchoid of a circle of radius a with respect to a point O on the circle. If 0 is the origin and p = 2a cos 9 is the polar equation of the circle of center (a, 0) through (0,0), the polar equation of the limacon is p = 2a cos 0 + 6, see Figure 7.16. Choosing b = 2a the limacon becomes a cardioid. CONCHOID OF D U R E R . Let Q = {q, 0) and R = (0, r) be points such that q + r = b. The locus of points P and P ' , on the straight line through Q and i?, with distance a from Q is Durer's conchoid (Albrecht Diirer (1471-1528)), see Figure 7.18. Its Cartesian equation may be found by eliminating q and r from the equations b = q + r, y = - ^ x + r. c. Cissoids Given two curves Ci and C2 and a fixed point O, we let Qi and Q2 be the intersections of a line through 0 with Ci and C2, respectively. The locus of points P on such lines such that OP = OQ2 — OQi = Q2Q1 is the cissoid of Ci and C2 with respect to O, see (a) Figure 7.17. The cissoids of a circle and a tangent line with respect to a fixed point of the circle that is not opposite to the point of tangency is the cissoid of Diodes introduced by Diodes (240BC-180BC) in his attempts to doubling the cube, see (b) Figure 7.17. If O is the origin, and the circle has equation (x — a/2)^ -h 2/^ = o^^/4, the intersections points are C = a ( l , t a n ^ ) , B = a cos 6{cos 9, sin 6), hence Diocles's cissoid has the Cartesian equation 2/^ (a — x) = x^, or, equivalently, polar equation p = a sin 0 tan 0.
Figure 7.17. (a) The cissoid, (b) Cissoid of Diodes, (c) Folium of Descartes.
238
7. Curves
Figure 7.18. (a) Diirer's conchoid, (b) Oval of Cassini, (c) The devil curve.
d. Algebraic curves These are loci of zeros of polynomials. The degree of the polynomial may be taken as measure of complexity: curves that are zeros of second order polynomials are well classified, see Example 3.69. We list here a few more algebraic curves, see Figure 7.19. It has an equation y{x'^-\-a'^) = a^ and it is the trace of the curve ^{t) = {x{t),y{t)) where x(t) = at, y{t) = a/{l + t^),
W I T C H OF AGNESI.
teR. It has an equation x{x - a)^ = y^i'^a - x) and it is the trace of the curve 7(^) = {x{t),y{t)) where x{t) = 2acos^ t, y{t) =ataint{l-2cos^t), teR.
STROPHOID OF BARROW.
E I G H T CURVE or LEMNISCATE OF G E R O N O . It has
an equation x"^ =
o?{x'^ — y^) and it is the trace of the curve ^{t) = {x(t),y{t)) where x(t) = a cost, y{t) = asintcost, teR. CURVES OF LISSAJOUS. They are the traces of curves 7(t) = {x{t),y(t)) where x{t) = asm{at + d), y{t) — 6sint, t G M in which each coordinate moves as a simple harmonic motion. One shows that such curves are algebraic closed curves iff a is rational. FOLIUM OF DESCARTES. It has an equation x^ -^y^ = Saxy and arises as trace of the curve 7(t) = {x{t),y{t)) where x{t) = Sat/{1 + t^), y{t) = 3atV(l +1^), t e M, see Figure 7.17. DEVIL'S CURVE. It has an equation y'^-x'^+ay^+bx'^ = 0, see Figure 7.18. DOUBLE FOLIUM. It has an equation {x^ -{-y^)'^ = iaxy'^, see Figure 7.20. TRIFOLIUM. It has an equation {x'^ -\- y'^){y'^ -\- x{x -\- a)) = Aaxy"^, see Figure 7.20. OVALS OF CASSINI. They have equation {x^ + y^ + a^)^ = 6^ + 4a^x^, see Figure 7.18. ASTROID. It has an equation x^/^ -h y^^^ — o?!^, see Figure 7.20.
e. The cycloid Nonrational curves are called transcendental Among them one of the most famous is the cycloid. This is the trajectory described by a fixed point (the tyre valve) of a circle (a tyre) rolhng on a line, see Figure 7.21.
7.1 Curves in R^
239
Figure 7.19. Some algebraic curves: from the top-left (a) the witch of Agnesi, (b) the strophoid of Barrow, (c) the lemniscate of Gerono, (d) the Lissajous curve for n = 5, d = n/2.
If the center of the circle is C = (0,i?), the radius R, P = (0,0) and we parametrize the movement with the angle 6 that CP makes with the vertical through C, then P — P{9), C = C{9), the cycloid has period 27r, and we have
^^
^'
\R{l-cose)J
Since the circle rolls, C{0) simply translates parallel to the axis of R6. We then conclude that the cycloid is the trace of the curve 7 : R —> R^ defined by
(Rie-sinOy '^'
\R{l-cose),
Figure 7.20. From the left: (a) the double folium , (b) the trifolium, (c) the astroid.
240
7. Curves
Figure 7.21. The cycloid.
The cycloid solves at least two important and celebrated problems. As we know, the pendulus is not isochrone, but it is approximately isocronic for small oscillations, see Section 6.3.1 of [GMl]. Christiaan Huygens (1629-1695) found that the isochronal curve is the cycloid. Johann BernouUi (1667-1748) showed that the cycloid is the curve of quickest descent, that is, the curve connecting two points on a vertical plane on which a movable point descends under the influence of gravitation in the quickest possible way. Other curves of the same nature as the cycloids are the epicycloids and the hypercycloids^ see Figure 7.22. These are obtained from a circle that rolls around the inside or the outside of another circle (or another curve). f. The catenary Another celebrated transcendental curve is the catenary. It describes the form assumed by a perfect flexible inextensible chain of uniform density hanging from two supports, already discussed by Gahleo Gahlei (15641642). Answering a challenge of Jacob Bernoulh (1654-1705), it was proved
Figure 7.22. (a) The epicycloid xit) = 9 c o s t - cos(9t), y{t) = 9smtsin(9<), (b) the ipocycloid x{t) = 8 c o s t + 2cos(4t), y{t) = Ssint — 2sin(4t), (c) a catenary.
7.2 Curves in Metric Spaces
241
CHRISTIANI
H V G E NI I ZVLICHEMII. CONST F
HOROLOGIVM OSCILLATORIVM SIVE
DE MOTV PENDVLORVM AD H O R O L O G I A
APTATO
DEMOKSTR.ATIONES G E O M E T R I C ^
PAK.ISIIS. Apud t. M u o u s T , Regis & lllullriirimi Afchiepifcopi Typogripbuit vii Citlurx, i d inlignc mum Rcgum,
~~
M"DC"L X X m
CP-M TKIVILEGIO
Figure 7.23. The pendulum clock from the Horologium Huygens (1629-1695).
Oscillatorium
•
REGIS.
of Christiaan
by Gottfried von Leibniz (1646-1716) and Christiaan Huygens (1629-1695) that the equation of the catenary hung at the same height at both sides is y = | ( e - / « + e - - / « ) = a cosh-
7.2 Curves in Metric Spaces Of course we may also consider curves in a general metric space X, as continuity is the only requirement. Let us start introducing the notion of total variation, a notion essentially due to Camille Jordan (1838-1922). a. Functions of bounded variation and rectifiable curves Let X be a metric space and / : [a, 6] C M —» X be any map. Denote by S the family of finite partitions a = {to,... ^tjsf} with a = to < ti < - • - < tN = b oi the interval [a, b] and, in correspondence to each partition cr, set |o-| := maxi=i,...,iv(|^2 - ^z-i|) and N-l
K(/):=5]d(/(ti),/(ii+i)), i=0
that we have denoted by P{(j) in the case of curves into W^.
242
7. Curves
L ETTRES
A DETTONVILLE CONTENANT Quelques-vnes dc fc» Inucntions de Geometric. S ^ A V O I K, La Rdblinion de vcm l a Problemes touclunt Lx ROVLBTTB qn'il luoitproporez publiquefflentaumojt deluin \6^. L'Egaliid vatK les Lignes coarb» de toutes forte* de Roulettei, & des Lignet EJipdqueit^
La Dimenfion&le Centre degraoiti desTriaBglejCylind«<ju«. La Dimenfion6cle Centre de grauit^der£fcalier. Vn Traitti d « TtiJignes flc de leurs Onglets. Vn Traitti dcs Sinu* ,& des Arcs de Cercle. Vn Traitt^ des Solides Circulaircs.
A
PARIS,
Chez G v i L L A V M B
D B s p R K Z , r u e ifflntlacqut
a I'Iraagc Saint Pro/per.
Figure 7.24. Blaise Pascal (1623-1662) and the frontispiece of his Lettres de Dettonville about the Roulettes.
M. DC. LIX"
7.38 Definition. The total variation of a map f : [a^h] —^ X is the number (eventually +ooj n/,[a,6])-supy,(/). We say that f has bounded total variation ifV{f^ [^?^]) < co. When the curve / : [a, 6] —^ X is continuous, V{f, [a, b]) is called the length of f and curves with bounded total variation, that is with finite length, are called rectifiable. Either directly or repeating the arguments used in studying the length of curves into R*^, it is easy to show the following. 7.39 Proposition. We have (i) if [a, b] C [c, d], then F ( / , [a, b]) < F ( / , [c, d]), (ii) F ( / , [a, 6]) > d{f{a),f{b)) and, if f is real-valued and increasing, thenV{f,[a,b]) = m-f{a), (iii) every Lipschitz-continuous function / : [a, 6] ^ X has bounded total variation and F ( / , [a, b]) < Lip (/) {b ~ a), (iv) the total variation is a subadditive set-function, meaning V{f, [a, b]) < V{f, [a, c]) + V{f, [c, b]) moreover, if f is continuous at c, then V{f,[a,b]) V{f,[c,b]), (v) V{f, [a, b]) = lim|,|_.o+ VM, [«, b]).
ifa
b;
— V{f,[a,c])
+
7.2 Curves in Metric Spaces
ALBERTVS DVRERVS
NV=
REMBERGENSIS PICTOR HVIVS PiaonlMM^Uim{ndM«ciigiMritt,L;q>icidM,ScM»aniiyK vtutKrii dtnom qot ciraDo, goomoM, h'bdla,t«it likxpi cem nuafiinopcn (oa cxaautHot ami QcceflarinitMiM onfie Qiatuor iiM ruunm IdftitutioDum C«oiDCffcanin) |ibn*Juicai,iu|)»rficici U ioiida owpo n tsUfbnit, adhtbi'tudcfr gMuooibw ad (an
243
A L B E R T I DV RERI INSTITVTIONVM GEOMETRtCARVM LllIRI CtyATVOR, lnqttibu«,lmrtt,lb|Kf<(l
|:>Qficu>lolM(iniiUa<Miii,VUiopok. l4MW«pu
Figure 7.25. Frontispieces of two editions of 1532 and of 1606 of Institutionum ricorum by Albrecht Diirer (1471-1528).
geomet-
7.40 ^ . Let f : [a,b] —^ X X X where X is a metric space. Show that / has bounded variation if and only if the two components of / = ( / i , / 2 ) , /i,2 : [a,6] —> X have bounded variation.
We say that two curves (p : [a, 6] -^ X and i/; : [c, d] ^ X into X are equivalent if there exists a homeomorphism a : [c, d] -^ [a, 6] such that jp{s) = (p{h{s)) Mx G [a, 6]. Prom the definitions we have the following. 7.41 Proposition. Two equivalent curves have the same total variation. From (iv) and (v) of Proposition 7.39 we also have the following. 7.42 Proposition. Let (^ : [a, 6] —> X he a rectifiable (continuous) curve. Then the real-valued function t -^ V{(p^ W.t]), t ^ [fl,^] is continuous and increasing. 7.43 1 . Prove the claims in Propositions 7.39, 7.41 and 7.42.
b. Lipschitz and intrinsic reparametrizations We saw that every regular Euclidean curve may be reparametrized with velocity one. Por curves in an arbitrary metric space we have 7.44 Theorem. Let j : [a^b] -^ X be a simple rectifiable curve on a metric space X of length L. Then there exists a homeomorphism a : [0, L] —> [a, 6] such that 7 o a : [0,L] —^ X is Lipschitz continuous with Lipschitz constant one.
244
7. Curves
Figure 7.26. The sets Ek of the middle third Cantor set.
We call that parametrization of the trace of 7 the intrinsic oi7{[a,b]).
parametrization
Proof. Let x € [a, b] and V{x) := V{j, [a, x]). We have L = V(7, [a, b]) and, on account of Proposition 7.42, V{x) is continuous and increasing. Since 7 is simple, V{x) is strictly increasing hence a homeomorphism between [a, 6] and [0, L]. Set cr := F " ^ . We then infer for 0 < x < y < L db(
= V(<7(y)) - y(
=x-y,
i.e., (^ o a is Lipschitz continuous with Lipschitz constant one.
•
7.2,1 Real functions with bounded variation It is worth adding a few more comments about the class of real-valued functions / : [a, 6] -^ R with finite bounded variation, denoted by BV{[a,b]). 7.45 Theorem. We have (i) BV([a^b]) is a linear space and \\f\\ := \f{a)\ + F ( / , [a,6]) is a norm on it, (ii) BV{[a^h]) contains the convex cone of increasing functions, (iii) every f G BF([a, b]) is the difference of two increasing functions. Proof. We leave to the reader the task of proving (i) and (ii) and we prove (iii). For / G BV([a, b]) and t G [a, b] set ^{t) '= V{f, K t]),
i;it) := ^(t) + f{t),
t G [a, 6].
For x,y £ [a,b], X < y, we have t/;(2/) - t/;(x) = [ifiy) - ip(x)] + [/(y) -
f{x)];
now the subadditivity of the total variation yields ^{y) - -fix) = V{f, [x,y]) > \f{y) -
fix)l
in particular ilj(y) - iPix) > 0. Therefore (p and ip are both increasing with bounded total variation, and f(t) = ip{t) — (p(t) Vt. D
A surprising consequence is the following. 7.46 Corollary. Every function in BV{[a,b]) has left- and right-limits at every point of [a,b].
7.2 Curves in Metric Spaces
245
l/2f
1/3
2/3
Figure 7.27. An approximate Cantor-Vitali function.
If we reread the proof of (iii) Theorem 7.45, on account of Proposition 7.42 we infer 7.47 Proposition. Every continuous function f : [a, 6] -^ R with finite total variation is the difference of two continuous increasing functions. a. The Cantor-Vitali function The Cantor ternary set is defined, see of [GM2], as C = CikEk where £^0 •= [0,1], El is obtained from £"0 be removing the open middle third of EQ, and E'fc+i by renioving from each interval of Ek its open middle third. Define for fc = 0 , 1 , . . . and j = 1 , . . . , 2*^, the base points
^0,1 = 0
bk-\-i,j
k^kj + ^bk,j
ifj = l,...,2'^ ifj = 2^ + l , . . . , 2 ^ + \
then the intervals that have been removed from Ek-i to get Ek at step k are 1 2, •^k-l Ik-ij:=bk-i,j+S-'^']-,-l j = l,. and the intervals whose union is Ek are Jk,j:=bkj+3-^[0M
j=
l,-..X.
Therefore 00
C = k=0
^7=1
^
Strongly related to Cantor's set is the Cantor-Vitali function introduced by Giuseppe VitaH (1875-1932). To define it, we first consider the approximate Cantor-Vitali functions Vk : [0,1] -^ R defined inductively by
246
7. Curves
hVkix/3) Vo{x) •.= X,
if a; €[0,1/3],
Vk+i{x) := {
if x G [1/3,2/3], if x € [2/3,1],
^ + lVki3-\x-2/3)) see Figure 7.27. One easily checks that for A; = 0 , 1 , . . . (i) We have ^^(0) = 0, ^^(1) = 1, Vkibj,k) = ^iz , and Vk{x) =
2j-l 2m+l
Vk{bj,k + 3"'=) =
^
0,...,fc-l, j = l,...,2^
if X G /,m , j 5 m
(ii) We have
Vk{x) = {iy j\EAt)dt where XEk is the characteristic function of the set Ek that we used to define the Cantor ternary set. (iii) We have \Vk{x)-Vk{y)\<\x-yr
Vx,i/G[0,l],
where a = log 2/log 3, in particular the 14's are equi-Holder. In fact, by symmetry it suffices to prove the claim for x,y e [0,3"^^] where Vfc is linear with slope (3/2)'^. For 0 < x
^3\fc ^3\^ = {^) \x-y\=[-)
\x-y\'-^\x-y\
as 2 3 " " = 1. (iv) We have \VM{x)-Vkix)\<2-
-k-{-l
\/xe [0,1].
In particular (iv) imphes that the sequence {V^} converges uniformly to a function V{x), which is by (iii) Holder-continuous with exponent a = log2/log3. The function V is called the Cantor-Vitali function and satisfies the following properties o V is not decreasing, hence it has bounded total variation, o in each interval of [0,1] \ E'^, V{x) — Vk{x) is constant, in particular V is differentiable outside the Cantor set with V'(x) = 0 Vx G [0,1] \ C, o F([0,1]) = [0,1], and V maps [0,1] \ C into the denumerable set Z ) : = { y 6 M | y = ^ , j - 0 , l , . . . , 2 ^ fc G N } , hence V maps C onto [0,1] \ D.
7.3 Exercises
247
7.48 Homeomorphisms do not preserve fractal dimensions. The function ^{x):=x + V{x), (^: [0,1] ^ [ 0 , 2 ] , is continuous and strictly increasing, hence a homeomorphism between [0,1] and [0,2]. In Theorem 8.109 we shall see that the algebraic dimension of R"^ is a topological invariant, that is, M'^ and R'^ are homeomorphic if and only if n = m. This is not true in general for the fractal dimension, see Chapter 8 of [GM2]. In fact, ip maps the complement of Cantor's set in [0,1] into the countable union of intervals of total measure 1, H^{ip{[0,1]) \ C) = 1, hence n^{(f{C)) = 1 and dimn{^{C)) = 1, while dimn{C) = log 2/log 3. 7.49 %. Let f : R^ -^ W^ he a Lipschitz-continuous map with Lipschitz inverse. Show that / preserves the fractal dimension, dim7^(/(A)) = dim-^ A.[Hint: Recall that n'^ifiA)) < Up if) n''(A), see Section 8.2.4 of [GM2].]
7,3 Exercises 7.50 ^ . We invite the reader to study some of the curves described in this chapter, try to convince himself that the figures are quite reasonable, and compute the lengths of some of those curves and, when possible, the enclosed areas. 7.51 ^ . Compute the total variation of the following functions / : [0, 2] —> M
{
1
ii X ^ A
0
ii X f A.
7.52 t . Let g{x) = ^/x, x e [0,1], and let / : [0,1] -^ M be given by
I0
otherwise.
Show that / , g and g o f have bounded total variation. 7.53 %. Let f,ge
BV{[0,1]).
Show that m i n ( / , p ) , m a x ( / , p ) , | / | G BV{[0,1]).
7.54 ^ . Show that the Cantor middle third set C is compact and perfect, i.e., int (C) =
8. Some Topics from t h e Topology of E^
As we have aheady stated, topology is the study of the properties shared by a geometric figure and all its bi-continuous transformations, i.e., the study of invariants by homeomorphisms. Its origin dates back to the problem of Konigsberg bridges and Euler's theorem about polyhedra, to Riemann's work on the geometric representation of functions, to Betti's work on the notion of multiconnectivity and, most of all, to the work of J. Henri Poincare (1854-1912). Starting from his research on differential equations in mechanics, Poincare introduced relevant topological notions and, in particular, the idea of associating to a geometric figure (using a rule that is common to all figures) an algebraic object, such as a group, that is a topological invariant for the figure and that one could compute. The fundamental group and homology groups are two important examples of algebraic objects introduced by Poincare: this is the beginning of combinatorial or algebraic topology. With the development of what we call today general topology due to, among others, Rene-Louis Baire (1874-1932), Maurice Prechet (1878-1973), Prigyes Riesz (1880-1956), Pelix Hausdorff (18691942), Kazimierz Kuratowski (1896-1980), and the interaction between general and algebraic topology due to L. E. Brouwer (1881-1966), James Alexander (1888-1971), Solomon Lefschetz (1884-1972), Pavel Alexandroff (1896-1982), Pavel Urysohn (1898-1924), Heinz Hopf (1894-1971), L. Agranovich Lyusternik (1899-1981), Lev G. Schnirelmann (1905-1938), Harald Marston Morse (1892-1977), Eduard Cech (1893-1960), the study of topology in a wide sense is consolidated and in fact receives new incentives thanks to the work of Jean Leray (1906-1998), Elie Cartan (18691951), Georges de Rham (1903-1990). Clearly, even a short introduction to these topics would deviate us from our course; therefore we shall confine ourselves to illustrating some fundamental notions and basic results related to the topology of R^, to the notion of dimension and, most of all, to the existence of fixed points.
250
8. Some Topics from the Topology of E "
Figure 8.1. A homotopy.
8.1 Homotopy In this section we shall briefly discuss the different flavors of the notion of homotopy. They correspond to the intuitive idea of continuous deformation of one object into another.
8.1-1 Homotopy of maps and sets a. Homotopy of maps In the following, the ambient spaces X, Y, Z will be metric spaces. 8.1 Definition. Two continuous maps f,g : X —^Y are called homotopic if there exists a continuous map H : [0,1] x X -^ Y such that H{0, x) = / ( x ) , H(l,x) = g{x) M X G X . In this case we say that H establishes or is a homotopy of f to g. It is easy to show that the homotopy relation f ^ g on maps from X to Y is an equivalence relation, i.e., it is (i) (ii)
(iii)
(REFLEXIVE) / ~ / . (SYMMETRIC) / ~ ^ iff P ~ / . (TRANSITIVE) if / ~ C/ and g ^
h, then f ^ h.
Therefore C^{X,Y) can be partioned into classes of homotopic functions. It is worth noticing that, since C«([0,1],C^X,F))
C\[0,l]xX,Y),
(8.1)
we have the following. 8.2 Proposition. / and g e C^{X,Y) are homotopic if and only if they belong to the same path-connected component of C^{X,Y) endowed with uniform distance. The subsets of C^{X,Y) of homotopy equivalent maps are the path-connected components of the metric space C^{X,Y) with uniform distance.
8.1 Homotopy
DIE WISSENSCHAFT
251
Pavel Sergeevid Aleksandrov
HBRAUSOBBSR PROF. DIL WILHBLM WESTPHAL BAND <6
Einltthrung in die kombinatorische Topolos^e
Topologia combinatoria
Dr. Kurt Reidemeiiter
FRIBORVIEWBO k SOHN, BRAUNSCHWEIG 1951
Figure 8.2. Frontispieces of the introduction to combinatorial topology by Kurt Reidemeister (1893-1971) and Pavel Alexandroff (1896-1982) in its Italian translation.
8.3 ^ . Let X,Y be metric spaces. Show the equality (8.1), which we understand as an isometry of metric spaces. 8.4 ^ . Let y be a convex subset of a normed linear space. Then every continuous map f : X -^ Y from an arbitrary metric space X is homotopic to a constant. In particular, constant maps are homotopic to each other. [Hint: Fix yo eY and consider the homotopy H : [0,1] x X -^ Y given by H{t, x) := tyo + (1 - t)f(x).] 8.5 ^ . Let X be a convex set of a normed linear space. Then every continuous map / : X —)• y into an arbitrary metric space is homotopic to a constant function. [HintFix XQ e X and consider the homotopy H : [0,1] x X —>• y given by H{t,x) :=
f{txo + (1 - t)x).] 8.6 5 . Two constant maps are homotopic iff their values can be connected by a path. 8.7 1[. Let X be a linear normed space. Show that the homotopy classes of maps / : X —)> y correspond to the path-connected components of Y.
According to Exercises 8.4, 8.5 and 8.6, all maps into W^ or defined on R"^ are homotopic to constant maps. However, this is not always the case for maps from or into S'^ := {x | ||x|| = 1}, the unit sphere of M^+^ 8.8 Proposition. We have (i) Let f^g : X -^ S'^ be two continuous maps such that f{x) and g{x) are never antipodal, i.e., g{x) ^ -f{x) Mx G X, then f and g are homotopic; in particular, if f : X —^ S'^ is not onto, then f is homotopic to a constant.
252
8. Some Topics from the Topology of R^
Figure 8.3. The figure suggests a homotopy of closed curves, that is a continuous family of closed paths, from a knotted loop to S^. But, it can be proved that there is no family of homeomorphisms of the ambient space E^ that, starting from the identity, deforms the initial knotted loop into S^.
(ii) Let 5^+1 := {x G W^^ | |x| < 1}. .4 continuous map f : S"" ^ Y is homotopic to a constant if and only if f has a continuous extension F : B^+i -^ Y. Proof, (i) Since f{x) and g{x) are never antipodal, the segment tg{x) H- (1 — t)f(x), t € [0,1], never goes through the origin; a homotopy of f to g is then
\tg(x) + {l
-t)f{x)\
Notice that y —>• A is the radial projection from R^^"^ onto the sphere 5^^, hence H{t, x) is the radial projection onto the sphere of the segment tg{x) + (1 — t)f(x), t 6 [0,1]. The second part of the claim follows by choosing yo ^ S'^\ f{X) and g{x) := —yo(ii) If F : B " + i -^ y is a continuous function such that F(x) = f{x) Vx G S"^, then the map H(t^x) := F(tx), (t^x) G [0,1] x 5^^, is continuous, hence a homotopy of i/(0, x) = F(0) to if (1, x) = f{x). Conversely, if if : [0,1] x 5 ^ ^ F is a homotopy of a constant map g(x) = p eY to f, if (0, x) = p, if (1, x) = f{x) \/x G X, then the map F : B ^ + i ^ y defined by
Ip
if X = 0
is a continuous extension of / to 5 ^ + ^ with values into Y.
•
b. Homotopy classes Denote by [X, Y\ the set of homotopy classes of continuous maps / : X ^ Y and by [/] G [X, Y] the equivalence class of / . The following two propositions collect some elementary facts. 8.9 Proposition. We have (COMPOSITION) Let f,f'\X-^Y,g,g':Y-~^Z he continuous maps. If f ^ f and g ~ g', then g o f r^ g' o f'. (ii) (RESTRICTION) If f,g : X -^Y are homotopic and Ac X, then f\A is homotopic to g\A ^^ maps from A to Y.
(i)
8.1 Homotopy
(iii)
253
(CARTESIAN PRODUCT) f,g : X ^YixY2 are homotopic if and only if ^i^ f and TTi o g are homotopic (with values in Yi) where i = 1,2 and TTi, i = 1,2 denote the projections on the factors.
A trivial consequence of Proposition 8.9 is that the set [X^Y] is a topological invariant of both X and Y. In a sense [X, Y] gives the number of "diflFerent" ways that X can be mapped into F , hence measures the "topological complexity" of Y relative to that of X. Let (/? : X ^ y be a continuous map and let Z be a metric space. Then (f defines a pull-back map ^*:[Y,Z]^[X,Z] defined by (p^[/] := [/ o <^], as Proposition 8.9 yields that the homotopy class oi f o (p depends on the homotopy class of / . Similarly cp induces a push-forward map defined by (p:^[g] := [p o g]. 8.10 P r o p o s i t i o n . We have the following. (i) Let (f^ip : X —^Y be continuous and homotopic^ (/p ~ t/^. Then (^^ = ^ ^ and (f:^ = -0^. (ii) Let (p : X ^^Y and rj :Y ^^ Z be continuous. Then {rj o if)"^ = ip"^ orf^
and
[T] O
ip)^ = ^# o ^#-
c. Hcsmotopy equivalence of sets 8.11 Definition. Two metric spaces X and Y are said homotopy equivalent, or are said to have the same homotopy type, if there exist two continuous maps f : X -^ Y and g : Y ^y X such that g o f ^ Idx and f O g rsj I d y .
If f : X -^ Y and g : Y -^ X define a homotopy equivalence between X and y , then for every space Z we infer from Proposition 8.10 g* o f*
= Id[y,z],
f*
og*
= Id[x,Zl-
Similarly hence [Z, X] and [Z, Y] are in a one-to-one correspondence. 8.12 Definition. A space X is called contractible if it is homotopy equivalent to a space with only one point, equivalently, if the identity map i : X ^^ X of X is homotopic to a constant map. By definition if X is contractible to XQ G X, then X is homotopic equivalent to {xo}, hence [Z,X] and [X, Z] reduces to a point for any space Z.
254
8. Some Topics from the Topology of R"^
Figure 8.4. W^ is contractible.
8.13 E x a m p l e . R'^ is contractible. In fact, H(t,x) contracts R'^ to the origin.
:= (1 - t)x, {t,x)
€ [0,1] X R^,
In general, describing the set [X, Y] is a very difficult task even for the simplest case of the homotopy of spheres, [5^, S'^], k,n>l. However, the following may be useful. 8.14 Definition. LetX be a metric space. We say that A C X is a retract of X if there exists a continuous map p : X ^^ A, called a retraction^ such that p{x) = x\/x £ A. Equivalently A is a retract of X if the identity map Id A : A-^ A extends to a continuous map r : X —^ A. We say that A C X is a deformation retract of X if A is a retract of X and the identity map Idx -^ X is homotopic to a retraction of X to A. Let ^ C X be a deformation retract of X and denote by iA '• A —^ X the inclusion map. Since Idx : X ^ X is homotopic to the retraction map r : X —> yl, we have r o iA = Idyl,
z^ o r = r ~ Idx,
hence A and X are homotopic equivalent. By the above, for every space Z we have [A, Z] = [X, Z] and [Z, A] = [Z, X] as sets, thus reducing the computation of [Z, A] and of [X, Z] respectively, to the smaller sets [Z, X] and [A,Z]. The following observation is useful. 8.15 Proposition. Let A C X be a subset of a metric space X . Then A is a deformation retract of X if and only if A is a retract of X and Idx : X —^ X is homotopic to a continuous map g : X ^^ A.
Figure 8.5. S^ is a deformation retract of the torus T C R^.
8.1 Homotopy
255
Figure 8.6. 5 ^ is a deformation retract of B^ \ {0}.
Proof. It is enough to prove sufficiency. Let r : X —^ A he a retraction and let h : [0,1] X X —)- X be a homotopy of I d x to g, /i(0,x) = x, h{l,x) = g{x) Vx G X. Then the map
yh{2-2t,x)
if I < t < 1
is continuous since /i(l, x) = r(/i(l, x)) Vx and shows that I d x is homotopic to r : X ^^ A. U
8.16 ^ . Show that every point of a space X is a retract of X. 8.17 If. Show that {0,1} C R is not a retract of M. 8.18 %, Show that a retract A C X of a space X is a closed set. 8.19 %. The possibility of retracting X onto A is related to the possibility of extending continuous maps on A to continuous maps on X . Show P r o p o s i t i o n . A C X is a retract of X if and only if for any topological space Z any continuous map f : A —^ Z extends to a continuous map F : X -^ Z. 8.20 1 . Show that 5 ^ is a deformation retract of 5^+^ \ {0}, see Figure 8.6. 8.21 f. With reference to Figure 8.8, show that M \ dM is not a retract of M , but M and M\dM are homotopy equivalent since they have a deformation retract in common.
Figure 8.7. The first two figures are homotopy equivalent since they are both deformation retracts of the third figure.
256
8. Some Topics from the Topology of W
Figure 8.8. M\dM
is not a retract of M , but M and M\dM
are homotopy equivalent.
d. Relative homotopy Intuitively, see Figure 8.1, the maps Ht : X —^ Y^ t ^ [0,1] defined by Ht{x) := H(t^x), are a continuous family of continuous maps that deform / to^. In particular, it is important to note that, in considering homotopy of maps, the target space is relevant and must be kept fixed in the discussion. As we shall see in the sequel, maps with values in Y that are nonhomotopic may become homotopic when seen as maps with values in Z DY. Also, it is worth considering homotopies of a suitable restricted type. For instance, when working with paths with fixed endpoints, it is better to consider homotopies such that for each t all curves x -^ Ht{x) := H{t,x), X e [0,1], have the same fixed endpoints for all t e [0,1]. Similarly, when working with closed curves, it is worthwile considering homotopies H{t, x) such that every curve x -^ Ht{x) := H{t, x) is closed for all t G [0,1]. 8.22 Definition. Let C C C^{X,Y). We say that f,g eC are homotopic relative to C if there exists a continuous map H[0,1] x X -^ Y such that if(0,x) = f{x), i f ( l , x ) = g{x) and the curves x -^ Ht{x) := H{t,x) belong to C for all t G [0,1]. It is easy to check that the relative homotopy is an equivalence relation. The set of relative homotopy classes with respect to C C C^(X, F ) is denoted by [ X , r ] c . Some choices of the subset C C C^([X, Y]) are relevant. (i) Let Z C y and C '.=^ {f e C0([0, l],Y) \ f{X) C Z } . In this case a homotopy relative to C is a homotopy of maps with values in Z. (ii) Let X = [0,1], a,6 G y and C := {/ G C^{X,Y) \ /(O) = a, / ( I ) = 6}. Then a homotopy relative to C is called a homotopy with fixed endpoints. (iii) Let X = [0,1], and let C := {f e C0([0,1],!^) | /(O) = / ( I ) } be the class of closed curves, or loops^ in Y. In this case two curves homotopic relative to C are said loop-homotopic. Recall that a closed curve 7 : [0,1] ^ ' X can be reparametrized as a continuous map 5 \ S^ ^^ X from the unit circle S^ C C Now let 7i,72 : [0,1] -^ X he two loops and let 81^82 '- S^ ^^ X be two corresponding reparametrizations on 5^. Then, recalling that
8.1 Homotopy
257
homotopies are simply paths in the space of continuous maps, it is trivial to show that 71 and 72 are loop-homotopic if and only if 61 and S2 are homotopic as maps from S^ into X. Therefore [[0,1],X]c = [S\X]. Finally, notice that the intuitive idea of continuous deformation has several subtle aspects, see Figure 8.3.
8.1.2 Homotopy of loops a. The fundamental group with base point Let X be a metric space and let XQ G X. It is convenient to consider loops 7 : [0,1] ^ X with 7(0) = 7(1) = XQ. We call them loops with base point XQ. Also, one can introduce a restricted form of homotopy between loops with base point XQ by considering loop-homotopies if (t, x) such that X -^ H{t, x) has base point XQ for every t. We denote the corresponding homotopy equivalence relation and homotopy classes repectively, by ^xo and []xo. Finally, 7ri(X,{xo}) denotes the set of loop-homotopy with base point XQ classes of loops with base point XQ. 8.23 %. Show that 7ri(X, XQ) reduces to a point if X is contractible and XQ G X. [Hint: Show that 7ri(X,xo) C [S^^X].]
b. The group structure on 7ri(X, a^o) Given two loops (/?,-0 : [0,1] -^ X with base point XQ, we may consider the junction of (f and ip denoted hy (p*ip as the loop with base point XQ defined by
^.m:=h''^ \^{2t-l)
if ^^[0,1/2], i f t G [1/2,1].
Since the homotopies with fixed endpoints can be joined, too, we have (fi * ^1 ~ (f * 7p if (pi,
(ASSOCIATIVITY) Let f,g,h : [0,1] -^ X be three loops with base point XQ. Then {[f]xo * [g]xo) * [h]xo = [f]xo * ([^]xo * Mxo)-
258
8. Some Topics from the Topology of W
(ii)
( R I G H T AND LEFT IDENTITIES) Let f : [0,1]-^ X be a loop with base point xo and let e^^o • [0? I] -^ X be the constant map, Cxoit) := XQ. Then [e^J^^ * [/]^o = ifUo * [exolxo = IDxc (iii) (INVERSE) Let ex^ : [0,1] -^ X be the constant map exo{t) := XQ and, for a loop / : [0,1] -^ X with base point XQ, let / : [0,1] -^ X be the map J{t) := / ( I - 1 ) . Then [f]xo * [7]:ro = [J]xo * [f]xo = [exol^o-
In this way the junction of_loops defines a natural group structure on 7ri(X,{a;o}), where [f]-^ = [f]^^8.25 Definition. Let X be a space and XQ G X. The set 7ri(X, {XQ}) of homotopy classes of loops with base point XQ has a natural group structure induced by the junction operation of loops. We then call 7ri(X, {XQ}) the fundamental group of X, or the first homotopy group of X, with base point XQ . c. Changing base point By definition 7ri(X, XQ) depends on the base point XQ. However, if XQ, xi G X, suppose that there exists a path a : [0,1] -^ X from XQ to Xi and let a : [0,1] -^ X, 'a{t) := a ( l — t), be the reverse path from xi to XQ. For every loop 7 with base point XQ, the curve a * 7 * a is a loop with base point xi. Since evidently a * 71 * a ~ a * 72 * a if 71 ~ 72, a defines a map a* : 7ri(X,xo) -^ 7ri(X,xi) by <^*([7]a:o) '= [ a * 7 * ^ ] x i ,
(8.2)
where we have denoted by [ ]a;o and []xi respectively, the homotopy classes of curves with base point XQ and xi. It is trivial to see that a* is a group isomorphism, thus concluding the following. 8.26 Proposition. 7ri(X, XQ) and 7ri{X,Xi) are isomorphic as groups for all xo,xi G X if X is path-connected. Thus, for a path-connected space X, all groups 7ri(X, XQ), XQ e X are the same group up to an isomorphism. We call it the fundamental group or the first homotopy group of X, and we denote it by 7TI{X). However, the map a* defined by (8.2) depends explictly on a. For convenience, let ha '-= a*. Examples show that in general ha 7^ hp if a and /? have the same endpoints, but we have h^^ha{[j]xi)
= h^^{[aja]xo) = [~pajaP]x, = [^a]a;i * Hxi * [P(^]x^-
This implies that (i) ha = h(3 if a and (3 are homotopic with the same endpoints^ (ii) ha is always the same map, independently from a, if 7ri{X^xi) is a commutative group.
8.1 Homotopy
259
Lecture Notes on Elementary Topolofly and Geometry
Figure 8.9. Camille Jordan (1838-1922) and the frontispiece of the Japanese translation of the Lecture Notes of Elementary Topology and Geometry by J. M. Singer and J. A. Thorpe.
Thus, attaching a path to XQ to any curve 7 : 5^ —> X, we can construct a loop with base point XQ and, at the homotopy level, this construction is actually a map h : [S^^X] ^^ 7ri(X,xo). It is clear that h is one-to-one, since its inverse is just the inclusion oi7Ti{X,xo) into [5^,X]. 8.27 Proposition. Let X be path-connected. If 7ri{X) is commutative, then the map h : [S^^X] -^ ^i{^) described above is bijective. 8.28 Definition. We say that a space X is simply connected ifX is pathconnected and 7ri(X, xo) reduces to a point for some XQ E X (equivalently for any XQ e X by Proposition 8.26). 8.29 %. Show that X is simply connected if X is path-connected and contractible.
d. Invariance properties of the fundamental group Let us now look at the action of continuous maps on the fundamental group. Let X, Y be metric spaces and let XQ G X. To any continuous map / : X —> y one associates a map / # :7ri(X,xo) ^ 7 r i ( y , / ( x o ) ) defined by /#([7]xo) •= [/°7]/(xo)-1^ is easy to see that the above definition makes sense, and that actually / ^ is a group homomorphism. 8.30 Proposition. We have the following. (i) Let f : X -^ Y and g : Y ^^ Z be two continuous maps. Then {9of)^=g^of^.
260
8. Some Topics from the Topology of W
(ii) If Id : X -^ X is the identity map and XQ G X, then Id^ is the identity map on 7ri(X, {XQ}). (iii) Suppose Y is path-connected^ and let F : [0^1]xX -^Y be a homotopy of two maps f and g from X into Y. Then the curve a(t) := F{t,xo), t G [0,1], joins /(xo) to g{xQ) and g^ = a^o f^. Proof, (i) and (ii) are trivial. To prove (iii), it is enough to show that / 0 7 and a 0 ^ 0 7 0 a are homotopic for every loop 7 with base point XQ . A suitable homotopy is given by the map H{t, x) : [0,1] -^ X -> y defined by
fa(2x) F ( « O (4x+2tV 3t+l
H(t,x):=
l,a(4x --3)
'))
ifx < 1-t 2 ' if i - t < a ; < t +43 2 ifx > t-l-3 4 •
Of course, Proposition 8.30 (i) and (ii) imply that a homeomorphism h : X -^ Y induces an isomorphism between 7ri(X, XQ) and 7ri(y,/i(xo)). Therefore, on account of Proposition 8.26, the fundamental group is a topological invariant of path-connected spaces. Actually, from (iii) Proposition 8.30 we infer the following. 8.31 Theorem. Let X, Y be two path-connected homotopy equivalent spaces. Then 7ri{X) and 7ri{Y) are isomorphic. Proof. Let f : X -^ Y, g :Y —i- X he continuous such that gofr^ and let XQ £ X. Then we have two induced maps
I d x and fogr^
Idy
/#:7ri(X,xo)-7ri(y,/(xo)), g^ : 7ri(y,/(xo)) -
7ri(X,p(/(xo))).
Let if : [0,1] X X -^ X be the homotopy of I d x to g o f and let K : [0,1] : Y -^ Y be the homotopy of I d y to f o g. If a\{i) := H{t,xo), a2{t) := K{t,f(xo)), then by Proposition 8.30 we infer g#of^
= ( p o / ) # = r ( a i ) * o ( I d x ) # = (ai)*,
/ # o p # = ( / o p ) # = ( a 2 ) * ( I d y ) # = (a2)*. Since ( a i ) * and (a2)* are isomorphisms, f^ is injective and surjective.
D
8.1.3 Covering spaces a. Covering spaces A useful tool to compute, at least in some cases, the fundamental group, is the notion of covering space. 8.32 Definition. A covering of Y is a continuous map p : X —^Y from a topological space X, called the total space, onto Y such that for all x EY there exists an open set U C Y containing x such that p~^{U) = UaVa, where Va are pairwise disjoint open sets and p^y^ is a homeomorphism between Va and U. Each Va is called a slice of p~^{U).
8.1 Homotopy
261
Figure 8.10.
8.33 E x a m p l e . Let Y be any space. Consider the disjoint union of fc-copies of y , that we can write as a Cartesian product X := Y x { 1 , 2 , . . . , fc}. Then the projection map p : A" —)• y , p{{y, i)) = y, is a. covering of X. 8.34 E x a m p l e . Let S^ be the unit circle of C. Then the circular motion p : R —> S*^, p{6) = e* '^'^^ is a covering of S^. 8.35 E x a m p l e . Let X C M^ be the treice of the regular helix 7(t) = ( c o s t , s i n t , t ) . Then p : X —> 5^ where p : R^ —^ E^, p{x,y,z) := {x,y), is the orthogonal projection on M^, is another covering of 5^. 8.36 1 . Let p : X —>^ y be a covering of Y. Suppose that Y is connected and that for some point yo E Y the set p~^(yo) is finite and contains k points. Show that p~^{y) contains k points for all y G y . In this case, we say that p : X —> y is a k-fold covering ofy. 8.37 f.
Show that p:R+
-^ S'^, p(t) := e**, is not a covering of S^.
8.38 %. Show that, if p : X —>^ X and q : Y -^ Y are coverings respectively, of X and y , then pxq:XxY^yXxY,px q{x, y) := (p(a;), q{y)), is a covering of X x y . In particular, if p : E -^ S'^ is defined by p(t) := e* ^'^S then the map p x p : E x R ^ 5^ x 5^ is a covering of the torus S^ x S^. Figure 8.10 shows the covering map for the standard torus of M^ that is homeomorphic to the torus S^ x S^ C E^.
8.39 1. Think of 5^ as a subset of C. Show that the map p : 5^ -^ 5 \ p{z) = z^, is a two-fold covering of S^. More generally, show that the map S^ —>• S^ defined by pit) := z"^ is a |n|-covering of 5 M f n G Z \ {0}. 8.40 %. Show that the map p : E-j. x covering of S^ x E + .
E+ X S^ defined by p{s,e)
8.41 %. Show that the map p : E+ x E defined by p{p,0) \ {0}.
= (s,e*^) is a
:= pe*^ is a covering of
E2
b. Lifting of curves In connection with coverings the notion of (continuous) lift is crucial. 8.42 Definition. Let p : X ^^ Y he a covering of Y and let f : Z ^^ Y be a continuous map. A continuous map f : Z -^ X such that po f = f is called a Hft of f on X.
262
8. Some Topics from the Topology of I
8.43 E x a m p l e . Let p : R -^ 5^ be the covering of 5^ given by p{t) = e^K A Uft of / : [0,1] —>- S^ is a continuous map /i : [0,1] —^ R such that f{t) = e*^(*). Looking at t as a time variable, h{t) is the angular evolution of f{t) as f(t) moves on S^. 8.44 E x a m p l e . Not every function can be lifted. For instance, consider the covering p :R-^ S^, p{t) = e*^^*. Then the identity map on S^ cannot be lifted to a continuous map /i : 5^ —>• R. In fact, parametrizing maps from S^ as closed curves parametrized on [0,27r], h would be periodic. On the other hand, if h was a lift of 2; = e**, we would have e** = ^iHt)^ which implies that h{t) = t-fconst, a contradiction.
However, curves can be lifted to curves that are not necessarily closed. Let X be a metric space. We say that X is locally path-connected if every point X e X has an open path-connected neighborhood U. 8.45 Proposition. Let p : X -^ Y be a covering ofY and let XQ G X. Suppose that X and Y are path-connected and locally path-connected. Then (i) each curve /? : [0,1] ^> F with ^(0) = P{XQ) has a unique continuous lift a : [0,1] -^ X such that po a = (3 and Q;(0) = xo, (ii) for every continuous map k : [0,1] x [0,1] -^ F with A:(0,0) = p{xo), there is a unique continuous lift h : [0,1] x [0,1] -^ X such that /i(0,0) = xo and p{h{t, s)) = k{t, s) for all {t, s) e [0,1] x [0,1]. Proof. Step 1. Uniqueness in (i). Suppose that for the two curves ai,Q!2 we have p(ai{t)) = p{cx2{t)) Vt e [0,1] and a i ( 0 ) = a2(0). The set E := {t\ai(t) = a 2 ( 0 } is closed in [0,1]; since p is a local homeomorphism, it is easily seen that E is also open in [0,1]. Therefore E = [0,1]. Step 2. Existence in (i). We consider the subset ^ := "s i G [0,1] 3 a continuous curve at : [0, t] -^ X such that a(0) = XQ and p{at{e))
= 13(6) V(9 E [0,i]}
and shall prove that E is open and closed in [0,1] consequently, E = [0,1] as it is not empty. Let T E E and let U be an open neighborhood of OiriT) for which p^u is a, homeomorphism. For a sufficiently small, a < CTQ, the curve s -^ j{s) := (P|t/)~^(/3(5)), s G [r, r + cr], is continuous, 7 ( T ) = ctrir) and p(a{s)) = /3(s), Vs € [r^r + cr]. Therefore for the curve a
if 0 < s < r, if r < 5 < r + cr,
we have aa(t) = (3{t) for all t G [0, a], i.e., r -|- CTQ G £^ for some CTQ > 0 if r G £^, or, in other words, E is open in [0,1]. We now prove that E is closed by showing that T := sup£? G E. Let {tn} C E be a nondecreasing sequence that converges to T and for every n, let a n : [0, in] -^ ^ be such that p{an{t)) = p{t) Vt G [0,tn]. Because of the uniqueness (Xr{t) = as{t) for all t G [0, r] if s < r, consequently a continuous curve a : [0, T[-^ X is defined so that p(a{t)) = p{t) Vt G [0, T[. It remains to show that we can extend continuously a at T. Let V be an open neighborhood of /3(T) such that p~^{V) — UjUj where Ua are pairwise disjoint open sets that are homeomorphic to V. Then f3{t) £ V iovt < t
8.1 Homotopy
263
belong to a unique Ua, say Ui, ior t
8.46 Proposition. Let X and Y be path-connected and locally pathconnected metric spaces and let f : X —^ Y be a covering of Y. Let a,P : [0,1] -^ Y be two curves with a(0) = /?(0) and a{l) = ^3(1) that are homotopic with fixed endpoints and let a, 6 : [0,1] -^ X be their continuous lifts that start at the same point a(0) = 6(0). Then a(\) = 6(1), and a and b are homotopic with fixed endpoints. Proof. Prom (i) Proposition 8.45 we know that a, (3 can be lifted uniquely to two curves a, 6 : [0,1] -^ X with a(0) = 6(0) = ao, p{ao) = a(0). Let k : [0,1] x [0,1] -> y be a homotopy between a and /5, i.e., k(0,t) = a{t), k{l,t) = P{t), k{s,0) = a(0) = /3(0), fc(s, 1) = a ( l ) = /?(1). By (ii) Proposition 8.45 we can lift k to /i, so that p{h{s,t)) = k{s, t) and A;(0,0) = a(0) = 6(0). Then ^ is a homotopy between a and 6 and in particular D a ( l ) = ^(0,1) ^ 6 ( 1 ) .
8.47 Theorem. Let X and Y be path-connected and locally path-connected metric spaces and let p : X ^^ Y be a covering ofY. If Y is simply connected, then p : X ^^Y is a homeomorphism. Proof. Suppose there are xi,a;2 G X with p{xi) — p(x2)- Since X is connected, there is a curve a : [0,1] —>• X with a(0) = xi and a ( l ) = X2. Let 6 : [0,1] -^ X be the constant curve b{t) = xi. The image curves a{t) := p(a(i)) and P{t) := p{b(t)) are closed curves, hence homotopic, Y being simply connected. Proposition 8.46 then yields X2 =a{l) = 6(1) =xi. D
8.48 Theorem. Let X and Y be path-connected and locally path-connected, and let p : X -^ Y be a covering of Y. Suppose that Z is pathconnected and simply connected. Then any continuous map f : Z —^Y has a lift f : Z -^ X. More precisely, given ZQ e Z and XQ G X, such that p{^o) = fi^o), there exists a unique continuous map f : Z -^ X such that f{zo) = xo andpoj = f. Proof. Let z ^ Z and let 7 : [0,1] ^^ Z be a curve joining ZQ to z. Then the curve OL{t) := / ( 7 ( t ) ) , t e [0,1], in y has a lift to a curve a : [0,1] -> X with a(0) = XQ, see (i) Proposition 8.45, and Proposition 8.46 shows that a ( l ) depends on a ( l ) = f{z)
264
8. Some Topics from the Topology of R"^
and does not depend on the particular curve 7. Thus we define f{z) := a ( l ) , and by definition f{zo) = XQ and pof = f. We leave to the reader to check that / is continuous. D
c. Universal coverings and homotopy 8.49 Definition. Let Y be a path-connected and locally path-connected metric space. A covering p : X -^ Y is said to be a universal covering of Y if X is path-connected, locally path-connected and simply connected. Prom Theorems 8.47 and 8.48 we immediately infer 8.50 Theorem (Universal property). Let X, F, Z be path-connected and locally path-connected metric spaces. Let p . X -^ Y, q : Z -^ Y be two coverings of Y and suppose Z simply connected. Then q has a lift q: Z -^ X which is also a covering of X. Moreover q is a homeomorphism if X is simply connected, too. The relevance of the universal covering space in computing the homotopy appears from the following. 8.51 Theorem. Let X and Y be path-connected and locally path-connected metric spaces and letp:X—^Y be the universal covering ofY. Then Vyo ^ Y 7ri(F,2/0) andp~^{yo) C X are one-to-one. Proof. Fix q € p~^(xo). For any curve a in F with base point XQ, denote by a : [0,1] ^• X its lift with a(0) = q. Clearly a is a curve in X which ends at a ( l ) € p~^{xo)' Moreover, if /3 is loop-homotopic to a in Y, then necessarily a ( l ) = 6(1), so the map a -^ a(l) is actually a map (pq : 7ri(y,xo)
-^p~'^{xo).
Of course (fq is surjective since any curve in X with endpoints in P~^{XQ) projects onto a closed loop in Y with base point XQ. Moreover, if (^q([7]) = ^q([<^])5 then the lifts c and d that start at the same point end at at the same point; consequently c and d are homotopic, as X is simply connected. Projecting the homotopy between c and d onto Y yields [7] = [S]. D
d. A global invertibility result Existence of a universal covering p : X ^ F of a space Y can be proved in the setting of topological spaces. Observe that if X and Y are pathconnected and locally path-connected, and if p : X ^ Y is a. universal covering of Y, then Y is locally simply connected, i.e., such that Wy e Y there exists an open set F C F containing y such that every loop in V with base point at x is homotopic (in Y) to the constant loop x. It can be proved in the context of topological spaces that any path-connected, locally path-connected and locally simply-connected Y has a universal covering p : X -^ Y. We do not deal with such a general problem and confine ourselves to discussing whether a given continuous map f : X ^^ Y is a. covering of Y.
8.1 Homotopy
265
Let X,Y be metric spaces. A continuous map / : X ^ Y" is a local homeomorphism if every x € X has an open neighborhood U such that /|[/ is a homeomorphism onto its image. We say that / is a proper map if f~^{K) is compact in X for every compact K CY. Clearly a homeomorphism from X onto its image / ( X ) C F is a local homeomorphism and a proper map. Also, if p : X -^ F is a covering of Y then p is a local homeomorphism. We have 8.52 let f f{X) f{X)
Theorem. Let X be path-connected and locally path-connected and : X -^Y be a local homeomorphism and a proper map. Then X and are open, path-connected and locally path-connected and f : X -^ is a covering of f{X).
Before proving Theorem 8.52, let us introduce the Banach indicatrix of / : X ^ R^ as the map Nf'.Y-^NU
{oo},
Nf{y) := # { x G X | f{x) = y}.
Evidently / ( X ) = {y \ Nf{y) > 1} and / is injective iff Nf{y) < 1 \/y. 8.53 Lemma. Let f : X -^ Y be a local homeomorphism and a proper map. Then Nf is bounded and locally constant on / ( X ) . Proof. Since / is a local homeomorphism, the set f~^{y) = {x £ X\f(x) = y} is discrete and in fact f~^{y) is finite, since / is proper. Let Nf{y) = k and f~^{y) = {xi,..., Xfc}. Since / is a local homeomorphism, we can find open disjoint neighborhoods [/i of x i , . . . , C/fe of Xk and an open neighborhood Vofy such that / | [/. : Uj —> V are homeomorphisms. In particular, for every y E V there is a unique Xj G Uj such that f(xj) = y.lt follows that Nf{y) > k^y ^V. We now show that for every y there exists such that Nf{y) < k holds for all y ^W. Suppose, in a neighborhood Woiy^WcV, fact, that for a y there is no neighborhood W such that N{y) < k for y EW, then there is a sequence {yi} C W, yt ^>- y with N{yi) > k, and points fi ^ Ui U • • - U Uk with / ( ? i ) — Vi- The set f~^{{yi} U {y}) is compact since / is proper, thus possibly passing to a subsequence {^i} converges to a point ^ and necessarily ^ ^ t/i U • • • U t/^; passing to the limit we also find / ( $ ) = y: a. contradiction since ^ is different from x i , . . . ,Xfc. D f~^{y) Proof of Theorem 8.52. Prom Lemma 8.53 we know that, for every y G Y^ contains finitely many points {xi, X2, •. •, xj^} where N is locally constant. If t/^, i = 1,... ,N, Ui 3 Xi and V 3 y are open and homeomorphic sets, we then set
V = ntiVi n {3/ e y I Nf{y) = N],
Wi := ifiuX'C^)-
Clearly V is open and / " ^ (V) is a finite sum of disjoint open sets that are homeomorphic to V. D
As a consequence of Theorem 8.47 we then infer the following useful global invertibility theorem. 8.54 Theorem. Let X be path-connected and locally path-connected, and let f : X ^^Y be a local homeomorphism that is proper. If f{X) is simply connected, then f is injective, hence a homeomorphism between X and
fix).
266
8. Some Topics from the Topology of R^
Proof, f : X -^ f(^) is a covering by Theorem 8.52. Theorem 8.47 then yields that / is one-to-one, hence a homeomorphism of X onto f{X). D
8-1.4 A few examples a. The fundamental group of S^ The map p : R -^ 5^, p{t) = e^^^* is a universal covering of S^. Therefore for any XQ G 5^, p~^{xo) = Z as sets. Therefore, see Theorem 8.51, one can construct an injective and surjective map
that maps [a] to the end value a(l) G Z of the lift a of a with a(0) = 0. We have 8.55 Lemma. ipxQ : 7ri(5^,xo) -^ Z is a group isomorphism. Proof. Let a,/3 he two loops in S^ with base point XQ and a, b the liftings with a(0) = 6(0) = 0. If n := (po{[a]) and m = (/?o([/?]), we define c : [0,1] -^ M by ^ ^ ,^ ^^ U{2s) _ , _
s e [0,1/2],
| n + 6(2s-l)
s G [1/2,1].
It is not diflftcult to check that c is the lift of a * /? with c(0) = 0 so that ^o([a] * IP]) = Ml^
* P]) = c(l) = n + m =
MW)D
Since (pxQ is a group isomorphism and Z is commutative, 7ri(S'^,xo) is commutative, and there is an injective and bijective map h : [5^,5^] —^ 7ri(X, xo), see Proposition 8.27. The composition map deg : (f{S\S')
-> Z,
deg(7) :=
f.oihib]))
is called the degree on 5^, and by construction we have the following. 8.56 Theorem. Two maps f,g : S^ —^ S^ have the same degree if and only if they are homotopic. Later we shall see that we can recover the degree mapping more directly. 8.57 %, Show that the fundamental group of M^ \ {0} is Z.
8.1 Homotopy
267
xo
Figure 8.11. A figure eight.
b. The fundamental group of the figure eight The figure eight is the union of two circles A and B with a point XQ in common. If a is a loop based at XQ that goes clockwise once around A, and a~^ is the loop that goes counterclockwise once around A, and similarly for 6, 6~^, then the cycle aba~^b~^ is a loop that cannot be unknotted in AUB while aa~^bb~^ can. More precisely, one shows that the fundamental group of the figure eight is the noncommutative free group on the generators a and b. Indeed, this can be proved using the following special form of the so-called SeifertVan Kampen theorem. 8.58 Theorem. Suppose X = UyjV, where U, V are open path-connected sets and U H V is path-connected and simply connected. Then for any XQ eU nV, 7ri(X, Xo) is the free product ofTTi{U,xo) and 7ri(y, XQ). 8.59 %. Show that the fundamental group of M? \ {XQ, xi} is isomorphic to the fundamental group of the figure eight. 8.60 %. Show that 7ri(X x Y, {xo,yo)) is isomorphic to 7ri(X,XQ) X ni{Y,yo), ticular the fundamental group of the torus S^ x S^ i s Z x Z .
in par-
8.61 %, Let X = Ai U ^ 2 U • • • U An where each Ai is homeomorphic to S^, and AiCiAj = {XQ} Hi ^ j . Show that TTI{X, XQ) is the free group on n generators a i , . . . , a n where a^ is represented by a path that goes around Ai once. 8.62 %. Let X be the space obtained by removing n points of M."^. Show that TTI(X, XQ) is a free group on n generators a i , . . . , a n , where ai is represented by a closed path which goes around the ith hole once.
c. The fundamental group of S'^^ n > 2 The following result is also a consequence of Theorem 8.58. 8.63 Theorem. Let X = U UV where U and V are simply connected open sets of X and UDV is path-connected. Then X is simply connected, i.e.,
7 r i ( X , Xo) = 0 .
As a consequence we have the following. 8.64 Proposition. The sphere S'^ C R"^"^^ is simply connected, i.e., 7ri(5^,xo) = 0 ifn>2.
268
8. Some Topics from the Topology of R"^
Proof. Let ps and pN be respectively, the south pole and the north pole of the sphere. The stereographic projection from the south (north) pole establishes a homeomorphism between 5 ^ \ {ps} (respectively, S"^ \ {PN}) and R^. Thus 7ri(S'^ \ {ps},xo) = 7ri(S''^ \ {PN}^XO) = 0 Wxo ^ PS,PN- By Theorem 8.63 it suffices to show that S'^\{PS,PN} is path-connected. For that we notice that the stereographic projection is a homeomorphism between S^ \ {PS,PN} and M^ \ {0} which in turn is path-connected D if n > 2.
Since the fundamental groups of R"^+^ \ {0} and S'^ are isomorphic, see Theorem 8.31, equivalently we can state 8.65 Proposition. W^ \ {0} is simply connected
ifn>2.
8.66 %. Show that R^, n > 2, and R^ are not homeomorphic.
8.1.5 Brouwer's degree a. The degree of maps S^ -^ S^ A more analytic presentation of the mapping degree for maps S^ —> S^ is the following. Think of S^ as the unit circle in the complex plane, so that the rotations of S^ write as complex multiplication, and represent loops in S^ as maps f : S^ -^ S^ or by 27r-periodic functions 6 -^ /(e^^),
8.67 Lemma. Let f : S^ —^ S^ be continuous. There exists a unique continuous function / i : R ^ R such that
{
(8.3)
h{0) = 0.
Proof. Consider the covering p : E —»> 5^ of 5^ given by p{t) := e**. The loop g{z) := f(z)/f{l) has base point 1 6 5'^. Then by the Ufting argument. Proposition 8.45, there exists a hft /i : R —• R such that (8.3) holds. The uniqueness follows directly from (8.3). In fact, if /n,/i2 verify (8.3), then hi(0) - h2(0) = k{e)2n where k{e) € Z. As hi and /i2 are continuous, k{0) is constant, hence k(6) = k{0) = 0. D
Let / : 5^ -^ 5^ be continuous and let /i : R ^ R be as in (8.3). Of course, for every 0 we have
h{0 + 27r) - h{e) = 2k{e)TT for some integer k{6) G Z. Since h is continuous, k is continuous, hence constant. Observe that k = h{2n) — h{0) = h{27r) and k is independent of the initial point / ( I ) . In particular, f : S^ -^ S^ and / / / ( I ) : S^ -^ S^ have the same degree.
8.1 Homotopy
269
8.68 Definition. Let f : S^ ^ S^ and let h be as in (8.3). There is a unique integer d G Z such that h{e + 27r) - h{e) =d27r
V6> G R.
The number d is called the winding number, or degree, of the map f : S^ -^ S^^ and it is denoted by deg(/). 8.69 Theorem. Two continuous maps / o , / i *. S^ -^ S^ have the same degree if and only if they are homotopic. Proof. Let f : S^ —>• 5^. We have already observed that f{z) and f{z)/f{l) have the same degree. On the other hand, f{z) and f(z)/f(l) are also trivially homotopic. To prove the theorem it is therefore enough to consider maps / o , / i with the same base point, say / ( I ) = 1. (i) Assume / o , / i are homotopic with base point 1 G 5^. By the lifting argument, the liftings ho, hi of / o , / i characterized by (8.3) have /ii(27r) = /i2(27r), hence d e g ( / i ) = /ii(27r) - ^i(O) = hi{2n) = h2{2n) = /i2(27r) - /i2(0) = deg(/2). Conversely, let / : 5^ —>• 5^ be of degree d and let h be given by (8.3). Then the map k : [0,1] X 51 ^ S^ defined by fc(t, 6) := exp {th(e) + d{l-
1)9)
establishes a homotopy of / to the map if : S^ -^ S^, (f{z) = z^. Therefore, if /o and / i have the same degree d and base point 1 G S^, then they are both homotopic to the same map (f{z) — z^. •
Finally we observe that deg(z^) = d'id elt and that, if / and g have the same base point, deg(^ * / ) = deg(^) + deg(/). b. A n integral formula for the degree Let / : 5^ ^ 51 and let /i : R ^ R be as in (8.3). Clearly, thinking of ^ as a time variable, h{6) is the angle evolution of the point /(e^^) on the circle. The degree of / corresponds to the total angle evolution, that is to the number of revolutions that f{z) does as z goes around 5^ once counterclockwise, counting the revolutions positively if f{z) goes counterclockwise and negatively if f{z) goes clockwise. Suppose / : [0,27r] ^ 5^ is a loop of class C^, that \s 9 -^ /(e*^) is of class C \ and let / i : R ^ R be as in (8.3). Differentiating (8.3) we get ie'^f\e^^)
= if{l)e'^^^^h'{e)
=
if{e^^)h\e)
and taking the modulus \h'(6)\ = |/'(e^^)|. Therefore, h' is the angular velocity of f{z) times ± 1 depending on the direction of motion of f{z) when z moves as e^^ on the unit circle. In coordinates, writing / := / i 4-i/2, we have / ' = / { + */2, hence
We conclude using the fundamental theorem of calculus
270
8. Some Topics from the Topology of W
Figure 8.12. Counting the degree.
8.70 Proposition (Integral formula for the degree). Let / : 5^ —> S^ be of class C^. Then the lift h of f in (8.3) is given by h{t) - I'h'{e)de
= I'ie''[-
f2{e'')f[{e'')
+ fi{e'')f^{e''))
d6. (8.4)
In particular
deg(/) = ^ j ^ " ' / i ' W\dB (8.5) h r ' " ' ' ( ~ /2(e^')/((e'^) + fi{e'')fiie''))
de.
One can define the lifting and degree of smooth maps by (8.5), showing the homotopy invariance in the context of regular maps, and then extending the theory to continuous functions by an approximation procedure. c. Degree and inverse image The degree oi f : S^ —^ S^ is strongly related to the number of roots of the equation f{x) = y counted with a suitable sign. 8.71 Proposition. Let f : S^ -^ S^ be a continuous map with degree d £ Z. For every y e S^, there exist at least \d\ points xi, ^ 2 , . . . , a;^^ in S^ such that f{xi) = y, , d. Furthermore, if f : S^ -^ S^ goes around S^ never turning back, i.e., if f{e^^) = e*^(^) where h : [0,27r] -^ R is strictly monotone, then the equation f{x) = y, y G S^, has exactly \d\ solutions. Proof. Let /i : R -^ R be as in (8.3) so that /i(27r) = 27rrf, and let s € [0, 2n[ be such that e*^ = y. For convenience suppose d > 0. The intermediate value theorem yields d distinct points 6i, ^2, • • •, ^d in [0,27r[ such that h(Oi) = 5, h{02) = 5-1- 2n, . . . , h{dd) = s + 2(d - l)7r, hence at least d distinct points x i , X2,..., Xd such that f{xj) = /(e*^J) = e^^^^^) — e*^ = y, see Figure 8.12. They are of course exactly d points x i , 0:2, • . . , a;^ if /i is strictly monotone. D
8.1 Homotopy
271
With the previous notation, suppose f : S^ -^ S^ is of class C^, let h be as in (8.3) and let y £ S^ and s G [0,27r[ be such that y = e^^. Assume that y is chosen so that the equation f{x) — y has a finite number of solutions and set e S^ I h{0) = s
(mod 27r), h\e) > o } ,
:= #{<9 G S^ I h{0) = s
(mod 27r), h\e) < o } .
iV+(/,2/) := #\e N.{f,y)
Then one sees that degif) ^ N+{f,y) - N.if,y),
(8-6)
see Figure 8.12. 8.72 The fundamental theorem of algebra. Using the degree theory we can easily prove that every complex polynomial P{z) := z"^ + aiz"^-^ + • • • + am-iz + ao has at least a complex root. Set Sj, := {z \ \z\ = p}. For p sufficiently large, P maps Sp in R^ \ {0}. Also deg(P/|P|) = m. In fact, by considering the homotopy Pt{z) := z"^ + tiaiz"^-^ + • • • + ao)
t G [0,1],
of P{z) to z"^, we have
| p , ( . ) l > | . r ( i - t ( ^ + ... + M ) )
vH^o.
Thus |Pt('2^)| > 0 V t G [0,1] provided \z\ is large enough, consequently Pt{z)/\Pt{z)l t G [0,1], z G S'\ establishes a homotopy of P/\P\ to z ^ from S^p into 5^, and we conclude that deg(P/|P|) = deg(z^) = m. 8.73 %. Show that / : 5^ -^ 5^ has at least d-l Figure 8.12.]
fixed
points if deg f = d. [Hint: See
d. The homological definition of degree for maps S^ -^ S^ Let / : 5^ -^ E^ be a continuous map, where for convenience we have denoted the target space S^ by T}. We fix in S^ and Y} two orientations, for instance the counterclockwise orientation, and we divide 5^ in small arcs whose images by / do not contain antipodal points (this is possible since / : 5^ ^ E^ is uniformly continuous) and let z i , . . . , Zw, ^n+i = zi £ S^ the points of such subdivision indexed according to the chosen orientation in S^. For each i = 1 . . . , n we denote by a^ the minimal arc connecting f{zi) with /(zi+i). We give it the positive sign if f{zi) precedes 7(2:^+1) with respect to the chosen orientation of E^, negative otherwise. Finally, for (^ G E^, ( 7^ f{zi) V i, we denote by p{() and n{Q the number of arcs ai respectively, positive and negative that contain ^. Then P(C) - n(C) = deg(/) G Z as we can see looking at the lift of / .
272
8. Some Topics from the Topology of E ^
Courant Institute of Mathematical Sciences
Topics in Nonlinear Functional Analysis L. Nirenberg
dMr *i»iMtm Mid t M i | « AbbiMttg « o M l M t , aMi • a j ; tiMn iMiti««tiP Bit X k.«iiMiii*w m4i0m»UMMta K*k«i^ 4MM« lMMta«« >»,<S (• - » ) W. Vir m i i l l 11 X tfaMT tit WIUM WM viBMriWM CMW 8«U <s iwkffM X M i e M ia a* Bit ika kMMlUkaka > ft Kmika, imm IdqpaU* lafMdi M f w U * te N l -
New York University
Figure 8.13. Frontispiece of lecture notes by Louis Nirenberg and a page from a paper by L. E. Brouwer (1881-1966).
8.2 Some Results on the Topology of Though the presentation of these topics would require more space and advanced techniques, and in any case, it leads us away from the main path, we think that it is worthwhile to present here some results that are relevant in the sequel. However, we shall confine ourselves to illustrate the ideas and refer to the literature for complete proofs and more details. In the next two paragraphs we collect a few relevant results on the topology of maps into S"^ that we freely use in the rest of this section.
8.2.1 Brouwer's theorem a. Brouwer's degree A topological degree, called Brouwer's degree, can be defined for continuous maps / : 5 " -^ 5^^, n > 2, either by extending the homological type arguments in the case n = 1 or, more generally, in terms of homology groups or, analytically, in terms of a sum with sign of the numbers of inverse images of a point, either pointwise or in the mean. Intuitively, one counts how many times the target 5 ^ is covered algebraically by the source S'^ via the map f} ^ Both approaches require the development of more advanced and relevant techniques; we refer the reader e.g., to o J. Dugundji, Topology, Allyn and Bax:on, 1966, o L. Nirenberg, Topics in Nonlinear Analysis, AMS-CIMS, New York, 2001,
8.2 Some Results on the Topology of M^
273
In this way we end up with a map deg:C^(S^,5^)-^Z such that (i) deg(Id) = l, (ii) deg(/) = 0 if / is constant, (iii) deg(/) = ( - l ) " + M f / ( x ) = - x , and we have the following. 8.74 Theorem (Brouwer). Let / o , / i : S'^ -^ S'^ be continuous and homotopic. Then deg(/o) = deg(/i). Indeed the degree completely characterizes the homotopy classes of continuous maps from S'^ into S'^. In fact, we have the following. 8.75 Theorem (Hopf). Two continuous maps of S'^ into itself are homotopic if and only if they have the same degree. Moreover, for each d eZ there is a map f : S^ ^^ S'^ with deg(/) = d. A map / : 5^ ^ 5^ C R^^^ is called antipodal if f{-x) = -f{x) Vx G 5^. For instance. Id : 5^ -> 5^ and - Id : 5^ ^ 5^ are antipodal. 8.76 Theorem (Borsuk antipodal theorem). Let f : S'^ ^ S^ be a continuous antipodal map. Then deg(/) is odd; in particular f is not homotopic to a constant map. b. Extension of maps into S^ The following two extension theorems for maps into S'^ are also crucial. We refer the reader e.g., to J. Dugundji, Topology^ AUyn and Bacon, 1966. 8.77 Theorem. We have the following. (i) Let A C S'^ be a closed set. Every continuous map f : A -^ S^ extends to a continuous map F : S'^ -^ S'^. (ii) Let A C 5^^"^^ be closed and f : A ^ S^ be continuous. Pick a point Xi € Ui in every bounded connected component Ui of A^ := S'^'^^ \A. Then there is a continuous extension F : 5"^"^^ \ Ui{pi} -^ S"^. 8.78 Theorem (Borsuk). Let A C R^, k > 1, be closed and let f : A ^^ S'^ be continuous. Then f can be extended to a continuous map F :R^ -^ S'^ if and only if f is homotopic to a constant map. Observing that A^ has a unique unbounded connected component if A is a compact subset of E^, and using the stereographic projection, it is not difficult to infer from Theorem 8.77 (i), (ii). o one of the several books on degree theory.
274
8. Some Topics from the Topology of R^
8.79 Theorem. We have the following. (i) Let A CW^ be compact. Then any continuous function from A into S'^ can be extended to a continuous F iW^ -^ S'^. (ii) Let A C R"'"^^ be compact and f : A ^^ S'^ be continuous. Pick a point Pi G Ui in every connected component Ui of A^. Then f can be extended to a continuous map F : W^^^ \ Ui{pi} —^ S"^. As a consequence of the Hopf theorem and Proposition 8.8 we immediately infer the following. 8.80 Proposition. A function f : S'^ ^^ S'^ has a continuous F : cl (5^+1) -> 5^ if and only if deg(/) - 0.
extension
8.81 Corollary. Let f : S'^ ^ W^^ \ {0} be a continuous map. Then there exists a continuous extension F : cl (J^^^^^) -^ W^'^^ \ {0} of f if and
onlyifdegif/\f\) = 0. c. Brouwer's fixed point theorem Since the identity from 5^ into S'^ has degree one, and the constant maps have degree zero, from the homotopic invariance of the degree we conclude the following. 8.82 Theorem (Brouwer). The identity map Id : S^ -^ S'^ is not homotopic to a constant map. In other words, we cannot peel an orange without piercing the peel. Brouwer's theorem, whose content is quite intuitive, at least in dimension n = 2, has several interesting and surprising consequences. In fact, we have the following. 8.83 Theorem. The following claims are equivalent (i) (BROUWER'S THEOREM) The identity map
Id : 5^ - ^ S"^ is
not
homotopic to a constant map. (ii) There is no continuous map F : B ^^ S^, B — c^^^"^^); such that F{x) = X Vx G 5^^, that is, S'^ is not a retract of B. (iii) (BROWER'S FIXED POINT THEOREM, I) Every continuous map f : B ^^ B, B := c^^^"^^), has a fixed point, i.e., there is at least one x E B such that f{x) = x. Proof, (i) =^ (ii) If F : 5 -^ S"^ is a continuous function with F{x) = x Vx E S'^, then H(t,x) := F{tx), {t,x) € [0,1] x 5 ^ , is a homotopy of the identity to F(0). A contradiction. (ii) => (iii) Suppose that there is X e B. Then, and we leave this to into the unique point of S'^ on the from B in S"^ with G{x) = x \/x e
a continuous F : B —^ B such that F{x) ^ x for all the reader, the map G : B -^ S^ that maps x e B half-line from f{x) to x would be a continuous map S^, contradicting (ii).
8.2 Some Results on the Topology of M^
275
(iii) => (i) Suppose that there is a homotopy H : [0,1] x S"^ -^ S^ between the identity and a constant map, H(l,x) = x, H{0,x) = p £ S"^. Then the function F \ B ^^ S^ defined by
yp
if X = 0,
would be a continuous extension of the identity on S'^ to B , hence —F{x) : B -^ B would have no fixed point. D 8.84 %. Let U_ C W^'^^ be a bounded open set. Prove that there existsjio continuous retraction r : U ^>^ dU with r{x) = x on dU. [Hint: Let 0 E C/, B{0, k) D U and consider the continuous map / : S ( 0 , k) -^ B(0, k) defined by ^r{x) \r{x)
if X € t/,
X
k—
i{xeB{0,k)\U.]
X
d. Fixed points and solvability of equations in R"^+^ Going through the proof of Theorem 8.83, we can deduce a number of results concerning the solvabihty of equations of the type F{x) = 0. Let f : S'^ -^ R"^"^^ \ {0} be a continuous map. Since / never vanishes, the map f/\f\ continuously maps 5^ into S'^. We call degree of f with respect to the origin the number
deg(/,0):=deg(^). 8.85 Proposition. Let f : S'^ ^^ R"^+^ \ {0} be a continuous map with deg(/,0) ^ 0. Then every extension F : B ^W of f, B := cl(B^+i), has a zero in B^^^. Proof. Suppose this is not true. Then there exists a continuous extension F : B ^>^ E^"'"^ \ {0} of /• Hence F{x)/\F{x)\ is a continuous map from B into S'^. According to Proposition 8.80, F{x)/\F{x)\ = f{x)/\f{x)\ has degree zero, a contradiction. D
Let us illustrate a few situations in which Proposition 8.85 applies. 8.86 Proposition. Let F : c^^'^"^^) -^ R"^+^ be a continuous map such that F{x) never points opposite to x for all x e S'^. Then F{x) = 0 has a solution. Proof. Let / := F^sn : 5 ^ -> W^^. Since, by assumption F{x) + Ax / 0 V A > 0, Vx G 5 ^ , / has no zeros and therefore h(t, x) := tf{x) + (1 - t)x,
t € [0,1], X G 5 ' ' ,
never vanishes. Hence ^(t, x ) / | ^ ( t , x)| is a homotopy of f/\f\ : 5"^ —>• 5 ^ to the identity map Id : 5 ^ —)^ 5 ^ . It follows that deg(/, 0) = d e g ( / / | / | ) = 1. We conclude, on account of Proposition 8.85 that F , being an extension of / , has at least one zero in B'^^^. D
276
8. Some Topics from the Topology of R^
8.87 Theorem (Brouwer's fixed point, II). Let F : B -^ W, B := cl (B"'), be a continuous map with F{dB) C B. Then F has a fixed point. Proof. Set (l>{x) := x — F(x), x E B, and suppose (p(x) ^ 0 Vx, otherwise we are through. In this case > never points opposite for each x G dB. Indeed, if a: — F(x) + \x = 0 for some A > 0 and x G dB, then F{x) = (1-|-A)a:. Now A > 0 is impossible since |F(a:)| < 1, and, if A = 0, then F{x) = a; on dB which we have ruled out. Thus, F{x) — x = 0 has a solution inside B. D
It is worth noticing that Brouwer's theorem still holds if we replace cl{B'^) with any set which is homeomorphic to the closed ball of R"^. Moreover, it also holds in the following form. 8.88 Theorem (Brouwer fixed point theorem, III). Every continuous map f : K -^ K from a convex compact set K into itself has a fixed point. Proof. According to Dugundji's theorem. Theorem 6.42, / has a continuous extension F : E^ —^ W^, whose image is contained in K, K being convex and closed. If B is a ball containing K^ then F{B) C B and by Brouwer's fixed point theorem, Theorem 8.87, F has a fixed point x G B , i.e., F{x) — x, and, since F(x) G K, we conclude that x^K. D
e. Fixed points and vector fields Every (n -f l)-dimensional vector field in a domain A C W^^^ may be regarded as a map (p : A C W^'^^ -^ M"^"^^, once we fix the coordinates. If (p is continuous and nonzero, the degree of (f with respect to the origin is called the characteristic of the vector field. The Brower's degree properties and Proposition 8.8 then read in terms of vector fields as follows. 8.89 Proposition. We have the following. (BROUWER) Let (p be a nonvanishing vector field in cl{B'^~^^). Then (p\s^ has characteristic zero. (ii) The outward normal to B"^^^ at x e S'^ = dB'^~^^ is x. Therefore the outward normal field to 5^, x -^ x/\x\, x G R'^"^^ \ {0}, has characteristic one. (iii) The inward normal at x £ S^ is —x. Therefore the inward normal field to 5^, X -^ -x/\x\, X e W"-^^ \ {0}, has characteristic ( - 1 ) ^ + ^ (iv) Let (f and tp be two continuous nonvanishing vector fields on S'^ that are never opposite on S^. Then ^p and ip have the same characteristic.
(i)
Let us draw some consequences. 8.90 Proposition. Each nonvanishing vector field on (p : c\{B'^^^) —^ MP'~^^ must contain at least an inward normal and an outward normal vector. Proof. In fact, (p\sn has characteristic zero by (i) Proposition 8.89. Since (p\s^ and the field of outward (inward) normals have different characteristics, we infer from Proposition 8.89 (iv) that (p\sn must contain an inward (outward) normal. D
8.2 Some Results on the Topology of E^
277
8.91 Theorem (Poincare-Brouwer). Every continuous nonvanishing vector field on an even-dimensional sphere S'^'^ must contain at least one normal vector. In particular, there can be no continuous nonvanishing tangential vector fields to S'^. Proof. By (ii), (iii) Proposition 8.89, the inward and outward normal vector fields in S^"^ have different characteristics. Since any unitary vector field must have characteristics differing from one of these two fields, the result follows from (iv) Proposition 8.89. D
8.92 Proposition. Let f : S'^'^ -^ 5^"^ be a continuous map. Then either f has a fixed point x = f{x), or there is an x e 5^^ such that f{x) = —x. Proof. Suppose f : S'^'^ ^ 5^^ has no fixed point. Then the vector field g : S'^'^ ^ 5^^ given by g{x) := .-^rc"^., x € 5 ^ ^ , is continuous and of modulus one. Thus it contains a normal vector, i.e., f{x) — x = Xx for some x e S'^ and A G M. Since \f{x)\ = |a;| = 1 we infer 1 = |/(a;)| = |A -f l\\x\ = |A -f 1|, i.e., either A = 0 or A = - 2 . We cannot have A = 0 since otherwise f(x) — x = 0 and x would be a fixed point. Thus necessarily A = - 2 , i.e., f(x) = -X. D 8.93 %, Let 0 : R^ —>^ R"^ be a continuous map that is coercive, that is
Wx) I x) —
>• oo
uniformly as \x\ —>• oo.
Show that
€ R^ )(x) — y never points
8.94 %. Let (/) : R"^ —)> R'^ be a continuous map such that h m s u p ' , / ' < 1. Show that ) has a fixed point. 8.95 %, Let us state another equivalent form of Brouwer's fixed point theorem. T h e o r e m ( M i r a n d a ) . Let f : Q := {x e R"^ \ \xi\ < 1, i = 1 , . . . , n } -^ R^ 6e a continuous map such that for i = 1 , . . . , n we have fi{xi,.
. . , X i _ i , - l , X i - | . i , . . .,Xn) > 0,
fi{xi,.
. . , X i _ i , l , X i + i , . . . ,Xn) < 0.
Then there is at least one x £ Q such that f{x) = 0. Show the equivalence between the above theorem and Brouwer's fixed point theorem. [Hint: To prove the theorem, first assume that strict inequalities hold. In this case show that for a suitable choice of e i , . . . , en G R the transformation x^= Xi -\- €ifi(x),
i = 1 , . . . , n,
maps Q into itself, and use Brouwer's theorem. In the general case, apply the above to f{x) — Sx and let S tend to 0. Conversely, if F maps Q into itself, consider the maps fi{x) = Fi{x) — Xi, i = l,...,n.] 8.96 ^ . Show that there is a nonvanishing tangent vector field on an odd-dimensional sphere S ^ ^ - i . [Hint: Think S^""'^ C R^^. Then the field X = (xi,X2,. . . ,X2n) -^ {-Xn-\-l, -Xn+2, - • • , X2n, Xi,X2, . . • ,Xn) defines a map from 5 2 n - i -^^^^ itself that has no fixed point.]
278
8. Some Topics from the Topology of R"
Drei SitM fiber die /{•^intensionfllQ euklidische Sphdre *)• K a r o l B o r x u i t (WwMW*). %t mim ii and K i datlMt «iB«r kiMiipdct Ut iMMe all* tl«t%* AMMldBiic«n » »oa 1/ i» «> .Y h«t nad d « dumb die Fomui n > , r*) —Sop j([<»(*), »'(«))») mHaiAvn MA. it IIMI «u eUer KO«|MMOM<) d*r ll«a(!« ^ C ^ ' * gthownde raoktioBm irerdca filmkM m Z gMumat Eise AbUMifiK 9 < H' i>«iMt «w. MNrtM'), wwa {id* >ii 9> IB A" •qainOMto Abbild«B« if «af >) Mit a, WMd* iek die MkJMiMb* »-diiMMio««le SpkAn. d. fa. dk. OlMfflhsiM t i M VoUkagtl im MkUdiwbm (n + lHii>«>ii<>n*l«n &.<»• V^ bMdeiioea. lA p mn beliebigHr Pookt der Spblte «., •o IwuidiMt f i m n p mntipfiMtH, d. h. qnartriMk w ,1 ml » m Mittalpmkt* *oa 8. (»l«Rea«i Puakt TOB S,. B I M Fsaktioa y « « . wW aiMpMfarfnn xwiaat, w a a a / ( p ^ - f/f,)}• ftr M « . ^ « « . gflt •) U* HMTOMIM. «w«r AiftA rial ((kM Bewriw) i* «MiaM «€ktl«M-
7 M)-*itt. 7,1. ^ a H. B*rr. M**. A«. w. 8. »7». * w ^ ». •)ffc..rt(W».,M,«))k,
Figure 8.14. Karol Borsuk (1905-1982) and a page from one of his papers.
8.2.2 Borsuk's theorem Also Borsuk's theorem, Theorem 8.76, has interesting equivalent formulations and consequences. 8.97 T h e o r e m . The following statements hold and are are equivalent. (i)
(BORSUK-ULAM)
There is no continuous antipodal map f : S'^ -^
(ii) Each continuous f : S'^ -^ W^ sends at least one pair of antipodal points to the same point. (iii) (LYUSTERNIK-SCHNIRELMANN) In each family o / n + l closed subsets covering S'^ at least one set must contain a pair of antipodal points. Proof. Borsuk's theorem => (i) Ii f : S^ -^ 5 ^ ~ ^ is a continuous antipodal map, and if we regard 5 ^ ~ ^ as the equator of S^, S^~^ C 5 ^ , / would give us a nonsurjective map f : S'^ -^ S'^, hence homotopic to a constant. On the other hand / has odd degree by Borsuk's theorem, a contradiction. (i) =>• (ii) Suppose that there is a continuous g : S^ Then the map / : 5 ^ -^ 5 ^ - ^ defined by
fix) :=
J"" such that g(x) ^
g{~x).
gj-x) - g{x) \9{-x)-g{x)\
would yield a continuous antipodal map. (ii) => (iii) Let F i , . . . , Fn+i be n + 1 closed sets covering S^ and let o; : 5 " —> 5 ^ be the map a{x) = —x. Suppose that a{Fi) f) Fi = 0 for a l H = 1 , . . . , n. Then we can find continuous functions gi : S'^ -^ [0,1] such that g~^{0) = Fi and p ~ ^ ( l ) = oc(Fi). Next we define g : S^ —^ R^ as g(x) = (gi{x),. ..,gn{x)). By the assumption there
8.2 Some Results on the Topology of W
is xo e S^ such that gilxo)
— gi(-xo)
V i, thus XQ ^
U F^ and XQ ^
279
U a(Fi),
consequently a^o € -Fn+i n a ( F n + i ) , a contradiction. (iii) =^ (i) Let f : S^ -^ 5 " ~ ^ be a continuous map. We decompose S^~^ into (n -f- 1) closed sets Ai,... ,An-\-i each of which has diameter less than two; this is possible by projecting the boundary of an n-simplex enclosing the origin and S'^~^. Defining i = 1 , . . . , n + 1, according to the assumption there is an XQ and a k Fi := f~^{Ai), such that Xo € Ffc n a{Fk). But then f{xo) and /(—XQ) belong to Fk and so / cannot • be antipodal.
8.98 Theorem. R"^ is not homeomorphic to W^
ifn^m.
Proof. Suppose n > m and let h : W^ —^ W^ be a continuous map. Since n — 1 > m, from (ii) Theorem 8.97 we conclude that h^gn-i : S"^"^ - ^ R"^ C M"""^ must send two antipodal points into the same point, so that h cannot be injective. D
8.99 Remark. As a curiosity, (ii) of Theorem 8.97 yields that at every instant there are two antipodal points in the earth with the same temperature and atmospheric pressure. 8.100 %. Show that every continuous map f : S'^ ^ S^ such that f{x) ^ f{—x) V x is surjective.
8.2.3 Separation theorems 8.101 Definition. We say that a set A C W'^ complement A^ := W^'^^ \A is not connected.
separates R^+^ if its
8.102 Theorem. Let A C W^^^ be compact. Then (i) (ii) (iii) (iv)
each connected component o/R^"^^ \A is a path-connected open set, A^ has exactly one unbounded connected component, the boundary of each connected component of A^ is contained in A, if A separates R^+^, but no proper subset does so, then the boundary of each connected component of A^ is exactly A.
Proof, (i) follows e.g., from Corollary 6.68, since connected components of A^ are open sets. (ii) Let B be a closed ball such that B D A. Then B^ is open, connected and B Thus B is contained in a unique connected component of A^.
C A^.
(iii) Let U be any connected component of A^ and x G dU. We claim that x does not belong to any connected component of A^, consequently x ^ A^. In fact, x ^ U, and, if X was in some component V, there would exist B(x, e) C V. B{x, e) would then also intersect U, thus U r\V ^ ^•. a contradiction. (iv) Let U be any connected component of A^. Since A separates E^"^^, there is another connected component V of A^ and, because V C M""''^ \ U, necessarily R^+^ \ C/ ^ 0. Consequently R ^ + i \ dU splits as R^+^\at/ = t/U(E^+^\t7) which are disjoint and nonempty, so dU separates R^"*"^. Since by (iii) dU C A and is closed, it follows from the hypotheses on A that dU = A. D
280
8. Some Topics from the Topology of W
8.103 Theorem (Borsuk's separation theorem). Let A C W^~^^ be compact Then A separates W^~^^ if and only if there exists a continuous map f : A -^ S'^ that is not homotopic to a constant. Proof. Define the map /?p|^ as
\x-p\ Assume that A separates M^"'"^. Then R^"'"^ \ A has at least one bounded component U. Choosing any p £ U v/e shall show that ^ p | ^ cannot be extended to a continuous function on the closed set AUU, consequently on R'^"''^; hence ^p|A is not homotopic to a constant map by Proposition 8.8. In fact, if F : AUU -^ S^ were a continuous extension of Pp\A we choose R> 0 such that B(p, R) D AuU and define g : B{p, R) -^ dB{p, R)
\P + R]
X — p
\p-\-RF{x)
~
iixeB{p,R)\U, iixeU.
Then g would be continuous in B{p,R) and g = Id on dB(p,R): this contradicts Brouwer's theorem. Conversely, suppose that A does not separate A^. Then A^ has exactly one connected component, which is necessarily unbounded. By Theorem 8.79, / extends to F : R'^+i -^ S^. Therefore F and consequently / = F | ^ are homotopic to a constant map. D
In particular, Borsuk's separation theorem tells us that the separation property is invariant by homeomorphisms. 8.104 Corollary. Let A be a compact set in W^ and let h : A-^W^ be a homeomorphism onto its image. Then A separates W^ if and only if h{A) separates W^. As a consequence we have the following. 8.105 Theorem (Jordan's separation t h e o r e m ) . A homeomorphic image of S^ in W^~^^ separates R'^'^^^ and no proper closed subset of S^ does so. In particular h{S'^) is the complete boundary of each connected component o/R^+i \ /i(5^). It is instead much more difficult to prove the following general Jordan's theorem. 8.106 Theorem (Jordan). Let h : S^ —^ R'^^^ be a homeomorphism between S'^ and its image. Then R"^+^ \h{S'^) has exactly two connected components, each having h{S'^) as its boundary. Jordan's theorem in the case n = 1 is also known as the Jordan curve theorem. We also have
8.3 Exercises
281
8.107 T h e o r e m ( J o r d a n - B o r s u k ) . Let K be a compact subset ofR'^'^^ such that W^\K has k connected components, and let h be a homeomorphism of K into its image on E^"^^. Then R^+^ \ h{K) has k connected components. Particularly relevant is the following theorem that follows from Borsuk's separation theorem, Theorem 8.103. 8.108 T h e o r e m (Brouwer's invariance doraiain t h e o r e m ) . Let U be an open set ofW^~^^ and let h : U C W^~^^ -^ ]R'^+^ be a homeomorphism between U and its image. Then h{U) is an open set in M^"^^. Proof. Let y € h(U). We shall show that there is an open set W C R^"*"^ such that y eW C h(U). Set x = /i~^(2/), and B := B(x, e) so that 'B CU. Then (i) E ^ + i \ h(B) is connected by Corollary 8.104 since B is homeomorphic to B(0,1) and B(0,1) does not separate E'^''"^, (ii) h{B,dB) = h{B) \ h{dB) is connected since it is homeomorphic to B{x, e). By writing
R''-^^ \ h{dB) = {W" \ h(B)\ U h(B \ dB) we see that MP' \ h{dB) is the union of two nonempty, disjoint connected sets, that are necessarily the connected components of M.'^ \ h{dB); since h{dB) is compact, they are also open in R ^ + ^ Thus we can take W := h(B \ dB). D
A trivial consequence of the domain invariance theorem is that if A is any subset of R"^+^ and h : A —^ R'^+^ is a homeomorphism between A and its image h{A), then h maps the interior of A onto the interior of h{A) and the boundary of A onto the boundary of h{A). Using Theorem 8.108 we can also prove 8.109 Theorem. W^ and W^ are not homeomorphic
ifn^m.
Proof. Suppose m > n. If M^ were homeomorphic to M^, then the image of W^ into R ^ under such homeomorphism would be open in R"^. However, the image is not open under the map ( x i , . . . ,Xn) —> {xi,... ,Xn,0,... ,0). •
8.3 Exercises 8.110 % Euler's formula. Prove Euler's formula for convex polyhedra in R^: V — E-\F = 2, where V" := # vertices, E := i^ edges, F := # faces, see Theorem 6.60 of [GMl]. [Hint: By taking out a face, deform the polyhedral surface into a plane polyhedral surface for which V — E -\- F decreases by one. Thus it suffices to show that for the plane polyhedral surface we have V — E -\- F = 1. Triangularize the face, noticing that this does not change V — E -\- F; eliminate from the exterior the triangles, this does not change V — E -\- F again, reducing in this way to a single triangle for which y - F H - F = 3 - 3 H - l = l.]
282
8. Some Topics from the Topology of R"^
8.111 1. Prove P r o p o s i t i o n . Let A be an open set of C. A is simply connected if and only if A is path-connected and C \ A has no compact connected components. [Hint: Use Jordan's theorem to show that A*^ has a bounded connected component if A is not simply connected. To prove the converse, use that R^ \ {XQ} is not simply connected.] 8.112 %, Prove T h e o r e m (Perron—Frobenius). Let A = [aij] be an n x n matrix with aij > 0 V i , j . Then A has an eigenvector x with nonnegative coordinates corresponding to a nonnegative eigenvalue. [Hint: U Ax = 0 for some x e D := {x eR'^ \x'' > OWi, X]r=i ^ ' ^ ^} ^ ^ ^^^^ finished the proof. Otherwise f{x) := Ax/(^^{Axy) has a fixed point in D.] 8.113 %. Prove T h e o r e m ( R o u c h e ) . Let B = B{0, R) be a ball in R" with center at the origin. Let f,ge C^(B) with \g{x)\ < \f{x)\ on dB. Then deg(/, 0) = d e g ( / + ^ , 0 ) .
Part III
Continuity in Infinite-Dimensional Spaces
Vito Volterra (1860-1940), David Hilbert (1862-1943) and Stefan Banach (1892-1945).
9. Spaces of Continuous Functions, Banach Spaces a n d Abstract Equations
The combination of the structure of a vector space with the structure of a metric space naturally produces the structure of a normed space and a Banach space^ i.e., of a complete linear normed space. The abstract definition of a linear normed space first appears around 1920 in the works of Stefan Banach (1892-1945), Hans Hahn (1879-1934) and Norbert Wiener (1894-1964). In fact, it is in these years that the Polish school around Banach discovered the principles and laid the foundation of what we now call linear functional analysis. Here we shall restrain ourselves to introducing some definitions and illustrating some basic facts in Sections 9.1 and 9.4. Important examples of Banach spaces are provided by spaces of continuous functions that play a relevant role in several problems. In Section 9.3 we shall discuss the completeness of these spaces, some compactness criteria for subsets of them, in particular the Ascoli-Arzeld theorem, and finally the density of subspaces of smoother functions in the class of continuous functions, as the Stone-Weierstrass theorem. Finally, Section 9.5 is dedicated to establishing some principles that ensure the existence of solutions of functional equations in a general context. We shall discuss the fixed point theorems of Banach and of CaccioppoliSchauder, the Leray-Schauder principle and the method of super- and subsolutions. Later, in Chapter 11 we shall discuss some applications of these principles.
9.1 Linear Normed Spaces 9.1.1 Definitions and basic facts 9.1 Definition. Let X be a linear space over K = R or C. A norm on X is a function \\ \\ : X -^ R-f satisfying the following properties
(i)
\\x\\eR^xeX,
(ii) \\x\\ > 0 and \\x\\ = 0 if and only ifx = 0, (iii) ||Ax|| = |Al||x|| V X G X ,
(iv)
VAGK,
\\x-^y\\<\\x\\^\\y\\\fx,yeX.
286
9. Spaces of Continuous Functions, Banach Spaces and Abstract Equations
MONOGRAFJE MATEMATYC2NE KOMfTET REDAKCTJNY: 5. B4MACH, B. KHASTEE, K. K ORATO WSK.I, S. MAZOTUBUEWIC2. V, 5IB»PUif5« i H. iTEINSAUS TOM r
THEORIE D E S
OPERATIONS UNEAIRE5 P A R
STEFAN BANACH P R O r E S a i U a A l U H I V B R S I T t DP. I W O W
Z iCBWENCJl rtrSDOSZO KULTURT NAKODOWEJ W A R 5 Z A W A
tjSa
Figure 9.1. Stefan Banach (1892-1945) and the frontispiece of the Theorie des operations lineaires.
/ / II II is a norm on X, we say that (X, || ||) is a linear normed space or simply that X is a normed space with norm \\ \\. Let X be a linear space. A norm on X, defined by d{x,y) := \\x-y\\
on X induces a natural distance \/x,y e X,
which is invariant by translations, i.e., d{x-\-z^y-\-z) = d{x,y) Vo;,y^z ^ X, Therefore, topological notions such as open sets, closed sets, compact sets, convergence of sequences, etc., and metric notions, such as completeness and Cauchy sequences, see Chapter 5, are well defined in a linear normed space. For instance, if X is a normed space with norm || ||, we say that {xn} C X converges to x ^ X li \\xn — x\\ —> 0 as n ^ oo. Notice also that the norm 11 11 : X ^^ R is a continuous function and actually a Lipschitzcontinuous function,
x||-|Ml|<||x-y||, see Example 5.25. 9.2 Definition. A real (complex) normed space (X, || ||) that is complete with respect to the distance d{x^y) := \\x — y\\ is called a real (complex) Banach space. 9.3 Remark. By Hausdorff's theorem, see Chapter 5, every normed Unear space X can be completed into a metric space, that is, X is homeomorphic to a dense subset of a complete metric space. Indeed, the completed metric space and the homeomorphism inherit the linear structure, as one easily
9.1 Linear Normed Spaces
287
sees. Thus every normed space X is isomorphic to a dense subset of a Banach space. 9.4 E x a m p l e . With the notation above: (i) E with the Euclidean norm |a;| is a Banach space. In fact, \x\ is a norm on R, and Cauchy sequences converge in norm, compare Theorem 2.35 of [GM2]. (ii) E*^, n > 1, is a normed space with the EucUdean norm vV2 '
= (El^'l')
x=
{x\x\...,xn,
see Example 3.2. It is also a Banach space, see Section 5.3. (iii) Similarly, C^ is a Banach space with the norm \\z\\ := ( E I L i ki^)^^^^ ^ =
9.5 % C o n v e x s e t s . In a linear space, we may consider convex subsets and convex functions. Definition. E C X is convex if Ax-h (1 — X)y G E for all x,y ^ E and for all A G [0,1]. f : X -^R is called convex if f{Xx + (1 - X)y) < Xf{x) + (1 - X)f{y) for all x,y e X and all A G [0,1]. Show that the balls B{xo,r) convex.
:= {x £ X \ \\x — xo\\ < r } of a normed space X are
a. Norms induced by inner and Hermitian products Let X be a real (complex) linear space with an inner (Hermitian) product {x\y). Then ||x|| := y^{x\x) is a norm on X, see Propositions 3.7 and 3.16. But in general, norms on linear vector spaces are not induced by inner or Hermitian products. 9.6 Proposition. Let \\ \\ be a norm on a real (respectively, complex) normed linear space X. A necessary and sufficient condition for the existence of an inner (Hermitian) product ( | ) such that ||x|| = (x|x) Vx G X is that the parallelogram law holds, \\x + y\? + \\x~ y||' = 2(||a;||2 + \\y\\^)
Vx,2/ S X.
9.7 1[. Show Proposition 9.6. [Hint: First show that if ||x||^ = {x\x), then the parallelogram law holds. Conversely, in the real case set {x\v):=-^{\\x
+
y\\''-\\x-y\W
and show that it is an inner product, while in the complex case, set {x\y) : = i ( | | x + 2/||2 - \\x - y\\^) + i-{\\x + iy\\^ - \\x 4 4
iy\\%
and show that {x\y) is a Hermitian product. 9.8 % Forp> 1, ||a;||p:= ( E I L I I ^ T ) ^ ^ ^ ' ^ = (a:\ a : 2 , . . . , x^), is a norm in M^, cf. Exercise 5.13. Show that it is induced by an inner product if and only if p = 2.
288
9. Spaces of Continuous Functions, Banach Spaces and Abstract Equations
b. Equivalent norms 9.9 Definition. Two norms || ||i and \\ ||2 on a linear vector space X are said to be equivalent if there exist two constants 0 < m < M such that m||x||i < ||x||2 < M | | x | | i
VXGX
(9.1)
If II 111 and II II2 are equivalent, then trivially the normed spaces (X, || ||i) and (X, II 112) have the same convergent sequences (to the same limits) and the same Cauchy sequences. Therefore {X, || ||i) Z5 a Banach space if and only if (X, || II2) i5 a Banach space. Since the induced distances are translation invariant, we have the following. 9.10 Proposition. Let \\ ||i and \\ II2 be two norms on a linear vector space X. The following statements are equivalent (i) II 111 and || II2 are equivalent norms, (ii) the relative induced distances are topologically equivalent, (iii) for any {xn} C X, \\xn\\i —> 0 if and only if \\xn\\2 -^ 0. Proof. Obviously (i) =^ (ii) => (iii). Let us prove that (iii) =^ (i). (iii) implies that the identity map i : (X, || ||i) —> {X, || II2) is continuous at 0. Therefore there exists 6 > 0 such that ||2:||2 < 1 if ||2;||i < (5. For x G M, x / 0, if z := {S/\\x\\i))x we have ||2;||i = S hence ||2;||2 < 1, i.e., ||a:||2 < ^ ||a:||i- Exchanging the role of || ||i and || II2 and repeating the argument, we also get the inequality
Iklli < ^IW|2 Vx e X for some (5i > 0, hence (i) is proved. D 9.11 ^ . Let X and Y be two Banach spaces. Show that their Cartesian product, called the direct sum, is a Banach space with the norm ||(a:,2/)||i,xxy •— I k l l x + Ibllr- Show that
||(x,2/)||p,xxy := ^IWI^ + IMI^, P > 1, \\{x,y)\\oo,XxY
'= m a x ( | | x | | x , WVWY),
are equivalent norms.
c. Series in normed spaces In a linear vector space X, finite sums of elements of X are elements of X. Therefore, given a sequence {xn} in X, we can consider the series X l ^ o ^^' i.e., the sequence of partial sums < X]fc=o ^^ r • •^^' moreover, X is a normed space, we can inquire about the convergence of series in X. 9.12 Definition. Let X be a normed vector space with norm \\ \\. A series ] ^ ^ o ^ ^ ^ Xn ^ X, is said to be convergent in X if the sequence of its partial sums, Sn := XIfc=o^^ converges in X, i.e., there exists s E X such that \\sn — s\\ -^ 0. In this case we write
fc=0
instead 0/ ||5n — s|| ^^ 0 and s is said to be the sum of the series.
9.1 Linear Normed Spaces
289
9.13 Remark. Writing s = Yl^=o^k might make one forget that the sum of the series s is a hmit. In dubious cases, for instance if more than one convergence is involved, it is worth specifying in which normed space (X, II ||x), equivalently with respect to which norm || ||x, the hmit has been computed by writing s = 2_] ^k fc=o or
in the norm X,
CX)
Xk
in X,
E nk=0^k
0.
S =
^ k=0
or, even better, writing
9.14 Definition. Let X he a normed space with norm \\ \\. We say that the series YL^^Xn, {xn} C X, is absolutely convergent if the series of the norms X]n=o 11^^ 11 converges in M. We have seen, compare Proposition 2.39 of [GM2], that every absolutely convergent series in R is convergent. In general, we have the following. 9.15 Proposition. Let X be a normed space with norm \\ \\. Then all the absolutely convergent series of elements of X converge in X if and only if X is a Banach space. Moreover, if Yll^=o ^n is convergent, then
||5^Xfe||<^||Xfc||. k=p
k=p
Proof. Let A" be a Banach space, and let J ^ ^ Q ^k be absolutely convergent. The sequence of partial sums of X^fcio ll^fcll ^^ ^ Cauchy sequence in E, hence X]fc=p Ikfcll ~^ 0 ss p,q -^ OO. From the triangle inequality we infer that
k=p
k=p
hence Efc=p^k\\ —* 0 as p,q —^ oo, i.e., the sequence of partial sums of S ^ ^ o ^k is a Cauchy sequence in X. Consequently, it converges in norm in X, since X is a Banach space. Conversely, let {xk} C X be a Cauchy sequence. By induction select n i such that \\xn — XniW < 1 if n > n i , then 712 > n i such that \\xn — a;n2 II < 1/2 if n > 712 and so on. Then {xn^} is a subsequence of {xk} such that
IK,+i-x„J|<2-'=
Vfc,
and consequently the series ^'^=i(xnk+i — ^n^) is absolutely convergent, hence convergent to a point y £ X by assumption, i.e., p | | X ^ ( ^ n f c + i -Xrik) fc=l
-y\\
-^0,
asp ^ + 0 0 .
290
9. Spaces of Continuous Functions, Banach Spaces and Abstract Equations
Since this simply amounts to \\xnp — x\\ -^ 0, x := y -\- Xm, {xuk} converges to x, and, as {xn} is a Cauchy sequence, we conclude that in fact the entire sequence {xn} converges to x. Finally, the estimate follows from the triangle inequality
k=p
k=p
as q -^ oo, since we are able to pass to the limit as J ^ ^ o ^k converges.
D
9.16 % C o m m u t a t i v i t y . Let X be a Banach space and let {xn} C X be such that ^^ Xn is absolutely convergent. Then ^ ^ a:o^(^), for {x^(„)} a rearrangement of {xn}, is also absolutely convergent, and Yin ^^ ~ X^n ^(T(n)9.17 % Associativity. Let X be a Banach space and let {xn} C X. Let {In} be a sequence of nonempty subsets of N with InCiIm = ^'^^n ^ m and Unin = N. If Yl^ Xn is absolutely convergent, then
E (E -0 is absolutely convergent and EfcLo^fc = E ^ i ( E f c € / n
^^^
d. Finite-dimensional normed linear spaces In a finite-dimensional vector space, there is only one topology induced by a norm: the Euclidean topology. In fact, if K = R or K = C, we have the following. 9.18 Theorem. In K^ any two norms are equivalent. Proof. It suffices to prove that any norm p on K^ is equivalent to the Euclidean norm I |. Let (ei, 6 2 , . . . , en) be the standard basis of K^. If x = (x^, x'^,..., x^) and y = (2/1, 2 / ^ , . . . , y"^), we have n
Z=l
n i=l
yJL^
o\l/2
hence p : K'^ -> E + is continuous. Since the unit ball B := {x e K'^ \ \x\ = 1} of K^ is compact, we infer that p attains a maximum value M and a minimum value m on B. Since the minimum value is attained, we infer that m > 0, otherwise p would be zero at some point of B. Therefore 0 < m < p{x) < M on B, and, on account of the 1-homogeneity of the norm, m\x\ < p{x) < M \x\ i.e., II II is equivalent to the Euclidean norm.
Wx € K'^, •
9.19 Corollary. Every finite-dimensional normed space X is a Banach space. In particular, any finite-dimensional subspace of X is closed and K C X is compact is and only if K is closed and hounded.
9.1 Linear Normed Spaces
291
Proof. Let p be a norm on X and let E : K^ ^^ X be a coordinate map on X. Since 8 is linear and nonsingular, p o £^ is a norm on K^ and E is trivially an isometry between the two normed spaces (K'^,poE) and (X,p). Since p o 5 is equivalent to the Euclidean norm, (K^jpo E) is a Banach space and therefore {X,p) is a Banach space, too. The second claim is obvious. D 9.20 %, Let A" be a normed space of dimension n. Then any system of coordinates E : X ^^ K^ is a linear continuous map between K^ with the Euclidean metric and the normed space X.
A key ingredient in the proof of Theorem 9.18 is the fact that the closed unit ball in W^ is compact. This property is characteristic of finitedimensional spaces. 9.21 Theorem (Riesz). The closed unit ball of a normed linear space X is compact if and only if X is finite dimensional. For the proof we need the following lemma, due to Prigyes Riesz (18801956), which in this context plays the role of the orthogonal projection theorem in spaces with inner or Hermitian products, see Theorem 3.27 and Chapter 10. 9.22 Lemma. Let Y be a closed linear subspace of a normed space (X, II II). Then there exists x G X such that \\x\\ = 1 and ||x — x|| > 1/2 \f xeY. Proof. Take XQ € X\Y and define d := inf{||2/ —xo|| | y € Y}. We have d > 0, otherwise we could find {yn} C Y with yn -^ XQ and XQ ^Y since Y is closed. Take yo ^Y with lll/o - XQ\\ < 2d and set x = j | f J 5 ^ - Clearly ||x|| = 1 and yo + y\\xo - yo\\ € K if y G y , hence M
"
_M
II
"
II
xo-yo
II
IN-t/o||ll
||2/lko-2/o||-3:o+yo||
\\xo-yo\\
d
1
-'2d
2 D
Proof of Theorem 9.21. Let B := {x e X \\\x\\ < 1}. If X has dimension n, and E : K^ —> X is a system of coordinates, then E is an isomorphism, hence a homeomorphism. Since B is bounded and closed, E~^{B) is also bounded and closed, hence compact in K", see Corollary 9.19. Therefore B = E{E-^{B)) is compact in X. We now prove that B is not compact if X has infinite dimension. Take x\ with | | x i | | = 1. By Lemma 9.22, we find X2 with ||x2|| = 1 at distance at least 1/2 from the subspace Span{a;i}, in particular \\x\ — X2II > | . Again by Lemma 9.22, we find X3 with ||x3|| = 1 at distance at least 1/2 from Span {xi, 0:2}, in particular ||a:3 — x\\\ > | and 11^3 — X211 > | . Iterating this procedure we construct a sequence {xn} of points in the unit sphere such that ||iCi — Xj|| > | ^i^j^i ^ 3- Therefore {xn} has no convergent subsequence, hence the unit sphere is not compact. D
9.23 Remark. We emphasize that, in any infinite-dimensional normed space we have constructed a sequence of unit vectors, a subsequence of which is never Cauchy.
292
9. Spaces of Continuous Functions, Banach Spaces and Abstract Equations
9.1.2 A few examples In Sections 9.2 and 9.4 we shall discuss respectively, the relevant Banach spaces of linear continuous operators and of bounded continuous functions. Here we begin with a few examples.
a. The space ^p, 1 < p < oc Let (y, II \\Y) be a normed space and p G M, p > 1. For a sequence ^ = {^i} CY we define
Then the space of sequences
is a linear space with norm ||^||£p(y). Moreover, we have the following. 9.24 Proposition, ^p(y) is a Banach space if Y is a Banach space. Proof. Let {^fe}? ^k •= {$1 }, be a Cauchy sequence in ipiV).
Since for any i
lld-Clly
(9.2)
the sequence {Q ^}k is a Cauchy sequence in Y, hence it has a limit fi € K,
We then set ^ := {^i} and prove that {^k} converges to ^ in ipiV)all n,m > no(e) we have Win -im\\ep(Y)
Fix e > 0, then for
< e
hence, for all r G N
Eii^l"^-?l""iiy<^' i=l
and, since x —^ ||2; — a:||y is continuous in Y, as m —^ oo,
for n > no{e) and all r. Letting r —^ oo, we find ||^n — Cll£p(y) ^ ^ ^^^ '^ ^ '^o, i-e., ^n ^^ ? in ^p(l^). Finally, the triangle inequality shows that ^ 6 ^p(^)C
9.1 Linear Normed Spaces
293
b. A normed space that is not Banach The map
a
\\ i /^/p p
t)
ifitrdtj
,
p>i,
defines a norm on the space of continuous functions C^([a, 6],R). Indeed, if y is a Unear normed space with norm || ||y, P = li/llLp(K6[,y):= /
\\m\fydt,
p > 1,
Ja
defines a norm on the space of continuous functions with values in Y. In fact, t -^ ||/(^)||y is a continuous real-valued map, hence Riemann integrable, thus ||/||p is well defined. Clearly ||/||p = 0 if and only if f{t) =0 \/t and 11/lip is positively homogeneous of degree 1. It remains to prove the triangle inequality for / ^ jj/lIP? called the Minkowski inequality, | | / + 5 | | p < | | / | | p + ||5llp
V/,5eC0([a,6],y).
The claim is trivial if one of the two functions is zero. Otherwise, we use the convexity of 0 : F —» E where 0(j/) := ||j/||y, i.e.,
Vx,2/ € y, VA e [0,1], (9.3)
y = g{t)/\\g\\p and A = l|/||p/(||/||p +
H^IIP),
and we
L^J'm)^mK^<
ll/lk + IM
that is Minkowski's inequality. It turns out that C^([a,6],R) normed with \\ \\p is not complete, see Example 9.25. Its completion is denoted by L^(]a, 6[). Its characterization is one of the outcomes of the Lebesgue theory of integration. 9.25 E x a m p l e . Define, see Figure 9.2,
fo
if - 1 < t <0,
fn{t) = { nt
if 0 < t < 1/n, - ' ' if 1/n < t < 1
1
and
r/.x ; 0 if - 1 < t < 0, f(t) = < 1 1 if 0 < t < 1.
The sequence {/n} converges to / in norm
\\U-m=
r^"(l-nt)Pdt
If p G C ^ ( [ - l , 1]) is the Umit of {/n}, then | | / - g\\-p — 0, consequently p = / = 0 on [—1,0] and p = / = 1 on ]0,1], a contradiction, since g is continuous.
294
9. Sp£ices of Continuous Functions, Banach Spaces and Abstract Equations
\/n Figure 9.2. Pointwise approximation of the Heaviside function.
c. Spaces of bounded functions Let A be any set and F be a normed space with norm || ||y. The uniform norm of a function f : A-^Y is defined by the number (possibly infinity)
\\f\\B{A,Y) : = s u p | | / ( : r ) | | y . B{A,Y) defines a norm on the space of hounded functions f : A B{AY)
Y
:=. | / : A - . r I ||/||B(A,y) < +00}
which then becomes a normed space. The norm ||/||^(A,y) on B{A,Y) is also denoted by ||/||OO,A or even by ||/||oo when no confusion can arise. The topology induced on B{A^ Y) by the uniform norm is called the topology of uniform convergence, see Example 5.19. In particular, we say that a sequence {fn} C B{A, Y) converges uniformly in Ato f e B{A, Y) and we write uniformly in A, /n(^) -^ f{x) if l|/n-/lkA,y)->0. 9.26 Proposition. If Y is a Banach space, then B{A,Y) space.
is a Banach
Proof. Let {fn} C 1S{A, Y) be a Cauchy sequence with respect to || ||oo. For any e > 0 there is a no such that ||/n — /m||oo < e for all n , m > no. Therefore, for all a: € ^ and n, m > no \\fn{x)-fm{x)\\Y<e.
(9.4)
Consequently, for all x € A, {/n(^)} is a Cauchy sequence in Y hence it converges to an element f(x) G Y. Letting m —>• 00 in (9.4), we find \\fn{x) - f{x)\\Y
< e
V n > no and Va; G A,
i-e., | | / - / n | | o o < e f o r n > no, hence ||/||oo < | | / n | , + e, i . e . , / G ^ ( A , y ) a n d / n uniformly in B{A^ Y) since 6 is arbitrary.
• /
D
9.27 f. Let Y be finite dimensional and let (ei, 6 2 , . . . , e-n) be a basis of Y. We can write / as f{x) = fi{x)e\ + • • • -f fn{x)en- Thus / G B{A,Y) if and only if all the components of / are bounded real functions.
9.2 Spaces of Bounded and Continuous Functions
295
d. The space looiY) A special case occurs when A = N. In this case B{A^ Y) is the space of bounded sequences of y , that we better denote by
e^iY) := BiN,Y). Therefore, by Proposition 9.26, ^oo(^) is a Banach space with the uniform norm ll^ll^oo(y) •= ll^llB(N,y) = s u p ll^illy, i
if Y is complete. 9.28 %. Show that for l < p < g < o o w e have (i) (ii) (iii) (iv)
ip{R) CiqiR) C^oo(M), £p{R) is a proper subspace of iq{M.), the identity map Id : ip{R) -^ £q{R) is continuous, ^i(E) is a dense subset of £q(M.) with respect to the convergence in iq{R).
9.29 %. Show that, if p , g > 1 and l / p 4 - l/q = 1, then
for all {^„} € ipCM.) and {r/n} e ^,(R). Moreover, show that oo
II^IIMIK) = s m
E«-^-| I ll^lkw ^ i}-
(^-^^
n=l
[Hint: For proving (9.5) use the Young inequaUty ab < a^/p -h b^/q. Using (9.5), show that > holds in (9.6). By a suitable choice b = b{a) and again using Young's inequality, finally show equality in (9.6).]
9.2 Spaces of Bounded and Continuous Functions In this section we discuss some basic properties of the space of continuous and bounded functions from a metric space into a Banach space.
9.2.1 Uniform convergence a. Uniform convergence Let X be a metric space and let F be a normed space with norm || ||y. Then, as we have seen in Proposition 9.26, the space B{X, Y) of bounded
296
9. Spaces of Continuous Functions, Banach Spaces and Abstract Equations
functions from X into F is a normed space with uniform norm, and B{X^ Y) is a Banach space provided y is a Banach space. We denote by Cb{X, Y) the subspace of B{X, Y) of bounded and continuous functions from X into F , Cb{X,Y)
:=C''{X,Y)nB{X,Y).
Observe that, by the Weierstrass theorem Cb{X, Y) = C{X, Y) if X is compact, and that, trivially, Cb{X^Y) is a normed space with uniform norm. 9.30 Proposition. Cb{X,Y)
is a closed subspace of
B{X^Y).
Proof. Let {/n} C Cb{X, Y) be such that fn^f uniformly. For any e > 0, we choose no = no(e) such that | | / — /no I loo,x < ^- It follows that
ll/lloc,X < 11/
/ n o ||oo,X + ||/nol|oo,X<4-00,
i.e., / 6 B{X,Y). Moreover, since /no is continuous, for a fixed XQ ^ X there exists 6 > 0 such that \\fno(x) — fno{xo)\\Y < € whenever x G X and dx{x,xo) < S. Thus, for d(x, xo) < S, we deduce that Wfix)
- f{xo)\\Y
< \\f(x)
- fnoix)\\Y
+ Wfnoix)
"
fno(xo)\\Y
-\-\\fno{xo)-f(xo)\\Y<^e i.e., / is continuous at XQ. In conclusion, / € C^{X, Y) fl B{X, Y).
D
Immediate consequences are the following corollaries. 9.31 Corollary. The uniform limit of a sequence of continuous functions is continuous. 9.32 Corollary. Let X be a metric space and let Y be a Banach space. Then Cb{X^Y) with uniform norm is a Banach space. 9.33 %. Show that the space C^([a, 6],R) of real functions / : [a, 6] —>- R, which are of class C^, is a Banach space with the norm ll/llci:=
sup | / ( x ) | + xG[0,l]
sup
|/'(x)|.
x€[0,l]
[Hint: If {fk} is a Cauchy sequence in C^Ha, 6]), show that fk -^ f, f^ -^ 9, uniformly. Then passing to the limit in
fk{x)-fk(a)=
r
fl(t)dt,
Ja
show that / is differentiable and f'{x)
= g{x) Vx.]
9.34 %. Let X be a metric space and let y be a complete metric space. Show that the space of bounded and continuous functions from X into F , endowed with the metric dooif,9) is a complete metric space.
:= sup xex
dY(f(x),g{x)),
9.2 Spaces of Bounded and Continuous Functions
297
Figure 9.3. Consider a wave shaped function, e.g., f{x) = 1/(1 + x ^ ) , and its translates fn{x) := 1/(1 -\-(x-\- n)2). Then ||/n||oo = 1, while fn(x) -^ 0 for all xeR.
b. Pointwise and uniform convergence Let ^ be a set and let y be a normed space normed by | |y. We say that {/n}, /n • ^ —^ y , converges pointwise to f : A ^^Y in A ii \fn{x)^f{x)\Y^O
VXGA,
while we say that {fn} converges uniformly to / in A if ||/n-/||oo,A^O.
Since for all x G ^ ll/n(a:) — / ( ^ ) | | y < ll/n — /||cx),x, uniform convergence trivially implies pointwise convergence while the converse is generally false. For instance, a sequence of continuous functions may converge pointwise to a discontinuous function, and in this case, the convergence cannot be uniform, as shown by the sequence fn{x) := x^, x e [0,1[, that converges to the function / which vanishes for all x G [0,1[, while / ( I ) = 1. Of course, a sequence of continuous functions may also converge pointwise and not uniformly to a continuous function, compare Figure 9.3. More explicitly, /n ^ / pointwise in A if Vx G A,Ve > 0 3 n = n(x, e) such that \fn{x) - / ( ^ ) | y < e for all n>n, while, fn—^f
uniformly in A if
V € > 0 3 n = 6 such that \fn{x) — / ( ^ ) | y < ^ for all n > n and all x e A. Therefore, we have pointwise convergence or uniform convergence according to whether the index n depends on or is independent of the point X.
c. A convergence diagram For series of functions fn : A —>Y, we shall write
/(^) = 5Z/n(^) n=l
VXGA
298
9. Spaces of Continuous Functions, Banach Spaces and Abstract Equations
Absolute convergence in B{A, Y)
Absolute convergence in Y
Uniform convergence, i.e., convergence in B(A,Y)
Convergence in Y
Figure 9.4. The relationships among the different notions of convergence for series of functions.
if the partial sums converge pointwise in A, and oo
f{x) = \^ fn{x)
uniformly in A
n=l
if the partial sums converge uniformly. Simply writing Yl^=i fn{x) = / ( ^ ) is, in fact, ambiguous. Summarizing, we introduced four different types of convergence for series of functions from a set A into a normed space Y. More precisely, if {/n} C B{A, Y) and / G B{A, F ) , we say that (i) S ^ o /n(^) converges pointwise to / if for all a: G ^ X^^o fni^) — / ( ^ ) in r , i.e., for all x G ^ , || J^^^Q fn{x) - /(x)||y ^ 0 as ;? -> oo, (ii) X ] ^ o / ^ ( ^ ) converges absolutely in Y for all x ^ A i.e., for any fixed X e A, the series of nonnegative real numbers Yl'^=o ll/n(^)||r converges, (iii) E r = o fn{x) converges uniformly in A to f if Yl7=o fn = f^^ B{A, F ) , i-e., \\YZ=0
fn - f\\B{A,Y)
- ^ 0 a s p - > 00,
(^^) J2^=o fn{x) converges absolutely in B{A^ Y) if the series of nonnegative real numbers Y!^=Q \\fn\\B{A,Y) converges. Clearly (iv) implies (ii), and (iii) implies (i). Moreover, (iv) implies (iii) and (ii) implies (i) if F is a Banach space; the other implications are false, see Example 9.35 below. 9.35 E x a m p l e . Consider functions / : R+ —> E. Choosing fn{x) := ( - l ) ^ / n , we see that X ] ^ i fn{x) converges pointwise and uniformly, but not absolutely in M or in H(R,R). Let f{x) := s i n x / x , a; > 0, and, for any n G N, ii n < X < n-\- 1, otherwise. Since c i / n < ||/n||oo < C2/n, J2n f^ ^^^^ ^^* converge absolutely in J S ( M + , R ) . But / ( ^ ) = Yl'^=o fn(,x) converges pointwise Wx G R+ and also absolutely in R for all a: G R+. Finally,
9.2 Spaces of Bounded and Continuous Functions p v"^
fsina: I
II a: > p
n=o
yo
otherwise,
299
hence p
/-V'/n II
- ^ n=0
< — -.0 lloo
asp-^oo,
p ^
therefore ^^fn converges uniformly, that is in iB(]R+,M). Here the convergence is uniform in iS(E+,M) but not absolute in iB(M-|.,E), because the functions fn take their maxima at different points and the maximum of the sum is much smaller than the sum of the maxima.
9.36 Theorem (Dini). Let X be a compact metric space and let {fn} be a monotonic sequence of functions fn'-X-^R that converges pointwise to a continuous function f. Then fn converges uniformly to f. 9.37 ^ . Show Dini's theorem. [Hint: Assuming that fn converges by decreasing to 0, for all e > 0 and for all x G X there exists a neighborhood Vx of x such that |/n(a:^)| < e Va; G Vx for all n larger than some n(x). Then use the compactness of X. Alternatively, use the uniform continuity theorem. Theorem 6.35.] 9.38 %. Show a sequence {fn} that converges pointwise to zero and does not converge uniformly in any interval of R. [Hint: Choose an ordering of the rationals {vn} and consider the sequence fn{x) := ^'^—Q
xZ.
p{E, F) := sup { sup d{x, F ) , sup d{x, E)}. xeE xeF Show that p is a distance on C. Now suppose that X is a compact metric space and Y is a normed space. Show that {fn} converges uniformly to / if and only if the graphs in X X y of the / n ' s converge to the graph of / with respect to the Hausdorff distance.
d. Uniform convergence on compact subsets 9.41 E x a m p l e . We have seen in Theorem 7.14 of [GM2] that a power series with radius of convergence p > 0 converges totally, hence uniformly, on every disk of radius r < p. This does not mean that it converges uniformly in the open disk {\z\ < p}. For instance, the geometric series X ^ ^ Q ^^ ^^^ radius of convergence 1 and if \x\ < 1 1 l-x
^ ^ n=0
C^ =
xi 1
-
consequently for all p. I sup
xe]-iMn-x
1
^ 1 ^ ^ x " = -f oo.
^Q
I
300
9. Spaces of Continuous Functions, Banach Spaces and Abstract Equations
n $««retario 1«jr8« a iiome deirAccademtco Ono> rarlo C«v. Prof. CMAAC ARZILA u m Memoriar Sun* L'A.utore raccoglie in (iu«a(a memoria ^li atudi dalui ikUi anitJ addlatro intorno allc s«rte dl funxiDtii di ufut variablla r««l«: riordlnandolt, modlflcandolt ove occorre, sempllflcandovi le dimoatrazloHi « corredaadoll di esempi. Le proprlatt di una s«iie di (Unzloni «,fx) ••. «,(«) -^ ••• * B.W •*• '" dipeiidono dal modo di eomportarsi delta somma
dei primi n termini. Quetta S(n, «) pu(> rieuardani come una fUn' Tiom d«lle due »ari»l>iH n c J": cd t$\\ »(ai>ili$ce appunto le principali propoaizioni relative alia conu* nui<^ iR(e«ral>ilii.1 e derivabilitii di esxa, deducendole come casi particolari da propMizioni relative «|ie ftin* xioni generall di dae variabili. IndicliiaffiO 4u< i riauliati prineipaM. 1. — Siano le «rt«), Uf(«),... funzioni flniiee continue delta * nellMntervmlio a ..A: eaisia deierminato in otni Puuto x
aliora la condizioiic necesiaria e tuMcienie affinclid ffyt) ata futtxione continu* dl « conalate nelta oono*f» genm uniform a tmtU: vale a dire eht. per ogol nan e r o po«Uivo « preeo a plaeere e per otrni Bumera tntere mi s i dee e basta trovare un allro Aumero IDtero m t ^ m i tale cite per un numero m compreao Ira m* e m,, ei abOia
Figure 9.5. Giulio Ascoli (1843-1896) and the first page of Sulle serie di funzioni by Cesare Arzela (1847-1912).
Kim)
todkando II reato della aerie eontatto a partire
9.42 Definition. Let ^ C R^ and let Y he a normed space. We say that a sequence of functions {fn\,fn''^-^y, converges uniformly on compact subsets oi A to f : A —^ Y if for every compact subset K C A we have ll/n - /||oo,x -^0 as n-^ oo. 9.43 %, Let / n , / : r2 —>^ R be continuous functions defined on the closure of an open set Q of R"^. Show that {/n} converges uniformly to / on fi if and only if {/n} converges uniformly to / in fi. 9.44 %, Let {/n}, / n : ^ C R*^ —>• y , be a sequence of continuous functions that converges uniformly on compact subsets oi Ato f : A —^Y. Show that / is continuous.
9.2.2 A compactness theorem At the end of the nineteenth century, especially in the works of Vito Volterra (1860-1940), Giulio AscoU (1843-1896), Cesare Arzela (18471912) there appears the idea of considering functions J-" whose values depend on the values of a function, the so-called funzioni di linee, functions of lines; one of the main motivations came from the calculus of variations. This eventually led to the notion of abstract spaces of Maurice Prechet (1878-1973). In this context, a particularly relevant result is the compactness criterion now known as the Ascoli-Arzela theorem. a. Equicontinuous functions L§t X be a metric space and Y a normed space.
9.2 Spaces of Bounded and Continuous Functions
301
9.45 Definition. We say that a subset T of B{X^Y) is equibounded, or uniformly bounded, by some constant M > 0 if we have ||/(x)||y < M
Vx ex,yf e T. We say that the family of functions T is equicontinuous if for all e> 0 there is S > 0 such that \\f{x) - f{y)\\Y <€
Wx.yeX
with dx{x,y)
< S, and V/ G T.
9.46 Definition (Holder-continuous functions). Let X^Y be metric spaces. We say that a function f : X -^ Y is Holder-continuous with exponent a, 0 < a <1, if there is a constant M such that dY{fix),f{y))<Mdx{x,yr, and we denote by C^'^(X, F ) the space of these functions. Clearly the space C^'^{X,Y) is the space Lip(X, F ) of Lipschitzcontinuous functions from X into Y. On C°'^(X, Y) n B(X, F ) , 0 < a < 1, we introduce the norm WfWco,^ = sup \\f{x)\\y + xeX
sup x,yeX,Xy^y
''^^,;]"f,^,^^''^,
(9.7)
IF~i/|lx
and it is easy to show the following. 9.47 Proposition. C^^'^{X,Y)nB{X,Y) a Banach space ifY is a Banach space.
endowed with the norm (9.7) is
Bounded subsets with respect to the norm (9.7), of (7^'"(X, F ) fl B{X, F ) provide examples of equicontinuous families. See the exercises at the end of this chapter for more on Holdercontinuous functions. b. T h e Ascoli—Arzela theorem 9.48 Theorem (Ascoli—Arzela). Every sequence of functions {/„} in C^{[a,b]) which is equibounded and equicontinuous has a subsequence that is uniformly convergent. More generally, we have the following. 9.49 Theorem. Let X be a compact metric space. A subset T ofC^{X, R) is relatively compact if and only if T is equibounded and equicontinuous. Proof. We recall that a subset of a metric space is relatively compact or precompact if and only if it is totally bounded, see Theorem 6.8. If .F is relatively compact, then T is totally bounded and, in particular, equibounded For any e > 0, let / i , . . . , / n , G ^ be an e-net for T, i.e., \/ f e T in C^{X,R). 11/ — /ill < € for some fi. Since the / j ' s are uniformly continuous, there is a Se such that dx{x,y)
< Se implies \fi{x) - fi(y)\
< e,
2 = 1, 2 , . . . ,ne.
302
9. Spaces of Continuous Functions, Banach Spaces and Abstract Equations
SULLE FUNZIONI DI LINEE »(^ Mixta MSfiUni* 188*, IHTBDOOirOHt
PROF. OESARE A R Z E L A
I. D > - « a t t t t 4 < r i * U I < i « i l l e e i M t l f U < i r r t ;!>•> <4 IL III r l t i r c t 4 t l l i «ar 41 <SI
mum
a HtfOt«
ID quMto lavoro io d6 anxtlutto una nuota dimostrazioBa delta condizlona n*o«staria • auffieUnte par I'eslsteDca di una curva llmlte in una «isee$«oiia data dt curva net piano. Coaaldero pol Ainztoo! <*) aveaU valor* datarmioato par Oftnuna dalla linee apparteoantl a una data variata a dUno««ro par UU funtloal, deflntta In una varlata ehiiua, i tooremi fondamantali oha valgoso par la funzioni di punU: argomanto gi& da ma trattato in una nota ai Lineal par I'anno 1884 a qui ripreso soUo un aspatto ^qoanto divwrso. Sagua Infina un'apsrftcaziona cha para BOte«o!a, son #& pal rasaltato al quala conduoa, ma bensi pal matodo aba in «$«a t tanutoLa proposizloni, stabittta al num. 1...9 anno tmmadiatainaata eatondibiU a funtioni di doa e pli!i variabUi, it eha fa ininvadara nUit appiiMslonl1. Sia una suooesaiona df infinite fuazlODt
<M*> h f m . A t t i t M «f<M
4 a U t t a Mil'ttite t a m i d ^ iMkbw 4 * 4 i i^^MfitSa MtaMrin il n w 4i
«)
U*$
«,<«), a^a),...Ma*)
dalla variabila raala «, dtOe naU'intarvalio a.-.b. Sia e(«>) una funzlone tale cha par ognl numaro posiUvo
MM 41 i i i i l w i r t i «l
ifNit attcllo M M 4a n *—mft» amfacSmime. Si x m ^ 4i mm frhmmuM i> IIM4O wpiwio ii jnt. V e l t « r » » » i
Figure 9.6. Two pages respectively from Le curve limiti di una varietd data di curve by Giulio Ascoli (1843-1896) and from Sulle funzioni di linee by Cesare Arzela (18471912).
Given f E T we choose io, 1 < io < Ue in such a way that \\f — fioW < ^- Then l/(x) - f(y)\ < | / ( x ) - fi,{x)\
+ \fi,(x)
- fi„{y)\ + \fi,(y)
-
f{y)\
<2||/-/ioll + l/ioW-/io(y)l<3€, for dx(x,y) < <5e, hence .F is an equicontinuous family. Conversely, suppose J^ is equibounded and equicontinuous. Again by Theorem 6.8, it suffices to show that J^ is totally bounded. Let e > 0. Prom the compactness of X and the equicontinuity of .F, we infer that there exists a finite family of open balls B ( x i , r^) that cover X and such that
\fiy)-f{xi)\<€
^yeBixi.ri),
V / € J^.
Since the set K := {f{xi) 11 < i < n, f € T} is bounded, we find y i , 2/2, • • •, ym € M such that K C U ^ ^ B ( y i , e). The set .F is covered by the finite union of the sets FW := { / eT^fixi)
6 B(y^(^),e) < e, i = l , . . . , n j ,
with TT varying among the bijective maps TT : { 1 , . . . , n } —^ { 1 , . . . , n } . Therefore, it suffices to show that diamFW < 4e. Since for / i , /2 € FV and x E B(xi,ri) we have | / l ( x ) - f2{x)\ < \fl{x) - fi{Xi)\ + |/l(Xi) - y , ( i ) | + IVnii) - f2{xi)\ 4- \f2{xi) - /2(:r)| < 4€, the proof is concluded.
D
9.50 %, Notice that the sequence {/n} of wave shaped functions in Figure 9.3,
fn{x):=
1 l + (a; + n ) 2 '
X € M, n G N,
is equicontinuous and equibounded, but not relatively compact.
9.3 Approximation Theorems
303
9.51 f. Theorem 9.49 can be formulated in slightly more general forms that are proved to hold with the same proof of Theorem 9.49. T h e o r e m . Let X be a compact metric space, and let Y be a Banach space. A subset T C C(X, y ) is relatively compact if and only if T is equicontinuous and, for every X, the set Tx of all values f{x) of f E J^ is relatively compact in Y. T h e o r e m . Let X and Y be metric spaces. Suppose X is compact. A sequence {fn} C C{X,Y) converges uniformly if and only if {fn} is equicontinuous and there exists a compact set K C Y such that {fn{x)} is contained in a 6-neighborhood of K for n sufficiently large. 9.52 ^ . Show the following. P r o p o s i t i o n . Let X^Y be two metric spaces and let Q C X be compact. Then the subsets ofC^^'^iQ^Y) that are bounded in the \\ ||^o,a norm are relatively compact in
C^{n,Y).
9.3 Approximation Theorems In this section we deal with the following questions: Can we approximate a continuous function uniformly, and with given precision, by a polynomial? Under which conditions are classes of smooth functions dense with respect to the uniform convergence in the class of continuous functions?
9.3.1 Weierstrass and Bernstein theorems a. Weierstrass's approximation theorem In 1885 Karl Weierstrass (1815-1897) proved the following. 9.53 Theorem (Weierstrass, I). Every continuous function in a closed bounded interval [a, b] is the uniform limit of a sequence of polynomials. In particular, for every n there exists a polynomial Qn(^) (of degree d = d{n) sufficiently large) such that |/(x) — Qn{x)\ < 2~'^ V a: G [a, 6]. If we set Pi{x) := Qi{x),
Pn{x) := Qn{x) - Qn-i{x),
n > 1,
we therefore conclude that every continuous function f{x) can be written in a closed and bounded interval as the (infinite) sum of polynomials., oo
f{x) = y ^ Pn{x) n=0
uniformly in [a, b].
304
9. Spgices of Continuous Functions, Banach Spaces and Abstract Equations
We recall that, in general, a continuous function is not the sum of a power series, since the sum of a power series is at least a function of class C"^, compare [GM2]. Many proofs of Weierstrass's theorem are nowadays available; in this section we shall illustrate some of them. This will allow us to discuss a number of facts that are individually relevant. A first proof of Theorem 9.53. We first observe, following Henri Lebesgue (1875-1941), that in order to approximate uniformly in [a, h] any continuous function, it suffices to approximate the function |a;|, x G [—1,1]In fact, any continuous function in [a, h] can be approximated, uniformly in [a, 6], by continuous and piecewise linear functions. Thus it suffices to approximate continuous and piecewise linear functions. Let f{x) be one of such functions. Then there exist points XQ = a < XI < X2 < • • • < Xr < Xr-\-i = b such that f^x) takes a constant value dk in each interval ]xk,Xk-\-i[' Then, in [a,b] we have r
f{x) = f{a)-\-^{dk fc=o
-dk-i)(pxk(x),
d-i = 0 ,
where ipc{x) := max(a: - c,0) = -{{x — c) -\-\x — c\). If we are able to approximate \x — Xk\-> x e [a,6], uniformly by polynomials {Qk,n}^ then the polynomials ^ 1 Pn{x) := / ( a ) + ^{dk - dk-i)-({x
- Xk) +
Qk,n{^))
k=0
approximate f(x) uniformly in [a,b]. By a linear change of variable, it then suffices to approximate |a;| uniformly in [—1,1]. This can be done in several ways. For instance, noticing that if x G [—1,1], then 1 — |a;| solves the equation in z
one considers the discrete process izn^i{x)
= ^[zUx)
+ {l-x^)]
n>0,
[zo{x) = 1. It is then easily seen that the polynomials Zn{x) satisfy (i) Znix)>Oin[-hll (ii) Zn{x)
>
Zn-\-l{x),
(iii) Znix) converges pointwise to 1 — \x\ if x G [—1,1]. Since 1 — \x\ is continuous, Dini's theorem. Theorem 9.36, yields that the polynomials Zn{x) converge uniformly to 1 — |a:| in [—1,1]. Alternatively one shows, using the binomial series, that VT^^=f^CnX'',
Cn = ( ^ ^ ^ ) ( - i r .
in ] — 1,1[. Then one proves that the series converges absolutely in C^([—1,1]), hence uniformly in [—1,1]. In fact, we observe that Cn := (^/^)(—1)" is negative for n > 1 hence,
9.3 Approximation Theorems
oouicnoM M Hamaunaa
Obordie aoalytiaolM DaxsMBMikfiit flogenamttir wSMtttehttr FoDtttiaasii mat reelkB VarinderUditn.
305
sm u ratoias taa RWCTK»«
LECONS
PROPRIETES EXTRMALES
Von K. WnxMnuat. Ertt. If ittk«n«>|.
MEILLEURS Al^PROXIHATION I i t / ( » ) «ii>» ftr jgam Mdtan W«ih d v V
FONCTIONS ANALYTIQUES D'U>fE
VARIABLE
S^BLLK
I>«indtewCH«leb>oi«ni««tm
j+W< » Wwtk kibm «
PARIS &Aimiitft-vaiAR» s r o , SDITKIIRS 5», <}«*1 4« Gna4Mla(Mtin M
Figure 9.7. The first page of Weierstrass's paper on approximation by polynomials and the Legons sur les proprietes extremales by Sergei Bernstein (1880-1968).
J2 K\ = 2 - ^ n=0
cn = 2 -
n=0
lim_ ^ ^^^
cnx""
n=0
oo
< 2-
lim y^ Cnx^ = 2 -
lim yj\-x
= 2.
Replacing \ — x with x^, it follows that oo
\x\ = Y2 c n ( l - x^)^
uniformly in [-1,1].
9.54 ^ . Add details to the previous proof.
b. Bernstein's polynomials Another proof of Theorem 9.53, grounded in probabHstic ideas, see Exercise 9.57, and giving expHcit formulas for the approximating polynomials, is due to Sergei Bernstein (1880-1968). It is enough to consider functions defined in [0,1] instead of in a generic interval [a, 6]. 9.55 Definition. Let f e C^([0,1]). Bernstein polynomials of f are
Bn{x) := JZf{^ 0^'(1 - ^)"-'.
^ > 0.
306
9. Spaces of Continuous Functions, Banach Spaces and Abstract Equations
9.56 Theorem (Bernstein). Bernstein's polynomials off converge uniformly in [0,1] to f. Proof. We split the proof into three steps. Step 1. The following identities hold
5:n.^i-xr-'= = i.
(9.8)
fc=0
(9.9) The first is trivial: it follows from the binomial formula (a + 6)" = ^Q {^)a^h'^-^ by choosing a = x and 6 = \ — x. The second needs some computation. Fix n > 1. Starting from the identities
fc=0 '^
±kQ)y'-yD(tQ)y')=nyiy+ir-\ '£k^ny''
= ny{ny + l){y + i r - \
we replace y by x / ( l — x) and multiply each of the equalities by (1 — x)". It follows that
fc=0 '^
f2kr)x^{i-xr-^ = nx, k=0 n
^
S ^^ Q)^^(^ - ^T~^ = ^^(^^ + 1-3:). fc=0
Multiplying each of the previous identities respectively, by n'^x^, —2nx and —1, and summing, we infer (9.9). Step 2. As x{l-x)
< \ , (9.9) yields
«=0
Fix S > 0 and x G [0,1], and denote by An (a:) the set of fc in { 0 , 1 , . . . , n } such that a: > (5; n (9.10) then yields
?:,{:y»-'>-<-^'
(9.11)
keAnix) that is, for n large, the terms that mostly contribute to the sum in (9.8) are the ones with index k such that \k <S. x\ n
9.3 Approximation Theorems
307
Step 3. Set M := sup^^jg,!] 1/(^)1 ^^^^ given e > 0, let 8 be such that \f{x) — f{y)\ < e for \x — y\ < 5. Then we have Bn{x)
- fix)
= f^
[/(^)
- /(X)] Q x ' = ( l - X)"-'^
fcGr,,(x)
where Tn : = | o , . . . , n ) \ A n = (fc G { 0 , . . . , n } I I - - x\ For k G Tnix),
i.e., |/c/n — x| < (5, we have \f{k/n)
<s].
— f{x)\ < e, hence
on the other hand, if \k/n — x\ > S, (9.11) yields
Therefore, we conclude for n large enough so that M/{2nS'^) \Bn{x) - f{x)\ < 2e
< €,
uniformly in [0,1].
9.57 %, The previous proof has the following probabilistic formulation. Let 0 < p < 1 and let Xn{p) be a random variable with binomial distribution
PiiXnip) = r/n}) = Q p - ( 1 - p ) " - ^ If / : [0,1] ^^ E is a function, the expectation of f{Xnit))
is given by
oo
E ifiXnim = E / © 0*'(i - *)""' r=0
and one shows in the theory of probability that E [f{Xn{t))]
converge uniformly to / .
c. Weierstrass's approximation theorem for periodic functions We denote by Cj^ the class of continuous periodic functions in R with period T > 0. 9.58 Theorem (Weierstrass, II). Every function f e C^^ is the uniform limit of trigonometric polynomials with period T.
308
9. Spaces of Continuous Functions, Banach Spaces and Abstract Equations
In Section 9.3 we shall give a direct proof of this theorem and in Section 11.5 we shall give another proof of it: the Fejer theorem. It is worth noticing that, in general, a continuous function is neither the uniform nor the pointwise sum of a trigonometric series. Here we shall prove that the claims in Theorems 9.58 and 9.53 are equivalent. By a linear change of variable we may assume that T = 27r. First let us prove the following. 9.59 Lemma. Let f e C^([—7r,7r]) be even. Then for any e > 0 there is an even trigonometric polynomial T{x) := yjQfccoskx k=0
such that \f{x) - T{x)\ < 6 V x G [-7r,7r]. Proof. We apply Theorem orem 9.53 to the continuous continuou function g{y) := /(arccos?/), y € [—1,1], to obtain n
/(arccosj/) - ^
CkV^l <e
in [-1,1],
hence n
\f{x) — y^CkCOS^x\<e
in [0,TT].
fc=o n
To conclude, it suffices to notice that ^
Cfc cos'^ x is an even polynomial.
D
k=0
Proof of Theorem 9.58. Let / € C2T^{R). f{x) + f{-x),
We consider the two even functions in [—TT, TT] (fix)
-
f(-x))smx.
(fix)
- fi-x))
Then Lemma 9.59 yields for any e > 0 fix) + f(-x)
= Ti(x) + a i ( x ) ,
sinx = T2{x) + a2{x)
Ti{x) and T2{x) being two even trigonometric polynomials, and for the remainders a i and a2 one has |ai(a:)|, |a2(x)| < e in [-7r,7r]. Multiplying the first equation by sin^ x and the second by sin x and summing we find fix) sin^ X = Tsix) -\- asix),
(9.12)
for Tsix) a trigonometric polynomial and ||a3||oo^[_7r,7r] < 2e. The same argument applies to fix — 7r/2), yielding f(x
j sin^ X = T4ix) -h a4ix)
where T4 is a trigonometric polynomial and ||a4||oo,[-7r,7r] < 26. By changing the variable X in a; + ^ , we then infer fix) cos^ X = Tsix) + asix)
(9.13)
where Tsix) := T4(x-{-|) and ||Q;5||oo,[-7r,7r] < 2e. Summing (9.12) and (9.13) we finally conclude the proof. •
9.3 Approximation Theorems
309
9.60 Remark. Actually, the two Weiertrass theorems are equivalent. We have already proved Theorem 9.58 using Theorem 9.53. We now outline how to deduce the first Weierstrass theorem, Theorem 9.53, from Theorem 9.58, leaving the details to the reader. Given / G C^([—7r,7r]), the function
satisfies ^(TT) = g{—7r), hence g can be extended to a continuous periodic map of period 27r. According to Theorem 9.58, for any e > 0 we find a trigonometric polynomial n{e)
Te{x) := ao -\- 2^i^k cosfcx + b^ sin/ex) with \g{x) — Tg(x)| < e for all x E [—7r,7r]. Next, we approximate sinfcx and cosfcx by polynomials (e.g., by Taylor polynomials), concluding that there is a polynomial Qe{x) with \Te{x) — Qe(x)| < e Vx G [—7r,7r], hence \g{x)-Q,{x)\<2em[-l,l].
9.3.2 Convolutions and Dirac approximations We now introduce a procedure that allows us to find smooth approximations of functions. a. Convolution product Here we confine ourselves to considering only continuous functions defined on the entire line. The choice of the entire line as a domain is not a restriction, since every continuous function on an interval [a, b] can be extended to a continuous function in R and, actually for any 5 > 0, to a continuous function that vanishes outside [a — S^b^ 5]. 9.61 E x a m p l e (Integral m e a n s ) . Let / : R —>• R be continuous. For any 6 > 0 consider the mean function of / px+6
fsi^)'-=^
r ^ 2d
m)d^,
^€M.
-6 Jx-S
Simple consequences of the fundamental theorem of calculus are (i) fs{x) is Lipschitz continuous, (ii) fs{x) -^ f{x) pointwise, while from the estimate
Ifsix) - f{x)\ < sup
\f{y)-f{x)\
\y-x\<6
and Theorem 6.35 (iii) fs{x) -^ f{x) uniformly on every bounded interval of R.
(9.14)
310
9. Spaces of Continuous Functions, Banach Spaces and Abstract Equations
The above allows us, of course, to uniformly approximate continuous functions with Lipschitz-continuous functions on every bounded interval.
9.62 Definition. Let / , p : R ^ R be two Riemann integrable functions. Suppose that g{x — t)f{t) is summable in R for any x G R. Then the function g * f{x) := / g{x - y)f{y) dy,
x G R,
called the convolution product of f and g, is well defined. Clearly the map {f^g) -^ g * f (i) is a bilinear operator, (ii) g^ f = —/ * g since g * f[x) = / p(x - y)f{y)dy
= -
JR
g{z)f{x -z)dz
=
-f*g{x),
JR
(iii) if / and g are summable in [a, b] and / vanishes outside the interval [a, 6], then ^ * / is well defined in R and pb
\g*f{x)\<
nb
\9{x-y)\\f{y)\dy<\\g\U^,^^„.-a]
\f{y)\dy.
Ja
Ja
(9.15) 9.63 E x a m p l e . The function fs in (9.14) is the convolution product of / and
\jd
se \x\ > S.
9.64 E x a m p l e . If g{t) = Xlfc=o ^k^^ ^^ ^ polynomial of degree n, then for any / that vanishes outside an interval, g * f(x) - f^ cfc X I ( ^ ) ( - l ) " - ^ ( / y^'-'fiy) k=o
j=o
^
dy)x^
^-'^
^
is again a polynomial of degree n. 9.65 E x a m p l e . If / = 0 outside [—TT, TT], then ^^^ * iix)
= / " /(2/)e^'^("-^) dy = 27rcfce^^"
i.e., e*^* * / is the fcth harmonic component of the periodic extension of / , compare Section 11.5.
9.66 Theorem. Letg G C'^(R), and let f be Riemann summable. Suppose that either f or g vanishes outside a bounded interval [a,b]. Then g * f G C^(R) and Dk{g * f){x) = {Dug) * f{x) \/x e R.
9.3 Approximation Theorems
311
Proof. We prove the claim when / = 0 outside [a, 6], the other case p = 0 outside [a, 6] is similar. By (9.15) we then have Is * / ( x ) | < ||s||oc,(.-fc,x-a] ll/lll
ll/lll := r
\m\
dy,
Ja
hence \\9 * /||oo,[c,d] < \\9\\oo,[c-b,d-a]
(i) We now prove that g*feC^{R)iige
ll/lll-
C^(R). In fact,
f{x) = / {g{x - y -\- h) - g{x - y))f{y)
g*f{x-\-h)-g*
dy = G * / ( x ) ,
where G{x) := g{x -\- h) — g{x). Therefore, using (9.15), we get \g*f(x
+ h)-g*
/ ( x ) | < \\g{x + h) - 5(x)||oo,lx-6,x-a] ll/lli ^ 0
as /i —> 0 since \\g(t -\- h) — g{t)\\oo,[x-b,x-a] g on compact sets.
(9.16)
~^ 0, because of the uniform continuity of
(ii) Similarly, we prove that p * / G C^(R) if / G C^(R). We have
g* f(x + h)- g* f{x)
-
/
9'{x-y)f(y)dy
H*f{x),
= where H{x) := aix+^V-g^x) _ ^,^^y
\g*f{x-\-h)-g*f(x)
^^^.^^ ^^
^^^^^^
f
I
^\\g{t-\-h)-g{t)
g'm
ll/lll. I \oo,[x — o,x — a\
Since g(x -\-h)r
g{x)
1 , 9 {x) = \T
rx+n
(9'(J/)-S'W)
I n. Jx -^
rx-\-h
< ^ /
<
sup
\9{y)-9{x)\dy
\g\y) - g\x)\-^ ^
ash-^0
\y-x\<\h\
because of the uniform continuity of p ' on compact sets, we then conclude that p * / is differentiable at x and that {g * f)'(x) = g' * f{x) Vx G M. Finally (p * fY = g' * f is continuous by (i). (iii) The general case is then proved by induction.
•
9.67 Remark. Let / and g be summable and let one of them vanish outside a bounded interval. If / instead of g is of class C^(R), then, recalhng that ^ * / = - / * ^, we infer from Theorem 9.66 that g * f e C^{R) and Dk{g ^ f){x) = g^ {Dkf){x). Therefore if both / and g are of class C^(E), then Dk{g * f){x) = (Dkg) * f{x) = g * {Dkf){x) and, in general, ^ * / is as smooth as the smoother of / and g.
312
9. Spaces of Continuous Functions, Banach Spaces and Abstract Equations
b. MoUifiers 9.68 Definition. A function k{x) G C ^ ( R ) such that k{x) = fc(—x), k{x) > 0,
k{x) = 0 for \x\ > 1,
/ k{x) dx = 1 JR
is called a smoothing kernel. 9.69 t . The function
(
ex ip{x) := { ^ \ 0
1- X if \x\ > 1
is C°*^(E), nonnegative, even and with finite integral. Hence the map k{x) := -^(p(x), is a smoothing kernel. where A := f^(p{x)dx,
Given a smoothing kernel fc(x), we can generate the family k,{x):=e-^k(^-^
e>0.
Trivially, ke{—x) = fce(x) and ke G C ^ ( R ) ,
ke{x) > 0,
ke{x) = 0 per |x| > e,
/ ke{x) dx = 1. JR
Also ke{x) = 0 for \x\ > e and ||fee||oo = ||A:||oo/e. 9.70 Definition. Given a smoothing kernel k, the mollifiers or smoothing operators Se are defined by
SJ{x) := K * f{x) = f K{x - y)f{y) dy. JR
We have S,f{x)
=fee* fix) = r
' k,{x - y)f{y) dy
Jx—e
= ]f^k{^)f{y)dy
=j
Hz)f{x-ez)dz.
Since the functions ke are of class C ^ , the functions 5e/(x), x G M, are of class C^ by Theorem 9.66. Moreover, as shown by the next theorem, they converge to / in norms that are as strong as the differentiability of / ; for instance, they converge uniformly or in norm C^ if / is continuous or if / is of class C^, respectively. 9.71 Proposition. Let f G C^(R). Then (i) 5 e / G C ^ ( M ) , V e > 0 ;
9.3 Approximation Theorems
313
(ii) If f = 0 in [a, b], then Sef{x) =0 in [a + e, 6 — e];
(iii) iSjyix) =
U^k'[^)fiy)dy;
(iv) Sef -^ f as e —> 0 uniformly in any bounded interval [a, 6]. Moreover, iff e C^{R), then ( 5 J ) ' ( x ) = {Sef){x) f'{x) uniformly on any bounded interval [a, 6].
Vx G R andSef{x)
-^
Proof, (i), (iii) follow from Theorem 9.66, and (ii) follows from the definition, (iv) If / € C^(R) and X G R we have \f{x) - Sef{x)\
- y)[f{y) - f{x)] dy =
= \fKeix <
sup
\f{y)-f(x)\
Iy-x|<e
f ke{y)dy= JR
\ke*{f-f{x)){x)\ sup
\f{y)-f{x)\.
\y-x\<e
Since / is uniformly continuous on bounded intervals in E, sup|j^_2.|^g \f{y)~f{x)\ ~^ 0? consequently \Se{x) — f(x)\ —> 0 as e —>• 0 uniformly on compact sets of R. If / G C^(E), we have already proved in Theorem 9.66 that Sef is of class C^ and that (SefYix) = Sef'{x). Applying (iv) to Sef and Sef we then reach the claim. D
c. Approximation of the Dirac mass The family {fce} is often referred to as an approximation of the Dirac delta. In appHcations, the Dirac S is often "defined" as a function vanishing at every point but zero and with the property that + 00
6{x) dx = 1;
/
sometimes it is "defined", with respect to convolution, as if it would operate like
f
J —c
Of course, no such function exists in the classical sense; but it can be thought of as a linear operator from C^(M) into R
We shall avoid dealing directly with J, as the correct context for doing this is the theory of distributions., and we set 9.72 Definition. A sequence of nonnegative functions Dn : R ^> R with the properties that for any interval [a, b] and for any p > 0 we have / JB{0,p)
Dn{x) dx ^ 1,
/ J[a,b]\B{0,p)
is called an approximation of the S.
Dn{x) dx ^0,
as n -^ (X), (9.17)
314
9. Spaces of Continuous Functions, Banach Spaces and Abstract Equations
Figure 9.8. Approximations of the Dirac S.
9.73 If. Let {Dn} be an approximation of S and let / be a continuous function in [a, b]. Show that Um /
Dn{x-
y)f{y) dx = f{x)
Vx €]a, 6[.
It is easy to prove the following. 9.74 Theorem. Let {Dn} be an approximation of 6. Suppose that each Dn is continuous in M and let f he a continuous function in [a, 6]. Then the functions fn{x) := / Dn{x - y)f{y) dy,
x G [a, 6],
converge uniformly to f in every interval [c^d] strictly contained in [a, 6]. Theorem 9.74 uses, in an essential way, the fact that the approximations oiS are nonnegative. For instance, the result in Theorem 9.74 does not hold for the sequence of the Dirichlet kernels, since the Fourier series of / does not converge to / if / is merely continuous, compare Section 11.5. 9.75 f. Prove Theorem 9.74. 9.76 ^ . Consider in [—1,1] the sequence of functions D„(x):=c„(l-xY
where
c„
:=-^--1^-—.
Show that for every p €]0,1[ lim :l£2 i =0. ^^°° /o(l-t2)^cit Infer that {Dn} is an approximation of S, hence the functions f \ l - i t - x f r m dt
fn{x):=Cn Jo
converge uniformly to / on compact sets of ] — 1,1[. Finally, observing that the functions fn{x) are actually polynomials of degree not greater than 2n, called Stieltjes polynomials, deduce from the above Wieierstrass's theorem, Theorem 9.53.
9.3 Approximation Theorems
315
/. p. NATANSON LECONS
CONSTRUCTIVE FUNCTION THEORY
L'APPROXIMATION DES FONCTIONS D'UrsE VARIABLE RfiELLE I'llOKKSStES A U SOKBONNK
Volume I UNIFORM APPROXIMATION
C. OK ht. V A L L £ E P0U8SIN Translated by ALEXIS N. OBOLENSKY
PARIS CAUTH1CK.VH.L.\US ET O; EDITEUUS IIMAIBK DC Vl'UBAt
FREDERICK
UNGAH PUBLISHING NBV YORK
CO.
Figure 9.9. Frontispieces of L'approximation des fonctions by Charles de la ValleePoussin (1866-1962) and of J. P. Natanson Constructive Function Theory, New York, 1964.
Consider the functions Dn{t) := Cn cos^"" ( " ) ,
t E [-7r,7r]
where, see 2.66 of [GM2], 1 ^n • —
_ 1
/:^cos2"(|)dt
(2n)!!
27r(2n-l)!!-
As proved below, we have the following. 9.77 Lemma. The sequence {Dn} is an approximation of S. Hence, as a consequence of Theorem 9.74, we can state the following. 9.78 Theorem (de la Vallee Poussin). Let f G C°([-7r,7r]). The functions
converge uniformly to f in every interval [a, b] with
—7r
Proof of Lemma 9.77. (i) Since cost is decreasing in [0,7r/2], we have r
cos2^(t) dt<(--p)
cos2^(p) < - cos2^(p);
on the other hand, since cost is concave in [0, 7r/2], we have cost > 1 — 2t/7r, hence
316
9. Spaces of Continuous Functions, Banach Spaces and Abstract Equations
/ Jo we therefore conclude
cos2^mdi> / Jo
( 1 - - ) ^ 7r>'
=-7-^ r; 2(2n + l ) '
/;/2cos2-(t)dt 2(2n+l)7r ^n, ^ ~/J'/^cos2n(t)dt < —^ TT -2 cos"^^(/9) —^0 and
as n -^ oo
_£C0S2-W^^^
lim - ^ — = 1. ''-^'^ Jo ^cos^"" {t)dt
(9.19) D
The functions Tn{x) in (9.18) are often called de la Vallee Poussin integrals. 9.79 Remark. Let g G C^(R) be a periodic function with period 27r. Applying Theorem 9.78 to g{x) := /(3x), x G [—TT,TT], we deduce the uniform convergence of {Tn{x)] to g{x) in [—7r/3,7r/3], i.e., the uniform convergence of {Tn{x/i)} to g{x) in [—TT, TT]. Since the T^'s are trigonometric polynomials of degree at most 2n, we may deduce at once the second Weierstrass theorem from Theorem 9.78.
9.3.3 The Stone—Weierstrass theorem Weierstrass's theorems can be generalized to and seen as consequences of the following theorem proved in 1937 by Marshall Stone (1903-1989) and known as the Stone-Weierstrass theorem. Let X be a compact metric space and let C{X) = C^{X,R) be the Banach space of continuous functions with uniform norm. An algebra of functions X is a real (complex) linear space of functions / : X —> R (respectively, / : X —> C) such that fgeAiif and g E A. We say that A distinguishes between the points of X if for any two distinct points x and y in X there is a function f in A such that f{x) ^ f{y). We say that A contains the constants if the constant functions belong to A. 9.80 Theorem (Stone-Weierstrass). Let X be a compact metric space and let A be an algebra of continuous real-valued functions, A C C^(X, M). If A contains constants and if it also distinguishes between the points of X, then A is dense in the Banach space C^(X, R). Let A be an algebra of bounded and continuous functions. As we have seen, the function \y\ can be approximated uniformly in [0,1] by polynomials. Consequently, if / G ^ , by considering instead of / the function h '-= / / I I/I loo? we can approximate x -^ \f{x)\ uniformly by the functions Pn{f{x)) where {Pn} is a sequence of polynomials. Since the Pn{f{x)ys belong to A, as A is an algebra of functions and f E A, we conclude that I/I belongs to the uniform closure of A, and also
9.3 Approximation Theorems
ma^(/,p):=l(/ + 5+|/-5|),
rmn{f,g):=^{f
+
317
g-\f-9\)
are in the uniform closure of A^ if both / and g are in the uniform closure of A A linear space of functions R with the property that max {f^g) and min (/, ^) are in i? if / and g e R is called a linear lattice: the above can be then restated as the closure of A is a linear lattice. To prove that A is dense in C^(X,R), it therefore suffices to prove the following. 9.81 Theorem. Let X be a compact metric space. A linear lattice R C C^(X, R) is dense, provided it contains the constants and distinguishes between the points in X. Proof. First we show that, for any / G C^(X, E) and any couple of distinct points x,y G X, we can find a function iljx,y G R such that '^x,y{x)
= f{x)
^Px,y{y) =
fiv)-
In fact by hypothesis, we can choose w ^ R such that w{x) ^ w{y); then the function i^x,y{t) :=
(fix)
- f{y)w(t)
- if(x)w{y) w(x) — w{y)
-
f{y)w{x))
has the required property. Given / G C^{X,R), e > 0 and y G X , for every x E X we find a ball B{x,rx) such that ijjx,y{t) > f{t) — e Vt G B{x,rx)- Since X is compact, we can cover it by a finite number of these balls {B^^} and we set (py := max tpxi,y- Then (py{y) = f{y) and ipy £ R since R is a lattice. We now let y vary, and for any y we find B{y, Vy) such that ^y{t) < f{t) -\- e^^t E B{y, Vy). Again covering X by a finite number of these balls {B(iii^ry^)}^ and setting (p := maxi c^y^, we conclude (p E'R and \^p{t) — / ( t ) | < e Vt G X , i.e., the claim. D
Of course real polynomials in [a, 6] form an algebra of continuous functions that contains constants and distinguishes between the points of [0,1]. Thus the Stone-Weierstrass theorem implies the first Weierstrass theorem and even more, we have the following. 9.82 Corollary. Every real-valued continuous function on a compact set K C M.'^ is the uniform limit in K of a sequence of polynomials in n variables. Theorem 9.80 does not extend to algebra of complex-valued functions. In fact, in the theory of functions of complex variables one shows that the uniform limits of polynomials are actually analytic functions and the map z ^ ;^, which is continuous, is not analytic. However, we have the following. 9.83 Theorem. Let A C C^(X, C) be an algebra of continuous complexvalued functions defined on a compact metric space X. Suppose that A distinguishes between the points in X, contains all constant functions and contains the conjugate f of f if f E A. Then A is dense in C^{X,C).
318
9. Spaces of Continuous Functions, Banach Spaces and Abstract Equations
coLuscTKHi OS uoHoeiiAMucs SCR u ntom
OE$
nactmi
LECONS
LES FONCTIONS MSCONTINllES PROFBSSUSBS XV COU^QB DE FRANCE
PARIS, GAUTHtER'VILLARS, IMtMUUECIt.LiBnAiHE • PttAII I I I tOXCITUDU, DC L'lic«tt r»tVT(Clllliail(,
Figure 9.10. Rene-Louis Baire (18741932) and the frontispiece of his Legons sur les functions discontinues.
Proof. JDenote by ^ o the subalgebra of A of real-valued functions. Of course 3?/ = ^ ( / + / ) and ^f = -^(f - f) belong to ^ o if / and g E A. Since f{x) i^ f{y) implies that ^f(x) 7^ 3f?/(y) or Qf{x) ^ ^f{y), Ao also distinguishes between the points of X and, trivially, contains the real constants. It follows that ^ o is dense in C^(X,R) and consequently, A is dense in C(X, C). D
The real-valued trigonometric polynomials
ao + yj(^fc cos kx + bk sin kx)
(9.20)
k=i
form an algebra that distinguishes between the points of [0,27r[ and contains the constants. Thus, trigonometric polynomials are dense among continuous real-valued periodic functions of period 27r. More generally, from Theorem 9.83 we infer the following. 9.84 T h e o r e m . All continuous complex-valued functions defined on the unit sphere {z E C\\z\ = 1} are uniform limits of complex-valued trigonometric polynomials
E k=—n
ike Cke
9.3 Approximation Theorems
319
9.3.4 The Yosida regularization a. Baire's approximation theorem The next theorem relates semicontinuous functions to continuous functions. 9.85 Theorem (Baire). Let X be a metric space and let f : X ^>' R be a function that is bounded from above and upper semicontinuous. Then there is a decreasing sequence of continuous, actually Lipschitz continuous, functions {/„} such that fn{x) —> f{x) for all x e X. Proof. Consider the so-called Yosida regularization fn{x) := sup{/(2/) yex
of / -nd{y,x)}.
Obviously f{x) < fn{x) < s u p / , fnix) > fn+i{x). We shall now show that each fn is Lipschitz continuous with Lipschitz constant less than n. Let x,y ^ X and assume that fnix) > fn{y)' For all 77 > 0 there is x' G X such that fn{x) < f{x')
- nd{x, x') -f- T)
hence 0 < fn{x) ~ fniv)
< fix')
- nd(x, xO + r? - ifix')
- ndiv,
x'))
= n(d(2/, x') — d(x, x')) + r; < ndix, y) -\-rj thus \fnix)
- fniy)\
since r] is arbitrary. Let us show that fnixo) i fixo). Denote by M the sup^.^^ / ( ^ ) - Since / ( X Q ) > limsup^_,2.^ fix), for any A > fixo) there is a spherical neighborhood Bixo,S) of XQ such that fix) < A Vx G B(xo, S), hence
\M
Then fix) hence
— ndixo.x)
if d(x, XQ) > 6.
< A Vx G X , provided n is sufficiently large, n > —~J\^o) ^ fixo)
Since A > fixo)
— nS
< fnixo)
= s u p ( / ( x ) - ndixo.x)) < A. X is arbitrary, we conclude fixo) = limn-^00 fnixo)-
•
Suppose that X == R^. An immediate consequence of Dini's theorem, Theorem 9.36, and of Baire's theorem, is the following. 9.86 Theorem. Let / : E^ ^ R fee a function that is bounded from above and upper semicontinuous. Then there exists a sequence of Lipschitzcontinuous functions that converges uniformly on compact sets to f.
320
9. Spaces of Continuous Functions, Banach Spaces and Abstract Equations
b. Approximation in metric spaces Yosida regularization also turns out to be useful to approximate uniformly continuous functions from a metric (or normed) space into R by Lipschitzcontinuous functions. Let X be a normed space with norm 11 11. 9.87 Proposition. The class of uniformly continuous functions f : X —> M is closed with respect to the uniform convergence. Recall that the modulus of continuity oi f : X —^ R is defined for all te
[0,+oo[by Ufit) := sup[\f{x)
- f{y)\ \x,yeX,
\\x - y\\ < t } .
(9.21)
Clearly / is uniformly continuous on X if and only if ujf{t) -^ 0 as ^ —> 0. 9.88 1. Prove Proposition 9.87.
Lipschitz-continuous functions from X to R are of course uniformly continuous, therefore uniform limits of Lipschitz-continuous functions are uniformly continuous too, on account of Proposition 9.87. We shall now prove the converse, compare Example 9.61. 9.89 Theorem. Every uniformly continuous function f : X ^^ R is the uniform limit of a sequence of Lipschitz-continuous functions. In order to prove Theorem 9.89, we introduce the function 6f{s) that measures the uniform distance of / from the class Lips{^) of Lipschitzcontinuous functions g : X —^R with Lipschitz constants not greater than s
Sfis) :^mi[\\f - g\\^\g eUp,{X)}. 9.90 %. Show that s —^ Sf(s) is nonincreasing and that / is the uniform limit of a sequence of Lipschitz-continuous functions if and only if (5y(s) -^ 0 as s -^ oo.
Then we introduce the Yosida regularization of / : X —> R by fs{x):=m{^[f{y)
+
s\\x-y\\].
9.91 t . Show that (i) fs is Lipschitz-continuous with Lipschitz constant s, (ii) fs (x) < ft (x) Va; if s < t, and, actually (iii) fs is the largest s-Lipschitz-continuous function among functions less than or equal to / .
9.3 Approximation Theorems
9.92 Proposition. Let f : X —^ R be a uniformly continuous Then
321
function.
6f{s) = ^sup[ujf{t)-st}.
(9.22)
Moreover, the minimum distance of f from Lip5(X) is obtained at gs{x) := fs{x)-}-Sf{s), i.e., ^f{s) = Proof. Let g 6 Ups{X).
\\f-9s\\oo'
Then l / W - / ( 2 / ) l < 2 | | / - ^ | | o o + 5||a:-2/||.
For x,y such that ||ic — 2/11 < *? by taking the infimum with respect to g, we infer
\fix)-m\<2Sf(s)
+ st
and, taking the supremum in x and y with ||x — 2/|| < t, we get ujf{t) < 2Sf{s) + st, hence - s u p | c j / ( t ) - St] < Sf{s). 2 t>o^ ^
(9.23)
Let us prove that the inequaUty (9.23) is actually an equality and the second part of the claim. For x,y ^ X we have fix) - f(y) < cjfiWx - 2/11) = [ujfiWx -y\\)-
s\\x - y\\) + s\\x - y\\
< s u p | u ; / ( t ) - stj + s\\x - y\\. By taking the supremum in y we get 0 < fix) - fsix)
< sup{a;/(t) - st] < 2(5/(s) ^ t>o ^
Wx e X
hence, by (9.23) we infer 11/ - MIoc < sup\ujfit)
- St] < 2Sfis).
t>o '^ Therefore, for gsix) := fsix) + Sfix)
(9.24)
^
we have
\\f-9s\\oo<Sf(s), and, since gs E L'lpsiX), ^fis)
we conclude | | / — gs\\oo = <5/(s)- Moreover, by (9.24)
= 11/ - 9s\\oo = 11/ - Mloo - Sfis) < suplujfit)
- St] - Sfis) < Sfis)
t>0 ^
^
i.e., -sup{ujfit)-st] 2, t>0 ^
=
Sfis).
^
9.93 1. Show that if / ^ := - ( - / ) s , then / ^ € UpsiX) Sfis) = 11/ - g'Woo,
g'ix)
:= rix)
and -
5fis).
322
9. Spaces of Continuous Functions, Banach Spaces and Abstract Equations
Proof of Theorem 9.89. It is enough to prove that infs>o<^/(s) = 0. First, we notice that ujf{t) is nondecreasing and subadditive. In fact, lix^y are such that ||a: —2/|| < a-\-h and we write b a a-\-b a-\-b then l-z — x| < a and |>2; — 2/| < 6; consequently, \f(x) - fiz)\ < ujfia) and |/(s/) - f{z)\ < u,f{b) that yield at once ujf{a -\-b) < oJf(a) +
^f(b).
Next, we observe that for any e > 0 and t = me -|- cr, m G N, cr < e, we have m < t/e and ojf (t) = LJf {me + cr) < mcjy (e) -f ujf (cr) < cjy (e) - -\-ujf (e). Therefore
<12 t>0^{all)„„„„_^,} = l„,„, e e From the last inequality we easily infer infs>o ^f(s) — 0
9.4 Linear O p e r a t o r s 9.4.1 Basic facts In finite-dimensional spaces, linear maps are continuous, but this is no more true in infinite-dimensional normed spaces, see Example 9.96. 9.94 % Show that P r o p o s i t i o n . Let X and Y be normed spaces. Suppose that X is finite Then every linear map L \ X ^>'Y is continuous.
dimensional.
The following proposition characterizes linear maps between two Banach spaces that are continuous. 9.95 Proposition. Let X and Y he normed spaces and let L : X -^ Y be a linear map. Then the following conditions are equivalent (i) L is continuous in X, (ii) L is continuous at 0, (iii) L is bounded on the unit ball, i.e., there exists K > 0 such that \\L{X)\\Y
9.4 Linear Operators
323
Proof. If L is continuous, then trivially, (ii) holds. If (ii) holds, then there exists S > 0 such that ||L(a:)||y < 1 provided \\x\\x < S. This yields ||L(x)||y < 1/S if ||a;||x < 1, since L is linear, i.e., (iii). Assuming (iii) and the linearity of L, we infer that for all \\L{^)\\Y
^ \ \ . (
\\x\\x
X
xii
^
II V \ \ x \ \ x J \ \ y -
'
i.e., (iv). (iv) in turn implies (v) since \\L(x) - L{y)\\Y = \\L{x - y)\\Y < K\\x - y\\x
Vx,s/ 6 X,
and trivially, (v) implies (i).
D
9.96 E x a m p l e . Let X be a normed space and let {cn} C X be a countable system of independent vectors with ||en|| = 1, and let y C X be the subspace of finite linear combinations of {cn}. Consider the operator L :Y —^R defined on {cn} by L{en) '-= n Vn and linearly extended to Y. Evidently L is linear and not bounded.
Linear maps between Banach spaces are often called linear operators. a. Continuous linear forms and hyperplanes Consider a linear map L : X ^ K defined on a linear normed space X, often called also a linear form. If L is not identically zero, we can find x such that X ^ ker L and we can decompose every x E X as L(x) __ L(x)
(/
L{x) L,\X)\ L(x)
V
in other words X = Span { x } e ker L. However it may happen that kerL is dense in X. 9.97 Proposition. Let L : X —^R be a linear map defined on a normed space X. Then ker L is closed if and only if L is continuous. Proof. Trivially, kerL := L~^(0) is closed if L is continuous. Conversely, if kerL = X, then L is constant, hence continuous. Otherwise we can choose x such that L(x) = 1. Since kerL is closed, also H := x-\-kerL is closed; since 0 ^ LT, we can then find a ball B(0, r) such that B{0,r) D H = 0. We now prove that L is continuous showing that
\L{x)\
In fact, if |L(rE)| > 1 for some x G B ( 0 , r ) , then X
\L(x)\\
\\
1
\L{x)\''\x\\ < r
while
Since H = {x\ L(x) = 1}, we conclude that xjLix)
\L{X))
e H f) B{0, r ) , a contradiction.
D
9.98 Corollary. If L : X -^ R is a linear map on a normed space X, then ker L is either closed or dense in X. In fact, the closure of ker L is a linear subspace that may agree either with kerL or X.
324
9. Spaces of Continuous Functions, Banach Spaces and Abstract Equations
b. The space of linear continuous maps For any linear map L : X -^ Y between two normed spaces with norms \x and II ||y, we define \L\\cix,Y) '=
sup
(9.25)
||L(x)||y,
|x||x
{ xeX
Il^ll/:(x,y) = PllocB,
\x\\x
-}.
or, equivalently, \C{X,Y)
inf{ii:GR|||L(x)||y
so that \\Lix)\\Y<\\L\\cix,Y)\\x\\x One can shorten this to
VxeX
IMI
-\-y)
= Ln{x)
+ Ln{y),
Ln{Xx)
=
XLnix),
we see that L is linear. Letting m ^- cx) in ||Ln(ip) — Lm{x)\\Y < e valid for \\x\\x < 1? n > no(e), we also find \\Ln{x) — L(X)\\Y < e for ||a:||x ^ 1 and n > no(e). This implies ||L|| < \\Ln\\ + e and \\Ln — L\\ < e for n > no, which in turn yields Ln -^ -^ in
C{X,Y).
D
c. Norms on matrices Por any n, let K = R (K = C) and consider R'^ (respectively C^) as an Euclidean (Hermitian) space endowed with the standard Euclidean (Hermitian) product and let | | be the associated norm. Let L : W^ —> K^ be linear, let L G Mm,n(^) be the associated matrix, L{x) =: Lx, and let /ii, /i2, • . . , /in be the singular values of L, that is the eigenvalues of the matrix (L^L)^/^ ordered in increasing order. Then ||L||2 = sup |L(x)p = sup {L*L{x)\x) = fil \x\ = l
\x\ = l
9.4 Linear Operators
325
Now define the i'^- norm of L € £(K",K'^), by ,1/2
\Lh:={Y.m?)
Of course, || ||2 and || || are equivalent norms in >C(K'^,K'^), since CiJL^^K^) is finite dimensional. More precisely we compute ||L||^ ^ tr(L^L) = ti(L^L)
= ^x\ +
,..^l,
and therefore, 9.100 Proposition. Let L G Mm,n{^)- Then \\L\\ is the maximum of the singular values of L and \\L\\ < \\L\\2 < \ / n 11-^11- Moreover, \\L\\ = \\L\\2 if and only if Rank L = 1. Proof. Let /xi, /i2, • • •, Mn be the singular values of L ordered in nondecreasing order. By the above, ||L|| = /Xn < H^lb < >/n||L|| while equality ||L|| = ||L||2 is equivalent to /ii = • • • = jjin-i = 0, and this happens if and only if Rank L = 1. D 9.101 %, Let T : ^2 —>^ IK be a bounded operator and for i = 1 , . . . , let e^ = Then
{Sin}n'
oo
3= 1
d. Pointwise and uniform convergence for operators In £(X, Y) we may define two notions of convergence. 9.102 Definition. Let {L^} C
C{X,Y).
(i) We say that {Ln} converges pointwise to L i/Vx G X we have \\Ln{x) - L{X)\\Y
^
0,
(ii) we say that {Ln} converges to L in norm or uniformly, if \\Ln-
L\\c{X,Y)=
sup \\x\\x
\\Ln{x)
- L{X)\\Y
-^ 0
aS U ^ OO.
Trivially, Ln -^ L pointwise ii Ln -^ L uniformly. But the converse is in general false and holds true if X is finite dimensional. 9.103 E x a m p l e . Recall that a sequence {xn} is in ^2(K) if and only if E f c l i ^l < +oo. For any n G N let e^^) := {Skn}k- Of course, ||e(^)||2 = 1 Vn. Consider the sequence of linear forms {Ln} on i2{^) defined by Ln{{xk}) = XnFor any x G ^2 W we have X]fcLi ^fc < +00, hence Ln{x) = Xn ^>- 0 as n ^>- 00, i.e., Ln —> 0 pointwise. On the other hand, \\Ln-0\\c(i^^^)
= \\Ln\\cie2,R)=
sup l|a;||2
and {Ln} does not converge uniformly to 0.
|Ln(x)|>Ln(e(^)) = l
326
9. Spaces of Continuous Functions, Ban£Lch Spaces and Abstract Equations
e. The algebra End (X) Let X, Y and Z be linear normed spaces and let / : X —> F and g :Y -^ Z be linear continuous operators. The composition g o f : X -^ Z is again a linear continuous operator from X to Z, and for every x e X we have
ll(5o/)(x)||z
l l 5 o / l l < I M I 11/11-
(9.26)
9.104 E x a m p l e . In general \\g o / | | < \\g\\ \\f\\. For instance, if X = R^ ^^(1 / and g are the orthogonal projections on the axes, we have | | / | | = ||p|| = 1 and fog = gof = 0, hence | | ^ o / | | = 1 1 / o p | | = 0 .
9.105 E x a m p l e . Let T : R"^ --^ R'^ he defined by T(x) = Tx where T := I 0 < € «
1. Then T-'^(x)
= T'^x
where T - ^ = [^
| . We then compute
\0 \\T\\ = 1, | | T - i | | = 1/6 and ||T|| | | T - i | | »
1 = ||Id|| =
1/eJ \\T-'oT\\.
Let X be a Banach space with norm || \\x, denote the Banach space C{X,X) by End(X) and the norm on End(X) by || ||. The product of composition defines in End (X) a structure of algebra in which the product satisfies the inequality (9.26): this is expressed by saying that End(X) is a Banach algebra. Clearly, if L G End (X) and L^ = L o L o - - - o L , then by (9.26) we have
IIL^II < ||L|r.
(9.27)
Again, in general, we may have a strict inequality. 9.106 Proposition. LetX be a Banach space andL G End (X). / / | | I / | | < 1, then Id — L is invertible, oo
(Id - L)-^ - ^
L^
in End (X)
n=0
and ||(Id — L)~^|| < ^ _ L M . In particular, for any y e X the equation X — Lx = y has a unique solution, x = Yl^=o^^y
^^^ 11^11 — i-nLiill^ll-
Proof. The series X l ^ o ^^ ^^ absolutely convergent, since oo
oo
^
5;^ iiL"ii < ^ iiLir = ^ - \\L\\'
n=0
hence convergent. In particular, S := J2'^=oL^
^ End (X) and | | 5 | | < jzifrfTT- Finally
n
(Id - L) ^
L^ = Id - L^+^ -^ Id
in End (X)
k=0
since ||L^+Ml < II^IT"^^ ^ 0 -
•=•
9.4 Linear Operators
327
f. The exponential of an operator Again by (9.27) we get, similarly to Proposition 9.106, the following. 9.107 Proposition. Let X be a Banach space and L G E n d ( X ) . (i) Let f{z) := X ] ^ o ^n^'^ be a power series with radius of convergence p > 0. If\\L\\ < p, then the series Yl^=o^'^^^ converges in End(X) and defines a linear continuous operator oo
/(L):=^anL"GEnd(X). n=0
(ii) The series YlT=oh-^^ continuous operator
converges in End(X) and defines the linear oo
e^ = exp (L) := ^
^
- L ^ G End (X).
fc=0 9.108 %, Show the following. P r o p o s i t i o n . Let X be a Banach space and let A,BE
E n d ( X ) . Then we have
(i) ( l d + ^ ) " ^ e ^ mEnd(X),
(ii) lle^ll <ell^ll, (iii) If A and B commute,
i.e., AB = BA,
then
(iv) if A has an inverse, then (e"^)"-^ = e~'^, (v) if X is finite-dimensional, X = W^, we have eP^P-'
=:Pe^p-\
dete^=e*^^,
if P has an inverse, if A is
symmetric.
9.4.2 Fundamental theorems In this subsection, we briefly illustrate four of the most important theorems about the structure of linear continuous operators on normed spaces. The first three, the principle of uniform boundedness, the open mapping theorem and the closed graph theorem are a consequence of Baire's category theorem, see Chapter 5, and are due to Stefan Banach (1892-1945); the fourth one, known as Hahn-Banach theorem, was proved independently by Hans Hahn (1879-1934) in 1926 and by Banach in 1929.
328
9. Spaces of Continuous Functions, Banach Spaces and Abstract Equations
a. The principle of uniform boundedness The following important theorem is known as the Banach-Steinhaus theorem and also as the principle of uniform boundedness. 9.109 Theorem (Banach-Steinhaus). LetX be a Banach space andY be a normed linear space. Let {T^} be a family of bounded linear operators from X to Y indexed on an arbitrary set A (possibly nondenumerable). If sup llTc^xlly < C{x) < +00
Vx e X,
aeA
then there exists a constant C such that SUp\\Ta\\ciX,Y)
AV
Xn is closed and by hypothesis UnXn = X . By Baire's category theorem, it follows that there exists no £N, XQ £ X and ro > 0 such that B(xo, ro) C Xno, that is \\Tcc{xo -h ro2)||y < no
Va 6 A,
hence \/z € B{0,1), \\TC.{Z)\\Y
<
ro
—\\Ta{roz-hxo-xo)\\Y
< l ( n o + ||T.(xo)||y)<"° + ' ' ^ " ° ^ = : a ro ro
The following corollaries are trivial consequences. 9.110 Corollary. Let {Tk} be a sequence of bounded linear operators from a Banach space X into a normed space Y. Suppose that for each x e X limfc_^oo Tk^ =: Tx exists in Y. Then the limit operator T is also a bounded linear operator from X to Y and we have \\T\\cix,Y)
9.111 Corollary. Let I < p < 4-oo. Any linear operator from ip{M.) into a normed linear space Y is continuous. Proof. In fact, the linear operators {Ln} defined by Ln(0:=i>((6,6,..-,^n,0,0,...))
i f ? = (Ci,?2,---)
are clearly continuous and l / n ( 0 ~^ ^{0 i^ ^p- Therefore L is continuous by the Banach-Steinhaus theorem. D The following theorem, again due t o Banach, is also a consequence of Baire's category theorem.
9.4 Linear Operators
329
9.112 T h e o r e m . Given a sequence of bounded linear operators {T^} from a Banach space X into a normed linear space Y, the set {x ex\
liminf ||Tfcx||y < + o o ) fc—••oo
either coincides with X or is a set of the first category of X. This in turn implies the following. 9.113 Corollary (Principle of c o n d e n s a t i o n of singularities). For p = 1 , 2 , . . . , let {Tp,q}, q = 1 , 2 , . . . , be a sequence of bounded linear operators from a Banach space into a normed space Yp. Assume that for each p there exists Xp £ X such that limsupq_^QQ \\Tp^qXp\\c(^X,Yp) = oo- Then the set
is of second
\x e X\ limsup||Tp,q||£(x,y„) = +oo for all p = 1,2,3,..
A
*^
•'
•
q-^oo
category.
The above principle gives a general method of finding functions with many singularities. For instance one can find in this way a continuous function x(t) of period 27r such that the partial sum of its Fourier expansion n
Snf(t) :=
h Y ^ (ttfc cos kt -h bk sin kt) ^
k=i
satisfies the condition limsup |5n/(*)l = ^^ n—+00
in a set P C [0, 27r] which has the power of the continuum.^
b. The open mapping theorem 9.114 Theorem (Banach's open mapping theorem). Let X and Y be Banach spaces and let T be a surjective bounded linear operator from X into y . Then T is open, i.e., it maps open sets of X onto open sets ofY. Proof
We divide the proof into two steps.
Step 1. First we prove that there is a <5 > 0 such that T B x ( 0 , l ) D B y (0,2(5). Set Xn := nTBx(0,1). All Xn are closed and, since T is surjective, U ^ ^ X n = Y. By Baire's category theorem, see Theorem 5.118, it follows that for some n, Xn has a nonvoid interior. By homogeneity, T ( B x ( 0 , 1 ) has a nonvoid interior, too, i.e., there exists yo e Y and 6 > 0 such that By(2/0,4(5) C T ( B x ( 0 , 1 ) . By symmetry -yo G T B x ( 0 , l ) , and, as T J B X ( 0 , 1) is convex, B y (0, 26) C T B x (0,1). Step 2. We shall now prove that TBx(0,l)DBy(0,5), that is the claim. Observe that by Step 1 and homogeneity TBx (0, r) D BY (0, 2Sr)
Vr > 0.
^ For proofs we refer the interested reader to e.g., K. Yosida, Functional Springer-Verlag, Berlin, 1964.
(9.28) Analysis,
330
9. Spaces of Continuous Functions, Bansich Spaces and Abstract Equations
We want to prove that the equation Tx = y has a solution x € B x ( 0 , 1 ) for any y e By(0,(5). Let y e Y be such that ||t/||y < 6. By (9.28) there exists xi e X such that ||a;i||x < 1/2 and ||Txi — 2/||y < <5/2- Similarly, considering the equation Tx = y — Txi, one can find X2 £ X such that ||a:2||x < 1/4 and \\y — Txi — TX2\\Y < 6/4. By induction, we then construct points Xn E X such that | | x n | | x < 2 ~ ^ and \\y - E L i ^ ^ f c l l v < V 2 ' ' - Therefore the series X3fc=i ^k is absolutely convergent in X with sum less than 1, hence it converges to some x E: X with ||ic||x < 1? and \\y-Tx\\Y=0. D 9.115 ^ . Show the converse of the open mapping theorem: if T : X —>• y is an open, bounded linear operator between Banach spaces, then T is surjective.
A trivial consequence of Theorem 9.114 is the following. 9.116 Corollary (Banach's continuous inverse theorem). LetX^Y be Banach spaces and let T : X ^^ Y be a surjective and one-to-one bounded linear operator. Then T~^ is a bounded operator. 9.117 Remark. Let X and Y be Banach spaces and let T : X -^ F be a linear continuous operator. Often one says that the equation Tx = y is well posed if for any y EY it has a unique solution x E X which depends continuously on y. Corollary 9.116 says that the equation Tx = y is wellposed if X and Y are Banach spaces and Tx = y is uniquely solvable
yyeY. c. The closed graph theorem Let X, Y be two Banach spaces. Then X xY \\{x,y)\\xxY:=\\x\\x
endowed with the norm + \\y\\Y
is also a Banach space. 9.118 Theorem (Banach's closed graph theorem). Let X andY be Banach spaces and letT : X ^^Y be a linear operator. Then T \ X -^Y is bounded if and only if its graph GT := {{x,y) e X xY\y is closed in X
= Tx]
xY.
Proof. If T is continuous, then trivially GT is closed. Conversely, GT is a closed linear subspace of X x y , hence GT is a Banach space with the induced norm of X xY. The linear map n : GT —> X , 7r((x, Tx)) := x, is a bounded linear operator that is one-to-one and onto; hence, by the Banach continuous inverse theorem, the inverse map of TT, n~^ : X -^ GT, X -^ ( x , T x ) , is a bounded linear operator, i.e., \\x\\x + | | T x | | y < C | | x | | x for some constant C. T is therefore bounded. •
9.4 Linear Operators
331
Figure 9.11. Hans Hahn (1879-1934) and Hugo Steinhaus (1887-1972).
d. The Hahn-Banach theorem The Hahn-Banach theorem is one of the most important results in hnear functional analysis. Basically, it allows one to extend to the whole space a bounded linear operator defined on a subspace in a controlled way. In particular, it enables us to show that the dual space, i.e., the space of linear bounded forms on X, is rich. 9.119 Theorem (Hahn-Banach, analytical form). Let X be a real normed space and let p : X ^^'R be a sublinear functional^ that is, satisfying p{x + 2/) < p{x) + p{y),
p(\x) = Xp{x)
VA > 0, Vx, y E X.
Let Y be a linear subspace of X and let f : Y —^ R be a linear functional such that f{x) < p{x) Vx G F . Then f can be extended to a linear functional F : X ^ R satisfying F{x) = f{x) Vx G y,
F{x) < p{x) Vx G X.
Proof. Denote by /C the set of all pairs (Ya, Qa) where Ya is a linear subspace of X such that Yot D Y and ga is a linear functional on Y^ satisfying 9<x{x) = fix) Vx G X,
gcc{x) < p{x) Vx G Fa-
We define an order in /C by (Ya,ga) < (^/3,p/3) if Ya C Yj3 and ga = gp on Yot. Then K becomes a partially ordered set. Every totally ordered subset {iXoi^goc)} clearly has an upperbound {Y\g') given by Y' = U(3Yfs, g' = gp on Yp. Hence, by Zorn's lemma, see e.g., Section 3.3 of [GM2], there is a maximal element {Yo,go). If we show that YQ = X, then the proof is complete with F = go. We shall assume that YQ ^ X and derive a contradiction. Let y\ ^YQ and consider Yi := Span (YQ U { y i } ) = [x = y-{-Xyi^y
eYo,
AG M } ,
notice that y EYQ and A G M are uniquely determined by x, otherwise we get yi G VoDefine pi : Ki —> M by gi (y + Ai/i) '-= go{y) -\- X c. If we can choose c in such a way that giiy + Xyi) = go{y) + Ac < ^(t/ + Xyi) for all A G M, y G YQ, then {Yi,gi) G K and {Yo,go) contradicts the maximality of (Yo,go).
< {Yi,gi),
Yi # YQ. This
332
9. Spaces of Continuous Functions, Banach Spaces and Abstract Equations
To choose c, we notice that for x,y EYQ 9o{y) - 9o{x) = go{y-x)
< p{y - x)
+pi-yi
- x).
Hence
-p{-yi -x) - 9o{x)
^n\>\-p{-yi-x)-go{x)\ xeYo ^
^
Thus we can choose c such that A
-y)
< ini \p{y+ y^^o ^
yi) - go{y)\ =\ B. ^
B. Then
yi) - go{y)
Vy € Fo,
- goiy)
Vy G YQ.
Multiplying the first inequality by A, A > 0, and the second by A, A < 0, and replacing y with y/X we conclude that for all A 7^ 0 and trivially for A = 0 Xc
- go{y).
9.120 Theorem (Hahn-Banach). Let X be a normed linear space of K = R or K = C and let Y be a linear subspace of X. Then for every f E £(F, K) there exists F e £(X, K) such that F{x) = f{x) Vx G r ,
\\F\\cix,K) = ll/IU(y,K).
Proof. If X is a real normed spgice, then the assertion follows from Theorem 9.119 with p(x) = \\f\\c(Y,R)\\^\\xTo prove that \\F{x)\\c(x,R) < \\f\\c(Y,R). notice that F{x) = e\F{x)\, 6 := ±1, then \F{x)\ = eF(x) = F{ex) < p(Ox) = ||/||£(y,R)||^a:||x = | | / | | £ ( y , R ) l k l l x . This shows ||i^||£(x,R) £ ll/ll£(y,R)- The opposite inequality is obvious. Suppose now that X and Y are complex normed spaces. Consider the real-valued map h{x) := dlf{x), xeY. /i is a R-linear bounded form on Y considered as a real normed space since
IM^)l
Vxey,
thus the first part of the proof yields a R-linear bounded map H : X —^ R, such that H{x) = h{x) Vx € y and \H(x)\ < ||/||£(y,R)||x||x Vx G X. Now define F{x) := H{x) - iH{ix) Vx G X, hence H{x) — dlF(x). It is easily seen that F : X —> C is a C-linear map and extends / . It remains to show that I ^ W I < ll/ll£(y,c)IMIx
VxeX.
For X E X, we can write F(x) = re*^ with r > 0. Hence \F{x)\ =r = R{e-'^F{x))
= ^F{e-'^x)
=
H{e''^x)
9.4 Linear Operators
333
Simple consequences are the following corollaries. 9.121 Corollary. Let X be a normed space and let x E X. Then there exists F G C{X^ M) such that F{x) = \\x\\x,
\\F\\c(x,R) = l'
9.122 Corollary. Let X be a normed space. Then for all x e X \\x\\x =
SUP{F(X)
I F e £(X,M), \\F\\cix
9.123 Corollary. Let Y be a linear subspace of a normed linear space X. IfY is not dense in X, then there exists F G >C(X, R) F ^ 0, such that F{y) - 0 Vy G y . 9.124 %. Prove Corollaries 9.121, 9.122 and 9.123.
We can give a geometric formulation to the Hahn-Banch theorem that is very useful. For the sake of simplicity from now on we shall assume that X is a real normed space, even though the following results hold also for complex normed spaces. A closed affine hyperplane in X is a set of the form H:=ixex\
F{x) = a\
where F G £(X,M) and a G R. It defines the two half-spaces H.=lxeX\
F{x) < aV
H^=UeX\
F{x) > ay
We say that H separates the sets A and B if AcH.
and
Be
if+.
9.125 Lemma (Gauge function). Let C C X be an open convex subset of the real normed space X and let 0 £ C. Define p{x) : = i n f { a > 0 | - G c } . Then (i) p is sublinear, (ii) 3M such that 0 < p{x) < M \\x\\x, (iii) C : = { X G X | P ( X ) < I } .
334
9. Spaces of Continuous Functions, Banax^h Spaces and Abstract Equations
Proof. If B ( 0 , r ) C X, we clearly have p{x) < \\\x\\x Vx € X, that is (ii). Let us prove (iii). Suppose x ^ C. Since C is open, (1 + e)x € C, if e is small. Hence p{x) < •^— < 1. Conversely, if p{x) < 1, there is a, 0 < a < 1, such that a~^x G C, hence X = a{a~^x) H- (1 — a ) 0 G C Finally, let us prove (i). Trivially p{Xx) = Xp{x) for A > 0. For all x,y E X and e > 0 we know that X
y
p{x) -f c '
p{y) + e
€C,
consequently, tx ,( 1 - % GC p{x) + € p(2/) -h e
VtG[0,l].
In particular, for t :=
p{x) -h p{y) + 2e
we obtain GC.
p{x) + p(t/) + 2€ This yields p{x -\- y) < p{x) + p(y) -f 2e and the claim follows, since e is arbitrary.
D
9.126 Proposition. Let C C X be an open convex subset of the real Then there exists f G £(X, R) normed space X and letx£X,x^C. such that f{x) < f(x)yx e C. In particular, C and x are separated by the closed affine hyperplane {x \ f{x) = f{x)}. Proof. By translation we can assume 0 G C and introduce the gauge function p{x) by Lemma 9.125. If Y := Span{^} and g : Span {^} —>> E is the linear map g{tx) := t, it is clear that g(x) < p{x) \/x G Span{x}. By Theorem 9.119, there exists a linear extension f of g such that f{x) < p(x) \/x G X. In particular, we have f(x) = 1 and / is bounded because of (ii) of Lemma 9.125. On the other hand, f{x) < 1 Va: G C by (iii) of Lemma 9.125. D
9.127 Theorem (Hahn-Banach thereom, geometrical formi). Let A and B be two nonempty disjoint convex sets of a real normed space X. Suppose A is open. Then A and B can be separated by a closed affine hyperplane. Proof. Set C := A — B = {x — y\x
e A, ye
B}. Trivially C is convex and open as
C := Uy^siA — y); moreover, 0 ^ C since AOB = ^. By Proposition 9.126 there exists / G / : ( X , E ) such that f{z) < 0 V;^ G C, i.e., f{x) < f{y) \/x e A'^y e B. If we choose a such that s u p / ( x )
9.5 Some General Principles for Solving Abstract Equations In this final section we establish some fundamental principles concerning the solvability of abstract equations
9.5 Some General Principles for Solving Abstract Equations
335
Au = f where A : X ^ F is a continuous function also called a continuous nonlinear operator between Banach spaces. These principles are fully appreciated for instance when dealing with the theory of ordinary or partial differential equations; however in Chapter 11 we shall illustrate some of their applications.
9.5.1 The Banach fixed point theorem Many problems take the form of finding a fixed point for a suitable transformation. For instance, if A maps X into X where X is a vector space, the equation Au = 0 is equivalent to An -\-u = u^ i.e., to finding a fixed point for the operator A -h Id. The contraction mapping theorem^ proved by Stefan Banach (1892-1945) in 1922, an elementary version of which we saw in Theorem 8.48 in [GM2], is surely one of the simplest results that ensures the existence of a fixed point and also gives a procedure to determine it. The method has its origins in the method of successive approximations of Emile Picard (1856-1941) and may be regarded as an abstract formulation of it. Let {xji] be defined by Xji
=
r
(^XTI—I j .
If {xn} converges to x and F is continuous, then x is a fixed point of F , F{x) = X. a. The fixed point theorem Let X be a metric space. A map T : X ^ X is said to be k-contractive if d{T{x),T{y)) < kd{x,y) Vx,y G X, or, in other words, if T : X ^ X is Lipschitz continuous with Lipschitz constant
less than or equal to fc. If 0 < A: < 1, T is often said simply a contraction or a contractive mapping. A point a: G X for which Tx = x is called a fixed point for T. The contraction principle states that contractions have a unique fixed point. 9.128 Theorem (The fixed point theorem of Banach). Let X be a complete metric space and letT : X -^ X be k-contractive with 0 < k < 1. Then T has a unique fixed point. Moreover, given XQ G X, the sequence {xn} defined recursively by Xn-\-i = T{xn) converges with an exponential rate to the fixed-point, and the following estimates hold
336
9. Spaces of Continuous Functions, Banach Spaces and Abstract Equations
CAHIEBS SCIEMIFIQUES PASCICUtE
Ml
LECONS
TOPOLOGIE ET EQUATIONS FONCTIONNEIUS'' PM m.
lux LBHAV n Join SCHAUDER
QUELQUES EQUATIONS FOINCTIOINNELLES APPLICATIONS A DIVERS PROItl.EVEJi ll'ANAI.YSK I. Conn<MriH» t'^aation iris timpit f(ip) r^kiiAi e«( un p*r«ntln, P un polynomt de It T«ri«bie rAelle x; lar«que i raiit, l< Booibr* dec tolulians pent urier, inai» M pariK reite conattate; cetle parili est an ionriaiM d« rentemblt des lolaliont. Un ritultal inaiogM faat pour taut«« let iquatisat inUgralas nlcTant d« la mithoii d'Amlt-Schttidl ('). N«us ittbliroos au eoun de ee tnraii qn'oo p«ut de mimt attaoher a I'tateaible da* solutions de eeriainet iqtsitioos foaelioonelles non tiniainf un enUer potitif, n^iatif OQ nal, riiutice total, qai resle invariaiu quand P^natian rarie eontioQaientet que lea solutiont res(eat boraiei dans leur ensemble; les tquatioas en qoettion so9t du Ijrpe
ET DE PHYSIQUK MATllfiMATIOUE
M. £ a U e PICARO
TiOifitt p»T M. Bu(ta« BI.AKrC
*-*(*)-»,
(>)
ob 9{x) est cempiiiemtm eontiniu(»oll»telig)(« et * appartjenacnt k an ensemble abtlrait, llniaire, ooroii «t eomplel (an sens de U. Banach). P'oo resulle an frotiAt UH gineral fwtaMtoid'ohuiurialkiorimct it'exfsunct: soit uae Equation du type (f). Soppossos qu'flO la modiae eontindment sans qo'dla cesse d'appartenir au tjrpe (i) et de telle tone que renssmbls de see solutions resle hotvki'
PARIS OAUTHlER-VILLAItS ET C-, KOITEUftS
•(•
Figure 9.12. The frontispiece of Legons sur quelques equations fonctionnelles by Emile Picard (1856-1941) and the first page of a celebrated paper by Jean Leray (1906-1998) and JuHusz Schauder (1899-1943) appeared in Journal de Mathematiques in 1933.
d{Xn-\-l,x)
d{xn,x) < [d{Xn+l,x)
<
kd{Xn,x),
^d{xi,Xo), < Y^d(Xn+l,Xn).
Proof. The proof is as in Theorem 8.48 of [GM2]. First we prove uniqueness. If x, y are two fixed points, from d{x,y) = d{Tx,Ty) < kd{x,y), 0 < k < 1 we infer d{x,y) = 0, i.e., X = y. Then we prove existence. Choose any XQ £ X and let Xn-\-i '•= T{xn), n > 0. We have d{Xn-\-l,Xn)
< kd{Xn,Xn-l)
< k^d{xi,Xo)
= k^d{T{xo),
XQ),
hence for p > n p-i
d(xp,Xn)
< y ^ d(xj^i,Xj) j=n
p-i
< ^^k^d{xi,xo)
<
fc^
-d{xi,xo).
j=n
Therefore d{xp,Xn) —^ 0 as n,p -^ oo, i.e., {xn} is a Cauchy sequence, hence it has a limit X ^ X and a; is a fixed point as it is easily seen passing to the limit in a^n+i = T{xn)Finally, we leave to the reader the proof of the convergence estimates. D
Notice that the second estimate in Theorem 9.128 allows us to evaluate the number of iterations that are sufficient to reach a desired accuracy; the second estimate allows us to evaluate the accuracy of Xn as an approximate value of X in terms of d(xn-\-i,Xn)9.129 if. Show that T : X -^ X has a unique fixed point if its m t h iterate T"^ = T o T o ' - ' o T i s a fc-contractive mapping with 0 < k < 1. [Hint: x and Tx are both fixed points of T ^ . ]
9.5 Some General Principles for Solving Abstract Equations
337
9.130 If. Let X := C^{[a,h]) and let
Tf{t) := f f{s)ds,
a
Ja Show that ^ " - ^ W = ( ^ 1M / * ( * - ^ ) " " V ( 5 ) ds (m - I j ! J a
a
is a contractive map if m is sufficiently large.
9.131 Proposition. Let X be a Banach space and T : X —^ X a kcontractive map with 0 < k < 1. Then Id — T is a bijection from X into itself, i.e., for every y £ X the equation x — Tx = y has a unique solution, moreover Lip(Id-r)-i < T ^ (9.29) 1—k Proof. For any y E X the equation x — Tx = y is equivalent to x = y -\- Tx =: F{x). Since F is fc-contractive and fc < 1, the fixed point theorem shows that x — Tx = y has a unique solution for any given y ^ X, i.e., Id — T is bijective. Finally, if x — Tx = y, then
||x||<||x-Tx|| + ||Tx||<|M| + fe||x||
i.e., I W | < i ^ | M | .
•
9.132 %, Let X be a Banach space and T : X ^ X a Lipschitz-continuous map. Show that the equation Tx -\- fix = y is solvable for any y, provided |)Lt| is sufficiently large. 9.133 %, Let X be a Banach space and 8 : X x X -^ R a. bilinear continuous form such that |B(x,j/)|
b. The continuity method The solvability of a linear equation Lix = y can be reduced to the solvability of a simpler equation LQX = y hy means of the following. 9.134 Theorem (The continuity method). Let X be a Banach space, Y a normed space and LQ, LI two linear continuous functions from X to Y. Fort € [0,1] consider the family of linear continuous functions Lt : X ^^ Y given by Lt := (1 - t)Lo + tLi and suppose that there exists a constant C such that the following a priori estimates hold
338
9. Spgices of Continuous Functions, Banach Spaces and Abstract Equations
diTAauiiT ponrrs n nmcrtOB ss*m* O, D. BISZ&Ori' AND 0. O. WMUXIOG tatfttts tbat sack nuteaec thcantas m y b« obtiiacd fraoi tiaom tlteottiiis fint to ipw« o( II tBia«nriaM u d tfan to ttMction ^ o n by * Uultfng p t w w . T ^ diRctios << attack Im b c a Mtowtd ntt lad bts nwltcd in the theoresu fhxs below. Vm imUoct it is Imiid that theereos on iimriant points for tlM iplKnorforiti sMrface ^pleld tttftMy^ by geaenliatiao CBrtcmx tlteoTb* Utatioeat i>facieeoofacd to tbe caK d real ftUKtioB oi a m l V8hai>le, •Ithongb extcaaoo* to real fuoctMas of srKrel real voriaUei are iadicated Oaly tlK one cf a liagfe nsJowira faoctka it eeojitoeii. In many caacs, of ne general problcau caa be reduced to thb case by a for dtfferratial and n* Intcgial eqiiatieot, Hacar and Boa-Uaear. Inadeataliy, it is proved ttet an alfebraie auntliild/i / . - <., irtMic/i, A . . . , / n art real palynoDuab is tte %, ...,jr„ b** BO nsfidarity for fowtal Tablet of tlKmUcaastatitiCi,r< f,. TbcaatjKinkaTetwt been atrie tofindany earlier proof of this tiatpUnAd inportaat tbconMB. Tb« Utetatotreoii the subject of iovttitat poiot* does sot appear to be eztendve. PorageaaKtrkticatttentofoae-valacdtnnsfomatiowiwitbaoe-nliKd iamram, « e laay fdcr toX. fi. J, Brouvcr.t Some edstence tbcorens of im• I>MMtHl I* tkt Axil^r, » « Mi MM «k> I>M> a . tMM. B»n>U». ». m . SstbbKl
Figure 9.13. George Birkhoff (18841944) and a page from a paper by Birkhoff and Kellogg in Transactions, 1922.
|x||x < C\\Ltx\\Y
Va: G X, Vt G [0,1].
(9.30)
Then, of course the functions Lt^ t £ [0,1], are injective; moreover, Li is surjective if and only if LQ is surjective. Proof. Injectivity follows from the linearity and (9.30). Suppose now that Ls is surjective for some s. Then Ls : X ^ Y is invertible and by (9.30) ||(I/s)~^|| < C. We shall now prove that the equation Ltx = y can be solved for any y E Y provided t is closed to s. For this we notice that Ltx = y is equivalent to LsX = y + (Ls — Lt)x = y + (t — s)Lox — {t — s)Lix which, in turn, is equivalent to X = L-^y
+ (t - S)L-^{LQ
- Li)x
=:
Tx
since Ls '• X -^Y has an inverse. Then we observe that ||Ta; — T2;||y < C|t — s|(||Lo|| + | | L i | | ) , consequently T is a contractive map if \t-s\<6:=
^(ll^oll + lli^il
and we conclude that Lt is surjective for all t with |t — s| < 5. Since <5 is independent of s, starting from a surjective map LQ we successively find that Lt with t 6 [0,(5], [0,2(5], . . . is surjective. We therefore prove that Li is surjective in a finite number of steps. D 9.135 R e m a r k . Notice that the proof of Theorem 9.134 says that, assuming (9.30), the subset of [0,1] 5 := | s 6 [0,1] I Ls : X ^ y is surjectivej is open and closed in [0,1]. Therefore S = [0,1] provided S ^ 0,
9.5 Some General Principles for Solving Abstract Equations
339
9.5.2 The Caccioppoli-Schauder fixed point theorem Compared to the fixed point theorem of Banach, the fixed point theorem of Caccioppoh and Schauder is more sophisticated: it extends the finitedimensional fixed point theorem of Brouwer to infinite-dimensional spaces. 9.136 Theorem (The fixed point theorem of Brouwer). Let K he a nonempty, compact and convex set of W^ and let f he a continuous map mapping K into itself. Then f has at least a fixed point in K. The generalization to infinite dimensions and to the abstract setting is due to JuUusz Schauder (1899-1943) at the beginning of the Twenties of the past century, however in specific situations it also appears in some of the works of George Birkhoff (1884-1944) and Oliver Kellogg (18781957) of 1922 and of Renato Caccioppoh (1904-1959) (independently from Juhusz Schauder) of the 1930's, in connection with the study of partial differential equations. Brouwer's theorem relies strongly on the continuity of the map / and in particular, on the property that those maps have of transforming bounded sets of a finite-dimensional linear space into relatively compact sets. As we have seen in Theorem 9.21, such a property is not valid anymore in infinite dimensions, thus we need to restrict ourselves to continuous maps that transform bounded sets into relatively compact sets. In fact, the following example shows that a fixed-point theorem such as Brouwer's cannot hold for continuous functions from the unit ball of an infinite-dimensional space into itself. 9.137 E x a m p l e . Consider the map / : ^2 —>^ ^2 given by
Clearly / maps the unit ball of ^2 in itself, is continuous and has no fixed point.
a. Compact maps 9.138 Definition. Let X and Y he normed spaces. The (non)linear operator A: X ^yY is called compact if (i) A is continuous, (ii) A maps hounded sets of X into relatively compact suhsets ofY, equivalently for any hounded sequence {xk} C X we can extract a suhsequence {xn^} such that {Axnk} ^^ convergent. 9.139 E x a m p l e . Consider the integral operator A : C^{[a,b]) -^ C^([a, 6]) that maps u e C^{[a,b]) into Au{x) G C^{[a,b]) defined by Au{x) := / Ja
F{x,y,u{y))dy
340
9. Spaces of Continuous Functions, Banach Spaces and Abstract Equations
is a continuous real-valued function in R^. For r > 0 set Qr := where F{x,y,u) {{x,y,u) eM.^\x,y e[a,b], |u| < r } and Mr := [u e C^{[a,b]) | ||tx||oo < r } . P r o p o s i t i o n . A : Mr -^ C^{[a,b]) is a compact
operator.
Proof, (i) First we prove that A : Mr —» C^([a, b]) is continuous. Fix e > 0 and observe that, F being uniformly continuous in Qr, there exists S > 0 such that \F{x,y,u)-
F{x,y,v)\
<e
if (x, y, u), (x, y, v) G Br and |IA — i;! < 6. Consequently, we have \F{x,y,u{y))
- F{x,y,v{y))\
<e
for u, t> € Mr with \\u — t'||cx),[a,6] < <^? hence \\Au-Av\\^^^a^^^=
sup \f[F{x,y,u{y))-F{x,yMy))]dy\<e{b-a).
(9.31)
(ii) It remains to show that A maps bounded sets into relatively compact sets. To do that, it suffices to show that A{Mr) is relatively compact in C^([a,6]). We now check that A{Mr) C C^([a, 6]) is a set of equibounded and equicontinuous functions. Then the Ascoli-Arzela theorem. Theorem 9.48, yields the required property. In fact, the equiboundedness of functions in A{Mr) follows from \\Au\\oo<{b-a)
sup ix,y,z)eQr
\F{x,y,z)l
while the equicontinuity of functions in A{Mr) is just (9.31).
D
Compact operators arise as limits of maps with finite rank as shown by the following theorem. 9.140 Theorem. Let X and Y be Banach spaces and M C X a nonempty bounded set. We have (i) If {An}, An : M —>Y, is a sequence of compact operators that converges to A : M -^ Y in B{A,Y), i.e., \\An - A\\]siA,Y) -^ 0 as n ^ oo, then A si compact. (ii) Suppose A : M ^^ Y is compact. Then there exists a sequence {An} of continous operators An '. X ^^ Y such that \\An — ^||OO,M —^ 0 as n ^ oo and each An has range in a finite-dimensional subspace ofY as well as in the convex envelope of A{M). Proof, (i) Fix e > 0 and choose n so that \\An — A||oo,M < c- Since An{M) is relatively compact, we can cover AniM) with a finite number of balls An(M) C [Jl^-^B{xi,e), i.e., A{M) is totally bounded, hence i = 1 , . . . , / . Therefore A(M) C ul^-^B{xi,2e), A{M) has compact closure, compare Theorem 6.8. (ii) Since A{M) is relatively compact, for each n there is a —-net, i.e., elements yj G A(M),
j = 1 , . . . , Jn such that A(M) C ^j=iB(yj,l/n), min \\Ax - yjW < j n
or, equivalently,
\/x e M.
9.5 Some General Principles for Solving Abstract Equations
341
Figure 9.14. Renato Caccioppoli (1904-1959) and Carlo Miranda (1912-1982).
Define the so-called Schauder
operators
•^3
where, for a; G M and j = 1 , . . . , J n , ttj (a;) := max "I
2
11 Ax — 2/j 11,0 >.
It is easily seen that the functions aj : M —> R are continuous and do not vanish simultaneously; moreover
the claim then easily follows.
•
b. The Caccioppoli-Schauder theorem 9.141 Theorem (CaccioppoU-Schauder). Let M C X be a closed, bounded, convex nonempty subset of a Banach space X. Every compact operator A: M -^ M from M into itself has at least a fixed point. Proof. Let WQ € M . Replacing u with u — UQ we may assume that O E M . Prom (ii) of Theorem 9.140 there are finite-dimensional subspaces Xn C X and continuous operators An : M -^ Xn such that \\Au - Anu\\ < ^ and AniM) C co{A{M)). The subset Mn -= Xn n M is bounded, closed, convex and nonempty (since 0 G Mn) and An{Mn) C Mn- Brouwer's theorem then yields a fixed point for An ' Mn —^ Mn, i e . , Un 6 Mn,
A-nUn — Un,
hence, as the sequence {un} is bounded, \\AUn -Un\\
= \\AUn - AnUn\\
< - | | W n | | —^ 0.
n Since A is compact, passing to a subsequence still denoted by {un}, we deduce that {Aun} converges to an element v £ X. On the other hand v G M , since M is closed, and \\un - v\\ < \\v - Aun\\ + \\Aun - Un\\ -^ 0 as n —^ oo; thus Un —^ V and from Aun = Un Vn we infer Av = v taking into account the continuity of A D
342
9. Spaces of Continuous Functions, Banach Spaces and Abstract Equations
c. The Leray-Schauder principle A consequence of the Caccioppoli-Schauder theorem is the following principle, which is very useful in applications, proved by Jean Leray (19061998) and Juliusz Schauder (1899-1943) in 1934 in the more general context of the degree theory and often referred to as to the fixed point theorem of Helmut Schaefer (1925- ) . 9.142 Theorem (Schaefer). Let X he a Banach space and A : X —^ X a compact operator. Suppose that the following a priori estimate holds: there is a positive number r > 0 such that, ifuEX solves u = tAu
for some 0 < ^ < 1,
then \u\\ < r. Then the equation Av
Ve X
has at least a solution. Proof. Let M := {u £ X \ \\u\\ < r} and consider the composition B of A with the retraction on the ball, i.e.,
Bu := <
Au rAu
if \\Au\\ < r,
{\\Au\\ B maps M to M , is continuous and maps bounded sets in relatively compact sets, since A is compact. Therefore the Caccioppoli-Schauder theorem yields a fixed point u £ M for B, Bu = u. Now, if ||^it|| < r, It is also a fixed point for A; otherwise \\Au\\ > r and f
u — Bu = -—377^w ~ tAu, \\Au\\
r
t :=
\\An\\
-
hence ||iZ|| < r: it follows that also ||-BtI|| < r, i.e., u = Bu = Au and u is again a fixed point for ^ . D
Theorems 9.134 and 9.142 may be regarded as special cases of a sort of general principle: a priori estimates on the possible solution yield existence of a solution.
9.5.3 The method of super- and sub-solutions In this section we state an abstract formulation of the following principle that reminds us of the intermediate value theorem: to find a solution, it often suffices to find a subsolution and a supersolution.
9.5 Some General Principles for Solving Abstract Equations
343
Figure 9.15. Juliusz Schauder (1899-1943) and Jean Leray (1906-1998).
a. Ordered Banach spaces 9.143 Definition. An order cone in a Banach space X is a subset X^ such that (i) X+ is closed, convex nonempty and X+ 7^ {0}, (ii) if u e XJ^ and A > 0, then Xu G X+, (iii) if u ^ X^ and —u G X^, then u = {). An order cone X+ C X defines a total order in X u
if and only if
v — u e X.^,
and we say that X is an ordered Banach space (by X+). In this case intervals in X are well defined [u,w] := {v e X \ u < V < u)}.
9.144 Definition. An order cone X^ is called normal if there is a number c > 0 such that \\u\\ < c\\v\\ whenever 0 < u
:= | w e C^{[a,b])\u{x)
>OWxe
[a,6]}
is a normal order cone. 9.147 %. Let u,v,w,Un,Vn be elements of an order cone X^ of a Banach space X. Show that (i) u < V and v < w imply u < w, (ii) u < V and v < u imply u = v, (iii) ii u
and
Nit; — t;|| < cllii; — wll.
344
9. Spaces of Continuous Functions, Banach Spaces and Abstract Equations
b. Fixed points via sub- and super-solutions 9.148 Theorem. Let X be a Banach space ordered by a normal order cone J let UQ^VQ E X and let A : [UQ^VQ] C X -^ X be a (possibly nonlinear) compact operator. Suppose that A is monotone increasing, i.e., Au < Av whenever u
and
Vn-\-i — A.Vn
Vn > 0
started respectively, from UQ and VQ converge respectively to solutions uand u^ of the equation u — Au. Moreover UQ
^ Ul ^ " ' '^ Un ^ " ' '^ U- < U^ < ' " < Vn < " ' '^ VO-
Proof. By induction UQ < • ' ' < Un < Vn < " • < Vo,
since A is monotone. Prom (v) of Exercise 9.147 11^0-itnll < C | | v o - i x o l l
Vn,
i.e., {un} is bounded. As A is compact, there exists u- G X such that for a subsequence {uk^} of {un} we have Au^^ -^ u- as n —)- oo. Finally u- — Au-^ since A is continuous. One operates similarly with {vn}^
9.149 Remark. Notice that the conclusion of Theorem 9.148 still holds if we require that A be monotone on the sequences {un} and {vn} defined by Un-\-i = Aun and Vn^\ = Avn started respectively, at i^o and v^ instead of being monotone in [UQ? ^O]-
9.6 Exercises 9.150 f. Show that in a normed space (X, || ||) the norm || || : X —> M+ is a Lipschitzcontinuous function with best Lipschitz constant one, i.e.,
|lNI-IMl|
9.6 Exercises
345
Eberhard Zeidler
Applied Functional Analysis Main Principlea and Their Applications
ANALYSE FONCTIONNELLE
With 37 !lluslrali««
Thferie et applicstions
Springer-Verlag M A S S O N > i r a Nnr Yock Bvnlmr Mibn Utaeo S>o Ptdo 1913
New York Bntin t Ibkyo HongKooK
Figure 9.16. Frontispieces of two volumes on functional analysis.
P r o p o s i t i o n . Let X be a linear space and let f : X —^ R-\. be a function such that (i) / ( x ) > 0 , f{x) = Oiffx = 0, (ii) / is positively homogeneous of degree one: f{Xx) = \X\f(x) Mx G X , VA > 0, (iii) the set {x \ fix) < 1} is convex. Then f{x) is a norm on X. 9 . 1 5 4 %, Prove the following variant of Lemma 9.22. L e m m a ( R i e s z ) . Let Y be a closed proper linear subspace of a normed space X. Then, for every e > 0, there exists a point z e X with \\z\\ = 1 such that \\z — y\\ > 1 — e for all y eY. 9.155 ^ . Show that BV([a,b])
is a Banach space with the norm
\\f\\BV-=
sup \fix)\ + V^{f). xE[a,b]
[Hint: Compare Chapter 7 for the involved definitions.] 9.156 %. Show that in C^{[a,b]) the norms || ||oo e || \\LP are not equivalent. 9.157 % Show that in C^{[0,1])
\x{0)\+ f \x'{t)\dt Jo defines a norm, and that the convergence in this norm implies uniform convergence. 9.158 if. Denote by Co the linear space of infinitesimal real sequences {xn} and by Coo the linear subspace of c© of sequences with only finite many nonzero elements. Show that Co is closed in ioo while Coo is not closed.
346
9. Spaces of Continuous Functions, Banach Spaces and Abstract Equations
9.159 ^ . Recall, see e.g.. Section 2.2 of [GMl], that the oscillation of a function / : R —» R over an interval around x and radius 6 is defined as ^f,s(^)'•=
sup
\f(y)-f{x)\
\y~x\<6
and that / : R —> R is continous at x if and only if ujf^si^) —>• 0 as 5 —>^ 0. Show that ujf^s{x) -^ 0 uniformly on every bounded interval of R. [Hint: Use Theorem 6.35.] 9.160 % Let / 6 C i ( R ) . Show that a uniformly in every bounded interval of R. [Hint: Use Theorem 6.35.] 9.161 f. Let / :]xo — 1, xo + 1[C R —>- R be differentiable at XQ. Show that the blow-up sequence {/n}, Jn[z) :=
Y
> } Kxo)z,
n
compare Section 3.1 of [GMl], converges uniformly on every bounded interval of R. [Hint: Use Theorem 6.35.] 9.162 %, Compute, if it exists, /•4
lim /
fn{x)dx,
/n(x):=-(e^/^-l).
9 . 1 6 3 f. Discuss the uniform convergence of the sequences of real functions in [0,1] f^{x)
:=(-l)^n(a:H-l)x'",
- ( 2 + sin(nj:))ei-^°^(^^)a:. n
9 . 1 6 4 f. Discuss the uniform convergence of the following real series E^^-cos ( n=l
^
)
.
f:(e^-e-^)arctan^, n=l
n
E, n3a:2 + 1'
^ ^ /arctan(na;) 7 r \ ^ ^
n=0
n=l
n=l
n=2
l
^j^
"^ n /
'
9.165 If- Show that {u e C^i[0,1]) | f^ u(x) dx = 0} is a linear subspace of C^([0,1]) that is not closed. 9.166 %. Show that {u € C^([0,1]) | u{0) = 1} is closed, convex and dense in CO([0,1]). 9.167 f. Show that {u G C°(R) | Hmx-^±oo u(x) = 0} is a closed subspace of C^(R). 9.168 %, Show that the subspace C^ (R) of C^(R) of functions with compact support is not closed. 9.169 %. Let X be a compact metric space and T C C^{X). tinuous if
Show that T is equicon-
9.6 Exercises
347
(i) the functions in T are equi-Lipschitz, i.e., 3 M such that l / W - /(2/)| < Md[x,y)
Vx,2/ G X, V/ 6 J^,
(ii) the functions in T are equi-Holder, i.e., 3 M and a, 0 < a < 1, such that l / W - / ( l / ) l < Md{x,yr
Vx,2/ € X, V/ € T.
9.170 1. Let T C C^{[a,b]). Show that any of the following conditions implies equicontinuity of the family T. (i) the functions in T are of class C^ and there exists M > 0 such that \f{x)\
<M
V x G [a,6], V / G J T
(ii) the functions in T are of class C^ and there exists M > 0 and p > 1 such that rb
I
\f'\Pdx
V/G^.
Ja
9.171 %, Let T C C^{[a^ h]) be a family of equicontinuous functions. Show that any of the following conditions implies equiboundedness of the functions in T. eT, (i) 3 C, 3 a;o G [a, h] such that |/(a:o)|
where Mo,a,fi :=
u{x) - u{y) sup — —-. x,yen
|x-2/r
One also defines C^^'^{Q) as the space of functions that belong to C^'"(A) for all relatively compact open subsets A, A CC Q. Show that C^'"(r2) is a Banach space with the norm || ||o,a,fi-
348
9. Spaces of Continuous Functions, Banach Spaces and Abstract Equations
9.176 ^ . Show that the space C^{[a,b]) is a Banach spa^e with norm k IMIc^:=Ell^'^^lloo,[a,6]. h=0
Define C^'^([a,b]) as the Hnear space of functions in C^{[a,b]) with Holder-continuous fc-derivative with exponent a such that l|w||fc,a,[a,6] — ll^llcfc([a,6]) + [D^'^]o,cx,[a,b] <
+00.
Show that C^'"([a,6]) is a Banach space with norm || ||fc,a,[a,6]9.177 1. Show that the immersion of C0'^([a,6]) into C^^^{[a,b]) is compact if 0 < a < /3 < 1. More generally, show that the immersion of C^'^{[a,b]) into C^"^([a, 6]) is compact if k -h /3 < h -\- a. 9.178 1[. Let fi C M^ be defined by
Q := |(x, 2/) G M^ 12/ < \x\^^^, x^-\-y'^ < l } . By considering the function u{x,y)
I (sgn x)yf^ := <
[O
if 2/ > 0,
iiy<0
where 1 < ^ < 2, show that u G C^(17), but u ( CO'«(Q) if (3/2
1.
9.179 % Prove the following P r o p o s i t i o n . Let Q be a bounded open set in M^ satisfying one of the following ditions (i) Q is convex, (ii) Q is star-shaped, (iii) dQ is locally the graph of a Lipschitz-continuous function. Then C'^'"(Q) C C'^'^(n) and the immersion is compact if k-\- (3 < h-\- a.
con-
[Hint: Show that in all cases there exists a constant M and an integer n such that Vx, 2/ € n there are at most n points zi, Z2,. •., Zn with zi = x and Zn = y such that S r J i k i ~ ^i-{-i\ ^ ^\x — y\. Use Lagrange mean value theorem.] 9.180 If. Show that the space of Lipschitz-continuous functions in [a, 6] is dense in C^([a, 6]). [Hint: Use the mean value theorem.] 9.181 % Show that the space of Lipschitz-continuous functions in [a, 6] with Lipschitz constant less than k agrees with the closure in C^([a, 6]) of the functions of class C^ with supa; \f'(x)\ < k. 9.182 K. Let A > 0. Show that \u G C^([0, -f-oo[) I sup '^
' [0, + cx)[
is complete with respect to the metric d{f,g)
|tx|e-^^ < -hooj ^
:= sup^{|/(a:) — g(x) \ e~^^}.
9.6 Exercises
349
9.183 % Let / : [0,1] -> [0,1] be a diffeomorphism with f{x) > 0 Vx G [0,1]. Show that there exists a sequence of polynomials Pn{x), which are diffeomorphisms from [0,1] into [0,1], that converges uniformly to / in [0,1]. [Hint: Use Weierstrass's approximation theorem.] 9.184 1. Define for A[aj] G Mn,n{^),
K = M or K = C,
||A||:=sup{l^|x^0}. Show that (i) | A x | < | | A | | | x | V x e X , (ii) | | A | | = s u p { ( A x | 3 , ) | | x | = M = l } ,
(iii) l|A|P<Er,,=iH)'
||A||=
sup ||A(2)||=max(^|Ai.|),
(ii) if|N| = Ni:=Er=ik*l,then ||A||=
sup ||A(2)|| = m a ^ ( f ^ | A 5 | ) .
9.186 % Let A , B e M2,2(R) be given by
Then A B 7^ B A . Compute e x p ( A ) , e x p ( B ) , exp (A)exp (B), exp (B)exp (A) and exp(A + B). .187 f. Define M{n)
= {N e End (C") I TV is normal},
U{n) = {N e End (C^) I N is unitary}, n{n)
= {N e End {C") I N is self-adjoint}, = {N e End (C^) I N is self-adjoint and positive}.
Show (i) if AT G Af{n) has spectral resolution N = Zlj^=i ^jPj^ then exp (AT) G M{n) and has the spectral resolution exp (AT) = X)?=i ^^^ ^j^ (ii) exp is one-to-one from H(n) onto ?i-f (n), (iii) the operator H -^ exp (iH) is one-to-one from Tin onto U{n).
350
9. Spaces of Continuous Functions, Banach Spaces and Abstract Equations
9.188 ^ . Let L G End (C"^). Then I d - L is invertible if and only if 1 is not an eigenvalue of L. If L is normal, then L = XI?=i ^j^j^ ^^^ ^^ have n
^
If ||L|| < 1, then all eigenvalues have modulus smaller than one and oo
n
n=0 j=l
oo n=0
9.189 % Let T,T-^ G E n d ( X ) . Show that S G End (X) and | | 5 - T|| < l / | | T | | - i , then S~^ exists, is a bounded operator and |lS-i-T-i||<
l-||5-T||||T-i|
9.190 ^ . Let X and Y be Banach spaces. We denote by Isom (X, K) the subspace of all continuous isomorphisms from, X into y , that is the subset of C{X^ Y) of linear continous operators L : X —^ Y with continuous inverse. Prove the following. T h e o r e m . We have (i) Isom(X, y ) is an open set of C{X,Y). (ii) The m,ap f ^^ f~^ from Isom (X, Y) into itself is
continuous.
[Hint: In the case of finite-dimensional spaces, it suffices to observe that the determinant is a continuous function.] 9.191 % Show that, if / is linear and preserves the distances, then / G Isom (X, Y). 9.192 If. Show that the linear map D : C^{[0,1]) C C^{[0,1]) -^ C^{[0,1]) that maps / to / ' is not continuous with respect to the uniform convergence. Show that also the map from C^ into C^ with domain C^
f e c\[o, 1]) c c°([o, 1]) ^ /'(1/2) e R is not continuous. In particular, notice that linear subspaces of a normed space are not necessarily closed. 9.193 f. Fix a = {a-n} G ^oo and consider the linear operator L : ii -^ ii, {Lx)n anXn- Show that
(i)
mi^Mu^,
(ii) L is injective iff an 7^ 0 Vn, (iii) L is surjective and L~^ e continuous if and only if inf \an\ > 0. 9.194 ^ . Show that the equation 2u = cosu -h 1 has a unique solution in C^([0,1]).
=
10. Hilbert Spaces, Dirichlet's Principle and Linear Compact Operators In a normed space, we can measure the length of a vector but not the angle formed by two vectors. This is instead possible in a Hilbert space, i.e., a Banach space whose norm is induced by an inner (or Hermitian) product. The inner (Hermitian) product allows us to measure the length of a vector, the distance between two vectors and the angle formed by them. The abstract theory of Hilbert spaces originated from the theory of integral equations of Vito Volterra (1860-1940) and Ivar Fredholm (18661927), successively developed by David Hilbert (1862-1943) and J. Henri Poincare (1854-1912) and reformulated, mainly by Erhard Schmidt (18761959), as a theory of linear equations with infinitely many unknowns. The axiomatic presentation, based on the notion of inner product, appeared around the 1930's and is due to John von Neumann (1903-1957) in connection with the developments of quantum mechanics. In this chapter, we shall illustrate the geometry of Hilbert spaces. In Section 10.2 we discuss the orthogonality principles, in particular the projection theorem and the abstract Dirichlet principle. Then, in Section 10.4 we shall discuss the spectrum of compact operators partially generalizing to infinite dimensions the theory of finite-dimensional eigenvalues, see Chapter 4.
10.1 Hilbert Spaces A Hilbert space is a real (complex) Banach space whose norm is induced by an inner (Hermitian) product.
10.1.1 Basic facts a. Definitions and examples 10.1 Definition. A real (complex) linear space, endowed with an inner or scalar (respectively Hermitian) product { \ ) is called a pre-Hilbert space.
352
10. Hilbert Spaces, Dirichlet's Principle and Linear Compact Operators
FORTSCHRITTE DER MATHEMATISCHEX WLSSENSCHAFTEN Vs JI0SOGRAPHIE.\ HEEAtTSOESKBEN VOX OTTO BUnayTHAL = HEFTS =13========—=——==:
GRim)ZtGE EINER .\LLGE31EINEN THEORIE DER LKEAREN IXTEGRALGLEICHUNGEN TOST
DAVIB HUiBSBT
m LEIPZIG DND BERLIN DBUCK UND TEBLAG TON R(J.TEOTNEa
Figure 10.1. David Hilbert (1862-1943) and the Theorie der linearen Integralgleichungen.
1912
We have discussed algebraic properties of the inner and Hermitian products in Chapter 3. We recall, in particular, that in a pre-Hilbert space H the function ||a;|| := y/{x \ x), xeH, (10.1) defines a norm on H for which the Cauchy-Schwarz
l(%)l < INI IMI
inequality,
yx,yeH,
holds. Moreover, CamoVs theorem \\x + 2/lP = \\x\\^ + \\y\\^ + 2^{x\y)
^x,y e H
and the parallelogram law \\x + y\? + \\x-y\\^
= 2{\\x\\^ + \\y\\^)
yx,yGH
hold. In Chapter 3 we also discussed the geometry of real and complex pre-Hilbert spaces of finite dimension. Here we add some considerations that are relevant for spaces of infinite dimension. A pre-Hilbert space H is naturally a normed space and has a natural topology induced by the inner product. In particular, if {xn} C H and X e H, then Xn ^^ x means that ||xn — x|| = {xn — x\xn — x)^/^ ^ 0 as n -^ oo. As for any normed vector space, the norm is continuous. We also have the following. 10.2 Proposition. The inner (or Hermitian) product in a pre-Hilbert space H is continuous on H x H, i.e., if Xn —^ x and yn —^ y in H, then {xn\yn) -^ {Ay)- -^^ particular, if {x\y) — 0 for all y in a dense subset Y of H, we have x = 0.
10.1 Hilbert Spaces
353
Proof. In fact \iXn\yn)
- ix\y)\
= \{Xn - x\yn)
+ (x\yn
-
y)\
< | | a : n - x | | ||2/n||-h||x||||2/n-2/||; the claim then follows since the sequence ||2/n|| is bounded, since it is convergent. If y is a dense subset of H, we find for any x € / / a sequence {yn} C Y such that yn -^ X. Taking the limit in (x | yn) = 0, we get (x | x) = 0. D 10.3 % Differentiability of t h e inner p r o d u c t . Let u :]a, h[-^ H be a, map from an interval of R into a pre-Hilbert space H. We can extend the notion of derivative in this context. We say that u is differentiable at to €]a, b[ if the limit u-(to):=lim^(^)-"(^^)e/f ^ "^ t-.o t - to exists. Check that P r o p o s i t i o n . Ifu,v
:]a,b[-^ H are differentiable
~{u{t)
I v{t)) = {u'{t)\v{t))
in ]a,b[, so is t -^ (w(t) | v{t)) and
+ {u{t)\v'{t))
Vt G]a,6[.
10.4 Definition. A pre-Hilbert space H that is complete with respect to the induced norm, \\x\\ := y/{x\x), is called a Hilbert space. 10.5 K. Every pre-Hilbert space / / , being a metric space, can be completed. Show that its completion if is a Hilbert space with an inner product that agrees with the original one when restricted to H.
Exercise 10.5 and Theorem 9.21 yield at once the following. 10.6 Proposition. Every finite-dimensional pre-Hilbert space is complete, hence a Hilbert space. In particular, any finite-dimensional subspace of a pre-Hilbert space is complete, hence closed. The closed unitary ball of a Hilbert space H is compact if and only if H is finite dimensional. 10.7 E x a m p l e . The space of square integrable real sequences oo
h = hW
:= [x = {xn} U n € M, ^
|xi|2 < oo}
i=l
is a Hilbert space with inner product {x \ y) := X^i^i^il/i' compare Section 9.1.2. Similarly, the space of square integrable complex sequences oo ^2(C) : = | x = {Xn}
\xneC,
"^ \Xi\'^ < O o j i=l
is a Hilbert space with the Hermitian product {x \ y) := X 3 S i ^iVi-
354
10. Hilbert Spaces, Dirichlet's Principle and Linear Compact Operators
10.8 E x a m p l e . In C^{[a,b]) b
{f\9)'.= j f{x)9{x)dx
defines an inner product with induced norm ||/||2 := f /^ \j{i)Y' dt\
. As we have
seen in Section 9.1.2, C^([a, 6]) is not complete with respect to this norm. Similarly 1
0
defines in C^([0,1],C) a pre-Hilbert structure for which C^([0,1],C) is not complete.
b. Orthogonality Two vectors x and y of a pre-Hilbert space are said to be orthogonal, and we write x ± y, if {x\y) = 0. The Pythagorean theorem holds for pairwise orthogonal vectors Xi, X2,..., Xn n
E
ll
n II \ " ^
ii2
2=1
n
II
2=1
Actually, if iif is a real pre-Hilbert space, x JL ^ if and only if ||x + 2/|p =
\\A? + \\v\?A denumerable set of vectors {e^} is called orthonormal if {eh\ek) = Shk V/i, fc. Of course, orthonormal vectors are linearly independent. 10.9 E x a m p l e . Here are a few examples. (i) In ^2, the sequence ei = ( 1 , 0 , . . . ) , 62 = ( 0 , 1 , . . . ) , . . . , is orthonormal. Notice that it is not a linear basis in the algebraic sense, (ii) In C°([a,6],M) with the L^-inner product
{f\9)L^ '•= J f{x)9{x) dx the triginometric
system 1
b—a
/
2TTX \
.
/
27rx \
^ ^
, cosin), s m i n ), \ 0 — a/ \ b — a/
n=l,2,...
is orthonormal, compare Lemma 5.45 of [GM2]. b
(iii) In C^([a,6],C) with the Hermitian L^-product {f\9)L2
:= j f{x)'g(x)dx, a
trigonometric
system 1 / 2kTTX\ exp i1, b—a \ b — a/
forms again an orthonormal
system.
fc
,
_ € Z,
the
10.1 Hilbert Spaces
355
10.1.2 Separable Hilbert spaces and basis a. Complete systems and basis Let H he a. pre-Hilbert space. We recall that a set E of vectors in H are said to be linearly independent if any finite choice of vectors in E are linearly independent. A set E C H oi linearly independent vectors such that any vector in if is a finite linear combination of vectors in E is called an algebraic basis of H. We say that a system of vectors {ea}aeA in ^ pre-Hilbert space H is complete if the smallest closed linear subspace that contains them is ff, or equivalently, if all finite hnear combinations of the {ca} are dense in H. Operatively, {ea}aeA C H is complete if for every x e H, there exists a sequence {xn} oi finite linear combinations of the Ca 's,
cti,...,akeA
that converges to x. ^ 10.10 Definition. A complete denumerable system {cn} of a pre-Hilbert space H of linearly independent vectors is called a basis of H.
b. Separable Hilbert spaces A metric space X is said to be separable if there exists a denumerable and dense family in X. Suppose now that if is a separable pre-Hilbert space, and {xn} is a denumerable dense subset of if; then necessarily {xn} is a complete system in H. Therefore, if we inductively eliminate from the family {xn} all elements that are linearly dependent on the preceding ones, we construct an at most denumerable basis of vectors {?/„} of H. Even more, applying the iterative process of Gram-Schmidt, see Chapter 3, to the basis {^/n}? we produce an at most denumerable orthonormal basis of H, thus concluding that every separable pre-Hilbert space has an at most denumerable orthonormal basis. The converse holds, too. If {cn} is an at most denumerable complete system in H and, for all n, Vn is the family of the linear combinations of ei, 6 2 , . . . , Cn with rational coefficients (or, in the complex case, with coefficients with rational real and imaginary parts), then UnVn is dense in H. We therefore can state the following. 10.11 Theorem. A pre-Hilbert space H is separable if and only if it has an at most denumerable orthonormal basis. ^ Notice that a basis, in the sense just defined, need not be a basis in the algebraic sense. In fact, though every element in H is the limit of finite linear combinations of elements of { c a } , it need not be a finite linear combination of elements of {ea}. Actually, it is a theorem that any algebraic basis of an infinite-dimensional Banach space has a nondenumerable cardinality.
356
10. Hilbert Spaces, Dirichlet's Principle and Linear Compact Operators
GIXTSBPPB VITALI
ERGEBNISSE DER MATHBMATIK UNDi IHRER GRENEGEBIETE niKVXC DEK tOiUPIUBTUHC Dti
GEOMETRIA
MMun Mil wmmmvif HEJtAUSGSeeVeN VON l>. B. HAtKOS i i iCNIt$BR - T UA%AYM4A < K RAI>eMAi£llt^ V. K.SCHl|lOt>». $6GKB-B. SfBBKBt fiiin
III IT
• N £ U B FOiaS.-
HEFX »
u t '•' ilniCi
:«
NELLO SPAZIO HILBERTIANO NORMED LINEAR SPACES MAHLONM.DAY
BOLOGNA NICOLA ZANIOHBLLI
Figure 10.2. Frontispieces of Geometria nello Spazio Hilbertiano (1875-1932) and a volume on normed spaces.
by Giuseppe Vitali
10.12 E x a m p l e . T h e following is an example of a nonseparable pre-Hilbert space: the space of all real functions / that are nonzero in at most a denumerable set of points {ti} (varying with / ) and moreover satisfy J2if(^i)^ < ^^ with inner product (x \ y) = Yl^{^)y{^)^ the sum being restricted to points where x{t)y{t) / 0.
10.13 Remark. Using Zorn's lemma, one can show that every Hilbert space has an orthonormal basis (nondenumerable if the space is nonseparable) ; also there exist nonseparable pre-Hilbert spaces with no orthonormal basis. Let iif be a separable Hilbert space, let {en} be an orthonormal basis on H and \ei Pn : H -^ H be the orthogonal projection on the finitedimensional subspace Hn := Span {ei, e 2 , . . . , en}- liL : H —^ Y is a linear operator from H into a linear normed space Y, set Ln{x) := LoPn{x)'ix E H. Since the LnS are obviously continuous, Hn being finite dimensional, and ||Lri(a;) — L(x)||y -^ 0 Vx G iJ, we infer from the Banach-Steinhaus theorem the following. 10.14 Proposition. Any linear map L : H space into a normed space Y is hounded.
Y from a separable Hilbert
Therefore linear unbounded operators on a separable pre-Hilbert space L : D -^ Y are necessarily defined only on a dense subset D ^ H of SL separable Hilbert space. There exist instead noncontinuous linear operators from a nonseparable Hilbert space into E. 10.15 E x a m p l e . Let X be the Banach space CQ of infinitesimal real sequences, cf. Exercise 9.158, and let / : X ^ R be defined by / ( ( a i , a 2 , . . . ) ) •= « i - Then ker / =
10.1 Hilbert Spaces
357
{{oin) € Co I a i = 0} is closed. To get an example of a dense hyperplane, let {e"^} be the element of CQ such that e j = Sk,n and let x^ be the element of CQ given by x^ = 1/n, so that {x^, e^, e ^ , . . . } is a linearly independent set in CQ. Denote by B a. Hamel basis (i.e., an algebraic basis) in CQ which contains {x^, e^, e-^,... }, and set B==
(x^,e^,e'^,...\uh^\iel]
where 6* ^ x^, e^ for any i and n. Define oo
/ : CO -^ E, Since e^ G ker f ^n>
f{aox^ -f ^2 ^rie"" -\-^aib') = ao.
1, ker / is dense in co but clearly ker f ^ CQ.
10.16 %. Formulate similar examples in the Hilbert space of Example 10.12.
c. Fourier series and i2 We shall now show that there exist essentially only two separable Hilbert spaces: £2 W and £2(C). As we have seen, if if is a finite-dimensional pre-Hilbert space, and (ei, e 2 , . . . , Cn) is an orthonormal basis of H^ we have n
X = ^{x\ej)
n
Cj,
\\x\\^ = ^
3=1
\{x\ej)\^. 3=1
We now extend these formulas to separable Hilbert spaces. Let i? be a separable pre-Hilbert space and let {en} be an orthonormal set of H. For x E H^ the Fourier coefficients of x with respect to {cn} are defined as the sequence {(a:|ej)}j, and the Fourier series of x as the series (X)
2^\x\ej)ej^ 3=1
whose partial n-sum is the orthogonal projection Pn{x) of x into the finitedimensional space Vn := Spanjei, 6 2 , . . . , e^}, n
Pn{x) =
^{x\ej)ej. j=i
Three questions naturally arise: what is the image of J^{x) := {{x\e^)}j,
xGH?
Does the Fourier series of x converge? Does it converge to x? The rest of this section will answer these questions.
358
10. Hilbert Spaces, Dirichlet's Principle and Linear Compact Operators
10.17 Proposition (Bessel's inequality). Let {en} be an orthonormal set in the pre-Hilbert space. Then oo
l]|(x|6fc)p<||x||2
\fxeH.
(10.2)
k=i Proof. Since for all n the orthogonal projection of x on the finite-dimensional subspace Vn := Span {ei, 6 2 , . . . , en} is Pn(x) = X)fc=o(^l^fc)^fe' *^^ Pythagorean theorem yields "£ \i^Wk)\^ = l l ^ n W l P = \\x\f - \\X - P„{x)\f
< \\x\f.
fc=0
When n —>^ oo, we get the Bessel inequality (10.2).
D
10.18 Proposition. Let {en} be an at most denumerable set in a preHilbert space H. The following claims are equivalent. (i) {en} is complete. (ii) \/x E H we have x = Yl^=oi^\^k)^k ^^ H, equivalently \\x — Pn{x)\\ —> 0 as n -^ 00. (iii) (PARSEVAL'S FORMULA), ||a;|p = Sfclo l(^l^fc)P Vx G ff holds. (iv) \/x,y E H we have oo
{x\y) =
^{x\ej){y\ej). j=i
In this case x = 0 if {x\ek) = 0 Vfc. Proof, (i) ^ (ii). Suppose the set {en} is complete. For every x ^ H and n G N, we find finite combinations of e i , e2,. •., Cn that converge to x, n Sn := X ^ a^efc, \\x - Sn\\ -^ 0. fc=0
If Pn{x) = Yyk=i(^\^k)^k we have, as Sn € Vn,
is the orthogonal projection of x in Vn = Span { e i , . . . , en}, \\x-Pn{x)\\<\\x-Sn\\-^0,
therefore x = J2^oi^\^k)^k (ii) 4^ (iii) follows from
in H. The converse (ii) => (i) is trivial,
E 1(^1^-^)1' = \\Pn{x)\f k=0
= \\X\\^ - \\X -
Pn{x)\\^
when n •—> oo. (ii) implies (iv) since the inner product is continuous. i^\y) = {Y^{x\ei)ei
I ^(x|ej)ej) j=i
= 5 Z (^l^») (2/|ej) (ei\ej) = ^{x\ej) and (iv) trivially implies (iii). Finally (iii) implies that x = 0 if {x\ek) = 0 Vfc.
{y\ej).
10.1 Hilbert Spaces
359
10.19 Proposition. Let H be a Hilbert space and let {en} be an orthonormal set of H. Given any sequence [ck] such that Xljlo l^^l^ ^ ^^^ ^^^^ the series YlTLo^j^j converges to H. If moreover {cn} is complete, then oo
X = V _ ] ( x | e j ) Cj
Vx G H.
j=i Proof. Define Xn := Yll^=zi ^j^j-
^^ n-\-p
ji=n+l
{xn} is a Cauchy sequence in if, hence it converges to y := X]?io ^J^J ^ ^' ^ ^ account of the continuity of the scalar product oo
cx)
{y\^j) = ( 5 Z c i e i | e j ) = ^Ci{ei\ej) i=i
= Cj
i=i
for all j . If X G i / and Cj := {x\ej) Vj, then {x — y\ej) — 0 Vj, and, since {cn} is complete, Proposition 10.18 yields x — y. D
Let iif be a pre-Hilbert space. Let us explicitly interpret the previous results as information on the linear map defined by T[x) := {{x\ej)}j,
x e H,
that maps x e H into the sequence of its Fourier coefficients. o Bessel's inequality says that T{x) G ^2 Vx G if and that T : H -^ £2 ^^ continuous, actually CX)
j=i
o if {en} is a complete orthonormal set in H, then Parseval's formula says that J^ : H -^ £2 is an isometry between H and its image J^{H) C £2, in particular J^ : H ^^ £2 is injective, o if if is complete and {e„} is a complete orthonormal set, then, according to Proposition 10.19, — the series YlTLi ^j^j converges in H for every choice of the sequence {cj} C ^2, that is, T is surjective onto £2, - the inverse map of T, T~^ : £2 —> H, is given by 00
i=i
Therefore, we can state the following.
360
10. Hilbert Spaces, Dirichlet's Principle and Linear Compact Operators
10.20 Theorem. Every separable Hilbert space H over R (respectively over C) is isometric to ^2(R) (respectively to ^2(C)^. More precisely, given an orthonormal basis {en} C H, the coordinate map £ : (^2{^) -^ H (K = R if H is real, resp. K = C if H is complex), given by oo
k=o
is a surjective isometry of Hilbert spaces and its inverse maps any x G H into the sequence of the corresponding Fourier coefficients {{x\ej)}j. Finally, we conclude with the following. 10.21 Theorem (Riesz-Fisher). Let H be a Hilbert space and let {cn} be an at most denumerable orthonormal set of H. Then the following statements are equivalent. (i) {cn} is a basis of H. (ii) yx G H we have x = S j l i ( ^ k j ) ^ j (iii) ||x — Pri(^)|| -^ 0, where Pn is the orthogonal projection onto Vn := Span{ei, e 2 , . . . , Cn}. (iv) (PARSEVAL'S FORMULA or ENERGY EQUALITY) ||a:|| = Z l j l i l(^kj)P holds ^x e H. (v) (a:|2/) = E^i(x|e^)(2/|e^). (vi) if {x\ej) = 0 Vj then x = 0. Proof. The equivalences of (i), (ii), (iii), (iv), (v) and of (i) => (vi) were proved in Proposition 10.18. It remains to show that (vi) implies (i). Suppose that {cn} is not complete. Then there is y e H with ||t/|p > ^27^1 l(2/kj)Pj while, on the other hand, Bessel's inequality and Proposition 10.19 show that there is z e H such that z := J2'jLoiy\^j)^j 1 and by Parseval's formula, ||2:|P = S ^ ^ i l(2/l^j)l^- Consequently ||2;|P < ||2/|p. But, on account of the continuity of the scalar product oo
oo
i^l^k) = (5Z(2/|ej)ej |efc) = X^(!/|ej)(ej |efc) = (y|efc) j=i
j=i
i.e., {y — z\ek) = 0 Vfc. Then by (vi) y = z, a. contradiction.
D
d. Some orthonormal polynomials in X^ Let / be an interval on R and let p : / —^ R be a continuous function that is positive in the interior of / and such that for all n > 0 / | t | X ^ ) dt < +00. The function p is often called a weight in 7. The subspace Vp of C^(/, C) of functions x{t) such that / \x{t)\'^p{t)dt
10.1 Hilbert Spaces
361
is a linear space and
{x\y):= Jx{t)'^p{t)dt defines a Hermitian product on it. This way V^ is a pre-Hilbert space. Also, one easily sees that the monomials {f^} n > 0, are linearly independent; Gram-Schmidt's orthonormalization process then produces orthonormal polynomials {Pn{t)} of degree n with respect to the weight p. Classical examples are o
JACOBI POLYNOMIALS
/ := [-1,1], o
Jn. They correspond to the choice p{t) := (1 - t r ( l + t ) ^
LEGENDRE POLYNOMIALS Pn- They correspond to the choice a = f3 = 0 in Jacobi polynomials Jn^ i.e.,
/-[-1,1], o
p{t):=l.
CHEBYCHEV POLYNOMIALS Tn. They correspond to the choice a = f3 = — 1/2 in Jacobi polynomials Jn, i.e.
/=[-l,l], o
LAGUERRE POLYNOMIALS
p{t):=
Ln. They correspond to the choice
/=[0,+oo], o
a,(3> - 1 .
H E R M I T E POLYNOMIALS
p{t) :=e-K
Hn- They correspond to the choice
/ := [-00, -f oo],
p{t) := e~*^
One can show that the polynomials {Jn}^ {^n}, {^n}, {Ln}, {^n} form respectively, a basis in Vp. Denoting by {Rn} the system of orthonormal polynomials with respect to p{t) obtained by applying the Gram-Schmidt procedure to {f^}, n > 0, the i?n's have interesting properties. First, we explicitly notice the following properties o (Al) for all n, Rn is orthogonal to any polynomial of degree less than n, o (A2) for all n the polynomial Rn{t) — tRn-i{t) has degree less than n, hence {tRn-l\Rn)
=
{Rn\Rn)^
o (A3) for all x,y^z eVp we have {xy\z) = {xy'z\l) = {x\yz). 10.22 P r o p o s i t i o n (Zeros of R^)* Every Rn has n real distinct roots in the interior of I.
362
10. Hilbert Spaces, Dirichlet's Principle and Linear Compact Operators
Proof. Since / j Rn{t)p{t) dt = 0, it follows that Rn changes sign at least once in / . Let t i < • • • < tr be the points in int (/) in which Rn changes sign. Let us show that r = n. Suppose r < n and let Q(t):=(t-ti){t-t2)...{t~tr), then RnQ has constant sign, hence, {Rn\Q) 7^ 0, that contradicts property ( A l ) .
D
10.23 Proposition (Recurrence relations). There exist two sequences {An}, {fJ^n} of real numbers such that for n > 2 Rn{t) = {t-\- Xn)Rn-l{t) Proof. Since deg(Hn - tRn-i)
— /^n^n-2(0-
< n - 1, then n-l
Rn{t) - tRn-l{t)
= J2 C^^^W' i=0
and for i < n — 1, we have —(tRn-i\Ri) = Ci(Ri\Ri). By (A3), we have {tRn-i\Ri) = {Ri\Ri), hence, if i-f-l < n—1, then {tRn-i\Ri) = 0 from which Ci = Ofor i = n—2,n—1. For i = n — 2, property (A2) shows that -{Rn-l\Rn-l)
=
Cn-2iRn~2\Rn-2),
hence Cn-2 < 0.
D
10.24 %. Define
«"(*)^=i£(*'-^)"(i) Integrating by parts show that {Qn} is an orthogonal system in [—1,1] with respect to p(t) = 1, and that {Qn\Qn) = 2/(2n -f 1). (ii) Show that Q n ( l ) = 1 and that Qn is given in terms of Legendre polynomias {Pn} by Qn{t) = Pn{t)/Pn{l). Finally, compute P n ( l ) . (iii) Show that the polynomials {Qn} satisfy the recurrence relation nQn = (2n - l ) Q n - i - (n - l ) Q n - 2 and solve the linear ODE d / 2\^Qn | ( ( l - * ^ ) ^ ) + n ( n + l)Q„=0. 10.25 %. In Vp with p{t) := e"* and / = [0, +oo[, define
(i) Show that deg Qn = n and that {Qn} is a system of orthogonal polynomials in Vp. Then compute {Qn\Qn)(ii) Show that Qn(0) = 1, and, in terms of Laguerre polynomials,
^ ^ ' Compute then Ln(0). respect to {Qn}(iv) Show that E 7 n , a Q n ( t ) = e^* in Vp.
Ln(0)
10.2 The Abstract Dirichlet's Principle and Orthogonality
METHODEN DER MATHEMATISCHEN PHYSIK
363
LINEAR OPERATORS PART I: GENERAL THEORY
VON
JR.COURANT tmD D.HILBERT
MUMHI DUNMBJ) u l MCOB T. 8CHWABTZ
ERSTER BAND WiBiMii Q. U*» ud Hetmi 6. BM«k ZWEITE VERBESSERTE AUFLAGE MIT 26 ABBILDUMGEN
PMuM mi DUtnk»fd m Ik, PMk ImUrat » tkt mural «/ tht AKtrn Pnptrlj CmHoJim an lictnu N». A42.
DiitribmorjAttmsatfa
Wiley Ctatsiet Ubnty EdiliMi PuMMied I9SS
PUBU»BE«S, Inc. New York A wiLEY-i<«nnsci»ici vw JOHN WILEY A M BERLIN
VERLAG VON JULIUS SPRINGER
Figure 10.3. The frontispieces of two classical monographs.
(v) Changing variable and using the Stone-Weierstrass theorem, show that {e ^*} is a basis in Vp. 10.26 %. Define the polynomials Qn{t) by
— e-* = ( - i r Q „ ( t ) e - ' . (i) Show that {Qn} is an orthogonal system in Vp with / = [0, H-oo[ and p{t) = e~* . Show that each Qn{t) is proportional to the Hermite polynomial Hn(ii) Show that Qo = 1, Qi =2t and that for n > 2 Qn{t)
= 2tQn-l{t)
- 2{n -
(iii) Show that Qn satisfies Q'J,(t) - 2tQ'^{i) + 2nQn{t) 2nQn-iit).
l)Qn-l{t).
= 0 and that Q'^{t) =
10.2 The Abstract Dirichlet's Principle and Orthogonality The aim of this section is to illustrate some aspects of the linear geometry of Hilbert spaces mainly in connection with the abstract formulation of the Dirichlet principle. In its concrete formulation, this principle has played a fundamental role in the geometric theory of functions by Riemann, in the theory of partial differential equations, for instance, when dealing with
364
10. Hilbert Spaces, Dirichlet's Principle and Linear Compact Operators
gravitational or electromagnetic fields and in the calculus of variations. On the other hand, in its abstract formulation it turns out to be a simple orthogonal projection theorem. a. The abstract Dirichlet's principle Let H he a. real (complex) Hilbert space with scalar (respectively Hermitian) product ( | ) and norm \\u\\ := y/{u\u). Let K = R if ff is real or K := C if iJ is complex. Recall that a hnear continuous functional L on H is a linear map L : if —^ M such that \L{u)\
"iueH;
(10.3)
the smallest constant K for which (10.3) holds is called the norm of L, denoted ||L|| so that \L{u)\ < ||L||||i/||,
^ueH,
and, see Section 9.4, \\L\\=
\L{u)\ sup \L{u)\ = s u p inti=i u/0 \m\
We denote by H* = C{H, K) the space of linear continuous functional on if, called the dual of H. 10.27 Theorem (Abstract Dirichlet's principle). Let H be a real or complex Hilbert space and let L G if*. The functional T : H —^R defined by J^{u):=h\u\\'^-^{L{u))
(10.4)
achieves a unique minimum point u in H, and every minimizing sequence, i.e., every sequence {uk} C H such that T{uk) -^ inf J^{v), converges ton in H. Moreover u is characterized as the unique solution of the linear equation {ip\u) = L{ip) \/(p e H. (10.5) In particular \\u\\ = ||L||. Proof. Let us prove that T has a minimum point. First we notice that T is bounded from below, since, recaUing the inequahty 2ah < a^ -f 6^, we have for dX\ v ^ H Hv)
> \\\v\?
hence A := inf^^if ^(v)
- \\L\\ \\v\\ > i | H | 2 - i | H | 2 _ i | | L | p = - i | | L | | 2 e R, € M. Then we observe that, by the parallelogram law.
10.2 The Abstract Dirichlet's Principle and Orthogonality
365
\\\A?-L{x)
Figure 10.4. The Dirichlet's principle.
i||«-^|p = i|M|2 + i|H|2.
I W + -U | | 2
- »(L(«)_»(L(t;)) +
2 3 ? ( L ( ^ ) )
(10.6)
Thus, if {txfe} is a minimizing sequence, by (10.6)
\\\uk - UH\? - Huk) + HUH) - 2 ^ ( ^ ^ ^ i ^ )
as h^k —^ oo. Therefore {uk} is a Cauchy sequence in H and converges to some u E H; by continuity T(uk) —> J^{u) hence J^{u) — A. This proves existence of the minimizer u. If {vk} is another minimizing sequence for .F, (10.6) yields ||wfc — "^fcll —^ 0? ^-nd this proves that the minimizer is unique and that every minimizing sequence converges to u in H. Let us show that u solves (10.5). Fix (f £ H, and consider the real function e —> J^(u + ec^), that is the second order polynomial in e J^(u + ev) = \M\\''
+ ^[W\u)
- L(v)]e + ^ ( « )
with minimum point at e = 0. We deduce 3f?((v?|u) - L{ip)) = O^cp e H hence, as 2; = 0 if dl{Xz) = 0 VA G C, {(f\u) - L{ip) = 0 ^ipeH. (10.7) Conversely, if v solves (10.5), then for every (p ^ H T(y + v ) = ^||^||2 + K(„|<^) + i | | ^ | | 2 - ^(L{v))
- 5R(L(v.))
= nv)+sR((^b) - L(ff)) + iibi|2 = nv) + i|lvll^ hence ^ ( 1 ; + (p) > T{y)^ Vv? G i / , i.e., i; is a minimum point for T in H. This proves that (10.5) has a unique solution, the minimum point w of .F : if —> M. Finally, we infer from (10.5)
c^^o \m\
(^#0 ll^ll
366
10. Hilbert Spaces, Dirichlet's Principle and Linear Compact Operators
ACAD£MIE DES SCIENCES OE HONGRIE
LEMONS D'ANALYSE FONCTIONNELLE
niCoAilC BIESZ tr BfiLAS2..NA(;¥
QUAtaitMS fiOITIAK
GAimilES'VIIXAiiS
AKAD£MIAI XIAD6
Figure 10.5. Prigyes Riesz (1880-1956) and the frontispiece of a classical monograph.
b. Riesz's theorem In particular, we have proved the following. 10.28 Theorem (Riesz). For every linear continuous functional L £ H* there exists a unique UL £ H such that L{ip) = {ip\uL)
V(^ G H.
(10.8)
Moreover \\UL\\ = \\L\\. Actually, we have also proved that Riesz's theorem and the abstract Dirichlet's principle are equivalent. 10.29 Continuous dependence and Riesz's operator. If u solves the minimum problem (10.4), or equation (10.8), we have \\u\\ = \\L\\. This implies that the solution of (10.4) or (10.8) depends continuously on L. In fact, if Ln,L G H* and ||Ln - L|| -^ 0, and if Un,u e H solve {(p\un) = L{(p) and {(p\u) = L{(p) "iip G if, then {ip\Un -U)
= {Ln - L){ip)
W(p G H,
hence ^/|| = | | L n - L | | - . 0 . 10.30 Riesz's operator. The map T : H* -^ H that associates to each L e H* the solution UL of (10.8) is called Riesz^s map. It is easily seen that r : H* -^ H is linear and by Riesz's theorem we have ||r(L)|| = \\UL\\ = ||L||, i.e., not only is F continuous, but
10.2 The Abstract Dirichlet's Principle and Orthogonality
367
10.31 Theorem. Riesz's map T : H* -^ H is an isometry between H* and H. c. The orthogonal projection theorem Let us now extend the orthogonal projection theorem onto finite-dimensional subspaces, see Chapter 3, to closed subspaces of a Hilbert space. Let H he a, Hilbert space and V a subspace of iJ. If / G if, then the map L : V -^ K, V -^ L{v) = {f\v) is a linear continuous operator on V with ||L|| < 11/11 since \{f\v)\ < \\f\\ \\v\\ Vv G F by the Cauchy-Schwarz inequality. Since a closed linear subspace F of a Hilbert space H is again a Hilbert space with the induced inner product, a simple consequence of Theorem 10.27 is the following. 4.
10.32 Theorem (Projection theorem). Let V be a closed linear subspace of a Hilbert space H. Then for every f £ H there is a unique point u eV of minimum distance from f, that is
\\f-u\\ = dist (/, V) := inf {l/
-^\\^&V}.
Moreover, u is characterized as the unique point such that f — u is orthogonal to V, i.e., (/ - u\ip) = 0 \/(feV. Proof. We have for a\\ v e V \\v - fW^ = ||i;||2 - 2dl{v\f) + | | / | p . Theorem 10.27, when applied to J^(v) := H'ulP — 2^{f\v), v £V, yields existence of a unique minimizer u ^V of ||t> — / I p , hence of i; —> ||v — / | | . The characterization of u given by Riesz's theorem states, in our case, that u is also the unique solution of 2{^\u) = 2Mf)
Vcp G V.
Let y be a subspace of a Hilbert space H. We denote by V-^ the class of vectors of H orthogonal to V
v^ :=^xeH\ {x\v) = 0Wve vy Clearly V-^ is a closed subspace of H. 10.33 Corollary. If V is a linear closed subspace of a Hilbert space H, then H = V ® V-^, i.e., every u £ H uniquely decomposes as u = v -\-w, where v £V and w G V-^. 10.34 %. Show that, if V is a linear subspace of a Hilbert space / / , then V-*- is closed and that (V-^)-^ is the closure of V.
368
10. Hilbert Spaces, Dirichlet's Principle and Linear Compact Operators
10.35 %. Show that the orthogonal projection theorem is in fact equivalent to Riesz's theorem and, consequently, to the abstract Dirichlet's principle. [Hint: We give the scheme of the proof leaving it to the reader to add details. Uniqueness and ||WL|| = 11-^11 follow from (10.8). Let us prove the existence of a solution of (10.8). Suppose L is not identically zero, then kerL = L~^({0}) is a linear closed proper subspace of H and there exists UQ G kerL"*- such that UQ ^ 0 and L{uo) = 1. Since u — L{u)uo € kerL Vix € / / , we have u = w -{• L{u)uo
with w € ker L and UQ E ker L .
Multiplying by UQ, we then find L{u) = lu \ . ^^.^ )•
d. Projection operators Let y be a linear closed subspace of a Hilbert space H. The projection theorem defines a linear continuous operator Py : H -^ H that maps f E H into its orthogonal projection Pyf G V; of course \\Pv\\ < 1 and Im(Py) = y . Also P^=Py0Py=
Py
and the formula H = V ^ V-^ can be written as Id = Py+Py±,
PyPy±=Py±Py=0.
For the reader's convenience, we only prove that Py{f + g) = Pv{f) + Py{g). TVivially, f + g- Py{f + g) ± V and f + g - Pyf - Pyg ± V; since there is a unique u E V such that f + g — h _L F , we conclude Pvif + g) = Pv{f) + Pvi9). 10.36 ^ . Let P : H -^ H he a linear operator such that P ^ = P . Then P is continuous if and only if ker P and Im P are closed.
10.37 %, If y is a closed subspace of a Hilbert space H and {en} is a denumerable orthonormal basis of V, then the orthogonal projection of x G i / is given by oo
Px := ^(Px\ej)ej 3=1
oo
=
J2{x\ej)ej. 3=1
10.3 Bilinear Forms Prom now on we shall only consider real vector spaces, though one could develop similar results for sesquilinear forms on complex vector spaces.
10.3 Bilinear Forms
369
10.3.1 Linear operators and bilinear forms a. Linear operators Let i / be a real Hilbert space. As we know, the space C{H^ H) of linear continuous operators, also called hounded operators from H into i7, is a Banach space with the norm
11^11=" ' -IFxT^o
IFII
If T G C{H, H), we denote by N{T) and R(T) respectively the kernel and the image or range of T. Since T is continuous, N(T) = T~^({0}) is closed in H^ while in general R{T) is not closed. The restriction of T to N{T)-^, T : N{T)-^ —> R{T) is of course a linear bijection, therefore, from Banach's open mapping theorem, cf. Section 9.4, we infer the following. 10.38 Proposition. Let T e C{H^H). Then T has a closed range in H if and only if there exists C > 0 such that \\x\\ < C | | r x | | \/x e N{T)^, that is, if and only if T~^ : R{T) —> N{T)^ is a bounded operator, or, equivalently, if and only ifT : N{T)-^ -^ R{T) is an isomorphism. b. Adjoint operator Let X, y be two real Hilbert spaces endowed with their inner products ( I )x and ( I )y, and let T G C{X, Y). For any y G F the map x -^ {Tx\y)Y is a linear continuous form on X, hence Riesz's theorem yields a unique element T*y ^ X such that (x|r*y)x - {Tx\y)Y
Vx eX,\/ye
y.
(10.9)
It is easily seen that the map T* : F ^^ X just defined is a linear operator called the adjoint of T. Moreover, from (10.9) T* is a bounded operator with ||r*|| = ||r||. Obviously, if 5',T G C{H,H)
{TSy = S*T\
{Ty =T.
10.39 %. Suppose that P : H —^ H is a. linear continuous operator such that P"^ — P and P * = P. Show that V : = P{H) is a closed subspace of H and that P is the orthogonal projection onto V.
An operator L : H —> H on a, Hilbert space H is called self-adjoint if T* = T, i.e., {x\Ty) = {Tx\y) ^x,y e H. It follows from (10.9) that R{T)^ = N{T*). Consequently R{T) = N{T*)-^ and using the open mapping theorem, we conclude the following. 10.40 Corollary. Let T G C{H,H) range. Then we have
be a bounded operator with closed
370
10. Hilbert Spaces, Dirichlet's Principle and Linear Compact Operators
(i) The equation Tx = y is solvable if and only if y -L N{T*), (ii) T is an isomorphism between N{T)^ and R(T) = N{T*)-^. In particular the Moore-Penrose inverse T^ : H -^ H, defined by composing the orthogonal projection onto R{T) with the inverse ofT^f^^T)-^^ ^^ ^ bounded operator. (iii) WehaveH = R{T)eN{T*). Proof, (i) For x,y € H we have {y\Tx) = (T*y\x). by considering the orthogonals, R{T) = (R(T)^)-^
Hence, R{T)-^ = N{T*).
Therefore,
=NiT*)^.
(ii) follows from the open mapping theorem and (iii) follows from (i) by considering the projections onto N{T*)-^ and N{T*). D 1 0 . 4 1 %. Let A (i) N{A) = (ii) R{A*) D = (iii) RIA*) closed, (iv) if R(A*A)
€ C{X, Y) be a bounded operator between Hilbert spaces. Show that N{A*A), R{A*A), R{A*A) if and only if R(A) = N{A*)-^, i.e., if and only if R{A) is is closed, then R(A*) and R(A) are closed.
10.42 ^ . Let H he a. Hilbert space and let T be a self-adjoint operator. Show that T is continuous. [Hint: Show that T has a closed graph.]
c. Bilinear forms Let if be a real vector space. A map B : H x H ^ R which is Unear on each factor is called a bilinear form. A bilinear form B : H x H —^ R is called continuous or bounded if, for some constant A, we have
|e(u,^)|
^u.veH,
and it is called coercive if there is A > 0 such that B{u,u)>\\\u\\^
\/ueH.
Finally, B{u, v) is said to be symmetric if B{u, v) = B{v, u)
Vu, V e H.
Any linear operator T : H —^ H defines a bilinear form by B{v,u) := {v\Tu), and B is bounded if T is bounded since \B{v,u)\ < \\T\\ \\v\\ \\u\\. Conversely, given a continuous bilinear form B : H x H —^Ron a, real Hilbert space if, \B{u,v)\
\fv e H.
(10.11)
10.3 Bilinear Forms
371
It is easy to see that T is linear and, from (10.10) that ||T|| < A since \\Tu\\^ =
B{Tu,u)
Consequently, by (10.11), there is a complete equivalence between bihnear continuous forms on a real Hilbert space H and bounded linear operators from H into H. Also, by (10.11), coercive bilinear continuous forms correspond to bounded operators called coercive, i.e., such that for some A > 0 {u\Tu) > A||?i|P VIA G H. Moreover, self-adjoint operators correspond to bilinear symmetric forms, in fact I3{v,u) - B{u,v) = {v\Tu) - {u\Tv) = {v\{T - T*)u)
^u^v e H.
10.3.2 Coercive symmetric bilinear forms a. I n n e r p r o d u c t s Clearly, every symmetric continuous coercive bilinear form on H defines in H a new scalar product, which in turn induces a norm that is equivalent to the original, since All^lP
^ueH.
Replacing (u\v) with B{u,v), Dirichlet's principle and Riesz's theorem read as follows. 10.43 Theorem. Let H be a real Hilbert space with inner product ( | ) and norm \\u\\ := ^/{u\ u) and let B : H x H ^ R be a symmetric, continuous and coercive bilinear form on H, i.e., B{u,v) = B{v,u) and for some A > A > 0 \B{u,v)\
B{u,u) > A||u|p,
"iu.v e H\
finally, let L be a continuous linear form on H. Then the following equivalent claims hold: (i)
The functional
(ABSTRACT DIRICHLET'S PRINCIPLE).
J'{u):=^B{u,u)-L{u) has a unique minimizer u £ H, every minimizing sequence converges to u, u in H,u solves B{ip,u) = L{ip)
^^eH.
(ii) (RIESZ'S THEOREM) The equation
B{ip,u) = L{ip) has a unique solution UL ^ H.
\J^eH.
(10.12)
372
10. Hilbert Spaces, Dirichlet's Principle and Linear Compact Operators
Moreover u = UL and \\UL\\ < ;^ll^llThe continuity estimate for UL follows from VX\\UL\\
< VB{UL,UL) = sup J § P = < -^\\L\\.
In terms of operators Theorem 10.43 may be rephrased as follows. 10.44 Theorem. Let T he a continuous, coercive^ self-adjoint operator on H, i.e., \\Tu\\ < \\T\\ \\u\l
{u\Tu) > \\\uf,
A > 0, VTX G F ,
T = T*.
Then T is invertible with continuous inverse, and HT"-^!! < 1/A. Proof. From coercivity we infer that A|M|2<|(n|Tu)|<|M|||r«||,
(10.13)
hence N(T) — {0} and T~^ : R{T) -^ H is continuous. T is therefore an isomorphism between H and R{T), and therefore R{T) is closed, R{T) = R{T) = N{T*)-^ = N{T)-^ = H and (10.13) rewrites as ||T-itx|| < ^ | | u | | \fueH. D A variational proof of Theorem 10.44- For any y e H, consider the bounded operator L: H ^R, L{ip) := {(f\y) and the bilinear form B : H x H -^R 6{^,u):={if\Tu). B is bounded and symmetric, T being bounded and self-adjoint. Moreover, the coercivity implies that B{<
ue
H
or Riesz's theorem. Theorem 10.43, we find x £ H such that MTx)
= B{^,x)
= iip\y)
^ipeH,
that is, Tx = y. Finally, from the coercivity assumption we infer A||x||2<|(x|Tx)|<|N|||Tx||, that is, | | T | | - i < ^ .
D
b. Green's operator Given a bilinear form in a real Hilbert space as above, the Green operator associated to B is the operator Tjs ' H* -^ H that maps L G H* into the unique solution UL,B G i / of B{^,UL^S) = L{^) V(/? G H. It is easily seen that Fg is linear and the estimate ||iiL,iB|| ^ j\\M\ ^^Y^ ^^^^ ^B is continuous. Of course, if F is the Riesz operator and T : H —^ H is an isomorphism such that B{v,u) = {v\Tu), then F^ = T~^ o F. 10.45 %. Under the assumptions of Theorem 10.43, let K C H be a. closed convex set of a real Hilbert space. Show that (i) the functional T{u) has a unique minimizer u £ K^ (ii) u is the unique solution u £ K oi the variational inequality ue K, B{u, u-v) < L{v) y V eK.
10.3 Bilinear Forms
373
c. Ritz's method The Dirichlet principle answers the question of the existence and uniqueness of the minimizer of J^{u) := -B{u^u) — L{u) and characterizes such a minimizer as the unique solution of S(i;, UL) = L{v) Vf G H. But, how can one compute uiP. If i / is a separable Hilbert space, there is an easy answer. In fact, since B{u, v) is an inner product, we can find a complete system in H which is orthonormal with respect to B,
such that every u G H uniquely writes as u = Xljli ^i'^^^j)^j^ compare Theorem 10.20. If I3{(p,u) = L{ip), V(^ € H, then B{ej,u) = L{ej), thus we have the following. 10.46 Theorem (Ritz's method). Let H be a separable real Hilbert space, B a symmetric coercive bilinear form, L e H* and {cn} a complete orthonormal system with respect to B. Then L{v) = B{v,u) ^v e H has the unique solution oo
3=1
This, of course, allows us to settle a procedure that, starting from a denumerable dense set of vectors {xn}, computes a system of orthonormal vectors with respect to JB( , ) by the Gram-Schmidt method, and yields the approximations Xlj^i ^i^j) ^j ^f ^L10.47 ^ . With the notation of Theorem 10.46, show that for every integer AT > 1, UN := J2j=i ^{^j)^j is the solution in Span {ei, 6 2 , . . . , e^} of the system of AT-Unear equations J3{v, UN) = L(v),
\/v € Span "I e i , e2, • . . , e^v \
and the unique minimizer of -B{v, v) — 9fJ(L(?;)),
V € Span < e i , 6 2 , . . . , e^v [^.
10.48 %, Show that the following error estimate for Ritz's method holds: — \u — uj^\ 2
<
T{UN)
— inf .^, H
where T{u) := ^J3{u, u) — L{u). [Hint: Compute T{u -\-v) — T{u).]
374
10. Hilbert Spaces, Dirichlet's Principle and Linear Compact Operators
d. Linear regression Let H,Y be Hilbert spaces and let A G C{H,Y). Given y e Y, we may find the minimum points u E H oi the functional T{u) := \\Au -yWlr
u e H.
(10.14)
Prom the orthogonal projection theorem, we immediately infer the following. 10.49 Proposition. Let A G £ ( i / , Y) be a bounded operator with closed range. Then the functional (10.14) has a minimum point u £ H and, if u E H is another minimizer of (10.14), then u — u£ N{A). Moreover, all minimum points are characterized as the points u e H such that An — y-L R{A) i.e., as the solutions of A*{Au-y)
= 0.
(10.15)
If N{A) = {0}, as N{A) = N{A*A), (10.15) has a unique solution, u = {A*A)~^A*y. If N{A) ^ {0}, the minimizer is not unique, so it is worth computing the minimizer of least norm, equivalently the only minimizer that belongs to N{A)^, or the solution of | ^ * ( ^ ^ - ^ ) = ^'
(10.16)
Recall that, being R[^A) closed, the map A|JV(A)^ ~^ ^ ( ^ ) is an isomorphism by the open mapping theorem. Consequently A+2/:= ( A | ^ ( ^ ) J . )
Qy,
y
eY,
where Q is the orthogonal projection onto R{A), defines a bounded linear operator A^ :Y -^ H called the Moore-Penrose inverse of A. It is trivial to check that the solution u of (10.16) is u = A^y. In the simplest case, A/'(A*) = {0}, we have R{A) = Y and (10.15) is equivalent to solving Au = y. Since we want to find a solution in N{A)-^^ it is worth solving AA*z = y so that u = A'^y = A*{AA*)~^y. In general, however, both AA* and A*A are singular and, in order to compute A^y, we resort to an approximation argument. Consider the penalized functional Tx{u) := \\Au -yW^Y^^MlH
^ ^ H,
(10.17)
where A > 0, that we may also write as J^xiu) = \\y\\' - 2{Au\y)Y + \\Au\;^ + A \\u\^uObserving that L{u) := {Au\y)Y = {A*y\u)H belongs to C{H,R) and that B{v,u) :=
{AV\AU)Y
-h X{V\U)H =
K'^\U)H
+
{V\A*AU)H
10.3 Bilinear Forms
375
is a symmetric, bounded, coercive, bilinear form on H, it follows from the abstract Dirichlet principle, Theorem 10.43, that Tx has a unique minimizer u\E H given by the unique solution of {ip\A''Aux)H + K^\UX)H
= (v^|^*2/)i/
V(/? G H,
i.e., {Xld + A*A)ux = A*y,
(10.18)
We also get, multiplying both sides of (10.18) by UA, X\\ux\\l + \\Aux\\l = {y\Aux)Y from which we infer the estimate independent on A P ^ A l l y < ||y||y.
(10.19)
10.50 P r o p o s i t i o n . Let A G C{H, Y) be a bounded operator with closed range and for A > 0, let ux := {Xld + A^'Ay^A^y
e H,
be the unique minimizer of (10.17). Then {ux} converges to A^y in H and
\\(xid +A*AY^ -A^\
0
as X^
0"^.
Proof. Since R(A) is closed by hypothesis, there exists C > 0, such that II^^IIH < C | | A ^ | | y
\/veN{A)^.
(10.20)
Since Xux = A*{y — Aux) G R{A*) C N{A)-^, we get in particular from (10.19) and (10.20) \\ux\\H
- fiu^) = A*A{ux
- Uy)
from which we infer \\A{ux - w/x)lly = (^A - u^j,\\ux - fiu^)Y
< \\ux - U^WH \\\UX - MW^HH
< I K - ^/xll/f (|A| I K - U^WH + |A - /i| IKIlif) < \\\ux
- u^WJj + |A - /x| llw^llff \\ux - U^,\\H.
Taking into account (10.20) and the boundedness of the W/^'s we then infer IK-«MllH
(10.21)
provided 20^ A < 1. For any {A^}, A^ —>^ 0"*", we then infer from (10.21) that {ux^} is a Cauchy sequence in Ar(A)-'-, hence converges to w € N{A)^. Passing to the Umit in (10.18), we also get A*{Au — J/) = 0, since {ux} is bounded, i.e., u := A'^y, as required. D
376
10. Hilbert Spaces, Dirichlet's Principle and Linear Compact Operators
10.3.3 Coercive nonsymmetric bilinear forms Riesz's theorem extends to nonsymmetric bilinear forms. a. The Lax-Milgram theorem As for finite systems of linear equations in a finite number of unknowns, in order to solve Tx = y, it is often worth first solving TT*x = y or T*Tx = T*y, since TT* and TT* are self-adjoint. We proceed in this way to prove the following. 10.51 Theorem (Lax—Milgram). Let B{u,v) be a continuous and coercive bilinear form on a Hilbert space H, i.e., there exists A > A > 0 such that \B{u,v)\ < K\\u\\\\v\\,
B{u,u) > \\\u\\^
W,v e H.
Then for all L G H* there exists a unique UL E H such that B{V,UL)
= L{V)
"iveH;
(10.22)
moreover \\UL\\ < 1/A||L||, i.e., Greenes operator associated to B, Ts * i/* -^ H, Tjs{L) := UL, is continuous. Proof. Let T : H —^ H he the continuous linear operator associated to B by B{v, u) = {v\Tu) The biUnear form
_ B{u,v)
Vti, V e H.
:= {TT*u \ v) = {T*u \ T*v)
is trivially continuous and symmetric; it is also coercive, in fact,
A^iHi^ < \B{u,u)\^ = \{u I r'«)|2 < ||«||2||r-«||2 = ||«|pB(«,«). Riesz's theorem. Theorem 10.43, then yields a unique UL £ H such that B{V,UL)
=-L{v)
"iveH.
Thus UL := T*UL is a solution for (10.22). Uniqueness follows from the coercivity of B. D
Equivalently we can state the following. 10.52 Theorem. Let T : H —^ H be a continuous and coercive linear operator, \\Tu\\ < \\T\\ \\u\l
{u\Tu) > X\\u\\^
^ueH
where A > A > 0. Then T is infective and surjective; moreover its inverse T~^ is a linear continuous and coercive operator with ||T~^|| < A~^. 10.53 % Show the equivalence of Theorems 10.51 and 10.52. 10.54 %. Read Theorem 10.52 when H = M^; in particular, interpret coercivity in terms of eigenvalues of the symmetric part of the matrix associated to T.
10.3 Bilinear Forms
377
b. Faedo-Galerkin method If if is a separable Hilbert space, the solution UL of the hnear equation (10.22) can be approximated by a procedure similar to the one of Ritz. Let if be a separable Hilbert space and let {en} be a complete orthonormal system in H. For every integer A^, we define VN '-= Span { e i , . . . , e^v} and let P/v : H —> H he the orthogonal projection on VN and UN to be the solution of the equation B{^,UN)
= L{^)
y^eVN,
(10.23)
i.e., in coordinates, UN '-= Yli=i ^^^i where N
J2B{ei,ej)x^
=L{ei),
V2 = l , . . . , i V .
3=1
Notice that the system has a unique solution since the matrix B , B^j = B{ei,ej) has N linearly independent columns as S is coercive. 10.55 Theorem (Faedo-Galerkin). The sequence {UN} converges to in H.
UL
Proof. We have X\\UN
- I ^ L I P < B{uN
-UL,UN
= B{UN,UN) = B{UL,UL
since B{UN,UL)
-UL)
+ B{UL,UL)
-B{UN,UL)
-B{UL,UN)
-UN),
= L(UN) = B(UN,UN)-
It suffices to show that for every v? € / /
B{(p,UN -UL)-^0
as
N-^oo.
(10.24)
We first observe that the sequence {UN} is bounded in H by ||L||/A since A | | t t ^ | p < B{UN,UN) — L(UN) < \\L\\ ll^tivll- On the other hand, we infer from (10.22) that B{PN
^(peH,
(10.25)
hence B{(p, UN - UL) = B{ip - PN(P, UN -UL)-\-
= B{cp -
PN^,
UN -
B(PN(f, UN - UL)
UL),
and \B{ip,UN - UL)\ < A\\uN - ULW W^ - PN^W < 2-IILll \\if - PNMThen (10.24) follows since \\ip - PN
D
378
10. Hilbert Spaces, Dirichlet's Principle and Linear Compact Operators
10.4 Linear Compact Operators In Chapter 4 we presented a rather complete study of hnear operators in finite-dimensional spaces. The study of linear operators in infinitedimensional spaces is more complicated. As we have seen, several important linear operators are not continuous, and moreover, linear continuous operators may have a nonclosed range: we may have {xk} C H, {yk} C Y such that Txk = yk.Vk -^ y ^Y, but the equation Tx = y has no solution. Here we shall confine ourselves to discussing compact perturbations of the identity for which we prove that the range or image is closed. We notice however, that for some applications this is not sufficient, and a spectral theory for both bounded and unbounded self-adjoint operators has been developed. But we shall not deal with these topics.
10.4.1 Fredholm-Riesz-Schauder theory a. Linear compact operators Let if be a real or complex Hilbert space. Recall, cf. Chapter 9, 10.56 Definition. A linear operator K : H -^ H is said to he compact if and only if K is continuous and maps hounded sets into sets with compact closure. The set of compact operators in H is denoted hy K{H^H). Therefore K : H -^ H is compact if and only if K is continuous and every bounded sequence {un} C H has a subsequence {uh^} such that K{uh^) converges in H. Also K{H,H) C C{H,H). Moreover, every linear continuous operator with finite range is a compact operator, in particular every linear operator on H is compact if H has finite dimension. On the other hand, since the identity map on H is not compact if dxmH = +oo, we conclude that )C{H^ H) is a proper subset of C{H, H) if dimiif = -hoo. Exercise 10.89 shows that compact operators need not have finitedimensional range. However, cf. Theorem 9.140, 10.57 Theorem. 1C{H^ H) is the closure of the space of the linear continuous operators of finite-dimensional range. Proof. Suppose that the sequence of Unear continuous operators with finite-dimensional range {An} converges to A G C(H,H), \\An - A\\ -^ 0. Then by (i) Theorem 9.140 A is compact. Conversely, suppose that A is compact, and let B be the unit ball of H. Then A(B) has compact closure, hence for all n there is a 1/n-net covering A{B), i.e., there are points 2/i, 2/2, • • •, 2/iV G A{B), N = N{n), such that A{B) C UjL^B(yj, 1/n). Define Vn := Span{2/1, 2/2,..., VN}, let Pn : H ^ Vnhe the orthogonal projection onto Vn and An := Pn^ A. Clearly each An has finite-dimensional range, thus it suffices to prove that \\An — A\\ -^ 0. For a\\ X £ B we find i € { 1 , 2 , . . . , iV} such that \\Ax — yi\\ < 1/n, hence, since PnVi = Vi and \\Pnz\\ < \\z\\,
10.4 Linear Compact Operators
379
coatonoK DE MoxooiupmES son u ratom nts Fosscrtoss, miLiJiit
•lES
SYSTfiMES
DtQUATIONS LINfiAIilES A UNE INFINITE D'INCONNUES (•All
FRiD£Ric RIES2,
PARIS, GAUTUtSR-VlLLARS, IMPIlIllEUIt-UBRAlftB
Figure 10.6. Marcel Riesz (1886-1969) and the frontispiece of a volume by Frigyes Riesz (1880-1956).
\\PnAx - Ax\\ < \\PnAx - PnViW + ||Pn2/i - Ax\\ < 2\\Ax - ViW < 2/n for all X e B.
D
10.58 Proposition. Let K G K{H,H). Then the adjoint K* of K is compact and AK and KA are compact provided A G C{H, H). Proof. The second part of the claim is trivial. We shall prove the first part. Let {un} C if be a bounded sequence, ||wn|| < M . Then {K*Un} is also bounded, hence {KK*Un} has a bounded subsequence, still denoted by {KK*Un}^ that converges. This implies that {K*Un} is a Cauchy sequence since \\K*Ui - K*Uj\\'' - {K*{ui - Uj)\K*{ui
- Uj)) = {m - Uj\KK*{ui
- Uj))
<2M\\KK*(ui-Uj)\\.
b. The alternative theorem Let A G C{H^ H) be a bounded operator with bounded inverse. A linear operator T G C{H,H) of the form T = A + K, where K G K{H,H), is called a compact perturbation of A. Typical examples are the compact perturbations of the identity, T = Id + K, i^ G /C(if, H)^ to which we can A~^K). always reduce T = A + K = A{IA^The following theorem, that we already know in finite dimension, holds for compact perturbations of the identity. It is due to Frigyes Riesz (18801956) and extends previous results of Ivar Predholm (1866-1927).
380
10. Hilbert Spaces, Dirichlet's Principle and Linear Compact Operators
10.59 Theorem (Alternative). Let H be a Hilbert space and let T = A-\- K : H -^ H be a compact perturbation of an operator A € C{H,H) with bounded inverse. Then (i) R{T) is closed, (ii) N{T) and N{T*) are finite-dimensional linear subspaces; moreover, dimiV(T)-dim7V(r*) = 0. The following lemma will be needed in the proof of the theorem. 10.60 Lemma. LetT = ld-\-K be a compact perturbation of the identity. U {^n} C H is a bounded sequence such that Txn -^ y, then there exist a subsequence {xk^} of {xn} and x e H such that ^kn ~^ ^
^^^
Tx = y.
Proof. Since {xn} is bounded and K is compact, we find a subsequence {xk^} of {xn} and z £ H such that Kxk^ —^z.It follows that Xk^ = Tx^^ — Kx^^ -^ y — z =: x, D and, since K is continuous, Txk^ —^ Tx = x -\- Kx = x — z = y. Proof of Theorem 10.59. Since T = A-\-K = A{ld-\- A~^K)^ A has a bounded inverse, A~^K is compact, we shall assume without loss of generality that A = Id. Step 1. First we show that there is a constant C > 0 such that ||a;||
Vx€A^(T)-^.
(10.26)
Suppose this is not true. Then there exists a sequence {xn} C N{T)^ such that ||a:n|| = I and ||T(xn)|| —> 0. T{xn) -^ 0 and Lemma 10.60 yield a subsequence {x^^} of {xn) and X ^ H such that x^^ —^ x and Tx — 0. The first condition yields x G N{T)-^ and II a; 11 = 1, while the second x € N{T). A contradiction. It follows from (10.26) that T is an isomorphism between the Hilbert space N(T)-^ and R{T), hence R{T) is complete, thus closed. This proves (i). Step 2. By Lemma 10.60 every bounded sequence in N{T) has a convergent subsequence. Riesz's theorem. Theorem 9.21, then yields that dimiV(T) < +oo. Similarly, one shows that dim Ar(T*) < oo. The rest of the claim is trivial if K is self-adjoint. Otherwise, we may proceed as follows, also compare 10.62 below. We use the fact that every compact operator is the limit of operators with finitedimensional range. Theorem 10.57. First we assume T = ld-\-K^ K of finite-dimensional range. In this case K : N{K)^ -^ R{K) is an isomorphism, in particular dim-R(i
( I d - ( 3 ) ^ Q ^ = Id. j=\
In particular. Id — Q is invertible with bounded inverse S j i i Q^ - Therefore we can write
T = ld +K=
Id-Q
+ Ki = {ld-Q){ld^{ld-Q)-^Ki)='.A{ld
+ B)
where B has finite-dimensional range; the claim (ii) then follows from Step 2.
D
10.4 Linear Compact Operators
381
c. Some facts related to the alternative theorem We collect here a few different proofs of some of the claims of the alternative theorem, since they are of interest by themselves. 10.61 R{Id + K) is closed. As we know, this is equivalent to R{T) = 7V(T*)-^, i.e., to show that for every / 6 N{T*)-^ the equation Tu := u -\- K{u) = / is solvable. To show this, we can use Riesz's theorem. Given / € N{T*)^, we try to solve TT*v = / , i.e., WipeH,
h{^,v) = {^\v)
(10.27)
where b{if,v) := (TT*v\ip) =
{T*v\T*(p).
liv e H solves (10.27), then u := T*v solves Tu = f. {Tu\
Wif e H.
We notice that N{TT*) = N{T*), therefore the bilinear bounded form b{(p,v) is symmetric if H is real (sesquilinear if H is complex) and well defined on the Hilbert space N{TT*)-^. We claim that b((p,v) is coercive on N{TT*)-^, 6((^,(^)>c||(p||2
V(^eiV(TT*)-^.
Otherwise, there exists a sequence {e-n} C N{T*)-^
(10.28)
with ||en|| = 1 and
b{en,en) = \\en + K*en\\^
-^ 0.
By Lemma 10.60, there exists e ^ H and a subsequence {cfc^} of {e-n} such that efcn ^ e,
Te = e + K e = 0;
in particular ||e|| = 1, e 6 N(T*) and e 6 N{T), a contradiction. We then conclude that b{(p, u) is an inner product on H (a Hermitian product if H is complex), equivalent to the original one. Applying Riesz's theorem, we then find v € N(T*)-^ such that b(ip,v) = (
^
||t;||<-||/||.
(10.29)
C
It remains to show that v solves (10.27). If P is the orthogonal projection of H into A/'(T*), then (10.29) is equivalent to b{P^,v)
= {P^\f)
VvPG//.
On the other hand, {if - Pcplf) = 0,
b{ip - P^, v) = {ip- Pip\TT*v)
= 0,
since / and v are in iV(T*)-'-, hence b{ip,v) = b{Pip,v) = (P(p\f) — (v?|/). 10.62 A n o t h e r proof of dim ^ ( T ) = dim7V(T*). Step 1. Let us prove the equality if T or T* is injective. Let Hi := R{T) and, by induction Hjj^i := T{Hj). Hj is a nonincreasing sequence of closed subspaces of H. We claim that there exists n such that Hri = Hn Vn > n. If not, we can find {sn} C R{H) with ||en|| = 1 and en € Hnr\H:^_^^. Since for n > m, T{en),T{em),e-n 6 Hm+i, ^m G H;^^-^, and Ken - Kem = {en + Ken) - (em + Kem) - en + em -= z-\- em, we may infer \\Ken-Kem\\^ a contradiction, since {K{en)}
= \\z\\^ +
\\en.\\^>l:
has a convergent subsequence.
382
10. Hilbert Spaces, Dirichlet's Principle and Linear Compact Operators
If N{T) = {0} and Hi = R{T) / / / , then necessarily Hj^i ^ Hj \/j since T is injective, and this is not possible, as we have seen. Hence H = R{T) and N{T*) = R(T)^ = {0}. If N{T*) — {0}, then repeating the above consideration for Id + i^* we get N{T) = {0}. Step 2. Let us prove that dim N{T) > dim R{T)-^. Assume that dim N(T) < dim R{T)-^. Then there exists a linear continuous operator L that maps the finite-dimensional space N{T) into the finite-dimensional space R{T)^ with L injective but not surjective. Let us extend L as a linear operator from H to R{T)-^ by setting Lx = 0 Va; G N(T)^. Then L has a finite-dimensional range, thus it is compact. Now we claim that N{ld + L-\- K) = {0}. In fact u -\- Ku + Lu = 0 implies Tu = u + Ku = -Lu and, since Tu e R{T) and Lu 6 R(T)-^, we infer Tu = Lu = 0, i.e., u 6 N{T) and u G N{T)^, since L is injective when restricted to N{T); in conclusion u = 0. Step 1 then says that Id -h K + L is surjective. This is a contradiction, since the equation u -h Ku -{- Lu = V has no solution when u G R{T)-^, v ^ R{L). Step 3. Replacing K hy K* in the above proves that dimR{T)^
= dimN{T*)
> dimR{T*)-^
=
dimN{T),
which completes the proof. 10.63 Yet a n o t h e r proof of dim Ar(T) = dim N{T*). Let H he a. separable Hilbert space, T = ld-{-K he a. compact perturbation of the identity, and let {cj} be a complete orthonormal system for H, ordered in such a way that N{T) + N{T*) is generated by the first elements e i , 6 2 , . . . , e^. P r o p o s i t i o n . Let Vn = S p a n { e i , 6 2 , . . . , en}, Pn be the orthogonal projection Vn. Then there exists a constant 7 > 0 and an integer no such that Vn > no
l|i'nr{<^)||>7ll¥'ll
v<^ey„nJV(r)-L.
Proof. Suppose the conclusion is not true; then for a sequence n^ ^ V?i € Vm n N{T)-^ we have \\Pn,T
over
00 of vectors
\\^i\\ = l.
(10.30)
By Lemma 10.60 for a subsequence {^pk^} and if £ H we then have K(pk^ -^ —(p- Since PnX -^ X as n -^ 00, we infer \\Pn^.Kipk,+
< WPn^^Kifik.W -H \\Kifk, + c^ll -
0
hence ipk^ -^ (p in H, since (pi = Pn^Tipi — PniK{(pi), and finally ^p H- K(p = 0. In D particular ||(^|| = 1 and (p G N{T) D N{T)-^, a contradiction. Prom the previous proposition, if {(^1, ip2,... •, (ps} is a. family of linearly independent vectors, then PnT(ipi),..., PnT{(ps) are also linearly independent, at least for n large enough; on the other hand, since R{T) = 7V(T*)-L, the vectors PnT((^i), • . . , PnT{(ps) belong to PnR{T) = Vn n Ar(T*)-^. Hence we have dimVnnN{T*)-^
>dimVnnN{T)-^
for n large enough. Similarly one proves d i m ^ n n N{T)-^ > dimVn H N{T*)^, dimVnnAr(T)-^
hence
=dimVnriN{T*)-^
for n large enough. The claim then follows by considering the orthogonal complements.
10.4 Linear Compgict Operators
383
d. The alternative theorem in Banach spaces The alternative theorem generaUzes to the so-called Predholm operators between Banach spaces X and Y of which compact perturbations of the identity are special cases. Let X be a real Banach space on K = R or K = C and X* := C{X,K) its dual space, which is a Banach space with the dual norm |M|=
sup Mx)l
Vc/PGX*.
\\x\\ = l
U (f e X* and x E X, we often write < (f^x > for ^{x). Clearly, the bilinear map < , > : X* X X —)• K, defined by < ip,x >= (f(x), is continuous, |<<^,x>|<||¥>||||x||
\/v€X',yx€X.
In general, X* is not isomorphic to X, contrary to the case of Hilbert spaces. If X and Y are Banach speices and if T : X —^ y is a linear bounded operator, the dual or adjoint operator T* :Y* -^ X* is defined by < T*((^),x > : = < if.Tx
> .
(10.31)
T* is continuous and ||T*|| = ||T||. 10.64 %, Let T G C{H, H), where H is a. Hilbert space. We then have two notions of adjoint operators: as the operator T* : H -^ H in (10.9) Chapter 10 and as the operator T^: H* ^ H* defined in (10.31). Show that, ii G : H* ^ H is Riesz's operator, then T = G-^ oT* oG. For a subset y C X of a Banach space X, we define
V-L .= 1^ ^ X* I
V^
of V. Notice that V-^ is closed in X*. We have
10.65 L e m m a . Let T : X -^ X be a bounded linear operator.Then R{T*) = N{T)-^.
N{T*) =
R{T)-^,
The class of linear compact operators on a Banach space, denoted by /C(X, X) is a closed subset of C{X,X). But in general these operators are not limits of linear operators with finite-dimensional range, contrary to the case X = H, where if is a Hilbert space as shown by a famous example due to Lindemann and Strauss. Recall that we can always approximate K € /C(X, X) by nonlinear operators with range contained in a finite-dimensional subspace, see Theorem 9.140. We can now state, but we omit the proof, the following result. 10.66 T h e o r e m ( A l t e r n a t i v e ) . Let X be a Banach space and letT = A-\-K X be a compact perturbation of an isomorphism A : X —^ X. Then (i) R{T) is closed, (ii) N{T) and N(T*)
have finite dimension,
and dimN{T)
=
: X -^
dimN{T*).
Consequently, we have the following. 10.67 Corollary ( A l t e r n a t i v e ) . Let A,K^ C{X,X) where A is a linear isomorphism of X and K is compact. Then the equation Ax + Kx = y is solvable if and only
ifyeNiT*)-^.
384
10. Hilbert Spaces, Dirichlet's Principle and Linear Compact Operators
e. The spectrum of compact operators 10.68 Definition. Let H be a Hilbert space on K, K = R or K = C^ and let L G C{H, H) be a bounded linear operator on H. The resolvent p{L) of the operator L is defined as the set p{L) = < A G K (Aid — L)~^ is a bounded operator^ and its complement cr{L) = K \ p{L) is called the spectrum of L. By the open mapping theorem cr(L) := < A G K Aid — 1/ is not injective or surjective >.
10.69 Definition. Let L G C{H,H). is defined as (Jp{L) := IXeKlXld
(10.32)
Then the pointwise spectrum of L
— L is not injective>.
The points in (Jp{L) are called eigenvalues of L, and the elements of A^(AId — L) are called the eigenvectors of L corresponding to A. Of course, crp(L) C cr(L) and, if d i m i / < +00, crp(L) = a{L) as, in this case, a linear operator is injective if and only if it is surjective. If dimiJ = -hoo, there exist, as we know, linear bounded operators which are injective but not surjective, hence, in general (Tp{L) ^ CF{L). 10.70 Remark. In the sequel we shall deal with compact operators L. For these operators the equahty crp(L) = a{L) also follows from the alternative theorem of the previous section. As in the finite-dimensional case, see Proposition 4.5, eigenvectors corresponding to distinct eigenvalues are linearly independent. Moreover a(L)c{AGK||A|<||L||}, because, if |A| > ||L||, then
\\\L\\
(10.33)
< 1, therefore, see Proposition 9.106,
Id-\- JL, equivalently XId -h L, is invertible and 00
{Xld + L ) - i = ^ ( - l ) ^ A ^ - ^ L ^ j=o
hence A G p{L). The following theorem gives a complete description of the spectrum of a linear compact operator.
10.4 Linear Compact Operators
10.71 Theorem. Let H be a Hilbert space with dimH K G /C(JF/', H) he a compact operator. Then
385
= +oo and let
(i) 0 G cj{K), (ii) K has either a finite number of eigenvalues or an infinite sequence of eigenvalues that converges to 0. (iii) the eigenspaces corresponding to nonzero eigenvalues have finite dimension, (iv) if X ^ 0 and A is not an eigenvalue for K, then XId — K is an isomorphism of H and {XId — K)~^ is continuous, (v) a{K)\{0} = ap{K)\{0}. Proof, (i) In fact R{K) ^ i / , since K is compact. (ii) Prom (10.33) the set of eigenvalues A is bounded, thus either A is finite or A has an accumulation point. Let us prove that in the latter case, A has only 0 as an accumulation point; we then conclude that A is denumerable, actually a sequence converging to zero. Suppose {An} is a sequence of nonzero eigenvalues with corresponding eigenvectors {un} such that An —>^ A 7«^ 0. Set fin '= 1/An and Vn := Span{iti, U2,..., Wn}, and notice that, if w := ^2^=1 ^j'^j G Vn, then XnW — Kw = J2]=i Cj{Xn — Xj)uj € Vn-iWe now construct a new sequence {vn} with \\vn\\ = 1 by choosing vi € Vi and, for n > 2, t;n G K i f l V ^ j . Clearly Vn is an eigenvector corresponding to An and, according to the previous remark, Vn — fXnKvn G Vn-\- For n > m we then find Vn — jjinKvn, HmKvm G V n - 1 , Vn G V^_^ and K{flnVn
- fJ'mVm) = Vn - {Vn - fJ'uKVn
with Vn G Vj^^i and z G Vn-i. \\K(flnVn)
+ HmKVm)
= : Vn - Z,
Thus we conclude - K{flmVm)\\^
= \\Vn\\^ + \\z\\^ > 1,
a contradiction, since {fXnUn} is bounded and K is compact. In conclusion A = 0. (iii), (iv) are part of the claims of the alternative theorem, and (v) follows from (iv).
D
10.72 Remark. Actually, Theorem 10.71 holds under the more general assumption that if is a Banach space. In this case it is known as the Riesz-Schauder theorem.
10.4.2 Compact self-adjoint operators Let us discuss more specifically the spectral properties of linear self-adjoint operators. a. Self-adjoint operators 10.73 Proposition. Let H be a real Hilbert space and L : H -^ H he a bounded self-adjoint linear operator. Set m := inf {Lu\u), \u\ = l
Then
M := sup {Lu\u). \u\=l
386
10. Hilbert Spaces, Dirichlet's Principle and Linear Compact Operators
(i) eigenvectors corresponding to distinct eigenvalues are orthogonal, (ii)
m,M
e
(T{L),
(iii) ||L|| = supji^ii^i \iu\Lu)\ = max(|m|, |M|). Also, if L is a bounded self-adjoint operator in a complex Hilbert space, then {u\Lu) G R^u e H, consequently all eigenvalues are real, moreover (i), (ii) and (iii) hold. Proof, (i) In fact Lu — \u and Lv = jiv^ X,/I£R,X^/J,
yield
(A - /i)(w I v) = {Lu \v)-{u\
Lv) = 0.
We now prove that for all u G H \\Mu - Lu\\ < \(Mu - Lu\u)\^/^,
\\mu - Lu\\ < \{mu - Lu\u)\^^'^.
(10.34)
The bilinear form b{u,v) := (Mu — Lu\v) is symmetric and nonnegative, h{u,u) > 0; the Cauchy-Schwarz inequality then yields \{Mu - Lu\v)\ < \{Mu - Lu\u)\^/^\{Mv
- Lv\v)\^^'^ < C\{Mu -
Lu\u)\^/^\\v\l
By choosing v = Mu — Lu, the first of (10.34) follows. A similar argument yields the second of (10.34). (ii) Let us prove that M G o-{L); similarly one proves that m 6 ciL). Let {u^} be a sequence such that \\uk\\ = 1 and {Luk\uk) —>• M . Because of (10.34) Muk — Lu^ -^ 0 in H. If M is in the resolvent, then Mu — Lu is one-to-one and onto with continuous inverse because of the open mapping theorem. Thus Uk := ( M i d - L)-'^{Muk
- Luk) -^ 0,
that contradicts \\uk\\ = 1. (iii) Set a := supij^ij^^i |(Lit|u)|; of course max(|M|, |m|) = a and a < \\L\\. Let us show that ||L|| < a. Since L is self-adjoint 4U{Lu\v) = {L(u + v)\u + v) - {L{u - v)\u - v), hence, according to the parallelogram law, 4\(Lu\v)\
< a{\\u + v\\^ + \\u~ vlf)
= 2a(\\u\f
+ |Hp).
Replacing u and v with eu, v/e respectively, e > 0, we find 4|(Ln|t;)| < 2 a m i n ( e 2 | | u | | 2
P) = 4a|HIIH|. + IM £2
ice, if V := Lu., we have
\\Lu\\'• < a | | u\\\\Lu\l
i.e..
\L\\
In the complex case we have {Lu\u) = {u\L*u) = (u\Lu) =
{Lu\u)
hence {Lu\u) 6 M. We leave to the reader the completion of the proof.
D
We notice that the proof of (iii) Proposition 10.73 uses the continuity of ( M i d - L)~^ when M E p{L). If L is compact, this is a consequence of the alternative theorem and the open mapping theorem is not actually needed.
10.4 Linear Compact Operators
387
10.74 Corollary. Let L : H ^^ H be a linear compact self-adjoint operator. Then there exists an eigenvalue X of L such that \\L\\ = |A|. Proof. If L = 0, then A = 0 is an eigenvalue. If L / 0, then ||L|| = max(|m|, \M\) ^ 0 and M,m £ cr{L). Assuming ||L|| — | M | , then M ^Q and, according to Theorem 10.71, M G o-p{L), i.e., M is an eigenvalue of L. Alternatively, we can proceed more directly as follows. Let {un} be a sequence with ||txn|| = 1 such that {Lun\un) -^ M\ then {Mun — Lun\un) —>• 0, and by (10.34) Mun — Lun —> 0 in i / . Since L is compact, there is u £ H and a subsequence u^^ of {un} such that Uk^ -^ u hence = 0,
Mu-Lu
I N | = l,
i.e., M is an eigenvalue for L.
D
b. Spectral theorem 10.75 Theorem (Spectral theoremi). Let H be a real or complex Hubert space and K a linear self-adjoint compact operator. Denote by W the family of finite linear combinations of eigenvectors of K corresponding to nonzero eigenvalues. Then W is dense in N{K)-^. In particular, N{K)-^ has an at most denumerable orthonormal basis of eigenvectors of K. If Pj is the orthogonal projection on the eigenspace corresponding to the nonzero eigenvalue \j, then oo
K = ^\jPj
inC{H,H).
j=i
Proof. We order the nonzero eigenvalues as X.^Xj
fori^j,
|Ai|>|A2|>|A3|>...
and set Nj := N(Xj Id — K) for the finite-dimensional eigenspace corresponding to Aj. According to Proposition 10.73 Nj ± Nk
for j ^ k
and
N{K) ± Nj Vj,
hence N{K) C W-^. To prove that W is dense in N{K)-^, W = N{K)^ or W-^ = N{K). Define {0} Wn=
it suffices to show that
if K has no nonzero eigenvalues,
{ uy^j^TVj
if K has at least n nonzero eigenvalues, if K has only p < n nonzero eigenvalues
Wp
and Vn := W^. Trivially W-^ = DnVn. Notice that, since K is self-adjoint K{W^) C W^ if K{Wn) C Wn and the linear operator K\Y^ € C{Vn,Vn) is again compact and self-adjoint. Moreover, the spectrum of K^y^ is made by the eigenvalues of K different from {Ai, A 2 , . . . , An}. Therefore by Corollary 10.74 \\T^
II
J l^^+il
i^ ^ ^ ^ ^* least n -h 1 eigenvalues,
\\^\Vn\\ = \
I0
.
.
(10.35)
otherwise.
If K has a finite number of eigenvalues, then V = V^ and (10.35) yields K{Vn) = {0}, i.e., y = H r C i V ( T ) .
388
10. Hilbert Spaces, Dirichlet's Principle and Linear Compact Operators
If K has a denumerable set {An} of eigenvalues, then |An| —> 0 by Theorem 10.71, hence ||^|v||<||K|v„|| = |An+l|-0 and K{V) = {0}, i.e., V C N(K). Choosing an orthonormal set of eigenvectors in each eigenspace Nj, we can produce an orthonormal system {en} of eigenvectors of K corresponding to nonzero eigenvalues, i.e., such that (6i \€j J = Oij ,
J\€j
^= Aj €j ,
that is complete in the closure W = N{K)^ of W. Let us prove the last part of the claim. Let Pj and Qn be the orthogonal projections respectively, on Nj and Wn- Since the eigenspaces are orthogonal, we have K{Qn{x)) = Yl]=i ^jPjix) \fx e H, hence n
KX-J2
>^jPj(^) = ^ ( ^ - Qnix)),
j= l
and therefore n
Kx-Y^XjPj{x)\\<\\K^v^\\\\x-Qn{x)\\<\Xn+i\\\x^ j=l
k
\K-^\jPj\\<\Xn+i\; the conclusion then follows since |An|—^Oasn—>^oo.
•
c. Compact normal operators A linear bounded operator T G C{H^ H) in a complex Hilbert space H is called normal if T*T = TT*. It is easy to show that if T is normal, then (i) N(T) = N{T*T) = N{TT*) = N{T*), (ii) N{T - Aid) = A^(T* - Aid), that is, T and T* have the same eigenspaces and conjugate eigenvalues. If T is normal, the operators
A:=
Z
,
B:=
2 ' 2i are self-adjoint and commute, AB = BA. Two linear compact self-adjoint operators that commute have the same spectral resolution^ see Theorem 4.29 for the finite-dimensional case. 10.76 Theorem. Let H he a complex Hilbert space and A,B two linear compact self-adjoint operators in H such that AB = BA. Then there exists a denumerable orthonormal system {e^} which is complete in {N(A) D N{B))-^ and made by common eigenvectors of A and B. If Xj and fij are respectively, the eigenvalue of A and the eigenvalue of B relative to Cj, and Pj : H -^ H is the orthogonal projection onto Span{ej}, PjX := {x\ej)ej, then oo
oo
10.4 Linear Compact Operators
Zar Algebra der FanktioBaloperationen and Theorie der normaleii Operatoren.
389
AflgMMtae BigenwerMMorie Hemitesdior
Etnkitanc. 1. Die voirU«8«nd« Axb«it leifiUlt in swei, im w«aentlidi«n onkbhingift Tcile. Dtt «nte (§§ I — m ) i*t d«t Untwsaeliung d n linMnn und Usohrinktca Op«rtitot8Q (d. h. Mfttmen) des Hilbcrtachcn RMUBM ^ gcwidiqA indem d>« aigelM»i«ch«n EigraMsluften des von ilineii gebildtten (nichikommutetiveo) RiagM ^ b«tncht«t werd«n. Dan 6«genaUad dM iw«tit Teiles bingegen bilden diejeoigwo, niobt notmmdig abenll (in $ ) ainnvoihi nnd beacbr&nkten Op«r«tonn, die die sogenMuite HilbertHhe Spektat • datstettung mit komplezen Eigenweiten nlauen (vgL die Muf&htlidMii' Expli«ening dieset Begtifie im $ 4 der Einleitung). Diaa «ind die tk ^normal'' za bcMicbnet^en Op«fttoi«a, die bitber nuz im BesobrtakM betncbtet wuidenM, and fQz die wit eine nene kUgemeineie Definition geM wecden (vgL am voihin •ogaffihrten Orte).
I netaidiea Ban (t. B. dec neUen ZaUangwadan, im knuplaien ZaUeaebeae, dar Obet. fliebe der K B h ^ * « » i , dec Bttedt* 0 . 1 mm.) betcaAteo, dk gawteeii HegBlenWtihediiigBifM gM>>«n (>• B. lte«% and Ui aaf endliali vieie Knioke etetig diflMMttiiidMC and. iwtimal etettg mm
jJKr.+..-+«,/',)-«,J»A+-'-+M^. K
Ebe wir dieee Dioge genaaer atueinandenetcen, aei en die Deiaitki dee (komplezen) Hilbeitecben Beumea $ erianett Men kum ibn tie
*
•»'
Wena in fi ete aQgeMnier MatbegnS (etara im SiBoa dM Leb«i«a»leben) eijetiart and d« dee Volamaleneiit fa O i«t (anf dec Oecadaa: d*, in dec Sbeae: d«4y, anf dec ObecttAe d^fiaheMkagel: m*.Mi<, oaer.). ^-gaaa-Baummh^l. "
Menge tlint Folgen komplexei Zablen { « , , « , , • • . } mit Midlicbem ^ | * J ^ tealieiert denken*); wir beteicbnen » >) Bit TOT kunm BBT in Vtoti«tn, T|^ dw XnsyUopidifArtikN « • HtUingtr und ToepUu, EMyU. d. luth. WIM. a 0. IS, 8«ite 1583. V«l. took lam.'* •) N««h dtn bektontes 3«tM TOO fleolMr and 1. BIMM el>«iio«at «Mt ife
> !m*W B f(f), ' w o M i * Einhidtofaigd-Obertlebe dnrehliai*, mit MtdiiebMii //ir(/>)|*
\Utm fcaeiiliiii Werte Mwa.
« Meehr. 1«OT, 8. 210-97S.
Figure 10.7. Two pages from two papers by John von Neumann (1903-1957) in Mathematische Annalen.
Proof. Let V^ be as in Theorem 10.75. As in the finite-dimensional case, see Proposition 4.27, for every eigenvalue A of A we find a basis of the corresponding eigenspace Ar(AId — A) made by eigenvectors of B. By induction we then find a denumerable orthonormal system which is complete in W and made of common eigenvectors {e-n} of A and B. By Theorem 10.75 then W = N{A)-^ and {cn} is a basis of N{A)-^ of common eigenvectors of A and B. Now AB = BA implies that B(N{A)) C N{A). Therefore, applying the spectral theorem to -B|;v(A)5 we find further eigenvectors {un} of B corresponding to nonzero eigenvalues that form a basis of N(A) n N{B)-^. The family {en} U {un} is now a denumerable orthonormal set of eigenvectors common to A and B that is complete in (N{A) D N{B))-^. The second part of the claim easily follows by applying Theorem 10.75 to A and B. D
10.77 Corollary. Let H be a complex Hilbert space and let T : H -^ H be a compact normal operator. Then there exists a denumerable basis {cn} in H of common eigenvectors of T and T*. / / Pj denotes the orthogonal projection on Spanjcj} and Xj is the corresponding eigenvalue, then
T = £ A,P„ j=i
T* = £ A,P„
in C{H,H).
j=i
Proof. Set A := (T -h T * ) / 2 and B := {T - T*)/{2i). We can apply Theorem 10.76 and find a basis {en} in (N(A) niV(B))-^, i.e., a basis in ker(T)-^ = ker(T*)-^ made by common eigenvectors to T = A-\-iB and T* = A — iB. D
390
10. Hilbert Spaces, Dirichlet's Principle and Linear Compact Operators
d. The Courant-Hilbert-Schmidt theory In several instances one is led to discuss the existence and uniqueness of solutions in a Hilbert space H of equations of the type a((^, u) - \k{^, u) = F{ip)
^ipeH
(10.36)
where F G i/*, and a{(f,u)^ k{(f^u) are bounded bilinear forms in H. As we have seen, by Riesz's theorem, there exist bounded operators A,Ke C{H, H) and / G if such that a((p, u) := {ip\Au),
k{ip, u) := {ip\Ku),
F{ip) = {ip\f)
ioT all u^(p e H. Then (10.36) reads equivalently as the linear equation in H {A - \K)u = / . (10.37) With the previous notation suppose that — A is continuous, self-adjoint and coercive on i / , i.e., there exists u > 0 such that \/ueH, (10.38) a{u,u)>iy\\u\\^ — K is compact, self-adjoint and positive, i.e., k{u, u) = {u\Ku) > 0
yu:j^O, ueH,
(10.39)
With these assumptions, the corresponding bilinear forms are continuous and symmetric; moreover a{v, u) defines an inner product in H equivalent to the original one {v\u) since V \\u\^ < a{u,u) <
\u
|2
Finally, see Theorem 10.44, A has a continuous inverse. The operator A — XK is therefore a compact perturbation of an isomorphism, and, since A and K are self-adjoint, the alternative theorem yields the following. 10.78 Theorem. The equation An -h XKu = f has a solution if and only if f is orthogonal to the solutions of An — XKu = 0. Now we want to study the equation An - XKu = 0 equivalently, a{ip, u) — Xk{(p, u) = 0 which can be rewritten as \u - A-^Ku = 0. A With the assumptions we have made o A~^K is a linear compact operator.
"iif e H,
10.4 Linear Compact Operators
391
THE
THEORY OF SOUND •Y
JOHN W I L U A l i STRUTT. BARON RAYLEICH, ScJA.
T.KS.
XOIEXT BKUCE UNDSAY
IN TWO VOLUMES
Figure 10.8. Lord William Strutt Rayleigh (1842-1919) and the frontispiece of his Theory of Sound.
o A ^K is positive, since a{u,A
'^Ku) = {u\AA ^Ku) = {u\Ku) > 0 for
o A ^K is self-adjoint with respect to the inner product a{v,u), since a{v,A-^Ku)
= {v\Ku) = {u\Kv) =
a{u,A-^Kv),
10.79 Definition. We shall say that X ^ 0 is an eigenvalue of (A^K) and that u is a eigenvector of {A^K) corresponding to A if 1/A is an eigenvalue of A~^K and u is a corresponding eigenvector, i.e., a solution ofAu-XKu = 0. The theory previously developed, when applied to the self-adjoint compact operator A~^K in the Hilbert space H with the inner product a(i;, u), yields the following. 10.80 Theorem. Let H he an infinite-dimensional Hilbert space and let A and K G C{H^ H) he self-adjoint, for A coercive and K compact. The equation Au—XKu = 0 has zero as its unique solution except for a sequence {An} of positive real numhers such that Xn -^ -hoc. For any such Xn, the vector space of solutions ofAu—XnKu — 0 is finite dimensional. Moreover, ifW is the family of finite linear combinations of eigenvectors of {A^K), then W is dense in H. In particular, there exists a complete orthonormal system in H of eigenvectors of (A, K) such that aici, Cj) = Xj6ij,
k{ei, ej) = 6ij
Vi, j .
Proof. The eigenvalues of A~^K are positive since A~^K is positive. Since A~^K is compact, A~^K has a denumerable sequence of eigenvalues {jjin} and /Xn -^ 0"^ and
392
10. Hilbert Spaces, Dirichlet's Principle and Linear Compact Operators
the corresponding eigenspaces are finite dimensional by Theorem 10.71. Consequently Au — XKu = 0 has nonzero solutions for the sequence An = l/fJ^n —^ -f-oo. The spectral theorem yields the density of W in H and the existence of an orthonormal basis of A~^K with respect to the inner product a{v^ it), a{ui,Uj) Therefore, a{ui,Uj) conclude
= Sij,
= Sij, \jk(ui,Uj)
—Uj — A~ Kuj = 0. = a(ui,Uj)
Q'(Ci)Cj) = Ajdij.,
= Sij and, if we set Sj := y/^Uj,
ki^ei^ej)
=
we
oij. D
e. Variational characterization of eigenvalues 10.81 Theorem. Let H, A and K he as in Theorem 10.80. Let {en} he a hasis in H of eigenvalues of {A^ K) ordered in such a way that the corresponding eigenvalues A^ form a nondecreasing sequence {An}, An < An-f 1. Then each An is the minimum of the Rayleigh's quotients . fa{u,u)\ u^o lk{u,u) J An := min< , , ^ , kiu.eA = 0 Vi = 1 , . . . , n — 1 >, if n> 1. ui^o I k[u^ u) \ ) Proof. For u ^ H write u = ^ ^ i CjCj so that k(u, Cj) = Cj and k{u,u) = X ^ ^ i \cj\^U u eVn := {u I k{u, ej) = 0, Vi == 1 , . . . , n - 1}, then Ci = 0 for i = 1 , . . . , n — 1, hence k(u,u)
= ^
\c.
2
j=n
while 00
a(u,u)
00
= ^2 ^jl^jl'^ ^ ^n ^ j=n
|cjp =
Xnk{u,u).
j=n
Therefore ^i^'^c < An on Vn. On the other hand, e-n G Vn and a(en, en) = An fc(en, en). D
Moreover with the previous notation and assumptions, we have the following. 10.82 Theorem (Min-max characterization). Denote hy S a generic suhspace of dimension n— I. Then we have An = maxmin< -——^—r u 7^ 0, k(u, z) = s lk{u,u) I
OWzeS>. J
Proof. The inequality < follows from Theorem 10.81. Let 5 be a linear subspace of H of dimension n — 1 and Vn := S p a n { e i , 62,..., e n } . Choose a nonzero vector UQ := ^^—lOiiCi so that k{uo,z)=0 V2 e 5 ; this is possible since dim S = n — 1. Then
10.5 Exercises
k{uo,uo)
393
:=^af, i=l n
n
a{uo, UQ) = ^2 ^i(^l < An ^ i=l
af = An
k{uo,uo),
i=l
hence a(u,u) mm -y-{ k[u,u)
I, , ^ ^^, \k{u,z)=0\/zeS\< I
{
^1 ^ aiuQ.uo) J
^^
k{uo,uo)
10.5 Exercises 10.83 ^ . Let Py : if —> if be the orthogonal projection onto the closed subspace V of a Hilbert space H. Show that o Py H- P\Y is a projection in a closed subspace if and only if PyPw = 0, and in this case Py -\-P^r = Pv®w, o PyPw is a projection on a closed subspace if and only if PyPw = PwPv, and in this case, PyPw = ^ynvi^1 0 . 8 4 % Let S,T G C{H,H). AT*, Id* = Id, T** = T . 10.85 % Let T € C{H,H).
Show that {S-^T)*
= 5*-hT*, (ST)* = T * 5 * , (AT)* =
Show that ||T||2 = ||T*||2 = ||TT*|| = ||T*T||.
10.86 If. Show that Hilbert's cube {x € ^^ | |a:n| < l/'^} is compact, while {x 6 ^2 I \xn\ < 1} is not compact. Show also that Hilbert's cube has no interior points, i.e., its complement is dense. 10.87 t . Show the following. P r o p o s i t i o n . Let L G C{H, H) he a hounded self-adjoint operator on a real or complex Hilhert space and m := inf (Lit|tt), M := sup (Lu\u). 1^1 = 1 |u| = l Then (i) c 7 p ( L ) c [ m , M ] , (ii) we have {Lu\u) = M \\u\\'^ (resp. {Lu\u) = mWuW^) if and only if u is an eigenvector of L corresponding to M (respectively m), [Hint: (i) Proceed by contradiction using Riesz's theorem; in the complex case, first show that Gp{L) C M. (ii) Use (10.34).] 10.88 %. Show the following. P r o p o s i t i o n . Let H he a Hilhert space, {\j} a sequence of nonzero real numhers converging to 0, {cj} an orthonormal set in H and Pj : H —^ H the orthogonal projection onto S p a n { e j } . Show that the series YlTLi ^j^j converges in C{H,H). Moreover, if CX)
K:=J2 ^J^'j then
^^ ^ ( ^ ' ^ ) '
(10-40)
394
10. Hilbert Spaces, Dirichlet's Principle and Linear Compact Operators
(i) "ix eH,Kx (ii) (iii) (iv) (v) (vi)
= ZT=i >'ji^\^j)^J^
for all j , Xj is a nonzero eigenvalue of K and Ksj = XjCj for all j , the sequence {Xj} is the set of all nonzero eigenvalues of K, K is self-adjoint and compact, {ej } is a basis of ker K-^, in particular ker K-^ is separable, if X^ 0 and X ^ Xj \fj, then {XI d — K)~^ is an isomorphism of H into itself.
If H is a complex Hilbert space and Aj € C, then (i), (ii), (iii), (iv), (vi) still hold and K is compact and normal. [Hint: (vi) follows from (v) and the alternative theorem. Moreover, an explicit bound for {Xld — K)~^ follows from (i), assuming H separable. Choose an orthonormal basis {zn} in keri^'. Then show that the equation (Aid — K)x = y has a unique solution
X = (Aid - K)-i = f;
^^^xk
+ T.Vy\
'-)za.
Then
if d:=minfcGN|A-Afc|.] 10.89 %, Let H be separable and let {en} be a basis of H. Consider the linear operator T{ej) = \ej,j>l,i.e., oo
^
T(u):=^-(u|e,)e,-. j = i ^
Show that T is compact with a nonclosed range. [Hint: Show that T is the limit in C{H, H) of a sequence of linear operators with finite-dimensional range. Then show that V e R{T) if and only if E ^ i J K ^ k j O P < -f-oo. Choose VQ := J^^tiJ'^^'^^j^ Vn :== E j i i j - 3 / 2 + i / n g^nd show that VQ ^ R(T), Vn G R{T) and \vn - vo\ -> 0.] 10.90 %, With the notation of Theorem 10.80 show the so-called completeness oo
k{u, v) = 2_^ ^('^5 ^i)^(^) Cj),
oo
a{u, v) = 2Z ^i^i'^^ ei)k(v,
ei).
relations
11. Some Applications
In this chapter we shall illustrate some of the applications of the abstract principles we stated in the previous chapter to specific concrete problems. Our aim is to show the usefulness of the abstract language in formulating and answering specific questions, identifying their specific characteristics and recognizing common features of problems that a priori are very different. Of course, the abstract approach mostly follows and is motivated by concrete questions, but later we see the approach as the most direct way to understand many questions, and even the most natural. Clearly, the problems we are going to discuss deserve more careful and detailed study because of their relevance, but this is out of our present scope and, in any case, often not possible because of the limited topics we have so far developed. For instance, in this chapter, we shall only use uniform convergence, since the use of integral norms, besides being more complex, requires the notion of Lebesgue's integration, without which any presentation would sound artificial.
11.1 Two Minimum Problems 11.1.1 Minimal geodesies in metric spaces Let X be a connected metric space so that every two of its points can be connected by a continuous path. One of the simplest problems of calculus of variations is to find a continuous curve of minimal length connecting two given points. A first question is deciding when such a minimal connection exists. Here, we shall see how the Frechet-Weierstrass theorem. Theorem 6.24, and the Ascoli-Arzela theorem. Theorem 9.48, lead to an answer. a. Semicontinuity of the length Let X be a metric space. Recall that / : X —> K is called lower semicontinuous if the level sets Tf^x := {x \ f{x) < A} of / are closed for all A G R, equivalently if /~^(]A, +oo[) is open for all A € R. Observing that,
396
11. Some Applications
^
COMMENTARII
K 1.6 K f^
DE UN£A BREVISSIMA
ACADEMIAE
IN SVPERPICIE QVACVNQVE DVO QVAEtlBET PVNCTA IVNCENTE.
S C I E N T I A R V M
Auefore Lconh. Eulero.
IMPERIALIS PETROPOLITANAB
C
VrQ\'E notum «ft, ct amulmtanqoamixiomi ponituf, lin«m fca viam brcvifliniam a 4ato punAo ad aliud ^uo4cunque cffc iincam r«&aiti. Ex hoc fadU intetligittir, in Aiperiicie pbna lineam brcuiiiiaiam duo <]uadtb«t puoda iungemem ttk tc&tm , qui« ab 3l(cro ad siterum ducUur. In fuptrficie fphaerica , in qua Hfxt. rcAa daci noit poted, ftatuitur « Ceometrk viim breuiffimam eife circaitttn maximum > qui dats duo pun^a coninngit. a. Quae a»t«m in fupcrScie qaacttoque fine conuexa, fine concaua, {Ui« ex hiimixtafirviabrc* viifiima , quae ex dato pua^o ad aSiud qaodcunque diKititr, nondum eftgeneraliccrdctcrfflieatum. Propttfujt mlhi hanc qoaeftioncm Cel. loh. BernouUi, fisnificani Tc vnincrifalcmiitutniffe aequationem^quae ad liocam breaiflimim determinandam cakjnc fu' perficiei attommodari poflir. Solui cgoctiam hoe proiikma , foiutionemqac hac dlflcrtattone expo> nere volui.
T O M V S 111. AD ANNVM cl> Jjcc 3xrtiu
PETROPOtI TYPIS ACADEMIAE clj ixc xxstj.
Figure 11.1. Frontispieces of Commentarii Petropoli vol. 3 (1732) and of the paper by Leonhard Euler (1707-1783) De linea brevissima.
if f = sup^ fi, then / ^(]A, +oc[) = U^/^ ^(]A, 4-oo[), we conclude the following. 11.1 Proposition. Let fi : X -^ R, i e I, be a family of lower semicontinuous functions on a metric space X. Then f := sup^fi is a lower semicontinuous function. Let (X, of) be a metric space. As we have seen, cf. Example 6.25, the length functional L : C^([a, 6],X) -^ R, which for each continuous curve (p : [a, 6] -^ X gives its length, is not a continuous functional with respect to uniform convergence in C°([a, 6],X). But we have the following. 11.2 Theorem (Semicontinuity). The length functional L{(p) is lower semicontinuous in C^([a, 6],X). Proof. Recall that we have L{ip) = sup Vs if)
ses
where Vs{f) := T>idx{f{ti)J{ti+i)), S = {to = a < ti < - - • < IN = b}. Since the functional / -^ Vsif) is continuous for every fixed subdivision S of [a, 6], the result follows. •
b. Compactness The intrinsic reparametrization theorem. Theorem 7.44, can be reformulated as: For every family of parametric curves {Ci} in X of length strictly less than A:, there exists a family of curves {C^'} parametrized in [0,1],
11.1 Two Minimum Problems
397
thus belonging in C^([0,1],X), such that Ci and (7^' are equivalent for all i e Lin particular they have the same length, and the curves C^ are equiLipschitz with Lipschitz constant less than k. Assuming X compact, the Ascoli-Arzela theorem yields the following. 11.3 Theorem (Compactness). Let X be a compact metric space and let {Ci} be a family of parametrized curves of length strictly less than k. Then the family {Ci} is relatively compact with respect to uniform convergence. More precisely, one can reparametrize the curves Ci on [0,1] in such a way that they belong to Lipfc([0,1],^), and therefore {Ci} is a relatively compact subset of C^{[0,1],X). c. Existence of minimal geodesies An immediate consequence of Theorems 11.2 and 11.3, on account of the Prechet-Weierstrass theorem is the following. 11.4 Theorem (Existence). Let X be an arc-connected compact metric space and P, Q two points of X. There exists a simple rectifiable curve of minimal length joining P to Q, provided there exists at least a rectifiable curve connecting P and Q. Proof. Since there exists at least a rectifiable curve connecting P and Q, A := inf{L(7) I 7 connecting P and Q} < +oo. Let A; > A and let K:={^:[OA]-^x\ipe
Lipfc([0,1], X ) , <^(0) = P, (^(1) = Q } .
By the Ascoli-Arzela theorem, K is compact in C°([0,1],X), hence there is (fo £ K such that (H-l) L(ipo) = inf I L(7) 7 connecting P and Q > by the Prechet-Weierstrass theorem. The map (/?o need not be injective a priori. However, the intrinsic reparametrization ip : [0, L((^o)] —^ ^^ see Theorem 7.44, which is equivalent to (po, having Lipschitz constant one, satisfies L{il;,[xi,X2])
= \xi
-X2\
and is injective. In fact, if 1/^(2:1) = '0(cc2) with xi < X2, deleting the loop corresponding to the interval ]a;i,a;2[, we would still get a curve connecting P and Q, but of length strictly less than L((po), contradicting (11.1). • 11.5 %, Show that the compactness assumption on X in Theorem 11.4 is necessary. In particular, discuss the cases when X equals the closed unit cube minus an interior open segment and minus a closed interior segment.
11.1.2 A minimum problem in a Hilbert space In this section we shall show how the theorem ensuring the existence of minimizers for quadratic coercive functionals generalizes to convex coercive functionals in a Hilbert space.
398
11. Some Applications
a. Weak convergence in Hilbert spaces Let X be a Banach space. We say that a sequence {xn} C X converges weakly to x G X, and we write
if F{xn) —> F{x) VF e X*, i.e., for every linear continuous functional F : X -> R on X. On account ot the Riesz's representation theorem, we have the following. 11.6 Proposition. A sequence {un} in a Hilbert space converges weakly to u e H iff {un\v) -> {u\v) ^v e H. If H is finite dimensional, weak and strong convergence agree, since weak convergence amounts to the convergence of the components in an orthonormal basis. On the contrary, if H has infinite dimension, the two notions of convergences diflFer. In fact, while from the inequaUty \{Un-u\v)\
< \\v\\\\Un
-
u\\
we get that strong convergence, ||iin — ^^H —> 0, implies weak convergence u-n -^ u; the opposite is not true. Consider, for instance, a denumerable orthonormal set {e^} C H. Then Bessel inequality yields {en\v) —> 0 Vi; G if, i.e., Cn -^ 0, while {cn} does not converge since Ikn -emii^ = ||en||^ - 2{en\em) + ||em||^ = 2
yn,m.
Weak convergence is one of the major tools in modern analysis. Here we only state one of its major useful issues. 11.7 Theorem. Every bounded sequence in a separable Hilbert space has a subsequence that is weakly convergent. Proof. Let {xn}, \xn\ < M, be a bounded sequence in H, and {ci} be a basis of H. {xn} has a subsequence {x!^} such that (x!^\ei) -^ a i . Similarly {x!^} has a subsequence {x!^} such that ((x!Jl\e2) -^
T{y) := Yl^y\ei)au
y e H.
i=l
T is linear and bounded, ||T|| < M as
hence the representation theorem of Riesz yields the existence of XT ^ H such that Tiy) = iy\xT) ^y e H and \\XT\\ = \\T\\ < M. In particular XT = E r ' S i ^i^i e H. We now prove that {xk^} converges weakly to XT- For that, set Zn := x^^ — XT and
11.1 Two Minimum Problems
399
let y be any vector in H. For any fixed e > 0 choose TV sufficiently large so that for VN '= Z)ili(2/|ei)ei, we have \\y - yjsfW < e. Then |(^n|2/-yiv)|
11.8 Remark. The last part of the proof actually shows that in a separable Hilbert space, weak convergence [xn — x|y) -^ 0 Vy amounts to the convergence of the components [xn — a:|ei) —^ 0 Vi in an orthonormal basis
11.9 %, Show that the compactness theorem, Theorem 10.52, holds in a generic Hilbert space which is not necessarily separable. [Hint: Apply Theorem 10.52 to the closure HQ of the family of finite combinations of {xn}, which is a separable Hilbert space. Then find X £ Ho and a subsequence {xk^} such that (xk^ —x\y) —>• 0 Vy G i/o- Then, use the orthogonal projection theorem onto HQ to show that actually, {x^^ —x\y) ^^ 0 Vy G H.]
11.10 Theorem (Banach-Saks). Every bounded sequence {vn} C H weakly convergent to v E H has a subsequence {vk^ } such that 1 "" — / ^ Vk -^ V n ^-^
in the norm of H.
z=l
Proof. Set Un '-= Vn — V. Then for a positive M we have ||wn|| < M for all n and we extract from {un} a subsequence {ttfc^ } in such a way that Uk^ : = w i ,
(wfcahfci) < 1, (^/e3|Wfci),(Wfc3|Wfc2) <
(Wfcp+llwfcj <
-,
V2=:l,...,p.
Therefore - . n
||2
ill"'
1|2
1
''^
= ;^EE("«=iK) + ; ^ E K K ) j=li<j
j=l
^n^^-
400
11. Some Applications
LEONID A TONBI.LI
Multiple Integrals in the Calculus ofVariatiom
FONDAMENTI 01
CAMOK) DKLIE VAEIAZIONI Chtdes 8 . Mbncy, Jt. VOLVUR P K I X O
BOtOGSA ??|G01iA Z A H r O H B L U
Sfdoger-Veriag tkriin l y j d b c t g H«v York 1 9 ^
Figure 11.2. Frontispieces of two classical monographs that, in particular, deal with semicontinuity on integral functionals.
b. Existence of minimizers of convex coercive functionals Let ^ : iif -> R be a convex functional on a real Hilbert space H. This means that the function ip{X) := T{\u -h (1 - X)v) is convex in [0,1] for all u,v e H. A typical example of a convex functional is the quadratic functional
Hu) = \\\uf-L{u) where L is a bounded linear form on if, that we have encountered in deahng with the abstract Dirichlet principle. Then we have, compare Proposition 5.61 of [GMl], the following. 11.11 Proposition (Jensen's inequality). A functional T : H —>R is convex if and only if for every finite convex combination ^
aiUi,
i=l
^
a i = 1,
i=l
of points Ui £ H we have
(
m
\
m
2=1
^
i=l
ai> 0,
11.1 Two Minimum Problems
401
Proof. Clearly Jensen's inequality with two points amounts to convexity. So it suffices to prove it assuming J^ convex. We give a direct proof by induction on m. Assume the claim holds for m — 1 points. Set a := a i + • • • + « m - i and am := 1 — a. If a = 0 or a = 1 the claim is proved, otherwise 0 < a < 1 and, if
1
^
then m
m—1
0< — <1, a
y ^ — = 1,
y ^ aiUi = art -f (1 -
a)um,
hence 111
ifh
T{^OLiU^
= T{oLU + (1 - OLUm)) < OLT{U) + (1 - a)T{um)
i=l
<
J2^i^M
i=l
by the inductive assumption.
D
11.12 Theorem. Let T : H -^R be a continuous, convex, bounded from below and coercive functional, meaning inf T{u) > —cxD,
^{u) -^ +00
as \u\ —^ -f-oo.
Then T has a minimizer in H. Proof. Let {un} be a minimizing sequence, T{un) —* miu^H ^{'^)-
Since for large n
- o o < inf T(u) < T{un) < inf T{u) + 1, the sequence {un} is bounded. Using the Banach-Saks theorem we find u ^ H, and we can extract a subsequence {wfenl ^^ {un} such that Uk^ —^ u and 1
''^
Vn •= —[ y ^ Uk^ j -^ u
in the norm of H.
i=l
Jensen's inequality yields
JFK) = ^ ( i f^„, ^ < i f^^(„,J, ^
thus T(vn) —*• inffj ^ since T{uk^) sequence, too. Finally
Z = l
^
on account of the continuity of T.
1
—^ infff^ as i -^ oo, i.e., {vn} is a minimizing
inf .7^ < T{u) = lim Tivn) H
1=
= J^{u)
n-^oo
•
1 1 . 1 3 %. Show that Theorem 11.12 still holds if T is convex with values in R, bounded from below and lower semicontinuous.
402
11. Some Applications
11.2 A Theorem by Gelfand and Kolmogorov In this section we shall prove that a topological space X is identified by the space of continuous functions on it. If we think of X as a geometric world and of a map from X into R as an observable of X, we can say: if we know enough observables, say the continuous observables, then we know our world. Let us begin by proving the following. 11.14 Proposition. Every metric space (X^d) can be isometrically embedded in C^ {X). Proof. Fix p e X and consider the map (/? : X —» C^(X,R) fa'.X -^R defined by fa{x) := d{x,a) - d{x,p).
that maps a e X into
Trivially, fa € C ° ( X , R ) and \fa{x) - A ( x ) | = \d{x, a) - d{x, b)\ < d{a, 6), i-e., Il/a — Alloo £ d{a^ 6); on the other hand ioi x = h we have \fa{h) — fh{b)\ — d{a, b), hence (p is an isometry. D 11.15 %. Show that every separable metric space (X, d) can be isometrically embedded in loo. [Hint: Let {xn} be a sequence in X and let (f : X -^ loo be given by (f{x)n '•= d{x,Xn) — d(xi,Xn)' Show that ip is an isometry.]
Let X be a topological space, see Chapter 5. The set C^{X,R) is a linear space and actually a commutative algebra with identity, since the product of two continuous functions is continuous. Let R and R^ be two commutative algebras. A map (/? : i? ^ i?' is said to be a homomorphism from R into R^ iiip{a-^b) = ip{a)-\-(p{b) and ip{ab) = (p{a)(p{b). If, moreover, ip is bijective we say that R and R' are isomorphic. Clearly C^{X,R) is completely determined by X, in the sense that every topological isomorphism (p : X -^ Y determines an isomorphism of the commutative algebras C^ (F, R) and C^ (X, R), the isomorphism being the composition product / -^ / o <^. If X is compact, the converse also holds. 11.16 Theorem (Gelfand—Kolmogorov). Let X be a compact topological space. Then C^(X,M) determines X. We confine ourselves to outlining the proof of Theorem 11.16. An ideal X of the algebra R is a, subset of R such that a,b £X => a — b E X and aEX,rER=^a-rEX. R is clearly the unique ideal that contains the identity of R. An ideal is called proper if X ^ R and maximal if it is not strictly contained in any proper ideal. Finally, we notice that R/X is a field if and only if X is maximal. 11.17 L e m m a . Let X be a compact topological space. X is a proper maximal ideal of C^{X) if and only if there is XQ e X such that X = {f e C^{X) \ f{xo) = 0}.
11.3 Ordinary Differential Equations
Proof. For any / € X, the set / ^(0) belongs to X, and J is not proper. Let and / " H o ) = n / ^ ^ O ) 7«^ 0. Since X there is XQ G -^ such that f{xo) = 0 ideal, hence X = {/ | / ( x o ) = 0}.
403
is closed and / ^(0) ^ 0. Otherwise 1 / / , hence 1, / i , • . . , / n € X. The function / := 53?==! f? i^ i^ ^ is compact, n { / - H O ) | / G X} / 0. In particular, V/ G X. On the other hand, {/1 f{xo) = 0} is an D
The spectrum of a commutative algebra with unity is then defined by s p e c R := j X X meiximal ideal of R?. Trivially, if R is isomorphic to C ° ( X , M), R ~ C^(X, M), then also the maximal ideals of R and C^(X, R) are in one-to-one correspondence, hence by Lemma 11.17 specif-specC^(X)-X. To conclude the proof of Theorem 11.16, we need to introduce a topology on the space specC^(X, R) in such a way that spec C^(X,M) ~ X becomes a topological isomorphism. For that, we notice that, if X is a maximal ideal of C^{X,R), then C^(X, R)/X ~ R, hence the so-called evaluation maps /(X), that map (/,X) into [/] € C^(X, R)/X c^ R, have sign. Now, if we fix the topology on spec R :^ spec C^{X, R) by choosing as a basis of neighborhoods the finite intersections of U(f) := *{ X X proper maximal ideal with /(X) > 0 >, it is not difficult to show that the isomorphism X —> specC^(X, R) is continuous. Since X is compact and the points in specC^(X, R) are separated by open neighborhoods, it follows that the isomorphism is actually a topological isomorphism. Theorem 11.16 has a stronger formulation that we shall not deal with, but that we want to state. A Banach space with a product which makes it an algebra such that \\^y\\ < Ikll \\y\\ is called a Banach algebra. An involution on a Banach algebra R is an operation x -^ x* such that (x -\- y)* = x* + y*, (Ax)* = Ax*, {xy)* = (yx)* and (x*)* = X. A Banach algebra with an involution is called a C*-algebra. Examples of C*-algebras are: (i) the space of complex-valued continuous functions from a topological space with involution / —» / , (ii) the space of linear bounded operators on a Hilbert space with the involution given by A —^ A*, A* being the adjoint of A. Again, the space of proper maximal ideals of a commutative C*-algebra, endowed with a suitable topology, is called the spectrum of the algebra. T h e o r e m (Gelfand—Naimark). A C*-algebra is isometrically gebra of com,plex-valued continuous functions on its spectrum.
isomorphic
to the al-
11.3 Ordinary Differential Equations The Banach fixed point theorem in suitable spaces of continuous functions plays a key role in the study of existence, uniqueness and continuous dependence from the data of solutions of ordinary differential equations.
404
11. Some Applications
11.3.1 The Cauchy problem Let D be an open set in R x R^, n > 1, and F{t, y) : D cRxW -^W be a continuous function. A solution of the system of ordinary equations f^x{t)^F{t,x{t))
(11.2)
is the data of an interval ]a, P[c R and a function x G C^Qa^ /?[; R'^) such that (11.2) holds for all t G]a, ^[. In particular, (t, x{t)) should belong to D for all t G]a, /?[. Geometrically, if we interpret F(t, x) as a vector field in P , then x{t) is a solution of (11.2) if and only if its graph curve t -^ {t, x{t)) is of class C-^, has trajectory in JD, and velocity equals to (1, F(t, x{t))) for all t. For this reason, solutions of (11.2) are called integral curves of the system. a. Velocities of class C^{D) In the sequel, at times we need a fact that comes from the differential calculus for functions of several variables that we are not discussing in this volume. Let (7 C R^ be an open set. We say that a function / : f] —> R is of class C^(p)^ fc > 1, if / possesses continuous partial derivatives up to order k. One can prove that, if / G C^{0) and 7 : [a, 6] -^ fi is a C^ curve in n, then / o 7 : [a, 6] —> R is of class C^([a, h\). For fc = 1 we have the chain rule
where D / ( x ) := ( | ^ ( ^ ) , | ^ ( ^ ) , • • •, ^ ( ^ ) ) is the matrix of partial derivatives of / and the product D/(7(^))7'(^) is the standard matrix product. A trivial consequence is that integral curves, when they exist, possess one derivative more than the function velocity F{t,x{t)). This is true by definition if F is merely continuous. If, moreover, F(t, x) G C^ and x{t) G C^, we successively find from the equation x'{t) = F{t,x(t)) that x'{t) G C^, x'{t) G C^,.", x'{t) G C^. In particular, if F{t,x) has continuous partial derivatives of any order, then the integral curves are C^. It is worth noticing that if F G C^{D), then by the chain rule dF x"{t) =
—{t,x{t))~^DF^{t,x{t))x\t),
where DF^ is the matrix of partial derivatives with respect to the x's variables and the product 'DFx{t,x{t))x'{t) is understood as the matrix product. For the sequel, it is convenient to set
11.3 Ordinary Differential Equations
405
11.18 Definition. We say that a function F{t, x) : [a, 0] x B{xo, b) -^ W^ is Lipschitz in x uniformly with respect to t if there exists L > 0 such that \F{t,x)-F{t,y)\
\/{t,x),{t,y)
e [a,f3] x B{xo^b).
(11.3)
Let D be an open set m R x R^. We say that a function F{t,x) : D -^ W^ is locally Lipschitz in x uniformly with respect to t if for any D := [a, (3] x 5(xo, b) strictly contained in D there exists L := L{a, /?, XQ, b) such that \F{t,x)-F{t,y)\
^{t,x),{t,y)
11.19 f. Show that the function / ( t , x) = sgn{t)\x\, in X uniformly with respect to t.
eD.
(t, x) G [-1,1] x [—1,1] is Lipschitz
11.20 If. Let D = [a,b] x [c,d] be a closed rectangle in M x E. Show that, if for all t G [a,6], the function x —^ fit,x) has derivative fx{t,x) on [c,d] and (t,x) —> fx{t,x) is continuous in D, then / is Lipschitz in x uniformly with respect to t. [Hint: Use the mean value theorem.] 11.21 If- Show the following. Let D be an open set of R x R^ and let F{t,x) G C^(D). Then F is locally Lipschitz in x uniformly with respect to t. [Hint: For any {to,xo) G D, choose a,b ^ R such that D := {{t, x)\\t — to\ < a, \x — xo\ < b} is strictly contained in D. Then, for (t,xi), (t,X2) G D, consider the curve 7(5) := (t, (1 — s)xi + 8X2), s G [0,1] whose image is in D and apply the mean value theorem to ^(7(5)), s G [0,1].]
b. Local existence and uniqueness Assume (to,xo) G D. We seek a local solution x{t) := (xi(t),.. .,Xn{t)) e C\[to - r, to + r],R^) for some r > 0 of the Cauchy problem relative to the system (11.2), i.e.,
^x{to) = XQ. We have the following. 11.22 Proposition. Let D be an open set in R x W^, n > 1, and let F{t,x) : D —> R"^ be a continuous function. Then x{t) G C^{[to — r, to + r],W) solves (11.4) if and only if x{t) belongs to C^{[to - r,to + r],W) and satisfies the integral equation t
x{t)=xo-\-
F{r,x{T))dT
\/te[to-r,to-^r].
(11.5)
406
11. Some Applications
Proof. Set / := [to-r,to+r\. If x G C ^ ( / , M ^ ) solves (11.4), then by integration x satisfies (11.5). Conversely, if a; G C ^ ( / , R ^ ) and satisfies (11.5), then, by the fundamental theorem of calculus, x{t) is differentiable and x'{t) = F{t, x{t)) in / , in particular it has D a continuous derivative. Moreover, (11.5) for t = to yields x{to) = XQ.
Let US start with a local existence and uniqueness result. 11.23 Theorem (Picard-Lindelof). Let F(t,x) : D cRxW ^W he a continuous function with domain D := {{t^x) G R x R^ | |t — to| < a, \x — xo\ < b}. Suppose (i) F{t,x) is hounded in D, \F{t,x)\ < M, (ii) F(t, x) is Lipschitz in x uniformly with respect to t, \F{t,x)-F{t,y)\
V{t,x),
{e,y) e D.
Then the Cauchy problem (11.4) has a unique solution in [to — r,to + r] where b 1 Proof. Let r be as in the claim and Ir := [*o — ^^^ to-\-r]. According to Proposition 11.22, we have to prove that the equation x{t) =xo-{-
F(T,x(T))dT. JtQ
has a unique solution x{t) G C ° ( / r , M ^ ) . Let 2/1,2/2 G C^{Ir,W) be two solutions of (11.5). Then for all t G Ir \yi{t) - y2{t)\ < [
\F{s,yi{s))-Fis,y2{s))\ds
JtQ
hence II2/1 -2/i||oo,/^ < kr\\yi -yiWoojrSince /cr < 1, then 2/1 = 2/2 in IrTo show existence, we show that the map x -^ Tx given by T[x]{t) := XQ ^- f
F{T,x(T))dT
is a contraction on
X := ^x e C^{Ir,^^)\x{to)
= xo, \x(t)-xo\
<6Vf G / r }
that is closed in C^{Ir,R^), hence a complete metric space. Clearly t —> T[x](t) is a continuous function in Ir, T[x]{to) = XQ and <M\t\<Mr
\T[x](t)-T[ym\
< |||F(r,x(r))-F(r,s/(T))|dT
< fc|i|||a;-2/||cx) < fcr-||a; - 2/||oo,/r.-
0
The fixed point theorem of Banach, Theorera 9.128, yields a (actually, a unique) fixed point T[x] = a; in X . In other words, the equation (11.5) has a unique solution. D
11.3 Ordinary Differential Equations
407
Taking into account the proof of the fixed point theorem we see that the solution x{t) of (11.4) is the uniform hmit of Picard^s successive approximations t
Xo{t) := xo,
and, for n > 1,
Xn{t) :=' XQ + / F(r,Xn-i(r)) dr. to
The Picard-Lindelof theorem allows us to discuss the uniqueness for the initial value problem (11.4). be a bounded domain, 11.24 Theorem (Uniqueness). Let D cRxW^ let F{t, x) : D -^W^ be a continuous function that is also locally Lipspchitz in X uniformly in t, and let {to,xo) G D. Then any two solutions x\ : I -^ W^, X2 '- J —^W^ defined respectively, on open intervals I and J containing to of the inital value problem (x^{t) =
F{t,x{t)),
[x{to) = xo, are equal on I H J. Proof. It is enough to assume I C J. Define
E := | t G/|xi(t) =X2{t)Y Obviously to £ E and E is closed relatively to I, as x i , X2 are continuous. We now prove that E is open in / , concluding E = I since / is an interval, compare Chapter 5. Let t* e E, define x* := xi{t*) = X2(t*). Let a, 6 G M+ be such that D := {{t, x) € D I |t — t*| < a, |a; — a:*| < 6} is strictly contained in D. F being bounded and locally Lipschitz in x uniformly with respect to t in D, the Picard-Lindelof theorem applies on D. Since xi(t) and X2{t) both solve the initial value problem starting at (t*,x*), we conclude that xi{t) = X2(t) on a small interval around x*. Thus E is open. D
c. Continuation of solutions We have seen that the initial value problem ha^ a solution that exists on a possibly small interval. Does a solution in a larger interval exist? As we have seen, given two solutions xi : / —> E"^, X2 : J -^ M'^ of the same initial value problem, one can glue them together to form a new function x : I U J —^ W^, that is again a solution of the same initial value problem but defined on a possibly larger interval. We say that x is a continuation of both xi and X2' Therefore, Theorem 11.24 allows us to define the maximal solution, or simply the solution as the solution defined on the largest possible interval. 11.25 Lemma. Suppose that F : D CR xW^ -^W^ is continuous in D, and let x{t) be a solution of the initial value problem
408
11. Some Applications
\x{to) = xo in the bounded interval ^ < t < 5; in particular {t^x{t)) e D ^t €]7,(5[. If F is bounded near {S,x{S)), then x{t) can be continuously extended on S, Moreover, if {S^x{S)) G D, then the extension is C^ up to S. A similar result holds also at (7, a:(7)). Proof Suppose that \F{t,x)\ ti, ^2 €]7, S[ we have
< M V(t,x) and let x{t), t e]j,5[,
r(t2) -x(ti)\
< I ^ \F{t,x(t))\dt
<M\ti
be a solution. For
-t2 I
i.e., X is Lipschitz on ]7, (5[, therefore it can be continuusly extended to [7, S]. The second part of the claim follows from (11.5) to get for t < S
x(t)-x(S) - ^
1 ^—=
t-S
t~sJd
/•*
,
/
F(s,x(s))ds ^ ' ^ ^^
, ,, ,
and letting i —> (^+.
D
Now if, for instance, {5^ x{S)) is not on the boundary of D and we can solve the initial value problem with initial datum x{5) at to = ^, we can continue the solution in the C^ sense because of Proposition 11.22, beyond the time 5, thus concluding the following. 11.26 Theorem (Continuation of solutions). Let F(t,x) be continuous in an open set D dB^xW^ and locally Lipschitz in x uniformly with respect to t. Then the unique (maximal) solution of x'(t) = F{t^x{t)) with x{to) = XQ extends forwards and backwards till the closure of its graph eventually meets the boundary of D. More precisely, any (maximal) solution x(t) is defined on an interval ]a, /?[ with the following property: for any given compact set K C /S., there is 5 — S{K) > 0 such that {t,x{t)) ^ K
forti
[a-^5,(3-5].
Recalling Exercise 11.21, we get the following. 11.27 Corollary. LetD be an open domain in^xW^, and letF G C^{D). Then every (maximal) solution of x'{t) = F(t^x{t)) can be extended forwards and backwards till the closure of its graph eventually reaches dD. 11.28 Corollary. Let D '.=]a,b[xW^ (a and b may be respectively, +00 and —00) and let F{t,x) : D -^ W^ be continuous and locally Lipschitz in x uniformly with respect to t. Then every locally bounded (maximal) solution of x' = F{t,x) is defined on the entire interval ]a, 6[. Proof. Let \x(t)\ < M. Should the maximal solution be defined on [a,/?] with, say P < b, then the graph of x would be contained in the compact set [a, /3] x B(0, M) strictly contained in ]a, 6[xR^. This contradicts Theorem 11.26. D
11.3 Ordinary Differential Equations
409
Of course, if F is bounded in D :=]a,fofxR*^,all solutions of x' = F(t^x) are automatically locally bounded since their velocities are bounded, so the previous theorem applies. For a weaker condition and stronger result, see Exercise 11.33. 11.29 E x a m p l e . Consider the initial value problem x' = x^, x{0) = 1, in Da '•= {(t, x) 11 G M, \x\ < a}. Since \F\ < a^ in Da, the continuation theorem applies. In fact, the maximal solution 1/(1 — t), t G] — O O , 1 — - [ has a graph that extends backwards till —CO and forward until it touches dDa-
d. Systems of higher order equations We notice that a differential equation of order n in normal form in the scalar unkown x{t) lx{t)
= F ( t , x ( t ) , | x ( t ) , . . . , ^,x{i))
(11.6)
can be written, by defining xi{t) := x{t),
X2{t) := -JlXiii)^
as the first order system x[{t)=:x2{t),
{x'^{t)=:Fit,Xi{t),X2it),...,Xn{t)) or, compactly as, y'{t) = F{t,y{t)) for the vector-valued unknown y{t) := {xi{t),X2{t),... D C R X R^ -^ M^ given by
,Xn{t)) and F :
F{t, X i , . . . , Xn) :=- (X2, X3, . . . , Xn, / ( t , X i ( t ) , X2(t), . . . ,
Xn{t)).
Consequently, the Cauchy problem for (11.6) is :r(») it) = F{t, x{t), x'{t), x"{t),...,
x^^'^^t)),
x{to) = xo, x'{to) = xi, x"(fo) = X2,
a;("-i)(fo)=a;„_i.
(11.7)
410
11. Some Applications
Along the same line, the initial value problem for a system of higher order equations can be reformulated as a Cauchy problem for a system of first order equations, to which we can apply the theory just developed. e. Linear systems For linear systems x'{t)=A{t)x{t)+g{t),
(11.8)
where A{t) is an n x n matrix and g{t) G R^, we have the following. 11.30 Theorem. Suppose that A{t) and g{t) are continuous in [a, 6] and that to G [a, 6] and XQ EW^. Then the solution of (11.8) with initial value x{to) = xo exists on the entire interval Proof. Assume for simplicity that to e]a, 6[. The field F{t, x) := A{t)x -\- g{t) is continuous in D :=]a,b[xW^ and locally Lipschitz in x uniformly with respect to t, \F{t,x)-F(t,y)\<
sup 11^(011 Ik-2/11
Va < a < / 9 < 6,Vx,?/G M^.
te[a,/3]
Therefore, a solution of (11.8) exists in a small interval of time around to, according to Picard-Lindelof theorem. To show that the solution can be continued on the whole interval ]a, 6[, it sufl[ices to show, according to Corollary 11.28, that x(t) is bounded. In fact, we have t
x(t) — x(to) = / A{s)x{s) ds-\- / Jto J
g(s)ds.
to
For t > to we then conclude iclude that \x(t)\<\xo\+max\g\(b-a)-iWM
sup ||^(OII / te[a,b] Jto
\x(s)\ds,
and the boundedness follows from GronwalVs inequality below.
D
11.31 Proposition (Gronwall's inequality). Suppose that k is a nonnegative constant and that f and g are two nonnegative continuous functions in [of,/?] such that t
f{t)
+ J f{s)g{s)ds,
te[a,P].
Then
f{t) < fcexp( /
9{s)ds).
Proof. Set U{t) := k + f f{s)g{s) ds. Then we have a
fit) in particular
< U(t),
U'{t) = f(t)g{t)
< 9{t)U(t),
U{a) = k,
11.3 Ordinary Differential Equations
411
ON THE EXISTENCE AND PROPERTIK OF THE SOLUTIONS OF A CERTAIN DlPPratENTIAL EQUATION OP THE SECOND ORDER.*
1. Introduction. Tke lifanntU «v«tiM to be ei»aU«Nd is
(1) (A) (B) (O (D)
-h[li''t»^]'^fi''») jr = IK a * ( « , * ) / - « »» » -=»» »t p(*.»)v''=< rt
a; = i», « - a, « = ai, « — »,,
jf = »i »t « — *i; v f t « » •>" *,, |f-*0 *t «-»op; K-0 «a ar-oo,
n * cdKoee and noivNaen M the wAitieat witt be tkoira mrier tbe {«Uowb( eoBtitiou M / ( z , y ) end f(x,y), where eaaditioD (&) is aied oilj with (O uU (D). Pint, ghrea «aj Ibite s,>a^ uri u v Inite i > 0 , / ( £ , y) is boradsd tai sttiain e LqMeUti eeodiUoB to a^ ^ « < x , «•« | y | < i , ead the «aae 11 inie fer l/»(«,y) (et- x i i g x ^ z , u d lU teite y, to that
l/(»,y)i<M.
-|<JU;i*-y|; y(*. r ) '
'"•" v ^ Wif iMl, a s. PwMw • * i">^» = 8. •• ttU/m" 0. She* • V -i.,#j€'»'>0, UtailMntttt /(«,») Sit f (cir) iMW> •»»«»«« (» W (D-
Figure 11.3. Thomas Gronwall (18771932) and a page from one of his papers.
g{s) ds\ - U{a) < 0.
11.32 %, Let w : [a, 6] -^ M^ be of class C^([a, b]). Assume that \w'\{t) < a{t) \w{t)\ + b(t)
Wt e [a, b]
where a(t), b{t) are nonnegative functions of class C^{[a,b]). Show that Mt)\
< (\w{to)\ + f
b(s)ds\exp
( I
a{s)ds\
for every t,to ^ [^i,^]- [i^wt; Apply Gronwall's lemma to f{t) := \w{t)\.] 11.33 %, Let F(t, x) : / X M^ —>^ M^ be continuous and locally Lipschitz in x uniformly with respect to t. Suppose that there exist nonnegative continuous functions a{t) and b(t) such that \F{t,x)\ < a(t)|x| + b{t). Show that all the solutions of x' = F{t,x) can be extended to the entire interval / .
f. A direct approach to Cauchy problem for linear systems For the reader's convenience we shall give here a more direct approach to the uniqueness and existence of the solution of the initial value problem to e [a, 6], (11.9)
X{to) = Xo, X'{t)=A{t)X(t)
+ F{t)
Vt e [a, b]
412
11. Some Applications
where XQ G W^ and the functions t -^ A{t) and t -^ F{t) are given continuous functions defined in [a, 6] with values respectively, in Mn,n{C) and C"'. Recall that ||A(t)|| := supj^j^i |A(t)x| denotes the norm of the matrix A{t) and set M:= sup ||A(t)||. tela,b]
As we have seen, see Proposition 11.22, X{t), t G [a, 6] solves (11.9) if and only ii t -^ X(t) is of class C^([a, b]) and solves the integral equation X{t) = Xo+
f {A{s)X{s) + F{s)) ds Jto that is, iff X{t) is a fixed point for the map T : X{t)
^ T{X){t) := XQ -f / iA{s)X{s) Jto
(11.10)
+ F{s)) ds.
(11.11)
Let 7 > 0. The function on C^([a,b],W^) defined by | | X | | , : = sup (iX(t)le-^l^-^ol) is trivially a norm on C^([a, 6]). Moreover, it is equivalent to the uniform norm on C^([a, 6],R'^) since e-^l''-"l||X|U,Ml
\TX{t) - TY{t)\ =
/ A{s)(X(s) - Y{s)) ds\
\Jto
I
Jto
7
Multiplying the last inequality by e~^l*~*ol and taking the sup norm gives
]\TX~TY\\^<-\\X-Y\\^. 7
11.3 Ordinary Differential Equations
413
11.35 Theorem. The initial value problem (11.9) has a unique solution X{t) of class C^{[a^b]), and \X(t)\<
(^\Xo\ + I '
\F{s)\dsyxp(^l\\A{s)\\dsy
Moreover, X{t) is the uniform limit in C^([a, 6],E^) of the sequence {Xn{t)} of functions defined inductively by Xo{t)
:= X o ,
Xn+l{t)
-
Xo +
f^ / {A{s)Xn{s)
(11.12) + F{S))
ds.
Proof. Choose 7 > M . Then T : C7 -^ C^ is a contraction map. Therefore, by the Banach fixed point theorem T has a unique fixed point. Going into its proof, we get the approximations. Finally, the estimate on \X{t)\ follows from (11.10) and the Gronwall Lemma. D
11.36 Remark. In the special case a = —00, b = +00, to = 0, F{t) = 0 Vt and A(t) = A constant, then (11.12) reduces to
^»w = (Eir*')^o fc=0
hence the solution of the initial value problem for the homogeneous linear system with constant coefficients ix'it)
=
AX{t),
\x{0)
= Xo
is °°
A fc
X{t) =[J2 ~W^^)^^ = ^""P (^ A)Xo
Vt G M
n=0
uniformly on bounded sets of R and |X(t)|<|Xo|exp(|t-to|||A||)
VteM.
g. Continuous dependence on data We now show that the local solution x{t; to, XQ) of the Cauchy problem (x' =
F{t,x),
[x{to) = Xo depends continuously on the initial point (to, XQ), and in fact is continuous in (t,to,xo).
414
11. Some Applications
11.37 Theorem. Let F(t,x) and Fx{t^x) be bounded and continuous in a region D. Also suppose that in D we have \F{t,x)\<M,
\FS,x)\
Then, for any e > 0 there exists 5 > 0 such that \x{t;to,xo) - x(t',To,xo)\ < e provided \t — t\ < S and \xo — xo\ < S and t, t are in a common interval of existence. Proof. Set ^(t) := x{t]to,xo), t
ip{t) := x(t;to,xo).
From t
(t)it) = xo+ I F{s, 0(s)) ds,
ip{t) =xo+
f F(s, ip{s)) ds, to
to t
to
f F{s,(P{s))ds=
f F{s,(f){s))ds+
*0
t f
*0
F{s,(l)is))ds
to
we infer
f[F(s,
*o t
IHt) - ^(t)\ <\x- xo\ +fclf \(f>{s) - iP{s)\ ds\+ M | t o - t o | to t <S + k\ f
\(l)(s)-ilj(s)\ds
•M5.
to Gronwall's inequality then yields \(t>{t) - ip{t)\ < S{1 -f M)exp ik\t - tol) < S(l + M)exp (fc(/3 - a ) ) . Since mt)-i^(t)\<\J\Fis,ijis))\ds
<M\t-t\<MS
we conclude
mt) - ^{t)\ < mt) - ip{t)\ + mt) - i>{?)\ < 6(1 + M ) exp {k(P - a)) + 6M if |t - t] < ^.
D
1 1 . 3 8 % Let F{t,x) and G{t,x) be as in Theorem 11.37, and let 4>{i) and ^ ( t ) be respectively, solutions of the Cauchy problems fa:' = F ( t , x ) , |^a:(to) = XQ
^^^
{x' = G{t,x), (^a:(to) = ^o-
Show that \4>{t) - ^ ( t ) | < ( k o - ^ol 4- €(^ - a ) ) exp {k{t - to)) ii\F(t,x)~G{t,x)\ <€.
11.3 Ordinary Differential Equations
415
h. The Peano theorem We shall now prove existence for the Cauchy problem (11.4) assuming only continuity on the velocity field F(t, x). As we know, in this case we cannot have uniqueness, see Example 6.16 of [GMl]. 11.39 Theorem (Peano). Let F{t,x) he a hounded continuous function in a domain D, and let (ito,xo) he a point in D. Then there exists at least one solution of (x' = F{t,x), \x{to) = XQ. Proof. Let \F{t,x)\ < M and D := {{t,x) G M x R^ | \t - to\ < a, \x - xo\ < b} be strictly contained in D. If r < min{a, b/M} we have seen that t
T[x]{t) := f
F{T,x(r))dT
to
maps the closed and convex set X := ixeC^{[xo
-r,xo+r],R'^)\x{to)
= XQ, \X - XO\ < fej
in itself, see Theorem 11.23. The operator T is continuous; in fact, since F is uniformly continuous in D, Ve > 0 Br/ such that |F(t, x) - F ( t , x')\<e
\fte
[a, b]
if |x - x'| < r;,
hence \F{t,Xn{t))-F(t,Xoo{t))\
<e
VtG [a,b]
for large enough n if Xn{t) -^ Xoo{t) uniformly. Then we have t \\T[Xn] - T[Xoo]||oo <
f \F{t,Xn{t))
- F{t, Xoo{t))\
dt <
e{b-a).
to
Moreover \T[x]it')-T[x]{t)\
^lF{T,x{r))dT <M\t-t'\,
= t'
and we conclude by the Ascoli-Arzela theorem that T : X —>^ X is compact. The Caccioppoli-Schauder theorem yields the existence of at least one fixed point x(t), • x{t) = T[x]{t); this concludes the proof.
Notice that the solutions can be continued, cf. Lemma 11.25, possibly in a nonunique way. Therefore any solution can be continued as a solution forwards and backwards in time till the closure of the graph of the extension eventually meets the boundary of the domain D. 11.40 t C o m p a r i s o n principle. Let / : [a, 6] x R —>• M be a function that is Lipschitz on each rectangle [a, 6] x [-A,A] and let a(t),/3(t) be two functions such that a{t) < /3(t),
a'it) < fit, ait)),
Show that every solution of
/3'(t) > / ( t , Pit))
Vt G [a, b].
416
11. Some Applications
a{a) <xo< I x(0)
f3{a),
-xo,
satisfies a{t) < x(t) < /3(t) Vt € [a, 6]. In particular, there is a solution that is defined on the entire interval. 11.41 ^ P e a n o ' s p h e n o m e n o n . Consider the Cauchy problem ^'(^) = /(*5^(*))5
^(*o) = xo
in [a,6],
(11.13)
where / ( t , x) is a continuous function. Show that (i) there exist a minimal and a m,axim,al solution, i.e., x{t) and x{t) solutions of (11.13) such that for any other solution of (11.13) we have x{t) < x{t) < x{t), (ii) if the minimal and the maximal solutions of (11.13) exist in [to, to-\-S], show that through every point {to,XQ) with t E [to,to -\- S] and x G [x{t),x{t)] there passes a solution of (11.13). [Hint: To show existence of a maximal solution, show that, if Xn{t) solves x' — f{t, x) + - , then, possibly passing to a subsequence, {xn} converges to a maximal solution.] 11.42 f. Study the following Cauchy problem passing to polar coordinates {p,0) X2(t)y/\x2(t)\
\x[{t)=xi(t)-
yjx\{t)+xl{ty
-'2{t)=X,{t)-^f^/^\,
y^x2(t)+x2(t)
[ x i ( 0 ) = l,
X2(0)=0.
11.3.2 Boundary value problems For second order equations it is useful to consider, besides the initial value problem, so-called boundary value problems in which the values of u or u', or a combination of these values, are prescribed at the boundary of the interval. For instance, suppose we want to find the linear motion of a particle under the external force F{t^x{t)^x'{t)) starting at time t = 0 in XQ and ending at time t = 1 in xi, i.e., we want to solve the Dirichlet problem, ' x"{t) = F{t,x{t),x\t)) x{0) =
in]0,l[,
XQ,
x(l) = xi. 11.43 %, Check that the problem x"^-x
= (} i n [ 0 , t i ] ,
x(0) = 0, x{t\)
=xi
(i) has a unique solution if ti ^ UTT, n G Z and xi G M, (ii) has infinite many solutions if ti = mv, n ^X and x\ = 0, (iii) has no solutions if ti = nn, n G Z and xi ^ 0.
11.3 Ordinary Differential Equations
417
Discuss also the same problem for the equation x" + Ax = 0.
11.44 Theorem. Lei F{t,x,y) be a continuous function in the domain D := {{t^x^y)\t e [0,1], |x| < a, \y\ < a}. Moreover, suppose that F{t,x,y) is Lipschitz in {x^y) uniformly with respect to t, i.e., there exists /i > 0 such that -F(t,X2,2/2)1 < M ( k i -X2I + I2/1 -2/2I)
\F{t,xi,yi)
for every (t,xi,?/i), (t,X2,2/2) ^ L). Then for |A| sufficiently small the problem {x" = \F(t,x^x'l ^^^^^^
\x(0)=x(l) = 0 has a unique solution x{t) G C^([0,1]). Moreover \x{t)\ < a and \x\t)\
VtG[0,l]. Proof. If x(t) solves x " = XF{t,x(t),x'{t)), x'{t)^A-{-X
then
f F(T,x(r),a:'(T))dT = A + A— / {t - T)F{T,X{T),X'{T)) Jo dt Jo
dr,
and x{t) =At-\-B-\-x[{t-
T)F{T,
x ( r ) , x ' ( r ) ) dr;
Jo the boundary conditions yield B = 0,
A-\-\
f
{1-
T)F(T,
X{T),X'{T))
dr = 0.
Jo Thus, x{t) is of class C2 ([0,1]) and solves (11.14) if and only if x(t) is of class Ci([0,1]) and solves t x{t)
= X f{t-
T)F{T,
X ( T ) , X\T))
dr
0 -Xt
(11.15) f
Jo
(l-r)F(r,x(T),x'(r))dT.
Now consider the class X :=\xe
CH[0,1]) I x(0) = 0, sup \x\t)\
^
'
[0,1]
< a] ^
endowed with the metric d{xi,X2)
:= sup \x'i{t) - X2(t)\ te[o,i]
that is equivalent to the C^ metric ||xi — X2||oo,[o,i] + ll^i ~^2lloo,[o,i]- I* is easily seen that (X, d) is a complete metric space and that the map x{t) —>• T[x]{t) given by T[x](t):=X
f (t-T)F(T,x(T),x'{T))dT-Xt Jo
f (l-r)F(T,x(r),x'(r))dr, Jo
maps X into itself and is a contraction provided |A| is sufficiently small. The Banach fixed point theorem then yields a unique solution x G X . On the other hand, (11.15) implies that any solution belongs to X if |A| is suffciently small, hence the solution is unique. •
418
11. Some Applications
a. The shooting method A natural approach to show existence of scalar solutions to the boundary value problem ^x" = F ( t , x , x ' )
in]Oj[,
x(0)=0,
(11.16)
x{t) = X consists in showing first existence of solutions y{t,\) problem \ " = F{t,y,y')
of the initial value
in[0,^
y(0) = 0,
(11-17)
,y'(o) = A, defined in the interval [0, ?] , and then showing that the scalar equation y(t,
\)=x
has at least a solution A; in this case the function y{t^\) clearly solves (11.16). Since y{t,X) is continuous in A by Theorem 11.37, to solve the last equation it suffices to show that there are values Ai and A2 such that y{t, Ai) <x< y{t, A2). This approach is usually referred to as the shooting method, introduced in 1905 by Carlo Severini (1872-1951). 11.45 Theorem. Let F{t,x,y) be a continuous function in a domain D. The problem (11.16) has at least a solution, provided that t and x/t are sufficiently small. Proof. Suppose \F{t, x,y)\ < M\ choose M > M' and a sequence of Lipschitz functions Fk(t,x^y) that converge uniformly to F{t^x,y) with \Fk{t,x,y)\<M
Vfc,
\ft,x,y.
Problem (11.17) for F^ transforms into the Cauchy problem for the first order system
i^(0) = (0,A) where z{t) = (x(t),y(t)) and Gk(x,z) = (y, Fk{t,x,y)). Now if 6 > 0 is chosen so that D := {(t, z)\\t\ < a e \z — (0, A)| < 6} is in the domain of Gk(t, z), and we proceed as in the proof of Peano's theorem, we find a solution z^^x of (11.18) defined in [0, r] with r < mm
L,
-^
1.
(11.19)
Since Gk is a Lipschitz function, z^ is in fact the unique solution of (11.18) and depends continuously on A := (0, A). If Xk,x{t) is the first component of zj^^x, we have, see Theorem 11.44, t
Xk,xW = Xt-h j{t
- T)Fir, Xfc,A(r), 4 , A W ) dr,
11.3 Ordinary Differential Equations
419
hence Xr - r^M < Xk,x{r) < Ar + r'^M and in particular, Xk,xi'^)<x
ii Xr-\-r^M
Xk,xi'^) > ^
if Ar — r^M > x.
<x,
fii 90)
It follows from (11.19) that the assumptions in (11.20) hold for two values of A if r and x/r are small enough, concluding that there is a solution Xk € C^([0, r]) to the boundary value problem fx'^'(t) = Ffc(t,a:fc,4), \xk(0)=Q,
(11-21)
[xk{r) = x. As in Theorem 11.44, we see that the family {xk{t)} is equibounded with equicontinuous derivatives, thus, by the Ascoli-Arzela theorem, a subsequence converges to x in the space C^([0,r]), and passing to the limit in the integral form of (11.21), we see actually that X e C2([0,r]) and solves (11.16) in [0,r]. D
b. A maximum principle Let u e C^QO, 1[) n C°([0,1]), but [0,1] can be replaced by any bounded interval. If u has a local maximum point XQ in the interior of [0,1], then ^'(xo) = 0
and
u'\xo) < 0.
(11.22)
If, moreover, u satisfies the differential inequality u" -\-b{x)u' >0,
(11.23)
then clearly (11.22) does not hold at points of ]0,1[, thus the maximum of ?/ is at 0 or 1, that is, at the boundary of [0,1]. If we allow the nonstrict inequality
^'' + b{xy > 0 the constant functions that have maximum at every point, are allowed; but this is the only exception. In fact, we have the following. 11.46 Theorem (Maximum principle). Let u be a function of class C^(]xi,X2[) nC^([xi,X2]) that satisfies the differential inequality u" + b(x)u' > 0
in ]x\,X2[
where b{x) is a function that is bounded below. Then u is constant, if it has an interior maximum point. Proof. By contradiction, suppose XQ € ] X I , X 2 [ is an interior maximum point and u is not constant so that there is x such that u(x) < u{xo). Assume for instance x €]xo, X2[ and consider the function ;2(x) := e"(^-^o) - 1 ,
xe
[xi,X2],
where a is a positive constant to be chosen. Trivially z{x) < 0 in ]xi,xo[, z{xo) = 0, z{x) > 0 in ]xo,X2[ and
420
11. Some Applications
z" + h[x)z' = (a^ + 6(x)a)e^(^-^o) > Q
in [o^i,0:2]
if a > max(0, — mix£[xi,x2] K^))- ^^so consider the function w{x) := u(x) + €z{x) where e > 0 has to be chosen. We have w(xo) = u(xo), w{x) < u(x) < u{xo) = w(xo) for X < xo, and w{x) = u{x) -h tz{x) < u{xo) if e < ^ ^^(^)^^^^ • With the previous choices of a and e, the function w has an interior maximum point in ]a;i,a:2[, but w" + b{x)w > 0: a contradiction. D 11.47 %, In the previous proof, z{x) := e*^(^~^o) — 1 is one of the possible choices. Show for instance that z{x) := {x — xi)^ — (XQ — xi)*^ does it as well.
11.48 T h e o r e m . Let u e C^(]xi,X2[) fl C^{[xi,X2\) solution of the differential inequality u'^ {x)-\-b{x)u'(x)
>0
he a nonconstant
in]xi,X2[
where b{x) is bounded from below. Then, u\xi) < 0 if u has a maximum value at xi and u'{x2) > 0 if u has maximum value at X2. Proof. As in Theorem 11.46 we find w'(xi) at a:i.
= u'(a) -f ea < 0 if w has maximum value D
Similarly we get the following. 11.49 T h e o r e m ( M a x i m u m principle). Letb{x) andc{x) be two functions with b{x) bounded from below and c{x) < 0 in [xi,X2]. Suppose that u G C^(]xi,X2[) n C^([xi,^2]) satisfies the differential inequality u" + h{x)u\x) -h c{x)u > 0
in ]xi, X2[.
Then (i) either u is constant or u has no nonnegative maximum at an interior point, (ii) ifu is not constant and has nonnegative maximum at Xi (respectively, at X2), then u'{xi) < 0 (respectively, u'{x2) > 0). An immediate consequence is the following comparison and uniqueness theorem for the Dirichlet boundary value problem for linear second order equations. 11.50 T h e o r e m ( C o m p a r i s o n principle). Let ui and U2 be two functions in C^(]x 1,^2 [) n C^([x 1,0:2]) that solve the differential equation u'\x) + b{x)u'{x) -h c{x)u{x) = f{x) where b, c and f are bounded functions and c{x) < 0. (i) If ui > U2 at xi and X2, then u\ > U2 in [xi.,X2], (ii) if ui = U2 in Xi and X2, then ui = U2 in [xi,X2]. 11.51 t ' Add details to the proofs of Theorems 11.49 and 11.50. By considering the equations u" + u = 0 e u" — u = Q show that Theorem 11.49 is optimal.
11.3 Ordinary Differential Equations
421
c. The method of super- and sub-solutions Consider the boundary value problem i-u''^Xu
= f{x)
in]0,l[,
^^^24)
The comparison principle, Theorem 11.50, says that it has at most one solution if A > 0, and, since we know the general integral, (11.24) has a unique solution. Let Q be the Green operator that maps / G C^([0,1]) to the unique C^([0,1]) solution of (11.24). Q is trivially continuous; since C2([0,l]) embeds into C^([0,1]) compactly, G is compact from C°([0,1]) into C^([0,1]); finally by the maximum principle, Q is monotone: if / < p, then Gf < Gg. Consider now the boundary value problem \-u"
=
f{x,u),
\u{0) = u{l) = 0 where we assume / : [0,1] x M ^ M to be continuous, differentiable in u for every fixed x, with fu{x,u) continuous and bounded, \fu{x,u)\ < k V(x, u) e [0,1] X R. By choosing A sufficiently large, we see that / ( x , u) + Xu is increasing in u and we may apply to the problem I -u'' + Xu = / ( x , u) -f Xu,
.^^ 25)
\u{0) = u{l) = 0 the argument in Theorem 11.46, inferring that, iiu and u are respectively, a subsolution and a supersolution for —u" = f{x,u), i.e., -M''(X,U),
f-7l">/(x,^),
^(0), M ( 1 ) < 0 ,
[MeC2([0,l]),
[ueC\[OA])
then setting Tu := G{f{x, u{x)) + Xu{x)) and U:=M,
iun^i-Tun,
[Vo :=U,
[Vn+l
forn>l,
=TVn
the sequences {un} and {vn} converge uniformly to a solution of
i.e., to a function of class C^ that solves (11.25). Hence we conclude
422
11. Some Applications
PREMifeRB vmm.
iiUAfl^S W GALCtlL DES VARfAT(O^S
CHAPITRE !.
P « M. StBM BKRXSTgJie.
I. Xoui verron* p!ui loin qoe les Equations diff«reniiejle»-«rdinairei dn cakul dcs rtrialions so pr*Miitenl, le pla« soorenl, sons It I^sprincip«0J!rt»ullats<1u prcieiil Jlcmoirc onl Hi rfjuroisd«n« ttoU SotesApiCMMesrr/titaidci3liti\Ticr, ^ jiiillMrl iSjiiilM 1910; it est (lone inulile de letrai)|ieterici. it vpus sculcmral ajoaler qn'on certain noiubre dct proposiliotif «Ic la |iremicre Parttc arak-tit tlrji Hi (lonnecs en ic)o8 par M. tiadamarti ('). MaU la iSL-tboilc quf j*rni|)loie djffpre cMenticHemeDl de ccllc de M. IfadaBiarJ el do* autres autrars qui, aprAs M. Ililbcrt, abordent difMtemrnt )e protileisc du caicD) det varialJona en n'ulirtMnI pa«, oil pretijue |>as, IK equations iliffirtntieitos elatsiqQCS. Povraioi, cV't, au contnirc, hi Oquadoiii difitrenticllcii qui accupcnt ia {ilacr cciitftile; k' cakul des variatiotii n'tst qu'uni! applieatton imporiaote dr la ihiotie g^ntraU dc^ vqualioo* du second ordte, d«D( ri-ltidr jc Iroure scuifiurnt qaoliiucfuis liiiipliPieoptr Ux cooiUlitiliofii du ealeul des Yarialioni. les di-iix poinit dc vi.o Bte tfiublent^galrmrnt Irgirtnirt, p( pctit-rlr« IVtuJc tiHlirrcl« dti problvnic du cal«»l in rari.ttioi)« rcndra-l-cllc rv prublr-ntr plu> accc>«ibte par In melhodft dfwcle*.
(•)
/ ; = / , ( ' . 7 , . •••,.?'..>',. • • • . / ; )
('=
oil les/j soRt dps fonelions, rn gdniral, eonliftues poor touies vateun recllesdes variables (saofdesvalcorspgrticttlwres 6«x,y„ ..•,>.) pt qni r«slcflt infiricurcs en rateur absoloe i * 0 f */,'"»-•••+j4*)» l«r*queles/;froiMentind«liBiincnl,Ad6peBdan!se»1pi»eiilde*.j'„—» y,. Xous ap|iellcr«ns Ics dqoalions ( i ) de celte"nature ijHaHw$ {!,). I'n cas partieulicr iinporlant es( celoi 0* IPS/J son! dfs polynomes Ju second degrd par rapport aox^,; c'csl la foriiie sous laquelle on petit onitrc lotijnurs Irs iqastions do inouremsnt de Lagrange. Ce cas a il< i-ludit parM, Painlcvc (')qui siipposait, d'ailleurs, ies fonetions^ analyliques. Ler(-siillal roiidamcnlal dc ccllc ^udcest \esuivant {*)'. S > „ y „ ..,y^tendeM vert ties rakurs fixes y\,y\, ..., J^i hrtfuex tent) rert X, et fue tout ksf, mnt rrgab'trt pour tet raitvrt A x, yti ^ i/Mi4ayl lenitrM igiilemtnl vers Jet ralettrtfiMtt.
ti,iiftt ftMlmc JmnljTt tti 'i If^Ulhrt ilrt l
Figure 11.4. Two pages from a paper by Sergei Bernstein (1880-1968).
11.52 Theorem. Let f{x,u) be a smooth function with \fu{x^u)\ < k V(a:, ?i). Assume that there exist a subsolution and a supersolution for i-u''
= f{x,u)
in [0,1],
\u{0) = u{l) = 0. Then there also exists a solution. We also have the following. 11.53 Theorem. Let f{t,p) : [0, +oc[x]R —> R 6e a function of class C^ that is periodic of period p in t. If the equation x"{t) = f{t,x{t)) has a subsolution x{t) and a supersolution x{t) that are periodic of period p with x{t) < x{t) for all t, then it has also a solution in between, of period p. 11.54 If. Prove Theorem 11.53. [Hint: Follow the following scheme. (i) Choose M so that / ( t , x) — Mx is decreasing. (ii) Inductively define a sequence of p-periodic functions by xo(t) := x(t) and Xn+iit), n > 0, as solution of < + l W - MXn-^lit)
= fit, Xn(t)) -
MXn{t).
(iii) Show that Xn{t) < Xn-^i{t) < x{t). (iv) Show that the sequences {x!^} and {x!,^} are equibounded, in particular {xn} and {x!^} have subsequences that converge, and actually that {xn}, {x!^} and {x!^} converge uniformly to (v) Finally, show that Xoo is the solution we are looking for.]
11.3 Ordinary Differential Equations
423
d. A theorem by Bernstein We conclude our excursus in the field of ODEs by the following result. 11.55 Theorem (Bernstein). Let F{x,u,p) continuous function such that
: [a,6] x R x M ^ R 6e a
(i) there exists M > 0 such that uF{x,u,0) > 0 if \u\ > M, (ii) there exist continuous nonnegative functions a{x, u) and b{x, u) such that \F(x,u,p)\ Then the problem
has a solution.
< a{x,u)\p\'^ -^b{x,u)
i
n" — F(x^u^u')
\J{x,u,p) 6 [a,6] x R x R.
in]a^b[^
u{a) = u(b) = 0
The original theorem^ by Bernstein, instead of (i), requires the stronger assumption that F be of class C^ and for some positive constant k one has Fu{x, u,p) > k > 0 for all (x, u,p). Its proof uses the shooting method. We shall instead use Schaefer's theorem, Theorem 9.142. Proof. As we have seen, the operator that maps every v G C^([a, 6]) into the solution of the problem L''
=
F{x,v{x),v'{x)),
|it(a)=0,
u{b) = 0
is compact. Therefore, according to Schaefer's theorem, it suffices to show that, under the assumptions of Theorem 11.55, there exists r > 0 such that, whenever the function V e C2([a,6]) solves
(
v'' =
XF{x,v,v'),
v(a) = vib) = 0,
for some A G [0,1], then ||'y||c2([a,6]) < '^^ ESTIMATE OF ||^;||CXD- Let XQ be a maximum point for v'^{x). We may assume XQ E.]a,b[, otherwise v = 0; therefore we have v'{xo) = 0 and 0 > ^ ^ ^ W I ^ = ^ o = 2V''^{XQ)
+ 2V{XO)V"{XO)
=
Xv{xo)F{xo,v{xo),0);
the assumption (i) then implies |t^(xo)| < M , hence ||t'||oo < M. ESTIMATE OF ||f'||oo- Let /i be a positive constant and let A and B be bounds for a(x, u) and b{x,u) when x G [a,b] and \u{x)\ < M. Multiplying the equation for v by e~^'" we find hence if /i > A. Similarly, multiplying the equation for v by e^^, we find •^ S.N. BERNSTEIN, Sur les equations Sup. Paris 2 9 (1912) 481-485.
du calcul des variations,
Ann. Sci. Ec. Norm.
424
11. Some Applications
SUR UNE CLASSE D'lQUATIOMS
{•AB
FOMCTIOMMELLES
rSBOBOLK
Ihn* quel<|UM tmn\a' ABKL »'mi o«eup«S %voe to proUimo do d^ nincr U M foaetSon f{x) d« DMaiira qv'ell* ntiafMM » I'^oatiott fofic-
(»)
/r(*,J')f ( * ) * - « » ( * )
f(*,t) at ^ ( s ) «ant d « fonetions donnte. ABEL A r ^ l n qu»]4|ueit ««« partioalMn d« o«He
K*) + yA«,jrMy)<%-f(«)
(b)
qm eat Jtroitomaat VM ft I'^qution aMlinuui. Bn eStt, M oa iairodmt M U«a de /*(«, y) «t |»(«), ] / ( « , » } et ] ^(ic), I'jquttioB (b) a'toit
(c)
>9{»)+fn*.»)f{9)
Equation qui a* tnnafonse «n i'^qoMioa (») en poaant ^ <•> o. Aiasi la lolation de I'^oaiion (a) pant Mra conaid^rfe comma implioitouant oontoone dana U aolottwi da I'^qnation (b). Magktia {or Nat«rv{d»s*k»b(ra*,
t8»3 M 0«aTr«i oom-
Figure 11.5. Ivar Predholm (1866-1927) and a page from one of his papers.
if /i > A. Since v' vanishes at some point in ]a, b[, integrating we deduce for all x € [a, b] -XBe-^^{b therefore II
-a)<
t;'e^^ < XBe^^(b
< c{A, B, M) since ||^||oo ^ ^
- a),
by step (i).
ESTIMATE OF II
f'^lloo- This is now trivial, since from the equation we have \v"{x)\ <\\F{x,v{x),v'{x))\
< c{M),
F being continuous in [a, b] x [0, M] x [0, c(A, B, M)].
D
11.4 Linear Integral Equations 11.4.1 Some motivations In several instances we have encountered integral equations, as convolution operators or, when solving linear equations, as integral equations of the type
x{t) = yo+
f{s,y{s))ds; Jto
for instance, the linear system x'{t) = A{t)x{t) can be written as
11.4 Linear Integral Equations
x{t)=:
f A{s)x{s)ds.
425
(11.26)
Jto (11.26) is an example of Volterra^s equation, i.e., of equations of the form t
f{t) - ax{t) -h f k{t, r)x{r) dr.
(11.27)
0
a. Integral form of second order equations The equation x'\t) — A{t)x{t) = 0, t G [a, ^ ] , can be written as a Volterra equation. In fact, integrating, we get x'{t) = ci -h / A{s)x{s) ds and, integrating again, x{t) = co + ci{t-to)
+ J
(^j^
A{s)x{s)ds)dT
= Co-\-ci{t - to) + / {t-s)A{s)x{s)dsdT
(11.28)
=: F{t) + I {t- s)A{s)x{s) ds, with F{t) := Co + ci{t - to) and G : [a, /?] x [a, /?] ^ E given by G{t,s):=
\t-s)A{s) 10
iis
b. Materials with memory Hooke's law states that the actual stress a is proportional to the actual strain e. At the end of 1800, Boltzmann and Volterra observed that the past history of the deformations of the body cannot always be neglected. In these cases the actual stress a depends not only on the actual strain, but on the whole of the deformations the body was subjected to in the past, hence at every instant t a{t) = ae{t) + F[e{T)ll where F is a functional depending on all values of €(r), 0 < r < t. In the linear context, Volterra proposed the following analytical model for F , k{t, T)e{T) dr.
426
11. Some Applications
This leads to the study of equations of the type t
a{t) = ae{t) + /
k{t,T)e{T)dr,
that are called Volterra ^s integral equations of first and second kind according to whether a = 0 or a ^^^ 0. c. Boundary value problems Consider the boundary value problem x" - A{t)x = 0, (11.29)
x(0) = a, x{L) = b. Prom (11.28) we infer x{t) = ci + C2t + / {t- s)A{s)x{s) ds Jo and, taking into account the boundary conditions, b-a c\ = a,
C2
1 Z*^ — / (L L Jo
s)A{s)x{s)ds,
we conclude that I
J
pL
pt
x{t) =a-]
-—t - 7 / {L- s)A{s)x{s) ds + {t - s)A{s)x{s) ds ^ ^ Jo Jo t{L - s) b-a f^ s{L-t) A{s)x{s) ds. A{s)x{s) ds = a+
In other words, x{t) solves (11.29) if and only if x{t) solves the integral equation, called Fredholm equation., c{t) = F{t)+ where F{t) := a + ^ t
I Jo
G{t,s)x{s)ds
and G : [0, L] x [0, L] -> M is given by
G(t,s):=
r s{L -1) L { t{L - s)
se s < t, se ^ < 5.
11.4 Linear Integral Equations
427
Figure 11.6. An elastic thread.
d. Equilibrium of an elastic thread Consider an elastic thread of length i which readily changes its shape, but which requires a force cd£ to increase its length by d£ according to Hooke's law. At rest, the position of the thread is horizontal (the segment AB) under the action of the tensile force TQ which is very large compared to any other force under consideration. If we apply a vertical force p at C for which x = ^, the thread will assume the form in Figure 11.6. Assume that S — CCQ be very small compared to ACQ and CQB (as a consequence of the smallness of p compared with To) and, disregarding terms of the order S'^ (compared with ^), the tension of the thread remains equal to TQ. Then the condition of equilibrium of forces is
p{i-m
I.e.,
'^'l + ^'T^
Tol
'
Denoting by y{x) the vertical deflection at a point of abscissa x, we have y{x) = where
G{x,^)p
(x{l-0 Tol
G{x,0:--
0 < X < ^,
Tol Now suppose that a continuously distribuited force with length density p{^) acts on the thread. By the principle of superposition the thread will assume the shape I
y{x) = JGix^OpiOdC
(11.30)
If we seek the distribution density p(^) so that the thread is in the shape 2/(x), we are led to study Fredholm's integral equation in (11.30). e. Dynamics of an elastic thread Suppose now that a force, which varies with the time t and has density at ^ given by a; > 0, p(^)sinu;t.
428
11. Some Applications
acts on the thread. Suppose that during the motion the abscissa of every point of the thread remains unchanged and that the thread oscillates according to y — y{x)smut. Then we find that at time t the piece of thread between ^ and ^ + A^ is acted upon by the force p{() sin(a;t) A^ plus the force of inertia
where ^ is the density of mass of the thread at ^, and the equation (11.30) takes the form I
y{x)smujt=
/ G{x,^)\p{^)smujt
+ uj'^p{^)y{0sinut]dC
(11.31)
0
If we set I
I G(x, OP(0 d^ =•• /W,
G{x, OP(0 ='- H^. 0, ,2 - : A ,
0
(11.31) takes the form of Fredholm equation I
y{x) = XJ k{x, £)y{0 d^ + /(x).
(11.32)
0
11.56 f. Show that, if in (11.32) we assume p(^) constant and / smooth, then y{x) solves f"{x), \y"{x)+Lo'^cy(x) = 3/(0) = 0,
(11-33)
[y(i) = 0, where c = P/TQ. Show also that, conversely, if y solves (11.33), then it also solves (11.32). 1 1 . 5 7 ^ . In the case p = const, show that the unique solution of (11.33) is I
2/W = - - ^
p sm/JLI
X
/r(Osin/x(^0^+-
J 0
p J 0
fri0smp(x-0d4
if sin pi ^ 0, /i := ojy/c. Instead, if sinpX = 0, i.e., p = pk where kn Pk'=—,
kn uJk:=T-i=,
then (11.33) is solvable if and only if
^ k^n"^ ^k'-=-7^—,
,
_ keZ,
11.4 Linear Integral Equations
429
vno voLtaasA
OPERE MATEMATICHE Memorie e Note rUBBUCATB A CtlRA 0BU.'A0CAOnilA M A Z K t U U D B UNCEI ODL CONCOItSO D E t OQNSICLIO NAZKMAU VOiX
fUCCKCHS
Figure 11.7. Vito Volterra (1860-1940) and the frontispiece of the first volume of his collected works.
I no sin fj,(l - ^ ) c ^ = 0 equivalently, iff
In particular, if f{x) = 0 and 11 = ^1^, all solutions are given by y{x) = C sin fMkX and the natural oscillations
C
eR
of the thread are given by y = C sin fikX sin LJkt.
Compare the above with the alternative theorem of Predholm in Chapter 10.
11.4.2 Volterra integral equations A linear integral equation in the unknown x{t), t G [a, b] of the type o
x{t)
= f{t) + jk{t.T)X{T)
dr
where f{i) and k{t,x) are given functions, is called a Fredholm equation of second kind^ while a Fredholm equation of the first kind has the form
430
11. Some Applications
6
fk{x,T)x{T)dr
= f{t).
a
The function k{t, r) is called the kernel of the integral equation. If the kernel satisfies k{t^ r) = 0 for t > r, the Fredholm equations of first and second kind are called Volterra equations. However it is convenient to treat Volterra equations separately. 11.58 Theorem. Let k{t,r) be a continuous kernel in [a,b] x [a, 6] and let f G C^([a, 6]). Then the Volterra integral equation x{t) = f{t) -f A / fe(t, r)x{T) dr Ja has a unique solution in C^([a, 6]) for all values of X. Proof. The transformation o
T[x]it) := fit) -h A / k{t, r)x{r)
dr
maps C^{[a, b]) into itself. Moreover for all t 6 [a, b] we have \T[xi]{t) - T[x2\{t)\ < |A| M{t - a)\\xi - xalloo.la.b) hence \T^[x^]{t)~T^[X2]{t)\
< lAI^M^^i^llxi
-a;2||cc,[a,6]
and by induction, if T^ := T o • - o T n times,
n!
^ ^
If n is sufficiently large, so that |A|^ M^ (b — a)'^/n\ < 1, we conclude that T'^ is a contraction, hence it has a unique fixed point x e C^{[a,b]). li n = 1 the proof is done, otherwise Tx is also a unique fixed point for T'^, so necessarily we again have Tx = x by uniqueness. D
11.4.3 Fredholm integral equations in C^ 11.59 Theorem. Let k{t,T) be a continuous kernel in [a, 6] x [a, 6] and let f G C°([a, 6]). The Fredholm integral equation b
x{t) = f{t) + A / k{t, r)x{r) dr a
has a unique solution x{t) in C^{[a,b]), provided \X\ is sufficiently small.
11.5 Fourier's Series
431
Proof. Trivially, the transformation b
T[x]{t) := fit) - H A / k{t, T)x(r) dr a
maps C^([a, 6]) into itself and is contractive for A close to zero, in fact, if M max|/c(t, r ) | ,
:=
h
|T[xi](t) - T[x2]{t)\ < |A| j |fc(t,r)| \xr{T) - X2{r)\ dr a <|A|M(6-a)||xi(t)-X2(t)||oo,[a,6] < 2ll^lW-^2(t)||oo,[a,6] if |A| M(6 -a)
< 1/2.
D
In order to understand what happens for large A, observe that the transformation h
T[x]{t) := fit) + Jk{t,T)x{r)dT is hnear, continuous and compact, see Example 9.139. The Riesz-Schauder theorem in Remark 10.72 then yields the following. 11.60 Theorem. Let k{t,r) G C^{[a,b] x [a,6]) and f G C^{[a,b]). The equation b
Xx{t) = f{t) + / k{t,
T)X{T)
dr
(11.34)
a
has a set of eigenvalues A with the only accumulation point A = 0. Each eigenvalue X ^ 0 has finite multiplicity and for any X, X ^ 0 and X ^ A, (11.34) has a unique solution. Further information concerning the eigenvalue case requires the use of a different space norm, the integral norm || ||2, and therefore a description of the completion L'^{{a, b)) of C^{{a^ b)) that we have not yet treated.
11.5 Fourier's Series In 1747 Jean d'Alembert (1717-1783) showed that the general solution of the wave equation
432
11. Some Applications
THfiORIE
Veberae
AHAlTTIQtJl
Darstellbarkeit einer Function
DE LA CHALEUR,
dnrch eine tdgonomeiiisehe Belhe.
P4» M. FOURIER. V<m
B. B i e m a n Q.
TWF k PARIS, d«r WiMCMdiaiUn ra OfiMfacn.
CHEZ FIRMIN DIOOT, VfeRE BT FILS, tMUt mm ut mttrntmiatmim, h'taemmcm
OSttingen, in 4*r Di«t«rieliioh*ii Bnelihtadlang.
18a a.
1867
Figure 11.8. Frontispieces of two celebrated works by Joseph Fourier (1768-1830) and G. F. Bernhard Riemann (1826-1866).
that transforms into = 0
drds by the change of variables r = x -\- at, s = x — at, is given by u{t, x) = (p{x + at) + IIJ{X — at),
where ip and ijj are, in principle, generic functions. Shghtly later, in 1753, Daniel Bernoulh (1700-1782) proposed a different approach. Starting with the observation of Brook Taylor (1685-1731) that the functions sm
rmrx
VT
(nna{t — (3) \ cos f )•
n=l,2,...
(11.36)
are solutions of the equation (11.35) and satisfy the boundary conditions u{t,G) = u{t,i) = ^, Bernoulli came to the conclusion that all solutions of (11.35) could be represented as superpositions of the tones in (11.36). An outcome of this was that every function could be represented as a sum of analytic functions, and, indeed.
TT
E n=0
1 sin(2n -h l)x 0 = < 2n + l -1
if 0 < a; < TT, if X = TT, if7r<x<27r.
Bernoulli's result caused numerous disputes that lasted well into the nineteenth century that even included the notion of function and, eventually.
11.5 Fourier's Series
433
was clarified with the contributions of Joseph Fourier (1768-1830), Lejeune Dirichlet (1805-1859), G. F. Bernhard Riemann (1826-1866) and many other mathematicians. The methods developed in this context, in particular the idea that a physical system near its equilibrium position can be described as superposition of vibrations and the idea that space analysis can be transformed into a frequency analysis, turned out to be of fundamental relevance both in physics and mathematics.
11.5.1 Definitions and preliminaries We denote by Ll^^ the space of complex-valued 27r-periodic functions in R that are summable on a period, for instance in [—7r,7r]. For /c G Z, the kth Fourier coefficient of / G L^^^ is the complex number
often denoted by f^k)
or / ^ .
11.61 Definition. The Fourier nth partial sum of f e L\^ is the trigonometric polynomial of order n given by
fc=—n
The Fourier series of f is the sequence of its Fourier partial sums and their limit Sf{x) = T
Cke'^"" := lim Snf{x) = lim V
k=—oo
Akx c,e^
k=—n
If / G Z/2^ is real-valued, then Ck = Ck
V/c G Z
since f{t) = f{t) and f{t)e'''Ut= -TT
f{t)e-i^t J—Tl
dt=
f{t)e-i''^dt. J—TT
The partial sums of the Fourier series of a real-valued function have the form n
Snfix)
n
= CO + Y^icke'"^ + cj^e-'"^) = CQ + ^ fc=i fe=i
3?(2cfee''=^),
434
11. Some Applications
Figure 11.9. The Dirichlet kernel with n = 5. Observe that the zeros of Dn{t) are equidistributed Xn '-= 2n+i^^ ^ "^ 2kn, k ^ Z.
thus, decomposing Ck in its real and imaginary parts, c^ = : (a^ — ibk)/2, that is, setting
I r ak := — /
I r f{t) cos{kt) dt,
bk := — /
f{t) sm{kt) dt,
we find the trigonometric series n
Snf{x) = ^+"^^{{ak-ibk){cos{kx)+ism{kx)) k=i
(11.37)
n
= — + y^(^fc COS kx -h bk sin kx). ^
k=i
However, the complex notation is handier even for real-valued functions. 11.62 ^ . Show that the operator ^ mapping every function in L^^ into the sequence of its Fourier coefficients, / -^ {/HJ/c)}, has the following properties: (i) (ii) (iii) (iv) (v) (vi)
it is linear (A/ + f^g^k) = Xr{k) + figl.k) VA, fieC,Vf,ge L\^, {fgYk — (f^* g'^ik), see Proposition 4.46, ( / * g)'l,k) = f\k)g1^k), see Proposition 4.48, if g(t) = f(-t), then gl,k) = H " ^ ) , if g(t) = fit - cp), then gl,k) = e'^^^ flk). if / is real and even, then its Fourier coefficients are real and its Fourier series is a cosine series, (vii) if / is real and odd, its Fourier coefficients are imaginary and its Fourier series is a sine series, (viii) if / has continuous derivative, or more generally / is continuous and / ' is piecewise continuous, then Ck{f') = ikckif) ^^ ^ ^-
11.5 Fourier's Series
435
a. Dirichlet's kernel The Dirichlet kernel or order n is defined by n
n
D„{x) := 1 + 2Y^cos{kx)
= ^
k=l
e'*^^,
x G E.
k=—n
As we have seen in Section 5.4 of [GM2], Dn{t) is a trigonometric polynomial of order n and 27r-periodic, Dri{t) is even,
and
{
2n + 1
iit = 2k7r,
keZ,
sin(t/2) "^ The Fourier coefficients of {Dn{t)} are trivially .^ ,
fl
if |A:|
[0
if \k\ > n.
Therefore it is not surprising that we have the following. 11.63 Lemma. For every f G L^^iM) we have Snf{x) = ^ J
[fix
+ t) + fix
-
t))Dnit) dt
VX €
M.
Proof. In fact Snfix)
=
Yl
^fc^'"" = ^
= ^ f fit)Dn{t l-K y_7r
=^
r
r
/We^'^""*^ ^ ^ = ^
-X)dt=^
r fix 27r y _ 7 r - x
r
f(t)Dn{x
+ t)Dnit)
- t) dt
dt
/(* + ^)i^n(t) C«t - ; ^ ^ ( / ( ^ + *) + /(^ - 0)i^n W dt,
27r 7 _ ^ 27r yo where we used, in the fourth equaUty, that Dn{t) is even and in the second to last equaUty that for a 27r-periodic function we have ra-\-2Tr
/
rrr
u{t) dt=
J a
u{t) dt
Va € M.
J — TV
D
Finally we explicitly notice that, though J^^ Dn{t) dt = 27r, we have f
\Dn{t)\dt =
0{\ogn).
J —TT
This prevents us from estimating the modulus of integrals involving Dn {t) by estimating the integral of the modulus.
436
11. Some Applications
momDA novmM.
LECONS 8ERIB TRIGONOMETRICHE
SERIES TRIGONOMITRIQUES PROFESSeeS AU COLtfiUE »E FRANCE
m PARIS, GA0T8IER-VILURS, iMPRIMSUR-UBRAJRE BOU>aSA
NIOOLA
ZAmomOM
Figure 11.10. The frontispieces of two volumes on trigonometric series by Henri Lebesgue (1875-1941) and Leonida Tonelli (1885-1946).
11.5,2 Point wise convergence If P is a trigonometric polynomial, P e 7^n,27r, then P agrees with its Fourier series, P{x) = E L - n ^fc^'^"^ ^^ ^ ^ ' see Section 5.4 of [GM2]. But this does not hold for every / G L^-j^. Given / G ^37^5 we then ask ourselves under which assumptions on / the Fourier series of / converges and converges to / . a. The Riemann—Lebesgue theorem The theorem below states that a rapidly oscillating function with a summable profile has an integral that converges to zero when the frequency of its oscillations tends to infinity, as a result of the compensation of positive and negative contributions due to oscillations, even though the L^ norms are far from zero. 11.64 Theorem (Riemann-Lebesgue). Let f :]a, b[-^ R be a Riemann summable function in ]a, 6[. For every interval ]c,d[c]a,b[ we have
[
f{t)e'''^dt~^^
as \\\ -^00
uniformly with respect to c and d. Proof, (i) Assume first that / is a step function, and let a := {XQ = a,xi,... be a subdivision of ]a,6[ so that f(x) = a^ on [xk-x^x^]. Then
,Xn = i>}
11.5 Fourier's Series
d
437
< - \- k=i
This proves the theorem in this case. (ii) Let / be summable in ]a, 6[ and e > 0. By truncating / suitably, we find a bounded Riemann integrable function he such that /^ \f(t) — he{t)\dt < e, and in turn a step function ge : (a, 6) -^ R with fl \he{t) - ge(t)\ dt < e. Consequently J^ \f{t)-ge{t)\ 2e and from J
f{t)e'^'
dt = J
9.{t)e'^* dt + J
(fit)
dt <
- ge{t))e'^* dt
we infer
I f f{t)e'^*dt\ < I / g,{t)e'^*dt\+ f I •/c
I Jc I rd
I
<
/
I
\fit)-ge(t)\dt
Ja
I
ge{t)e'^* dt\ + e.
I Jc
I
The conclusion then follows by applying part (i) to ge.
Q
11.65 Corollary. Let f be Riemann summable in ]a, 6[. Then / f{s)sml(n-{--js)ds^^Q
as n ^^ oo
uniformly with respect to the interval ]c, d[c]a, 6[. 11.66 %. Show the following. P r o p o s i t i o n . Let f G 1/2^. Then we have / f{t)Dn {t)dt-^0 Js<\s\
asn-^
oo
for every S > 0. 11.67 If. Show Theorem 11.64 integrating by parts if / is of class C^{[a, b]). 11.68 If. Let / 6 Ll^ and let {ck{f)} that |cfc(/)| —^ 0 as ^ -^ ±oo.
be the sequence of its Fourier coefficients. Show
b. Regular functions and Dini test 11.69 Definition. We say in this context that f e L^^ is regular atx if there exist real numbers L^{x) and M^{x) such that lim / ( x 4 - t ) = L+(x), t-^o+ ,. /(x-f^)-L+(x) ,^., , hm — ^-^ = M^(x), t->o+ t
^B.
lim f{x + t)=L-{x), (11.38) t-^o,. f{x-^t)-L^{x) ^. , , hm — ^-^ = M (x). t^ot
438
11. Some Applications
Of course, if / is differentiable at x, then / is regular at x with L^{x) = f{x) and M'^{x) = f{x). Discontinuous functions with left and right hmits at X and bounded slope near x are evidently regular at x. In particular square waves, sawtooth ramps and C^ functions are regular at every x G R. It is easy to see that if / is regular at x then the function ^^^ty_^fi- + t) + fi--i)^-LH^)-L-(-)
(11.39)
is bounded hence Riemann integrable in ]0, TT]. 11.70 Definition. We say that a 27r-periodic piecewise-continuous map f :R-^ C is Dini-regular at x eR if there exist real numbers I/^(x) such that r I fix + 0 + fix -t)- L+(x) -
Jo '
L-{x)
t
dt < +00.
(11.40)
11.71 Theorem (Dini's test). Let f e Ll^iR) be Dini-regular atx eR andletL^ix),L-ix) be as in (11.40). ThenSnfix) -^ (L+(a:)+L-(x))/2. Proof. We may assume that x € [—TT, TT]. Since -^ f_^Dn(t)dt 1/2, we have
— -^ f^ Dn(t) dt =
5 „ / ( x ) - ^"^(^) + ^ " ( ^ ) = J _ / " " ( / ( ^ + () + / ( ^ -t)-L+2 2n Jo
L-)Dn{t)
dt (11.41)
= ^ f
where (fxit) is as in (11.39). Set h{t) := ^xit) ^^^L,2)^ so that \h(t)\
- ^ ^ ( ^ ) + ^ (^) ^ J _ r hi^t) sin((n + 1/2)0 dt -> 0. 2
27r ^0
In particular, if / is continuous, 27r-periodic and satisfies the Dini condition at every x, then Snfix) —^ fix) ^x ER pointwise. 11.72 E x a m p l e . Let 0 < a < 1 and A C M. Recall that f : A ^ R is said to be a-Holder-continuous if there exists K > 0 such that
\f(x)-f{y)\
^x,y e A.
We claim that a 27r-periodic a-Holder-continuous function on [a, b] satisfies the Dini test at every x 6]a, b[. In fact, if S = Sx := min(|a: — a\,\x — b\), then
/•y(. + t)-/(x) + /(x-^)-/(x), ^^ /•* Jo ^
t Jo
^ 0
Jo
^^^ r _^ Js
11.5 Fourier's Series
11.73 t . Show that the 27r-periodic extension of y/\t\, continuous.
439
t G [-7r,7r] is l/2-H61der-
1 1 . 7 4 E x a m p l e . Show that, if / is continuous and satisfies the Dini test at x, then L-^{x) = L-{x) = f{x). 11.75 t - Show that the 27r-periodic extension of f(t) := l / l o g ( l / | t | ) , t € [-7r,7r] does not satisfy the Dini test at 0.
11.5,3 L^-convergence and the energy equality a. Fourier's partial sums and orthogonality Denote by ||/||2 the quadratic mean over a period of /
\\f\\l--=^fjm\'dt, and with L^^^ the space of integrable functions with ||/||2 < oo. The Hermitian biUnear form and the corresponding "norm"
(f\3):=^fj{t)W)dt,
11/112:= (^/j/Wl'^*)
'
are not a Hermitian product and a norm in L^^r, since ||/||2 = 0 does not imply f{t) = 0 V^, but they do define a Hermitian product and a norm in L^^^ 0 C^(E), since ||/||2 = 0 imphes / = 0 if / is continous. Alternatively, we may identify functions / and g in 1/2^^ i f l l / ^ ^ ' I b ^ O , and again {f\g) and II/II2 define a Hermitian product and a norm on the equivalence classes of L27r if^ ^^ i^ is usual, we still denote by L^TT ^^^ space of equivalence classes. It is easily seen that L^TT is a pre-Hilbert space with {f\g). Notice that two nonidentical continuous functions belong to different equivalence classes. Since e*'^^, A: G Z, belong to L^^j. and
^^ J-TT
we have the following. 11.76 Proposition. The trigonometric system {e'^^^ \k e Z} is an orthonormal system in L^^.
440
11. Some Applications
Since
we have n
Snf{x) = J2 {f\e""')e"'^
xeR,
i.e., the Fourier series of / is the abstract Fourier series with respect to the trigonometric orthonormal system. Therefore the results of Section 10.1.2 apply, in particular the Bessel inequality holds oo
as well as Proposition 10.18, in particular |i/-5n/||2<||/-P||2
VPGPn,2..
Recall also that for a trigonometric polynomial P G Pn,27r the Pythagorean theroem holds
l.l^jP{t)\Ut= E MP)\'. b. A first uniform convergence result 11.77 Theorem. Let f G C^{R). Then Snf -> / uniformly in R. Proof. Since Snf{x) -^ f(x) Vx, it suffices to show the uniform convergence of We notice that / ' € I/27r ^^^ that, by integration by parts, Ck(f') := ikckif)
Snf-
VfceZ,
hence , if fc 7*^ 0,
N(/)|<^^<|c.(/')|^ + ^ where we have used the inequaUty \ab\ < a^ + 6 ^ . Since Zlfci-00 kfc(/')P converges by Bessel's inequaUty, we therefore conclude that ^ ^ _ ^ kfc(/)| converges, consequently 00
A;= — 0 0
converges absolutely in C^(M) since He^'^^HocR = 1 V/c.
D
11.78 f. Let / e C^(]R) and let {ck} be its Fourier coefficients. Show that /c^|cfc| -^ 0 as |A;| -^ 00.
For stronger results about uniform convergence of Fourier series see Section 11.5.4.
11.5 Fourier's Series
441
A. ZYGMUND
TRIGONOMETRIC SERIES
CAMBBIDGB AT THE ONIVKKBITY PEBSS
Figure 11.11. Antoni Zygmund (19001992) and the frontispiece of the first edition of volume I of his Trigonometric Series.
c. Energy equality We have, compare Chapter 9, the following. 11.79 Lemma. C^(M) H L|^ is dense in L\^. Proof. Let / G L'2^ and € > 0. There is a Riemann integrable function he with 11/ —^elb < e and a step function ke in [-7r,7r] such that \\ke\\ < Me and ||/ie —/ce||i < (7re^)/Me where Me := ||/ie||oo, consequently \\he •ke\\l
= — f \he-ke\'^dt< 27r y_7r
— 2Me 27r
f
\he - ke\dt
<^
J^T,
First, approximating ke by a Lipschitz function, then smoothing the edges, we find le € Ci([-7r,7rl) with \\ke ~ le\\2 < e. Finally we modify U near TT and — TT to obtain a new function ge with pe(-7r) = ge{T^) — g'{—7r) = g'{n) = 0. Extending ge to a periodic function in R, we finally get ge € C^iR) H L^^ and \\f - ge\\2 < 4e. D
Now we can state the following. 11.80 Theorem. For every f e L^^ we have \\Snf-f\\2 -^ 0. Therefore, the trigonometric system {e*^^}, k E Z, is orthonormal and complete in L^TT/ moreover, for any f G L^^ the energy equality or ParsevaVs identity holds: 00
„7r
dt. k— — oo
Proof. Given / G L^^ and e > 0, let ^ € C i ( R ) Pi L^^ be such that \\f - g\\2 < e. Since 5n^ is a trigonometric polynomial of order at most n, and Snf is the point of minimal L^^ distance in L'2^ from / we have
442
11. Some Applications
11/ - Snfh
< 11/ - Sng\\2
< 11/ - g\\2 + \\g - 5 „ s | | 2 < € + lis - SnSlloc
and the claim follows since ||p — -S'npHoo —>• 0 as n —• oo. The rest of the claim is now stated in Proposition 10.18. D 1 1 . 8 1 %, Show that, if the Fourier series of / G Z/27r converges uniformly, then it converges to / . In particular, if the Fourier coefficients c^ of / satisfy + C50
^
|cfc| < 4-00,
k= — oo
then f{x) = J^ifc^cxD Cfce*^^ in the sense of uniform convergence in R.
11.5.4 Uniform convergence a. A variant of the Riemann-Lebesgue theorem Let us state a variant of the Riemann-Lebesgue theorem that is also related to the Dirichlet estimate for the series of products. 11.82 Proposition (Second theorem of mean value). Let f and g be Riemann integrable functions in ]a, b[. Suppose moreover that f is not decreasing, and denote by M and m respectively, the maximum and the minimum values of x -^ J^ g{t)dt, x G [a,b]. Then we have
mf{b)< f
f{t)g{t)dt<Mf{h).
Ja In particular, there exists cE\a^b[ such that
f f{t)g{t)dt = m Ja
j g{t)dt. Jc
Proof. Choose a constant d such that g{t)-\-d > 0 in ]a, h[. If / is differentiable, the claim follows easily integrating by parts / ^ f(t)(g{t)-\-d) dt. The general case can be treated by approximation (but we have not developed the correct means yet) or using the formula of summation by parts, see Section 6.5 of [GM2]. For the reader's convenience we give the explicit computation. Let a = {XQ = a^xi,... ^Xn = b} be a partition of [a,b]. Denote by A^ the interval [xk-i^Xk] and set cr^ := 5Dfc_i fixk){xk — x^-i). We have f m{g(t)
+d)dt=J2
[
/ W ( P W + d)) dt<J2
f{xk){Gixk-i)
- G(xk) + dak
n-1
= f{xi)G(xo)
4- J2 G{xk){f{xk+i)
- f{xk))
k=i n-l
< M(^fixi)
-f Y^ifixk^i) k=i
=
Mf(b)-\-dak.
- f{xk))
+ dak
+ dak
11.5 Fourier's Series
443
SERIE 1)1 vmwiw K
m vmmm mm FUNZIONI Dl UNA YARUBIIE REJiLE U L I S S E DINI
PISA TtPOCkAFU T. MtSTItl • C.
Figure 11.12. Ulisse Dini (1845-1918) and the frontispiece of his Serie di Fourier.
Since <7fc -^ f^ g{t) dt as fc —>• oo, we infer
/
f{t)git)dt<Mf(b).
Ja
Similarly, we get /^ f{t)g{t) dt > mf{h).
The second part of the claim follows from the
intermediate value theorem since /^ g{t) dt is continuous.
•
Prom the Riemann-Lebesgue lemma, see Exercise 11.66, for any / G I/27r ^iid 5 > 0 we have for every fixed x
[
f{x + t)Dn{t) dt ^ 0
as n ^ oo.
For future use we prove the following. 11.83 Proposition. Let f G Ll^ and S > 0. Then
i:
f(x + i)Dn(t) dt ^>^ 0
uniformly in x
as n ^ oo
^^.
Proof. Since l / s i n ( t / 2 ) is decreasing in ]0, TT], the second theorem of mean value yields ^ = ^{x) G [(^,7r] such that [ f{x + t)Dn{t)dt=-r^—-f f(x + J5 sm(d/2) J5 On the other hand,
t)sm{{n-\-l/2)tdt.
444
11. Some Applications
/
f{x + t) sin((n + 1/2)* dt = = cos((n + 1/2)2;) /
f{t) sin((n + l/2)(t
- a;)) dt
/ ( t ) sin((n + l/2)t) dt
- sin((n + l/2)ar) / f{t) cos((n + l / 2 ) t ))dt < Js+x fS+x and the last two integrals converge uniformly to zero in [—TT, TT], see Exercise 11.62. Thus Is / ( ^ + t)Dn{t) dt -^ 0 uniformly in [—7r,7r], hence in E. D
b. Uniform convergence for Dini-continuous functions Let / G C°''^(M)nL27r be a 27r-periodic and a-Holder-continuous function. It is easy to see that / is continuous and Dini-regular at every a: G M. In fact, ii S = 6x := min(|a; — a|, \x — 6|), then
Jo '
i <2K
I
^
Jo
J6
i-^+«dt+^-^^%^^<+oc.
JO
Therefore Snf{x) -^ f{x) Vo; G M by the Dini test theorem, Theorem 11.71. We have the following. 11.84 Theorem. / / / is 2'jT-periodic and of class CQ'^(E), 0 < a < 1, then Snfix) —> f(x) uniformly in K. Proof. Let 5 > 0 to be chosen later. We have Snfix)
~ fix)
= f ifix + t)~f(x))Dnit)dt+ Jo =: IiiS,n,x)
4-
rifix^t)-fix))Dnit)dt Js
l2iS,n,x)
Let e > 0. Since / is a-Holder-continuous there exists K > 0 such that \fix + t)-
f{x)\ < K | t | «
Va: e M, Vt € [0, 2n],
hence \Ii(S,n,x)\
f t " — ^ | s i n ( ( n + l / 2 ) t ) | r f ^ < 2 X / t - ^ ^ " dt = — ( 5 " . Jo sin(t/2) Jo ^
We can therefore choose 6 in such a way that \Ii(6, n,x)\ < e uniformly with respect to X and n. On the other hand |/2(^, n,ic)| < e uniformly with respect to x as n —» +00 by Proposition 11.83 concluding that \Snfix)
— fix)\
< 2e
uniformly in x
for n sufficiently large.
•
With the same proof we also infer the following. 11.85 Theorem (Dini's t e s t ) . Let f G C^(M) (1 Ll^ be a 27r-periodic and continuous function with modulus of continuity UJ{S)^ \f{x) — f{y)\ < (J(5) if \x — y\ < 5, such that UJ{S)/S is summable in a neighborhood of 6 = 0. Then Snf -^ f uniformly in R.
11.5 Fourier's Series
445
c. Riemann's localiziation principles The convergence of Fourier's partial sums is a local property in the following sense 11.86 Proposition. If g,h e L\^ and g = h in a neighborhood of a point X, then Sng{x) — Snh{x) -^ 0 as n ^^ oo. Proof. Assume f .= g — h vanishes in [x — S, x -\- S], S > 0. Then, for every t e [0,6] we have f{x + t) = f{x — t)=0, hence Snfix)
- fix)
= i - Pifix
Since (f{x + t) -\- f{x — t))/ sm(t/2) Riemann-Lebesgue theorem.
+ t) -h fix - t))Dn{t)
dt.
is summable in (<5, TT), the result follows from the D
11.87 Proposition. If f G L\^ and f = 0 in ]a, b[, then Snf{x) uniformly on every interval [c, d] with a < c < d
—> 0
Proof. Let us show that Snf{x) —>• 0 uniformly in [a-\-S,b — S], 0 < 6 < [h — a)/2. For X £[a-\- 5,b — 6] and 0 < t < (5 we have f{x -\-t) = f{x — t) = 0, hence = ^
Snfix)
r ifix + t) -h fix - t))Dnit)
dt.
^TT Js
The claim follows from Proposition 11.83.
•
The locaUzation principle says that, when studying the pointwise convergence in an open interval ]a, 6[ or the uniform convergence in a closed interval inside ]a, b[ of the Fourier series of a function / , we can modify / outside of ]a, b[. With this observation we easily get the following. 11.88 Corollary. Let f G Ll^ be a function that is of class C^([a,6]). Then {Snf{x)} converges uniformly to f{x) in any interval strictly contained in ]a, b[.
11.5.5 A few complementary facts a. The primitive of the Dirichlet kernel Denote by Gn{x) the primitive of the Dirichlet kernel, Gn{x) : - / Dn{t)dt. Jo It is easy to realize that Gn{x) is odd and nonnegative in [0, TT] and takes its maximum value in [0,7r] at the first zero Xn := 2 ^ ^ of Dn- Thus ^
(
2n
=i
^
/•W(^"+^)sin((n + l/2).)
sins , 2(n+l) 0 sin(5/(2n-h 1))- ds ~< 42n + 17^7^ < 27r
446
11. Some Applications
Figure 11.13. The graph of G5(x) in [—7r,7r].
independently of n; in particular,
j Dn(t)
dt
(11.42)
<27r
for all c,d e [0,7r]. Also, by Exercise 11.66, or directly by an integration by parts, it is easily seen that, given any (J > 0, there is a constant c{6) such that |G„(7r) - Gn{x)\ = I r Dn{t)dt\ < c{5)-
(11.43)
for all X e [0,6]. For future use we now show that hm G n\^n)
sms ds.
—^ I
n—>-oo
(11.44)
Jo
In order to do that, we first notice that 2 -t < sint
0
sint <
TT
-r 6
I.e.,
1 sint
1 <_t t
tG]0,7r],
hence / Jo
Dn{t)dt-
TT / 2n \2 '" sin((n + l/2)t) dt < i L ( _ ^ ! [ _ ) ^ ^ 0 1 2 V 2 n + l / t/2 Jo
(11.45) ^ '
/
as n —> oo. Equality(11.44) then follows as
-T ,r\,ds=2r'j^ds.
sin((n + l/2)t) ^^ ^ 2 t/2 2n +
iJo
s/(2n+l)
Jo
s (11.46)
11.5 Fourier's Series
447
Figure 11.14. The sawthooth h(x) and its Fourier partial sum of order 5 in [—7r,7r].
b. Gibbs's phenomenon Consider the 27r-periodic function h defined by periodically extending the function h{t) :--
-7T - t
if - TT < t < 0,
0
if t = 0,
TT-t
(11.47)
ii0
Its Fourier coefficients are easily computed to be Co = 0,
Ck := TT, A: 7^0, IK
hence Snh{x)=
IZ t,-_—71,11 k^=
^ = / *^
Dn{t)dt-x ^0
fc^O
or sinkx
Snh{x) = 2Y^ k fc=i
In particular Snh{0) — 0 Vn, and, by Dini's test, Theorem 11.71, ^ v ^ sin fca: , . . 2 > —-— = h{x)
. . . T^^ pomtwise m M.
k=l
The energy inequality yields
-^-^
or
fc=-oo
fc=l
(11.48)
448
11. Some Applications
k=l
As we have already seen, we have the following, of which we give a direct proof. 11.89 Theorem. For any positive S > 0 the Fourier series ofh converges uniformly to h in [5, TT] . Proof. We know t h a t Snh{x) converge pointwise to h, therefore it suffices t o show that
j :
—
(11.50)
fc^O
converges uniformly in [(5,TT]. We apply Dirichlet's theorem for series of products, see Section 6.5 of [GM2], respectively, to the series with positive and negative indices with Gfc = l/(*fc) and 6fe : = e*'^^, to find that
\SnHx)-h{x)\ = \ J2
-TT^Tl
^-TT
hence \\Snh
- /l||oo,[<5,7r] < JZ
4
1
^
|l-e^^|
- 7 "^ 0
BS n - > OO.
Alternatively, from (11.48) we infer pX
Snh{x)
- h(x) = /
PTT
Dnit)
- TT = -
/
Jo
Dnis)
ds
Jx
and, by (11.43), \Snh{x) — h{x)\ < c{6)— n
uniformly in [S,7r].
However the Fourier series of h does not converge uniformly in [0, TT].
11.90 Proposition. We have 11^,11 \\Snh\\oo,[OM Proof. Let yn be the point where Snh{x) Mn
^ Z*^ s i n s , -^ ^ / ^^• /o Jo ^ obtains its maximum value in [0,TT],
:= s u p Snh{x) [0,n]
=
SniVn),
and let Xn := 2^+1- Since Xn is the maximum point of Gn{x), we have GniVn)
-Xn
< Gn{Xn)
- Xn = SnfiXn)
< SnfiVn)
This implies 0 < yn < Xn and —Xn < Sniyn) \Mn
- Gn{Xn)\
= GniVn)
— Gn{xn)
< Xn = T
- Vn < Gn{Xn)
- Un-
< —2/n, hence T-
(11-51)
2n + 1 The conclusion follows from (11.44).
•
11.5 Fourier's Series
449
We can rewrite the statement in Proposition 11.90 as ||'5'n/i||oo,[0,7rl " ^ ( ~ /
Since
C?^)||/i||oo,[0,7r]
2 r sin .? , sins< — / as = 1.089490... "^ Jo s
we see that, while Snh{x) —^ h{x) for all x G M, near 0, Snh{x) always has a maximum which stays away from the maximum of h that is ||/i||oo,[o,7r] — ^ by a positive quantity: this is the Gibbs phenomenon^ which is in fact typical of Fourier series at jump points; but we shall not enter into this subject.
11.5.6 The Dirichlet-Jordan theorem The pointwise convergence of the Fourier series of a continuous or summable function is a subtle question and goes far beyond Dini's test, Theorem 11.71. An important result, proved by Lejeune Dirichlet (1805-1859), shows in fact that a 27r-periodic function which has only a finite number of jumps and maxima and minima, has a Fourier series that converges pointwise to (L"^ -h L~)/2 where L^ := \imy_^x± f{y); in particular Snf{x) -^ f{x) at the points of continuity. The same proof applies to functions with bounded variation, see Theorem 11.91. In 1876 Paul du Bois-Reymond (1831-1889) showed a continuous function whose Fourier series diverges at one point, and, therefore, that the continuity does not solely suffice for the pointwise convergence of the Fourier series. We shall present a different example due to Lipot Fejer (1880-1959). Starting from this example one can show continuous functions whose Fourier series do not converge in a denumerable dense set, for instance, the rationals. In the 1920's Audrey Kolmogorov (1903-1987) showed a continuous function with Fourier series divergent on a set with the power of the continuum, and Hugo Steinhaus (1887-1972) showed a continuous function whose Fourier series converges pointwise everywhere, but does not converge uniformly in any interval. Eventually, the question was clarified in 1962 by Lennart Carleson. Here we collect some complements. a. The Dirichlet-Jordan test 11.91 Theorem (Dirichlet-Jordan). Let f be a 27r-periodic function with bounded total variation in [a, 6]. (i) For every x e]a,b[ we have Snf{x) -^ {L'^ -h L~)/2 where L^ := \imy_,^±f{y). (ii) / / / is also continuous in ]a, b[, then Snf{x) -^ f{x) uniformly in any closed interval strictly contained in]a^b[.
450
11. Some Applications
/i
_L // + 2 n
/jL + n
Figure 11.15. The amplitude of the harmonics of
Qn,n{x).
Proof. Let [a, b] be an interval with b — a < 2IT. Since every function with bounded variation in [a, 6] is the sum of an increasing function and of a decreasing function, we may also assume that / is nondecreasing in [a, b]. (i) Let X G]a,6[, gx{t) := f{x + t) - L+ + f{x - t) - L- where L± := lim^_,^± We have Snfix)
-
^
^^ 1
= ^
Hifix
+ i) - L + + fix
- t ) - L-)Dn{t)
f{y).
dt
ZTT 7 o
= i r \ 9x{t)Dn{t)ds-\-— 27r JQ /o
I 27r Js
ga:{s)Dn{s)ds
(11.52)
= /l+/2. where 5 > 0 is to be chosen later. Since f(x + t) — L^ and ~{f{x — t) — L ) are nondecreasing near t = 0 and nonnegative, the second theorem of mean value and (11.42) yield | / i | < 2n \f{x + (5) - L+l + | / ( x -S)-L-\ (11.53) while (11.43) yields \l2\
(11.54) n
Therefore, given e > 0, we can choose S > 0 in such a way that \f(x-\-S)-L-^\
+
\f{x-S)-L-\<e
to obtain from (11.53) and (11.54) that \Snf(x)
-^^~^
I
<27re~\-c{S)^.
That proves the pointwise convergence at x. (ii) In this case for every x G [a, 6], we have L+ = L~ = f(x) and it suffices to estimate uniformly in [a + cr, 6 - cr], 0 < a < (6 — a ) / 2 , h and /2 in (11.52). Since / is uniformly continuous in [a, 6], given e > 0, we can choose S, 0 < S < a, in such a way that \f{x -\- S) — f{x)\ + \f{x — S) — f(x)\ < e, uniformly with respect to x in [a -\- a^b — a], hence from (11.53) | / i | < 27re uniformly in [a-\- cr,b — a]. The uniform estimate of IJ2I is instead the claim of Proposition 11.83. Finally, if 6 — a > 27r, it suffices to write [a, 6] as a finite union of intervals of length less than 27r and apply the above to them. •
11.92 Remark. Notice that the Dirichlet-Jordan theorem is in fact a claim on monotone functions. Monotone functions are continuous except on a denumerable set of jump points, that is not necessarily discrete.
11.5 Fourier's Series
451
b. Fejer example Let /i € N be a natural number to be chosen later. For every n € N consider the trigonometric polynomial of degree n ' ^ cos(n + A* — k)x — cos(n + /x -|- k)x Qn,^i{x) : = 2 ^ k=i
'^
^ . .. . . -^ = 2sm((n-I-/ijx) >
sinkx
,
see Figure 11.15. It is a cosine polynomial with harmonics of order fi^ji -\- l^n -\- ii — 1, n + /i + 1 , . . . , n H- 2/x. Now choose o a sequence {a^} of positive numbers in such a way that X^^^i flfc < +oo, o a sequence {n^} of nonnegative integers such that a^ log n^ does not converge to zero, o a sequence {/ifc} of nonnegative integers such that ^ik-\-i > Mfc + 2n;t? and set Qk{x)
:=Qnfc,/Xfc(x).
Since the sums Yl^=\ ^^^^^ are equibounded, see (11.42) and (11.48), the polynomials Qn,ix{x) are equibounded independently of n,/x G N and a; € M. Consequently X ^ ^ i CLkQk{x) converges absolutely in C^(E) to a continuous function / ( x ) , a; G M,
fe=l
which is 27r-periodic and even, for / a sum of cosines. The Fourier series of / is then a cosine series oo
•^/(^) = -^ -^^^k ^
cos{kx).
k= l
We now show that 5 n / ( 0 ) has no limit as n —^ oo. Since / is a uniform limit, we can integrate term by term to get Fourier coefficients Cj — - /
fit) cos(jO dt=Y2—
f
Qk{t) cosijt) dt
because of the choice of the /x^, the harmonics of Qk and Q^, h ^ k are distinct, in particular 2^
Cj =ak2_^-
>ak
T ^^^
log^fc-
Consequently, we deduce for the Fourier partial sums of / at 0
Therefore Snf(0) does not converge, because of our choice of { n ^ } . A possible choice of the previous constants is ak '= -j^, which yields ak log(nfe) = log 2.
nk = 2'' ,
Ilk =
2^
452
11. Some Applications
Figure 11.16. Paul du Bois-Reymond (1831-1889) and Lipot Fejer (1880-1959).
11.5.7 Fejer's sums Let / be a continuous and 27r-periodic function. The Fourier partial sums of / need not provide a good approximation of / , neither uniformly nor pointwise; on the other hand / can be approximated uniformly by trigonometric polynomials, see Theorem 9.58. A specific interesting approximation was pointed out by Lipot Fejer (1880-1959). Let / e L\^ and Snf{x) = Z)fc=-n <^fce^^^. Fejer's sums of / are defined by
Fr,f{x):=-^J2Snf{x). Trivially Fnf{x) written as
are trigonometric polynomials of order n that can be
W = ;^tE«.'=-" = ^ E ( » k=Oj=-k
c,e
IJX
j=-n
We have 11.93 Theorem (Fejer). Let f eLl^D converge to f uniformly in R.
C^{R). The Fejer sums
Fnf{x)
Before proving Fejer's theorem, let us state a few properties of the Fejer kernel defined by n-\-1 f—^ k=0
where D^ denotes the Dirichlet's kernel of order k. 11.94 Proposition. We have Jn + 1
Fn{x) =
if X = 2fc7r, fc G Z,
_L.f^M-p)^Y
.n + 1V
sin(x/2)
/
other.^se.
11.5 Fourier's Series
453
Proof. Trivially
i^«(0) = ^
E l?fc(0) = - ^ X^(2fc + 1) = ^ ^ i ± ^ = n + 1.
Observing that in = / s i n ( x / 2 ) + . . . + sin((n + l/2)x)N /^^ ^ ^^ V sin(x/2) / / the expression in parentheses is the imaginary part of gix/2 _|_ gi3x/2 _|
|_ gi(2n+l)a;/2
sin(a;/2) ^ ^i{n+i)x/2 sin(x/2)
g i x / 2 / g i ( n + l ) x _ ]^\
~
sin(a:/2)(e»^ - 1)
2ism{{n + l ) x / 2 ) ^ ^u(n+i)x/2) 2isin(x/2)
sin((n + l ) x / 2 ) ^ sin2(x/2)
we see that Fn{x) =
1 /sin((n + l ) x / 2 ) \ 2 ^ ( n + 1 V sin(x/2)
11.95 Proposition. Fejer's kernel has the following properties (i) Fn{x) > 0, (ii) Fn{x) is even, (iv) Fn{x) attains its maximum value at 2k7r, k e Z, (v) for all S > 0, Fn{x) -^ 0 uniformly in [(5, n] as n -^ oo, (vi) there exists a constant A> 0 such that Fn{x) < r^_^\^2 for alln eN and X y^O in [—7r,7r],
(vii) {Fn} is an approximation of the Dirac mass S. Proof. (i),(ii),(iii), (iv), (v) are trivial; (vi) follows from the estimate sint > 2t/7r in ]0,7r/2]. Finally (vii) follows from (iii) and (v). D Proof of Fejer's theorem, Theorem 11.93. First we observe that Fnf{x)
- fix)
= ^
r(f{x
+ t ) + fix
- t ) - 2f{x))Fn{t)
dt.
ZTT Jo
Thus, if we set g{t) := f{x -\-t) + f(x - t) - 2 / ( x ) , Fnf{x)
- f[x) = ^
f
g{t)Fn{t) dt+^
27r Jo
f
9it)Fn{t) dt =: h + h.
2TT JS
Now, given e > 0, we can choose S so that \f{x + t) + f{x - t) - 2 / ( x ) | < 2e for all t G [0, S] uniformly in x, since / is uniformly continuous. Hence \h\<2ef
Jo
Fn{t) <2e r
Jo
Fn{t)dt
= 277 6.
On the other hand {h] < 4||/||c5oA/((n + l)^^), hence \Fnf{x)
- f{x)\
< 6+
4 | | / | | o o - - ^ .
A. Mathematicians and Other Scientists
Maria Agnesi (1718-1799) Pavel Alexandroff (1896-1982) James Alexander (1888-1971) Archimedes of Syracuse (287BC-212BC) Cesare Arzela (1847-1912) Giulio Ascoli (1843-1896) Rene-Louis Baire (1874-1932) Stefan Banach (1892-1945) Isaac Barrow (1630-1677) Giusto Bellavitis (1803-1880) Daniel Bernoulli (1700-1782) Jacob Bernoulli (1654-1705) Johann Bernoulli (1667-1748) Sergei Bernstein (1880-1968) Wilhelm Bessel (1784-1846) Jacques Binet (1786-1856) George Birkhoff (1884-1944) Bernhard Bolzano (1781-1848) Emile Borel (1871-1956) Karol Borsuk (1905-1982) L. E. Brouwer (1881-1966) Renato Caccioppoli (1904-1959) Georg Cantor (1845-1918) Alfredo Capelli (1855-1910) Lennart Carleson (1928- ) Lazare Carnot (1753-1823) Elie Cartan (1869-1951) Giovanni Cassini (1625-1712) Augustin-Louis Cauchy (1789-1857) Arthur Cayley (1821-1895) Eduard Cech (1893-1960) Pafnuty Chebyshev (1821-1894) Richard Courant (1888-1972) Gabriel Cramer (1704-1752) Jean d'Alembert (1717-1783) Georges de Rham (1903-1990) Richard Dedekind (1831-1916) Rene Descartes (1596-1650) Ulisse Dini (1845-1918) Diodes (240BC-180BC) Paul Dirac (1902-1984) Lejeune Dirichlet (1805-1859) Paul du Bois-Reymond (1831-1889) James Dugundji (1919-1985)
Albrecht Durer (1471-1528) Euclid of Alexandria (325BC-265BC) Leonhard Euler (1707-1783) Alessandro Faedo (1914-2001) Herbert Federer (1920- ) Lipot Fejer (1880-1959) Pierre de Fermat (1601-1665) Sir Ronald Fisher (1890-1962) Joseph Fourier (1768-1830) Maurice Frechet (1878-1973) Ivar Fredholm (1866-1927) Georg Frobenius (1849-1917) Boris Galerkin (1871-1945) Galileo Galilei (1564-1642) Carl Priedrich Gauss (1777-1855) Israel Moiseevitch Gelfand (1913- ) Camille-Christophe Gerono (1799-1891) J. Willard Gibbs (1839-1903) Jorgen Gram (1850-1916) Hermann Grassmann (1808-1877) George Green (1793-1841) Thomas Gronwall (1877-1932) Jacques Hadamard (1865-1963) Hans Hahn (1879-1934) Georg Hamel (1877-1954) William R. Hamilton (1805-1865) Felix Hausdorff (1869-1942) Oliver Heaviside (1850-1925) Eduard Heine (1821-1881) Charles Hermite (1822-1901) David Hilbert (1862-1943) Otto Holder (1859-1937) Robert Hooke (1635-1703) Heinz Hopf (1894-1971) Guillaume de I'Hopital (1661-1704) Christiaan Huygens (1629-1695) Carl Jacobi (1804-1851) Johan Jensen (1859-1925) Camille Jordan (1838-1922) Oliver Kellogg (1878-1957) Felix Klein (1849-1925) Helge von Koch (1870-1924) Audrey Kolmogorov (1903-1987) Leopold Kronecker (1823-1891)
456
A. Mathematicians and Other Scientists
Kazimierz Kuratowski (1896-1980) Joseph-Louis Lagrange (1736-1813) Edmond Laguerre (1834-1886) Pierre-Simon Laplace (1749-1827) Caspar Lax (1487-1560) Henri Lebesgue (1875-1941) Solomon Lefschetz (1884-1972) Adrien-Marie Legendre (1752-1833) Gottfried von Leibniz (1646-1716) Jean Leray (1906-1998) Sophus Lie (1842-1899) Ernst Lindelof (1870-1946) Rudolf Lipschitz (1832-1903) Jules Lissajous (1822-1880) L. Agranovich Lyusternik (1899-1981) James Clerk Maxwell (1831-1879) Edward McShane (1904-1989) Arthur Milgram (1912-1961) Hermann Minkowski (1864-1909) Carlo Miranda (1912-1982) August Mobius (1790-1868) Harald Marston Morse (1892-1977) Mark Naimark (1909-1978) Nicomedes (280BC-210BC) des Chenes M . - A. Parseval (1755-1836) Blaise Pascal (1623-1662) Etienne Pascal (1588-1640) Giuseppe Peano (1858-1932) Oskar Perron (1880-1975) Emile Picard (1856-1941) J. Henri Poincare (1854-1912) Diadochus Proclus (411-485) Pythagoras of Samos (580BC-520BC) Hans Rademacher (1892-1969) Tibor Rado (1895-1965) Lord William Strutt Rayleigh (1842-1919)
Kurt Reidemeister (1893-1971) G. F. Bernhard Riemann (1826-1866) Frigyes Riesz (1880-1956) Marcel Riesz (1886-1969) Eugene Rouche (1832-1910) Adhemar de Saint Venant (1797-1886) Stanislaw Saks (1897-1942) Helmut Schaefer (1925- ) Juliusz Schauder (1899-1943) Erhard Schmidt (1876-1959) Lev G. Schnirelmann (1905-1938) Hermann Schwarz (1843-1921) Karl Seifert (1907-1996) Takakazu Seki (1642-1708) Carlo Severini (1872-1951) Hugo Steinhaus (1887-1972) Thomas Jan Stieltjes (1856-1894) Marshall Stone (1903-1989) James Joseph Sylvester (1814-1897) Brook Taylor (1685-1731) Heinrich Tietze (1880-1964) Leonida Tonelli (1885-1946) Stanislaw Ulam (1909-1984) Pavel Urysohn (1898-1924) Charles de la Vallee-Poussin (1866-1962) Egbert van Kampen (1908-1942) Alexandre Vandermonde (1735-1796) Giuseppe Vitali (1875-1932) Vito Volterra (1860-1940) John von Neumann (1903-1957) Karl Weierstrass (1815-1897) Norbert Wiener (1894-1964) Kosaku Yosida (1909-1990) William Young (1863-1942) Nikolay Zhukovsky (1847-1921) Max Zorn (1906-1993) Antoni Zygmund (1900-1992)
There exist many web sites dedicated to the history of mathematics, we mention, e.g., http: //www-history .mcs. st-and. ac. uk/"history.
B. Bibliographical Notes
We collect here a few suggestions for the readers interested in delving deeper into some of the topics treated in this volume. Concerning linear algebra the reader may consult o P. D. Lax, Linear Algebra, Wiley & Sons, New York, 1997, o S. Lang, Linear Algebra, Addison-Wesley, Reading, 1966, o A. Quarteroni, R. Sacco, F. Saleri, Numerical Mathematics, Springer-Verlag, NewYork, 2000, o G. Strang, Introduction to Applied Mathematics, Wellesley-Cambridge Press, 1961. Of couse, curves and surfaces are discussed in many textbooks. We mention o M. do Caxmo, Differential Geometry of Curves and Surfaces, Prentice Hall Inc., New Jersey, 1976, o A. Gray, Modem Differential Geometry of Curves and Surfaces, ORG Press, Boca Raton, 1993. Concerning general topology and topology the reader may consult among the many volumes that are available o J. Dugundji, Topology, Alyn and Bacon, Inc., Boston, 1966, o K. Janich, Topology, Springer-Verlag, Berlin, 1994, o I. M. Singer, J. A. Thorpe, Lecture Notes on Elementary Topology and Geometry, Springer-Verlag, New York, 1967, o J. W. Vick, Homology Theory. An Introduction to Algebraic Topology, SpringerVerlag, New York, 1994. With special reference to degree theory and existence of fixed points we mention o A. Granas, J. Dugundji, Fixed Point Theory, Springer-Verlag, New York, 2003. o L. Nirenberg, Topics in Nonlinear Functional Analysis, Courant Institute of Mathematical Sciences, New York University, 1974. The literature on Banach and Hilbert spaces, linear operators, spectral theory and linear and nonlinear functional analysis is incredibly wide. Here we mention only a few titles o N. J. Akhiezer, I. M. Glazman, Theory of Linear Operators in Hilbert Spaces, Dover, New York, 1983, o H. Brezis, Analyse Fonctionelle, Masson, Paris, 1983, o A. Friedman, Foundations of Modem Analysis, Dover, New York, 1970, and also o N. Dundford, J. Schwartz, Linear Operators, John Wiley, New York, 1988, o K. Yosida, Functional Analysis, Springer-Verlag, Berlin, 1974, as well as the celebrated o R. Courant, D. Hilbert, Methods of Mathematical Physics, Interscience Publishers, 1953, o F. Riesz, B. Sz. Nagy, Legons d'Analyse Fonctionelle, Gauthier-Villars, Paris, 1965.
C. Index
accumulation point, 164 algebra - End (X), 326 - ideal, 402 - - maximal, 402 - - proper, 402 - of functions, 316 - spectrum, 403 - with identity, 402 algorithm - Gram-Schmidt, 85, 99 ball - open, 152 Banach - algebra, 326, 403 - closed graph theorem, 330 - continuous inverse theorem, 330 - fixed point theorem, 335 - indicatrix, 265 - open mapping theorem, 329 - space, 286 - ordered, 343 basis, 43 - dual, 54 - orthonormal, 85 bilinear form - bounded, 370 - coercive, 370 bilinear forms, 95 - signature, 97 bracket - Lie, 38 Carnot's formula, 81 cluster point, 164 coefficients - Fourier, 433 compact set, 200 - relatively, 203 - sequentially, 197 conies, 106
connected - component, 211 - set, 210 continuity - for metric spaces, 163 continuity method, 337 contractible spaces^ 253 convergence - in a metric sp£ice, 153 - pointwise, 157, 297 - uniform, 157, 294, 296 on compact subsets, 310 - weak, 398 convex hull, 208 convolution, 309, 310 - integral means, 309 coordinates - cylindrical, 168 - polar, 168 - spherical, 168 covectors, 54 covering, 165, 199, 260 - locally finite, 165 - net, 199 criterion - Hausdorff, 200 cube - Hilbert, 158 curve, 219 - arc length reparametrization, 232 - closed, 219 - cylindrical helix, 221 - cylindrical representation length, 231 - equivalent, 224 - intrinsic parametrization, 243 - length, 227 in cylindrical coordinates, 231 in polar coordinates, 231 in spherical coordinates, 231 minimal, 397 of graphs, 231
460
-
Index
semicontinuity, 395 Lipschitz-continuous, 230 orientation, 224 parametrization, 219 Peano, 228 piecewise regular, 226 piecewise-C^, 226 polar representation, 221 length, 231 rectifiable, 227 regular, 224 self-intersection, 223 simple, 223 spherical representation length, 231 tangent vectors, 225 total variation, 241 trace, 219 trajectory, 219 von Koch, 228
decomposition - polar, 125 - singulai' value, 126 definitively, 192 degree, 268 - integral formula, 269 - mapping - - degree, 268 - on 5 1 , 266 - with respect to a point, 275 delta - Dirac, 313 - - approximation, 313 - Kronecker, 12 dense set, 192 determinant, 33, 34 - area, 31 - Laplace's formula, 36 - of a product, 35 - of the transpose, 35 - Vandermonde, 39 diameter, 153 Dini - regular, 438 - test, 438, 444 Dirichlet - problem, 416 discrete Fourier transform, 134, 144 - inverse, 134 distance, 81, 84, 151, 154, 161, 286 - between sets, 216 - codes, 156 - discrete, 156 - Euclidean, 155 - from a set, 162 - Hausdorff, 299 - in ip, 158
- in the mean, 160 - integral, 160 - uniform, 157, 159 duality, 55 eigenspace, 58 eigenvalue, 58, 384, 391 - min-max characterization, 392 - multiplicity algebraic, 62 geometric, 62 - real and complex, 66 - variational characterization, 115 eigenvector, 58 energy equality, 360, 441 example - Fejer, 451 exponential operator, 327 Fejer - example, 451 - sums, 452 fixed point, 335 force, 92 forms - bilinear, 95 - linear, 54 - quadratic, 115 formula - Binet, 35 - Carnot, 81 - degree, 269 - Euler, 281 - Grassmann, 18, 47 - Hadamard, 143 - inverse matrix, 30 - Laplace, 36 - Parseval, 358, 441 - polarity, 80, 83 - rank, 49 of matrices, 16 Fourier - coefficients, 357, 433 - series, 357, 433 uniform convergence, 444 Fredholm's alternative, 50 function, see map - BauEich's indicatrix, 265 - bounded total variation, 244 - closed, 194, 216 - coercive, 203 - continuous, 163, 182 image of a compact, 202 image of a connected set, 212 inverse of, 202 - convex, 287 - exponential, 171 - Holder-continuous, 161
Index
- homeomorphism, 182 - Joukowski, 169 - limit, 164 - Lipschitz-continuous, 161 extension, 207 - logarithm, 171 - lower semicontinuous, 204 - Mobius, 170 - open, 194, 216 - proper, 216 - sequentially semicontinuous, 203 - total variation, 241 - uniformly continuous, 205 functions - equibounded, 301 - equicontinuous, 301 - Holder-continuous, 301 - homotopic, 250 fundamental group, 258
- Bessel, 358, 440 - Cauchy-Schwarz, 80, 83, 352 - Gronwall, 410 - Jensen, 400 - Minkowski, 155, 158, 293 - triangular, 81, 84 - variational, 372 inner product - continuity, 352 integral - de la Vallee Poussin, 316 integral equations - Fredholm, 426, 428, 429 - Volterra, 425, 426, 429 invariant - metric, 183 - topological, 184 isolated point, 180 isometries, 87
gauge function, 333 geodesic, 152 - distance, 154 Gibbs phenomenon, 448 Green operator, 421 group - fundamental, 258 - linear, 50 - orthogonal, 88 - unitary, 88
kernel - de la Vallee-Poussin, 315 - Dirichlet, 435
Holder function, 161 Hausdorff criterion, 200 Hermitian product - continuity, 352 Hilbert space, 158, 353 - basis, 355 - complete system, 355 - dual, 364 - Fourier series, 357 - pre, 351 - separable, 355 - weak convergence, 398 Hubert's cube, 393 homeomorphism, 182 homotopy, 250 - equivalence, 253 - first group, 258 - relative, 256 - with fixed endpoints, 256 ideal, 402 - maximal, 402 - proper, 402 identity - Jacobi, 38 - parallelogram, 80, 287 inequality
law - parallelogram, 287 least squares, 129 - canonical equation, 129 lemma - Gronwall, 410 - Riemann-Lebesgue, 436 - Uryshon, 209 liminf, 204 limit point, 164 limsup, 204 linear - combination, 42 - equation, 50 - operator, 44 characteristic polynomial, 60 - eigenspace, 58 - eigenvalue, 58 - eigenvector, 58 - subsp£u:e, 4 - systems, 22 Cramer's rule, 36 linear difference - equations - - o f higher order, 137 linear difference equations - systems, 136 linear regression, 374 Lipschitz - constant, 161 - function, 161 map - affine, 37
461
462
Index
- compact, 339 - linear, 44 - - affine, 37 - - associated matrix, 48 automorphism, 50 endomorphism, 50 - - graph, 109 image, 45 - - kernel, 45 - - rank, 45 - proper, 265 - Riesz, 91, 367 mapping - degree, 268 matrix - algebra, 11 - associated to a linear map, 48 - block, 39, 137 - characteristic of a, 35 - cofactors, 36 - complementing minor, 34 - congruent to, 102 - determinant, 33, 34 - diagonal, 12 - diagonizable, 60 - eigenspace, 58 - eigenvalue, 58 - eigenvector, 58 - Gauss reduced, 26 pivots, 26 - Gram, 82, 85, 96, 101, 143 - identity, 12 - inverse, 12, 36 - Jordan's - - basis, 72 canonical form, 70 - Jordan's formula, 137 - LR decomposition, 30 - nilpotent, 69 - nonsingular, 15 - orthogonal, 88 - polar form, 125 - power, 137 - product, 11 - rank, 16 - similar to, 60 - singular value decomposition, 126 - singular values, 125 - spectrum, 58 - stair-shaped, 26 pivots, 26 - symmetric, 38 - trace, 38, 61 - transpose, 13 - triangular - - lower, 12 - - upper, 12 - unitary, 88
maximum point, 201 method - continuity, 337 - Faedo-Galerkin, 377 - Gauss elimination, 25 - Gram-Schmidt, 106 - Jacobi, 100 - least squares, 128 - Picard, 335, 407 - Ritz, 373 error estimate, 373 - shooting, 418 - super- and sub-solutions, 344 - variational for the eigenvalues, 116, 118 metric, 97 - Artin, 103 - Euclidean, 103 - invariant, 183 - Lorenz, 103 - nondegenerate, 97 - positive, 97 - pseudoeuclidean, 103 metric axions, 151 metric space, 151
- C \ 160 - compact, 200 - complete, 185 - completion, 186 - connected, 210 - connected component, 211 - continuity in, 163 - convergence in, 153 - immersion in ^oo, 402 - immersion in C^, 402 - locally connected, 212 - path-connected, 213 - sequentially compact, 197 metrics, 151 - equivalent, 188 - in a product space, 156 - topologically equivalent, 178 minimal geodesies, 397 minimizing sequence, 201 minimum point, 201 Minkowski - discrete inequality, 155 - inequality, 158 Minkowski inequality, 293 minor - complementing, 34 modulus of continuity, 320 mollifiers, 312 Moore-Penrose inverse, 369, 374 neighborhood, 177 norm, 79, 154, 285 - C^'", 301 - C i , 296
Index
- equivalent norms, 288 - L°°, 294 - ^p, 292 - LP, 293 - uniform, 294 - uniform or infinity, 159 normed space, 154, 285
- £(X,y), 324 - series, 288 absolute convergence, 289 normed spaces - convex sets, 287 numbers - Fibonacci, 140 ODE - Cauchy problem, 405 - comparison theorem, 420 - continuation of solutions, 408 - Gronwall's lemma, 410 - integral curves, 404 - maximum principle, 419, 420 - Picard approximations, 407 - shooting method, 418 operator - adjoint, 93, 369 - closed range, 369 - commuting, 388 - compact, 378 - compact perturbation, 379 - eigenvalue, 384 - eigenvector, 384 - Green, 372, 421 - linear - - antisymmetric, 121 - - isometry, 121 - - normal, 121 - - positive, 117 - - self-adjoint, 121 - - symmetric, 121 - normal, 121, 388 - positive, 117 - powers, 119 - projection. 111, 368 - resolvent, 384 - Riesz, 366 - self-adjoint. 111, 369 - singular values, 125 - spectrum, 384 pointwise, 384 - square root, 120 operators - bounded, 324 - compaxjt, 339 - exponential, 327 - pointwise convergence, 325 - Schauder, 341 - uniform convergence, 325
463
order cone, 343 parallegram law, 80, 83 parallelogram law, 352 path, 219 Peano curve, 228 Peano's phenomenon, 416 perfect set, 192 phenomenon - Peano, 416 point - adherent, 179 - boundary, 179 - cluster, 164 - exterior, 179 - interior, 179 - isolated, 180 - limit, 164 - of accumulation, 164 polynomials - Bernstein, 306 - Hermite, 361 - Jacobi, 361 - Laguerre, 361 - Legendre, 361 - Stieltjes, 314 - Tchebychev, 361 principle - abstract Dirichlet's, 364, 371 - Cantor, 188 - maximum, 419, 420 - of condensation of singularities, 329 - of uniform boundedness, 328 - Riemann's localization, 445 problem - Dirichlet, 416 product - Hermitian, 82 - inner, 79 - scalar, 79 projection - stereographic, 168 quadratic forms, 104 quadrics, 107 rank, 16 - of the transpose, 17 Rayleigh's quotient, 392 resolvent, 384 retraction, 254 scalars, 4, 41 segment-connected set, 213 semicontinuous function - sequentially, 203 sequence - Cauchy, 185
464
Index
- convergent, 153 series, 288 - Fourier, 357, 433 set - boundary of, 179 - bounded, 153 totally, 199 - closed, 175 - closure of, 179 - compact, 200 - - sequentially, 197 - complement of, 175 - connected, 210 inM, 211 - convex, 287 - convex hull of, 208 - dense, 192 - derived of, 180 - discrete, 192 - interior, 179 - meager, 189 - neighborhood, 177 - nowhere dense, 189 - of the first category, 189 - of the second category, 189 - open, 175 - perfect, 192 - regular closed, 193 - regular open, 193 - relatively compact, 203 - segment-connected, 213 - separated, 210 small oscillations, 141 - normal modes, 143 - proper frequencies, 143 smoothing kernel, 312 space - C-?, 346 - C^'", 301 - Cfe, 296
- LP, 293 - ^oo, 295 - ip, 292 - CO, 356 - CO, 159 - contractible, 253 - ^cx>, 157 - LP, 161 - ip, 158 - Hilbert, 353 - Hubert's, 158 - locally path-connected, 262 - L2(]a, 6[), 354 - £2, 353 - pre-Hilbert, 351 - simply connected, 259 - topologically complete, 194 spectral theorem, 387
spectrum, 58, 384 - characterization, 385 - pointwise, 384 subsolution, 344 subspace - orthogonal, 90 supersolution, 344 test - Dini, 438, 444 theorem - alternative, 94, 380, 383 - Baire, 188 - Baire of approximation, 319 - Banach's fixed point, 335 - Banach-Saks, 399 - Banach-Steinhaus, 328 - Bernstein, 306, 423 - Binet, 35 - Bolzano-Weierstrass, 198 - Borsuk, 273 - Borsuk's separation, 280 - Borsuk-Ulam, 278 - Brouwer, 273 - Brouwer's fixed point, 274, 276, 339 - Brouwer's invariance domain, 281 - Caccioppoli-Schauder, 341 - Cantor-Bernstein, 215 - Carnot, 81, 352 - Cayley-Hamilton, 67 - closed graph, 330 - comparison, 420 - continuation of solutions, 408 - continuous inverse, 330 - Courant, 116 - Cramer, 36 - de la Vallee Poussin, 315 - Dini, 299, 438 - Dirichlet-Jordan, 449 - Dugundji, 208 - existence of minimal geodesies, 397 of minimizers of convex coercive functionals, 401 - Fejer, 452 - finite covering, 200 - Prechet-Weierstrass, 203 - Fredholm, 94 - Fredholm's alternative, 50 - fundamental of algebra, 271 - Gelfand-Kolmogorov, 402 - Gelfand-Naimark, 403 - generalized eigenvectors, 69 - Gram-Schmidt, 85 - Hahn-Banach, 331, 332, 334 - Hausdorff, 186 - Heine-Cantor-Borel, 205 - Hopf, 273
Index
-
intermediate value, 212 Jacobi, 100 Jordan, 280 Jordan's canonical form, 72 Jordan's separation, 280 Jordan-Borsuk, 281 Kirszbraun, 207 Kronecker, 35 Kuratowski, 215 Lax-Milgram, 376 Lyusternik-Schnirelmann, 278 McShane, 207 Miranda, 277 nested sequence, 188 open mapping, 329 Peano, 415 Perron-Frobenius, 282 Picard-Lindelof, 406 Poincare-Brouwer, 277 polar decomposition, 125 projection, 89, 367 Pythagoras, 81, 84, 86, 354 Riemann-Lebesgue, 436 Riesz, 91, 291, 366, 371 Riesz-Fisher, 360 Riesz-Schauder, 385 Rouche, 282 Rouche-Capelli, 23 Schaefer's fixed point, 342 second mean value, 442 Seifert-Van Kampen, 267 simultaneous diagonalization, 117 spectral, 112, 122, 385 spectral resolution, 114 stability for systems of linear difference equations, 140 - Stone-Weierstrass, 316 - Sylvester, 98, 101 - Tietze, 208 - Uryshon, 185 - Weierstrass, 201 - Weierstrass's approximation, 303 - Weierstrass's approximation for periodic functions, 307 theory - Courant-Hilbert-Schmidt, 389 completeness relations, 394 toplogical - invariant, 184 topological - property, 184 - space, 182 topological space - contractible, 253 - deformation retract, 254 - Hausdorff, 184 - retract, 254 - simply connected, 259
topology, 178, 182 - basis, 184 - discrete, 184 - indiscrete, 184 - of uniform convergence, 294 totally bounded set, 199 trigonometric polynomials, 130 - energy identity, 131 - Fourier coefficients, 131 - sampling, 132 tubular neighborhood, 159 variational - inequality, 372 vector space, 41 - K"", 3 - automorphism, 50 - basis, 5, 43 canonical basis of K^, 9 - orthonormal, 85 - coordinate system, 46 - dimension, 8, 45 - direct sum, 18, 47 - dual, 54 - Euclidean, 79 norm, 81 - Hermitian, 82 norm, 84 - linear combination, 4, 42 - linear subspace, 4 implicit representation, 18 parametric representation, 18 - ordered basis, 9 - subspace, 42 - supplementary, 47 - supplementary linear subspaces, 18 vectors, 41 - linearly dependent, 5 - linearly independent, 5, 42 - norm, 79 - orthogonal, 80, 84, 354 - orthonormal, 85 - span of, 42 von Koch curve, 228 work, 92 Yosida regularization, 319, 320
465
Printed in the United States of America