Mathematical Analysis: Linear and Metric Structures and Continuity

Mariano Giaquinta Giuseppe Modica Mathematical Analysis Linear and Metric Structures and Continuity Birkhauser Boston...

Author: Mariano Giaquinta | Giuseppe Modica

24 downloads 616 Views 18MB Size Report

This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!

Report copyright / DMCA form

DOWNLOAD PDF

Mariano Giaquinta Giuseppe Modica

Mathematical Analysis Linear and Metric Structures and Continuity

Birkhauser Boston • Basel • Berlin

Mariano Giaquinta Scuola Noimale Superiore Dipartimento di Matematica 1-56100 Pisa Italy

Giuseppe Modica Universita degl. Studi di Firenze P S . ^ ' T " ' ' ' **' Matematica Apphcata 1-50139 Firenze Italy

Cover design by Alex Gerasev. Mathematics Subject Classification (2000): 00A35, 15-01, 32K99,46L99, 32C18, 46E15, 46E20 Library of Congress Control Number: 2006927565 ISBN-10: 0-8176-4374-5

e-ISBN-10: 0-8176-4514-4

ISBN-13: 978-0-8176-4374-4

e-ISBN-13: 978-0-8176-4514-4

Printed on acid-free paper. ©2007 Birkhauser Boston BirkhdUSeV All rights reserved. This work may not be translated or copied in whole or in part without the written permission of the publisher (Birkhauser Boston, c/o Springer Science+Business Media LLC, 233 Spring Street, New York, NY 10013, USA) and the author, except for brief excerpts in connection with reviews or scholarly analysis. Use in connection with any form of information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology now known or hereafter developed is forbidden. The use in this publication of trade names, trademarks, service marks and similar terms, even if they are not identified as such, is not to be taken as an expression of opinion as to whether or not they are subject to proprietary rights. 9 8 7 6 5 4 3 2 1 www.birkhauser.com

(MP)

Preface

One of the fundamental ideas of mathematical analysis is the notion of a function', we use it to describe and study relationships among variable quantities in a system and transformations of a system. We have already discussed real functions of one real variable and a few examples of functions of several variables^, but there are many more examples of functions that the real world, physics, natural and social sciences, and mathematics have to offer: (a) not only do we associate numbers and points to points, but we associate numbers or vectors to vectors, (b) in the calculus of variations and in mechanics one associates an energy or action to each curve y{t) connecting two points (a, y{a)) and {b,y{b)y. b

^{y):= j F{t,y{tW{t))dt in terms of the so-called Lagrangian F(t, y,p), (c) in the theory of integral equations one maps a function into a new function h

^(^) "^

K{s,r)x{T)dT

by means of a kernel K{s, r ) , (d) in the theory of differential equations one considers transformations of a function x{t) into the new function t

t

/

f{s,x{s))ds,

a

where f{s,y)

is given.

^ in M. Giaquinta, G. Modica, able, Birkhauser, Boston, 2003, quinta, G. Modica, Mathematical Birkhauser, Boston, 2004, which

Mathematical Analysis, Functions of One Variwhich we shall refer to as [GMl] and in M. GiaAnalysis. Approximation and Discrete Processes, we shall refer to as [GM2].

Preface

FONCTIONS DE LIGNES CROPRSSKKS \ U\ SORnONMi: RN l$l«

ViTO V 0 I , T « 8 J U .

r«r JftMpk H*t»,

PARIS, r.AOTIflRH-VItrARS, IMPniMEtin-UBHAlRE

Figure 0.1. Vito Volterra (1860-1940) and the frontispiece of his Legons sur les fonctions de lignes.

Of course all the previous examples are covered by the abstract setting of functions or mappings from a set X (of numbers, points, functions, . . . ) with values in a set Y (of numbers, points, functions, . . . ) . But in this general context we cannot grasp the richness and the specificity of the different situations, that is, the essential ingredients from the point of view of the question we want to study. In order to continue to treat these specificities in an abstract context in mathematics, but also use them in other fields, we proceed by identifying specific structures and studying the properties that only depend on these structures. In other words, we need to identify the relevant relationships among the elements of X and how these relationships reflect on the functions defined on X. Of course we may define many intermediate structures. In this volume we restrict ourselves to illustrating some particularly important structures: that of a linear or vector space (the setting in which we may consider linear combinations), that of a metric space (in which we axiomate the notions of limit and continuity by means of a distance)^ that of a normed vector space (that combines linear and metric structures), that of a Banach space (where we may operate hnearly and pass to the limit), and finally, that of a Hilbert space (that allows us to operate not only with the length of vectors, but also with the angles that they form). The study of spaces of functions and, in particular, of spaces of continuous functions originating in Italy in the years 1870-1880 in the works of among others Vito Volterra (1860-1940), Giulio Ascoh (1843-1896), Cesare Arzela (1847-1912) and Ulisse Dini (1845-1918), is especially relevant in the previous context. A descriptive diagram is the following:

Preface

vii

Accordingly, this book is divided into three parts. In the first part we study the hnear structure. In the first three chapters we discuss basic ideas and results, including Jordan's canonical form of matrices, and in the fourth chapter we present the spectral theorem for self-adjoint and normal operators in finite dimensions. In the second part, we discuss the fundamental notions of general topology in the metric context in Chapters 5 and 6, continuous curves in Chapter 7, and finally, in Chapter 8 we illustrate the notions of homotopy and degree, and Brouwer's and Borsuk's theorems with a few applications to the topology of R^. In the third part, after some basic preliminaries, we discuss in Chapter 9 the Banach space of continuous functions presenting some of the classical fixed point theorems that play a relevant role in the solvability of functional equations and, in particular, of differential equations. In Chapter 10 we deal with the theory of Hilbert spaces and the spectral theory of compact operators. Finally, in Chapter 9 we survey some of the important applications of the ideas and techniques that we previously developed to the study of geodesies, nonlinear ordinary differential and integral equations and trigonometric series. In conclusion, this volume^ aims at studying continuity and its implications both in finite- and infinite-dimensional spaces. It may be regarded as a companion to [GMl] and [GM2], and as a reference book for multi-dimensional calculus, since it presents the abstract context in which concrete problems posed by multi-dimensional calculus find their natural setting. Though this volume discusses more advanced material than [GMl,2], we have tried to keep the same spirit, always providing examples and "^ This book is a translation and revised edition of M. Giaquinta, G. Modica, Analisi Matematica, III. Strutture lineari e metriche, continuitd, Pitagora Ed., Bologna, 2000.

viii

Preface

exercises to clarify the main presentation, omitting several technicalities and developments that we thought to be too advanced and supplying the text with several illustrations. We are greatly indebted to Cecilia Conti for her help in polishing our first draft and we warmly thank her. We would like to also thank Fabrizio Broglia and Roberto Conti for their comments when preparing the Itahan edition; Laura Poggiolini, Marco Spadini and Umberto Tiberio for their comments and their invaluable help in catching errors and misprints and Stefan Hildebrandt for his comments and suggestions, especially those concerning the choice of illustrations. Our special thanks also go to all members of the editorial technical staff of Birkhauser for the excellent quality of their work and especially to Avanti Paranjpye and the executive editor Ann Kostant. N o t e : We have tried to avoid misprints and errors. But, as most authors, we are imperfect authors. We will be very grateful to anybody who wants to inform us about errors or just misprints or wants to express criticism or other comments. Our e-mail addresses are [email protected]

[email protected]

We shall try to keep up an errata corrige at the following webpages: http: //www. sns. it/'^giaquinta http://www.dma.unif i.it/~modica

Mariano Giaquinta Giuseppe Modica Pisa and Firenze October 2006

Contents

Preface Part I. Linear Algebra 1.

2.

Vectors, Matrices and Linear Systems 1.1 The Linear Spaces R^ and C^ a. Linear combinations b. Basis c. Dimension d. Ordered basis 1.2 Matrices and Linear Operators a. The algebra of matrices b. A few special matrices c. Matrices and linear operators d. Image and kernel e. Grassmann's formula f. Parametric and impUcit equations of a subspace .. 1.3 Matrices and Linear Systems a. Linear systems and the language of linear algebra b. The Gauss ehmination method c. The Gauss elimination procedure for nonhomogeneous linear systems 1.4 Determinants 1.5 Exercises

3 3 3 6 7 9 10 11 12 13 15 18 18 22 22 24 29 31 37

Vector Spaces and Linear Maps 41 2.1 Vector Spaces and Linear Maps 41 a. Definition 41 b. Subspaces, linear combinations and bases 42 c. Linear maps 44 d. Coordinates in a finite-dimensional vector space .. 45 e. Matrices associated to a linear map 47 f. The space £ ( X , y ) 49 g. Linear abstract equations 50

X

3.

Contents

h. Changing coordinates i. The associated matrix under changes of basis . . . . j . The dual space C[X, K) k. The bidual space 1. Adjoint or dual maps 2.2 Eigenvectors and Similar Matrices 2.2.1 Eigenvectors a. Eigenvectors and eigenvalues b. Similar matrices c. The characteristic polynomial d. Algebraic and geometric multiplicity e. Diagonizable matrices f. Triangularizable matrices 2.2.2 Complex matrices a. The Cayley-Hamilton theorem b. Factorization and invariant subspaces c. Generahzed eigenvectors and the spectral theorem d. Jordan's canonical form e. Elementary divisors 2.3 Exercises

51 53 54 55 56 57 58 58 60 60 62 62 64 65 ^^ 67 68 70 75 76

Euclidean and Hermitian Spaces 3.1 The Geometry of Euclidean and Hermitian Spaces a. Euclidean spaces b. Hermitian spaces c. Orthonormal basis and the Gram-Schmidt algorithm d. Isometrics e. The projection theorem f. Orthogonal subspaces g. Riesz's theorem h. The adjoint operator 3.2 Metrics on Real Vector Spaces a. Bilinear forms and linear operators b. Symmetric bilinear forms or metrics c. Sylvester's theorem d. Existence of p-orthogonal bases e. Congruent matrices f. Classification of real metrics g. Quadratic forms h. Reducing to a sum of squares 3.3 Exercises

79 79 79 82 85 87 88 90 91 92 95 95 97 97 99 101 103 104 105 109

Contents

Self-Adjoint Operators 4.1 Elements of Spectral Theory 4.1.1 Self-adjoint operators a. Self-adjoint operators b. The spectral theorem c. Spectral resolution d. Quadratic forms e. Positive operators f. The operators A* A and AA"" g. Powers of a self-adjoint operator 4.1.2 Normal operators a. Simultaneous spectral decompositions b. Normal operators on Hermitian spaces c. Normal operators on Euclidean spaces 4.1.3 Some representation formulas a. The operator A* A b. Singular value decomposition c. The Moore-Penrose inverse 4.2 Some Applications 4.2.1 The method of least squares a. The method of least squares b. The function of linear regression 4.2.2 Trigonometric polynomials a. Spectrum and products b. Sampling of trigonometric polynomials c. The discrete Fourier transform 4.2.3 Systems of difference equations a. Systems of linear difference equations b. Power of a matrix 4.2.4 An ODE system: small oscillations 4.3 Exercises

xi

Ill Ill Ill Ill 112 114 115 117 118 119 121 121 121 122 125 125 126 127 128 128 128 130 130 131 132 134 136 136 137 141 143

Part II. Metrics and Topology 5.

Metric Spaces and Continuous Functions 5.1 Metric Spaces 5.1.1 Basic definitions a. Metrics b. Convergence 5.1.2 Examples of metric spaces a. Metrics on finite-dimensional vector spaces b. Metrics on spaces of sequences c. Metrics on spaces of functions 5.1.3 Continuity and limits in metric spaces a. Lipschitz-continuous maps between metric spaces b. Continuous maps in metric spaces

149 151 151 151 153 154 155 157 159 161 . 161 162

Contents

c. Limits in metric spaces 164 d. The junction property 165 5.1.4 Functions from R^ into R'^ 166 a. The vector space C^{A, W^) 166 b. Some nonhnear continuous transformations from R^ into R ^ 167 c. The calculus of limits for functions of several variables 171 5.2 The Topology of Metric Spaces 174 5.2.1 Basic facts 175 a. Open sets 175 b. Closed sets 175 c. Continuity 176 d. Continuous real-valued maps 177 e. The topology of a metric space 178 f. Interior, exterior, adherent and boundary points .. 179 g. Points of accumulation 180 h. Subsets and relative topology 181 5.2.2 A digression on general topology 182 a. Topological spaces 182 b. Topologizing a set 184 c. Separation properties 184 5.3 Completeness 185 a. Complete metric spaces 185 b. Completion of a metric space 186 c. Equivalent metrics 187 d. The nested sequence theorem 188 e. Baire's theorem 188 5.4 Exercises 190 Compactness and Connectedness 6.1 Compactness 6.1.1 Compact spaces a. Sequential compactness b. Compact sets in R^ c. Coverings and e-nets 6.1.2 Continuous functions and compactness a. The Weierstrass theorem b. Continuity and compactness c. Continuity of the inverse function 6.1.3 Semicontinuity and the Prechet-Weierstrass theorem 6.2 Extending Continuous Functions 6.2.1 Uniformly continuous functions 6.2.2 Extending uniformly continuous functions to the closure of their domains 6.2.3 Extending continuous functions a. Lipschitz-continuous functions

197 197 197 197 198 199 201 201 202 202 203 205 205 206 207 207

Contents

6.3

6.4 7.

8.

6.2.4 Tietze's theorem Connectedness 6.3.1 Connected spaces a. Connected subsets b. Connected components c. Segment-connected sets in R*^ d. Path-connectedness 6.3.2 Some apphcations Exercises

xiii

208 210 210 211 211 212 213 214 216

Curves 7.1 Curves in R^ 7.1.1 Curves and trajectories a. The calculus b. Self-intersections c. Equivalent parametrizations 7.1.2 Regular curves and tangent vectors a. Regular curves b. Tangent vectors c. Length of a curve d. Arc length and C^-equivalence 7.1.3 Some celebrated curves a. Spirals b. Conchoids c. Cissoids d. Algebraic curves e. The cycloid f. The catenary 7.2 Curves in Metric Spaces a. Functions of bounded variation and rectifiable curves b. Lipschitz and intrinsic reparametrizations 7.2.1 Real functions with bounded variation a. The Cantor-Vitali function 7.3 Exercises

219 219 219 222 223 223 224 224 225 226 232 233 234 236 237 238 238 240 241

Some Topics from the Topology of R^ 8.1 Homotopy 8.1.1 Homotopy of maps and sets a. Homotopy of maps b. Homotopy classes c. Homotopy equivalence of sets d. Relative homotopy 8.1.2 Homotopy of loops a. The fundamental group with base point b. The group structure on 7ri(X, XQ) c. Changing base point

249 250 250 250 252 253 256 257 257 257 258

241 243 244 245 247

Contents

d. Invariance properties of the fundamental group . . . 259 8.1.3 Covering spaces 260 a. Covering spaces 260 b. Lifting of curves 261 c. Universal coverings and homotopy 264 d. A global invertibility result 264 8.1.4 A few examples 266 a. The fundamental group of 5^ 266 b. The fundamental group of the figure eight 267 c. The fundamental group of 5^, n > 2 267 8.1.5 Brouwer's degree 268 a. The degree of maps S^ ^ S^ 268 b. An integral formula for the degree 269 c. Degree and inverse image 270 d. The homological definition of degree for maps S^ ^ S^ 271 8.2 Some Results on the Topology of R'^ 272 8.2.1 Brouwer's theorem 272 a. Brouwer's degree 272 b. Extension of maps into S'^ 273 c. Brouwer's fixed point theorem 274 d. Fixed points and solvability of equations in R'^+^ . 275 e. Fixed points and vector fields 276 8.2.2 Borsuk's theorem 278 8.2.3 Separation theorems 279 8.3 Exercises 281 Part III. Continuity in Infinite-Dimensional Spaces 9.

Spaces of Continuous Functions, Banach Spaces and Abstract Equations 285 9.1 Linear Normed Spaces 285 9.1.1 Definitions and basic facts 285 a. Norms induced by inner and Hermitian products . 287 b. Equivalent norms 288 c. Series in normed spaces 288 d. Finite-dimensional normed linear spaces 290 9.1.2 A few examples 292 a. The space £p, 1 < p < oo 292 b. A normed space that is not Banach 293 c. Spaces of bounded functions 294 d. The space iooiy) 295 9.2 Spaces of Bounded and Continuous Functions 295 9.2.1 Uniform convergence 295 a. Uniform convergence 295 b. Pointwise and uniform convergence 297

Contents

xv

c. A convergence diagram 297 d. Uniform convergence on compact subsets 299 9.2.2 A compactness theorem 300 a. Equicontinuous functions 300 b. The Ascoh-Arzela theorem 301 9.3 Approximation Theorems 303 9.3.1 Weierstrass and Bernstein theorems 303 a. Weierstrass's approximation theorem 303 b. Bernstein's polynomials 305 c. Weierstrass's approximation theorem for periodic functions 307 9.3.2 Convolutions and Dirac approximations 309 a. Convolution product 309 b. MoUifiers 312 c. Approximation of the Dirac mass 313 9.3.3 The Stone-Weierstrass theorem 316 9.3.4 The Yosida regularization 319 a. Baire's approximation theorem 319 b. Approximation in metric spaces 320 9.4 Linear Operators 322 9.4.1 Basic facts 322 a. Continuous linear forms and hyperplanes 323 b. The space of linear continuous maps 324 c. Norms on matrices 324 d. Pointwise and uniform convergence for operators . 325 e. The algebra End (X) 326 f. The exponential of an operator 327 9.4.2 Fundamental theorems 327 a. The principle of uniform boundedness 328 b. The open mapping theorem 329 c. The closed graph theorem 330 d. The Hahn-Banach theorem 331 9.5 Some General Principles for Solving Abstract Equations . . . 334 9.5.1 The Banach fixed point theorem 335 a. The fixed point theorem 335 b. The continuity method 337 9.5.2 The Caccioppoli-Schauder fixed point theorem 339 a. Compact maps 339 b. The Caccioppoli-Schauder theorem 341 c. The Leray-Schauder principle 342 9.5.3 The method of super- and sub-solutions 342 a. Ordered Banach spaces 343 b. Fixed points via sub- and super-solutions 344 9.6 Exercises 344

xvi

Contents

10. Hilbert Spaces, Dirichlet's Principle and Linear Compact Operators 10.1 Hilbert Spaces 10.1.1 Basic facts a. Definitions and examples b. Orthogonality 10.1.2 Separable Hilbert spaces and basis a. Complete systems and basis b. Separable Hilbert spaces c. Fourier series and ^2 d. Some orthonormal polynomials in L^ 10.2 The Abstract Dirichlet's Principle and Orthogonality a. The abstract Dirichlet's principle b. Riesz's theorem c. The orthogonal projection theorem d. Projection operators 10.3 Bilinear Forms 10.3.1 Linear operators and bilinear forms a. Linear operators b. Adjoint operator c. Bilinear forms 10.3.2 Coercive symmetric bilinear forms a. Inner products b. Green's operator c. Ritz's method d. Linear regression 10.3.3 Coercive nonsymmetric bilinear forms a. The Lax-Milgram theorem b. Faedo-Galerkin method 10.4 Linear Compact Operators 10.4.1 Fredholm-Riesz-Schauder theory a. Linear compact operators b. The alternative theorem c. Some facts related to the alternative theorem . . . . d. The alternative theorem in Banach spaces e. The spectrum of compact operators 10.4.2 Compact self-adjoint operators a. Self-adjoint operators b. Spectral theorem c. Compact normal operators d. The Courant-Hilbert-Schmidt theory e. Variational characterization of eigenvalues 10.5 Exercises

351 351 351 351 354 355 355 355 357 360 363 364 366 367 368 368 369 369 369 370 371 371 372 373 374 376 376 377 378 378 378 379 381 383 384 385 385 387 388 390 392 393

Contents

11. Some Applications 11.1 Two Minimum Problems 11.1.1 Minimal geodesies in metric spaces a. Semicontinuity of the length b. Compactness c. Existence of minimal geodesies 11.1.2 A minimum problem in a Hilbert space a. Weak convergence in Hilbert spaces b. Existence of minimizers of convex coercive functional 11.2 A Theorem by Gelfand and Kolmogorov 11.3 Ordinary Differential Equations 11.3.1 The Cauchy problem a. Velocities of class C^{D) b. Local existence and uniqueness c. Continuation of solutions d. Systems of higher order equations e. Linear systems f. A direct approach to Cauchy problem for linear systems g. Continuous dependence on data h. The Peano theorem 11.3.2 Boundary value problems a. The shooting method b. A maximum principle c. The method of super- and sub-solutions d. A theorem by Bernstein 11.4 Linear Integral Equations 11.4.1 Some motivations a. Integral form of second order equations b. Materials with memory c. Boundary value problems d. Equihbrium of an elastic thread e. Dynamics of an elastic thread 11.4.2 Volterra integral equations 11.4.3 Predholm integral equations in C^ 11.5 Fourier's Series 11.5.1 Definitions and preliminaries a. Dirichlet's kernel 11.5.2 Pointwise convergence a. The Riemann-Lebesgue theorem b. Regular functions and Dini test 11.5.3 L^-convergence and the energy equality a. Fourier's partial sums and orthogonality b. A first uniform convergence result c. Energy equality 11.5.4 Uniform convergence

xvii

395 395 395 395 396 397 397 398 400 402 403 404 404 405 407 409 410 411 413 415 416 418 419 421 423 424 424 425 425 426 427 427 429 430 431 433 435 436 436 437 439 439 440 441 442

xviii

Contents

a. A variant of the Riemann-Lebesgue theorem b. Uniform convergence for Dini-continuous functions c. Riemann's locaUziation principles 11.5.5 A few complementary facts a. The primitive of the Dirichlet kernel b. Gibbs's phenomenon 11.5.6 The Dirichlet-Jordan theorem a. The Dirichlet-Jordan test b. Fejer example 11.5.7Fejer's sums

442 444 445 445 445 447 449 449 451 452

A. Mathematicians and Other Scientists

455

B.

457

Bibliographical Notes

C. Index

459

Mathematical Analysis Linear and Metric Structures and Continuity

Parti

Linear Algebra

—

x.

k

\»

:^[^^

^^^'\ -,i5^^^fl

W^MI

^'^

•If - '•'•'

William R. Hamilton (1805-1865), James Joseph Sylvester (1814-1897) and Arthur Cayley (1821-1895).

1. Vectors, Matrices and Linear Systems

The early developments of linear algebra, and related to it those of vectorial analysis, are strongly tied, on the one hand, to the geometrical representation of complex numbers and the need for more abstraction and formalization in geometry and, on the other hand, to the newly developed theory of electromagnetism. The names of William R. Hamilton (1805-1865), August Mobius (1790-1868), Giusto Bellavitis (1803-1880), Adhemar de Saint Venant (1797-1886) and Hermann Grassmann (1808-1877) are connected with the beginning of linear algebra, while J. Willard Gibbs (18391903) and Oliver Heaviside (1850-1925) established the basis of modern vector analysis motivated by the then recent Treatise in Electricity and Magnetism by James Clerk Maxwell (1831-1879). The subsequent formalization is more recent and relates to the developments of functional analysis and quantum mechanics. Today, linear algebra appears as a language and a collection of results that are particularly useful in mathematics and in applications. In fact, most modeling, which is done via linear programming of ordinary or partial differential equations or control theory, can be treated numerically by computers only after it has been transformed into a linear system; in the end most of the modeling on computers deals with linear systems. Our aim here is not to present an extensive account; for instance, we shall ignore the computational aspects (error estimations, conditioning, etc.), despite their relevance, but rather we shall focus on illustrating the language and collecting a number of useful results in a wider sense. There is a strict link between linear algebra and linear systems. For this reason in this chapter we shall begin by discussing linear systems in the context of vectors in R'^ or C"^.

1.1 The Linear Spaces R^ and C" a. Linear combinations Let K be the field of real numbers or complex numbers. We denote by '. the space of ordered n-tuples of elements of K, K^ := {x \yi:={x\x^,...,

x^), x' eK,

i = 1,..., n } .

4

1. Vectors, Matrices and Linear Systems

The elements of K^ are often called points or vectors of K^] in the latter case we think of a point in K^ as the end-point of a vector applied at the origin. In this context the real or complex numbers are called scalars as they allow us to regard a vector at different scales. We can sum points of K^, or multiply them by a scalar by summing their coordinates or multiplying the coordinates by A: X + y := (x^ + y \ x2 + y ^ . . . , a;^ + 2/^), if X = {x\

Ax := {Xx\ Ax^,..., Ax^),

x \ . . . , X-), y = (2/1, 2 / 2 , . . . , 2/-), A E K.

Of course Vx, y, z G K"^ and VA, /x G K, we have o o o o

(x + y) + z = X + (y + z), X 4- y = y + X, A(x 4- y) = Ax + Ay, (A -f /i)x = Ax + /ix, (A/i)x = A(/ix), if 0 := ( 0 , . . . , 0), then x - f O = 0 + x = x, 1 • X = x and, if —x := (—l)x, then x + (—x) = 0.

We write x — y for x + (—y) and, from now on, the vector 0 will be simply denoted by 0. 1.1 Example. If we identify B? with the plane of geometry via a Cartesian system, see [GMl], the sum of vectors in R"^ corresponds to the sum of vectors acccording to the parallelogram law, and the multipHcation of x by a scalar A to a dilatation by a factor |A| in the same sense of x if A > 0 or in the opposite sense if A < 0.

1.2 About the notation. A list of vectors in K^ will be denoted by a lower index, Vi, V 2 , . . . , Vfc, and a list of scalars with an upper index A^, A^,..., A'^. The components of a vector x will be denoted by upper indices. In connection with the product row by columns^ see below, it is useful to display the components as a column

/x^\ .2

However, since this is not very convenient typographically, if not strictly necessary, we shall write instead x = (x^, x ^ , . . . , x"). Given k scalars A^, A^,..., A'^ and k vectors v i , V2,. •., v^ of R'^, we may form the linear combination vector of vi, V 2 , . . . , Vfc with coefficients A^, A^,..., A'^ given by k

5^ASGK^ j=l

1.3 Definition, (i) We say that W CK^ is a linear subspace, or simply a subspace of K^, if all finite linear combinations of vectors in W belong to W.

1.1 The Linear Spaces R^ and C^

i tan— •cfiancenM le gmerati &M rnni riMkto B MUM •«•« MiolUU conTCMJaiii; •team riaiciMII M aaodo ehe U acoonda eatreaatU di aiateau coiacida eolia prian eatrcaritt deOa a^oaata, la reUa «ke ebinde coo aate m poBgana (piaao o cobbo), ti t praaa dalla priwa eatrenitl dalla prima alia aacaada deU'ultina &t»n la.Ioro compotta-e^uipotUnit. CSi ii rappraaeau ooi lagni + blerpoaU tra la ratta cba « TOclioBo eomawTt, e col aecaa Jim intSeanU la eauiftolUiaa. Onde

k ttmm AS + BC A AC, AB~^.BC +CD JkAD tc. Tali aooipoUanaa aaM«taa« •« alia r«Ua ia eaaa c»aipr«M altra aa ae toaUlaiaeaao a l«ro ri BCJiiAD— CD, la qaa& aaprima eha la eonipoata*«q«ii>oUaata delta doe raUa AB BC « aqidpoUaDle alia eoaipoata delia doe AD DC. 6.' La aqaiaolleaia AB An, CD, ota n capriaM aa aaaiaro poaiUro, in£ea «te AB b parallcla a diraUa per to ataaao Terio H CD, « «be la laro laacbaaaa kanao 3 rapporto aaprawo delta eqaa* aioaa^iB-in.CD.

(OOreiUu tUkt «(ifci«ii«a« <1 •!« aMttd*.«' 5 * , ie aa |U fnt». % BMHM, au ibni « <W at«Kttuuw tpi tnmM ii tit—Un >»« n« «n«rraM«i. j>t«r (niiuiaw qMMt HupllM MfadN* U rnM<m ftdri molli: lV«r* t^- M«uMi m » ««-

Figure 1.1. Giusto Bellavitis (18031880) and a page from his Nuovo metodo di geometria analitica.

(ii) Given a subset S CW^ we call the span ofS the subset ofK^, denoted by Span 5 , of all finite combinations of vectors in S. IfW = Span 5 , we say that the elements of S span W, or that S is a set of generators forW. (iii) We say that k vectors Vi, V 2 , . . . , v^ E K'^ are linearly dependent if there exist scalars A^, A^,..., A^, not all zero, such that X]^=i ^^'^j = 0. (iv) If k vectors Vi, V2,..., v^ are not linearly dependent, that is A^vi + • • • + A'^Vfc = 0

implies

A^ = A^ = • • • = A^ = 0,

then v i , V 2 , . . . , Vjfc are called linearly independent. (v) A subset S CK^ is di set of linearly independent vectors if any finite choice of vectors Vi, V 2 , . . . , v^ G 5 Z5 made of linearly independent vectors. (vi) Let W be a linear subspace ofW^. A subset S C W of linearly independent vectors that form a set of generators for W is called a basis

ofW. Observe the following. (i) W is a, linear subspace of K" if and only if for all x, y G VT and all A, /i G K, we have Ax + /L^y G W. (ii) If VF is a hnear subspace of K^, then 0 e W and moreover Av G VF for all A G K if V G W^. (iii) v i , V2,..., Vjfc G K^ are linearly dependent if and only if one of the vectors Vi, V 2 , . . . , v^ is a linear combination of the others.

1. Vectors, Matrices and Linear Systems

MltMOIBBS UJS. «*OMtT«tt «T irtouilQoi. - ttinein aw l*t nmmtt tt Ot OffXnnett gimAriqutt, Hmrkur usagt pour timpl^ la Miemiqmi forU-u. Sam-Vwnm. (BxinOt pwr I'latenr.)

b a r y c e n t r i s c h e Calcul •in a«a«i Half«mitt«l

(CMMBiMttrc*, MM. Caaeby, Oapin, Storm.) . J'tppaUtMMMf'bm^lrJfiMd'iinBombrcqadcoiMiiMdtligBMa.^e,...

analytiscfaen Behandlang der Geometm a>r|«*t*llt . Aiati, i'ffMi Ofirme* fiamilrifm d« «' aldto, vm txeisgtomJuitfutitti' maa, kllgot ( <{ui, •jooM* g«amMrk|iMaent li a.doooe _ liI%Ma',etj'feri>

die Entwiekdwif mksttn

tiftmOutm

Ima'-a. • L'dMra>n<m«f( oa l« («m f4i)m^r>fM d'uM lifM lii(ipoi4* vtriabic, ten r«X«M |«omtoii|iM it ee qn'eile at aetiiaUMBcM tor cc qn'eUe i M • Si oett* ligde r a'a trtrM qu'iofmiiiiMt pcu ea gnndcar «l ca direction,

Angatt

Ferdinand

Mdbia*

Fr*f«a««t dtr Ailr***ai« >• Ltlfil^ D< Mtaie tn *ecn>iw«m«it (teoiAriqiM infiaiOMnM pMit de b Hotel* tir • Si let vtriMtomraeccwivMd« U ligM r le rappontM tn temp* (, let

>nt »i,r

(Hffirmtul* ^omM^ms da premier et da dnuiim* ordre do r. . Ce qui (xteede ('tpptiijiM mu airet pianct comnM atu lifpiei di

Xmff*rff*lm.

L • I p . I f. T«rl<| tea J.lnaa Aakr*«iat > t r l k 18.7.

Figure 1.2. A page from Memoire sur les sommes et les differences geometriques by Adhemar de Saint Venant (1797-1886) and the frontispiece of the Barycentrische Calcul by August Mobius (1790-1868).

(iv) If k vectors vi, V 2 , . . . , v^ are linearly independent, then necessarily vi, V 2 , . . . , Vfc are distinct and not zero. Moreover, any choice of h, 1 < h < k, yields a linearly independent set. (v) Let S C K^. Then W := Span 5 is a linear subspace of K*^. More explicitly W = Span S if and only if for every w e W there exist A: G N, scalars A^, A^,..., A'^ G K and vectors Vi, V2,..., v^ G S such that w = J2j=i ^^"^j-

b. Basis We shall now discuss the crucial notion of basis. 1.4 Definition. Let C C K^. A subset B C C is a maximal set of linearly independent vectors oiC if B is a set of linearly independent vectors and, for every w G C \ S, the set B U {w} is not a set of linearly independent vectors. 1.5 Proposition. Let W be a linear subspace ofK^. B is a basis ofW if and only if B is a maximal set of linearly independent vectors ofW. Proof. Let B be a basis of Vl^. JB is a set of linearly independent vectors. Moreover, since for every w G W^ there are k vectors v i , V2,. • . , v^ G ;S and k scalars /x^, / i ^ , . . . , yJ^ such that w = Ylj=i l^'^^ki the vectors v i , V 2 , . . . , v^, w are not linearly independent, i.e., 6 U {w} is not a set of linearly independent vectors. Conversely, suppose that JB is a maximal set of linearly independent vectors. To prove that B is a basis, it suffices to prove that Span B = W and, actually, that W C

1.1 The Linear Spaces M^ and C^

Span JS. If w G W, by assumption B U {w} is not a set of linearly independent vectors. Then there exist v i , V 2 , . . . , v^ G H and scalars a, A^, A ^ , . . . , A'^ such that k

aw

-^J2xiy.=0.

On the other hand a 7^ 0, since otherwise v i , V 2 , . . . , v^ would be linearly dependent, hence k W OL .

.

and w G Span B. Therefore W C Span B,

D

Using Zorn's lemma, see e.g., [GM2], one can show that every subspace of a hnear space, including of course the linear space itself, has a basis. In the present situation, Proposition 1.5 allows us to select a hasis of a subspace W by selecting a maximal set of linearly independent vectors in W. For instance, \iW = Span {vi, V2,..., Vjt}, we select the first nonzero element, say w i :== v i , then successively choose W2,W3,...,w^ in the list (vi, V 2 , . . . , Vfc) so that W2 is not a multiple of v i , and by induction, Wj is not a linear combination of w i , W2,..., Wj_i. This is not a very efficient method, but it works. For more efficient methods, see Exercises

1.46 and 3.28. 1.6 % Define the notion of minimal to the notion of basis.

set of generators and show that it is equivalent

c. Dimension We shall now show that all bases of a subspace W of K^ have the same number of elements, called the dimension of W, denoted by dim W, and that dim W n. (iii) All bases of a subspace W C W^ have the same number of elements k and k
(1.1)

with at least one of the x* not zero. Without loss of generality, we can assume that x^ i^ 0, hence

1 ^

J ^ x^ j=2 ^

1. Vectors, Matrices and Linear Systems

and the vectors v i , W2, W 3 , . . . , w^ span W. The vectors v i , W2, W 3 , . . . , w^ are also independent. In fact, if A^, A'^,..., A^ are such that k

A^vi+^A^Wj =0, j=2

then (1.1) yields k i~2

and this implies A^ = 0 since x^ ^ 0; consequently ^^=2 ^"^^j ~ ^' hence A^ = • • • = A^ = 0 . Assume now the inductive hypothesis, that is, that we can choose n—p vectors of the basis w i = (1,0, . . . , 0 ) , W2 = ( 0 , 1 , . . . ,0), . . . , Wn = (0,0, . . . , 1 ) , say W p + i , . . . , W n , such that { v i , . . . , Vp, W p + i , . . . , Wn} is a basis of W. Let us prove the claim for p + 1 vectors. Since Vp+i is independent of v i , V 2 , . . . , Vp, we infer by the induction hypothesis that p

k

i=l

j=p4-l

where at least one of the y* is not zero. Assuming, without loss of generality, that yp+'^ / 0, we have -I

k

P

n

i

and the vectors v i , . . . , Vp_|_i, Wp_|.2, • •, Wfc span W. Let us finally show that these vectors are also independent. If p

k

J2 ^'vi -h AP+Vp+1 + Yl i=l

^^"^J = 0'

j=p+2

(1.2) yields P

^ ( V + AP+ia:^)viH-AP+i2/P+iwp+i+ i=l

k

Yl

(A^'+ AP+i2/^)w,-= 0.

j=p+2

Since { v i , . . . , Vp, W p + i , . . . , Wn} is a basis of W^ because of the induction assumption, and yP^^ 7^ 0 by construction, we conclude that A^"*"^ = 0, and consequently A* = 0 for all indices i. (ii) Assume that the v i , V 2 , . . . , v^ G K^ are independent and fc > n. By (i) we can complete the basis { w i , W 2 , . . . , Wn} of W^ to form a basis of Span { v i , V 2 , . . . , v ^ } with k elements; this is a contradiction since { e i , 6 2 , . . . , e n } is already a basis of K'^, hence a maximal system of linearly independent vectors of K^. (iii) follows as (ii). Let us prove that two bases of W have the same number of elements. Suppose that { v i , V2,.. •, Vp} and {ei, e 2 , . . . , e^} are two bases of W with p < k. By (i) we may complete v i , V 2 , . . . , Vp with k — p vectors chosen among e i , 6 2 , . . . , e^ to form a new basis { v i , V 2 , . . . , Vp, e p ^ - i , . . . , 6 ^ } of W; but this is a contradiction since {ei, 6 2 , . . . , ep} is already a basis of W, hence a maximal system of linearly independent vectors of W, see Proposition 1.5. Similarly, and we leave it to the reader, one can prove that k < n. D

1.8 Definition. The number of the elements of a (all) basis of a linear subspace W ofK^ is called the dimension ofW and denoted by dimM^.

1.1 The Linear Spaces E ^ and C^

9

1.9 Corollary. The linear space W^ has dimension n and, if W is a linear subspace ofW^, then dimW < n. Moreover, (i) there are k linearly independent vectors v i , V2,..., v^ G W, (ii) a set of k linearly independent vectors v i , V2,..., v^ 6 M^ is always a basis of W, (iii) any p vectors v i , V2,..., Vp G W with p > k are always linearly dependent, (iv) z/vi, V 2 , . . . , Vp are p linearly independent vectors ofW, then p < k, (v) for every subspace V C K^ such that V CW we have dim V < k, (vi) let V, W be two subspaces of W^; then V = W if and only ifVcW and dim V = dim W. 1.10 If. Prove Corollary 1.9.

d. Ordered basis Until now, a basis 5 of a linear subspace W of W^ is just a finite set of linearly independent generators of VF; every x G VF is a unique linear combination of the basis elements. Here, uniqueness means uniqueness of the value of each coefficient in front of each basis element. To be precise, one would write v€5

It is customary to index the elements of 5 with natural numbers, i.e., to consider 5 as a list instead of as a set. We call any list made with the elements of a basis S an ordered basis. The order just introduced is then used to link the coeflficients to the corresponding vectors by correspondingly indexing them. This leads to the simpler notation x^^AV, i=l

we have already tacitly used. Moreover, 1.11 Proposition. Let W be a linear subspace ofW^ of dimension k and let (vi, V 2 , . . . , Vfc) be an ordered basis of W. Then for every x G VF there is a unique vector A G K^, A := (A^, A^,..., A^) such that x = J2^=i ^^^i1.12 E x a m p l e . The list (ei, ©2) • • • J Gn) of vectors of W^ given by e i :— (1, 0,. .,0), e2 := ( 0 , 1 , . . . , 0 ) , . . . , Gn = ( 0 , 0 , . . . , 1) is an ordered basis of K^. In fact e i , e 2 , . 1 Gn are trivially linearly independent and span K^ since

/o\ = xi \xn/

1

0

X2

+ x^

u

+ • • • + x"

=E

X

Gj

\l/

for all X € K^. (ei , e 2 , . . . , Gn) is called the canonical or standard basis of MJ^. We shall always think of the canonical basis as an ordered basis.

10

1. Vectors, Matrices and Linear Systems

1.2 Matrices and Linear Operators Following Arthur Cayley (1821-1895) we now introduce the calculus of matrices. An m x n matrix A with entries in X is an ordered table of elements of X arranged in m rows and n columns. It is customary to index the rows from top to bottom from 1 to m and the columns from left to right from 1 to n. If {aj} denotes the element or entry in the ith row and the jth column, we write /a\

al

2

a.n

A = \o

or A = [a}], z = l , . . . , m , j

,n.

<)

Given a matrix A we write A^ for the entry (i, j ) of A, and denote the set of matrices with m rows and n columns with Mm,n{X). Usually X will be the field of real or complex numbers K, but a priori one allows other entries. 1.13 Remark (Notation). The common agreement about the indices of the elements of a matrix is that the row is determined by the upper index and the column by the lower index. Later, we shall also consider matrices with two lower indices or two upper indices, A = [aij] or A = [a*-^]. In both cases the agreement is that the first index identifies the row. These agreements turn out to be particularly useful to keep computation under control. The three diff^erent types of notation correspond to different mathematical objects represented by those matrices. But for the moment we shall not worry about this. If A = [a*] G Mp^n and B = [6^] G Mp^rn are two matrices with the same number of rows p, we denote by [A | B], or by ( A

the

B

matrix with p rows and (n + m) columns defined by /« A B

•1=^

a\

ai

b\

b}2 hi

..

bl

^ ,a?

aS

<

6? 65 ... V^J

or shortly by if 1 < j < n [b)-n

l,...,p.

\in-\-\<j
Similarly, given A = [a*] G Mp^n and B = [6^] G M^^n, we denote by

1.2 Matrices and Linear Operators

11

the {p -\- q) X n matrix C = [cj] defined by

'

\6}-P

i f p + 1
a. The algebra of matrices Two matrices A := [a^], B = [6^] in Mm,n{^) can be summed by setting A + B

where c} := a] +6J,

i = l,...,m, j = l,...,n.

Moreover, one can multiply a matrix A € Mm,n{^) by a scalar A G K by setting AA:= Xal that is, each entry of AA is the corresponding entry of A multiplied by A. Notice that the sum of matrices is meaningful if and only if both matrices have the same number of rows and columns. Putting the rows one after the other, we can identify Mm,nOQ with K'^"^ as a set, and the operations on the matrices defined above correspond to the sum and the multiplication by a scalar in K^'^. Thus Mm,n{^), endowed with the two operations (A, B) —^ A + B and (A, A) -^ AA, is essentially K*^"^. A basis for Mm,n{^) is the set of m x n matrices {I^} where IJ has entries 1 at the (i, j ) position and zero otherwise. 1.14 Definition (Product of matrices). / / the number of rows of A is the same as the number of the columns of "B, A = [a'j] G Mp^n{^), B = [b'j] e Mn,q{K), we define the product matrix A B G Mp^q by setting A B = [c)]

where c] = y ^ a ^ b ^ . k=i

Notice that if {a\,ai,..., j t h column of B , then /

AB:=

aj,) is the ith row of A and (6], 6^,..., 6 p is the

ah

ai

\

/

b] bl

a{

a'2

6?

\ a?

12

1. Vectors, Matrices and Linear Systems

where

(AB)} := 4 = a\b] + c^^h] + • • • + a^^^. For this reason the product of matrices is called the product rows by columns. It is easily seen that the product of matrices is associative and distributive i.e., (AB)C = A(BC) =: A B C ,

A ( B + C) = A B + A C

but, in general, it is not commutative, AB ^ BA, as simple examples show. Indeed, we may not be able to form B A even if A B is meaningful. Moreover, A B may equal 0, 0 being the matrix with zero in all entries, although A^^ and B 7^ 0. b . A few special m a t r i c e s For future purposes, it is convenient to single out some special matrices. A square n x n matrix A with nonzero entries only on the principal diagonal is called a diagonal matrix and denoted by

A = diag(Ai,...,An) :=

/Ai 0

0 ... A2 . . .

0\ 0

\0

0 ... A J

in short, A is diagonal iff ^ = [^jl'^j • " ^'^^^ where Sj is the Kronecker symbol. S):-

The nx n matrix Idn : = d i a g ( l , . . . , l ) = is called the identity matrix since for every A G Mp^ni^) and B G Mn,q we have AIdn = A, IdnB = B . We say that A = [a*] is upper triangular if a'j = 0 for all {i,j) with i > j and lower triangular if a'j = 0 for all (i^j) with j < i. 1.15 Definition. We say that a n x n square matrix A G Mn,n{^) ^^ invertible if there exists B € Mn^n(^) such that AB = Idn andBA = IdnSince the inverse is trivially unique, we call B the inverse of A and we denote it by A~^.

1.2 Matrices and Linear Operators

13

1.16 ^ . Show that an upper (lower) triangular matrix A = [a*] € Mn,n{^) is invertible if and only ii a\ / 0 Vi = 1 , . . . , n. Show that, if A is invertible, A~^ is upper (lower) triangular.

1.17 Definition. Let A = [a'j] G Mm,n{^)' The transpose A^ of A is the matrix A ^ := [6*] G Mn,m(^) where 6^ = a^ Vz = 1 , . . . , n, Vj = 1 , . . . , m. We obtain A ^ from A by exchanging rows with columns, that is, writing the successive columns of A from left to right as successive rows from top to bottom. It is easily seen that (i) (ii) (iii) (iv)

(A^)^ = A, (AA + /iB)^ = XA^ + /xB^ VA, /i G K, (AB)^ = B ^ A ^ V A , B , A is invertible if and only if A ^ is invertible and (A~-^)^ = (A^)"-^.

In particular, in the context of matrices with one upper and one lower index, the transposition operation exchanges upper and lower indices; thus in the case of row- and column-vectors we have a2

(Oi, a2,...,

ttnf

=

c. Matrices and linear operators A map A:K^ -^ W^ is said to be linear if A(Ax + /iy) = A A(x) -h /i A{y)

Vx, y G K^, VA, /i G K.

In particular A{0) = 0 . By induction it is easily seen that A is linear if and only if

3=1

j=l

for any A; = 1,2,3,..., for any v i , V2,..., v^ G K^ and scalars A^,..., A^. Linear maps from K^ into K^ and m x n matrices are in one-to-one correspondence. In fact, we have the following. 1.18 Proposition. Let A : K"^ ^ K^ be a linear map. Then the matrix A:=[A(ei)|A(e2)|...l^(e„)],

(1.3)

where (ei, e 2 , . . . , en) is the canonical basis ofW^, is the unique matrix such that X = {x\ x ^ . . . , x^). (1.4) A(x) = Ax, Conversely, if A e Mn,mi^), then the linear map in (1.4) is a linear map fromW intoK^.

14

1. Vectors, Matrices and Linear Systems

Proof. Assuming A is defined by (1.3), we have for all x = (x^, x ^ , . . . , a:^) € K'^ n

A(x) = A{^x^Gi\ i=l

n

= ^ x M ( e i ) = Ax. i-l

Actually, A is characterized by (1.4), since if ^ ( x ) = B x Va:, then A{ei) = B e , Vz = 1 , . . . , n, hence A and B have the same columns. Conversely, given A 6 Mm,n{^)^ it is trivial to check that the map x —>• A x is a linear map from K^ into K ^ . D

1.19 Remark. The map A -^ A that relates hnear operators and matrices is tied to the {ordered) canonical basis of K!^ and K ^ . If A and A are related by (1.4), we refer to A and A respectively as the linear map associated to the matrix A and the matrix associated to the linear map A. If we denote by a i , a 2 , . . . , an the columns of A indexed from the left so that A = [ai I a2 I . . . | a^], then n

A(x) = A x = ^ x ' a i ,

x = ( x \ x ^ , . . . , x^),

(1.5)

that is, for every x = (x^, x ^ , . . . , x"^), ^ ( x ) is the linear combination of vectors a i , a 2 , . . . , a^ of W^ with scalars X 1 X , . . . , x a s coefficients. Observe that A{ei) = a i , . . . , A{en) = an, where ( e i , e 2 , . . . ,en) is the canonical basis of K*^. 1.20 Proposition. Under the correspondence (1.4) between matrices and linear maps, the sum of two matrices corresponds to the sum of the associated operators, and the product of two matrices corresponds to the composition product of the associated operators. Proof, (i) Let A , B G Mm,n(K) and let A{x.) := A x and B(K) := B x . Then we have {A -f S ) ( x ) := A(x) + B(x) = A x + B x = (A + B ) x . (ii) Let A € Mrn,n(K), B € Mp,m{K), yl(x) := A x and B ( y ) := B y Vx € K^, Vy G K ^ . Then (B o A)(x) = B(A(x)) = B(A(x)) = B ( A x ) = ( B A ) x . D

1.21 ^ . Give a few examples of 2 x 3 and 3 x 2 matrices, their sums and their products (whenever this is possible). Show examples for which A B ^ B A and A B = 0 although A 7«^ 0 and B 7«^ 0. Finally, show that A x = 0 Vx G K^ implies A = 0. [Hint: Compare Exercises 1.76, 1.79 and 1.81.] 1.22 ^ . Show that (^ : R^ —• R is linear if and only if there exist a, 6 G R such that (p{{x, y)) =ax-hbyWx,yeR. 1.23 t - Show that v? : R^ -> R is linear if and only if (i) (fiiXx, At/)) = Xip{x, y) V(a:, y) G R^ and VA G R+, (ii) there exist A and r G R such that (p({cose,sinO)) — Aco9{0 + r) ^9 e R. 1.24 %. The reader is invited to find the form of the associated matrices corresponding to the linear operators sketched in Figure 1.3.

1.2 Matrices and Linear Operators

0

VZ2L,

W-

^

15

n w.

YA

Figure 1.3. Some linear transformations of the plane. In the figure the possible images of the square [0,1] x [0,1] are in shadow.

d. Image and kernel Let A G M^,n(K) and let A(x) := Ax, x G K*^ be the linear associated operator. The kernel and the image of A (or A) are respectively defined by kerA = ker A : - {x G K^ I A(x) - o } , 1mA = Im A : - | y G K ^ I 3 x G K'' such that ^(x) = y | . Trivially, kerA is a linear subspace of the source space K"^, and it easy to see that the following three claims are equivalent: (i) A is injective, (ii) kerA = {0}, (iii) a i , a 2 , . . . , an are linearly independent in W^. If one of the previous claims holds, we say that A is nonsingular, although in the current literature nonsingular usually refers to square matrices. Also observe that A may be nonsingular only if m > n. Also Im A — Im A is a linear subspace of the target space K"^, and by definition lm.A = Spanjai, a 2 , . . . , a^}. The dimension of Im A = Im A is called the rank of A (or of A) and is denoted by Rank A (or Rank A). By definition Rank A is the maximal number of linearly independent columns o / A , in particular Rank A < min(n, m). Moreover, it is easy to see that the following claims are equivalent

16

1. Vectors, Matrices and Linear Systems

(i) A is surjective, (ii) I m A = K^, (iii) Rank A = m. Therefore A may be surjective only if m < n. The following theorem is crucial. 1.25 Theorem (Rank formula). For every matrix A G Mm,n{^) ^^e have dim Im A = n — dim ker A. Proof. Let ( v i , V 2 , . . . , v^) be a basis of ker A. According to Theorem 1.7 we can choose (n — k) vectors e ^ ^ i , . . . , en of the standard basis of K" in such a way that v i , V 2 , . . . , Vfc,efe_|_i,... ,en form a basis of K'^. Then one easily checks that D (A(efc_|_i),..., A ( e n ) ) is a basis of Im A, thus concluding that d i m l m A = n — k.

A first trivial consequence of the rank formula is the following. 1.26 Corollary. Let A e M^,n(K). (i) If m 0. (ii) If m > n, then A is nonsingular, i.e., ker A = {0}, if and only if Rank A is maximal, Rank A = n. (iii) Ifm, = n, i.e., A is a square matrix, then the following two equivalent claims hold: a) Let A{K) := A x be the associated linear map. Then A is surjective if and only if A is injective. b) A x = h is solvable for any choice of h e K ^ if and only if A{-x) = 0 has zero as a unique solution. Proof, (i) Prom the rank formula we have dim ker A = n — dim I m A > 7 2 — m > 0 . (ii) Again from the rank formula, dim Im A = n — dim ker A = n = min(n, m). (iii) (a) Observe that A is injective if and only if ker A = {0}, equivalently if and only if dim ker A = 0, and that A is surjective if and only if Im A = K^, i.e., d i m l m A = m = n. The conclusion follows from the rank formula. (iii) (b) The equivalence between (iii) (a) and (iii) (b) is trivial.

D

Notice that (i) and (ii) imply that A : K^ ^ W^ may be injective and surjective only if n = m. 1.27 %, Show the following. P r o p o s i t i o n . Let A € Mn,nOQ and A(x.) \— A x . The following claims are equivalent: (i) (ii) (iii) (iv) (v) (vi)

A is injective and surjective, A is nonsingular, i.e., ker A = {0}, A is surjective, there exists B G Mn,n{^) such that B A = Idn, there exists B 6 Mn,ni)^) such that A B = Idn, A is invertible, i.e., there exists a matrix B G Mn,n{^) Idn.

such that B A = A B =

1.2 Matrices and Linear Operators

17

An important and less trivial consequence of the rank formula is the following. 1.28 Theorem (Rank of the transpose). Let A e Mm^n- Then we have (i) the maximum number of linearly independent columns and the maximum number of linearly independent rows are equal, i.e., Rank A = Rank A^, (ii) let p := Rank A. Then there exists a nonsingular p x p square submatrix of A. Proof, (i) Let A = [a* ], let a i , a 2 , . . . , an be the columns of A and let p := Rank A. We assume without loss of generality that the first p columns of A are linearly independent and we define B as the mxp submatrix formed by these columns, B := [ai | a2 | . . . | ap]. Since the remaining columns of A depend linearly on the columns of B , we have p

a^ = J2'''j^i

Vfc = l , . . . , m , V j = p + l , . . . , n

i=i

for some R = [r*] G Mp^n-p{^)-

In terms of matrices,

I ap-f 1 ap-f 2 . •. an

= I ai

...

ap I R = B R ,

hence

A=

[BIBRI

-=B[ldp|R].

Taking the transposes, we have A-^ G Mn,mO^)j B-^ G Mp^rnO^) ^^^

(1.6)

A^ = R^

Since [Idp | R ] ^ is trivially injective, we infer that ker A ^ = k e r B ^ , hence by the rank formula Rank A"^ = m — dim ker A ^ = m — dim ker B-^ = Rank B ^ , and we conclude that Rank A ^ = RankB-^ < min(m, p) = p = Rank A. Finally, by applying the above to the matrix A ^ , we get the opposite inequality Rank A = Rank(A-^)-^ < RankA-^, hence the conclusion. (ii) With the previous notation, we have Rank B-^ = Rank B = p. Thus B has a set of p independent rows. The submatrix S of B made by these rows is a square pxp matrix with Rank S = Rank S-^ = p, hence nonsingular. • 1.29 1. Let A G Mm,n{K), let A(x) := A x and let ( v i , V 2 , . . . , Vn) be a basis of K^. Show the following: (i) A is injective if and only if the vectors A ( v i ) , A(v2), • . . , A{vn) of K^ are linearly independent, (ii) A is surjective if and only if {A(vi), A ( v 2 ) , . . . , A(vn)} spans K ^ , (iii) A is bijective iff { ^ ( v i ) , A ( v 2 ) , . . . , A(vn)} is a basis of K"^.

18

1. Vectors, Matrices and Linear Systems

e. Grassmann's formula Let U and V be two linear subspaces of K^. Clearly, both U (IV and [/ -h F := | x G K"" I X = u + V for some u G C/ and v E F | are linear subspaces of K^. When U OV = {0}, we say that U -^ V is the direct sum of U and V and we write [/ 0 F for C/ + V. If moreover U ® V = K^, we say that U and V are supplementary subspaces. The following formula is very useful. 1.30 Proposition (Grassmann's formula). LetU andV be linear subspaces ofK^. Then dim{U + y ) + dim(C/ nV) = dimU-\- dimV. Proof. Let ( u i , U2,. • . , u^) and ( v i , V 2 , . . . , v^) be two bases of U and V respectively. The vectors u i , U 2 , . . . , u/^, v i , V 2 , . . . , v^ span U -{-V, and a subset of them form a basis of U -\-V. In particular, dim.{U + V) = R a n k L where L is the n x {h-\-k) matrix defined by

L := l^ui I ... I u^ I - VI I ... I - VfcJ. Moreover, a vector x = XlILi ^^^i G K'^ is in L^ fl V if and only if there exist unique y^^ y^^- • • ^ y^ such that X = x^u\ + . . . x^Wh = 2/^vi H

h 2/^Vfc,

thus, if and only if the vector w := (—x^, — x ^ , . . . , —x^, y^, y^,..., longs to kerL. Consequently, the linear map (j) : K^'^'^ —v K^,

y^) € K^'^'^ be-

h

is injective and surjective from kerL onto UHV. It follows that dini{U OV) = dim kerL and, by the rank formula, dim(C/ nV)

+ dim{U + V) = dim kerL -h R a n k L = h + k = dimU + dimV. D

1.31 f. Notice that the proof of Grassmann's formula is in fact a procedure to compute two bases oiU-\-V and UOV starting from two bases of U and V. The reader is invited to choose two subspaces U and V of K" and to compute the basis of U -\- V and of

unv. f. Parametric and implicit equations of a subspace 1.32 P a r a m e t r i c e q u a t i o n of a straight line in K"^. Let a 7^ 0 and let q be two vectors in K^. The parametric equation of a straight line through q and direction a is the map r : K —>• K*^ given by r(A) := Aa + q, AG K. The image of r | x G R^ 3 A such that x = Aa -h q | is the straight line through q and direction a.

19

1.2 Matrices and Linear Operators

Figure 1.4. Straight line through q and direction a.

We have r(0) = q and r ( l ) = a -h q. In other words, r{t) passes through q and a -f q. Moreover, x is on the straight Une passing through q and a -f q if and only if there exists t E K such that x = i a + b , or, more explicitly

^2 =ta^

-\-q'^, (1.7)

[x^ = ta^ -\-q'" In kinematics, K — R and the map t —• r{t) := t a + q gives the position at time t of a point moving with constant vector velocity a starting at q at time t = 0 on the straight line through q and a + q. 1.33 Implicit e q u a t i o n of a straight line in K^. We want to find a representation of the straight line (1.7) which makes no use of the free parameter t. Since a 7«^ 0, one of its components is nonzero. Assume for instance a^ ^ 0, we can solve the first equation in (1.7) to get t = {q^ — x^)/a^ and, substituting the result into the last (n — 1) equations, we find a system of (n — 1) constraints on the variable x = (a;\ a:^,..., x^) e K^,

a:2 = ial^a^^q^ x^=^-^^^a^+q^,

The previous linear system can be written as A ( x — q) = 0 where A G Mn-i,nO^) the matrix defined by

f-a'^/a^ -a^/a^ A = I ~a^lo> -a'*/a" X-a'^lQ^

-1 0 0

0 -1 0U

0 0 - 11

0

0

0

...

is

0 0

-1/

1.34 ^ . Show that there are several parametric equations of a given straight line. A parametric equation of the straight line through a and b G M'^ is given by t —> r(t) := a-l-t(b-a), t GM. 1.35 P a r a m e t r i c a n d implicit e q u a t i o n s of a 2-plane in K^. Given two linearly independent vectors vi,V2 in R^ and a point q G M^, we call the parametric equation

20

1. Vectors, Matrices and Linear Systems

of the plane directed by v i , V2 and passing through q, the map (/? : K^ —> K^ d.efined by (p{{a, /3)) := a v i + /3v2 -h q, or in matrix notation

[v i I V2J I

^ I ct\

1^1+^.0)

Of course v? is Unear iff q = 0. The 2-plane determined by this parametrization is defined by n : = I m ( ^ = | x G M^ I x - q G I m A J . Suppose v i = (a, 6, c) and V2 = {d, e, / ) so that

Because of Theorem 1.28, there is a nonsingular 2 x 2 submatrix B of A and, without loss of generality, we can suppose that B = ( \b

1. We can then solve the system e

\x^ - q^ = ba -\- e(3 in the unknown (a,/3), thus finding a and /3 as linear functions of x^ — q^ and x^ — (p'. Then, substituting into the third equation, we can eliminate (a, ^) from the last equation, obtaining an implicit equation, or constraint, on the independent variables, of the form r (x^ - gl) + 5 {x^ -q^) + t (x^ - q^) = 0, that describes the 2-plane without any further reference to the free parameters

{a,0).

More generally, let W he a. linear subspace of dimension fc in K^, also called a k-plane (through the origin) of K"^. If v i , V 2 , . . . , v^ is a basis of W, we can write H^ = ImL where L :=

vi

V2

Vfc

We call X -^ L{x) := Lx the parametric equation of W generated by (vi, V2,..., Vfc). Of course a different basis of W yields a different parametrization. We can also write any subspace W of dimension k a.sW = ker A where A G Mn-k,n{^)' We call it an implicit representation of W. Notice that since ker A = W, we have Rank A-^ = Rank A = n — A: by Theorem 1.28 and the rank formula. Hence the rows of A are n — k linearly independent vectors of K*^. 1.36 Remark. A A;-dimensional subspace of K" is represented by means of k free parameters, i.e., the image of K^ through a nondegenerate parametric equation, or by a set of independent {n — k) constraints given by linearly independent scalar equations in the ambient variables.

1.2 Matrices and Linear Operators

21

1.37 P a r a m e t r i c a n d implicit representations. One can go back and forth from the parametric to the impUcit representation in several ways. For instance, start with W = I m L where L G M^^/c(K) has maximal rank, R a n k L = k. By Theorem 1.28 there is a A: X A: nonsingular matrix submatrix M of L. Assume that M is made by the first few rows of L so that M L = N where N G M^-kM^)Writing x as x = ( x ' , x ' 0 with x ' € K^ and x ' ' G K'^"'', the parametric equation x = Lt, t G K'^, writes as

I x' = Mt,

(1.8)

I x'' = Nt. As M is invertible,

(

t = M-ix', N M - i x ' = x".

We then conclude that x G I m L if and only if N M ~ ^ x ' = x " . The latter is an implicit equation for W, that we may write as A x = 0 if we define A G Mn-fc,n(K) by

A =

-Idfc

NM-

Conversely, let W = ker A where A G Mn,fc(K) has Rank A = n — k. Select n — k the square independent columns, say the first n —fcon the left, call B G Mn-k,n-kO^) matrix made by these columns, and split x as x = ( x ' , x ' ' ) where x ' G K'^~^ and x " G K^. Thus A x = 0 rewrites as

= 0,

or

B x ' -f C x ' ' = 0.

As B is invertible, the last equation rewrites as x ' = —B ^ C x " , Therefore x G ker A if and only if

x ' ' := L x ' ' ,

i.e., W =

lmL.

x'' G :

22

1. Vectors, Matrices and Linear Systems

1.3 Matrices and Linear Systems a. Linear systems and the language of linear algebra Matrices and linear operators are strongly tied to linear systems. A linear system of m equations and n unknowns has the form

(1.9)

[afx^+a^x'^^'"

m^n + a^x

Lm

The m-tuple (6^,..., b^) is the given right-hand side, the n-tuple ( x ^ , . . . , x"^) is the unknown and the numbers {a}}, i = 1 , . . . , m, j = 1 , . . . , n are given and called the coefficients of the system. If we think of the coefficients as the entries of a matrix A, /a\ A =

aj

a\

. ••

al

.

ai\ (1.10)

K-] =

\aT af . ..

a-/

and we set b := ( 6 \ 6^,..., V^) £ K'", x := {x^, x^,..., the system can be written in a compact way as Ax = b.

x") € K", then (1.11)

Introducing the linear map ^ ( x ) := Ax, (1.9) can be seen as a functional equation A{x) = b (1.12) or, denoting by a i , a 2 , . . . , an the n-columns of A indexed from left to right, as (1.13) x^ai + x^8i2 + • • • -h x'^ain = b. Thus, the discussion of linear systems, linear independence, matrices and linear maps are essentially the same, in different languages. The next proposition collects these equivalences. 1.38 Proposition. With the previous notation we have: (i) A x is a linear combination of the columns of A. (ii) The following three claims are equivalent: a) the system (1.11) or (1.9), is solvable, i.e., there exists x G K^ such that A x = b ; b) h is a linear combination o / a i , a 2 , . . . , an/ c) h e ImA. (iii) The following four claims are equivalent:

1.3 Matrices and Linear Systems

23

a) A x = b has at most one solution, b) A x = 0 implies x = 0, c) A(x) = 0 has a unique solution, d) kerA = {0}, e) ai, a 2 , . . . , an are linearly independent. (iv) ker A is the set of all solutions of the system A x = 0. (v) Im A is the set of all b ^s such that the system Ax = b has at least one solution. (vi) Let XQ £ W^ he a solution of AXQ — h. Then the set of all solutions of A x = h is the set | x o | + ker A := {x G K"" x -

XQ

G ker A}.

With the previous notation, we see that b is Unearly dependent of a i , a 2 , . . . , an if and only if Rank a i

^n

= Rank a i

an b

Thus from Proposition 1.38 (ii) we infer the following. 1.39 Proposition (Rouche-Capelli). With the previous notation, the system (1.9) or (1.11) is solvable if and only if Rank a i

an

= Rank a i

an b

The m X (n + 1) matrix

/a{ b^

b]:^

ai

\aT

a^

...

<

is often called the complete matrix of the system (1.9). 1.40 ^ . Prove all claims in this section. 1.41

Solving linear s y s t e m s . Let us return t o the problem of solving Ax = b,

where

A G MmA^)^

heK^.

If n = m and A is nonsingular, then the unique solution is XQ := A~^6. In the general case, according t o Proposition 1.39, t h e system is solvable if and only if Rank A = Rank [A | b], and if XQ € K"^ is a solution, the set of all solutions is given by {a:o}+ker A. Let r := Rank A. Since R a n k A ^ = r , we may assume without loss of generality that the first r rows of A are linearly independent and the other rows depend linearly on the first r rows. Therefore, if we solve the system of r equations

24

1. Vectors, Matrices and Linear Systems

(a\

(x^\

a\\

62 (1.14) \a\

\b'-J

\x"J

the remaining equations are automatically fulfilled. So it is enough to solve A x = b in the case where A G Mr,n{^) and Rank A ^ = Rank A = r. We have two cases. If r = n, then A 6 Mr,r is nonsingular, consequently A x = b has a unique solution x = A ~ ^ b . If r < n , then A has r linearly independent columns, say the first r. Denote by R the r x r nonsingular matrix made by these columns, and decompose x = ( x ' , x " ) with x ' G K^ and x ' ' 6 K^"''. Then A x = b writes as = b,

i.e., R x ' + S x " = b , or x ' = R - i ( b - Rx'O- Therefore,

/-R-ib^ = : L x ' -f- xo,

x' + V

0

concluding that the set of all solutions of the system A x — b is {x I X - Xo G ker A } = | x x - xo G Im L >.

b. The Gauss elimination method As we have seen, linear algebra yields a proper language to discuss linear systems, and conversely, most of the constructions in linear algebra reduce to solving systems. Moreover, the proofs we have presented are constructive and become useful from a numerical point of view if one is able to efficiently solve the following two questions: (i) find the solution of a nonsingular square system A x = b, (ii) given a set of vectors T c K^, find a subset S CT such that Span S = SpanT. In this section we illustrate the classical Gauss elimination method which efficiently solves both questions. 1.42 E x a m p l e . Let us begin with an example of how to solve a linear system. Consider the linear system

Sx + Sy + Qz 2x-hy-\-z where x := {x,y,x),

b := (61,62,63) and

=b2, =63,

Ax = b

1.3 Matrices and Linear Systems

25

A=-

We subtract from the second and third equations the first one multipUed by 1/2 and 1/3 respectively to get the new equivalent system:

1

6x-\-18y-\-6z

=bi,

< 3x-\-Sy + 6z-^{6x-\-18y-\-6z) [

2x-^ y-\-z - l{6x + 18y + 6z)

=-^bi-\-b2, =-§61+63,

i.e., ( 6x-\-lSy + 6z < - i / + 3z

= 61, =-§61+62,

[

=-§61+63.

-by-z

(1.15)

This essentially requires us to solve the system of the last two equations f - 2 / + 32 \ -5y-z

=-§61+62, =-§61+63.

We now apply the same argument to this last system, i.e., we subtract from the last equation the first one multiplied by 5 to get

(

6x-\-lSy-\-6z -y + Sz -52/ - z - 5(-2/ + 3^)

=61, =-§61+62, = - § 6 1 + 63 - 5 ( - § 6 i + 62),

i.e.,

(

6x-\-18y-\-6z

=61,

-y + Sz =-§61+62, -I62 =261-562+63. This system has exsictly the same solution as the original one and, moreover, it is easily solvable starting from the last equation. Finally, we notice that the previous method produced two matrices

U is upper triangular and L is lower triangular with 1 in the principal diagonal, so the original system A x = b rewrites as U x = Lb. Since L = [/*] is invertible (l] = 1 Vi) and x is arbitrary, we can rewrite the last formula as a decomposition formula for A, A = L-iU.

The algorithm we have just described in Example 1.42, that transforms the proposed 3 x 3 square system into a triangular system, extends to systems with an arbitrary number of unknowns and equations, and it is called the Gauss elimination method. Moreover, it is particularly efficient, but does have some drawbacks from a numerical point of view.

26

1. Vectors, Matrices and Linear Systems

Let (1.16)

Ax = 0

be a linear homogeneous system with m equations, n unknowns and a coefficient matrix given by /'

^2

A = \o

Starting from the left, we denote by ji the index of the first column of A that has at least one nonzero element. Then we reorder the rows into a new matrix B of the same dimensions, in such a way that the element 6]^ is nonzero and all columns with index less than ji are zero, 0

b]^ *

A

0

B = [6]] =

\0

...

0

*/

where * denotes the unspecified entries. We then set pi := 6j^, and for i = 2 , . . . , m we subtract from the ith row of B the first row of B multiplied by —b'j^/pi. The resulting matrix therefore has the following form

A i :=

/O 0

... ...

Vo ...

0 pi 0 0

0

0

* ... * ...

*\ *

*...*/

where pi ^ 0, below pi all entries are zero and * denotes the unspecified entries. We then transform Ai into A2, A2 into A3, and so on, operating as previously, but on the submatrix of Ai of rows of index respectively larger than 2 , 3 , The algorithm of course stops when there are no more rows and/or columns. The resulting matrix produced this way is not uniquely determined as there is freedom in exchanging the rows. However, a resulting matrix that we call a Gauss reduced matrix, is clearly upper triangular if A is a square matrix, m = n, and in general has the following stair-shaped form

1.3 Matrices and Linear Systems

27

T~ tt i * t ^ 4: t H * t. f ^'

nn SBBB T T f

*

IF * t
Figure 1.5. Two pages of the Japanese mathematician Takakazu Seki (1642-1708) who apparently dealt with determinants before Gauss.

GA:=

/o

Pi

*

*

*

0 0 0

0 0 0

0 0 0

P2

*

0 0

\o

0

0

0

+ +

*

0 0

P3

*

0

Pr

0

0

0

0

/ (1.17) where * denotes the unspecified entries; the nonzero numbers Pi,P2, • • • ^Pr are called the pivots of the stair-shaped matrix G A Finally, since o multiplying one of the equations of the system A x = 0 by a nonzero scalar, o exchanging the order of the equations, o summing a multiple of one equation and another equation, produces a linear system with the same solution as the initial system, and observing that the Gauss elimination procedure operates with transformations of this type, we conclude that G A X = 0 has the same solution as the initial system. We now discuss the solvability of the system Lx = b, if L is stairshaped. 1.43 Proposition. Let Li be a stair-shaped mxn matrix. Suppose that L has r pivots, r < min(n, m). Then a basis o/ImL is given by the r columns containing the pivots, and the system Lx = b , b = (6^, 6^,..., b^)^, has a solution if and only if¥^^ = . . . = : 6"^ z= 0. Proof. Since there are r pivots and at most one pivot per row, the last rows of L are identically zero, hence I m L C {b G W^ | b = (6i, 6 2 , . . . , 6r, 0 , . . . , 0)}. Consequently,

28

1. Vectors, Matrices and Linear Systems

Figure 1.6. Takakazu Seki (1642-1708).

dim Im L < r. On the other hand the r columns that contain the pivots are in Im L and are hnearly independent, hence Rank L = r, and Im L = {(6^, 6 ^ , . . . , 6^, 0 , . . . , 0) | 6* G D K Vz, 2 = l , . . . , r } .

The Gauss elimination procedure preserves several properties of the original matrix A. 1.44 Theorem. Let A G Mm,n{^) O'fid l^i G A be one of the matrices resulting from the Gauss elimination procedure. Then (i) ker A = kerGA, (ii) Rank A = Rank G A = number of pivots of GA? (iii) letji,J2, " ",jr be the indices of the columns of the pivots O / G A ? then the columns of A with the same indices are linearly independent, Proof, (i) is a rewriting of the equivalence of A x = 0 with G A X = 0. (ii) Because of (i), the rank formula yields Rank A = R a n k G A , and Rank G A equals the number of pivots by Proposition 1.43. (iii) Let A = [ai | a2 | . . . | an] and let Jj : =

a,",

a^o

72 I •• • \*^3ky

Following the Gauss elimination procedure we used on A, we easily see that the columns of B transform into the columns of the pivots which are linearly independent. By (i) k e r B = {0}. D 1.45 If. Let A G Mm,ni^)'

Show a procedure to find a basis for Rank A and Rank A ^ .

1.46 %, Let W = S p a n { v i , V 2 , . . . , v ^ } be a subspace of K". Show a procedure to find a basis of W among the vectors v i , V 2 , . . . , v^. 1.47 if. Let A G Mm,n{^)'

Show a procedure to find a basis of ker A.

1.3 Matrices and Linear Systems

29

1.48 ^ . Let v i , V 2 , . . . , Vfc G K ^ be k linearly independent vectors. Show a procedure to complete them with n — k vectors of the canonical basis of W^ in order to form a new basis of M'^. [Hint: Apply the Gauss elimination procedure on the matrix

/ A:=

VI

V2

1 0

0 1

0

0

... ...

0 0

\

1

/

Vfc

1.49 %, Show that A G Mn,n{^) A has n pivots.

is invertible if and only if a Gauss reduced matrix of

c. The Gauss elimination procedure for nonhomogeneous linear systems Now consider the problem of solving A(x) = b, where A G Mm,n{^), X G IK*^ and b G K^. We can equivalently write it as

/ a\

al

at

do

al

1 0

0 1

0\ 0 -61

\aT

af

a!?

al

0

0

= 0.

-62

1/

V-6'"/ If one computes a Gauss reduced form of the m x {n + m) matrix

B:=

Id

/ a\

a\

tti

"]

\af

...

a\

ao

1 0 0 1

0

al

0 0

1/

we find, on account of Theorem 1.44, that GB:=

GA

s

where G A G Mm,n{^) is a Gauss reduced matrix of A and S G M^,rn(K). Moreover, if the elimination procedure has been carried out without any permutation of the rows, then S is a lower triangular matrix with 1 as entries in the principal diagonal, hence it is invertible. Since for every b the system A x = b is equivalent to G A X = Sb, we then have GAX = S b - S A x

thus concluding that A =

S~^GA.

VXGK^,

In particular.

30

1. Vectors, Matrices and Linear Systems

1.50 Proposition (LR decomposition). Let A G Mn,n(K) be a square matrix. If the elimination procedure proceeds without any permutation of the rows, we can decompose A as A = LR, where R = G A is the resulting Gauss reduced matrix and "L is a suitable lower triangular matrix with 1 as entries in the principal diagonal o/L. In general, howewer, the permutation of the rows must be taken into account. For this purpose, let us fix some notation. Recall that a permutation of { 1 , . . . , m} is a one-to-one map a : { 1 , . . . , m} —> { 1 , . . . , m}. The set of all permutations of m elements is denoted by Vm- For every permutation a of m elements, define the associated permutation matrix R a e Mm,m{^)

by

R . := where (ei, 6 2 , . . . , e^^) is the canonical basis of W^. Let A E Mm,n{^)' If o- permutes the indices of the rows of A, then the resulting matrix is R^A. Now denote by ^(A) the Gauss reduced matrix, if it exists, obtained by the Gauss ehmination procedure starting from the top row and proceeding without any permutation of the rows. Let G A be a Gauss reduced form of A. Then G A = ^ ( R a A) for some permutation a oi m elements. Now fix a Gauss reduced form G A of A, and let a be such that G A = ^ ( R ^ A ) . Write A x = y as (R<jA)x = R^-y = Idm(Ro-y) and let B:=

R^A

Id

Then B and R^-A may be reduced without any permutation of the rows, hence by the above

g{B) = [^(RaA) I S] = [GA where S is lower triangular with all entries in the principal diagonal equal to 1. Therefore G A X = SR^-y = SR^-Ax Vx, that is. G A = SRo-A.

(1.18)

When A G Mn,n(^) is a square matrix, (1.18) shows that A is invertible if and only if a Gauss reduced form G A of A is invertible and .-1

GA

SRO^.

In practice, let (ei, 6 2 , . . . , e^) be the canonical basis of K^ and let A~^ =: [vi IV2 I . . . I Vn]. Let i = 1 , . . . n. To compute v^, we observe that Vi = A~^e^, i.e., Av^ = e^. Thus, using the Gauss elimination procedure, from (1.18) Vi is a solution of GAV^ = SR^e^. Now, since G A is upper triangular, this last system is easily solved by inductively computing the components of v^ starting from the last, upward.

1.4 Determinants

31

(0,1

Figure 1.7. The area transformation.

1.4 Determinants The notion of determinant originated with the observation of Gabriel Cramer (1704-1752) and Carl Priedrich Gauss (1777-1855) that the process of elimination of unknowns when solving a linear system of equations amounts to operating with the indices of the associated matrix. Developments of this observation due to Pierre-Simon Laplace (1749-1827), Alexandre Vandermonde (1735-1796), Joseph-Louis Lagrange (1736-1813) and Carl Friedrich Gauss (1777-1855), who introduced the word determinant, were then accomplished with a refined study of the properties of the determinant by Augustin-Louis Cauchy (1789-1857) and Jacques Binet (1786-1856). Here we illustrate the main properties of the determinant. 1.51

D e t e r m i n a n t and area in '.

Let A =

a c

b d

(1.19)

be a 2 X 2 matrix. It is easily seen that A is not singular, i.e., the linear homogeneous system

{

ax + 62/ = 0, ex -\- dy = ()

has zero, {x^y) — (0,0), as a unique solution if and only if ad — he ^ 0. The number ad — he is the determinant of the matrix A, det A = det (

\ :— ad — he.

One immediately notices the combinatorial characteristic of this definition: if A = [a*], then det A — a\a2 — a\a\. Let a := (a, c) and b := (6, d) be the two columns of the matrix A in (1.19). The elementary area of the parallelogram spanned by a and b with vertices (0,0), {a^e)^ (6, d) and (a -h 6, c + d), is given by Area(T) = |a| | b | | sin(9| where 6 is the angle a b , irrespective of the orientation, see Figure 1.7. On the other hand, by Carnot's formula |a| | b | cosO = ah-{- ed, hence

32

1. Vectors, Matrices and Linear Systems

fciANUALI HOEPLI

ERNESTO

ELEMENTARY TREATISE

PASCAL

OrdUnmHo iMU B. Uairuiltk dl XkpoU

DETERMINANTS

I DETERMINANT!

WITB TRUB «rM.tOATWJl TO

SiMUlTANBOVS UNKAR

SQUATIOm

AND AIGBBMICAL OEOMSTRY.

CHARLBS L. DODOSOIT, M.A.

ULRICO HOEPLI R0lT0««UBRAtO OR.U RIAL CA«A

MILANO

MACMILLAM AKO CO.

Figure 1.8. Frontispieces of two books on determinants respectively by Ernesto Pascal and Charles L. Dodgson, better known as Lewis Caroll.

Area(T)2 = (a^ + c^){h^ + d^){l - cos'^ 6) = aH'^ + a^d^ -f h^c^ + c^d^ - {ah + cdf = a^d^ + 62c2 •2abcd

= {ad-bey

=detA^,

I.e.,

Area(T) = | d e t A | . We may think of det A as of the area of T with sign. In fact, the sign of det A may be used to define the sign of the angle formed by the vectors a and b . The angle a b is positively {negatively) oriented if det[a| b] > 0 (det[a| b] < 0). Angles with sign in geometry are also modelled by complex multiplication, identifying E^ with C. Using the previous notation, setting z := a -\- ib, w = c + id we have zw = {a-\- ib){c — id) = {ac -h bd) + i{bc — ad) = ( a • b )]jj2 -\-i det A.

Let v i , V2 G M^. As we have seen, the determinant of the matrix [vi | V2] is not zero iff vi and V2 are linearly independent. Actually, for any n > 1 there is a real function defined on n x n matrices that tells us whether the n columns of the matrix are linearly independent: the determinant One of the simplest ways to define it is as follows. We recall that a permutation of { l , . . . , n } is a one-to-one map a : { 1 , . . . , n} -^ { 1 , . . . , n}. The set of permutations of n objects, denoted by Vn is a group with respect to the operation of composition. A permutation that exchanges two adjacent indices and leaves the other indices unchanged is called a transposition. Transpositions are elementary permutations in the sense that each permutation a can be obtained by composing

1.4 Determinants

33

subsequent transpositions. Of course, there are several ways to decompose a given permutation into elementary transpositions, but the parity or oddity of the number of transpositions needed to realize a given permutation a depends only on the permutation a. We define the signature^ or sign^ of the permutation a the number .

.^

f.+,1^ ifcr .X ^ decomposes in an even number of transpositions, \-l

if (J decomposes in an odd number of transpositions.

1.52 Definition. Let A = [a^] e Mn,n(IK), n > 1. The determinant of A is then defined by d e t A :=:= Yl ( - i r < i ) « ^ ( 2 ) •••<(„)•

(1-20)

Notice that det A is a sum of products and each product contains just one element from each row and each column, and the sum, apart from the sign, is extended to all possible choices. 1.53 E x a m p l e . Of course for matrix A in (1.19) we again get det A = ad — be. Going b£u:k to the area, one shows that given 3 vectors v i , V2, V3 6 M^ and denoting by T the polyhedra generated by these vectors, we still have Vol3(T) = | d e t [ v i | v 2 | v 3 ] | . For n vectors v i , V2,. •., Vn G M", let L : = [ v i I V2 I . . . I Vn] G

Mn,n{^)

and let L{x) := Lx. If Q is the unit cube of M.'^, Q:= !yi = ( x \ x^,...,x'')\o<x'

<1 Vi},

we define the n-dimensional volume of T := L{Q) by Voln(T) :==|detL|.

It is useful to think of the determinant as a function of the columns of the matrix. In fact, we have the following. 1.54 Theorem. The determinant onnxn matrices is the unique function det : Mn,n{^) -^ ^ such that, when seen as a function of columns, it is (i)

(LINEAR ON EACH FACTOR); for

all a^,af

G K^,

i = l,...,n,

XeK det [... I a • + a •' I . . . 1 = det [... I a • I . . . 1 + det det . . . Aa^ . . .

= A det . . . a^ . . . ,

and

34

1. Vectors, Matrices and Linear Systems

(ii)

(ALTERNATING); by exchanging two adjacent columns the determinant changes sign,

det (iii)

-det

az a i + i

ai+i ai

(NORMALIZED); det Idn = 1.

Notice that because of (i) the alternating property can be equivalently formulated by saying that det A == 0 if A has two equal columns. Proof. Clearly the right-hand side of (1.20) fulfills the conditions (i), (ii), (iii). To prove uniqueness, suppose that D : Mn,ni^) ~^ ^ fulfills (i), (ii), (iii) of Theorem 1.54. Write A = [aj] G Mn,n{^) as A = [ai I a2 I . . . | an] where n

j= l

( e i , 6 2 , . . . , Bn) being the canonical basis of K". Then by (i) D{A) =

J2 ^l(l)<(2) ^(l),...,^(n)

' ' ' < ( n ) ^ ( [ e i I • • . I en])

where cr(l), c r ( 2 ) , . . . , a{n) vary in { 1 , . . . , n}. Since by (ii) D{A) = 0 if A has two equal columns, we infer that cr(i) ^ a{j) ii i ^ j , i.e., that cr is a permutation of (1, 2 , . . . , n ) . Since £>([e^(i) | . . . | e^(^)]) = ( - l ) ^ D ( [ e i | . . . | en]) and D([ei | . . . | en]) = 1, we conD clude that D{A) agrees with the right-hand side of (1.20), hence D{A) = det A.

The determinant can also be computed by means of an inductive formula. 1.55 Definition. Let A = [aj] E Mn^nOQ, n > 1. A r-minor of A is a r X r suhmatrix of A, that is a matrix obtained by choosing the common entries of a choice of r rows and r columns of A and relabeling the indices from 1 to r. For i,j = l,...,nwe define the complementing (i, j)-minor of the matrix A, denoted by M^(A), as the (n — 1) x (n — l)-minor obtained by removing the ith row and the jth column from A. 1.56 T h e o r e m (Laplace). Let A e Mn,n(^), det A : -

n>l. Then

A

X;=i(-lP+'«] detMJ(A)

z/n = 1,

(1.21)

ifn>l.

Proof. Denote by D ( A ) the right-hand side of (1.21). Let us prove that D{A) fulfills the conditions (i), (ii) and (iii) of Theorem 1.54, thus D{A) = det A. The conditions (i) and (ii) of Theorem 1.54 are trivially fulfilled by D{A). Let us also show that (iii) holds, i.e., if aj = a^+i for some j , then D(A) — 0. We proceed by induction on j . By the induction step, det M ^ ( A ) = Q ioi h ^ jj + 1, hence D{A) = (-l)-?+ia] det M ] ( A ) + (—l)^aL ^ d e t M L j ( A ) . Since a j = ^j-fi? ^^^^ consequently, M j ( A ) = Mj_j_^(A), we conclude that D{A) = 0 . D

Prom (1.20) we immediately infer the following.

1.4 Determinants

35

1.57 T h e o r e m ( D e t e r m i n a n t of t h e t r a n s p o s e ) . We have det A ^ = det A

for all A G Mn,n(lK).

One then shows the following important theorem. 1.58 Theorem (Binet's formula). Let A and B be two nxn Then det(BA) = d e t B det A.

matrices.

Proof. Let A - : [a]] = [ai | . . . | an], B = [6j] = [bi | . . . | bn] and let ( e i , . . . , e n ) be the canonical basis of K"^. Since

j=l

j,r=l

r=l

we have n

d e t ( B A ) = det Q ^ r=l =

n

a5[br I . . . I 5]1 « n b r ] ) r=l

Yl ^a(l)«a(2) • • • < ( n ) det[b^(i) | . . . | b^(^)] creVn

= E

( - i r < i ) < 2 ) - - - < ( n ) d e t B = detAdetB.

As stated in the beginning, the determinant gives us a criterion to decide whether a matrix is nonsingular or, equivalently, whether n vectors are linearly independent. 1.59 T h e o r e m . A nxn matrix A is nonsingular if and only if det

A^^.

Proof. If A is nonsingular, there is a B G Mn.ni]^) such that A B = Idn, see Exercise 1.27; by Binet's formula det A det B = 1. In particular det A ^ 0. Conversely, if the columns of A are linearly dependent, then it is not difficult to see that det A = 0 by using Theorem 1.54. •

Let A = [a^] be an m x n matrix. We say that the characteristic of A is r if all p-minors with p > r have zero determinant and there exists a r-minor with nonzero determinant. 1.60 T h e o r e m (Kronecker). The rank and the characteristic of a matrix are the same. Proof. Let A 6 Mm,ni^) and let r := Rank A . For any minor B , trivially R a n k B < Rank A = r, hence every p-minor is singular, i.e., has zero determinant, if p > n. On the other hand. Theorem 1.28 implies that there exists a nonsingular r-minor B of A , hence with det B ^ 0. •

The defining inductive formula (1.21) requires us to compute the determinant of the complementing minors of the elements of the first row; on account of the alternance, we can use any row, and on account of Theorem 1.57, we can use any column. More precisely,

36

1. Vectors, Matrices and Linear Systems

1.61 Theorem (Laplace's formulas). Let A be an n x n matrix. We have for all h^k = 1,...,n n

Skh det A = Yli-lf+^a';

det M^^(A),

n

Skh det A = ^ ^ ( - i r + ' ^ a l d e t M K A ) , where Shk is Kronecker^s symbol. 1.62 ^ . To compute the determinant of a square n x n matrix A we can use a Gauss reduced matrix G A of A. Show that det A = (—1)*^ n r = i ( ^ A ) i where a is the permutation of rows needed to compute G A , and the product is the product of the pivots.

It is useful to rewrite Laplace's formulas using matrix multiplication. Denote by cof(A) = [c*] the square n x n matrix, called the matrix of CO factors of A, defined by c}:=(-l)^+^detM^(A). Notice the exchange between the row and column indices: the (z, j ) entry of cof(A) is (—1)*"^-^ times the determinant of the complementing (j, z)-minor. Using the cofactor matrix, Laplace's formulas in Theorem L61 rewrite in matrix form as 1.63 Theorem (Laplace's formulas). Let A be annxn we have cQf(A) A = A c o f ( A ) = det A Idn-

matrix. Then (1.22)

We immediately infer the following. 1.64 Proposition. Let A = [ai | a2 | . . . | an] G Mn,n{^)

be nonsingular.

(i) We have A"' = : ^ c o f ( A ) . det A ^ ^ (ii)

The system A x = b , b G K"^, has a unique solu-

(CRAMER'S RULE)

tion given by _ / 1 J\.

—

\

2 )

n\ , . . . , t X /

^ J,

«£/

detBi

det A '

where := lai

. . . a^-i b a^-^i

. . . an I.

1.5 Exercises

37

Proof, (i) follows immediately from (1.22). (ii) follows from (i), but it is better shown using linearity and the alternating property of the determinant. In fact, solving A x = b is equivalent to finding x = (x^, x ^ , . . . , x^) such that b = Y17=i ^*^i- Now, linearity and the alternating property of the determninant yield n

det B i = det

• • • a^-i

n

V ^ X^SLJ ai_|_i j=i

...

= V ^ x^ det

• • • Sii-i

SLJ ai_|_i

...

j=i

Since the only nonzero addend on the right-hand side is the one with j = i, vje infer det B i = x^ det a i

1.65 f.

...

a^-i

a^ a i + i

...

an

= x^ det A.

Show that d e t c o f ( A ) = (det A)^

1.5 Exercises 1.66 1 . Find the values of x,y (1, y, y^) form a basis of M^.

e M for which the three vectors (1,1,1),

{l,x,x'^),

1.67 ^ . Let Q:i,a2 G C be distinct and nonzero. Show that e"i*, e"2*^ t G R, are linearly independent on C. [Hint: See [GM2] Corollary 5.54.] 1.68 %, Write the parametric equation of a straight line (i) through b = (1,1,1) and with direction a = (1,0,0), (ii) through a = (1,1,1) and b = (1,0,0). 1.69 %. Describe in a parametric or implicit way in E^, o a straight line through two points, o the intersection of a plane and a straight line, o a straight line that is parallel to a given plane, o a straight line on a plane, o a plane through three points, o a plane through a point containing a given straight line, o a plane perpendicular to a straight line. 1.70 ^ Afflne t r a n s f o r m a t i o n s . An affine transformation (f : K^ —>• K^ is a map of the type (p{x.) := L{x.) + qo where L : K^ —> K^ is linear and qo G K"^. Show that (p is an affine transformation if and only if (f maps straight lines onto straight lines. 1.71 f. Let Pi and P2 be two (n - l)-planes in W^. Show that either Pi = P2 or P i n F2 = 0 or Pi n P2 has dimension n - 2. 1.72 1. In M^ find (i) two 2-planes through the origin that meet only at the origin, (ii) two 2-planes through the origin that meet along a straight line. 1.73 1 . In E^ write the 2 x 2 matrix associated with the counterclockwise rotations of angles 7r/2, TT, 37r/2, and, in general, 6 eR.

38

1. Vectors, Matrices and Linear Systems

1.74 %, Write the matrix associated with the axial symmetry in R^ and to plane symmetries. 1.75 %. Write down explicit linear systems of 3, 4, 5 equations with 4 or 5 unknowns, and use the Gauss elimination procedure to solve them. 1.76 % Let A € Mn,n(K). Show that if A B = 0 VB € Mn,n(^)^

1.77 1 . Let A = ( ^

~^

^ I and B = I ^

~^

\0

-1

3J

\b

2

then A = 0.

^ | . Compute A -f B , \ / 2 A + B .

3J

1.78 If. Let

(

3

3

2

3^

2

2

0

2

-1

0

1

V

(

0

(2 3 B = 1

V2

-A 2 -1

5/

Compute A B , B B ^ , B ^ B . 1.79 K. Let 0 0 0

0\ 1 0 0 0/

Show that A B = 0. 1.80 f. Let A , B € Mn,n{K).

Show that if A B = 0 and A is invertible, then B = 0.

1.81 % Let

Show that o A 2 = B 2 = C2 o AB = - B A = o BC = - C B = o CA = - A C =

3=-Id, C, Id, B.

1.82 f. Let A , B e Mn,n- We say that A is symmetric if A = A ^ . Show that, if A is symmetric, then A B is symmetric if and only if A and B commute, i.e., A B = B A . 1.83 ^ . Let M G Mn,n{^) be an upper triangular matrix with all entries in the principal diagonal equal to 1. Suppose that for some k we have M'^ = M M • • • M = IdnShow that M = Idn1.84 %. Let A , B € Mn,n(K). In general A B ^ B A . The n x n matrix [A,B] := A B — B A is called the comm,utator or the Lie bracket of A and B . Show that (i) [A,B1 = - [ B , A ] , (ii)

(JACOBI'S IDENTITY) [[A,B], C] + [[B, C], A] -f [[C, A ] , B ]

=

0,

(iii) the trace of [A, B] is zero. The trace of a n x n matrix A = [aj] is defined as

1.5 Exercises

39

1.85 ^. Let A € Mn,n be diagonal. Show that B is diagonal if and only if [A, B] = 0. 1.86 ^ Block matrices. Write a, n x n matrix as

A=(^l

^2

where A J is the submatrix of the first k rows and h columns, A2 is the submatrix of the first k rows and n — h columns, etc. Show that A\

A A /B}

B A _/A}B}+A1B?

AjBi+AiBi

A?

A^MB?

B2/

AfBi+A^B:

lAfBj+A^B?

1.87 1. Let A G Mfc,fc(K), B G Mn,n(K) and A 0

C =

0 B

Compute det C. 1.88 1. Let A G Mfc,fe(K), B G Mn,n(K), C G Mfc,n(K) and A 0

M =

C B

Compute det M. 1.89 ^ Vandermonde determinant. Let Ai, A2, /I

A:=

. , An G

and

l\

1

1

Ai

A2

A3

An

^1 Aj

^2 A2

^3 A3

A3

VA? AJ AJ ... A^y Prove that det A = ni<7 (^i —^j)- [Hint: Proceed by induction on n. Notice that det A is a polynomial in An and use the principle of identity for polynomials.] 1.90 %. Compute the rank of the following matrices /2 2 3

1

3 1 \ 1 - 3 1 3 1 - 1

Vs 4 -2

0 /

/3 3 3 l \ 1 3 3 3 3 3 3 3 \ 1 1 1 1/

(2

3 1 3\ 3 1 - 1 2 - 1 2 2 1 \1 5 3 4)

1.91 %, Solve the following linear systems 2x-\-4y-\-3z-2t 3x-y-\-2z

+ t = l,

x + 2y- z-\-2t = 2, [x-5y

+ 4z-St

= 1,

2x + 2y-Sz x-\-2yx-3y

= 3,

+ St = 3,

z-{-St = 2, + 2z + 2t= - 4 ,

(4x + y-2z-\-8t

= 1.

2. Vector Spaces and Linear Maps

The linear structure of K^ is shared by several mathematical objects. We have already noticed that the set of m x n matrices satisfies the laws of sum and multiplication by scalars. The aim of this chapter is to introduce abstract language and illustrate some facts related to linear structure. In particular, we shall see that in every finite-dimensional vector space we can introduce the coordinates related to a basis and explain how the coordinates description of intrinsic objects changes when we change the coordinates, i.e., the basis.

2.1 Vector Spaces and Linear Maps a. Definition Let K be a commutative field, here it will be either R or C. 2.1 Definition. A vector space over the field K is a set X endowed with (i) an operation -\- : X x X -^ X^ called the sum, that makes X a commutative group, i.e., e X, a) {x-\-y)-\- z = x + {y-\- z), x + y = y-\-x,'ix,y b) there exists an element 0 e X called the zero element, such that x-hO = 0-hx = x V x G X , c) for every x £ X there exists —XEX such that x -\- {—x) = 0, (ii) an operation of multiplication by a scalar - : K x X —^ X that associates to every X e K and x £ X an element of X denoted by \x such that a) \{x -\-y) = \x-\- Xy, (A -h fi)x = Xx -i- fix, b) X{/jix) = {Xfi)x, 1' X = X.

In particular, {—l)x = —x Vx G X; we therefore write x — y instead of x-h(-y). The elements of a vector space over K are called vectors, and the elements of K are called scalars. The product of a vector by scalars allows us to regard a vector at all scales.

42

2. Vector Spaces and Linear Maps

2.2 E x a m p l e . As we have seen, K^ for n > 1, and all the linear subspaces of K^ are vector spaces over K. Also, the space of m x n matrices with entries in K, Mm,n{^), is a vector space over K, with the two operations of sum of matrices and multiplication of a matrix by a scalar, see Section 1.2. 2.3 E x a m p l e . Let X be any set. Then the class T{X, K) of all functions ip : X —^ K is a vector space with the two operations of sum and multiplication by scalars defined by {if + ip)ix) := ip{x) -f 7p(x), (A(/?)(a:) := X(p{x) Va: G X. Several subclasses of functions are vector spaces, actually linear subspaces of J^(X,K). For instance, o the set C°([0,1],]R) of all continuous functions cp : [0,1] —>• R, the set of kdiflFerentiable functions from [0,1] into R, the set C'^([0,1],R) of all functions with continuous derivatives up to the order k, the set C"^([0,1],R) of infinitely differentiable functions, o the set of polynomials of degree less than /c, the set of all polynomials, o the set of all complex trigonometric polynomials, o the set of Riemann summable functions in ]0,1[, o the set of all sequences with values in K.

We now begin the study of properties that depend only on the hnear structure of a vector space, independently of specific examples. b. Subspaces, linear combinations and bases 2.4 Definition. A subset W of a vector space X is called a linear subspace, or shortly a subspace of X, if (i) OeW, (ii) "^ x,y eW we have x ~\-y eW, (iii) \/ X eW and W XeK we have Xx G W. Obviously the element 0 is the zero element of X and the operations of sum and multiplication by scalars are as those in X. In a vector space we may consider the finite linear combinations of elements of X with coefficients in K, i.e., ^A^^.

eX

i=l

where A^, A^,..., A'^ G K and vi, t'2, • • •, ^n ^ X. Notice that we have indexed both the vectors and the relative coefficients, and we use the standard notation on the indices: a list of vectors has lower indices and a list of coefficients has upper indices. It is readily seen that a subset VF C X is a subspace of X if and only if all finite linear combinations of elements of X with coefficients in K belong to W. Moreover, given a set 5 C X, the family of all finite linear combinations of elements of 5 is a subspace of X called the span of S and denoted by Span S. We say that a finite number of vectors are linearly dependent if there are scalars, not all zero, such that

2.1 Vector Spaces and Linear Maps

43

LECTURES Q U A T E R N I O N S J « J t o ^latinwiiciil ^tt»i«; T B E KOTAL I R I S H ACADEMY;

THE HALLS Ot TBIKnY COLLEOS, DUBUK :

SIR WltUAM ROWAN HAUILTOM, LLD, M.R.LA.,

DUBLIN: HO0OK8 AND SMITH, OEAFTON.STBKRT.

Figure 2.1. Arthur Cayley (1821-1895) and the Lectures on Quaternions by WiUiam R. Hamilton (1805-1865).

lOilPOiC! trHITTAEIB » COM ATSMAklA tMSK CAItrafDOKt HA«Hau«t « CO. 1853.

or, in other words, if one vector is a linear combination of the others. If n vectors are not hnearly dependent, we say that they are linearly independent More generally, we say that a set S of vectors is a set of linearly independent vectors whenever any finite list of elements of S is made by hnearly independent vectors. Of course linearly independent vectors are distinct and nonzero. 2.5 Definition. Let X be a vector space. A set S of linearly independent vectors such that Span 5 = X is called a basis of X. A set A C X is a maximal independent set of X if A is a set of linearly independent vectors and, whenever we add to it a vector w € X\A, Au{w} is not a set of linearly independent vectors. Thus a basis of X is a subset S C X such that (i) every a: G X is a finite linear combination of some elements of S. Equivalently, for every x e X there is a map A : 5 ^ K such that X = J2ves -^C^)^ ^iid X{v) = 0 except for a finite number of elements (depending on x) of 5, (ii) each finite subset of 5 is a set of linearly independent vectors. It is easy to prove that for every x e X the representation x = J2ves ^(^)^ is unique if 5 is a basis of X. Using the same proof as in Proposition 1.5 we then infer 2.6 Proposition. Let X be a vector space over K. Then S C X is a basis of X if and only if S is a maximal independent set.

44

2. Vector Spaces and Linear Maps

Using Zorn's lemma, see [GM2], one can also show the following. 2.7 Theorem. Every vector space X has a basis. Moreover, two bases have the same cardinality. 2.8 Definition. A vector space X is finite dimensional if X has a finite basis. In the most interesting infinite-dimensional vector spaces, one can show the basis has nondenumerable cardinality. Later, we shall see that the introduction of the notion of limit, i.e., of a new structure on X, improves the way of describing vectors. Instead of trying to see every a; G X as a finite linear combination of elements of a nondenumerable basis, it is better to approximate it by a suitable sequence of finite linear combinations of a suitable countable set. For finite-dimensional vector spaces. Theorem 2.7 can be proved more directly, as we shall see later. 2.9 f. Show that the space of all polynomials and C^([0,1], M) are infinite-dimensional vector spaces.

c. Lifear maps 2.10 Definition. Let X and Y be two vector spaces over K. A map (p : X -^Y is called K-linear, or linear for short, if (p{x -\-y) = (f{x) -h (p(y)

and

(p{Xx) = X(p{x)

for any x,y E X and A G K. A linear map that is infective and surjective is called a (linear) isomorphism. Of course, ii (p : X ^Y

is hnear, we have (p{0) = 0 and, by induction,

2.11 Proposition. Let (p : X ^^Y k

i=l

be linear. Then k

i=l

for any A^, A^,..., A^ G K and e\, 6 2 , . . . , Cn G X . In particular, a linear map is fixed by the values it takes on a basis. The space of linear maps (p : X -^Y between two vector spaces X and y , denoted by £(X, Y"), is a vector space over K with the operations of sum and multiplication by scalars defined in terms of the operations on Y by {ip + ip){x) := (p(x) + ip{y), (A(^)(x) = X(p(x) for all (p,ip e C{X, Y) and A G R. Notice also that the composition of linear maps is again a linear map, and that, if (/? : X -> F is an isomorphism, then the inverse map (p~^ :Y -^ X is also an isomorphism. It is easy to check the following.

2.1 Vector Spaces and Linear Maps

2.12 Proposition. Let ip : X -^Y

45

be a linear map,

(i) If S C X spans W C X, then (p{S) spans ip{W), (ii) If ei, 6 2 , . . . , Cn are linearly dependent in X, then ^ ( e i ) , . . . , (^(cn) are linearly dependent in Y. (iii) ^ is injective if and only if any list (ei, 6 2 , . . . , Cn) of linearly independent vectors in X is mapped into a list ((^(ei),(/?(e2),... ,(p{en)) of linearly independent vectors in X. (iv) The following claims are equivalent a) if is an isomorphism, b) S C X is a basis of X if and only if (p{S) is a basis ofY. 2.13 %, Show that the following maps are linear (i) the derivation map D : C^([0,1]) —>• C^([0,1]) that maps a C^-function into its derivative, f ^^ f. (ii) the map that associates to every function of class C^([0,1]) its integral over [0,1],

/-

r f{t)dt, Jo

(iii) the primitive map C°([0,1]) —• C^([0,1]) that associates to every continuous function the primitive function X

fix)

-^ F(x)):= : j mdt. 0

2.14 Definition. Let cp : X —^ Y be a linear map. The kernel of (p and the image of ip are respectively kercp := Ix e X\ (p{x) = o i , Im(p :=
d. Coordinates in a finite-dimensional vector space Let X be a finite-dimensional vector space over K and let (ei, 6 2 , . . . , Cn) be an ordered basis on X. Then every vector x e X writes uniquely as X = Yl7=i ^*^^' where xi, X2,..., Xn G K. Then (ei, 6 2 , . . . , Cn) defines a map £ : X -^W^ characterized by n

£{x) = X = (x^, x ^ , . . . , x^)

if and only if

x =^ 2=1

xV^.

46

2. Vector Spaces and Linear Maps

Figure 2.2. Coordinate system in a finite-dimensional vector spax^e.

It is trivial to verify that S is linear, injective and surjective, hence an isomorphism, together with its inverse n 2=1

We call £ the coordinate system related to the ordered basis (ei, e 2 , . . . , e^) and refer to £{x) as to the coordinate vector of x with respect to the basis (ei, 6 2 , . . . , Cn). Notice that £ maps Ci to the zth vector e^ of the canonical basis of K^. Also notice that ordered bases and isomorphims £ : X —^ W^ are in one-to-one correspondence. In particular, any isomorphism £ : X -^ W^ is a coordinate system related to a suitable basis. In fact, the vectors ei, 6 2 , . . . , Cn of X defined for i = 1 , . . . , n by ei := £~^{ei) form a basis of X by Proposition 2.12 (iii) and it is easy to check that n

£{x) = X = (a:^, x ^ , . . . , x'^)

if and only if

x = 2 J x'^^ii=l

The use of a basis, or, equivalently, of a coordinate system, allows us to transfer definitions and results in K^ to similar definitions and claims in X. We have the following. 2,15 Proposition. Let X be a finite-dimensional vector space. (i) Let (ei, 6 2 , . . . , Cn) be an ordered basis of X and let vi, V2," - ^ Vp £ X, p < n, be p linearly independent vectors. Then we can choose n — p elements among ei, 6 2 , . . . , e^, say ei, 6 2 , . . . , en-p, such that (i'l, 1^2,. • • 5 Vp,ei, 6 2 , . . . , en-p) is a basis of X. (ii) Assume that vi, i;2,..., Vk spans X. Possibly eliminating some of the vi^ V2,.. ", Vkf ^e get a basis of X. (iii) Any two bases of X have the same number of elements. The number of elements of a basis of a finite-dimensional space X is called the dimension of X and denoted by dimX. The following corollaries follow from Proposition 2.15.

2.1 Vector Spaces and Linear Maps

47

2.16 Corollary. Let X be a vector space of dimension n, let £ : X —^ W^ he a coordinate system on X and let W he a suhspace of X. Then E{W) is a suhspace ofK^ and dim VF = dim£{W). 2.17 Corollary. Let X he a vector space of dimension n. Then (i) (ii) (iii) (iv)

n linearly independent vectors of X form a hasis of X, if k> n, then k vectors of X are always linearly dependent, for every suhspace W of X we have dim W
ifVcW

Let U and V be two subspaces of a vector space X. Then both U OV and

U + v := lxex\x

= u + v, ueU, v ev\

are Unear subspaces of X, When U dV = {0}, we say that U + V is the direct sum of U and V and we write f/ 0 V instead oiU -\-V. Moreover if X = [/ 0 y , we say that U and V are supplementary. Thus X = U ^V means that every x e X decomposes uniquely as x = u-\-w, u e U, v eV. 2.18 Corollary (Grassmann's formula). Let U and V he two finitedimensional suhspaces of a vector space X. Then dim U-\-dimV = dim{U H V) + dim{U -f V). 2.19 f. Show that every n-dimensional vector space is the direct sum of n subspaces of dimension 1. 2.20 f. Let ei, 6 2 , . . . , en be distinct vectors of X and let 1 < p < n. Then, trivially, Span{ei, 6 2 , . . . , ep} + Span{ep-|-i,... , e n } = Span{6i, 6 2 , . . . , en}- Show that, if the 6i's are linearly independent, then Span{ei, 6 2 , . . . , 6p} e Span { e p + i , . . . , e n } = Span{ei, 6 2 , . . . , en}2.21 %, Let Vi, V2 be two subspaces of a vector space V of finite dimension and assume that V = Vi © V2. Then every vector v G V decomposes uniquely as v = vi -\- V2 with Vi eV. Show that the coordinate maps ni : V —>- Vi, i = 1,2, ni{v) =Vi, are linear. 2.22 f. Let (f : X —^Y he an isomorphism from X onto Y. Show that d i m X = d i m y .

e. Matrices associated to a linear map Let X, y be two vector spaces of dimension n and m respectively. We shall now show that every choice of an oriented basis, equivalently of a coordinate system, in X and Y yields an identification between linear maps and matrices. Let (ei, 6 2 , . . . , Cn) be an oriented basis in X, (/i, /2, • •., fm) be an oriented basis in Y and let £ : X -^K^, T :Y —^ W^ be the corresponding

48

2. Vector Spaces and Linear Maps

Die Ausdehnungslehre von 1844 0^

Die lineale Ausdelmimgslelire iieuer Zvreig der Hath«matik

iank Imiitaiw • ( i t Mrifii ZMi|t «r btlwnWi,

aiir in ftiUUk. Mnhanil., .lie L.-lirv Tom M»i;uc<w>m» iiud illu Krvstuilviittuiii: rrUutvrt

Eemuum Grassmaim. Zw««,.nTo..unv.,..«.cn«Aun^, m 1 TafcL

Mpxig Vorl.s; vo» Otto Wiij»i,.l. 1878.

Figure 2.3. Hermann Grassmann (18081877) and his Ausdenungslehre.

coordinate systems. To every linear map (, : X linear map L :W^ -^ W^ defined by L —

Y one associates the

ToHoE - 1

(2.1)

see Figure 2.4, that maps the coordinates of a vector x G X, relative to the basis (ei, e 2 , . . . , Cn), into the coordinates oii[x) G y , relative to the basis (/i^ / 2 , . • •, /m), and then, see Proposition 1.18, an m x n matrix L such that L(x) = Lx. We call L and L respectively, the vfia'p and the matrix associated to £ using the coordinate systems £ and T^ or, equivalently, using (ei, 6 2 , . . . , en) and (/i, /2, • . . , /m) ^^ ^ 6a5Z5 in X and in Y, respectively. Since 5~^ maps the iih vector Ci of the canonical basis of K^ to e^, L(e^) is the coordinate vector of £{ei) in the basis (/i, / 2 , . . •, /m), hence L = [L*] where

e{ej) = J2^f-

(2.2)

z=l

Equivalently, see Proposition 1.18, L(ei) L(e2) . . . U/(en)j. Since £ : X have

U^ and ^ : y —> K^ are isomorphisms, we trivially £{keT £) = ker L,

.:r(Im ^) = Im L.

Hence, recalling Theorem 1.25, we have the following.

2.1 Vector Spsices and Linear Maps

/

X

49

/

s

tf-i

/w Figure 2.4. The matrix associated to a linear map.

2.23 Theorem (Rank formula). Let £ : X ^^ Y be a linear map between linear spaces. If X is finite dimensional, then Rank£ = dimlm^ = dimX — dimkerf. Proof. Let (ei, 6 2 , . . . , en) be a basis of X. Then Im^ = Span { ^ ( e i ) , . . . , ^ ( c n ) } , hence d i m l m ^ < +oo. Now choose a basis ( / i , / 2 , • • •, / m ) of Im^ and consider the linear associated map L : K^ —> K"^ using the two bases (ei, 6 2 , . . . , Cn) on X and ( / i , / 2 , . . . , fm) on Im^. Then Theorem 1.25 yields dim Im i = dim Im L = n — dim ker L = n — dim ker £.

f. The space C{X, Y) Let X and Y be vector spaces of dimension n and m, and let (ei, e2,.. •, Cn) and (/i, /2, • • •, /m) be two bases in X and Y. Then (2.2) defines a map M : iC(X, y ) -^ M7n,n(K) which is trivially injective and surjective. Since M is also Unear, we deduce that £(X, Y) and Mm,n(K) are isomorphic. In particular, the vector space £(X, Y) has dimension mn. A basis of >C(X, y ) is given by the mn maps {^j}, j = 1 , . . . , n, z = 1 , . . . , m, defined in terms of the bases as

^ii{ek)=Sifj = 0

if k ^ i,

A; = 1,.. . , n .

if k = i,

The matrix associated to ip'j is the mx n matrix with all entries 0 except for the entry (i, j ) where we have 1. Of course the matrix Ad{£) associated to £ depends on the coordinate systems we use on X and Y. When we want to emphasize such a dependence, we write

to denote the matrix associated to £ : X -^ y using the coordinate systems £ on the source space and T in the target space. The product of composition of Unear maps corresponds to the product of composition of linear maps at the level of coordinates, hence to the product row by columns of the corresponding matrices. More precisely, we have the following.

50

2. Vector Spaces and Linear Maps

2.24 Proposition. Let (/? : X —> F and ip : Y -^ Z be two linear maps, and let £ : X -^W, T :Y -^ K^, and G : Z ^ K^ be three systems of coordinates on X,Y and Z. Then

MJ{^ov) = M%{^P)Mf{
oE-'^).

A special case arises if X = F = Z. In this case, the space C{X^ X) of the hnear maps from X into itself, also known as the space of endomorphisms of X and sometimes denoted by End(X), is closed under the operations of sum, multiplication by scalars and product of composition. We say that C{X, X) is an algebra with respect to these operations and, for any coordinate system £ on X, M^ : C{X,X) -^ Mn,n{^) is an isomorphism of algebras. The set of isomorphisms from X into itself, called the automorphisms of X and denoted by Aut {X), is a group with respect to the composition. If dim X = n and £ : X -^K^ is a coordinate system, then A^f (Aut {X)) coincides with the group GL(n,K) of all nonsingular n X n matrices, GL(n,K) := | L G Mn,n(K) I detL ^^ o } . g. Linear abstract equations Let X, Y be two vector spaces over K. A linear (abstract) equation in the unknown x is an equation of the form ^{x) = y,

(2.3)

where (p : X -^ Y is a> linear map and y £ Y. The equation (f{x)=0 called the associate homogeneous equation to (2.3). Of course, we have

is

(i) the set of all solutions of the associate linear homogeneous equation ip{x) = 0 is ker ip, (ii) (2.3) is solvable iff t/ G limp, (iii) (2.3) has at most a unique solution if ker (^ = {0}, (iv) if (p{xo) = 2/, then the set of all solutions of (2.3) is Ix e X\x

— xo e ker (p >.

Taking into account the rank formula, we infer the following. 2.25 Corollary. Let X, Y be finite dimensional of dimension n and m respectively, and let (p : X -^Y be a linear map. Then

2.1 Vector Spaces and Linear Maps

51

(i) if m 0, (ii) if m>n, then (p is injective iff Rank ip = n, (iii) ifn = m, then ip is injective if and only if (p is surjective. The claim (iii) of Corollary 2.25 is one of the forms of Predholm 's alternative theorem: either (p{x) = y is solvable for every y ^ Y or (p{x) = 0 has a nonzero solution. 2.26 E x a m p l e . A second order linear equation ay"+by'-\-cy

= f,

a,b,ceR,

/ 6 C^R),

(2.4)

can be seen as an abstract linear equation ip(y) = f by introducing the linear map if : C2(R) -^ C ^ R ) ,

y - .
(2.5)

Since (2.4) has a solution for every / G C^(R), see [GMl], (p is onto, and the linearity yields the following (i) the set of all solutions of the associated homogeneous equation ay"-\-by'-\-cy = 0 is a linear space, actually ker <^, (ii) if yf is any solution of (2.4), then all solutions of (2.4) are obtained by adding to yf a solution of the homogeneous equation a y " -|- fey' + cy = 0, i.e., of (p{y) = 0. In abstract terms, the set of all solutions of (^(y) = / is given by | y 6 C 2 ( R ) | y - y / 6 kerc^}. Moreover, consider the map 7 : R^ —• C^(R) that maps each (a, /5) G R^ to the unique solution of the initial value problem a y " + 6 y ' + c y = 0, y(0)=:a,y'(0)=/3. It is easy to show t h a t 7 : R^ —• C^(R) is linear, and by definition, I m 7 = ker(/?, where ip is the map in (2.5). Since 7 is trivially injective, the rank formula yields dim Im 7 = 2 — 0 = 2, concluding that the space of solutions of a homogeneous second order ODE is a vector space of dimension 2.

h. Changing coordinates The coordinates of a vector depend on the chosen coordinate system. Let us discuss how they change. Let X be a vector space of dimension n, and let £^ : X —> K*^ and T : X -^ K^ he two coordinate systems on X, that we label respectively as the old system and the new system. Denote by (ei, e2, •. •, Cn) and (/i, / 2 , . . . , fn) the bases associated respectively to the old coordinate system £ and to the new coordinate system T. The linear map L \= !F o £~^ : W*' -^ W^ maps the old f-coordinate vector of x G X to the new ^-coordinate vector of x, see Figure 2.5. The matrix L associated to L in the basis (ei, 6 2 , . . . , en) is L := M^{ld)

= [jr(ei) I T{e2) I . . . I Hen)

52

2. Vector Spaces and Linear Maps

Figure 2.5. Changing the basis.

We say that L or L changes coordinates from S to T. Remember that the zth column of L is the new ^-coordinate vector of the zth vector of the old basis, i.e, see (2.2), L = [Lj] where

-E^;/'-

(2.6)

i=l

Let m : X -^ X he the hnear map defined by m{ei) := /^ Vi = 1 , . . . , m. Then the associated matrix M = [M-] to m using the basis £ is M:=

5(/i)^(/2)

£{fn)]

or,

Therefore, comparing with (2.6), M = L~^. In conclusion (i) L maps the old coordinates to the new coordinates, (ii) L~-^ maps the old basis to the new basis. Thus L acts differently on the basis and on the coordinates. We say that the coordinates change in a contravariant way. This is nothing mysterious; for instance, 7000 p = 7 Kg: if the unit measure ei is, say 1000 times the unit measure ei, then we expect that the number of units ei associated to a measure will be 1/1000 of the number of units ei. 2.27 E x a m p l e . Suppose we want to change from the canonical basis (ei, 62) to a new one given by the vectors (1, 2) and (3,4) in R^. The matrix that changes the basis from ei = (1,0), 62 = (0,1) into / i = (ei + 2 e 2 ) , /2 = (3ei -h4e2) is M = The old coordinates of a vector P can easily be obtained from the new ones, as P = X ei -\- y €2 = a fi -{- P f2,

2.1 Vector Spaces and Linear Maps

53

thus, in the old coordinates,

and, conversely, we have

i. The associated matrix under changes of basis Let X and Y be two vector spaces of dimension n and m. As before, the matrix associated to a linear map depends on the chosen coordinates on X and Y. Let f : X —^ K"^, £' : X -^ W^ be two coordinate systems on X, and let ^ : X ^> K^, ^ ' : X -^ K"^ be two coordinate systems on Y. Of course, for every map i we have (. = Idy o ^ o Idx, consequently

Mfiii) = M^'{ld)Mf{i)MUld), or, in other words, we can state the following. 2.28 Proposition. Given the previous notation, let R G Mn,n{^) be the matrix that changes coordinates from £ to £', let S G Mm,m{^) ^^ the matrix that changes coordinates from J^ to T', let A A' be the matrices that represent £ respectively, in the systems of coordinates £ and T and in the systems £' and T'. Then A' = S A R - ^ 2.29 Corollary. Let A G Mn^n^^ cind let AiW^ ^K^ be the associated linear operator A{x) := Ax. Let (fi, f 2 , . . . , fn) be a basis ofK^. Then the matrix associated to A using the basis (fi, f2,. •., fn)? both in the source and the target W^, is the matrix A' := S-^AS where

= [ ' fl

f2

f„

Proof. Let ( e i , e 2 , . . . , en) be the canonical basis of K^, then Sci = fi are the coordinates of fi in the basis ( e i , e 2 , . . . , en), A S e i = Afj is the coordinate vector of A(fi) in canonical coordinates and, finally, S~^ A S e i is the coordinate vector of A(fi) in the (fl, f 2 , . . . , fn) basis. D

54

2. Vector Spaces and Linear Maps

j . The dual space £ ( X , K) Linear maps from X into K play a special role. Let X be a vector space over K with dimX = n. Linear maps from X into K are also called linear forms or covectors, and the space of linear forms, C{X, K), also denoted by X*, is called the dual space of X. Suppose X is finite dimensional. Then, as we have seen, X* has dimension n and, if (ei, 6 2 , . . . , en) is a basis of X, then every linear form ^ : X —> K is represented as a 1 x n matrix L that maps the coordinates x of x G X to i{x), i.e.. Lx = e{x)

if

X = ^ x V i , X = ( x \ x ^ , . . . , x'^)

or, L = [/i I /2 I . . . I In] and n

e{x) = ^ iix\

vx = {x\ x^..., x"").

2=1

Consider now the linear maps e^, e^,..., e'^ : X ^ K defined by e'{ei) = Sij

Vz,j = l , . . . , n .

(2.7)

2.30 Proposition. We have (i) for i = 1,.. .,n, the map eV X —> K maps x e X to the ith coordinate of X, so that n

X = 2_\e^{x)ei

Vx G X,

2=1

(ii) (e^, e^,..., e^) is a basis on X*, (iii) ifx = ^ ^ ^ 1 x'ci eX and£ = YJ"i=i he' G X\

then(>{x) = YTi^i

hx\

Proof, (i) If x = J^^^i x^ei, then e^{x) = Y:7=i x'e^{ei) = x^. Then ^(e^) = /i^ Vi. Thus, if ^(x) = 0 Vx we trivially have (ii) Let ^ := Yl]=iH^^' /ii = 0 Vi. (iii) In fact,

jf = l

i=l

j=l

i=l

= E E '>^*e'(«i) = E E 'j^^-^y = E '*^*i=lj=l

i=lj=l

i=l

The system of linear maps (e^, e^,..., e'^) characterized by (2.7) is called the dual basis of (ei, e 2 , . . . , en) in X*.

2.1 Vector Spaces and Linear Maps

55

CALCOLO

GEOMETRICO tttCtKITOUUX

OPEBAZlOh'l DELIA LOQICA DEDUTTIVA QIU8EPFE PBAKO

TOUIKO FRATSLtl 80CCA KDITORI thMSSftM

^^» ••«•&»

Figure 2.6. Giuseppe Peano (1858-1932) and the frontispiece of his Calcolo Geometrico.

2.31 Remark. Coordinates of vectors or covectors of a vector space X of dimension n are both n-tuples. However, to distinguish them, it is useful to index the coordinates of covectors with lower indices. We can reinforce this notation even more by writing the coordinates of vectors as column vectors and coordinates of covectors as row vectors.

k. The bidual space Of course, we may consider also the space of linear forms on X*, denoted by X** and called the hidual of X. Every v e X identifies a Hnear form on >C(X*,K), by defining t;** : X* -^ K by ^;**(^) := e{v). The map 7 : X —> X**, X —> 7(x) := x** we have just defined is linear and injective. Since dimX** = dimX* = dimX, 7 is surjective, hence an isomorphism, and we call it natural since it does not depend on other structures on X , as does the choice of a basis. Since X** and X are naturally isomorphic, there is a "symmetry" between the two spaces X and X*. To emphasize this symmetry, it is usual to write ip{x) instead of < (p^x > if (^ G X* and a; G X , introducing the evaluation map

< , >:X* x X

< (p,x > : = (f{x).

2.32 ^ . Let X be a vector space and let X* be its dual. A duality between X and X* is a map < , > : X* x X ^ K that is

56

2. Vector Spaces and Linear Maps

(i) linear in each factor, -\- /3y > = a < (p,x > -\-/3 < (p,y > ,

< ^,ax

< a(f -\- /3ip ,x > = a < (f,x > -f/? < ip ,x > , for all a, /? G K, X, y e X and (f,ip e X*, (ii) nondegenerate i.e., if < If ,x > = 0 Wx, then (f = 0, if < V? , x > = 0 V that evaluates a linear map (p : X at x G X is a duality.

^^K

1. Adjoint or dual maps Let X, Y be vector spaces, X* and F* their duals and < , >x and < , >Y the evaluation maps on X* x X and Y* x Y. For every linear map £ : X ^^ y , one also has a map ^* : F* -^ X* defined by < r(2/*),x > : = < y\e{x)

>

Vx G X,V2/* G F*.

It turns out that £* is linear. Now if (ei, e2,. •., Cn) and ( / i , . . . , fm) are bases in X and Y respectively, and (e^, e ^ , . . . , e^) and (/^, / ^ , . . . , / ^ ) are the dual bases in X* and F*, then the associated matrices L = [L^] G Mm,n(If^) and M = [Mj] G Mn,m(IK), associated respectively to £ and ^*, are defined by m

n

h=l

2=1

By duality, i.e., M = L-^. Therefore we conclude that i / L is the matrix associated to £ in a given basis, then L^ is the matrix associated to £* in the dual basis. We can now discuss how coordinate changes in X reflect on the dual space. Let X be a vector space of dimension n, X* its dual space, ( e i , . . . , e^), (ei, 62,..., Cn) two bases on X and ( e \ e^,..., e^) and (e^ e^,..., e") the corresponding dual bases on X*. Let ^ : X -^ X be the linear map defined by £{ei) := e^ Vi = 1 , . . . , n. Then by duality <£*{e'),ej

> = <e\£{ej)

> = < e\ej > = Sij =< e\ej

>

Vz,j,

^*(e*) = e* Vz = 1 , . . . , n. If L and L^ are the associated matrices to £ and ^*, L changes basis from (ei, e 2 , . . . , e-n) to (ei, 62,..., en) in X, and L^ changes basis in the dual space from (e^, e^,..., e") to (e^, e^,..., e").

2.2 Eigenvectors and Similar Matrices

57

Now if (/? G X*, we have (/? = Zir=i ^i^^ ~ Y17=i ^^^^ hence n

n

i=l

n

n

z=l j = l

z=l

Thus, if a := (ai, a 2 , . . . , a^), b := (61, 62, • • •, bn)^ we have a^ = L ^ b ^

or

a = bL.

In other words, the coordinates in X* change according to the change of basis. We say that the change of coordinates in X* is covariant

2.2 Eigenvectors and Similar Matrices Let A : X —^ X he a. hnear operator on a vector space. How can we describe the properties of A that are invariant by isomorphisms? Since isomorphims amount to changing basis, we can put it in another way. Suppose X is finite dimensional, dim X = n, then we may consider the matrix A associated to A using a basis (we use the same basis both in the source and the target X). But how can we catch the properties of A that are independent of the basis? One possibihty is to try to choose an "optimal" basis in which, say, the matrix A takes the simplest form. As we have seen, if we choose two coordinate systems £ and ^ on X, and S is the matrix that changes coordinates from £ to ^ , then the matrices A and B that represent A respectively in the basis £ and J^ are related by B = SAS-^ Therefore we are asking for a nonsingular matrix S such that S~^AS has the simplest possible form: this is the problem of reducing a matrix to a canonical form. Let us try to make the meaning of "simplest" for a matrix more precise. Suppose that in X there are two supplementary invariant subspaces under A X = Wi® W2,

A{Wi) C Wi, A{W2) C W2.

Then every x e X splits uniquely as x = xi + X2 with xi G VFi, X2 G W25 and A{x) = A{xi) + A{x2) with A(xi) G Wi and A{x2) G VF2. In other words, A splits into two operators Ai : Wi ^^ Wi^ A \ W2 -^ W2 that are the restrictions of A to Wi and W2- Now suppose that d i m X = n and let (ei, 6 2 , . . . , Ck) and (/i, /2, •. •, fn-k) be two bases respectively of Wi and W2. Then the matrix associated to A in the basis (ei, 6 2 , . . . , efc, / i , / 2 , . . •, fn-k) of X has the form

58

2. Vector Spaces and Linear Maps

A =

Ai 0

where some of the entries are zero. If we pursue this approach, the optimum would be the decomposition of X into n supplementary invariant subspaces Wi, W2,.. •, Wn under A of dimension 1,

X = T^i e H^2 e • • • e T^n,

A(Wi) c Wi.

In this case, A acts on each Wi as a dilation: A{x) = Xix Vx G Wi for some A^ G K. Morever, if (ei, 6 2 , . . . , Cn) is a basis of X such that ei G W^ for each 2, then the matrix associated to A in this basis is the diagonal matrix A = diag(Ai,A2,...,An).

2.2-1 Eigenvectors a. Eigenvectors and eigenvalues As usual, K denotes the field R or C. 2.33 Definition. Let A : X -^ X be a linear operator on a vector space X over K. We say that x E X is an eigenvector of A if Ax = Xx for some X E K. If X is a nonzero eigenvector, the number X for which A{x) = Xx is called an eigenvalue of A, or more precisely, the eigenvalue of A relative to X. The set of eigenvalues of A is called the spectrum of A. / / A G Mn,nO^), we refer to eigenvalues and eigenvectors of the associated linear operator A : W^ —> W^, A(x) := Ax, as the eigenvalues and the eigenvectors of A. Prom the definition, A is an eigenvalue of A if and only if ker(A Id — A) 7^ {0}, equivalently, if and only if Aid — >1 is not invertible. If A is an eigenvalue, the subspace of all eigenvectors with eigenvalue A FA := {^ e X I A{x) = Xx\ = ker(AId - A) is called the eigenspace of A relative to A. 2.34 E x a m p l e , let X = C°° ([0, n]) be the linear space of smooth functions that vanish at 0 and n and let £>^ : X —> X he the linear operator D'^{f) := j " that maps every function / into its second derivative. Nonzero eigenvectors of the operator D^ ^ that is, the nonidentically zero functions y € C°*^[0,1] such that D'^y{x) = \y{x) for some A G M, are called eigenfunctions. 2.35 E x a m p l e . Let X be the set Pn of polynomials of degree less than n. Then, each Pfc d Pn A; = 0 , . . . , n is an invariant subspace for the operator of differentiation. It has zero as a unique eigenvalue.

2.2 Eigenvectors and Similar Matrices

59

2.36 %, Show that the rotation in E^ by an angle 6 has no nonzero eigenvectors if 9 ^ 0,n, since in this case there are no invariant lines.

2.37 Definition. Let A : X ^^ X be a linear operator on X. A subspace W C X is invariant (under A) if A{W) C W. In the following proposition we collect some simple properties of eigenvectors. 2.38 Proposition. Let A : X -^ X be a linear operator on X. (i) X 7^ 0 is an eigenvector if and only if Span {x} is an invariant subspace under A. (ii) Let A be an eigenvector of A and let Vx be the corresponding eigenspace. Then every subspace W C Vx is an invariant subspace under A, i.e., A{W) C W. (iii) dimker(AId — A) > 0 if and only if A is an eigenvalue for A. (iv) Let W C X be an invariant subspace under A and let A be an eigenvalue for A^^r. Then A is an eigenvalue for A : X ^ X . (v) A is an eigenvalue for A if and only ifO is an eigenvalue for Aid —^. (vi) Let if : X —^ Y be an isomorphism and let A : X -^ X be an operator. Then x E X is an eigenvector for A if and only if (p{x) is an eigenvector for (p o A o (p~^, and x and ip{x) have the same eigenvalue. (vii) Nonzero eigenvectors with different eigenvalues are linearly independent. Proof, (i), . . . , (vi) are trivial. To prove (vii) we proceed by induction on the number k of eigenvectors. For A; = 1 the claim is trivial. Now assume by induction that the claim holds for fc — 1 nonzero eigenvectors, and let e i , 62, • • •, e^ be such that ej / 0 Vi = 1 , . . . , fc, A{ej) = XjCj Vj = 1 , . . . , fc with Xj ^ Ai Vi 7^ j . Let aiei-fa2e2H

\-akCk = 0,

(2.8)

be a linear combination of e i , 6 2 , . . . , e^. From (2.8), multiplying by Ai and applying A we get aiXiei

+ a2Aie2 H

h afcAie^ = 0,

a i A i e i -h a2A2e2 H

h OfcAfcefc = 0,

consequently k

^{Xj

-Xi)ajej

=0.

j=2

By the inductive assumption, aj{Xj — Ai) = 0 Vj = 2 , . . . , n, hence aj = 0 for all j > 2. We then conclude from (2.8) that we also have a i = 0, i.e., that e i , 6 2 , . - . , Cfc are linearly independent. D

Let A : X ^^ X he a. linear operator on X of dimension n, and let A be the associated matrix in a coordinate system f : X —> W^. Then (vi) implies that x G X is an eigenvector of A if and only if x := £{x) is an eigenvector for x —> A x and x and x have the same eigenvalue. Prom (vii) Proposition 2.38 we infer the following.

60

2. Vector Spaces and Linear Maps

2.39 Corollary. Let A : X -^ X be a linear operator on a vector space X of dimension n. If A has n different eigenvalues, then X has a basis formed by eigenvectors of A. b. Similar matrices Let A : X ^^ Xhea, linear operator on a vector space X of dimension n. As we have seen, if we fix a basis, we can represent A by an n x n matrix. If A and A' G Mn^ni^) ^^^ ^wo such matrices that represent A in two different bases (ei, e 2 , . . . , e^) and (ei, 62,..., en), then by Proposition 2.28 A' = S~^AS where S is the matrix that changes basis from (ei, ^ 2 , . . . , e^) to (ei, € 2 , . . . ,

€n)-

2.40 Definition. Two matrices A , B G Mn n(K) are said to be similar if there exists S G GL(n,K) such that B = S~ AS. It turns out that the similarity relation is an equivalence relation on matrices, thus nxn matrices are partitioned into classes of similar matrices. Since matrices associated to a linear operator A : X ^^ X^ dimX = n, are similar, we can associate to A a unique class of similar matrices. It follows that if a property is preserved by similarity equivalence, then it can be taken as a property of the linear operator to which the class is referred. For instance, let A : X -^ X he a. hnear operator, and let A, B be such that B = S~^AS. By Binet's formula, we have det B = det S"^ det A det S = —— det A det S = det A. detS Thus we may define the determinant of the linear map A : X —^ X hy det A := det A where A is any matrix associated to A. c. The characteristic polynomial Let X be a vector space of dimension n, and let A : X ^^ X he a, hnear operator. The function ^-^PAW

'•= det(AId -A),

AG K,

is called the characteristic polynomial of A. It can be computed by representing A by a matrix A in a coordinate system and computing PAW as the characteristic polynomial of any of the matrices A representing A, PA{X)=PA{X)

=

detiXld-A).

In particular, it follows that p^( ) : K —> IK is a polynomial in A of degree n, and that the roots of p^(A) are the eigenvalues of A or of A. Moreover, we can state

2.2 Eigenvectors and Similar Matrices

61

2.41 Proposition. We have the following. (i) Two similar matrices A, B have the same eigenvalues and the same characteristic polynomials. (ii) / / A has the form

([M\ 0

0 A2

u

0

0 \ 0

AfeU

where for z = 1 , . . . , fc, each block A^ is a square matrix of dimension ki with principal diagonal on the principal diagonal of A, then PA{S) = PAA^) ' PA2{s). • .PAfc(s).

(iii) We have

det(5ld - A) = 5^ - tr A5^-i + • • • + (-l)^det A n k=l

where t r A := X]^=i A^ is the trace of the matrix A, and ak is the sum of the determinants of the kx k submatrices of A with principal diagonal on the principal diagonal of A. Proof, (i) If B = S A S - i , S G GL(n,K), then s I d - B = S ( s l d - A ) S - i , hence det(s Id - B ) = det S det(s Id - A)(det S ) - ^ = det(s Id - A), (ii) The matrix 5 Id — A is a block matrix of the same form

sId-A2

5 Id - Afc

V hence det(s Id - A) = Y[i=i det{s Id - A^). (iii) We leave it to the reader.

Notice that there exist matrices with the same eigenvalues that are not similar, see Exercise 2.73.

62

2. Vector Spaces and Linear Maps

d. Algebraic and geometric multiplicity 2.42 Definition. Let A : X -^ X be a linear operator, and let X GK be an eigenvalue of A. We say that A has geometric multiplicity k e N if dimker(AId - A) = k. Let PA{S) be the characteristic polynomial of A. We say that A has algebraic multiplicity k if PA{S) = {S- X)^q{s),

where q{X) ^ 0.

2.43 Proposition. Let A : X -^ X be a linear operator on a vector space of dimension n and let A be an eigenvalue of A of algebraic multiplicity m. Then dimker(AId - A) <m. Proof. Let us choose a basis (ei, 6 2 , . . •, Cn) in X such that (ei, e2, • •., e^) is a basis for Vx '•= ker(AId — A). The matrix A associated to A in this basis has the form

|\

(\ Aid

C

0

D

A =

vl

1/

where the first block, Aid, is a fc x /c matrix of dimension k = dimV^. Thus Proposition 2.41 (ii) yields PA{S) = det(s Id — A) = (s — \)^pr>(s), and the multiplicity of A is at least k. D

e. Diagonizable matrices 2.44 Definition. We say that A G Mn,n{^) is diagonizable, if A is similar to a diagonal matrix. 2.45 Theorem. Let A : X -^ X be a linear operator on a vector space of dimension n, and let (ei, 6 2 , . . . , Cn) be a basis of X. The following claims are equivalent. (i) ei, 6 2 , . . . , Cn are eigenvectors of A and Ai, A2,..., An are the relative eigenvalues. (ii) We have A{x) = Yl^=i K^^^i /^^ all x e X if x = Yl7=i ^^^i(iii) The matrix that represents A in the basis (ei, ^ 2 , . . . , e^) is diag(Ai, A2,..., An). (iv) If A is the matrix associated to A in the basis (/i, / 2 , . . . , fn), then S-^AS = dia^(Ai, A2,..., An) where S is the matrix that changes basis from (/i, /25 • • • ? /n) io (ei, 6 2 , . . . , Cn), i-e., the ith column of S is the coordinate vector of the eigenvector ei in the basis (/i, 7 2 , . . . , /n)-

2.2 Eigenvectors and Similar Matrices

63

Proof, (i) ^ (ii) by linearity and (iii) <=> (i) since (iii) is equivalent to A{ei) = Xiei. D Finally (iii) ^ (iv) by Corollary 2.29.

2.46 Corollary. Let A : X ^^ X be a linear operator on a vector space of dimension n. Then the following claims are equivalent. (i) X splits as the direct sum of n one-dimensional invariant subspaces (under A), X = Wie • • - ^ Wn. (ii) X has a basis made of eigenvectors of A. (iii) Let Ai, A2,..., A^ be all distinct eigenvalues of A, and let Vx^, • • •, Vx,^ be the corresponding eigenspaces. Then y ^ dim Vxi = n. (iv) / / A is the matrix associated to A in a basis, then A is diagonizable. Proof, (i) implies (ii) since any nonzero vector in any of the Wis is an eigenvector. Denoting by (ei, e2,. • •, en) a basis of eigenvectors, the spaces Wi := Spanje^} are supplementary spaces of dimension one, hence (ii) implies (i). (iii) is a rewriting of (i) since for each eigenvalue A, Vx is the direct sum of the Wi^s that have A as the corresponding eigenvalue. Finally (ii) and (iii) are equivalent by Theorem 2.45. D

2.47 Linear equations. The existence of a basis of X of eigenvectors of an operator A : X —^ X makes solving the Unear equation A{x) — y trivial. Let (ei, e 2 , . . . , Cn) be a basis of X of eigenvectors of A and let Ai, A2,..., An be the corresponding eigenvalues. Writing x^y e X in this basis, n

n

x = Y^x'ei,

^y'ei,

i=l

i=l

we rewrite the equation A{x) = y as

5^(A,x^-2/%=0, z=l

i.e., as the diagonal system Aix^ =y^, ..., (AnX

Therefore

—y

.

64

2. Vector Spaces and Linear Maps

(i) suppose that 0 is not an eigenvalue, then A{x) = y has a unique solution

=E

•ei.

Z=l

(ii) let 0 be an eigenvalue, and let VQ — Span{ei, e 2 , . . . , e^}. Then A{x) = 2/ is solvable if and only \i y^ = --- — y^ — ^ and a solution of A{x) = 7/ is xo := Xir=fc+i A"^^- -^^ linearity, the space of all solutions is the set | X G X X - X O G

kerA =

VQ

[•

2.48 % Let A : X —>• X be a linear operator on a finite-dimensional space. Show that A is invertible if and only if 0 is not an eigenvalue for A. In this case show that 1/A is an eigenvalue for A"^ if and only if A is an eigenvalue for A.

f. Triangularizable matrices First, we notice that the eigenvalues of a triangular matrix are the entries of the principal diagonal. We can then state the following. 2.49 Theorem. Let A G Mn,n(IK). / / the characteristic polynomial decomposes as a product of factors of first degree, i.e., if there are (not necessarily distinct) numbers Ai, A2,..., An G K such that det(AId - A) = (A - A i ) . . . (A - An), then A is similar to an upper triangular matrix. Proof. Let us prove the following equivalent claim. Let A : X -^ X he a linear operator on a vector space of dimension n. If PAW factorizes as a product of factors of first degree, then there exists a basis (iti, it2, • • •, Un) of X such that Span{wi}, Span {it 1,1*2}, Span {141,1*2,^x3}, . . . ,Span {1*1,1*2, • • itn} are invariant subspaces under A. In this case we have for the linear operator >l(x) = A x associated to A A u i = A(ui)

= a}ui,

Au2 = ^ ( u 2 ) = a2Ui H- a2U2,

[AUn

= A(Un)

= a^Ui

+ a^U2

H

(- a^JUn,

i.e., the matrix A associated to A using the basis (t*i, 1*2, • • •, Un) is upper triangular, /a\

A =

0

Vo

4\

4

„2

0

a^/

2.2 Eigenvectors and Similar Matrices

65

We proceed by induction on the dimension of X . If d i m X = 1, the claim is trivial. Now suppose that the claim holds for any linear operator on a vector space of dimension n — 1, and let us prove the claim for A. Prom PA (A) = det(AId - A) = (A - A i ) . . . (A - An), Ai is an eigenvalue of A, hence there is a corresponding nonzero eigenvalue ui and Span {ui} is an invariant subspace under A. Now we complete {ui} as a basis by adding vectors f2, • • • i^n, and let B be the restriction of the operator A to Span {v2,.. .Vn}Let B be the matrix associated to B in the basis {v2,. • • ,Vn), and let A be the matrix associated to A in the basis (iti, W2, • • •, Wn)- Then

/«!

where ai = Ai.

A =

\ Thus PA(X)

= PA (A) = (A - A I ) P B ( A ) = (A -

AI)PB(A).

It follows that the characteristic polynomial of B is P B ( A ) = (A — A 2 ) . . . (A — An). By the inductive hypothesis, there exists a basis (u2,... itn) of Span {v2,..., fn} such that Span{1x2}, Span{u2,W3}, . . . , S p a n { i i 2 , . . •,Wn} are invariant subspaces under B, hence S p a n { u i } , Span{iti,ii2}, Span{141,1x2,^3}, . . . ,Span{iti,tX2,... Wn} are invariant subspaces under A.

D

2.2.2 Complex matrices When K = C, a significant simplification arises. Because of the fundamental theorem of algebra, the characteristic polynomial PA (A) of every linear operator A : X ^^ X over a complex vector space X of dimension n, factorizes as product of n factors of first degree. In particular, A has n eigenvalues, if we count them with their multiplicities. Prom Theorem 2.49 we conclude the following at once. 2.50 Corollary. Let A e Mn,n(C) be a complex matrix. Then A is similar to an upper triangular matrix, that is, there exists a nonsingular matrix S G Mn,n(C) such that S~-^AS is upper triangular. Moreover, 2.51 Corollary. Let A G Mn,n(C)6e a matrix. Then A is diagonizahle (as a complex matrix) if and only if the geometric and algebraic multiplicities of each eigenvalue agree.

66

2. Vector Spaces and Linear Maps

Proof. Let Ai, A 2 , . . . , Afc be the distinct eigenvalues of A , for each i = 1^.. .k,\et and Vx^ respectively be the algebraic multiplicity and the eigenspace of A^. If dim V\^ = rrii Vi, then by the fundamental theorem of algebra k

rrii

n

^ d i m V ^ . =^mi i=l

= n.

i=l

Hence A is diagonizable, by Corollary 2.46. Conversely, if A is diagonizable, then Yli=i dim Vx^ = n, hence by Proposition 2.43 dimVxi ^ i^i^ hence k

n = ^rrii

k

> ^ d i m V ^ i = n. D

2.52 Remark (Real and complex eigenvalues). If A G Mn,n{^), its eigenvalues are by definition the real solutions of the polynomial equation det(AId — A) = 0. But A is also a matrix with complex entries, A G Mn,n{C) and it has as eigenvalues which are the complex solutions of det(AId — A) = 0. It is customary to call eigenvalues of A the complex solutions of det(AId — A) = 0 even if A has real entries, while the real solutions of the same equation, which are the eigenvalues of the real matrix A following Definition 2.33, are called real eigenvalues. The further developments we want to discuss depend on some relationships among polynomials and matrices that we now want to illustrate. a. The Cayley-Hamilton theorem Given a polynomial f{t) = ^^=1 CLk^^•> to every n x n matrix A we can associate a new matrix / ( A ) defined by n

/ ( A ) := ao Id + ^ k=l

n

akA^ =: ^

a^A k

k=0

if we set A^ := Id. It is easily seen that, if a polynomial f{t) factors as f{t) = p{t)q{t), then the matrices p{A) and q{A) commute, and we have fiA)=p{A)q{A)=q{A)p{A). 2.53 Proposition. Let A E Mn{C), and let p{t) be a polynomial. Then (i) if X is an eigenvalue of Ay then p{\) is an eigenvalue ofp{A), (ii) if fi is an eigenvalue of p{A), then fi = p{X) for some eigenvalue A

of A. Proof, (i) follows observing that A'^ , /c G N, is an eigenvalue of A'^ if A is an eigenvalue of A . (ii) Since ^ is an eigenvalue of p ( A ) , the matrix p{A) — /xld is singular. Let p{t) = J2i=i ^i** be of degree fc, a/e 7^ 0. By the fundamental theorem of algebra we have

2.2 Eigenvectors and Similar Matrices

67

p{t) ~ iJ, = ak n(*~'^^)' i=l

hence p(A) — / i l d = a^ H i ^ i C A — r^ Id) and, since p(A) — /xld is singular, at least one of its fa<:tors, say A — ri Id, is singular. Consequently, r i is an eigenvalue of A and trivially, p{ri) — /x = 0. D

Now consider two polynomials P{t) := J^- PjP and Q(t) := ^ ^ Qkt^ with n X n matrices as coefficients. Trivially, the product polynomial R(t) := P{t)Q{t) is given by

2.54 Lemma. Using the previous notation, if A G Mn,n{C) with the coefficients ofQ{t), then R ( A ) = P ( A ) Q ( A ) .

commutes

Proof. In fax;t,

R(A) = ^P,QfcA>+'= = 5^(P,A^)(QfeA'=) = (^P,A^)(^Q,A^)

=P(A)Q(A).

2.55 Theorem (Cayley-Hamiilton). Let A G Mn,n{C) 0,'^d let PA{S) be its characteristic polynomial, PA{S) '= det(sld —A). Then PA{A) = 0. Proof. Set Q(5) := s I d — A , s G C, and denote by cof Q(s) the matrix of cofactors of Q(s). By Laplace's formulas, see (1.22), cof Q(s) Q(s) = Q(s) cof Q(s) = det Q(s) Id =

PA{S)

Id.

Since A trivially commutes with the coefficents Id and A of Q ( s ) , Lemma 2.54 yields PA (A) = PA (A) Id = cof Q ( A ) Q ( A ) = cof Q ( A ) -0 = 0. D

b. Factorization and invariant subspaces Given two polynomials Pi,P2 with degPi > degP2, we may divide Pi by P2, i.e., uniquely decompose Pi as Pi = QP2 + R where d e g P < degP2. This allows us to define the greatest common divisor (g.c.d.) of two polynomials that is defined up to a scalar factor, and compute it by Euclid's algorithm. Moreover, since complex polynomials factor with irreducible factors of degree 1, the g.c.d. of two complex polynomials is a constant polynomial if and only if the two polynomials have no common root. We also have 2.56 Lemma. Let p{t) and q{t) be two polynomials with no common zeros. Then there exist polynomials a{t) and b{t) such that a{t)p{t) + b{t)q{t) = 1 V t e C .

68

2. Vector Spaces and Linear Maps

We refer the readers to [GM2], but for their convenience we add the proof of Lemma 2.56 Proof. Let V := (r{t) := a{t)p{t) -h b{t)q(t) I a{t), b{t) are polynomials} and let d = ap-\-/3q he the nonzero polynomial of minimum degree in V. We claim that d divides both p and q. Otherwise, dividing p by d we would get a nonzero polynomial r := p — md and, since p and d are in V, r = p — md G V also, hence a contradiction, since r has degree strictly less than d. Then we claim that the degree of d is zero. Otherwise, d would have a root that should be common to p and q since d divides both p and q. In conclusion, d is a nonzero • constant polynomial.

2.57 Proposition. For every polynomial p, the kernel of p{A) is an invariant subspace for A G Mn,n{C)Proof. Let w G kerp(A).

Since tp{t) = p{t) t, we infer Ap{A) = p{A)A.

Therefore

p ( A ) ( A w ) = (p(A) A ) w = ( A p ( A ) ) w = A p ( A ) w := AO = 0. Hence Aw G kerp(A).

•

2.58 Proposition. Let p be the product of two coprime polynomials, p{t) =Pi{t)p2{t), and let A G Mn,n(C). Then kerp(A) := kerpi(A) 0ker;?2(A). Proof By Lemma 2.56, there exist two polynomials a i , a 2 such that ai{t)pi(t) + a2(t)p2(t) = 1. Hence (2.9) ai (A)pi (A) + a2 (A)p2 (A) = Id. Set Wi := k e r p i ( A ) ,

W2 := kerp2(A),

W := kerp(A).

Now for every x G W, we have ai {A)pi ( A ) x G W2 since P 2 ( A ) a i ( A ) p i ( A ) x = P2(A)(Id - a2(A)p2(A))x = (Id - a2(A)p2(A))p2(A)x = a i ( A ) p i ( A ) p 2 ( A ) x = a i ( A ) p ( A ) x - 0. and, similarly, a2(A)p2(A)x G Wi. Thus W = Wi + W2. Finally W = WieW2fact, if y G Wi n W2, then by (2.9), we have

In

y = a i (A)pi (A)y -h 02 (A)p2 (A)y = 0 + 0 = 0.

c. Generalized eigenvectors and the spectral theorem 2.59 Definition. Let A G Mn,n(C), and let A be an eigenvalue of A of multiplicity k. We call generalized eigenvectors of A relative to the eigenvalue A the elements of W:=ker{Xld-A)^. Of course, (i) eigenvectors relative to A are generalized eigenvectors relative to A, (ii) the spaces of generalized eigenvectors are invariant subspaces for A.

2.2 Eigenvectors and Similar Matrices

69

2.60 T h e o r e m . Let A G Mn,n(C). Let Ai, A2,..., A^ be the eigenvalues of A with multiplicities m i , 7712,..., ruk and let VFi, VF2,..., W^ 6e the subspaces of the relative generalized eigenvectors, Wi := ker(AiId — A). Then (i) the spaces VFi, VF2,..., Wk are supplementary, consequently there is a basis of C^ of generalized eigenvectors of A, (ii) dimWi = mi. Consequently, if we choose A ' G Mn,n(^) using a basis (ei, 6 2 , . . . , e^) where the the first rui elements span Wi, the following m2 elements span W2 and the last m^ elements span Wk- We can then write the matrix A ' in the new basis similar to A where A' has the form 0

Ai

0 \ 0

A2

A' = 0

0

where for every i = 1,... ,k, the block Ai is ami x ^ i matrix with Xi as the only eigenvector with multiplicity mi and, of course, (A^ Id — A^)"^^ = 0. Proof, (i) Clearly the polynomials pi{s) := (Ai - s)"^i, ^2(5) := (A2 - s)"^^, . . . , Pk{s) := (Afe — s)'^f^ factorize pA and are coprime. Set N^ := p i ( A ) and notice that Wi = k e r N i . Repeatedly applying Proposition 2.58, we then get kerpA(A) = k e r ( N i N 2 . . . N ^ ) = k e r ( N i ) © ker(N2N3 • • • N ^ )

= --- =

WieW2e"-®Wk.

(i) then follows from the Cayley-Hamilton theorem, kerpA(A) = C^. (ii) It remains to show that dim Wi = rrii VI Let (ei, 62, • • •, e-n) be a basis such that the first hi elements span VTi, the following /12 elements span W2 and the last h^ elements span W^. A is therefore similar to a block matrix Ai

0

0

0

A2

0

0

0

\

A' =

Afc h

where the block A^ is a square matrix of dimension hi := dim W^. On the other hand, the Qi X Qi matrix (A^ Id — Aj)"^^ = 0 hence all the eigenvalues of Xi Id — A i are zero. Therefore A i has a unique eigenvalue Ai with multiplicity /li, and p^i (s) := (s — Xi)^^. We then have k

PA{S) = PA'is) = UPAM i=l

k

= Yl(s ~ A)^S i=l

and the uniqueness of the factorization yields hi = rrii. The rest of the claim is trivial. D

70

2. Vector Spaces and Linear Maps

Another proof of dim Wi = ruj goes as follows. First we show the following. 2.61 L e m m a . IfO is an eigenvalue ofB^ eigenvalue for B"^ with multiplicity m.

Mn,n(C) with multiplicity

m, the 0 is an

Proof. The function 1 - A'^, A € C, can be factorized as 1 - A"^ = I l i l o ^ ( ^ ' ~ ^) where (jj := e*27r/m jg ^ j.QQ^ Q£ unity (the two polynomials have the same degree and take the same values at the m roots of the unity and at 0). For z,t E C m—l

.

m —1

---*'"=.™(i - (1)-)=.- n (-^ - ^) = n (-*- - *). i=0

i=0

hence m— l

2^Id-B^ = 2^Id-B^-

J|(u;^2ld-B). i=0

If we set qo(z) := H l i ^ ^ Q{^-^z)^ we have qo{0) ^ 0, and m—l

m—l

P B - (z"^) ••= detiz"^ Id - B"^) = Yl PB(UJ'Z) = JJ (uj^z^q{uj^z) i=0

=

z'^\o{z).

i=0

(2.10) On the other hand p-B^ — s'^qi(r) for some qi with qi{0) ^ 0 and some r > 1. Thus, following (2.10) PBm(s) = s'^qi(s), i.e., 0 is an eigenvalue of multiplicity m for B*^.

•

i4noi/ier proof that dim Wj = m,i in Theorem, 2.60. Since

y] m,i = y^ dim Wi = dim X, it suffices to show that dim Wi < rui Vi. Since 0 is an eigenvalue of B := Aj Id —A of multiplicity m := r/ij, 0 is an eigenvalue of multiplicity m for B"^ by Lemma 2.61. Since Wi is the eigenspace corresponding to the eigenvalue 0 of B ' ^ , it follows from Proposition 2.43 that dim Wj < m. D

d. Jordan's canonical form 2.62 Definition. A matrix B E Mn^n^) exists k >0 such that B'^ = 0.

^s said to he nilpotent if there

Let B G Mq^q{C) be a nilpotent matrix and let k be such that B^ = 0, but gfc-i _^ Q p^^ g^ basis (ei, 6 2 , . . . , Cg) of kerB, and, for each z = 1 , . . . , s, set ej := Ci and define ef, ef,..., ef' to solve the systems Be^ '-= ^i~ for j = 2 , 3 . . . as long as possible. Let {e^}, j = 1 , . . . ,fc^,z = 1 , . . . , ^, be the family of vectors obtained this way. 2.63 Theorem (Canonical form of a nilpotent matrix). Let "B be a q X q nilpotent matrix. Using the previous notation, {e]} is a basis ofC^. Consequently, if we write B with respect to this basis, we get a qxq matrix B ' similar to B of the form

2.2 Eigenvectors eind Similar Matrices

/K

71

0 \ 0 (2.11)

B' B

V

AJ

where each block Bj has dimension ki and, if ki > 1, it has the form 0 1 0 0 0 1 B,= 0 0 0

. . . . . .

0 0 0

0 0 0

. . .

1 0

(2.12)

The reduced matrix B ' is called the canonical Jordan form of the nilpotent matrix B . Proof. The kernels Hj := ker B-^ of B-^, j = 1,... ,k, form a strictly increasing sequence of subspaces {0} = i/o C / / i C i/2 C • • • C Hk-i C Hk := C*?. The claim then follows by iteratively applying the following lemma.

D

2.64 Lemma. For j = 1,2,... ,k — l, let (ei, e 2 , . . . , Cp) be a basis of Hj and let xi, X2,..., x^ be all possible solutions of Bxj = Cj, j = 1,... ,p. Then (ei, 6 2 , . . . , ep,xi, X2,..., Xr) is a basis for Hj^\. Proof. In fact, it is easily seen that o the vectors e i , e2, • . . , Cp, x i , 0:2, • • •, Xr are linearly independent, o {ei, 6 2 , . . . , ep,a;i, X 2 , . . . , Xr} C Hj^i, o the image of Hj^i by B is contained in Hj. Thus r, which is the number of elements ei in the image of B , is the dimension of the image of i / j + i by B . The rank formula then yields dim Hj^i

= dim Hj -\- dim f Im B PI ifj + i ) =

p-\-r.

Now consider a generic matrix A G M„,n(C). We first rewrite A using a basis of generalized eigenvectors to get a new matrix A ' similar to A of the form A^ A'

0

0

...

0 \ 0

(2.13)

72

2. Vector Spaces and Linear Maps

where each block A^ has the dimension of the algebraic multiplicity rrii of the eigenvalue A^ and a unique eigenvalue A^. Moreover, the matrix Ci := A J d - Ai is nilpotent, and precisely C^' = 0 and C^"''^ •=/- 0. Applying Theorem 2.63 to each Cj, we then show that A^ is similar to \i Id + B ' where B ' is as (2.11). Therefore, we conclude the following. 2.65 Theorem (Jordan's canonical form). Lei Ai, A 2 , . . , A^; he all distinct eigenvalues of A e Mn,n(C). For every z = 1 , . . . , A: (i) let (^2,1,..., Ui^p.) be a basis of the eigenspace Vx. (as we know, pi < rii),

(ii) consider the generalized eigenvectors relative to Ai defined as follows: for any j = 1,2,...,pi, a) set ejj := Uij, b) set efj to be a solution of a-l

iA-Xild)efj=e

(2.14)

a

as long as the system (2.14) is solvable, c) denote by OL{i,j) the number of solved systems plus 1. Then for every i = 1,... ,k the list (efj) with j = 1,... ,pi and a = 1,..., a(i,j) is a basis for the generalized eigenspace Wi relative to \i. Hence the full list Kj)

i = l....,kj

= l,...,pi,a

is a basis ofC^. By the definition of the {efj}, S:=

(2.15)

= l,...,a(z,j) if we set

1 2 1 2 1 2 ^ l , l 5 ^ I , l 5 • • • 5 ^ 1 , 2 ' ^1,25 • • • 5 ^ 2 , 1 ' ^ 2 , 1 ' • • •

the matrix J := S ^AS, that represents x —> A x in the basis (2.15), has the form

A

j

=

'1,1

0

0

0

0

Ji,)

0

0

0

0

\

'1,2

0

0

'k,pk

\ where i = 1,... ,k, j = 1,... ,pi, 3ij has dimension a{i,j)

and

2.2 Eigenvectors and Similar Matrices

73

if dim 3ij = 1

1

0

0

o\

0

\i

1

0

0

0

0

\i

1

0

0

0

0

...

\i

0

0

...

0

(\ ^iJ ~~ \

\ Q

otherwise.

1 \)

A basis with the properties of the basis in (2.15) is called a Jordan basis^ and the matrix J that represents A in a Jordan basis is called a canonical Jordan form of A. 2.66 E x a m p l e . Find a canonical Jordan form of ^ 2 0 0 0 0 1 2 0 0 A = 0 1 2 0 0 0 1 3

Vl

0

0

A is lower triangular hence the eigenvalues of multiplictiy 2. We then have /o 0 1 0 A-2Id = 0 1 0 0 Vl 0

^ 0 0 0

1

3/

A are 2 with multiplicity 3 and 3 with

0 0 0 1 0

0 0 0 1 1

o\ 0 0 0 1/

A — 2Id has rank 4 since the columns of A of indices 1, 2, 3 and 5 are linearly independent. Therefore the eigenspace V2 has dimension 5 — 4 == 1 by the rank formula. We now compute a nonzero eigenvalue,

/

(x\ y ( A - 2 Id)

0

z t

\

/0\ 0 0 0

y z-\-t

Voy For instance, one eigenvector is si := ( 0 , 0 , 1 , - 1 , 1 ) ^ . We now compute the Jordan basis relative to this eigenvalue. We have e\ -^ = s\ and it is possible to solve

\

/ 0 \ 0 1

z+t \x-ht + uj

-1

/

0 X

y

for instance, S2

^1,1

Vl/

(0,1,0, —1, 2 ) ^ is a solution. Hence we compute a solution of

74

2. Vector Spaces and Linear Maps

(

0

\ 1 0 -1

X

y z-ht \x + t-\-u/

\2j

hence S3 := ef ^^ = (1,0, 0, - 1 , 2 ) ^ . Looking now at the other eigenvalue,

A-3Id =

0 0 -1 1 0

0 -1 1 0 0

(''1 0 0

Vi

o\

0 0 0 0 1

0 0 0 0/

A is of rank 4 since the columns of indices 1, 2, 3 and 4 are linearly independent. Thus by the rank formula, the eigenspace relative to the eigenvalue 2 has dimension 1. We now compute an eigenvector with eigenvalue 2. We need to solve /x\

/

( A - 3 Id)

- x \

t

-y y- z z

\u)

\x + t/

y z

=

('\ 0 0 0

=

\oJ

and a nonzero solution is, for instance, 54 := (0,0,0, 0,1)-^. Finally, we compute Jordan's basis relative to this eigenvalue. A solution of /

- x \ 0 0 0

-y

y-z z

\x + t) is given by S5 = e^ ^ = (0, 0 , 0 , 1 , 0)-^. Thus, we conclude that the matrix

/o S = Ul

S2 S3 S4 S5

0 1 -1

=

Vi

0 1 0 -1 2

1 0 0 -1 2

0 0 0 0 1

o\ 0 0 1 0/

is nonsingular, since the columns are linearly independent, and by construction /2 0 S-^AS^: 0 0

1 2 0 0

0 0 1 0 2 0 0 3

0\ 0 0 1

Vo 0 0 0 3/

2.2 Eigenvectors and Similar Matrices

75

e. Elementary divisors As we have seen, the characteristic polynomial det(5ld-A),

seK,

is invariant by similarity transformations. However, in general the equality of two characteristic polynomials does not imply that the two matrices be similar.

2.67 E x a m p l e . The unique eigenvalue of the matrix A ^ = I

V /i

I is AQ and has

XoJ

multiplicity 2. The corresponding eigenspace is given by the solutions of the system

{

O'x'^ +0'X^

=0,

fjLX^ + 0 • X ^ = 0 .

If /x 7^ 0, then VXO,M ^ ^ dimension 1. Notice that AQ is diagonal, while A^ is not diagonal. Moreover, AQ and A^ with fi ^ 0 are not similar.

It would be interesting to find a complete set of invariants that characterizes the class of similarity of a matrix, without going explictly into Jordan's reduction algorithm. Here we mention a few results in this direction. Let A e Mn,n(C). The determinants of the minors of order k of the matrix 5 Id — A form a subset T>k of polynomials in the s variable. Denote by Dk{s) the g.c.d. of these polynomials whose coefiicient of the maximal degree term is normalized to 1. Moreover set Do{s) := 1. Using Laplace's formula, one sees that Dk-i{s) divides Dk{s) for all k = l , . . . , n . The polynomiSfe

are called the elementary divisors of A. They form a complete set of invariants that describe the complex similarity class of A. In fact, the following holds. 2.68 Theorem. The following claims are equivalent (i) A and B are similar as complex matrices, (ii) A and B have the same Jordan's canonical form (up to permutations of rows and columns), (iii) A and B have the same elementary divisors.

76

2. Vector Spxaces and Linear Maps

2.3 Exercises 2.69 f. Write a few 3 x 3 real matrices and interpret them as linear maps from M^ into E^. For each of these linear maps, choose a new basis of R^ and write the associate matrix with respect to the new basis both in the source and the target R^. 2.70 %. Let Vi, V 2 , . . . , Vri be finite-dimensional vector spaces, and let / o , / i , - • • ? / n be linear maps such that {0}:^yiAy2^

. . . ^ ^ ' V n _ / ^ ' Vn ^ {0}.

Show that, if I m ( / i ) = ker(/i+i) Vi = 0 , . . . , n - 1, then E ? = i ( - 1 ) ' d i m Vi = 0 . 2.71 f. Consider R as a vector space over Q. Show that 1 and ( are linearly independent if and only if ^ is irrational, ? ^ Q. Give reasons to support that R as a vector space over Q is not finite dimensional. 2.72 ^ L a g r a n g e m u l t i p l i e r s . Let X, Y and Z be three vector spaces over K and let f : X ^>'Y, g : X —^ Z he two linear maps. Show that ker p C ker / if and only if there exists a linear map £ : Z -^ Y such that / := io g. 2.73 f.

Show that the matrices

;:)• 0°)' have the same eigenvalues but are not similar. 2.74 ^ . Let Ai, A 2 , . . . , An be the eigenvalues of A € Mn,n(C), possibly repeated with their multiplicities. Show that tr A = Ai + • • -f An and det A = Ai • A2 • • • An. 2.75 %. Show that p{s) = s'^ -\- an-is^~^ the n X n matrix /

H

0 0

1 0

0 1

-ao

—ai

—a2

2.76 %, Let A G Mfc,fc(K), B € Mn,n{^), polynomial of the matrix

h ao is the characteristic polynomial of ... ...

0 0

\

-fln-i

/

C 6 Mk,n{^)-

VO

Compute the characteristic

B

2.77 % L e t ^ r C ^ -^ C^ be defined by ^(ei) := ei^i if i = 1 , . . . , n - l and ^(en) = e i , where e i , e2, • . . , en is the canonical basis of C^. Show that the associated matrix L is diagonizable and that the eigenvalues are all distinct. [Hint: Compute t h e characteristic polynomial.]

2.3 Exercises

2.78 %. Let A 6 Mn,n{^)

77

and suppose A^ = Id. Show that A is similar to

/T

for some k, 1 < k < n. [Hint: Consider the subspeices V+ := {a^ I A x = x } and V- := {x I A x = - x } and show that V+ 0 y_ = R^. ] 2.79 i[. Let A, B G Mn,nW be two matrices such that A^ = B ^ = Id and tr A = tr B . Show that A and B are similar. [Hint: Use Exercise 2.78.] 2.80 f. Show that the diagonizable matrices span Mn,n{^)- [Hint: Consider the matrices Mij = diag (1, 2 , . . . , n ) + 'Eij where Eij has value 1 at entry (i, j ) and value zero otherwise.] 2.81 %. Let A , B e Mn,n(^) and let B be symmetric. Show that the polynomial t -^ det(A -I- t B ) has degree less than R a n k B . 2.82 %, Show that any linear operator A : W^ dimension 1 or 2.

V^ has an invariant subspace of

2.83 f F i t t i n g d e c o m p o s i t i o n . Let / : X —>• X be a linear operator of a finitedimensional vector space and set f^ := / o • • • o / /c-times. Show that there exists k, 1 < k
ker(/'=) = ker(/'=+i), Im(/'=) = Im (/«=+!), /|iin(/fe) • I n i ( / ^ ) -^ Im(f^) /(ker/*=)cker(/^), /|ker(/fc) • ker(/'^) -^ kerif^)

is an isomorphism, is nilpotent,

(vi) y = k e r ( / ' = ) e l m ( / ' = ) . 2.84 ^ . A is nilpotent if and only if all its eigenvalues are zero. 2.85 %, Consider the linear operators in the linear space of polynomials B(P){t) = tP(t). A(P){t) := P'{t), Compute the operator AB — BA. 2.86 f. Let A,B

be linear operators on R^. Show that

(i) tT(AB)=tv{BA), (ii) AB-BA:^ Id. 2.87 t . Show that a linear operator C : R^ -> R^ can be written as C = AB where A,B : R-^ -^ R-^ are linear operators if and only if tr C = 0.

BA

78

2. Vector Spaces and Linear Maps

2.88 %. Show that the Jordan canonical form of the matrix fa al al 0 a al 0 0 a \0

0

4\ r.2

0

a)

with alai ...an ^ # 0 : /a 1 0 0 o 1 0 0 a

0\ 0 0

\0

a)

0 0

3- Euclidean and Hermitian Spaces

3.1 The Geometry of Euclidean and Hermitian Spaces Until now we have introduced several different languages, linear independence, matrices and products, linear maps that are connected in several ways to linear systems and stated some results. The structure we used is essentially hnearity. A new structure, the inner product, provides a richer framework that we shall illustrate in this chapter. a. Euclidean spaces 3.1 Definition. Let X be a real vector space. An inner product on X is a map ( | ) : X x X —^W which is o

(BILINEAR) (X,2/)

—> {x\y) is linear in each factor, i.e., {Xx + fiy\z) = X{x\z) + fi{y\z), {x\Xy -h /iz) = X{x\y) + fi{x\z),

for all x,y,z

e X, for all A, // G M.

o (SYMMETRIC) {x\y) = {y\x) o (POSITIVE DEFINITE) {X\X)

for

> 0

all x,y e X. VX and {x\x)

=0 if and only if x = 0.

The nonnegative real number \x\ := y/{x\x) is called the norm of x e X. A finite-dimensional vector space X with an inner product is called an Euclidean vector space, and the inner product of X is called the scalar product of X. 3.2 Example. The map ( | ) : R^ x E'^ -^ R defined by (x|y) := x . y = f ^ ^ V ,

x := {x\ x^,..., x^), y := {y\ y\ ... ^ y^)

80

3. Euclidean and Hermitian Spaces

is an inner product on W^, called the standard scalar product of R^, and W^ with this scalar product is an Euclidean space. In some sense, as we shall see later, see Proposition 3.25, it is the unique Euclidean space of dimension n. Other examples of inner products on M^ can be obtained by weighing the coordinates by nonnegative real numbers. Let Ai, A2, •. •, An be positive real numbers. Then (x|y) •.= ^ \ i x Y ,

x = ( x i , x 2 , . . . , x " ) , y = {y\

y\...,y")

i=l

is an inner product on R^. Other examples of inner products in infinite-dimensional vector spaces can be found in Chapter 10.

Let X be a vector space with an inner product. Prom the bihnearity of the inner product we deduce that \x + 2/p = {x + y\x + y) = {x\x + t/) + {y\x + y) = {x\x) + 2{x\y) + {y\y) = \x\^ + 2{x\y) + l^l^

(3.1)

from which we infer the following. 3.3 Theorem. The following hold. (i)

(PARALLELOGRAM IDENTITY)

\x + y\^ + \x-yf (ii)

( P O L A R I T Y FORMULA)

We have

= 2 (|x|2 + \yf)

Vx,

yeX.

We have

{x\y) = 4 ( k + 2/P - k - y\^)

Vx, yeX,

hence we can get the scalar product of x and y by computing two norms. (iii) (CAUCHY-SCHWARZ INEQUALITY) The following inequality holds \{x\y)\<\x\

\yl

VX,7/GX;

moreover^ {x\y) = \x\\y\ if and only if either y = 0 or x = Xy for some A G M, A > 0. Proof, (i), (ii) follow trivially from (3.1). Let us prove (iii). If y = 0, the claim is trivial. If 2/ 7^ 0, the function < -^ |x -f i y p , i 6 E, is a second order nonnegative polynomial since 0<\x-\-

ty\'^ = ix + ty\x + ty) = {x-\- ty\x) -f (x + ty\ty) = \x\'^ + 2{x\y) t + |y|^ *^;

hence its discriminant is nonpositive, thus ((x\y))'^ — |xp|2/p < 0. If {x\y) = \x\ \y\, then the discriminant of t —>• |x -f ty\^ vanishes. If t/ 7^ 0, then for some t G M we have \x + *2/P = 0, i.e., x = —ty. Finally, —t is nonnegative since D -t{y\y) =: ix\y) = \x\ \y\ > 0.

3.4 Definition. Let X be a vector space with an inner product. Two vectors x,y G X are said to be orthogonal, and we write x Ly, if {x\y) = 0.

3.1 The Geometry of Euclidean and Hermitian Spaces

81

Prom (3.1) we immediately infer the following. 3.5 Proposition (Pythagorean theorem). Let X be a vector space with an inner product. Then two vectors x^y G X are orthogonal if and only if I

I

|2

I

|2

, I

|2

1^ + 2/1 = m + \y\ ' 3.6 Carnot's formula. Let x , y G M^ be two nonzero vectors of R^, that we think of as the plane of Euclidean geometry with an orthogonal Cartesian reference. Setting x := (a, 6), y := (c, d), and denoting by 6 the angle between Ox and Oy, it is easy to see that |x|, |y| are the lengths of the two segments Ox and Oy, and that x « y := ac-\- bd = |x| |y| cos^. Thus (3.1) reads as Carnot's formula |x + y p = |xp -h |yp H- 2|x| |y| cos^. In general, given two vectors x , y G R", we have by Cauchy-Schwarz inequality | x » y | < | x | | y | , hence there exists a ^ G R such that x»y

m \y\

. =: cos^.

6 is called the angle between x and y and denoted by xy. In this way (3.1) rewrites as Carnot 's formula |x -f y p = |xp + |yp + 2|x| |y| cosl9. Notice that the angle 6 is defined up to the sign, since cos^ is an even function. 3.7 Proposition. Let X be a Euclidean vector space and let { \ ) be its inner product. The norm of x G X , |x| :=

y/{x\x)

is a function \ | : X -^ R with the following properties (i) \x\ G R4. Vx G X. (ii) (NONDEGENERACY) (iii)

(iv)

\X\ =0 if and only ifx = 0. (1-HOMOGENEITY) |AX| = |A| |x| VA G R, Vx G X. (TRIANGULAR INEQUALITY) \x-\-y\ < \x\ + \y\ \/x,y

G X.

Proof, (i), (ii), (iii) are trivial (iv) follows from the Cauchy-Schwarz inequality since \x + 2/|2 = |x|2 + |t/|2 + 2{y\x) < |x|2 + |y|2 + 2 \{y\x)\

<|xp + |2/|2+2|x|M = (N + M)2). D

Finally, we call the distance between x and y G X the number d{x,y) := \x — y\. It is trivial to check, using Proposition 3.7, that the distance function d : X x X ^^ R defined by d{x,y) := \x — y\^ has the following properties

82

3. Euclidean and Hermitian Spaces

(i)

(NONDEGENERACY)

d(x, y) >0^x,y

e X and d{x, y) = 0 ii and only

ii X = y. (ii) (SYMMETRY) d{x,y) = d{y,x) Vx,?/ € X. (iii) (TRIANGULAR INEQUALITY) d{x,y) < d(x,z) + d{z,y) Vx,y,z e X.

We refer to d as the distance in X induced by the inner product. 3.8 Inner products in coordinates. Let X be a Euclidean space, denote by ( I ) its inner product, and let (ei, 6 2 , . . . , e^) be a basis of X. If ^ = E l L i ^'^i^ y = E I L i y'^i ^ ^i then by linearity {x\y) =

Y^x'y^{ei\ej).

The matrix G = [gij],

Qij = {ei\ej)

is called the Gram matrix of the scalar product in the basis (ei, eg, • • •, ^n)Introducing the coordinate column vectors x = (x^, x ^ , . . . , x'^)'^ and y = (2/^, y ^ , . . . , 2/"^)^ G R"^ and denoting by • • • the standard scalar product in R"^, we have {x\y) = x . G y = x^Gy rows by columns. We notice that (i) G is symmetric, G^ — G, since the scalar product is symmetric, (ii) G is positive definite, i.e., x^Gx > 0 Vx G R"' and x^Gx = 0 if and only if X — 0, in particular, G is invertible. b. Hermit ian spaces A similar structure exists on complex vector spaces. 3.9 Definition. Let X be a vector space over C. A Hermitian product on X is a map {\):XxX-^C which is (i) (SESQUILINEAR),

i.e.,

{av -h (iw\z) = a{v\z) + l3{w\z), {v\aw -f /3z) = a{v\w) -h l3{v\z) (ii) (iii)

(HERMITIAN) {Z\UJ)

yzex.

= (wlz) "iw.z £ X, in particular {z\z) G \

(POSITIVE DEFINITE) (Z\Z)

> 0 and {z\z) = 0 if and only if z = 0.

The nonnegative real number \z\ :=^ y^{z\z) is called the norm of z E X. 3.10 Definition. A finite-dimensional complex space with a Hermitian product is called a Hermitian space.

3.1 The Geometry of Euclidean and Hermitian Spaces

83

3.11 E x a m p l e . Of course the product (z,w) -^ {z\w) := wz is a Hermitian product on C. More generally, the map ( | ) : C^ x C " —^ C defined by n

{z\w) := z » w := ^z^w^ J=i

Mz = (z^, z^,...,

^ " ) , w = {w'^, w'^,...,

w'^)

is a Hermitian product on C^, called the standard Hermitian product of C^. As we shall see later, see Proposition 3.25, C^ equipped with the standard Hermitian product is in a sense the only Hermitian space of dimension n.

Let X be a complex vector space with a Hermitian product ( | ). Prom the properties of the Hermitian product we deduce \z -h w\'^ = (z -{• w\z -\-w) = (z\z -h K;) + (w\z -\-w) I

I

\

I

/

V I

/

\

I

, (3.2)

/

= {z\z) + {z\w) + {w\z) -h {w\w) = \z\'^ -h \w\'^ + 2di{z\w) from which we infer at once the following. 3.12 T h e o r e m ,

(ii)

(i) We have

(PARALLELOGRAM IDENTITY)

\z + w\'^ -\-\z-wf (iii)

( P O L A R I T Y FORMULA)

We have

= 2 (|zp + \wf)

\/z, w

eX.

We have

— iw\ 4:{z\w) =: (\z -\- w\'^ — \z — w;p 1 4- if 1^: -h iw\'^ — \z 2^ for all z^w G X. We therefore can compute the Hermitian product of z and w by computing four norms. (iv) (CAUCHY-SCHWARZ INEQUALITY) The following inequality holds \{z\w)\ < \z\ \wl

yz.weX;

moreover {z\w) = \z\ \w\ if and only if either w = 0, or z = Xw for some A G M, A > 0. Proof, (i), (ii), (iii) follow trivially from (3.2). Let us prove (iv). Let z, w E X and A = te*^, t,e eR. From (3.2) 0 < |z + \w\'^ = t'^\w\'^ + \z\^ + 2t^{e-'yz\w))

Vt G M,

hence its discriminant is nonpositive, thus me-'»{z\w))\

< \z\ \w\.

Since 0 is arbitrary, we conclude |(2;|ii;)| < \z\ \w\. The second part of the claim then follows as in the real case. If {z\w) = \z\ \w\, then the discriminant of the real polynomial t —> |2 + ttyp, t e R, vanishes. If -w; ^i^ 0, for some t G M we have \z + tw\'^ = 0, i.e., z = —tw. Finally, —t is nonnegative since —t{w\w) = {z\w) = \z\ \w\ > 0. D

,

84

3. Euclidean and Hermitian Spaces

3.13 i[. Let A" be a complex vector space with a Hermitian product and let z^w ^ X. Show that K^^lif)! = \z\ \w\ if and only if either it; = 0 or there exists A 6 C such that z = \w.

3.14 Definition. Let X be a complex vector space with a Hermitian product ( I ). Two vectors z^w e X are said to be orthogonal, and we write z 1.W, if {z\w) = 0. Prom (3.2) we immediately infer the following. 3.15 P r o p o s i t i o n ( P y t h a g o r e a n t h e o r e m ) . Let X be a complex vector space with a Hermitian product ( | ). If z^w E X are orthogonal, then

We see here a diiference between the real and the complex cases. Contrary to the real case, two complex vectors, such that |z + i(;p = |2:p + |if;p holds, need not be orthogonal. For instance, choose X := C, {z\w) := wz, and let z = 1 and w = i. 3.16 P r o p o s i t i o n . Let X be a complex vector space with a Hermitian product on it. The norm of z e X, \z\ :=

^/(z\z),

is a real-valued function \ \ : X —^R with the following properties (i) \z\ G R+ Vz G X. (ii) (NONDEGENERACY) (iii) (iv)

\Z\=0 if and only if z = 0. (1-HOMOGENEITY) \XZ\ = |A| \z\ VA G C, Vz G X. (TRIANGULAR INEQUALITY) \Z-\-W\ < \z\ + \w\ \/z,w

G X.

Proof, (i), (ii), (iii) are trivial, (iv) follows from the Cauchy-Schwarz inequality since \z + w\^ = \z\^ -f k | 2 + 2^(z\w)

< \z\^ + |i/;|2 + 2 \(z\w)\

<\z\^ + \w\^ + 2\z\\w\

= (\z\ +

\w\n D

Finally, we call distance between two points z^w oi X the real number d{z,w) := \z — w\. It is trivial to check, using Proposition 3.16, that the distance function d : X x X ^^ R defined by d{z,w) := \z — w\ has the following properties (i)

(NONDEGENERACY) d{z,w) > 0 yz,w e X and d{z,w) = 0 if and only ii z = w. (ii) (SYMMETRY) d{z,w) = d{w,z) ^z.w G X. (iii)

(TRIANGULAR INEQUALITY) d{z, w) < d(z, x) + d{x, w) Ww, x,z

e

X.

We refer to d as to the distance on X induced by the Hermitian product.

3.1 The Geometry of Euclidean and Hermitian Spaces

85

3.17 Hermitian products in coordinates. If X is a Hermitian space, the Gram matrix associated to the Hermitian product is defined by setting G=

[gij],

9ij :=

{ei\ej).

Using Hnearity {z\w) = Y2 {ei\ej)z'w^ = z^Gw ',3 = 1

if z = (z^, z ^ , . . . , z'^), w = (it;^, w'^,..., w'^) G C^ are the coordinate vector columns of z and w in the basis (ei, e 2 , . . . , Cn)- Notice that (i) G is a Hermitian matrix, G = G, (ii) G is positive definite, i.e., z^Gz > 0 Vz G C^ and z^Gz = 0 if and only if z = 0, in particular, G is invertible. c. Orthonormal basis and the Gram—Schmidt algorithm 3.18 Definition. Let X be a Euclidean space with scalar product { \ ) or a Hermitian vector space with Hermitian product ( | ). ^4 system of vectors {^a}aeA ^ ^ '^s called orthonormal if iea\ep) =Sap

Va,/3 G A

Orthonormal vectors are hnearly independent. In particular, n orthonormal vectors in a Euclidean or Hermitian vector space of dimension n form a basis, called an orthonormal basis. 3.19 E x a m p l e . The canonical basis ( e i , e 2 , . . . , Cn) of E"^ is an orthonormal basis for the standard inner product in E " . Similarly, the canonical basis ( e i , 6 2 , . . . , en) of C^ is an orthonormal basis for the standard Hermitian product in C"^. 3.20 %. Let ( I ) be an inner (Hermitian) product on a Euclidean (Hermitian) space X of dimension n and let G be the associated Gram matrix in a basis (ei, 6 2 , . . . , en)Show that G = Idn if and only if (ei, e2,. •., en) is orthonormal.

Starting from a denumerable system of linearly independent vectors, we can construct a new denumerable system of orthonormal vectors that span the same subspaces by means of the Gram-Schmidt algorithm, 3.21 Theorem (Gram-Schmidt). Let X be a real (complex) vector space with inner (Hermitian) product ( | ). Let t'l, t;2,..., t'jt,... be a denumerable set of linearly independent vectors in X. Then there exist a set of orthonormal vectors wi, W2,. - -, Wk,- • - such that for each fc = 1,2,... Span|it;i, W2,..., wA = Spanjt'i, t'2,.--, Vkj.

86

3. Euclidean and Hermitian Spaces

Proof. We proceed by induction. In fact, the algorithm W[ = VI,

wi := -—f-, '^p='^pYl^j=lMwj)wj p -P A^j= w Wp:=

—-

never stops since Wp ^ 0 "ip = 1,2,3,...

and produces the claimed orthonormal basis. D

3.22 Proposition (Pythagorean theorem). LetX be a real (complex) vector space with inner (Hermitian) product ( | ). Let (ei, e2, • . . , e^) be an orthonormal basis of X. Then k X =

Y^{x\ej)ej

xeX,

2=1

that is the ith coordinate of x in the basis (ei, e2,. •., Cn) is the cosine director {x\ei) of x with respect to ei. Therefore we compute k

{x\y) = Y^(x|ei) {y\ei)

if X is Euclidean^

i=l k

{x\y) = 2_\{^\^i) {y\^i)

^ / ^ ^-^ Hermitian,

i=l

so that in both cases Pythagoras's theorem holds: k

\x\' = {x\x) = Y;^\ix\ei)\-'. i=l

Proof. In fact, by linearity, for j = 1 , . . . , A; and x — ^Y^=\ ^^^i ^® have n

n

n

i=l

i=\

i=\

Similarly, using linearity and assuming X is Hermitian, we have {x\y) = (Y^x'ei

I (jZv^ej)

i=l n

= f^ i,3 = ^

j=\ k

hence, by the first part, n

{x\y) =

^{x\ei){y\ei).

x*^(e,|e^)

3.1 The Geometry of Euclidean and Hermitian Spaces

87

d. Isometries 3.23 Definition. Let X^Y he two real (complex) vector spaces with inner (Hermitian) products ( | )x and ( | )y. We say that a linear map A : X -^ Y is an isometry if and only if \A{X)\Y

= \x\x

Vx e

X,

or, equivalently, compare the polar formula, if {A{x)\A{y))Y

= {x\y)x

^x,yeX.

Isometries are trivially injective, but not surjective. If there exists a surjective isometry between two Euclidean (Hermitian) spaces, then X and Y are said to be isometric. 3.24 %. Let X,Y be two real (complex) vector spaces with inner (Hermitian) products { \ )x and { \ )Y and let A : X —>• V be a linear map. Show that the following claims are equivalent (i) A is an isometry, (ii) B C X is am orthonormal basis if and only if A{B) is an orthonormal basis for A{X).

Let X be a real vector space with inner product ( | ) or a complex vector space with Hermitian product ( | ). Let (ei, e 2 , . . . , e^) be a basis in X and f : X -^ K^, (K = R of K = C) be the corresponding system of coordinates. Proposition 3.22 implies that the following claims are equivalent. (i) (ei, 6 2 , . . . , Cn) is an orthonormal basis, (ii) £{x) = ((x|ei),...,(x|en)), (iii) £ is an isometry between X and the Euclidean space W^ with the standard scalar product (or C^ with the standard Hermitian product). In this way, the Gram-Schmidt algorithm yields the following. 3.25 Proposition. Let X be a real vector space with inner product ( | ) (or a complex vector space with Hermitian product { \ )) of dimension n. Then X is isometric to R^ with the standard scalar product (respectively, to C^ with the standard Hermitian product), the isometry being the coordinate system associated to an orthonormal basis. In other words, using an orthonormal basis on X is the same as identifying X with R"^ (or with C") with the canonical inner (Hermitian) product. 3.26 I s o m e t r i e s in c o o r d i n a t e s . Let us compute the matrix associated to an isometry R : X -^ Y between two Euclidean spaces of dimension n and m respectively, in an orthonormal basis (so that X and Y are respectively isometric to R^ (C^) and W^ ( C ^ ) by means of the associated coordinate system). It is therefore sufficient to discuss real isometries i? : E^ -^ E"^ and complex isometries RiC^ -^C^. Let i? : E^ —)• E ^ be linear and let R € Mm,nW be the associated matrix, il(x) = R x , X G E"^. Denoting by ( e i , e 2 , . . •, en) the canonical basis of E " ,

3. Euclidean and Hermitian Spaces

R =

ri

r2

...

Fn ,

Ti = R e j Vi.

Since (ei, e 2 , . . . , Cn) is orthonormal, R is an isometry if and only if ( r i , rg, • •., Tn) are orthonormal. In particular, m > n and Tj Ti = Ti* Tj = 5ij

i.e., the matrix R is an orthogonal matrix^ R ^ R = Idn. When m — n, the isometries i l : R'^ —» R'^ are necessarily surjective being injective, and form a group under composition. As above, we deduce that the group of isometries of R'^ is isomorphic to the orthogonal group 0(n) defined by 0{n) := | R e Mn,n(R) | R ^ R = I d n } . Observe that a square orthogonal matrix R is invertible with R~^ = R-^. If follows that R R ^ = Id and | det R | = 1. Similarly, consider C^ as a Hermit ian space with the standard Hermitian product. Let R:C -^C^ he linear and let R € Mm,n(C) be such that R{z) = R z . Denoting by ( e i , e 2 , . . . , Cn) the canonical basis of R^, R =

r i r2

...

Fn ,

Ti = R e i Vi = 1 , . . . , m.

Since ( e i , e 2 , . . . , en) is orthonormal, R is an isometry if and only if r i , r 2 , . . •, rn are orthonormal. In particular, m > n and

i.e., the matrix R is a unitary

matrix, R^R=

Idn.

When 171 = 71, the isometries R : C^ -^ C^ are necessarily surjective being injective, moreover they form a group under composition. From the above, we deduce that the group of isometries of C^ is isomorphic to the unitary group U(n) defined by U{n) := | R € Mn,n(C) | R ^ R = I d n } . Observe that a square unitary matrix R is invertible with R R R ^ = Id and | det R | = 1.

^= R

. I t follows that

e. The projection theorem Let X be a real (complex) vector space with inner (Hermitian) product ( I ) that is not necessarily finite dimensional, let F C X be a finitedimensional linear subspace of X of dimension k and let (ei, 6 2 , . . . , e^) be an orthonormal basis of V. We say that x G X is orthogonal to V if {x\v) = 0 Vz; G 1^. As (ei, e 2 , . . . , ek) is a basis of V, x 1.V if andonly if (x|ei) = 0 Vi = 1 , . . . ,fc. For all a; G X, the vector Py{x) :=^{x\ei)ei i=i

eV

3.1 The Geometry of Euclidean and Hermitian Spaces

89

is called the orthogonal projection of x in F , and the map Py : X -^ V, X —> Pv{x), the projection map onto V. By Proposition 3.22, Py(x) = x if X G F , hence ImP = V and P^ = P. By Proposition 3.22 we also have |Py(x)p = Zli=i l(^ki)P- The next theorem explains the name for Pv{x) and shows that in fact Pv{x) is well defined as it does not depend on the chosen basis (ei, e2, • •., e/c). 3.27 Theorem (of orthogonal projection). With the previous notation, there exists a unique z G V such that x — z is orthogonal to V, i.e., {x — z\v) = 0 \fv e V. Moreover, the following claims are equivalent. (i) X — z is orthogonal to V, i.e., {x — z\v) = 0 ^v e V, (ii) z GV is the orthogonal projection of x onto V, z = Pv{x), (iii) z is the point in V of minimum distance from x, i.e., \x — z\ < \x — v\

Mv

GV^

V ^ z.

In particular, Pv{x) is well defined as it does not depend on the chosen orthonormal basis and there is a unique minimizer of the function v -^ \x — v\, V e V, the vector z = Pv{x). Proof. We first prove uniqueness. U zi,Z2 £V are such that (x — Zi\v) = 0 ioT i = I, 2, then {zi — Z2\v) = 0 Vt; G V, in particular \zi — Z2\'^ = 0. (i) => (ii). From (i) we have {x\ei) = {z\ei) Vi = 1 , . . . , fc. By Proposition 3.22 k

k

z = Y^{z\ei)ei

= '^{x\ei)ei

=

Pv{x).

1=1

i=i

This also shows existence of a point z such that x — z is orthogonal to V and that the definition of Pv{x) is independent of the chosen orthonormal basis (ei, 6 2 , . . . , e^). (ii) =^ (i). Ii z = Py{x),

we have for every j = 1,...

,k

k

{x - z\ej) = {x\ej) - '^{x\ei)(ei\ej)

= {x\ej) - {x\ej) = 0,

i=l

hence (x — z\v) = 0 Vi'. (i) => (iii). Let v EV.

Since {x — z\v) = 0 v/e have

\x-v\'^

= \x-

z-\- z-vl"^

= \x-

z\'^ -\-\z-

i;p,

hence (iii). (iii) =^ (i). Let v e V. The function t ^y \x — z + t v p , t G M, has a minimum point at t = 0. Since \x- z + tvl"^ = \x - z\'^ + 2t^{x - z\v) -\- t'^\v\'^, necessarily 3f?(x — z\v) = 0. If X is a real vector space, this means {x — z\v) = 0, hence (i). If X is a complex vector space, from R{x — z\v) = 0 \/v e V, we also have 3f?(e-^^(x - z\v)) = 0 V(9 € M Vv G V, hence {x - z\v) = 0 Vi; € V and thus (ii). D

We can discuss linear independence in terms of an orthogonal projection. In fact, for any finite-dimensional space V C X, x e V ii and only if X — PY{X) = 0, equivalently, the equation x — Pv{x) = 0 is an implicit equation that characterizes V as the kernel of Id- Py.

90

3. Euclidean and Hermitian Spaces

3.28 %, Let W = Span { v i , V 2 , . . . , v^} be a subspace of K^. Describe a procedure that uses the orthogonal projection theorem to find a basis of W. 3.29 %, Given A G Mm,n(^), describe a procedure that uses the orthogonal projection theorem in order to select a maximal system of independent rows and columns of A. 3.30 %. Let A G Mm^nC^)- Describe a procedure to find a basis of ker A. 3.31 ^. Given k linear independent vectors, choose among the vectors (ei, e2, • . . , en) of R"" (n — k) vectors that together with v i , V 2 , . . . , v^ form a basis of R"^. 3.32 P r o j e c t i o n s in c o o r d i n a t e s . Let X be a Euclidean (Hermitian) space of dimension n and let F C X be a subspace of dimension k. Let us compute the matrix associated to the orthogonal projection operator Py : X -^ X in an orthonormal basis. Of course, it suffices to think of Py as of the orthogonal projection on a subspace of R^ (C^ in the complex case). Let (ei, e 2 , . . . , en) be the canonical basis of R^ and V C R^. Let v i , V 2 , . . . , v^ be an orthonormal basis of V and denote by V = [vM the n x k nonsingular matrix

V : = ^vi I V2 I ... I VfcJ so that Vj = Z ^ i L i ^ j e j - Let P be the n x n matrix associated to the orthogonal projection onto V, Py(x) = P x , or, Pi = P e i , z = l , . . . , n .

' = [pi |P2 I ••• | p n j , Then Pi = Py{ei)

= ^{ei.Wj)wj

=

Y^v]wj

3=1

j=l

j=lh=l

h=l

(3.3)

I.e.,

P = VV^. The complex case is similar. With the same notation, instead of (3.3) we have k

Pi = Py{ei)

k

= ^{ei

.Vj )wj = ^

v^Wj

J=l

3=1

3=1h=l

h=l

(3.4)

i.e., P = VV^.

f. Orthogonal subspaces Let X be a real vector space with inner product ( | ) or a complex vector space with Hermitian product ( | ). Suppose X is finite dimensional and let W^ C X be a linear subspace of X. The subset ly-^ := {x G X I {x\y) is called the orthogonal of W in X.

=OyyeW^

3.1 The Geometry of Euclidean and Hermitian Spaces

91

3.33 Proposition. We have (i) W-^ is a linear subspace of Xj

(ii) wnw-^ = {o}, (iii) (W^)^ = W, (iv) W and W-^ are supplementary, hence dim W + dim W-^ = n, (v) if Pw dnd Pw^ tt^^ respectively, the orthogonal projections onto W and W-^ seen as linear maps from X into itself, then Pw^ = ^^x — PwProof. We prove (iv) and leave the rest to the reader. Let {vi, V2, •.., Vk) he a basis of W. Then we can complete (vi, V2,... •, v^) with n — k vectors of the canonical basis to get a new basis of X. Then the Gram-Schmidt procedure yields an orthonormal basis {wi, W2^" ", Wn) of X such that W = Span iwi, W2,... •, Wk\- On the other hand Wk-\-i,..., Wn € W-^, hence dim W-^ = n — k. •

g. Riesz's theorem 3.34 Theorem (Riesz). Let X be a Euclidean or Hermitian space of dimension n. For any L e X* there is a unique XL G X such that L{X) = {X\XL)

VXGX.

'

(3.5)

Proof. Assume for instance, that X is Hermitian. Suppose L ^ 0, otherwise we choose XL = 0, and observe that d i m l m L = 1, and V := kerL has dimension n — 1 if d i m X = n. Fix XQ G V-^ with |a:o| = 1, then every x E X decomposes as X = x' -\- XxQ,

x' 6 kerL, A = (x|a:o)-

Consequently, L{x) = I/(a:') -f AL(xo) = (x|a;o)I/(a:o) = (a;|L(xo)xo) and the claim follows choosing x^ '= L{xo)xo.

•

The map (3 : X* ^^ X, L ^^ XL defined by the Riesz theorem is called the Riesz map. Notice that /? is linear if X is Euclidean and antilinear if X is Hermitian. 3.35 T h e Riesz m a p in c o o r d i n a t e s . Let X be a Euclidean (Hermitian) space with inner (Hermitian) product ( | ),fixa basis and denote by x = (x^, x ^ , . . . , x'^) the coordinates of x, and by G the Gram matrix of the inner (Hermitian) product. Let L G X* and let L be the associated matrix, L{x) = L x . From (3.5) L x = L{x) = {X\XL)

= X-^GXL

if X is Euclidean,

L x = L{x) = {X\XL)

= X"^G5CL"

if X is Hermitian,

Gx£, = L-^

or XL = G~^L-^

if X is Euclidean,

7^

1 'p

i.e.,

Gxx, = L

or x/, = G

L

if X is Hermitian.

In particular, if the chosen basis (ei, 62, • • •, en) is orthonormal, then G = Id and XX, = L-^

if X is Euclidean,

—T

XL = L

if X is Hermitian.

92

3. Euclidean and Hermitian Spaces

Figure 3.1. Dynamometer.

3.36 E x a m p l e (Work and forces). Suppose a mass m is fixed to a dynamometer. If 9 is the inclination of the dynamometer, the dynamometer shows the number L = mg cos 0,

(3.6)

where p is a suitable constant. Notice that we need no coordinates in R"^ to read the dynamometer. We may model the lecture of the dynamometer as a map of the direction V of the dynamometer, that is, as a map L : S"^ —* R from the unit sphere 5"^ = {x € E^ Ma;I = 1} of the ambient space V into R. Moreover, extending L homogeneously to the entire space V by setting L{v) := \v\ L{v/\v\), v e R^ \ {0}, we see that such an extension is linear because of the simple dependence of L from the inclination. Thus we can model the elementary work done on the mass m, the measures made using the dynamometer, by a linear map L : V -^ R. Thinking of the ambient space V as Euclidean, by Riesz's theorem we can represent L as a scalar product, introducing a vector F := XL EV such that {v\F) = L(y)

Vt; G V.

We interpret such a vector as the force whose action on the mass produces the elementary work L{v). Now fix a basis (ei,62,63) of V. li F = (F^^F"^, F^)^ is the column vector of the force coordinates and L = (Li,L2,I/3) is the 1 x 3 matrix of the coordinates of L in the dual basis, that is, the three readings Li = L{ei), i = 1,2,3, of the dynamometer in the directions 61,62,63, then, as we have seen.

In particular, if (61,62,63) is an orthonormal basis.

h. The adjoint operator Let XyY be two vector spaces both on K = M or K = C with inner (Hermitian) products ( | )x and ( | )y and let A : X -^Y he a, hnear map. For any y eV the map X -> {A{x)\y)Y

3.1 The Geometry of Euclidean and Hermitian Spaces

93

defines a linear map on X, hence by Riesz's theorem there is a unique A*{y) e X such that {A{x)\y)Y = {y\A%x))x

Vx G X, Vy e Y,

(3.7)

It is easily seen that the map y -^ A*{y) from Y into X defined by (3.7) is linear: it is called the adjoint of A. Moreover, (i) let A,B : X -^ Y he two linear maps between two Euchdean or Hermitian spaces. Then {A + B)* = A* ^ B*, (ii) (XA)* = XA* if A G M and A : X ^ Y is SL hnear map between two Euclidean_spaces, (iii) (XA)* = XA"" if A G C and A : X ^ F is a hnear map between two Hermitian spaces, (iv) ( 5 o A)* = A* o 5 * if A : X ^ F and B : F ^ Z are hnear maps between Euclidean (Hermitian) spaces, (v) (A*)* = ^ i f A : X - ^ y i s a linear map. 3.37 %, Let X, Y be vector spaces. We have already defined an adjoint A : Y* —^ X* with no use of inner or Hermitian products, < A{y*),x>=<

y*,A(x)

>

\/x e X, My* e

Y\

If X and Y are Euclidean (Hermitian) spaces, denote by /3x : X* —^ X, /Sy : Y* -^ Y the Riesz isomorphisms and by A* the adjoint of A defined by (3.7). Show that A* =

/3xoAo0-\ 3.38 T h e adjoint o p e r a t o r in c o o r d i n a t e s . Let X, Y be two Euclidean (Hermitian) spaces with inner (Hermitian) products { \ )x and ( | ) y . Fix two bases in X and y , and denote the Gram matrices of the inner (Hermitian) products on X and Y respectively, by G and H . Denote by x the coordinates of a vector x. Let A : X -^ Y be a linear map. A* be the adjoint map and let A, A* be respectively, the associated matrices. Then we have (A(x)\y)Y

= x^A^Hy,

{x\A*{y))

= x^'GA'y,

{x\A*iy))

= x^GA^^y,

if X and Y are Euclidean and (A{x)\y)Y

= x^A^Hy,

if X and Y are Hermitian. Therefore GA* = A ^ H

if X and Y are Euclidean,

GA* = A ^ H

if X and Y are Hermitian,

or, recalling that G ^ = G, ( G ~ i ) ^ = G - \ H ^ = H if X and Y are Euclidean and that G ^ = G, ( G - i ) ^ = G - i , and H ^ = H if X and Y are Hermitian, we find A* = G - ^ A ^ H

if X and Y are Euclidean,

A* = G~^ A ^ H

if X and r are Hermitian.

In particular. A* = A ^ in the Euclidean case, _r A* = A in the Hermitian case if and only if the chosen bases in X and Y are orthonormal.

(3.8)

94

3. Euclidean and Hermitian Spaces

3.39 Theorem. Let A: X -^Y be a linear operator between two Euclidean or two Hermitian spaces and let A* : Y -^ X be its adjoint. Then Rank^* = R a n k ^ . Moreover, (Im^)-^=ker^*,

Im^=(ker^*)^,

{lmA*)-^=kerA,

IinA* = (ker^)-^.

Proof. Fix two orthonormal bases on X and Y, and let A be the matrix associated to A using these bases. Then, see (3.8), the matrix associated to A* is A-^, hence Rank A* = Rank A ^ = Rank A = Rank A, and dim(ker A*)-^ = dim Y - dim ker A* = Rank A* = Rank A = dim Im A. On the other hand, Im A C (ker A*)-*- since, if t/ = A{x) and A*{v) = 0, then {y\v) = (A{x)\v) = {x\A*{v)) = 0. We then conclude that (ker A*)-*- = ImA. The other claims easily follow. In fact, they are all equivalent to I m A = (kerA*)-*-. D

As an immediate consequence of Theorem 3.39 we have the following. 3.40 Theorem (The alternative theorem). Let A : X -^ Y be a linear operator between two Euclidean or two Hermitian spaces and let A* : Y -^ X be its adjoint. Then A|kerA-L * (ker^)-^ -^ ImA and At^ : ImA —^ (ker^)-^ are injective and onto, hence isomorphisms. Moreover, (i) A(x) = y has at least a solution if and only if y is orthogonal to kerA^ (ii) y is orthogonal to ImA if and only if A*{y) = 0, (iii) A is injective if and only if A* is surjective, (iv) A is surjective if and only if A* is injective. 3.41 . A more direct proof of the equaUty ker A = (Im^*)-^ is the following. For simplicity, consider the real case. Clearly, it suffices to work in coordinates and by choosing an orthonormal basis, it is enough to show that Im A = (ker A^)-^ for every matrix A € Mm,n{^)Let A = (a'j) G Mm„n{^) and let a^, a^,..., a"^ be the rows of A, equivalently the columns of A ^ , /ai\

A =

\a^J Then,

3.2 Metrics on Real Vector Spaces

Ax

/ a\x^ + alx^ + • • • + al^x"" \ ajx^ + ajx^ + • • • + alx"" \ a f x^ + afx^ + • • • + a ^ x ^ /

95

2

\ a^ • x/

Consequently, x G ker A if and only if a* • x = 0 Vi = 1 , . . . , m, i.e., kerA = S p a n U \ s?,...,

a^}

= (ImA^)"^.

(3.9)

3-2 Metrics on Real Vector Spaces In this section, we discuss bilinear forms on real vector spaces. One can develop similar considerations in the complex setting, but we refrain from doing it. a. Bilinear forms and linear operators 3.42 Definition. Let X he a real linear space. A bilinear form on X is a map 6 : X X X —> M that is linear in each factor, i.e., b{ax + /3y, z)=a 6(x, z)-\-(3b{y, z), 6(x, ay -\- f3z) = a 6(x, y) + 0 b{z, z). for all X, y, X G X and all forms on X by B{X).

a,l3e

We denote the space of all bilinear

Observe that, if 6 G B{X), then 6(0, x) - b{0,y) = 0 Vx,?/ G X. The class of bihnear forms becomes a vector space if we set (6i + 62)(x,y) := 6i(x,y) + b2{x,y), (A6)(x, y) := 6(Ax, y) =- 6(x, A, y). Suppose that X is a linear space with an inner product denoted by ( I ). If 6 G S(X), then for every y e X the map x -^ b{x,y) is linear, consequently, by Riesz's theorem there is a unique B := B{y) G X such that b{x,y) = {x\B{y)) Vx G X. (3.10) It is easily seen that the map y —> B{y) from Y into X defined by (3.10) is linear. Thus (3.10) defines a one-to-one correspondence between B{X) and the space of linear operators £(X, X), and it is easy to see this correspondence is a linear isomorphism between B{X) and £(X, X).

3. Euclidean and Hermitian Spaces

96

Ueber

die Hjpothe8eQ,welohe derGeometne zu Gruad^ liegen.

d ie

Hypothesen, B. R i e m a n n.

welche der Oeometrie za 6nmde liegen. iBcljung. P l » n d«r Un BekaantUoh wut dte Qmmctrte to . den Begritf de« iUittmea, ab die «r*««n Grandbagriib tta die Baume al* etwaa Gegebene* wnxa. Sie giebt ton iha NomlnaldefinitioneD, w«hrend die wcMBtUchen BeetimmuBgeQ in Form TOO Axiomen auftretesi. D u VerMltaiM dieeer VoreuMetrangen bleibt dabei im Dankeln; man •ieht weder «in, ob uod in wie wdt "ihte Verbindung nothwendig, fiocli a priori, ob sie mflgHch Ut. Uiese Dankelheit wntde such von Ettklid bi« auf L e g e n d r e . der Oeometrie ni nennen. urn den berOl>mte»ten neoeren Bei weder Ton den MathenwUkem, noch on den PhiloMiphen, welch* sich te die* aeinen Oiund wohl dariu, damit betcbiitigttD. geboben. £ • h daM der allgemeine Begriff mehr£uh tttgedehnter i . ganz onbearbeitet blieb. Icli cliein die babe mir daher tunichet die A n ^ b e gesteilt. den Begriff einer mehrftch auigedehateo GrBMe au< aUgemrinei» Ght««»«nbegri«»n lu conetniimi. £ • wird daraos hervoigehea. dan ein« m«hrfe«h auigedehnte GrOwe ver-

B. R i e m a n n.

t) Dine AMModlooc H tm 10. Jani 18&4 ran d«m 2«Mk MiMT HaUiUtioa Ttnaitaltelen CoUoqiiam Iflerans eridlrt Mch die Form der D«nMlni(,

d«m dreix«bs«m B«nd« 4*r AUiandlaag«n der KSni|^ieiicD OtMllwiwrt der WiMMtehaftcn tn OSttingen.

Gdttingen, in der Diet«riolitoli»ii 1867,

BoohhandUng.

BmanMbwrig, im JuU 1M7.

Figure 3.2. Frontispiece and a page of the celebrated dissertation of G. F. Bernhard Riemann (1826-1866).

3.43 Bilinear forms in c o o r d i n a t e s . Let X be a finite-dimensional vector space and let (ei, 6 2 , . . . , e-n) be a basis of X. Let us denote by B the nxn matrix, sometimes called the Gram matrix of b. B = [bij]

bij =

b{ei,ej).

Recall that the first index from the left is the row index. Then by linearity, if for every x, 2/, X = {x^, cc^,..., ic^)^ and y = (y^, 2/^, • • •, y^)^ € M^ are respectively, the column vectors of the coordinates of x and y, we have

bix,y) = J2 ^ij^'y' = x^-lBy) =x^By. In particular, a coordinate system induces a one-to-one correspondence between bilinear forms in X and bilinear forms in W^. Notice that the entries of the matrix B have two lower indices that sum with the indices of the coordinates of the vectors x, y that have upper indices. This also reminds us that B is not the matrix associated to a linear operator B related to B . In fact, if instead N is the associated linear operator to 6, bix.y) = (x\N{y))

Vx,y€X,

then y ^ B x - b(x,y) = {x\N{y))

= y^N^Gx

where we have denoted by G the Gram matrix associated to the inner product on X , G = [gij], gij — (ei|ej), and by N the n v. n matrix associated to A^ : X —^ X in the basis (ei, 6 2 , . . . , en). Thus N^G = B or, recalling that G is symmetric and invertible, N = G-iB^.

3.2 Metrics on Real Vector Spaces

97

b. Symmetric bilinear forms or metrics 3.44 Definition. Let X be a real vector space. A bilinear form b G B{X) is said to be (i) symmetric or a metric, ifb{x^y) = b{y,x) Vx,?/ G X, (ii) antisymmetric ifb{x,y) = —b{y,x) ^x,y G X. The space of symmetric bilinear forms is denoted by Syra{X). 3.45 %, Let b G B(X). Show that bs{x,y) := ^{b{x,y) + b{y,x)), x,y € X , is a symmetric bilinear form and bA{x,y) :— ^{b{x,y) — b{y,x)), x,y £ X, is an antisymmetric bilinear form. In particular, one has the natural decomposition 6(x, y) = bs {x, y) + 6^ (x, y) of b into its symmetric and antisymmetric parts. Show that b is symmetric if and only if 6 = 65, and that b is antisymmetric if and only if 6 = 6^^. 3.46 %. Let 6 G B(X) be a symmetric form, and let B be the associated Gram matrix. Show that 6 is symmetric if and only if B ^ = B . 3.47 %, Let b e B(X) and let N be the associated linear operator, see (3.10). Show that AT is self-adjoint, N* = AT, if and only if 6 G Sym{X). Show that N* = -N if and only if b is antisymmetric.

c. Sylvester's theorem 3.48 Definition. Let X be a real vector space. We say that a metric on X, i.e., a bilinear symmetric form g : X x X ^^^R is (i) nondegenerate if^xeX,x^O there is y e and Wy e Xy y y^ 0 there is x e X such that (ii) positively definite ifb{x^x) > O'^x ^ X, x ^ (iii) negatively definite ifb{x,x) < 0 \/x e X, x

X such that 6(x, y) ^ 0 6(x, y) j^ 0, 0, ^0.

3.49 %. Show that the scalar product {x\y) on X is a symmetric and nondegenerate bilinear form. We shall see later, Theorems 3.52 and 3.53, that any symmetric, nondegenerate and positive bilinear form on a finite-dimensional space is actually an inner product.

3.50 Definition. Let X be a vector space of dimension n and let g G Sym{X) be a metric on X. (i) We say that a basis (ei, e 2 , . . . , Cn) is g-orthogonal if g{ei,ej) = 0 Vz,j = l,...,n, i^j. (ii) The radical of g is defined as the linear space md{g) := | x G X\g{x,y)

= 0\fye

x}.

(iii) The range of the metric g is r{g) := n — dimrad^.

98

3. Euclidean and Hermitian Spaces

Figure 3.3. Jorgen Gram (1850-1916) and James Joseph Sylvester (1814-1897).

(iv) The signature of the metric g is the triplet of numbers {i^{g)J.{g),io{g)) where i^{g) := maximum of the dimensions of the subspaces V C X on which g is positive definite, g{v^v) > O^v EV, V ^^ 0, i-{g) := maximum of the dimensions of the subspaces V C X on which g is negative definite, g{v,v) 0, g{ei,ei) < 0, g{ei^ei) = 0. Then n+ = 'i+id), u- = i-{g) and UQ = ioig)- In particular, n^., n_, no do not depend on the chosen g-orthogonal basis, i^{g) + i-{g) = r{g)

and

i+{g) + i-{g) + io{g) = n.

3.2 Metrics on Real Vector Spaces

Proof. Suppose that g(ei,ei)

99

> 0 for i = 1 , . . . , n-j.. For each v = X^i^i '^^^i^ we have 9{v,v) = Y^\v'\'^g{ei,ei)

> 0,

1=1

hence dim Span {ei, 6 2 , . . . , en^} < H ( P ) - On the other hand, if l y C X is a subspace of dimension i-^{g) such that g{v, i;) > 0 Vv € W, we have V y n S p a n { e n ^ + i , . . . , e n } = {0} since g{v,v) < 0 for all v G S p a n { e n , + i , . . . , e n } . Therefore we also have i-\-{g) < n — {n — n-|_) = n.^. Similarly, one proves that n _ = i-{g). Finally, since G := [^(ei,ej)] is the matrix associated to g in the basis (ei, 6 2 , . . . , en), we have io{g) = d i m r a d ( y ) = d i m k e r G , and, since G is diagonal, d i m k e r G = UQ. D

d. Existence of ^-orthogonal bases The Gram-Schmidt algorithm yields the existence of an orthonormal basis in a Euclidean space X. We now see that a slight modification of the GramSchmidt algorithm allows us to construct in a finite-dimensional space a ^-orthogonal basis for a given metric g. 3.53 Theorem (Gram-Schmidt). Let g be a metric on a finite-dimensional real vector space X. Then g has a g-orthogonal basis. Proof. Let r be the rank of gf, r := n—dimrcid (^), and let {wi, W2,. •., Wn-r) be a basis of rad (g). If V denotes a supplementary subspace of rad (^f), then V is p-orthogonal to radg and d i m F = r. Moreover, for every v £ V there is z £ X such that g{v, z) ^ 0. Decomposing zasz = w-\-t^wEV^t£ r a d ( ^ ) , we then have g{v,w) = g{v,w) + g{v, t) = g(v, z) ^ 0, i.e., g is nondegenerate on V. Since trivially, (i^i, i i ; 2 , . . . , Wn-r) is ^-orthogonal and V is ^-orthogonal to (i^i, 1^2, • • •, '^n-r)-, in order to conclude it suffices to complete the basis {w\^ W2,. - • -, Wn-r) with a ^f-orthogonal basis of V; in other words, it suffices to prove the claim under the further assumption that g be nondegenerate. We proceed by induction on the dimension of X. Let ( / i , / 2 , . . . , / n ) be a basis of X. We claim that there exists ei G X with g{ei,ei) / 0. In fact, if for some fi we have gifiifi) 7^ O5 we simply choose ei := / i , otherwise, if g{fi,fi) = 0 for all i, for some k ^ 0 we must have g{fi, fk) ¥" 0, since by assumption rad (g) = {0}. In this case, we choose ei := / i -|- /^ as

g{fi + fkji + fk) = gifufi) + 2g{fiJk) -f gifkJk) = O + 2g{fiJk) + 0 / 0 . Now it is easily seen that the subspace Vi:=[vex\g{euv)

= 0]

supplements S p a n { e i } , and we find a basis (t'2, • • • ,^n) of Vi such that g{vj,ei) for all j = 2 , . . . , n by setting ._.

= 0

fl(/j,ei) P(ei,ei)

Since g is nondegenerate on Vi, by the induction assumption we find a p-orthogonal basis ( e 2 , . . . , Cn) of Vi, and the vectors (ei, 6 2 , . . . , Cn) form a p-orthogonal basis of X. D

100

3. Euclidean and Hermitian Spaces

A variant of the Gram-Schmidt procedure is the following one due to Carl Jacobi (1804-1851). Let g : X X X -^ R he Si metric on X. Let (/i, / 2 , . •., fn) be a basis of X, let G be the matrix associated to g in this basis, G = [gij], gij = 9{fu fj)- Set Ao = 1 and for A: = 1 , . . . , n Ak :=detGA: where G^ is the k x k submatrix of the first k rows and k columns. 3.54 Proposition (Jacobi). / / A^ 7^ 0 for all k = 1 , . . . , n, there exists a g-orthogonal basis (ei, 6 2 , . . . , e^) of X; moreover g[ek,ek) := —7—. Proof. We look for a basis (ei, e2, • . . , en) so that I ei

=a\fi,

62 = 0 2 / 1 +«2/2»

or, equivalently, ek

•=Yl^kfi^

(3.11)

k = l,..-,n,

as in the Gram-Schmidt procedure, such that g(ei,ej) = 0 for i ^ j . At first sight the system giei^ej) = 0, i :^ j , is a system in the unknowns aj.. However, if we impose that for all fc's p(efc,/i) = 0 Vi = l , . . . , A ; - l , (3.12) by linearity g{ek,ei) = 0 for i < /c, and by symmetry gie^^ei) = 0 for i > k. It suffices ,a^ then to fulfill (3.12) i.e., solve the system offc— 1 equations in k unknowns al.,a^,... k

Yl^ifjJiK

=0,

Vi = 1,... ,^ - 1.

(3.13)

j=i

If we add the normalization condition k

Yl9{fjJk)ai

= l,

(3.14)

j=i

we get a system of k equations in k unknowns of the type G ^ x = b , where Gk = [9ij]y 9ij '•= gifiJj)^ X = (oj^,...,aj^)^ and b = ( 0 , 0 , . . . , 1)^. Since det Gfc = Afc and Afc 7^ 0 by assumption, the system is solvable. Due to the arbitrarity of fc, we are able to find a gf-orthogonal basis of type (3.11). It remains to compute ^(6^,6^). Prom (3.13) and (3.14) we get

9iek,ek) = Yl ^O'kOUiJj) = J2'^k{Y^9(fiJj)4)

=Yl/'k^Jk=cit j=i

and we compute a^ by Cramer's formula, ^k —

hence giek.Ck) =

Ak-i/Ak-

Afc

'

3.2 Metrics on Real Vector Spaces

101

3.55 Remark. Notice that Jacobi's method is a rewriting of the GramSchmidt procedure in the case where g{fi^fi) 7^ 0 for all i's. In terms of Gram's matrix G := [gi^i^ej)]^ we have also proved that

T-GT = d i a g { ^ } for a suitable triangular matrix T. 3.56 Corollary (Sylvester). Suppose that Ai,...,Afc ^ 0. Then the metric g is nondegenerate. Moreover, i-{g) equals the number of changes of sign in the sequence (1, Ai, A 2 , . . . , A^). In particular, if Ak > 0 for all k 's, then g is positive definite. Let (ei, 6 2 , . . . , en) be a ^-orthogonal basis oi X. By reordering the basis in such a way that > 0 if j = l , . . . , 2 + ( ^ ) , < 0 if j = 2+(p) + l,...,2+(^) + *-(^), = 0 otherwise;

9{ej,ej)\ and setting (

fr-=^

[ Cj

^

if 7 - 1

i^(a)-\-i

(a)

otherwise

we get

9{fjJj)

( 1 if j = l , . . . , i + ( ^ ) , = < - 1 if j = n ( ^ ) + l , . . . , i + ( ^ ) + z_(c/), I 0 otherwise.

e. Congruent matrices It is worth seeing now how the matrix associated to a bilinear form changes when we change bases. Let (ei, 6 2 , . . . , Cn) and (/i, /2, • • •, fn) be two bases of X and let R be the matrix associated to the map R : X -^ X, R{ei) := fi in the basis (ei, 6 2 , . . . , e^), that is

R := [ri I r2 where r^ is the column vector of the coordinates of fi in the basis (ei, 6 2 , . . . , Cn)' As we know, if x and x' are the column vectors of the coordinates of X respectively, in the basis (ei, 6 2 , . . . , en) and (/i, /2, • • •, /n), then x = R x ' . Denote by B and B ' the matrices associated to b respectively, in the coordinates (ei, 6 2 , . . . , e^) and (/i, / 2 , . • •, fn)- Then we have

102

3. Euclidean and Hermitian Spaces

b{x,y)

x'^BV

b{x, y) = x^By = x "T-DTJ R^BRy' hence (3.15)

B' = R^BR.

The previous argument can be of course reversed. If (3.15) holds, then B and B ' are the Gram matrices of the same metric h on W^ in different coordinates 6(x,2/)=x^bV = (RxfB(Ry). 3.57 Definition. Two matrices A , B G Mn^n{ if there exists a nonsingular matrix R E Mn,n{

are said to he congruent such that B = R ^ A R .

It turns out that the congruence relation is an equivalence relation on matrices, thus the nxn matrices are partitioned into classes of congruent matrices. Since the matrices associated to a bilinear form in different basis are congruent, to any bilinear form corresponds a unique class of congruent matrices. The above then reads as saying that two matrices A, B G Mn,n{^) are congruent if and only if they represent the same bilinear form in different coordinates. Thus, the existence of a ^r-orthogonal basis is equivalent to the following. 3.58 T h e o r e m . A symmetric matrix A G Mn^n{ diagonal matrix.

is congruent to a

Moreover, Sylvester's theorem reads equivalently as the following. 3.59 T h e o r e m . Two diagonal matrices I, J G Mn,n(^) o,f^ congruent if and only if they have the same number of positive, negative and zero entries in the diagonal. If, moreover, a symmetric matrix A G Mji_^ji{M^ is congruent to (\ 0 0 Ida 0

V 0

-Idfc

0

0

0/

then (a^b^n — a — b) is the signature of the metric y ^ A x . Thus the existence of a ^-orthogonal matrix in conjunction with Sylvester's theorem reads as the following. 3.60 T h e o r e m . Two symmetric matrices A , B G Mn,nO^) o,re congruent if and only if the metrics y-^ A x and y-^Bx on W^ have the same signature (a, 6, r). In this case, A and B are congruent to

3.2 Metrics on Real Vector Spaces

V

103

\

Ida

0

0

0

-Idfc

0

0

0

0/

f. Classification of real metrics Since reordering the basis elements is a linear change of coordinates, we can now reformulate Sylvester's theorem in conjunction with the existence of a ^-orthonormal basis as follows. Let X, Y be two real vector spaces, and let g, h be two metrics respectively, on X and Y. We say that {X,g) and (F, h) are isometric if and only if there is an isomorphism L : X —^ Y such that h{L{x),L{y)) = g{x^y) Wx,y G X. Observing that two metrics are isometric if and only if, in coordinates, their Gram matrices are congruent, from Theorem 3.60 we infer the following. 3.61 Theorem. (X, g) and (y, h) are isometric if and only if g and h have the same signature, (hid) ^'^-{9)^0(9)) = (i+(/i),i_(/i),ioW). Moreover, if X has dimension n and the metric g on X has signature (a^b.r), a + b + r = n, then (X^g) is isometric to (lR^,/i) where /i(x,y) := x^Hy and

/I H

V

Ida

0

0

0

-Idb

0

0

0

0/

According to the above, the metrics over a real finite-dimensional vector space X are classified, modulus isometrics, by their signature. Some of them have names: (i) The Euclidean metric: i-^{g) = n, i-{g) = io{g) — 0; in this case g is a scalar product. (ii) The pseudoeuclidean metrics: i-{g) = 0. (iii) The Lorenz metric or Minkowski metric: i-\-{g) = n — 1, i-{g) — 1, ^o(^) = 0. (iv) The Artin metric i-\-{g) = i-{g) = P, ^o(p) = 03.62 %. Show that a biUnear form ^ on a finite-dimensional space X is an inner product on X if and only if g is symmetric and positive definite.

104

3. Euclidean and Hermitian Spaces

g. Quadratic forms Let X be a finite-dimensional vector space over M and let b G B{X) be a bilinear form on X. The quadratic form 0 : X ^ R associated to b is defined by 0(x) = b{x,x)^ X e X. Observe that 0 is fixed only by the symmetric part of b bs{x,y) :=

-{b{x,y)-\-b{y,x))

since b{x, x) = bs{x, x) \fx G X. Moreover one can recover bs from (/) since bs is symmetric, bs{x, y) = 2 v^^ -^y)-

^ W - ^(^)) •

Another important relation between a bilinear form b G B{X) and its quadratic form 0 is the following. Let x and v £ X. Since (j){x -f- tv) = (f){x) + M b(x^ v) -f b{v, we have j^(t>{x + tv)^t=o = 2bs{x,v).

(3.16)

We refer to (3.16) saying that the symmetric part bs ofb is the first variation of the associated quadratic form. 3.63 Homogeneous polynomials of degree two. Let B = [bij] € Mn,nW and let n

6(x,y) := x^By = ^

bijx'y^

be the bilinear form defined by B on R*^, x = ( x \ x ^ , . . . , x"), y = ( y \ ?/2,..., 2/^). Clearly, n

(/)(x) = 6(x, x) = x^Bx = ^

bijx'^x^

is a homogeneous polynomial of degree two. Conversely, any homogeneous polynomial of degree two P{x) = ^

bijx'x^ = x^Bx

i,j = l,n i<3

defines a unique symmetric bilinear form in W^ by 6(x,y):=^(p(x + y ) - P ( x ) - P ( y ) ) with associated quadratic form P .

3.2 Metrics on Real Vector Spaces

105

3.64 E x a m p l e . Let {x,y) be the standard coordinates in M."^. The quadratic polynomial ax^ + 6xs/ + cy^ = ( x , j , ) ( ^ J ^

' f ) ( y

is the quadratic form of the metrics g{{x,y),{z,w))

'a := {z,w) L \b/2

b/2\

(x^

c J

\yj

3.65 D e r i v a t i v e s of a quadratic form. Prom (3.16) we can compute the partial derivatives of the quadratic form (f){x) := x ^ G y . In fact, choosing u = e^, we have -^{x)

:= -(t>{x + teh) = 2bs(x,eh)

= x^(G + G^)e^

hence, arranging the partial derivatives in a 1 x n matrix, called the Jacobian oi(t>,

matrix

•'*'-[^w||^w|-lSH Idx^^ ^\ dx^^'^'l '" \ dx^ we have D(a;)=x^(G-hG^), or, taking the transpose, V(/)(x) := (D
h. Reducing to a sum of squares Let ^ be a metric on a real vector space X of dimension n and let (j) be the associated quadratic form. Then, choosing a basis (ei, 6 2 , . . . , en) we have (t){x) = g{x,x) =

^{x'fg[euei) 2=1

if and only if (ei, 6 2 , . . . , e^) is ^-orthogonal, and the number of positive, negative and zero coefficients is the signature of g. Thus, Sylvester's theorem in conjunction with the fact that we can always find a ^-orthogonal basis can be rephrased as follows. 3.66 Theorem (Sylvester's law of inertia). Let (j){x) = g{x,x) be the quadratic form associated to a metric g on an n-dimensional real vector space. (i) There exists a basis (/i, / 2 , . . . , fn) of X such that

2=1

where {i^{g),i-{g),io{g))

2=1

is the signature of g.

2=1

106

3. Euclidean and Hermitian Spaces

(ii) If for some basis (ei, 6 2 , . . . , en) n

n

x:=Y,^'ei,

(3.17)

then the numbers n^^n^ andno respectively, of positive, negative and zero ^{ei)^s are the signature {i^{g),i-{g),io{g)) of g. 3 . 6 7 E x a m p l e . In order to reduce a quadratic form cf) to the canonical form (3.17), we may use Gram-Schmidt's algorithm. Let us repeat it focusing this time on the change of coordinates instead of on the change of basis. Suppose we want to reduce to a sum of squares by changing coordinates, the quadratic form n

where at least one of the aij 's is not zero. We first look for a coefficient akk that is not zero. If we find it, we go further, otherwise if all akk vanish, at least one of the mixed terms, say a i 2 , is nonzero; the change of variables x^ = i / i -f-2/2,

yx^ = y^

for j = 3 , . . . , n,

transforms ai2X^x'^ into ai2{(y^)'^ — (y^)^), and since a n = 022 = 0, in the new coordinates (y^, j / ^ , . . . , y^) the coefficient of (y^)^ is not zero. Thus, possibly after a linear change of variables, we write 0 as ^(x) = — ( a n j / i ) 2 + y ^ azjyiyJ + B ( s / 2 , . . . , J/"). We now complete the square and set

iY'=a^^y^ +

ZU^y'^

\Y^ =y^

fori = 2 , . . . , n .

so that

^ix) =

«ii ^

^(a^^y'-^^^yA\c=^{Y'f-^C

"Hi 2

j=2

/

an

where C contains only products of Y ^ , . . . , Y^. The process can then be iterated. 3.68 E x a m p l e . Show that Jacobi's method in Proposition 3.54 transforms (p in Ai

A2

A3

if X = Y27=i ^*^i' ^^^ ^ suitable p-orthogonal basis (ei, 6 2 , . . . , en)3.69 E x a m p l e (Classification of conies). The conies in the plane are the zeros of a second degree polynomial in two variables P{x,y)

:=ax'^ -{-2bxy-hcy^

-\-dx + ey + f = 0,

{x,y) G M^,

(3.18)

3.2 Metrics on Real Vector Spaces

107

where a, 6, c, d,e,f£ R. Choose a new system of coordinates (X, Y), X = ax-{- l3y, Y = 7X + 6y in which the quadratic part of P transforms into a sum of squares ax^ + bxy + cy^ = pX"^ + g^^, consequently, P into pX'^ + qY'^ + 2rX + 2 s y + / = 0. Now we can classify the conies in terms of the signs of p, q and / . If p, q are zero, the conic reduces to the straight line 2rX + 2 s y + / = 0. If p 7»^ 0 and g == 0, then, completing the square, the conic becomes p{X - Xof

+ 25y -f / = 0,

Xo = - ,

i.e., a parabola with vertex in (Xo,0) and axis parallel to the axis of Y. Similarly, if p = 0 and q ^ 0, the conic is a parabola with vertex in {0,Yo), YQ := s/q, and axis parallel to the X axis. Finally, if pq ^ 0, completing the square, the conic is p{X - Xo? -h q{Y - Yof

+ / = 0,

Xo=

r/p,

FQ =

s/q,

i.e., it is o a hyperbola if / 7^ 0 and pg < 0, o two straight lines if / = 0 and pg < 0, o an ellipse if sgn (/) = —sgn (p) and pq > 0, o a point if / = 0 and pq > 0, o the empty set if sgn (/) = sgn (p) and pq > 0. Since we have operated with linear changes of coordinates that map straight lines into straight lines, ellipses into ellipses, and hyperbolas into hyperbolas, we conclude the following. 3.70 P r o p o s i t i o n . The conies in the plane are classified in terms of the signature of their quadratic part and of the sign of the zero term. 3.71 %. The equation of a quadric i.e., of the zeros of a second order polynomial in n variables, see Figure 3.4 for n — 3, has the form (—x), if and only if b = 0. (ii) xo is a center of symmetry, i.e., 0(xo — x) =
i=p-\-l

(v) Suppose det A = 0. Since A = A ^ , we have ker A — (Im A)-^. Choosing a basis in which the first k elements generate Im A and the last n — k ker A, then A writes as

I

f\ A'

V

0

0'

108

3. Euclidean and Hermitian Spaces

(b)

ic),{k),iQ)

^

^

/^\if)

ii)

(9)

iJ)

^

(0

Figure 3.4. Quadrics: (a) ellipsoid: a^x'^-\-b^y'^-\-c'^z'^ = 1; (b) point: a^x'^-\-b'^y^-\-c^z'^ = 0; (c) imaginary ellipsoid: a^x"^ -f- b'^y^ -}- c^z^ = — 1; (d) hyperboloid of one sheet: a^x^ -h b'^y'^ — (P'z^ = 1; (e) cone: a^x^ + b^y^ — cP'z'^ = 0; (f) hyperboloid of two sheets: -a^x^ - b'^y^ + c^z^ = 1; (g) paraboloid: a'^x'^ + b'^y^ - 2cz = 0, c> 0; (h) saddle a^x^ - b'^y^ — 2cz = 0, c > 0; (i) elliptic cyhnder: a^x^ + b'^y^ — 1; (j) straight line: a^x^ + 6^2/^ = 0; (k) imaginary straight line: c?x^ -\-b^y^ = —1; (1) hyperbolic cylinder cP'x^ — b^y^ — 1; (m) nonparallel planes: d^x^ — b'^y^ = 0; (n) parabolic cylinder a?x'^ — 2cz, c > 0; (o) parallel planes: a?x'^ = 1; (p) plane: a^x^ = 0; (q) imaginary plane: a?x'^ = —1.

3.3 Exercises

109

in this new basis and the quadric can be written as (/>(x) = ( x O ^ A ' x ' + 2(b'|xO + 2 ( b ' ' | x ' 0 + C2 = 0 where x ' , b ' G Im A, x ' ' , b ' ' G ker A, x = x ' + x ' ' , b = b ' + b ' ' and det A ' / 0. Applying the argument in (iii) to ( x ' ) ^ A ' x ' + 2 b ' « x ' +C2, we may further transform the quadric into (l){x) = ( x O ^ A ' x ' + C3 + 2 b ' ' . x ' ' = 0, and, writing j / ' := —2 b " • x ' ' — C3, that is, by means of an affine transformation that does not change the variable x', we end up with (f){x) = ( x ' ) ' ^ A ' x ' — y' = 0.

3.3 Exercises 3.72 %, Starting from specific lines or planes expressed in parametric or implicit way in M^, write o the straight line through the origin perpendicular to a plane, o the plane through the origin perpendicular to a straight line, o the distance of a point from a straight line and from a plane, o the distance between two straight lines, o the perpendicular straight line to two given nonintersecting lines, o the symmetric of a point with respect to a line and to a plane, o the symmetric of a line with respect to a plane. 3.73 %. Let X, Y be two Euclidean spaces with inner products respectively, ( | ) x and ( I ) y . Show that X X y is an Euclidean space with inner product {xi\yi)-\-{x2\y2), (xi,X2), (2/1,2/2) E X xY. of X X y .

Notice that X x {0} and {0} x Y are orthogonal subspaces

3.74 If. Let X, 2/ G M"". Show that x ± y if and only if \x - ay\ > \x\ Va G M. 3.75 %, The graph of the map A{x) := Ax, A G Mm,n{^) GA := { ( x , y ) | a : G R ' ' , yeR^,

y = A{x)\

is defined as C R"" x R'^.

Show that GA is a linear subspace of M'^+'^ of dimension n and that it is generated by the column vectors of the {k -\- n) x n

1\

A Id„

110

3. Euclidean and Hermitian Spaces

Also show that the row vectors of the k x {n + k) matrix

-Idfe

A generates the orthogonal to GA-

3.76 %. Write in the standard basis of R^ the matrices of the orthogonal projection on specific subspaces of dimension 2 and 3. 3.77 %, Let X be Euclidean or Hermitian and let V, W be subspaces of X. Show that

v-^nw^ = {v-\-w)^. 3.78 If. Let / : Mn,n{K) -> K be a linear map such that / ( A B ) = / ( B A ) V A , B € Mn,niK). Show that there is A 6 K such that / ( X ) = A t r X for all X E Mn,n{K) where t r X : = E ^ i C c ^ i f X = [a.}]. 3.79 f. Show that the bilinear form b : Mn,n{^)

x Mn,n{R) -^ K given by n

6(A,B):=tr(A^B):=53(A^B)| i=l

defines an inner product on the real vector space Mn,n{R)- Find the orthogonal of the symmetric matrices. 3.80 f. Given n + 1 points zi, Z2,.-., Zn+i in C, show that there exists a unique polynomial of degree at most n with prescribed values at zi, ^2, • • ? Zn+i- [Hint: If Vn is the set of complex polynomials of degree at most n, consider the map : Vn -^ C^"*" given by (/>(P) := ( P ( 2 i ) , P f e ) , • • • ,P(^n)).] 3 . 8 1 % D i s c r e t e i n t e g r a t i o n . Let ti, t2, • - •, tn he n points in [a,b] C M. Show that there are constants a i , a 2 , . . •, an such that b

/

n

P{t)dt

=

^ajP{tj)

for every polynomial of degree at most n — 1. 3.82 If. Let g := fO, 1]^ = la; G M"" I 0 < a:i < 1,

i=l,...,n\

be the cube of side one in R'^. Show that its diagonal has length y/n. Denote by x i , . . . ,X2'n the vertices of Q and by x := ( 1 / 2 , 1 / 2 , . . . , 1/2) the center of Q. Show that the balls around x that do not intersect the balls B{xi, 1/2), i = 1 , . . . , 2^, necessarily have radius at most Rn '-= {y/n — 2)/2. Conclude that for n > 4, B(x, Rn) is not contained in Q. 3.83 f.

Give a few explicit metrics in M^ and find the corresponding orthogonal bases.

3.84 f. Reduce a few explicit quadratic forms in R^ and R'* to their canonical form.

4. Self-Adjoint Operators

In this chapter, we deal with self-adjoint operators on a Euclidean or Hermitian space, and, more precisely, with the spectral theory for self-adjoint and normal operators. In the last section, we shall see methods and results of linear algebra at work on some specific examples and problems.

4.1 Elements of Spectral Theory 4.1.1 Self-adjoint operators a. Self-adjoint operators 4.1 Definition. Let X be a Euclidean or Hermitian space X. A linear operator A : X -^ X is called self-adjoint if A* = A. As we can see, if A is the matrix associated to A in an orthonormal basis, then A ^ and A are the matrices associated to A* in the same basis according to whether X is Euclidean or Hermitian. In particular, A is self-adjoint if and only if A = A"^ in the Euclidean case and A = A^ in the Hermitian case. Moreover, as a consequence of the alternative theorem we have X = ker A 0 Im A,

ker A

±lmA

ii A : X -^ X is self-adjoint. Finally, notice that the space of self-adjoint operators is a subalgebra of £(X, X). Typical examples of self-adjoint operators are the orthogonal projection operators. In fact, we have the following. 4.2 Proposition. Let X be a Euclidean or Hermitian space and let P : X -^ X be a linear operator. P is the orthogonal projection onto its image if and only if P* = P and P o P = P.

112

4. Self-Adjoint Operators

Proof. This follows, for instance, from 3.32. Here we present a more direct proof. Suppose P is the orthogonal projection onto its image. Then for every y € X {y~P{y)\z) = 0 V2 € I m P . Thus y = P{y) if y € I m P , that is P{x) = PoP(a:) = P'^{x) Vx G X. Moreover, for x,y E X 0 = (a; - P{x)\P{y))

= (x\P{y))

-

(P{x)\P(y)),

0 = (P{x)\y - P(y)) = (P(x)\y)

-

(P(x)\P{y)),

hence, {P{x)\y) = (x|P(j/)),

i.e.,

P * = P.

Conversely, if P * = P and P^^ = P we have {x - P{x)\Piz))

= {P*(x - P{x))\z)

= (P{x) - P''(x)\z)

= (P(x) - P{x)\z)

=0

for all z^X.

D

b. The spectral theorem The following theorem, as we shall see, yields a characterization of the self-adjoint operators. 4.3 Theorem (Spectral theoremi). Let A : X -^ X be a self-adjoint operator on the Euclidean or Hermitian space X. Then X has an orthonormal basis made of eigenvectors of X. In order to prove Theorem 4.3 let us first state the following. 4.4 Proposition. Under the hypothesis of Theorem 4-3 we have (i) A has n real eigenvalues, if counted with multiplicity, (ii) ifV CW^ is an invariant subspace under A, then V-^ is also invariant under A, (iii) eigenvectors corresponding to distinct eigenvalues are orthogonal. Proof, (i) Assume X is Hermitian and let A € Mn,n(^) be the matrix associated to A in an orthonormal basis. Then A = A''", and A has n complex eigenvalues, if counted with multiplicity. Let z 6 C"^ be an eigenvector with eigenvalue A G C. Then A z = A z = Az = A z. Consequently, if A = (a*), z = (z^, z"^,...,

z^), we have

1 A|z|2 = Er=i A z* z* = E«"=i ^' A? = E«",,=i "5 ^^ ^\ = E",=i 4 ^' ^'Since a* = a^ for a l H , j = 1 , . . . , n, we conclude that (A-A)|zp = 0

i.e.,

AGM.

In the Euclidean case, A ^ = A = A , also. (ii) Let w € V^. For every v £ V -we have {A{'w)\v) = (w\A{v)) = 0 since A{v) G V and w eV-^. Thus A{w) ± V. (iii) Let x, y be eigenvectors of A with eigenvalues A, /x, respectively. Then A and /x are real and (A - ^i){x\y) = iXx\y) - {x\ny) = (A(x)\y) - {x\A{y)) = 0. Thus (x\y) = 0 if A 5^ ^ .

D

4.1 Elements of Spectral Theory

113

Proof of Theorem 4-3. We proceed by induction on the dimension n oi X. On account of Proposition 4.4 (i), the claim trivially holds if d i m X = 1. Suppose the theorem has been proved for all self-adjoint operators on H when dim H = n — 1 and let us prove it for A. Because of (i) Proposition 4.4, all eigenvalues of A are real, hence there exists at least an eigenvector ui of A with norm one. Let H := Span {txi}"*" and let B := A^fj be the restriction of A to H. Because of (ii) Proposition 4.4, B{H) C H, hence B \ H ^^ H is a linear operator on H (whose dimension is n — 1); moreover, B is self-adjoint, since it is the restriction to a subspace of a self-adjoint operator. Therefore, by the inductive assumption, there is an orthonormal basis {u2^... ,Un) oi H made by eigenvectors of B , hence of A. Since U2,... ,Un are orthogonal to u i , (txi, W2, • • •, ttn) is an orthonormal • basis of X made by eigenvectors of ^ .

The next proposition expresses the existence of an orthonormal basis of eigenvectors in several diflFerent ways, see Theorem 2.45. We leave its simple proof to the reader. 4.5 Proposition. Let A : X ^^ X be a linear operator on a Euclidean or Hermitian space X of dimension n. Let (wi, U2, • • •, Un) be a basis of X and let Xi^ A2,..., An be real numbers. The following claims are equivalent (i) (lAi, 1^2, •. •, Un) is an orthonormal basis of X and each Uj is an eigenvector of A with eigenvalue Xj, i.e., A{uj) = XjUj,

(^il'^j) = ^ij

Vi, j = 1 , . . . , n,

(ii) {ui, U2',..., Un) is an orthonormal basis and n

J= l

(iii) (lii, 1/2,..., Un) is an orthonormal basis and for all x^y £ X (A(x)\y) = / ^ ^ = i ^ji^M (y\^j) ^f^ ^^ Euclidean, IX]?= 1 ^3i^Wj) iy\%•) if X is Hermitian. Moreover, we have the following, compare with Theorem 2.45. 4.6 Proposition. Let A : X —^ X be a self-adjoint operator in a Euclidean or Hermitian space X of dimension n and let A € Mn^n{^) be the matrix associated to A in a given orthonormal basis. Then A is similar to a diagonal matrix. More precisely, let {ui, 1*2,..., Un) be a basis of X of eigenvectors of A, /et Ai, A2,.. •, An E M be the corresponding eigenvalues and let S G Mn,n{^) be the matrix that has the n-tuple of components of Ui in the given orthonormal basis as the ith column. S :=

U i U2

Ur,

Then S^S = Id

and

S^AS = diag (Ai, A2,..., An)

114

4. Self-Adjoint Operators

if X is Euclidean, and S^S = Id

and

S^AS = diag (Ai, A2,..., A^).

if X is Hermitian. Proof. Since the columns of S are orthonormal, it follows that S ^ S = Id if X is Euclid—T

ean or S S = Id if X is Hermitian. The rest of the proof is contained in Theorem 2.45. D

c. Spectral resolution Let A : X —^ X he 3. self-adjoint operator on a Euclidean or Hermitian space X of dimension n, let (i^i, 1*2,..., Un) be an orthonormal basis of eigenvectors of A, let Ai, A2,..., A^; be the distinct eigenvalues of A and Vi, V2,..., Vk the corresponding eigenspaces. Let Pi : X -^ Vi he the projector on Vi so that

UjEVi

and by (ii) Proposition 4.4 k

A{x) = J2XiPi{x). i=l

As we have seen, by (iii) Proposition 4.4, we have Vi L Vj ii i •=^ j and, by the spectral theorem, Y2i=i dimV^ = n. In other words, we can say that {Vi}i is a decomposition of X in orthogonal subspaces or state the following. 4.7 Theorem. Let A : X -^ X be self-adjoint on a Euclidean or Hermitian space X of dimension n. Then there exists a unique family of projectors Pi, P25. • • 5 Pk CL'^d distinct real numbers Ai, A2,. •., A^^ such that k

p. o Pj = 6ijPj,

^

k

Pi = Id

and A = ^

XiPi.

Finally, we can easily complete the spectral theorem as follows. 4.8 Proposition. Let X be a Euclidean or Hermitian space. A linear opertor A : X -^ X is self-adjoint if and only if the eigenvalues of A are real and there exists an orthonormal basis of X made of eigenvectors of A.

4.1 Elements of Spectral Theory

115

d. Quadratic forms To a self-adjoint operator A : X —^ X we may associate the bilinear form a:X xX ^K, a{x,y) := {A{x)\y),

x,y e X,

which is symmetric if X is EucUdean and sesquilinear^ tt(^?y) = o.{y^x)^ if X is Hermitian. 4.9 Theorem. Let A : X —^ X be a self-adjoint operator, (ei, e2, • . . , e^) an orthonormal basis of X of eigenvectors of A and Ai, A2,. •., A^ be the corresponding eigenvalues. Then n

{A{x)\x) =Y.Xi\{x\ei)f

Vx e X,

(4.1)

2=1

In particular, if Amin ^nd Amax CiT'e respectively, the smallest and largest eigenvalues of A, then A m i n k l ^ < {A{x)\x)

< A max 1^1

Vx G X.

Moreover, we have {A{x)\x) = Amin |^P (resp. {A{x)\x) = Amax kPy^ if and only if x is an eigenvector with eigenvalue Amin (resp. Xmax)Proof. Proposition 4.5 yields (4.1) hence n

n

i=l

2=1

and, since l^p = J27=i l ( ^ k i ) P ^^ G X, the first part of the claim is proved. Let us prove that {A(x)\x) = Amin 12^ P if and only if x is an eigenvector with eigenvalue Amin- If x is an eigenvector of A with eigenvalue Amin) then A{x) = Amin 2: hence {A{x)\x) = (Amin^:^!^) = Amin k p . Conversely, suppose (ei, 6 2 , . . . , Cn) is a basis of X made by eigenvectors of A and the eigenspace Vx^-^^ is spanned by (ei, 6 2 , . . . , e^). Prom {A(x)\x) = AminkP^ we infer that 0 = iA(x)\x)

n - AminkP = E ( ^ ^ i=l

^min)|(x|ei)P

and, as AjAmin ^ 0, we get that (x|ei) = 0 Vz > A:, thus x G V\^.^. We proceed similarly for Amax.

D

All eigenvalues can, in fact, be characterized as in Theorem 4.9. Let us order the eigenvalues, counted with their multiplicity, as Ai < A2 < • • • < An and let (ei, e 2 , . . . , en) be an orthonormal basis of corresponding eigenvectors (ei, 6 2 , . . . , en), A{ei) = XiCi Vz = 1 , . . . , n; finally, set Vk := Span{ei, e 2 , . . . , e^},

Wk := {efc,efc+i,... ,en}.

116

4. Self-Adjoint Operators

Since T4, Wk are invariant subspaces under A and Vj^ = Wk-\-i, by applying Theorem 4.9 to the restriction of {A{x)\x) on 14 and VF^, we find Ai = mm{A{x)\x), 1x1=1 Xk = max< {A{x)\x)

(4.2) |x| = 1, x G Vjt >

= min< {A{x)\x)

|a:| = 1, x e Wk \

if A: = 2 , . . . , n — 1,

An = max(A(x)|x). kl=i Moreover, if 5 is a subspace of dimension n—fc-fl, we have 5014 ¥" {0}? then there is XQ ^ S ilVk with |a;o| = 1; thus min|(A(x)|x) |x| = 1, x e s\ < {A{xo)\xo) < max< {A{x)\x)

|x| = 1, x eVk\

= Xk-

Since dim W4 = n — k -\-1 and mmxeWk{^{^)\^) = A^, we conclude with the min-max characterization of eigenvalues that makes no reference to eigenvectors. 4.10 Proposition (Courant). Let A he a self-adjoint operator on a Euclidean or Hermitian space X of dimension n and let Xi < X2 < - - - < Xn be the eigenvalues of A in nondecreasing order and counted with multiplicity. Then Xk =

max

min<^ (A(x)\x)

dimS=n-k-\-l

=

\x\ = 1. x e S>

l^v V / I / I I I

min m a x n A ( x ) b ) dimS=k

J

\x\ =

l^v V / I / I I I

l,xeS>. J

4.11 A variational algorithm for the eigenvectors. Prom (4.2) we know that Afc :=min{(A(x)|x)| |x| = 1, x G 14-^1},

A: = l . . . , n ,

(4.3)

where F_i = {0}. This yields an iterative procedure to compute the eigenvalues of A. For j = 1 define Ai =

m.m{A{x)\x), kl=i

and for j = 1 , . . . , n — 1 set Vj := eigenspace of Aj, Aj+i := min< {A{x)\x) | |x| = 1, x e Wj >.

4.1 Elements of Spectral Theory

117

Notice that such an algorithm yields an alternative proof of the spectral theorem. We shall see in Chapter 10 that this procedure extends to certain classes of self-adjoint operators in infinite-dimensional spaces. Finally, notice that Sylvester's theorem, Gram-Schmidt's procedure or the other variants for reducing a quadratic form to a canonical form, see Chapter 3, allow us to find the numbers of positive, negative and null eigenvalues (with multiplicity) without computing them explicitly. e. Positive operators A self-adjoint operator A : X -^ X is called positive (resp. nonnegative) if the quadratic form (j){x) := (Ax|x) is positive for x ^ 0 (resp. nonnegative). Prom the results about metrics, see Corollary 3.56, or directly from Theorem 4.9, we have the following. 4.12 Proposition. Let A: X ^^ X be self-adjoint. A is positive (nonnegative) if and only if all eigenvalues of A are positive (nonnegative) or iff there is X> 0 (X>0) such that {Ax\x) > A|xp. 4.13 Corollary. A : X ^^ X is positive self-adjoint if and only if a{x,y) = {A{x)\y) is an inner (Hermitian) product on X. 4.14 Proposition (Simultaneous diagonalization). Let A^M : X ^>X be linear self-adjoint operators on X. Suppose M is positive. Then there exists a basis (ei, e 2 , . . . , e^) of X and real numbers Ai, A2,..., An such that {M{ei)\ej) = 6ij, A{ej) = XjMcj \/iJ = 1 , . . . ,n. (4.4) Proof. The metric g{x,y) := (M(x)\y) is a scalar (Hermitian) product on X and the Unear operator M~^A : X —> X is self-adjoint with respect to g since g{M-^A{x),y)

= {MM-''A(x)\y)

= {A{x)\y) =

= {x\MM-'^A(y))

= (Mx\M-'^A(y))

{x\A{y)) =

g{x,M-'^A{y)).

Therefore, M~^A has real eigenvalues and, by the spectral theorem, there is a gorthonormal basis of X made of eigenvectors of M ~ ^ A , g{ei,ej)

= {M{ei)\ej)

= Sij,

M~^A(ej)

= XjCj \/i,j = 1 , . . . , n .

4.15 Remark. We cannot drop the positivity assumption in Proposition 4.14. For instance, if

we have det(AId — M~^A) = A^ + 1, hence M~^A has no real eigenvalue.

118

4. Self-Adjoint Operators

4.16 %. Show the following. P r o p o s i t i o n . Let X be a Euclidean space and let g,b : X x X —^R be two metrics on X. Suppose g is positive. Then there exists a basis of X that is both g-orthogonal and b-orthogonal. 4 . 1 7 ^ . Let A,M be linear self-adjoint operators and let M be positive. Then M~^A is self-adjoint with respect to the inner product g{x,y) := (M{x)\y). Show that the eigenvalues Ai, A2, • •., An of M~^^A are iteratively given by Ai =

min

g(x,x) = l

g{M

A(x))x

= min

Xyt^O

{M{x)\x)

and for J = 1 , . . . , n — 1 I V^ := eigenspace of M~^A IWj

relative to Aj,

:=iVi®V2e'-'eVj)-^,

[Xj+1 := mm{{A{x)\x)

\ {M{x)\x)

= 1, x e

Wj},

where V-*- denotes the orthogonal to V with respect to the inner product g. 4 . 1 8 f.

Show the following.

P r o p o s i t i o n . Let T be a linear operator on K^. IfT-\-T* is positive then all eigenvalues of T have positive (nonnegative) real part.

(nonnegative),

f. The operators A* A and A A* Let A : X -^Y he Si linear operator between X and Y that we assume are either both Euclidean or both Hermitian. Prom now on we shall write Ax instead of A{x) for the sake of simplicity. As usual, A* : Y ^ X denotes the adjoint of A. 4.19 Proposition. The operator A* A : X -^ X is (i) self-adjoint, (ii) nonnegative, (iii) Ax, A*Ax and {A*Ax\x) are all nonzero or all zero, in particular A* A is positive if and only if A is infective, (iv) if ui^ i/2,..., Un are eigenvectors of A* A respectively, with eigenvalues Ai, A2,..., An; then {Aui\Auj) =

\i{ui\uj),

in particular, if ui, U2,. -., Un are orthogonal to each other, then Au\,..., Aun are orthogonal to each other as well. Proof

(i) In fact, {A*A)* = A*A** = A*A.

(ii) and (iii) If Ax = 0, then trivially A*Ax = 0, and if A*Ax = 0, then (A*Ax\x) = 0. On the other hand, {A*Ax\x) = {Ax\Ax) = | ^ x p hence Ax = 0 if (A*Ax\x) = 0. (iv) In fact, {Aui\Auj)

= {A*Aui\uj)

= \i{ui\uj)

= Xi\ui\'^Sij.

D

4.1 Elements of Spectral Theory

4.20 Proposition. The operator A A* :Y ^^Y

119

is

(i) self'adjoint, (ii) nonnegative, (iii) A*x, AA*x and {AA*x\x) are either all nonzero or all zero, in particular AA* is positive if and only if ker A* = {0}, equivalently if and only if A is surjective. (iv) if ui, U2j' >., Un are eigenvectors of AA"^ with eigenvalues respectively, Ai, A2,..., Xny then {A''ui\A*Uj) = Xi{ui\uj), in particular, if ui, U2,.. - ^ Un are orthogonal to each other, then A*ui^..., A*Un are orthogonal to each other as well Moreover, A A* and A* A have the same nonzero eigenvalues and Rank^Tl* = R a n k ^ M = Rank A = Rank A*. In particular, Rank A A* = Rank A* A < min(dim X, dim F ) . Proof. The claims (i) (ii) (iii) and (iv) are proved as in Proposition 4.19. To prove that A* A and A A* have the same nonzero eigenvalues, notice that if X € A", x 7«^ 0, is an eigenvalue for A*A with eigenvalue A 7^ 0, A*Ax = Xx, then Ax 7^ 0 by (iii) Proposition 4.19 and AA*{Ax) = A{A*Ax) = A{Xx) = XAx, i.e., Ax is a nonzero eigenvector for A A* with the same eigenvalue A. Similarly, one proves that if 2/ 7^ 0 is an eigenvector for A A* with eigenvalue X ^ 0, then by (iii) A*y ^ 0 and A*y is an eigenvector for A* A with eigenvalue A. Finally, from the alternative theorem, we have Rank A A* = Rank A* = Rank A = Rank A M .

g. Powers of a self-adjoint operator Let A : X -^ X he self-adjoint. By the spectral theorem, there is an orthonormal basis (ei, 6 2 , . . . , e-n) of X and real numbers Ai, A2,..., A^ such that n

Ax = 2_[^j{^\^j)^j

^^ ^ ^'

By induction, one easily computes, using the eigenvectors ei, 6 2 , . . . , Cn and the eigenvalues Ai, A2,..., An of A the /c-power oi A, A^ :=^ Ao- - -oA k times, V/c > 2, as n

A^x = Y,i^i)H^\ei)

ei

Vx e X

(4.5)

i=l

4.21 Proposition. Let A: X ^^ X be self-adjoint and fc > 1. Then (i) A^ is self-adjoint, (ii) A is an eigenvalue for A if and only if A^ is an eigenvalue for A^,

120

4. Self-Adjoint Operators

(iii) X E X is an eigenvector of A with eigenvalue A if and only if x is an eigenvector for A^ with eigenvalue X^. In particular, the eigenspaces of A relative to A and of A^ relative to X^ coincide. (iv) / / A is invertihle, equivalently, if all eigenvalues of A are nonzero, then 1

A~^x = 22 T~(^kO ^i

^^ ^ ^'

.=1 ^^ 4.22 % Let A: X -^ X he self-adjoint. Show that

li p{t) = YlT=i ^kt^ ^^ ^ polynomial of degree m, then (4.5) yields m

m

P{A){x) = Y^akA^x)

n

n

= ^5^afcA^^(x|e,)e,- = 5^p(A,)(x|e,)e,-. (4.6)

k=l

k=lj=l

j=l

4.23 Proposition. Let A : X -^ X be a nonnegative self-adjoint operator and let k E N, k > 1. There exists a unique nonnegative self-adjoint operator B : X -^ X such that B'^^ = A. Moreover, B is positive if A is positive. The operator B such that B'^^ = A \s called the 2A;th root of A and is denoted by ^ \ / A . Proof. If A{x) ^ X;7=i ^jiA^j^j^

(4.5) yields B^^ == A for n

Uniqueness remains to be shown. Suppose B and C are self-adjoint, nonnegative and such that A = B"^^ = C^'^. Then B and C have the same eigenvalues and the same eigenspaces by Proposition 4.21, hence B = C. •

In particular, if A : X —^ X is nonnegative and self-adjoint, the operator square root of A is defined by n \/A{X)

:= ^2 V ^ ( ^ l ^ i ) ^ J '

X e X,

i=l

if A has the spectral decomposition Ax = X]^=i

^j{^\^j)^j'

4 . 2 4 %. Prove Proposition 4.14 by noticing that, if A and M are self-adjoint and M is positive, then M~'^/^AM~'^/^ : X —>• X is well defined and self-adjoint. 4 . 2 5 ^ . Let A,B be self-adjoint and let A be positive. Show that B is positive if S := AB -h BA is positive. [Hint: Consider A~^/^BA~^/2 and apply Exercise 4.18.]

4.1 Elements of Spectral Theory

121

4.1.2 Normal operators a. Simultaneous spectral decompositions 4.26 Theorem. Let X be a Euclidean or Hermitian space. If A and B are two self-adjoint operators on X that commute, A = A\

B = B\

AB = BA,

then there exists an orthonormal basis (ei, 6 2 , . . . , Cn) on X of eigenvectors of A and B, hence n

n

z = ^{z\ei)ei,

Az = ^Xi{z\ei)ei,

2=1

1=1

n

Bz =

^fii{z\ei)ei, i—1

Ai, A2,..., An G M and /ii, /Li2,..., /in ^ I^ being the eigenvalues respectively of A and B. This is proved by induction as in Theorem 4.3 on account of the following. 4.27 Proposition. Under the hypoteses of Theorem 4-26, we have (i) A and B have a common nonzero eigenvector, (ii) if V is invariant under A and B, then V-^ is invariant under A and B as well. Proof, (i) Let A be an eigenvalue of A and let V\ be the corresponding eigenspace. For all y € Vx we have ABy = BAy = XBy, i.e., By G V^. Thus V^ is invariant under B , consequently, there is an eigenvector w £ Vx oi B^y^, i.e., common to A and B. (ii) For every w G V-^ and z £ V, we have Az,Bz G V and {Aw\z) = {w\Az) = 0, {Bw\z) = {w\Bz) = 0, i.e.. Aw, Bw eV-^. • 4 . 2 8 %. Show that two symmetric matrices A, B are simultaneously diagonizable if and only if they commute A B = B A .

b. Normal operators on Hermitian spaces A linear operator on a Euclidean or Hermitian space is called normal if NN* = N*N. Of course, if we fix an orthonormal basis in X, we may represent N with an n x n matrix N 6 Mn,n(C) and N is normal if and only if N N ^ = N ^ N if X is Hermitian or N N ^ = N ^ N if X is Euclidean. The class of normal operators, though not trivial from the algebraic point of view (it is not closed for the operations of sum and composition), is interesting as it contains several families of important operators as subclasses. For instance, self-adjoint operators A^ = A^*, anti-self-adjoint operators N* = —N, and isometric operators, N*N = Id, are normal operators. Moreover, normal operators in a Hermitian space are exactly the ones that are diagonizable. In fact, we have the following.

122

4. Self-Adjoint Operators

4.29 Theorem (Spectral theorem). Let X be a Hermitian space of dimension n and let N : X -^ X he a linear operator. Then N is normal if and only if there exists an orthonormal basis of X made by eigenvectors ofN. Proof. Let (ei, 6 2 , . . . , Cn) be an orthonormal basis of X made by eigenvectors of N. Then for every z £ X n

n

Nz = ^Xj{z\ej)ej,

N*z =

j= l

^^{z\ej)ej 3= 1

hence NN*z = XI^^i |AjP(^|e-,)ej- = N*Nz. Conversely, let N + N* N -N* A:= — , B := . 2 2i It is easily seen that A and B are self-adjoint and commute. Theorem 4.26 then yields a basis of orthonormal eigenvectors of A and B and therefore of eigenvectors oi N := A+ iB and N* = A- iB. D 4.30 1 . Show that AT : C"^ -^ C"^ is normal if and only if N and N* have the same eigenspaces.

c. Normal operators on Euclidean spaces Let us translate the information contained in the spectral theorem for normal operators on Hermitian spaces into information about normal operators on Euclidean spaces. In order to do that, let us first make a few remarks. As usual, in C^ we write z — x+iy, x,y EW^ for z — {x\ -\-iyi,..., Xn + iyn)' If VF is a subspace of W^, the subspace of C"^ WeiW

:= (zeC'^\z

= x-h iy, x,y

is called the complexified ofW. Trivially, A\mc{W®iW) if F is a subspace of C^, set V

ew\ = dim^ W. Also,

:={ZGC^|ZGF}.

4.31 Lemma. A_subspace V C C^ is the complexified of a real subspace W if and only ifV = V. Proof. liV vectors

= W^

iW, trivially V = V. Conversely, if 2 € F is such that z e V, the

''have real coordinates. Set

z -\- z 2 '

^ -

W :=^xeW\x=

z — z {z/i) -h z/i 2i 2

^ ^ , 2€ V};

then it is easily seen that V = W ® iW li V = V.

4.1 Elements of Spectral Theory

123

For N : M^ ^ W^ we define its complexified as the (complex) linear operator Nc : C -^ C defined by Nc{z) := Nx + iNy iiz = x-\-iy. Then we easily see that (i) A is an eigenvalue of N if and only if A is an eigenvalue of Nc^ (ii) N is respectively, a self-adjoint, anti-self-adjoint, isometric or normal operator if and only if Nc is respectively, a self-adjoint, anti-selfadjoint, isometric or normal operator on C"^, (iii) the eigenvalues of N are either real, or pairwise complex conjugate; in the latter case the conjugate eigenvalues A and A have the same multiplicity. 4.32 Proposition. Let N : W^ ^^ W^ be a normal operator. Every real eigenvalue \ of N of multiplicity k has an eigenspace Vx of dimension k. In particular, V\ has an orthonormal basis made of eigenvectors. Proof. Let A be a real eigenvalue for NQ, NQZ = Xz. We have NQZ

— Nx - iNy = N^z = Xz = Xz,

i.e., z € C^ is an eigenvector of N^ with eigenvalue A if and only if 'z is also an eigenvector with eigenvalue A. The eigenspace Ex of Nc relative to A is then closed under conjugation and by Lemma 4.31 Ex '•= Wx © iWx, where

Wx:={xeR''\x=^,

z-\- z

zeEx],

and dimR Wx — dime ^x • Since N^ is diagonizable in C and W\ C VA ? we have k — dime Ex = dim Wx < dimR V^. As dimV^ ^ k, see Proposition 2.43, the claim follows.

D

4.33 Proposition. Let X be a nonreal eigenvalue of the normal operator N :W^ —^W^ with multiplicity k. Then there exist k planes of dimension 2 that are invariant under N. More precisely, i / e i , e 2 , . . . , en G C"^ are k orthonormal eigenvectors that span the eigenspace Ex of Nc relative to A and we set U2J-1 : =

Cj -\- Cj 7=—,

V2 '

U2j :=

'''

^.— ^.

V2i '

then lii, ii2,. • •, U2k Q'Te orthonormal in W^, and for j = 1,... ,k the plane Span{iA2j_i5^2j}? is invariant under N; more precisely we have

{

N{u2j-l)

= OLU2J-1 - fiU2j,

N{u2j) = /3u2j-i + au2j

where X=: a-\- i/3. Proof. Let Ex, Ej be the eigenspax:es of Nc relative to A and A. Since Nc is diagonizable on C, then ^A -L ^Jdime ^A = dime -^X ~ ^' On the other hand, for z ^ Ex

124

4. Self-Adjoint Operators

Ncz = Nx — iNy = N^z =

\z.

Therefore, z ^ Ex'ii and only if 2 G E-^. The complex subspgice F\ := Ex®Ej of C^ has dimension 2k and is closed under conjugation; Lemma 4.31 then yields Fx = Wx ^iWx where Wx:=[xeR''\x=

^ ^ ,

zeEx\

and

dimR Wx = dime E = 2k.

(4.7)

If (ei, 6 2 , . . . , efc) is an orthonormal basis of Ex, (ei, 6 2 , . . . , e^) is an orthonormal basis of Ej; since y/2ej

=: U2J-1

-\-iu2j,

V^ej = : U2j-1

• ^y'2j,

we see that {uj} is an orthonormal basis of Wx- Finally, if A := a -f- z/?, we compute

= ^+ Ae-

(Niu2j-i) = Nc{^) \N{U2J)

= Af ( ^ )

= ^ ^ ^

• = aU2j-l

- 0U2j,

= • • • = /3«2,-l + a « 2 „

i.e., Span {tt2j-1, W2j} is invariant under N.

D

Observing that the eigenspaces of the real eigenvalues and the eigenspaces of the complex conjugate eigenvectors are pairwise orthogonal, from Propositions 4.32 and 4.33 we infer the following. 4.34 Theorem. Let N be a normal operator on M.^. Then R^ is the direct sum of 1-dimensional and 2-dimensional subspaces that are pairwise orthogonal and invariant under N. In other words, there is an orthonormal basis such that the matrix N associated to N in this basis has the block structure 0 \ 0 A Ai N'

0

0

0

0

To each real eigenvalue A of multiplicity k correspond k blocks A of_dimension 1 x 1 . To each couple of complex conjugate eigenvalues A, A of multiplicity k correspond fc 2 x 2 blocks of the form a -a

(3 a

where a + if3 := A. 4.35 Corolleiry. Let A/^: M" —> R" be a normal operator. Then (i) N is self-adjoint if and only if all its eigenvalues are real, (ii) A'' is anti-self-adjoint if and only if all its eigenvalues are purely imaginary (or zero), (iii) N is an isometry if and only if all its eigenvalues have modulus one. 4 . 3 6 % Show Corollary 4.35.

4.1 Elements of Spectral Theory

125

4.1.3 Some representation formulas a. The operator A* A Let yl: X —> y be a linear operator between two Euclidean spaces or two Hermitian spaces and let ^* : y ^ ' X be its adjoint. As we have seen, yl*A : X —> X is self-adjoint, nonnegative and can be written as n

A" Ax =

Y^Xi{x\ei)ei 2=1

where (ei, 6 2 , . . . , Cn) is a basis of X made of eigenvectors of A*A and for each 2 = 1 , . . . , n Ai is the eigenvalue relative to e^; accordingly, we also have {A*Ay^^x

:= ^

fJ.i{x\ei) e^.

2=1

where /Xi := ^/Xi. The operator (A*A)^/^ and its eigenvalues / i i , . . . ,/Xn, called the singular values of A, play an important role in the description of A 4 . 3 7 f. Let A G Mm,n{^)' Show that ||A|| := sup|a.|^i |Ax| is the greatest singular value of A. [Hint: | A x p = (A* Ax) •x .]

4.38 Theorem (Polar decomposition). Let A : X ^^Y between two Euclidean or two Hermitian spaces.

be an operator

(i) If dimX < d i m y , then there exists an isometry U : X -^ Y, i.e., tf'U = Id, such that Moreover, if A = US with f/*C/ = Id and S* = S, then S = (A* A)^/^ and U is uniquely defined on ker S-^ = ker A-^. (ii) If dimX > dim.Y, then there exists an isometry U : Y ^y X, i.e., U*U = Id such that A = {AA*y^^U\ Moreover, ifA = SU with U*U = Id and 5* = S, then S = {AA*)^^'^ and U is uniquely defined on ker 5-^ = I m ^ . Proof. Let us show (i). Set n := d i m X and N := d i m y . First let us prove uniqueness. If A = 175 where U*U = Id and 5* = S, then A*A = S*U*US = S*S = ^ 2 , i.e., S = (A*A)i/2. Now from A = U(A*A)^/^, we infer for i = 1 , . . . , n Aid)

= t/(A*A)i/2(ei) = Uimei)

=

fiiU{ei),

if (ei, 6 2 , . . . , en) is an orthonormal basis of X of eigenvectors of {A*A)^^^ with relative eigenvalues ^ 1 , /JL2, . • •, /Xn- Hence, U(ei) = —A(ei) if/Ltj ^ 0, i.e., U is uniquely defined by A on the direct sum of the eigenspaces relative to nonzero eigenvalues of (A* A)^/^, that is, on the orthogonal of ker(A*A)^/2 = ker A. Now we shall exhibit U. The vectors A{ei),..., A{en) are orthogonal and |A(ei)| = Mi as

126

4. Self-Adjoint Operators

{A{ei)\A{ej))

= {A*A{ei)\ej)

= fJLi{ei\ej) = f^i6ij.

Let us reorder the eigenvectors and the corresponding eigenvalues in such a way that for some k, 1 < k < n, the vectors A{ei),..., A{ek) are not zero and A(ek-\-i) = - • - = A{en) = 0. For i = 1 , . . . , A: we set i;i := , . r ^ ^ and we complete t;i, t ; 2 , . . . , ^^fc to form a new orthonormal basis (vi^ V2,-.., VN) oiY. Now consider U : X -^Y defined by U(ei) :=Vi

i=l,...,n.

By construction {U{ei)\U{ej)) — Sij, i.e., U*U = Id, and, since fXi = \A{ei)\ = 0 for i > k, we conclude for every i = 1,... ,n

I yrt^t;^ = Ovi = 0

if k < i < n

(ii) follows by applying (i) to ^ * .

D

b. Singular value decomposition Combining polar decomposition and the spectral theorem we deduce the so-called singular value decomposition of a matrix A. We discuss only the real case, since the complex one requires only a few straightforward changes. Let A G MM,nW with n < N. The polar decomposition yields A = U(A^A)i/2

^i^j^

uTu _ jj

On the other hand, since A-^A is symmetric, the spectral theorem yields S e Mn.nW such that (A^A)i/2 = S^diag(/ii, / i 2 , . . . , /in)S,

S^S = Id,

where /ii, /X25 • • •, /^n are the squares of the singular values of A. Recall that the ith column of S is the eigenvector of (A*A)^/^ relative to the eigenvalue fii. then T ^ T = Id, In conclusion, if we set T := U S ^ G MNA^), S^S = Id and A = Tdiag(/ii, / i 2 , . . . , /in)S. This is the singular value decomposition of A, that is implemented in most computer hbraries on linear algebra. Starting from the singular value decomposition of A, we can easily compute, of course, (A-^A)^/^, and the polar decomposition of A. 4.39 . We notice that the singular value decomposition can be written in a more symmetric form if we extend T to a square orthogonal matrix V G MN,N(^), V ^ V = Id and extending diag (/xi, /i2, • • •, /in) to a A/^ x n matrix by adding N — n null rows at the bottom. Then, again

A = VAS where V G MATXATW, V ^ V = Id, S G Mn,n(^),

S ^ S = Id and

4.1 Elements of Spectral Theory

0

/MI 0

/i2

0

0 0

0 0

Mn

\0

0

A =

127

0

0/

c. The Moore-Penrose inverse Let A : X —^Y he 3. linear operator between two Euclidean or two Hermitian spaces of dimension respectively, n and m. Denote by P:X

^kevA^

Q:Y-^lmA

and

the orthogonal projections operators to kevA-^ and 1mA. Of course Ax = Qy has at least a solution x G X for any y ^Y. Equivalently, there exists X E X such that y — Ax ± Im ^ . Since the set of solutions of Ax — Qy is a translate of ker A, we conclude that there exists a unique x := A'^y E X such that y — Ax 1. Im A, [x e

Ax = Qy,

equivalently,

(4.8)

X = Px.

kerA^,

The linear map A"^ :Y ^^ X, y —^ A^y, defined this way, i.e.,

is called the Moore-Penrose inverse oi A: X ^^Y. {AA^

From the definition

=Q,

A^A = P, ker A+ =lmA^ ImA^ =

= kerQ,

keiA^.

4.40 Proposition. A^ is the unique linear map B :Y -^ X such that AB = Q,

BA = P

and

kerB = keTQ',

(4.9)

moreover we have A^AA"^ =A^AA''

=A\

(4.10)

128

4. Self-Adjoint Operators

Proof. We prove that B = A^ by showing for s\\ y £ Y the vector x := By satisfies (4.8). The first equaUty in (4.9) yields Ax = ABy = Qy and the last two imply x = By = BQy = BAx = Px. Finally, from AA^ = Q and A^A = P^ we infer that A*AA^ = A*Q = A\

A^AA* = PA* = A*,

using also that A*Q = A* and PA* = A* since A and A* are such that ImA (kerA*)-^ and ImA* = kerA-^.

= D

The equation (4.10) allow us to compute A^ easily when A is injective or surjective. 4.41 Corollary. Let A : X ^^ Y be a linear map between Euclidean or Hermitian spaces of dimension n and m, respectively. (i) If ker A = {0}, then n <m, A* A is invertible and A^ =

{A*A)-^A*;

moreover^ if A = [/(A*A)^/^ is the polar decomposition of A, then A^ = {A*A)-^/^U\ (ii) If ker A* = {0}, then n>m, AA* is invertible, and

moreover, if A = {AA*)^^'^U* is the polar decomposition of A, then At = C/(AA*)-V2. For more on the Moore-Penrose inverse, see Chapter 10.

4.2 Some Applications In this final section, we illustrate methods of linear algebra in a few specific examples.

4.2.1 The method of least squares a. The method of least squares Suppose we have m experimental data yi, 2/2, • • •, Vm when performing an experiment of which we have a mathematical model that imposes that the data should be functions, (/>(x), of a variable x e X. Our problem is that of finding x G X in such a way that the theoretical data 0(x) be as close as possible to the data of the experiment. We can formahze our problem as follows. We list the experimental data as a vector y = (yi, 2/2, • • •, 2/m) G W^ and represent the mathematical

4.2 Some Applications

129

model as a map 0 : X —> W^. Then, we introduce a cost function C = C{(j){x)^y) that evaluates the error between the expected result when the parameter is x, and the experimental data. Our problem then becomes that of finding a minimizer of the cost function C. If we choose (i) the model of the data to be linear^ i.e., X is a vector space of dimension n and (j) = A\ X -^ W^ is a linear operator, (ii) as cost function, the function square distance between the expected and the experimental data, C{x) = \Ax - 2/|2 = {Ax - y\Ax - y),

(4.11)

we talk of the {linear) least squares problem. 4.42 Theorem. Let X and Y he Euclidean spaces, A \ X ^^ Y a linear map, y EY and C : X ^^R the function C{x) := \Ax-y\Y^

x e X.

The following claims are equivalent (i) X is a minimizer of C, (ii) y - Ax ± 1mA, (iii) X solves the canonical equation A*{Ax-y)

= 0.

(4.12)

Consequently C has at least a minimizer in X and the space of minimizers of C is a translate o/ker A. Proof. Clearly minimizing is equivalent to finding z = Ax G I m A of least distance from y. By the orthogonal projection theorem, x is a minimizer if and only if Ax is the orthogonal projection of y onto Im A. We therefore deduce that a minimizer x G X for C exists, that for two minimizers xi,X2 of C we have Axi = Ax2, i.e., x i — X2 6 ker A and that (i) and (ii) are equivalent. Finally, since ImA-*- = kerA*, (ii) and (iii) are clearly equivalent. • 4 . 4 3 R e m a r k . The equation (4.12) expresses the fa-ct that the function x —>• | Aa: — 6p is stationary at a minimizer. In fact, compare 3.65, since Vx{z\x) = z and Vx(^x\x) = 2La; if L is self-adjoint, we have \Ax - 6|2 = |6|2 _ 2(6|Ax) + |Ax|2, V(6|Ax) = V(A*6|x) = A*6, Vx|Ax|2 = Vx{A*Ax\x)

=

2A*Xx

hence Vx\h-Ax\^

=

2A*{Ax-h).

As a consequence of Theorem 4.42 on account of (4.8) we can state the following 4.44 Corollary. The unique minimizer of C{x) = \Ax — y|y in Im A* = ker A-^ is X = A^y.

130

4. Self-Adjoint Operators

b. The function of linear regression Given m vectors xi, X2,. • •, Xm in a Euclidean space X and m corresponding numbers yi, 2/2, • • •, 2/m, we want to find a linear map L : X -^R that minimizes the quantity m

nL):='£\yi-Lixi)\\ i=l

This is in fact a dual formulation of the linear least squares problem. By Riesz's theorem, to every linear map L : X —> R corresponds a unique vector WL ^ X such that L{y) := {y\wL)j and conversely. Therefore, we need to find w G X such that m

C{w) := ^Ivi

- {xi\w)\'^ -^ min.

2=1

If y := (2/1, 2/2, • • •, 2/m) ^ R"^ and A: X —^ W^ is the linear map Aw := [{xi\w), {X2\w),...

{xn\w)j,

we are again seeking a minimizer of C : X —> M C{w) :=\y-Aw\^,

w e X.

Theorem 4.42 tells us that the set of minimizers is nonempty, it is a translate of ker A and the unique minimizer of C in ker A-^ = Im ^* is if; := A^y. Notice that n

A*a=j2

^i^i^

^=(«^

^^ • • -«"") ^ ^"^

2=1

hence, w £ IxnA'' = ker A"^ if and only if if; is a linear combination of xi, X2,..., Xm- We therefore conclude that A'^y is the unique minimizer of the cost function C that is a linear combination o / x i , 0:2,..., Xm- The corresponding linear map L{x) := {x\A^y) is called the function of linear regression.

4.2.2 Trigonometric polynomials Let us reconsider in the more abstract setting of vector spaces some of the results about trigonometric polynomials, see e.g.. Section 5.4.1 of [GM2]. Let Vn,2Tr be the class of trigonometric polynomials of degree m with complex coefficients n

Vn,2ir ••= [Pix) = ^

Cke''"' I Cfc e C, A; = - n , . . . , n } .

4.2 Some Applications

131

Recall that the vector {c-n,. "•,Cn) G C^"^"^^ is called the spectrum of P{^) — Z]fc=-n ^k^^^^' Clearly, Pn,27r is a vector space over C of dimension at most 2n + 1. The function (P|Q) : VU^-K X ^n,27r -^ C defined by

is a Hermitian product on Vn,2T^ that makes Pn,27r a Hermitian space. Since

(gifcx|^z/.x)=1. r ^i{k-h)x ^^ ^ ^^^^ see Lemma 5.45 of [GM2], we have the following. 4.45 Proposition. The trigonometric polynomials {e^^^}k=-n,n form an orthonormal set of2n-\-l vectors in Pn,27r o,nd we have the following. (i) T^n,27T is a Hermitian space of dimension 2n + 1. (ii) The map ^: 'Pn,27r -^ C^^"^^, that maps a trigonometric polynomial to its spectrum is well defined since it is the coordinate system in Vn,27r relative to the orthonormal basis {e'^^^}. In particular^ : Vn,2n -^ £2n-\-i j^g ^ (complex) isometry. (iii) (FOURIER COEFFICIENTS) For k = - n , ...,n we have

1 n

Cfe = (P|e"=") = — / (iv)

P{t)e-''''dt.

( E N E R G Y IDENTITY)

i- r \P{t)fdt = \\P\f := (P|P) = J2 |(P|e''=-)|2 = f; \Ck?. k=—n

k=—n

a. Spectrum and products Let P{x) = Y2=-n ^ke'^"" and Q{x) = Y2=-n dke'^'' be two trigonometric polynomials of order n. Their product is the trigonometric polynomial of order 2n

k=—n

/e=—n

h,k=—n

2n

= E ( E -Hd,)e'''. p=-2n

h-\-k=p

If we denote by {ck} * {dk} the product in the sense of Cauchy of the spectra of P and Q, we can state the following.

132

4. Self-Adjoint Operators

4.46 Proposition. The spectrum of P{x)Q{x) is the product in the sense of Cauchy of the spectra of P and Q (PQ)k = Pk^Qk4.47 Definition. The convolution product of P and Q is defined by p * Q{x) := ^

r

P{x +

t)Q{i)dt

^TT J-n

Notice that the operation (P^Q) ^^ P *Q is hnear in the first factor and antihnear in the second one. We have 4.48 Proposition. P^Q is a trigonometric polynomial of degree n. Moreover the spectrum of P^Q is the term-by-term product of the spectra of P and Q, {p7Q)^:=P,Qk. Proof. In fact for h, k = — n , . . . , n, we have

27r y_7r

27r J - T T

hence, if P{x) = Efc=_n Cfce^^=" and Q{x) = E f c = - n ^fce^''^, we have

P^Q{x)=

f^

f2 ^hdi^^hke'"''= Y.

h=—nk=—n

^fc^^'""-

k=—n

b. Sampling of trigonometric polynomials A trigonometric polynomial of degree n can be reconstructed from its values on a suitable choice of its values on 2n + 1 points, see Section 5.4.1 r - := — o^+li? 27r . J = "~^?..., n, then the sampling map of [GM2]. Set Xj C : Vn,2n

-- C ^ ^ + l ,

C{P)

:= ( P ( x _ n ) , • • • ,

is invertible, in fact, see Theorem 5.49 of [GM2], 1 '^ ^^""^ • " 2 ; r f T ^

P{xj)Dn{x-Xj)

3 = -n

where Dn{t) is the Dirichlet kernel of order n Dn{t)—

Yl

e^^* = l 4 - 2 ^ c o s A : t .

P{Xn))

4.2 Some Applications

133

Spectrum

Trigonometric polynomials of degree n

£2n-\-l

E

n

ikt

Samplings £2n+l

IDFT

Figure 4.1. The scenario of trigonometric polynomials, spectra and samples.

4.49 Proposition. K,27r given by

J^^^C

V2n-\-lC~'^(z)(x)

and its inverse yjin + 1C~^ : C^"''^^

:= , \^ ZiDJx - Xj) >/2nTT .^1^ -^ '

are isometries between Vn,2n O'Tid C^"^"^^. Proof. In fact, C maps e* '^*, k = — n , . . . , n, to an orthonormal basis of C^'^"'"^:

Prom the samples, we can directly compute the spectrum of P, 4.50 Proposition. Let P{x) e Vn,2n CL'^^d Xj := 2 ^ ^ j ? j = —n, . . . , n . Then

^ f^ P(t)e-'^' dt = ^ ^ J2 P{xj)e-''^^.

(4.13)

Proof. Since (4.13) is linear in P , it suffices to prove it when P{x) = e*'^*, h = — n , . . . , n. In this case, we have ^ J^^ P{t)e~'^ ^^ dt = 6hk ^^^

3=-n

since Dn{xj)

J=-n

= 0 for j ^ 0, j e [-n,n] and DnCO) = 2n + 1.

D

134

4. Self-Adjoint Operators

c. The discrete Fourier transform The relation between the values {P{tj)} of P e 'Pn,27r at the 2n + 1 points tj and the spectrum P of P in the previous paragraph is a special case of the so-called discrete Fourier transform. For each positive integer N^ consider the 27r-periodic function EN{t) : R -^ C given by „ /^N v^^ ikt 1 ^ EN{t):=2^e'^' = \ , k=o [ i-eit

if H s a multiple of 27r,

. . (4.14)

Otherwise.

Let uj = e*i5^ and let l,a;,c<;^,... ^uj^~'^ be the ATth roots of one. For h GZ we have — V cj^^ = < ^ if /i is a multiple of TV, -^ ^^Q [O otherwise,

,^ ^^.

in particular, iV-l

_ ^

u;^^ = (^;,fc

a -N
(4.16)

fc=0

The discrete Fourier transform of order N, DFTN defined by DPT/v(y) := U y rows by column, where U = [£/]],

C/j:=-iw-'^

Vi,i = 0 , . . . , J V - l .

The inverse discrete Fourier transform of order AT, := Vz where is defined by IDFTM{Z) V = [Vi],

V; = Ji,

- C ^ -^ C ^ , is

IDFTN

'- C^ —^ C ^ ,

Vz,j = 0 , 7 V - l .

4.51 Proposition. IDFTN is the inverse of DFTj\f. Moreover, the operators y/N DFTN and -4^ IDFT^ are isometrics ofC^. Proof. In fact, by (4,16)

fc=0 fc=0 i.e., V = U - i and, by the definition of U and V , U ^ = ; ^ V , hence U ^ = ^ V =

4.2 Some Applications

135

Notice that, according to their definitions, we need N'^ multiphcations to compute DFTN (or IDFTjsj). There is an algorithm, that we shall not describe here, called the Fast Fourier Transform that, using the redunwith only N dance of some multiphcations, computes DFTN (or IDFTN) multiplications with a performance of 0{N log N). Let P{t) = E L - n ^k^'^'' ^ ^n,27r and let AT > 2n + 1. A computation similar to the one in Proposition 4.50 shows that

^ ' ^ h £

^^^^'~'' ^^ ^ ^ ^

n^j^-'''^'

=

DFTMV

(4.17)

and Xj := ^ j , -N < j < N. Thus the where y := {P{xo),... ,P{XN)) spectrum of P is the DFT^ of its values at Xj if n < N/2. On the other hand, if z := (ZQ, • • •, ZN-I) is the vector defined by

Zk '=

and we recall that

Pk

if 0 < fc < n,

0

if n < A: < iV/2,

Pk-N

if N/2
0

if iV/2 + n < fc < TV

<

IDFTN

is the inverse of P{xj)

i.e., the values of P at

Xj are

:=

the

N/2 -f n,

DFTN,

we have

{IDFTNZ)J,

IDFTN

of the spectrum of P.

4.52 Frequency spectrum. In applications, the DFTN and IDFTN may appear slightly differently. If / is a To-periodic function, one lets To/AT be the period of sampling, so that tj := ^j = jT^ j = 0 , 1 , . . . , iV — 1, are the sampling points and DTFN produces the sequence

cfc := ^ E fiJT)e''^'''In other words, the values of {ck} are regarded as the values of the component of frequency Vk '•= k^- = - ^ , i.e., as the samples of the so-called frequency spectrum / : E ^^ C of / , defined by

\0

otherwise.

The discrete Fourier transform and its inverse then rewrite as

136

4. Self-Adjoint Operators

f{4f)'h"ff<-'^>'-'^" j=0 N-1

f('^^)=Ef{iky^^''j=o

4.2.3 Systems of difference equations Linear difference equations of first and second order are discussed e.g., in [GM2]. Here we shall discuss systems of linear difference equations. a. Systems of linear difference equations First let us consider systems of first order. Let A G Mk,k{C). The homogeneous linear recurrence for the sequence in C'^

{

Xn+i = AXn,

n > 0,

Xo given

has the unique solution Xn := A'^XQ Vn, as one can easily check. 4.53 Proposition. Given {Fn} in C^, the recurrence f X n + l = AXn

-h F n + i ,

n > 0,

[Xo given has the solution n

Xn := A^Xo + Y. ^""''^3

'^n>{)

3=0

where we assume FQ := 0. Proof. In fact, for n > 0 we have n+l

n

3=0

3=0

n

= A A^Xo + A ^ j=0

n

A ^ - ^ F , - + F n + i = A ( A ^ X O -h ^

A ^ - ^ F ^ ) + Fn+i

3=0

= AXn + Fn-l-l. D

4.2 Some Applications

137

4.54 Higher order linear difference equations. Every equation Xn-[-k H- dk-lXn-^k-l

H

h Go^n = /n+1,

n > 0

(4.18)

can be transformed into afcx A: system of difference equations of first order. In fact, if Xn := (Xn,Xn+l,...,Xn+fc-l)^ G C'',

Fn:=(0,0,...,0,/n^€C^ and A is the k x k matrix / 0 0

1 0

0 1

0 0

\ (4.19)

A:= 0 -ao

0 0 —oi —a2

-flfe-l

/

it is easily seen that ^n+1 — AX„ + Fn-\-l

(4.20)

and conversely, if {Xn} solves (4.20), then {xn}, Xn '-= X^ Vn, solves (4.18). In this way the theory of higher order linear difference equations is subsumed to that of first order systems. In this respect, one computes for the matrix A in (4.19) that k-i

det(AId - A) = A'^ + ^ a j X ^ . j=0

This polynomial in A is called the characteristic polynomial of the difference equation (4.18). b. Power of a matrix Let us compute the power of A in an efficient way. To do this we remark the following. (i) If B is similar to A, A = S'^BS for some S with det S 7^ 0, then A^ = S-^BSS-^BS = S-^B^S and, by induction, A " = S-^B"S

Vn.

138

4. Self-Adjoint Operators

(ii) If B is a block matrix with square blocks in the principal diagonal 0

Bi

0

Bo

B =

then

B" =

v

By

0

0

B?

0

0

B:

Let Ai, A2,..., A/j be the distinct eigenvalues of A with multiphcities mi, m 2 , . . . , mfc. For every k, let pk be the dimension of the eigenspace relative to A^ (the geometric multiplicity). Then, see Theorem 2.65, there exists a nonsingular matrix S € Mfe,fc(C) such that J := S~^AS has the Jordan form Ji,i

J==

v

0

0

0

Jl,2

0

...

0

0

...

\

'k,pk

where i = 1 , . . . , A;, j = 1 , . . . ,pi and /Ai 0

1 0 Ai 1

Xi 0 0

if Jij has dimension 1, ... ...

0\ 0

'^ij

otherwise. 0 Vo

0 0

0 0

... ...

Ai 0

1 Ai/

4.2 Some Applications

139

Consequently A^ = S J ^ S - \ and -tn

0

0

Tn •^1,2

0

0

J^ =

Tn

0

V It remains to compute the power of each Jordan block. If J ' = Jij = (A) has dimension one, then J'^ = A'^. If instead J ' = Jij is a block of dimension q at least two, /A 0

1 A

0 1

... ...

0\ 0

0 ... VO 0

0 ...

A 0

1 A/

J^ =

then J ' = AId + B ,

B}

:=Si+ij.

Since I Sr-^ij

C^)]

if r < g.

[0 ifr>9, we have B^ = 0. Thus Newton's binomial formula yields

J'^ = (Aid + B)^ = J2 ("")A"-^B^ 3=0 ^-^^

I.e.

j=0

^-^ ^

0

1

0 VO

3=0 ^-^^

n\

...

( " ) ^

0

1

n\ A

0

0

1

J ' " = A"

/

140

4. Self-Adjoint Operators

Notice that each element of A " = SJ^S~^ has the form k

3=1

where Ai, A2,..., A^ are the eigenvalues of A and Pj{t) is a polynomial of degree at most Pj — 1, where pj is the algebraic multiplicity of \j. It follows that for p > maxi{\\i\) there is a constant Cp such that every solution of Xn-^i = AXfi satisfies \Xn\ = sup < Cpp"" \Xo\

Vn.

In particular we have the following. 4.55 Theorem. If all eigenvalues of A have modulus less than one, then every solution of Xn^i = AXn converges to zero as n -^ +00. Proof. Fix cr > 0 such that maxi=i,n \^i\ < cr < 1. As we have seen, there exists a constant Ca such that if Xn is a solution of X n + i = AXn Vn, then \Xn\ < CaCT'' \Xol

Vn.

Since 0 < cr < 1, a"^ —^ 0, and the claim is proved.

D

4.56 E x a m p l e (Fibonacci n u m b e r s ) . Consider the sequence of Fibonacci numbers

(

/n+2 = /n+1 + / n

n > 0,

.^ ^^.

/O = 0, / i = 1, that is given by 1 / / l + \/5\n

/l-y/5\n

(4.22)

see e.g., [GM2]. Let us find it again as an application of the above. Set fn Un-i-1 then

F.,.H^^^M= /n+2/

'-''

\/n+/n+l7

0^ ' 1 and, F„=A»(; where 0 1

1 1

]=r

\1

^ ^n, 1

4.2 Some Applications

141

The characteristic polynomial of A is det(AId — A) = A(A — 1) — 1, hence A has two distinct eigenvalues l-hv/5 l-v/5 2 ' ^ 2 An eigenvalue relative to A is (1, A) and an eigenvector relative to /x is (1, /x). The matrix A is diagonizable a s A = S A S ~ ^ where

^X

IJLJ

A -/i

y A

-ly

\0

/x

It follows that

1

A

A — /x \^A

l \ / A'^ ii)

\—iJL^

Consequently,

'" = I^(--''") = 7!((H^)"-(^)") 4,2.4 An ODE system: small oscillations Let x i , X 2 , . . . , xjv be N point masses in M^ each respectively, to a nonzero mass m i , m 2 , . . . , TRN. Assume that each point exerts a force on the other points according to Hookers law^ i.e., the force exerted by the mass at Xj on Xi is proportional to the distance of Xj from x^ and directed along the line through x^ and Xj,

By Newton's reaction law, the force exerted by x^ on Xj is equal and opposite in direction, fji = ~fij^ consequently the elastic constants fc^j, i ^ j , satisfy the symmetry condition kij = kji. In conclusion, the total force exerted by the system on the mass at x^ is N X,j = l,N i^3

3 = 1,N i^j

where we set ku := — ^J=I,N

3 = 1,N iy^J

j=l

kij. Newton's equation then takes the form

mix'/-fi=0,

z = l,...,Ar,

(4.23)

with the particularity that the j t h component of the force depends only on the j t h component of the mass. The system then splits into 3 systems of N equations of second order, one for each coordinate. If we use the matrix notation, things simplify. Denote by M := diag {mi, m2,. • •, TUN}

142

4. Self-Adjoint Operators

the positive diagonal matrix of masses, by K := (fc^j) G MN,N{^) the symmetric matrix of elastic contants, and by X{t) € M^ the jth coordinates of the points x i , . . . , x^v A = (Xj^,..., x ^ j ,

x^ =: [Xj^, a^^, x^ j ,

i.e., the columns of the matrix X(t) := [xi(t) I X2(t) I . . . I Xiv(t)]

e Miv,3(M).

Then (4.23) transforms into the system of equations MX'\t)-\-KX{t) = 0

(4.24)

where the product is the product rows by columns. Finally, if X"(t) denotes the matrix of second derivatives of the entries of X(^), the system (4.23) can be written as X'\t) + M - ^ K X ( t ) = 0, (4.25) in the unknown X : R -^ Miv,n(R)Since M~^K is symmetric, there is an orthonormal basis of R ^ made by eigenvalues of M~^K and real numbers Ai, A2,..., AA/^ such that {ui\uj) = 5ij

and

M.~^Kuj := \jUj\

notice that i^i, ?X2? • • •, ^iv are pairwise orthonormal vectors since M is diagonal. Denoting by Pj the projection operator onto Span{uj} we also have N

Id = 5^P„

N

M-^K = X^A,P,.

3=1

j=i

Thus, projecting (4.25) onto SpanjiXj} we find 0 = Pj{0) = P , ( X " + M - ^ K X ) = {PjXy

+ A^(P,X),

Vj = 1 , . . . , iV,

i.e., the system (4.25) splits into A^ second order equations each in the unknowns of the matrix PjX.{t). Since K is positive, the eigenvalues are positive, consequently each element of the matrix PjX(t) is a solution of the harmonic oscillator

hence PjX{t) = cos(yA~t)P,X(0) + ! ^ ^ ^ i ^ P , . X ' ( 0 ) . In conclusion, since Id = ^j^i Pj-, we have

4.3 Exercises

143

N

x{t) = J2PiMt) ' '

(4.26)

The numbers \/A7/(27r),... \/A^/(27r) are called the proper frequencies of the system. We may also use a functional notation ^

A2n+1

^

\2n

and we can write (4.26) as X(i) = cos{t^/A)X{0)

+ ?H^i^^X'(0), vA

where A := M~^K.

4.3 Exercises 4 . 5 7 %. Let A be an n X n matrix and let A be its eigenvalue of greatest modulus. Show that |A| < sup^dail + | 4 | + • • • + |aj,|). 4.58 % G r a m m a t r i x . Let {/i, / 2 , . . . , fm} be m vectors in M^. The matrix G = [gij] G Mm,n(IR) defined by gij = {fi\fj) is called Gram's matrix. Show that G is nonnegative and it is positive if and only if / i , 72, • • • ? fm are linearly independent. 4.59 t . Let A,B : C^ -^ C^ be self-adjoint and let A be positive. Show that the eigenvalues of A~^B are real. Show also that A~^B is positive if B is positive. 4 . 6 0 %. Let A = [a^] G Mn,n(K) be self-adjoint and positive. Show that det A < (trA/n)"^ and deduce det A < n?=i<^i- [Hint: Use the inequality between geometric and aritmethic means, see [GMl].] 4.61 %, Let A e Mn,nOQ and let a i , a 2 , . . . , an G K"^ be the columns of A. Prove Hadamard's formula det A < f l i L i \^i\- [Hint: Consider H = A* A.] 4.62 f. Let A, B G Mn,n{^) be symmetric and suppose that A is positive. Then the number of positive, negative and zero eigenvalues, counted with their multiplicity, of A B and of B coincide. 4.63 1 . Show that ||Ar*Ar|| = ||7V||2 if N is normal.

144

4. Self-Adjoint Operators

4 . 6 4 % D i s c r e t e Fourier transform. Let T : C^ —• C ^ be the cycling forward shifting operator T{{zo,zi,... ,ZN-I)) '•= {zi,Z2,.. • ,ZN-I,ZO). Show that (i) T is self-adjoint, (ii) the N eigenvalues of T are the ATth roots of 1, (iii) the vectors Ufc : - - i = ( l , a ; ^ , u ; 2 f e , . . . , a ; ^ ( ^ - l ) ) ,

u := e'^, fc = 0 , . . . , AT - 1,

form an orthonormal basis of C ^ of eigenvectors of T; finally the cosine directions (z|ufc) of z G C-^ with respect to the basis ( u o , . . . , u^^^) are given by the Discrete Fourier transform of z. 4.65 ^ . Let A, B : X -^ X he two self-adjoint operators on a Euclidean or Hermitian space. Suppose that all eigenvalues of A- B are strictly positive. Order the eigenvalues Ai, A 2 , . . . , An of A and /ii, /i2, • • •, /in of B in a nondecreasing order. Show that Xi < fj,i Vi = 1 , . . . , n. [Hint: Use the variational characterization of the eigenvalues.] 4.66 f. Let A : X -^ X he self-adjoint on a Euclidean or Hermitian space. Let Ai, A 2 , . . . , An and /xi, /X2,..., /Xn be respectively, the eigenvalues and the singular values of A that we think of as ordered as |Ai | < IA2I < • • • < |An| and A*i < /i2 < • • • < MnShow that |Ai| = /ii Vi = 1 , . . . , n . [Hint: A*A = A^.] 4 . 6 7 %, Let A : X -^ X he a. linear operator on a Euclidean or Hermitian space. Let m, M he respectively the smallest and the greatest singular value of A. Show that ' ^ ^ l-^l ^ ^ for any eigenvalue X oi A. 4.68 f. Let A : X -^ Y he a, linear operator between two Euclidean or two Hermitian spaces. Show that (i) ( ^ M ) i / 2 maps ker A to {0}, (ii) (A*A)^^'^ is an isomorphism from ker A-^ onto itself, (iii) {AA*y^'^ is an isomorphism from ImA onto itself. 4.69 %, Let A : X -^ Y he a. linear operator between two Euclidean or two Hermitian spaces. Let (wi, 1*2, • • •, Un) and /ii, / i 2 , . . . , /in, fJ'i > 0 be such that (wi, U2, • •, Un) is an orthonormal basis of X and {A* A^^'^x = X]i A*i(^l^i)^i- Show that (i) AA'^y = E ^ , # o / ^ i ( 2 / l ^ ^ i ) ^ ^ i ^2/ e ^ ' (ii) If B denotes the restriction of [A*A)^^'^ to kei A^, see Exercise 4.68, then B ^x=

y ^ —{x\ui)ui • ^ — '

1

1

Vj;€kerA"'",

1

>^i^0 " *

(iii) If C denotes the restriction of (AA*)^^"^ to ImA, see Exercise 4.68, then C~^2/= ^

—{y\Aui)Aui

VyGlmA

^i^O ^'

4.70 %. Let A 6 Miv,n(^K)i ^ > ^? with Rank A = n. Select n vectors wi, 1^2, • • •, Un € K** such that A w i , . . . , A u n G K ^ are orthonormal. [Hint: Find U G Mn,n(K) such that A U is an isometry.]

4.3 Exercises

4.71 %. Let A e MN,n{^) and A = U A V , where U G 0(N), to 4.39, show that A+ = V ^ A ' U ^ where

/^

0

0

0\

0

-^

0

0

A' =

V G 0{n).

145

According

1

Mfc

0

\ 0

0

...

0

0/

Ml) M2, • • •, Mfe being the nonzero singular values of A. 4.72 %, For u : R —> R^, discuss the system of equations

(Pu —-+ dt

2

. V-l

-l'

u = 0.

4.73 If. Let A G Mn,n(R) be a symmetric matrix. Discuss the following systems of ODEs x ' ( t ) + A x ( t ) = 0, - i x ' ( t ) + Ax(t) = 0 , x " ( t ) + Ax(t) = 0,

where Ais positive definite

and show that the solutions are given respectively by e-*^x(0),

e-^*^x(0),

cos(tVA)x(0) +

sin(t\/A)

^

, x'(0).

4.74 ^ . Let A be symmetric. Show that for the solutions of x''(t) + Ax(t) = 0 the energy is conserved. Assuming A positive, show that |x(t)| < E/Xi where E is the energy of x(t) and A the smallest eigenvalue of A. 4.75 %. Let A be a Hermitian matrix. Show that |x(t)| = const if x(t) solves the Schrodinger equation i x ' -|- A x = 0.

Part II

Metrics and Topology

Felix Hausdorff (1869-1942), Maurice Frechet (1878-1973) and Rene-Louis Baire (18741932).

5. Metric Spaces and Continuous Functions

The rethinking process of infinitesimal calculus, that was started with the definition of the limit of a sequence by Bernhard Bolzano (1781-1848) and Augustin-Louis Cauchy (1789-1857) at the beginning of the XIX century and was carried on with the introduction of the system of real numbers by Richard Dedekind (1831-1916) and Georg Cantor (1845-1918) and of the system of complex numbers with the parallel development of the theory of functions by Camille Jordan (1838-1922), Karl Weierstrass (18151897), J. Henri Poincare (1854-1912), G. F. Bernhard Riemann (18261866), Jacques Hadamard (1865-1963), Emile Borel (1871-1956), ReneLouis Baire (1874-1932), Henri Lebesgue (1875-1941) during the whole of the XIX and beginning of the XX century, led to the introduction of new concepts such as open and closed sets, the point of accumulation and the compact set. These notions found their natural collocation and their correct generalization in the notion of a metric space, introduced by Maurice Frechet (1878-1973) in 1906 and eventually developed by Felix Hausdorff (1869-1942) together with the more general notion of topological space. The intuitive notion of a "continuous function" probably dates back to the classical age. It corresponds to the notion of deformation without "tearing". A function from X to F is more or less considered to be continuous if, when x varies slightly, the target point y = f{x) also varies slightly. The critical analysis of this intuitive idea also led, with Bernhard Bolzano (1781-1848) and Augustin-Louis Cauchy (1789-1857), to the correct definition of continuity and the limit of a function and to the study of the properties of continuous functions. We owe the theorem of intermediate values to Bolzano and Cauchy, around 1860 Karl Weierstrass proved that continuous functions take on maximum and minimum values in a closed and limited interval, and in 1870 Eduard Heine (1821-1881) studied uniform continuity. The notion of a continuous function also appears in the work of J. Henri Poincare (1854-1912) in an apparently totally different context, in the so-called analysis situs, which is today's topology and algebraic topology. For Henri Poincare, analysis situs is the science that enables us to know the qualitative properties of geometrical figures. Poincare referred to the properties that are preserved when geometrical figures undergo any kind of deformation except those that introduce tearing and glueing of points. An intuitive idea for some of these aspects may be provided by the following examples.

150

5. Metric Spaces and Continuous Functions

GBUNDZCGE

MENaENLEHEE ESPACES ABSTRAITS INTRODUCTION A L'ANALYSE GfiNfiRALB

FELIX H1U8D0RPP

a FIOCRES: (M TKXr Xatirice MUlECHBT

PARIS GA0THIBU-VJLLAR8 »i CI, tOlTEURS H aniHl»-A«fil(li««, H

LEIPZia VEBLAO VON VEIT A COilP,

Figure 5.1. Frontispieces of Les espaces abstraits by Maurice Frechet (1878-1973) and of the Mengenlehre by Felix Hausdorff (1869-1942).

o Let US draw a disc on a rubber sheet. No matter how one pulls at the rubber sheet, without tearing it, the disc stays whole. Similarly, if one draws a ring, any way one pulls the rubber sheet without tearing or glueing any points, the central hole is preserved. Let us think of a loop of string that surrounds an infinite pole. In order to separate the string from the pole one has to break one of the two. Even more, if the string is wrappped several times around the pole, the linking number between string and pole is constant, regardless of the shape of the coils. o We have already seen Euler's formula for polyhedra in [GMl]. It is a remarkable formula whose context is not classical geometry. It was Poincare who extended it to all surfaces of the type of the sphere, i.e., surfaces that can be obtained as continuous deformations of a sphere without tearing or glueing. o E, R^, R^ are clearly different objects as linear vectorspaces. As we have seen, they have the same cardinality and are thus undistinguishable as sets. Therefore it is impossible to give meaning to the concept of dimension if one stays inside the theory of sets. One can show, instead, that their algebraic dimension is preserved by deformations without tearing or glueing. At the core of this analysis of geometrical figures we have the notion of a continuous deformation that corresponds to the notion of of a continuous one-to-one map whose inverse is also continuous, called homeomorphisms. We have already discussed some relevant properties of continuous functions / : R - ^ R e / : R 2 - ^ R i n [GMl] and [GM2]. Here we shall discuss continuity in a sufficiently general context, though not in the most general.

5.1 Metric Spaces

151

Poincare himself was convinced of the enormous importance of extending the methods and ideas of his analysis situs to more than three dimensions. ... L'analysis situs a plus de trois dimensions presente des difficultes enormes; il faut pour tenter de les surmonter etre bien persuade de Vextreme importance de cette science. Si cette importance n'est pas bien comprise de tout le monde, c 'est que tout le monde n'y a pas suffisamment reflechi^ In the first twenty years of this century with the contribution, among others, of David Hilbert (1862-1943), Maurice Prechet (1878-1973), FeUx Hausdorff (1869-1942), Pavel Alexandroff (1896-1982) and Pavel Urysohn (1898-1924), the fundamental role of the notion of an open set in the study of continuity was made clear, and general topology was developed as the study of the properties of geometrical figures that are invariant with respect to homeomorphisms, thus linking back to Euler who, in 1739, had solved the famous problem of Konigsberg's bridges with a topological method. There are innumerable successive applications, so much so that continuity and the structures related to it have become one of the most pervasive languages of mathematics. In this chapter and in the next, we shall discuss topological notions and continuity in the context of metric spaces.

5.1 Metric Spaces 5.1.1 Basic definitions a. Metrics 5.1 Definition. Let X be a set. A distance or metric on X is a map d : X X X -^ R+ for which the following conditions hold: (i) (IDENTITY) d{x,y) >Oifx^yeX, and d{x,x) = 0 Vx G X . (ii) (SYMMETRY) d{x,y) = d{y,x) ^x,y e X. (iii) (TRIANGLE INEQUALITY) d{x,y) < d{x,z) -\-d{z,y), \/ x,y,z e X. A metric space is a set X with a distance d. Formally we say that (X, d) is a metric space if X is a set and d is a distance on X. The properties (i), (ii) and (iii) are often called metric axioms.

^ The analysis situs in more than three dimensions presents enormous difficulties; in order to overcome them one has to be strongly convinced of the extreme importance of this science. If its importance is not well understood by everyone, it is because they have not sufficiently thought about it.

152

5. Metric Spaces and Continuous Functions

\

T

B

Figure 5.2. Time as distance.

5.2 E x a m p l e . The Euclidean distance d{x, y) := \x — y\, x,y £R, is a, distance on R. On M? and R^ distances are defined by the Euclidean distance, given for n = 2,3 by _

^ 1/2

Irtixi-y^A

where X := (xi,a;2),y := (yi,2/2) if n = 2, or x := (xi,X2,X3),y := (2/1,2/2,2/3) if n = 3. In other words, R, R'^, R^ are metric spaces with the EucHdean distance. 5.3 E x a m p l e . Imagine R^ as a union of strips En := {(a:i,a;2,iC3) \n < X3 < n -\- 1}, made by materials of different indices of refractions Vn- The time t{A, B) needed for a light ray to go from A to B in R^ defines a distance on R^, see Figure 5.2.

5.4 E x a m p l e . In the infinite cylinder C = {{x,y,z)\x'^ -\-y"^ = 1} c R^, we may define a distance between two points P and Q as the minimal length of the line on C, or geodesic, connecting P and Q. Observe that we can always cut the cylinder along a directrix in such a way that the curve is not touched. If we unfold the cut cylinder to a plane, the distance between P and Q is the Euclidean distance of the two image points.

5.5 1. Of course 1001a; — 2/1 is also a distance on R, only the scale factor has changed. More generally, if / : R —• R is an injective map, then d(x, y) := \f(x) — f{y)\ is again a distance on R.

5.6 Definition. Let (X, d) be a metric space. The open ball or spherical open neighborhood centered at XQ e X of radius p > 0 is the set B{xo,p) := Ix e X\ d{x, XQ) < p\.

Figure 5.3. Metrics on a cylinder and on the boundary of a cube.

5.1 Metric Spaces

153

Notice the strict inequality in the definition of B{x,r). In M, R^, R^ with the EucHdean metric, B{xo^ r) is respectively, the open interval ]xo — r, xo + r[, the open disc of center XQ G R^ and radius r > 0, and the ball bounded by the sphere of R^ of center XQ G R^ and radius r > 0. We say that a subset £' C X of a metric space is bounded if it is contained in some open ball. The diameter oi E C X is given by d i a m ^ := sup< d(a:,y) x, ?/ G £^L and, trivially, E is bounded iff didiXnE < +oo. Despite the suggestive language, the open balls of a metric space need not be either round nor convex; however they have some of the usual properties of discs in R^. For instance (i) B{xo, r) C B{xo, s) VXQ G X and 0 < r < 5, (ii) Ur>oB{xo,r) = X Vxo G X, (iii) nr->o^(^o,^) = {^o} Vxo G X ,

(iv) Va:o G X and Vz G B{xo^r) the open ball centered at z and radius p :=z r — d(xo, z) > 0 is contained in B{xo, r), (v) for every couple of balls B{x,r) and B{y,s) with a nonvoid intersection and Vz G B{x,r) n ^(y, s), there exists t > 0 such that B{z, t) C B{x, r) n B{y, s), in fact t := min(r — d(x, z), s — d{y, z)), (vi) for every x,y e X with x ^ y the balls 5 ( x , r i ) and B{y^r2) are disjoint if ri + r2 < G!(X, y). 5.7 ^ . Prove the previous claims. Notice how essential the strict inequality in the definition of B(xo, p) is.

b. Convergence A distance d on a set X allow us to define the notion of convergent sequence in X in a natural way. 5.8 Definition. Let (X, c!) he a metric space. We say that the sequence {xn} C X converges to x e X, and we write Xn -^ x, if d{Xn,x) -^ 0 in R, that is , if for any r > 0 there exists n such that d{xn^x) < r for all n>n. The metric axioms at once yield that the basic facts we know for limits of sequences of real numbers also hold for limits of sequences in an arbitrary metric space. We have (i) the limit of a sequence {xn} is unique, if it exists, (ii) if {xn} converges, then {xn} is bounded, (iii) computing the limit of {xn} consists in having a candidate x G X and then showing that the sequence of nonnegative real numbers {d{xn',x)} converges to zero, (iv) if Xn -^ X, then any subsequence of {xn} has the same limit x.

154

5. Metric Spaces and Continuous Functions

Thus, the choice of a distance on a given set X suffices to pass to the limit in X (in the sense specified by the metric d). However, given a set X, there is no distance on X that is reasonably absolute (even in R), but we may consider different distances in X. The corresponding convergences have different meanings and can be suited to treat specific problems. They all use the same general language, but the exact meaning of what convergence means is hidden in the definition of the distance. This flexibility makes the language of metric spaces useful in a large context.

5-1.2 Examples of metric spaces Relevant examples of distances are provided by linear vector spaces on the fields K = E or C in which we have defined a norm. 5.9 Definition. Let X be a linear space over K = R or C. A norm on X is a function \\ \\ : X -^ R+ satisfying the following properties (i) (FiNiTENESS) ||a;|| eR\/x e X. (ii) (IDENTITY) ||X|| > 0 and \\x\\ =0 if and only if x — 0. (iii)

(iv)

(1-HOMOGENEITY) ||AX|| = |A| ||X|| V X G X, (TRIANGLE INEQUALITY) ||X + 2/|| < ||x|| +

VA G K.

||2/|| \J x,y e X.

/ / II • II is a norm on X, we say that (X, || ||) Z5 a linear normed space or simply that X is a normed space with norm \\\\. Let X be a linear space with norm || ||. It is easy to show that the function d: X x X -^ R+ given by d{x,y) := \\x-y\\,

x,y e X,

satisfies the metric axioms, hence defines a distance on X, called the natural distance in the normed space (X, || ||). Obviously, such a distance is translation invariant, i.e., d{x -\- z,y -\- z) — d{x, y) Vx, y,z ^ X. Trivial examples of metric spaces are provided by the nonempty subsets of a metric space. If A is a subset of a metric space (X, c?), then the restriction of d to ^ x A is trivially, a distance on A. We say that ^ is a metric space with the induced distance from X. 5.10 If. For instance, the cylinder C := {{x, y, 2;) G M^ | x^ 4- y^ = 1} is a metric space with the Euclidean distance that, for x, y G C, yields d(x, y) :=:length of the chord joining x and y. The geodesic distance dg of Example 5.4, that is the length of the shortest path in C between x and y, defines another distance. C with the geodesic distance dg has to be considered as another metric space different from C with the Euclidean distance. A simple calculation shows that l|x-y||
We shall now illustrate a few examples of metric spaces.

5.1 Metric Spaces

,-'^/ / //^/ / :: // / / / V ^

\

^

^

155

V

\\ \ \ \N • \ \ '» : \\^: Ni

/Ti,

= \ \ \\ \ : X ^s \V\V

//'i // /: / / /' . v \ /^'/ / / '

Figure 5.4. The ball centered at (0,0) of radius 1 in R^ respectively, for the metrics d\, di.3, d2 and d^. The unit ball centered at (0,0) of radius one for the metric doo is the square ] — 1, l[x] — 1,1[.

a. Metrics on

finite-dimensional

vector spaces

5.11 %, As we have already seen, E^ with the Euclidean distance |x — y | is a metric space. More generally, any Euclidean or Hermitian vector space X is a normed space with norm given by

\\x\\:=^/W) cf. Chapter 3. X is therefore a metric space with the induced distance d{x,y) :=

\\x-y\\,

called the Euclidean distance of X. 5.12 % oo-distance. Set for x = (a^i, 0:2,..., Xn) G M.'^ ||x||cx) := max(|xi|, | x 2 | , . . . , \xn\)Show that X —> ||a:||oo is a norm on R'^. Hence, R^, equipped with the distance doo(x,y) :=

\\x-y\\oo,

is a metric space different from the standard R^ with the Euclidean distance of Exercise 5.11. In R"^, the unit ball centered at (0,0) of radius one for the metric doo is the square ] — 1, l[x] — 1,1[, see Figure 5.4.

5.13 % p-distance. Given a real number p > 1, we set for x € R*^

:=(Ekil

l/p

Show that ||x||p is a norm on R"^, hence dp(x,y) := | | x - 2 / | | p is a distance on R*^. Observe that || II2 and ^2 are the Euclidean norm and distance in R"^. In R^, the unit ball centered at (0,0) of radius one for the metric dp for some values of p is shown in Figure 5.4. [Hint: The triangle inequality for the p-norm is called Minkowski's discrete inequality

156

5. Metric Spaces and Continuous Functions

| | a + b | | p < ||a||p+||b||p,

Va := (ai, a2, • . . , an), h={bi,

6 2 , . . . , 6n),

which follows for instance from Minkowski's inequality for integral norms, see [GMl]. Alternatively, we can proceed as follows. Suppose a and b are nonzero, otherwise the inequality is trivial, apply the convexity inequality f{Xx+{l — X)y) < Xf{x)-\-{l — X)f{y) to f{t) = tP with X := a i / | | a | | p , y := 6i/||b||p, A := ||a||p/(||a||p + ||b||p), and sum on i from 1 to n, to get |a + b||p

<1.]

^llp-M|b|b

5.14 t P r o d u c t spaces. Let (Xi,^^^)), {X2,d^'^^), ... (Xn,d(^)) be n metric spaces and let y = X i X ^ 2 X • • • X Xn be the Cartesian product of X i , . . . , Xn- Show that each of the functions defined on X x K by

fcip(x,y) := (EILW('H^i,2/iF)'^^ (^doo(x,y) :=m3X^S'-\xi,yi)

ifp> 1,

U == l , . . . , n |

for X = (rr^, x ^ , . . . , x^), y = (t/^, 2 / ^ , . . . , y^) G Y, are distances on Y. Notice that if X i = • • • = Xn = M with the Euclidean distance, Y is R^, then the distances dp(x, y) are just the distances in Exercises 5.13 and 5.12. Also show that if {x^} C Y, x^ := (x^, x | , . . . x^) Vfc, and x = (x^, x ^ , . . . , x ^ ) , then the following claims are equivalent. (i) There exists p > 1 such that dp(xfc,x) —^ 0, (ii) d p ( x f c , x ) - ^ O V p > 1, (iii)

doo(Xfc,x)-^OVp>l,

(iv) Vz = l , . . . , n d i ( 4 , x O - ^ 0 . 5.15 % D i s c r e t e d i s t a n c e . Let X be any set. The discrete distance on X is given by \l d{x, y)= l,

and that convergent sequences with respect to the discrete distance reduce to sequences that are definitively constant. 5.16 % C o d e s d i s t a n c e . Let X be a set that we think of as a set of symbols, and let X ^ = X x X x - - - x X the space of ordered words on n symbols. Given two words X = (xi, X2,..., Xn) and 2/ = (t/i, 2/2, • • •, 2/n) € X ^ , let dix,y)

: = # | i | a ; i ^ yij

be the number of bits in x and y that are diff^erent. Show that d(x, y) defines a distance in X ^ . Characterize the balls of X ^ relative to that distance. [Hint: Write d{x,y) — X)?=i d(xi,yi) where d is the discrete distance in X.]

5.1 Metric Spaces

157

RENDICONTI MEMOfilE E COMUHiCAZiONI.

CIRCOLO MATEMATICO

SUR OOEUHIES POLXTS DU CALCUL FOKCTIOSKEL; P«r M. M i n r l o t fritkti

( P m ) •>

DI PALERMO IXTRODUCTIOM.

DiHiTOKt: G. B. GUCCIA. U vitnHe X, qiund i tauw vakur aumirfqiw de x amrMfond OM n h u r bin Mwr-

TOMO XXII ^irectkiiu ((ar eumpk «i c« ^ui coacemt ruaifsnniU) « eo pinkulhr au poiot ik vw ik «e <)u'on doit jXtndre pwr varubk. Oepuk looficmpt, oo t cooaMM te ruKdiNU ik liow, il* tni^ ou mOuw Jk n v u M l n iramiti^ut*. U s l u m t (lainl». ndoui Kwt pitu li'caun. Akui, M. U: Koi-x > M uiwni i CtwUcf l a ibiKiiQiu l(|ici\4 noaM^i, nuu J'unc MMR uitiik (Uianu ^ d< f canapood uoe vaiwr mmMum dtmaiiait dc U: V(4). dMtdu d«s propriMt d« ctt afinieat cataiint* rabjai du Ctktl ffwaiMinl.

PALERMO. SEDM DMllA SOCISTA

Figure 5.5. The first page of the These at the Faculty of Sciences of Paris by Maurice Prechet (1878-1973), published on the Rendiconti del Circolo Matematico di Palermo.

b. Metrics on spaces of sequences We now introduce some distances and norms on infinite-dimensional vector spaces. 5.17 E x a m p l e (^oo s p a c e ) . Consider the space of all real (or complex) sequences X := ( a ; i , . . . ) - For x = {xn}, y := {yn}, set ||x||oo : = s u p | a : n | ,

doo(x,y) := | | x - y | | o o .

n

It is easy to show that x —• ||a;||oo satisfies the axioms of a norm apart from the fact that it may take the value +oo. Thus a: —>• ||a:||oo is a norm in the space ^C50 : = | x = {Xn} I ||x||oo < +CXD j ,

that is, on the linear vector space of hounded sequences. Consequently,

doo{x,y) := | | x - y | | o o ,

x,y eioo,

is a distance on ^oo J called the unifoTm distance. Convergence of "{x/j. }• (Z •^oo to x G •^oo in the uniform norm, called the uniform convergence, amounts to ||Xfc - x | | o o = S U p | x J . - X '

as fc —• oo.

(5.1)

where Xfc = ( a ; ^ , a ; | , . . . ) and x = (x^, x ^ , . . . ) . Notice that the uniform convergence in (5.1) is stronger than the pointwise convergence

Vi

xi

as fc —>^ oo.

For instance, let (f{t) := te~*, t € M+, and consider the sequence of sequences {x^} where x^ := {x^}n, xV- := < ^ ( ^ ) . Then \fi we have xj. = f e"^/'' -^ 0 as fc -^ oo, while llxfc - Oiloo = s u p j ^ e - ^ / ^ I 2 = 0 , 1 , . . . j = - 7^ 0.

158

5. Metric Spaces and Continuous Functions

Of course W^ with the metric doo in Exercise 5.12 is a subset of ^oo endowed with the induced metric doo- This follows from the identification (a:\...,x^) ^

(x\...,x",0,...,0,...).

5.18 E x a m p l e (£p spaces, p > 1). Consider the space of all real (or complex) sequences X := ( x i , . . . ) . For 1 < p < oo, x = {xn} and y := {y-n} set 1 /

Trivially, ||x||^p = 0 if and only if any element of the sequence x is zero, moreover Minkowski's inequality

l|x + y|Up<||x||«^ + ||y|U^, holds as it follows from Exercise 5.13 (passing to the limit as n -^ oo in Minkowski's inequality in E'^). Thus || ||^ satisfies the metric axioms apart from the fact that it may take the value -f-oo. Hence, || H^ is a norm in the linear space of sequences ^p-{x={Xn}|||x|Up<+Oo}.

Consequently, d^ (x, y) := ||x — yWcp is a distance on ip. Convergence of {x^} C ^p to x G ^p amounts to oo

where x^ = i^\^^^^ • • •) ^^^ ^ = (x^, x ^ , . . . ) . Notice that W^ with the metric dp in Exercise 5.13 is a subset of £p endowed with the induced metric d^p. This follows for instance from the identification (x\...,x'^) ^

(x\...,x'',0,...,0,...).

Finally, observe that ||x||^g < ||x||£p Vx if 1 < p < g, hence iiCipC

iq.

(5.2)

Since there exist sequences x = {xn} such that HxH^^ < +oo while ||x||£p = +00 if p < g, as for instance r 1 1i/p the inclusions (5.2) are strict if 1 < p < g. The case p = 2 is particularly relevant since the ^2 norm is induced by the scalar product ( x | y ) , , := £ x * j / * ,

||x|U, = ^ ( x l x ) , , .

Z = l

^2 is called the Hilbert coordinate space, and the set { x - ( x \ x 2 , . . . , ) € ^ 2 | | x i | < i Vi} the Hilbert cube.

5.1 Metric Spaces

159

Figure 5.6. Tubular neighborhood of the graph of / .

c. Metrics on spaces of functions The language of metric spaces is particularly relevant in dealing with different types of convergences of functions. As examples of metric spaces of functions, we then introduce a few normed spaces that are relevant in the sequel. 5.19 E x a m p l e ( C o n t i n u o u s f u n c t i o n s ) . Denote by C°([0,1]) the space of all continuous functions / : [0,1] -^ M. For / : [0,1] -> E set ll/||oo,[0,l] —

sup x€[0,l]

\f(x)\.

We have (0 ll/llcxD,[o,i] ^ ~^^^ ^y Weierstrass's theorem, (ii) ll/lloc!lo!i]=Oifr/(x) = OVx, (iii) l|A/||oc,[o,i] = |A|||/||oo,[o,i). (iv) | | / + 9 | | c » , [ 0 , l l < ll/llcx,,[0,ll + IMIoo,lO,ll-

To prove (iv) for instance, observe that for all x € [0,1], we have \f(x)+g{x)\ < \f(x)\ + \g{x)\ < ||/||oc,[o,i] + IMIoc,[o,i] hence the right-hand side is an upperbound for the values of f -\- g. The map / —* ||/||oo,[o,i] ^^ *^^^ ^ norm on C^([0,1]), called the uniform or infinity norm. Consequently C^([0,1]) is a normed space and a metric space with the uniform distance ^

^

te[o,i]

In this space, the ball B{f, e) of center / and radius e > 0 is the set of all continuous functions g G ^^([0,1]) such that \gix) - f{x)\ < e

VxG[0,l]

or the family of all continuous functions with graphs in the tubular neighborhood of radius e of the graph of / t / ( / , e ) : = { ( a ; , j / ) | x e [ 0 , l ] , y eR,\y

- f{x)\ < e],

(5.3)

see Figure 5.6. The uniform convergence in C^([0,1]), that is the convergence in the uniform norm, of {fk} C C^([0,1]) to / G C^([0,1]) amounts to computing Mk '•= ll/fc - /lloc,[o,i] = ^ ^ ^ , j l/fc W - / W l for every fc = 1, 2 , . . . and to showing that Mk —* 0 as /c -^ -f-oo.

160

5. Metric Spaces and Continuous Functions

-1

-1/A;2

Figure 5.7. The function / ^ in (5.4).

5.20 E x a m p l e ( F u n c t i o n s o f class C^dO, 1])). Denote by C^([0,1]) the space of all functions / : [0,1] - ^ R of class C \ see [GMl]. For / 6 C^{[0,1]), set llci([0,i]) ' =

sup | / ( a : ) | + xG[0,l]

sup

\f{x

||/lloo,[0,l] + ll/'lloo,[0,l]-

xG[0,l]

It is easy to check that / —• Consequently,

Ci(fo 11) ^s ^ norm in the vector space C^([0,1]).

dc^([o,i])if^9) •= ll/-pllci([o,i]) defines a distance in C^([0,1]). In this case, a function g E C^ has a distance less than e from / if | | / - S/||oo,[o,i] + 11/' - P1IOO,[O,I] < ^i equivalently, if the graph of g is in the tubular neighborhood t / ( / , ei) of the graph of / , and the graph of g' is in the tubular neighborhood C/(/',e2) of f with ei + 62 = e, see (5.3). Moreover, convergence in the Ci([0,l])-norm of { A } C CH[0,1]) to f E C i ( [ 0 , l ] ) , \\fk - /Ilci([0,i]) ^ 0, amounts to / uniformly in [0,1], fk •/'

uniformly in [0,1].

Figures 5.8 and 5.9 show graphs of Lipschitz functions and functions of class C^([0,1]) that are closer and closer to zero in the uniform norm, but with uniform norm of the derivatives larger than one. 5.21 E x a m p l e (Integral m e t r i c s ) . Another norm and corresponding distance in C^([0,1]) is given by the distance in the mean

II/IIL1([O,I]) '= J

l/WN*,

dLmo,i])(f,g)

•= \\f -g\\LHo,i)

•= j

\f-9\dx.

5.22 ^ . Show that the L -norm in C0([0,1]) satisfies the norm axioms. Convergence with respect to the L^-distance differs from the uniform one. For instance, for A: = 1, 2 , . . . set (5.4)

\o

if]^
We have ||/fc||cx>,[o,i] = /(O) = fc - ^ H-oo while ||/fc||Li([o,i]) = V(2fc) - ^ 0, cf. Figure 5.7. More generally, the I / P ( [ 0 , l])-norm, 1 < p < oo, on C°([0,1]), is defined by

5.1 Metric Spaces

161

Figure 5.8. The Lebesgue example.

\\f\\LP(o,i) ••= (^£ \fixw dxy \ It turns out that / -^ | | / | | L P ( O , I ) satisfies the axioms of a norm, hence 1

dLP{[0,i]){f,9)

'= \\f - 9\\LPi[0,i]) '= [

i/p

\f-9\^^^] 0

is a distance in C^([0,1]); it is called the L P ( [ 0 , 1])-distance. 5.23 ^ . Show that the L P ( [ 0 , l])-norm in C^([0,1]) satisfies the norm axioms. [Hint: The triangle inequality is in fact Minkowski's inequality, see [GMl].]

5-l,3 Continuity and limits in metric spaces a. Lipschitz-continuous maps between metric spaces 5.24 Definition. Let (X^dx) and (Y^dy) be two metric spaces and let 0 < a < 1. We say that a function f : X ^^ Y is a-Holder-continuous if there exists L > 0 such that dY{f{x)J{y)) 1-Holder-continuous smallest constant L of / , often denoted called the Lipschitz

yx.yeX.

< Ldxix.vr,

(5.5)

functions are also called Lipschitz continuous. The for which (5.5) holds is called the a-Holder constant by [f]a' When a = 1, the l-Holder constant is also constant of f and denoted by [f]i, L i p / or L i p ( / ) .

5.25 E x a m p l e ( T h e d i s t a n c e f u n c t i o n ) . Let {X, d) be a metric space. For any XQ G X , the function f{x) := d{x,xo) : X —> R is a Lipschitz-continuous function with Lip (/) = 1. In fact, from the triangle inequaUty, we get \fiy) - f{x)\ = \d(y,xo) -d{x,xo)\

< d{x,y)

yx,y G X,

hence / is Lipschitz continuous with Lip (/) < 1. Choosing a; = XQ, we have \f{y) - / ( ^ o ) | = \d{y, xo) - d{xo, xo)\ = d{y, XQ), thus L i p ( / ) > 1.

162

5. Metric Spaces and Continuous Functions

Figure 5.9. On the left, the sequence fkix) := k~^ cos{kx) that converges uniformly to zero with slopes equibounded by one. On the right, gk(x) '•= k~^ cos{k'^x), that converges uniformly to zero, but with slopes that diverge to infinity. Given any function / G C'^([0,1]), a similar phenomenon occurs for the sequences fk{x) := f(kx)/k, gk{x) =

f{k'^x)/k.

5.26 ^ D i s t a n c e from a s e t . Let (X, d) be a metric space. The distance function from cc € A" to a nonempty subset A C X is defined by d{x, A) := inf|d(a:, y) U € A \ . It is easy to show that f{x) := d(x, A) : X —^R is a, Lipschitz-continuous function with

Li (f)^

Jo iid(x,A) = OWx, I1

otherwise.

If d{x, A) is identically zero, then the claim is trivial. On the other hand, for any x,y £ X and z E A we have d{x, z) < d{x, y) -\- d{y, z) hence, taking the infimum in z^

d{x,A)-d{y,A)

and interchanging x and y,

\d{x,A)-d{y,A)\ d(x, A) is Lipschitz continuous with Lipschitz constant less than one. Since there exists a. x ^ A such that d(rr. A) > 0, there exists a sequence {zn} C A such that W r < l + i Therefore, \d(x, A) — d(xn,A)\

= d{x, A) >

d(x, Xn), n -\-1 from which we infer that the Lipschitz constant of a: —• d{x^ A) must not be smaller than one.

b. Continuous maps in metric spaces The notion of continuity that we introduced in [GMl], [GM2] for functions on one real variable can be extended in the context of the abstract metric structure. In fact, by paraphrasing the definition of continuity of functions / : R -^ R+ we get

5.1 Metric Spaces

163

5.27 Definition. Let (X^dx) and {Y^dy) be two metric spaces. We say that f : X ^^ Y is continuous at XQ Z/ Ve > 0 there exists S > 0 such that dy(/(x),/(xo)) < e whenever dx{x,xo) < S, i.e., S)) C 5 y ( / ( X Q ) , e).

Ve > 0 3(5 > 0 such that f{Bx{xo,

(5.6)

We say that f : X ^^ Y is continuous in E C X if f is continuous at every point XQ ^ E. When E = X and f : X ^^ Y is continuous at any point of X, we simply say that f : X ^^Y is continuous. 5.28 1[. Show that a-Holder-continuous functions, 0 < a < 1, in particular Lipschitzcontinuous functions, between two metric spaces are continuous.

Let (X, dx) and (F, dy) be two metric spaces and E C X. Since E is a, metric space with the induced distance of X, Definition 5.27 also appHes to the function f : E —^Y. Thus f : E —^Y is continuous dit XQ E E if \/e>03S>0

such that f{Bx{xo,

S) H E) C By{f{xo),

e)

(5.7)

and we say that / : E" ^ y is continuous ii f : E ^^ Y is continuous at any point XQ E E. 5.29 Remark. As in the case of functions of one real variable, the domain of the function / is relevant in order to decide if / is continuous or not. For instance, f : X -^Y is continuous in £" C X if Vxo G E V€ > 0 35 > 0 such that f{Bx{xo,

5)) C By{f{xo),

e),

(5.8)

while the restriction f\E - E -^ Y oi f to E is continuous in E if yxoeE\/e>03S>0

such that f{Bx{xo,S)nE)

C By{f{xo),e).

(5.9)

We deduce that the restriction property holds: if f : X ^^Y is continuous in E, then its restriction f\E '- ^ ~^^ ^o E is continuous. The opposite is in general false, as simple examples show. 5.30 Proposition. Let X,Y,Z be three metric spaces, and XQ E X. If f : X ^^ Y is continuous at XQ and g : Y -^ Z is continuous at f{xo), then g o f : X ^^ Z is continuous at XQ • In particular, the composition of two continuous functions is continuous. Proof. Let e > 0. Since g is continuous at f{xo), there exists a > 0 such that g{BY{f{xo),(T)) C Bz{g{f{xo)),e). Since / is continuous at XQ, there exists 5 > 0 such that f(BxixoyS)) C By(/(a;o),cr), consequently go f{Bx{xo,S))

C g{BYif{xo),a))

C Bzigo

f{xo),e). D

Continuity can be expressed in terms of convergent sequences. As in the proof of Theorem 2.46 of [GM2], one shows 5.31 Theorem. Let (X, dx) and (y, dy) be two metric spaces, f : X -^Y is continuous at XQ E X if and only if f{xn) -^ f{xo) in iy^dy) whenever {Xn}

d X,

Xn-^

XQ in ( X , d x ) .

164

5. Metric Spaces and Continuous Functions

c. Limits in metric spaces Related to the notion of continuity is the notion of the hmit. Again, we want to rephrase f{x) —^yo as x ^ XQ. For that we need / to be defined near XQ, but not necessarily at XQ. For this purpose we introduce the 5.32 Definition. Let X be a metric space and A C X. We say that XQ G X is an accumulation point of A if each ball centered at XQ contains at least one point of A distinct from XQ , Vr>0

B(xo,r)nA\{xo}^0.

Accumulation points are also called cluster points. 5.33 %, Consider R with the EucUdean metric. Show that (i) the set of points of accumulation of A :=]a, 6[, B = [a,b], C = [a, b[ is the closed interval [a, 6], (ii) the set of points of accumulation of A :=]0,1[U{2}, B = [0,1]U{2}, C = [0,1[U{2} is the closed interval [0,1], (iii) the set of points of accumulation of the rational numbers and of the irrational numbers is the whole R.

We shall return to this notion, but for the moment the definition suffices. 5.34 Definition. Let (X, dx) and (Y, dy) be two metric spaces, letEcX and let XQ G X be a point of accumulation of E. Given f : E\ {XQ} -^ Y, we say that y^ £Y is the limit of f{x) as x -^ XQ, X E E, and we write f{x) -^yo as x-^

XQ,

or

lim /(x) = yo X

'XQ

xeE

if for any e > 0 there exists 6 > 0 such that dy(/(x),2/o) < e whenever X e E and 0 < dx{x,xo) < S. Equivalently, Ve > 0 3(5 > 0 such that f{Bx{xo,

S)r]E\

{XQ}) C Byiyo, e).

Notice that, while in order to deal with the continuity of / at xo we only need / to be defined at XQ; when we deal with the notion of limit we only need that XQ be a point of accumulation of E. These two requirements are unrelated, since not all points of E are points of accumulation and not all points of accumulation of E are in E, see, e.g.. Exercise 5.33. Moreover, the condition 0 < dxix^xo) in the definition of limit expresses the fact that we can disregard the value of / at XQ (in case / is defined at XQ). Also notice that the limit is unique if it exists^ and that limits are preserved by restriction. To be precise, we have 5.35 Proposition. Let (X, dx) and (F, dy) be two metric spaces. Suppose F C E C X and let XQ G X be a point of accumulation for F. If f{x) —^y as X -^ xo; X £ E, then / ( x ) -^ y as x -^ XQ, X £ F. 5.36 ^ . As for functions of one variable, the notions of limit and continuity are strongly related. Show the following.

5.1 Metric Spaces

165

P r o p o s i t i o n . Let X and Y be two metric spaces, E C X and XQ ^ X. (i) / / XQ belongs to E and is not a point of accumulation of E, then every function f : E —^Y is continuous at XQ. (ii) Suppose that XQ belongs to E and is a point of accumulation for E. Then a) f : E -^ Y is continuous at XQ if and only if f{x) —>• f{xo) as x ^^ XQ, xe E, b) f(x) —> y as X -^ XQ, X ^ E, if and only if the function g : EU {XQ} -^ Y defined by \fix)

ifxeE\{xo}, if X =

is continuous

XQ

at XQ.

We conclude with a change of variable theorem for limits, see e.g., Proposition 2.27 of [GMl] and Example 2.49 of [GM2]. 5.37 Proposition. Let X^Y,Z be metric spaces, E C X and let XQ be a point of accumulation for E. Let f : E —^ Y, g : f{E) -^ Z be two functions and suppose that /(XQ) is an accumulation point of f{E). If (i) 9{y) -^ L as y -^ yo, y e f{E), (ii) f(x) -^ yo as X -^ xoy X e E, (iii) either f{xo) = yO) or f{x) ^ yo for all x e E and x ^ XQ, then g{f{x)) -^ L as x —^ XQ, X E E. d. The junction property A property we have just hinted at in the case of real functions is the junction property^ see Section 2.1.2 of [GMl], which is more significant for functions of several variables. Let X be a set. We say that a family {[/«} of subsets of a metric space is locally finite at a point XQ G X if there exists r > 0 such that B(xo, r) meets at most a finite number of the C/a's. 5.38 Proposition. Let {X^dx), (1^,dy) be metric spaces, f : X —^ Y a function, XQ E X, and let {Ua} be a family of subsets of X locally finite at XQ.

(i) Suppose that XQ is as X -^ XQ, X eUa, (ii) / / XQ G HaUa and then f : X ^^Y is

a point of accumulation of Ua and that f[x) -^ y for all a. Then f{x) —^y as x ^^ XQ, X £ X. f : Ua C X —^ Y is continuous at xo for all a, continuous at XQ .

5.39 t - Prove Proposition 5.38. 5.40 E x a m p l e . An assumption on the covering is necessary in order that the conclusions of Proposition 5.38 hold. Set A := {(x, y)\x'^
{

1

if I

0

otherwise. otl

166

5. Metric Spaces and Continuous Functions

The function / is discontinuous at XQ := (0,0), since its oscillation is one in every ball centered at XQ. Denote by Um the straight line through the origin Um '•= { ( x , y ) \ y = m x } ,

m G M,

Uoo '-= {{x,y)\x

= 0}.

The C/a's, a G R U {oo} form a covering of E^ that is not locally finite at XQ and for any a G M U oo, the restriction of / to e£ich Ua is zero near the origin. In particular, each restricition f^u^ : [/« —^ M is continuous at the origin.

5-1-4 Functions from R^ into

om

It is important to be acquainted with the Umit notion we have just introduced in an abstract context. For this purpose, in this section, we shall focus on mappings between Euclidean spaces and illustrate with a few examples some of the abstract notions previously introduced. a. The vector space C^{A,W^) Denote by eV R"^ -^ R the linear map that maps x = (x^, x ^ , . . . , x^) G R'^ into its ith component, e*(x) := x \ Any map f : X -^ W' from a set X into R^ writes as an n-tuple of real-valued functions / ( x ) = ( / ^ ( x ) , . . . , /'^(x)), where for any i = 1 , . . . , n the function /* : X -^ R is given by p{x) := e^(/(x)). Prom n

|yi|,|y2|,...,i2/„l<|y|<X^|j/i|

yeM"

z=l

we readily infer the following. 5.41 Proposition. The following claims hold. (i) The maps eV R^ —> R, i = 1 , . . . , n, are Lipschitz continuous. (ii) Let {X, d) he a metric space. Then a) f : X -^ W^ is continuous at XQ e X if and only if all its components f^, / ^ , . . . , / ^ are continuous at XQ, b) '^f fi9 ' ^ -^^^ ^^6 continuous at XQ, then f -\- g : X —> W^ is continuous at XQ , c) if f : X -^ W^ and A : X —> R are continuous at XQ then the map Xf : X —^ W^ defined by A/(x) := A(x)/(x), is continuous at

XQ.

5.42 E x a m p l e . The function / : R^ —>• R, / ( x , y, x) := sin(x^y) + x^ is continuous at R^. In fact, if xo := (xo,yo,zo), then the coordinate functions x = (x, y,%) -^ x, x —^ y, X —>• z are continuous at XQ by Proposition 5.41. By Proposition 5.41 (iii), x -^ x'^y and X -^ z'^ are continuous at xo, and by (ii) Proposition 5.41, x -^ x'^y + z"^ is continuous at XQ. Finally sin(x^2/ -^ x^) is continuous since sin is continuous.

5.43 Definition. Let X and Y he two metric spaces. We denote hy C^{X, Y) the class of all continuous function f : X -^Y.

5.1 Metric Spaces

167

As a consequence of Proposition 5.41 C^(X,R"^) is a vector space. Moreover, if A G C^{X,R) and / G CO(X,R^), then A / : X -^ R^ given by Xf{x) := A(x)/(x), xeX, belongs to C^(X,R^). In particular, 5.44 Corollary. Polynomials in n variables belong to C^(R^,R). Therefore, maps f '.W^ —^ W^ whose components are polynomials of n variables are continuous. In particular, linear maps L G £(R^,R'^) are continuous. It is worth noticing that in fact 5.45 Proposition. Let L :W^ ^^ W^ be linear. Then L is Lipschitz continuous in R^. Proof. As L is linear, we have Lip (/) : ==

sup x,yeR'^

\\X-y\\Rn

Xy^y

=

sup — x,y£R^ \\x-y\\un

=

sup —— o^zeR^ IPIIR'^

= : ||L||.

xj^y

Let us prove that ||L|| < H-CXD. Since L is continuous at zero by Corollary 5.44, there exists S > 0 such that ||L(ii;)|| < 1 whenever \\w\\ < S. For any nonzero 2 € M^, set w := 2Jnr\- Since ||ti;|| < 6, we have ||L(i(;)|| < 1. Therefore, writing z = ^y^w and using the linearity of L

||L(.)|| = | | « L W | | = « | 1 L H | | < ^ | N 1 hence

||L||
For a more detailed description of linear maps in normed spaces, see Chapters 9 and 10. b. Some nonlinear continuous transformations from R^ into R ^ We now present a few examples of nonlinear continuous transformations between Euclidean spaces. 5.46 E x a m p l e . For fc = 0 , 1 , . . . consider the map w^ :] — 1,1[—> M^ given by , ,

J (cos kt, sin kt)

if t G]0, 27r/fc[,

1(1,0)

otherwise.

This is a Lipschitz function whose graph is given in Figure 5.10. Notice that the graph of Uk = {{t,Uk{t))} is a curve that "converges" as fc —>• oo to a horizontal Une plus a vertical circle at 0. Compare with the function sgn x from M to R.

168

5. Metric Spaces and Continuous Functions

2Tv/k

Figure 5.10. The function u^ in Example 5.46.

5.47 E x a m p l e ( S t e r e o g r a p h i c p r o j e c t i o n ) . Let

be the unit sphere in E^"*"^. If x = ( x i , . . . Xn.Xn^i) € R^"*"^, let us denote the coordinates of x by (2/, z) where y = (xi, X2,..., Xn) € M" and z = Xn-\-i € R. With this notation, S"^ = {(y, z) e M"" X M 112/|2 -j-z'^ = 1}. Furthermore, denote by Ps = (0, - 1 ) € 5 ^ the South pole of S"^. The stereographic projection (from the South pole) is the map that projects from the South pole the sphere onto the {z = 0} plane, cT : S " \ {Ps} C K " + i ^ R",

(y, z) -^

~ ^ . L -j- Z

It is easily seen that a is injective, surjective and continuous with a continuous inverse given by

that maps x E M.'^ into the point of S'^ lying in the segment joining the South pole of 5"' with X, see Figure 5.11. 5.48 E x a m p l e (Polar c o o r d i n a t e s ) . The transformation (T:E:=

Up,e)\p>0,

0 < ^ < 27r}-^R2,

(p,^) ^

(pcos(9,psin(9)

defines a map that is injective and continuous with range R^ \ {0}. The extension of the map to the third coordinate a : E X R - ^ R 2 xR--R^,

(p.O.z) -^

(pcose.psmO.z)

defines the so-called cylindrical coordinates in R^. 5.49 E x a m p l e (Spherical c o o r d i n a t e s ) . The representation of points {x,y, z) € R^ as I X — psinipcosO, I y = p simp sin 6, ^Z =

pCOSip,

see Figure 5.12, defines the spherical coordinates in R^. This in turn defines a continuous transformation (p,0,(p) —> {x,y,z) from E := | ( p , ^, (^) I /9 > 0, 0 < ^ < 27r, 0 < <^ < TT} = ] 0 , +OO[X [0,2n[x [0, n]

into R 3 \ {0}.

5.1 Metric Spaces

169

Figure 5.11. The stereographic projection from the South pole.

Complex-valued functions of one complex variable provide examples of tranformations of the plane. 5.50 E x a m p l e {w = z'^). The map z ^^ z'^ defines a continuous transformation of C to C. The inverse image of each nonzero n; G C is made by n distinct points, given by the n roots of w, those n points collapse to zero when it; = 0. If we write the transformation w = z"^ as \w\ == \z\'^ Argti; = n A r g z , we see, identifyng C with M^, that the circle of radius r and center 0 is mapped into the circle of radius r'^ and center 0. Moreover, if a point goes clockwise along the circle, then the normalized point image -^jp^ goes along the unit circle clockwise n times. 5.51 f. The map z —^ w = z'^ restricted to ipo < Argz < c^i with 0 < c^i — v?o < 27r/n is injective and continuous. 5.52 %. Show that the map z -^ w = z"^ maps the family of parallel lines to the axes (but the axes themselves) into two families of parabolas with the common axis as the real axis and the common foci at the origin, see Figure 5.13.

5.53 E x a m p l e ( T h e Joukowski f u n c t i o n ) . This is the map

A(.):=i(z+1),

P==

Figure 5.12. Spherical coordinates.

z/O,

{x,y,z)

170

5. Metric Spaces and Continuous Functions

Figure 5.13. The transformation w = z'^ maps families of lines parallel to the axes, except for the axes, into two families of parabolas with the common axis as the real axis and the common foci at the origin.

which appears in several problems of aerodynamics. It is a continuous function defined every point w ^ ± 1 , 0 has at most, and, in fact, exactly in C \ {0}. Since X{z) = \(l/z), two distinct inverse images zi,Z2 satisfying ziZ2 = 1. is one-to-one from {\z\ < 1, 2 / 0} or {\z\ > 1} 5.54 If. Show that X(z) = l/2{z-\-l/z) into the complement of the segment {w\ — 1 < dlw < 1}. A maps the family of circles {z I \z\ = r}ry 0 < r < 1, into a family of co-focal ellipses and maps the diameters z = te^°^, — 1 < ^ < 1 , 0 < a < 7 r , i n a family of co-focal hyperbolas, see Figure 5.14. 5.55 E x a m p l e ( T h e M o b i u s t r a n s f o r m a t i o n s ) . These maps, defined by L(^):==^i±^, cz -\- d

ad-bc^O

(5.10)

are continuous and injective from C \ {—d/c} into C \ {a/c} and have several relevant properties that we list below, asking the reader to show that they hold. 5.56 %. Show the following. (i) L(z) —> a/c as \z\ -^ oo and \L{z)\ -^ oo as z -^ —d/c. Because of this, we write L(oo) = a/c, L(—d/c) = oo and say that L is continuous from CU{oo} into itself. (ii) Show that every rational function, i.e., the quotient of two complex polynomials, defines a continuous transformation of C U {oo} into itself, as in (i). (iii) The Mobius transformations L{z) in (5.10) are the only rational functions from C U {oo} into itself that are injective. (iv) The Mobius transformations (aiZ -\- bi)/(ciZ -|- di), i = 1,2, are identical if and only if ( a i , 6i, ci, di) is a nonzero multiple of (a2, 62, C2, ^2)(v) The Mobius transformations form a group G with respect to the composition of maps; the subset H C G, H := {z, 1 — z, l/z, 1/(1 — z), (z — l)/z} is a subgroup

of a (vi) A Mobius transformation maps straight lines and circles into straight lines and circles (show this first for the map l/z, taking into account that the equations for straight lines and circles have the form A{x'^ + V^) + 2Bx + 2Cy + D = 0 if z = x-\-iy). (vii) The map in (5.10) maps circles and straight lines through —c/d into a straight line and any other straight line or circle into a circle.

5.1 Metric Spax:es

171

\ \ Figure 5.14. The Joukowski function maps circles \z\ = r, 0 < r < 1, and diameters z = ±6=*=*", 0 < t < l , 0 < a < 27r, respectively into a family of ellipses and of cofocal hyperbolas.

(viii) The only Mobius transformation with at least two fixed points is z. Two Mobius transformations are equal if they agree at three distinct points. There is a unique Mobius transformation that maps three distinct points 2:1,22, ^3 € C U {00} into three distinct points wi,W2jWs G C U {00}.

5.57 E x a m p l e ( E x p o n e n t i a l a n d l o g a r i t h m ) . The complex function z —> exp^;, see [GM2], is continuous from C —• C, periodic of period 27Ti with image C \ {0}. In particular e^ does not vanish, and every nonzero w has infinitely many preimages. 5.58 f. Taking into account what we have proved in [GM2], show the following. (i) ii; = e^ is injective with a continuous inverse in every strip parallel to the real axis of width /i < 27r, and has an image as the interior of an angle of radiants h and vertex at the origin; (ii) w = e^ maps every straight line which is not parallel to the axes into a logarithmic spiral, see Chapter 7.

c. The calculus of limits for functions of several variables Though we may have appeared pedantic, we have always insisted in specifying the domain E C X in which the independent variables varied. This is in fact particularly relevant when dealing with limits and continuity of functions of several variables, as in this case there are several reasonable ways of approaching a point XQ. Different choices may and, in general, do lead to different answers concerning the existence and/or the equality of the limits lim fix) and lim fix). xeE xeF Let (X, dx) and (F, dy) be two metric spaces, f : X -^Y a point of accumulation of X.

and XQ e X

172

5. Metric Spaces and Continuous Functions

Figure 5.15. The function in Example 5.59.

(i) If we find two sets Ei^E2 such that XQ is an accumulation point of both El and E2, and the restrictions f : Ei C X -^ Y and f : E2 C X -^Y oi f have diflFerent Hmits, then / has no Hmit when X ^y

XQ,

X e El U E2.

(ii) if we want to show that f{x) has hmit as a: ^ XQ, we may a) guess a possible limit yo £Y^ for instance computing the limit yo of a suitable restriction of / , b) show that the real-valued function x -^ dy(/(x),2/o) converges to zero as X -^ xo, for instance proving that dy (/(x), yo) < h{x) for all x G X, x ^ XQ, where h : X -^R is such that h{x) -^ 0 as x -^ XQ. 5.59 E x a m p l e . Let / : E^ \ {(0,0)} -^ M be defined by f(x,y) := xy/{x^ + 2/^) for {x,y) 7^ (0,0). Let us show that / has no limit as (x^y) —• (0,0). By contradiction, suppose that f{x,y) —> L G R as (x,y) —>• (0,0). Then for any sequence {(xn,2/n)} C R^\{(0,0)} converging to (0,0) we find f{xn,yn) —>• L- Choosing {xn.yn) '-— (l/'^, ^7^)? we have

hence, a s n - ^ 0 0 , L = k/{l-\-k?). Since k is arbitrary, we have a contradiction. This is even more evident if we observe that / is positively homogeneous of degree 0, i.e., /(Ax, \y) = f{x, y) for all A > 0, i.e., / is constant along half-lines from the origin, see Figure 5.15. It is then clear that / has limit at (0,0) if and only if / is constant, which is not the case. Notice that from the inequality 2xy < x^ -^y^ we can easily infer that \f{x, y)\ < 1/2 V(x, y) € R^ \ {(0, 0)}, i.e., that / is a bounded function. 5.60 E x a m p l e . Let fix,y) := sm(x'^y)/{x'^ +2/^) for {x,y) ^ (0,0). In this case ( l / n , 0 ) -» (0,0) and / ( l / n , 0 ) = 0 - ^ 0 . Thus 0 is the only possible limit as (x,y) -^ (0,0); and, in fact it is, since \ff^ o.\ n\ |sin(x2y)| ^ |x| |y| ,^, ^ 1 ,^, ^ |/(x,2/)-0 = < \x\ < - x| - ^ 0 x^ + y^ x^ -\-y^ 2 as (x,2/) —^ (0,0). Here we used | sin t\ < \t\ Vt, 2|x| |y| < x2 -h 2/^ V(x,2/) and that (x,2/) -^ |x| is a continuous map in R^, see Proposition 5.41.

5.1 Metric Spaces

173

We can also consider the restriction of / to continuous paths from XQ, i.e., choose a map (f : [0,1] -^ R^ that is continuous at least at 0 with (p{0) = XQ and ^{t) ^ XQ ior t ^ 0 and compute, if possible

Such limits may or may not exist and their values depend on the chosen path, for a fixed / . Of course, if xeE then, on account of the restriction property and of the change of variable theorem,

i i S , /(^) = ^

and

lim f{^{t)) = L

xeF respectively for any F C E oi which XQ remains a point of accumulation and for any continuous path in E, (^([0,1]) C E. 5.61 E x a m p l e . Let us reconsider the function / :

R2

\ {(0, 0)} ^ M,

fix, y) :=

- ^ ^ x^ + 2/^

which is continuous in R-^ \ {(0, 0)}. Suppose that we move from zero along the straight Hne {(x,y)\y — mx,a: € M} that we parametrize \yy x -^ (x,mx). Then f{(p{x)) = f{x,mx)

= —-^ — -, as X -> 0, 1 + m^ 1 + m^ in particular, the previous limit depends on m, hence f{x, y) has no limit as (x, y) -^ (0,0). Set E := {{x,y) | x € M, 0
= 0.

In fact, in this case

0 < J ^ < ^ = A N X + y^

X'^

i„E.

5.62 E x a m p l e . The function

fi^,y) = { ^

if(x,v)eEM{(0,0)}, if (x, 3/) = (0,0)

is continuous in M? \ {(0,0)} but is not continuous at (0,0). Restricting / to a straight line through (0,0) parametrized as (p{t) = (ta, tb), t € K, gives

f(^it)) = fiat,bt) = - ^ ^ ^ = J^t^O

ast^O.

However, restricting / along the graph of the function y = ax^ parametrized as (^(x) := (x,ax'^), gives / ( x , a x ) = -—-—r— x^ + a^x^

-> — - — - , 1 + a^

asx-^0,

174

5. Metric Spaces and Continuous Functions

MONOGRARTE

MATEMATYCZNE

|

KOMITET REDAKCYJNY: S. B A M A C H . B. KKA5T£il. K. KV&ATOWSKI. s. MAatoBKiBWicz, w. saM.ntiitit

i «.

srtwHAVs

TOMin

TOPOLOGIE I E5PACE5 M £ T R I S A B L E 5 . &SPACE5 COMPLETE

C A S I M I R

K U R A T O W S K I

. X I'CCOie POtTTeCHNIQI,'! OZ IWOv

Z SUBWEMCJI FU»Z)0*ZU KCtTgKT K A * 0 » 0 W E ; W A R S Z A W A - L W d W

iqSS

Figure 5.16. Kazimierz Kuratowski (1896-1980) and the frontispiece of the first volume of his Topologie.

thus / has no Hmit as {x, y) —> (0, 0). Let us now consider the restriction of / to the set

E:=\{x,y)\x>0,

\y\ < x^}.

We have lim

/ ( x , y) — 0.

{x,y)eE In fact,

I x^y

•0

=

J.4 -1-2/2

x^ -\' 2/2

< kl -^ 0,

since (x, y) G £?.

We conclude by observing that for functions / : R^

the expression

lim f{x) = +00 means that VM G R there is J > 0 such that f{x) > MWxe

B{xo,

5)\{XQ}.

5.2 The Topology of Metric Spaces In this section we introduce some families of subsets of a metric space X that are defined by the metric structure, namely the families of open and closed sets. Recall that if X is a set, V{X) denotes the set of all subsets of

X: Ae V{X) if and only

liAdX.

5.2 The Topology of Metric Spaces

175

5.2.1 Basic facts a. Open sets 5.63 Definition. A subset A of a metric space (X, d) is called an open set if for all x £ A there exists a ball centered at x contained in A, i.e., Vx G A 3ra, > 0 such that B{x,rx)

C A.

(^-H)

5.64 Proposition. A subset A of a metric space X is open if and only if either A is empty or is a union of open balls. Proof. Let A be open. Then either A is empty or A is trivially a union of open balls, A — \Jr^^AB(x,rx). Conversely, (5.11) trivially holds if A = 0. If instead x G A 7^ 0, since we assume that A is union of balls, there is 2/ 6 X and p > 0 such that x G ^(2/5 p) G A. Thus y ^ A and, setting r := p — d{x, y), we have r > 0 and by the triangle inequality B{x^r) C B{y,p) C A. •

In particular, 5.65 Corollary. The open balls of a metric space X are open sets. 5.66 %. Let (X, d) be a metric space and r > 0. Show that {y E X \ d{y, x) > r} is an open set in X. 5.67 %, Let (X, d) be a metric space. Show that {xn} C X converges to x G X if and only if, for any open set A such that x £ A, there exists n such that Xn E A for all n >n.

The following is also easily seen. 5.68 Proposition. Let {X, d) be a metric space. Then (i) 0 and X are open sets, (ii) if {Aa} is a family of open sets, then UaA^ is an open set, too, (iii) zf 74i, A 2 , . . . , An are finitely many open sets, then n^^^A^ is open. 5.69 % By considering the open sets {] — - , ^ [ | n G N}, show that the intersection of infinitely many open sets needs not be an open set.

b. Closed sets Recall that the complement of A C X is the set A^ := X \ A. 5.70 Definition. Let X be a metric space. F C X is called a closed set is an open set. if F^ — X\F

176

5. Metric Spax^es and Continuous Functions

The de Morgan formulas

a

a

together with Proposition 5.68 yield at once the following. 5.71 Proposition. Let X be a metric space. Then (i) 0 and X are closed sets, (ii) the intersection of any family of closed sets is a closed set, (iii) the union of finitely many closed sets is a closed set. 5.72 %. Show that [a, 6], [a,+oo[ and [—a, H-oo[ are closed sets in E, while [0,1[ is neither closed nor open. 5.73 if. Show that the set { - | n = 1,2,... } is neither closed nor open. 5.74 %, Show that any finite subset of a metric space is a closed set. 5.75 ^ . Show that the closed ball {x € X\d{x,xo) X I d(x, xo) = r} are closed sets.

< r}, and its boundary {x €

One may characterize closed sets in terms of convergent sequences. 5.76 Proposition. Let {X,d) be a metric space. A set F C X is a closed set if and only if every convergent sequence with values in F converges to a point of F. Proof. Suppose that F is closed and that {x^} C F converges to x € X. Let us prove that x E F. Assuming on the contrary, that x f F, there exists r > 0 such that B(x,r) fl F = 0. As {xn} C F , we have d{xn,x) > r Vn, a contradiction since d{xn,x) —> 0. Conversely, suppose that, whenever {xk} C F and Xk -^ x, we have a: G F , but F is not closed. Thus X \F is not open, hence there exists a point x £ X \F such that Vr > 0 B{x, r) n F ^ 0. Choosing r = l , | , | , . . . , w e inductively construct a sequence {xn} C F such that d{xn,x) < ^, hence converging to x. Thus x e F by assumption, but X £ X \F by construction, a contradiction. •

c. Continuity 5.77 Theorem. Let {X,dx) and {Y^dy) be two metric spaces and f : X —^Y. Then the following claims are equivalent (i) (ii) (iii) (iv)

/ is continuous, f~^{B) is an open set in X for any open ball B ofY, f~^{A) is an open set in X for any open set A inY, f~^(F) is a closed set in X for any closed set F inY.

5.2 The Topology of Metric Spaces

177

Proof, (i) => (ii). Let B be an open ball in Y and let x be a point in f~^{B). Since f(x) E B, there exists a ball BY{f{x),e) C B. Since / is continuous at x, there exists (5 > 0 such that f{Bx{x,S)) C B y ( / ( x ) , e ) C B that is B x ( x , 5 ) C / " ^ ( B ) . As x is arbitrary, f~^{B) is an open set in X. (ii) =^ (i) Suppose / " ^ B ) is open for any open ball B of y . Then, given XQ, / ~ ^ ( B y ( / ( x o , 6 ) ) ) is open, hence there is (5 > 0 such that Bx{xo,S) is contained in /~-^(By(/(xo,6))), i.e., f(Bx{xoyS)) C B y ( / ( x o ) , € ) , hence / is continuous at XQ. (ii) and (iii) are equivalent since f~^{UiAi) ofX.

= Uif~^{Ai)

for any family {Ai} of subsets

(iii) and (iv) are equivalent on account of the de Morgan formulas.

•

5.78 %. Let / , p : X —> y be two continuous functions between metric spaces. Show that the set {x £ X \ / ( x ) = 9{x)} is closed. 5.79 ^ . It is convenient to set Definition. Let (X, d) be a metric space. U C X is said to be a neighborhood of XQ £ X if there exists an open set A of X such that XQ £ A C U. In particular o B(xo,r) is a neighborhood of any x € B ( x o , r ) , o A is open if and only if A is a neighborhood of any point of A. Let {X, d), (y, d) be two metric spaces let XQ e X and let / : X -^ y . Show that / is continuous at XQ if and only if the inverse image of an open neighborhood of / ( X Q ) is an open neighborhood of XQ .

Finally, we state a junction rule for continuous functions, see Proposition 5.38. 5.80 Proposition. Let (X, d) be a metric space, and let {Ua} be a covering of X. Suppose that either all Ua ^s are open sets or all Ua 's are closed and for any x £ X there is an open ball that intersects only finitely many Ua- Then (i) A F are continuous. 5.81 %, Some kind of assumption on the covering in Proposition 5.80 is necessary. If X := [a, 6], xo G [a, 6], Ux := {x} for all x G [a, 6], show that the claims in Proposition 5.80 are false.

d. Continuous real-valued maps Let (X, d) be a metric space and / : X ^> R. Prom Theorem 5.77 we find that / : X ^ ' R is continuous if and only if /~^(]a, b[) is an open set for every bounded interval ]a, fe[c M. Moreover, 5.82 Corollary. Let f : X ^^ R be a function defined on a metric space X and let t e R. Then

178

5. Metric Spaces and Continuous Functions

(i) {x e X\ f{x) > t}, {x e X\ f{x) < t} are open sets, (ii) {x e X\ f{x) >t},{xeX\ f{x) < t} and {x £ X \ f{x) = t} are closed sets. 5.83 Proposition. Let (X^d) be a metric space. Then F C X is a closed set of X if and only if F = {x\ d(x, F) =0}. Proof. By Corollary 5.82, {x \ d{x, F) = 0} is closed, x -> d{x, F) being Lipschitz continuous, see Example 5.25. Therefore F = {x\ d(x, F) = 0} implies that F is closed. Conversely assume that F is closed and that there exists x ^ F such that d{x, F) = 0. Since F is closed by assumption, there exists r > 0 such that B{x, r) D F = 0. But then (i(x, F ) > r > 0, a contradiction. D 5.84 f. Prove the following P r o p o s i t i o n . Let (X, d) he a metric space. (i) F C such (ii) A C such

X is that X is that

a closed set if and F = {x e X\ f{x) an open set if and A = {x e X\ f{x)

Then

only if there exists a continuous function f : X -^R < 0}, only if there exists a continuous function f : X ^>-R < 0}.

Actually f can be chosen to be a Lipschitz-continuous [Hint: If F is closed, choose f{x) := d{x,F)^

function.

while if A is an open set, choose f{x)

=

-d(x,X\A).]

e. The topology of a metric space 5.85 Definition. The topology of a metric space X is the family TX C V{X) of its open sets. It may happen that different distances di and ^2 on the same set X that define diff'erent famihes of balls produce the same family of open sets for the same reason that a ball is union of infinitely many squares and a square is union of infinitely many balls. We say that the two distances are topologically equivalent if {X,di) and (X, ^2) have the same topology, i.e., the same family of open sets. The following proposition yields necessary and sufficient conditions in order that two distances be topologically equivalent. 5.86 Proposition. Let di, d2 be two distances in X and let Bi{x,r) and B2(x^r) be the corresponding balls of center x and radius r. The following claims are equivalent (i) di and ^2 ^^^ topologically equivalent, (ii) every ball Bi{x,r) is open for 0^2 OL^d every ball B2{x^r) is open for di. (iii) \/x e X and r > 0 there are r^^Px > 0 such that B2{x, r^) C Bi{x, r) and Bi[x,px) C B2[x,r), (iv) the identity map i : X -^ X is a homeomorphism between the metric spaces (X^di) and (X, 0J2).

5.2 The Topology of Metric Spaces

179

Figure 5.17. x is an interior point to A, y is s. boundary point to A and z is an exterior point to A. X and y are adherent points to A and z is not.

5.87 ^ . Show that the distances in R^ doo and dp Vp > 1, see Exercise 5.13, are all topologically equivalent to the Euclidean distance d2- If we substitute W^ with the infinitely-dimensional vector space of sequences ii, the three distances give rise to different open sets.

We say that a property of X is a topological property of X if it can be expressed only in terms of set operations and open sets. For instance, being an open or closed set, the closure of or the boundary of, or a convergent sequence are topological properties of X, see Section 5.2.2 for more. As we have seen, / is continuous if and only if the inverse image of open sets is open, A trivial consequence, for instance, is that the composition of continuous functions is continuous, see Proposition 5.30. Also we see that the continuity of / : X —> y is strongly related to the topologies Tx '•= {Ac

X\A

open in X } ,

ry := {AcY\A

open in F } ,

respectively on X and y , and in fact it depends on the metrics only through Tx and Ty. In other words being a continuous function f : X ^^ Y is a, topological property of X and Y. f. Interior, exterior, adherent and boundary points 5.88 Definition. Let X be a metric space and A C X. We say that XQ G X is interior to A if there is an open ball B{xQ,r) such that B{xo^r) C A; we say that XQ is exterior to A if XQ is interior to X \ A; we say that XQ is adherent to A if it is not interior to X \ A; finally, we say that XQ is a X\A. boundary point of A if XQ is neither interior to A nor interior to o

The set of interior points to A is denoted by A or by int A, the set of adherent points of A, called also the closure of A, is denoted by A or by cl (A), and finally the set of boundary points to A is called the boundary of A and is denoted by dA. 5.89 ^ . Let (X, d) be a metric space and B{xo,r) be an open ball of X. Show that (i) every point of B{xo,r) is interior to B{xo,r), i.e., intB{xo,r) = B{xo,r), (ii) every point x such that d(x,xo) = r is a boundary point to B{xo,r), i.e., dB{xo,r) = {x I d{x, XQ) = r } , (iii) every point x with d{x,xo) > r is exterior to B{xo,r),

180

5. Metric Spaces and Continuous Functions

(iv) every point x such that d{x^xo) < r is adherent to B{XO^T)^ {x\d{x,XQ) < r } . 5.90 %, Let X be a metric space and Ac (i) int Ac A,

i.e., c\{B{xo^r))

=

X. Show that

(ii) int A is an open set and actually the largest open set contained in A, i n t ^ = u | c / I U open U C A \ (iii) A is open if and only if >1 = int v4. 5.91 %. Let X be a metric space and A (Z X. Show that (i) ^ C A , (ii) A is closed and actually the smallest closed set that contains A, i.e., cl {A) = n J F I F closed , F D

A^

(iii) A is closed if and only if A = A, (iv) A = {xeX\d(x,A) = 0}. 5.92 (i) (ii) (iii)

%, Let X be a metric space and A C X. Show that dA = d{X\A),_ dAnmtA = i/}, A = dAU int A,dA = A\ int A, ^A = An A^, in particular dA is a closed set,

(iv) ddA = 0, A = A, int int A = int A, (v) >l is closed if and only if dA C 4 , (vi) A is open if and only if dA fl A = 0. 5.93 %, Let {X,dx) and ( F , d y ) be metric spaces and / : X —>• V. Show that the following claims are equivalent (i) / : X —>^ y is continuous, (ii) f(A) C JjA) foTjll ACX, for all BcY. (iii) / - i ( ^ ) C f-'^(B)

g. Points of accumulation Let A C Xhesi subset of a metric space. The set of points of accumulation, or cluster points, of A, denoted by DA, is sometimes called the derived of A. Trivially T>A C A, and the set of adherent points to A that are not points of accumulation of A, X{A) := ^ \ DA, are the points x e A such that J5(x, r) r\A = {x} for some r > 0. These points are contained in A, I{A)=A\VAcA and are called isolated points of A. 5.94 % Show that X>A C A and that A is closed if and only if VA C A.

5.95 Proposition. Let (X,d) be a metric space, F C X and x e X. We have

5.2 The Topology of Metric Spaces

181

(i) X is adherent to F if and only if there exists a sequence {xn} C F that converges to x, (ii) X is an accumulation point for F if and only if there exists a sequence {xn} C F taking distinct values in F that converges to x; in particular, a) X is an accumulation point for F if and only if there exists a sequence {xn} C F\ {x} that converges to x, b) in every open set containing an accumulation point for F there are infinitely many distinct points of F. Proof, (i) If there is a sequence {xn} C F that converges to x G X , in every neighborhood of X there is at least a point of F , hence x is adherent to F. Conversely, if x is adherent to F, there IS a XYI G B{x, ^ ) n F for each n, hence {xn} C F and Xfi ^ X. (ii) If moreover x is a point of accumulation of F, we can choose Xn E F \ {x} and moreover Xn G B{x,rn), Vn •= mm{d{x,Xn-i), - ) • The sequence {xn} has the desired properties. D

h. Subsets and relative topology Let (X^d) be a metric space and Y C X. Then (Y^d) is a metric space, too. The family of open sets in Y induced by the distance d is called the relative topology of Y, We want to compare the topology of X and the relative topology of Y. The open ball in Y with center x ^Y and radius r > 0 is Bvix.r)

:= ^yeY^d{y,x)

< r|

=Bx{x,r)nY.

5.96 Proposition. Let {X,d) be a metric space and let Y C X.

Then

(i) B is open in Y if and only if there exists an open set A C X in X such that B = AnY, (ii) B is closed in Y if and only if there exists a closed set A in X such that B = AnY. Proof. Since (ii) follows at once from (i), we prove (i). Suppose that A is open in X and let x be a point in An Y. Since A is open in X , there exists a ball Bx{x,r) C A, hence By(x,r) = Bx{x,r)nY C AnY. Thus AnY is open in Y. Conversely, suppose that B is open in Y. Then for any x e B there is a ball Byix^rx) = Bx{x,rx) n B C B. The set A := U{Bxix,rx) | a: G B } is an open set in X and AnY = B. D

Also the notions of interior, exterior, adherent and boundary points, in (y, d) are related to the same notions in (X, d), and whenever we want to emphasize the dependence on_y' of the interior, closure, derived and boundary sets we write inty(yl), Ay, VyA, dyA instead of int A, A, DA, dA, 5.97 Proposition. For any A CY (i) mty{A)_=mtx{A)nY, (ii) Ay = AxnY,

we have

182

5. Metric Spaces and Continuous Functions

dyA Figure 5.18. dA and

(iii) VYA = (iv) dYA =

dyA.

VxAr]Y, dxA\dxY.

5.98 f. Let Y := [0,1[C R. The open balls of Y are the subsets of the type {y G [0,1[ I I?/ — a:;| < r } . If x is not zero and r is sufficiently small, {y | I?/ — ic| < r } fl [0,1[ is again an open interval with center x,]x — r^x -\- r[. But, if x = 0, then for r < 1 B y ( 0 , r ) := [0,r[. Notice that x = 0 is an interior point of Y (for the relative topology of Y), but it is a boundary point for the topology of X. This is in agreement with the intuition: in the first case we are considering y as a space in itself and nothing exists outside it, every point is an interior point and dyY = i/}; in the second case y is a subset of R and 0 is at the frontier between Y and M \ y . 5.99 ^ . Prove the claims of this paragraph that we have not proved.

5.2.2 A digression on general topology a. Topological spaces As a further step of abstraction, following Felix Hausdorff (1869-1942) and Kazimierz Kuratowski (1896-1980), we can separate the topological structure of open sets from the metric structure, giving a set-definition of open sets in terms of their properties. 5.100 Definition. Let X be a set. A topology in X is a distinct family of subsets T C V{X), called open sets, such that o 0, X G r, o if {Aoc} C T, then Ua^a ^ T, o if Ai,A2,...,AneT, thenOl^^Ak^T. A set X endowed with a topology is called a topological space. Sometimes we write it as (X^r). 5.101 Definition. A function f : X -^ Y between topological spaces (X,Tx) and {Y,TY) is said to be continuous if f~^{B) e TX whenever

5.2 The Topology of Metric Spaces

183

B e TY' f : X ^^ Y is said to be a homeomorphism if f is both injective and surjective and both f and f~^ are continuous, or, in other words A G Tx if and only if f{A) G ryTwo topological spaces are said to be homeomorphic if and only if there exists a homeomorphism between them. Proposition 5.68 then reads as follows. 5.102 Proposition. Let (X, d) be a metric space. Then the family formed by the empty set and by the sets that are the union of open balls of X is a topology on X, called the topology on X induced by the metric d. The topological structure is more flexible than the metric structure, and allows us to greatly enlarge the notion of the space on which we can operate with continuous deformations. This is in fact necessary if one wants to deal with qualitative properties of geometric figures, in the old terminology, with analysis situs. We shall not dwell on these topics nor with the systematic analysis of different topologies that one can introduce on a set, i.e., on the study of general topology. However, it is proper to distinguish between metric properties and topological properties. According to Felix Klein (1849-1925) a geometry is the study of the properties of figures or spaces that are invariant under the action of a certain set of transformations. For instance, Euchdean plane geometry is the study of the plane figures and of their properties that are invariant under similarity transformations. Given a metric space (X, d), a property of an object defined in terms of the set operations in X and of the metric of X is a metric property of X, for instance whether {xn} C X is convergent or not is a metric property of X. More generally, in the class of metric spaces, the natural transformations are those h : (X, dx) -^ {Y, dy) that are one-to-one and do not change the distances dy{h{x)^h{x)) = dx{x,y). Also two metric spaces (X,d) and (y, d) are said to be isometric if there exists an isometry between them. A metric invariant is a predicate defined on a class of metric spaces that is true (respectively, false) for all spaces isometric with (X, d) whenever it is true (false) for (X, d). With this languange, the metric properties that make sense for a class of metric spaces, being evidently preserved by isometrics, are metric invariants. And the Geometry of Metric Spaces^ that is the study of metric spaces, of their metric properties, is in fact the study of metric invariants. 5.103 %, Let {X,di) and (K,^2) be two metric spaces and denote them respectively, by Bi{x^ r) and B2{x, r) the ball centered at x and radius r respectively, for the metrics di and d2. Show that a one-to-one map /i : X —>> K is an isometry if and only if the action of h preserves the balls, i.e., h{Bi{x,r))

= B2{h{x),r)

Vx G X,Vr > 0.

Similarly, given a topological space (X, r x ) , a property of an object defined in terms of the set operations and open sets of X is called a topological property of X, for instance being an open or closed subset, being

184

5. Metric Spaces and Continuous Functions

the closure or boundary of a subset, or being a convergent sequence in X are topological properties of X. In the class of topological spaces, the natural group of transformations is the group of homeomorphisms, that are precisely all the one-toone maps whose actions preserve the open sets. Two topological spaces are said homeomorphic if there is a homeomorphism from one to the other. A topological invariant is a predicate defined on a class of topological spaces that is true (false) in any topological space that is homeomorphic to X whenever it is true (false) on X. With this language, topological properties that make sense for a class of topological spaces, being evidently preserved by the homeomorphims, are topological invariants. And the topology, that is the study of objects and of their properties that are preserved by the action of homeomorphisms, is in fact the study of topological invariants. b. Topologizing a set On a set X we may introduce several topologies, that is subsets o{V{X). Since such subsets are ordered by inclusion, topologies are partially ordered by inclusion. On one hand, we may consider the indiscrete topology r = {0, X } in which no other sets than 0 and X are open, thus there are no "small" neighborhoods. On the other hand, we can consider the discrete topology in which any subset is an open set, r = T^iX), thus any point is an open set. There is a kind of general procedure to introduce a topology in such a way that the sets of a given family £ CV{X) are all open sets. Of course we can take the discrete topology but what is significant is the smallest family of subsets r that contains £ and is closed with respect to finite intersections and arbitrary unions. This is called the coarser topology or the weaker topology for which f C r. It is unique and can be obtained adding possibly to £ the empty set, X, the finite intersections of elements of £ and the arbitrary union of these finite intersections. This previous construction is necessary, but in general it is quite complicated and £ loses control on r, since r builds up from finite intersections of elements of £. However, if the family £ has the following property, as for instance it happens for the balls of a metric space, this can be avoided. A basis S of X is a family of subsets of X with the following property: for every couple Ua and Up e 13 there is U^ e B such that Uj cUaf^Up. We have the following. 5.104 Proposition. Let B = {Ua} he a basis for X. Then the family T consisting of ^, X and all the unions of members of B is the weaker topology in X containing B. c. Separation properties It is worth noticing that several separation properties that are trivial in a metric space do not hold, in general, in a topological space. The following claims, o sets consisting of a single points are closed,

5.3 Completeness

185

o for any two distinct points x and y G X there exist disjoint open sets A and B such that x ^ A and y E B, o for any x e X and closed set F C X there exist disjoint open sets A and B such that x G A and F C B. o for any pair of disjoint closed sets E and F there exist disjoint open sets A and B such that E C A and F C B, are all true in a metric space, but do not hold in the indiscrete topology. A topological space is called a Hausdorff topological space if (ii) holds, regular if (iii) holds and normal if (iv) holds. It is easy to show that (i) and (iv) imply (iii), (i) and (iii) imply (ii), and (ii) implies (i). We conclude by stating a theorem that ensures that a topological space be metrizable, i.e., when we can introduce a metric on it so that the topology is the one induced by the metric. 5.105 Theorem (Uryshon). A topological space X with a countable basis is metrizable if and only if it is regular.

5.3 Completeness a. Complete metric spaces 5.106 Definition. A sequence {xn} with values in a metric space {X,d) is a Cauchy sequence if Ve > 0

3 1/ such that d{xn^Xm) < ^ ^n^m > v.

It is easily seen that 5.107 Proposition. In a metric space (i) every convergent sequence is a Cauchy sequence, (ii) any subsequence of a Cauchy sequence is again a Cauchy sequence, (iii) if {xk^} is a subsequence of a Cauchy sequence {xn} such that Xk^ -^ XQ, then Xn —> XQ. 5.108 Definition. A metric space (X, d) is called complete if every Cauchy sequence converges in X. By definition, a Cauchy sequence and a complete metric space are metric invariants. With Definition 5.108, Theorems 2.35 and 4.23 of [GM2] read as R, R^, C are complete metric spaces. Moreover, since n \xi\,

\X21 . . . , \Xn\ < \\x\\ < ^ 2=1

\Xi\

VX = ( X i , X2, . . . , Xn),

186

5. Metric Spaces and Continuous Functions

{xk} C M.^ or C"^ is a convergent sequence (respectively, Cauchy sequence) if and only if the sequences of coordinates {x^},2 = l , . . . , n are convergent sequences (Cauchy sequences). Thus 5.109 Theorem. For all n > 1, W^ and C^ endowed with the Euclidean metric are complete metric spaces. b. Completion of a metric space Several useful metric spaces are complete. Notice that closed sets of a complete metric space are complete metric spaces with the induced distance. However, there are noncomplete metric spaces. The simplest significant examples are of course the open intervals of R and the set of rational numbers with the Euclidean distance. Let X be a metric space. A complete metric space X* is called a completion of X if (i) X is isometric to a subspace X of X*, (ii) X is dense in X*, i.e., clX = X*. We have the following. 5.110 Theorem (HausdorfF). Every metric space X has a completion and any two completions of X are isometric. Though every noncomplete metric space can be regarded as a subspace of its completion, it is worth remarking that from an effective point of view the real problem is to realize a suited handy model of this completion. For instance, the Hausdorff model, when apphed to rationals, constructs the completion as equivalence class of Cauchy sequences of rationals, instead of real numbers. In the same way, the Hausdorff procedure applied to a metric space X of functions produces a space of equivalence classes of Cauchy sequences. It would be desirable to obtain another class of functions as completion, instead. But this can be far from trivial. For instance a space of functions that is the completion of C°([0,1]) with the L^([0,1]) distance can be obtained by the Lebesgue integration theory. 5.111 %, Show that a closed set F of a complete metric space is complete with the induced metric. 5.112 f. Let (X,d) completion of A.

be a metric space and A C X. Show that the closure of A is a

Proof of Theorem 5.110. In fact we outline the main steps leaving to the reader the task of completing the arguments. (i) We consider the family of all Cauchy sequences of X and we say that two Cauchy sequences {yn} and {zn} are equivalent if d{yn,Zn) -^ 0 (i.e., if, a posteriori, {yn} and {zn} "have the same limit"). Denote by X the set of equivalence classes obtained this way. Given two classes of equivalence Y and Z in X, let {yn} and {zn} be two representatives respectively of Y and Z. Then one sees

5.3 Completeness

187

Figure 5.19. Felix Hausdorff (1869-1942) and Rene-Louis Baire (1874-1932).

(i) {d{zn,yn)} is a Cauchy sequence of real numbers, hence converges to a real number. Moreover, such a limit does not depend on the representatives {yn} e {zn} of Y and Z, so that d{zn,yn) -^ diZ,Y). (ii) (i(y, Z) is a distance in X. (ii) Let X be the subspace of X of the equivalence classes of the constant sequences with values in X . It turns out that X is isometric to X. Let Y e X and let {yn} be a representative of Y. Denote by Yu the class of all Cauchy sequences that are equivalent to the constant sequence {zn} where Zn := yu ^'n. Then it is easily seen that Y^ ^>^ Y in X and that X is dense in X . (iii) Let {Yu} be a Cauchy sequence in X. For all u we choose Zu ^ X such that d{Yu, Zu) < 1/^' and we let Zu ^ X he Q. representative of Z^. Then we see that {z^} is a Cauchy sequence in X and, if Z is the equivalence class of {zu}^ then Y^ —> Z. This proves that X is complete. (iv) It remains to prove that any two completions are isometric. Suppose that X and X are two completions of X. With the above notation, we find X C X and X (Z X that are isometric and one-to-one with X. Therefore X and X are isometric and in a one-to-one correspondence. Because X is dense in X and X are dense respectively in X and X it is not difficult to extend the isometry i \ X ^ X to an isometry between X and X.

D

c. Equivalent metrics Completeness is a metric invariant and not a topological invariant. This means that isometric spaces are both complete or noncomplete and that there exist metric spaces X and Y that are homeomorphic, but X is complete and Y is noncomplete. In fact, homeomorphisms preserve convergent sequences but not Cauchy sequences. 5.113 E x a m p l e . Consider X :— endowed with the distance d{x,y)

endowed with the Euclidean metric and Y : =

l + l^l

i + |y||

188

5. Metric Spaces and Continuous Functions

X and Y are homeomorphic, a homeomorphism being given by the map h{x) := jirr^y x G R. In particular both distances give rise to the same converging sequences. However the sequence {n} is not a Cauchy sequence for the EucHdean distance, but it is a Cauchy sequence for the metric d since for n, m G N, m > n d(m, n) =

11 4- n

< 1 1 H- m -

>^ 0

per 1/ —» oo.

1+n

Since {n} does not converge in (M, d), Y = (R, d) is not complete.

Homeomorphic, but nonisometric spaces can sometimes have the same Cauchy sequences. A sufficient condition ensuring that Cauchy sequences with respect to different metrics on the same set X are the same, is that the two metrics be equivalent, i.e., there exist constants Ai,A2 > 0 such that Aidi(x,2/) < d2{x,y)

<

\2di{x,y).

5.114 %, Show that two metric spaces which are equivalent are also topologically equivalent, compare Proposition 5.86.

d. The nested sequence theorem An extension of Cantor's principle or the nested intervals theorem in R, see [GM2], holds in a complete metric space. 5.115 Proposition. Let {Ek} he a monotone-decreasing sequence of nonempty sets, i.e., ^ ^ Ek-\-\ C Ek^k = 0,1,..., in a complete metric space X. //diam (Ek) —^ 0, then there exists one and only one point x E X with the following property: any ball centered at x contains one, and therefore infinitely many of the Ek 's. Moreover, if all the Ek are closed, then DkEk = {x}. As a special case we have the following. 5.116 Corollary. In a complete metric space a sequence of nested closed balls with diameters that converge to zero have a unique common point. Notice that the conclusion of Corollary 5.116 does not hold if the diameters do not converge to zero: for Ek := [k, +oo[c M we have 0 ^^ Ek^i C Ek and CikEk = 0. 5.117 ^ . Prove Proposition 5.115.

e. Baire's theorem 5.118 Theorem (Baire). Let X be a complete metric space that can be written as a denumerable union of closed sets {Ei}, X = U^^Ei. Then at least one of the Ei's contains a ball of X.

5.3 Completeness

189

Proof. Suppose that none of the EiS contains a ball of X and let x\ ^ E\\ Since Ei is closed, there is r\ such that c l ( S ( x i , r i ) ) n £"1 = 0 . Inside cl ( B ( a ; i , r i / 2 ) ) there is now X2 ^ E2 (otherwise cl ( B ( x i , r i / 2 ) ) C E2 which is a contradiction) H E2 = 0 , also we may choose r2 < r i / 2 . Iteratand r2 such that c\{B{x2,r2)) {B{xk,rk)}, ing this procedure we find a monotonic-decreasing family of closed balls D cl(B(x2,r2)) D • • • such that c l ( B ( x n , r n ) ) fl J5n = 0. Thus the comcl{B{xi,ri)) mon point to all these balls, that exists by Corollary 5.116, would not belong to any of the En, a contradiction. D

An equivalent formulation is the following. 5.119 Proposition. In a complete metric space, the denumerable intersection of open dense sets of a complete metric space X is dense in X. 5.120 Definition. A subset A of a metric space X is said nowhere dense if its closure has no interior point, intcl(^) = 0 , equivalently, if X\A is dense in X. A set is called meager or of the first category if it can he written as a countable union of nowhere dense sets. If a set is not of the first category, then we say that it is of the second category. 5.121 Proposition. In a complete metric space a meager set has no interior point, or, equivalently, its complement is dense. Proof. Let {An} be a family such that intclAn = 0. Suppose there is an open set U with U C U n ^ n . Prom U C UnAn C On An we deduce HnAn^ C U^. Baire's theorem, see Proposition 5.119, then implies that U^ is dense. Since U^ is closed, we conclude D that U"" = X i.e., U = 0.

5.122 Corollary. A complete metric space is a set of second category. This form of Baire's theorem is often used to prove existence results or to show that a certain property is generic^ i.e., holds for "almost all points" in the sense of the category, i.e., that the set X\{xeX\p{x)

holds}

is a meager set. In this way one can show^, see also Chapters 9 and 10, the following. 5.123 Proposition. The class of continuous functions on the interval [0,1] which have infinite right-derivative at every point, are of second category in C^([0,1]) with the uniform distance; in particular, there exist continuous functions that are nowhere differentiable. Finally we notice that, though for a meager set A we have int A = 0, we may have intclA ^ 0: consider A := Q C M. •^ See, e.g., J. Dugundii,Topology,

Allyn and Bacon Inc., Boston.

190

5. Metric Spaces and Continuous Functions

5.4 Exercises 5.124 %. Show that \x — 2/P is not a distance in R. 5.125 %. Let (X, d) be a metric space and M > 0. Show that the functions di{x,y)

:=mm(M,d(x,y)),

d2{x,y)

:= d{x,y)/{l

+

d(x,y))

are also distances in (X, d) that give rise to the same topology. 5.126 If. Plot the balls of the following metric in C ,.

.

d(z,w)=y

I b — lyl

if diTgz = axgw or z = w,

I l^l + \w\

otherwise.

5.127 ^ . Let (X, dx) be a metric space. Show that, if / : [0, H-oo[—>^ [0 + CXD[ is concave and /(O) = 0, then

d{x,y) :=

fidx{x,y))

is a distance on X, in particular d"(x, y) is a distance for any a, 0 < a < 1. Notice that instead || ||", 0 < a < 1, is not a norm, if || || is a norm. 5.128 1. Let f :{X,dx) -^ {Y, dy) be a-Holder continuous. Show that f : {X,d%) (Yjdy) is Lipschitz continuous.

-^

5.129 %. Let S be the space of all sequences of real numbers. Show that the function d: S X S -^R given by

if X = {xn}, y = {l/n}, is a distance on S. 5.130 % C o n s t a n c y of sign. Let X and Y be metric spaces and F be a closed set of y , let XQ be a point of accumulation for X and f \ X —^Y. li f{x) - ^ y a s x — > ^ x o and y ^ F, then there exists S > 0 such that f(B{xo,S) \ {XQ}) fl F = 0. 5.131 f. Let (X, rf) and {Y,S) be two metric spaces and let X x F be the product metric space with the metric p((a;i,2/i),(x2,2/2)) := yd(xi,X2y Show (i) (ii) (iii)

+ <^(2/i, 2/2)^.

that the projection maps n{x, y) = x, 7r(a:, y) = y, are continuous, map open sets into open sets, but, in general, do not map closed sets into closed sets.

5.132 % C o n t i n u i t y of o p e r a t i o n s o n functions. Let * : Y x Z -^ W he a. map which we think of as an operation. Given f : X —^ Z and ^ : X -^ F , we may then define the map f * g : Y x Z ^ W by f * g{x) = f(x) * g{x), x e X. Suppose that X, y, Z, W are metric spaces, consider y x Z as the product metric space with the distance as in Exercise 5.131. Show that if / , g are continuous at XQ, and * is continuous at {f{xo),g{xo)), then f * g is continuous at XQ.

5.4 Exercises

5.133 f.

191

Show that

(i) the parametric equation of straight lines in E^, t —> a t 4- b , a, b G M^, is a continuous function, (ii) the parametric equation of the heUx in R^, t —>^ {cost, sint,t), t E M, is a continuous function. 5.134 %. Let {X,dx) and (Y^dy) be two metric spaces, E C Y, XQ he a point of accumulation of E and f : E C X ^>-Y. Show that f(x) -^ yo as x ^>- XQ, x ^ E, ii and only i f V e > 0 3 ( 5 > 0 such that f{E H B{xo, S) \ {XQ}) C B{yo, e). 5.135 %, Show that the scalar product inM^, (x|y) := J27=i ^iVi^ Vx = x^, x ^ , . . . , x'^, y = y^^ y^,- • • 1 y^ G M^, is a continuous function of the 2n variables (x, y) G R^^. 5.136 %. Find the maximal domain of definition of the following functions and decide whether they are continuous there:

1 + 2/2'

a;-logy'

\/xey — ye^.

5.137 ^ . Decide whether the following functions are continuous

J ^

if(x,2/)^(0,0),

f ^

if(x,2/)^(0,0),

0

if(x,2/) = (0,0).

\o

if(x,2/) = (0,0),

I | ^

if(x,2/)^(0,0),

J;;:^^ [ ^

if 2 / iiy-x^>y/W\' - x 2 > ^ and (a., 2 / ) / ( 0 , 0 ) ,

\o

if(x,2/) = (0,0),

\o

if(x,2/) = (0,0).

5.138 ^ . Compute, if they exist, the limits as (x,y) —> (0,0) of log2 {l-{-xy)

X sin(x2 + 3y^)

s i n x ( l — cosa?)

x^ sin^ y sin(x2 + 2/2^

5.139 ^ . Consider R with the Euclidean topology. Show that (i) (ii) (iii) (iv) (v) (vi)

if A if A HA if A if A if A

=]a, 6[, we have int A = A, A = [a, 6] and dA = {a, 6}, = [a, 6[, we have int A =]a, 6[, A = [a, 6] and ^A = {a, 6}, = [a, +oo[, we have int A =]a, +oo[, A = A and OA = {a}, = Q C R, we have int A = 0, A = R and dA = R, = {(^5 2/) € 1^^ I ^ = 2/}» we have int A = 0, A = A and dA = A, = N C R, we have int A = 0, A = A and dA = A.

5.140 %. Let {X, d) be a metric space and {Ai} be a family of subsets of X. Show that UiAi C UiAi,

DiAi C DiAi.

5.141 ^ . Prove the following T h e o r e m . Any open set A ofR is either empty or a finite or denumerable disjoint open intervals with endpoints that do not belong to A.

union of

192

5. Metric Spaces and Continuous Functions

[Hint: Show that (i) Vx G A there is an interval ]C,77[ with x G]^,ry[ and ^, r] ^ A, (ii) if two such intervals ]$I,T7I[ and ]?2,^2[ have a common point in A and endpoints not in A, then they are equal, (iii) since each of those intervals contains a rational, then they are at most countable many.] Show that the previous theorem does not hold in R^. Show that we instead have the following. T h e o r e m . Every open set A CR'^ is the union of a finite or countable union of cubes with disjoint interiors. 5.142 %. Prove the following theorem, see Exercise 5.141, T h e o r e m . Every closed set F C R can be obtained by taking out from R a finite or countable family of disjoint open intervals. 5.143 1. Let X be a metric space. (i) Show that a^o G X is an interior point of A C X ii and only if there is an open set U such that XQ E U C A. (ii) Using only open sets, express that XQ is an exterior point to A, an adherent point to A and a boundary point to A. 5.144 %. Let X be a metric space. Show that A is open if and only if any sequence >n. {xn} that converges to XQ £ A is definitively in A, i.e., 3n such that Xn ^ A'^n 5.145 %, Let (X,dx) and ( F , d y ) be two metric spaces and let / : A" —>^ y be a continuous map. Show that (i) if 2/0 € y is an interior point of B C Y and if f{xo) = yo, then XQ is an interior point o f / - 1 ( B ) . (ii) if ico G X is adherent to A C X, then f(xo) is adherent to f(A), (iii) if xo G X is a boundary point of ^ C X , then f{xo) is a boundary point for f(A), (iv) if a:o G X is a point of accumulation of A C X and / is injective, then f(xo) is a point of accumulation of f(A). 5.146 %. Let X be a metric space and A C X. Show that a:o G X is an accumulation point for A if and only if for every open set U with XQ £ U we have U H A\ {XQ} y^ 0. Show also that being an accumulation point for a set is a topological notion. [Hint: Use (iv) of Exercise 5.145.] 5.147 %. Let X be a metric space and A C X. Show that x is a point of accumulation of A if and only ii x E A\ {x}. 5.148 %. Let X be a metric space. A set A C X without points of accumulation in X , V(A) = 0, is called discrete. A set without isolated points, X{A) = 0, is called perfect. Of course every point of a discrete set is isolated, since A C A = T{A) C A. Show that the converse is false: a set of isolated points, A = X{A), needs not be necessarily discrete. We may only deduce that VA = dA. 5.149 %. Let X be a metric space. Recall that a set £" C X is dense in X ii E = X. Show that the following claims are equivalent (i) D is dense in X , (ii) every nonempty open set intersects D, (iii) D^ = X \D has no interior points, (iv) every open ball B{x,r) intersects D.

5.4 Exercises

193

5.150 % Q is dense in R, i.e., Q = M, and dQ = R. Show that R \ Q is dense in R. Show that the set E of points of R^ with rational coordinates and its complement are dense in R^. 5.151 %, Let r be an additive subgroup of R. Show that either F is dense in R or F is the subgroup of integer multiples of a fixed real number. 5.152 %, Let X be a metric space. Show Xn —^ x if and only if for every open set A In particular, the notion of convergence with X Ei A there is n such that Xn E A^iny^n. is a topological notion. 5.153 %, The notion of a convergent sequence makes sense in a topological space. One says that {xn} C X converge to a; G X if for every open set A with x E A there is n such that Xn £ AWn>n. However, in this generality limits are not unique. If in X we consider the indiscrete topology r = {0, X } , every sequence with values in X converges to any point in X. Show that limits of converging sequences are unique in a Hausdorff topological space. Finally, let us notice that in an arbitrary topological space, closed sets cannot be characterized in terms of limits of sequences, see Proposition 5.76. 5.154 ^ . Let (X, r ) be a topological space. A set F C X is called sequentially closed with respect to r if every convergent sequence with values in F has limit in F. Show that the family of sequentially closed sets satisfies the axioms of closed sets. Consequently there is a topology (a priori different from r ) for which the closed sets are the family of sequentially closed sets.

5.155 f. Let X be a metric space. Show that d i a m A = diamA, but in general diam int A < diam A. 5.156 If. Let 0 7^ F C R be bounded from above. Show that sup JE; € F ; if snpE ^ E, then sup F is a point of ax;cumulation of E; finally, show that there exist max E and min E, if E is nonempty, bounded and closed. 5.157 %, Let X be a metric space. Show that dA = 0 iff A is both open and closed. Show that in R'^ we have OA = 0 iff A = 0 or A = R". 5.158 %, Let X be a metric space. Show that dint A C dA, and that it may happen that aint A ^ dA. 5.159 If. Sometimes one says that A is a regular open set if A = int A, and that C is a regular closed set if C = int C. Show examples of regular and nonregular open and closed sets in R^ and R^. Show the following: (i) The interior of a closed set is a regular open set, the closure of an open set is a regular closed set. (ii) The complement of a regular open (closed) set is a regular closed (open) set. (iii) If A and B are regular open sets, then An B is a. regular open set; if C and D are regular closed sets then C U D is a regular closed set. 5.160 %. Let X be a metric space. A subset D C X is dense in X if and only if for every x G X we can find a sequence {xn} with values in D such that Xn -^ x. 5.161 %, Let (X, d) and (Y^d) be two metric spaces. Show that (i) if / : X —)• y is continuous, then / : £" C X —>^ Y is continuous in E with the induced metric,

194

5. Metric Spaces and Continuous Functions

Figure 5.20. A Cauchy sequence in C^([0,1]) with the L^-metric, with a noncontinuous "limit".

(ii) / : X —> y is continuous if and only if f : X —^ f{^)

is continuous.

5.162 ^ . Let X and Y be two metric spaces and let / : X —>^ y . Show that / is continuous if and only if df-'^{A) C f~^{dA) \/ AcX. 5.163 If O p e n a n d closed m a p s . Let (X,dx) and (Y^dy) be two metric spaces. A map f : X ^)'Y is called open (respectively, closed) if the image of an open (respectively, closed) set of X is an open (respectively, closed) set in Y. Show that (i) the coordinate maps iri : W^ -^ R, K = (xi, ) -^ Xi, i = 1,... ,n, are open maps but not closed maps, (ii) similarly the coordinate maps of a product TTX : X X Y —^ X, ny : X X Y —^ Y given by 7Tx(x,y) = x, 7ry(x,y) = y are open but in general not closed maps, (iii) / : X —^ y is an open map if and only if / ( i n t A) C int f{A) \/A C X, (iv) / : X -^ y is a closed map if and only if f{A) C f(A) VA C X. 5.164 %, Let / : X -^ y be injective. Show that / is an open map if and only if it is a closed map. 5.165 f. A metric space {X,dx) is called topologically complete if there exists a distance d in X topologically equivalent to dx for which (X, d) is complete. Show that being topologically complete is a topological invariant.

5.166 ^ . Let (X, d) be a metric space. Show that the following two claims are equivalent (i) (X, d) is a complete metric space, (ii) If {Fa} is a family of closed sets of X such that a) any finite subfamily of { F a } has nonempty intersection, b) inf{diamFa} = 0, then (laFa is nonempty and consists of exactly one point. 5.167 %. Show that the irrational numbers in [0,1] cannot be written as countable union of closed sets in [0,1]. [Hint: Suppose they are, so that [0,1] = Ur€Q{r'} U UiEi and use Baire's theorem.] 5.168 %, Show that a complete metric space made of countably many points has at least an isolated point. In particular, a complete metric space without isolated points is not countable. Notice that, if Xn -^ Xoo in M, then A := {xn | n = 1, 2 . . . } U {xoo} with the induced distance is a countable complete metric space.

5.4 Exercises

195

5.169 f. Show that C^([0,1]) with the L^-metric is not complete. [Hint: Consider the sequence in Figure 5.20.] 5.170 f. Show that X = {n | n = 0 , 1 , 2 , . . . } and F = {1/n | n = 1 , 2 , . . . } are homeomorphic as subspaces of R, but X is complete, while Y is not complete.

6. Compactness and Connectedness

In this chapter we shall discuss, still in the metric context, two important topological invariants: compactness and connectedness.

6.1 Compactness Let E be a subset of M?. We ask ourselves whether there exists a point xo E E oi maximal distance from the origin. Of course E needs to be bounded, sup^.^^; d(0, x) < -foo, if we want a positive answer, and it is easily seen that if E is not closed, our question may have a negative answer, for instance ii E = 5(0,1). Assuming E bounded and closed, how can we prove existence? We can find a maximizing sequence, i.e., a sequence {xn} C E such that d(0,Xfc) —> supd(0,x), and our question has a positive answer if {xk} is converging or, at least, if {xk} has a subsequence that converges to some point XQ G E. In fact, in this case, d(0,a:jfc^) -^ d(0,a;o), x -^ d{0,x) is continuous, and d(0, Xn^) —^ sup^^^(i(0,x), too, thus concluding that d(0,xo) = supd(0,x). xeE

6.1.1 Compact spaces a. Sequential compactness 6.1 Definition. Let (X^d) be a metric space. A subset K C X is said to be sequentially compact if every sequence {xk} C K has a subsequence {xuk} that converges to a point of K. Necessary conditions for compactness are stated in the following

198

6. Compactness and Connectedness

awifAm it VM^WHttt^fta, Mc tin mtntfifn* itfiinti StefitUaf itn&lfttnt mn\((^mi tint wUt fBuriel itt ^Ui^ma Utiti

Vittnatb fdoljait^/ airfiilttMtf^Mofl, .1.1 nttiHlfalfn «|it(t<»« hr t

%iit Mt tt^tMmgn •». >'*rr«|*«0 <w OMN*

Figure 6.1. Bernhard Bolzano (17811848) and the frontispiece of the work where Bolzano-Weierstrass theorem appears.

|>rag. 181?,

6.2 Proposition. PFe have (i) ^ni/ sequentially compact metric space (X, d) zs complete; (ii) ^n^/ sequentially compact subset of a metric space {X, d) is hounded, closed, and complete with the induced metric. Proof, (i) Let {x^} C X be a Cauchy sequence. Sequential compactness allows us to extract a convergent subsequence; since {x^} is Cauchy, the entire sequence converges, see Proposition 5.107. (ii) Let K be sequentially compact. Every point x 6 K is the limit of a sequence with values in K\ by assumption x £ K, thus K = K and K is closed. Suppose that K is not bounded. Then there is a sequence {xn} C K such that d(xi,Xj) > 1 Vi, j . Such a sequence has no convergent subsequences, a contradiction. Finally, K is complete by (i). D

b. Compact sets in R^ In general, bounded and closed sets of a metric space are not sequentially compact. However we have 6.3 Theorem. InW^, n > 1, a set is sequentially compact if and only if it is closed and bounded. This follows from 6.4 Theorem (Bolzano—Weierstrass). Any infinite and bounded subset E ofW^, n>l, has at least a point of accumulation.

6.1 Compactness

199

Proof. Since E is bounded, there is a cube Co of side L, so that X . . . X [a^f^.b'"'],

£ C Co := \af\bf^]

bf' - a f > = L.

Since £J is infinite, if we divide CQ in 2^ equal subcubes, one of them Ci:=[aW,6Wlx...x[a«,6W],

6 ^ ' - a ^ = L/2,

contains infinitely many elements of E. By induction, we divide Ci in 2^ equal subcubes with no common interiors, and choose one of them, Ci-|-i, that contains infinitely many elements oi E. If Q := [a
x . . . X [a
fef'

- af' =

L/2\

the vertices of Cj converge, <^^^'bf

^

«r.6r

and

a r = 6 r

since for each k = l , . . . , n the sequences {a^'^^^} and {b^^^^^} are real-valued Cauchy sequences. The point a := (af^,..., a^) is then an accumulation point for E, since for any r > 0, Ci C B(a, r) for i sufficiently large. •

Another useful consequence of Bolzano-Weierstrass theorem is 6.5 Theorem. Any bounded sequence {xk} ofW^ has a convergent subsequence. Proof. If {xk} takes finitely many values, then at least one of them, say a, is taken infinitely often. If {pfclfceN are the indices such that Xp^ = a, then {xpj^} converges, since it is constant. Assume now that {xk} takes infinitely many values. The BolzanoWeierstrass theorem yields a point of accumulation Xoo for these values. Now we choose pi as the first index for which Ixp^ — Xoo| < 1, P2 as the first index greater than pi such that \xp2 — Xc»| < 1/2 and so on: then {xp^} is a subsequence of {xn} and Xp^ -^ Xoo. D

c. Coverings and €-nets There are other ways to express compactness. Let A be a subset of a metric space X. A covering of A is a, family A = {Ac} of subsets of X such that A C Ua^a- We have already said that A = {Aa} is din open covering of A if each A^ is an open set, and that {Aa} is a finite covering of A if the number of the A^'s is finite. A subcovering of a covering ^ of A is a subfamily of A that is still a covering of A 6.6 Definition. We say that a subset A of a metric space X is totally bounded if for any e > 0 there is a finite number of balls B{xi,e), i = 1,2,...,A/" of radius e, each centered at xi G X, such that A C uiIiB(a;i,€). For a given e > 0, the corresponding balls are said to form an e-covering of A^ and their centers, characterized by the fact that each point of A has distance less than e from some of the x^'s, form a set {xi) called an e-net for A. With this terminology A is totally bounded iff for every e > 0 there exists an e-net for A. Notice also that A C X is totally bounded if and only if for every e > 0 there exists a finite covering {Ai} oi X with sets having diam Ai < e.

200

6. Compactness and Connectedness

6.7 Definition. We say that a subset K of a metric space is compact if every open covering of K contains a finite subcovering. We have the following. 6.8 Theorem. Let X be a metric space. The following claims are equivalent. (i) X is sequentially compact. (ii) X is complete and totally bounded. (iii) X is compact. The implication (ii) => (i) is known as the Hausdorff criterion and the implication (i) => (iii) as the finite covering lemma. Proof, (i) =^ (ii) By Proposition 6.2, X is complete. Suppose X is not totally bounded. Then for some r > 0 no finite family of balls of radius r can cover X. Start with xi € X; since B(a;i,r) does not cover X, there is X2 G X such that d{x2,xi) > r. Since {B{xi,r), S ( x 2 , r ) } does not cover X either, there is X3 G X such that d{xs,xi) > r and d{xs,X2) > r. By induction, we construct a sequence {xi] such that d(xi,Xj) > r "ii > j , hence d{xi,Xj) > r Vi, j . Such a sequence has no convergent subsequence, but this contradicts the assumption. (ii) => (iii) By contradiction, suppose that X has an open covering A = {Aa} with no finite subcovering. Since X is totally bounded, there exists a finite covering {Ci} of K, n

\ \ Ci = X,

such that

diam Ci < 1,

i = 1 , . . . , n.

1=1

By the assumption, there exists at least ki such that A has no finite subcovering for Cki • Of course Xi := C^j is a metric space which is totally bounded; therefore we can cover Cki with finitely many open sets with diameter less than 1/2, and A has no finite subcovering for one of them that we call X2. By induction, we construct a sequence {Xi} of subsets of X with XiD

X2D

-•' ,

d i a m X i < 1/2%

such that none of them can be covered by finitely many open sets of A. Now we choose for each k a point x^ G Xk. Since {xk} is trivially a Cauchy sequence and X is complete, {xfc} converges to some XQ E X. Let AQ E Abe such that XQ E A and let r be such that B{xo,r) C A {A is an open set). For k suflficiently large we then have d(xk,xo) < r for all X G Xk, i.e., X^ C B(xo,r) C AQ. In conclusion, Xk is covered by one open set in A, a contradiction since by construction no finite subcovering of A could cover Xk • (iii) => (i) If, by contradiction, {xk} has no convergent subsequence, then {xk} is an infinite set without points of accumulation in X. For every x E X there is a ball B{x,rx) centered at x that contains at most one point of {xk}- The family of these balls J' := {B{Xyrx)}x£X is an open covering of X with no finite subcovering of {xk} hence of X, contradicting the assumption. D 6.9 R e m a r k . Clearly the notions of compactness and sequential compactness are topological notions. They have a meaning in the more general setting of topological spaces, while the notion of totally bounded sets is just a metric notion. We shall not deal with compactness in topological spax:es. We only mention that compactness and sequential compactness are not equivalent in the context of topological spaces. 6.10 ^ . Let X be a metric space. Show that any closed subset of a compact set is compact.

6.1 Compactness

201

6.11 %, Let X be a metric space. Show that finite unions and generic intersections of compact sets are compact. 6.12 ^ . Show that a finite set is compact.

6.1.2 Continuous functions and compactness a. The Weierstrass theorem As in [GM2], continuity of / : K ^ M and compactness of K yield existence of a minimizer. 6.13 Definition. Let f : X —^R. Points x_,x+ G X such that f{x-)

= inf / ( x ) , ^€X

/(x+) = sup f{x) xex

are called respectively, a minimum point or a minimizer and a maximum point or a maximize! for / : X ^ R. A sequence {xk} C X such that f{xk) -^ inixex f{x) fresp. f{xk) -^ ^^Pxex fi^)) ^^ called a minimizing sequence fresp. a maximizing sequence/ Notice that any function / : X ^ R defined on a set X has a minimizing and a maximizing sequence. In fact, because of the properties of the infimum, there exists a sequence {yk} C f{X) such that yk -^ infa^^x / ( ^ ) (that may be — oo), and for each k there exists a point Xk E X such that f{xk) = yk, hence f{xk) -> inf^ex f{x). 6.14 Theorem (Weierstrass). Let f : X -^ R be a continuous realvalued function defined in a compact metric space. Then f achieves its maximum and minimum values, i.e., there exists a:_,x+ G X such that / ( x _ ) = inf / ( x ) , ^^^

/(x+) = sup f{x). xex

Proof. Let us prove the existence of a minimizer. Let {xk} C K he a. minimizing sequence. Since X is compact, is has a subsequence {xuk} that converges to some X- € X. By continuity of / , f{xnk) —^ fi^-)y while by restriction Vrik '= fi^nj

-^

inf / ( x ) . xE A

The uniqueness of the Umit yields infaj^x / ( ^ ) = / ( ^ - ) -

^

In fact, we proved that, if / : X -^ M is continuous and X is compact, then any minimizing (resp. maximizing) sequence has a subsequence that converges to a minimum (resp. maximum) point.

202

6. Compactness and Connectedness

b. Continuity and compactness Compactness and sequential compactness are topological invariants. In fact, we have the following. 6.15 Theorem. Let f : X -^ Y be a continuous function between two metric spaces. If X is compact, then f(X) is compact. Proof. Let {Va} be an open covering of f{X). Since / is continuous, {/"^(Va)} is an open convering of X. Consequently, there are indices a i , . . . , aj^ such that

XCf-HVa,)U...UrHVa^), hence

f(X)CVa,U...UVaj,, i.e., f{X) is compact.

•

Another proof of Theorem 6.15. Let us prove that f{X) is sequentially compfict whenever X is sequentially compact. If {yn} C f{X) and {xn} C X is such that f{xn) = y-n \fn, since X is sequentially compact, a subsequence {x^^} of {xn} converges to a point XQ G X. By continuity, the subsquence {f(xk^)} of {yn} converges to f{xo) E f(X). Then Theorem 6.8 applies. • 6.16 t - Infer Theorem 6.14 from Theorem 6.15. 6.17 %, Suppose that E is a. noncompact metric space. Show that there exist (i) f : E -^R continuous and unbounded, (ii) f : E —^M. continuous and bounded without maximizers and/or minimizers.

c. Continuity of the inverse function Compactness also plays an important role in dealing with the continuity of the inverse function of invertible maps. 6.18 Theorem. Let f : X —^ Y be a continuous function between two metric spaces. If X is compact, then f is a closed function. In particular, if f is infective, then the inverse funcion f~^ : f{X) —^ X is continuous. Proof. Let F C X he a. closed set. Since X is compact, F is compact. Prom Theorem 6.15 we then infer that f{F) is compsict, hence closed. Suppose / injective and let g : f{X) —^ X he the inverse of / . We then have 9~'^{E) = f{E) "iE C X, hence g'^^iF) is a closed set if F is a closed set in X. D

6.19 Corollary. Let f : X -^Y be a one-to-one, continuous map between two metric spaces. If X is compact, then f is a homeomorphism. 6.20 E x a m p l e . The following example shows that the assumption of compactness in Theorem 6.18 cannot be avoided. Let X = [0, 27r[, Y he the unit circle of C centered at the origin and f{t) := e**, t G X. Clearly f{t) — cost + isint is continuous and injective, but its inverse function f~^ is not continuous at the point (1,0) = /(O).

6.1 Compactness

203

6.1.3 Semicontinuity and the Frechet-Weierstrass theorem Going through the proof of Weierstrass's theorem we see that a weaker assumption suffices to prove existence of a minimizer. In fact, if instead of the continuity of / we assume^ / ( ^ - ) < hminf /(x^),

whenever {xk} is such that Xk -^ x_,

(6.1)

then for any convergent subsequence {xn^} of a minimizing sequence, Xrik -^

^0,

/(^nj

-^

inf

/(x).

we have inf f{x) < f{xo) < hminf / ( x ^ J = lim / ( x ^ J = i n f / ( x ) , x£X

AC—>oo

k-^oo

xEX

i.e., again /(XQ) = inf^GX / ( ^ ) - We therefore introduce the following definitions. 6.21 Definition. We say that a function / : X ^ R defined on a metric space X is sequentially lower semicontinuous at x G X, s.l.s.c. for short, if / ( x ) < liminf/(xfc)

whenever {xk} C X is such that Xk -^ x.

fc—»-oo

6.22 Definition. We_say that a subset E of a metric space X is relatively compact if its closure E is compact. 6.23 Definition. Let X be a metric space. We say that f : X -^ R is coercive if for allteR the level sets of f,

[xex\

fix) < t]

are relatively compact. Then we can state the following. 6.24 Theorem (Frechet-Weierstrass). Let X be a metric space and let f : X ^^ R be bounded from below, coercive and sequentially lower semicontinuous. Then f takes its minimum value.

^ See Exercises 6.26 and 6.28 for the definition of lim inf and related information.

204

6. Compactness and Connectedness

Figure 6.2. Lebesgue's example of a sequence of curves of length \ / 2 that converges in the uniform distance to a curve of length 1.

6.25 E x a m p l e . There are many interesting examples of functions that are semicontinuous but not continuous: a typical example is the length of a curve. Though we postpone details, Lebesgue's example in Figure 6.2 shows that the function length, defined on the space of piecewise linear curves with the uniform distance, is not continuous. In fact length(/fc) = \ / 2 , fkix) -^ fooix) := 0 uniformly in [0,1], and length(/oo) = 1 < 27r. We shall prove later that in fact the length functional is sequentially lower semicontinuous. 6.26 %, We say that f : X —>• R is lower semicontinuous, for short l.s.c., if for all t G M the sets {x G X \ f(x) < t} are closed. Sequential lower semicontinuity and lower semicontinuity are topological concepts; they turn out to be different, in general. Show that if X is a metric space, then / is lower semicontinuous if and only if / is sequentially semicontinuous. 6.27 f. Let X be a metric space. We recall, see e.g., [GM2], that ^ G M is the liminf oi f : X -^Rasy-^x, £ = liminf fiy), if X is a point of accumulation of X and (i) Vm < ^ 3 (5 such that f{y) > m if y G B(x, 5) \ {XQ}, (ii) V m > ^ V ( 5 > 0 3 2/5G B{x, 5) \ {x) such that f{ys) < m. Show that the lim inf always exists and is given by liminf / (2/)= sup

inf

/ ( y ) = lim

inf

f{y).

Similarly we can define the lim sup of / : X —> R, so that lim s u p / ( y ) = — lim inf (—/(x)). y—i'X

y ^""^

Explicitly define it and show that lim s u p / ( y ) = y-*x

lim r^0+

sup

/(y).

B(x,r)\{x}

Finally, show that f : X ^>-Ris sequentially lower semicontinuous if and only if Vx G X /(x)^ R be defined on a metric space X . Show that (i) liminiy-^xfiy) < limsup^_^/(y), (ii) / ( x ) < liminfy_,x / ( y ) if and only if —/(x) > limsupy_,2; —f{y), hence / is lower semicontinuous if and only if — / is upper semicontinuous, (iii) / ( y ) —> ^ as y —• X if and only if liminfy_x / ( y ) = Hmsupj^.^^. / ( y ) = i, (iv) liminfy_a; / + liminfy-^^p < l i m i n f y _ x ( / + p), (v) liminfx^xo(/(p(2^))) = / ( liminfx-^xo 9{x)), if either / is continuous at L := liminfx-^xo 9{x) or / ( x ) ^ L for any x 7^ xo, (vi) / is bounded from below in a neighborhood of x if and only if lim infy_,x / > — 00.

6.2 Extending Continuous Functions

205

6.2 Extending Continuous Functions 6.2.1 Uniformly continuous functions 6.29 Definition. Let {X,dx) and {Y^dy) be two metric spaces. We say that f : X —^Y is uniformly continuous in X if for any e > 0 there exists 6 > 0 such that dy(/(x), f{y)) < e for all x^y £ X with dx{x,y) < 5. 6.30 Remark. Uniform continuity is a global property, in contrast with continuity (at all points) which is a local property. A comparison is worthwhile (i) / : X —> y is continuous if Vxo G X , V e > 0 3 ( 5 > 0 ( i n principle 5 depends on e and XQ) dy{f (X), f (XQ)) < e whenever dx{x,xo) < S. (ii) / : X -^ y is uniformly continuous in X if Ve > 0 35 > 0 (in this case S depends on e but not on XQ) dy {f {X)^ f (XQ)) < e whenever dx{x,xo) < S. Of course, if / is uniformly continuous in X, / is continuous in X and uniformly continuous on any subset of X. Moreover if {Ua} is a finite partition of X and each F^u^ : Ua -^ Y is uniformly continuous in [/«? then / : X -^ y is uniformly continuous in X. 6.31 %, Show that Lipschitz-continuous and more generally Holder-continuous functions, see Definition 5.24, are uniformly continuous functions. 6.32 ^ . Show that f : X ^^ Y is not uniformly continuous in X if and only if there exist two sequences {xn}, {Vn} C X and eo > 0 such that dx{xn^yn) -^ 0 and dY{f{xn),f{yn)) > eo Vn. 6.33 % Show that (i) x^ and sinx^, x G M, are not uniformly continuous in R, (ii) 1/x is not uniformly continuous in ]0,1], (iii) sin x^, x E M, is not uniformly continuous in M. Using directly Lagrange's theorem, show that (iii) x'^, x G [0,1], is uniformly continuous in [0,1], (iv) e~^, X G M, is uniformly continuous in [0, +oo[. 6.34 f. Let X , y be two metric spaces and let / : X —>> y be uniformly continuous. Show that the image of a Cauchy sequence is a Cauchy sequence on Y.

6.35 Theorem (Heine-Cantor-Borel). Let f : X -^ Y be a continuous map between metric spaces. If X is compact^ then f is uniformly continuous.

206

6. Compactness and Connectedness

Proof. By contradiction, suppose that / is not uniformly continuous. Then there is eo > 0, and two sequences {xn}, {Vn} C X such that dx(xn,yn)

< -, and dvUi^n), f{yn)) > eo Vn. (6.2) n Since X is compact, {xn} has a convergent subsequence, x^^ -^ x, x E X. The first inequality in (6.2) yields that {y^ } converges to x, too. On account of the continuity of/,

dY{f(xkJ,f{x))^0, hence dY{f{xk^)^f{yk^))

dY{f{ykJ,fix))^0,

-^ 0: this contradicts the second inequality in (6.2).

D

6.2.2 Extending uniformly continuous functions to the closure of their domains Let X, Y be metric spaces, E C X and / : £" ^ F be a continuous function. Under which conditions is there a continuous extension of / over £•, i.e., a continuous g : E -^ Y such that g = f in E? Notice that we do not want to change the target Y. Of course, such an extension may not exist, for instance if ^ =]0,1] and f{x) = 1/x, x G]0, 1]. On the other hand, if it exists, it is unique. In fact, if gi and g2 : E —> Y are two continuous extensions, then A := {x e E\gi{x) = ^2(^)} is closed and contains E^ hence A = E. 6.36 Theorem. Let X and Y be two metric spaces. Suppose that Y is complete and that f : E C X —^ Y is a uniformly continuous map. Then f extends uniquely to a continuous function on E; moreover the extension is uniformly continuous in E. Proof. First we observe (i) since / is uniformly continuous in E, if {xn} is a Cauchy sequence in E, then {f{xn)} is a Cauchy sequence in y , hence it converges in Y, (ii) since / is uniformly continuous, if {xn} {yn} C E are such that Xn ^>- x and y-n ^^ X for some x £ X^ then the Cauchy sequences {f{xn)} and {f(yn)} have the same limit. Define F : ^ —> y as follows. For any x ^ E, let {xn} C E he such that Xn —^ x. Define F{x) := lim / ( x n ) . We then leave to the reader the task of proving that (i) F is welldefined, i.e., its definition makes sense, since for any x the value F{x) is independent of the chosen sequence {xn} converging to x, (ii) F{x) = f{x)\^xeE, _ (iii) F is uniformly continuous in E, (iv) F extends / , i.e., F{x) = f(x)\/x G E. D 6.37 %. As a special case of Theorem 6.36, we notice that a function f : E C X ^ y , which is uniformly continuous on a dense subset E C X, extends to a uniformly continuous function defined on the whole X.

6.2 Extending Continuous Functions

207

6.2.3 Extending continuous functions Let X, Y be metric spaces, E C X and f : E -^Y he a, continuous function. Under which conditions can / be extended to a continuous function F : X ^ y ? This is a basic question for continuous maps. a. Lipschitz-continuous functions We first consider real-valued Lipschitz-continuous maps, f : E C X —^ R. 6.38 T h e o r e m ( M c S h a n e ) . Let (X, d) be a metric space, E C X and let f : E ^^R be a Lipschitz map. Then there exists a Lipschitz-continuous map F : X ^ R with the same Lipschitz constant as f, which extends f. Proof. Let L := Lip ( / ) . For a: 6 X let us define F{x):=

ini (f{y) + yeE \

Ld{x,y)) /

and show that it has the required properties. For x £ E v/e clearly have F{x) < while, / being Lipschitz, gives f{x)
+ Ld{x,y)

f(x)

"iy e E,

i.e., f(x) < F{x), thus concluding that F(x) = f(x) Vx € E. Moreover, we have F{x) < inf (f{z)

+ L d(y, z)) + L d{x, y) = F{y) + L d{x, y)

and similarly F{y) L, thus concluding Lip (F) = L. •

The previous theorem allows us to extend vector-valued Lipschitzcontinuous maps f : E C X -^ R"^, but the Lipschitz extension will have, in principle, a Lipschitz constant less than y/mLip{f). Actually, a more elaborated argument allows us to prove the following. 6.39 Theorem (Kirszbraun). Let f : E C R'' -> W^ be a Lipschitzcontinuous map. Then f has an extension F :W^ -^ W^ such that Lip F = Lip/. In fact there exist several extensions of Kirszbraun's theorem that we will not discuss. We only mention that it may fail if either R'^ or R"^ is remetrized by some norm not induced by an inner product. 6.40 If ( F e d e r e r ) . Let X be R^ ^ i t h the infinity norm 11 x | ] oo — supdx^l + \x'^\) and the map f : AcX -^R'^, where A := { ( - 1 , 1 ) , (1, - 1 ) , (1,1)} and / ( - 1 , 1 ) := ( - 1 , 0),

/ ( I , - 1 ) := (1, 0),

/ ( I , 1) := (0, Vs).

Show that Lip (/) = 1, but / has no 1-Lipschitz extension to A U {(0, 0)}.

208

6. Compactness and Connectedness

6.2.4 Tietze's theorem An extension of Theorem 6.38 holds for continuous functions in a closed domain. 6.41 Theorem (Tietze). Let X be a metric space, E C X be a closed subset of X, and f a continuous function from E into [—1,1] (respectively, R / Then f has a continuous extension from X into [—1,1] (respectively,

R). Actually we have the following. 6.42 T h e o r e m ( D u g u n d j i ) . Let X he a metric space, E a closed subset of X and let f be a continuous function from E into W^. Then f has a continuous extension from X into W^; moreover the range of f is contained in the convex hull of f(E). We recall that the convex hull of a subset E CM."^ is the intersection of all convex sets that contain E. Proof of Tietze's theorem,. First assume that / is bounded. Then it is not restrictive to assume that inf^; / = 1 and sup^; f = 2. We shall prove that the function

fix) F{x):=liniy^E{f{y)d{x,y)) d{x, E) I

iixe E, lix^E

is a continuous extension of / and 1 < F(a:) < 2 Vx € X . Since the last claim is trivial, we need to prove that F is continuous in X. Decompose X = \ntE\J {X\E)\J dE. If XQ 6 int E, then F is continuous at XQ by assumption. Let XQ ^ X\E.lii this case x -^ d{x,E) is continuous and strictly positive in an open y)) neighborhood of XQ, therefore it suffices to prove that that h{x) := miy^E{fiy)d(x, is continuous at XQ. We notice that for y E E and x, XQ € A' we have f(y)d{x,y)

< f{y)d{xo,y)

+ fiy)d{x,xo)

< f(y)d(xo,y)

+

2d{x,xo),

hence h{x) < h{xo) 4- 2d(x,

XQ)

and, exchanging x with XQ, \h{x) — h{xo)\ <

2d{x,xo)-

This proves continuity of h a.t XQ. Let XQ e dE. For e > 0 let r > 0 be such that \f{y) — f{xo)\ d{y^ ^o) < r and y G E. For x G B{xo,r/4) we have ^ o J " ^ „,(/(2/)d(^.J/)) < nxo)d{x,xo} EnB{xo,r/4)

< 2 4j = 2^

and 3 r.v J^^ . Sf{y)d{x, y)) > d{xo,y) - d{x, XQ) > -r. E\B(xo,r/4) 4 Therefore we find for x with d(xo,x) < r / 4 , h{x) = inU f{y)d{x,y)) = inf f{y)d{x,y) yeE EnB{xo,r/4) and d{x, E) = d(x, E n B(xo, r / 4 ) ) .

< e provided

6.2 Extending Continuous Functions

209

On the other hand, ior y £ E n B(xo, r) we have |/(a:o) — f{y)\ < ^ hence {f{xo) - €)d{x, E) < h{x) < {f{xo) + e)d{x, E) if a; G B{xo,r/4:),

i.e., h{x) is continuous at XQ.

Finally, if / is not bounded, we apply the above to g := (f o f, (f being the homeomorphism (p : M —>]0,2[, (p{x) = ^ + Wi^TT)' ^^ ^ extends continuously g, then F := (p~^ o G continuously extends / . •

6.43 Remark. The extension F : X ^ R oi f : Ec Tietze's theorem is Lipschitz continuous outside E.

X ^ R provided by

Sketch of the proof of Theorem 6.42, assuming X = R^ and E C X compact. a countable dense set {efc}^ in E and for x ^ E and k = 1 , 2 , . . . , and set

Choose

\x-ek\ ,.(.):=n.ax{2-^,0} The function

(

fix),

xeE,

Ek>l^-''Mx)f{ek) defines a continuous extension / , moreover /(M'^) is contained in the convex-hull of }{E). • 6.44 ^ . Let E and F be two disjoint nonempty closed sets of a metric space Check that the function / : X —>> [0,1] given by /(x)=

{X,d).

"^(^'^^ d{x, E) + d{x, F)

is continuous in X , has values in [0,1], f{x) = OWx E E and f{x) = l\/x

E F.

6.45 f. Let E and F be two disjoint nonempty closed sets of a metric space {X,d). Using the function / in Exercise 6.44 show that there exist two open sets A.BcX with An B = 0, AD E and B D F. Indeed Exercise 6.45 has an inverse. 6.46 L e m m a ( U r y s h o n l e m m a ) . Let X be a topological space such that each couple of disjoint closed sets can be separated by two open disjoint sets. Then, given a couple of disjoint closed sets E and F, there exists a continuous function / : X —^ [0,1] such that f{x) = l'ixe E and f{x) = 0 Vx G F . This lemma answers the problem of finding nontrivial continuous functions in a topological space and is a basic step in the construction of the so-called partition or decomposition of unity, a means that allow us to pass from local to global constructions and vice versa. Since we shall not need these results in a general context, we refrain from further comments and address the reader to one of the treatises in general topology.

210

6. Compactness and Connectedness

PR6FACB AU VOLUME I. La Topoiogie traite des propri«16» de« euenble* de poinls. invarlontes par rapports aux transforinatioM bicontlnues. Une traiwfonnation (univoqoe) y-f^x) est dite continue, lortque la condition x^Umx^ Mtratne f (x)~ ]im/{x,). Bll« eat dite bkontlnae ou une homiomorphie, lorsqu'elle admet, en outre, une transformation inverse *«"/"'{>) continue. Le terme .ensemble de points' exige quelques oxplieations: on peut noUmroent se domander quel est I'espace dont on considire I«s points. Corome on salt, la notion de point de I'espace euclidien i 3 dimensions a M. itendue dans la Qtomitrie analytique sur I'espace k un nombre arbitraire dee dimensions: un point p de I'espace euclidien d* (& k dimensions) est par difiniUon un sysltoie de k nombres r*els /><",f<»',...,/>i*>; la convergence l i m p , - / ? signifie

TOPOLOGY James

Dugundji Pnfam of Malhtmelitt Umv^uty of Soulkttn CaUfornia

quo I'on a Urn p i " - / ' ' , quel que soit / < * . Le diveloppoment r sont des suites infinies /»">,/;<'>,... ,pW,... de nombres riels; la convwgence llm p^^p y signifle que I'on a lim p^l^'-P^'^, quel que soit I.

ALtYN AND BACON. INC. HOSrON . LONDON . SVONEV . TORONTO

Or, c'est pricistoent I'itude des invariants dea bomiomorphies entre soos-ensenbles de I'espace C^ qui constitue le vrai domaine de la Topoiogie i I'itat actuel de cette science. Ajonlons

Figure 6.3. The first page of the Preface of Topoiogie by Kazimierz Kuratowski (18961980) and the frontispiece of a classical in general topology.

6.3 Connectedness Intuitively, a space is connected if it does not consist of two separate pieces.

6.3.1 Connected spaces 6.47 Definition. A metric space X is said to be connected if it is not the union of two nonempty disjoint open sets. A subset E C X is connected if it is connected as subspace of X. This can be formulated in other ways. For example we say_that two sets A and 5 of a metric space X are separated if both AilB and AnB are empty, i.e., no point of A is adherent to B and no point of B is adherent to A. 6.48 Proposition. Let X be a metric space. The following properties are equivalent. (i) X is connected. (ii) There are no closed sets F,G in X such that F fl G = 0 and X = FUG. (iii) The only subsets of X both open and closed are 0 and X. (iv) X is not the union of two nonempty and separated subsets.

6.3 Connectedness

211

Proof. Trivially (i) <^ (ii) ^ (iii). Let us prove that (i) =^ (iv). By contradiction, suppose X = AUB where A and_B are nonempty and separated. Prom An B = ^ and AUB = X we infer A C B^ and B^ C A, hence A = B^, i.e., A is an open set. Similarly we infer that B is open, concluding that X is not connected, a contradiction. Finally, let us prove that (iv) =^ (ii). By contradiction, assume that X is not connected. Then X = AuB with A, B closed, disjoint and nonempty, thus (AnB) = (AHB) = AnB = 0. Thus X is separated, a contradiction. D

a. Connected subsets 6.49 Theorem. E cR is connected if and only if E is an interval. Proof. If £• C M is not an interval, there exist x,y ^ E and z ^ E with x < z < y. Thus the sets Ei := ED] — oo,z[ and E2 := En]z,-\-oo[ are nonempty and separated. Since E = EiU E2, E is not connected, a contradiction. Conversely, if E is not connected, then E = AU B with A and B nonempty and separated. Let x £ A and y £ B and, without loss of generality, suppose x < y. Define z :=sup{An

[x,y]).

We have z £ A hence z ( B; in particular x < z < y. If z ^ A_then x < z < y and z ^ E, i.e., E is not an interval. Otherwise, if 2; € ^ , then z ( B and there exists zi such that z < zi
b . Connected components Because of Exercise 6.53, the following definition makes sense. 6.55 Definition. Let X be a metric space. The connected component of X containing XQ E X is the largest connected subset CXQ of X such that

6.56 Proposition. Let X be a metric space. We have the following. (i) The distinct connected components of the points of X form a partition

ofX. (ii) Each connected component C C X is a closed set. (iii) Ifye Cx, then Cx =Cy.

212

6. Compactness and Connectedness

(iv) If Y C X is a nonempty open and closed subset of X, then Y is a connected component of X. Observe that the connected components are not necessarily open. For instance, consider X = Q for which Cx := {x} Va: G Q. Of particular interest are the locally connected metric spaces, i.e., spaces X for which for every x e X there exists r^ > 0 such that B{x,rx) is connected. 6.57 Proposition. LetX be metric space. The following claims are equivalent. (i) Each connected component is open. (ii) X is locally connected. Proof. Each point in X has a connected open neighborhood by (i), hence (ii) holds. Let C be a connected component of X , let a; G C and, by assumption, let B{x^rx) be a connected ball centered at x. As B{x^rx) is connected, trivially B{x^rx) C C, i.e., C is open. D

6.58 Corollary. Every convex set ofW^ is connected. Proof. In fact every convex set iC C M^ is the union of all segments joining a fixed D point xo €i K to points x ^ K. Then Exercise 6.53 applies.

The class of all connected sets of a metric space X is a topological invariant. This follows at once from the following. 6.59 Theorem. Let f : X -^ Y be a continuous map between metric spaces. If X is connected, then f{X) cY is connected. Proof. Assume by contradiction that f{X) is not connected. Then there exist nojiempty open sets C,D CY such that CnDn f{X) = 0, (C U D) D f{X) = f{X). Since / is continuous, A := f~^{C), B := f~^(D) are nonempty open sets in A", such that D An B = (d and AU B = X. A contradiction, since X is connected.

Since the intervals are the only connected subsets of M, we again find the intermediate value theorem of [GMl] and, more generally, 6.60 Corollary. Let f : X ^^ R be a continuous function defined on a connected metric space. Then f takes all values between any two that it assumes. c. Segment-connected sets in M"^ In R'^ we can introduce a more restrictive notion of connectedness that in some respect is more natural. If x, y G M^, a polyline joining x to 2/ is the union of finitely many segments of the type [a:,Xi],[xi,X2],...,[a:Ar-i,y]

6.3 Connectedness

213

where Xi e R'^ and [xi,Xi_j_i] denotes the segment joining Xi with Xi^i. It is easy to check that a polyUne joining x to y can be seen as the image or trajectory of a piecewise hnear function 7 : [0,1] —> W^. Notice that piecewise hnear functions are Lipschitz continuous. 6.61 Definition. We say that A cW^ is segment-connected if each pair of its points can be joined by a polyline that lies entirely in A. If A[x] denotes the set of all points that can be joined to x by a polyline in A, we see that A is segment-connected if and only ifA = A[x]. Moreover we have the following. 6.62 Proposition. Any segment-connected A CM.'^ is connected. Of course, not every connected set is segment-connected, indeed a circle in R^ is connected but not segment-connected. However, we have the following. 6.63 Theorem. Let A be an nonempty open set ofW^. nected if and only if A is segment-connected.

Then A is con-

Proof. Let XQ e A, let B := A[x] be the set of all points that can be connected with XQ by a polyline and let C := A\A[x]. We now prove that both B and C are open. Since A is connected, we infer A = i4[x] hence, A is segment-connected. Let X E B. Since A is open, there exists B{x,r) C A. Since x is connected with XQ by a polyline, adding a further segment we can connect each point of B{x, r) with XQ by a polyline. Therefore B(x,r) C B if x G B , i.e., B is open. Similarly, if x G C, let B{x, r) C A. No points in B{x, r) can be connected with XQ by a polyline since on the contrary adding a further segment, we can connect x with XQ. SO B{X, r) C C if X G C , i.e., C is open. D

d. Path-connectedness Another notion of connection that makes sense in a topological space is joining by paths. Let X be a metric space. A path or a curve in X joining x with y is a continuous function / : [0,1] ^> X with /(O) = x and / ( I ) = y. The image of the path is called the trajectory of the path. 6.64 Definition. A metric space X is said path-connected if any two points in X can be joined by a path. Evidently R*^ is path-connected. We have, as in Theorem 6.63, the following. 6.65 Proposition. Any path-connected metric space X is connected. The converse is however false in general. 6.66 t . Consider the set A CR"^, A = GU I where G is the graph of f{x) := sin 1/x, 0 < a; < 1, and / = {0} x [—1,1]. Show that A is connected but not path-connected.

214

6. Compactness and Connectedness

Similarly to connected sets, if {Aa} C X are path-connected with Ha^a 7^ 0, then A := UaAa is path-connected. Because of this, one can define the path-connected component of X containing a given XQ G X as the maximal subset of X containing XQ that is path-connected. However, examples show that the path-connected components are not closed, in general. But we have the following. 6.67 Proposition. Let X be metric space. The following claims are equivalent. (i) Each path-connected component is open (hence closed). (ii) Each point of x has a path-connected open neighborhood. Proof, (ii) follows trivially from (i). Let C be a path-connected component of X, let X £ C and by assumption let B(x^rx) be a path-connected ball centered at x. Then trivially B(x, Vx) C C, i.e., C is open. Moreover C is also closed since X\C is the union of the other path-connected components that are open sets, as we have proved. D

6.68 Corollary. An open set A of W^ is connected if and only if it is path-connected. Proof. Suppose that A is connected and let U C A he a, nonempty open set. Each point X E U then has a ball B{x,r) C U that is path-connected. By Proposition 6.57 any path-connected component C of A is open and closed in A. Since A is connected, D C = A.

6.3.2 Some applications Topological invariants can be used to prove that two given spaces are not homeomorphic. 6.69 Proposition. R and R^, n> 1, are not homeomorphic. Proof. Assume, by contradiction, that /i : R^ —> E is a homeomorphism, and let XQ be a point of R'^. Then clearly R^ \ {XQ} is connected, but /i(R^ \ {XQ}) = R \ {h(xo)} is not connected, a contradiction. D

Much more delicate is proving that 6.70 Theorem. R"^ and W^, n^vn,

are not homeomorphic.

The idea of the proof remains the same. It will be sufficient to have a topological invariant that distinguishes between different R'^. Similarly, one shows that [0,1] and [0,1]"^, n > 1, are not homeomorphic even if one-to-one correspondence exists. 6.71 %, Show that for any one-to-one mapping /i : [0,1]" —> [0,1] neither h nor h~^ is continuous.

6.3 Connectedness

215

6.72 %, Show that the unit circle S^ of M^ is not homeomorphic to M. [Hint: Suppose /i : 5^ ^ R is a homeomorphism and let XQ e S^. Then S^ \ {XQ} is connected, while R \ {h{xo)} is not connected.]

6.73 Theorem. In R each closed interval is homeomorphic to [—1,1]; each open interval is homeomorphic ^o ] — 1,1[ and each half-open interval is homeomorphic ^o ] — 1,1]. Moreover, no two of these three intervals are homeomorphic. Proof. The first part is trivial. To prove the second part, it suffices to argue as in Proposition 6.69 removing respectively, 2, 0 or 1 points to one of the three standard intervals, thus destroying connectedness. D 6.74 %, Show that the unit ball 5 ^ := {x e R'^^^ | \x\ = 1} in R^+^ is connected and that S^ and S^, n > 1, are not homeomorphic. 6.75 ^ . Let A C R^ and let C C R"^ be a connected set containing points of both A and R"^ \ A. Show that C contains points of dA. 6.76 f. Show that the numbers of connected components and of path-connected components are topological invariants. T h e o r e m . Let f : X —^ Y (path-connected) component f is a homeomorphism, then (path-connected) components

be a continuous function. The image of each connected of X must lie in a connected component ofY. Moreover, if f induces a one-to-one correspondence between connected of X andV.

6.77 %. In set theory, the following theorem of Cantor-Bernstein holds, see Theorem 3.58 of [GM2]. T h e o r e m . / / there exist injective maps X —>• "K and K —> X , then there exists a one-to-one map between X and Y. This theorem becomes false if we require also continuity. T h e o r e m ( K u r a t o w s k i ) . There may exist continuous and one-to-one maps f \ X ^y Y and g :Y -^ X between metric spaces and yet X and Y are not homeomorphic. [Hint: Let X, y C R be given by X =]0,1[U{2}U]3,4[U{5} U . . . U]3n, 3n + l[U{3n + 2} U . . . Y =]0,1]U]3,4[U{5} U . . . U]3TI, 3n -f l[U{3n -h 2} U . . . By Exercise 6.76, X and Y are not homeomorphic, since the component ]0,1] of Y is not homeomorphic to any component of X , but the maps f : X —^ Y and g :Y ^>^ X given by x/2 f{x):=r

^^^^^'

I1

if X = 2,

are continuous and one-to-one.]

and

g{x) := { ^ X— 3

ifx€]0,l[, ifa:G]3,4[, otherwise

216

6. Compactness and Connectedness

6.4 Exercises 6.78 %, Show that a continuous map between compact spaces needs not be an open map, i.e., needs not map open sets into open sets. 6.79 ^ . Show that an open set in R^ has at most countable many connected components. Show that this is no longer true for closed sets. 6.80 %. The distance between two subsets A and B of a metric space is defined by d{A,B)

:= inf d{a,b). aEA beB

Of course, the distance between two disjoint open sets or closed sets may be zero. Show that, if A is closed and B is compact, then d(A, B) > 0. [Hint: Suppose 3 an,bn such that d{an,bn) —> 0 . . . ] and (K,dy) be metric spaces, and let {X x Y^dxxv) be their 6.81 %. Let {X,dx) Cartesian product with one of the equivalent distances in Exercise 5.14. Let TT : X xY be the projection map onto the first component, 7r(x,2/) := x. n is an open map, see Exercise 5.131. Assuming Y compact, show that TT is a closed map, i.e., maps closed sets into closed sets. 6.82 %, Let f : X -^ Y he a. map between two metric spaces and suppose Y is compact. Show that / is continuous if and only if its graph Gf := [{x, y)eXxY\xeX,

y = f{x)]

is closed in X x Y endowed with one of the distances in Exercise 5.14. Show that, in general, the claim is false if Y is not compact. 6.83 If. Let K be a compact set in E^, and for every x € M set i^'x := {2/ ^ Jf^ I {x, y) € K} and f{x) := diam A'a;, x G M. Show that / is upper semicontinuous. 6.84 ^ . A map / : X ^- K is said to be proper if the inverse image of any compact set K C Y is a. compact set in X. Show that / is a closed map if it is continuous and proper. 6.85 %. Show Theorem 6.35 using the finite covering property of X. [Hint: Ve > 0 to every x E X we can associate a S(x) > 0 such that dy {f (x), f (y)) < e/2 whenever y E X and dx(x, y) < S{x). Prom the open covering {B{x, S{x))} of X we can extract a finite subcovering {B{xi,rxi)}i=i^...^N such that X C B{xi,6{xi))U.. .\JB{xn,S{xN)). Set 6 := m i n { ( 5 ( x i ) , . . . ,

S{XN)}.]

6.86 %, Let f : E —^ R"^ be uniformly continuous on a bounded set E. Show that f{E) is bounded. [Hint: The closure of £J is a compact set ...] 6.87 t . Show that (i) if / : X ^- R^ and g : X —^ R'^ are uniformly continuous, then f -\- g and A/, A € R, are uniformly continuous, (ii) if / : X -^ y is uniformly continuous in A C X and B C X, then / is uniformly continuous in AU B. 6.88 %. Let f^g : X —^ R he uniformly continuous. Give conditions such that fg is uniformly continuous.

6.4 Exercises

217

6.89 %. Show that the composition of uniformly continuous functions is uniformly continuous. 6.90 %. Concerning maps / : [0, -|-oo[—)> R, show the following. (i) If / is continuous and f{x) —)^ A G M as x -^ +oo, then / is uniformly continuous in [0,-|-oo[. (ii) If / is continuous and has an asymptote, then / is uniformly continuous in [0, +oo[. (iii) If / : [0, -|-oo[—> R is uniformly continuous in [0, -f-oo[, then there exists constants A and B such that |/(a;)| < A\x\ + J5 Vx > 0. (iv) If / is bounded, then there exists a concave function uj{t), t > 0, such that

l/W-/W|0. 6.91 1 . Let K C X he a, compact subset of a metric space X and x G X \ K. Show that there exists y E K such that d{x, y) = d{x, K). 6.92 %, Let X be a metric compact space and / : X —>> X be an isometry. Show that f{X) — X. [Hint: / ^ , / ^ , . • • > are isometries.] 6 . 9 3 %%, Show that the set of points of R^ whose coordinates are not both rational, is connected. 6.94 %. Let B be a, at most, countable subset of R"^, n > 1. Show that C :=R'^\B is segment-connected. [Hint: Assume that 0 € C and show that each x E: C can be connected with the origin by a path contained in C, thus C is path-connected. Now if the segment [0, x] is contained in C we have reached the end of our proof, otherwise consider any segment R transversal to [0, x] and show that there is z E R such that the polyline [0,2:] U [z,x] does not intersect B.] 6.95 %. Let / : R^ —> R, n > 1, be continuous. Show that there are at most two points 2/ G R for which f~^{y) is at most countable. [Hint: Take into account Exercise 6.94.]

7. Curves

The intuitive idea of a curve is that of the motion of a point in space. This idea summarizes the heritage of the ancient Greeks who used to think of curves as geometric figures characterized by specific geometric properties, as the conies, and the heritage of the XVIII century, when, with the development of mechanics, curves were thought of as the trace of a moving point.

7.1 Curves in W 7.1-1 Curves and trajectories Prom a mathematical point of view, it is convenient to think of a curve as of a continuous map 7 from an interval / of R into M'^, 7 € C^(/, M'^). The image 7(7) of a curve 7 G C^{I,W^), is called the trace or the trajectory of 7. We say that 7 : / ^ ' R"^ is a parametrization of F if 7(/) = F, intuitively, a curve is a (continuous) way to travel on F. li x^y G R^, a curve 7 € C^([a, 6], R^) such that 7(a) = x, 7(6) = y, is often called a path joining x and y. A curve is what in kinematics is called the motion law of a material point, and its image or trajectory is the "line" in R*^ spanned when the point moves. If the basis in R"^ is understood, —as we shall do from now on, fixing the standard basis of R'^— a curve 7 G C^(/, R"^) writes as an n-tuple of continuous real-valued functions of one variable, 7(t) = (7^(t),7^(t),... ,7"^(t)), 7* : / ^ R, 7*(t) being the component of 7(t) Vt G / . Let /c = 1,2,..., or 00. We say that a curve 7 G C^(/,R^) if all the components of 7 are real-valued functions respectively, of class C^(/,R), and that 7 is a curve of class C'^ if 7 G C^(/,R^) We also say that 7 : [a, b] ^ R'^ is a closed curve of class C^ if 7 is closed, 7 G C^(/,R'^) and moreover, the derivatives of order up to k of each component of 7 at a and b coincide, Dj-f'{a) = Dj^\b)

Vz = 1 , . . . , n, Vj - 1 , . . . , A;.

If 7 : / ^ R" is of class C \ the vector

220

7. Curves

7'(*o):=((y)%),..., (7")'(
te [0,1],

is an affine map, called the parametric equation of the line through x in the direction of y. Thus its trajectory is the line L CW^ through s(0) = x and s ( l ) = y with constant vector velocity s'{t) = y — x. In kinematics, s{t) is the position of a point traveling on the straight line s(R) with constant velocity ly — x| assuming s(0) = x and s ( l ) = y. Therefore the restriction si[o,i] of s, s{t) = {l-t)x-\-ty,

0
1,

describes the uniform motion of a point starting from x at time t = 0 and arriving in y at time t = 1 with constant speed ly — x| and is called the parametric equation of the segment joining y to x. 7.2 E x a m p l e ( U n i f o r m circular m o t i o n ) . T h e curve 7 : R —)^ E^^ given by ^{t) = (cos t, sin t) has as its trajectory the unit circle of R^ {(x, y) | x^ + 2/^ = 1} with velocity one. In fact, Y(t) = (—sint,cost) thus |7'(t)| = 1 Vt. 7 describes the uniform circular motion of a point on the unit circle that starts at time t = 0 at (1,0) and moves counterclockwise with angular velocity one, cf. [GMl]. Notice that 7 ' J- 7 and 7 ' ' JL 7 ' since

(7'(*)h(t)) = i ^ ( t ) = 0 . (7"(*)|7'(t))=J^W = 0. Finally, observe that the restriction of 7 to [0,27r[ runs on the unit circle once, since 7|[o,27r[ is injective. The uniform circular motion is better described looking at R^ as the Gauss plane of complex numbers, see [GM2]. Doing so, we substitute y(t) with t —> e**, t € R, since we have e** = cos t-\- i sin t. 7.3 E x a m p l e ( G r a p h s ) . Let / G C ° ( / , R ^ ) be a curve. The graph of / , Gf := [(x,y)

elxW'lxel,

y = fix)]

CR^+\

has the standard parametrization, still denoted by Gf, Gf : I —> R'^'^^, Gf{t) := t -^ (t, f{t)), called the graph-curve o f / . Observe that Gf is an injective map, in particular Gf is never a closed curve, Gf is of class C^ if / is of class C^, k = 1 ,00, and G^(t) = ( ! , / ' ( ( ) ) if / is of class C^. A point that moves with the graph-curve law along the graph, moves with horizontal component of the velocity field normalized to + 1 . Notice that \G'^(t)\ > 1 V(.

7.1 Curves in W

221

Figure 7.1. A cylindrical helix.

7.4 E x a m p l e (Cylindrical h e l i x ) . If j{t) = (acost, a s i n t , 6t), t 6 M, then y{t) = (—asint, acos t, 6), t E M. We see that the point ^{t) moves with constant (scalar) speed along a helix, see Figure 7.1.

7.5 E x a m p l e (Different p a r a m e t r i z a t i o n s ) . Different curves may have the same trace, as we have seen for uniform circular motion. As another example, the curves 71 (t) := (t,0), 72(0 '= (*^,0) and73(t) := (t{t'^-l),0), t G M, are different parametrizations of the abscissa-axis of M^; of course, the three parametrizations give rise to different motions along the x-axis. Similarly, the curves ai (t) = (t3,t2) and (72(t) = (t, (t2)V3)^ t G M, are different parametrizations of (a) Figure 7.2. Notice that a i is a C°°parametrization, while a2 is continuous but not of class C^. 7.6 E x a m p l e (Polar c u r v e s ) . Many curves are more conveniently described by a polar parametrization: instead of giving the evolution of Cartesian coordinates of 7(t) := (x(t), y{t)), we give two real functions 9{t) and p{t) that describe respectively, the angle evolution of 7(t) measured from the positive part to the abscissa axis and the distance of 7(t) from the origin, so that in Cartesian coordinates 7 ( 0 = {p{t) cose{t),p{t)

sinOit)).

If the independent variable t coincides with the angle 0, 9{t) = t, we obtain a polar curve ^{0) = (p((9) cos (9, p(^) sin ^).

In the literature there are many classical curves that have been studied for their relevance in many questions. Listing them would be incredibly long, but we shall illustrate some of them in Section 7.1.3.

Figure 7.2. (a) ^(t) = {t^,t^),

(b) j{t) = {t^ - At,t^

222

7. Curves

a. The calculus Essentially the entire calculus, with the exception of the mean value theorem, can be carried on to curves. 7.7 Definition. Let 7 G C^{[a, 6]; R^), 7 = (7^, 7 ^ , . . . , 7^). The integral of 7 on [a, b] is the vector in R" f 'y{s)ds:= Ja

( I jHs)ds,

f 'Y\s)ds,

\ Ja

...,

Ja

f

r{s)ds).

Ja

/

7.8 Proposition. If-f £ C°([a,6];]R"), then I f^l{s)ds\ Proof. Suppose that f^ 7(5) ds ^ ( i ; \ v ^ , . . . , v'^) e E"" we have {v\ f

^{s)ds)

=Y^v'

f

<

J^\-f{s)\ds.

0, otherwise the claim is trivial. For all v

-f'{s)ds=

f

v'-f'{s)ds=

=

f {v\y{s)) ds;

using Cauchy's inequality we deduce \(v\

f

j{s)ds)\

= \ f

{v\^is))ds\<

f

\{v\j{s))\ds

<\v\ f h{s)\ds Ja for all !> G M". Therefore it suffices to choose v := f^ 7(5) ds to find the desired result. D

If 7 G C^{[a, 6], R*^) and n > 1, the mean value theorem does not hold. Indeed, if j{t) = (cost, sinf), t G [0,27r], and s G [0, 27r] is such that 0 = 7(27r)-7(0) = 27ry(5), we reach a contradiction, since |7'(5)| = |(— sin 5, cos s)| = 1. However, the fundamental theorem of calculus, when applied to the components yields the following. 7.9 Theorem. Let 7 G C^([a,6];R^). Then 7(6)-7(a) = / y{s)ds. Ja Finally, we notice that Taylor ^s formula extends to curves simply writing it for each component, 7(t) = 7(to) + y(to)(t - to) + ^7"(^o)(t - to)' + • • • + l^J^'\to){t

- to)' + ^ /

(^ - s)'7^'^'Hs)

ds.

(7.1)

7.1 Curves in R^

223

Figure 7.3. Some trajectories: from the left, (a) simple curve, (b) simple closed curve, (c), (d), (f) curves that are not simple.

b. Self-intersections Traces of curves may have self-intersections^ i.e., in general, curves are not injective. In (b) Figure 7.2 the trace of the curve 7(t) = {t^ - 4t, t'^ - 4) t eR self-intersects at the origin. One defines the multiplicity of a curve 7 G C^{I,W) dit xeW^ as the number of fs such that j{t) = x,

N(nJ,x):=#[tel\i{t)

= x]

Of course, the trace of 7 is the set of points with multiplicity at least 1. We shall distinguish two cases. (i) 7 : / -^ R"^ is not closed, i.e., 7(a) ^ 7(6). In this case we say that 7 is simple if 7 is not injective i.e., all points of its trajectory have multiplicity 1. Notice that, if / = [a, ^], then 7 is simple if and only if 7 is an homeomorphism of [a, 6] onto 7([a, 6]), [a, 6] being compact, see Corollary 6.19. In contrast, if / is not compact, / and 7(7) in general are not homeomorphic. For instance let / = [0,27r[ and 7(t) := (cost,sint), t G / be the uniform circular motion. Then 7(7) is the unit circle that is not homeomorphic to 7, see Exercise 6.72. (ii) ^ is a closed curve, i.e., 7 = [a, 6] and 7(a) = 7(6). In this case we say that 7 is a simple closed curve if the restriction of 7 to [a, 6[ is injective, or, equivalently, if all points of the trajectory of 7, but 7(a) = 7(6) have multiplicity 1. A (closed) curve 7 has self-intersections if it is not a (closed) simple curve. 7.10 %. Show that any closed curve 7 : [a, 6] —> M^ can be seen as a continuous map from the unit circle S^ C M^. Furthermore show that its trax^e is homeomorphic to S^ if 7 is simple. 7.11 If. Study the curves (x{t),y(t)),

x{t) = 2 t / ( l + t^), y(t) = (f

- 1)/(1 + t^),

c. Equivalent parametrizations Many properties of curves are independent of the choice of the parameter, that is, are invariant under homemorphic changes of the parameter. This is the case for the multiplicity function and, as we shall see later, of the length. For this reason, it is convenient to introduce the following definition

224

7. Curves

7.12 Definition. Let I, J be intervals and let 7 E C^(J,M^) and 5 G C^(J,R"^). We say that 5 is equivalent to 7 if there is a continuous oneto-one map h : J -^ I such that 5{s) = j{h{s))

V 5 G J.

In other words S is equivalent to 7 if (5 reduces to 7 modulo a continuous change of variable in the time axis. Since the inverse /i~^ : / —^ J of a continuous one-to-one map is also continuous, see [GMl], we have that 7 is equivalent to S iff S is equivalent to 7. Actually one sees that the relation of equivalence among curves is an equivalence relation. Trivially, two equivalent curves have the same trace and the same multiplicity function; the converse is in general false. 7.13 E x a m p l e , ^{t) = (cosi, sint), t 6 [0, 27r] and d(t) = (cost, sint), t e [0,4n] have the same trace but are not equivalent since their multipHcity functions are diflPerent.

However, we have the following. 7.14 Theorem. Two simple curves with the same trace are equivalent. Proof. Assume for simplicity that the two curves 7 G C^il.R"^) and <5 G C^{J,W^), I and J being intervals, are not closed. Set h := y'^ oS which clearly is a one-to-one and continuous map from J to I. h is then a homeomorphism, see [GMl], and clearly

s(t) = 707-10 s(t) = s(h{t))

\/teJ.

The notion of equivalence between curves can be made more precise. 7.15 Definition. Let 7 G C^{I,W) and S G C^(J,W) be two equivalent curves, and let h : J -^ I be a homeomorphism such that S(t) — j{h{t)) Wt e J. We say that 7 and 5 have the same orientation if h is monotoneincreasing and have opposite orientation if h is monotone-decreasing. Since every homeomorphism between intervals is either strictly increasing or strictly decreasing, see [GMl], two equivalent curves either have the same orientation or have opposite orientations. In this way, the set of curves can be partitioned into equivalence classes and each class decomposes into two disjoint subclasses: equivalent curves with the same orientation and equivalent curves with opposite orientation.

7.1.2 Regular curves and tangent vectors a. Regular curves We say that a curve 7 of class C^ is regular if 7'(^) ^ 0 Vt. It is also convenient to reconsider the notion of equivalence in the category of curves of class C^.

7.1 Curves in E^

225

7.16 Definition. Let I, J be intervals. Two curves 7 G C^(/,R^), S G C^(J,R'^) of class C^ are C^-equivalent if there exists a one-to-one map h : J -^ I of class C^ with h'{t) ^ 0 Wt e J such that 7(5) = 7{h{s))

^seJ. Clearly C^-equivalent curves have the same trace. We can prove that being C^-equivalent is an equivalence relation between regular curves; actually we shall prove the following result after Proposition 7.37. 7.17 Theorem. Let 7 and S be two curves of class C^, and suppose they are regular. Then 7 and S are C^ -equivalent if and only if they are C^equivalent. Since every function of class C^ with h' ^ 0 Mt is either strictly increasing or strictly decreasing, since h' cannot change sign, any two C^equivalent curves either have the same orientation or have opposite orientation. In this way the set of C^-curves can be partitioned into equivalence classes and each class decomposes into two disjoint subclasses: C^equivalent curves with the same orientation and C^-equivalent curves with opposite orientation. b. Tangent vectors Let 7 : / ^^ R"^ be a simple, regular curve of class C^ and let F := 7(7) be its trace. If x G F, there exists a unique t e I such that ^(t) = x. 7.18 Definition. The space of tangent vectors to the trace T at x eT is defined as the space of all multiples ofy{t), Tan^F := Span W{t)\

7(t) = x.

The unit tangent vector to 7 at a; := 7(t) is defined by

Notice that the previous definition makes sense since one proves that Span {7'(t)} where 7(t) = x, depends only on the trace of 7 and on x. In fact, if 7 : / —^ R'^ and S : J —^M^ are two curves with the same trace F, then Theorems 7.14 and 7.17 yield that 7 and 5 are C^-equivalent, i.e., there exists h : J ^^ I one-to-one and of class C^ with h'{s) ^ 0 "is E J such that S{s) = ^{h{s)) \/s G / . Differentiating, we get S\s) =

h\s)S\h{s)),

that is, 6'{s) and 7'(t) are multiples of each other when S{s) = ^{t) =: x. Moreover,

226

7. Curves

T ^N

litN) ^ to

7(^1)

Figure 7.4. A polygonal line inscribed on a curve.

that is, two C^-equivalent curves with the same orientation have the same unit tangent vector, and two C^-equivalent curves with opposite orientation have the opposite unit tangent vector. Remaining in a classic context, it is convenient to also introduce the families of piecewise-C^ curves and piecewise regular curves. 7.19 Definition. A curve 7 : [a, 6] —> W^ is said to be piecewise-C^ (respectively, regular) if ^ ^ C^(/, R'^) and there exist finitely many points a = to < ti < ' •' < tjsf = b such that the restrictions 7|[t.,ti_i] ^^^ ^/ class C^ (respectively, regular) for all i — 1^... ^N. We emphasize that in Definition 7.19 7 is required to be continuous everywhere in [a, 6], while derivatives are required to exist everywhere except at finite many points where only left- and right-derivatives exist. Notice also that piecewise-C^ curves are Lipschitz continuous. 7.20 %, Let 7 : [a, 6] —> R^ be a piecewise regular curve. Show that every point in 7([a, 6]) has finite multiplicity. Show a piecewise regular curve that has infinitely many points of multiplicity 2. 7.21 % Show that 7(b) - 7 ( a ) = / Ja if 7 : [a, h] —>• M"" is piecewise C^.

l'{s)ds

c. Length of a curve Recall that a partition a of [a, 6] is a choice af finitely many points t o , . . . , tjv with a = to < ti < ' •' < t^ = b. Denote by S the family of partitions of [a,b]. For each partition G = {to->ti,... .t^] ^ S one computes the length of the polygonal line P{cr) joining the points 7(^0)? 7(^1)7 • • • ? I{^N) in the listed order, Figure 7.4, N

P{a):=Y,UU)-i{U_,)\.

7.1 Curves in W

227

Figure 7.5. The graph of f{x) = x s i n ( l / x ) , x € [0,1], is not rectifiable.

7.22 Definition. Let 7 G C°([a,6];E^). The length of 7 Z5 defined as L(7) := s u p | p ( a ) I cr G ^ j and i(;e 5a?/ ^/ia^ 7 Z5 rectifiable or 7 has finite total variation z/ L(7) < +CXD.

In other words the length of a curve is the supremum of the lengths of all inscribed polygonals. The following is easily seen. 7.23 Proposition. 7/7 and S are equivalent, then L(7) = L{5). In particular 7 and S are either both rectifiable or not, and the length of a simple curve depends only on its trace. 7.24 If. Prove Proposition 7.23. 7.25 f.

Let 7 : [a, b] —>- R'^ be a curve and let a < c < b. Show that L{'y) = L(7|[a,c]) +

^(7[c,6])-

7.26 If. Show that if -f{t) = (cost,sint), t G [0,27r], we have L(7) = 27r, while if 7(t) = (cost,sint), t 6 f0,47r], we have L{'y) = 47r.

7.27 E x a m p l e . Curves 7 G C^i[a, 6);R^) need not be rectifiable, i.e., of finite length. Indeed the curve graph of / , j{x) = (x, f{x)) where

Figure 7.6. A closed curve that is not rectifiable.

228

7. Curves

Figure 7.7. An approximation of the von Koch curve.

-, ,

Jxsin(l/a;)

f(x} := <

ifx€]0,l], if x = 0

has infinite length, see Figure 7.5. Indeed, if Xn:=

;

TT,

neN,

mr + 7r/2 the length of 7|[a:„_i,xn] ^^ larger than Xn\ sin l/xn\ n—l

= Xn, hence for any n n—1

^

m > L(7|[.„,ii) > |:-«= = g ^^rw5' i.e., L(7) = oo. Notice that 7 belongs to C0([0,1],M^) fl C^flO, 1],R^), but 7 ' is not bounded in a neighborhood of 0. 7.28 E x a m p l e ( T h e v o n K o c h c u r v e ) . Clearly a bounded region of the plane may be enclosed by a curve of arbitrarily large length, think of the coasts of Great Britain or of Figure 7.6. A curve of infinite length enclosing a finite area is the von Koch curve that is constructed as follows. Start from an equilateral triangle, replace the middle third of each line segment with the two sides of an equilateral triangle whose third side is the middle third that we want to remove. Then one iterates the procedure indefinitely. One can show that the iterated curves converge uniformly to a curve, called the von Koch curve, which (i) is a continuous simple curve, (ii) has infinite length and encloses a finite area, (iii) is not diff^erentiable at any point. 7.29 %, Show that each iteration in the construction of von Koch's curve increases its length by a factor 4 / 3 , and, given any two points on the curve, the length of the arc between the two points is infinity. Finally, show that the surface enclosed by von Koch's curve is 8/5 of the surface of the initial triangle.

7.30 E x a m p l e ( T h e P e a n o c u r v e ) . Continuous nonsimple curves may be quite pathological. Giuseppe Peano (1858-1932) showed in 1890 an example of a continuous curve 7 : [0,1] —• [0,1] x [0,1] whose trsice is the entire square: any such curve is called a Peano curve. Following David Hilbert (1862-1943), one such curve may be constructed as follows. Consider the sequence of continuous curves 7i : [0,1] —>• R^ as in Figure 7.8. The curve at step i is obtained by modifying the curve at step (i — 1) in an interval of width 2~* and in a corresponding square of side 2~* on the target. The sequence of these curves therefore converges uniformly to a continuous curve, whose trsLce is the

7.1 Curves in R^

229

igiH

Figure 7.8. Construction of a Peano curve according to Hilbert.

entire square. Of course, 7 is not injective, otherwise we would conclude that [0,1] is homeomorphic to the square [0,1]^, compare Proposition 6.69. Another way of constructing a Peano curve, closer to the original proof of Peano who used ternary representations of reals, is the following. Represent each x € [0,1] in its dyadic expression, x = Y^QLI ^i/2*, h 6 {0,1}, choosing not to have representations ending with period 1. If x = J2^o ^ i / 2 ' ^ [0.1]^ set 00

,

00

1

-r(-)-(E^.E^)1=0

i=0

Using the fact that the alignment "changes" by a small quantity if x varies in a sufficiently small interval, we easily infer that 7 is continuous. On the other hand, 7 is trivially surjective.

No pathological behavior occurs for curves of class C^. In particular, there is a formula for computing their length. 7.31 Theorem. Let 7 G C^([a,6];R'^). Then 7 is rectifiable and

L(n)= I \i{s)\ds. Ja Proof. Let cr G «S be a partition of [a, 6], P(cr) the length of the polygonal line corresponding to G. The fundamental theorem of calculus yields 7(ti)-7(*i-i)= /

l\s)ds,

|7fe)-7(*i-i)l< / '

W{s)\ds.

hence Jti-i

Summing over i, we conclude

P{cT) = Y.\l{ti)-l{U-^)\< i.e., L(7) = sup^ P(cr) < j^\^'{s)\ds rectifiable. It remains to show that

/

W{s)\dx

< 00, for cr arbitrary. This shows that 7 is

/ \y(s)\dx 0, there is a partition cTe such that

(7.2)

230

7. Curves

MATHEMATI8CHE ANNAIEN.

Sur vne eoortw, qui' ramplit tonte nae kin plane. 0, PuM i tub.

m TXEBINDUMO lOT aMXOlUMK

D«u Mtt* Mot* <» auaaia* daox (bacUoM s «( y, ositonB*! M tootxionnJM not dMfoaetiaMeoattaim d'ont nritU*, OB * tian aa m <)• MMIIM q«i pruM pu totn tM pointo d'sa cBtt. Done, Aut dooirf im an do eooibo ooatiBa*, «ai fiun d-uini hnKKhta-,fla'ortpMtonloM.pMibb ita I* r«tif«rm«r dtuu

BUBOLF FBIEDBIGH ALFBED CLSBSCE XS»t»t Mitwirk*a( dtr H«rr*a

l«C«>«ni«i( WniMmibw

r-0,0,4,0,,.. . (Pour M momat, T « t MotioMt mw Mat* d* liiiiarM}. Si • «it ra tkatM, ddidgaoat ju ka 1« sliifto > - o, emyitwaMrt d* o; «'*(t-ik-to, F««>a* kO-S, k l - l , k2~0. 8i » - k«. *e dwhdt a - k»i oa » «<mi k« » • (nod. S). DMgDoei par k>a I* rdntttt ^ i'optntion k t^>«<« a toil , ~ f c * ^ o . , . . -

Profc IWte Ktoin TOytik

*""^^

Prof A4olidi l U y a r

Dooe »., • * " ahiW d. X, **»
LXIPZIO, Dsvcx o>o TBBiAS voa B. • . r a o a m t . iMa

Figure 7.9. T h e first page of the paper of Giuseppe Peano (1858-1932) appeared in Matematische Annalen.

t l7'(«)l ds

< P{(7e) + e.

Ja

We observe that for every s € [U-i^ti] j{ti)-^{ti-i)

we have

= f ^ y(t)dt=f^

(y(t)-'y'{s))dt-}-^'(s)(ti-ti^i),

consequently

iv wi < ^i

-|7(«i)-7(
(7.3)

^i—1

provided we choose the partition a^ := {to,ti,... Wit)-'y'is)\<e

,tjsf) in such a way that i{s,telti-i,ti]

(such a choice is possible since 7 ' : [a,b] -^ W^ is uniformly continuous in [a, 6] by the Heine-Cantor theorem, Theorem 6.35). The conclusion then follows integrating with respect to s on [ti-i,ti] and summing over i. D Of course Theorem 7.31 also holds for piecewise-C^ curves: if 7 € C^[a,b], a = to < ti < " • < tN = b dind ^ e C^{[ti-i,ti];W) V i = l , . . . i V , then

7.32 Lipschitz curves. Lipschitz curves, i.e., curves 7 : [a, 6] - ^ R" for which there is L > 0 such that hit)

~ j{s)\ < L\t - s\

Vf,5G[a,6],

7.1 Curves in W

231

are rectifiable. In fact, for every partition 7, with a = to < ti < ... < IN = b we have N

P{a) =

^\^{U-i)-^iU)\
Quite a bit more comphcated is the problem of finding an exphcit formula for the length of a Lipschitz curve or, more generally, of a rectifiable curve. This was solved with the contributions of Henri Lebesgue (1875-1941), Giuseppe Vitah (1875-1932), Tibor Rado (1895-1965), Hans Rademacher (1892-1969) and Leonida Tonelh (1885-1946) using several results of a more refined theory of integration, known as Lebesgue integration theory. 7.33 % T h e l e n g t h formula holds for primitives. Let 7 : [a, 6] -^ M" be a curve. Suppose there exists a Riemann integrable function ip : [a, b] —> M^ such that 7(t) = 7(a) -h / ^i^(s) ds Ja Show that 7 is rectifiable and L{'y) = f^ |V^(t)| dt.

Vt G [a, b].

7.34 %. Show that two regular curves that are C^-equivalent have the same length. [Hint: Use the formula of integration by substitution.] 7.35 E x a m p l e ( L e n g t h of g r a p h s ) . Let / G C^([a,6],M). The graph of / , Gf : [a, 6] -^ R2^ ^^(^) ^ (t,/(*)), is regular and G'^{t) = ( l , / ' ( t ) ) . Thus the length of Of

Ja

7.36 E x a m p l e ( L e n g t h in polar c o o r d i n a t e s ) . (i) Let p{t) : [a,6] -^ R-\., 0 : [a, 6] —>• R be continuous functions and let 7(t) = {x{t), y{t)) be the corresponding plane curve in polar coordinates, 7(t) = {p{t) cos6(t), p{t) sin6(t)). Since | 7 ' p = x'2 + 2/'^ = p'^ + p26^'^ we infer

L(7)= f ^p'^ + p^O'^ dt, Ja

in particular, for a polar curve 7(t) = (p(t) cos^(t),p(t) sin^(t)), we have

L{i) = f yjp'^ + p2 dt. (ii) Let p{t) : [a,6] ^> R+, ^ : [a,6] —^ R and / : [a,6] —>• R be continuous and let 7(t) := (x(t),y(t),2;(t)), t e [a,6], be the curve in space defined by cylindrical coordinates {p{t),e{t)/f{t)), i.e., 7(t) := {p{t) cos e(t), p{t) sin6(t), f{e{t))). Since

L(7) = y

^p'2 ^ ^2^/2 ^ y.,2^,2 ^^

(iii) For a curve in spherical coordinates {p{t), 6{t), (p{t)), that is, for the curve ^{t) = {x{t),y{t),z{t)), t G [a, 6] where x{t) = p(t) sin(p{t) cos0{t),

y{t) = p{t) sin(p{t) sin0{t),

z{t) = p{t) cosip{t),

the length is

L(7) = f

y^p'2_^p2^'2_^^2sinV^'^ dt.

232

7. Curves

a t

b

Figure 7.10. Arc length or curvilinear coordinate.

d. Arc length and C^-equivalence Let 7 G C^{[a, 6]; R^) be a curve of class C^ and regular, y{t) ^ 0 V^. The function Sj : [a^b] -^R that for each t e [a,b] gives the length of ^Ifa,*]? ,(t) -i:(7|[a,t]) = /

h'{s)\ds,

Ja

is called the arc length or curvilinear abscissa of 7. We have (i) s^{t) is continuous, not decreasing and maps [a, b] onto [0, L], L being the length of 7. Moreover Sj is differentiable at every point and

s'^{t) = h'it)\

yte[a,b],

(ii) since 7 is regular, j'{t) ^ 0^ t e [a,6], s^{t) is in fact strictly increasing; consequently, its inverse t^ : [0, L] -^ [a, b] is strictly increasing, too, and by the differentiation theorem of the inverse, see [GMl], t^ is of class C^ and

With the previous notation, the reparametrization by arc length of 7 is defined as the curve S^ : [0,L] —^ W^ given by 6^is) := j{t^{s))

se[0,L],

Differentiating, we get

|<5;(,)| == l ^ ^ ^ ^ l = mt,{s))\ \t'^is)\ = 1

Vs.

As a consequence, the arc length reparametrization of a regular curve 7 of length L is a curve 5 : [0, L] -^ W^ that is C^-equivalent to 7 , has the same orientation of 7 and for which \S'{s)\ = 1 V5 G [0,1/]. It is actually the unique reparametrization with these properties. 7.37 Proposition. Let ip : [a, 6] -^ W and ip : [c,d] -^ M^ 6e ti(;o C^equivalent curves with the same orientation, 'ip{s) = (p{h{s)) \/s G [c,d], for some h : [c, d] -^ [a, b] of class C^ with h' > 0, and length L. Then s^{t) = s^{h{t)),

\/t G [c, d],

/ience (^(^(5) = 6^{s) Vs G [0,L].

and

t^{s) = t^{s)

\/s G [0, L],

7.1 Curves in R^

233

fi|i»rw«A<»r>'<.«>t«ftnMi«h«»«fciiwi<|ii|iiri>»iwft.j»»»t

Figure 7.11. Maria Agnesi (1718-1799) and a page from the Editio princeps of the works of Archimedes of Syracuse (287BC-212BC).

Proo/. If ip{s) = ifi{h{s)) \/s e[c,d],

heC^,h'

> 0, then for any ( £ [c.d]

s^{t) = I ' IV^'WI d r = I* \v'(h{T))\h'{T)

dr

rh(t) Ja hence S^ := ip o t-^p = ip o h~^ o t^p = (f o h o h~^ o t^p = (p o t^p = S(p. Proof of Theorem 7.17. Assume that S € C^{[c,d],R'^), 7 G Ci([a,6],IR'^) 7 regular, ^ : [c,d] -^ [a,6] is continuous and increasing and S(s) = ^{h{s)) Vs G [c,d]. Then the functions /3(5) := L{6\^c,s]) = f

W{s)\ds,

s G [c,d],

a W : = L ( 7 , [ , , , ^ ) = /" h'(T)\dr t e [a,bl Ja are of class C^ and ^(s) = L((5|[c,s]) = ^(7|[a,/i(s)]) = «(^(s)) Vs G [c,d], see Proposition 7.37. Since 7 is regular, a{t) is invertible with inverse of class C^, hence /i(s) = a~^(^(s)) and ^ is of class C^. D

7.1.3 Some celebrated curves Throughout the centuries, mathematicians, artists, scholars of natural sciences and layman have had an interest in plane curves, their variety of forms, and their occurrence in many natural phenomena. As a consequence

D

234

7. Curves

Figure 7.12. (a) Archimedes's spiral, (b) Fermat's spiral, (c) Hyperbolic spiral.

there is a large literature which attempts to classify plane curves according to their properties focusing on their constructive aspects or by simply providing catalogs. In this section we shall present some of these famous curves. a. Spirals Spirals are probably among the most known curves, the first and simplest being the spiral of Archimedes. This is the curve described by a point that moves with constant velocity along a half-line that rotates with constant angular velocity along its origin. If the origin of the half-line is the origin of a Cartesian plane, we have p = vt,9 = cut, thus the polar form of Archimedes's spiral is p = a6,

a := —.

Other spirals are obtained assuming that the motion along the half-line is accelerated, for instance p = a^". All these spirals begin at the origin at ^ = 0 and move away from the origin as 9 increases.

Figure 7.13. (a) Lituus, (b) Logarithmic spiral, (c) Cayley's sextic.

7.1 Curves in R^

235

ee Figure 7.14. (a) Cardioid, (b) Lemniscate, (c) L'Hospital cubic.

ARCHIMEDEAN SPIRALS.

These are the curves defined by

ae.

m e

Among them, see Figures 7.12 and 7.13, we mention o Archimedes's spiral p = aO^ o FermaVs spiral p^ = a^6^ o the hyperbolic or inverse spiral p = a/6^ o the lituus (? — a^/9^ o the logarithmic or equiangular spiral 6 — log^ /?, i.e., p = A^, A > 1. It is the spiralis mirabilis of Johann Bernoulh (1667-1748). It (actually, its tangent at every point) forms a constant angle with any ray from the origin, and every ray intersects the logarithmic spiral in a sequence of points with distances in a geometric progression. It is probably the spiral that one finds most frequently in nature, expressing growth proportional to the organism, as in shells, pine cones, sunfiowers or in galaxies. SINUSOIDAL SPIRALS. A large variety of curves is described by the sinusoidal spirals p^ = a'^cos(n^), n rational. For instance, o Cayley's sextic p = 4acos^(^/3), see Figure 7.13, that we can also write in an implicit form as the set of points (x, y) such that 4(x^ + 2/2-ax)3 = 27a2(x2 +7/2)2^

Figure 7.15. (a) Parabolic spiral a = 1, 6 = 0.7, (b) Euler's spiral.

236

7. Curves

Figure 7.16. (a) The conchoid, (b) The conchoid of Nicomedes a = 4, 6 = 2, (c) Limacon of Pascal a = b = 1.

o Cardioid p = 2a(l + cos0), see Figure 7.14, that we can write implicitly as the set of points (x, y) such that [x^ + y^ — 2ax)'^ = Aa^{x'^-\-y^), o Lemniscate of Bernoulli p^ = a^cos(2^), see Figure 7.14, equivalently as the set of points (x, y) such that (x^ + y^)^ = a^(x^ — 2/^), o Cubic of de VHospital: pcos^lO/S) — a, see Figure 7.14. Other well-known spirals are, see Figure 7.15, PARABOLIC SPIRALS, (p - a)^ =

6^^,

EULER'S SPIRAL. 7(t) = {x(t),y(t))

where x{t) = ± J^ ^

dt, y{t) =

±J^^dt,0
CONCHOID OF NICOMEDES.

\ X = b-\- a cos ^, I.e.,

\y= (6 + a cos 6) tan 0,

by the change of variable t = 6 tan^, see Figure 7.16. We can write it also in polar coordinates as p{e) = a -f

cos^'

7.1 Curves in W

237

or as the set of points (x, y) such that

LiMACON OF PASCAL. (Etienne Pascal (1588-1640), the father of Blaise Pascal.) It is the conchoid of a circle of radius a with respect to a point O on the circle. If 0 is the origin and p = 2a cos 9 is the polar equation of the circle of center (a, 0) through (0,0), the polar equation of the limacon is p = 2a cos 0 + 6, see Figure 7.16. Choosing b = 2a the limacon becomes a cardioid. CONCHOID OF D U R E R . Let Q = {q, 0) and R = (0, r) be points such that q + r = b. The locus of points P and P ' , on the straight line through Q and i?, with distance a from Q is Durer's conchoid (Albrecht Diirer (1471-1528)), see Figure 7.18. Its Cartesian equation may be found by eliminating q and r from the equations b = q + r, y = - ^ x + r. c. Cissoids Given two curves Ci and C2 and a fixed point O, we let Qi and Q2 be the intersections of a line through 0 with Ci and C2, respectively. The locus of points P on such lines such that OP = OQ2 — OQi = Q2Q1 is the cissoid of Ci and C2 with respect to O, see (a) Figure 7.17. The cissoids of a circle and a tangent line with respect to a fixed point of the circle that is not opposite to the point of tangency is the cissoid of Diodes introduced by Diodes (240BC-180BC) in his attempts to doubling the cube, see (b) Figure 7.17. If O is the origin, and the circle has equation (x — a/2)^ -h 2/^ = o^^/4, the intersections points are C = a ( l , t a n ^ ) , B = a cos 6{cos 9, sin 6), hence Diocles's cissoid has the Cartesian equation 2/^ (a — x) = x^, or, equivalently, polar equation p = a sin 0 tan 0.

Figure 7.17. (a) The cissoid, (b) Cissoid of Diodes, (c) Folium of Descartes.

238

7. Curves

Figure 7.18. (a) Diirer's conchoid, (b) Oval of Cassini, (c) The devil curve.

d. Algebraic curves These are loci of zeros of polynomials. The degree of the polynomial may be taken as measure of complexity: curves that are zeros of second order polynomials are well classified, see Example 3.69. We list here a few more algebraic curves, see Figure 7.19. It has an equation y{x'^-\-a'^) = a^ and it is the trace of the curve ^{t) = {x{t),y{t)) where x(t) = at, y{t) = a/{l + t^),

W I T C H OF AGNESI.

teR. It has an equation x{x - a)^ = y^i'^a - x) and it is the trace of the curve 7(^) = {x{t),y{t)) where x{t) = 2acos^ t, y{t) =ataint{l-2cos^t), teR.

STROPHOID OF BARROW.

E I G H T CURVE or LEMNISCATE OF G E R O N O . It has

an equation x"^ =

o?{x'^ — y^) and it is the trace of the curve ^{t) = {x(t),y{t)) where x(t) = a cost, y{t) = asintcost, teR. CURVES OF LISSAJOUS. They are the traces of curves 7(t) = {x{t),y(t)) where x{t) = asm{at + d), y{t) — 6sint, t G M in which each coordinate moves as a simple harmonic motion. One shows that such curves are algebraic closed curves iff a is rational. FOLIUM OF DESCARTES. It has an equation x^ -^y^ = Saxy and arises as trace of the curve 7(t) = {x{t),y{t)) where x{t) = Sat/{1 + t^), y{t) = 3atV(l +1^), t e M, see Figure 7.17. DEVIL'S CURVE. It has an equation y'^-x'^+ay^+bx'^ = 0, see Figure 7.18. DOUBLE FOLIUM. It has an equation {x^ -{-y^)'^ = iaxy'^, see Figure 7.20. TRIFOLIUM. It has an equation {x'^ -\- y'^){y'^ -\- x{x -\- a)) = Aaxy"^, see Figure 7.20. OVALS OF CASSINI. They have equation {x^ + y^ + a^)^ = 6^ + 4a^x^, see Figure 7.18. ASTROID. It has an equation x^/^ -h y^^^ — o?!^, see Figure 7.20.

e. The cycloid Nonrational curves are called transcendental Among them one of the most famous is the cycloid. This is the trajectory described by a fixed point (the tyre valve) of a circle (a tyre) rolhng on a line, see Figure 7.21.

7.1 Curves in R^

239

Figure 7.19. Some algebraic curves: from the top-left (a) the witch of Agnesi, (b) the strophoid of Barrow, (c) the lemniscate of Gerono, (d) the Lissajous curve for n = 5, d = n/2.

If the center of the circle is C = (0,i?), the radius R, P = (0,0) and we parametrize the movement with the angle 6 that CP makes with the vertical through C, then P — P{9), C = C{9), the cycloid has period 27r, and we have

^^

^'

\R{l-cose)J

Since the circle rolls, C{0) simply translates parallel to the axis of R6. We then conclude that the cycloid is the trace of the curve 7 : R —> R^ defined by

(Rie-sinOy '^'

\R{l-cose),

Figure 7.20. From the left: (a) the double folium , (b) the trifolium, (c) the astroid.

240

7. Curves

Figure 7.21. The cycloid.

The cycloid solves at least two important and celebrated problems. As we know, the pendulus is not isochrone, but it is approximately isocronic for small oscillations, see Section 6.3.1 of [GMl]. Christiaan Huygens (1629-1695) found that the isochronal curve is the cycloid. Johann BernouUi (1667-1748) showed that the cycloid is the curve of quickest descent, that is, the curve connecting two points on a vertical plane on which a movable point descends under the influence of gravitation in the quickest possible way. Other curves of the same nature as the cycloids are the epicycloids and the hypercycloids^ see Figure 7.22. These are obtained from a circle that rolls around the inside or the outside of another circle (or another curve). f. The catenary Another celebrated transcendental curve is the catenary. It describes the form assumed by a perfect flexible inextensible chain of uniform density hanging from two supports, already discussed by Gahleo Gahlei (15641642). Answering a challenge of Jacob Bernoulh (1654-1705), it was proved

Figure 7.22. (a) The epicycloid xit) = 9 c o s t - cos(9t), y{t) = 9smtsin(9<), (b) the ipocycloid x{t) = 8 c o s t + 2cos(4t), y{t) = Ssint — 2sin(4t), (c) a catenary.

7.2 Curves in Metric Spaces

241

CHRISTIANI

H V G E NI I ZVLICHEMII. CONST F

HOROLOGIVM OSCILLATORIVM SIVE

DE MOTV PENDVLORVM AD H O R O L O G I A

APTATO

DEMOKSTR.ATIONES G E O M E T R I C ^

PAK.ISIIS. Apud t. M u o u s T , Regis & lllullriirimi Afchiepifcopi Typogripbuit vii Citlurx, i d inlignc mum Rcgum,

~~

M"DC"L X X m

CP-M TKIVILEGIO

Figure 7.23. The pendulum clock from the Horologium Huygens (1629-1695).

Oscillatorium

•

REGIS.

of Christiaan

by Gottfried von Leibniz (1646-1716) and Christiaan Huygens (1629-1695) that the equation of the catenary hung at the same height at both sides is y = | ( e - / « + e - - / « ) = a cosh-

7.2 Curves in Metric Spaces Of course we may also consider curves in a general metric space X, as continuity is the only requirement. Let us start introducing the notion of total variation, a notion essentially due to Camille Jordan (1838-1922). a. Functions of bounded variation and rectifiable curves Let X be a metric space and / : [a, 6] C M —» X be any map. Denote by S the family of finite partitions a = {to,... ^tjsf} with a = to < ti < - • - < tN = b oi the interval [a, b] and, in correspondence to each partition cr, set |o-| := maxi=i,...,iv(|^2 - ^z-i|) and N-l

K(/):=5]d(/(ti),/(ii+i)), i=0

that we have denoted by P{(j) in the case of curves into W^.

242

7. Curves

L ETTRES

A DETTONVILLE CONTENANT Quelques-vnes dc fc» Inucntions de Geometric. S ^ A V O I K, La Rdblinion de vcm l a Problemes touclunt Lx ROVLBTTB qn'il luoitproporez publiquefflentaumojt deluin \6^. L'Egaliid vatK les Lignes coarb» de toutes forte* de Roulettei, & des Lignet EJipdqueit^

La Dimenfion&le Centre degraoiti desTriaBglejCylind«<ju«. La Dimenfion6cle Centre de grauit^der£fcalier. Vn Traitti d « TtiJignes flc de leurs Onglets. Vn Traitti dcs Sinu* ,& des Arcs de Cercle. Vn Traitt^ des Solides Circulaircs.

A

PARIS,

Chez G v i L L A V M B

D B s p R K Z , r u e ifflntlacqut

a I'Iraagc Saint Pro/per.

Figure 7.24. Blaise Pascal (1623-1662) and the frontispiece of his Lettres de Dettonville about the Roulettes.

M. DC. LIX"

7.38 Definition. The total variation of a map f : [a^h] —^ X is the number (eventually +ooj n/,[a,6])-supy,(/). We say that f has bounded total variation ifV{f^ [^?^]) < co. When the curve / : [a, 6] —^ X is continuous, V{f, [a, b]) is called the length of f and curves with bounded total variation, that is with finite length, are called rectifiable. Either directly or repeating the arguments used in studying the length of curves into R*^, it is easy to show the following. 7.39 Proposition. We have (i) if [a, b] C [c, d], then F ( / , [a, b]) < F ( / , [c, d]), (ii) F ( / , [a, 6]) > d{f{a),f{b)) and, if f is real-valued and increasing, thenV{f,[a,b]) = m-f{a), (iii) every Lipschitz-continuous function / : [a, 6] ^ X has bounded total variation and F ( / , [a, b]) < Lip (/) {b ~ a), (iv) the total variation is a subadditive set-function, meaning V{f, [a, b]) < V{f, [a, c]) + V{f, [c, b]) moreover, if f is continuous at c, then V{f,[a,b]) V{f,[c,b]), (v) V{f, [a, b]) = lim|,|_.o+ VM, [«, b]).

ifa
b;

— V{f,[a,c])

+

7.2 Curves in Metric Spaces

ALBERTVS DVRERVS

NV=

REMBERGENSIS PICTOR HVIVS PiaonlMM^Uim{ndM«ciigiMritt,L;q>icidM,ScM»aniiyK vtutKrii dtnom qot ciraDo, goomoM, h'bdla,t«it likxpi cem nuafiinopcn (oa cxaautHot ami QcceflarinitMiM onfie Qiatuor iiM ruunm IdftitutioDum C«oiDCffcanin) |ibn*Juicai,iu|)»rficici U ioiida owpo n tsUfbnit, adhtbi'tudcfr gMuooibw ad (an

243

A L B E R T I DV RERI INSTITVTIONVM GEOMETRtCARVM LllIRI CtyATVOR, lnqttibu«,lmrtt,lb|Kf<(ll(ii«fui
|:>Qficu>lolM(iniiUa<Miii,VUiopok. l4MW«purffii*nmnVccb(liin,iavi* Itcolwa. M fcaio BaGticafi. A n w M.D.XXKU.NoiwAiviai.

Figure 7.25. Frontispieces of two editions of 1532 and of 1606 of Institutionum ricorum by Albrecht Diirer (1471-1528).

geomet-

7.40 ^ . Let f : [a,b] —^ X X X where X is a metric space. Show that / has bounded variation if and only if the two components of / = ( / i , / 2 ) , /i,2 : [a,6] —> X have bounded variation.

We say that two curves (p : [a, 6] -^ X and i/; : [c, d] ^ X into X are equivalent if there exists a homeomorphism a : [c, d] -^ [a, 6] such that jp{s) = (p{h{s)) Mx G [a, 6]. Prom the definitions we have the following. 7.41 Proposition. Two equivalent curves have the same total variation. From (iv) and (v) of Proposition 7.39 we also have the following. 7.42 Proposition. Let (^ : [a, 6] —> X he a rectifiable (continuous) curve. Then the real-valued function t -^ V{(p^ W.t]), t ^ [fl,^] is continuous and increasing. 7.43 1 . Prove the claims in Propositions 7.39, 7.41 and 7.42.

b. Lipschitz and intrinsic reparametrizations We saw that every regular Euclidean curve may be reparametrized with velocity one. Por curves in an arbitrary metric space we have 7.44 Theorem. Let j : [a^b] -^ X be a simple rectifiable curve on a metric space X of length L. Then there exists a homeomorphism a : [0, L] —> [a, 6] such that 7 o a : [0,L] —^ X is Lipschitz continuous with Lipschitz constant one.

244

7. Curves

Figure 7.26. The sets Ek of the middle third Cantor set.

We call that parametrization of the trace of 7 the intrinsic oi7{[a,b]).

parametrization

Proof. Let x € [a, b] and V{x) := V{j, [a, x]). We have L = V(7, [a, b]) and, on account of Proposition 7.42, V{x) is continuous and increasing. Since 7 is simple, V{x) is strictly increasing hence a homeomorphism between [a, 6] and [0, L]. Set cr := F " ^ . We then infer for 0 < x < y < L db(
= V(<7(y)) - y(
=x-y,

i.e., (^ o a is Lipschitz continuous with Lipschitz constant one.

•

7.2,1 Real functions with bounded variation It is worth adding a few more comments about the class of real-valued functions / : [a, 6] -^ R with finite bounded variation, denoted by BV{[a,b]). 7.45 Theorem. We have (i) BV([a^b]) is a linear space and \\f\\ := \f{a)\ + F ( / , [a,6]) is a norm on it, (ii) BV{[a^h]) contains the convex cone of increasing functions, (iii) every f G BF([a, b]) is the difference of two increasing functions. Proof. We leave to the reader the task of proving (i) and (ii) and we prove (iii). For / G BV([a, b]) and t G [a, b] set ^{t) '= V{f, K t]),

i;it) := ^(t) + f{t),

t G [a, 6].

For x,y £ [a,b], X < y, we have t/;(2/) - t/;(x) = [ifiy) - ip(x)] + [/(y) -

f{x)];

now the subadditivity of the total variation yields ^{y) - -fix) = V{f, [x,y]) > \f{y) -

fix)l

in particular ilj(y) - iPix) > 0. Therefore (p and ip are both increasing with bounded total variation, and f(t) = ip{t) — (p(t) Vt. D

A surprising consequence is the following. 7.46 Corollary. Every function in BV{[a,b]) has left- and right-limits at every point of [a,b].

7.2 Curves in Metric Spaces

245

l/2f

1/3

2/3

Figure 7.27. An approximate Cantor-Vitali function.

If we reread the proof of (iii) Theorem 7.45, on account of Proposition 7.42 we infer 7.47 Proposition. Every continuous function f : [a, 6] -^ R with finite total variation is the difference of two continuous increasing functions. a. The Cantor-Vitali function The Cantor ternary set is defined, see of [GM2], as C = CikEk where £^0 •= [0,1], El is obtained from £"0 be removing the open middle third of EQ, and E'fc+i by renioving from each interval of Ek its open middle third. Define for fc = 0 , 1 , . . . and j = 1 , . . . , 2*^, the base points

^0,1 = 0

bk-\-i,j

k^kj + ^bk,j

ifj = l,...,2'^ ifj = 2^ + l , . . . , 2 ^ + \

then the intervals that have been removed from Ek-i to get Ek at step k are 1 2, •^k-l Ik-ij:=bk-i,j+S-'^']-,-l j = l,. and the intervals whose union is Ek are Jk,j:=bkj+3-^[0M

j=

l,-..X.

Therefore 00

C = k=0

^7=1

^

Strongly related to Cantor's set is the Cantor-Vitali function introduced by Giuseppe VitaH (1875-1932). To define it, we first consider the approximate Cantor-Vitali functions Vk : [0,1] -^ R defined inductively by

246

7. Curves

hVkix/3) Vo{x) •.= X,

if a; €[0,1/3],

Vk+i{x) := {

if x G [1/3,2/3], if x € [2/3,1],

^ + lVki3-\x-2/3)) see Figure 7.27. One easily checks that for A; = 0 , 1 , . . . (i) We have ^^(0) = 0, ^^(1) = 1, Vkibj,k) = ^iz , and Vk{x) =

2j-l 2m+l

Vk{bj,k + 3"'=) =

^

0,...,fc-l, j = l,...,2^

if X G /,m , j 5 m

(ii) We have

Vk{x) = {iy j\EAt)dt where XEk is the characteristic function of the set Ek that we used to define the Cantor ternary set. (iii) We have \Vk{x)-Vk{y)\<\x-yr

Vx,i/G[0,l],

where a = log 2/log 3, in particular the 14's are equi-Holder. In fact, by symmetry it suffices to prove the claim for x,y e [0,3"^^] where Vfc is linear with slope (3/2)'^. For 0 < x
^3\fc ^3\^ = {^) \x-y\=[-)

\x-y\'-^\x-y\

as 2 3 " " = 1. (iv) We have \VM{x)-Vkix)\<2-

-k-{-l

\/xe [0,1].

In particular (iv) imphes that the sequence {V^} converges uniformly to a function V{x), which is by (iii) Holder-continuous with exponent a = log2/log3. The function V is called the Cantor-Vitali function and satisfies the following properties o V is not decreasing, hence it has bounded total variation, o in each interval of [0,1] \ E'^, V{x) — Vk{x) is constant, in particular V is differentiable outside the Cantor set with V'(x) = 0 Vx G [0,1] \ C, o F([0,1]) = [0,1], and V maps [0,1] \ C into the denumerable set Z ) : = { y 6 M | y = ^ , j - 0 , l , . . . , 2 ^ fc G N } , hence V maps C onto [0,1] \ D.

7.3 Exercises

247

7.48 Homeomorphisms do not preserve fractal dimensions. The function ^{x):=x + V{x), (^: [0,1] ^ [ 0 , 2 ] , is continuous and strictly increasing, hence a homeomorphism between [0,1] and [0,2]. In Theorem 8.109 we shall see that the algebraic dimension of R"^ is a topological invariant, that is, M'^ and R'^ are homeomorphic if and only if n = m. This is not true in general for the fractal dimension, see Chapter 8 of [GM2]. In fact, ip maps the complement of Cantor's set in [0,1] into the countable union of intervals of total measure 1, H^{ip{[0,1]) \ C) = 1, hence n^{(f{C)) = 1 and dimn{^{C)) = 1, while dimn{C) = log 2/log 3. 7.49 %. Let f : R^ -^ W^ he a Lipschitz-continuous map with Lipschitz inverse. Show that / preserves the fractal dimension, dim7^(/(A)) = dim-^ A.[Hint: Recall that n'^ifiA)) < Up if) n''(A), see Section 8.2.4 of [GM2].]

7,3 Exercises 7.50 ^ . We invite the reader to study some of the curves described in this chapter, try to convince himself that the figures are quite reasonable, and compute the lengths of some of those curves and, when possible, the enclosed areas. 7.51 ^ . Compute the total variation of the following functions / : [0, 2] —> M

{

1

ii X ^ A

0

ii X f A.

7.52 t . Let g{x) = ^/x, x e [0,1], and let / : [0,1] -^ M be given by

I0

otherwise.

Show that / , g and g o f have bounded total variation. 7.53 %. Let f,ge

BV{[0,1]).

Show that m i n ( / , p ) , m a x ( / , p ) , | / | G BV{[0,1]).

7.54 ^ . Show that the Cantor middle third set C is compact and perfect, i.e., int (C) =

8. Some Topics from t h e Topology of E^

As we have aheady stated, topology is the study of the properties shared by a geometric figure and all its bi-continuous transformations, i.e., the study of invariants by homeomorphisms. Its origin dates back to the problem of Konigsberg bridges and Euler's theorem about polyhedra, to Riemann's work on the geometric representation of functions, to Betti's work on the notion of multiconnectivity and, most of all, to the work of J. Henri Poincare (1854-1912). Starting from his research on differential equations in mechanics, Poincare introduced relevant topological notions and, in particular, the idea of associating to a geometric figure (using a rule that is common to all figures) an algebraic object, such as a group, that is a topological invariant for the figure and that one could compute. The fundamental group and homology groups are two important examples of algebraic objects introduced by Poincare: this is the beginning of combinatorial or algebraic topology. With the development of what we call today general topology due to, among others, Rene-Louis Baire (1874-1932), Maurice Prechet (1878-1973), Prigyes Riesz (1880-1956), Pelix Hausdorff (18691942), Kazimierz Kuratowski (1896-1980), and the interaction between general and algebraic topology due to L. E. Brouwer (1881-1966), James Alexander (1888-1971), Solomon Lefschetz (1884-1972), Pavel Alexandroff (1896-1982), Pavel Urysohn (1898-1924), Heinz Hopf (1894-1971), L. Agranovich Lyusternik (1899-1981), Lev G. Schnirelmann (1905-1938), Harald Marston Morse (1892-1977), Eduard Cech (1893-1960), the study of topology in a wide sense is consolidated and in fact receives new incentives thanks to the work of Jean Leray (1906-1998), Elie Cartan (18691951), Georges de Rham (1903-1990). Clearly, even a short introduction to these topics would deviate us from our course; therefore we shall confine ourselves to illustrating some fundamental notions and basic results related to the topology of R^, to the notion of dimension and, most of all, to the existence of fixed points.

250

8. Some Topics from the Topology of E "

Figure 8.1. A homotopy.

8.1 Homotopy In this section we shall briefly discuss the different flavors of the notion of homotopy. They correspond to the intuitive idea of continuous deformation of one object into another.

8.1-1 Homotopy of maps and sets a. Homotopy of maps In the following, the ambient spaces X, Y, Z will be metric spaces. 8.1 Definition. Two continuous maps f,g : X —^Y are called homotopic if there exists a continuous map H : [0,1] x X -^ Y such that H{0, x) = / ( x ) , H(l,x) = g{x) M X G X . In this case we say that H establishes or is a homotopy of f to g. It is easy to show that the homotopy relation f ^ g on maps from X to Y is an equivalence relation, i.e., it is (i) (ii)

(iii)

(REFLEXIVE) / ~ / . (SYMMETRIC) / ~ ^ iff P ~ / . (TRANSITIVE) if / ~ C/ and g ^

h, then f ^ h.

Therefore C^{X,Y) can be partioned into classes of homotopic functions. It is worth noticing that, since C«([0,1],C^X,F))

C\[0,l]xX,Y),

(8.1)

we have the following. 8.2 Proposition. / and g e C^{X,Y) are homotopic if and only if they belong to the same path-connected component of C^{X,Y) endowed with uniform distance. The subsets of C^{X,Y) of homotopy equivalent maps are the path-connected components of the metric space C^{X,Y) with uniform distance.

8.1 Homotopy

DIE WISSENSCHAFT

251

Pavel Sergeevid Aleksandrov

HBRAUSOBBSR PROF. DIL WILHBLM WESTPHAL BAND <6

Einltthrung in die kombinatorische Topolos^e

Topologia combinatoria

Dr. Kurt Reidemeiiter

FRIBORVIEWBO k SOHN, BRAUNSCHWEIG 1951

Figure 8.2. Frontispieces of the introduction to combinatorial topology by Kurt Reidemeister (1893-1971) and Pavel Alexandroff (1896-1982) in its Italian translation.

8.3 ^ . Let X,Y be metric spaces. Show the equality (8.1), which we understand as an isometry of metric spaces. 8.4 ^ . Let y be a convex subset of a normed linear space. Then every continuous map f : X -^ Y from an arbitrary metric space X is homotopic to a constant. In particular, constant maps are homotopic to each other. [Hint: Fix yo eY and consider the homotopy H : [0,1] x X -^ Y given by H{t, x) := tyo + (1 - t)f(x).] 8.5 ^ . Let X be a convex set of a normed linear space. Then every continuous map / : X —)• y into an arbitrary metric space is homotopic to a constant function. [HintFix XQ e X and consider the homotopy H : [0,1] x X —>• y given by H{t,x) :=

f{txo + (1 - t)x).] 8.6 5 . Two constant maps are homotopic iff their values can be connected by a path. 8.7 1[. Let X be a linear normed space. Show that the homotopy classes of maps / : X —)> y correspond to the path-connected components of Y.

According to Exercises 8.4, 8.5 and 8.6, all maps into W^ or defined on R"^ are homotopic to constant maps. However, this is not always the case for maps from or into S'^ := {x | ||x|| = 1}, the unit sphere of M^+^ 8.8 Proposition. We have (i) Let f^g : X -^ S'^ be two continuous maps such that f{x) and g{x) are never antipodal, i.e., g{x) ^ -f{x) Mx G X, then f and g are homotopic; in particular, if f : X —^ S'^ is not onto, then f is homotopic to a constant.

252

8. Some Topics from the Topology of R^

Figure 8.3. The figure suggests a homotopy of closed curves, that is a continuous family of closed paths, from a knotted loop to S^. But, it can be proved that there is no family of homeomorphisms of the ambient space E^ that, starting from the identity, deforms the initial knotted loop into S^.

(ii) Let 5^+1 := {x G W^^ | |x| < 1}. .4 continuous map f : S"" ^ Y is homotopic to a constant if and only if f has a continuous extension F : B^+i -^ Y. Proof, (i) Since f{x) and g{x) are never antipodal, the segment tg{x) H- (1 — t)f(x), t € [0,1], never goes through the origin; a homotopy of f to g is then

\tg(x) + {l

-t)f{x)\

Notice that y —>• A is the radial projection from R^^"^ onto the sphere 5^^, hence H{t, x) is the radial projection onto the sphere of the segment tg{x) + (1 — t)f(x), t 6 [0,1]. The second part of the claim follows by choosing yo ^ S'^\ f{X) and g{x) := —yo(ii) If F : B " + i -^ y is a continuous function such that F(x) = f{x) Vx G S"^, then the map H(t^x) := F(tx), (t^x) G [0,1] x 5^^, is continuous, hence a homotopy of i/(0, x) = F(0) to if (1, x) = f{x). Conversely, if if : [0,1] x 5 ^ ^ F is a homotopy of a constant map g(x) = p eY to f, if (0, x) = p, if (1, x) = f{x) \/x G X, then the map F : B ^ + i ^ y defined by

Ip

if X = 0

is a continuous extension of / to 5 ^ + ^ with values into Y.

•

b. Homotopy classes Denote by [X, Y\ the set of homotopy classes of continuous maps / : X ^ Y and by [/] G [X, Y] the equivalence class of / . The following two propositions collect some elementary facts. 8.9 Proposition. We have (COMPOSITION) Let f,f'\X-^Y,g,g':Y-~^Z he continuous maps. If f ^ f and g ~ g', then g o f r^ g' o f'. (ii) (RESTRICTION) If f,g : X -^Y are homotopic and Ac X, then f\A is homotopic to g\A ^^ maps from A to Y.

(i)

8.1 Homotopy

(iii)

253

(CARTESIAN PRODUCT) f,g : X ^YixY2 are homotopic if and only if ^i^ f and TTi o g are homotopic (with values in Yi) where i = 1,2 and TTi, i = 1,2 denote the projections on the factors.

A trivial consequence of Proposition 8.9 is that the set [X^Y] is a topological invariant of both X and Y. In a sense [X, Y] gives the number of "diflFerent" ways that X can be mapped into F , hence measures the "topological complexity" of Y relative to that of X. Let (/? : X ^ y be a continuous map and let Z be a metric space. Then (f defines a pull-back map ^*:[Y,Z]^[X,Z] defined by (p^[/] := [/ o <^], as Proposition 8.9 yields that the homotopy class oi f o (p depends on the homotopy class of / . Similarly cp induces a push-forward map defined by (p:^[g] := [p o g]. 8.10 P r o p o s i t i o n . We have the following. (i) Let (f^ip : X —^Y be continuous and homotopic^ (/p ~ t/^. Then (^^ = ^ ^ and (f:^ = -0^. (ii) Let (p : X ^^Y and rj :Y ^^ Z be continuous. Then {rj o if)"^ = ip"^ orf^

and

[T] O

ip)^ = ^# o ^#-

c. Hcsmotopy equivalence of sets 8.11 Definition. Two metric spaces X and Y are said homotopy equivalent, or are said to have the same homotopy type, if there exist two continuous maps f : X -^ Y and g : Y ^y X such that g o f ^ Idx and f O g rsj I d y .

If f : X -^ Y and g : Y -^ X define a homotopy equivalence between X and y , then for every space Z we infer from Proposition 8.10 g* o f*

= Id[y,z],

f*

og*

= Id[x,Zl-

Similarly hence [Z, X] and [Z, Y] are in a one-to-one correspondence. 8.12 Definition. A space X is called contractible if it is homotopy equivalent to a space with only one point, equivalently, if the identity map i : X ^^ X of X is homotopic to a constant map. By definition if X is contractible to XQ G X, then X is homotopic equivalent to {xo}, hence [Z,X] and [X, Z] reduces to a point for any space Z.

254

8. Some Topics from the Topology of R"^

Figure 8.4. W^ is contractible.

8.13 E x a m p l e . R'^ is contractible. In fact, H(t,x) contracts R'^ to the origin.

:= (1 - t)x, {t,x)

€ [0,1] X R^,

In general, describing the set [X, Y] is a very difficult task even for the simplest case of the homotopy of spheres, [5^, S'^], k,n>l. However, the following may be useful. 8.14 Definition. LetX be a metric space. We say that A C X is a retract of X if there exists a continuous map p : X ^^ A, called a retraction^ such that p{x) = x\/x £ A. Equivalently A is a retract of X if the identity map Id A : A-^ A extends to a continuous map r : X —^ A. We say that A C X is a deformation retract of X if A is a retract of X and the identity map Idx -^ X is homotopic to a retraction of X to A. Let ^ C X be a deformation retract of X and denote by iA '• A —^ X the inclusion map. Since Idx : X ^ X is homotopic to the retraction map r : X —> yl, we have r o iA = Idyl,

z^ o r = r ~ Idx,

hence A and X are homotopic equivalent. By the above, for every space Z we have [A, Z] = [X, Z] and [Z, A] = [Z, X] as sets, thus reducing the computation of [Z, A] and of [X, Z] respectively, to the smaller sets [Z, X] and [A,Z]. The following observation is useful. 8.15 Proposition. Let A C X be a subset of a metric space X . Then A is a deformation retract of X if and only if A is a retract of X and Idx : X —^ X is homotopic to a continuous map g : X ^^ A.

Figure 8.5. S^ is a deformation retract of the torus T C R^.

8.1 Homotopy

255

Figure 8.6. 5 ^ is a deformation retract of B^ \ {0}.

Proof. It is enough to prove sufficiency. Let r : X —^ A he a retraction and let h : [0,1] X X —)- X be a homotopy of I d x to g, /i(0,x) = x, h{l,x) = g{x) Vx G X. Then the map

yh{2-2t,x)

if I < t < 1

is continuous since /i(l, x) = r(/i(l, x)) Vx and shows that I d x is homotopic to r : X ^^ A. U

8.16 ^ . Show that every point of a space X is a retract of X. 8.17 If. Show that {0,1} C R is not a retract of M. 8.18 %, Show that a retract A C X of a space X is a closed set. 8.19 %. The possibility of retracting X onto A is related to the possibility of extending continuous maps on A to continuous maps on X . Show P r o p o s i t i o n . A C X is a retract of X if and only if for any topological space Z any continuous map f : A —^ Z extends to a continuous map F : X -^ Z. 8.20 1 . Show that 5 ^ is a deformation retract of 5^+^ \ {0}, see Figure 8.6. 8.21 f. With reference to Figure 8.8, show that M \ dM is not a retract of M , but M and M\dM are homotopy equivalent since they have a deformation retract in common.

Figure 8.7. The first two figures are homotopy equivalent since they are both deformation retracts of the third figure.

256

8. Some Topics from the Topology of W

Figure 8.8. M\dM

is not a retract of M , but M and M\dM

are homotopy equivalent.

d. Relative homotopy Intuitively, see Figure 8.1, the maps Ht : X —^ Y^ t ^ [0,1] defined by Ht{x) := H(t^x), are a continuous family of continuous maps that deform / to^. In particular, it is important to note that, in considering homotopy of maps, the target space is relevant and must be kept fixed in the discussion. As we shall see in the sequel, maps with values in Y that are nonhomotopic may become homotopic when seen as maps with values in Z DY. Also, it is worth considering homotopies of a suitable restricted type. For instance, when working with paths with fixed endpoints, it is better to consider homotopies such that for each t all curves x -^ Ht{x) := H{t,x), X e [0,1], have the same fixed endpoints for all t e [0,1]. Similarly, when working with closed curves, it is worthwile considering homotopies H{t, x) such that every curve x -^ Ht{x) := H{t, x) is closed for all t G [0,1]. 8.22 Definition. Let C C C^{X,Y). We say that f,g eC are homotopic relative to C if there exists a continuous map H[0,1] x X -^ Y such that if(0,x) = f{x), i f ( l , x ) = g{x) and the curves x -^ Ht{x) := H{t,x) belong to C for all t G [0,1]. It is easy to check that the relative homotopy is an equivalence relation. The set of relative homotopy classes with respect to C C C^(X, F ) is denoted by [ X , r ] c . Some choices of the subset C C C^([X, Y]) are relevant. (i) Let Z C y and C '.=^ {f e C0([0, l],Y) \ f{X) C Z } . In this case a homotopy relative to C is a homotopy of maps with values in Z. (ii) Let X = [0,1], a,6 G y and C := {/ G C^{X,Y) \ /(O) = a, / ( I ) = 6}. Then a homotopy relative to C is called a homotopy with fixed endpoints. (iii) Let X = [0,1], and let C := {f e C0([0,1],!^) | /(O) = / ( I ) } be the class of closed curves, or loops^ in Y. In this case two curves homotopic relative to C are said loop-homotopic. Recall that a closed curve 7 : [0,1] ^ ' X can be reparametrized as a continuous map 5 \ S^ ^^ X from the unit circle S^ C C Now let 7i,72 : [0,1] -^ X he two loops and let 81^82 '- S^ ^^ X be two corresponding reparametrizations on 5^. Then, recalling that

8.1 Homotopy

257

homotopies are simply paths in the space of continuous maps, it is trivial to show that 71 and 72 are loop-homotopic if and only if 61 and S2 are homotopic as maps from S^ into X. Therefore [[0,1],X]c = [S\X]. Finally, notice that the intuitive idea of continuous deformation has several subtle aspects, see Figure 8.3.

8.1.2 Homotopy of loops a. The fundamental group with base point Let X be a metric space and let XQ G X. It is convenient to consider loops 7 : [0,1] ^ X with 7(0) = 7(1) = XQ. We call them loops with base point XQ. Also, one can introduce a restricted form of homotopy between loops with base point XQ by considering loop-homotopies if (t, x) such that X -^ H{t, x) has base point XQ for every t. We denote the corresponding homotopy equivalence relation and homotopy classes repectively, by ^xo and []xo. Finally, 7ri(X,{xo}) denotes the set of loop-homotopy with base point XQ classes of loops with base point XQ. 8.23 %. Show that 7ri(X, XQ) reduces to a point if X is contractible and XQ G X. [Hint: Show that 7ri(X,xo) C [S^^X].]

b. The group structure on 7ri(X, a^o) Given two loops (/?,-0 : [0,1] -^ X with base point XQ, we may consider the junction of (f and ip denoted hy (p*ip as the loop with base point XQ defined by

^.m:=h''^ \^{2t-l)

if ^^[0,1/2], i f t G [1/2,1].

Since the homotopies with fixed endpoints can be joined, too, we have (fi * ^1 ~ (f * 7p if (pi,
(ASSOCIATIVITY) Let f,g,h : [0,1] -^ X be three loops with base point XQ. Then {[f]xo * [g]xo) * [h]xo = [f]xo * ([^]xo * Mxo)-

258

8. Some Topics from the Topology of W

(ii)

( R I G H T AND LEFT IDENTITIES) Let f : [0,1]-^ X be a loop with base point xo and let e^^o • [0? I] -^ X be the constant map, Cxoit) := XQ. Then [e^J^^ * [/]^o = ifUo * [exolxo = IDxc (iii) (INVERSE) Let ex^ : [0,1] -^ X be the constant map exo{t) := XQ and, for a loop / : [0,1] -^ X with base point XQ, let / : [0,1] -^ X be the map J{t) := / ( I - 1 ) . Then [f]xo * [7]:ro = [J]xo * [f]xo = [exol^o-

In this way the junction of_loops defines a natural group structure on 7ri(X,{a;o}), where [f]-^ = [f]^^8.25 Definition. Let X be a space and XQ G X. The set 7ri(X, {XQ}) of homotopy classes of loops with base point XQ has a natural group structure induced by the junction operation of loops. We then call 7ri(X, {XQ}) the fundamental group of X, or the first homotopy group of X, with base point XQ . c. Changing base point By definition 7ri(X, XQ) depends on the base point XQ. However, if XQ, xi G X, suppose that there exists a path a : [0,1] -^ X from XQ to Xi and let a : [0,1] -^ X, 'a{t) := a ( l — t), be the reverse path from xi to XQ. For every loop 7 with base point XQ, the curve a * 7 * a is a loop with base point xi. Since evidently a * 71 * a ~ a * 72 * a if 71 ~ 72, a defines a map a* : 7ri(X,xo) -^ 7ri(X,xi) by <^*([7]a:o) '= [ a * 7 * ^ ] x i ,

(8.2)

where we have denoted by [ ]a;o and []xi respectively, the homotopy classes of curves with base point XQ and xi. It is trivial to see that a* is a group isomorphism, thus concluding the following. 8.26 Proposition. 7ri(X, XQ) and 7ri{X,Xi) are isomorphic as groups for all xo,xi G X if X is path-connected. Thus, for a path-connected space X, all groups 7ri(X, XQ), XQ e X are the same group up to an isomorphism. We call it the fundamental group or the first homotopy group of X, and we denote it by 7TI{X). However, the map a* defined by (8.2) depends explictly on a. For convenience, let ha '-= a*. Examples show that in general ha 7^ hp if a and /? have the same endpoints, but we have h^^ha{[j]xi)

= h^^{[aja]xo) = [~pajaP]x, = [^a]a;i * Hxi * [P(^]x^-

This implies that (i) ha = h(3 if a and (3 are homotopic with the same endpoints^ (ii) ha is always the same map, independently from a, if 7ri{X^xi) is a commutative group.

8.1 Homotopy

259

Lecture Notes on Elementary Topolofly and Geometry

Figure 8.9. Camille Jordan (1838-1922) and the frontispiece of the Japanese translation of the Lecture Notes of Elementary Topology and Geometry by J. M. Singer and J. A. Thorpe.

Thus, attaching a path to XQ to any curve 7 : 5^ —> X, we can construct a loop with base point XQ and, at the homotopy level, this construction is actually a map h : [S^^X] ^^ 7ri(X,xo). It is clear that h is one-to-one, since its inverse is just the inclusion oi7Ti{X,xo) into [5^,X]. 8.27 Proposition. Let X be path-connected. If 7ri{X) is commutative, then the map h : [S^^X] -^ ^i{^) described above is bijective. 8.28 Definition. We say that a space X is simply connected ifX is pathconnected and 7ri(X, xo) reduces to a point for some XQ E X (equivalently for any XQ e X by Proposition 8.26). 8.29 %. Show that X is simply connected if X is path-connected and contractible.

d. Invariance properties of the fundamental group Let us now look at the action of continuous maps on the fundamental group. Let X, Y be metric spaces and let XQ G X. To any continuous map / : X —> y one associates a map / # :7ri(X,xo) ^ 7 r i ( y , / ( x o ) ) defined by /#([7]xo) •= [/°7]/(xo)-1^ is easy to see that the above definition makes sense, and that actually / ^ is a group homomorphism. 8.30 Proposition. We have the following. (i) Let f : X -^ Y and g : Y ^^ Z be two continuous maps. Then {9of)^=g^of^.

260

8. Some Topics from the Topology of W

(ii) If Id : X -^ X is the identity map and XQ G X, then Id^ is the identity map on 7ri(X, {XQ}). (iii) Suppose Y is path-connected^ and let F : [0^1]xX -^Y be a homotopy of two maps f and g from X into Y. Then the curve a(t) := F{t,xo), t G [0,1], joins /(xo) to g{xQ) and g^ = a^o f^. Proof, (i) and (ii) are trivial. To prove (iii), it is enough to show that / 0 7 and a 0 ^ 0 7 0 a are homotopic for every loop 7 with base point XQ . A suitable homotopy is given by the map H{t, x) : [0,1] -^ X -> y defined by

fa(2x) F ( « O (4x+2tV 3t+l

H(t,x):=

l,a(4x --3)

'))

ifx < 1-t 2 ' if i - t < a ; < t +43 2 ifx > t-l-3 4 •

Of course, Proposition 8.30 (i) and (ii) imply that a homeomorphism h : X -^ Y induces an isomorphism between 7ri(X, XQ) and 7ri(y,/i(xo)). Therefore, on account of Proposition 8.26, the fundamental group is a topological invariant of path-connected spaces. Actually, from (iii) Proposition 8.30 we infer the following. 8.31 Theorem. Let X, Y be two path-connected homotopy equivalent spaces. Then 7ri{X) and 7ri{Y) are isomorphic. Proof. Let f : X -^ Y, g :Y —i- X he continuous such that gofr^ and let XQ £ X. Then we have two induced maps

I d x and fogr^

Idy

/#:7ri(X,xo)-7ri(y,/(xo)), g^ : 7ri(y,/(xo)) -

7ri(X,p(/(xo))).

Let if : [0,1] X X -^ X be the homotopy of I d x to g o f and let K : [0,1] : Y -^ Y be the homotopy of I d y to f o g. If a\{i) := H{t,xo), a2{t) := K{t,f(xo)), then by Proposition 8.30 we infer g#of^

= ( p o / ) # = r ( a i ) * o ( I d x ) # = (ai)*,

/ # o p # = ( / o p ) # = ( a 2 ) * ( I d y ) # = (a2)*. Since ( a i ) * and (a2)* are isomorphisms, f^ is injective and surjective.

D

8.1.3 Covering spaces a. Covering spaces A useful tool to compute, at least in some cases, the fundamental group, is the notion of covering space. 8.32 Definition. A covering of Y is a continuous map p : X —^Y from a topological space X, called the total space, onto Y such that for all x EY there exists an open set U C Y containing x such that p~^{U) = UaVa, where Va are pairwise disjoint open sets and p^y^ is a homeomorphism between Va and U. Each Va is called a slice of p~^{U).

8.1 Homotopy

261

Figure 8.10.

8.33 E x a m p l e . Let Y be any space. Consider the disjoint union of fc-copies of y , that we can write as a Cartesian product X := Y x { 1 , 2 , . . . , fc}. Then the projection map p : A" —)• y , p{{y, i)) = y, is a. covering of X. 8.34 E x a m p l e . Let S^ be the unit circle of C. Then the circular motion p : R —> S*^, p{6) = e* '^'^^ is a covering of S^. 8.35 E x a m p l e . Let X C M^ be the treice of the regular helix 7(t) = ( c o s t , s i n t , t ) . Then p : X —> 5^ where p : R^ —^ E^, p{x,y,z) := {x,y), is the orthogonal projection on M^, is another covering of 5^. 8.36 1 . Let p : X —>^ y be a covering of Y. Suppose that Y is connected and that for some point yo E Y the set p~^(yo) is finite and contains k points. Show that p~^{y) contains k points for all y G y . In this case, we say that p : X —> y is a k-fold covering ofy. 8.37 f.

Show that p:R+

-^ S'^, p(t) := e**, is not a covering of S^.

8.38 %. Show that, if p : X —>^ X and q : Y -^ Y are coverings respectively, of X and y , then pxq:XxY^yXxY,px q{x, y) := (p(a;), q{y)), is a covering of X x y . In particular, if p : E -^ S'^ is defined by p(t) := e* ^'^S then the map p x p : E x R ^ 5^ x 5^ is a covering of the torus S^ x S^. Figure 8.10 shows the covering map for the standard torus of M^ that is homeomorphic to the torus S^ x S^ C E^.

8.39 1. Think of 5^ as a subset of C. Show that the map p : 5^ -^ 5 \ p{z) = z^, is a two-fold covering of S^. More generally, show that the map S^ —>• S^ defined by pit) := z"^ is a |n|-covering of 5 M f n G Z \ {0}. 8.40 %. Show that the map p : E-j. x covering of S^ x E + .

E+ X S^ defined by p{s,e)

8.41 %. Show that the map p : E+ x E defined by p{p,0) \ {0}.

= (s,e*^) is a

:= pe*^ is a covering of

E2

b. Lifting of curves In connection with coverings the notion of (continuous) lift is crucial. 8.42 Definition. Let p : X ^^ Y he a covering of Y and let f : Z ^^ Y be a continuous map. A continuous map f : Z -^ X such that po f = f is called a Hft of f on X.

262

8. Some Topics from the Topology of I

8.43 E x a m p l e . Let p : R -^ 5^ be the covering of 5^ given by p{t) = e^K A Uft of / : [0,1] —>- S^ is a continuous map /i : [0,1] —^ R such that f{t) = e*^(*). Looking at t as a time variable, h{t) is the angular evolution of f{t) as f(t) moves on S^. 8.44 E x a m p l e . Not every function can be lifted. For instance, consider the covering p :R-^ S^, p{t) = e*^^*. Then the identity map on S^ cannot be lifted to a continuous map /i : 5^ —>• R. In fact, parametrizing maps from S^ as closed curves parametrized on [0,27r], h would be periodic. On the other hand, if h was a lift of 2; = e**, we would have e** = ^iHt)^ which implies that h{t) = t-fconst, a contradiction.

However, curves can be lifted to curves that are not necessarily closed. Let X be a metric space. We say that X is locally path-connected if every point X e X has an open path-connected neighborhood U. 8.45 Proposition. Let p : X -^ Y be a covering ofY and let XQ G X. Suppose that X and Y are path-connected and locally path-connected. Then (i) each curve /? : [0,1] ^> F with ^(0) = P{XQ) has a unique continuous lift a : [0,1] -^ X such that po a = (3 and Q;(0) = xo, (ii) for every continuous map k : [0,1] x [0,1] -^ F with A:(0,0) = p{xo), there is a unique continuous lift h : [0,1] x [0,1] -^ X such that /i(0,0) = xo and p{h{t, s)) = k{t, s) for all {t, s) e [0,1] x [0,1]. Proof. Step 1. Uniqueness in (i). Suppose that for the two curves ai,Q!2 we have p(ai{t)) = p{cx2{t)) Vt e [0,1] and a i ( 0 ) = a2(0). The set E := {t\ai(t) = a 2 ( 0 } is closed in [0,1]; since p is a local homeomorphism, it is easily seen that E is also open in [0,1]. Therefore E = [0,1]. Step 2. Existence in (i). We consider the subset ^ := "s i G [0,1] 3 a continuous curve at : [0, t] -^ X such that a(0) = XQ and p{at{e))

= 13(6) V(9 E [0,i]}

and shall prove that E is open and closed in [0,1] consequently, E = [0,1] as it is not empty. Let T E E and let U be an open neighborhood of OiriT) for which p^u is a, homeomorphism. For a sufficiently small, a < CTQ, the curve s -^ j{s) := (P|t/)~^(/3(5)), s G [r, r + cr], is continuous, 7 ( T ) = ctrir) and p(a{s)) = /3(s), Vs € [r^r + cr]. Therefore for the curve a X defined by , . I ar(s) o^ais) := 0 if r G £^, or, in other words, E is open in [0,1]. We now prove that E is closed by showing that T := sup£? G E. Let {tn} C E be a nondecreasing sequence that converges to T and for every n, let a n : [0, in] -^ ^ be such that p{an{t)) = p{t) Vt G [0,tn]. Because of the uniqueness (Xr{t) = as{t) for all t G [0, r] if s < r, consequently a continuous curve a : [0, T[-^ X is defined so that p(a{t)) = p{t) Vt G [0, T[. It remains to show that we can extend continuously a at T. Let V be an open neighborhood of /3(T) such that p~^{V) — UjUj where Ua are pairwise disjoint open sets that are homeomorphic to V. Then f3{t) £ V iovt < t
8.1 Homotopy

263

belong to a unique Ua, say Ui, ior t [0,1] -^ X are such that p o hi = p o h2 in [0, l ] x [0,1] w i t h / i i ( 0 , 0 ) = ^2(0,0), from (i) we infer/ii(P) = h2{P), as hi\^p = /i2|7pLet us prove existence. Again by (i) there is a curve a{t) with a(0) = XQ and p{a(t)) = k(0,t) for all t, and, for each t, a curve s -^ h(s,t) such that p{h{s,t)) = k{s,t) with /i(0, t) = a{t). Of course fc(0,0) = a(0) = XQ and it remains to show that h : [0,1] X [0,1] -> X is continuous. Set Rs := [0,s[x[0,l] and RQ := {0} x [0,1]. Suppose h is not continuous and let (s, i) be a point in the closure of the points of discontinuity of h. Let U be an open and connected neighborhood of kite's) such that p^U is a homeomorphism. By lifting fcip(f/) we find a rectangle P C M"^ that has (s,t) as an interior point and a continuous function w : R -^ U with w{t,s) = h(t,'s) such that p{w{t,s)) = k{t,s) = p{h{t,s)) for all (t, s) 6 Ps"- Since w and /i are continuous in Pg-, they agree in Rs fl P . On the other hand, both h{t, s) and w{t, s) lift the same function /c(t, s), thus by (i) they agree, hence h{t,s) = w{t,s) is continuous in a neighborhood of (t, s): a contradiction. D

8.46 Proposition. Let X and Y be path-connected and locally pathconnected metric spaces and let f : X —^ Y be a covering of Y. Let a,P : [0,1] -^ Y be two curves with a(0) = /?(0) and a{l) = ^3(1) that are homotopic with fixed endpoints and let a, 6 : [0,1] -^ X be their continuous lifts that start at the same point a(0) = 6(0). Then a(\) = 6(1), and a and b are homotopic with fixed endpoints. Proof. Prom (i) Proposition 8.45 we know that a, (3 can be lifted uniquely to two curves a, 6 : [0,1] -^ X with a(0) = 6(0) = ao, p{ao) = a(0). Let k : [0,1] x [0,1] -> y be a homotopy between a and /5, i.e., k(0,t) = a{t), k{l,t) = P{t), k{s,0) = a(0) = /3(0), fc(s, 1) = a ( l ) = /?(1). By (ii) Proposition 8.45 we can lift k to /i, so that p{h{s,t)) = k{s, t) and A;(0,0) = a(0) = 6(0). Then ^ is a homotopy between a and 6 and in particular D a ( l ) = ^(0,1) ^ 6 ( 1 ) .

8.47 Theorem. Let X and Y be path-connected and locally path-connected metric spaces and let p : X ^^ Y be a covering ofY. If Y is simply connected, then p : X ^^Y is a homeomorphism. Proof. Suppose there are xi,a;2 G X with p{xi) — p(x2)- Since X is connected, there is a curve a : [0,1] —>• X with a(0) = xi and a ( l ) = X2. Let 6 : [0,1] -^ X be the constant curve b{t) = xi. The image curves a{t) := p(a(i)) and P{t) := p{b(t)) are closed curves, hence homotopic, Y being simply connected. Proposition 8.46 then yields X2 =a{l) = 6(1) =xi. D

8.48 Theorem. Let X and Y be path-connected and locally path-connected, and let p : X -^ Y be a covering of Y. Suppose that Z is pathconnected and simply connected. Then any continuous map f : Z —^Y has a lift f : Z -^ X. More precisely, given ZQ e Z and XQ G X, such that p{^o) = fi^o), there exists a unique continuous map f : Z -^ X such that f{zo) = xo andpoj = f. Proof. Let z ^ Z and let 7 : [0,1] ^^ Z be a curve joining ZQ to z. Then the curve OL{t) := / ( 7 ( t ) ) , t e [0,1], in y has a lift to a curve a : [0,1] -> X with a(0) = XQ, see (i) Proposition 8.45, and Proposition 8.46 shows that a ( l ) depends on a ( l ) = f{z)

264

8. Some Topics from the Topology of R"^

and does not depend on the particular curve 7. Thus we define f{z) := a ( l ) , and by definition f{zo) = XQ and pof = f. We leave to the reader to check that / is continuous. D

c. Universal coverings and homotopy 8.49 Definition. Let Y be a path-connected and locally path-connected metric space. A covering p : X -^ Y is said to be a universal covering of Y if X is path-connected, locally path-connected and simply connected. Prom Theorems 8.47 and 8.48 we immediately infer 8.50 Theorem (Universal property). Let X, F, Z be path-connected and locally path-connected metric spaces. Let p . X -^ Y, q : Z -^ Y be two coverings of Y and suppose Z simply connected. Then q has a lift q: Z -^ X which is also a covering of X. Moreover q is a homeomorphism if X is simply connected, too. The relevance of the universal covering space in computing the homotopy appears from the following. 8.51 Theorem. Let X and Y be path-connected and locally path-connected metric spaces and letp:X—^Y be the universal covering ofY. Then Vyo ^ Y 7ri(F,2/0) andp~^{yo) C X are one-to-one. Proof. Fix q € p~^(xo). For any curve a in F with base point XQ, denote by a : [0,1] ^• X its lift with a(0) = q. Clearly a is a curve in X which ends at a ( l ) € p~^{xo)' Moreover, if /3 is loop-homotopic to a in Y, then necessarily a ( l ) = 6(1), so the map a -^ a(l) is actually a map (pq : 7ri(y,xo)

-^p~'^{xo).

Of course (fq is surjective since any curve in X with endpoints in P~^{XQ) projects onto a closed loop in Y with base point XQ. Moreover, if (^q([7]) = ^q([<^])5 then the lifts c and d that start at the same point end at at the same point; consequently c and d are homotopic, as X is simply connected. Projecting the homotopy between c and d onto Y yields [7] = [S]. D

d. A global invertibility result Existence of a universal covering p : X ^ F of a space Y can be proved in the setting of topological spaces. Observe that if X and Y are pathconnected and locally path-connected, and if p : X ^ Y is a. universal covering of Y, then Y is locally simply connected, i.e., such that Wy e Y there exists an open set F C F containing y such that every loop in V with base point at x is homotopic (in Y) to the constant loop x. It can be proved in the context of topological spaces that any path-connected, locally path-connected and locally simply-connected Y has a universal covering p : X -^ Y. We do not deal with such a general problem and confine ourselves to discussing whether a given continuous map f : X ^^ Y is a. covering of Y.

8.1 Homotopy

265

Let X,Y be metric spaces. A continuous map / : X ^ Y" is a local homeomorphism if every x € X has an open neighborhood U such that /|[/ is a homeomorphism onto its image. We say that / is a proper map if f~^{K) is compact in X for every compact K CY. Clearly a homeomorphism from X onto its image / ( X ) C F is a local homeomorphism and a proper map. Also, if p : X -^ F is a covering of Y then p is a local homeomorphism. We have 8.52 let f f{X) f{X)

Theorem. Let X be path-connected and locally path-connected and : X -^Y be a local homeomorphism and a proper map. Then X and are open, path-connected and locally path-connected and f : X -^ is a covering of f{X).

Before proving Theorem 8.52, let us introduce the Banach indicatrix of / : X ^ R^ as the map Nf'.Y-^NU

{oo},

Nf{y) := # { x G X | f{x) = y}.

Evidently / ( X ) = {y \ Nf{y) > 1} and / is injective iff Nf{y) < 1 \/y. 8.53 Lemma. Let f : X -^ Y be a local homeomorphism and a proper map. Then Nf is bounded and locally constant on / ( X ) . Proof. Since / is a local homeomorphism, the set f~^{y) = {x £ X\f(x) = y} is discrete and in fact f~^{y) is finite, since / is proper. Let Nf{y) = k and f~^{y) = {xi,..., Xfc}. Since / is a local homeomorphism, we can find open disjoint neighborhoods [/i of x i , . . . , C/fe of Xk and an open neighborhood Vofy such that / | [/. : Uj —> V are homeomorphisms. In particular, for every y E V there is a unique Xj G Uj such that f(xj) = y.lt follows that Nf{y) > k^y ^V. We now show that for every y there exists such that Nf{y) < k holds for all y ^W. Suppose, in a neighborhood Woiy^WcV, fact, that for a y there is no neighborhood W such that N{y) < k for y EW, then there is a sequence {yi} C W, yt ^>- y with N{yi) > k, and points fi ^ Ui U • • - U Uk with / ( ? i ) — Vi- The set f~^{{yi} U {y}) is compact since / is proper, thus possibly passing to a subsequence {^i} converges to a point ^ and necessarily ^ ^ t/i U • • • U t/^; passing to the limit we also find / ( $ ) = y: a. contradiction since ^ is different from x i , . . . ,Xfc. D f~^{y) Proof of Theorem 8.52. Prom Lemma 8.53 we know that, for every y G Y^ contains finitely many points {xi, X2, •. •, xj^} where N is locally constant. If t/^, i = 1,... ,N, Ui 3 Xi and V 3 y are open and homeomorphic sets, we then set

V = ntiVi n {3/ e y I Nf{y) = N],

Wi := ifiuX'C^)-

Clearly V is open and / " ^ (V) is a finite sum of disjoint open sets that are homeomorphic to V. D

As a consequence of Theorem 8.47 we then infer the following useful global invertibility theorem. 8.54 Theorem. Let X be path-connected and locally path-connected, and let f : X ^^Y be a local homeomorphism that is proper. If f{X) is simply connected, then f is injective, hence a homeomorphism between X and

fix).

266

8. Some Topics from the Topology of R^

Proof, f : X -^ f(^) is a covering by Theorem 8.52. Theorem 8.47 then yields that / is one-to-one, hence a homeomorphism of X onto f{X). D

8-1.4 A few examples a. The fundamental group of S^ The map p : R -^ 5^, p{t) = e^^^* is a universal covering of S^. Therefore for any XQ G 5^, p~^{xo) = Z as sets. Therefore, see Theorem 8.51, one can construct an injective and surjective map

that maps [a] to the end value a(l) G Z of the lift a of a with a(0) = 0. We have 8.55 Lemma. ipxQ : 7ri(5^,xo) -^ Z is a group isomorphism. Proof. Let a,/3 he two loops in S^ with base point XQ and a, b the liftings with a(0) = 6(0) = 0. If n := (po{[a]) and m = (/?o([/?]), we define c : [0,1] -^ M by ^ ^ ,^ ^^ U{2s) _ , _

s e [0,1/2],

| n + 6(2s-l)

s G [1/2,1].

It is not diflftcult to check that c is the lift of a * /? with c(0) = 0 so that ^o([a] * IP]) = Ml^

* P]) = c(l) = n + m =
MW)D

Since (pxQ is a group isomorphism and Z is commutative, 7ri(S'^,xo) is commutative, and there is an injective and bijective map h : [5^,5^] —^ 7ri(X, xo), see Proposition 8.27. The composition map deg : (f{S\S')

-> Z,

deg(7) :=

f.oihib]))

is called the degree on 5^, and by construction we have the following. 8.56 Theorem. Two maps f,g : S^ —^ S^ have the same degree if and only if they are homotopic. Later we shall see that we can recover the degree mapping more directly. 8.57 %, Show that the fundamental group of M^ \ {0} is Z.

8.1 Homotopy

267

xo

Figure 8.11. A figure eight.

b. The fundamental group of the figure eight The figure eight is the union of two circles A and B with a point XQ in common. If a is a loop based at XQ that goes clockwise once around A, and a~^ is the loop that goes counterclockwise once around A, and similarly for 6, 6~^, then the cycle aba~^b~^ is a loop that cannot be unknotted in AUB while aa~^bb~^ can. More precisely, one shows that the fundamental group of the figure eight is the noncommutative free group on the generators a and b. Indeed, this can be proved using the following special form of the so-called SeifertVan Kampen theorem. 8.58 Theorem. Suppose X = UyjV, where U, V are open path-connected sets and U H V is path-connected and simply connected. Then for any XQ eU nV, 7ri(X, Xo) is the free product ofTTi{U,xo) and 7ri(y, XQ). 8.59 %. Show that the fundamental group of M? \ {XQ, xi} is isomorphic to the fundamental group of the figure eight. 8.60 %. Show that 7ri(X x Y, {xo,yo)) is isomorphic to 7ri(X,XQ) X ni{Y,yo), ticular the fundamental group of the torus S^ x S^ i s Z x Z .

in par-

8.61 %, Let X = Ai U ^ 2 U • • • U An where each Ai is homeomorphic to S^, and AiCiAj = {XQ} Hi ^ j . Show that TTI{X, XQ) is the free group on n generators a i , . . . , a n where a^ is represented by a path that goes around Ai once. 8.62 %. Let X be the space obtained by removing n points of M."^. Show that TTI(X, XQ) is a free group on n generators a i , . . . , a n , where ai is represented by a closed path which goes around the ith hole once.

c. The fundamental group of S'^^ n > 2 The following result is also a consequence of Theorem 8.58. 8.63 Theorem. Let X = U UV where U and V are simply connected open sets of X and UDV is path-connected. Then X is simply connected, i.e.,

7 r i ( X , Xo) = 0 .

As a consequence we have the following. 8.64 Proposition. The sphere S'^ C R"^"^^ is simply connected, i.e., 7ri(5^,xo) = 0 ifn>2.

268

8. Some Topics from the Topology of R"^

Proof. Let ps and pN be respectively, the south pole and the north pole of the sphere. The stereographic projection from the south (north) pole establishes a homeomorphism between 5 ^ \ {ps} (respectively, S"^ \ {PN}) and R^. Thus 7ri(S'^ \ {ps},xo) = 7ri(S''^ \ {PN}^XO) = 0 Wxo ^ PS,PN- By Theorem 8.63 it suffices to show that S'^\{PS,PN} is path-connected. For that we notice that the stereographic projection is a homeomorphism between S^ \ {PS,PN} and M^ \ {0} which in turn is path-connected D if n > 2.

Since the fundamental groups of R"^+^ \ {0} and S'^ are isomorphic, see Theorem 8.31, equivalently we can state 8.65 Proposition. W^ \ {0} is simply connected

ifn>2.

8.66 %. Show that R^, n > 2, and R^ are not homeomorphic.

8.1.5 Brouwer's degree a. The degree of maps S^ -^ S^ A more analytic presentation of the mapping degree for maps S^ —> S^ is the following. Think of S^ as the unit circle in the complex plane, so that the rotations of S^ write as complex multiplication, and represent loops in S^ as maps f : S^ -^ S^ or by 27r-periodic functions 6 -^ /(e^^),

8.67 Lemma. Let f : S^ —^ S^ be continuous. There exists a unique continuous function / i : R ^ R such that

{

(8.3)

h{0) = 0.

Proof. Consider the covering p : E —»> 5^ of 5^ given by p{t) := e**. The loop g{z) := f(z)/f{l) has base point 1 6 5'^. Then by the Ufting argument. Proposition 8.45, there exists a hft /i : R —• R such that (8.3) holds. The uniqueness follows directly from (8.3). In fact, if /n,/i2 verify (8.3), then hi(0) - h2(0) = k{e)2n where k{e) € Z. As hi and /i2 are continuous, k{0) is constant, hence k(6) = k{0) = 0. D

Let / : 5^ -^ 5^ be continuous and let /i : R ^ R be as in (8.3). Of course, for every 0 we have

h{0 + 27r) - h{e) = 2k{e)TT for some integer k{6) G Z. Since h is continuous, k is continuous, hence constant. Observe that k = h{2n) — h{0) = h{27r) and k is independent of the initial point / ( I ) . In particular, f : S^ -^ S^ and / / / ( I ) : S^ -^ S^ have the same degree.

8.1 Homotopy

269

8.68 Definition. Let f : S^ ^ S^ and let h be as in (8.3). There is a unique integer d G Z such that h{e + 27r) - h{e) =d27r

V6> G R.

The number d is called the winding number, or degree, of the map f : S^ -^ S^^ and it is denoted by deg(/). 8.69 Theorem. Two continuous maps / o , / i *. S^ -^ S^ have the same degree if and only if they are homotopic. Proof. Let f : S^ —>• 5^. We have already observed that f{z) and f{z)/f{l) have the same degree. On the other hand, f{z) and f(z)/f(l) are also trivially homotopic. To prove the theorem it is therefore enough to consider maps / o , / i with the same base point, say / ( I ) = 1. (i) Assume / o , / i are homotopic with base point 1 G 5^. By the lifting argument, the liftings ho, hi of / o , / i characterized by (8.3) have /ii(27r) = /i2(27r), hence d e g ( / i ) = /ii(27r) - ^i(O) = hi{2n) = h2{2n) = /i2(27r) - /i2(0) = deg(/2). Conversely, let / : 5^ —>• 5^ be of degree d and let h be given by (8.3). Then the map k : [0,1] X 51 ^ S^ defined by fc(t, 6) := exp {th(e) + d{l-

1)9)

establishes a homotopy of / to the map if : S^ -^ S^, (f{z) = z^. Therefore, if /o and / i have the same degree d and base point 1 G S^, then they are both homotopic to the same map (f{z) — z^. •

Finally we observe that deg(z^) = d'id elt and that, if / and g have the same base point, deg(^ * / ) = deg(^) + deg(/). b. A n integral formula for the degree Let / : 5^ ^ 51 and let /i : R ^ R be as in (8.3). Clearly, thinking of ^ as a time variable, h{6) is the angle evolution of the point /(e^^) on the circle. The degree of / corresponds to the total angle evolution, that is to the number of revolutions that f{z) does as z goes around 5^ once counterclockwise, counting the revolutions positively if f{z) goes counterclockwise and negatively if f{z) goes clockwise. Suppose / : [0,27r] ^ 5^ is a loop of class C^, that \s 9 -^ /(e*^) is of class C \ and let / i : R ^ R be as in (8.3). Differentiating (8.3) we get ie'^f\e^^)

= if{l)e'^^^^h'{e)

=

if{e^^)h\e)

and taking the modulus \h'(6)\ = |/'(e^^)|. Therefore, h' is the angular velocity of f{z) times ± 1 depending on the direction of motion of f{z) when z moves as e^^ on the unit circle. In coordinates, writing / := / i 4-i/2, we have / ' = / { + */2, hence

We conclude using the fundamental theorem of calculus

270

8. Some Topics from the Topology of W

Figure 8.12. Counting the degree.

8.70 Proposition (Integral formula for the degree). Let / : 5^ —> S^ be of class C^. Then the lift h of f in (8.3) is given by h{t) - I'h'{e)de

= I'ie''[-

f2{e'')f[{e'')

+ fi{e'')f^{e''))

d6. (8.4)

In particular

deg(/) = ^ j ^ " ' / i ' W\dB (8.5) h r ' " ' ' ( ~ /2(e^')/((e'^) + fi{e'')fiie''))

de.

One can define the lifting and degree of smooth maps by (8.5), showing the homotopy invariance in the context of regular maps, and then extending the theory to continuous functions by an approximation procedure. c. Degree and inverse image The degree oi f : S^ —^ S^ is strongly related to the number of roots of the equation f{x) = y counted with a suitable sign. 8.71 Proposition. Let f : S^ -^ S^ be a continuous map with degree d £ Z. For every y e S^, there exist at least \d\ points xi, ^ 2 , . . . , a;^^ in S^ such that f{xi) = y, , d. Furthermore, if f : S^ -^ S^ goes around S^ never turning back, i.e., if f{e^^) = e*^(^) where h : [0,27r] -^ R is strictly monotone, then the equation f{x) = y, y G S^, has exactly \d\ solutions. Proof. Let /i : R -^ R be as in (8.3) so that /i(27r) = 27rrf, and let s € [0, 2n[ be such that e*^ = y. For convenience suppose d > 0. The intermediate value theorem yields d distinct points 6i, ^2, • • •, ^d in [0,27r[ such that h(Oi) = 5, h{02) = 5-1- 2n, . . . , h{dd) = s + 2(d - l)7r, hence at least d distinct points x i , X2,..., Xd such that f{xj) = /(e*^J) = e^^^^^) — e*^ = y, see Figure 8.12. They are of course exactly d points x i , 0:2, • . . , a;^ if /i is strictly monotone. D

8.1 Homotopy

271

With the previous notation, suppose f : S^ -^ S^ is of class C^, let h be as in (8.3) and let y £ S^ and s G [0,27r[ be such that y = e^^. Assume that y is chosen so that the equation f{x) — y has a finite number of solutions and set e S^ I h{0) = s

(mod 27r), h\e) > o } ,

:= #{<9 G S^ I h{0) = s

(mod 27r), h\e) < o } .

iV+(/,2/) := #\e N.{f,y)

Then one sees that degif) ^ N+{f,y) - N.if,y),

(8-6)

see Figure 8.12. 8.72 The fundamental theorem of algebra. Using the degree theory we can easily prove that every complex polynomial P{z) := z"^ + aiz"^-^ + • • • + am-iz + ao has at least a complex root. Set Sj, := {z \ \z\ = p}. For p sufficiently large, P maps Sp in R^ \ {0}. Also deg(P/|P|) = m. In fact, by considering the homotopy Pt{z) := z"^ + tiaiz"^-^ + • • • + ao)

t G [0,1],

of P{z) to z"^, we have

| p , ( . ) l > | . r ( i - t ( ^ + ... + M ) )

vH^o.

Thus |Pt('2^)| > 0 V t G [0,1] provided \z\ is large enough, consequently Pt{z)/\Pt{z)l t G [0,1], z G S'\ establishes a homotopy of P/\P\ to z ^ from S^p into 5^, and we conclude that deg(P/|P|) = deg(z^) = m. 8.73 %. Show that / : 5^ -^ 5^ has at least d-l Figure 8.12.]

fixed

points if deg f = d. [Hint: See

d. The homological definition of degree for maps S^ -^ S^ Let / : 5^ -^ E^ be a continuous map, where for convenience we have denoted the target space S^ by T}. We fix in S^ and Y} two orientations, for instance the counterclockwise orientation, and we divide 5^ in small arcs whose images by / do not contain antipodal points (this is possible since / : 5^ ^ E^ is uniformly continuous) and let z i , . . . , Zw, ^n+i = zi £ S^ the points of such subdivision indexed according to the chosen orientation in S^. For each i = 1 . . . , n we denote by a^ the minimal arc connecting f{zi) with /(zi+i). We give it the positive sign if f{zi) precedes 7(2:^+1) with respect to the chosen orientation of E^, negative otherwise. Finally, for (^ G E^, ( 7^ f{zi) V i, we denote by p{() and n{Q the number of arcs ai respectively, positive and negative that contain ^. Then P(C) - n(C) = deg(/) G Z as we can see looking at the lift of / .

272

8. Some Topics from the Topology of E ^

Courant Institute of Mathematical Sciences

Topics in Nonlinear Functional Analysis L. Nirenberg

dMr *i»iMtm Mid t M i | « AbbiMttg « o M l M t , aMi • a j ; tiMn iMiti««tiP Bit X k.«iiMiii*w m4i0m»UMMta K*k«i^ 4MM« lMMta«« >»,<S (• - » ) W. Vir m i i l l 11 X tfaMT tit WIUM WM viBMriWM CMW 8«U <s iwkffM X M i e M ia a* Bit ika kMMlUkaka > ft Kmika, imm IdqpaU* lafMdi M f w U * te N l -

New York University

Figure 8.13. Frontispiece of lecture notes by Louis Nirenberg and a page from a paper by L. E. Brouwer (1881-1966).

8.2 Some Results on the Topology of Though the presentation of these topics would require more space and advanced techniques, and in any case, it leads us away from the main path, we think that it is worthwhile to present here some results that are relevant in the sequel. However, we shall confine ourselves to illustrate the ideas and refer to the literature for complete proofs and more details. In the next two paragraphs we collect a few relevant results on the topology of maps into S"^ that we freely use in the rest of this section.

8.2.1 Brouwer's theorem a. Brouwer's degree A topological degree, called Brouwer's degree, can be defined for continuous maps / : 5 " -^ 5^^, n > 2, either by extending the homological type arguments in the case n = 1 or, more generally, in terms of homology groups or, analytically, in terms of a sum with sign of the numbers of inverse images of a point, either pointwise or in the mean. Intuitively, one counts how many times the target 5 ^ is covered algebraically by the source S'^ via the map f} ^ Both approaches require the development of more advanced and relevant techniques; we refer the reader e.g., to o J. Dugundji, Topology, Allyn and Bax:on, 1966, o L. Nirenberg, Topics in Nonlinear Analysis, AMS-CIMS, New York, 2001,

8.2 Some Results on the Topology of M^

273

In this way we end up with a map deg:C^(S^,5^)-^Z such that (i) deg(Id) = l, (ii) deg(/) = 0 if / is constant, (iii) deg(/) = ( - l ) " + M f / ( x ) = - x , and we have the following. 8.74 Theorem (Brouwer). Let / o , / i : S'^ -^ S'^ be continuous and homotopic. Then deg(/o) = deg(/i). Indeed the degree completely characterizes the homotopy classes of continuous maps from S'^ into S'^. In fact, we have the following. 8.75 Theorem (Hopf). Two continuous maps of S'^ into itself are homotopic if and only if they have the same degree. Moreover, for each d eZ there is a map f : S^ ^^ S'^ with deg(/) = d. A map / : 5^ ^ 5^ C R^^^ is called antipodal if f{-x) = -f{x) Vx G 5^. For instance. Id : 5^ -> 5^ and - Id : 5^ ^ 5^ are antipodal. 8.76 Theorem (Borsuk antipodal theorem). Let f : S'^ ^ S^ be a continuous antipodal map. Then deg(/) is odd; in particular f is not homotopic to a constant map. b. Extension of maps into S^ The following two extension theorems for maps into S'^ are also crucial. We refer the reader e.g., to J. Dugundji, Topology^ AUyn and Bacon, 1966. 8.77 Theorem. We have the following. (i) Let A C S'^ be a closed set. Every continuous map f : A -^ S^ extends to a continuous map F : S'^ -^ S'^. (ii) Let A C 5^^"^^ be closed and f : A ^ S^ be continuous. Pick a point Xi € Ui in every bounded connected component Ui of A^ := S'^'^^ \A. Then there is a continuous extension F : 5"^"^^ \ Ui{pi} -^ S"^. 8.78 Theorem (Borsuk). Let A C R^, k > 1, be closed and let f : A ^^ S'^ be continuous. Then f can be extended to a continuous map F :R^ -^ S'^ if and only if f is homotopic to a constant map. Observing that A^ has a unique unbounded connected component if A is a compact subset of E^, and using the stereographic projection, it is not difficult to infer from Theorem 8.77 (i), (ii). o one of the several books on degree theory.

274

8. Some Topics from the Topology of R^

8.79 Theorem. We have the following. (i) Let A CW^ be compact. Then any continuous function from A into S'^ can be extended to a continuous F iW^ -^ S'^. (ii) Let A C R"'"^^ be compact and f : A ^^ S'^ be continuous. Pick a point Pi G Ui in every connected component Ui of A^. Then f can be extended to a continuous map F : W^^^ \ Ui{pi} —^ S"^. As a consequence of the Hopf theorem and Proposition 8.8 we immediately infer the following. 8.80 Proposition. A function f : S'^ ^^ S'^ has a continuous F : cl (5^+1) -> 5^ if and only if deg(/) - 0.

extension

8.81 Corollary. Let f : S'^ ^ W^^ \ {0} be a continuous map. Then there exists a continuous extension F : cl (J^^^^^) -^ W^'^^ \ {0} of f if and

onlyifdegif/\f\) = 0. c. Brouwer's fixed point theorem Since the identity from 5^ into S'^ has degree one, and the constant maps have degree zero, from the homotopic invariance of the degree we conclude the following. 8.82 Theorem (Brouwer). The identity map Id : S^ -^ S'^ is not homotopic to a constant map. In other words, we cannot peel an orange without piercing the peel. Brouwer's theorem, whose content is quite intuitive, at least in dimension n = 2, has several interesting and surprising consequences. In fact, we have the following. 8.83 Theorem. The following claims are equivalent (i) (BROUWER'S THEOREM) The identity map

Id : 5^ - ^ S"^ is

not

homotopic to a constant map. (ii) There is no continuous map F : B ^^ S^, B — c^^^"^^); such that F{x) = X Vx G 5^^, that is, S'^ is not a retract of B. (iii) (BROWER'S FIXED POINT THEOREM, I) Every continuous map f : B ^^ B, B := c^^^"^^), has a fixed point, i.e., there is at least one x E B such that f{x) = x. Proof, (i) =^ (ii) If F : 5 -^ S"^ is a continuous function with F{x) = x Vx E S'^, then H(t,x) := F{tx), {t,x) € [0,1] x 5 ^ , is a homotopy of the identity to F(0). A contradiction. (ii) => (iii) Suppose that there is X e B. Then, and we leave this to into the unique point of S'^ on the from B in S"^ with G{x) = x \/x e

a continuous F : B —^ B such that F{x) ^ x for all the reader, the map G : B -^ S^ that maps x e B half-line from f{x) to x would be a continuous map S^, contradicting (ii).

8.2 Some Results on the Topology of M^

275

(iii) => (i) Suppose that there is a homotopy H : [0,1] x S"^ -^ S^ between the identity and a constant map, H(l,x) = x, H{0,x) = p £ S"^. Then the function F \ B ^^ S^ defined by

yp

if X = 0,

would be a continuous extension of the identity on S'^ to B , hence —F{x) : B -^ B would have no fixed point. D 8.84 %. Let U_ C W^'^^ be a bounded open set. Prove that there existsjio continuous retraction r : U ^>^ dU with r{x) = x on dU. [Hint: Let 0 E C/, B{0, k) D U and consider the continuous map / : S ( 0 , k) -^ B(0, k) defined by ^r{x) \r{x)

if X € t/,

X

k—

i{xeB{0,k)\U.]

X

d. Fixed points and solvability of equations in R"^+^ Going through the proof of Theorem 8.83, we can deduce a number of results concerning the solvabihty of equations of the type F{x) = 0. Let f : S'^ -^ R"^"^^ \ {0} be a continuous map. Since / never vanishes, the map f/\f\ continuously maps 5^ into S'^. We call degree of f with respect to the origin the number

deg(/,0):=deg(^). 8.85 Proposition. Let f : S'^ ^^ R"^+^ \ {0} be a continuous map with deg(/,0) ^ 0. Then every extension F : B ^W of f, B := cl(B^+i), has a zero in B^^^. Proof. Suppose this is not true. Then there exists a continuous extension F : B ^>^ E^"'"^ \ {0} of /• Hence F{x)/\F{x)\ is a continuous map from B into S'^. According to Proposition 8.80, F{x)/\F{x)\ = f{x)/\f{x)\ has degree zero, a contradiction. D

Let us illustrate a few situations in which Proposition 8.85 applies. 8.86 Proposition. Let F : c^^'^"^^) -^ R"^+^ be a continuous map such that F{x) never points opposite to x for all x e S'^. Then F{x) = 0 has a solution. Proof. Let / := F^sn : 5 ^ -> W^^. Since, by assumption F{x) + Ax / 0 V A > 0, Vx G 5 ^ , / has no zeros and therefore h(t, x) := tf{x) + (1 - t)x,

t € [0,1], X G 5 ' ' ,

never vanishes. Hence ^(t, x ) / | ^ ( t , x)| is a homotopy of f/\f\ : 5"^ —>• 5 ^ to the identity map Id : 5 ^ —)^ 5 ^ . It follows that deg(/, 0) = d e g ( / / | / | ) = 1. We conclude, on account of Proposition 8.85 that F , being an extension of / , has at least one zero in B'^^^. D

276

8. Some Topics from the Topology of R^

8.87 Theorem (Brouwer's fixed point, II). Let F : B -^ W, B := cl (B"'), be a continuous map with F{dB) C B. Then F has a fixed point. Proof. Set (l>{x) := x — F(x), x E B, and suppose (p(x) ^ 0 Vx, otherwise we are through. In this case never points opposite for each x G dB. Indeed, if a: — F(x) + \x = 0 for some A > 0 and x G dB, then F{x) = (1-|-A)a:. Now A > 0 is impossible since |F(a:)| < 1, and, if A = 0, then F{x) = a; on dB which we have ruled out. Thus, F{x) — x = 0 has a solution inside B. D

It is worth noticing that Brouwer's theorem still holds if we replace cl{B'^) with any set which is homeomorphic to the closed ball of R"^. Moreover, it also holds in the following form. 8.88 Theorem (Brouwer fixed point theorem, III). Every continuous map f : K -^ K from a convex compact set K into itself has a fixed point. Proof. According to Dugundji's theorem. Theorem 6.42, / has a continuous extension F : E^ —^ W^, whose image is contained in K, K being convex and closed. If B is a ball containing K^ then F{B) C B and by Brouwer's fixed point theorem, Theorem 8.87, F has a fixed point x G B , i.e., F{x) — x, and, since F(x) G K, we conclude that x^K. D

e. Fixed points and vector fields Every (n -f l)-dimensional vector field in a domain A C W^^^ may be regarded as a map (p : A C W^'^^ -^ M"^"^^, once we fix the coordinates. If (p is continuous and nonzero, the degree of (f with respect to the origin is called the characteristic of the vector field. The Brower's degree properties and Proposition 8.8 then read in terms of vector fields as follows. 8.89 Proposition. We have the following. (BROUWER) Let (p be a nonvanishing vector field in cl{B'^~^^). Then (p\s^ has characteristic zero. (ii) The outward normal to B"^^^ at x e S'^ = dB'^~^^ is x. Therefore the outward normal field to 5^, x -^ x/\x\, x G R'^"^^ \ {0}, has characteristic one. (iii) The inward normal at x £ S^ is —x. Therefore the inward normal field to 5^, X -^ -x/\x\, X e W"-^^ \ {0}, has characteristic ( - 1 ) ^ + ^ (iv) Let (f and tp be two continuous nonvanishing vector fields on S'^ that are never opposite on S^. Then ^p and ip have the same characteristic.

(i)

Let us draw some consequences. 8.90 Proposition. Each nonvanishing vector field on (p : c\{B'^^^) —^ MP'~^^ must contain at least an inward normal and an outward normal vector. Proof. In fact, (p\sn has characteristic zero by (i) Proposition 8.89. Since (p\s^ and the field of outward (inward) normals have different characteristics, we infer from Proposition 8.89 (iv) that (p\sn must contain an inward (outward) normal. D

8.2 Some Results on the Topology of E^

277

8.91 Theorem (Poincare-Brouwer). Every continuous nonvanishing vector field on an even-dimensional sphere S'^'^ must contain at least one normal vector. In particular, there can be no continuous nonvanishing tangential vector fields to S'^. Proof. By (ii), (iii) Proposition 8.89, the inward and outward normal vector fields in S^"^ have different characteristics. Since any unitary vector field must have characteristics differing from one of these two fields, the result follows from (iv) Proposition 8.89. D

8.92 Proposition. Let f : S'^'^ -^ 5^"^ be a continuous map. Then either f has a fixed point x = f{x), or there is an x e 5^^ such that f{x) = —x. Proof. Suppose f : S'^'^ ^ 5^^ has no fixed point. Then the vector field g : S'^'^ ^ 5^^ given by g{x) := .-^rc"^., x € 5 ^ ^ , is continuous and of modulus one. Thus it contains a normal vector, i.e., f{x) — x = Xx for some x e S'^ and A G M. Since \f{x)\ = |a;| = 1 we infer 1 = |/(a;)| = |A -f l\\x\ = |A -f 1|, i.e., either A = 0 or A = - 2 . We cannot have A = 0 since otherwise f(x) — x = 0 and x would be a fixed point. Thus necessarily A = - 2 , i.e., f(x) = -X. D 8.93 %, Let 0 : R^ —>^ R"^ be a continuous map that is coercive, that is

Wx) I x) —

>• oo

uniformly as \x\ —>• oo.

Show that is onto W^. [Hint: Show that for every x,y opposite to X for \x\ = R, R large.]

€ R^
8.94 %. Let (/) : R"^ —)> R'^ be a continuous map such that h m s u p ' , / ' < 1. Show that
. . , X i _ i , - l , X i - | . i , . . .,Xn) > 0,

fi{xi,.

. . , X i _ i , l , X i + i , . . . ,Xn) < 0.

Then there is at least one x £ Q such that f{x) = 0. Show the equivalence between the above theorem and Brouwer's fixed point theorem. [Hint: To prove the theorem, first assume that strict inequalities hold. In this case show that for a suitable choice of e i , . . . , en G R the transformation x^= Xi -\- €ifi(x),

i = 1 , . . . , n,

maps Q into itself, and use Brouwer's theorem. In the general case, apply the above to f{x) — Sx and let S tend to 0. Conversely, if F maps Q into itself, consider the maps fi{x) = Fi{x) — Xi, i = l,...,n.] 8.96 ^ . Show that there is a nonvanishing tangent vector field on an odd-dimensional sphere S ^ ^ - i . [Hint: Think S^""'^ C R^^. Then the field X = (xi,X2,. . . ,X2n) -^ {-Xn-\-l, -Xn+2, - • • , X2n, Xi,X2, . . • ,Xn) defines a map from 5 2 n - i -^^^^ itself that has no fixed point.]

278

8. Some Topics from the Topology of R"

Drei SitM fiber die /{•^intensionfllQ euklidische Sphdre *)• K a r o l B o r x u i t (WwMW*). %t mim ii and K i datlMt «iB«r kiMiipdct Ut iMMe all* tl«t%* AMMldBiic«n » »oa 1/ i» «> .Y h«t nad d « dumb die Fomui n > , r*) —Sop j([<»(*), »'(«))») mHaiAvn MA. it IIMI «u eUer KO«|MMOM<) d*r ll«a(!« ^ C ^ ' * gthownde raoktioBm irerdca filmkM m Z gMumat Eise AbUMifiK 9 < H' i>«iMt «w. MNrtM'), wwa {id* >ii 9> IB A" •qainOMto Abbild«B« if «af >) Mit a, WMd* iek die MkJMiMb* »-diiMMio««le SpkAn. d. fa. dk. OlMfflhsiM t i M VoUkagtl im MkUdiwbm (n + lHii>«>ii<>n*l«n &.<»• V^ bMdeiioea. lA p mn beliebigHr Pookt der Spblte «., •o IwuidiMt f i m n p mntipfiMtH, d. h. qnartriMk w ,1 ml » m Mittalpmkt* *oa 8. (»l«Rea«i Puakt TOB S,. B I M Fsaktioa y « « . wW aiMpMfarfnn xwiaat, w a a a / ( p ^ - f/f,)}• ftr M « . ^ « « . gflt •) U* HMTOMIM. «w«r AiftA rial ((kM Bewriw) i* «MiaM «€ktl«M-

7 M)-*itt. 7,1. ^ a H. B*rr. M**. A«. w. 8. »7». * w ^ ». •)ffc..rt(W».,M,«))k,

Figure 8.14. Karol Borsuk (1905-1982) and a page from one of his papers.

8.2.2 Borsuk's theorem Also Borsuk's theorem, Theorem 8.76, has interesting equivalent formulations and consequences. 8.97 T h e o r e m . The following statements hold and are are equivalent. (i)

(BORSUK-ULAM)

There is no continuous antipodal map f : S'^ -^

(ii) Each continuous f : S'^ -^ W^ sends at least one pair of antipodal points to the same point. (iii) (LYUSTERNIK-SCHNIRELMANN) In each family o / n + l closed subsets covering S'^ at least one set must contain a pair of antipodal points. Proof. Borsuk's theorem => (i) Ii f : S^ -^ 5 ^ ~ ^ is a continuous antipodal map, and if we regard 5 ^ ~ ^ as the equator of S^, S^~^ C 5 ^ , / would give us a nonsurjective map f : S'^ -^ S'^, hence homotopic to a constant. On the other hand / has odd degree by Borsuk's theorem, a contradiction. (i) =>• (ii) Suppose that there is a continuous g : S^ Then the map / : 5 ^ -^ 5 ^ - ^ defined by

fix) :=

J"" such that g(x) ^

g{~x).

gj-x) - g{x) \9{-x)-g{x)\

would yield a continuous antipodal map. (ii) => (iii) Let F i , . . . , Fn+i be n + 1 closed sets covering S^ and let o; : 5 " —> 5 ^ be the map a{x) = —x. Suppose that a{Fi) f) Fi = 0 for a l H = 1 , . . . , n. Then we can find continuous functions gi : S'^ -^ [0,1] such that g~^{0) = Fi and p ~ ^ ( l ) = oc(Fi). Next we define g : S^ —^ R^ as g(x) = (gi{x),. ..,gn{x)). By the assumption there

8.2 Some Results on the Topology of W

is xo e S^ such that gilxo)

— gi(-xo)

V i, thus XQ ^

U F^ and XQ ^

279

U a(Fi),

consequently a^o € -Fn+i n a ( F n + i ) , a contradiction. (iii) =^ (i) Let f : S^ -^ 5 " ~ ^ be a continuous map. We decompose S^~^ into (n -f- 1) closed sets Ai,... ,An-\-i each of which has diameter less than two; this is possible by projecting the boundary of an n-simplex enclosing the origin and S'^~^. Defining i = 1 , . . . , n + 1, according to the assumption there is an XQ and a k Fi := f~^{Ai), such that Xo € Ffc n a{Fk). But then f{xo) and /(—XQ) belong to Fk and so / cannot • be antipodal.

8.98 Theorem. R"^ is not homeomorphic to W^

ifn^m.

Proof. Suppose n > m and let h : W^ —^ W^ be a continuous map. Since n — 1 > m, from (ii) Theorem 8.97 we conclude that h^gn-i : S"^"^ - ^ R"^ C M"""^ must send two antipodal points into the same point, so that h cannot be injective. D

8.99 Remark. As a curiosity, (ii) of Theorem 8.97 yields that at every instant there are two antipodal points in the earth with the same temperature and atmospheric pressure. 8.100 %. Show that every continuous map f : S'^ ^ S^ such that f{x) ^ f{—x) V x is surjective.

8.2.3 Separation theorems 8.101 Definition. We say that a set A C W'^ complement A^ := W^'^^ \A is not connected.

separates R^+^ if its

8.102 Theorem. Let A C W^^^ be compact. Then (i) (ii) (iii) (iv)

each connected component o/R^"^^ \A is a path-connected open set, A^ has exactly one unbounded connected component, the boundary of each connected component of A^ is contained in A, if A separates R^+^, but no proper subset does so, then the boundary of each connected component of A^ is exactly A.

Proof, (i) follows e.g., from Corollary 6.68, since connected components of A^ are open sets. (ii) Let B be a closed ball such that B D A. Then B^ is open, connected and B Thus B is contained in a unique connected component of A^.

C A^.

(iii) Let U be any connected component of A^ and x G dU. We claim that x does not belong to any connected component of A^, consequently x ^ A^. In fact, x ^ U, and, if X was in some component V, there would exist B(x, e) C V. B{x, e) would then also intersect U, thus U r\V ^ ^•. a contradiction. (iv) Let U be any connected component of A^. Since A separates E^"^^, there is another connected component V of A^ and, because V C M""''^ \ U, necessarily R^+^ \ C/ ^ 0. Consequently R ^ + i \ dU splits as R^+^\at/ = t/U(E^+^\t7) which are disjoint and nonempty, so dU separates R^"*"^. Since by (iii) dU C A and is closed, it follows from the hypotheses on A that dU = A. D

280

8. Some Topics from the Topology of W

8.103 Theorem (Borsuk's separation theorem). Let A C W^~^^ be compact Then A separates W^~^^ if and only if there exists a continuous map f : A -^ S'^ that is not homotopic to a constant. Proof. Define the map /?p|^ as

\x-p\ Assume that A separates M^"'"^. Then R^"'"^ \ A has at least one bounded component U. Choosing any p £ U v/e shall show that ^ p | ^ cannot be extended to a continuous function on the closed set AUU, consequently on R'^"''^; hence ^p|A is not homotopic to a constant map by Proposition 8.8. In fact, if F : AUU -^ S^ were a continuous extension of Pp\A we choose R> 0 such that B(p, R) D AuU and define g : B{p, R) -^ dB{p, R)

\P + R]

X — p

\p-\-RF{x)

~

iixeB{p,R)\U, iixeU.

Then g would be continuous in B{p,R) and g = Id on dB(p,R): this contradicts Brouwer's theorem. Conversely, suppose that A does not separate A^. Then A^ has exactly one connected component, which is necessarily unbounded. By Theorem 8.79, / extends to F : R'^+i -^ S^. Therefore F and consequently / = F | ^ are homotopic to a constant map. D

In particular, Borsuk's separation theorem tells us that the separation property is invariant by homeomorphisms. 8.104 Corollary. Let A be a compact set in W^ and let h : A-^W^ be a homeomorphism onto its image. Then A separates W^ if and only if h{A) separates W^. As a consequence we have the following. 8.105 Theorem (Jordan's separation t h e o r e m ) . A homeomorphic image of S^ in W^~^^ separates R'^'^^^ and no proper closed subset of S^ does so. In particular h{S'^) is the complete boundary of each connected component o/R^+i \ /i(5^). It is instead much more difficult to prove the following general Jordan's theorem. 8.106 Theorem (Jordan). Let h : S^ —^ R'^^^ be a homeomorphism between S'^ and its image. Then R"^+^ \h{S'^) has exactly two connected components, each having h{S'^) as its boundary. Jordan's theorem in the case n = 1 is also known as the Jordan curve theorem. We also have

8.3 Exercises

281

8.107 T h e o r e m ( J o r d a n - B o r s u k ) . Let K be a compact subset ofR'^'^^ such that W^\K has k connected components, and let h be a homeomorphism of K into its image on E^"^^. Then R^+^ \ h{K) has k connected components. Particularly relevant is the following theorem that follows from Borsuk's separation theorem, Theorem 8.103. 8.108 T h e o r e m (Brouwer's invariance doraiain t h e o r e m ) . Let U be an open set ofW^~^^ and let h : U C W^~^^ -^ ]R'^+^ be a homeomorphism between U and its image. Then h{U) is an open set in M^"^^. Proof. Let y € h(U). We shall show that there is an open set W C R^"*"^ such that y eW C h(U). Set x = /i~^(2/), and B := B(x, e) so that 'B CU. Then (i) E ^ + i \ h(B) is connected by Corollary 8.104 since B is homeomorphic to B(0,1) and B(0,1) does not separate E'^''"^, (ii) h{B,dB) = h{B) \ h{dB) is connected since it is homeomorphic to B{x, e). By writing

R''-^^ \ h{dB) = {W" \ h(B)\ U h(B \ dB) we see that MP' \ h{dB) is the union of two nonempty, disjoint connected sets, that are necessarily the connected components of M.'^ \ h{dB); since h{dB) is compact, they are also open in R ^ + ^ Thus we can take W := h(B \ dB). D

A trivial consequence of the domain invariance theorem is that if A is any subset of R"^+^ and h : A —^ R'^+^ is a homeomorphism between A and its image h{A), then h maps the interior of A onto the interior of h{A) and the boundary of A onto the boundary of h{A). Using Theorem 8.108 we can also prove 8.109 Theorem. W^ and W^ are not homeomorphic

ifn^m.

Proof. Suppose m > n. If M^ were homeomorphic to M^, then the image of W^ into R ^ under such homeomorphism would be open in R"^. However, the image is not open under the map ( x i , . . . ,Xn) —> {xi,... ,Xn,0,... ,0). •

8.3 Exercises 8.110 % Euler's formula. Prove Euler's formula for convex polyhedra in R^: V — E-\F = 2, where V" := # vertices, E := i^ edges, F := # faces, see Theorem 6.60 of [GMl]. [Hint: By taking out a face, deform the polyhedral surface into a plane polyhedral surface for which V — E -\- F decreases by one. Thus it suffices to show that for the plane polyhedral surface we have V — E -\- F = 1. Triangularize the face, noticing that this does not change V — E -\- F; eliminate from the exterior the triangles, this does not change V — E -\- F again, reducing in this way to a single triangle for which y - F H - F = 3 - 3 H - l = l.]

282

8. Some Topics from the Topology of R"^

8.111 1. Prove P r o p o s i t i o n . Let A be an open set of C. A is simply connected if and only if A is path-connected and C \ A has no compact connected components. [Hint: Use Jordan's theorem to show that A*^ has a bounded connected component if A is not simply connected. To prove the converse, use that R^ \ {XQ} is not simply connected.] 8.112 %, Prove T h e o r e m (Perron—Frobenius). Let A = [aij] be an n x n matrix with aij > 0 V i , j . Then A has an eigenvector x with nonnegative coordinates corresponding to a nonnegative eigenvalue. [Hint: U Ax = 0 for some x e D := {x eR'^ \x'' > OWi, X]r=i ^ ' ^ ^} ^ ^ ^^^^ finished the proof. Otherwise f{x) := Ax/(^^{Axy) has a fixed point in D.] 8.113 %. Prove T h e o r e m ( R o u c h e ) . Let B = B{0, R) be a ball in R" with center at the origin. Let f,ge C^(B) with \g{x)\ < \f{x)\ on dB. Then deg(/, 0) = d e g ( / + ^ , 0 ) .

Part III

Continuity in Infinite-Dimensional Spaces

Vito Volterra (1860-1940), David Hilbert (1862-1943) and Stefan Banach (1892-1945).

9. Spaces of Continuous Functions, Banach Spaces a n d Abstract Equations

The combination of the structure of a vector space with the structure of a metric space naturally produces the structure of a normed space and a Banach space^ i.e., of a complete linear normed space. The abstract definition of a linear normed space first appears around 1920 in the works of Stefan Banach (1892-1945), Hans Hahn (1879-1934) and Norbert Wiener (1894-1964). In fact, it is in these years that the Polish school around Banach discovered the principles and laid the foundation of what we now call linear functional analysis. Here we shall restrain ourselves to introducing some definitions and illustrating some basic facts in Sections 9.1 and 9.4. Important examples of Banach spaces are provided by spaces of continuous functions that play a relevant role in several problems. In Section 9.3 we shall discuss the completeness of these spaces, some compactness criteria for subsets of them, in particular the Ascoli-Arzeld theorem, and finally the density of subspaces of smoother functions in the class of continuous functions, as the Stone-Weierstrass theorem. Finally, Section 9.5 is dedicated to establishing some principles that ensure the existence of solutions of functional equations in a general context. We shall discuss the fixed point theorems of Banach and of CaccioppoliSchauder, the Leray-Schauder principle and the method of super- and subsolutions. Later, in Chapter 11 we shall discuss some applications of these principles.

9.1 Linear Normed Spaces 9.1.1 Definitions and basic facts 9.1 Definition. Let X be a linear space over K = R or C. A norm on X is a function \\ \\ : X -^ R-f satisfying the following properties

(i)

\\x\\eR^xeX,

(ii) \\x\\ > 0 and \\x\\ = 0 if and only ifx = 0, (iii) ||Ax|| = |Al||x|| V X G X ,

(iv)

VAGK,

\\x-^y\\<\\x\\^\\y\\\fx,yeX.

286

9. Spaces of Continuous Functions, Banach Spaces and Abstract Equations

MONOGRAFJE MATEMATYC2NE KOMfTET REDAKCTJNY: 5. B4MACH, B. KHASTEE, K. K ORATO WSK.I, S. MAZOTUBUEWIC2. V, 5IB»PUif5« i H. iTEINSAUS TOM r

THEORIE D E S

OPERATIONS UNEAIRE5 P A R

STEFAN BANACH P R O r E S a i U a A l U H I V B R S I T t DP. I W O W

Z iCBWENCJl rtrSDOSZO KULTURT NAKODOWEJ W A R 5 Z A W A

tjSa

Figure 9.1. Stefan Banach (1892-1945) and the frontispiece of the Theorie des operations lineaires.

/ / II II is a norm on X, we say that (X, || ||) is a linear normed space or simply that X is a normed space with norm \\ \\. Let X be a linear space. A norm on X, defined by d{x,y) := \\x-y\\

on X induces a natural distance \/x,y e X,

which is invariant by translations, i.e., d{x-\-z^y-\-z) = d{x,y) Vo;,y^z ^ X, Therefore, topological notions such as open sets, closed sets, compact sets, convergence of sequences, etc., and metric notions, such as completeness and Cauchy sequences, see Chapter 5, are well defined in a linear normed space. For instance, if X is a normed space with norm || ||, we say that {xn} C X converges to x ^ X li \\xn — x\\ —> 0 as n ^ oo. Notice also that the norm 11 11 : X ^^ R is a continuous function and actually a Lipschitzcontinuous function,

x||-|Ml|<||x-y||, see Example 5.25. 9.2 Definition. A real (complex) normed space (X, || ||) that is complete with respect to the distance d{x^y) := \\x — y\\ is called a real (complex) Banach space. 9.3 Remark. By Hausdorff's theorem, see Chapter 5, every normed Unear space X can be completed into a metric space, that is, X is homeomorphic to a dense subset of a complete metric space. Indeed, the completed metric space and the homeomorphism inherit the linear structure, as one easily

9.1 Linear Normed Spaces

287

sees. Thus every normed space X is isomorphic to a dense subset of a Banach space. 9.4 E x a m p l e . With the notation above: (i) E with the Euclidean norm |a;| is a Banach space. In fact, \x\ is a norm on R, and Cauchy sequences converge in norm, compare Theorem 2.35 of [GM2]. (ii) E*^, n > 1, is a normed space with the EucUdean norm vV2 '

= (El^'l')

x=

{x\x\...,xn,

see Example 3.2. It is also a Banach space, see Section 5.3. (iii) Similarly, C^ is a Banach space with the norm \\z\\ := ( E I L i ki^)^^^^ ^ =

9.5 % C o n v e x s e t s . In a linear space, we may consider convex subsets and convex functions. Definition. E C X is convex if Ax-h (1 — X)y G E for all x,y ^ E and for all A G [0,1]. f : X -^R is called convex if f{Xx + (1 - X)y) < Xf{x) + (1 - X)f{y) for all x,y e X and all A G [0,1]. Show that the balls B{xo,r) convex.

:= {x £ X \ \\x — xo\\ < r } of a normed space X are

a. Norms induced by inner and Hermitian products Let X be a real (complex) linear space with an inner (Hermitian) product {x\y). Then ||x|| := y^{x\x) is a norm on X, see Propositions 3.7 and 3.16. But in general, norms on linear vector spaces are not induced by inner or Hermitian products. 9.6 Proposition. Let \\ \\ be a norm on a real (respectively, complex) normed linear space X. A necessary and sufficient condition for the existence of an inner (Hermitian) product ( | ) such that ||x|| = (x|x) Vx G X is that the parallelogram law holds, \\x + y\? + \\x~ y||' = 2(||a;||2 + \\y\\^)

Vx,2/ S X.

9.7 1[. Show Proposition 9.6. [Hint: First show that if ||x||^ = {x\x), then the parallelogram law holds. Conversely, in the real case set {x\v):=-^{\\x

+

y\\''-\\x-y\W

and show that it is an inner product, while in the complex case, set {x\y) : = i ( | | x + 2/||2 - \\x - y\\^) + i-{\\x + iy\\^ - \\x 4 4

iy\\%

and show that {x\y) is a Hermitian product. 9.8 % Forp> 1, ||a;||p:= ( E I L I I ^ T ) ^ ^ ^ ' ^ = (a:\ a : 2 , . . . , x^), is a norm in M^, cf. Exercise 5.13. Show that it is induced by an inner product if and only if p = 2.

288

9. Spaces of Continuous Functions, Banach Spaces and Abstract Equations

b. Equivalent norms 9.9 Definition. Two norms || ||i and \\ ||2 on a linear vector space X are said to be equivalent if there exist two constants 0 < m < M such that m||x||i < ||x||2 < M | | x | | i

VXGX

(9.1)

If II 111 and II II2 are equivalent, then trivially the normed spaces (X, || ||i) and (X, II 112) have the same convergent sequences (to the same limits) and the same Cauchy sequences. Therefore {X, || ||i) Z5 a Banach space if and only if (X, || II2) i5 a Banach space. Since the induced distances are translation invariant, we have the following. 9.10 Proposition. Let \\ ||i and \\ II2 be two norms on a linear vector space X. The following statements are equivalent (i) II 111 and || II2 are equivalent norms, (ii) the relative induced distances are topologically equivalent, (iii) for any {xn} C X, \\xn\\i —> 0 if and only if \\xn\\2 -^ 0. Proof. Obviously (i) =^ (ii) => (iii). Let us prove that (iii) =^ (i). (iii) implies that the identity map i : (X, || ||i) —> {X, || II2) is continuous at 0. Therefore there exists 6 > 0 such that ||2:||2 < 1 if ||2;||i < (5. For x G M, x / 0, if z := {S/\\x\\i))x we have ||2;||i = S hence ||2;||2 < 1, i.e., ||a:||2 < ^ ||a:||i- Exchanging the role of || ||i and || II2 and repeating the argument, we also get the inequality

Iklli < ^IW|2 Vx e X for some (5i > 0, hence (i) is proved. D 9.11 ^ . Let X and Y be two Banach spaces. Show that their Cartesian product, called the direct sum, is a Banach space with the norm ||(a:,2/)||i,xxy •— I k l l x + Ibllr- Show that

||(x,2/)||p,xxy := ^IWI^ + IMI^, P > 1, \\{x,y)\\oo,XxY

'= m a x ( | | x | | x , WVWY),

are equivalent norms.

c. Series in normed spaces In a linear vector space X, finite sums of elements of X are elements of X. Therefore, given a sequence {xn} in X, we can consider the series X l ^ o ^^' i.e., the sequence of partial sums < X]fc=o ^^ r • •^^' moreover, X is a normed space, we can inquire about the convergence of series in X. 9.12 Definition. Let X be a normed vector space with norm \\ \\. A series ] ^ ^ o ^ ^ ^ Xn ^ X, is said to be convergent in X if the sequence of its partial sums, Sn := XIfc=o^^ converges in X, i.e., there exists s E X such that \\sn — s\\ -^ 0. In this case we write

fc=0

instead 0/ ||5n — s|| ^^ 0 and s is said to be the sum of the series.

9.1 Linear Normed Spaces

289

9.13 Remark. Writing s = Yl^=o^k might make one forget that the sum of the series s is a hmit. In dubious cases, for instance if more than one convergence is involved, it is worth specifying in which normed space (X, II ||x), equivalently with respect to which norm || ||x, the hmit has been computed by writing s = 2_] ^k fc=o or

in the norm X,

CX)

Xk

in X,

E nk=0^k

0.

S =

^ k=0

or, even better, writing

9.14 Definition. Let X he a normed space with norm \\ \\. We say that the series YL^^Xn, {xn} C X, is absolutely convergent if the series of the norms X]n=o 11^^ 11 converges in M. We have seen, compare Proposition 2.39 of [GM2], that every absolutely convergent series in R is convergent. In general, we have the following. 9.15 Proposition. Let X be a normed space with norm \\ \\. Then all the absolutely convergent series of elements of X converge in X if and only if X is a Banach space. Moreover, if Yll^=o ^n is convergent, then

||5^Xfe||<^||Xfc||. k=p

k=p

Proof. Let A" be a Banach space, and let J ^ ^ Q ^k be absolutely convergent. The sequence of partial sums of X^fcio ll^fcll ^^ ^ Cauchy sequence in E, hence X]fc=p Ikfcll ~^ 0 ss p,q -^ OO. From the triangle inequality we infer that

k=p

k=p

hence Efc=p^k\\ —* 0 as p,q —^ oo, i.e., the sequence of partial sums of S ^ ^ o ^k is a Cauchy sequence in X. Consequently, it converges in norm in X, since X is a Banach space. Conversely, let {xk} C X be a Cauchy sequence. By induction select n i such that \\xn — XniW < 1 if n > n i , then 712 > n i such that \\xn — a;n2 II < 1/2 if n > 712 and so on. Then {xn^} is a subsequence of {xk} such that

IK,+i-x„J|<2-'=

Vfc,

and consequently the series ^'^=i(xnk+i — ^n^) is absolutely convergent, hence convergent to a point y £ X by assumption, i.e., p | | X ^ ( ^ n f c + i -Xrik) fc=l

-y\\

-^0,

asp ^ + 0 0 .

290

9. Spaces of Continuous Functions, Banach Spaces and Abstract Equations

Since this simply amounts to \\xnp — x\\ -^ 0, x := y -\- Xm, {xuk} converges to x, and, as {xn} is a Cauchy sequence, we conclude that in fact the entire sequence {xn} converges to x. Finally, the estimate follows from the triangle inequality

k=p

k=p

as q -^ oo, since we are able to pass to the limit as J ^ ^ o ^k converges.

D

9.16 % C o m m u t a t i v i t y . Let X be a Banach space and let {xn} C X be such that ^^ Xn is absolutely convergent. Then ^ ^ a:o^(^), for {x^(„)} a rearrangement of {xn}, is also absolutely convergent, and Yin ^^ ~ X^n ^(T(n)9.17 % Associativity. Let X be a Banach space and let {xn} C X. Let {In} be a sequence of nonempty subsets of N with InCiIm = ^'^^n ^ m and Unin = N. If Yl^ Xn is absolutely convergent, then

E (E -0 is absolutely convergent and EfcLo^fc = E ^ i ( E f c € / n

^^^

d. Finite-dimensional normed linear spaces In a finite-dimensional vector space, there is only one topology induced by a norm: the Euclidean topology. In fact, if K = R or K = C, we have the following. 9.18 Theorem. In K^ any two norms are equivalent. Proof. It suffices to prove that any norm p on K^ is equivalent to the Euclidean norm I |. Let (ei, 6 2 , . . . , en) be the standard basis of K^. If x = (x^, x'^,..., x^) and y = (2/1, 2 / ^ , . . . , y"^), we have n

Z=l

n i=l

yJL^

o\l/2

hence p : K'^ -> E + is continuous. Since the unit ball B := {x e K'^ \ \x\ = 1} of K^ is compact, we infer that p attains a maximum value M and a minimum value m on B. Since the minimum value is attained, we infer that m > 0, otherwise p would be zero at some point of B. Therefore 0 < m < p{x) < M on B, and, on account of the 1-homogeneity of the norm, m\x\ < p{x) < M \x\ i.e., II II is equivalent to the Euclidean norm.

Wx € K'^, •

9.19 Corollary. Every finite-dimensional normed space X is a Banach space. In particular, any finite-dimensional subspace of X is closed and K C X is compact is and only if K is closed and hounded.

9.1 Linear Normed Spaces

291

Proof. Let p be a norm on X and let E : K^ ^^ X be a coordinate map on X. Since 8 is linear and nonsingular, p o £^ is a norm on K^ and E is trivially an isometry between the two normed spaces (K'^,poE) and (X,p). Since p o 5 is equivalent to the Euclidean norm, (K^jpo E) is a Banach space and therefore {X,p) is a Banach space, too. The second claim is obvious. D 9.20 %, Let A" be a normed space of dimension n. Then any system of coordinates E : X ^^ K^ is a linear continuous map between K^ with the Euclidean metric and the normed space X.

A key ingredient in the proof of Theorem 9.18 is the fact that the closed unit ball in W^ is compact. This property is characteristic of finitedimensional spaces. 9.21 Theorem (Riesz). The closed unit ball of a normed linear space X is compact if and only if X is finite dimensional. For the proof we need the following lemma, due to Prigyes Riesz (18801956), which in this context plays the role of the orthogonal projection theorem in spaces with inner or Hermitian products, see Theorem 3.27 and Chapter 10. 9.22 Lemma. Let Y be a closed linear subspace of a normed space (X, II II). Then there exists x G X such that \\x\\ = 1 and ||x — x|| > 1/2 \f xeY. Proof. Take XQ € X\Y and define d := inf{||2/ —xo|| | y € Y}. We have d > 0, otherwise we could find {yn} C Y with yn -^ XQ and XQ ^Y since Y is closed. Take yo ^Y with lll/o - XQ\\ < 2d and set x = j | f J 5 ^ - Clearly ||x|| = 1 and yo + y\\xo - yo\\ € K if y G y , hence M

"

_M

II

"

II

xo-yo

II

IN-t/o||ll

||2/lko-2/o||-3:o+yo||

\\xo-yo\\

d

1

-'2d

2 D

Proof of Theorem 9.21. Let B := {x e X \\\x\\ < 1}. If X has dimension n, and E : K^ —> X is a system of coordinates, then E is an isomorphism, hence a homeomorphism. Since B is bounded and closed, E~^{B) is also bounded and closed, hence compact in K", see Corollary 9.19. Therefore B = E{E-^{B)) is compact in X. We now prove that B is not compact if X has infinite dimension. Take x\ with | | x i | | = 1. By Lemma 9.22, we find X2 with ||x2|| = 1 at distance at least 1/2 from the subspace Span{a;i}, in particular \\x\ — X2II > | . Again by Lemma 9.22, we find X3 with ||x3|| = 1 at distance at least 1/2 from Span {xi, 0:2}, in particular ||a:3 — x\\\ > | and 11^3 — X211 > | . Iterating this procedure we construct a sequence {xn} of points in the unit sphere such that ||iCi — Xj|| > | ^i^j^i ^ 3- Therefore {xn} has no convergent subsequence, hence the unit sphere is not compact. D

9.23 Remark. We emphasize that, in any infinite-dimensional normed space we have constructed a sequence of unit vectors, a subsequence of which is never Cauchy.

292

9. Spaces of Continuous Functions, Banach Spaces and Abstract Equations

9.1.2 A few examples In Sections 9.2 and 9.4 we shall discuss respectively, the relevant Banach spaces of linear continuous operators and of bounded continuous functions. Here we begin with a few examples.

a. The space ^p, 1 1. For a sequence ^ = {^i} CY we define

Then the space of sequences

is a linear space with norm ||^||£p(y). Moreover, we have the following. 9.24 Proposition, ^p(y) is a Banach space if Y is a Banach space. Proof. Let {^fe}? ^k •= {$1 }, be a Cauchy sequence in ipiV).

Since for any i

lld-Clly
(9.2)

the sequence {Q ^}k is a Cauchy sequence in Y, hence it has a limit fi € K,

We then set ^ := {^i} and prove that {^k} converges to ^ in ipiV)all n,m > no(e) we have Win -im\\ep(Y)

Fix e > 0, then for

< e

hence, for all r G N

Eii^l"^-?l""iiy<^' i=l

and, since x —^ ||2; — a:||y is continuous in Y, as m —^ oo,

for n > no{e) and all r. Letting r —^ oo, we find ||^n — Cll£p(y) ^ ^ ^^^ '^ ^ '^o, i-e., ^n ^^ ? in ^p(l^). Finally, the triangle inequality shows that ^ 6 ^p(^)C

9.1 Linear Normed Spaces

293

b. A normed space that is not Banach The map

a

\\ i /^/p p

t)

ifitrdtj

,

p>i,

defines a norm on the space of continuous functions C^([a, 6],R). Indeed, if y is a Unear normed space with norm || ||y, P = li/llLp(K6[,y):= /

\\m\fydt,

p > 1,

Ja

defines a norm on the space of continuous functions with values in Y. In fact, t -^ ||/(^)||y is a continuous real-valued map, hence Riemann integrable, thus ||/||p is well defined. Clearly ||/||p = 0 if and only if f{t) =0 \/t and 11/lip is positively homogeneous of degree 1. It remains to prove the triangle inequality for / ^ jj/lIP? called the Minkowski inequality, | | / + 5 | | p < | | / | | p + ||5llp

V/,5eC0([a,6],y).

The claim is trivial if one of the two functions is zero. Otherwise, we use the convexity of 0 : F —» E where 0(j/) := ||j/||y, i.e.,
Vx,2/ € y, VA e [0,1], (9.3)

y = g{t)/\\g\\p and A = l|/||p/(||/||p +

H^IIP),

and we

L^J'm)^mK^<

ll/lk + IM

that is Minkowski's inequality. It turns out that C^([a,6],R) normed with \\ \\p is not complete, see Example 9.25. Its completion is denoted by L^(]a, 6[). Its characterization is one of the outcomes of the Lebesgue theory of integration. 9.25 E x a m p l e . Define, see Figure 9.2,

fo

if - 1 < t <0,

fn{t) = { nt

if 0 < t < 1/n, - ' ' if 1/n < t < 1

1

and

r/.x ; 0 if - 1 < t < 0, f(t) = < 1 1 if 0 < t < 1.

The sequence {/n} converges to / in norm

\\U-m=

r^"(l-nt)Pdt
If p G C ^ ( [ - l , 1]) is the Umit of {/n}, then | | / - g\\-p — 0, consequently p = / = 0 on [—1,0] and p = / = 1 on ]0,1], a contradiction, since g is continuous.

294

9. Sp£ices of Continuous Functions, Banach Spaces and Abstract Equations

\/n Figure 9.2. Pointwise approximation of the Heaviside function.

c. Spaces of bounded functions Let A be any set and F be a normed space with norm || ||y. The uniform norm of a function f : A-^Y is defined by the number (possibly infinity)

\\f\\B{A,Y) : = s u p | | / ( : r ) | | y . B{A,Y) defines a norm on the space of hounded functions f : A B{AY)

Y

:=. | / : A - . r I ||/||B(A,y) < +00}

which then becomes a normed space. The norm ||/||^(A,y) on B{A,Y) is also denoted by ||/||OO,A or even by ||/||oo when no confusion can arise. The topology induced on B{A^ Y) by the uniform norm is called the topology of uniform convergence, see Example 5.19. In particular, we say that a sequence {fn} C B{A, Y) converges uniformly in Ato f e B{A, Y) and we write uniformly in A, /n(^) -^ f{x) if l|/n-/lkA,y)->0. 9.26 Proposition. If Y is a Banach space, then B{A,Y) space.

is a Banach

Proof. Let {fn} C 1S{A, Y) be a Cauchy sequence with respect to || ||oo. For any e > 0 there is a no such that ||/n — /m||oo < e for all n , m > no. Therefore, for all a: € ^ and n, m > no \\fn{x)-fm{x)\\Y<e.

(9.4)

Consequently, for all x € A, {/n(^)} is a Cauchy sequence in Y hence it converges to an element f(x) G Y. Letting m —>• 00 in (9.4), we find \\fn{x) - f{x)\\Y

< e

V n > no and Va; G A,

i-e., | | / - / n | | o o < e f o r n > no, hence ||/||oo < | | / n | , + e, i . e . , / G ^ ( A , y ) a n d / n uniformly in B{A^ Y) since 6 is arbitrary.

• /

D

9.27 f. Let Y be finite dimensional and let (ei, 6 2 , . . . , e-n) be a basis of Y. We can write / as f{x) = fi{x)e\ + • • • -f fn{x)en- Thus / G B{A,Y) if and only if all the components of / are bounded real functions.

9.2 Spaces of Bounded and Continuous Functions

295

d. The space looiY) A special case occurs when A = N. In this case B{A^ Y) is the space of bounded sequences of y , that we better denote by

e^iY) := BiN,Y). Therefore, by Proposition 9.26, ^oo(^) is a Banach space with the uniform norm ll^ll^oo(y) •= ll^llB(N,y) = s u p ll^illy, i

if Y is complete. 9.28 %. Show that for l < p < g < o o w e have (i) (ii) (iii) (iv)

ip{R) CiqiR) C^oo(M), £p{R) is a proper subspace of iq{M.), the identity map Id : ip{R) -^ £q{R) is continuous, ^i(E) is a dense subset of £q(M.) with respect to the convergence in iq{R).

9.29 %. Show that, if p , g > 1 and l / p 4 - l/q = 1, then

for all {^„} € ipCM.) and {r/n} e ^,(R). Moreover, show that oo

II^IIMIK) = s m

E«-^-| I ll^lkw ^ i}-

(^-^^

n=l

[Hint: For proving (9.5) use the Young inequaUty ab < a^/p -h b^/q. Using (9.5), show that > holds in (9.6). By a suitable choice b = b{a) and again using Young's inequality, finally show equality in (9.6).]

9.2 Spaces of Bounded and Continuous Functions In this section we discuss some basic properties of the space of continuous and bounded functions from a metric space into a Banach space.

9.2.1 Uniform convergence a. Uniform convergence Let X be a metric space and let F be a normed space with norm || ||y. Then, as we have seen in Proposition 9.26, the space B{X, Y) of bounded

296

9. Spaces of Continuous Functions, Banach Spaces and Abstract Equations

functions from X into F is a normed space with uniform norm, and B{X^ Y) is a Banach space provided y is a Banach space. We denote by Cb{X, Y) the subspace of B{X, Y) of bounded and continuous functions from X into F , Cb{X,Y)

:=C''{X,Y)nB{X,Y).

Observe that, by the Weierstrass theorem Cb{X, Y) = C{X, Y) if X is compact, and that, trivially, Cb{X^Y) is a normed space with uniform norm. 9.30 Proposition. Cb{X,Y)

is a closed subspace of

B{X^Y).

Proof. Let {/n} C Cb{X, Y) be such that fn^f uniformly. For any e > 0, we choose no = no(e) such that | | / — /no I loo,x < ^- It follows that

ll/lloc,X < 11/

/ n o ||oo,X + ||/nol|oo,X<4-00,

i.e., / 6 B{X,Y). Moreover, since /no is continuous, for a fixed XQ ^ X there exists 6 > 0 such that \\fno(x) — fno{xo)\\Y < € whenever x G X and dx{x,xo) < S. Thus, for d(x, xo) < S, we deduce that Wfix)

- f{xo)\\Y

< \\f(x)

- fnoix)\\Y

+ Wfnoix)

"

fno(xo)\\Y

-\-\\fno{xo)-f(xo)\\Y<^e i.e., / is continuous at XQ. In conclusion, / € C^{X, Y) fl B{X, Y).

D

Immediate consequences are the following corollaries. 9.31 Corollary. The uniform limit of a sequence of continuous functions is continuous. 9.32 Corollary. Let X be a metric space and let Y be a Banach space. Then Cb{X^Y) with uniform norm is a Banach space. 9.33 %. Show that the space C^([a, 6],R) of real functions / : [a, 6] —>- R, which are of class C^, is a Banach space with the norm ll/llci:=

sup | / ( x ) | + xG[0,l]

sup

|/'(x)|.

x€[0,l]

[Hint: If {fk} is a Cauchy sequence in C^Ha, 6]), show that fk -^ f, f^ -^ 9, uniformly. Then passing to the limit in

fk{x)-fk(a)=

r

fl(t)dt,

Ja

show that / is differentiable and f'{x)

= g{x) Vx.]

9.34 %. Let X be a metric space and let y be a complete metric space. Show that the space of bounded and continuous functions from X into F , endowed with the metric dooif,9) is a complete metric space.

:= sup xex

dY(f(x),g{x)),

9.2 Spaces of Bounded and Continuous Functions

297

Figure 9.3. Consider a wave shaped function, e.g., f{x) = 1/(1 + x ^ ) , and its translates fn{x) := 1/(1 -\-(x-\- n)2). Then ||/n||oo = 1, while fn(x) -^ 0 for all xeR.

b. Pointwise and uniform convergence Let ^ be a set and let y be a normed space normed by | |y. We say that {/n}, /n • ^ —^ y , converges pointwise to f : A ^^Y in A ii \fn{x)^f{x)\Y^O

VXGA,

while we say that {fn} converges uniformly to / in A if ||/n-/||oo,A^O.

Since for all x G ^ ll/n(a:) — / ( ^ ) | | y < ll/n — /||cx),x, uniform convergence trivially implies pointwise convergence while the converse is generally false. For instance, a sequence of continuous functions may converge pointwise to a discontinuous function, and in this case, the convergence cannot be uniform, as shown by the sequence fn{x) := x^, x e [0,1[, that converges to the function / which vanishes for all x G [0,1[, while / ( I ) = 1. Of course, a sequence of continuous functions may also converge pointwise and not uniformly to a continuous function, compare Figure 9.3. More explicitly, /n ^ / pointwise in A if Vx G A,Ve > 0 3 n = n(x, e) such that \fn{x) - / ( ^ ) | y < e for all n>n, while, fn—^f

uniformly in A if

V € > 0 3 n = 6 such that \fn{x) — / ( ^ ) | y < ^ for all n > n and all x e A. Therefore, we have pointwise convergence or uniform convergence according to whether the index n depends on or is independent of the point X.

c. A convergence diagram For series of functions fn : A —>Y, we shall write

/(^) = 5Z/n(^) n=l

VXGA

298

9. Spaces of Continuous Functions, Banach Spaces and Abstract Equations

Absolute convergence in B{A, Y)

Absolute convergence in Y

Uniform convergence, i.e., convergence in B(A,Y)

Convergence in Y

Figure 9.4. The relationships among the different notions of convergence for series of functions.

if the partial sums converge pointwise in A, and oo

f{x) = \^ fn{x)

uniformly in A

n=l

if the partial sums converge uniformly. Simply writing Yl^=i fn{x) = / ( ^ ) is, in fact, ambiguous. Summarizing, we introduced four different types of convergence for series of functions from a set A into a normed space Y. More precisely, if {/n} C B{A, Y) and / G B{A, F ) , we say that (i) S ^ o /n(^) converges pointwise to / if for all a: G ^ X^^o fni^) — / ( ^ ) in r , i.e., for all x G ^ , || J^^^Q fn{x) - /(x)||y ^ 0 as ;? -> oo, (ii) X ] ^ o / ^ ( ^ ) converges absolutely in Y for all x ^ A i.e., for any fixed X e A, the series of nonnegative real numbers Yl'^=o ll/n(^)||r converges, (iii) E r = o fn{x) converges uniformly in A to f if Yl7=o fn = f^^ B{A, F ) , i-e., \\YZ=0

fn - f\\B{A,Y)

- ^ 0 a s p - > 00,

(^^) J2^=o fn{x) converges absolutely in B{A^ Y) if the series of nonnegative real numbers Y!^=Q \\fn\\B{A,Y) converges. Clearly (iv) implies (ii), and (iii) implies (i). Moreover, (iv) implies (iii) and (ii) implies (i) if F is a Banach space; the other implications are false, see Example 9.35 below. 9.35 E x a m p l e . Consider functions / : R+ —> E. Choosing fn{x) := ( - l ) ^ / n , we see that X ] ^ i fn{x) converges pointwise and uniformly, but not absolutely in M or in H(R,R). Let f{x) := s i n x / x , a; > 0, and, for any n G N, ii n < X < n-\- 1, otherwise. Since c i / n < ||/n||oo < C2/n, J2n f^ ^^^^ ^^* converge absolutely in J S ( M + , R ) . But / ( ^ ) = Yl'^=o fn(,x) converges pointwise Wx G R+ and also absolutely in R for all a: G R+. Finally,

9.2 Spaces of Bounded and Continuous Functions p v"^

fsina: I

II a: > p

n=o

yo

otherwise,

299

hence p

/-V'/n II

- ^ n=0

< — -.0 lloo

asp-^oo,

p ^

therefore ^^fn converges uniformly, that is in iB(]R+,M). Here the convergence is uniform in iS(E+,M) but not absolute in iB(M-|.,E), because the functions fn take their maxima at different points and the maximum of the sum is much smaller than the sum of the maxima.

9.36 Theorem (Dini). Let X be a compact metric space and let {fn} be a monotonic sequence of functions fn'-X-^R that converges pointwise to a continuous function f. Then fn converges uniformly to f. 9.37 ^ . Show Dini's theorem. [Hint: Assuming that fn converges by decreasing to 0, for all e > 0 and for all x G X there exists a neighborhood Vx of x such that |/n(a:^)| < e Va; G Vx for all n larger than some n(x). Then use the compactness of X. Alternatively, use the uniform continuity theorem. Theorem 6.35.] 9.38 %. Show a sequence {fn} that converges pointwise to zero and does not converge uniformly in any interval of R. [Hint: Choose an ordering of the rationals {vn} and consider the sequence fn{x) := ^'^—Q
xZ.

p{E, F) := sup { sup d{x, F ) , sup d{x, E)}. xeE xeF Show that p is a distance on C. Now suppose that X is a compact metric space and Y is a normed space. Show that {fn} converges uniformly to / if and only if the graphs in X X y of the / n ' s converge to the graph of / with respect to the Hausdorff distance.

d. Uniform convergence on compact subsets 9.41 E x a m p l e . We have seen in Theorem 7.14 of [GM2] that a power series with radius of convergence p > 0 converges totally, hence uniformly, on every disk of radius r < p. This does not mean that it converges uniformly in the open disk {\z\ < p}. For instance, the geometric series X ^ ^ Q ^^ ^^^ radius of convergence 1 and if \x\ < 1 1 l-x

^ ^ n=0

C^ =

xi 1

-

consequently for all p. I sup

xe]-iMn-x

1

^ 1 ^ ^ x " = -f oo.

^Q

I

300

9. Spaces of Continuous Functions, Banach Spaces and Abstract Equations

n $««retario 1«jr8« a iiome deirAccademtco Ono> rarlo C«v. Prof. CMAAC ARZILA u m Memoriar Sun* L'A.utore raccoglie in (iu«a(a memoria ^li atudi dalui ikUi anitJ addlatro intorno allc s«rte dl funxiDtii di ufut variablla r««l«: riordlnandolt, modlflcandolt ove occorre, sempllflcandovi le dimoatrazloHi « corredaadoll di esempi. Le proprlatt di una s«iie di (Unzloni «,fx) ••. «,(«) -^ ••• * B.W •*• '" dipeiidono dal modo di eomportarsi delta somma

dei primi n termini. Quetta S(n, «) pu(> rieuardani come una fUn' Tiom d«lle due »ari»l>iH n c J": cd t$\\ »(ai>ili$ce appunto le principali propoaizioni relative alia conu* nui<^ iR(e«ral>ilii.1 e derivabilitii di esxa, deducendole come casi particolari da propMizioni relative «|ie ftin* xioni generall di dae variabili. IndicliiaffiO 4u< i riauliati prineipaM. 1. — Siano le «rt«), Uf(«),... funzioni flniiee continue delta * nellMntervmlio a ..A: eaisia deierminato in otni Puuto x

aliora la condizioiic necesiaria e tuMcienie affinclid ffyt) ata futtxione continu* dl « conalate nelta oono*f» genm uniform a tmtU: vale a dire eht. per ogol nan e r o po«Uivo « preeo a plaeere e per otrni Bumera tntere mi s i dee e basta trovare un allro Aumero IDtero m t ^ m i tale cite per un numero m compreao Ira m* e m,, ei abOia

Figure 9.5. Giulio Ascoli (1843-1896) and the first page of Sulle serie di funzioni by Cesare Arzela (1847-1912).

Kim)

todkando II reato della aerie eontatto a partire

9.42 Definition. Let ^ C R^ and let Y he a normed space. We say that a sequence of functions {fn\,fn''^-^y, converges uniformly on compact subsets oi A to f : A —^ Y if for every compact subset K C A we have ll/n - /||oo,x -^0 as n-^ oo. 9.43 %, Let / n , / : r2 —>^ R be continuous functions defined on the closure of an open set Q of R"^. Show that {/n} converges uniformly to / on fi if and only if {/n} converges uniformly to / in fi. 9.44 %, Let {/n}, / n : ^ C R*^ —>• y , be a sequence of continuous functions that converges uniformly on compact subsets oi Ato f : A —^Y. Show that / is continuous.

9.2.2 A compactness theorem At the end of the nineteenth century, especially in the works of Vito Volterra (1860-1940), Giulio AscoU (1843-1896), Cesare Arzela (18471912) there appears the idea of considering functions J-" whose values depend on the values of a function, the so-called funzioni di linee, functions of lines; one of the main motivations came from the calculus of variations. This eventually led to the notion of abstract spaces of Maurice Prechet (1878-1973). In this context, a particularly relevant result is the compactness criterion now known as the Ascoli-Arzela theorem. a. Equicontinuous functions L§t X be a metric space and Y a normed space.

9.2 Spaces of Bounded and Continuous Functions

301

9.45 Definition. We say that a subset T of B{X^Y) is equibounded, or uniformly bounded, by some constant M > 0 if we have ||/(x)||y < M

Vx ex,yf e T. We say that the family of functions T is equicontinuous if for all e> 0 there is S > 0 such that \\f{x) - f{y)\\Y <€

Wx.yeX

with dx{x,y)

< S, and V/ G T.

9.46 Definition (Holder-continuous functions). Let X^Y be metric spaces. We say that a function f : X -^ Y is Holder-continuous with exponent a, 0 < a <1, if there is a constant M such that dY{fix),f{y))<Mdx{x,yr, and we denote by C^'^(X, F ) the space of these functions. Clearly the space C^'^{X,Y) is the space Lip(X, F ) of Lipschitzcontinuous functions from X into Y. On C°'^(X, Y) n B(X, F ) , 0 < a < 1, we introduce the norm WfWco,^ = sup \\f{x)\\y + xeX

sup x,yeX,Xy^y

''^^,;]"f,^,^^''^,

(9.7)

IF~i/|lx

and it is easy to show the following. 9.47 Proposition. C^^'^{X,Y)nB{X,Y) a Banach space ifY is a Banach space.

endowed with the norm (9.7) is

Bounded subsets with respect to the norm (9.7), of (7^'"(X, F ) fl B{X, F ) provide examples of equicontinuous families. See the exercises at the end of this chapter for more on Holdercontinuous functions. b. T h e Ascoli—Arzela theorem 9.48 Theorem (Ascoli—Arzela). Every sequence of functions {/„} in C^{[a,b]) which is equibounded and equicontinuous has a subsequence that is uniformly convergent. More generally, we have the following. 9.49 Theorem. Let X be a compact metric space. A subset T ofC^{X, R) is relatively compact if and only if T is equibounded and equicontinuous. Proof. We recall that a subset of a metric space is relatively compact or precompact if and only if it is totally bounded, see Theorem 6.8. If .F is relatively compact, then T is totally bounded and, in particular, equibounded For any e > 0, let / i , . . . , / n , G ^ be an e-net for T, i.e., \/ f e T in C^{X,R). 11/ — /ill < € for some fi. Since the / j ' s are uniformly continuous, there is a Se such that dx{x,y)

< Se implies \fi{x) - fi(y)\

< e,

2 = 1, 2 , . . . ,ne.

302

9. Spaces of Continuous Functions, Banach Spaces and Abstract Equations

SULLE FUNZIONI DI LINEE »(^ Mixta MSfiUni* 188*, IHTBDOOirOHt

PROF. OESARE A R Z E L A

I. D > - « a t t t t 4 < r i * U I •> <4 IL III r l t i r c t 4 t l l i «ar 41 <SI

mum

a HtfOt«

ID quMto lavoro io d6 anxtlutto una nuota dimostrazioBa delta condizlona n*o«staria • auffieUnte par I'eslsteDca di una curva llmlte in una «isee$«oiia data dt curva net piano. Coaaldero pol Ainztoo! <*) aveaU valor* datarmioato par Oftnuna dalla linee apparteoantl a una data variata a dUno««ro par UU funtloal, deflntta In una varlata ehiiua, i tooremi fondamantali oha valgoso par la funzioni di punU: argomanto gi& da ma trattato in una nota ai Lineal par I'anno 1884 a qui ripreso soUo un aspatto ^qoanto divwrso. Sagua Infina un'apsrftcaziona cha para BOte«o!a, son #& pal rasaltato al quala conduoa, ma bensi pal matodo aba in «$«a t tanutoLa proposizloni, stabittta al num. 1...9 anno tmmadiatainaata eatondibiU a funtioni di doa e pli!i variabUi, it eha fa ininvadara nUit appiiMslonl1. Sia una suooesaiona df infinite fuazlODt

<M*> h f m . A t t i t M «f<M

4 a U t t a Mil'ttite t a m i d ^ iMkbw 4 * 4 i i^^MfitSa MtaMrin il n w 4i

«)

U*$

«,<«), a^a),...Ma*)

dalla variabila raala «, dtOe naU'intarvalio a.-.b. Sia e(«>) una funzlone tale cha par ognl numaro posiUvo
MM 41 i i i i l w i r t i «l

ifNit attcllo M M 4a n *—mft» amfacSmime. Si x m ^ 4i mm frhmmuM i> IIM4O wpiwio ii jnt. V e l t « r » » » i

Figure 9.6. Two pages respectively from Le curve limiti di una varietd data di curve by Giulio Ascoli (1843-1896) and from Sulle funzioni di linee by Cesare Arzela (18471912).

Given f E T we choose io, 1 < io < Ue in such a way that \\f — fioW < ^- Then l/(x) - f(y)\ < | / ( x ) - fi,{x)\

+ \fi,(x)

- fi„{y)\ + \fi,(y)

-

f{y)\

<2||/-/ioll + l/ioW-/io(y)l<3€, for dx(x,y) < <5e, hence .F is an equicontinuous family. Conversely, suppose J^ is equibounded and equicontinuous. Again by Theorem 6.8, it suffices to show that J^ is totally bounded. Let e > 0. Prom the compactness of X and the equicontinuity of .F, we infer that there exists a finite family of open balls B ( x i , r^) that cover X and such that

\fiy)-f{xi)\<€

^yeBixi.ri),

V / € J^.

Since the set K := {f{xi) 11 < i < n, f € T} is bounded, we find y i , 2/2, • • •, ym € M such that K C U ^ ^ B ( y i , e). The set .F is covered by the finite union of the sets FW := { / eT^fixi)

6 B(y^(^),e) < e, i = l , . . . , n j ,

with TT varying among the bijective maps TT : { 1 , . . . , n } —^ { 1 , . . . , n } . Therefore, it suffices to show that diamFW < 4e. Since for / i , /2 € FV and x E B(xi,ri) we have | / l ( x ) - f2{x)\ < \fl{x) - fi{Xi)\ + |/l(Xi) - y , ( i ) | + IVnii) - f2{xi)\ 4- \f2{xi) - /2(:r)| < 4€, the proof is concluded.

D

9.50 %, Notice that the sequence {/n} of wave shaped functions in Figure 9.3,

fn{x):=

1 l + (a; + n ) 2 '

X € M, n G N,

is equicontinuous and equibounded, but not relatively compact.

9.3 Approximation Theorems

303

9.51 f. Theorem 9.49 can be formulated in slightly more general forms that are proved to hold with the same proof of Theorem 9.49. T h e o r e m . Let X be a compact metric space, and let Y be a Banach space. A subset T C C(X, y ) is relatively compact if and only if T is equicontinuous and, for every X, the set Tx of all values f{x) of f E J^ is relatively compact in Y. T h e o r e m . Let X and Y be metric spaces. Suppose X is compact. A sequence {fn} C C{X,Y) converges uniformly if and only if {fn} is equicontinuous and there exists a compact set K C Y such that {fn{x)} is contained in a 6-neighborhood of K for n sufficiently large. 9.52 ^ . Show the following. P r o p o s i t i o n . Let X^Y be two metric spaces and let Q C X be compact. Then the subsets ofC^^'^iQ^Y) that are bounded in the \\ ||^o,a norm are relatively compact in

C^{n,Y).

9.3 Approximation Theorems In this section we deal with the following questions: Can we approximate a continuous function uniformly, and with given precision, by a polynomial? Under which conditions are classes of smooth functions dense with respect to the uniform convergence in the class of continuous functions?

9.3.1 Weierstrass and Bernstein theorems a. Weierstrass's approximation theorem In 1885 Karl Weierstrass (1815-1897) proved the following. 9.53 Theorem (Weierstrass, I). Every continuous function in a closed bounded interval [a, b] is the uniform limit of a sequence of polynomials. In particular, for every n there exists a polynomial Qn(^) (of degree d = d{n) sufficiently large) such that |/(x) — Qn{x)\ < 2~'^ V a: G [a, 6]. If we set Pi{x) := Qi{x),

Pn{x) := Qn{x) - Qn-i{x),

n > 1,

we therefore conclude that every continuous function f{x) can be written in a closed and bounded interval as the (infinite) sum of polynomials., oo

f{x) = y ^ Pn{x) n=0

uniformly in [a, b].

304

9. Spgices of Continuous Functions, Banach Spaces and Abstract Equations

We recall that, in general, a continuous function is not the sum of a power series, since the sum of a power series is at least a function of class C"^, compare [GM2]. Many proofs of Weierstrass's theorem are nowadays available; in this section we shall illustrate some of them. This will allow us to discuss a number of facts that are individually relevant. A first proof of Theorem 9.53. We first observe, following Henri Lebesgue (1875-1941), that in order to approximate uniformly in [a, h] any continuous function, it suffices to approximate the function |a;|, x G [—1,1]In fact, any continuous function in [a, h] can be approximated, uniformly in [a, 6], by continuous and piecewise linear functions. Thus it suffices to approximate continuous and piecewise linear functions. Let f{x) be one of such functions. Then there exist points XQ = a < XI < X2 < • • • < Xr < Xr-\-i = b such that f^x) takes a constant value dk in each interval ]xk,Xk-\-i[' Then, in [a,b] we have r

f{x) = f{a)-\-^{dk fc=o

-dk-i)(pxk(x),

d-i = 0 ,

where ipc{x) := max(a: - c,0) = -{{x — c) -\-\x — c\). If we are able to approximate \x — Xk\-> x e [a,6], uniformly by polynomials {Qk,n}^ then the polynomials ^ 1 Pn{x) := / ( a ) + ^{dk - dk-i)-({x

- Xk) +

Qk,n{^))

k=0

approximate f(x) uniformly in [a,b]. By a linear change of variable, it then suffices to approximate |a;| uniformly in [—1,1]. This can be done in several ways. For instance, noticing that if x G [—1,1], then 1 — |a;| solves the equation in z

one considers the discrete process izn^i{x)

= ^[zUx)

+ {l-x^)]

n>0,

[zo{x) = 1. It is then easily seen that the polynomials Zn{x) satisfy (i) Znix)>Oin[-hll (ii) Zn{x)

>

Zn-\-l{x),

(iii) Znix) converges pointwise to 1 — \x\ if x G [—1,1]. Since 1 — \x\ is continuous, Dini's theorem. Theorem 9.36, yields that the polynomials Zn{x) converge uniformly to 1 — |a:| in [—1,1]. Alternatively one shows, using the binomial series, that VT^^=f^CnX'',

Cn = ( ^ ^ ^ ) ( - i r .

in ] — 1,1[. Then one proves that the series converges absolutely in C^([—1,1]), hence uniformly in [—1,1]. In fact, we observe that Cn := (^/^)(—1)" is negative for n > 1 hence,

9.3 Approximation Theorems

oouicnoM M Hamaunaa

Obordie aoalytiaolM DaxsMBMikfiit flogenamttir wSMtttehttr FoDtttiaasii mat reelkB VarinderUditn.

305

sm u ratoias taa RWCTK»«

LECONS

PROPRIETES EXTRMALES

Von K. WnxMnuat. Ertt. If ittk«n«>|.

MEILLEURS Al^PROXIHATION I i t / ( » ) «ii>» ftr jgam Mdtan W«ih d v V

FONCTIONS ANALYTIQUES D'U>fE

VARIABLE

S^BLLK

I>«indtewCH«leb>oi«ni««tm
j+W< » Wwtk kibm «
PARIS &Aimiitft-vaiAR» s r o , SDITKIIRS 5», <}«*1 4« Gna4Mla(Mtin M

Figure 9.7. The first page of Weierstrass's paper on approximation by polynomials and the Legons sur les proprietes extremales by Sergei Bernstein (1880-1968).

J2 K\ = 2 - ^ n=0

cn = 2 -

n=0

lim_ ^ ^^^

cnx""

n=0

oo

< 2-

lim y^ Cnx^ = 2 -

lim yj\-x

= 2.

Replacing \ — x with x^, it follows that oo

\x\ = Y2 c n ( l - x^)^

uniformly in [-1,1].

9.54 ^ . Add details to the previous proof.

b. Bernstein's polynomials Another proof of Theorem 9.53, grounded in probabHstic ideas, see Exercise 9.57, and giving expHcit formulas for the approximating polynomials, is due to Sergei Bernstein (1880-1968). It is enough to consider functions defined in [0,1] instead of in a generic interval [a, 6]. 9.55 Definition. Let f e C^([0,1]). Bernstein polynomials of f are

Bn{x) := JZf{^ 0^'(1 - ^)"-'.

^ > 0.

306

9. Spaces of Continuous Functions, Banach Spaces and Abstract Equations

9.56 Theorem (Bernstein). Bernstein's polynomials off converge uniformly in [0,1] to f. Proof. We split the proof into three steps. Step 1. The following identities hold

5:n.^i-xr-'= = i.

(9.8)

fc=0

(9.9) The first is trivial: it follows from the binomial formula (a + 6)" = ^Q {^)a^h'^-^ by choosing a = x and 6 = \ — x. The second needs some computation. Fix n > 1. Starting from the identities

fc=0 '^

±kQ)y'-yD(tQ)y')=nyiy+ir-\ '£k^ny''

= ny{ny + l){y + i r - \

we replace y by x / ( l — x) and multiply each of the equalities by (1 — x)". It follows that

fc=0 '^

f2kr)x^{i-xr-^ = nx, k=0 n

^

S ^^ Q)^^(^ - ^T~^ = ^^(^^ + 1-3:). fc=0

Multiplying each of the previous identities respectively, by n'^x^, —2nx and —1, and summing, we infer (9.9). Step 2. As x{l-x)

< \ , (9.9) yields

«=0

Fix S > 0 and x G [0,1], and denote by An (a:) the set of fc in { 0 , 1 , . . . , n } such that a: > (5; n (9.10) then yields

?:,{:y»-'>-<-^'

(9.11)

keAnix) that is, for n large, the terms that mostly contribute to the sum in (9.8) are the ones with index k such that \k <S. x\ n

9.3 Approximation Theorems

307

Step 3. Set M := sup^^jg,!] 1/(^)1 ^^^^ given e > 0, let 8 be such that \f{x) — f{y)\ < e for \x — y\ < 5. Then we have Bn{x)

- fix)

= f^

[/(^)

- /(X)] Q x ' = ( l - X)"-'^

fcGr,,(x)

where Tn : = | o , . . . , n ) \ A n = (fc G { 0 , . . . , n } I I - - x\ For k G Tnix),

i.e., |/c/n — x| < (5, we have \f{k/n)

<s].

— f{x)\ < e, hence

on the other hand, if \k/n — x\ > S, (9.11) yields

Therefore, we conclude for n large enough so that M/{2nS'^) \Bn{x) - f{x)\ < 2e

< €,

uniformly in [0,1].

9.57 %, The previous proof has the following probabilistic formulation. Let 0 < p < 1 and let Xn{p) be a random variable with binomial distribution

PiiXnip) = r/n}) = Q p - ( 1 - p ) " - ^ If / : [0,1] ^^ E is a function, the expectation of f{Xnit))

is given by

oo

E ifiXnim = E / © 0*'(i - *)""' r=0

and one shows in the theory of probability that E [f{Xn{t))]

converge uniformly to / .

c. Weierstrass's approximation theorem for periodic functions We denote by Cj^ the class of continuous periodic functions in R with period T > 0. 9.58 Theorem (Weierstrass, II). Every function f e C^^ is the uniform limit of trigonometric polynomials with period T.

308

9. Spaces of Continuous Functions, Banach Spaces and Abstract Equations

In Section 9.3 we shall give a direct proof of this theorem and in Section 11.5 we shall give another proof of it: the Fejer theorem. It is worth noticing that, in general, a continuous function is neither the uniform nor the pointwise sum of a trigonometric series. Here we shall prove that the claims in Theorems 9.58 and 9.53 are equivalent. By a linear change of variable we may assume that T = 27r. First let us prove the following. 9.59 Lemma. Let f e C^([—7r,7r]) be even. Then for any e > 0 there is an even trigonometric polynomial T{x) := yjQfccoskx k=0

such that \f{x) - T{x)\ < 6 V x G [-7r,7r]. Proof. We apply Theorem orem 9.53 to the continuous continuou function g{y) := /(arccos?/), y € [—1,1], to obtain n

/(arccosj/) - ^

CkV^l <e

in [-1,1],

hence n

\f{x) — y^CkCOS^x\<e

in [0,TT].

fc=o n

To conclude, it suffices to notice that ^

Cfc cos'^ x is an even polynomial.

D

k=0

Proof of Theorem 9.58. Let / € C2T^{R). f{x) + f{-x),

We consider the two even functions in [—TT, TT] (fix)

-

f(-x))smx.

(fix)

- fi-x))

Then Lemma 9.59 yields for any e > 0 fix) + f(-x)

= Ti(x) + a i ( x ) ,

sinx = T2{x) + a2{x)

Ti{x) and T2{x) being two even trigonometric polynomials, and for the remainders a i and a2 one has |ai(a:)|, |a2(x)| < e in [-7r,7r]. Multiplying the first equation by sin^ x and the second by sin x and summing we find fix) sin^ X = Tsix) -\- asix),

(9.12)

for Tsix) a trigonometric polynomial and ||a3||oo^[_7r,7r] < 2e. The same argument applies to fix — 7r/2), yielding f(x

j sin^ X = T4ix) -h a4ix)

where T4 is a trigonometric polynomial and ||a4||oo,[-7r,7r] < 26. By changing the variable X in a; + ^ , we then infer fix) cos^ X = Tsix) + asix)

(9.13)

where Tsix) := T4(x-{-|) and ||Q;5||oo,[-7r,7r] < 2e. Summing (9.12) and (9.13) we finally conclude the proof. •

9.3 Approximation Theorems

309

9.60 Remark. Actually, the two Weiertrass theorems are equivalent. We have already proved Theorem 9.58 using Theorem 9.53. We now outline how to deduce the first Weierstrass theorem, Theorem 9.53, from Theorem 9.58, leaving the details to the reader. Given / G C^([—7r,7r]), the function

satisfies ^(TT) = g{—7r), hence g can be extended to a continuous periodic map of period 27r. According to Theorem 9.58, for any e > 0 we find a trigonometric polynomial n{e)

Te{x) := ao -\- 2^i^k cosfcx + b^ sin/ex) with \g{x) — Tg(x)| < e for all x E [—7r,7r]. Next, we approximate sinfcx and cosfcx by polynomials (e.g., by Taylor polynomials), concluding that there is a polynomial Qe{x) with \Te{x) — Qe(x)| < e Vx G [—7r,7r], hence \g{x)-Q,{x)\<2em[-l,l].

9.3.2 Convolutions and Dirac approximations We now introduce a procedure that allows us to find smooth approximations of functions. a. Convolution product Here we confine ourselves to considering only continuous functions defined on the entire line. The choice of the entire line as a domain is not a restriction, since every continuous function on an interval [a, b] can be extended to a continuous function in R and, actually for any 5 > 0, to a continuous function that vanishes outside [a — S^b^ 5]. 9.61 E x a m p l e (Integral m e a n s ) . Let / : R —>• R be continuous. For any 6 > 0 consider the mean function of / px+6

fsi^)'-=^

r ^ 2d

m)d^,

^€M.

-6 Jx-S

Simple consequences of the fundamental theorem of calculus are (i) fs{x) is Lipschitz continuous, (ii) fs{x) -^ f{x) pointwise, while from the estimate

Ifsix) - f{x)\ < sup

\f{y)-f{x)\

\y-x\<6

and Theorem 6.35 (iii) fs{x) -^ f{x) uniformly on every bounded interval of R.

(9.14)

310

9. Spaces of Continuous Functions, Banach Spaces and Abstract Equations

The above allows us, of course, to uniformly approximate continuous functions with Lipschitz-continuous functions on every bounded interval.

9.62 Definition. Let / , p : R ^ R be two Riemann integrable functions. Suppose that g{x — t)f{t) is summable in R for any x G R. Then the function g * f{x) := / g{x - y)f{y) dy,

x G R,

called the convolution product of f and g, is well defined. Clearly the map {f^g) -^ g * f (i) is a bilinear operator, (ii) g^ f = —/ * g since g * f[x) = / p(x - y)f{y)dy

= -

JR

g{z)f{x -z)dz

=

-f*g{x),

JR

(iii) if / and g are summable in [a, b] and / vanishes outside the interval [a, 6], then ^ * / is well defined in R and pb

\g*f{x)\<

nb

\9{x-y)\\f{y)\dy<\\g\U^,^^„.-a]

\f{y)\dy.

Ja

Ja

(9.15) 9.63 E x a m p l e . The function fs in (9.14) is the convolution product of / and

\jd

se \x\ > S.

9.64 E x a m p l e . If g{t) = Xlfc=o ^k^^ ^^ ^ polynomial of degree n, then for any / that vanishes outside an interval, g * f(x) - f^ cfc X I ( ^ ) ( - l ) " - ^ ( / y^'-'fiy) k=o

j=o

^

dy)x^

^-'^

^

is again a polynomial of degree n. 9.65 E x a m p l e . If / = 0 outside [—TT, TT], then ^^^ * iix)

= / " /(2/)e^'^("-^) dy = 27rcfce^^"

i.e., e*^* * / is the fcth harmonic component of the periodic extension of / , compare Section 11.5.

9.66 Theorem. Letg G C'^(R), and let f be Riemann summable. Suppose that either f or g vanishes outside a bounded interval [a,b]. Then g * f G C^(R) and Dk{g * f){x) = {Dug) * f{x) \/x e R.

9.3 Approximation Theorems

311

Proof. We prove the claim when / = 0 outside [a, 6], the other case p = 0 outside [a, 6] is similar. By (9.15) we then have Is * / ( x ) | < ||s||oc,(.-fc,x-a] ll/lll

ll/lll := r

\m\

dy,

Ja

hence \\9 * /||oo,[c,d] < \\9\\oo,[c-b,d-a]

(i) We now prove that g*feC^{R)iige

ll/lll-

C^(R). In fact,

f{x) = / {g{x - y -\- h) - g{x - y))f{y)

g*f{x-\-h)-g*

dy = G * / ( x ) ,

where G{x) := g{x -\- h) — g{x). Therefore, using (9.15), we get \g*f(x

+ h)-g*

/ ( x ) | < \\g{x + h) - 5(x)||oo,lx-6,x-a] ll/lli ^ 0

as /i —> 0 since \\g(t -\- h) — g{t)\\oo,[x-b,x-a] g on compact sets.

(9.16)

~^ 0, because of the uniform continuity of

(ii) Similarly, we prove that p * / G C^(R) if / G C^(R). We have

g* f(x + h)- g* f{x)

-

/

9'{x-y)f(y)dy

H*f{x),

= where H{x) := aix+^V-g^x) _ ^,^^y

\g*f{x-\-h)-g*f(x)

^^^.^^ ^^

^^^^^^

f

I

^\\g{t-\-h)-g{t)

g'm

ll/lll. I \oo,[x — o,x — a\

Since g(x -\-h)r

g{x)

1 , 9 {x) = \T

rx+n

(9'(J/)-S'W)

I n. Jx -^

rx-\-h

< ^ /

<

sup

\9{y)-9{x)\dy

\g\y) - g\x)\-^ ^

ash-^0

\y-x\<\h\

because of the uniform continuity of p ' on compact sets, we then conclude that p * / is differentiable at x and that {g * f)'(x) = g' * f{x) Vx G M. Finally (p * fY = g' * f is continuous by (i). (iii) The general case is then proved by induction.

•

9.67 Remark. Let / and g be summable and let one of them vanish outside a bounded interval. If / instead of g is of class C^(R), then, recalhng that ^ * / = - / * ^, we infer from Theorem 9.66 that g * f e C^{R) and Dk{g ^ f){x) = g^ {Dkf){x). Therefore if both / and g are of class C^(E), then Dk{g * f){x) = (Dkg) * f{x) = g * {Dkf){x) and, in general, ^ * / is as smooth as the smoother of / and g.

312

9. Spaces of Continuous Functions, Banach Spaces and Abstract Equations

b. MoUifiers 9.68 Definition. A function k{x) G C ^ ( R ) such that k{x) = fc(—x), k{x) > 0,

k{x) = 0 for \x\ > 1,

/ k{x) dx = 1 JR

is called a smoothing kernel. 9.69 t . The function

(

ex ip{x) := { ^ \ 0

1- X if \x\ > 1

is C°*^(E), nonnegative, even and with finite integral. Hence the map k{x) := -^(p(x), is a smoothing kernel. where A := f^(p{x)dx,

Given a smoothing kernel fc(x), we can generate the family k,{x):=e-^k(^-^

e>0.

Trivially, ke{—x) = fce(x) and ke G C ^ ( R ) ,

ke{x) > 0,

ke{x) = 0 per |x| > e,

/ ke{x) dx = 1. JR

Also ke{x) = 0 for \x\ > e and ||fee||oo = ||A:||oo/e. 9.70 Definition. Given a smoothing kernel k, the mollifiers or smoothing operators Se are defined by

SJ{x) := K * f{x) = f K{x - y)f{y) dy. JR

We have S,f{x)

=fee* fix) = r

' k,{x - y)f{y) dy

Jx—e

= ]f^k{^)f{y)dy

=j

Hz)f{x-ez)dz.

Since the functions ke are of class C ^ , the functions 5e/(x), x G M, are of class C^ by Theorem 9.66. Moreover, as shown by the next theorem, they converge to / in norms that are as strong as the differentiability of / ; for instance, they converge uniformly or in norm C^ if / is continuous or if / is of class C^, respectively. 9.71 Proposition. Let f G C^(R). Then (i) 5 e / G C ^ ( M ) , V e > 0 ;

9.3 Approximation Theorems

313

(ii) If f = 0 in [a, b], then Sef{x) =0 in [a + e, 6 — e];

(iii) iSjyix) =

U^k'[^)fiy)dy;

(iv) Sef -^ f as e —> 0 uniformly in any bounded interval [a, 6]. Moreover, iff e C^{R), then ( 5 J ) ' ( x ) = {Sef){x) f'{x) uniformly on any bounded interval [a, 6].

Vx G R andSef{x)

-^

Proof, (i), (iii) follow from Theorem 9.66, and (ii) follows from the definition, (iv) If / € C^(R) and X G R we have \f{x) - Sef{x)\

- y)[f{y) - f{x)] dy =

= \fKeix <

sup

\f{y)-f(x)\

Iy-x|<e

f ke{y)dy= JR

\ke*{f-f{x)){x)\ sup

\f{y)-f{x)\.

\y-x\<e

Since / is uniformly continuous on bounded intervals in E, sup|j^_2.|^g \f{y)~f{x)\ ~^ 0? consequently \Se{x) — f(x)\ —> 0 as e —>• 0 uniformly on compact sets of R. If / G C^(E), we have already proved in Theorem 9.66 that Sef is of class C^ and that (SefYix) = Sef'{x). Applying (iv) to Sef and Sef we then reach the claim. D

c. Approximation of the Dirac mass The family {fce} is often referred to as an approximation of the Dirac delta. In appHcations, the Dirac S is often "defined" as a function vanishing at every point but zero and with the property that + 00

6{x) dx = 1;

/

sometimes it is "defined", with respect to convolution, as if it would operate like

f

J —c

Of course, no such function exists in the classical sense; but it can be thought of as a linear operator from C^(M) into R

We shall avoid dealing directly with J, as the correct context for doing this is the theory of distributions., and we set 9.72 Definition. A sequence of nonnegative functions Dn : R ^> R with the properties that for any interval [a, b] and for any p > 0 we have / JB{0,p)

Dn{x) dx ^ 1,

/ J[a,b]\B{0,p)

is called an approximation of the S.

Dn{x) dx ^0,

as n -^ (X), (9.17)

314

9. Spaces of Continuous Functions, Banach Spaces and Abstract Equations

Figure 9.8. Approximations of the Dirac S.

9.73 If. Let {Dn} be an approximation of S and let / be a continuous function in [a, b]. Show that Um /

Dn{x-

y)f{y) dx = f{x)

Vx €]a, 6[.

It is easy to prove the following. 9.74 Theorem. Let {Dn} be an approximation of 6. Suppose that each Dn is continuous in M and let f he a continuous function in [a, 6]. Then the functions fn{x) := / Dn{x - y)f{y) dy,

x G [a, 6],

converge uniformly to f in every interval [c^d] strictly contained in [a, 6]. Theorem 9.74 uses, in an essential way, the fact that the approximations oiS are nonnegative. For instance, the result in Theorem 9.74 does not hold for the sequence of the Dirichlet kernels, since the Fourier series of / does not converge to / if / is merely continuous, compare Section 11.5. 9.75 f. Prove Theorem 9.74. 9.76 ^ . Consider in [—1,1] the sequence of functions D„(x):=c„(l-xY

where

c„

:=-^--1^-—.

Show that for every p €]0,1[ lim :l£2 i =0. ^^°° /o(l-t2)^cit Infer that {Dn} is an approximation of S, hence the functions f \ l - i t - x f r m dt

fn{x):=Cn Jo

converge uniformly to / on compact sets of ] — 1,1[. Finally, observing that the functions fn{x) are actually polynomials of degree not greater than 2n, called Stieltjes polynomials, deduce from the above Wieierstrass's theorem, Theorem 9.53.

9.3 Approximation Theorems

315

/. p. NATANSON LECONS

CONSTRUCTIVE FUNCTION THEORY

L'APPROXIMATION DES FONCTIONS D'UrsE VARIABLE RfiELLE I'llOKKSStES A U SOKBONNK

Volume I UNIFORM APPROXIMATION

C. OK ht. V A L L £ E P0U8SIN Translated by ALEXIS N. OBOLENSKY

PARIS CAUTH1CK.VH.L.\US ET O; EDITEUUS IIMAIBK DC Vl'UBAt

FREDERICK

UNGAH PUBLISHING NBV YORK

CO.

Figure 9.9. Frontispieces of L'approximation des fonctions by Charles de la ValleePoussin (1866-1962) and of J. P. Natanson Constructive Function Theory, New York, 1964.

Consider the functions Dn{t) := Cn cos^"" ( " ) ,

t E [-7r,7r]

where, see 2.66 of [GM2], 1 ^n • —

_ 1

/:^cos2"(|)dt

(2n)!!

27r(2n-l)!!-

As proved below, we have the following. 9.77 Lemma. The sequence {Dn} is an approximation of S. Hence, as a consequence of Theorem 9.74, we can state the following. 9.78 Theorem (de la Vallee Poussin). Let f G C°([-7r,7r]). The functions

converge uniformly to f in every interval [a, b] with

—7r
Proof of Lemma 9.77. (i) Since cost is decreasing in [0,7r/2], we have r

cos2^(t) dt<(--p)

cos2^(p) < - cos2^(p);

on the other hand, since cost is concave in [0, 7r/2], we have cost > 1 — 2t/7r, hence

316

9. Spaces of Continuous Functions, Banach Spaces and Abstract Equations

/ Jo we therefore conclude

cos2^mdi> / Jo

( 1 - - ) ^ 7r>'

=-7-^ r; 2(2n + l ) '

/;/2cos2-(t)dt 2(2n+l)7r ^n, ^ ~/J'/^cos2n(t)dt < —^ TT -2 cos"^^(/9) —^0 and

as n -^ oo

_£C0S2-W^^^

lim - ^ — = 1. ''-^'^ Jo ^cos^"" {t)dt

(9.19) D

The functions Tn{x) in (9.18) are often called de la Vallee Poussin integrals. 9.79 Remark. Let g G C^(R) be a periodic function with period 27r. Applying Theorem 9.78 to g{x) := /(3x), x G [—TT,TT], we deduce the uniform convergence of {Tn{x)] to g{x) in [—7r/3,7r/3], i.e., the uniform convergence of {Tn{x/i)} to g{x) in [—TT, TT]. Since the T^'s are trigonometric polynomials of degree at most 2n, we may deduce at once the second Weierstrass theorem from Theorem 9.78.

9.3.3 The Stone—Weierstrass theorem Weierstrass's theorems can be generalized to and seen as consequences of the following theorem proved in 1937 by Marshall Stone (1903-1989) and known as the Stone-Weierstrass theorem. Let X be a compact metric space and let C{X) = C^{X,R) be the Banach space of continuous functions with uniform norm. An algebra of functions X is a real (complex) linear space of functions / : X —> R (respectively, / : X —> C) such that fgeAiif and g E A. We say that A distinguishes between the points of X if for any two distinct points x and y in X there is a function f in A such that f{x) ^ f{y). We say that A contains the constants if the constant functions belong to A. 9.80 Theorem (Stone-Weierstrass). Let X be a compact metric space and let A be an algebra of continuous real-valued functions, A C C^(X, M). If A contains constants and if it also distinguishes between the points of X, then A is dense in the Banach space C^(X, R). Let A be an algebra of bounded and continuous functions. As we have seen, the function \y\ can be approximated uniformly in [0,1] by polynomials. Consequently, if / G ^ , by considering instead of / the function h '-= / / I I/I loo? we can approximate x -^ \f{x)\ uniformly by the functions Pn{f{x)) where {Pn} is a sequence of polynomials. Since the Pn{f{x)ys belong to A, as A is an algebra of functions and f E A, we conclude that I/I belongs to the uniform closure of A, and also

9.3 Approximation Theorems

ma^(/,p):=l(/ + 5+|/-5|),

rmn{f,g):=^{f

+

317

g-\f-9\)

are in the uniform closure of A^ if both / and g are in the uniform closure of A A linear space of functions R with the property that max {f^g) and min (/, ^) are in i? if / and g e R is called a linear lattice: the above can be then restated as the closure of A is a linear lattice. To prove that A is dense in C^(X,R), it therefore suffices to prove the following. 9.81 Theorem. Let X be a compact metric space. A linear lattice R C C^(X, R) is dense, provided it contains the constants and distinguishes between the points in X. Proof. First we show that, for any / G C^(X, E) and any couple of distinct points x,y G X, we can find a function iljx,y G R such that '^x,y{x)

= f{x)

^Px,y{y) =

fiv)-

In fact by hypothesis, we can choose w ^ R such that w{x) ^ w{y); then the function i^x,y{t) :=

(fix)

- f{y)w(t)

- if(x)w{y) w(x) — w{y)

-

f{y)w{x))

has the required property. Given / G C^{X,R), e > 0 and y G X , for every x E X we find a ball B{x,rx) such that ijjx,y{t) > f{t) — e Vt G B{x,rx)- Since X is compact, we can cover it by a finite number of these balls {B^^} and we set (py := max tpxi,y- Then (py{y) = f{y) and ipy £ R since R is a lattice. We now let y vary, and for any y we find B{y, Vy) such that ^y{t) < f{t) -\- e^^t E B{y, Vy). Again covering X by a finite number of these balls {B(iii^ry^)}^ and setting (p := maxi c^y^, we conclude (p E'R and \^p{t) — / ( t ) | < e Vt G X , i.e., the claim. D

Of course real polynomials in [a, 6] form an algebra of continuous functions that contains constants and distinguishes between the points of [0,1]. Thus the Stone-Weierstrass theorem implies the first Weierstrass theorem and even more, we have the following. 9.82 Corollary. Every real-valued continuous function on a compact set K C M.'^ is the uniform limit in K of a sequence of polynomials in n variables. Theorem 9.80 does not extend to algebra of complex-valued functions. In fact, in the theory of functions of complex variables one shows that the uniform limits of polynomials are actually analytic functions and the map z ^ ;^, which is continuous, is not analytic. However, we have the following. 9.83 Theorem. Let A C C^(X, C) be an algebra of continuous complexvalued functions defined on a compact metric space X. Suppose that A distinguishes between the points in X, contains all constant functions and contains the conjugate f of f if f E A. Then A is dense in C^{X,C).

318

9. Spaces of Continuous Functions, Banach Spaces and Abstract Equations

coLuscTKHi OS uoHoeiiAMucs SCR u ntom

OE$

nactmi

LECONS

LES FONCTIONS MSCONTINllES PROFBSSUSBS XV COU^QB DE FRANCE

PARIS, GAUTHtER'VILLARS, IMtMUUECIt.LiBnAiHE • PttAII I I I tOXCITUDU, DC L'lic«tt r»tVT(Clllliail(,

Figure 9.10. Rene-Louis Baire (18741932) and the frontispiece of his Legons sur les functions discontinues.

Proof. JDenote by ^ o the subalgebra of A of real-valued functions. Of course 3?/ = ^ ( / + / ) and ^f = -^(f - f) belong to ^ o if / and g E A. Since f{x) i^ f{y) implies that ^f(x) 7^ 3f?/(y) or Qf{x) ^ ^f{y), Ao also distinguishes between the points of X and, trivially, contains the real constants. It follows that ^ o is dense in C^(X,R) and consequently, A is dense in C(X, C). D

The real-valued trigonometric polynomials

ao + yj(^fc cos kx + bk sin kx)

(9.20)

k=i

form an algebra that distinguishes between the points of [0,27r[ and contains the constants. Thus, trigonometric polynomials are dense among continuous real-valued periodic functions of period 27r. More generally, from Theorem 9.83 we infer the following. 9.84 T h e o r e m . All continuous complex-valued functions defined on the unit sphere {z E C\\z\ = 1} are uniform limits of complex-valued trigonometric polynomials

E k=—n

ike Cke

9.3 Approximation Theorems

319

9.3.4 The Yosida regularization a. Baire's approximation theorem The next theorem relates semicontinuous functions to continuous functions. 9.85 Theorem (Baire). Let X be a metric space and let f : X ^>' R be a function that is bounded from above and upper semicontinuous. Then there is a decreasing sequence of continuous, actually Lipschitz continuous, functions {/„} such that fn{x) —> f{x) for all x e X. Proof. Consider the so-called Yosida regularization fn{x) := sup{/(2/) yex

of / -nd{y,x)}.

Obviously f{x) < fn{x) < s u p / , fnix) > fn+i{x). We shall now show that each fn is Lipschitz continuous with Lipschitz constant less than n. Let x,y ^ X and assume that fnix) > fn{y)' For all 77 > 0 there is x' G X such that fn{x) < f{x')

- nd{x, x') -f- T)

hence 0 < fn{x) ~ fniv)

< fix')

- nd(x, xO + r? - ifix')

- ndiv,

x'))

= n(d(2/, x') — d(x, x')) + r; < ndix, y) -\-rj thus \fnix)

- fniy)\

since r] is arbitrary. Let us show that fnixo) i fixo). Denote by M the sup^.^^ / ( ^ ) - Since / ( X Q ) > limsup^_,2.^ fix), for any A > fixo) there is a spherical neighborhood Bixo,S) of XQ such that fix) < A Vx G B(xo, S), hence

\M

Then fix) hence

— ndixo.x)

if d(x, XQ) > 6.

< A Vx G X , provided n is sufficiently large, n > —~J\^o) ^ fixo)

Since A > fixo)

— nS

< fnixo)

= s u p ( / ( x ) - ndixo.x)) < A. X is arbitrary, we conclude fixo) = limn-^00 fnixo)-

•

Suppose that X == R^. An immediate consequence of Dini's theorem, Theorem 9.36, and of Baire's theorem, is the following. 9.86 Theorem. Let / : E^ ^ R fee a function that is bounded from above and upper semicontinuous. Then there exists a sequence of Lipschitzcontinuous functions that converges uniformly on compact sets to f.

320

9. Spaces of Continuous Functions, Banach Spaces and Abstract Equations

b. Approximation in metric spaces Yosida regularization also turns out to be useful to approximate uniformly continuous functions from a metric (or normed) space into R by Lipschitzcontinuous functions. Let X be a normed space with norm 11 11. 9.87 Proposition. The class of uniformly continuous functions f : X —> M is closed with respect to the uniform convergence. Recall that the modulus of continuity oi f : X —^ R is defined for all te

[0,+oo[by Ufit) := sup[\f{x)

- f{y)\ \x,yeX,

\\x - y\\ < t } .

(9.21)

Clearly / is uniformly continuous on X if and only if ujf{t) -^ 0 as ^ —> 0. 9.88 1. Prove Proposition 9.87.

Lipschitz-continuous functions from X to R are of course uniformly continuous, therefore uniform limits of Lipschitz-continuous functions are uniformly continuous too, on account of Proposition 9.87. We shall now prove the converse, compare Example 9.61. 9.89 Theorem. Every uniformly continuous function f : X ^^ R is the uniform limit of a sequence of Lipschitz-continuous functions. In order to prove Theorem 9.89, we introduce the function 6f{s) that measures the uniform distance of / from the class Lips{^) of Lipschitzcontinuous functions g : X —^R with Lipschitz constants not greater than s

Sfis) :^mi[\\f - g\\^\g eUp,{X)}. 9.90 %. Show that s —^ Sf(s) is nonincreasing and that / is the uniform limit of a sequence of Lipschitz-continuous functions if and only if (5y(s) -^ 0 as s -^ oo.

Then we introduce the Yosida regularization of / : X —> R by fs{x):=m{^[f{y)

+

s\\x-y\\].

9.91 t . Show that (i) fs is Lipschitz-continuous with Lipschitz constant s, (ii) fs (x) < ft (x) Va; if s < t, and, actually (iii) fs is the largest s-Lipschitz-continuous function among functions less than or equal to / .

9.3 Approximation Theorems

9.92 Proposition. Let f : X —^ R be a uniformly continuous Then

321

function.

6f{s) = ^sup[ujf{t)-st}.

(9.22)

Moreover, the minimum distance of f from Lip5(X) is obtained at gs{x) := fs{x)-}-Sf{s), i.e., ^f{s) = Proof. Let g 6 Ups{X).

\\f-9s\\oo'

Then l / W - / ( 2 / ) l < 2 | | / - ^ | | o o + 5||a:-2/||.

For x,y such that ||ic — 2/11 < *? by taking the infimum with respect to g, we infer

\fix)-m\<2Sf(s)

+ st

and, taking the supremum in x and y with ||x — 2/|| < t, we get ujf{t) < 2Sf{s) + st, hence - s u p | c j / ( t ) - St] < Sf{s). 2 t>o^ ^

(9.23)

Let us prove that the inequaUty (9.23) is actually an equality and the second part of the claim. For x,y ^ X we have fix) - f(y) < cjfiWx - 2/11) = [ujfiWx -y\\)-

s\\x - y\\) + s\\x - y\\

< s u p | u ; / ( t ) - stj + s\\x - y\\. By taking the supremum in y we get 0 < fix) - fsix)

< sup{a;/(t) - st] < 2(5/(s) ^ t>o ^

Wx e X

hence, by (9.23) we infer 11/ - MIoc < sup\ujfit)

- St] < 2Sfis).

t>o '^ Therefore, for gsix) := fsix) + Sfix)

(9.24)

^

we have

\\f-9s\\oo<Sf(s), and, since gs E L'lpsiX), ^fis)

we conclude | | / — gs\\oo = <5/(s)- Moreover, by (9.24)

= 11/ - 9s\\oo = 11/ - Mloo - Sfis) < suplujfit)

- St] - Sfis) < Sfis)

t>0 ^

^

i.e., -sup{ujfit)-st] 2, t>0 ^

=

Sfis).

^

9.93 1. Show that if / ^ := - ( - / ) s , then / ^ € UpsiX) Sfis) = 11/ - g'Woo,

g'ix)

:= rix)

and -

5fis).

322

9. Spaces of Continuous Functions, Banach Spaces and Abstract Equations

Proof of Theorem 9.89. It is enough to prove that infs>o<^/(s) = 0. First, we notice that ujf{t) is nondecreasing and subadditive. In fact, lix^y are such that ||a: —2/|| < a-\-h and we write b a a-\-b a-\-b then l-z — x| < a and |>2; — 2/| < 6; consequently, \f(x) - fiz)\ < ujfia) and |/(s/) - f{z)\ < u,f{b) that yield at once ujf{a -\-b) < oJf(a) +

^f(b).

Next, we observe that for any e > 0 and t = me -|- cr, m G N, cr < e, we have m < t/e and ojf (t) = LJf {me + cr) < mcjy (e) -f ujf (cr) < cjy (e) - -\-ujf (e). Therefore

<12 t>0^{all)„„„„_^,} = l„,„, e e From the last inequality we easily infer infs>o ^f(s) — 0

9.4 Linear O p e r a t o r s 9.4.1 Basic facts In finite-dimensional spaces, linear maps are continuous, but this is no more true in infinite-dimensional normed spaces, see Example 9.96. 9.94 % Show that P r o p o s i t i o n . Let X and Y be normed spaces. Suppose that X is finite Then every linear map L \ X ^>'Y is continuous.

dimensional.

The following proposition characterizes linear maps between two Banach spaces that are continuous. 9.95 Proposition. Let X and Y he normed spaces and let L : X -^ Y be a linear map. Then the following conditions are equivalent (i) L is continuous in X, (ii) L is continuous at 0, (iii) L is bounded on the unit ball, i.e., there exists K > 0 such that \\L{X)\\Y
9.4 Linear Operators

323

Proof. If L is continuous, then trivially, (ii) holds. If (ii) holds, then there exists S > 0 such that ||L(a:)||y < 1 provided \\x\\x < S. This yields ||L(x)||y < 1/S if ||a;||x < 1, since L is linear, i.e., (iii). Assuming (iii) and the linearity of L, we infer that for all \\L{^)\\Y

^ \ \ . (

\\x\\x

X

xii

^

II V \ \ x \ \ x J \ \ y -

'

i.e., (iv). (iv) in turn implies (v) since \\L(x) - L{y)\\Y = \\L{x - y)\\Y < K\\x - y\\x

Vx,s/ 6 X,

and trivially, (v) implies (i).

D

9.96 E x a m p l e . Let X be a normed space and let {cn} C X be a countable system of independent vectors with ||en|| = 1, and let y C X be the subspace of finite linear combinations of {cn}. Consider the operator L :Y —^R defined on {cn} by L{en) '-= n Vn and linearly extended to Y. Evidently L is linear and not bounded.

Linear maps between Banach spaces are often called linear operators. a. Continuous linear forms and hyperplanes Consider a linear map L : X ^ K defined on a linear normed space X, often called also a linear form. If L is not identically zero, we can find x such that X ^ ker L and we can decompose every x E X as L(x) __ L(x)

(/

L{x) L,\X)\ L(x)

V

in other words X = Span { x } e ker L. However it may happen that kerL is dense in X. 9.97 Proposition. Let L : X —^R be a linear map defined on a normed space X. Then ker L is closed if and only if L is continuous. Proof. Trivially, kerL := L~^(0) is closed if L is continuous. Conversely, if kerL = X, then L is constant, hence continuous. Otherwise we can choose x such that L(x) = 1. Since kerL is closed, also H := x-\-kerL is closed; since 0 ^ LT, we can then find a ball B(0, r) such that B{0,r) D H = 0. We now prove that L is continuous showing that

\L{x)\

In fact, if |L(rE)| > 1 for some x G B ( 0 , r ) , then X

\L(x)\\

\\

1

\L{x)\''\x\\ < r

while

Since H = {x\ L(x) = 1}, we conclude that xjLix)

\L{X))

e H f) B{0, r ) , a contradiction.

D

9.98 Corollary. If L : X -^ R is a linear map on a normed space X, then ker L is either closed or dense in X. In fact, the closure of ker L is a linear subspace that may agree either with kerL or X.

324

9. Spaces of Continuous Functions, Banach Spaces and Abstract Equations

b. The space of linear continuous maps For any linear map L : X -^ Y between two normed spaces with norms \x and II ||y, we define \L\\cix,Y) '=

sup

(9.25)

||L(x)||y,

|x||x
{ xeX

Il^ll/:(x,y) = PllocB,

\x\\x

-}.

or, equivalently, \C{X,Y)

inf{ii:GR|||L(x)||y
so that \\Lix)\\Y<\\L\\cix,Y)\\x\\x One can shorten this to

VxeX

IMI 0 there is no(e) such that \\Ln — Lm\\ < e for n , m > no- In particular, ||Ln(a:) — Lmix)\\Y < e||x||x for all X £ X, i.e., for every x ^ X {Ln(x)} is a Cauchy sequence in Y hence converges to an element L{x) G Y, Ln{x) —>• L{x) as n —>• oo. Letting n to infinity in Ln(x

-\-y)

= Ln{x)

+ Ln{y),

Ln{Xx)

=

XLnix),

we see that L is linear. Letting m ^- cx) in ||Ln(ip) — Lm{x)\\Y < e valid for \\x\\x < 1? n > no(e), we also find \\Ln{x) — L(X)\\Y < e for ||a:||x ^ 1 and n > no(e). This implies ||L|| < \\Ln\\ + e and \\Ln — L\\ < e for n > no, which in turn yields Ln -^ -^ in

C{X,Y).

D

c. Norms on matrices Por any n, let K = R (K = C) and consider R'^ (respectively C^) as an Euclidean (Hermitian) space endowed with the standard Euclidean (Hermitian) product and let | | be the associated norm. Let L : W^ —> K^ be linear, let L G Mm,n(^) be the associated matrix, L{x) =: Lx, and let /ii, /i2, • . . , /in be the singular values of L, that is the eigenvalues of the matrix (L^L)^/^ ordered in increasing order. Then ||L||2 = sup |L(x)p = sup {L*L{x)\x) = fil \x\ = l

\x\ = l

9.4 Linear Operators

325

Now define the i'^- norm of L € £(K",K'^), by ,1/2

\Lh:={Y.m?)

Of course, || ||2 and || || are equivalent norms in >C(K'^,K'^), since CiJL^^K^) is finite dimensional. More precisely we compute ||L||^ ^ tr(L^L) = ti(L^L)

= ^x\ +

,..^l,

and therefore, 9.100 Proposition. Let L G Mm,n{^)- Then \\L\\ is the maximum of the singular values of L and \\L\\ < \\L\\2 < \ / n 11-^11- Moreover, \\L\\ = \\L\\2 if and only if Rank L = 1. Proof. Let /xi, /i2, • • •, Mn be the singular values of L ordered in nondecreasing order. By the above, ||L|| = /Xn < H^lb < >/n||L|| while equality ||L|| = ||L||2 is equivalent to /ii = • • • = jjin-i = 0, and this happens if and only if Rank L = 1. D 9.101 %, Let T : ^2 —>^ IK be a bounded operator and for i = 1 , . . . , let e^ = Then

{Sin}n'

oo

3= 1

d. Pointwise and uniform convergence for operators In £(X, Y) we may define two notions of convergence. 9.102 Definition. Let {L^} C

C{X,Y).

(i) We say that {Ln} converges pointwise to L i/Vx G X we have \\Ln{x) - L{X)\\Y

^

0,

(ii) we say that {Ln} converges to L in norm or uniformly, if \\Ln-

L\\c{X,Y)=

sup \\x\\x
\\Ln{x)

- L{X)\\Y

-^ 0

aS U ^ OO.

Trivially, Ln -^ L pointwise ii Ln -^ L uniformly. But the converse is in general false and holds true if X is finite dimensional. 9.103 E x a m p l e . Recall that a sequence {xn} is in ^2(K) if and only if E f c l i ^l < +oo. For any n G N let e^^) := {Skn}k- Of course, ||e(^)||2 = 1 Vn. Consider the sequence of linear forms {Ln} on i2{^) defined by Ln{{xk}) = XnFor any x G ^2 W we have X]fcLi ^fc < +00, hence Ln{x) = Xn ^>- 0 as n ^>- 00, i.e., Ln —> 0 pointwise. On the other hand, \\Ln-0\\c(i^^^)

= \\Ln\\cie2,R)=

sup l|a;||2
and {Ln} does not converge uniformly to 0.

|Ln(x)|>Ln(e(^)) = l

326

9. Spaces of Continuous Functions, Ban£Lch Spaces and Abstract Equations

e. The algebra End (X) Let X, Y and Z be linear normed spaces and let / : X —> F and g :Y -^ Z be linear continuous operators. The composition g o f : X -^ Z is again a linear continuous operator from X to Z, and for every x e X we have

ll(5o/)(x)||z
l l 5 o / l l < I M I 11/11-

(9.26)

9.104 E x a m p l e . In general \\g o / | | < \\g\\ \\f\\. For instance, if X = R^ ^^(1 / and g are the orthogonal projections on the axes, we have | | / | | = ||p|| = 1 and fog = gof = 0, hence | | ^ o / | | = 1 1 / o p | | = 0 .

9.105 E x a m p l e . Let T : R"^ --^ R'^ he defined by T(x) = Tx where T := I 0 < € «

1. Then T-'^(x)

= T'^x

where T - ^ = [^

| . We then compute

\0 \\T\\ = 1, | | T - i | | = 1/6 and ||T|| | | T - i | | »

1 = ||Id|| =

1/eJ \\T-'oT\\.

Let X be a Banach space with norm || \\x, denote the Banach space C{X,X) by End(X) and the norm on End(X) by || ||. The product of composition defines in End (X) a structure of algebra in which the product satisfies the inequality (9.26): this is expressed by saying that End(X) is a Banach algebra. Clearly, if L G End (X) and L^ = L o L o - - - o L , then by (9.26) we have

IIL^II < ||L|r.

(9.27)

Again, in general, we may have a strict inequality. 9.106 Proposition. LetX be a Banach space andL G End (X). / / | | I / | | < 1, then Id — L is invertible, oo

(Id - L)-^ - ^

L^

in End (X)

n=0

and ||(Id — L)~^|| < ^ _ L M . In particular, for any y e X the equation X — Lx = y has a unique solution, x = Yl^=o^^y

^^^ 11^11 — i-nLiill^ll-

Proof. The series X l ^ o ^^ ^^ absolutely convergent, since oo

oo

^

5;^ iiL"ii < ^ iiLir = ^ - \\L\\'

n=0

hence convergent. In particular, S := J2'^=oL^

^ End (X) and | | 5 | | < jzifrfTT- Finally

n

(Id - L) ^

L^ = Id - L^+^ -^ Id

in End (X)

k=0

since ||L^+Ml < II^IT"^^ ^ 0 -

•=•

9.4 Linear Operators

327

f. The exponential of an operator Again by (9.27) we get, similarly to Proposition 9.106, the following. 9.107 Proposition. Let X be a Banach space and L G E n d ( X ) . (i) Let f{z) := X ] ^ o ^n^'^ be a power series with radius of convergence p > 0. If\\L\\ < p, then the series Yl^=o^'^^^ converges in End(X) and defines a linear continuous operator oo

/(L):=^anL"GEnd(X). n=0

(ii) The series YlT=oh-^^ continuous operator

converges in End(X) and defines the linear oo

e^ = exp (L) := ^

^

- L ^ G End (X).

fc=0 9.108 %, Show the following. P r o p o s i t i o n . Let X be a Banach space and let A,BE

E n d ( X ) . Then we have

(i) ( l d + ^ ) " ^ e ^ mEnd(X),

(ii) lle^ll <ell^ll, (iii) If A and B commute,

i.e., AB = BA,

then

(iv) if A has an inverse, then (e"^)"-^ = e~'^, (v) if X is finite-dimensional, X = W^, we have eP^P-'

=:Pe^p-\

dete^=e*^^,

if P has an inverse, if A is

symmetric.

9.4.2 Fundamental theorems In this subsection, we briefly illustrate four of the most important theorems about the structure of linear continuous operators on normed spaces. The first three, the principle of uniform boundedness, the open mapping theorem and the closed graph theorem are a consequence of Baire's category theorem, see Chapter 5, and are due to Stefan Banach (1892-1945); the fourth one, known as Hahn-Banach theorem, was proved independently by Hans Hahn (1879-1934) in 1926 and by Banach in 1929.

328

9. Spaces of Continuous Functions, Banach Spaces and Abstract Equations

a. The principle of uniform boundedness The following important theorem is known as the Banach-Steinhaus theorem and also as the principle of uniform boundedness. 9.109 Theorem (Banach-Steinhaus). LetX be a Banach space andY be a normed linear space. Let {T^} be a family of bounded linear operators from X to Y indexed on an arbitrary set A (possibly nondenumerable). If sup llTc^xlly < C{x) < +00

Vx e X,

aeA

then there exists a constant C such that SUp\\Ta\\ciX,Y)
AV

Xn is closed and by hypothesis UnXn = X . By Baire's category theorem, it follows that there exists no £N, XQ £ X and ro > 0 such that B(xo, ro) C Xno, that is \\Tcc{xo -h ro2)||y < no

Va 6 A,

hence \/z € B{0,1), \\TC.{Z)\\Y

<

ro

—\\Ta{roz-hxo-xo)\\Y

< l ( n o + ||T.(xo)||y)<"° + ' ' ^ " ° ^ = : a ro ro

The following corollaries are trivial consequences. 9.110 Corollary. Let {Tk} be a sequence of bounded linear operators from a Banach space X into a normed space Y. Suppose that for each x e X limfc_^oo Tk^ =: Tx exists in Y. Then the limit operator T is also a bounded linear operator from X to Y and we have \\T\\cix,Y)

oo

9.111 Corollary. Let I ((6,6,..-,^n,0,0,...))

i f ? = (Ci,?2,---)

are clearly continuous and l / n ( 0 ~^ ^{0 i^ ^p- Therefore L is continuous by the Banach-Steinhaus theorem. D The following theorem, again due t o Banach, is also a consequence of Baire's category theorem.

9.4 Linear Operators

329

9.112 T h e o r e m . Given a sequence of bounded linear operators {T^} from a Banach space X into a normed linear space Y, the set {x ex\

liminf ||Tfcx||y < + o o ) fc—••oo

either coincides with X or is a set of the first category of X. This in turn implies the following. 9.113 Corollary (Principle of c o n d e n s a t i o n of singularities). For p = 1 , 2 , . . . , let {Tp,q}, q = 1 , 2 , . . . , be a sequence of bounded linear operators from a Banach space into a normed space Yp. Assume that for each p there exists Xp £ X such that limsupq_^QQ \\Tp^qXp\\c(^X,Yp) = oo- Then the set

is of second

\x e X\ limsup||Tp,q||£(x,y„) = +oo for all p = 1,2,3,..

A

*^

•'

•

q-^oo

category.

The above principle gives a general method of finding functions with many singularities. For instance one can find in this way a continuous function x(t) of period 27r such that the partial sum of its Fourier expansion n

Snf(t) :=

h Y ^ (ttfc cos kt -h bk sin kt) ^

k=i

satisfies the condition limsup |5n/(*)l = ^^ n—+00

in a set P C [0, 27r] which has the power of the continuum.^

b. The open mapping theorem 9.114 Theorem (Banach's open mapping theorem). Let X and Y be Banach spaces and let T be a surjective bounded linear operator from X into y . Then T is open, i.e., it maps open sets of X onto open sets ofY. Proof

We divide the proof into two steps.

Step 1. First we prove that there is a <5 > 0 such that T B x ( 0 , l ) D B y (0,2(5). Set Xn := nTBx(0,1). All Xn are closed and, since T is surjective, U ^ ^ X n = Y. By Baire's category theorem, see Theorem 5.118, it follows that for some n, Xn has a nonvoid interior. By homogeneity, T ( B x ( 0 , 1 ) has a nonvoid interior, too, i.e., there exists yo e Y and 6 > 0 such that By(2/0,4(5) C T ( B x ( 0 , 1 ) . By symmetry -yo G T B x ( 0 , l ) , and, as T J B X ( 0 , 1) is convex, B y (0, 26) C T B x (0,1). Step 2. We shall now prove that TBx(0,l)DBy(0,5), that is the claim. Observe that by Step 1 and homogeneity TBx (0, r) D BY (0, 2Sr)

Vr > 0.

^ For proofs we refer the interested reader to e.g., K. Yosida, Functional Springer-Verlag, Berlin, 1964.

(9.28) Analysis,

330

9. Spaces of Continuous Functions, Bansich Spaces and Abstract Equations

We want to prove that the equation Tx = y has a solution x € B x ( 0 , 1 ) for any y e By(0,(5). Let y e Y be such that ||t/||y < 6. By (9.28) there exists xi e X such that ||a;i||x < 1/2 and ||Txi — 2/||y < <5/2- Similarly, considering the equation Tx = y — Txi, one can find X2 £ X such that ||a:2||x < 1/4 and \\y — Txi — TX2\\Y < 6/4. By induction, we then construct points Xn E X such that | | x n | | x < 2 ~ ^ and \\y - E L i ^ ^ f c l l v < V 2 ' ' - Therefore the series X3fc=i ^k is absolutely convergent in X with sum less than 1, hence it converges to some x E: X with ||ic||x < 1? and \\y-Tx\\Y=0. D 9.115 ^ . Show the converse of the open mapping theorem: if T : X —>• y is an open, bounded linear operator between Banach spaces, then T is surjective.

A trivial consequence of Theorem 9.114 is the following. 9.116 Corollary (Banach's continuous inverse theorem). LetX^Y be Banach spaces and let T : X ^^ Y be a surjective and one-to-one bounded linear operator. Then T~^ is a bounded operator. 9.117 Remark. Let X and Y be Banach spaces and let T : X -^ F be a linear continuous operator. Often one says that the equation Tx = y is well posed if for any y EY it has a unique solution x E X which depends continuously on y. Corollary 9.116 says that the equation Tx = y is wellposed if X and Y are Banach spaces and Tx = y is uniquely solvable

yyeY. c. The closed graph theorem Let X, Y be two Banach spaces. Then X xY \\{x,y)\\xxY:=\\x\\x

endowed with the norm + \\y\\Y

is also a Banach space. 9.118 Theorem (Banach's closed graph theorem). Let X andY be Banach spaces and letT : X ^^Y be a linear operator. Then T \ X -^Y is bounded if and only if its graph GT := {{x,y) e X xY\y is closed in X

= Tx]

xY.

Proof. If T is continuous, then trivially GT is closed. Conversely, GT is a closed linear subspace of X x y , hence GT is a Banach space with the induced norm of X xY. The linear map n : GT —> X , 7r((x, Tx)) := x, is a bounded linear operator that is one-to-one and onto; hence, by the Banach continuous inverse theorem, the inverse map of TT, n~^ : X -^ GT, X -^ ( x , T x ) , is a bounded linear operator, i.e., \\x\\x + | | T x | | y < C | | x | | x for some constant C. T is therefore bounded. •

9.4 Linear Operators

331

Figure 9.11. Hans Hahn (1879-1934) and Hugo Steinhaus (1887-1972).

d. The Hahn-Banach theorem The Hahn-Banach theorem is one of the most important results in hnear functional analysis. Basically, it allows one to extend to the whole space a bounded linear operator defined on a subspace in a controlled way. In particular, it enables us to show that the dual space, i.e., the space of linear bounded forms on X, is rich. 9.119 Theorem (Hahn-Banach, analytical form). Let X be a real normed space and let p : X ^^'R be a sublinear functional^ that is, satisfying p{x + 2/) < p{x) + p{y),

p(\x) = Xp{x)

VA > 0, Vx, y E X.

Let Y be a linear subspace of X and let f : Y —^ R be a linear functional such that f{x) < p{x) Vx G F . Then f can be extended to a linear functional F : X ^ R satisfying F{x) = f{x) Vx G y,

F{x) < p{x) Vx G X.

Proof. Denote by /C the set of all pairs (Ya, Qa) where Ya is a linear subspace of X such that Yot D Y and ga is a linear functional on Y^ satisfying 9<x{x) = fix) Vx G X,

gcc{x) < p{x) Vx G Fa-

We define an order in /C by (Ya,ga) < (^/3,p/3) if Ya C Yj3 and ga = gp on Yot. Then K becomes a partially ordered set. Every totally ordered subset {iXoi^goc)} clearly has an upperbound {Y\g') given by Y' = U(3Yfs, g' = gp on Yp. Hence, by Zorn's lemma, see e.g., Section 3.3 of [GM2], there is a maximal element {Yo,go). If we show that YQ = X, then the proof is complete with F = go. We shall assume that YQ ^ X and derive a contradiction. Let y\ ^YQ and consider Yi := Span (YQ U { y i } ) = [x = y-{-Xyi^y

eYo,

AG M } ,

notice that y EYQ and A G M are uniquely determined by x, otherwise we get yi G VoDefine pi : Ki —> M by gi (y + Ai/i) '-= go{y) -\- X c. If we can choose c in such a way that giiy + Xyi) = go{y) + Ac < ^(t/ + Xyi) for all A G M, y G YQ, then {Yi,gi) G K and {Yo,go) contradicts the maximality of (Yo,go).

< {Yi,gi),

Yi # YQ. This

332

9. Spaces of Continuous Functions, Banach Spaces and Abstract Equations

To choose c, we notice that for x,y EYQ 9o{y) - 9o{x) = go{y-x)

< p{y - x)
+pi-yi

- x).

Hence

-p{-yi -x) - 9o{x)
^n\>\-p{-yi-x)-go{x)\ xeYo ^

^

Thus we can choose c such that A
-y)

< ini \p{y+ y^^o ^

yi) - go{y)\ =\ B. ^

B. Then

yi) - go{y)

Vy € Fo,

- goiy)
Vy G YQ.

Multiplying the first inequality by A, A > 0, and the second by A, A < 0, and replacing y with y/X we conclude that for all A 7^ 0 and trivially for A = 0 Xc
- go{y).

9.120 Theorem (Hahn-Banach). Let X be a normed linear space of K = R or K = C and let Y be a linear subspace of X. Then for every f E £(F, K) there exists F e £(X, K) such that F{x) = f{x) Vx G r ,

\\F\\cix,K) = ll/IU(y,K).

Proof. If X is a real normed spgice, then the assertion follows from Theorem 9.119 with p(x) = \\f\\c(Y,R)\\^\\xTo prove that \\F{x)\\c(x,R) < \\f\\c(Y,R). notice that F{x) = e\F{x)\, 6 := ±1, then \F{x)\ = eF(x) = F{ex) < p(Ox) = ||/||£(y,R)||^a:||x = | | / | | £ ( y , R ) l k l l x . This shows ||i^||£(x,R) £ ll/ll£(y,R)- The opposite inequality is obvious. Suppose now that X and Y are complex normed spaces. Consider the real-valued map h{x) := dlf{x), xeY. /i is a R-linear bounded form on Y considered as a real normed space since

IM^)l
Vxey,

thus the first part of the proof yields a R-linear bounded map H : X —^ R, such that H{x) = h{x) Vx € y and \H(x)\ < ||/||£(y,R)||x||x Vx G X. Now define F{x) := H{x) - iH{ix) Vx G X, hence H{x) — dlF(x). It is easily seen that F : X —> C is a C-linear map and extends / . It remains to show that I ^ W I < ll/ll£(y,c)IMIx

VxeX.

For X E X, we can write F(x) = re*^ with r > 0. Hence \F{x)\ =r = R{e-'^F{x))

= ^F{e-'^x)

=

H{e''^x)

9.4 Linear Operators

333

Simple consequences are the following corollaries. 9.121 Corollary. Let X be a normed space and let x E X. Then there exists F G C{X^ M) such that F{x) = \\x\\x,

\\F\\c(x,R) = l'

9.122 Corollary. Let X be a normed space. Then for all x e X \\x\\x =

SUP{F(X)

I F e £(X,M), \\F\\cix
9.123 Corollary. Let Y be a linear subspace of a normed linear space X. IfY is not dense in X, then there exists F G >C(X, R) F ^ 0, such that F{y) - 0 Vy G y . 9.124 %. Prove Corollaries 9.121, 9.122 and 9.123.

We can give a geometric formulation to the Hahn-Banch theorem that is very useful. For the sake of simplicity from now on we shall assume that X is a real normed space, even though the following results hold also for complex normed spaces. A closed affine hyperplane in X is a set of the form H:=ixex\

F{x) = a\

where F G £(X,M) and a G R. It defines the two half-spaces H.=lxeX\

F{x) < aV

H^=UeX\

F{x) > ay

We say that H separates the sets A and B if AcH.

and

Be

if+.

9.125 Lemma (Gauge function). Let C C X be an open convex subset of the real normed space X and let 0 £ C. Define p{x) : = i n f { a > 0 | - G c } . Then (i) p is sublinear, (ii) 3M such that 0 < p{x) < M \\x\\x, (iii) C : = { X G X | P ( X ) < I } .

334

9. Spaces of Continuous Functions, Banax^h Spaces and Abstract Equations

Proof. If B ( 0 , r ) C X, we clearly have p{x) < \\\x\\x Vx € X, that is (ii). Let us prove (iii). Suppose x ^ C. Since C is open, (1 + e)x € C, if e is small. Hence p{x) < •^— < 1. Conversely, if p{x) < 1, there is a, 0 < a < 1, such that a~^x G C, hence X = a{a~^x) H- (1 — a ) 0 G C Finally, let us prove (i). Trivially p{Xx) = Xp{x) for A > 0. For all x,y E X and e > 0 we know that X

y

p{x) -f c '

p{y) + e

€C,

consequently, tx ,( 1 - % GC p{x) + € p(2/) -h e

VtG[0,l].

In particular, for t :=

p{x) -h p{y) + 2e

we obtain GC.

p{x) + p(t/) + 2€ This yields p{x -\- y) < p{x) + p(y) -f 2e and the claim follows, since e is arbitrary.

D

9.126 Proposition. Let C C X be an open convex subset of the real Then there exists f G £(X, R) normed space X and letx£X,x^C. such that f{x) < f(x)yx e C. In particular, C and x are separated by the closed affine hyperplane {x \ f{x) = f{x)}. Proof. By translation we can assume 0 G C and introduce the gauge function p{x) by Lemma 9.125. If Y := Span{^} and g : Span {^} —>> E is the linear map g{tx) := t, it is clear that g(x) < p{x) \/x G Span{x}. By Theorem 9.119, there exists a linear extension f of g such that f{x) < p(x) \/x G X. In particular, we have f(x) = 1 and / is bounded because of (ii) of Lemma 9.125. On the other hand, f{x) < 1 Va: G C by (iii) of Lemma 9.125. D

9.127 Theorem (Hahn-Banach thereom, geometrical formi). Let A and B be two nonempty disjoint convex sets of a real normed space X. Suppose A is open. Then A and B can be separated by a closed affine hyperplane. Proof. Set C := A — B = {x — y\x

e A, ye

B}. Trivially C is convex and open as

C := Uy^siA — y); moreover, 0 ^ C since AOB = ^. By Proposition 9.126 there exists / G / : ( X , E ) such that f{z) < 0 V;^ G C, i.e., f{x) < f{y) \/x e A'^y e B. If we choose a such that s u p / ( x )
9.5 Some General Principles for Solving Abstract Equations In this final section we establish some fundamental principles concerning the solvability of abstract equations

9.5 Some General Principles for Solving Abstract Equations

335

Au = f where A : X ^ F is a continuous function also called a continuous nonlinear operator between Banach spaces. These principles are fully appreciated for instance when dealing with the theory of ordinary or partial differential equations; however in Chapter 11 we shall illustrate some of their applications.

9.5.1 The Banach fixed point theorem Many problems take the form of finding a fixed point for a suitable transformation. For instance, if A maps X into X where X is a vector space, the equation Au = 0 is equivalent to An -\-u = u^ i.e., to finding a fixed point for the operator A -h Id. The contraction mapping theorem^ proved by Stefan Banach (1892-1945) in 1922, an elementary version of which we saw in Theorem 8.48 in [GM2], is surely one of the simplest results that ensures the existence of a fixed point and also gives a procedure to determine it. The method has its origins in the method of successive approximations of Emile Picard (1856-1941) and may be regarded as an abstract formulation of it. Let {xji] be defined by Xji

=

r

(^XTI—I j .

If {xn} converges to x and F is continuous, then x is a fixed point of F , F{x) = X. a. The fixed point theorem Let X be a metric space. A map T : X ^ X is said to be k-contractive if d{T{x),T{y)) < kd{x,y) Vx,y G X, or, in other words, if T : X ^ X is Lipschitz continuous with Lipschitz constant

less than or equal to fc. If 0 < A: < 1, T is often said simply a contraction or a contractive mapping. A point a: G X for which Tx = x is called a fixed point for T. The contraction principle states that contractions have a unique fixed point. 9.128 Theorem (The fixed point theorem of Banach). Let X be a complete metric space and letT : X -^ X be k-contractive with 0 < k < 1. Then T has a unique fixed point. Moreover, given XQ G X, the sequence {xn} defined recursively by Xn-\-i = T{xn) converges with an exponential rate to the fixed-point, and the following estimates hold

336

9. Spaces of Continuous Functions, Banach Spaces and Abstract Equations

CAHIEBS SCIEMIFIQUES PASCICUtE

Ml

LECONS

TOPOLOGIE ET EQUATIONS FONCTIONNEIUS'' PM m.

lux LBHAV n Join SCHAUDER

QUELQUES EQUATIONS FOINCTIOINNELLES APPLICATIONS A DIVERS PROItl.EVEJi ll'ANAI.YSK I. Conn<MriH» t'^aation iris timpit f(ip) r^kiiAi e«( un p*r«ntln, P un polynomt de It T«ri«bie rAelle x; lar«que i raiit, l< Booibr* dec tolulians pent urier, inai» M pariK reite conattate; cetle parili est an ionriaiM d« rentemblt des lolaliont. Un ritultal inaiogM faat pour taut«« let iquatisat inUgralas nlcTant d« la mithoii d'Amlt-Schttidl ('). N«us ittbliroos au eoun de ee tnraii qn'oo p«ut de mimt attaoher a I'tateaible da* solutions de eeriainet iqtsitioos foaelioonelles non tiniainf un enUer potitif, n^iatif OQ nal, riiutice total, qai resle invariaiu quand P^natian rarie eontioQaientet que lea solutiont res(eat boraiei dans leur ensemble; les tquatioas en qoettion so9t du Ijrpe

ET DE PHYSIQUK MATllfiMATIOUE

M. £ a U e PICARO

TiOifitt p»T M. Bu(ta« BI.AKrC

*-*(*)-»,

(>)

ob 9{x) est cempiiiemtm eontiniu(»oll»telig)(« et * appartjenacnt k an ensemble abtlrait, llniaire, ooroii «t eomplel (an sens de U. Banach). P'oo resulle an frotiAt UH gineral fwtaMtoid'ohuiurialkiorimct it'exfsunct: soit uae Equation du type (f). Soppossos qu'flO la modiae eontindment sans qo'dla cesse d'appartenir au tjrpe (i) et de telle tone que renssmbls de see solutions resle hotvki'

PARIS OAUTHlER-VILLAItS ET C-, KOITEUftS

•(•«• Noir parw wt (

Figure 9.12. The frontispiece of Legons sur quelques equations fonctionnelles by Emile Picard (1856-1941) and the first page of a celebrated paper by Jean Leray (1906-1998) and JuHusz Schauder (1899-1943) appeared in Journal de Mathematiques in 1933.

d{Xn-\-l,x)

d{xn,x) < [d{Xn+l,x)

<

kd{Xn,x),

^d{xi,Xo), < Y^d(Xn+l,Xn).

Proof. The proof is as in Theorem 8.48 of [GM2]. First we prove uniqueness. If x, y are two fixed points, from d{x,y) = d{Tx,Ty) < kd{x,y), 0 < k < 1 we infer d{x,y) = 0, i.e., X = y. Then we prove existence. Choose any XQ £ X and let Xn-\-i '•= T{xn), n > 0. We have d{Xn-\-l,Xn)

< kd{Xn,Xn-l)

< k^d{xi,Xo)

= k^d{T{xo),

XQ),

hence for p > n p-i

d(xp,Xn)

< y ^ d(xj^i,Xj) j=n

p-i

< ^^k^d{xi,xo)

<

fc^

-d{xi,xo).

j=n

Therefore d{xp,Xn) —^ 0 as n,p -^ oo, i.e., {xn} is a Cauchy sequence, hence it has a limit X ^ X and a; is a fixed point as it is easily seen passing to the limit in a^n+i = T{xn)Finally, we leave to the reader the proof of the convergence estimates. D

Notice that the second estimate in Theorem 9.128 allows us to evaluate the number of iterations that are sufficient to reach a desired accuracy; the second estimate allows us to evaluate the accuracy of Xn as an approximate value of X in terms of d(xn-\-i,Xn)9.129 if. Show that T : X -^ X has a unique fixed point if its m t h iterate T"^ = T o T o ' - ' o T i s a fc-contractive mapping with 0 < k < 1. [Hint: x and Tx are both fixed points of T ^ . ]

9.5 Some General Principles for Solving Abstract Equations

337

9.130 If. Let X := C^{[a,h]) and let

Tf{t) := f f{s)ds,

a
Ja Show that ^ " - ^ W = ( ^ 1M / * ( * - ^ ) " " V ( 5 ) ds (m - I j ! J a

a
is a contractive map if m is sufficiently large.

9.131 Proposition. Let X be a Banach space and T : X —^ X a kcontractive map with 0 < k < 1. Then Id — T is a bijection from X into itself, i.e., for every y £ X the equation x — Tx = y has a unique solution, moreover Lip(Id-r)-i < T ^ (9.29) 1—k Proof. For any y E X the equation x — Tx = y is equivalent to x = y -\- Tx =: F{x). Since F is fc-contractive and fc < 1, the fixed point theorem shows that x — Tx = y has a unique solution for any given y ^ X, i.e., Id — T is bijective. Finally, if x — Tx = y, then

||x||<||x-Tx|| + ||Tx||<|M| + fe||x||

i.e., I W | < i ^ | M | .

•

9.132 %, Let X be a Banach space and T : X ^ X a Lipschitz-continuous map. Show that the equation Tx -\- fix = y is solvable for any y, provided |)Lt| is sufficiently large. 9.133 %, Let X be a Banach space and 8 : X x X -^ R a. bilinear continuous form such that |B(x,j/)|
b. The continuity method The solvability of a linear equation Lix = y can be reduced to the solvability of a simpler equation LQX = y hy means of the following. 9.134 Theorem (The continuity method). Let X be a Banach space, Y a normed space and LQ, LI two linear continuous functions from X to Y. Fort € [0,1] consider the family of linear continuous functions Lt : X ^^ Y given by Lt := (1 - t)Lo + tLi and suppose that there exists a constant C such that the following a priori estimates hold

338

9. Spgices of Continuous Functions, Banach Spaces and Abstract Equations

diTAauiiT ponrrs n nmcrtOB ss*m* O, D. BISZ&Ori' AND 0. O. WMUXIOG tatfttts tbat sack nuteaec thcantas m y b« obtiiacd fraoi tiaom tlteottiiis fint to ipw« o( II tBia«nriaM u d tfan to ttMction ^ o n by * Uultfng p t w w . T ^ diRctios << attack Im b c a Mtowtd ntt lad bts nwltcd in the theoresu fhxs below. Vm imUoct it is Imiid that theereos on iimriant points for tlM iplKnorforiti sMrface ^pleld tttftMy^ by geaenliatiao CBrtcmx tlteoTb* Utatioeat i>facieeoofacd to tbe caK d real ftUKtioB oi a m l V8hai>le, •Ithongb extcaaoo* to real fuoctMas of srKrel real voriaUei are iadicated Oaly tlK one cf a liagfe nsJowira faoctka it eeojitoeii. In many caacs, of ne general problcau caa be reduced to thb case by a for dtfferratial and n* Intcgial eqiiatieot, Hacar and Boa-Uaear. Inadeataliy, it is proved ttet an alfebraie auntliild/i / . - <., irtMic/i, A . . . , / n art real palynoDuab is tte %, ...,jr„ b** BO nsfidarity for fowtal Tablet of tlKmUcaastatitiCi,r< f,. TbcaatjKinkaTetwt been atrie tofindany earlier proof of this tiatpUnAd inportaat tbconMB. Tb« Utetatotreoii the subject of iovttitat poiot* does sot appear to be eztendve. PorageaaKtrkticatttentofoae-valacdtnnsfomatiowiwitbaoe-nliKd iamram, « e laay fdcr toX. fi. J, Brouvcr.t Some edstence tbcorens of im• I>MMtHl I* tkt Axil^r, » « Mi MM «k> I>M> a . tMM. B»n>U». ». m . SstbbKl

Figure 9.13. George Birkhoff (18841944) and a page from a paper by Birkhoff and Kellogg in Transactions, 1922.

|x||x < C\\Ltx\\Y

Va: G X, Vt G [0,1].

(9.30)

Then, of course the functions Lt^ t £ [0,1], are injective; moreover, Li is surjective if and only if LQ is surjective. Proof. Injectivity follows from the linearity and (9.30). Suppose now that Ls is surjective for some s. Then Ls : X ^ Y is invertible and by (9.30) ||(I/s)~^|| < C. We shall now prove that the equation Ltx = y can be solved for any y E Y provided t is closed to s. For this we notice that Ltx = y is equivalent to LsX = y + (Ls — Lt)x = y + (t — s)Lox — {t — s)Lix which, in turn, is equivalent to X = L-^y

+ (t - S)L-^{LQ

- Li)x

=:

Tx

since Ls '• X -^Y has an inverse. Then we observe that ||Ta; — T2;||y < C|t — s|(||Lo|| + | | L i | | ) , consequently T is a contractive map if \t-s\<6:=

^(ll^oll + lli^il

and we conclude that Lt is surjective for all t with |t — s| < 5. Since <5 is independent of s, starting from a surjective map LQ we successively find that Lt with t 6 [0,(5], [0,2(5], . . . is surjective. We therefore prove that Li is surjective in a finite number of steps. D 9.135 R e m a r k . Notice that the proof of Theorem 9.134 says that, assuming (9.30), the subset of [0,1] 5 := | s 6 [0,1] I Ls : X ^ y is surjectivej is open and closed in [0,1]. Therefore S = [0,1] provided S ^ 0,

9.5 Some General Principles for Solving Abstract Equations

339

9.5.2 The Caccioppoli-Schauder fixed point theorem Compared to the fixed point theorem of Banach, the fixed point theorem of Caccioppoh and Schauder is more sophisticated: it extends the finitedimensional fixed point theorem of Brouwer to infinite-dimensional spaces. 9.136 Theorem (The fixed point theorem of Brouwer). Let K he a nonempty, compact and convex set of W^ and let f he a continuous map mapping K into itself. Then f has at least a fixed point in K. The generalization to infinite dimensions and to the abstract setting is due to JuUusz Schauder (1899-1943) at the beginning of the Twenties of the past century, however in specific situations it also appears in some of the works of George Birkhoff (1884-1944) and Oliver Kellogg (18781957) of 1922 and of Renato Caccioppoh (1904-1959) (independently from Juhusz Schauder) of the 1930's, in connection with the study of partial differential equations. Brouwer's theorem relies strongly on the continuity of the map / and in particular, on the property that those maps have of transforming bounded sets of a finite-dimensional linear space into relatively compact sets. As we have seen in Theorem 9.21, such a property is not valid anymore in infinite dimensions, thus we need to restrict ourselves to continuous maps that transform bounded sets into relatively compact sets. In fact, the following example shows that a fixed-point theorem such as Brouwer's cannot hold for continuous functions from the unit ball of an infinite-dimensional space into itself. 9.137 E x a m p l e . Consider the map / : ^2 —>^ ^2 given by

Clearly / maps the unit ball of ^2 in itself, is continuous and has no fixed point.

a. Compact maps 9.138 Definition. Let X and Y he normed spaces. The (non)linear operator A: X ^yY is called compact if (i) A is continuous, (ii) A maps hounded sets of X into relatively compact suhsets ofY, equivalently for any hounded sequence {xk} C X we can extract a suhsequence {xn^} such that {Axnk} ^^ convergent. 9.139 E x a m p l e . Consider the integral operator A : C^{[a,b]) -^ C^([a, 6]) that maps u e C^{[a,b]) into Au{x) G C^{[a,b]) defined by Au{x) := / Ja

F{x,y,u{y))dy

340

9. Spaces of Continuous Functions, Banach Spaces and Abstract Equations

is a continuous real-valued function in R^. For r > 0 set Qr := where F{x,y,u) {{x,y,u) eM.^\x,y e[a,b], |u| < r } and Mr := [u e C^{[a,b]) | ||tx||oo < r } . P r o p o s i t i o n . A : Mr -^ C^{[a,b]) is a compact

operator.

Proof, (i) First we prove that A : Mr —» C^([a, b]) is continuous. Fix e > 0 and observe that, F being uniformly continuous in Qr, there exists S > 0 such that \F{x,y,u)-

F{x,y,v)\

<e

if (x, y, u), (x, y, v) G Br and |IA — i;! < 6. Consequently, we have \F{x,y,u{y))

- F{x,y,v{y))\

<e

for u, t> € Mr with \\u — t'||cx),[a,6] < <^? hence \\Au-Av\\^^^a^^^=

sup \f[F{x,y,u{y))-F{x,yMy))]dy\<e{b-a).

(9.31)

(ii) It remains to show that A maps bounded sets into relatively compact sets. To do that, it suffices to show that A{Mr) is relatively compact in C^([a,6]). We now check that A{Mr) C C^([a, 6]) is a set of equibounded and equicontinuous functions. Then the Ascoli-Arzela theorem. Theorem 9.48, yields the required property. In fact, the equiboundedness of functions in A{Mr) follows from \\Au\\oo<{b-a)

sup ix,y,z)eQr

\F{x,y,z)l

while the equicontinuity of functions in A{Mr) is just (9.31).

D

Compact operators arise as limits of maps with finite rank as shown by the following theorem. 9.140 Theorem. Let X and Y be Banach spaces and M C X a nonempty bounded set. We have (i) If {An}, An : M —>Y, is a sequence of compact operators that converges to A : M -^ Y in B{A,Y), i.e., \\An - A\\]siA,Y) -^ 0 as n ^ oo, then A si compact. (ii) Suppose A : M ^^ Y is compact. Then there exists a sequence {An} of continous operators An '. X ^^ Y such that \\An — ^||OO,M —^ 0 as n ^ oo and each An has range in a finite-dimensional subspace ofY as well as in the convex envelope of A{M). Proof, (i) Fix e > 0 and choose n so that \\An — A||oo,M < c- Since An{M) is relatively compact, we can cover AniM) with a finite number of balls An(M) C [Jl^-^B{xi,e), i.e., A{M) is totally bounded, hence i = 1 , . . . , / . Therefore A(M) C ul^-^B{xi,2e), A{M) has compact closure, compare Theorem 6.8. (ii) Since A{M) is relatively compact, for each n there is a —-net, i.e., elements yj G A(M),

j = 1 , . . . , Jn such that A(M) C ^j=iB(yj,l/n), min \\Ax - yjW < j n

or, equivalently,

\/x e M.

9.5 Some General Principles for Solving Abstract Equations

341

Figure 9.14. Renato Caccioppoli (1904-1959) and Carlo Miranda (1912-1982).

Define the so-called Schauder

operators

•^3

where, for a; G M and j = 1 , . . . , J n , ttj (a;) := max "I

2

11 Ax — 2/j 11,0 >.

It is easily seen that the functions aj : M —> R are continuous and do not vanish simultaneously; moreover

the claim then easily follows.

•

b. The Caccioppoli-Schauder theorem 9.141 Theorem (CaccioppoU-Schauder). Let M C X be a closed, bounded, convex nonempty subset of a Banach space X. Every compact operator A: M -^ M from M into itself has at least a fixed point. Proof. Let WQ € M . Replacing u with u — UQ we may assume that O E M . Prom (ii) of Theorem 9.140 there are finite-dimensional subspaces Xn C X and continuous operators An : M -^ Xn such that \\Au - Anu\\ < ^ and AniM) C co{A{M)). The subset Mn -= Xn n M is bounded, closed, convex and nonempty (since 0 G Mn) and An{Mn) C Mn- Brouwer's theorem then yields a fixed point for An ' Mn —^ Mn, i e . , Un 6 Mn,

A-nUn — Un,

hence, as the sequence {un} is bounded, \\AUn -Un\\

= \\AUn - AnUn\\

< - | | W n | | —^ 0.

n Since A is compact, passing to a subsequence still denoted by {un}, we deduce that {Aun} converges to an element v £ X. On the other hand v G M , since M is closed, and \\un - v\\ < \\v - Aun\\ + \\Aun - Un\\ -^ 0 as n —^ oo; thus Un —^ V and from Aun = Un Vn we infer Av = v taking into account the continuity of A D

342

9. Spaces of Continuous Functions, Banach Spaces and Abstract Equations

c. The Leray-Schauder principle A consequence of the Caccioppoli-Schauder theorem is the following principle, which is very useful in applications, proved by Jean Leray (19061998) and Juliusz Schauder (1899-1943) in 1934 in the more general context of the degree theory and often referred to as to the fixed point theorem of Helmut Schaefer (1925- ) . 9.142 Theorem (Schaefer). Let X he a Banach space and A : X —^ X a compact operator. Suppose that the following a priori estimate holds: there is a positive number r > 0 such that, ifuEX solves u = tAu

for some 0 < ^ < 1,

then \u\\ < r. Then the equation Av

Ve X

has at least a solution. Proof. Let M := {u £ X \ \\u\\ < r} and consider the composition B of A with the retraction on the ball, i.e.,

Bu := <

Au rAu

if \\Au\\ < r,

{\\Au\\ B maps M to M , is continuous and maps bounded sets in relatively compact sets, since A is compact. Therefore the Caccioppoli-Schauder theorem yields a fixed point u £ M for B, Bu = u. Now, if ||^it|| < r, It is also a fixed point for A; otherwise \\Au\\ > r and f

u — Bu = -—377^w ~ tAu, \\Au\\

r

t :=

\\An\\

-

hence ||iZ|| < r: it follows that also ||-BtI|| < r, i.e., u = Bu = Au and u is again a fixed point for ^ . D

Theorems 9.134 and 9.142 may be regarded as special cases of a sort of general principle: a priori estimates on the possible solution yield existence of a solution.

9.5.3 The method of super- and sub-solutions In this section we state an abstract formulation of the following principle that reminds us of the intermediate value theorem: to find a solution, it often suffices to find a subsolution and a supersolution.

9.5 Some General Principles for Solving Abstract Equations

343

Figure 9.15. Juliusz Schauder (1899-1943) and Jean Leray (1906-1998).

a. Ordered Banach spaces 9.143 Definition. An order cone in a Banach space X is a subset X^ such that (i) X+ is closed, convex nonempty and X+ 7^ {0}, (ii) if u e XJ^ and A > 0, then Xu G X+, (iii) if u ^ X^ and —u G X^, then u = {). An order cone X+ C X defines a total order in X u
if and only if

v — u e X.^,

and we say that X is an ordered Banach space (by X+). In this case intervals in X are well defined [u,w] := {v e X \ u < V < u)}.

9.144 Definition. An order cone X^ is called normal if there is a number c > 0 such that \\u\\ < c\\v\\ whenever 0 OWxe

[a,6]}

is a normal order cone. 9.147 %. Let u,v,w,Un,Vn be elements of an order cone X^ of a Banach space X. Show that (i) u < V and v < w imply u < w, (ii) u < V and v < u imply u = v, (iii) ii u 0, Vtt; G X , (iv) if Un < Vn, Un ^^ u and Vn -^ V as n -^ 00, then w < f, (v) if XJ^ is normal, then u
and

Nit; — t;|| < cllii; — wll.

344

9. Spaces of Continuous Functions, Banach Spaces and Abstract Equations

b. Fixed points via sub- and super-solutions 9.148 Theorem. Let X be a Banach space ordered by a normal order cone J let UQ^VQ E X and let A : [UQ^VQ] C X -^ X be a (possibly nonlinear) compact operator. Suppose that A is monotone increasing, i.e., Au < Av whenever u
and

Vn-\-i — A.Vn

Vn > 0

started respectively, from UQ and VQ converge respectively to solutions uand u^ of the equation u — Au. Moreover UQ

^ Ul ^ " ' '^ Un ^ " ' '^ U- < U^ < ' " < Vn < " ' '^ VO-

Proof. By induction UQ < • ' ' < Un < Vn < " • < Vo,

since A is monotone. Prom (v) of Exercise 9.147 11^0-itnll < C | | v o - i x o l l

Vn,

i.e., {un} is bounded. As A is compact, there exists u- G X such that for a subsequence {uk^} of {un} we have Au^^ -^ u- as n —)- oo. Finally u- — Au-^ since A is continuous. One operates similarly with {vn}^

9.149 Remark. Notice that the conclusion of Theorem 9.148 still holds if we require that A be monotone on the sequences {un} and {vn} defined by Un-\-i = Aun and Vn^\ = Avn started respectively, at i^o and v^ instead of being monotone in [UQ? ^O]-

9.6 Exercises 9.150 f. Show that in a normed space (X, || ||) the norm || || : X —> M+ is a Lipschitzcontinuous function with best Lipschitz constant one, i.e.,

|lNI-IMl| 1, is a convex function. 9.153 %, Convexity can replace the triangle inequality. Prove the following claim.

9.6 Exercises

345

Eberhard Zeidler

Applied Functional Analysis Main Principlea and Their Applications

ANALYSE FONCTIONNELLE

With 37 !lluslrali««

Thferie et applicstions

Springer-Verlag M A S S O N > i r a Nnr Yock Bvnlmr Mibn Utaeo S>o Ptdo 1913

New York Bntin t Ibkyo HongKooK

Figure 9.16. Frontispieces of two volumes on functional analysis.

P r o p o s i t i o n . Let X be a linear space and let f : X —^ R-\. be a function such that (i) / ( x ) > 0 , f{x) = Oiffx = 0, (ii) / is positively homogeneous of degree one: f{Xx) = \X\f(x) Mx G X , VA > 0, (iii) the set {x \ fix) < 1} is convex. Then f{x) is a norm on X. 9 . 1 5 4 %, Prove the following variant of Lemma 9.22. L e m m a ( R i e s z ) . Let Y be a closed proper linear subspace of a normed space X. Then, for every e > 0, there exists a point z e X with \\z\\ = 1 such that \\z — y\\ > 1 — e for all y eY. 9.155 ^ . Show that BV([a,b])

is a Banach space with the norm

\\f\\BV-=

sup \fix)\ + V^{f). xE[a,b]

[Hint: Compare Chapter 7 for the involved definitions.] 9.156 %. Show that in C^{[a,b]) the norms || ||oo e || \\LP are not equivalent. 9.157 % Show that in C^{[0,1])

\x{0)\+ f \x'{t)\dt Jo defines a norm, and that the convergence in this norm implies uniform convergence. 9.158 if. Denote by Co the linear space of infinitesimal real sequences {xn} and by Coo the linear subspace of c© of sequences with only finite many nonzero elements. Show that Co is closed in ioo while Coo is not closed.

346

9. Spaces of Continuous Functions, Banach Spaces and Abstract Equations

9.159 ^ . Recall, see e.g.. Section 2.2 of [GMl], that the oscillation of a function / : R —» R over an interval around x and radius 6 is defined as ^f,s(^)'•=

sup

\f(y)-f{x)\

\y~x\<6

and that / : R —> R is continous at x if and only if ujf^si^) —>• 0 as 5 —>^ 0. Show that ujf^s{x) -^ 0 uniformly on every bounded interval of R. [Hint: Use Theorem 6.35.] 9.160 % Let / 6 C i ( R ) . Show that a uniformly in every bounded interval of R. [Hint: Use Theorem 6.35.] 9.161 f. Let / :]xo — 1, xo + 1[C R —>- R be differentiable at XQ. Show that the blow-up sequence {/n}, Jn[z) :=

Y

> } Kxo)z,

n

compare Section 3.1 of [GMl], converges uniformly on every bounded interval of R. [Hint: Use Theorem 6.35.] 9.162 %, Compute, if it exists, /•4

lim /

fn{x)dx,

/n(x):=-(e^/^-l).

9 . 1 6 3 f. Discuss the uniform convergence of the sequences of real functions in [0,1] f^{x)

:=(-l)^n(a:H-l)x'",

- ( 2 + sin(nj:))ei-^°^(^^)a:. n

9 . 1 6 4 f. Discuss the uniform convergence of the following real series E^^-cos ( n=l

^

)

.

f:(e^-e-^)arctan^, n=l

n

E, n3a:2 + 1'

^ ^ /arctan(na;) 7 r \ ^ ^

n=0

n=l

n=l

n=2

l

^j^

"^ n /

'

9.165 If- Show that {u e C^i[0,1]) | f^ u(x) dx = 0} is a linear subspace of C^([0,1]) that is not closed. 9.166 %. Show that {u € C^([0,1]) | u{0) = 1} is closed, convex and dense in CO([0,1]). 9.167 f. Show that {u G C°(R) | Hmx-^±oo u(x) = 0} is a closed subspace of C^(R). 9.168 %, Show that the subspace C^ (R) of C^(R) of functions with compact support is not closed. 9.169 %. Let X be a compact metric space and T C C^{X). tinuous if

Show that T is equicon-

9.6 Exercises

347

(i) the functions in T are equi-Lipschitz, i.e., 3 M such that l / W - /(2/)| < Md[x,y)

Vx,2/ G X, V/ 6 J^,

(ii) the functions in T are equi-Holder, i.e., 3 M and a, 0 < a < 1, such that l / W - / ( l / ) l < Md{x,yr

Vx,2/ € X, V/ € T.

9.170 1. Let T C C^{[a,b]). Show that any of the following conditions implies equicontinuity of the family T. (i) the functions in T are of class C^ and there exists M > 0 such that \f{x)\

<M

V x G [a,6], V / G J T

(ii) the functions in T are of class C^ and there exists M > 0 and p > 1 such that rb

I

\f'\Pdx
V/G^.

Ja

9.171 %, Let T C C^{[a^ h]) be a family of equicontinuous functions. Show that any of the following conditions implies equiboundedness of the functions in T. eT, (i) 3 C, 3 a;o G [a, h] such that |/(a:o)| 3 x G [a, 6] with |/(a;)| < C, (iii) 3 C such that f^ \f{t)\ dt < C. 9.172 %. Let Q be a set and let X be a metric space. Prove that a subset B of the space of bounded functions from Q in X with the uniform norm is relatively compact if and only if, for any e > 0, there exists a finite partition Q i , . . . , Que oi Q such that the total variation of every u G H in every Qi is not greater than e. 9.173 ^ . Show that a subset K C ^p, 1 0 3 ne such that E ^ = n e l^"l^ ^ ^ ^^^ ^^^ {^^} ^ ^' 9.174 ^ . Let X be a complete metric space with the property that the bounded and closed subsets of C^(X) are compact. Show that X consists of a finite number of points. 9.175 % H o l d e r - c o n t i n u o u s functions. Let n be a bounded open subset of M"^. According to Definition 9.46 the space of Holder-continuous functions with exponent a in 17, C^'"(Q), 0 < a < 1 (also called Lipschitz continuous if a = 1) is defined as the linear space of continuous functions in Q such that 11^1 |0,a,f2 '•= s u p \u\ + [u]o,cx,n < + 0 0 Q

where Mo,a,fi :=

u{x) - u{y) sup — —-. x,yen

|x-2/r

One also defines C^^'^{Q) as the space of functions that belong to C^'"(A) for all relatively compact open subsets A, A CC Q. Show that C^'"(r2) is a Banach space with the norm || ||o,a,fi-

348

9. Spaces of Continuous Functions, Banach Spaces and Abstract Equations

9.176 ^ . Show that the space C^{[a,b]) is a Banach spa^e with norm k IMIc^:=Ell^'^^lloo,[a,6]. h=0

Define C^'^([a,b]) as the Hnear space of functions in C^{[a,b]) with Holder-continuous fc-derivative with exponent a such that l|w||fc,a,[a,6] — ll^llcfc([a,6]) + [D^'^]o,cx,[a,b] <

+00.

Show that C^'"([a,6]) is a Banach space with norm || ||fc,a,[a,6]9.177 1. Show that the immersion of C0'^([a,6]) into C^^^{[a,b]) is compact if 0 < a < /3 < 1. More generally, show that the immersion of C^'^{[a,b]) into C^"^([a, 6]) is compact if k -h /3 < h -\- a. 9.178 1[. Let fi C M^ be defined by

Q := |(x, 2/) G M^ 12/ < \x\^^^, x^-\-y'^ < l } . By considering the function u{x,y)

I (sgn x)yf^ := <

[O

if 2/ > 0,

iiy<0

where 1 < ^ < 2, show that u G C^(17), but u ( CO'«(Q) if (3/2
1.

9.179 % Prove the following P r o p o s i t i o n . Let Q be a bounded open set in M^ satisfying one of the following ditions (i) Q is convex, (ii) Q is star-shaped, (iii) dQ is locally the graph of a Lipschitz-continuous function. Then C'^'"(Q) C C'^'^(n) and the immersion is compact if k-\- (3 < h-\- a.

con-

[Hint: Show that in all cases there exists a constant M and an integer n such that Vx, 2/ € n there are at most n points zi, Z2,. •., Zn with zi = x and Zn = y such that S r J i k i ~ ^i-{-i\ ^ ^\x — y\. Use Lagrange mean value theorem.] 9.180 If. Show that the space of Lipschitz-continuous functions in [a, 6] is dense in C^([a, 6]). [Hint: Use the mean value theorem.] 9.181 % Show that the space of Lipschitz-continuous functions in [a, 6] with Lipschitz constant less than k agrees with the closure in C^([a, 6]) of the functions of class C^ with supa; \f'(x)\ < k. 9.182 K. Let A > 0. Show that \u G C^([0, -f-oo[) I sup '^

' [0, + cx)[

is complete with respect to the metric d{f,g)

|tx|e-^^ < -hooj ^

:= sup^{|/(a:) — g(x) \ e~^^}.

9.6 Exercises

349

9.183 % Let / : [0,1] -> [0,1] be a diffeomorphism with f{x) > 0 Vx G [0,1]. Show that there exists a sequence of polynomials Pn{x), which are diffeomorphisms from [0,1] into [0,1], that converges uniformly to / in [0,1]. [Hint: Use Weierstrass's approximation theorem.] 9.184 1. Define for A[aj] G Mn,n{^),

K = M or K = C,

||A||:=sup{l^|x^0}. Show that (i) | A x | < | | A | | | x | V x e X , (ii) | | A | | = s u p { ( A x | 3 , ) | | x | = M = l } ,

(iii) l|A|P<Er,,=iH)'
||A||=

sup ||A(2)||=max(^|Ai.|),

(ii) if|N| = Ni:=Er=ik*l,then ||A||=

sup ||A(2)|| = m a ^ ( f ^ | A 5 | ) .

9.186 % Let A , B e M2,2(R) be given by

Then A B 7^ B A . Compute e x p ( A ) , e x p ( B ) , exp (A)exp (B), exp (B)exp (A) and exp(A + B). .187 f. Define M{n)

= {N e End (C") I TV is normal},

U{n) = {N e End (C^) I N is unitary}, n{n)

= {N e End {C") I N is self-adjoint}, = {N e End (C^) I N is self-adjoint and positive}.

Show (i) if AT G Af{n) has spectral resolution N = Zlj^=i ^jPj^ then exp (AT) G M{n) and has the spectral resolution exp (AT) = X)?=i ^^^ ^j^ (ii) exp is one-to-one from H(n) onto ?i-f (n), (iii) the operator H -^ exp (iH) is one-to-one from Tin onto U{n).

350

9. Spaces of Continuous Functions, Banach Spaces and Abstract Equations

9.188 ^ . Let L G End (C"^). Then I d - L is invertible if and only if 1 is not an eigenvalue of L. If L is normal, then L = XI?=i ^j^j^ ^^^ ^^ have n

^

If ||L|| < 1, then all eigenvalues have modulus smaller than one and oo

n

n=0 j=l

oo n=0

9.189 % Let T,T-^ G E n d ( X ) . Show that S G End (X) and | | 5 - T|| < l / | | T | | - i , then S~^ exists, is a bounded operator and |lS-i-T-i||<

l-||5-T||||T-i|

9.190 ^ . Let X and Y be Banach spaces. We denote by Isom (X, K) the subspace of all continuous isomorphisms from, X into y , that is the subset of C{X^ Y) of linear continous operators L : X —^ Y with continuous inverse. Prove the following. T h e o r e m . We have (i) Isom(X, y ) is an open set of C{X,Y). (ii) The m,ap f ^^ f~^ from Isom (X, Y) into itself is

continuous.

[Hint: In the case of finite-dimensional spaces, it suffices to observe that the determinant is a continuous function.] 9.191 % Show that, if / is linear and preserves the distances, then / G Isom (X, Y). 9.192 If. Show that the linear map D : C^{[0,1]) C C^{[0,1]) -^ C^{[0,1]) that maps / to / ' is not continuous with respect to the uniform convergence. Show that also the map from C^ into C^ with domain C^

f e c\[o, 1]) c c°([o, 1]) ^ /'(1/2) e R is not continuous. In particular, notice that linear subspaces of a normed space are not necessarily closed. 9.193 f. Fix a = {a-n} G ^oo and consider the linear operator L : ii -^ ii, {Lx)n anXn- Show that

(i)

mi^Mu^,

(ii) L is injective iff an 7^ 0 Vn, (iii) L is surjective and L~^ e continuous if and only if inf \an\ > 0. 9.194 ^ . Show that the equation 2u = cosu -h 1 has a unique solution in C^([0,1]).

=

10. Hilbert Spaces, Dirichlet's Principle and Linear Compact Operators In a normed space, we can measure the length of a vector but not the angle formed by two vectors. This is instead possible in a Hilbert space, i.e., a Banach space whose norm is induced by an inner (or Hermitian) product. The inner (Hermitian) product allows us to measure the length of a vector, the distance between two vectors and the angle formed by them. The abstract theory of Hilbert spaces originated from the theory of integral equations of Vito Volterra (1860-1940) and Ivar Fredholm (18661927), successively developed by David Hilbert (1862-1943) and J. Henri Poincare (1854-1912) and reformulated, mainly by Erhard Schmidt (18761959), as a theory of linear equations with infinitely many unknowns. The axiomatic presentation, based on the notion of inner product, appeared around the 1930's and is due to John von Neumann (1903-1957) in connection with the developments of quantum mechanics. In this chapter, we shall illustrate the geometry of Hilbert spaces. In Section 10.2 we discuss the orthogonality principles, in particular the projection theorem and the abstract Dirichlet principle. Then, in Section 10.4 we shall discuss the spectrum of compact operators partially generalizing to infinite dimensions the theory of finite-dimensional eigenvalues, see Chapter 4.

10.1 Hilbert Spaces A Hilbert space is a real (complex) Banach space whose norm is induced by an inner (Hermitian) product.

10.1.1 Basic facts a. Definitions and examples 10.1 Definition. A real (complex) linear space, endowed with an inner or scalar (respectively Hermitian) product { \ ) is called a pre-Hilbert space.

352

10. Hilbert Spaces, Dirichlet's Principle and Linear Compact Operators

FORTSCHRITTE DER MATHEMATISCHEX WLSSENSCHAFTEN Vs JI0SOGRAPHIE.\ HEEAtTSOESKBEN VOX OTTO BUnayTHAL = HEFTS =13========—=——==:

GRim)ZtGE EINER .\LLGE31EINEN THEORIE DER LKEAREN IXTEGRALGLEICHUNGEN TOST

DAVIB HUiBSBT

m LEIPZIG DND BERLIN DBUCK UND TEBLAG TON R(J.TEOTNEa

Figure 10.1. David Hilbert (1862-1943) and the Theorie der linearen Integralgleichungen.

1912

We have discussed algebraic properties of the inner and Hermitian products in Chapter 3. We recall, in particular, that in a pre-Hilbert space H the function ||a;|| := y/{x \ x), xeH, (10.1) defines a norm on H for which the Cauchy-Schwarz

l(%)l < INI IMI

inequality,

yx,yeH,

holds. Moreover, CamoVs theorem \\x + 2/lP = \\x\\^ + \\y\\^ + 2^{x\y)

^x,y e H

and the parallelogram law \\x + y\? + \\x-y\\^

= 2{\\x\\^ + \\y\\^)

yx,yGH

hold. In Chapter 3 we also discussed the geometry of real and complex pre-Hilbert spaces of finite dimension. Here we add some considerations that are relevant for spaces of infinite dimension. A pre-Hilbert space H is naturally a normed space and has a natural topology induced by the inner product. In particular, if {xn} C H and X e H, then Xn ^^ x means that ||xn — x|| = {xn — x\xn — x)^/^ ^ 0 as n -^ oo. As for any normed vector space, the norm is continuous. We also have the following. 10.2 Proposition. The inner (or Hermitian) product in a pre-Hilbert space H is continuous on H x H, i.e., if Xn —^ x and yn —^ y in H, then {xn\yn) -^ {Ay)- -^^ particular, if {x\y) — 0 for all y in a dense subset Y of H, we have x = 0.

10.1 Hilbert Spaces

353

Proof. In fact \iXn\yn)

- ix\y)\

= \{Xn - x\yn)

+ (x\yn

-

y)\

< | | a : n - x | | ||2/n||-h||x||||2/n-2/||; the claim then follows since the sequence ||2/n|| is bounded, since it is convergent. If y is a dense subset of H, we find for any x € / / a sequence {yn} C Y such that yn -^ X. Taking the limit in (x | yn) = 0, we get (x | x) = 0. D 10.3 % Differentiability of t h e inner p r o d u c t . Let u :]a, h[-^ H be a, map from an interval of R into a pre-Hilbert space H. We can extend the notion of derivative in this context. We say that u is differentiable at to €]a, b[ if the limit u-(to):=lim^(^)-"(^^)e/f ^ "^ t-.o t - to exists. Check that P r o p o s i t i o n . Ifu,v

:]a,b[-^ H are differentiable

~{u{t)

I v{t)) = {u'{t)\v{t))

in ]a,b[, so is t -^ (w(t) | v{t)) and

+ {u{t)\v'{t))

Vt G]a,6[.

10.4 Definition. A pre-Hilbert space H that is complete with respect to the induced norm, \\x\\ := y/{x\x), is called a Hilbert space. 10.5 K. Every pre-Hilbert space / / , being a metric space, can be completed. Show that its completion if is a Hilbert space with an inner product that agrees with the original one when restricted to H.

Exercise 10.5 and Theorem 9.21 yield at once the following. 10.6 Proposition. Every finite-dimensional pre-Hilbert space is complete, hence a Hilbert space. In particular, any finite-dimensional subspace of a pre-Hilbert space is complete, hence closed. The closed unitary ball of a Hilbert space H is compact if and only if H is finite dimensional. 10.7 E x a m p l e . The space of square integrable real sequences oo

h = hW

:= [x = {xn} U n € M, ^

|xi|2 < oo}

i=l

is a Hilbert space with inner product {x \ y) := X^i^i^il/i' compare Section 9.1.2. Similarly, the space of square integrable complex sequences oo ^2(C) : = | x = {Xn}

\xneC,

"^ \Xi\'^ < O o j i=l

is a Hilbert space with the Hermitian product {x \ y) := X 3 S i ^iVi-

354

10. Hilbert Spaces, Dirichlet's Principle and Linear Compact Operators

10.8 E x a m p l e . In C^{[a,b]) b

{f\9)'.= j f{x)9{x)dx

defines an inner product with induced norm ||/||2 := f /^ \j{i)Y' dt\

. As we have

seen in Section 9.1.2, C^([a, 6]) is not complete with respect to this norm. Similarly 1

0

defines in C^([0,1],C) a pre-Hilbert structure for which C^([0,1],C) is not complete.

b. Orthogonality Two vectors x and y of a pre-Hilbert space are said to be orthogonal, and we write x ± y, if {x\y) = 0. The Pythagorean theorem holds for pairwise orthogonal vectors Xi, X2,..., Xn n

E

ll

n II \ " ^

ii2

2=1

n

II

2=1

Actually, if iif is a real pre-Hilbert space, x JL ^ if and only if ||x + 2/|p =

\\A? + \\v\?A denumerable set of vectors {e^} is called orthonormal if {eh\ek) = Shk V/i, fc. Of course, orthonormal vectors are linearly independent. 10.9 E x a m p l e . Here are a few examples. (i) In ^2, the sequence ei = ( 1 , 0 , . . . ) , 62 = ( 0 , 1 , . . . ) , . . . , is orthonormal. Notice that it is not a linear basis in the algebraic sense, (ii) In C°([a,6],M) with the L^-inner product

{f\9)L^ '•= J f{x)9{x) dx the triginometric

system 1

b—a

/

2TTX \

.

/

27rx \

^ ^

, cosin), s m i n ), \ 0 — a/ \ b — a/

n=l,2,...

is orthonormal, compare Lemma 5.45 of [GM2]. b

(iii) In C^([a,6],C) with the Hermitian L^-product {f\9)L2

:= j f{x)'g(x)dx, a

trigonometric

system 1 / 2kTTX\ exp i1, b—a \ b — a/

forms again an orthonormal

system.

fc

,

_ € Z,

the

10.1 Hilbert Spaces

355

10.1.2 Separable Hilbert spaces and basis a. Complete systems and basis Let H he a. pre-Hilbert space. We recall that a set E of vectors in H are said to be linearly independent if any finite choice of vectors in E are linearly independent. A set E C H oi linearly independent vectors such that any vector in if is a finite linear combination of vectors in E is called an algebraic basis of H. We say that a system of vectors {ea}aeA in ^ pre-Hilbert space H is complete if the smallest closed linear subspace that contains them is ff, or equivalently, if all finite hnear combinations of the {ca} are dense in H. Operatively, {ea}aeA C H is complete if for every x e H, there exists a sequence {xn} oi finite linear combinations of the Ca 's,

cti,...,akeA

that converges to x. ^ 10.10 Definition. A complete denumerable system {cn} of a pre-Hilbert space H of linearly independent vectors is called a basis of H.

b. Separable Hilbert spaces A metric space X is said to be separable if there exists a denumerable and dense family in X. Suppose now that if is a separable pre-Hilbert space, and {xn} is a denumerable dense subset of if; then necessarily {xn} is a complete system in H. Therefore, if we inductively eliminate from the family {xn} all elements that are linearly dependent on the preceding ones, we construct an at most denumerable basis of vectors {?/„} of H. Even more, applying the iterative process of Gram-Schmidt, see Chapter 3, to the basis {^/n}? we produce an at most denumerable orthonormal basis of H, thus concluding that every separable pre-Hilbert space has an at most denumerable orthonormal basis. The converse holds, too. If {cn} is an at most denumerable complete system in H and, for all n, Vn is the family of the linear combinations of ei, 6 2 , . . . , Cn with rational coefficients (or, in the complex case, with coefficients with rational real and imaginary parts), then UnVn is dense in H. We therefore can state the following. 10.11 Theorem. A pre-Hilbert space H is separable if and only if it has an at most denumerable orthonormal basis. ^ Notice that a basis, in the sense just defined, need not be a basis in the algebraic sense. In fact, though every element in H is the limit of finite linear combinations of elements of { c a } , it need not be a finite linear combination of elements of {ea}. Actually, it is a theorem that any algebraic basis of an infinite-dimensional Banach space has a nondenumerable cardinality.

356

10. Hilbert Spaces, Dirichlet's Principle and Linear Compact Operators

GIXTSBPPB VITALI

ERGEBNISSE DER MATHBMATIK UNDi IHRER GRENEGEBIETE niKVXC DEK tOiUPIUBTUHC Dti

GEOMETRIA

MMun Mil wmmmvif HEJtAUSGSeeVeN VON l>. B. HAtKOS i i iCNIt$BR - T UA%AYM4A < K RAI>eMAi£llt^ V. K.SCHl|lOt>». $6GKB-B. SfBBKBt fiiin

III IT

• N £ U B FOiaS.-

HEFX »

u t '•' ilniCi

:«

NELLO SPAZIO HILBERTIANO NORMED LINEAR SPACES MAHLONM.DAY

BOLOGNA NICOLA ZANIOHBLLI

Figure 10.2. Frontispieces of Geometria nello Spazio Hilbertiano (1875-1932) and a volume on normed spaces.

by Giuseppe Vitali

10.12 E x a m p l e . T h e following is an example of a nonseparable pre-Hilbert space: the space of all real functions / that are nonzero in at most a denumerable set of points {ti} (varying with / ) and moreover satisfy J2if(^i)^ < ^^ with inner product (x \ y) = Yl^{^)y{^)^ the sum being restricted to points where x{t)y{t) / 0.

10.13 Remark. Using Zorn's lemma, one can show that every Hilbert space has an orthonormal basis (nondenumerable if the space is nonseparable) ; also there exist nonseparable pre-Hilbert spaces with no orthonormal basis. Let iif be a separable Hilbert space, let {en} be an orthonormal basis on H and \ei Pn : H -^ H be the orthogonal projection on the finitedimensional subspace Hn := Span {ei, e 2 , . . . , en}- liL : H —^ Y is a linear operator from H into a linear normed space Y, set Ln{x) := LoPn{x)'ix E H. Since the LnS are obviously continuous, Hn being finite dimensional, and ||Lri(a;) — L(x)||y -^ 0 Vx G iJ, we infer from the Banach-Steinhaus theorem the following. 10.14 Proposition. Any linear map L : H space into a normed space Y is hounded.

Y from a separable Hilbert

Therefore linear unbounded operators on a separable pre-Hilbert space L : D -^ Y are necessarily defined only on a dense subset D ^ H of SL separable Hilbert space. There exist instead noncontinuous linear operators from a nonseparable Hilbert space into E. 10.15 E x a m p l e . Let X be the Banach space CQ of infinitesimal real sequences, cf. Exercise 9.158, and let / : X ^ R be defined by / ( ( a i , a 2 , . . . ) ) •= « i - Then ker / =

10.1 Hilbert Spaces

357

{{oin) € Co I a i = 0} is closed. To get an example of a dense hyperplane, let {e"^} be the element of CQ such that e j = Sk,n and let x^ be the element of CQ given by x^ = 1/n, so that {x^, e^, e ^ , . . . } is a linearly independent set in CQ. Denote by B a. Hamel basis (i.e., an algebraic basis) in CQ which contains {x^, e^, e-^,... }, and set B==

(x^,e^,e'^,...\uh^\iel]

where 6* ^ x^, e^ for any i and n. Define oo

/ : CO -^ E, Since e^ G ker f ^n>

f{aox^ -f ^2 ^rie"" -\-^aib') = ao.

1, ker / is dense in co but clearly ker f ^ CQ.

10.16 %. Formulate similar examples in the Hilbert space of Example 10.12.

c. Fourier series and i2 We shall now show that there exist essentially only two separable Hilbert spaces: £2 W and £2(C). As we have seen, if if is a finite-dimensional pre-Hilbert space, and (ei, e 2 , . . . , Cn) is an orthonormal basis of H^ we have n

X = ^{x\ej)

n

Cj,

\\x\\^ = ^

3=1

\{x\ej)\^. 3=1

We now extend these formulas to separable Hilbert spaces. Let i? be a separable pre-Hilbert space and let {en} be an orthonormal set of H. For x E H^ the Fourier coefficients of x with respect to {cn} are defined as the sequence {(a:|ej)}j, and the Fourier series of x as the series (X)

2^\x\ej)ej^ 3=1

whose partial n-sum is the orthogonal projection Pn{x) of x into the finitedimensional space Vn := Spanjei, 6 2 , . . . , e^}, n

Pn{x) =

^{x\ej)ej. j=i

Three questions naturally arise: what is the image of J^{x) := {{x\e^)}j,

xGH?

Does the Fourier series of x converge? Does it converge to x? The rest of this section will answer these questions.

358

10. Hilbert Spaces, Dirichlet's Principle and Linear Compact Operators

10.17 Proposition (Bessel's inequality). Let {en} be an orthonormal set in the pre-Hilbert space. Then oo

l]|(x|6fc)p<||x||2

\fxeH.

(10.2)

k=i Proof. Since for all n the orthogonal projection of x on the finite-dimensional subspace Vn := Span {ei, 6 2 , . . . , en} is Pn(x) = X)fc=o(^l^fc)^fe' *^^ Pythagorean theorem yields "£ \i^Wk)\^ = l l ^ n W l P = \\x\f - \\X - P„{x)\f

< \\x\f.

fc=0

When n —>^ oo, we get the Bessel inequality (10.2).

D

10.18 Proposition. Let {en} be an at most denumerable set in a preHilbert space H. The following claims are equivalent. (i) {en} is complete. (ii) \/x E H we have x = Yl^=oi^\^k)^k ^^ H, equivalently \\x — Pn{x)\\ —> 0 as n -^ 00. (iii) (PARSEVAL'S FORMULA), ||a;|p = Sfclo l(^l^fc)P Vx G ff holds. (iv) \/x,y E H we have oo

{x\y) =

^{x\ej){y\ej). j=i

In this case x = 0 if {x\ek) = 0 Vfc. Proof, (i) ^ (ii). Suppose the set {en} is complete. For every x ^ H and n G N, we find finite combinations of e i , e2,. •., Cn that converge to x, n Sn := X ^ a^efc, \\x - Sn\\ -^ 0. fc=0

If Pn{x) = Yyk=i(^\^k)^k we have, as Sn € Vn,

is the orthogonal projection of x in Vn = Span { e i , . . . , en}, \\x-Pn{x)\\<\\x-Sn\\-^0,

therefore x = J2^oi^\^k)^k (ii) 4^ (iii) follows from

in H. The converse (ii) => (i) is trivial,

E 1(^1^-^)1' = \\Pn{x)\f k=0

= \\X\\^ - \\X -

Pn{x)\\^

when n •—> oo. (ii) implies (iv) since the inner product is continuous. i^\y) = {Y^{x\ei)ei

I ^(x|ej)ej) j=i

= 5 Z (^l^») (2/|ej) (ei\ej) = ^{x\ej) and (iv) trivially implies (iii). Finally (iii) implies that x = 0 if {x\ek) = 0 Vfc.

{y\ej).

10.1 Hilbert Spaces

359

10.19 Proposition. Let H be a Hilbert space and let {en} be an orthonormal set of H. Given any sequence [ck] such that Xljlo l^^l^ ^ ^^^ ^^^^ the series YlTLo^j^j converges to H. If moreover {cn} is complete, then oo

X = V _ ] ( x | e j ) Cj

Vx G H.

j=i Proof. Define Xn := Yll^=zi ^j^j-

^^ n-\-p

ji=n+l

{xn} is a Cauchy sequence in if, hence it converges to y := X]?io ^J^J ^ ^' ^ ^ account of the continuity of the scalar product oo

cx)

{y\^j) = ( 5 Z c i e i | e j ) = ^Ci{ei\ej) i=i

= Cj

i=i

for all j . If X G i / and Cj := {x\ej) Vj, then {x — y\ej) — 0 Vj, and, since {cn} is complete, Proposition 10.18 yields x — y. D

Let iif be a pre-Hilbert space. Let us explicitly interpret the previous results as information on the linear map defined by T[x) := {{x\ej)}j,

x e H,

that maps x e H into the sequence of its Fourier coefficients. o Bessel's inequality says that T{x) G ^2 Vx G if and that T : H -^ £2 ^^ continuous, actually CX)

j=i

o if {en} is a complete orthonormal set in H, then Parseval's formula says that J^ : H -^ £2 is an isometry between H and its image J^{H) C £2, in particular J^ : H ^^ £2 is injective, o if if is complete and {e„} is a complete orthonormal set, then, according to Proposition 10.19, — the series YlTLi ^j^j converges in H for every choice of the sequence {cj} C ^2, that is, T is surjective onto £2, - the inverse map of T, T~^ : £2 —> H, is given by 00

i=i

Therefore, we can state the following.

360

10. Hilbert Spaces, Dirichlet's Principle and Linear Compact Operators

10.20 Theorem. Every separable Hilbert space H over R (respectively over C) is isometric to ^2(R) (respectively to ^2(C)^. More precisely, given an orthonormal basis {en} C H, the coordinate map £ : (^2{^) -^ H (K = R if H is real, resp. K = C if H is complex), given by oo

k=o

is a surjective isometry of Hilbert spaces and its inverse maps any x G H into the sequence of the corresponding Fourier coefficients {{x\ej)}j. Finally, we conclude with the following. 10.21 Theorem (Riesz-Fisher). Let H be a Hilbert space and let {cn} be an at most denumerable orthonormal set of H. Then the following statements are equivalent. (i) {cn} is a basis of H. (ii) yx G H we have x = S j l i ( ^ k j ) ^ j (iii) ||x — Pri(^)|| -^ 0, where Pn is the orthogonal projection onto Vn := Span{ei, e 2 , . . . , Cn}. (iv) (PARSEVAL'S FORMULA or ENERGY EQUALITY) ||a:|| = Z l j l i l(^kj)P holds ^x e H. (v) (a:|2/) = E^i(x|e^)(2/|e^). (vi) if {x\ej) = 0 Vj then x = 0. Proof. The equivalences of (i), (ii), (iii), (iv), (v) and of (i) => (vi) were proved in Proposition 10.18. It remains to show that (vi) implies (i). Suppose that {cn} is not complete. Then there is y e H with ||t/|p > ^27^1 l(2/kj)Pj while, on the other hand, Bessel's inequality and Proposition 10.19 show that there is z e H such that z := J2'jLoiy\^j)^j 1 and by Parseval's formula, ||2:|P = S ^ ^ i l(2/l^j)l^- Consequently ||2;|P < ||2/|p. But, on account of the continuity of the scalar product oo

oo

i^l^k) = (5Z(2/|ej)ej |efc) = X^(!/|ej)(ej |efc) = (y|efc) j=i

j=i

i.e., {y — z\ek) = 0 Vfc. Then by (vi) y = z, a. contradiction.

D

d. Some orthonormal polynomials in X^ Let / be an interval on R and let p : / —^ R be a continuous function that is positive in the interior of / and such that for all n > 0 / | t | X ^ ) dt < +00. The function p is often called a weight in 7. The subspace Vp of C^(/, C) of functions x{t) such that / \x{t)\'^p{t)dt

10.1 Hilbert Spaces

361

is a linear space and

{x\y):= Jx{t)'^p{t)dt defines a Hermitian product on it. This way V^ is a pre-Hilbert space. Also, one easily sees that the monomials {f^} n > 0, are linearly independent; Gram-Schmidt's orthonormalization process then produces orthonormal polynomials {Pn{t)} of degree n with respect to the weight p. Classical examples are o

JACOBI POLYNOMIALS

/ := [-1,1], o

Jn. They correspond to the choice p{t) := (1 - t r ( l + t ) ^

LEGENDRE POLYNOMIALS Pn- They correspond to the choice a = f3 = 0 in Jacobi polynomials Jn^ i.e.,

/-[-1,1], o

p{t):=l.

CHEBYCHEV POLYNOMIALS Tn. They correspond to the choice a = f3 = — 1/2 in Jacobi polynomials Jn, i.e.

/=[-l,l], o

LAGUERRE POLYNOMIALS

p{t):=

Ln. They correspond to the choice

/=[0,+oo], o

a,(3> - 1 .

H E R M I T E POLYNOMIALS

p{t) :=e-K

Hn- They correspond to the choice

/ := [-00, -f oo],

p{t) := e~*^

One can show that the polynomials {Jn}^ {^n}, {^n}, {Ln}, {^n} form respectively, a basis in Vp. Denoting by {Rn} the system of orthonormal polynomials with respect to p{t) obtained by applying the Gram-Schmidt procedure to {f^}, n > 0, the i?n's have interesting properties. First, we explicitly notice the following properties o (Al) for all n, Rn is orthogonal to any polynomial of degree less than n, o (A2) for all n the polynomial Rn{t) — tRn-i{t) has degree less than n, hence {tRn-l\Rn)

=

{Rn\Rn)^

o (A3) for all x,y^z eVp we have {xy\z) = {xy'z\l) = {x\yz). 10.22 P r o p o s i t i o n (Zeros of R^)* Every Rn has n real distinct roots in the interior of I.

362

10. Hilbert Spaces, Dirichlet's Principle and Linear Compact Operators

Proof. Since / j Rn{t)p{t) dt = 0, it follows that Rn changes sign at least once in / . Let t i < • • • < tr be the points in int (/) in which Rn changes sign. Let us show that r = n. Suppose r < n and let Q(t):=(t-ti){t-t2)...{t~tr), then RnQ has constant sign, hence, {Rn\Q) 7^ 0, that contradicts property ( A l ) .

D

10.23 Proposition (Recurrence relations). There exist two sequences {An}, {fJ^n} of real numbers such that for n > 2 Rn{t) = {t-\- Xn)Rn-l{t) Proof. Since deg(Hn - tRn-i)

— /^n^n-2(0-

< n - 1, then n-l

Rn{t) - tRn-l{t)

= J2 C^^^W' i=0

and for i < n — 1, we have —(tRn-i\Ri) = Ci(Ri\Ri). By (A3), we have {tRn-i\Ri) = {Ri\Ri), hence, if i-f-l < n—1, then {tRn-i\Ri) = 0 from which Ci = Ofor i = n—2,n—1. For i = n — 2, property (A2) shows that -{Rn-l\Rn-l)

=

Cn-2iRn~2\Rn-2),

hence Cn-2 < 0.

D

10.24 %. Define

«"(*)^=i£(*'-^)"(i) Integrating by parts show that {Qn} is an orthogonal system in [—1,1] with respect to p(t) = 1, and that {Qn\Qn) = 2/(2n -f 1). (ii) Show that Q n ( l ) = 1 and that Qn is given in terms of Legendre polynomias {Pn} by Qn{t) = Pn{t)/Pn{l). Finally, compute P n ( l ) . (iii) Show that the polynomials {Qn} satisfy the recurrence relation nQn = (2n - l ) Q n - i - (n - l ) Q n - 2 and solve the linear ODE d / 2\^Qn | ( ( l - * ^ ) ^ ) + n ( n + l)Q„=0. 10.25 %. In Vp with p{t) := e"* and / = [0, +oo[, define

(i) Show that deg Qn = n and that {Qn} is a system of orthogonal polynomials in Vp. Then compute {Qn\Qn)(ii) Show that Qn(0) = 1, and, in terms of Laguerre polynomials,

^ ^ ' Compute then Ln(0). respect to {Qn}(iv) Show that E 7 n , a Q n ( t ) = e^* in Vp.

Ln(0)

10.2 The Abstract Dirichlet's Principle and Orthogonality

METHODEN DER MATHEMATISCHEN PHYSIK

363

LINEAR OPERATORS PART I: GENERAL THEORY

VON

JR.COURANT tmD D.HILBERT

MUMHI DUNMBJ) u l MCOB T. 8CHWABTZ

ERSTER BAND WiBiMii Q. U*» ud Hetmi 6. BM«k ZWEITE VERBESSERTE AUFLAGE MIT 26 ABBILDUMGEN

PMuM mi DUtnk»fd m Ik, PMk ImUrat » tkt mural «/ tht AKtrn Pnptrlj CmHoJim an lictnu N». A42.

DiitribmorjAttmsatfa

Wiley Ctatsiet Ubnty EdiliMi PuMMied I9SS

PUBU»BE«S, Inc. New York A wiLEY-i<«nnsci»ici vw JOHN WILEY A M BERLIN

VERLAG VON JULIUS SPRINGER

Figure 10.3. The frontispieces of two classical monographs.

(v) Changing variable and using the Stone-Weierstrass theorem, show that {e ^*} is a basis in Vp. 10.26 %. Define the polynomials Qn{t) by

— e-* = ( - i r Q „ ( t ) e - ' . (i) Show that {Qn} is an orthogonal system in Vp with / = [0, H-oo[ and p{t) = e~* . Show that each Qn{t) is proportional to the Hermite polynomial Hn(ii) Show that Qo = 1, Qi =2t and that for n > 2 Qn{t)

= 2tQn-l{t)

- 2{n -

(iii) Show that Qn satisfies Q'J,(t) - 2tQ'^{i) + 2nQn{t) 2nQn-iit).

l)Qn-l{t).

= 0 and that Q'^{t) =

10.2 The Abstract Dirichlet's Principle and Orthogonality The aim of this section is to illustrate some aspects of the linear geometry of Hilbert spaces mainly in connection with the abstract formulation of the Dirichlet principle. In its concrete formulation, this principle has played a fundamental role in the geometric theory of functions by Riemann, in the theory of partial differential equations, for instance, when dealing with

364

10. Hilbert Spaces, Dirichlet's Principle and Linear Compact Operators

gravitational or electromagnetic fields and in the calculus of variations. On the other hand, in its abstract formulation it turns out to be a simple orthogonal projection theorem. a. The abstract Dirichlet's principle Let H he a. real (complex) Hilbert space with scalar (respectively Hermitian) product ( | ) and norm \\u\\ := y/{u\u). Let K = R if ff is real or K := C if iJ is complex. Recall that a hnear continuous functional L on H is a linear map L : if —^ M such that \L{u)\
"iueH;

(10.3)

the smallest constant K for which (10.3) holds is called the norm of L, denoted ||L|| so that \L{u)\ < ||L||||i/||,

^ueH,

and, see Section 9.4, \\L\\=

\L{u)\ sup \L{u)\ = s u p inti=i u/0 \m\

We denote by H* = C{H, K) the space of linear continuous functional on if, called the dual of H. 10.27 Theorem (Abstract Dirichlet's principle). Let H be a real or complex Hilbert space and let L G if*. The functional T : H —^R defined by J^{u):=h\u\\'^-^{L{u))

(10.4)

achieves a unique minimum point u in H, and every minimizing sequence, i.e., every sequence {uk} C H such that T{uk) -^ inf J^{v), converges ton in H. Moreover u is characterized as the unique solution of the linear equation {ip\u) = L{ip) \/(p e H. (10.5) In particular \\u\\ = ||L||. Proof. Let us prove that T has a minimum point. First we notice that T is bounded from below, since, recaUing the inequahty 2ah < a^ -f 6^, we have for dX\ v ^ H Hv)

> \\\v\?

hence A := inf^^if ^(v)

- \\L\\ \\v\\ > i | H | 2 - i | H | 2 _ i | | L | p = - i | | L | | 2 e R, € M. Then we observe that, by the parallelogram law.

10.2 The Abstract Dirichlet's Principle and Orthogonality

365

\\\A?-L{x)

Figure 10.4. The Dirichlet's principle.

i||«-^|p = i|M|2 + i|H|2.

I W + -U | | 2

- »(L(«)_»(L(t;)) +

2 3 ? ( L ( ^ ) )

(10.6)

Thus, if {txfe} is a minimizing sequence, by (10.6)

\\\uk - UH\? - Huk) + HUH) - 2 ^ ( ^ ^ ^ i ^ )

as h^k —^ oo. Therefore {uk} is a Cauchy sequence in H and converges to some u E H; by continuity T(uk) —> J^{u) hence J^{u) — A. This proves existence of the minimizer u. If {vk} is another minimizing sequence for .F, (10.6) yields ||wfc — "^fcll —^ 0? ^-nd this proves that the minimizer is unique and that every minimizing sequence converges to u in H. Let us show that u solves (10.5). Fix (f £ H, and consider the real function e —> J^(u + ec^), that is the second order polynomial in e J^(u + ev) = \M\\''

+ ^[W\u)

- L(v)]e + ^ ( « )

with minimum point at e = 0. We deduce 3f?((v?|u) - L{ip)) = O^cp e H hence, as 2; = 0 if dl{Xz) = 0 VA G C, {(f\u) - L{ip) = 0 ^ipeH. (10.7) Conversely, if v solves (10.5), then for every (p ^ H T(y + v ) = ^||^||2 + K(„|<^) + i | | ^ | | 2 - ^(L{v))

- 5R(L(v.))

= nv)+sR((^b) - L(ff)) + iibi|2 = nv) + i|lvll^ hence ^ ( 1 ; + (p) > T{y)^ Vv? G i / , i.e., i; is a minimum point for T in H. This proves that (10.5) has a unique solution, the minimum point w of .F : if —> M. Finally, we infer from (10.5)

c^^o \m\

(^#0 ll^ll

366

10. Hilbert Spaces, Dirichlet's Principle and Linear Compact Operators

ACAD£MIE DES SCIENCES OE HONGRIE

LEMONS D'ANALYSE FONCTIONNELLE

niCoAilC BIESZ tr BfiLAS2..NA(;¥

QUAtaitMS fiOITIAK

GAimilES'VIIXAiiS

AKAD£MIAI XIAD6

Figure 10.5. Prigyes Riesz (1880-1956) and the frontispiece of a classical monograph.

b. Riesz's theorem In particular, we have proved the following. 10.28 Theorem (Riesz). For every linear continuous functional L £ H* there exists a unique UL £ H such that L{ip) = {ip\uL)

V(^ G H.

(10.8)

Moreover \\UL\\ = \\L\\. Actually, we have also proved that Riesz's theorem and the abstract Dirichlet's principle are equivalent. 10.29 Continuous dependence and Riesz's operator. If u solves the minimum problem (10.4), or equation (10.8), we have \\u\\ = \\L\\. This implies that the solution of (10.4) or (10.8) depends continuously on L. In fact, if Ln,L G H* and ||Ln - L|| -^ 0, and if Un,u e H solve {(p\un) = L{(p) and {(p\u) = L{(p) "iip G if, then {ip\Un -U)

= {Ln - L){ip)

W(p G H,

hence ^/|| = | | L n - L | | - . 0 . 10.30 Riesz's operator. The map T : H* -^ H that associates to each L e H* the solution UL of (10.8) is called Riesz^s map. It is easily seen that r : H* -^ H is linear and by Riesz's theorem we have ||r(L)|| = \\UL\\ = ||L||, i.e., not only is F continuous, but

10.2 The Abstract Dirichlet's Principle and Orthogonality

367

10.31 Theorem. Riesz's map T : H* -^ H is an isometry between H* and H. c. The orthogonal projection theorem Let us now extend the orthogonal projection theorem onto finite-dimensional subspaces, see Chapter 3, to closed subspaces of a Hilbert space. Let H he a, Hilbert space and V a subspace of iJ. If / G if, then the map L : V -^ K, V -^ L{v) = {f\v) is a linear continuous operator on V with ||L|| < 11/11 since \{f\v)\ < \\f\\ \\v\\ Vv G F by the Cauchy-Schwarz inequality. Since a closed linear subspace F of a Hilbert space H is again a Hilbert space with the induced inner product, a simple consequence of Theorem 10.27 is the following. 4.

10.32 Theorem (Projection theorem). Let V be a closed linear subspace of a Hilbert space H. Then for every f £ H there is a unique point u eV of minimum distance from f, that is

\\f-u\\ = dist (/, V) := inf {l/

-^\\^&V}.

Moreover, u is characterized as the unique point such that f — u is orthogonal to V, i.e., (/ - u\ip) = 0 \/(feV. Proof. We have for a\\ v e V \\v - fW^ = ||i;||2 - 2dl{v\f) + | | / | p . Theorem 10.27, when applied to J^(v) := H'ulP — 2^{f\v), v £V, yields existence of a unique minimizer u ^V of ||t> — / I p , hence of i; —> ||v — / | | . The characterization of u given by Riesz's theorem states, in our case, that u is also the unique solution of 2{^\u) = 2Mf)

Vcp G V.

Let y be a subspace of a Hilbert space H. We denote by V-^ the class of vectors of H orthogonal to V

v^ :=^xeH\ {x\v) = 0Wve vy Clearly V-^ is a closed subspace of H. 10.33 Corollary. If V is a linear closed subspace of a Hilbert space H, then H = V ® V-^, i.e., every u £ H uniquely decomposes as u = v -\-w, where v £V and w G V-^. 10.34 %. Show that, if V is a linear subspace of a Hilbert space / / , then V-*- is closed and that (V-^)-^ is the closure of V.

368

10. Hilbert Spaces, Dirichlet's Principle and Linear Compact Operators

10.35 %. Show that the orthogonal projection theorem is in fact equivalent to Riesz's theorem and, consequently, to the abstract Dirichlet's principle. [Hint: We give the scheme of the proof leaving it to the reader to add details. Uniqueness and ||WL|| = 11-^11 follow from (10.8). Let us prove the existence of a solution of (10.8). Suppose L is not identically zero, then kerL = L~^({0}) is a linear closed proper subspace of H and there exists UQ G kerL"*- such that UQ ^ 0 and L{uo) = 1. Since u — L{u)uo € kerL Vix € / / , we have u = w -{• L{u)uo

with w € ker L and UQ E ker L .

Multiplying by UQ, we then find L{u) = lu \ . ^^.^ )•

d. Projection operators Let y be a linear closed subspace of a Hilbert space H. The projection theorem defines a linear continuous operator Py : H -^ H that maps f E H into its orthogonal projection Pyf G V; of course \\Pv\\ < 1 and Im(Py) = y . Also P^=Py0Py=

Py

and the formula H = V ^ V-^ can be written as Id = Py+Py±,

PyPy±=Py±Py=0.

For the reader's convenience, we only prove that Py{f + g) = Pv{f) + Py{g). TVivially, f + g- Py{f + g) ± V and f + g - Pyf - Pyg ± V; since there is a unique u E V such that f + g — h _L F , we conclude Pvif + g) = Pv{f) + Pvi9). 10.36 ^ . Let P : H -^ H he a linear operator such that P ^ = P . Then P is continuous if and only if ker P and Im P are closed.

10.37 %, If y is a closed subspace of a Hilbert space H and {en} is a denumerable orthonormal basis of V, then the orthogonal projection of x G i / is given by oo

Px := ^(Px\ej)ej 3=1

oo

=

J2{x\ej)ej. 3=1

10.3 Bilinear Forms Prom now on we shall only consider real vector spaces, though one could develop similar results for sesquilinear forms on complex vector spaces.

10.3 Bilinear Forms

369

10.3.1 Linear operators and bilinear forms a. Linear operators Let i / be a real Hilbert space. As we know, the space C{H^ H) of linear continuous operators, also called hounded operators from H into i7, is a Banach space with the norm

11^11=" ' -IFxT^o

IFII

If T G C{H, H), we denote by N{T) and R(T) respectively the kernel and the image or range of T. Since T is continuous, N(T) = T~^({0}) is closed in H^ while in general R{T) is not closed. The restriction of T to N{T)-^, T : N{T)-^ —> R{T) is of course a linear bijection, therefore, from Banach's open mapping theorem, cf. Section 9.4, we infer the following. 10.38 Proposition. Let T e C{H^H). Then T has a closed range in H if and only if there exists C > 0 such that \\x\\ < C | | r x | | \/x e N{T)^, that is, if and only if T~^ : R{T) —> N{T)^ is a bounded operator, or, equivalently, if and only ifT : N{T)-^ -^ R{T) is an isomorphism. b. Adjoint operator Let X, y be two real Hilbert spaces endowed with their inner products ( I )x and ( I )y, and let T G C{X, Y). For any y G F the map x -^ {Tx\y)Y is a linear continuous form on X, hence Riesz's theorem yields a unique element T*y ^ X such that (x|r*y)x - {Tx\y)Y

Vx eX,\/ye

y.

(10.9)

It is easily seen that the map T* : F ^^ X just defined is a linear operator called the adjoint of T. Moreover, from (10.9) T* is a bounded operator with ||r*|| = ||r||. Obviously, if 5',T G C{H,H)

{TSy = S*T\

{Ty =T.

10.39 %. Suppose that P : H —^ H is a. linear continuous operator such that P"^ — P and P * = P. Show that V : = P{H) is a closed subspace of H and that P is the orthogonal projection onto V.

An operator L : H —> H on a, Hilbert space H is called self-adjoint if T* = T, i.e., {x\Ty) = {Tx\y) ^x,y e H. It follows from (10.9) that R{T)^ = N{T*). Consequently R{T) = N{T*)-^ and using the open mapping theorem, we conclude the following. 10.40 Corollary. Let T G C{H,H) range. Then we have

be a bounded operator with closed

370

10. Hilbert Spaces, Dirichlet's Principle and Linear Compact Operators

(i) The equation Tx = y is solvable if and only if y -L N{T*), (ii) T is an isomorphism between N{T)^ and R(T) = N{T*)-^. In particular the Moore-Penrose inverse T^ : H -^ H, defined by composing the orthogonal projection onto R{T) with the inverse ofT^f^^T)-^^ ^^ ^ bounded operator. (iii) WehaveH = R{T)eN{T*). Proof, (i) For x,y € H we have {y\Tx) = (T*y\x). by considering the orthogonals, R{T) = (R(T)^)-^

Hence, R{T)-^ = N{T*).

Therefore,

=NiT*)^.

(ii) follows from the open mapping theorem and (iii) follows from (i) by considering the projections onto N{T*)-^ and N{T*). D 1 0 . 4 1 %. Let A (i) N{A) = (ii) R{A*) D = (iii) RIA*) closed, (iv) if R(A*A)

€ C{X, Y) be a bounded operator between Hilbert spaces. Show that N{A*A), R{A*A), R{A*A) if and only if R(A) = N{A*)-^, i.e., if and only if R{A) is is closed, then R(A*) and R(A) are closed.

10.42 ^ . Let H he a. Hilbert space and let T be a self-adjoint operator. Show that T is continuous. [Hint: Show that T has a closed graph.]

c. Bilinear forms Let if be a real vector space. A map B : H x H ^ R which is Unear on each factor is called a bilinear form. A bilinear form B : H x H —^ R is called continuous or bounded if, for some constant A, we have

|e(u,^)|
^u.veH,

and it is called coercive if there is A > 0 such that B{u,u)>\\\u\\^

\/ueH.

Finally, B{u, v) is said to be symmetric if B{u, v) = B{v, u)

Vu, V e H.

Any linear operator T : H —^ H defines a bilinear form by B{v,u) := {v\Tu), and B is bounded if T is bounded since \B{v,u)\ < \\T\\ \\v\\ \\u\\. Conversely, given a continuous bilinear form B : H x H —^Ron a, real Hilbert space if, \B{u,v)\ B{v, u) is a Unear continuous operator on H^ hence by Riesz's theorem, there exists Tu € H such that B{v, u) = {v\Tu)

\fv e H.

(10.11)

10.3 Bilinear Forms

371

It is easy to see that T is linear and, from (10.10) that ||T|| < A since \\Tu\\^ =

B{Tu,u)
Consequently, by (10.11), there is a complete equivalence between bihnear continuous forms on a real Hilbert space H and bounded linear operators from H into H. Also, by (10.11), coercive bilinear continuous forms correspond to bounded operators called coercive, i.e., such that for some A > 0 {u\Tu) > A||?i|P VIA G H. Moreover, self-adjoint operators correspond to bilinear symmetric forms, in fact I3{v,u) - B{u,v) = {v\Tu) - {u\Tv) = {v\{T - T*)u)

^u^v e H.

10.3.2 Coercive symmetric bilinear forms a. I n n e r p r o d u c t s Clearly, every symmetric continuous coercive bilinear form on H defines in H a new scalar product, which in turn induces a norm that is equivalent to the original, since All^lP

^ueH.

Replacing (u\v) with B{u,v), Dirichlet's principle and Riesz's theorem read as follows. 10.43 Theorem. Let H be a real Hilbert space with inner product ( | ) and norm \\u\\ := ^/{u\ u) and let B : H x H ^ R be a symmetric, continuous and coercive bilinear form on H, i.e., B{u,v) = B{v,u) and for some A > A > 0 \B{u,v)\
B{u,u) > A||u|p,

"iu.v e H\

finally, let L be a continuous linear form on H. Then the following equivalent claims hold: (i)

The functional

(ABSTRACT DIRICHLET'S PRINCIPLE).

J'{u):=^B{u,u)-L{u) has a unique minimizer u £ H, every minimizing sequence converges to u, u in H,u solves B{ip,u) = L{ip)

^^eH.

(ii) (RIESZ'S THEOREM) The equation

B{ip,u) = L{ip) has a unique solution UL ^ H.

\J^eH.

(10.12)

372

10. Hilbert Spaces, Dirichlet's Principle and Linear Compact Operators

Moreover u = UL and \\UL\\ < ;^ll^llThe continuity estimate for UL follows from VX\\UL\\

< VB{UL,UL) = sup J § P = < -^\\L\\.

In terms of operators Theorem 10.43 may be rephrased as follows. 10.44 Theorem. Let T he a continuous, coercive^ self-adjoint operator on H, i.e., \\Tu\\ < \\T\\ \\u\l

{u\Tu) > \\\uf,

A > 0, VTX G F ,

T = T*.

Then T is invertible with continuous inverse, and HT"-^!! < 1/A. Proof. From coercivity we infer that A|M|2<|(n|Tu)|<|M|||r«||,

(10.13)

hence N(T) — {0} and T~^ : R{T) -^ H is continuous. T is therefore an isomorphism between H and R{T), and therefore R{T) is closed, R{T) = R{T) = N{T*)-^ = N{T)-^ = H and (10.13) rewrites as ||T-itx|| < ^ | | u | | \fueH. D A variational proof of Theorem 10.44- For any y e H, consider the bounded operator L: H ^R, L{ip) := {(f\y) and the bilinear form B : H x H -^R 6{^,u):={if\Tu). B is bounded and symmetric, T being bounded and self-adjoint. Moreover, the coercivity implies that B{<
ue

H

or Riesz's theorem. Theorem 10.43, we find x £ H such that MTx)

= B{^,x)

= iip\y)

^ipeH,

that is, Tx = y. Finally, from the coercivity assumption we infer A||x||2<|(x|Tx)|<|N|||Tx||, that is, | | T | | - i < ^ .

D

b. Green's operator Given a bilinear form in a real Hilbert space as above, the Green operator associated to B is the operator Tjs ' H* -^ H that maps L G H* into the unique solution UL,B G i / of B{^,UL^S) = L{^) V(/? G H. It is easily seen that Fg is linear and the estimate ||iiL,iB|| ^ j\\M\ ^^Y^ ^^^^ ^B is continuous. Of course, if F is the Riesz operator and T : H —^ H is an isomorphism such that B{v,u) = {v\Tu), then F^ = T~^ o F. 10.45 %. Under the assumptions of Theorem 10.43, let K C H be a. closed convex set of a real Hilbert space. Show that (i) the functional T{u) has a unique minimizer u £ K^ (ii) u is the unique solution u £ K oi the variational inequality ue K, B{u, u-v) < L{v) y V eK.

10.3 Bilinear Forms

373

c. Ritz's method The Dirichlet principle answers the question of the existence and uniqueness of the minimizer of J^{u) := -B{u^u) — L{u) and characterizes such a minimizer as the unique solution of S(i;, UL) = L{v) Vf G H. But, how can one compute uiP. If i / is a separable Hilbert space, there is an easy answer. In fact, since B{u, v) is an inner product, we can find a complete system in H which is orthonormal with respect to B,

such that every u G H uniquely writes as u = Xljli ^i'^^^j)^j^ compare Theorem 10.20. If I3{(p,u) = L{ip), V(^ € H, then B{ej,u) = L{ej), thus we have the following. 10.46 Theorem (Ritz's method). Let H be a separable real Hilbert space, B a symmetric coercive bilinear form, L e H* and {cn} a complete orthonormal system with respect to B. Then L{v) = B{v,u) ^v e H has the unique solution oo

3=1

This, of course, allows us to settle a procedure that, starting from a denumerable dense set of vectors {xn}, computes a system of orthonormal vectors with respect to JB( , ) by the Gram-Schmidt method, and yields the approximations Xlj^i ^i^j) ^j ^f ^L10.47 ^ . With the notation of Theorem 10.46, show that for every integer AT > 1, UN := J2j=i ^{^j)^j is the solution in Span {ei, 6 2 , . . . , e^} of the system of AT-Unear equations J3{v, UN) = L(v),

\/v € Span "I e i , e2, • . . , e^v \

and the unique minimizer of -B{v, v) — 9fJ(L(?;)),

V € Span < e i , 6 2 , . . . , e^v [^.

10.48 %, Show that the following error estimate for Ritz's method holds: — \u — uj^\ 2

<

T{UN)

— inf .^, H

where T{u) := ^J3{u, u) — L{u). [Hint: Compute T{u -\-v) — T{u).]

374

10. Hilbert Spaces, Dirichlet's Principle and Linear Compact Operators

d. Linear regression Let H,Y be Hilbert spaces and let A G C{H,Y). Given y e Y, we may find the minimum points u E H oi the functional T{u) := \\Au -yWlr

u e H.

(10.14)

Prom the orthogonal projection theorem, we immediately infer the following. 10.49 Proposition. Let A G £ ( i / , Y) be a bounded operator with closed range. Then the functional (10.14) has a minimum point u £ H and, if u E H is another minimizer of (10.14), then u — u£ N{A). Moreover, all minimum points are characterized as the points u e H such that An — y-L R{A) i.e., as the solutions of A*{Au-y)

= 0.

(10.15)

If N{A) = {0}, as N{A) = N{A*A), (10.15) has a unique solution, u = {A*A)~^A*y. If N{A) ^ {0}, the minimizer is not unique, so it is worth computing the minimizer of least norm, equivalently the only minimizer that belongs to N{A)^, or the solution of | ^ * ( ^ ^ - ^ ) = ^'

(10.16)

Recall that, being R[^A) closed, the map A|JV(A)^ ~^ ^ ( ^ ) is an isomorphism by the open mapping theorem. Consequently A+2/:= ( A | ^ ( ^ ) J . )

Qy,

y

eY,

where Q is the orthogonal projection onto R{A), defines a bounded linear operator A^ :Y -^ H called the Moore-Penrose inverse of A. It is trivial to check that the solution u of (10.16) is u = A^y. In the simplest case, A/'(A*) = {0}, we have R{A) = Y and (10.15) is equivalent to solving Au = y. Since we want to find a solution in N{A)-^^ it is worth solving AA*z = y so that u = A'^y = A*{AA*)~^y. In general, however, both AA* and A*A are singular and, in order to compute A^y, we resort to an approximation argument. Consider the penalized functional Tx{u) := \\Au -yW^Y^^MlH

^ ^ H,

(10.17)

where A > 0, that we may also write as J^xiu) = \\y\\' - 2{Au\y)Y + \\Au\;^ + A \\u\^uObserving that L{u) := {Au\y)Y = {A*y\u)H belongs to C{H,R) and that B{v,u) :=

{AV\AU)Y

-h X{V\U)H =

K'^\U)H

+

{V\A*AU)H

10.3 Bilinear Forms

375

is a symmetric, bounded, coercive, bilinear form on H, it follows from the abstract Dirichlet principle, Theorem 10.43, that Tx has a unique minimizer u\E H given by the unique solution of {ip\A''Aux)H + K^\UX)H

= (v^|^*2/)i/

V(/? G H,

i.e., {Xld + A*A)ux = A*y,

(10.18)

We also get, multiplying both sides of (10.18) by UA, X\\ux\\l + \\Aux\\l = {y\Aux)Y from which we infer the estimate independent on A P ^ A l l y < ||y||y.

(10.19)

10.50 P r o p o s i t i o n . Let A G C{H, Y) be a bounded operator with closed range and for A > 0, let ux := {Xld + A^'Ay^A^y

e H,

be the unique minimizer of (10.17). Then {ux} converges to A^y in H and

\\(xid +A*AY^ -A^\

0

as X^

0"^.

Proof. Since R(A) is closed by hypothesis, there exists C > 0, such that II^^IIH < C | | A ^ | | y

\/veN{A)^.

(10.20)

Since Xux = A*{y — Aux) G R{A*) C N{A)-^, we get in particular from (10.19) and (10.20) \\ux\\H 0. Prom (10.18) we have -{Xux

- fiu^) = A*A{ux

- Uy)

from which we infer \\A{ux - w/x)lly = (^A - u^j,\\ux - fiu^)Y

< \\ux - U^WH \\\UX - MW^HH

< I K - ^/xll/f (|A| I K - U^WH + |A - /i| IKIlif) < \\\ux

- u^WJj + |A - /x| llw^llff \\ux - U^,\\H.

Taking into account (10.20) and the boundedness of the W/^'s we then infer IK-«MllH
(10.21)

provided 20^ A < 1. For any {A^}, A^ —>^ 0"*", we then infer from (10.21) that {ux^} is a Cauchy sequence in Ar(A)-'-, hence converges to w € N{A)^. Passing to the Umit in (10.18), we also get A*{Au — J/) = 0, since {ux} is bounded, i.e., u := A'^y, as required. D

376

10. Hilbert Spaces, Dirichlet's Principle and Linear Compact Operators

10.3.3 Coercive nonsymmetric bilinear forms Riesz's theorem extends to nonsymmetric bilinear forms. a. The Lax-Milgram theorem As for finite systems of linear equations in a finite number of unknowns, in order to solve Tx = y, it is often worth first solving TT*x = y or T*Tx = T*y, since TT* and TT* are self-adjoint. We proceed in this way to prove the following. 10.51 Theorem (Lax—Milgram). Let B{u,v) be a continuous and coercive bilinear form on a Hilbert space H, i.e., there exists A > A > 0 such that \B{u,v)\ < K\\u\\\\v\\,

B{u,u) > \\\u\\^

W,v e H.

Then for all L G H* there exists a unique UL E H such that B{V,UL)

= L{V)

"iveH;

(10.22)

moreover \\UL\\ < 1/A||L||, i.e., Greenes operator associated to B, Ts * i/* -^ H, Tjs{L) := UL, is continuous. Proof. Let T : H —^ H he the continuous linear operator associated to B by B{v, u) = {v\Tu) The biUnear form

_ B{u,v)

Vti, V e H.

:= {TT*u \ v) = {T*u \ T*v)

is trivially continuous and symmetric; it is also coercive, in fact,

A^iHi^ < \B{u,u)\^ = \{u I r'«)|2 < ||«||2||r-«||2 = ||«|pB(«,«). Riesz's theorem. Theorem 10.43, then yields a unique UL £ H such that B{V,UL)

=-L{v)

"iveH.

Thus UL := T*UL is a solution for (10.22). Uniqueness follows from the coercivity of B. D

Equivalently we can state the following. 10.52 Theorem. Let T : H —^ H be a continuous and coercive linear operator, \\Tu\\ < \\T\\ \\u\l

{u\Tu) > X\\u\\^

^ueH

where A > A > 0. Then T is infective and surjective; moreover its inverse T~^ is a linear continuous and coercive operator with ||T~^|| < A~^. 10.53 % Show the equivalence of Theorems 10.51 and 10.52. 10.54 %. Read Theorem 10.52 when H = M^; in particular, interpret coercivity in terms of eigenvalues of the symmetric part of the matrix associated to T.

10.3 Bilinear Forms

377

b. Faedo-Galerkin method If if is a separable Hilbert space, the solution UL of the hnear equation (10.22) can be approximated by a procedure similar to the one of Ritz. Let if be a separable Hilbert space and let {en} be a complete orthonormal system in H. For every integer A^, we define VN '-= Span { e i , . . . , e^v} and let P/v : H —> H he the orthogonal projection on VN and UN to be the solution of the equation B{^,UN)

= L{^)

y^eVN,

(10.23)

i.e., in coordinates, UN '-= Yli=i ^^^i where N

J2B{ei,ej)x^

=L{ei),

V2 = l , . . . , i V .

3=1

Notice that the system has a unique solution since the matrix B , B^j = B{ei,ej) has N linearly independent columns as S is coercive. 10.55 Theorem (Faedo-Galerkin). The sequence {UN} converges to in H.

UL

Proof. We have X\\UN

- I ^ L I P < B{uN

-UL,UN

= B{UN,UN) = B{UL,UL

since B{UN,UL)

-UL)

+ B{UL,UL)

-B{UN,UL)

-B{UL,UN)

-UN),

= L(UN) = B(UN,UN)-

It suffices to show that for every v? € / /

B{(p,UN -UL)-^0

as

N-^oo.

(10.24)

We first observe that the sequence {UN} is bounded in H by ||L||/A since A | | t t ^ | p < B{UN,UN) — L(UN) < \\L\\ ll^tivll- On the other hand, we infer from (10.22) that B{PN
^(peH,

(10.25)

hence B{(p, UN - UL) = B{ip - PN(P, UN -UL)-\-

= B{cp -

PN^,

UN -

B(PN(f, UN - UL)

UL),

and \B{ip,UN - UL)\ < A\\uN - ULW W^ - PN^W < 2-IILll \\if - PNMThen (10.24) follows since \\ip - PN
D

378

10. Hilbert Spaces, Dirichlet's Principle and Linear Compact Operators

10.4 Linear Compact Operators In Chapter 4 we presented a rather complete study of hnear operators in finite-dimensional spaces. The study of linear operators in infinitedimensional spaces is more complicated. As we have seen, several important linear operators are not continuous, and moreover, linear continuous operators may have a nonclosed range: we may have {xk} C H, {yk} C Y such that Txk = yk.Vk -^ y ^Y, but the equation Tx = y has no solution. Here we shall confine ourselves to discussing compact perturbations of the identity for which we prove that the range or image is closed. We notice however, that for some applications this is not sufficient, and a spectral theory for both bounded and unbounded self-adjoint operators has been developed. But we shall not deal with these topics.

10.4.1 Fredholm-Riesz-Schauder theory a. Linear compact operators Let if be a real or complex Hilbert space. Recall, cf. Chapter 9, 10.56 Definition. A linear operator K : H -^ H is said to he compact if and only if K is continuous and maps hounded sets into sets with compact closure. The set of compact operators in H is denoted hy K{H^H). Therefore K : H -^ H is compact if and only if K is continuous and every bounded sequence {un} C H has a subsequence {uh^} such that K{uh^) converges in H. Also K{H,H) C C{H,H). Moreover, every linear continuous operator with finite range is a compact operator, in particular every linear operator on H is compact if H has finite dimension. On the other hand, since the identity map on H is not compact if dxmH = +oo, we conclude that )C{H^ H) is a proper subset of C{H, H) if dimiif = -hoo. Exercise 10.89 shows that compact operators need not have finitedimensional range. However, cf. Theorem 9.140, 10.57 Theorem. 1C{H^ H) is the closure of the space of the linear continuous operators of finite-dimensional range. Proof. Suppose that the sequence of Unear continuous operators with finite-dimensional range {An} converges to A G C(H,H), \\An - A\\ -^ 0. Then by (i) Theorem 9.140 A is compact. Conversely, suppose that A is compact, and let B be the unit ball of H. Then A(B) has compact closure, hence for all n there is a 1/n-net covering A{B), i.e., there are points 2/i, 2/2, • • •, 2/iV G A{B), N = N{n), such that A{B) C UjL^B(yj, 1/n). Define Vn := Span{2/1, 2/2,..., VN}, let Pn : H ^ Vnhe the orthogonal projection onto Vn and An := Pn^ A. Clearly each An has finite-dimensional range, thus it suffices to prove that \\An — A\\ -^ 0. For a\\ X £ B we find i € { 1 , 2 , . . . , iV} such that \\Ax — yi\\ < 1/n, hence, since PnVi = Vi and \\Pnz\\ < \\z\\,

10.4 Linear Compact Operators

379

coatonoK DE MoxooiupmES son u ratom nts Fosscrtoss, miLiJiit
•lES

SYSTfiMES

DtQUATIONS LINfiAIilES A UNE INFINITE D'INCONNUES (•All

FRiD£Ric RIES2,

PARIS, GAUTUtSR-VlLLARS, IMPIlIllEUIt-UBRAlftB

Figure 10.6. Marcel Riesz (1886-1969) and the frontispiece of a volume by Frigyes Riesz (1880-1956).

\\PnAx - Ax\\ < \\PnAx - PnViW + ||Pn2/i - Ax\\ < 2\\Ax - ViW < 2/n for all X e B.

D

10.58 Proposition. Let K G K{H,H). Then the adjoint K* of K is compact and AK and KA are compact provided A G C{H, H). Proof. The second part of the claim is trivial. We shall prove the first part. Let {un} C if be a bounded sequence, ||wn|| < M . Then {K*Un} is also bounded, hence {KK*Un} has a bounded subsequence, still denoted by {KK*Un}^ that converges. This implies that {K*Un} is a Cauchy sequence since \\K*Ui - K*Uj\\'' - {K*{ui - Uj)\K*{ui

- Uj)) = {m - Uj\KK*{ui

- Uj))

<2M\\KK*(ui-Uj)\\.

b. The alternative theorem Let A G C{H^ H) be a bounded operator with bounded inverse. A linear operator T G C{H,H) of the form T = A + K, where K G K{H,H), is called a compact perturbation of A. Typical examples are the compact perturbations of the identity, T = Id + K, i^ G /C(if, H)^ to which we can A~^K). always reduce T = A + K = A{IA^The following theorem, that we already know in finite dimension, holds for compact perturbations of the identity. It is due to Frigyes Riesz (18801956) and extends previous results of Ivar Predholm (1866-1927).

380

10. Hilbert Spaces, Dirichlet's Principle and Linear Compact Operators

10.59 Theorem (Alternative). Let H be a Hilbert space and let T = A-\- K : H -^ H be a compact perturbation of an operator A € C{H,H) with bounded inverse. Then (i) R{T) is closed, (ii) N{T) and N{T*) are finite-dimensional linear subspaces; moreover, dimiV(T)-dim7V(r*) = 0. The following lemma will be needed in the proof of the theorem. 10.60 Lemma. LetT = ld-\-K be a compact perturbation of the identity. U {^n} C H is a bounded sequence such that Txn -^ y, then there exist a subsequence {xk^} of {xn} and x e H such that ^kn ~^ ^

^^^

Tx = y.

Proof. Since {xn} is bounded and K is compact, we find a subsequence {xk^} of {xn} and z £ H such that Kxk^ —^z.It follows that Xk^ = Tx^^ — Kx^^ -^ y — z =: x, D and, since K is continuous, Txk^ —^ Tx = x -\- Kx = x — z = y. Proof of Theorem 10.59. Since T = A-\-K = A{ld-\- A~^K)^ A has a bounded inverse, A~^K is compact, we shall assume without loss of generality that A = Id. Step 1. First we show that there is a constant C > 0 such that ||a;||
Vx€A^(T)-^.

(10.26)

Suppose this is not true. Then there exists a sequence {xn} C N{T)^ such that ||a:n|| = I and ||T(xn)|| —> 0. T{xn) -^ 0 and Lemma 10.60 yield a subsequence {x^^} of {xn) and X ^ H such that x^^ —^ x and Tx — 0. The first condition yields x G N{T)-^ and II a; 11 = 1, while the second x € N{T). A contradiction. It follows from (10.26) that T is an isomorphism between the Hilbert space N(T)-^ and R{T), hence R{T) is complete, thus closed. This proves (i). Step 2. By Lemma 10.60 every bounded sequence in N{T) has a convergent subsequence. Riesz's theorem. Theorem 9.21, then yields that dimiV(T) < +oo. Similarly, one shows that dim Ar(T*) < oo. The rest of the claim is trivial if K is self-adjoint. Otherwise, we may proceed as follows, also compare 10.62 below. We use the fact that every compact operator is the limit of operators with finitedimensional range. Theorem 10.57. First we assume T = ld-\-K^ K of finite-dimensional range. In this case K : N{K)^ -^ R{K) is an isomorphism, in particular dim-R(i
( I d - ( 3 ) ^ Q ^ = Id. j=\

In particular. Id — Q is invertible with bounded inverse S j i i Q^ - Therefore we can write

T = ld +K=

Id-Q

+ Ki = {ld-Q){ld^{ld-Q)-^Ki)='.A{ld

+ B)

where B has finite-dimensional range; the claim (ii) then follows from Step 2.

D

10.4 Linear Compact Operators

381

c. Some facts related to the alternative theorem We collect here a few different proofs of some of the claims of the alternative theorem, since they are of interest by themselves. 10.61 R{Id + K) is closed. As we know, this is equivalent to R{T) = 7V(T*)-^, i.e., to show that for every / 6 N{T*)-^ the equation Tu := u -\- K{u) = / is solvable. To show this, we can use Riesz's theorem. Given / € N{T*)^, we try to solve TT*v = / , i.e., WipeH,

h{^,v) = {^\v)

(10.27)

where b{if,v) := (TT*v\ip) =

{T*v\T*(p).

liv e H solves (10.27), then u := T*v solves Tu = f. {Tu\
Wif e H.

We notice that N{TT*) = N{T*), therefore the bilinear bounded form b{(p,v) is symmetric if H is real (sesquilinear if H is complex) and well defined on the Hilbert space N{TT*)-^. We claim that b((p,v) is coercive on N{TT*)-^, 6((^,(^)>c||(p||2

V(^eiV(TT*)-^.

Otherwise, there exists a sequence {e-n} C N{T*)-^

(10.28)

with ||en|| = 1 and

b{en,en) = \\en + K*en\\^

-^ 0.

By Lemma 10.60, there exists e ^ H and a subsequence {cfc^} of {e-n} such that efcn ^ e,

Te = e + K e = 0;

in particular ||e|| = 1, e 6 N(T*) and e 6 N{T), a contradiction. We then conclude that b{(p, u) is an inner product on H (a Hermitian product if H is complex), equivalent to the original one. Applying Riesz's theorem, we then find v € N(T*)-^ such that b(ip,v) = (
^
||t;||<-||/||.

(10.29)

C

It remains to show that v solves (10.27). If P is the orthogonal projection of H into A/'(T*), then (10.29) is equivalent to b{P^,v)

= {P^\f)

VvPG//.

On the other hand, {if - Pcplf) = 0,

b{ip - P^, v) = {ip- Pip\TT*v)

= 0,

since / and v are in iV(T*)-'-, hence b{ip,v) = b{Pip,v) = (P(p\f) — (v?|/). 10.62 A n o t h e r proof of dim ^ ( T ) = dim7V(T*). Step 1. Let us prove the equality if T or T* is injective. Let Hi := R{T) and, by induction Hjj^i := T{Hj). Hj is a nonincreasing sequence of closed subspaces of H. We claim that there exists n such that Hri = Hn Vn > n. If not, we can find {sn} C R{H) with ||en|| = 1 and en € Hnr\H:^_^^. Since for n > m, T{en),T{em),e-n 6 Hm+i, ^m G H;^^-^, and Ken - Kem = {en + Ken) - (em + Kem) - en + em -= z-\- em, we may infer \\Ken-Kem\\^ a contradiction, since {K{en)}

= \\z\\^ +

\\en.\\^>l:

has a convergent subsequence.

382

10. Hilbert Spaces, Dirichlet's Principle and Linear Compact Operators

If N{T) = {0} and Hi = R{T) / / / , then necessarily Hj^i ^ Hj \/j since T is injective, and this is not possible, as we have seen. Hence H = R{T) and N{T*) = R(T)^ = {0}. If N{T*) — {0}, then repeating the above consideration for Id + i^* we get N{T) = {0}. Step 2. Let us prove that dim N{T) > dim R{T)-^. Assume that dim N(T) < dim R{T)-^. Then there exists a linear continuous operator L that maps the finite-dimensional space N{T) into the finite-dimensional space R{T)^ with L injective but not surjective. Let us extend L as a linear operator from H to R{T)-^ by setting Lx = 0 Va; G N(T)^. Then L has a finite-dimensional range, thus it is compact. Now we claim that N{ld + L-\- K) = {0}. In fact u -\- Ku + Lu = 0 implies Tu = u + Ku = -Lu and, since Tu e R{T) and Lu 6 R(T)-^, we infer Tu = Lu = 0, i.e., u 6 N{T) and u G N{T)^, since L is injective when restricted to N{T); in conclusion u = 0. Step 1 then says that Id -h K + L is surjective. This is a contradiction, since the equation u -h Ku -{- Lu = V has no solution when u G R{T)-^, v ^ R{L). Step 3. Replacing K hy K* in the above proves that dimR{T)^

= dimN{T*)

> dimR{T*)-^

=

dimN{T),

which completes the proof. 10.63 Yet a n o t h e r proof of dim Ar(T) = dim N{T*). Let H he a. separable Hilbert space, T = ld-{-K he a. compact perturbation of the identity, and let {cj} be a complete orthonormal system for H, ordered in such a way that N{T) + N{T*) is generated by the first elements e i , 6 2 , . . . , e^. P r o p o s i t i o n . Let Vn = S p a n { e i , 6 2 , . . . , en}, Pn be the orthogonal projection Vn. Then there exists a constant 7 > 0 and an integer no such that Vn > no

l|i'nr{<^)||>7ll¥'ll

v<^ey„nJV(r)-L.

Proof. Suppose the conclusion is not true; then for a sequence n^ ^ V?i € Vm n N{T)-^ we have \\Pn,T
over

00 of vectors

\\^i\\ = l.

(10.30)

By Lemma 10.60 for a subsequence {^pk^} and if £ H we then have K(pk^ -^ —(p- Since PnX -^ X as n -^ 00, we infer \\Pn^.Kipk,+
< WPn^^Kifik.W -H \\Kifk, + c^ll -

0

hence ipk^ -^ (p in H, since (pi = Pn^Tipi — PniK{(pi), and finally ^p H- K(p = 0. In D particular ||(^|| = 1 and (p G N{T) D N{T)-^, a contradiction. Prom the previous proposition, if {(^1, ip2,... •, (ps} is a. family of linearly independent vectors, then PnT(ipi),..., PnT{(ps) are also linearly independent, at least for n large enough; on the other hand, since R{T) = 7V(T*)-L, the vectors PnT((^i), • . . , PnT{(ps) belong to PnR{T) = Vn n Ar(T*)-^. Hence we have dimVnnN{T*)-^

>dimVnnN{T)-^

for n large enough. Similarly one proves d i m ^ n n N{T)-^ > dimVn H N{T*)^, dimVnnAr(T)-^

hence

=dimVnriN{T*)-^

for n large enough. The claim then follows by considering the orthogonal complements.

10.4 Linear Compgict Operators

383

d. The alternative theorem in Banach spaces The alternative theorem generaUzes to the so-called Predholm operators between Banach spaces X and Y of which compact perturbations of the identity are special cases. Let X be a real Banach space on K = R or K = C and X* := C{X,K) its dual space, which is a Banach space with the dual norm |M|=

sup Mx)l

Vc/PGX*.

\\x\\ = l

U (f e X* and x E X, we often write < (f^x > for ^{x). Clearly, the bilinear map < , > : X* X X —)• K, defined by < ip,x >= (f(x), is continuous, |<<^,x>|<||¥>||||x||

\/v€X',yx€X.

In general, X* is not isomorphic to X, contrary to the case of Hilbert spaces. If X and Y are Banach speices and if T : X —^ y is a linear bounded operator, the dual or adjoint operator T* :Y* -^ X* is defined by < T*((^),x > : = < if.Tx

> .

(10.31)

T* is continuous and ||T*|| = ||T||. 10.64 %, Let T G C{H, H), where H is a. Hilbert space. We then have two notions of adjoint operators: as the operator T* : H -^ H in (10.9) Chapter 10 and as the operator T^: H* ^ H* defined in (10.31). Show that, ii G : H* ^ H is Riesz's operator, then T = G-^ oT* oG. For a subset y C X of a Banach space X, we define

V-L .= 1^ ^ X* I =0\/xe called the annihilator

V^

of V. Notice that V-^ is closed in X*. We have

10.65 L e m m a . Let T : X -^ X be a bounded linear operator.Then R{T*) = N{T)-^.

N{T*) =

R{T)-^,

The class of linear compact operators on a Banach space, denoted by /C(X, X) is a closed subset of C{X,X). But in general these operators are not limits of linear operators with finite-dimensional range, contrary to the case X = H, where if is a Hilbert space as shown by a famous example due to Lindemann and Strauss. Recall that we can always approximate K € /C(X, X) by nonlinear operators with range contained in a finite-dimensional subspace, see Theorem 9.140. We can now state, but we omit the proof, the following result. 10.66 T h e o r e m ( A l t e r n a t i v e ) . Let X be a Banach space and letT = A-\-K X be a compact perturbation of an isomorphism A : X —^ X. Then (i) R{T) is closed, (ii) N{T) and N(T*)

have finite dimension,

and dimN{T)

=

: X -^

dimN{T*).

Consequently, we have the following. 10.67 Corollary ( A l t e r n a t i v e ) . Let A,K^ C{X,X) where A is a linear isomorphism of X and K is compact. Then the equation Ax + Kx = y is solvable if and only

ifyeNiT*)-^.

384

10. Hilbert Spaces, Dirichlet's Principle and Linear Compact Operators

e. The spectrum of compact operators 10.68 Definition. Let H be a Hilbert space on K, K = R or K = C^ and let L G C{H, H) be a bounded linear operator on H. The resolvent p{L) of the operator L is defined as the set p{L) = < A G K (Aid — L)~^ is a bounded operator^ and its complement cr{L) = K \ p{L) is called the spectrum of L. By the open mapping theorem cr(L) := < A G K Aid — 1/ is not injective or surjective >.

10.69 Definition. Let L G C{H,H). is defined as (Jp{L) := IXeKlXld

(10.32)

Then the pointwise spectrum of L

— L is not injective>.

The points in (Jp{L) are called eigenvalues of L, and the elements of A^(AId — L) are called the eigenvectors of L corresponding to A. Of course, crp(L) C cr(L) and, if d i m i / < +00, crp(L) = a{L) as, in this case, a linear operator is injective if and only if it is surjective. If dimiJ = -hoo, there exist, as we know, linear bounded operators which are injective but not surjective, hence, in general (Tp{L) ^ CF{L). 10.70 Remark. In the sequel we shall deal with compact operators L. For these operators the equahty crp(L) = a{L) also follows from the alternative theorem of the previous section. As in the finite-dimensional case, see Proposition 4.5, eigenvectors corresponding to distinct eigenvalues are linearly independent. Moreover a(L)c{AGK||A|<||L||}, because, if |A| > ||L||, then

\\\L\\

(10.33)

< 1, therefore, see Proposition 9.106,

Id-\- JL, equivalently XId -h L, is invertible and 00

{Xld + L ) - i = ^ ( - l ) ^ A ^ - ^ L ^ j=o

hence A G p{L). The following theorem gives a complete description of the spectrum of a linear compact operator.

10.4 Linear Compact Operators

10.71 Theorem. Let H be a Hilbert space with dimH K G /C(JF/', H) he a compact operator. Then

385

= +oo and let

(i) 0 G cj{K), (ii) K has either a finite number of eigenvalues or an infinite sequence of eigenvalues that converges to 0. (iii) the eigenspaces corresponding to nonzero eigenvalues have finite dimension, (iv) if X ^ 0 and A is not an eigenvalue for K, then XId — K is an isomorphism of H and {XId — K)~^ is continuous, (v) a{K)\{0} = ap{K)\{0}. Proof, (i) In fact R{K) ^ i / , since K is compact. (ii) Prom (10.33) the set of eigenvalues A is bounded, thus either A is finite or A has an accumulation point. Let us prove that in the latter case, A has only 0 as an accumulation point; we then conclude that A is denumerable, actually a sequence converging to zero. Suppose {An} is a sequence of nonzero eigenvalues with corresponding eigenvectors {un} such that An —>^ A 7«^ 0. Set fin '= 1/An and Vn := Span{iti, U2,..., Wn}, and notice that, if w := ^2^=1 ^j'^j G Vn, then XnW — Kw = J2]=i Cj{Xn — Xj)uj € Vn-iWe now construct a new sequence {vn} with \\vn\\ = 1 by choosing vi € Vi and, for n > 2, t;n G K i f l V ^ j . Clearly Vn is an eigenvector corresponding to An and, according to the previous remark, Vn — fXnKvn G Vn-\- For n > m we then find Vn — jjinKvn, HmKvm G V n - 1 , Vn G V^_^ and K{flnVn

- fJ'mVm) = Vn - {Vn - fJ'uKVn

with Vn G Vj^^i and z G Vn-i. \\K(flnVn)

+ HmKVm)

= : Vn - Z,

Thus we conclude - K{flmVm)\\^

= \\Vn\\^ + \\z\\^ > 1,

a contradiction, since {fXnUn} is bounded and K is compact. In conclusion A = 0. (iii), (iv) are part of the claims of the alternative theorem, and (v) follows from (iv).

D

10.72 Remark. Actually, Theorem 10.71 holds under the more general assumption that if is a Banach space. In this case it is known as the Riesz-Schauder theorem.

10.4.2 Compact self-adjoint operators Let us discuss more specifically the spectral properties of linear self-adjoint operators. a. Self-adjoint operators 10.73 Proposition. Let H be a real Hilbert space and L : H -^ H he a bounded self-adjoint linear operator. Set m := inf {Lu\u), \u\ = l

Then

M := sup {Lu\u). \u\=l

386

10. Hilbert Spaces, Dirichlet's Principle and Linear Compact Operators

(i) eigenvectors corresponding to distinct eigenvalues are orthogonal, (ii)

m,M

e

(T{L),

(iii) ||L|| = supji^ii^i \iu\Lu)\ = max(|m|, |M|). Also, if L is a bounded self-adjoint operator in a complex Hilbert space, then {u\Lu) G R^u e H, consequently all eigenvalues are real, moreover (i), (ii) and (iii) hold. Proof, (i) In fact Lu — \u and Lv = jiv^ X,/I£R,X^/J,

yield

(A - /i)(w I v) = {Lu \v)-{u\

Lv) = 0.

We now prove that for all u G H \\Mu - Lu\\ < \(Mu - Lu\u)\^/^,

\\mu - Lu\\ < \{mu - Lu\u)\^^'^.

(10.34)

The bilinear form b{u,v) := (Mu — Lu\v) is symmetric and nonnegative, h{u,u) > 0; the Cauchy-Schwarz inequality then yields \{Mu - Lu\v)\ < \{Mu - Lu\u)\^/^\{Mv

- Lv\v)\^^'^ < C\{Mu -

Lu\u)\^/^\\v\l

By choosing v = Mu — Lu, the first of (10.34) follows. A similar argument yields the second of (10.34). (ii) Let us prove that M G o-{L); similarly one proves that m 6 ciL). Let {u^} be a sequence such that \\uk\\ = 1 and {Luk\uk) —>• M . Because of (10.34) Muk — Lu^ -^ 0 in H. If M is in the resolvent, then Mu — Lu is one-to-one and onto with continuous inverse because of the open mapping theorem. Thus Uk := ( M i d - L)-'^{Muk

- Luk) -^ 0,

that contradicts \\uk\\ = 1. (iii) Set a := supij^ij^^i |(Lit|u)|; of course max(|M|, |m|) = a and a < \\L\\. Let us show that ||L|| < a. Since L is self-adjoint 4U{Lu\v) = {L(u + v)\u + v) - {L{u - v)\u - v), hence, according to the parallelogram law, 4$Lu\v)\

< a{\\u + v\\^ + \\u~ vlf)

= 2a(\\u\f

+ |Hp).

Replacing u and v with eu, v/e respectively, e > 0, we find 4|(Ln|t;)| < 2 a m i n ( e 2 | | u | | 2

P) = 4a|HIIH|. + IM £2

ice, if V := Lu., we have

\\Lu\\'• < a | | u\\\\Lu\l

i.e..

\L\\
In the complex case we have {Lu\u) = {u\L*u) = (u\Lu) =

{Lu\u)

hence {Lu\u) 6 M. We leave to the reader the completion of the proof.

D

We notice that the proof of (iii) Proposition 10.73 uses the continuity of ( M i d - L)~^ when M E p{L). If L is compact, this is a consequence of the alternative theorem and the open mapping theorem is not actually needed.

10.4 Linear Compact Operators

387

10.74 Corollary. Let L : H ^^ H be a linear compact self-adjoint operator. Then there exists an eigenvalue X of L such that \\L\\ = |A|. Proof. If L = 0, then A = 0 is an eigenvalue. If L / 0, then ||L|| = max(|m|, \M$ ^ 0 and M,m £ cr{L). Assuming ||L|| — | M | , then M ^Q and, according to Theorem 10.71, M G o-p{L), i.e., M is an eigenvalue of L. Alternatively, we can proceed more directly as follows. Let {un} be a sequence with ||txn|| = 1 such that {Lun\un) -^ M\ then {Mun — Lun\un) —>• 0, and by (10.34) Mun — Lun —> 0 in i / . Since L is compact, there is u £ H and a subsequence u^^ of {un} such that Uk^ -^ u hence = 0,

Mu-Lu

I N | = l,

i.e., M is an eigenvalue for L.

D

b. Spectral theorem 10.75 Theorem (Spectral theoremi). Let H be a real or complex Hubert space and K a linear self-adjoint compact operator. Denote by W the family of finite linear combinations of eigenvectors of K corresponding to nonzero eigenvalues. Then W is dense in N{K)-^. In particular, N{K)-^ has an at most denumerable orthonormal basis of eigenvectors of K. If Pj is the orthogonal projection on the eigenspace corresponding to the nonzero eigenvalue \j, then oo

K = ^\jPj

inC{H,H).

j=i

Proof. We order the nonzero eigenvalues as X.^Xj

fori^j,

|Ai|>|A2|>|A3|>...

and set Nj := N(Xj Id — K) for the finite-dimensional eigenspace corresponding to Aj. According to Proposition 10.73 Nj ± Nk

for j ^ k

and

N{K) ± Nj Vj,

hence N{K) C W-^. To prove that W is dense in N{K)-^, W = N{K)^ or W-^ = N{K). Define {0} Wn=

it suffices to show that

if K has no nonzero eigenvalues,

{ uy^j^TVj

if K has at least n nonzero eigenvalues, if K has only p < n nonzero eigenvalues

Wp

and Vn := W^. Trivially W-^ = DnVn. Notice that, since K is self-adjoint K{W^) C W^ if K{Wn) C Wn and the linear operator K\Y^ € C{Vn,Vn) is again compact and self-adjoint. Moreover, the spectrum of K^y^ is made by the eigenvalues of K different from {Ai, A 2 , . . . , An}. Therefore by Corollary 10.74 \\T^

II

J l^^+il

i^ ^ ^ ^ ^* least n -h 1 eigenvalues,

\\^\Vn\\ = \

I0

.

.

(10.35)

otherwise.

If K has a finite number of eigenvalues, then V = V^ and (10.35) yields K{Vn) = {0}, i.e., y = H r C i V ( T ) .

388

10. Hilbert Spaces, Dirichlet's Principle and Linear Compact Operators

If K has a denumerable set {An} of eigenvalues, then |An| —> 0 by Theorem 10.71, hence ||^|v||<||K|v„|| = |An+l|-0 and K{V) = {0}, i.e., V C N(K). Choosing an orthonormal set of eigenvectors in each eigenspace Nj, we can produce an orthonormal system {en} of eigenvectors of K corresponding to nonzero eigenvalues, i.e., such that (6i \€j J = Oij ,

J\€j

^= Aj €j ,

that is complete in the closure W = N{K)^ of W. Let us prove the last part of the claim. Let Pj and Qn be the orthogonal projections respectively, on Nj and Wn- Since the eigenspaces are orthogonal, we have K{Qn{x)) = Yl]=i ^jPjix) \fx e H, hence n

KX-J2

>^jPj(^) = ^ ( ^ - Qnix)),

j= l

and therefore n

Kx-Y^XjPj{x)\\<\\K^v^\\\\x-Qn{x)\\<\Xn+i\\\x^ j=l

k

\K-^\jPj\\<\Xn+i\; the conclusion then follows since |An|—^Oasn—>^oo.

•

c. Compact normal operators A linear bounded operator T G C{H^ H) in a complex Hilbert space H is called normal if T*T = TT*. It is easy to show that if T is normal, then (i) N(T) = N{T*T) = N{TT*) = N{T*), (ii) N{T - Aid) = A^(T* - Aid), that is, T and T* have the same eigenspaces and conjugate eigenvalues. If T is normal, the operators

A:=

Z

,

B:=

2 ' 2i are self-adjoint and commute, AB = BA. Two linear compact self-adjoint operators that commute have the same spectral resolution^ see Theorem 4.29 for the finite-dimensional case. 10.76 Theorem. Let H he a complex Hilbert space and A,B two linear compact self-adjoint operators in H such that AB = BA. Then there exists a denumerable orthonormal system {e^} which is complete in {N(A) D N{B))-^ and made by common eigenvectors of A and B. If Xj and fij are respectively, the eigenvalue of A and the eigenvalue of B relative to Cj, and Pj : H -^ H is the orthogonal projection onto Span{ej}, PjX := {x\ej)ej, then oo

oo

10.4 Linear Compact Operators

Zar Algebra der FanktioBaloperationen and Theorie der normaleii Operatoren.

389

AflgMMtae BigenwerMMorie Hemitesdior

Etnkitanc. 1. Die voirU«8«nd« Axb«it leifiUlt in swei, im w«aentlidi«n onkbhingift Tcile. Dtt «nte (§§ I — m ) i*t d«t Untwsaeliung d n linMnn und Usohrinktca Op«rtitot8Q (d. h. Mfttmen) des Hilbcrtachcn RMUBM ^ gcwidiqA indem d>« aigelM»i«ch«n EigraMsluften des von ilineii gebildtten (nichikommutetiveo) RiagM ^ b«tncht«t werd«n. Dan 6«genaUad dM iw«tit Teiles bingegen bilden diejeoigwo, niobt notmmdig abenll (in $ ) ainnvoihi nnd beacbr&nkten Op«r«tonn, die die sogenMuite HilbertHhe Spektat • datstettung mit komplezen Eigenweiten nlauen (vgL die Muf&htlidMii' Expli«ening dieset Begtifie im $ 4 der Einleitung). Diaa «ind die tk ^normal'' za bcMicbnet^en Op«fttoi«a, die bitber nuz im BesobrtakM betncbtet wuidenM, and fQz die wit eine nene kUgemeineie Definition geM wecden (vgL am voihin •ogaffihrten Orte).

I netaidiea Ban (t. B. dec neUen ZaUangwadan, im knuplaien ZaUeaebeae, dar Obet. fliebe der K B h ^ * « » i , dec Bttedt* 0 . 1 mm.) betcaAteo, dk gawteeii HegBlenWtihediiigBifM gM>>«n (>• B. lte«% and Ui aaf endliali vieie Knioke etetig diflMMttiiidMC and. iwtimal etettg mm

jJKr.+..-+«,/',)-«,J»A+-'-+M^. K

Ebe wir dieee Dioge genaaer atueinandenetcen, aei en die Deiaitki dee (komplezen) Hilbeitecben Beumea $ erianett Men kum ibn tie

*

•»'

Wena in fi ete aQgeMnier MatbegnS (etara im SiBoa dM Leb«i«a»leben) eijetiart and d« dee Volamaleneiit fa O i«t (anf dec Oecadaa: d*, in dec Sbeae: d«4y, anf dec ObecttAe d^fiaheMkagel: m*.Mi<, oaer.). ^-gaaa-Baummh^l. "

Menge tlint Folgen komplexei Zablen { « , , « , , • • . } mit Midlicbem ^ | * J ^ tealieiert denken*); wir beteicbnen » >) Bit TOT kunm BBT in V
toti«tn, T|^ dw XnsyUopidifArtikN « • HtUingtr und ToepUu, EMyU. d. luth. WIM. a 0. IS, 8«ite 1583. V«l. took lam.'* •) N««h dtn bektontes 3«tM TOO fleolMr and 1. BIMM el>«iio«at «Mt ife

> !m*W B f(f), ' w o M i * Einhidtofaigd-Obertlebe dnrehliai*, mit MtdiiebMii //ir(/>)|*, ww. V|L <.B. die In Anm.') Miti0 M *

\Utm fcaeiiliiii Werte Mwa.

« Meehr. 1«OT, 8. 210-97S.

Figure 10.7. Two pages from two papers by John von Neumann (1903-1957) in Mathematische Annalen.

Proof. Let V^ be as in Theorem 10.75. As in the finite-dimensional case, see Proposition 4.27, for every eigenvalue A of A we find a basis of the corresponding eigenspace Ar(AId — A) made by eigenvectors of B. By induction we then find a denumerable orthonormal system which is complete in W and made of common eigenvectors {e-n} of A and B. By Theorem 10.75 then W = N{A)-^ and {cn} is a basis of N{A)-^ of common eigenvectors of A and B. Now AB = BA implies that B(N{A)) C N{A). Therefore, applying the spectral theorem to -B|;v(A)5 we find further eigenvectors {un} of B corresponding to nonzero eigenvalues that form a basis of N(A) n N{B)-^. The family {en} U {un} is now a denumerable orthonormal set of eigenvectors common to A and B that is complete in (N{A) D N{B))-^. The second part of the claim easily follows by applying Theorem 10.75 to A and B. D

10.77 Corollary. Let H be a complex Hilbert space and let T : H -^ H be a compact normal operator. Then there exists a denumerable basis {cn} in H of common eigenvectors of T and T*. / / Pj denotes the orthogonal projection on Spanjcj} and Xj is the corresponding eigenvalue, then

T = £ A,P„ j=i

T* = £ A,P„

in C{H,H).

j=i

Proof. Set A := (T -h T * ) / 2 and B := {T - T*)/{2i). We can apply Theorem 10.76 and find a basis {en} in (N(A) niV(B))-^, i.e., a basis in ker(T)-^ = ker(T*)-^ made by common eigenvectors to T = A-\-iB and T* = A — iB. D

390

10. Hilbert Spaces, Dirichlet's Principle and Linear Compact Operators

d. The Courant-Hilbert-Schmidt theory In several instances one is led to discuss the existence and uniqueness of solutions in a Hilbert space H of equations of the type a((^, u) - \k{^, u) = F{ip)

^ipeH

(10.36)

where F G i/*, and a{(f,u)^ k{(f^u) are bounded bilinear forms in H. As we have seen, by Riesz's theorem, there exist bounded operators A,Ke C{H, H) and / G if such that a((p, u) := {ip\Au),

k{ip, u) := {ip\Ku),

F{ip) = {ip\f)

ioT all u^(p e H. Then (10.36) reads equivalently as the linear equation in H {A - \K)u = / . (10.37) With the previous notation suppose that — A is continuous, self-adjoint and coercive on i / , i.e., there exists u > 0 such that \/ueH, (10.38) a{u,u)>iy\\u\\^ — K is compact, self-adjoint and positive, i.e., k{u, u) = {u\Ku) > 0

yu:j^O, ueH,

(10.39)

With these assumptions, the corresponding bilinear forms are continuous and symmetric; moreover a{v, u) defines an inner product in H equivalent to the original one {v\u) since V \\u\^ < a{u,u) <

\u

|2

Finally, see Theorem 10.44, A has a continuous inverse. The operator A — XK is therefore a compact perturbation of an isomorphism, and, since A and K are self-adjoint, the alternative theorem yields the following. 10.78 Theorem. The equation An -h XKu = f has a solution if and only if f is orthogonal to the solutions of An — XKu = 0. Now we want to study the equation An - XKu = 0 equivalently, a{ip, u) — Xk{(p, u) = 0 which can be rewritten as \u - A-^Ku = 0. A With the assumptions we have made o A~^K is a linear compact operator.

"iif e H,

10.4 Linear Compact Operators

391

THE

THEORY OF SOUND •Y

JOHN W I L U A l i STRUTT. BARON RAYLEICH, ScJA.

T.KS.

XOIEXT BKUCE UNDSAY

IN TWO VOLUMES

Figure 10.8. Lord William Strutt Rayleigh (1842-1919) and the frontispiece of his Theory of Sound.

o A ^K is positive, since a{u,A

'^Ku) = {u\AA ^Ku) = {u\Ku) > 0 for

o A ^K is self-adjoint with respect to the inner product a{v,u), since a{v,A-^Ku)

= {v\Ku) = {u\Kv) =

a{u,A-^Kv),

10.79 Definition. We shall say that X ^ 0 is an eigenvalue of (A^K) and that u is a eigenvector of {A^K) corresponding to A if 1/A is an eigenvalue of A~^K and u is a corresponding eigenvector, i.e., a solution ofAu-XKu = 0. The theory previously developed, when applied to the self-adjoint compact operator A~^K in the Hilbert space H with the inner product a(i;, u), yields the following. 10.80 Theorem. Let H he an infinite-dimensional Hilbert space and let A and K G C{H^ H) he self-adjoint, for A coercive and K compact. The equation Au—XKu = 0 has zero as its unique solution except for a sequence {An} of positive real numhers such that Xn -^ -hoc. For any such Xn, the vector space of solutions ofAu—XnKu — 0 is finite dimensional. Moreover, ifW is the family of finite linear combinations of eigenvectors of {A^K), then W is dense in H. In particular, there exists a complete orthonormal system in H of eigenvectors of (A, K) such that aici, Cj) = Xj6ij,

k{ei, ej) = 6ij

Vi, j .

Proof. The eigenvalues of A~^K are positive since A~^K is positive. Since A~^K is compact, A~^K has a denumerable sequence of eigenvalues {jjin} and /Xn -^ 0"^ and

392

10. Hilbert Spaces, Dirichlet's Principle and Linear Compact Operators

the corresponding eigenspaces are finite dimensional by Theorem 10.71. Consequently Au — XKu = 0 has nonzero solutions for the sequence An = l/fJ^n —^ -f-oo. The spectral theorem yields the density of W in H and the existence of an orthonormal basis of A~^K with respect to the inner product a{v^ it), a{ui,Uj) Therefore, a{ui,Uj) conclude

= Sij,

= Sij, \jk(ui,Uj)

—Uj — A~ Kuj = 0. = a(ui,Uj)

Q'(Ci)Cj) = Ajdij.,

= Sij and, if we set Sj := y/^Uj,

ki^ei^ej)

=

we

oij. D

e. Variational characterization of eigenvalues 10.81 Theorem. Let H, A and K he as in Theorem 10.80. Let {en} he a hasis in H of eigenvalues of {A^ K) ordered in such a way that the corresponding eigenvalues A^ form a nondecreasing sequence {An}, An < An-f 1. Then each An is the minimum of the Rayleigh's quotients . fa{u,u)\ u^o lk{u,u) J An := min< , , ^ , kiu.eA = 0 Vi = 1 , . . . , n — 1 >, if n> 1. ui^o I k[u^ u) \ ) Proof. For u ^ H write u = ^ ^ i CjCj so that k(u, Cj) = Cj and k{u,u) = X ^ ^ i \cj\^U u eVn := {u I k{u, ej) = 0, Vi == 1 , . . . , n - 1}, then Ci = 0 for i = 1 , . . . , n — 1, hence k(u,u)

= ^

\c.

2

j=n

while 00

a(u,u)

00

= ^2 ^jl^jl'^ ^ ^n ^ j=n

|cjp =

Xnk{u,u).

j=n

Therefore ^i^'^c < An on Vn. On the other hand, e-n G Vn and a(en, en) = An fc(en, en). D

Moreover with the previous notation and assumptions, we have the following. 10.82 Theorem (Min-max characterization). Denote hy S a generic suhspace of dimension n— I. Then we have An = maxmin< -——^—r u 7^ 0, k(u, z) = s lk{u,u) I

OWzeS>. J

Proof. The inequality < follows from Theorem 10.81. Let 5 be a linear subspace of H of dimension n — 1 and Vn := S p a n { e i , 62,..., e n } . Choose a nonzero vector UQ := ^^—lOiiCi so that k{uo,z)=0 V2 e 5 ; this is possible since dim S = n — 1. Then

10.5 Exercises

k{uo,uo)

393

:=^af, i=l n

n

a{uo, UQ) = ^2 ^i(^l < An ^ i=l

af = An

k{uo,uo),

i=l

hence a(u,u) mm -y-{ k[u,u)

I, , ^ ^^, \k{u,z)=0\/zeS\ if be the orthogonal projection onto the closed subspace V of a Hilbert space H. Show that o Py H- P\Y is a projection in a closed subspace if and only if PyPw = 0, and in this case Py -\-P^r = Pv®w, o PyPw is a projection on a closed subspace if and only if PyPw = PwPv, and in this case, PyPw = ^ynvi^1 0 . 8 4 % Let S,T G C{H,H). AT*, Id* = Id, T** = T . 10.85 % Let T € C{H,H).

Show that {S-^T)*

= 5*-hT*, (ST)* = T * 5 * , (AT)* =

Show that ||T||2 = ||T*||2 = ||TT*|| = ||T*T||.

10.86 If. Show that Hilbert's cube {x € ^^ | |a:n| < l/'^} is compact, while {x 6 ^2 I \xn\ < 1} is not compact. Show also that Hilbert's cube has no interior points, i.e., its complement is dense. 10.87 t . Show the following. P r o p o s i t i o n . Let L G C{H, H) he a hounded self-adjoint operator on a real or complex Hilhert space and m := inf (Lit|tt), M := sup (Lu\u). 1^1 = 1 |u| = l Then (i) c 7 p ( L ) c [ m , M ] , (ii) we have {Lu\u) = M \\u\\'^ (resp. {Lu\u) = mWuW^) if and only if u is an eigenvector of L corresponding to M (respectively m), [Hint: (i) Proceed by contradiction using Riesz's theorem; in the complex case, first show that Gp{L) C M. (ii) Use (10.34).] 10.88 %. Show the following. P r o p o s i t i o n . Let H he a Hilhert space, {\j} a sequence of nonzero real numhers converging to 0, {cj} an orthonormal set in H and Pj : H —^ H the orthogonal projection onto S p a n { e j } . Show that the series YlTLi ^j^j converges in C{H,H). Moreover, if CX)

K:=J2 ^J^'j then

^^ ^ ( ^ ' ^ ) '

(10-40)

394

10. Hilbert Spaces, Dirichlet's Principle and Linear Compact Operators

(i) "ix eH,Kx (ii) (iii) (iv) (v) (vi)

= ZT=i >'ji^\^j)^J^

for all j , Xj is a nonzero eigenvalue of K and Ksj = XjCj for all j , the sequence {Xj} is the set of all nonzero eigenvalues of K, K is self-adjoint and compact, {ej } is a basis of ker K-^, in particular ker K-^ is separable, if X^ 0 and X ^ Xj \fj, then {XI d — K)~^ is an isomorphism of H into itself.

If H is a complex Hilbert space and Aj € C, then (i), (ii), (iii), (iv), (vi) still hold and K is compact and normal. [Hint: (vi) follows from (v) and the alternative theorem. Moreover, an explicit bound for {Xld — K)~^ follows from (i), assuming H separable. Choose an orthonormal basis {zn} in keri^'. Then show that the equation (Aid — K)x = y has a unique solution

X = (Aid - K)-i = f;

^^^xk

+ T.Vy\

'-)za.

Then

if d:=minfcGN|A-Afc|.] 10.89 %, Let H be separable and let {en} be a basis of H. Consider the linear operator T{ej) = \ej,j>l,i.e., oo

^

T(u):=^-(u|e,)e,-. j = i ^

Show that T is compact with a nonclosed range. [Hint: Show that T is the limit in C{H, H) of a sequence of linear operators with finite-dimensional range. Then show that V e R{T) if and only if E ^ i J K ^ k j O P < -f-oo. Choose VQ := J^^tiJ'^^'^^j^ Vn :== E j i i j - 3 / 2 + i / n g^nd show that VQ ^ R(T), Vn G R{T) and \vn - vo\ -> 0.] 10.90 %, With the notation of Theorem 10.80 show the so-called completeness oo

k{u, v) = 2_^ ^('^5 ^i)^(^) Cj),

oo

a{u, v) = 2Z ^i^i'^^ ei)k(v,

ei).

relations

11. Some Applications

In this chapter we shall illustrate some of the applications of the abstract principles we stated in the previous chapter to specific concrete problems. Our aim is to show the usefulness of the abstract language in formulating and answering specific questions, identifying their specific characteristics and recognizing common features of problems that a priori are very different. Of course, the abstract approach mostly follows and is motivated by concrete questions, but later we see the approach as the most direct way to understand many questions, and even the most natural. Clearly, the problems we are going to discuss deserve more careful and detailed study because of their relevance, but this is out of our present scope and, in any case, often not possible because of the limited topics we have so far developed. For instance, in this chapter, we shall only use uniform convergence, since the use of integral norms, besides being more complex, requires the notion of Lebesgue's integration, without which any presentation would sound artificial.

11.1 Two Minimum Problems 11.1.1 Minimal geodesies in metric spaces Let X be a connected metric space so that every two of its points can be connected by a continuous path. One of the simplest problems of calculus of variations is to find a continuous curve of minimal length connecting two given points. A first question is deciding when such a minimal connection exists. Here, we shall see how the Frechet-Weierstrass theorem. Theorem 6.24, and the Ascoli-Arzela theorem. Theorem 9.48, lead to an answer. a. Semicontinuity of the length Let X be a metric space. Recall that / : X —> K is called lower semicontinuous if the level sets Tf^x := {x \ f{x) < A} of / are closed for all A G R, equivalently if /~^(]A, +oo[) is open for all A € R. Observing that,

396

11. Some Applications

^

COMMENTARII

K 1.6 K f^

DE UN£A BREVISSIMA

ACADEMIAE

IN SVPERPICIE QVACVNQVE DVO QVAEtlBET PVNCTA IVNCENTE.

S C I E N T I A R V M

Auefore Lconh. Eulero.

IMPERIALIS PETROPOLITANAB

C

VrQ\'E notum «ft, ct amulmtanqoamixiomi ponituf, lin«m fca viam brcvifliniam a 4ato punAo ad aliud ^uo4cunque cffc iincam r«&aiti. Ex hoc fadU intetligittir, in Aiperiicie pbna lineam brcuiiiiaiam duo <]uadtb«t puoda iungemem ttk tc&tm , qui« ab 3l(cro ad siterum ducUur. In fuptrficie fphaerica , in qua Hfxt. rcAa daci noit poted, ftatuitur « Ceometrk viim breuiffimam eife circaitttn maximum > qui dats duo pun^a coninngit. a. Quae a»t«m in fupcrScie qaacttoque fine conuexa, fine concaua, {Ui« ex hiimixtafirviabrc* viifiima , quae ex dato pua^o ad aSiud qaodcunque diKititr, nondum eftgeneraliccrdctcrfflieatum. Propttfujt mlhi hanc qoaeftioncm Cel. loh. BernouUi, fisnificani Tc vnincrifalcmiitutniffe aequationem^quae ad liocam breaiflimim determinandam cakjnc fu' perficiei attommodari poflir. Solui cgoctiam hoe proiikma , foiutionemqac hac dlflcrtattone expo> nere volui.

T O M V S 111. AD ANNVM cl> Jjcc 3xrtiu

PETROPOtI TYPIS ACADEMIAE clj ixc xxstj.

Figure 11.1. Frontispieces of Commentarii Petropoli vol. 3 (1732) and of the paper by Leonhard Euler (1707-1783) De linea brevissima.

if f = sup^ fi, then / ^(]A, +oc[) = U^/^ ^(]A, 4-oo[), we conclude the following. 11.1 Proposition. Let fi : X -^ R, i e I, be a family of lower semicontinuous functions on a metric space X. Then f := sup^fi is a lower semicontinuous function. Let (X, of) be a metric space. As we have seen, cf. Example 6.25, the length functional L : C^([a, 6],X) -^ R, which for each continuous curve (p : [a, 6] -^ X gives its length, is not a continuous functional with respect to uniform convergence in C°([a, 6],X). But we have the following. 11.2 Theorem (Semicontinuity). The length functional L{(p) is lower semicontinuous in C^([a, 6],X). Proof. Recall that we have L{ip) = sup Vs if)

ses

where Vs{f) := T>idx{f{ti)J{ti+i)), S = {to = a < ti < - - • < IN = b}. Since the functional / -^ Vsif) is continuous for every fixed subdivision S of [a, 6], the result follows. •

b. Compactness The intrinsic reparametrization theorem. Theorem 7.44, can be reformulated as: For every family of parametric curves {Ci} in X of length strictly less than A:, there exists a family of curves {C^'} parametrized in [0,1],

11.1 Two Minimum Problems

397

thus belonging in C^([0,1],X), such that Ci and (7^' are equivalent for all i e Lin particular they have the same length, and the curves C^ are equiLipschitz with Lipschitz constant less than k. Assuming X compact, the Ascoli-Arzela theorem yields the following. 11.3 Theorem (Compactness). Let X be a compact metric space and let {Ci} be a family of parametrized curves of length strictly less than k. Then the family {Ci} is relatively compact with respect to uniform convergence. More precisely, one can reparametrize the curves Ci on [0,1] in such a way that they belong to Lipfc([0,1],^), and therefore {Ci} is a relatively compact subset of C^{[0,1],X). c. Existence of minimal geodesies An immediate consequence of Theorems 11.2 and 11.3, on account of the Prechet-Weierstrass theorem is the following. 11.4 Theorem (Existence). Let X be an arc-connected compact metric space and P, Q two points of X. There exists a simple rectifiable curve of minimal length joining P to Q, provided there exists at least a rectifiable curve connecting P and Q. Proof. Since there exists at least a rectifiable curve connecting P and Q, A := inf{L(7) I 7 connecting P and Q} < +oo. Let A; > A and let K:={^:[OA]-^x\ipe

Lipfc([0,1], X ) , <^(0) = P, (^(1) = Q } .

By the Ascoli-Arzela theorem, K is compact in C°([0,1],X), hence there is (fo £ K such that (H-l) L(ipo) = inf I L(7) 7 connecting P and Q > by the Prechet-Weierstrass theorem. The map (/?o need not be injective a priori. However, the intrinsic reparametrization ip : [0, L((^o)] —^ ^^ see Theorem 7.44, which is equivalent to (po, having Lipschitz constant one, satisfies L{il;,[xi,X2])

= \xi

-X2\

and is injective. In fact, if 1/^(2:1) = '0(cc2) with xi < X2, deleting the loop corresponding to the interval ]a;i,a;2[, we would still get a curve connecting P and Q, but of length strictly less than L((po), contradicting (11.1). • 11.5 %, Show that the compactness assumption on X in Theorem 11.4 is necessary. In particular, discuss the cases when X equals the closed unit cube minus an interior open segment and minus a closed interior segment.

11.1.2 A minimum problem in a Hilbert space In this section we shall show how the theorem ensuring the existence of minimizers for quadratic coercive functionals generalizes to convex coercive functionals in a Hilbert space.

398

11. Some Applications

a. Weak convergence in Hilbert spaces Let X be a Banach space. We say that a sequence {xn} C X converges weakly to x G X, and we write

if F{xn) —> F{x) VF e X*, i.e., for every linear continuous functional F : X -> R on X. On account ot the Riesz's representation theorem, we have the following. 11.6 Proposition. A sequence {un} in a Hilbert space converges weakly to u e H iff {un\v) -> {u\v) ^v e H. If H is finite dimensional, weak and strong convergence agree, since weak convergence amounts to the convergence of the components in an orthonormal basis. On the contrary, if H has infinite dimension, the two notions of convergences diflFer. In fact, while from the inequaUty \{Un-u\v)\

< \\v\\\\Un

-

u\\

we get that strong convergence, ||iin — ^^H —> 0, implies weak convergence u-n -^ u; the opposite is not true. Consider, for instance, a denumerable orthonormal set {e^} C H. Then Bessel inequality yields {en\v) —> 0 Vi; G if, i.e., Cn -^ 0, while {cn} does not converge since Ikn -emii^ = ||en||^ - 2{en\em) + ||em||^ = 2

yn,m.

Weak convergence is one of the major tools in modern analysis. Here we only state one of its major useful issues. 11.7 Theorem. Every bounded sequence in a separable Hilbert space has a subsequence that is weakly convergent. Proof. Let {xn}, \xn\ < M, be a bounded sequence in H, and {ci} be a basis of H. {xn} has a subsequence {x!^} such that (x!^\ei) -^ a i . Similarly {x!^} has a subsequence {x!^} such that ((x!Jl\e2) -^ a^ € M Vi, and of course |ai| < M Vz. Now consider the map T : H -^R given by cx>

T{y) := Yl^y\ei)au

y e H.

i=l

T is linear and bounded, ||T|| < M as

hence the representation theorem of Riesz yields the existence of XT ^ H such that Tiy) = iy\xT) ^y e H and \\XT\\ = \\T\\ < M. In particular XT = E r ' S i ^i^i e H. We now prove that {xk^} converges weakly to XT- For that, set Zn := x^^ — XT and

11.1 Two Minimum Problems

399

let y be any vector in H. For any fixed e > 0 choose TV sufficiently large so that for VN '= Z)ili(2/|ei)ei, we have \\y - yjsfW < e. Then |(^n|2/-yiv)| 0, hence |(2n|yAr)| < e for n larger than some n = n(Ar, e). Thus \{zn\y)\<{2M + \)e \^n>n. D

11.8 Remark. The last part of the proof actually shows that in a separable Hilbert space, weak convergence [xn — x|y) -^ 0 Vy amounts to the convergence of the components [xn — a:|ei) —^ 0 Vi in an orthonormal basis

11.9 %, Show that the compactness theorem, Theorem 10.52, holds in a generic Hilbert space which is not necessarily separable. [Hint: Apply Theorem 10.52 to the closure HQ of the family of finite combinations of {xn}, which is a separable Hilbert space. Then find X £ Ho and a subsequence {xk^} such that (xk^ —x\y) —>• 0 Vy G i/o- Then, use the orthogonal projection theorem onto HQ to show that actually, {x^^ —x\y) ^^ 0 Vy G H.]

11.10 Theorem (Banach-Saks). Every bounded sequence {vn} C H weakly convergent to v E H has a subsequence {vk^ } such that 1 "" — / ^ Vk -^ V n ^-^

in the norm of H.

z=l

Proof. Set Un '-= Vn — V. Then for a positive M we have ||wn|| < M for all n and we extract from {un} a subsequence {ttfc^ } in such a way that Uk^ : = w i ,

(wfcahfci) < 1, (^/e3|Wfci),(Wfc3|Wfc2) <

(Wfcp+llwfcj <

-,

V2=:l,...,p.

Therefore - . n

||2

ill"'

1|2

1

''^

= ;^EE("«=iK) + ; ^ E K K ) j=li<j

j=l

^n^^-

400

11. Some Applications

LEONID A TONBI.LI

Multiple Integrals in the Calculus ofVariatiom

FONDAMENTI 01

CAMOK) DKLIE VAEIAZIONI Chtdes 8 . Mbncy, Jt. VOLVUR P K I X O

BOtOGSA ??|G01iA Z A H r O H B L U

Sfdoger-Veriag tkriin l y j d b c t g H«v York 1 9 ^

Figure 11.2. Frontispieces of two classical monographs that, in particular, deal with semicontinuity on integral functionals.

b. Existence of minimizers of convex coercive functionals Let ^ : iif -> R be a convex functional on a real Hilbert space H. This means that the function ip{X) := T{\u -h (1 - X)v) is convex in [0,1] for all u,v e H. A typical example of a convex functional is the quadratic functional

Hu) = \\\uf-L{u) where L is a bounded linear form on if, that we have encountered in deahng with the abstract Dirichlet principle. Then we have, compare Proposition 5.61 of [GMl], the following. 11.11 Proposition (Jensen's inequality). A functional T : H —>R is convex if and only if for every finite convex combination ^

aiUi,

i=l

^

a i = 1,

i=l

of points Ui £ H we have

(

m

\

m

2=1

^

i=l

ai> 0,

11.1 Two Minimum Problems

401

Proof. Clearly Jensen's inequality with two points amounts to convexity. So it suffices to prove it assuming J^ convex. We give a direct proof by induction on m. Assume the claim holds for m — 1 points. Set a := a i + • • • + « m - i and am := 1 — a. If a = 0 or a = 1 the claim is proved, otherwise 0 < a < 1 and, if

1

^

then m

m—1

0< — <1, a

y ^ — = 1,

y ^ aiUi = art -f (1 -

a)um,

hence 111

ifh

T{^OLiU^

= T{oLU + (1 - OLUm)) < OLT{U) + (1 - a)T{um)

i=l

<

J2^i^M

i=l

by the inductive assumption.

D

11.12 Theorem. Let T : H -^R be a continuous, convex, bounded from below and coercive functional, meaning inf T{u) > —cxD,

^{u) -^ +00

as \u\ —^ -f-oo.

Then T has a minimizer in H. Proof. Let {un} be a minimizing sequence, T{un) —* miu^H ^{'^)-

Since for large n

- o o < inf T(u) < T{un) < inf T{u) + 1, the sequence {un} is bounded. Using the Banach-Saks theorem we find u ^ H, and we can extract a subsequence {wfenl ^^ {un} such that Uk^ —^ u and 1

''^

Vn •= —[ y ^ Uk^ j -^ u

in the norm of H.

i=l

Jensen's inequality yields

JFK) = ^ ( i f^„, ^ < i f^^(„,J, ^

thus T(vn) —*• inffj ^ since T{uk^) sequence, too. Finally

Z = l

^

on account of the continuity of T.

1

—^ infff^ as i -^ oo, i.e., {vn} is a minimizing

inf .7^ < T{u) = lim Tivn) H

1=

= J^{u)

n-^oo

•

1 1 . 1 3 %. Show that Theorem 11.12 still holds if T is convex with values in R, bounded from below and lower semicontinuous.

402

11. Some Applications

11.2 A Theorem by Gelfand and Kolmogorov In this section we shall prove that a topological space X is identified by the space of continuous functions on it. If we think of X as a geometric world and of a map from X into R as an observable of X, we can say: if we know enough observables, say the continuous observables, then we know our world. Let us begin by proving the following. 11.14 Proposition. Every metric space (X^d) can be isometrically embedded in C^ {X). Proof. Fix p e X and consider the map (/? : X —» C^(X,R) fa'.X -^R defined by fa{x) := d{x,a) - d{x,p).

that maps a e X into

Trivially, fa € C ° ( X , R ) and \fa{x) - A ( x ) | = \d{x, a) - d{x, b)\ < d{a, 6), i-e., Il/a — Alloo £ d{a^ 6); on the other hand ioi x = h we have \fa{h) — fh{b)\ — d{a, b), hence (p is an isometry. D 11.15 %. Show that every separable metric space (X, d) can be isometrically embedded in loo. [Hint: Let {xn} be a sequence in X and let (f : X -^ loo be given by (f{x)n '•= d{x,Xn) — d(xi,Xn)' Show that ip is an isometry.]

Let X be a topological space, see Chapter 5. The set C^{X,R) is a linear space and actually a commutative algebra with identity, since the product of two continuous functions is continuous. Let R and R^ be two commutative algebras. A map (/? : i? ^ i?' is said to be a homomorphism from R into R^ iiip{a-^b) = ip{a)-\-(p{b) and ip{ab) = (p{a)(p{b). If, moreover, ip is bijective we say that R and R' are isomorphic. Clearly C^{X,R) is completely determined by X, in the sense that every topological isomorphism (p : X -^ Y determines an isomorphism of the commutative algebras C^ (F, R) and C^ (X, R), the isomorphism being the composition product / -^ / o <^. If X is compact, the converse also holds. 11.16 Theorem (Gelfand—Kolmogorov). Let X be a compact topological space. Then C^(X,M) determines X. We confine ourselves to outlining the proof of Theorem 11.16. An ideal X of the algebra R is a, subset of R such that a,b £X => a — b E X and aEX,rER=^a-rEX. R is clearly the unique ideal that contains the identity of R. An ideal is called proper if X ^ R and maximal if it is not strictly contained in any proper ideal. Finally, we notice that R/X is a field if and only if X is maximal. 11.17 L e m m a . Let X be a compact topological space. X is a proper maximal ideal of C^{X) if and only if there is XQ e X such that X = {f e C^{X) \ f{xo) = 0}.

11.3 Ordinary Differential Equations

Proof. For any / € X, the set / ^(0) belongs to X, and J is not proper. Let and / " H o ) = n / ^ ^ O ) 7«^ 0. Since X there is XQ G -^ such that f{xo) = 0 ideal, hence X = {/ | / ( x o ) = 0}.

403

is closed and / ^(0) ^ 0. Otherwise 1 / / , hence 1, / i , • . . , / n € X. The function / := 53?==! f? i^ i^ ^ is compact, n { / - H O ) | / G X} / 0. In particular, V/ G X. On the other hand, {/1 f{xo) = 0} is an D

The spectrum of a commutative algebra with unity is then defined by s p e c R := j X X meiximal ideal of R?. Trivially, if R is isomorphic to C ° ( X , M), R ~ C^(X, M), then also the maximal ideals of R and C^(X, R) are in one-to-one correspondence, hence by Lemma 11.17 specif-specC^(X)-X. To conclude the proof of Theorem 11.16, we need to introduce a topology on the space specC^(X, R) in such a way that spec C^(X,M) ~ X becomes a topological isomorphism. For that, we notice that, if X is a maximal ideal of C^{X,R), then C^(X, R)/X ~ R, hence the so-called evaluation maps /(X), that map (/,X) into [/] € C^(X, R)/X c^ R, have sign. Now, if we fix the topology on spec R :^ spec C^{X, R) by choosing as a basis of neighborhoods the finite intersections of U(f) := *{ X X proper maximal ideal with /(X) > 0 >, it is not difficult to show that the isomorphism X —> specC^(X, R) is continuous. Since X is compact and the points in specC^(X, R) are separated by open neighborhoods, it follows that the isomorphism is actually a topological isomorphism. Theorem 11.16 has a stronger formulation that we shall not deal with, but that we want to state. A Banach space with a product which makes it an algebra such that \\^y\\ < Ikll \\y\\ is called a Banach algebra. An involution on a Banach algebra R is an operation x -^ x* such that (x -\- y)* = x* + y*, (Ax)* = Ax*, {xy)* = (yx)* and (x*)* = X. A Banach algebra with an involution is called a C*-algebra. Examples of C*-algebras are: (i) the space of complex-valued continuous functions from a topological space with involution / —» / , (ii) the space of linear bounded operators on a Hilbert space with the involution given by A —^ A*, A* being the adjoint of A. Again, the space of proper maximal ideals of a commutative C*-algebra, endowed with a suitable topology, is called the spectrum of the algebra. T h e o r e m (Gelfand—Naimark). A C*-algebra is isometrically gebra of com,plex-valued continuous functions on its spectrum.

isomorphic

to the al-

11.3 Ordinary Differential Equations The Banach fixed point theorem in suitable spaces of continuous functions plays a key role in the study of existence, uniqueness and continuous dependence from the data of solutions of ordinary differential equations.

404

11. Some Applications

11.3.1 The Cauchy problem Let D be an open set in R x R^, n > 1, and F{t, y) : D cRxW -^W be a continuous function. A solution of the system of ordinary equations f^x{t)^F{t,x{t))

(11.2)

is the data of an interval ]a, P[c R and a function x G C^Qa^ /?[; R'^) such that (11.2) holds for all t G]a, ^[. In particular, (t, x{t)) should belong to D for all t G]a, /?[. Geometrically, if we interpret F(t, x) as a vector field in P , then x{t) is a solution of (11.2) if and only if its graph curve t -^ {t, x{t)) is of class C-^, has trajectory in JD, and velocity equals to (1, F(t, x{t))) for all t. For this reason, solutions of (11.2) are called integral curves of the system. a. Velocities of class C^{D) In the sequel, at times we need a fact that comes from the differential calculus for functions of several variables that we are not discussing in this volume. Let (7 C R^ be an open set. We say that a function / : f] —> R is of class C^(p)^ fc > 1, if / possesses continuous partial derivatives up to order k. One can prove that, if / G C^{0) and 7 : [a, 6] -^ fi is a C^ curve in n, then / o 7 : [a, 6] —> R is of class C^([a, h\). For fc = 1 we have the chain rule

where D / ( x ) := ( | ^ ( ^ ) , | ^ ( ^ ) , • • •, ^ ( ^ ) ) is the matrix of partial derivatives of / and the product D/(7(^))7'(^) is the standard matrix product. A trivial consequence is that integral curves, when they exist, possess one derivative more than the function velocity F{t,x{t)). This is true by definition if F is merely continuous. If, moreover, F(t, x) G C^ and x{t) G C^, we successively find from the equation x'{t) = F{t,x(t)) that x'{t) G C^, x'{t) G C^,.", x'{t) G C^. In particular, if F{t,x) has continuous partial derivatives of any order, then the integral curves are C^. It is worth noticing that if F G C^{D), then by the chain rule dF x"{t) =

—{t,x{t))~^DF^{t,x{t))x\t),

where DF^ is the matrix of partial derivatives with respect to the x's variables and the product 'DFx{t,x{t))x'{t) is understood as the matrix product. For the sequel, it is convenient to set

11.3 Ordinary Differential Equations

405

11.18 Definition. We say that a function F{t, x) : [a, 0] x B{xo, b) -^ W^ is Lipschitz in x uniformly with respect to t if there exists L > 0 such that \F{t,x)-F{t,y)\
\/{t,x),{t,y)

e [a,f3] x B{xo^b).

(11.3)

Let D be an open set m R x R^. We say that a function F{t,x) : D -^ W^ is locally Lipschitz in x uniformly with respect to t if for any D := [a, (3] x 5(xo, b) strictly contained in D there exists L := L{a, /?, XQ, b) such that \F{t,x)-F{t,y)\
^{t,x),{t,y)

11.19 f. Show that the function / ( t , x) = sgn{t)\x\, in X uniformly with respect to t.

eD.

(t, x) G [-1,1] x [—1,1] is Lipschitz

11.20 If. Let D = [a,b] x [c,d] be a closed rectangle in M x E. Show that, if for all t G [a,6], the function x —^ fit,x) has derivative fx{t,x) on [c,d] and (t,x) —> fx{t,x) is continuous in D, then / is Lipschitz in x uniformly with respect to t. [Hint: Use the mean value theorem.] 11.21 If- Show the following. Let D be an open set of R x R^ and let F{t,x) G C^(D). Then F is locally Lipschitz in x uniformly with respect to t. [Hint: For any {to,xo) G D, choose a,b ^ R such that D := {{t, x)\\t — to\ < a, \x — xo\ < b} is strictly contained in D. Then, for (t,xi), (t,X2) G D, consider the curve 7(5) := (t, (1 — s)xi + 8X2), s G [0,1] whose image is in D and apply the mean value theorem to ^(7(5)), s G [0,1].]

b. Local existence and uniqueness Assume (to,xo) G D. We seek a local solution x{t) := (xi(t),.. .,Xn{t)) e C\[to - r, to + r],R^) for some r > 0 of the Cauchy problem relative to the system (11.2), i.e.,

^x{to) = XQ. We have the following. 11.22 Proposition. Let D be an open set in R x W^, n > 1, and let F{t,x) : D —> R"^ be a continuous function. Then x{t) G C^{[to — r, to + r],W) solves (11.4) if and only if x{t) belongs to C^{[to - r,to + r],W) and satisfies the integral equation t

x{t)=xo-\-

F{r,x{T))dT

\/te[to-r,to-^r].

(11.5)

406

11. Some Applications

Proof. Set / := [to-r,to+r\. If x G C ^ ( / , M ^ ) solves (11.4), then by integration x satisfies (11.5). Conversely, if a; G C ^ ( / , R ^ ) and satisfies (11.5), then, by the fundamental theorem of calculus, x{t) is differentiable and x'{t) = F{t, x{t)) in / , in particular it has D a continuous derivative. Moreover, (11.5) for t = to yields x{to) = XQ.

Let US start with a local existence and uniqueness result. 11.23 Theorem (Picard-Lindelof). Let F(t,x) : D cRxW ^W he a continuous function with domain D := {{t^x) G R x R^ | |t — to| < a, \x — xo\ < b}. Suppose (i) F{t,x) is hounded in D, \F{t,x)\ < M, (ii) F(t, x) is Lipschitz in x uniformly with respect to t, \F{t,x)-F{t,y)\
V{t,x),

{e,y) e D.

Then the Cauchy problem (11.4) has a unique solution in [to — r,to + r] where b 1 Proof. Let r be as in the claim and Ir := [*o — ^^^ to-\-r]. According to Proposition 11.22, we have to prove that the equation x{t) =xo-{-

F(T,x(T))dT. JtQ

has a unique solution x{t) G C ° ( / r , M ^ ) . Let 2/1,2/2 G C^{Ir,W) be two solutions of (11.5). Then for all t G Ir \yi{t) - y2{t)\ < [

\F{s,yi{s))-Fis,y2{s))\ds
JtQ

hence II2/1 -2/i||oo,/^ < kr\\yi -yiWoojrSince /cr < 1, then 2/1 = 2/2 in IrTo show existence, we show that the map x -^ Tx given by T[x]{t) := XQ ^- f

F{T,x(T))dT

is a contraction on

X := ^x e C^{Ir,^^)\x{to)

= xo, \x(t)-xo\

<6Vf G / r }

that is closed in C^{Ir,R^), hence a complete metric space. Clearly t —> T[x](t) is a continuous function in Ir, T[x]{to) = XQ and <M\t\<Mr
\T[x](t)-T[ym\

< |||F(r,x(r))-F(r,s/(T))|dT

< fc|i|||a;-2/||cx) < fcr-||a; - 2/||oo,/r.-

0

The fixed point theorem of Banach, Theorera 9.128, yields a (actually, a unique) fixed point T[x] = a; in X . In other words, the equation (11.5) has a unique solution. D

11.3 Ordinary Differential Equations

407

Taking into account the proof of the fixed point theorem we see that the solution x{t) of (11.4) is the uniform hmit of Picard^s successive approximations t

Xo{t) := xo,

and, for n > 1,

Xn{t) :=' XQ + / F(r,Xn-i(r)) dr. to

The Picard-Lindelof theorem allows us to discuss the uniqueness for the initial value problem (11.4). be a bounded domain, 11.24 Theorem (Uniqueness). Let D cRxW^ let F{t, x) : D -^W^ be a continuous function that is also locally Lipspchitz in X uniformly in t, and let {to,xo) G D. Then any two solutions x\ : I -^ W^, X2 '- J —^W^ defined respectively, on open intervals I and J containing to of the inital value problem (x^{t) =

F{t,x{t)),

[x{to) = xo, are equal on I H J. Proof. It is enough to assume I C J. Define

E := | t G/|xi(t) =X2{t)Y Obviously to £ E and E is closed relatively to I, as x i , X2 are continuous. We now prove that E is open in / , concluding E = I since / is an interval, compare Chapter 5. Let t* e E, define x* := xi{t*) = X2(t*). Let a, 6 G M+ be such that D := {{t, x) € D I |t — t*| < a, |a; — a:*| < 6} is strictly contained in D. F being bounded and locally Lipschitz in x uniformly with respect to t in D, the Picard-Lindelof theorem applies on D. Since xi(t) and X2{t) both solve the initial value problem starting at (t*,x*), we conclude that xi{t) = X2(t) on a small interval around x*. Thus E is open. D

c. Continuation of solutions We have seen that the initial value problem ha^ a solution that exists on a possibly small interval. Does a solution in a larger interval exist? As we have seen, given two solutions xi : / —> E"^, X2 : J -^ M'^ of the same initial value problem, one can glue them together to form a new function x : I U J —^ W^, that is again a solution of the same initial value problem but defined on a possibly larger interval. We say that x is a continuation of both xi and X2' Therefore, Theorem 11.24 allows us to define the maximal solution, or simply the solution as the solution defined on the largest possible interval. 11.25 Lemma. Suppose that F : D CR xW^ -^W^ is continuous in D, and let x{t) be a solution of the initial value problem

408

11. Some Applications

\x{to) = xo in the bounded interval ^ < t < 5; in particular {t^x{t)) e D ^t €]7,(5[. If F is bounded near {S,x{S)), then x{t) can be continuously extended on S, Moreover, if {S^x{S)) G D, then the extension is C^ up to S. A similar result holds also at (7, a:(7)). Proof Suppose that \F{t,x)\ ti, ^2 €]7, S[ we have

< M V(t,x) and let x{t), t e]j,5[,

r(t2) -x(ti)\

< I ^ \F{t,x(t))\dt

<M\ti

be a solution. For

-t2 I

i.e., X is Lipschitz on ]7, (5[, therefore it can be continuusly extended to [7, S]. The second part of the claim follows from (11.5) to get for t < S

x(t)-x(S) - ^

1 ^—=

t-S

t~sJd

/•*

,

/

F(s,x(s))ds ^ ' ^ ^^

, ,, ,

and letting i —> (^+.

D

Now if, for instance, {5^ x{S)) is not on the boundary of D and we can solve the initial value problem with initial datum x{5) at to = ^, we can continue the solution in the C^ sense because of Proposition 11.22, beyond the time 5, thus concluding the following. 11.26 Theorem (Continuation of solutions). Let F(t,x) be continuous in an open set D dB^xW^ and locally Lipschitz in x uniformly with respect to t. Then the unique (maximal) solution of x'(t) = F{t^x{t)) with x{to) = XQ extends forwards and backwards till the closure of its graph eventually meets the boundary of D. More precisely, any (maximal) solution x(t) is defined on an interval ]a, /?[ with the following property: for any given compact set K C /S., there is 5 — S{K) > 0 such that {t,x{t)) ^ K

forti

[a-^5,(3-5].

Recalling Exercise 11.21, we get the following. 11.27 Corollary. LetD be an open domain in^xW^, and letF G C^{D). Then every (maximal) solution of x'{t) = F(t^x{t)) can be extended forwards and backwards till the closure of its graph eventually reaches dD. 11.28 Corollary. Let D '.=]a,b[xW^ (a and b may be respectively, +00 and —00) and let F{t,x) : D -^ W^ be continuous and locally Lipschitz in x uniformly with respect to t. Then every locally bounded (maximal) solution of x' = F{t,x) is defined on the entire interval ]a, 6[. Proof. Let \x(t)\ < M. Should the maximal solution be defined on [a,/?] with, say P < b, then the graph of x would be contained in the compact set [a, /3] x B(0, M) strictly contained in ]a, 6[xR^. This contradicts Theorem 11.26. D

11.3 Ordinary Differential Equations

409

Of course, if F is bounded in D :=]a,fofxR*^,all solutions of x' = F(t^x) are automatically locally bounded since their velocities are bounded, so the previous theorem applies. For a weaker condition and stronger result, see Exercise 11.33. 11.29 E x a m p l e . Consider the initial value problem x' = x^, x{0) = 1, in Da '•= {(t, x) 11 G M, \x\ < a}. Since \F\ < a^ in Da, the continuation theorem applies. In fact, the maximal solution 1/(1 — t), t G] — O O , 1 — - [ has a graph that extends backwards till —CO and forward until it touches dDa-

d. Systems of higher order equations We notice that a differential equation of order n in normal form in the scalar unkown x{t) lx{t)

= F ( t , x ( t ) , | x ( t ) , . . . , ^,x{i))

(11.6)

can be written, by defining xi{t) := x{t),

X2{t) := -JlXiii)^

as the first order system x[{t)=:x2{t),

{x'^{t)=:Fit,Xi{t),X2it),...,Xn{t)) or, compactly as, y'{t) = F{t,y{t)) for the vector-valued unknown y{t) := {xi{t),X2{t),... D C R X R^ -^ M^ given by

,Xn{t)) and F :

F{t, X i , . . . , Xn) :=- (X2, X3, . . . , Xn, / ( t , X i ( t ) , X2(t), . . . ,

Xn{t)).

Consequently, the Cauchy problem for (11.6) is :r(») it) = F{t, x{t), x'{t), x"{t),...,

x^^'^^t)),

x{to) = xo, x'{to) = xi, x"(fo) = X2,

a;("-i)(fo)=a;„_i.

(11.7)

410

11. Some Applications

Along the same line, the initial value problem for a system of higher order equations can be reformulated as a Cauchy problem for a system of first order equations, to which we can apply the theory just developed. e. Linear systems For linear systems x'{t)=A{t)x{t)+g{t),

(11.8)

where A{t) is an n x n matrix and g{t) G R^, we have the following. 11.30 Theorem. Suppose that A{t) and g{t) are continuous in [a, 6] and that to G [a, 6] and XQ EW^. Then the solution of (11.8) with initial value x{to) = xo exists on the entire interval Proof. Assume for simplicity that to e]a, 6[. The field F{t, x) := A{t)x -\- g{t) is continuous in D :=]a,b[xW^ and locally Lipschitz in x uniformly with respect to t, \F{t,x)-F(t,y)\<

sup 11^(011 Ik-2/11

Va < a < / 9 < 6,Vx,?/G M^.

te[a,/3]

Therefore, a solution of (11.8) exists in a small interval of time around to, according to Picard-Lindelof theorem. To show that the solution can be continued on the whole interval ]a, 6[, it sufl[ices to show, according to Corollary 11.28, that x(t) is bounded. In fact, we have t

x(t) — x(to) = / A{s)x{s) ds-\- / Jto J

g(s)ds.

to

For t > to we then conclude iclude that \x(t)\<\xo\+max\g\(b-a)-iWM

sup ||^(OII / te[a,b] Jto

\x(s)\ds,

and the boundedness follows from GronwalVs inequality below.

D

11.31 Proposition (Gronwall's inequality). Suppose that k is a nonnegative constant and that f and g are two nonnegative continuous functions in [of,/?] such that t

f{t)
+ J f{s)g{s)ds,

te[a,P].

Then

f{t) < fcexp( /

9{s)ds).

Proof. Set U{t) := k + f f{s)g{s) ds. Then we have a

fit) in particular

< U(t),

U'{t) = f(t)g{t)

< 9{t)U(t),

U{a) = k,

11.3 Ordinary Differential Equations

411

ON THE EXISTENCE AND PROPERTIK OF THE SOLUTIONS OF A CERTAIN DlPPratENTIAL EQUATION OP THE SECOND ORDER.*

1. Introduction. Tke lifanntU «v«tiM to be ei»aU«Nd is

(1) (A) (B) (O (D)

-h[li''t»^]'^fi''») jr = IK a * ( « , * ) / - « »» » -=»» »t p(*.»)v''=< rt

a; = i», « - a, « = ai, « — »,,

jf = »i »t « — *i; v f t « » •>" *,, |f-*0 *t «-»op; K-0 «a ar-oo,

n * cdKoee and noivNaen M the wAitieat witt be tkoira mrier tbe {«Uowb( eoBtitiou M / ( z , y ) end f(x,y), where eaaditioD (&) is aied oilj with (O uU (D). Pint, ghrea «aj Ibite s,>a^ uri u v Inite i > 0 , / ( £ , y) is boradsd tai sttiain e LqMeUti eeodiUoB to a^ ^ « < x , «•« | y | ^» = 8. •• ttU/m" 0. She* • V -i.,#j€'»'>0, UtailMntttt /(«,») Sit f (cir) iMW> •»»«»«« (» W (D-

Figure 11.3. Thomas Gronwall (18771932) and a page from one of his papers.

g{s) ds\ - U{a) < 0.

11.32 %, Let w : [a, 6] -^ M^ be of class C^([a, b]). Assume that \w'\{t) < a{t) \w{t)\ + b(t)

Wt e [a, b]

where a(t), b{t) are nonnegative functions of class C^{[a,b]). Show that Mt)\

< (\w{to)\ + f

b(s)ds\exp

( I

a{s)ds\

for every t,to ^ [^i,^]- [i^wt; Apply Gronwall's lemma to f{t) := \w{t)\.] 11.33 %, Let F(t, x) : / X M^ —>^ M^ be continuous and locally Lipschitz in x uniformly with respect to t. Suppose that there exist nonnegative continuous functions a{t) and b(t) such that \F{t,x)\ < a(t)|x| + b{t). Show that all the solutions of x' = F{t,x) can be extended to the entire interval / .

f. A direct approach to Cauchy problem for linear systems For the reader's convenience we shall give here a more direct approach to the uniqueness and existence of the solution of the initial value problem to e [a, 6], (11.9)

X{to) = Xo, X'{t)=A{t)X(t)

+ F{t)

Vt e [a, b]

412

11. Some Applications

where XQ G W^ and the functions t -^ A{t) and t -^ F{t) are given continuous functions defined in [a, 6] with values respectively, in Mn,n{C) and C"'. Recall that ||A(t)|| := supj^j^i |A(t)x| denotes the norm of the matrix A{t) and set M:= sup ||A(t)||. tela,b]

As we have seen, see Proposition 11.22, X{t), t G [a, 6] solves (11.9) if and only ii t -^ X(t) is of class C^([a, b]) and solves the integral equation X{t) = Xo+

f {A{s)X{s) + F{s)) ds Jto that is, iff X{t) is a fixed point for the map T : X{t)

^ T{X){t) := XQ -f / iA{s)X{s) Jto

(11.10)

+ F{s)) ds.

(11.11)

Let 7 > 0. The function on C^([a,b],W^) defined by | | X | | , : = sup (iX(t)le-^l^-^ol) is trivially a norm on C^([a, 6]). Moreover, it is equivalent to the uniform norm on C^([a, 6],R'^) since e-^l''-"l||X|U,Ml 0. Moreover, T is a contraction map on C^ if 7 > M := sup ||A(^)||. te\a,h] Proof. In fact, VX, Y ^C-y and f € [a, 6], we have

\TX{t) - TY{t)\ =

/ A{s)(X(s) - Y{s)) ds\

\Jto

I

Jto

7

Multiplying the last inequality by e~^l*~*ol and taking the sup norm gives

]\TX~TY\\^<-\\X-Y\\^. 7

11.3 Ordinary Differential Equations

413

11.35 Theorem. The initial value problem (11.9) has a unique solution X{t) of class C^{[a^b]), and \X(t)\<

(^\Xo\ + I '

\F{s)\dsyxp(^l\\A{s)\\dsy

Moreover, X{t) is the uniform limit in C^([a, 6],E^) of the sequence {Xn{t)} of functions defined inductively by Xo{t)

:= X o ,

Xn+l{t)

-

Xo +

f^ / {A{s)Xn{s)

(11.12) + F{S))

ds.

Proof. Choose 7 > M . Then T : C7 -^ C^ is a contraction map. Therefore, by the Banach fixed point theorem T has a unique fixed point. Going into its proof, we get the approximations. Finally, the estimate on \X{t)\ follows from (11.10) and the Gronwall Lemma. D

11.36 Remark. In the special case a = —00, b = +00, to = 0, F{t) = 0 Vt and A(t) = A constant, then (11.12) reduces to

^»w = (Eir*')^o fc=0

hence the solution of the initial value problem for the homogeneous linear system with constant coefficients ix'it)

=

AX{t),

\x{0)

= Xo

is °°

A fc

X{t) =[J2 ~W^^)^^ = ^""P (^ A)Xo

Vt G M

n=0

uniformly on bounded sets of R and |X(t)|<|Xo|exp(|t-to|||A||)

VteM.

g. Continuous dependence on data We now show that the local solution x{t; to, XQ) of the Cauchy problem (x' =

F{t,x),

[x{to) = Xo depends continuously on the initial point (to, XQ), and in fact is continuous in (t,to,xo).

414

11. Some Applications

11.37 Theorem. Let F(t,x) and Fx{t^x) be bounded and continuous in a region D. Also suppose that in D we have \F{t,x)\<M,

\FS,x)\
Then, for any e > 0 there exists 5 > 0 such that \x{t;to,xo) - x(t',To,xo)\ < e provided \t — t\ < S and \xo — xo\ < S and t, t are in a common interval of existence. Proof. Set ^(t) := x{t]to,xo), t

ip{t) := x(t;to,xo).

From t

(t)it) = xo+ I F{s, 0(s)) ds,

ip{t) =xo+

f F(s, ip{s)) ds, to

to t

to

f F{s,(P{s))ds=

f F{s,(f){s))ds+

*0

t f

*0

F{s,(l)is))ds

to

we infer {t) - ip{t) =XQ-XQ+

f[F(s, (s)) - F ( 5 , ip{s))] ds+ f F{s, (l>{s)) ds, to

*o t

IHt) - ^(t)\ <\x- xo\ +fclf \(f>{s) - iP{s)\ ds\+ M | t o - t o | to t <S + k\ f

\(l)(s)-ilj(s)\ds

•M5.

to Gronwall's inequality then yields \(t>{t) - ip{t)\ < S{1 -f M)exp ik\t - tol) < S(l + M)exp (fc(/3 - a ) ) . Since mt)-i^(t)\<\J\Fis,ijis))\ds

<M\t-t\<MS

we conclude

mt) - ^{t)\ < mt) - ip{t)\ + mt) - i>{?)\ < 6(1 + M ) exp {k(P - a)) + 6M if |t - t] < ^.

D

1 1 . 3 8 % Let F{t,x) and G{t,x) be as in Theorem 11.37, and let 4>{i) and ^ ( t ) be respectively, solutions of the Cauchy problems fa:' = F ( t , x ) , |^a:(to) = XQ

^^^

{x' = G{t,x), (^a:(to) = ^o-

Show that \4>{t) - ^ ( t ) | < ( k o - ^ol 4- €(^ - a ) ) exp {k{t - to)) ii\F(t,x)~G{t,x)\ <€.

11.3 Ordinary Differential Equations

415

h. The Peano theorem We shall now prove existence for the Cauchy problem (11.4) assuming only continuity on the velocity field F(t, x). As we know, in this case we cannot have uniqueness, see Example 6.16 of [GMl]. 11.39 Theorem (Peano). Let F{t,x) he a hounded continuous function in a domain D, and let (ito,xo) he a point in D. Then there exists at least one solution of (x' = F{t,x), \x{to) = XQ. Proof. Let \F{t,x)\ < M and D := {{t,x) G M x R^ | \t - to\ < a, \x - xo\ < b} be strictly contained in D. If r < min{a, b/M} we have seen that t

T[x]{t) := f

F{T,x(r))dT

to

maps the closed and convex set X := ixeC^{[xo

-r,xo+r],R'^)\x{to)

= XQ, \X - XO\ < fej

in itself, see Theorem 11.23. The operator T is continuous; in fact, since F is uniformly continuous in D, Ve > 0 Br/ such that |F(t, x) - F ( t , x')\<e

\fte

[a, b]

if |x - x'| < r;,

hence \F{t,Xn{t))-F(t,Xoo{t))\

<e

VtG [a,b]

for large enough n if Xn{t) -^ Xoo{t) uniformly. Then we have t \\T[Xn] - T[Xoo]||oo <

f \F{t,Xn{t))

- F{t, Xoo{t))\

dt <

e{b-a).

to

Moreover \T[x]it')-T[x]{t)\

^lF{T,x{r))dT <M\t-t'\,

= t'

and we conclude by the Ascoli-Arzela theorem that T : X —>^ X is compact. The Caccioppoli-Schauder theorem yields the existence of at least one fixed point x(t), • x{t) = T[x]{t); this concludes the proof.

Notice that the solutions can be continued, cf. Lemma 11.25, possibly in a nonunique way. Therefore any solution can be continued as a solution forwards and backwards in time till the closure of the graph of the extension eventually meets the boundary of the domain D. 11.40 t C o m p a r i s o n principle. Let / : [a, 6] x R —>• M be a function that is Lipschitz on each rectangle [a, 6] x [-A,A] and let a(t),/3(t) be two functions such that a{t) < /3(t),

a'it) < fit, ait)),

Show that every solution of

/3'(t) > / ( t , Pit))

Vt G [a, b].

416

11. Some Applications

a{a) <xo< I x(0)

f3{a),

-xo,

satisfies a{t) < x(t) < /3(t) Vt € [a, 6]. In particular, there is a solution that is defined on the entire interval. 11.41 ^ P e a n o ' s p h e n o m e n o n . Consider the Cauchy problem ^'(^) = /(*5^(*))5

^(*o) = xo

in [a,6],

(11.13)

where / ( t , x) is a continuous function. Show that (i) there exist a minimal and a m,axim,al solution, i.e., x{t) and x{t) solutions of (11.13) such that for any other solution of (11.13) we have x{t) < x{t) < x{t), (ii) if the minimal and the maximal solutions of (11.13) exist in [to, to-\-S], show that through every point {to,XQ) with t E [to,to -\- S] and x G [x{t),x{t)] there passes a solution of (11.13). [Hint: To show existence of a maximal solution, show that, if Xn{t) solves x' — f{t, x) + - , then, possibly passing to a subsequence, {xn} converges to a maximal solution.] 11.42 f. Study the following Cauchy problem passing to polar coordinates {p,0) X2(t)y/\x2(t)\

\x[{t)=xi(t)-

yjx\{t)+xl{ty

-'2{t)=X,{t)-^f^/^\,

y^x2(t)+x2(t)

[ x i ( 0 ) = l,

X2(0)=0.

11.3.2 Boundary value problems For second order equations it is useful to consider, besides the initial value problem, so-called boundary value problems in which the values of u or u', or a combination of these values, are prescribed at the boundary of the interval. For instance, suppose we want to find the linear motion of a particle under the external force F{t^x{t)^x'{t)) starting at time t = 0 in XQ and ending at time t = 1 in xi, i.e., we want to solve the Dirichlet problem, ' x"{t) = F{t,x{t),x\t)) x{0) =

in]0,l[,

XQ,

x(l) = xi. 11.43 %, Check that the problem x"^-x

= (} i n [ 0 , t i ] ,

x(0) = 0, x{t\)

=xi

(i) has a unique solution if ti ^ UTT, n G Z and xi G M, (ii) has infinite many solutions if ti = mv, n ^X and x\ = 0, (iii) has no solutions if ti = nn, n G Z and xi ^ 0.

11.3 Ordinary Differential Equations

417

Discuss also the same problem for the equation x" + Ax = 0.

11.44 Theorem. Lei F{t,x,y) be a continuous function in the domain D := {{t^x^y)\t e [0,1], |x| < a, \y\ < a}. Moreover, suppose that F{t,x,y) is Lipschitz in {x^y) uniformly with respect to t, i.e., there exists /i > 0 such that -F(t,X2,2/2)1 < M ( k i -X2I + I2/1 -2/2I)

\F{t,xi,yi)

for every (t,xi,?/i), (t,X2,2/2) ^ L). Then for |A| sufficiently small the problem {x" = \F(t,x^x'l ^^^^^^

\x(0)=x(l) = 0 has a unique solution x{t) G C^([0,1]). Moreover \x{t)\ < a and \x\t)\

VtG[0,l]. Proof. If x(t) solves x " = XF{t,x(t),x'{t)), x'{t)^A-{-X

then

f F(T,x(r),a:'(T))dT = A + A— / {t - T)F{T,X{T),X'{T)) Jo dt Jo

dr,

and x{t) =At-\-B-\-x[{t-

T)F{T,

x ( r ) , x ' ( r ) ) dr;

Jo the boundary conditions yield B = 0,

A-\-\

f

{1-

T)F(T,

X{T),X'{T))

dr = 0.

Jo Thus, x{t) is of class C2 ([0,1]) and solves (11.14) if and only if x(t) is of class Ci([0,1]) and solves t x{t)

= X f{t-

T)F{T,

X ( T ) , X\T))

dr

0 -Xt

(11.15) f

Jo

(l-r)F(r,x(T),x'(r))dT.

Now consider the class X :=\xe

CH[0,1]) I x(0) = 0, sup \x\t)\

^

'

[0,1]

< a] ^

endowed with the metric d{xi,X2)

:= sup \x'i{t) - X2(t)\ te[o,i]

that is equivalent to the C^ metric ||xi — X2||oo,[o,i] + ll^i ~^2lloo,[o,i]- I* is easily seen that (X, d) is a complete metric space and that the map x{t) —>• T[x]{t) given by T[x](t):=X

f (t-T)F(T,x(T),x'{T))dT-Xt Jo

f (l-r)F(T,x(r),x'(r))dr, Jo

maps X into itself and is a contraction provided |A| is sufficiently small. The Banach fixed point theorem then yields a unique solution x G X . On the other hand, (11.15) implies that any solution belongs to X if |A| is suffciently small, hence the solution is unique. •

418

11. Some Applications

a. The shooting method A natural approach to show existence of scalar solutions to the boundary value problem ^x" = F ( t , x , x ' )

in]Oj[,

x(0)=0,

(11.16)

x{t) = X consists in showing first existence of solutions y{t,\) problem \ " = F{t,y,y')

of the initial value

in[0,^

y(0) = 0,

(11-17)

,y'(o) = A, defined in the interval [0, ?] , and then showing that the scalar equation y(t,

\)=x

has at least a solution A; in this case the function y{t^\) clearly solves (11.16). Since y{t,X) is continuous in A by Theorem 11.37, to solve the last equation it suffices to show that there are values Ai and A2 such that y{t, Ai) <x< y{t, A2). This approach is usually referred to as the shooting method, introduced in 1905 by Carlo Severini (1872-1951). 11.45 Theorem. Let F{t,x,y) be a continuous function in a domain D. The problem (11.16) has at least a solution, provided that t and x/t are sufficiently small. Proof. Suppose \F{t, x,y)\ < M\ choose M > M' and a sequence of Lipschitz functions Fk(t,x^y) that converge uniformly to F{t^x,y) with \Fk{t,x,y)\<M

Vfc,

\ft,x,y.

Problem (11.17) for F^ transforms into the Cauchy problem for the first order system

i^(0) = (0,A) where z{t) = (x(t),y(t)) and Gk(x,z) = (y, Fk{t,x,y)). Now if 6 > 0 is chosen so that D := {(t, z)\\t\ < a e \z — (0, A)| < 6} is in the domain of Gk(t, z), and we proceed as in the proof of Peano's theorem, we find a solution z^^x of (11.18) defined in [0, r] with r < mm

L,

-^

1.

(11.19)

Since Gk is a Lipschitz function, z^ is in fact the unique solution of (11.18) and depends continuously on A := (0, A). If Xk,x{t) is the first component of zj^^x, we have, see Theorem 11.44, t

Xk,xW = Xt-h j{t

- T)Fir, Xfc,A(r), 4 , A W ) dr,

11.3 Ordinary Differential Equations

419

hence Xr - r^M < Xk,x{r) < Ar + r'^M and in particular, Xk,xi'^)<x

ii Xr-\-r^M

Xk,xi'^) > ^

if Ar — r^M > x.

<x,

fii 90)

It follows from (11.19) that the assumptions in (11.20) hold for two values of A if r and x/r are small enough, concluding that there is a solution Xk € C^([0, r]) to the boundary value problem fx'^'(t) = Ffc(t,a:fc,4), \xk(0)=Q,

(11-21)

[xk{r) = x. As in Theorem 11.44, we see that the family {xk{t)} is equibounded with equicontinuous derivatives, thus, by the Ascoli-Arzela theorem, a subsequence converges to x in the space C^([0,r]), and passing to the limit in the integral form of (11.21), we see actually that X e C2([0,r]) and solves (11.16) in [0,r]. D

b. A maximum principle Let u e C^QO, 1[) n C°([0,1]), but [0,1] can be replaced by any bounded interval. If u has a local maximum point XQ in the interior of [0,1], then ^'(xo) = 0

and

u'\xo) < 0.

(11.22)

If, moreover, u satisfies the differential inequality u" -\-b{x)u' >0,

(11.23)

then clearly (11.22) does not hold at points of ]0,1[, thus the maximum of ?/ is at 0 or 1, that is, at the boundary of [0,1]. If we allow the nonstrict inequality

^'' + b{xy > 0 the constant functions that have maximum at every point, are allowed; but this is the only exception. In fact, we have the following. 11.46 Theorem (Maximum principle). Let u be a function of class C^(]xi,X2[) nC^([xi,X2]) that satisfies the differential inequality u" + b(x)u' > 0

in ]x\,X2[

where b{x) is a function that is bounded below. Then u is constant, if it has an interior maximum point. Proof. By contradiction, suppose XQ € ] X I , X 2 [ is an interior maximum point and u is not constant so that there is x such that u(x) < u{xo). Assume for instance x €]xo, X2[ and consider the function ;2(x) := e"(^-^o) - 1 ,

xe

[xi,X2],

where a is a positive constant to be chosen. Trivially z{x) < 0 in ]xi,xo[, z{xo) = 0, z{x) > 0 in ]xo,X2[ and

420

11. Some Applications

z" + h[x)z' = (a^ + 6(x)a)e^(^-^o) > Q

in [o^i,0:2]

if a > max(0, — mix£[xi,x2] K^))- ^^so consider the function w{x) := u(x) + €z{x) where e > 0 has to be chosen. We have w(xo) = u(xo), w{x) < u(x) < u{xo) = w(xo) for X < xo, and w{x) = u{x) -h tz{x) < u{xo) if e < ^ ^^(^)^^^^ • With the previous choices of a and e, the function w has an interior maximum point in ]a;i,a:2[, but w" + b{x)w > 0: a contradiction. D 11.47 %, In the previous proof, z{x) := e*^(^~^o) — 1 is one of the possible choices. Show for instance that z{x) := {x — xi)^ — (XQ — xi)*^ does it as well.

11.48 T h e o r e m . Let u e C^(]xi,X2[) fl C^{[xi,X2\) solution of the differential inequality u'^ {x)-\-b{x)u'(x)

>0

he a nonconstant

in]xi,X2[

where b{x) is bounded from below. Then, u\xi) < 0 if u has a maximum value at xi and u'{x2) > 0 if u has maximum value at X2. Proof. As in Theorem 11.46 we find w'(xi) at a:i.

= u'(a) -f ea < 0 if w has maximum value D

Similarly we get the following. 11.49 T h e o r e m ( M a x i m u m principle). Letb{x) andc{x) be two functions with b{x) bounded from below and c{x) < 0 in [xi,X2]. Suppose that u G C^(]xi,X2[) n C^([xi,^2]) satisfies the differential inequality u" + h{x)u\x) -h c{x)u > 0

in ]xi, X2[.

Then (i) either u is constant or u has no nonnegative maximum at an interior point, (ii) ifu is not constant and has nonnegative maximum at Xi (respectively, at X2), then u'{xi) < 0 (respectively, u'{x2) > 0). An immediate consequence is the following comparison and uniqueness theorem for the Dirichlet boundary value problem for linear second order equations. 11.50 T h e o r e m ( C o m p a r i s o n principle). Let ui and U2 be two functions in C^(]x 1,^2 [) n C^([x 1,0:2]) that solve the differential equation u'\x) + b{x)u'{x) -h c{x)u{x) = f{x) where b, c and f are bounded functions and c{x) < 0. (i) If ui > U2 at xi and X2, then u\ > U2 in [xi.,X2], (ii) if ui = U2 in Xi and X2, then ui = U2 in [xi,X2]. 11.51 t ' Add details to the proofs of Theorems 11.49 and 11.50. By considering the equations u" + u = 0 e u" — u = Q show that Theorem 11.49 is optimal.

11.3 Ordinary Differential Equations

421

c. The method of super- and sub-solutions Consider the boundary value problem i-u''^Xu

= f{x)

in]0,l[,

^^^24)

The comparison principle, Theorem 11.50, says that it has at most one solution if A > 0, and, since we know the general integral, (11.24) has a unique solution. Let Q be the Green operator that maps / G C^([0,1]) to the unique C^([0,1]) solution of (11.24). Q is trivially continuous; since C2([0,l]) embeds into C^([0,1]) compactly, G is compact from C°([0,1]) into C^([0,1]); finally by the maximum principle, Q is monotone: if / < p, then Gf < Gg. Consider now the boundary value problem \-u"

=

f{x,u),

\u{0) = u{l) = 0 where we assume / : [0,1] x M ^ M to be continuous, differentiable in u for every fixed x, with fu{x,u) continuous and bounded, \fu{x,u)\ < k V(x, u) e [0,1] X R. By choosing A sufficiently large, we see that / ( x , u) + Xu is increasing in u and we may apply to the problem I -u'' + Xu = / ( x , u) -f Xu,

.^^ 25)

\u{0) = u{l) = 0 the argument in Theorem 11.46, inferring that, iiu and u are respectively, a subsolution and a supersolution for —u" = f{x,u), i.e., -M''
f-7l">/(x,^),

^(0), M ( 1 ) < 0 ,

[MeC2([0,l]),

[ueC\[OA])

then setting Tu := G{f{x, u{x)) + Xu{x)) and U:=M,

iun^i-Tun,

[Vo :=U,

[Vn+l

forn>l,

=TVn

the sequences {un} and {vn} converge uniformly to a solution of

i.e., to a function of class C^ that solves (11.25). Hence we conclude

422

11. Some Applications

PREMifeRB vmm.

iiUAfl^S W GALCtlL DES VARfAT(O^S

CHAPITRE !.

P « M. StBM BKRXSTgJie.

I. Xoui verron* p!ui loin qoe les Equations diff«reniiejle»-«rdinairei dn cakul dcs rtrialions so pr*Miitenl, le pla« soorenl, sons It I^sprincip«0J!rt»ullats<1u prcieiil Jlcmoirc onl Hi rfjuroisd«n« ttoU SotesApiCMMesrr/titaidci3liti\Ticr, ^ jiiillMrl iSjiiilM 1910; it est (lone inulile de letrai)|ieterici. it vpus sculcmral ajoaler qn'on certain noiubre dct proposiliotif «Ic la |iremicre Parttc arak-tit tlrji Hi (lonnecs en ic)o8 par M. tiadamarti ('). MaU la iSL-tboilc quf j*rni|)loie djffpre cMenticHemeDl de ccllc de M. IfadaBiarJ el do* autres autrars qui, aprAs M. Ililbcrt, abordent difMtemrnt )e protileisc du caicD) det varialJona en n'ulirtMnI pa«, oil pretijue |>as, IK equations iliffirtntieitos elatsiqQCS. Povraioi, cV't, au contnirc, hi Oquadoiii difitrenticllcii qui accupcnt ia {ilacr cciitftile; k' cakul des variatiotii n'tst qu'uni! applieatton imporiaote dr la ihiotie g^ntraU dc^ vqualioo* du second ordte, d«D( ri-ltidr jc Iroure scuifiurnt qaoliiucfuis liiiipliPieoptr Ux cooiUlitiliofii du ealeul des Yarialioni. les di-iix poinit dc vi.o Bte tfiublent^galrmrnt Irgirtnirt, p( pctit-rlr« IVtuJc tiHlirrcl« dti problvnic du cal«»l in rari.ttioi)« rcndra-l-cllc rv prublr-ntr plu> accc>«ibte par In melhodft dfwcle*.

(•)

/ ; = / , ( ' . 7 , . •••,.?'..>',. • • • . / ; )

('=

oil les/j soRt dps fonelions, rn gdniral, eonliftues poor touies vateun recllesdes variables (saofdesvalcorspgrticttlwres 6«x,y„ ..•,>.) pt qni r«slcflt infiricurcs en rateur absoloe i * 0 f */,'"»-•••+j4*)» l«r*queles/;froiMentind«liBiincnl,Ad6peBdan!se»1pi»eiilde*.j'„—» y,. Xous ap|iellcr«ns Ics dqoalions ( i ) de celte"nature ijHaHw$ {!,). I'n cas partieulicr iinporlant es( celoi 0* IPS/J son! dfs polynomes Ju second degrd par rapport aox^,; c'csl la foriiie sous laquelle on petit onitrc lotijnurs Irs iqastions do inouremsnt de Lagrange. Ce cas a il< i-ludit parM, Painlcvc (')qui siipposait, d'ailleurs, ies fonetions^ analyliques. Ler(-siillal roiidamcnlal dc ccllc ^udcest \esuivant {*)'. S > „ y „ ..,y^tendeM vert ties rakurs fixes y\,y\, ..., J^i hrtfuex tent) rert X, et fue tout ksf, mnt rrgab'trt pour tet raitvrt A x, yti ^ i/Mi4ayl lenitrM igiilemtnl vers Jet ralettrtfiMtt.

ti,iiftt ftMlmc JmnljTt tti 'i If^Ulhrt ilrt l
Figure 11.4. Two pages from a paper by Sergei Bernstein (1880-1968).

11.52 Theorem. Let f{x,u) be a smooth function with \fu{x^u)\ < k V(a:, ?i). Assume that there exist a subsolution and a supersolution for i-u''

= f{x,u)

in [0,1],

\u{0) = u{l) = 0. Then there also exists a solution. We also have the following. 11.53 Theorem. Let f{t,p) : [0, +oc[x]R —> R 6e a function of class C^ that is periodic of period p in t. If the equation x"{t) = f{t,x{t)) has a subsolution x{t) and a supersolution x{t) that are periodic of period p with x{t) < x{t) for all t, then it has also a solution in between, of period p. 11.54 If. Prove Theorem 11.53. [Hint: Follow the following scheme. (i) Choose M so that / ( t , x) — Mx is decreasing. (ii) Inductively define a sequence of p-periodic functions by xo(t) := x(t) and Xn+iit), n > 0, as solution of < + l W - MXn-^lit)

= fit, Xn(t)) -

MXn{t).

(iii) Show that Xn{t) < Xn-^i{t) < x{t). (iv) Show that the sequences {x!^} and {x!,^} are equibounded, in particular {xn} and {x!^} have subsequences that converge, and actually that {xn}, {x!^} and {x!^} converge uniformly to (v) Finally, show that Xoo is the solution we are looking for.]

11.3 Ordinary Differential Equations

423

d. A theorem by Bernstein We conclude our excursus in the field of ODEs by the following result. 11.55 Theorem (Bernstein). Let F{x,u,p) continuous function such that

: [a,6] x R x M ^ R 6e a

(i) there exists M > 0 such that uF{x,u,0) > 0 if \u\ > M, (ii) there exist continuous nonnegative functions a{x, u) and b{x, u) such that \F(x,u,p)\ Then the problem

has a solution.

< a{x,u)\p\'^ -^b{x,u)

i

n" — F(x^u^u')

\J{x,u,p) 6 [a,6] x R x R.

in]a^b[^

u{a) = u(b) = 0

The original theorem^ by Bernstein, instead of (i), requires the stronger assumption that F be of class C^ and for some positive constant k one has Fu{x, u,p) > k > 0 for all (x, u,p). Its proof uses the shooting method. We shall instead use Schaefer's theorem, Theorem 9.142. Proof. As we have seen, the operator that maps every v G C^([a, 6]) into the solution of the problem L''

=

F{x,v{x),v'{x)),

|it(a)=0,

u{b) = 0

is compact. Therefore, according to Schaefer's theorem, it suffices to show that, under the assumptions of Theorem 11.55, there exists r > 0 such that, whenever the function V e C2([a,6]) solves

(

v'' =

XF{x,v,v'),

v(a) = vib) = 0,

for some A G [0,1], then ||'y||c2([a,6]) < '^^ ESTIMATE OF ||^;||CXD- Let XQ be a maximum point for v'^{x). We may assume XQ E.]a,b[, otherwise v = 0; therefore we have v'{xo) = 0 and 0 > ^ ^ ^ W I ^ = ^ o = 2V''^{XQ)

+ 2V{XO)V"{XO)

=

Xv{xo)F{xo,v{xo),0);

the assumption (i) then implies |t^(xo)| < M , hence ||t'||oo < M. ESTIMATE OF ||f'||oo- Let /i be a positive constant and let A and B be bounds for a(x, u) and b{x,u) when x G [a,b] and \u{x)\ < M. Multiplying the equation for v by e~^'" we find hence if /i > A. Similarly, multiplying the equation for v by e^^, we find •^ S.N. BERNSTEIN, Sur les equations Sup. Paris 2 9 (1912) 481-485.

du calcul des variations,

Ann. Sci. Ec. Norm.

424

11. Some Applications

SUR UNE CLASSE D'lQUATIOMS

{•AB

FOMCTIOMMELLES

rSBOBOLK

Ihn* quel<|UM tmn\a' ABKL »'mi o«eup«S %voe to proUimo do d^ nincr U M foaetSon f{x) d« DMaiira qv'ell* ntiafMM » I'^oatiott fofic-

(»)

/r(*,J')f ( * ) * - « » ( * )

f(*,t) at ^ ( s ) «ant d « fonetions donnte. ABEL A r ^ l n qu»]4|ueit ««« partioalMn d« o«He
K*) + yA«,jrMy)<%-f(«)

(b)

qm eat Jtroitomaat VM ft I'^qution aMlinuui. Bn eStt, M oa iairodmt M U«a de /*(«, y) «t |»(«), ] / ( « , » } et ] ^(ic), I'jquttioB (b) a'toit

(c)

>9{»)+fn*.»)f{9)
Equation qui a* tnnafonse «n i'^qoMioa (») en poaant ^ <•> o. Aiasi la lolation de I'^oaiion (a) pant Mra conaid^rfe comma implioitouant oontoone dana U aolottwi da I'^qnation (b). Magktia {or Nat«rv{d»s*k»b(ra*,

t8»3 M 0«aTr«i oom-

Figure 11.5. Ivar Predholm (1866-1927) and a page from one of his papers.

if /i > A. Since v' vanishes at some point in ]a, b[, integrating we deduce for all x € [a, b] -XBe-^^{b therefore II

-a)<

t;'e^^ < XBe^^(b

< c{A, B, M) since ||^||oo ^ ^

- a),

by step (i).

ESTIMATE OF II

f'^lloo- This is now trivial, since from the equation we have \v"{x)\ <\\F{x,v{x),v'{x))\

< c{M),

F being continuous in [a, b] x [0, M] x [0, c(A, B, M)].

D

11.4 Linear Integral Equations 11.4.1 Some motivations In several instances we have encountered integral equations, as convolution operators or, when solving linear equations, as integral equations of the type

x{t) = yo+

f{s,y{s))ds; Jto

for instance, the linear system x'{t) = A{t)x{t) can be written as

11.4 Linear Integral Equations

x{t)=:

f A{s)x{s)ds.

425

(11.26)

Jto (11.26) is an example of Volterra^s equation, i.e., of equations of the form t

f{t) - ax{t) -h f k{t, r)x{r) dr.

(11.27)

0

a. Integral form of second order equations The equation x'\t) — A{t)x{t) = 0, t G [a, ^ ] , can be written as a Volterra equation. In fact, integrating, we get x'{t) = ci -h / A{s)x{s) ds and, integrating again, x{t) = co + ci{t-to)

+ J

(^j^

A{s)x{s)ds)dT

= Co-\-ci{t - to) + / {t-s)A{s)x{s)dsdT

(11.28)

=: F{t) + I {t- s)A{s)x{s) ds, with F{t) := Co + ci{t - to) and G : [a, /?] x [a, /?] ^ E given by G{t,s):=

\t-s)A{s) 10

iis
b. Materials with memory Hooke's law states that the actual stress a is proportional to the actual strain e. At the end of 1800, Boltzmann and Volterra observed that the past history of the deformations of the body cannot always be neglected. In these cases the actual stress a depends not only on the actual strain, but on the whole of the deformations the body was subjected to in the past, hence at every instant t a{t) = ae{t) + F[e{T)ll where F is a functional depending on all values of €(r), 0 < r < t. In the linear context, Volterra proposed the following analytical model for F , k{t, T)e{T) dr.

426

11. Some Applications

This leads to the study of equations of the type t

a{t) = ae{t) + /

k{t,T)e{T)dr,

that are called Volterra ^s integral equations of first and second kind according to whether a = 0 or a ^^^ 0. c. Boundary value problems Consider the boundary value problem x" - A{t)x = 0, (11.29)

x(0) = a, x{L) = b. Prom (11.28) we infer x{t) = ci + C2t + / {t- s)A{s)x{s) ds Jo and, taking into account the boundary conditions, b-a c\ = a,

C2

1 Z*^ — / (L L Jo

s)A{s)x{s)ds,

we conclude that I

J

pL

pt

x{t) =a-]

-—t - 7 / {L- s)A{s)x{s) ds + {t - s)A{s)x{s) ds ^ ^ Jo Jo t{L - s) b-a f^ s{L-t) A{s)x{s) ds. A{s)x{s) ds = a+

In other words, x{t) solves (11.29) if and only if x{t) solves the integral equation, called Fredholm equation., c{t) = F{t)+ where F{t) := a + ^ t

I Jo

G{t,s)x{s)ds

and G : [0, L] x [0, L] -> M is given by

G(t,s):=

r s{L -1) L { t{L - s)

se s < t, se ^ < 5.

11.4 Linear Integral Equations

427

Figure 11.6. An elastic thread.

d. Equilibrium of an elastic thread Consider an elastic thread of length i which readily changes its shape, but which requires a force cd£ to increase its length by d£ according to Hooke's law. At rest, the position of the thread is horizontal (the segment AB) under the action of the tensile force TQ which is very large compared to any other force under consideration. If we apply a vertical force p at C for which x = ^, the thread will assume the form in Figure 11.6. Assume that S — CCQ be very small compared to ACQ and CQB (as a consequence of the smallness of p compared with To) and, disregarding terms of the order S'^ (compared with ^), the tension of the thread remains equal to TQ. Then the condition of equilibrium of forces is

p{i-m

I.e.,

'^'l + ^'T^

Tol

'

Denoting by y{x) the vertical deflection at a point of abscissa x, we have y{x) = where

G{x,^)p

(x{l-0 Tol

G{x,0:--

0 < X < ^,

Tol Now suppose that a continuously distribuited force with length density p{^) acts on the thread. By the principle of superposition the thread will assume the shape I

y{x) = JGix^OpiOdC

(11.30)

If we seek the distribution density p(^) so that the thread is in the shape 2/(x), we are led to study Fredholm's integral equation in (11.30). e. Dynamics of an elastic thread Suppose now that a force, which varies with the time t and has density at ^ given by a; > 0, p(^)sinu;t.

428

11. Some Applications

acts on the thread. Suppose that during the motion the abscissa of every point of the thread remains unchanged and that the thread oscillates according to y — y{x)smut. Then we find that at time t the piece of thread between ^ and ^ + A^ is acted upon by the force p{() sin(a;t) A^ plus the force of inertia

where ^ is the density of mass of the thread at ^, and the equation (11.30) takes the form I

y{x)smujt=

/ G{x,^)\p{^)smujt

+ uj'^p{^)y{0sinut]dC

(11.31)

0

If we set I

I G(x, OP(0 d^ =•• /W,

G{x, OP(0 ='- H^. 0, ,2 - : A ,

0

(11.31) takes the form of Fredholm equation I

y{x) = XJ k{x, £)y{0 d^ + /(x).

(11.32)

0

11.56 f. Show that, if in (11.32) we assume p(^) constant and / smooth, then y{x) solves f"{x), \y"{x)+Lo'^cy(x) = 3/(0) = 0,

(11-33)

[y(i) = 0, where c = P/TQ. Show also that, conversely, if y solves (11.33), then it also solves (11.32). 1 1 . 5 7 ^ . In the case p = const, show that the unique solution of (11.33) is I

2/W = - - ^

p sm/JLI

X

/r(Osin/x(^0^+-

J 0

p J 0

fri0smp(x-0d4

if sin pi ^ 0, /i := ojy/c. Instead, if sinpX = 0, i.e., p = pk where kn Pk'=—,

kn uJk:=T-i=,

then (11.33) is solvable if and only if

^ k^n"^ ^k'-=-7^—,

,

_ keZ,

11.4 Linear Integral Equations

429

vno voLtaasA

OPERE MATEMATICHE Memorie e Note rUBBUCATB A CtlRA 0BU.'A0CAOnilA M A Z K t U U D B UNCEI ODL CONCOItSO D E t OQNSICLIO NAZKMAU VOiX

fUCCKCHS

Figure 11.7. Vito Volterra (1860-1940) and the frontispiece of the first volume of his collected works.

I no sin fj,(l - ^ ) c ^ = 0 equivalently, iff

In particular, if f{x) = 0 and 11 = ^1^, all solutions are given by y{x) = C sin fMkX and the natural oscillations

C

eR

of the thread are given by y = C sin fikX sin LJkt.

Compare the above with the alternative theorem of Predholm in Chapter 10.

11.4.2 Volterra integral equations A linear integral equation in the unknown x{t), t G [a, b] of the type o

x{t)

= f{t) + jk{t.T)X{T)

dr

where f{i) and k{t,x) are given functions, is called a Fredholm equation of second kind^ while a Fredholm equation of the first kind has the form

430

11. Some Applications

6

fk{x,T)x{T)dr

= f{t).

a

The function k{t, r) is called the kernel of the integral equation. If the kernel satisfies k{t^ r) = 0 for t > r, the Fredholm equations of first and second kind are called Volterra equations. However it is convenient to treat Volterra equations separately. 11.58 Theorem. Let k{t,r) be a continuous kernel in [a,b] x [a, 6] and let f G C^([a, 6]). Then the Volterra integral equation x{t) = f{t) -f A / fe(t, r)x{T) dr Ja has a unique solution in C^([a, 6]) for all values of X. Proof. The transformation o

T[x]it) := fit) -h A / k{t, r)x{r)

dr

maps C^{[a, b]) into itself. Moreover for all t 6 [a, b] we have \T[xi]{t) - T[x2\{t)\ < |A| M{t - a)\\xi - xalloo.la.b) hence \T^[x^]{t)~T^[X2]{t)\

< lAI^M^^i^llxi

-a;2||cc,[a,6]

and by induction, if T^ := T o • - o T n times,

n!

^ ^

If n is sufficiently large, so that |A|^ M^ (b — a)'^/n\ < 1, we conclude that T'^ is a contraction, hence it has a unique fixed point x e C^{[a,b]). li n = 1 the proof is done, otherwise Tx is also a unique fixed point for T'^, so necessarily we again have Tx = x by uniqueness. D

11.4.3 Fredholm integral equations in C^ 11.59 Theorem. Let k{t,T) be a continuous kernel in [a, 6] x [a, 6] and let f G C°([a, 6]). The Fredholm integral equation b

x{t) = f{t) + A / k{t, r)x{r) dr a

has a unique solution x{t) in C^{[a,b]), provided \X\ is sufficiently small.

11.5 Fourier's Series

431

Proof. Trivially, the transformation b

T[x]{t) := fit) - H A / k{t, T)x(r) dr a

maps C^([a, 6]) into itself and is contractive for A close to zero, in fact, if M max|/c(t, r ) | ,

:=

h

|T[xi](t) - T[x2]{t)\ < |A| j |fc(t,r)| \xr{T) - X2{r)\ dr a <|A|M(6-a)||xi(t)-X2(t)||oo,[a,6] < 2ll^lW-^2(t)||oo,[a,6] if |A| M(6 -a)

< 1/2.

D

In order to understand what happens for large A, observe that the transformation h

T[x]{t) := fit) + Jk{t,T)x{r)dT is hnear, continuous and compact, see Example 9.139. The Riesz-Schauder theorem in Remark 10.72 then yields the following. 11.60 Theorem. Let k{t,r) G C^{[a,b] x [a,6]) and f G C^{[a,b]). The equation b

Xx{t) = f{t) + / k{t,

T)X{T)

dr

(11.34)

a

has a set of eigenvalues A with the only accumulation point A = 0. Each eigenvalue X ^ 0 has finite multiplicity and for any X, X ^ 0 and X ^ A, (11.34) has a unique solution. Further information concerning the eigenvalue case requires the use of a different space norm, the integral norm || ||2, and therefore a description of the completion L'^{{a, b)) of C^{{a^ b)) that we have not yet treated.

11.5 Fourier's Series In 1747 Jean d'Alembert (1717-1783) showed that the general solution of the wave equation

432

11. Some Applications

THfiORIE

Veberae

AHAlTTIQtJl

Darstellbarkeit einer Function

DE LA CHALEUR,

dnrch eine tdgonomeiiisehe Belhe.

P4» M. FOURIER. V<m

B. B i e m a n Q.

TWF k PARIS, d«r WiMCMdiaiUn ra OfiMfacn.

CHEZ FIRMIN DIOOT, VfeRE BT FILS, tMUt mm ut mttrntmiatmim, h'taemmcm

OSttingen, in 4*r Di«t«rieliioh*ii Bnelihtadlang.

18a a.

1867

Figure 11.8. Frontispieces of two celebrated works by Joseph Fourier (1768-1830) and G. F. Bernhard Riemann (1826-1866).

that transforms into = 0

drds by the change of variables r = x -\- at, s = x — at, is given by u{t, x) = (p{x + at) + IIJ{X — at),

where ip and ijj are, in principle, generic functions. Shghtly later, in 1753, Daniel Bernoulh (1700-1782) proposed a different approach. Starting with the observation of Brook Taylor (1685-1731) that the functions sm

rmrx

VT

(nna{t — (3) \ cos f )•

n=l,2,...

(11.36)

are solutions of the equation (11.35) and satisfy the boundary conditions u{t,G) = u{t,i) = ^, Bernoulli came to the conclusion that all solutions of (11.35) could be represented as superpositions of the tones in (11.36). An outcome of this was that every function could be represented as a sum of analytic functions, and, indeed.

TT

E n=0

1 sin(2n -h l)x 0 = < 2n + l -1

if 0 < a; < TT, if X = TT, if7r<x<27r.

Bernoulli's result caused numerous disputes that lasted well into the nineteenth century that even included the notion of function and, eventually.

11.5 Fourier's Series

433

was clarified with the contributions of Joseph Fourier (1768-1830), Lejeune Dirichlet (1805-1859), G. F. Bernhard Riemann (1826-1866) and many other mathematicians. The methods developed in this context, in particular the idea that a physical system near its equilibrium position can be described as superposition of vibrations and the idea that space analysis can be transformed into a frequency analysis, turned out to be of fundamental relevance both in physics and mathematics.

11.5.1 Definitions and preliminaries We denote by Ll^^ the space of complex-valued 27r-periodic functions in R that are summable on a period, for instance in [—7r,7r]. For /c G Z, the kth Fourier coefficient of / G L^^^ is the complex number

often denoted by f^k)

or / ^ .

11.61 Definition. The Fourier nth partial sum of f e L\^ is the trigonometric polynomial of order n given by

fc=—n

The Fourier series of f is the sequence of its Fourier partial sums and their limit Sf{x) = T

Cke'^"" := lim Snf{x) = lim V

k=—oo

Akx c,e^

k=—n

If / G Z/2^ is real-valued, then Ck = Ck

V/c G Z

since f{t) = f{t) and f{t)e'''Ut= -TT

f{t)e-i^t J—Tl

dt=

f{t)e-i''^dt. J—TT

The partial sums of the Fourier series of a real-valued function have the form n

Snfix)

n

= CO + Y^icke'"^ + cj^e-'"^) = CQ + ^ fc=i fe=i

3?(2cfee''=^),

434

11. Some Applications

Figure 11.9. The Dirichlet kernel with n = 5. Observe that the zeros of Dn{t) are equidistributed Xn '-= 2n+i^^ ^ "^ 2kn, k ^ Z.

thus, decomposing Ck in its real and imaginary parts, c^ = : (a^ — ibk)/2, that is, setting

I r ak := — /

I r f{t) cos{kt) dt,

bk := — /

f{t) sm{kt) dt,

we find the trigonometric series n

Snf{x) = ^+"^^{{ak-ibk){cos{kx)+ism{kx)) k=i

(11.37)

n

= — + y^(^fc COS kx -h bk sin kx). ^

k=i

However, the complex notation is handier even for real-valued functions. 11.62 ^ . Show that the operator ^ mapping every function in L^^ into the sequence of its Fourier coefficients, / -^ {/HJ/c)}, has the following properties: (i) (ii) (iii) (iv) (v) (vi)

it is linear (A/ + f^g^k) = Xr{k) + figl.k) VA, fieC,Vf,ge L\^, {fgYk — (f^* g'^ik), see Proposition 4.46, ( / * g)'l,k) = f\k)g1^k), see Proposition 4.48, if g(t) = f(-t), then gl,k) = H " ^ ) , if g(t) = fit - cp), then gl,k) = e'^^^ flk). if / is real and even, then its Fourier coefficients are real and its Fourier series is a cosine series, (vii) if / is real and odd, its Fourier coefficients are imaginary and its Fourier series is a sine series, (viii) if / has continuous derivative, or more generally / is continuous and / ' is piecewise continuous, then Ck{f') = ikckif) ^^ ^ ^-

11.5 Fourier's Series

435

a. Dirichlet's kernel The Dirichlet kernel or order n is defined by n

n

D„{x) := 1 + 2Y^cos{kx)

= ^

k=l

e'*^^,

x G E.

k=—n

As we have seen in Section 5.4 of [GM2], Dn{t) is a trigonometric polynomial of order n and 27r-periodic, Dri{t) is even,

and

{

2n + 1

iit = 2k7r,

keZ,

sin(t/2) "^ The Fourier coefficients of {Dn{t)} are trivially .^ ,

fl

if |A:|
[0

if \k\ > n.

Therefore it is not surprising that we have the following. 11.63 Lemma. For every f G L^^iM) we have Snf{x) = ^ J

[fix

+ t) + fix

-

t))Dnit) dt

VX €

M.

Proof. In fact Snfix)

=

Yl

^fc^'"" = ^

= ^ f fit)Dn{t l-K y_7r

=^

r

r

/We^'^""*^ ^ ^ = ^

-X)dt=^

r fix 27r y _ 7 r - x

r

f(t)Dn{x

+ t)Dnit)

- t) dt

dt

/(* + ^)i^n(t) C«t - ; ^ ^ ( / ( ^ + *) + /(^ - 0)i^n W dt,

27r 7 _ ^ 27r yo where we used, in the fourth equaUty, that Dn{t) is even and in the second to last equaUty that for a 27r-periodic function we have ra-\-2Tr

/

rrr

u{t) dt=

J a

u{t) dt

Va € M.

J — TV

D

Finally we explicitly notice that, though J^^ Dn{t) dt = 27r, we have f

\Dn{t)\dt =

0{\ogn).

J —TT

This prevents us from estimating the modulus of integrals involving Dn {t) by estimating the integral of the modulus.

436

11. Some Applications

momDA novmM.

LECONS 8ERIB TRIGONOMETRICHE

SERIES TRIGONOMITRIQUES PROFESSeeS AU COLtfiUE »E FRANCE

m PARIS, GA0T8IER-VILURS, iMPRIMSUR-UBRAJRE BOU>aSA

NIOOLA

ZAmomOM

Figure 11.10. The frontispieces of two volumes on trigonometric series by Henri Lebesgue (1875-1941) and Leonida Tonelli (1885-1946).

11.5,2 Point wise convergence If P is a trigonometric polynomial, P e 7^n,27r, then P agrees with its Fourier series, P{x) = E L - n ^fc^'^"^ ^^ ^ ^ ' see Section 5.4 of [GM2]. But this does not hold for every / G L^-j^. Given / G ^37^5 we then ask ourselves under which assumptions on / the Fourier series of / converges and converges to / . a. The Riemann—Lebesgue theorem The theorem below states that a rapidly oscillating function with a summable profile has an integral that converges to zero when the frequency of its oscillations tends to infinity, as a result of the compensation of positive and negative contributions due to oscillations, even though the L^ norms are far from zero. 11.64 Theorem (Riemann-Lebesgue). Let f :]a, b[-^ R be a Riemann summable function in ]a, 6[. For every interval ]c,d[c]a,b[ we have

[

f{t)e'''^dt~^^

as \\\ -^00

uniformly with respect to c and d. Proof, (i) Assume first that / is a step function, and let a := {XQ = a,xi,... be a subdivision of ]a,6[ so that f(x) = a^ on [xk-x^x^]. Then

,Xn = i>}

11.5 Fourier's Series

d

437

< - \- k=i

This proves the theorem in this case. (ii) Let / be summable in ]a, 6[ and e > 0. By truncating / suitably, we find a bounded Riemann integrable function he such that /^ \f(t) — he{t)\dt < e, and in turn a step function ge : (a, 6) -^ R with fl \he{t) - ge(t)\ dt < e. Consequently J^ \f{t)-ge{t)\ 2e and from J

f{t)e'^'

dt = J

9.{t)e'^* dt + J

(fit)

dt <

- ge{t))e'^* dt

we infer

I f f{t)e'^*dt\ < I / g,{t)e'^*dt\+ f I •/c

I Jc I rd

I

<

/

I

\fit)-ge(t)\dt

Ja

I

ge{t)e'^* dt\ + e.

I Jc

I

The conclusion then follows by applying part (i) to ge.

Q

11.65 Corollary. Let f be Riemann summable in ]a, 6[. Then / f{s)sml(n-{--js)ds^^Q

as n ^^ oo

uniformly with respect to the interval ]c, d[c]a, 6[. 11.66 %. Show the following. P r o p o s i t i o n . Let f G 1/2^. Then we have / f{t)Dn {t)dt-^0 Js<\s\
asn-^

oo

for every S > 0. 11.67 If. Show Theorem 11.64 integrating by parts if / is of class C^{[a, b]). 11.68 If. Let / 6 Ll^ and let {ck{f)} that |cfc(/)| —^ 0 as ^ -^ ±oo.

be the sequence of its Fourier coefficients. Show

b. Regular functions and Dini test 11.69 Definition. We say in this context that f e L^^ is regular atx if there exist real numbers L^{x) and M^{x) such that lim / ( x 4 - t ) = L+(x), t-^o+ ,. /(x-f^)-L+(x) ,^., , hm — ^-^ = M^(x), t->o+ t

^B.

lim f{x + t)=L-{x), (11.38) t-^o,. f{x-^t)-L^{x) ^. , , hm — ^-^ = M (x). t^ot

438

11. Some Applications

Of course, if / is differentiable at x, then / is regular at x with L^{x) = f{x) and M'^{x) = f{x). Discontinuous functions with left and right hmits at X and bounded slope near x are evidently regular at x. In particular square waves, sawtooth ramps and C^ functions are regular at every x G R. It is easy to see that if / is regular at x then the function ^^^ty_^fi- + t) + fi--i)^-LH^)-L-(-)

(11.39)

is bounded hence Riemann integrable in ]0, TT]. 11.70 Definition. We say that a 27r-periodic piecewise-continuous map f :R-^ C is Dini-regular at x eR if there exist real numbers I/^(x) such that r I fix + 0 + fix -t)- L+(x) -

Jo '

L-{x)

t

dt < +00.

(11.40)

11.71 Theorem (Dini's test). Let f e Ll^iR) be Dini-regular atx eR andletL^ix),L-ix) be as in (11.40). ThenSnfix) -^ (L+(a:)+L-(x))/2. Proof. We may assume that x € [—TT, TT]. Since -^ f_^Dn(t)dt 1/2, we have

— -^ f^ Dn(t) dt =

5 „ / ( x ) - ^"^(^) + ^ " ( ^ ) = J _ / " " ( / ( ^ + () + / ( ^ -t)-L+2 2n Jo

L-)Dn{t)

dt (11.41)

= ^ f
where (fxit) is as in (11.39). Set h{t) := ^xit) ^^^L,2)^ so that \h(t)\
- ^ ^ ( ^ ) + ^ (^) ^ J _ r hi^t) sin((n + 1/2)0 dt -> 0. 2

27r ^0

In particular, if / is continuous, 27r-periodic and satisfies the Dini condition at every x, then Snfix) —^ fix) ^x ER pointwise. 11.72 E x a m p l e . Let 0 < a < 1 and A C M. Recall that f : A ^ R is said to be a-Holder-continuous if there exists K > 0 such that

\f(x)-f{y)\
^x,y e A.

We claim that a 27r-periodic a-Holder-continuous function on [a, b] satisfies the Dini test at every x 6]a, b[. In fact, if S = Sx := min(|a: — a\,\x — b\), then

/•y(. + t)-/(x) + /(x-^)-/(x), ^^ /•* Jo ^

t Jo

^ 0

Jo

^^^ r _^ Js

11.5 Fourier's Series

11.73 t . Show that the 27r-periodic extension of y/\t\, continuous.

439

t G [-7r,7r] is l/2-H61der-

1 1 . 7 4 E x a m p l e . Show that, if / is continuous and satisfies the Dini test at x, then L-^{x) = L-{x) = f{x). 11.75 t - Show that the 27r-periodic extension of f(t) := l / l o g ( l / | t | ) , t € [-7r,7r] does not satisfy the Dini test at 0.

11.5,3 L^-convergence and the energy equality a. Fourier's partial sums and orthogonality Denote by ||/||2 the quadratic mean over a period of /

\\f\\l--=^fjm\'dt, and with L^^^ the space of integrable functions with ||/||2 < oo. The Hermitian biUnear form and the corresponding "norm"

(f\3):=^fj{t)W)dt,

11/112:= (^/j/Wl'^*)

'

are not a Hermitian product and a norm in L^^r, since ||/||2 = 0 does not imply f{t) = 0 V^, but they do define a Hermitian product and a norm in L^^^ 0 C^(E), since ||/||2 = 0 imphes / = 0 if / is continous. Alternatively, we may identify functions / and g in 1/2^^ i f l l / ^ ^ ' I b ^ O , and again {f\g) and II/II2 define a Hermitian product and a norm on the equivalence classes of L27r if^ ^^ i^ is usual, we still denote by L^TT ^^^ space of equivalence classes. It is easily seen that L^TT is a pre-Hilbert space with {f\g). Notice that two nonidentical continuous functions belong to different equivalence classes. Since e*'^^, A: G Z, belong to L^^j. and

^^ J-TT

we have the following. 11.76 Proposition. The trigonometric system {e'^^^ \k e Z} is an orthonormal system in L^^.

440

11. Some Applications

Since

we have n

Snf{x) = J2 {f\e""')e"'^

xeR,

i.e., the Fourier series of / is the abstract Fourier series with respect to the trigonometric orthonormal system. Therefore the results of Section 10.1.2 apply, in particular the Bessel inequality holds oo

as well as Proposition 10.18, in particular |i/-5n/||2<||/-P||2

VPGPn,2..

Recall also that for a trigonometric polynomial P G Pn,27r the Pythagorean theroem holds

l.l^jP{t)\Ut= E MP)\'. b. A first uniform convergence result 11.77 Theorem. Let f G C^{R). Then Snf -> / uniformly in R. Proof. Since Snf{x) -^ f(x) Vx, it suffices to show the uniform convergence of We notice that / ' € I/27r ^^^ that, by integration by parts, Ck(f') := ikckif)

Snf-

VfceZ,

hence , if fc 7*^ 0,

N(/)|<^^<|c.(/')|^ + ^ where we have used the inequaUty \ab\ < a^ + 6 ^ . Since Zlfci-00 kfc(/')P converges by Bessel's inequaUty, we therefore conclude that ^ ^ _ ^ kfc(/)| converges, consequently 00

A;= — 0 0

converges absolutely in C^(M) since He^'^^HocR = 1 V/c.

D

11.78 f. Let / e C^(]R) and let {ck} be its Fourier coefficients. Show that /c^|cfc| -^ 0 as |A;| -^ 00.

For stronger results about uniform convergence of Fourier series see Section 11.5.4.

11.5 Fourier's Series

441

A. ZYGMUND

TRIGONOMETRIC SERIES

CAMBBIDGB AT THE ONIVKKBITY PEBSS

Figure 11.11. Antoni Zygmund (19001992) and the frontispiece of the first edition of volume I of his Trigonometric Series.

c. Energy equality We have, compare Chapter 9, the following. 11.79 Lemma. C^(M) H L|^ is dense in L\^. Proof. Let / G L'2^ and € > 0. There is a Riemann integrable function he with 11/ —^elb < e and a step function ke in [-7r,7r] such that \\ke\\ < Me and ||/ie —/ce||i < (7re^)/Me where Me := ||/ie||oo, consequently \\he •ke\\l

= — f \he-ke\'^dt< 27r y_7r

— 2Me 27r

f

\he - ke\dt

<^

J^T,

First, approximating ke by a Lipschitz function, then smoothing the edges, we find le € Ci([-7r,7rl) with \\ke ~ le\\2 < e. Finally we modify U near TT and — TT to obtain a new function ge with pe(-7r) = ge{T^) — g'{—7r) = g'{n) = 0. Extending ge to a periodic function in R, we finally get ge € C^iR) H L^^ and \\f - ge\\2 < 4e. D

Now we can state the following. 11.80 Theorem. For every f e L^^ we have \\Snf-f\\2 -^ 0. Therefore, the trigonometric system {e*^^}, k E Z, is orthonormal and complete in L^TT/ moreover, for any f G L^^ the energy equality or ParsevaVs identity holds: 00

„7r

dt. k— — oo

Proof. Given / G L^^ and e > 0, let ^ € C i ( R ) Pi L^^ be such that \\f - g\\2 < e. Since 5n^ is a trigonometric polynomial of order at most n, and Snf is the point of minimal L^^ distance in L'2^ from / we have

442

11. Some Applications

11/ - Snfh

< 11/ - Sng\\2

< 11/ - g\\2 + \\g - 5 „ s | | 2 < € + lis - SnSlloc

and the claim follows since ||p — -S'npHoo —>• 0 as n —• oo. The rest of the claim is now stated in Proposition 10.18. D 1 1 . 8 1 %, Show that, if the Fourier series of / G Z/27r converges uniformly, then it converges to / . In particular, if the Fourier coefficients c^ of / satisfy + C50

^

|cfc| < 4-00,

k= — oo

then f{x) = J^ifc^cxD Cfce*^^ in the sense of uniform convergence in R.

11.5.4 Uniform convergence a. A variant of the Riemann-Lebesgue theorem Let us state a variant of the Riemann-Lebesgue theorem that is also related to the Dirichlet estimate for the series of products. 11.82 Proposition (Second theorem of mean value). Let f and g be Riemann integrable functions in ]a, b[. Suppose moreover that f is not decreasing, and denote by M and m respectively, the maximum and the minimum values of x -^ J^ g{t)dt, x G [a,b]. Then we have

mf{b)< f

f{t)g{t)dt<Mf{h).

Ja In particular, there exists cE\a^b[ such that

f f{t)g{t)dt = m Ja

j g{t)dt. Jc

Proof. Choose a constant d such that g{t)-\-d > 0 in ]a, h[. If / is differentiable, the claim follows easily integrating by parts / ^ f(t)(g{t)-\-d) dt. The general case can be treated by approximation (but we have not developed the correct means yet) or using the formula of summation by parts, see Section 6.5 of [GM2]. For the reader's convenience we give the explicit computation. Let a = {XQ = a^xi,... ^Xn = b} be a partition of [a,b]. Denote by A^ the interval [xk-i^Xk] and set cr^ := 5Dfc_i fixk){xk — x^-i). We have f m{g(t)

+d)dt=J2

[

/ W ( P W + d)) dt<J2

f{xk){Gixk-i)

- G(xk) + dak

n-1

= f{xi)G(xo)

4- J2 G{xk){f{xk+i)

- f{xk))

k=i n-l

< M(^fixi)

-f Y^ifixk^i) k=i

=

Mf(b)-\-dak.

- f{xk))

+ dak

+ dak

11.5 Fourier's Series

443

SERIE 1)1 vmwiw K

m vmmm mm FUNZIONI Dl UNA YARUBIIE REJiLE U L I S S E DINI

PISA TtPOCkAFU T. MtSTItl • C.

Figure 11.12. Ulisse Dini (1845-1918) and the frontispiece of his Serie di Fourier.

Since <7fc -^ f^ g{t) dt as fc —>• oo, we infer

/

f{t)git)dt<Mf(b).

Ja

Similarly, we get /^ f{t)g{t) dt > mf{h).

The second part of the claim follows from the

intermediate value theorem since /^ g{t) dt is continuous.

•

Prom the Riemann-Lebesgue lemma, see Exercise 11.66, for any / G I/27r ^iid 5 > 0 we have for every fixed x

[

f{x + t)Dn{t) dt ^ 0

as n ^ oo.

For future use we prove the following. 11.83 Proposition. Let f G Ll^ and S > 0. Then

i:

f(x + i)Dn(t) dt ^>^ 0

uniformly in x

as n ^ oo

^^.

Proof. Since l / s i n ( t / 2 ) is decreasing in ]0, TT], the second theorem of mean value yields ^ = ^{x) G [(^,7r] such that [ f{x + t)Dn{t)dt=-r^—-f f(x + J5 sm(d/2) J5 On the other hand,

t)sm{{n-\-l/2)tdt.

444

11. Some Applications

/

f{x + t) sin((n + 1/2)* dt = = cos((n + 1/2)2;) /

f{t) sin((n + l/2)(t

- a;)) dt

/ ( t ) sin((n + l/2)t) dt

- sin((n + l/2)ar) / f{t) cos((n + l / 2 ) t ))dt < Js+x fS+x and the last two integrals converge uniformly to zero in [—TT, TT], see Exercise 11.62. Thus Is / ( ^ + t)Dn{t) dt -^ 0 uniformly in [—7r,7r], hence in E. D

b. Uniform convergence for Dini-continuous functions Let / G C°''^(M)nL27r be a 27r-periodic and a-Holder-continuous function. It is easy to see that / is continuous and Dini-regular at every a: G M. In fact, ii S = 6x := min(|a; — a|, \x — 6|), then

Jo '

i <2K

I

^

Jo

J6

i-^+«dt+^-^^%^^<+oc.

JO

Therefore Snf{x) -^ f{x) Vo; G M by the Dini test theorem, Theorem 11.71. We have the following. 11.84 Theorem. / / / is 2'jT-periodic and of class CQ'^(E), 0 < a < 1, then Snfix) —> f(x) uniformly in K. Proof. Let 5 > 0 to be chosen later. We have Snfix)

~ fix)

= f ifix + t)~f(x))Dnit)dt+ Jo =: IiiS,n,x)

4-

rifix^t)-fix))Dnit)dt Js

l2iS,n,x)

Let e > 0. Since / is a-Holder-continuous there exists K > 0 such that \fix + t)-

f{x)\ < K | t | «

Va: e M, Vt € [0, 2n],

hence \Ii(S,n,x)\
f t " — ^ | s i n ( ( n + l / 2 ) t ) | r f ^ < 2 X / t - ^ ^ " dt = — ( 5 " . Jo sin(t/2) Jo ^

We can therefore choose 6 in such a way that \Ii(6, n,x)\ < e uniformly with respect to X and n. On the other hand |/2(^, n,ic)| < e uniformly with respect to x as n —» +00 by Proposition 11.83 concluding that \Snfix)

— fix)\

< 2e

uniformly in x

for n sufficiently large.

•

With the same proof we also infer the following. 11.85 Theorem (Dini's t e s t ) . Let f G C^(M) (1 Ll^ be a 27r-periodic and continuous function with modulus of continuity UJ{S)^ \f{x) — f{y)\ < (J(5) if \x — y\ < 5, such that UJ{S)/S is summable in a neighborhood of 6 = 0. Then Snf -^ f uniformly in R.

11.5 Fourier's Series

445

c. Riemann's localiziation principles The convergence of Fourier's partial sums is a local property in the following sense 11.86 Proposition. If g,h e L\^ and g = h in a neighborhood of a point X, then Sng{x) — Snh{x) -^ 0 as n ^^ oo. Proof. Assume f .= g — h vanishes in [x — S, x -\- S], S > 0. Then, for every t e [0,6] we have f{x + t) = f{x — t)=0, hence Snfix)

- fix)

= i - Pifix

Since (f{x + t) -\- f{x — t))/ sm(t/2) Riemann-Lebesgue theorem.

+ t) -h fix - t))Dn{t)

dt.

is summable in (<5, TT), the result follows from the D

11.87 Proposition. If f G L\^ and f = 0 in ]a, b[, then Snf{x) uniformly on every interval [c, d] with a < c < d
—> 0

Proof. Let us show that Snf{x) —>• 0 uniformly in [a-\-S,b — S], 0 < 6 < [h — a)/2. For X £[a-\- 5,b — 6] and 0 < t < (5 we have f{x -\-t) = f{x — t) = 0, hence = ^

Snfix)

r ifix + t) -h fix - t))Dnit)

dt.

^TT Js

The claim follows from Proposition 11.83.

•

The locaUzation principle says that, when studying the pointwise convergence in an open interval ]a, 6[ or the uniform convergence in a closed interval inside ]a, b[ of the Fourier series of a function / , we can modify / outside of ]a, b[. With this observation we easily get the following. 11.88 Corollary. Let f G Ll^ be a function that is of class C^([a,6]). Then {Snf{x)} converges uniformly to f{x) in any interval strictly contained in ]a, b[.

11.5.5 A few complementary facts a. The primitive of the Dirichlet kernel Denote by Gn{x) the primitive of the Dirichlet kernel, Gn{x) : - / Dn{t)dt. Jo It is easy to realize that Gn{x) is odd and nonnegative in [0, TT] and takes its maximum value in [0,7r] at the first zero Xn := 2 ^ ^ of Dn- Thus ^

(

2n

=i

^

/•W(^"+^)sin((n + l/2).)

sins , 2(n+l) 0 sin(5/(2n-h 1))- ds ~< 42n + 17^7^ < 27r

446

11. Some Applications

Figure 11.13. The graph of G5(x) in [—7r,7r].

independently of n; in particular,

j Dn(t)

dt

(11.42)

<27r

for all c,d e [0,7r]. Also, by Exercise 11.66, or directly by an integration by parts, it is easily seen that, given any (J > 0, there is a constant c{6) such that |G„(7r) - Gn{x)\ = I r Dn{t)dt\ < c{5)-

(11.43)

for all X e [0,6]. For future use we now show that hm G n\^n)

sms ds.

—^ I

n—>-oo

(11.44)

Jo

In order to do that, we first notice that 2 -t < sint
0
sint <

TT

-r 6

I.e.,

1 sint

1 <_t t

tG]0,7r],

hence / Jo

Dn{t)dt-

TT / 2n \2 '" sin((n + l/2)t) dt oo. Equality(11.44) then follows as

-T ,r\,ds=2r'j^ds.

sin((n + l/2)t) ^^ ^ 2 t/2 2n +

iJo

s/(2n+l)

Jo

s (11.46)

11.5 Fourier's Series

447

Figure 11.14. The sawthooth h(x) and its Fourier partial sum of order 5 in [—7r,7r].

b. Gibbs's phenomenon Consider the 27r-periodic function h defined by periodically extending the function h{t) :--

-7T - t

if - TT < t < 0,

0

if t = 0,

TT-t

(11.47)

ii0
Its Fourier coefficients are easily computed to be Co = 0,

Ck := TT, A: 7^0, IK

hence Snh{x)=

IZ t,-_—71,11 k^=

^ = / *^

Dn{t)dt-x ^0

fc^O

or sinkx

Snh{x) = 2Y^ k fc=i

In particular Snh{0) — 0 Vn, and, by Dini's test, Theorem 11.71, ^ v ^ sin fca: , . . 2 > —-— = h{x)

. . . T^^ pomtwise m M.

k=l

The energy inequality yields

-^-^

or

fc=-oo

fc=l

(11.48)

448

11. Some Applications

k=l

As we have already seen, we have the following, of which we give a direct proof. 11.89 Theorem. For any positive S > 0 the Fourier series ofh converges uniformly to h in [5, TT] . Proof. We know t h a t Snh{x) converge pointwise to h, therefore it suffices t o show that

j :

—

(11.50)

fc^O

converges uniformly in [(5,TT]. We apply Dirichlet's theorem for series of products, see Section 6.5 of [GM2], respectively, to the series with positive and negative indices with Gfc = l/(*fc) and 6fe : = e*'^^, to find that

\SnHx)-h{x)\ = \ J2

-TT^Tl

^-TT

hence \\Snh

- /l||oo,[<5,7r] < JZ

4

1

^

|l-e^^|

- 7 "^ 0

BS n - > OO.

Alternatively, from (11.48) we infer pX

Snh{x)

- h(x) = /

PTT

Dnit)

- TT = -

/

Jo

Dnis)

ds

Jx

and, by (11.43), \Snh{x) — h{x)\ < c{6)— n

uniformly in [S,7r].

However the Fourier series of h does not converge uniformly in [0, TT].

11.90 Proposition. We have 11^,11 \\Snh\\oo,[OM Proof. Let yn be the point where Snh{x) Mn

^ Z*^ s i n s , -^ ^ / ^^• /o Jo ^ obtains its maximum value in [0,TT],

:= s u p Snh{x) [0,n]

=

SniVn),

and let Xn := 2^+1- Since Xn is the maximum point of Gn{x), we have GniVn)

-Xn

< Gn{Xn)

- Xn = SnfiXn)

< SnfiVn)

This implies 0 < yn < Xn and —Xn < Sniyn) \Mn

- Gn{Xn)\

= GniVn)

— Gn{xn)

< Xn = T

- Vn < Gn{Xn)

- Un-

< —2/n, hence T-

(11-51)

2n + 1 The conclusion follows from (11.44).

•

11.5 Fourier's Series

449

We can rewrite the statement in Proposition 11.90 as ||'5'n/i||oo,[0,7rl " ^ ( ~ /

Since

C?^)||/i||oo,[0,7r]

2 r sin .? , sins< — / as = 1.089490... "^ Jo s

we see that, while Snh{x) —^ h{x) for all x G M, near 0, Snh{x) always has a maximum which stays away from the maximum of h that is ||/i||oo,[o,7r] — ^ by a positive quantity: this is the Gibbs phenomenon^ which is in fact typical of Fourier series at jump points; but we shall not enter into this subject.

11.5.6 The Dirichlet-Jordan theorem The pointwise convergence of the Fourier series of a continuous or summable function is a subtle question and goes far beyond Dini's test, Theorem 11.71. An important result, proved by Lejeune Dirichlet (1805-1859), shows in fact that a 27r-periodic function which has only a finite number of jumps and maxima and minima, has a Fourier series that converges pointwise to (L"^ -h L~)/2 where L^ := \imy_^x± f{y); in particular Snf{x) -^ f{x) at the points of continuity. The same proof applies to functions with bounded variation, see Theorem 11.91. In 1876 Paul du Bois-Reymond (1831-1889) showed a continuous function whose Fourier series diverges at one point, and, therefore, that the continuity does not solely suffice for the pointwise convergence of the Fourier series. We shall present a different example due to Lipot Fejer (1880-1959). Starting from this example one can show continuous functions whose Fourier series do not converge in a denumerable dense set, for instance, the rationals. In the 1920's Audrey Kolmogorov (1903-1987) showed a continuous function with Fourier series divergent on a set with the power of the continuum, and Hugo Steinhaus (1887-1972) showed a continuous function whose Fourier series converges pointwise everywhere, but does not converge uniformly in any interval. Eventually, the question was clarified in 1962 by Lennart Carleson. Here we collect some complements. a. The Dirichlet-Jordan test 11.91 Theorem (Dirichlet-Jordan). Let f be a 27r-periodic function with bounded total variation in [a, 6]. (i) For every x e]a,b[ we have Snf{x) -^ {L'^ -h L~)/2 where L^ := \imy_,^±f{y). (ii) / / / is also continuous in ]a, b[, then Snf{x) -^ f{x) uniformly in any closed interval strictly contained in]a^b[.

450

11. Some Applications

/i

_L // + 2 n

/jL + n

Figure 11.15. The amplitude of the harmonics of

Qn,n{x).

Proof. Let [a, b] be an interval with b — a < 2IT. Since every function with bounded variation in [a, 6] is the sum of an increasing function and of a decreasing function, we may also assume that / is nondecreasing in [a, b]. (i) Let X G]a,6[, gx{t) := f{x + t) - L+ + f{x - t) - L- where L± := lim^_,^± We have Snfix)

-

^

^^ 1

= ^

Hifix

+ i) - L + + fix

- t ) - L-)Dn{t)

f{y).

dt

ZTT 7 o

= i r \ 9x{t)Dn{t)ds-\-— 27r JQ /o

I 27r Js

ga:{s)Dn{s)ds

(11.52)

= /l+/2. where 5 > 0 is to be chosen later. Since f(x + t) — L^ and ~{f{x — t) — L ) are nondecreasing near t = 0 and nonnegative, the second theorem of mean value and (11.42) yield | / i | < 2n \f{x + (5) - L+l + | / ( x -S)-L-\ (11.53) while (11.43) yields \l2\
(11.54) n

Therefore, given e > 0, we can choose S > 0 in such a way that \f(x-\-S)-L-^\

+

\f{x-S)-L-\<e

to obtain from (11.53) and (11.54) that \Snf(x)

-^^~^

I

<27re~\-c{S)^.

That proves the pointwise convergence at x. (ii) In this case for every x G [a, 6], we have L+ = L~ = f(x) and it suffices to estimate uniformly in [a + cr, 6 - cr], 0 < a < (6 — a ) / 2 , h and /2 in (11.52). Since / is uniformly continuous in [a, 6], given e > 0, we can choose S, 0 < S < a, in such a way that \f{x -\- S) — f{x)\ + \f{x — S) — f(x)\ < e, uniformly with respect to x in [a -\- a^b — a], hence from (11.53) | / i | < 27re uniformly in [a-\- cr,b — a]. The uniform estimate of IJ2I is instead the claim of Proposition 11.83. Finally, if 6 — a > 27r, it suffices to write [a, 6] as a finite union of intervals of length less than 27r and apply the above to them. •

11.92 Remark. Notice that the Dirichlet-Jordan theorem is in fact a claim on monotone functions. Monotone functions are continuous except on a denumerable set of jump points, that is not necessarily discrete.

11.5 Fourier's Series

451

b. Fejer example Let /i € N be a natural number to be chosen later. For every n € N consider the trigonometric polynomial of degree n ' ^ cos(n + A* — k)x — cos(n + /x -|- k)x Qn,^i{x) : = 2 ^ k=i

'^

^ . .. . . -^ = 2sm((n-I-/ijx) >

sinkx

,

see Figure 11.15. It is a cosine polynomial with harmonics of order fi^ji -\- l^n -\- ii — 1, n + /i + 1 , . . . , n H- 2/x. Now choose o a sequence {a^} of positive numbers in such a way that X^^^i flfc < +oo, o a sequence {n^} of nonnegative integers such that a^ log n^ does not converge to zero, o a sequence {/ifc} of nonnegative integers such that ^ik-\-i > Mfc + 2n;t? and set Qk{x)

:=Qnfc,/Xfc(x).

Since the sums Yl^=\ ^^^^^ are equibounded, see (11.42) and (11.48), the polynomials Qn,ix{x) are equibounded independently of n,/x G N and a; € M. Consequently X ^ ^ i CLkQk{x) converges absolutely in C^(E) to a continuous function / ( x ) , a; G M,

fe=l

which is 27r-periodic and even, for / a sum of cosines. The Fourier series of / is then a cosine series oo

•^/(^) = -^ -^^^k ^

cos{kx).

k= l

We now show that 5 n / ( 0 ) has no limit as n —^ oo. Since / is a uniform limit, we can integrate term by term to get Fourier coefficients Cj — - /

fit) cos(jO dt=Y2—

f

Qk{t) cosijt) dt

because of the choice of the /x^, the harmonics of Qk and Q^, h ^ k are distinct, in particular 2^

Cj =ak2_^-

>ak

T ^^^

log^fc-

Consequently, we deduce for the Fourier partial sums of / at 0

Therefore Snf(0) does not converge, because of our choice of { n ^ } . A possible choice of the previous constants is ak '= -j^, which yields ak log(nfe) = log 2.

nk = 2'' ,

Ilk =

2^

452

11. Some Applications

Figure 11.16. Paul du Bois-Reymond (1831-1889) and Lipot Fejer (1880-1959).

11.5.7 Fejer's sums Let / be a continuous and 27r-periodic function. The Fourier partial sums of / need not provide a good approximation of / , neither uniformly nor pointwise; on the other hand / can be approximated uniformly by trigonometric polynomials, see Theorem 9.58. A specific interesting approximation was pointed out by Lipot Fejer (1880-1959). Let / e L\^ and Snf{x) = Z)fc=-n <^fce^^^. Fejer's sums of / are defined by

Fr,f{x):=-^J2Snf{x). Trivially Fnf{x) written as

are trigonometric polynomials of order n that can be

W = ;^tE«.'=-" = ^ E ( » k=Oj=-k

c,e

IJX

j=-n

We have 11.93 Theorem (Fejer). Let f eLl^D converge to f uniformly in R.

C^{R). The Fejer sums

Fnf{x)

Before proving Fejer's theorem, let us state a few properties of the Fejer kernel defined by n-\-1 f—^ k=0

where D^ denotes the Dirichlet's kernel of order k. 11.94 Proposition. We have Jn + 1

Fn{x) =

if X = 2fc7r, fc G Z,

_L.f^M-p)^Y

.n + 1V

sin(x/2)

/

other.^se.

11.5 Fourier's Series

453

Proof. Trivially

i^«(0) = ^

E l?fc(0) = - ^ X^(2fc + 1) = ^ ^ i ± ^ = n + 1.

Observing that in = / s i n ( x / 2 ) + . . . + sin((n + l/2)x)N /^^ ^ ^^ V sin(x/2) / / the expression in parentheses is the imaginary part of gix/2 _|_ gi3x/2 _|

|_ gi(2n+l)a;/2

sin(a;/2) ^ ^i{n+i)x/2 sin(x/2)

g i x / 2 / g i ( n + l ) x _ ]^\

~

sin(a:/2)(e»^ - 1)

2ism{{n + l ) x / 2 ) ^ ^u(n+i)x/2) 2isin(x/2)

sin((n + l ) x / 2 ) ^ sin2(x/2)

we see that Fn{x) =

1 /sin((n + l ) x / 2 ) \ 2 ^ ( n + 1 V sin(x/2)

11.95 Proposition. Fejer's kernel has the following properties (i) Fn{x) > 0, (ii) Fn{x) is even, (iv) Fn{x) attains its maximum value at 2k7r, k e Z, (v) for all S > 0, Fn{x) -^ 0 uniformly in [(5, n] as n -^ oo, (vi) there exists a constant A> 0 such that Fn{x) < r^_^\^2 for alln eN and X y^O in [—7r,7r],

(vii) {Fn} is an approximation of the Dirac mass S. Proof. (i),(ii),(iii), (iv), (v) are trivial; (vi) follows from the estimate sint > 2t/7r in ]0,7r/2]. Finally (vii) follows from (iii) and (v). D Proof of Fejer's theorem, Theorem 11.93. First we observe that Fnf{x)

- fix)

= ^

r(f{x

+ t ) + fix

- t ) - 2f{x))Fn{t)

dt.

ZTT Jo

Thus, if we set g{t) := f{x -\-t) + f(x - t) - 2 / ( x ) , Fnf{x)

- f[x) = ^

f

g{t)Fn{t) dt+^

27r Jo

f

9it)Fn{t) dt =: h + h.

2TT JS

Now, given e > 0, we can choose S so that \f{x + t) + f{x - t) - 2 / ( x ) | < 2e for all t G [0, S] uniformly in x, since / is uniformly continuous. Hence \h\<2ef

Jo

Fn{t) <2e r

Jo

Fn{t)dt

= 277 6.

On the other hand {h] < 4||/||c5oA/((n + l)^^), hence \Fnf{x)

- f{x)\

< 6+

4 | | / | | o o - - ^ .

A. Mathematicians and Other Scientists

Maria Agnesi (1718-1799) Pavel Alexandroff (1896-1982) James Alexander (1888-1971) Archimedes of Syracuse (287BC-212BC) Cesare Arzela (1847-1912) Giulio Ascoli (1843-1896) Rene-Louis Baire (1874-1932) Stefan Banach (1892-1945) Isaac Barrow (1630-1677) Giusto Bellavitis (1803-1880) Daniel Bernoulli (1700-1782) Jacob Bernoulli (1654-1705) Johann Bernoulli (1667-1748) Sergei Bernstein (1880-1968) Wilhelm Bessel (1784-1846) Jacques Binet (1786-1856) George Birkhoff (1884-1944) Bernhard Bolzano (1781-1848) Emile Borel (1871-1956) Karol Borsuk (1905-1982) L. E. Brouwer (1881-1966) Renato Caccioppoli (1904-1959) Georg Cantor (1845-1918) Alfredo Capelli (1855-1910) Lennart Carleson (1928- ) Lazare Carnot (1753-1823) Elie Cartan (1869-1951) Giovanni Cassini (1625-1712) Augustin-Louis Cauchy (1789-1857) Arthur Cayley (1821-1895) Eduard Cech (1893-1960) Pafnuty Chebyshev (1821-1894) Richard Courant (1888-1972) Gabriel Cramer (1704-1752) Jean d'Alembert (1717-1783) Georges de Rham (1903-1990) Richard Dedekind (1831-1916) Rene Descartes (1596-1650) Ulisse Dini (1845-1918) Diodes (240BC-180BC) Paul Dirac (1902-1984) Lejeune Dirichlet (1805-1859) Paul du Bois-Reymond (1831-1889) James Dugundji (1919-1985)

Albrecht Durer (1471-1528) Euclid of Alexandria (325BC-265BC) Leonhard Euler (1707-1783) Alessandro Faedo (1914-2001) Herbert Federer (1920- ) Lipot Fejer (1880-1959) Pierre de Fermat (1601-1665) Sir Ronald Fisher (1890-1962) Joseph Fourier (1768-1830) Maurice Frechet (1878-1973) Ivar Fredholm (1866-1927) Georg Frobenius (1849-1917) Boris Galerkin (1871-1945) Galileo Galilei (1564-1642) Carl Priedrich Gauss (1777-1855) Israel Moiseevitch Gelfand (1913- ) Camille-Christophe Gerono (1799-1891) J. Willard Gibbs (1839-1903) Jorgen Gram (1850-1916) Hermann Grassmann (1808-1877) George Green (1793-1841) Thomas Gronwall (1877-1932) Jacques Hadamard (1865-1963) Hans Hahn (1879-1934) Georg Hamel (1877-1954) William R. Hamilton (1805-1865) Felix Hausdorff (1869-1942) Oliver Heaviside (1850-1925) Eduard Heine (1821-1881) Charles Hermite (1822-1901) David Hilbert (1862-1943) Otto Holder (1859-1937) Robert Hooke (1635-1703) Heinz Hopf (1894-1971) Guillaume de I'Hopital (1661-1704) Christiaan Huygens (1629-1695) Carl Jacobi (1804-1851) Johan Jensen (1859-1925) Camille Jordan (1838-1922) Oliver Kellogg (1878-1957) Felix Klein (1849-1925) Helge von Koch (1870-1924) Audrey Kolmogorov (1903-1987) Leopold Kronecker (1823-1891)

456

A. Mathematicians and Other Scientists

Kazimierz Kuratowski (1896-1980) Joseph-Louis Lagrange (1736-1813) Edmond Laguerre (1834-1886) Pierre-Simon Laplace (1749-1827) Caspar Lax (1487-1560) Henri Lebesgue (1875-1941) Solomon Lefschetz (1884-1972) Adrien-Marie Legendre (1752-1833) Gottfried von Leibniz (1646-1716) Jean Leray (1906-1998) Sophus Lie (1842-1899) Ernst Lindelof (1870-1946) Rudolf Lipschitz (1832-1903) Jules Lissajous (1822-1880) L. Agranovich Lyusternik (1899-1981) James Clerk Maxwell (1831-1879) Edward McShane (1904-1989) Arthur Milgram (1912-1961) Hermann Minkowski (1864-1909) Carlo Miranda (1912-1982) August Mobius (1790-1868) Harald Marston Morse (1892-1977) Mark Naimark (1909-1978) Nicomedes (280BC-210BC) des Chenes M . - A. Parseval (1755-1836) Blaise Pascal (1623-1662) Etienne Pascal (1588-1640) Giuseppe Peano (1858-1932) Oskar Perron (1880-1975) Emile Picard (1856-1941) J. Henri Poincare (1854-1912) Diadochus Proclus (411-485) Pythagoras of Samos (580BC-520BC) Hans Rademacher (1892-1969) Tibor Rado (1895-1965) Lord William Strutt Rayleigh (1842-1919)

Kurt Reidemeister (1893-1971) G. F. Bernhard Riemann (1826-1866) Frigyes Riesz (1880-1956) Marcel Riesz (1886-1969) Eugene Rouche (1832-1910) Adhemar de Saint Venant (1797-1886) Stanislaw Saks (1897-1942) Helmut Schaefer (1925- ) Juliusz Schauder (1899-1943) Erhard Schmidt (1876-1959) Lev G. Schnirelmann (1905-1938) Hermann Schwarz (1843-1921) Karl Seifert (1907-1996) Takakazu Seki (1642-1708) Carlo Severini (1872-1951) Hugo Steinhaus (1887-1972) Thomas Jan Stieltjes (1856-1894) Marshall Stone (1903-1989) James Joseph Sylvester (1814-1897) Brook Taylor (1685-1731) Heinrich Tietze (1880-1964) Leonida Tonelli (1885-1946) Stanislaw Ulam (1909-1984) Pavel Urysohn (1898-1924) Charles de la Vallee-Poussin (1866-1962) Egbert van Kampen (1908-1942) Alexandre Vandermonde (1735-1796) Giuseppe Vitali (1875-1932) Vito Volterra (1860-1940) John von Neumann (1903-1957) Karl Weierstrass (1815-1897) Norbert Wiener (1894-1964) Kosaku Yosida (1909-1990) William Young (1863-1942) Nikolay Zhukovsky (1847-1921) Max Zorn (1906-1993) Antoni Zygmund (1900-1992)

There exist many web sites dedicated to the history of mathematics, we mention, e.g., http: //www-history .mcs. st-and. ac. uk/"history.

B. Bibliographical Notes

We collect here a few suggestions for the readers interested in delving deeper into some of the topics treated in this volume. Concerning linear algebra the reader may consult o P. D. Lax, Linear Algebra, Wiley & Sons, New York, 1997, o S. Lang, Linear Algebra, Addison-Wesley, Reading, 1966, o A. Quarteroni, R. Sacco, F. Saleri, Numerical Mathematics, Springer-Verlag, NewYork, 2000, o G. Strang, Introduction to Applied Mathematics, Wellesley-Cambridge Press, 1961. Of couse, curves and surfaces are discussed in many textbooks. We mention o M. do Caxmo, Differential Geometry of Curves and Surfaces, Prentice Hall Inc., New Jersey, 1976, o A. Gray, Modem Differential Geometry of Curves and Surfaces, ORG Press, Boca Raton, 1993. Concerning general topology and topology the reader may consult among the many volumes that are available o J. Dugundji, Topology, Alyn and Bacon, Inc., Boston, 1966, o K. Janich, Topology, Springer-Verlag, Berlin, 1994, o I. M. Singer, J. A. Thorpe, Lecture Notes on Elementary Topology and Geometry, Springer-Verlag, New York, 1967, o J. W. Vick, Homology Theory. An Introduction to Algebraic Topology, SpringerVerlag, New York, 1994. With special reference to degree theory and existence of fixed points we mention o A. Granas, J. Dugundji, Fixed Point Theory, Springer-Verlag, New York, 2003. o L. Nirenberg, Topics in Nonlinear Functional Analysis, Courant Institute of Mathematical Sciences, New York University, 1974. The literature on Banach and Hilbert spaces, linear operators, spectral theory and linear and nonlinear functional analysis is incredibly wide. Here we mention only a few titles o N. J. Akhiezer, I. M. Glazman, Theory of Linear Operators in Hilbert Spaces, Dover, New York, 1983, o H. Brezis, Analyse Fonctionelle, Masson, Paris, 1983, o A. Friedman, Foundations of Modem Analysis, Dover, New York, 1970, and also o N. Dundford, J. Schwartz, Linear Operators, John Wiley, New York, 1988, o K. Yosida, Functional Analysis, Springer-Verlag, Berlin, 1974, as well as the celebrated o R. Courant, D. Hilbert, Methods of Mathematical Physics, Interscience Publishers, 1953, o F. Riesz, B. Sz. Nagy, Legons d'Analyse Fonctionelle, Gauthier-Villars, Paris, 1965.

C. Index

accumulation point, 164 algebra - End (X), 326 - ideal, 402 - - maximal, 402 - - proper, 402 - of functions, 316 - spectrum, 403 - with identity, 402 algorithm - Gram-Schmidt, 85, 99 ball - open, 152 Banach - algebra, 326, 403 - closed graph theorem, 330 - continuous inverse theorem, 330 - fixed point theorem, 335 - indicatrix, 265 - open mapping theorem, 329 - space, 286 - ordered, 343 basis, 43 - dual, 54 - orthonormal, 85 bilinear form - bounded, 370 - coercive, 370 bilinear forms, 95 - signature, 97 bracket - Lie, 38 Carnot's formula, 81 cluster point, 164 coefficients - Fourier, 433 compact set, 200 - relatively, 203 - sequentially, 197 conies, 106

connected - component, 211 - set, 210 continuity - for metric spaces, 163 continuity method, 337 contractible spaces^ 253 convergence - in a metric sp£ice, 153 - pointwise, 157, 297 - uniform, 157, 294, 296 on compact subsets, 310 - weak, 398 convex hull, 208 convolution, 309, 310 - integral means, 309 coordinates - cylindrical, 168 - polar, 168 - spherical, 168 covectors, 54 covering, 165, 199, 260 - locally finite, 165 - net, 199 criterion - Hausdorff, 200 cube - Hilbert, 158 curve, 219 - arc length reparametrization, 232 - closed, 219 - cylindrical helix, 221 - cylindrical representation length, 231 - equivalent, 224 - intrinsic parametrization, 243 - length, 227 in cylindrical coordinates, 231 in polar coordinates, 231 in spherical coordinates, 231 minimal, 397 of graphs, 231

460

-

Index

semicontinuity, 395 Lipschitz-continuous, 230 orientation, 224 parametrization, 219 Peano, 228 piecewise regular, 226 piecewise-C^, 226 polar representation, 221 length, 231 rectifiable, 227 regular, 224 self-intersection, 223 simple, 223 spherical representation length, 231 tangent vectors, 225 total variation, 241 trace, 219 trajectory, 219 von Koch, 228

decomposition - polar, 125 - singulai' value, 126 definitively, 192 degree, 268 - integral formula, 269 - mapping - - degree, 268 - on 5 1 , 266 - with respect to a point, 275 delta - Dirac, 313 - - approximation, 313 - Kronecker, 12 dense set, 192 determinant, 33, 34 - area, 31 - Laplace's formula, 36 - of a product, 35 - of the transpose, 35 - Vandermonde, 39 diameter, 153 Dini - regular, 438 - test, 438, 444 Dirichlet - problem, 416 discrete Fourier transform, 134, 144 - inverse, 134 distance, 81, 84, 151, 154, 161, 286 - between sets, 216 - codes, 156 - discrete, 156 - Euclidean, 155 - from a set, 162 - Hausdorff, 299 - in ip, 158

- in the mean, 160 - integral, 160 - uniform, 157, 159 duality, 55 eigenspace, 58 eigenvalue, 58, 384, 391 - min-max characterization, 392 - multiplicity algebraic, 62 geometric, 62 - real and complex, 66 - variational characterization, 115 eigenvector, 58 energy equality, 360, 441 example - Fejer, 451 exponential operator, 327 Fejer - example, 451 - sums, 452 fixed point, 335 force, 92 forms - bilinear, 95 - linear, 54 - quadratic, 115 formula - Binet, 35 - Carnot, 81 - degree, 269 - Euler, 281 - Grassmann, 18, 47 - Hadamard, 143 - inverse matrix, 30 - Laplace, 36 - Parseval, 358, 441 - polarity, 80, 83 - rank, 49 of matrices, 16 Fourier - coefficients, 357, 433 - series, 357, 433 uniform convergence, 444 Fredholm's alternative, 50 function, see map - BauEich's indicatrix, 265 - bounded total variation, 244 - closed, 194, 216 - coercive, 203 - continuous, 163, 182 image of a compact, 202 image of a connected set, 212 inverse of, 202 - convex, 287 - exponential, 171 - Holder-continuous, 161

Index

- homeomorphism, 182 - Joukowski, 169 - limit, 164 - Lipschitz-continuous, 161 extension, 207 - logarithm, 171 - lower semicontinuous, 204 - Mobius, 170 - open, 194, 216 - proper, 216 - sequentially semicontinuous, 203 - total variation, 241 - uniformly continuous, 205 functions - equibounded, 301 - equicontinuous, 301 - Holder-continuous, 301 - homotopic, 250 fundamental group, 258

- Bessel, 358, 440 - Cauchy-Schwarz, 80, 83, 352 - Gronwall, 410 - Jensen, 400 - Minkowski, 155, 158, 293 - triangular, 81, 84 - variational, 372 inner product - continuity, 352 integral - de la Vallee Poussin, 316 integral equations - Fredholm, 426, 428, 429 - Volterra, 425, 426, 429 invariant - metric, 183 - topological, 184 isolated point, 180 isometries, 87

gauge function, 333 geodesic, 152 - distance, 154 Gibbs phenomenon, 448 Green operator, 421 group - fundamental, 258 - linear, 50 - orthogonal, 88 - unitary, 88

kernel - de la Vallee-Poussin, 315 - Dirichlet, 435

Holder function, 161 Hausdorff criterion, 200 Hermitian product - continuity, 352 Hilbert space, 158, 353 - basis, 355 - complete system, 355 - dual, 364 - Fourier series, 357 - pre, 351 - separable, 355 - weak convergence, 398 Hubert's cube, 393 homeomorphism, 182 homotopy, 250 - equivalence, 253 - first group, 258 - relative, 256 - with fixed endpoints, 256 ideal, 402 - maximal, 402 - proper, 402 identity - Jacobi, 38 - parallelogram, 80, 287 inequality

law - parallelogram, 287 least squares, 129 - canonical equation, 129 lemma - Gronwall, 410 - Riemann-Lebesgue, 436 - Uryshon, 209 liminf, 204 limit point, 164 limsup, 204 linear - combination, 42 - equation, 50 - operator, 44 characteristic polynomial, 60 - eigenspace, 58 - eigenvalue, 58 - eigenvector, 58 - subsp£u:e, 4 - systems, 22 Cramer's rule, 36 linear difference - equations - - o f higher order, 137 linear difference equations - systems, 136 linear regression, 374 Lipschitz - constant, 161 - function, 161 map - affine, 37

461

462

Index

- compact, 339 - linear, 44 - - affine, 37 - - associated matrix, 48 automorphism, 50 endomorphism, 50 - - graph, 109 image, 45 - - kernel, 45 - - rank, 45 - proper, 265 - Riesz, 91, 367 mapping - degree, 268 matrix - algebra, 11 - associated to a linear map, 48 - block, 39, 137 - characteristic of a, 35 - cofactors, 36 - complementing minor, 34 - congruent to, 102 - determinant, 33, 34 - diagonal, 12 - diagonizable, 60 - eigenspace, 58 - eigenvalue, 58 - eigenvector, 58 - Gauss reduced, 26 pivots, 26 - Gram, 82, 85, 96, 101, 143 - identity, 12 - inverse, 12, 36 - Jordan's - - basis, 72 canonical form, 70 - Jordan's formula, 137 - LR decomposition, 30 - nilpotent, 69 - nonsingular, 15 - orthogonal, 88 - polar form, 125 - power, 137 - product, 11 - rank, 16 - similar to, 60 - singular value decomposition, 126 - singular values, 125 - spectrum, 58 - stair-shaped, 26 pivots, 26 - symmetric, 38 - trace, 38, 61 - transpose, 13 - triangular - - lower, 12 - - upper, 12 - unitary, 88

maximum point, 201 method - continuity, 337 - Faedo-Galerkin, 377 - Gauss elimination, 25 - Gram-Schmidt, 106 - Jacobi, 100 - least squares, 128 - Picard, 335, 407 - Ritz, 373 error estimate, 373 - shooting, 418 - super- and sub-solutions, 344 - variational for the eigenvalues, 116, 118 metric, 97 - Artin, 103 - Euclidean, 103 - invariant, 183 - Lorenz, 103 - nondegenerate, 97 - positive, 97 - pseudoeuclidean, 103 metric axions, 151 metric space, 151

- C \ 160 - compact, 200 - complete, 185 - completion, 186 - connected, 210 - connected component, 211 - continuity in, 163 - convergence in, 153 - immersion in ^oo, 402 - immersion in C^, 402 - locally connected, 212 - path-connected, 213 - sequentially compact, 197 metrics, 151 - equivalent, 188 - in a product space, 156 - topologically equivalent, 178 minimal geodesies, 397 minimizing sequence, 201 minimum point, 201 Minkowski - discrete inequality, 155 - inequality, 158 Minkowski inequality, 293 minor - complementing, 34 modulus of continuity, 320 mollifiers, 312 Moore-Penrose inverse, 369, 374 neighborhood, 177 norm, 79, 154, 285 - C^'", 301 - C i , 296

Index

- equivalent norms, 288 - L°°, 294 - ^p, 292 - LP, 293 - uniform, 294 - uniform or infinity, 159 normed space, 154, 285

- £(X,y), 324 - series, 288 absolute convergence, 289 normed spaces - convex sets, 287 numbers - Fibonacci, 140 ODE - Cauchy problem, 405 - comparison theorem, 420 - continuation of solutions, 408 - Gronwall's lemma, 410 - integral curves, 404 - maximum principle, 419, 420 - Picard approximations, 407 - shooting method, 418 operator - adjoint, 93, 369 - closed range, 369 - commuting, 388 - compact, 378 - compact perturbation, 379 - eigenvalue, 384 - eigenvector, 384 - Green, 372, 421 - linear - - antisymmetric, 121 - - isometry, 121 - - normal, 121 - - positive, 117 - - self-adjoint, 121 - - symmetric, 121 - normal, 121, 388 - positive, 117 - powers, 119 - projection. 111, 368 - resolvent, 384 - Riesz, 366 - self-adjoint. 111, 369 - singular values, 125 - spectrum, 384 pointwise, 384 - square root, 120 operators - bounded, 324 - compaxjt, 339 - exponential, 327 - pointwise convergence, 325 - Schauder, 341 - uniform convergence, 325

463

order cone, 343 parallegram law, 80, 83 parallelogram law, 352 path, 219 Peano curve, 228 Peano's phenomenon, 416 perfect set, 192 phenomenon - Peano, 416 point - adherent, 179 - boundary, 179 - cluster, 164 - exterior, 179 - interior, 179 - isolated, 180 - limit, 164 - of accumulation, 164 polynomials - Bernstein, 306 - Hermite, 361 - Jacobi, 361 - Laguerre, 361 - Legendre, 361 - Stieltjes, 314 - Tchebychev, 361 principle - abstract Dirichlet's, 364, 371 - Cantor, 188 - maximum, 419, 420 - of condensation of singularities, 329 - of uniform boundedness, 328 - Riemann's localization, 445 problem - Dirichlet, 416 product - Hermitian, 82 - inner, 79 - scalar, 79 projection - stereographic, 168 quadratic forms, 104 quadrics, 107 rank, 16 - of the transpose, 17 Rayleigh's quotient, 392 resolvent, 384 retraction, 254 scalars, 4, 41 segment-connected set, 213 semicontinuous function - sequentially, 203 sequence - Cauchy, 185

464

Index

- convergent, 153 series, 288 - Fourier, 357, 433 set - boundary of, 179 - bounded, 153 totally, 199 - closed, 175 - closure of, 179 - compact, 200 - - sequentially, 197 - complement of, 175 - connected, 210 inM, 211 - convex, 287 - convex hull of, 208 - dense, 192 - derived of, 180 - discrete, 192 - interior, 179 - meager, 189 - neighborhood, 177 - nowhere dense, 189 - of the first category, 189 - of the second category, 189 - open, 175 - perfect, 192 - regular closed, 193 - regular open, 193 - relatively compact, 203 - segment-connected, 213 - separated, 210 small oscillations, 141 - normal modes, 143 - proper frequencies, 143 smoothing kernel, 312 space - C-?, 346 - C^'", 301 - Cfe, 296

- LP, 293 - ^oo, 295 - ip, 292 - CO, 356 - CO, 159 - contractible, 253 - ^cx>, 157 - LP, 161 - ip, 158 - Hilbert, 353 - Hubert's, 158 - locally path-connected, 262 - L2(]a, 6[), 354 - £2, 353 - pre-Hilbert, 351 - simply connected, 259 - topologically complete, 194 spectral theorem, 387

spectrum, 58, 384 - characterization, 385 - pointwise, 384 subsolution, 344 subspace - orthogonal, 90 supersolution, 344 test - Dini, 438, 444 theorem - alternative, 94, 380, 383 - Baire, 188 - Baire of approximation, 319 - Banach's fixed point, 335 - Banach-Saks, 399 - Banach-Steinhaus, 328 - Bernstein, 306, 423 - Binet, 35 - Bolzano-Weierstrass, 198 - Borsuk, 273 - Borsuk's separation, 280 - Borsuk-Ulam, 278 - Brouwer, 273 - Brouwer's fixed point, 274, 276, 339 - Brouwer's invariance domain, 281 - Caccioppoli-Schauder, 341 - Cantor-Bernstein, 215 - Carnot, 81, 352 - Cayley-Hamilton, 67 - closed graph, 330 - comparison, 420 - continuation of solutions, 408 - continuous inverse, 330 - Courant, 116 - Cramer, 36 - de la Vallee Poussin, 315 - Dini, 299, 438 - Dirichlet-Jordan, 449 - Dugundji, 208 - existence of minimal geodesies, 397 of minimizers of convex coercive functionals, 401 - Fejer, 452 - finite covering, 200 - Prechet-Weierstrass, 203 - Fredholm, 94 - Fredholm's alternative, 50 - fundamental of algebra, 271 - Gelfand-Kolmogorov, 402 - Gelfand-Naimark, 403 - generalized eigenvectors, 69 - Gram-Schmidt, 85 - Hahn-Banach, 331, 332, 334 - Hausdorff, 186 - Heine-Cantor-Borel, 205 - Hopf, 273

Index

-

intermediate value, 212 Jacobi, 100 Jordan, 280 Jordan's canonical form, 72 Jordan's separation, 280 Jordan-Borsuk, 281 Kirszbraun, 207 Kronecker, 35 Kuratowski, 215 Lax-Milgram, 376 Lyusternik-Schnirelmann, 278 McShane, 207 Miranda, 277 nested sequence, 188 open mapping, 329 Peano, 415 Perron-Frobenius, 282 Picard-Lindelof, 406 Poincare-Brouwer, 277 polar decomposition, 125 projection, 89, 367 Pythagoras, 81, 84, 86, 354 Riemann-Lebesgue, 436 Riesz, 91, 291, 366, 371 Riesz-Fisher, 360 Riesz-Schauder, 385 Rouche, 282 Rouche-Capelli, 23 Schaefer's fixed point, 342 second mean value, 442 Seifert-Van Kampen, 267 simultaneous diagonalization, 117 spectral, 112, 122, 385 spectral resolution, 114 stability for systems of linear difference equations, 140 - Stone-Weierstrass, 316 - Sylvester, 98, 101 - Tietze, 208 - Uryshon, 185 - Weierstrass, 201 - Weierstrass's approximation, 303 - Weierstrass's approximation for periodic functions, 307 theory - Courant-Hilbert-Schmidt, 389 completeness relations, 394 toplogical - invariant, 184 topological - property, 184 - space, 182 topological space - contractible, 253 - deformation retract, 254 - Hausdorff, 184 - retract, 254 - simply connected, 259

topology, 178, 182 - basis, 184 - discrete, 184 - indiscrete, 184 - of uniform convergence, 294 totally bounded set, 199 trigonometric polynomials, 130 - energy identity, 131 - Fourier coefficients, 131 - sampling, 132 tubular neighborhood, 159 variational - inequality, 372 vector space, 41 - K"", 3 - automorphism, 50 - basis, 5, 43 canonical basis of K^, 9 - orthonormal, 85 - coordinate system, 46 - dimension, 8, 45 - direct sum, 18, 47 - dual, 54 - Euclidean, 79 norm, 81 - Hermitian, 82 norm, 84 - linear combination, 4, 42 - linear subspace, 4 implicit representation, 18 parametric representation, 18 - ordered basis, 9 - subspace, 42 - supplementary, 47 - supplementary linear subspaces, 18 vectors, 41 - linearly dependent, 5 - linearly independent, 5, 42 - norm, 79 - orthogonal, 80, 84, 354 - orthonormal, 85 - span of, 42 von Koch curve, 228 work, 92 Yosida regularization, 319, 320

465

Printed in the United States of America

Mathematical analysis: linear and metric structures and continuity

Read more

Mathematical Analysis: Linear and Metric Structures and Continuity

Read more

Mathematical analysis : linear and metric structures and continuity

Read more

Mathematical Analysis: Linear and Metric Structures and Continuity

Read more

The Geometry of Metric and Linear Spaces

Read more

Metric linear spaces

Read more

Metric Linear Spaces

Read more

Non-linear Modeling and Analysis of Solids and Structures

Read more

Non-linear Modeling and Analysis of Solids and Structures

Read more

Non-linear Modeling and Analysis of Solids and Structures

Read more

Non-linear Modeling and Analysis of Solids and Structures

Read more

Metric Linear Spaces (Mathematics and its Applications)

Read more

Metric Linear Spaces (Mathematics and its Applications)

Read more

Problems in Mathematical Analysis ll: Continuity and Differentiation

Read more

Problems in mathematical analysis 2. Continuity and differentiation

Read more

Non-Linear Finite Element Analysis of Solids and Structures

Read more

Metric Structures in Differential Geometry

Read more

Non-Linear Finite Element Analysis of Solids and Structures: Essentials

Read more

Non-Linear Finite Element Analysis of Solids and Structures: Essentials

Read more

Non-Linear Finite Element Analysis of Solids and Structures: Essentials

Read more

Metric structures in differential geometry

Read more

Analysis and mathematical physics

Read more

Analysis and Mathematical Physics

Read more

Automatic continuity of linear operators

Read more

Linear and Complex Analysis Problem

Read more

Linear and Complex Analysis Problem

Read more

Metric foliations and curvature

Read more

Metric Foliations and Curvature

Read more

Metric foliations and curvature

Read more

Metric foliations and curvature

Read more

Recommend Documents

Mathematical analysis: linear and metric structures and continuity

Mathematical Analysis: Linear and Metric Structures and Continuity

Mathematical analysis : linear and metric structures and continuity

Mariano Giaquinta Giuseppe Modica Mathematical Analysis Linear and Metric Structures and Continuity Birkhauser Boston...

Mathematical Analysis: Linear and Metric Structures and Continuity

Mariano Giaquinta Giuseppe Modica Mathematical Analysis Linear and Metric Structures and Continuity Birkhauser Boston...

The Geometry of Metric and Linear Spaces

Metric linear spaces

Metric Linear Spaces

Stefan Rolewicz Institute of Mathematics, Polish Academy of Sciences, Warsaw, Poland Metric Linear Spaces D. Reidel P...

Non-linear Modeling and Analysis of Solids and Structures

This page intentionally left blank Non-linear Modeling and Analysis of Solids and Structures Steen Krenk CAMBRIDGE ...

Non-linear Modeling and Analysis of Solids and Structures

This page intentionally left blank Non-linear Modeling and Analysis of Solids and Structures Steen Krenk CAMBRIDGE ...

Non-linear Modeling and Analysis of Solids and Structures

This page intentionally left blank Non-linear Modeling and Analysis of Solids and Structures Steen Krenk CAMBRIDGE ...