Igor Kriz Aleš Pultr
Introduction to Mathematical Analysis
Igor Kriz Aleˇs Pultr
Introduction to Mathematical Analysis
Igor Kriz Department of Mathematics University of Michigan Ann Arbor, MI USA
Aleˇs Pultr Department of Applied Mathematics (KAM) Faculty of Mathematics and Physics Charles University Prague Czech Republic
ISBN 978-3-0348-0635-0 ISBN 978-3-0348-0636-7 (eBook) DOI 10.1007/978-3-0348-0636-7 Springer Basel Heidelberg New York Dordrecht London Library of Congress Control Number: 2013941992 © Springer Basel 2013 This work is subject to copyright. All rights are reserved by the Publisher, whether the whole or part of the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation, broadcasting, reproduction on microfilms or in any other physical way, and transmission or information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology now known or hereafter developed. Exempted from this legal reservation are brief excerpts in connection with reviews or scholarly analysis or material supplied specifically for the purpose of being entered and executed on a computer system, for exclusive use by the purchaser of the work. Duplication of this publication or parts thereof is permitted only under the provisions of the Copyright Law of the Publisher’s location, in its current version, and permission for use must always be obtained from Springer. Permissions for use may be obtained through RightsLink at the Copyright Clearance Center. Violations are liable to prosecution under the respective Copyright Law. The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication does not imply, even in the absence of a specific statement, that such names are exempt from the relevant protective laws and regulations and therefore free for general use. While the advice and information in this book are believed to be true and accurate at the date of publication, neither the authors nor the editors nor the publisher can accept any legal responsibility for any errors or omissions that may be made. The publisher makes no warranty, express or implied, with respect to the material contained herein. Printed on acid-free paper Springer Basel AG is part of Springer Science+Business Media (www.birkhauser-science.com)
To Sophie
To Jitka
Preface
This book is a result of a long-term project which originated in courses we taught to undergraduate students who specialize in mathematics. These students had ı-" calculus before, but there did not seem to be a suitable comprehensive textbook for a follow-up course in analysis. We wanted to write such a textbook based on our courses, but that was not the only goal. Teaching bright students is about introducing them to mathematics. Therefore, we wanted to write a book which the students may want to keep after the course is over, and which could serve them as a bridge to higher mathematics. Such a book would necessarily exceed the scope of their courses. We start with standard material of second year analysis: multi-variable differential calculus, Lebesgue integration, ordinary differential equations and vector calculus. What makes all this go smoothly is that we introduce some basic concepts of point set topology first. Since our aim is to be completely rigorous and as selfcontained as possible, we also include a Preliminaries chapter on the basic topic of one-variable calculus, and two Appendices on the necessary concepts of linear algebra. This pretty much comprises the first part of our book. With the foundations covered, it is possible to venture much further. The common theme of the second part of our book is the interplay between analysis and geometry. After a second installment of point set topology, we are quickly able to introduce complex analysis, and after some multi-linear algebra, also manifolds, differential forms and the general Stokes Theorem. The methods of manifolds and complex analysis combine in a treatment of Riemann surfaces. Basic methods of the calculus of variations are applied to a theory of geodesics, which in turn leads to basic tensor calculus and Riemannian geometry. Finally, infinite-dimensional spaces, which have already made an appearance in multiple places throughout the text, are treated more systematically in a chapter on the basic concepts of functional analysis, and another on a few of its applications. The total amount of material in this book cannot be covered in any single year course. An instructor of a course based on this book should probably aim for covering the first part, and take his or her picks in the second part. As already mentioned, we hope to motivate the student to hold on to their textbook, and use it for further study in years to come. They will eventually get to more advanced books in analysis and beyond, but here they can get, relatively quickly, their first glimpse of a big picture. vii
viii
Preface
Because of this, the aim of our book is not limited to undergraduate students. This text may equally well serve a graduate student or a mathematician at any career stage who would like a quick source or reference on basic topics of analysis. A scientist (for example in physics or chemistry) who may have always been using analysis in their work, can use this book to go back and fill in the rigorous details and mathematical foundations. Finally, an instructor of analysis, even if not using this book as a textbook, may want to use it as a reference for those pesky proofs which usually get skipped in most courses: we do quite a few of them. Ann Arbor, USA Prague 1, Czech Republic
Igor Kriz Aleˇs Pultr
Contents
Part I A Rigorous Approach to Advanced Calculus 1
Preliminaries . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 Real and complex numbers .. .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 Convergent and Cauchy sequences . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 Continuous functions . . . . . . . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 Derivatives and the Mean Value Theorem .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 Uniform convergence .. . . . . . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 Series. Series of functions .. . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 Power series . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 A few facts about the Riemann integral . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 Exercises . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
3 3 10 11 13 18 19 23 26 30
2
Metric and Topological Spaces I . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 Basics. . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 Subspaces and products . . . . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 Some topological concepts . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 First remarks on topology . . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 Connected spaces . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 Compact metric spaces . . . . . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 Completeness . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 Uniform convergence of sequences of functions. Application: Tietze’s Theorems .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 Exercises . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
33 33 37 39 43 47 51 54
3
Multivariable Differential Calculus . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 Real and vector functions of several variables . . . . . . . . . . . . . . . . . . . . . . . . . 2 Partial derivatives. Defining the existence of a total differential . . . . . . 3 Composition of functions and the chain rule . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 Partial derivatives of higher order. Interchangeability . . . . . . . . . . . . . . . . . 5 The Implicit Functions Theorem I: The case of a single equation . . . . 6 The Implicit Functions Theorem II: The case of several equations . . . 7 An easy application: regular mappings and the Inverse Function Theorem . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
57 62 65 65 66 71 74 77 81 86
ix
x
Contents
8
Taylor’s Theorem, Local Extremes and Extremes with Constraints. . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Exercises . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
87 94
Integration I: Multivariable Riemann Integral and Basic Ideas Toward the Lebesgue Integral . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 Riemann integral on an n-dimensional interval . . . . . . . . . . . . . . . . . . . . . . . . 2 Continuous functions are Riemann integrable.. . . . . . . . . . . . . . . . . . . . . . . . . 3 Fubini’s Theorem in the continuous case . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 Uniform convergence and Dini’s Theorem .. . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 Preparing for an extension of the Riemann integral .. . . . . . . . . . . . . . . . . . . 6 A modest extension .. . . . . . . . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 A definition of the Lebesgue integral and an important lemma . . . . . . . 8 Sets of measure zero; the concept of “almost everywhere” .. . . . . . . . . . . 9 Exercises . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
97 97 100 101 102 105 107 109 113 115
Integration II: Measurable Functions, Measure and the Techniques of Lebesgue Integration . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 Lebesgue’s Theorems.. . . . . . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 The class ƒ (measurable functions) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 The Lebesgue measure . . . . . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 The integral over a set . . . . . . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 Parameters.. . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 Fubini’s Theorem . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 The Substitution Theorem .. . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 H¨older’s inequality, Minkowski’s inequality and Lp -spaces . . . . . . . . . . 9 Exercises . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
117 117 118 120 123 127 128 130 135 141
9 4
5
6
7
Systems of Ordinary Differential Equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 The problem.. . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 Converting a system of ODE’s to a system of integral equations . . . . . 3 The Lipschitz property and a solution of the integral equation .. . . . . . . 4 Existence and uniqueness of a solution of an ODE system . . . . . . . . . . . . 5 Stability of solutions .. . . . . . . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 A few special differential equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 General substitution, symmetry and infinitesimal symmetry of a differential equation .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 Symmetry and separation of variables .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 Exercises . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
145 145 147 149 151 153 161
Systems of Linear Differential Equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 The definition and the existence theorem for a system of linear differential equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 Spaces of solutions . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 Variation of constants .. . . . . . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 A Linear differential equation of nth order with constant coefficients.. . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
175
165 168 172
175 179 181 183
Contents
5 6 8
Systems of LDE with constant coefficients. An application of Jordan’s Theorem .. . . . . . . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 187 Exercises . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 190
Line Integrals and Green’s Theorem .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 Curves and line integrals . . . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 Line integrals of the first kind (D according to length) .. . . . . . . . . . . . . . . 3 Line integrals of the second kind . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 The complex line integral . . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 Green’s Theorem.. . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 Exercises . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Part II 9
xi
193 193 197 199 202 204 209
Analysis and Geometry
Metric and Topological Spaces II . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 Separable and totally bounded metric spaces. . . . . . . . . . . . . . . . . . . . . . . . . . . 2 More on compact spaces . . . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 Baire’s Category Theorem . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 Completion .. . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 More on topological spaces: Separation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 The space of continuous functions revisited: The Arzel`a-Ascoli Theorem and the Stone-Weierstrass Theorem.. . . . 7 Exercises . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
213 213 216 219 221 224 229 234
10 Complex Analysis I: Basic Concepts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 The derivative of a complex function. Cauchy-Riemann conditions . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 From the complex line integral to primitive functions . . . . . . . . . . . . . . . . . 3 Cauchy’s formula . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 Taylor’s formula, power series, and a uniqueness theorem . . . . . . . . . . . . 5 Applications: Liouville’s Theorem, the Fundamental Theorem of Algebra and a remark on conformal maps . . . . . . . . . . . . . . . . 6 Laurent series, isolated singularities and the Residue Theorem .. . . . . . 7 Exercises . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
237
252 254 263
11 Multilinear Algebra . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 Hom and dual vector spaces . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 Multilinear maps and the tensor product . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 The exterior (Grassmann) algebra . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 Exercises . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
267 267 271 276 284
12 Smooth Manifolds, Differential Forms and Stokes’ Theorem . . . . . . . . . 1 Smooth manifolds . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 Tangent vectors, vector fields and differential forms.. . . . . . . . . . . . . . . . . . 3 The exterior derivative and integration of differential forms . . . . . . . . . . 4 Integration of differential forms and Stokes’ Theorem . . . . . . . . . . . . . . . . 5 Exercises . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
287 287 292 298 301 307
237 243 245 248
xii
Contents
13 Complex Analysis II: Further Topics. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 The Riemann Mapping Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 Holomorphic isomorphisms of disks onto polygons and the Schwartz-Christoffel formula . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 Riemann surfaces, coverings and complex differential forms . . . . . . . . . 4 The universal covering and multi-valued functions . . . . . . . . . . . . . . . . . . . . 5 Complex analysis beyond holomorphic functions . . . . . . . . . . . . . . . . . . . . . 6 Exercises . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
311 312 317 321 332 340 346
14 Calculus of Variations and the Geodesic Equation . . . . . . . . . . . . . . . . . . . . . 1 The basic problem of the calculus of variations, and the Euler-Lagrange equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 A few special cases and examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 The geodesic equation .. . . . . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 The geometry of geodesics .. .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 Exercises . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
349
15 Tensor Calculus and Riemannian Geometry . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 Tensor calculus. . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 Affine connections .. . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 Tensors associated with an affine connection: torsion and curvature .. . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 Riemann manifolds . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 Riemann surfaces and surfaces with Riemann metric.. . . . . . . . . . . . . . . . . 6 Exercises . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
367 368 371 374 378 381 390
16 Banach and Hilbert Spaces: Elements of Functional Analysis . . . . . . . . 1 Banach and Hilbert spaces . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 Uniformly convex Banach spaces .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 Orthogonal complements and continuous linear forms . . . . . . . . . . . . . . . . 4 Infinite sums in a Hilbert space and Hilbert bases . . . . . . . . . . . . . . . . . . . . . 5 The Hahn-Banach Theorem ... . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 Dual Banach spaces and reflexivity .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 The duality of Lp -spaces . . . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 Images of Banach spaces under bounded linear maps . . . . . . . . . . . . . . . . . 9 Exercises . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
393 393 395 397 402 408 411 415 419 424
17 A Few Applications of Hilbert Spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 Some preliminaries: Integration by a measure . . . . . . . . . . . . . . . . . . . . . . . . . p 2 The spaces L .X; C/ and the Radon-Nikodym Theorem . . . . . . . . . . . . . 3 Application: The Fundamental Theorem of (Lebesgue) Calculus.. . . . 4 Fourier series and the discrete Fourier transformation .. . . . . . . . . . . . . . . . 5 The continuous Fourier transformation .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 Exercises . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
427 427 432 435 440 443 448
349 352 356 360 365
Contents
xiii
A
Linear Algebra I: Vector Spaces .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 Vector spaces and subspaces .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 Linear combinations, linear independence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 Basis and dimension .. . . . . . . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 Inner products and orthogonality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 Linear mappings . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 Congruences and quotients .. .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 Matrices and linear mappings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 Exercises . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
451 451 454 457 460 464 468 469 474
B
Linear Algebra II: More about Matrices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 Transforming a matrix. Rank .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 Systems of linear equations . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 Determinants .. . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 More about determinants . . . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 The Jordan canonical form of a matrix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 Exercises . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
477 477 479 485 489 493 498
Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 501 Index of Symbols . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 503 Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 505
Introduction
The main purpose of this introduction is to tell the reader what to expect while reading this book, and to give advice on how to read it. We assume the reader to be acquainted with the basics of differential and integral calculus in one variable, as traditionally covered in the first year of study. Nevertheless, we include, for the reader’s convenience, in Chapter 1, a few pivotal theoretical points of analysis in one variable: continuity, derivatives, convergence of sequences and series of functions, the Mean Value Theorem, Taylor expansion, and the single-valued Riemann integral. The purpose of including this material is two-fold. First, we would like this text to be as self-contained as possible: we wish to spare the reader a tedious search, in another text, for an elementary fact he or she may have forgotten. The second, and perhaps more important reason, is to focus attention on facts of elementary differential and integral calculus that have deeper aspects, and are fundamental to more advanced topics. In connection with this, we also review in the exercises to Chapter 1 definitions of elementary functions and proofs, from the first principles, of their properties needed later. What we omit at this stage is a proof of the existence of real numbers; the reader probably knows it from elsewhere, but if not there will be an opportunity to come back and do it as an exercise to Chapter 9. An entirely different prerequisite is linear algebra. While not a part of mathematical analysis in the narrowest sense, it contains many necessary techniques. In fact, differential calculus (in particular in more than one variable) can be without much exaggeration understood as the study of linear approximations of more general mappings, and a basic knowledge in dealing with the linear case is indispensable. The reader’s skills in these topics (determinants, linear equations, operations with matrices, and others) may determine to a considerable degree his or her success with a large part of this book. Because of this, we feel it is appropriate to include linear algebra in this text as a reference. In order not to slow down the narrative, we do so in two appendices: Appendix A for more theoretical topics such as vector spaces and linear mappings, and Appendix B for more computational questions regarding matrices, culminating with a treatment of the Jordan canonical form. Let us turn to the main body of this book. It is divided into two parts. One of our main goals is to present a rigorous treatment of the traditional topics of advanced calculus: multivariable differentiation, (Lebesgue) integration, and differential equations. All this is covered by Part I, including basic facts about line integrals and Green’s Theorem. xv
xvi
Introduction
In Part II we use the techniques developed in Part I to approach phenomena of geometrical nature which the reader may already have encountered without proof, or will certainly encounter in further studies. Using the tools developed they can be probed into considerable depth without too much further difficulty. Part I. We think it essential to start rigorous advanced calculus with the basic notions of (at least metric) topology. Concepts such as neighborhood, open set, closure and convergence viewed narrowly just in the context of the Euclidean space Rn do not give a satisfactory picture of what is going on (and besides, would not be sufficient for what will come later). In Chapter 2, we discuss these concepts first in the context of metric spaces. This generality is, strictly speaking, already sufficient for most of our purposes. Yet, it is useful to learn about the more general topological spaces to be able to distinguish what really depends on metric and what does not. For this reason, our treatment of space in this chapter is an interweaving narrative of metric and topological spaces with the goal of presenting an adequate general outlook. We stick here, however, to the simpler facts and concepts needed in the nearest chapters (the more advanced topics on spaces are postponed to Chapter 9) and the reader will certainly not find this chapter hard. With the basic knowledge of metric topology we are ready for multivariable differential calculus. This is covered in Chapter 3. We start with the basic notions of partial derivative and total differential, and chain rule. Emphasizing the role of the total differential as a linear map is key to more coordinate-free approaches to analysis, vital when investigating manifolds (later in Chapter 12). Next, we prove from first principles the Implicit Function Theorem. This is the first more complicated analytic proof in our text; the reader is advised to pay detailed attention to this material, and certainly not to skip it, as it is a good model of what a proof in analysis looks and feels like. The chapter is concluded with material related to the multivariable version of Taylor’s Theorem, and to calculating extremes and saddles of multivariable functions. The following two chapters are devoted to integration. In Chapter 4 we start with the multivariable Riemann integral over a product of intervals. It becomes clear very quickly, however, that a more versatile theory is needed. For example, we want to take integrals of unbounded functions, or integrals over more general types of sets. Or, we would like to know when we can take limits or derivatives behind the integral sign. We would like to understand precisely why “the boundary does not matter” when taking a multivariable integral, and how and why we can change variables in a non-linear way. All this leads to the concept of Lebesgue integral, which is somewhat notorious for being time-consuming because of the abstract concepts it entails. There is, however a way around that. A method of P.J. Daniell (going back to 1918 and unjustly neglected for decades) allows a straightforward introduction of the Lebesgue integral starting with monotone limits of Riemann integrals of continuous functions with compact support. The necessary technical theorems can be proved very quickly and we present them in the second half of Chapter 4. In Chapter 5, we go on to present the more technical aspects of the Lebesgue integral. We explain how to take limits and derivatives behind the integral sign, prove Fubini’s Theorem, define the Lebesgue measure and prove its basic properties. Further, we
Introduction
xvii
introduce Borel sets and prove criteria of measurability. We present a rigorous proof of a multvariable substitution theorem. Finally, we introduce Lp -spaces: while this may seem like an early place, we will have enough integration theory at this point to do so, and to prove their basic properties. This is useful, as the Lp -spaces often occur throughout analysis (for example, in this book, we will use them in Chapters 13 and 15 in proving the existence of a complex structure on an oriented surface with a Riemann metric.) We will return to the study of Lp -spaces in Chapter 16, where they provide the most basic examples in functional analysis. Next, having covered differentiation and integration, we turn to differential equations. We restrict our attention to the ordinary differential equations (ODEs), as partial differential equations have quite a different flavor and constitute a vast field of their own, far beyond a general course in analysis (even an advanced one). For a text on partial differential equations, we refer the reader, for example, to [5]. Chapter 6 on (general) ordinary differential equations is in fact independent of Chapters 4 and 5 and uses only the material of Chapters 1, 2 and 3. We introduce the concept of a Lipschitz function and prove the local existence and uniqueness theorem for the systems of ODEs (the Picard-Lindel¨of Theorem). We also discuss stability of solutions and differentiation with respect to parameters. Further, we discuss the basic method for separation of variables, and finally discuss global and infinitesimal symmetries of systems of ODEs (thus motivating further study of vector fields); also, we explain how the methods of separation of variables discussed earlier are related to symmetries of the system. Chapter 7 covers some aspects of linear differential equations (LDEs). The global existence theorem is proved, and the affine set of solutions of a linear system is discussed. We show how to use the Wronskian for recognizing a fundamental system ( basis) of the space of solutions of a homogeneous system of LDEs, and how to get solutions of a non-homogeneous system from the homogeneous one using the variation of constants. Also, we present a method of solving systems of LDEs with constant coefficients, easier in the case of a single higher order LDE, and requiring the Jordan canonical form of a matrix from Appendix B in the harder general case. Chapter 8, concluding Part I, treats parametric curves, line integrals of the first and second kind and the complex line integral. At the end we prove Green’s Theorem, which we will need when dealing with complex derivatives, but which is also an elementary warm-up for the general Stokes’s Theorem. Part II. Now our perspective changes. The traditional items of advanced calculus have been mostly covered and we turn to topics interesting from the point of view of geometry. To proceed, perhaps by now not surprisingly, we need another installment of topological foundations. This is done in Chapter 9, presenting more material on topological spaces (separability, compactness, separation axioms and the Urysohn Theorem) as well as on metric spaces (completion, Baire’s Category Theorem). In the last section we prove the Stone-Weierstrass Theorem providing a remarkably general method to obtain useful dense sets in spaces of functions, and the Arzel`aAscoli Theorem, which greatly clarifies the meaning of uniform convergence, and will be useful in Chapter 10 when proving the Riemann Mapping Theorem.
xviii
Introduction
Next, in Chapter 10 we introduce the basic methods of complex analysis. The fundamental facts can be derived almost immediately from the complex line integral and Green’s Theorem of Chapter 8. The conclusions, however, are powerful and surprising. Unlike differentiable functions of a real variable, complex functions with a complex derivative (holomorphic functions) are much more rigid. They are determined, for example, by their values on a convergent sequence of points, and the existence of a derivative automatically implies the existence of derivatives of all orders; on the other hand, “geometrically very smooth” functions may not have a complex derivative. Thus, our view of the differential calculus as we know it from the real case is turned upside down. Yet, complex analysis has important real applications, such as for instance the explanation of the convergence properties of a Taylor series. Other applications presented are the Fundamental Theorem of Algebra, and an important geometric one, the Jordan Curve Theorem. We then go on to cover other basic methods of complex analysis, such as Laurent series, the classification of isolated singularities, the Residue Theorem and the Argument Principle, which has several interesting applications, including the Open Mapping Theorem. Next, we also have to upgrade our knowledge of linear algebra; more specifically, we must get acquainted with the techniques of multilinear algebra. This is done in Chapter 11, which includes dual vector spaces, tensor products, and the exterior (Grassmann) algebra. Thus equipped, we can now study calculus on manifolds. This is done in Chapter 12. We define smooth manifolds, tangent vectors, vector fields, and differential forms. Further, we present the exterior derivative, de Rham complex, and the de Rham cohomology. A general form of Stokes’s Theorem is proved and related to the operators grad, div and curl as introduced in traditional calculus courses. A combination of the study of manifolds with complex analysis in one variable leads to the concept of Riemann surfaces. Their basic theory is presented in Chapter 13. We begin the chapter with the Riemann Mapping Theorem, showing conformal equivalence (holomorphic isomorphism) of simply connected proper open subsets of C. We also present the Schwarz-Christoffel integrals giving conformal equivalence between open convex polygons and the unit open disk; examples include elliptic integrals. Then we introduce the theory of Riemann surfaces and coverings, and construct universal coverings. We will see that even if we are not interested in the abstraction of manifolds, this formalism will greatly enhance our understanding of complex integration: we will now be able, for example, to integrate a holomorphic function over a homotopy class of continuous paths. We will also be able to understand how to make rigorous the concept of a “multi-valued holomorphic function”, which was strongly suggested by the methods of Chapter 10, yet could not be adequately approached by its methods. Finally, studying complex differential forms on Riemann surfaces will lead us to the basic notions of “dz-d z-calculus”, which is very helpful in complex analysis. To demonstrate, we will apply this to extending some of the methods of complex analysis beyond the case of holomorphic functions.
Introduction
xix
Chapter 14 is devoted, primarily, to the basic problem of the calculus of variations in one independent variable, and to the Euler-Lagrange equation for critical functions. Here, only the material of Part I is used. In the second half of the chapter we define a Riemann metric on an open subset of Rn and discuss the geodesic equation in more detail. Also, we prove the local minimality of geodesics. A part of the reason for introducing this material is a motivation of the topics investigated in Chapter 15 where we combine it with the material on manifolds to obtain the basic concepts of Riemannian geometry. We start with tensor calculus and then move on to affine connections, Riemann metrics on manifolds and curvature, and give a local characterization of the Euclidean space as the Riemannian manifold with zero curvature tensor. Using the methods of the last section of Chapter 13, we also show how to construct a complex structure on an oriented Riemann manifold in dimension 2. Chapter 16 concerns Hilbert and Banach spaces, and introduces the basic concepts of functional analysis. Here we need as prerequisites only the techniques from Part I and Chapter 9. We start with the definition and basic properties of Hilbert spaces. We show that a Hilbert space provides, in a sense, an infinitedimensional extension of the nice properties of the finite-dimensional vector spaces with inner product. Banach spaces are also introduced; their theory is much harder, but nevertheless we are able to prove a few neat results. Starting with Hahn-Banach’s Theorem, we go on to examining duals of Banach spaces, proving, for example, that the dual of Lp is Lq for 1 < p < 1, 1=p C 1=q D 2. We will also prove the Open Mapping Theorem and the Closed Graph Theorem. In Chapter 17 we present some applications, mainly of Hilbert spaces. (One tends to use Hilbert spaces wherever possible, precisely because they are much easier.) We will prove the Radon-Nikodym Theorem, and use it to prove a version of the Fundamental Theorem of Calculus for the Lebesgue integral, a fairly hard fact. In the framework of Hilbert spaces we also define Fourier series and (continuous) Fourier transformation. As a fringe benefit, we will introduce Borel measures. The theory of Lp -spaces generalizes to this case, and includes some interesting new examples. When using this book as a textbook for a course, an instructor should aim to cover most of Part I in the first semester. After this, one can work with Part II, basically, on three independent tracks, thus customizing the course as needed, and as time permits. For Chapters 10 and 14, no additional prerequisities beyond Part I are needed. All the other chapters require Chapter 9; just this added to Part I suffices for the study of Hilbert spaces. In the remaining group, multilinear algebra of Chapter 11 has to precede manifolds in Chapter 12 and Riemannian geometry in Chapter 15, which also uses the facts from Chapter 14, and its last section uses Chapter 10. Chapter 13 uses Chapter 10 and Chapter 12. Using this dependence of chapters, an instructor may decide about the topics for the next semester, possibly assigning some material to the students as independent
xx
Introduction
reading. There is no need to cover entire chapters, there are endless possibilities how to mix and match topics to create an interesting course. The student (or reader) is, in any case, most strongly encouraged to keep the book for further study. As already mentioned, we anticipate that graduate students of mathematics, mathematicians and scientists in areas using analysis, as well as instructors of courses in analysis will find this book useful as a reference, and will find their own ways through the topics. In the Bibliography section, the reader will find suggestions for further reading. In the more advanced sections of this text, we often introduce concepts (such as “Lie group” or “de Rham cohomology”) which arise as a natural culmination of our discussion, but whose systematic development is beyond the scope of this book. These concepts are meant to motivate further study. We would like to emphasize that our list of literature is by no means meant to be complete. The books we do suggest all have a fairly close connection to the present text, and to mathematical analysis. They contain more detailed information, as well as suggestions of further literature. Finally, we would like to say a few words about sources. The overall conception of the book is original: we designed the logic of the interdependence of topics, and the strategy for their presentation. Many proofs are, in fact, also “original” in the sense that we made up our own arguments to fit best the particular stage of the presentation (the book contains no new mathematical results). Given the scope of the project, however, we did, in some cases, consult lecture notes, other books and occasionally even research papers for particular proofs. All the books used are listed in the Bibliography at the end of the book. In the case of research papers, we give the name of who we believe is the original author of the proof, but do not include explicit journal references, as we feel an effort of being even partially fair would lead to a web of references which would only bewilder a first-time student of the subject. We did want to mention, however, that there are also quite a few proofs which seem to have become “standard” in this field (including, sometimes, particular notation), and whose original author we were not able to track down. We would like to thank all of those, who, by inventing those proofs, contributed to this book implicitly. We would also like to thank colleagues and students who read parts of our book, and gave us valuable comments. Last but not least, the authors gratefully acknowledge the support of CE-ITI of Charles University, the Michigan Center for Theoretical Physics and the NSF.
Part I A Rigorous Approach to Advanced Calculus
1
Preliminaries
The typical reader of this text will have had a rigorous “ı-"” first year calculus course, using a text such as for example [22]. Such a course will have included definitions and basic properties of the standard elementary functions (polynomials, rational functions, exponentials and logarithms, trigonometric and cyclometric functions), the concept of continuity of a real function and the fact that continuity is preserved under standard constructions (sum, product, composition, etc.), and the basic rules of computing derivatives. We review here mainly the more theoretical aspects of these topics. The reason for reviewing them are two-fold. The first reason is that we would like this text to be as self-contained as possible. The second reason is that some of the basic results have, in fact, substantial depth in them, and the more advanced topics on which this book focuses make heavy use of them. Not reviewing such topics would at times even create a danger of circular arguments.
1
Real and complex numbers
Perhaps it is useful to go over a few basic conventions first. By a map or mapping from a set S to a set T we mean a rule which assigns to each element of S precisely one element of T . Two rules are considered the same if they always produce the same value (in T ) on the same input element (of S ). Therefore, technically, a map is a binary relation, i.e. a set R of pairs .x; y/, x 2 S , y 2 T , such that for each s 2 S , there is precisely one .s; y/ 2 R. The sets S , T are called the domain and codomain, respectively. We will denote a map f from a set S to a set T by f W S ! T . For such a map, and a set X S , we will denote by f ŒX the set of all elements f .x/ such that x 2 X . Similarly, for Y T , we will denote by f 1 ŒY the set of all x 2 S such that f .x/ 2 Y . The set f ŒX is called the image of the set X under the map f , and the set f 1 ŒY is called the pre-image of the set Y under f . This use of the square bracket may perhaps seem unusually pedantic, but will soon pay off in the text below. The image f ŒS of the domain is sometimes called the image of the map f .
I. Kriz and A. Pultr, Introduction to Mathematical Analysis, DOI 10.1007/978-3-0348-0636-7 1, © Springer Basel 2013
3
4
1 Preliminaries
To comment briefly on the use of inclusion symbols, throughout this book, we generally use to denote a subset with possible equality; when equality is excluded, we use ¨. We generally avoid the somewhat ambiguous symbol . When we do use it, it means in a context where equality is a priori excluded for an obvious reason not entering the logic of the argument (this may happen, for example, for a finite subset of the real numbers when we are not using the finiteness to conclude that the complement is non-empty). Returning to the subject of mappings, for a map f W S ! T , and U T , often it is useful to have a special symbol for the map g W U ! T which is defined by g.x/ D f .x/ when x 2 U . The map g is called the restriction to the subset U , and denoted by f jU or f jU . A map is called onto if f ŒS D T , and is called one-to-one (briefly 1-1) if for every s1 ; s2 2 S , f .s1 / D f .s2 / ) s1 D s2 . Onto maps are also called surjective and one-to-one maps are called injective. A bijective map is a map which is both surjective and injective. The composition of maps f W S ! T , g W T ! U will be denoted by g ı f .x/ D g.f .x// for x 2 S . In fact, the circle is often omitted, and instead of g ı f , one simply writes gf . One must, of course, make sure there is no possibility of confusion with multiplication. The identity map IdS on a set S is defined simply by IdS .x/ D x for every x 2 S . Note that a bijective map f W S ! T has an inverse, i.e. a map f 1 such that f 1 ı f D IdS , f ı f 1 D IdT . We will use the symbol N to denote the set of (positive) natural numbers f1; 2; : : : g. The set of non-negative integers will be denoted by N0 , and the set of all integers by Z. The set of all rational numbers will be denoted by Q. The set R of real numbers needs more attention.
1.1 Let us summarize the structure of the set R of real numbers as it will be used in this text. We do not give a rigorous construction of the real numbers at this point. Such a construction however will emerge in the context of our discussion of completeness in Chapter 9, where it is reviewed as an exercise. First R is a field, that is, there are binary operations, addition C and multiplication (which will be often indicated simply by juxtaposition) that are associative (that is, a C .b C c/ D .a C b/ C c and a.bc/ D .ab/c) and commutative (that is, a C b D b C a and ab D ba) and related by the distributivity law (a.b Cc/ D abCac). There are neutral elements, zero 0 and one (also called unit) 1, such that a C 0 D a and a 1 D a. With each a 2 R we have associated an element a 2 R such that a C .a/ D 0; almost the same holds for the multiplication where we have for every non-zero a an element a1 (also denoted by a1 ) such that a a1 D 1. Furthermore there is a linear order on R (a binary relation such that a a, that a b and b a implies a D b, that a b and b c implies a c, and finally that for any a; b either a b or a D b or a b), and this order is preserved by addition and by multiplication by elements that are 0.
1 Real and complex numbers
5
Then, we have the absolute value jaj equal to a if a 0 and to a if a 0. One often views R as a line with ja bj representing the distance between a and b. For M R, we say that a is an upper (resp lower) bound of M if x a (resp. x a) for all x 2 M . A supremum (resp. infimum), denoted by sup M
(resp. inf M ),
is the least upper bound (resp. greatest lower bound), if it exists. Thus, the supremum s of M is characterized by the properties (1) 8x 2 M; x s, and (2) if x a for all x 2 M then s a (similarly for infimum with instead of ). (2) It is often expediently replaced by (2’) if a < s then there is an x 2 M such that a < x (realize that (1)&(2) is indeed equivalent to (1)&(2’)). It is a specific property of the ordered field R that each non-empty M R that has an upper bound has a supremum or, equivalently, that each non-empty M R that has a lower bound has an infimum. In mathematical analysis, it is often customary to use the symbols 1 D C1 and 1. The supremum (resp. infimum) of the empty set is defined to be 1 (resp. 1), and the supremum (resp. infimum) of a set with no upper bound (resp. no lower bound) is defined to be 1 (resp. 1). Accordingly, it is customary to write 1 < a < C1 for any real number a, and to define 1 C 1 D 1 .1/ D 1, and .1/ C .1/ D .1/ 1 D 1, a ˙1 D ˙1 resp. a ˙1 D 1 for a > 0 resp. a < 0. It is important to keep in mind, however, that the symbols 1, 1 are not real numbers, and expressions such as 1 1 or 0 1 are undefined (although see Section 6 of Chapter 4 for an exception). If M is a subset of R and sup.M / 2 M (resp. inf.M / 2 M ), we say that the supremum (resp. infimum) is attained, and speak of a maximum resp. minimum. In this case, we may use the notation max M; min M: It is important to keep in mind that, unlike the supremum and infimum, a maximum and/or minimum of a non-empty bounded subset of R may not exist. A non-empty finite subset of R, however, always has a maximum and a minimum. Variants of notation associated with suprema and infima (resp. maxima and minima) are often used. For example, instead of sup M , one may write sup x; x2M
and similarly for the infimum, etc.
6
1 Preliminaries
Let us fix notations for open and closed intervals in R: As usual, .a; b/ means the set of all x 2 R such that a < x < b, where a; b are real numbers or ˙1. We will denote by ha; bi the corresponding closed interval, i.e. the set of all x 2 R [ f˙1g such that a x b. The reader can fill in the meaning of the symbols ha; b/, .a; bi.
1.2 The field of complex numbers C can be represented as R R with addition .x1 ; x2 / C .y1 ; y2 / D .x1 C y1 ; x2 C y2 / and multiplication .x1 ; x2 /.y1 ; y2 / D .x1 y1 x2 y2 ; x1 y2 C x2 y1 /; we have the zero .0; 0/ and the unit .1; 0/. It is an easy exercise to check that C has the arithmetic properties of a field (associativity, commutativity, distributivity) and that .x1 ; x2 / D .x1 ; x2 / and .x1 ; x2 /1 D x1 x2 . 2 ; 2 /. The field of complex numbers, however, has no reasonable 2 x1 C x2 x1 C x22 order. One introduces the complex conjugate of x D .x1 ; x2 / as x D .x1 ; x2 /. It is easy to see that xCy DxCy
and x y D x y:
(*)
Further, there is the absolute value (also called the modulus) defined by setting q p jxj D xx D x12 C x22 x ). jxj2 If we view C as the Euclidean plane (one often speaks of the Gaussian plane) then jxj is the standard distance of x from .0; 0/, and jx yj is the standard Pythagorean distance. Usually one sets i D .0; 1/ and writes (thus, x 1 D
x1 C ix2 for .x1 ; x2 / (note that the multiplication rule in C comes from distributivity and the equality i 2 D 1). In the other direction, one puts Re.x1 C ix2 / D x1 ; Im.x1 C ix2 / D x2 ; and calls these real numbers the real resp. imaginary part of x1 C ix2 . We have a natural embedding of fields .x 7! .x; 0// W R ! C which will be used without further mention; note that this embedding respects the absolute value.
1 Real and complex numbers
7
1.3 Theorem. For the absolute value of complex numbers one has jx C yj jxj C jyj: Proof. Let x D x1 C ix2 and y D y1 C iy2 . We can assume y ¤ 0. For any real number we have 0 .xj C yj /2 D xj2 C 2xj yj C 2 yj , j D 1; 2. Adding these inequalities, we obtain 0 jxj2 C 2.x1 y1 C x2 y2 / C 2 jyj2 : Setting D
x1 y1 C x2 y2 yields jyj2
0 jxj2 2
.x1 y1 C x2 y2 /2 .x1 y1 C x2 y2 /2 2 .x1 y1 C x2 y2 /2 2 C jyj D jxj jyj2 jyj4 jyj2
and hence .x1 y1 C x2 y2 /2 jxj2 jyj2 . Consequently, jx C yj2 D .x1 C y1 /2 C .x2 C y2 /2 D jxj2 C 2.x1 y1 C x2 y2 / C jyj2 jxj2 C 2jxjjyj C jyj2 D .jxj C jyj/2 :
t u
1.3.1 Corollary. If x D x1 C ix2 and y D y1 C iy2 then jx yj jx1 y1 j C jx2 y2 j
and jxj yj j jx yj:
1.3.2 Comment: A function is basically the same thing as a map, although in many texts (including this one), the term function is reserved for a map whose codomain is a set whose elements we perceive as numbers, or at least some closely related generalizations. For example, the codomain may be R, C or a subset of one of these sets, or it may be, say, Œ0; 1. Sometimes, we will allow the codomain to consist even of n-tuples of numbers, see for example Chapter 3. While many basic courses define functions simply by formulas without worrying about the domain and codomain, in a rigorous view of the subject, specifying domains and codomains is essential for capturing even the most basic phenomena: Consider, for example, the function f .x/ D x 2 :
(*)
If we specify the domain as R, the function certainly cannot have an inverse no matter what the codomain is, since it is not injective. If we do specify the domain, say, as Œ0; 1/, and the codomain as R, there is still no inverse, since the function is not onto. If, however, (*) is considered as a function f W Œ0; 1/ ! Œ0; 1/;
8
1 Preliminaries
then there is an inverse, which is rather useful, namely f 1 .x/ D
1.4
p
x:
Polynomials and their roots
1.4.1 Recall that a polynomial with coefficients in R resp. C is an expression which is either 0 (the zero polynomial) or is of the form p.x/ an x n C C a1 x C a0
with
aj 2 R resp. C
(*)
for some n 2 N0 , where an ¤ 0. Technically, then, a non-zero polynomial is simply the .n C 1/-tuple of real (resp. complex) numbers .a0 ; : : : ; an /. (This is the information we would have to specify if we were to store the polynomial, say, on a computer.) The number n is called the degree of the polynomial p.x/. The degree of the zero polynomial is not defined. Of course, the polynomial (*) also determines a function .x 7! an x n C C a1 x C a0 / W R ! R resp. C ! C: The zero polynomial determines a function, too, namely one which is constantly 0. In analysis, it is quite common to identify a polynomial with the function it determines (although note carefully that the domain and codomain of the function corresponding to a polynomial with real coefficients will change if its coefficients are considered as complex numbers). Nevertheless, this identification is permissible, since two different polynomials over R (resp. C) never correspond to the same function. To this end, note that it suffices to show that a non-zero polynomial does not correspond to the 0 function (by passing to the difference). To this end, simply note that if jx0 j is very large, then jan x0n j > jan1 j jx0n1 j C C ja0 j jan1 x0n1 C C a0 j; and hence p.x0 / ¤ 0 by the triangle inequality. In fact, much more is true: a polynomial of degree n can be zero at no more than n different points of R (or C). Define a complex root of a polynomial p.x/ to be a number c 2 C such that p.c/ D 0. If c 2 R, we speak of a real root.
1 Real and complex numbers
9
1.4.2 Lemma. If p.x/ is a polynomial with coefficients in C with root c 2 C, then there exists a unique polynomial q.x/ with coefficients in C such that p.x/ D q.x/.x c/: Moreover, q.x/ has degree n 1. If the coefficients of p.x/ and the number c are real, then the coefficients of the polynomial q.x/ are real. Proof. For existence, recall (or observe by chain cancellation) that for k 2 N, x k c k D .x c/.x k1 C x k2 c C C xc k2 C c k1 /: Therefore, p.x/ p.c/ D an .x n c n / C C a1 .x c/ can be written as x c times another polynomial. If c is a root of p.x/, p.c/ D 0 by definition, so our statement follows. For uniqueness, note that for a non-zero polynomial q.x/ of degree k, the polynomial q.x/.x c/ has degree k C 1, and hence is non-zero. t u We immediately have the following 1.4.3 Corollary. A polynomial p.x/ of degree n with coefficients in R or C has at most n distinct roots. 1.4.4 Proposition. Let c be a (possibly complex) root of a polynomial p with coefficients in R. Then the complex conjugate c is also a root of p. Proof. By 1.2.(*), p.c/ D p.c/.
t u
The Fundamental Theorem of Algebra (which will be proved in Chapter 10, Theorem 5.2), states that every polynomial of degree 1 has a root in C: By Lemma 1.4.2, we then see that every polynomial of degree n with coefficients in C can be written uniquely (up to order of factors) as p.x/ D an .x c1 / .x cn / for some complex numbers c1 ; : : : ; cn . (The uniqueness is proved by induction.) Note that the numbers c1 ; : : : ; cn may not be all distinct. When c D ci for exactly k > 0 different values i 2 f1; : : : ; ng, we say that the root c has multiplicity k.
10
1 Preliminaries
Applying Proposition 1.4.4 inductively, if a polynomial p.x/ has real coefficients, then the multiplicity of the root c is equal to the multiplicity of c.
2
Convergent and Cauchy sequences
2.1 A sequence .xn /n in R or in C is said to converge to x if 8" > 0 9n0 such that n n0 ) jxn xj < ": We write lim xn D x or simply lim xn D x: n
The reader is certainly familiar with the easy facts such as lim.xn C yn / D lim xn C lim yn or lim.xn yn / D lim xn lim yn , etc.
2.2 A sequence .xn /n in R or in C is said to be Cauchy if 8" > 0 9n0 such that m; n n0 ) jxm xn j < ": Observation. Every convergent sequence is Cauchy. (If we have the implication n n0 ) jxn xj < " then m; n n0 ) jxm xn j jxm xj C jx xn j < 2"). 2.3 Theorem. If a xn b for all xn , then the sequence .xn /n contains a convergent subsequence .x/kn , and a limn xkn b. Proof. Let a xn b for all n. Set M D fx j 9 infinitely many n such that x xn g: This set is non-empty (a 2 M ) and bounded (no x > b is in M ) and hence there is a finite s D sup M . By the definition, each Kn D fk j s
1 1 < xk < s C g n n
3 Continuous functions
11
is infinite, and we can choose, first, xk1 such that s 1 < xk1 < s C 1 and if k1 < < kn are chosen with kj 2 Kj we can choose a knC1 2 KnC1 such that knC1 > kn . Then obviously limn xkn D s, and equally obviously a s b. t u 2.4 Theorem. (Bolzano - Cauchy) Every Cauchy sequence of real numbers converges. Proof. Since for some m and all n m, jxn xm j < 1, a Cauchy sequence is bounded and hence it contains a subsequence xk1 ; : : : ; xkn ; : : : converging to an x. But then limn xn D x: indeed, choose for an " > 0 an n0 such that for m; n n0 we have jxm xn j < " and jx xkn j < ". Then, since kn n, jx xn j < 2" for n n0 . t u
2.5 From 1.3.1, we see that if .xn D xn1 C ixn2 /n is a sequence of complex numbers then .xn /n converges if and only if both .xnj /n converge and .xn /n is Cauchy if and only if both .xnj /n are Cauchy. Consequently we can infer from Theorem 2.4 the following Corollary. Every Cauchy sequence of complex numbers converges.
3
Continuous functions
3.1 Recall that a real (resp. complex) function of one real (resp. complex) variable is a mapping f W X ! R (resp. ! C) with X R (resp. C): In the real case X will be most often an interval, that is, a set J R such that x; y 2 J and x z y implies that z 2 J . Recall the standard notation from 1.1 for (bounded) open and closed intervals: .a; b/ D fx j a < x < bg and ha; bi D fx j a x bg: The intervals ha; bi will be often referred to as compact intervals; the reason for this terminology will become apparent in Chapter 2 below. A function f W X ! R resp. C is said to be continuous if
12
1 Preliminaries
8x 2 X 8" > 0 9ı > 0 such that jy xj < ı ) jf .y/ f .x/j < ":
(3.1.1)
3.2 Proposition. A function f is continuous if and only if for every convergent sequence one has f .lim xn / D lim f .xn /. Proof. If f is continuous, if x D lim xn 2 X and if " > 0 then first choose a ı > 0 as in (3.1.1) and then an n0 such that jxn xj < " for n n0 . Then jf .xn / f .x/j < " for n n0 . Now suppose f is not continuous. Then there is an x 2 X and an " > 0 such that for every ı > 0 there is a y.ı/ such that jy.ı/ xj < ı and jf .y.ı// f .x/j ". Set xn D y. n1 /. Then limn xn D x while f .xn / cannot converge to f .x/. t u 3.3 Theorem. (The Intermediate Value Theorem) Let J be an interval, let f W J ! R be a continuous function, and let for some u < v, min.f .u/; f .v// K max.f .u/; f .v//. Then there is an x 2 hu; vi such that f .x/ D K. Proof. Since a restriction of a continuous function is obviously continuous, since f is continuous if and only if f is, and since if f is continuous then any x 7! f .x/ K with K fixed is continuous, it suffices to prove that if f W ha; bi ! R is continuous and f .a/ 0 f .b/ then there is a c 2 ha; bi such that f .c/ D 0. Set c D supfx 2 ha; bi j f .x/ 0g. Suppose f .c/ > 0. Then for " D f .c/ we have a ı > 0 such that for x > c ı, f .x/ > f .c/ " D 0 while there should exist an x > c ı such that f .x/ 0. Similarly we cannot have f .c/ < 0 because for " D f .c/ we would have a ı > 0 with f .x/ < f .c/ C " D 0 for c x < c C ı contradicting the definition of c again. Thus, f .c/ D 0. t u 3.4 Theorem. A continuous function f W ha; bi ! R on a compact interval attains a maximum and a minimum. Proof. for the maximum. Set M D ff .x/ j x 2 ha; big. If it is not bounded choose xn > n and consider a convergent subsequence xkn with limit y. We have f .y/ D limn xkn which is impossible because it would yield f .y/ > n for all n. Hence M is bounded and has a finite supremum s. Now choose xn with s n1 < xn s, and a convergent subsequence xkn with limit y 2 ha; bi to obtain f .y/ D s. t u
3.5 A function f is said to be uniformly continuous if 8" > 0 9ı > 0 such that 8x; y; jy xj < ı ) jf .y/ f .x/j < ": 3.5.1 Theorem. A continuous function on ha; bi is uniformly continuous.
4 Derivatives and the Mean Value Theorem
13
Proof. Suppose not. Then there exists an " > 0 such that 8n 9xn ; yn such that jxn yn j <
1 and jf .xn / f .yn /j ": n
Choose a convergent subsequence .xkn /n and then a convergent subsequence .ykmn /n of .ykn /n . Then we have limn xkmn D limn ykmn contradicting Proposition 3.2 and the inequality j limn f .xkmn / limn f .ykmn /j ". t u
4
Derivatives and the Mean Value Theorem
4.1 Let f W X ! R be a function, X R. We say that f has a limit A at a point a and write lim f .x/ D A
x!a
if it is defined on .u; v/ X fag for some u < a < v and if 8" > 0 9ı > 0 such that x 2 .a ı; a C ı/ X fag ) jf .x/ Aj < ": Note that f does not have to be defined in a, and if it is, lim f .x/ D A does not x!a
say anything about the value f .a/.
4.2 Let J be an open interval. A function f W J ! R has a derivative A in a point x if lim
h!0
f .x C h/ f .x/ DA h
(that is, if the limit on the left-hand side exists, and if it is equal to a). The reader is certainly familiar with the notation A D f 0 .x/; or
df .x/ dx
and with the basic computation rules like .f C g/0 D f 0 C g 0 or .fg/0 D f 0 g C fg 0 etc. 4.3 Theorem. A function f has a derivative A at the point x if and only if there is a function defined on some .ı; ı/ X f0g (ı > 0) such that lim .h/ D 0 and f .x C h/ f .x/ D Ah C h.h/:
h!0
14
1 Preliminaries
Proof. If such a exists we have for h 2 .ı; ı/ X f0g, f .x C h/ f .x/ D A C .h/ h f .x C h/ f .x/ D A. On the other hand, if the derivative exists h!0 h f .x C h/ f .x/ then we can set .h/ D A. t u h and hence lim
4.3.1 Corollary. If f has a non-zero derivative at a point x then f .x/ is neither a maximum nor a minimum value of f (A maximum resp. minimum value of a function f is the maximum resp. minimum, if one exists, of the set of values of f .). (Indeed, consider f .x C h/ f .x/ D h.A .h// for j.h/j < jAj.) A point at which a function f has zero derivative or the derivative does not exist is called a critical point. Corollary 4.3.1 implies that critical points are the only points at which a function f can have a minimum or a maximum. It is, of course, not guaranteed that a critical point would be an actual minimum or maximum (take the point x D 0 for the function f .x/ D x 3 ). However, see Theorem 4.7 below for a partial converse of the Corollary.
4.4
The Mean Value Theorem
4.4.1 Theorem. (Rolle) Let f be continuous in ha; bi and let it have a derivative in .a; b/. Let f .a/ D f .b/. Then there is a c 2 .a; b/ such that f .c/ D 0. Proof. If f is constant then f 0 .c/ D 0 for all c. If not then, as f .a/ D f .b/, either its maximum or its minimum (recall Theorem 3.4) has to be attained in a c 2 .a; b/. By 4.3.1, f 0 .c/ D 0. t u 4.4.2 Theorem. (The Mean Value Theorem, Lagrange’s Theorem) Let f be continuous in ha; bi and let it have a derivative at .a; b/. Then there is a c 2 .a; b/ such that f 0 .c/ D
f .b/ f .a/ : ba
More generally, if, furthermore, g is a function with the same properties and such that g.b/ ¤ g.a/ and g 0 .x/ ¤ 0 then there is a c 2 .a; b/ such that f .b/ f .a/ f 0 .c/ D : g 0 .c/ g.b/ g.a/
4 Derivatives and the Mean Value Theorem
15
Proof. Set F .x/ D .f .x/ f .a//.g.b/ g.a// C .f .b/ f .a//.g.x/ g.a//. Then F .a/ D F .b/ D 0 and F 0 .x/ D f 0 .x/.g.b/ g.a// g 0 .x/.f .b/ f .a// and the second formula follows. For the first one, set g.x/ D x. t u
4.4.3 The Mean Value Theorem is often used in the following form (to be compared with 4.3): let x; x C h be both in an interval in which f has a derivative. Then f .x C h/ f .x/ D f 0 .x C h/ h for some 2 .0; 1/: (Use 4.4.2 for hx; x C hi resp. hx C h; xi.) 4.4.4 Corollary. If f is continuous in ha; bi and if it has a positive (resp. negative) derivative in .a; b/ then it strictly increases (resp. decreases) (i.e. x < y ) f .x/ < f .y/ resp. x < y ) f .x/ > f .y/) in ha; bi. If f 0 0 in .a; b/ then f is constant. (For, f .y/ f .x/ D f 0 .c/.y x/. )
4.5
The second derivative, convex and concave functions
Suppose f has a derivative f 0 .x/ at every x 2 J , where J is an open interval. Thus, we have a new real function f 0 W J ! R and this function may have a derivative again. In such a case we speak of the second derivative.
4.5.1 A function f is said to be convex resp. concave on an interval ha; bi if for any two x < y in ha; bi and any z D tx C .1 t/y, (0 < t < 1), between these arguments, f .x/ tf .x/ C .1 t/f .y/
resp. f .x/ tf .x/ C .1 t/f .y/
(that is, the points of the graph of f lay below (resp.above) the straight line connecting the points .x; f .x// and .y; f .y//). 4.5.2 Proposition. Let f be continuous on ha; bi and let f have a non-negative (resp non-positive) second derivative on .a; b/. Then it is convex (resp.concave) on ha; bi. Proof. In the notation above we have y z D y tx .1 t/y D t.y x/;
z x D .1 t/.y x/:
Let the second derivative be non-negative. Then we have x < u < z < v < y and u < w < v such that
16
1 Preliminaries
f .y/ f .z/ f .z/ f .x/ D f 0 .v/ f 0 .u/ D f 00 .w/.v u/ 0 yz zx so that f .y/ f .z/ f .x/ f .x/ ; t.y x/ .1 t/.y x/ hence .1 t/.f .y/ f .z// t.f .z/ f .x// and finally tf .x/ C .1 t/f .y/ f .z/:
t u
4.5.3 An application: Young’s inequality We have Proposition. Let a; b > 0 and let p; q 1 be such that ab
1 p
C
1 q
D 1. Then
bq ap C : p q
Proof. Since ln00 .x/ D x12 < 0, ln is concave; thus if, say ap < b q we have 1 1 1 1 ln. ap C b q / ln.ap / C ln.b q / D ln a C ln b D ln.ab/ p q p q and since ln increases, the inequality follows.
4.6
t u
Derivatives of higher order and Taylor’s Theorem
Just as we defined the first and second derivative of a function on an open interval J , we may iterate the process to define the third, fourth derivative, etc. In general, we speak of the derivative of n’th order, and define f .0/ D f; f .1/ D f 0
and further f .nC1/ D .f .n/ /0 :
(Of course, as before, for a given function, such higher derivatives may or may not exist.) 4.6.1 Theorem. (Taylor) Let f have derivatives up to degree n C 1 in an open interval containing a and x, a ¤ x. Then there is a c in the open interval between a and x such that
4 Derivatives and the Mean Value Theorem
f .x/ D
n X f .k/ .a/
kŠ
kD0
17
.x a/k C
f .nC1/ .c/ .x a/nC1 : .n C 1/Š
Proof. Fix x and a and define a function R.t/ of one real variable t by setting R.t/ D f .x/
n X f .k/ .t/ kD0
kŠ
.x t/k :
Then we have X f .kC1/ .t/ X f .k/ .t/ dR.t/ D .x t/k C k.x t/k1 : dt kŠ kŠ
R0 .t/ D
n
n
kD0
kD1
Substituting l D k C 1 in the second sum we obtain R0 .t/ D
n X f .kC1/ .t/
kŠ
kD0
.x t/k C
n1 X f .lC1/ .t/ lD0
lŠ
.x t/l D
f .nC1/ .t/ .x t/n : nŠ
Now define g.t/ D .x t/nC1 . Then g 0 .t/ D .n C 1/.x t/n and g.x/ D 0. Since also R.x/ D 0 we obtain from Theorem 4.4.2, R.a/ R.x/ R0 .c/ f .nC1/.c/.x c/n R.a/ D D 0 D g.a/ g.a/ g.x/ g .c/ nŠ.n C 1/.x c/n and hence R.a/ D
f .nC1/ .c/ f nC1 .c/ g.a/ D .x a/nC1 nŠ.n C 1/ .n C 1/Š
and the statement follows, since R.a/ D f .x/ f .x/ D
n X f .k/ .a/ kD0
4.7
kŠ
n X f .k/ .a/ kD0
.x a/k C R.a/.
kŠ
.x a/k , that is, t u
Local extremes
One immediate consequence of Taylor’s Theorem is a partial converse of Corollary 4.3.1. Suppose a function f is defined on an open interval containing a point x0 . We say that x0 is a local maximum (resp. local minimum) of f if there exists a ı > 0 such that for all x 2 .x0 ı; x0 C ı/ such that x ¤ x0 , f .x/ < f .x0 / (resp. f .x/ > f .x0 /). We have the following
18
1 Preliminaries
Theorem. Let f be a function such that f 0 and f 00 exist and are continuous on an open interval .a; b/ containing a point x0 . Suppose further that f 0 .x0 / D 0, f 00 .x0 / < 0 (resp. f 00 .x0 / > 0). Then x0 is a local maximum (resp. local minimum) of f . Proof. Let us treat the case of f 00 .x0 / D q > 0; the proof in the other case is analogous. By Taylor’s Theorem, for x 2 .a; b/, x ¤ x0 , there exists a point c in the open interval between x0 and x such that f .x/ D f .x0 / C
f 00 .c/ .x x0 /2 : 2
(*)
Since f 00 is continuous, there exists a ı > 0 such that for x 2 .x0 ı; x0 C ı/, f 00 .c/ > 0. Then it follows immediately from (*) that if x 2 .x0 ı; x0 C ı/, x ¤ x0 , f .x/ > f .x0 /. t u
5
Uniform convergence
5.1 Let fn be real or complex functions defined on an X . We write limn fn D f , or briefly fn ! f if limn fn .x/ D f .x/ for all x 2 X , and say that fn converge to f pointwise. This convergence is not very satisfactory: consider fn W h0; 1i ! R defined by fn .x/ D x n , an example where all the fn are continuous while the limit f is not. We shall need to work with a stronger concept. A sequence of (real or complex) functions .fn /n is said to converge to f uniformly if 8" > 0 9n0 such that 8n n0 8x; jfn .x/ f .x/j < ": This is often indicated by writing fn f . 5.2 Theorem. Let fn be continuous and let fn f . Then f is continuous. Proof. Take an x0 2 X and an " > 0. Choose an n such that for all n n0 and for all x, jfn .x/ f .x/j < 3" , and then a ı > 0 such that jfn .x0 / fn .x/j < 3" for jx0 xj < ı. Then for jx0 xj < ı, jf .x0 / f .x/j jf .x0 / fn .x0 /j C jfn .x0 / fn .x/j C jfn .x/ f .x/j < ": t u 5.3 Theorem. Let fn have derivatives on an open interval J , let fn ! f and let fn0 g. Then f has a derivative and f 0 D g.
6 Series. Series of functions
19
Proof. By the Mean Value Theorem we have for some 0 < < 1, ˇ ˇ ˇ f .x C h/ f .x/ ˇ ˇ ˇ g.x/ ˇ ˇ h ˇ ˇ ˇ f .x C h/ fn .x C h/ ˇ f .x/ fn .x/ fn .x C h/ fn .x/ ˇ Dˇ C C g.x/ˇˇ h h h ˇ ˇ ˇ f .x C h/ fn .x C h/ ˇ f .x/ fn .x/ 0 ˇ Dˇ C C fn .x C h/ g.x/ˇˇ h h
1 1 jf .x C h/ fn .x C h/j C jf .x C h/ fn .x C h/j jhj jhj C jfn0 .x C h/ g.x C h/j C jg.x C h/ g.x/j:
Fix an h ¤ 0 such that jg.x C h/ g.x/j < 4" . Then choose an n such that (1) jf .x C h/ fn .x C h/j < 4" jhj and jf .x/ fn .x/j < 4" jhj, and (2) jfn0 .x C h/ g.x C h/j < 4" . (Inequality (2) is where we need the convergence to be uniform: we do not know the exact position of x C h). Then ˇ ˇ ˇ ˇ f .x C h/ f .x/ ˇ < 1 " jhj C 1 " jhj C " C " D ": ˇ g.x/ ˇ jhj 4 ˇ h jhj 4 4 4
6
t u
Series. Series of functions
6.1 Let .an /n be a sequence of real or complex numbers. The associated series (or sum 1 P P of a series) an (briefly, an if there is no danger of confusion) is the limit n X
nD1
ak provided it exists; in such a case we say that n kD1 P that it converges absolutely if jan j converges. lim
6.2
P
an converges, and we say
Consequences of Absolute Convergence
6.2.1 Proposition. series converges. More generally, if P An absolutely convergent P jan j bn and bn converges then an converges. Proof. Set sn D
n X kD1
ak and s n D
n X kD1
bk . For m n we have
20
1 Preliminaries
jsm sn j D j
m X
an j
kDnC1
m X
jan j
kDnC1
m X
bn D js m s n j:
kDnC1
Thus, if .s n /n is convergent, hence Cauchy, then .sn /n is Cauchy and hence convergent. t u P 6.2.2 Proposition. The series an converges absolutely if and only if for every X jan j < " for every finite K fn j n n0 g. " > 0 there is an n0 with k2K
Proof. The formula is equivalent to stating that
m X
jak j < " for n0 n m.
kDn
Thus, the condition amounts to stating that the sequence .
n X kD1
Cauchy.
jak j/n is t u
P
an converge absolutely. Then for all bijections p from the 1 X set of natural numbers f1; 2; : : : g to itself the sums ap.n/ are equal. 6.2.3 Theorem. Let
nD1
Proof. Let X
1 X
ap.n/ D s for a bijection p. Choose n1 sufficiently large such that
nD1
" for every finite K fn j n n1 g and, further, an n0 such that for 2 k2K n n0 we have ˇ ˇ n ˇ " ˇX ˇ ˇ ap.n/ s ˇ < and fp.1/; : : : ; p.n/g f1; : : : ; n1 g: ˇ ˇ 2 ˇ jan j <
kD1
Now if n p.n0 / then if we consider K D f1; : : : ; ngXfp.1/; : : : ; p.n0 /g we obtain ˇ ˇn ˇ ˇn ˇ ˇ n 0 0 ˇ ˇX ˇ ˇX ˇ X ˇX X " " ˇ ˇ ˇ ˇ ˇ ˇ ak s ˇ D ˇ ap.k/ C ak s ˇ ˇ ap.k/ s ˇ C jak j < C D ": ˇ ˇ ˇ ˇ ˇ ˇ ˇ 2 2 kD1
kD1
k2K
kD1
k2K
t u
6.3 It is worth taking this a little further. A set S is called countable if there exists a bijection W f1; 2; : : : g ! S . Note that this is the same as ordering S into an
6 Series. Series of functions
21
infinite sequence s1 ; s2 ; : : : where si X go through all elements of S , and each element occurs exactly once. Let us say that as converges absolutely if s2S
sup
X
jas j
KS finite s2K
is finite. By Proposition 6.2.2, this is equivalent to
1 X
a .n/ converging absolutely
nD1
for one specified bijection (which can be arbitrary). Theorem 6.2.3 then shows that when this occurs, then X as s2S
is well-defined. Here is an example where this point of view helps: 6.3.1 Theorem. Let S1 ; S2 ; : : : be disjoint finite or countable sets, X and let S D [ Si . Then the set S is finite or countable. Furthermore, if as converges i
s2S
absolutely, then 0 1 1 X X X @ as A D as ; i D1
s2Si
(*)
s2S
and the left-hand side converges absolutely. Proof. The case when S is finite is not interesting. Otherwise, we may order the elements of S into an infinite sequence as follows: Assume each of the sets Si is ordered into a (finite or infinite) sequence. Then let Tn consist of all the i ’th elements (if any) S of Sj such that 1 i; j n. Then clearly each Tn is finite, and Tn TnC1 , and Ti D S . Thus, we can order S by taking all the elements of T1 , then all the remaining elements of T2 , etc. Thus, S is countable. X Now let us investigate (*). The supremum sup jas j over finite subsets K s2K
of Si is less than or equalX to the analogous supremum over K finite subsets of as converges absolutely. Further, for a finite subset S , which shows that each K 1; 2; : : : ,
s2Si
ˇ ˇ XX X ˇˇ X ˇˇ X X ˇ as ˇˇ jas j sup jas j ˇ ˇ i 2K s2Si i 2K ˇs2Si i 2K s2Li
22
1 Preliminaries
where the supremum on the right-hand side is over all finite subsets Li Si . We see that the right-hand side is finite by our assumption of absolute convergence over S , and therefore the left-hand side of (*) converges absolutely. Finally, to prove equality in (*), use a variation of the above proof of the fact that S is countable: Let Tn consist of sufficiently many elements of S1 ; : : : ; Sn such that the sum 0 1 n X X @ as A i D1
s2Tn \Si
differs from 0 1 n X X @ as A i D1
s2Si
by less than 1=n. Then the limit of these particular partial sums is the left- hand side of (*), but is also equal to the right-hand side by absolute convergence. t u 6.3.2 Corollary. Let
1 X
1 X
am ,
mD0 1 X
bn be absolutely convergent series. Then
nD0
! am
mD0
1 X nD0
! bn
D
1 n X X nD0
! ak bnk ;
(*)
kD0
with the right-hand side converging absolutely. P
Proof. By the assumption, the supremum of
X
jam j jbn j over K, L finite
m2K;n2L
subsets of f0; 1; 2; : : : g is finite, thus proving that
am bn is absolutely convergent,
S
where S is the set of all pairs of numbers 0; 1; : : : . The rest follows from Theorem 6.3.1. t u
6.4 Let .fn /n be a sequence of real or complex functions (defined on an X R resp C). 1 1 X X fn is defined as a function with values fn .x/ whenever the last The series nD1
series converges. If f .x/ D
1 X nD1
nD1
fn .x/ converges (resp. converges absolutely) for
7 Power series
23
all x 2 X we say that f .x/ we say that
1 X
P
fn converges (resp. converges absolutely). If
n X
fk .x/
kD1
fn converges uniformly.
nD1
6.4.1 Since finite sums of continuous functions are continuous and since .f1 C Cfn /0 D f10 C C fn0 we obtain from Theorem 5.2, Theorem 5.3 and Proposition 6.2.1 P fn uniformly converge. Then the Corollary. 1. Let fn be continuous and let resulting P function is continuous. P 2. Let f D n fn converge, fn0 exist and let fn0 converge uniformly. Then f 0 P let P 0 exists and is equal to fn (that is, the derivative of fn can be obtained by taking derivatives of the individual summands). P 3. The statements 1 (resp 2) apply to the case of jfn .x/j an with an convergent; here the convergence is, moreover, absolute.
7
Power series
7.1 A power series with center c is a series
1 X
an .x c/n . So far we will limit ourselves
nD1
to the real context; later in Chapter 10, we will discuss them in the complex case.
7.2 The limes superior (sometimes also called the upper limit) of a sequence .an /n of a real number is the number lim sup an D inf sup an : n
n kn
It obviously exists if the sequence .an /n is bounded; if not we set lim supn an D C1. It is easy to see that lim supn an D lim an whenever the latter exists. The limes inferior (or lower limit) is defined analogously with inf and sup switched. 7.2.1 Proposition. Let lim sup an D inf sup an D a and limn bn D b. Let an ; bn n kn
0 and let a; b be finite. Then lim sup an bn D ab.
24
1 Preliminaries
Proof. Choose an " > 0 and a K > a C b. Take an > 0 such that K > a C b C and K < " There is an n0 such that n n0
)
sup ak < a C and b < bn < b C : kn
That is, for every n n0 there exists a k.n/ n such that a ak.n/ < a C and b < bk.n/ < b C so that a.b / < ak.n/ bk.n/ < .a C /.b C / D ab C .a C b C / and since ab " < ab K < ab a and .a C b C / < K < " we see that ab " < ak.n/ bk.n/ < ab C " and conclude that lim sup an bn D ab.
7.2.2 For a power series
1 X
t u
an .x c/n define the radius of convergence
nD1
D ..an /n / D if lim sup
1 lim sup
p n jan j
p n jan j ¤ 0; otherwise set ..an /n / D C1.
Theorem. Let r < ..an /n /. Then the power series
1 X
an .x c/n converges
n1
absolutely and uniformly on the set fx j jx cj rg. 1 X On the other hand, if jx cj > then an .x c/n does not converge. nD1
Proof.
I. Let jx cj r < . Choose a q such that r inf sup
n kn
p k jak j < q < 1:
Then there is an n such that p p r sup k jak j < q and hence r k jak j < q for all k n: kn
Choose a K 1 such that r k Kq k for k n. Then jak x k j Kq k for all k and jx cj r
7 Power series
25
and hence by Theorem 6.3.1, the series converges on fx j jxcj rg absolutely and uniformly. p II. Let jx cj > ; then jx cj infn supkn k jak j > 1, hence jx cj p supkn k jak j > 1 for all n, and hence for each n there is a k.n/ n such p that jx cj k.n/ jak.n/ > 1, and hence jak.n/ .x c/k.n/ j > 1 and the series cannot converge: its summands do not even converge to 0. t u
7.3 Consider the series 1 X
nan .x c/n1 :
(*)
nD1
Obviously it converges if and only if
1 X
nan .x c/n does and hence its radius of
nD1
convergence is 1 p : lim sup n njan j p p p p By Proposition 7.2.1, lim sup n njan j D lim sup n n n jan j D lim n n lim sup p p p 1 n jan j D lim sup n jan j (since lim n n D lim e n ln n D e 0 D 1). Thus, the series (*) is the same as that of the original P radius ofn convergence of the n1 n an .x c/ and since nan .x c/ is the derivative P of an .x n c/ we conclude from 5.3 and 6.3.1 that for jx cj < the series an .x c/ has a derivative, and that it is obtained as the sum of the derivatives of the individual summands. So far, this derivative had to be understood as in the real context. In fact, however, it is valid for complex power series as well; see Chapter 10.
7.4
Remark
If we proceed to compute the higher derivatives summand-wise, we obtain f .k/ .x/ D
1 X nDk
In particular
n.n 1/ .n k C 1/an .x c/nk :
26
1 Preliminaries
f .k/ .c/ D nŠan ;
and hence ak D
f .k/ .c/ : kŠ
Thus, if a function can be written as a power series with a center c then the coefficients an are uniquely determined (they do depend on the c, of course). Compare this with the formula in 4.6.1. It should be noted, though, that in real analysis it can easily happen that a function f has all derivatives without being .nC1/ representable as a power series: the remainder f .nC1/Š.t / .x c/nC1 may not converge to zero with increasing n (see Exercise (13)). In fact, it is interesting to note that many important constructions in real analysis, such as the smooth partition of unity which we will need in Chapter 12, depend on the use of such functions.
8
A few facts about the Riemann integral
8.1 A partition of a compact interval ha; bi is a sequence D W a D t0 < t1 < < tn D b: The mesh of the partition is the maximum of the numbers jti C1 ti j. A partition D 0 W a D t00 < t10 < < tm0 D b refines D if ftj j j D 1; : : : ; ng ftj0 j j D 1; : : : ; mg. Let f W ha; bi ! R be a bounded function (this means that the set of values of f is bounded). Define the lower and upper sum of f in D as s.f; D/ D
n X
mj .tj tj 1 /
and S.f; D/ D
j D1
n X
Mj .tj tj 1 /
j D1
where mj D infff .x/ j x 2 htj 1 ; tj ig and Mj D supff .x/ j x 2 htj 1 ; tj ig. 8.1.1 Proposition. 1. If D 0 refines D then s.f; D/ s.f; D 0 / and S.f; D/ S.f; D 0 /. 2. For any two partitions D1 ; D2 , s.f; D1 / S.f; D2 /. 0 0 < < tlCr D tk and aj D supff .x/ j x 2 Proof. If tk1 D tl0 < tlC1 X 0 0 0 0 htlCj 1 ; tlCj ig, A D supff .x/ j x 2 htk1 ; tk ig then aj .tlCj tlCj 1 /
X
j 0 A.tlCj
0 tlCj 1 /
0
D A.tk tk1 / and S.f; D / S.f; D/ follows. Similarly
j
for the lower sums. Let D be a common refinement of D1 and D2 (easily obtained, e.g., from the union of the elements of the two partitions). Then
8 A few facts about the Riemann integral
27
s.f; D1 / s.f; D/ S.f; D/ S.f; D2 /:
t u
8.2 By Proposition 8.1.1, we can define the lower resp. upper Riemann integral of f over ha; bi by setting Z
Z
b
f .x/dx D sup s.f; D/
If
Rb a
f .x/dx D
Rb
a f .x/dx
Z
f .x/dx D inf s.f; D/:
resp.
D
a
b D
a
we denote the common value by Z
b
f .x/dx
b
f
or briefly
a
a
and call it the Riemann integral of f over ha; bi. 8.2.1 Proposition. such that
Rb a
f exists if and only if for every " > 0 there is a partition D S.f; D/ s.f; D/ < ":
Rb Proof. I. Let a f exist and let " > 0. There is a partition D1 such that Rb Rb S.f; D1 / < a f C 2" and a partition D2 such that s.f; D2 / > a f 2" . Then we have, for the common refinement D of D1 and D2 , Z
b
S.f; D/ s.f; D/ < a
" f C 2
Z
b
f C a
" D ": 2
II. Let the statement hold. Choose an " > 0 and a D such that S.f; D/s.f; D/ > ". Then Z
Z
b
f S.f; D/ < s.f; D/ C " a
Since " > 0 was arbitrary,
b
f C ":
a
Rb
af
D
Rb a
f.
t u
8.3 Theorem. For every continuous function f W ha; bi ! R the Riemann integral Rb a f exists. In fact, more strongly, for every sequence Dn of partitions of ha; bi whose mesh approaches 0 with n ! 1, we have
28
1 Preliminaries
Z
b
lim s.f; Dn / D lim S.f; Dn / D
n!1
n!1
f: a
Proof. Let " > 0. By 3.5.1, f is uniformly continuous. Hence there exist a ı > 0 such that jx yj < ı
)
jf .x/ f .y/j <
" : ba
Choose a partition D W a D t0 < t1 < < tn D b such that tj tj 1 < ı for all j D 1; : : : ; n. Then Mj mj D supff .x/ j x 2 htj 1 ; tj ig infff .y/ j y 2 " htj 1 ; tj ig supfjf .x/ f .y/j j x; y 2 htj 1 ; tj ig ba and hence X .Mj mj /.tj tj 1 / " X " .b a/ D ": .tj tj 1 / D ba ba
S.f; D/ s.f; D/ D
t u
8.4 Theorem. (The Integral Mean Value Theorem) Let f be a continuous function on ha; bi, M D maxff .x/ j x 2 ha; big and m D minff .x/ j x 2 ha; big (they exist by 3.4). Then there exists a c 2 ha; bi such that Z
b
f .x/dx D f .c/.b a/: a
Proof. From the definition one immediately obtains that Z m.b a/
b
f .x/dx M.b a/:
a
Thus there is a K, m K M such that exists a c such that K D f .c/.
Rb a
f .x/dx D K.b a/. By 3.3, there t u
8.5 Proposition. Let a < b < c and let f be a bounded function defined on ha; ci. Then Z
Z
b
f C a
Z
c
f D b
Z
c
f a
Z
b
f C
and a
Z
c
c
f D b
f: a
Proof. Denote by D.u; v/ the set of all paritions of hu; vi. For D1 2 D.a; b/ and D2 2 D.b; c/ define D1 C D2 2 D.a; c/ as a union of the two sequences. Obviously s.D1 C D2 ; f / D s.D1 ; f / C s.D2 ; f /:
8 A few facts about the Riemann integral
29
We have Z
b a
Z f C
c b
f D
sup D1 2D.a;b/
s.D1 ; f / C
sup D2 2D.b;c/
s.D2 ; f /
D supfs.D1 ; f / C s.D2 ; f / j D1 2 D.a; b/; D2 2 D.b; c/g D supfs.D1 C D2 ; f / j D1 2 D.a; b/; D2 2 D.b; c/g Z c f; D supfs.D; f / j D 2 D.a; c/g D a
the penultimate equality because each D 2 D.a; c/ can be refined by a D1 C D2 adding the b. u t
8.5.1 Convention Rb Ra For b < a we will write formally a f for b f . Then we have, for any a; b; c, Z
b
Z
c
f C
a
Z
c
f D
b
f: a
8.6 Theorem. (The Fundamental Theorem of Calculus) Let f be continuous on ha; bi. For x 2 ha; bi set Z
x
F .x/ D
f .t/dt: a
Then we have F 0 .x/ D f .x/ for all x 2 .a; b/. Proof. Let h ¤ 0. By 8.5 and 8.4. we have Z
xCh
F .x C h/ F .x/ D a
Z
x
f
Z
xCh
f D
a
f D f .x C h/h
x
with some 2 h0; 1i. Thus, 1 .F .x C h/ F .x// D f .x C h/ h and, as f is continuous, lim f .x C h/ D f .x/. h!0
t u
8.6.1 Corollary. If f and G are continuous on ha; bi and if G 0 D f in .a; b/ then Z
b
f .x/dx D G.b/ G.a/: a
30
1 Preliminaries
Rx Rb Rb Ra (By 4.4.4, a f .t/dt G.x/ is constant. Thus, a f D a f a f D G.b/ C C .G.a/ C C / D G.b/ G.a/.)
9
Exercises
(1) Assuming the Fundamental Theorem of Algebra, prove that every non-zero polynomial with coefficients in R is a product of polynomials with coefficients in R each of which has degree 2. [Hint: Use 1.4.4.] (2) Prove that the set R of all real numbers is not countable (we say it is 1 X ak 2k are all well-defined uncountable). [Hint: Prove that the numbers kD0
and different for all choices ak 2 f0; 1g. If there were a sequence n 2 N of all these numbers, then the number
1 X
1 X
ak;n 2k ,
kD0
.1ak;k /2k would be different
kD0
from all of them - a contradiction.]
X 1 (3) (a) Prove directly that the function e x D x n satisfies e x e y D e xCy . nŠ [Hint: Use Corollary 6.3.2] (b) Prove that e x ¤ 0 for any x 2 R. [Hint: use (a).] (c) Prove that e x is a continuous function on R which takes on only positive values. [Hint: Use Theorem 7.2.2, Theorem 5.2 and Theorem 3.3.] (4) Using the definition from Exercise (3), prove that .e x /0 D e x . [Hint: Corollary 6.4.1 is relevant.] (5) (a) Prove that e x is an increasing function on R. [Hint: Use Exercises (3) and (4).] (b) Prove that lim e x D 0, lim e x D 1. [Hint: Use (a) and Exercise (3).] x!1
x!1
(6) (a) Prove that there exists a function ln.x/ W fx 2 Rjx > 0g ! R inverse to e x . [Hint: Use Exercise (5) (b).] (b) Prove that .ln.x//0 D 1=x. [Hint: This follows from the chain rule; a direct proof can also be given using Theorem 4.3.] (7) For a 2 R, x > 0, define x a D e a ln.x/ . Using the chain rule, prove that .x a /0 D ax a1 . 1 X .1/n x 2nC1 =.2n C 1/Š, (8) Define functions sin.x/, cos.x/ by sin.x/ D 1 X .1/n x 2n =.2n/Š. cos.x/ D nD0
nD0
(a) Prove that cos.x/ D cos.x/; sin.x/ D sin.x/ (i.e. cos.x/ is even and sin.x/ is odd). (b) Prove that .sin.x//0 D cos.x/, .cos.x//0 D sin.x/. [Hint: Corollary 6.4.1 is relevant.]
9 Exercises
31
(9) Prove that there exists a minimum number a > 0 such that cos.a/ D 0. This number a is called =2. Prove that cos.x/ is decreasing in the interval .0; =2/. [Hint: By Exercise (8), we have cos00 .x/ D cos.x/, while cos.0/ D 1, .cos.0//0 D 0. This means that .cos.x//0 is negative in some interval .0; "/, " > 0, and .cos.x//0 is decreasing on any interval .0; a/ on which cos.x/ > 0. Let cos0 ."=2/ D b, cos."=2/ D c, b; c > 0. Then cos."=2 C t/ c bt if "=2 < "=2 C t < a (a as above). From this, it follows we cannot have a "=2 > c=b.] (10) (a) Prove that cos.x ˙ y/ D cos.x/ cos.y/ sin.x/ sin.y/, sin.x ˙ y/ D sin.x/ cos.y/ ˙ cos.x/ sin.y/. [Hint: analogous to Exercise (3).] (b) Prove that sin. =2/ D 1, sin.x/ is increasing on the interval .0; =2/, and cos. =2 x/ D sin.x/. [Hint: Let sin. =2/ D a. Apply (a) to show that cos. =2 x/ D a sin.x/, sin. =2 x/ D a cos.x/, and therefore a2 D 1. Observe that we must then have a D 1 because sin.x/ is increasing on the interval .0; =2/ by Exercise (8).] (11) Prove that cos.x/ and sin.x/ are both periodic with period 2 , their values (on x real) are between 1 and 1, and describe their maxima and minima, and intervals on which they are decreasing resp. increasing. [Hint: Use Exercise (10) and the fact that cos.x/ is even to prove that cos.x C / D cos.x/, etc.] (12) Now consider the definition of e x from Exercise (3) for a complex number x. (a) Prove that e x is well-defined (i.e. the series converges) for all x 2 C, and that e x e y D e xCy for x; y 2 C. [Interpret this as separate statements about the real and imaginary parts.] (b) Prove that for a complex number , the functions Re.e x /, Im.e x / are continuous and differentiable in the real variable x, and that .e x /0 D e x [this is, again, to be interpreted as equalities of the real and imaginary parts]. e ix C e ix e ix e ix (c) Prove the equalities cos.x/ D , sin.x/ D for x 2 R. 2 2i [Remark: The attentive reader surely noticed that something is missing here; we should learn how to differentiate with respect to a complex variable x. (!) However, we will have to build up a lot more foundations, and wait until Chapter 10 below, to understand that rigorously.] (13) Let f .x/ D e 1=x for x > 0, f .x/ D 0 for x 0. Prove that f .n/ .0/ D 0 for all n 1.
2
Metric and Topological Spaces I
A key to rigorous multivariable calculus is a basic understanding of point set topology in the framework of metric spaces. Covering these basic concepts is the purpose of this chapter. We will see that studying these concepts in detail will really pay off in the chapters below. While studying metric spaces, we will discover certain concepts which are independent of metric, and seem to beg for a more general context. This is why, in the process, we will introduce topological spaces as well.
1
Basics
1.1 Let RC denote the set of all non-negative real numbers and C1. A metric space is a set X endowed with a metric (or distance function, briefly distance) d W X
X ! RC such that (M1) d.x; y/ D 0 if and only if x D y, (M2) d.x; y/ D d.y; x/, and (M3) d.x; y/ C d.y; z/ d.x; z/. Condition (M3) is called the triangle inequality; the reader will easily guess why. The elements of a metric space are usually referred to as points. Very often one considers distance functions which take on finite values only, but allowing infinite distances comes in handy sometimes.
1.1.1 Examples (a) The set R of real numbers with the distance function d.x; y/ D jx yj. (b) The set (plane) C of complex numbers, again with the distance jx yj; note, however, that here the fact that it satisfies the triangle inequality is much less trivial than in the previous case (see Theorem 1.3 of Chapter 1). (c) The Euclidean space Rm D f.x1 ; : : : ; xm / j xj 2 Rg
I. Kriz and A. Pultr, Introduction to Mathematical Analysis, DOI 10.1007/978-3-0348-0636-7 2, © Springer Basel 2013
33
34
2 Metric and Topological Spaces I
r d..x1 ; : : : ; xm /; .y1 ; : : : ; ym // D
X .xj yj /2 :
Comment: In linear algebra, there are good reasons for distinguishing row and column vectors, and equally good reasons why the ordinary Eucliean space Rn should consist of column vectors. This is the reason why we used the subscript Rn above for row vectors, which are easier to write down (compare with A.7.3). From the point of view of metric and topological spaces, however, the distinction between row and column vectors has no meaning. Because of that, in this chapter, we will use the symbols Rn and Rn interchangably, not distinguishing between row and column vectors. (d) C.ha; bi/, the set of all continuous real functions on the interval ha; bi, with d.f; g/ D max jf .x/ g.x/j: x
(e) The set F .X / of all bounded real functions on a set X with d.f; g/ D sup jf .x/ g.x/j: x
(f) The unit circle S 1 D f.x; y/ 2 R2 j x 2 C y 2 D 1g where for two points P; Q 2 S 1 , d.P; Q/ is the lesser of the two angles between the lines ftPjt 2 Rg and ftQjt 2 Rg. (g) Any set S with the metric given by d.x; y/ D 0 if x D y 2 S and d.x; y/ D 1 if x ¤ y 2 S . This is known as the discrete space.
1.2
Norms
The metrics in Examples 1.1.1 (a)–(e) in fact all come from a more special situation, which plays an especially important role. A norm on a vector space V (over real or complex numbers) is a mapping jj jj W V ! R such that (1) jjxjj 0, and jjxjj D 0 only if x D o, (2) jjx C yjj jjxjj C jjyjj, and (3) jj˛xjj D j˛j jjxjj.
1.2.1 A normed vector space is a (real or complex) vector space V provided with a norm. (The term normed linear space is also common.) Since we have jjx zjj D jjx y C y zjj jjx yjj C jjy zjj;
1 Basics
35
the function .x; y/ D jjx yjj is a metric on V , called the metric associated with the norm. In this sense, we can always view a normed linear space as a metric space.
1.2.2 Examples 1. Any of the following formulas yields a norm in Rn . (a) jjxjj D P max xj , (b) jjxjj D q jxj j, P 2 xj . (c) jjxjj D Notice that (c) gives the metric space in Example 1.1.1 (c). 2. In the space of bounded real functions on a set X we can consider the norm jj'jj D supfj'.x/j j x 2 X g: The associated metric gives rise to Example 1.1.1 (e) above.
1.2.3 A particularly important example Example 1.2.2 (c) is in fact, a special case of the following construction: On a (real or complex) vector space with an inner product (see 4.2 of Appendix A), we have a norm jjxjj D
p xx:
Indeed: (1) of 1.2 is obvious. Further, by the Cauchy-Schwarz inequality (see 4.4 of Appendix A), jjx C yjj2 D .x C y/.x C y/ D xx C xy C yx C yy D jxx C xy C yx C yyj jjxjj2 C jxyj C jyxj C jjyjj2 jjxjj2 C 2jjxjjjjyjjj C jjyjj2 D .jjxjj C jjyjj/2 : Finally, jj˛xjj D
1.3
p p .˛x/.˛x/ D ˛˛.xx/ D j˛j jjxjj.
t u
Convergence
A sequence x1 ; x2 ; : : : of points of metric space converges to a point x whenever for every " > 0, there exists an n0 such that for all n n0 , we have d.xn ; x/ < ". This is expressed by writing lim xn D x
n!1
or
lim xn D x n
or just
lim xn D x:
We then speak of a convergent sequence. Note that obviously (*) any subsequence .xkn /n of a convergent sequence converges to the same limit.
36
2 Metric and Topological Spaces I
1.3.1 Examples (a) The usual convergence in R or C. (b) Consider the examples in 1.1.1 (d) and (e). Realize that the convergence of a sequence of functions f1 ; f2 ; : : : in these spaces is what one usually calls uniform convergence of functions.
1.4 Two metrics d1 ; d2 on the same set X are said to be equivalent if there exist positive real numbers ˛; ˇ such that for every x; y 2 X , ˛d1 .x; y/ d2 .x; y/ ˇd1 .x; y/: Note that we have an obvious 1.4.1 Observation. If d1 and d2 are equivalent then .xn /n converges in .X; d1 / if and only if it converges in .X; d2 /.
1.5 Let .X; d / and .Y; d 0 / be metric spaces. A map f W X ! Y is said to be continuous if for every x 2 X and every " > 0 there is a ı > 0 such that, for every y in X , d.x; y/ < ı
)
d 0 .f .x/; f .y// < ":
(ct)
Later on we will need a stronger concept: a mapping f W X ! Y is said to be uniformly continuous if for every " > 0 there is a ı > 0 such that, for all x; y in X d.x; y/ < ı
)
d 0 .f .x/; f .y// < ":
(uct)
Note the subtle difference between the two concepts. In the former the ı can depend on x, while in the latter it depends on the " only. For example, f D .x 7! x 2 / W R ! R is continuous but not uniformly continuous.
2 Subspaces and products
37
It is easy to prove 1.5.1 Proposition. A composition g ı f of continuous (resp. uniformly continuous) maps f and g is continuous (resp. uniformly continuous).
1.5.2 Here is another easy but important Observation. Let d; d1 be equivalent metrics on X and let d 0 ; d10 be equivalent metrics on Y . Then a map f W X ! Y is continuous (resp. uniformly continuous) with respect to d; d 0 if and only if it is continuous (resp. uniformly continuous) with respect to d1 ; d10 . 1.6 Proposition. A map f W .X; d / ! .Y; d 0 / is continuous if and only if for every convergent sequence .xn /n in .X; d /, the sequence .f .xn //n is convergent and f .lim xn / D lim f .xn /: (Compare with Proposition 3.2 of Chapter 1.) Proof. ): Let lim xn D x. Consider the ı > 0 from (ct) taken for the x and an " > 0. There is an n0 such that n n0 implies d.xn ; x/ < ı. Then for n n0 , d 0 .f .xn //; f .x// < ". (: Suppose f is not continuous. Then there is an x 2 X and an "0 > 0 such that for every ı > 0 there exists an x.ı/ such that d.x.ı/; x/ < ı while d 0 .f .x.ı// "0 . Now set xn D x. n1 /; obviously lim xn D x and .f .xn //n does not converge to f .x/. t u
2
Subspaces and products
2.1 Let .X; d / be a metric space and let X 0 X be an arbitrary subset. Obviously .X 0 ; d 0 / where d 0 is d restricted to X 0 X 0 is a metric space again. Examples. (a) Intervals in the real line. (b) More generally, the typical subspaces of the Euclidean space Rm one usually works with: n-dimensional intervals (by which we mean cartesian products of n-tuples of intervals), polyhedra, balls, spheres, etc. (c) The space C.ha; bi/ from 1.1.1.(d) is a subspace of the F .ha; bi/ from 1.1.1.(e). Convention. Unless otherwise stated we will think of subsets of spaces automatically as subspaces.
38
2 Metric and Topological Spaces I
2.1.1 Observations. 1. Let .X 0 ; d 0 / be a subspace of .X; d /. Then the embedding map j D .x 7! x/ W X 0 ! X is uniformly continuous. Consequently, a restriction f jX 0 W .X 0 ; d 0 / ! .Y; d / of a continuous (resp. uniformly continuous) f W .X; d / ! .Y; d / is continuous (resp. uniformly continuous). 2. Let f W .X; d / ! .Y; d / be a continuous (resp. uniformly continuous) map and let Y 0 Y be a subspace such that f ŒX Y 0 . Then f 0 D .x 7! f .x// W 0 .X; d / ! .Y; d / is continuous (resp. uniformly continuous). Proof. 1. For " > 0 take ı D ". For the consequence recall 1.5.1 and the fact that f jX 0 D fj . 2. For x and " > 0 use the same ı as for f . t u
2.2 Let .Xi ; di /, i D 1; : : : ; m, be metric spaces. On the cartesian product
m Y
Xi D
i D1
X1 Xm consider the following distances:
v u m uX ..x1 ; : : : ; xm /; .y1 ; : : : ; ym // D t di .xi ; yi /2 ; i D1
..x1 ; : : : ; xm /; .y1 ; : : : ; ym // D
m X
di .xi ; yi /; and
i D1
d..x1 ; : : : ; xm /; .y1 ; : : : ; ym // D max di .xi ; yi /: ( and d satisfy (M1), (M2) and (M3) obviously. The triangle inequality of needs some simple reasoning – one can use, for instance, Theorem 4.4 from Appendix A. In fact, we will rarely use this metric in the context of the topology of multivariable functions. However, note its geometrical significance: it yields the standard Pythagorean metric in the space Rm viewed as R R.) 2.2.1 Proposition. The distance functions , and d are equivalent metrics. Proof. v u m uX p ..xi /i ; .yi /i / t max dj .xj ; yj /2 D n d..xi /i ; .yi /i /: i D1
j
Obviously d..xi /i ; .yi /i / ..xi /i ; .yi /i /; ..xi /i ; .yi /i / and finally ..xi /i ; m X .yi /i / maxj dj .xj ; yj / D n d..xi /i ; .yi /i /. i D1
3 Some topological concepts
39
2.2.2 Q The space Xi endowed with any of the metrics , , d (typically, by d ) will be referred to as the product of the spaces .Xi ; di /, i D 1; : : : ; m. Theorem. 1. The projections pj D ..X1 ; : : : ; xm / 7! xj / W
Y .Xi ; di / ! i
.Xj ; dj / are uniformly continuous. 2. A sequence 1 2 3 .x11 : : : ; xm /; .x12 : : : ; xm /; .x13 : : : ; xm /; : : :
converges in
(*)
Q .Xi ; di / if and only if each of the sequences xj1 ; xj2 ; xj3 : : :
(**)
converges in the respective .Xj ; dj /. 3. Let fj W .Y; d / ! .Xj ; dj / be continuous (resp. uniformly continuous). Then the mapping Y .Xi ; di / f D .y 7! .f1 .y/; : : : ; fm .y/// W .Y; d 0 / ! (the unique mapping such that pj f D fj for all j ) is continuous (resp.uniformly continuous). Proof. 1. We have d..xi /i ; .yi /i / dj .xj ; yj /. Thus, it suffices to take ı D ". 2. If . / converges then each . / converges by 1 and 1.6. For " > 0 choose nj such that for k nj , dj .xjk ; xj / < ", and consider n0 D maxj nj . Then for k n0 , dj .xjk ; xj / < " for all j , and hence max dj .xjk ; xj / < ". 3. immediately follows from 2 and 1.6.
3
Some topological concepts
3.1
Neighborhoods
First, define the "-ball with center x as
.x; "/ D fy j d.x; y/ < "g: A subset U X is a neighborhood of a point x 2 X if there exists an " > 0 such that
.x; "/ U:
40
2 Metric and Topological Spaces I
Remark: While the concept of an "-ball depends on the concrete metric, the concept of neighborhood does not change if we replace a metric by an equivalent one. In fact, we can change the metric even much more radically – see Exercise (5) below. 3.1.1 Observations. 1. If U is a neighborhood of x and U V then V is a neighborhood of x. 2. If U1 ; U2 are neighborhoods of x then so is U1 \ U2 . (1: for V use the same .x; "/. 2: if .x; "i / Ui then .x; min."1 ; "2 // U1 \ U2 .)
3.2
Open and closed sets
A subset U .X; d / is open if it is a neighborhood of each of its points. A subset A .X; d / is closed if for every sequence .xn /n , xn 2 A convergent in .X; d /, the limit lim xn is in A. 3.2.1 Proposition. 1. X and ; are open. If U and SV are open then U \ V is open, and if Ui , i 2 J , are open (J arbitrary) then Ui is open. i 2J
2. U is open if and only if X X U is closed. 3. X and ; are closed. [ If A and B are closed then A[B is closed, and if Ai , i 2 J , Ai is closed. are closed then i 2J
Proof. 1 is straightforward (use 3.1.1). 2: Let U be open, A D X X U . The limit x of a sequence .xn /n that is all in A cannot be in U since there is an " > 0 such that .x; "/ U , and the xn ’s with sufficiently large n have to be in such .x; "/. On the other hand, if U is not open, then there is an x 2 U such that for every n,
.x; n1 / ª U . Therefore, we can choose points xn 2 .x; n1 /\A with x D lim xn 2 U D X X A. 3 follows from 1.3 and the formulas relating intersections and unions with complements. t u
3.3
Closure
Let A be a general subset of a metric space X D .X; d /. For a point x 2 X , define the distance of x from A by d.x; A/ D inffd.x; a/ j a 2 Ag:
3 Some topological concepts
41
Note that if x 2 A then d.x; A/ D 0 but d.x; A/ can be 0 even if x … A. The closure of a set A in .X; d / is the set A D fx j d.x; A/ D 0g: This definition seems to depend heavily on the distance function. But we have 3.3.1 Proposition. 1. The set A is closed, and it is the smallest closed set containing A. In other words, \ AD fB closed j A Bg: 2. A point x 2 X is in A if and only if for each of its neighborhoods U , U \ A ¤ ; (in other words, if and only if for each open U 3 x, U \ A ¤ ;). Proof. 1 : U D X X A is open, since if x … A there is an " > 0 such that .x; 2"/ \ A D ; and hence by the triangle inequality .x; "/ \ A D ;. Let B be closed and B A. Let x 2 A. For each n choose an xn 2 A (and hence in B) such that d.x; xn / < n1 . Then x D lim xn is in B. The correctness of the formula follows from 3.2.1. 2 is obvious: in yet other words we are speaking about the balls .x; "/ intersecting A. t u 3.3.2 Proposition. 1. ; D ;, A A, and A B ) A B, 2. A [ B D A [ B, and 3. A D A. Proof. 1 is trivial. 2: By 1, A [ B A [ B. Now let x 2 A [ B; x is or is not in A. In the latter case, all sufficiently close elements from A [ B have to be in B and hence x 2 B. 3: By 3.3.1 1, A is closed and since it contains B D A, it also contains B D A. t u We also define the interior Int.A/ D X X X A. The interior of A is also denoted by Aı . It immediately follows from Proposition 3.3.1 that the interior is the union of all open sets contained in A. The boundary of A is defined as @A D A X Int.A/.
3.4 Continuity can be expressed in terms of the concepts introduced in this section. We have
42
2 Metric and Topological Spaces I
Theorem. The following statements on a mapping f W .X; d / ! .Y; d 0 / are equivalent. (1) f is continuous. (2) For every 2 X and every neighborhood V of f .x/ there is a neighborhood U of x such that f ŒU V . (3) For every U open in .Y; d 0 / the preimage f 1 ŒU is open in .X; d /. (4) For every A closed in .Y; d 0 / the preimage f 1 ŒA is closed in .X; d /. (5) For every subset A X , f ŒA f ŒA: (6) For every subset B Y , f 1 ŒB f 1 ŒB: Proof. (1))(2) : Let V be a neighborhood of f .x/ with .f .x/; "/ V . Choose a ı > 0 as in (ct) for x and ". Then f Œ .x; ı/ .f .x/; "/, and .x; ı/ is a neighborhood of x. (2))(3) : If U Y is open and x 2 f 1 ŒU then f .x/ 2 U and U is a neighborhood. Hence there is a neighborhood V of x such that f ŒV U and we have x 2 V f 1 ŒU , making f 1 ŒU a neighborhood of x. (3),(4) by 3.2.1 2, since f 1 Œ preserves complements. (4))(5) : We have A f 1 ŒŒf ŒA f 1 Œf ŒA: Since f 1 Œf ŒA is closed, we have by 3.3.1 A f 1 Œf ŒA and the statement follows. (5))(6) : We have, by (5), f Œf 1 ŒB f Œf 1 ŒB B and hence f 1 ŒB f 1 ŒB. (6))(1) : If f .y/ 2 .f .x/; "/ then f .y/ … Y X .f .x/; "/ and hence y … 1 f ŒB where B D Y X .f .x/; "/. Hence y … f 1 ŒB and there is a ı > 0 such that .y; ı/ \ f 1 ŒB D ;. Thus if d.x; y/ > ı then f .y/ … B, that is, f .y/ 2 .f .x/; "/. t u
3.5 A continuous mapping f W .X; d / ! .Y; d 0 / is called a homeomorphism if there is a continuous mapping g W .Y; d 0 / ! .X; d / such that fg D idY
and gf D idX :
4 First remarks on topology
43
If there exists a homeomorphism f W .X; d / ! .Y; d 0 / we say that the spaces .X; d / and .Y; d 0 / are homeomorphic. Note that if d and d 0 are equivalent metrics then the identity map idX W .X; d / ! .X; d 0 / is a homeomorphism. But idX W .X; d / ! .X; d 0 / can be a homeomorphism even when d and d 0 are far from being equivalent (consider, e.g., the interval h0; / with the standard metric d and with d 0 .x; y/ D j tan x tan yj). A property of a space or a concept related to spaces is said to be topological if it is preserved under all homeomorphisms. For example, by Theorem 3.4, for a set to be a neighborhood of a point, or to be open resp, closed, or the closure, are topological concepts. By 1.6, convergence is a topological concept. Continuity is a topological concept, but uniform continuity is not. This suggests the possibility of formulating a notion of a space based only on topological properties. We will explore this in the next section.
4
First remarks on topology
Very often, a choice of metric is not really important. We may be interested just in continuity, and a concrete choice of metric may be somehow off the point. For example, note that the ”natural” Pythagorean metric would have been a real burden in dealing with the product. Sometimes it even happens that one has a natural notion of continuity, or convergence, without having a metric defined first. It may even happen that there is no reasonable way to define a metric. This leads to a more general notion of a space, called a topological space. The idea is to describe the structure of interest simply in distinguishing whether a subset U X containing x “surrounds” (is a neighborhood of) x, or declaring some subsets open resp. closed, or specifying an operator of closure. We will present here three variants of the definition, which turn out to be equivalent.
4.1 We will start with the neighborhood approach, which was historically the first one (introduced by Hausdorff in 1914). It is convenient to denote by P.X / the power set of X , which means the set of all subsets of X (including the empty set and X ). With every x 2 X , one associates a set U.x/ P.X /, called the system of the neighborhoods of x, satisfying the following axioms: (1) For each U 2 U.x/, x 2 U , (2) If U 2 U.x/ and U V X then V 2 U.x/, (3) If U; V 2 U.x/ then U \ V 2 U.x/, and (4) For every U 2 U.x/ and every y 2 V there is a V 2 U.x/ such that U 2 U.y/. One then defines a (possibly empty) subset U of X to be open if U is a neighborhood of each of its points. One defines a subset A of X to be closed if the complement
44
2 Metric and Topological Spaces I
X X A of A is open. The closure of a subset S of X is defined by the formula S D fx j 8U 2 U.x/; U \ S ¤ ;g.
4.2 Nowadays probably the most common approach to the structure of topology is to define open sets first as a set of subsets of X satisfying certain axioms. It may be perhaps less intuitive, but it turns out to be much simpler technically. In this approach, a topology on a set X is a subset P.X / satisfying (1) ;; X 2 , (2) U; V 2 ) U \ S V 2 , (3) Ui 2 ; i 2 J ) Ui 2 . In other words, we may simply say that a topology is a subset of the set P.X / of all subsets of X which is closed under all unions and all finite intersections. (To include (1), we allow the union of an empty set of subsets of X , which is said to be ;, and the intersection of an empty set of subsets of X , which is said to be X .) One then defines a closed set as a complement of an open set; U is a neighborhood of x if there is an open V such that x 2 V U , and the closure is defined by the formula AD
\
fB j A B; B closedg:
A subset A X is called dense if A D X . Remark: It is possible to start equivalently with closed sets first and then define open sets as their complements; the axioms of closed sets are obtained by expressing the axioms for open sets in terms of their complements (see Exercise (9)).
4.3 Or, one can start with a closure operator u W P.X / ! P.X / satisfying (1) u.;/ D ; and A u.A/, (2) u.A [ B/ D u.A/ [ u.B/ and (3) u.u.A// D u.A/. A is declared closed if u.A/ D A, the open sets are complements of the closed ones, and U is a neighborhood of x if x … u.X X U /.
4.4 In fact one usually thinks of a topological space as a set endowed with all the above mentioned notions simultaneously, and the only question is which of them
4 First remarks on topology
45
one considers primitive concepts and which are defined afterwards. The resulting structure is the same. (See the Exercises.)
4.5 A topology is not always obtained from a metric (if it is we speak of a metrizable space). Here are two rather easy examples. (a) Take an infinite set X and declare U X to be open if either it is void or if X X U is finite. (b) Take a partially ordered set .X; / and declare U to be open if U D fx j 9y 2 U ; x yg. (Note: this topology is metrizable for certain special choices of partial orderings, but certainly not in general.) Non-metrizable spaces of importance are of course seldom defined as easily as this. But it should be noted that many non-metrizable spaces are of interest today.
4.6 A mapping f W X ! Y between topological spaces is continuous if for every x 2 X and every neighborhood V of f .x/ there is a neighborhood U of x such that f ŒU V (cf. (2) in Theorem 3.4). If we replace in 3.4 the metric definition of continuity (1) with the definition we just made, we have the following more general result: Theorem. Let X; Y be topological spaces. Then the following statements on a mapping f W X ! Y are equivalent. (1) f is continuous. (2) For every U open in Y the preimage f 1 ŒU is open in X . (3) For every A open in Y the preimage f 1 ŒA is closed in X . (4) For every subset A X , f ŒA f ŒA: (5) For every subset B Y , f 1 ŒB f 1 ŒB: Proof. Most of the implications can be proved by the same reasoning as in 3.4. The only one needing a simple adjustment is (5))(1): Let (5) hold and let V be a neighborhood of f .x/. Thus, f .x/ … Y X V , that is, x … f 1 ŒY X V . Hence, U D X X f 1 ŒY X V D f 1 ŒV is a neighborhood of x, and f ŒU D ff 1 ŒV V . t u
46
2 Metric and Topological Spaces I
4.7 The system of open sets constituting a topology is often determined by a so-called basis, which means a subset B such that B1 ; B2 2 B
)
for every U 2 ;
B1 \ B2 2 B and [ U D fB j B 2 B; B U g:
(For example, the set of all open intervals, or the set of all open intervals with rational endpoints are bases of the standard topology of the real line R). One may wish to define a topological space where some particular subsets are open, thus specifying a subset S P.X / of such sets without any a priori properties. One easily sees that the smallest topology containing S is the set of all unions of finite intersections of elements of S. Then one speaks of S as of a subbasis of the topology obtained. The preimages of (finite) intersections are (finite) intersections, and preimages of unions are unions of preimages. Consequently we obtain from 4.6 an important Observation. A mapping f W .X; / ! .Y; / is continuous if and only if there is a subbasis S of such that each f 1 ŒS with S 2 S is open. (Thus e.g. to make sure a real function f W X ! R is continuous it suffices to check that all the f 1 Œ.1; a/ and f 1 Œ.a: C 1/ are open.)
4.8 Let .X; / be a topological space and let Y X be a subset. We define the subspace of .X; / carried (or induced) by Y as .Y; jY /
where jY D fU \ Y j U 2 g:
Since for the embedding map j W Y ! X , j 1 ŒU D U , the map j is continuous; furthermore, if f W .Z; / ! .X; / is a continuous map such that f ŒZ Y then the map .z 7! f .z// W .Z; / ! .Y; jY / is continuous as well. Note that this is in accordance with the concept of subspace in the metric case: the metric subspace (cf. 2.1) has the topology just described, obtained from the topology of the larger metric space.
4.8.1 Convention Unless otherwise stated, the subsets of a topological space will be understood to be endowed with the induced topology, and we will subject the terminology to this convention. Thus we will speak of “connected subsets” or “compact subsets” etc (see below) or on the other hand of an ‘open subspace” or ”closed subspace”, etc.
5 Connected spaces
5
47
Connected spaces
One of the simplest notions defined for topological spaces is connectedness.
5.1 A topological space X is said to be connected if for any two open sets U; V X which satisfy U \ V D ; and U [ V D X , we have U D ; (and hence V D X ), or V D ; (and hence U D X ). It is also common, for a subset S X , to say that S is connected if S is a connected topological space with respect to the induced topology. Note that this is equivalent to saying that for open sets U; V X such that U \ V \ S D ; and U [ V S , we have U S or V S . The following observations are immediate. 5.1.1 Proposition. Let X be a connected space and f W X ! Y a continuous map which is onto. Then Y is connected. Proof. Suppose U; Y Y are open, U \ V D ;, U [ V D Y . Then f 1 ŒU \ f 1 ŒV D ;, f 1 ŒU [ f 1 ŒV D X , so f 1 ŒU D ; or f 1 ŒV D ;, which implies U D ; or V D ; since f is onto. t u 5.1.2 Proposition. Let Si X , i 2 I , and let each Si be connected. Suppose further for every i; j 2 I , there exist i0 ; : : : ; ik 2 I , i0 D i , ik D j such that Sit \ Sit C1 ¤ ;. Then SD
[
Si
i 2I
is connected. Proof. Suppose U; V are open in X , U [ V S; U \ V \ S D ;. Suppose further U is non-empty. Then there exists an i 2 I such that U \Si ¤ ;, and hence U Si since Si is connected. Now select any j 2 I and let i0 ; : : : ; ik be as in the statement of the Proposition. By induction on t, we see that U \ Sit ¤ ;, and hence U Sit since Sit is connected. Thus, U Sj . Since j 2 I was arbitrary, U S . t u 5.1.3 Corollary. A product X Y of two connected metric spaces X; Y is connected. Proof. Choose a point x 2 X and consider the sets S0 D fxg Y , Sy D X fyg for y 2 Y . Then Si , i 2 Y q f0g, satisfy the assumptions of Proposition 5.1.2. u t 5.1.4 Proposition. The closure of a connected subset S of a topological space is connected.
48
2 Metric and Topological Spaces I
Proof. If U; V S satisfy U \ V D ;, U [ V D S and U; V are non-empty open in S , then U \ S , V \ S are non-empty and open in S , their union is S and their intersection is non-empty, contradicting the assumption that S is connected. t u
5.2
Connectedness of the real numbers
The fact that the set R of all real numbers is connected is “intuitively obvious”, but must be proved with care. Let us start with a preliminary result. 5.2.1 Lemma. Every open set U R is a union of countably (or finitely) many disjoint open intervals. Proof. We know that U is a union of countably many open intervals Ui , i D 1; 2; : : : since open intervals .q1 ; q2 /, q1 ; q2 2 Q, form a basis of the topology of R. Note also that if V; W are open intervals and V \ W ¤ ;, then V [ W is an open interval, and that an increasing union of open intervals is an open interval. Now consider an equivalence class on f1; 2; : : : g where i j if and only if there exist i0 ; : : : ; ik such that i0 D i , ik D j and Uit \ Uit C1 ¤ ;. Then the sets [
Ui
i 2C
where C are equivalence classes with respect to are disjoint open intervals whose union is U . t u 5.2.2 Theorem. The connected subsets of R are precisely (open, closed, half-open, bounded, unbounded, etc.) intervals. Proof. Let us first prove that intervals are connected. Let J be an interval. Suppose U; V are open in R, U \ V J , U \ V \ J D ;. Suppose U is non-empty. By Lemma 5.2.1, U is a disjoint union of countably many open intervals Ui , i 2 I ¤ ;. Without loss of generality, none of the sets Ui is disjoint with J . Choose i 2 I , and suppose Ui D .a; b/ does not contain J . Then .a; b/ [ J is an interval containing but not equal to .a; b/, so a 2 J or b 2 J . Let, without loss of generality, b 2 J . Then b … V , b … Uj , j ¤ i , since V , Uj , j ¤ i are open and disjoint with Ui . Thus, b 2 J X .U [ V /, which is a contradiction. On the other hand, suppose that S R is connected but isn’t an interval. Then there exist points x < z < y, x; y 2 S , z … S . But then S .1; z/ [ .z; 1/, which contradicts the assumption that S is connected. t u 5.2.3 Corollary. The Euclidean space Rn is connected. Proof. This follows from Theorem 5.2.2 and Corollary 5.1.3.
t u
5 Connected spaces
5.3
49
Path-connected spaces
A topological space X is called path-connected if for any two points x; y 2 X , there exists a continuous map W h0; 1i ! X such that .0/ D x, .1/ D y. By Theorem 5.2.2, Proposition 5.1.1 and Proposition 5.1.2, a path-connected space is connected. See Exercise (14) for an example of a closed subset of R2 which is connected but not path-connected. 5.3.1 Proposition. Let U Rn be a connected open set (with the induced topology). Then U is path-connected. Proof. If U is empty, it is clearly path-connected. Suppose U is non-empty. Choose a point x 2 U . Let V U be the set of all points y 2 U for which there exists a continuous map W h0; 1i ! U such that .0/ D x, .1/ D y. We claim that V is open in U : this is the same as being open in Rn . If is as above, .y; "/ U , and z 2 .y; "/, extend to a map h0; 2i ! U by putting .1 C t/ D tz C .1 t/y for t 2 h0; 1i. Clearly is continuous, and defining W h0; 1i ! U by .t/ D .2t/ shows z 2 V . We also claim, however, that V is closed in U : Let yn ! y, yn 2 V , y 2 U . Since U is open, there exists an " > 0, .y; "/ U . Then there exists an n such that yn 2 .y; "/. Then we proceed the same way as above: Let W h0; i ! U , .0/ D x, .1/ D yn . Extend to a map h0; 2i ! U by putting .1 C t/ D ty C .1 t/yn for t 2 h0; 1i. Putting again .t/ D .2t/ shows that y 2 V . Since V ¤ ; (since x 2 V ), and since V is open and closed in U , we must have V D U , since U is connected. t u
5.4
Connected components
Let X be a topological space. Let be a relation on X where x y if and only if there exists a connected subset S X such that x; y 2 S . Then is an equivalence relation (transitivity follows from Proposition 5.1.2). The equivalence classes of are called the connected components of X . Also by Proposition 5.1.2, connected components are connected subsets of X . An immediate consequence of Proposition 5.1.4 is the following: 5.4.1 Lemma. Connected components of X are closed subsets of X .
t u
Connected components may not be open: consider Q (with the topology induced from R). Then the connected components are single points. We have, however, 5.4.2 Lemma. Let U Rn be an open set. Then the connected components of U are open in U (hence in Rn ).
50
2 Metric and Topological Spaces I
Proof. Let x 2 U . Then there exists " > 0 such that .x; "/ U , but .x; "/ is homeomorphic to Rn and hence connected by Corollary 5.2.3, so .x; "/ is contained in the connected component of x. Since this is true for every point x, the connected components are open. t u
5.5
A result on bounded closed intervals
The proof of the following result will seem, in nature, related to the proof of the fact that the real numbers are connected. While this is true, it turns out to be mainly due to special properties of the real numbers. The result itself is a reformulation of compactness, a notion which we will discuss in the next section. An understanding of this connection for general metric spaces, however, will have to be postponed until Chapter 9 below. By an open interval (resp. bounded closed interval) in Rn we mean a set of the n n Y Y form .ak ; bk / (resp. of the form hak ; bk i, 1 < ak ; bk < 1). kD1
kD1
n Theorem. For every bounded [closed interval K in R and every set of open intervals S such that K I , there exists a finite subset F S such that
J
[
I 2S
I.
I 2F
Proof. Let us first consider the case n D 1. Let ha; bi be contained in a union of a set S open intervals. Let t 2 ha; bi be the supremum of the set M of all s 2 ha; bi such that ha; si is contained in a union of some finite subset of S . We want to prove that t D b. Assume, then, that t < b. Then there exists a J 2 S such that t 2 J . On the other hand, by the definition of supremum, there exist si 2 M such that si % t. Then, for some i , si 2 J . But we also know that there exists a finite subset F S whose union contains ha; si i. Then the union of the finite subset F [ fJ g contains ha; xi for every x 2 J , contradicting t D sup M . Now let us consider general n. Assume, by induction, that the statement holds with n replaced by n 1. Let K D ha1 ; b1 i han ; bn i. Then for every point x 2 ha1 ; b1 i, there exists, by the induction hypothesis, a finite subset Fx S such that fxg ha2 ; b2 i han ; bn i Fx . Let Ix be the intersection of all the (1-dimensional) intervals I1 where I1 In 2 Fx . Then ha1 ; b1 i is contained in the union of the open intervals Ix , x 2 ha1 ; b1 i, and hence there are finitely many k [ Ixi . Then K is contained in the points x1 ; : : : ; xk 2 ha1 ; b1 i such that ha1 ; b1 i union of the open intervals in Fx1 [ [ Fxk .
i D1
t u
6 Compact metric spaces
51
n Corollary. For every [ bounded closed interval K in R and every set of open [ sets Q such that K I , there exists a finite subset F Q such that J I. I 2Q
I 2F
(Apply the theorem to the set S of all open intervals which are contained in one of the open sets in Q.)
6
Compact metric spaces
6.1 A metric space X is said to be compact if each sequence .xn /n in X contains a convergent subsequence. Thus, in particular, a bounded closed interval ha; bi in R is compact (recall Theorem 2.3 of Chapter 1).
6.2 Proposition. 1. A subspace of a compact space is compact if and only if it is closed. 2. If f W X ! Y is continuous then the image f ŒA of any compact A X is compact. Proof. 1. Let A be a closed subspace of a compact X . Let .xn /n be a sequence of points of A. There is a subsequence xk1 ; xk2 ; xk3 ; : : : converging in X . Since A is closed, the limit is in A. Now let A not be closed. Then there is a sequence .xn /n of elements of A convergent in X , with the limit x in X X A; since each subsequence converges to x, there is none converging to a point in A. 2. Let .yn /n be a sequence in f ŒA. Choose xi 2 A such that yi D f .xi /. Since A is compact we have a subsequence xk1 ; xk2 ; xk3 ; : : : converging to an x 2 A. Then by 1.5, yk1 ; yk2 ; yk3 ; : : : converges to f .x/. t u
6.2.1 Note that from the second part of the proof of the first statement we obtain an immediate Observation. A compact subspace of any metric space X is closed in X . Remark. Thus we have a slightly surprising consequence: if X is compact, Y is a general metric space and if f W X ! Y is a continuous mapping then, besides
52
2 Metric and Topological Spaces I
preimages of closed sets being closed, also the images of closed sets are closed. We will learn more about this phenomenon in Chapter 9 below. For now, let us record the following 6.2.2 Corollary. Let f W X ! Y be a continuous bijective (i.e. one to one and onto) map of metric spaces where X is compact. Then f is a homeomorphism. 6.3 Proposition. Let X be a compact metric space. Then for each continuous real function f on X there exist x1 ; x2 2 X such that f .x1 / D minff .x/ j x 2 X g
and f .x2 / D maxff .x/ j x 2 X g:
(Compare with 3.4 of Chapter 1.) Proof. A compact subspace A of R has a minimal and a maximal point, namely inf A and sup A that are obviously limits of sequences in A. Apply to A D f ŒX , compact by 6.2. t u 6.4 Proposition. (Finite) products of compact spaces are compact. Proof. We will begin with the product X Y of two compact metric spaces - the extension to a general finite product follows by induction. Let .x1 ; y1 /; .x2 ; y2 /; .x3 ; y3 /; : : :
(*)
be a sequence of points of X Y . In X , choose a convergent subsequence .xkn /n of .xn /n . Now take the sequence .ykn /n in Y and choose a convergent subsequence .ykrn /n . Then by 2.2.2.2 (and (1.2.1)), .xkr1 ykrn /; .xkr2 ; ykr2 /; .xkr3 ; ykr3 /; : : : t u
is a convergent subsequence of (*).
A metric space .X; d / is bounded if there exists a number K such that for all x; y 2 X , d.x; y/ < K. From the triangle inequality we immediately see that this is equivalent to any of the following statements: there is a K such that for every x;
X .x; K/;
for every x there is a K such that X .x; K/:
6 Compact metric spaces
53
6.5 Theorem. A subspace of the Euclidean space Rm is compact if and only if it is bounded and closed. Proof. I. From Theorem 2.3 of Chapter 1, we already know that a bounded closed interval is compact. II. Now let X be a bounded closed subspace of Rm . Since it is bounded there are intervals hai ; bi i, i D 1; ; : : : ; m, such that X J D ha1 ; b1 i ham ; bm i: By 6.4 and I, J is compact. The subspace X is closed in Rm , hence in J , and hence it is compact by 6.2. III. Let X not be closed in Rm . Then it is not compact, by 6.2.1. IV. Let X not be bounded. Choose arbitrarily x1 and then xn such that d.x1 ; xn / > n. A convergent sequence is always bounded (all but finitely many of its elements are in the "-ball of the limit). Thus, .xn /n cannot have a convergent subsequence as it has no bounded one. t u
6.6 We have already observed that uniform continuity is a much stronger property than continuity (even the real function x 7! x 2 is not uniformly continuous). But the situation is different for compact spaces. We have Theorem. Let X; Y be metric spaces and let X be compact. Then a mapping f W X ! Y is uniformly continuous if and only if it is continuous. (Compare with Theorem 3.5.1 of Chapter 1.) Proof. Let f be continuous but not uniformly continuous. Negating the definition, there is an "0 > 0 such that for every ı > 0 there are x.ı/; y.ı/ such that d.x.ı/; y.ı// < ı
while
d 0 .f .x.ı//; f .y.ı/// "0 :
Consider xn D x. n1 / and yn D y. n1 /. Choose a convergent subsequence .xkn /n of .xn /n and a convergent subsequence .ykrn /n of .ykn /n , set e x n D xkrn and e y n D ykrn , and finally x D lim e x n and y D lim e y n . As d.e xn; e y n / < n1 , x D y. This is a contradiction since by continuity f .x/ D lim f .e x n / and f .y/ D lim f .e y n / and d.f .e x n /; f .e y n // is always at least "0 . t u
54
7
2 Metric and Topological Spaces I
Completeness
7.1 A sequence .xn /n in a metric space .X; d / is said to be Cauchy if 8" > 0 9n0 such that 8m; n n0 ; d.xm ; xn / < ": 7.2 Proposition. 1. Every convergent sequence is Cauchy. 2. Let a Cauchy sequence .xn /n contain a convergent subsequence; then the whole sequence .xn /n converges. 3. Every Cauchy sequence is bounded. Proof. 1. Let lim xn D x. For " > 0 choose an n0 such that d.xn ; x/ < n n0 . Then for m; n n0 , d.xm ; xn / d.xm ; x/ C d.x; xn / <
" 2
for all
" " C D ": 2 2
2. Let .xn /n be Cauchy and let .xkn /n be a subsequence converging to a point x. Choose an n1 such that for m; n n1 , d.xm ; xn / < 2" , and an n2 such that for n n2 , d.xkn ; x/ < 2" . Set n0 D max.n1 ; n2 /. Since kn n we have, for n n0 , d.xn ; x/ d.xn ; xkn / C d.xkn ; x/ < ": 3. Choose n0 such that for m; n n0 , d.xm ; xn / < 1. Then for any n, d.x; xn0 / < 1 C max d.xn0 ; xk /: kn0
t u
7.3 A metric space .X; d / is said to be complete if every Cauchy sequence in X converges. 7.3.1 Proposition. A subspace A of a complete space X is complete if and only if it is closed. Proof. Let A be closed. If a sequence is Cauchy in A, it is Cauchy in X and hence convergent. Since A is closed, the limit of the sequence has to be in A. If A is not closed there is a sequence .xn /n with xn 2 A, convergent in X to an x 2 X X A. Then .xn /n is Cauchy in X and hence in A as well; but all of its subsequences converge to x and hence do not converge in A. t u
7 Completeness
55
7.4 Proposition. A compact metric space is complete. Proof. Let .xn /n be a Cauchy sequence in a compact metric space X . Then it has a convergent subsequence, and by 6.2 2, it converges. t u 7.5 Theorem. The Euclidean space Rm (in particular, the real line R) is complete. Consequently, a subspace of Rm is complete if and only if it is closed. Proof. Let .xn /n be a Cauchy sequence in Rm . By 6.2 it is bounded and hence fxn j n D 1; 2; : : : g J D ha1 ; b1 i ham ; bm i for sufficiently large intervals haj ; bj i. By 6.4 .xn /n converges in J and hence it converges in Rm . u t Remark. The special case of the real line is the well-known Bolzano-Cauchy Theorem (Theorem 2.4 of Chapter 1).
7.6 The following is the well-known Banach Fixed Point Theorem. At first sight it may seem that its use will be rather limited: the assumption is very strong. But the reader will be perhaps surprised by the generality of one of the applications in 3.3 of Chapter 6. Theorem. Let .X; d / be a complete metric space. Let f W X ! X be a mapping such that there is a q < 1 with d.f .x/; f .y// q d.x; y/
(*)
for all x; y 2 X . Then there is precisely one x 2 X such that f .x/ D x. Proof. Choose any x1 2 X and then, inductively, xnC1 D f .xn /: Set C D d.x1 ; x2 /. By the assumption we have d.x2 ; x3 / C q; d.x3 ; x4 / C q 2 ; : : : ; d.xn ; xnC1 / C q n1 : Thus, by triangle inequality, for m n C 1, d.xn ; xm / D C.q n1 Cq n C Cq m2 / C q n1 .1Cq Cq 2 C / D
C q n1 : 1q
56
2 Metric and Topological Spaces I
Hence, .xn /n ia a Cauchy sequence and we have a limit x D lim xn . Now a mapping f satisfying (*) is clearly continuous and hence we have f .x/ D f .lim xn / D lim f .xn / D lim xnC1 D x: Finally, if f .x/ D x and f .y/ D y then d.x; y/ D d.f .x/; f .y// q d.x; y/ with q < 1 which is possible only if d.x; y/ D 0.
7.7
t u
An Example: Spaces of continuous functions
Let X D .X; d / be a metric space. Denote by C.X / the space of all bounded continuous real functions f W X ! R, endowed with the metric d.f; g/ D sup jf .x/ f .x/j: x2X
(The function d thus defined really is a metric. Obviously d.f; g/ D 0 implies f D g and d.f; g/ D d.g; f /. Suppose d.f; g/ C d.g; h/ < d.f; g/; then there is an x 2 X such that d.f; g/ C d.g; h/ < jf .x/ h.x/j, but then in particular jf .x/ g.x/j C jg.x/ h.x/j < jf .x/ h.x/j, a contradiction.) Remark. Of course, by 2.4.2, if X is compact then C.X / is the space of all continuous functions on X . 7.7.1 Observation. The convergence in C.X / is exactly the uniform convergence defined in 8.1. (We have d.f; g/ < " if and only if for all x 2 X , jf .x/ g.x/j < ".) 7.7.2 Proposition. The space C.X / with the metric defined above is complete. Proof. Let .fn /n be a Cauchy sequence in C.X /. Then, since jfn .x/ fm .x/j d.fn ; fm / for each x 2 X , every .fn .x//n is a Cauchy sequence in R, and hence a convergent one. Set
8 Uniform convergence of sequences of functions. Application: Tietze’s Theorems
57
f .x/ D lim fn .x/: n
Claim. The sequence .fn /n converges to f uniformly. Proof of the Claim. Consider an " > 0. There exists an n0 such that for m; n n0 , 8x; jfn .x/ fm .x/j <
" 2
lim jfn .x/ fm .x/j D jfn .x/ lim fm .x/j D m!1 m!1 jfn .x/ f .x/j 2" < ". Thus, for n n0 and for all x 2 X , jfn .x/
and
hence
t u
f .x/j < ".
Proof of the Proposition continued. By the Claim and 8.2, f is continuous. Now there exists an n0 such that for all n; m n0 , d.fn ; fm / D sup jfn .x/ fm .x/j < x
1 and hence, taking the limit, we obtain jfn .x/ f .x/j 1 for all x. Thus, if jfn0 .x/j K we have jf .x/j K for all x. Now we know that f is bounded and continuous, hence f 2 C.X /, and by 7.7.1 and the Claim again, .fn /n converges to f in C.X /. t u
7.7.3 Let a; b 2 R [ f1: C 1g. Put C.X I a; b/ D ff 2 C.X / j 8x; a f .x/ bg: Proposition. The subspace C.X I a; b/ is closed in C.X /. Consequently, it is complete. Proof. Recall 8.1.1. Since uniform convergence implies pointwise convergence, if a fn .x/ b and fn converge to f then a f .x/ b and f 2 C.X I a; b/. The consequence follows from 7.3.1. t u
8
Uniform convergence of sequences of functions. Application: Tietze’s Theorems
On various occasions we have seen that general facts the reader knew about real functions of one real variable held generally, and the proofs did not really need anything but replacing jx yj by the distance d.x; y/. For example, this was the case when studying the relationship between continuity with convergence, or when proving that continuous maps of compact spaces are automatically uniformly continuous; or the fact about maxima and minima of real functions on a compact
58
2 Metric and Topological Spaces I
space (where in fact the general proof was in a way simpler, or more transparent, due to the observation that the image of a compact space is compact). In this section we will introduce yet another case of such a mechanical extension, namely the behavior of uniformly convergent sequences of mappings, resp. uniformly convergent series of real functions. As an application we will present rather important Tietze Theorems on extension of continuous maps.
8.1 Let .X; d /, .Y; d 0 / be metric spaces. A sequence of mappings f1 ; f2 ; f3 ; : : : W X ! Y is said to converge uniformly to f if for every " > 0 there is an n0 such that for all n n0 and for all x 2 X , d 0 .fn .x/; f .x// < ": This is usually indicated fn f:
8.1.1 Remarks 1. Note that if fn f then lim fn .x/ D f .x/ for all x.
(*)
The statement (*) alone, (called pointwise convergence), is much weaker, and would not suffice as an assumption in 8.2 below. 2. Also note that in the above definition, one uses the metric structure in .Y; d 0 / only. See 8.2.1 below. 8.2 Proposition. Let fn f for mappings .X; d / ! .Y; d 0 /. Let all the functions fn be continuous. Then f is continuous. Proof. For " > 0 choose n such that d 0 .fn .x/; f .x// < 3" for all x. Since fn is continuous there is a ı > 0 such that d.x; y/ < ı implies d 0 .fn .x/f .x// < 3" . Now we have the implication d.x;y/ < ı ) d 0 .f .x/; f .y// d 0 .f .x/; fn .x// C d 0 .fn .x/; fn .y// C d 0 .fn .y/; f .y// <
" " " C C D ": 3 3 3
t u
8 Uniform convergence of sequences of functions. Application: Tietze’s Theorems
59
8.2.1 Note that an analogous proposition also holds for a topological space .X; / instead of a metric one. In the proof replace the requirement of ı by a neighborhood U of x such that fn ŒU .fn .x/; 3" / and use for y 2 U the triangle inequality as before. P 8.3 Corollary. Let fn W .X; d / ! R be continuous functions, let an be a convergent series of real numbers, and let for every n and every x, jfn .x/j an . n 1 X X Then gn .x/ D fk .x/ uniformly converge to fk .x/ and hence g D .x 7! 1 X
kD1
kD1
fk .x// is a continuous function.
kD1
8.4 Lemma. Let A; B be disjoint closed subsets of a metric space .X; d / and let ˛; ˇ be real numbers. Then there is a continuous function ' D ˆ.A; BI ˛; ˇ/ W X ! R such that 'ŒA f˛g;
'ŒB fˇg
and
minf˛; ˇg '.x/ maxf˛; ˇg:
(ˆ)
Proof. Set '.x/ D ˛ C .ˇ ˛/
d.x; A/ : d.x; A/ C d.x; B/
This definition is correct: d.x; A/ C d.x; B/ D 0 yields d.x; A/ D d.x; B/ D 0 and by closedness x 2 A and x 2 B; but A and B are disjoint. Furthermore, .x/ D d.x; C / is continuous (by triangle inequality, d.y; C / d.x; C / C d.x; y/ and hence jd.x; C / d.y; C /j d.x; y/) so that ', obtained by arithmetic operations from continuous functions, is continuous as well. The properties listed in .ˆ/ are obvious. t u 8.5 Theorem. (Tietze) Let A be a closed subspace of a metric space X and let J be a compact interval in R. Then each continuous mapping f W A ! J can be extended to a continuous g W X ! J (that is, there is a continuous g such that gjA D f ). Proof. For a degenerate interval ha; ai the statement is trivial and all the other compact intervals are homeomorphic; if the statement holds for J1 and if h W J ! J1 is a homeomorphism we can extend for f W A ! J the hf to a g W X ! J1 and then take g D h1 g. Thus we can choose the J arbitrarily. For our purposes, J D h1; 1i will be particularly convenient.
60
2 Metric and Topological Spaces I
Set A1 D f 1 Œh1; 13 i and B1 D f 1 Œh 13 ; 1i and consider 1 1 '1 D ˆ.A1 ; B1 I ; /: 3 3 We obviously have 8x 2 A;
jf .x/ '1 .x/j
2 : 3
Set f1 D f '1 . Suppose we already have continuous f D f1 ; f2 ; : : : ; fn W A ! h1; 1i and '1 ; '2 ; : : : 'n W X ! h1; 1i such that for all k D 1; : : : ; n, j'k .x/j
1 ; 3k
fk .x/ D fk1 .x/ 'k .x/
and jfk .x/j
2 : 3k
(*)
Then set 1 1 1 1 ; i; BnC1 D f 1 Œh nC1 ; n i; 3n 3nC1 3 3 1 1 D ˆ.AnC1 ; BnC1 I nC1 ; nC1 / and fnC1 D fn 'nC1 : 3 3
AnC1 D f 1 Œh 'nC1
Thus we obtain sequences of continuous functions '1 ; '3 ; : : : ; 'k ; : : : and f D f0 ; f1 ; : : : ; fk ; : : : satisfying (*) for all k. By 7.3, we have a continuous function 1 1 X X 2 'k .x// W X ! R and since jg.x/j D 1, we can view it as g D .x 7! 3k kD1 kD1 a continuous function g W X ! h1; 1i: Now let x 2 A. We have f .x/ D '1 .x/Cf1 .x/ D '1 .x/C'2 .x/Cf2 .x/ D D '1 .x/C C'n .x/Cfn .x/ and since limn fn .x/ D 0 we conclude that f .x/ D g.x/.
t u
8 Uniform convergence of sequences of functions. Application: Tietze’s Theorems
61
8.5.1 Theorem. (Tietze’s Real Line Theorem) Let A be a closed subspace of a metric space X . Then each continuous mapping f W A ! R can be extended to a continuous g W X ! R. Proof. We can replace R by any space homeomorphic with R (recall the first paragraph of the previous proof). We will take the open interval .1; 1/ instead and extend a map f W A ! .1; 1/. By 8.5, f can be extended to a g W X ! h1; 1i. Such g can, however reach the values 1 or 1 and hence is not an extension as desired. To remedy the situation, consider B D g 1 Œf1; 1g which is a closed set disjoint with A, consider the ' D ˆ.A; B; 0; 1/ from 8.4, and define g.x/ D g.x/ '.x/: Now we have f .x/ D g.x/ D g.x/ for x 2 A, and jg.x/j < 1 for all x 2 X : if g.x/ D 1 or 1 then '.x/ D 0.
8.5.2 A subspace R of a space Y is said to be a retract of Y if there exists a continuous r W Y ! R such that r.x/ D x for all x 2 R. A metric space Y is injective if for every metric space X and closed A X , each continuous f W A ! Y can be extended to a continuous g W X ! Y . (Thus, we have learned above that R and any compact interval are injective spaces.) Theorem. Every retract of a Euclidean space is injective. Proof. First we will prove that a Euclidean space itself is injective. Consider it as the product Rm D R R
m times
with the projections pj ..x1 ; : : : ; xm // D xj . Let f W A ! Rm be a continuous mapping. Then we have by 8.5.1 continuous gj W X ! R such that gj jA D pj f . By 2.2.2 we have the continuous g D .x 7! .g1 .x/; : : : ; gm .x/// W X ! Rm and for x 2 A we obtain g.x/ D .p1 f .x/; : : : ; pm f .x// D f .x/. Now let Y be a retract of Rm with a retraction r W Rm ! Y and an inclusion map j W Y ! Rm (thus, rj D id). Now if f W A ! Y (or, rather, jf W A ! Rm ) is extended to g W X ! Rm , the desired extension g is rg. t u
62
9
2 Metric and Topological Spaces I
Exercises
(1) (2) (3) (4)
Prove 1.4.1. Prove Proposition 1.5.1. Prove Observation 1.5.2. Prove that f W .X; d / ! .Y; d 0 / is continuous if and only if for each convergent sequence .xn /n in .X; d / the sequence .f .xn // is convergent (not specifying the limits.). (5) (a) Consider the set of real numbers R. Prove that the function d 0 .x; y/ D jx 3 y 3 j
(6) (7)
(8) (9) (10) (11)
(12)
(13)
(14)
(15)
is a metric which is not equivalent to the metric d given in example 1.1.1 (a). (b) Prove that nevertheless, neighborhoods with respect to d are the same as neighborhoods with respect to d 0 . Each .x; "/ is open (use the triangle inequality). Let Y be a subspace of .X; d /. U is open (closed) in Y if and only if there exists an open (closed) V in X such that U D V \ Y . The closure of A in y is A \ Y where A is the closure in X (discuss this from the various aspects of closure as presented in 3.3. Find an example when uniform continuity is not preserved under homeomorphism. Write down a definition of topology based on closed subsets of X . Check that the closures as defined in 4.1 and 4.2 satisfy the requirements of 4.3). Starting with open sets, define neighborhoods, and from them define closure as indicated above. Prove that you get the same as the closure defined from open sets directly. Start with open sets, define neighborhoods, and then open sets as in 4.1. Prove that the open sets thus defined are precisely the same sets as the original ones (note the role of the somewhat clumsy requirement (4) in 4.1). Preserving connectedness is not the same as continuity. Give an example of a map f W X ! Y such that for every connected S X , f ŒS is connected (with the induced topology from Y ), but f is not continuous. [Hint: Take X D Q, the rational numbers.] Let X R2 be the union of the set of all points .0; y/, y 2 h1; 1i and the set of all points .x; sin.1=x//, x > 0, with the induced topology. (a) Prove that X R2 is a closed subset. (b) Prove that X is connected but not path-connected. Let U Rn be a connected open set, and let x; y 2 U . Prove that there exist x0 ; : : : ; xk 2 U , x0 D x, xk D y, such that the straight line segment connecting xt ; xt C1 is contained in U . [Hint: mimic the proof of Proposition 5.3.1.]
9 Exercises
63
(16) Path-connected components are defined the same way as connected components in 5.4, with the word “connected” replaced by the word “pathconnected”. Are path-connected components necessarily closed? Prove or give a counterexample. (17) Check that convergence in the metric spaces defined in 1.1.1 (d), (e) is precisely uniform convergence. (18) Prove an analogue of Proposition 8.2 for uniform continuity instead of continuity. 1 X (19) Let K be the set of all real numbers of the form ak 3k , where ak 2 f0; 2g. kD1
(This is called the Cantor set.) Prove that K is compact. Prove that K contains no compact interval with more than one point. (20) Prove that a subspace of Rm is injective if and only if it is a retract.
3
Multivariable Differential Calculus
In this chapter, we will learn multivariable differential calculus. We will develop the multivariable versions of the concept of a derivative, and prove the Implicit Function Theorem. We will also learn how to use derivatives to find extremes of multivariable functions. To understand Multivariable Differential Calculus, one must be familiar with Linear Algebra. We assume that the typical reader of this book will already have had a course in linear algebra, but for convenience we review the basic concepts in Appendices A and B. We refer periodically to results of these Appendices, and we recommend that the reader who has seen some linear algebra simply start reading the present chapter, and refer to these results in the Appendix as needed. Notationally, the most important are the conventions in Sections 1.3 and 7.3 of Appendix A below: Rn will be the space of real n-dimensional column vectors (matrices of type n 1). To avoid awkward notation, however, we will usually write rows and decorate them with the superscript ‹T which means transposition (Subsection 7.3 in Appendix A. Row or column vectors will be denoted by bold-faced letters, such as v. The zero vector (origin) will be denoted by o.
1
Real and vector functions of several variables
1.1 We will deal with real functions of several real variables, that is, mappings f W D ! R with a domain D Rn . Typically, D will be open. Intercheangably f .x/ where, in accordance with convention 7.3 of Appendix A, x D .x1 ; : : : ; xn /T , we will also write f .x1 ; : : : ; xn /. When x 2 Rm , y 2 Rn , notations such as f .x; y/, f .x; y1 ; : : : ; yn / will also be allowed for a function f of m C n variables. Given such a function f , we will often be concerned with the associated functions of one variable .t/ D f .x1 ; : : : ; xk1 ; t; xkC1 ; : : : ; xn /; I. Kriz and A. Pultr, Introduction to Mathematical Analysis, DOI 10.1007/978-3-0348-0636-7 3, © Springer Basel 2013
xj .j ¤ k/ fixed:
(1) 65
66
3 Multivariable Differential Calculus
It is useful to realize right away that the study of an f W D ! R cannot be reduced to the system of all such functions of one variable. For instance, all of the functions (1) may be continuous while f itself is not. See the following example. Set 8 .x y/2 ˆ ˆ ˆ < x 2 C y 2 for .x; y/ ¤ .0; 0/; f .x; y/ D ˆ ˆ ˆ :1 for .x; y/ D .0; 0/:
(2)
Then each f .a; / and each f .; b/ is continuous, but f is not: the sequence . n1 ; n1 / converges to .0; 0/ while lim f . n1 ; n1 / D 0 ¤ f .0; 0/.
1.2 Recall again Convention 1.3, 7.3 of Appendix A. It is important to note that a vector function f D .f1 ; : : : ; fm /T W D ! Rm ;
fj W D ! R:
is continuous if and only if all the fi are continuous (recall Theorem 2.2.2 of Chapter 2).
1.3
Composition
Vector functions f W D ! Rm , D Rn , and g W D 0 ! Rk , D Rn , can be composed whenever fŒD D 0 , and we shall write g ı f W D ! Rk ;
(if there is no danger of confusion, gf W D ! Rk /;
for the composition (sending x to g.f.x//, without pedantically restricting f to a map f 0 W D ! D 0 first.
2
Partial derivatives. Defining the existence of a total differential
2.1 Let f W D ! R be a real function of n variables. The partial derivative of f by xk (or, the k-th partial derivative) at the point .x1 ; : : : ; xn / is the (ordinary) derivative of the function of 1.1 (1), i.e. the limit
2 Partial derivatives. Defining the existence of a total differential
f .x1 ; : : : xk1 ; xk C h; xkC1 ; : : : ; xn / f .x1 ; : : : ; xn / : h!0 h lim
67
(*)
The standard notation is @f .x1 ; : : : ; xn / @xk
or
@f .x1 ; : : : ; xn /; @xk
in case of multiple variables denoted by different letters, say for f .x; y/ we write, of course, @f .x; y/ @x
and
@f .x; y/ ; @y
etc.
This notation is slightly inconsistent: the xk in the “denominator” @xk just indicates focusing on the k-th variable while the xn in the f .x1 ; : : : ; xn / in the “numerator” refers to an actual value of the argument. When confusion is possible, one can write more specifically ˇ @f .x1 ; : : : ; xn / ˇˇ : ˇ @xk .x1 ;:::;xn /D.a1 ;:::;an / However, we will use this notation only occasionally. Example. @.x 2 C e xyCsin.y/ / D 2x C ye xyCsin.y/ ; @x @.x 2 C e xyCsin.y/ / D .x C cos.y//e xyCsin.y/ : @y
2.1.1 @f .x1 ; : : : ; xn / It can happen (and typically it does) that partial derivatives exist for @xk all .x1 ; : : : ; xn / in some domain D 0 D. In such case, we obtain a function @f W D 0 ! R: @xk It is usually obvious from the context whether, speaking of a partial derivative, we have in mind a function or just a number, as in the definition 2.1, (*) above.
2.2 We shall write
68
3 Multivariable Differential Calculus
jjxjj D max jxi j i
for the distance of x from o (for our purposes we could have taken any of the equivalent distances (recall Subsection 2.2 of Chapter 2) such as the Euclidean norm p .xx/ where xx is the dot product (see Appendix A, 4.3); our choice is perhaps the most convenient technically because of its simple behavior with respect to products). We say that f .x1 ; : : : ; xn / has a total differential at a point a D .a1 ; : : : ; an / if there exists a function continuous in a neighborhood U of o which satisfies .o/ D 0 (in an alternate but equivalent formulation, one requires to be defined in U X fog and satisfy lim .h/ D 0), and numbers A1 ; : : : ; An such that h!o
f .a C h/ f .a/ D
n X
Ak hk C jjhjj.h/
(2.2.1)
kD1
(using the dot product, we may write f .a C h/ f .a/ D A a C jjhjj.h/). 2.3 Proposition. Let a function f have a total differential at a point a, as in the definition above. Then 1. f is continuous in a. 2. f has all the partial derivatives in a and one has @f .a/ D Ak : @xk Proof. 1. We have jf .x y/j jA.x y/j C j.x y/jjx yjj and the limit of the right-hand side for y ! x is clearly 0. 2. We have 1 .f .x1 ; : : : xk1 ;xk C h; xkC1 ; : : : ; xn / f .x1 ; : : : ; xn // h jj.0; : : : ; h; : : : ; 0/jj ; D Ak C ..0; : : : ; 0; h; 0; : : : ; 0// h and the limit of the right-hand side is clearly Ak .
2.4
t u
Directional derivatives
It may now seem silly to prefer the basis vectors in Rn when defining partial derivatives. In effect, for any vector v 2 Rn , one can define a directional derivative of f by v by
2 Partial derivatives. Defining the existence of a total differential
69
f .x C hv/ f .x/ : h!0 h
@v f .x/ D lim
(Caution: Some calculus textbooks use a different convention, calling the @v=jjvjj the directional derivative when v ¤ o, the point being that it only depends on the “direction” of v. The notion as we defined it, without requiring any assumption on v, and moreover linear in v, is much more natural for use in geometry, as we will see later.) In any case, the following fact is proved precisely in the same way as Proposition 2.3: Proposition. If a function f has a total differential at a point a, and v 2 Rn is any vector, then the corresponding directional derivative exists and one has @v f .a/ D
n X
Ak vk :
kD1
2.5 The formula f .x1 C h1 : : : ; xn C hn / f .x1 ; : : : xn / D f .a C h/ f .a/ D
n X
Ak hk C jjhjj.h/
kD1
may be interpreted as saying that in a small neighborhood of a, the function f is well approximated by the affine function (see Appendix A, 5.9) L.x1 ; : : : ; xn / D f .a1 ; : : : ; an / C
X
Ak .xk ak / W
by the required properties of , the error term is much smaller than the difference x a. In case of just one variable, there is no distinction between having a derivative at a and having a total differential at the same point. In case of more than one variable, however, the difference between having all partial derivatives and having a total differential at a point is tremendous. A function f may have all partial derivatives in an open set without f even being even continuous there: In the example 1.1 (2), both partial derivatives exist everywhere. If we consider a single point, there are even much simpler examples, say the function f defined by f .x; 0/ D f .0; y/ D 0 for all x; y, and f .x; y/ D 1 otherwise. Then both @f and @f still exist at the point .0; 0/). @x @y What is happening geometrically is this: If we think of a function f as represented by its “graph”, the hypersurface S D f.x1 ; : : : ; xn ; f .x1 ; : : : ; xn // j .x1 ; : : : ; xn / 2 Dg RnC1 ;
(*)
70
3 Multivariable Differential Calculus
the partial derivatives describe just the tangent lines in the directions of the coordinate axes, while a total differential guarantees the existence of an entire tangent hyperplane. Possessing continuous partial derivatives is another matter, though. 2.6 Theorem. Let f have continuous partial derivatives in a neighborhood of a point a. Then f has a total differential at a. Proof. Let h.0/ D h; h.1/ D .0; h2 ; : : : ; hn /; h.2/ D .0; 0; h3 ; : : : ; hn /
etc.
(so that h.n/ D o/). Then we have f .a C h/ f .a/ D
n X
.f .a C h.k1/ / f .a C h.k/ // DW M:
kD1
By Lagrange’s Theorem, there are 0 k 1 such that f .a C h.k1/ / f .a C h.k/ / D
@f .a1 ; : : : ; ak1 ; ak C k hk ; akC1 ; : : : ; an / hk @xk
and hence we can proceed with X @f .a1 ; : : : ; ak C k hk ; : : : ; an / hk @xk X @f .a/ X @f .a1 ; : : : ; ak C k hk ; : : : ; an / @f .a/ D hk C . /hk @xk @xk @xk X @f .a/ X @f .a1 ; : : : ; ak C k hk ; : : : ; an / @f .a/ hk : D hk C jjhjj . / @xk @xk @xk jjhjj
M D
Set .h/ D
X @f .a1 ; : : : ; ak C k hk ; : : : ; an / @f .a/ hk . : / @xk @xk jjhjj
ˇ ˇ ˇ hk ˇ ˇ 1 and since the functions @f are continuous, lim .h/ D 0. Since ˇˇ h!o jjhjj ˇ @xk
t u
2.7 Thus, focusing on an open set in the domain of a function, we may write schematically continuous PD ) TD ) PD
3 Composition of functions and the chain rule
71
(where PD stands for all partial derivatives and TD for total differential). Note that neither of the implications can be reversed. We have already discussed the second one; for the first one, recall that for functions of one variable the existence of a derivative at a point coincides with the existence of a total differential there, but a derivative is not necessarily a continuous function even when it exists at every point of an open set. In the rest of this chapter, simply assuming that partial derivatives exist will almost never be enough. Sometimes the existence of the total differential will suffice, but more often than not we will assume the existence of continuous partial derivatives.
3
Composition of functions and the chain rule
3.1 Theorem. Let f .x/ have a total differential in a point a. Let real functions gk .t/ have derivatives at a point b and let gk .b/ D ak for all k D 1; : : : ; n. Put F .t/ D f .g.t// D f .g1 .t/; : : : ; gn .t//: Then F has a derivative in b, and F 0 .b/ D
n X @f .a/ kD1
@xk
gk0 .b/:
Proof. Consider the formula 2.2.1. Applying it to our function f , we get 1 1 .F .b C h/ F .b// D .f .g.b C h// f .g.b// h h 1 D .f .g.b/ C .g.b C h/ g.b/// f .g.b// h n X gk .b C h/ gk .b/ jgk .b C h/ gk .b/j D C .g.b C h/ g.b// max : Ak k h h kD1
Now limh!0 .g.b C h/ g.b// D 0 since the functions gk are continuous at b, jgk .b C h/ gk .b/j is bounded in a sufficiently small neighborhood of 0, and max k h since gk have derivatives. Thus, the limit of the last summand is zero and we have X gk .b C h/ gk .b/ 1 lim .F .b C h/ F .b// D lim Ak h h n
kD1
D
n X kD1
X @f .a/ gk .b C h/ gk .b/ D g 0 .b/: h @xk k n
Ak lim
kD1
t u
72
3 Multivariable Differential Calculus
3.1.1 Corollary. Let f .x/ have a total differential at a point a. Let real functions gk .t1 ; : : : ; tr / have partial derivatives at b D .b1 ; : : : ; br / and let gk .b/ D ak for all k D 1; : : : ; n. Then .f ı g/.t1 ; : : : ; tr / D f .g.t// D f .g1 .t/; : : : ; gn .t// has all the partial derivatives at b, and @.f ı g/.b/ X @f .a/ @gk .b/ D : @tj @xk @tj n
kD1
3.1.2 Remark The assumption of the existence of total differential in 2.1 is essential and it is easy to see why. Recall the geometric intuition from 2.5. The n-tuple of functions g D .g1 ; : : : ; gn / represents a parametrized curve in D, and f ı g is then a curve on the hypersurface S of 2.5, (*). The partial derivatives of f , or the tangent lines of S in the directions of the coordinate axes, have in general nothing to do with the behaviour on this curve.
3.2
What is the total differential?
The perceptive reader has noticed that in fact, while we defined what it means that a function has a total differential, we have not yet defined the total differential as an object. To remedy this, let us go one step further and consider in 3.1.1 a mapping f D .f1 ; : : : ; fs /T W D ! Rs . Take its composition f ı g with a mapping g W D 0 ! Rn (recall the convention in 1.3). Then we get X @fi @gk @.f ı g/ D : @tj @xk @xj
(3.2.1)
k
This formula is often referred to as the chain rule. It certainly has not escaped the reader’s attention that the right-hand side is the product of matrices
@fi @xk
i;k
@gk @xj
: k;j
Recall that the multiplication of matrices is the matrix of the composition of the linear maps the matrices represent (see Theorem 7.6 of Appendix A). In view of this, it is natural to define the total differential Dfx0 W Rn ! Rs of the map f at a point x0 2 D as the linear map f A W Rn ! Rs
3 Composition of functions and the chain rule
73
associated with the matrix AD
@fi .x/ @xj
ˇˇ ˇ ˇ : ˇ i;j x0
For the purposes of practical calculation, in fact, the map Dfx0 and its associated matrix A are often identified. The chain rule can be then stated in the form D.f ı g/v0 D D.f/g.v0 / ı D.g/v0 : Compare it with the one variable rule .f ı g/0 .t/ D f 0 .g.t//g 0 .t/I for 1 1 matrices we of course have .a/.b/ D .ab/. Note that additionally, the total differential in this point can be used to define an affine approximation fxaff0 of the map f at the point x0 (in an affine map approximating f near x0 , see Appendix A, 5.9): fxaff0 ..x// D f.x0 / C Dfx0 .x x0 /:
3.3
Lagrange’s Formula in several variables
Recall that a subset D Rn is said to be convex if x; y 2 D
)
8t; 0 t 1; .1 t/x C ty D x C t.y x/ 2 D:
Proposition. Let a real function f have continuous partial derivatives in a convex open set D Rn . Then for any two points x; y 2 D, there exists a , 0 1, such that f .y/ f .x/ D
n X @f .x C .y x// j D1
@xj
.yj xj /:
Proof. Set F .t/ D f .x C t.y x//. Then F D f ı g where g is defined by gj .t/ D xj C t.yj xj /, and 0
F .t/ D
n X @f .g.t// j D1
@xj
gj0 .t/
D
n X @f .g.t// j D1
@xj
.yj xj /:
74
3 Multivariable Differential Calculus
Hence by Lagrange’s formula in one variable, f .y/ f .x/ D F .1/ F .0/ D F 0 ./ t u
which yields the statement of the proposition. Remark. The formula is often used in the form f .x C h/ f .x/ D
n X @f .x C h/ j D1
@xj
hj :
Compare this with the formula for total differential.
3.4 It may be of interest that the formula for the derivative of a product of single-variable functions is a consequence of the chain rule. Set h.u; v/ D u v so that @f D v and @f D u. Then @u @v .f .x/g.x//0 D
@h.f .x/; g.x// 0 @h.f .x/; g.x// 0 f .x/ C g .x/ @u @u
D g.x/f 0 .x/ C f .x/g 0 .x/:
4
Partial derivatives of higher order. Interchangeability
4.1 Similarly to the second derivative of a function of one variable, we may consider partial derivatives of a partial derivative, i.e. of a function of the form g.x/ D @
[email protected]/ , k @g.x/ : @xl The result, if it exists, is then denoted by @2 f .x/ : @xk @xl More generally, we may iterate this process to obtain @r f .x/ : @xk1 @xk2 : : : @xkr
4 Partial derivatives of higher order. Interchangeability
75
These functions, when they exist, are called partial derivatives of order r. For example, @3 f .x; y; x/ @x@y@z
and
@3 f .x; y; x/ @x@x@x
are derivatives of third order (even though in the first case, we have taken a partial derivative by each variable only once). To simplify notation, taking partial derivatives by the same variable more than once consecutively may be indicated by an exponent, e.g., @5 f .x; y/ @5 f .x; y/ ; D 2 3 @x @y @x@x@x@y@y @5 f .x; y/ @5 f .x; y/ D : 2 2 @x @y @x @x@x@y@y@x
4.2 Consider the function f .x; y/ D x sin.y 2 C x/: Compute @f .x; y/ D sin.y 2 C x/ C x cos.y 2 C x/ @x
and
@f .x; y/ D 2xy cos.y 2 C x/: @y
Computing the second-order derivatives, we obtain @2 f @2 f D 2y cos.y 2 C x/ 2xy sin.y 2 C x/ D : @x@y @y@x Whether it is surprising or not, it suggests a conjecture that higher order partial derivatives do not depend on the order of differentiation. In effect, this is true – provided all the derivatives in question are continuous. 4.2.1 Proposition. Let f .x; y/ be a function such that the partial derivatives
@2 f @x@y
@2 f are defined and continuous in a neighborhood of a point .x; y/. Then we @y@x have and
@2 f .x; y/ @2 f .x; y/ D : @x@y @y@x
76
3 Multivariable Differential Calculus
Proof. Consider the function of a real variable h defined by the formula F .h/ D
f .x C h; y C h/ f .x; y C h/ f .x C h; y/ C f .x; y/ : h2
If we set 'h .y/ D f .x C h; y/ f .x; y/ and k .x/
D f .x; y C k/ f .x; y/;
we have F .h/ D
1 1 .'h .y C h/ 'h .y// D 2 . h2 h
h .x
C h/
h .x//:
Let us compute the first expression. The function 'h , which is a function of one variable y, has the derivative 'h0 .y/ D
@f .x C h; y/ @f .x; y/ @y @y
and hence by 3.3, we have 1 1 .'h .y C h/ 'h .y// D 'h0 .y C 1 h/ 2 h h @f .x C h; y C 1 h/ @f .x; y C 1 h/ : D @y @y
F .h/ D
Using 3.3 again, we obtain @ F .h/ D @x
@f .x C 2 h; y C 1 h/ @y
for some 1 ; 2 between 0 and 1. Similarly, computing h12 . h .x C h/ @ F .h/ D @y
h .x//,
(*)
we obtain
@f .x C 4 h; y C 2 h/ : @x
(**)
@ @f @ @f . / and . / are continuous at the point .x; y/, we can @y @x @x @y compute lim F .h/ from either of the formulas (*) or (**) and obtain
Now since both h!0
lim F .h/ D
h!0
@2 f .x; y/ @2 f .x; y/ D : @x@y @y@x
t u
5 The Implicit Functions Theorem I: The case of a single equation
77
Remark. Look what happens: F .h/ (and its possible limit in 0) is an attempt @2 f and to compute the second partial derivative in one step. The continuity of @x@y @2 f makes sure that it is, in fact, possible. @y@x
4.3 Iterating the interchanges allowed by 4.2.1, we easily obtain, as a corollary, Theorem. Let a function f of n variables possess continuous partial derivatives up to the order k. Then the values of these drivatives depend only on the number of times a partial derivative is taken in each of the individual variables x1 ; : : : ; xn .
4.3.1 Thus, under the assumption of the theorem, we can write a general partial derivative of the order r k as @r f @x1r1 @x2r2 : : : @xnrn
with r1 C r2 C C rn D r
where, of course, rj D 0 is allowed and indicates the absence of the symbol @xj .
5
The Implicit Functions Theorem I: The case of a single equation
5.1 Suppose we have a function of n C 1 variables, which we will write as F .x; y/; and consider the problem of finding a solution y D f .x/ of the equation F .x; y/ D 0:
(5.1.1)
Even in very simple cases we can hardly expect a unique solution. Take for example F .x; y/ D x 2 C y 2 1. Then for jxj > 1 there is no solution f .x; y/. For jx0 j < 1, for some open interval containing x0 , we have two solutions f .x/ D
p 1 x2
p and g.x/ D 1 x 2 :
78
3 Multivariable Differential Calculus
This is better, but we have two values in each point, contradicting the definition of a function. To achieve uniqueness, we have to restrict not only the values of x, but also the values of y to an interval .y0 ; y0 C / (where F .x0 ; y0 / D 0). That is, if we have a particular solution .x0 ; y0 / we must restrict our attention to a “window” .x0 ı; x0 C ı/ .y0 ; y0 C / through which we see a unique solution. In our example, there is also the case .x0 ; y0 / D .1; 0/, where there is a unique solution, but no suitable window as above, since in every neighborhood of .1; 0/, there are no solutions on the right-hand side of .1; 0/, and two solutions to the left. In another example y 2 jxj D 0; the solution .0; 0/ can be extended indefinitely both ways, but still there is no neighborhood of .0; 0/ in which there would be a unique solution.
5.2 Actually, the above examples cover more or less all the exceptions that can occur for “reasonable” functions F . Theorem. Let F .x; y/ be a function of n C 1 variables defined in a neighborhood of a point .x0 ; y0 /. Let F have continuous partial derivatives up to the order r 1 and let ˇ ˇ ˇ @F .x0 ; y0 / ˇ 0 ˇ ¤ 0: ˇ F .x ; y0 / D 0 and ˇ ˇ @y Then there exist ı > 0 and > 0 such that for every x with jjx x0 jj < ı there exists precisely one y with jy y0 j < such that F .x; y/ D 0: Furthermore, if we write y D f .x/ for this unique value y, then the function f W .x10 ı; x10 C ı/ .xn0 ı; xn0 C ı/ ! R has continuous partial derivatives up to the order r. Proof. As before, we write jjxjj D max xi . Let i
J.< / D fx j jjx x0 jj < g and J. / D fx j jjx x0 jj g (thus, the “window” interval we are seeking is J .< ı/ .y0 ; y0 C ı/.
5 The Implicit Functions Theorem I: The case of a single equation
79
Without loss of generality, let, say, @F .x0 ; y0 / > 0: @y Since the first partial derivatives of F are continuous, there exist a > 0, K, ı1 > 0 and > 0 such that for all .x; y/ 2 J.ı1 / hy0 ; y0 C i, we have @F .x; y/ a @y
and
ˇ ˇ ˇ @F .x; y/ ˇ ˇK ˇ ˇ ˇ @x i
(5.2.1)
(use Theorem 6.6 of Chapter 2). I. The function f : For fixed x 2 J.ı1 /, we will consider the function of one variable y 2 .y0 ; y0 C / defined by 'x .y/ D F .x; y/: Thus, 'x0 .y/ D
@F .x;y/ @y
> 0 and hence
all 'x .y/ with x 2 J.ı1 / are increasing functions of y, and
'x0 .y0 / < 'x0 .y0 / D 0 < 'x0 .y0 C /.
By 2.6 and 2.3, F is continuous, and hence there is a ı, 0 < ı ı1 , such that 8x 2 J.< ı/;
'x .y0 / < 0 < 'x .y0 C /:
Now by Theorem 3.3 of Chapter 1, there is precisely one y 2 .y0 ; y0 C / ('x is one-to-one since it is increasing) such that 'x .y/ D 0 – that is, F .x; y/ D 0. Define this to be f .x/. II. The first derivatives. We will fix an index j , abbreviate the .j 1/-dimensional vector x1 ; : : : ; xj 1 by xb (“the xi ’s before”) and the .nj /-dimensional vector xj C1 ; : : : ; xn by xa (“the xi ’s after”); thus, we may write x D .xb ; xj ; xa /: @f as the derivative of Compute @x j By 3.3, we have
.t/ D f .xb ; t; xa /.
0 D F .xb ; t C h; xa ; .t C h// F .xb ; t; xa ; .t// D F .xb ; t C h; xa ; .t/ C . .t C h/ .t/// F .xb ; t; xa ; .t//
80
3 Multivariable Differential Calculus
D
@F .xb ; t C h; xa ; .t/ C . .t C h/ .t/// h @xj
C
@F .xb ; t C h; xa ; .t/ C . .t C h/ @y
.t///
. .t C h/
.t//
@F .xb ; t C h; xa ; .t/ C . .t C h/ @xj .t C h/ .t/ D h @F .xb ; t C h; xa ; .t/ C . .t C h/ @y
.t///
and hence
.t/// (5.2.2)
for some between 0 and 1. Thus by (5.2.1), ˇ ˇ ˇK ˇ j .t C h/ .t/j jhj ˇˇ ˇˇ a and f is continuous (note that we have not known that before). Using this fact, we can compute from (5.2.2) .t C h/ .t / D h @F .xb ; t C h; xa ; .t / C . .t C h/ @xj D lim h!0 @F .xb ; t C h; xa ; .t / C . .t C h/ @y lim
h!0
@F .xb ; t; xa ; .t // @xj D : .t /// @F .xb ; t; xa ; .t // @y .t ///
III The higher derivatives. Note that we have not only proved the existence of the first derivative of f , but also the formula @F .x; f .x// @f .x/ D @xj @xj
@F .x; f .x// @y
1 :
(5.2.3)
From this we can inductively compute the higher derivatives of f (using the standard rules of differentiation) as long as the derivatives @x1r1 exist and are continuous.
@r F @xnrn @y rnC1 t u
6 The Implicit Functions Theorem II: The case of several equations
81
5.3 We have obtained the formula (5.2.3) while proving that f has a derivative. If we knew beforehand that f has a derivative, we could deduce (5.2.3) immediately from the chain rule. In effect, we have 0 F .x; f .x//I taking a derivative of both the sides we obtain 0D
@F x; f .x// @F x; f .x// @f .x/ C : @xj @y @xj
Differentiating further, we obtain inductively linear equations from which we can compute the values of all the derivatives guaranteed by the theorem.
5.4
Remark
The solution f in 5.2 has as many derivatives as the initial F . But note the restriction r 1. One usually thinks of the 0-th derivative as of the function itself. The theorem does not guarantee a continuous solution f of an equation F .x; f .x// D 0 with continuous F . Even just for the existence of the f we have used the first derivatives.
6
The Implicit Functions Theorem II: The case of several equations
6.1
A warm-up: what happens in the case of two equations
Suppose we try to find a solution yi D fi .x/, i D 1; 2, of a pair of equations F1 .x; y1 ; y2 / D 0; F2 .x; y1 ; y2 / D 0 in a neighborhood of a point .x0 ; y10 ; y20 / (at which the equalities hold). We will apply the “substitution method” based on Theorem 5.2. First we will think of the second equation as an equation for the unknown y2 ; in a neighborhood of .x0 ; y10 ; y20 / we obtain y2 as a function .x; y1 /. Substitute this into the first equation to obtain G.x; y1 / D F1 .x; y1 ; .x; y1 //I if we find, in a neighborhood of .x0 ; y10 /, a solution y1 D f1 .x/, we can substitute it into and obtain y2 D f2 .x/ D .x; f1 .x//.
82
3 Multivariable Differential Calculus
What did we have to assume? First, of course, we have to have the continuous partial derivatives of the functions Fi . Then, to be able to obtain by 5.2 the way we did, we need to have @F2 0 0 0 .x ; y1 ; y2 / ¤ 0: @y2
(6.1.1)
Finally, we also need to have @G 0 0 .x ; y1 / ¤ 0I @y1 by 3.1.1, this is equivalent to @F1 @ @F1 C ¤ 0: @y1 @y2 @y1
(6.1.2)
Now we have (recall (5.2.3)) @ @F1 1 @F2 D @y1 @y2 @y1 and (6.1.2) becomes
@F1 @y2
1
@F1 @F2 @F1 @F2 @y1 @y2 @y2 @y1
¤ 0;
that is, @F1 @F2 @F1 @F2 ¤ 0: @y1 @y2 @y2 @y1 This formula should be conspicuously familiar. Indeed, it is (see the notation for determinants from Subsection 3.3 of Appendix B) ˇ ˇ @F1 ˇ ˇ @y1 ; ˇ ˇ ˇ @F2 ˇ ˇ @y ; 1
ˇ ˇ ˇ ˇ @Fi ˇ ¤ 0: ˇ D det @yj i;j @F2 ˇˇ @y2 ˇ @F1 @y2
Note that if we assume that this determinant is non-zero we have either @F2 0 0 0 .x ; y1 ; y2 / ¤ 0 @y2 and/or @F2 0 0 0 .x ; y1 ; y2 / ¤ 0; @y1
(6.1.3)
6 The Implicit Functions Theorem II: The case of several equations
83
so if the latter holds, we can start by solving F2 .x; y1 ; y2 / D 0 for y1 instead of y2 . Thus the condition (6.1.3) suffices.
6.2
The Jacobian
For a system of functions F.x; y/ D .F1 .x; y1 ; : : : ; ym /; : : : ; Fm .x; y1 ; : : : ; ym // and variables y1 ; : : : ; ym , define the Jacobi determinant (briefly, the Jacobian) @Fi D.F/ D det : D.y/ @yj i;j D1;:::;m
6.3 By extending the substitution procedure indicated in 6.1, we will now prove the general Implicit Function Theorem. Theorem. Let Fi .x; y1 ; : : : ; ym /, i D 1; : : : ; m, be functions of n C m variables with continuous partial derivatives up to an order k 1. Let F.x0 ; y0 / D o and let D.F/ 0 0 .x ; y / ¤ 0: .y/ Then there exist ı > 0 and > 0 such that for every x 2 .x10 ı; x10 C ı/ .xn0 ı; xn0 C ı/ there exists precisely one 0 0 y 2 .y10 ; y10 C / .ym ; xm C /
such that F.x; y/ D 0: Furthermore, if we write this y as a vector function f.x/ D .f1 .x/; : : : ; fm .x//, then the functions fi have continuous partial derivatives up to the order k.
84
3 Multivariable Differential Calculus
Proof. We proceed by induction. By Theorem 5.2, the statement holds for m D 1. Now assume it holds for a given m, and let us have a system of equations Fi .x; y/; i D 1; : : : ; m C 1 satisfying the assumptions above (i.e. the unknown vector y is .mC1/-dimensional). Then, in particular, in the Jacobian determinant we cannot have a column consisting entirely of zeros, and hence, after possibly renumbering the Fi ’s, we may assume without loss of generality that @FmC1 0 0 .x ; y / ¤ 0: @ymC1 If we write yQ D .y1 ; : : : ; ym /, we then have by the induction hypothesis ı1 > 0 and
1 > 0 such that for 0 0 .x; yQ / 2 .x10 ı1 ; x10 C ı1 / .xn0 ı1 ; x1n C ı1 / .ym ı1 ; ym C ı1 /;
there exists precisely one ymC1 D
.x; yQ / satisfying
0 FmC1 .x; yQ ; ymC1 / D 0 and jymC1 ymC1 < 1 :
This has continuous partial derivatives up to the order k and hence so have the functions Gi .x; yQ / D Fi .x; yQ ; .x; yQ //; i D 1; : : : ; m C 1 (the last of which, GmC1 , is identically zero). By 3.1.1, we then have @Gj @Fj @Fj @ D C : @yi @yi @ymC1 @yi Now consider the determinant ˇ ˇ @F1 ˇ ˇ @y1 ; : : : ; ˇ ˇ ˇ ˇ :::; :::; ˇ ˇ D.F/ ˇ Dˇ ˇ @Fm D.y/ ˇ ˇ @y ; : : : ; 1 ˇ ˇ ˇ ˇ @FmC1 ˇ ˇ @y ; : : : ; 1
@F1 @F1 ; @ym @ymC1 :::;
:::
@Fm @Fm ; @ym @ymC1 @FmC1 @FmC1 ; @ym @ymC1
(6.3.1)
ˇ ˇ ˇ ˇ ˇ ˇ ˇ ˇ ˇ ˇ ˇ ˇ: ˇ ˇ ˇ ˇ ˇ ˇ ˇ ˇ ˇ
6 The Implicit Functions Theorem II: The case of several equations
Add to the i th column the product of the last column with the scalar taking into account the fact that GmC1 0 and hence
85
@ . By (6.3.1), @yi
@GmC1 @FmC1 @FmC1 @ D C D 0; @yi @yi @ymC1 @yi we obtain
ˇ ˇ @G1 ˇ ˇ @y1 ; : : : ; ˇ ˇ ˇ ˇ :::; :::; ˇ ˇ D.F/ ˇ Dˇ ˇ @Gm D.y/ ˇ ˇ @y ; : : : ; 1 ˇ ˇ ˇ ˇ ˇ 0; : : : ; ˇ
@G1 @F1 ; @ym @ymC1 :::;
:::
@Gm @Fm ; @ym @ymC1 0;
@FmC1 @ymC1
ˇ ˇ ˇ ˇ ˇ ˇ ˇ ˇ ˇ ˇ @FmC1 D.G1 ; : : : ; Gm / ˇ : ˇD ˇ @ymC1 D.y1 ; : : : ; ym / ˇ ˇ ˇ ˇ ˇ ˇ ˇ ˇ
Thus, D.G1 ; : : : ; Gm / ¤0 D.y1 ; : : : ; ym / and hence by the induction hypthesis there are ı2 > 0, 2 > 0 such that for jxi xi0 j < ı2 there is a uniquely determined yQ with jyi yi0 j < 2 such that Gi .x; yQ / D 0 for i D 1; : : : ; m and that the resulting fi .x/ have continuous partial derivatives up to the order k. If we define, further, fi C1 .x/ D
.x; f1 .x/; : : : ; fm .x//
we obtain a solution f of the original system of equations F.x; y/ D 0. The proof is almost finished but not quite. What about the uniqueness of the solution within the constraints jjx x0 jj < ı and jjy y0 jj < ? Does uniqueness in the two steps of the proof above (solving FmC1 .x; y1 ; : : : ; ym ; ymC1 / D 0 for ymC1 , and then G.x; yQ / D 0 for y1 ; : : : ; ym ) really guarantee that a different solution cannot be found by some other procedure (e.g. reversing the order of variables)? But luckily, in this particular proof, this turns out not to be a serious problem. Choose 0 < ı1 ; 1 ; 2 and then 0 < ı < ı1 ; ı2 and, moreover, sufficiently small so that for jx1 xi0 j < ı one had jfj .x/ fj .x0 /j < (the last to make sure to have in the -interval at least one solution). Now let F.x; y/ D o;
and jjx x0 jj < ı and jjy y0 jj < :
(6.3.2)
86
3 Multivariable Differential Calculus
We have to prove that then necessarily yi D fi .x/ for all i . Since jxi xi0 j < ı ı1 0 for i D 1; : : : ; n, jyi yi0 j < ı1 for i D 1; : : : ; m and jymC1 ymC1 j < 1 we have, necessarily, ymC1 D .x; yQ /. Thus, by (6.3.2), G.x; yQ / D o and since jxi xi0 j < ı ı2 and jyi yi0 j < 2 we have indeed yi D fi .x/. t u
7
An easy application: regular mappings and the Inverse Function Theorem
7.1 Let U Rn be an open set. A mapping f W U ! Rn is said to be regular if each fi @fi has continuous partial derivatives @x and if for all the x 2 D, we have j D.f/ .x/ ¤ 0: D.x/ 7.2 Proposition. Let f W U ! Rn be a regular mapping. Then the image f ŒV of every open V U is open. Proof. Let f .x0 / D y0 . Define F W V Rn ! Rn by setting Fi .x; y/ D fi .x/ yi :
(7.2.1)
¤ 0, and hence, by 6.3, there exist ı > 0 and > 0 Thus F.x0 ; y0 / D o and D.F/ D.x/ such that for every y with jjy y0 jj < ı, there exists (precisely one, but this is not important at this moment) x with jjx x0 jj < and Fi .x; y/ D fi .x/yi D 0. This means that we have f.x/ D y (note that the roles of the xi and the yi are reversed from the usual convention: here, the yi are the independent variables). Thus, we have
.y0 ; ı/ D fy j jjy y0 jj < ıg f ŒV :
t u
Remark. Confront this fact with the characterization of continuous maps in Theorem 3.4 of Chapter 2: for regular maps, both images and preimages of open sets are open. 7.3 Proposition. Let f W U ! Rn be a regular mapping. Then for each x0 2 U there exists an open neighborhood V such that the restriction fjV is one-to-one. Moreover, the mapping g W f ŒV ! Rn inverse to fjV is regular.
8 Taylor’s Theorem, Local Extremes and Extremes with Constraints
87
Proof. Consider the F from (7.2.1) again. We have, for a sufficiently small > 0, precisely one x D g.y/ such that F.x; y/ D 0 and jjx x0 jj < . This g has, furthermore, continuous partial derivatives. We have, by 3.2, D.Id/ D D.f ı g/ D Df Dg: By the chain rule, D.f/ D.g/ D detDf detDg D 1 D.x/ D.y/ and hence for each y 2 fŒV ,
D.g/ .y/ D.y/
¤ 0.
t u
7.3.1 Corollary. If a regular mapping f W U ! Rn is one-one, then the inverse g W fŒU ! Rn is regular as well.
8
Taylor’s Theorem, Local Extremes and Extremes with Constraints
8.1
Taylor’s Theorem
A function f defined on an open set of Rn is called a C r -function if f is continuous and possesses continuous partial derivatives up to (and including) order r. A function which is C r for all r 2 N is called C 1 . C 1 functions will be also called smooth, while C 1 -functions will be called continuously differentiable. (Terminology in the literature varies, some texts use the word smooth for C 1 . We shall never do so in the present text.) Taylor’s Theorem for multivariable functions may look more intimidating, but we will see that it is an easy consequence of the corresponding single variable theorem: Theorem. (Taylor) Let f be a C rC1 -function defined on an open convex subset U Rn , and let a 2 U . Then for every point x 2 U , x ¤ x0 , there exists a point c on the open line segment connecting a and x such that f .x/ D r X
X
kD0 k1 CCkn Dk; ki 0
C
X k1 CCkn DrC1; ki 0
@k f .a/ 1 .x1 a1 /k1 : : : .xn an /kn k k1 Š : : : kn Š .@x1 / 1 : : : .@xn /kn @k f .c/ 1 .x1 a1 /k1 : : : .xn an /kn : k k1 Š : : : kn Š .@x1 / 1 : : : .@xn /kn (*)
88
3 Multivariable Differential Calculus
Proof. Simply use Theorem 4.5.1 for the function g.t/ D f .a C t.x a//: The formula (*) follows immediately from the observation g .k/ .t/ D X k1 CCkn Dk ki 0
ˇ ˇ kŠ @k f .s/ ˇ ˇ k1 Š : : : kn Š .@s1 /k1 : : : .@sn /kn ˇ
.x1 a1 /k1 : : : .xn an /kn
sD.aCt .xa//
(**) which follows by applying the chain rule repeatedly.
t u
It is useful to note that the affine approximation in the sense of 3.2 of the function f at a point a is simply the sum of the constant and linear terms of its Taylor expansion.
8.2
Local extremes and critical points
Let f be a function defined on an open subset U Rn and let x0 2 U . In analogy with the one-variable case, (4.7 of Chapter 1), we say that f has a local minimum (resp. local maximum) at x0 if there exists a ı > 0 such that for every x 2 .x0 ; ı/ with x ¤ x0 , we have f .x/ > f .x0 / (resp. f .x/ < f .x0 /). A local minimum or a local maximum are referred to by the joint term local extreme. On the other hand, x0 is called a critical point of f if either f does not have a total differential at x0 , or the total differential is 0. The following is then a direct consequence, for example, of Proposition 2.3 and Corollary 4.3.1 of Chapter 1. 8.2.1 Proposition. A local minimum or local maximum of a function f W U ! R is a critical point of f .
8.3
The Hessian
Just as in 4.7 of Chapter 1, we would like a partial converse of Proposition 8.2.1 based on second derivatives. We will see, however, that in the multivariable case, the geometry is intrinsically more complicated. Suppose a function f W U ! R is C 2 on some open set U Rn . One considers the Hessian matrix H of type n n whose .i; j /’th entry is @2 f : @xi @xj
8 Taylor’s Theorem, Local Extremes and Extremes with Constraints
89
This is a symmetric matrix by Proposition 4.2.1, and hence has an associated real symmetric bilinear form. If the Hessian is non-degenerate at a critical point x0 , we call x0 a non-degenerate critical point. We have the following Theorem. Suppose f is C 2 on an open set U Rn containing a non-degenerate critical point x0 . Then the following holds: if the Hessian H.x0 / is positive-definite (resp. negative-definite) at x0 , then x0 is a local minimum (resp. local maximum). If the Hessian is indefinite, then x0 is neither a local minimum nor a local maximum. Such point x0 is called a saddle point. Proof. By Taylor’s Theorem 8.1, for any > 0 for which .x0 ; / U , for every x 2 .x0 ; ı/, x ¤ x0 , there exists a point c on the open line segment connecting x0 and x such that 1 f .x/ D f .x0 / C .x x0 /T H.c/.x x0 /: 2
(8.3.1)
Then we conclude that if H.c/ is positive-definite (resp. negative-definite), we have f .x/ > f .x0 / (resp. f .x/ < f .x0 /). If H.c/ is indefinite, then, by definition, both positive and negative values will occur. However, in the statement of the theorem, we have H.x0 /, not H.c/. To remedy this situation, we proceed as follows: Consider .v; c/ D vT H.c/v as a function of .v; c/ 2 X where X D f.v; c/ j v 2 Rn ; v v D 1; c 2 .x0 ; =2/g: Then by our assumptions, is continuous. However, X R2n is compact by Theorem 6.5 of Chapter 2, and hence by Theorem 6.6 of Chapter 2, is uniformly continuous. Now suppose H.x0 / is positive-definite. The closed subset X0 X consisting of all .v; c/ where c D x0 is compact, and hence has a minimum value m on X0 by Proposition 6.3 of Chapter 2. Since H.x0 / is positive-definite, we have m > 0. Now by the uniform continuity of , there exists a ı > 0 such that for all c 2 .x0 ; ı/, .v; c/ 2 X , .v; c/ > 0, and hence H.c/ is positive-definite also. The case of H.x0 / negative-definite is handled analogously. When H.x0 / is indefinite non-degenerate, there exist .v1 ; x0 / 2 X0 , .v2 ; x0 / 2 X0 such that .v1 ; x0 / > 0, .v2 ; x0 / < 0. Since is continuous, there exists a ı > 0 such that for c 2 .x0 ; ı/, .v1 ; c/ > 0, .v2 ; c/ < 0, and hence H.c/ is indefinite. t u
90
8.4
3 Multivariable Differential Calculus
Global extremes
Suppose f W X ! R is a continuous function on a compact subset X Rn . Then by Proposition 6.3 of Chapter 2, f attains a (global) minimum and maximum on X at some points x1 ; x2 2 X . Can we find these points in practice? This is a classic example of an optimization problem, which, as the reader can imagine, has many applications outside of mathematics. The first method that comes into mind is computing all critical points, and checking the values to see at which of these points the maximum (resp. minimum) occurs. This is generally an adequate method when n D 1. A typical (although not general, see Exercise (19) in Chapter 2 above) example of a set X is a compact interval or a finite union of compact intervals. If it happens (as it often does) that the equation f 0 .x/ D 0 has only finitely many solutions, then the only other critical points to check are the finitely many boundary points of the intervals. One immediately realizes, however, that the method in this form does not work even for a perfectly “reasonable” compact subset X Rn when n > 1 such as for example a cube (or, more generally, a region with corners as we introduce it in Chapter 12 below). The point is that the boundary of such sets X will in general be infinite (in fact, uncountable, see Exercise (2) of Chapter 1), and will consist entirely of critical points as defined above, so there is no way of checking all of them. To see what else we can do, let us consider a simple example. Suppose we want to find the local extremes of a function f .x; y/ which is continuously differentiable on some open set containing the ball B D f.x; y/ j x 2 C y 2 1g. Suppose we are to find the global extremes of f on the compact set B. In the interior of B, we can then solve the equations @f @f D 0; D 0: @x @y
(*)
On the boundary, the extreme may not satisfy the equations (*), but we note that the boundary is itself the set of solutions of the “nice” equation x 2 C y 2 D 1:
(C)
It is certainly worth asking if some generalization of (*) might hold, which would allow us to solve the problem. Note that generically speaking, we expect a single equation in the boundary case, since in addition to it, we still have the equation (“constraint”) (C).
8.5
Local Extremes with constraints. Lagrange multipliers
The problem we encountered at the end of Subsection 8.4 can be formalized as follows: Let U Rn be open, and let f W U ! R be a real function. Let, further,
8 Taylor’s Theorem, Local Extremes and Extremes with Constraints
91
gi W U ! R be real functions, i D 1; : : : ; k. A point x0 2 U is called a local minimum (resp. maximum) subject to the constraints gi .x/ D 0; i D 1; : : : ; k
(*)
if x D x0 satisfies (*) and there exists a ı > 0 such that for every x 2 .x0 ; ı/, x ¤ x0 which satisfies (*) we have f .x/ > f .x0 / (resp. f .x/ < f .x0 /. We have the following Theorem. Let f; g1 ; : : : ; gk be real functions defined in an open set D Rn , and suppose they are continuously differentiable. Suppose further that the rank of the matrix 0 @g @g1 1 ; :::; B @x1 @xn B M D B :::; :::; ::: @ @g @gk k ; :::; @x1 @xn
1 C C C A
is exactly k at each point of D. Suppose a continuously differentiable function f W U ! R has a local extreme subject to the constraints (*) at a point x D a D .a1 ; : : : ; an /. Then there exist numbers 1 ; : : : ; n (known as Lagrange multipliers) such that for each i D 1; : : : ; n, we have @gj .a/ @f .a/ X C j D 0: @xi @xi j D1 n
Proof. See Subsection 2.4 of Appendix B. If the matrix M has rank k, then at least one of the k k submatrices of M is regular, and hence has a non-zero determinant. Without loss of generality, let us assume that at the extremal point we have, say, ˇ ˇ @g1 ˇ ˇ @x1 ; : : : ; ˇ ˇ ˇ ˇ :::; :::; ˇ ˇ ˇ ˇ ˇ @gk ˇ ˇ @x ; : : : ; 1
ˇ ˇ ˇ ˇ ˇ ˇ ˇ : : : ˇˇ ¤ 0: ˇ ˇ ˇ @gk ˇˇ @xn ˇ @g1 @xn
If this holds, we have by 6.3 in a neighborhood of the point a functions i .xkC1 ; : : : ; xn /
(1)
92
3 Multivariable Differential Calculus
(let us write xQ for .xkC1 ; : : : ; xn /) with contiuous partial derivatives such that gi . 1 .Qx/; : : : ; k .Qx/; xQ / D 0
for i D 1; : : : ; k:
Thus, an extreme (i.e. local maximum or local minimum) of f .x/ at a subject to the given constraints implies the corresponding extreme property (without constraints) of the function F .Qx/ D f . 1 .Qx/; : : : ; k .Qx/; xQ /; at aQ , and hence by (1), @F .Qa/ D0 @xi
for i D k C 1; : : : ; n;
and this is, by 3.1.1, equivalent to k X @f .a/ @ r .Qa/ rD1
@xr
@xi
C
@f .a/ @xi
for i D k C 1; : : : ; n:
(2)
Taking derivatives of the constant functions gi . 1 .Qx/; : : : ; .Qx/; xQ / D 0 we obtain for j D 1; : : : ; k, k X @gj .a/ @ r .Qa/ rD1
@xr
@xi
C
@gj .a/ @xi
for i D k C 1; : : : ; n:
(3)
Now we will use (1) again, for another purpose. By Theorem B.2.5.1, the system of linear equations @gj .a/ @f .a/ X C j D 0; @xi @xi j D1 n
i D 1; : : : ; k;
has a unique solution 1 ; : : : ; k . Those are the equalities from the statement, but, so far, for i k only. It remains to be shown that the same equalities hold also for i > k. In effect, by (2) and (3), for i > k we obtain n k k k X @gj .a/ @f .a/ X @f .a/ @ r .Qa/ X X @gj .a/ @ r .Qa/ C j D j @xi @xi @xr @xi @xr @xi j D1 rD1 j D1 rD1 0 1 n n n X X X @gj .a/ @ r .Qa/ @ r .Qa/ @f .a/ @ A C j D 0 D 0: u t @x @x @x @xi i i i rD1 j D1 rD1
8 Taylor’s Theorem, Local Extremes and Extremes with Constraints
8.6
93
Remarks
1. The functions f; gi were assumed to be defined in an open D so that we can take derivatives whenever we need them. In particular, this was used in the @F .a/ , and the resulting equality (3) in Theorem 8.5 above. computation of @xi Take the example of the unit ball B at the end of 8.4 as an example of f .x; y/ D x C 2y. Then the formulas x C 2y and x 2 C y 2 1 make sense on all of R2 . 2. The force of the statement in 8.5 is in asserting the existence of 1 ; : : : ; k that satisfy more than k equations, thus creating equations for the i ’s. In the above @f @f D 1 and D 2, g.x; y/ D x 2 C y 2 1 and mentioned example, we have @x @y @g @g hence D 2x and D 2y. There is one that has to satisfy two equations @x @y 1 C 2x D 0 and 2 C 2y D 0: This is possible only if y D 2x. Hence, as x 2 C y 2 D 1 we obtain 5x 2 D 1 and 1 p 2 hence x D ˙ p15 ; this localizes the extremes to . p15 ; p25 / and . p /. 5 5
8.7 A problem of finding extremes with constraints may not be related to extremes at boundary points. Here is an example of another nature. Let us ask the question which rectangular parallelepiped of a given surface area has the largest volume. Denoting the lengths of the edges by x1 ; : : : ; xn , the surface area is 1 1 S.x1 ; : : : ; xn / D 2x1 xn CC x1 xn and the volume is V .x1 ; : : : ; xn / D x1 xn : Thus, we have @V 1 1 @S 2 1 1 2x1 xn 2 : D x1 xn and D .x1 xn / CC @xi xi @xi xi x1 xn xi If we write yi D x1i and s D y1 C C yn and divide the equation from the theorem by x1 xn , we obtain 2yi .s yi / C yi D 0;
or
yi D s C
Thus, all the xi are equal and the unique solution is the cube.
: 2
94
9
3 Multivariable Differential Calculus
Exercises
(1) Prove that the function 8 .x 2 y/2 ˆ ˆ ˆ < x 4 C y 2 for .x; y/ ¤ .0; 0/; f .x; y/ D ˆ ˆ ˆ :1 for .x; y/ D .0; 0/ becomes continuous when restricted to any straight line in R2 . Prove, however, that f is not continuous. (2) Let f .x; y/ W R2 ! R be the function defined by x
y
f .x; y/ D e y x for x; y ¤ 0
(3) (4) (5) (6) (7) (8) (9)
and by f .x; 0/ D f .0; y/ D 0. Prove that f has partial derivatives of all orders on R2 , but is not continuous. [Hint: for x; y ¤ 0, inductively, all partial derivatives (including higher ones) are of the form Q.x; y/f .x; y/ where Q is a rational function. Taking limits of such functions along vertical or horizontal lines to points of the form .0; y/, .x; 0/, however, the limit is always 0. Therefore, by the mean value theorem, the (possibly higher) partial derivative in question is also 0 at those points.] Prove Proposition 2.4 in detail. Prove that if in 3.1.1 the functions gk have total differentials in b then f ı g has one as well. Derive, similarly as in 3.4, a formula for the derivative of fg . How many different expressions 4.3.1 are there? Find the first three summands in the Taylor expansion of the solution of (5.2.3). Give a counterexample of the statement of Theorem 5.2 when we drop the assumption r D 1. Implicit differentiation Let functions F W U ! Rm , U RnCm be open as in Theorem 6.3 and let f W V ! Rn , V Rn be open the map mentioned in Theorem 6.3. Let Dx F W Rn ! Rm and Dy F W Rm ! Rm be linear maps such that DF .x; y/ D Dx F .x/ C Dy F .y/. Using the chain rule, prove that then Dfjx D .Dy F j.x;f.x// /1 .Dx F j.x;f.x// /:
(10) Prove formula 8.1 (**) in detail. (11) Prove Proposition 8.2.1 in detail.
9 Exercises
95
(12) (a) Find a maximum and minimum of the function f .x; y/ D ax C by on the set B D f.x; y/ j x 2 C y 2 1g R2 for every choice of values of the constants a; b 2 R. (b) Find a minimum and maximum of the function f .x; y/ D x 2 C 2y 2 on the set B.
4
Integration I: Multivariable Riemann Integral and Basic Ideas Toward the Lebesgue Integral
1
Riemann integral on an n-dimensional interval
In the first part of this chapter we will present a simple generalization of the onedimensional Riemann integral which the reader already knows (see Section 8 of Chapter 1). To start with, we will consider the integral only for functions defined on n-dimensional intervals (D“bricks”) and we will be concerned, basically, with continuous functions. Later, the domains and functions to be integrated on will become much more general.
1.1 A compact interval in the n-dimensional Euclidean space Rn is a product J D ha1 ; b1 i han ; bn i where hak ; bk i are compact intervals in R. A partition D of such interval is an n-tuple .D1 ; : : : ; Dn / where the Di are partitions of the intervals hai ; bi i, that is, sequences Di W ai D ti1 < ti 2 < < ti;ni D bi ;
(*)
often also viewed as sequences of intervals hti1 ; ti 2 i; hti 2 ; ti 3 i; : : : ; hti;ni 1 ; ti;ni i: The partition D above is called a refinement of a partition D 0 D .D10 ; : : : ; Dn0 / if the sequences (*) above are subsequences of the sequences
I. Kriz and A. Pultr, Introduction to Mathematical Analysis, DOI 10.1007/978-3-0348-0636-7 4, © Springer Basel 2013
97
98
4 Integration I: Multivariable Riemann Integral and Basic Ideas Toward the: : : 0 Di0 W ai D ti10 < ti02 < < ti;n 0 D bi : i
We have the obvious 1.1.1 Observation. Any two partitions have a common refinement.
1.2 A member of a partition D D .D1 ; : : : ; Dn / is any of the intervals (bricks) ht1;i1 ; t1;i1 C1 i htn;in ; tn;in C1 i where the ti;j are as in (*). The set of all members of a partition D will be denoted by jDj. The volume of an interval J D ha1 ; b1 i han ; bn i is the number volJ D
n Y
.bi ai /:
i D1
Let f be a bounded function on an interval J and let D be a partition of J . The lower (resp. upper) sum of f in D is the number X
s.f; D/ D
mK volK
resp. S.f; D/ D
K2jDj
X
MK volK
K2jDj
where mK D infff .x/ j x 2 Kg and MK D supff .x/ j x 2 Kg:
1.2.1 From the definitions of suprema and infima we immediately see that if D refines D 0 then s.f; D/ s.f; D 0 /
and S.f; D/ S.f; D 0 /;
(*)
and taking into account a common refinement we immediately obtain Observation. For any two partitions D; D 0 we have s.f; D/ S.f; D 0 /: Now we can define the lower and the upper Riemann integral of f over J by setting Z
Z f D sup s.f:D/ J
D
f D inf S.f:D/;
and J
D
1 Riemann integral on an n-dimensional interval
99
and if these two values coincide we speak of the Riemann integral of f over J and write Z f J
or, if we wish to emphasize the variables, Z
Z f .x1 : : : ; xn /dx1 dxn
f .x/dx:
or
J
J
We then speak of a Riemann integrable function.
1.3 The following easy fact can be left to the reader (it can be proved by a literal repetition of the one variable case – Exercise (1)). Proposition. If f; g are Riemann integrable and if ˛; ˇ are real numbers then ˛f C ˇg is Riemann integrable and we have Z
Z
Z
.˛f C ˇg/ D ˛
f Cˇ J
1.4
g: J
Almost disjoint unions of intervals
An interval J D ha1 ; b1 i han ; bn i is an almost disjoint union of a pair of intervals J i D ha1i ; b1i i hani ; bni i, i D 1; 2, if for some k we have 8i ¤ k; .ai ; bi / D .ai1 ; bi1 / D .ai2 ; bi2 /;
and
a1 D ai1 ; bi1 D ai2 ; bi2 D bi or a1 D a12 ; b12 D ai1 ; bi1 D bi : An interval J is an almost disjoint union of intervals J1 ; J2 ; : : : ; Jn if it can be produced recursively from J1 ; : : : ; Jn by taking almost disjoint unions of pairs, using each Ji precisely once. 1.4.1 Proposition. Let L be an almost disjoint union of intervals J i , i D 1; : : : ; n. Then Z f D J
n Z X i D1
Z f Ji
f D
and J
n Z X i D1
f: Ji
4 Integration I: Multivariable Riemann Integral and Basic Ideas Toward the: : :
100
Proof. It suffices to prove the statement for an almost disjoint union of a pair of intervals J 1 ; J 2 , and for this case it suffices to realize that each partition of J can be refined into a pair of partitions of the J i ’s, and, on the other hand, from any pair of partitions of the J i ’s we can obtain, using common refinements, a partition of J . t u
2
Continuous functions are Riemann integrable
2.1 Theorem. A function F is Riemann integrable if and only if for every " > 0 there exists a partition D such that S.D; f / s.D; f / < ": Proof. If the formula holds, then, for each " > 0, Z
Z
Z
f S.D; f / < s.D; f / C " J
f C"
f C ": J
J
R R On the other hand, if f D J f then by definition there are D 0 ; D 00 such that J S.D; f / s.D; f / < "; take a common refinement D of D 0 ; D 00 and use 1.2.1 (*). t u Theorem. Every continuous function on an interval J is Riemann integrable. Proof. By Theorem 6.6 of Chapter 2, f is uniformly continuous. Take an " > 0 and choose a ı > 0 such that for the distance in Rn we have d.x; y/ < ı
)
jf .x/ f .y/j <
" : volJ
Further, choose a partition D such that 8K 2 jDj; 8 x; y 2 K; d.x; y/ < ı: Then we have for mK D infff .x/ j x 2 Kg and MK D supff .x/ j x 2 Kg, " and since obviously MK mK volJ X volK D volJ; K2jDj
3 Fubini’s Theorem in the continuous case
101
we have S.D; f / s.D; f / D
X
.MK mK /volK
K2jDj
" X volK D ": volJ
t u
K2jDj
2.2 The following statements are straightforward (they hold more generally, but we will need them so far for continuous functions only). Proposition. R R Let f; g be continuous functions. Then 1. j f j jf j.R R 2. If f g then f g. 3. In particular if f .x/ C for all x 2 J then Z f C volJ: J
3
Fubini’s Theorem in the continuous case
3.1 Theorem. Let J 0 Rm , J 00 Rn be intervals, J D J 0 J 00 . Let f be a continuous function defined on J . Then Z
Z f .x; y/d.x; y/ D J
Z J0
.
Z J 00
f .x; y/dy/dx D
Z J 00
.
J0
f .x; y/dx/dy:
Proof. We will Rprove the first equality, the second one is analogous. Put F .x/ D J 00 f .x; y/dy. We will prove that Z
Z f D j
j0
F:
This will also include the fact that the latter integral exists; this could be easily shown by proving, using uniform continuity, that F is continuous. But we will get it during the proof for free anyway. Choose a partition D of J such that Z
Z f " s.f; D/ S.f; D/
f C ":
The partition D (as any partition of J ) obviously consists of a partition D 0 of J 0 and a partition D 00 of J 00 , and we have
4 Integration I: Multivariable Riemann Integral and Basic Ideas Toward the: : :
102
jDj D fK 0 K 00 j K 0 2 jD 0 j; K 00 2 jD 00 jg and each member appears as precisely one K 0 K 00 . We have X
F .x/
K 00 2jD 00 j
max f .x; y/volK 00
y2K 00
and hence S.F; D 0 /
X K 0 2jD 0 j
max0 . x2K
X K 00 2jD 00 j
X
X
K 0 2jD 0 j
K 00 2jD 00 j
X
max f .x; y/ volK 00 / volK 0
y2K 00
max
.x;y/2K 0 K 00
f .x; y/ volK 00 volK 0
max f .z/ vol.K 00 K 0 / D S.f; D/
K 0 K 00 2jDj
z2K 0 K 00
and similarly s.f; D/ s.F; D 0 /: Hence we have Z
Z
0
f " s.F; D / j
and therefore
4
J0
J0
R J0
Z
Z
F exists and is equal to
R J
S.F; D/
f C "; J
t u
f.
Uniform convergence and Dini’s Theorem
4.1 Theorem. Let fn be continuous real functions on a compact interval J and let them converge uniformly to a function f . Then Z
Z f D lim J
n!1 J
fn :
Proof. Choose an " > 0 and an n0 such that, for n n0 , jfn .x/ f .x/j <
" : volJ
The symbols mK and MK will be as in 1.2, and the corresponding values for fn will be denoted by mnK and MKn . Thus we have
4 Uniform convergence and Dini’s Theorem
103
jmK mnk j; jMK Mkn j < so that
X
js.f; D/ s.fn ; D/j
" volJ
jmK mnK j volK < "
K2jDj
X
(again we use the fact that
volK D volJ ) and similarly
K2jDj
jS.f; D/ S.fn ; D/j < ": Choose a partition D such that Z Z f " s.f; D/ S.f; D/ f C ": J
Then
J
Z
Z f 2" s.f; D/ " s.fn ; D/ J
fn J
Z
S.fn ; D/ S.f; D/ C " and we conclude that lim
4.2
R J
fn D
R J
f C 2"; J
t u
f.
Notation
A sequence .fn /n of functions is said to be increasing if for all x f1 .x/ f2 .x/ fn .x/ (usually this is referred to as non-decreasing, but “increasing” is shorter and there will be no danger of confusion). Similarly we speak of a decreasing sequence. In the remainder of this chapter, we will allow infinite values, that is, a function will be a mapping f W Rm ! R [ f1; C1g. Consequently, an increasing (resp. decreasing) sequence .fn /n always has a limit, namely the supremum resp. infimum. We write fn % f
resp. fn & f
and if there is a danger of confusion (e.g. in double indexing) we emphasize the varying index as in fnk %k fn ;
k fnk & fn :
104
4 Integration I: Multivariable Riemann Integral and Basic Ideas Toward the: : :
The notation an & a, an % a may be used also for monotone sequences of numbers. The constant zero function will be denoted, hopefully without danger of confusion, simply by 0. 4.3 Theorem. (Dini) Let fn be continuous real functions on a compact metric space X and let fn & 0. Then fn converge to 0 uniformly. Proof. It suffices to prove that mn D max fn .x/ converges to zero, because then x
jfn .x/ 0j < " for sufficiently large n independently of the choice of x 2 X . Suppose it does not. Reducing, possibly, fn to a subsequence, we obtain an example with fn & 0
and 8n; mn > "0
for a fixed "0 > 0. Since X is a compact metric space, there exist xn such that fn .xn / D mn , and we can choose a subsequence of xn converging to some x 2 X . After reducing to a subsequence, we may assume without loss of generality that we have fn & 0;
8n fn .xn / > "0
and
lim xn D x: n
Now for k n, fn .xk / fk .xk / > "0 and hence fn .x/ D lim fn .xk / "0 for all n: k
This is a contradiction with lim fn .x/ D 0. n
t u
4.4 From 4.2 and 4.3, we immediately obtain the following Corollary. Let fn be continuous real functions on a compact interval J and let fn & 0. Then Z lim n
fn D 0:
5 Preparing for an extension of the Riemann integral
5
105
Preparing for an extension of the Riemann integral
5.1 For many purposes, the Riemann integral is not sufficiently general. For example, we may be interested in computing integrals such as Z
1 0
p dx p D 2 xj10 D 2; x
which however is incorrect in the setting we considered so far, since the Riemann integral on the left-hand side does not exist. While in this particular case there is a quick fix in the form of “improper Riemann integrals” (which we do not treat here), clearly, a more systematic solution is needed: What about a function f where f .x/ is 0 for x rational and 1 for x irrational? (This function is known as the Dirichlet function.) Obviously, f is not Riemann integrable, but should we define Z
1
f .x/dx D 1 0
to express that modifying the value of the function which is constantly equal to 1 on countably many points should not change the value of the integral? More generally, can one define the integral in such a way that we have Z lim
Z fn D
lim fn
(*)
in a situation more general than the case of a uniform limit? Clearly, it is unreasonable to expect (*) in complete generality: for example, consider functions fn where fn is constant n on the interval .0; 1=n/ and constant 0 elsewhere. Then fn ! 0, while each of the functions fn has (Riemann) integral equal to 1. Given all these questions, it is remarkable that there is a satisfactory answer: people more or less agree on one standard extension of the Riemann integral to a much larger class of functions, known as the Lebesgue integral. While there are different approaches to the Lebesgue integral, and the concept is somewhat notorious for taking a long time to cover, we will present here a relatively quick yet rigorous approach of defining the Lebesgue integralRsimply by starting with certain special cases of (*) as the definition of the value of lim fn , and then showing that this leads to a consistent theory. This approach to the Lebesgue integral is due to P.J.Daniell.
106
5.2
4 Integration I: Multivariable Riemann Integral and Basic Ideas Toward the: : :
The class Z
The support of a function f W Rn ! R is the closure of the set fx 2 Rn j f .x/ ¤ 0g. The support of f is denoted by supp.f /: Thus, a function has compact support if and only if it vanishes outside a compact subset X Rn . There is obviously the smallest interval J0 containing the set X . Any interval J containing J0 is easily represented as an almost disjoint union of a set of intervals containing J0 and Rsuch that f is zero on all the other members of the system. Thus by 1.4, the integral J f does not depend on the choice of the interval J containing the support X of f . We will denote the common value by If R (we will reserve the standard symbol for an extended integral defined later). The set of all continuous functions with compact support in Rn will be denoted by Z: Let us summarize the basic facts we will use below: We have a class Z of functions defined on Rn such that (Z1) for all ˛; ˇ 2 R and f; g 2 Z, ˛f C ˇg 2 Z, (Z2) if f 2 Z then jf j 2 Z, and a mapping I W Z ! R such that (I1) if f 0 then If 0, (I2) I is a linear map, and (I3) if fn & 0 then Ifn & 0 (for (I3), use 4.4, realizing that the support of fn is contained in the support of f1 ). Below, we will consistently use only the facts (Zj) and (Ij) and their consequences. For example, let max.f; g/ (resp. min.f; g/) denote the function whose value at a point x is max.f .x/; g.x// (resp. min.f .x/; g.x//), and let f C D max.f; 0/, f D min.f; 0/. Note that max.f; g/ D
1 .f C g C jf gj/ 2
and
min.f; g/ D
1 .f C g jf gj/: 2
Thus, we easily deduce that f g f; g 2 Z
) )
If Ig; and max.f; g/; min.f; g/; f C ; f 2 Z:
6 A modest extension
6
107
A modest extension
6.1 Define Zup D ff W Rn ! .1; C1 j 9fn 2 Z; fn % f g; Zdn D ff W Rn ! Œ1; 1/ j 9fn 2 Z; fn & f g; Z D Zup [ Zdn : Remark. We choose, of course, the topology on .1; C1 where a set is a neighborhood of C1 if and only if it contains some interval .K; C1. This makes fn % f well defined: it means that fn is an increasing sequence of functions in Z such that for each x 2 Rn , the sequence fn .x/ converges to f .x/ in .1; C1. The treatment of Zdn is symmetrical. We will refer to f as a monotone limit of the functions fn % f or fn & f . The functions in Z are not necessarily continuous, they do not have to have a compact support, and can (obviously) reach infinite values. Also note that Z Zup \ Zdn and this inclusion is not an equality. 6.2 Proposition. Let f; g 2 Z be monotone limits of sequences of functions fn 2 Z and gn 2 Z, respectively. Let f g. Then lim Ifn lim Ign : Proof. (a) If fn % f and gn & g then fn f g gn . (b) Let fn % f and gn % g. For a fixed k set hn D min.gn ; fk /: Then the sequence .hn / increases and we have lim hn D min.g; fk / D fk ; and hence hn %n fk ;
that is,
n .fk hn / & 0
and we obtain, by (I3), that lim Ihn D Ifk . Now gn hn , hence Ign Ihn , n and hence lim Ign Ifk n
for each k so that finally lim Ifn lim Igk . n
k
108
4 Integration I: Multivariable Riemann Integral and Basic Ideas Toward the: : :
(c) If fn & f and gn & g use (b) for f; g. (d) Let fn & f and gn % g. Then fn gn hn D .fn gn /C ; since hn & 0 we have lim Ihn D 0 and finally lim Ifn lim Ign D lim I.fn gn / 0:
6.3
t u
A Corollary and a Definition
For f 2 Z , we can define If D lim Ifn n
where fn is an arbitrary monotone sequence of functions in Z converging (pointwise) to f .
6.4
A few immediate facts
For the purposes of integration, it is convenient to adopt the convention 0 1 D 0 .1/ D 0. We will use this convention for the remainder of this chapter, and in Chapter 5. (a) f 2 Zup if and only if f 2 Zdn . (b) If f; g 2 Zup resp. Zdn then f C g 2 Zup resp. Zdn and we have I.f C g/ D If C Ig. (c) If f 2 Zup and ˛ 0 resp. ˛ 0 then ˛f 2 Zup resp. Zdn and we have I.˛f / D ˛If . (d) If f; g 2 Z and f g then If Ig. (e) If f; g 2 Zup then max.f; g/; min.f; g/ 2 Zup . 6.5 Proposition. Let fn 2 Zup and fn % f . Then f 2 Zup and Ifn % If . Similarly for fn 2 Zdn and fn & f . Proof. Choose fnk 2 Z such that fnk %k fn and set gn D maxffij j 1 i; j ng: (The maximum of finitely many functions is defined by applying the definition of 5.2 recursively; alternately, take the maximum of the values at one point at a time.) Then gn % g for some g. Since gn .x/ D fij .x/ fi .x/
for some
ij n
we have gn fn f:
(1)
7 A definition of the Lebesgue integral and an important lemma
109
On the other hand, for k n we have gk fnk and hence g fn :
(2)
By (1) and (2), gn % f . Regarding the value of If , by (2), If D Ig Ifn and hence If lim Ifn ; on the other hand, by (1), If D lim Ign lim Ifn . t u
7
A definition of the Lebesgue integral and an important lemma
In this section, we will define the well-known Lebesgue integral by the method of Daniell. This approach differs from the original Lebesgue construction based on defining a measure first. Here we will obtain measure later as a consequence of an already defined integral. We will see in Chapter 5 that the basic properties of measure will follow practically for free.
7.1 For an arbitrary function f W Rn ! Œ1; 1, let Z
Z f D supfIg j g f; g 2 Z g and dn
R
f resp
R
f D inffIg j g f; g 2 Zup g:
f is called the lower resp. upper (Lebesgue) integral of f .
Remark. This notation will not interfere with the notation for the lower and upper Riemann integral introduced in 1.2 and used through Section 4. While the meanings of both notations are in fact different, we will not encounter the lower and upper Riemann integral any longer (with the exception of the Exercises). R R 7.2 Proposition. (1) f D supfIg j g f; g 2 Z g and f D inffIg j g f; g 2 Z g. R R (2) f f . R R R R (3) If f g then f g and f g. Proof. (a) Assume that, say, the second equality does not hold. Then there exists a R g f , g 2 Zdn such that Ig < f . Let gn & g with gn 2 Z. Then there has to R be a k such that Igk < f . This is a contradiction, since gn 2 Z Zup . (2) and (3) are trivial. t u
110
4 Integration I: Multivariable Riemann Integral and Basic Ideas Toward the: : :
7.3 From 7.2 (1), we immediately obtain the following R R Corollary. For f 2 Z we have f D f D If .
7.4 Denote by L R R the set of all functions f such that f D f and such that the common value is finite. Such functions are called (Lebesgue) integrable, the common finite value is called the Lebesgue integral of f and denoted by Z f: We will keep this notation for a while to distinguish the Lebesgue integral from the types of integral developed earlier. Note, however, that in practice, other notations are also common, for example, if x1 ; : : : ; xn are the standard coordinates in Rn , one commonly writes Z f .x1 ; : : : ; xn /dx1 : : : dxn or Z f .x/dx for the Lebesgue integral also. Remark. The assumption of finiteness of the common value is essential. R R Functions with infinite f D f can in general misbehave. We will have functions with infinite Lebesgue integral later, but their class will have to be restricted – see 7.9 below. 7.5 Proposition. A function f W Rn ! Œ1; 1 satisfies f 2 L if and only if for every " > 0 there exist g1 2 Zdn and g2 2 Zup , g1 f g2 , such that Igi are finite and Ig2 Ig1 < ".
7 A definition of the Lebesgue integral and an important lemma
111
Proof. The implication ) is obvious. ( : If gi are as assumed in the statement, then Z Ig1
Z f
f Ig2 Ig1 C "
R R so that f f is smaller than any " > 0.
7.6
t u
Convention
Functions from L can have infinite values. Let us agree that in case of f .x/ D C1 and g.x/ D 1 the value f .x/ C g.x/ will be chosen arbitrarily. We will see that for our purposes such arbitrariness in the definition of f C g does not matter. 7.7 Proposition. (1) If f; g 2 L then f C g 2 L and one has Z
Z .f C g/ D
Z f C
g:
(2) If f 2 L then any ˛f 2 L and one has Z
Z ˛f D ˛
(3) (4) (5) (6)
f:
If f; g 2 L then max.f; g/ 2R L andRmin.f; g/ 2 L. If f; g 2 L and f g then f g. If f 2 L then f C ; f 2 L. R R If f 2 L then jf j 2 L and j f j jf j
Proof. (1) We shall use 7.5. Choose f1 ; g1 2 Zup and f2 ; g2 2 Zdn such that f1 f f2 , g1 g g2 and If1 If2 < ", Ig1 Ig2 < ". Then f1 C g1 f C g f2 C g2
(*)
and the statement follows (realize that the inequalities hold also at the ambiguous points mentioned in the convention of 7.6: if, say, f .x/ D C1 and g.x/ D 1 then f2 .x/ D C1 and g1 .x/ D 1; f1 .x/ has to be finite, as a limit of a decreasing sequence of finite numbers, and similarly for g2 .x/ so that the inequalities (*) are satisfied trivially). (2) follows immediately from 7.5. (3) Take the fi ; gi as in (1) to obtain max.f1 ; g1 / max.f; g/ max.f2 ; g2 /
and
min.f1 ; g1 / min.f; g/; min.f2 ; g2 /
4 Integration I: Multivariable Riemann Integral and Basic Ideas Toward the: : :
112
and realize that max.f2 ; g2 / max.f1 ; g1 / .f2 f1 / C .g2 g1 /: Similarly for the minimum. (4) isRobvious Rand (5) follows from R (3). R R R R (5) j f j D j .f C f /j D j f C f j f C C f D jf j. t u 7.8 Lemma. If fn 2 L and if fn % f then Z lim
Z fn D
f:
Remarks before the proof. 1. ThisR lemmaRis very important and willR play a Rcrucial role below. 2. As fn f , we have trivially lim fn f . Hence, under the assumptions Z Z Z of the lemma, we have lim fn D f D f . n
R R R Proof. We obviously have lim fn f , and if lim fn D C1 the equality is trivial. R Thus, we can assume that the limit is finite. By the definition of fn choose gn 2 Zup , gn fn such that Z fn C
" 2nC1
> Ign :
Set hn D maxfgi ji D 1; : : : ; ng. Then hn 2 Zup and the sequence hn is increasing so that by 6.5, h D lim hn 2 Zup . Now hn gn fn and hence h f , and R Ih f . Here is an important Claim. hn fn .g1 f1 / C .g2 f2 / C C .gn fn /: (Indeed, at each point x, we have gj .x/ fj .x/ D hn .x/ fj .x/ for some j n. The summands are non-negative, and hence the inequality holds for j D n; otherwise the sum is greater than or equal to hn .x/ fj .x/ C gn .x/ fn .x/ D hn .x/fn .x/Cgn .x/fj .x/ hn .x/fn .x/Cgn .x/fn .x/ hn .x/fn .x/.)
8 Sets of measure zero; the concept of “almost everywhere”
113
Thus we have Z Ihn
so that Ihn
7.9
R
fn
fn C " and finally
R
n X " <" i C1 2 i D1
R f Ihn lim fn C ".
t u
Some more notation
Set Lup D ff j 9fn 2 L; fn % f g; Ldn D ff j 9fn 2 L; fn & f g; and L D Lup [ Ldn : Now we obtain from 7.8 the following R R 7.9.1 Corollary. For each f 2 L we have f D f . Consequently, Lup \ Ldn D L:
7.9.2 Convention R R R For f 2 L we will use the symbol f for the common value of f and f , even when it is infinite. However, we will not refer to such functions as integrable. R 7.9.3 Proposition. If f 2 L and if the integral f from 7.9.2 is finite then f 2 L and the integral coincides with the standard integral in L. Proof. Let, say, f 2 Lup , let fn % f with fn 2 L. Then by Lemma 7.8 and part 2 R R R R of the Remark in 7.8, f D lim fn D f D f . t u
8
Sets of measure zero; the concept of “almost everywhere”
8.1 The characteristic function of a subset M Rm will be denoted by cM (that is, cM .x/ D 1 if x 2 M and cM .x/ D 0 otherwise). We have
114
4 Integration I: Multivariable Riemann Integral and Basic Ideas Toward the: : :
M N
if and only if
cM cN ;
cM [N D max.cM ; cN / and cM \N D min.cM ; cN /; and if M1 M2 Mn , M D
1 S
Mn , then
nD1
cMn % cM : R M is a set of measure zero if cM D 0 and hence cM 2 L).
R
cM D 0 (then, since cm 0, we also have
8.2 Proposition. (1) If M is a set of measure zero and N M then N is a set of measure zero. 1 S Mn is a set of measure zero. (2) If Mn are sets of measure zero then also nD1
Proof. (1) is trivial. For (2), consider Nn D M1 [ [Mn . Then cNn cM1 C cMn R and hence Nn is a set of measure zero by 7.7. Now cNn % cM and hence cM D 0 by 7.8. t u
8.3 Let V .x/ be a statement about points in Rm . We say that V holds almost everywhere (briefly, a.e.) if the set fx j not V .x/g is a set of measure zero. If f .x/ D g.x/ almost everywhere, we will write f g: 8.4 Proposition. (1) If f 2 L then f .x/ is finite almost everywhere. (2) If f 2 Lup (resp. Ldn ) then f .x/ > 1 (resp. < C1) almost everywhere. Proof. (1) Recall the convention on sums in 7.6, and Proposition 7.7 (1). We may define f CR .f / equally well as 0 or as cM where M D fx j f .x/ D ˙1g R and hence cM D 0 D 0. (2) When f 2 Lup , take fn 2 L with fn % f . Then fx j f .x/ D 1g fx j f1 .x/ D ˙1g and the latter set is a set of measure zero by (1). The case of f 2 Ldn is analogous. t u
9 Exercises
115
R R R R 8.5 Proposition. If f g then f D g and f D g. R Proof. We will consider the case of (the other case is analogous). If we do not R R R have f D g D C 1 we can assume that f < C 1. Set M Dfxj f .x/ ¤ g.x/g R and rn D n cM . By 3.8 we have r D 0 for r D lim rn . R Choose h1 ; h2 2 Zup such that h1 f , h2 r, Ih1 < f C " and Ih2 < ". R R Then we have h1 C h2 2 Zup , h1 C h2 g, and hence g Ih1 C Ih2 < f C 2". R R R Thus, g f , in particular g < C1, and we can repeat the procedure with f; g interchanged. t u 8.6 Corollary. (1) If f 2 L and f g then g 2 L. (2) If f 2 Lup resp. Ldn and f g then g 2 Lup resp. Ldn . R 8.7 Proposition. If f 0 and f D 0 then f 0. Proof. Set Mn D fx j f .x/ n1 g. Since 0 cMn nf we have
R
cMn D 0, hence 1 [ Mn is a set of Mn is a set of measure zero, and consequently fx j f .x/ ¤ 0g D
measure zero.
9 (1) (2) (3) (4)
(5)
(6) (7) (8)
nD1
t u
Exercises Prove Proposition 1.3. Prove Proposition 2.2. Prove the second equality in Theorem 3.1. Prove that the lower Riemann integral of a bounded function on an interval in Rn is always less than or equal to the lower Lebesgue integral, and that the upper Riemann integral is always greater than or equal to the upper Lebesgue integral. Conclude that a Riemann integrable function on an interval is Lebesgue integrable and that both integrals are equal. Prove by definition that the Lebesgue integral of the function equal to x q on h0; bi and 0 elsewhere where 0 < q < 1, b > 0 are constants exists, and compute it. [Hint: Consider the functions equal to x q on ha; bi where 0 < a < b and 0 elsewhere.] 1 exists and Prove that the Lebesgue integral of the function f .x/ D 1 C x2 compute it. [Hint: see the hint to Exercise (5).] Prove that the function which is equal to 1 on every irrational number in h0; 1i and 0 elsewhere is Lebesgue integrable and calculate its Lebesgue integral. Prove that the Cantor set of Exercise (19) in Chapter 2 has measure 0. [Hint: Express its characteristic function as an appropriate monotone limit.]
116
4 Integration I: Multivariable Riemann Integral and Basic Ideas Toward the: : :
T (9) By a generalized Cantor set, we shall mean the intersection S D Si of sets S0 S1 S2 : : : constructed as follows: We put S0 D h0; 1i. The set Sn is a union of 2n closed intervals hai ; bi i, i D 1; : : : ; 2n , and for some number bi ai , we have "n > 0, " < 2 ! 2n [ ai C bi ai C bi SnC1 D Sn X "; C" : . 2 2 i D1 (a) Prove that there exist generalized Cantor sets which are not of measure 0. (b) Derive a necessary and sufficient condition (in terms of the numbers "i ) for the set S to be of measure 0. (10) (a) Prove that for two generalized Cantor sets S , T , there exists a monotone homeomorphism W h0; 1i ! h0; 1i such that ŒS D T . [Hint: Construct such map with S , T replaced by Sn , Tn and prove that the sequence of those maps converges uniformly. Use a separate argument to show that the limit is monotone.] (b) Conclude that for a homeomorphism h0; 1i ! h0; 1i, a continuous image of a set of measure 0 may not be of measure 0. (11) Let f W R ! h0; 1i be defined as follows: If x is irrational, then f .x/ D 0. If x D a=b where a 2 Z, b 2 N and the greatest common divisor of a and b is 1, then f .a=b/ D 1=b. Prove that f is continuous almost everywhere. [Hint: Try to guess the set of all points at which f is continuous.]
Integration II: Measurable Functions, Measure and the Techniques of Lebesgue Integration
1
Lebesgue’s Theorems
up 1.1 Theorem. (Lebesgue’s Monotone let R Convergence R Theorem) Let fn 2 L and up fn % f a.e. Then f 2 L and f D lim fn . Similarly for fn 2 Ldn and fn & f .
Proof. Let us treat the case fn % f , fn 2 Lup , the other case is analogous. Choose fnk 2 L such that fnk %k fn and set gn D maxffij j i; j ng: Now gn % g with gn 2 L. Since gn f we have g f . On the other hand, however, gp fmp for p n and hence g fn , and finally g f . Thus, f D g 2 Lup . R R Now consider the value R of f . If lim fn D C1 the equality is trivial; hence we can assume that lim fn is finite. Then fn 2 L and we can use 7.8 of Chapter 4 R R R t u to obtain lim fn D f D f . Remark. This statement is also known as Levi’s Theorem. 1.2 Theorem. (Lebesgue’s Dominated Convergence Theorem) Let fn 2 L. Assume lim fn .x/ D f .x/ R a.e., and Rlet there exist a g 2 L such that jfn .x/j g.x/ a.e. Then f 2 L and f D lim fn . Remark. The attentive reader may worry about the seemingly sloppy formulation: does one mean “almost everywhere one has that for all n that jfn .x/j g.x/” or “for each n one has that jfn .x/j g.x/ almost everywhere”? But it is an easy exercise (Exercise (1)) to show these two statements are equivalent.
I. Kriz and A. Pultr, Introduction to Mathematical Analysis, DOI 10.1007/978-3-0348-0636-7 5, © Springer Basel 2013
117
5
5 Integration II: Measurable Functions, Measure and the Techniques: : :
118
Proof. By 8.5 of Chapter 4, we may omit “almost everywhere” from the assumptions. Set hn D maxffk j k ng;
gn D minffk j k ng:
Since max fnCj %p hn we have hn 2 Lup , and similarly gn 2 Ldn . But we have, j D0;:::;p moreover, g gn fn hn g R R and hence gn and hn are finite and we have in fact gn ; hn 2 L, and consequently gn 2 Lup and hn 2 Ldn and we can use Lebesgue’s Monotone Convergence Theorem. Now obviously gn % fR and hn &R f , by RLebesgue’s Monotone Convergence Theorem we haveR lim gn D R lim hn D f , and finally since gn fn hn we conclude that f D lim fn . t u 1.3 Proposition. Let g 2 L, let fn 2 L , let fn g a.e. and let lim fn .x/ D f .x/ a.e. Then f 2 Lup . Similarly for fn g we obtain f 2 Ldn .
n
R R Proof. Since 1 < g fn , fn 2 Lup (if fn 2 Ldn it has, hence, a finite integral so that, by 7.9.3 of Chapter 4, fn 2 L Lup as well). Set ' D supn fn . We have max fk %n ' and hence ' 2 Lup by 1.1, and there exist 'n 2 L such that kn
'n % '. Obviously ' f g and we can assume that 'n g (else replace 'n by max.'n ; g/). Set gkn D min.'k ; fn /: We have g gkn 'k and hence gkn 2 L and, moreover, we can use Lebesgue’s Dominated Convergence Theorem for lim gkn and obtain n
min.'k ; f / D lim gkn 2 L: n
Now we conclude that min.'k ; f / %k f and hence f 2 Lup .
2
The class ƒ (measurable functions)
2.1 As before, limn fn D f will be abbreviated by writing fn ! f . Let ƒ D ff j 9fn 2 L; fn ! f g
t u
2 The class ƒ (measurable functions)
119
(unlike in the definition of Lup and Ldn there is no assumption on the nature of the convergence). Functions which belong to ƒ are called (Lebesgue) measurable. 2.2 Proposition. If f g and f 2 ƒ then g 2 ƒ. Proof. Let fn 2 L and fn ! f . Define M D fx j f .x/ ¤ g.x/g and set gn .x/ D g.x/ for x 2 M; gn .x/ D fn .x/ otherwise: Then by 8.6 of Chapter 4, gn 2 L.
t u
2.3 From 1.3, we immediately see the following Corollary. If f 2 ƒ and f 0 then f 2 Lup .
2.4 The following is trivial. Proposition. (a) If f; g 2 ƒ and if f C g makes sense a.e. then f C g 2 ƒ. (b) If f 2 ƒ and ˛ 2 R then ˛f 2 ƒ. (c) If f; g 2 ƒ then max.f; g/; min.f; g/ 2 ƒ. (d) If f 2 ƒ then jf j 2 ƒ. 2.5 Proposition. f 2 ƒ if and only if both f C and f are in Lup . Proof. If fn are in L and fn ! f then obviously fnC ! f C and fn ! f . Use 2.3. The other implication is trivial. u t 2.5.1 Corollary. Let f 2 ƒ and let there exist a g 2 L such that jf j g. Then f 2 L. 2.6 Proposition. If fn 2 ƒ and if fn ! f a.e. then f 2 ƒ. Proof. We have fnC ; fn 2 Lup and fnC ! f C , fn ! f . Thus, by 1.3, both f C and f are in Lup . t u 2.7 f 2 L if and only if f C and f are in Lup and if the difference R Proposition. R f C f makes sense.
5 Integration II: Measurable Functions, Measure and the Techniques: : :
120
f 2 ƒ X L if and only if f C and f are in Lup and R Consequently, f D C1.
R
fC D
Proof. ) : Let, say, f 2 Lup and let fn % f and fn 2 L. As Rf1 D f1C f1 f DfC f f1 2 L and hence the value of f is finite. R f C weRhave ( : If f f makes sense then at least one of the integrals is finite and either f C or f is in L. Thus, f C f is either in Lup or in Ldn . t u
2.8
Remark
Some of the statements proved in this section may be somewhat surprising. It turned out, for example, that for integrability of a limit of integrable functions, the nature of the limiting process is not very important: all one needs is that the positive and negative parts of the limit not both have infinite integrals. For the value of the integral of the limit, on the other hand, the nature of the convergence obviously matters a great deal.
3
The Lebesgue measure
3.1 A set A Rm is said to be (Lebesgue) measurable if the characteristic function cA is in ƒ (then, of course, it is in Lup , by 2.3). We put Z .A/ D cA and call .A/ the (Lebesgue) measure of A. Note that this terminology is in accordance with 8.4 of Chapter 4 (see Exercise (4)).
3.2
General facts
If A; B Rm are measurable, A [ B is measurable and .A [ B/ .A/ C .B/ (by 2.4, we have cA[B D max.cA ; cB / . cA C cB / in ƒ) and if A; B are disjoint then .A [ B/ D .A/ C .B/ as then cA[B D cA C cB .
(3.2.1)
3 The Lebesgue measure
121
But we have much more: the measure is countably additive (-additive, as this fact is usually referred to). Here are some facts on measurability. Proposition. (1) Let An , n D 1; 2; : : : , be measurable sets. Then
1 [
An is
nD1
measurable. If for any two n; k the intersection An \ Ak is a set of measure zero then .
1 [
An / D
nD1
1 X
.An /:
nD1
(2) The intersection of a countable system of measurable sets is measurable. (3) If A; B are measurable then the difference A X B is measurable. (4) .;/ D 0 and for a measurable subset A B, .A/ .B/. Proof. (1) We have cA1 [[An %n cS1 nD1 An and hence cS1 2 Lup . In the almost disjoint case we obtain the value nD1 An from the finite additivity (3.2.1) and from Lebesgue’s Monotone Convergence Theorem. (3) cAXB D max.cA cB ; 0/. (See 2.4.) (2) From (1), (3). (4) is trivial. t u
3.3
Special sets
Proposition. (1) Every open set in Rm is measurable. (2) Every closed set in Rm is measurable. (3) For the interval J D ha1 ; b1 i ham ; bm i, one has .J / D .b1 a1 /.b2 a2 / .bn an /: (4) Every countable set is measurable, with measure 0. Proof. The Euclidean distance in Rm will be denoted by .x; y/. (1) It suffices to show that bounded open sets are measurable: for a general open U consider the open [ balls Bn D fx j .x; .0; : : : ; 0// < ng and use Proposition 3.2 (1) for U D U \ Bn . n
Thus, let U be a bounded open set. Set
122
5 Integration II: Measurable Functions, Measure and the Techniques: : :
An D fx j .x; Rm X U /
1 g n
and define fn W Rm ! R by fn .x/ D
.x; Rm X U / : .x; Rm X U / C .x; An /
Since An and Rm X U are disjoint closed sets, fn is a continuous map. Since fn .x/ D 0 for x … U , we have fn 2 Z ƒ. Now if x 2 U then .x; Rm XU / 1 for some n0 and hence x 2 An , and fn .x/ D 1, for all n n0 . Thus, n0 fn ! cU and cU 2 ƒ. (2) Use (1) and 3.2 (3). (3) Note that for a bounded closed set C we can use a similar procedure as in (1): this time set An D fx j .x; C /
1 g n
and define fn W Rm ! R by fn .x/ D
.x; An / : .x; An / C .x; C /
Now obviously fn .x/ D 1 for x 2 C and fn .x/ D 0 for .x; C / n1 if n. Furthermore, if k n then .x; Ak / .x; An /, and fk .x/ fn .x/. Thus, fn & cC : In particular this holds for the interval J . Moreover, fn .x/ D 0 outside ha1
1 1 1 1 ; b1 C i ham ; bm C i n n n n
and 0 fn .x/ 1 so that by the standard estimate of Riemann integrals Z .b1 a1 / .bn an /
fn
.b1 a1 C
2 2 / .bn an C / n n
R and cJ D .b1 a1 /.b2 a2 / .bn an / by Lebesgue’s Monotone Convergence Theorem (actually already by Dini’s Theorem). (4) By (3), .fxg/ D 0. Use 3.2 (1). t u
4 The integral over a set
3.4
123
The set B of Borel sets
The smallest class of subsets of Rm containing all open subsets and closed under • Complements, • Countable unions, and • Countable intersections (of course, the last follows from the first two) is called the class of Borel sets, and denoted by B. Thus, all the open and closed sets are Borel. However, we have more complicated sets. For example, an F set is a countable union of closed subsets, and a Gı set is an intersection of countably many open subsets. Going on, a Gı set is a union of countably many Gı sets, and an F ı set is an intersection of countably many F sets, and so on. All sets produced in this way are Borel by definition. From 3.2 and 3.3 we immediately obtain 3.4.1 Corollary. Every Borel set is measurable.
3.5 Let us conclude this section with a trivial remark. From 3.2 (1) and 2.2 (1), we immediately obtain the frequently used somewhat paradoxical observation that for every " > 0, there exists a dense open set U of the unit interval I such that .U / < ": order all the rationals in I in a sequence r1 ; r2 ; : : : ; rn ; : : : and set U D
1 [
.rn
nD1
1 1 ; rn C nC2 / 2nC2 2
(where .a; b/ designate open intervals).
4
The integral over a set
4.1 Unlike the additivity of the classes L etc., we do not have similarly well behaved multiplicativity properties. Nevertheless, multiplying by characteristic functions cM of M measurable does give satisfactory results. Proposition. Let M be a measurable set and let f 2 L. Then cM f 2 L. Proof. Put 'n D min.ncM ; .max.f; .ncM ////. Then 'n 2 ƒ and since j'n j jf j we have cM f D lim 'n in L by 2.5.1. t u
5 Integration II: Measurable Functions, Measure and the Techniques: : :
124
4.2 By 4.1, we can define for a measurable set M and f 2 L, Z
Z df
f
cM f;
M
the integral of f over M . 4.3 Proposition. Let Mn , n D 1; 2; : : : be measurable. (a) Let for n ¤ k, Mn ; MkSbe almost R disjoint (i.e. .Mn \ Mk / D 0), f 2 ƒ, and assume that for M D Mn , M f makes sense. Then Z f D M
(b) Let M1 M2 ; M D
S
1 Z X
Mn and assume that f D lim
M
f makes sense. Then
f:
n
M
T
R
Z
Z
(c) Let M1 M2 ; M D
f:
nD1 Mn
Mn
Mn and assume that
Z
R M1
f makes sense. Then
Z f D lim M
n
f: Mn
Proof. For f 0 the statement immediately follows from Lebesgue’s Monotone Convergence Theorem and the fact that the sum formula obviously R holds for finitely many Mn . Thus, we Rhave theR equality for f C and f . Now if M f makes sense then by 2.7 one of m f C , m f is finite, and hence at least one of the series 1 Z 1 Z X X f C, f converges, and since the summands are non-negative, it nD1
Mn
nD1 Mn
converges absolutely. Thus, Z
Z
fC
f D M
M
Z
f D M
1 Z X nD1
fC Mn
1 Z X nD1
fD Mn
1 Z X nD1
.f C f /; Mn
the last reshuffling being made possible by the absolute convergence of at least one of the series (and the other’s being a sum of non-negative numbers). (b) Apply (a) for M1 ; M2 X M1 ; M3 X M2 ; : : : . S (c) Set Nn D M1 X Mn . Then M D M1 X Nn . Use (b). t u
4 The integral over a set
125
4.3.1 Remark R For the general statement, the assumption that M f make sense is essential. The R R point is that we could have both M f C and M f infinite.
4.4
Criteria of measurability
For many purposes, we need a criterion by which sets and functions are measurable. Let us begin with the following definition: For a Borel set X Rm , a function f W X ! h1; 1i is called Borel measurable if For every S h1; 1i Borel, f 1 ŒS is Borel.
(C)
Theorem. A function f W X ! h1; 1i is (Lebesgue) measurable if and only if there exists a Borel measurable function equal to f almost everywhere. 4.4.1 Corollary. A subset S Rm is measurable if and only if there exists a Borel set B Rm such that S X B and B X S are sets of measure 0. Comment: Note that since the inverse image preserves unions, intersections and complements, we may equivalently replace every Borel set S in (C) by either every interval h1; a/, a 2 R or every interval .a; 1i, a 2 R. Proof of the Theorem: We begin by considering the easy implication. First, suppose f is Borel measurable. Then so are f C and f , so by Proposition 2.5, we may assume f 0. Then define fn .x/ D
k k kC1 when n f .x/ < : n 2 2 2n
(*)
Then clearly fn % f: Further, each fn is an increasing limit of a sequence of functions each of which takes on only finitely many values, the inverse images of which are Borel, and hence measurable sets. Therefore fn 2 Lup , and hence f 2 Lup . Now a function equal to f almost everywhere is measurable by Corollary 8.6 of Chapter 4. To prove the converse implication, we first prove some lemmas. 4.4.2 Lemma. If fn % f or fn & f and the functions fn are Borel-measurable, so is f . Proof. Consider fn % f (the case of fn & f clearly follows by taking negatives). Note that f .x/ S > a if and only if there exists an n such that fn .x/ > a, so f 1 Œ.a; 1i D fn1 Œ.a; 1i, so our statement follows from the Comment. t u
5 Integration II: Measurable Functions, Measure and the Techniques: : :
126
4.4.3 Lemma. If f 0, f is Borel measurable, and everywhere.
R
f D 0, then f D 0 almost
Proof. Otherwise, .f 1 Œ.1=n; 1i for some n D 1; 2; : : : . But then Z
Z f
f 1 Œ.1=n;1i
f
1 .f 1 Œ.1=n; 1i/ > 0: n t u
A contradiction.
4.4.4 Lemma. If f , g are Borel measurable, so is f and, if f; g 0, also f C g. Proof. The statement for f is immediate. For f C g, note that .f C g/.x/ < a if and only if there exist rational numbers q, r such that f .x/ < q, g.x/ < r and q C r < a and thus, .f C g/1 Œ.1; a/ is the (countable) union of the Borel sets f 1 Œ.1; q/ \ g 1 Œ.1; r/. t u Now let f be measurable. Then by Lemma 4.4 of Chapter 4, and Proposition 2.5, it suffices to prove the statement for f C , f , and hence, by Lemma 4.4.2, for f 2 L. When f 2 L, by 4.7.5, there exist gn 2 Zdn such that gn gnC1 f and Z
Z gn %
f:
Similarly, there exist hn 2 Zup such that hn hnC1 f and Z
Z hn &
f:
By the Comment, functions in Zup and Zdn are clearly Borel-measurable, so if we put g D lim gn ; h D lim hn ; g and h are Borel-measurable functions,
5 Parameters
127
gf h and Z .h g/ D 0: By Lemma 4.4.4, h g is Borel measurable. Let B D fxj.h g/.x/ D 0g: By Lemma 4.4.3, X X B is a set of measure 0. Therefore, we can take h as the Borel measurable function required by the statement. t u 4.5 Corollary. A function f W Rm ! h1; 1i is measurable if and only if for every interval B D .a; 1i (alternately, every interval B D h1; a/), f 1 ŒB is measurable. Proof. If f is measurable then, by Theorem 4.4, it is equal to a Borel-measurable function almost everywhere, and hence clearly satisfies our condition by Proposition 8.2 (1) of Chapter 4. If, on the other hand, f satisfies our criterion then, as in the Comment above, f 1 ŒB is Lebesgue measurable for every Borel set B. As above, we may pass to the functions f C and f , and hence may assume that f 0. Now the formula (*) again produces an increasing sequence of measurable functions converging to f , and hence f is measurable. t u
5
Parameters
5.1 Theorem. Let T be a metric space, t0 2 T , and let f W T Rm ! R [ fC1; 1g be a function such that .1/ for almost all x, f .; x/ is continuous in a point t0 , .2/ there is a neighborhood U of t0 such that the functions f .t; / belong to L for all t 2 U X ft0 g, and .3/ there exists a g 2 L and a neighborhood U of t0 such that for almost all x and for all t 2 U X ft0 g one has jf .t; x/j g.x/. Then f .t0 ; / is in L and we have Z
Z f .t0 ; / D lim
t !t0
f .t; /:
Proof. Choose tn 2 U X ft0 g such that lim tn D t0 and use the Lebesgue Dominated n Convergence Theorem. t u
5 Integration II: Measurable Functions, Measure and the Techniques: : :
128
5.2 Theorem. Let f W R Rm ! R [ fC1; 1g be such that in a neighborhood U of t0 @f .t; x/ for almost all x, .1/ there exist partial derivatives @t .2/ there exists a g 2 L such that for almost all x and for all f 2 U one has ˇ ˇ ˇ @f .t; x/ ˇ ˇ ˇ ˇ @t ˇ g.x/; R .3/ and for t 2 U there existZ f .t; /. @f .t0 ; / Then there exist the integral and one has @t Z
d @f .t0 ; / D @t dt
Z f .t0 ; /:
1 @f .t0 ; x/ D lim .f .t0 C h; x/ f .t0 ; x//. Set '.h; x/ D h!0 h @t 1 .f .t C h; x/ f .t ; x//. By Lagrange’s Theorem we have 0 0 h
Proof. We have
ˇ ˇ ˇ @f .t0 C h; x/ ˇ ˇ g.x/ j'.h; x/j D ˇˇ ˇ @t and hence we can apply Theorem 5.1.
6
t u
Fubini’s Theorem
In this section we will have to indicate the dimension of the Euclidean space we work in. When working in Rm , we will decorate the symbols Z; Zup ; L up etc. with subscripts Zm ; Zm ; Lm etc., and for the integral symbols we will use R .m/ R .m/ R .m/ R R R ; ; instead of ; ; . We will abandon the integral symbol I since we already know that for f 2 Z R we have If D f . Finally, to avoid confusion in the case of two variables we will sometimes use the classical Z Z Z Z f .x; y/dy or f .x; y/dx for f .x; / or f .; y/: 6.1 Lemma. For a function f defined on RmCn define functions F and F on Rm by setting
6 Fubini’s Theorem
Z
129
Z
.mCn/
F .x/ D
.mCn/
(resp. F .x/ D
f .x; y/dy
f .x; y/dy /:
Then one has Z
Z
.mCn/
f
Z
.m/
F
Z
.mCn/
.m/
f
(resp.
F /:
Proof. I. If f 2 ZmCn then we have equalities, by the case of Fubini’s Theorem for the Riemann integral of continuous maps on compact intervals. Furthermore, when F D F D F , we have F 2 Zm : Indeed, choose a compact interval J carrying the function f . The function F obviously has compact support, contained in the projection of J (the values elsewhere are integrals of 0). Further, let K be the volume of J . For an " > 0 there exists a ı > 0 such that for .x; x 0 / < ı, we have jf .x; y/ f .x 0 ; y/j < " , independently on y. Therefore, we have K ˇZ ˇ Z ˇ ˇ ˇ F .x/ F .x 0 /ˇ < " K D "; ˇ ˇ K and F is continuous. II. Now let fk 2 ZmCn , fk %k f . Then Z Fk .x/ D fk .x; y/dy % F .x/ for all y. Therefore, we still have Z .mCn/ Z f .x; y/dy D lim
fk .x; / % f .x; /
and also
Z
.mCn/
fk D lim
k
Z
.m/
Fk D
k
.m/
F:
III. Now let f be general and let g 2 Zup be such that g f . Put G.x/ D R .mCn/ g.x; y/dy, Then G F , and by II we have Z
Z
.mCn/
gD
Z
.m/
G
.m/
F
and hence Z
Z f D inff
Z g j g 2 Zup ; g f g
F:
t u
5 Integration II: Measurable Functions, Measure and the Techniques: : :
130
Theorem. (Fubini) Let f 2 LmCn . Then for almost all x there exists the integral R .mCn/ f .x; y/dy. If we denote its value by F .x/, and define the values F .x/ arbitrarily in the remaining points, we have F 2 Lm and Z
Proof. Put F .x/ D have Z
R
Z f D
.m/
F:
f .x; y/dy and F .x/ D
Z f D
.mCn/
Z f
F
8 ˆ ˆ ˆ <
Z
ˆ ˆ ˆ :
Z
F
9 > > > =
F
> > > ;
R
f .x; y/dy. By Lemma 6.1, we
Z
Z F
Z f D
f:
R Let R f be in LmCn . Then the values are finite and we obtain, first ofR all, thatR F D F is finite and hence F 2 Lm , and similarly F 2 Lm . Further, F D F and R hence .F F / D 0 and hence F F D 0 almost everywhere, by 4.7. If f 2 LmCn use Lebesgue’s Monotone Convergence Theorem. t u
7
The Substitution Theorem
In this section, we will prove a substitution theorem for multivariable integrals. The reader should be aware that a much more general substitution theorem is valid (see [18]). In this text, we would basically be happy with a substitution theorem for the Riemann integral of a continuous bounded function where the coordinate change is a diffeomorphism with bounded partial derivatives (as needed, for example, in the Stokes Theorem in Chapter 12 below). However, we will typically need to integrate over Borel sets, which makes Lebesgue integral relevant. The purpose of this section is to give a rigorous, but otherwise as straightforward as possible, proof of the version of the theorem needed here.
7.1 Recall the set B of all Borel sets in Rm . Let U Rm be an open set. Define BU D fS 2 BjS U g: Note that clearly, BU is the smallest set of subsets of U closed under complements and countable unions, which contains all open subsets of U . Let us also write IU D fha1 ; b1 / han ; bn /jha1 ; b1 i han ; bn i U g:
7 The Substitution Theorem
131
7.2 Lemma. Let U Rm be open and let S D S0 2 IU . Then there exist S1 ; S2 ; : : : ; Sn ; : : : such that i ¤ j ) Si \ Sj D ; for i; j D 0; 1; 2; : : : and 1 [
V D
Si :
i D0
Proof. Let S0 D ha1 ; b1 / han ; bn /: Assume S0 is non-empty. (If S0 is empty, choose S1 2 IU arbitrary and proceed with k 1 instead.) Let di .k/ D .bi ai /=2k . Assuming S0 ¤ ;, let T0 D fS0 g. Suppose we have already defined T0 ; : : : Tk1 . Let Tk be the set of all hr1 ; s1 / hrn ; sn / .U X .T0 [ [ Tk1 // where si D ri C di .k/; ri D ai C `i di .k/ for some `i 2 Z, i D 1; : : : ; n. Let fS1 ; S2 ; : : : g D T1 [ T2 [ : : : : By definition, the Si ’s are disjoint and one easily checks that closed in U .
S
Si is open and t u
7.3 Lemma. For S 2 IU , there exist open sets U V1 Vk : : : such that 1 \
SD
Vk :
i D1
Proof. Using the same notation as in Lemma 7.2, take Vk D .a1
1 1 ; b1 / .an ; bn /: k k
t u
7.4 Proposition. Let SU be the smallest set of subsets of U which satisfies .1/ Iu SU ; .2/ When S1 ; S2 ; 2 SU are disjoint, then 1 [
Si 2 SU ;
i D1
.3/ When S 2 SU , we have U X S 2 SU . Then SU D BU .
(C)
5 Integration II: Measurable Functions, Measure and the Techniques: : :
132
Comment: This proposition is a special case of a more abstract theorem known as Dynkin’s Lemma. The proof is essentially the same; the greater generality would be of no use to us. Proof. Let S 2 SU . Let SU .S / D fT 2 SU j S \ T 2 SU g:
(7.4.1)
Step 1: If S 2 SU , then the conditions (2) and (3) above hold with SU replaced by SU .S /. Proof.
(2) is trivial by distributivity. To prove (3), when S; T; S \T 2 SU , then S \ .U X T / D U X ..U X S / [ .S \ T // 2 SU ;
since .U X T / \ .S \ T / D ;.
t u
Step 2: When S 2 IU , clearly IU SU . Therefore, by Step 1, SU .S / D SU . Step 3: Now let S 2 SU . By Step 2, IU SU . Therefore, by Step 1, SU .S / D SU . Step 4: By Step 3 (note (7.4.1)) and (2), (C) holds for any S1 ; S2 ; 2 SU (without assuming disjointness). By Lemma 1 (with S0 D ; and U replaced by V ), every open subset V U satisfies V 2 SU . Therefore, BU SU . Step 5: Note that by Lemma 7.2, IU BU , and hence, by definition, SU BU . t u
7.5
Assumption
Assume now U Rm is an open set, and F W U ! Rm is an injective map with continuous first partial derivatives which satisfies det.DFx / ¤ 0 for all x 2 U (Then F is regular, and by 7.2, 7.3 of Chapter 3, its image is open and its inverse also satisfies the Assumption). Recall 3.2 of Chapter 3 for a discussion of DFx . The attentive reader has noticed that det.DFx / is a special case of the Jacobian considered in 6.2 of Chapter 3 when the variables x of 6.2 of Chapter 3 are not present and y is labeled as x. Many texts, in fact, reserve the term for this special case.
7 The Substitution Theorem
133
7.6 Lemma. Let S 2 IU . Then Z .FŒS /
jdet.DFx /jdx:
(*)
S
Proof. Note first that by Lemma 7.3 and the fact that F is a homeomorphism onto its image, FŒS is Borel. Next, one proves (*) in the case when is an affine map (see 5.9 of Appendix A). By the multiplicative property of the determinant with respect to composition, translation-invariance of Lebesge measure, Fubini’s Theorem and Gauss elimination, it then suffices to prove (*) for n D 1 (which is obvious) and for the map
1a 01
:
(C)
For the case of (C), since is clearly invariant under translation, it suffices to prove the statement for S D h0; b1 / h0; b2 /; b1 ; b2 > 0: Then FŒS
n1 [
h
i D0
b2 iab2 i b2 .i C 1/b2 iab2 b2 jaj ; C b1 C jaj / h ; /: n n n n n n
The Lebesgue measure of the right-hand side, with n D 2k , k ! 1, clearly approaches b1 b2 , while FŒS is an intersection of this decreasing sequence of sets. For the case of general satisfying our assumption, by countable additivity, it suffices to consider the case when the Rm -closure S of S is contained in U . Then since the partial derivatives of are continuous on S , they are uniformly continuous by Theorem 6.6 of Chapter 2. Therefore, for every " > 0 there exists a ı > 0 such that for a D .a1 ; : : : an / 2 S and 0 < bi ai < ı; we have ˇ ˇ ˇ @Fi .y/ @Fi .a/ ˇ ˇ ˇ ˇ @x @x ˇ < ": j j By the Mean Value Theorem, then, assuming (7.6.1), FŒha1 ; b1 / han ; bn /
(7.6.1)
5 Integration II: Measurable Functions, Measure and the Techniques: : :
134
is a subset of x CDFx Œh".b1 a1 /; .b1 a1 /C".b1 a1 // h".bn an /; .bn an /C".bn an //:
From the affine case, we conclude that .FŒha1 ; b1 / han ; bn // .1 C 2"/mjdet.DFx /j: t u
Since " > 0 was arbitrary, our statement follows.
7.7 Lemma. Let S 2 IU and let f W FŒU ! R be a non-negative continuous function. Then Z
Z f
.f ı F/jdet.DFx /jdx:
FŒS
(*)
S
This also holds with S replaced by an open subset V U . Proof. Let S D ha1 ; b1 / han ; bn /: Let, for integers 0 i1 < 2k ; : : : 0 in < 2k , Sk .i1 ; : : : ; in / denote the set ha1 C
i1 .b1 a1 / .i1 C 1/.b1 a1 / ; a1 C k 2 2k
:::
in .bn an / .in C 1/.bn an / : ; a C
an C n 2k 2k Then define “step functions” fk by fk .x/ D
inf
z2Sk .i1 ;:::in /
f .z/ for x 2 Sk .i1 ; : : : ; in /:
Then fk % f and with f replaced by fk , the statement for S 2 IU holds by Lemma 7.6. For V open, the statement holds by Lemma 7.2 (with S0 D ;, U replaced by V . t u 7.8 Proposition. For V U open, f W FŒU ! R non-negative continuous, we have
8 Holder’s ¨ inequality, Minkowski’s inequality and Lp -spaces
135
Z
Z f D FŒV
.f ı F/jdet.DFx /jdx: V
The statement also holds with V replaced by S 2 IU . Proof. First note that the statement for S 2 IU follows from the statement for V open by Lemma 7.3. For V open, the inequality follows from Lemma 7.7. The inequality follows from Lemma 7.7 with f replaced by f ı F, F replaced by F1 , FŒU replaced by U and V replaced by FŒV (recall that the set FŒU is open). u t 7.9 Theorem. (The Substitution Theorem) Let F satisfy Assumption 7.5, and let f W FŒU ! R be a continuous function. Let S 2 BU . Then Z Z f D .f ı F/jdet.DFx /jdx; (C) FŒS
S
provided that the integral on at least one side of the equation exists and is finite. Proof. By considering f C D max.f; 0/, f D min.f; 0/, (recall 5.2 of Chapter 4), we may assume f 0. By Proposition 7.8 (for S 2 IU ), and by the additivity of the integral, we clearly have (C) for all S 2 SU and hence our statement follows from Proposition 7.4. t u
8
¨ Holder’s inequality, Minkowski’s inequality and Lp -spaces
In this section, we will introduce Lp -spaces, 1 p 1, which are a very basic source of examples in analysis. The true significance of those spaces in mathematics will emerge in Chapters 16 and 17 below. However, their definition and basic properties are often used throughout analysis, and thus now is a good place to treat them. In this section, let B be a Borel subset of Rn . Let f be a real measurable function defined on B. We will write (assuming that the right-hand side is finite) for 1 p < 1. Z kf kp D Z In this section, we will tend to write For p D 1, one defines
jf j
p
B
p1 :
Z , since the set B will not change.
instead of B
kf k1 D inffM 0 j f .x/ M almost everywhere on Bg; again, assuming this number is finite.
5 Integration II: Measurable Functions, Measure and the Techniques: : :
136
8.1 Theorem. (H¨older’s inequality) Let p; q > 1 and let
1 1 C D 1. We have p q
Z jfgj kf kp kgkq : Proof. Put ˛ D kf kp , ˇ D kgkq . Then Z
Z
1 jf jp D ˛
1 q jgj D 1: ˇ
Set f D ˛1 f and g D ˇ1 g. By Young’s inequality 4.5.3 of Chapter 1 we have jg.x/j jf .x/j C ; p q
jf .x/g.x/j and hence 11 ˛ˇ
Z
Z jf gj
jfgj D
1 p
Z jf jp C
1 q
Z jgjq D
1 1 C D 1; p q
and finally Z jfgj ˛ˇ D kf kp kgkq :
t u
8.1.1 Observation. If jf jp and jgjq are linearly dependent, then Z jfgj D kf kp kgkq : Remark: The equality holds if and only if the functions are dependent, but we will not need the other implication. Proof. Let, say, jgjq D ˛jf jp . Then Z kgkq D .
1
1
jgjq / q D ˛ q .
Z
1
p
1
jf jp / q D ˛ q .kf kp / q
and hence 1
p
1
kf kp kgkq D ˛ q .kf kp /1C q D ˛ q .kf kpp /
pCq pq
1
D ˛ q kf kpp :
8 Holder’s ¨ inequality, Minkowski’s inequality and Lp -spaces
137
On the other hand we also have Z Z Z Z p p 1 1 1 1 1 1 C1 q q q q q jf jjgj D .jf j˛ jf j / D ˛ D˛ jf j jf jp. q C p / D ˛ q kf kpp : t u 8.2 Theorem. (Minkowski’s inequality) We have, for 1 p 1, kf C gkp kf kp C kgkp whenever the right-hand side is defined. Proof. The inequality is obvious for p D 1 and p D 1, hence we can assume that 1 > p > 1. Recall Proposition 4.5.2 of Chapter 1. For p 1 and x 0, the function f .x/ D x p is convex (since h00 .x/ D p.p 1/x p2 0) and hence we have 1 1 1 1 jf C gjp . j2f j C j2gj/p D j2f jp C j2gjp D 2p1 jf jp C 2p1 jgjp : 2 2 2 2 R R p p RThus, first,pif the integrals jf j and jgj are finite, also the integral of the sum jf C gj is finite, and kf C gkp makes sense. If it is zero then the inequality holds. Thus suppose it is not zero. We have Z .kf Cgkp /p D
Z jf Cgjp
Z .jf jCjgj/jf Cgjp1 D
Z jf jjf Cgjp1 C
Proceed, using H¨older inequality, taking into account that hence q D Z ..
p1 , p 1
jf jp / p C.
Z
1
jgjp / p /.
Z
p
.p1/ p1 1 p1
jf Cgj
/
jgjjf Cgjp1 :
1 p1 1 D 1 D and q p p
D .kf kp Ckgkp /.kf Cgkp /p1 :
Hence .kf C gkp /p .kf kp C kgkp /.kf kp C kgkp /.kf C gkp /p1 and Minkowski’s inequality follows dividing both sides by .kf C gkp /p1 .
8.3
The definition of Lp
Denote by Lp .B/ the set of all measurable functions on B for which jjf jjp < 1:
t u
5 Integration II: Measurable Functions, Measure and the Techniques: : :
138
By Theorem 8.2, Lp .B/ is a vector space over R, and it may appear that jjf gjjp
(8.3.1)
therefore defines a norm on Lp .B/ in the sense of 1.2.1 of Chapter 2. This is, however, not true for the simple reason that two functions f; g which are equal almost everywhere have 0 distance! It is immediately obvious, on the other hand, that the converse is also true, since we have the following fact. R
8.3.1 Lemma. If f W B ! Œ0; 1 and on B.
X
f D 0, then f D 0 almost everywhere
R Proof. Let, for " > 0, E" D fx 2 X jf .x/ > "g. Then clearly X f > ".E" /, so .E" / D 0. The set E D E1=1 [ E1=2 [ [ E1=n [ : : : therefore satisfies .E/ D 0, but we have E D fx 2 X jf .x/ ¤ 0g. t u Thus, we see that (8.3.1) gives a well-defined norm on the quotient space Lp .B/ D Lp .B/=L0 where L0 is the subspace of functions which are 0 almost everywhere. (See Section 6 of Appendix A for the definition of a quotient vector space.) More precisely, the formula (8.3.1) is applied to representatives f , g of two equivalence classes constituting the quotient space Lp .B/, but does not depend on the choice of representatives. Additionally, by what we just observed, the distance of two equivalence classes which are not equal cannot be 0. In the context of the normed vector spaces Lp .B/, it is common to identify a function f with the coset to which it belongs to notationally, i.e. to write f 2 Lp .B/. This slight imprecision does not tend to cause difficulties.
8.4
A comment of complex functions
Sometimes, we are interested in an analogue of the Lp -spaces for complex functions. In this context, the following simple result is useful: 8.4.1 Lemma. Let f W B ! C be an integrable function. Then Z Z j fj jf j: B
B
Proof. Let ˛ be such that j˛j D 1 and ˛ Z
Z
j
B
f Dj
Z
fjD˛ B
R
X
f j. Then
Z
f D B
R
Z
˛f D X
(The last equality follows from the fact that
Re.˛f / B
R X
˛f is real.)
jf j: B
t u
8 Holder’s ¨ inequality, Minkowski’s inequality and Lp -spaces
139
Therefore, Minkowski’s inequality also holds for complex-valued functions by the following argument: jjf C gjjp jj jf j C jgj jjp jj jf j jjp C jj jgj jjp D jjf jjp C jjgjjp : The case of p D 1 needs a separate (easy) discussion, see Exercise (17). Note that a complex analogue of H¨older’s inequality follows from the real case immediately. Thus, we can define the normed vector spaces Lp .B; C/, 1 p 1 completely analogously as the spaces Lp .B/, with real functions replaced by complex ones.
8.5
Completeness of the spaces Lp
8.5.1 Lemma. (Fatou’s Lemma) Let fn W B ! Œ0; 1 be measurable functions. Then Z Z .lim inf fn / lim inf fn : B
n!1
n!1
B
Proof. Let gn D inf fm . We have mn
Z
Z gn inf B
mn B
fn ;
while gn % lim inf fn , so the statement follows by passing to the limit by the n!1
Lebesgue Monotone Convergence Theorem.
t u
8.5.2 Theorem. The spaces Lp .B/ and Lp .B; C/, 1 p 1, are complete metric spaces. Proof. Consider, for example, the complex case (the proof in the real case is the same). Let fn W X ! C represent a Cauchy sequence in Lp . Then there exist n1 < n2 < < nk < : : : such that 1 X
jjfnk fnkC1 jjp < 1:
kD1
For p < 1, this means that 1 X
jfnk .x/ fnkC1 .x/jp < 1
kD1
almost everywhere, so .fnk .x//k is a Cauchy sequence in C almost everywhere in x 2 B, so the sequence of functions fk converges in a set S B such that .B X S / D 0. In the case p D 1, the same conclusion also holds, and moreover,
5 Integration II: Measurable Functions, Measure and the Techniques: : :
140
in that case, the convergence is uniform (Exercise (18)). Now let f .x/ D lim fnk .x/ for x 2 S , and f .x/ D 0 for x 2 X X S . In the case of p D 1, we are done. For p < 1, by Fatou’s Lemma 8.5.1, Z
Z jfn f jp lim inf k!1
B
jfn fnk jp :
(8.5.1)
B
If we choose n such that jjfn fm jjp < ", then the right-hand side of (8.5.1) is ". The right-hand side of (8.5.1) converges to 0 with n ! 1 because the sequence fn is Cauchy. t u
8.6
An inequality between Lp norms
8.6.1 Lemma. Let 1 < p and let B Rn be a Borel subset such that .B/ < 1. Then p Z Z 1 1 jf .x/j jf .x/jp : .B/ B .B/ B Proof. Put x0 D
1 .B/
Z jf .x/j: B
Since .x p /00 > 0 on .0; 1/, the derivative of x p is increasing on .0; 1/. Therefore, if we let b D .x0 /p and let a be the value of .x p /0 D px p1 at x0 , we have ax0 C b D .x0 /p and the derivative of ax C b is .x p /0 on .0; x0 / and .x p /0 on .x0 ; 1/. We conclude that ax C b x p for all x 2 .0; 1/: Now compute: 1 .B/
Z
1 jf .x/j .B/ n
Z .ajf .x/j C b/ D ax0 C b D .x0 /p ;
p
B
t u
as claimed.
8.6.2 Theorem. Let .B/ < 1, 1 r p 1. Then, for a measurable function f on B, 1
1
jjf jjr .B/ p r jjf jjp :
9 Exercises
141
In particular, Lp .B/ is a closed subspace of Lr .B/ (and similarly in the complex case). Proof. Clearly, the case of p D 1 is a direct consequence of the definition. Additionally, it suffices to consider the case r D 1 (otherwise, replace f by jf jr and p by p=r. The case of r D 1 and p < 1 follows from Lemma 8.6.1. t u
9
Exercises
(1) Prove the statement contained in Remark 1.2. (2) Consider a modification of Theorem 1.1 where one replaces Lup , Ldn by L . Is this modified statement true? Prove or disprove. (3) Prove Proposition 2.4. (4) Prove that sets of measure 0 as defined in 8.4 of Chapter 4 are precisely Lebesgue measurable sets of measure 0, as defined in 3.1. S (5) (a) Prove that if A1 A2 : : : are measurable sets and A D Ai , then .A/ D lim .Ai /:
(*)
T (b) Now let A1 A2 : : : , A D Ai . Give an example when (*) does not hold. Formulate a reasonable hypothesis which fixes the problem. [Hint: Finiteness.] (6) Let M Rm be a measurable set, and let f W M ! R be a function such that for every Borel set S Rm , f 1 ŒS is measurable. Prove that then the functionf defined by ( f .x/ D
f .x/ for x 2 M; 0 otherwise
is measurable. (7) Give an example of a measurable function f W Rm ! R such that there exists a measurable set S R where f 1 ŒS is not measurable. (8) Prove the following strengthening of Corollary 4.3.1: Let S be a Lebesgue measurable set in Rm . Then there exists a subset K S of type F (a countable union of compact sets) such that .S X K/ D 0. [Hint: First note that for a real function f 2 Zdn , f 1 Œha; 1/ is closed. Now in the proof of 4.4, we produced a non-decreasing sequence of Zdn -functions fn cS such that fn % cS almost everywhere. Let K be the union of fn1 Œh1=2; 1/.] (9) Prove that if S is a Lebesgue measurable set in Rm , then there exists a set U of type Gı (countable intersection of open sets) containing S such that .U X S / D 0.
5 Integration II: Measurable Functions, Measure and the Techniques: : :
142
(10) Prove that a bounded function on a compact interval ha; bi is Riemannintegrable if and only if it is continuous almost everywhere. (An analogue in Rm also holds and can be proved using analogous methods.) [Hint: For necessity, take a sequence of partitions for which both the upper and lower Riemann sums converge to the integral; prove that the function is continuous outside of the union of any set of closed intervals which are neighborhoods of all the points ti involved in all these partitions - recall 8.1 of Chapter 1. For sufficiency, let f be continuous almost everywhere. Let Fn be the set of all x0 2 ha; bi such that lim sup jf .x/ f .x0 /j 1=n. Then Fn is closed, and, x!x0
by assumption, covered by a set S of countably many open intervals the sum of lengths of which is < 1=n. For a point x0 … Fn , consider a ıx0 > 0 such that for x 2 .x0 ; ıx0 /, jf .x/ f .x0 /j < 1=n. Then ha; bi is contained in the union of the elements of S and all the .x0 ; ıx0 =2/, x0 … Fn . Hence, by 5.5 of Chapter 2, ha; bi is contained in a union of elements of a finite subset Sn . Show that the partitions by the boundary points of all the intervals in Sn give upper and lower Riemann sums which converge to the same number with n ! 1.] (11) Evaluate the integral Z
=2 0
ln.1 C cos.a/ cos.x// dx cos.x/
for 0 < a < . [Hint: Find the derivative with respect to a first.] (12) Compute Z xy E
where E is the tetrahedron in R3 with vertices .0; 0; 0/T ; .1; 1; 1/T ; .2; 3; 4/T ; .3; 6; 7/T : [Hint: Use linear substitution.] (13) Spherical coordinates. For m 2, consider the map m W .0; 1/ . =2; =2/n2 .0; 2 / ! Rm given as follows. If we denote the variables in the target as x1 ; : : : ; xm and the variables in the source as r; t1 ; : : : ; tm1 then x1 D r cos.t1 / : : : cos.tm1 /; xi D r cos.t1 / : : : cos.tmi / sin.tmi C1 / i D 2; : : : m: Prove that
9 Exercises
143
jdetD m j.t1 ;:::;tm1 / j D r m .cos.t1 //m2 cos.t2 /m3 : : : cos.tm1 /1 : [Hint: Express m D
ı where is given by the formula
y1 D tm1 ; .y2 ; : : : ; ym / D m1 .r; t1 ; : : : ; tm2 / and
is given by x1 D y2 cos.y1 /; x2 D y2 sin.y1 /; xi D yi for 3 i m.
Use the chain rule.] (14) Using Exercise (13), compute the volume .D m / where D m D f.x1 ; : : : ; xm /j
X
xi2 rg Rm :
(15) Prove that Z
e t dt D 2
p :
R
[Hint: First compute Z
e x
2 y 2
R2
(16)
(17) (18) (19)
using 2-dimensional spherical (Dpolar) coordinates. The integral in question is the square root of the result. Why?] Let U be an open subset of Rn , and let F W U ! Rn be a map satisfying Assumption 7.5. Prove that if U is connected, then det.DF/ does not change signs on U . [Hint: Recall 5.1.1 of Chapter 2.] Define in detail the metric space L1 .B; C/. Complete the details of the proof of Theorem 8.5.2 for p D 1. Using the method of Lemma 8.6.1, prove the following Jensen inequality: If is a convex function on .0; 1/, then .
1 .B/
Z jf .x/j/ B
1 .B/
Z .jf .x/j/: B
(20) (“Baby Lp ”) Define, on Rn or Cn , 1 p < 1 k.x1 ; : : : ; xn /kp D .jx1 jp C C jxn jp /1=p (and similarly for Cn ). Prove that this makes Rn , Cn into normed vector spaces. What is the appropriate definition in the case of p D 1?
6
Systems of Ordinary Differential Equations
1
The problem
1.1 A system of ordinarydifferential equations (briefly, ODE’s) is a problem of finding functions y1 .x/; : : : ; yn .x/ on some open interval in R such that yk0 .x/ D fk .x; y1 .x/; : : : ; yn .x//
for k D 1; : : : ; n
(1.1.1)
where fk are continuous functions of n C 1 real variables. Note that then yi , since they are required to have a derivative, must in particular be continuous, and the derivative is then also continuous by (1.1.1). The expression “ordinary” indicates that there appear only derivatives of functions of one variable, not partial derivatives of functions of several variables. Using the vector symbols y, f as in Chapter 3, we can describe the task by writing y0 .x/ D f.x; y.x//:
1.2 We may encounter systems involving higher derivatives, such as for example y1 D f1 .x; y1 ; y2 ; y10 ; y20 ; y100 ; y200 ; y1000 ; y2000 /; .4/
y2000 D f2 .x; y1 ; y2 ; y10 ; y20 ; y100 ; y200 ; y2000 /: This appears to call for a generalization of the original problem. But in fact, such systems are easily converted to systems of ODE’s as above: in this particular case, introduce additional variables
I. Kriz and A. Pultr, Introduction to Mathematical Analysis, DOI 10.1007/978-3-0348-0636-7 6, © Springer Basel 2013
145
146
6 Systems of Ordinary Differential Equations
z1 D y1 ; z2 D y2 ; z3 D y10 ; z4 D y20 ; z5 D y100 ; z6 D y200 and y7 D y1000 ; making the two equations into the equivalent system of the form (1.1.1): z01 D z3 ; z02 D z4 ; z03 D z5 ; z04 D z6 ; z05 D z7 ; z06 D f2 .x; z1 ; : : : ; z7 /; z07 D f1 .x; z1 ; : : : ; z6 ; f2 .x; z1 ; : : : ; z7 //: The reader certainly sees how to apply this procedure in a general situation .k /
.k /
.k /
y1 1 D f1 .x; y1 ; : : : ; y1 1 ; : : : ; yn ; : : : ; yn n /; ::: .k / .k / .k / yn n D fn .x; y1 ; : : : ; y1 1 ; : : : ; yn ; : : : ; yn n /:
(1.2.1)
Introduce additional variables for all the derivatives of yi of order less than the highest order derivative of yi which occurs in the system, and rewrite the original system in terms of the additional variables, introducing additional equations relating the new variables as derivatives of each other (see Exercise (1), (2)). To be explicit, one sometimes refers to a system of the form (1.1.1) as a system of first-order ODE’s, but we already see that such systems are all we need to consider.
1.3 We may, in fact, encounter even more general systems, namely a system of equations of the form .k /
.k /
F1 .x; y1 ; : : : ; y1 1 ; : : : ; yn ; : : : ; yn n / D 0; ::: .k / .k / Fm .x; y1 ; : : : ; y1 1 ; : : : ; yn ; : : : ; yn n / D 0:
(1.3.1)
In such a case, we will always assume that m D n and that the Jacobian of the .k / .k / Fi ’s in the variables corresponding to y1 1 ; : : : ; yn n is non-zero. Then, using the Implicit Function Theorem 6.3 of Chapter 3, the system (1.3.1) can be converted (at least locally) to the system (1.2.1), and hence again, by the method explained there, to a first-order system of the form (1.1.1). If m ¤ n or the Jacobian in question is 0, the problem (1.3.1) will be considered ill-posed from our point of view.
2 Converting a system of ODE’s to a system of integral equations
147
Note that whether the problem (1.3.1) is well-posed depends on the values of x, .k 1/ the yi ’s and their derivatives up to yi i , and the (number) solution of the resulting .k / equations for the yi i ’s. We will see, however, that this is in the spirit of the theory we will develop, as in solving the system (1.2.1), we get to specify x, the yi ’s and .k 1/ their derivatives up to yi i as initial conditions. (This is equivalent to specifying x and yi as initial condition in the system 1.1.) The translations of 1.2 and 1.3 serve a theoretical purpose. They may often be difficult to carry out in practice. In many cases, different reductions may be more advantageous. (See Exercise (3).)
1.4
Remarks
1. To simplify notation, we write y 0 D f .x; y/ instead of the more correct y 0 .x/ D f .x; y.x//, etc. Thus, the symbol y may feature both as a variable in a function f of two variables, and as a name of a function y.x/. 2. Differential equations play a fundamental role in various applications. Let us just mention a simple geometric interpretation of the ODE y 0 D f .x; y/: the function f .x; y/ determines directions at individual points .x; y/ of the plane R2 ; the graphs of the desired solutions are curves following the prescribed directions.
2
Converting a system of ODE’s to a system of integral equations
2.1 Theorem. Let .a; b/ be an open interval containing a number x0 . Let 1 ; : : : ; n be arbitrary real numbers. Then the functions y1 .x/; : : : ; yn .x/ constitute a solution of the ODE system yj0 .x/ D fj .x; y1 .x/; : : : ; yn .x//; j D 1; : : : ; n
(2.1.1)
in this interval such that, moreover, yj .x0 / D j if and only if they satisfy the equations Z yj .x/ D
x
fj .t; y1 .t/; : : : ; yn .t//dt C j :
(2.1.2)
x0
Proof. This is an easy consequence of the Fundamental Theorem of Calculus. If (2.1.1) is satisfied then one has Z
x
yj .x/ D
fj .t; y1 .t/; : : : ; yn .t//dt C cj x0
148
6 Systems of Ordinary Differential Equations
for some constants cj . If, moreover, yj .x0 / D j we obtain for x D x0 , Z j D yj .x0 / D
x0
fj .: : : /dt C cj D 0 C cj :
x0
On the other hand, if the functions yj .x/ satisfy (2.1.2) then by taking the derivative by x we obtain that yj0 .x/ D fj .x; y1 .x/; : : : ; yn .x//, and setting x D x0 we conclude that yj .x0 / D j . t u
2.2
Remark
This very easy translation of our problem has in fact a quite surprising consequence. Let us illustrate it on the equation y 0 D f .x; y/. Denote by D the operator of taking the derivative, and by F the operator transforming y.x/ to f .x; y.x//. Further, define an operator J by setting Z
x
J.y/.x/ D
f .t; y.t//dt: co
The original task was to solve the equation D.y/ D F .y/:
(*)
This looks somewhat scary: for example, if we take the space X D C..a; b// of bounded continuous functions on .a; b/ as considered in 7.7 of Chapter 2, the operator D is not even defined on X , as not every continuous function has a derivative. It seems that in order to treat the equation by means of spaces of functions, we would have to think hard what space to work on, and what metric to choose to make both sides of the equation (*) continuous. Such problems do, indeed, arise with some types of differential equations. However, in case of our system (1.1.1), Theorem 2.1 gives a way out: After the translation we obtain the equation y D J.y/
(**)
where J is (as we will see) continuous. Furthermore, this is a fixed-point problem about which we already know something (see 7.6 of Chapter 2); indeed, the Banach Fixed Point Theorem will be of a great help.
3 The Lipschitz property and a solution of the integral equation
3
149
The Lipschitz property and a solution of the integral equation
3.1 Let f .x; y1 ; : : : ; yn / be a function in n C 1 (real) variables. It is said to be Lipschitz in the variables y1 ; : : : ; yn if there exists a number M such that jf .x; y1 ; : : : ; yn / f .x; z1 ; : : : ; zn /j M max jyi zi j: i
We say that f is locally Lipschitz in y1 ; : : : ; yn if for each u0 D .x0 ; y10 ; : : : ; yn0 / of the domain in question there is an open U 3 u0 such that the restriction f jU is Lipschitz. 3.2 Observation. If a function f .x; y1 ; : : : ; yn / has continuous partial derivatives @f then it is locally Lipschitz. @yj (Indeed take a point u0 D .x0 ; y10 ; : : : ; yn0 /, an open set U 3 u0 and an M such that ˇ ˇ ˇ @f .x; y1 ; : : : ; yn / ˇ M ˇ ˇ : ˇ ˇ @yj n Then by the Mean Value Theorem, we have for .x; y1 ; : : : ; yn /; .x; z1 ; : : : ; zn / 2 U , ˇ ˇ ˇ ˇ ˇX @f .: : : / ˇ jf .x; y1 ; : : : ; yn / f .x; z1 ; : : : ; zn /j D ˇˇ .yj zj /ˇˇ @yj ˇ j ˇ X ˇˇ @f .: : : / ˇˇ M ˇ ˇ jyj zj j:/ ˇ @y ˇ jyj zj j n n max j j j
3.3 Theorem. Let fj .x; y1 ; : : : ; yn /, j D 1; : : : ; n be continuous and Lipschitz in the variables y1 ; : : : ; yn in a neighborhood of a point u D .x0 ; 1 ; : : : ; n /. Then there is an a > 0 such that in the interval .x0 a; x0 C a/ the system of equations Z x uj .x/ D fj .t; u1 .t/; : : : ; un .t//dt C j x0
has precisely one solution u1 ; : : : ; un . Proof. First, choose a neighborhood U D .x0 ˛ 0 ; x0 C ˛ 0 / .y1 ˇ 0 ; y1 C ˇ 0 / .yn ˇ 0 ; yn C ˇ 0 /
150
6 Systems of Ordinary Differential Equations
on which f jU is Lipschitz. Now choose 0 < ˛ < ˛ 0 and 0 < ˇ < ˇ 0 . We have an M such that jx0 xj ˛; jj yj j ˇ; jj zj j ˇ implies that jfj .x; y1 ; : : : ; yn / f .x; z1 ; : : : ; zn /j M max jyi zi j: i
Since f is continuous we also have an A such that jfj .x; y1 ; : : : ; yn /j A in the compact interval hx0 ˛; x0 C ˛i h1 ˇ; 1 C ˇi hn ˇ; n C ˇi (recall Proposition 6.3 of Chapter 2). Choose an a such that (1) 0 < a ˛, ˇ (2) a , and A q for some q < 1. (3) a M Consider the space of continuous functions C D C..x0 a; x0 C a// (recall 7.7 of Chapter 2) and the subspaces Yj D fu j u 2 C; j ˇ u.x/ j C ˇg: All the Yj are complete metric spaces and hence also the product Y D Y1 Y2 Yn with, say, the maximum metric .u; v/ D max j .uj ; vj /; j
where j . ; / D supx j .x/ .x/j, is complete (7.7.2 and 7.3.1 of Chapter 2). Now define for u D .u1 ; : : : ; un / J.u/ D .J1 .u/; : : : ; Jn .u// where Z
x
Jj .u/.x/ D
fj .t; u1 .t/; : : : ; un .t//dt C j : x0
4 Existence and uniqueness of a solution of an ODE system
151
Since ˇZ x ˇ ˇ ˇ ˇ fj .t; u1 .t/; : : : ; un .t//dt ˇˇ jJj .u/.x/ j j D ˇ x0 Z x jfj .t; u1 .t/; : : : ; un .t//jdt jx0 xj A a A ˇ; x0
J is a mapping Y ! Y , and our problem is to find a fixed point of J . We have .J.u/; J.v// D max sup jJk .u/.x/ Jk .v/.x/j k
x
ˇZ x ˇ Z x ˇ ˇ ˇ fk .t; u1 .t/; : : : /dt fk .t; v1 .t/; : : : /dt ˇˇ D max sup ˇ k x x0 x0 ˇZ x ˇ ˇ ˇ ˇ fk .t; u1 .t/; : : : / fk .t; v1 .t/; : : : /dt ˇˇ D max sup ˇ k
x
max sup k
x
Z
x0 x
jfk .t; u1 .t/; : : : / fk .t; v1 .t/; : : : /jdt D c:
x0
Since we have jfk .t; u1 .t/; : : : / fk .t; v1 .t/; : : : /j M max xjuj .t/ vj .t/j j
M max sup juj .x/ vj .x/j D M .u; v/ we obtain j
x
.J.u/; J.v// c max sup jx x0 j M .u; v/ a M .u; v/ q .u; v/: j
x
Thus, J W Y ! Y satisfies the condition of the Banach Fixed Point Theorem 7.6 of Chapter 2 and we conclude that there is precisely one u such that J.u/ D u, that is, precisely one solution of our integral equations on the interval .x0 a; x0 C a/. u t
4
Existence and uniqueness of a solution of an ODE system
4.1 Using 2.1, we immediately infer from Theorem 3.3 the following Theorem. (The Picard-Lindel¨of Theorem) Let fj .x; y1 ; : : : ; yn /, j D 1; : : : ; n be continuous and let them be Lipschitz with respect to y1 ; : : : ; yn in a neighborhood of a point u D .x0 ; 1 ; : : : ; n /. Then for a sufficiently small a > 0 the system yj0 .x/ D fj .x; y1 .x/; : : : ; yn .x//; j D 1; : : : ; n has precisely one solution on .x0 a; x0 C a/ such that yj .x0 / D j for all j .
152
6 Systems of Ordinary Differential Equations
Remark. Thus, unlike the uniqueness in 3.3, the solution is unique with respect to the extra conditions yj .x0 / D j . These requirements are usually referred to as the initial conditions.
4.2 The solutions in 4.1 are of a local character, that is, they are guaranteed in a small neighborhood of the initial point x0 only. Now we will head to solutions of a more global character, defined as far as possible. To start with, we will speak of a local solution .u; J / defined on an open interval J and we will endeavour to extend the J . 4.2.1 Lemma. Under the conditions of 4.1, let J; K be open intervals, let x0 2 J \ K, and let .u; J / and .v; K/ be local solutions such that u.x0 / D v.x0 /. If f is continuous and Lipschitz with respect to the yj in the domain in which we consider our system, we have ujJ \ K D vjJ \ K. Proof. By 4.1, if the u and v coincide at a point they coincide in some of its open neighborhoods. Thus, U D fx j u.x/ D v.x/; x 2 J \ Kg is an open subset of J \ K. From the continuity of u, v it follows that U is closed as well. Since J \ K is an interval, hence connected by 5.2.2 of Chapter 2, and since U is non-empty, U D K \ J . t u
4.2.2 Take the union of all the intervals J on which there exists a solution u satisfying uj .x0 / D j . By Lemma 4.2.1, there exists a solution .u; J / with the domain J . Such maximal solutions are called the characteristics of the given ODE system. In this terminology we can summarize the preceding facts in the following Theorem. Let U be an open subset of RnC1 and let f.x; y1 ; : : : ; yn / W U ! R be continuous and locally Lipschitz in y1 ; : : : ; yn , Then for each .x0 ; 1 ; : : : ; n / there is a unique characteristic u such that uj .x0 / D j
4.3 Consider a differential equation y .n/ D f .c; y; y 0 ; : : : ; y .n1/ /:
(4.3.1)
From the method of 1.2 and from Theorem 4.2.2, we obtain the following
5 Stability of solutions
153
Corollary. Let U be an open subset of RnC1 and let f .c; y1 ; y2 ; : : : ; yn / be continuous and locally Lipschitz in y1 ; : : : ; yn . Then for each .xo ; 1 ; : : : ; n / 2 U there exists precisely one solution y of the equation (4.3.1) with maxinum interval domain such that y k .x0 / D kC1 ; k D 0; : : : ; n 1:
4.4
Examples
1. The domain of a characteristic may not be equal to the domain on which a differential equation is defined. For example, the differential equation y0 D 1 C y2 has solutions y D tan.x C C / where C is any constant, as easily verified. (See the next section for a more systematic method for finding the solution.) 2. The Lipschitz condition in the assumptions is essential. Consider the equation 2
y 0 D 3y 3 : 2
We have the solutions y.x/ D .x Cc/3 . The function f .x; y/ D 3y 3 is Lipschitz in y in all the points but the .x; 0/. And indeed in these exceptional points we have solutions 8 3 ˆ ˆ <.x a/ for x a; y.x/ D 0 for a x b; ˆ ˆ :.x b/3 for x b; all of them satisfying y.x0 / D 0 for any x0 2 .a; b/.
5
Stability of solutions
5.1
The problems of stability
Consider the equations yj0 .x/ D fj .x; y1 .x/; : : : ; yn .x//; yj .x0 / D j ;
j D 1; : : : ; n
solved as in 4.1. The solution depends (uniquely) on the j . A question naturally arises whether this dependence is continuous. For example, if it were not continuous,
154
6 Systems of Ordinary Differential Equations
using of the solution in practical applications would be rather suspect, as the effect of small errors in initial conditions would be unpredictable. Furthermore, a practical setting often contains additional parameters, so the system becomes yj0 .x; ˛1 ; : : : ; ˛k / D fj .x; y1 .x; ˛1 ; : : : ; ˛k /; : : : ; yn .x; ˛1 ; : : : ; ˛k //; yj .x0 ; ˛1 ; : : : ; ˛k / D j ;
j D 1; : : : ; n:
(*)
As before, the derivative is taken by x (while technically, this is a partial derivative, the convention is to continue using the ordinary derivative symbol to emphasize the fact that we have one system of ordinary differential equations for each value of the parameters). As, again, in practice the parameters are known only approximately, the solution makes practical sense only if it depends continuously on the parameters ˛i . The two stability problems can be reduced to one. Fix initial conditions 0j , consider ˇi D i 0i ;
zi D yi C ˇi
and define gj .x; z1 ; : : : ; zn ; ˛1 ; : : : ; ˛k ; ˇ1 ; : : : ; ˇn / D fj .x; z1 C ˇ1 ; : : : ; zn C ˇn ; ˛1 ; : : : ; ˛k / which turns the combined task (*) into z0j .x;˛1 ; : : : ; ˛k ; ˇ1 ; : : : ; ˇn / D gj .x; z1 .x; ˛1 ; : : : ; ˛k ; ˇ1 ; : : : ; ˇn /; : : : ; zn .x; ˛1 ; : : : ; ˛k ; ˇ1 ; : : : ; ˇn //; zj .x0 ;˛1 ; : : : ; ˛k ; ˇ1 ; : : : ; ˇn / D 0j ;
j D 1; : : : ; n
with the initial values 0j fixed. Thus, it suffices to study the dependence of the system on parameters only, with initial conditions fixed; in the notation (*), this means we will study stability with respect to ˛1 ; : : : ; ˛k , with j fixed.
5.1.1 Remark One can also convert the combined stability problem into a problem concerning initial conditions only. But the trick with parameters is more expedient and we will concentrate on that. 5.2 Lemma. (Gronwall’s inequality) Let F be a non-negative real-valued function on an interval ha; bi and let there exist positive constants C; K such that for all x 2 ha; bi we have Z x F .x/ C C K F .t/dt: a
5 Stability of solutions
155
Then for all x 2 ha; bi, F .x/ C eK.xa/ : Proof. Put Z
x
G.x/ D C C K
F .t/dt: a
Then we have F .x/ G.x/
and G 0 .x/ D K F .x/ K G.x/:
Since G.x/ > 0, we have G 0 .x/ K G.x/ and hence
Z
x a
G 0 .t/ dt K G.t/
Z
x
1 dt D K.x a/: a
Subsituting in the first integral y D G.t/, we obtain Z
G.x/ G.a/
dy D ln G.x/ ln G.a/ D ln G.x/ ln C; y
so that ln G.x/ ln C C K.x a/;
and hence G.x/ C eK.xa/ :
Using F .x/ G.x/ again, we obtain the desired inequality.
t u
5.3 To simplify notation, in the proof of the following theorem we will write ˛ for ˛1 ; : : : ; ˛k and use the symbol k˛k for
max j˛j j:
j D1;:::;k
Similarly, for a system we will write y1 ; : : : ; yn resp. y1 .x/; : : : ; yn .x/, kyk D max jyj j or ky.x/k D max jyj .x/j: j D1;:::;n
j D1;:::;n
156
6 Systems of Ordinary Differential Equations
Theorem. Let fj .x; y1 ; : : : ; yn ; ˛1 ; : : : ; ˛k / be functions continuous in all variables and Lipschitz in the variables yj and ˛j in some neighborhood of a point .x0 ; 0 ; : : : ; n ; ˛10 ; : : : ; ˛k0 /:
(5.3.1)
Then the solution yj .x; ˛1 ; : : : ; ˛k / of the system of equations yj0 .x; ˛1 ; : : : ; ˛k / D fj .x; y1 .x; ˛1 ; : : : ; ˛k /; : : : ; yn .x; ˛1 ; : : : ; ˛k /; ˛1 ; : : : ; ˛k /; yj .x0 ; ˛1 ; : : : ; ˛k / D j ;
j D 1; : : : ; n
is continuous in all variables in some neighborhood U of the point (5.3.1). Moreover, if K is a Lipschitz constant for the variables y1 ; : : : ; yn , ˛1 ; : : : ; ˛n , we have an estimate on U : jyj .x; ˛1 ; : : : ; ˛k / yj .x; ˇ1 ; : : : ; ˇk /j max j˛i ˇi jeK.xa/ i D1;:::;k
(5.3.2)
for all j D 1; : : : ; n. Proof. We have jyj .x; ˛/ yj .x; ˇ/j D Z x j fj .t; y1 .t; ˛/; : : : ; yn .t; ˛/; ˛/dt fj .t; y1 .t; ˇ/; : : : ; yn .t; ˇ/; ˇ/dtj a
Z
x
jfj .t; y1 .t; ˛/; : : : ; yn .t; ˛/; ˛/ fj .t; y1 .t; ˇ/; : : : ; yn .t; ˇ/; ˇ/jdt Z
a x
.jfj .t; y1 .t; ˛/; : : : ; ˛/ fj .t; y1 .t; ˛/; : : : ; ˇ/j
a
C jfj .t; y1 .t; ˛/; : : : ; ˇ/ fj .t; y1 .t; ˇ/; : : : ; ˇ/j/dt
Z
x
.K k˛ ˇk C K ky.t; ˛/ y.t; ˇ/k/dt; a
so
Z
x
ky.x; ˛/ y.x; ˇ/k K
.k˛ ˇk C ky.t; ˛/ y.t; ˇ/k/dt:
a
If we set F .x/ D k˛ ˇk C ky.x; ˛/ y.x; ˇ/k, we obtain Z F .x/ k˛ ˇk C K
F .t/dt: a
By Lemma 5.2, we now have
x
5 Stability of solutions
157
F .x/ k˛ ˇkeK.xa/ and since ky.x; ˛/ y.x; ˇ/k F .x/, the estimate (5.3.2) follows.
5.4
t u
Remark
Recall that the existence and uniqueness in Theorem 4.1 was proved using the Banach Fixed Point Theorem 7.6 of Chapter 2. The reader may naturally ask whether the stability theorem (at least the continuity) is not an easy consequence of a general property of such fixed points. That is, we think of the following problem. Let us have metric spaces X; T and a mapping f WX T !X such that d.f .x; t/; f .y; t// rt where rt < 1 depend on t 2 T only. Define F .t/ 2 X by the equation f .F .t/; t/ D F .t/. How does F .t/ depend on t? There are fairly general facts known on this subject, but they do not fit well with our present topic. Due to the special character of our equations it is, luckily enough, easy to show the dependence by an explicit estimate, as we have done.
5.5 The solution of a system of differential equations is (under reasonable conditions) not only continuously dependent on parameters. In fact we can even take derivatives. Consider, again, the system of equations yi0 .x; ˛/ D fi .x; y1 .x; ˛/; : : : ; yn .x; ˛/; ˛/; yi .x0 ; ˛/ D i
i D 1; : : : ; n;
(5.5.1)
@yi , satisfying the conditions from 4.1 (where we write, similarly as before, yi0 , not @x for the derivatives by x, to keep in mind the fact that we are dealing with an ordinary differential equation). Theorem. Let fi .x; y1 ; : : : ; yn ; ˛1 ; : : : ; ˛k / be continuous functions defined on an open neighborhood of a point (5.3.1), continuously differentiable with respect to yj and ˛p . Then the solutions yi .x; ˛/ of the system (5.5.1), which exist and are unique on some open neighborhood U of (5.3.1), are differentiable with respect to ˛p , p D 1; : : : ; k on U , and the functions zi .x; ˛/ D
@yi .x; ˛/ @˛p
158
6 Systems of Ordinary Differential Equations
satisfy the system of equations z0i .x; ˛/ D
n X @fi @fi .x; y.x; ˛/; ˛/ zj C .x; y.x; ˛/; ˛/; @y @˛ j p j D1
zi .x0 ; ˛/ D 0;
i D 1; : : : ; n; (5.5.2)
where we write briefly y for y1 ; : : : ; yn and ˛ for ˛1 ; : : : ; ˛k . Remarks. 1. The continuous differentiability with respect to yj and ˛p makes, of course, the functions fi locally Lipschitz with respect to these variables. 2. The system (5.5.1) is viewed as solved and yi .x; ˛/ constitute the (unique) solution. The equations (5.5.2) contains these functions as aleady given, not as something dependent on the zi . Thus, the right-hand sides of the equations in (5.5.2) are Lipschitz with respect to zj and therefore the system has a solution. Our task will be to prove that the individual zi ’s are the partial derivatives of the yi by ˛p . 3. The reader has certainly not overlooked that the equations for zi which we hoped @yi come naturally in the form (5.5.2): if we already knew yi to have to be the @˛ derivatives, we would obtain the equality by taking derivatives of the equalities in (5.5.1). But this we do not know yet. Proof. First of all, note that the problem is immediately reduced to the case k D 1: We may treat all parameters but one as constant for the existence of a single partial derivative; once equation (5.5.2) is proved, we can use Theorem 5.3 to prove continuity of the partial derivatives in all the ˛p ’s. Thus, let us assume k D 1, and write ˛ for ˛p . Let yi be a solution of the system (5.5.1) and z a solution of the system (5.5.2). Put ui .x; ˛; h/ D
1 .yi .x; ˛ C h/ yi .x; ˛// h
and vi .x; ˛; h/ D ui .x; ˛; h/ zi .x; ˛/: Thus, @ui @vi .x; ˛; h/ D .x; ˛; h/ z0i .x; ˛/ @x @x n X @ui @fi @fi .x; ˛; h/ .x; y.x; ˛/; ˛/: D .x; y.x; ˛/; ˛/ zj @x @yj @˛ j D1
5 Stability of solutions
159
Let us compute the derivative
@ui .x; ˛; h/ : @x
1 @ui .x; ˛; h/ D .yi0 .x; ˛ C h/ yi0 .x; ˛// @x h n 1 X D . fj .x; y.x; ˛ C h/; ˛ C h/ f .x; y.x; ˛/; ˛ C h// h j D1 C
1 .fi .x; y.x; ˛/; ˛ C h/ f .x; y.x; ˛/; ˛//: h
By the Mean Value Theorem we may continue, writing y for y1 .x; ˛ C h/ y1 .x; ˛/; : : : ; yn .x; ˛ C h/ yn .x; ˛/, that is, hu1 .x; ˛; h/; : : : ; hun .x; ˛; h/, D
n X @fi @fi .x; y.x; ˛/; ˛ C 2 h/: .x; y.x; ˛/ C 1 y; ˛ C h/ uj .x; ˛; h/ C @yj @˛ j D1
Let us now consider
@vi . Since ui .x; ˛; h/ D vi .x; ˛; h/ C zi .x; ˛/, we obtain @x
ˇ ˇ X ˇ n ˇ ˇ ˇ ˇ @vi ˇ @fj ˇ jvj .x; ˛; h/j ˇ ˇ ˇ
y; ˛ C h/ .x; ˛; h/ .x; y.x; ˛/ C 1 ˇ ˇ ˇ @x ˇ @y j D1
C
ˇ n ˇ X ˇ ˇ @fi @fj ˇ ˇ. .x; y.x; ˛/ C
y; ˛ C h/ .x; ˛/ .x; y.x; ˛/; ˛// z j 1 ˇ ˇ @y @y j
j D1
ˇ ˇ ˇ @fi ˇ @fi ˇ Cˇ .x; y.x; ˛/ C 2 h/ .x; y.x; ˛//ˇˇ ; @˛ @˛
and further ˇ ˇ ˇ X n ˇ ˇ ˇ @vi ˇ @fi ˇ ˇ jvj .x; ˛; h/j ˇ ˇ ˇ .x; y.x; ˛/ C
y; ˛ C h/ .x; ˛; h/ 1 ˇ ˇ @x ˇ @y ˇ j j D1
C
ˇ n ˇ X ˇ ˇ @fi @fj ˇ ˇ. .x; y.x; ˛/C
y; ˛ C h/ .x; ˛/ .x; y.x; ˛/; ˛ C h// z j 1 ˇ ˇ @y @y j
j D1
C
ˇ n ˇ X ˇ ˇ @fi @fj ˇ ˇ. .x; y.x; ˛/; ˛ C h/ .x; ˛/ .x; y.x; ˛/; ˛// z j ˇ ˇ @y @y j
j D1
ˇ ˇ ˇ @fi ˇ @fi C ˇˇ .x; y.x; ˛/ C 2 h/ .x; y.x; ˛//ˇˇ : @˛ @˛
160
6 Systems of Ordinary Differential Equations
Choose a compact neighbourhood of .x0 ; y.x0 ; ˛/; ˛/ and a K sufficiently large to have, in this range, max i
ˇ n ˇ X ˇ @fi ˇ ˇ ˇ K: .x; y.x; ˛/; ˛/ ˇ @y ˇ j
j D1
Now let " > 0. From the Lipschitz property we see that for h sufficiently small we have for all x sufficiently close to x0 to stay in the aforementioned range ˇ n ˇ X ˇ @fi ˇ @fj ˇ. ˇ ˇ @y .x; y.x; ˛/ C 1 y; ˛ C h/ @y .x; y.x; ˛/; ˛ C h// zj .x; ˛/ˇ j j D1 ˇ n ˇ X ˇ @fi ˇ @fj ˇ ˇ C ˇ. @y .x; y.x; ˛/; ˛ C h/ @y .x; y.x; ˛/; ˛// zj .x; ˛/ˇ j j D1 ˇ ˇ ˇ @f ˇ @f ˇ C ˇ .x; y.x; ˛/ C 2 h/ .x; y.x; ˛//ˇˇ < " @˛ @˛ and hence ˇ ˇ n X ˇ @vi ˇ ˇ ˇ "CK .x; ˛; h/ jvj .x; ˛; h/j ˇ @x ˇ j D1 so that ˇ n ˇ n X X ˇ @vi ˇ ˇ ˇ "CK .x; ˛; h/ jvj .x; ˛; h/j; ˇ @x ˇ i D1
j D1
and consequently n X
Z
x
jvi .x; ˛; h/j x0
j D1
Thus, for F .x/ D n" C nK lim
.n" C nK
n X
n X
jvi .t; ˛; h/j/dt:
i D1
jv.x; ˛; h/j we have
i D1
Z
x
F .x/
F .t/dt x0
and can apply Gronwall inequality to obtain, for each individual i , ˇ ˇ ˇ1 ˇ ".eK.xx0/ 1/ jvi .x; ˛; h/j D ˇˇ .yi .x; ˛ C h/ yi .x; ˛// zi .x; ˛/ˇˇ ; h K
6 A few special differential equations
161
1 .yi .x; ˛ C h/ yi .x; ˛// D h!0 h t u
and since " > 0 was arbitrary we conclude that lim zi .x; ˛/:
6
A few special differential equations
6.1 First of all, let us realize that in the situations where the theorem on the existence and uniqueness is applicable, we do not really have to be concerned about the correctness dy as if it were a fraction, failing to of the procedure we use (e.g. working with dx control whether there might not be a zero in a denominator, etc.). If we obtain a function satisfying the equation (and initial conditions), it has to be the one and only solution we are looking for, by Theorem 4.2.2. This is a perfect example of the importance of theoretical work for calculations.
6.2 We have already encountered a differential equation without knowing it. Namely, looking for a primitive function of a function f is the ODE y 0 D f .x/: In general, to determine a primitive function is by no means an easy task (indeed it is often impossible to obtain a formula in terms of elementary functions). It is, however, customary to think of an ODE as solved if it is reduced to formulas in primitive functions.
6.3
Separation of variables
The equation y 0 D f .x/g.y/ can be treated as follows: rewrite it as 1 y 0 .x/ D f .x/ g.y.x// R and compare the primitive functions of both sides (these are indicated by plain ). We obtain
162
6 Systems of Ordinary Differential Equations
Z
1 /.y.x// D . g
.
Z f /.x/ C C:
This somewhat clumsy computation can be, more intuitively, modified as follows. Take the equation as dy D f .x/g.y/; dx proceed to dy D f .x/dx g.y/ and “integrate” Z
dy D g.y/
Z f .x/dx C C:
Examples. 1. For y 0 D y sin x we obtain Z
dy D y
Z sin xdx C C;
hence ln y D cos x C C yielding jyj D e cos xCC
that is,
y D D e cos x :
2. Similarly, the equation y 0 D 1 C y 2 of Example 4.4 1 is transformed to Z
dy 1 C y2
yielding arctan y D x C C and finally y D tan.x C C /. 3. For y0 D
x y
R R we obtain ydy D xdx C C , hence 12 y 2 D 12 x 2 C c and finally x 2 C y 2 D r 2 . This is a very intuitive example: What curves are perpendicular in each .x; y/ to the vector .x; y/? Of course, the circles with their centers at .0; 0/.
6 A few special differential equations
163
6.4 To solve the equation y 0 D f .ax C by/; substitute z.x/ D ax C by.x/. Then we have z0 b y 0 C a D b f .x/, a particularly simple example of the equations from 6.3 where the right-hand side is independent of x (such example is known as an autonomous equation).
6.5 (The “homogeneous equation” - not to be confused with homogeneous linear differential equations in Chapter 7 below.) To solve the equation y y 0 D f . /; x (in other words, y 0 D F .x; y/ where F is such that for any t, F .x; y/ D F .tx; ty/), y substitute z D . Then we obtain x z0 D
y0x y y0 z 1 D D .f .z/ z/ ; 2 x x x
again an equation with separated variables.
6.6 The equation y0 D f
ax C by C c ˛x C ˇy C
(6.6.1)
would be of the type 6.5 if we had c D D 0. If not, let us try to force it. Let x0 ; y0 be a solution of the linear (algebraic) equations ax C by C c D 0 ˛x C ˇy C D 0: Then ax C by C c a.x x0 / C b.y y0 / D : ˛x C ˇy C ˛.x x0 / C ˇ.y y0 /
164
6 Systems of Ordinary Differential Equations
If we substitute D x x0 ; z D y y0 ; we obtain z./ D y.x x0 / y0 and
dx d
D 1 so that
dz D y 0 ./ D f d
a C bz : ˛ C ˇz
The linear algebraic equations above may fail to have a solution: namely we could have had .a; b/ D K .˛; ˇ/ or K .a; b/ D .˛; ˇ/. Then, however, the equation 6.6.1 is already of the form y 0 D F .Ax C By/ as it is, and we can use the procedure from 6.4.
6.7
The linear equation y 0 D a.x/y C b.x/; first encounter with variation of constants
First, solve the equation y 0 D a.x/y. This is a case of separated variables and by the method from 6.3, we obtain a solution R
u1 .x/ D c e
a.x/dx
:
(6.7.1)
Let us try to find a solution of the original equation in the form y.x/ D c.x/ u1 .x/ (because of replacing the constant c from (6.7.1) by a function in x one speaks of a variation of constant; in a more general setting, it will be used in Chapter 7 below). Thus, we should have the equality y 0 D c 0 u1 C cu01 and since u01 D au1 , we have, further, y 0 D c 0 u1 C cau1 D c 0 u1 C ay: Thus, we need a c.x/ such that b.x/ D c 0 .x/u1 .x/ and this equality is satisfied by Z c.x/ D
b.x/ dx C K: u1 .x/
7 General substitution, symmetry and infinitesimal symmetry of a differential equation
6.8
165
At least one second-order equation
In physics, we encounter the equation y 00 D f .y/: Such an equation can be solved as follows. First, multiply both sides by y 0 to obtain y 0 y 00 D f .y/y 0 ; that is, 1 . .y 0 /2 /0 D .. 2
Z
f / ı y/0
and further
1 0 2 .y / D . 2
Z f/ıy CC
(ı indicates composition of functions) and finally s Z y D 2. f / ı y C C ; 0
a case of separated variables.
7
General substitution, symmetry and infinitesimal symmetry of a differential equation
7.1 One may ask how, looking at a differential equation, one finds the substitution which allows us to separate variables. Of course, in most cases, it is not possible. When it is, however, there is, in fact, a general strategy for finding the substitution, relating separation of variables to symmetry. To study symmetry, it is convenient to write a system of differential equations in a form in which the right-hand side does not depend explicitly on x: yi0 D fi .y1 ; : : : ; yn /:
(7.1.1)
Clearly, this is a special case of the system (1.1.1). On the other hand, a system of the form (1.1.1) can be always reduced to the form (7.1.1) by introducing an additional variable y0 : y00 D 1; yi0 D fi .y0 ; y1 ; : : : ; yn /:
166
6 Systems of Ordinary Differential Equations
7.2 Now assume we have a system of the form (7.1.1). We may write it in vector notation, putting y D .y1 ; : : : ; yn /T , f D .f1 ; : : : ; fn /T (recall that reconciling the direction of composition of maps with matrix multiplication favors viewing vectors as columns here, see e.g. Appendix A, 7.5): y0 D f.y/:
(7.2.1)
Let us point out a geometric interpretation of the system (7.2.1). Denote the independent variable by t. A solution y.t/ can be interpreted as a parametric curve with the parameter t. Then the equation (7.2.1) says that the tangent (“velocity”) vector of the curve y at the point t is equal to f.y.t//. A function U ! Rn on a subset U yRn when we interpret its values as vectors is called a vector field. The curves y.t/ are called integral curves of the vector field. One sometimes denotes the solution as y.t/ D exp.tf/y.0/;
(7.2.2)
although this is somewhat misleading, given the fact that the solution is not an exponential even in the case of n D 1 unless f is constant, and cannot be figured out explicitly in general when n > 1.
7.3 Let us now study how a vector field changes when we change variables. By a substitution at y0 2 Rn we shall mean a smooth map W U ! Rn where U is an open neighborhood of y0 whose differential at y0 is non-singular. Writing z D .y/, then, by the chain rule, we get from (7.2.1) a system of differential equations for z, z0 D D j 1 .z/ f. 1 .z//; (the operation on the right-hand side is matrix multiplication), so from the point of view of differential equations, transforms the vector field f to the vector field g.z/ D D j 1 .z/ f. 1 .z// in an open neighborhood of z0 D .y0 /. In other words, the differential equation (7.2.1), expressed in the variables z, reads z0 D g.z/:
(7.3.1)
7 General substitution, symmetry and infinitesimal symmetry of a differential equation
167
7.4 We will call a symmetry (at y0 ) if the differential equations (7.2.1) and (7.3.1) coincide, i.e. we have g.z/ D f.z/, or f. .y// D D jy f.y/:
(7.4.1)
However, we are less interested in a single symmetry than in a (continuous) family of symmetries. By this, we mean a smooth map W RnC1 ! Rn , which, denoting the first variable as ", and writing ."; ‹/ as " W Rn ! Rn , has the property that each " is a symmetry, and 0 D Id (in other words, 0 .y/ D y). Given a family of symmetries, what is happening near " D 0? Let ˇ @ " ˇˇ : (7.4.2) uD @" ˇ"D0 Then considering the condition (7.4.1) for D " and differentiating by " at " D 0, we get that ˇ ˇ @ f. " .y//ˇˇ D Df u.y/ D @u f .y/; @" "D0 ˇ ˇ @ D j.";y/ f.y/ˇˇ D Dujy f.y/ D @f u.y/ @" "D0 (here on the right-hand side we use the notation @u f.y/ D Chapter 3).
d f .y C tu/, see 2.4 of dt
7.5 For two smooth vector fields u, f, we write Œu; f D @u .f/ @f .u/; and call this the Lie bracket of vector fields. This is, again, a vector field. The derivative of the condition (7.4.1) at " D 0 then reads Œu; f D 0:
(7.5.1)
168
6 Systems of Ordinary Differential Equations
A smooth vector field u defined on an open neighborhood of y0 which satisfies (7.5.1) will be called an infinitesimal symmetry of the differential equation (7.2.1) at y0 . For technical reasons (dealing with possibly different domains of definition), we will consider two infinitesimal symmetries at y0 equal when they coincide on an open neighborood of y0 .
7.6 It is worth pointing out two properties of the Lie bracket of vector fields: Œu; v D Œv; u;
(7.6.1)
Œu; Œv; w C Œv; Œw; u C Œw; Œu; v D 0:
(7.6.2)
The equality (7.6.2) is called the Jacobi identity. Generally, a vector space over R or C with a binary operation Œ‹; ‹ which is linear in each coordinate and satisfies the equalities (7.6.1), (7.6.2) is called a Lie algebra. Thus, in particular, smooth vector fields defined on the same open subset of Rn form a Lie algebra, as do symmetries of the differential equation (7.2.1) at a given point y0 (this follows from the Jacobi identity).
7.7
Comment
Several concepts of this and the next section are closely related to Chapter 12 below. After finishing that chapter, the reader may be ready to tie this in together in some highly interesting and important geometrical notions which are beyond the scope of this text. For example, the notion of Lie algebra just mentioned leads to the notion of a Lie group. In Chapter 12, we will develop enough techniques to introduce the concept of a Lie group, and will mention it briefly in Exercises (6), (7), (8) of Chapter 12. Lie groups are a major field of mathematical study. We recommend [9, 10] for further reading.
8
Symmetry and separation of variables
8.1 Given a single infinitesimal symmetry u, then exp."u/
(8.1.1)
(used in the sense of the notation (7.2.2)) is a continuous family of symmetries. This is because in case of " equal to (8.1.1), by definition, the derivative of the condition
8 Symmetry and separation of variables
169
(7.4.1) by " is the same at every point ", and is equal to Œu; f D 0. Now given an infinitesimal symmetry u of the equation (7.2.1) at a point y0 , and assuming u.y0 / ¤ 0;
(8.1.2)
then, without loss of generality, we may assume that u; f2 .y0 /; : : : ; fn .y0 / form a basis of Rn .
(8.1.3)
(By Steinitz’ Theorem 2.6 of Appendix A, this can be always achieved after permuting the coordinates fi .) Assuming (8.1.3) holds, consider the following smooth map U ! Rn defined in an open neighborhood U of y 0 : ˆ..z1 ; : : : ; zn /T / D exp..z1 y10 /u/ .y10 ; z2 ; : : : ; zn /:
(8.1.4)
We have set things up in such a way that ˆ.y0 / D y0 ; (although obviously that is not important), and by (8.1.3) and the Implicit Function Theorem, the map ˆ has a smooth inverse ‰ in an open neighborhood of y 0 . We consider the substitution z D ‰.y/:
(8.1.5)
Because u is an infinitesimal symmetry, the differential equation expressed in the variables z, i.e. (7.3.1), has a family of symmetries z1 7! z1 C ";
(8.1.6)
zi 7! zi for i D 2; : : : ; n:
(8.1.7)
This means that the function g does not depend on the variable z1 , and thus, we have reduced the number of variables by 1: we have a system of n 1 differential equations in the variables z2 ; : : : ; zn , and an equation for z01 in terms of z2 ; : : : ; zn . For n D 2, this implies a complete solution (separation of variables). Of course, to make this method work, we must be able to evaluate (8.1.1), which, a priori, is a system of n differential equations. However, in some cases, symmetries may be more easily visible than direct solutions.
8.2 It is useful to mention one generalization. By a generalized symmetry of the equation (7.2.1) we shall mean a substitution z D .y/ such that f. .y// D ˛.y/D jy f.y/;
(8.2.1)
170
6 Systems of Ordinary Differential Equations
for some function ˛ W U ! R (i.e., a scalar). The significance of a generalized symmetry is that it preserves the direction, but not the magnitude of the tangent vectors to the integral curves. Thus, roughly speaking, a generalized symmetry preserves the integral curves as sets, but not their parametrization. The infinitesimal version of this condition is Œu; f D f
(8.2.2)
for another scalar function W U ! R. Again, generalized infinitesimal symmetries of (7.2.1) at a point form a Lie algebra, a derivative at 0 of a continuous system of generalized symmetries is a generalized infinitesimal symmetry, and conversely, (8.1.1) for a generalized infinitesimal symmetry is a continuous family of generalized symmetries. In the case of a generalized symmetry, we may still apply the substitution (8.1.4). As a result, (8.1.6), (8.1.7) will be a generalized symmetry of the system (7.3.1). In this case, we know that the function g.z/=g1 .z/ does not depend on z1 , so (7.3.1) reduces to a system of n 1 equations gi .z/ dzi ; i D 2; : : : n: D dz1 g1 .z/ Note, however, that now unless the factor ˛ of the generalized symmetry has some special form, we still end up with a general first-order differential equation for the variable z1 .
8.2.1 Example Consider the homogeneous differential equation y y 0 D f . /: x In symmetric form, this is y y 0 D f . /; x x 0 D 1: We have an obvious family of generalized symmetries .x; y/T D .x; y/T
(*)
(to conform with the above notation, " D 1). The corresponding infinitesimal symmetry is
8 Symmetry and separation of variables
171
u.x; y/T D .x; y/T ; which exponentiates to exp.zu/.x; y/T D e z .x; y/T ; so (fixing, say, x 0 D 1 and calling the new variables z; v), the substitution becomes .x; y/T D .e z1 ; y 0 e z1 v/T ; or z D 1 C ln.x/; vD
y : y0x
(**)
Up to scalar multiple, the formula for v is the substitution from the last section. It is worthwhile noting, however, that in the present form, we obtain the autonomous equation 1 dv D 0 .f .y 0 v/ v/ dz y (which we may not have noticed in the last section). Obviously, the rather simple form of the generalized infinitesimal symmetry allows us to recover z in this case.
8.3
Example
The fact that for n D 2, a symmetry leads to separation of variables, begs the question whether the separated equation y 0 D a.x/y
(8.3.1)
always has an infinitesimal symmetry. In fact, we plainly see that making a substitution in x (independent of y) introduces multiplication by a function of x, so we should be able to make a substitution in x which would eliminate the factor a.x/, and the equation would become autonomous. This suggests an infinitesimal symmetry of the form u D .k.x/; 0/T :
(8.3.2)
k 0 .x/ D k.x/a.x/;
(8.3.3)
The condition (7.5.1) becomes
172
6 Systems of Ordinary Differential Equations
which can be solved. In fact, it is the original equation, so this is no simplification, but we have found a symmetry, which, as we will see, is useful. Note also that the fact that the equation (8.3.3) coincides with (8.3.1) has a geometric reason: Choosing a non-zero characteristic C of the equation (8.3.1), the vector field the value of which at each .x; y/T is the vertical vector from .x; 0/T to the characteristic C is a symmetry because any other characteristic, considered as a function of x, is a constant multiple of the function with graph C .
8.4
Example
The symmetry (8.3.2) (subject to the condition (8.3.3)) plainly also is a symmetry of the equation y 0 D a.x/y C b.x/
(8.4.1)
(since (8.3.2) has 0 Lie bracket with .0; b.x//T ). Thus, we may use this symmetry to solve the equation (8.4.1). The substitution we get by choosing y 0 D .0; 0/, y1 D y; y2 D x, is y D z1 k.z2 /; x D z2 : Setting z D z1 , we get z0 D
y 0 k.x/ yk 0 .x/ b.x/ ; D k.x/2 k.x/
which is solvable by an integral, as desired.
9
Exercises
(1) Convert the differential equation y 000 D .y 00 /2
x C ln.x/ sin.y 2 /
into a system of first-order ODE’s. (2) Convert the system of ordinary differential equations y 00 D
z0 y 0 C y3; zCy Cx
z00 D ln.z0 C cos.y 0 C z// C 3 into a system of first-order ODE’s.
9 Exercises
173
(3) (a) Using Exercise (9) of Chapter 3, describe a procedure of converting a system of equations of the form (1.3.1) to a system of the form (1.2.1) with kn raised to kn C1 without assuming we can find the implicit function explicitly. (Note that this may even be useful in the case k0 D D kn D 0.) (b) Using this method, convert the differential equation y 0 C sin.x C y C y 0 / D 0 into an ordinary (explicit) second-order differential equation. (4) State and prove an analogue of Corollary 4.3 for the general system (1.2.1) of 1.2. (5) Solve the differential equation y0 D
2x : ey
(6) Solve the differential equation y0 D
x2 C y 2 : xy
(7) Solve the differential equation y0 D x C
y : x
(8) Prove the Jacobi identity for vector fields. (9) Prove that infinitesimal symmetries of a system of differential equations at a point y 0 form a Lie algebra under the operation of Lie bracket of vector fields. (10) Prove that generalized infinitesimal symmetries of a system of differential equations at a point y 0 form a Lie algebra under the operation of Lie bracket of vector fields. (11) Prove that a generalized infinitesimal symmetry exponentiates to a generalized symmetry. (12) Find an infinitesimal symmetry of the equation y 0 D f .ax C by/ and recover the solution. (13) Find a generalized infinitesimal symmetry of the equation 0
y Df and use it to find the solution.
ax C by C c ˛x C ˇy C
7
Systems of Linear Differential Equations
Systems of linear differential equations have many special properties, the most important of which is that a characteristic is defined in any open interval in which the system is defined (in contrast with ODE, see Example 4.4.1 of Chapter 6). In this chapter, we prove this important “no blow-up” theorem, and discuss the linear character of the set of solutions. We also describe a method for solving completely the important class of systems of linear differential equations with constant coefficients.
1
The definition and the existence theorem for a system of linear differential equations
1.1 Let aij ; bi be continuous functions on an open interval J . A system of linear differential equations (briefly, LDE’s) is the following special case of the system of ODE’s (1.1.1) of Chapter 6: yi0 .x/ D
n X
aij .x/yj .x/ C bi .x/; i D 1; : : : ; n:
(L)
j D1
Recall that such systems arise naturally as equations for partial derivatives of solutions of general differential equations by a parameter (see (5.5.2) of Chapter 6). A linear (differential) equation of order n, where ai ; b are continuous on J , is y .n/ .x/ C an1 .x/y .n1/ C C a1 .x/y 0 C a0 .x/y D b.x/ i D 1; : : : ; n:
˜ (L)
˜ is easy to translate to a system of the form (L) by the method Again, the system (L) of 1.2. In fact, again, one may call (L) a system of first order LDE’s, define systems of higher order LDE’s, and then show such systems are equivalent to systems of first I. Kriz and A. Pultr, Introduction to Mathematical Analysis, DOI 10.1007/978-3-0348-0636-7 7, © Springer Basel 2013
175
176
7 Systems of Linear Differential Equations
order LDE using the method of 1.2 of Chapter 6. Consequently, it suffices, again, to develop a theory for first-order systems (L). However, in some practical situations, ˜ it is advantageous to treat the special case of a single higher order equation (L) separately, as we will see below. ˜ if b is zero), we speak of If all the functions bi are zero (in the case (L), homogeneous equations resp. equation. The homogeneous counterpart of an (L) ˜ will be indicated by (L-hom) resp. (L-hom). ˜ resp. (L) 1.2 Lemma. Let f be continuous and bounded on the half-open interval ha; b/. Define a value of f at b arbitrarily. Then there exists the (Riemann) integral Rb a f .t/dt and we have Z
Z
b
f .t/dt D lim
x!b a
a
x
f .t/dt:
Comment: We prove this result here directly to make this chapter (and Chapter 6 above) largely self-contained, and independent of the techniques of the Lebesgue integral as introduced in Chapters 4, 5. The attentive reader, however, should see how the present statement follows from a much stronger result in Exercise (10) of Chapter 5, Exercise (4) of Chapter 4, and the Lebesgue Dominated Convergence Theorem. Rx Proof. The Riemann integrals a trivially exist (because of the continuity). Let jf .x/j C . Thus, we can choose partitions D.x/ of ha; xi such that Z
x
f a
" s.f jha; xi; D.x// S.f jha; xi; D.x// 2
Z
x
f C a
" 2
(*)
(notation from Section 8 of Chapter 1). " Let x > b D . Define a partition D 0 .x/ of ha; bi by adding the interval hx; bi 2C to D.x/. Then we have " s.f jha; xi; D.x// .b x/C 2 Z b Z b s.f; D 0 .x// f f S.f; D 0 .x//
s.f jha; xi; D.x//
a
a
" S.f jha; xi; D.x// C .b x/C S.f jha; xi; D.x// C : 2 From (*) and (**), we obtain
(**)
1 The definition and the existence theorem for a system of linear differential equations
Z
Z
x
Z
b
f " a
Z
b
f
x
f a
a
177
f C "; a
hence ˇZ Z b ˇˇ ˇ x ˇ ˇ f f ˇ " and ˇ ˇ a ˇ a
ˇZ Z b ˇˇ ˇ x ˇ ˇ f fˇ" ˇ ˇ a a ˇ
and finally Z
Z
b
f D lim
x!b a
a
Z
x
b
f D
f:
t u
a
1.3 Theorem. Let aij .x/; bi .x/ be continuous on an interval J , let x0 2 L and let j , j D 1; : : : ; n, be arbitrary real numbers. Then the LDE system yi0 .x/ D
n X
aij .x/yj .x/; i D 1; : : : ; n
j D1
has precisely one solution y1 ; : : : ; yn , defined on the whole of J , such that yj .x0 / D j . Proof. Uniqueness follows from the general Theorem 4.1 of Chapter 6, from which we also know that there exists a solution defined on a neighborhood of the point x0 . We will prove that this solution can be extended on the whole of J . We will construct the extension on the part of the interval to the right of x0 , the extension to the left is analogous. Recall 2.1 of Chapter 6 and denote by M the set of all z 2 J , z x0 such that there is a solution of the equations Z yi .x/ D
z
.
n X
aij .t/yj .t/ C bi .t//dt C i
x0 j D1
on hx0 ; zi. Set s D sup M . If the set M is not all of J \ hx0 ; C1i, we have (1) s finite, and (2) s 2 J X M . ((1) is obvious; regarding (2), either s < sup J and there is a solution in one of its neighborhoods, or s … M while s 2 J , since it is the only point at which J \ hx0 ; C1i can differ from M ). Since aij and bi are continuous functions defined on hx0 ; si, they are bounded on this interval, say jaij .x/j A and jbi .x/j B:
178
7 Systems of Linear Differential Equations
Choose C , ˛ sufficiently large to have ˛ > 2nA and B.s x0 / C max i <
C ˛x e 2
and moreover such that the set MQ D fx j x 2 M and jyi .x/j < C e˛x g is non-empty. The set MQ is obviously open in M . The set M is obviously connected. Thus, if we prove that MQ is closed we will see (recall Section 5 of Chapter 2) that MQ D M , and to do that it suffices to show that MQ is closed under limits of increasing sequences (if jyi .y/j < C e˛y holds for y’s arbitrarily close above x it obviously holds for x as well). To this end, consider an increasing sequence xn of points of MQ and let lim xn D . Important: we do not assume 2 M ; this will be a consequence of n
this part of the proof, and will be used later. From continuity, we immediately see that jyi .x/j C e˛x on the interval hx0 ; i and hence by Lemma 1.2 we know that the Riemann integral Z X . aij .t/ C bi .t//dt Z is equal to lim
x0 x
x! x0
.
X
aij .t/ C bi .t//dt D lim yi .xn /. If we define yi ./ as this x!
limit (if the yi has been already defined at , this coincides, by continuity, with the original value) we have extended the solution of our LDE system to (hence, in particular, we have 2 M ). We have, however Z X Z j . aij .t/ C bi .t//dt C i j .nAC e˛t C B/dt C ji j x0
x0
nAC ˛ e C B. xO / C ji j < C e˛ ; ˛
so is not only in M , but in fact in MQ . Therefore, MQ D M . Now we will take advantage of the fact that in our procedure, we did not assume to be in M : the supremum point s can be written as a limit of an increasing sequence of elements from M (equal to MQ ) in contradiction with s 2 J X M which followed from the assumption that M ¤ J \ hx0 ; C1i. u t 1.4 Corollary. Let ai .x/ .i D 1; : : : ; n/ and b.x/ be continuous on an interval J , let x0 2 J and let i , i D 1; : : : ; n be arbitrary. Then the equation y .n/ C
n1 X
aj .x/y .j / .x/ D b.x/
j D1
has precisely one solution on the interval J satisfying the conditions y .j / .x0 / D j for all j D 1; : : : ; n 1.
2 Spaces of solutions
2
179
Spaces of solutions
2.1 In this section, the continuous functions aij ; bi are defined on an open interval J . We denote by C.J / the R-vector space of all continuous functions on J . Further, we denote the vector space C.J / C.J / n times (the n-th power of C.J /) by C n .J /: 2.2 Theorem. The system of all solutions of the LDE system (L) constitutes an affine subset y0 C W of C n .J /, and the system of all solutions of the n-th order ˜ constitutes an affine subset y0 C W , where the vector subspaces W are equation (L) the sets of all solutions of the associated homogeneous equations. Proof. will be done for (L). Obviously if y D .y1 ; : : : ; yn / and z D .z1 ; : : : ; zn / solve the associated homogeneous system (L-hom) then so does any ˛y C ˇz and n the system of all the solutions of (L-hom) is a vector P subspace of C .J /. Now if y0 D .y01 ; : : : ; y0n / P solves (L), that is, if y00 P D aij y0j C bi and if y solves (L-hom), that is, y 0 D aij yj then y00 C y 0 D aij .y0j C yj / C bi and y0 C y solves (L). On the other hand if z is an arbitrary solution of (L) then z0i y00 D P aij .zj y0j / so that z y0 2 W and z D y0 C .z y0 / 2 y0 C W . t u Remark. Of course the principle is the same as in the solution of systems of algebraic linear equations. Theorem. The dimensions of (both of) the affine sets from the previous theorem are n. Proof. Again, we will prove the statement for the system (L). Let y1 ; : : : ; yp be solutions of (L-hom) and let p > n. Take an x0 2 J . Then the system of algebraic linear equations y11 .x0 /˛1 C y21 .x0 /˛2 C C yp1 .x0 /˛p D 0 :::
:::
:::
y1n .x0 /˛1 C y2n .x0 /˛2 C C ypn .x0 /˛p D 0
180
7 Systems of Linear Differential Equations
has a non-trivial solution ˛1 ; : : : ; ˛n (in fact, the vector space of such solutions has dimension p n). Set yD
p X
˛i yi :
i D1
In particular we have y.x0 / D .0; : : : ; 0/. But we already know such a solution, namely zero: o D .const0 ; : : : ; const0 /. From uniqueness, it now follows that p X ˛i yi D o, i.e. that the system y1 ; : : : ; yn is linearly dependent; hence, the i D1
dimension of W is at most n. On the other hand consider the solutions yi of (L-hom) such that yij .x0P / is 1 for i D 1 and 0 otherwise. P Then we obtain a linearly independent system: if ˛ y D o then in particular ˛i yi .x0 / D 0, that is, i i P ˛i ıij D 0 and all the ˛i are zero. t u
2.3
The Wronski determinants (Wronskians)
For solutions y1 ; : : : ; yn of (L-hom), one introduces the determinant ˇ ˇ ˇ y11 .x/; : : : ; y1n .x/ ˇ ˇ ˇ ˇ: W .y1 ; : : : ; yn /.x/ D ˇˇ ::: ˇ ˇy .x/; : : : ; y .x/ˇ n1 nn ˜ For solutions y1 ; : : : ; yn of the equation (L-hom), one introduces ˇ ˇ ˇ y1 .x/; : : : ; yn .x/ ˇ ˇ 0 ˇ ˇ y1 .x/; : : : ; yn0 .x/ ˇ ˇ ˇ: W .y1 ; : : : ; yn /.x/ D ˇ ˇ ::: ˇ ˇ ˇy .n1/ .x/; : : : ; y .n1/ .x/ˇ 1
n
The functions W .y1 ; : : : ; yn /.x/ resp. W .y1 ; : : : ; yn /.x/ are called the Wronski determinants of the equations in question. Remark. Note that the latter is in fact a special case of the former obtained from the standard translation as in 1.2 of Chapter 6. 2.4 Theorem. The following statements are equivalent for a system of solutions y1 ; : : : ; yn of the system (L) (the interval J is as before): (1) the solutions y1 ; : : : ; yn are linearly independent, (2) W .y1 ; : : : ; yn /.x/ ¤ 0 at all x 2 J , (3) there exists an x0 2 J such that W .y1 ; : : : ; yn /.x0 / ¤ 0. ˜ If the conditions hold, the system y1 ; : : : ; yn is called a Similarly for the system (L). fundamental system of solutions.
3 Variation of constants
181
˜ just for a change. Proof. We will prove the statement for the case (L), (1))(2): Suppose (2) does not hold and we have an x0 2 J such that ˇ ˇ y1 .x0 /; : : : ; yn .x0 / ˇ 0 ˇ y .x0 /; : : : ; yn0 .x0 / W .y1 ; : : : ; yn /.x0 / D ˇˇ 1 ::: ˇ ˇy .n1/ .x /; : : : ; y .n1/ .x 0
1
n
ˇ ˇ ˇ ˇ ˇ D 0: ˇ ˇ ˇ 0/
Then the system of algebraic linear equations y1 .x0 /˛1 C y2 .x0 /˛2 C C yn .x0 /˛n D 0; y1 .x0 /0 ˛1 C y2 .x0 /0 ˛2 C C yn .x0 /0 ˛n D 0; ::: .n1/
y1
:::
::: .n1/
.x0 /˛1 C y2
.x0 /˛2 C C yn.n1/ .x0 /˛n D 0
P has a non-trivial solution ˛1 ; : : : ; ˛n . If we set y D ˛i yi we have in particular y.x0 / D y 0 .x0 / D D y .n1/ .x0 / D 0. P This holds for the trivial constant zero solution as well and hence, by uniqueness, ˛i y D const0 and our solutions are linearly dependent. The implications (2))(3) and (3))(1) are trivial. t u
3
Variation of constants
This is a method which allows us to find the system of solutions of the system ˜ provided we know a fundamental system of solutions of the system (L) (resp. (L)), ˜ (L-hom) (resp. (L-hom)). Again, the latter is a special case of the former, but in this case we will present both cases explicitly.
3.1
The system (L)
Suppose we have a basis y1 ; : : : ; yn of solutions of (L-hom). We will try to find a solution of (L) in the form y0 .x/ D
n X
ci .x/yi .x/
i D1
(recall 6.7 of Chapter 6). We have yij0 D
X k
ajk yik
182
7 Systems of Linear Differential Equations
and hence yij0 D D
X X
ci0 yij C ci0 yij C
X X
i
ci yij0 D ajk
X
X i
ci yik D
i
k
X
ci0 yij C X
ci ajk yik
ik
ci0 yij C
i
X
ajk y0k
k
and hence the problem is in finding functions ci .x/ such that X
ci0 .x/yij .x/ D bi .x/:
i
This is easily done using the Cramer rule (Appendix B, 4.2). If we denote by Wi .x/ the Wronskian in which we replace the i -th column by the 0
1 b1 .x/ @ ::: A bn .x/ we obtain ci0 .x/ D
Wi .x/ W .y1 ; : : : ; yn /
with the denominator non-zero by 2.6, and conclude that Z Wi .x/ ci .x/ D : W .y1 ; : : : ; yn /
3.2
˜ The equation (L)
Consider a basis y1 .x/; : : : ; yn .x/. Let us look for a solution in the form y.x/ D .n/
We have yi .x/ D
n1 X
.j /
aj yi
X
ci .x/yi .x/:
D 0. Thus, if we require
j D0
X
ci0 .x/yi .x/ D 0 .k/
(*)
4 A Linear differential equation of nth order with constant coefficients
183
for k D 0; : : : ; n 2, we will have X
y 0 .x/ D
ci .x/yi0 .x/
::: y .n1/ .x/ D
X
.n1/
ci .x/yi
.x/:
Let us add a further requirement X
ci0 .x/yi
.n1/
.x/ D b.x/:
(**)
Then we have y .n/ .x/ D
X
.n/
ci .x/yi .x/ C b.x/
and conclude that y .n/ .x/ C
X
ak .x/y .k/ .x/ D b.x/:
k
The requirements (*) and (**) constitute, again, a system of algebraic linear equations solvable using the Cramer rule (again with the non-zero Wronskian in the denominator) to obtain ci0 .x/. Finally, take the primitive functions to obtain ci.x/ .
4
A linear differential equation of nth order with constant coefficients
In this and the following section we will consider linear differential equations with constant coefficients ai , resp. aij . In view of the previous section, it suffices to solve the corresponding homogeneous equations. If these are solved, the general case can be computed by variation of constants; note that the right-hand sides b resp bi do not have to be constant.
4.1
The Characteristic Polynomial
Consider the problem of finding a function y satisfying y .n/ C an1 y .n1/ C C a1 y 0 C a0 y D 0 where ak are real numbers.
(*)
184
7 Systems of Linear Differential Equations
We already know that it suffices to find n linearly independent solutions. Let us try y.x/ D ex : We have y .k/ .x/ D k ex ; and hence the equation (*) will be satisfied if (and only if) e x .n C an1 n1 C C a1 C a0 / D 0; that is, since e x ¤ 0, if and only if p./ D n C an1 n1 C C a1 C a0 D 0: The polynomial p is called the characteristic polynomial of the equation (*). Thus we see that if is a root of the characteristic polynomial of (*) then y.x/ D ex is a solution of this equation.
4.2 If 1 ; : : : ; n are distinct numbers then the functions e1 x ; : : : ; en x constitute a linearly independent system. This is easily proved using the Wronski and Vandermonde determinants. For our purposes this would not suffice, though. We will need a stronger Lemma. Let 1 ; : : : ; k be distinct complex numbers and let p1 .x/; : : : ; pk .x/ be polynomials. Let k X
pj .x/ej x
j D1
be identically zero. Then all the polynomials pj are zero. Proof. Suppose not. Then among the counterexamples, choose one such that (a) the maximum of the degrees of the polynomials pj is the least possible, and (b) the number of the polynomials pj with this maximum degree is the least possible.
4 A Linear differential equation of nth order with constant coefficients
185
Here the degree of a constant non-zero polynomial is defined to be 0, and the degree of the constant zero is defined to be 1. Thus, taking derivative of a non-zero polynomial decreases the degree by one. We have identically k X
pj .x/ej x D 0:
(4.2.1)
j D1
Taking the derivative we obtain k X
pj .x/ej x C
j D1
k X
pj .x/j ej x D 0:
(4.2.2)
j D1
Let, say, p1 have the maximum degree. Subtracting (4.2.1) multiplied by 1 from (4.2.2), we obtain p10 .x/e1 x C
k X
..j 1 /pj .x/ C pj0 .x//ej x D 0:
(4.2.3)
j D2
Now the degree of the polynomial at e1 x has decreased and none of the other degrees has increased. Thus, the formula (4.2.3) cannot be a counterexample to the statement and hence we have to have p10 .x/ 0; and .j 1 /pj .x/ C pj0 .x/ 0 for j > 1: From the second equation we immediately see that all the pj with j > 1 are identically zero (since 1 ¤ j ). The first one immediately yields only that p1 has to be a constant, but C e1 x is zero only if C D 0. t u 4.3 Corollary. Let 1 ; : : : ; k be distinct complex numbers. Then the system of functions e1 x ;xe1 x ; : : : ; x s1 e1 x ; e2 x ; xe2 x ; : : : ; x s2 e2 x ; : : : : : : : : : ek x ; xek x ; : : : ; x sk e1 x : with arbitrary non-negative integers sj is linearly independent.
186
4.4
7 Systems of Linear Differential Equations
The simplest case
If the characteristic polynomial has n distinct real roots 1 ; : : : ; n then we have, by 4.1 and 4.3, the fundamental system of solutions e1 x ; : : : ; en x : The problem is, hence, what to do with the complex roots, and how to deal with a possible multiplicity of some of the roots.
4.5
Complex roots
We are dealing with an LDE in real variables. Thus the characteristic polynomial has real coefficients and consequently each of the roots which is not real is accompanied with its complex conjugate as another root. That is, if ˇj ¤ 0 in a root j D ˛j C iˇj then there is a k ¤ j with k D ˛j iˇj : The two complex functions ej x ; ek x are then in our basis replaced by e˛j x cos ˇj x
and e˛n x sin ˇj x:
(4.5.1)
Replacing eix and eix by linear combinations of cos x and sin x, and vice versa, in the present context, is justified by Exercise (12) of Chapter 1. We will gain a much better understanding of this in Chapter 10 below.
4.6
Multiple roots
Define an operator L.y/ D y .n/ C
n1 X
aj y .j /
j D0
to be applied on functions y.x; / of two real variables. Thus we have @n y X @j y aj j : L.y/ D n C @x @x j D0 n1
By 4.2.1, of Chapter 3, we obtain
5 Systems of LDE with constant coefficients. An application of Jordan’s Theorem
187
@ @n y @y @ @n @y L.y/ D C D L C D n n @ @ @x @x @ @ and more generally k @ y @k : L.y/ D L @k @k In particular for y.x; / D ex we have L.y/ D ex p./ and hence
@k y L.x e / D L @k k x
D
@k x .e p.//: @k
By induction we easily learn that ! k X @k x k .j / .e p.// D p ./x kj ex : @k j j D1 If is a k-multipled root of p we have p./ D p 0 ./ D D p .k1/ ./ D 0 and hence the equation L.y/ D 0 is satisfied, besides ex , also by xex ; x 2 ex ; : : : ; x k1 ex Thus we obtain k solutions, and if we apply this to all the roots we obtain n solutions, independent by 4.3, and hence the fundamental system of solutions we needed. For a conjugate pair of complex roots ˛ C iˇ, ˛ iˇ we take, of course, e˛x cos ˇx; xe˛x cos ˇx; : : : ; x k1 e˛x cos ˇx; e˛x sin ˇx; xe˛x sin ˇx; : : : ; x k1 e˛x sin ˇx:
5
Systems of LDE with constant coefficients. An application of Jordan’s Theorem
5.1 Consider a system of first-order linear differential equations y0 D Ay:
(5.1.1)
188
7 Systems of Linear Differential Equations
In fact, let us carefully consider two contexts in which (5.1.1) makes sense. The first context is, as above, when A is a constant n n matrix over R, and y W R ! Rn is an unknown vector-valued function. However, it also makes sense to consider the case when A is an n n matrix over C, and the unknown function is y W R ! Cn . This case makes sense since we may identify C Š R2 , and such system of n complexvalued first order differential equations can therefore be interpreted as a system of 2n real-valued first-order linear differential equations. Let us emphasize, however, that in this discussion, the independent variable remains real. The advantage of considering (5.1.1) over C is that over C, every matrix is similar to a matrix in Jordan canonical form. Changing basis to the basis in which the matrix is in Jordan form gives a substitution which allows us to solve the system of equations. Even more explicitly, this can be said as follows: consider a k k Jordan block of the matrix A with respect to an eigenvalue . This corresponds to k vectors u1 ; : : : ; uk 2 Cn such that Au1 D u1 ; Auj D uj C uj 1 ; j D 2; : : : ; k:
(5.1.2)
Then this data give the following solutions of the system (5.1.1): u1 e x ; u2 e x C u1 xe x ; ::: x k1 x e : uk e x C uk1 xe x C C u1 .k1/Š
(5.1.3)
Taking the solutions (5.1.3) for all Jordan blocks gives a fundamental system of solutions, which we can see by taking the determinant of their values at 0 (where we get the base change matrix from the Jordan basis to the standard basis); recall from Theorem 2.4 that a system of n solutions whose values are independent at one point is a fundamental system of solutions.
5.2 Let us now consider the case when the system (5.1.1) is over R. Then, the matrix A is a real matrix. This means that for every solution y over C, Re.y/; Im.y/
(5.2.1)
are real solutions of (5.1.1). Taking all such solutions for all Jordan blocks gives a system of real solutions which, when considered over C, generate the vector space of all the complex solutions and hence must contain a basis of the space of real solutions (which can be found explicitly by finding a set of columns which form a basis of the matrix of values at 0).
5 Systems of LDE with constant coefficients. An application of Jordan’s Theorem
189
5.2.1 Example Consider the system (5.1.1) with 0
0 B1 ADB @0 1
1 0 0 0
0 0 0 1
1 0 0C C: 1 A 0
Then one sees right away that the characteristic polynomial is A .x/ D .x 2 C 1/2 ; and the Jordan canonical form is 0
i B1 J DB @0 1
1 0 0 0 i 0 0C C: 0 i 0A 0 1 i
Let us consider the Jordan block corresponding to the eigenvalue i . By solving systems of linear equations, we find that u1 D .0; 0; 1; i /T ; u2 D .2i; 2; 0; 1/T : Note that we could equivalently take a scalar multiple of both vectors by the same non-zero complex number. Thus, (5.1.3) produces solutions .0; 0; e ix ; i e ix /T ; .2i; 2; 0; 1/T e ix C .0; 0; 1; i /T xe ix : Taking real and imaginary parts, we get four real solutions .0; 0; cos.x/; sin.x//T ; .0; 0; sin.x/; cos.x//T ; .2 sin.x/; 2 cos.x/; x cos.x/; cos.x/ x sin.x//T ; .2 cos.x/; 2 sin.x/; x sin.x/; sin.x/ C x cos.x//T : Since the data obtained from the other Jordan block can be taken complex conjugate, we know that these solutions span the space of all complex solutions, and hence form a fundamental system of real solutions.
190
5.3
7 Systems of Linear Differential Equations
Remark
As mentioned above, a single differential equation with constant coefficients y .n/ C a1 y .n1/ C C an y D 0 can be converted to a system of first-order linear differential equations (5.1.1) where 0
0 1 B 0 0 B B A D B ::: ::: B @ 0 0 an an1
0 1 ::: 0
::: ::: ::: ::: :::
an2
1 0 0C C C :::C: C 1A a1
We clearly have A .x/ D x n C an1 x n1 C C an :
(5.3.1)
In fact, we call A the characteristic matrix of the polynomial (5.3.1). We may ask when, conversely, may a system of first-order linear differential equations with constant coefficients be converted by a substitution to a single n’th order linear differential equation? Clearly, this is equivalent to asking which square matrices are similar to characteristic matrices. We will solve this question in the exercises.
6
Exercises
(1) Prove that the Wronskian W .x/ of any n solutions of the system y0 D A.x/y satisfies the differential equation W .x/0 D tr.A/W .x/: (Here for a square matrix A, tr.A/ is the sum of its diagonal terms.) (2) The differential equation y 00 C
y0 y D0 x x
has solutions y D x; y D
1 : x
6 Exercises
191
Find all solutions of the differential equation y 00 C
y0 y D ex : x x
(3) Find a fundamental system of solutions of the equation y .3/ y .2/ C 8y 0 C 12y D 0: (4) Find all solutions of the system of LDE’s y10 D y1 y2 C xe x ; y20 D y1 C 3y2 C x 2 : (5) Find a fundamental system of real solutions of the system of LDE’s (5.1.1) with 0
1 B 1 ADB @ 0 1
1 1 0 0
0 0 1 1
1 1 1C C: 1A 1
(6) Prove that the characteristic polynomial of a characteristic matrix Ap of a polynomial p with highest coefficient 1 is equal to p. (7) A cyclic vector for a linear transformation f W V ! V is a vector v 2 V such that the vectors v; f .v/; : : : ; f N .v/; : : : span the vector space V . (As usual, we will identify an n n matrix with the linear transformation Rn ! Rn it defines by matrix multiplication.) Prove that a matrix is similar to a characteristic matrix if and only if it has a cyclic vector. (8) Suppose an n n matrix A over C has only one eigenvalue. Prove that A has a cyclic vector if and only if A is equivalent to a Jordan block. [Hint: If v is a cyclic vector, prove that .I A/j v for j 0 span Cn .] (9) Prove that if f W V ! V is a linear transformation, v 2 V is a cyclic vector and W V is a subspace such that f .W / W , then the image v C W ov v in V =W is a cyclic vector for the induced linear transformation f =W W V =W ! V =W . (10) Using the results of Exercises 8 and 9, prove that a square matrix A over C has a cyclic vector if and only if A has exactly one Jordan block for each eigenvalue . (Note: such matrices are sometimes called regular, which however may be confusing since this notion has nothing to do with non-singularity.) (11) Suppose you know a cyclic vector of an n n (constant) matrix A. Explain how you can use the method of Section 4 (which is simpler than the method of Section 5) for solving the system of LDE’s y0 D Ay:
8
Line Integrals and Green’s Theorem
In this chapter, we introduce the line integral and prove Green’s Theorem which relates a line integral over a closed curve (or curves) in R2 to the ordinary integral of a certain quantity over the region enclosed by the curve(s). Making rigorous sense of what this last concept means is a big part of the work. Much of the material of this section is subsumed by the more general treatment of Stokes’ Theorem in manifolds of arbitrary dimension in Chapter 12 below. However, there are two important reasons to present Green’s Theorem first. The first reason is that Green’s Theorem is much more elementary, and does not require the added abstraction, and algebra and topology material needed for Stokes’ Theorem. The other important reason is that Green’s Theorem can be, in fact, used directly to set up the foundations of basic complex analysis, which we do in the next chapter, and which is rather nice to do without having to go into Stokes’ Theorem in a general dimension.
1
Curves and line integrals
1.1 A parametrization of a (piecewise continuously differentiable) curve in Rn is a continuous map D . 1 ; : : : ; n /T W ha; bi ! Rn (recall our convention 1.1 of Chapter 3 of denoting vector functions by bold-faced letters) such that there exists a partition a D a0 < a1 < < an D b
I. Kriz and A. Pultr, Introduction to Mathematical Analysis, DOI 10.1007/978-3-0348-0636-7 8, © Springer Basel 2013
193
194
8 Line Integrals and Green’s Theorem
of the interval ha; bi for which we have: (1) On each of the intervals hai 1 ; ai i, each of the functions j has a continuous derivative (use one-sided derivatives at the endpoints). (2) For every i there exists a j such that j0 .t/ is positive or negative on all of hai 1 ; ai i (again, take one-sided derivatives at the endpoints). We will sometimes also speak of a parametrized piecewise continuously differentiable curve. Comment: Instead of condition (2), one may simply require that 0 .t/ ¤ 0 on hai 1 ; ai i, as the interval can then be subdivided into finitely many intervals on each of which condition (2) holds.
1.2 We say that two parametrizations W ha; bi ! Rn ; W ha; bi ! Rn are weakly equivalent, and write
;
if there exists a homeomorphism ˛ W ha; bi ! hc; d i such that ı ˛ D : Note that the relation really is an equivalence relation, i.e. that it is reflexive, symmetrical and transitive (see Exercise (2)). Equivalence classes with respect to will be called piecewise continuously differentiable curves.
1.3
Remark
Clearly, implies Œha; bi D Œhc; d i. On the other hand, if and are one-to-one and we have Œha; bi D Œhc; d i, then we have . In effect, consider the maps W ha; bi ! Œha; bi; W ha; bi ! Œha; bi defined by the same formulas as ; . Since the relevant spaces are compact, , 1 : are homeomorphisms (see 6.2.2 of Chapter 2). Put ˛ D
1 Curves and line integrals
195
1.4 Proposition. The map ˛ from the definition of weak equivalence is piecewise continuously differentiable. Proof. Let a0 ; : : : ; ar , c0 ; : : : ; cs be the partitions figuring in Definition 1.1 for the parametrizations , . Let b0 ; : : : ; bk be a common refinement of a0 ; : : : ; ar and ˛ 1 .c0 /; : : : ; ˛ 1 .cs /. On the interval h˛.bi 1 /; ˛.bi /i, choose j such that j is a one-to-one continuously differentiable map with non-zero derivative. Then j1 has a derivative and we have ˛.t/ D j1 . j .t// on hbi 1 ; bi i. t u
1.5 We say that two parametrizations W ha; bi ! Rn ;
W ha; bi ! Rn
are equivalent, and write if there exists an increasing homeomorphism ˛ such that ı ˛ D . The equivalence classes of are called oriented piecewise continuously differentiable curves. Proposition. Suppose a parametrization of a piecewise continuously differentiable curve is one-to-one with the possible exception .a/ D .b/. Then the
-equivalence class of contains precisely two -equivalence classes. Proof. Since ) , a -equivalence class is a union of -equivalence classes. Define W ha; bi ! ha; bi by .t/ D t C b C a: Then we have ı but (by injectivity), not . Therefore, there are at least two -equivalence classes in each -equivalence class. When
but the homeomorphisms ˛, ˇ from Definition 1.2 for the pairs ; and ˇ; ˛ are not increasing, then ˇ˛ is increasing, and hence . t u
1.6
A Remark and a Convention
The geometric idea of a curve is modelled well by the concepts of 1.2 (see 1.3). A parametrization can be interpreted as additional information about “time” at which we are at a particular point when travelling along the curve. In an oriented curve, we do not care about the precise time at which we are at a particular point, but we do want to keep track of the direction of travel.
196
8 Line Integrals and Green’s Theorem
We will often speak freely of an (oriented or unoriented) curve W ha; bi ! Rn : Of course, what we really mean is the corresponding equivalence class of parametrizations.
1.7 Let K; L be oriented curves with parametrizations W ha; bi ! Rn , W ha1 ; b1 i ! Rn such that .b/ D .a1 /. Without loss of generality, we may assume a1 D b ((otherwise, we may replace , say, by the parametrization ı ˛ where ˛.t/ D a1 C .t b/). Let us then write c D b1 , and define W ha; ci ! Rn by . Then
/.t/ D
.t/ .t/
for t 2 ha; bi, for t 2 hb; ci.
is a parametrization of a new oriented curve which we will denote by L C K:
This parametrized curve is well-defined in terms of K, L; moreover, this operation is associative (see Exercises (3), (4), (5) below).
1.8 Recall the map of 1.5. If L is an oriented curve with parametrization W ha; bi ! Rn , define L as the oriented curve determined by ı . Thus, L is “the other oriented curve” which represents the same (unoriented) curve, and accordingly, we shall refer to L as the oriented curve with opposite orientation. Again, L does not depend on the particular parametrization of the oriented curve L.
1.9
Some terminology
A curve with a one-to-one parametrization is sometimes called a simple arc, a curve with a representation such that .a/ D .b/ but .x/ ¤ .y/ unless x D y or fx; yg D fa; bg is called a simple closed curve. The word ‘closed’ in this context has, of course, a different meaning than a closed subset of a topological space.
2 Line integrals of the first kind (D according to length)
2
197
Line integrals of the first kind (D according to length)
2.1 p Recall our definition of Euclidean norm jjujj D u u where is the dot product. Then jju vjj is the ordinary Euclidean distance. While this distance function was unimportant (even awkward) in topological considerations, and could be replaced by any equivalent metric, in the present section it will play a crucial role.
2.2 Recall the definition of the Riemann integral and view it informally as a kind of summation of the function f over the length of an interval. Since an interval is a very special case of a piecewise continuously differentiable curve (where the parametrization is the identity), we may wonder if the Riemann integral could be generalized to a situation where the domain is (the image of) a piecewise continuously differentiable curve. This intuition indeed works. By a partition of a parametrized piecewise continuously differentiable curve W ha; bi ! Rn we will mean a sequence of points .t0 /; .t1 /; : : : ; .tk /
(*)
where t0 < t1 < < tk is a partition of the interval ha; bi. The mesh of a partition is the maximum of the numbers jj.ti 1 / .ti /jj. Note that since ha; bi is a compact space, is a uniformly continuous map and hence if the mesh of a sequence of partitions goes to 0, so does the mesh of their -images. Now consider a continuous real function f defined (at least) on Œha; bi. In analogy with the Riemann integral, (recall, in particular, Theorem 8.3 of Chapter 1), let us investigate sums of the form k X
f ..ti //jj.ti / .ti 1 /jj
i D1
and let us see if they converge to a particular value when the mesh goes to 0. By the Mean Value Theorem v uX X u n f ..ti //t . j .ti / j .ti 1 //2 j D1
i
D
X
f ..ti //
i
D
X i
sX
j0 .ij /2 .ti ti 1 /2
j
f ..ti //
sX j
j0 .ij /2 .ti ti 1 /
198
8 Line Integrals and Green’s Theorem
which, by Theorem 8.3 of Chapter 1, when the mesh goes to 0, converges to Z
b
f ..t//jj0 .t/jjdt:
a
When W ha; bi ! Rn is a parametrization of a piecewise continuously differentiable curve L, we call the number Z
b
f ..t//jj0 .t/jjdt
(**)
a
the line intergral of the first kind (or integral according to length) of the function f over the curve L, and denote it by Z
Z f or L
f .x/jjdxjj: L
2.2.1 Comment The formula (**) makes sense, of course, for any integrable function f , in which case the integral (**) exists as a Lebesgue integral. A similar comment will apply to all the types of curve integrals we shall introduce. It is useful to note, however, that in the context of the present chapter, we are not interested in such level of generality, and are happy to assume that the function f is continuous in which case the Lebesgue integral is the same as the Riemann integral. Nevertheless, even with that in mind, the Lebesgue integral techniques we developed in Chapter 5 are still needed for example in arguments such as differentiating behind the integral sign in Proposition 3.7 or the use of multivariable substitution in Section 5 below. 2.3 Proposition. The expression in the definition of the line integral of the first kind is independent of parametrization. Proof. Let and be as in 1.1, let D ı ˛. By 1.4, ˛ is piecewise continuously differentiable (except at finitely many points where there are, at most, discontinuities of the first kind, i.e. such that the corresponding one-sided limits exist), and hence we have qP qP 0 2 0 2 jj0 .t/jj D j0 .t/2 D j .˛.t// .˛ .t// qP 0 0 2 0 D .˛.t//jj j˛ 0 .t/j j .˛.t// j˛ .t/j D jj and hence by the Substitution Theorem (for the Riemann integral in one variable), we have
3 Line integrals of the second kind
Z
b
199
f ..t//jj0 .t/jjdt D
a
Z
b
f . .˛.t//jj
0
.˛.t//jj j˛ 0 .t/jdt
a
Z
d
D
f . .//jj
0
t u
./jjd:
c
(The attentive reader will recall from the theory of the single variable Riemann integral substitution that if ˛ is decreasing, the absolute value in j˛ 0 .t/j is nevertheless correct because of an interchange of bounds.)
2.4
Remark
The length of a curve L is defined as the integral of the first kind of the function 1 over L, i.e. Z
Z
b
1D L
3
jj0 jj:
a
Line integrals of the second kind
3.1 Let W ha; bi ! Rn be a parametrization of a piecewise continuously differentiable oriented curve L, and let f D .f1 ; : : : ; fn /T be a vector function defined (at least) on Œha; bi. The line integral of the second kind of the function f over the oriented curve L is the number Z
Z
b
fD L
f..t// 0 .t/ D
a
n Z X j D1 a
b
fj ..t// j0 .t/dt
(note, in the middle expression, the dot product of vectors). When there is a danger of confusion, we will denote line integrals of the first and second kind explicitly by Z Z .I / ; .II/ : L
L
In the literature, the line integral of the second kind is also often denoted by Z .f1 dx1 C C fn dxn /: L
200
8 Line Integrals and Green’s Theorem
This notation, in fact, conforms to the notation of differential forms, which we will see later in Chapter 12. When x D .x1 ; : : : ; xn /T , we will also use the notation Z f.x/ dx: L
3.2 The “physical” meaning of the line integral of the second R kind: We travel around the curve L from the beginning point to the end point. L F is then the work done when we exert the force F at each given point of the curve. 3.3 Proposition. The expression in the definition of the line integral of the second kind does not depend on the choice of parametrization of an oriented piecewise continuously differentiable curve. Proof. Let D ı ˛. Now, of course, ˛ 0 .t/ > 0 (with the possible exception of finitely many points, where ˛ 0 has, at most, discontinuities of the first kind). We have n Z X
b
j D1 a
D
fj ..t// j0 .t/dt D
n Z X
d
fj . .//
j D1 c
Z
n Z X
b
fj . .˛.t//
j D1 a
0 j ./d:
0 0 j .˛.t//˛ .t/dt
t u
Z f D
Observation. L
f. L
3.4 We immediately see the following Proposition. Let K; L be oriented piecewise continuously differentiable curves such that K C L is defined. Then Z
Z
Z
fD KCL
fC K
f: L
3 Line integrals of the second kind
201
3.5 Now let f be a (scalar) function defined on Œha; bi where is a parametrization of a piecewise continuously differentiable oriented curve. On the same set, define f..t// D f ..t//
0 .t/ : jj0 .t/jj
From the definitions, one has immediately Z
Z f D .II/
.I / L
f: L
Thus, the line integral of the first kind can be reduced to the line integral of the second kind.
3.6
Remarks
1. The traditional terms “of the first kind” and “of the second kind” therefore should not be interpreted as expressing the order of importance. The line integral of the second kind is in fact more fundamental, and the integral of the first kind can be reduced to it. Perhaps the reason for the terminology is that the line integral of the first kind is the more naive notion. 2. The function f or f often is defined on an open set containing Œha; bi. This will play a crucial role in the proof of Green’s Theorem.
3.7 Since continuous functions on a compact set are bounded, we obtain immediately from Theorem 5.2 of Chapter 5 the following 3.8 Proposition. Let f.˛; x/ be a continuous vector function defined in an open set @fj .˛; x/ U of Rn such that is continuous on U for each j . Then the line integral @˛ of the second kind satisfies d d˛
Z
Z f.˛; x/ dx D L
L
@f.˛; x/ dx: @˛
202
4
8 Line Integrals and Green’s Theorem
The complex line integral
4.1 For a complex function of one real variable, f .t/ D f1 .t/ C if2 .t/ where f1 , f2 are real functions, one introduces the Riemann integral by the formula Z
b
Z
b
f .t/dt D
a
Z
b
f1 .t/dt C i
a
f2 .t/dt: a
4.2 Recall that on the field of complex numbers C, we use the distance function d.x; y/ D jx yj, which is the same as the Euclidean distance when we identify C with R2 by x C iy 7! .x; y/. We will use this identification freely to define piecewise continuously differentiable functions in C, etc., but now note that .t/ are the elements of the field C and hence can be subjected to the multiplication in C which is different from the dot multiplication in R2 (for example in that the result is again an element of C rather than R). This distinction, in fact, is the main point of the present section. Because of this, when working with complex-valued functions, we will not use bold-faced letters as we did in the case of vector functions.
4.3 Let W ha; bi ! C be a parametrization of an oriented piecewise continuously differentiable curve L and let f be a (continuous) complex function of one complex variable defined on some set containing Œha; bi. The complex line integral Z f .z/dz L
is introduced by the formula Z
b
f . .t// 0 .t/dt
(*)
a
(independence on (oriented) parametrization will be discussed below in 4.4). Again, note with caution that while the formula (*) is similar to the definition of the line integral of the second kind, it is different and “more mysterious” in that it involves complex multiplication. For example, there is no simple interpretation of the complex curve interval similar to the interpretations given in 2.2 or 3.2.
4 The complex line integral
203
4.4 It is, however, again possible to express the complex line integral in terms of line integrals of the second kind. Theorem. Let f be a complex function of one complex variable. Let f .z/ D f1 .z/ C if2 .z/ where f1 , f2 are real functions of one complex variable. Then the complex line integral satisfies Z
Z
Z
f .z/dz D .II/
.f1 ; f2 /T C i .II/
L
L
.f2 ; f1 /T : L
Proof. We have Z
b
f . .t// 0 .t/dt D
Z
a
b a
Z
b
D a
.f1 . .t// C if2 . .t///. 10 .t/ C i 20 .t//dt
.f1 . .t// 10 .t/ C .f2 . .t/// 20 .t//dt
Z
b
Ci a
! .f2 . .t// 10 .t/
C
f1 . .t// 20 .t/dt
Z
Z .f1 ; f2 /T C i
D L
.f2 ; f1 /T : L
t u
Remark: This theorem also implies that the complex line integral does not depend on the parametrization of an oriented piecewise continuously differentiable curve. (Of course, reversal of orientation results in a reversal of sign.)
4.5 The estimate in the following statement is not particularly tight. However, it will prove useful in Chapter 10 below. Lemma. Let L be a piecewise continuously differentiable curve in C of length d (recall 2.4), and assume a complex function f on Œha; bi satisfies jf .z/j A. Then we have ˇZ ˇ ˇ ˇ ˇ f .z/dzˇ 4Ad: ˇ ˇ L
204
8 Line Integrals and Green’s Theorem
Proof. We have ˇZ ˇ ˇZ ˇ Z b Z b Z b ˇ b ˇ ˇ b ˇ ˇ ˇ ˇ 0 0 0 0 0ˇ f . .t// .t/dt ˇ D ˇ f1 1 f2 2 C i f2 1 C i f1 1 ˇ ˇ ˇ a ˇ ˇ a ˇ a a a ˇZ ˇ ˇZ ˇ ˇZ ˇ ˇZ ˇ ˇ b ˇ ˇ b ˇ ˇ b ˇ ˇ b ˇ ˇ ˇ ˇ ˇ 0ˇ 0ˇ 0ˇ 0ˇ ˇ f1 1 ˇ C ˇ f2 2 ˇ C ˇ f2 1 ˇ C ˇ f1 2 ˇ ˇ a ˇ ˇ a ˇ ˇ a ˇ ˇ a ˇ Z
b
a
Z 4
jf1 j j 10 j C b
Z
b a
Aj 0 j D 4A
Z
a
jf2 j j 20 j C b
Z
b a
jf2 j j 10 j C
Z a
b
jf1 j j 20 j
j 0 j D 4Ad:
t u
a
5
Green’s Theorem
5.1
Smooth partition of unity: a “baby version”
Let Z Rn be a compact set, and let S be a set of open subsets of Rn whose union contains Z. A smooth partition of unity subordinate to S is a set of finitely many smooth functions i W Rn ! R, i D 1; : : : ; k such that the image of each i is contained in h0; 1i, the support of each i is compact and contained in one of the k X i has the property that .x/ D 1 for every x 2 Z. sets from S , and D i D1
Lemma. Let Z Rn be a compact set. For every set S whose union contains Z, there exists a smooth partition of unity. Proof. First of all, Z is bounded by 6.5 of Chapter 2, and hence contained in a bounded closed interval K D hA1 ; B1 i hAn ; Bn i. Consider the set T consisting of all bounded open intervals whose closures are either contained in one of the sets from S , or are disjoint with Z. By 5.5 of Chapter 2, K is contained in the union of the elements of a finite subset F of T . Now consider the function .x/ equal to e 1=x for x > 0, and equal to 0 for x 0 (see Exercise (13) of Chapter 1). For an interval J D .a1 ; b1 / .an ; bn /, let J .x/ D
n Y
.xi ai /.bi xi /:
kD1
Consider further the functions i;A .x/ D .xi Bi /, i;B .x/ D .Ai xi /, and let be the sum of all these functions. Then J D J =, J 2 F form a smooth partition of unity. t u
5 Green’s Theorem
205
5.2 By a domain we shall mean an open subset U of R2 (or of C) which has compact closure (which, by 6.5, is equivalent to being bounded). This condition may not be very strong, but we will see that it will play an absolutely crucial role in the proof of Green’s Theorem. Let L1 ; : : : ; Lk be oriented piecewise continuously differentiable simple closed curves in R2 with disjoint images and with parametrizations c1 ; : : : ; ck . We will say that L1 q q Lk is the boundary of a domain U oriented counter-clockwise if the images of Li are contained in U and for every x 2 U X U there exists an open neighborhood Vx of x and an injective regular map with bounded partial derivatives x W Vx ! .o; 1/ D fx 2 R2 j jjxjj < 1g with det.D x / > 0; a number ˛ 2 .0; 2 / and numbers a > 0, b 2 R and i 2 f1; : : : ; kg such that (1) x .b/ D x; (2) x ŒU \ Vx D f.r cos ; r sin /j 0 < r < 1; 0 < < ˛g; (3) .
k [
Im.cj // \ Vx D ci Œ.b a; b C a/;
j D1
(4) For s 2 h1; 0i, we have x ci .as C b/ D .s cos.˛/; s sin.˛// and for s 2 h0; 1i, we have x ci .as C b/ D .s; 0/:
5.3
Comment
Informally, the above definition says simply that the boundary of U is a union of the images of the Li ’s and that at a neighborhood of every point of the boundary, locally U looks like a wedge of an open disk (the wedge may also be a half-disk) whose boundary is parametrized linearly by one of the curves ci in the same direction as the increasing parametrization of .1; 1/ is with respect to the upper half-disk f.x; y/ 2 .o; 1/jy 0g: Note, however, the great generality this allows, for example a disk D with several open disks removed whose disjoint closures are in the interior of D, or similarly with polygons, etc. The beauty of the upcoming proof is that it uses no intuitive properties of such situations except the formal properties given in the definition;
206
8 Line Integrals and Green’s Theorem
for example, we do not use any intuitive notion of “interior” or “exterior” of the curves Li , and although the expression “counter-clockwise” matches the intuition, Definition 5.2 is not based on intuition. Another way of putting this is to note that our definition of boundary is purely local in the sense that it is completely described by requirements on neighborhoods of individual points of C.
5.4 Let Z
Z
Z
fD L1 qqLk
f CC L1
f: Lk
Theorem. (Green’s Theorem) Let U be a domain in R2 and let L1 ; : : : ; Lk be oriented piecewise continuously differentiable simple closed curves with disjoint images such that L1 q q Lk is the boundary of U oriented counter-clockwise. Let M D U and let f W V ! R2 be a function with continuous (first) partial derivatives for some V M open. Then we have Z
Z fD L1 qqLk
M
@f2 @f1 @x1 @x2
:
(5.4.1)
Proof. First, we note that the theorem is valid for U D .0; K/ .0; K/, K > 0, i D 1 and c1 W h0; 4i ! R2 defined by c1 .t/ D .Kt; 0/ for 0 t D .K; K.t 2// for 1 t D .K.3 t/; K/ for 2 t D .0; K.4 t// for 3 t
1, 2, 3, 4.
In this case, applying Fubini’s Theorem and the Fundamental Theorem of Calculus in one variable, we get Z M
@f2 D @x1
Z
Z
K 0
Z
K 0
@f2 .x1 ; x2 / dx1 dx2 @x1 Z
K
D
Z
.f2 .K; x2 / f2 .0; x2 // dx2 D 0
fC L1
f: L3
5 Green’s Theorem
207
Similarly, we have Z M
@f1 D @x2
Z
K
D
Z
K 0
Z
@f1 .x1 ; x2 / dx2 dx1 @x2
K 0
Z
Z
.f1 .x1 ; 0/ f1 .x1 ; K// dx1 D
0
fC L4
f: L2
Adding these two formulas gives the statement in the present case. Amazingly, this is the only concrete case of the theorem we need to prove by direct calculation. Now consider the general case. First we need to observe that the statement (5.4.1) doesn’t change if we perform a (2-variable) substitution by a diffeomorphism W V ! V 0 (see Theorem 7.9 of Chapter 5). This is easy to accept, but somewhat harder to do in detail. The reason is that even in two variables, the concepts we set up so far do not transform in the simplest possible way under coordinate change. We will understand this better in Chapter 12 below. To do the calculation we need, let us write .x1 ; x2 /T D F..r1 ; r2 /T /; so identifying, at a point, the linear map D with its associated matrix, we have 0 @x @x 1 1 B @r1 @r2 B DF D B @ @x @x 2 2 @r1 @r2
1 C C C: A
Now consider a parametrized vector function f W R2 ! R2 where we understand the independent variables to be x1 ; x2 (i.e. “the x1 x2 -plane”). Let L be an oriented piecewise continuously differentiable curve in R2 , which we understand as “the r1 r2 -plane”. We will denote, slightly imprecisely, by FŒL the “F-image of the curve L in the x1 x2 -plane”, i.e. the oriented curve obtained by composing the parametrization of L with the map F. The key observation then is that the definition of the line integral of the second kind gives Z
Z f dx D FŒL
..DF/T .f ı F// dr:
(5.4.2)
L
(Note that if we wrote the integrand of a line integral of the second kind as a row instead of column vector, the transposition on the right hand side of (5.4.2) would be unnecessary - again, we will understand this better in Chapter 12 below.)
208
8 Line Integrals and Green’s Theorem
Denoting the integrand on the right hand side of (5.4.2) by g, we have, in coordinates, @x2 1 f2 B @r1 @r1 C C B gDB C: @ @x @x2 A 1 f1 C f2 @r2 @r2 0 @x
1
f1 C
Now compute: @g2 @g1 @r1 @g2 D
@x1 @f1 @2 x1 @x2 @f2 @2 x2 C f1 C C f2 @r2 @r1 @r1 @r2 @r2 @r1 @r1 @r2
@x1 @f1 @2 x1 @x2 @f2 @2 x2 f1 f2 : @r1 @r2 @r1 @r2 @r1 @r2 @r1 @r2
(5.4.3)
We see that the second order terms cancel out, and after applying the chain rule @fi @fi @x1 @fi @x2 D C ; @rj @x1 @rj @x2 @rj the right hand side of (5.4.3) becomes a sum of eight terms, four of which cancel out, leaving
@f1 @f2 @x1 @x2
@x1 @x2 @x1 @x2 @r1 @r2 @r2 @r1
D
@f2 @f1 @x1 @x2
det.DF/;
which is what we need to transform (5.4.1) from the x-coordinates to the r-coordinates, provided det.DF/ > 0 (see Theorem 7.9 and Exercise (16) of Chapter 5). Now by compactness of U (our main assumption!), there exist open sets V1 ; : : : ; Vm of R2 such that V1 [ [ Vm U and for each i , we have either Vi U or x 2 Vi Vx for some x 2 U XU . Let ui be a smooth partition of unity subordinate to the cover .Vi /. We shall prove the formula (5.4.1) for each of the functions ui f, i D 1; : : : ; m. We distinguish four cases: Case 1: Vi U . By linear substitution, we may assume Vi .0; K/ .0; K/. Thus, the statement for ui f follows from the special case already proved (the left hand side of (5.4.1) with f replaced by ui f is 0).
6 Exercises
209
Case 2: x 2 Vi Vx and 0 < < . By R2 -linear substitution, we may assume D =2. In this case, choose K D 1 and extend the map ui f ı . x /1 to an open set containing h0; Ki h0; Ki by 0. Again, the statement reduces to the special case already proved (noting that for this new function, the contributions to the right hand side of (5.4.1) for 1 t 3 are 0). Case 3: D . By the linear substitution rD
1 .x1 C 1/; s D x2 ; 2
applied to the function ui f ı . x /1 , the statement reduces to the special case already proved with K D 1. (Note that for this function, the contributions to the left hand side of (5.4.1) with 1 t 4 are 0.) Case 4: < < 2 . By R2 -linear substitution applied to the function ui f ı . x /1 , we may assume D 3 =2. Then extend the function ui f ı . x /1 to an open neighborhood in R2 of the set Z D .h1; 1i h1; 1i/ X ..0; 1i h1; 0//: Express Z D Z1 [ Z2 where Z1 D h1; 1i h0; 1i; Z2 D h1; 0i h1; 0i: The sets are not disjoint, but the intersection has measure 0. For the sets Z1 , Z2 and restrictions of the function ui f ı . x /1 , the statement follows from Cases 3 and 2, respectively. When adding the left hand sides of formula (5.4.1) for these functions, the contributions from the line segment h1; 0i f0g cancel out. u t
6
Exercises
(1) Prove the statement of the comment in 1.1. (2) Prove that the relation in 1.2 is an equivalence relation. (3) Prove that in 1.7, is a piecewise continuously differentiable parametrization of a curve. (4) Prove that the parametrized curve K C L defined in 1.7 depends only on the parametrized curves K and L, and not on their parametrizations. (5) Prove that in 1.7, we have .K C L/ C M D K C .L C M /.
210
8 Line Integrals and Green’s Theorem
0.t /
of 3.4 at each point Œha; bi (where jj0.t / jj defined) does not depend on the parametrization of the piecewise continuously differentiable oriented curve. Prove also that reversal of orientation of the curve results in multiplication of this factor by 1. (7) Compute the complex line integral (6) Prove directly that the factor
Z e z dz L
where L is the straight line segment in C from 2 C 3i to 1 C i . (8) Write out in detail the simplification of the right hand side of (5.4.3) using the chain rule. (9) Compute Z y 2 dx C 2xydy
.II/ L
where L is the boundary of the upper unit half-disk f.x; y/T 2 R2 j x 2 C y 2 < 1; y > 0g oriented counterclockwise. (10) Prove that if L1 q q Lk is the boundary of a domain U oriented counter-clockwise, then the area of U is equal to 1 2
Z xdy ydx: L1 qqLk
[Hint: Use Green’s Theorem.] (11) Using Green’s Theorem and Theorem 4.4, compute the complex line integral Z z2 dz L
where L is the boundary of the square fx C iy j 0 < x < 1; 0 < y < 1g oriented counter-clockwise.
Part II Analysis and Geometry
9
Metric and Topological Spaces II
For the remaining chapters of this text, we must revisit our foundations. Specifically, it is time to upgrade our knowledge of both metric and topological spaces. For example, in the upcoming discussion of manifolds in Chapter 12, we will need separability. We will need a characterization of compactness by properties of open covers. Also, it is natural to define manifolds as topological and not metric spaces which prompts the development of separation axioms, with a focus on normality. On the other hand, when discussing Hilbert spaces in Chapters 16 and 17, we will need completion, extension of uniformly continuous maps, and the Stone-Weierstrass Theorem. These are the topics we will discuss in the present chapter.
1
Separable and totally bounded metric spaces
1.1
A few concepts
A subset M X of a topological space is said to be dense if M D X . A space is separable if it contains an at most countable dense subset. (At most countable means finite or countable.). A cover of a space .X; / ( is the set of all open sets, recall Subsection 4.2 of Chapter 2) is a subset U such that [ U D X: Note that we only consider covers by open sets. (In other texts, this requirement is sometimes dropped, in which case our concept would be called an open cover.) A subcover V of a cover U is a subset V U that is itself a cover. A space X is said to be Lindel¨of if every cover of X contains an at most countable subcover. 1.2 Theorem. The following statements about a metric space X D .X; d / are equivalent. I. Kriz and A. Pultr, Introduction to Mathematical Analysis, DOI 10.1007/978-3-0348-0636-7 9, © Springer Basel 2013
213
214
9 Metric and Topological Spaces II
(1) X is separable. (2) The topology of X has a countable basis. (3) X is Lindel¨of. Proof. (1))(2): Let M be a countable dense subset of X . Put B D f .m; r/ j m 2 M; r rationalg: We will prove that B is a basis. Take an open U , an x 2 U , and an " > 0 such that
.x; "/ U . Now choose an m 2 M such that d.x; m/ < 13 " and a rational r such that 13 " < r < 23 ". Then x 2 .m; r/ .x; "/ U : in effect, if d.m; y/ < r we have d.x; y/ d.x; m/ C d.m; y/ < . 13 C 23 /" D ". (2))(3): Let B be a countable basis and let U be an arbitrary open cover. Put B 0 D fB 2 B j 9U 2 U; B U g. Then B 0 is a countable cover, and if we choose for each B 2 B 0 a UB 2 U with B UB then also fUB j B 2 B 0 g is a countable cover. (3))(1): For every positive natural number n, choose a countable subcover of the cover f .x; n1 / j x 2 X g, say
.xn1 ; n1 /; : : : ; .xnk ; n1 /; : : : : Then the set fxnk j n; k D 1; 2; : : : g is dense in X .
t u
1.2.1 Remarks 1. This is a very specific fact concerning metric spaces. In a general topological space one has only the (very easy) implications (2))(3) a (2))(1) and nothing more. 2. In the literature, the existence of a countable basis is often called the second axiom of countability. 1.2.2 Obviously, if X has a countable basis B then each subspace Y X has one, namely BjY D fU \ Y j U 2 Bg. Hence we have Corollary. A subspace of a separable metric space is separable. Equivalently, a subspace of a Lindel¨of metric space is Lindel¨of. The first of these statements hardly comes as a surprise (it is easy to prove it directly, too). But the second one should sound somewhat strange. We will see shortly (in 2.3 below) that Lindel¨of property is very close to compactness, and compactness is (very obviously) not preserved on subspaces. Again, this corollary is characteristic for metric spaces. In a general topological context neither of the two statements holds.
1 Separable and totally bounded metric spaces
215
1.3 A metric space X is said to be totally bounded if for each " > 0 there exists a finite subset M."/ of X such that for every x 2 X; we have d.x; M."// < ":
(TB)
1.3.1 A totally bounded space is always bounded but a bounded space is not necessarily totally bounded: take any infinite set and define d.x; y/ D 1 for x ¤ y. But we have Proposition. A subspace X of the Euclidean space Rn is totally bounded if and only if it is bounded. Proof. If X is bounded, then we have X hN; N i hN; N i for a suficiently large natural N . Choose a natural number k such that and put
N k
<
" 2
M D fs D . sk1 ; : : : ; skn / j si are integers, N k si N kg: For every x 2 X , there exists an s 2 M such that d.x; s/ < 2" . For an s 2 M , choose an x.s/ 2 X such that d.x.s/; s/ < 2" , if such x.s/ exists, and put MX D fx.s/ j s 2 M such that x.s/ existsg: Then, by the triangle inequality, we have, for every x 2 X , d.x; MX / <
" 2
C 2" D ". t u
1.4 Proposition. A metric space X is totally bounded if and only if every sequence in X contains a Cauchy subsequence. Proof. I. Let X be totally bounded. Consider the sets M. n1 / from the definition 1.3. Now consider a sequence .xi /i D1;2::: in X . If the set P D fxi j i D 1; 2; : : : g is finite, then our sequence contains a constant subsequence, which is, of course, Cauchy. Otherwise choose first m1 2 M.1/ so that P1 D P \
.m1 ; 1/ is infinite, and then k1 with xk1 2 P1 . Now assuming we have mj 2 M. j1 /; j D 1; : : : ; n 1; such that Pj D Pj 1 \ .mj ; j1 / are infinite, and k1 < k2 < < kn1 such that xkj 2 Pj ;
216
9 Metric and Topological Spaces II
choose mn 2 M. n1 / with Pn D Pn1 \ .m1 ; 1/ infinite, and a kn > kn1 such that xkn 2 Pn . Then the sequence xk1 ; xk2 ; xk3 ; : : : is obviously Cauchy. II. Let X not be totally bounded. Then there exists an " > 0 such that for every finite M X there exists an x 2 X with d.x; M / ". Pick an arbitrary point x1 ; assuming we have chosen x1 ; : : : ; xn , pick an xnC1 2 X so that d.xnC1 ; fx1 ; : : : ; xn g/ ". The resulting sequence obviously contains no Cauchy subsequence. t u 1.5 Proposition. A totally bounded metric space is separable. Proof. It sufices to take M D
2
S1 nD1
t u
M. n1 /.
More on compact spaces
2.1 A point x of a space X is said to be an accumulation point of a subset M X if for every neighborhood U of X the intersection U \ M is infinite. Here is a simple reformulation of the definition of compactness we have used so far (i.e. requiring that every sequence have a convergent subsequence). 2.1.1 Proposition. A metric space X is compact if and only if every infinite set M X has an accumulation point. Proof. I. Assume that every sequence in X has a converegent subsequence. Let M be an infinite subset of X . Choose a one-to-one (not necessarily onto) mapping ' W N ! M and a convergent subsequence .'.kn //n of .'.n//n . Then lim '.kn / n is an accumulaton point of M . II. Assume that every infinite M X has an accumulation point. Let .xn /n be a sequence in X . If M D fxn j n D 1; 2; : : : g is finite then .xn /n contains a constant, and hence convergent, subsequence. Assume that M is infinite, and let x be one of its accumulation points. Choose k1 arbitrarily, and assuming xk1 ; : : : ; xkn1 , k1 ; < < kn1 , are chosen, choose kn > kn1 so that xkn 2
.x; n1 / (such a kn exists since x is an accumulation point, and on the other hand, only finitely many j , namely those with j kn1 , are disqualified by the definition of a subsequence). Now obviously lim xkn D x. t u n
2.2 Theorem. A metric space X is compact if and only if it is complete and totally bounded. Proof. I. If X is compact then it is complete by 7.4 of Chapter 2. If X were not totally bounded, there would exist an " > 0 such that for every finite subset M there is a point x with d.x; M / ". Choose x1 arbitrarily and assuming
2 More on compact spaces
217
x1 ; : : : ; xn are already chosen, pick an xn such that d.xn ; fx1 ; : : : ; xn g/ ". Then fxn j n D 1; 2; : : : g is infinite and obviously has no accumulation point. II. If .xn /n is a sequence in a totally bounded complete metric space then by 1.4 it contains a Cauchy subsequence, and by completeness, this subsequence converges. t u
2.2.1 Remark This fact is a generalization of Theorem 6.5 of Chapter 2 stating that a subset X Rn is compact if and only if it is closed and bounded. We know that Rn is complete (7.5 of Chapter 2), and hence, by 7.5 of Chapter 2 again, X is complete if and only if it is closed; by 1.3.1, for X Rm , boundedness and total boundedness are equivalent.
2.3 The following is the famous Heine-Borel Theorem. One can think of it as a generalization of 5.5 of Chapter 2 to arbitrary metric spaces. Theorem. A metric space is compact if and only if each of its (open) covers contains a finite subcover. Proof. Let X be compact and let U 0 be a cover of X which has no finite subcover. By 2.2 and 1.5, X is separable, hence by 1.2 it is Lindel¨of, and hence U 0 has a countable subcover U D fU1 ; U2 ; : : : ; Un ; : : : g: By assumption, U has no finite subcover. Now our strategy is to discard the Ui ’s which are “redundant” in the given order. More precisely, let V1 D Uj for the lowest j for which Uj ¤ ;, and assuming Vj , j D 1; : : : ; n 1 are already chosen, let Vn D Uj for the lowest j such that Uj X
n1 [
Vi ¤ ;
i D1
(by assumption, the finite system fV1 ; : : : ; Vn1 g cannot be a cover). Choose xn 2 Vn X
n1 [ i D1
Vi
and put M D fxn j n D 1; 2; : : : g:
218
9 Metric and Topological Spaces II
Now • xn … fx1 ; : : : ; xn1 g .
n1 [
Vi / and hence M is infinite,
i D1
• V1 ; : : : ; Vn ; : : : is a cover since each discarded Uj is contained in the union of the Vi ’s, and • Vn \ M fx1 ; : : : ; xn g and hence is finite. This is a contradiction: The set M must have an accumulation point x, this x is an element of some Vn , but this neighborhood of x meets M in finitely many points only. II. Assume each cover of X has a finite subcover and assume M X has no accumulation point. Then for every x 2 X there exists an open neighborhood Un such that Un \ M is finite. Choose a finite subcover Uk1 ; : : : ; Ukn . Then M D.
n [
U xi / \ M D
i D1
n [
.Uxi \ M /
i D1
t u
is a finite union of finite sets and hence is finite.
2.4 Theorem 2.3 suggests the following definition of compactness for general topological spaces, which we will adopt from now on: A topological space is said to be compact if each of its (open) covers has a finite subcover. Similarly as in the special case of metric spaces (recall 6.2 of Chapter 2) we have 2.4.1 Proposition. Let f W X ! Y be a continuous map and let X be compact. Then the subspace f ŒX of Y is compact. Proof. Let Ui , i 2 J , be open in Y and let f ŒX f 1 Œ
[
i 2J
Ui D
[
[
Ui . Then X
i 2J
f 1 ŒUi and hence there exist i1 ; : : : ; in such that
i 2J
X
n [
f 1 ŒUi D f 1 Œ
j D1
This is equivalent to f ŒX
n [
n [
Uij :
j D1
U ij .
t u
j D1
From this statement we obtain, again, the following important generalization of Proposition 6.3 of Chapter 2:
3 Baire’s Category Theorem
219
2.4.2 Proposition. Let X be a compact topological space. Then every continuous real function f W X ! R has both a maximum and a minimum. 2.4.3 Proposition. A closed subspace Y of a compact topological space is compact. S Proof. Let Ui , i 2 J , be open sets in X such that Ui Y . Then fUi j i 2 J g [ fX X Y g is an open cover of X and hence there exists a finite subcover U i1 ; : : : ; U in ; X X Y S of X . Since Y \ .X X Y / D ; we have Y nj D1 Uij .
t u
2.4.4 Remark Unlike the case of metric spaces, a compact subspace of a topological space is not necessarily closed: for example, any subspace of a finite topological space is compact, but not every subset may be closed. This, in fact, is one of the motivations of separation axioms, which can be used to remedy this situation, and which will be discussed in Section 5 below.
3
Baire’s Category Theorem
3.1 A subset A of a topological space X is said to be nowhere dense if X X A is dense in X , that is, if X X A D X (recall that A denotes the closure of A, i.e. the intersection of all closed subsets of X which contain A). In other words, A is nowhere dense if and only if for each non-empty open U the intersection U \ .X X A/ is non-empty. Consequently we obtain 3.2 Observation. A union of finitely many nowhere dense subsets of X is nowhere dense. (If A; B are nowhere dense and U is non-empty open then U \ .X X A/ is non-empty open and hence U \ .X X A/ \ .X X B/ D U \ .X X .A [ B// D U \ .X X A [ B/ is non-empty.)
3.3 A subset A X is of the first category (or meager) in X if A is a countable union 1 [
An
nD1
with An nowhere dense. From 3.2 we immediately see that
220
9 Metric and Topological Spaces II
A subset A X is of the first category in X if it is a union
1 S
An of an increasing
nD1
sequence A1 A2 of nowhere dense subsets.
3.4 Theorem. (Baire’s Category Theorem) If X is a complete metric space then X is not of the first category in X . Proof. Let A1 A2 An be an increasing sequence [of nowhere dense subsets of a complete metric space X . We will prove that X X An ¤ ;. n
Since a closure of a nowhere dense set is nowhere dense, we may assume without loss of generality that the sets An are closed. The set A1 is nowhere dense closed and hence there exists an x1 2 X X A1 and an "1 , 0 < "1 < 1 such that .x1 ; 2"1 / \ A1 D ;. Now .x1 ; "1 / is a non-empty open set and hence .x1 ; "1 / \ .X X A1 / ¤ ; and we have an x2 and an "2 , 0 < "2 < 12 , such that .x2 ; 2"2 / .x1 ; "1 / \ .X X A2 /, i.e.
.x2 ; 2"2 / \ A2 D ;
and .x2 ; 2"2 / .x1 ; "1 /:
Now assume we already have x1 ; : : : ; xn and "1 ; : : : ; "n , 0 < "k < k1 , such that
.xk ; 2"k / \ Ak D ; for k n;
and
.xk ; 2"k / .xk1 ; "k1 / for 1 < k n: Since .xn ; "n / is a non-empty open set, we have a non-empty open .xn ; "n / \ 1 .X X AnC1 / and hence there is an xnC1 and an "nC1 with 0 < "nC1 < nC1 such that
.xnC1 ; 2"nC1 / \ AnC1 D ; and .xnC1 ; 2"nC1 / .xn ; "n /: Since .x; "/ .x; 2"/ (if d.y; .x; "// D 0 we can find a z 2 .x; "/ such that d.y; z/ < "), setting Bn D .xn ; "n / we obtain a sequence
.x1 ; 2"1 / B1 .x2 ; 2"2 / B2 .x3 ; 2"3 / B3 such that .xk ; 2"k / \ Ak D ; (and hence Bk \ Ak D ;). For k n we have xk 2 .xn ; 2"n / and since "n < n1 the sequence .xn /n is Cauchy, and by completeness it has a limit x 2 X . Furthermore, for k n we have T xk 2 T Bn , and since Bn is closed, xT2 Bn . Thus, x 2 B . Since Bn \ n S SAn D ; we have Bk \ An D ; and finally Bk \ An D ;. Therefore, x … An . t u
4 Completion
4
221
Completion
4.1 Let X D .X; d / be a metric space. On the set of Cauchy sequences .xn /n in X , introduce an equivalence relation .xn /n .xn0 /n
df
lim d.xn ; xn0 / D 0 n
( is obviously reflexive and symmetric, and the transitivity immediately follows from the triangle inequality). 4.2 Lemma. 1. If .xn /n and .yn /n are Cauchy sequences in X then .d.xn ; yn //n is a Cauchy, and hence convergent, sequence in R. 2. If .xn /n .xn0 /n and .yn /n .yn0 /n then limn d.xn ; yn / D limn d.xn0 ; yn /. Proof. 1. From the triangle inequality, we immediately see that jd.xm ; ym / d.xn ; yn /j d.xm ; xn / C d.ym ; yn /: Thus, if d.xm ; xn /; d.ym ; yn / < 2" , then jd.xm ; ym / d.xn ; yn /j < ". 2. d.xn ; yn / d.xn ; xn0 / C d.xn0 ; yn0 / C d.yn0 ; yn / and hence lim d.xn ; yn / lim d.xn0 ; yn0 /, and by symmetry also lim d.xn0 ; yn0 / lim d.xn ; yn /. t u
4.3 Denote by XQ the set of all the -equivalence classes of Cauchy sequences in .X; d /. For ; 2 XQ , define dQ .; / D lim d.xn ; yn /
where .xn /n 2 and .yn /n 2 :
Q dQ / is a metric space. Observation. XQ D .X; (The definition of dQ is correct by 4.2: obviously dQ is symmetric and satisfies the triangle inequality, and if dQ .; / D 0 and .xn /n 2 ; .yn /n 2 then we obtain .xn /n .yn /n by comparing the definitions of and dQ .)
4.4 A bijection (i.e. a one-to-one onto map) f W .X; d / ! .X 0 ; d 0 / is called an isometry if 8x; y
d 0 .f .x/; f .y// D d.x; y/:
(*)
222
9 Metric and Topological Spaces II
(Note that (*) implies that f is one-to-one. Thus, to verify that a mapping satisfying this condition is an isometry it suffices to prove that it is onto.) If such a mapping exists we say that the spaces .X; d / and .X 0 ; d 0 / are isometric. A map satisfying the condition (*) without assuming that it is onto will be called an isometric embedding. Proposition. Every metric space is isometric to a dense subspace of a complete metric space. Proof. For x 2 X define xQ 2 XQ as the class containing the constant sequence x; x; x; : : : : Obviously the mapping D .x 7! x/ Q W X ! X D fxQ j x 2 X g XQ is an isometry. I. X is dense in XQ . Consider an arbitrary " > 0. For a 2 XQ , choose a representative .xn /n and an n0 such that d.xm ; xn / < " for m; n n0 . Then dQ .; xQ n0 / D lim d.xn ; xn0 / ": m
Q Since X is a dense subset II. XQ is complete. Let .n /n be a Cauchy sequence in X. of XQ , we can choose an xn 2 X such that dQ .n ; xQ n / < n1 : For an " > 0, choose an n0 such that dQ .m ; n / < " whenever m; n n0 . Then d.xm ; xn / D dQ .xQ m ; xQ n / dQ .xQ m ; m / C dQ .m ; n / C dQ .n ; xQ n / <
1 m
C"C
1 n
and we see that .xn /n is a Cauchy sequence. Denote by the equivalence class of .xn /n . We will show that this is a limit, in XQ , of the sequence .n /n . Take an arbitrary " > 0 and an n0 such that n10 < 2" and for n; k n0 , d.xm ; xk / < 2" (this can be done since we already know that .xn /n is a Cauchy sequence). Then for n n0 , we have dQ .n ; / dQ .n ; xQ n / C dQ .xQ n ; .xk /k / <
" 2
C lim d.xn ; xk / k
" 2
C
" 2
D ":
t u
4 Completion
223
4.5 An isometric embedding of a metric space X into a complete metric space with a dense image is called a completion of X . Proposition. Up to isometry, there exists precisely one completion of a metric space X . More precisely, in the notation of 4.4, if ' W X ! Y is a completion then there exists an isometry f W XQ ! Y such that f ı D '. Proof. If we denote by the metric on Y , we have 8x; y 2 X; .'.x/; '.y// D d.x; y/
and 'ŒX D Y:
For a 2 XQ , choose a representative .xn /n and put f ./ D lim '.xn / n
.in Y /
(by the isometric embedding requirement, .'.xn //n is Cauchy and hence convergent in Y ; if .xn /n .yn /n , then again by the isometric embedding requirement, lim .'.xn /; '.yn // D lim d.xn ; yn / D 0; n
and hence lim '.xn / D lim '.yn / so that the definition does not depend on the n
n
choice of a representative). We have f .x/ Q D '.x/ (the limit of a constant sequence), and since a metric is (obviously) a continuous function, we have .f ./; f .// D .lim '.xn /; lim '.yn // n
n
D lim .'.xn /; '.yn // D lim d.xn ; yn / D dQ .; /: Thus, f is an isometric embedding, and it remains to show that f is onto. Take a y 2 Y . Since 'ŒX is dense, there exists a sequence .xn /n in X such that lim.'.xn // D y. Thus in particular .'.xn //n is Cauchy, and, since ' is an isometric embedding, so is .xn /n is. If we denote by the equivalence class of .xn /n , we obtain f ./ D lim.'.xn // D y. t u
4.6
Extension of uniformly continuous maps
When discussing the Fourier transform in Chapter 17, we will need the following important result on extension of uniformly continuous maps to the completion.
224
9 Metric and Topological Spaces II
Proposition. Let .X; d /; .X 0 ; d 0 / be metric spaces, let .X 0 ; d 0 / be complete and let Y be a dense subspace of X . Then each uniformly continuous f W Y ! X 0 has a unique uniformly continuous extension g W X ! X 0 . Proof. For an x 2 X , choose a sequence xn in Y such that lim xn D x, and set g.x/ D lim f .xn /. (Clearly, this definition is forced by the assumption of uniform n continuity of g, which already proves uniqueness.) Let us show that this is a correct definition of a mapping: .xn /n is a Cauchy sequence, hence .f .xn //n is Cauchy and hence convergent; if .yn /n is another sequence in Y converging to x we have a Cauchy sequence f .x1 /; f .y1 /; f .x2 /; f .y2 /; : : : ; f .xn /; f .yn /; : : : converging to both lim f .xn / and lim f .yn /. Considering the constant sequence, g.x/ D f .x/ for n n x 2 Y. Now let " > 0. Choose "1 ; > 0 such that "1 C 2 < ", and a ı > 0 such that d.u; v/ < ı implies d 0 .f .u/; f .v// < "1 for u; v 2 Y . Let d.x; y/ < ı. Choose n sufficiently large such that d.xn ; yn / < ı and d 0 .f .xn /; g.x//; d 0 .f .yn /; g.y// < : Then d 0 .g.x/; g.y// d 0 .g.x/; f .xn /// C d 0 .f .xn /; f .yn // C d 0 .f .yn /; g.y// < C "1 C < ": t u
5
More on topological spaces: Separation
Topological spaces are seldom used in the generality of Chapter 2, Section 4. For various purposes, extra assumptions are usually added. In analysis, we typically encounter so-called separation axioms, (in fact, typically, the stronger ones), which we will briefly introduce in this section. It is worth noting that in this context, separation refers to separation of points or subsets by open sets; it is not related to separability as defined in Section 1 above.
5.1
T0 and T1
A topological space is said to be T0 if for any two distinct points x; y 2 X there exists an open set U such that either x … U 3 y or y … U 3 x. This is equivalent to requiring that fxg D fyg implies that x D y. A space is said to be T1 if for any two distinct points x; y 2 X there is an open set U such that y … U 3 x. This is equivalent to requiring that every finite set be closed.
5 More on topological spaces: Separation
225
It should be noted that while there is not much use for spaces that are not T0 , spaces which are not T1 are used a lot (typically, however, in applications outside analysis).
5.2
T2 , or the Hausdorff axiom
A space is Hausdorff (or, T2 ) if for any two distinct points x; y 2 X there are disjoint open sets U; V such that x 2 U and y 2 V . Hausdorff spaces are already “analysis-friendly”; for instance they admit concepts of convergence in which limits are unique. We will not discuss such topics but will present the following fact which has been promised before. 5.2.1 Proposition. In a Hausdorff space every compact subset is closed. Proof. Let A X be a compact subspace. Fix an x … A. We will prove that there is a neighborhood of x that is disjoint from A. For each a 2 A choose disjoint open sets Ua 3 a and Va 3 x. Then fUa j a 2 Ag n \ is a cover of A and hence there is an open subcover Ua1 ; : : : ; Uan . Set V D Vai . Then V \
n [
i D1
Uai D ; and hence V \ A D ;.
t u
i D1
From 2.4.1, we obtain the following generalization of 7.2 of Chapter 2. 5.2.2 Corollary. Let f W X ! Y be a continuous map, let X be compact and let Y be Hausdorff. Then for every closed A X , the image f ŒA is closed. Thus in particular such an f W X ! Y that is bijective is a homeomorphism.
5.3
Regularity and complete regularity (T3 and T3C 1 ) 2
A space X is regular, or T3 , if for every x 2 X and every closed A X such that x … A, there are open disjoint U; V such that x 2 U and A V . X is completely regular, or T3C 1 , if for every x 2 X and every closed A X 2 such that x … A there is a continuous mapping ' W X ! h0; 1i such that '.x/ D 0 and 'ŒA f1g. Obviously a completely regular space is regular: take the assumed ' and set U D ' 1 Œh0; 12 / and V D ' 1 Œ. 12 ; 1i . 5.3.1 Proposition. A topological space X is regular if and only if for every open U X, [ U D fV j V open; V U g:
226
9 Metric and Topological Spaces II
Proof. I. Let X be regular and let x 2 U . Then x … X X U and there are disjoint open sets V 3 x and W X X U . Now V X X W U and since X X W is closed, V U . II. Let the condition hold, let A be closed, and let x … A. Then x2
[
fV j V open; V X X Ag
and hence there is an open set V 3 x such that A X X V .
5.4
t u
Normality
A space is normal (or T4 ) if for any two disjoint closed subsets A; B X , there exist disjoint open sets U; V such that A U and B V .
5.4.1 Remarks 1. After 5.3, the reader may expect an axiom T4C 1 requiring a separation of disjoint 2 closed sets by continuous real functions. This, however, already follows from normality as we will see in 5.4.6 below. On the other hand, complete regularity does not follow from regularity. 2. Of course we have T2 ) T1 ) T0 while we do not have such implications for the higher separation axioms (T3 does not imply T2 , T4 does not imply T3 ). The reason is that the higher separation axioms in fact do not require that points be closed. In practice, one usually works with T3 &T1 , T3C 1 &T1 and T4 &T1 2 and then the expected implications from “higher” to “lower” separation axiom naturally hold. 5.4.2 Proposition. Every metric space .X; d / is normal. Proof (Recall 8.4 of Chapter 2). For disjoint closed sets A; B X define a maping ' W X ! h0; 1i by setting '.x/ D
d.x; A/ : d.x; A/ C d.x; B/
Since the A; B are closed and disjoint we cannot have simultaneously d.x; A/ D 0 and d.x; B/ D 0 and hence d.x; A/ C d.x; B/ > 0 for all x . Thus, ' is continuous and we can take U D ' 1 Œh0; 12 / and V D ' 1 Œ. 12 ; 1i . t u
5 More on topological spaces: Separation
227
5.4.3 Proposition. Every regular Lindel¨of topological space is normal. Proof. Let X be regular Lindel¨of and let A; B be closed and disjoint sets. For a 2 A, choose open disjoint sets Ua 3 a and Va0 B. fUa j a 2 Ag [ fX X Ag is a cover of X and therefore we have a subcover X X A; U1 ; : : : ; Un ; : : : : Thus we have obtained open sets U1 ; : : : ; Un ; : : :
such that
[
Un A and U n \ B D ;:
n
Taking, instead, the unions U1 ; U1 [ U2 ; U1 [ U2 [ U3 ; : : : we can assume that U1 U2 Un : Similarly we can find open sets V1 V2 Vn ;
such that
[
Vn B
and V n \ A D ;:
n
Now set UQ n D Un X
n [
V j;
U D
j D1
VQn D Vn X
n [ j D1
Uj;
[
UQ n ;
and
n
V D
[
VQn :
n
We have A [ U (no point of A appears in any of the subtracted V j ) and B V , and U \ V D .UQ m \ VQn / D ;, since in any of the intersections Um \ VQn , we have m;n
either m n or m n.
t u
5.4.4 Proposition. Every compact Hausdorff space is normal. Proof. By 5.4.3, it suffices to prove that the space is regular. Let A be closed and x … A. For a 2 A choose disjoint open sets Ua 3 a and Va 3 x. Then fUa j a 2 Ag n [ Uai is a cover of A and hence there is an open subcover Ua1 ; : : : ; Uan . Set U D and V D
n \ i D1
i D1
Vai . Then x 2 V , A U and U \ V D ;.
t u
228
9 Metric and Topological Spaces II
5.4.5 Lemma. Let Q h0; 1i be a dense subset. Let us have in a topological space X open sets Uq , q 2 Q, such that U q Ur :
)
q
Define a mapping ' W X ! h0; 1i by setting '.x/ D inffq j x 2 Uq g: Then ' is continuous. Proof. Set M.x/ D fq j x 2 Uq g Since obviously q 2 M.x/ and q < r imply r 2 M.x/, we have q > '.x/ ) x 2 Uq and hence x … Uq
)
'.x/ q:
(*)
For q < '.x/ take an r with q < r < '.x/; then x … Ur and we see that q < '.x/
)
x … U q:
(**)
Let '.x/ 2 .˛; ˇ/ (the cases '.x/ D 0 or 1 are only simpler and can be left to the reader). Choose ˛ < q < ' < r < ˇ. Then by the implications above, x 2 Ur X U q
and 8y 2 Ur X U q ; '.y/ 2 .˛; ˇ/:
Thus, the neighborhood Ur X U q of x is being mapped into .˛; ˇ/ and we see that ' is continuous. t u 5.4.6 Proposition. (Urysohn’s Theorem) Let A; B be disjoint closed subsets of a normal space X . Then there is a continuous mapping ' W X ! h0; 1i such that 'ŒA f0g and 'ŒA f1g. Proof. Let Q be the set of all dyadic rationals between 0 and 1, that is, the k ; 2n
n D 1; 2; : : : I k D 1; 2; : : : ; 2n 1:
Choose disjoint open U. 12 /, V such that A U. 12 / and B V (so that U. 12 / X X B). Now let U. 2km / be already chosen for m n so that q
)
U.q/ U.r/:
For k D 0; : : : ; 2n , choose disjoint open sets U. 2kC1 /, V such that 2nC1
6 The space of continuous functions revisited: The Arzel`a-Ascoli Theorem and : : :
U. 2kn / U. 2kC1 / 2nC1
and X X U. kC1 /V 2n
229
(and hence U. 2kC1 / U. kC1 // 2n 2nC1
where for k D 0 we take the set A instead of U.0/ and for k D 2n we take B instead of X X U.1/. Thus we obtain inductively a system U.q/, q 2 Q, satisfying the requirements of Lemma 5.4.5, and the statement follows. t u
5.4.7 Remarks 1. In particular, every Lindel¨of regular space is completely regular. It should be noted that, with the exception of T3 » T3C 1 , proving that a lower separation 2 axiom does not imply a higher one is easy. This exception, on the contrary, was a hard nut to crack (and had been an open problem for quite some time). Proposition 5.4.6 shows why: the counterexample has to use uncountable reasoning in a substantial way. 2. Lemma 5.4.5 can be used to reformulate complete regularity without referring to the real numbers. Recall 5.3.1. Denote by the relation V U df V U: It is in general not interpolative (that is, we generally do not necesarily have a W such that U W V ). If we denote by C the largest interpolative subrelation of then completely Sregular spaces can be characterized as those where each open U is the union fV j V C U g.
6
The space of continuous functions revisited: ` The Arzela-Ascoli Theorem and the Stone-Weierstrass Theorem
Certain very strong theorems hold about the space C.K/ of (necessarily bounded) continuous real functions on a compact metric space K with the supremum metric considered in 7.7 of Chapter 2. We will prove two such results in the this section, and use them in Chapters 10 and 17 below.
6.1
` The Arzela-Ascoli Theorem
A sequence of functions fn 2 C.K/ is called uniformly bounded if there exists a number M such that jfn .x/j < M for every n and every x 2 K. Therefore, being uniformly bounded is the same thing as fn 2 .0; M / for all n, for a fixed M > 0, where 0 is the constant zero function. Additionally, the sequence of functions .fn /n is called equicontinuous if for every " > 0, there exists a ı > 0 such that for every x; y 2 K and every n 2 N,
230
9 Metric and Topological Spaces II
d.x; y/ < ı ) jjfn .x/ fn .y/jj < ": Thus, this means that the functions fn are all uniformly continuous with the same bound ı depending on ", independent of n. 6.2 Theorem. (The Arzel`a-Ascoli Theorem) Let K be a compact metric space. Then any uniformly bounded equicontinuous sequence of functions .fn /n in C.K/ has a uniformly convergent subsequence (i.e. a subsequence convergent in C.K/). Proof. By Theorem 2.2, the space K is totally bounded. Therefore, for each " 2 N, there is a finite subset S" K such that for every x 2 K, d.x; y/ < " for at least one y 2 S" . Now let [ SD S1=k D fx1 ; x2 ; x3 ; : : : g: k
Then f ŒK is compact by Proposition 6.2 2, so there exists a subsequence .fi1n /n of .fn /n such that the sequence fi1n .x1 / converges. Next, there exists a subsequence .fii2n /n of .fi1n /n such that fi2n .x2 / converges. Repeating this procedure, we may successively pick subsequences .fij n /n such that fij n .xj / converges. Note however that then since we picked each sequence as a subsequence of the previous one, the “diagonal” subsequence finn converges on every point of S . Now let "=3 1=k. Taking ı D ı."/ for a given " from the definition of equicontinuity, let N be such that for m; n > N , jfimm .s/ finn .s/j < "=3 for every s 2 Sk . Then, by the triangle inequality, jfimm .x/ finn .x/j < ", (since there exists an s 2 S1=k with d.x; s/ < ı."=3/, showing that jfimm .x/ finn .x/j < " for every x 2 K, showing that the subsequence .finn /n is Cauchy in C.K/. Since however C.K/ is complete (by Proposition 7.7.2), this subsequence converges in C.K/. t u Sometimes we are interested in working in the space C.X / of bounded real continuous functions on a space X which is not compact. In that case, the assumptions of equicontinuity and uniform boundedness, and consequently the conclusion of uniform convergence, are often too strong. One strategy for getting around this is the following: We say that a topological space X is -compact if it is a union of countably many compact subsets. 6.3 Theorem. Suppose that X is a -compact metric space. Then every sequence .fn /n in C.X / which is equicontinuous and bounded on every K X compact has a subsequence which is uniformly convergent on every K X compact. Proof. We use the “diagonal method” one more time. Let XD
1 [ nD0
Kn
6 The space of continuous functions revisited: The Arzel`a-Ascoli Theorem and : : :
231
for Kn compact. Then using Theorem 6.2, choose a subsequence .fi1n /n which converges uniformly on K1 . Within this subsequence, choose another subsequence .fi2n /n which converges uniformly on K2 . Proceeding in the same way, keep choosing consecutive subsequences, so that .fij n /n converges uniformly in Kj . Then the “diagonal” subsequence .finn /n satisfies the requirement. t u Another important problem in analysis is approximation, i.e. the problem of finding a convenient subset dense in a given metric space X . We will now prove a very strong approximation theorem for the space C.K/ of real functions on a compact metric space K, for which we will find an application in Chapter 17 below, in our treatment of Fourier series.
6.4
The Stone-Weierstrass Theorem: Assumptions and statement
Notice that the space C.K/ has the structure of a vector space over R, and that the operations of addition and multiplication by a scalar are continuous. In addition to this, C.K/ also has an operation of product of function, which is also continuous. We will consider subsets A C.K/ satisfying the following assumptions: (1) A is a vector subspace of C.K/, contains the constant function 1 with value 1, and for f; g 2 A, we have f g 2 A. (We say that A is a unital subalgebra of C.K/.) (2) For any two points x; y 2 K, there exists a function f 2 A such that f .x/ ¤ f .y/ (we say that A separates points). 6.4.1 Theorem. (The Stone-Weierstrass Theorem) Let A be a unital subalgebra of C.K/ which separates points. Then A is a dense subset of C.K/. The proof of this theorem will occupy the remainder of this section. However, let us observe one thing right away: since the operations of addition of functions, multiplication of functions and multiplication by a scalar are continuous functions C.K/ C.K/ ! C.K/, R C.K/ ! C.K/, the closure of a unital subalgebra is a unital subalgebra. Therefore, the statement of the theorem will follow if we can prove that every closed unital subalgebra of C.K/ which separates points is equal to C.K/.
6.5 An important step in the proof of the theorem is the fact that the square root (and hence the absolute value) of a non-negative continuous function on a bounded compact interval p is a uniform limit of polynomials. To prove this, we use the Taylor expansion of 1 x.
232
9 Metric and Topological Spaces II
Lemma. Let 0 < b < 1. Then the Taylor expansion of converges absolutely uniformly in the interval hb; bi.
p 1 x at the point x D 0
While it is possible to prove this fact in an elementary way, a much easier proof will follow from the methods of complex analysis. Because of this, we will skip the proof at this point, and referpthe reader to Exercise (8) of Chapter 10 where we define rigorously the function 1 x for x 2 C, Re.x/ < 1, and prove that the (complex) radius of convergence of its Taylor series is 1. Comment: In fact, using a lemma of Abel’s, the upper bound of uniform convergence can be extended to 1. However, we do not need that fact. 6.6 Lemma. Let A C.K/ bepa closed unital subalgebra. (1) If f 2 A and f 0, then f 2 A. (2) If f 2 A, then jf j 2 A. (3) If f; g 2 A, then max.f; g/; min.f; g/ 2 A. Proof. Without loss of generality, max jf j < 1. By Lemma 6.5, k
! 1 X p 1=2 .1 1=n f /k f C 1=n D k kD0
converges uniformly for n D 1; 2; : : : , and hence p f C 1=n 2 A:
(6.6.1)
p The function x is continuous, and hence,pby Theorem 6.6 of Chapter p 2, uniformly continuous on h0; 2i, which implies that f C 1=n converges to f uniformly p with n ! 1, and hence f 2 A. p (2) This follows from the formula jf j D f 2 and from (1). (3) This follows from (2) and the fact that max.f; g/ D
6.7
1 1 .f C g C jf gj/; min.f; g/ D .f C g jf gj/: 2 2
t u
Proof of Theorem 1.1:
Let A C.K/ be a closed unital subalgebra which separates points, and let f 2 C.K/. Given " > 0, we will construct a g 2 A such that for every x 2 K, jf .x/ g.x/j < ":
(*)
6 The space of continuous functions revisited: The Arzel`a-Ascoli Theorem and : : :
233
Since " > 0 was arbitrary, this will imply that f is a limit of a uniformly convergent sequence of elements of A, and hence f 2 A since A is closed. Since f was arbitrary, A D C.K/, which implies the statement of the theorem. To construct g, consider two points s ¤ t 2 K. Since A separates points, we may choose h 2 A such that h.s/ ¤ h.t/. Now define, for v 2 K, fs;t .v/ D f .s/ C .f .t/ f .s//
h.v/ h.t/ h.s/ h.t/
Clearly, fs;t 2 A, and fs;t .s/ D f .s/; fs;t .t/ D f .t/: Now fixing s, let Ut D fv 2 K j fs;t .v/ < f .v/ C "g: Then Ut D .fs;t f /1 Œ.1; "/; and since fs;t ; f are continuous, Ut is open. On the other hand, s; t 2 Ut , and hence .Ut /t ¤s is an open cover of K. Since K is compact, this open cover has a finite subcover .Ut1 ; : : : ; Utm /. Putting hs D min.fs;t1 ; : : : ; fs;tm /; we have hs < f C "; hs .s/ D s: Now let Vs D fv 2 K j hs .v/ > f .v/ "g: Then Vs D .hs f /1 Œ."; 1/; and hence Vs is open. Since s 2 Vs , .Vs /s2K is an open cover of K. Since K is compact, this cover has a finite subcover .Vs1 ; : : : ; Vsp /. Let g D max.hs1 ; : : : ; hsp /:
234
9 Metric and Topological Spaces II
Then g 2 A, and f " < g < f C "; as desired.
7 (1) (2) (3) (4) (5) (6) (7)
(8) (9)
t u
Exercises Prove directly that a subspace of a metric separable space is separable. Prove that a subspace of a totally bounded metric space is totally bounded. Prove that a (finite) product of totally bounded metric spaces is totally bounded. Using Baire’s theorem, prove that an increasing function f W h0; 1i ! R is Lipschitz on a dense subset of h0; 1i. Prove a modification of Baire’s Category Theorem where “complete metric space” is replaced by “compact Hausdorff space”. Prove that an onto isometry of metric spaces is a homeomorphism. A rigorous construction of real numbers. Note carefully that the field of real numbers R cannot be constructed as a completion of the metric space Q of rational numbers directly using 4.1, since the definition of the metric in Lemma 4.2 uses the real numbers, thereby making such an argument circular. Nevertheless, this difficulty can be circumvented, and the approach of 4.1 can be used to define R after all. Following the logically correct sequence of steps is the point of the present exercise. (a) Consider, on Q, the metric d.a; b/ D ja bj. Now define R as the set of equivalence classes of Cauchy sequences with respect to the equivalence relation defined in 4.1. Prove that R is a field with respect to the operation of addition and multiplication of Cauchy sequences, which contains Q as the subfield of (equivalence classes of) constant sequences. (b) Write, for a Cauchy sequence x D .xi /i in Q, x > 0 when there exists an N such that xi > 0 for every i > N . Prove that if x y, then x > 0 if and only if y > 0. (Caution: note that this fails if we tried to use instead of >.) (c) Define, for a 2 R, jaj D a when a > 0 and jaj D a .D 0 a/ otherwise. Prove that d.a; b/ D ja bj is a metric on R and that R is a complete metric space with respect to this metric. (d) The material of 4.1 is now rigorous without previously assuming a construction of R. Verify (caution, it is very nearly a tautology) that the metric space R is indeed the completion of the metric space Q as defined in 4.1. Prove that any open set in Rn is -compact. Prove the following converse to the Arzel`a-Ascoli Theorem: If X is a compact metric space and .fn /n is a uniformly convergent sequence in C.X /, then it is uniformly bounded and equicontinuous.
7 Exercises
235
(10) Prove the following result known as the Weierstrass Approximation Theorem: For a continuous function f W ha; bi ! R, there exists a sequence of polynomials (with real coefficients) pn .x/ which, when restricted to ha; bi, converge to f . (11) Prove that the set of all polynomials in the variables sin.nx/, cos.nx/, n D 0; 1; 2; : : : is dense in C.h0; 2 i/. Is the set of all polynomials in the variables sin.nx/, n D 0; 1; 2; : : : dense in C.h0; 2 /? Prove or disprove.
Complex Analysis I: Basic Concepts
10
In this chapter, we will develop the basic principles of the analysis of complex functions of one complex variable. As we will see, using the results of Chapter 8, these developments come almost for free. Yet, the results are of great significance. On the one hand, complex analysis gives a perfect computation of the convergence of a Taylor expansion, which is of use even if we are looking at functions of one real variable (for example, power functions with a real power). On the other hand, the very rigid, almost “algebraic”, behavior of holomorphic functions is a striking mathematical phenomenon important for the understanding of areas of higher mathematics such as algebraic geometry ([8]). In this chapter, the reader will also see a proof of the Fundamental Theorem of Algebra and, in Exercise (4), a version of the famous Jordan Theorem on simple curves in the plane.
1
The derivative of a complex function. Cauchy-Riemann conditions
1.1 From 1.2 of Chapter 1, recall p the complex conjugate z D x iy of z D x C iy and the absolute value jzj D zz, the easy rules z1 C z2 D z1 C z2 ; z1 z2 D z1 z2
and jz1 z2 j D jz1 j jz2 j;
and the slightly harder triangle inequality jz1 C z2 j jz1 j C jz2 j: Further recall from 4.2 of Chapter 8 that the set of complex numbers C is identified with the Euclidean plane, with the distance jz1 z2 j equal to Euclidean distance in R2 .
I. Kriz and A. Pultr, Introduction to Mathematical Analysis, DOI 10.1007/978-3-0348-0636-7 10, © Springer Basel 2013
237
238
10 Complex Analysis I: Basic Concepts
1.2 Let U C be an open subset and let f W U ! C be a mapping, i.e. a complex 1 function of one variable. We can compute, in the field C, the values .f .z C h/ h f .z// for h ¤ 0 D 0 C i 0, and, analogously to the case of real functions of one variable, consider the limit lim
h!0
f .z C h/ f .z/ ; h
(but this time in the metric space C), if it exists. If the limit exists, we speak (again) of a derivative of f in z. More generally, one can introduce, in the obvious way, partial derivatives of functions f W U1 Un ! C of several complex variables. One uses the same notation as in the real case: f 0 ; f 0 .z/;
df ; etc. dz
By precisely the same procedure as in the real case we can prove the formulas .f C g/0 D f 0 C g 0 ; .˛f /0 D ˛f 0 ; .f g/0 D f 0 g C f g 0 (the second of which concerns the multiplication by a complex constant), the composition rule .f ı g/0 .z/ D f 0 .g.z// g 0 .z/ and the formula .zn /0 D n zn1 , so we can take derivatives of polynomials exactly as in the real case.
1.3 What we cannot do, however, is adopt the interpretation of a derivative as describing a tangent, or expressing smoothness, as in the real case. The function f .z/ D z is certainly as smooth as a map can be: geometrically it is just mirorring the plane along the real axis. But we have here f .z C h/ f .z/ zChz h D D ; h h h an expression that has no limit for h approaching 0: on the real axis, i.e. for h D h1 C i 0, we have constantly the value hh D hh11 D 1 while on the imaginary axis, i.e. for h D 0 C ih2 , we have
h h
D
h2 h2
D 1.
1 The derivative of a complex function. Cauchy-Riemann conditions
239
In other words, while the condition of existence of complex derivative does imply the existence of total differential of the function f considered as a map R2 ! R2 (or U ! R2 where U is an open set in R2 ), the converse is not true: the existence of a complex derivative is a much stronger condition. We will see below in 5.3 that it has a different interpretation, namely of f preserving orientation and angles: smoothness follows.
1.4
Cauchy–Riemann conditions
Writing z D x C iy, we can view a complex function f W U ! C as f .z/ D P .x; y/ C iQ.x; y/ where P; Q are real functions in two real variables. We will now show that the differentiability of f implies certain equations between the partial derivatives of P an Q. 1.4.1 Theorem. Let a complex function f have a (complex) derivative at a point z D x C iy. Then the functions P; Q have partial derivatives at .x; y/ and we have @P .x; y/ @Q.x; y/ D @x @y
and
@P .x; y/ @Q.x; y/ D : @y @x
(CR)
The derivative of f is then given by the formulas f 0 .z/ D
@Q.x; y/ @Q.x; y/ @P .x; y/ @P .x; y/ Ci D i : @x @x @y @y
Remark. The equations (CR) are referred to as the Cauchy - Riemann conditions. We have shown that these conditions are necessary for complex differentiability. We will show in Theorem 1.5 below that the conditions are also sufficient when f is continuously differentiable. A theorem of Looman and Menchoff states, more generally, that the conditions are also sufficient assuming only that f is continuous, but we will not need that result here. The conditions (CR) alone, without any additional assumption on f , however, do not imply differentiability (see Exercise (2).) Proof. Put h D h1 C ih2 . We have 1 1 .f .z C h/ f .z// D .P .x C h1 ; y C h2 / P .x; y// h h1 C ih2 i C .Q.x C h1 ; y C h2 / Q.x; y//: h1 C ih2
(*)
240
10 Complex Analysis I: Basic Concepts
For h2 D 0 (and h1 ¤ 0) this yields in particular 1 i .P .x C h1 ; y/ P .x; y// C .Q.x C h1 ; y/ Q.x; y// h1 h1
(**)
while for h1 D 0 (and h2 ¤ 0) we obtain i 1 .P .x; y C h2 / P .x; y// C .Q.x; y C h2 / Q.x; y//: h2 h2
(***)
If the expression (*) has a limit for h ! 1, the expression (**) has the same limit for h1 ! 0, namely @Q.x; y/ @P .x; y/ Ci .D f 0 .z// @x @x and similarly (***) yields @P .x; y/ @Q.x; y/ i .D f 0 .z//: @y @y Comparing the real and the imaginary parts, we obtain the desired equations.
t u
1.5 Theorem. Let P; Q be real functions of two variables with continuous partial derivatives, let f .z/ D P .x; y/ C iQ.x; y/ and let the conditions (CR) be satisfied at some point z D x C iy 2 U . Then f has a derivative in z. Proof. We have 1 .f .z C h/ f .z/ h 1 D .P .x C h1 ; y C h2 / P .x; y/ C iQ.x C h1 ; y C h2 / iQ.x; y// h 1 D .P .x C h1 ; y C h2 / P .x C h1 ; y/ C P .x C h1 ; y/ P .x; y/ h C i.Q.x C h1 ; y C h2 / Q.x C h1 ; y/ C Q.x C h1 ; y/ Q.x; y///: Denote the right-hand side by u. Using the Mean Value Theorem and (CR), we obtain P .x C h1 ; y C h2 / P .x C h1 ; y/ C P .x C h1 ; y/ P .x; y/ D
@P .x C ˇh1 ; y/ @P .x C h1 ; y C ˛h2 / h2 C h1 @y @x
D
@P .x C h1 ; y C ˛h2 / @P .x C ˇh1 ; y/ h2 C h1 @x @x
1 The derivative of a complex function. Cauchy-Riemann conditions
241
and similarly Q.x C h1 ; y C h2 / Q.x C h1 ; y/ C Q.x C h1 ; y/ Q.x; y/ D
@Q.x C ıh1 ; y/ @Q.x C h1 ; y C h2 / h2 C h1 @y @y
D
@Q.x C h1 ; y C h2 / @P .x C ıh1 ; y/ h2 C h1 ; @x @x
with some 0 < ˛; ˇ; ; ı < 1. Thus, setting h D h1 C ih2 ,
@Q.x C ıh1 ; y/ @P .x C h1 ; y C ˛h2 / .h1 C ih2 / C i .h1 C ih2 / @x @x @P .x C ˇh1 ; y/ @P .x C h1 ; y C ˛h2 / h1 C @x @x @Q.x C h1 ; y C h2 / @Q.x C ıh1 ; y/ h2 @x @x
uD
1 h
@P .x C h1 ; y C ˛h2 / @Q.x C ıh1 ; y/ h1 h2 Ci C d1 C d2 @x @x h h ˇ ˇ ˇ hi ˇ and since the differences d1 ; d2 tend to 0 and ˇˇ ˇˇ 1, the statement follows. h D
1.6
t u
Holomorphic functions
A complex function f W U ! C on an open set U C with continuous partial derivatives which satisfies the Cauchy-Riemann conditions is said to be holomorphic. It can be shown that a complex function is holomorphic on U if and only if it has a complex derivative on U . (By what we already proved, sufficiency is the non-trivial part.) This is the famous theorem of Goursat which can be found, for example, in [1]. From the chain rule, it is again immediate that for holomorphic functions f; g in an open set U , f C g, f g, f g are holomorphic, as is fg provided that g is non-zero at all points of U .
1.7 Recall the complex line integral from Section 4 above. Later we will need the following fact. It is an easy consequence of 3.7 and 4.4 of Chapter 8, but we shall spell things out, mainly to exercise the Cauchy-Riemann conditions.
242
10 Complex Analysis I: Basic Concepts
Theorem. Let f .; z/ be a continuous complex function of two variables which R is holomorphic in in some open set U C. Then the complex line integral L satisfies d d
Z
Z f .; z/dz D L
L
@f .; z/ dz: @
R
Proof. Set F . / D L f .; z/dz and write f .; z/ D P .˛; ˇ; x; y/ C iQ.˛; ˇ; x; y/ where D ˛ C iˇ. From 4.4 of Chapter 8, we see that F . / D P.˛; ˇ/ C i Q.˛; ˇ/ where Z P.˛; ˇ/
D .II/
.P .˛; ˇ; x; y/ Q.˛; ˇ; x; y/
Z D
.P .˛; ˇ; x; y/dx Q.˛; ˇ; x; y/dy/; Z
Q.˛; ˇ/
D .II/
.Q.˛; ˇ; x; y/; P .˛; ˇ; x; y/
Z D
.Q.˛; ˇ; x; y/dx C P .˛; ˇ; x; y/dy/:
Since f is holomorphic in , we have @P @Q D @˛ @ˇ
and
@P @Q D @ˇ @˛
so that by 3.7 of Chapter 8, @Q @Q @P D .II/ ; D ; @ˇ @ˇ @ˇ Z Z @Q @Q @P @P @Q @P D .II/ ; D .II/ ; D @ˇ @ˇ @ˇ @˛ @˛ @˛
@P D .II/ @˛
Z
@Q @P ; @˛ @˛
Z
so that F . / is holomorphic and hence has a derivative. By 1.4,
(1.7.1)
@P @Q @f D Ci @ @˛ @˛
and hence by (1.7.1) and 1.4 again, Z
@f .; z/ dz D .II/ @
Z
@P @Q ; @˛ @˛
Z C i.II/
@Q @P ; @˛ @˛
D
@P @Q dF Ci D : @˛ @˛ d
t u
2 From the complex line integral to primitive functions
2
243
From the complex line integral to primitive functions
2.1 Theorem. Let U be a domain in C. Let L1 ; : : :; Lk be simple piecewise continuously differentiable closed curves with disjoint images such that L1 q q Lk is the boundary of U oriented counter-clockwise (see 5.2 of Chapter 8). Let f be a function defined on an open set V contining U . Then the complex line integral of f satisfies Z
Z f .z/dz C C
f .z/dz D 0:
L1
Lk
Proof. Put again f .z/ D P .x; y/ C iQ.x; y/. By 4.4 of Chapter 8, we have Z
Z
Z
f D .II/ Li
.P; Q/ C i.II/ Li
.Q; P / Li
and by the Green’s formula (5.4.1) of Chapter 8, the sum of these factors is equal to Z U
@Q @P @x @y
Z Ci U
@Q @P : @x @y
By the Cauchy-Riemann conditions, both the summands are zero.
t u
2.2 Consider two oriented simple arcs P1 ; P2 expressed by parametrizations i W h˛i ; ˇi i ! C such that 1 .˛1 / D 2 .˛2 / and 1 .ˇ1 / D 2 .ˇ2 / and 1 .x/ ¤ 2 .y/ unless x D ˛1 and y D ˛2 or x D ˇ1 and y D ˇ2 . Then L D P2 C P1 is a piecewise continuously differentiable simple closed curve. If L is the boundary of a domain U and f is holomorphic on an open subset of C containing U , then by 2.1, Z
Z f D P1
f: P2
2.3 Let f be holomorphic in a convex open set U C. For a; b 2 U define Z
Z
b
f .z/dz D a
f .z/dz L.a;b/
where L.a; b/ is parametrized by W h0; 1i ! C, .t/ D a C t.b a/.
244
10 Complex Analysis I: Basic Concepts
Fix a 2 U and write for u 2 U , Z
u
F .u/ D
f .z/dz: a
Theorem. We have F 0 .z/ D f .z/. Proof. We claim that Z
Z
Z
f .z/dz D L.a;uCh/
f .z/dz C L.a;u/
f .z/dz:
(2.3.1)
L.u;uCh/
In effect, this is trivial when the points a; u and u C h are colinear. Otherwise the piecewise continuously differentiable simple curves P1 D L.a; u C h/ and P2 D L.a; u/ C L.u; u C h/, h 2 h0; 1i, satisfy the assumptions of 2.2 and hence (2.3.1) follows from 3.4 and 4.4 of Chapter 8. Now, by (2.3.1), Z 1 Z 1 1 1 .F .u C h/ F .u// D f .u C th/hdt D f .u C th/dt h h 0 0 Z 1 Z 1 D P .u C th/dt C i Q.u C th/dt 0
0
which with real h ! 0 approaches P .u/ C iQ.u/, by the Mean Value Theorem. u t
2.4
Comment
By analogy with the theory of real functions, we call F a primitive function of f if F 0 D f . It is easy to observe that the difference between two primitive functions on an open set is locally constant, i.e. constant on each connected component. Indeed, by 1.4, we can reduce this to the fact that a real function with partial derivatives equal to 0 on an open set is locally constant. In particular, on a convex open set U , any two primitive functions differ by a constant.
2.5 It is curious to observe that the proof of Theorem 2.3 can be “transported” (with only minor modifications) by a (real) injective regular map. More precisely, identifying C with R2 , let W U ! V be a bijective regular map in the sense of Subsection 7.1 of Chapter 3. Then the proof of Theorem 2.3 remains valid with the line segments L.a; b/ replaced by their -images. We obtain therefore the following
3 Cauchy’s formula
245
Proposition. If V is an open set in C such that there exists a bijective (real) regular map W U ! V where U is convex, then every holomorphic function on V has a primitive function. As it turns out, the converse is also true. In fact, in Section 1 of Chapter 13, we shall prove much more, namely that unless U D C, the map can be chosen to be holomorphic. This is the famous Riemann Mapping Theorem.
3
Cauchy’s formula
3.1 Lemma. Let Kr be a circle with center in a point z and radius r > 0, oriented counter-clockwise. Then we have Z d D 2 i: Kr z Proof. Parametrize Kr by W h0; 2 i ! C; .t/ D z C r.cos.t/ C i sin.t//: Then we have 0 .t/ D r. sin.t/ C i cos.t//, and hence Z Kr
d D z
Z
2 0
r. sin.t/ C i cos.t// D r.cos.t/ C i sin.t//
Z
2
i dt D 2 i:
t u
0
3.2 Notice that the integral computed in 3.1 is not required to vanish by Theorem 2.1 because the argument is not defined (and in fact, goes to infinity) at D z. 3.3 Theorem. (Cauchy’s formula) Let f be holomorphic in an open disk .z; R/ with R > r > 0. Then we have Z 1 f ./ d D f .z/: 2 i Kr z Proof. We have Z Kr
f ./ d z
Z D Kr
f .z/ d C z
Z Kr
f ./ f .z/ d: z
246
10 Complex Analysis I: Basic Concepts
The first summand on the right-hand side is equal to 2 if .z/ by 3.1. We shall prove that the second summand is 0. Since f ./ f .z/ ; !z z
f 0 .z/ D lim
f ./ f .z/ is bounded on the set U Xfzg for some open neighborhood z U of z (and hence, by continuity, on .z; r/ X fzg). Let the quantity
ˇ ˇ ˇ f ./ f .z/ ˇ ˇ < A in .z; r/ X fzg: ˇ ˇ ˇ z By Lemma 4.5 of Chapter 8, for 0 < s < r, we have ˇZ ˇ ˇ ˇ
Ks
ˇ f ./ f .z/ ˇˇ d ˇ 4A 2 s D 8A s: z
In particular, Z lim
s!0 Ks
f ./ f .z/ d D 0: z
Now we will apply 2.1 to U D .z; r/ X .z; s/;
(*)
with k D 2, L1 D Kr , L2 D Ks . By (*), we have Z Kr
f ./ f .z/ d z Z
D lim
s!0
Kr
f ./ f .z/ d z
D lim 0 D 0: s!0
Z Ks
f ./ f .z/ d z
D .by 2.1/ t u
3.4 Theorem. A holomorphic complex function on an open set U has complex derivatives of all orders on U . Proof. By 1.7, we may differentiate the argument of the integral in Cauchy’s formula repeatedly by z, giving
3 Cauchy’s formula
247
f .k/ .z/ D
kŠ 2 i
Z Kr
f ./ d: . z/kC1
(3.4.1) t u
3.5 Corollary. A continuous complex function f on a convex open set in C is holomorphic if and only if it has a primitive function. Proof. If f is holomorphic then it has a primitive function F by Theorem 2.3. If f has a primitive function F then F is holomorphic since f is continuous. Now apply Theorem 3.4 to the function F . t u We also get the following 3.6 Theorem. (Weierstrass’s Theorem) Suppose that fn is a sequence of holomorphic functions defined on an open set U C which converge to a function f .z/ uniformly on every compact subset of U . Then f is a holomorphic function on U . Furthermore, fn0 converge to f 0 uniformly on every compact subset of U . Proof. Using Cauchy’s formula (Theorem 3.3) with f replaced by fn , and taking the limit after the integral sign using Lebesgue’s Dominated Convergence Theorem implies the same formula for f , proving that f is holomorphic. Further, using the same argument on formula (3.4.1) (k D 1), we see that fn0 converges to f 0 , and further that the convergence is uniform in a disk with center z and radius r=2. A compact set is covered by finitely many such disks by the Heine-Borel Theorem 2.3 of Chapter 9, which implies that the convergence of derivatives is uniform on a compact set. t u The following result will be useful for applying the Arzel`a-Ascoli Theorem 6.2 to sequences of analytic functions. 3.7 Theorem. A sequence .fn /n of holomorphic functions defined on an open set U C which is uniformly bounded on every compact subset K U is equicontinuous on every compact subset K U . Proof. Let z0 2 U , and assume .z0 ; r/ U . Let M be the boundary of .z0 ; r/, oriented counterclockwise. For z 2 .z0 ; r/, we get f .z/ f .z0 / D
1 2 i
Z M
1 1 z z0
z z0 D 2 i
Z M
f ./d (*)
f ./d : . z/. z0 /
248
10 Complex Analysis I: Basic Concepts
If jf .t/j < C for all t 2 M , and if z 2 .z0 ; r=2/, then the right-hand side of (*) is less than or equal to 4C jz z0 j : r
(C)
Now let K U be a compact subset. We claim that there exists an r > 0 and a compact set L, K L U such that every point of distance r from some point of K belongs to L. (For every point x 2 K, there is a number s.x/ > 0 such that .x; s.x// U . By the Heine-Borel Theorem 2.3 of Chapter 9, K is covered by finitely many of the open disks .xi ; s.xi /=3/, for some points xi , i D 1; : : : ; k. Let s D minfs.xi /ji D k [
.xi ; s.xi /=3/. 1; : : : ; kg. Then we may put r D s=3, L D i D1
Now let C be a uniform bound on jfn .z/j for z 2 L. Then in (C) we may always use these values of C and r. We see that then at least for z; t 2 K, jz tj < r=2, jfn .z/ fn .t/j < which implies equicontinuity on K.
4C jz tj ; r t u
Note that in the preceding proof, we have proved more than equicontinuity, namely a uniform Lipschitz constant.
3.8
Remarks
1. Note that the statements 3.4 and 3.5 are in sharp contrast with real analysis. 2. We will see that Cauchy’s formula in complex analysis plays an analogous role to the Mean Value Theorem in real analysis. It is, however, a much stronger tool, which makes certain concepts (such as the Taylor series) much easier. f ./ going to infinity at the point z. Note that 3. Realize the role of the argument z all the information about the integral in 8.3 is contained in an arbitrarily small neighborhood of z. 4. By the same argument, the circle Kr could be replaced by any closed simple curve L which is the boundary of a domain U oriented counter-clockwise and such that z 2 U .
4
Taylor’s formula, power series, and a uniqueness theorem
4.1 Theorem. (Taylor’s formula) Let f be holomorphic in a neighborhood of a point c 2 C. Then, in a sufficiently small neighborhood of c, we have
4 Taylor’s formula, power series, and a uniqueness theorem
f .z/ D f .c/C
249
1 0 1 1 f .c/.zc/C f 00 .c/.zc/2 C C f .n/ .c/.zc/n C: : : : 1Š 2Š nŠ
Proof. We have 1 z 1 D : z c 1 zc c
(*)
Consider a circle Kr with center c and radius r such that f is holomorphic in
.c; R/ for some R > r. Let z be such that jz cj < r, so that ˇ ˇ ˇzcˇ ˇ ˇ ˇ c ˇ < 1 for a point of the circle Kr . From (*), we obtain zc C 1C c
1 1 D z c D
zc c
2
CC
zc c
!
n C :::
1 1 1 C .z c/ C .z c/2 C ::: c . c/2 . c/3
C.z c/n
1 C :::: . c/nC1
Thus, from Cauchy’s formula and Lebesgue’s Dominated Convergence Theorem (note that we are dealing with continuous functions and therefore all partial sums have a uniform constant bound), we get: 1 f .z/ D 2 i 1 D 2 i
Z
Z
f ./ d z
1 f ./ d C .z c/ c 2 i
1 C.z c/ 2 i
Z
n
Z
f ./ d C : : : . c/2
f ./ d C : : : : . c/nC1
By the formula in the proof of Theorem 3.4, we have 1 2 i
Z
1 f ./ d D f .n/ .c/: nC1 . c/ nŠ
t u
250
10 Complex Analysis I: Basic Concepts
4.2 Note that repeating verbatim the proofs in Section 7 of Chapter 1, we get the following Proposition. A (complex) power series 1 X
ak .z c/k
(*)
kD0
converges absolutely and uniformly in a circle with center c and any radius 1 s < r D lim inf p n jan j and diverges outside of the closed circle with center c and radius r. (The number r is called the radius of convergence of the power series (*).) Moreover, the power series 1 X
kak .z c/k1
kD1
has the same radius of convergence as (*), and the series (*) may be differentiated term by term.
4.3 The power series ez D 1 C
z2 z C C :::; 1Š 2Š
sin.z/ D z
z3 z5 C :::; 3Š 5Š
cos.z/ D 1
z2 z4 C ::: 2Š 4Š
will now be considered the definitions of the functions e z , sin.z/, cos.z/ for z complex (the radius of convergence of these series is 1). Therefore, we have
4 Taylor’s formula, power series, and a uniqueness theorem
251
e iz D cos.z/ C i sin.z/; e iz D cos.z/ i sin.z/; and also cos.z/ D
4.4
e iz e iz e iz C e iz ; sin.z/ D : 2 2i
A uniqueness theorem
Lemma. Let f , g be holomorphic in an open set U , and let c 2 U , c D lim cn , cn ¤ c. Suppose f .cn / D g.cn / for all n. Then f D g in some neighborhood of c. Proof. It suffices to prove that if f .cn / D 0 for all n, then f 0 in some neighborhood of c. By Taylor’s formula, we have f .z/ D
1 X
ak .z c/k
kD0
for some constants ak . It suffices to prove that ak D 0 for all k:
(*)
Assuming (*) does not hold, let n be the smallest number such that an ¤ 0. Then in some neighborhood of c, f .z/ D .z c/n .an C anC1 .z c/ C anC2 .z c/2 C : : : /: The function in the parentheses on the right-hand side is continuous (it is a uniform limit of continuous functions), and not zero at c; thus, it is non-zero in some t u neighborhood of c, and so is .z c/n , contradicting our assumptions. Theorem. Assume f; g are holomorphic on a connected open set U , and let c 2 U , c D lim cn , c ¤ cn , and f .cn / D g.cn / for all n. Then f g on U . Proof. Let M D fz 2 U jf .u/ D g.u/ in some neighborhood of zg: M is clearly open, and by the lemma, it is also closed and non-empty. Since U is connected, we have M D U . t u
252
4.5
10 Complex Analysis I: Basic Concepts
The algebra of power series
Note that on two power series of the form 4.2.(*) with the same c, we can perform addition, sutraction and multiplication (in the case of multiplication, note that only finitely many terms with the same power .zc/k are added). As an inverse operation to this purely algebraic multiplication, note that it is also possible to divide by any power series 4.2.(*) with a0 ¤ 0, figuring the coefficients of the ratio by a recursive procedure. It will be important for us that when these purely algebraic operations are performed on power series with a positive radius of convergence representing Taylor series at c of holomorphic functions f , g, the power series resulting in an algebraic operation converges and is the Taylor series of f C g, f g, f g or f =g, (the division requires g.c/ ¤ 0). All of these statements are more or less obvious with the exception of the division. Here we note that since g.c/ ¤ 0, we have g.z/ ¤ 0 in some disk .c; r/, r > 0. Therefore, f =g is a holomorphic function in a disk with center a, and hence has a Taylor expansion at c. Multiplying this Taylor expansion with the Taylor expansion of g at c algebraically, we then get the Taylor expansion of f at c by uniqueness. This implies that the Taylor expansion of f =g at c is the algebraic ratio of the Taylor expansions of f and g at c.
5
Applications: Liouville’s Theorem, the Fundamental Theorem of Algebra and a remark on conformal maps
5.1 Theorem. (Liouville’s Theorem) Suppose f is holomorphic and bounded in all of C. Then f is constant. Proof. By the formula from 3.4, for any circle Kr with center z and radius r we have 2Š f .z/ D 2 i 0
Z Kr
f ./ d: . z/2
Suppose jf .z/j A for all z 2 C. For a point on Kr , we then have ˇ ˇ ˇ f ./ ˇ A ˇ ˇ ˇ . z/2 ˇ r 2 ; and by 4.5 of Chapter 8, we have jf 0 .z/j 4
A 2Š 8A 2 r 2 D : 2 r r
Since r > 0 was arbitrary, we must have f 0 .z/ 0 and hence f must be constant. t u
5 Applications: Liouville’s Theorem, the Fundamental Theorem of Algebra: : :
253
5.2 Theorem. (The Fundamental Theorem of Algebra) Every non-constant polynomial has at least one root in C. Proof. Suppose a polynomial p.z/ D zn C an1 zn1 C C a1 z C a0 ; n 1 has no root in C. Then the function f .z/ D
1 p.z/
is defined and holomorphic on all of C. Let R D 2n max.ja0 j; : : : ; jan j/ (where an D 1). For jzj R, we then have jp.z/j jzjn jan1 zn1 C C a1 z C a0 j jzjn jzjn1
R R D jzjn1 Rn 2 2
and hence jf .z/j
c : Rn
On the other hand, on fzj jzj Rg, f is bounded because it is continuous. Thus, f is bounded on all of C, and by Liouville’s Theorem, it is constant, and hence so is p.z/. This is a contradiction since we assumed n 1. t u
5.3 A conformal map is a regular map f W U ! Rn defined on an open set U Rn such that for two vectors u; v 2 Rn , and every point z 2 U , the angle between the vectors Dz f.u/,Dz f.v/ is the same as the angle between u ¤ 0 and v ¤ 0. (Recall that the angle 0 ˛ between non-zero vectors u, v is defined by cos.˛/ D u v=.jjujj jjvjj/.) Note that for n D 1, any regular map is conformal. For n > 2, it can be shown that every conformal map is locally a constant multiple of an isometry, a fact which we will not show here. However, for n D 2 we have the following result. Identify, again, C with R2 by x C iy 7! .x; y/, and drop the bold-faced letters.
254
10 Complex Analysis I: Basic Concepts
Theorem. For n D 2, a regular map f as in 1.1 is conformal if and only if on each connected component of U , f is either holomorphic or the complex conjugate of a holomorphic function (such a function is often called antiholomorphic). Proof: This is really a statement entirely about the R-linear map Dfz for each z 2 U (see Exercise (1)), which is a consequence of the following Lemma. A regular R-linear map A W C ! C preserves angles between non-zero pairs of vectors if and only A is given either by the formula Az D z or Az D z for some ¤ 0 2 C. The proof of the lemma goes as follows: Representing A by a 2 2 matrix using the basis 1, i , by our assumption, the columns of A must be non-zero and orthogonal. Since multiplication by a non-zero complex number preserves angles by the geometric interpretation of complex numbers, we may assume (by composing, if necessary, A with multiplication by a suitable non-zero complex number) that the first column of A is .1; 0/T . By orthogonality, the second column is then .0; a/T for some non-zero (real) a. However, the requirement that A.1; 1/T , A.1; 1/T be orthogonal gives a2 D 1. If a D 1, Az D z and if a D 1, Az D z. t u
6
Laurent series, isolated singularities and the Residue Theorem
6.1
Laurent series
Let f be a holomorphic function defined on an annulus R1 < jz cj < R2 for some a 2 C. Let Lr be a circle with center c and radius r oriented counterclockwise. Define Z f ./ 1 d f1 .z/ D 2 i Lr z for some jz cj < r < R2 and f2 .z/ D
1 2 i
Z Ls
f ./ d z
for some R1 < s < jz cj. The exact choice of r or s does not change the value by Theorem 2.1. Furthermore, we have f .z/ D f1 .z/ C f2 .z/ for R1 < jz cj < R2 .
(6.1.1)
6 Laurent series, isolated singularities and the Residue Theorem
255
To see this, consider a circle K with center z and a small radius oriented counterclockwise, and apply Theorem 2.1 to the function f ./ z of the variable with simple closed curves Lr , Ls , K , along with Cauchy’s formula (Theorem 3.3). By differentiating under the integral sign (Theorem 1.7), and the fact that the value does not depend on r, we see that the function f1 .z/ is holomorphic in the disk jz cj < R2 , and hence has a Taylor expansion. In case of the function f2 .z/, it is convernient to perform the substitution D
1 1 1 ; D c C ; d D 2 d; c
and similarly tD
1 1 ; zDcC ; zc t
so that 1 t D : z t For the function g.t/ D f2 .z.t//, this gives 1 g.t/ D t 2 i
Z M
g./ d . t/
where M is the circle with center 0 and radius 1=s < 1=jtj oriented counterclockwise (note that the substitution reverses orientation, so we have a total of 4 minus signs, which result in a plus). Again by differentiating under the integral sign, we see that g.t/=t is a holomorphic function in the circle jtj < 1=R1 , and hence has a Taylor expansion. (Note: when performing the substitution, we implicitly used the fact that when performing substitution in complex line integrals, we may treat differentials the same way as in ordinary single-variable integral substitution - see Exercise (11) below). Writing the Taylor series of g.t/ in the variable .z c/, we obtain an expansion of the form f2 .z c/ D
X n<0
which leads to the following result:
an .z c/n ;
256
10 Complex Analysis I: Basic Concepts
Theorem. A holomorphic function f .z/ in an annulus R1 < jz cj < R2 has an expansion f .z/ D
1 X
an .z c/n
(6.1.2)
nD1
which is absolutely convergent in the annulus R1 < jz cj < R2 , and the convergence is uniform on every compact subset. Furthermore, the coefficients an are uniquely determined by f . (This is called the Laurent expansion of the function f .z/.) Proof. The existence of the expansion (6.1.2) follows from the expansions for the functions f1 , f2 in the variable z c discussed above. Moreover, the convergence properties of the series (6.1.2) follow from our already discussed theory of power series. Regarding uniqueness, note that the coefficients an can be calculated by Cauchy integrals, which can be performed term by term by the convergence properties of the power series (see Exercise (13) below). t u
6.2
Classification of isolated singularities and the Residue Theorem
Let U be an open subset of C, and let c 2 U . A holomorphic function f defined on U X fag is said to have an isolated singularity at c. In this case, f has a Laurent expansion (6.1.2) at a with R1 D 0. Isolated singularities are classified using this expansion: If an D 0 for all n < 0, we say that f has a removable singularity at c. Clearly, in that case, one can extend f to U by setting f .c/ D a0 . (For a stronger statement, see Exercise (14).) On the other hand, if the set of all n for which an ¤ 0 is not bounded below, then we say that f has an essential singularity at c. If n > 0 is such that an ¤ 0 and am D 0 for all m < n, then we say that f has a pole of order n at c. Symmetrically, if n > 0 and an ¤ 0 while am D 0 for m < n, we say that f has a zero of order n at c, although that is not really a singularity. If am D 0 for m < n, n 0, we say that f has at most a pole of order n at c, and if am D 0 for m < n > 0, we say that f has a zero of order at least n at c. We say that f .z/ has at most a pole at c if it does not have an essential sigularity there. From the uniqueness of the Laurent expansion, it immediately follows that f has at most a pole of order n at c if and only if f .z/.z c/n is holomorphic in U , while f has a zero of order at least n at c if and only if f .z/=.z c/n is holomorphic in U . Note that the remarks 4.5 on algebraic operations with power series obviously extend to Laurent series of functions which have at most a pole at the point c. When essential singularities are present, however, multiplication and division may involve infinite sums of coefficients at the same power, and hence the purely algebraic operations are undefined. We will also need the following result on essential singularities:
6 Laurent series, isolated singularities and the Residue Theorem
257
Proposition. Let f be a holomorphic function on .c; r/ X fcg with an essential singularity at c and let A 2 C. Then for each " > 0 and each ı > 0, f .z/ 2 .A; "/ for infinitely many z 2 .c; ı/. Proof. Assuming the opposite, there exists an A 2 C and an " > 0 and a ı > 0 such that f .z/ A ¤ 0 for z 2 .c; ı/. But then the function 1 f .z/ A has a removable sinularity at A, and hence f .z/ has at most a pole at A.
t u
For a function f with an isolated singularity at c as above, we define the residue of f at c as reszDc f .z/ D a1 :
(6.2.1)
Since we may integrate the Laurent series term by term, it follows that if L is a circle in U with center c oriented counter-clockwise such that the interior of the circle is also contained in U , then Z 1 reszDc f .z/ D f ./d: (6.2.2) 2 i L From this and Theorem 2.1, we then immediately get the following fact: Theorem. (The Residue Theorem) Let U be a domain in C and let L1 ; : : : ; Lk be simple piecewise continuously differentiable closed curves with disjoint images such that L1 q qLk is the boundary of U oriented counter-clockwise. Let further c1 ; : : : cm be finitely many distinct points in U , and let f be a holomorphic function on V X fa1 ; : : : ; am g where V U is an open set. Then Z Z f .z/dz C C f .z/dz D 2 i.reszDc1 f .z/ C C reszDcm f .z//: L1
Lk
t u
6.3
´ Theorem Applications: The Argument Principle and Rouche’s
The Residue Theorem has the following celebrated consequence. We say that a function is meromorphic in an open set U C if f is holomorphic and non-zero on U X S for a discrete set S U , and f has at most a pole at each c 2 S . Then we define the degree of f at c 2 U as 8 < n if f has a zero of degree n at c degc .f / D n if f has a pole of order n at c : 0 otherwise.
258
10 Complex Analysis I: Basic Concepts
6.3.1 Theorem. (The Argument Principle) Let U be a domain in C and let L1 ; : : : ; Lk be simple piecewise continuously differentiable closed curves with disjoint images such that L1 q q Lk is the boundary of U oriented counterclockwise. Let f be a meromorphic function on V U with no zeros or poles on L1 q q Lk . Then Z k X X 1 f 0 .z/ dz D degc .f /: 2 i Li f .z/ j D1 c2U (Note that since U is compact, the sum on the right-hand side has only finitely many non-zero terms.) Proof. If f .z/ D .z c/n g.z/ with g.c/ ¤ 0. Then f 0 .z/ D n.z c/n1 g.z/ C .z c/n g 0 .z/; so that n g 0 .z/ f 0 .z/ D C ; f .z/ zc g.z/ and hence reszDc
f 0 .z/ D n: f .z/
The statement then follows directly from the Residue Theorem (see Exercise (17)). t u Comment: The argument of a number z 2 C X f0g is defined as the angle Arg.z/ D ˛ such that z D jzje i ˛ . Since this ˛ is only defined up to adding an integral multiple of 2 , one usually normalizes by requiring 0 Arg.z/ < 2 (this is the argument in the narrower, normalized sense). It follows that in a connected open set U where there exists a holomorphic function Ln.z/ which satisfies e Ln.z/ D z; we have Arg.z/ D Im.Ln.z// C 2 k for some k 2 Z: If U C is, say, a convex open set on which f .z/ has no zero, then f 0 .z/=f .z/ has a primitive function Ln.f .z// whose imaginary part differs from Arg.f .z// by
6 Laurent series, isolated singularities and the Residue Theorem
259
2 k, k 2 Z. The whole point is, however, that by Lemma 3.1, Ln.z/ cannot be welldefined on the whole set C X f0g; roughly speaking, when we follow a circle with center 0 once around counter-clockwise, the value of the logarithm will increase by 2 i (note that its real part won’t change: it is just its imaginary part, the argument, which will inrease by 2 ). Thus, Theorem 6.3.1 in the case k D 1 makes precise the intuitive assertion that following around a simple closed curve on which f .z/ has no zero and which is a boundary oriented counter-clockwise of a domain U , then the increase of the argument of f along this curve is equal to 2 times the number of zeros of f inside U . Let f be a holomorphic function on U which is non-zero outside of a finite set of points. Then f is meromorphic, and the sum of degrees of f at all the points a 2 U (which has only finitely many non-zero summands) is called the number of zeros of the function f in the set U . (Thus, this is a count of zeros with “multiplicities”.) 6.3.2 Corollary. (Rouch´e’s Theorem) Let U be a domain in C and let L1 ; : : : ; Lk be simple piecewise continuously differentiable closed curves with disjoint images such that L1 q q Lk is the boundary of U oriented counter-clockwise. Let f , g be holomorphic on V U and satisfy jf .z/ g.z/j < jf .z/j for z 2 L1 q q Lk : Then f , g have the same number of zeros in U . (Note that again, since U is compact, by Theorem 4.4 ,f and g have only finitely many zeros in U .) Proof. By assumption, we have ˇ ˇ ˇ ˇ g.z/ ˇ < 1 for z 2 L1 q q Lk : ˇ 1 ˇ ˇ f .z/ Thus, if we put F .z/ D f .z/=g.z/, then F ŒL1 q q Lk
where is is the open disk with center 1 and radius 1. Then 1=z has a primitive function on , which we will denote by Ln.z/. The chain rule then implies .Ln.F .z///0 D
F 0 .z/ : F .z/
Therefore, Z k X 1 F 0 .z/ dz D 0; 2 i Li F .z/ j D1 and our statement follows from the Argument Principle.
t u
260
10 Complex Analysis I: Basic Concepts
6.3.3 Theorem. Suppose a holomorphic function f .z/ defined on an open set U C is such that for some z0 2 U , f .z0 / D w0 and the function f .z/ w0 has a zero of order n at z0 . Then there exists an "0 > 0 such that for 0 < " < "0 , there exists a ı > 0 such that for all c 2 C with jc w0 j < ı, the number of zeros of f .z/ c in
.z0 ; "/ is n. Proof. Let L" be the circle with center z0 and radius " oriented conterclockwise. We will study the integral Z 1 f 0 .z/ dz: (*) 2 i L" f .z/ c Choose "0 > 0 so that f .z/ w0 ¤ 0 for z 2 .z0 ; "0 / X fz0 g. (Such an "0 > 0 must exist or else f .z/ is constant in a neighborhood of z0 by Theorem 4.4.) Then if we choose 0 < " < "0 , since L" is compact, there exists a ı > 0 such that the denominator of the integrand (*) is non-zero for all c 2 .w0 ; ı/. Therefore, the integral (*) is defined and continuous in that domain. However, we know by the Argument Principle that (*) is a non-negative integer, namely the number of zeros of the function f .z/ c in the disk .z0 ; "/. Thus, it must be constant, as claimed. t u 6.3.4 Corollary. (The Holomorphic Open Mapping Theorem) A non-constant holomorphic function on a connected open set U C maps open sets onto open sets. Proof. Note that in particular in the conclusion of Theorem 6.3.3, every element of t u
.w0 ; ı/ is in the image of f . An immediate consequence is then the following: 6.3.5 Corollary. (The maximum principle) If f .z/ is holomorphic and nonconstant in a connected open set U C, then jf .z/j has no maximum in U . Proof. By Corollary 6.3.4, for any z 2 U , all points in a neighborhood of f .z/ are in the image of f , so this will include points of greater absolute value. t u Another consequence of the Argument Principle is the following 6.3.6 Theorem. (Hurwitz’s Theorem) Let fn be holomorphic functions on a connected open set U C which converge uniformly on every compact subset of U to a function f W U ! C. Assume further that fn .z/ ¤ 0 for all n and all z 2 U . Then either f is identically 0 on U , or f .z/ ¤ 0 for all z 2 U . Proof. We know from Weirstrass’s Theorem (Theorem 3.6) that f .z/ is a holomorphic function on U . Suppose f .z/ is not identically 0. Then by Theorem 4.4, for any point z0 2 U , there exists a number r > 0 such that f .z/ is defined and not
6 Laurent series, isolated singularities and the Residue Theorem
261
equal to 0 for 0 < jz z0 j r. In particular, by Proposition 6.3 of Chapter 2, jf .z/j has a minimum on the circle K D fz 2 Cj jz z0 j D rg. It follows that 1=f .z/ converges uniformly to 1=f 0 .z/ on K, and by Weierstrass’s Theorem 3.6 also fn0 .z/ converges uniformly to f 0 .z/ on K. By Lebesgue’s Dominated Convergence Theorem, considering K oriented counter-clockwise, we conclude that 1 lim n!1 2 i
Z K
1 fn0 .z/ dz D fn .z/ 2 i
Z K
f 0 .z/ dz: f .z/
(*)
Now by the Argument Principle, the argument of the limit in (*) is the number of zeros of fn inside the circle K, which is 0, while the right-hand side is the number of zeros of f inside K. In particular, f .z0 / ¤ 0, and the statement follows, since z0 was arbitrary. t u
6.4
Example: The values of the Riemann zeta function at even integers k 2
The Riemann zeta function is .s/ D
1 X 1 for Re.s/ > 1: s m mD1
A lot can be said about the Riemann zeta function, but here we want to show how the Residue Theorem can be applied to evaluating .k/ for k 2 an even integer, which is a typical example of an application of the theorem. (The evaluation of .k/ for odd integers k > 2 is still an open problem.) First, note that e z 1 has a simple (=order 1) zero at z D 0, and hence
ez
z 1
has a removable singularity at 0, and hence has a Taylor expansion at z D 0: 1
X Bj z D zj : ez 1 j Š j D0 The numbers Bj are called the Bernoulli numbers. One has B0 D 1; B1 D 1=2; B2 D 1=6; B3 D 0; B4 D 1=30; : : : :
262
10 Complex Analysis I: Basic Concepts
(See Exercise (18).) Now consider the function f .z/ D
2 i zk .e 2 i z
1/
:
Then, by definition, for k 2 an even integer, reszD0 f .z/ D
.2 i /k Bk : kŠ
On the other hand, clearly f .z/ has a simple (D order 1) pole at m 2 Z X f0g), and using Taylor series at z D m, one gets reszDm f .z/ D
1 : mk
Also, clearly, f .z/ has no other poles. Let L be a rectangle with sides ˙.n C 21 / C ti, niCt, t 2 R in the appropriate ranges, oriented counterclockwise. By the Residue Theorem, Z f .z/dz D 2 i L
n X 1 .2 i /k Bk C2 kŠ mk mD1
! :
(C)
On the other hand, the left-hand side tends to 0 with n ! 1. In effect, we claim that je 2 iz 1j > C
(*)
on L, where C > 0 is a constant independent of n. To see this, note that on the vertical sides of the rectangle, e 2 i z 1 is a negative real number, while on the horizontal sides, e 2 iz is a complex number of constant absolute value, which with n ! 1 tends to 0 on the upper side, and to 1 on the lower side. This proves (*), and since we are further dividing by zk , k 2, the absolute value of the integrand on the left-hand side of (C) is < K=n2 for a constant K independent of n, which implies that the left-hand side of (C) converges to 0 with n ! 1. We conclude the following Theorem. For every even integer k 2, .k/ D
.2 i /k Bk : 2.kŠ/
7 Exercises
7
263
Exercises
(1) Prove from first principles that for a holomorphic function f W U ! C where U is open, f , thought of as a map from an open set of R2 to R2 , has a total differential at every point. (2) Prove that the function of one complex variable ( f .z/ D
e z 0
4
if z ¤ 0 if z D 0
satisfies the Cauchy-Riemann conditions (CR) everywhere in C, but is not holomorphic. [This example is due to H. Looman.] (3) Consider the function of one complex variable f .z/ D
z5 =jzj4 0
if z ¤ 0 if z D 0.
Prove that f is continuous everywhere in C, satisfies the Cauchy-Riemann conditions (CR) at z D 0, but does not have a complex derivative at z D 0. (4) Jordan’s Theorem. Let L be an oriented closed simple piecewise continuously differentiable curve in C. Assume for simplicity that there exists a parametrization c W ha; bi ! C of L where the partition a D a0 < < 0 0 ak D b mentioned in 1.1 of Chapter 8 satisfies cC .ai / ¤ c .ai / for 0 0 i D 1; : : : ; k 1, cC .a/ ¤ c .b/ for any > 0. 1. Prove that for every x 2 cŒha; bi, there exists an open neighborhood Vx and a diffeomorphism x W Vx ! .0; 1/ with det.D x / > 0; a number ˛ 2 .0; 2 / and numbers a > 0; b 2 R such that c.b/ D x; cŒha; bi \ Vx D cŒ.b a; b C a/ and for s 2 h1; 0i, we have x c.as C b/ D .s cos.˛/; s sin.˛// and for s 2 h0; 1i; we have x c.as C b/ D .s; 0/.
264
10 Complex Analysis I: Basic Concepts
2. Define, for a point z 2 C X cŒha; bi, 1 indc .z/ D 2 i
Z L
d : z
Consider the notation from part 1 of this exercise. Let t1 D .q cos.ˇ/; q sin.ˇ//, t2 D .q cos. /; q sin. // where 0 < q < 1, 0 < ˇ < ˛ < < 2 . Let zi D x1 .ti /, i D 1; 2. Prove that indc .z1 / D indc .z2 / C 1: [Hint: Assume without loss of generality b D 0, a D 1, x D Id. Let q < r < 1. Consider the curve c1 parametrized by the restriction of c to hr; ri. Let c2 W Œ0; ˛ ! C and c3 W Œ˛; 2 ! C be defined by t 7! .r cos.t/; r sin.t//. Let Li be the oriented curve parametrized by ci for i D 1; 2; 3. Use Remark 8.6. (4) for the curves L1 C L2 , L3 L1 , L2 C L3 .] 3. Prove that indc .z/ is constant on connected components of C X cŒha; bi. 4. Let cŒha; bi .0; R/ and let jzj R. Prove that indc .z/ D 0:
(5) (6)
(7) (8)
5. Prove that there exists a point x of cŒha; bi for which, in the notation of part 2 of this Exercise, indc .z1 / D 0 or indc .z2 / D 0. Note that either alternative can arise depending on the orientation of c. [Hint: Let x 2 I m.c/ be a point with maximal real part.] 6. Let Ui be the connected component of C X cŒha; bi which contains the point zi , i D 1; 2. Prove that Ui X Ui D cŒha; bi. [Hint: Use part 1 and compactness.] 7. Prove from part 5 that U1 [ U2 [ cŒha; bi is open, and equal to its closure, hence equal to C. Hence, CXcŒha; bi has precisely two connected components, namely U1 and U2 (note that, by parts 2 and 3, U1 ¤ U2 ). Prove that the set of all z 2 C such that e z D 1 is precisely the set f2k i j k 2 Zg. [Hint: Recall Exercises (12), (11) of Chapter 1]. Prove that if Re.t/ > 0, then there exists a unique z 2 C with =2 < Im.z/ < =2 such that e z D t. Denote z D ln.t/. Prove that the complex derivative of ln.z/ is 1=z. For Im.z/ > 0, a 2 C, define za D e a ln.z/ . Mimic Exercise (7) of Chapter 1 to show that the complex derivative of za is aza1 . Define, for a 2 C, ! a D1 0
7 Exercises
265
and for k D f1; 2; : : : g, a k
! D
a.a 1/ : : : .a k C 1/ : kŠ
Prove Newton’s formula, which states that for z 2 C with jzj < 1, we have ! 1 X a .1 C z/a D zn : n nD0 (9) Suppose that f is a holomorphic function on C, and suppose there exist nonzero numbers a; b 2 C such that we do not have qa D b for any q 2 Q, and such that f .z C a/ D f .z/, f .z C b/ D f .z/ for all z 2 C. Prove that then f is constant. [Note that there is more than one case to consider.] (10) Prove that a non-constant holomorphic function on f W C ! C satisfies f ŒC D C. [Hint: If a … f ŒC, then the function 1=.f .z/ a/ is holomorphic and bounded.] (11) Prove that if L is a parametrized oriented piecewise smooth curve in an open set U C, h W U ! C is a holomorphic injective function and f is a holomorphic function on hŒU , then Z
Z
f .h.z//h0 .z/dz:
f .t/dt D hŒL
L
(Note that the notation hŒL applied to a parametrized curve is slightly imprecise, but the meaning is clear.) (12) Prove that if f .z/ is a holomorphic function on an annulus R1 < jjzajj < R2 which cannot be holomorphically extended to any annulus r1 < jjz ajj < r2 for r1 R1 , r2 R2 where equality does not arise in both cases, then the Laurent expansion (6.1.2) diverges outside the annulus R1 jjz ajj R2 . (13) Prove that Z . a/n d D 0 K
where K is a circle with center a oriented counter-clockwise, and n 2 Z, n ¤ 1. (14) Let U C be an open set, and let a 2 U . Let f be a holomorphic function on U X fag. Suppose that f is bounded in some neighborhood of a. Prove that then f has a removable singularity at a. [Hint: Consider the function f2 of Subsection 6.1. Then f2 is bounded in a neighborhood of 0 because f is. This means that the function of Subsection 6.1 is holomorphic and bounded in all of C. Apply Liouville’s Theorem.]
266
10 Complex Analysis I: Basic Concepts
(15) Prove that the complex function f .z/ D e 1=z for z ¤ 0 has an essential singularity at z D 0. Conclude that the Taylor expansion of f at a 2 C X f0g has radius of convergence jjajj. (Compare to Exercise (13) of Chapter 1.) (16) Prove that a function f as in Subsection 6.2 has a pole of order n at a if and only if g.z/ D f .z/.z a/n is holomorphic in U and g.a/ ¤ 0. Similarly, prove that f has a zero of order n at a if and only if h.z/ D f .z/=.z a/n is holomorphic in U and h.z/ ¤ 0. (17) Let U be a connected open subset of C and let f; g be meromorphic functions on U . Prove that f g, f =g are meromorphic on U . (18) Prove that Bk D 0 if k > 2 is an odd integer.
11
Multilinear Algebra
Now that we strengthened our foundations in topology, algebra needs an upgrade as well. We already know the concept of a bilinear map, which we have used, for example, in Chapter 3, Section 8. Of the more general multilinear maps, we encountered one additional example: the determinant. In this chapter, we will study multilinear maps in some depth. This is essential for our treatment of differential forms in Chapter 12 below, as well as for tensor calculus in Chapter 15 below. In this chapter and the next, we will drop the bold-faced letter convention of 1.2 of Chapter 3, as it is generally not used in this context.
1
Hom and dual vector spaces
1.1 In this Chapter, the symbol F stands for either the field R of real numbers or the field C of complex numbers. Let V , W be vector spaces over F. Denote by HomF .V; W / the set of all linear maps (homomorphisms of F-vector spaces) f W V ! W: Observe that HomF .V; W / is again a vector space: for f; g 2 HomF .V; W /, we have a linear map f C g 2 HomF .V; W / defined by .f C g/.x/ D f .x/ C g.x/; and when 2 F, we also have a linear map f defined by
I. Kriz and A. Pultr, Introduction to Mathematical Analysis, DOI 10.1007/978-3-0348-0636-7 11, © Springer Basel 2013
267
268
11 Multilinear Algebra
.f /.x/ D f .x/: The required identities are obvious. A case of special interest is when W D F. Then we write V D HomF .V; F/; and call V the dual of the vector space V and often refer to elements of V as linear forms on V .
1.2
Covariance and contravariance
Observe the behavior of HomF .V; W / with respect to linear maps. First, let W W ! W 0 be a linear map. We can naturally define a map W HomF .V; W / ! HomF .V; W 0 / by composition with : f 2 HomF .V; W / 7! ı f 2 HomF .V; W 0 /: The map is clearly linear. Moreover, this construction clearly preserves the identity, and if we have another linear map W W 0 ! W 00 , we have
ı D .
ı / :
Next, consider a linear map W V ! V 0 . Again, we can define naturally a map on HomF .‹; W / by g 7! .g/ D g ı : This map, W HomF .V 0 ; W / ! HomF .V; W /; however, goes in the opposite direction! Again, is clearly linear, and this construction preserves identity. Also, it preserves composition, but this time in the reversed order: If W V 0 ! V 00 is a linear map, then . ı / D ı : This behavior, i.e. reversing the direction of maps and the order of composition, is referred to as contravariance and the opposite of contravariance, i.e. preserving the direction of maps and order of composition, is then referred to as covariance. Thus, the construction is contravariant and the construction is covariant.
1 Hom and dual vector spaces
269
These are basic concepts of category theory where the constructions 7! and 7! are referred to as covariant and contravariant functors, which means that they preserve the identity, and preserve or reverse the order of composition. For our purposes, however, we do not need to investigate these concepts further. The interested reader can look at [12]. At this moment, the most important fact for us is that the dual is contravariant, i.e. for a linear map of vector spaces WV !W we obtain a linear map W W ! V :
1.3
The dual basis
Suppose now that a vector space V is finite-dimensional, and let .v1 ; : : : ; vn / be an ordered basis of V . Then, by the definition of a basis, there exist linear forms .f1 ; : : : ; fn / 2 V such that fi .vj / D
1 0
when i D j else.
Proposition. .f1 ; : : : ; fn / is a ordered basis of V (and is referred to as the dual (ordered) basis of .v1 ; : : : ; vn /). Proof. We know that any linear form f W V ! F satisfies f .1 v1 C C n vn / D 1 f .v1 / C C n f .vn /: We conclude that f D f .v1 /f1 C C f .vn /fn :
t u
Note that finite-dimensionality was used in the last line of the proof, where we would get an undefined infinite sum, were the basis infinite.
1.4
The double dual
Let V be a vector space. Then there is a map W V ! .V /
270
11 Multilinear Algebra
defined naturally as follows: Let v2V and .f W V ! F/ 2 V : Then define ..v//.f / D f .v/: Proposition. The map is an isomorphism when V is finite-dimensional. Proof. Let V have an ordered basis .v1 ; : : : ; vn /. Let .f1 ; : : : ; fn / be the dual ordered basis, and let .w1 ; : : : ; wn / be the dual ordered basis of .f1 ; : : : ; fn /. By definition, we have .vi / D wi . t u
1.5
Duals of inner product spaces
For a general finite-dimensional space V , there is no “naturally defined” isomorphism V ! V . This statement can actually be made more precise, but we won’t need that. For a finite-dimensional real vector space V with inner product hu; vi, however, the situation is different. We can define a linear map WV !V
(*)
. .v//.w/ D hv; wi:
(**)
by
It is easily seen that this is an isomorphism, since when .v1 ; : : : ; vn / is an orthonormal ordered basis, . .v1 /; : : : ; .vn // is the dual basis. When V is a finite-dimensional inner product vector space over C, the situation is somewhat more complicated. If we attempt to define an isomorphism (*) by the formula (**), we find by the properties of the complex inner product that the map we obtain is anti-linear, not linear. It is possible to define the complex conjugate space V which is the same as V as a real vector space, and the multiplication by 2 C on V is defined as the multiplication by the complex conjugate on V . Then the formula (**) defines an isomorphism V ! V :
2 Multilinear maps and the tensor product
2
271
Multilinear maps and the tensor product
2.1 Let V1 ; : : : ; Vn ; W be vector spaces over the same field F (again, we assume F D R or F D C). A multilinear map from V1 Vn into W is a map of sets which is linear in each coordinate. This means that for fixed vi 2 Vi , i ¤ k (i; k D 1; : : : ; n), and for x; y 2 Vk , 2 F, we have .v1 ; : : : ; vk1 ; x C y; vkC1 ; : : : ; vn / D .v1 ; : : : ; vk1 ; x; vkC1 ; : : : ; vn / C .v1 ; : : : ; vk1 ; y; vkC1 ; : : : ; vn / and .v1 ; : : : ; vk1 ; x; vkC1 ; : : : ; vn / D .v1 ; : : : ; vk1 ; x; vkC1 ; : : : ; vn /: When n D 2 resp. n D 3, we speak of a bilinear resp. trilinear map, etc. Beware the standard mistake: Note that a multilinear map is not linear (except in special cases such as when V1 D D Vn D 0). For example, in the bilinear case, the linear additivity formula gives 2 .x; z C t/ D .2x; z C t/ D .x; z/ C .x; t/; while the multilinear additivity formula gives .x; z C t/ D .x; z/ C .x; t/:
2.2
The tensor product by universality
Since a multilinear map from V1 Vn to W is not linear, it raises the question whether the information contained in a multilinear map could be equivalently replaced by the information contained in a linear map. In mathematics, a standard approach to such a situation is by looking for a universal object: Is there a vector space W0 and a multilinear map from V1 Vn to W0 such that for every multilinear map from V1 Vn to any vector space W there exists a unique linear map 0 W W0 ! W such that 0 ı D ? We express this by a diagram (the arrows labelled multi mean multilinear maps): V1 Vn multi multi
W0
W: 0
272
11 Multilinear Algebra
The dotted arrow means that the map (in this case linear) exists and is determined by the other data. For given vector spaces V1 ; : : : ; Vn , such a universal vector space W0 indeed exists. It is called the tensor product, and denoted by W 0 D V1 ˝ ˝ Vn : Of course, the existence is yet to be proved. However, let us observe that just from the universal property, if the tensor product exists, it must be unique up to a preferred (we say canonical) isomorphism: Suppose 0 W V1 Vn ! W00 is another universal multilinear map. Then by the universality of W0 , there exists a linear map W W0 ! W00 such that 0 D : Similarly, by the universality of W00 , there exists a linear map such that D
0 :
D
;
W W00 ! W0
But now, since we have
by the uniqueness part of the univeral property of , we must have D Id; and similarly so and
2.3
D Id;
are linear isomorphisms.
The existence of the tensor product
Proposition. Let V1 ; : : : ; Vn be vector spaces. Then there exists a vector space V1 ˝ ˝ Vn (the tensor product) which satisfies the universality property from the last paragraph. Proof. The construction is not very inspiring. Recall from Appendix A, 5.6 the construction of the free vector space. Now take the free vector space
2 Multilinear maps and the tensor product
F.V1 Vn /
273
(*)
on the (typically infinite) basis V1 Vn (forgetting, for the moment, the vector space structure of V1 ; : : : ; Vn completely). Then there is a canonical (i.e. obvious) map 0 W V1 Vn ! F.V1 Vn /; namely sending each x D .v1 ; : : : ; vn / 2 V1 Vn to the free generator of the same name. This map 0 is just a map of sets; there is no reason even to suspect that it may be multilinear. Now, however, we apply our technique of factorization. Namely, in (*), take the vector subspace Z generated by all the elements .v1 ; : : : ; vk1 ; x C y; vkC1 ; : : : ; vn / C.v1 ; : : : ; vk1 ; x; vkC1 ; : : : ; vn / C .v1 ; : : : ; vk1 ; y; vkC1 ; : : : ; vn /; and .v1 ; : : : ; vk1 ; x; vkC1 ; : : : ; vn / .v1 ; : : : ; vk1 ; x; vkC1 ; : : : ; vn / where vi 2 Vi , i ¤ k, i; k D 1; : : : ; n and 2 F. Now define W0 as the factor W0 D F.V1 Vn /=Z: Therefore, by definition of the factor space, we have a canonical projection W F.V1 Vn / ! W0 : Put D 0 : Then we immediately see that is a multilinear map, because the definition of multilinearity is precisely equivalent to asserting that our generators of Z go to 0. Now by definition of basis, any map of sets W V1 Vn ! W determines a unique linear map ˆ W F.V1 Vn / ! W
274
11 Multilinear Algebra
such that ˆ.v1 ; : : : ; vn / D .v1 ; : : : ; vn /: If is moreover multilinear, then ˆŒZ D 0; so by the Homomorphism Theorem, there exists a (necessarily unique) linear map 0 W F.V1 Vn /=Z ! W such that 0 D ˆ:
t u
Notation: One usually denotes v1 ˝ ˝ vn D .v1 ; : : : ; vn / 2 V1 ˝ ˝ Vn : Let us also remark that to be completely precise, we should denote our tensor product by V1 ˝F ˝F Vn to distinguish the field. We will, however, typically not use this longer notation unless confusion can arise. Perhaps the most important convention is that in most of advanced mathematics, a multilinear map W V1 Vn ! W is generally identified with the corresponding linear map 0 W V1 ˝ ˝ Vn ! W; which means that the two concepts are no longer distinguished explicitly, and the linear variant is written in all formulas.
2.4
The tensor product and bases
To avoid excessive indexing, assume here that n D 2 and investigate the tensor product V ˝ W of vector spaces V , W with ordered bases .v1 ; : : : ; vm / and .w1 ; : : : ; wp /. (See Exercise (2).)
2 Multilinear maps and the tensor product
275
Proposition. The set fvi ˝ wj j i D 1; : : : ; m; j D 1; : : : ; pg is a basis of V ˝ W . Proof. By the uniqueness explained in 2.2, it suffices to exhibit a bilinear map 0 W V W ! F.V W / which satisfies the universal property for bilinear maps. Put 0 .
m X
i vi ;
i D1
n X
j wj / D
j D1
X
i j vi ˝ wj :
i;j
Bilinearity is an immediate consequence of associativity and distributivity. For universality, simply note that every bilinear map WV W !U into another vector space U must satisfy .
m X i D1
i vi ;
n X
j wj / D
j D1
X
i j .vi ; wj /;
i;j
so the map required by the universality property is uniquely given by the formula 0 .vi ˝ wj / D .vi ; wj /:
2.5
The tensor product and duals
Let V , W be vector spaces over F. Note that we have canonical maps W V ˝ W ! Hom.V; W /; W V ˝ W ! .V ˝ W / : Specifically, let v 2 V , w 2 W , and let .f W V ! F/ 2 V ; .g W W ! F/ 2 W :
t u
276
11 Multilinear Algebra
Define ..f ˝ w//.v/ D f .v/w; .f ˝ g/.v ˝ w/ D f .v/ ˝ g.w/: Proposition. Let V , W be finite-dimensional vector spaces. Then the linear maps , defined above are isomorphisms. Proof. Let V; W have ordered bases .v1 ; : : : ; vm / and .w1 ; : : : ; wn /, let the dual ordered bases be .f1 ; : : : ; fm /, .g1 ; : : : ; gn /. We already know that the space space Hom.V; W / is isomorphic to the space of .m n/-matrices by assigning to a linear map W V ! W its matrix with respect to the bases .vi /, .wj /. Denote by i;j 2 Hom.V; W / the linear map whose matrix has 1 in the j ’th row and i ’th column and 0’s elsewhere. Then clearly the set of all i;j , i D 1; : : : ; m; j D 1; : : : ; n is a basis of Hom.V; W /, and we have .fi ˝ wj / D i;j ; thus proving the statement for . Now let, on .V ˝ W / , .ei;j / be the dual basis to the basis .vi ˝ wj /. Then by definition .fi ˝ gj / D ei;j ; t u
thus proving the statement about .
3
The exterior (Grassmann) algebra
3.1
Alternating (multilinear) maps
Let V , W be vector spaces over the field F (which, again, we assume to be equal to R or C). Recall that a multilinear map W„ V ƒ‚ V… ! W k times
can be identified with a linear map W V ˝˝V ! W „ ƒ‚ … k times
(the left hand side is, of course, also denoted by V ˝k ). The multilinear map is called alternating if for any permutation
3 The exterior (Grassmann) algebra
277
W f1; : : : ; kg ! f1; : : : ; kg; and any vectors vi 2 V , we have .v .1/ ˝ ˝ v .k/ / D sgn./ .v1 ˝ ˝ vk /:
3.2 It is natural to ask if there is a universal object for alternating multilinear maps just as the tensor product was for multilinear maps, i.e. if for every vector space V and every k D 0; 1; 2; : : : there exists a vector space Wa and an alternating map W V ˝k ! Wa such that for every alternating map W V ˝k ! W there exists a unique linear map a W Wa ! W such that D a ; or, expressed by a diagram, V ˝k alt alt
Wa
W: a
The notation alt means an alternating map. Such an object indeed exists, as we shall prove in 3.3. It is called the exterior power, and is denoted by ƒk .V /. It is also unique up to canonical isomorphism by the same argument as the tensor product (see Exercise (6)).
3.3
The Existence of the exterior power
Proposition. The vector space Wa D ƒk .V / and the map W V ˝k ! ƒk .V / with the universal property described in Section 3.2 exist.
278
11 Multilinear Algebra
Proof. Let Z V ˝k be the vector subspace generated by all elements of the form .v .1/ ˝ ˝ v .n/ / sgn./ .v1 ˝ ˝ vn / where is a permutation on f1; : : : ; ng and vi 2 V . Then by definition, the quotient map W V ˝k ! ƒk .V / is alternating (since all the generators of Z being 0 is a translation of the definition of an alternating map). Let W V ˝k ! W be an alternating map. Then, again, by definition, ŒZ D 0; so by the Homomorphism Theorem, there exists a unique linear map a W ƒk .V / D V ˝k =Z ! W such that D a ı , as claimed.
t u
Notation: For v1 ; : : : ; vk 2 V , one writes v1 ^ ^ vk D .v1 ˝ ˝ vk / 2 ƒk .V /:
3.4
Exterior powers and bases
Proposition. Let V be a finite-dimensional vector space with ordered basis .v1 ; : : : ; vn /. Then the set fvi1 ^ ^ vik j1 i1 < < ik ng
(1)
is a basis of ƒk .V /. Proof. Again, we will use the uniqueness which follows from the universal property. Let Wa0 be the free vector space on the set (1). A linear map on a vector space can be defined by specifying its values on the basis elements. The basis elements on V ˝k are vi1 ˝ ˝ vik ; i1 ; : : : ik 2 f1; : : : ng: Define thus
(2)
3 The exterior (Grassmann) algebra
279
0 W V ˝k ! Wa0 by sending the element (2) to 0 if two of the numbers i1 ; : : : ; ik are equal, and to sgn./ vi .1/ ^ ^ vi .k/ if i .1/ < < i .k/ : Now let W V ˝k ! W be an alternating map. Define a0 W Wa0 ! W by a0 .vi1 ^ ^ vik / D .vi1 ˝ ˝ vik /:
(3)
Then for a basis element x in (2), .x/ D a0 0 .x/
(4)
follows from the definition of an alternating map (in particular, note that if two coordinates of x coincide, swapping these two coordinates only changes the sign but not x so we get .x/ D .x/, implying .x/ D 0). Note also that the definition (3) is thereby forced by (4), so 0 has the universal property of 3.2. t u
3.5
Remark
Note that if dim.V / D n, then we have ! n dim ƒ .V / D : k k
In particular, for k > n, we have ƒk .V / D 0; and we also have dim.ƒn .V // D 1: This means that the space of alternating multilinear maps on V ˝n is 1-dimensional. Specifying an isomorphism W V ! Fn ;
280
11 Multilinear Algebra
(where the right hand side is the space of columns), one such non-zero alternating map is v1 ˝ ˝ vn 7! det. .v1 /; : : : ; .vn //: (Here the argument of the determinant is simply the n n matrix with the columns listed put in that order.) Thus, we have proved that any alternating multilinear map on V ˝n is a constant multiple of the determinant! When F D R, a choice of one of the two connected components of ƒn .V / X f0g is called an orientation of the vector space V , and the other orientation is called the opposite orientation. A linear isomorphism f W V ! V is said to preserve orientation if the linear isomorphism ƒn .f / (see Exercise (7)) restricted to ƒn .V /X f0g preserves the chosen connected component. Otherwise, f is said to reverse orientation.
3.6
The exterior product
Let V; Z; W be vector spaces. Consider two numbers k; ` D 0; 1; 2; : : : . It is useful to study multilinear maps W V ˝k ˝ Z ˝` ! W which are alternating in the first k coordinates and the last ` coordinates separately. By this, we mean that .v .1/ ˝ ˝ v .kC`/ / D sgn./ .v1 ˝ ˝ vkC` / whenever v1 ; : : : ; vk 2 V ,vkC1 ; : : : ; vkC` 2 Z, and is a permutation which satisfies .f1; : : : ; kg/ D f1; : : : ; kg (and hence, of course, also .fk C 1; : : : ; k C `g/ D fk C 1; : : : ; k C `g). It turns out that the universal object for multilinear maps alternating in the first k and last ` coordinates separately is ƒk .V / ˝ ƒ` .Z/. More precisely, we have the following Proposition. The map 2 D ˝ W V ˝k ˝ Z ˝` ! ƒk .V / ˝ ƒ` .Z/ (see Exercises (3) and (4)) is alternating in the first k and last ` coordinates separately. For any vector space W and any linear map W V ˝k ˝ Z ˝` ! W
3 The exterior (Grassmann) algebra
281
alternating in the first k and last ` coordinates separately, there exists a unique linear map 2 W ƒk .V / ˝ ƒ` .V / ! W such that D 2 2 . Proof. It is obvious that 2 is alternating in the first k and the last ` coordinates separately. Consider a map as in the statement of the proposition. Then for w 2 Z ˝` fixed, v 7! .v ˝ w/ is an alternating map on V ˝k , which gives us a map w W ƒk .V / ! W:
(*)
Fixing now v 2 ƒk .V /, on the other hand, (*) is clearly linear and alternating in w, thus giving us a map v W ƒ` .V / ! W:
(**)
Therefore, (**) specifies a bilinear map ƒk .V / ƒ` .Z/ ! W; and hence a linear map 2 W ƒk .V / ˝ ƒ` .Z/ ! W: We have D 2 2 by definition, which in turn uniquely determines 2 since ƒk .V / ˝ ƒ` .Z/ is generated by elements of the form .v1 ^ ^ vk / ˝ .vkC1 ^ ^ vkC` /, v1 ; : : : ; vk 2 V , vkC1 ; : : : ; vkC` 2 Z. t u The whole point of the proposition for our purposes is that for V D Z, when a map W V ˝kC` ! W is alternating, it is clearly alternating in the first k and last ` coordinates separately, so the universal property in the proposition (for W D ƒkC` .V /) gives a map ^ W ƒk .V / ˝ ƒ` .V / ! ƒkC` .V /:
282
11 Multilinear Algebra
We will think of this map as a kind of a product, called the exterior product, i.e. write, for x 2 ƒk .V /, y 2 ƒ` .V /, x ^ y 2 ƒkC` .V /: One has, of course, for v1 ; : : : ; vkC` 2 V , .v1 ^ ^ vk / ^ .vkC1 ^ ^ vkC` / D v1 ^ ^ vkC` : In the above notation, we therefore see that x ^ y D .1/k` y ^ x since .1/k` is the sign of the permutation swapping the first k with the last ` coordinates (without changing their individual orders). Note that if we put ƒ.V / D
1 M
ƒk .V /
kD0
(let ƒ0 .V / D F), this defines an actual bilinear product ^ W ƒ.V / ˝ ƒ.V / ! ƒ.V /: This product is associative and unital in the sense that .x ^ y/ ^ z D x ^ .y ^ z/; 1 ^ x D x ^ 1 D x: One calls ƒ.V / the exterior algebra (or the Grassmann algebra).
3.7
The exterior product and duality
Let V be a vector space over R or C. Define a linear map W ƒk .V / ! .ƒk .V // by ..f1 ^ ^ fk //.v1 ^ ^ vk / D
X
sgn./ f .1/ .v1 / f .k/ .vk /
where the sum is over all permutations on the set f1; : : : ; kg.
3 The exterior (Grassmann) algebra
283
Proposition. If V is a finite-dimensional vector space, then the map defined above is an isomorphism. Proof. Let .v1 ; : : : ; vn / be an ordered basis of V and let .f1 ; : : : ; fn / be the dual ordered basis of V . Then for 1 i1 < < ik n, we have .vi1 ^ ^ vik / D fi1 ^ ^ fik :
3.8
t u
The Hodge * operator
Now let V be a finite-dimensional real inner product space of dimension n. (The complex case can be treated also, but we don’t need it; see e.g. [8]). Then ƒk .V / is naturally an inner product space where the inner product is defined by hv1 ^ ^ vk ; w1 ^ ^ wk i D
X
sgn./ hv .v1 / ; w1 i hv .vk / ; wk i
where the sum is over all permutations on f1; : : : ; kg. It is useful to note that if .v1 ; : : : ; vn / is an ordered orthonormal basis of V , then the basis given by Proposition 3.3 is orthonormal. Now let V be an oriented real finite-dimensional vector space of dimension n. Recall from Remark 3.5 that dim.ƒn .V // D 1 and note that an orientation specifies a connected component C of ƒn .V / X f0g. Now there exists a unique 2 C with h; i D 1. There exists a unique linear isomorphism " W ƒn .V / ! R such that "./ D 1: Then we have a linear map W ƒk .V / ! .ƒnk .V // defined by ..v1 ^ ^ vk //.vkC1 ^ ^ vn / D ".v1 ^ ^ vn /: Now define the Hodge * operator W ƒk .V / ! ƒnk .V /
284
11 Multilinear Algebra
as the composition ƒk .V /
.ƒnk .V //
ƒnk .V /
where the second map is given by the inner product on ƒnk .V /. Note that when .v1 ; : : : ; vn / is an ordered orthonormal basis of V , then v1 ^ ^ vn is equal either to or . We say that the basis is oriented if v1 ^ ^ vn D : Then we see readily that for an oriented ordered basis .v1 ; : : : ; vn / of V , we have .v1 ^ ^ vk / D vkC1 ^ ^ vn :
4
Exercises
(1) Let V , W be finite-dimensional vector spaces, and let W V ! W be a linear map. Fix ordered bases .v1 ; : : : ; vn / of V and .w1 ; : : : ; wm / of W . Let .f1 ; : : : ; fn / and .g1 ; : : : ; gm / be the dual ordered bases. Let A be the matrix of the map with respect to the ordered bases .v1 ; : : : ; vn /, .w1 ; : : : ; wm /. Prove that the matrix of the map with respect to the dual ordered bases is the transposed matrix AT . (2) Write down a basis of V1 ˝ ˝ Vn in terms of chosen bases of V1 ; : : : ; Vn for general n. (3) “Functoriality” of the tensor product: For linear maps f W V ! V 0 , g W W ! W 0 , construct a map f ˝ g W V ˝ W ! V 0 ˝ W 0 in such a way that Id ˝ Id is the identity, and the construction preserves compositions. (4) Prove commutativity associativity and unitality of the tensor product, i.e. construct isomorphisms V ˝ W ! W ˝ V; V ˝ .W ˝ Z/ ! .V ˝ W / ˝ Z; F˝V !V which form commutative diagrams with the linear maps constructed in Exercise (3). (This property is called “naturality”.) (5) Prove that for any vector spaces V; W; Z, there is a canonical isomorphism Hom.V; Hom.W; Z// Š Hom.V ˝ W; Z/:
4 Exercises
285
(6) Prove the uniqueness of ƒn .V / based on its universal property discussed in 3.2. (7) Prove “functoriality” of ƒn , i.e. for a linear map V ! W , construct a linear map ƒn .f / W ƒn .V / ! ƒn .W / which preserves identity maps and compositions. (8) Prove that for a finite-dimensional vector space V , a linear isomorphism f W V ! V preserves orientation if and only if det.f / > 0. (9) Prove the associativity and unitality property of the exterior product defined in Section 3.6. (10) Let V be a real finite-dimensional inner product space. Prove the commutativity of the diagram ƒk .V /
Š
Š
.ƒk .V // Š
ƒk .V /
ƒk .V / Id
where the vertical maps are given by the inner products in V and ƒk .V /.
Smooth Manifolds, Differential Forms and Stokes’ Theorem
12
In this chapter, we will introduce smooth manifolds (“locally Euclidean spaces”). A theory of differential forms, which we will exhibit, allows us to set up a general theory of integration on such spaces, and to generalize Green’s Theorem in Chapter 8 to the general Stokes Theorem in arbitrary dimension. In the process of introducing these topics, we will touch on the field of algebraic topology. For basic information on this topic, the reader may look at [20]. For a more advanced introduction to algebraic topology from the point of view of differential forms, we recommend [3]. For an introduction to topics which are more abstract, we recommend [13, 14].
1
Smooth manifolds
1.1
Topological manifolds
A topological manifold of dimension n (briefly a topological n-manifold) is a metrizable separable topological space M (metrizable means that there exists a metric on M which induces the given topology on M ) such that for every x 2 M there exists an open neighborhood Ux of x and an injective open map hx W Ux ! Rn
(*)
(open map means that the image of every set open in the domain is open in the codomain). The neighborhood Ux is called a coordinate neighborhood, and the function hx is called a coordinate system, or coordinate system at x. The map assigning to each x 2 M a coordinate neighborhood and a coordinate system is called an atlas. The coordinate systems of an atlas are also referred to as charts.
I. Kriz and A. Pultr, Introduction to Mathematical Analysis, DOI 10.1007/978-3-0348-0636-7 12, © Springer Basel 2013
287
288
12 Smooth Manifolds, Differential Forms and Stokes’ Theorem
Remarks: 1. Note that instead of requiring hx to be open, we could have equivalently required that hx be a homeomorphism (see Exercise (1)). 2. Since we assume M is separable and metrizable, it has a countable basis (see 1.2 of Chapter 9). The reason we don’t actually say M is a metric space is that we do not want to specify the metric: the metric has no geometric significance, and is only a technical tool at this point. While there are metrics on manifolds which do have a geometrical significance, we will only see these when we develop more structure (such as the concept of a Riemann metric in Chapter 15). 3. Note that the pairs .Ux ; hx / may coincide for different points x. For example, for M D Rn , the atlas may contain only one coordinate system, namely Rn with the identity map Id W Rn ! Rn , which can be equal to .Ux ; hx / for all x 2 Rn . In other interesting cases, the atlas may contain only finitely many coordinate systems (in fact, note that by definition, a compact manifold always has such a finite atlas). The reader may wonder why we don’t simply speak of atlases as open covers U, with coordinate systems on each U 2 U. This is merely a technical point: it turns out that being able to denote a coordinate neighborhood of a point by a single symbol simplifies many arguments. 4. Because we required separability, by our definition, an uncountable discrete set is not a manifold. There is an alternative definition, calling a manifold any (possibly a uncountable) disjoint union of manifolds in our sense. (In a disjoint union Mi , a set U is open if and only if each U \ Mi is open in Mi .)
1.2
i
Smooth manifolds
A smooth manifold of dimension n (briefly a smooth n-manifold) is a topological manifold M with an atlas .Ux ; hx / such that for every x; y 2 M , the composition
hx ŒUx \ Uy
.hx /1
Ux \ Uy
hy
hy ŒUx \ Uy
(C)
is a smooth map, i.e. a map which is continuous and has partial derivatives of all orders which are also continuous. (Note that the domain and codomain of (C) are open subsets of Rn ; also note that the intersection of Ux and Uy may be empty; in that case, the condition (C) is void.) Remarks: 1. Note that this definition is completely intuitive: it simply says that in a coordinate neighborhood, we can speak of smooth real functions, and that these concepts are compatible when we pass from one coordinate neighborhood to another. 2. Note that the continuity of all higher partial derivatives does not follow from their existence, even on an open set (see Exercise (2) of Chapter 3).
1 Smooth manifolds
1.3
289
Differentiable maps
Let M and N be manifolds of dimensions m and n, respectively. A map f W M ! N is called a C r -map if f is continuous and for every x 2 M , the composition hx Œ.f 1 ŒUf .x/ / \ Ux
h1 x
f 1 ŒUf .x/ \ Ux
hf .x/
f
Uf .x/
Rn
has continuous partial derivatives up to order r 2 N. Note that the source of the composition is an open subset of a Euclidean space. A C 1 map is a map which is C r for all r. A C r -diffeomorphism (r 1) is a homeomorphism f W M ! N such that both f , f 1 are C r . A C 1 -diffeomorphism will be referred to simply as a diffeomorphism. Two smooth manifolds M , N for which there exists a diffeomorphism M ! N are called diffeomorphic. For a point x 2 M , a smooth coordinate system at x consists of an open neighborhood U of x and a (smooth) diffeomorphism h W U ! V where V is an open subset of Rn . Given a smooth manifold M with a given atlas .Ux ; hx /, any other atlas .Vx ; kx / on the topological manifold M is considered an atlas on the smooth manifold M if the identity on M is a diffeomorphism from the manifold defined by the atlas .Ux ; hx / to the manifold defined by the atlas .Vx ; kx /.
1.4
Examples
(1) Any open subset of a Euclidean space Rn is, of course, a smooth manifold, and C r -maps between such manifolds are simply maps for which the required partial derivatives (in the old sense) exist and are continuous. (2) More generally, an open subset U of a smooth manifold M automatically inherits a structure of a smooth manifold. (3) Suppose f W Rn ! R is a C 1 -function. Define M D f.x; f .x// 2 RnC1 jx 2 Rn g: Then M is a smooth manifold with a single coordinate neighborhood Rn and the coordinate function .x; f .x// 7! x: The smooth manifold M is known as the graph of the function f . The identity embedding M RnC1 is a C 1 -map and the projection M ! Rn
290
12 Smooth Manifolds, Differential Forms and Stokes’ Theorem
given by .x; f .x// 7! x is a diffeomorphism. (4) The first “non-trivial” example of a smooth manifold is the n-sphere S n D f.x0 ; : : : ; xn / 2 RnC1 j
n X
xi2 D 1:g:
i D0
such that For every x D .x0 ; : : : ; xn / 2 S n , there exists a k 2 f0; : : : ; ng q
xk ¤ 0. Choose, for each x, such a k and an " satisfying 0 < " < Then we can take
1 xk2 .
hx W .y0 ; : : : ; yn / 7! .y0 ; : : : ; yk1 ; ykC1 ; : : : ; yn /; Ux D h1 x Œ .hx .x/; "/: (5) If M , N are smooth manifolds, then M N is naturally a smooth manifold where the coordinate neighborhood of a point .x; y/ 2 M N is Ux Uy and the coordinate function is h.x;y/ .z; t/ D .hx .z/; hy .t//. The product projections M N ! M , M N ! N are C 1 -maps.
1.5
Smooth partition of unity
Let M be a smooth manifold and let .Ui /i 2I be an open cover of M . A smooth partition of unity subordinate to the cover .Ui / is a system of smooth functions ui W M ! R such that for every x 2 M , 0 ui .x/ 1, u1 i Œ.0; 1i Ui (i.e. the support of ui is contained in Ui ), and for every x 2 M there exists an open neighborhood Vx of x and a finite subset Ix I such that for all y 2 Vx , i 2 I XIx , we have ui .y/ D 0 and X
ui D 1:
(1.5.1)
i 2I
(Note that the expression on the left-hand side of (1.5.1) makes sense because on Vx , it can be defined as the sum over Ix .) A refinement of an open cover .Ui /i 2I is an open cover .Vj /j 2J such that for every j 2 J , there exists an i 2 I such that Vj Ui . A cover .Ui /i 2I is called
1 Smooth manifolds
291
locally finite if for every x 2 M , there exists an open neighborhood Vx and a finite subset Ix I such that for i 2 I X Ix , Vx \ Ui D ;. Lemma. For every open cover .Ai /i 2I of a smooth manifold M , there exists an atlas .Uj ; hj /j 2J such that J is countable, the cover .Uj / is locally finite, is a refinement of the cover .Ai /, we have hj ŒUj D .o; 3/ and .h1 j Œ .o; 1//j 2J is also a cover of M . Proof. Since M has a countable basis by Theorem 1.2 of Chapter 9, any open cover has a countable subcover. Since clearly every point of M has a compact neighborhood, there exists a countable cover by open sets whose closures are compact, which is a refinement of .Ai /. Assume, without loss of generality, that .Ai /i 2I itself is such a cover, and that, moreover, I D f1; 2; : : : g. Now define K1 D A1 , and assuming K1 ; : : : ; Ki are defined, let Ki C1 D A1 [ [ Ar where r > i is the smallest number such that Ki A1 [ [ Ar . (Note that such a number exists by compactness.) Denote by X ı the interior of a set M , i.e. X ı D M X .M X X /: Now setting K0 D ;, one has M D
1 [
Ki X Kiı1 ;
i D1
and Ki 1 Kiı : For each x 2 Ki X Kiı1 , we can find an open neighborhood Ux KiıC1 X Ki 2 which is contained in one of the Ai ’s, and a diffeomorphism hx W Ux ! .o; 3/ such that hx .x/ D 0. Furthermore, there are finite sets Si Ki X Kiı1 so that ı h1 x Œ .o; 1/, x 2 Si , cover Ki X Ki 1 . S The system .Ux ; hx /x2 Si is the required atlas. t u Theorem. For any open cover .Ai / of a smooth manifold M there exists a smooth partition of unity subordinate to .Ai /. Proof. Let W R ! R be a function defined by .t/ D e 1=t for t > 0 and .t/ D 0 for t 0. Then is smooth (see Exercise (3) of Chapter 1). Hence the function g W Rn ! R defined by
292
12 Smooth Manifolds, Differential Forms and Stokes’ Theorem
g.x/ D
.2 jjxjj/ .2 jjxjj/ C .jjxjj 1/
satisfies 0 g.x/ 1 for every x 2 Rn , and g.x/ D 0 for jjxjj 2; g.x/ D 1 for jjxjj 1: Now take the atlas .Uj ; hj / from the statement of the Lemma, let gj D g ı hj and define gj uj D X for j 2 J : gk k2J
(Note that the right-hand side is well defined by local finiteness.)
2
t u
Tangent vectors, vector fields and differential forms
The notion of a tangent vector to a smooth manifold models the geometric intuition (for example, the instant velocity of a point moving in the manifold). As we learned in the previous section, however, we must model everything in terms of coordinate neighborhoods.
2.1
Tangent vectors
Let M be a smooth m-manifold and let x 2 M . Consider the set TQ Mx of all triples .U; h; v/ where U is a neighborhood of x, h W U ! V be a diffeomorphism for some V Rn open, and v 2 Rn . Now introduce the following equivalence relation on TQ Mx : We put .U; h; v/ .V; k; w/ if there exists an open neighborhood W of x contained in U \ V such that if we denote by f the composition
hŒW
h1
W
k
then Dfh.x/ .v/ D w:
kŒW ;
2 Tangent vectors, vector fields and differential forms
293
(Recall that D denotes the total differential, see 3.2 of Chapter 3). It is easy to verify that this is indeed an equivalence relation. The set of equivalence classes of TQ Mx is denoted by TM x and its elements are called tangent vectors to M at x. A representative of a -equivalence class will be called a representative of a tangent vector. The tangent vector represented by a triple .U; h; v/ will be sometimes denoted by Œ.U; h; v/. When this gets too cumbersome, we will also refer to v as the vector Œ.U; h; v/ in the coordinate system h W U ! V . Lemma. Let u 2 TM x . Then for every neighborhood U of x and every diffeomorphism h W U ! V for V Rn open, there exists a unique representative .U; h; v/ of the tangent vector u. Proof. If .V; k; w/ is any representative of u, put v D D.k ı h1 /1 h.x/ .w/: (Note that k ı h1 is defined in a neighborhood of h.x/.) By definition, this proves existence. To prove uniqueness, note that by definition, clearly we cannot have .U; h; v/ .U; h; v 0 / for v ¤ v 0 2 Rn . t u Note that by the lemma it immediately follows that TM x has a natural structure of a R-vector space, and that moreover, this vector space is n-dimensional. In effect, let U be an open neighborhood of x and let h W U ! V be a diffeomorphism onto an open subset of Rn . Let Œ.U; h; v/ C Œ.U; h; w/ D Œ.U; h; v C w/; Œ.U; h; v/ D Œ.U; h; v/ where 2 R. Correctness of the definitions of these operations (i.e. independence of the results of chosen representatives) follows from the linearity of the differential in Rn . As noted above, a coordinate system h W U ! V at x 2 M identifies TM x with Rn . Putting h D .h1 ; : : : ; hn /, hi W U ! R, it is useful to denote the ordered basis of TM x corresponding to the standard basis of Rn by .
@ @ ;:::; /: @h1 @hn
(*)
The reason for this notation is that if f W U ! R is a smooth function, in the spirit of the chain rule, it makes sense to write @f .x/ @.f ı h1 / @ f .x/ D D .h.x// @hi @hi @xi
294
12 Smooth Manifolds, Differential Forms and Stokes’ Theorem
where on the right-hand side, xi denotes the standard i ’th coordinate of Rn , as used in Chapter 3. It is also useful to notice that when U Rn is an open subset, x 2 U , we have a canonical identification Rn Š TU x via v 7! ŒU; Id; v:
2.2
The total differential on manifolds
Let M; N be smooth manifolds and let f W M ! N be a C 1 -map. Let x 2 M . We define the total differential of f at x Dfx W TM x ! TN f .x/ as follows: Let V be an open neighborhood of f .x/ and let k W V ! W be a diffeomorphism where W Rm is open. Then define Dfx .Œ.U; h; v/ D Œ.V; k; D.k ı f ı h1 /h.x/ .v//: This definition is correct by the chain rule (in Euclidean spaces) and Dfx is linear by linearity of differentials (in Euclidean spaces). Additionally, note that it generalizes the definition of total differential 3.2 of Chapter 3 when we identify the tangent space of an open subset of Rn at every point with Rn . If we have a real C 1 -function f W U ! R from some U M open, we usually write df .x/ instead of Dfx . From this point of view, df can also be viewed as a C 1 - 1-form (see 2.3 below). Similar statements, of course, hold for C r and smooth functions. In particular, it is useful to note that if h W U ! V is a coordinate system at x 2 M , and h D .h1 ; : : : ; hn /, then .dh1 ; : : : ; dhn / is a basis of TM x dual to the basis 2.1 (*) of TM x . In preparation for the next subsection, note also that by the properties of duals and exterior products (see Chapter 11), we have canonical linear maps Dfx W TN f .x/ ! TM x ; ƒk .Dfx / W ƒk .TN f .x/ / ! ƒk .TM x /:
2 Tangent vectors, vector fields and differential forms
295
A smooth map f W M ! N is called an immersion (resp. submersion) if for every x 2 M , Dfx is injective (resp. onto). An immersion which is a homeomorphism onto its image is also called an embedding or an inclusion of a submanifold. We will then also refer to f ŒM as a submanifold of N .
2.3
Smooth vector fields and differential forms
Let M be a smooth n-manifold. Then a vector field v on M (resp. a k-form ! on M ) is a map assigning to each x 2 M an element of v.x/ 2 TM x (resp. of !.x/ 2 ƒk .TM x /). A differential form is a common term for k-forms for any k. A vector field (resp. a k-form) is called C r if for every x 2 M there exists an open neighborhood U and a diffeomorphism h W U ! V for V Rn open such that for every y 2 U , y 7! Dhy .v.y// 2 Rn resp. y 7! ƒk ..D.h/y /1 /.!.y// 2 ƒk ..Rn / / is a C r map where the right-hand side uses the identification of the tangent spaces of an open subset of Rn at the end of Section 2.1. C 1 vector fields and k-forms are also called smooth. It is useful to note that if h W U ! V is a smooth coordinate system at some point x 2 M , h D .h1 ; : : : ; hn /, then immediately from the definition, the vector space of all smooth vector fields on U is f
n X i D1
fi
@ j fi W U ! R smooth functionsg; @hi
and the space of all smooth 1-forms on U is f
n X
fi dhi j fi W U ! R smooth functionsg:
i D1
Thus, the smooth vector field or 1-form is completely determined by the n-tuple of smooth functions .f1 ; : : : ; fn /, and vice versa, the functions fi are determined by the vector field (resp. differential form) and the coordinate system. Using Proposition 3.4 of Chapter 11, we can extend this to k-forms. The space of all smooth k-forms on U is isomorphic to f
X
fi1 ;:::;ik dhi1 ^ ^ dhik j fi1 ;:::;ik W U ! R smoothg;
1i1 <
and the smooth functions fi1 ;:::;ik are completely determined by a smooth k-form.
296
12 Smooth Manifolds, Differential Forms and Stokes’ Theorem
It is also useful to realize that if .Ui ; hi / is a smooth atlas of M , and we have smooth vector fields vi resp. smooth k-forms !i on each Ui such that vi jUi \Uj D vj jUi \Uj resp. !i jUi \Uj D !j jUi \Uj ; then this uniquely determines a smooth vector field resp. smooth k-form on M . In other words, smooth vector fields and k-forms can be described by a collection of local descriptions in the charts of an atlas. Analogous statements are, of course, true with “smooth” replaced by C r .
2.4
Products and functoriality
For a vector field v and a smooth k-form ! on M , and a smooth function f W M ! R, the product f v is again a smooth vector field, and f ! is a smooth k-form (here the products are evaluated point-wise, i.e. .f v/.x/ D f .x/v.x/ for all x 2 M , and similarly for the differential form). Additionally, for a smooth `-form on M , we have a smooth .k C `/-form ! ^ defined using the exterior product 3.7 of Chapter 11: .! ^ /.x/ D !.x/ ^ .x/ 2 ƒkC` .TM x /: There are, also, analogous statements for C r . Now let f W M ! N be a smooth map. Using the maps constructed at the end of 2.2, for a smooth k-form ! on N , we obtain a smooth k-form f ! on M . Explicitly, for x 2 M , .f !/.x/ D ƒk .Dff.x/ /.!.f .x/// 2 ƒk .TM x / : This correspondence, of course, sends the identity to the identity, and .f ıg/ .!/ D g .f .!//. Thus, we conclude that differential forms are contravariant in smooth maps (in the sense of 1.2 of Chapter 11). There are, of course, analogous statements for smooth replaced by C r . It may be surprising that vector fields are neither covariant nor contravariant in smooth maps: One can see this by realizing that vectors are covariant, while smooth functions are contravariant. Vector fields can be made, however, covariant in diffeomorphisms: Let f W M ! N be a diffeomorphism and let v be a smooth vector field on M . Then we can define a smooth vector field f w on N by .f w/.x/ D Dff 1 .x/ .v.f 1 .x/// 2 TN x :
2 Tangent vectors, vector fields and differential forms
297
2.4.1 Comment The meaning of the symbols f and f here is related to, but not quite the same as in 1 of Chapter 11. Note that, for example, in the current situation, f is not a linear map. Nevertheless, using the same symbol in both situations is quite standard in this case.
2.5
A Slice Theorem
The attentive reader has noticed a similarity of this material with our remarks on substitution in differential equations. In fact, much of what we observed in Section 7 of Chapter 6 can be done coordinate-free. Let us make this concrete in one aspect, which will be instructive as a contrast with what we will do with differential forms: Proposition. Let v be a smooth vector field on a smooth n-manifold M , and suppose v.x/ ¤ 0 for all x 2 M (we speak of a non-vanishing vector field). Let x 2 M . Then there exists a coordinate system h W U ! V at x such that the vector field h .vjU / on V is constant and equal to .1; 0; : : : ; 0/ 2 Rn (using the identification from the end of 2.1). Proof. Let k W U1 ! V1 be any coordinate system at x and consider the vector field k v. We can treat this vector field as a system of differential equations on V1 : For a smooth function f W .a; b/ ! V1 , the equation is f 0 .t/ D
n X @f .k v/.f .t//i : @xi i D1
(*)
Now we know that this system has a smooth solution in a neighborhood of a point of V1 . Specifically, consider vectors w2 ; : : : ; wn 2 Rn such that .k v/.k.x//; w2 ; : : : ; wn are linearly independent (hence form a basis of Rn ).
(**)
Then by Theorem 4.1 of Chapter 6, there exists an open neighborhood V2 Rn of o and a smooth map W V2 ! V1 such that for y 2 V2 , we have .y/ 2 V1 , further .0; a2 ; : : : ; an / D k.x/ C
n X
ai wi
i D2
and for any constants a2 ; : : : ; an such that .t; a2 ; : : : ; an / 2 V2 , the function f .t/ D .t; a2 ; : : : ; an /
298
12 Smooth Manifolds, Differential Forms and Stokes’ Theorem
satisfies the equation (*). Additionally, by our assumption (**), the map is regular at 0, so by the Inverse Function Theorem 7.3 of Chapter 3, there exists an open neighborhood V of 0 such that the restriction jV W V ! ŒV is a diffeomorphism. Now put U D k 1 ŒV and h D . 1 k/jU :
t u
3
The exterior derivative and integration of differential forms
3.1
The exterior derivative
The R-vector space of all smooth k-forms on a smooth n-manifold M is denoted by
k .M /. It is clearly a vector space over R. We will now construct a linear map d W k .M / ! kC1 .M /:
(1)
In terms of a smooth coordinate system h W U ! V , one writes 0 d@
1
X
fi1 ;:::;ik dhi1 ^ ^ dhik A
1i1 <
D
X
dfi1 ;:::;ik ^ dhi1 ^ ^ dhik
(2)
1i1 <
D
X
n X @fi
1i1 <
1 ;:::;ik
@hj
dhj ^ dhi1 ^ ^ dhik :
Lemma. The formula (2) does not depend on the choice of coordinate system. Proof. One first notices that for a smooth function f , df is independent of coordinate system by the chain rule, and that for smooth functions f; g, one has the Leibniz rule d.fg/ D f dg C gdf:
3 The exterior derivative and integration of differential forms
299
Now let g W U ! W be another coordinate system. By the chain rule, we have dhi D
n X @hi dgj : @gj j D1
Now differentiating fi1 ;:::;ik dhi1 ^ ^ dhik
(3)
in the h-coordinate system and converting to the g-coordinate system, we obtain X
dfi1 ;:::;ik
@hi1 @hik ::: @gj1 @gjk
dgj1 ^ ^ dgjk
(4)
where the sum is over all possible choices 1 j1 ; : : : ; jk n (the numbers jp do not have to form an increasing sequence in p). Now converting (3) to the g-coordinate system first, we obtain X
fi1 ;:::;ik
@hi1 @hik ::: @gj1 @gjk
dgj1 ^ ^ dgjk :
(5)
Now differentiating (5) in the g-coordinate system, we must form @hik @hi1 ::: d fi1 ;:::;ik @gj1 @gjk
(6)
and then multiply by dgj1 ^ ^ dgjk . However, by the Leibniz rule, we may differentiate fi1 ;:::;ik and the partial derivative factors separately, and the key point is that when we differentiate @hip ; @gjp we get a double partial derivative @2 hip : @gjp @gjp0 In the resulting sum, however, each such term will appear twice, with the attached dgjp and dgjp0 terms swapped. Thus, by the rules of computation in the exterior algebra, the two terms in each such pair appear with opposite signs, and hence cancel out. t u
300
3.2
12 Smooth Manifolds, Differential Forms and Stokes’ Theorem
The de Rham complex, de Rham cohomology and Betti numbers
Lemma. We have d ı d D 0 W k .M / ! kC2 .M /: Proof. Using formula (2) from 3.1 in a coordinate system h W U ! V , the (k C 2)-form 0 d@
X
1 fi1 ;:::;ik dhi1 ^ ^ dhik A
1i1 <
is a sum of expressions of the form @2 fi1 ;:::;ik dh` ^ dhj ^ dhi1 ^ ^ dhik ; @hj @h` but each of these terms appears twice with j and ` in opposite orders, and therefore with opposite signs, and hence the entire expression vanishes (of course, the terms with j D ` vanish immediately). t u We therefore obtain a sequence of vector spaces and linear maps d
d
d
0 .M / ! 1 .M / ! ! n .M /
(*)
(note that we can only have k .M / ¤ 0 for 0 k n) such that d ı d D 0: The sequence (*) is called the de Rham complex of the smooth manifold M , and is denoted by .M /. A k-form ! is called closed if d! D 0 and is called exact if there exists a .k 1/-form such that ! D d: (We consider the 0-form 0 exact.) Then the set of all closed k-forms is a vector subspace of k .M / which is denoted by Z k .M /, and the set of all exact k-forms is then a vector subspace of Z k .M / which is denoted by B k .M /.
4 Integration of differential forms and Stokes’ Theorem
301
The quotient R-vector space k HDR .M / D Z k .M /=B k .M /
is called the k’th de Rham cohomology vector space of M . We write k .M // bk .M / D dim.HDR
and call this the k’th Betti number of M (it can, of course, be infinite, see Exercise (17)). Betti numbers are fundamental characteristics of manifolds. For example, they are computable in practice, they turn out to be topological invariants, which means that two homeomorphic manifolds have the same Betti numbers. Also, Betti numbers can be defined for topological manifolds, and in fact, for all topological spaces. This leads to an area of mathematics called algebraic topology (see, for example, [3, 13, 14, 20]). Unfortunately, in this text, a systematic treatment of Betti numbers would take us too far afield, and we will confine ourselves to a few basic exercises (Exercises (11), (12) (13), (14), (15), (16), (17)).
4
Integration of differential forms and Stokes’ Theorem
4.1
Orientation of smooth manifolds
Let M be a smooth n-manifold. An orientation of M is a choice of orientation of the space .TM x / for each x 2 M such that for each x 2 M there exists a coordinate system h W U ! V at x for which the orientation is constant when we use the identification from the end of 2.1. Two orientations of M are considered equal if they are equal at every point x 2 M . An orientation may not exist (see Exercise (18) below). A smooth manifold for which there exists an orientation is called orientable. Recall from 3.5 of Chapter 11 that a non-zero element of ƒn ..TM x / / determines an orientation of .TM x / . Hence a form ! 2 n .M / such that !.x/ ¤ 0 for all x 2 M determines an orientation of M . Lemma. Every orientation of a smooth n-manifold M is determined by a form ! 2 n .M / such that !.x/ ¤ 0 for all x 2 M (which is then often called the volume form). Moreover, two such forms !; determine the same orientation if and only if there exists a smooth (nowhere vanishing) function k W M ! R such that ! D k . Proof. To prove the first statement (existence), take a smooth atlas .Ui ; hi / such that a form !i as required exists for the restriction of our orientation to Ui (such an atlas exists by the definition of orientation). Now take a smooth partition of unity ui subordinate to the cover Ui , and put
302
12 Smooth Manifolds, Differential Forms and Stokes’ Theorem
!D
X
ui !i :
i
To prove the second statement, let !, determine the same orientation. Choose a smooth atlas .Ui ; hi /. Then !jUi D fi dh1 ^ ^ dhn , jUi D gi dh1 ^ ^ dhn . Define k.x/ D fi .x/=gi .x/ when x 2 Ui . t u Let M , N be oriented n-manifolds and let be a volume form on N specifying the orientation. We say that a diffeomorphism W M ! N preserves orientation if the volume form ! specifies the given orientation on M .
4.2
Integration
Let ! be a smooth n-form on Rn . Then we may write ! D f dx1 ^ ^ dxn for a smooth function f W Rn ! R. Let B Rn be a Borel set such that B is compact. Define Z
Z !D B
f dx1 : : : dxn :
(1)
B
Now let M be a smooth oriented n-manifold and let ! be a smooth n-form on M . Let B M be a Borel set such that B is compact. Then there exists a smooth atlas .Ui ; hi / of M such that hi preserves orientation if we take the standard orientation dx1 ^ ^ dxn on hi ŒUi , and such that there exists a finite subset F I where Ui \ B D 0 for i … F (take an orientation-preserving atlas, choose a finite subcover containing B and intersect the remaining charts with M X B). Now put Z !D B
XZ i 2F
hi ŒB\Ui
ui .h1 i / !
(2)
(recall 2.4). Lemma. The number (2) does not depend on the choice of the atlas .Ui ; hi / (subject to the given conditions). Proof. First note that if U; V Rn are open sets such that B U , W U ! V is an orientation-preserving diffeomorphism, ! is a smooth n-form, then Z
Z
. 1 / !
!D B
ŒB
as defined by (1), by the Substitution Theorem 7.9 of Chapter 5.
(3)
4 Integration of differential forms and Stokes’ Theorem
303
Now let .Ui ; hi /i 2I , .Ui0 ; h0i /i 2I 0 be two atlases as in the statement of the lemma. First, note that by the (finite) additivity of the integral, we may assume I D I 0 , Ui D Ui0 . We may still have hi ¤ h0i , but the invariance of the integral under this choice follows from (3). t u Remark: Note that our notation is slightly inconsistent. In (2), we should display the orientation of the manifold M . In (1), on the other hand, we assume the standard orientation of Rn , i.e. the orientation defined by the n-form dx1 ^ ^dxn . A reversal of orientation results, of course, in a reversal of sign.
4.3
Regions with corners
4.3.1 Let M be an oriented smooth n-manifold. By a region with corners in M we mean a compact subset K M such that for every x 2 K X K ı , there exists an orientation-preserving coordinate system h W U ! V at x in M such that V D .1; 1/n and there exists a k 2 f0; : : : ; ng such that hŒK \ V D h0; 1/k .1; 1/.nk/
(1)
hŒK \ V D .1; 1/n X ..1; 0/k .1; 1/.nk/ /:
(2)
or
(We use the symbol S n for the n-th Cartesian power of a set S here to reduce the chance of confusion.) A special case worth pointing out is the case when one always has k 1. In this case, we call K a compact n-dimensional submanifold with boundary. Note that then our coordinate system gives K X K ı the structure of a .k 1/-dimensional compact submanifold of M .
4.3.2 Integrating over the boundary Now let be a smooth .n 1/-form on M . Consider an atlas .Ui ; hi / of M such that there exists a finite subset F I where K \ Ui D ; when i … F , and .Ui ; hi / satisfy (1) or (2) when i 2 F . Let ui be a smooth partition of unity subordinate to the cover Ui . Let F1 , resp. F2 denote the set of all i 2 F for which h D hi satisfies (1) (resp. (2)). Denote by cj W Rn1 ! Rn
304
12 Smooth Manifolds, Differential Forms and Stokes’ Theorem
the map given by .x1 ; : : : ; xn1 / 7! .x1 ; : : : ; xj 1 ; 0; xj ; : : : ; xn1 /: Then define Z @K
D
k XX
Z .1/j
h0;1/.k1/ .1;1/.nk/
i 2F1 j D1
C
k XX
ui .h1 i cj /
Z .1/j
i 2F2 j D1
.1;0i.k1/ .1;1/.nk/
(*)
ui .h1 i cj / :
It can be proved that the expression (*) does not depend on the choice of atlas with the properties required above. However, this is a bit tedious and we will omit the proof, as it is not needed for proving Stokes’ Theorem. When stating the theorem in the next paragraph, we will simply assume that an atlas as above has been chosen. It is worth noting, however, that in the special case of a compact n-dimensional submanifold with boundary, it follows that the integral defined by (*) coincides with Z
@K
where W @K ! M is the inclusion of a submanifold, as discussed above. Note however that then one must be careful about orientation. The correct orientation at a point x 2 @K, x 2 Ui , is by n1 .T .@K/x / : .h1 i / .dx2 ^ ^ dxn / 2 ƒ
(The minus sign comes from the fact that the added first vector of the ordered basis representing the orientation of TMx should point “outside” from the boundary, which, in our setup, happens to be in the negative direction.) 4.4 Theorem. (Stokes’ Theorem) Let M be a smooth n-manifold and let 2
n1 .M /. Let K be a region with corners in M and let .Ui ; hi / and ui be chosen as in 4.3. Then Z Z D d: (*) @K
K
Proof. The statement and the proof are both straightforward generalizations of our treatment of Green’s Theorem. (In fact, the part of the proof dealing with
4 Integration of differential forms and Stokes’ Theorem
305
substitutions becomes simpler, since Stokes’ Theorem is stated in terms of differential forms, which are contravariant.) Again, the key step is to prove the result for the case of a cube: M D Rn , KD
n Y
haj ; bj i;
j D1
aj < bj . In this case, suppose without loss of generality that D f dx1 ^ ^ dxj 1 ^ dxj C1 ^ ^ dxn : Then d D .1/j C1
@f dx1 ^ ^ dxn : @xj
Then by Fubini’s Theorem and the Fundamental Theorem of Calculus in one variable, Z Z d D Q .1/j C1 .f .x1 ; : : : ; xj 1 ; bj ; xj C1 ; : : : ; xn / K
`¤j ha` ;b` i
Z f .x1 ; : : : ; xj 1 ; aj ; xj C1 ; : : : ; xn //dx1 : : : dxn1 D
: @K
(Note that on the right-hand side, the summands corresponding to coordinates other than the j ’th coordinate vanish.) Now in the general case, one proves the theorem by considering each of the summands 4.3.2 (*) separately, applying the case of the cube to the smooth .n 1/-form ui .h1 i cj / :
When i 2 F1 , one uses the cube h0; 1ik h1; 1i.nk/: When i 2 F2 , one sums over the cubes h1; 0i.`1/ h0; 1i h1; 1i.n`/ with ` D 1; : : : ; k. Again, the summands not relevant to the statement are 0 or appear twice with opposite signs. u t
306
4.5
12 Smooth Manifolds, Differential Forms and Stokes’ Theorem
Three special cases: grad, div and curl
On open submanifolds U Rn , smooth 1-forms are identified with Rn -valued functions. The differential d W 0 .U / ! 1 .U / is then identified with a map grad from the space of smooth functions on U to the space of Rn -valued smooth functions (or, equivalently, n-tuples of smooth functions) on U . The corresponding case of the Stokes Theorem is the “Fundamental Theorem of Line Integrals” which says that for an oriented piecewise smooth curve L represented by W ha; bi ! Rn , we have Z grad.f / D f . .b// f . .b//:
.II/
(*)
L
(Note that our current setup is slightly different, to get a special case of Theorem 4.4, we would have to formulate (*) on smooth 1-manifolds rather than piecewise smooth curves, but both statements are equally easy to prove - see Exercise (20).) Smooth .n 1/-forms can also be identified with smooth 1-forms and smooth n-forms can be identified with smooth functions Pusing the Hodge -operator. For a function F W U ! Rn , denote by the 1-form Fi dxi . Then we put div.F / D .d . //; and for a region with corners K U , we put Z
Z F D
:
@K
@K
(In this form, this integral is also known as flux.) Then the Stokes Theorem takes the form Z Z F D div.F /: @K
K
When n D 3, one also denotes by curl.F / the R3 -valued function associated with the 1-form .d /: In coordinates, we obtain curl.F / D
@F3 @F2 @F1 @F3 @F2 @F1 ; ; @x2 @x3 @x3 @x1 @x1 @x2
:
5 Exercises
307
Let M be a 2-dimensional submanifold of R3 , let K be a region with corners in M and let F be an Rn -valued function defined in an open subset of R3 containing M . Then the Stokes Theorem takes on the form Z
Z curl.F / D K
F: @K
Observe that the right-hand side may be interpreted as a sum of line integrals of the second kind.
5
Exercises
(1) Prove that the definitions of a manifold and a smooth manifold would remain equivalent if we require the coordinate maps hx to be homeomorphisms. (2) Prove in detail that the definition given in Example 1.4 (4) really specifies a smooth manifold and that the inclusion S n RnC1 is a C 1 -map. (3) Prove that the function used in the proof of Theorem 1.5 is smooth. (4) Recall the example of the manifold S n from the last section. For x 2 S n , construct an isomorphism of vector spaces x W T .S n /x Š fw 2 Rn jx w D 0g such that for every smooth map f W RnC1 ! RnC1 which satisfies f ŒS n S n we have a commutative diagram T .S n /x
x
Rn
Df jS n
Df
T .S n /f .x/
f .x/
Rn :
(5) Recall the notion of Lie bracket of smooth vector fields from 7.5 of Chapter 6. Let us generalize this notion to vector fields on manifolds. In other words, let u, v be vector fields which on some open set U with smooth coordinates h1 ; : : : ; hn are given by uD
n X i D1
X @ @ ; vD gi @hi @hi i D1 n
fi
for smooth functions fi , gi . Define Œu; v D
n X @gj @fj @ fi gi : @hi @hi @hj i;j D1
308
12 Smooth Manifolds, Differential Forms and Stokes’ Theorem
Prove that this is a well-defined operation on smooth vector fields on a smooth manifold M , and that it satisfies (7.6.1) and (7.6.2) of Chapter 6. (6) A Lie group is a smooth manifold G which is also a group (see B.3.1), such that the operations of multiplication W G G ! G and inverse W G ! G are smooth maps (see 1.4, (5)). Prove that the groups GLn .R/, GLn .C/ (see Appendix B, Exercise (6)) are open subsets of the real vector spaces of all n n real (resp. complex) matrices, and are Lie groups by considering their group structure and the induced smooth manifold structure. (7) Let G be a Lie group. A vector field v on the manifold G is called left invariant if .DLg /.v.e// D v.g/ for every g 2 G where e is the unit element and Lg W G ! G is the diffeomorphism given by left multiplication by g. Prove that the R-vector space of left invariant vector fields on G is isomorphic to T Ge by v 7! v.e/: (8) Prove that if G is a Lie group, then the vector space g of left-invariant smooth vector fields on G forms a sub-algebra of the Lie algebra of all smooth vector fields discussed in Exercise (5) in the sense that the Lie bracket of two left invariant vector fields is left invariant. This g is called the Lie algebra associated with the Lie group G, and can be shown to encode a large part of the Lie group structure of G. (For further reading, see for example [9, 10].) (9) Find two smooth 1-forms !, on R2 such that for every x 2 R2 we have !.x/; .x/ ¤ 0 and there does not exist any non-empty open set U R2 and W U ! R2 with D !jU . Compare with 2.5. [Hint: use the exterior derivative.] (10) Prove that for a smooth k-form ! and a smooth `-form , d.! ^ / D .d!/ ^ C .1/k ! ^ d: (11) Generalize the proof of Lemma 3.1 to prove that for a smooth map f W M ! N and a smooth k-form ! 2 k .N /, we have d.f !/ D f .d!/: Conclude that we have a canonical linear map f W Z k .N / ! Z k .M /
5 Exercises
309
which restricts to f W B k .N / ! B k .M / and hence determines a linear map k k .N / ! HDR .M /: f W HDR
Hence, de Rham cohomology is contravariant in smooth maps. (12) Prove that diffeomorphic smooth manifolds have the same Betti numbers [Hint: use Exercise (11)]. (13) Note that a smooth 0-form is the same thing as a smooth function. Prove that a smooth 0-form is closed if and only if it is locally constant. Conclude that b0 .M / is the number of connected components of M . (14) Let W R ! S 1 be the smooth map defined by .t/ D .cos.t/; sin.t//: Prove that a smooth 1-form f dx on R is equal to ! for some ! 2 1 .S 1 / if and only if f is a smooth periodic function with period 2 . Prove that ! is exact if and only if Z
2
f .x/dx D 0: 0
Conclude that b1 .S 1 / D 1. Conclude also that b1 .S 1 S 1 / ¤ 0. [Hint: Consider the smooth map S 1 S 1 ! S 1 given by .x; y/ 7! x and the smooth map S 1 ! S 1 S 1 given by x 7! .x; a/ for some constant a. Use Exercise (11).] (15) Prove that bn .Rn / D 0 (this is a special case of the Poincar´e lemma, which says that bi .Rn / D 0 for i > 0). [Hint: writing an ! 2 n .Rn / as f dx1 ^ ^ dxn , put Z
x1
g.x1 ; : : : ; xn / D
f .t; x1 ; : : : ; xn /dt 0
and consider the form gdx1 ^ ^ dxn .] (16) Let ! 2 Z 1 .S 2 /. Let U C D S 2 X f.0; 0; 1/g, U D S 2 X f.0; 0; 1/g, U D U C \ U . Then U C and U are diffeomorphic to R2 , so by Exercise (15), there exist smooth 0-forms (i.e. smooth functions) f W U C ! R, g W U ! R such that df D !jU C , dg D !jU . Additionally, d.f jU gjU / D 0 and hence f jU gjU is locally constant and hence constant, since U is connected. Let c D f jU gjU . Define a function h W S 2 ! R by h.x/ D f .x/ for x 2 U C ,
310
12 Smooth Manifolds, Differential Forms and Stokes’ Theorem
h.x/ D g.x/ C c for x 2 U . Prove that dh D !, and hence b1 .S 2 / D 0. Conclude (see Exercise (14)) that the smooth manifolds S 2 and S 1 S 1 are not diffeomorphic - an intuitively obvious, but highly non-trivial fact. (17) Prove that b1 .C X Z/ D 1. (18) The M¨obius strip. Consider here S 1 as the unit circle in C. M D f.x; z/ 2 C S 1 jx 2 =z 2 Rg: Prove that M is not orientable. [Hint: Consider the immersion and submersion f W R R ! M given by .x; t/ 7! .xe it ; e 2 it /. Prove that a 2-form hdxdy 2
2 .R2 / D f ! for a 2-form ! 2 2 .M / must satisfy h.0; 1/ D h.0; 0/ and that therefore ! cannot be nowhere vanishing.] (19) Consider, on S n1 , the smooth n 1-form !D
n X
.1/i C1 dx1 ^ ^ dxi 1 ^ dxi C1 ^ ^ dxn :
i D1
Prove that Z ! ¤ 0: s n1
Conclude that bn1 .S n1 / 1. [Hint: use Stokes’ Theorem, the Hodge * operator and spherical coordinates.] (20) Prove the Fundamental Theorem of Line Integrals, 4.5 (*). [Hint: After composing with the map , it becomes essentially a special case of the Fundamental Theorem of Calculus for the Riemann integral, but a little bit of care is needed since L is only piecewise smooth.]
Complex Analysis II: Further Topics
13
There are some extremely important concepts in complex analysis which we did not cover in Chapter 10, and which ultimately lead up to several other areas of mathematics. First of all, quite a bit more can be said about conformal maps. Under very general conditions, one open subset of C can be mapped holomorphically bijectively onto another. We prove one such result, the famous Riemann Mapping Theorem. In many situations, such maps can even be written down explicitly. Those are the Schwartz-Christoffel formulas, which have applications in cartography, as the basic condition on mappings in cartography is to be conformal (since distortion of distances in a topographical map is generally considered more allowable than distortion of angles). Yet, the Schwarz-Christoffel formulas also lead to elliptic integrals, which are “inverse” to elliptic functions (see for example [11]). A major topic not covered in Chapter 10 is the question of “multi-valued holomorphic maps” such as, for example, the natural logarithm on C X f0g (or, for that matter, elliptic integrals). What is the appropriate theoretical underpinning for such functions? It turns out that now is the right moment for us to study such questions, since we have already learned about manifolds. In this chapter, we will study complex manifolds of complex dimension 1, which are called Riemann surfaces. It turns out that the right way of thinking about multivalued functions on an open subset U of C is as functions defined on a certain Riemann surface which is a covering of U (not to be confused with open covers as studied in 1.1 of Chapter 9). In the process of developing this concept, we will also learn a lot more about complex integration (we will develop, for example, integration of holomorphic functions along continuous paths and will show that if two paths are homotopic, i.e. one can be continuously deformed to another, the integrals are the same). At the same time, we will also explore striking ways in which complex differential forms behave on Riemann surfaces, which will greatly enhance our understanding of complex integration. Finally, we will see that methods of complex analysis extend even to functions which are not holomorphic, generalizing, for example, the Cauchy formula to functions which are continuously differentiable but not holomorphic. These methods will be very useful in Chapter 15 below, where we will construct compatible complex structures on oriented surfaces with Riemann metrics. I. Kriz and A. Pultr, Introduction to Mathematical Analysis, DOI 10.1007/978-3-0348-0636-7 13, © Springer Basel 2013
311
312
13 Complex Analysis II: Further Topics
As is the case with the concept of manifolds, the study of coverings has a close connection with algebraic topology, which we will not explore here in detail. We will, however, briefly introduce the concept of the fundamental group and give two examples in Exercises (15) and (16). For more information on Riemann surfaces, we recommend the book [6], and for a very concise yet informative study of the fundamental group and coverings in an abstract topological setting, [13]. For very interesting ventures to higher dimensions, [8] may be an excellent source.
1
The Riemann Mapping Theorem
In this section, we will consider bijective holomorphic maps f W U ! V where U; V are open subsets of C. Note that by Theorem 6.3.3 of Chapter 10, we must have f 0 .z/ ¤ 0 for z 2 U , and hence f 1 W V ! U is also a bijective holomorphic function. Such functions will be called holomorphic isomorphisms, and if U D V , holomorphic automorphisms.
1.1
Holomorphic self-maps of C and the unit disk
1.1.1 Proposition. The only injective holomorphic functions on C are f .z/DazCb. Proof. Let us study the singularity of the function f .1=z/ at z D 0. If this singularity is removable, then f is bounded, and hence constant by Liouville’s Theorem 5.1 of Chapter 10, contradicting our assumptions. If f .1=z/ has a pole of order k > 1 at 0, then for " > 0 sufficiently small, there exists, by Theorem 6.3.3 of Chapter 10, a ı > 0 such that 1=f .1=z/ a has exactly k zeros in .0; "/ for every a 2 .0; ı/. Note that these k zeros may include zeros of order > 0, but not for ı sufficiently small, since otherwise the holomorphic function .1=f .1=z//0 would have zeros arbitrarily close to 0, and hence would be constantly 0 by Theorem 4.4 of Chapter 10. However, if the k zeros are all different, this contradicts injectivity of f . Finally, if f .1=z/ has an essential singularity at 0, let f .0/ D A. Then by the Holomorphic Open Mapping Theorem 6.3.4 of Chapter 10, for every r > 0 there exists an " > 0 such that f j .0; r/ takes on every value in .A; "/. On the other hand, applying Proposition 6.2 of Chapter 10 to f .1=z/, we see that there are (infinitely many) z with jzj > r such that f .z/ 2 .A; "/ which, again, contradicts injectivity. We have concluded that f .1=z/ has a pole of order 1 at z D 0. Then the function .f .z/ A/=z is holomorphic and bounded on C, and hence is constant by Liouville’s Theorem 5.1. t u It is, however, convenient to consider a slightly larger class of maps called M¨obius transformations. These maps are formally defined as maps C [ f1g ! C [ f1g by formulas fA .z/ D
az C b cz C d
1 The Riemann Mapping Theorem
313
where A is a matrix of complex numbers AD
a c
b d
;
and we assume det.A/ ¤ 0: One readily verifies that fA ı fB D fAB ; and thus all M¨obius transformations are bijective maps C [ f1g ! C [ f1g. We will understand that better in Section 3 below. While in the formalism we introduced, M¨obius transformations with c ¤ 0 are, by definition, not holomorphic functions on C, they can be useful in mapping injectively holomorphically certain open subsets of C onto one another (see Exercises 1, 2). 1.1.2 Lemma. (Schwartz’s Lemma) If f .z/ is a holomorphic function on .0; 1/ which satisfies the conditions jf .z/j 1 for all z 2 .0; 1/ and f .0/ D 0, then jf .z/j jzj for z 2 .0; 1/, and jf 0 .0/j 1. If additionally jf .z0 /j D jz0 j for some z0 2 .0; 1/ or jf 0 .0/j D 1, then f .z/ D cz for all z 2 .0; 1/ for some constant jcj D 1. Proof. Consider the function f .z/=z for z 2 .0; 1/ X f0g. g.z/ D f 0 .0/ for z D 0. This function is holomorphic on .0; 1/ (see Exercise (14)). By the maximum principle 6.3.5 of Chapter 10, then, the maximum of the function jg.z/j on .0; r/ can occur only on the boundary for any 0 < r < 1. By assumption, jg.z/j 1=r for jzj D r, and hence also for jzj r. Passing to the limit, we get jg.z/j 1 for z 2 .0; 1/, which is the first claim. If equality arises for a single point z0 2 .0; 1/, the function g has a maximum at that point, and hence must be constant. In order for the equality to actually arise, however, the constant must have absolute value 1. u t 1.1.3 Corollary. Let f be a holomorphic automorphism of .0; 1/ such that f .0/ D 0 and f 0 .0/ is a positive real number. Then f .z/ D z for all z 2 .0; 1/. Proof. First note that by Theorem 6.3.3 of Chapter 10, f 0 .z/ ¤ 0 for all z 2
.0; 1/, and thus f 1 W .0; 1/ ! .0; 1/ is also a holomorphic map. Thus, Schwartz’s Lemma can be applied to both f and f 1 . In particular, jf 0 .0/j 1 and 1=jf 0 .0/j D j.f 1 .0//0 j 1, and hence equality must arise. Therefore, by
314
13 Complex Analysis II: Further Topics
Schwartz’s Lemma again, f .z/ D cz with jcj D 1, but the assumption that f 0 .0/ is a positive real number then implies c D 1. t u
1.2
The Riemann Mapping Theorem
An open subset U C is called simply connected if U is connected and every holomorphic function on U has a primitive function. Lemma. Let U C be a simply connected open set and let W U ! C X f0g be a holomorphic function. Then there exists a holomorphic function Ln. .z// on U such that e Ln. .z// D .z/ for all z 2 U . 0 .z/ has a primitive function on .z/ U , which we will denote by Ln. .z//. This function is determined up to an additive constant. But using the chain rule and the product rule, we find that
Proof. Since U is simply connected, the function
e Ln. .z// .z/
0 D 0;
so this function is constant. The additive constant can therefore be chosen in such a way that e Ln. .z// D .z/:
t u
Theorem. (The Riemann Mapping Theorem) Let U ¨ C be an open simply connected set, and let z0 2 U . Then there exists a unique holomorphic isomorphism f W U ! .0; 1/ such that f .z0 / D 0 and f 0 .z0 / is a positive real number. Proof. First of all, note that uniqueness follows from Corollary 1.1.3, since if f1 , f2 were two maps satisfying the conclusion of the Theorem, then .f1 /1 f2 would be a holomorphic automorphism of .0; 1/ with positive real derivative at 0. To prove existence, we will first prove that there exists an injective holomorphic map f W U ! .0; 1/ with f .z0 / D 0 where f 0 .z0 / is a positive real number. In effect, let a … U . Apply Lemma 1.2 to the function .z/ D z a. Thus, we have a function Ln.z a/ such that e Ln.za/ D z a:
1 The Riemann Mapping Theorem
315
Now let h.z/ D e Ln.za/=2 : Then h.z/2 D z a on U ; which means that for any z; t 2 U , h.z/ ¤ ˙h.t/: By the Holomorphic Open Mapping Theorem 6.3.4 of Chapter 10, there is an r > 0 such that
.h.z0 /; r/ hŒU : Therefore, hŒU \ .h.z0 /; r/ D ;: This means that for z 2 U , jh.z/ C h.z0 /j r; and in particular, 2jh.z0 /j r: Now consider the function f0 .z/ D
r jh0 .z0 /j h.z0 / h.z/ h.z0 / : 4 jh.z0 /j2 h0 .z0 / h.z/ C h.z0 /
First, note that the denominator is non-zero. Clearly, we have f0 .z0 / D 0. From the chain rule, in fact, f00 .z0 / D .r=8/ jh00 .z0 /j=jh.z0 /j2 > 0: Additionally, f0 is a composition of a M¨obius transformation with the injective function h. Thus, f0 is injective. Finally, ˇ ˇ ˇ ˇ ˇ 4jh.z0 /j ˇ h.z/ h.z0 / ˇ ˇ 2 ˇ D jh.z0 /j ˇ 1 ˇ ˇ ˇ h.z/ C h.z / ˇ ˇ h.z / h.z/ C h.z / ˇ r 0 0 0
316
13 Complex Analysis II: Further Topics
which shows that jf0 .z/j 1 for z 2 U . Of course, strict inequality must hold by the Holomorphic Open Mapping Theorem 6.3.4 of Chapter 10. Now let N be the supremum of all the values f 0 .z0 / over the set S of all injective functions f W U ! .0; 1/ which satisfy f .z0 / D 0 and f 0 .z0 / > 0. (Note that a priori, one may have N D C1.) There exists, however, a sequence .fn /n of functions in S such that lim fn0 .z0 / D N:
n!1
Clearly, the sequence fn is uniformly bounded and by Theorem 3.7 of Chapter 10, is also equicontinuous on every compact subset of U . By Theorem 6.3 of Chapter 9 (a consequence of the Arzel`a-Ascoli Theorem), there exists a subsequence .fin /n which converges uniformly on every compact subset K U . Denote the limit function by f . We know by Weierstrass’s Theorem 3.6 of Chapter 10 that f is holomorphic, f 0 .z0 / D N (and thus N < 1), and jf .z/j 1 for every z 2 U , but, again, a strict inequality must arise by Theorem 6.3.4 of Chapter 10. We will now show that f is injective. In effect, let z1 2 U . Then the functions fin .z/ fin .z1 / have no zero in U Xfz1 g, and hence f .z/f .z1 / has no zero in U Xfz1 g by Hurwitz’s Theorem 6.3.6 of Chapter 10. Since z0 was arbitrary, f is injective as claimed. We claim that the function f W U ! .0; 1/ is onto. Assume, for contradiction, that w0 2 .0; 1/ X f ŒU . Let .z/ D
f .z/ w0 : 1 w0 f .z/
Note that this function is injective since it is the composition of a M¨obius transformation with the injective function f . Applying Lemma 1.2 to this function .z/, we can find a function Ln. .z// on U which satisfies e Ln. .z// D .z/: Let, again, g.z/ D e .Ln .z//=2; so that .z/ D .g.z//2 : Let F .z/ D
jg 0 .z0 /j g.z/ g.z0 / : g 0 .z0 / 1 g.z0 /g.z/
Note that we have F 2 S . (Compare Exercise (2).)
2 Schwartz-Christoffel formula
317
Note that f D ı F for a certain holomorphic function W .0; 1/ ! .0; 1/ which satisfies .0/ D 0. Concretely, is a composition of a holomorphic automorphism of .0; 1/ which maps 0 to g.z0 / followed by squaring, and a holomorphic automorphism of .0; 1/ which maps 0 to w0 . Since is obviously not a linear map, we have j 0 .0/j < 1 by Schwartz’s Lemma 1.1.2, so F 0 .0/ > f 0 .0/ D N (since both are positive real numbers), which is a contradiction. t u
1.3
Comments
1. It is clear why the case U D C must be excluded in the statement of the Riemann Mapping Theorem: By Liouville’s Theorem 5.1 of Chapter 10, any bounded holomorphic map defined on C is constant. 2. Excluding the case of U D C, as already remarked, Proposition 2.5 of Chapter 10 provides a converse to the Riemann Mapping Theorem when stated for realregular images of convex open sets. Perhaps much more importantly, however, Proposition 2.5 of Chapter 10 serves as a source of examples for the Theorem. While our definition of a simply connected set above precisely fits the proof of the Theorem, it is not a condition which is easy to verify. On the other hand, constructing real injective regular maps on convex sets, as in Proposition 2.5 of Chapter 10, is easy (for example, see Exercise (4)).
2
Holomorphic isomorphisms of disks onto polygons and the Schwartz-Christoffel formula
2.1
Convex polygons
We will examine holomorphic isomorphisms between .0; 1/ and open polygons. We will restrict here to convex polygons, although the restriction is not really necessary (in the sense that the same formula implies to non-convex polygons as well). However, convex polygons are much easier to treat rigorously. By an open half-plane of angle a , 0 a < 2, we shall mean a subset of C consisting of all the numbers b C z where Im.ze i a / > 0
(2.1.1)
for some constant b 2 C. A closed half-plane is the closure of an open half-plane of the same angle. An open convex polygon P is a bounded non-empty intersection of open halfplanes P1 ; : : : ; Pk of angles a1 ; : : : ak , 0 < a1 < < ak 2. We put a0 D ak 2. The corresponding closed convex polygon is its closure P . Consider the points zi , where fzi g D @Pi \ @Pi 1 , i D 1; : : : ; k, and we let P0 D Pk and z0 D zk . Let us assume, without loss of generality, that k is the smallest possible for the given P . Then the points zi are called the vertices of P and the number ˇi D .ai ai 1 /
318
13 Complex Analysis II: Further Topics
the exterior angle at the vertex zi , i D 1; : : : ; k. Setting ˛i D 1 ˇi , the angle ˛i is called the interior angle at the vertex zi . The boundary @P is the union of the closed line segments Li between the points zi , zi C1 , i D 0; : : : ; k 1. Now let z0 2 C and 0 < ˛ < 1. By Lemma 1.2, the function .z z0 /˛ D e ˛Ln.zz0 /
(2.1.2)
can then be defined on any open half-plane P whose boundary contains z0 , and inspection shows that this function can be extended to a bijective continuous function mapping P onto a closed angle of value ˛. Let f .w/ be a holomorphic function defined in an open neighborhood U of a point w0 2 C, let f .w0 / D z0 and let f 0 .w0 / ¤ 0. Define g.w/ D .f .w/ z0 /˛ : Then we just proved that g.w/ is defined on f 1 ŒP where P is an open half-plane containing z0 . 2.2 Lemma. The function g.w/.w w0 /˛ extends holomorphically to an open neighborhood of w0 , and the extension is nonzero there. Proof. By our assumption, the Taylor expansion of f .w/ z0 at w0 is of the form .w w0 /
1 X
anC1 .w w0 /n
nD0
where a1 ¤ 0. However, since z˛ can be defined as a holomorphic function in the neighborhood of any non-zero point, we may then write g.w/ D
1 X
!˛ anC1 .w w0 /
n
;
nD0
which is a holomorphic function in the neighborhood w0 .
t u
2.3 Lemma. Let f W P ! .0; 1/ be a holomoprhic isomorphism where P is an open polygon. Then f extends to a homeomorphism f W P ! .0; 1/. Furthermore, if we denote by g.w/ the inverse of f , assuming that g.wi / D zi where z1 ; : : : zk are the vertices of P , then for w ¤ w1 ; : : : ; wk in the domain of g, g can be extended to a holomorphic function with non-zero derivative in a neighborhood
2 Schwartz-Christoffel formula
319
of w, and additionally .g.w/ zi / .w wi /˛i
(2.3.1)
can be extended to a holomorphic function on an open neighborhood of wi , which is non-zero there. Proof. Consider a point z 2 @P , z ¤ z1 ; : : : ; zk . Let us use the above notation for P . Assume, without loss of generality, that z 2 L0 , and L0 R. Consider the set Q D Int.P [ fz j z 2 P g/, (note that z means the complex conjugate of z while P is the closure), and the set R D C X fz 2 R j jzj > 1g. For z 2 L0 X fz0 ; z1 g, by the Riemann Mapping Theorem 1.2, there exists a unique holomorphic isomorphism Q ! R such that g.z/ D 0, and g 0 .z/ is a positive real number. Since, however, the map g.z/ is another solution, it must be equal to g.z/ or, in other words, g.z/ D g.z/. It then follows from the Intermediate Value Theorem that gŒP is contained in the upper half-plane (the open half-plane with angle 0), and hence must be equal to it. Thus, g restricts to a holomorphic isomorphism from P onto the upper halfplane. Replacing .0; 1/ by the upper half-plane (which we may do by a M¨obius transformation), we see that f extends holomorphically to an open neighborhood of z. Let us now consider the points z D zi . Assume, without loss of generality, i D 0, L0 R, z0 D 0. Then denoting by the continuous extension of the function (2.1.2) for ˛ D ˛0 , to the closed upper half-plane, let be the restriction of 1 to P . Then we have ŒP \ R D ŒLk1 [ L0 : Let this image be the interval hs; ti where s < 0 < t. Now applying the argument of the previous paragraph to the holomorphic isomorphism ˆ from the set Int. ŒP [ fz j z 2 ŒP g/ to C X ..1; si [ ht; 1/ which maps 0 to 0 with a positive real derivative at 0, we again see that the map must be symmetric under complex conjugation, and hence must map ŒP holomorphically bijectively onto the open upper half-plane. Therefore (after composing with a M¨obius transformation to pass from the upper half-plane to .0; 1/), ˆ ı gives a continuous extension of f to a neighborhood of z0 in P , and, in fact, also a holomorphic extension of f ı 1 to an open neighborhood of 0. Now the open neighborhoods of all points z 2 @P cover @P which is compact, and hence by the Heine-Borel Theorem 2.3 of Chapter 9, we may cover @P by finitely many such neighborhoods. By the uniqueness theorem for holomorphic functions 4.4 of Chapter 10, the local extensions of f agree on the intersections of the neighborhood, which proves the existence of the continuous map f .
320
13 Complex Analysis II: Further Topics
Now the statement about the holomorphic extension of the function (2.3.1) follows from Lemma 2.2 (applied to this same function g). t u 2.4 Theorem. (The Schwartz-Christoffel formula) Let P be a convex polygon as above, and let f W P ! .0; 1/ be a holomorphic isomorphism. Let f .zi / D wi (see Lemma 2.3). Then the inverse g of the map f is given by the formula Z
k wY
.u wi /ˇi du C D
g.w/ D C 0
(2.4.1)
i D1
for some constants C; D 2 C. Proof. Apply Lemma 2.3. Differentiating, we get that the function h.w/ D g 0 .w/
k Y .w wi /ˇi
(*)
i D1
extends holomorphically to an open set U containing .0; 1/, and the extension has no zero on U . We will show that the argument of the function (*) is, in fact, constant on the boundary @ .0; 1/. First, consider the argument (in the sense of Subsection 6.3 of Chapter 10) of the function g 0 .w/ for w in the segment of the unit circle between the points wi and wi C1 . But we see that g.w/ on that segment is a composition of a linear function with Ln.z/, and using the chain rule, Arg.g 0 .w// D Ci Arg.w/ where Ci is a constant in the circle segment between wi and wi C1 . (Comment: In this case, we consider the argument in the broader sense, i.e. determined only up to an integral multiple of 2 ). Now the key point is that the slope of the side of P changes by ˇi when passing the point wi , which immediately gives Ci Ci 1 D ˇi : On the other hand, by basic geometry of an isosceles triangle, we have Arg.w wi / D ˙
Arg.wi / C Arg.w/ C : 2 2
Therefore, for w on the circle segment between wi and wi C1 , k Y Arg .w wi /ˇi i D1
! D Qi C .
k X j D1
ˇj /Arg.w/=2 D Qi C Arg.w/
(**)
3 Riemann surfaces, coverings and complex differential forms
321
for some constant Qi . When passing wi in the clock-wise direction, one of the C signs in (**) changes into a , which shows Qi Qi 1 D ˇi : We see then that the argument of the function h.w/ of (*) is constant on the unit circle. Thus, the holomorphic function h.w/ on U maps the unit circle into a set of the form S D fte ib j t > 0g for b constant (a ray). Applying the Maximum Principle 6.3.5 of Chapter 10 to the holomorphic functions e h.w/e
ib
ib ; e h.w/e ;
we then see that h.w/ maps the whole set .0; 1/ to S , and thus, by the Open Mapping Theorem 6.3.4 is constant. Integrating gives the statement of the theorem. t u Comment: The numbers wi are not determined by Theorem 2.4 or any of the above discussion. They are difficult to determine analytically except in a few very special situations (see Exercises (6), (7)).
3
Riemann surfaces, coverings and complex differential forms
We will now use what we learned about complex analysis to discuss a partial “complex analog” of some of the material of Chapter 12. While this may seem like an abstract exercise, it actually turns out to be an extremely useful device, which will enhance greatly our understanding of topics already covered, such as M¨obius transformations, simply connected open subsets of C, and even primitive functions.
3.1
Riemann Surfaces: the basic definitions
Much of the theory of smooth manifolds of Chapter 12 can be directly translated to form a theory of “complex manifolds” by simply replacing R with C and smooth functions by holomorphic functions. However, there are some notable exceptions which require care. First of all, to discuss a theory of complex manifolds in an arbitrary dimension, we would first have to study analysis in several complex variables. While the reader could probably fill in the basic definitions, the theory
322
13 Complex Analysis II: Further Topics
of several complex variables is a special area of analysis with many subtleties, which exceeds the realm of this book. For a good introduction to that subject, we recommend [15]. Because of this, we will restrict our attention to complex dimension 1. A complex manifold of complex dimension 1 is called a Riemann surface. (It has, of course, topological dimension 2, and 2-dimensional manifolds are often called surfaces.) Thus, we have the following definition: A Riemann surface is a 2-dimensional topological manifold † with an atlas .Ux ; hx / (where the coordinate maps are understood as maps into C) such that the compositions (C) of Subsection 1.2 of Chapter 12 are holomorphic maps. Analogously with Subsection 1.3 of Chapter 12, a map f W †1 ! †2 of Riemann surfaces is called holomorphic if f is continuous, and for every x 2 †1 , the composition
hx Œ.f 1 ŒUf .x/ / \ Ux
h1 x
f 1 ŒUf .x/ \ Ux
hf .x/
f
Uf .x/
C
is holomorphic. As expected, a holomorphic map f W † ! C for a Riemann surface † will be called a holomorphic function on †. The treatment of tangent vectors of Riemann surfaces also parallels directly the smooth case, i.e. Subsection 2.1 of Chapter 12. Of course, the tangent space T †x to a Riemann surface at a point x 2 † is a complex line, i.e. a vector space over C of dimension 1. The first difference between Riemann surfaces and smooth manifolds is that Remark 1 of Subsection 1.1 of Chapter 12 does not apply to Riemann surfaces. In other words, we cannot assume that the coordinate maps are onto C: if we did, then there would not be enough examples. By Proposition 1.1.1, an open subset U ¨ C, which we can consider as a Riemann surface where the (single) coordinate map is the inclusion, does not have an atlas whose coordinate systems would be holomorphic maps onto C. Another substantial difference is the absence of a “holomorphic partition of unity”. In other words, the discussion of Subsection 1.5 of Chapter 12 does not have a holomorphic analogue. For example, a holomorphic function on a connected open set which is 0 outside of a compact subset is necessarily constant 0 by Theorem 4.4 of Chapter 10. On the other hand, also in contrast with the case of real manifolds, note again that for a bijective holomorphic map of Riemann surfaces f W †1 ! †2 , by Theorem 6.3.3 of Chapter 10, Dfx ¤ 0 for every x 2 †1 , and thus f 1 is also a bijective holomorphic map. Again, such maps will be called holomorphic isomorphisms, and holomorphic automorphisms if †1 D †2 .
3 Riemann surfaces, coverings and complex differential forms
3.2
323
The first examples
It turns out that we already have a number of examples of Riemann surfaces. Of course, open subsets of C are immediate examples. A first “non-trivial” example is the complex projective space CP 1 : As a set, it is C [ f1g. It is topologized as S 2 , with C identified homeomorphically with S 2 X fag for any chosen point a 2 S 2 . Then the atlas has two charts: one is C and the identity on C, the other is CP 1 X f0g with the chart defined by 1=z for z ¤ 1 z 7! 0 for z D 1. Now it is pretty much obvious from the definition that the M¨obius transformations of Subsection 1.1 are holomorphic automorphisms of CP 1 , and it is not difficult to check that they are the only ones (see Exercise (10)). Moreover, for an open set U C, note that a meromorphic function on U is precisely the same thing as a holomorphic map U ! CP 1 . Because of this, one extends this to call a meromorphic function on a Riemann manifold † a holomorphic map f W † ! CP 1 . Here is another example: Let a; b be complex numbers linearly independent over R. Introduce an equivalence relation on C where x1 C iy1 x2 C iy2 is x1 x2 D ka, y1 y2 D `b where k; ` are integers. The set E of equivalence classes with respect to this equivalence relation is called an elliptic curve. (The use of the term “curve” here stems from algebraic geometry, where one develops methods for defining geometric objects, called varieties, over general fields. A 1-dimensional variety is called a curve. A non-singular curve over the field C is then, in particular, a Riemann surface.) Denote the equivalence class of z 2 C by Œz, an element of an equivalence class is called its representative. Clearly, we have a projection WC!E given by .z/ D Œz: We may define a metric E by letting the distance of two classes Œz0 , Œt0 be min jz tj where z 2 Œz0 , t 2 Œt0 . The reason the minimum exists is that the subset L D fka C `b j k; ` 2 Zg
324
13 Complex Analysis II: Further Topics
is discrete. The projection is then continuous. There exists, therefore, an " > 0 such that .0; "/ \ L D f0g. Then for any z 2 C, j .z; "/ is a homeomorphism onto .Œz; "/. Thus, the inverses of these restrictions can be taken for an atlas, making E a Riemann surface. Meromorphic functions on E are the same data as doubly periodic functions on C. Such functions are called elliptic functions. See Exercise (8) for one method by which examples of elliptic functions can be constructed.
3.3
Coverings
3.3.1 Let † be a Riemann surface. A holomorphic map W T ! †, where T is another Riemann surface, is called a covering if for every z 2 †, there exists an open neighborhood Vz such that 1 ŒVz is a disjoint union of open subsets Ui , i 2 I , such that for each i , the restriction jUi W Ui ! Vz is a holomorphic isomorphism. We call Vz a fundamental neighborhood. Note that an open subset of a fundamental neighborhood which contains z is also a fundamental neighborhood. Obviously, a holomorphic isomorphism is a covering. If E is an elliptic curve, the projection W C ! E discussed in Subsection 3.2 is a covering. For yet another example, see Exercise (14).
3.3.2 Coverings from (local) primitive functions Another example of a covering, which will be of great significance to us, is obtained as follows: Let U C be an open subset, and let f W U ! C be a holomorphic function. Let Uf be equal to U C as a set. Denote by W Uf ! U the projection to the first factor: .z; t/ D z. Introduce, however, a topology on Uf as follows: Let its basis consist of all sets of the form WV;F D f.z; F .z// j z 2 V g
(*)
where V U is an open subset and F is a primitive function of f on V . By Theorem 2.3 of Chapter 10, every point of Uf is contained in one of the sets (*), and in fact .WV;F ; jWV;F / form an atlas of a Riemann surface Uf , and, furthermore, the projection is a covering. (Convex open subsets of U can be taken as fundamental neighborhoods.)
3 Riemann surfaces, coverings and complex differential forms
325
3.3.3 Paths and homotopy We will now briefly investigate topological properties of coverings. By a path in a topological space X , one means a continuous map ! W h0; 1i ! X . The points !.0/ and !.1/ are called the beginning point and end point, respectively. A homotopy of paths !, with the same beginning point and the same end point is a continuous map h W h0; 1i h0; 1i ! X such that h.s; 0/ D !.s/, h.s; 1/ D .s/, h.0; t/ D !.0/, h.1; t/ D !.0/ for all s; t 2 h0; 1i. We write h W ! ' . Our main result is the following 3.3.4 Theorem. Let W T ! † be a covering, and let ! W h0; 1i ! † be a path. Let a point x 2 T be such that .x/ D !.0/. Then there exists a unique path !Q in T such that !.0/ Q D x and !.t/ Q D !.t/ for all t 2 h0; 1i. Furthermore, if ! ' , then !Q ' Q (in particular, !Q and Q have the same endpoints). One refers to the path !Q as a lifting of the path !. Proof. Let At be an open interval containing the point t 2 h0; 1i such that !ŒAt \ h0; 1i is contained in a fundamental neighborhood. By Theorem 5.5 of Chapter 2, h0; 1i is covered by finitely many of the open intervals At . Denoting their end points by 0 D t0 < t1 < < tk D 1, each of the images !Œhti ; ti C1 i is contained in a fundamental neighborhood. We can prove by induction on i that a lift !Q i of !jh0; ti i with end point x exists and is unique: in fact, assuming this for a given i , !Q i exists, let V be a fundamental neighborhood containing !Œhti ; ti C1 i, and let Vj be the open subset given by the definition of a covering which is mapped homeomorphically to V by the restriction i of the projection, and has the property that !Q i .ti / 2 Vj . Then for t 2 hti ; ti C1 i, define !Q i C1 .t/ D i1 !.t/: Clearly, this extends !Q i to the required !Q i C1 , and further this extension is uniquely determined, since i is a homeomorphism. Now we can put !Q D !Q k , and we have both existence and uniqueness. Regarding the homotopy, let h W ! ' . We shall construct a lift of this homotopy to T . Note that we already know the lift exists and is uniquely determined by applying the path lifting theorem separately to the path h.‹; a/ with each fixed a. However, we must prove that this lift hQ W h0; 1i h0; 1i ! T is continuous. To this end, we must repeat, to some extent, our above argument for paths: The set h0; 1i h0; 1i is compact, and hence is covered by finitely many rectangles hs; s 0 i ht; t 0 i the closures of whose images lie in fundamental neighborhoods. Taking the finite sets of all such s; s 0 and t; t 0 , we obtain partitions 0 D s0 < s1 < < s` D 1, 0 D t0 < t1 < < tm D 1 where the h-image of each rectangle hsi ; si C1 i htj ; tj C1 i is in a fundamental neighborhood Ui;j . For each Q j , we then prove by induction on i that hjh0; si i htj ; jj C1 i is continuous; indeed, suppose the statement is true for a given i (and a fixed j ). Then by the connectedness Q i g htj ; tj C1 i is contained in one of of intervals and the induction hypothesis, hŒfs
326
13 Complex Analysis II: Further Topics
the disjoint open sets which, by , map homeomorphically onto Ui;j . Inverting the homeomorphism, we obtain the statement for i C 1. Q 1/ is constant in s, note that 1 Œf!.1/g is discrete, and a To see that h.s; continuous function from a connected space to a discrete space is constant. t u Remark: It is useful to note that the proof of this theorem was purely topological and did not make any use of the holomorphic structure.
3.4
Complex and holomorphic differential forms
3.4.1 Integration on Riemann surfaces Let us begin by a brief discussion of complex line integrals on Riemann surfaces. A Riemann surface † is certainly a smooth manifold, and by the material of Chapter 12, for a differential 1-form ! on † and a continuously differentiable map L W ha; bi ! †, we may integrate Z
Z
b
!D L
L .!/:
(*)
a
This definition extends, as before, in an obvious way to piecewise continuously differentiable curves L, and is independent of parametrization in the sense of Chapter 8. Therefore, the key point is specifying the differential 1-form !. What one means by complex integral is that using complex multiplication, we can introduce 1-forms with complex coefficients (also called complex-valued differential forms. We obtain those by applying ‹ ˝R C to the spaces TMx , ƒk .TMx /. (When tensoring over R with C, we consider C as an R-vector space. However, note that ‹ ˝R C covariantly turns R-vector spaces into C-vector spaces by using the multiplication in C.) Thus, a smooth complex-valued k-form assigns to each x 2 M , an element of ƒk .TMx / ˝R C which becomes smooth upon identification of TMx with C Š R2 when x 2 U and U is a coordinate neightborhood. Identifying C with R2 , a complex-valued k-form on a Riemann surface is then precisely the same thing as a pair of real k-forms: its real and imaginary part. This construction could, in fact, be done by any (real) smooth manifold. We remarked in Subsection 4.4 of Chapter 8 that the complex line integral over a piecewise continuously differentiable curve in an open subset U C we used so extensively in Chapter 10 (and the present chapter) can be expressed in terms of the line integral of the second kind. In the more modern context of complex-valued differential forms, this is expressed by the simple but somewhat profound formula
3 Riemann surfaces, coverings and complex differential forms
dz D dx C i dy:
327
(**)
In (**), we identify C with R2 by z D x C iy: The right-hand side of (**) is then a differential 1-form with complex coefficients, so we may integrate it over piecwise continuously differentiable curves L in C. When integrating the left-hand side of (**) over L, we mean, on the other hand, the corresponding complex line integral. This is, then, the same thing as treating dz as a complex-valued 1-form. Using complex multiplication, we then have additional complex 1-forms ! D f .z/dz for a complex continuously real-differentiable function f .z/. A line integral of the complex-valued 1-form ! is then the same thing as the complex line integral as treated earlier, thus explaining in this way a complex line integral as an integral of a complex-valued 1-form.
3.4.2 Holomorphic 1-Forms on a Riemann surface On a general Riemann surface †, we no longer have a preferred form d z, but we do have one on a coordinate neighborhood with a holomorphic coordinate system z. Using the complex chain rule, we see that if z D z.t/ is a holomorphic function of another holomorphic coordinate t, then dz D z0 .t/dt where z0 denotes the complex derivative (note that we only need to make sense of this on an open subset of C). This means that 1-forms on a coordinate system, which can be given as f .z/dz; where f is a holomorphic function, transform to 1-forms of the same kind upon holomorphic change of coordinates. Such complex-valued 1-forms are called holomorphic 1-forms on the Riemann manifold †. Now in analogy with Subsection 2.4 of Chapter 10, for a holomorphic 1-form on a Riemann surface †, a primitive function (if one exists) is a function F W † ! C such that dF D !:
(*)
To see that this is the right generalization, note that on an open set U C, indeed, dF D f .z/dz is equivalent to F 0 .z/ D f .z/, see Exercise (12). Note that therefore by what we proved in Chapter 10, it immediately follows that a primitive function to a holomorphic 1-form (if one exists) is necessarily holomorphic.
328
13 Complex Analysis II: Further Topics
Even if a primitive function does not exist, note that the construction 3.3.2 immediately generalizes to give, for any holomorphic 1-form ! on a Riemann surface †, a covering W †! ! †: Again, †! D † C as a set, and the topology has basis consisting of sets 3.3.2 (*), where V † is open, and F is a primitive function of ! on V .
3.5
The basis dz, dz
We are not, however, always interested just in holomorphic 1-forms. It is therefore natural to also introduce the “complex conjugate 1-form” dz D dx i dy on a coordinate neighborhood U of a Riemann surface, where z is the coordinate. Then dz, dz at each point of U of a coordinate neightborhood clearly form a basis of the (complex) dual of the complexified tangent space T †x ˝R C. Under a holomorphic change of coordinates z D z.t/, the 1-form d z transforms by dz D z0 .t/dt : It follows that the C-vector spaces of forms on U f.z/dz j smooth C valuedg; f.z/dz j smooth C valuedg are preserved by holomorphic change of coordinates. Such forms are called 1-forms of type .1; 0/, resp. of type .0; 1/. In fact, note that if we define, for ! D f .z/dz C g.z/dz with f; g smooth on U , ! D f .z/dz C g.z/dz; then this “complex conjugation” operator is invariant under holomorphic coordinate change, and switches the spaces of .1; 0/-forms and .0; 1/-forms. In view of this, it is helpful also to write the basis of complex vector fields dual to dz, dz on U : 1 @ @ @ ; D i @z 2 @x @y @ 1 @ @ D Ci : @z 2 @x @y
3 Riemann surfaces, coverings and complex differential forms
329
Note that in this notation, the Cauchy-Riemann equations for a function f can be expressed simply by @f D 0: @z
(3.5.1)
In other words, a continuously differentiable function on f W † ! C is holomorphic if and only if it satisfies (3.5.1) in holomorphic coordinates. Let us now examine how the new basis behaves with respect to the exterior differential and the exterior product. Regarding exterior product, note that dz ^ dz D 2i dx ^ dy:
(3.5.2)
Regarding the exterior differential, one has, of course, for a complex continuously (real)-differentiable function f , df D
@f @f dz C dz: @z @z
Recalling the Cauchy-Riemann condition in the form (3.5.1), it then becomes natural to write for a form !0 2 f1; dz; dz; dz ^ dzg, and a complex continuously differentiable function f on U , @f dz ^ !0 ; @z @f dz ^ !0 ; @.f !0 / D @z
@.f !0 / D
(the point, of course, being that d!0 D 0). Of course, we have d D @ C @: One readily verifies that @ and @ are invariant under a change of holomorphic coordinate (see Exercise (13)). Because of that, @ and @ are well-defined on any Riemann surface †. Note that on a compact Riemann surface, there may exist non-trivial holomorphic 1-forms. For example, the form dz obviously determines a well-defined holomorphic 1-form on any elliptic curve as defined in Subsection 3.2. Compare this with Exercise 9 which asserts that every holomorphic function on a compact Riemann surface is constant. In fact, note that if † is a compact Riemann surface, then the space 1Hol .†/ embeds canonically into the de Rham cohomology with complex coefficients 1 1 HDR .†; C/ D HDR .†/ ˝R C:
330
13 Complex Analysis II: Further Topics
This is because if a holomorphic 1-form ! satisfies ! D df , then f is necessarily holomorphic and hence constant, and hence ! D 0. In fact, one can prove that for † compact, there is a canonical isomorphism 1 HDR .†; C/ Š 1Hol .†/ 1Hol .†/:
Let us remark that the 1-form dz, of course, pulls back to any open subset U C, and hence also to any covering W V ! U . We shall simplify notation by denoting dz D d.z ı / also by dz, thus defining “complex integration” of functions on any covering † equipped with a covering W V ! U where U C is an open subset. Since every point z 2 V has an open neighborhood which is mapped by holomorphically bijectively onto an open subset of U , a complex derivative of holomorphic functions f W V ! C is then also defined, as is the concept of a primitive function of f on open subsets of V .
3.6
Complex line integrals revisited
In Chapter 8, we investigated extensively the implications of reparametrizing a piecewise continuously differentiable parametrized curve L. Note that in particular, we can make the domain of the parametrization the interval h0; 1i, which lets us consider the parametrized curve L as a path in the sense of Subsection 3.3. Note also that reparametrizations result in homotopic paths. 3.6.1 Theorem. Let † be a Riemann surface, and let ! be a holomorphic 1-form on †. Let L, M be partially continuously differentiable parametrized curves in V which are homotopic as paths (in particular, they have the same beginning points and the same end points). Then Z
Z !D
!:
L
M
Proof. Consider the covering †! of † corresponding to the local primitive functions of f (see 3.3.2 and 3.4. Let z0 be the beginning point of the parametrized curves L; M . Let LQ (resp. MQ ) be a lift of the path L (resp.M ) to † with beginning Q MQ . We claim that point .z0 ; 0/. Let .z1 ; K1 /, .z2 ; K2 / be the end points of L, Z
Z ! D K1 ; L
! D K2 : M
3 Riemann surfaces, coverings and complex differential forms
331
In effect, find again 0 D t0 < t1 < < tk D 1 such that LŒhti ; ti C1 i, M Œhti ; ti C1 i for each chosen i are contained in a fundamental neighborhood of the covering, and use the properties of primitive functions. But then since L; M are homotopic, .z1 ; K1 / D .z2 ; K2 / by Theorem 3.3.4, which proves our statement. t u 3.6.2 Corollary. Let U C be an open set, let f W U ! C be a holomorphic function and let L, M be piecewise continuously differentiable parametrized curves which are homotopic as paths. Then Z
Z f .z/dz D L
f .z/dz:
t u
M
Note that it would be quite difficult to prove this directly using the techniques of Chapter 10, in particular since there is no theory of line integrals of the second kind over continuous paths: we have really used the force of Theorem 3.3.4 here. However, for open subsets of C, we can go even further. Recall the definition of a simply connected open set from Subsection 1.2. 3.6.3 Theorem. For a connected open set U ¨ C, the following are equivalent: (1) U is simply connected (i.e. every holomorphic function on U has a primitive function). (2) U is holomorphically isomorphic to .0; 1/ (3) Let a; b 2 U . Then any two paths !; with beginning point a and end point b are homotopic. Proof. (1) implies (2) by the Riemann Mapping Theorem 1.2. (2) implies (3) because .0; 1/ is a convex set: We may define the homotopy simply by h.s; t/ D t!.s/ C .1 t/.s/. To see that (3) implies (1), suppose that U is a connected open subset of C satisfying (3). Let f be a holomorphic function on U . Let † be the covering 3.3.2 corresponding to the primitive function of f , and let †0 be a connected component of U . By definition, the restriction of the projection 0 W †0 ! U is a covering. We claim, in fact, that it is a holomorphic isomorphism. By Theorem 3.3.4, and the fact that U is path-connected, 0 is onto. Thus, if it is not a holomorphic isomorphism, it cannot be injective, i.e. there must be two points x; y 2 †0 with 0 .x/ D 0 .y/. But †0 is connected, and since it is a manifold, also path-connected, so there is a path ! in beginning point x and end point y. Then the projection 0 ı ! in U has the same beginning point and end point 0 .x/ D 0 .y/, but cannot be homotopic to the constant path by Theorem 3.3.4, since its lift ! has a different beginning point and end point. The contradiction proves that 0 is a holomorphic isomorphism; the second coordinate of 01 .z/ is then a primitive function of f on U . t u
332
4
13 Complex Analysis II: Further Topics
The universal covering and multi-valued functions
Theorem 3.6.3 suggests the following definition: A Riemann surface † is called simply connected if it is connected, and if any two paths !, which have the same beginning point and the same end point are homotopic. The Riemann Mapping Theorem actually has a generalization called the Uniformization Theorem stating that every simply connected Riemann surface is holomorphically isomorphic to
.0; 1/, C or CP 1 , but we shall not prove this here (see, however, Exercise (17)). Q ! † 4.1 Theorem. Every connected Riemann surface † has a covering W † Q where † is simply connected. (This covering is called the universal covering of †.) Q as the set of homotopy classes (i.e. Proof. Select a point x0 2 †. Define † equivalence classes with respect to the relation of homotopy) of paths ! with beginning point x0 . The homotopy class of a path ! will be denoted by Œ!. We Q ! †, sending a class Œ! to the end point of ! (by have an obvious map W † our definition of homotopy, this does not depend on the choice of a representative). Q and to prove Therefore, it remains to define a structure of a Riemann surface on † that it is simply connected and that is a covering. It is helpful here to introduce the operation of concatenation of paths, which is a generalization of the operation C on parametrized continuously differentiable curve we considered in Chapter 8: If !, are paths in † where the end point of ! is the beginning point of , define the path ! by .! /.t/ D
!.2t/ for 0 t 1=2 .2t 1/ for 1=2 t 1.
Note that, (just as for piecewise continuously differentiable curves,) concatenation is associative up to homotopy. Also similarly as for curves, the operation L of Chapter 8 has a generalization to paths: the inverse path of ! is defined by !.t/ D !.1 t/: One readily proves that ! ! is homotopic to a constant path, as is ! !. Q Œ! D x. Let .Ux ; hx / be a coordinate system To proceed further, let Œ! 2 †, of † at x, and let V hx ŒUx be a convex open subset containing hx .x/. Then Q be the set consisting of all classes Œ! ..hx /1 ı L/ where L is let UŒ!;V † a linearly parametrized line segment in V with beginning point hx .x/. Note that this class does not depend on the choice of the representative ! of the class Œ!. Note that by definition, maps UŒ!;V bijectively onto h1 x ŒV . (Note: our notation implies that a fixed coordinate system is specified at each x 2 †; otherwise, the notation UŒ!;V must be modified to reflect the coordinate system.)
4 The universal covering and multi-valued functions
333
4.1.1 Lemma. If .Œ!/ D .Œ/ (i.e. ! and have the same end point x) and Œ! ¤ Œ (i.e. ! and are not homotopic), then for any convex open subset V hx ŒUx , UŒ!;V \ UŒ;V D ;: Proof. If ! ..hx /1 ı L/ ' ..hx /1 ı L/, then ! ..hx /1 ı L/ ..hx /1 ı L/ ' ..hx /1 ı L/ ..hx /1 ı L/; (note: this uses associativity of up to homotopy), which in turn implies !' (which uses the inverse property).
t u
We still need to make yet another observation. 4.1.2 Lemma. Let !, V be as above and let y 2 h1 x ŒV . Then there exists a path and an " > 0 such that for a convex open W .hy .y/; "/, UŒ;W UŒ!;V . 1 Proof. It suffices to choose " > 0 such that h1 y Œ .hy .y/; "/ hx ŒW . We set
D ! ..hx /1 ı L/ where L is a linearly parametrized line segment with beginning point hx .x/ and end point hx .y/. To prove that UŒ;W UŒ!;V , let M be a linearly parametrized line segment in W with beginning point hy .y/ and end point hy .z/. We need to prove that Œ .h1 y ı M / 2 UŒ!;V :
(*)
To this end, note that by associativity of , 1 1 .h1 y ı M / ' ! .hx ı L/ .hy ı M /:
Now we have 1 1 1 .h1 x ı L/ .hy ı M / D hx ı .L .hx ı hy ı M //:
Clearly, the path L .hx ı h1 y ı M / in V is not a linearly parametrized line segment, but is homotopic to one since V is a convex set. This proves (*). t u
334
13 Complex Analysis II: Further Topics
Q a topology where a subset U is a Now by Lemma 4.1.2, we can give † neighborhood of an Œ! 2 U if and only if it contains a subset of the form UŒ!;V (we need the lemma to conclude that this definition is correct in the sense that a set we call a neighborhood indeed contains an open subset). Lemma 4.1.1 then implies that 1 for U D h1 x ŒV as above, ŒU is a disjoint union of open subsets UŒ!;V over Q as the open sets UŒ!;V all the ! with .!/ D x. We can then define an atlas of † Q together with the coordinate maps hx ı where .Œ!/ D x. It then follows that † is a path-connected Riemann surface and is a covering - except for one detail: we Q is separable. must prove that † To this end, let Ui be a countable basis of † such that each Ui is connected and contained in a convex subset of a coordinate neighborhood. Then each Ui is a fundamental neighborhood. We will prove that for each x 2 †, the set 1 Œfxg is Q countable; then the connected components of 1 ŒUi form a countable basis of †. But note that by compactness as above, for every path with beginning point x0 and end point x, there exist 0 D t0 < t1 < < tm D 1, a finite sequence i1 ; : : : ; im such that !Œhtj 1 ; tj i Uij for j D 1; : : : ; m. In particular, Uij \ Uij C1 ¤ ;. One then proves by induction that there exist unique connected components UQ ij of 1 ŒUij such that UQ ij \ UQ ij C1 ¤ ;. Since clearly Œ! 2 UQ im , and since there are only countably many such sequences i1 < < im , there are only countably many Œ! with .Œ!/ D x, as claimed. Q is simply connected. But this is easy. First of Finally, we shall prove that † Q is path-connected by construction (since the lift of a path ! with beginning all, † point x0 has, by definition, end point Œ!). Next, suppose that ˛, ˇ are two paths Q with the same beginning point Œ! and the same end point Œ. But this means in † Œ! . ı ˛/ D Œ D Œ! . ı ˇ/, i.e. ! . ı ˛/ ' ! . ı ˇ/, which implies ı ˛ ' ı ˇ (by concatenating with !, which implies ˛ ' ˇ by Theorem 3.3.4. t u
4.2
Base points, universality and multi-valued functions
Let † be a connected Riemann surface and let x0 2 †. We refer to such a chosen point as a base point of †. Note that we already used the base point in the Q and that in fact that construction comes construction of the universal covering †, with a preferred base point xQ 0 , represented by the constant path at x0 . We have .xQ 0 / D x0 . We refer to a covering W T ! † with a choice of base points .xQ 0 / D x0 as a based covering. The term ‘universal covering’ (which should really pedantically be called “universal based covering”) is justified by the following fact: 4.2.1 Theorem. Let † be a connected Riemann surface. Consider the based Q ! †, with base points xQ 0 7! x0 , and let W T ! † universal covering W † be any based covering, with base points y0 7! x0 . Then there exists a unique based Q ! T such that .xQ 0 / D y0 . In fact, we have ı D . covering W †
4 The universal covering and multi-valued functions
335
Q Let ! be a path in † Q with beginning point xe0 and end point x. Proof. Let x 2 †. By Theorem 3.3.4, there is a unique lift of the path ı ! to T with beginning point y0 . Let .x/ be the end point of . (Note in fact that this definition is forced by the path lifting property, which already implies uniqueness.) On the other hand, also note that our definition of .x/ did not depend on the choice of the path !, since Q is simply connected. Because of this, if any two such paths are homotopic as † U is a connected fundamental open neighborhood of a point z 2 † for both the coverings and , and if Ui (resp. Uj ) is the open disjoint summand of 1 ŒU (resp. 1 ŒU ) such that 0 D jUi ! U (resp. 0 D jUj ) and which contains x (resp. .x/), then jUi is given by the formula 01 ı 0 , which shows that is a covering with such fundamental neighborhoods Uj . (Note: if y 2 T is not in the connected component of the base point, then it won’t be in the image of , so the fundamental neighborhood of y can be chosen to be the whole connected component.) t u We immediately get the following 4.2.2 Theorem. A connected Riemann surface † is simply connected if and only if every covering W T ! † with T connected is a holomorphic isomorphism. Proof. Suppose † is simply connected and W T ! † is a covering with T connected. We already remarked that is onto by Theorem 3.3.4. Suppose .x1 / D .x2 /. Let ! be a path in T with beginning point x1 and end point x2 . Then ı ! has a beginning point equal to its end point, and hence is homotopic to the constant path since † is simply connected. Thus, by Theorem 3.3.4, x1 D x2 . Thus, is also injective, and thus is a holomorphic isomorphism. On the other hand, suppose † is connected but not simply connected. Then the Q ! † cannot be a holomorphic isomorphism, since † Q is universal covering W † simply connected. t u 4.2.3 Corollary. (Uniqueness of universal covering) Let † be a connected RieQ ! † with base points xe0 7! x0 mann surface. A based universal covering W † is unique in the sense that for any other based universal covering W T ! † with Q !T base points z0 7! x0 , there exists a unique holomorphic isomorphism W † such that ˛.y0 / D z0 . In fact, ı D . Q ! T with Proof. By Theorem 4.2.1, there exists a unique covering W † the specified properties. Since T is simply connected, by Theorem 4.2.2, is a holomorphic isomorphism. t u
4.2.4 Multi-valued functions Q !† Let † be a based connected Riemann surface with base point x0 , and let W † be a based universal covering with base points xQ 0 7! x0 . Then we define a multiQ valued holomorphic function on † based at x0 as a holomorphic function on †.
336
13 Complex Analysis II: Further Topics
Note that then, in particular, the multivalued function based at a point x0 does have a well-defined “value” at the point x0 . Multivalued holomorphic functions based at x0 form an algebra in the sense that they contain (ordinary) holomorphic functions (a holomorphic function f is identified with the multi-valued function f ı ), and have well-defined operations of addition and multiplication. Much more is true, of course, for example if f is a multi-valued holomorphic function based at x0 and g W C ! C is an ordinary holomorphic function, then there is a well-defined multivalued holomorphic function g ı f based at x0 . Q does not matter in the sense Note that by Corollary 4.2.3, the choice of † that multi-valued holomorphic functions defined via any other based holomorphic Q by a preferred bijection, namely universal covering are related to those defined via † Q and T , and that the one induced by the based holomorphic isomorphism between † this bijection preserves all the operations in sight. It is important to note, however, that unless † is simply connected, there is no preferred way of identifying the algebras of multivalued holomorphic functions based at different base-points of †. Examples of multi-valued holomorphic functions on Riemann surfaces can be obtained from holomorphic 1-forms !: Note that we have a primitive function F of ! well-defined on any connected component of the covering †! , and hence, by Theorem 4.2.1, on the universal cover. This is referred to as the multi-valued primitive function of !. Note that a discussion of base points is not so important here, since no matter how we choose base points, two multi-valued primitive functions of the same holomorphic 1-form will differ by a constant. In particular, for connected open sets U C, we have a well-defined notion (up to additive constant) of a multivalued primitive function based at z0 2 U of a given multi-valued function based at z0 . For example, the multi-valued primitive function of f .z/ D
1 z z0
on C X fz0 g with value equal to 0 at the base point which is chosen to project to z0 C 1 2 C X fz0 g is called the multivalued logarithm ln.z z0 /. Choosing an arbitrary ˛ 2 C, we then obtain the multivalued function .z z0 /˛ D e ˛ ln.zz0 / on C X fz0 g, also based at z0 C 1 2 C X fz0 g. Sometimes different conventions of base points are appropriate (see below). In any case, no matter what base point we specify, the multi-valued logarithm is well defined up to adding an integral multiple of 2 i , and .z z0 /˛ is well defined up to a non-zero multiplicative constant of modulus 1.
4 The universal covering and multi-valued functions
337
4.2.5 Example The behavior of multi-valued functions can be quite complicated. Consider the multivalued function f .z/ D za .z 1/b
(1)
on U D C X f0; 1g. Assume, for simplicity, a; b > 0 to be real numbers. (Note that there exists unique multi-valued functions za , .z 1/b based at any chosen point 0 < z0 < 1 whose values at the base point are positive real numbers.) Now let F be the multi-valued primitive function on U (Let, for example, the value of F at the base point zQ0 , .Qz0 / D z0 , be 0.) Now let K be the circle with center 0 and radius z0 (and beginning point z0 ) oriented counter-clockwise, and let L be the circle with center 1 and radius 1 z0 (and beginning point z0 ) oriented counter-clockwise. Let ! be a concatenation of m copies of K and n copies of L (in any fixed order), and ze1 be the end-point of the lift !Q to the universal covering with beginning point zQ0 . Then one immediately sees that f .Qz1 / D f .Qz0 /e 2 .maCnb/i :
(2)
Let us now examine the behavior of the function F : First note that the integrals Z
z0
AD 0
Z
1
za .z 1/b dz; B D
za .z 1/b dz z0
actually exist in the sense of ordinary real analysis, and are equal to (finite) positive real numbers. Additionally, the integrals of (1) over a circle with radius " and center e is a lift of K to the universal 0 or 1 goes to 0 with " ! 0. Because of this, if K covering with beginning point zQ1 as above, we have, denoting the end point by zQ2 , F .Qz2 / F .Qz1 / D e 2 .maCnb/i .e 2 ai 1/A;
(3)
while if e L is a lift of L to the universal cover with beginning point zQ1 as above and end point zQ3 , we have F .Qz3 / F .Qz1 / D e 2 .maCnb/i .1 e 2 bi /B:
(4)
Note that the operations (3), (4) do not commute: if we begin at zQ1 and follow first K and then L, the value of the primitive function increases by e 2 .maCnb/i .e 2 ai 1/A C e 2 ..mC1/aCnb/i .1 e 2 bi /B;
338
13 Complex Analysis II: Further Topics
while following L first and then K beginning from the same point zQ1 gives an increase of e 2 .maC.nC1/b/i .e 2 ai 1/A C e 2 .maCnb/i .1 e 2 bi /B: These two values are in general not equal. Because of this, it is not true, contrary to what one may naively expect, that F .z/=f .z/ would be a single-valued function on U (in the sense that it would be a composition of an ordinary holomorphic function on U with ). Note that we also see that the end points of the lifts of K L and L K to the universal covering with the same beginning point are, in fact, different. Up to normalization, the function F belongs to a family of functions called hypergeometric functions; they are, in some sense, the “simplest” multivalued holomorphic functions on a connected open subset of C for which this phenomenon occurs.
4.3
The fundamental group
Let † be a connected Riemann surface with base point x0 . Denote by 1 .†; x0 / the set of all homotopy classes of paths in † with beginning point and end point x0 . Recall the proof of Theorem 4.1, and specifically the operation of concatenation of paths. From the arguments given there, it follows that gives a well-defined binary operation on 1 .†; x0 /. Moreover, note that the constant path at x0 is a unit element for the operation . Also observe that if, for a path !, we define a path ! given by !.t/ D !.1 t/; then Œ! is the inverse of Œ! with respect to . Thus, the set 1 .†; x0 / with the operation is a group in the sense of Appendix B, 3.1. This group is called the fundamental group of † with base point x0 . There are many interesting and deep connections between the fundamental group and coverings, which we cannot explore in this text, in part because we do not develop the theory of groups in any substantial way. After filling in the necessary algebra, say, in [2], the reader can find more information in [6, 13, 20]. There is, however, one connection between the fundamental group and the universal cover which is too beautiful and striking to pass up. Consider a based universal cover Q xQ 0 / ! .†; x0 / .†; of a connected Riemann surface †. A deck transformation is a homeomorphism Q !† Q such that the following diagram commutes: f W†
4 The universal covering and multi-valued functions
339
f
˜ Σ π
˜ Σ π
Σ
(In other words, such that ı f D .) Note that a deck transformation is automatically a holomorphic isomorphism, and that deck transformations form a group with respect to composition of maps. Denote this group by . Now define maps ˆ W ! 1 Œfx0 g by letting, for a deck transformation f , ˆ.f / D f .xQ 0 /: Define, on the other hand, a map ‰ W 1 .†; x0 / ! 1 Œfx0 g as follows: Let ! be a path in † with beginning point and end point x0 . Let !Q Q which is the unique lifting of ! with beginning point xQ 0 (see be a path in † Theorem 3.3.4). Then let ‰.Œ!/ be the end point of !. Q By Theorem 3.3.4, this does not depend on the choice of the representative ! of the class Œ! 2 1 .†; x0 /. The following result can often be used to compute the fundamental group (see Exercise (15)). Theorem. The maps ˆ and ‰ are bijections. Moreover, the composition ˆ1 ı ‰ is a homomorphism (hence isomorphism) of groups. Proof. The fact that ˆ is bijective is a special case of the universality Theorem 4.2.1. Q is connected, and hence path-connected. Let To show that ‰ is onto, recall that † Q from xQ 0 to y. Put ! D ı . Then ‰.Œ!/ D y 2 1 Œfx0 g and let be a path in † y. To prove injectivity, note that the just mentioned is unique up to homotopy Q is simply connected, and composing with gives uniqueness of Œ!. since † To prove that ˆ1 ı ‰ is a homomorphism of groups, let , ! be paths in † with Q with beginning point beginning points and points x0 and let , Q !Q be their lifts to † Q whose beginning point is the end xQ 0 . Let, on the other hand, O be the lift of to † point xQ 1 of !. Q Now let f be a deck transformation which sends xQ 0 to xQ 1 . Then by uniqueness of path lifting, f ı Q D . O In particular, if we denote the end point of Q by xO 0 and the end point of O by xO 1 , then f .xO 0 / D xO 1 :
340
13 Complex Analysis II: Further Topics
We see that ˆ.f ı g/ D xO 1 D ‰.! /; so f ı g D ˆ1 ı ‰.! /; while f D ˆ1 ı ‰.!/; g D ˆ1 ı ‰./; t u
which is what we wanted to prove.
4.4
Comment
The reader no doubt noticed that the concepts of covering, universal covering, and fundamental group do not use the structure of a Riemann surface very substantially. They can, indeed, be defined for more general topological spaces. In order for the nice theorems we presented to be true, however, some “local assumptions” about the topological spaces involved must be included. The book [20] contains an easily accessible discussion of coverings in a more general topological context. One case which works very well is the case of smooth (or even topological) manifolds. Definition 3.3.1, Theorem 3.3.4, Theorem 4.1, Theorem 4.2.1, the definition of fundamental group in 4.3 and Theorem 4.3 remain vaild if we replace “Riemann surface” by “smooth manifold” (resp. “topological manifold”) and “holomorphic isomorphism” by “diffeomorphism” (resp. “homeomorphism”). Yet, the case of Riemann surfaces, which we discussed above, is particularly striking, and in this context, coverings were first discovered by Riemann.
5
Complex analysis beyond holomorphic functions
We are now ready to extend Cauchy’s formula (Theorem 3.3 of Chapter 10) to the case of any continuously (real)-differentiable function. Let us write an integration variable D s C it (to distinguish from the standard convention z D x C iy). 5.1 Theorem. (The Cauchy-Green formula) Let U be a domain in C. Let L1 ; : : : ; Lk be simple piecewise continuously differentiable closed curves with disjoint images such that L1 q q Lk is the boundary of U oriented counterclockwise. Let U be defined and have continuous real partial derivatives on an open
5 Complex analysis beyond holomorphic functions
341
set V C containing U . Then for z 2 U , we have
1
Z U
k Z 1 X .@f =@/dsdt f ./d C D f .z/: z 2 i j D1 Lj z
(5.1.1)
Proof. It is actually almost the same as the proof of Theorem 3.3 of Chapter 10. Using the language of Subsection 3.5, we may rewrite (5.1.1) as
1 2 i
Z d U
f ./d z
C
k Z 1 X f ./d D f .z/: 2 i j D1 Lj z
(*)
On the other hand, for " > 0 small, if we denote by K the boundary of .z; "/ oriented counter-clockwise, then Green’s Theorem 5.4 of Chapter 8 gives
1 2 i
Z d U X .z;"/
Z k Z 1 X 1 f ./d f ./d f ./d C D : z 2 i j D1 Lj z 2 i K z
(**)
When " ! 0, the right-hand side tends to f .z/ by the same argument as in the proof of Theorem 3.3 of Chapter 10. So it remains to prove that Z lim
"!0 .z;"/
.@f =@/dsdt D 0; z
which, by continuity, is equivalent to Z lim
"!0 .z;"/
dsdt D 0; z
which is an obvious calculation (for example in polar coordinates).
5.2
t u
The “inverse” Cauchy-Riemann operator
The Cauchy-Green formula is the starting point of applying methods of complex analysis to classes of functions which are not necessarily holomorphic, but merely satisfies differentiability conditions in the real sense. We will use these methods in Section 5 of Chapter 15, when we will construct a complex structure on an oriented surface with a Riemann metric. Recall H¨older’s inequality (Theorem 8.1 of Chapter 5) Z jj C
f ./g.z /dsdtjj1 jjf jjp jjgjjq for
1 1 C D 1. p q
(5.2.1)
342
13 Complex Analysis II: Further Topics
We will focus here on functions defined on an open disk D D .0; 1/: (We could, of course, equivalently work on any other disk.) Where needed, we may extend such functions to C by 0. Note also that for z 2 D, using polar coordinates, we have ./ D
1 2 Lq .D/ for every q < 2: z
Thus, by (5.2.1), for p > 2, we have a well-defined operator P W Lp .D/ ! L1 .C/ defined by .P .f //.z/ D
1
Z D
f ./dsdt : z
We will also need another version of this operator, defined by the formula .P1 .f // D
1
Z f ./ C
1 1 dsdt: z
Note that the function ./ D
1 1 z
is in Lq .CX .0; 2jzj// for every q > 1, and thus P1 .f / is defined for any function f 2 Lp .C/, p > 2, and produces a (not necessarily bounded) complex function defined everywhere on C. 5.2.1 Lemma. Let f be a continuous function on C with support in D such that jf .z/ f .t/j Kjz tj˛ for some ˛ > 0
(5.2.2)
for some constant K. Then P .f / is a continuously differentiable function on C. In fact, we have 1 @P .f .z// D @z
Z D
@P .f .z// f ./ f .z/ D f .z/: dsdt; . z/2 @z
(5.2.3)
If f is a continuous function on C which is in Lp .C/, p > 2 and satisfies (5.2.2), then P1 .f / is a continuously differentiable function on C and
5 Complex analysis beyond holomorphic functions
Z
1 @P1 .f .z// D @z
C
343
f ./ f .z/ f ./ f .0/ dsdt; . z/2 2 (5.2.4)
@P1 .f .z// D f .z/ f .0/: @z Proof. Let us first prove the statement for P . Using polar coordinates, one easily proves the identity
1
Z D
dsdt D z; for z 2 D. z
(1)
Using this formula for z 2 D and small j zj, we have 1 .P .f //.z C z/ .P .f //.z/ D
Z D
f ./ f .z/ . z/dxdy C f .z/ .z/: z z z (2)
Dividing by z and taking limits z ! 0 along the lines y D ix, y D ix, we obtain the formulas (5.2.3). Note carefully that the function f ./ f .z/ j zj˛ (with arbitrary value at D z) is, by assumption, bounded in 2 D. After dividing by z, the limit behind the integral sign can be taken by the Lebesgue Dominated Convergence Theorem after we restrict the integral to DX .z; 2 z/. The remaining integral is bounded by a constant times Z
.z;2 z/
j
dsdt : z zj
zj1˛ j
(3)
We must show that (3) converges to 0 with z ! 0. The integral (3) is certainly finite, and without loss of generality, z D 0. Now a substitution D z shows that (3) is proportional to j zj˛ , and hence tends to 0 with z ! 0, as needed. Proving the continuity of @P .f .z// @z is actually easier, we may use the Lebesgue Dominated Convergence Theorem directly on the entire range of integration after substituting D z. If z is not in the support of f , (1) still remains valid since f .z/ D 0 (formula (1) is not used in that case). The case of P1 is analogous: Instead of (1), we have
344
13 Complex Analysis II: Further Topics
.P1 .f //.z C z/ .P1 .f //.z/ 1
Z C
f ./ f .0/ z
z f ./ f .z/ dsdt z z z z
C.f .z/ f .0// .z/: The Lebesgue dominated convergence argument can then be applied on the set C X . .z; 2 z/ [ .0; 2 z//; and we use the estimate (5.2.3) at the point z on .z; 2 z/ and at the point z D 0 at
.0; 2 z/. The rest of the argument is the same. t u As an application, we get the following extension of Liouville’s Theorem, which will be useful in Section 5 of Chapter 15. 5.3 Theorem. Let f be a function on C with continuous first (real) partial derivatives. Assume that lim f .z/ D 0
z!1
and that there exists a function A.z/ with continuous first (real) partial derivatives and compact support such that @f D Af: @z Then f .z/ D 0 for all z 2 C. Proof. Assume without loss of generality that the support of A.z/ is contained in D. Put F .z/ D f .z/e .P .A//.z/ . Using Lemma 5.2.1, we compute @F D e B.z/ @z
@f .z/ f .z/A.z// D 0: @z
Thus, F is a holomorphic function on C, and since it tends to 0 at 1, it is zero by Liouville’s Theorem 5.1 of Chapter 10. t u Finally, we will prove two easy inequalities involving the operator P , which will also be useful in Section 5 of Chapter 15: 5.3.1 Lemma. (1) If f is a continuously differentiable function on C with support in D, then
5 Complex analysis beyond holomorphic functions
j.P .f //.z/j
345
8jjf jj1 : 1 C jzj
(2) For every p > 2 there exists a constant Cp such that if f 2 Lp .C/, then j.P1 .f //.z1 / .P1 .f //.z2 /j Cp jjf jjp jz1 z2 j12=p :
Proof. For (1), clearly it suffices to prove that Z D
8 dsdt : j zj 1 C jzj
(*)
First of all, by polar coordinates, we clearly have Z
.0;r/
dxdy D 2 r: p x2 C y 2
Thus, for jzj 1, we may use r D 2 to show that the left-hand side of (*) is less than or equal to 4 . For jzj > 1, the idea is to integrate 1=j zj over the intersection of
.z; 1 C jzj/ X .z; jzj 1/
(**)
with the smallest angle with center z which contains D. As already remarked, the integral of 1=j zj over (**) is 4 , so the integral over the intersection of (**) with an angle of size ˛ will be 2˛: The angle in question has size 2arcsin.1=jzj/
2 ; jzj 1 C jzj
which is good enough. To prove (2), use H¨older’s inequality with 1=p C 1=q D 1 (put D =jzj, u D s=jzj, v D t=jzj):
346
13 Complex Analysis II: Further Topics
ˇ ˇZ 1=q Z ˇ ˇ f ./ f ./ dxdy ˇ ˇ dxdy ˇ jzj jjf .z/jjp ˇ q C z C j. z/j Z D jjf .z/jjp
C
dudv j. 1/jq
1=q jzj.2=q/1:
Applying this to the function f .z C z2 / at the point z1 z2 gives the claimed inequality. t u
6
Exercises
(1) Prove that a M¨obius transformation maps an open disk, an open half-plane or the complement of a closed disk onto an open disk, open half-plane or the complement of a closed disk. Prove furthermore that for any two subsets of C[ f1g of any two of the above three types, there exists a M¨obius transformation mapping one onto the other. (2) Let w0 2 .0; 1/. Consider the M¨obius transformation f .z/ D
z w0 : 1 w0 z
Prove that this gives a holomorphic automorphism of .0; 1/ which maps w0 to 0. [Hint: Consider the effect of this M¨obius transformation on jzj D 1.] (3) Construct a non-constant (non-injective) holomorphic function f on C which is not onto. (4) Let a < b 2 R and let f; g W Œa; b ! R be continuous real functions which are continuously differentiable in .a; b/ and such that for a < x < b, f .x/ < g.x/. Prove that the set fx C iy 2 C j a < x < b; f .x/ < y < g.x/g is simply connected. (5) Find an elementary function which maps the set fz 2 .0; 1/ j Re.z/ > 0; Im.z/ > 0g bijectively holomorphically onto .0; 1/. [Hint: Find, in this order, holomorphic isomorphisms of the set described onto an open half-disk, an open quadrant, an open half-plane, .0; 1/.] (6) Show that if the polygon P is a triangle, then in Theorem 2.4, the points w1 ; w2 ; w3 can be chosen to be any three points on the unit circle which occur in this order when the circle is oriented counter-clockwise. [Hint: Using the maps of Exercise (2) and rotations, show that there is a holomorphic automorphism of .0; 1/ which extends holomorphically to an open set containing .0; 1/, and maps a given choice of points w1 ; w2 ; w3 to any other such given choice.] (7) Determine a choice of the points wi when P is a regular k-gon.
6 Exercises
347
(8) Using the Schwartz-Chrisfoffel formula, write down an explicit formula (with one free parameter) for a function f mapping bijectively holomorphically the upper half-plane on a rectangle. Such formulas are called elliptic integrals. Using complex conjugation (similarly as in Lemma 2.3), prove that the inverse function g extends to a meromorphic function on C, which is doubly periodic, with periods equal to the sides of the rectangle (such functions are called elliptic functions). For information on elliptic function, the reader may look at [11]. (9) Prove that every holomorphic function on a compact Riemann manifold is constant. (10) Prove that the M¨obius transformations are the only holomorphic automorphisms of CP 1 . [Hint: Use Proposition 1.1.1.] (11) Prove that non-constant meromorphic functions on CP 1 are precisely rational functions, i.e. functions of the form p.z/=q.z/ where p.z/, q.z/ are polynomials, q.z/ not identically zero. [Hint: Multiply (resp. divide) such a function f .z/ by the product of all factors .zzi /ki where zi is a pole (resp. zero)of order ki in C, (infinitely many zeroes or poles would mean f is a constant 0 or 1 by the Uniqueness Theorem 4.4 of Chapter 10). Then we may assume without loss of generality that the restriction of f to C has neither zeroes nor poles. Now if f .1/ ¤ 1, then f is bounded on C, while if f .1/ D 1, then 1=f .z/ is bounded on C. In either case, f is constant by Liouville’s Theorem 5.1 of Chapter 10.] (12) Prove in detail that for U C an open set, F .z/ is a primitive function for the 1-form f .z/dz with f holomorphic if and only if F .z/ is a primitive function of f .z/. (13) Prove in detail that the definitions of the operators @, @ on differential forms on a Riemann surface is invariant under holomorphic change of coordinates. (14) Prove that the function e z , considered as a holomorphic map C ! C X f0g, is a covering and that, in fact, it is the universal covering of C X f0g. (15) From Exercise (14), construct an isomorphism 1 .C X f0; g; x0 / ! Z for any base point x0 . (16) Prove that 1 .C X f0; 1g; x0 / is not abelian for any base point x0 . [Hint: Use Example 4.2.5.] (17) Prove that the Riemann surface CP 1 is simply connected. [Hint: Use smooth partition of unity to prove that every path is homotopic to a piecewise continuously differentiable p arametrized curve. For any two piecewise continuously differentiable curves, there exists a point which is in neither of their images.] (18) Prove that a connected Riemann surface † (or, for that matter, a connected manifold, see Comment 4.4) with a point x0 2 † is simply connected if and only if 1 .†; x0 / is the trivial group (i.e. a group with a single element).
348
13 Complex Analysis II: Further Topics
(19) Recall the concept of a Lie group from Chapter 12, Exercise (6). Prove that the fundamental group of a Lie group is commutative (see Comment 4.4). [Hint: the concatenation of paths is homotopic to the point-wise product, using the group operation.] (20) Prove that if W ! G is a covering and G is a Lie group (cf. Comment 4.4) with both G and connected, then can be given a structure of a Lie group such that is a homomorphism of groups. (21) Define for f 2 Lp .D/, p > 2, 1 .Q.f // D
Z
f ./dsdt D
z
Assuming (5.2.2), calculate @Q.f .z// @Q.f .z// ; : @z @z
:
Calculus of Variations and the Geodesic Equation
14
The aim of this chapter is to give a glimpse of the main principle of the calculus of variations which, in its most basic problem, concerns minimizing certain types of linear functions on the space of continuously differentiable curves in Rn with fixed beginning point and end point. For further study in this subject, we recommend [7]. We derive the Euler-Lagrange equation which can be used to axiomatize a large part of classical mechanics. We then consider in more detail the possibly most fundamental example of the calculus of variations, namely the problem of finding the shortest curve connecting two points in an open set in Rn with an arbitrary given (smoothly varying) inner product on its tangent space. The Euler-Lagrange equation in this case is known as the geodesic equation. The smoothly varying inner product captures the idea of curved space. Thus, solving the geodesic equation here goes a long way toward motivating the basic techniques of Riemannian geometry, which we will develop in the next chapter.
1
The basic problem of the calculus of variations, and the Euler-Lagrange equations
1.1 For the purposes of this chapter, define a continuously differentiable function y W ha; bi ! Rn
(*)
as a function with the property that the function defined as the derivative of y on .a; b/ and as the respective one-sided derivatives at a and b is everywhere defined and continuous on ha; bi. Now consider the vector space V D Va;b;p;q of all continuously differentiable function (*) such that y.a/ D p; an y.b/ D q I. Kriz and A. Pultr, Introduction to Mathematical Analysis, DOI 10.1007/978-3-0348-0636-7 14, © Springer Basel 2013
349
350
14 Calculus of Variations and the Geodesic Equation
for some fixed values p; q 2 Rn . Let L D L.t; x1 ; : : : xn ; v1 ; : : : ; vn / W ha; bi R2n ! R be a function with continuous first partial derivatives (again, take one-sided derivatives at a, b when applicable). The most basic problem of the calculus of variations is looking for the extremes of the function (we use the term functional) S WV !R given by Z
b
S.y/ D
L.t; y.t/; y0 .t//dt:
a
Note that S is continuous when we consider the metric on V given by the norm jjyjj D sup jjy.t/jj C sup jjy0 .t/jj: t 2ha;bi
t 2ha;bi
(Here we may choose any of the usual norms on Rn , for example the maximum one.) However, in the kind of formal investigation we are going to do, even this will play only a peripheral role. Lemma. Let f W ha; bi ! R be a continuous function such that Z
b
f .t/h.t/dt D 0
a
for all continuously differentiable functions h such that h.a/ D h.b/ D 0. Then f 0. Proof. Suppose f is not identically zero. Then f .t0 / ¤ 0 for some t0 2 .a; b/. Suppose, without loss of generality, f .t0 / > 0. Since f is continuous, there exists an " > 0 such that f .t/ > 0 for all t 2 .t0 "; t0 C "/. Now let u be a continuously differentiable function which is positive on some non-empty interval contained in .t0 "; t0 C "/, and 0 elsewhere (we may use the “baby version” of smooth partition of unity 5.1 of Chapter 8). Then Z
b
f .t/h.t/dt > 0: a
t u
1.2 Theorem. (The Euler-Lagrange equations) Suppose the functional S W V ! R has an extreme at a function y 2 V. Then the function y satisfies the system of differential equations
1 The basic problem of the calculus of variations, and the Euler-Lagrange equations
351
ˇ ˇ d @L ˇˇ @L ˇˇ D : @xi ˇxDy;vDy0 dt @vi ˇxDy;vDy0 Comment: It is often customary to write the equations in the form @L d @L ; D @yi dt @yi0 but there is some danger in such a notation, since in the partial derivatives, we must treat yi , yi0 as formal symbols plugged in for the arguments xi , vi of L, while the derivative by t is the actual total derivative by the independent variable t. Proof of the theorem: Choose any continuously differentiable function h W ha; bi ! R, such that h.a/ D h.b/ D 0. Consider the real function of n variables Z
b
ˆh .u1 ; : : : ; un / D
L.t; y.t/ C uh.t/; y0 .t/ C uh0 .t//dt:
a
If the functional L has an extreme at y, then ˆh has an extreme at o, and since it has continuous partial derivatives by the chain rule everywhere, we must have @ˆh .o/ D 0: @ui Denoting by ei the i ’th standard basis vector of Rn , compute 1 .ˆh .uei / ˆh .o// u D
1 u
Z
Z
b
C a
L.t; y.t/ C uei h.t/; y 0 .t/ C uei h0 .t// L.t; y.t/; y0 .t// dt
a
1 D u Z
b
@L.t; y.t/ C uei h.t/; y 0 .t/ C uei h0 .t// uh.t/dt @xi a ! @L.t; y.t/; y0 .t/ C uei h0 .t// 0 uh .t/dt @vi b
for some 0 < ; < 1. On the right hand side, we used the Mean Value Theorem 3.3 of Chapter 3 twice. Note that the u factor cancels out, and using h.a/ D h.b/ D 0 and integration by parts in the second integral, we get Z
b
D a
Now use Lemma 1.1.
@L.: : : / d @xi dt
@L.: : : / @vi
h.t/dt: t u
352
1.3
14 Calculus of Variations and the Geodesic Equation
Comment
The main idea of the proof resembles the idea of the total differential of a function of finitely many variables (see Exercise (1) below for a more concrete statement). It may seem we got something for free: how come we can find extremes of functionals on a space of continuously differentiable functions as easily as extremes of functions of finitely many variables? There is, however, one major catch: with the space V not being compact (not even locally), there is no guarantee an extreme of the functional S on V exists at all! Therefore, Theorem 1.2 is not nearly as strong as it may seem, giving only candidates for a possible extreme. Similarly as in the case of functions of finitely many variables, we call these candidates critical functions. Highly nontrivial methods are generally needed to show that a given critical function is in fact an extreme (we will see an example of that below).
2
A few special cases and examples
Simplifications in the form of the Euler-Lagrange equations occur in certain special cases.
2.1
When L does not depend on x
In this case, the Euler-Lagrange equations become @L.t; y.t/; y0 .t// D Ki @vi where K1 ; : : : ; Kn 2 R are constants. Example: Let n D 1. Let us verify that the shortest graph of a function connecting two points .a; p/ .b; q/ in R2 , a < b, is indeed a straight line. The formula for the arc length of a graph of a function y D y.t/ is Z
b
S.y/ D
p 1 C y 0 .t/2 dt;
a
and hence L.t; x; v/ D
p 1 C v2 :
Therefore, we have v @L Dp ; @v 1 C v2
2 A few special cases and examples
353
and we get a differential equation y0 p D K; 1 C y 02 or .1 K 2 /y 02 D K 2 : Thus we get a critical function if and only if y 0 is constant.
2.2
When L does not depend on t
Then we have X @L.y.t/; y0 .t// d @L.y.t/; y0 .t// 00 L.y.t/; y0 .t// D yi0 C yi ; dt @xi @vi i D1 n
which, using the Euler-Lagrange equation, is equal to n X d @L.y.t/; y0 .t// @L.y.t/; y0 .t// 00 d yi0 C yi D dt @vi @vi dt i D1
n X @L.y.t/; y0 .t// i D1
@vi
!
yi0
:
Thus, we obtain d dt
n X @L 0 yi L @v i i D1
! D 0;
or in other words n X @L 0 y LDK @vi i i D1
(2.2.1)
is a “conserved quantity” in t. This expression is called the Hamiltonian. When n D 1, the equation (2.2.1) can be used directly instead of the Euler-Lagrange equation to find critical functions. Example: The brachistochrone problem. Design a shape of a roller-coaster track in the tx plane such that the car starting at the point .0; r/ reaches the point .s; 0/ in the shortest possible time. (Gravity is assumed to pull in the negative direction of the x axis. Caution: here x is the vertical coordinate, and t is the horizontal coordinate, not time!)
354
14 Calculus of Variations and the Geodesic Equation
We may choose units such that the mass of the car is 1, as is the acceleration of gravity. Then the potential energy at the point .0; s/ is s, and hence by conservation of energy, at a point .t; x/, the kinetic energy is .s x/. Thus, if the component of the velocity in the t direction is w, we have 1 2 w .1 C x 02 / D s x; 2 and hence s wD
2.s x/ ; 1 C x 02
or s L.t; x; v/ D 1=w D
1 C v2 : 2.s x/
Thus, (2.2.1) gives s
1 C v2 x0 x0 p D K; 2.s x/ 1 C x 02 2.s x/
which yields p 1 D K 2.s x/ .1 C x 02 / or 1 D 2K 2 .1 x/.1 C x 02 /: It is not difficult to verify that the solution can be expressed parametrically as t./ D 1 A.1 C cos.//; x./ D A.sin./ C / C B
(2.2.2)
for suitable constants A, B. This curve is called a cycloid.
2.3
Lagrangian mechanics
In physics, the motion of a system of finitely many particles in R3 can be described using the Euler-Lagrange equations. Consider all the coordinates of all the particles together, so we have a variation problem in R3n , where the coordinates of i -th
2 A few special cases and examples
355
particle are coordinates number 3i 2, 3i 1, 3i . The basic principle for writing down the Lagrangian is L D kinetic energy potential energy:
(2.3.1)
In the basic setup of Newtonian mechanics, the particles have masses mi , and the kinetic energy is n X 1 i D1
2
2 2 2 m.v3i 2 C v3i 1 C v3i /:
(2.3.2)
P1 2 A kinetic energy formula of this form, i.e. essentially the form mv , is referred 2 to as a standard kinetic term. The potential energy term is more variable. Assuming the particles act on one another by gravity, Newton’s law of gravity gives potential energy
X i <j
mi mj Gv u 2 uX t .x 2 3i k x3j k /
(2.3.3)
kD0
where G is Newton’s universal constant of gravity. We may generalize this further by including a conservative force field acting on each particle, Fi D grad. i /, i W R3 ! R, i D 1; : : : ; n, in which case we add to the potential energy the term
n X
i .x3i 2 ; x3i 1 ; x3i /:
(2.3.4)
i D1
According to the recipe (2.3.1), the (original) Lagrangian is obtained by taking the standard kinetic term (2.3.2), and subtracting the potential terms (2.3.3), (2.3.4), thus getting L.x1 ; : : : ; x3n ; v1 ; : : : ; v3n /
D
n X 1 i D1
C
X i <j
2
2 2 2 m.v3i 2 C v3i 1 C v3i /
Gv u 2 uX t .x
mi mj
3i k
kD0
x3j k /2
C
n X i D1
i .x3i 2 ; x3i 1 ; x3i /:
356
14 Calculus of Variations and the Geodesic Equation
Lagrange’s principle states that the equation of motion is given by the critical function for this Lagrangian on a time interval ha; bi with given positions at the times a and b, i.e. that it is subject to the Euler-Lagrange equations 1.2. We will not prove this here. In fact, a mathematical “proof” in this setting is not to be expected: we are referring to a system of physical particles. What could be proved, however, is that Lagrange’s equations are equivalent to Newton’s. Observe that in the presence of the standard kinetic term (2.3.2), the Hamiltonian (2.2.1) of 2.3.1 has the physical meaning of the total energy of the system, which, indeed, should be conserved by the law of conservation of energy. The Lagrangian mechanics setup may seem like nothing new, since it only recovers Newton’s equations, and, in fact, is even less general, since it requires a conservative force field. However, the Lagrangian turns out to be extremely beneficial for generalizations. In fact, most of modern physics uses the Lagrangian formalism.
3
The geodesic equation
Let us return to mathematics. Perhaps the single most important example of the Euler-Lagrange equation is the geodesic equation in a Riemann metric (although it should be pointed out that the equation does have a physical meaning, describing in fact the motion of a light ray in a gravity field in Einstein’s general relativity).
3.1
A Riemann metric on an open subset of Rn
Let U Rn be an open subset. Let gij W U ! R, i D 1; : : : ; n, be smooth functions such that for each x 2 U , g D .gij /i;j is a positive definite symmetric matrix. We will interpret g.x/ as the associated matrix of a (real) inner product hu; vig of tangent vectors at the point x 2 U , which will be called a Riemann metric on U (see 7.7 of Appendix A). The key point is that the tangent space of U is canonically identified with Rn via the coordinate map which is simply the embedding U Rn . As a generalization of formula (**) in Subsection 2.2 of Chapter 8, we will, then, define the length with respect to the Riemann metric g of a piecewise continuously differentiable curve represented by a map W ha; bi ! U by the formula Z sg . / D a
(See Exercise (8).)
b
q
h 0 .t/; 0 .t/ig dt:
(3.1.1)
3 The geodesic equation
357
We will be interested in the variational problem of minimizing the functional (3.1.1) over the set of continuously differentiable curves with given boundary points .a/ D A; .b/ D B 2 U . Before getting into this seriously, we will introduce a notational convention which is helpful when figuring out numerical examples in complicated formulas with many indices: often, we are making multiple sums over indices, for example, i D 1; : : : ; n over terms where the index i occurs, and is equal, in two factors entering the formula. In this, and the following chapter, we will make the convention that When an index i appears in more than one factor of a product, then i will occur in precisely two such factors, once as a subscript and once as a superscript. This notation shall mean summation over all permissible values of i , which X must be the same in both factors in question; the summation symbol will be omitted. i
(3.1.2) Thus, using this convention, the components of the function will be written with superscripts, i , i D 1; : : : ; n, and the formula (3.1.1) above will assume the form Z
b
sg . / D a
q
Z
b
gij 0i 0j D
q
gij . .t// 0i .t/ 0j .t/dt:
(3.1.3)
a
The convention (3.1.2) may seem unreasonably restrictive, but turns out adequate in the types of formulas we will encounter. It is known as (one version of) the Einstein convention. When two quantities share an index as a subscript in one and a superscript in the other (and summation over all permissible values is to be performed), we call the quantities coupled. We can see already in (3.1.3) in comparison with (3.1.1) that the Einstein convention can make formulas more explicit. In the next chapter, when talking about the more general context of manifolds, we will talk about tensors, and will give the Einstein convention a deeper interpretation.
3.2
A trick: modifying the functional
We see immediately that the Euler-Lagrange equation for the functional (3.1.3) will be a pain because of the square root in the Lagrangian. This problem has a surprisingly simple solution, which, at first, cannot possibly seem right: simply omit the square root! Thus, we will consider the functional Z
b
Sg . / D a
gij 0i 0j :
(3.2.1)
358
14 Calculus of Variations and the Geodesic Equation
To justify this, recall that by Lemma 8.6.1 of Chapter 5, Sg . /
1 .sg . //2 ; ba
while equality arises if and only if gij . .t// 0i .t/ 0j .t/ is constant in t (keep in mind that we are using the Einstein convention). This condition is called parametrization by arc length. Note that any continuously differentiable curve can be parametrized by arc length: Letting Z tq s.t/ D gij . .t// 0i .t/ 0j .t/; a
we obtain an increasing continuously differentiable map with positive derivative s from ha; bi to the interval h0; sg . /i; composing with s 1 is a parametrization by arc length. This shows that if the functional Sg indeed has a minimum in the space of continuously differentiable curves with fixed boundary points A, B, then the minimum curve also minimizes the functional sg , and furthermore is parametrized by arc length!
3.3
The Euler-Lagrange equation for the modified functional-the geodesic equation
The modified functional (3.2.1) of 3.2 gives us the Lagrangian L.x; v/ D gij .x/v i v j ;
(3.3.1)
using the notation x D .x 1 ; : : : ; x n /, v D .v 1 ; : : : ; v n / and the Einstein convention. We have @L D 2gij .x/v j : @v i Note here that from the point of view of the Einstein convention, we must @ treat the i in i as a subscript. @v By the chain rule, we therefore have @gij d @L.x; x0 / D 2gij .x j /00 C 2 k .x j /0 .x k /0 dt @v i @x @gij @gik j 00 D 2gij .x / C C j .x j /0 .x k /0 : @x k @x The last step may seem to do nothing, but we will see later that it is useful to have the quantity coupled to .x j /0 .x k /0 symmetrical in j; k (it will help eliminate a certain, somewhat counterintuitive, quantity known as torsion).
3 The geodesic equation
359
Also note that by the chain rule, we have @gj k j 0 k 0 @L.x; x0 / D .x / .x / ; @x i @x i and hence the Euler-Lagrange equation becomes (after cancelling 2), 1 gij .x / C 2 i 00
@gij @gik @gj k C j @x k @x @x i
.x j /0 .x k /0 D 0;
(3.3.2)
which is called the geodesic equation. It is useful to write ijk D
1 2
@gij @gik @gj k C j @x k @x @x i
:
(3.3.3)
Then (3.3.2) becomes gij .x i /00 C ijk .x j /0 .x k /0 D 0:
(3.3.4)
As we learned from the theory of differential equations in Chapter 6, it is useful to have the highest derivative in explicit form. In the present case, it suffices to multiply by the matrix g1 inverse to g. To conform with the Einstein convention, it is customary to denote the .i; j /’th entry of the matrix g 1 as g ij . Then we obtain g ij gj k D ıki where ıki D 1 when i D k D 0 otherwise is called the Kronecker ı (see also Appendix A, 7.2). Thus, putting ji k D g i ` `j k ; the geodesic equation becomes .x i /00 C ji k .x j /0 .x k /0 D 0:
(3.3.5)
The symbols ijk or ji k are known as Christoffel symbols of the first resp. second kind. Parametrized curves satisfying the geodesic equation are called geodesics parametrized by arc length, or simply geodesics. Let us keep in mind, however, that geodesics are merely critical for the functional (3.2.1) of 3.2. We have not proved that geodesics minimize the length of continuously differentiable curves
360
14 Calculus of Variations and the Geodesic Equation
with given boundary points. In fact, this is false in general (see Exercise (7) (c) of Chapter 15 below). Yet, for the sake of geometry, we are clearly interested at least in some minimum length statement regarding geodesics, and it is important to note that the variational tools we supplied do not give that. We will prove such a statement in the next section using different methods.
4
The geometry of geodesics
The purpose of this section is to study geodesics in more detail, and eventually to prove that locally they really are the curves of minimal length connecting two points with respect to a given Riemann metric.
4.1
Dependence on boundary conditions, the exponential map
Recall now Theorem 6.5.5 where we investigated the dependence on an ordinary differential equation on initial conditions. At this point, we are interested in dealing with smooth functions. Let us distill the result we will need here: 4.1.1 Lemma. Let U Rn be an open set, and let f W R U ! Rn be a smooth function. Consider points t0 2 R, x0 2 U . Then there exists an open neighborhood V of .t0 ; x0 / in R U and a unique smooth function y W V ! U such that y.t0 ; x/ D x and @y.t; x/ D f.t; y.t; x// @t
(*)
for all .t; x/ 2 V . Proof. As explained in Subsection 5.1 of Chapter 6, we can treat dependence on initial conditions as dependence on parameters. From this point of view, the existence and uniqueness of a continuous solution y as claimed follows from Theorem 5.3 of Chapter 6, and its continuous differentiability in all variables follows from Theorem 5.5 of Chapter 6. Now applying the equations (5.5.2) of Chapter 6 for the partial derivatives, we obtain, by induction, the existence and continuity of all higher partial derivatives. t u By 1.2 of Chapter 6, an analogue of Lemma 4.1.1 also holds for systems of higher order differential equations. Applying this specifically to the case of the geodesic equation (3.3.5) of 2.3, we obtain the following 4.1.2 Corollary. For a smooth Riemann metric g on an open set U Rn and a point P 2 U , pick an isometry W .Rn ; h‹; ‹i/ ! .Rn ; h‹; ‹igP /:
4 The geometry of geodesics
361
(Here on the left hand side, h‹; ‹i denotes the dot product, see Appendix A, Section 4.3.) Then there exists a convex open neighborhood V of o 2 Rn , and a unique smooth map W V ! U such that (1) .o/ D P , (2) for each v 2 Rn , .vt/ considered as a function of t in an open neighborhood of o in which vt 2 V is a g-geodesic parametrized by arc length (in the sense of 3.3), (3) @v .o/ D .v/. The smooth map of Corollary 4.1.2 is often denoted by exp and called the exponential map.
4.2
Behavior of geodesics with respect to lengths and angles
Let us first verify that solutions of the equation (3.3.5) of 3.3 are indeed parametrized by arc length with respect to the Riemann metric g. While we argued in 3.2 that this must be true for parametric curves minimizing the functional (3.2.1), note that we have so far only proved that the solutions of (3.3.5) are critical. Hence, that argument cannot be used rigorously. 4.2.1 Lemma. Let x W .a; b/ ! U be a solution of the equation (3.3.5). Then we have .gij .x i /0 .x j /0 /0 D 0 (using the Einstein convention). Proof. Let us compute the Hamiltonian (2.2.1) of 3.2 for the Lagrangian (3.3.1) of 2.3: @L.x; x0 / i 0 .x / L.x; x 0 / D 2gij .x i /0 .x j /0 gij .x i /0 .x j /0 D gij .x i /0 .x j /0 : @v i Thus, the quantity whose constancy in t we are trying to prove is in effect the Hamiltonian. Hence, our statement follows from 2.2. u t Note that the proof of Lemma 4.2.1 suggests multiplying the Lagrangian (3.3.1) of 3.3 by a factor of 1=2, and calling it energy.
4.2.2 Now we will prove that when we shift a geodesic to a nearby geodesic, the angle of the shift is also conserved, provided that we do not change the scale of parametrization. More precisely, let solutions of the geodesic equation (3.3.5) of 3.3 depend on some smooth parameter u in the space of initial conditions, as in the proof of Lemma 4.1.1. Let us assume further that
362
14 Calculus of Variations and the Geodesic Equation
@gij .x i /0 .x j /0 D 0: @u
(1)
Note that by Lemma 4.2.1, it suffices to verify this condition at one point, and the condition indeed means that we are not changing the scale of arc length parametrization with u. Now let zD
@x ; @u
as, again, in the proof of Lemma 4.2.1. Lemma. We have .gij .zi /.x j /0 /0 D 0:
(2)
Proof. Compute .gij .zi /.x j /0 /0 D
@gij @x k @x i @x j @2 x i @x j @x i @2 x j C g C g : ij ij @x k @t @u @t @u@t @t @u .@t/2
(3)
Now (1) implies that @gij @x k @x i @x j @2 x i @x j C 2g D 0: ij @x k @u @t @t @u@t @t
(4)
Subtracting 1=2 times (4) from the right hand side of (3), we get D
1 @gij @x k @x i @x j @x i @2 x j @gij @x k @x i @x j C gij : k k @x @t @u @t 2 @x @u @t @t @u .@t/2
(5)
2 j Using the geodesic equation (3.3.5) of 3.3 for @ x 2 , we see that the second term is .@t/ equal to
gij ji k .x i /0 .x j /0 .x k /0 D ijk .x i /0 .x j /0 .x k /0 : Using the definition of ijk in 3.3, this is equal to This is equal to
1 @x i 2 @u
@gij @gik @gj k C j k @x @x @x i
@x j @x k : @t @t
4 The geometry of geodesics
363
@gij @x k @x i @x j 1 @gij @x k @x i @x j C k @x @t @u @t 2 @x k @u @t @t t u
by renaming variables, which shows that (5) is 0.
Remark: In comparison with Lemma 4.2.1, we may ask if Lemma 4.2.2 has a similarly conceptual proof (our proof was by calculation from the definition of the Christoffel symbols). Such a conceptual proof indeed exists, and is related to our comments in Sections 7 and 8 of Chapter 6: the condition (1) indicates that the Lagrangian has an infinitesimal symmetry. By a similar but somewhat more elaborate argument to the discussion in Chapter 6, this always implies a conserved quantity known as a Noether current, which is the cause of the conservation law proved in Lemma 4.2.2. Discussing this more systematically, however, exceeds the scope of this text.
4.3
Minimality of geodesics
Let us now consider an open subset U Rn with a smooth Riemann metric g and a point P 2 U . Choose an isometry as in Corollary 4.1.2, and let W V ! U , .0/ D P , be the corresponding exponential map. By the Inverse Function Theorem 7.3 of Chapter 3, we may further assume that is a diffeomorphism onto its image. 4.3.1 Lemma. Let Sr D fx 2 Rn j jjxjj D rg: Assuming Sr V , x 2 Sr , v 2 T .Sr /x , then .D /x .v/ is g-orthogonal to .D /x .x/. In other words, T . .Sr // .x/ is g-orthogonal to .D /x .x/. Caution: It is not claimed, and, as we will see in the next section, certainly not true in general, that would be an isometry! Proof. We will use Lemma 4.2.2. Let xQ D x=jjxjj D x=r. Consider the geodesic .t xQ /. By the definition of , and the fact that it is a diffeomorphism onto its image when restricted to V , the space T . .Sr // .x/ is spanned by the vectors z.r/ of 4.2.2 with respect to the boundary condition change x.0; u/ D P;
xQ C uw x .0; u/ D jjQx C uwjj 0
(*)
364
14 Calculus of Variations and the Geodesic Equation
where hx; wi D 0. The condition (1) of 4.2.2 is then satisfied (at t D 0 and hence, by Lemma 4.2.1, for all t 2 .r; r/) by the fact that is an isometry. By (*), gij .zi /.x j /0 D 0 at t D 0, and hence, by Lemma 4.2.1, also at t D r D jjxjj, which implies the statement of Lemma 4.3.1. u t Assume now, without loss of generality, that V D .o; R/ for some R > 0. 4.3.2 Theorem. Let y W h0; ai ! .V / be a continuously differentiable curve such that y.a/ 2 .Sr /. Then, recalling the notation (3.1.1), we have sg .y/ r; and equality is attained if and only if y is a geodesic, i.e. y yQ where yQ is a geodesic parametrized by arc length. Proof. Consider the function h W .V / ! R given by h.x/ D jj 1 xjj: Then the function h is smooth on .V / X fP g. By Lemma 4.3.1, the vector .
@h.x/ /i @x i
(a)
is a positive multiple of the derivative at t D jj 1 .x/jj of the geodesic .t. 1 .x/=jj 1 .x/jj/:
(b)
We have g ij
@h.x/ @h.x/ D 1: @x i @x j
(c)
(Change coordinates so that one coordinate vector will be the derivative of (b) at t D jj 1 .x/jj and the other, g-orthogonal coordinate vectors will be tangent vectors at x to .Sjj 1 .x/jj /. Then the contributions to (c) in the new coordinates from all but i D j D 1 will be 0, and the contribution from the first coordinate is 1 by the fact that the geodesic (b) is parametrized by arc length, and is an isometry.) Hence,
5 Exercises
365
by the finite-dimensional Cauchy-Schwarz inequality (see 4.4 of Appendix A), for any z 2 Rn , q
gij .x/zi zj
@h.x/ i z @x i
where equality arises if and only if z is a positive multiple of (a). Therefore, Z Z q sg .y/ D gij .y.t//.y i /0 .t/.y j /0 .t/dt h.y.t//0 dt D h.y.a// h0;ai
h0;ai
where equality arises if and only if y0 .t/ is a positive multiple of a tangent vector of a geodesic of the form (b) for y.t/ D x almost everywhere in t (and hence everywhere, by continuity). t u
5
Exercises
(1) Prove that for y 2 Va;b;p;q , and h 2 Va;b;o;o , we have Z S.y C h/ S.y/ D
b
Dy .t/h.t/dt C M.h/ jjhjj
a
where M W Va;b;o;o ! R satisfies lim M.h/ D 0
h!0
and @L.t; y.t/; y0 .t// d .Dy .t/i / D @xi dt
@L.t; y.t/; y0 .t// : @vi
The function Dy .t/ is an example of what we call a Fr´echet derivative, although it is more common to consider this concept on normed vector spaces (while Va;b;p;q is an affine space). [Hint: Mimic the proof of Theorem 1.2, but keep in mind that h plays a slightly different role, h.t/ now having values in Rn .] (2) Find the critical functions W ha; bi ! R for the functional Z
b
S.y/ D
p y.x/ 1 C .y 0 .x//2 dx
a
on continuously differentiable functions y 2 Va;b;p;q , p; q > 0 2 R.
366
14 Calculus of Variations and the Geodesic Equation
(3) Find the Euler-Lagrange equation for the functional Z
1
S.y/ D
y 2 .y 0 /2 dx:
0
(4) Find the critical functions W ha; bi ! R for the functional Z
b
S.y/ D
.p 2 .y 0 /2 C q 2 y 2 /dx:
a
(5) By reversing the coordinates in Example 2.2 (i.e. making the vertical coordinate the independent and the horizontal coordinate the dependent variable), find an alternate solution to the brachistochrone problem using the method of Example 2.1. (6) Find the critical functions for the functional Z
1
S.u; v/ D
..u0 /2 C .v 0 /2 C u0 v 0 /dx:
0
(7) Prove in detail the parametric form (2.2.2) of the solution of the brachystochrone problem. (8) Prove that the formula (3.1.1) of 3.1 does not depend on the parametrization of a piecewise continuously differentiable curve L. (9) The hyperbolic plane is the upper half-plane of complex numbers, i.e. the set H D fx C iy 2 C j y > 0g with the Riemannian metric gij associated, at a point x C iy 2 H, with the matrix 1=y 2 0 : 0 1=y 2 Using the geodesic equation, determine the geodesics in H. (10) (Spherical geometry) Consider, on C D fx C iy j x; y 2 Rg; the Riemann metric gij associated, at a point x C iy 2 C, with the matrix
1=.1 C x 2 C y 2 / 0 : 0 1=.1 C x 2 C y 2 /
Using the geodesic equations, determine the geodesics in this space.
Tensor Calculus and Riemannian Geometry
15
The attentive reader probably noticed that the concept of a Riemann metric on an open subset of Rn which we introduced in the last chapter, and the related material on geodesics, beg for a generalization to manifolds. Although this is not quite as straightforward as one might imagine, the work we have done in the last chapter gets us well underway. A serious problem we must address, of course, is how the concepts we introduced behave under change of coordinates. It turns out that what we have said on covariance and contravariance in manifolds is not quite enough: we need to discuss the notation of tensor calculus. Additionally, it turns out that discussing geodesics in a Riemann metric directly would cause us to copy many expressions over and over unnecessarily. There is a natural intermediate notion which axiomatizes the Christoffel symbols of the second kind directly, without referring to a Riemann metric. This gives rise to the concept of an affine connection. In the presence of an affine connection, we can discuss geodesics, but also the important geometric concepts of torsion and curvature. We will show that vanishing of torsion and curvature characterizes, in an appropriate sense, the canonical affine connection on Rn (the flat connection). We will define the notion of a Riemann manifold, and show how it canonically specifies an affine connection, known as the Levi-Civita connection. This will lead us to the concept of curvature of a Riemann manifold. We will show that locally, a Riemann manifold with zero curvature is isometric to an open subset of Rn . We will also show that every oriented Riemann manifold in dimension 2 has a compatible structure of a Riemann surface. Although we make no reference to physics, the present chapter gives a good rigorous foundation for the mathematics of general relativity theory. In fact, the notation we use (writing out the indices in tensors) is closer to physics than is customary in most mathematical texts. As we shall see, this notation does not sacrifice rigor, and can make calculations with tensors more transparent by showing explicitly which coordinates we are contracting. To comment on the title of this chapter, by tensor calculus, one usually means the basic development of tensor fields, their transformation under changes of coordinates, and the covariant derivative. Riemannian geometry develops the same I. Kriz and A. Pultr, Introduction to Mathematical Analysis, DOI 10.1007/978-3-0348-0636-7 15, © Springer Basel 2013
367
368
15 Tensor Calculus and Riemannian Geometry
concepts further and on a higher level of abstraction. By making a kind of a vertical slice through the concepts, we are hoping to make advanced geometry more accessible to the reader. Riemannian geometry is a vast subject, and here we only explore its very beginnings. For further study of differential geometry and Riemannian geometry, we recommend [10, 16, 21]. From here on, we commonly drop the bold-faced letter convention from 1.2 of Chapter 3. Exceptions will be made where we specifially need to refer to material of previous chapters, such as Comment 1.1 below.
1
Tensor calculus
1.1
Tensors and tensor fields
Let M be a smooth manifold and let x 2 M . An m-times contravariant and n-times covariant tensor (or, more briefly, tensor of type .m; n/) at x is simply an element of .TM x /˝m ˝ .TM x /˝n :
(1.1.1)
A smooth tensor field T on M of type .m; n/ is a map assigning to each x 2 M a tensor Tx at x of type .m; n/ such that for a smooth coordinate system h W U ! RN at any point x 2 M , h ˝ ˝ h Ty ; y 2 U „ ƒ‚ …
(1.1.2)
m C n times
depends smoothly on y. (By h , we mean Dhx on the contravariant coordinates and .D.h1 /h.x/ / on the covariant coordinates. Note also that the tangent space of RN is identified canonically with RN , so the target of (1.1.2) is canonically identified with the same finite-dimensional vector space for all y 2 U .) For example, therefore, a smooth tensor field of type .1; 0/ is the same as a smooth vector field, and a smooth tensor field of type .0; 1/ is the same as a smooth 1-form. Comment: The reader may wonder why the convention on using the terms covariant and contravariant when referring to tensors is opposite to the functoriality we observed in 2.4 of Chapter 12. The reason is that the traditional terminology on tensors (which we follow here) focuses on coordinates rather than the objects themselves. In other words, one does not refer to functoriality with respect to smooth maps, but with respect to coordinate change, which turns out to be the opposite. To give an example, to use the notation of Chapter 14, a tangent vector would be, in local coordinates h1 ; : : : ; hn written as v D vi
@ @hi
1 Tensor calculus
369
(using the Einstein convention). From the point of view of tensor calculus, we write v simply as v i . Note that composing the coordinate system h with where W U ! V is a diffeomorphism of open subsets of Rn , we have @ @hj @ D @. ı h/i @ i @hj by the chain rule, and thus with respect to the coordinates . ı h/i , the coordinates of v will be D 1 .v 1 ; : : : ; v n /T :
1.2
A coordinate-free meaning for indices
Even though we have not specified coordinates, it is often customary to give a tensor of type .m; n/ m different superscripts and n different subscripts, e.g. m Tji11ji22:::i :::jn :
The superscripts and subscripts are formal symbols each one of which refers simply to a particular factor of (1.1.1). For example a tensor of type .2; 2/ may be then denoted by ij
Tk` : This notation has immediate benefits. For example, the Einstein convention now makes sense for tensors: for tensors T , S , by the symbol ::: ::: :::i ::: T::::::i ::: S:::i ::: D S:::i ::: T:::
we mean the image of S ˝ T under the map which applies the evaluation map .TM x / ˝ .TM x / ! R to the coordinates of S and T labeled by i . We stipulate that each index will occur at most twice, but there may be multiple pairs of coinciding indices, in which case we apply multiple evaluation maps: For example, ij
Tk` Sijk` 2 R makes sense for two tensors of type .2; 2/ at the same point x 2 M . This operation is often referred to as contraction.
370
15 Tensor Calculus and Riemannian Geometry
The other benefit is that we can easily talk about symmetric and antisymmetric tensors: Recall that for two vector spaces V; W there is a canonical interchange map V ˝ W ! W ˝ V; v ˝ w 7! w ˝ v: A tensor ::: T::::::i :::j ::: or T:::i :::j :::
is called symmetric (resp. antisymmetric) in the coordinates i; j if applying the interchange map to those coordinates gives again T (resp. T ). We may also say that a tensor is symmetric (resp. antisymmetric) in a set of coordinates S if it is symmetric (resp. antisymmetric) in any pair i; j 2 S . Realize that then, for example, a smooth tensor field of type .0; k/ antisymmetric in all its coordinates is the same thing as a smooth k-form. One example needs to be discussed explicitly: recall from 2.5 of Chapter 11 that the canonical map V;W W V ˝ W ! Hom.V; W /; ..f ˝ w//.v/ D f .v/w is an isomorphism when V , W are finite-dimensional. We then have a smooth tensor field of type .1; 1/ on any manifold M which, at any point x 2 M , is given by 1 TM x ;TM x .IdTM x /: This tensor field is denoted by ıji :
1.3
Comment
The reader probably noticed the difference between the way subscripts and superscripts are used in the context of tensors on a Riemann manifold, and the way we used them in the last chapter: in the last chapter, an index i stood simply for the i ’th coordinate, where i is a number, and the Einstein convention was used to sum terms where the same i occurs twice. In the context of tensors, no number is plugged in for i , it simply is a label denoting which factor of the tensor product we are working with, and the Einstein convention means an application of the evaluation map. Conveniently, these two points of view are somewhat interchangable: if we pick @ a local coordinate system h, then we have a basis i of TM x , and a dual basis dhi @h of .TM x / , and the evaluation map can be indeed computed by summing products of terms coupling a basis element with the corresponding element of the dual basis.
2 Affine connections
371
Nevertheless, one must be careful to note that the coordinate-free tensor context is more restrictive: The tensor notation should be used only for quantities which are intrinsically coordinate-free. For example, let us take the Christoffel symbols ijk . On an open set in Rn , the tangent space is canonically identified with Rn , so we could certainly view ijk as a tensor of type .1; 2/. The trouble is, however, that if we change coordinates, i.e. apply a diffeomorphism to another open subset of Rn , this will not preserve the tangent space identification, and we find that it would not preserve the tensor ijk we just defined, i.e. that for each choice of coordinates, we would get a different tensor. Usually, this is expressed by saying that ijk is not a tensor and transforms according to different rules (see Exercise (1) below). It is more accurate, however, to say that there is no canonical tensor given by the Christoffel symbols.
2
Affine connections
2.1
The definition of an affine connection
There is no general natural way of taking a derivative of a vector field by another vector field on a smooth manifold. However, we can give a manifold additional structure which enables such operations, and specify axioms which make this operation “behave like a derivative”. This leads to the notion of an affine connection. Consider the R-vector space W.M / of all smooth vector fields on a smooth manifold M . An affine connection (or, more briefly, connection) on M is a bilinear map W.M / W.M / ! W.M /; .u; v/ 7! ru .v/ such that for a smooth function f W M ! R, we have rf u .v/ D f ru .v/
(2.1.1)
ru .f v/ D @u f v C f ru .v/:
(2.1.2)
and
By @u f we mean a function which, at x 2 M , is the directional derivative at x of f by the vector u.x/. (Note that (2.1.2) can be interpreted as a kind of a Leibniz rule.)
2.2
Locality
Perhaps the first thing to notice about affine connections is that they are “local” in the following sense: the value of ru .v/ at a point x 2 M clearly depends only on
372
15 Tensor Calculus and Riemannian Geometry
the value u.x/ and the value of v on the image of any continuously differentiable oriented curve W .a; a/ ! M (such that 0 .t/ ¤ 0) where .a/ D x and 0 .0/ D u: Choosing vector fields e1 ; : : : ; eN such that e1 .x/; : : : ; eN .x/ form a basis of TM x , by (2.1.1) and bilinearity, rei .v/ clearly determine the value of ru .v/ at x for any u. To prove the statement about the v variable, first note that ru .v/.x/ and ru .w/.x/ are equal if v, w coincide in a neighborhood of x: in such a case, there exists a smooth function h W M ! R such that h is constant 1 in a neighborhood of x, and hv D hw. This implies our claim by axiom (2.1.2). Now choose local coordinates h W U ! Rn , h.u/ D 0. Then without loss of generality, M D hŒU , which is an open set in Rn (in other words, h is the inclusion). We can write v D f i ei ; w D g i ei ; and by our assumption, f i and g i coincide on the image Œa; a/ of . In particular, @u f i .x/ D @u g i .x/: Consequently, again, our claim follows from axiom (2.1.2). Consider now a function v assigning to y 2 Œ.a; a/ an element in TM y which is smooth in the sense that h v W ."; "/ ! Rn is a smooth function where h are local coordinates at x. We shall call such a function a smooth vector field defined on Œ.a; a/. By the Implicit Function Theorem, a smooth vector field defined on Œ.a; a/ extends to a smooth vector field in a neighborhood of x, and by the previous remarks, ru .v/.x/ is well defined even though v is not a priori a smooth vector field defined on a neighborhood of x. This is sometimes important.
2.3
Examples
1. The most basic example is the canonical connection in Rn : since the tangent space of Rn is canonically identified with Rn , vector fields are canonically identified with Rn -valued functions, and we may simply define the value of ru .v/ at x as the u.x/-directional derivative of v (considered as an Rn -valued function) at x. 2. Let us now generalize this example in the spirit of the previous section. Let U be an open subset of Rn and let gij be a Riemann metric on U . Define ru .v/ D u
i
@v j ej C v j ijk ek @x i
(2.3.1)
2 Affine connections
373
using the standard coordinates x i in Rn , and letting ei be the standard basis of Rn . The axioms (2.1.1), (2.1.2) are readily verified. To explain where this formula comes from, note that by the chain rule, if x D x.t/ is a geodesic, then (2.3.1) gives precisely rx 0 .t / x 0 D 0;
(2.3.2)
which really “looks like” a generalization of the equation of a straight line in Rn , although the left-hand side must be taken in the sense of the remarks made in 2.2. Perhaps the main purpose of this section is to develop this example further, generalize it to Riemann manifolds and show its significance to a Riemann metric; but we need to develop more the general theory of connections first.
2.4
Parallel transport and geodesics
2.4.1 Let W ha; bi ! M be a continuously differentiable parametrized curve in a smooth manifold M with affine connection r (as usual, we assume that 0 .t/ ¤ 0 for any t 2 ha; bi and take the one-sided derivatives at the boundary points). Consider the equation r 0 .t / y. .t// D 0
(*)
where y is a smooth vector field defined on Œha; bi. Clearly, we can treat this problem locally, and hence we may work in a coordinate neighborhood U of M , where we have a smooth coordinate system h W U ! RN . Let ei D @ i . Writing @h y D x i ei ; the equation (*) becomes a system of first-order linear differential equations in the coefficients x i . Thus, by Theorem 1.3 of Chapter 7, there is a unique solution to the equation (*) with given value v D y. .a// 2 TM .a/ : This solution is called the parallel transport of the vector v along the parametrized curve with respect to the affine connection r. It is important to note, however, that performing parallel transport on a vector v D .a/ 2 TM .a/ along a parametrized closed curve may produce a different vector v ¤ .b/ 2 TM .b/ D TM .a/ . This is related to two quantities known as torsion and curvature associated with the affine connection r, which we will discuss in the next section.
374
15 Tensor Calculus and Riemannian Geometry
2.4.2 Geodesics We can now also see that the concept of a geodesic generalizes to any smooth manifold with an affine connection. In effect, if, in a local coordinate system h, we write ei D @ i , and define Christoffel symbols of the connection r by @h rei .ej / D ijk ek ;
(*)
then in this generalized sense, any affine connection in local coordinates is given by the formula (2.3.1) of 2.3 (by the axioms of 2.1). We then see that the “geodesic equation” (2.3.2) written in coordinates becomes a (non-linear) secondorder ordinary differential equation, and hence locally has solutions uniquely determined by the value and derivative at a single point (by Corollary 4.1.2 of Chapter 14).
3
Tensors associated with an affine connection: torsion and curvature
Recall the vector space W.M / of smooth vector fields on M . We will prove the following 3.1 Lemma. Suppose we have a multi-linear function ˆ W W.M / W.M / ! W.M / „ ƒ‚ … k times
which has the property that ˆ.u1 ; : : : ; ui 1 ; yui ; ui C1 ; : : : ; uk /x D y.x/ˆ.u1 ; : : : ; uk /x
(*)
for every smooth function y W M ! R and every i D 1; : : : ; k. Then ˆ.u1 ; : : : ; uk /x only depends on .u1 /x ; : : : ; .uk /x , and defines a smooth tensor field of type .1; k/. Remark: Multi-linearity of ˆ, as we defined it, guarantees condition (*) for a constant function y. Proof. By the same reasoning as in 2.2, the value ˆ.u1 ; : : : ; uk /x depends only on values of ui in an open neighborhood U of x. We may assume U to be a coordinate neighborhood with a coordinate function h W U ! RN , and let ei D @ i . Then we @h may write ui D i y j ej
3 Tensors associated with an affine connection: torsion and curvature
375
for smooth functions i y j on U , so (*) implies that ˆ.u1 ; : : : ; uk /x is the sum of 1y
j1
.x/ k y jk .x/ˆ.ej1 ; : : : ; ejk /x
over all possible choices 1 j1 ; : : : ; jk N . This implies our statement.
t u
3.2 Let r be an affine connection on a smooth manifold M . We will give two examples of quantities satisfying the assumptions of Lemma 3.1, namely T .u; v/ D ru .v/ rv .u/ Œu; v and R.u; v; w/ D ru rv .w/ rv ru .w/ rŒu;v .w/ where Œu; v is the Lie bracket of the smooth vector fields u; v (see Section 7 of Chapter 6, and Exercise (5). Lemma. The functions T , R satisfy the hypotheses of Lemma 3.1, and hence define ` smooth tensor fields Tijk , Rijk . Furthermore, both of these tensors are antisymmetric in the coordinates i; j . ` Remark: The tensors Tijk , Rijk are called the torsion tensor and curvature tensor, respectively.
Proof of the Lemma: Multilinearity is obvious, as is antisymmetry in the specified coordinates. Condition (*) of Lemma 3.1 is a direct calculation. T .yu; v/ D ryu .v/ rv .yu/ Œyu; v D yru .v/ yrv .u/ .@v y/ u yŒu; v C .@v y/ u D yT .u; v/; R.yu; v; w/ D ryu rv .w/ rv ryu .w/ rŒyu;v .w/ D yru rv .w/ rv yru .w/ ryŒu;v w C r@v yu w D yru rv .w/ yrv ru .w/ @v y ru .w/ yrŒu;v w C@v y ru .w/ D yR.u; v; w/;
376
15 Tensor Calculus and Riemannian Geometry
R.u; v; yw/ D ru rv .yw/ rv ru .yw/ rŒu;v .yw/ D ru yrv .w/ C ru .@v y w/ rv yru .w/ rv .@u y/ w yrŒu;v .w/ @Œu;v y w D yru rv .w/ C @u yrv .w/ C @v yru .w/ C @u @v y w yrv ru .w/ @v yru .w/ .@u y/rv .w/ @v @u y w yrŒu;v w @u @v y w C @v @u y w D yR.u; v; w/: The other cases follow by antisymmetry.
3.3
t u
Example
The connection defined in Example 2.3 2 has zero torsion. This immediately follows from the fact that ijk D jki :
(3.3.1)
Compare this to the beginning of Subsection 3.3 of Chapter 14, where we specifically defined the Christoffel symbols in such a way so as to make (3.3.1) true. In fact, more generally, we see from the comments made in 2.4.2 and formula (2.3.1) of 2.3 that any affine connection has zero torsion if and only if, in local coordinates, it satisfies (3.3.1) in the sense of 2.4.2.
3.4
A characterization of the Euclidean connection
Theorem. Let M be a smooth manifold with an affine connection r, and let x 2 M . Then there exists an open neighborhood of x in which r has torsion and curvature tensors equal identically to 0 if and only if there exists an open neighborhood U of x and a coordinate system h W U ! Rn which sends r restricted to U to the canonical connection (Example 2.31) on Rn , restricted to hŒU . Proof. Clearly, the Euclidean connection has torsion and curvature 0, and hence the existence of the coordinate system h W U ! Rn with the specified properties implies that r is torsion and curvature free on U . On the other hand, consider a connection r on M which is torsion and curvature free on an open neighborhood of x. Choose a basis e1 ; : : : ; en of TM x . Let W .a1 ; a1 / ! M , .0/ D x, 0 .0/ D e1 be a geodesic with respect to r. Now denote the parallel transport of e2 along also by e2 at each point t1 2 .a1 ; a1 /. Let t1 W .a2 ; a2 / ! M be a geodesic with t1 .0/ D .t1 /, t01 .0/ D e2 . Note that we may assume the number a2 > 0 is independent of t1 because of smooth dependence on geodesics on boundary conditions (the argument of 2.4 extends verbatim to this situation). By the same argument, we may also consider as a smooth function
3 Tensors associated with an affine connection: torsion and curvature
377
W .a1 ; a1 / .a2 ; a2 / ! M: We will denote the two independent variables by t1 2 .a1 ; a1 /, t2 2 .a2 ; a2 /. We clearly have Œ
@ @ ; D0 @t1 @t2
(1)
by the commutation of partial derivatives. Write ei D
@ ; @ti
(2)
i D 1; 2. By the fact that r has 0 curvature, parallel transports along the curves t1 ;‹ and ‹;t2 with constant t1 resp. t2 therefore commute. We conclude in particular that re2 .e1 / D 0; since it is true at t2 D 0 by our definition. Since r has 0 torsion, we also have re1 .e2 / D 0; and since the curvature is 0, re2 re1 .e1 / D re1 re2 .e1 / D 0: Hence, in fact, re1 .e1 / D 0; since it is true at t2 D 0 by our definitions. In conclusion, rei .ej / D 0
(3)
for i; j 2 f1; 2g. Now assume, by induction, that we have a function W .a1 ; a1 / .ak ; ak / ! M such that if we define (2), then (3) is true for all i; j 2 f1; : : : ; kg. If k < n, denote the parallel transport of ekC1 to any of the points .t1 ; : : : ; tk / by the curves .t1 ; : : : ti 1 ; ‹; ti C1 ; : : : tn / (with only one ti non-constant) by ekC1 . Smooth dependence on boundary conditions implies that is a smooth function of the k C 1 variables t1 ; : : : ; tkC1 on some set
378
15 Tensor Calculus and Riemannian Geometry
.a1 ; a1 / .akC1 ; akC1 / and applying the above argument to individual pairs of coordinates gives (3) for i; j 2 f1; : : : ; k C 1g. Thus, we may assume k D n. But then is locally the inverse of a local coordinate system on M at x (by the Inverse Function Theorem), and (3) implies that this coordinate system carries the connection r to the Euclidean connection 2.31, as claimed. t u
4
Riemann manifolds
The purpose of this section is to put, finally, everything together. We define a connection canonically associated with a Riemann metric on a smooth manifold, called the Levi-Civita connection. We define the curvature of a Riemann manifold, and prove that vanishing of the curvature locally characterizes Euclidean geometry up to isometry.
4.1
Riemann metrics
A (smooth) Riemann metric on a smooth manifold M is a smooth tensor field of type .2; 0/ denoted usually by gij which is symmetric and such that for each x 2 M , the symmetric bilinear form on TM x defined by g.u; v/ D gij ui v j is positive-definite (and hence defines a real inner product). A smooth manifold with a Riemann metric is called a Riemann manifold. The fact that we considered an inner product on TM x (as opposed to TMx ) is merely a convention: we claim that given a Riemann metric gij , there exists a unique tensor of type .2; 0/ denoted by gij such that gij g j k D ıik ; which, moreover, defines a positive-definite symmetric bilinear form on TM x : Picking an ordered basis B of TM x , the matrix of g ij with respect to the ordered basis of TM x dual to B is the inverse of the matrix of gij with respect to B which is also positive-definite (see Exercise (2)). Similarly, we could have started with positive-definite symmetric tensor gij , and a positive-definite symmetric tensor gij would be determined. An isometry is a smooth diffeomorphism f W M ! N between Riemannian manifolds with Riemann metrics g, gQ such that f g D g. Q 4.1.1 Lemma. Every smooth manifold M has a Riemann metric.
4 Riemann manifolds
379
Proof. The statement is certainly true if we replace M by one of its coordinate neighborhoods Ui (since for an open subset of Rn , we can take the standard inner product on Rn ). Let i g be the Riemann metric on Ui , and let ui be a smooth partition of unity subordinate to the open cover .Ui /. Then X
ui .i g/
i
is a Riemann metric on M . (Note that a linear combination of finitely many positive-definite symmetric matrices with positive coefficients is a positive-definite symmetric matrix.) t u
4.1.2 The induced Riemann metric Lemma 4.1.1 is often very useful technically, but is perhaps not very geometric: the Riemann metric which we proved to exist has no geometric meaning. Typically, we are dealing with a situation where a Riemann metric is given and we are interested in its properties. The most common way a Riemann metric can be given is as follows: suppose we are given a Riemann metric on a smooth manifold N , and suppose W M N is a smooth submanifold (we could more generally consider the situation when is an immersion). Then we have a naturally induced Riemann metric on M , simply because for x 2 M , we have an embedding TM x TN x . To show that this induced Riemann metric is smooth, recall that gij is contravariant with respect to , so we know that .gij /M D ..gij /N / is a smooth tensor field.
4.2
Riemann metrics and connections
Let gij be a Riemann metric on a smooth manifold M , and let r be an affine connection on M . We say that the connection r is compatible with the Riemann metric gij if g ij is preserved by parallel transport, i.e. for a smooth parametrized curve with boundary points x, y, and two vectors u; v 2 TM x , if uQ , vQ are the parallel transports of u; v to TM y , we have gij .Qui ; vQ j / D gij .ui ; v j /
(4.2.1)
An “infinitesimal version” of this condition is (dropping the indices) @u .g.v; w// D g.ru .v/; w/ C g.v; ru .w//:
(2)
(See exercise (4).) Theorem. For every Riemann metric g on a smooth manifold M , there exists a unique affine connection r on M which is compatible with g and has 0 torsion. This affine connection r is known as the Levi-Civita connection.
380
15 Tensor Calculus and Riemannian Geometry
Proof. We shall prove uniqueness first. Suppose we have an affine connection compatible with the Riemann metric g. Let u; v; w be smooth vector fields on M . Compute from (2) and the fact that r is torsion free: @u .g.v; w// C @v .g.u; w// @w .g.u; v// D g.ru v; w/ C g.ru w; v/ C g.rv u; w/ C g.rv w; u/ g.rw u; v/ g.rw v; u/ D 2g.ru v; w/ C g.Œu; w; v/ C g.Œv; w; u/: Therefore, g.ru v; w/ D 12 .@u .g.v; w// C @v .g.u; w// @w .g.u; v// g.Œu; w; v/ g.Œv; w; u// : Hence, ru v is determined by g. Now we will prove existence. We will first treat the case when M D U is an open subset of Rn . In this case, consider the connection (2.3.1) constructed in Example 2.32. We already know from Example 3.33 that this connection is torsion free. To verify that this connection is compatible with the metric g, by the chain rule, it suffices to verify the condition (2) in the case when u D ei , v D ej , w D ek . Thus, we need to show that @ei .g.ej ; ek // D g.rei ej ; ek / C g.ej ; rei ek /; which translates to @gij D kij C j i k ; @x k which follows directly from equation (3.3.3) of Chapter 14. Now let M be an arbitrary smooth Riemann manifold, and let .Ui / be a coordinate cover of M . Then by what we just proved, and by locality of connections, we have smooth torsion free connections on each Ui which are compatible with g. By uniqueness, further, the connections corresponding to Ui and Uj coincide on Ui \ Uj . Thus, these connections together define a torsion free affine connection on M compatible with g. t u
4.3
The curvature tensor of a Riemann manifold, and a characterization of Euclidean geometry
Let M be a smooth manifold with a Riemann metric g. To this data, we have uniquely associated the Levi-Civita connection r by Theorem 4.2. The curvature tensor R of the Levi-Civita connection is called the curvature tensor of the Riemann manifold M . The culmination of our work is the following result, which characterizes Euclidean geometry in the world of Riemann manifolds!
5 Riemann surfaces and surfaces with Riemann metric
381
Theorem. Let M be a Riemann manifold, and let x 2 M . Then there exists an open neighborhood of x on which R D 0 if and only if there exists an open neighborhood U of x and a smooth map h W U ! Rn which is an isometry onto its image. Proof. The necessity of 0 curvature for the existence of h follows directly from Theorem 3.4, and the sufficiency almost does. In effect, if curvature vanishes in a neighborhood of x, from Theorem 3.4, we get an open neighborhood U of x and a map h W U ! Rn which is a diffeomorphism onto its image such that h maps the Levi-Civita connection on U to the Euclidean connection on hŒU . Clearly, we may then assume that U D M and h is the identity. Note however that we have not proved the map h preserves Riemann metrics. In effect, we must investigate the question: What Riemann metrics is the Euclidean connection r compatible with? To answer this question, assume, without loss of generality, that U is connected (in fact, we could assume without loss of generality that it is an open ball). We see from the formulation (4.2.1) of compatibility of the connection with the metric that given an inner product gx on TM x for a chosen point x 2 U , there is at most one Riemann metric gij on U with which r is compatible and such that .g ij /x D gx (since the inner product on TM y for all y 2 U is then determined by parallel transport). Since, however, for the Euclidean connection, parallel transport is simply the identity when we make the canonical identification of TM y with Rn , for any inner product gx on TM x D Rn , there is precisely one Riemann metric with which r is compatible, namely the one specified by the same inner product on all TM y D Rn . Since any two inner product spaces of the same dimension are isomorphic, to get the desired isometry, it suffices to pick an affine map ˛ W Rn ! Rn which takes the inner product on TM x to the standard inner product on Rn for a single point x 2 U . We may then put h D ˛jU . t u Remark: For a general Riemann metric, it is not so easy to characterize all Riemann metrics with which its Levi-Civita connection is compatible, although (for connected manifolds) it remains true that such Riemann metrics are characterized by the inner product they give on TM x at a single point. Which of these inner products are allowable, however, is related to the notion of holonomy, which we do not discuss here. We refer the interested reader to [21].
5
Riemann surfaces and surfaces with Riemann metric
Despite the fact that both concepts are attributed to Riemann, a Riemann surface is not the same thing as a Riemann manifold which is a surface (i.e. has dimension 2). A Riemann surface † is of course, in particular, a 2-dimensional manifold, and hence Lemma 4.1.1 applies. Additionally, † comes with the structure of a complex manifold, but that is not the same thing as a Riemann metric.
382
5.1
15 Tensor Calculus and Riemannian Geometry
The compatible complex structure
When putting a Riemann metric on a Riemann surface, we are usually only interested in compatible metrics which means that for any tangent vector u 2 T †x for any x 2 M , u is orthogonal to iu. Nevertheless, the method of Lemma 4.1.1 readily applies to prove the following Lemma. Every Riemann surface † has a compatible Riemann metric Proof. On an open subset of C, the metric on C identified with R2 via the isomorphism z 7! .Re.z/; Im(z)/
(5.1.2)
is clearly compatible. Let, again, Ui be the coordinate neighborhoods of †, let i g be a compatible Riemann metric on Ui and let ui be a smooth partition of unity subordinate to .Ui /. Then, as before, X
ui .i g/
i
is the desired compatible Riemann metric on †.
t u
In this context, it is also appropriate to make the following Observation. Every Riemann surface † comes with a canonical (i.e. preferred) orientation. Proof. We will produce a nowhere vanishing 2-form on †. In fact, on an open subset of C, we can simply take the form dxdy where x and y are the first and second coordinates of R2 (i.e. z D x C iy). Note again that the coordinates of a complex number z D xQ C i y, Q D jje i ˛ 2 C, are given by xQ cos.˛/ sin.˛/ x D jj : yQ sin.˛/ cos.˛/ y We conclude that dxd Q yQ D jj2 dxdy:
(5.1.3)
Now let .Ui / be a coordinate neighborhood of † and let !i be a 2-form induced as above from dxdy by the complex coordinate z D x C iy on Ui . The key observation is that, by (5.1.3), on the intersection Ui \ Uj , !i D h!j
5 Riemann surfaces and surfaces with Riemann metric
383
where h is a positive smooth real function. Thus, if ui is, again, a smooth partition of unity subordinate to .Ui /, then !D
X
ui !i
i
is the nowhere vanishing 2-form on † we were seeking. Simultaneously, it follows that the form obtained from any other complex atlas is a multiple of ! by a positive smooth real function. t u
5.2
The complex structure on an oriented surface with a Riemann metric: reduction to the equation of holomorphic disks
The orientation constructed in the Observation is called a compatible orientation on †. In view of the Observation and Lemma 5.1, it is a natural question if there is a converse, i.e. if every 2-dimensional oriented Riemann manifold has a structure of a Riemann surface with which the Riemann metric and orientation are compatible. The answer is affirmative, but the proof turns out to be quite hard. We will need the full force of the methods of Section 5 of Chapter 13. Let † be a 2-dimensional oriented manifold with a Riemann metric. Our task is to construct a complex structure compatible with the metric. Let x 2 †. Clearly, it is enough to construct a conformal oreintation-preserving coordinate u W U ! C (with non-singular differential at x). It turns out that it is somewhat easier to construct the inverse of the coordinate function u, which we will denote by f D f .z/. Note that, without loss of generality, we may assume that U D † is an open subset of C and x D 0 D u.x/, so the function f we seek should map an open neighborhood of 0 onto U , f .0/ D 0, Df0 should be non-singular and orientation preserving. What is, however, the condition of compatibility of complex structure with Riemann metric in this setting? To understand this, note that a 2-dimensional oriented inner product R-vector space V comes with a canonical complex structure J , which means a linear map J W V ! V such that J 2 D Id. In fact, define Jv to be the vector of length jjvjj which is orthogonal to v and has the property that v ^ Jv has positive orientation. In this setting, the Riemann metric therefore specifies, at each z 2 U , a complex structure Jz on C D T Uz , which varies smoothly as a function of z. This is referred to as an almost complex structure. We are therefore seeking a smooth function f .z/ defined in a neighborhood of 0 such that Dfz .it/ D Jf .z/ Dfz .t/; f .0/ D 0; det.Df0 / ¤ 0:
(5.2.1)
This is our first encounter with the equation of holomorphic disks. In order to solve the equation, however, it is more convenient to write it in terms of complex
384
15 Tensor Calculus and Riemannian Geometry
differential 1-forms. A complex differential 1-form ˛ on U is said to be of J -type .1; 0/ if for every v 2 C, and every z 2 U , ˛.z/.Jz v/ D i ˛.z/.v/: (Note that a 1-form of type .1; 0/ with respect to the standard complex structure i is simply of the form .z/dz; where .z/ is a smooth function, i.e. not necessarily a holomorphic 1-form.) Now by definition, there exists a smooth function W U ! C such that dz D ˛ C .z/˛ where ˛ is of J -type .1; 0/. We have dz D ˛ C ˛; and hence ˛D
dz dz : 1 jj2
Thus, the complex 1-form dz .z/dz is of J -type .1; 0/ and the condition of f being J -holomorphic means that f .dz .z/dz/ D .z/dz for a smooth function .z/. We have f .dz/ D
@f @f dz C dz; @z @z
f .dz/ D .f / .dz/ D
@f @f dz C dz: @z @z
Thus, we have f .dz .z/dz/ D .
@f @f @f @f .f .z// /dz C . .f .z// /dz: @z @z @z @z
The condition that this be a form of type .1; 0/ with respect to the standard complex structure then reads @f =@z D .f .z//@f =@z: (Note that @f =@z D @f =@z.)
(5.2.2)
5 Riemann surfaces and surfaces with Riemann metric
385
Our goal is then to solve the differential equation (5.2.2). To this end, we will make one more reduction. Applying @=@z to (5.2.2) and writing g.z/ D
@ @ ; h.z/ D ; @z @z
(*)
we obtain @2 f =.@z@z/ .f .z//@2 f =.@z@z/ D @f =@z .g.f .z//.@f =@z/ C h.f .z//.@f =@z// D @f =@z .g.f .z//.@f =@z/ C h.f .z//.f .z//.@f =@z// D .g.f .z// C .f .z//h.f .z/// j@f =@zj2 : (The second equality uses the equation (5.2.2).) Putting b.z/ D g.z/ C .z/h.z/;
(5.2.3)
we therefore have @2 f =.@z@z/ .f .z//@2 f =.@z@z/ D b.f .z//j@f =@zj2 : The complex conjugate equation is .f .z//@2 f =.@z@z/ C @2 f =.@z@z/ D b.f .z//j@f =@zj2 : Putting a.z/ D
b.z/ C .z/b.z/ ; 1 j.z/j2
(5.2.4)
this gives the equation @2 f =.@z@z/ D a.f .z//j@f =@zj2 :
(5.2.5)
Our strategy is first to solve the equation (5.2.5), and then show that the solution (with suitable conditions) also satisfies (5.2.2), and hence (5.2.1). Before doing so, however, let us briefly consider what restriction we can place on the function a.z/. Note that this function is related to the smooth function .z/ by the equations (*), (5.2.3) and (5.2.4). On the function .z/ we can certainly impose the relation .0/ D 0;
386
15 Tensor Calculus and Riemannian Geometry
since we are free to choose the differential of f to preserve the complex structure at 0. Further, by substituting t D ız for ı > 0 small if necessary, we can make .z/ and its first several chosen partial derivatives arbitrarily small in a chosen neighborhood of 0, and further, since we are only interested in a correct solution in a neighborhood of 0, we may assume .z/ D 0. for jzj > 1=2. Using the equations (*), (5.2.3) and (5.2.4), we can translate this to similar conditions on a.z/, i.e., for any fixed chosen ı > 0, we can assume a.0/ D 0; a.z/ D 0 for jzj > 1=2, ja.z/j; j@a=@zj; j@a=@zj < ı for all z 2 C.
(5.2.6)
5.3 Theorem. There exists an ı > 0 such that for a smooth function a.z/ satisfying (5.2.6), there exists a solution f .z/ to the equation (5.2.5) with @f =@z, @f =@z continuous, f .0/ D 0, lim f .z/ D 1;
z!1
(5.3.1)
.@f =@z/.0/ ¤ 0 and lim
z!1
@f D 0: @z
(5.3.2)
Proof. Recall Section 5.2 of Chapter 13. We will find a solution of the form f .z/ D z C P1 . .z//; 2 L3 .C/:
(5.3.3)
Define .A. //.z/ D a.z C P1 . // j .z/ C 1j2 : Let us consider first the equation @ D A. /: @z
(5.3.4)
In effect, we will solve the equation (5.3.4) in the set Q" of continuous bounded functions on C which satisfy j .z/j
" 1 C jzj
(5.3.5)
with the metric induced from the metric on the space C.C/ of bounded continuous functions on C (the supremum metric). Note that obviously, Q" is a closed subset of C.C/.
5 Riemann surfaces and surfaces with Riemann metric
387
The parameter " > 0 will be chosen later, but note that (5.3.5) implies Q L3 .C/: Since j.P1 . //.z/j C3 K"jzj1=3 where Z KD C
dxdy .1 C jzj/3
1=3 ;
choosing C3 K" < 1=2 guarantees jz C P1 . /j > 1=2 for jzj > 1 ı for some ı > 0, so supp..A. //.z// D:
(5.3.6)
Let us also assume 0 < " < 1. Now by choosing ı > 0 sufficiently small, we may assume jA. /j < "=8 and jA. / A. /j
1 j 2
j
(5.3.7)
for ; 2 Q" . (Again, we are considering the norm in C.C/.) Now put 1 D 0; nC1 D P .A. n //: By Lemma 5.3.1 (1) of Chapter 13, we have n 2 Q" , and by (5.3.7), . n / is a Cauchy sequence in Q" . Put D lim n : n!1
Since P is continuous on C.C/, we have D PA. /; j .z/j "=.1 C jzj/; jA. /j < "=8:
(5.3.8)
Now by (5.3.6), A. / has support in D, so by Lemma 5.3.1 (2) of Chapter 13,
388
15 Tensor Calculus and Riemannian Geometry
j .z/ .t/j < Kjz tj1=3 for a suitable constant K. By Lemma 5.3.1 (2) of Chapter 13 again, there exist constants L; > 0 such that jA .z/ A .t/j < Ljz tj ; and hence is continuously differentiable by Lemma 5.2.1 of Chapter 13, and moreover satisfies (5.3.4). Now consider the function f .z/ defined by (5.3.3). First note that by the definition of P1 , f .0/ D 0. The equality (5.3.1) follows from the second estimate (5.3.8) and from Lemma 5.3.1 (2) of Chapter 13. Also, @P1 . .z// D .z/ .0/ @z by formula (5.2.4) of Lemma 5.2.1 of Chapter 13. Therefore, we have in (5.3.3) @f D .z/ C 1 .0/; @z and f is continuously differentiable on C by Lemma 5.2.1 of Chapter 13. Therefore, f solves the equation (5.2.5), and @f =@z is non-zero at the point z D 0 because j .0/j ": To prove (5.3.2), it suffices to prove that lim
z!1
@P1 . / D 0: @z
(5.3.9)
Because of the second estimate (5.3.8), we can write 1 P1 . .z// D
Z C
1 ./ dsdt C z
Z C
./ dsdt:
The second summand is constant in z, the first one is, by substitution D z, D u C iv, 1
Z C
. C z/ dudv:
Differentiating after the integral sign gives
5 Riemann surfaces and surfaces with Riemann metric
1
Z @ .z C /=@z C
1 dudv D
389
Z @ ./=@ D
dsdt : z
(5.3.10)
Note that the integrand on the right-hand side 0 outside D, which lets us restrict the integration from C to D. This also implies that taking derivatives after the integral sign is legal by Theorem 5.2 of Chapter 5. Now the right-hand side of (5.3.10) obviously tends to 0 with z ! 1, which proves (5.3.9). t u 5.4 Proposition. Any solution f .z/ of the equation (5.2.5) which satisfies the conditions of Theorem 5.3 is also a solution of the equation (5.2.2). Proof. Let f be as assumed. Then, recalling (5.2.4), we have @2 f =.@z@z/ .f .z//@2 f =.@z@z/ D .a.f .z// .f .z//a.f .z///j@f =@zj2 D b.f .z//j@f =@zj2 : Using the chain rule, we obtain from (5.2.3) @ @.f .z// .@f =@z .f .z//@f =@z/ C @f =@z @z @z @f @f D @f =@z g.f .z// : C .f .z// h.f .z// @z @z From this, we obtain @ .@f =@z .f .z//@f =@z/ @z D @f =@z h.f .z// @f =@z .f .z// .@f =@z/ : Setting F .z/ D @f =@z .f .z// .@f =@z/; we therefore have @F D A.z/ F .z/ @z where A.z/ D @f =@z h.f .z// is a continuously differentiable function with compact support. Further, we have
390
15 Tensor Calculus and Riemannian Geometry
lim F .z/ D 0
z!1
(by (5.3.1) and the fact that has compact support). Hence, F .z/ D 0 for all z 2 C by Theorem 5.3 of Chapter 13, which proves our statement. t u Therefore, we have finished the proof of the following result. 5.5 Theorem. Every oriented smooth surface † with a Riemann metric has a compatible complex structure. u t Note that in view of the comments of Subsection 5.3 of Chapter 10 and the Riemann Mapping Theorem 1.2 of Chapter 13, this can be equivalently phrased to say that for every surface † with a Riemann metric, and any point x 2 †, any sufficiently small simply connected open neighborhood of x can be mapped conformally bijectively onto .0; 1/. In cartography, this theorem is of major significance: Note that together with the Riemann Mapping Theorem, we can make a flat local chart of any (smooth) landscape in the shape of any simply connected open set in C (other than C itself) which preserves surface angles.
6
Exercises
(1) Let M be a smooth manifold with an affine connection and let U be an open subset of M . Let x i , y i be two different coordinate systems on U , and let k ijk be the Christoffel symbols with respect to the coordinates x i , and ij the Christoffel symbols with respect to y i . Prove that k
ij D
@x p @x q @y k r @y k @2 x m @2 x m C : pq @y i @y j @x r @x m @y i @y i @y j
Note that the second term is the “error term for the symbol ijk behaving as a tensor of type .2; 1/”. (2) Prove that the inverse of a positive-definite symmetric matrix is positivedefinite. [Hint: We have x T Ax > 0 when x ¤ 0, and we want to prove y T .A1 /y > 0 for y ¤ 0. Consider y D Ax.] (3) Let M be a Riemann manifold with Riemann metric g. Define, for x; y 2 M , .x; y/ D inf sg .y/ y
where y is a parametrized continuously differentiable curve with boundary points x; y. Prove that the function is a metric and that the associated topology to is the topology on M which is a part of the definition of a
6 Exercises
391
manifold. [Hint: Use Theorem 4.3.2 of Chapter 14. Keep in mind that one of the things to show is that .x; y/ D 0 implies x D y.] (4) Prove that the conditions (4.2.1) and (2) of 4.2 are equivalent. [Hint: Integrating condition (2) along a curve where r 0 .v/ D r 0 .w/ D 0, u D 0 gives (4.2.1). This also means that (4.2.1) implies (2) at points where ru .v/ D ru .w/ D 0. Fixing local coordinates, the general case then follows by the chain rule.] (5) Volume associated with a Riemann metric: (a) Let g be a Riemann metric defined on a bounded open subest U Rn . Assuming B U is a Borel set, define volg .B/ D
Z q det.gij /: B
Prove that this definition is invariant under diffeomorphism, provided we transform gij as a tensor of type .2; 0/. (b) Let M be a Riemann manifold with coordinate atlas .Up ; hp /p2P and let up be a smooth partition of unity subordinate to Up . Recall that P can be chosen to be countable, since we defined manifolds to have a countable basis. Let B be a Borel subset of M . Prove that we can write B as a disjoint union of Borel sets Bp , p 2 P , such that Bp Up . Put volg .B/ D
X
vol.hp / g .hp ŒBp /:
p
Prove that volg .B/ does not depend on the choices (i.e. the atlas and the set Bp ). (6) Let W ha; bi ! .0; 1/ be a smooth function (taking one-sided derivatives at the boundary points). Consider the smooth map of manifolds W .a; b/ S 1 ! R3 given by .x; e 2 i t / 7! .x; .x/ cos.t/; .x/ sin.t//: Prove that is an embedding of manifolds. Let g be the Riemannian metric on M D Im./ induced from R3 . Find an explicit formula for the volume (=“area”) of M in terms of the function . Find the function which minimize the surface area of M subject to given values .a/; .b/ > 0. You may assume without proof such smooth function exists. [Hint: compare with Exercise (2) of Chapter 14.] (7) (a) Consider the 2-sphere S 2 D f.x; y; z/ 2 R3 j x 2 C y 2 C z2 D 1g
392
15 Tensor Calculus and Riemannian Geometry
with the Riemann metric induced from R3 . State precisely and prove that geodesics are precisely segments of great circles parametrized by arc length. (b) Generalize this to the n-sphere. (c) Construct a Riemann metric on R2 in which there exists a geodesic with boundary points A, B which does not minimize the distance functional among continuously differentiable curves with boundary points A, B. [Hint: Remove a point from S 2 , and induce a Riemann metric on R2 from the Riemann metric (a) via the radial projection diffeomorphism.] (8) Let M N be a smooth submanifold, and let g be the Riemann metric on M induced by a Riemann metric gQ on N . If we denote by r resp. rQ the Levi-Civita connection of g resp. g, Q prove that .ru .v//x is the g-orthogonal Q projection of rQ u .v/ onto TM x for x 2 M (note that rQ u .v/ is only defined in the sense of 2.2). Use this to compute the curvature tensor of S 2 with the Riemann metric induced from R3 . Conclude that no non-empty open set of S 2 is isometric to an open set of R2 (with the respective Riemann metrics). This fact was first rigorously proved by Gauss. (9) Prove that every 1-dimensional manifold is diffeomorphic either to S 1 or to R. [Hint: Use Lemma 4.1.1 and parametrization by arc length.] (10) Consider the ball S in R3 given by the equation x 2 C y 2 C .z 1/2 D 1: Identifying the xy-plane with C by z D x C iy; define a map from S X f.0; 0; 2/g to C by mapping a point P on S with the point Q in the xy-plane such that P , Q and .0; 0; 2/ lie on a straight line. This is called the stereographic projection. If we take on S the induced Riemann metric from R3 , and the standard complex structure on C, prove that the stereographical projection gives a coordinate system of a compatible complex structure on S (or, equivalently, a conformal map). [Hint: This can be done using basic trigonometry. A particularly elegant solution can be obtained by comparing the isometries of S with M¨obius transformations on C [ f1g.]
Banach and Hilbert Spaces: Elements of Functional Analysis
16
Let us now turn to infinite-dimensional geometry. The simplest such structure is probably that of a Hilbert space. It is highly relevant for analysis, and plays a key role in such areas as stochastic analysis and quantum physics. In this chapter we will discuss the basics of this concept; in the next one we will present some of its uses. In the process we will also introduce the more general Banach spaces. Some facts about Hilbert spaces readily generalize to Banach ones, but deeper theorems in this much broader area require separate methods. These methods comprise a vast area of mathematics called functional analysis. For good texts on this subject we can recommend, e.g., [17, 19]. In this chapter we will be able to present some of the simpler highlights of functional analysis, in particular the Hahn-Banach Theorem and some of its consequences, and the duality of Lp spaces.
1
Banach and Hilbert spaces
1.1 In this chapter we will work with vector spaces over the field R of real numbers and the field C of complex numbers (see Appendix A). Since the case of C is perhaps less familiar, we will emphasize it, especially in the theory of Hiblert spaces. All we say for C there remains true essentially verbatim over the field R as well, and the reader is encouraged to consider what changes are appropriate in the real case (mostly, complex conjugation disappears). In the case of Banach spaces, the cases of R and C are sometimes really different. In those cases, we will spell out both alternatives in detail. Now recall the notion of an inner product from 4.2 of Appendix A and its associated norm (and hence metric) from 1.2.3 of Chapter 2. Recall also the general notion of a norm as introduced in 1.2 of Chapter 2.
I. Kriz and A. Pultr, Introduction to Mathematical Analysis, DOI 10.1007/978-3-0348-0636-7 16, © Springer Basel 2013
393
394
16 Banach and Hilbert Spaces: Elements of Functional Analysis
If a normed vector space is complete (in the sense of Section 7 of Chapter 2) we speak of a Banach space. If, moreover, the norm has been obtained from an inner product as in 1.2.3 of Chapter 2, we speak of a Hilbert space. By an isomorphism of Banach spaces, we mean a vector space isomorphism which is also a homeomorphism. An isometric isomorphism (briefly isometry) is an isomorphism of vector spaces which preserves the norm. Note that an isometric isomorphism of Banach spaces which are Hilbert also necessarily preserves the inner product (Exercise (2)). Examples: In particular, Rn (resp. Cn ) equipped with the standard Pythagorean metric is an example of a real (resp. complex) Hilbert space, and more generally, each of the norms kvkp makes Rn , Cn into a Banach space (Exercise (20) of Chapter 5). More interestingly, let B Rn be a Borel subset. Recall the spaces Lp .B/, p L .B; C/ of Section 8 of Chapter 5. In the present terminology, Theorem 8.5.2 of Chapter 5 says that Lp .B/ and Lp .B; C/, 1 p 1, are real resp. complex Banach spaces. In fact, on L2 .B/, L2 .B; C/ we have a real (resp. complex) inner product defined by Z f g D fg B
which is finite by the Cauchy-Schwarz inequality applied at every point. Since the norm on L2 is the norm corresponding to this inner product, the spaces L2 .B/ and L2 .B; C/ are real and complex Hilbert spaces. The spaces Lp .B/, Lp .B; C/ are, in some sense, the most fundamental examples. 1.2 Theorem. A norm is a uniformly continuous map V ! R. Proof. We have jjxjj D jjy C .x y/jj jjyjj C jjx yjj and similarly with the roles of x and y reversed, so jjjxjj jjyjjj jjx yjj:
1.3
t u
An important convention
A subspace of a Banach resp. Hilbert space is a subset that is a Banach resp. Hilbert space in the inherited structure. In particular, it is required to be complete. Thus, by Proposition 7.3.1 of Chapter 2, subspaces of a Banach resp. Hilbert space are precisely closed linear (vector) subspaces.
2 Uniformly convex Banach spaces
2
395
Uniformly convex Banach spaces
2.1 A normed linear space V is said to be uniformly convex if 8" > 0 9ı > 0 such that for all x; y 2 V we have the implication jjxjj D jjyjj D 1 and jj xCy jj > 1 ı ) jjx yjj < ": 2 To reduce ı-", it is sometimes convenient to rephrase this as the following obviously equivalent statement: For sequences of elements .xn /, .yn / in V , .kxn k D kyn k D 1 and k
xn C yn k ! 1/ ) kxn yn k ! 0 2
(here the symbol ! indicates the limit with n ! 1). Explanation. This condition expresses the intuitive notion of convexity of the (unit) ball in the space as a sort of “bulging”. If you take for instance the norm from Example 1.2.2(a) of Chapter 2, the unit ball is a cube; it does not really bulge: two elements x; y on any of its faces may be far from each other while the distance of the mean point xCy 2 from the center is still 1. In Example 1.2.2(c) of Chapter 2, on moves the other hand, if we move x,y on the border from each other, the point xCy 2 away from the border. Draw a picture. 2.2 Theorem. A Hilbert space is uniformly convex. Proof. Choose an " > 0 and set ı D 1 q 2 1 ı D 1 "4 , then we have 1
q
1
"2 4.
If jjxjj D jjyjj D 1 and jj xCy 2 jj >
1 1 1 "2 < .x C y/.x C y/ D .1 C yx C xy C 1/ D .2 C yx C xy/ 4 4 4 4
and consequently xy C yx > 2 "2 ; so jjx yjj2 D .x y/.x y/ D jjxjj2 C jjyjj2 xy yx D 2 .xy C yx/ < "2 :
t u
2.3 Lemma. Let yn ; zn be elements of a uniformly convex Banach space such that n jj D 1: lim jjyn jj D lim jjzn jj D lim jj yn Cz 2
Then lim jjyn zn jj D 0.
396
16 Banach and Hilbert Spaces: Elements of Functional Analysis
Proof. First, we obviously have
zn zn
D lim 1
lim
jjzn jj jjyn jj
jjzn jj
jjzn jj
.1 /zn D 0: jjyn jj
(2.3.1)
Since the norm is a continuous function, it follows from (2.3.1) and the assumptions that
1 zn
1 zn C yn yn
1 zn zn
lim . C / D lim
C . / D 1 2 kzn k kyn k
kyn k 2 2 kzn k kyn k
and hence we obtain, by the uniform convexity, that
yn zn
D0
lim
jjyn jj jjzn jj
and we conclude, using (2.3.1) again, that
yn zn zn zn
D 0: lim jjyn zn jj D lim jjyn jj
C
jjy jj jjz jj jjzn jj jjyn jj
n n
t u
2.4 Theorem. Let K be a closed convex subset of a uniformly convex Banach space B and let a 2 B. Then there exists precisely one element y 2 K such that jjy ajj D inffjjx ajj j x 2 Kg: Proof. The maps x 7! x a and x 7! ˛x are obviously homeomorphisms preserving convexity. Thus, except for the trivial case of a 2 K, we can assume that aDo
and
inffjjxjj j x 2 Kg D 1:
Then there exists a sequence jjxn jj, n D 1; 2; : : : such that lim jjxn jj D 1: Since K is convex we have m 1 jj xn Cx jj 2
1 .jjxn jj C jjxm jj/: 2
(2.4.1)
Suppose that the sequence .xn /n is not Cauchy. Then there exist subsequences .yn /n and .zn /n such that for some "0 > 0 and all n, jjyn zn jj "0 :
3 Orthogonal complements and continuous linear forms
397
n However, we have lim jjyn jj D lim jjzn jj D 1 and by (2.4.1) also lim jj yn Cz jj D 1 and 2 hence by Lemma 2.3, lim jjyn zn jj D 0, a contradiction. Thus, .xn /n is a Cauchy sequence and if we set y D lim xn we have y 2 K and jjyjj D 1. If we had jjzjj D 1 for another z 2 K we would have, according to the same reasoning as above, a Cauchy sequence y; z; y; z; : : : ; y; z; : : : . t u
3
Orthogonal complements and continuous linear forms
3.1 Similarly as in 4.6 of Appendix A, we define for a subspace M of a Hilbert space H M ? D fx j xy D 0 for all y 2 M g: Note that from the property xx D 0 ) x D o of the scalar product it follows that M \ M ? D fog: Also note that M ? is a Hilbert subspace: it is obviously a vector subspace of H , and it is closed since the mapping ..x; y/ 7! xy/ W H ! C resp. R is continuous (indeed, we have jxy x 0 y 0 j D jxy xy0 C xy0 x 0 y 0 j jxy xy0 j C jxy0 x 0 y 0 j jjxjj jjy y 0 jj C jjy 0 jj jjx x 0 jj). 3.2 Theorem. Let M be a (Hilbert) subspace of a Hilbert space H . Then each x 2 H can be uniquely written as x D y Cz
with
y 2 M and z 2 M ? :
Proof. Using 2.3, consider the element y 2 M for which jjx yjj D minfjjx ujj j u 2 M g and put z D x y. For a general non-zero u 2 M we have zu ujj jjx yjj D jjzjj; jjz uu and hence jjzjj2
uz zu zu zu zu uz C uu D jjzjj2 0; hence uu uu uu uu
jzuj2 D .zu/zu D .zu/.uz/ 0 so zu D 0, and finally z 2 M ? .
398
16 Banach and Hilbert Spaces: Elements of Functional Analysis
If we have x D zCy D z0 Cy 0 with y; y 0 2 M and z:z0 2 M ? then yy 0 D z0 z and these differences are in M \ M ? D fog. t u 3.3 Theorem. .M ? /? D M for all subspaces M . Proof. Obviously M .M ? /? . Now let x 2 .M ? /? . Using 3.2, write x D y C z with y 2 M and z 2 M ? . Then zz D zx zy D 0 0 D 0; and hence z D o and x D y 2 M .
t u
3.4 By Theorem 3.3, the mapping M 7! M ? is a bijection of the set of all subspaces of H onto itself. Since it obviously reverses order by inclusion (i.e. M1 M2 ) M2? M1? ), we have .M \ N /? D M ? C N ?
and .M C N /? D M ? \ N ?
where M C N is the smallest subspace containing both M and N . 3.5 Theorem. Let V; V 0 be normed vector spaces (real or complex). Then the following statements for a linear operator f W V ! V 0 are equivalent. (1) f is continuous. (2) f is uniformly continuous. (3) There exists a number K such that jjxjj 1 ) jjf .x/jj K: Because of condition (3) of the theorem, continuous linear operators between normed linear spaces are also referred to as bounded. Proof. (2))(1) is trivial. (1))(3): Suppose the implication does not hold. Then there exist xk 2 V such that jjxk jj 1 and jjf .xk /jj k. Put yk D k1 xk . Then lim yk D o while jjf .yk /jj 1 and hence f .xn / cannot converge to o D f .o/. (3))(2): Suppose such a K exists. For " > 0, put ı D K1 ". Now if jjx yjj < ı, then jj K" .x y/jj 1, and hence jjf .x/ f .y/jj D jjf .x y/jj D
" K " jj " f .x y/jj K D ": K K
t u
3 Orthogonal complements and continuous linear forms
399
3.5.1 This leads to a concept of a norm of a continuous linear map f W V ! V 0 between normed vector spaces defined by jjf jj D supfjjf .x/jj j jjxjj 1g: It is an easy exercise to show that it is indeed a norm on the vector space L.V; V 0 / of all continuous linear maps f W V ! V 0 (with the natural addition and multiplication by scalars).
3.5.2 A linear form on a real or complex normed vector space V is a continuous linear mapping V ! R resp. V ! C. Similarly as in 1.1 of Chapter 11, we will denote by V the space of all linear forms on V . This is called the dual space of the normed vector space V . Note, however, that, unlike in 1.1 of Chapter 11, we now take the continuous linear forms only. The definition from 3.5.1 yields a norm on V defined by jj'jj D supfj'.x/j j jjxjj 1g:
3.5.3 Similarly as in 1.2 of Chapter 11, we have for a continuous linear mapping f W V ! V 0 a linear mapping f W .V 0 / ! V defined by f .'/ D ' ı f (if f; ' are continuous then the composition ' ı f is continuous as well). We will show that f is continuous. This is an immediate consequence of the following 3.6 Lemma. We have jjf .'/jj jjf jj jj'jj. Proof. We have jjf .'/jj D jj' ı f jj D supfj'.f .x//j j jjxjj 1g. If jjxjj 1 then jf .x/j jjf jj. Thus, 1 jjf .x/jj 1 and 1 j'.f .x//j D j'. 1 f .x//j jj'jj. jjf jj jjf jj jjf jj t u 3.6.1 Theorem. (The Riesz RepresentationTheorem) Let H be a Hilbert space. Then - for every a 2 H , the mapping .x 7! xa/ W H ! C is a linear form, and - on the other hand every linear form ' W H ! C is given by the formula .x 7! xa/ for a uniquely determined a 2 H .
400
16 Banach and Hilbert Spaces: Elements of Functional Analysis
Proof. The first statement is obvious. Now let ' W H ! C be a continuous linear mapping. If it is constant (and hence, zero everywhere) we can set '.x/ D xo. Otherwise M D fx j '.x/ D 0g is a subspace unequal to H and hence M ? ¤ fog, by 3.2. First we will show that dim M ? D 1. Indeed, let o ¤ x; y 2 M ? and consider u D '.y/x '.x/y. Then '.u/ D '.y/'.x/ '.x/'.y/ D 0 and hence u 2 M \ M ? D fog. Thus, '.y/x '.x/y D o and since x; y are non-zero in M ? , '.x/; '.y/ are nonzero and x; y are linearly dependent. Thus, we have M ? D f˛b j ˛ 2 Cg for some b ¤ o. Now by 3.2, a general x 2 H can be written as x D xM C ˛.x/b
with
xM 2 M:
Hence we have '.x/ D ˛.x/'.b/
and xb D ˛.x/.bb/:
Comparing these two equations we obtain '.x/ D xa
where a D
'.b/ : bb
The uniueness is obvious (if a ¤ b then 0 ¤ .a b/.a b/ and hence xa ¤ xb for x D a b). t u 3.7 Lemma. Let ' W H ! C be given by '.x/ D xa. Then jj'jj D supfj'.x/j j jjxjj 1g D jjajj: Proof. If jjxjj 1 then j'.x/j D jxaj jjxjjjjajj jjajj. On the other hand we have '. 1 a/ D 1 aa D jjajj. t u jjajj jjajj
3.7.1 A map f between vector spaces over C is said to be antilinear if it preserves addition and sends ˛z to ˛f .z/.
3 Orthogonal complements and continuous linear forms
401
Theorem. The correspondence D H W H ! H defined by .a/.x/ D xa is bijective, antilinear and preserves norms. Proof. is one-one onto by Theorem 3.6.1. We have .a C b/.x/ D x.a C b/ D xa C xb D .a/.x/ C .b/.x/, and .˛z/.x/ D x.˛z/ D ˛.xz/. t u
3.7.2 Remark Note that in the case of Hilbert spaces over R, the mappings H are norm preserving isomorphisms.
3.8 Let f W H ! H 0 be a continuous linear mapping. By 3.5.3 we have a continuous linear mapping f W .H 0 / ! H dual to f . On the other hand, in view of 3.7, we have a continuous linear mapping associated with f going in the same direction, namely the g from the commutative diagram H
H ! ? ? fy
H ? ? ygDH 0 f H1
H 0
H 0 ! .H 0 / : This calls for a closer analysis. For a continuous linear mapping f and a fixed y 2 H 0 we have the linear form, obviously continuous, h D .x 7! f .x/y/: By Theorem 3.5, there is, hence, a z 2 H such that h D .x 7! xz/: Setting z D f Ad .y/ we obtain a mapping H 0 ! H satisfying the formula 8x; y;
f .x/ y D x f Ad .y/:
This mapping f Ad is referred to as the mapping adjoint to f . We will show that the mapping g from the diagram above is equal to .f Ad / . Indeed, we have .f Ad / .H .a//.x/ D .H .a/ ı f Ad /.x/ D H .a/.f Ad .x// D f Ad .x/ a D a f Ad .x/ D f .a/ x D x f .a/ D H 0 .f .a//.x/:
402
16 Banach and Hilbert Spaces: Elements of Functional Analysis
3.9 A continuous linear mapping f W H ! H 0 is said to be Hermitian if it is adjoint to itself, that is, if f D f Ad , explicitly f .x/ y D x f .y/
for all x; y 2 H .
Remark. Hermitian mappings (one also speaks of Hermitian operators) play an important role in theoretical physics. It is a useful exercise to show that Hermitian operators Cn ! Cn are associated with matrices A such that T
A D A D A P (we have in mind the complex case with xy D xi yi and the complex conjugate matrix defined by .aij /ij D .aij /ij ; AT is, as usual, the transposed matrix). Recall from 7.2 of Appendix A that A is sometimes called the adjoint matrix.
3.9.1 The eigenvalues of a linear operator f W H ! H are numbers such that f .u/ D u for a non-zero u, and that the x’s satisfying such equations are called eigenvectors (compare 5.1 of Appendix B). We have Theorem. 1. All the eigenvalues of a Hermitian operator f are real. 2. Two eigenvectors associated with different eigenvalues are orthogonal. Proof. 1. Let f .u/ D u and u ¤ o. Then we have .u u/ D u u D f .u/ u D u f .u/ D u .u/ D .u u/: 2. Let f .u/ D ˛u and f .v/ D ˇv, ˛ ¤ ˇ. Then we have .˛ ˇ/uv D ˛.uv/ ˇ.uv/ D .˛u/v u.ˇv/ D f .u/v uf .v/ D 0:
4
t u
Infinite sums in a Hilbert space and Hilbert bases
4.1 We say that a system .xj /j 2J of elements of a Hilbert space has a sum x and write xD
X
xj
J
if for every " > 0 there exists a finite J."/ J such that for every finite K such that J."/ K J we have
4 Infinite sums in a Hilbert space and Hilbert bases
X
jjx
403
xj jj < ":
j 2K
Observation. If a sum of .xj /j 2 J exists then it is uniquely detemined. (Indeed, let the statement above hold for x and y. Then X X xj jj C jjy xj jj < 2": / jjx yjj jjx j 2K
j 2K
4.2 Theorem. .xj /j 2J has a sum if and only if for every " > 0 there exists a finite subset K."/ X J such that for each finite subset K J satisfying K \ K."/ D ; xi jj < ". one has jj K
Proof. ) : Consider an " > 0 and put K."/ D J. "2 /. Let K be finite and such that K \ K."/ D ;. Then we have X X X X X xj jjDjj xj xj jjjj xj xjj C jj xj xjj < ": jj K
K[K."/
K."/
K\K."/
( : Set Kn D K.1/[K. 21 /[ [K. n1 / and yn D
K."/
X
xj . From the assumption
j 2Kn
we easily see that .yn /nX is a Cauchy sequence and hence it has a limit x D lim yn . We will show that x D xj . J
Choose an " > 0 and an n such that jjx yn jj < 2" and at the same time n1 < 2" . Take a K Kn and set L D K X Kn . Then X X X " " jjx xj jj D jjx yn C xj jj jjx yn jj C jj xj jj C D ": 2 2 K L L (The last inequality uses L \ Kn D ;.)
t u
4.3 Theorem. A system .xj /j 2J has a sum x if and only if either J is finite and X xj D x in the ordinary sense, or the following conditions hold simultaneously: j 2J
(a) for at most countably many j , xj ¤ o, (b) whenever we order the xj ¤ o in a sequence x1 ; x2 ; : : : we have lim n
with the same result x.
n X kD1
xk D x
404
16 Banach and Hilbert Spaces: Elements of Functional Analysis
Proof. We will use the same notation as above. 1 [ 1 ) : The set L D K. / is countable and if j … L then jjxj jj < n nD1 and hence xj D 0. Thus, without loss of generality,
1 n
for all n,
J D f1; 2; : : : ; n; : : : g: For " > 0 choose n" such that J."/ f1; 2; : : : ; n" g. Then for n n" , we obviously have jjx
n X
xk jj < ":
kD1
( : Suppose the sum
X
xj does not exist. Choose a fixed order x1 ; x2 ; : : : .
J
Then the limit x D limn
n X
xk either does not exist or it does but it is not
X
xj .
J
kD1
In the latter case, by the definition, there exists an a > 0 such that 8 finite L J 9 finite K.L/such that L K.L/ J and jj
X
xj xjj a:
K.L/
Put A1 D f1g; B1 D K.A1 /; A2 D f1; 2; : : : ; max B1 C 1g; B2 D K.A2 / and further, assuming A1 ; : : : ; An , B1 ; : : : ; Bn are already determined, put AnC1 D f1; 2; : : : ; max Bn C 1g; BnC1 D K.AnC1 /: Now A1 B1 ¨ A2 B2 ¨ A3 and lim jj n
X An
xj xjj D 0 while
jj
X
xj xjj a:
(4.3.1)
Bn
If we rearrange the sequence x1 ; x2 ; : : : into a sequence y1 ; y2 ; : : : by taking successively all xj ’s from the blocks A1 ; B1 X A1 ; A2 X B1 ; : : : ; An X Bn1 ; Bn X An ; AnC1 X Bn ; : : : (the xj in the individual blocks ordered arbitrarily), we see that in view of 4.3.1, n X limn yk does not exist. t u kD1
4 Infinite sums in a Hilbert space and Hilbert bases
4.4 Theorem. Let (1) (2)
X J X
X
X
xj and
J
yj exist in a Hilbert space H . Then
J
˛xj exists and is equal to ˛
X J
.xj C yj / exists and is equal to
J
(3) for every z the sum
X
405
xj , X
xj C
X
J
yj , and
J
.xj z/ exists and is equal to .
J
X
xj /z.
J
Proof. (1) and (2) are straightforward. (3): The mapping .x 7! xz/ is continuous. By Theorem 4.3, we can think of the n X system .xj /j as of a sequence x1 ; x2 ; : : : with the sum x D lim xk and conclude that xz D .lim
n X
kD1
n X X xk /z D lim .xk z/ D .xj z/.
kD1
t u
J
kD1
4.5 Similarly as in 4.5 of Appendix A, we will speak of an orthogonal system .xj /j 2J if xj xk D 0 whenever j ¤ k. If, moreover, jjxj jj D 1 for all j 2 J we say that the system is orthonormal. 4.6 Theorem. (Generalized Pythagoras’ Theorem) An orthogonal system .xj /j in a Hilbert space has a sum if and only if the system .jjxj jj2 /j has a sum in R. In that case, we have jj
X
2
xj jj D
X
J
jjxj jj2 :
J
Proof. I. Existence: ) : Consider the sets K."/ from 4.2. If K J is finite and K \ K."/ D ; then, using orthogonality, X K
jjxj jj2 D
X j;k2K
xj xk D .
X K
xj /.
X K
xj / D jj
X
2
xj jj < "2 :
K
( : Reason as in the ) implication but in reverse, using, this time, the sets K."2 /. II. The equality: X xj . By 4.4(3), we have Set x D J
406
16 Banach and Hilbert Spaces: Elements of Functional Analysis
xx D .
X
xj /x D
J
X
.xj x/ D
J
X
xj .
X
J
xk / D
XX
J
J
.xj xk / D
J
X
xj xj :
j
t u
system in a 4.7 Theorem. (Bessel’s inequality) Let .xj /j 2J be an orthogonal P Hilbert space H . Then for each element x 2 H , the sum J jxxj j2 exists and one has X jxxj j2 jjxjj2 : J
Proof. Let K J be a finite subset. We have X X X 2 0 jjx .xxj /xj jj D .x .xxj /xj /.x .xxj /xj / K
D xx
X
K
K
D xx
X
K
D xx
K
.xxj /.xxj / D xx
K
and hence
j;k2K
X X .xxj /.xxj / .xxj /.xxj / C .xxj /.xxj /
K
X
K
X X .xxj /.xj x/ .xxj /.xj x/ C .xxj /.xxk /.xj xk /
X
K
jxxj j2
K
X
jxxj j2 jjxjj2 :
K
Thus, the sum
X
jxxj j absolutely converges (recall 6.2 and 6.3 of Chapter 1).
t u
J
4.8 From 4.7 and 4.6, we immediately obtain the following Corollary. If .xj /j 2J is an orthonormal system in H then for every x 2 H there exists the sum X .xxj /xj : J
4.9 Theorem. (Parseval’s equality) One has
X
inequality becomes equality, if and only if x D
jxxj j2 D jjxjj2 , that is the Bessel
J X J
.xxj /xj .
4 Infinite sums in a Hilbert space and Hilbert bases
407
Proof. Recall the beginning of the proof of Theorem 4.7: instead of the inequality X X 2 2 0 jjx .xxj /xj jj consider 0 D jjx .xxj /xj jj and observe that the K
K
formulas in the statement express the same fact.
t u
4.10 A Hilbert basis of a Hilbert space H is a maximal orthonormal system in H , that is, an orthonormal system .xj /j 2J such that no non-zero x 2 H is orthogonal to all of the xj , j 2 J . Using Zorn’s lemma (for the system of all orthogonal systems ordered by inclusion), one easily proves the following Proposition. Every Hilbert space has a Hilbert basis. Remark. There is a terminological conflict: a Hilbert basis of H is not a basis of H as a vector space; the point is not in the orthogonality – we already have the concept of an orthogonal basis in a vector space with a scalar product, and a Hilbert basis is in general not that either. It does not generate the space: a general element is not necessarily a linear combination of its elements. But, as we will see, a general element can be expressed as an “infinite linear combination” of the elements of a Hilbert basis. 4.11 Theorem. Let .xj /j 2J be an orthonormal system in a Hilbert space H . Then the following statements are equivalent. (1) .xj /j 2J is a Hilbert basis. (2) If x is orthogonal to all the xj ,X j 2 J then x D o. (3) For every x 2 H one has x D .xxJ /xj . J
(4) For every two x; y 2 H one has
xy D
X
.xxj /.yxj /:
J
(5) For every x 2 H one has jjxjj D
sX
jxxj j2 :
J
Proof. (1),(2) is just a reformulation of the definition. P (2))(3) : For every x 2 HP , one has .x .xxj /xj /xk D xxk xxk D 0 for each k and hence by (2), x .xxj /xj D 0.
408
16 Banach and Hilbert Spaces: Elements of Functional Analysis
(3))(4) : We have xy D .
X j
.xxj /xj /.
X X X .yxk /xk / D .xxj /.yxk /xj xk D .xxj /.yxj /: k
j
j;k
(4))(5) : Suppose (1) does not hold. Choose pP an element X such that jjxjj D 1 2 and xxj D 0 for all j . Then jjxjj D 1 ¤ 0 D u t J jxxj j .
5
The Hahn-Banach Theorem
Let us now turn our attention to Banach spaces. Recall that linear maps f W V ! R, f W V ! C for a real resp. complex vector space V are called linear forms. 5.1 Theorem. (Hahn - Banach) Let V be a real vector space and let W V ! R be a function such that (a) for all x; y 2 V , .x C y/ .x/ C .y/ and (b) for every x 2 V and r 2 h0; 1/, .rx/ D r .x/. Let V0 be a vector subspace of V and let f0 be a linear form on V0 such that f0 .x/
.x/ for all x 2 V0 :
Then there exists a linear form f on V such that f0 D f jV0
and f .x/
.x/ for all x 2 V:
Proof. Consider the system W of all pairs .W; g/ where W V0 is a vector subspace of V and g W W ! R a linear form such that gjV0 D f0 and that jg.x/j .x/ for all x 2 W . On W define an order v by the formula .W1 ; g1 / v .W2 ; g2 /
df
W1 W2 and g2 jW1 D g1 : S Let C D f.Wi ; gi / j i 2 J g W be a chain in this order. Setting W D i 2J Wi and defining g W W ! R by f .x/ D fi .x/ for x 2 Wi , we obtain a .W; g/ majorizing all the .Wi ; gi /. By Zorn’s Lemma, there is, hence, a .W; g/ 2 W maximal in the order v. We will prove the statement of the theorem by showing that W D V . Suppose W ¤ V . Choose a 2 V X W and let W 0 D fx C ra j x 2 W; r 2 Rg: For arbitrary x; y 2 W we have g.x/ C g.y/ D g.x C y/
.x C a C y a/
.x C a/ C
.y a/
5 The Hahn-Banach Theorem
409
and hence g.y/
.y a/ g.x/ C
.x C a/:
Since x; y are arbitrary there is a real number ˛ such that 8x; y 2 W: g.y/
.y a/ ˛ g.x/ C
.x C a/
(*)
(for instance ˛ D supy .g.y/ .y a//, or ˛ D infx .g.x/ C .x C a//). Now define a linear form hWW0 !R
by letting h.x C ra/ D g.x/ C r˛
(this is correct, if x C ra D y C sa then .r s/a D x y 2 W , hence s D r and x D y). Let r > 0. Since, by (*), 1 g. x/ C ˛ r
1 . x C a/; r
we have 1 1 h.x C ra/ D r.g. x/ C ˛/ r . x C a/ D r r
.x C ra/:
Similarly if r < 0 we use the inequality g.
1 x/ ˛ r
.
1 x a/ r
to obtain h.x C ra/ D r.g.
1 1 x/ ˛/ r. . x a// D r r
.x C ra/:
Since trivially h.x C 0 a/ .x C 0 a/ we conclude that h.y/ y 2 W 0 contradicting the maximality of .W; g/.
.y/ for all t u
5.2 Corollary. (Hahn-Banach’s Theorem - the complex version) Let V be a complex vector space, and let W V ! h0; 1/ satisfy (a) for all x; y 2 V , .x C y/ .x/ C .y/ and (b) for every x 2 V and r 2 C, .rx/ D jrj .x/. Let V0 be a vector subspace of V and let f0 be a linear form on V0 such that jf0 .x/j
.x/ for all x 2 V0 :
Then there exists a linear form f on V such that f0 D f jV0
and jf .x/j
.x/ for all x 2 V:
410
16 Banach and Hilbert Spaces: Elements of Functional Analysis
Proof. View V as a vector space over R. By Hahn-Banach’s Theorem, there exists a linear map g W V ! R such that gjV0 D Re.f0 /, g.x/ .x/. Then there exists a (unique) complex-linear map f W V ! C such that Re.f / D g. In particular, by uniqueness, f jV0 D f0 . Now for every x 2 V , there exists a complex number of modulus 1 such that f .x/ D jf .x/j: Thus, jf .x/j D f .x/: Hence f .x/ 2 R, and hence f .x/ D g.x/. Now compute: jf .x/j D g.x/
.x/ D jj .x/ D
.x/:
t u
5.3 As an easy consequence of Hahn - Banach Theorem we obtain Proposition. Let L be a normed real or complex vector space and let M be a vector subspace of L. Let g be a continuous linear form on M . Then there exists a continuous linear form on L such that kf k D kgk (the norms in L and M ). Proof. Use Theorem 5.1 resp. Corollary 5.2 with V D L, V0 D M and kgk kxk.
.x/ D t u
5.4 And here is another one. Proposition. Let L be a normed vector space and let M be a closed vector subspace. Let M ¤ L. Then there is a continuous non-zero linear form f on L such that f jM is constant zero. Remark. Note that we speak of continuity but not of the norm: norm of f jM is zero and would not help us. Proof. Choose an a 2 L X M . Since M is closed, inffkx ak j x 2 M g D d > 0. Define a linear form g on M 0 D fx C ra j x 2 M; r 2 Rg by setting g.x C ra/ D r. We have k.x Cra/.y Csa/k D kx y C.r s/ak D jr sjk
1 .x y/Cak jr sjd r s
6 Dual Banach spaces and reflexivity
411
and hence g is continuous. Now extend g to a continuous linear form on L using the Hahn-Banach Theorem. t u
6
Dual Banach spaces and reflexivity
6.1 Recall the definition 3.5.2 of the dual L of a normed vector space L. Proposition. L is always complete (and consequently is always a Banach space). Proof. To fix ideas, let us consider the real case (the complex case is analogous). Suppose .fn / is a Cauchy sequence in L . Let B be the unit ball in L. Then, by definition, the restriction fn jB is a Cauchy sequence in the space C.B/ of bounded continuous functions on B, which we discussed in Chapter 2 (and, in fact, the L -distances kfm fn k are equal to the C.B/-distances). However, we already know that the space C.B/ is complete, and thus the sequence .fn jB/ converges uniformly to a function f0 W B ! R. Then it is immediate that the function f 2 L defined by f .v/ D kvk f0 .v=kvk/ is the limit of the sequence .fn / in L .
t u
6.2 Recall from Section 3.6 that for a continuous linear mapping f W L ! M , we have a continuous linear mapping f W M ! L
by setting f . / D f
and that we have kf k kf k. In fact, the norms are equal. Proposition. We have kf k D kf k. Proof. To fix ideas, let us consider the real case (the complex case is analogous). Choose an " > 0 and an x0 2 L such that 0 < kx0 k 1 and kf .x0 /k kf k ". On the vector subspace frf .x0 / j r 2 Rg define a linear form g by setting 1 g.rf .x0 // D rkf .x0 /k. Then kgk D 1 (the unit ball is frf .x0 / j r g) kf .x0 /k and hence there is, by Proposition 5.3, a linear form 2 M such that k k 1 and .f .x0 // D kf .x0 /k. Thus, kf k kf . /k D k f k j .f0 .x0 //j D kf .x0 /k kf k ". Since " > 0 was arbitrary we conclude that kf k D kf k. u t
412
16 Banach and Hilbert Spaces: Elements of Functional Analysis
6.3 For a normed linear space L define D L W L ! L
by setting ..x//. / D .x/:
6.3.1 Proposition. is a linear map preserving norm, and for every continuous linear map f W L ! M we have a commutative diagram L
L ! L ? ? ? : ? fy yf M
M ! M Proof. Again, to fix ideas, let us work in the real case. The complex case is the same. Checking that is linear is straightforward. Consider the formula k.x/k D supfj.x/.f /j j kf k 1g D supfjf .x/j j kf k 1g: By Lemma 3.6, jf .x/j kf k kxk and hence we see that k.x/k kxk. Now fix an x ¤ o and define a linear form g W L0 D frx j r 2 Rg ! R by setting 1 g.rx/ D rkxk. The unit ball in L0 is the set frx j r kxk g and hence kgk D 1. By Proposition 5.3, we can extend g to a linear form f on L with kf k D 1 and we have .x/.f / D f .x/ D kxk. Thus, k.x/k kxk. Finally, let f W L ! M be a continuous linear map, x 2 L and 2 M . We have ..f L /.x//. / D .f ..x//. / D .L .x/ f //. / D .x/.f . // D L .x/. f / D .f .x// D .M .f .x///. / D ..M f /.x//. /; that is, f L D M f .
t u
6.4 A Banach space B is said to be reflexive if the mapping B is surjective (and hence a norm preserving isomorphism).
6.4.1 Remark: We have seen in Theorem 3.7.1 that the dual space of a Hilbert space H is antilinearly isomorphic to H by the inner product. Composing the antilinear isomorphisms
6 Dual Banach spaces and reflexivity
413
H ! H ! .H / ; one gets the map of 6.3, and thus a Hilbert space is always reflexive. 6.5 Proposition. Let a Banach space B not be reflexive. Then neither is the Banach space B . Proof. Since B is complete, the vector subspace B ŒB of B is also complete (it is norm-isomorphic) and hence, by Proposition 7.3.1 of Chapter 2 closed in B . By Proposition 5.4, there exists an F 2 B , a linear form on B that is non-zero but identically zero on B ŒB. We will show that it is not in B ŒB . Suppose it is, that is, F D B .f / for a linear form f on B. In particular, for each B .x/ we have F .B .x// D 0. Thus, 0 D B .f /.B .x// D B .x/.f / D f .x/ for all x, hence f D o and finally also F D .o/ is identically zero, a contradiction. t u
6.6
The weak topology
The following construction works over R or C. To fix ideas, let us work over C. The treatment over R is analogous. Let W be a Banach space and let W be its dual. The weak topology of W (with respect to W ) has a basis of open sets determined by all possible choices of elements f1 ; : : : fn 2 W , and open sets U1 ; : : : ; Un C: The basis element corresponding to this data is fX 2 W j X.f1 / 2 U1 ; : : : ; X.fn / 2 Un g: 6.6.1 Lemma. Let V be a normed vector space. Then the unit ball B of .V / is the closure of the image B1 of the unit ball of V under the canonical map V ! .V / , with respect to the weak topology (with respect to V ). Proof. To prove that B is contained in the closure of B1 with respect to the weak topology, it suffices to show that every open set U in the weak topology disjoint with B1 is also disjoint with B. For open sets U which are of the form F11 ŒU1 \ \ Fn1 ŒUn with U1 ; : : : ; Un open for F1 ; : : : ; Fn 2 V (such sets form a basis of the open topology), we may as well take the quotient of both V and .V / by the annihilator of F1 ; : : : ; Fn (i.e. the subspace of elements which have 0 evaluation on F1 ; : : : ; Fn ). The map induced on the quotients from the canonical map V ! .V / , however, is
414
16 Banach and Hilbert Spaces: Elements of Functional Analysis
the canonical map W ! .W / where W is the quotient of V by the annihilator of F1 ; : : : ; Fn , which is an isomorphism since W is finite-dimensional. On the other hand, if X … B, there exists an F 2 V such that kF k D 1, X.F / > 1. This means that the open set determined by F1 D F and U1 D .1 C .X.F / 1/=2; 1/ contains X but is disjoint from B1 , thus showing that X is not in the closure of B1 with respect to the weak topologiy. t u 6.7 Theorem. (The Milman-Pettis Theorem) Every uniformly convex Banach space V is reflexive. Proof (The proof we present here is due to J.R. Ringrose). Let V be a uniformly convex Banach space. By uniform convexity, for every " > 0 it is possible to choose a ı D ı."/ > 0 such that if x; y 2 V satisfy kxk; kyk 1; kx C yk 2 ı; then kx yk < ": Now suppose V is a uniformly convex Banach space which is not reflexive. Let B be the closed unit ball in .V / , and let B1 be the image of the closed unit ball in V under the canonical map V ! .V / . Then B is contained in the closure of B1 under the weak topology (with respect to the space V ). Assuming B ¤ B1 , since the canonical embedding V ! .V / is an isometry, by completeness, the image is closed, and thus B1 is a closed subset of B. This means that there exists an " > 0 and an X 2 B such that, in .V / ,
.X; 2"/ \ B1 D ;:
(*)
1 Now choose an F 2 V such that kF k D 1 and jX.F / 1j < ı where ı D ı."/. 2 Then put V D fY 2 .V / j jY .F / 1j <
1 ıg: 2
If Y; Y1 2 V \ B1 , we have jY .F / C Y1 .F /j > 2 ı, and hence kY C Y1 k > 2 ı, and therefore kY Y1 k < ". Fixing Y , we deduce that
7 The duality of Lp -spaces
415
V \ B1 Y C "B: Since, however, the right-hand set is closed in .V / under the weak topology (with respect to V ), while X is in the closure o V \ B1 with respect to the weak topology (since, in that topology, V is open), we deduce that X 2 Y C "B. This is a contradiction with (*). t u
7
The duality of Lp -spaces
We begin with the following result: 7.1 Theorem. For 1 < p < 1, the spaces Lp .B/, Lp .B; C/ are uniformly convex.
7.2
Reduction to the real case
The remainder of this section will consist of a proof of Theorem 7.1. The first thing we should realize is that the real and complex cases are actually somewhat different, since in the complex case the definition of Lp uses the complex absolute value, which, in effect, is a Hilbert space norm on C D R2 . Because of this, we don’t have an obvious isomorphism of Lp .B; C/, considered as a real Banach space, to a real Lp -space (although we won’t prove that they are not isomorphic). Of course, Lp .B/ is embedded into Lp .B; C/ isometrically, and hence the uniform convexity for Lp .B; C/ implies the uniform convexity of Lp .B/. We will, however, be interested in the opposite implication, as the proof of uniform convexity of Lp .B/, is, in fact, somewhat simpler. Assume, therefore, that we already know that Lp .B/ is uniformly convex, and let .fn /, .gn / be sequences in Lp .B; C/ such that kfn kp D kgn kp D 1; k
fn C gn kp ! 1: 2
Then certainly k jfn j kp D k jgn j kp D 1; and k
fn C gn jfn j C jgn j kp k kp 1 2 2
(the second inequality by the triangle inequality), so k
jfn j C jgn j kp ! 1; 2
416
16 Banach and Hilbert Spaces: Elements of Functional Analysis
and hence by the uniform convexity of Lp .B/, k jfn j jgn j k ! 0: This means that there exist measurable functions ˛n W B ! C, j˛n .x/j D 1 for all x, such that kfn ˛n gn kp ! 0:
(*)
From the uniform convexity of Hilbert spaces (applied to the 1-dimensional complex Hilbert space C), we know that for each " > 0 there exists a ı > 0 such that j˛n .x/ 1j > " ) j
gn .x/ C ˛n .x/gn .x/ j < .1 ı/1=p jgn .x/j: 2
Denote by Sn the set of all x 2 B such that j˛n .x/ 1j > ", and denote by cn D cSn its characteristic function (i.e. the function equal to 1 on Sn and 0 elsewhere). Then .k
gn .x/ C ˛n .x/gn .x/ kp /p .kgn kp /p ı.kgn cn kp /p : 2
Taking n ! 1 and using (*), we obtain lim jjgn cn jjp D 0;
n!1
and hence lim kfn gn jjp lim kfn ˛n gn jjp C lim jj.1 ˛n /gn jjp
n!1
n!1
n!1
lim ."kgn kp C kgn cn kp / D ": n!1
Since " > 0 was arbitrary, we are done: it suffices to prove the uniform convexity of Lp .B/.
7.3
The uniform convexity of Lp .B/
We will show now a simple argument proving the uniform convexity of Lp .B/ which does not generalize to the complex case, thus explaining in particular why the reduction 7.2 pays off. 7.3.1 Lemma. Let 1 p < 1 and let f; g be non-negative real functions which represent elements in Lp .B/. Then .kf C gkp /p .kf kp /p C .kgkp /p :
7 The duality of Lp -spaces
417
Proof. Note that for non-negative numbers x; y and p 1, we have .x C y/p x p C y p : In effect, dividing by y p , we may assume without loss of generality y D 1, and then Z
xC1
.x C 1/ x D p
p
ptp1 dt x
is a non-decreasing function in x, and hence is 1. Now we have Z .kf C gkp /p .kf kp /p .kgkp /p D
..f .t/ C g.t//p f .t/p g.t/p / 0; B
t u
as claimed. 7.3.2 Lemma. If, in a normed vector space, sequences .xn /, .yn / satisfy kxn k ! 1; kxn C yn kp C kxn yn kp ! 2; then kxn C yn k ! 1; kxn yn k ! 1:
Proof. Using the compactness of the interval h0; 3i, by picking a subsequence, we may assume, without loss of generality, that kxn C yn k ! ˛; kxn yn k ! ˇ for some ˛; ˇ 0. Now we have ˛ C ˇ D lim .kxn C yn k C kxn yn k/ lim k2xn k D 2; n!1
n!1
while ˛ p C ˇ p D 2. Thus, 1 1 . .˛ C ˇ//p .˛ p C ˇ p / D 1; 2 2 t and hence, since t p is a convex function on h0; 1/, ˛ D ˇ and equality occurs. u
7.3.3 Proof that Lp .B/ is uniformly convex Suppose .fn /, .gn / to be sequences in Lp .B/ such that kfn kp D kgn kp D 1; k
fn C gn kp ! 1: 2
418
16 Banach and Hilbert Spaces: Elements of Functional Analysis
Put xn D
fn C gn fn gn ; yn D : 2 2
Then kxn C yn kp D kfn kp D 1 D kgn kp D kxn yn kp ; and hence 2 D .kxn C yn kp /p C .kxn yn kp /p Z D
.jxn .t/ C yn .t/jp C jxn .t/ yn .t/jp / B
Z .j jxn .t/j C jyn .t/j jp C j jxn .t/j jyn .t/j jp /
D B
D .k jxn j C jyn j kp /p C .k jxn j jyn j kp /p : (Note that in the third equality, it is crucial that xn , yn are real numbers.) Now by Lemma 7.3.2, k jxn j C jyn j kp ! 1: Using Lemma 7.3.1, .kyn kp /p .k jxn j C jyn j kp /p .kxn kp /p ! 0; as claimed. This concludes the proof that Lp .B/ is uniformly convex, and hence, by Subsection 7.2, the proof of Theorem 7.1. t u 1 1 C D1 p q (then, of course, also 1 < q < 1). We have isometric isomorphisms of Banach spaces
7.4 Theorem. Let B be a Borel subset of Rn . Let 1 < p < 1 and let
Uq W Lq .B/ Š .Lp .B// and Uq W Lq .B; C/ Š .Lp .B; C// given by Z .Uq .y//.x/ D
x y: B
(7.4.1)
8 Images of Banach spaces under bounded linear maps
419
Proof. Let us prove the complex case (the real case is analogous). By H¨older’s inequality, the integral (7.4.1) exists, and we have j.Uq .y//.x/j kykq kxkp : Since Uq .y/ is linear, we therefore have Uq .y/ 2 .Lp .B; C// with kUq .y/k kykq :
(*)
To deduce that Uq is an isometry, we need to show that the norms are in fact equal. Let, therefore, y 2 Lq .B; C/ be such that kykq D 1. Let ˛ W B ! C be a measureable function such that j˛.t/j D 1 for t 2 B and ˛.t/y.t/ D jy.t/j: Define x.t/ D jy.t/jq=p ˛.t/. Then x 2 Lp .B; C/, and kxkp D 1. We compute: Z .Uq .y//.x/ D
Z
Z
xy D B
jyj
q=p
jyj D
B
jyjq D 1; B
thus proving the equality in (*). Thus, Uq is an isometric embedding, and since Lq .B; C/ is complete, the image of Uq is closed. We need to show this map is onto. However, if Uq is not onto, then by Proposition 5.4, there exists a non-zero ! 2 ..Lp .B; C// / such that !.Uq .y// D 0 for all y 2 Lq .B:C/. However, since Lp .B; C/ is uniformly convex by Theorem 7.1, it is reflexive by Theorem 6.7, and hence ! D .x/ for some x 2 Lp .B; C/. We conclude that .Up .x//.y/ D 0 for all y 2 Lq .B; C/, which contradicts the fact that Up is an isometry.
8
t u
Images of Banach spaces under bounded linear maps
8.1 Recall that a map f W X ! Y between topological spaces is open if the image of each open subset of X is open. It is relatively open if its restriction X ! f ŒX is open. In this section, we will write for subsets S; T of a vector space V and a point x 2V,
420
16 Banach and Hilbert Spaces: Elements of Functional Analysis
x C S D fx C y j y 2 S g; S C T D fx C y j x 2 S; y 2 T g and similarly x S , S T etc. We have an immediate 8.1.1 Observation. A linear map f W M ! N between normed vector spaces is open if and only if the image f ŒU of every neighbourhood of zero in M is a neighbourhood of zero in N . 8.1.2 Corollary. An open linear map f W M ! N is onto. Proof. f ŒM contains an open neighborhood U of o, so there exists an " > 0 such that kvk < " ) v 2 f ŒM . But scalar multiples of elements of U are also in f ŒM since f is linear, and these include all elements of N . t u 8.2 Proposition. Let M; N be normed vector spaces and let f W M ! N be an open continuous linear map. If M is complete then N is also complete. Proof. Let .yn / be a Cauchy sequence in N . Let B be the unit ball in M . Then since f is open, there exists a ı > 0 such that f ŒB contains all vectors of norm ı. By passing to a subsequence, if necessary, we may assume that kyn ynC1 k <
1 : 2n
Now f is onto, so there is an x1 2 M such that f .x1 / D y1 . By induction, then, we may choose xn such that f .xn / D yn and kxn xnC1 k <
1 : 2n ı
Then .xn / is a Cauchy sequence. Let x D lim xn . Then f .x/ D lim yn by continuity. t u 8.3 Lemma. Let M; M1 be normed vector spaces such that M is complete. Let f W M ! M1 be a continuous linear map such that for each neighbourhood U of o in M the closure of the image f ŒU is a neighborhood of o in M1 . Then for each neighbourhood U of o the image f ŒU is a neighborhood of o (and hence f is open).
8 Images of Banach spaces under bounded linear maps
421
Proof. Choose a neighborhood U of o and an ˛ > 0 such that fx 2 M j kxk ˛g U: Let Un D fx j kxk
˛ g; 2n
Vn D f ŒUn :
Thus, every Vn is a neighborhood of o in M1 . We will prove that f ŒU is a neigborhood of zero by showing that V1 f ŒU . To this end, let y 2 V1 be arbitrary; we look for an x 2 U such that y D f .x/. We will find inductively xk 2 Uk k D 1; 2; : : : such that for all n, y
n X
f .xk / 2 VnC1
and
kD1
ky
n X kD1
(*) 1 f .xk /k < : n
First, since .y V2 / \ fz j ky zk < 1g is a neighborhood of y and y is in the closure of f .U1 /, we have a y1 2 .y V2 / \ fz j ky zk < 1g \ f .U1 /; that is, a y1 D f .x1 / with x1 2 U1 such that ky f .x1 /k < 1 and y1 D y v with v 2 V2 , that is, y f .x1 / D v 2 V2 . Now suppose we already have x1 ; : : : ; xn such that (*) holds. Then n X y f .xk / 2 f ŒUnC1 and since kD1
..y
n X
f .xk // VnC2 / \ fz j ky
kD1
n X
f .xk / zk <
kD1
is a neigborhood of y
n X
1 g nC1
f .xk / there is an xnC1 2 UnC1 such that
kD1
.y
n X kD1
ky
n X kD1
f .xk // f .xnC1 / D y
nC1 X
f .xk / 2 VnC2 ; and
kD1
f .xk / f .xnC1 /k D ky
nC1 X kD1
which are the conditions (*) with n C 1 replacing n.
f .xk /k <
1 ; nC1
422
16 Banach and Hilbert Spaces: Elements of Functional Analysis
Since xk 2 Uk , clearly, the sequence .
n X
xk /
kD1
is Cauchy, and if we denote its limit by x, then f .x/ D lim
n X
f .xk / D y:
kD1
Finally, kxk ˛, and hence x 2 U .
t u
Recall the definition of a meager set (set of the first category) from 3.3 of Chapter 9, and the Theorem 3.4 of Chapter 9 stating that no complete space is meager in itself (Baire’s Category Theorem). 8.4 Theorem. Let M; N be normed vector spaces, M a complete one. Let f W M ! M1 be a continuous linear map. Then there holds precisely one of the following statements. (1) f ŒM is complete and f is relatively open. (2) f ŒM is meager in itself and f is not open; moreover, there is a neighborhood U of o such that f ŒU is nowhere dense in f ŒM . Proof. The two alternatives exclude each other by Baire’s Category Theorem (Theorem 3.4 of Chapter 9). I. Suppose there is a neighbourhood U of zero such that f ŒU is nowhere dense 1 [ nU and hence in f ŒM . Then f is obviously not open. Furthermore, M D f ŒM D
1 [
nD1
nf ŒU . Obviously, if A is nowhere dense, then nA is nowhere
nD1
dense also. Thus, f ŒM is meager in itself. II. Let none of the f ŒU with U a neighbourhood of zero be nowhere dense. Thus, each such f ŒU is a neighbourhood of some of its points. We will prove that in fact it is a neighbourhood of o and the statement will follow from Proposition 8.2 and Lemma 8.3. Let U be a neighborhood of zero in M . By continuity of the addition we have a neighborhood V 0 such that V 0 C V 0 U and by continuity of the map x 7! .x/, V 0 is a neighborhood of zero, and finally also V D V 0 \ .V 0 / is a neighborhood of o. The set f .V / is a neighborhood of a point y0 and since V D V 0 \ .V 0 /, it is also a neighborhood of y0 . Consider the homeomorphism D .y 7! y y0 /. It maps f ŒV onto f ŒV y0 and since f ŒV y0 f ŒV C f ŒV f ŒU we have .f ŒV / f ŒU and since .y0 / D o and is a homeomorphism, f ŒU is a neighborhood of o. t u
8 Images of Banach spaces under bounded linear maps
423
8.5 As an immediate corollary we obtain an important Theorem. Let M ! N be Banach spaces and let f W M ! N be a bijective linear map. Then f is a homeomorphism. Proof. Alternative (2) of Theorem 8.4 is excluded by Baire’s Category Theorem. t u
8.6 Note that, somewhat surprisingly, we have in Theorem 8.5 the continuity of f 1 implied by the continuity of f (reminiscent of the mappings between compact Hausdorff spaces, and, even more basically, the behaviour of algebraic homomorphisms). We will present, as a consequence of Theorem 8.5, another case of an “inverted implication”. Let X1 ; X2 be metric spaces; consider a mapping f W X1 ! X2 and its graph G D f.x; f .x// j x 2 X1 g X1 X2 : If f is continuous then the graph G is obviously closed in X1 X2 (the sequence .xn ; f .xn // either converges to .lim xn ; f .lim xn // or does not converge at all). Equally obviously, closedness of the graph G does not imply continuity (consider a discontinuous one-one onto map f with continuous f 1 ). For Banach spaces we have, however, 8.6.1 Theorem. (The Closed Graph Theorem) Let Mi , i D 1; 2, be Banach spaces and let f W M1 ! M2 be a linear map with a closed graph G D f.x; f .x// j x 2 M1 g M1 M2 . Then f is continuous. Proof. Consider the space M1 M2 with the norm k.x1 ; x2 /k D max.kx1 k; kx2 k/: This is a Banach space (a product of two complete metric spaces is complete). The graph G D f.x; f .x// j x 2 M1 g is a closed vector subspace of M1 M2 and hence it is, again, a Banach space. Now the projection p1 D ..x; y/ 7! x/ W G ! M1 is a continuous map. It is linear one-one and onto, and hence, by Theorem 8.5, the inverse p11 W M1 ! G is continuous. Since also p2 D ..x; y/ 7! y/ W G ! M2 is continuous, the composition f D p2 p11 W M1 ! M2 is continuous. t u
424
16 Banach and Hilbert Spaces: Elements of Functional Analysis
8.6.2 Remark: The completeness hypothesis in Theorem 8.6.1 is essential. Consider the space C.ha; bi/ of continuous real functions on a closed interval ha; bi with the norm k k D maxt 2ha;bi j .t/j. Take the subspace M C.ha; bi/ consisting of the functions with a continuous derivative (one-sided in a and b). Now M is a normed vector space (not complete, though) and the convergence in M is uniform convergence. By Theorem 5.3 of Chapter 1, if functions xn converge to X and if the derivatives xn0 converge to y then x 0 exists and x 0 D y. Thus, the mapping D D .x 7! x 0 / W M ! M of taking the derivative has a closed graph. Obviously, however, D is not continuous; in fact it is continuous at no point x 2 M .
9
Exercises
(1) Prove that any finite-dimensional vector space V with an inner product is a Hilbert space. Prove that the norms associated with any two inner products on V define equivalent metrics. (2) Prove that if f W H ! H 0 is an isometric isomorphism of Banach spaces where H; H 0 are Hilbert spaces, then f .u/ f .v/ D u v. [Hint: there is a formula expressing the dot product from its associated norm.] (3) Prove that the closure of the unit ball .o; 1/ in a Hilbert space H is compact if and only if H is finite-dimensional. (4) Give an example of a bounded linear operator F W H ! H , where H is a Hilbert space, whose image is not closed. (5) Prove that the symbol jjf jj defined in 3.5.1 is a norm on the space L.B; B 0 / of continuous linear maps B ! B 0 for Banach spaces B; B 0 . (6) Prove the statement of 3.4 in detail. (7) Let V be a finite-dimensional Hilbert (Dinner product) space over C and let f W V ! V be a Hermitian operator. Define, for x; y 2 V , B.x; y/ D f .x/y. Prove that B is a Hermitian form. (8) Let H; J be Hilbert spaces. A linear operator F W H ! J is called compact if F ŒB is compact where B D fx 2 H j jjxjj 1g. (a) Prove that if F is compact then for any bounded closed subset S H , F ŒS is compact. (b) Prove that a compact operator is always bounded. (c) An operator F W H ! J between Hilbert spaces is called finite if its image is finite-dimensional. Prove that a finite operator is always compact. (d) Give an example of a compact operator between Hilbert spaces which is not finite. (9) Prove that if F W H ! J is a compact linear operator between Hilbert spaces, then there exists an x 2 H such that jjxjj D 1 and jjF .x/jj D jjF jj jjxjj. [Hint: Consider y 2 F ŒB to be of maximal norm (note that the norm is continuous and F ŒB is compact).]
9 Exercises
425
(10) Let F W H ! J be a compact linear operator where H , J are Hilbert spaces. (a) Prove that there exist orthonormal systems .ei /i 2N , .fi /i 2N in H and J respectively and numbers s1 s2 such that F .en / D sn fn
(i)
and F is 0 on the orthogonal complement of the closure of the vector subspace generated by e1 ; e2 ; : : : .
(ii)
Prove further that the numbers sn are uniquely determined and that the orthonormal systems .ei /, .fi / are uniquely determined up to a scalar multiple if s1 > s2 > . The numbers si are known as singular values of the operator F . [Hint: s1 D jjF jj. Use Exercise (9) and pass to orthogonal complements.] (b) Prove that lim sn D 0:
n!1
(iii)
Conversely, prove that if F W H ! J is an operator which satisfies (i), (ii) and (iii), then F is compact. (11) A compact linear operator F W H ! J between Hilbert spaces is called trace class if its singular values satisfy 1 X
sn < 1:
nD1
Prove that when an operator F W H ! H for a Hilbert space H is trace class, and, for every Hilbert basis .ei /i 2I of H , X aij ei ; f .ej / D i 2I
then the series
X
aii is absolutely convergent and does not depend on the
i 2I
choice of Hilbert basis. This number is denoted by tr.F / and called the trace of F . (12) A compact operator linear F W H ! J between Hilbert spaces is called Hilbert-Schmidt if its singular values satisfy 1 X
.sn /2 < 1:
nD1
Let .ei /i 2I be a Hilbert basis of H . For two Hilbert-Schmidt operators F; G W H ! J , define
426
16 Banach and Hilbert Spaces: Elements of Functional Analysis
F G D
X
F .ei / G.ei /:
i 2I
Prove that this is a well-defined inner product on the space HS.H; J / of all Hilbert-Schmidt linear operators, and that moreover HS.H; J / with this inner product is a Hilbert space. (13) Prove that if L is a uniformly convex Banach space and 0 ¤ h 2 L , then there exists a z 2 L such that kzk D 1 and h.z/ D khk. [Hint: Choose a sequence zn in the unit ball of L such that h.zn / ! khk. Uniform convexity implies that it is Cauchy.] (14) Let B be a Borel set in Rn such that .B/ > 0. Prove that L1 .B/, L1 .B; C/, L1 .B/, L1 .B; C/ are not uniformly convex. [Hint: It suffices to consider the “baby” version - see Exercise (20) of Chapter 5.] (15) Let F W L ! M be a bounded operator where L; M are Banach spaces, and the vector space M=F ŒL is finite-dimensional. Prove that then F ŒL is closed in M . [Hint: There is a finite-dimensional vector space V and an extension FQ W L ˚ V ! M which is onto, and maps V isomorphically onto M=f ŒL. Now FQ is open and the image, under FQ , of the open subset L V X f0g is M X F ŒL.]
A Few Applications of Hilbert Spaces
17
In the previous chapter we developed, with the help of analysis, an understanding of Hilbert (and Banach) spaces as a kind of satisfactory generalization of linear algebra to infinite-dimensional spaces. In particular, we developed modified notions of duals and bases which behave well in this situation. The real force of Hilbert spaces, however, is that they naturally occur in a variety of contexts. In physics, specifically in quantum mechanics, a (complex) Hilbert space is the basic structure on a state space, which is the fundamental concept of the theory. In this chapter, we will remain in mathematics, and give examples of Hilbert spaces which occur as certain spaces of functions (generally known as L2 spaces). We will then explore two particular roles L2 -spaces play. First, they provide us with a useful technical tool. To illustrate this, we will prove the Radon-Nikodym Theorem on derivatives of measures. We will then apply that result to proving a Lebesgue integral version of the Fundamental Theorem of Calculus, which is ultimately a very satisfactory, but also very difficult theorem. In some sense, this theorem brings the story of Lebegue integral, which we used extensively (although often implicitly) throughout this book, to a conclusion. The second use of L2 -spaces, and generally Hilbert spaces, is as a rigorous foundation for modelling intuitive geometric ideas. We will illustrate this on the concepts of Fourier series and the Fourier transformation.
1
Some preliminaries: Integration by a measure
In most of this book, we worked with the Lebesgue integral which we constructed by passing to limits from the Riemann integral. As a result, we obtained a construction of the Lebesgue measure. At this point, however, we need to talk about measures in greater generality. In this section, we summarize the basics of integration theory with respect to more general measures.
I. Kriz and A. Pultr, Introduction to Mathematical Analysis, DOI 10.1007/978-3-0348-0636-7 17, © Springer Basel 2013
427
428
17 A Few Applications of Hilbert Spaces
1.1 To avoid excessive definitions, we will only consider so-called Borel measures. For the completely abstract concepts of measure and integration, the reader is referred to [18]. Let X Rn be a Borel subset. By a Borel measure on X we shall mean a map which assigns to each Borel subset S X a number .S / 2 Œ0; 1. We require that for disjoint subsets S1 ; S2 ; : : : ; Sn ; : : : of X , we have .S1 [ S2 [ : : : / D
1 X
.Sn /
nD1
(a property known as -additivity). Note that when E1 E2 : : : , then -additivity, applied to the sets EnC1 X En , implies .En / % .
[
En /:
Example: By Proposition 3.2 and Corollary 3.4.1 of Chapter 4, we know that the Lebesgue measure on Rn can be considered as a Borel measure on Rn (if we ignore the fact that it is defined on even more general sets).
1.2
Definition and basic facts about integration of non-negative real functions with respect to a measure
First, by a simple function on X we mean a function expressed in the form sD
n X
ai cAi
(1.2.1)
i D1
where Ai X are Borel subsets, and 0 ai < 1. We define the integral of a simple function with respect to a Borel measure by Z sd D X
n X
ai .Ai /:
(1.2.2)
i D1
If .Ai / D 1 and ai D 0, we set the i ’th summand equal to 0. Note carefully that a priori, the integral of s as defined may depend on the expression (1.2.1). However, it doesn’t (see Exercise (1)). Even without knowing that fact, however, we define for a Borel measurable function f W X ! Œ0; 1, (recall 4.4 of Chapter 4) Z
Z f d D sup X
sd X
(1.2.3)
1 Some preliminaries: Integration by a measure
429
where the supremum is taken Rover all simple functions (1.2.1) such that s f . Note: For a Borel set B X , B f d may be defined simply as the integral of the restriction of f by the restriction of to B. 1.2.1 Lemma. For any Borel function f W X ! Œ0; 1 there exist simple functions sn such that sn % f . Proof. Put sn .x/ D k=2n when 0 k n2n and x is such that k=2n f .x/ < .k C 1/=2n:
t u
1.2.2 Corollary. When is the Lebesgue measure and f W X ! Œ0; 1 is Borel measurable, then (1.2.3) is the Lebesgue integral of f over X . Proof. Use Lemma 1.2.1, the Lebesgue Monotone Convergence Theorem (Theorem 1.1 of Chapter 5), and recall definition 3.1 of Chapter 5. t u 1.2.3 Theorem. (the Lebesgue Monotone Convergence Theorem for a Borel measure) Let fn % f , where fn W X ! Œ0; 1 are Borel-measurable functions. Then Z
Z lim
n!1 X
fn d D
f d: X
Proof. First note that Z
Z fn X
fnC1 ; X
so the limit makes sense. If s fn is a simple function, then clearly s f . This implies the inequality. For the opposite inequality, let s f be a simple function and let 0 < c < 1.SLet En be the set of all x 2 X such that cs.x/ fn .x/. Clearly, En EnC1 , and En D X , so Z
Z
Z
sd D
c X
csd D lim X
n!1 E n
Z csd lim
n!1 E n
Z fn d lim
n!1 X
fn d:
(The second equality follows from -additivity.) Now taking the supremum of the left-hand side of the inequality we just derived over all 0 < c < 1 and all simple functions s f , we obtain the inequality of the statement. t u
430
17 A Few Applications of Hilbert Spaces
1.2.4 Lemma. When f; g W X ! Œ0; 1 are Borel measurable functions, and c 2 Œ0; 1/, we have Z
Z .cf /d D c X
Z
f d; X
Z
.f C g/d D
Z
f d C
X
gd:
X
X
Proof. The first equality is obvious, since simple functions f correspond bijectively to simple functions cf by multiplication by c. For the second inequality, let by Lemma 1.2.1, sn % f , sn0 % g where sn , sn0 are simple functions. We have sn C sn0 % f C g, so by the Lebesgue Monotone Convergence Theorem, Z
Z .f C g/d D lim X
X
Z
Z sn d C lim
D lim X
X
.sn C sn0 /d sn0 d D
Z
Z f d C X
gd:
t u
X
1.2.5 Comment Let be a Borel measure on X and u W X ! Œ0; 1 a Borel-measurable function. Then it follows from Lemma 1.2.4 and the Lebesgue Monotone Convergence Theorem that Z E 7! ud E
is a Borel measure on X ; this Borel measure is often denoted by u.
1.3
Integration of complex functions over a measure
Let be a Borel measure. A Borel-measurable function f W X ! C is called -integrable if Z jf jd < 1: X
(This is clearly equivalent to requiring that Re.f /C , Im.f /C Re.f / and Im.f / all have finite integrals). We then put Z
Z
C
f d D X
Z
Re.f / d X
Z
C
Re.f / d C i X
Z
Im.f / d:
Im.f / d X
X
1 Some preliminaries: Integration by a measure
431
By Proposition 2.7 of Chapter 5, a Borel-measurable function is integrable by the Lebesgue measure if and only if it has a finite Lebesgue integral, and the integral just defined equals its Lebesgue integral. 1.3.1 Lemma. Let f; g W X ! C be -integrable functions, and let ˛ 2 C. Then ˛f , f C g are -integrable and we have Z
Z ˛f d D ˛ X
Z
f d; X
Z
.f C g/d D
Z
f d C
X
X
gd: X
Proof. The second formula immediately follows from Lemma 1.2.4. To prove the first formula, one first notes that for ˛ 0 it follows from Lemma 1.2.4, then one checks it for ˛ D 1 and ˛ D i , and uses the second formula to pass to the case of ˛ arbitrary. t u 1.3.2 Theorem. (the Lebesgue Dominated Convergence Theorem) Suppose fn W X ! C are Borel measurable functions and assume that fn ! f , and there exists a -integrable function g W X ! Œ0; 1 such that for all n, jfn j g. Then Z
Z fn d D
lim
f d:
X
X
Proof. We have jfn f j 2g; so by Fatou’s lemma (the proof of Lemma 8.5.1 of Chapter 5 works for any Borel measure), Z Z 2gd lim inf .2g jf fn j/d n!1
X
X
Z
Z
D lim inf. n!1
2gd X
Z
Z 2gd lim sup
D
n!1
X
Subtracting
R X
jf fn j/d X
jf fn jd: X
2gd from both sides, Z jf fn jd D 0
lim sup n!1
X
432
17 A Few Applications of Hilbert Spaces
and hence Z lim
n!1 X
jf fn jd D 0:
An analogue of Lemma 8.4.1 of Chapter 5 also holds by the same proof. Therefore, Z lim
n!1 X
.f fn /d D 0; t u
which implies our statement by Lemma 1.3.1. p
2
The spaces L .X; C/ and the Radon-Nikodym Theorem
2.1
The spaces L .X; C/
p
The definition of spaces Lp 1 p 1 with respect to an arbitrary Borel measure parallels completely the discussion of the case of the Lebesgue measure in Section 8 of Chapter 5. In particular, let X Rn be a Borel subset and let be a Borel p measure on X . Let, for 1 p < 1, L .X; C/ denote the set of equivalence classes of all Borel-measurable functions f W X ! C such that Z jf jp d < 1
(2.1.1)
X
with respect to the equivalence relation of being equal almost everywhere (i.e. f g if and only if .fx 2 X j f .x/ ¤ g.x/g/ D 0/. The relation is a p congruence, so L .X; C/ inherits a structure of a C-vector space from the set of p all functions satisfying (2.1.1). Again, elements of L .X; C/ are often (slightly imprecisely but usually harmlessly) identified with their representative functions. Again, we define jjf jjp to be the p’th root of the left-hand side of (2.1.1). For p D 1, we define, again, jjf jj1 to be the infimum of M 1 such that f .x/ M almost everywhere, and we define L1 .X; C/ to be the quotient of the vector space of such functions by the congruence of being equal almost everywhere. An analogue of Minkowski’s inequality (Theorem 8.2 of Chapter 5) holds by the p same proof, thus providing us with a norm on L .X; C/. The proof of Theorem 8.5.2 p of Chapter 5 extends to the case of Borel measures to prove that the spaces L .X; C/ are complete, and hence are Banach spaces. In fact, all the theory of the spaces Lp .B/, Lp .B; C/ we built up in Chapter 16 extends verbatim to the case of the p p spaces L .X /, L .X; C/. In particular, for 1 < p < 1, these Banach spaces are uniformly convex and hence are reflexive; we simply didn’t want to complicate the discussion in Chapter 16 with unnecessary generality where we didn’t need it. (However, see Exercises 6, 7 below.) It is worthwhile pointing out, though, that the case of Borel measures gives some interesting examples which we haven’t seen before: Let S be a countable set with
p
2 The spaces L .X; C/ and the Radon-Nikodym Theorem
433 p
the measure in which every element has measure 1. Then the space L .S / is isometric to the space of all sequences .an /n2N such that 1 X
jan jp < 1
nD1
with the norm jj.an /n jj D .
X
jan jp /1=p :
Such spaces are denoted by `p (`p .C/ in the complex case). As before, a special role belongs to the spaces L2 .X /, L2 .X; C/. By the CauchySchwarz inequality, for f; g 2 L2 .X; C/, Z f gd < 1; (2.1.2) X
so the formula (2.1.2) defines an inner product on L2 .X; C/. Since the norm comes from the inner product, L2 .X; C/ with the inner product (2.1.2) is a Hilbert space (and similarly, L2 .X / is a real Hilbert space).
2.2
The Radon-Nikodym Theorem
Let , be Borel measures on a Borel set X Rn . We say that is absolutely continuous with respect to if for every Borel set S X , .X / D 0 implies .X / D 0. Theorem. Suppose that , are Borel measures on a Borel set X Rn , .X / < 1, .X / < 1 and is absolutely continuous with respect to . Then there exists a Borel measurable function h W X ! Œ0; 1/ such that D h. (see Comment 1.2.5). The function h is called the Radon-Nikodym derivative of by . Proof. Consider the measure D C . We then have .X / < 1 and for every Borel set S X , .S / .S /. Then every function in f 2 L2 .X; C/, f is -integrable, hence -integrable. Define Z I.f / D f d: X
Clearly, I W L2 .X; C/ ! C is a C-linear map. We claim that I is continuous. By Theorem 3.5 of Chapter 16, it suffices to prove that there exists a number K such that jjf jj ;2 1 implies
434
17 A Few Applications of Hilbert Spaces
jI.f /j K. Let S D fx 2 X j jf .x/j 1g. Z
Z jf jd C
jI.f /j
jf jd X XS
S
Z S
jjf jj2 d C .X X S / jjf jj2 ;2 C .X X S / 1 C .X /:
By the Riesz Representation Theorem 3.6.1 of Chapter 16, there exists a g 2 L2 .X; C/ such that for all f 2 L2 .X; C/, Z Z f d D f gd : (2.2.1) X
X
But we claim that in fact .fxj0 g < 1g/ D .X /:
(2.2.2)
In effect, .S / > 0 where S is any of the sets fx 2 X j Im.g/ > 1=ng, fx 2 X j Im.g/ < 1=ng, fx 2 X j Re.g/ < 1=ng, fx 2 X j Re.g/ 1g, would violate (2.2.1) for f D cS (in particular, for S D fx 2 X j Re.g/ 1g, we would get .S / .S / D .S / C .S /, so .S / D 0 which contradicts the assumption .S / > 0 by absolute continuity). Thus the above sets S have .S / D 0, which proves (2.2.2). Now rewrite (2.2.1) as Z
Z f .1 g/d D X
fgd:
(2.2.3)
X
Put h D g=.1 g/ where defined, and h D 0 elsewhere. Now let En D fx 2 X jg.x/ < 1 1=ng: Then for a Borel set S En , f D cS =.1 g/ is bounded non-negative Borel-measurable, and hence is in L2 .X; C/, so applying (2.2.3) gives .S / D R S En , then .S / D 0 hence .S / D .S / D 0, so S hd. RIf S X X .S / D S hd. Now any Borel subset of X is a countable union of sets for which the statement was just proved. t u
2.3 The following statement will be useful in the next section. Lemma. Let X Rn be a Borel set, and let , be Borel measures on X such that .X / < 1. Then is absolutely continuous with respect to if and only if for every " > 0 there exists a ı > 0 such that .S / < ı implies .S / < ".
3 Application: The Fundamental Theorem of (Lebesgue) Calculus
435
Proof. Let the ı-" condition hold. Then a set S with .S / D 0 satisfies the hypothesis for every ı, and hence .S / < " for every " > 0. Conversely, let be absolutely continuous with respect to . Suppose the ı-" does not hold, i.e. there exists an " > 0 and sets Ei with .Ei / 1=2i such that .Ei / ". Put Ai DTEi [ Ei C1 [T . Then .Ai / ", Ai Ai C1 , .Ai / 1=2i 1 , and hence . Ai / D 0, . Ai / " by -additivity. t u
3
Application: The Fundamental Theorem of (Lebesgue) Calculus
In this section, we derive an application of the Radon-Nikodym Theorem which is the analogue, for the Lebesgue integral, of the Fundamental Theorem of Calculus, stating, roughly, that the derivative and the integral are inverse operations. We begin with the part about the integral of the derivative. This part does not need the RadonNikodym Theorem, but it shows that things are much harder than in the case of the Riemann integral. Throughout this section, we will work with real functions; all statements immediately follow for complex-valued functions by treating them as pairs of real functions.
3.1
Absolute continuity of functions
A function f W ha; bi ! R is called absolutely continuous if for every " > 0 there exists a ı > 0 such that for any m-tuple of non-empty disjoint intervals hai ; bi i ha; bi, i D 1; : : : ; m which satisfy m X .bi ai / < ı; i D1
we have m X
jf .bi / f .ai /j < ":
i D1
An absolutely continuous function is clearly continuous (take m D 1).
3.2
The derivative of an integral
Consider now the situation when f W ha; bi ! R is a Lebesgue integrable function. (Recall that by Theorem 4.4 of Chapter 5, we may assume that f is Borel measurable.) Now define a function F W ha; bi ! R
436
17 A Few Applications of Hilbert Spaces
by Z F .x/ D
f: ha;xi
(The integral is with respect to the Lebesgue measure .) Proposition. The function F W ha; bi ! R is absolutely continuous. Proof. The measure jf j (see Comment 1.2.5) is clearly absolutely continuous with respect to . Our statement therefore follows from Lemma 2.3 and Lemma 8.4.1 of Chapter 5. t u Theorem. The function F has a derivative almost everywhere in ha; bi and we have F 0 .x/ D f .x/ almost everywhere in ha; bi. Proof. Recall that for every ı > 0, there exists a continuous function g W ha; bi!R such that Z jf gj < ı: ha;bi
(By our definition of the Lebesgue integral, we may replace f by a function in Zup , which can then be replaced by a continuous function.) Now our statement is true for g in place of f by the corresponding statement for the Riemann integral (Theorem 8.6 of Chapter 1). Now let us investigate the function h D f g: Let " > 0. Let B be the set of all x 2 ha; bi for which there exists a t.x/ > 0 with a x t.x/ < x C t.x/ b such that Z jhj > "t.x/: hxt .x/;xCt .x/i
Let K be a compact subset of the open set B. Then there exist x1 ; : : : ; xN such that N [
.xi t.xi /; xi C t.xi // K:
i D1
Note that we may find i1 < < im such that the intervals
3 Application: The Fundamental Theorem of (Lebesgue) Calculus
437
.xi t.xi /; xi C t.xi // are disjoint, and m [
.xij 3t.xij /; xij C 3t.xij // K:
(3.2.1)
j D1
In fact, assume without loss of generality that t.x1 / t.xN /: Then it suffices to let ij C1 be the smallest number i > ij such that .xi t.xi /; xi C t.xi // is disjoint from .xik t.xik /; xik C t.xik // for k j . By (3.2.1), we see that .K/ 6
m X j D1
6X " j D1 m
t.xij / <
Z jhj hxij t .xij /;xij Ct .xij /i
6 "
Z jhj ha;bi
6ı : "
Since K B was an arbitrary compact subset, we conclude .B/
6ı : "
Now the point is that for every " > 0 we can choose ı > 0 such that .B/ is arbitrarily small. Let C D fa x bj jh.x/j "g: Clearly, ı .C / : " However, for x 2 ha; bi X .B [ C /;
(3.2.2)
for every t > 0 such that a x t < x C t b, we have for both J D hx t; xi and J D hx; x C ti ˇZ Z 1 ˇˇ f t ˇ J J
ˇ Z ˇ 1 g ˇˇ jhj "; t J
while jf .x/ g.x/j < ", and thus j. 1t
R J
f / f .x/j < 3" for sufficiently small t > 0
(3.2.3)
438
17 A Few Applications of Hilbert Spaces
for x as in (3.2.2). Now the sets B, C depend on ı and ", but writing B D B.ı; "/, C D C.ı; "/, (3.2.3) holds for x 2 ha; bi X
\
.B.1=n; "/ [ C.1=n; "//;
n
which is almost everywhere. Since " was arbitrary, considering " D 1=k, k D 1; 2; : : : , we see that F 0 .x/ D f .x/ almost everywhere on ha; bi, as claimed. u t
3.3
The integral of the derivative
Let us now consider the harder direction, namely the integral of the derivative of a function F W ha; bi ! R. By Proposition 3.2, it suffices to consider the case when F is absolutely continuous. Theorem. Let F W ha; bi ! R be absolutely continuous. Then F 0 .x/ exists almost everywhere and for every x 2 ha; bi, Z
F 0 D F .x/ F .a/: ha;xi
The proof will consist of several steps. First assume that F is increasing.
(*)
We start with 3.3.1 Lemma. Let (*) hold and let F be absolutely continuous. Let S ha; bi satisfy .S / D 0. Then .F ŒS / D 0. Proof. Suppose, without loss of generality, a; b … S . By Exercise (9) of Chapter 5, there exists for every ı > 0 an open set U S such that .U / < ı. Then we may express U as a countable disjoint union of open intervals .ai ; bi /, i D 1; 2; : : : (Lemma 5.2.1 of Chapter 2). By the definition of absolute continuity (applied to i D 1; : : : ; n and taking a limit with n ! 1) we see that for every " > 0 there exists a ı > 0 for which ŒF ŒS < ". Since " > 0 was arbitrary, .F ŒS / D 0, as desired. t u
3.3.2 Proof of the Theorem under the hypothesis (*) Since F is increasing and continuous on a compact interval, F 1 is continuous on F Œha; bi, hence Borel measurable. Hence, we can define a Borel measure on ha; bi by .S / D .F ŒS /:
3 Application: The Fundamental Theorem of (Lebesgue) Calculus
439
Further, by the lemma, is absolutely continuous with respect to the Lebesgue measure , and hence satisfies the assumptions of the Radon-Nikodym Theorem. Let h be the Radon-Nikodym derivative of by . Then applying the statement of Theorem 2.2 to the sets ha; xi, we get Z hd D F .x/ F .a/; ha;xi
as claimed. The fact that h is the derivative of F almost everywhere follows from Theorem 3.2. t u 3.3.3 Lemma. Let F W ha; bi ! R be absolutely continuous. Let G.x/ D sup
N X
jF .ti / F .ti 1 /j
i D1
where the supremum is over all N and all choices of points a D t0 < < tN D x: Then the functions G, G F , G C F are increasing and absolutely continuous. (The function G is called the total variation of the function F .) Proof. Let a y < x b. The supremum in the definition of G.x/ clearly will not change if we take it only over such tuples .ti / which additionally satisfy ti D y for some i . This shows that G.x/ G.y/ D sup
N X
jF .ti / F .ti 1 /j
(*)
i D1
where the supremum is taken over all y D t0 < < tN D x: Now choose an " > 0. Then if F satisfies the condition of absolute continuity with a particular ı > 0, (*) (applied to y D ai ; x D bi for each individual i in the definition 3.1) shows that G satisfies the condition of absolute continuity for the same ı. To show that G F and G C F are non-decreasing, note that by definition, for a y < x b, G.x/ G.y/ jF .x/ F .y/j;
440
17 A Few Applications of Hilbert Spaces
and hence G.x/ G.y/ ˙.F .x/ F .y//; t u
as required.
3.3.4 Proof of the Theorem in the general case Clearly, an R-linear combination of absolutely continuous functions is absolutely continuous. Let F be as in the hypothesis of the theorem. Then the conclusion holds with F replaced by the increasing functions G C F C x, G C x, and hence, by the linearity of derivatives and integrals, for F D .G C F C x/ .G C x/:
4
t u
Fourier series and the discrete Fourier transformation
In the preceding sections, we obtained strong theorems (Theorems 2.2 and 3.3) which used the theory of Hilbert spaces in their proofs, but Hilbert spaces were not a part of the final statements. The role of Hilbert spaces in this and the next section is different, namely as a framework in which intuitive statements can be easily made rigorous. Of course, much more can be said on the subjects we touch on here, but what we say is a good example of the role the concept plays, for example, in mathematical physics.
4.1
The discrete Fourier transform (L2 -Fourier series)
We begin with an auxilliary result.
4.1.1
The subspace of continuous functions with compact support in Lp Let U Rn be an open set. Recall that the support supp.f / of a function f W U ! R is the closure in U of the set of all x 2 U such that f .x/ ¤ 0. The set (vector space) of continuous functions on U with compact support is denoted by Cc .U /. Similarly, the space of continuous complex functions with compact support on U is denoted by Cc .U; C/. Theorem. Let U Rn be a an open set and let 1 p < 1. Then the set Cc .U / (resp. Cc .U; C/) is dense in Lp .U / (resp. Lp .U; C/). Proof. Let us prove the complex case, the real case is analogous. Let K U be a compact set. We will first prove that in Lp .U; C/,
4 Fourier series and the discrete Fourier transformation
cK 2 Cc .U; C/
441
(4.1.1)
(recall that cK is the characteristic function, which has value 1 on K and 0 elsewhere). In effect, K is contained in the union of all balls .x; "x / with x 2 K, "x < 1=k which are contained in U , and hence in finitely many of those balls by compactness. Let Uk be the union of these finitely many open balls. Then by Tietze’s Theorem (Theorem 8.5 of Chapter 2), there exists a function fk W Rn ! h0; 1i such that f .x/ D 1 for x 2 K, and f .x/ D 0 for x … Uk . Clearly, fk has compact support, and for k sufficiently large, supp.fk / U . Then fk & cK , and we have lim jjfk cK jjp ! 0
k!1
by the Lebesgue Dominated Convergence Theorem, which implies (4.1.1). Next, we claim that (4.1.1) extends to any F -set K which satisfies .K/ < 1: this is because any such set is a union of countably many Kn compact, and we may assume K1 K2 and use the fact that by the Lebesgue Dominated Convergence Theorem, lim jjcKk cK jjp D 0:
k!1
Finally, recall from Exercise (8) of Chapter 5 that for every measurable set S U with .S / < 1, there exists an F -set K S , .S XK/ D 0, so in Lp .U; C/; cS is in the closure of Cc .U; C/. Consequently, so is any non-negative simple function s with finite integral (which is equivalent to s p having a finite integral). Now for any f 0, f 2 Lp .U; C/, there are non-negative simple functions sn with sn % f . Then lim jjf sk jjp D 0
k!1
by Lebesgue’s Dominated Convergence Theorem, and hence f 2 Cc .U; C/, which implies the same conclusion about any f 2 L2 .U; C/ (by considering Re.f /C , Re.f / , Im.f /C , Im.f / ). t u
4.1.2 Comments 1. Note that unlike our previous results on Lp , Theorem 4.1.1 does not readily generalize to an arbitrary Borel measure. 2. Also note that Cc .U / is certainly not dense in L1 .U /. Since the complement of a measure 0 set in U is necessarily dense, on Cc .U /, L1 -convergence is uniform convergence, and thus the closure of Cc .U / in L1 .U / consists, in particular, of continuous functions.
442
17 A Few Applications of Hilbert Spaces
4.1.3 The Discrete Fourier Transform Theorem Consider the space L2 .h0; 2 i/ (but we could adapt our arguments to any compact interval of non-zero length, see Exercise (12)). Then by explicit calculation, 1 p e inx ; n 2 Z 2
(4.1.2)
form an orthonormal system in L2 .h0; 2 i/. Theorem. The system (4.1.2) forms an orthonormal basis of L2 .h0; 2 i/. Proof. Consider the space S 1 D fz 2 C j jzj D 1g with the topology induced by C. Now consider the R-vector subspace C.S 1 ; R/ spanned by the functions zn C zn , i.zn zn /, n 2 Z. Then is closed under multiplication, contains a non-zero constant function and separates points, and hence satisfies the hypotheses of the Stone-Weierstrass Theorem 6.4.1 of Chapter 9. Consequently, every continuous function f W S 1 ! R is a uniform limit of a sequence of elements of . Composing with the map e ix , we see that in particular, every continuous function g W .0; 2 / ! R with compact support is a uniform limit of functions gn which are finite linear combinations of the functions sin.nx/, cos.nx/, n 2 Z. Therefore, every continuous function g W .0; 2 / ! C with compact support is a uniform limit of functions gn where each gn is a finite linear combination of the functions e inx , n 2 Z. By the Lebesgue Dominated Convergence Theorem, a sequence in L2 .h0; 2 i; C/ which converges uniformly converges in L2 . Since the functions gn are (finite) linear combinations of the elements (4.1.2), g is in the closure of the subspace spanned by (4.1.2). Thus, our statement follows from Theorem 4.1.1. t u
4.1.4 As already remarked in Section 2.1 above, sometimes one denotes by `2 .C/ the space L2 .Z; C/ where is the counting measure on Z, i.e. .S / is the number of elements of S when S is finite, and .S / D 1 for S infinite. Then the assignment X a.n/ p e inx 7! .a W Z ! C/ 2 n2Z
(*)
defines an isomorphism L2 .h0; 2 i; C/ ! `2 .C/ which is sometimes referred to as the discrete Fourier transformation and the expression on the left-hand side of (*) of an element f 2 L2 .h0; 2 i; C/ is called its Fourier series. Much hard mathematics concerns convergence of Fourier series
5 The continuous Fourier transformation
443
in other spaces than L2 . Note, however, that by Theorem 4.9 of Chapter 16, we have an expression for the coefficients an : 1 an D p 2
Z
f .x/e inx :
(**)
h0;2 i
5
The continuous Fourier transformation
5.1
The continuous Fourier transformation formula
While Exercise (15) of the previous Section gives a basis of L2 .R; C/, one may ask if there is a more compelling analogue of formula (**) which would apply to L2 .R; C/. There is a surprisingly simple answer, namely to apply (**) for a continuous parameter instead of n 2 Z, and integrate over all of R, thus obtaining, again, a function on R: Define for a function f W R ! C and for t 2 R, 1 fO.t/ D p 2
Z
f .x/e ixt dx:
(5.1.1)
R
(The integral on the right-hand side is the Lebesgue integral; we include the symbol dx to emphasize that we are integrating in the variable x.) Despite the simplicity of the generalization, it is immediately visible that the situation will be more complicated than in the case of the discrete Fourier transformation. For example, we cannot expect the formula (5.1.1) to work for every f 2 L2 .R; C/: in order for (5.1.1) to make sense, f must be integrable. Conversely, suppose (5.1.1) does make sense. Do we have fO 2 L2 .R; C/? We will answer these questions partially: We will apply the continuous Fourier transform formula (5.1.1) to certain subspace of functions called “rapidly decreasing functions”, and extend it to an isometric isomorphism of Hilbert spaces F W L2 .R; C/
Š
L2 .R; C/:
Again, much deeper and more specific convergence theorems exist, but we will not discuss them in this text. 5.2 Lemma. (The Riemann-Lebesgue lemma) Let f W R ! C be an integrable function. Then fO W R ! C is continuous and we have lim fO.t/ D lim f .t/ D 0:
t !1
t !1
444
17 A Few Applications of Hilbert Spaces
Proof. When tn ! t, then f .x/e itn x ! f .x/e itx , while jf .x/e itn x j D jf .x/j. Thus, fO.tn / ! fO.t/ by the Lebesgue Dominated Convergence Theorem. This proves continuity. To prove the limit formula, first consider the case when f D c.a;bi , a < b: we have ˇZ ˇ ˇ ˇ 1 itx ˇ e dx ˇˇ D je itb e ita j ˇ jtj .a;bi and the right-hand side goes to 0 with jtj ! 1. By a step function we shall now mean a (finite) C-linear combination of the functions c.a;bi (with varying a < b). Then we claim that for every integrable function f W R ! C and every " > 0, there exists a step function s such that Z jf sj < ": R
First, this is true for continuous functions with compact supports (by the convergence of the Riemann integral). Then it is true for non-negative functions in Zdn and hence for all integrable functions by the Lebesgue Monotone Convergence Theorem and linearity of integrals. But Z jfO.t/ sO.t/j jf sj < "; R
and thus the limit formula for s implies the limit formula for f .
t u
5.3 Lemma. Let f W R ! C be such that both f .x/ and x f .x/ are integrable. Then fO.t/ is differentiable, and
2
dfO D ixf .x/.t/: dt (Note: By the right-hand side, we mean the Fourier transform of ixf .x/, which is a function of t.) Proof. Under the conditions given, we have d dt
Z f .x/e R
itx
Z dx D R
@f .x/e itx dx D @t
Z
.ix/f .x/e itx dx R
by Theorem 5.2 of Chapter 5 (differentiation under the integral sign).
t u
5.4 Lemma. Let f W R ! C have a continuous derivative, and assume f .x/ and f 0 .x/ are integrable, and that lim f .x/ D 0:
x!˙1
5 The continuous Fourier transformation
445
Then we have fb0 .t/ D itfO.t/: Proof. Compute: fb0 .t/ D
Z
0
f .x/e
itx
Z dx D lim
a!1 ha;ai
R
D lim .f .a/e ita f .a/e ita C a!1
f 0 .x/e itx dx
Z
itf .x/e itx dx/ D it ha;ai
Z
f .x/e itx dx: R
The passages to the limit follow from the Lebesgue Dominated Convergence Theorem. The middle equality is integration by parts (for the Riemann integral). u t 5.5 Lemma. Let f; g W R ! C be integrable functions and let a > 0. Then Z
Z f .ax/g.x/dx O D R
g.ay/fO.y/dy:
(5.5.1)
R
(Again, on both sides, we mean the Lebesgue integral.) Proof. First note that both sides of (5.5.1) make sense by the Riemann-Lebesgue lemma, since fO and gO are continuous and bounded. Next, consider the integral Z f .x/g.t/e itx=a : R2
Clearly, this integral exists (replace the integrand by jf .x/j jg.t/j), and is equal to both sides of (5.5.1) by Fubini’s Theorem and linear substitutions x=a D u and t=a D v. t u
5.6
Rapidly decreasing functions
A function f W R ! C is called rapidly decreasing (or Schwarzian) if f has all derivatives, and for all numbers m; n D 0; 1; 2; : : : , we have lim x m f .n/ .x/ D 0:
x!˙1
(Note that the term “rapidly decreasing” is a misnomer, since these functions are, in fact, never decreasing.) Note that any smooth function with compact support is rapidly decreasing (since all its derivatives will have, again, compact support). The vector space of all rapidly decreasing functions f W R ! C is denoted by S. Lemma. Let f 2 S. Then fO 2 S.
446
17 A Few Applications of Hilbert Spaces
Proof. By induction, using Lemmas 5.3 and 5.4, t m fO.n/ .t/ is a (finite) linear
4
combination of functions of the form x k f .`/ .x/.t/. Use the assumption and the Riemann-Lebesgue lemma. u t
5.7
The Fourier Inversion Theorem
Define the inverse Fourier transform fQ by 1 fQ.t/ D p 2
Z f .x/e itx dx: R
Then by definition, b fQ D f where x is the complex conjugate of x. It follows that the inverse Fourier transform maps S to S. Theorem. For f 2 S, the inverse Fourier transform of the Fourier transform of f is equal to f . Proof. Let f; g 2 S. By the Lebesgue Dominated Convergence Theorem and Lemma 5.6, we may pass to the limit a ! 0 in Lemma 5.5, getting Z
Z
fO:
gO D g.0/
f .0/ R
(5.7.1)
R
Setting 1 2 g.x/ D p e x =2 ; 2 and using Exercise (15) of Chapter 5, and Exercise (18) below, (5.7.1) becomes 1 f .0/ D p 2
Z
fO; R
which is the special case of the formula we desire at the point x D 0. The general case follows from Exercise (16) below. t u Corollary. For f; g 2 S, we have hf; gi D hfO; gi: O
5 The continuous Fourier transformation
447
Proof. We have hfO; gi O D
Z
fOgO D
R
Z
gO D fe
Z
b f gO D R
Z
R
f g D hf; gi:
t u
R
Lemma. S is dense in L2 .R; C/. Proof. In effect, by Theorem 4.1.1, continuous functions with compact support are dense in L2 .R; C/, but we claim that if f is a continuous function with compact support K, then there exists an L K compact and fn smooth with supp.fn / L which converge to f uniformly (and hence in L2 ). In effect, let U be any open neighborhood of K such that U is compact. Let " > 0. Since f is uniformly continuous, there exists a ı > 0 such that for x 2 K, y 2 .x; ı/, jf .x/ f .y/j < ". Further, by compactness, for ı > 0 sufficiently small and all x 2 K, .x; ı/ U . Now choose a smooth partition of unity ux subordinate to the open cover by the balls .x; ı/ and R X K, and let g.t/ D
X
ux .t/f .x/:
x2K
Then jf .t/ g.t/j < 2" for all t 2 K, while g is smooth and supp.g/ U .
t u
5.8 Theorem. The maps S ! S given by f 7! fO, f 7! fQ extend to linear isometries F ; F 1 W L2 .R; C/ ! L2 .R; C/ which are inverse to each other. Proof. An isometry of inner product spaces is always injective. Thus, by Corollary 5.7, the Fourier transform gives an injective linear map S ! S, and by Theorem 5.7, it is onto. Hence it is a linear isomorphism, and the inverse Fourier transform is an inverse linear isomorphism. Hence, the inverse Fourier transform is also an isometry (of course, this could also be proved directly). Now composing either the Fourier transform or the inverse Fourier transform S ! S with the inclusion S L2 .R; C/, we obtain uniformly continuous maps into a complete metric space, which can therefore be uniquely extended to a uniformly continuous map L2 .R; C/ ! L2 .R; C/
448
17 A Few Applications of Hilbert Spaces
by Proposition 4.6 of Chapter 9. These maps are clearly linear isometries by continuity of the inner product and vector space operations, and are inverse to each other by uniqueness of the extension. t u
6
Exercises
(1) Prove that the expression (1.2.2) of 1.2 does not depend on the expression of a simple function (1.2.1). (2) Prove that the function volg .B/ on Borel subsets B of a Riemann manifold M from Exercise (5) of Chapter 15 is a Borel measure on M . (3) Prove that for two Riemann metrics g1 , g2 on a smooth manifold M , the Borel measure volg1 is absolutely continuous with respect to the Borel measure volg2 . Conclude that it makes sense to speak of a measure 0 set in a smooth manifold, even when we do not specify a Riemann metric. (4) Extend the Radon-Nikodym Theorem to the case when there exist subsets S X1 ; X2 ; X such that X D Xn and .Xn / < 1. (The measure on X is then called -finite). Note that we are keeping the assumption .X / < 1. [Hint: Apply Theorem 2.2 for each Xn instead of X .] (5) Prove uniqueness in the Radon-Nikodym Theorem, i.e. prove that if two functions h1 , h2 in the statement of Theorem 2.2 satisfy the conclusion, then they are equal almost everywhere. (6) Prove that if B Rn is a Borel set and is a -finite Borel measure on B, then there is an isomorphism of Banach spaces .L1 .B// Š L1 .B/, and similarly in the complex case. [Hint: Extend the Radon-Nikodym Theorem to a situation where instead of the measure we have a continuous linear functional on L1 .B/ under the condition .X / < 1 - the proof is the same! The “Radon-Nikodym derivative” h is the function in L1 which we are seeking; Exercise (4) is also relevant. To prove that there is a bound M such that jh.X /j < M almost everywhere, assume for contradiction that jh.x/j > 2n on a subset Xn of Rpositive measure, RXn disjoint. Then there exists an integrable function fn with n jf j 1=2 , n Xn Xn fn h D 1.] (7) Prove that if U is an open set in Rn , then the spaces L1 .B/, L1 .B; C/ are not reflexive. [Hint: Use 4.1.2. Let V be a the closure of Cc .U / in L1 .B/. Prove that there is a continuous linear form X on L1 .B/ which is 0 on Cc .U /. Consequently, X cannot come from L1 .U /. (Consider Exercise (6).)] (8) Prove that in Lemma 2.3, the assumption .X / < 1 is needed. Find a counterexample and describe where the proof goes wrong when we omit this condition. (9) The requirement in Definition 3.1 that the intervals hai ; bi i be disjoint is needed. Give an example showing that we get a different notion if we drop it.
6 Exercises
449
(10) Prove that while x 2 sin.1=x 2 / has a derivative everywhere, it is not absolutely continuous on h1; 1i, and thus the Lebesgue integral of its derivative does not exist. (11) Let F W ha; bi ! R be Lipschitz (see 3.1 of Chapter 6). Prove that F has a derivative almost everywhere. (12) In analogy of 4.1, find a Hilbert basis of the space L2 .ha; bi/ for a < b. (13) Using 4.1, find a real orthonormal basis of the real Hilbert space L2 .h0; 2 i; R/: (14) Let f W h0; 2 i ! R be defined by (a) f .x/ D x, (b) f .x/ D 1 for 0 x and f .x/ D 0 else. Compute the Fourier series of f . (15) Prove that the functions m;n W R ! C where m;n .x/ D p12 e inx when 2 m x < 2 .m C 1/ and m;n .x/ D 0 otherwise, form an orthonormal basis of L2 .R; C/. (16) Let f W R ! C be an integrable function and let a 2 R. Define a function fa W R ! C by fa .x/ D f .x C a/. Prove that fba .t/ D e ita fO.t/. (17) Define the convolution of functions f; g W R ! C by Z f g.t/ D
f .x/g.t x/dx: R
Prove that if f and g are integrable then the convolution is well defined, and one has
1
f g D fO g: O [Hint: Use Fubini’s Theorem.] 2 (18) Prove that the function e x =2 is rapidly decreasing and that its Fourier transform is the same function.
A
Linear Algebra I: Vector Spaces
1
Vector spaces and subspaces
1.1 Let F be a field (in this book, it will always be either the field of reals R or the field of complex numbers C). A vector space V D .V; C; o; ˛./ .˛ 2 F// over F is a set V with a binary operation C, a constant o and a collection of unary operations (i.e. maps) ˛ W V ! V labelled by the elements of F, satisfying (V1) .x C y/ C z D x C .y C z/, (V2) x C y D y C x, (V3) 0 x D o, (V4) ˛ .ˇ x/ D .˛ˇ/ x, (V5) 1 x D x, (V6) .˛ C ˇ/ x D ˛ x C ˇ x, and (V7) ˛ .x C y/ D ˛ x C ˛ y. Here, we write ˛ x and we will write also ˛x for the result ˛.x/ of the unary operation ˛ in x. Often, one uses the expression “multiplication of x by ˛”; but it is useful to keep in mind that what we really have is a collection of unary operations (see also 5.1 below). The elements of a vector space are often referred to as vectors. In contrast, the elements of the field F are then often referred to as scalars. In view of this, it is useful to reflect for a moment on the true meaning of the axioms (equalities) above. For instance, (V4), often referred to as the “associative law” in fact states that the composition of the functions V ! V labelled by ˇ; ˛ is labelled by the product ˛ˇ in F, the “distributive law” (V6) states that the (pointwise) sum of the mappings labelled by ˛ and ˇ is labelled by the sum ˛ C ˇ in F, and (V7) states that each of the maps ˛ preserves the sum C. See Example 3 in 1.2.
I. Kriz and A. Pultr, Introduction to Mathematical Analysis, DOI 10.1007/978-3-0348-0636-7, © Springer Basel 2013
451
452
1.2
A Linear Algebra I: Vector Spaces
Examples
Vector spaces are ubiquitous. We present just a few examples; the reader will certainly be able to think of many more. 1. The n-dimensional row vector space Fn . The elements of Fn are the n-tuples .x1 ; : : : ; xn / with xi 2 F, the addition is given by .x1 ; : : : ; xn / C .y1 ; : : : ; yn / D .x1 C y1 ; : : : ; xn C yn /; o D .0; : : : ; 0/, and the ˛’s operate by the rule ˛..x1 ; : : : ; xn // D .˛x1 ; : : : ; ˛xn /: Note that F1 can be viewed as the F. However, although the operations a come from the binary multiplication in F, their role in a vector space is different. See 5.1 below. 2. Spaces of real functions. The set F .M / of all real functions on a set M , with pointwise addition and multiplication by real numbers is obviously a vector space over R. Similarly, we have the vector space C.J / of all the continuous functions on an interval J , or e.g. the space C 1 .J / of all continuously differentiable functions on an open interval J or the space C 1 .J / of all smooth functions on J , i.e. functions which have all higher derivatives. There are also analogous C-vector spaces of complex functions. 3. Let V be the set of positive reals. Define x ˚ y D xy, o D 1, and for arbitrary ˛ 2 R, ˛ x D x ˛ . Then .V; ˚; o; ˛ ./ .˛ 2 R// is a vector space (see Exercise (1)).
1.3
An important convention
We have distinguished above the elements of the vector space and the elements of the field by using roman and greek letters. This is a good convention for a definition, but in the row vector spaces Fn , which will play a particular role below, it is somemewhat clumsy. Instead, we will use for an arithmetic vector a bold-faced variant of the letter denoting the coordinates. Thus, x D .x1 ; : : : ; xn /;
a D .a1 ; : : : ; an /;
etc.
Similarly we will write f D .f1 ; : : : ; fn / for the n-tuple of functions fj W X ! R resp. C (after all, they can be viewed as mappings f W X ! Fn ), and similarly.
1 Vector spaces and subspaces
453
These conventions make reading about vectors much easier, and we will maintain them as long as possible (for example in our discussion of multivariable differential calculus in Chapter 3). The fact is, however, that in certain more advanced settings the conventions become cumbersome or even ambiguous (for example in the context of tensor calculus in Chapter 15), and because of this, in the later chapters of this book we eventually abandon them, as one usually does in more advanced topics of analysis. We do, however, use the symbol o universally for the zero element of a general vector space – so that in Fn we have o D .0; 0; : : : ; 0/.
1.4 We have the following trivial Observation. In any vector space V , for all x 2 V , we have x C o D x and there exists precisely one y such that x C y D o, namely y D .1/x. (Indeed, x C o D 1 x C 0 x D .1 C 0/x D x and x C .1/x D 1x C .1/x D .1 C .1//x D 0 x D o, and if x C y D o and x C z D o then y D y C .x C z/ D .y C x/ C z D z.)
1.5
(Vector) subspaces
A subspace of a vector space V is a subset W V that is itself a vector space with the operations inherited from V . Since the equations required in V hold for special as well as general elements, we have a trivial Observation. A subset W V of a vector space is a subspace if and only if (a) o 2 W , (b) if x; y 2 W then x C y 2 W , and (c) for all ˛ 2 F and x 2 W , ˛x 2 W .
1.5.1 Also the following statement is immediate. Proposition. The intersection of an arbitrary set of subspaces of a vector space V is a subspace of V .
1.6
Generating sets
By 1.5.1, we see that for each subset M of V there exists the smallest subspace W V containing M , namely
454
A Linear Algebra I: Vector Spaces
L.M / D
\ fW j W subspace of V and M W g:
For M finite, we use the notatiom L.u1 ; : : : ; un /
instead of L.fu1 ; : : : ; un g/:
Obviously L.;/ D fog. We say that M generates L.M /; in particular if L.M / D V we say that M is a generating set (of V ). One often speaks of a set of generators but we have to keep in mind that this does not imply each of its elements generates V , which would be a much stronger statement. If there exists a finite generating system we say that V is finitely generated, or finite-dimensional.
1.7
The sum of subspaces
Let W1 ; W2 be subspaces. Unlike the intersection W1 \ W2 , the union W1 [ W2 is generally (and typically) not a subspace. But we have the smallest subspace containing both W1 and W2 , namely L.W1 [ W2 /. It will be denoted by W1 C W2 and called the sum of W1 and W2 . (One often uses the symbol ‘˚’ instead of ‘C’ when one also has W1 \ W2 D fog.)
2
Linear combinations, linear independence
2.1 A linear combination of a system x1 ; : : : ; xn of elements of a vector space V over F is a formula ˛1 x1 C C ˛n xn
(briefly,
n X
˛j xj /:
(*)
j D1
The “system” in question is to be understood as the sequence, although the order in which it is presented will play no role. However, a possible repetition of an individual element is essential. Note that we spoke of (*) as of a “formula”. That is, we had in mind the full information involved (more pedantically, we could speak of the linear combination as of the sequence together with the mapping f1; : : : ; ng ! F sending j to ˛j ). The vector obtained as the result of the indicated operations should be referred to as the result of the linear combination (*). We will follow this convention consistently
2 Linear combinations, linear independence
to begin with; later, we will speak of a linear combination
455 n X
˛j xj more loosely,
j D1
trusting that the reader will be able to tell from the context whether we will mean the explicit formula or its result.
2.2 A linear combination (*) is said to be non-trivial if at least one of the ˛j is non-zero. A system x1 ; : : : ; xn is linearly dependent if there exists a non-trivial linear combination (*) with result o. Otherwise, we speak of a linearly independent system. 2.2.1 Proposition. 1. If x1 ; : : : ; xn is linearly dependent resp. independent then for any permutation of f1; : : : ; ng the system x .1/ ; : : : ; x .n/ is linearly dependent resp. independent. 2. A subsystem of a linearly independent system is linearly independent. 3. Let ˇ2 ; : : : ; ˇn be arbitrary. Then x1 : : : ; xn is linearly independent if and only if n X ˇj xj ; x2 ; : : : ; xn is. the system x1 C j D2
4. A system x1 ; : : : ; xn is linearly dependent if and only if some of its members are a (result of a) linear combination of the others. In particular, any system containing o is linearly dependent. Similarly, if there exist j ¤ k such that xj D xk then x1 ; : : : ; xn is linearly dependent. Proof. 1. is trivial. 2. A non-trivial linear combination demonstrating the dependence of the smaller system demonstrates the dependence of the bigger one if we put ˛j D 0 for the remaining summands. 3. It suffices to prove one implication, the other follows by symmetry since the first system can be obtained from the second by using the coefficients ˇj . Thus, n X let ˛1 .x1 C ˇj xj / C ˛2 x2 C C ˛n xn D o with an ˛k ¤ 0. Then we have j D2
˛1 x1 C .˛2 C ˛1 ˇ2 /x2 C C .˛n C ˛1 ˇn /xn D o and it is a non-trivial linear combination of the x1 ; : : : ; xn : indeed either ˛1 ¤ 0 or .˛k C ˛1 ˇk / D ˛k ¤ 0. n X 4. If ˛1 x1 C C ˛n xn (briefly, ˛j xj D o) with ˛k ¤ 0 then xk D j D1
X X .˛j / xj . On the other hand, if xk D ˛j xj we have the non-trivial ˛k j ¤k j ¤k X linear combination xk C .˛j /xj D o. t u j ¤k
456
2.3
A Linear Algebra I: Vector Spaces
Conventions
We speak of a linearly independent finite set X V if X is independent when ordered as a sequence without repetition. A general subset X V is said to be independent if each of its finite subsets is independent. 2.4 Theorem. Let M be an arbitrary subset of a vector space V . Then L.M / is the set of all the (results of) linear combinations of finite subsystems of M . Proof. The set of all such results of linear combinations is obviously a subspace of V . On the other hand, a subspace W containing M has to contain all the (results of) linear combinations of elements of M . t u 2.5 Proposition. L.u1 ; : : : ; un / L.v1 ; : : : ; vk / if and only if each of the uj ’s is a linear combination of v1 ; : : : ; vk . Proof. If it is, the inclusion follows from 2.4 since L.u1 ; : : : ; un / is the smallest subspace containing all the uj ; if we have the inclusion then the uj ’s are the desired linear combinations, again by 2.4. t u 2.6 Theorem. (Steinitz’ Theorem, or The Exchange Theorem) Let v1 ; : : : ; vk be a linearly independent system in a vector space V and let fu1 ; : : : ; un g be a generating set. Then (1) k n, and (2) There exists a bijection W f1; : : : ; ng ! f1; : : : ; ng (i.e. a permutation of the set f1; : : : ; ng) such that fv1 ; : : : ; vk ; u .kC1/; : : : ; u .n/ g is a generating set. Proof. by induction. If k D 1 we have v1 D one uj0 with ˛j0 ¤ 0. Now
n X
˛j uj and since v1 ¤ o by 2.2, there exists at least
j D1
uj0 D
X ˛j 1 v1 C uj ˛j0 ˛j0 j ¤j0
and we have, by 2.5, L.v1 ; u1 ; : : : ; uj0 1 ; uj0 C1 ; : : : ; un / D L.u1 ; : : : ; un / D V: Rearange the uj by exchanging u1 with uj0 .
3 Basis and dimension
457
Now let the statement hold for k and let us have a linearly independent system v1 ; : : : ; vk ; vkC1 . Then v1 ; : : : ; vk . is linearly independent and we have, after a rearrangement of the uj , L.v1 ; : : : ; vk ; ukC1 ; : : : ; un / D V: Since vkC1 2 V we have vkC1 D
k X j D1
˛j vj C
n X
˛j uj :
j DkC1
We cannot have all the ˛j with j > k equal to zero: since v1 ; : : : ; vk ; vkC1 are independent, this would contradict 2.2.1 4. Thus, ˛j0 ¤ 0 for some j0 > k and hence, first, n k C 1; and, second, after rearranging the uj ’s to exchange the uj0 with ukC1 we obtain 1 ˛kC1
vkC1 D
k n X X ˛j ˛j vj C ukC1 C uj ; ˛ ˛kC1 j D1 kC1 j DkC2
and hence ukC1
k n X X ˛j ˛j 1 D vj C vkC1 C uj ; ˛ ˛ ˛ kC1 kC1 j D1 kC1 j DkC2
and L.v1 ; : : : ; vk ; vkC1 ; ukC2 ; : : : ; un / D L.u1 ; : : : ; un / D V by 2.5 again.
3
t u
Basis and dimension
3.1 We have observed a somewhat complementary behaviour of generating sets and independent systems: the former remain a generating set if more elements are added, the latter remain independent if some elements are deleted. This suggests the importance of minimal generating sets and maximal independent ones. We will see they are, basically, the same. The resulting concept is of fundamental importance in linear algebra. A basis of a vector space V is a subset that is both generating and linearly independent.
458
A Linear Algebra I: Vector Spaces
3.1.1 Observation. In a vector space V , (1) if u1 ; : : : ; un is a generating set then each x can be written as x D
n X
˛j uj ,
j D1
(2) if u1 ; : : : ; un is linearly independent then each x can be written at most one way n X ˛j uj , as x D j D1
(3) if u1 ; : : : ; un is a basis then each x can be written precisely one way as x D n X ˛j uj . j D1
((1) is in 2.4; as for (2), if
n X j D1
˛j uj D
n X
ˇj uj then
j D1
˛j ˇj D 0; (3) is a combination of (1) and (2).)
n X .˛j ˇj /uj D o and j D1
3.2 Theorem. 1. Every (finite) generating system u1 ; : : : ; un contains a basis. 2. Every linearly independent system v1 ; : : : ; vn of a finitely generated vector space can be extended to a basis. 3. All bases of a finitely generated vector space have the same number of elements. Proof. 1. If u1 ; : : : ; un are linearly independent we already have a basis. Else there is, by 2.1 4, an element uj , say un (which we can achieve by rearangement), that is a linear combination of others. Then by 2.5, L.u1 ; : : : ; un1 / D L.u1 ; : : : ; un / D V and we can repeat the procedure with the generating system u1 ; : : : ; un1 . After repeating the procedure sufficiently many times we finish with a generating u1 ; : : : ; uk that is linearly independent. (Note that this last system can be empty if the preceding system u1 consisted of u1 D o only; the empty system is formally independent, and constitutes a basis of the trivial vector space fog.) 2. From 1 we already know that V has a basis u1 ; : : : ; un and from 2.6 we infer that after rearangement we have a generating system v1 ; : : : ; vk ; ukC1 : : : ; un
(*)
and this, by 1 again, has to contain a basis. But this basis cannot be a proper subset of (*), by 2.6, since there exists an independent system u1 ; : : : ; un . 3. If u1 ; : : : ; un and v1 ; : : : ; vk are bases then by 2.6, k n and n k. t u
3.3 The common cardinality of all bases of a finitely generated vector space V is called the dimension of V and denoted by dim V:
3 Basis and dimension
459
From 2.6 and 3.2 we immediately obtain Corollary. Let dim V D n. Then 1. every generating system u1 ; : : : ; un is a basis, and 2. every linearly independent system u1 ; : : : ; un is a basis. 3.4 Theorem. A subspace W of a finitely generated vector space V is finitely generated and we have dim W dim V . If dim W D dim V then W D V . Proof. We just have to show that W is finitely generated; the other statements are consequences of the already proved facts (since a basis of W is a linearly independent system in V ). Suppose W is not finitely generated. Then, first, it contains a non-zero element u1 . Suppose we have already found a linearly independent system u1 ; : : : ; un . Since V ¤ L.u1 ; : : : ; un / there exists a unC1 2 V X L.u1 ; : : : ; un /. Then, by 2.2.1 4, u1 ; : : : ; un ; unC1 is linearly independent, and we can construct inductively an arbitrarily large independent system, contradicting 2.6. t u
3.5
Remark
We have learned that every finitely generated vector space has a basis. In fact, one can easily prove, using Zorn’s lemma, that every vector space has one. Indeed, let fIj j j 2 J g S be a chain of independent subsets of V . Then I D fIj j j 2 J g is an independent set again, since any finite subset M D fx1 ; : : : ; xn g I is independent: if xk 2 Ijk then M Ir , the largest of the Ijk , k D 1; : : : ; n. Thus there exists a maximal independent set B and this B is a basis: if there were x … L.B/ we would have fxg [ B independent, by 2.2.1 4, contradicting the maximality. Recall the sum of subspaces from 1.7. We have 3.6 Theorem. Let W1 ; W2 be finitely generated subspaces of a vector space V . Then dim W1 C dim W2 D dim.W1 \ W2 / C dim.W1 C W2 /: Proof. Consider a basis u1 ; : : : ; uk of W1 \ W2 . By 3.2, there exist bases u1 ; : : : ; uk ; vkC1 ; : : : ; vr of W1 ; and u1 ; : : : ; uk ; wkC1 ; : : : ; ws of W2 : Then the system u1 ; : : : ; uk ; vkC1 ; : : : ; vr ; wkC1 ; : : : ; ws
460
A Linear Algebra I: Vector Spaces
obviously generates W1 C W2 and hence our statement will follow if we prove that it is linearly independent (and hence a basis) – since then dim.W1 C W2 / D r C s k. To this end, let k X
r X
˛j uj C
j D1
ˇj vj C
j DkC1
s X
j wj D o:
j DkC1
Then we have r X
ˇj vj D
k X j D1
j DkC1
and since it also can be written as
s X
˛j uj
k X
j wj 2 W1 \ W2
j DkC1
ıj uj , all the ˇj are zero, by 3.1.1.
j D1
Consequently, k X j D1
˛j uj C
s X
j wj D o
j DkC1
and since u1 ; : : : ; uk ; wkC1 ; : : : ; ws is a basis, also all the ˛i and i are zero.
4
t u
Inner products and orthogonality
4.1 In this section, it is important that we work with vector spaces over R or C. Since all the formulas in the real context will be special cases of the respective complex ones, the proofs will be done in C. Recall the complex conjugate z D z1 i z2 ofp z D z1 C i z2 , the formulas z C z0 D z C z0 and z z0 D z z0 , the absolute value jzj D zz, and realize that for a real z this absolute value is the standard one.
4.2 An inner product in a vector space V over C resp. R is a mapping ..x; y/ 7! x y/ W V V ! C resp. R such that (1) u u 0 (in particular always real), and u u D 0 only if u D o, (2) u v D v u (u v D v u in the real case),
4 Inner products and orthogonality
461
(3) .˛u/ v D ˛.u v/, and (4) u .v C w/ D u v C u w. We usually write simply uv for u v, and u2 for uu. Note that u.˛v/ D .˛v/u D ˛.vu/ D ˛.vu/ D ˛.uv/ and using similarly twice the complex conjugate, .v C w/u D vu C wu: Remark: The notation for an inner product sometimes varies. The most common alternate notation to x y is hx; yi (although one must beware of possible confusion with our notation for closed intervals). The notation is particularly convenient when we want to express the dependence of the product on some other data, such as a matrix (see Section 7.7 below). Further, we introduce the norm jjujj D
4.3
p uu:
An important example
In the row vector space we will use without further mentioning the inner product the symbol xy D
n X
xj y j
(in the real case x y D
j D1
n X
xj yj /
j D1
(see Exercise (2)). This specific example of an inner product is sometimes referred to as the dot product. 4.4 Theorem. (The Cauchy-Schwarz inequality) We have jxyj
p
xx
p yy.
Proof. We have 0 .x C y/.x C y/ D xx C .y/x C x.y/ C .y/.y/ D xx C .yx/ C .xy/ C .yy/: If x D o then the inequality in the statement holds trivially. Else set D
xy yy
(*)
462
A Linear Algebra I: Vector Spaces
to obtain from (*) 0 xx
yx .xy/.yx/ xy xy .yx/ .xy/ C .yy/ D xx .yx/ yy yy .yy/.yy/ yy
and hence .xy/.xy/ D .xy/.yx/ .xx/.yy/. Take square roots.
t u
4.5 Vectors u; v are said to be orthogonal if uv D 0. Note that the only vector orthogonal to itself is o. A system u1 ; : : : ; un is said to be orthogonal if uj uk D 0 whenever j ¤ k. It is orthonormal if, moreover, jjuj jj D 1 for all j . 4.5.1 Proposition. An orthogonal system consisting of non-zero elements (in particular, an orthonormal system) is linearly independent. P P Proof. Multiply o D ˛j uj by uk from the right. We obtain 0 D .˛j uj /uk D P ˛j .uj uk / D ˛k .uk uk /. Since uk uk ¤ 0, ˛k D 0. t u 4.5.2 Theorem. (The Gram-Schmidt orthogonalization process) For every basis u1 ; : : : ; un of a vector space V with inner product there exists an orthonormal basis v1 ; : : : ; vn such that for each k D 1; 2; : : : ; n, L.v1 ; : : : ; vk / D L.u1 ; : : : ; uk /: If u1 ; : : : ; ur is orthonormal we can have vj D uj for j r. Proof. Start with v1 D jju11 jj . If we already have an orthonormal system v1 ; : : : ; vk such that L.v1 ; : : : ; vr / D L.u1 ; : : : ; ur / for all r k set w D ukC1
k X .ukC1 vj /vj : j D1
For all vr , r k, we have wvr D ukC1 vr
k X
.ukC1 vj /.vj vr / D ukC1 vr ukC1 vr D 0:
j D1
We have w ¤ o since otherwise ukC1 D
k X
.ukC1 vj /vj 2 L.v1 ; : : : ; vk / D
j D1
L.u1 ; : : : ; uk / contradicting the linear independence of u1 ; : : : ; uk ; ukC1 . Thus we can set
4 Inner products and orthogonality
463
vkC1 D
w jjwjj
and obtain an orthonormal system v1 ; : : : ; vk ; vkC1 and L.v1 ; : : : ; vk ; vkC1 / D L.u1 ; : : : ; uk ; ukC1 / by 2.5. Finally observe that if u1 ; : : : ; ur was already orthonormal, the procedure yields vj D uj until j D r. t u
4.6 The orthogonal complement of a subspace W of a vector space V with inner product is the set W ? D fu 2 V j uv D 0 for all v 2 W g: From the properties in 4.1 we immediately obtain 4.6.1 Observations. 1. W ? is a subspace of V and we have W ? \ W D fog and the implication W1 W2
)
W2? W1? :
2. L.v1 ; : : : ; vn /? D fu j uvj D 0 for all j D 1; : : : ; ng: 4.6.2 Theorem. Let V be a finite-dimensional vector space with inner product. Then we have, for subspaces W; Wj V , (1) W ˚ W ? D V , (2) dim W ? D dim V dim W , (3) .W ? /? D W , and (4) .W1 \ W2 /? D W1? C W2? and .W1 C W2 /? D W1? \ W2? . Proof. (1) and (2): Let u1 ; : : : ; uk be an orthonormal basis of W . By 2.6 and 4.5.2 we can extend it to an orthonormal basis u1 ; : : : ; uk ; ukC1 ; : : : ; un of V . If n n X X xD ˛j uj is in W ? we have 0 D xur D ˛j .uj ur / D ˛r for r k and j D1
j D1
x 2 L.ukC1 ; : : : ; un /. On the other hand, if x 2 L.ukC1 ; : : : ; un / then x 2 W ? by 4.6.1 2. Thus, ? W D L.ukC1 ; : : : ; un /, and (1) and (2) follow. (3) Obviously W .W ? /? . By (2), dim W D dim.W ? /? and hence W D .W ? /? by 3.4. (4) Obviously Wi? .W1 \ W2 /? and hence W1? C W2? .W1 \ W2 /? , and ? ? similarly W ? 1 \ W 2 .W1 C W2 / . Now, using (3) and 4.6.1 1 we obtain
464
A Linear Algebra I: Vector Spaces
.W1 \ W2 /? D ..W1? /? \ .W2? /? /? ..W1? C W2? /? /? D W1? C W2? ; and .W1 C W2 /? D ..W1? /? C .W2? /? /? ..W1? \ W2? /? /? D W1? \ W2? :
4.7
t u
Hermitian and Symmetric Bilinear Forms
For a vector space V over C, a mapping V V ! C satisfying all the axioms of 4.2 except axiom (1) is called a Hermitian form. (Note that by axiom (2) of 4.2, B.v; v/ is always a real number.) If we replace C by R in this definition, we speak of a symmetric bilinear form (over R). For Hermitian and symmetric bilinear forms, one usually does not use the notation , but a letter, for example B.u; v/, u; v 2 V . A Hermitian (resp. real symmetric bilinear) form B is then called positive definite (resp. negative definite) if B is an inner product (resp. B is an inner product). B is called indefinite if it is neither positive nor negative definite. A Hermitian resp. real symmetric bilinear form B is called degenerate if there exists a non-zero vector v 2 V such that for every w 2 V , B.v; w/ D 0. Otherwise, B is called nondegenerate. Clearly, every degenerate Hermitian or real symmetric bilinear form is indefinite. Real symmetric bilinear forms, and whether they are non-degenerate and positive or negative-definite, is important in multivariable differential calculus (see Section 8 of Chapter 3). Hermitian forms behave analogously in many ways. It is therefore natural to ask: Given a Hermitian or real symmetric bilinear form, can we decide if it is positive or negative definite? Doing this algorithmically requires solving systems of linear equations, which we will review in Appendix B, so we will postpone the solution of this problem to Appendix B.2.6 below.
5
Linear mappings
5.1 Let V; W be vector spaces. A mapping f W V ! W is said to be linear if for all x; y 2 V;
f .x C y/ D f .x/ C f .y/; and
for all ˛ 2 F and x 2 V;
f .˛x/ D ˛f .x/:
Note that the “multiplication by elements of F” really acts as individual unary operations (recall 1.1). In particular, a linear mapping f W F ! F with F viewed as F1 (recall 1.2 1 satisfies f .ax/ D af .x/, not f .ax/ D f .a/f .x/). A linear mapping f W V ! W is an isomorphism if there is a linear mapping g W W ! V such that fg D id and gf D id; V and W are then said to be isomorphic.
5 Linear mappings
465
We have an immediate 5.1.1 Observation. A composition of linear mappings is a linear mapping.
5.2 1. 2. 3. 4.
Examples
The projections pk D ..x1 ; : : : ; xn / 7! xk / W Fn ! F1 are linear mappings. The mapping ..x1 ; x2 ; x3 / 7! .x2 ; x1 x3 // W F3 ! F2 is linear. Recall 1.2 2. The mapping . 7! .x// W F .X / ! R1 is linear. Let J be an open interval. Recall 1.2 2 again. Taking the derivative at a point a 2 J is a linear mapping from C 1 .J / to R1 . See the Exercises for more examples.
5.3 Theorem. Let f W V ! W be a linear mapping such that f ŒV D W , let g W V ! Z be a linear mapping, and let h W W ! Z be a mapping such that hf D g. Then h is linear. Proof. For each w 2 W choose an element .w/ 2 V such that f . .w// D w. We have h.x C y/ D h.f . .x// C f . .y/// D hf . .x/ C .y// D g. .x/ C .y// D g .x/Cg .y/ D hf .x/Chf .y/ D h.x/Ch.y/ and similarly h˛x D h.˛f .x// D hf .˛ .x// D g.˛ .x// D ˛g. .x// D ˛hf .x/ D ˛h.x/. t u Note. This is a general fact about homomorphisms between algebraic structures. 5.3.1 Corollary. Every linear mapping f W V ! W that is one-one and onto is an isomorphism. (Indeed, there is a g W W ! V such that gf D id and gf D id. Since f is onto and id is linear, g is linear.) 5.3.2 Corollary. If dim V D n then V is isomorphic to Fn . (Choose a basis P u1 ; : : : ; un and define a mapping f W Fn ! V by setting f ..x1 ; : : : ; xn // D xi ui . This f is obviously linear and by 3.1.1 1 it is one-one and onto.) 5.4 Proposition. Let f W V ! W be a linear mapping. If f is one-one then it sends every linearly independent system to a linearly independent one, if f is onto then it sends every generating set to a generating one. Consequently, isomorphisms preserve generating sets, linearly independent ones, and bases. P P Proof. Let f be one-one and let ˛j f .xj / D o. Then f . ˛j xj / D f .o/ and P ˛j xj D o so that if x1 ; : : : ; xn were linearly independent, all the ˛j are zero.
466
A Linear Algebra I: Vector Spaces
Let f be onto and let M P generate V . For a y 2 W choose an x 2 VPsuch that f .x/ D y and write x as ˛i ui with ui 2 M . Then, y D f .x/ D f . ˛i ui / D P ˛i f .ui / with f .ui / 2 f ŒM . t u 5.5 Theorem. Let u1 ; : : : ; un be a basis of a vector space V , let W be a vector space and let W fu1 ; : : : ; un g ! W be an arbitrary mapping. Then there exists precisely one linear mapping f W V ! W such that f .ui / D .ui / for each i . P Proof. Since every element of V can ˛j uj there is P at most P be written as x D one such fP: we must have f .x/ DP ˛j .uj /. On the other hand, if x D ˛j uj and y D ˇj uj then x C y D P .˛j C ˇj /uj and it is, by 3.1.1, the only such representation. Similarly for ˛x D ˛˛j uj . Thus, setting f .x/ D
X
˛j .uj /
where x D
X
˛j uj
yields a linear mapping f W V ! W such that f .ui / D .ui /.
5.6
t u
The Free Vector Space on a Set S
In view of Theorem 5.5, it is an interesting question if for any set S , we can find Š
a vector space with a basis B and a bijection W S !B. This is called the free F-vector space on the set S , and denoted by FS (it is customary to treat as the identity, which is usually OK, since it is specified). Of course, for S finite, we may simply take Fn where n is the cardinality of S . However, for S infinite, the Cartesian product FS turns out not to be the right construction. Rather, we set there exists a finite subset F S such that FS D a W S ! F j : a.s/ D 0 for s 2 S X F The operations of addition and multiplication by a scalar are done point-wise. In fact, this is a vector subspace of FS , which is the space of all maps S ! F. The basis B in question is the set of all maps as W S ! F where as .s/ D 1 and as .t/ D 0 for t ¤ s. It is easily verified that this is a basis. One usually treats the map S ! FS , s 7! as as an inclusion, so as becomes identified with s.
5.7
Affine subsets
Let W be a subspace of a vector space V and let x0 2 V . A subset of the form x0 C W D fx0 C w j w 2 W g is called an affine subset of V (or affine set in V ).
5 Linear mappings
467
5.7.1 Proposition. Let L be an affine set in V . Then the subspace W in the representation L D x0 C W is uniquely determined, while for x0 one can take an arbitrary element of L. The space W is sometimes referred to as the associated vector subspace of V , and the dimension of V is referred to as the dimension of L. Proof. We have w2W
if and only if
w D x y with x; y 2 L
(x0 C u .x0 C v/ D u v 2 W and on the other hand, if w 2 W then w D .x0 C w/ x0 ). Now let x1 D x0 C w0 be arbitrary, w0 2 W . Then for any w 2 W we have x1 C w D x0 C .w0 C w/ 2 L, and x0 C w D x1 w0 C w. t u 5.8 Theorem. Let f W V ! Z be a linear mapping. Then (1) W D f 1 Œfog is a subspace of V , and (2) the f 1 Œfzg are precisely the affine sets in V of the form v C W with f .v/ D z. Proof. (1): If f .x/ D f .y/ D o then f .˛x C ˇy/ D o. (2) Let f .v0 / D z. Then for each w 2 W we have f .v0 C w/ D f .v0 / C f .w/ D z C o D z and on the other hand, if f .v/ D z then f .v v0 / D z z D o, hence v v0 2 W , and v D v0 C .v v0 /. t u
5.9
Affine maps
By an affine map between affine subsets L V , M W of vector spaces V , W we shall mean simply a map f WL!M which is of the form f .x/ D y0 C g.x x0 / where x0 2 L, y0 2 M , and g is a linear map between the associated vector subspaces. It is possible to say a lot more about affine subsets and affine maps. Alternately, many calculus texts do not mention them at all and refer to affine subsets as “linear subsets”, and affine maps imprecisely as “linear maps [in the broader sense]”. We decided to make the compromise of keeping the terminology precise without dwelling on details which would not be useful to us.
468
A Linear Algebra I: Vector Spaces
6
Congruences and quotients
6.1 A congruence on a vector space V is an equivalence relation E V V (we will write xEy for .x; y/ 2 E) such that xEy
)
.˛x/E.˛y/ for all ˛ 2 F; and
xi Eyi ; i D 1; 2
)
.x1 C x2 /E.y1 C y2 /:
For the equivalence (congruence) classes Œx; Œy set Œx C Œy D Œx C y
and ˛Œx D Œ˛x
(this is correct: if x 0 2 Œx and y 0 2 Œy then x 0 Ex and y 0 Ey and hence .x 0 C y 0 /E.x C y/ and x 0 C y 0 2 Œx C y; similarly for Œ˛x). It is easy to check that the set of equivalence classes with these operations constitutes a vector space, denoted by V =E; and that pE D .x 7! Œx/ W V ! V =E is a linear mapping onto. 6.2 Theorem. The formulas E 7! WE D fx j xEog and W 7! EW D f.x; y/ j x m 2 W g constitute a one-one corespondence between the congruences on V and subspaces of V . The congruence classes of E are precisely the affine sets x C WE : Proof. Obviously WE D fx j xEog is a subspace. If W is a subspace then EW is a congruence: trivially xEW x, if xEW y then xy 2 W , hence yx D .xy/ 2 W and yEW x, and if xEW y and yEW z then x z D .x y/ C .y z/ 2 W and xEW z; if xi EW yi then .x1 y1 / C .x2 y2 / 2 W , that is, .x1 C x2 / .y1 C y2 / 2 W and finally if xEW y we have x y 2 W and hence ˛x ˛y 2 W , that is, ˛xEW ˛y. Now x 2 WEW if and only if xEW o if and only if x D x o 2 W , and xEWE y if and only if x y 2 WE if and only if .x y/Eo if and only if xEy.
7 Matrices and linear mappings
469
Finally, if y 2 Œx then yEx, hence .y x/Eo, that is, y x 2 WE , and y D x C .y x/ 2 x C WE . If y 2 .x C WE / then y D x C w with w 2 W and y x Dw2W. t u
6.2.1 If W is a subspace of V we will use, in view of 5.2, the symbol V =W
instead of
V =EW :
We call the vector space V =W the quotient space (or factor) of V by the subspace W .
6.3 Let f W V ! Z be a linear mapping. The subspace f 1 Œfog of V is called the kernel of f and denoted by Kerf: Theorem. (The homomorphism theorem for vector spaces) For every linear mapping f W V ! Z and every subspace W Kerf there is an homomorphism h W V =W ! Z defined by h.x C W / D f .x/. If f is onto, so is h. If W D Kerf , h is one-to-one. Proof. Using the projection V =W ! V =Kerf , x 2 W 7! x C Kerf , it suffices to consider the case W D Kerf . If x C Kerf D y C Kerf then x y 2 Kerf , hence f .x/f .y/ D o and f .x/ D f .y/. Thus, the mapping h is correctly defined. Since we have, for the linear mapping p D .x 7! Œx/ W V ! V =Kerf with hp D f , h is a linear mapping, by 5.3. Now h is obviously onto if f . If x C Kerf ¤ y C Kerf then x y … Kerf and f .x/ f .y/ D f .x y/ ¤ o so that h is one-one. t u
7
Matrices and linear mappings
7.1
Matrices
In this section we will deal with vector spaces over the field of complex or real numbers. A matrix of the type m n is an array 1 0 a11 ; : : : ; a1n A D @ ::: ::: ::: A am1 ; : : : ; amn
470
A Linear Algebra I: Vector Spaces
where the entries ajk are numbers, real or complex, according to the context. If m and n are obvious we often write simply A D .ajk /j;k
or
.ajk /jk :
Sometimes the jk-th entry of a matrix A is denoted by Ajk . The row vectors .aj1 ; : : : ; aj n /; j D 1; : : : ; m are called the rows of the matrix A, and the .a1k ; : : : ; amk /; k D 1; : : : ; n are called the columns of A. Hence, a matrix of the type m n is sometimes referred to as a matrix with m rows and n columns. Matrices of the type m m are called square matrices.
7.2
Basic operations with matrices
Transposition. Let A D .ajk /jk be an m n matrix. The n m matrix AT D .ajk0 /jk
where ajk0 D akj
is called the transposed matrix of A. There is a variant of this construction over the field C: If A is a matrix over C, we denote by A the complex conjugate of AT , i.e. the matrix obtained from AT by replacing every entry by its complex conjugate. This is sometimes called the adjoint matrix of A. A (necessarily square) matrix A which satisfies AT D A (resp. A D A) is called symmetric (resp. Hermitian). Multiplication. Let A D .ajk /jk be an m n matrix and let B D .bjk /jk be an n p matrix. The product of A and B is the matrix AB D .cjk /jk
where cjk D
n X
ajr brk :
rD1
The unit matrices are the matrices of type n n defined by ( I D In D
.ıjk /jk
where
ıjk
D
1 if j D k 0 if j ¤ k
:
We obviously have .AB/T D B T AT ; .AB/ D B A and AI D A and IA D A whenever defined.
7 Matrices and linear mappings
471
The motivation for the definition of the product will be apparent in 7.6 below, where we will also learn more about its properties.
7.3
Row and column vectors as matrices
A vector x D .x1 ; : : : ; xn / 2 Fn will be viewed as a matrix of the type 1 n. Also, we will consider the column vectors, matrices of type n 1, 0
1 x1 xT D @: : :A : xn Clearly, all column vectors of a given dimension n also form a vector space over F, known as the n-dimensional column vector space and denoted as Fn . We will see that in spite of the fact that it is more convenient to write rows than columns, the space of columns is more convenient in the sense that for columns, composition of linear maps corresponds to multiplication of matrices without reversing orders (see Theorem 7.6 below). Because of this, nearly all courses in linear algebra now use the space of column vectors and not row vectors as the default model of an n-dimensional vector space. We will follow this convention in this text as well. In particular, we will extend the convention 1.3 to column vectors.
7.4
The standard bases of Fn , Fn
In the row vector space Fn , we will consider the basis ( e1 ; : : : ; en
where .ej /k D
1 if j D k; 0 if j ¤ k
and in Fn , we will consider the basis e1 ; : : : ; en
where ei D .ei /T
(this notation conforms with 1.3; of course .ej /k D ıjk from 7.2). The ej ’s from Fm and Fn with m ¤ n differ (and similarly for ej ), but this rarely causes confusion. In the rare cases where it can we will display the dimension n as j n ej , n e . Obviously we have xD
n X j D1
xj ej :
(7.4.1)
472
7.5
A Linear Algebra I: Vector Spaces
The linear maps fA ,f A
Let A be a matrix of type m n. Define a mapping fA W Fm ! Fn
by setting fA .x/ D xA;
f A W Fn ! Fm
by setting f A .x/ D Ax:
and a mapping
7.5.1 Theorem. The mappings fA , f A are linear and the formula A 7! fA resp. A 7! f A yields a bijective correspondence between matrices of type m n and the set of all linear mappings Fm ! Fn resp. Fn ! Fm . Proof. We will prove the statement about row spaces. The statement for column spaces is analogous (see Exercise (10)). The linearity of the formula is an immediate consequence of the definition of a product of matrices. We have .ej A/1k D
n X
ejr ark D ajk
(*)
rD1
and hence if A ¤ B, there exist r; s such that ars ¤ brs . Thus, fA ¤ fB . Now let f W Fm ! Fn be an arbitrary linear mapping. Consider the ajk uniquely defined by the formula f .m ej / D
n X
ajk .n ek /
kD1
and define A as the array .ajk /jk . We have, by (*), f .x/ D f .
X j
xj .m ej // D
X j
xj f .m ej / D
X j
xj
X
ajk .n ek / D
k
and hence f .x/1k D .xA/1k and finally f .x/ D .xA/.
XX . xj ajk /.n ek /; k
j
t u
7 Matrices and linear mappings
473
7.6 Theorem. In the representation of linear mappings from 7.5 we have fI D id; fAB D fB ı fA ; and f I D id; f AB D f A ı f B : Proof. We will only prove the statement for row vectors. The statement for column vectors is analogous (see Exercise (11)). The first formula is obvious. Now let A, B be matrices of types m n resp. n p. If two linear maps agree on a basis they obviously coincide. We have fB .fA .m ej // D fB . D
X k
ajk .
X
X
ajk .m ek // D
k
X
ajk fB .m ek /
k
bkr .p er / D
r
XX . ajk bkr /p er D fAB .m ej /: r
t u
k
7.6.1 From the associativity of composition of mappings and from the uniqueness of the matrix in the representation of linear mappings as fA we immediately obtain Corollary. Multiplication of matrices is associative, that is, A.BC / D .AB/C whenever defined.
7.6.2 Different bases, base change At this point we must mention the fact that the association between matrices and linear maps works for arbitrary finite-dimensional vector spaces V; W . Let B D .v1 ; : : : ; vn / resp. C D .w1 ; : : : ; wm / be sequences of distinct vectors in V resp. W which, when considered as sets, form bases of V and W (we speak of ordered bases). Then for an m n matrix A over F, we have an associated linear map B;C f
A
WV !W
given by A B;C f .vj / D
m X
aij wi :
i D1
Clearly, (for example, by considering the isomorphisms between V , Fn and W , Fm mapping B and C to the standard bases), this again defines a bijective correspondence between m n matrices over F and linear maps from V to W . We will say that the linear map B;C f A is associated to the matrix A with respect to the bases B and C , and, vice versa, that A is the matrix associated with the linear
474
A Linear Algebra I: Vector Spaces
map (or simply matrix of the linear map) f D B;C f A with respect to the bases B; C . An analogue of Theorem 7.6 of course holds, i.e. B;D f
A1 A2
D C;D f A1 ı B;C f A2
(*)
for an m n matrix A1 and an n p matrix A2 , and ordered bases B; C; D of m- resp. n- resp. p-dimensional spaces U , V , W . For two ordered bases B; B 0 of the same finite-dimensional vector space V , the matrix of Id W V ! V with respect to the basis B in the domain and B 0 in the codomain is sometimes referred to as the base change matrix from the basis B to the basis B 0 . By (*), base change matrices can be used to relate matrices of linear maps with respect to different bases, both in the domain and codomain.
7.7
Hermitian matrices and Hermitian forms
Given a Hermitian (resp. symmetric) matrix A of type n n over C (resp. over R), we have a Hermitian (resp. symmetric bilinear) form B on Cn (resp. Rn ) given by B.x; y/ D y Ax: In case when B is positive-definite, this becomes an inner product, also denoted by hx; yiB : (In the real case, of course, y D yT .) Conversely, the axioms immediately imply that every Hermitian (resp. symmetric bilinear) form on Cn (resp. Rn ) arises in this way. We will say that the form B is associated with the matrix A and vice versa. Sometimes we simplify the terminology and call a Hermitian (resp. real symmetric) matrix positive definite resp. negative definite resp. indefinite if the corresponding property holds for its associated Hermitian (resp. symmetric bilinear) form.
8
Exercises
(1) Prove the statement made in Example 1.2 3. (2) Prove that the dot-product from 4.2 satisfies the definition of an inner product, and more generally the B defined in Subsection 7.7 is a Hermitian (resp. symmetric bilinear) form. (3) Prove that every Hermitian (resp. symmetric bilinear) form on Cn (resp. Rn ) is associated with a Hermitian (resp. symmetric) matrix. (4) Take the vector space V from 1.23. Prove that .x 7! ln x/ is an isomorphism V ! R1 .
8 Exercises
475
(5) Prove that if 1 , 2 are inner products on a (real or complex) vector space V , and ; > 0, then .1 / C .2 / is an inner product. (6) Prove that linear maps F ! F are precisely the mappings .x 7! ax/ where a 2 F is fixed. Z b
(7) Prove that if ha; bi is a closed interval then . 7!
.x/dx/ is a linear a
mapping C.ha; bi/ ! R1 . (8) Prove that the set of all as , s 2 S in 5.6 forms a basis of the free vector space FS on a set S . (9) Prove that an affine map f W L ! M between affine subsets of vector spaces V , W can be made to satisfy the definition 5.9 with any choice of the element x0 2 L. Is an analogous statement true for y0 2 M ? (10) Prove the statement of Theorem 7.5.1 for column vectors. (11) Prove the statement of Theorem 7.6 for column vectors. (12) Prove that the set of all matrices of type m n with entries in F is a vector space over F where addition is addition of matrices, and multiplication by a scalar 2 F is the operation which multiplies each entry by . Is this vector space finite-dimensional? What is its dimension?
B
Linear Algebra II: More about Matrices
1
Transforming a matrix. Rank
1.1
Elementary row and column operations
Recall Section A.7. Let A be a matrix of type m n. The vector subspace Row.A/ of Fn generated by the rows of A is called the row space of A and the vector subspace Col.A/ of Fm generated by the columns is called the column space of A. An elementary row (resp. column) operation on A is any of the following three transformations of the matrix. (E1) A permutation of the rows (resp. columns). (E2) Multiplication of one of the rows (resp. columns) by a non-zero number. (E2) Adding to a row (resp. column) a linear combination of the other rows (resp. columns). 1.1.1 Observation. An elementary row (resp. column) operation does not change the row resp. column space.
1.2 The column space is, of course, changed by a row operation (and the row space is changed by a column operation). We have, however, the following Proposition. An elementary row (resp. column) operation preserves the dimension of the column (resp. row) space. Proof. Let p be a permutation of the set f1; 2; : : : ; ng. Define p W Fn ! Fn by setting p .x1 ; : : : ; xn / D .xp.1/ ; : : : ; xp.n/ /:
I. Kriz and A. Pultr, Introduction to Mathematical Analysis, DOI 10.1007/978-3-0348-0636-7, © Springer Basel 2013
477
478
B Linear Algebra II: More about Matrices
Obviously p is an isomorphism: trivially it is linear, and it has an inverse, namely p1 . Further, for a non-zero a define a .x1 ; x2 ; : : : ; xn / D .ax1 ; x2 ; : : : ; xn /: Again, it is an isomorphism, with the inverse a1 . Finally, setting for numbers b2 ; : : : ; bn , .x1 ; x2 ; : : : ; xn / D .x1 C
n X
bj xj ; x2 ; : : : ; xn /;
j D2
we obtain an isomorphism with the inverse sending .x1 ; x2 ; : : : ; xn / to .x1
n X
bj xj ; x2 ; : : : ; xn /:
j D2
Now performing elementary row operations on A we transform the column space by the isomorphisms p , a and ; an isomorphism sends a basis to a basis (A.5.4) and hence preserves dimension. t u 1.3 Theorem. For any matrix A, the dimensions of the row and column spaces coincide. Proof. By 1.1.1 and 1.2, the dimensions are unchanged after arbitrarily many row and column operations. If ajk D 0 for all j; k then both the dimensions are zero. Let there be an ajk ¤ 0. Performing (E1), we can move the ajk to the position .1; 1/ and multiplying the 1 (now) first row by we have our matrix transformed to ajk 0
1 1; b12 ; : : : ; b1n B b21 ; b22 ; : : : ; b2n C B C @ ::: ::: ::: A: bm1 ; bm2 ; : : : ; bmn Now we will perform the operations (E3) subtracting the first row bj1 times from the j -th one, and when this is finished we do the same with the columns thus obtaining the matrix transformed to 0 1 1; 0; : : : ; 0 B .2/ .2/ C B0; a22 ; : : : ; a2n C B C: @ ::::::::: A .2/ .2/ 0; am2 ; : : : ; amn
2 Systems of linear equations
479
.2/
If all the ajk with j; k 2 are zero, the dimension of the two spaces are 1. .2/
Otherwise choose an ajk ¤ 0, move it to the position .2; 2/ by (E1) operations (without affecting the first row and column) and repeat the procedure as above to obtain 1 0 1; 0; 0; : : : ; 0 B0; 1; 0; : : : ; 0 C C B B .3/ .3/ C B0; 0; a33 ; : : : ; a3n C : C B @ ::::::::: A .3/ .3/ 0; 0; am3 ; : : : ; amn .rC1/
After sufficiently many repetitions of the procedure we have ajk j; k > r and have a matrix 0 1 1; 0; : : : ; 0; 0; : : : ; 0 B0; 1; : : : ; 0; 0; : : : ; 0C B C B C ::::::::: B C B C B D B0; 0; : : : ; 1; 0; : : : ; 0C B C B0; 0; : : : ; 0; 0; : : : ; 0C B C @ A ::::::::: 0; 0; : : : ; 0; 0; : : : ; 0
D 0 for all
with the first r diagonal entries 1 and all the others zero, and hence the dimensions of both the row and the column spaces are equal to r. t u
1.4 The common dimension of the row and column spaces is called the rank of the matrix and denoted by rankA:
2
Systems of linear equations
2.1 Let A D .ajk /jk be a matrix of type m n and let b1 ; : : : ; bm be numbers. A system of linear equations is a name for the task of determining x1 ; : : : ; xn 2 F such that a11 x1 C a12 x2 C C a1n xn D b1 :::
:::
:::
am1 x1 C am2 x2 C C amn xn D bm
:
(2.1.1)
480
B Linear Algebra II: More about Matrices
If .b1 ; : : : ; bn / D o we speak of a homogeneous system, and when replacing the original b by o we speak of the homogeneous system associated with (2.1.1). The matrix A is called the matrix of the system and the matrix 0
1 a11 ; : : : ; a1n ; b1 @ ::: ::: ::: A am1 ; : : : ; amn ; bm is referred to as the augmented matrix of the system.
2.2
Three views of the task
1. Recall A.7.5. We seek an x such that AxT D bT : Thus we have a linear map f W Fn ! Fm and would like to determine the set f 1 ŒbT : 2. If we denote by c1 ; : : : ; cn the columns of A, we are seeking numbers x1 ; : : : ; xn such that n X
xj cj D bT :
j D1
3. The associated homogeneous system can be understood as seeking the x such that x aj D 0 for all j D 1; : : : ; m where is the dot product and aj D .aj1 : : : ; aj n / are the complex conjugates of the rows of A (this approach is valid for F D R; C, which, as remarked above, are the only contexts we are interested in). Thus, the set of solutions of the associated homogeneous system coincides with the orthogonal complement L.a1 ; : : : ; am /? : Now the dimension of L.a1 ; : : : ; am / is the same as that of the row space, that is, equal to the rank r od A: if we perfom the procedure from Theorem A.3.2 (the Gram-Schmidt process) on the system a1 ; : : : ; am , we end up with a basis of
2 Systems of linear equations
481
the same size as when starting with a1 ; : : : ; am (since aj is a linear combination of the other ak ’s if and only if aj is a linear combination of the other ak ’s). Thus, by Theorem A.4.6.2, the dimension of the subspace of solutions of a homogeneous system is n rankA.
2.3 From 2.2 2, we immediately obtain 2.3.1 Theorem (Frobenius). A system of linear equations has a solution if and only if the rank of the matrix of the system is the same as the rank of the augmented one. (That is: if and only if the right-hand side column is in the column space of A.) From 2.2 1 and 2.2 3, we obtain 2.3.2 Theorem. If a system of linear equations has a solution x0 , then the set of all solutions is an affine set x0 C W where W is the set of all solutions of the associated homogeneous system. The dimension of this affine set is n rankA.
2.4
The Gauss Elimination Method
By 2.3.2, to determine the set of all solutions of the system (2.1.1), it suffices to find one of its solutions and s D n r linearly independent solutions x1 ; : : : ; xs of the associated homogeneous system, where r D rankA. The general solution is then x0 C
s X
˛j xj ;
˛j 2 F arbitrary:
j D1
First observe that elementary row operations on the augmented matrix preserve the solution set. Column operations change the solution set and will not be used, with the exception of the (E1) performed on the A-part of the augmented matrix: this is relatively harmless; we will only have to keep track of the permuted coordinates of solutions. Start with the augmented matrix and transform it by (E1) operations so that the .1; 1/ entry is non-zero, moving there a non-zero aj1 k . Remember j1 . Then multiply the first row by .aj0 1 k /1 to obtain
482
B Linear Algebra II: More about Matrices
0
1 0 0 ; : : : ; a1n ; b10 1; a12 B a0 ; a0 ; : : : ; a0 ; b 0 C 22 2n 2 C B 21 @ : : : : : : : : :A 0 0 0 ; am2 ; : : : ; amn ; b20 am1 0 and then subtract from the j -th rows, j D 2; : : : ; m, the aj1 multiple of the first one. Now we have
1 0 0 0 1; a12 ; : : : ; a1n ; b10 B0; a00 ; : : : ; a00 ; b 00 C 22 2n 2 C : B @ : : : : : : : : :A 00 00 ; : : : ; amn ; b200 0; am2 We repeat the procedure in the part of the matrix with indices 2 (during this, of 0 0 course, the a12 ; : : : ; a1n are permuted, too; again, the j2 from the aj002 k moved to the .2; 2/ position to be remembered). After repeating the procedure r 1 times we obtain a matrix 0 1 1; c12 ; c13 ; : : : ; c1r ; : : : ; c1n ; bQ1 B0; 1; c23 ; : : : ; c2r ; : : : ; c2n ; bQ2 C B C B0; 0; 1; : : : ; c ; : : : ; c ; bQ C 3r 3n 3C B B C ::: ::: B ::: C B C B0; 0; 0 : : : 1; : : : ; crn ; bQr C B C B0; 0; 0 : : : 0; : : : ; 0; 0 C B C @ ::: A ::: ::: 0; 0; 0 : : : 0; : : : ; 0; 0 (note that because of Frobenius’ Theorem the right-hand side becomes zero after the r-th row or else the system has no solution) corresponding to a system of equations y1 C c12 y2 C y2 C
c1;rC1 yrC1 C Cc1n yn
D bQ1 ;
c23 y3 C C c2r yr C c2;rC1 yrC1 C Cc2n yn
D bQ2 ;
c13 y3 C C c1r yr C
:::
::: yr C
cr;rC1 yrC1 C Ccrn yn
D bQr
with the same system of solutions if we set yk D xjk . The one solution y0 of the system can be obtained by setting y0;rC1 D y0;rC2 D D y0;n D 0, y0;r D bQr , and then recursively y0;k1 D
n X j Dk
ck1;j y0j C bQk1 :
2 Systems of linear equations
483
A basis yi (i D 1; : : : ; s D n r) of the vector space of solutions of the associated homogeneous system can be then obtained by setting yi;rCi D 1, yi;rCj D 0 otherwise, and then recursively yi;k1 D
n X
ck1;j yij :
j Dk
2.5
Regular matrices
A matrix A D .aij /ij of type n n is said to be regular (or non-singular) if rankA D n. In such a case, each system of equations n X
aij xj D bj;
i D 1; 2; : : : ; n
j D1
has precisely one solution: it has a solution since the augmented matrix, being of type n .n C 1/, cannot have a bigger rank than n; on the other hand, the dimension of the set of solutions is n n D 0. By 1.3, a matrix A is regular if and only if AT is regular. 2.5.1 Theorem. The following statements about a square matrix A are equivalent. (1) A is regular. (2) There exists a matrix U such that AU D I . (3) There exists a matrix V such that VA D I . (4) The matrix A has a unique inverse matrix, that is, there is a unique U such that UA D AU D I . Notation. The inverse matrix of A will be denoted by A1 . Proof. (1))(2),(3): Notation from 2.2 1 and A.7.4. For each ei on the right-hand side there is a solution xi such that AxTi D eTi .D ei /: Thus,
X
j
ajk xi k D ıi , and if we set uij D xj i we have
k
X k
j
ajk uki D ıi , that is, we
have a U such that AU D I . The statement (3) is obtained applying this reasoning A.7.2. for AT and using X aij ujk D ıik . Fix k and set xj D ujk . Then in the notation of (2))(1): Let j
2.2.2 we have for the columns cj of A,
484
B Linear Algebra II: More about Matrices
X
xj cj D ek :
j
Thus, the column space contains all the ek and hence its dimension is n. (2)&(3))(4): If AU D I and VA D I we have have V D V .AU / D .VA/U D U . (4))(2) is trivial. u t
2.6
Deciding if a Hermitian form is positive-definite or negative-definite
Recall now our problem from A.4.7 of deciding if a Hermitian (or real symmetric bilinear) form is positive-definite or negative-definite. Consider a Hermitian form B on a finite-dimensional complex vector space V (the case of a real symmetric bilinear form is analogous). Then perform the following procedure: Start with k D 0. Suppose we have constructed vectors v1 ; : : : vk 2 V such that B.vi ; vi / ¤ 0, B.vi ; vj / D 0 for i ¤ j . Note that the vectors vi must be linearly independent. (In effect, suppose k X
ai vi D 0:
i D1
Applying B.‹; vi /, we get ai D 0.) Then, using a system of linear equations, find a non-zero vector w 2 V such that B.vi ; w/ D 0 for all i D 1; : : : ; k. If no such w exists, then by 2.2 3, k dim.V /, and by linear independence, equality arises, so the vi ’s form a basis of V . In this case, if the signs of the real numbers B.vi ; vi / are all positive (resp. negative), B is positive-definite (resp. negative-definite). Otherwise, B is indefinite. Suppose the vector w exists. If B.w; w/ ¤ 0, put vkC1 D w and repeat the procedure with k replaced by k C 1. If B.w; w/ D 0, find a vector u 2 V such that B.w; u/ ¤ 0. If no such u exists, B is degenerate. If u exists, then 4B.u; w/ D B.u C w; u C w/ C iB.iu C w; iu C w/ B.u C w; u C w/ iB.iu C w; u C w/ by the axioms, so choosing vkC1 as one of the vectors u C w, u C w, iu C w, iu C w, the vector vkC1 will satisfy B.vkC1 ; vkC1 / ¤ 0. Repeat the procedure with k replaced by k C 1.
3 Determinants
3
485
Determinants
3.1 A group G is a set with a binary operation which satisfies associativity, has a unit element e and an inverse unary operation .‹/1 . Explicitly, the axioms are .a b/ c D a .b c/; a e D e a; x x 1 D x 1 x D e: For groups G; H , a map f W G ! H is called a homomorphism of groups if we have f .a b/ D f .a/ f .b/ for all a; b 2 G: A bijective homomorphism of groups is called an isomorphism (of groups). Obviously, the inverse of an isomorphism is again an isomorphism. Immediate examples of groups include the set Z of all integers with the operation C, the set f1; 1g with the operations (multiplication), as well as R or C with the operation C or R D R X f0g, C D C X f0g with the operation . Note that all those groups have the additional property that a b Db a where is the operations. Groups satisfying this property are called commutative or abelian. We will soon encounter examples of groups which are not abelian. We will not develop the theory of groups at all here (and the reader is referred to [2] and [4] for more on abstract algebra), but they do come up naturally in the context of the determinant. In particular we will use the obvious fact that the mappings G!G x 7! x 1
and x 7! ax for a fixed a 2 G
are bijections (the first is inverse to itself, the other one to x 7! a1 x). It then follows that if f W G ! R or C is any mapping then X x2G
f .x/ D
X
f .x 1 / D
x2G
(all three are the same sum, only rearanged).
X x2G
f .ax/
(3.1.1)
486
3.2
B Linear Algebra II: More about Matrices
The sign of a permutation
We will be concerned with the group P .n/ of permutations of the set f1; 2; : : : ; ng, i.e. bijections f1; 2; : : : ; ng ! f1; 2; : : : ; ng, where the operation is composition. A permutation p 2 P .n/ will be usually encoded as a sequence .k1 ; : : : ; kn /
where kj D p.j /:
A transposition is a permutation interchanging two of the elements and keeping all the others. 3.2.1 Theorem. 1. Every pemutation can be obtained as a composition of transpositions. 2. If p 2 P .n/ can be represented as a composition of an even (resp. odd) number of transposition then in any such representation the number of transpositions is even (resp. odd). Proof. 1. By induction. The statement is obvious for n D 1; 2. Now let it hold for P .n/ and let p be a permutation of f1; : : : ; n; nC1g. Consider the transposition interchanging nC1 with p.nC1/ (if p.nC1/ D nC1 set D id). Now q D ıp sends n C 1 to n C 1, hence f1; : : : ; ng to f1; : : : ; ng. The restriction q 0 of q to f1; : : : ; ng can be written as q 0 D 10 ı ı r0 with transpositions j0 . Extending these to transpositions j of f1; : : : ; n; n C 1g we obtain a representation p D ı 1 ı ı r : 2. Encode p as .k1 ; : : : ; kn / and set I.p/ D f.i; j / j i < j and ki > kj g;
.p/ D #I.p/
(# indicates the number of elements). We will prove that for any transposition the number j. ı p/ .p/j is odd; since .id/ D 0 the statement will follow. Let exchange ˛ with ˇ, ˛ < ˇ, let q D ı p. Then we have p D .k1 ; : : : ; k˛1 ; k˛ ; k˛C1 ; : : : ; kˇ1 ; kˇ ; kˇC1 ; : : : ; kn / and q D .k1 ; : : : ; k˛1 ; kˇ ; k˛C1 ; : : : ; kˇ1 ; k˛ ; kˇC1 ; : : : ; kn /: We obviously have .i; j / 2 I.p/ if and only if .i; j / 2 I.q/ for i; j ¤ ˛; ˇ;
or i < ˛ and j 2 f˛; ˇg,
or ˇ < j and i 2 f˛; ˇg.
3 Determinants
487
Thus we have to discuss the cases (a) .˛; j / with ˛ < j < ˇ, (b) .j; ˇ/ with ˛ < j < ˇ, and (c) .˛; ˇ/. In cases (a) and (b) we have together an even number of changes: we have .˛; j / 2 I.p/ if and only if .j; ˇ/ … I.q/, and .j; ˇ/ 2 I.p/ if and only if .˛; j / … I.q/; thus if there are s many .˛; j / 2 I.p/ and t many .j; ˇ/ 2 I.p/ we have s C t such pairs in I.p/ and usCu D t D 2u.sCt/ such pairs in I.q/ where u D ˇ˛C1. The case (c) stands alone, and it is in precisely one of the I.p/, I.q/. t u
3.2.2 Notation and observation We define ( C1 if p is a composition of an even number of transpositions, sgn p D 1 if p is a composition of an odd number of transpositions. From the definition we immediately infer that sgn id D 1;
sgn .p ı q/ D sgn p sgn q
and sgn p 1 D sgn p:
Permutations p with sgnp D 1 (resp. sgnp D 1) are called even (resp. odd). 3.2.3 Corollary. The map sgn W P .n/ ! f1; 1g sending a permutation to its sign is a homomorphism of groups, where on f1; 1g, we consider the operation of multiplication.
3.3 The determinant of a matrix A D .aij /ij is the number X det A D sgn p a1;p.1/ an;p.n/ : p2P .n/
It is often indicated as
ˇ ˇ ˇa11 ; : : : ; a1n ˇ ˇ ˇ ˇ ::: ::: ˇ: ˇ ˇ ˇa ; : : : ; a ˇ n1 nn
ˇ ˇ ˇ a; b ˇ ˇ ˇ D ad bc (and this is about the only case of a determinant Thus for instance ˇ c; d ˇ easily and transparently computed from the basic definition).
488
B Linear Algebra II: More about Matrices
3.3.1 Proposition. 1. det AT D det A. 2. If B is obtained from a square matrix A by permuting the rows or columns following a permutation p 2 P .n/ then det B D sgn p det A. Proof. Rearranging the factors we obtain the formula a1p.1/ anp.n/ D ap1 .1/1 ap1 .n/n and since sgn p 1 D sgn p we can rewrite the formula from the definition as X det A D sgn p 1 ap1 .1/1 ap1 .n/n p2P .n/
which is, by (3.1.1), equal to X
sgn p ap.1/1 ap.n/n :
p2P .n/
2. It suffices to prove it for a permutation of rows. We have B D .ap.i /j /ij so that X det B D sgn q ap.1/q.1/ ap.n/q.n/ : q2P .n/
Rearanging the factors and using 3.2.2, we obtain X det B D sgn q a1;qp1 .1/ an;qp1 .n/ q2P .n/
D sgn p
X
sgn qp1 a1;qp1 .1/ an;qp1 .n/
q2P .n/
and by (3.1.1), D sgn p
X
sgn q a1;q.1/ an;q.n/ D sgn p det A:
t u
q2P .n/
3.3.2 Corollary. If there are in a matrix A two equal colums or rows then det A D 0. (For, transposing such two rows yields det A D det A.) From the formula for det A we immediately get the following 3.4 Theorem. A determinant is linear in each of its rows (resp. columns). That is, if A is a matrix of type n n and if Aj .x/ is obtained from A by replacing the j -th row by x then the mapping .x 7! det Aj .x// W Fn ! R resp. C is linear.
4 More about determinants
489
3.4.1 Convention The notation Aj .x/ will be kept in the remainder of this chapter. Furthermore, we will use the symbol Aj .xT / for the matrix in which the i -th column is replaced by xT . 3.4.2 Theorem. If B is obtained from A by adding to a row (resp. column) a linear combination of the other rows (columns) then det B D det A. Proof. Let a1 ; : : : ; an be the rows of A. We have A D Ai .ai / and B D Ai .ai C X ˛j aj /. By 3.2, det Ai .aj / D 0 for j ¤ i and hence j ¤i
det B D det Ai .ai C
X
˛j aj / D det Ai .ai /C
j ¤i
X
˛j det Ai .aj / D det A:
t u
j ¤i
3.4.3 Proposition. Let aij D 0 for i > j . Then det A D a11 a22 ann . More explicitly, ˇ ˇ ˇ ˇ a ; a ; a ; :::; a 1;n1 ; a1n ˇ ˇ 11 12 13 ˇ ˇ 0: a : a ; : : : ; a ˇ 22 23 2;n1 ; a2n ˇ ˇ ˇ ˇ 0: 0: a33 ; : : : ; a3;n1 ; a3n ˇ D a11 a22 ann : ˇ ˇ ˇ ˇ ::: ::: ::: ˇ ˇ ˇ 0; 0; 0; : : : ; 0; ann ˇ Proof. follows again from the definition: if p ¤ Id then there is an i with i > p.i /. t u
3.4.4 Computing a determinant Using elemetary operations of the type (E1) and (E3) we can easily transform the matrix in our determinant into the form as in 3.4.3; then we will have the value as the product of the elements on the diagonal. The (E3) operations do not change the value (see 3.4.2). We have to be more careful with the (E1) operations, though. Since computing of the sign may not be quite transparent, it is prudent to use transpositions only, and whenever such is performed, to multiply automatically one of the rows or columns by 1.
4
More about determinants
4.1
Minors and the inverse matrix
Denote by A.i;j / the matrix obtained from A by deleting the i -th row and the j -th column. The number ˛ij D .1/i Cj det A.i;j / is called the .i; j /-th minor of A.
490
B Linear Algebra II: More about Matrices
4.1.1 Recall the notation from 3.4.1. We have the following Theorem. det Ai .x/ D
n X
xj ˛ij and det Aj .xT / D
j D1
n X
xi ˛ij .
j D1
Proof. P We shall treat the case of rows (the case of columns is analogous). Since x D xj ej we have X det Ai .x/ D xj det Ai .ej /: j
Now
ˇ ˇ ˇ ˇ ˇ ˇ ˇ ˇ det Ai .ej / D ˇ ˇ ˇ ˇ ˇ ˇ ˇ
a1;1 ::: ai 1;1 0 ai C1;1 ::: an;1
::: ::: ::: ::: ::: ::: :::
a1;j 1 ::: ai 1;j 1 0 ai C1;j 1 ::: an;j 1
0 ::: 0 1 0 ::: 0
a1;j C1 ::: ai 1;j C1 0 ai C1;j C1 ::: an;j C1
::: ::: ::: ::: ::: ::: :::
ˇ a1;n ˇˇ : : : ˇˇ ai 1;n ˇˇ ˇ 0 ˇ: ˇ ai C1;n ˇ ˇ : : : ˇˇ an;n ˇ
Exchange subsequently the i -th row with the .i 1/-th one then the .i 1/-th row with the .i 2/-th one, etc., and then similarly operating with the rows we move the 1 from the .i; j /-th to the .1; 1/-th position and obtain det Ai .ej / D .1/
i Cj
ˇ ˇ ˇ 1 o ˇ i Cj ˇ ˇ det A.i;j / D .1/i Cj ˛ij : ˇ yT A.i;j /ˇ D .1/
t u
4.1.2 Corollary. In particular, for x the j -th row of A, we obtain n X j D1
akj ˛ij D
n X
j
ajk ˛j i D ıi det A; hence A .˛jk /Tjk D I det A
j D1
from which we immediately get a formula for the inverse matrix, A1 D
4.2
˛ T ij : det A ij
Cramer’s Rule
Recall the representation of a system of linear equations as AxT D bT
(*)
4 More about determinants
491
from 2.2 1. If A is a regular matrix we can multiply this formula by A1 from the left to obtain xT D A1 AxT D A1 bT : Thus, by 4.1.2 we obtain xi D
1 X ˛j i bj : det A j
The sum is then by 4.1.1 equal to det Ai .b/ so that we obtain the formula (Cramer’s Rule) xi D
det Aj .b/ : det A
Of course computing the solutions using this formula would be much harder than using the Gauss Elimination. It is, however, useful for theoretical purposes.
4.3
Determinants and products of matrices
4.3.1 Lemma. Let A; B be square matrices and let C be a matrix of the form
AM O B
or as
A O M B
where O indicates a system of zero entries while the entries at M are arbitrary. Then det C D det A det B: Proof. It suffices to treat the first case. Transform the matrix as indicated in 3.4.4 to obtain 0
0 a11 B 0 B B 0 B B ::: B B B 0 B B B B B B B @
0 a12 0 a22 0 ::: 0
0 a13 0 a23 0 a13 ::: 0
O
::: ::: ::: ::: :::
0 a1m 0 a2m 0 a3m ::: 0 amm
1 M
0 b11 0 0 ::: 0
0 b12 0 b22 0 ::: 0
0 b13 0 b23 0 b13 ::: 0
C C C C C C C C : 0 C C b1n C 0 C b2n C 0 C b3n C ::: A
::: ::: ::: ::: 0 : : : bnn
492
B Linear Algebra II: More about Matrices
If we do the first just in the first m rows and columns and then in the remaining ones, the left upper part corresponds to the transformation of the matrix A and the 0 0 0 right lower one is the matrix B transformed. Thus we have det A D a11 a22 amm , 0 0 0 0 0 0 0 0 0 det B D b11 b22 bnn and det C D a11 a22 amm b11 b22 bnn D det A det B. u t 4.3.2 Theorem. Let A; B be matrices of type n n. Then det AB D det A det B: Proof. Consider the matrix 0
a11 Ba B 21 B: : : B B C D Ban1 B B 0; B @: : : 0;
: : : a1n : : : a2n ::: ::: : : : ann :::; 0 ::: ::: :::; 0
1 0 ::: 0 b11 ::: bn1
0 1 ::: 0 b12 ::: bn2
::: ::: ::: ::: ::: ::: :::
1 0 0 C C : : :C C C 1 C : C b1n C C : : :A bnn
To the i -th column add the a1i multiple of the .n C 1/-th column, the a2i multiple of the .n C 2/-th column, etc. untill the ani multiple of the 2n-th column. Then the upper left part anihilates, and the lower left part becomes AB, schematically O In : AB B
Now let us exchange the i -th and .n C i /-th rows and, to compensate the change of sign, multiply after each of these exchanges the i -th row by -1. We obtain
In O DD B AB
and still det C D det D. By Lemma 4.3.1, det C D det A det B and det D D det I det AB D det AB. t u 4.4 Proposition. A square matrix A is regular if and only if det A ¤ 0. Proof. If A is not regular then some of the rows are linear combinations of the others and det A D 0 by 3.4.2. If A is regular it has an inverse A1 . Thus by 3.3.2, det A det A1 D det AA1 D det I D 1 and hence det A ¤ 0. t u
5 The Jordan canonical form of a matrix
4.5
493
The determinant of a linear map
Let V be a finite-dimensional vector space over F and let f W V ! V be a linear map. Then Theorem 4.3.2 enables us to define the determinant det.f / of the linear map f as the determinant of the matrix A of f with respect to the same ordered basis B in the domain and the codomain (see A.7.6.2). Note that the choice of the basis B does not matter because if we choose another basis B 0 and denote the base change matrix from B to B 0 by M , then the matrix of f with respect to B 0 in the domain and codomain is MAM 1 , and det.MAM 1 / D det.M /det.A/det.M /1 D det.A/:
5
The Jordan canonical form of a matrix
5.1
Eigenvalues and eigenvectors of a matrix
An eigenvalue of a matrix A is a number 2 F such that there exists a non-zero column vector v with Av D v:
(5.1.1)
The column vector v is then called an eigenvector of A (associated with the eigenvalue ). Note. These concepts are very useful (see an application in Chapter 7). One interpretation is as of a generalized fixed-point. If we recall the linear mapping f A W Fn ! Fn we see that we have here an “almost fixed point” v with f .v/ D v. In the set of all lines through the origin fv j 2 Fg (v ¤ o), which has a lot of structure and called the .n 1/-dimensional projective space, the directions generated by eigenvectors become fixed points of the action by f A .
5.1.1
Determining eigenvalues: the characteristic polynomial n n n X X X The formula 5.1.1, that is, ajk vk D vj , can be viewed as ajk vk D ıjk vk , kD1
n X rewritten as .ıjk ajk /vk D 0, or
kD1
kD1
kD1
.I A/vT D o:
(5.1.2)
Now this is a system of linear equations that has a nonzero solution if and only if rankA < n, that is, by 4.4, if and only if A ./ D det.I A/ D 0:
494
B Linear Algebra II: More about Matrices
The expression A ./ is easily seen to be a polynomial in with coefficients in F. It is called the characteristic polynomial of A. We will also apply it to arguments more general than the numbers from F, see the next paragraph.
5.2
The algebra of matrices of type n n
Matrices of type n n can be added by the rule A C B D .ajk C bjk /jk
where A D .ajk /jk and B D .bjk /jk
and multiplied by the ˛ 2 F by setting ˛A D .˛aj k /j k : This is of course the same as computing in the vector space of n n matrices over F. (Recall that the zero vector is the zero matrix O, i.e. the matrix with all the entries 0). Note that the I A in (5.1.2) agrees with this notation. For convenience we sometimes write the muliplication by numbers also from the right, as A˛. Furthermore we have the multiplication of matrices AB and we easily deduce that .A C B/C D AC C BC; 0 A D O;
A.B C C / D AB C AC;
and AO D OA D O;
and .˛A/B D A.˛B/ D ˛.AB/:
This structure is called the algebra of matrices (of type n n). It will be denoted by An : Thus, we can consider polynomials with coefficients in An . 5.2.1 Lemma. Let A 2 An , and let p.x/ D Ck x k C : : : C1 x C C0 where C0 ; : : : Ck 2 An commute with A 2 An . Then there exists a polynomial q.x/ with coefficients in An such that p.x/ D .xI A/q.x/ C p.A/:
5 The Jordan canonical form of a matrix
495
Proof. Apply division of polynomials with remainder by xI A; we work with polynomials in coefficients in An . All matrices involved as coefficients commute with A. t u 5.3 Theorem. (Cayley-Hamilton) Plugging a matrix A 2 An into its own characteristic polynomial gives A .A/ D O: Proof. Let B./ D I A, let C./j k D .1/j Ck detB./.j;k/ : By Cramer’s rule, .I A/C./T D I A ./: Applying Lemma 5.2.1, we have .I A/C./T D .I A/q./ C A .A/; or .I A/.C./T q.// D A .A/: Examining the highest power of which occurs in C./T q./, we see that C./T q./ D 0; proving the statement of the Theorem.
5.4 By a Jordan block we mean a matrix of the form 0 1 0 ::: 0 0 B 1 ::: 0 0 C B C B C B 0 1 ::: 0 0 C B C: B::: ::: ::: ::: :::C B C @ 0 0 ::: 0 A 0 0 ::: 1
t u
496
B Linear Algebra II: More about Matrices
A matrix similar to a matrix A is a matrix of the form B 1 AB where B is an invertible matrix. A direct sum of square matrices A1 ; : : : ; Ak is the matrix 0
A1 B 0 B @::: 0
0 A2 ::: 0
::: ::: ::: :::
1 0 0 C C: :::A Ak
5.5 A vector space V is a direct sum of subspaces U1 ; : : : ; Ur if Uj \ Uk D fog and V D U1 C C Ur ; in other words, if each v 2 V can be written as a unique sum v D vi C C vr with vj 2 Uj . We then write V D U1 ˚ ˚ Ur . From now on, we will work over the field F D C. 5.5.1 Lemma. Put U D fv 2 Cn j .I A/N v D 0 for some N D 0; 1; 2; : : :g: Then Cn is the direct sum of the spaces U . Proof. Let us write A .x/ D
k Y
.x i /ni I
i D1
thus, i are the eigenvalues of A, and i D 0; : : : ; k and linear transformations
P
ni D n. Define subspaces Wi Cn ,
fi W Wi 1 ! Wi ; i D 1; : : : ; k as follows. W 0 D Cn ; fi D .A i E/ni jWi 1 ; Wi D fi ŒWi 1 : By definition, Ker.fi / Ui ; i D 0; : : : ; k 1: By Cayley-Hamilton’s Theorem,
(1)
5 The Jordan canonical form of a matrix
497
Wk D 0:
(2)
Since fi are onto we have by (2), dim.Ker.f0 // C C dim.Ker.fk1 // D n; hence, by (1), k X
dim.Ui / n:
i D1
Thus, it suffices to show that if k X
vi D 0; vi 2 Ui ;
(3)
i D1
then v1 D D vk D 0. Let ni D minfN j .A i I /N vi D 0g: Suppose ni0 ¤ 0. Then replacing each vector vi by vi0 D .A i0 I /vi , the vectors vi0 still satisfy (3) in place of the vi ’s. When we make this replacement, the number ni0 decreases by 1, while the numbers ni , i ¤ i0 , remain unchanged. After applying this procedure finitely many times, we achieve a situation where ni1 D 1 for some i1 , and ni D 0 for i ¤ i1 . Then (3) reads vi1 D 0; which contradicts ni1 D 1. Thus, we have proved that ni D 0 for all i , in other words vi D 0, which is what we needed to show. t u 5.5.2 Theorem. (Jordan) Every n n matrix is similar to a direct sum of Jordan blocks. Moreover, up to order, the Jordan blocks are uniquely determined. (We refer to this direct sum as the Jordan canonical form of the matrix A.) Proof. We will exhibit a proof which will allow us to find the Jordan blocks and the matrix T explicitly (assuming we already have the eigenvalues). Fix an eigenvalue . We shall exhibit a basis of U with respect to which the matrix of the linear transformation AjU is a direct sum of Jordan blocks. Put f D I A. Define subspaces U0 U1 Um
(1)
498
B Linear Algebra II: More about Matrices
of U inductively by U0 D 0; U;i C1 D f1 ŒUi : We see that if we let m be the first number such that Um D U ; then all the inclusions (1) are strict. Let vj1 ; : : : ; vj qj be a set of vectors in Uj which projects to a basis of U;j =.Uj 1 C f ŒU;j C1 /, j D 1; : : : ; m (recall A.6.2.1). Then vj i ; f .vj i /; : : : ; .f /j 1 vj i ; j D 1; : : : ; m; i D 1; : : : ; qj is by definition the desired basis. Combining these bases over for all eigenvalues , by Lemma 5.5.1, gives a basis with respect to which the linear transformation A is a sum of Jordan blocks. Further, the sizes of the Jordan blocks determine and are determined by the dimensions of the spaces Uj , which in turn depend only on the matrix A. This implies the uniqueness statement. t u
6
Exercises
(1) Write down a detailed proof of Theorem 2.3.2. (2) Find all solutions of the system of linear equations over R x C 2y C 3z C 4t C u D 10; 2x C 4y C 2z C 5t C u D 8; 3x C 6y C 5z C 9t C 2u D 1: (3) Prove that a Hermitian form over Cn (resp. symmetric bilinear form over Rn ) is non-degenerate if and only if its associated matrix is regular. (4) Decide whether the symmetric bilinear form on R3 associated with the matrix 0
1 461 @6 8 2A 124 is non-degenerate, and whether it is positive-definite, negative-definite or indefinite.
6 Exercises
499
(5) Compute the determinant of the matrix 0
21 B2 2 B @1 4 35
1 34 4 5C C: 3 3A 68
(6) Prove that the set of all n n matrices over R (resp. C) of non-zero determinant with the operation of matrix multiplication is a group. This group is called the general linear group and denoted by GLn .R/ (resp. GLn .C/). (7) Prove that det W GLn .F/ ! F is a homomorphism of groups where F stands for R or C. (8) Prove that the determinant of a square matrix with entries in An in which two rows (or two columns) coincide is 0. [Hint: the same product appears once with a C and once with a .] (9) Write down an explicit condition on when a 2 2 matrix
ab cd
(a; b; c; d 2 C) is regular, and write down a closed formula for its inverse. (10) Determine the Jordan canonical form of the matrix 0
11 B0 1 ADB @0 0 00
1 03 1 0C C 1 0A 01
and find a non-singular matrix P such that P 1 AP is in Jordan form.
Bibliography
1. L. Ahlfors, Complex Analysis, 3rd edn. (McGraw-Hill Science/Engineering/Math, New York, 1979) 2. M. Artin, Algebra, 2nd edn. (Pearson, Boston, 2011) 3. R. Bott, L.W. Tu, Differential Forms in Algebraic Topology. Graduate Texts in Mathematics, vol. 82 (Springer, New York, 2011) 4. D. Dummit, R. Foote, Abstract Algebra, 3rd edn. (Wiley, Hoboken, 2004) 5. L. Evans, Partial Differential Equations. Graduate Studies in Mathematics, vol. 19, 2nd edn. (American Mathematical Society, Providence, 2010) 6. O. Forster, B. Gilligan, Lectures on Riemann Surfaces. Graduate Texts in Mathematics, vol. 81 (Springer, New York, 1981) 7. I.M. Gelfand, S.V. Fomin, Calculus of Variations. Dover Books in Mathematics (Dover Publications, Mineola, 2000) 8. P. Griffiths, J. Harris, Principles of Algebraic Geometry (Wiley, New York, 1994) 9. B.C. Hall, Lie Groups, Lie Algrba, and Representations: An Elementary Introduction. Graduate Texts in Mathematics, vol. 222 (Springer, New York, 2003) 10. S. Helgason, Differential Geometry, Lie Groups, and Symmetric Spaces. Graduate Studies in Mathematics, vol. 34 (American Mathematical Society, Providence, 2001) 11. S. Lang, Elliptic Functions. Graduate Texts in Mathematics, vol. 112 (Springer, New York, 1987) 12. S. MacLane, Categories for the Working Mathematician. Graduate Texts in Mathematics, vol. 5, 2nd edn. (Springer, New York, 1998) 13. J.P. May, A Concise Course in Algebraic Topology (University of Chicago Press, Chicago, 1999) 14. J.R. Munkres, Elements of Algebraic Topology (Westview Press, Boulder, 1996) 15. R. Narasimhan, Several Complex Variables (University of Chicago Press, 1995) 16. P. Petersen, Riemannian Geometry. Graduate Texts in Mathematics, vol. 171 (Springer, New York, 2010) 17. F. Riesz, B. Nagy, Functional Analysis (Dover Publications, New York, 1990) 18. W. Rudin, Real and Complex Analysis. International Series in Pure and Applied Mathematics, 3rd edn. (McGraw-Hill, New York, 1987) 19. W. Rudin, Functional Analysis, 2nd edn. (McGraw-Hill, New York, 1991) 20. M. Singer, J.A. Thorpe, Lecture Notes on Elementary Topology and Geometry. Undergraduate Texts in Mathematics (Springer, New York, 1976) 21. M. Spivak, A Comprehensive Introduction to Differential Geometry. 5 volume set, 3rd edn. (Publish or Perish, Houston, 1999) 22. M. Spivak, Calculus, 4th edn. (Publish or Perish, Houston, 2008)
I. Kriz and A. Pultr, Introduction to Mathematical Analysis, DOI 10.1007/978-3-0348-0636-7, © Springer Basel 2013
501
Index of Symbols
.a; b/ open interval, 6 ha; bi closed interval, 6 A adjoint matrix, 470 AT transposed matrix, 470 A1 inverse matrix, 483 C.X/ space of bounded continuous functions, 56 C r , C 1 degrees of smoothness, 289 F , Gı , Fı . . . types of Borel sets, 123 Lp , 138 ` curvature tensor, 375 Rijk Tijk torsion tensor, 375 V dual vector space, 268 W .y1 ; : : : ; yn / Wronskian, 180 W ? orthogonal complement, 463 Œu; v Lie bracket of vector fields, 167 ƒ Lebesgue measurable functions, 118 ei , ei standard bases, 471 o the zero element of a vector space, 452 u v inner product, dot product, 461 v row or column vector, 452 A characteristic polynomial of a matrix, 493 j ıi Kronecker delta, 359 det A, jAj determinant of a matrix, 487 dim.V Z / dimension of a vector space, 458 .I /
ZL
line integral of the first kind, 199
line integral of the second kind, 199 .II/ L Z f Lebesgue integral, 109 Z f Riemann integral over an n-dimensional Z
J
ZL ZB
interval, 99 f .z/dz complex line integral, 202 ! integral of a differential form, 302 f Lebesgue integral over a set, 124
M
Z f d integral by a measure, 427 ZXb f .x/dx the integral, 27 a p
` , 433 `p .C/, 433 fO the Fourier transform formula, 443 ln.x/ natural logarithm, 30 C the field of complex numbers, 6 FS free vector space on a set S, 466 F field of real or complex numbers, 451 Fn the space of column vectors, 471 Fn row vector space, 452 R the field of real numbers, 4 Z functions with compact support on Rn , 106 Zup , Zdn , Z sets of certain limits of compactly supported functions, 107 B Borel sets, 123 F Fourier transformation, 443 F 1 inverse Fourier transformation, 447 S the space of rapidly decreasing functions, 445 L Lebesgue integrable functions, 110 Lup , Ldn , L functions with a (possibly infinite) Lebesgue integral, 113 TM x the tangent space at a point x, 293 sgn p sign of a permutation, 487 @f partial derivative, 66 @xi @v f directional derivative, 68 Df total differential, 73 d exterior derivative, 298 definition of, 31 1 .†; x0 / fundamental group, 338 ', 325 sin.x/; cos.x/ trigonometric functions, 30 Col.A/ column space, 477 Row.A/ row space, 477 Im.z/, 6 Re.z/, 6
I. Kriz and A. Pultr, Introduction to Mathematical Analysis, DOI 10.1007/978-3-0348-0636-7, © Springer Basel 2013
503
504 Arg.z/, 258 grad, div, curl operators on vector fields, 306 rankA rank of a matrix, 479 fQ inverse Fourier transformation formula, 446 i Christoffel symbols of the second kind, jk 359 ijk Christoffel symbols of the first kind, 359
.M / the de Rham complex of M , 300
.x; "/ open ball, 39
k .M / the vector space of k-forms, 298 jjf jjp , 135 jjxjj the norm of x, 34 cM characteristic function, 113 e x exponential function, 30
Index of Symbols f ŒX the image of a set under a map, 3 f dual linear map, 269 f 1 ŒX The pre-image of a set under a map, 3 f Ad adjoint linear operator, 401 fA , f A linear maps associated with a matrix, 472 fn ! f pointwise convergence, 18 fn % f increasing limit, 103 fn à f uniform convergence, 18, 58 fn & f decreasing limit, 103 Hom.U; V / the vector space of homomorphisms, 267 Ker.f / kernel, 469 P.X/ power set, 43
Index
Abelian group, 485 Absolute convergence, 19 Absolutely continuous function, 435 Absolutely continuous measure, 433 Adjoint linear operator, 401 Adjoint matrix, 470 Affine approximation, 73 Affine connection, 371 Affine map, 467 Affine set, 466 Algebra of matrices, 494 Almost complex structure, 383 Almost everywhere, 114 Argument, 258 Argument Principle, 258 Arzel`a-Ascoli Theorem, 229 Associated homogeneous system, 480 Associated vector subspace to an affine set, 467 Atlas, 287 Augmented matrix of a system of linear equations, 480
Baire’s Category Theorem, 220 Banach’s Fixed Point Theorem, 55 Banach space, 393 Banach subspace, 394 Base change matrix, 474 Base point, 334 Basis of a topology, 46 Basis of a vector space, 457 Beginning point, 325 Bessel’s inequality, 406 Betti numbers, 301, 309 Bijective map, 4 Bolzano-Cauchy Theorem, 11 Borel measurable function, 125 Borel measure, 428 Borel set, 123
Boundary oriented counter-clockwise, 205 Bounded linear operator, 398 Bounded metric space, 52 Brachistochrone, 353
Cantor set, 63 Category theory, 269 Cauchy-Riemann conditions, 239 Cauchy-Schwarz inequality, 461 Cauchy sequence, 10, 54 Cauchy’s formula, 245 Cayley-Hamilton Theorem, 495 Chain rule, 71 Change of coordinates, 368 Characteristic function, 113 Characteristic matrix, 190 Characteristic polynomial, 183, 493 Chart, 287 Christoffel symbol, 359 Closed form, 300 Closed set, 40, 44 Closed simple curve, 196 Closure, 40, 44 Codomain, 3 Column, 470 space, 477 vector, 471 Compact interval, 11 Compact metric space, 51 Compact operator, 424 Compact topological space, 218 Completely regular space, 225 Complete metric space, 54 Completion, 223 Complex conjugates, 6 Complex derivative, 238 Complex line integral, 202 Complex primitive function, 243 Composition, 4, 66
I. Kriz and A. Pultr, Introduction to Mathematical Analysis, DOI 10.1007/978-3-0348-0636-7, © Springer Basel 2013
505
506 Concave function, 15 Conformal map, 253 Congruence on a vector space, 468 Connected component, 49 Connected space, 47 Connection, 371 Conserved quantity, 353 Continuous Fourier transformation, 443 Continuous function, 11 Continuous map, 36, 45 Contravariance, 268 Convergence, 35 Convergent sequence, 10 Convex function, 15 Convex polygon, 317 Convex set, 73 Coordinate neighborhood, 287 Coordinate system, 287 Countable set, 20 Coupled quantities, 357 Covariance, 268 Covering, 213, 324 Cramer’s rule, 490 Critical function, 352 Critical point, 14, 88 Curvature tensor, 375 Curve, 193 Cyclic vector, 191 Cycloid, 354
Daniell’s method, 109 Deck transformation, 338 Decreasing sequence of functions, 103 Degree of a polynomial, 8 Dense subset, 44 de Rham cohomology, 301, 309 de Rham complex, 300 Derivative, 13 Determinant, 487 of a linear map, 493 Diffeomorphism, 289 Differential form, 295 Dimension of a vector space, 458 Dini’s Theorem, 102 Directional derivative, 68 Direct sum of vector spaces, 496 Discrete Fourier transform, 442 Distance, 33 Domain, 3, 205 Dot product, 461 Dual basis, 269 Dual space, 399 Dual vector space, 268
Index Dynkin’s Lemma, 132
Eigenvalues, 402, 493 Eigenvectors, 402, 493 Einstein convention, 357, 370 Elementary row and column operations, 477 Elliptic curve, 323 Elliptic functions, 347 Elliptic integral, 347 End point, 325 Energy, 355, 361 Equation of holomorphic disks, 383 Essential singularity, 256 Euclidean connection, 376 Euclidean plane, 6 Euler-Lagrange equations, 350 Even permutation, see Permutation, even Exact form, 300 Existence theorem for systems of LDE’s, 177 Existence and Uniqueness Theorem for Systems of ODE’s, 151 Exponential map, 361 Exterior algebra, 277 Exterior derivative, 298 Exterior power, 277 Exterior product, 281
Factor, 469 Fatou’s lemma, 139 Field, 4 Finite-dimensional vector space, 454 -finite measure, 448 Finite operator, 424 Flux, 306 Fourier series, 442 Fourier transformation, 443 Fr´echet derivative, 365 Free vector space on a set, 466 Frobenius’ Theorem, 481 Fubini’s Theorem, 101, 128 Function, 7 Functoriality, 285 Fundamental group, 338 Fundamental neighborhood, 324 Fundamental system of solutions, 180 Fundamental Theorem of Algebra, 253 Fundamental Theorem of Calculus, 29, 435, 438 Fundamental Theorem of Line Integrals, 306
Gauss elimination method, 481
Index Gaussian plane, 6 Generalized Cantor set, 116 Generalized Pythagoras’ Theorem, 405 Generalized symmetry of a system of ODE’s, 169 General linear group, 499 Generating set of a vector space, 453 Geodesic, 359, 374 equation, 359 Global extreme, 90 Gram-Schmidt orthogonalization process, 462 Goursat’s Theorem, 241 Grassmann algebra, 277 Green’s Theorem, 206 Gronwall’s inequality, 154 Group, 485
Hamiltonian, 353 Hausdorff space, 225 Heine-Borel Theorem, 217 Hermitian form, 464 Hermitian matrix, 470 Hermitian operator, 402 Hessian, 88 Higher derivative, 16 Hilbert basis, 407 Hilbert-Schmidt operator, 426 Hilbert space, 393 Hilbert subspace, 394 Hodge operator, 283 H¨older’s inequality, 136 Holomorphic automorphism, 312, 322 Holomorphic 1-form, 327 Holomorphic function, 241, 322 Holomorphic isomorphism, 312, 322 Holomorphic Open Mapping Theorem, 260 Holonomy, 381 Homeomorphism, 42 Homogeneous differential equation, 170 Homogeneous equation, 163 Homogeneous LDE’s, 176 Homogeneous system of linear equations, 480 Homomorphism theorem (for vector spaces), 469 Homomorphism of groups, 485 Homotopy of paths, 325 Hurwitz’s Theorem, 260 Hyperbolic plane, 366 Hypergeometric functions, 338
Identity, 4 Imaginary part, 6
507 Immersion, 295 Implicit differentiation, 94 Implicit Function Theorem, 77, 81 Increasing sequence of functions, 103 Indefinite Hermitian, real symmetric matrix, form, 474, 484 Induced Riemann metric, 379 Infimum, 5 Infinitesimal symmetry, 168 Injective map, 4 Injective space, 61 Inner product, 460 norm, 461 Integral by a Borel measure, 430 Integral curves, 166 Integral equations, 147 Integral Mean Value Theorem, 28 Intermediate Value Theorem, 12 Interval (n-dimensional), 97 Inverse, 4 Fourier transform, 446 Function Theorem, 86 matrix, 483 Isolated singularity, 256 Isometry, 378 Isomorphism of banach spaces, 394 of groups, 485 holomorphic, 312 isometric, 394, 418 of vector space, 464
Jacobian, 83 Jacobi identity, 168 Jensen’s inequality, 143 Jordan block, 495 Jordan canonical form, 497 Jordan’s Curve Theorem, 263 Jordan’s Theorem on Matrices, 497
Kernel, 469 k-form, 295 Kronecker ı, 359
Lagrange’s Theorem, 14, 73 Lagrangian, 354 Laurent series, 256 LDE, see Linear differential equation Lebesgue integrable function, 110 Lebesgue integral, 109 over a set, 123
508 Lebesgue measure, 120 Lebesgue’s Dominated Convergence Theorem, 117, 431 Lebesgue’s Monotone Convergence Theorem, 117, 429 Left invariant vector field, 308 Levi-Civita connection, 379 Levi’s Theorem, 117 Lie algebra, 168 Lie bracket, 168, 307 Lie group, 308 Lifting, 325 Lindel¨of space, 213 Linear combination, 454 Linear differential equation (LDE), 164, 175 Linear independence, 455 Linear map, 464 associated with a matrix, 474 Line integral of the first kind, 198 Line integral of the second kind, 199 Liouville’s Theorem, 252 Lipschitz function, 149 Local extreme, 17, 88 Locally finite cover, 290 Looman-Menchoff’s Theorem, 239 Lower sum, 26
Manifold, 287 Map, 3 Matrix, 469 associated with a linear map, 474 of a linear map, 473 of a system of linear equations, 480 Maximum principle, 260 Mean Value Theorem, 14 Measurable function, 118 Measurable set, 120 Meromorphic function, 323 Mesh, 26 Metric, 33 space, 33 subspace, 37 Metrizable space, 45 Minor of a matrix, 489 M¨obius strip, 310 M¨obius transformations, 312, 323 Modulus, 6 Multiplication of matrices, 470 Multiplicity of a root, 9 Multi-valued holomorphic function, 335
Index Negative definite Hermitian, real symmetric matrix, form, 474, 484 Neighborhood, 39, 44 Noether current, 363 Non-singular matrix, 483 Non-vanishing vector field, 297 Norm, 34 Normal space, 226 Normed vector space, 34
Odd permutation, see Permutation, odd ODE, see Ordinary differential equation Onto map, 4 Open cover, 213 Open set, 40, 44 Ordered basis, 473 Ordinary differential equation (ODE), 145 Orientation, 280, 301 Oriented curve, 195 Orthogonal complement, 397, 463 Orthogonal vectors, 462 Orthonormal system, 462
Parallel transport, 373 Parametrization, 193 by arc length, 358 Parametrized curve, 194 Parseval’s equality, 406 Partial derivatives, 66 of higher order, 74 Partition of an interval, 26, 97 Path, 325 Permutation, 456 even, 486 odd, 486 Path-connected space, 49 Picard-Lindel¨of Theorem, 151 Piecewise continuously differentiable curve, 194 Point, 33 Pole, 256 Polynomial, 8 Positive definite Hermitian, real symmetric matrix, form, 474, 484 Power series, 23 Power set, 43 Primitive function, 327 Product of metric spaces, 39
Quotient vector space, 469
Index Radius of convergence, 24 Radon-Nikodym Theorem, 433 Rank of a matrix, 479 Rapidly decreasing function, 445 Real numbers - a rigorous construction, 234 Real part, 6 Refinement of a cover, 290 Refinement of a partition, 26, 98 Region with corners, 303 Regular matrix, 191, 483 Regular space, 225 Removable singularity, 256 Residue, 257 Residue Theorem, 257 Restriction of a map, 4 Riemann integrable function, 99, 142 Riemann integral, 26, 27, 97, 98 Riemann-Lebesgue lemma, 443 Riemann Mapping Theorem, 314 Riemann metric, 356, 378 Riemann surface, 322 Riemann zeta function, 261 Riesz Representation Theorem, 399 Root of a polynomial, 8 Rouch´e’s Theorem, 259 Row, 470 space, 477 vector, 471
Scalar, 451 Schwartz-Christoffel formula, 317 Schwartz’s Lemma, 313 Schwarzian function, 445 Separable space, 213 Separation axioms, 224 Separation of variables, 161, 168 Series, 19 Set of measure 0, 114 Sign of a permutation, 486 Similar matrices, 496 Simple arc, 196 Simply connected Riemann surface, 332 Simply connected set, 314 Singular values, 425 Slice Theorem, 297 Smooth coordinate system, 289 Smooth function, 288 Smooth manifold, 288 Smooth partition of unity, 204, 290 Space of solutions, 179 Spherical coordinates, 142 Square matrix, 470 Standard basis, 471
509 Steinitz’ Theorem, 456 Stereographical projection, 392 Stokes’ Theorem, 304 Stone-Weierstrass Theorem, 231 Subbasis of a topology, 46 Subcover, 213 Submanifold, 295 Submersion, 295 Substitution in differential equations, 165 Substitution Theorem, 130, 135 Sum in a Hilbert space, 402 Sum of vector subspaces, 454 Support, 106, 440 Supremum, 5 Surface, 322 Surjective map, 4 Symmetric bilinear form, 464 Symmetric matrix, 470 Symmetry of a system of ODE’s, 167 System of linear differential equations, 175 with constant coefficients, 183 System of linear equations, 479 System of ordinary differential equations, 145
Tangent vector, 292 Taylor’s Theorem, 16, 87, 248 Tensor, 368 calculus, 368 field, 368 product, 271, 272 Tietze’s Real Line Theorem, 61 Tietze’s Theorem, 59 Topological concept, 43 Topological invariant, 301 Topological manifold, 287 Topological space, 43 Topology, 44 Torsion tensor, 375 Total differential, 68, 72, 294 Totally bounded metric space, 215 Trace class operator, 425 Transposition, 470, 486 Triangle inequality, 33 T0 and T1 spaces, 224 T2 space, 225 T3 and T3C 1 spaces, 225 2 T4 space, 226
Uncountable set, 30 Uniform convergence, 18, 58 Uniformization Theorem, 332
510 Uniformly continuous function, 12 Uniformly continuous map, 36 Uniformly convex Banach space, 395 Uniqueness theorem for holomorphic functions, 251 Unit matrix, 470 Universal covering, 332 Universal object, 271, 276 Upper sum, 26 Urysohn’s Theorem, 228
Index space, 451 subspace, 453 Volume, 391 form, 301
Weak topology, 413 Weierstrass’s Theorem, 247 Wronskian, 180
Young’s inequality, 16 Variation of constants, 164, 181 Vector, 451 field, 166, 295
Zero, 256