Functional Analysis for the Applied Sciences

Universitext Gheorghe Moroşanu Functional Analysis for the Applied Sciences Universitext Universitext Series edit...

Author: Gheorghe Morosanu

52 downloads 365 Views 4MB Size Report

This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!

Report copyright / DMCA form

DOWNLOAD PDF

Universitext

Gheorghe Moroşanu

Functional Analysis for the Applied Sciences

Universitext

Universitext

Series editors Sheldon Axler San Francisco State University, San Francisco, CA, USA Carles Casacuberta Universitat de Barcelona, Barcelona, Spain John Greenlees University of Warwick, Coventry, UK Angus MacIntyre Queen Mary University of London, London, UK Kenneth Ribet University of California, Berkeley, CA, USA Claude Sabbah ´ Ecole Polytechnique, CNRS, Universit´e Paris-Saclay, Palaiseau, France Endre S¨uli University of Oxford, Oxford, UK Wojbor A. Woyczy´nski Case Western Reserve University, Cleveland, OH, USA

Universitext is a series of textbooks that presents material from a wide variety of mathematical disciplines at master’s level and beyond. The books, often well classtested by their author, may have an informal, personal even experimental approach to their subject matter. Some of the most successful and established books in the series have evolved through several editions, always following the evolution of teaching curricula, into very polished texts. Thus as research topics trickle down into graduate-level teaching, first textbooks written for new, cutting-edge courses may make their way into Universitext.

More information about this series at http://www.springer.com/series/223

Gheorghe Moros¸anu

Functional Analysis for the Applied Sciences

Gheorghe Moros¸anu Romanian Academy of Sciences Bucharest, Romania Department of Mathematics Babes-Bolyai University Cluj-Napoca, Romania

ISSN 0172-5939 ISSN 2191-6675 (electronic) Universitext ISBN 978-3-030-27152-7 ISBN 978-3-030-27153-4 (eBook) https://doi.org/10.1007/978-3-030-27153-4 Mathematics Subject Classification (2010): 32A70 © Springer Nature Switzerland AG 2019 This work is subject to copyright. All rights are reserved by the Publisher, whether the whole or part of the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation, broadcasting, reproduction on microfilms or in any other physical way, and transmission or information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology now known or hereafter developed. The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication does not imply, even in the absence of a specific statement, that such names are exempt from the relevant protective laws and regulations and therefore free for general use. The publisher, the authors, and the editors are safe to assume that the advice and information in this book are believed to be true and accurate at the date of publication. Neither the publisher nor the authors or the editors give a warranty, expressed or implied, with respect to the material contained herein or for any errors or omissions that may have been made. The publisher remains neutral with regard to jurisdictional claims in published maps and institutional affiliations. This Springer imprint is published by the registered company Springer Nature Switzerland AG. The registered company address is: Gewerbestrasse 11, 6330 Cham, Switzerland

Dedicated to my wife, Carmen

Preface The goal of this book is to present in a friendly manner some of the main results and techniques in Functional Analysis and use them to explore various areas in mathematics and its applications. Special attention is paid to creating appropriate frameworks towards solving diﬀerent problems in the ﬁeld of diﬀerential and integral equations. In fact, the ﬂavor of this book is given by the ﬁne interplay between the tools oﬀered by Functional Analysis and some speciﬁc problems which are of interest in the Applied Sciences. The table of contents of the book (see below) oﬀers a fairly good description of the material. In contrast with other books in the ﬁeld, we present in Chap. 1 the real number system, describing the Cantor– M´eray model which is most appropriate for our purposes here. Indeed, it is based on a completion procedure, allowing the extension from rational numbers to real numbers. This procedure involves the concepts of limit and inﬁnity that are speciﬁc to analysis. We consider the Cantor–M´eray construction as the corner stone of mathematical analysis, which is why we pay attention to this subject which is usually assumed well known. In order to help the reader to understand the richness of ideas and methods oﬀered by Functional Analysis, we have included a section of exercises at the end of each chapter. Some of these exercises supplement the theoretical material discussed in the corresponding chapter, while others are mathematical problems that are related to the real world. Some of the exercises are borrowed from other books, being reformulated and/or presented in a form adapted to the needs of the corresponding chapter. We do not indicate the books where individual exercises come from, but all those sources are included into the reference list of our book. In any event, we do not claim originality in such cases. Other exercises were invented by us to oﬀer the reader enough vii

viii

Preface

material to understand the theoretical part of the book and gain expertise in solving practical problems. In the last chapter of the book (Chap. 12), we provide solutions to almost all exercises. This is in contrast to many other books which include exercises without solutions. For easy exercises, we provide hints or ﬁnal solutions, and answers to very easy exercises are left to the reader. I encourage everybody to spend some time working on an exercise before looking at its solution. We shall refer to an exercise by indicating the chapter and exercise numbers (and not the section number). For example, Exercise 11.3 will mean Exercise 3 in the last section of Chap. 11 (which is Sect. 11.3 in this case). The book is addressed to graduate students and researchers in applied mathematics and neighboring ﬁelds of science. I would like to thank the anonymous reviewers whose pertinent comments improved the initial version of the book. Special thanks are due to a former American student of mine, Ivan Andrus, who wrote the ﬁrst draft of the present book as lecture notes for my Functional Analysis lectures in 2010. He also carefully checked the ﬁnal version of the book and suggested several minor changes. I am also indebted to my former student Liviu Nicolaescu for reading the ﬁrst part of the book and correcting some errors. Last but not least, I would like to thank Mrs. Elizabeth Loew, Executive Editor at Springer, for our very kind cooperation that led to the successful completion of this book project. Cluj-Napoca, Romania

Gheorghe Moro¸sanu

Contents 1 Introduction 1.1 Sets . . . . . . . . . 1.2 Sequences . . . . . 1.3 Real Numbers . . . 1.4 Complex Numbers 1.5 Linear Spaces . . . 1.6 Exercises . . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

1 1 3 3 15 16 26

2 Metric Spaces 2.1 Deﬁnitions . . . . . . . . . . . . . . . . 2.2 Completeness . . . . . . . . . . . . . . 2.3 Compact Sets . . . . . . . . . . . . . . 2.4 Continuous Functions on Compact Sets 2.5 The Banach Contraction Principle . . 2.6 Exercises . . . . . . . . . . . . . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

31 31 34 40 44 55 58

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

65 65 71 75 82 86

4 Continuous Linear Operators and Functionals 4.1 Deﬁnitions, Examples, Operator Norm . . . . . . . 4.2 Main Principles of Functional Analysis . . . . . . . 4.3 Compact Linear Operators . . . . . . . . . . . . . . 4.4 Linear Functionals, Dual Spaces, Weak Topologies 4.5 Exercises . . . . . . . . . . . . . . . . . . . . . . . .

. . . . .

. . . . .

89 89 93 96 97 104

3 The 3.1 3.2 3.3 3.4 3.5

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

Lebesgue Integral and Lp Spaces Measurable Sets in Rk . . . . . . . . Measurable Functions . . . . . . . . . The Lebesgue Integral . . . . . . . . Lp Spaces . . . . . . . . . . . . . . . Exercises . . . . . . . . . . . . . . . .

. . . . .

ix

x

Contents

5 Distributions, Sobolev Spaces 5.1 Test Functions . . . . . . . . . . . . . . . . . . 5.2 Friedrichs’ Molliﬁcation . . . . . . . . . . . . . 5.3 Scalar Distributions . . . . . . . . . . . . . . . 5.3.1 Some Operations with Distributions . 5.3.2 Convergence in Distributions . . . . . 5.3.3 Diﬀerentiation of Distributions . . . . 5.3.4 Diﬀerential Equations for Distributions 5.4 Sobolev Spaces . . . . . . . . . . . . . . . . . 5.5 Bochner’s Integral . . . . . . . . . . . . . . . . 5.6 Vector Distributions, W m,p (a, b; X) Spaces . . 5.7 Exercises . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . .

. . . . . . . . . . .

. . . . . . . . . . .

107 . 107 . 112 . 119 . 121 . 122 . 125 . 131 . 143 . 149 . 155 . 160

. . . . . . Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

165 165 168 171 175 180 186 195

7 Adjoint, Symmetric, and Self-adjoint Linear Operators 7.1 The Adjoint of a Linear Operator . . . . . . . . . 7.2 Adjoints of Operators on Hilbert Spaces . . . . . 7.2.1 The Case of Compact Operators . . . . . 7.3 Symmetric Operators and Self-adjoint Operators 7.4 Exercises . . . . . . . . . . . . . . . . . . . . . . .

. . . . .

. . . . .

. . . . .

201 201 204 205 209 212

6 Hilbert Spaces 6.1 Examples . . . . . . . . . . . . . . . . 6.2 Jordan–von Neumann Characterization 6.3 Projections in Hilbert Spaces . . . . . 6.4 The Riesz Representation Theorem . . 6.5 Lax–Milgram Theorem . . . . . . . . . 6.6 Fourier Series Expansions . . . . . . . 6.7 Exercises . . . . . . . . . . . . . . . . .

. . . . . . . . . . .

8 Eigenvalues and Eigenvectors 8.1 Deﬁnition and Examples . . . . . . . . . . . . . . . 8.2 Main Results . . . . . . . . . . . . . . . . . . . . . 8.3 Eigenvalues of −Δ Under the Dirichlet Boundary Condition . . . . . . . . . . . . . . . . . . . . . . . 8.4 Eigenvalues of −Δ Under the Robin Boundary Condition . . . . . . . . . . . . . . . . . . . . . . . 8.5 Eigenvalues of −Δ Under the Neumann Boundary Condition . . . . . . . . . . . . . . . . . . . . . . . 8.6 Some Comments . . . . . . . . . . . . . . . . . . . 8.7 Exercises . . . . . . . . . . . . . . . . . . . . . . . .

217 . . 217 . . 219 . . 226 . . 228 . . 230 . . 232 . . 239

Contents

xi

9 Semigroups of Linear Operators 9.1 Deﬁnitions . . . . . . . . . . . . . . . . . . . 9.2 Some Properties of C0 -Semigroups . . . . . 9.3 Uniformly Continuous Semigroups . . . . . 9.4 Groups of Linear Operators. Deﬁnitions and to Operator Semigroups . . . . . . . . . . . 9.5 Translation Semigroups . . . . . . . . . . . . 9.6 The Hille–Yosida Generation Theorem . . . 9.7 The Lumer–Phillips Theorem . . . . . . . . 9.8 The Feller–Miyadera–Phillips Theorem . . . 9.9 A Perturbation Result . . . . . . . . . . . . 9.10 Approximation of Semigroups . . . . . . . . 9.11 The Inhomogeneous Cauchy Problem . . . . 9.12 Applications . . . . . . . . . . . . . . . . . . 9.12.1 The Heat Equation . . . . . . . . . . 9.12.2 The Wave Equation . . . . . . . . . 9.12.3 The Transport Equation . . . . . . . 9.12.4 The Telegraph System . . . . . . . . 9.13 Exercises . . . . . . . . . . . . . . . . . . . . 10 Solving Linear Evolution Equations by the Fourier Method 10.1 First Order Linear Evolution Equations . . . . . . . . . . . . . 10.2 Second Order Linear Evolution Equations . . . . . . . . . . . . . 10.3 Examples . . . . . . . . . . . . . 10.4 Exercises . . . . . . . . . . . . . .

243 . . . . . . 244 . . . . . . 246 . . . . . . 252 Link . . . . . . 254 . . . . . . 257 . . . . . . 260 . . . . . . 265 . . . . . . 268 . . . . . . 271 . . . . . . 273 . . . . . . 279 . . . . . . 283 . . . . . . 283 . . . . . . 286 . . . . . . 288 . . . . . . 291 . . . . . . 293

297 . . . . . . . . . . . . 297 . . . . . . . . . . . . 304 . . . . . . . . . . . . 308 . . . . . . . . . . . . 309

11 Integral Equations 11.1 Volterra Equations . . . . . . . . . . . . . . . . . . . 11.2 Fredholm Equations . . . . . . . . . . . . . . . . . . 11.3 Exercises . . . . . . . . . . . . . . . . . . . . . . . . .

315 . 315 . 325 . 336

12 Answers to Exercises 12.1 Answers to Exercises 12.2 Answers to Exercises 12.3 Answers to Exercises 12.4 Answers to Exercises 12.5 Answers to Exercises

341 . 341 . 343 . 354 . 359 . 365

for for for for for

Chap. 1 Chap. 2 Chap. 3 Chap. 4 Chap. 5

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

xii

Contents

12.6 12.7 12.8 12.9 12.10 12.11

Answers Answers Answers Answers Answers Answers

Bibliography

to to to to to to

Exercises Exercises Exercises Exercises Exercises Exercises

for for for for for for

Chap. 6 Chap. 7 Chap. 8 Chap. 9 Chap. 10 Chap. 11

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

375 383 390 398 407 417 429

Chapter 1

Introduction This chapter comprises deﬁnitions, notation, and basic results related to set theory, real and complex numbers, and linear spaces.

1.1

Sets

We assume that the reader is familiar with the basic concepts and results of set theory. However, we are going to recall or specify some concepts and symbols that will be frequently used in this book. First of all, in this book the notation A ⊂ B or B ⊃ A indicates that every element (member) of the set A is also an element of the set B. In particular, A ⊂ A. The empty set, i.e., the set with no elements, will be denoted as usual by ∅. The empty set is a subset of every set A, ∅ ⊂ A. The sets A, B are equal, A = B, if and only if A ⊂ B and B ⊂ A. We assume that the sets N = {1, 2, . . . } (natural numbers), Z = {. . . , −2, −1, 0, 1, 2, . . . } (integers), and Q = {0} ∪ {±m/n; m, n ∈ N, (m, n) = 1} (rational numbers) are well known, including their axiomatic deﬁnitions. A set A is called countable if there exists an injective function from A to N. If one can ﬁnd a bijective function from A to N then S is called countably inﬁnite. In particular, N, Z, and Q are countably inﬁnite sets. In fact, a countable set is either ﬁnite or countably inﬁnite. © Springer Nature Switzerland AG 2019 G. Moro¸sanu, Functional Analysis for the Applied Sciences, Universitext, https://doi.org/10.1007/978-3-030-27153-4 1

1

2

1 Introduction

Ordered Sets. A partial order on a given set A is a binary relation ≤ over A satisfying the following conditions for x, y, z ∈ A: (a) x ≤ x; (b) if x ≤ y and y ≤ x, then x = y; (c) if x ≤ y and y ≤ z, then x ≤ z. We say that x < y if x ≤ y and x = y. The symbols ≥ and > have natural meanings: x ≥ y iﬀ y ≤ x, and x > y iﬀ y < x. If A is endowed with a partial order, then A is called a partially ordered set. For example, N is partially ordered with respect to the divisibility relation (m ≤ n if m is a divisor of n); also, the set of subsets of a given set S is partially ordered by the inclusion relation. Note that in these examples there are pairs of elements which are not comparable with respect to the corresponding order, which is why the order is called partial. If A is a set with a partial order ≤, then a subset B ⊂ A is said to be totally ordered (or a chain) if any two elements x, y ∈ B are comparable, i.e., either x ≤ y or y ≤ x (including the case x = y). Let B be a subset of A. An element z ∈ A is an upper bound for B if x ≤ z for all x ∈ B. If B has an upper bound, it is said to be bounded above. An element m ∈ A is a maximal element of A if there is no x ∈ A, x = m, such that m ≤ x. A maximal element of A is not necessarily an upper bound for A. The set A is called inductive if any totally ordered subset of A has an upper bound. Now, let us recall an important result which is known as Zorn’s Lemma1 : Theorem 1.1 (Zorn’s Lemma). Every nonempty, partially ordered, inductive set has a maximal element. If B is a nonempty subset of a partially (possibly totally) ordered set A, the supremum of B, denoted sup B, is deﬁned as the least upper bound of B. An element b ∈ A is the least upper bound of B if and only if (i) x ≤ b for all x ∈ B; (ii) if a < b then a is not an upper bound of B, i.e., there exists an x ∈ B such that a < x. If sup B exists, then it is unique. If B has a greatest element b (i.e., x ≤ b for all x ∈ B), then b = sup B. 1

Max August Zorn, German mathematician, 1906–1993.

1.3 Real Numbers

1.2

3

Sequences

A sequence in a nonempty set X is an ordered list of elements from X, and can be deﬁned as a function f : D → X whose domain D is a countable, totally ordered set. The case when D is ﬁnite is not considered in this book. We shall mostly consider that D = N and the sequence is usually denoted (an )n∈N , or simply (an ), where an = f (n) for all n ∈ N. Sometimes we consider inﬁnite subsets of N, for instance, D = {m, m + 1, . . . }, m ∈ N, m > 1, and in this case the sequence is denoted (an )n≥m . A sequence can also be indicated by listing its terms: (an )n∈N = (a1 , a2 , . . . ). For example, (1, 3, 5, 7, . . . ) is the sequence of odd natural numbers. It is worth pointing out that a term (element) can appear several times in a sequence, e.g., (an )n∈N = (0, 1, 0, 1, 0, 1, . . . ), where a2k−1 = 0 and a2k = 1 for all k ∈ N. A subsequence of a given sequence (an )n∈N = (a1 , a2 , . . . ) is a new sequence (bk )k∈N , obtained by removing some terms from (a1 , a2 , . . . ) and preserving the order of the remaining terms, i.e., bk = ank , k ∈ N, where n1 < n2 < · · · We close this section by noting that further details on sequences will be discussed later.

1.3

Real Numbers

While everybody feels comfortable dealing with rational numbers, in order to understand the larger set of real numbers some eﬀort is needed. Real numbers are needed since the set of rational numbers Q is not suﬃciently large for many purposes. For example, the equation p2 = 2 has no solution in Q. This assertion was ﬁrst proved by Euclid.2 In fact, it was observed that the diagonal and the side of any square are incommensurable, i.e., the length p of the diagonal of the unit square is not a rational number. Indeed, p must satisfy the equation p2 = 2. One needs to ﬁnd a number p (which cannot be a rational one) to represent the length of that diagonal. Many other similar examples 2

Greek mathematician, known as father of Geometry, born around 330 BC, presumably in Alexandria, Egypt.

4

1 Introduction

appear when trying to express areas, volumes, weights, etc. So, it was really necessary to enlarge the set Q to obtain a set R, called the set of real numbers, within which inconveniences as those described above do not occur. The elements of R Q√ will be called irrational numbers. In particular, the irrational number 2 will be the precise representation for the length of the diagonal of the unit square. In fact, we will √ see 2 that the √ equation p = 2 discussed above has two solutions in R, + 2 and − 2. Roughly speaking, R is the completion of Q, as we will explain below. First of all, let us recall an axiomatic deﬁnition of R: R is an ordered ﬁeld, containing Q as a subﬁeld, and having the least upper bound property. More precisely, R, endowed with two internal operations, addition and multiplication, denoted “+” and “·”, and a total order, denoted “≤”, satisﬁes the following axioms: (A1) x + y = y + x for all x, y ∈ R; (A2) (x + y) + z = x + (y + z) for all x, y, z ∈ R; (A3) there exists an element 0 ∈ R such that x + 0 = x for all x ∈ R; (A4) for all x ∈ R there exists an element −x ∈ R such that x + (−x) = 0; (M1) xy = yx for all x, y ∈ R (note that here and in what follows x · y is also denoted xy); (M2) (xy)z = x(yz) for all x, y, z ∈ R; (M3) there exists an element 1 ∈ R, 1 = 0, such that 1 · x = x for all x ∈ R; (M4) for all x ∈ R {0} there exists an element x−1 ∈ R (called the inverse of x, also denoted x1 or 1/x) such that x · x−1 = 1; (D) x(y + z) = xy + xz for all x, y, z ∈ R (the distributive law); (O1) if x, y ∈ R and x ≤ y, then x + z ≤ y + z for all z ∈ R; (O2) if x, y ∈ R and x ≥ 0, y ≥ 0, then xy ≥ 0; (LUBP) for every nonempty subset A of R that is bounded above (i.e., A has an upper bound) there exists sup A ∈ R.

1.3 Real Numbers

5

The axiom (LUBP) is called the least upper bound property (which is why it is so denoted) or the completeness axiom (this name will be clariﬁed in the following). Remark 1.2. The fact that Q is a subﬁeld of R means that Q ⊂ R and the operations of addition and multiplication in R are also internal operations in Q. In fact, any ordered ﬁeld K contains a subﬁeld QK which is isomorphic to Q. Indeed, the function g : Q → K, deﬁned by g(m/n) = (m · 1K ) · (n · 1K )−1 , is an injective morphism, so g(Q) is a subﬁeld of K isomorphic to Q. Thus, the condition from the deﬁnition above, that R contains Q as a subﬁeld is superﬂuous if we admit that Q is unique up to isomorphism. We merely wanted to make it clear that R is an extension of Q. Remark 1.3. It is worth pointing out that the extension from rational numbers to real numbers is the result of a long investigative process extended over more than 2000 years. The problem was clariﬁed in the nineteenth century. There are several models for R deﬁned by the above system of axioms, such as the Stolz–Weierstrass model,3 based on decimal expansions; Dedekind’s model,4 based on the socalled Dedekind cuts and the Cantor–M´ eray model.5 All these models are based on approximation (as are all models of R). We shall describe the Cantor–M´eray construction which involves Cauchy sequences of rational numbers and uses the basic properties of Q as an ordered ﬁeld. Intuitively speaking, according to this construction, R will consist of all rational numbers, plus “limits” of Cauchy6 sequences in Q which are not rational numbers. The most important step in this construction (completion procedure) will be to show that the completeness axiom is satisﬁed by this model, denoted RC−M (C − M comes from Cantor–M´eray), thus ensuring that any Cauchy sequence of rational numbers is “convergent” (has a “limit”) in RC−M . But such “limits” cannot be used in this construction (one cannot deﬁne real numbers by themselves!), so instead we consider as elements of RC−M the equivalence classes of Cauchy rational sequences (two sequences being equivalent if the corresponding sequence of diﬀerences 3

Otto Stolz, Austrian mathematician, 1842–1905; Karl Weierstrass, German, known as father of modern analysis, 1815–1897. 4 Richard Dedekind, German mathematician, 1831–1916. 5 Georg Cantor, German mathematician, 1845–1918; Charles M´eray, French mathematician, 1835–1911. 6 August-Louis Cauchy, French mathematician, engineer and physicist, 1789– 1857.

6

1 Introduction

approaches zero); one considers equivalence classes because the sequence which is supposed to deﬁne (“converge to”) a real number is not unique. Finally, we will prove that any two copies of R are isomorphic, thus concluding that R is unique up to isomorphism. Before presenting in detail the Cantor–M´eray model, we will make a few comments and derive some abstract results regarding R as deﬁned by the axioms above. Remark 1.4. It is easily seen that (LUBP) implies that for any nonempty set A ⊂ R which is bounded below (i.e., has a lower bound), there exists the greatest lower bound of A, denoted inf A ∈ R. In fact, inf A = − sup {x ∈ R; −x ∈ A}. The converse implication is also true, so one may replace (LUBP) by this equivalent statement. Remark 1.5. It is worth pointing out that the (LUBP) is precisely what makes the diﬀerence between R and Q. Indeed, Q is an ordered ﬁeld, but does not satisfy the (LUBP), as illustrated by the following counterexample: Let A ⊂ Q denote the set {p ∈ Q : p > 0, p2 < 3}. A is nonempty, since 1 ∈ A. Obviously, A is bounded above (e.g., 2 is an upper bound of A). Assume by contradiction that there exists a number α ∈ Q which is the least upper bound of A, α = sup A. Then α ≥ 1 and we need to examine the following three possibilities: α2 < 3, α2 = 3, and α2 > 3. If α2 < 3, then (2α + 3)/(α + 2) > α, and (2α + 3)/(α + 2) ∈ A, so α is not even an upper bound of A. The case α ∈ Q, α2 = 3 is impossible (prove it!). Finally, if α2 > 3, then β := (2α + 3)/(α + 2) ∈ Q, β > 0 (since α ∈ Q, α ≥ 1), and α − β = (α2 − 3)/(α + 2) > 0, hence β < α. On the other hand, 3 − β 2 = (3 − α2 )/(α + 2)2 < 0, so β 2 > 3. It follows that β is an upper bound for A, with β < α. This contradicts the fact that α = sup A. Since none of the above cases is possible, there is no rational number α such that α = sup A. Therefore Q does not satisfy the (LUBP). Note that if A is considered as a subset of R, then there exists sup A = √ 3 ∈ R \ Q (see below). Now, we present a result known as the Archimedean7 property:

7

Archimedes of Syracuse, 287–212 BC.

1.3 Real Numbers

7

Theorem 1.6. If x, y ∈ R and x > 0, then there exists n ∈ N such that nx > y. Proof. Assume that, on the contrary, nx ≤ y for all n ∈ N, so the set A = {nx; n ∈ N} is bounded above. Then the (LUBP) implies that there exists α = sup A ∈ R. Since α − x < α, there exists an element of A, say mx, with m ∈ N, such that α − x < mx which is equivalent to α < (m + 1)x ∈ A. This contradicts the fact that α is an upper bound of A. Theorem 1.7. Q is dense in R, i.e., between any two distinct real numbers there is a rational number. Proof. Let x, y ∈ R, x < y. Since y − x > 0 it follows by the Archimedean property that there exists an n ∈ N such that n(y − x) > 1.

(1.3.1)

By the same Archimedean property there exist w, z ∈ N such that −w < −nx < z. In fact, w can be replaced by m := − sup{r ∈ Z : −w ≤ r < −nx}, so nx < m. Moreover, nx < m ≤ nx + 1.

(1.3.2)

By (1.3.1) and (1.3.2) we can conclude that x < m/n < y. Theorem 1.8 (existence of n-th roots of positive reals). For all x ∈ R, x > 0, and for all n ∈ N, n ≥ 2, there exists a unique y ∈ R, y > 0, such that y n = x. Proof. The uniqueness of y follows from the implication 0 < y1 < y2 ⇒ y1n < y2n . To prove the existence of y consider the set A = {t ∈ R; t > 0, tn < x}. A is nonempty, since it contains t1 = x/(1 + x). Indeed, tn1 < t1 < x. A is also bounded above (for example, 1 + x is an upper bound for A). By the (LUBP) there exists y = sup A ∈ R, y > 0. Let us prove that y n = x. Assuming that y n < x, we have for 0 < ε < 1, (y + ε)n − y n = ε[(y + ε)n−1 + y(y + ε)n−2 + · · · + y n−1 ] < εn(y + 1)n−1 . Hence (y + ε)n < y n + εz,

(1.3.3)

where z = n(y+1)n−1 . By the Archimedean property, there is a k ∈ N, k ≥ 2, such that ε = 1/k satisﬁes εz < x − y n .

(1.3.4)

8

1 Introduction

From (1.3.3) and (1.3.4) it follows that y + ε ∈ A which contradicts the fact that y = sup A. We can also show that y n > x leads to a contradiction. Hence, y n = x. √ √ The n-th root y of the real number x > 0 is denoted n x ( x if n = 2) or x1/n . At this moment, we can see that in particular √ 2 the 2 2 2 can be solved in R: p = 2 ⇔ p − ( 2) equation√p = 2 √ √ = 0 ⇔ (p √ − 2)(p + 2) =√ 0, so there are two solutions, p = 2 and p = − 2. The number 2, which is irrational, represents in particular the length of the diagonal of the unit square. So, the √ diﬃculty pointed out by Euclid can be handled in R. Similarly, 3 is an irrational number representing the length of the diagonal of the unit cube. Remark 1.9. Sometimes it is useful to represent numbers by points on a straight line. First, let us mark arbitrarily two distinct points O and A on the straight line to represent the numbers 0 and 1. The line segment OA is called the unit segment. If we choose a point P to the right of A, such that OP consists of m unit segments, m ∈ N, m ≥ 2, then P represents the natural number m. The negative integers are similarly represented by points on the left of O, following the natural order . . . , −3, −2, −1. So now we have a directed straight line, called the number line, including the positive half-line (on the right of O) and the negative half-line. One can also associate with any rational number a point on the number line. For example, if one divides OA into 2 equal parts and choose a point R on the positive half-line, such that OR is equal to 3 such parts, then R represents 3/2. Obviously, the points corresponding to distinct rational numbers are distinct too. Note that the set of points on the number line corresponding to all the rational numbers does not cover the number line. For example, the point√D corresponding to the length of the diagonal of the unit square (i.e., 2) is on the number line (D being constructible by using a ruler and compass). We will discuss later the representation of irrational numbers by points on the number line. Sequences of Real Numbers. A sequence (an )n∈N in R is said to be increasing (or nondecreasing) if an ≤ an+1 for all n ∈ N. If an < an+1 for all n ∈ N, then (an ) is called strictly increasing. Similarly, if the order relations “≤” and “<” are replaced by “≥” and “>”, we obtain the deﬁnitions for a decreasing (or nonincreasing) sequence, and a strictly decreasing sequence, respectively.

1.3 Real Numbers

9

A sequence (an )n∈N in R is said to be bounded above (bounded below) if there exists an M ∈ R such that an ≤ M (an ≥ M , respectively) for all n ∈ N. If (an ) is bounded both above and below, then it is called bounded. A sequence (an )n∈N in R is said to be convergent if there exists a number a ∈ R (called limit of (an )) such that ∀ε ∈ R, ε > 0, ∃N = N (ε) ∈ N such that ∀ n > N, |an − a| < ε. Here, | · | means the absolute value function, i.e., |x| = x if x ≥ 0, and |x| = −x if x < 0. The above deﬁnition (of a convergent sequence) will be discussed again later in a more general framework. Here we are interested in some properties of sequences of real numbers. It is easily seen that any convergent sequence is bounded, and its limit is unique. Next, we state the so-called Monotone Convergence Theorem: Theorem 1.10 (Monotone Convergence Theorem). Any sequence (an )n∈N in R which is increasing (decreasing) and bounded is convergent. Proof. We consider the case when (an ) is increasing and bounded (the other case is similar). Since the set of all an ’s (where repetitions are eliminated) is bounded above, it follows by (LUBP) that there exists its supremum a ∈ R. Thus, for all ε ∈ R, ε > 0, there exists an N ∈ N, such that a − ε < N . Since (an ) is increasing, we have a − ε < an for all n > N , so |an − a| = a − an < ε ∀n > N. We continue with the following result known as Bolzano–Weierstrass’ Theorem.8 Theorem 1.11 (Bolzano–Weierstrass). Every bounded sequence in R has a convergent subsequence. Proof. Let (an )n∈N be a bounded sequence in R. Let k be a natural number with the property ak > am for all m > k. Assume there are inﬁnitely many such k’s, say k = nj , n1 < n2 < · · · < nj < · · · . Then, the subsequence (anj )j∈N is strictly decreasing, hence convergent since it is also bounded (cf. Theorem 1.10). 8

Bernard Bolzano, Bohemian mathematician, logician, philosopher, and theologian, 1781–1848.

10

1 Introduction

If the set of such k’s is ﬁnite (possibly empty), we denote by K the maximum of such k’s. Obviously, for n1 = K + 1 there exists an n2 ∈ N, such that an1 ≤ an2 . Now, since n2 does not belong to the set of k’s, there exists an n3 ∈ N such that an2 ≤ an3 . Continuing this procedure we obtain a subsequence (anj )j∈N which is increasing and bounded, hence convergent (cf. Theorem 1.10). A sequence (an )n∈N in R is said to be a Cauchy sequence if ∀ε ∈ R, ε > 0, ∃N = N (ε) ∈ N such that ∀ n, m > N, |an − am | < ε. Theorem 1.12. A sequence in R is Cauchy if and only if it is convergent. Proof. Let (an )n∈N be a Cauchy sequence in R. It is easily seen that (an ) is bounded. Thus, by Theorem 1.11, there is a convergent subsequence, say (ank )k∈N . Let a ∈ R be its limit. By the triangle inequality (which obviously holds in R), we have |an − a| ≤ |an − ank | + |ank − a|. Using this inequality we easily conclude that (an ) is convergent (with the same limit a). The converse implication is trivial. The facts recalled above, derived from the axiomatic deﬁnition of R, are important in real analysis and also help us understand the Cantor– M´eray model for R. The Cantor–M´ eray Construction. Assume that Q (the ordered ﬁeld of rational numbers) is known. We want to extend Q to obtain a larger ordered ﬁeld satisfying in addition the (LUBP). Denote by SQ the collection of all Cauchy sequences of rational numbers. When deﬁning a Cauchy sequence in Q we require ε ∈ Q, ε > 0 (since the extension of Q is not yet known). Deﬁne the following equivalence relation in SQ (an ) ∼ (bn ) iﬀ ∀ε ∈ Q, ε > 0, ∃N ∈ N such that ∀n > N, |an − bn | < ε. For example, the sequences (an ), (bn ), (cn ), deﬁned by an = 1/n, bn = n/(n2 + 1), cn = 0 for all n ≥ 1, belong to the same equivalence

1.3 Real Numbers

11

class, i.e., the class of the constant sequence (0, 0, . . . ), which can be identiﬁed with 0 ∈ Q. We identify any r ∈ Q with the equivalence class of the constant sequence (r, r, . . . ). Let us denote by RC−M the set of all equivalence classes in SQ (with respect to the equivalence relation deﬁned above). Obviously, Q can be regarded as a subset of RC−M (in view of the natural identiﬁcation mentioned above). Now, one deﬁnes in a natural manner the operations of addition and multiplication in RC−M . Speciﬁcally, if a, b are classes in RC−M with representatives (an ), (bn ) ∈ SQ , then a + b and ab are deﬁned as the equivalence classes of (an + bn ) and (an bn ), respectively. Also, a ≤ b if for all ε ∈ Q, ε > 0, there exists an N ∈ N such that bn − an ≥ −ε for all n ≥ N . Note that the strict inequality a < b (i.e., a ≤ b and a = b) can be equivalently expressed as follows: there exists an ε0 ∈ Q, ε0 > 0, such that bn −an ≥ ε0 for all n large enough. Likewise, these deﬁnitions do not depend on speciﬁc representatives. It is easily seen that RC−M is an ordered ﬁeld satisfying axioms (A1)− (A4), (M 1) − (M 4), (D), and (O1) − (O2). Let us now prove that RC−M also satisﬁes the (LUBP). Let Ω be a nonempty subset of RC−M which is bounded above, with upper bound of a ∈ RC−M . One may assume that a is the class of a constant sequence (u0 , u0 , . . . ) with u0 ∈ Q (if this is not the case, we can use the information that a Cauchy sequence in Q has an upper bound in Q, so a can be replaced by the class of a constant sequence (u0 , u0 , . . . ), where u0 is a large rational number). Let us pick an s0 ∈ Ω and a rational number l0 such that l0 < s0 , where l0 is identiﬁed with the class of the constant sequence (l0 , l0 , . . . ). Next, we construct two sequences of rational numbers (un ) and (ln ) as follows: u1 = u0 and l1 = l0 , then, successively, for n = 1, 2, . . . , either un+1 = (un + ln )/2, ln+1 = ln if (un + ln )/2 is an upper bound of Ω, or un+1 = un and ln+1 = (un + ln )/2 if (un + ln )/2 is not an upper bound of Ω. By induction we can see that un is an upper bound of Ω for all n ∈ N, while ln is not an upper bound of Ω for any n ∈ N. Obviously, (un ) and (ln ) are Cauchy sequences, so their classes u, l ∈ RC−M , and in fact u = l, since |un − ln | = un − ln = (u0 − l0 )/2n−1 , n ≥ 1. It is also obvious that u is an upper bound of Ω. Let us prove that u is the least upper bound: u = sup Ω. Assume that there exists a smaller upper bound, say v ∈ RC−M , v < u = l. Since lk ≤ lk+1 for all k ∈ N, there exists an N ∈ N such that v < lN . But lN is not an upper bound of Ω, hence v = u cannot be an upper bound of Ω, leading to a

12

1 Introduction

contradiction. Therefore, RC−M satisﬁes all the axioms and is indeed a model for R. Remark 1.13. Let us summarize: any element x ∈ RC−M is the equivalence class of a Cauchy sequence in Q, say (rn ) (this could be a constant sequence if x ∈ Q); since RC−M is a model for R (a complete ordered ﬁeld), we know that (rn ) is convergent (see Theorem 1.12); its limit (which is independent of the choice of (rn ) in the class x) can be identiﬁed with x. So now we have a clear representation of RC−M , including rational and irrational numbers. The Real Number System (Model) is Unique up to Isomorˆ be another model for R. As before, we admit that Q phism. Let R ˆ is unique up to isomorphism, so Q is a subﬁeld of both RC−M and R. ˆ ˆ Since Q is dense in R (see Theorem 1.7), for any x ∈ R, there exists a sequence of rational numbers (rn ) that converges to x (this sequence can be the constant sequence (x, x, . . . ) if x ∈ Q). Of course, (rn ) is a Cauchy sequence. We associate with such an x the class of (rn ) with respect to the equivalence class “∼” deﬁned above. So we have deﬁned ˆ → RC−M , φ(x) = the class of (rn ). It is easily seen a mapping φ : R that φ is a bijection, and φ(x + y)

=

φ(x) + φ(y)

φ(x · y)

=

φ(x) · φ(y)

ˆ ∀x, y ∈ R, ˆ ∀x, y ∈ R,

x > 0 =⇒ φ(x) > 0. ˆ is isomorphic to RC−M , hence any two real number modTherefore, R els are isomorphic. So in what follows we will consider that the real number system is unique and denote it by R. The Dedekind–Cantor Axiom on Continuity of a Straight Line. We discussed in Remark 1.9 how to represent rational numbers on a directed straight line. Now, taking into account the Cantor–M´eray construction, we can complete the procedure by representing irrational numbers. We see that to every real number there corresponds a unique point of the directed straight line, and the correspondence is one-toone. The Dedekind–Cantor axiom stipulates that there are no gaps on the line after representing all real numbers, that is there is a one-toone correspondence between R and the points of the directed straight line. The directed straight line will be called the real line, and real numbers will be sometimes called points.

1.3 Real Numbers

13

The Extended Real Number System. Sometimes it is necessary to describe mathematically what happens “beyond” real numbers. For example, 1/x gets closer and closer to zero when x gets larger and larger. Having in mind that the point on the real line corresponding to x goes far away to the right, we usually say that x tends to inﬁnity, and write x → +∞. The fact that 1/x tends to zero as x → ∞ can be 1 = 0. written as +∞ Similar situations require the introduction of the symbol −∞. So we are led to the so-called extended real number system, R := R ∪ {−∞, +∞} . The usual order in R is preserved, and we deﬁne −∞ < x < +∞ ∀x ∈ R. Then +∞ (−∞) is an upper bound (lower bound, respectively) of every nonempty subset of R. Moreover, any nonempty subset has a least upper bound. For instance, E = {x + x1 : x ∈ R, x = 0} has sup E = +∞ and inf E = −∞. The symbol +∞ is also denoted by ∞. In accordance with our intuition, we adopt the following conventions x x = = 0 ∀x ∈ R; ∞ −∞ x · ∞ = ∞, x · (−∞) = −∞ ∀x ∈ R, x > 0;

x + ∞ = ∞, x − ∞ = −∞,

x · ∞ = −∞, x · (−∞) = +∞ ∀x ∈ R, x < 0; ∞ + ∞ = ∞, −∞ − ∞ = −∞, ∞ · ∞ = ∞, ∞ · (−∞) = −∞, (−∞) · (−∞) = +∞. On the other hand, operations like 0 · (±∞), ∞ − ∞,

∞ ∞

are not accepted. For example, x/(1 + x2 ) approaches 0 as x → ∞, √ while x/(1 + x) approaches +∞ as x → ∞. Thus, the quotient of two large numbers may approach either 0 or ∞. That is why we say that ∞ ∞ does not make sense. Note that R does not form a ﬁeld (why?). We assume familiarity of the reader with sequences and series of real numbers. For information see, for example, [33, 41, 42].

14

1 Introduction

The Number e. Sometimes checking whether a real number is irrational is not a trivial task. The number often known as e is an example in this respect. It is deﬁned as the sum of a series, namely e=

∞ 1 , n!

n=0

where n! = 1 · 2 · 3 · · · n for n ≥ 1, and 0! = 1. Let sn denote the partial 1 . By the ratio test we see that the sum of the series, i.e., sn = nk=0 k! series converges, hence e ∈ R. More precisely, 2<e<1+

∞ 1 = 3. 2k

(1.3.5)

k=0

Note that

∞

e − sn <

1 1 , (n + 1)! (k + 1)2 k=0

hence

2 . (1.3.6) n!n Let us now prove that e is irrational. Assume the contrary, that e = p/q, where p, q ∈ N, (p, q) = 1. In fact, q > 1 (see (1.3.5)). From (1.3.6) we infer that 2 p (1.3.7) 0 < q! − sq < . q q Observing that q!sq ∈ N, we have m := q! pq − sq ∈ N. So we deduce from (1.3.7) that 0 < m < 1 which is impossible (there is no integer between 0 and 1). Therefore e is irrational, as claimed. Remark 1.14. By an argument from Rudin [42, p. 64] we see that e 1 n is also the limit of the sequence (xn )n∈N deﬁned by xn = 1 + n . Using the binomial formula we can write 1 1 1 2 1 xn = 1 + 1 + 1− + 1− 1− + ··· 2! n 3! n n 1 1 2 n − 1 + 1− 1− ··· 1 − . n! n n n Then, for all n, m ∈ N, n ≥ m ≥ 2, we have 1 1 1 2 1 + ··· + 1− 1− ··· 1+1++ 1− 2! n m! n n m − 1 ≤ xn ≤ sn , 1− n 0 < e − sn <

1.4 Complex Numbers

15

which implies 1+1+

1 1 + ··· + ≤ lim inf xn ≤ lim sup xn ≤ e. 2! m!

Therefore, e = lim xn exists, as claimed.

1.4

Complex Numbers

We assume that the reader is familiar with the complex ﬁeld. In what follows we just recall its construction and some notation. Let C denote the Cartesian9 product R×R equipped with two internal operations, addition and multiplication, deﬁned as follows: (x, y) + (u, v) = (x + u, y + v), (x, y) · (u, v) = (xu − yv, xv + yu). It is easy to check that C is a ﬁeld, with (0, 0) and (1, 0) in the role of 0 and 1, respectively. particular, for any z = (x, y) ∈ C, z = (0, 0), x In −y we have z −1 = x2 +y 2 , x2 +y 2 . Note that the set R1 := {(x, 0); x ∈ R} is a subﬁeld of C with respect to the above operations that read in this case (x, 0) + (u, 0) = (x + u, 0), (x, 0) · (u, 0) = (xu, 0). Thus any (x, 0) can be identiﬁed with x and R1 with these operations can be identiﬁed with R with the usual operations of addition and multiplication. So R can be viewed as a subﬁeld of C. Any z = (x, y) ∈ C can be decomposed as z = (x, 0) + (y, 0) · (0, 1), so in view of the above identiﬁcation, we can write z = x + yi, where i := (0, 1). Note that (0, 1)·(0, 1) = (−1, 0), thus we can write i2 = −1; i is called the imaginary unit. Summarizing, we can write C = {x + yi; x, y ∈ R} and observe that the two operations initially deﬁned can be viewed as the addition and multiplication similar to those used for real numbers if we admit that i2 = −1. The elements z = x + yi of C are called complex numbers and C is known as the complex ﬁeld or complex number system. For a complex number z = x + yi, the real numbers x and y are called the real part and the imaginary part of z, respectively (denoted x = Re z, 9

Ren´e Descartes, latinized Renatus Cartesius, French mathematician, philosopher, and scientist, 1596–1650.

16

1 Introduction

y = Im z). Complex numbers z = x + yi can be represented by points (of coordinates x, y) in the complex plane which is determined by two orthogonal directed straight lines with the same unit, the x-axis (real axis) and the y-axis (imaginary axis). Let z¯ = x − yi be the complex conjugate of z = x + yi. Note that z · z¯ = x2 + y 2 ∈ R. The number |z| = x2 + y 2 is called the magnitude of z, and it represents the length of the segment connecting the origin O of the complex plane and the point of coordinates x and y corresponding to z.

1.5

Linear Spaces

Recall that a nonempty set X is said to be a linear space (or vector space) over a ﬁeld K if there exist a binary operation on X, called addition, + : X × X → X, and an external binary operation, called scalar multiplication, · : K × X → X, such that the following axioms are satisﬁed (A1)

(x + y) + z = x + (y + z) ∀x, y, z ∈ X;

(A2)

x + y = y + x ∀x, y ∈ X;

(A3)

∃0 ∈ X, called zero, such that x + 0 = x ∀x ∈ X;

(A4)

∀x ∈ X ∃ − x ∈ X such that x + (−x) = 0;

(A5)

1 · x = x ∀x ∈ X, where 1 is the unit element of the ﬁeld K;

(A6)

α(βx) = (αβ)x ∀α, β ∈ K, ∀x ∈ X;

(A7)

(α + β)x = αx + βx ∀α, β ∈ K, ∀x ∈ X;

(A8)

α(x + y) = αx + αy ∀α ∈ K, ∀x, y ∈ X.

The ﬁrst four axioms ensure that X is an Abelian10 group with respect to addition. In the following K will be either the ﬁeld R of real numbers or the ﬁled C of complex numbers, and X will be called a real or complex space, respectively. A nonempty subset Y of X which is a linear space with respect to the same operations is called a subspace of X. In fact, a necessary and 10

Niels Henrik Abel, Norwegian mathematician, 1802–1829.

1.5 Linear Spaces

17

suﬃcient condition for a nonempty subset Y of X to be a subspace is that Y be closed under the operations, i.e., ∀ x, y ∈ Y, ∀ α ∈ K, x + y ∈ Y, αx ∈ Y. If S is a nonempty subset of a linear space X, we denote by Span S the collection of all ﬁnite linear combinations of elements of S, i.e., k αi xi = α1 x1 + · · · + αk xk ; k ∈ N, αi ∈ K, Span S = i=1

xi ∈ S, i = 1, . . . , k

.

Obviously, Span S is a linear subspace of X, called the linear subspace generated by S (and S is said to be a system of generators). We recall that x1 , x2 , . . . , xk ∈ X (where X is a linear space) are said to be linearly dependent if there exist some scalars α1 , . . . , αk ∈ K, not all zero, such that α1 x1 + · · · + αk xk = 0. Otherwise, the vectors x1 , x2 , . . . , xk are called linearly independent (and {x1 , x2 , . . . , xk } is said to be a linearly independent system). In this case, S = {x1 , x2 , . . . , xk } is a basis of the space Y = Span S (which could be the whole of X), and we say that Y has dimension k, dim Y = k, and any vector x ∈ Y can be uniquely expressed as a linear combination, x=

k

αi x i = α1 x 1 + · · · + αk x k ,

i=1

where α1 , . . . , αk ∈ K are called coordinates of x with respect to the basis S. A basis is not unique. A linear space X is inﬁnite dimensional (written as dim X = ∞) if for any k ∈ N there exist k vectors in X which are linearly independent. If X contains only the null vector, then by convention we deﬁne dim X = 0. Recall that any two linear spaces X, Y are isomorphic if there exists a bijection φ : X → Y which satisﬁes φ(αx + βy) = αφ(x) + βφ(y)

∀ α, β ∈ K, ∀ x, y ∈ X .

If either of the two (isomorphic) spaces is ﬁnite dimensional then the other is also ﬁnite dimensional and dim X = dim Y (prove it!).

18

1 Introduction

Scalar Product. An important concept that allows the extension of some properties of classical Euclidean geometry to general linear spaces is the scalar product. A scalar product (or inner product) on a linear space X is a mapping from X × X to K, denoted (·, ·), which satisﬁes the following axioms ∀ x, y ∈ X ,

(a1 )

(x, y) = (y, x)

(a2 )

(x + y, z) = (x, z) + (y, z)

(a3 )

(αx, y) = α(x, y)

(a4 )

(x, x) ≥ 0 ∀x ∈ X, and (x, x) = 0 ⇐⇒ x = 0 .

∀ x, y, z ∈ X ,

∀ α ∈ K, ∀ x, y ∈ X ,

We have denoted by (y, x) the complex conjugate of (y, x) (obviously, (y, x) = (y, x) if K = R). A space X together with such a product is called an inner product space. It is easily seen that (x, αy) = α(x, y) for all α ∈ K and all x, y ∈ X. Two vectors x, y ∈ X are called orthogonal if their scalar product is equal to zero: (x, y) = 0 (this is sometimes denotedx⊥y). One can also deﬁne the length of a vector x ∈ X as x = (x, x). The mapping x → x satisﬁes the following properties: (i) (ii) (iii)

x = 0 ⇐⇒ x = 0 ; αx = |α| · x ∀α ∈ K, ∀ x ∈ X ; x + y ≤ x + y

∀ x, y ∈ X .

The mapping · : X → [0, ∞) deﬁned above is a norm on X, and X is a normed space. In general a mapping from X to [0, ∞) satisfying (i), (ii), (iii) is called a norm on X. A given space may have many diﬀerent norms, but the above is a special norm, being generated by a scalar product. While (i) and (ii) are trivial, property (iii) (called triangle inequality) follows from the Bunyakovsky–Cauchy–Schwarz11 inequality: |(x, y)| ≤ x · y

∀x, y ∈ X ,

(1.5.8)

which is valid in any normed space whose norm is generated by a scalar product. Indeed,

11

Viktor Y. Bunyakovsky, Russian mathematician, 1804–1889; Karl Hermann Amandus Schwarz, German mathematician, 1843–1921.

1.5 Linear Spaces

19

x + y2 = x2 + 2 Re(x, y) + y2 ≤ x2 + 2|(x, y)| + y2 ≤ x2 + 2x · y + y2 = (x + y)2 , which clearly implies (iii). As far as (1.5.8) is concerned, its proof is based on the inequality 0 ≤ x + αy2 = x2 + 2 Re α(x, y) + |α|2 y2 ,

(1.5.9)

for all α ∈ K and all x, y ∈ X. In fact, we can assume x = 0 and y = 0 (otherwise (1.5.8) is trivial). Now replacing in (1.5.9) α = −(x, y)/y2 yields (1.5.8). We continue with some examples: Example 1. For a given n ∈ N, consider X = Rn , which is the set of all ordered n-tuples (here arranged as n × 1 matrices) x = (x1 , . . . , xn )T , where x1 , . . . , xn ∈ R. It is easily seen that X = Rn is a linear space over R with respect to the usual operations of addition and scalar multiplication: x + y = (x1 + y1 , . . . , xn + yn ),

αx = (αx1 , . . . , αxn ) ,

for all x = (x1 , . . . , xn )T , y = (y1 , . . . , yn )T ∈ X, α ∈ R. The null (zero) element of X is (0, 0, . . . , 0)T , while the inverse of any x = (x1 , . . . , xn )T ∈ X with respect to the addition is (−x1 , . . . , −xn )T . The usual scalar product of X = Rn is deﬁned by (x, y) =

n

x i yi

∀ x = (x1 , . . . , xn )T , y = (y1 , . . . , yn )T ∈ X ,

i=1

and the corresponding norm is

n

x2i . x = (x, x) = i=1

If n = 1 the above scalar product is the usual multiplication in R, while the corresponding norm is the absolute value. If n = 2 or n = 3 then the above scalar product is nothing else but the scalar product (dot product) of two vectors in the Euclidean plane or space, respectively, while the corresponding norm of a vector represents its length.

20

1 Introduction

Orthogonality of two vectors means the usual geometric orthogonality. For this reason X = Rn so equipped is called Euclidean n-space. By extension, a general normed space whose norm is generated by a scalar (inner) product is called a generalized Euclidean space (or inner product space, as previously mentioned). Analogously, Cn is a linear space over C with respect to the usual operations of addition and scalar multiplication. Here, the usual scalar product is deﬁned by (x, y) =

n

x i yi

∀ x = (x1 , . . . , xn )T , y = (y1 , . . . , yn )T ∈ Cn ,

i=1

and the corresponding (Euclidean) norm is

n

|xi |2 . x = (x, x) = i=1

Note that any n-dimensional linear space X over K is isomorphic to Kn . Indeed, an isomorphism φ : X → Kn is the mapping which associates with any x ∈ X the vector constructed with the coordinates of x with respect to a basis in X. Thus any such space X can be equipped with a scalar product as follows: (x, y)X := (φ(x), φ(y)) =

n

φ(x)i · φ(y)i

∀x, y ∈ X .

i=1

Therefore, any ﬁnite dimensional linear space X can be organized as a generalized Euclidean (inner product) space. Example 2. Let X be the set of all functions from [0, 1] to K. Obviously, X is a linear space with respect to the usual operations of addition and scalar multiplication (f + g)(t) = f (t) + g(t), (αf )(t) = αf (t)

∀ t ∈ [0, 1], ∀ f, g ∈ X, ∀ α ∈ K.

Consider the set Y of all polynomial functions f : [0, 1] → K (i.e., f (t) = a0 + a1 t + a2 t + · · · + ak tk , a0 , . . . , ak ∈ K, k ∈ {0} ∪ N). Obviously Y is a (proper) subspace of X. Note that Y is inﬁnite dimensional (dim Y = ∞) and hence so is X. Indeed, for any k ∈ N

1.5 Linear Spaces

21

the set of polynomials {1, t, t2 , . . . , tk } ⊂ Y is an independent system. We can deﬁne on Y the scalar product 1 f (t) · g(t) dt ∀f, g ∈ Y, (f, g) = 0

1/2 1 . and the corresponding norm f = (f, f ) = 0 |f (t)|2 dt Another norm on Y is the following f ∗ = sup |f (t)|

∀f ∈ Y ,

t∈[0,1]

but this one is not generated by a scalar product. Indeed, if we assume that · ∗ is generated by a scalar product, then it must satisfy the parallelogram law ∀f, g ∈ Y , (1.5.10) f + g2∗ + f − g2∗ = 2 f 2∗ + g2∗ which is valid in any inner product space. But, for example, the polynomial functions f (t) = t, g(t) = 1 − t do not satisfy (1.5.10), which conﬁrms our assertion above. Now, let Z be the set of polynomials f ∈ Y of degree less than or equal to n − 1, for a given natural number n. This is a ﬁnite dimensional subspace of Y , with basis {1, t, t2 , . . . , tn−1 } and dimension n. Therefore, Z is isomorphic to Kn . A natural isomorphism between Z and Kn is the mapping which associates with any polynomial function f (t) = a0 + a1 t + a2 t2 + · · · + an−1 tn−1 the n-dimensional vector (a0 , a1 , a2 , . . . , an−1 )T ∈ Kn . Thus, besides the scalar product above, one can deﬁne on Z another scalar product (f, g)Z =

n−1

a i · bi ,

i=0

for all f (t) = a0 + a1 t + · · · + an−1 tn−1 , g(t) = b0 + b1 t + · · · + bn−1 tn−1 , where ai , bi ∈ K, i ∈ {0, 1, . . . , n − 1}. This scalar product generates a new norm on Z, 1/2

n−1 |ai |2 f Z = (f, f )Z = i=0

∀ f (t) = a0 + a1 t + · · · + an−1 tn−1 ∈ Z.

22

1 Introduction

Orthogonal Systems. Let S ⊂ X \ {0} be a nonempty countable set, where X is an inner product space with a scalar product (·, ·) and the norm · generated by it. We further assume that S is a linearly independent system (otherwise we eliminate those vectors which are linear combinations of other vectors from S). Recall that an inﬁnite set is linearly independent if all ﬁnite subsets of it are independent. Consider ﬁrst the case when S is a ﬁnite independent system, S = {u1 , . . . , un }. Starting from S one can construct an orthogonal system S = {v1 , . . . , vn }, i.e., (vi , vi ) = 0 and (vi , vj ) = 0 if i = j. In what follows we present the Gram–Schmidt12 orthogonalization method. To create S let v 1 = u1 , v2 = u2 + αv1 , such that v2 ⊥v1 , i.e., 0 = (v2 , v1 ) = (u2 + αv1 , v1 ) = (u2 , v1 ) + αv1 2 , giving α=−

(u2 , v1 ) . v1 2

Note that v1 = u1 = 0 (by assumption) and also v2 = 0. To see that, we suppose by contradiction that v2 = 0, i.e., u2 + αu1 = 0. But this is impossible since u1 , u2 are independent vectors. After having determined the ﬁrst p members of S deﬁne vp+1 = up+1 +

p

βj vj ,

j=1

so that vp+1 ⊥vk for all k = 1, . . . , p, that is, 0 = (up+1 , vk ) + βk vk 2 , (up+1 , vk ) , k = 1, . . . , p. βk = − vk 2

12

Jorgen Pedersen Gram, Danish actuary and mathematician, 1850–1916; Erhard Schmidt, Baltic German mathematician, 1876–1959.

1.5 Linear Spaces

23

As before vp+1 = 0 since each vk is a linear combination of uk ’s, k = 1, . . . , p, namely p θk uk , vp+1 = up+1 + k=1

and u1 , u2 , . . . , up , up+1 are independent vectors. Continue the process until ﬁnished. Since S is an orthogonal system, it follows that it is independent (prove it!). S can be simply replaced by an orthonormal (independent) system S = {w1 , . . . , wn }, by deﬁning wj = vj −1 vj , j = 1, . . . , n. In particular, any n-dimensional inner product space possesses an orthonormal basis (since any basis can be replaced by an orthonormal one). If S is a countably inﬁnite, independent system in X, S = {u1 , u2 , . . . , un , . . . }, then using the same Gram–Schmidt method, one can construct an orthonormal system S = {w1 , w2 , . . . , wn , . . . }, i.e., (wi , wj ) = δij , where δij is the Kronecker13 symbol, δii = 1 and δij = 0 for i = j. Linear, Bilinear, Sesquilinear, Quadratic Forms. Let X be a linear space over K. A function f : X → K is said to be a linear form on X if f (αx + βy) = αf (x) + βf (y)

∀ α, β ∈ K, ∀ x, y ∈ X .

The set of all linear forms on X, denoted X ∗ , is a linear space with respect to the usual operations on functions and is called the dual of X. If X is ﬁnite dimensional, with a basis B = {u1 , . . . , un }, n ∈ N, then any linear form f has a speciﬁc form: f (x) =

n i=1

a i αi

∀x =

n

αi u i ∈ X ,

i=1

where ai = f (ui ) are called coeﬃcients of the linear form with respect to the basis B. X ∗ is isomorphic to Kn (hence dim X ∗ = n), since the mapping associating each f ∈ X ∗ to the vector (f (u1 ), . . . , f (un ))T ∈ Kn is an isomorphism (prove it!). A function a : X ×X → K which is linear with respect to each variable is called a bilinear form on X (more precisely, a(·, y) is a linear form for all y ∈ X, and a(x, ·) is also a linear form for all x ∈ X). 13

Leopold Kronecker, German mathematician, 1823–1891.

24

1 Introduction

If K = C and the above condition on a(x, ·) is replaced by the linearity of the complex conjugate function a(x, ·) (x ∈ X) then a is said to be a sesquilinear form on X. For example, a scalar product on X is a sesquilinear form. If X is ﬁnite dimensional, with a basis un } and a is a B = {u1 , . . . , bilinear form on X, then for all x = ni=1 αi ui , y = nj=1 βj uj ∈ X we have n cij αi βj , (1.5.11) a(x, y) = i,j=1

where cij = a(ui , uj ), i, j = 1, . . . , n. Hence a is represented by the matrix C = (cij ) (which depends on the basis of X). If a is a sesquilinear form, then instead of (1.5.11) we have a(x, y) =

n

cij αi βj .

i,j=1

A bilinear form a : X × X → R is said to be symmetric if a(x, y) = a(y, x) for all x, y ∈ X. If X is ﬁnite dimensional, then the symmetry of a bilinear form a is expressed by the symmetry of the matrix associated with a (the symmetry of that matrix being independent of the basis of space X). Any symmetric bilinear form a deﬁnes a quadratic form F : X → R by setting F (x) = a(x, x). Given a quadratic form F , one can recover the corresponding bilinear form a. Indeed, from a(x + y, x + y) = a(x, x) + 2a(x, y) + a(y, y), we deduce a(x, y) = =

1 [a(x + y, x + y) − a(x, x) − a(y, y)] 2 1 [F (x + y) − F (x) − F (y)] . 2

A quadratic form F is said to be positive deﬁnite (positive semideﬁnite) if F (x) > 0 for all x ∈ X, x = 0 (F (x) ≥ 0 for all x ∈ X, respectively). F is called negative deﬁnite (negative semideﬁnite) if −F is positive deﬁnite (positive semideﬁnite, respectively). If F is a positive deﬁnite quadratic form on the real linear space X then the corresponding a is a scalar product on X.

1.5 Linear Spaces

25

If X is a real n-dimensional linear space, with a basis B = {u1 , . . . , un }, and F is a quadratic form on X, then F (x) = a(x, x) =

n

cij αi αj ,

i,j=1

where αj ’s are the coordinates of x with respect to basis B (in particular, the components of x if X = Rn with its usual basis). It can be shown (using the well-known Gauss14 method), that for any such quadratic form F , there is a convenient basis of X such that F can be written as follows: F (x) = λ1 β12 + λ2 β22 + · · · + λn βn2 , where β1 , . . . , βn are the coordinates of x with respect to the new basis, and λ1 , . . . , λn ∈ R (some of these λ’s could be zero). In fact, starting from the new basis, one can simply deﬁne another basis, such that F can be written under the following canonical form: F (x) =

n

εi γi2 ,

εj ∈ {−1, 0, +1}, j = 1, . . . , n ,

i=1

where γi ’s are the coordinates of x with respect to the last basis. Obviously, F is positive deﬁnite (positive semideﬁnite) if and only if εj = 1, j = 1, . . . , n (εj ∈ {0 , +1}, j = 1, . . . , n, respectively). Let us also recall that for a quadratic form F : X → R (X being an ndimensional real linear space), whose matrix C with respect to a basis of X has nonzero NW principal minors (i.e., Δi = 0, i = 1, . . . , n) there always exists a decomposition (called Jacobi’s formula15 ) as follows: F (x) =

n Δi−1 i=1

Δi

βi2 ,

where Δ0 := 1, and β1 , . . . , βn are the coordinates of x with respect to a new basis of X. Therefore, F is positive deﬁnite (negative deﬁnite) if and only if Δi > 0, i = 1, . . . , n (respectively, (−1)i Δi > 0, i = 1, . . . , n). These are known as Sylvester’s conditions.16 14

Carl Friedrich Gauss, German mathematician and physicist, 1777–1855. Carl Gustav Jacob Jacobi, German mathematician, 1804–1851. 16 James Joseph Sylvester, English mathematician, 1814–1897. 15

26

1 Introduction

If X is a complex linear space and a : X × X → C is a sesquilinear form, then a is called Hermitian17 if a(x, y) = a(y, x) for all x, y ∈ X. Such a form a deﬁnes a quadratic form F (x) = a(x, x), x ∈ X, with values in R. If X is an n-dimensional complex linear space, then one can ﬁnd a basis in X such that F takes the form F (x) =

n

λi β i β i =

i=1

n

λi |βi |2 ,

i=1

where λi ∈ R, i = 1, . . . , n, and βi , i = 1, . . . , n, are the coordinates of x with respect to that basis. The Jacobi formula also works in this complex case, and Sylvester’s conditions remain valid. We close this chapter by inviting the reader to consult other books to ﬁnd more information on the topics addressed in this chapter, such as [6, 16, 28, 33, 37, 41, 42, 51].

1.6

Exercises

1. Let A, B, C be some arbitrary subsets of a universe U . Show that (a) A \ (B ∪ C) = (A \ B) ∩ (A \ C) = (A \ B) \ C ; (b) A \ (B ∩ C) = (A \ B) ∪ (A \ C) ; (c) (A ∩ B) \ C = A ∩ (B \ C) = (A \ C) ∩ B ; (d) (A ∪ B) \ C = (A \ C) ∪ (B \ C) ; 2. Let A, B, C be given sets, which are subsets of a universe U . Determine the set X ⊂ U satisfying A ∩ X = B and A ∪ X = C . The same question if X satisﬁes A \ X = B, and X \ A = C . 3. Prove that for all sets A, B, C satisfying A ∩ C = B ∩ C and A ∪ C = B ∪ C, we have A = B. 17

Charles Hermite, French mathematician, 1822–1901.

1.6 Exercises

27

4. Let A, B, C, D be some arbitrary subsets of a universe U . Which of the following statements are true? (a) (A ∩ B) × (C ∩ D) = (A × C) ∩ (B × D); (b) (A ∪ B) × (C ∪ D) = (A × C) ∪ (B × D); (c) (A \ B) × C = (A × C) \ (B × C), where “×” denotes the Cartesian product? 5. Let A be a set with a partial order ≤ . If A has a smallest element a = min A, then a is the unique minimal element of A. 6. Let A = {an ; an = and sup A.

1 1·2

+

1 2·3

+ ··· +

1 n(n+1) ,

n ∈ N}. Find inf A

7. Deﬁne on C (the set of complex numbers) the binary relation as follows: z1 = x1 + y1 i z2 = x2 + y2 i ⇐⇒ x1 ≤ x2 and y1 ≤ y2 . Show that (a) is a partial order on C, but not a total order on C; (b) for each a ≥ 0, is a total order on Xa = {z = x + yi ∈ C; y = ax} (i.e., Xa is a chain); (c) there exists a partial order on C such that, for each a < 0, Xa deﬁned as above is a chain of C with respect to this partial order. 8. Show that the sequence (an )n≥1 deﬁned by √ a1 = 2, an = 2 + an−1 , n ≥ 2 , is convergent and calculate its limit. 9. Let a be a given real number and let (an )n≥1 be a sequence in R such that any subsequence of it has a convergent subsequence whose limit is a. Show that an → a. 10. Let X be a vector space. If {v1 , v2 , v3 } ⊂ X is a linearly independent system, then show that {v1 + v2 , v2 + v3 , v3 + v1 } is too.

28

1 Introduction

11. Let X be the real vector space of all functions f : R → R. Show that each of the following systems of functions in X S1 = {1, cos x, (cos x)2 }, S2 = {ex , xex , . . . , xk ex }, k ∈ N, is linearly independent. 12. Let X be the real vector space of all continuous functions f : [0, 1] → R. Consider on X the scalar product

1

(f, g) =

f (t) · g(t) dt ∀f, g ∈ X ,

0

and the induced norm. (a) Which of the following systems of functions in X is linearly independent? (i) (ii) (iii) (iv) (v) (vi)

f1 (t) = 1, f2 (t) = t, f3 (t) = t2 ; f1 (t) = 1 − t, f2 (t) = t(1 − t), f3 (t) = 1 − t2 ; f1 (t) = 1, f2 (t) = et , f3 (t) = 2e−t ; f1 (t) = 3t, f2 (t) = t + 5, f3 (t) = −2t2 , f4 (t) = (t + 1)2 ; f1 (t) = (t + 1)2 , f2 (t) = t2 − 1, f3 (t) = 2t2 + 2t − 3; f1 (t) = 1, f2 (t) = 1 + t, f3 (t) = 1 + t + t2 , . . . , fk (t) = 1+t+t2 +· · ·+tk−1 , where k is a given natural number.

(b) Let Y be the vector subspace of X generated by B = {f1 , f2 , f3 }, where f1 (t) = 1, f2 (t) = t, f3 (t) = t2 for t ∈ [0, 1]. By using the Gram–Schmidt method, construct an orthonormal basis in Y with respect to the above scalar product. 13. Show that the system B = {1, t − 1, (t − 1)2 , (t − 1)3 } is a basis of the real vector space X of all polynomials of degree ≤ 3 with real coeﬃcients, and ﬁnd the coordinates of a polynomial p = p(t) ∈ X with respect to this basis. 14. Let X be a linear space equipped with a scalar product (·, ·). Show that a system S = {x1 , x2 , . . . , xk } ⊂ X is linearly independent if and only if the following determinant (called the Gram determinant) det (xi , xj )1≤i,j≤k = 0.

1.6 Exercises

29

15. Show that the mapping (·, ·) : R3 × R3 −→ R deﬁned by 1 1 (x, y) = xi yi − (x1 y2 + x1 y3 + x2 y3 ) 2 4 3

i=1

is a scalar product. Construct a basis of R3 which is orthonormal with respect to this scalar product. 16. Let X be the real vector space of polynomials of degree ≤ m with real coeﬃcients, where m is a given natural number. Find the expression of the linear form f : X → R deﬁned by 1 f (p) = p(t) dt ∀p(t) = a0 + a1 t + a2 t2 + · · · am tm ∈ X 0

with respect to each of the bases

B = {1, t, t2 , . . . , tm }, B = 1, 1 + t, 1 + t + t2 , . . . , 1 + t + t2 + · · · + tm .

17. Let X be a vector space over R. A bilinear form a : X × X → R is said to be antisymmetric if a(x, y) = −a(y, x) ∀x, y ∈ X . Show that (i) a bilinear form a : X × X → R is antisymmetric ⇐⇒ a(x, x) = 0 ∀x ∈ X. (ii) any bilinear form on X is the sum of a symmetric bilinear form and an antisymmetric one. 18. Let A be an n × n matrix with real entries, and let B = aIn + AT A, where AT denotes the transpose of A, In denotes the n × n identity matrix, and a > 0. Show that the quadratic form F : Rn → R whose matrix with respect to the canonical basis of Rn is B is positive deﬁnite. What about the case a = 0? 19. Consider the quadratic form F : R3 → R, F (x) = x21 + x22 + 3x23 + 4x1 x2 + 2x1 x3 + 2x2 x3 ∀x ∈ R3 . n 2 Determine a basis of R3 such that F (x) = i=1 εi ξi , where ξ1 , . . . , ξn are the coordinates of x with respect to this basis, and εj ∈ {−1, 0, +1}, j = 1, . . . , n. Check whether F is positive deﬁnite, negative deﬁnite, or neither.

30

1 Introduction

20. Use Sylvester’s conditions to show that the quadratic form F : R4 → R, F (x) = 2x21 +2x22 +x23 +4x24 −x1 x2 +x1 x3 +x2 x4 −x3 x4 ∀x ∈ R4 , is positive deﬁnite. Determine a basis of R4 such that F can be written as a sum of squares with respect to this basis.

Chapter 2

Metric Spaces Metric spaces oﬀer a suﬃciently large framework for most of the problems we discuss in this book.

2.1

Deﬁnitions

Deﬁnition 2.1. A metric (or a distance function) on a nonempty set X is a function d : X × X → [0, ∞) satisfying (M 1)

d(x, y) = 0 ⇐⇒ x = y ;

(M 2)

d(x, y) = d(y, x)

(M 3)

d(x, y) ≤ d(x, z) + d(z, y)

∀x, y ∈ X ; ∀x, y, z ∈ X .

A set X equipped with a metric d is called a metric space and is sometimes denoted (X, d). Any set X = ∅ can be equipped with a metric. The “simplest” metric is the so-called discrete metric which is deﬁned by d(x, y) = 1 if x, y ∈ X, x = y

and

d(x, x) = 0 ∀x ∈ X .

Note that this metric is not very useful in practice, but is suitable for counterexamples. Now let (X, · ) be a normed (linear) space. Then X can be equipped with the metric d(x, y) = x − y,

x, y ∈ X .

© Springer Nature Switzerland AG 2019 G. Moro¸sanu, Functional Analysis for the Applied Sciences, Universitext, https://doi.org/10.1007/978-3-030-27153-4 2

(2.1.1) 31

32

2 Metric Spaces

Note also that any ﬁnite dimensional linear space can be equipped with a norm (e.g., with the Euclidean norm—see the previous chapter), and hence with the metric generated by that norm (cf. (2.1.1)). If (X, d) is a metric space and ∅ = Y ⊂ X, then Y is also a metric space with respect to d restricted to Y × Y . Deﬁnition 2.2. Let (X, d) be a metric space. For x0 ∈ X and r > 0 deﬁne B(x0 , r) := {x ∈ X; d(x, x0 ) < r} , which is called the open ball centered at x0 with radius r. Deﬁnition 2.3. A nonempty set A ⊂ (X, d) is said to be open if for each x ∈ A there exists an ε > 0 such that B(x, ε) ⊂ A. By convention the empty set is considered open. Obviously, the collection τ of all open sets forms a topology: (a)

∅, X ∈ τ ;

(b)

the union of any sub-collection of τ is in τ ;

(c)

the intersection of any ﬁnite sub-collection of τ is in τ.

Note that the intersection of an inﬁnite collection of open sets may not be open. For example, in X = R, with d(x, y) = |x − y|, we have for a ﬁxed x0 ∈ R ∞ n=1

x0 −

1 1 = {x0 }, , x0 + n n

and obviously {x0 } does not belong to the (usual) topology of R deﬁned by | · |. In what follows X denotes a metric space endowed with the topology τ generated by its metric d (see above), called metric topology. If d is deﬁned by a norm, i.e., d(x, y) = x − y (x, y ∈ X), then τ is called a norm topology. A set V ⊂ X is said to be a neighborhood of a point p ∈ X if there is an r > 0 such that B(p, r) ⊂ V . In particular, any open set D is a neighborhood of any p ∈ D.

2.1 Deﬁnitions

33

A set C ⊂ X is said to be closed (with respect to topology τ ) if X \ C is open (i.e., X \ C ∈ τ ). In particular, for any x0 ∈ X and r > 0, we have B(x0 , r) ∈ τ , and B(x0 , r) := {x ∈ X; d(x, x0 ) ≤ r} is closed, i.e., X \ B(x0 , r) ∈ τ (prove these assertions!). A subset A of a metric space (X, d) is said to be bounded (with respect to d) if it is contained in a closed ball (equivalently, in an open ball). Otherwise, A is called unbounded (with respect to d). For example, N ⊂ R is bounded with respect to the discrete metric on R, but is unbounded with respect to the usual norm topology (the norm being the absolute value function | · |). A sequence (an )n∈N in (X, d) is said to be convergent if there exists a ∈ X such that d(an , a) → 0. This is denoted an → a, or limn→∞ an = a, or lim an = a, and we say that (an ) converges to a, or that a is the limit of (an ). It is easily seen that the limit is unique. Let S be a nonempty subset of a metric space (X, d). S is closed if and only if the limit of any convergent sequence of points in S is also a point of S (prove it!). A point p ∈ (X, d) is called an accumulation point (or limit point) of a set S ⊂ X if (V ∩ S) \ {p} = ∅ for every neighborhood V of p. Note that p is not necessarily an element of S. If q ∈ S and q is not an accumulation point of S, then q is an isolated point of S. Obviously, p is an accumulation point of S if and only if there exists a sequence (pn ) in S such that pn → p. By the above assertion, S is closed if and only if S contains all its accumulation points. Let (an )n∈N a sequence in (X, d). A point p ∈ X is called a cluster point of (an ) if for every ε > 0 there are inﬁnitely many an such that d(an , p) < ε (in other words, (an ) has a subsequence converging to p). A point p ∈ S is called an interior point of S if there is an r > 0 such that B(p, r) ⊂ S. The set of all interior points of S is called the interior of S, and is denoted Int S. Obviously, • Int S is the union of all open subsets of S, and hence Int S is an open set (possibly ∅); • S is open if and only if S = Int S.

34

2 Metric Spaces

For a set S ⊂ (X, d) the closure of S, denoted Cl S or S, is the intersection of all closed sets containing S. Clearly, Cl S is a closed set, and • S is closed if and only if S = Cl S; • Cl S = S ∪ {accumulation points of S}. A metric space (X, d) is called separable if it has a countable, dense subset S, i.e., Cl S = X (the closure being related to the metric topology generated by d). For example, R is separable with respect to its usual topology (since Q is dense in R with respect to this topology), but is not separable with respect to the discrete topology, i.e., the topology associated with the discrete metric on R. This is because any subset of R is closed with respect to the discrete topology, so there is no dense countable subset of R. The boundary of a set S ⊂ (X, d) is deﬁned to be the set ∂S := Cl S ∩ Cl (X \ S). Obviously, ∂S = ∂(X \ S), and p ∈ ∂S if and only if B(p, ε) ∩ S = ∅ and B(p, ε) ∩ (X \ S) = ∅ for all ε > 0.

2.2

Completeness

We start this section with the deﬁnition of a Cauchy sequence which is essential in what follows. Deﬁnition 2.4. A sequence (an )n∈N in a metric space (X, d) is called a Cauchy sequence if for all ε > 0 there exists an N = N (ε) ∈ N such that d(an , am ) < ε for all m, n > N . It is easy to see that any convergent sequence in a metric space is a Cauchy sequence. The converse implication is not true in general. Deﬁnition 2.5. A metric space (X, d) is called complete if every Cauchy sequence (an )n∈N in X converges (i.e., there exists a point a ∈ X such that d(an , a) → 0). For example, R with its usual topology (deﬁned by | · |) is a complete metric space (as shown in the previous chapter, see Theorem 1.12). More generally, for any n ∈ N, Rn , equipped with the Euclidean metric

2.2 Completeness

35

(generated by the Euclidean norm), is complete, because a Cauchy sequence in Rn is Cauchy in each coordinate. In fact, we will see later that Rn endowed with any norm is complete. On the other hand, the metric space (Q, d), where d(x, y) = |x − y| (x, y ∈ X) is not complete. For example, the sequence in Q, deﬁned by 1 2 , n = 1, 2, . . . a1 = 2, an+1 = an + 2 an √ is convergent in (R, |·|) (since an ≥ 2 and an+1 /an√≤ 1, n = 1, 2, . . . ), hence Cauchy with respect to | · |, but its limit is 2 ∈ / Q. Now, let us examine another example. Let S be a nonempty set. Deﬁne B(S; R) = {f : S → R; f (S) is bounded} , where the boundedness condition on f (S) means: ∃M > 0 such that |f (s)| ≤ M for all s ∈ S. Obviously, X = B(S; R) is a real linear space with respect to the usual operations (addition and scalar multiplication). It can be equipped with a norm · deﬁned by f := sup |f (s)| s∈S

∀f ∈ X ,

which gives a metric d : X × X → [0, ∞), d(f, g) = f − g (f, g ∈ X). Moreover, it is easily seen that (X, d) is a complete metric space. The key condition ensuring the completeness of X is the completeness of R with respect to its usual metric. Convergence in X = B(S; R) is called uniform convergence on S. It is stronger than the pointwise convergence. In particular, d(fn , f ) = fn − f → 0 ⇒ lim fn (s) = f (s) ∀s ∈ S , n→∞

but the converse implication is not true in general. Deﬁnition 2.6. A normed space (X, · ) which is complete (i.e., (X, d) is complete for d(x, y) = x − y, x, y ∈ X) is called a Banach space. In particular, B(S; R) with the norm above (called uniform convergence norm) is a Banach space. The subset XK = {f ∈ B(S; R) : |f (s)| ≤ K ∀s ∈ S}, where K is a given positive constant, is a complete metric space with respect to the same metric (generated by the

36

2 Metric Spaces

uniform convergence norm), since XK is closed in B(S; R). Note that XK is not a Banach space because it is not a linear space. In general, if (X, d) is a complete metric space, then any nonempty closed set Y ⊂ X is also a complete metric space with the metric d restricted to Y × Y . Deﬁnition 2.7. Two metric spaces (X1 , d1 ), (X2 , d2 ) are isometric if there exists a bijection φ : X1 → X2 such that d2 (φ(x), φ(y)) = d1 (x, y) for all x, y ∈ X. An important result, due to Hausdorﬀ,1 says that any metric space can be extended (uniquely up to isometry) to a complete metric space (see [44, Chapter 2]). More precisely we have Theorem 2.8. For any metric space (X, d) there exists a complete ¯ such that ¯ d) metric space (X, (j)

¯ such that (X, d) is isometric to there exists X1 ⊂ X ¯ (X1 , d);

(jj)

¯. ¯ d) X1 is dense in (X,

¯ with the above properties is unique up to isometry. ¯ d) (X, Proof. One can construct an extension (completion) of (X, d) by a procedure similar to that used in the previous chapter to construct the Cantor–M´eray model for R starting from rational numbers. Specifically, let E denote the set of all Cauchy sequences in (X, d). E is nonempty as it contains constant sequences (c, c, . . . ), c ∈ X. We deﬁne an equivalence relation in E as follows: (an ), (bn ) ∈ E are equivalent iﬀ d(an , bn ) → 0. In other words, two sequences convergent in (X, d) with the same limit are equivalent. It is easily seen that ¯ the relation deﬁned above is indeed an equivalence relation. Let X be the collection of all equivalence classes in E with respect to this equivalence relation. Denote by A, B, C, . . . the classes of sequences ¯ ×X ¯ → [0, ∞) by (an ), (bn ), (cn ), . . . Now, deﬁne d¯ : X ¯ B) = lim d(an , bn ) d(A, n→∞

¯ ∀A, B ∈ X.

The limit above exists since |d(an+p , bn+p ) − d(an , bn )| ≤ d(an+p , an ) + d(bn+p , bn ) , 1

Felix Hausdorﬀ, German mathematician, 1868–1942.

(2.2.2)

2.2 Completeness

37

which says that (d(an , bn )) is a Cauchy sequence in (R, | · |). Note also that the limit in (2.2.2) does not depend on representatives, as the following inequality shows |d(an , bn ) − d(an , bn )| ≤ d(an , an ) + d(bn , bn ) → 0 . Thus d¯ is well deﬁned. It is easy to check that d¯ is a metric. ¯ be the mapping which associates with every Now, let ψ : X → X a ∈ X the class A of the constant sequence (a, a, . . . ): ψ(a) = A. Obviously, ψ is injective, so if we denote X1 = ψ(X), then ψ is a bijection between X and X1 . Moreover, for any A, B ∈ X1 we have ¯ ¯ B) = lim d(a, b) = d(a, b). d(ψ(a), ψ(b)) = d(A, ¯ are isomorphic, i.e., (j) holds true. Hence (X, d) and (X1 , d) Let us now prove (jj). To this purpose, let A be an arbitrary element ¯ and let (an ) be a representative of A. For each k ∈ N denote of X by Ak the class of the constant sequence (ak , ak , . . . ). Since (an ) is a Cauchy sequence in (X, d), we can write ∀ε > 0, ∃N ∈ N : d(ak+p , ak ) < ε ∀k > N, p ∈ N. Therefore, ¯ Ak ) = lim d(am , ak ) ≤ ε ∀k > N . d(A, m→∞

This shows that A is approximated by Ak ∈ X1 , hence (jj) holds true. ¯ is complete, let (An ) be a Cauchy se¯ d) In order to prove that (X, ¯ ¯ quence in (X, d). For each class Ak there is a class Bk ∈ X1 such ¯ k , Bk ) < 1/k (see (jj)). Notice that Bk is the class of some that d(A constant sequence (bk , bk , . . . ) with bk ∈ X. We can show that (bk ) is a Cauchy sequence in (X, d): ¯ k , Bm ) d(bk , bm ) = d(B ¯ k , Ak ) + d(A ¯ k , Am ) + d(A ¯ m , Bm ) ≤ d(B ≤

1 1 ¯ m , Bm ) → 0, + + d(A k m

¯ We as k, m → ∞, so the class B of the sequence (bk ) belongs to X. ¯ claim that B is the limit of (Ak ) with respect to d. Indeed, given ε > 0, ¯ Bk ) + d(B ¯ k , Ak ) ¯ Ak ) ≤ d(B, d(B, 1 = lim d(bm , bk ) + m→∞ k < ε

38

2 Metric Spaces

¯ k , B) = 0, as claimed. Therefore for large enough k, so limk→∞ d(A ¯ is complete. ¯ d) (X, ¯ ¯ d) Finally, we need to show that any two complete metric spaces (X, ˆ satisfying (j) and (jj) are isometric. Let X1 ⊂ X ¯ and ˆ d) and (X, ˆ X2 ⊂ X such that each of these spaces is isometric to (X, d). Let ¯ and h : (X, d) → (X2 , d) ˆ be the corresponding g : (X, d) → (X1 , d) ¯ and (X2 , d) ˆ are isometric, and θ = h ◦ g −1 is isometries. Then (X1 , d) an isometry between these spaces. ¯ By (jj) there exists a sequence Let A be an arbitrary element of X. ¯ (An ) in X1 such that d(An , A) → 0. Obviously, Bn = θ(An ) ∈ X2 and ˆ so it is convergent since (X, ˆ ˆ d), ˆ d) (Bn ) is a Cauchy sequence in (X, ˆ n , B) → 0. Denote by θ˜ ˆ be its limit: d(B is complete. Let B ∈ X the mapping that takes A to B. Note that B does not depend on the choice of (An ) so it is unique for each A, i.e., θ˜ is well deﬁned. In fact, ¯ It is easily seen that θ˜ is a θ˜ is an extension of θ to the whole X. ¯ ˆ bijection between X and X. ¯ and let It remains to prove that θ˜ is an isometry. Let A, A ∈ X (An ), (An ) sequences in X1 which converge, respectively, to A and ¯ Let B, B be the limits of Bn = θ(An ) and A with respect to d. ˆ By letting n tend to inﬁnity in the equation ˆ d). Bn = θ(An ) in (X, ˆ n , B ) = d(A ¯ n , A ) , d(B n n we obtain

ˆ B ) = d(A, ¯ A ), ˜ ˜ ) = d(B, dˆ θ(A), θ(A

by using the inequality ¯ A )| ≤ d(A ¯ n , A) + d(A ¯ , A ) , ¯ n , A ) − d(A, |d(A n n ¯ and (X, ˆ ¯ d) ˆ d) and the similar one for Bn , Bn , B, B . Therefore, (X, are indeed isometric. Remark 2.9. Let X be a nonempty subset of a given complete metric space (Z, d). Then (Cl X, d) is also a complete metric space, where ¯ Clearly, (Cl X, d) Cl X is the closure of X in (Z, d), also denoted X. ¯ in Theorem 2.8, so (Cl X, d) can be regarded ¯ d) plays the role of (X, as the completion of X with respect to d. To illustrate this case, consider X = (0, 1] and Z = R with d(x, y) = |x − y|. Then, Cl X = [0, 1] and so ([0, 1], d) is the completion of ((0, 1], d) (which is not itself complete). Further examples will be discussed later, including examples involving function spaces.

2.2 Completeness

39

¯ because it was not a Note that in Theorem 2.8 we had to construct X priori known. We continue with Baire’s Theorem2 that is used to derive some important principles of Functional Analysis: the Uniform Boundedness Principle, the Open Mapping Theorem, and the Closed Graph Theorem (see Theorems 4.7, 4.8, and 4.10). Theorem 2.10 (Baire). Let (X, d) be a complete metric space and let Xn ⊂ X, n ∈ N, be closed sets satisfying Int Xn = ∅ Then,

Int

∞

∀n ∈ N .

(2.2.3)

Xn

= ∅.

(2.2.4)

n=1

Proof. Notice ﬁrst that for all F ⊂ X we have Cl(X \ F ) =: X \ F = X ⇐⇒ Int F = ∅ . So by (2.2.3) Dn = X \ Xn is dense in X and is also open for all n ∈ N. We have to show that (2.2.4) holds or, equivalently, that M := ∩∞ n=1 Dn is dense in X, i.e., for every open set D ⊂ X we have D ∩ M = ∅. Fix such an open set D and choose some x0 ∈ D and r0 > 0 such that the closed ball B(x0 , r0 ) ⊂ D. Since D1 is open and dense in X there exist x1 ∈ B(x0 , r0 ) ∩ D1 and r1 > 0 such that B(x1 , r1 ) ⊂ B(x0 , r0 ) ∩ D1 , 0 < r1 <

r0 . 2

By induction one can ﬁnd sequences (xn ) and (rn ) such that B(xn+1 , rn+1 ) ⊂ B(xn , rn ) ∩ Dn+1 , 0 < rn+1 <

rn , 2

for n = 0, 1, 2, . . . It is easily seen that (xn ) is Cauchy, hence convergent (since (X, d) is complete). If a denotes its limit then a ∈ D ∩ M , hence D ∩ M = ∅, as claimed.

2

Ren´e-Louis Baire, French mathematician, 1874–1932.

40

2.3

2 Metric Spaces

Compact Sets

Let A be a subset of a metric space (X, d). A cover of A is a collection of sets {Di }i∈I whose union contains A: A⊂ Di , i∈I

where I is a ﬁnite or inﬁnite index set. If all Di are open sets then {Di }i∈I is called an open cover. Deﬁnition 2.11. A subset A of (X, d) is called compact if every open cover of A has a ﬁnite subcover. The next result is a characterization of compact sets in metric spaces. Theorem 2.12. A subset A of a metric space (X, d) is compact if and only if every sequence in A has a subsequence that converges to a point of A (in other words A is sequentially compact). Proof. is divided into several steps. Step 1: If A is compact then A is closed. We need to show that X \ A is open. Let x ∈ X \ A. If y ∈ A we have d(y, x) > 0 and so y belongs to Dn = {z ∈ X; d(z, x) > 1/n} for n ∈ N large enough. Thus {Dn }n∈N is an open cover of A. Since A is compact, there is a ﬁnite subcover of A. In fact, this subcover can be reduced to one set DN with N large. By construction B(x, N1 ) ⊂ X \A, and hence X \ A is open, therefore A is closed, as claimed. Step 2: If A is compact and B ⊂ A is closed, then B is compact. If {Di }i∈I is an open cover of B, then {Di }i∈I ∪ {X \ B} is an open cover of A. Since A is compact, we can extract a ﬁnite subcover of A, say {Di1 , Di2 , . . . , Dim , X \ B}. Thus {Di1 , Di2 , . . . , Dim } is a ﬁnite subcover of B extracted from {Di }i∈I . Step 3: A being compact implies A is sequentially compact. Assume, by contradiction, that there is a sequence (xn ) in A that has no convergent subsequence. So (xn ) has inﬁnitely many distinct points y1 = xn1 , y2 = xn2 , . . . such that for each ym there is an open

2.3 Compact Sets

41

ball B(ym , rm ) containing no other yi (otherwise ym is a cluster point of the sequence (yi )). The set C = {y1 , y2 , . . . } is closed since all its points are isolated. By Step 2, C is compact. On the other hand, {B(ym , rm )}m∈N is an open cover of C which has no ﬁnite subcover. Hence (xn ) must have a convergent subsequence. Its limit belongs to A, since A is closed (see Step 1). Step 4: If A is sequentially compact, then for every open cover {Di }i∈I of A, there exists an r > 0 such that ∀y ∈ A, B(y, r) is contained in some Di . Assume to the contrary that this is not the case. Thus there exists an open cover {Di } of A such that ∀n ∈ N there is some yn ∈ A so that B(yn , n1 ) is not contained in any Di . By hypothesis (yn ) has a subsequence (z1 = yn1 , z2 = yn2 , . . . ) converging to some z ∈ A. Obviously, z belongs to some Di0 and since Di0 is open and zn → z, we can choose some large N such that B(zN , N1 ) ⊂ Di0 , which is a contradiction. Step 5: A being sequentially compact implies that for all ε > 0 there is a ﬁnite number of open balls of radius ε covering A (i.e., A is totally bounded). We need to analyze the case when A is not ﬁnite, otherwise the conclusion is obvious. Assume that A is not totally bounded, i.e., for some ε > 0 we cannot cover A with ﬁnitely many open balls of radius ε. Choose y1 ∈ A and y2 ∈ A \ B(y1 , ε). By the same assumption there exists a point y3 ∈ A \ B(y1 , ε) ∪ B(y2 , ε) . Repeating this process we obtain a sequence yn ∈ A \ ∪n−1 i=1 B(yi , ε) , which satisﬁes d(yn , ym ) ≥ ε for all n, m ∈ N, n = m. In other words, (yn ) has no Cauchy subsequence and hence has no convergent subsequence, thus contradicting sequential compactness. Step 6: If A is sequentially compact then A is compact. Let {Di } be an open cover of A. Associate with this cover a positive r given by Step 4. By Step 5 (see also its proof) there is a ﬁnite number of points, say y1 , y2 , . . . , yp ∈ A, such that A ⊂ ∪pj=1 B(yj , r) . By Step 4, each ball B(yj , r) is contained in some Dij . Hence {Di1 , Di2 , . . . , Dip } is a ﬁnite (open) subcover of A.

42

2 Metric Spaces

Deﬁnition 2.13. A set A ⊂ (X, d) is called relatively compact if Cl A is compact. Corollary 2.14. A set A ⊂ (X, d) is relatively compact if and only if every sequence in A has a convergent subsequence. Proof. Assume that any sequence in A has a convergent subsequence (its limit being a point of Cl A). Then Cl A is sequentially compact (hence compact) in (X, d). Indeed, if (xn ) is a sequence in Cl A, then there exists a sequence (yn ) in A such that d(xn , yn ) < 1/n for all n ∈ N. As (yn ) has a convergent subsequence (ynk ), it follows that (xnk ) is also convergent. So the statement of the corollary holds true by Theorem 2.12. Now let us recall a result due to Bolzano and Weierstrass. Theorem 2.15 (Bolzano–Weierstrass). Every bounded sequence in Rk endowed with the Euclidean norm has a convergent subsequence. Proof. This theorem is known for k = 1 (see Theorem 1.11) and extends easily to Rk : a bounded sequence in Rk is bounded in each coordinate. From the proof of Theorem 2.12 we see that every compact set in a metric space is closed and bounded. The converse implication is not true in general. However, we have the following result attributed to Heine and Borel.3 Corollary 2.16 (Heine–Borel). Let ∅ = A ⊂ Rk endowed with the usual Euclidean metric. A is compact if and only if A is closed and bounded (with respect to the same metric). Proof. The forward implication is valid in every metric space, as observed above. Conversely, assume that A is closed and bounded. Then any sequence in A is bounded so it has a convergent subsequence (cf. Theorem 2.15). Its limit belongs to A because A is closed. Thus A is sequentially compact, hence compact by Theorem 2.12. Remark 2.17. The Heine–Borel Theorem extends to any ﬁnite dimensional space with Euclidean metric but may not be true for other ´ Heinrich Eduard Heine, German mathematician, 1821–1881; Emile Borel, French mathematician, 1871–1956. 3

2.3 Compact Sets

43

metrics. As an example, consider (R, d0 ), where d0 is the discrete metric 0 x = y, d0 (x, y) = 1 x = y. Let A = N ⊂ R. A is bounded with respect to d0 because A ⊂ B(0, 2). A = N is closed with respect to d0 , but it is not compact because the open cover {B(n, 1/2)}n∈N has no ﬁnite subcover. A collection of subsets of (X, d) is said to have the ﬁnite intersection property if the intersection of every ﬁnite sub-collection of the family is nonempty. Theorem 2.18. If a collection of compact subsets of a metric space (X, d), say {Ki }i∈I , has the ﬁnite intersection property, then ∩i∈I Ki = ∅. Proof. The statement is trivial if I is a ﬁnite set, so let us assume that I is inﬁnite. Assume to the contrary that ∩i∈I Ki = ∅. Hence X = ∪i∈I (X \ Ki ) ∪i∈I1 (X \ Ki ) , = (X \ Ki0 )

(2.3.5)

where i0 ∈ I is an arbitrary but ﬁxed index, and I1 = I \ {i0 }. It follows that Ki0 ⊂ ∪i∈I1 (X \ Ki ) . As Ki0 is compact and {X \ Ki }i∈I1 is an open cover of Ki0 , there is a ﬁnite set J ⊂ I1 such that Ki0 ⊂ ∪i∈J (X \ Ki ). Therefore (see (2.3.5)), X = ∪i∈J1 (X \ Ki ) , where J1 = J ∪ {i0 }, or equivalently ∅ = ∩i∈J1 Ki , which contradicts our assumption because J1 is ﬁnite.

44

2 Metric Spaces

2.4

Continuous Functions on Compact Sets

Let (X, d) and (X1 , d1 ) be two metric spaces. A function f : D ⊂ (X, d) → (X1 , d1 ) is said to be continuous at some point x0 ∈ D if for every neighborhood V ⊂ (X1 , d1 ) of f (x0 ) there exists a neighborhood U ⊂ (X, d) of x0 such that f (U ∩ D) ⊂ V , or equivalently ∀ε > 0, ∃δ > 0 : ∀x ∈ D, d(x, x0 ) < δ ⇒ d1 (f (x), f (x0 )) < ε . (2.4.6) U and δ depend on ε and x0 . The continuity of f at x0 ∈ D can also be equivalently expressed by using sequences xn ∈ D, d(xn , x0 ) → 0 =⇒ d1 (f (xn ), f (x0 )) → 0 . If f is continuous at all x0 ∈ D then we say that f is continuous on D (or simply continuous). The function f is called uniformly continuous on D if δ can be the same for all x0 ∈ D, i.e., δ is independent of x0 ∈ D (it depends only on ε). Theorem 2.19. If D ⊂ (X, d) is a nonempty compact set and f : D → (X1 , d1 ) is continuous (on D), then the following hold: • f is uniformly continuous on D; • the set f (D) := {f (x); x ∈ D} is compact in (X1 , d1 ); • C(D; X1 ) := {f : D → (X1 , d1 ); f continuous on D} is a metric ˜ g) = sup space with respect to the metric d(f, x∈D d1 (f (x), g(x)). ˜ is also complete. If in addition (X1 , d1 ) is complete, then (C(D; X1 ), d) The proof is left to the reader as an exercise. Theorem 2.20 (Weierstrass). If D ⊂ (X, d) is a nonempty compact set and f : D → R (R being equipped with the usual metric), then f (D) is closed and bounded, and there exist x0 , y0 ∈ D such that f (x0 ) = inf f (D) and f (y0 ) = sup f (D).

2.4 Continuous Functions on Compact Sets

45

Proof. The ﬁrst part of the theorem follows from Theorem 2.19 which in particular says that f (D) is compact (in R), hence closed and bounded. So the inﬁmum and supremum of f (D), denoted m and M , are ﬁnite numbers. Now, for all n ∈ N there exists an xn ∈ D such that 1 m ≤ f (xn ) < m + . (2.4.7) n As D is a compact set, (xn ) has a subsequence which converges to some x0 ∈ D. This fact combined with (2.4.7) implies m = f (x0 ). Similarly, there is a point y0 ∈ D such that M = f (y0 ). Equivalent Norms. Let X be a linear space over K (as usual K is either R or C). Two norms on X, say · and · ∗ , are said to be equivalent if there exist two positive constants C1 , C2 such that C1 x ≤ x∗ ≤ C2 x ∀x ∈ X .

(2.4.8)

Obviously, two equivalent norms on X generate the same topology on X. If X is a k-dimensional linear space, k ∈ N, with a basis B = {u1 , . . . , uk }, then X can be equipped with diﬀerent norms, such as xmax = max |αi |, xp =

1≤i≤n n

|αi |p

1/p

, p ∈ [1, ∞),

i=1

for all x = ki=1 αi ui ∈ X. Note that · 2 is precisely the Euclidean norm of X introduced before. Theorem 2.21. If X is a k-dimensional linear space, k ∈ N, then any two norms on X are equivalent. Proof. It is enough to show that any norm · on X is equivalent to the k Euclidean norm · 2 . On the one hand, for any x = i=1 αi ui ∈ X, we have x ≤

k

|αi | · ui

i=1

≤ ≤

max ui ·

1≤i≤k

√

k

|αi |

i=1

k max ui · x2 . 1≤i≤k

(2.4.9)

46

2 Metric Spaces

We have used the triangle inequality√and the Bunyakovsky–Cauchy– Schwarz inequality. Denoting C := k max1≤i≤k ui , we can derive from (2.4.9) (2.4.10) x ≤ Cx2 ∀x ∈ X . In order to get the other inequality we use Theorem 2.20. Observe that · is a continuous function on (X, · 2 ): |x − x0 | ≤ x − x0 ≤ Cx − x0 2 , so · is bounded and attains its inﬁmum, say C1 , on the unit sphere S2 (0, 1) = {x ∈ X; x2 = 1} (which is compact in (X, · 2 )), i.e., x ≥ C1

∀x ∈ S2 (0, 1) .

(2.4.11)

C1 cannot be zero since it is the value of · at a point in S2 (0, 1). From (2.4.11) we easily derive C1 x2 ≤ x ∀x ∈ X .

(2.4.12)

According to (2.4.10) and (2.4.12), the two norms are equivalent, as claimed. Remark 2.22. In inﬁnite dimensional linear spaces there exist norms which are not equivalent. For instance, let us consider the following two norms on the real linear space X = C[a, b] := C([a, b]; R), −∞ < a < b < +∞, b |f (t)| dt . f = sup{|f (t)|; a ≤ t ≤ b}, f 1 = a

We have

f 1 ≤ (b − a)f ∀f ∈ X , i.e., the sup-norm · is stronger than · 1 . But the two norms are not equivalent. Indeed, let (fn ) be the sequence in X deﬁned by 0, a ≤ t ≤ b − n1 , fn (t) = nt + 1 − nb, b − n1 < t ≤ b ,

where n ∈ N, n > 1/(b − a). Clearly fn = 1, but b 1 fn 1 = , |nt + 1 − nb| dt = 1 2n b− n

so there does not exist C such that fn ≤ Cfn 1 because fn 1 → 0 as n → ∞.

2.4 Continuous Functions on Compact Sets

47

Remark 2.23. It follows from Theorem 2.21 that any norm on a ﬁnite dimensional linear space generates the same topology as that deﬁned by the Euclidean norm, so any topological result involving the Euclidean norm is also valid with respect to any other norm. In particular, the Heine–Borel Theorem is valid in any ﬁnite dimensional linear space equipped with any norm. Throughout the rest of this book, Rk and any other ﬁnite dimensional linear space is always considered as a normed space, equipped with the norm topology (generated by any convenient norm), unless otherwise speciﬁed. The next result is a characterization (due to Riesz4 ) of the ﬁnite dimensionality of normed spaces clarifying the Heine–Borel Theorem. Theorem 2.24 (Riesz). Let (X, · ) be a normed linear space. X is ﬁnite dimensional if and only if every closed bounded subset of X is compact. In order to prove Theorem 2.24, we need the following lemma. Lemma 2.25. Let (X, · ) be a normed space. Let X1 ⊂ X be a proper, closed linear subspace of X. Then there exists x0 ∈ X \ X1 such that x0 = 1 , 1 ∀x ∈ X1 . x − x0 ≥ 2 Proof. Choose x1 ∈ X \ X1 and let ρ = d(x1 , X1 ) := inf{x1 − z; z ∈ X1 }. We ﬁrst prove ρ > 0. Suppose ρ = 0. Then there exists a sequence zn ∈ X1 such that x1 − zn < 1/n, hence zn → x1 . As X1 is closed, this implies x1 ∈ X1 , which is a contradiction. By the deﬁnition of ρ there exists x2 ∈ X1 such that x1 − x2 < 2ρ. Let 1 (x1 − x2 ) . x0 = x1 − x2

4

Frigyes Riesz, Hungarian mathematician, 1880–1956.

48

2 Metric Spaces

Clearly x0 ∈ X \ X1 and x0 = 1. For x ∈ X1 we have x − x0 = x − x1 − x2 −1 (x1 − x2 ) 1 x1 − v = x1 − x2 1 x1 − v ≥ 2ρ 1 ρ = , ≥ 2ρ 2 where v = x2 + x1 − x2 x ∈ X1 . Proof of Theorem 2.24. The necessity part follows from the Heine– Borel Theorem extended to ﬁnite dimensional linear spaces (see Remark 2.23). To prove suﬃciency, assume by way of contradiction that X is not ﬁnite dimensional, i.e., there exist inﬁnitely many distinct points in X, say x1 , x2 , . . . , such that for all n ∈ N, Bn = {x1 , x2 , . . . , xn } is a linearly independent system. Let Xn = Span Bn . Now, (Xn , · ) is a closed space and Xn ⊂ Xn+1 (proper inclusion) for all n ∈ N. By Lemma 2.25, there exists yn ∈ Xn+1 \ Xn for n ∈ N such that yn = 1 and yn − x ≥ 1/2 ∀x ∈ Xn . In particular yn − ym ≥ 1/2 for all m, n ∈ N, m = n. So (yn ) has no Cauchy subsequence, hence no convergent subsequence. On the other hand, yn ∈ Cl B(0, 1) ∀n ∈ N, so (yn ) should have a convergent subsequence (since Cl B(0, 1) is compact by assumption). This contradiction completes the proof. Arzel` a–Ascoli Criterion5 Let (X, d) and (X1 , d1 ) be metric spaces and let ∅ = A ⊂ X. Denote as usual by C(A; X1 ) the set of all continuous functions from A ⊂ (X, d) to (X1 , d1 ). Deﬁnition 2.26. A family of functions F ⊂ C(A; X1 ) is called equicontinuous if for all ε > 0 and all x ∈ A there exists δ > 0 such that y ∈ A and d(x, y) < δ implies d1 (f (x), f (y)) < ε for all f ∈ F, i.e., δ = δ(ε, x) is independent of f . 5

Cesare Arzel` a, Italian mathematician, 1847–1912; Giulio Ascoli, Italian mathematician, 1843–1896.

2.4 Continuous Functions on Compact Sets

49

Deﬁnition 2.27. If in addition δ = δ(ε) (i.e., δ is independent of x and f ), then F is uniformly equicontinuous, i.e., ∀ε > 0, ∀x, y ∈ A, d(x, y) < δ implies d1 (f (x), f (y)) < ε for all f ∈ F. Remark 2.28. If A ⊂ (X, d) is compact and F ⊂ C(A; X1 ) is equicontinuous, then F is uniformly equicontinuous (see Exercise 2.22 below). Note also that if A is compact then C(A; X1 ) is a metric space with ˜ g) = sup respect to the metric d(f, x∈A d1 (f (x), g(x)); if in addition ˜ is complete too, and in partic(X1 , d1 ) is complete then (C(A; X1 ), d) k ular C(A; R ), k ∈ N, is a Banach space with respect to the sup-norm f C(A; Rk ) = supx∈A f (x), where · is a norm of Rk . Theorem 2.29 (Arzel`a–Ascoli Criterion). Let ∅ = A ⊂ (X, d) be compact. Assume that F ⊂ C(A, Rk ) is equicontinuous and bounded in C(A; Rk ) (i.e., ∃M > 0 such that f (x) ≤ M , ∀x ∈ A, ∀f ∈ F). Then F is relatively compact in C(A; Rk ) equipped with the sup-norm. Proof. For any δ > 0 we have A ⊂ ∪x∈A B(x, δ) and since A is compact, there exists a ﬁnite subcover, so that A ⊂ ∪pj=1 B(yj , δ). Let Cδ = {y1 , y2 , . . . , yp } and consider C = ∪i∈N C1/i . C is dense in A and countable so C = {x1 , x2 , . . . }. In order to prove that F is relatively compact in C(A; Rk ) it suﬃces to show that any sequence in F has a convergent subsequence in this space (cf. Corollary 2.14). So let (fn )n∈N be a sequence in F. Since F is bounded in C(A; Rk ), then (fn (x1 )N ) is bounded in Rk so there exists a subsequence of (fn ), f11 , f12 , . . . , f1n , . . . which is convergent at x = x1 . By the same assumption this subsequence has a subsequence f21 , f22 , . . . , f2n , . . . which is convergent at x = x2 (and at x = x1 as well). Continuing the process we obtain successive subsequences fm1 , fm2 , . . . , fmn , . . . .. . Think of it as an inﬁnite matrix and consider the diagonal sequence (gn ) = (f11 , f22 , . . . , fnn , . . . ) which converges at any point of C. On the other hand, as F is equicontinuous and A is compact, F is in fact

50

2 Metric Spaces

uniformly equicontinuous, i.e., for every ε > 0 there exists a δ = δ(ε) > 0 such that ∀z, w ∈ A, d(z, w) < δ implies gn (z)−gn (w) < ε ∀n ∈ N . (2.4.13) We can choose δ = 1/i, with i ∈ N suﬃciently large. Now, for a given ε ﬁx a δ = 1/i, so Cδ = C1/i is a ﬁnite set Cδ = {y1 , . . . , yp } ⊂ C. If x ∈ A then it belongs to a ball B(yj , δ) for some j ∈ {1, . . . , p} and we have, by (2.4.13) and the convergence of (gn (yj )), gn (x) − gm (x)≤ gn (x) − gn (yj ) + gn (yj ) − gm (yj ) + gm (yj ) − gm (x) < ε + ε + ε = 3ε ∀n, m > N (ε, j) . Therefore, gn − gm C(A; Rk ) ≤ 3ε

∀n, m > N (ε) :=

max N (ε, j) .

j∈{1,...,p}

As C(A; Rk ) is a Banach space it follows that (gn ) converges in this space. Notice that in the above proof we have used two essential arguments: the completeness of the space (Rk , · ) (implying that C(A; Rk ) is a Banach space) and the fact that the set {f (x); f ∈ F, x ∈ X1 } is bounded in Rk (equivalently, relatively compact in this space) for all x ∈ A. So the following generalization holds true: Theorem 2.30. Let F ⊂ C(A; X1 ) where A = ∅ is a compact subset of (X, d) and (X1 , d1 ) is a complete metric space. Assume that F is equicontinuous and {f (x); f ∈ F} is relatively compact in (X1 , d1 ) for all x ∈ A. Then F is relatively compact in C(A; X1 ). Peano’s Existence Theorem6 In what follows we illustrate the Arzel`a–Ascoli Criterion with Peano’s Existence Theorem which is a fundamental result in the theory of ordinary diﬀerential equations. Theorem 2.31 (Peano). Let a, b ∈ (0, ∞), t0 ∈ R, x0 ∈ Rk , and Rk be equipped with the norm v = max1≤i≤k |vi |. Let D be the set D = {(t, v) ∈ R × Rk ; |t − t0 | ≤ a, v − x0 ≤ b} ⊂ Rk+1 6

Giuseppe Peano, Italian mathematician, 1858–1932.

2.4 Continuous Functions on Compact Sets

51

and let f : D → Rk be a continuous function. Then there exists a continuously diﬀerentiable function x : [t0 − δ, t0 + δ] → Rk satisfying the equation x (t) = f (t, x(t))

∀t ∈ [t0 − δ, t0 + δ] ,

(2.4.14)

and the initial (Cauchy) condition x(t0 ) = x0 ,

(2.4.15)

where δ = min(a, b/M ) with M = sup{f (t, v); (t, v) ∈ D}. M is assumed to be a positive number, because the case M = 0 ⇐⇒ f ≡ 0 is trivial. Proof. We shall use Euler’s method of polygonal lines.7 Since f ∈ C(D; Rk ) and D is compact, f is uniformly continuous, i.e., ∀ε > 0, ∃δ = δ1 (ε) > 0 such that (t, v1 ), (s, v2 ) ∈ D, |t − s| < δ1 , v1 − v2 < δ1 ⇒ f (t, v1 ) − f (s, v2 ) < ε. We shall only prove existence on I := [t0 , δ]. By symmetry we get the other side, however we have to check that the solution is diﬀerentiable at t = t0 . Given xr (t), t ∈ [t0 , t0 + δ], x(t) = xl (t), t ∈ [t0 − δ, t0 ], we have x− (t0 ) =

dxl dxr (t0 ) = f (t0 , x0 ) = (t0 ) = x+ (t0 ) . dt dt

Consider the uniform subdivision Δ : t 0 < t1 < · · · < tN = t0 + δ , i.e., tj = t0 + jhε , j = 0, 1, . . . , N , with hε ≤ min{δ1 (ε), δ1M(ε) }. Now for a given ε > 0 construct φε : I → Rk as φε (tj ) + (t − tj )f (tj , φε (tj )), tj < t ≤ tj+1 , φε (t) = t = t0 . x0 , 7

Leonhard Euler, Swiss mathematician, physicist, astronomer, logician, and engineer, 1707–1783.

52

2 Metric Spaces

The graph of φε is a polygonal line called Euler’s polygonal line and we shall see that it approximates for ε small the trajectory of the solution of problem (2.4.14) and (2.4.15). For k = 1 Euler’s polygonal line can be visualized in the (t, x) coordinate plane. Consider the family F = {φε ; ε > 0}. Let us ﬁrst show that φε is well deﬁned on I for all ε > 0. On the interval [t0 , t1 ], φε (t) = x0 + (t − t0 )f (t0 , x0 ) and φε (t) − x0 ≤ M (t − t0 ) ≤ M δ ≤ b , so (t, φε (t)) ∈ D. In particular (t1 , φε (t1 )) ∈ D. So on [t1 , t2 ], φε (t) = φε (t1 ) + (t − t1 )f (t1 , φε (t1 )) is well deﬁned and φε (t) − x0 ≤ φε (t) − φε (t1 ) + φε (t1 ) − x0 ≤ (t − t1 )M + (t1 − t0 )M ≤ (t − t0 )M ≤ Mδ ≤ b, so by induction φε (t) is well deﬁned and continuous on I and φε (t) − x0 ≤ M (t − t0 ) ≤ M δ ≤ b

(2.4.16)

for all t ∈ I. Thus φε (t) ≤ φε (t) − x0 + x0 ≤ b + x0 . Therefore, F is a bounded subset of C(I; Rk ). In order to apply the Arzel`a–Ascoli Theorem, we need to show that F is equicontinuous. If t, s ∈ [tj , tj+1 ] then φε (t) − φε (s) ≤ M |t − s|. If t, s are in diﬀerent intervals, say t ∈ [tp , tp+1 ], s ∈ [tq , tq+1 ] with p < q, then φε (t) − φε (s) ≤ φε (s) − φε (tq ) + φε (tq ) − φε (tq−1 ) + · · · + φε (tp+1 ) − φε (t) ≤ M (s − tq ) + M (tq − tq−1 ) + · · · + M (tp+1 − t) ≤ M (s − t) = M |t − s| , so F is equicontinuous, in fact it is Lipschitz equicontinuous. Thus by the Arzel` a–Ascoli Criterion there is a sequence εn → 0+ such that

2.4 Continuous Functions on Compact Sets

53

φεn converges in C(I; Rk ) to some φ ∈ C(I; Rk ) as n → ∞. Also (see (2.4.16)) φ(t) − x0 ≤ b , so (t, φ(t)) ∈ D for all t ∈ I. Now it simply remains to prove that x = φ(t) is a solution of problem (2.4.14) and (2.4.15). Deﬁne φεn (t) − f (t, φεn (t)) t = tnj , gεn (t) = 0 otherwise , where {tnj } is the subdivision of I corresponding to εn . If tnj < t < tnj+1 then φεn (t) = 0 + f (tnj , φεn (tnj )). For t ∈ (tnj , tnj+1 ), we have |t − tnj | ≤ hεn ≤ δ1 (εn ), and φεn (t) − φεn (tnj ≤ M |t − tnj | ≤ M hεn ≤ δ1 (εn ) . The ﬁnal inequality holds by the deﬁnition of hε . Because f is uniformly continuous gεn (t) ≤ εn ∀n, ∀t ∈ I , so gεn converges uniformly to 0. On the other hand, for all t ∈ I t t gεn (s) ds = φεn (t) − x0 − f (s, φεn (s)) ds . t0

(2.4.17)

t0

Now since φεn converges uniformly to φ on I and f is continuous on D, f (s, φεn (s)) → f (s, φ(s)) uniformly on I as n → ∞. Therefore, passing to the limit in (2.4.17), we get

t

φ(t) = x0 +

f (s, φ(s)) ds , t ∈ I,

t0

so x = φ(t) is a solution to the given Cauchy problem (2.4.14) and (2.4.15). Remark 2.32. There is no guarantee of uniqueness. For example the Cauchy problem x (t) = 2 |x(t)| , x(0) = 0 ,

54

2 Metric Spaces

with a = b = 1, D = [−1, 1]×[−1, 1], f (t, v) = 2 |v|, has the following solutions: x1 (t) = 0, −1 ≤ t ≤ 1, t2 , −1 ≤ t ≤ 0, x2 (t) = 0, 0 < t ≤ 1, 0, −1 ≤ t ≤ 0, x3 (t) = 2 −t , 0 < t ≤ 1, −1 ≤ t ≤ 0, t2 , x4 (t) = 2 −t , 0 < t ≤ 1. Note that all these solutions are deﬁned on the whole interval [−1, 1], even if the existence interval given by Peano’s Theorem is smaller: δ = min{a, b/M } = min{1, 1/2} = 1/2. A solution which is deﬁned on the whole initial interval [t0 − a, t0 + a] in the case of problem (2.4.14) and (2.4.15)) is called a global solution. In particular, the above four solutions are global solutions. In fact, there are inﬁnitely many solutions of the above Cauchy problem (see Exercise 2.28 below). Peano’s Theorem provides only a local solution, i.e., a solution deﬁned on an interval around t0 which in general is smaller than the initial interval. If f in (2.4.14) is deﬁned on an open set Ω ⊂ Rk+1 then one can associate with each pair (t0 , x0 ) ∈ Ω a box D ⊂ Ω so that Peano’s Theorem gives a local solution to problem (2.4.14) and (2.4.15) deﬁned on an interval which depends on (t0 , x0 ). By requiring additional conditions one can guarantee uniqueness. For example, we get uniqueness if, in addition, f satisﬁes a Lipschitz condition: ∃L > 0 such that f (t, v1 ) − f (t, v2 ) ≤ Lv1 − v2

(2.4.18)

for all (t, v1 ), (t, v2 ) ∈ D. Let x = φ(t), y = ψ(t) for t ∈ I = [t0 , t0 + δ] be two solutions of problem (2.4.14) and (2.4.15). Then t φ(s) − ψ(s) ds, φ(t) − ψ(t) ≤ L t0

or, equivalently, d −Lt e dt

t t0

φ(s) − ψ(s) ds ≤ 0 ,

2.5 The Banach Contraction Principle

55

for all t ∈ I. It follows easily that φ(t) = ψ(t) for all t ∈ I. Uniqueness on [t0 − δ, t0 ] follows by converting problem (2.4.14) and (2.4.15) on [t0 − δ, t0 ] into a similar Cauchy problem on [0, δ] by using the change τ = t0 − t. Therefore, we can state the following result. Theorem 2.33. Under the assumptions of Peano’s Theorem (Theorem 2.31), plus (2.4.18), there exists a unique function x ∈ C 1 ([t0 − δ, t0 + δ]; R) satisfying (2.4.14) and (2.4.15), where δ is the same as in Theorem 2.31. Remark 2.34. Peano’s Theorem is no longer valid in inﬁnite dimensions, i.e., if Rk is replaced by an inﬁnite dimensional Banach space (see [18]). Euler’s Diﬀerence Scheme. If x = φ(t) is unique, then φε → φ in C(I; Rk ) as ε → 0+ so the polygonal line corresponding to φε approximates the graph of φ. Let Δ : t0 < t1 < · · · < tN = t0 + δ with tj = t0 + jh and h = Nδ . The points (tj , φε (tj )) give us the polygonal line approximation. Denoting φj := φε (tj ) we have φj+1 = φj + hf (tj , φj ), j = 0, 1, . . . , N − 1, φ0 = x0 . This is an explicit diﬀerence scheme, called Euler’s scheme. Its solution provides the vertices of a polygonal line approximation, so Euler’s scheme is important for the numerical analysis of the solutions of differential equations.

2.5

The Banach Contraction Principle

We saw in the previous section that under the assumptions of Peano’s Existence Theorem (Theorem 2.31) plus the Lipschitz condition (2.4.18) the Cauchy problem x (t) = f (t, x(t)), x(t0 ) = x0

(2.5.19)

has a unique solution x ∈ C 1 (I; Rk ), where I = [t0 − δ, t0 + δ], with δ as deﬁned in the statement of Theorem 2.31. This (existence and uniqueness) result can also be derived by applying the general Banach8 8

Stefan Banach, Polish mathematician, 1892–1945.

56

2 Metric Spaces

Contraction Principle (also known as the Banach Fixed Point Theorem) we present below. Before stating this principle let us explain how problem (2.5.19) can be reduced to a ﬁxed point problem. Note that problem (2.5.19) is equivalent to the integral equation

t

x(t) = x0 +

f (s, x(s)) ds .

(2.5.20)

t0

Denote X = {v ∈ C(I; Rk ); v(t)−x0 ≤ b, t ∈ I}. This is a complete metric space since it is a closed subset of the Banach space C(I; Rk ) equipped with the sup-norm, denoted · C , which gives the metric d(u, v) = u − vC . Deﬁne on X the map (operator) T by

t

(T v)(t) = x0 +

f (s, v(s)) ds ,

∀v ∈ X .

t0

We prefer the notation T v instead of T (v). It is easily seen that under the assumptions above T v ∈ X for all v ∈ X, i.e., T : X → X. Equation (2.5.20) can be simply written as x = Tx,

(2.5.21)

so the above Cauchy problem (or Eq. (2.5.20)) reduces to solving Eq. (2.5.21) in X. In other words, the Cauchy problem (2.5.19) has a unique solution x deﬁned on I if and only if T has a unique ﬁxed point x: x = T x. We do not go into further details concerning the above Cauchy problem, or Eq. (2.5.20), since later on we will address Volterra equations which are more general. We simply wanted to motivate the Banach Contraction Principle which is applicable to many other problems. Theorem 2.35 (Banach Contraction Principle). Let (X, d) be a complete metric space, and assume T : X → X is a contraction, i.e., ∃α ∈ (0, 1) such that d(T x, T y) ≤ αd(x, y) for all x, y ∈ X. Then T has a unique ﬁxed point (i.e., ∃! x∗ ∈ X such that T x∗ = x∗ ). Proof. We will use the method of successive approximations. Deﬁne a sequence xn = T xn−1 for n ∈ N with x0 ∈ X arbitrary. We have by induction d(xn+1 , xn ) ≤ αn d(x1 , x0 ) = αn d(T x0 , x0 ) ,

∀n ∈ N .

(2.5.22)

2.5 The Banach Contraction Principle

57

We now prove that (xn ) is Cauchy in (X, d): d(xn+p , xn ) ≤ d(xn+p , xn+p−1 ) + d(xn+p−1 , xn+p−2 ) + · · · + d(xn+1 , xn ) which by (2.5.22) is ≤ αn (1 + α + · · · + αp−1 )d(T x0 , x0 ) 1 − αp d(T x0 , x0 ) = αn 1−α αn ≤ d(T x0 , x0 ) . 1−α So it is Cauchy in (X, d) (as αn → 0), and since (X, d) is complete, xn converges to some x∗ ∈ X: d(xn , x∗ ) → 0. Now, d(x∗ , T x∗ ) ≤ d(T x∗ , xn ) + d(xn , x∗ ) = d(T x∗ , T xn−1 ) + d(xn , x∗ ) ≤ αd(x∗ , xn−1 ) + d(xn , x∗ ) , which converges to 0 as n → ∞, so d(x∗ , T x∗ ) ≤ 0 and thus x∗ is a ﬁxed point of T . We now wish to show that x∗ is unique. Suppose that y ∗ is also a ﬁxed point of T , then d(x∗ , y ∗ ) = d(T x∗ , T y ∗ ) ≤ αd(x∗ , y ∗ ), so (1 − α)d(x∗ , y ∗ ) ≤ 0 which implies x∗ = y ∗ . Remark 2.36. The assumption α < 1 in Theorem 2.35 is essential as the following counterexample from Natanson9 [38, p. 571] shows. If X = R, and T : R → R is given by T x = x + π2 − arctan x, then T has no ﬁxed point because π2 − arctan x > 0 ∀x ∈ R. On the other hand, by the Mean Value Theorem, we have for all x, y ∈ R, x = y, |T x − T y| ≤ |x − y − arctan x + arctan y| x − y = x − y − 1 + z2 for some z between x and y

= |x − y| · 1 −

1 1 + z2

< |x − y| , 9

Isidor P. Natanson, Russian mathematician, 1906–1963.

58

2 Metric Spaces

so, even though the inequality is strict, α = 1 and hence T is not a contraction. Thus, the fact that this T has no ﬁxed point is not surprising. Remark 2.37. From the above proof we see that d(xn , x∗ ) ≤

αn d(T x0 , x0 ) , 1−α

which gives us an approximation of x∗ . · · ◦T, k ≥ 2, is a contraction Remark 2.38. Suppose that T k = T ◦ · k f actors

(even though T may not be), then there is a unique ﬁxed point for T . Proof. A ﬁxed point of T is obviously a ﬁxed point of T k . Conversely if x∗ is a ﬁxed point of T k (which exists and is unique by Theorem 2.35) then T x∗ = T k+1 x∗ = T k (T x∗ ), so both x∗ and T x∗ are ﬁxed points of T k , and consequently T x∗ = x∗ .

2.6

Exercises

n 1. Let A1 ,A2 , . . . be subsets of a metric space. Prove that Cl i=1 ∞ ∞ Ai = ni=1 Cl Ai for all n ∈ N and Cl i=1 Ai ⊃ i=1 Cl Ai . Show by an example that the latter inclusion can be proper. 2. Let A be a subset of a metric space. Do A and Cl A always have the same interior? Do A and Int A always have the same closure? 3. Prove that if X = ∅ and d0 is the discrete metric on X, then any subset of (X, d0 ) is open. 4. Let ∅ = A ⊂ (X, d). Prove that p ∈ Cl A ⇔ inf {d(p, x) : x ∈ A} = 0 . 5. Let (X, d) be a metric space, ∅ = A ⊂ (X, d), and let (Y, · ) be a Banach space. Denote BC(A; Y ) := {f : (A, d) → (Y, · ); f continuous and bounded}. Prove that BC(A; Y ) is a Banach space with respect to the supnorm: f sup = supx∈A f (x).

2.6 Exercises

59

6. Find the accumulation points of the following subsets of R2 (equipped with the Euclidean metric): (a) Z × Z ; (b) Q × Q ; 1 (c) {( m n , n ); m, n ∈ Z, n = 0} ;

(d) {( n1 +

1 m , 0);

m, n ∈ Z \ {0}} .

7. Find the boundaries of the following sets: (a) A = [0, 1] ∩ Q ; (b) B = { n1 ; n ∈ N} ; (c) C = {(x, y) ∈ R2 ; x2 − y 2 > 1} . 8. Let (X, d) be a linear, metric space with d deﬁned by a norm · (i.e., d(x, y) = x−y, ∀x, y ∈ X). Prove that the closure of any open ball B(x, r) := {v ∈ X; d(v, x) < r} in (X, d) is the closed ball B(x, r) := {v ∈ X; d(v, x) ≤ r}. Show that this property fails to be true if X is equipped with the discrete metric d0 . 9. Show that any Cauchy sequence in a metric space can have at most one cluster point. 10. Find the cluster points of the following sequences: √ (a) xn = sin 2π n2 + 3n , n = 1, 2, . . . ; √ (b) yn = sin π n2 + n , n = 1, 2, . . . 11. Show that B := {f ∈ C([0, 1]; R) : f (x) > 0 for all x ∈ [0, 1]} is open in C([0, 1]; R) equipped with the metric generated by the sup-norm. What is the closure of B in this metric (in fact Banach) space? 12. Denote BC(R; R) := {f : R → R; f is continuous and f (R) is bounded} . Let D := {f ∈ BC(R; R); f (x) > 0 for all x ∈ R}. Is D open in BC(R; R) equipped with the sup-norm? If not, what is Int D? What is Cl D?

60

2 Metric Spaces

13. Find an open cover of (0, 1] ⊂ (R, | · |) which has no ﬁnite subcover. 14. Find a necessary and suﬃcient condition for a discrete subset of a metric space (X, d) to be compact. [Recall that S ⊂ (X, d) is discrete if all its elements are isolated]. 15. If A is a nonempty compact subset of a metric space (X, d), then A is separable (i.e., there exists a countable subset S of A, such that A = Cl S). 16. Let A, B be nonempty subsets of a normed space (X, · ) equipped with the topology given by the metric d deﬁned by d(x, y) = x − y, x, y ∈ X. We have the following: (a) If A, B are both compact sets, then A + B := {u + v; u ∈ A, v ∈ A} is compact, too; (b) If A is closed and B is compact, then A + B is closed, but not necessarily compact (give a counterexample). 17. Let f : [0, ∞) → R, f (x) =

sin(π(2x − 1)), x ∈ [ 12 , 1], 0, otherwise.

Let fn (x) = f (2n x) for x ∈ [0, 1], n ∈ N. Show that F = {fn ; n ∈ N} is closed and bounded in C[0, 1] := C([0, 1], R) equipped with the sup-norm, but not compact. 18. Let (X, d) be a complete metric space. If ∅ = A ⊂ X is a totally bounded set, show that A is relatively compact. 19. Let l1 be the set of all sequences of real numbers a = (an )n∈N satisfying ∞ n=1 |an | < ∞. Show that 1 is a Banach space over R with respect to the norm a = (a) l ∞ 1 n=1 |an |, a ∈ l . (b) the set A = {a = (an )n∈N ∈ l1 ; ∞ n=1 n|an | ≤ 1} is compact 1 in (l , · ) (i.e., in (X, d), where d is the metric generated by · : d(a, b) = a − b, a, b ∈ l1 ).

2.6 Exercises

61

20. Let F be the set of all functions f : D = [0, 1] → R, f (x) =

∞

an sin (nπx),

n=1

where a = (an )n∈N is a sequence in R satisfying ∞ n=1 n|an | ≤ 1. Show that F is a compact subset of C[0, 1] := C([0, 1]; R) equipped with the sup-norm. Does the result hold if the domain of the f ’s is D = R? 21. Let −∞ < a < b < ∞, un ∈ C 1 ([a, b]; R), n = 1, 2, . . . , such that (un )n∈N and (un )n∈N are bounded in Lp ([a, b], R), p ∈ (1, ∞), equipped with the usual norm. Show that (un ) has a subsequence which is convergent in C([a, b]; R) with respect to the sup-norm. (Information on Lp spaces is available in Chap. 3 below.) 22. Let (X, d), (Y, ρ) be metric spaces, and let F ⊂ C(A; Y ), where ∅ = A ⊂ X. If A is compact (with respect to d) and F is an equicontinuous family, then F is uniformly equicontinuous. 23. For a ∈ R consider fa : [0, 1] → R, fa (x) = 1+ax2 x2 . Show that F = {fa ; a ∈ R} is relatively compact in C[0, 1] := C([0, 1]; R) equipped with the sup-norm, but not compact. 24. (a) Prove Gronwall’s lemma, namely given t b(s)u(s) ds, t ∈ I = [t0 , T ], u(t) ≤ a(t) + t0

where u, a, b : I → R are all continuous functions and b ≥ 0, then t t u(t) ≤ a(t) + a(s)b(s)e s b(τ )dτ ds ∀t ∈ I. t0

In particular, prove Bellman’s lemma, which states that in the case a is a constant function, i.e., a(t) = C ∀t ∈ I, then u(t) ≤ Ce

t t0

b(s)ds

, t ∈ I.

(b) Let x = x(t) : [t0 −δ, t0 +δ] → Rk be a solution given by Theorem 2.31 (Peano’s Theorem). Assume (in addition to continuity on D) that f satisﬁes the Lipschitz condition (2.4.18). Use Bellman’s lemma to prove that x is the unique solution of the corresponding Cauchy problem.

62

2 Metric Spaces

25. Prove that if 1 1 x(t)2 ≤ c2 + 2 2

t

f (s)x(s) ds

∀t ∈ I = [t0 , T ] ,

t0

where c ∈ R, f, x ∈ C(I) := C(I; R), f ≥ 0 for all t ∈ I, then t f (s) ds ∀t ∈ I . |x(t)| ≤ |c| + t0

26. Show that the following Cauchy problem in R x (t) = 1 + t2 +

x(t)2 ; x(0) = 0 , 1 + x(t)2

has a unique solution deﬁned on R. 27. Do the same for the Cauchy problem 2 x (t) = 2e−t + ln 1 + x(t)2 ; x(0) = 0 . 28. Show that the Cauchy problem x (t) = 2 |x(t)|, t ∈ R; x(0) = 0, has inﬁnitely many solutions deﬁned on R. 29. Show that for every x0 ∈ R the Cauchy problem x (t) = 1 + t 1 + x(t)2 , t ≥ 0; x(0) = x0 , has a unique solution deﬁned on a bounded interval. 30. Show that the Cauchy problem x (t) = t2 + x(t)2 , x(0) = 0 , has √ a solution whose maximal interval is (−T, T ), with 2/2 < T < ∞. 31. Let ∅ = Ω ⊂ Rk+1 , k ≥ 2, be an open set, and let f : Ω → Rk be a continuous function. Then, for any (t0 , x0 ) ∈ Ω, the Cauchy problem (CP )

x (t) = f (t, x(t)), x(t0 ) = x0 ,

2.6 Exercises

63

has at least one solution deﬁned on an interval around t0 . If, in addition, f satisﬁes the condition: ∀ compact K ⊂ Ω, ∃LK > 0 such that ∀(t, u), (t, v) ∈ K, f (t, u) − f (t, v) ≤ LK u − v , where · is a norm of Rk , then the (local) solution of (CP ) is unique. 32. Consider in an interval I ⊂ R the Cauchy problem x (t) = A(t)x(t) + b(t), t ∈ I , x(t0 ) = x0 , where t0 ∈ I, x0 = (x01 , x02 , . . . , x0k )T ∈ Rk , A(t) = (aij (t)) is a k × k-matrix, and b(t) = (b1 (t), . . . , bk (t))T with aij , bj ∈ C(I) := C(I; R), i, j = 1, 2, . . . , k. Show that the above Cauchy problem has a unique solution on the whole interval I. 33. Let T : B(0, 1) → B(0, 1) be a map satisfying ∀x, y ∈ B(0, 1), d2 (T x, T y) ≤ d2 (x, y) , where B(0, 1) is the closed unit ball of (Rk , d2 ), and d2 is the Euclidean metric. Show that T has at least one ﬁxed point. 34. Prove that for every f ∈ C[0, 1] := C([0, 1]; R) and α ∈ (0, 1) the integral equation 1 x(t) = f (t) + e−ts cos αx(s) ds, t ∈ [0, 1] 0

has a unique solution x ∈ C[0, 1]. 35. Let (X, · ) be a Banach space and let f : [0, ∞) × X → X be a continuous function satisfying f (t, x1 ) − f (t, x2 ) ≤ a(t)x1 − x2 , t ∈ [0, ∞), x1 , x2 ∈ X , where a ∈ C([0, ∞); R). Show that the Cauchy problem x (t) = f (t, x(t)), t ≥ 0; x(0) = x0 , has a unique solution x ∈ C 1 ([0, ∞); X).

Chapter 3

The Lebesgue Integral and Lp Spaces In this chapter we discuss Lebesgue1 measurable sets, Lebesgue measurable functions, Lebesgue integration, and Lp spaces. These spaces, equipped with appropriate norms, are signiﬁcant examples of Banach spaces.

3.1

Measurable Sets in Rk

Here we essentially follow [46]. First of all, for any closed cube C ⊂ Rk , C = [a1 , b1 ] × [a2 , b2 ] × · · · × [ak , bk ], where bi − ai = c > 0, i = 1, 2, . . . , k, we denote v(C) := ck (which is called the volume of C). A collection of cubes in Rk is said to be almost disjoint if the interiors of the cubes are disjoint. It is easily seen that every open set D ⊂ Rk (equipped with the usual norm topology) can be written as a countable union of almost disjoint k closed cubes: D = ∪∞ j=1 Cj . To prove this, consider a grid in R of closed cubes of side length 1/n, with n suﬃciently large, retaining the cubes of the grid that are completely contained in D. Then, we bisect each cube of the above grid into 2k cubes with side length 1/(2n) and 1

Henri L´eon Lebesgue, French mathematician, 1875–1941.

© Springer Nature Switzerland AG 2019 G. Moro¸sanu, Functional Analysis for the Applied Sciences, Universitext, https://doi.org/10.1007/978-3-030-27153-4 3

65

3 The Lebesgue Integral and Lp Spaces

66

retain those new cubes that are contained in D. Thus, repeating indeﬁnitely the procedure, we construct a countable collection of almost disjoint closed cubes whose union equals D, as claimed. Now, for any set M ⊂ Rk , we deﬁne the exterior measure of M by me (M ) = inf

∞

v(Cj ),

j=1

where the inﬁmum is taken over all countable covers of M , ∪∞ j=1 Cj ⊃ M with closed cubes Cj . Some Remarks on the Exterior Measure (a) Obviously, the exterior measure of a singleton is zero, and me (∅) = 0. (b) If M1 ⊂ M2 ⊂ Rk , then me (M1 ) ≤ me (M2 ). (c) If C is a closed cube in Rk , then me (C) = v(C). Indeed, we clearly have me (C) ≤ v(C), and in order to prove the converse inequality it suﬃces to show that for any cover by closed cubes ∪∞ j=1 Cj ⊃ C, we have v(C) ≤

∞

v(Cj ).

(3.1.1)

j=1

Let ε > 0 be arbitrary but ﬁxed. Choose for each j an open cube Cj ⊃ Cj such that v Cj ≤ (1 + ε)v(Cj ). Since {Cj }∞ j=1 is an open cover of the compact set C, there exists a ﬁnite subcover {Cj 1 , . . . , Cj m }, C ⊂ ∪m i=1 Cji . It follows that v(C) ≤ (1 + ε)

m

v(Cji ) ≤ (1 + ε)

i=1

∞

v(Cj ).

j=1

As ε was arbitrarily chosen, this implies (3.1.1). ¯ (d) If C is an open cube in Rk , then me (C) = v(C). (e) If M = ∪∞ j=1 Mj , then me (M ) ≤

∞ j=1

me (Mj ).

(3.1.2)

3.1 Measurable Sets in Rk

67

We can assume me (Mj ) < ∞ for all j ∈ N, otherwise the inequality is trivially satisﬁed. For arbitrary ε > 0 we can choose for each j a cover by closed cubes Mj ⊂ ∪∞ q=1 Cj,q such that ∞

ε . 2j

v(Cj,q ) < me (Mj ) +

q=1

Then, M ⊂ ∪∞ j,q=1 Cj,q , hence me (M ) ≤ ≤

∞ j,q=1 ∞

v(Cj,q )

me (Mj ) +

j=1

=

∞

ε 2j

me (Mj ) + ε,

j=1

which implies (3.1.2). (f) For every M ⊂ Rk , we have me (M ) = inf{me (D); D open, D ⊃ M }. Clearly, me (M ) ≤ inf{me (D); D open, D ⊃ M }. For the converse inequality, let ε > 0 and choose a cover of M by closed cubes, M ⊂ ∪∞ j=1 Cj , such that ∞

v(Cj ) < me (M ) +

j=1

ε . 2

Choose for every j an open cube Cj , such that Cj ⊂ Cj and v Cj ≤ v(Cj ) +

ε 2j+1

.

3 The Lebesgue Integral and Lp Spaces

68

Then, denoting D = ∪∞ j=1 Cj , we have that D is an open set and by (e)

me (D ) ≤

∞

me Cj

∞ v Cj =

j=1

j=1 ∞

≤

v(Cj ) +

j=1 ∞

=

v(Cj ) +

j=1

ε 2j+1

ε 2

< me (M ) + ε. Hence, inf{me (D); D open, D ⊃ M } ≤ me (M ), as claimed. (g) If M is a countable union of almost disjoint closed cubes, M = ∞ C , then m (M ) = ∪∞ e j=1 j j=1 v(Cj ). Indeed, by (c) and (e), me (M ) ≤ ∞ j=1 v(Cj ), and for the converse inequality we consider, for a ﬁxed m ∈ N and an arbitrary but ﬁxed ε, closed cubes C˜j ⊂ Int(Cj ), j = 1, . . . , m, such that ε v(Cj ) < v(C˜j ) + j , j = 1, . . . , m. 2 Then, me (M ) ≥

˜ me (∪m j=1 Cj )

=

m

v(C˜j ) ≥

j=1

which implies me (M ) ≥

m

v(Cj ) − ε,

j=1

∞

j=1 v(Cj ).

Deﬁnition 3.1. A set M ⊂ Rk is Lebesgue measurable (or simply measurable) if for every ε > 0 there exists an open set D such that D ⊃ M and me (D \ M ) < ε. If M is measurable, we deﬁne the Lebesgue measure (or measure) of M by m(M ) := me (M ). Some Properties of Measurable Sets (A) It follows from the above deﬁnition that every open set is measurable.

3.1 Measurable Sets in Rk

69

(B) If me (M ) = 0, then M is measurable and m(M ) = 0. Indeed, we know (see (f) above) that 0 = me (M ) = inf{me (D); D open, D ⊃ M }, so for any ε > 0 there exists an open set Dε such that Dε ⊃ M and me (Dε ) < ε. As Dε \ M ⊂ Dε , we have me (Dε \ M ) < ε. (C) If M = ∪∞ j=1 Mj , where each Mj is measurable, then M is measurable. Indeed, for a given ε > 0, we can choose for each j an open set Dj , Dj ⊃ Mj , such that me (Dj \Mj ) < ε/2j . Hence D = ∪∞ j=1 Dj ∞ (D \ M ) =⇒ m (D \ M ) ≤ is open, D ⊃ M and D \ M ⊂ ∪ j j e j=1 ∞ m (D \ M ) < ε. e j j j=1 (D) If K ⊂ Rk is a compact set, then K is measurable. Since K is compact, hence bounded, we have me (K) < ∞. For any ε > 0 there exists an open set D, D ⊃ K, such that me (D) < me (K) + ε/2 (cf. (f)). The open set D \ K can be written as a countable union of almost disjoint closed cubes: D\K = ∪∞ j=1 Cj . Now, for a given p ∈ N, K1 = ∪pj=1 Cj is a compact set with K1 ∩ K = ∅, K ∪ K1 ⊂ D, and me (D) ≥ me (K ∪ K1 ) = me (K) + me (K1 ) p v(Cj ), = me (K) + j=1

which implies that p j=1

ε v(Cj ) ≤ me (D) − me (K) < , 2

hence me (D \ K) ≤ me (∪∞ j=1 Cj ) ≤

∞ j=1

=

∞ j=1

v(Cj ) ≤

ε < ε, 2

so K is indeed measurable. It follows that

me (Cj )

3 The Lebesgue Integral and Lp Spaces

70

(D1) any closed set F ⊂ Rk is measurable. Indeed, F can be written as a countable union of compact sets, F = ∪∞ n=1 F ∩ B(0, n), so the assertion follows from (C) and (D). (E) If M ⊂ Rk is measurable, then Rk \ M is also measurable. To prove this, observe ﬁrst that for all n ∈ N there exists an open set Dn such that M ⊂ Dn and me (Dn \ M ) < 1/n. Since k Rk \ Dn is a closed set, it is measurable, hence E := ∪∞ n=1 (R \ k Dn ) is also measurable (cf. (C)). We have E ⊂ R \ M and Rk \(M ∪E) ⊂ Dn \M , hence me (Rk \(M ∪E)) < 1/n. Therefore me (Rk \ (M ∪ E)) = 0, so Rk \ (M ∪ E) is measurable (cf. (B)). Since Rk \ M = [Rk \ (M ∪ E)] ∪ E, we conclude by (C) that Rk \ M is measurable, as claimed. (F) Any countable intersection of measurable sets is also a measurable set. k ∞ k This follows easily from ∩∞ j=1 Mj = R \ [∪j=1 (R \ Mj )] (see also (C) and (E)).

Now let us state an important result related to measurable sets: Theorem 3.2. If {Mn }∞ n=1 ∞is any collection of disjoint measurable M ) = sets, then m(∪∞ n=1 n n=1 m(Mn ). Proof. In a ﬁrst stage, we assume that each Mn is bounded. Let ε > 0 be arbitrary but ﬁxed. Since Rk \ Mn is measurable, for any n ∈ N there exists a closed set Fn ⊂ Mn such that me (Mn \ Fn ) < ε/2n . For each ﬁxed p ∈ N, F1 , . . . , Fp are compact and disjoint, and, denoting M = ∪∞ n=1 Mn , we have m(M ) ≥

m(∪pn=1 Fn )

=

p n=1

m(Fn ) ≥

p

m(Mn ) − ε,

n=1

which implies m(M ) ≥ pn=1 m(Fn ) ≥ pn=1 m(Mn ). This concludes the proof in the case when each Mn is bounded, since the converse inequality is also satisﬁed. In the general case, we consider the closed

3.2 Measurable Functions

71

cubes Ci centered at the origin with side length i ∈ N and deﬁne Mn,1 = Mn ∩ C1 , Mn,i = Mn ∩ (Ci \ Ci−1 ), i = 2, 3, . . . Then Mn = ∪i Mn,i , M = ∪n,i Mn,i , so, as each Mn,i is bounded, we can use what we obtained above to write m(Mn,i ) = m(Mn,i ) = m(Mn ). m(M ) = n,i

n

i

n

Remark 3.3. There are subsets of Rk which are not Lebesgue measurable. See, for example, [46, p. 24]. Remark 3.4. Denote by A the collection of all measurable subsets of Rk . According to the usual terminology, as ∅ ∈ A and (E) and (C) hold, the pair (Rk , A) is a σ-algebra. As the Lebesgue measure m is a nonnegative function on A satisfying m(∅) = 0 and Theorem 3.2, the triple (Rk , A, m) is a measure space. This deﬁnition of a measure space can be also used for sets other than Rk . In particular, if Ω ⊂ Rk is a Lebesgue measurable set, and deﬁne B = {B ∩ Ω; B ∈ A}, then (Ω, B, m) is a measure space (where m is the restriction to B of the Lebesgue measure deﬁned above).

3.2

Measurable Functions

In what follows we consider the measure space (Rk , A, m) deﬁned in the previous section. Note that similar considerations apply to any other measure spaces. Assume that R = R1 is equipped with the usual topology. Deﬁnition 3.5. A function f : Rk → R is called measurable if for all λ ∈ R the set {f > λ} := {x ∈ Rk ; f (x) > λ} is measurable (i.e., it belongs to A). Remark 3.6. Equivalent deﬁnitions are obtained if the set {f > λ} is replaced by {f ≥ λ}, {f < λ}, or {f ≤ λ}, λ ∈ R. Indeed, if {f > λ} is measurable for all λ ∈ R then so is {f ≥ λ} = Rk \ ∩∞ n=1 {f > λ − 1/n} ∀λ ∈ R, hence so is {f < λ} = Rk \ {f ≥ λ} ∀λ ∈ R, and so on (the other implications are trivially satisﬁed).

72

3 The Lebesgue Integral and Lp Spaces

Theorem 3.7. f : Rk → R is measurable if and only if for every open set D ⊂ R the set f −1 (D) := {x ∈ Rk ; f (x) ∈ D} is measurable. Proof. The set D = (λ, ∞) is open for any λ ∈ R. If f −1 (D) is assumed to be measurable for any λ ∈ R, then f is measurable, since f −1 (D) = {f > λ}. Conversely, let us assume that f is measurable. If ∅ = D ⊂ R is an open set then it can be represented as a countable union of disjoint open intervals. Indeed, for x ∈ D denote by I(x) the maximal open interval containing x and included into D. If x, y are distinct points in D, then I(x), I(y) either coincide or are disjoint. Obviously, D = ∪x∈D I(x). Since each I(x) contains a rational number, the number of distinct I(x) must be countable so D = ∪∞ n=1 In . Since f is measurable, we have f −1 (In ) ∈ A for all n ∈ N, which implies −1 (I ) ∈ A. f −1 (D) = ∪∞ n n=1 f We say that a property (P ) holds almost everywhere (abbreviated a.e.) in Ω ⊂ Rk if it holds in Ω \ E with m(E) = 0; in other words, (P ) holds for almost all (abbreviated a.a.) x ∈ Ω. Theorem 3.8. Let f, g : Rk → R. If f is measurable and g = f a.e., then g is also measurable. Proof. Denote E = {g = f }. We have for any λ ∈ R, {g > λ} ∪ E = {f > λ} ∪ E ∈ A, hence G := {g > λ} ∪ E ∈ A. Since {g > λ} diﬀers from G by a set of measure zero, it follows that {g > λ} ∈ A. Observe that the equality a.e. is an equivalence relation in the set of all measurable functions. Theorem 3.9. If f : Rk → R is measurable and g : R → R is continuous, then g ◦ f is measurable. Proof. As g is a continuous function, for any open set D ⊂ R, g −1 (D) is open, too. Hence, as f is measurable, we conclude that (g ◦ f )−1 (D) = f −1 (g −1 (D)) is measurable for any open set D ⊂ R. Remark 3.10. It follows from the above result that, if f is measurable, then so are the functions λf (λ ∈ R), |f |p (p > 0), f + = max{f, 0}, f − = − min{f, 0}, etc.

3.2 Measurable Functions

73

Theorem 3.11. If f, g are measurable, then so are f + g and f g. If, in addition, g = 0 a.e., then f /g is measurable. Proof. For any λ ∈ R we have {f + g > λ} = ∪q∈Q {f > q > λ − g} =

{f > q} ∩ {g > λ − q} ,

q∈Q

where Q is the set of rational numbers. It follows that f + g is measurable. The function f g is also measurable since 1 f g = (f + g)2 − (f − g)2 . 4 In order to prove the last statement, it suﬃces to prove that 1/g is measurable. This follows from {1/g > λ} = ({g > 0} ∩ {λg < 1}) ∪ ({g < 0} ∩ {λg > 1}). Theorem 3.12. If (fn )n∈N is a sequence of measurable functions, then all of supn∈N fn , inf n∈N fn , lim supn→∞ fn , and lim inf n→∞ fn are measurable. In particular, if fn → f a.e. then f is measurable. Proof. For any λ ∈ R we have {supn∈N fn > λ} = ∪n∈N {fn > λ} which implies that supn∈N fn is measurable. The function inf n∈N fn is also measurable since it is equal to − supn∈N (−fn ). The other statements follow from lim sup fn = inf {sup fi }, lim inf fn = sup{inf fn }, n→∞

i

n→∞

n≥i

i

n≥i

which coincide a.e. with f = lim fn when fn → f a.e. Now, let us recall the deﬁnition of the characteristic function of a set E, denoted χE , 1 if x ∈ E, χE (x) = 0 if x ∈ / E. Let E ⊂ Rk . It is easily seen that χE is measurable if and only if E is measurable. Deﬁnition 3.13. A function f : Rk → R is called a simple function if it has the form p yi χMi (x), (3.2.3) f (x) = i=1

where p ∈ N, yi ∈ R, i = 1, . . . , p, and the Mi ’s are disjoint, measurable subsets of Rk , with m(Mi ) < ∞, i = 1, . . . , p.

3 The Lebesgue Integral and Lp Spaces

74

Any simple function is measurable, as a ﬁnite linear combination of characteristic functions of measurable sets. Normally, in the above deﬁnition y1 , . . . , yp are distinct numbers. Theorem 3.14. If f : Rk → R is a measurable function, then there exists a sequence of simple functions (fn )n∈N such that |fn (x)| ≤ |fn+1 (x)|, x ∈ Rk , n = 1, 2, . . .

(3.2.4)

lim fn (x) = f (x), x ∈ Rk .

(3.2.5)

and n→∞

If, in addition, f ≥ 0, n = 1, 2, . . . , then one can ﬁnd fn ≥ 0, n = 1, 2, . . . Proof. We assume ﬁrst that f is a nonnegative measurable function. For a given n ∈ N, deﬁne the following subsets of Rk Mj = {

j−1 j ≤ f < n }, j = 1, 2, . . . , n2n , and Pn = {f ≥ n}, 2n 2

which are all measurable. Let n2 j−1 n

gn (x) =

j=1

2n

χMj (x) + nχPn (x), x ∈ Rk , n = 1, 2, . . .

It is easily seen that 0 ≤ gn ≤ gn+1 and 0 ≤ f (x) − gn (x) ≤ 1/2n , whenever f (x) ≤ n, hence gn → f . Thus, the sequence (gn ) satisﬁes all the properties for a sequence (fn ) mentioned in the statement of the theorem, except for m(Pn ) < ∞, m(Mj ) < ∞ for all n ∈ N, j = 1, 2, . . . , n2n (see Deﬁnition 3.13). This inconvenience can be easily removed as follows. For any n ∈ N, consider the closed cube Cn centered at the origin with side length n and deﬁne fn (x) = gn (x)χCn (x) n2 j−1 n

=

j=1

2n

χMj ∩Cn (x) + nχPn ∩Cn (x), x ∈ Rk .

It is easily seen that (fn ) satisﬁes all the desired properties, including 0 ≤ fn (x) ≤ fn+1 (x), x ∈ Rk , n = 1, 2, . . . For a general measurable function f one can use the decomposition f = f + − f − , which implies |f | = f + + f − . Since f + and f − are both

3.3 The Lebesgue Integral

75

measurable and nonnegative, it follows from the proof above that there exist sequences (fn+ ) and (fn− ) that satisfy the properties mentioned above and approximate f + and f − , respectively. Then (fn = fn+ −fn− ) is a sequence of simple functions satisfying (3.2.4) and (3.2.5). Remark 3.15. Taking into account Theorems 3.12 and 3.14, one can say that a function f : Rk → R is measurable (in the sense of Deﬁnition 3.5) if and only if f is the limit of a sequence of simple functions (fn ), i.e., fn (x) → f (x), as n → ∞, for a.a. x ∈ Rk . This equivalent condition can be used to deﬁne the notion of an X-valued measurable function, where X is a Banach space.

3.3

The Lebesgue Integral

If f : Rk → R is a simple function as in (3.2.3), the Lebesgue integral of f is deﬁned by p f (x) dx := m(Mi ) · yi . (3.3.6) Rk

i=1

If Ω is a measurable subset of Rk then g = f χΩ is also a simple function and we deﬁne f (x) dx := f (x)χΩ (x) dx . Rk

Ω

Denote by S the set of all simple functions f : Rk → R. It is easily seen that S is a linear space over R with respect to the usual operations: addition of functions and scalar multiplication. We have the following statements: • Rk (αf + βg) dx = α Rk f dx + β Rk g dx ∀f, g ∈ S, α, β ∈ R; • f, g ∈ S, f ≤ g =⇒ Rk f dx ≤ Rk g dx; • If Ω1 , Ω2 ⊂ Rk are disjoint measurable sets with m(Ωi ) < ∞, i = 1, 2, then f dx = f dx + f dx ; Ω1 ∪Ω2

Ω1

• If f ∈ S, then so is |f | and f dx| ≤ | Rk

Ω2

Rk

|f | dx .

3 The Lebesgue Integral and Lp Spaces

76

The proofs are easy and are left to the reader. In what follows we are concerned with the Lebesgue integration of nonnegative measurable functions. Denote by S + the set of all nonnegative simple functions f : Rk → R (i.e., functions of the form (3.2.3), where each yi ≥ 0). Deﬁnition 3.16. A nonnegative measurable function f : Rk → R is called integrable in the sense of Lebesgue (or simply integrable) if sup{ s dx; s ∈ S + , s ≤ f } < +∞ , Rk

and denote

If sup{

Rk

f dx := sup{

Rk

Rk

s dx; s ∈ S + , s ≤ f }.

s dx; s ∈ S + , s ≤ f } = ∞, we write

Rk

f dx = ∞.

Note that if f is a nonnegative simple function, i.e., a function of the form (3.2.3) 0, i = 1, . . . , p, then using this deﬁnition we with yi ≥ reobtain Rk f (x) dx = pi=1 m(Mi ) · yi . k If f : Rk → R is a nonnegative integrable function and Ω ⊂ R is a measurable set, then Ω f dx := Rk f χΩ dx. We have the following immediate statements for f, g : Rk → R nonnegative measurable functions and α ≥ 0: • f ≤ g =⇒ Rk f dx ≤ Rk g dx ; • If Ω1 ⊂ Ω2 ⊂ Rk are measurable sets, with Ω1 ⊂ Ω2 , then Ω1 f dx ≤ Ω2 f dx ; We also have: • If f : Rk → R is a nonnegative measurable function, then: f = 0 a.e. if and only if Rk f dx = 0. Proof. Observe ﬁrst that if f = 0 a.e., then for any s ∈ S + , with s ≤ f , we have s = 0 a.e., so Rks dx = 0. Therefore Rk f dx = 0. Conversely, let us assume that Rk f dx = 0. Deﬁne Ωn = {x ∈ Rk ; f (x) ≥ 1/n}, n ∈ N. We have for all n ∈ N 1 1 χΩn dx = m(Ωn ) . f dx ≥ 0= n n k k R R So m(Ωn ) = 0 for all n ∈ N =⇒ m({f > 0}) = m(∪∞ n=1 Ωn ) = 0 =⇒ f = 0 almost everywhere.

3.3 The Lebesgue Integral

77

Let us now state the so-called Monotone Convergence Theorem or Beppo Levi’s theorem.2 Theorem 3.17 (Monotone Convergence Theorem). Let 0 ≤ f1 ≤ f2 ≤ · · · ≤ fn ≤ · · · be a sequence of measurable functions. Denote f (x) := limn→∞ fn (x). Then lim fn dx = f dx . n→∞ Rk

Rk

Proof. Obviously, there exists fn dx ≤ lim n→∞ Rk

f dx . Rk

In order to prove the converse inequality, let s ∈ S + , s ≤ f , and let ε ∈ (0, 1). Deﬁne Mn = {x ∈ Rk ; fn (x) ≥ εs(x)}, n ∈ N. We have k Rk = ∪ ∞ n=1 Mn . Indeed, if x ∈ R and f (x) = 0, then s(x) = 0, so x ∈ M1 . If f (x) > 0, then f (x) > εs(x), hence x ∈ Mn for n large enough. Next, Rk

fn dx ≥

fn dx ≥ ε Mn

s dx . Mn

Since Mn ⊂ Mn+1 for all n ∈ N, the last inequality implies fn dx ≥ ε s dx , lim n→∞ Rk

Rk

hence, as ε ∈ (0, 1) was arbitrary, fn dx ≥ s dx ∀s ∈ S + , s ≤ f . lim n→∞ Rk

Rk

This implies

lim

n→∞ Rk

fn dx ≥

f dx , Rk

as claimed. Remark 3.18. Combining Theorems 3.14 and 3.17, we infer that for any nonnegative integrable function f : Rk → R, there exists an increasing sequence (sn )N in S + such that sn → f pointwise (or a.e.) and Rk sn dx → Rk f dx. Using this observation, one can readily deduce that 2

Beppo Levi, Italian mathematician, 1875–1961.

3 The Lebesgue Integral and Lp Spaces

78

• if f, g : Rk → R are nonnegative integrable functions, then so is f + g and (f + g) dx = f dx + g dx . Rk

Rk

Rk

We also have • Rk αf dx = α Rk f dx ∀α ≥ 0 . The next result is known as Fatou’s lemma.3 Theorem 3.19. Let fn : Rk → R be a sequence of nonnegative measurable functions. Set f = lim inf n→∞ fn . Then, f dx ≤ lim inf fn dx . (3.3.7) n→∞

Rk

Rk

Proof. Denote gn = inf m≥n fm , n ∈ N. Since (gn ) is an increasing sequence, we have f = sup gn = lim gn . n→∞

n∈N

By the Monotone Convergence Theorem we have gn dx = f dx . lim

(3.3.8)

On the other hand, since gn ≤ fn , n ∈ N, we have gn dx ≤ fn dx, n ∈ N .

(3.3.9)

n→∞ Rk

Rk

Rk

Rk

Combining (3.3.8) and (3.3.9) yields (3.3.7). Now, we are going to deﬁne the Lebesgue integral for a general measurable function f : Rk → R. One can use the decomposition f = f + −f − . Obviously, f is measurable if and only if both f + and f − are measurable. Deﬁnition 3.20. A measurable function f : Rk → R is called integrable if both f + and f − are integrable and + f dx := f dx − f − dx . Rk

3

Rk

Rk

Pierre Joseph Louis Fatou, French mathematician, 1878–1929.

3.3 The Lebesgue Integral

79

Denote by L(Rk ) the set of all (measurable and) integrable functions f : Rk → R. One can prove by elementary arguments the following statements: • If f : Rk → R is measurable, then so is |f | and f ∈ L(Rk ) ⇐⇒ |f | ∈ L(Rk ) ; • If f, g : Rk → R are measurable, g ∈ L(Rk ) and |f | ≤ g, then f ∈ L(Rk ) ; • If f ∈ L(Rk ) and α ∈ R, then αf ∈ L(Rk ) and αf dx = α f dx ; Rk

Rk

We also have • If f, g ∈ L(Rk ), then f + g ∈ L(Rk ) and (f + g) dx = f dx + Rk

Rk

g dx . Rk

Proof. Assume f, g ∈ L(Rk ). Then f + , f − , g + , g − , f + g, (f + g)+ , (f +g)− are measurable, and f + , f − , g + , g − ∈ L(Rk ). From (f +g)+ ≤ f + +g + and (f +g)− ≤ f − +g − we infer that (f +g)+ , (f +g)− ∈ L(Rk ), which implies f + g ∈ L(Rk ). On the other hand, (f + g)+ − (f + g)− = f + g = f + − f − + g + − g − , so

(f + g)+ + f − + g − = (f + g)− + f + + g + ,

which involves only nonnegative integrable functions. Hence, + − (f + g) dx + f dx + g − dx Rk Rk Rk − + (f + g) dx + f dx + g + dx, = Rk

Rk

Rk

which gives the desired equality. • Let f, g : Rk → R be such that f ∈ L(Rk ) and g = f a.e. Then, g ∈ L(Rk ) and Rk g dx = Rk f dx.

3 The Lebesgue Integral and Lp Spaces

80

Proof. From g = f a.e. we derive g + = f + ≥ 0 a.e and g − = f − ≥ 0 a.e., so g + dx = f + dx, g − dx = f − dx , Rk

Rk

Rk

Rk

and the result follows. • If f, g ∈ L(Rk ) and f ≤ g a.e., then

Rk

f dx ≤

Rk

g dx.

The proof is easy. • For every f ∈ L(Rk ) we have k f dx ≤ R

Rk

|f | dx .

Proof. We know that f ∈ L(Rk ) ⇒ |f | ∈ L(Rk ). We have f dx = f + dx − f − dx k k k R R R + ≤ f dx + f − dx k k R R |f | dx . = Rk

Similarly,

−

Rk

f dx ≤

Rk

|f | dx ,

so the result follows. Theorem 3.21. Let f ∈ L(Rk ). Then, for every ε > 0 there exists δ > 0, such that for every measurable set M ⊂ Rk with m(M ) < δ, we have M |f | dx < ε. Proof. For n ∈ N deﬁne

gn (x) =

|f (x)| n

if |f (x)| ≤ n, if |f (x)| > n.

Observe that, for every n ∈ N, 0 ≤ gn ≤ |f |, so gn ∈ L(Rk ). Moreover, (gn ) is an increasing sequence converging pointwise to |f |. By Beppo Levi, lim

n→∞ Rk

gn dx =

Rk

|f | dx ,

3.3 The Lebesgue Integral

81

so, for a given ε > 0, there exists an N ∈ N such that ε (|f | − gN ) dx < . 2 Rk Choosing δ = ε/(2N ), we have ∀M ∈ A with m(M ) < δ,

gN dx ≤

M

ε . 2 (3.3.11)

N dx = N m(M ) < M

Now, we derive from (3.3.10) and (3.3.11), |f | dx = (|f | − gN ) dx + M

(3.3.10)

M

gN dx < ε . M

Recall that the equality a.e. is an equivalence relation in the linear space of measurable functions, in particular in L(Rk ). Denote by L1 (Rk ) the quotient space L(Rk )/∼, where ∼ stands for the equivalence relation we are talking about. In general, any equivalence class in L1 (Rk )/∼ is identiﬁed with a representative of the corresponding class, which is usually selected to be the most regular one. If Ω ⊂ Rk is a measurable set, we can similarly deﬁne L1 (Ω) := L(Ω)/∼. Based on this identiﬁcation, we can say that the above theory works for functions (in fact classes of functions) belonging to L1 (Rk ) or to L1 (Ω). The next result is known as Lebesgue’s Dominated Convergence Theorem. Theorem 3.22 (Lebesgue’s Dominated Convergence Theorem). Let Ω ⊂ Rk be a measurable set, possibly Ω = Rk . Let (fn )n∈N be a sequence in L1 (Ω) such that (a) fn (x) → f (x) a.e. on Ω; (b) ∃g ∈ L1 (Ω) such that |fn (x)| ≤ g(x) a.e. on Ω. Then, f ∈ L1 (Ω) and limn→∞ Ω |fn (x) − f (x)| dx = 0. Proof. According to (a), f is measurable. Passing to the limit in (b) we get |f | ≤ g a.e., so f ∈ L1 (Ω). Set hn := |fn − f |. We have hn → 0 a.e. on Ω and hn ≤ g˜ := g + |f | ∈ L1 (Ω). Applying Fatou’s lemma to the sequence (˜ g − hn ), we get g˜ dx ≤ lim inf (˜ g − hn ) dx = g˜ dx − lim sup hn dx, Ω

n→∞

Ω

Ω

n→∞

Ω

3 The Lebesgue Integral and Lp Spaces

82

which implies

hn dx ≤ 0 .

lim sup n→∞

Ω

Thus lim

n→∞ Ω

3.4

hn dx = 0 .

Lp Spaces

Throughout this section Ω denotes a measurable subset of Rk (possibly Ω = Rk ). As usual, any class of measurable functions with respect to the equality a.e. will be identiﬁed with one of its representatives. We have already deﬁned the space L1 (Ω) as being the set of all functions f : Ω → R which are integrable over Ω, i.e., f is measurable and Ω |f | dx < ∞. This deﬁnition can be extended as follows: Lp (Ω) := {f : Ω → R; f is measurable and |f |p ∈ L1 (Ω)} , for 1 ≤ p < ∞. We also deﬁne L∞ (Ω) := {f : Ω → R; f is measurable and there exists C ≥ 0 such that |f (x)| ≤ C a.e. on Ω}. It is easily seen that, for every 1 ≤ p ≤ ∞, Lp (Ω) is a linear space over R. Now, for 1 < p < ∞ denote by q the conjugate of p, i.e., 1 1 + = 1. p q Recall the so-called Young’s inequality ab ≤

a p bq + . p q

(3.4.12)

This inequality follows from the fact that the log function is concave on (0, ∞), so log

1 1 1 ap + bq ≥ log ap + log bq = log(ab) . p q p q

1

Now, we set for 1 ≤ p < ∞

1/p f Lp (Ω) := |f (x)|p dx ∀f ∈ Lp (Ω) , Ω

3.4 Lp Spaces

83

and f L∞ (Ω) := inf{C; |f (x)| ≤ C a.e. on Ω} ∀f ∈ L∞ (Ω) . We are going to prove that these are norms. To this purpose, we need the following auxiliary result which is known as H¨ older’s inequality.4 Lemma 3.23 (H¨ older’s Inequality). Let 1 < p < ∞. If f ∈ Lp (Ω) q and g ∈ L (Ω), then f g ∈ L1 (Ω) and |f g| dx ≤ f Lp (Ω) gLq (Ω) , (3.4.13) Ω

where q is the conjugate of p. Proof. If f = 0 a.e. on Ω, then (3.4.13) is trivially satisﬁed, so we can assume f Lp (Ω) > 0. By Young’s inequality we have |f g| ≤

1 p 1 q |f | + |g| a.e. on Ω . p q

This shows that f g ∈ L1 (Ω) and 1 1 |f g| dx ≤ f pLp (Ω) + gqLq (Ω) . p q Ω By replacing in this inequality f by αf with α > 0, we obtain αp−1 1 f pLp (Ω) + gqLq (Ω) , |f g| dx ≤ p αq Ω q/p

whose right-hand side achieves its minimum for α = gLq (Ω) /f Lp (Ω) , thus (3.4.13) follows. Theorem 3.24. · Lp (Ω) is a norm in Lp (Ω) for all 1 ≤ p ≤ ∞. Proof. The result is trivial for p = 1. Now, if f ∈ L∞ (Ω), then |f (x)| ≤ f L∞ (Ω) a.e. on Ω .

(3.4.14)

Indeed, we infer from the deﬁnition of · L∞ (Ω) that, for each n ∈ N, there exists a constant Cn such that f L∞ (Ω) ≤ Cn < f L∞ (Ω) + 4

1 and |f (x)| ≤ Cn , n

Otto Ludwig H¨ older, German mathematician, 1859–1937.

3 The Lebesgue Integral and Lp Spaces

84

for x ∈ Ω \ An with m(An ) = 0. Setting A = ∪∞ n=1 An , we have m(A) = 0 and |f (x)| ≤ Cn , x ∈ Ω \ A . As Cn → f L∞ (Ω) we derive (3.4.14) by passing to the limit in the last inequality. Using (3.4.14) one can easily prove that · L∞ (Ω) is a norm in L∞ (Ω). Now, let us consider the case 1 < p < ∞. We have only to prove the triangle inequality (since the other axioms are trivially satisﬁed). For f, g ∈ Lp (Ω), we have p |f + g|p−1 |f + g| dx f + gLp (Ω) = Ω p−1 ≤ |f + g| |f | dx + |f + g|p−1 |g| dx. (3.4.15) Ω

Ω

Noting that |f + g|p−1 ∈ Lq (Ω), we obtain by H¨older’s inequality p p f + gpLp (Ω) ≤ f + gp−1 Lp (Ω) f L (Ω) + gL (Ω) , which implies f + gpLp (Ω) ≤ f pLp (Ω) + gpLp (Ω) . Theorem 3.25. For every 1 ≤ p ≤ ∞, Lp (Ω) equipped with · Lp (Ω) is a Banach space. Proof. The fact that · Lp (Ω) is a norm was shown before (see Theorem 3.24). So we only need to prove that this norm is complete. We distinguish two cases. Case 1: 1 ≤ p < ∞. Let (f )n∈N be a Cauchy sequence in Lp (Ω). Then there exists a subsequence (fnm )m∈N which satisﬁes fnm+1 − fnm Lp (Ω) ≤

1 , m = 1, 2, . . . 2m

(3.4.16)

Indeed, one may ﬁrst choose n1 ∈ N such that fm − fn Lp (Ω) ≤ 1/2 ∀m, n ≥ n1 ; then choose n2 ∈ N, n2 ≥ n1 , such that fm − fn Lp (Ω) ≤ 1/22 ∀m, n ≥ n2 , and so on. We are going to show that there is a function f ∈ Lp (Ω) such that fnm −f Lp (Ω) → 0, as m → ∞. If we show this, the initial sequence (fn ) will be convergent in Lp (Ω),

3.4 Lp Spaces

85

as a Cauchy sequence with a convergent subsequence. For simplicity, we redenote fm := fnm , so (3.4.16) becomes fm+1 − fm Lp (Ω) ≤ Set gn (x) =

n

1 , m = 1, 2, . . . 2m

(3.4.17)

|fi+1 (x) − fi (x)| .

i=1

According to (3.4.17), we have gn Lp (Ω) ≤ 1, n = 1, 2, . . . By the Monotone Convergence Theorem, gn (x) converges a.e. to a ﬁnite limit g(x), and g ∈ Lp (Ω). Now, for m ≥ n ≥ 2 and for almost all x ∈ Ω, |fm (x) − fn (x)| ≤ |fm (x) − fm−1 (x)| + · · · + |fn+1 (x) − fn (x)| = gm−1 (x) − gn−1 (x) ≤ g(x) − gn−1 (x) .

(3.4.18)

It follows that for almost all x ∈ Ω, (fn (x))n∈N is Cauchy, so it converges to some f (x). We also obtain for almost all x ∈ Ω |f (x) − fn (x)| ≤ g(x), n = 2, 3, . . . so, in particular, f ∈ Lp (Ω). As |fn −f |p → 0 a.e. on Ω and |fn −f |p ≤ g p ∈ L1 (Ω), we are in a position to apply the Dominated Convergence Theorem to conclude that fn − f Lp (Ω) → 0. Case 2: p = ∞. Let (fn ) be a Cauchy sequence in L∞ (Ω). So, for any j ∈ N, there exists Nj ∈ N such that fn − fm L∞ (Ω) ≤

1 ∀n, m ≥ Nj . j

Hence, there exists a set Mj with m(Mj ) = 0 such that |fn (x) − fm (x)| ≤

1 ∀x ∈ Ω \ Mj , m, n ≥ Nj . j

(3.4.19)

Obviously, the set M = ∪∞ j=1 Mj has measure zero. For each x ∈ Ω \ M the sequence (fn (x)) is Cauchy and therefore convergent to some f (x) ∈ R. Now, we deduce from (3.4.19) |fn (x) − f (x)| ≤

1 ∀x ∈ Ω \ M, n ≥ Nj , j

3 The Lebesgue Integral and Lp Spaces

86

hence f ∈ L∞ (Ω) and fn − f L∞ (Ω) ≤

1 ∀n ≥ Nj . j

So (fn ) converges to f in L∞ (Ω).

3.5

Exercises

1. A set Ω ⊂ Rk is measurable ⇐⇒ for every ε > 0 there exists a closed set F ⊂ Ω such that m(Ω \ F ) < ε. 2. Let Ω ⊂ Rk be a measurable set with m(Ω) < ∞. Show that, for every ε > 0, there exists a compact set K ⊂ Ω such that m(Ω \ K) < ε. 3. Let A ⊂ Ω ⊂ B ⊂ Rk , where A, B are measurable sets with m(A) = m(B) < ∞. Then Ω is measurable. 4. Let h ∈ Rk \ {0} and α ∈ R. Show that for every measurable set Ω ⊂ Rk we have (a) Ωh := {x + h; x ∈ Ω} is measurable and m(Ωh ) = m(Ω) (translation invariance); (b) αΩ := {αx; x = |α|k m(Ω).

∈

Ω} is measurable and m(αΩ)

5. Let h ∈ Rk \ {0} and α ∈ R \ {0}. If f ∈ L1 (Rk ), then so are the functions x → f (x − h), x → f (αx) and f (x − h) dx = f (x) dx, f (αx) dx Rk Rk Rk 1 = f (x) dx . |α|k Rk 6. Let −∞ < a < b < +∞ and let f : [a, b] → R be a bounded function. If f is Riemann integrable then f ∈ L1 (a, b) := L1 ((a, b); R), and the two integrals coincide:

b

b

f (x) dx = (R)

(L) a

f (x) dx . a

3.5 Exercises

87

Use the (Dirichlet) function D : [0, 1] → R, 1 if x ∈ Q ∩ [0, 1], D(x) = 0 if x ∈ [0, 1] \ Q, to show that the converse implication is not true in general. 7. Let fn : [0, 1] → R be deﬁned by fn (x) =

nxn−1 , x ∈ [0, 1], n ∈ N . 1+x

Show that

1

fn (x) dx =

lim

n→∞ 0

1 . 2

8. Show that f : [1, ∞) → R deﬁned by f (x) = x−2 ln x, x ∈ [1, ∞) , is Lebesgue integrable and ∞ f (x) dx = 1 . 1

9. Show that

n

1+

lim

n→∞ 0

x n −2x e dx = 1. n

10. Let f : [0, ∞) → R be a continuous function such that lim f (x) = a,

x→∞

where a ∈ R. Show that, for every b ∈ (0, ∞), lim

n→∞ a

b

f (nx) dx = ab .

11. Let f : [0, 1] → R be deﬁned by 0 if x = 0, 1 1 f (x) = √ n if x ∈ n+1 , n , n ∈ N.

3 The Lebesgue Integral and Lp Spaces

88

Show that (a) f is not Riemann integrable on [0, 1] ; / Lp (0, 1) for 2 ≤ p ≤ ∞ . (b) f ∈ Lp (0, 1) for 1 ≤ p < 2, and f ∈ 12. Show that the following functions are not Lebesgue integrable: (a) f (x) = x1 , x ∈ (0, 1) ; (b) g(x) = sin x + cos x, x ∈ (0, ∞) . 13. Let f ∈ C[0, 1] := C([0, 1]; R), such that f (0) = 0, and f is diﬀerentiable at x = 0. Then prove that g : (0, 1) → R, deﬁned by g(x) = x−3/2 f (x), x ∈ (0, 1) , belongs to L1 (0, 1). 14. If f ∈ L1 (0, 1), show that

1 0

xn f (x) dx → 0 as n → ∞.

15. Let Ω ⊂ Rk be a measurable set with m(Ω) < ∞ and let 1 ≤ p < q ≤ ∞. Prove that Lq (Ω) ⊂ Lp (Ω) and f Lp (Ω) ≤ m(Ω)(q−p)/pq f Lq (Ω) ∀f ∈ Lq (Ω) . 16. Let Ω ⊂ Rk be a measurable set with m(Ω) < ∞ and let f ∈ L∞ (Ω). Prove that lim f Lp (Ω) = f L∞ (Ω) .

p→∞

Chapter 4

Continuous Linear Operators and Functionals In this chapter we discuss linear operators between linear spaces, but our presentation is restricted at this stage to the space of continuous (bounded) linear operators between normed spaces. When the target space is either R or C, they are called (continuous linear) functionals and are used to deﬁne dual spaces and weak topologies. Unless otherwise speciﬁed, this chapter only considers linear spaces over the ﬁeld K, with K being R or C. When two or more linear spaces are involved then all of them will be over the same ﬁeld.

4.1

Deﬁnitions, Examples, Operator Norm

We begin this section with some basic deﬁnitions. Deﬁnition 4.1. Let X, Y be linear spaces and let A : D(A) ⊂ X → Y . A is called a linear operator if D(A) is a linear subspace of X and A(αx + βy) = αAx + βAy,

∀α, β ∈ K, ∀x, y ∈ D(A) .

We denote the range of A by R(A), i.e., R(A) = {Ax; x ∈ D(A)}. The range R(A) is a linear subspace of Y . We say that A is injective or one-to-one if N (A), the nullspace of A, deﬁned by N (A) = {x ∈ D(A); Ax = 0}, is precisely {0}. The operator A is called surjective or onto if R(A) = Y . © Springer Nature Switzerland AG 2019 G. Moro¸sanu, Functional Analysis for the Applied Sciences, Universitext, https://doi.org/10.1007/978-3-030-27153-4 4

89

90

4 Continuous Linear Operators and Functionals

Example 1. Let X = Rn , Y = Rm with n, m ∈ N. Let M be an m × n matrix with real entries, then A : D(A) = X → Y deﬁned by Au = M u

∀u = (u1 , . . . , un )T ∈ X

is a linear operator, and in fact all linear maps between these spaces can be represented in this way. Here we consider that the elements of both X and Y are column vectors. If m = 1 then A is a linear form on X, as deﬁned in Chap. 1. Example 2. For X = Y = C[a, b] := C([a, b]; R) with −∞ < a < b < ∞, the derivative operator Af = f is deﬁned on D(A) = C 1 [a, b] (which is the set of all continuously diﬀerentiable functions f : [a, b] → R), and its range is R(A) = C[a, b] = Y , so A is surjective. Note that A is not injective because its nullspace N (A) := {f ∈ D(A); Af = 0} = {0} (more precisely, N (A) consists of all constant functions). Example 3. For X = Y = C[a, b], −∞ < a < b < ∞, the antiderivative operator t (Af )(t) = a f (s) ds is deﬁned on D(A) = C[a, b] = X. It is injective because Af = 0 implies f = 0. However A is not surjective because (Af )(a) = 0 for all f ∈ D(A) = C[a, b], and thus R(A) is a proper subset of Y = C[a, b]. Proposition 4.2. Let (X, · X ), (Y, · Y ) be normed (linear) spaces and let A : X → Y be a linear operator. Then the following are equivalent 1. A is continuous on X; 2. A is continuous at x = 0; 3. A maps bounded subsets of X to bounded subsets of Y ; 4. There exists c > 0 such that AuY ≤ cuX for all u ∈ X. Proof. An exercise. Remark 4.3. If X, Y are ﬁnite dimensional spaces then both of them can be equipped with norms, and every linear operator between the two spaces is continuous (prove it!). In fact, any such operator can

4.1 Deﬁnitions, Examples, Operator Norm

91

be represented by a matrix which depends on the bases of the two spaces. So continuity of linear operators is interesting only in the case of inﬁnite dimensional linear spaces. Remark 4.4. A linear operator A : D(A) ⊂ X → Y is said to be bounded if sup {AxY ; x ∈ D(A), xX ≤ 1} < ∞ .

(4.1.1)

Otherwise, A is called unbounded. Obviously, any continuous linear operator from (X, · X ) to (Y, · Y ) is bounded. Conversely, if A : D(A) ⊂ X → Y is a bounded linear operator, then denoting by cˆ the supremum in (4.1.1) we have AxY ≤ cˆxX

∀x ∈ D(A) ,

(4.1.2)

so A is continuous from (D(A), ·X ) to (Y, ·Y ) (see Proposition 4.2). That is why continuous linear operators are also called bounded. Note that if A is a continuous (bounded) linear operator from (D(A), · X ) to (Y, ·Y ), then A can be extended by continuity to a continuous linear operator A1 : D(A1 ) = X1 → Y1 , where X1 , Y1 denote the completions of D(A) and Y with respect to ·X and ·Y , respectively. For (X, · X ), (Y, · Y ) normed spaces, denote L(X, Y ) = {A : X → Y ; A is linear and continuous}. Obviously, L(X, Y ) is a linear space. It is a normed space with the so-called operator norm A = sup {AuY ; u ∈ X, uX ≤ 1} . Clearly, we have AuY ≤ A · uX

∀u ∈ X .

If (Z, · Z ) is another normed space, and A ∈ L(X, Y ), B ∈ L(Z, X), then AB ∈ L(Z, Y ) and AB ≤ A · B , where AB denotes the composition A◦B.

92

4 Continuous Linear Operators and Functionals

In the case X = Y we simply write L(X) = L(X, X). Examples. If X = C[0, 1] equipped with the usual sup-norm, the t antiderivative operator A : X → X, (Af )(t) = 0 f (s) ds, t ∈ [0, 1], f ∈ X, is linear and continuous (hence bounded) with A = 1. On the other hand, for the same space X, the derivative operator B : D(B) = C 1 [0, 1] ⊂ X → X, Bf = f , is linear but unbounded because for fn (t) = tn , t ∈ [0, 1], n ∈ N, we have fn = 1, while Bfn = n → ∞. Remark 4.5. If X = {0}, then A = sup {AuY ; u ∈ X, uX = 1} .

(4.1.3)

Proof. If we denote the right-hand side by a, then clearly a ≤ A .

(4.1.4)

Now, from the inequality A(u−1 X u)Y ≤ a

∀u ∈ X \ {0}

we derive AuY ≤ auX

∀u ∈ X .

(4.1.5)

By taking the supremum in (4.1.5) over all u ∈ X, uX ≤ 1, we ﬁnd A ≤ a, which combined with (4.1.4) proves (4.1.3). Theorem 4.6. If (X, · X ) is a normed space and (Y, · Y ) is a Banach space, then L(X, Y ) is a Banach space with respect to the operator norm. Proof. We know that L(X, Y ) is a normed space, so we have to show that it is complete. For the sake of simplicity we redenote by · both the norms · X and · Y . Consider a Cauchy sequence (An ) in L(X, Y ), i.e., ∀ε > 0 ∃Nε such that An − Am < ε

∀n, m > Nε .

For the same ε, we have An v − Am v ≤ εv

∀v ∈ X, n, m > Nε .

4.2 Main Principles of Functional Analysis

93

Now, (An v) converges in Y since Y is Banach, so we have an operator A : X → Y , Av = limn→∞ An v, and because each An is linear, A is as well. Since for all v ∈ X Av ≤ Av − AN +1 v + AN +1 v ≤ εv + AN +1 · v = ε + ANε +1 v , we see that A is continuous, so A ∈ L(X, Y ). Since An v − Av ≤ ε for v ∈ X such that v ≤ 1 and n > Nε , we get An − A ≤ ε ∀n > Nε , which implies that An → A in L(X, Y ).

4.2

Main Principles of Functional Analysis

In this section we present some important principles of Functional Analysis: the Uniform Boundedness Principle, the Open Mapping Theorem, and the Closed Graph Theorem. We begin with the Uniform Boundedness Principle, which was proven by Banach and Steinhaus.1 Theorem 4.7 (Banach–Steinhaus, Uniform Boundedness Principle). Let (X, · X ) and (Y, · Y ) be Banach spaces and let {Ti }i∈I ⊂ L(X, Y ) be a collection of operators satisfying sup Ti xY < ∞ i∈I

Then,

∀x ∈ X .

sup Ti < ∞ . i∈I

Proof. Denote Xn = {x ∈ X; sup Ti xY ≤ n}, n ∈ N . i∈I

1

Hugo Steinhaus, Polish mathematician, 1887–1972.

(4.2.6)

(4.2.7)

94

4 Continuous Linear Operators and Functionals

Obviously, Xn is a closed set for every n ∈ N, and by (4.2.6) we have X=

∞

Xn .

n=1

It follows by Baire’s Theorem (Theorem 2.10) that there exists an n0 ∈ N such that Int Xn0 = ∅, i.e., there is a ball B(x0 , r0 ) ⊂ Xn0 , r0 > 0. Hence, Ti (x0 + r0 w)Y ≤ n0

∀i ∈ I, ∀w ∈ B(0, 1),

which implies r0 Ti ≤ n0 + Ti x0 Y

∀i ∈ I .

This shows that (4.2.7) holds true (see also (4.2.6)).

Theorem 4.8 (Open Mapping Theorem). Let (X, · X ), (Y, · Y ) be Banach spaces. If A : D(A) ⊂ X → Y is a linear, continuous, and surjective operator, then A maps open sets in X to open sets in Y . Proof. It suﬃces to prove that there exists a constant r > 0 such that BY (0, r) ⊂ A(BX (0, 1)) ,

(4.2.8)

where BX (0, 1), BY (0, r) denote the open balls in X and Y centered at 0 with radii 1 and r, respectively. In order to prove (4.2.8) we shall ﬁrst show the existence of a constant r1 > 0 such that (4.2.9) BY (0, r1 ) ⊂ Cl A(BX (0, 1)) . Denote Yn = n Cl A(BX (0, 1)) , n ∈ N. Since A is surjective, we have Y = ∪n∈N Yn . By Baire’s 2.10) Int Yn0 = ∅ Theorem (Theorem for some n0 ∈ N, hence Int Cl A(BX (0, 1)) = ∅. So, for some y0 ∈ Y and some r1 > 0, we have (4.2.10) BY (y0 , 2r1 ) ⊂ Cl A(BX (0, 1)) . Adding the fact that −y0 ∈ Cl A(BX (0, 1)) to (4.2.10), we obtain BY (0, 2r1 ) ⊂ Cl A(BX (0, 1)) + Cl A(BX (0, 1)) = 2 Cl A(BX (0, 1))

4.2 Main Principles of Functional Analysis

95

(since Cl A(BX (0, 1)) is a convex set), hence (4.2.9) holds true. Now we are going to prove (4.2.8) by using (4.2.9) with r1 = 2r, i.e., BY (0, 2r) ⊂ Cl A(BX (0, 1)) . (4.2.11) Choose an arbitrary y ∈ BY (0, r). By (4.2.11) we have ∀ε > 0 ∃v ∈ BX (0, 1/2) such that y − AvY < ε .

(4.2.12)

In particular, for ε = r/2 there exists a v1 ∈ BX (0, 1/2) with y − Av1 Y <

r . 21

Now choosing y − Av1 instead of y and ε = 1/22 in (4.2.12), we can ﬁnd some v2 ∈ BX (0, 1/22 ) with (y − Av1 ) − Av2 Y <

r . 22

Continuing the process we ﬁnd vn ∈ BX (0, 1/2n ) such that y − A(v1 + v2 + · · · + vn )Y <

r . 2n

(4.2.13)

Obviously, xn = v1 + v2 + · · · + vn deﬁnes a Cauchy sequence in X, hence xn converges to some x ∈ X with xX < 1 and y = Ax since A ∈ L(X, Y ) (see (4.2.13)). As y was an arbitrary vector in BY (0, r) the proof of (4.2.8) is complete. Remark 4.9. If (X, · X ), (Y, · Y ) are Banach spaces and A ∈ L(X, Y ) is bijective, then A−1 ∈ L(Y, X). This follows from (4.2.8). Theorem 4.10 (Closed Graph Theorem). Let (X, · X ), (Y, · Y ) be Banach spaces. If A : X → Y is a linear operator and its graph G(A) := {(x, Ax); x ∈ X} is closed in X × Y (in other words, A is a closed operator), then A ∈ L(X, Y ). Proof. Deﬁne on X the norm xA = xX + AxY , x ∈ X, which is called the graph norm. Since G(A) is a closed set in (X, · X )×(Y, ·Y ), it follows that (X, ·A ) is a Banach space. Obviously, xX ≤ xA ∀x ∈ X,

96

4 Continuous Linear Operators and Functionals

so the identity operator I : (X, · A ) → (X, · X ) is continuous. So, by Remark 4.9, its inverse I −1 = I ∈ L((X, · X ), (X, · A )), i.e., there exists a constant C > 0 such that xA ≤ CxX ∀x ∈ X. In particular, AxY ≤ CxX ∀x ∈ X, which means A is continuous from (X, · X ) to (Y, · Y ).

4.3

Compact Linear Operators

If X, Y are normed spaces and A : X → Y is a linear operator then A is called compact or completely continuous if A takes bounded sets of X into relatively compact subsets of Y . Example. Let X = Y = C[a, b], −∞ < a < b < +∞, equipped with the usual sup-norm, and let A : X → X be deﬁned by

b

(Af )(t) =

k(t, s)f (s) ds

∀f ∈ X, ∀t ∈ [a, b] ,

a

where k ∈ C([a, b] × [a, b]). Obviously A is a linear operator. Moreover, it follows from Arzel`a– Ascoli’s Criterion that A is a compact operator. The key argument here is that the equicontinuity condition is a consequence of the uniform continuity of k. A compact linear operator is clearly continuous (see Proposition 4.2). Denote by K(X, Y ) = {A ∈ L(X, Y ); A is compact } . It is clear that K(X, Y ) is a linear subspace of L(X, Y ). Moreover, we have the following theorem. Theorem 4.11. If X is a normed space and Y is a Banach space, then K(X, Y ) is a closed linear subspace of L(X, Y ), i.e., K(X, Y ) is a Banach space with respect to the operator norm (see Theorem 4.6).

4.4 Linear Functionals, Dual Spaces, Weak Topologies

97

Proof. We shall denote by · all the three norms of X, Y , and L(X, Y ). Let (An ) be a sequence in L(X, Y ) which converges to some A ∈ L(X, Y ), namely An − A → 0. So, for ε > 0 there exists m ∈ N suﬃciently large such that ε . (4.3.14) Am − A < 3r Let (xn ) be a sequence in the ball B(0, r) ⊂ X, where r > 0 is arbitrary but ﬁxed. Since Am is compact there exists a subsequence of (xn ), say (xnk )k≥1 , such that (Axnk )k≥1 is convergent, hence Cauchy. Thus, for any ε > 0 (which can be the same as above), there exists N ∈ N such that ε ∀k, j > N . (4.3.15) Am xnk − Am xnj < 3 Using (4.3.14) and (4.3.15) we deduce Axnk − Axnj ≤ Axnk − Am xnk + Am xnk − Am xnj + Am xnj − Axnj ≤ A − Am · xnk + Am xnk − Am xnj + Am − A · xnj ε ε ε + +r· = ε, < r· 3r 3 3r in other words, (Axnk ) is Cauchy, hence convergent, and therefore A ∈ K(X, Y ). Remark 4.12. It is worth pointing out that if A ∈ K(X, Y ), where X is a normed space and Y is a Hilbert space (see Chap. 6), then there exists a sequence (An )n≥1 in L(X, Y ), such that the range of An is ﬁnite dimensional (hence An is compact) for all n ≥ 1 and An − A → 0. For the proof of this nice result see Brezis2 [6, Remark 1, pp. 157–158].

4.4

Linear Functionals, Dual Spaces, Weak Topologies

We begin this section by deﬁning the important concept of a dual space. Deﬁnition 4.13. Let (X, · ) be a normed space. Deﬁne the dual of X, denoted X ∗ , by X ∗ = {f : X → K; f is linear and continuous }, 2

Haim Brezis, French mathematician, born 1944.

98

4 Continuous Linear Operators and Functionals

so X ∗ is in fact L(X, K). The elements of X ∗ are called functionals. Since (K, |·|) is a Banach space, X ∗ is also a Banach space with respect to f = sup {|f (v)|; v ∈ X, v ≤ 1} . By deﬁnition |f (v)| ≤ f · v ∀v ∈ X, ∀f ∈ X ∗ . Example 1. Let X be the linear space of all sequences of real numbers (xn )n≥1 satisfying ∞ |xn | < ∞ . n=1

X is usually denoted by to the norm

l1

and is a Banach space (over R) with respect (xn ) =

∞

|xn | .

n=1

See Exercise 2.19. It is easily seen that any functional f ∈ X ∗ has the form ∞ an xn , f (xn ) = n=1

where (an ) is a bounded sequence in R. X ∗ is usually denoted by l∞ and is a Banach space with the norm (an )∞ = sup |an | . n≥1

Example 2. Let X = C[a, b], −∞ < a < b < +∞, with the sup-norm, denoted · . For a ﬁxed v ∈ X deﬁne f : X → R by

b

u(t)v(t) dt

f (u) =

∀u ∈ X .

a

We see that f is linear and also continuous because |f (u)| ≤ (b − a)v · u ∀u ∈ X , and therefore f ∈ X ∗ .

4.4 Linear Functionals, Dual Spaces, Weak Topologies

99

Now, consider the same space X = C[a, b] equipped with another norm, namely the L2 -norm, and the same functional f , which can be expressed as the scalar product f (u) = (u, v)L2 (a,b)

∀u ∈ X .

Again, f is linear and by the Bunyakovsky–Cauchy–Schwarz inequality |f (u)| ≤ vL2 (a,b) · uL2 (a,b)

∀u ∈ X ,

so f ∈ (X, · L2 (a,b) )∗ . Question: Given f ∈ (X, · L2 (a,b) )∗ , does there exist v ∈ X = C[a, b] such that f (u) = (u, v)L2 (a,b) for all u ∈ X? We shall show later (Theorem 6.10) that there exists such a v in the L2 (a, b), but not necessarily in X = C[a, b]. In what follows we present the Hahn3 –Banach Theorem on the extension of linear (not necessarily continuous) R-valued functionals. Theorem 4.14 (Hahn–Banach). Let X be a real linear space, and let p : X → R be a map which satisﬁes p(x + y) ≤ p(x) + p(y) p(αx) = αp(x)

∀x, y ∈ X ,

∀α > 0, x ∈ X .

If Y is a linear subspace of X and f : Y → R is a linear functional satisfying f (x) ≤ p(x) ∀x ∈ Y , then there exists a linear functional g : X → R such that g(x) = f (x)

∀x ∈ Y ,

g(x) ≤ p(x)

∀x ∈ X .

Proof. The case Y = X is trivial, so we assume that Y is a proper subspace of X. Consider the collection E of all linear extensions of f in the above sense, i.e., h ∈ E if and only if D(h) is a linear subspace of X, Y ⊂ D(h), h is linear, h extends f , and h(x) ≤ p(x) ∀x ∈ D(h). Clearly f ∈ E so E is nonempty. Deﬁne on E the order relation h1 h2 ⇐⇒ D(h1 ) ⊂ D(h2 ) and h2 (x) = h1 (x) ∀x ∈ D(h1 ) . 3

Hans Hahn, Austrian mathematician, 1879–1934.

100

4 Continuous Linear Operators and Functionals

We wish to apply Zorn’s Lemma, so let G = {hi }i∈I be a totally ordered subset of E and consider the functional h deﬁned by D(h) = ∪i∈I D(hi ), h(x) = hi (x)

if x ∈ D(hi ) for some i ∈ I .

Obviously, h is well deﬁned and belongs to E and is an upper bound for G. Hence E is inductive, so by Zorn’s Lemma E has a maximal element g ∈ E. To complete the proof let us show that D(g) = X. Assume by contradiction that this is not the case, so ∃x0 ∈ X \ D(g). Consider Z = Span {x0 } ∪ D(g) , and deﬁne on Z a linear functional g˜ of the form g˜(tx0 + x) = αt + g(x), t ∈ R, x ∈ D(g) , where α is a real parameter. We shall prove that there exists an α such that g˜ ∈ E, i.e., αt + g(x) ≤ p(tx0 + x)

∀x ∈ D(g), t ∈ R .

(4.4.16)

In particular, g(x) + α ≤ p(x + x0 )

∀x ∈ D(g) ,

g(y) − α ≤ p(y − x0 )

∀y ∈ D(g) ,

hence α should satisfy g(y) − p(y − x0 ) ≤ α ≤ p(x + x0 ) − g(x)

∀x, y ∈ D(g) ,

which is equivalent to sup [g(y) − p(y − x0 )] ≤ α ≤ inf [p(x + x0 ) − g(x)] . y∈D(g)

x∈D(g)

Such an α exists indeed since g(y) − p(y − x0 ) ≤ p(x + x0 ) − g(x) ⇔ g(x + y) ≤ p(x + x0 ) + p(y − x0 ) = p(x + y) , which is clearly valid for all x, y ∈ D(g). It is easy to check that g˜ with this alpha satisﬁes (4.4.16), so g˜ ∈ E. But g˜ is a proper extension of g (since D(g) is a proper subset of D(˜ g ) = Z) and this contradicts the maximality of g.

4.4 Linear Functionals, Dual Spaces, Weak Topologies

101

Corollary 4.15. Let (X, · ) be a normed space and let Y be a linear subspace of X. If f ∈ Y ∗ := (Y, · )∗ , then there exists an extension g of f such that g ∈ X ∗ := (X, · )∗ and gX ∗ = f Y ∗ . Proof. If K = R then we can apply the Hahn–Banach Theorem with p(x) = f Y ∗ x to derive the existence of a linear extension g : X → R satisfying g(x) ≤ f Y ∗ x ∀x ∈ X . Since −g(x) = g(−x) satisﬁes a similar inequality, we have g ∈ X ∗ and gX ∗ ≤ f Y ∗ . Obviously, the converse inequality is also satisﬁed, so gX ∗ = f Y ∗ . If K = C deﬁne q(x) := Re f (x) ∀x ∈ Y. Then, f (x) = q(x) − iq(ix)

∀x ∈ Y,

and |q(x)| ≤ f Y ∗ x ∀x ∈ Y.

(4.4.17)

Now, if we regard X, Y as real linear spaces and take into account (4.4.17), we deduce from the ﬁrst part of the proof the existence of a continuous linear functional h : X → R which extends q and satisﬁes |h(x)| ≤ f Y ∗ x ∀x ∈ X .

(4.4.18)

Set g(x) = h(x) − ih(ix),

x∈X.

Functional g : X → C is an extension of f and is linear on the complex space X. Let us prove that |g(x)| ≤ f Y ∗ x ∀x ∈ X . Indeed, for each x ∈ X, g(x) can be written as g(x) = reiθ , r ≥ 0, so |g(x)| = r = Re e−iθ g(x) = Re g e−iθ x = h(e−iθ x (by (4.4.18)) ≤ f Y ∗ x ∀x ∈ X .

102

4 Continuous Linear Operators and Functionals

Therefore, g ∈ X ∗ , and gX ∗ ≤ f Y ∗ . As the converse inequality is trivially satisﬁed, we have gX ∗ = f Y ∗ . Remark 4.16. In fact, even Theorem 4.14 above can be extended to the complex case K = C by a similar procedure. Corollary 4.17. Let (X, · ) be a normed space. Then for every x0 ∈ X \ {0} there exists a functional g ∈ X ∗ such that gX ∗ = 1 and g(x0 ) = x0 . Proof. Apply Corollary 4.15 with Y = Span{x0 } and f : Y → K deﬁned by f (x) = tx0 for x = tx0 , t ∈ K .

Corollary 4.18. Let (X, · ) be a normed space. Then for every x ∈ X we have x = sup {|f (x)|; f ∈ X ∗ , f X ∗ ≤ 1} ,

(4.4.19)

where the sup is attained. Proof. For x = 0 (4.4.19) is obvious. Let x ∈ X \ {0} and denote by a the right-hand side of (4.4.19). Clearly, a ≤ x. In fact, a = x by virtue of Corollary 4.17. Remark 4.19. Let (X, · ) be a normed space. Deﬁne J(x) = {x∗ ∈ X ∗ ; x∗ X ∗ = x, x∗ (x) = x2 } . From Corollary 4.17 we see that J(x) is nonempty for all x ∈ X. In general, J(x) is not a singleton, but there are cases when this happens for all x ∈ X (e.g., if X is a Hilbert space, as will be shown later). The set-valued map x → J(x) is called the duality map from X to X ∗ . Recall that, given a normed space (X, ·), the strong (norm) topology of X is the metric topology generated by d(x, y) = x − y for x, y ∈ X. In fact, we can consider that X is a Banach space (in other words, · is complete, or d is complete), otherwise we can use the completion procedure (see Theorem 2.8) to reach this framework.

4.4 Linear Functionals, Dual Spaces, Weak Topologies

103

Deﬁnition 4.20. The weak topology of X is the one generated by neighborhoods of the origin of the form Vx∗1 ,x∗2 ,...,x∗m ;ε = {x ∈ X; |x∗j (x)| < ε, j = 1, 2, . . . , m} , for all ﬁnite systems of functionals {x∗1 , x∗2 , . . . , x∗m } and for all ε > 0. w We write xn → x or xn x to mean convergence in the weak topology, i.e., x∗ (xn ) → x∗ (x) for all x∗ ∈ X ∗ . w

Remark 4.21. If xn → x, i.e., xn − x → 0, then xn → x. Indeed, for all x∗ ∈ X ∗ , |x∗ (xn ) − x∗ (x)| = |x∗ (xn − x)| ≤ x∗ · xn − x , which tends to 0. The converse is not true in general, and we shall see some examples later. However, if X is ﬁnite dimensional then strong and weak convergence are equivalent. Indeed, by choosing particular functionals, one can see that weak convergence reduces to convergence on coordinates. Deﬁnition 4.22. In X ∗ , besides the strong topology and the weak topology, deﬁned by means of functionals from X ∗∗ := (X ∗ )∗ (the bidual of X), we have the so-called weak-star topology w∗ , starting from another neighborhood basis consisting of Vx1 ,x2 ,...,xm ;ε = {x∗ ∈ X ∗ ; |x∗ (xj )| < ε, j = 1, 2, . . . , m} , for all ﬁnite systems {x1 , x2 , . . . , xm } ⊂ X, and for all ε > 0. So conw∗

vergence x∗n → x∗ means x∗n (x) → x∗ (x) for all x ∈ X, i.e., pointwise convergence for a sequence of functionals. In general this is diﬀerent than w-convergence. In general X is embedded into X ∗∗ , which is to say that there is an i injection x → fx deﬁned by fx (x∗ ) = x∗ (x) for all x∗ ∈ X ∗ . Clearly, i ∈ L(X, X ∗∗ ) since |fx (x∗ )| ≤ x∗ · x . Moreover, using Corollary 4.17, we see that i is an isometry. If i : X → X ∗∗ is onto (surjective), then X is said to be reﬂexive. In particular Hilbert spaces are reﬂexive, as will be shown later. Remark 4.23. It is easily seen that if X is reﬂexive then w = w∗ on X ∗.

104

4.5

4 Continuous Linear Operators and Functionals

Exercises

1. Let X, Y be linear spaces. Find a necessary and suﬃcient condition for a subset G ⊂ X × Y to be the graph of a linear operator from X into Y . 2. Let X, Y be normed spaces over R. If : X → Y is a continuous operator satisfying the condition A(x1 + x2 ) = Ax1 + Ax2

∀x1 , x2 ∈ X,

then A is linear (hence A ∈ L(X, Y )). 3. Let −∞ < a < b < +∞. Find the operator norm of A ∈ L(X) given by (Af )(t) = tf (t), t ∈ [a, b], f ∈ X, when (i)

X = C[a, b] with the sup-norm;

(ii)

X = Lp (a, b), with the usual norm, for some 1 ≤ p < ∞.

4. Let X = C[a, b], where −∞ < a < b < +∞. Assume that X is equipped with the usual sup- norm and consider the operator A deﬁned by t g(s)f (s) ds, f ∈ X, t ∈ [a, b], (Af )(t) = a

where g is a given function in L1 (a, b) with g(s) ≥ 0 for almost all s ∈ (a, b). Show that A is a compact linear operator from X into itself (i.e., A ∈ K(X) ⊂ L(X)) and calculate A. 5. Let (X, · X ), (Y, · Y ) be normed spaces. Show that a linear operator A : X → Y is continuous if and only if the following implication holds (∗) xn ∈ X, xn X → 0 =⇒ (Axn Y ) is a bounded sequence. 6. Let (X, · X ) be a normed space and let (Y, · Y ) be a Banach space. Show that, for any sequence (An )n∈N in L(X, Y ) satisfy ∞ ing An ≤ an ∀n ∈ N with ∞ a < ∞, the series n=1 n n=1 An is convergent in L(X, Y ).

4.5 Exercises

105

7. Let (X, · ) be a Banach space. Show that (i) for all A ∈ L(X) the series I+

1 1 1 A + A2 + · · · + An + · · · 1! 2! n!

is convergent in L(X) with its usual operator norm (the sum of the series being denoted eA ). Here I denotes the identity operator on X. (ii) for all A ∈ L(X) with A < 1, I − A is invertible and (I − A)−1 ∈ L(X). 8. Let (X, · ) be a Banach space. For every pair of operators A, B ∈ L(X) that commute (i.e., AB = BA) one has eA eB = eA+B (for the notation see the previous exercise). 9. Let (X, · X ), (Y, · Y ) be Banach spaces. Let (Tn )n∈N be a sequence in L(X, Y ) which is pointwise convergent, i.e., ∀x ∈ X ∃yx ∈ Y such that Tn x − yx Y → 0. Deﬁne T : X → Y by T x = yx , x ∈ X. Show that (a) (Tn )n∈N is bounded in (K, | · |); (b) T ∈ L(X, Y ); (c) T ≤ lim inf Tn . 10. Let (X, · ) be a Banach space and let S be a nonempty subset of X such that for all f ∈ X ∗ the set f (S) = {f (x); x ∈ S} is bounded in (K, | · |)}. Prove that S is bounded in (X, · ). 11. Let X be a Banach space and let A : X → X ∗ be a linear operator satisfying (Ax)(y) = (Ay)(x)

∀x, y ∈ X.

Show that A is a continuous operator, i.e. A ∈ L(X, X ∗ ). 12. Let (X, · X ), (Y, · Y ) be Banach spaces. If A : D(A) ⊂ X → Y is a linear closed operator with D(A) closed in (X, ·X ), then prove there exists a constant C > 0 such that AxY ≤ CxX ∀x ∈ D(A).

106

4 Continuous Linear Operators and Functionals

13. Let X be a Banach space and let A : X → X ∗ be a linear operator satisfying (Ax)(x) ≥ 0 ∀x ∈ X. Show that A ∈ L(X, X ∗ ). 14. Let X be a linear space, equipped with two norms, ·1 and ·2 , such that X is Banach for both norms. Assume there exists a constant C > 0 such that x2 ≤ Cx1 ∀x ∈ X. Show that · 1 and · 2 are equivalent. 15. Let X be an n-dimensional linear space, with n ∈ N. Let B = {u1 , u2 , . . . , un } be a basis in X. For any linear functional f : X → K we have f (u) =

n

α i fi

i=1

∀u =

n

αi ui ∈ X,

i=1

where fi := f (ui ), i = 1, 2, . . . , n. Obviously, any such f is continuous with respect to any norm of X, i.e., f ∈ X ∗ . Compute the norm of f , f X ∗ , explicitly, in terms of the fi ’s, when the norm of X is deﬁned by (i) u∞ = max1≤i≤n |αi | ∀u = ni=1 αi ui ∈ X; (ii) u1 = ni=1 |αi | ∀u = ni=1 αi ui ∈ X;

1/p n p |α | ∀u = ni=1 αi ui ∈ X, where p ∈ (iii) up = i i=1 (1, ∞). 16. Let X = {u ∈ C[0, 1]; u(0) = 0} with the usual sup-norm. Let f : X → R be deﬁned by 1 u(s) ds ∀u ∈ X. f (u) = 0

Show that f ∈ X ∗ and compute f X ∗ . Can one ﬁnd some u ∈ X such that usup = 1 and f (u) = f X ∗ ?

Chapter 5

Distributions, Sobolev Spaces In this chapter we ﬁrst present test functions, which are then used to introduce scalar distributions. The space D (Ω) of distributions is analyzed in detail and some related applications are discussed: the interpretation of the density of a mass concentrated at a point by means of the Dirac distribution, solving the Poisson equation in D (Ω), solving ordinary diﬀerential equations in D (R), solving the equation of the vibrating string with non-smooth initial displacement function, and the boundary controllability for a problem associated with the same wave equation. We also introduce and discuss Sobolev spaces. In order to introduce vector distributions we shall present in a separate section the Bochner integral for vector functions. Vector distributions and W k,p (a, b; X) spaces are then presented. These will later be used in solving problems associated with parabolic and hyperbolic PDE’s.

5.1

Test Functions

Let Ω ⊂ Rk be a nonempty open set in Rk (which is equipped with the usual topology). For u : Ω → R deﬁne the support of u by supp u = {x ∈ Ω; u(x) = 0} . For a given m ∈ N, let C m (Ω) denote the set of all functions u : Ω → R such that u, and all its n-th order partial derivatives, 1 ≤ n ≤ m, exist © Springer Nature Switzerland AG 2019 G. Moro¸sanu, Functional Analysis for the Applied Sciences, Universitext, https://doi.org/10.1007/978-3-030-27153-4 5

107

108

5 Distributions, Sobolev Spaces

and are continuous. Further let C0m (Ω) = {u ∈ C m (Ω); supp u is a compact (bounded) set ⊂ Ω}. For m = ∞ extend the deﬁnitions above in the obvious way. The elements (functions) in C0∞ (Ω) are called test functions since they serve as arguments of distributions that will be deﬁned later. A typical example of a test function is φ : Ω = Rk → R deﬁned by

exp x12 −1 , x2 < 1, 2 φ(x) = 0, x2 ≥ 1, where · 2 is the Euclidean norm. The function φ ∈ C ∞ (Rk ) (prove it!) and so φ ∈ C0∞ (Rk ) with supp φ = B(0, 1). For later use we also deﬁne ω(x) dx = 1 . (5.1.1) ω(x) = Cφ(x) with C > 0 such that Rk

Obviously, C0∞ (Ω) is a real linear space with respect to the usual operations (addition of functions and scalar multiplication). In what follows, we introduce the usual topology on C0∞ (Ω). To this purpose, we must ﬁrst discuss a few important concepts. Seminorms, Locally Convex Spaces, Inductive Limit Let X be a linear space over K (as usual, K is either R or C). A function p : X → R is called a seminorm if the following conditions are satisﬁed: (j) p(x + y) ≤ p(x) + p(y), x, y ∈ X, (jj) p(αx) = |α|p(x), α ∈ K, x ∈ X. If p is a seminorm, then p(0) = 0, while the case when p(x) = 0 for some x = 0 is not excluded. We also have |p(x1 ) − p(x2 )| ≤ p(x1 − x2 ) ∀x1 , x2 ∈ X,

(5.1.2)

which in particular shows that p(x) ≥ 0 for all x ∈ X. Indeed, by (j), p(x1 )−p(x2 ) ≤ p(x1 −x2 ) and so p(x1 −x2 ) = p(x2 −x1 ) ≥ p(x2 )−p(x1 ). Obviously, (5.1.2) follows from these two inequalities. We will use seminorms to equip X with a topology. If p is a seminorm and M is the set {x ∈ X; p(x) < ε}, where ε is a positive constant, then, obviously, 0 ∈ M and M is convex, balanced (i.e., x ∈ M and

5.1 Test Functions

109

|α| ≤ 1 implies αx ∈ M ), and absorbing (i.e., for any x ∈ X there exists an α > 0 such that α−1 x ∈ M ). Let F = {pi : X → R; i ∈ I} be a family of seminorms satisfying the axiom of separation: for any y ∈ X, y = 0, there exists j ∈ I such that pj (y) = 0. Consider the collection V(0) of all sets which are ﬁnite intersections of sets {x ∈ X; pi (x) < εi }, i ∈ I, ε > 0. Such an intersection looks like V = {x ∈ X; pi (x) < εi , i = 1, . . . , n}, where {p1 , . . . , pn } ⊂ F and {ε1 , . . . , εn } ⊂ (0, ∞). Obviously, V is a convex, balanced, and absorbing set. Each V ∈ V(0) is considered to be a neighborhood of 0 ∈ X and y + V := {y + v; v ∈ V } a neighborhood of any y ∈ X. Now, a set D ⊂ X which is a neighborhood of any y ∈ D is called open. Indeed, the collection τ of all such sets, plus ∅ ⊂ X, satisﬁes the axioms of a topology, so (X, τ ) is a topological space. Using the separating property of F we can infer that singletons are closed sets. Indeed, let y ∈ X be a given point. For each x ∈ X, x = y, let Dx be an open set containing x but not y. Then D = ∪x =y Dx is open and its complement is {y}, so the singleton {y} is closed, as claimed. Note that if F does not satisfy the axiom of separation then the closedness of singletons is not guaranteed. It is easily seen that the mappings X × X (x, y) → x + y ∈ X and K × X (α, x) → αx ∈ X are both continuous, so X is a topological linear space. Deﬁnition. A topological linear space X is called locally convex if every open set containing 0 includes a convex, balanced, and absorbing open subset. To summarize, we can say that any linear space X equipped (as above) with the topology generated by a family of seminorms {pi ; i ∈ I} satisfying the axiom of separation is a locally convex space in which any seminorm pi is continuous (cf. (5.1.2)). Conversely, any locally convex space X is a topological linear space whose topology is generated by a collection of seminorms. In order to show this, we deﬁne for a convex, balanced, and absorbing set M ⊂ X the so-called Minkowski functional: pM (x) = inf{α; α > 0, α−1 x ∈ M }, x ∈ X.

110

5 Distributions, Sobolev Spaces

Observe that M = {x ∈ X; pM (x) ≤ 1}. pM is a seminorm on X. Indeed, by the convexity of M and the obvious relations y x ∈ M, ∈ M, ε > 0, pM (x) + ε pM (y) + ε we deduce x pM (x) + ε · pM (x) + pM (y) + 2ε pM (x) + ε y pM (y) + ε · ∈ M. + pM (x) + pM (y) + 2ε pM (y) + ε Hence pM (x + y) ≤ pM (x) + pM (y) + 2ε ∀ε > 0 ⇒ pM (x + y) ≤ pM (x) + pM (y). We also have pM (αx) = |α|pM (x), since M is balanced. So, the topology of a given locally convex space X is the one generated by the collection of seminorms obtained as the Minkowski functionals associated with convex, balanced, and absorbing open subsets of X. Deﬁnition. Let X be a linear space over K and let {Xα ; α ∈ J} be a collection of linear subspaces of X such that X = ∪α∈J Xα . Suppose that each Xα is a locally convex space such that, if Xα1 ⊂ Xα2 , then the topology of Xα1 coincides with the relative topology of Xα1 as a subset of Xα2 . Every convex, balanced, and absorbing set D ⊂ X is considered open ⇐⇒ D∩Xα is an open set of Xα containing 0 ∈ Xα for all α ∈ J. If X is a locally convex space with respect to the topology deﬁned in this way, then X is called the inductive limit of the Xα ’s. Now let us return to C0∞ (Ω). For any compact K ⊂ Ω deﬁne the set DK (Ω) = {φ ∈ C0∞ (Ω); supp φ ⊂ K}, which is a linear subspace of C0∞ (Ω). For m ∈ N0 = N ∪ {0} and K ⊂ Ω compact, pK,m (φ) =

sup x∈K, |α|≤m

|Dα φ(x)|

is a seminorm on DK (Ω), where α = (α1 , α2 , . . . , αk ) ∈ Nk0 , |α| = α1 + α2 + · · · + αk , and the α-derivative of φ is deﬁned as Dα φ =

∂ |α| φ . · · · ∂xαk k

∂xα1 1 ∂xα2 2

5.1 Test Functions

111

Note that the order of diﬀerentiation is not important since φ is a smooth function. If α = (0, 0, . . . , 0), then Dα φ = φ by convention. Then DK (Ω) is a locally convex space and, if K1 ⊂ K2 the topology of DK1 (Ω) coincides with the relative topology of DK1 (Ω) as a subset of DK2 (Ω). Then C0∞ (Ω) can be regarded as the inductive limit of the DK (Ω)’s, where K ranges over all compact subsets of Ω. The space C0∞ (Ω), topologized in this way, is denoted by D(Ω). One of the seminorms deﬁning the topology of D(Ω) is p(φ) = sup |φ(x)|, φ ∈ C0∞ (Ω). x∈Ω

If D = {φ ∈ C0∞ (Ω); p(φ) < 1} and K is a compact subset of Ω, then D ∩ DK (Ω) = {φ ∈ DK (Ω); pK (φ) := supx∈K |φ(x)| < 1}. Theorem 5.1. Convergence of a sequence φn → 0 in D(Ω) means that the following conditions are satisﬁed: (a) there exists a compact set K ⊂ Ω such that supp φn ⊂ K for all n; (b) Dα φn → 0 uniformly on K as n → ∞ for all α ∈ Nk0 . Proof. If (a) is satisﬁed, then (b) is satisﬁed, too. So all we need to do is to prove (a). Assume by contradiction that (a) is not satisﬁed. So there exists a sequence (xj )j≥1 in Ω with no cluster point in Ω and a subsequence (φnj )j≥1 such that φnj (xj ) = 0 for all j ≥ 1. Deﬁne a seminorm p : C0∞ (Ω) → R by p(φ) = 2

∞

sup

j=1 x∈Kj \Kj−1

|φ(x)| , φ ∈ C0∞ (Ω), |φnj (xj )|

where the sequence of compacts K1 ⊂ K2 ⊂ · · · ⊂ Ω satisﬁes ∪j≥1 Kj = Ω and xj ∈ Kj \ Kj−1 , j = 1, 2, . . . , K0 = ∅. Clearly, the set V = {φ ∈ C0∞ (Ω); p(φ) < 1} is a neighborhood of 0 ∈ C0∞ (Ω) and none of the φnj belongs to V , which gives a contradiction. Obviously, the convergence φn → φ in D(Ω) means that (φn ) satisﬁes condition (a) with some compact K ⊂ Ω, and Dα φn → Dα φ uniformly on K as n → ∞ for all α ∈ Nk0 . Example 1. For Ω = Rk let φn (x) = n1 ω(x), where ω is the test function deﬁned by (5.1.1). Then K = B(0, 1) and all derivatives of φn converge uniformly to 0, so φn → 0 in D(Ω).

112

5 Distributions, Sobolev Spaces

Example 2. For Ω = Rk let ψn (x) = n1 ω( n1 x) for x ∈ Rk . Dα ψn → 0 uniformly as n → ∞ for all α ∈ Nk0 , but there is no K satisfying (a). In fact supp ψn = B(0, n), therefore ψn does not converge in D(Ω).

5.2

Friedrichs’ Molliﬁcation

Friedrichs’ molliﬁcation will allow us to associate with “bad functions” very good approximate functions.1 Consider again the test function ω : Rk → R deﬁned in the previous section, i.e.,

C exp x12 −1 , x2 < 1, 2 ω(x) = 0, x2 ≥ 1, with C > 0 such that

Rk

ω(x) dx =

B(0,1) ω(x) dx

Deﬁnition 5.2. For ε > 0 deﬁne ωε (x) = This is called the molliﬁer.

= 1.

1 ω( 1ε x) εk

for all x ∈ Rk .

The molliﬁer ωε has the following properties: 1. ωε ∈ C ∞ (Rk ) ; 2. supp ωε = B(0, ε) ; 3. Rk ωε (x) dx = B(0,ε) ωε (x) dx = 1 . Deﬁnition 5.3. Let f ∈ L1loc (Rk ), i.e., f is a real measurable function and f ∈ L1 (K) for any compact K ⊂ Rk . For ε > 0 deﬁne fε (x) the Friedrichs’ molliﬁcation of f as fε (x) = (ωε ∗ f )(x), where ∗ denotes the convolution product = Rk

1

ωε (x − y)f (y) dy ,

Kurt Otto Friedrichs, German-American mathematician, 1901–1982.

5.2 Friedrichs’ Molliﬁcation

113

and by changing variables, = R

k

ωε (y)f (x − y) dy ωε (y)f (x − y) dy

= B(0,ε)

for almost all x ∈ Rk . If f ∈ L1loc (Ω), then f can be extended as f = 0 for x ∈ Rk \ Ω, and we can deﬁne fε as before. For ε > 0 and f ∈ L1loc (Rk ), we have 1. fε ∈ C ∞ (Rk ) ; 2. supp fε ⊂ supp f + B(0, ε), i.e., not much larger than supp f ; 3. If f has compact support, so does fε . Proposition 5.4. If f ∈ C0 (Ω), then fε (x) → f (x) uniformly as ε → 0+ in Ω, where C0 (Ω) = {u ∈ C(Ω); u has compact (bounded) support ⊂ Ω}. Proof. Set K = supp f and K = K + B(0, ε0 ), where ε0 > 0. Then supp fε ⊂ K ⊂ Ω for 0 < ε ≤ ε0 , if ε0 is small enough. For 0 < ε ≤ ε0 and x ∈ K , |fε (x) − f (x)| = f (x − y)ωε (y) dy − f (x)ωε (y) dy k Rk R |f (x − y) − f (x)| ωε (y) dy . ≤ B(0,ε)

f is continuous on K , hence uniformly continuous on K , so for any η > 0, |f (x − y) − f (x)| < η for all y ∈ B(0, ε) with ε > 0 small. Thus supx∈Ω |fε (x) − f (x)| ≤ η for all ε > 0 suﬃciently small, hence fε → f uniformly in Ω as ε → 0+ . Theorem 5.5. If f ∈ Lp (Ω) for some 1 ≤ p < ∞, then (the restriction to Ω of ) fε is in Lp (Ω) for all ε > 0 and 1. fε Lp (Ω) ≤ f Lp (Ω) for all ε > 0 , 2. fε − f Lp (Ω) → 0 as ε → 0+ .

114

5 Distributions, Sobolev Spaces

Proof. It suﬃces to consider Ω = Rk , because we can extend f to Rk as before, and the two conclusions of the theorem for the extension of f will imply the same conclusions for f ∈ Lp (Ω). Consider ﬁrst the case p = 1, i.e., f ∈ L1 (Rk ). Note that (x, y) → |f (y)| ωε (x − y)

(5.2.3)

is measurable on Rk × Rk and |f (y)| ωε (x − y) dx = |f (y)| ωε (x − y) dx Rk Rk =1

= |f (y)| for almost all y ∈ Rk . Next, |f (y)| ωε (x − y) dx dy = Rk

Rk

Rk

|f (y)| dy = f L1 (Rk ) < ∞ .

(5.2.4) Thus, by Fubini-Tonelli’s Theorem (see, e.g., [51, p. 18]), function (5.2.3) is a member of L1 (Rk × Rk ) and |fε (x)| dx = ωε (x − y)f (y) dy dx k Rk Rk R |f (y)| ωε (x − y) dx dy ≤ k Rk R =1

= f L1 (Rk ) , so that fε L1 (Rk ) ≤ f L1 (Rk ) , as claimed. We now consider the case 1 < p < ∞ for the same function (5.2.3). Then fε ∈ Lp (Rk ) and, denoting by p the conjugate of p (i.e., (1/p) + (1/p ) = 1), we have |f (y)| ωε (x − y) dy |fε (x)| ≤ Rk ωε (x − y)1/p ωε (x − y)1/p |f (y)| dy = Rk

5.2 Friedrichs’ Molliﬁcation

115

which by H¨ older’s inequality ≤

R

k

ωε (x − y) dy

"1/p

1/p !

ωε (x − y)|f (y)| dy p

Rk

=1

so that |fε (x)| ≤ p

Rk

ωε (x − y)|f (y)|p dy

and integrating !

|fε (x)| dx ≤ p

Rk

"

ωε (x − y)|f (y)| dy dx |f (y)|p ωε (x − y) dx dy = k Rk R =1 = |f (y)|p dy p

Rk

Rk

Rk

= f pLp (Rk ) so that fε Lp (Rk ) ≤ f Lp (Rk ) which concludes the proof of the ﬁrst statement of the theorem. Before we continue the proof of the theorem we shall prove two auxiliary results. Lemma 5.6. For all compact K ⊂ Ω there exists an open neighborhood V of K such that V ⊂ Ω and a continuous map g : Ω → R satisfying g(x) = 1 for all x ∈ K, g(x) = 0 for all x ∈ Ω \ V, and 0 ≤ g(x) ≤ 1 for all x ∈ Ω . Proof. Let K ⊂ Ω be a compact set. Consider δ > 0 small and let V be δ-neighborhood of K whose closure V lies in Ω. Let W = Ω \ V and ρ(x) = d(x, W ) := inf w∈W x − w2 which is a continuous function.

116

5 Distributions, Sobolev Spaces

Now let α = inf x∈K ρ(x) > 0, and let g(x) = min{1, α1 ρ(x)} which is also a continuous function. Clearly g(x) = 1 for x ∈ K, g(x) = 0 for x ∈ W = Ω \ V , and 0 ≤ g(x) ≤ 1 for x ∈ V \ K. Lp (Ω)

Lemma 5.7. C0 (Ω) is dense in Lp (Ω) for all 1 ≤ p < ∞: C0 (Ω) = p p L (Ω) (i.e., every L (Ω) function can be approximated by C0 (Ω) functions with respect to the usual norm of Lp (Ω)). Proof. Let u ∈ Lp (Ω). We have u = u+ − u− , where both u+ and u− are nonnegative Lp (Ω) functions. So, it suﬃces to consider nonnegative Lp (Ω) functions u which we approximate by simple functions s=

m

yi χ M i ,

i=1

where the sets Mi ⊂ Ω are mutually disjoint and measurable with m(Mi ) < ∞, and the χMi are their characteristic functions. Consider a sequence of simple functions (sn ), such that 0 ≤ sn ≤ u and sn → u as n → ∞ for almost all x ∈ Ω, so sn → u in Lp (Ω). Thus u can be approximated by simple functions and so all reduces to approximating characteristic functions u = χM where M ⊂ Ω is a Lebesgue measurable set with m(M ) < ∞. In fact, we only need to consider K ⊂ M compact such that m(M \ K) = m(M ) − m(K) is small (see Exercise 3.2), so |χK − χM |p dx = 1 dx = m(M \ K) M \K

Ω

is small. Now, choose V as in Lemma 5.6 such that m(V \ K) < εp , then there exists g ∈ C0 (Ω) such that p p |g − χK | dx = g dx ≤ 1 dx = m(V \ K) < εp Ω

V \K

V \K

so g − χK Lp (Ω) < ε . Thus the characteristic functions u = χM can indeed be approximated by C0 (Ω) functions.

5.2 Friedrichs’ Molliﬁcation

117

Proof of Theorem 5.5, continuation. Consider f ∈ Lp (Ω) and approximate it using Lemma 5.7: for θ > 0 there exists g ∈ C0 (Ω) such that f − gLp (Ω) <

θ . 3

(5.2.5)

We have fε − f Lp (Ω) ≤ fε − gε Lp (Ω) + gε − gLp (Ω) + g − f Lp (Ω) which by the ﬁrst statement of the theorem is ≤ 2f − gLp (Ω) + gε − gLp (Ω) so by (5.2.5) 2 < θ + gε − gLp (Ω) 3 which by Proposition 5.4 2 < θ + constant · gε − gC(K ) 3 < θ3

<θ for all ε > 0 small. Therefore, lim sup fε − f Lp (Ω) = 0 =⇒ lim fε − f Lp (Ω) = 0 . ε→0+

ε→0+

This completes the proof. The following is a fundamental theorem. Theorem 5.8. Let Ω ⊂ Rk be a nonempty open set. We have Lp (Ω) C0∞ (Ω) = Lp (Ω) for all 1 ≤ p < ∞ (i.e., every Lp (Ω) function can be approximated by test functions). Proof. Let f ∈ Lp (Ω). By Lemma 5.7 for all η > 0 there exists g ∈ C0 (Ω) such that f − gLp (Ω) < η/2. On the other hand, there is a gε ∈ C0∞ (Ω) and by Theorem 5.5 gε − gLp (Ω) < η/2 for ε > 0 small. Therefore, η η f − gε Lp (Ω) ≤ f − gLp (Ω) + gε − gLp (Ω) < + = η 2 2 for ε > 0 small.

118

5 Distributions, Sobolev Spaces

Theorem 5.9. If f ∈ L1loc (Ω) is such that f (x)φ(x) dx = 0 ∀φ ∈ C0∞ (Ω) ,

(5.2.6)

Ω

then f = 0 a.e. on Ω. Proof. First of all let us extend (5.2.6) to f (x)g(x) dx = 0

(5.2.7)

Ω

for all g ∈ L∞ (Ω) such that g vanishes almost everywhere on Ω \ K, where K ⊂ Ω is a compact set. Obviously, such a function g belongs in particular to L1 (Ω) and (by Theorem 5.5) gε − gL1 (Ω) → 0 as ε → 0+ . Hence, there exists a sequence εj → 0 such that gεj (x) → g(x) as j → ∞ for a.a. x ∈ Ω . Therefore by (5.2.6) we have f (x)gεj (x)dx = 0 ,

(5.2.8)

(5.2.9)

Ω

for j large enough such that supp gεj ⊂ K , where K is a compact, K ⊂ K ⊂ Ω. We also have |f (x)gεj (x)| ≤ |f (x)| · |gεj (x)| ≤ |f (x)| |g(y)|ωεj (x − y) dy Ω

≤ |f (x)| · gL∞ (Ω) , for j large enough and for almost all x ∈ K . So we can apply the Lebesgue Dominated Convergence Theorem (see also (5.2.8) and (5.2.9)) to get (5.2.7) for all g ∈ L∞ (Ω) such that g vanishes a.e. on Ω \ K. Now choose an arbitrary compact set K ⊂ Ω and let g = sign f · χK . Then by (5.2.7) we have f g dx = |f |χK dx = |f | dx = 0, Ω

Ω

K

which implies f = 0 for almost all x ∈ K. Since K is arbitrary, f = 0 a.e. on Ω.

5.3 Scalar Distributions

5.3

119

Scalar Distributions

Let Ω ⊂ Rk be a nonempty open set. Recall that C0∞ (Ω), topologized as the inductive limit of the DK (Ω)’s, where K runs over all compact subsets of Ω, is denoted by D(Ω) (see Sect. 5.1). Deﬁnition 5.10. A functional u : D(Ω) → R is said to be a (scalar) distribution (on Ω) if u is linear and continuous, i.e., • u(α1 φ1 + α2 φ2 ) = α1 u(φ1 ) + α2 u(φ2 ) for all α1 , α2 ∈ R, and all φ1 , φ2 ∈ D(Ω); • φn → φ in D(Ω) implies u(φn ) → u(φ) in R. In fact, it is enough to consider φ = 0 for the second condition because of linearity. Let D (Ω) denote the set of all distributions on Ω. It is easily seen that D (Ω) is a real linear space. Sometimes we shall write (u, φ) instead of u(φ). Notice that, in general, a distribution is not deﬁned point-wise on Ω, unless it is a regular distribution, i.e., a distribution deﬁned by a usual function, as explained below. Regular Distributions ˜ : D(Ω) → R by Let u ∈ L1loc (Ω) and deﬁne u u(x)φ(x) dx

u ˜(φ) =

∀φ ∈ D(Ω) .

Ω

Since φ has compact support uφ ∈ L1 (Ω) and so u ˜ is well deﬁned. Clearly, u ˜ is linear and continuous and therefore a distribution. Note ˜ is injective. Since i is that the mapping i : L1loc (Ω) → D (Ω), i(u) = u linear, injectivity can be seen by showing the implication u ˜ = i(u) = 0 =⇒ u = 0 for a.a. x ∈ Ω. This is indeed the case by Theorem 5.9. We now simply identify u ˜ with u and write u(x)φ(x) dx

u(φ) =

∀φ ∈ D(Ω).

Ω

A distribution which arises this way is called a regular distribution.

120

5 Distributions, Sobolev Spaces

The Dirac Distribution2 Let Ω = Rk and deﬁne (δ, φ) = δ(φ) = φ(0) for all φ ∈ D(Ω). It is linear and continuous, so δ ∈ D (Ω). δ is called the Dirac distribution or delta function, to follow the original denomination, even though it is not in fact a function. Claim: The distribution δ is not a regular distribution. Proof. Suppose, by way of contradiction, that there exists a function f ∈ L1loc (Rk ) such that f (x)φ(x) dx ∀φ ∈ D(Rk ) . (δ, φ) = Rk

This means

f φ dx = φ(0) Rk

and, in particular, f φ dx = 0 Rk

hence

∀φ ∈ D(Rk ) ,

∀φ ∈ D(Rk ), supp φ ⊂ Rk \ {0} ,

Rk \{0}

f φ dx = 0 ∀φ ∈ D(Rk \ {0}) .

Then according to Theorem 5.9, f = 0 for almost all x ∈ Rk \ {0} so f = 0 for almost all x ∈ Rk , thus φ(0) = (δ, φ) = 0 for all φ ∈ D(Rk ) which is false. A physical interpretation of δ will be provided later. For a given x0 ∈ Rk one can deﬁne a similar Dirac distribution, denoted δx0 , by (δx0 , φ) = φ(x0 ) ∀φ ∈ D(Rk ) . The Dirac distribution associated with x0 = 0 is precisely δ. Of course, linear combinations of Dirac distributions are also distributions. In fact, the space of distributions is a large one, as shown below.

2

Paul Adrien Maurice Dirac, English theoretical physicist, 1902–1984.

5.3 Scalar Distributions

5.3.1

121

Some Operations with Distributions

Besides addition and scalar multiplication there are some further operations we can perform on distributions. • Multiplication by a C ∞ function. For u ∈ D (Ω) and a ∈ C ∞ (Ω), deﬁne au by (au, φ) := (u, aφ)

∀φ ∈ D(Ω) .

Note that aφ is still a test function, and au is linear and continuous on D(Ω), so au ∈ D (Ω). This is a generalization of the usual multiplication of functions. Indeed, if u ∈ L1loc (Ω) (i.e., u is a regular distribution), then (au, φ) = (u, aφ) u(aφ) dx = Ω = (au)φ dx , Ω

so (au)(x) = a(x)u(x) for almost all x ∈ Ω. • Reﬂection about the origin. For the sake of simplicity, consider Ω = Rk . Let u ∈ D (Rk ). Sometimes we write u(x) instead of u even though it is not a function. For example, this helps denote the reﬂection of u (u(−x), φ(x)) := (u(x), φ(−x))) dx

∀φ ∈ D(Rk ) .

Clearly u(−x) ∈ D (Ω). Notice that if u ∈ L1loc (Rk ), then u(−x) ∈ D (Rk ) is precisely the regular distribution generated by the function x → u(−x),

u(−x)φ(x) dx = Rk

u(x)φ(−x) dx Rk

∀φ ∈ D(Ω) ,

and this explains the notation for the reﬂection of the distribution u.

122

5 Distributions, Sobolev Spaces

• Translation by a vector. For u ∈ D (Rk ) and h ∈ Rk , deﬁne u(x + h) by (u(x + h), φ(x)) := (u(x), φ(x − h))

∀φ ∈ D(Ω) .

It is clear that u(x + h) ∈ D (Rk ). Again, the notation u(x + h) is justiﬁed by the case when u is a locally integrable function. Note that the Dirac distribution δx0 deﬁned before is precisely δ(x−x0 ) in terms of the above notation.

5.3.2

Convergence in Distributions

Let (un )n∈N be a sequence in D (Ω). We say that (un ) converges in D (Ω) if there exists u ∈ D (Ω) such that lim (un , φ) = (u, φ)

n→∞

∀φ ∈ D(Ω) .

(5.3.10)

In fact, D (Ω) is sequentially complete, so if (5.3.10) holds then u is automatically in D (Ω). More precisely, Claim: If (un , φ) is convergent for all φ ∈ D(Ω), then the functional u : D(Ω) → R deﬁned by (5.3.10) is linear and continuous. Proof. While the linearity of u follows trivially from (5.3.10), its continuity is not immediate, see Gel’fand and Shilov [17].3 Assume that u is not continuous, i.e., there exists a sequence φn → 0 in D(Ω) such that, on a subsequence again denoted φn , we have |u(φn )| ≥ δ > 0

∀n .

(5.3.11)

Choosing another subsequence, we may assume that for all n sup |Dα φn (x)| <

x∈Ω

1 22n

∀ |α| ≤ n .

(5.3.12)

We consider ψn = 2n φn . By (5.3.12) we get ψn → 0 in D(Ω) .

(5.3.13)

On the other hand (see (5.3.11)), |u(ψn )| ≥ 2n δ → ∞ . 3

(5.3.14)

Israel M. Gel’fand, Russian mathematician, 1913–2009; Georgiy E. Shilov, Russian mathematician, 1917–1975.

5.3 Scalar Distributions

123

Let us now extract new subsequences, say (˜ un ) and (ψ˜n ). In view of (5.3.14) we can pick a ψ˜1 such that |(u, ψ˜1 )| > 1. Thus, by virtue of (5.3.10), we can choose u ˜1 such that |(˜ u1 , ψ˜1 )| > 1. Now, assuming that u ˜j and ψ˜j have been chosen for j = 1, 2, . . . , n − 1, we can pick (by the continuity of the u ˜j ’s and by (5.3.14)) a test function ψ˜n such that 1 (5.3.15) |(˜ uk , ψ˜n )| < n−k , k = 1, 2, . . . , n − 1 , 2 n−1 ˜ |(u, ψ˜j )| + n + 1 . (5.3.16) |(u, ψn )| > j=1

Taking into account (5.3.10) and (5.3.16), we can pick u ˜n such that |(˜ un , ψ˜n )| >

n−1

|(˜ un , ψ˜j )| + n + 1 .

(5.3.17)

j=1

So by induction we obtain (˜ un ) and (ψ˜n ) satisfying (5.3.15) and (5.3.17). Set ∞ ψ= ψ˜n . n=1

Since (ψ˜n ) is a subsequence of (ψn ) the above series is convergent in D(Ω) (see (5.3.12)), hence ψ ∈ D(Ω). Now, let us estimate |(˜ un , ψ)| by using the decomposition (˜ un , ψ˜j ) + (˜ un , ψ˜n ) . (5.3.18) (˜ un , ψ) = j =n

From (5.3.15) we get ∞

|(˜ un , ψ˜j )| <

j=n+1

∞ j=n+1

1 2j−n

= 1.

(5.3.19)

Finally, from (5.3.17), (5.3.18), and (5.3.19) we see that |(˜ un , ψ)| > n , which contradicts (5.3.10). As an example, consider the sequence of Friedrichs molliﬁers un = ω1/n , i.e., un (x) = nk ω(nx) for x ∈ Ω = Rk , n ∈ N, where ω is

124

5 Distributions, Sobolev Spaces

the test function deﬁned before in (5.1.1). The graphs of the un ’s for k = 1 or k = 2 can be visualized in corresponding coordinate systems to observe the behavior of the un ’s as n gets larger and larger. The pointwise limit of (un ) is as follows: lim un (x) =

n→∞

0 x = 0 +∞ x = 0

which is not an R-valued function. On the other hand, viewing the un ’s as distributions, we have un → δ in D (Rk ): (un , φ) =

k R

un (x)φ(x) dx

= 1 B(0, n )

un (x)φ(x) dx

→ φ(0) = (δ, φ) , for all φ ∈ D(Rk ). Physical Interpretation of the Dirac Distribution: The Dirac distribution represents, for instance, the density of a unit mass concentrated at some point. To explain that, let us suppose that a unit mass, which is concentrated at the origin of a coordinate system in R3 , is distributed uniformly in B(0, 1/n) ⊂ R3 . Thus the corresponding mass density is given by δn (x) =

3n3 4π

0

x2 ≤ n1 , otherwise,

and obviously the total mass δn dx = 1. For n → ∞ the mass concentrates in x = 0. Obviously, δn (x) → 0 as n → ∞ for all x = 0, and δn (0) → +∞, so δn does not converge pointwise to a function. However, δn → δ in D (R3 ) as n → ∞ : 3n3 (δn , φ) = 4π

1 B(0, n )

φ(x) dx → φ(0) = (δ, φ)

∀φ ∈ D (R3 ) ,

so δ can indeed be interpreted as the density of an idealized point mass.

5.3 Scalar Distributions

5.3.3

125

Diﬀerentiation of Distributions

For u ∈ C 1 (Ω) and φ ∈ D(Ω) one can write

∂u ∂u ,φ = φ dx ∂xi ∂x Ω i ∂u φ dx , i = 1, . . . , k = supp φ ∂xi and switching to a rectangular cell including supp φ to ease computation ∂u φ dx = cell ∂xi and integrating by parts

∂φ dx ∂xi cell ∂φ dx =− u Ω ∂xi

∂φ . = − u, ∂xi =−

u

Hence, if u ∈ C 1 (Ω) we have

∂φ , φ = − u, ∂xi ∂xi

∂u

∀φ ∈ D(Ω), i = 1, . . . , k.

(5.3.20)

If u is an arbitrary distribution, then we use (5.3.20) as the deﬁnition ∂u which is also an element of D (Ω). Whenever u is a smooth of ∂x i function, its distributional derivative deﬁned by (5.3.20) coincides with the classical derivative of u. ∂u ∈ D (Ω) for i = 1, . . . , k, we deduce by Since u ∈ D (Ω) implies ∂x i induction that every distribution u ∈ D (Ω) is inﬁnitely diﬀerentiable, and we have (Dα u, φ) = (−1)|α| (u, Dα φ)

∀φ ∈ D(Ω), α = (α1 , . . . , αk ) ∈ Nk0 . (5.3.21)

By convention D(0,0,...,0) u = u. It is clear from (5.3.21) that mixed derivatives in the sense of distributions do not depend on the order of diﬀerentiation.

126

5 Distributions, Sobolev Spaces

Let us now discuss some examples. Example 1. Consider the Heaviside function H, deﬁned on Ω = R by 1 x ≥ 0, H(x) = 0 x < 0. We use H˙ to indicate the pointwise derivative, and H for the derivative ˙ in D (R). Obviously H(x) = 0 for x = 0, hence H˙ = 0 a.e. On the other hand, H = δ: for all φ ∈ D(R) we have ˙ (H , φ) = −(H, φ) ∞ ˙ H(x)φ(x) dx =− −∞ ∞ ˙ φ(x)dx =− 0 x=∞ = −φ(x)x=0

and since φ has compact support = φ(0) = (δ, φ). So, if u is not smooth, the distributional derivative may not coincide with the pointwise derivative. Example 2. Consider u = u(x1 , x2 ) : Ω = R2 → R, x1 x2 ≥ 0, u = x1 H(x2 ) = 0 x2 < 0. By a straightforward computation we ﬁnd that the distributional derivative D(1,0) u = H(x2 ), which coincides with the classical partial derivative ∂u/∂x1 . On the other hand, ∂φ (0,1) (D u, φ) = − u dx1 dx2 R2 ∂x2 ! " +∞ +∞ ∂φ =− x1 dx2 dx1 ∂x2 −∞ 0 +∞ = x1 φ(x1 , 0) dx1 ∀φ ∈ D(R2 ). −∞

5.3 Scalar Distributions

127

Note that D(0,1) u is not a regular distribution. Indeed, assuming the contrary, we obtain D(0,1) u = 0 almost everywhere in R2 by using test functions with support in R × (0, +∞) and in R × (−∞, 0), while D(0,1) u cannot be zero. So D(0,1) u is diﬀerent from the classical partial derivative ∂u/∂x2 (which is zero almost everywhere). Example 3. Let Ω = R3 and consider u = 1/r, where r = x2 = x21 + x22 + x23 . We want to calculate Δu =

3 ∂2u

∂x2i

i=1

,

where the derivatives are in the sense of distributions. The operator Δ is called the Laplace operator (or Laplacian).4 Note that u is not an element of L1loc (R3 ) (because of a singularity at the origin) so that we cannot deﬁne the distribution Δu directly. We replace u with 1 r ≥ n1 , un = r 0 r < n1 , which belongs to L1loc (R3 ), for all n ∈ N. For any test function φ ∈ D(R3 ) we have (Δun , φ) = (un , Δφ) and since un is regular = 1 r≥ n

We wish to accept 1 (Δu, φ) = (Δ , φ) = lim n→∞ r

Δφ dx . r 1 r≥ n

Δφ dx r

as the deﬁnition of Δu, but of course we must show that this limit exists. For a ﬁxed φ ∈ D(R3 ), deﬁne the spherical shell Sn = {x ∈ R3 ;

4

1 ≤ r ≤ a} n

Pierre-Simon Laplace, French mathematician and astronomer, 1749–1827.

128

5 Distributions, Sobolev Spaces

where a is large enough that supp φ ⊂ B(0, a). We then use the second Green formula5 (see, for example, [14, p. 628]) to deduce that # $ $ # 1 1 1 1 Δφ − φΔ dx = Δφ − φ Δ dx r r r r r≥ 1 Sn n

=0

and changing the direction of the normal and consequently the sign in the double integral below (we can ignore the outer edge of the shell since φ vanishes there), #

=− and since r =

1 n

1 r= n

on the edge = −n

and because

∂φ ∂r

$ ∂ 1 ∂φ 1 dσ −φ ∂r r ∂r r

1 r= n

∂φ dσ − n2 ∂r

φ dσ 1 r= n

is bounded

1 2 φ dσ = −nO 2 − n n r= 1 n

which as n → ∞ becomes = −4πφ(0) = −4π(δ, φ). So, the limit exists and lim n→∞

Hence

1 r≥ n

Δφ dx = −4π(δ, φ). r

1 (Δ , φ) = −4π(δ, φ) r

∀φ ∈ C0∞ (R3 ) ,

that is to say, Δ 1r = −4πδ. 5

George Green, British mathematical physicist, 1793–1841.

5.3 Scalar Distributions

129

This result can be easily generalized to higher dimensions by showing that for k ≥ 3,

1 Δ k−2 = −(k − 2)ak δ r k in D (R ), where ak is the “area” of the unit hyper-sphere in Rk . Also, for k = 2 we have

1 = −2πδ Δ ln r in D (R2 ), so that deﬁning for k ≥ 2 ⎧ 1 1 ⎪ ⎨− (k−2)ak rk−2 E(x) = ⎪ ⎩ 1 − 2π ln 1r we have

k ≥ 3, k = 2,

ΔE = δ in D (Rk ) .

(5.3.22)

E is called the fundamental solution of the Laplacian Δ. In particular, it can be used to ﬁnd a solution to the Poisson equation6 Δu = f (x),

x ∈ Rk .

(5.3.23)

Assume that f ∈ L∞ (Rk ) and vanishes almost everywhere outside a compact set. Then the function E(x − y)f (y) dy (5.3.24) u(x) = (E ∗ f )(x) = Rk

is well deﬁned (since E is locally summable) and satisﬁes Eq. (5.3.23). Indeed, we ﬁrst notice that for all y ∈ Rn and φ ∈ C0∞ (Rk ) E(x − y)Δφ(x) dx = (E(x − y), Δφ(x)) Rk

= (Δx E(x − y), φ(x)) and taking into account (5.3.22) = (δ(x − y), φ(x)) = (δ(x), φ(x + y)) = φ(y). 6

Sim´eon Denis Poisson, French mathematician, engineer, and physicist, 1781– 1840.

130

5 Distributions, Sobolev Spaces

Now, (Δu, φ) = (u, Δφ) u(x)Δφ(x) dx = k " R ! E(x − y)f (y) dy Δφ(x) dx = Rk Rk ! " f (y) E(x − y)Δφ(x) dx dy = Rk

Rk

and using the last relation above = f (y)φ(y) dy Rk

∀φ ∈ D(Rk ),

= (f, φ)

which implies Δu = f , as claimed. Remark 5.11. We point out (without proof) the following result known as Weyl’s regularity lemma7 (see, e.g., [47]): if ∅ = Ω ⊂ Rk is open, f ∈ C ∞ (Ω), u ∈ D (Ω) and Δu = f in D (Ω), then u ∈ C ∞ (Ω). The following result says that diﬀerentiation in D (Ω) is a continuous operation. Proposition 5.12. Suppose that ∅ = Ω ⊂ Rk is open. If un → u in ∂u n D (Ω), then ∂u ∂xi → ∂xi in D (Ω) for all i = 1, . . . , k. Proof. Let un → u in D (Ω). We have for all φ ∈ D(Ω)

∂φ , φ = − un , ∂xi ∂xi

∂u

and since to

∂φ ∂xi

n

is a test function, as n → ∞ the right-hand side converges

∂φ = − u, ∂xi

∂u ,φ . = ∂xi

7

Hermann Weyl, German mathematician, theoretical physicist, and philosopher, 1885–1955.

5.3 Scalar Distributions

131

Remark 5.13. As an immediate consequence of Proposition 5.12, if un → u in D (Ω), then Dα un → Dα u in D (Ω) for all α = (α1 , . . . , αk ) ∈ Nk0 with |α| > 0. Series in D (Ω) Suppose (un )n∈N is a sequence in D (Ω). Then we can associate with this sequence the series u1 + u2 + · · · + un + · · · and say that it converges in D (Ω) if the sequence of partial sums sn = u1 + · · · + un converges, sn → u in D (Ω), and write u1 + u2 + · · · + un + · · · = u . By Remark 5.13, sn → u implies Dα sn → Dα u in D (Ω) for all α, hence we can diﬀerentiate the series term by term as many times as we wish, i.e., D α u1 + D α u2 + · · · + D α un + · · · = D α u in D (Ω). This is not the case in classical analysis. For example, with Ω = R, un (x) = n1 sin(nx) converges uniformly to 0 as n → ∞ (and uniform convergence implies convergence in D (Ω)), but un (x) = cos(nx) which does not converge, even pointwise. However, it does (j) converge in D (R). In fact, un → 0 as n → ∞ in D (R) for all j = 1, 2, . . . .

5.3.4

Diﬀerential Equations for Distributions

Consider Ω = R, u, b ∈ D (R) and smooth functions a1 , a2 , . . . , an ∈ C ∞ (R). Then, if u(j) indicates the j-th derivative of u in D (R), the diﬀerential equations u(n) + a1 u(n−1) + · · · + an−1 u(1) + an u = 0 u

(n)

+ a1 u

(n−1)

+ · · · + an−1 u

(1)

+ an u = b

(E0 ) (E)

make sense. Classically, there are nice solutions u to (E0 ) and they are solutions in the sense of distributions as well. In fact, there are

132

5 Distributions, Sobolev Spaces

no other solutions to (E0 ) in D (R) as long as ai ∈ C ∞ (R), as proven below. The equation u = 0 has constant solutions C in D (R) since for all φ ∈ D(R) ∞ φ dt = 0, (C , φ) = −(C, φ ) = −C −∞

But, are constant functions the only solutions to the equation u = 0 in the sense of distributions? We answer this question in the following way. If u ∈ D (R) and u = 0 in D (R), we have 0 = (u , φ) = −(u, φ ) Given φ ∈ D(R) deﬁne

ψ(t) = φ(t) − ω(t)

∀φ ∈ D(R).

(5.3.25)

∞

φ(s) ds −∞

for all t ∈ R with ω= ω(t) deﬁned as in (5.1.1) (where k = 1). Note +∞ that ψ ∈ D(R) and −∞ ψ dt = 0. Deﬁne t ψ(s) ds, t ∈ R , φ1 (t) = −∞

and notice that φ1 ∈ D(R) and φ1 = ψ. Now for all φ ∈ D(R) ! ∞ " (u, φ) = (u, ψ) + φ(s) ds (u, ω) −∞ constant ∞ = (u, φ1 ) + Cφ ds −∞

and according to (5.3.25) = (C, φ) , thus u = C. Therefore, any distributional solution of the equation u = 0 is a constant distribution (i.e., a distribution generated by a constant function). Now consider the linear diﬀerential system ⎧ u1 = a11 u1 + a12 u2 + · · · + a1n un ⎪ ⎪ ⎪ ⎪ ⎨u = a21 u1 + a22 u2 + · · · + a2n un 2 .. ⎪ ⎪ . ⎪ ⎪ ⎩ un = an1 un + an2 u2 + · · · + ann un

(5.3.26)

5.3 Scalar Distributions

133

where the aij ∈ C ∞ (R). Denoting by A(x) the matrix aij (x) and by u the column vector (u1 , . . . , un )T , we can rewrite (5.3.26) as u = Au .

(5.3.27)

Let X = X(t) be a fundamental matrix of the system (5.3.26). We know from the classical theory of linear diﬀerential systems (see, e.g., [8, 11]) that X is invertible and X = AX for all t ∈ R. Consider the transformation u = Xz then in the sense of distributions u = X z + Xz = AXz + Xz and by (5.3.27) = AXz. Hence Xz = 0, so having in mind the fact that X is invertible, we deduce that z = 0. We have denoted by u and z the column vectors whose components are the distributional derivatives of u1 , . . . , un and z1 , . . . , zn , respectively. As z = 0, z must be a constant vector z = c ∈ Rn and we ﬁnd u = Xc. Therefore, there are no solutions in D (Rn ) to system (5.3.26) other than the classical ones. Finally, consider the homogeneous Eq. (E0 ). Since it can be written in the vector form (5.3.27) which has only classical solutions, so does (E0 ). The non-homogeneous case (E) has a general solution which is obtained by adding to the general solution of (E0 ) a particular solution to (E) in the sense of distributions. Indeed, if up ∈ D (R) is such a particular solution, and u is an arbitrary solution in D (R) of (E), then u − up is a (classical) solution of (E0 ), hence a linear combination of the functions belonging to the fundamental system of solutions. Example. Consider in D (R) the diﬀerential equation u − 2u + u = 2δ(t − 1) ,

t ∈ R.

(5.3.28)

In order to solve this equation, we ﬁrst notice that if u is a distributional solution of it, then u − 2u + u = 0 in D (Ωi ), i = 1, 2,

134

5 Distributions, Sobolev Spaces

where Ω1 = (−∞, 1), Ω2 = (1, +∞). Therefore, u is a classical solution of the corresponding homogeneous equation within each of these two intervals, i.e., u is a function (regular distribution) of the form u(t) =

(c1 t + c2 )et , (c3 t + c4 )et ,

t ∈ (−∞, 1), t ∈ (1, +∞),

where c1 , c2 , c3 , c4 are real constants. Not all these functions u are solutions of the given diﬀerential equation. The fact that such a function u is a solution means 1 uφ + 2uφ + uφ dt −∞ ∞ + uφ + 2uφ + uφ dt = 2φ(1) ∀φ ∈ D(R). 1

Integrating by parts and bearing in mind that u is a classical solution of the homogeneous equation in (−∞, 1) and also in (1, ∞), plus the fact that φ(1) and φ (1) can be any real numbers, we obtain u(1 + 0) = u(1 − 0),

u(1 ˙ + 0) − u(1 ˙ − 0) = 2,

so c3 = c1 + 2e−1 ,

c4 = c2 − 2e−1 .

Thus the general solution of the given equation is 2(t − 1)et−1 , t > 1, t u(t) = (c1 t + c2 )e + 0, t < 1. i.e., u(t) = (c1 t + c2 )et + 2(t − 1)et−1 H(t − 1). It is worth pointing out that there is no classical solution of the given equation, more precisely there is a jump at t = 1 in the ﬁrst derivative of any solution (which is caused by the Dirac distribution in the righthand side of the equation). Remark 5.14. Note that in equation (E) above the coeﬃcient of u(n) is 1, i.e., we do not have any singularity in the coeﬃcient of the leading term. Otherwise, some diﬃculties may occur. For example, consider the simple equation (5.3.29) tu = 0 in D (R).

5.3 Scalar Distributions

135

If u is a distributional solution of (5.3.29), then it must be constant in (−∞, 0) and in (0, ∞) as well. So the general solution is u(t) = c1 + c2 H(t),

t ∈ R,

where c1 , c2 are real constants. Note that in this case there are two independent solutions (e.g., u1 (t) = 1, u2 (t) = H(t)), even if the given equation is of order one. Now, in order to illustrate the need for distributions in solving problems associated with partial diﬀerential equations, consider the following examples: Example 1. Consider the equation of an inﬁnite vibrating string with no external force acting on it utt − uxx = 0,

(t, x) ∈ R2 ,

(5.3.30)

with some conditions at t = 0, say, u(0, x) = ψ(x), ut (0, x) = 0, where ut :=

x ∈ R,

(5.3.31)

∂u ∂2u ∂2u , utt := 2 , uxx := . ∂t ∂t ∂x2

First assume that ψ ∈ C 2 (R). Recall that using the change of variables α = x + t, β = x − t , Eq. (5.3.30) can be reduced to the equation uαβ = 0 . So it is easily seen that any solution of the Eq. (5.3.30) has the form u = g(x+t)+h(x−t), and so applying (5.3.31), we ﬁnd the D’Alembert formula8 u= 8

1 ψ(x + t) + ψ(x − t) 2

(D’Alembert’s formula) .

(5.3.32)

Jean-Baptiste le Rond d’Alembert, French mathematician, mechanician, physicist, philosopher, and music theorist, 1717–1783.

136

5 Distributions, Sobolev Spaces

Clearly, u is a C 2 function. It is the unique classical solution of problem (5.3.30), (5.3.31). On the other hand, assuming that ψ ∈ C 1 (R), then u given by (5.3.32) is no longer a classical solution of Eq. (5.3.30). However, this u still satisﬁes conditions (5.3.31). Now, assume that ψ ∈ C(R). In this case, the function u given by (5.3.32) only satisﬁes classically the condition u(0, x) = ψ(x), x ∈ R. However, it should be some relation between this u and problem (5.3.30), (5.3.31). Indeed, we can show that this u satisﬁes (5.3.30) and the condition ut (0, x) = 0 (x ∈ R) in a weak sense, that is in the sense of distributions. If ψ , ψ denote the ﬁrst and second derivative of ψ in D (R), then it is easily seen that D(1,0) ψ(x + t) = D(0,1) ψ(x + t) = ψ (x + t)

in D (R2 ) ,

D(2,0) ψ(x + t) = D(0,2) ψ(x + t) = ψ (x + t)

in D (R2 ) .

Similarly, D(0,1) ψ(x − t) = ψ (x − t) = −D(1,0) ψ(x − t)

in D (R2 ) ,

= D(0,2) ψ(x − t) = ψ (x − t)

in D (R2 ) .

D(2,0) ψ(x − t)

Consequently, for all φ ∈ D(R2 ), we have 1 ψ (x + t) + ψ (x − t), φ(t, x) 2 (0,2) u(t, x), φ(t, x) , = D

(2,0) D u(t, x), φ(t, x) =

which shows that u given by (5.3.32) satisﬁes Eq. (5.3.30) in the sense of distributions: D(2,0) u − D(0,2) u = 0 in D (R2 ) . We also have 1 ψ (x + t) − ψ (x − t) 2 = 0, if t = 0 .

D(1,0) u(t, x) =

5.3 Scalar Distributions

137

Example 2. Here we discuss the boundary controllability of the 1-dimensional wave equation describing the vibrations of a ﬁnite string. Speciﬁcally, let us consider the following initial-boundary value problem: ⎧ ⎪ 0 < x < 1, t > 0, ⎨utt − uxx = 0, (5.3.33) u(t, 0) = 0, u(t, 1) = f (t), t > 0, ⎪ ⎩ 0 1 u(0, x) = u (x), ut (0, x) = u , 0 < x < 1, where u0 , u1 ∈ L1 (0, 1). We shall prove that ∃T > 0, ∀(u0 , u1 ) ∈ L1 (0, 1)2 , ∃f ∈ L1loc [0, ∞), ∀t > T, u = 0, where u is the corresponding solution of problem (5.3.33). In fact, we shall see that there exists a lowest time instant T with this property, precisely T = 2. Obviously, any T > 2 satisﬁes the same property. This result is in accordance with similar results previously obtained by other authors by using diﬀerent arguments (see, e.g., [34, p. 57]). Our direct approach is more advantageous since it provides the solution u (in a generalized sense, under weak assumptions on the data) as a function of u0 , u1 , and f and allows us to determine the minimal time interval (0, 2) and an explicit control function f (depending on u0 and u1 ) which steers the solution u to zero. It may happen that this direct approach is known, but we could not ﬁnd anything about it in the literature. Nevertheless, we present it here as a nice application and do not claim originality. Existence of Solutions to Problem (5.3.33) Denote R = {(t, x); t ≥ 0, 0 ≤ x ≤ 1}. Consider in the ﬁrst instance that u = u(t, x) is a classical solution of problem (5.3.33) corresponding to regular u0 , u1 , and f . Obviously, the solution of the above wave equation has the general form u(t, x) = g(x + t) + h(x − t) .

(5.3.34)

From the initial and boundary conditions we can determine g(x + t) and h(x − t) (hence u = u(t, x)) within diﬀerent subsets (triangles or squares) of R, as follows: From the initial conditions we get x 0 g(x) + h(x) = u (x), g(x) − h(x) = u1 + c, 0 < x < 1 , 0

138

5 Distributions, Sobolev Spaces

where c is a real constant, hence ⎧ 0 x 1 1 ⎪ ⎨g(x) = 2 u (x) + 0 u + c , ⎪ ⎩

h(x) =

1 2

u0 (x)

−

x 0

u1

0 < x < 1, (5.3.35)

−c ,

0 < x < 1.

From the boundary conditions we obtain g(t) + h(−t) = 0, t > 0 ,

(5.3.36)

g(1 + t) + h(1 − t) = f (t), t > 0 .

(5.3.37)

and Now (5.3.36) yields (see also (5.3.35)) h(−t) = −g(t) t 1 0 u1 + c , 0 < t < 1 . = − u (t) + 2 0

(5.3.38)

From (5.3.37) we derive (see also (5.3.35)) g(1 + t) = f (t) − h(1 − t) 1−t 1 0 u1 − c , 0 < t < 1 . = f (t) − u (1 − t) − 2 0

(5.3.39)

We also have for 0 < t < 1 h(−t − 1) = −g(1 + t) 1 = −f (t) + u0 (1 − t) − 2

1−t

u1 − c ,

g(2 + t) = f (1 + t) − h(−t) t 1 0 u1 + c , = f (1 + t) + u (t) + 2 0 h(−t − 2) = −g(t + 2) 1 = −f (1 + t) − 2

(5.3.40)

0

! " t 0 1 u (t) + u +c , 0

(5.3.41)

(5.3.42)

5.3 Scalar Distributions

139

g(3 + t) = f (2 + t) − h(−t − 1) " ! 1−t 1 0 1 u − c , (5.3.43) u (1 − t) − = f (2 + t) + f (t) − 2 0 and so on. By using the above formulas we can determine u = u(t, x) in R. We decompose R into triangles and squares as in Fig. 5.1.

− 2 − x

1

1≤x+t≤2 0≤x−t≤1

2≤x+t≤3 −1 ≤ x − t ≤ 0

C A

3≤x+t≤4 −2 ≤ x − t ≤ −1

F

0≤x+t≤1 0≤x−t≤1

1≤x+t≤2 −1 ≤ x − t ≤ 0

I 2≤x+t≤3 −2 ≤ x − t ≤ −1

D

B

H

0 ≤ x − t ≤ −1 1≤x+t≤2

J

−3 ≤ x − t ≤ −2 2≤x+t≤3

t x

+

+

3

x

+

+

2

x

x

1

3≤x+t≤4 −3 ≤ x − t ≤ −2

G

E

−1 ≤ x − t ≤ 0 0≤x+t≤1

0

t=

t= − x

x

x

−

−

t=

t=

0

− 1

1

x

t=

t=

t=

t=

3

2

1

0

Figure 5.1: Regions of the plane We ﬁrst determine g(x + t) in the triangle A ∪ B (i.e., the intersection of R and the strip {0 ≤ x + t ≤ 1}) (see (5.3.35)): 1 g(x + t) = 2

! 0 u (x + t) +

x+t

" 1

u +c

.

(5.3.44)

0

Now let us determine g(x + t) in the parallelogram C ∪ D ∪ E (i.e., the intersection of R and the strip {1 ≤ x + t ≤ 2}) (see (5.3.39)): g(x + t) = g (x + t − 1) + 1 1 0 u (2 − x − t) = f (x + t − 1) − 2" 2−x−t − u1 − c . 0

(5.3.45)

140

5 Distributions, Sobolev Spaces

Note that choosing f : (0, 1) → R, 1−y 1 0 u1 − c ∀y ∈ (0, 1), f (y) = u (1 − y) − 2 0 implies g(x + t) = 0 in C ∪ D ∪ E. For (t, x) ∈ F ∪ G ∪ H (i.e., 2 ≤ x + t ≤ 3, 0 ≤ x ≤ 1) we have (see (5.3.41)) g(x + t) = g (x + t − 2) + 2 1 0 = f (x + t − 1) + u (x + t − 2) 2" x+t−2 + u1 + c . (5.3.46) 0

If we choose f : (1, 2) → R, " ! z−1 1 0 1 u +c ∀z ∈ (1, 2) , f (z) = − u (z − 1) + 2 0 then g(x + t) = 0 in F ∪ G ∪ H. In what follows we shall try to determine h(x − t). First, in the triangle A ∪ C (see (5.3.35)) ! " x−t 1 0 1 u (x − t) − u −c . h(x − t) = 2 0

(5.3.47)

Next, for (t, x) ∈ B ∪ D ∪ F we have h(x − t) = h − (t − x) = −g(t − x) ! " t−x 1 0 1 u (t − x) + u +c . = − 2 0 For (t, x) ∈ E ∪ G ∪ I we have h(x − t) = h − (t − x − 1) − 1 1 = −f (t − x − 1) + u0 (2 + x − t) 2 2+x−t u1 − c . − 0

(5.3.48)

(5.3.49)

5.3 Scalar Distributions

141

Observe that if f : (0, 1) → R is the function deﬁned above then h(x − t) = 0 in E ∪ G ∪ I. For (t, x) ∈ H ∪ J we have h(x − t) = h − (t − x − 2) − 2 1 = −f (t − x − 1) − u0 (t − x − 2) 2 t−x−2 u1 + c . +

(5.3.50)

0

With f : (1, 2) → R as deﬁned before, we have h(x − t) = 0 in H ∪ J. In fact all the above computations are valid for u0 , u1 ∈ L1 (0, 1) and f ∈ L1loc [0, ∞). These calculations lead to the following theorem: Theorem 5.15. For any u0 , u1 ∈ L1 (0, 1) and f ∈ L1loc [0, ∞) (i.e., f is Lebesgue summable on (0, m) for all m > 0) problem (5.3.33) has a unique weak solution u. If u0 ∈ C[0, 1], u1 ∈ L1 (0, 1), f ∈ C[0, ∞), and the following compatibility conditions are satisﬁed: u0 (0) = 0, u0 (1) = f (0) ,

(5.3.51)

then u = u(t, x) ∈ C([0, ∞) × [0, 1]). Proof. Using the above computations we construct u(t, x) = g(x + t) + h(x − t). Obviously, u satisﬁes the wave equation in the distribution sense on the interior of each of the sets A, B, C, D, E, F, G, and so on. In this sense, u is a weak solution of the wave equation. By construction, the initial and boundary conditions are also satisﬁed. It is easily seen that the constant c disappears when constructing u = g(x + t) + h(x − t) in A, B, C, . . . so the solution u is unique. If u0 ∈ C[0, 1], u1 ∈ L1 (0, 1), f ∈ C[0, ∞) and u0 , f satisfy (5.3.51), then u is continuous on [0, ∞) × [0, 1]. It suﬃces to observe that u is continuous on the characteristic lines {x − t = i}, i = 0, 1, . . . , and {x + t = −j}, j = 1, 2 . . . , restricted to the inﬁnite strip R. Remark 5.16. For higher regularity of u one needs to assume more regularity of the data and additional compatibility conditions.

142

5 Distributions, Sobolev Spaces

Exact Boundary Controllability A careful analysis of the above computations shows that there are pairs (u0 , u1 ) for which there are no functions f : (0, T ) → R, T < 2, making u = 0 in the trapezoid A ∪ B ∪ C ∪ D ∪ F . In other words, the waves cannot be controlled in [0, T ] if T < 2. On the other hand, we have Theorem 5.17. For any pair (u0 , u1 ) ∈ L1 (0, 1)×L1 (0, 1) there exists a control function f : (0, +∞) → R deﬁned by ⎧ 1 0 (1 − y) + 1−y u1 − c , ⎪ u ⎪ 2 0 ⎪ ⎪ ⎪ ⎪ ⎪ ⎨ f (y) = − 1 u0 (y − 1) + y−1 u1 + c , 2 0 ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎩ 0,

y ∈ (0, 1), y ∈ (1, 2),

(5.3.52)

y > 2,

1 with c = −u0 (1) − 0 u1 , which makes u = 0 in the inﬁnite trapezoid {x ≤ t − 1} ∩ {0 < x < 1}. Proof. The proof follows easily from the computations performed above, including the remarks on the regions where g(x+t) = 0 or h(x−t) = 0 (based on the fact that f is that given in (5.3.52)). Of course, u vanishes in {x < t − 1} ∩ {0 < x < 1} since f (t) = 0 for t > 2. Remark 5.18. Let us emphasize that, if f is chosen as in (5.3.52), then the corresponding (unique) solution u vanishes starting from the line segment deﬁned as the intersection of R and the characteristic line {x − t = −1} and remains zero everywhere on the right side of that segment, which can be interpreted as a threshold. So the waves can be controlled in the minimal time interval (0, 2) and in fact in any interval (0, T ) with T ≥ 2. Remark 5.19. While the solution u is unique in Theorem 5.15, the control function f is not since the constant c in (5.3.52) can be chosen arbitrarily. Indeed, the restriction of f to the interval (0, 2) is unique up to an additive constant, 1 as follows from the computations above. We chose c = −u0 (1) − 0 u1 in Theorem 5.17 in order to obtain a continuous control function f .

5.4 Sobolev Spaces

5.4

143

Sobolev Spaces

Let ∅ = Ω ⊂ Rk be an open set. For m ∈ N, 1 ≤ p ≤ ∞ deﬁne the Sobolev space of order m to be9 W m,p (Ω) = {u ∈ Lp (Ω); Dα u ∈ Lp (Ω) ∀α ∈ Nk0 , 0 < |α| ≤ m} , where the derivatives Dα u are considered in the sense of distributions. Obviously, W m,p (Ω) is a linear space with respect to the usual operations of addition and scalar multiplication. In particular, ) ∂u 1,p p p ∈ L (Ω) ∀i = 1, . . . , k , W (Ω) = u ∈ L (Ω); ∂xi ∂u where ∂x is the partial derivative of u with respect to xi in the sense i of distributions.

Theorem 5.20. For all m ∈ N, 1 ≤ p ≤ ∞, W m,p (Ω) is a real Banach space with respect to the norm um,p =

Dα upLp (Ω)

1/p ,

1 ≤ p < ∞,

|α|≤m

um,∞ = max Dα uL∞ (Ω) , |α|≤m

p = ∞.

Proof. Obviously, · m,p is a norm for all 1 ≤ p ≤ ∞. Let (un )n∈N be a Cauchy sequence in W m,p (Ω), i.e., for all ε > 0 there exists N = N (ε) ∈ N such that un − um m,p < ε for all n, m > N . It follows that (Dα un ) is Cauchy in Lp (Ω) for all α ∈ Nk0 , |α| ≤ m. Since Lp (Ω) is a Banach space (with respect to · Lp (Ω) ), there exist u, uα ∈ Lp (Ω) such that un → u, Dα un → uα in Lp (Ω) ∀α ∈ Nk0 , 0 < |α| ≤ m .

(5.4.53)

On the other hand, we have the following. Claim: In general, if vn → v in Lp (Ω), 1 ≤ p ≤ ∞, then vn → v in D (Ω). 9

Sergei, L. Sobolev, Russian mathematician, 1908–1989.

144

5 Distributions, Sobolev Spaces

Indeed, for all φ ∈ D(Ω) we have |(vn − v, φ)| = (vn − v)φ dx Ω

which for p = 1 ≤ vn − vL1 (Ω) sup |φ| and for 1 < p < ∞ by H¨older ≤ vn − vLp (Ω) φLp (Ω) , so vn → v in D (Ω), and similarly for p = ∞. By the above claim, it follows that the convergences in (5.4.53) also hold in D (Ω). Since Dα is a closed operation in D (Ω), it follows that uα = D α u

∀α ∈ Nk0 , 0 < |α| ≤ m .

Therefore, u ∈ W m,p (Ω) and un − um,p → 0 as n → ∞. Now, for m ∈ N, 1 ≤ p ≤ ∞, denote as usual by W0m,p (Ω) the closure of C0∞ (Ω) in (W m,p (Ω), · m,p ). Obviously, W0m,p (Ω) is a Banach space with respect to · m,p for all m ∈ N, 1 ≤ p ≤ ∞. For p = 2 there are speciﬁc notations H m (Ω) := W m,2 (Ω), H0m (Ω) := W0m,2 (Ω), and corresponding norm · m := · m,2 . These are Hilbert spaces with the scalar product (u, v)m := Dα u, Dα v L2 (Ω) . |α|≤m

(A Banach space (X, · )is called Hilbert if · is given by a scalar product (·, ·), i.e., x = (x, x), x ∈ X; see Chap. 6 for more information on Hilbert spaces). In particular, the scalar product of H 1 (Ω) (and of H01 (Ω) as well) is (u, v)1 = (u, v)L2 (Ω) +

k ∂u ∂v , ∂xj ∂xj L2 (Ω) j=1

=

uv dx + Ω

k j=1

Ω

∂u ∂v dx , ∂xj ∂xj

5.4 Sobolev Spaces

145

where the derivatives are in the sense of distributions, so that u21

=

u2L2 (Ω)

k ∂u 2 + dx . Ω ∂xj j=1

It is well known that W m,p (Ω) is separable if 1 ≤ p < ∞. See, e.g., [1, p. 47], where further results on Sobolev spaces can be found. See also [2, 6, 14]. Let us recall (without proof) the following approximation result (cf., e.g., [14, p. 252]). Theorem 5.21. Let ∅ = Ω ⊂ Rk be an open bounded set of class C 1 , and let 1 ≤ p < ∞. Then for every u ∈ W m,p (Ω) there exists a sequence (un ) in C ∞ (Ω) such that un → u in W m,p (Ω). For the deﬁnition of a C 1 open set see [6, p. 272]. Generally, in applications ∂Ω is smooth enough and consequently Ω is of class C 1 . Notice also that W0m,p (Rk ) = W m,p (Rk ), i.e., C0∞ (Rk ) is dense in W m,p (Rk ) (see, e.g., [1, p. 56]). But, in general, W0m,p (Ω) is a proper subspace of W m,p (Ω). Let us also state (without proof) a uniﬁed version of some results due to Sobolev, Rellich & Kondrashov10 (see, e.g., [2, pp. 3–4]). Theorem 5.22. If ∅ = Ω ⊂ Rk is an open set of class C 1 and 1 ≤ p < ∞, then there are the continuous embeddings k p

(a) if m <

, then W m,p (Ω) → Lq (Ω) ∀q ∈ [p, p∗ ], where p∗ =

kp k−mp ;

(b) if m =

k p

, then W m,p (Ω) → Lq (Ω) ∀q ∈ [p, ∞);

(c) if m > kp , then W m,p (Ω) → C 0,α (Ω) (which is the space of H¨ older continuous functions deﬁned on Ω with exponent α ∈ (0, 1), and with α = 1 if m − kp > 1). If, in addition, Ω is bounded, then all the above embeddings are compact except for the case q = p∗ in (a), and furthermore, if we replace W m,p (Ω) by W0m,p (Ω), then all these embeddings (including the compact ones) hold without any regularity condition on ∂Ω. 10 Vladimir I. Kondrashov, Russian mathematician, 1909–1971; Franz Relich, Austrian-German mathematician, 1906–1955.

146

5 Distributions, Sobolev Spaces

The above embeddings are the natural linear injective maps between the corresponding spaces. In particular, the embedding (c) above associates with every u ∈ W m,p (Ω) (which is a class of functions with respect to the a.e. equality) its continuous representative. Continuity and compactness of the above embeddings are understood in the usual sense. We continue with a few words on the trace of functions from W m,p (Ω) on the boundary ∂Ω of Ω. The concept of trace is important for applications to boundary value problems for partial diﬀerential equations. We restrict our attention to W 1,p (Ω), 1 ≤ p < ∞, since this case is suﬃcient for the applications that will be discussed later. Clearly, for a function u ∈ C(Ω) its restriction to ∂Ω, denoted u|∂Ω , is well deﬁned. But if u ∈ W 1,p (Ω) then u is only deﬁned a.e. on Ω so it does not make sense to speak about the restriction of u to ∂Ω because the k-dimensional Lebesgue measure of ∂Ω is zero; however, there is a trace of u on ∂Ω which plays the role of the restriction u|∂Ω . More precisely, we have the following theorem (cf. [14, pp. 258–259]): Theorem 5.23. Let ∅ = Ω ⊂ Rk be an open bounded set of class C 1 , and let 1 ≤ p < ∞. There exists a continuous linear operator γ : W 1,p (Ω) → Lp (∂Ω) such that γ(u) = u|∂Ω for all u ∈ W 1,p (Ω) ∩ C(Ω). Moreover, u ∈ W01,p (Ω) if and only if u ∈ W 1,p (Ω) and γ(u) = 0. In fact, the operator γ from the above statement is the extension by continuity of the classical restriction to ∂Ω from W 1,p (Ω) ∩ C(Ω) to Lp (∂Ω). This extension is unique since W 1,p (Ω) ∩ C(Ω) is dense in (W 1,p (Ω), · 1,p ) (see Theorem 5.21). If u ∈ W01,p (Ω), hence γ(u) = 0, we say that u = 0 on ∂Ω in a generalized sense. For details on traces and Lp (∂Ω), 1 ≤ p < ∞, see [14]. The case k = 1 If Ω = (a, b) ⊂ R, −∞ ≤ a < b ≤ +∞, we denote Lp (a, b) := Lp (a, b) , W m,p (a, b) := W m,p (a, b) , W0m,p (a, b) := W0m,p (a, b) , H m (a, b) := H m (a, b) , H0m (a, b) := H0m (a, b) . The case Ω = (a, b) will be discussed later in Sect. 5.6 on vector distributions. In particular, we shall see that for 1 ≤ p < ∞ and

5.4 Sobolev Spaces

147

−∞ < a < b < +∞ every u ∈ W 1,p (a, b) has a representative which is an absolutely continuous function on [a, b], so identifying u with this representative, u(a) and u(b) make sense classically. According to Theorem 5.23, u is in W01,p (a, b) if and only if u ∈ W 1,p (a, b) and u(a) = 0 = u(b). This shows in particular that W01,p (a, b) is a proper subspace of W 1,p (a, b). Green’s Identity Let ∅ = Ω ⊂ Rk be an open and bounded set of class C 1 . Recall the classical divergence (Gauss–Ostrogradski11 ) formula ∇ · F dx = F · n ds Ω

∂Ω

∀F = (f1 , . . . , fk ), fi ∈ C 1 (Ω), i = 1, . . . , k,

(5.4.54)

where n is the outward pointing unit normal. Choosing in (5.4.54) F = g∇f , with f ∈ C 2 (Ω) and g ∈ C 1 (Ω), one obtains the classical Green identity ∂f gΔf dx + ∇f · ∇g dx = g ds . (5.4.55) Ω Ω ∂Ω ∂n Taking into account Theorems 5.21 and 5.23, the identity (5.4.55) can be easily extended by density to gΔf dx + ∇f · ∇g dx Ω Ω ∂f g (5.4.56) ds ∀f ∈ W 2,p (Ω), g ∈ W 1,q (Ω) , = ∂n ∂Ω where 1 < p < ∞ and q is the conjugate ofp, i.e., q = (p − 1)/p. Here, the functions in the right-hand side under ∂Ω actually represent their traces on ∂Ω. Poincar´ e’s Inequality12 Now we present an important inequality which holds in W01,p (Ω) for 1 ≤ p < ∞ and Ω open and bounded. 11 Mikhail V. Ostrogradski, Russian-Ukrainian mathematician, mechanician, and physicist, 1801–1862. 12 Henri Poincar´e, French mathematician, theoretical physicist, engineer, and philosopher of science, 1854–1912.

148

5 Distributions, Sobolev Spaces

Theorem 5.24 (Poincar´e). Let ∅ = Ω ⊂ Rk be an open bounded set and let 1 ≤ p < ∞. Then uLp (Ω) ≤ C∇uLp (Ω)

∀u ∈ W01,p (Ω) ,

(5.4.57)

where C is a positive constant depending on Ω and ∇uLp (Ω) :=

k

∂u/∂xi Lp (Ω)

1/p .

i=1

Proof. Taking into account the deﬁnition of W01,p (Ω), it is enough to prove (5.4.57) for all u ∈ C0∞ (Ω). Consider ﬁrst the case k= 1, i.e., Ω = (a, b), −∞ < a < b < ∞. If u ∈ C0∞ (a, b) := C0∞ (a, b) , then

x

u(x) =

b

u (t) dt =⇒ |u(x)| ≤

0

|u (t)| dt

∀x ∈ [a, b] .

a

If p = 1 we obtain (5.4.57) with C = b − a by integrating the last inequality over [a, b]. If 1 < p < ∞ then we can derive from the same inequality by using H¨ older 1 |u(x)| ≤ (b − a) p u Lp (a,b) ∀x ∈ [a, b] , where p = p/(p − 1). It follows that b |u(x)|p dx ≤ (b − a)p u pLp (a,b) , a

so (5.4.57) holds again with C = b − a. Now, consider the case k = 2. Let D = [a, b] × [c, d] be a rectangle in the xy-plane such that Ω ⊂ D. Take u ∈ C0∞ (Ω) and extend it as zero in D \ Ω. We have x ∂ u(s, y) ds =⇒ |u(x, y)| u(x, y) = ∂s a b ∂ u(s, y) ds ∀(x, y) ∈ D . ≤ ∂s a If p = 1 we obtain by integrating the last inequality over D uL1 (D) ≤ (b − a)

∂u ∂u L1 (D) =⇒ uL1 (Ω) ≤ (b − a) L1 (Ω) . ∂x ∂x

5.5 Bochner’s Integral

149

If 1 < p < ∞ we derive by using H¨ older uLp (Ω) ≤ (b − a)

∂u p , ∂x L (Ω)

so, in fact, (5.4.58) is valid for p ∈ [1, ∞). Similarly, ∂u uLp (Ω) ≤ (d − c) Lp (Ω) . ∂y

(5.4.58)

(5.4.59)

By (5.4.58) and (5.4.59) it follows that (5.4.57) holds with C = 2 max {b − a, d − c}. The proof is similar for k ≥ 3. Remark 5.25. An inspection of the above proof shows that the Poincar´e inequality still holds if the Lebesgue measure of Ω is ﬁnite, and also if the projection of Ω on some coordinate plane is bounded. Remark 5.26. If Ω is bounded or satisﬁes one of the conditions in the previous remark then, according to the Poincar´e inequality, W01,p (Ω) can be equipped with a new norm u∗1,p = ∇uLp (Ω) , which is equivalent to the usual norm · 1,p .

5.5

Bochner’s Integral

Let ∅ = Ω ⊂ Rk be a Lebesgue measurable set, and let (X, · ) be a real Banach space. As in the case of R-valued functions, a function g : Ω → X is a simple function if it is of the form g(s) =

p

χMi (s)yi

i=1

for some yi ∈ X, Mi ⊂ Ω measurable with ﬁnite measure (i.e., m(Mi ) < ∞), and Mi ∩ Mj = ∅ if i = j. Here, we prefer to use s to denote a generic point in Ω (instead of x which could be used to designate points of X). A function f : Ω → X is called strongly measurable (or simply measurable) if there exists a sequence of simple functions gn : Ω → X such that lim gn (s) − f (s) = 0 for a.a. s ∈ Ω . n→∞

150

5 Distributions, Sobolev Spaces

If g is a simple function as above, then it is clearly measurable. Deﬁne its integral over Ω to be g(s) ds := Ω

p

m(Ai )yi .

i=1

If g is a simple function, then g (i.e., the function s → g(s)) is a simple function as well (hence Lebesgue integrable over Ω) and the following inequality holds: * * * * * g(s) ds* ≤ g(s) ds . * * Ω

Ω

Denote by S the set of all simple functions g : Ω → X. Clearly S is a real linear space with respect to the usual operations (addition of functions and scalar multiplication), and (α1 g1 + α2 g2 ) ds = α1 g1 ds Ω Ω g2 ds ∀α1 , α2 ∈ R, ∀g1 , g2 ∈ S . +α2 Ω

Deﬁnition 5.27. f : Ω → X is said to be Bochner integrable (over Ω)13 if there exists a sequence of simple functions gn : Ω → X converging strongly to f a.e. in Ω (so f is measurable) and gn (s) − gm (s) ds = 0 , (5.5.60) lim n,m→∞ Ω

and the Bochner integral of f is deﬁned as f (s) ds := lim gn (s) ds. Ω

n→∞ Ω

Let us justify the above deﬁnition. We have * * * * * * * * * gn ds − * * * g ds (g − g ) ds = m n m * * * * Ω Ω Ω gn − gm ds . ≤ Ω

13

Salomon Bochner, American mathematician, 1899–1982.

(5.5.61)

5.5 Bochner’s Integral

151

So (5.5.60) implies * * * * * gm ds* lim * gn ds − * = 0, n,m→∞ Ω

Ω

i.e., the limit in (5.5.61) exists. To prove the limit does not depend on gn ) satisfying the same the choice of (gn ), consider another sequence (˜ properties. Then, by (5.5.60), we have for all ε > 0 gn − g˜n − gm + g˜m ds ≤ gn − gm ds + ˜ gn − g˜m ds Ω

Ω

Ω

≤ ε ∀n, m > Nε . Letting m → ∞ it follows from Fatou’s Lemma that gn − g˜n ds ≤ ε ∀n > Nε .

(5.5.62)

Ω

Now, since gn , g˜n are simple functions, we have * * * * * * * * * * gn ds − * g˜n ds* gn − g˜n ds . * = * (gn − g˜n ) ds* ≤ * Ω

Ω

Ω

Ω

(5.5.63) From (5.5.62) and (5.5.63) we deduce g˜n ds = lim gn ds = f ds , lim n→∞ Ω

n→∞ Ω

Ω

so the deﬁnition is correct. Remark 5.28. Note that if X = RN , N ∈ N, then f = (f1 , . . . , fN ) is measurable in the sense above if and only if fi is Lebesgue measurable for all i = 1, . . . , N , and integrability of f in the sense of Bochner means integrability of all fi ’s in the sense of Lebesgue. If (X, · ) is an inﬁnite dimensional Banach space, then, in addition to the concept of strong measurability of a function from Ω to X as deﬁned before, there is also a concept of weak measurability, namely f : Ω → X is said to be weakly measurable if s → x∗ (f (s)) is Lebesgue measurable for every continuous linear functional x∗ : (X, · ) → R. If X is a separable Banach space, then the weak measurability of f is equivalent to its strong measurability. In fact, this equivalence holds if f is almost separably valued, that is {f (s); s ∈ Ω \ M } is a separable set, where M ⊂ Ω has zero Lebesgue measure. This result belongs to

152

5 Distributions, Sobolev Spaces

Pettis,14 see, e.g., [51, p. 131]. It is worth mentioning that, in all the applications discussed in this book, X will always stand for separable Banach spaces. The next result says that Bochner integrability of any X-valued function f reduces to Lebesgue integrability of f . Theorem 5.29 (Bochner). Let (X, · ) be a real Banach space and let Ω ⊂ Rk be a measurable set. If f : Ω → X is strongly measurable, then f is Bochner integrable if and only if f is Lebesgue integrable, where f (s) := f (s) for almost all s ∈ Ω. Proof. Since f is strongly measurable, f is also (Lebesgue) measurable because a sequence of simple functions gives a sequence of simple functions upon taking the norm. To prove necessity, assume that f is Bochner integrable. If (gn ) is a sequence of simple functions as in Deﬁnition 5.27, we can write (see (5.5.60)) gn − gm ds ≤ ε ∀n, m > Nε . Ω

Applying Fatou’s Lemma, we get gn − f ds ≤ ε ∀n > Nε , Ω

i.e., gn − f is Lebesgue integrable for all n > Nε . So integrating the obvious inequality f ≤ f − gn + gn we obtain f ds ≤ f − gn ds + gn ds < ∞ ∀n > Nε , Ω

Ω

Ω

hence f is Lebesgue integrable. In order to prove suﬃciency, assume that f is Lebesgue integrable and consider a sequence of simple functions hn : Ω → X such that lim hn (s) − f (s) = 0

n→∞

Deﬁne gn (s) = 14

hn (s) 0

for almost all s ∈ Ω .

if hn (s) ≤ (1 + δ)f (s), otherwise,

Billy James Pettis, American mathematician, 1913–1979.

5.5 Bochner’s Integral

153

where δ is a positive constant. This is a sequence of simple functions and (5.5.64) lim gn (s) − f (s) = 0 for a.a. s ∈ Ω . n→∞

We must show lim

n,m→∞ Ω

gn − gm ds = 0 .

(5.5.65)

To do this, we shall apply the Lebesgue Dominated Convergence Theorem to the sequence (gn − f ). The ﬁrst condition of this theorem is satisﬁed (see (5.5.64)), and gn (s) − f (s) ≤ gn (s) + f (s) ≤ (1 + δ)f (s) + f (s) = (2 + δ)f (s) , so the second condition of the Lebesgue Dominated Convergence Theorem is also satisﬁed, hence gn − f ds = 0 . lim n→∞ Ω

This along with the obvious inequality gn − gm ds ≤ gn − f ds + gm − f ds Ω

Ω

Ω

implies (5.5.65). Remark 5.30. It is worth pointing out that for every f : Ω → X which is Bochner integrable, we have * * * * * f ds* ≤ f ds , * * Ω

Ω

because this inequality holds for simple functions. In general, the usual properties of the Lebesgue integral are also satisﬁed by the Bochner integral. Remark 5.31. Let (X, · ) and (Y, · ∗ ) be real Banach spaces. If f : Ω → X is Bochner integrable over Ω and A is a continuous linear operator from (X, ·) to (Y, ·∗ ), then A◦f is also Bochner integrable and A◦f ds = A f ds . Ω

Ω

154

5 Distributions, Sobolev Spaces

Indeed, if (gn ) is a sequence of simple functions converging to f , then (A◦gn ) is also a sequence of simple functions which converges to A◦f . Moreover, A◦gn − A◦gm ds ≤ A gn − gm ds → 0 as n, m → ∞ . Ω

Ω

It follows that A◦f ds = lim A◦gn ds = lim A gn ds = A f ds , n→∞ Ω

Ω

n→∞

Ω

Ω

as claimed. For X a real Banach space, Ω ⊂ Rk measurable, and 1 ≤ p < ∞ deﬁne p f p ds < ∞} . L (Ω; X) = {f : Ω → X; f is measurable and Ω

We also deﬁne L∞ (Ω; X) = {f : Ω → X; f is measurable and ess sup f (s) < ∞} , s∈Ω

where ess sup f (s) := inf{C; f (s) ≤ C a.e. on Ω}. s∈Ω

Let ∼ denote equality a.e. and deﬁne the quotient space Lp (Ω; X) := Lp (Ω)/∼ . This is a real Banach space for 1 ≤ p ≤ ∞ with respect to the norm !

"1/p

f Lp (Ω; X) :=

f ds p

,

1 ≤ p < ∞,

Ω

f L∞ (Ω; X) := ess sups∈Ω f (s) . The proof follows by arguments similar to those from the proof of the classical theorem corresponding to the case X = R (Theorem 3.25), so we leave it to the reader as an exercise. The key condition is the completeness of X. If Ω = (a, b) with −∞ ≤ a < b ≤ ∞ denote Lp (a, b; X) := Lp ((a, b); X).

5.6 Vector Distributions, W m,p (a, b; X) Spaces

5.6

155

Vector Distributions, W m,p (a, b; X) Spaces

Let X be a Banach space and let −∞ ≤ a < b ≤ ∞. Denote as before ∞ ∞ D(a, b) = C0 (a, b) := C0 (a, b) equipped with the inductive limit topology. Deﬁnition 5.32. An X-valued distribution over (a, b) is an operator u : D(a, b) → X which is linear and continuous (in the sense that if φn → 0 in D(a, b) then u(φn ) → 0). The set of all such vector distributions is denoted D (a, b; X). As in the scalar case, a regular distribution is one which is generated by a locally integrable function u ∈ L1loc (a, b; X), i.e., u : (a, b) → X is strongly measurable and u ∈ L1 (K) for all K ⊂ (a, b) compact. Deﬁne u ˜ : D(a, b) → X by

b

φ(t)u(t)dt

u ˜(φ) :=

∀φ ∈ D(a, b) .

a

The mapping u → u ˜ is injective, as its null set is {0}. Indeed, for φ ∈ D(a, b) and v ∈ L1loc (a, b; X) satisfying

b

φ(t)v(t) dt = 0 , a

we have (cf. Remark 5.31)

b

φ(t)x∗ (v(t)) dt = 0 ∀x∗ ∈ X ∗ ,

a

where X ∗ is the dual of X. Since t → x∗ (v(t)) is a real, locally summable function, it follows by Theorem 5.9 that x∗ (v(t)) = 0 ∀x∗ ∈ X ∗ , and a.a. t ∈ (a, b) so v(t) = 0 for a.a. t ∈ (a, b). Consequently, one can identify the (regular) distribution u ˜ with the locally summable function u, and write

b

φ(t)u(t)dt

u(φ) := a

∀φ ∈ D(a, b) .

156

5 Distributions, Sobolev Spaces

Of course, as in the scalar case, not all vector distributions arise in this way, e.g., u : D(R) → X deﬁned by u(φ) = φ(0)x for all φ ∈ D(R) and a ﬁxed x ∈ X \ {0}. For u ∈ D (a, b; X) deﬁne the derivative u (φ) := −u(φ )

∀φ ∈ D(a, b) ,

and inductively, u(j) (φ) = (−1)j u(φ(j) )

∀φ ∈ D(a, b), j ∈ N ,

and by convention u(0) = u . In applications, intervals (a, b) are suﬃcient, though the theory extends to Ω ⊂ Rk . For m ∈ N, 1 ≤ p ≤ ∞, we set W m,p (a, b; X) := {u ∈ D (a, b; X); u(j) ∈ Lp (a, b; X), j = 0, 1, . . . , m} , so, in fact, u is a regular distribution because j = 0 is included. Also, all (distributional) derivatives above are regular as well. Theorem 5.33. If X is a Banach space then, for all m ∈ N and 1 ≤ p ≤ ∞, W m,p (a, b; X) is a Banach space with respect to the norm uW m,p (a,b; X) :=

m

u(j) pLp (a,b; X)

1/p ,

1 ≤ p < ∞,

j=0

uW m,∞ (a,b; X) := max u(j) L∞ (a,b; X) , 0≤j≤m

Proof. Similar to the proof of Theorem 5.20.

p = ∞.

5.6 Vector Distributions, W m,p (a, b; X) Spaces

157

m,p The notation Wloc (a, b; X) indicates the set of all u ∈ D (a, b; X) such that u ∈ W m,p (t1 , t2 ; X) for every bounded interval (t1 , t2 ) ⊂ (a, b). For p = 2 denote H m (a, b; X) = W m,2 (a, b; X). If X is a Hilbert space, then so is H m (a, b; X) with respect to the inner product

(u, v)H m (a,b; X) =

m b j=0

u(j) (t), v (j) (t)

a

X

dt .

Now for −∞ < a < b < +∞ denote by Am,p (a, b; X) the space of all functions f : [a, b] → X which are absolutely continuous on [a, b], the pointwise derivatives dj f /dtj exist and are absolutely continuous on [a,b] for j = 1, 2, . . . , m − 1, and dm f /dtm ∈ Lp (a, b; X). Remark 5.34. If X is reﬂexive, it follows by a well-known theorem due to K¯ omura15 (see [25]; see also [45, p. 105]) that A1,1 (a, b; X) = AC([a, b]; X) , where AC([a, b]; X) is the space of all X-valued absolutely continuous functions on [a, b]. Theorem 5.35. For m ∈ N, 1 ≤ p ≤ ∞, −∞ < a < b < ∞, and u ∈ Lp (a, b; X) then the following are equivalent: (j) u ∈ W m,p (a, b; X) ; (jj) there exists u1 ∈ Am,p (a, b; X) such that u1 (t) = u(t) for almost all t ∈ (a, b) . Proof. We shall prove the case m = 1, and then the result follows by induction. To prove the implication (j) ⇒ (jj) ﬁx u ∈ W 1,p (a, b; X) and extend it as zero in R \ (a, b). For ε > 0 small deﬁne uε as before, i.e., ωε (t − s)u(s) ds , uε (t) = R

where 1 ωε (t) = ω(t/ε), ε 15

and ω(t) =

Ce 0,

−

1 1−t2

Yukio K¯ omura, Japanese mathematician, born 1931.

,

|t| < 1 , |t| ≥ 1 ,

158

5 Distributions, Sobolev Spaces

with C > 0 such that

R ω(t) dt

u˙ ε (t) =

= 1. We have

d uε (t) dt

= R

ωε (t − s)u(s) ds ,

∀t ∈ R ,

which is a function, but we understand it as a distribution and apply it to a test function φ ∈ C0∞ (R) (u˙ ε , φ) = φ(t)u˙ ε (t) dt R ! " φ(t) ωε (t − s)u(s) ds dt = R

R

and interchanging the order of integration ! = R

R

=−

R

ωε (t

" − s)φ(t) dt u(s) ds

φε (s)u(s) ds

= −(u, φε ) = u (φε ) φε (t)u (t) dt = R " ! ωε (t − s)φ(s) ds u (t) dt = R

R

and changing the order of integration again !

φ(s)

=

R

= R

R

ωε (t − s)u (t) dt

" ds

φ(s)(u )ε (s) ds

so that (u˙ ε , φ) = ((u )ε , φ) ,

∀φ ∈ C0∞ (R) .

In other words, the pointwise derivative u˙ ε is equal to (u )ε .

(5.6.66)

5.6 Vector Distributions, W m,p (a, b; X) Spaces

159

Now, integrate to obtain

t

uε (t) − uε (s) =

(u )ε (τ ) dτ .

(5.6.67)

s

Note that uε → u and (u )ε → u in Lp (a, b; X) as ε → 0+ (the proof is the same as in the scalar case). Hence, there exists a function u1 such that t u1 (t) − u1 (s) = u (τ ) dτ for a.a. s, t ∈ (a, b) . s

Therefore, u1 ∈ AC([a, b]; X) and u˙ 1 = u for almost all t ∈ (a, b), i.e., the pointwise derivative u˙ 1 is a representative of the distributional derivative u ∈ Lp (a, b; X). So u˙ 1 ∈ Lp (a, b; X), which together with absolute continuity implies that u1 ∈ A1,p (a, b; X). For the implication (jj) =⇒ (j), assume there exists u1 ∈ A1,p (a, b; X) an element of the class u. We must show that u ∈ W 1,p (a, b; X). Since u1 ∈ AC[a, b], u ∈ Lp (a, b; X), and we must show that u ∈ Lp (a, b; X). We start with u˙ 1 and interpret it as a distribution. For all φ ∈ D(a, b), we have b φu˙ 1 dt (u˙ 1 , φ) = a

and, integrating by parts,

b

=−

˙ 1 dt φu

a

and, since changing u1 to another element of its class won’t aﬀect the integral,

b

=−

˙ dt φu

a

˙ = −u(φ) = u (φ) . Therefore, u˙ 1 = u as distributions, but since u˙ 1 is a function, so is u and u˙ 1 ∈ Lp (a, b; X) so u ∈ Lp (a, b; X). Note that usually good representatives are preferred since their values at particular points make sense.

160

5.7

5 Distributions, Sobolev Spaces

Exercises

1. Let Ω = R × (−1, +1) ⊂ R2 and let u : Ω → R be deﬁned by u(x) = |x1 |x21 (1 + x1 x2 + |x2 |x22 ). Show that u ∈ C 2 (Ω) and ﬁnd supp u. 2. Find a collection F of seminorms on C[0, 1] := C([0, 1]; R) such that the topology generated by F coincide with the pointwise convergence topology. 3. Let Ω ⊂ Rk be a nonempty open set. For any compact set K ⊂ Ω and m ∈ N ∪ {0} deﬁne the seminorm p : C ∞ (Ω) → R pK,m (f ) =

|Dα f (x)|, f ∈ C ∞ (Ω),

sup x∈K,|α|≤m

where α = (α1 , . . . , αk ) are multi-indices, |α| = α1 + · · · αk , and Dα f (x) =

∂xα1 1

∂ |α| f (x1 , . . . , xk ). · · · ∂xαk k

Consider a sequence of compact sets K1 ⊂ K2 ⊂ · · · ⊂ Kn ⊂ · · · ⊂ Ω, such that Ω = ∪∞ n=1 Kn . Deﬁne for each j ∈ N dj (f, g) =

j pKj,m (f − g) 1 , f, g ∈ C j (Ω), · m 2 1 + pKj,m (f − g)

m=0

and d(f, g) =

∞ dj (f, g) 1 · , f, g ∈ C ∞ (Ω). j 2 1 + dj (f, g) j=1

Show that d is a metric on C ∞ (Ω). 4. Find a function φ ∈ C ∞ (R) with supp φ = [0, 4], φ ≥ 0 and maxR φ = 1. 5. Let φ ∈ C0∞ (Rk ). Prove that there exists ψ ∈ C0∞ (Rk ) such that kψ φ = ∂x1∂···∂x if and only if Rk φ(x) dx = 0. k 6. Let (an )n∈N be a sequence of real numbers. Prove that there exists a function φ ∈ C0∞ (R) such that φ(n) = an ∀n ∈ N if and only if there exists an n0 ∈ N such that an = 0 ∀n > n0 .

5.7 Exercises

161

7. Let m ∈ N and ψ ∈ C0∞ (Rk ). Deﬁne the sequence (φn ) by φn (x) = 2−n nm ψ(nx), x ∈ Rk , n ∈ N. Show that φn → 0 in D(Rk ) as n → ∞. 8. Let h be a nonzero vector in Rk and let ψ ∈ C0∞ (Rk ). Consider the sequence (φn )n∈N , where

1 φn (x) = n ψ x + h − ψ(x) , x ∈ Rk , n ∈ N. n Prove that φn →

k j=1

hj

∂ψ in D(Rk ). ∂xj

Deduce from this result the convergence in D(Rk ) to 0 of the sequence (γn )n∈N deﬁned by

1 1 γn (x) = n ψ x + h − ψ x − h , x ∈ Rk , n ∈ N. n n 9. Let Ω ⊂ Rk be a nonempty open set. For φ ∈ C0∞ (Ω) consider φn (x) = ω1/n (x − y)φ(y) dy, x ∈ Ω, n ∈ N suﬃciently large, Ω

where ω1/n denotes the usual Friedrichs molliﬁer. Prove that φn converges to φ in D(Ω). 10. Let Ω ⊂ Rk be a nonempty open set. For a given point a ∈ Ω and for a multi-index α ∈ Nk0 , deﬁne u : D(Ω) → R by u(φ) = Dα φ(a)

∀φ ∈ D(Ω).

Here N0 = N ∪ {0}. Prove that u ∈ D (Ω) and u is not a regular distribution. 11. Let Ω ⊂ Rk be a nonempty open set. Show that, if φ ∈ D(Ω) satisﬁes u(φ) = 0 ∀u ∈ D (Ω), then φ = 0.

162

5 Distributions, Sobolev Spaces

12. Let u : D(R) → R, u(φ) =

∞

φ(1/i2 ) − φ(0) , φ ∈ D(R).

i=1

Prove that u is well deﬁned, u ∈ D (R), and u is not a regular distribution. 13. Show that mixed derivatives of distributions do not depend on the order of diﬀerentiation. 14. Let ∅ = Ω ⊂ Rk be an open set, u ∈ D (Ω), a ∈ C ∞ (Ω). Show that ∂a ∂u ∂(au) = u+a . ∂xi ∂xi ∂xi Extend this formula to Dα (au) for a general multi-index α. 15. Find the n-th derivatives (n = 1, 2, 3) in the sense of distributions of f, g : R → R, 1 f (x) = x|x|, x ∈ R , 2 g(x) = H(x) · cos x, x ∈ R , where H denotes the usual Heaviside function. 16. Find a sequence (Hn )n∈N in C0∞ (R) such that Hn → H in D (R), where H is the Heaviside function. 17. Let u : D(R2 ) → R,

∞

u(φ) = −∞

φ(x1 , 0) dx1 ∀φ ∈ D(R2 ).

(i) Prove that u ∈ D (R2 ); (ii) Show that u is not a regular distribution; (iii) Check that

∂u ∂x1

= 0.

18. Let Ω ⊂ Rk be a nonempty open set and let S ⊂ Ω be a countably inﬁnite set of isolated points, S = {x1 , x2 , . . . , xn , . . . }. Show that for any sequence of real numbers (an )n∈N ∞ the series n=1 an δxn converges in D (Ω).

5.7 Exercises

163

19. Let (xn )n∈N be a sequence in Rk . Prove the following implication: δxn → 0 in D (Rk ) =⇒ xn → ∞. 20. Solve the following equations in D (R): (a) u + tu = χ[0,1] (t) (where χ[0,1] is the characteristic function of [0, 1]); (b) u + u = H + δ (where H denotes the Heaviside function and δ is the Dirac distribution); (c) u − 2u + u = 2δ(t − 1) + δ(t − 2); (d) u − 4u = δ − δ − 8. 21. Solve the Cauchy problem u − u = δ(t − 1) + 2δ(t − 3) − 2t − 1 u(0) = 1, u (0) = 0.

in D (R),

22. Prove that the solution set of the equation (sin t) · u = 0 in D (R) is an inﬁnite dimensional linear subspace of D (R). 23. Find u1 , u2 , u3 ∈ D (R) satisfying the diﬀerential system ⎧ ⎪ ⎨u1 = 4u1 − u2 + H, u2 = 3u1 + u2 − u3 + δ, ⎪ ⎩ u3 = u1 + u3 + H. 24. Let a be a given real number. If u ∈ W01,1 (a, ∞) : = W01,1 (a, ∞; R), prove that there exists a function v ∈ C[a, ∞) which is a representative of the class u, and v(a) = 0. 25. Let p ∈ (1, ∞). Show that W 2,p (0, 1) is compactly embedded into C 1 [0, 1]. The Sobolev space W 2,p (0, 1) is equipped with the usual norm, and C 1 [0, 1] is equipped with the norm f C 1 = max |f (t)| + max |f (t)| ∀f ∈ C 1 [0, 1]. 0≤t≤1

0≤t≤1

164

5 Distributions, Sobolev Spaces

26. Let φ ∈ C0∞ (R) \ {0} and let 1 ≤ p ≤ +∞. Deﬁne un : R → R by un (t) = φ(t + n), t ∈ R, n ∈ N. Prove that (i) (un )n∈N is bounded in W m,p (R) for every m ∈ N; (ii) there exists no subsequence of (un ) converging strongly in Lq (R) for any 1 ≤ q ≤ ∞. 27. Let ∅ = Ω ⊂ Rk be an open bounded set. If u, v ∈ H 1 (Ω) = W 1,2 (Ω), show that uv ∈ W 1,1 (Ω) and ∂u ∂v ∂ (uv) = ·v+u· , i = 1, 2, . . . , k, ∂xi ∂xi ∂xi in D (Ω) and a.e. in Ω.

Chapter 6

Hilbert Spaces Let X be a linear space over K equipped with a scalar (inner) product (·, ·) (i.e., X is an inner product space or a generalized Euclidean space, as deﬁned in Chap. 1). As usual, throughout this chapter K is either R or C. Deﬁne the norm x = (x, x), x ∈ X . If (X, · ) is a Banach space (i.e., (X, d) is a complete metric space, where d(x, y) = x − y, x, y ∈ X), then X is said to be a Hilbert1 space. In other words, a Hilbert space is a Banach space (X, · ) whose norm is given by a scalar product.

6.1

Examples

We have already met some Hilbert spaces, such as the Euclidean space Rk , Ck , L2 (Ω), H m (Ω), m ∈ N, these spaces being equipped with their usual scalar products, i.e.,

1

David Hilbert, German mathematician, 1862–1943.

© Springer Nature Switzerland AG 2019 G. Moro¸sanu, Functional Analysis for the Applied Sciences, Universitext, https://doi.org/10.1007/978-3-030-27153-4 6

165

166

6 Hilbert Spaces

(x, y) =

(x, y) = (u, v)L2 (Ω) (u, v)m

k i=1 k

x i yi ,

x = (x1 , . . . , xk ) , y = (y1 , . . . , yk ) ∈ Rk ,

x i yi ,

x = (x1 , . . . , xk ), y = (y1 , . . . , yk ) ∈ Ck ,

i=1 = uv dx , u, v ∈ L2 (Ω) , Ω α = D u, Dα v L2 (Ω) , u, v ∈ H m (Ω) , |α|≤m

and the corresponding induced norms x = 2

x2 = u2L2 (Ω) u2m

k i=1 k

x2i , |xi |2 ,

x = (x1 , . . . , xk ) ∈ Rk , x = (x1 , . . . , xk ) ∈ Ck ,

i=1 = u2 dx , u ∈ L2 (Ω) , Ω = Dα u2L2 (Ω) , u ∈ H m (Ω) , |α|≤m

where Ω is a measurable or open subset of Rk in the third and fourth cases, respectively. Obviously, every Cauchy sequence in Rk is convergent since the corresponding coordinate sequences are Cauchy in (R, | · |), hence convergent in that space. So the Euclidean space Rk equipped with the above scalar product and norm is a Hilbert space over R. Similarly, Ck equipped with the above scalar product and norm is a Hilbert space over C. Note also that Lp (Ω) equipped with the usual norm is a Banach space for all 1 ≤ p ≤ ∞ (see Theorem 3.25). So (L2 (Ω), · L2 (Ω) ) is a real Hilbert space. Also, H m (Ω) equipped with the above scalar product and norm is a real Hilbert space, and so is its closed subspace H0m (Ω), m ∈ N. It is worth pointing out that H01 (Ω) can be equipped with a diﬀerent scalar product, ∗ ∇u · ∇v dx , u, v ∈ H01 (Ω) , (u, v)1 = Ω

6.1 Examples

167

and the induced norm u∗1 = ∇uL2 (Ω) ,

u ∈ H01 (Ω) ,

whenever Ω is open and has ﬁnite measure, or its projection on a coordinate plane is bounded (see Theorem 5.24 and Remarks 5.25 and 5.26). Note also that, for −∞ ≤ a < b ≤ ∞ and a Hilbert space X, L2 (a, b; X) equipped with the scalar product

b

(u, v)L2 (a,b; X) =

u(t), v(t)

a

X

dt ,

u, v ∈ L2 (a, b; X) ,

and the induced norm u2L2 (a,b; X)

b

= a

u(t)2L2 (a,b; X) dt ,

is a Hilbert space, too. Also, H m (a, b; X) is a Hilbert space for any m ∈ N with respect to the scalar product (u, v)m =

m b

u(j) (t), v (j) (t)

a

j=0

X

dt ,

u, v ∈ H m (a, b; X) ,

and the induced norm u2m =

m j=0

b a

u(j) (t)2X dt ,

u ∈ H m (a, b; X) .

Let us point out that any inner product space can be extended (uniquely up to isomorphism) to a Hilbert space, by a completion procedure similar to that used in the proof of Theorem 2.8. To illustrate this consider the space C[0, 2] endowed with the scalar product u, v =

2

u(t)v(t) dt ,

u, v ∈ C[0, 2] ,

0

and the induced norm u2L2

2

= u, u = 0

u(t)2 dt ,

u ∈ C[0, 2] .

168

6 Hilbert Spaces

The space (C[0, 2], · L2 ) is not complete (i.e., it is not a Hilbert space), as can be seen by using the sequence (un )n≥2 deﬁned by ⎧ ⎪ ⎨0, un (t) = nt − n + 1, ⎪ ⎩ 1,

0 ≤ t ≤ 1 − n1 , 1 − n1 < t < 1 , 1 ≤ t ≤ 2,

but it can be extended to the Hilbert space (L2 (0, 2), · L2 ) (each element ∈ C[0, 2] being identiﬁed with its L2 equivalence class). If X is a ﬁnite dimensional, inner product space, then it is a Hilbert space with respect to the norm induced by the corresponding inner product, so no extension is needed (in particular, Rk and Ck are Hilbert spaces).

6.2

Jordan–von Neumann Characterization Theorem

Our aim in this chapter is to present the main properties of Hilbert spaces which are of course common to all the particular spaces mentioned above. First of all, we state the following characterization result due to Jordan and von Neumann.2 Theorem 6.1 (Jordan–von Neumann). Let (H, · ) be a normed space. Then the norm · is given by a scalar product (i.e., there exists a scalar product (·, ·) : H × H → K such that x = (x, x), x ∈ H) if and only if · satisﬁes the parallelogram law. (Hence, a Banach space (H, · ) is Hilbert ⇐⇒ its norm · satisﬁes the parallelogram law). Proof. Necessity has already been proved in Chap. 1, though we repeat here the proof which is immediate. Assuming that · is generated by a scalar product (·, ·), we have for all x, y ∈ H x + y2 + x − y2 = (x + y, x + y) + (x − y, x − y) = 2(x2 + y2 ) ,

(6.2.1)

i.e., the norm satisﬁes the parallelogram law. 2

Pascual Jordan, German theoretical and mathematical physicist, 1902–1980; John von Neumann, Hungarian-American mathematician, physicist, and computer scientist, 1903–1957.

6.2 Jordan–von Neumann Characterization Theorem

169

Now let us prove suﬃciency. Assume that the norm · of H satisﬁes the parallelogram law (see (6.2.1)). Consider ﬁrst the case K = R. Deﬁne f : H × H → R by f (x, y) =

1 x + y2 − x − y2 , x, y ∈ H, 4

which we will show is a scalar product on H. Clearly, 1 f (x, x) = 2x2 = x2 4

∀x ∈ H ,

∀x, y ∈ H ,

f (x, y) = f (y, x)

∀x ∈ H .

f (x, 0) = 0

(6.2.2) (6.2.3) (6.2.4)

Obviously, for any x1 , x2 , y ∈ H, we have 1 x1 + x2 + y2 + x1 + x2 − y2 , 4 1 f (x1 − x2 , y) = x1 − x2 + y2 + x1 − x2 − y2 . 4

f (x1 + x2 , y) =

Add the two equations and apply the parallelogram law to get 1 x1 + y2 + x2 2 2 −x1 − y2 − x2 2 1 = x1 + y2 − x1 − y2 2 = 2f (x1 , y) . (6.2.5)

f (x1 + x2 , y) + f (x1 − x2 , y) =

In the special case x1 = x2 = x we have (see also (6.2.4) and (6.2.3)) f (2x, y) = 2f (x, y)

∀x, y ∈ H .

(6.2.6)

Now choose in (6.2.5) x1 + x2 = x and x1 − x2 = x to obtain f (x, y) + f (x , y) = 2f

x + x ,y , 2

which by (6.2.6) gives f (x + x , y) = f (x, y) + f (x , y)

∀x, x , y ∈ H .

(6.2.7)

From (6.2.7) we obtain f (nx, y) = nf (x, y) for all n ∈ N which can be extended to f (nx, y) = nf (x, y)

∀x, y ∈ H, ∀n ∈ Z ,

(6.2.8)

170

6 Hilbert Spaces

since f (−x, y) = −f (x, y) (by (6.2.7)). Now for a rational number r = m/n, m, n ∈ Z, n = 0, we have (by (6.2.8)) ! "

m m 1 x, y = mf x, y = f x, y , f n n n so f (rx, y) = rf (x, y)

∀x, y ∈ H, ∀r ∈ Q .

Since f is continuous on H × H, this extends to r ∈ R, i.e., f (rx, y) = rf (x, y)

∀x, y ∈ H, ∀r ∈ R .

(6.2.9)

Summarizing, we see that f satisﬁes (6.2.2), (6.2.3), (6.2.7), and (6.2.9), so f (·, ·) is a scalar product and generates the given norm: x2 = f (x, x), x ∈ H. Suﬃciency in the complex case K = C can be treated similarly, with f : H × H → C deﬁned by 3 1 m i x + im y2 , x, y ∈ H , f (x, y) = 4 m=0

where i is the imaginary unit. Remark 6.2. In fact, the scalar product generating a norm is unique. Indeed, if (·, ·) and ·, · are two scalar products such that (x, x) = x, x = x2 , x ∈ H, then we easily derive from (x + y, x + y) = x + y, x + y

∀x, y ∈ H ,

that Re(x, y) = Rex, y

∀x, y ∈ H ,

(6.2.10)

and this completes the proof in the real case. If K = C, then by replacing y by iy in (6.2.10), we also get Im(x, y) = Imx, y

∀x, y ∈ H .

Remark 6.3. We have already noticed that Rk equipped with the usual Euclidean norm is a Hilbert space, but Rk is not Hilbert with respect to other norms, such as u1 =

k i=1

|ui |, or umax = max |ui |, u = (u1 , . . . , uk ) ∈ Rk . 1≤i≤k

6.3 Projections in Hilbert Spaces

171

Indeed, one can easily ﬁnd pairs of vectors that do not satisfy the parallelogram law expressed in terms of these norms. Similarly, L1 (a, b), −∞ ≤ a < b ≤ ∞, equipped with its usual norm, is not a Hilbert space, as can be seen by ﬁnding a pair of functions f, g ∈ L1 (a, b) that does not satisfy the parallelogram law (do it!).

6.3

Projections in Hilbert Spaces

A Hilbert space is similar in many respects to k-dimensional Euclidean space. That is why Hilbert spaces are more useful in applications than general Banach spaces. Theorem 6.4. Let H be a Hilbert space with scalar product (·, ·) and induced norm · , and let C be a nonempty, convex, closed subset of H. Then for all x ∈ H there exists a unique y ∈ C such that x − y = d(x, C) := inf x − v . v∈C

(6.3.11)

Proof. First we prove the existence of y. If x ∈ C then d(x, C) = 0 so a good candidate is y = x. Assume x ∈ H \ C. Denote ρ = d(x, C). By the deﬁnition of inf, for all n ∈ N there exists yn ∈ C such that ρ ≤ x − yn < ρ +

1 , n

which gives lim x − yn = ρ .

n→∞

(6.3.12)

We have ρ > 0. Indeed if ρ = 0, then by (6.3.12) yn → x and C is closed, so x ∈ C, contradiction. Apply the parallelogram law (see (6.2.1)) to x − yn and x − ym to get 2x − (yn + ym )2 + yn − ym 2 = 2 x − yn 2 + x − ym 2 , (6.3.13) for all n, m. Consider the ﬁrst term of the left-hand side of (6.3.13) and factor out a 4 4x − (1/2)(yn + ym )2 ≥ 4ρ2 .

(6.3.14)

Note that (1/2)(yn + ym ) is a convex combination of elements of C and therefore is in C by convexity. Hence (see (6.3.13) and (6.3.14)), yn − ym 2 ≤ 2 x − yn 2 + x − ym 2 − 4ρ2 . (6.3.15)

172

6 Hilbert Spaces

Using (6.3.12) we get that (yn ) is Cauchy because the right-hand side of (6.3.15) converges to 0 as n, m → ∞. Therefore (yn ) converges strongly to some y, and y ∈ C because C is closed. It follows from (6.3.12) that x − y = ρ . We now prove uniqueness. Suppose x − y = ρ = x − y for some y, y ∈ C. We use the parallelogram law for x − y, x − y to obtain 2x − (y + y )2 + y − y 2 = 2 x − y2 + x − y 2 which implies 4x − (1/2)(y + y )2 + y − y 2 = 4ρ2 .

(6.3.16)

(1/2)(y + y ) ∈ C since it is a convex combination, therefore 4x − (1/2)(y + y )2 ≥ 4ρ2 yielding (see (6.3.16)) y − y 2 ≤ 4ρ2 − 4ρ2 = 0 , and thus y = y . Remark 6.5. Both assumptions (C closed and convex) are essential. For example, if C is an open disc in R2 , then there is no y for x ∈ R2 \C. On the other hand, if C is not convex there may exist more (possibly inﬁnitely many) y’s for the same x, as the reader can easily imagine. Deﬁnition 6.6. Let ∅ = C ⊂ H be a closed and convex set. A point y as above is called the projection of x on C and is denoted y = PC x. Since a projection exists and is unique for any x ∈ H we can deﬁne a projection operator PC : H → C : x → y = PC x. Theorem 6.7. Let H be a Hilbert space and let ∅ = C ⊂ H be a closed and convex set. For x ∈ H, y ∈ C the following are equivalent: (a) y = PC x; (b) x − y ≤ x − v for all v ∈ C; (c) Re(x − y, y − v) ≥ 0 for all v ∈ C; (d) Re(x − v, y − v) ≥ 0 for all v ∈ C.

6.3 Projections in Hilbert Spaces

173

If H is a real Hilbert space, then the “Re” from (c) and (d) can be removed. Proof. (a) ⇐⇒ (b) : Trivial. (b) =⇒ (c) : x − y2 ≤ x − v2 for all v ∈ C. Let v = (1 − λ)y + λw for 0 < λ < 1, and w ∈ C. Since v is a convex combination, v is in C. We have x − y2 ≤ x − y + λ(y − w)2 ≤ x − y2 + 2λ Re(x − y, y − w) + λ2 y − w2 , so that 0 ≤ 2 Re(x − y, y − w) + λy − w2 . Let λ → 0+ to ﬁnd Re(x − y, y − w) ≥ 0 for all w ∈ C . (c) =⇒ (b) : Since Re(x − y, y − x + x − v) ≥ 0 we have x − y2 ≤ Re(x − y, x − v) ≤ |(x − y, x − v)| ≤ x − y · x − v

∀v ∈ C ,

so if x − y = 0 then we are done; otherwise divide by it, and we get x − y ≤ x − v ,

∀v ∈ C .

(c) =⇒ (d) : Re(x − v + v − y, y − v) ≥ 0 for all v ∈ C so that Re(x − v, y − v) ≥ y − v2 ≥ 0,

∀v ∈ C .

(d) =⇒ (c) : Replacing v in (d) by (1 − λ)y + λw for λ ∈ (0, 1), w ∈ C, we get Re(x − y + λ(y − w), λ(y − w)) ≥ 0 , and, as λ is strictly positive, this implies Re(x − y, y − w) + λy − w2 ≥ 0 . Thus, letting λ → 0+ we obtain Re(x − y, y − w) ≥ 0

∀w ∈ C .

174

6 Hilbert Spaces

Remark 6.8. The projection operator is Lipschitz. Proof. Using condition (c) of Theorem 6.7 we have Re(x1 − PC x1 , PC x1 − PC x2 ) ≥ 0 , Re(x2 − PC x2 , PC x1 − PC x2 ) ≥ 0 . Add the two to obtain Re(PC x1 − PC x2 , x1 − PC x1 − x2 + PC x2 ) ≥ 0 , which implies Re(PC x1 − PC x1 , x1 − x2 ) ≥ PC x1 − PC x2 2 . By the Bunyakovsky–Cauchy–Schwarz inequality, this leads to PC x1 − PC x2 ≤ x1 − x2

∀x1 , x2 ∈ H .

Thus PC is Lipschitz with constant L = 1. For this reason the operator PC is also called nonexpansive. Remark 6.9. Let C ⊂ H be a closed linear subspace. By condition (c) of Theorem 6.7 we have for all v ∈ C, Re(x − y, y − v) ≥ 0, and in fact we can write it as Re(x − y, v) ≥ 0 for all v ∈ C since C is a linear subspace. Both v, −v ∈ C because of linearity and this gives equality Re(x − y, v) = 0 for all v ∈ C. We can also replace v with iv, and so Im(x − y, v) = 0, therefore (x − y, v) = 0,

∀v ∈ C .

(6.3.17)

In general, when two vectors w1 , w2 ∈ H satisfy (w1 , w2 ) = 0 they are said to be orthogonal by analogy with orthogonality in Euclidean space, and we write w1 ⊥ w2 . So, (6.3.17) can be expressed as (x−y) ⊥ C. The reader is invited to imagine what the orthogonality relation (6.3.17) looks like in the Euclidean space R3 equipped with the usual scalar product and norm.

6.4 The Riesz Representation Theorem

6.4

175

The Riesz Representation Theorem

Let (H, (·, ·), · ) be a Hilbert space and let M ⊂ H be a closed linear subspace. The orthogonal complement M ⊥ of M is deﬁned as M ⊥ = {u ∈ H; (u, v) = 0 ∀v ∈ M } and is a closed subset (subspace) because (·, ·) : H × H → K is continuous. Orthogonal Decomposition of H: We claim that any vector u ∈ H can be written as u = u1 + u2 with u1 ∈ M and u2 ∈ M ⊥ , and this decomposition is unique. We write H = M ⊕ M ⊥ and call it a direct sum. Proof. Note that u1 = PM u (which is unique) is the component in M , while u2 = u − u1 = u − PM u is in M ⊥ because (u − PM u, v) = 0 for all v ∈ M (see (6.3.17)). Let us now prove that this decomposition (u = u1 + u2 ) is unique. Suppose that u = u1 + u2 = u1 + u2 with u1 , u1 ∈ M and u2 , u2 ∈ M ⊥ . Then 0 = (u1 − u1 + u2 − u2 , u1 − u1 ) = u1 − u1 2 + (u2 − u2 , u1 − u1 ) , where the second term is 0 because u1 − u1 ∈ M , u2 − u2 ∈ M ⊥ . Thus u1 − u1 2 = 0 so that u1 = u1 which in turn implies u2 = u2 . Theorem 6.10 (Riesz Representation Theorem). Let (H, (·, ·), · ) be a Hilbert space. For all f ∈ H ∗ (i.e., f is a continuous linear functional from H to K) there exists a unique v ∈ H such that f (u) = (u, v) ∀u ∈ H and v = f . Proof. Step 1. We ﬁrst show that such a v is unique. Suppose that (u, v) = (u, v ) for all u ∈ H, then (u, v − v ) = 0 for all u ∈ H and in particular (v − v , v − v ) = 0 so v = v .

176

6 Hilbert Spaces

Step 2. We now prove the existence of v. If f = 0 then clearly v = 0 works. If f = 0 consider the nullspace N (f ) = {z ∈ H; f (z) = 0}. It is a closed linear subspace so H = N (f )⊕N (f )⊥ . In fact N (f ) = H because f is not identically 0. Thus there exists u0 ∈ N (f )⊥ \ {0}. We may assume f (u0 ) = 1 by scaling. Let u ∈ H be arbitrary and deﬁne w = u − f (u)u0 . Now consider f (w) = f (u) − f (u)f (u0 ) = f (u) − f (u) = 0 , showing that w ∈ N (f ). So u = w + f (u)u0 with w ∈ N (f ), f (u)u0 ∈ N (f )⊥ , and this decomposition is unique. Thus (u, u0 ) = (w, u0 ) + f (u)(u0 , u0 ) = 0 + f (u)u0 2 , and solving for f (u),

f (u) = u,

1 u 0 , u0 2

so f (u) is of the given form with v = u0 −2 u0 , and v is unique by the previous step. Step 3. We ﬁnally prove that v = f . For f = 0 this is obvious, so assume that f = 0, which implies v = 0. By Bunyakovsky– Cauchy–Schwarz |f (u)| ≤ v · u , and by considering those u with u ≤ 1, we get f ≤ v . Now f (v) = (v, v) = v2 so that f

1 v = v , v

(6.4.18)

6.4 The Riesz Representation Theorem

177

which combined with (6.4.18) shows that f = v.

Remark 6.11. Recall that in Sect. 4.4 of Chap. 4 we asked whether functionals f from the dual of (C[a, b], · L2 (a,b) ), −∞ < a < b < ∞, can be expressed as f (u) = (u, v)L2 (a,b) , u ∈ C[a, b], with v ∈ C[a, b]. The answer is, in general, no. First of all, any f ∈ (C[a, b], · L2 (a,b) )∗ can be extended by continuity to (L2 (a, b), · L2 (a,b) ) which is a Hilbert space. By the Riesz Representation Theorem, for each such f (extended to L2 (a, b)) there exists a unique v ∈ L2 (a, b) such that f (u) = (u, v)L2 (a,b) , ∀u ∈ L2 (a, b), but this v is not necessarily an element of C[a, b] (i.e., v has no representative in C[a, b]). In fact, we can consider f (u) = (u, v)L2 (a,b) , u ∈ L2 (a, b), with v ∈ L2 (a, b) \ C[a, b]; this f is continuous on (C[a, b], ·L2 (a,b) ) and its representation as a scalar product, f (u) = (u, v)L2 (a,b) , is unique (i.e., v is unique); but this v is not an element of C[a, b], so the answer to the above question is negative. Remark 6.12. In the proof of Theorem 6.10 we saw that for all u ∈ H, 0 = f ∈ H ∗ we have the decomposition u = w+f (u)u0 with w ∈ N (f ), u0 ∈ N (f )⊥ , f (u0 ) = 1, so that dim N (f )⊥ = 1. Another way to say this is that the codimension of N (f ) is 1. For such a functional f and for some a ∈ K we have an aﬃne subspace of H, Y := {u ∈ H; f (u) = a} = au0 + N (f ) , whose codimension is 1 (i.e., the codimension of N (f ) is 1), thus Y is a usual hyperplane if H is the Euclidean space. Conversely, given a closed aﬃne subspace Y of H of codimension 1, i.e., Y = u1 + Z, for some u1 ∈ H, Z ⊂ H a closed linear subspace with codimension 1, there exists u0 ∈ H \ {0} which is orthogonal on Z, i.e., (u, u0 ) = 0, u ∈ Z. Deﬁne f : H → K, f (u) = (u, u0 ),

∀u ∈ H ,

so that f ∈ H ∗ , N (f ) = Z, f = 0 (since f (u0 ) = u0 2 = 0) and Y can be expressed by means of this f as follows: Y = u1 + N (f ) = {u ∈ H; f (u) = f (u1 )} . 1 A simple example is H = L2 (0, 1), Z = {u ∈ H; 0 u(t) dt = 0}. Clearly, Z is a closed linear subspace of H with codim Z = 1. Indeed, any v ∈ H can be uniquely decomposed into

178

6 Hilbert Spaces

1

v(t) = 0

1 v(s) ds + v(t) − v(s) ds 0 u(t)

= C + u(t) ,

for a.a. t ∈ (0, 1) ,

where u ∈ Z and C is a constant, i.e., H = Span{1}⊕Z. We can 1 choose u0 to be the constant function 1, so f (u) = 0 u(t) dt. The Weak Topology of H Taking into account the Riesz Representation Theorem, we see that the weak topology of H is generated by the neighborhood system Vv1 ,v2 ,...,vp ;ε = {x ∈ H; |(x, vj )| < ε, j = 1, . . . , p}, ε > 0, v1 , . . . , vp ∈ H, p ∈ N . So the fact that a sequence (xn ) in H converges weakly to some x means (xn , v) → (x, v) for all v ∈ H. If dim H = ∞ then we can use the Gram–Schmidt method (see Chap. 1) to construct an inﬁnite orthonormal sequence (x1 , x2 , . . . , xn , . . . ). This sequence converges weakly to 0. Indeed, for v ∈ H arbitrary, we have *N *2 N N * * * * |(xn , v)|2 − 2 |(xn , v)|2 + v2 * (v, xn )xn − v * = * * n=1

n=1

= v2 −

n=1

N

|(xn , v)|2

n=1

≥ 0, so that N

|(xn , v)|2 ≤ |v|2 , ∀N ∈ N ,

n=1

which is known as Bessel’s inequality.3 So the series convergent and consequently

3

∞

n=1 |(xn , v)|

2

is

Friedrich Wilhelm Bessel, German astronomer, mathematician, physicist and geodesist, 1784–1846.

6.4 The Riesz Representation Theorem

(xn , v) → 0,

179

∀v ∈ H ,

i.e., (xn ) converges weakly to 0. But (xn ) is not strongly convergent (to 0) since xn = 1 for all n ∈ N. Therefore, weak convergence in any inﬁnite dimensional Hilbert space is diﬀerent from strong convergence. Based on the Riesz Representation Theorem, we can deﬁne the socalled Riesz operator R : H → H ∗ by v → (· , v) so that (Rv)(u) = (u, v) for all u, v ∈ H and Rv = v. As seen before, R is also bijective. Theorem 6.13. Every Hilbert space is reﬂexive. φ

Proof. Let φ : H → H ∗∗ , v → fv ∈ H ∗∗ such that fv (x∗ ) = x∗ (v) for all x∗ ∈ H ∗ . As we have already seen, φ is injective. For the convenience of the reader, let us prove this again in the present context. If fv = 0, x∗ (v) = 0 for all x∗ ∈ H ∗ which implies, by the Riesz Representation Theorem, that (v, w) = 0 for all w ∈ H so that v = 0. Thus φ is injective. We now prove that φ is surjective. Let x∗∗ ∈ H ∗∗ and deﬁne u∗ ∈ H ∗ by u∗ (v) := x∗∗ (Rv) for all v ∈ H. Denote u = R−1 u∗ and calculate x∗∗ (x∗ ) = x∗∗ (R(R−1 x∗ )) = u∗ (R−1 x∗ ) = (R−1 x∗ , u) = (u, R−1 x∗ ) = x∗ (u) = fu (x∗ ) , so that all functionals x∗∗ are of the form fu (x∗ ), and φ is onto, i.e., for all x∗∗ ∈ H ∗∗ there exists u ∈ H such that x∗∗ = fu . Remark 6.14. The above proof is a direct one. In fact, Theorem 6.13 follows from the Milman–Pettis4 general result we state without proof: every uniformly convex Banach space is reﬂexive. Recall that a normed space (H, · ) is said to be uniformly convex if ∀ε ∈ (0, 2) ∃δ > 0 such that ∀x, y ∈ H, x ≤ 1, y ≤ 1, x − y > ε we have (1/2)(x + y) < 1 − δ. 4

David P. Milman, Soviet and later Israeli mathematician, 1912–1982.

180

6 Hilbert Spaces

If H is a Hilbert space, it follows easily by using the parallelogram law that H is uniformly convex, hence reﬂexive (by Milman–Pettis).

6.5

Lax–Milgram Theorem

We begin this section with a preparatory lemma whose proof is based on the Banach Contraction Principle. Lemma 6.15. Let (H, (·, ·), · ) be a real Hilbert space and let A : H → H be a not necessarily linear operator satisfying (a) (Au − Av, u − v) ≥ cu − v2 tonicity); (b) Au − Av ≤ Lu − v

for all u, v ∈ H (strong mono-

for all u, v ∈ H (Lipschitz condition),

where c and L are given positive constants. Then for all w ∈ H there exists a unique u∗ ∈ H such that Au∗ = w, i.e., A is a bijection. Proof. We ﬁrst prove uniqueness: Suppose u1 , u2 ∈ H such that Au1 = w = Au2 . Then by (a), 0 = (Au1 − Au2 , u1 − u2 ) ≥ cu1 − u2 2 , which implies u1 = u2 . We now prove existence: First we note that c ≤ L by using (a) and (b) together with Bunyakovsky–Cauchy–Schwarz. For a ﬁxed w ∈ H, deﬁne B : H → H by Bu = u − t(Au − w),

t > 0, u ∈ H .

Note that if there is a ﬁxed point of B then it is u∗ as desired. We wish to apply the Banach Contraction Principle in (H, d), where d(u, v) = u − v. We have for all u, v ∈ H d(Bu, Bv)2 = Bu − Bv2 = u − v2 − 2t(u − v, Au − Av) + t2 Au − Av2 ≤ u − v2 − 2tcu − v2 + t2 L2 u − v2 from (a)

from (b)

= (1 − 2tc + t L ) u − v 2 2

call this m

= mu − v2 = md(u, v)2 .

2

6.5 Lax–Milgram Theorem

181

Obviously, m ≥ 0. We choose t to minimize m = m(t) and ﬁnd that t = Lc2 . Thus the minimum value of m is m=1−2

c2 c2 c2 + = 1 − ≥ 0, L2 L2 L2

since c ≤ L. If c = L, then m = 0, so B is constant, i.e., Bu = w0 , so that w0 = u − (c/L2 )(Au − w). In this case A is aﬃne, namely Au =

L2 (u − w0 ) + w , c

so that u∗ = w0 . When c < L then 0 < m < 1 so that B is a contraction and hence by the Banach Contraction Principle (see Sect. 2.5) B has a unique ﬁxed point u∗ . Theorem 6.16 (Nonlinear Lax–Milgram Theorem). 5 Let H be a real Hilbert space and consider two functionals a : H × H → R and b : H → R satisfying 1. For all u ∈ H the map v → a(u, v) is linear and continuous on H (i.e., it belongs to H ∗ ); 2. a(u, u − v) − a(v, u − v) ≥ cu − v2 c > 0; 3. |a(u, w) − a(v, w)| ≤ Lu − v · w some L > 0;

for all u, v ∈ H and some for all u, v, w ∈ H and

4. b is a continuous linear functional (i.e., b ∈ H ∗ ). Then there exists a unique u ∈ H such that a(u, v) = b(v)

∀v ∈ H .

(6.5.19)

Proof. By the ﬁrst assumption and the Riesz Representation Theorem 6.10 for all u ∈ H there exists a unique z ∈ H such that a(u, v) = (v, z) for all v ∈ H. So there exists an operator A : H → H deﬁned by Au := z. We now rewrite the second condition a(u, u − v) − a(v, u − v) = (u − v, Au) − (u − v, Av) = (u − v, Au − Av) 5

Peter D. Lax, Hungarian-born American mathematician, born 1926; Arthur N. Milgram, American mathematician, 1912–1961.

182

6 Hilbert Spaces

and since K = R = (Au − Av, u − v) ≥ cu − v2 , for all u, v ∈ H, so A satisﬁes condition (a) of the previous lemma. From the third assumption we have for all u, v, z ∈ H |a(u, z) − a(v, z)| = |(z, Au) − (z, Av)| = |(z, Au − Av)| ≤ Lu − v · z . Choosing z = Au − Av we see that operator A also satisﬁes condition (b) of Lemma 6.15. On the other hand, by the fourth assumption and the Riesz Representation Theorem there exists a unique w such that b(v) = (v, w) for all v ∈ H. Now (6.5.19) can be written as (v, Au) = (v, w), ∀v ∈ H ⇐⇒ Au = w , so the conclusion of the theorem follows by Lemma 6.15. Theorem 6.17 (Classic Lax–Milgram Theorem). Let H be a real Hilbert space and consider two functionals a : H × H → R and b : H → R satisfying 1. a is bilinear; 2. a is bounded (continuous) on H×H, namely |a(u, v)| ≤ Lu·v for all u, v ∈ H for some L > 0; 3. a is strongly positive (or coercive), i.e., there exists c > 0 such that a(v, v) ≥ cv2 for all v ∈ H; 4. b is linear and continuous (i.e., b ∈ H ∗ ). Then there exists a unique u ∈ H satisfying a(u, v) = b(v) ∀v ∈ H.

(6.5.19 )

If, in addition, a is symmetric (i.e., a(u, v) = a(v, u) for all u, v ∈ H) then u is a solution of (6.5.19 ) if and only if it is a solution (minimizer) of the quadratic minimization problem ) 1 min a(v, v) − b(v) . (6.5.20) v∈H 2

6.5 Lax–Milgram Theorem

183

Proof. Observe that the conditions of Theorem 6.16 are satisﬁed, so all that remains is to prove the ﬁnal statement. Deﬁne 1 F (v) = a(v, v) − b(v), v ∈ H . 2 If u is a solution of (6.5.20) then F (u) ≤ F (v) for all v ∈ H. Deﬁne φ(t) = F (u + tv) for t ∈ R, v ∈ H. We have 1 φ(t) = a(u + tv, u + tv) − b(u + tv) 2 1 1 = a(u, u) + ta(u, v) + t2 a(v, v) − b(u) − tb(v) 2 2 1 = F (u) + t a(u, v) − b(v) + t2 a(v, v) . 2 Therefore, hence

φ (t) = a(u, v) − b(v) + ta(v, v) , a(u, v) − b(v) = φ (0) = 0 ,

since t = 0 is a minimizer of φ, so that u satisﬁes (6.5.19 ) because v is arbitrary. Conversely, suppose that u satisﬁes (6.5.19 ). We must show F (u) ≤ F (v) for all v ∈ H. It is enough to prove F (u+v)−F (u) is nonnegative: 1 1 F (u + v) − F (u) = a(u + v, u + v) − b(u + v) − a(u, u) + b(u) 2 2 1 1 1 = a(u, u) + a(u, v) + a(v, v) − b(v) − a(u, u) 2 2 2 symmetric

1 = a(u, v) − b(v) + a(v, v) 2 =0

1 = a(v, v) 2 ≥ 0. So F (u) ≤ F (u + v) for all v ∈ H which implies u is a solution to (6.5.20).

Next we illustrate the above results with some applications.

184

6 Hilbert Spaces

Dirichlet’s Principle6 : Let ∅ = Ω ⊂ Rk be a bounded domain. For all f ∈ L2 (Ω) there exists a unique u ∈ H01 (Ω) which is a solution to the following minimization problem: ) 1 ∇v · ∇v dx − f v dx , (6.5.21) min 2 Ω v∈H01 (Ω) Ω and equivalently u is a solution to u ∈ H01 (Ω) , Ω ∇u · ∇v dx = Ω f v dx

∀v ∈ H01 (Ω) .

(6.5.22)

Remark 6.18. In the sense of distributions we can rewrite (6.5.22) to be (6.5.23) u ∈ H01 (Ω), −Δu = f in Ω , which is known as the Euler–Lagrange equation7 associated with the minimization problem (6.5.21) (being a Poisson equation in this example) and u being 0 on the boundary is interpreted as meaning the trace of u on the boundary ∂Ω is 0. Indeed, for every test function φ ∈ C0∞ (Ω) we have ∇u · ∇φ dx = f φ dx ⇐⇒ (−Δu, φ) = (f, φ) , Ω

Ω

i.e., −Δu = f in D (Ω). Since f is in L2 (Ω), −Δu is as well, so u satisﬁes the equation −Δu = f for a.a. x ∈ Ω. In fact, if ∂Ω is smooth enough, then u ∈ H01 (Ω) ∩ H 2 (Ω) (see [39, Theorem 3.1, p. 212]). Moreover, if f ∈ C ∞ (Ω) then so is u. Actually, the following regularity result holds. Lemma 6.19 (Weyl). If ∅ = Ω ⊂ Rk is open, f ∈ L∞ (Ω), and u ∈ D (Ω) satisﬁes the equation −Δu = f in the sense of distributions, then u ∈ C ∞ (Ω). Proof of Dirichlet’s Principle. We wish to use the classical Lax–Milgram Theorem 6.17. Denote H := H01 (Ω). Recall that H is a real Hilbert space as a closed subspace of H 1 (Ω). According to Remark 5.26, H can be equipped with the norm

6 7

Johann Peter Gustav Lejeune Dirichlet, German mathematician, 1805–1859. Joseph-Louis Lagrange, Italian mathematician and astronomer, 1736–1813.

6.5 Lax–Milgram Theorem

185

! u∗ =

"1/2 |∇u| dx 2

,

Ω

u ∈ H = H01 (Ω) ,

which is equivalent with the usual H 1 (Ω) norm. Deﬁne a : H ×H → R and b : H → R by a(u, v) := ∇u · ∇vdx , b(v) := f v dx . Ω

Ω

Clearly a is bilinear and symmetric. Moreover, a is also continuous (bounded), |a(u, v)| ≤ u∗ · v∗ ∀u, v ∈ H , and coercive

a(v, v) = Ω

∇v · ∇v dx = v2∗

∀v ∈ H .

Obviously, b is linear and also continuous because |b(v)| ≤ f L2 (Ω) vL2 (Ω) so by Poincar´e’s inequality ≤ Cf L2 (Ω) v∗

∀v ∈ H .

Thus all the conditions of Theorem 6.17 are fulﬁlled, so the proof of Dirichlet’s Principle is complete. Now let us consider the following nonlinear boundary value problem: −Δu(x) + β(u(x)) = f (x), x ∈ Ω , (6.5.24) u = 0, x ∈ ∂Ω , where ∅ = Ω ⊂ Rk is a bounded domain, f ∈ L2 (Ω), and β : R → R is a nonlinear Lipschitz continuous, nondecreasing function. We wish to prove that problem (6.5.24) has a unique solution u ∈ H01 (Ω). To this purpose we can apply Theorem 6.16 with H = H01 (Ω) equipped with the norm · ∗ as above, and with a : H × H → R, b : H → R deﬁned by a(u, v) = ∇u · ∇v dx + β(u)v dx , b(v) = f v dx . Ω

Ω

Ω

186

6 Hilbert Spaces

It is a simple exercise to show that all the assumptions of Theorem 6.16 are fulﬁlled, so there is a unique u ∈ H = H01 (Ω) satisfying u(u, v) = b(v), ∀v ∈ H , i.e., −Δu + β ◦ u = f in D (Ω). Note that β◦u ∈ L2 (Ω) so −Δu = f − β(u) is in L2 (Ω) as well, i.e., u satisﬁes the given equation for a.a. x ∈ Ω. In fact, if ∂Ω is smooth enough, then u ∈ H 2 (Ω) (cf. [39, Theorem 3.1, p. 212]).

6.6

Fourier Series Expansions

Let (H, (·, ·), · ) be a Hilbert space with m := dim H ≥ 1. If m < ∞ then starting from a basis of H, say B = {e1 , . . . , em }, one can construct by the Gram–Schmidt procedure (see Chap. 1) an orthonormal basis B = {u1 , . . . , um }, i.e., (ui , uj ) = δij , i, j = 1, . . . , m. So every u ∈ H can be written as u=

m

ci ui , ci ∈ K, i = 1, . . . , m .

i=1

This yields ci = (u, ui ), i = 1, . . . , m, hence u=

m (u, ui )ui ,

∀u ∈ H .

(6.6.25)

i=1

(6.6.25) is called the Fourier expansion of u and (u, ui ) are called Fourier coeﬃcients.8 In what follows we are interested in Fourier series expansions in the case m = ∞. A set S ⊂ H is said to be an orthonormal set if for any pair u, v ∈ S, u = v, we have u = v = 1,

and

(u, v) = 0 .

An orthonormal set S ⊂ H is called a complete orthonormal system in H if it is not properly included in any other orthonormal set in H.

8

Jean-Baptiste Joseph Fourier, French mathematician and physicist,1768–1830.

6.6 Fourier Series Expansions

187

Remark 6.20. Claim: any Hilbert space H = {0} has a complete orthonormal system, and any orthonormal set can be extended to a complete orthonormal system. Indeed, choosing x ∈ H \ {0} and denoting u1 = (1/x)x we see that {u1 } is an orthonormal system in H. Consider the collection of all orthonormal systems in H which contain {u1 }. This collection is partially ordered with respect to the usual inclusion relation. By Zorn’s Lemma there exists a maximal element for the collection, which is a complete orthonormal system in H. If m = ∞ then this system is inﬁnite, be it countable or not (this issue will be clariﬁed later). Theorem 6.21. Let (H, (·, ·), · ) be an inﬁnite dimensional Hilbert space and let S = {un }n∈N ⊂ H be a countably inﬁnite orthonormal system. Then the following are equivalent: (a) S is complete; ∀u ∈ H; (b) u = ∞ n=1 (u, un )un ∞ 2 2 ∀u ∈ H (Parseval’s relation)9 ; (c) n=1 |(u, un )| = u (d) Span S is dense in H. Proof. First of all, using the orthogonality of system S, we have for all u ∈ H and N ∈ N 0≤

N

(u, un )un − u2 = u2 −

n=1

N

|(u, un )|2 .

(6.6.26)

n=1

We deduce from (6.6.26) that (b) ⇐⇒ (c). Let us prove that (b) =⇒ (a). Assume by contradiction that (b) holds, but S is not complete, i.e., there exists a vector u ˆ ∈ H \ S such that ˆ it then follows ˆ u = 1, and (ˆ u, un ) = 0 ∀n ∈ N. From (b) with u = u u ˆ = 0 which is a contradiction. Now, we prove that (a) =⇒ (b). Fix u ∈ H. By a standard computation we get * m+p *2 m+p * * * * (u, un )un * = |(u, un )|2 . (6.6.27) * *n=m * n=m ∞ 2 Since the numerical series is convergent n=1 |(u, un )| (see (6.6.26)), we deduce from (6.6.27) that the sequence of partial 9

Marc-Antoine Parseval, French mathematician, 1755–1836.

188

6 Hilbert Spaces

sums of the series in (b) is Cauchy in H, hence convergent to some u ˜ ∈ H, so we can write u ˜=

∞

(u, un )un .

(6.6.28)

n=1

We compute

(˜ u, uj ) = lim

N →∞

N

(u, un )un , uj

= (u, uj )

∀j ∈ N ,

n=1

so (˜ u − u, uj ) = 0

∀j ∈ N,

which implies u ˜ = u by the completeness of S. Therefore u ˜ in (6.6.28) can be replaced by u. It is clear that (b) =⇒ (d). To complete the proof it suﬃces to show (d) =⇒ (a). Assume by contradiction that (d) holds but S is not complete, i.e., there exists a vector v ∈ H \ S such that v = 1, and (v, un ) = 0 ∀n ∈ N. According to (d), we obtain (v, w) = 0 ∀w ∈ H, hence v = 0, another contradiction. Remark 6.22. According to Theorem 6.21, if S = {un }n∈N is a complete orthonormal system in H, then every u ∈ H is the sum of the Fourier series associated with it (see (b)), similar to the ﬁnite dimensional case m < ∞. That is why S is also called a countable orthonormal basis of H. The next result is a characterization of the Hilbert spaces possessing countable orthonormal bases. Theorem 6.23. A Hilbert space has a countable orthonormal basis if and only if it is separable. Proof. Let H be a Hilbert space. Denote m := dim H. If m < ∞, then the result is trivial, so let us assume m = ∞. Let S = {un }n∈N be a (countable) orthonormal basis in H. Then Span S is dense in H (cf. Theorem 6.21). On the other hand, using the fact that Q is dense in R, we can show that there exists a countable subset of Span S which is dense in Span S, hence in H. Indeed, for any u ∈ Span S, say u = pk=1 αk uk , and any ε > 0, there are numbers rk ∈ Q if H is a real Hilbert space, or rk ∈ Q + iQ if H is a complex Hilbert space, such that * * p * * ε * * rk uk * < ε. |rk − αk | < , k = 1, . . . , p =⇒ *u − * * p k=1

6.6 Fourier Series Expansions

189

Thus H is separable. Conversely, assume H is separable, i.e., there exists a countably inﬁnite set, say M = {x1 , x2 , . . . , xn , . . . } such that M = H. Using Gram–Schmidt (see Chap. 1) we can construct with vectors from M an orthonormal system S = {u1 , u2 , . . . , un , . . . } eliminating dependent vectors of M if any. An inspection of the Gram–Schmidt method shows that in fact M ⊂ Span S so that H = M ⊂ Span S ⊂ H

⇒ Span S = H ,

so S is an orthonormal basis (cf. Theorem 6.21). Remark 6.24. If H is not separable, then the existence of a complete orthonormal system S = {ui }i∈I in H is still valid (cf. Remark 6.20). Obviously, the index set I is no longer countable. Surprisingly, in this case, for every u ∈ H there is a sequence of indices i1 , i2 , . . . such that u=

∞

(u, uij )uij ,

j=1

i.e., u has a Fourier series expansion as in the separable case. For the proof of this result, see [51, pp. 86–87]. A Classical Fourier Series Expansion Let H = L2 (−π, π) with the usual scalar product π f (x)g(x) dx , f, g ∈ H , (f, g) = −π

and f = (f, f ) for all f ∈ H. Let S = {un }∞ n=0 , where 1 1 1 u0 = √ , u2k−1 (x) = √ cos kx, u2k (x) = √ sin kx, k = 1, 2, . . . . π π 2π By a straightforward computation we can see that S is an orthonormal system in H. Moreover, S is complete as stated in the following result. Theorem 6.25 (Fischer10 –Riesz). The orthonormal system S as above is a basis in H = L2 (−π, π).

10

Ernst Sigismund Fischer, Austrian mathematician, 1875–1954.

190

6 Hilbert Spaces

Proof. According to Theorem 6.21 it suﬃces to prove that Span S = H. We know that C0∞ (−π, π) is dense in L2 (−π, π) (see Theorem 5.8). To conclude we can use Weierstrass’ lemma below (cf. [52, p. 205]). This is an approximation result with respect to the sup-norm of C[−π, π] which is obviously stronger than the norm of H = L2 (−π, π). Lemma 6.26 (Weierstrass). Span S is dense in the space X = {f ∈ C[−π, π]; f (−π) = f (π)} equipped with the sup-norm · C , where S is the function system deﬁned above. Proof. Let f ∈ X be even, i.e., f (−x) = f (x), x ∈ [−π, π]. Since the function y → f (arccos y) is continuous on [−1, 1], for all ε > 0 there exists a Bernstein11 polynomial p such that sup |f (arccos y) − p(y)| < ε ⇐⇒ y∈[−1,1]

sup |f (x) − p(cos x)| < ε . x∈[0,π]

(6.6.29) In fact, since both f and x → p(cos x) are even, we can extend (6.6.29) to [−π, π], (6.6.30) sup |f (x) − p(cos x))| < ε . x∈[−π,π]

By elementary trigonometric formulas we see that p(cos x) ∈ Span S, so (6.6.30) concludes the proof in the case when f is even. Now, consider an odd function f ∈ X, so f (−π) = f (π) = f (0) = 0. f (x) Then x → sin x is an even function, but has singularities at x = 0, ±π. So we consider for δ > 0 small π(x−δ) f π−2δ , x ∈ (δ, π − δ), f˜(x) = 0, x ∈ [0, δ] ∪ [π − δ, π], and f˜(x) := −f˜(−x) for x ∈ [−π, 0). Clearly, f˜ is a continuous odd function which approximates f uniformly. Now deﬁne ˜ f (x) , x ∈ [−π, π] \ {0, ±π} , ψ(x) = sin x 0 x ∈ {0, ±π} . 11

Sergei N. Bernstein, Russian mathematician, 1880–1968.

6.6 Fourier Series Expansions

191

Thus, by the ﬁrst part of the proof, ∀ε > 0, ∃q ∈ Span S such that ψ −qC < ε =⇒ f˜−q sin xC ≤ ε . Obviously q sin x ∈ Span S and thus odd continuous functions can be approximated as well by elements in Span S. To conclude the proof, it is enough to notice that any function f can be decomposed into f = fe + fo where 1 1 fe (x) = [f (x) + f (−x)], fo (x) = [f (x) − f (−x)] , 2 2 are even and odd, respectively.

Some Comments 1. Since L2 (−π, π) has a countable orthonormal basis S, it follows that L2 (−π, π) is separable (by Theorem 6.23). Obviously, L2 (a, b) is separable for any a < b. In fact, for any measurable set Ω ⊂ Rk , Lp (Ω) is separable for all p ∈ [1, ∞) (see, e.g., [6, p. 95]). 2. By Theorems 6.21 and 6.25 it follows that every u ∈ L2 (−π, π) is the sum of the Fourier series associated with it, i.e., ∞ (u, un )un , (6.6.31) u= n=0

n

meaning that sn (u) = k=0 (u, uk )uk converges strongly to u in L2 (−π, π). Taking into account the structure of the basis S, (6.6.31) can be written as ∞

u(x) =

a0 + (an cos nx + bn sin nx) , 2

(6.6.32)

n=1

where 1 π u(t) dt, ak = u(t) cos(kt) dt, π −π −π π 1 u(t) sin(kt) dt , (6.6.33) bk = π −π

1 a0 = π

π

192

6 Hilbert Spaces

for all k ∈ N. Note that (6.6.32) is precisely the classical form of the Fourier series associated with u. For the moment, we know (by Fischer–Riesz) that for u ∈ L2 (−π, π) the series expansion (6.6.32) is valid in L2 (−π, π), i.e., a0 + sn (u)(x) = (ak cos kx + bk sin kx) 2 n

(6.6.34)

k=1

converges to u in L2 (−π, π). Then there is a subsequence of (sn (u)) that converges to u for a.a. x ∈ (−π, π). There is a question whether the sequence (sn (u)) itself converges a.e., i.e., (6.6.32) holds for a.a. x ∈ (−π, π). This question was posed in 1920 by Luzin.12 In 1966, Carleson13 proved that this is indeed the case. The proof is not trivial and is omitted. Later, Hunt14 extended the result to Lp -functions, i.e., the series expansion (6.6.32) holds a.e. for every Lp -function u, for 1 < p < ∞. On the other hand, in 1922 Kolmogorov15 gave a counterexample showing that it does not hold for p = 1. However, the Fourier expansion (6.6.32) holds for L1 -functions in the sense of distributions, as explained below. Fourier Series Expansions of L1 Functions Recall that in general L1 functions do not admit Fourier series expansions in classical theory. However, the Fourier coeﬃcients of u (see (6.6.33)) are still well deﬁned if u ∈ L1 (−π, π). Fix such a function u ∈ L1 (−π, π) and associate with it the series ∞

a0 u(x) ≈ + (an cos nx + bn sin nx) . 2 n=1

We can prove that ∞

a0 (an cos nx + bn sin nx) in D (−π, π) , + u(x) = 2

(6.6.35)

n=1

where ak , bk are the Fourier coeﬃcients of u deﬁned in (6.6.33). 12

Nikolai N. Luzin, Russian mathematician, 1883–1950. Lennart Axel Edvard Carleson, Swedish mathematician, born 1928). 14 Richard Allen Hunt, American mathematician, 1937–2009. 15 Andrey N. Komogorov, Russian mathematician, 1903–1987. 13

6.6 Fourier Series Expansions

193

Recall that distributions are not deﬁned pointwise, and the appearance of x in (6.6.35) is simply for convenience. In order to prove (6.6.35), consider the series bn a0 2 an x + − 2 cos nx − 2 sin nx , 4 n n ∞

n=1

which is obtained by formally integrating twice in the right-hand side of (6.6.35). This series is uniformly and absolutely convergent since for all n ≥ 1 a bn 1 n − 2 cos nx − 2 sin nx ≤ 2 (|an | + |bn |) n n n π 4 |u(t)| dt ≤ 2 n π −π 1 =C 2. n Let an bn a0 − 2 cos nx − 2 sin nx . s(x) = x2 + 4 n n ∞

(6.6.36)

n=1

Of course, uniform convergence on [−π, π] implies convergence in D (−π, π), so (6.6.36) also holds in D (−π, π). Diﬀerentiating (6.6.36) twice in the sense of distributions we get ∞

a0 + (an cos nx + bn sin nx) = s in D (−π, π) . 2

(6.6.37)

n=1

Finally we must show that s = u, i.e., s is generated by the function u. We consider the partial sums a0 + (an cos nx + bn sin nx) . sl (u)(x) = 2 l

n=1

194

6 Hilbert Spaces

For φ ∈ D(−π, π) = C0∞ (−π, π) we have π sl (u)(x)φ(x) dx (sl (u), φ) = −π # $ π l a0 = + φ(x) (an cos nx + bn sin nx) dx 2 −π n=1 # π π 1 = φ(x) u(t) dt 2π −π −π π l 1 + cos nx u(t) cos nt dt π −π n=1 $ π 1 + sin nx f (t) sin nt dt dx , π −π and now we change the order of integration to get π (sl (u), φ) = u(t)sl (φ)(t) dt .

(6.6.38)

−π

On the other hand, lim sl (φ)(t) = φ(t)

l→∞

uniformly for t ∈ [−π, π] .

(6.6.39)

Indeed, if we denote 1 π 1 π φ(t) cos kt dt (k ≥ 0), Bk = φ(t) sin kt dt (k ≥ 1) , Ak = π −π π −π we may integrate by parts twice (since φ is inﬁnitely diﬀerentiable), so that for k ≥ 1 π 1 φ (t) sin kt dt Ak = − kπ −π π 1 φ (t) cos kt dt =− 2 k π −π and similarly 1 Bk = − 2 k π

π −π

φ (t) sin kt dt .

6.7 Exercises

195

Therefore there exists a constant C1 > 0 (depending on φ) such that |Ak | ≤

C1 C1 , |Bk | ≤ 2 , 2 n n

∀k ≥ 1 .

(6.6.40)

As Ak , Bk are the Fourier coeﬃcients of φ, we deduce from (6.6.40) that the Fourier series of φ is uniformly convergent (see Weierstrass’ M Test) and its sum is φ (by the classical theory, or by Theorem 6.25), i.e., (6.6.39) holds. Finally, taking into account (6.6.39) and letting l → ∞ in (6.6.38), we get

π

(s , φ) =

u(t)φ(t) dt = (u, φ). −π

As φ was arbitrarily chosen this implies s = u, as claimed.

6.7

Exercises

1. Let ∅ = Ω ⊂ Rk be an open set and let p ∈ (1, ∞). It is well known that Lp (Ω) is a Banach space with respect to the usual norm 1/p p |u(x)| dx , u ∈ Lp (Ω). uLp (Ω) = Ω

Prove that Lp (Ω), · Lp (Ω) is a Hilbert space if and only if p = 2. 2. Let H be a pre-Hilbert space, i.e., a linear space equipped with a scalar product (·, ·) and the induced norm · . Show that for x, y ∈ H we have |(x, y)| = x · y if and only if x and y are linearly dependent. 3. Let −∞ < a < b < ∞. Show that C[a, b] with the sup-norm is not a Hilbert space. 4. Let n be a given natural number. Let C be the set of all polynomials with real coeﬃcients of degree ≤ n. Show that for any u ∈ L2 (0, 1) there exists a unique pu ∈ C such that u − pu L2 (0,1) ≤ u − pL2 (0,1) ∀p ∈ C.

196

6 Hilbert Spaces

5. Let (H, · ) be a Hilbert space. Deﬁne P : H → H, by u if u ≤ 1, Pu = −1 u u if u > 1. (Operator P is called radial retraction). Prove that (i) P is nonexpansive, i.e., Lipschitzian with Lipschitz constant L = 1; (ii) if H is a general Banach space, then P is Lipschitzian with L = 2. 6. Let R3 be equipped with the usual scalar product and Euclidean norm. Set M = {x = (x1 , x2 , x3 )T ∈ R3 ; 2x1 − x2 − 3x3 = 0}. Show that M is a closed linear subspace of R3 . Determine M ⊥ and for x = (1, 2, −1)T determine PM x and write x as a direct sum of vectors in M and M ⊥ , i.e., x = x1 +x2 , x1 ∈ M, x2 ∈ M ⊥ . 7. Let −∞ < a < b < +∞ and let L2 (a, b) := L2 (a, b; R) be equipped with the usual scalar product and norm. Show that ) b 2 M = u ∈ L (a, b); u(t) dt = 0 a

is a closed linear subspace of L2 (a, b). Determine M ⊥ and write any u ∈ L2 (a, b) as a direct sum of vectors in M and M ⊥ , i.e., u = u1 + u2 , u1 ∈ M, u2 ∈ M ⊥ . 8. Same exercise for L2 (−1, 1) and M = {u ∈ L2 (−1, 1); u(t) = u(−t) for a.a. t ∈ (−1, 1)}. 9. Show that any linear subspace Y of a Hilbert space (H, (·, ·)) one has ⊥ ⊥ = Cl Y. Y 10. Let H = L2 (0, 1) be the real Hilbert space equipped with the usual scalar product and norm. Is the subspace Y = {u ∈ 1 H; 0 u(t) t dt = 0} closed in H?

6.7 Exercises

197

11. Prove that the dual of any Hilbert space is a Hilbert space, too. 12. Let {un }∞ n=1 be an orthonormal basis in a Hilbert space H and let (an )n∈N be a bounded sequence in R. Prove that (i) the sequence (vn )n∈N deﬁned by 1 ai ui , n ∈ N, n n

vn =

i=1

converges strongly to zero; √ (ii) the sequence ( nvn )n∈N converges weakly to zero. 13. Let (H, · ) be a Hilbert space and A ∈ L(H). Show that the following two conditions are equivalent: (i) there exists a constant c > 0 such that cx ≤ Ax ∀x ∈ H; (ii) there exists an operator B ∈ L(H) such that B ◦ A = I, where I is the identity operator on H. 14. Let (H, · , (·, ·)) be a real Hilbert space. For any A ∈ L(H) satisfying (Ax, x) ≥ 0 ∀x ∈ H, we have (i) H = N (A) ⊕ [Cl R(A)]; (ii) for all t > 0, I + tA is bijective and lim (I + tA)−1 u = PN (A) u ∀u ∈ H,

t→∞

where I denotes the identity operator. 15. Let (un )n∈N be a sequence in a Hilbert space (H, · ) which is weakly convergent to a point u ∈ H. If, in addition, lim sup un ≤ u then show un − u → 0. 16. Prove that for any f ∈ L1 (0, 1) there exists a unique u ∈ H01 (0, 1) satisfying 1 1 1 u (t)v (t) dt + u(t)v(t) dt = f (t)v(t) dt ∀v ∈ H01 (0, 1), 0

0

0

198

6 Hilbert Spaces

and, furthermore, u ∈ W 2,1 (0, 1) and

−u + u = f u(0) = 0, u(1) = 0.

a.e. in (0, 1),

17. Let f ∈ L2 (0, 1) and α > 0. (i) Show that the following boundary value problem, denoted (P ), ⎧ 2 ⎪ ⎨u ∈ H (0, 1), −u (t) + αu(t) = f (t) ⎪ ⎩ u (0) = 0, u (1) = u(1),

for a.a. t ∈ (0, 1),

is equivalent to the variational formulation, denoted (P˜ ), 1 1 u ∈ H 1 (0, 1), −u(1)v(1) + u v + α uv 0 0 1 f v ∀v ∈ H 1 (0, 1). = 0

(ii) Using Lax–Milgram prove that for α large enough there exists a unique solution u of problem (P ). (iii) Show that the solution u can be expressed as the minimizer of a functional deﬁned on H 1 (0, 1). 18. Let (H, (·, ·)) be a Hilbert space and let Y ⊂ H be a closed subspace with an orthonormal basis {un }∞ n=1 . Prove that ∀y ∈ H ∞ the closest point to y in Y is i=1 (y, un )un . 19. Let (H, · ) be an inﬁnite dimensional, separable Hilbert space. Show that for any x ∈ H, x ≤ 1, there exists a sequence (xn )n∈N in H such that xn = 1 for all n ∈ N and xn → x weakly.

6.7 Exercises

199

20. Find the Fourier expansions of the functions f1 (x) = cos x − |x|, −π ≤ x ≤ π, f2 (x) = −3x + sin x, −π ≤ x ≤ π, −1 −π ≤ x ≤ 0, f3 (x) = x + 1 0 ≤ x ≤ π, x + 1 −1 ≤ x ≤ 0, f4 (x) = x2 − 1 0 ≤ x ≤ 1.

Chapter 7

Adjoint, Symmetric, and Self-adjoint Linear Operators Here we ﬁrst recall the deﬁnition of the adjoint of a linear operator and discuss some related results. Then we shall address the case of compact operators A : H → H, where H is a Hilbert space, and present the Fredholm theorem as an application. The last section is devoted to symmetric operators and self-adjoint operators. Throughout this chapter we consider linear operators between linear spaces over K, where K is either R or C, unless otherwise speciﬁed.

7.1

The Adjoint of a Linear Operator

Let X, Y be Banach spaces with duals X ∗ and Y ∗ and let A : D(A) ⊂ X → Y be a linear operator that is densely deﬁned: D(A) = X. The adjoint of A is an operator A∗ : D(A∗ ) ⊂ Y ∗ → X ∗ deﬁned as follows. The domain of A∗ is the set D(A∗ ) = {y ∗ ∈ Y ∗ ; ∃c > 0 such that |y ∗ (Ax)| ≤ cx ∀x ∈ D(A)}, which is a linear subspace of Y ∗ . Note that for y ∗ ∈ D(A∗ ) the linear functional f (x) = y ∗ (Ax) is continuous on D(A) (equipped with the norm · of X), i.e., |f (x)| ≤ cx for all x ∈ D(A). According to the Hahn–Banach Theorem, f can be extended to a functional g ∈ X ∗ , © Springer Nature Switzerland AG 2019 G. Moro¸sanu, Functional Analysis for the Applied Sciences, Universitext, https://doi.org/10.1007/978-3-030-27153-4 7

201

202

7 Adjoint, Symmetric, and Self-adjoint Linear Operators

such that |g(x)| ≤ cx for all x ∈ X. This extension is unique since D(A) is dense in X. We now deﬁne A∗ y ∗ = g and we can write y ∗ (Ax) = (A∗ y ∗ )(x)

∀x ∈ D(A), y ∗ ∈ D(A∗ ) .

(7.1.1)

Example. Let X = Y = l1 (for the deﬁnition of l1 see Chap. 4). Let A : D(A) ⊂ l1 → l1 be deﬁned by D(A) = {(xn )n≥1 ∈ l1 ; (nun )n≥1 ∈ l1 }, A(un ) = (nun ) . Obviously, D(A) is dense in l1 . It is also easily seen that D(A∗ ) = {(yn )n≥1 ∈ l∞ ; (nyn )n≥1 ∈ l∞ }, A∗ (yn ) = (nyn ) . Note that both A and A∗ are closed operators, i.e., their graphs are closed in l1 ×l1 and l∞ ×l∞ , respectively. In fact, we have the following general result: Theorem 7.1. Let X and Y be Banach spaces and let A : D(A) ⊂ X → Y be a densely deﬁned, linear operator. Then A∗ is closed. Proof. Let (yn∗ ) be a sequence in D(A∗ ) such that yn∗ → y ∗ in Y ∗ and A∗ yn∗ → x∗ in X ∗ . We have yn∗ (Ax) = (A∗ yn∗ )(x)

∀x ∈ D(A) ,

which yields (by letting n → ∞) y ∗ (Ax) = x∗ (x)

∀x ∈ D(A) .

Therefore y ∗ ∈ D(A∗ ) and A∗ y ∗ = x∗ .

We also have the following results about continuity.

7.1 The Adjoint of a Linear Operator

203

Theorem 7.2. Let X, Y be Banach spaces with duals X ∗ and Y ∗ . If A ∈ L(X, Y ), then A∗ ∈ L(Y ∗ , X ∗ ), and A = A∗ . Proof. Obviously, D(A∗ ) = Y ∗ . From (7.1.1) we deduce (using the same symbol · for diﬀerent norms) |(A∗ y ∗ )(x)| ≤ y ∗ · A · x ∀x ∈ X, y ∗ ∈ Y ∗ . Therefore, A∗ y ∗ ≤ A · y ∗ ∀y ∗ ∈ Y ∗ ⇒ A∗ ≤ A . On the other hand, using (7.1.1) again, we obtain |y ∗ (Ax)| ≤ A∗ · y ∗ · x ∀x ∈ X, y ∗ ∈ Y ∗ , hence, by Corollary 4.18 in Chap. 4, Ax ≤ A∗ · x ∀x ∈ X ⇒ A ≤ A∗ . We continue with some simple properties of adjoint operators. Let X, Y, Z be three Banach spaces over K, where K is the same (either R or C) for all the three spaces. Then the following properties hold: (a) If A : D(A) ⊂ X → Y is a densely deﬁned, linear operator, and B : D(B) ⊂ X → Y is another linear operator, such that A ⊂ B (i.e., D(A) ⊂ D(B) and Bx = Ax ∀x ∈ D(A)), then B ∗ ⊂ A∗ ; (b)

For all α , β ∈ K and A, B ∈ L(X, Y ) , (αA + βB)∗ = αA∗ + βB ∗ ;

(c)

If A ∈ L(X, Y ) and B ∈ L(Y, Z), then (B ◦ A)∗ = A∗ ◦ B ∗ .

(d)

If A ∈ L(X, Y ) is bijective, then A∗ is bijective too, and (A∗ )−1 = (A−1 )∗ .

The proofs are left to the reader.

204

7.2

7 Adjoint, Symmetric, and Self-adjoint Linear Operators

Adjoints of Operators on Hilbert Spaces

Let (H, (·, ·), · ) be a Hilbert space. Let A : D(A) ⊂ H → H be a densely deﬁned, linear operator. Taking into account the Riesz Representation Theorem, the adjoint of A can be redeﬁned as an operator from H into itself, as follows: D(A∗ ) = {y ∈ H; ∃c > 0 such that |(Ax, y)| ≤ cx ∀x ∈ D(A)} . Now, for y ∈ D(A∗ ) the linear functional x → (Ax, y) (which is continuous on (D(A), · )) can be extended uniquely to a functional belonging to H ∗ so (by Riesz) there is a corresponding element in H, denoted A∗ y. Thus we have a linear operator A∗ : D(A∗ ) ⊂ H → H, such that (Ax, y) = (x, A∗ y)

∀x ∈ D(A), y ∈ D(A∗ ) .

(7.2.2)

In fact, if R : H → H ∗ denotes the Riesz isomorphism, this adjoint is nothing else but the operator R−1 ◦ A∗ ◦ R, with A∗ being the adjoint deﬁned in the previous section. Whenever we deal with a densely deﬁned linear operator A : D(A) ⊂ H → H, we shall associate with A the A∗ deﬁned in this section. It is easily seen that all the properties discussed in the previous section remain valid, except for (b) which now takes the form (b )

For all α , β ∈ K and A, B ∈ L(X, Y ) , ¯ ∗, (αA + βB)∗ = α ¯ A∗ + βB

where α ¯ , β¯ denote the complex conjugates of α, β. In fact, for any α ∈ K and any densely deﬁned, linear operator A : D(A) ⊂ X → Y , we have ¯ A∗ . (αA)∗ = α

If H is ﬁnite dimensional, then the matrix corresponding to A∗ is the transposed conjugate of the matrix corresponding to A (while the matrix associated with the adjoint of A as deﬁned in the previous section is just the transpose of the matrix corresponding to A. This shows the diﬀerence between the two notions of adjoint).

7.2 Adjoints of Operators on Hilbert Spaces

205

If A ∈ L(H) := L(H, H), then A∗∗ := (A∗ )∗ = A. Indeed, we have (Ax, y) = (x, A∗ y) = (A∗ y, x) = (y, A∗∗ x) = (A∗∗ x, y)

∀x, y ∈ H ,

which proves the assertion.

7.2.1

The Case of Compact Operators

Denote by K(H) := K(H, H) the space of compact linear operators from H into itself. This is a closed subspace of L(H) := L(H, H) with respect to the operator norm, hence K(H) is a Banach space with respect to this norm (see Theorem 4.11). Theorem 7.3. If (H, (·, ·), ·) is a Hilbert space and A ∈ K(H), then the nullspace of I − A, denoted N = N (I − A), is a ﬁnite dimensional subspace of H, where I denotes the identity operator of H. Proof. Obviously N is a (closed) linear subspace of (H, · ). Let Q be a bounded subset of N . Since A is compact and Q = AQ we deduce that Q is relatively compact in (N , · ). According to Theorem 2.24, N is ﬁnite dimensional. Theorem 7.4 (Schauder1 ). If (H, (·, ·), · ) is a Hilbert space and A ∈ K(H) then A∗ ∈ K(H), too. Proof. Let r > 0 be arbitrary but ﬁxed. Since A∗ ∈ L(H), the set A∗ B(0, r) is bounded: x < r =⇒ A∗ x ≤ rA∗ . As A is compact, it follows that for any sequence (xn )n≥1 in B(0, r) the sequence ((A ◦ A∗ )xn )n≥1 has a convergent subsequence, say ((A ◦ A∗ )xnk )k≥1 . We also have A∗ xnk − A∗ xnj 2 = A∗ (xnk − xnj ), A∗ (xnk − xnj ) = xnk − xnj , A(A∗ (xnk − xnj )) ≤ 2r(A ◦ A∗ )xnk − (A ◦ A∗ )xnj , so (A∗ xnk )k≥1 is convergent. 1

Juliusz Pawel Schauder, Polish mathematician, 1899–1943.

206

7 Adjoint, Symmetric, and Self-adjoint Linear Operators

Remark 7.5. Let A ∈ L(H). Then A is compact if and only if A∗ is compact. This follows from Schauder’s Theorem above combined with (A∗ )∗ = A. Remark 7.6. If A, B ∈ L(H) and at least one is compact, then A ◦ B is compact as well. We continue with an important result, essentially due to Fredholm,2 that provides a necessary and suﬃcient condition for an operator equation involving a compact linear operator to have a solution. Theorem 7.7 (Fredholm). Let (H, (·, ·), · ) be a Hilbert space and let A ∈ K(H). The equation x − A∗ x = f has a solution if and only if f ∈ N ⊥ , where N = N (I − A) (the nullspace of I − A). Corollary 7.8. If (H, (·, ·), · ) is a Hilbert space and A ∈ K(H), then the equation x − Ax = f has a solution if and only if f ∈ N (I − ⊥ A∗ ) . Proof. Use Theorem 7.7 with A∗ instead of A. In order to prove Fredholm’s Theorem, we need the following lemma. Lemma 7.9. Let (H, (·, ·), · ) be a Hilbert space and let A ∈ K(H). Then there exists a constant C > 0 such that Cx ≤ (I − A)x

∀x ∈ N ⊥ ,

(7.2.3)

where N = N (I − A). Proof. Assume by contradiction that (7.2.3) is not true, i.e., for all n ∈ N there exists an xn ∈ N ⊥ such that xn = 1 and (I − A)xn <

1 . n

Therefore, xn − Axn → 0 .

(7.2.4)

As A is compact there is a subsequence of (xn )n≥1 , say (xnk )k≥1 , such that (Axnk )k≥1 is convergent. By (7.2.4) we deduce that (xnk )k≥1 is also convergent, and its limit x ∈ N ⊥ (since N ⊥ is closed). Using again (7.2.4), we infer that x − Ax = 0, i.e., x ∈ N . Since N ∩ N ⊥ = {0}, we have x = 0, which contradicts xn = 1 ∀n ≥ 1. 2

Erik Ivar Fredholm, Swedish mathematician, 1866–1927.

7.2 Adjoints of Operators on Hilbert Spaces

207

Proof of Fredholm’s Theorem. Necessity. Assume that the equation x−A∗ x = f has a solution x ∈ H. Then, for all y ∈ N , we have (f, y) = (x, y) − (A∗ x, y) = (x, y) − (x, Ay) = (x, (I − A)y ) =0

= 0. Therefore f ∈ N ⊥ . Suﬃciency. Assume f ∈ N ⊥ . Since N ⊥ is a closed subspace of (H, · ), N ⊥ is a Hilbert space with the same scalar product and norm. According to Lemma 7.9, · is equivalent (on N ⊥ ) with the norm deﬁned by the scalar product x, y = (T x, T y)

∀x, y ∈ N ⊥ ,

where T = I − A. Since the functional x → (x, f ) is linear and continuous on N ⊥ , it follows by the Riesz Representation Theorem that there exists xf ∈ N ⊥ such that (x, f ) = x, xf

∀x ∈ N ⊥ .

(7.2.5)

=(T x,T xf )

In fact, (7.2.5) holds for all x ∈ H, since x = x +x , with x ∈ N , x ∈ N ⊥ . Denoting x ˜ = T xf , we can write (see (7.2.5) extended to H) (T x, x ˜) = (x, f )

∀x ∈ H ,

=(x,˜ x−A∗ x ˜)

so

˜=f. x ˜ − A∗ x

The following result provides some information that supplements Theorem 7.7. Theorem 7.10. Let (H, (·, ·), · ) be a Hilbert space and let A ∈ K(H). Then, R(I − A) = H ⇐⇒ N = {0} ⇐⇒ N ∗ = {0} ⇐⇒ R(I − A∗ ) = H , where N = N (I − A), N ∗ = N (I − A∗ ), and R(I − A), R(I − A∗ ) denote the ranges of I − A, I − A∗ .

208

7 Adjoint, Symmetric, and Self-adjoint Linear Operators

Proof. Keeping in mind Theorem 7.7 and Corollary 7.8, it suﬃces to prove that (7.2.6) R(I − A) = H ⇐⇒ R(I − A∗ ) = H . Assume R(I − A) = H. Let us prove that N = {0}. Assume by way of contradiction that N = {0}, i.e., there exists an x0 ∈ N , x0 = 0. As R(I − A) = H we can construct a sequence (xn )n≥1 in D(A) such that T xn = xn−1 ∀n ≥ 1 , where T := I − A. We have T n xn = x0 = 0,

and T n+1 xn = 0 ,

where T k := T ◦ T ◦ · · · ◦ T (k factors). Hence, denoting Hn = N (T n ), we have that Hn is a proper linear subspace of Hn+1 for all n ∈ N. According to Theorem 7.3, every Hn is a ﬁnite dimensional space, hence closed, since " n ! n n n (−1)k+1 Ak . T = (I − A) = I − k k=1 compact operator By Lemma 2.25 there exists a sequence (un )n≥1 such that un ∈ Hn+1 , un = 1, un − u ≥

1 2

∀u ∈ Hn .

Since for 1 ≤ m < n T n (T un + Aum ) = T n+1 un + AT n um = 0 , we have T un + Aum ∈ Hn , and Aun − Aum = un − (T un + Aum ) ≥

1 . 2

Thus the sequence (Aun )n≥1 cannot have Cauchy (hence convergent) subsequences. This contradicts the fact that A is compact combined with un = 1 for all n ≥ 1. Therefore, N = {0}, which (by Theorem 7.7) implies that R(I − A∗ ) = H. Thus we have proved the implication R(I − A) = H ⇒ R(I − A∗ ) = H . The converse implication follows by replacing A with A∗ .

7.3 Symmetric Operators and Self-adjoint Operators

209

Remark 7.11. From Corollary 7.8 and Theorem 7.10 we deduce that if the equation x − Ax = f has a solution uf for all f ∈ H then uf is unique (since N (I − A) = {0}). So we can now state the socalled Fredholm’s alternative regarding the equation x − Ax = f with A ∈ K(H), namely one of the following must hold: • for every f ∈ H the equation x − Ax = f has a unique solution (equivalently, N (I − A) = {0}); • N (I − A) = {0}, in which case the equation x − Ax = f is solvable if and only if f ⊥ N (I − A∗ ) (i.e., f satisﬁes m orthogonality relations, where m = dim N (I − A∗ ) = dim N (I − A)). We shall later apply Fredholm’s alternative to a class of integral equations that are named after him. Remark 7.12. In fact, the above theory is valid in a general Banach space H (see, e.g., [6, Chapter 6] or [15, Chapter 5]).

7.3

Symmetric Operators and Self-adjoint Operators

We begin this section with the following deﬁnition. Deﬁnition 7.13. Let (H, (·, ·), · ) be a Hilbert space and let A : D(A) ⊂ H → H be a densely deﬁned, linear operator. (a) A is called symmetric if A ⊂ A∗ , i.e., (Ax, y) = (x, Ay) (b)

∀x, y ∈ D(A) ;

A is called self-adjoint if A = A∗ , i.e., A ⊂ A∗ and A∗ ⊂ A.

Obviously, if D(A) = H then A is symmetric if and only if it is selfadjoint, and in this case A is closed (by Theorem 7.1), hence A ∈ L(H) (by the Closed Graph Theorem). Example 1. Let X = L2 (a, b; K), where −∞ < a < b < +∞ and let A : X → X be deﬁned by

b

(Af )(t) =

k(t, s)f (s) ds, a ≤ t ≤ b ,

a

where k ∈ C([a, b] × [a, b]; K). The space X equipped with the usual scalar product and norm is a Hilbert space and A ∈ L(X). Moreover,

210

7 Adjoint, Symmetric, and Self-adjoint Linear Operators

it is easy to see (by using Arzel` a–Ascoli’s Criterion) that A ∈ K(X). Note that for all f, g ∈ X we have b (Af )(t) · g(t) dt (Af, g)L2 (a,b; K) = a b b k(t, s)f (s) ds · g(t) dt = a a b

b f (s) · k(t, s)g(t) dt ds =

a b

=

f (t) ·

a

thus (A∗ g)(t) =

b

a b

k(s, t)g(s) ds dt ,

a

k(s, t) · g(s) ds

∀g ∈ X .

a

Obviously, A = A∗ ⇐⇒ k(t, s) = k(s, t) ∀t, s ∈ [a, b] . Example 2. Let X = L2 (R; K) with its usual scalar product and Hilbertian norm, and let A : D(A) ⊂ X → X be given by D(A) = {f ∈ X; tf (t) ∈ X} , (Af )(t) := tf (t)

∀t ∈ K, f ∈ D(A) .

It is easily seen that A is self-adjoint. Example 3. Let H = L2 (Ω) be equipped with the usual scalar product and norm, where ∅ = Ω ⊂ RN , N ≥ 2, is a bounded domain with smooth boundary. Let A : D(A) ⊂ H → H, where D(A) = C0∞ (Ω),

Au = Δu

∀u ∈ D(A) .

Obviously, D(A) is dense in H. By Green’s identity, we have vΔu dx = uΔv dx ∀u ∈ D(A) = C0∞ (Ω), v ∈ H 2 (Ω) . Ω

Ω

Thus ⊂ D(A∗ ) and A∗ v = Δv for all v ∈ D(A). Therefore A is symmetric but not self-adjoint because D(A) is a proper subset of D(A∗ ). If the domain of A = Δ is extended to H01 (Ω) ∩ H 2 (Ω) then A becomes self-adjoint. More precisely, we have the following proposition. H 2 (Ω)

7.3 Symmetric Operators and Self-adjoint Operators

211

Proposition 7.14. Let H = L2 (Ω) be equipped with the usual scalar product (·, ·) and the induced norm · , where ∅ = Ω ⊂ RN , N ≥ 2, is a bounded domain with smooth boundary. Let B : D(B) ⊂ H → H be deﬁned by D(B) = H 2 (Ω) ∩ H01 (Ω), Bu = Δu for all u ∈ D(B). Then B is self-adjoint. Proof. Clearly, D(B) is dense in H and, by Green’s formula, we have for all u, v ∈ D(B) (Bu, v) = Δu · v dx = u · Δv dx = (u, Bv), Ω

Ω

hence D(B) ⊂ D(B ∗ ) and B ∗ v = Bv for all v ∈ D(B) (i.e., B is symmetric). Let us prove that D(B ∗ ) = D(B). Using the Lax–Milgram Theorem, we can see that R(I + B) = H. In addition, since B is positive, I + B is invertible and J := (I + B)−1 ∈ L(H). As B is symmetric, so is J. Now, let v be an arbitrary function in D(B ∗ ). Denoting g = v + B ∗ v, we have (g, u) = (v, u + Bu) ∀u ∈ D(B). Therefore, for every h ∈ H, we have (g, Jh) = (v, h) =⇒ (Jg, h) = (v, h), so v = Jg ∈ R(J) = D(B). We know that, for every bijective A ∈ L(H), A∗ is also bijective and (A∗ )−1 = (A−1 )∗ . In fact, the following more general result holds. Theorem 7.15. Let (H, (·, ·), · ) be a Hilbert space and let A : D(A) ⊂ H → H be a symmetric linear operator, with R(A) = H. Then (A−1 )∗ = (A∗ )−1 , where all operations are permitted. If, in addition, A is self-adjoint, then so is A−1 . Proof. A is injective. Indeed, if u ∈ D(A) and Au = 0 then 0 = (Au, v) = (u, Av)

∀v ∈ D(A) ,

which implies u = 0 since R(A) is dense in H.

212

7 Adjoint, Symmetric, and Self-adjoint Linear Operators

A∗ is also injective because if v ∈ D(A∗ ) and A∗ v = 0, then (Au, v) = (u, A∗ v) = 0 ∀u ∈ D(A), A−1 and (A∗ )−1 exist and thus v = 0 since R(A) Therefore, ∗=−1H. −1 ∗ = R(A ). Since D(A−1 ) is dense in with D(A ) = R(A), D (A ) H, (A−1 )∗ exists. Denote B := (A−1 )∗ . We have (u, v) = (A−1 (Au), v) = (Au, Bv)

∀u ∈ D(A), v ∈ D(B),

(7.3.7)

and (z, w) = (A(A−1 z), w) = (A−1 z, A∗ w)

∀z ∈ D(A−1 )

= R(A), w ∈ D(A∗ ) .

(7.3.8)

By (7.3.7) Bv ∈ D(A∗ ) and v = A∗ (Bv)

∀v ∈ D(B) .

(7.3.9)

On the other hand, by (7.3.8), A∗ w ∈ D (A∗ )−1 = D(B) and w = (A−1 )∗ (A∗ w)

∀w ∈ D(A∗ ) .

(7.3.10)

=B

From (7.3.9) and (7.3.10) we derive B = (A∗ )−1 ⇐⇒ (A−1 )∗ = (A∗ )−1 . If A = A∗ , then (A−1 )∗ = A−1 .

7.4

Exercises

1. Let X, Y be Banach spaces. Let A : D(A) ⊂ X → Y be a densely deﬁned, closed linear operator and B ∈ L(X, Y ). Deﬁne T : D(T ) = D(A) ⊂ X → Y by T x = Ax + Bx ∀x ∈ D(A). Prove that (i)

T is a closed operator;

(ii)

D(T ∗ ) = D(A∗ ) and T ∗ = A∗ + B ∗ .

7.4 Exercises

213

2. Let X, Y be Banach spaces and let A : D(A) ⊂ X → Y be a densely deﬁned linear operator. Show that A∗ is injective if and only if Cl R(A) = Y . 3. Let H be a Hilbert space. If A : D(A) ⊂ H → H is a symmetric linear operator with R(A) = H, then A is self-adjoint, i.e., A = A∗ . 4. Let H be a Hilbert space, with the scalar product denoted (·, ·), and let A, B ∈ L(H). Show that A∗ A = B ∗ B ⇐⇒ (Ax, Ay) = (Bx, By) ∀x, y ∈ H. 5. Let H be a Hilbert space. For any A ∈ L(H), show that A∗ A = A2 . 6. Let (H, (·, ·)) be a Hilbert space over C and let A ∈ L(H). Prove that A is symmetric (hence self-adjoint) ⇐⇒ (Ax, x) ∈ R ∀x ∈ H. 7. Let (H, (·, ·)) be a Hilbert space over R. Prove that for any a > 0 and any A ∈ L(H) the operator T = I + aA∗ A is invertible and T −1 ∈ L(H), where I denotes the identity operator on H. 8. Let H be a Hilbert space over C and let A ∈ L(H) be a symmetric (hence self-adjoint) operator. Denote T = A + iI, where i2 = −1 and I is the identity operator on H. Prove that (a) T is a normal operator (i.e., T ∗ T = T T ∗ ); (b) T is invertible and T −1 ∈ L(H). 9. Let H be a Hilbert space over C. For A ∈ L(H) and a0 , a1 , . . . , an ∈ C, denote by P (A) the operator polynomial a0 I + a1 A + · · · + an An , where I stands for the identity operator. (j) If A is symmetric (hence self-adjoint) and a0 , a1 , . . . , an ∈ R, then P (A) is symmetric, too; (jj) If A is a normal operator (i.e., A∗ A = AA∗ ), then so is P (A).

214

7 Adjoint, Symmetric, and Self-adjoint Linear Operators

10. Let H1 , H2 be Hilbert spaces. Deﬁne H = H1 × H2 to be the Hilbert space consisting of all pairs (x1 , x2 )T , x1 ∈ H1 and x2 ∈ H2 , with ! " ! " ! " x1 y1 x 1 + y1 + = , x2 y2 x 2 + y2 " ! " ! αx1 x ∀α ∈ K, α 1 = x2 αx2 and a scalar product deﬁned by +! " ! ", ! " ! " y1 y1 x1 x , = (x1 , y1 )H1 + (x2 , y2 )H2 ∀ 1 , ∈ H. x2 y2 x2 y2 Given A1 ∈ L(H1 ) and A2 ∈ L(H2 ), deﬁne the matrix operator . A1 0 . A= 0 A2 Prove that A ∈ L(H) and A = max {A1 , A2 }. Find A∗ . 11. Let A ∈ L(H), where H is a Hilbert space over C. As in the previous exercise, deﬁne Y = H × H to be the Hilbert space consisting of all pairs (x1 , x2 )T , x1 ∈ H and x2 ∈ H, with the corresponding operations and scalar product. Deﬁne on Y the matrix operator B by . 0 iA B= , −iA∗ 0 where i = B ∗ = B.

√

−1. Prove that B ∈ L(Y ), B = A, and that

Now, assume that A : D(A) ⊂ H → H is a linear, densely deﬁned operator. Prove that B : D(A∗ ) × D(A) ⊂ Y → Y is symmetric. 12. Let H be a Hilbert space and let A ∈ L(H) satisfying A ≤ 1. Prove that Ax = x if and only if A∗ x = x. 13. Let H be the real Hilbert space L2 (0, 1) equipped with the usual scalar product and induced norm. Deﬁne A : D(A) ⊂ H → H by D(A) = {u ∈ H 1 (0, 1); u(0) = 0}, Au = u .

7.4 Exercises

215

(a) Show that D(A) is dense in H and that A is closed; (b) Compute N (A) and R(A); (c) Determine A∗ and show that D(A∗ ) is dense in H. 14. Let H be the real Hilbert space L2 (0, 1) equipped with the usual scalar product and induced norm. Let A : D(A) ⊂ H → H be the operator deﬁned by Au = u , where (a) D(A) = H01 (0, 1); (b) D(A) = {u ∈ H 1 (0, 1); u(0) = αu(1)} for some α ∈ R \ {0}. Determine N (A), R(A), A∗ , N (A∗ ), R(A∗ ) in each of these two cases. 15. Let H be the real Hilbert space L2 (0, 1) equipped with the usual scalar product and induced norm. Let A : D(A) ⊂ H → H, Au = u , where D(A) is speciﬁed below. Determine A∗ in each of the following cases: (a) D(A) = {u ∈ H 2 (0, 1); u(0) = u(1) = 0}; (b) D(A) = {u ∈ H 2 (0, 1); u(0) = u(1) = u (0) = u (1) = 0}; (c) D(A) = {u ∈ H 2 (0, 1); u(0) = u (1) = 0}; (d) D(A) = {u ∈ H 2 (0, 1); u(0) = u(1)}; 16. Let H = l2 (C) be the complex Hilbert space of all 2sequences of complex numbers x = (xn )n∈N satisfying ∞ n=1 |xn | < ∞, with the usual scalar product x, y =

∞

xn y¯n ∀x = (xn ), y = (yn ) ∈ H,

n=1

and the induced norm, denoted · . Deﬁne the operators A : H → H and B : D(B) ⊂ H → H by A(xn ) = (xp+1 , xp+2 , xp+3 , . . . ), for a given p ∈ N, nα i n xn , for a given α ∈ R. B(xn ) = 1+n

216

7 Adjoint, Symmetric, and Self-adjoint Linear Operators

(a) Show that A ∈ L(H) and compute A and A∗ ; (b) Show that if α ≤ 1 then D(B) = H and B ∈ L(H); compute B; (c) For α > 1 ﬁnd (the maximal domain) D(B) and prove that D(B) is dense in H; (d) Compute B ∗ for all α ∈ R; (e) Check whether A and B with α ≤ 1 are normal operators.

Chapter 8

Eigenvalues and Eigenvectors In this chapter we present the main results regarding eigenvalues and eigenvectors of compact and/or symmetric operators. This includes the Hilbert–Schmidt Theorem and its applications to the main eigenvalue problems for the Laplacian. Throughout this chapter we consider linear operators deﬁned on linear spaces over K, where K is either R or C, unless otherwise speciﬁed.

8.1

Deﬁnition and Examples

We ﬁrst introduce the concept of an eigenpair (i.e., eigenvector + the corresponding eigenvalue). Deﬁnition 8.1. Let X be a linear space. A vector u ∈ X \ {0} is said to be an eigenvector of a linear operator A : X → X if there exists λ ∈ K such that Au = λu. Such a λ is called an eigenvalue corresponding to u, and the pair (u, λ) is called an eigenpair. Remark 8.2. For a given eigenvector u of A the corresponding eigenvalue λ is unique. Indeed, λu = Au = λ1 u =⇒ (λ − λ1 )u = 0 =⇒ λ − λ1 = 0, since u = 0. © Springer Nature Switzerland AG 2019 G. Moro¸sanu, Functional Analysis for the Applied Sciences, Universitext, https://doi.org/10.1007/978-3-030-27153-4 8

217

218

8 Eigenvalues and Eigenvectors

For a given eigenvalue λ of A, the set of the corresponding eigenvectors is N (λI − A) \ {0}, where I is the identity operator of X. Remark 8.3. Note also that a set of eigenvectors u1 , u2 , . . . , um of A corresponding to distinct eigenvalues λ1 , λ2 , . . . , λm (m ∈ N) is a linearly independent system. The proof is by induction. Example 1. Let X = Cn , A : X → X, Au = M u ∀u = (u1 , . . . , un )T ∈ X, where M = (aij ) is an n × n matrix with entries aij ∈ C. Then, λ is an eigenvalue of A if and only if det(λI − M ) = 0, where I is the n × n identity matrix. Example 2. Let H = l2 (C) be the complex Hilbertspace of all se2 quences of complex numbers x = (xn )n∈N satisfying ∞ n=1 |xn | < ∞, with the usual scalar product x, y =

∞

xn y¯n ∀x = (xn ), y = (yn ) ∈ H,

n=1

and the induced norm, denoted · . Deﬁne the linear operator A by

2 3 n x2 , x3 , . . . , xn , . . . . A(xn ) = 1 2 n−1 We have for all x = (xn ) ∈ H Ax

2

=

∞

|n(n − 1)−1 xn |2

n=2 ∞

≤ 4

|xn |2

n=2 2

≤ 4x , so A ∈ L(H) and A ≤ 2. In fact, A = 2, since for x ˜ = (0, 1, 0, 0 . . . ) we have ˜ x = 1 and A˜ x = 2. Consider the equation Ax = λx, or, equivalently, n+1 xn+1 = λxn , n = 1, 2, . . . n

(8.1.1)

Observe that λ = 0 is an eigenvalue of A with eigenvectors (x1 , 0, 0, . . . ), x1 ∈ C \ {0}. If λ = 0, then it follows easily from (8.1.1) that xn =

1 n−1 λ x1 , n = 1, 2, . . . n

8.2 Main Results

219

In order for (xn ) to be an eigenvector, we choose x1 = 0. The condition (xn ) ∈ H is equivalent to |λ| ≤ 1. So the set {λ ∈ C; |λ| ≤ 1} is the set of all eigenvalues of A.

8.2

Main Results

We begin this section with a general result about the eigenvalues of a compact linear operator. Theorem 8.4. Let (X, · ) be a normed space and let A ∈ K(X) (i.e., A : X → X is linear and sends bounded sets to relatively compact sets). Then A has a countable set of eigenvalues, and the only possible accumulation point of the set of eigenvalues is λ = 0. Moreover, for any eigenvalue λ = 0, dim N (λI − A) < ∞ (one says that λ has a ﬁnite rank or ﬁnite multiplicity). Proof. The proof is trivial if X is ﬁnite dimensional, so let us assume that X is inﬁnite dimensional. To prove the ﬁrst statement of the theorem, it suﬃces to show that for all r > 0 the set {λ ∈ K; |λ| ≥ r} contains a ﬁnite number of eigenvalues. Suppose not, i.e., there exists r0 > 0 and inﬁnitely many distinct eigenvalues λ1 , λ2 , . . . such that |λn | ≥ r0 ∀n ≥ 1. Then there exists a sequence un ∈ X \ {0} such that Aun = λn un ∀n ≥ 1, and we may assume that un = 1 ∀n ≥ 1. Because the λn ’s are distinct, Bn = {u1 , u2 , . . . , un } are independent systems. Set Xn = Span Bn , n = 1, 2, . . . By Lemma 2.25, there exists yn ∈ Xn \ Xn−1 such that yn = 1 ∀n ≥ 2 and yn − v ≥

1 1 ∀v ∈ Xn−1 , n ≥ 2 =⇒ yn − ym ≥ ∀n = m. 2 2

Thus (yn ) has no Cauchy (hence no convergent) subsequence. On the other hand, assuming that 1 ≤ m < n, we have Ayn − Aym =

λ n yn

∈Xn \Xn−1

− λn ym + (Ayn − λn yn ) − (Aym − λm ym )

= λn yn − vmn

∈Xm

∈Xn−1

∈Xm ⊂Xn−1

220

8 Eigenvalues and Eigenvectors

with vmn ∈ Xn−1 , because n n n Ayn − λn yn = A αi u i − λn αin ui i=1

=

n

αin λi ui − λn

i=1

=

=

n i=1 n−1

i=1 n

αin ui

i=1

αin (λi − λn )ui αin (λi − λn )ui

i=1

which is in Xn−1 . Hence we have Ayn − Aym = λn yn − vnm = |λn | · yn − λ−1 n vnm ≥ r0 yn − λ−1 n vnm r0 , ≥ 2 so (Ayn ) has no Cauchy (hence no convergent) subsequence. But A is compact and yn = 1 ∀n ≥ 1 so (Ayn ) must have a convergent subsequence. This contradiction shows that {λ ∈ K; |λ| ≥ r} contains a ﬁnite number of eigenvalues of A for all r > 0, as claimed. The proof of the latter statement of the theorem is similar to the proof of Theorem 7.3. Proposition 8.5. Let (H, (·, ·), · ) be a Hilbert space and let A : H → H be a symmetric (hence self-adjoint) operator. Then, (i) every eigenvalue of A is real, even if K = C; (ii) every two eigenvectors of A corresponding to distinct eigenvalues are orthogonal. Proof. To prove (i) suppose λ is an eigenvalue of A. Let u ∈ H \ {0} be a corresponding eigenvector, i.e., Au = λu. Then (Au, u) = (λu, u) = λu2 , (u, Au) = (u, λu) = λu2 . As A is symmetric and u = 0, we infer that λ = λ.

8.2 Main Results

221

To prove (ii), consider two eigenpairs of A, (u1 , λ1 ), (u2 , λ2 ), where λ1 , λ2 ∈ R (from (i)) and λ1 = λ2 . We have λ1 (u1 , u2 ) = (Au1 , u2 ) = (u2 , Au1 ) = λ2 (u1 , u2 ) , thus (λ1 − λ2 )(u1 , u2 ) = 0 , =0

so (u1 , u2 ) = 0. Proposition 8.6. Let (H, (·, ·), · ) be a Hilbert space, H = {0}, and let A ∈ L(H) be a symmetric operator. Then, A = sup { |(Ax, x)|; x ∈ H, x = 1 } . Proof. Trivial if A = 0 (equivalently A = 0). Assume A = 0 (A > 0) and set a = sup { |(Ax, x)|; x ∈ H, x = 1} . Since |(Ax, x)| ≤ Ax · x ≤ A · x2

∀x ∈ H ,

we infer that a ≤ A .

(8.2.2)

Now, for given b > 0 and x ∈ H such that x = 1 and Ax > 0, we have 1 1 1 1 1 Ax2 = A(bx + Ax), bx + Ax − A(bx − Ax), bx − Ax . 4 b b b b (8.2.3) We also have |(Av, v)| ≤ av2 ∀v ∈ H . (8.2.4) Combining (8.2.3) and (8.2.4) we obtain a 1 1 Ax2 ≤ bx + Ax2 + bx − Ax2 4 b b a 2 1 2 2 ≤ b x + 2 Ax , 2 b so for x = 1 and b = Ax > 0 we have Ax2 ≤ aAx . Therefore, Ax ≤ a

∀x ∈ H, x = 1 =⇒ A ≤ a .

This together with (8.2.2) implies A = a.

222

8 Eigenvalues and Eigenvectors

Note that the assumption that A is symmetric in the above theorem is essential. We have the following central theorem. Theorem 8.7 (Hilbert–Schmidt). Let (H, (·, ·), · ) be an inﬁnite dimensional, separable Hilbert space and let A : H → H be a symmetric (equivalently, self-adjoint), compact linear operator, with N (A) = {0}. Then there exist a sequence of eigenvalues of A, (λ1 , λ2 , . . . , λn , . . . ), such that (|λn |) is a decreasing sequence of positive numbers converging to 0 and a complete orthonormal system (basis) in H of corresponding eigenvectors {un }∞ n=1 (i.e., Aun = λn un for n = 1, 2, . . . ). Proof. We ﬁrst observe that A > 0 ⇐⇒ A = 0 (since N (A) = {0}). Let us prove that either A or −A is an eigenvalue of A. By Proposition 8.6 there exists (vn )n≥1 , with vn = 1 ∀n ≥ 1, such that |(Avn , vn )| → A. In fact, one can extract from (vn ) a subsequence, again denoted (vn ), such that (Avn , vn ) converges to either A or −A, say (8.2.5) (Avn , vn ) → λ1 := A. Since A is compact we can now take another subsequence, also denoted (vn ), such that Avn → u1 , (8.2.6) and this is the subsequence we keep. Now, passing to the limit in 0 ≤ Avn − λ1 vn 2 = Avn 2 − 2λ1 (Avn , vn ) + λ21 ,

(8.2.7)

we get (see (8.2.5) and (8.2.6)) 0 ≤ u1 2 − λ21 =⇒ |λ1 | ≤ u1 . Hence, in particular, u1 = 0. The converse is also true since we have Avn ≤ A · vn = A , so by (8.2.6) u1 ≤ A = |λ1 | . Therefore, u1 = |λ1 | = A .

(8.2.8)

From (8.2.7) (see also (8.2.5), (8.2.6) and (8.2.8)) we derive Avn − λ1 vn → 0 .

(8.2.9)

8.2 Main Results

223

So, in view of (8.2.6), (λ1 vn ) converges to u1 and thus by (8.2.9) and continuity of A we get Au1 = λ1 u1 , i.e., (u1 , λ1 ) is an eigenpair of A. We normalize without changing notation, u1 := |λ1 |−1 u1 , since we want an orthonormal system of eigenvectors. It is worth pointing out that any other eigenvalue λ satisﬁes |λ| ≤ |λ1 |. Indeed, if we assume by contradiction the existence of an eigenpair (u, λ), with |λ| > |λ1 | and u = 1, then |(Au, u)| = |λ| which contradicts |λ1 | = A being the supremum from Proposition 8.6. We now use induction to prove the existence of eigenpairs (un , λn ) for n = 2, 3, . . . Denote by Y the orthogonal complement of Span{u1 }, i.e., Y = { u ∈ H; (u, u1 ) = 0 } . Since H is inﬁnite dimensional, so is Y . Moreover Y is a Hilbert space (with the scalar product and norm of H), and is invariant to A in the sense that AY ⊂ Y because for y ∈ Y , (Ay, u1 ) = (y, Au1 ), since A is symmetric, and = (y, λ1 u1 ) = λ1 (y, u1 ) = 0. The restriction A|Y is not 0 since then N (A) = Y . In fact, all the properties are inherited (symmetric, compact, and N (A|Y ) = {0}) and by the previous step we have an eigenvalue λ2 = ± sup { |(Av, v)|; v ∈ Y, v = 1} , and a corresponding eigenvector u2 ∈ Y, u2 = 1, Au2 = λ2 u2 . Moreover, |λ2 | ≤ λ1 .

224

8 Eigenvalues and Eigenvectors

Next, take Z = { u ∈ Y ; (u, u2 ) = 0 } =

⊥

Span{u1 , u2 }

,

which is an inﬁnite dimensional (Hilbert) subspace of H, and obtain a new eigenpair (u3 , λ3 ), with u3 = 1, |λ3 | ≤ |λ2 |. We may continue doing this, each time obtaining an inﬁnite dimensional subspace. We thus construct a sequence of eigenvalues (λn ) such that |λ1 | ≥ |λ2 | ≥ · · · ≥ |λn | ≥ · · · ,

(8.2.10)

and the corresponding sequence of eigenvectors (un ), Aun = λn un , un = 1, n ≥ 1 , forms an orthonormal system by construction. Next, we prove that Au =

∞

λn (u, un )un

∀u ∈ H .

(8.2.11)

⊥

n=1

Deﬁne the space Vm := {u ∈ H; (u, uj ) = 0, j = 1, . . . , m} =

Span{u1 , . . . , um }

,

which is an inﬁnite dimensional Hilbert space (with respect to (·, ·), · ), invariant under A (i.e., Av ∈ Vm ∀v ∈ Vm ). By the previous step of our proof, there is an eigenpair (um+1 , λm+1 ) of A such that |λm+1 | = A|Vm = sup {|(Av, v)|; v ∈ Vm+1 , v = 1} . In particular, Av ≤ |λm+1 | · v ∀v ∈ Vm+1 .

(8.2.12)

Now, choose a particular wm = u −

m

(u, un )un

n=1

and notice that wm ∈ Vm because (vm , uj ) = (u, uj )−(u, uj ) = 0 ∀j = 1, . . . , m. Calculate wm = u − 2

2

m n=1

|(u, un )|2 ≤ u2 .

(8.2.13)

8.2 Main Results

225

Combining (8.2.12) and (8.2.13) we get Awm = Au − = Au −

m

(u, un )Aun

n=1 m

λn (u, un )un ,

n=1

so that Awm ≤ A|Vm · wm = |λm+1 | · wm ≤ |λm+1 | · u .

(8.2.14)

On the other hand, λn → 0. Indeed, since (|λn |) is decreasing (see (8.2.10)), there exists lim |λn | = α ≥ 0 .

n→∞

Suppose by way of contradiction that α > 0. Obviously, |λn | ≥ α for all n ≥ 1 and so λ−1 n un =

1 1 un = ≤ |λn | |λn | α

∀n ≥ 1 .

Since A is compact un = A(λ−1 n un ) has a convergent subsequence. But this is impossible because un − um 2 = un 2 + um 2 = 2 ∀n = m . So α = 0, i.e., λn → 0, as claimed. Consequently, we have by (8.2.14) that Awm → 0 as m → ∞, i.e., (8.2.11) holds true. Finally, let us prove that {un }∞ n=1 is a basis in H. We 6.21 that for all u ∈ H the series ∞know from the proof of Theorem ∞ (u, u )u converges (as {u } n n n n=1 is an orthonormal system), so n=1 we can write ∞ (u, un )un v= n=1

226

8 Eigenvalues and Eigenvectors

and we simply need to check that u = v. Consider the sequence of partial sums sm = m n=1 (u, un )un which converges strongly to v as m → ∞, so Asm → Av. On the other hand, by (8.2.11) we have that Asm =

m

λn (u, un )un → Au as m → ∞ .

n=1

Hence, Av = Au =⇒ A(v − u) = 0 =⇒ v = u , since ker A = {0}. Thus the system {un }∞ n=1 is complete, i.e., a basis in H (cf. Theorem 6.21). Remark 8.8. If we assume in addition that A is positive (i.e., (Av, v) > 0 for all v ∈ H \{0}), then it has eigenvalues λ1 ≥ λ2 ≥ · · · ≥ λn ≥ · · · , with λn > 0 ∀n ≥ 1. This follows from (Aun , un ) = λn un 2 = λn , n ≥ 1. Note also that λ1 = A = sup{(Av, v); v ∈ H, v = 1} and λn+1 = A|Vn = sup{(Av, v); v ∈ Vn , v = 1} ∀n ≥ 1 , ⊥ where Vn = Span{u1 , u2 , . . . , un } , n ≥ 1.

8.3

Eigenvalues of −Δ Under the Dirichlet Boundary Condition

In what follows we apply the Hilbert–Schmidt Theorem to an eigenvalue problem for the Laplace operator. Speciﬁcally, let ∅ = Ω ⊂ RN , N ≥ 2, be a bounded domain with smooth boundary ∂Ω. Consider the Dirichlet eigenvalue problem −Δu = λu in Ω , (8.3.15) u=0 on ∂Ω . Deﬁnition 8.9. A real number λ is said to be an eigenvalue of the Dirichlet problem (8.3.15) if there is a function u ∈ H01 (Ω) \ {0} such that the problem is satisﬁed in the sense that ∇u · ∇v dx = λ uv dx ∀v ∈ H01 (Ω) , (8.3.16) Ω

or, equivalently,

Ω

−Δu = λu in D (Ω) .

8.3 Eigenvalues of −Δ Under the Dirichlet Boundary Condition

227

Remark 8.10. As ∂Ω is assumed to be smooth, the eigenfunction u is in fact more regular (see [6, Theorem 9.25, p. 298]). Theorem 8.11. Let ∅ = Ω ⊂ RN be a bounded domain with smooth boundary ∂Ω. Then there exist an increasing sequence of positive eigenvalues λn for (8.3.15) such that λn → +∞ and a complete orthonormal system (in H = L2 (Ω)) of eigenfunctions un satisfying problem (8.3.15) with λ = λn , n = 1, 2, . . . Proof. Let H = L2 (Ω) equipped with the usual inner product and norm. H is an inﬁnite dimensional, separable Hilbert space (over R). We know that for every f ∈ H = L2 (Ω) the problem −Δu = f in Ω , u=0 on ∂Ω , has a unique solution u ∈ H01 (Ω) (by Dirichlet’s Principle, Chap. 6). Deﬁne an operator A : H → H by assigning f → u. Note that A is linear and N (A) = {0}. Moreover, A is symmetric (hence self-adjoint since D(A) = H). Indeed, if v = Ag with g ∈ H, i.e., −Δv = g in Ω , v=0 on ∂Ω , then, by Green, we can write ∇u · ∇v dx = f v dx = f Ag dx , Ω Ω Ω ∇v · ∇u dx = gu dx = gAf dx ,

Ω

Ω

Ω

so Ω f Ag dx = Ω gAf dx as desired. Let us show that operator A is also compact, i.e., for every constant M > 0, the set SM := {Af ; f ∈ L2 (Ω), f L2 (Ω) ≤ M } is relatively compact in H = L2 (Ω). Indeed, if u = Af ∈ SM it follows from (8.3.16) with v = u that ∇u2L2 (Ω)

=

f u dx Ω

≤ f L2 (Ω) · uL2 (Ω)

228

8 Eigenvalues and Eigenvectors

and by the Poincar´e inequality ≤ Cf L2 (Ω) · ∇uL2 (Ω) . Finally, ∇uL2 (Ω) ≤ Cf L2 (Ω) ≤ CM , so that Af H01 (Ω) is less than or equal to some constant. We know that bounded sets in H01 (Ω) are relatively compact in L2 (Ω) so that SM is relatively compact in this space. We can apply the Hilbert–Schmidt Theorem which guarantees the existence of a sequence of eigenpairs for A, {(un , μn )}∞ n=1 , such that |μn | decreases to zero and {un }∞ is a complete orthonormal system (ban=1 sis) in H = L2 (Ω). Note that Aun = μn un says that un satisﬁes the problem −Δun = λn un in Ω , on ∂Ω , un = 0 where λn = 1/μn , i.e., (un , λn ) is an eigenpair of problem (8.3.15). Note also that un 2 dx = |∇un |2 dx > 0 ∀n ≥ 1 , λn = λn Ω

Ω

so (λn )n≥1 is an increasing sequence of positive numbers, and λn → +∞ (since |μn | = μn decreases to 0).

8.4

Eigenvalues of −Δ Under the Robin Boundary Condition

Let again ∅ = Ω ⊂ RN , N ≥ 2, be a bounded domain with smooth boundary ∂Ω. Consider the classical Robin eigenvalue problem −Δu = λu in Ω , (8.4.17) ∂u ∂ν + αu = 0 on ∂Ω , where α is a positive constant and ∂u/∂ν denotes the outward unit normal to ∂Ω. In this case we have the following natural deﬁnition:

8.4 Eigenvalues of −Δ Under the Robin Boundary Condition

229

Deﬁnition 8.12. A real number λ is said to be an eigenvalue of the Robin problem (8.4.17) if there is a function u ∈ H 1 (Ω) \ {0} such that ∇u · ∇v dx + α uv ds = λ uv dx ∀v ∈ H 1 (Ω) . (8.4.18) Ω

∂Ω

Ω

Remark 8.13. Again, as ∂Ω was assumed to be smooth enough, the eigenfunction u is, in fact, more regular. Theorem 8.14. Assume ∅ = Ω ⊂ RN is a bounded domain with smooth boundary ∂Ω and α is a positive constant. Then there exists an increasing sequence of positive eigenvalues λn for (8.4.17) such that λn → +∞ and a complete orthonormal system (in H = L2 (Ω)) of eigenfunctions un satisfying problem (8.4.17) with λ = λn , n = 1, 2, . . . Proof. Again, let H = L2 (Ω) equipped with the usual inner product and norm. By the Lax–Milgram Theorem (see Chap. 6) we easily infer that for every f ∈ H = L2 (Ω) the problem −Δu + u = f in Ω , ∂u on ∂Ω , ∂ν + αu = 0 has a unique solution u ∈ H 1 (Ω). Now deﬁne A : H → H by assigning f → u. It is an easy exercise to check that A is positive and satisﬁes all the conditions of the Hilbert–Schmidt Theorem. In contrast with the previous Dirichlet case, we have replaced −Δ by −Δ + I in order to ensure the strong positivity (coercivity) of the corresponding bilinear form as well as the compactness of A (based on Theorem 5.22). Therefore there exists a sequence of eigenpairs for A, {(un , μn )}∞ n=1 , such that |μn | = μn decreases to 0 and {un }∞ is an orthonormal n=1 basis in H. The fact that Aun = μn un can be written as −Δun = λn un in Ω , ∂u on ∂Ω , ∂ν + αun = 0 where λn = −1+1/μn , i.e., (un , λn ) is an eigenpair of problem (8.4.17). Note that λn = λ n un 2 dx = ∇un 2 dx + α u2n ds > 0 ∀n ≥ 1 , Ω

Ω

∂Ω

(8.4.19) so (λn )n≥1 is an increasing sequence of positive numbers converging to ∞ (since |μn | = μn decreases to 0).

230

8.5

8 Eigenvalues and Eigenvectors

Eigenvalues of −Δ Under the Neumann Boundary Condition

Under the same conditions on Ω we consider the Neumann eigenvalue problem −Δu = λu in Ω , (8.5.20) ∂u on ∂Ω , ∂ν = 0 i.e., α > 0 in the Robin eigenvalue problem is replaced by α = 0. The deﬁnition of an eigenvalue is the same as before (see Deﬁnition 8.12) with α = 0 in (8.4.18). We have a result similar to Theorem 8.14 which we explain in what follows. One can again consider H = L2 (Ω) with its usual scalar product and norm, and A : H → H the operator which associates with each f ∈ H the unique solution u ∈ H 1 (Ω) of the problem −Δu + u = f in Ω , ∂u on ∂Ω . ∂ν = 0 The Hilbert–Schmidt Theorem is again applicable (see also Remark 8.8), thus there exist a decreasing sequence of positive eigenvalues of operator A, say (μn )n≥0 , μn → 0, and a corresponding complete orthonormal system {un }∞ n=0 , i.e., Aun = μn un for n = 0, 1, 2, . . . So denoting λn = −1 + 1/μn we have −Δun = λn un in Ω , ∂un on ∂Ω , ∂ν = 0 for n = 0, 1, 2, . . . and (λn ) is an increasing sequence converging to ∞. We also have (8.4.19) with α = 0, hence λn ≥ 0 for all n ≥ 0. Note that λ0 = 0 is the ﬁrst eigenvalue of problem (8.5.20), the corresponding eigenfunctions being the nonzero constant functions. Thus λ0 = 0 has and the multiplicity one (so λ0 = 0 is said to be a simple eigenvalue) corresponding normalized eigenfunction is u0 = ±1/ m(Ω), where m(Ω) denotes the Lebesgue measure of Ω. Consequently, a result similar to Theorem 8.14 holds, with the only diﬀerence that the ﬁrst eigenvalue is no longer a positive number (it is λ0 = 0). In fact, the proof can also be done as in the Dirichlet case, as explained below. Denote by V0 the one-dimensional space generated by u0 =

8.5 Eigenvalues of −Δ Under the Neumann Boundary Condition

231

1/ m(Ω): V0 = Span{u0 } = Span{1}. Obviously, the space H = L2 (Ω) can be written as a direct sum ⊥ v dx = 0} . H = V0 ⊕ V1 , V1 = V0 = {v ∈ H; Ω

The space V1 is a closed linear subspace of H, so it is a real Hilbert space with respect to the same scalar product and norm. We can use V1 as a basic space to show the existence of (λn , un ) for n = 1, 2, . . . Note that W = V1 ∩ H 1 (Ω) is a real Hilbert space with respect to the scalar product (see (8.5.21) below) ∇v · ∇w dx ∀v, w ∈ W, v, w = Ω

and the corresponding induced norm. Indeed, we can show that 2 v 2 dx = 1} β = inf { ∇v dx; v ∈ W, Ω Ω 2 Ω ∇v dx , = inf v2 dx v∈W \{0} Ω Rayleigh quotient is a positive number. If we assume by way of contradiction that β = 0 then there exists a minimizing sequence (vk )k≥1 in W , vk L2 (Ω) = 1 ∀k ≥ 1, such that (vk ) converges to some vˆ weakly in H 1 (Ω) and strongly in V1 . From 2 ∇ˆ v · ∇(ˆ v − vk ) dx + ∇ˆ v · ∇vk dx ∇ˆ v L2 (Ω) = Ω Ω ∇ˆ v · ∇(ˆ v − vk ) dx + ∇ˆ v L2 (Ω) ∇vk L2 (Ω) ≤ Ω

we derive

∇ˆ v dx ≤ lim inf

∇vk 2 dx = 0 ,

2

Ω

which implies

Ω

∇ˆ v 2 dx = 0 , Ω

and so vˆ is a constant function. Since vˆ ∈ V1 it follows that vˆ = 0. On the other hand, one can derive from vk L2 (Ω) = 1, k ≥ 1, that

232

8 Eigenvalues and Eigenvectors

ˆ v L2 (Ω) = 1, a contradiction. Thus β > 0 as claimed. In particular, this implies the following Poincar´e-type inequality: βv2L2 (Ω) ≤ ∇v2L2 (Ω)

∀v ∈ W .

(8.5.21)

Now, according to the Lax–Milgram Theorem, for each f ∈ V1 the problem −Δu = f in Ω , ∂u on ∂Ω , ∂ν = 0 has a unique solution u ∈ W . Moreover the operator A : V1 → V1 deﬁned by Af = u, f ∈ V1 (i.e., A = (−Δ)−1 ), is positive and satisﬁes the conditions of the Hilbert–Schmidt Theorem. Therefore the existence of {(λn , un )}∞ n=1 is again guaranteed. Summarizing what we have done so far, we obtain the following result. Theorem 8.15. Assume ∅ = Ω ⊂ RN is a bounded domain with smooth boundary ∂Ω. Then there exist a sequence of eigenvalues for (8.5.20), 0 = λ0 < λ1 ≤ λ2 ≤ · · · ≤ λn ≤ · · · , such that λn → ∞ and a complete orthonormal system (in H = L2 (Ω)) of eigenfunctions un verifying problem (8.5.20) with λ = λn , n = 0, 1, 2, . . . ; in addition λ0 = 0 is simple and u0 = ±1/ m(Ω).

8.6

Some Comments

1. Let f ∈ L2 (Ω). The Neumann problem −Δu = f in Ω , ∂u on ∂Ω , ∂ν = 0 has a solution (that is unique up to additive constant) if and only if f ∈ V1 (i.e., Ω f dx = 0). Suﬃciency follows by the Lax–Milgram Theorem, as noticed before, while the converse implication follows by Green’s Identity. 2. Deﬁne λD 1

2 1 = inf { ∇v dx; v ∈ H0 (Ω), v 2 dx = 1} Ω Ω 2 dx ∇v Ω . = inf 2 v∈H01 (Ω)\{0} Ω v dx Rayleigh quotient

8.6 Some Comments

233

D It is easily seen that λD 1 is positive and is attained for a function u1 ∈ WD = H01 (Ω), uD 1 L2 (Ω) = 1 which is an eigenfunction corresponding . Moreover, λD to λD 1 1 is the ﬁrst eigenvalue (or principal eigenvalue), D D i.e., λ1 = λ1 given by Theorem 8.11, λD 1 is simple, and u1 is positive within Ω (see [14, Theorem 2, p. 336]). If we deﬁne uD W1D = {v ∈ H01 (Ω); 1 v dx = 0} , Ω

2 D λD = inf { ∇v dx; v ∈ W , v 2 dx = 1}, 2 1

then

Ω

Ω

∈ W1D , uD 2 L2 (Ω) = 1, D D D λ2 , u2 ⊥u1 . In general,

uD 2

is attained at some tion corresponding to

which is an eigenfuncsetting

D Wn−1

= {v ∈ λD n

H01 (Ω);

Ω

uD j v dx = 0, j = 1, . . . , n − 1}, n ≥ 2,

2 D = inf { ∇v dx; v ∈ Wn−1 , v 2 dx = 1}, Ω

Ω

D we obtain a sequence of eigenpairs (λD n , un ), such that D D D λD 1 < λ2 ≤ · · · ≤ λn ≤ · · · , λn = λn → ∞ , ∞ 2 and {uD n }n=1 is an orthonormal basis in L (Ω). This method is an alternative to that described in the proof of the Hilbert–Schmidt Theorem.

Similar arguments work for the Robin and Neumann eigenvalue problems. We just recall that the lowest positive eigenvalues are given by R 2 2 1 v ds; v ∈ H (Ω), v 2 dx = 1} λ1 = inf { ∇v dx + α Ω ∂Ω Ω 2 dx + α 2 ∇v Ω ∂Ω v ds , (8.6.22) = inf 2 v∈H 1 (Ω)\{0} Ω v dx λN 1

= inf {

∇v dx; v ∈ W = V1 ∩ H (Ω), Ω 2 Ω∇v dx . = inf 2 v∈W \{0} Ω v dx 2

1

v 2 dx = 1} Ω

234

8 Eigenvalues and Eigenvectors

N It is readily seen that both λR 1 and λ1 (which is equal to β deﬁned be1 fore) are positive numbers and are attained for functions uR 1 ∈ H (Ω) N and u2 ∈ W which are the corresponding eigenfunctions. It is also N R 1 N well known that both λR 1 and λ1 are simple and u1 ∈ H (Ω), u2 ∈ W do not change sign within Ω.

3. For all f ∈ L2 (Ω) the Robin problem −Δu = f in Ω , ∂u ∂ν + αu = 0 on ∂Ω , where α is a given positive constant, has a unique solution u ∈ H 1 (Ω). Indeed, by (8.6.22) we have the inequality 2 2 v dx ≤ ∇v dx + α v 2 ds ∀v ∈ H 1 (Ω) , (8.6.23) λR 1 Ω

Ω

∂Ω

which (along with the continuity of the canonical injection of H 1 (Ω) into L2 (∂Ω)) shows that its right-hand side deﬁnes a norm equivalent to the usual norm in H 1 (Ω). So the claim follows from Lax–Milgram applied to the bilinear form (u, v) → ∇u · ∇v dx + α uv ds . Ω

∂Ω

4. For some particular sets Ω ⊂ RN the eigenpairs (λn , un ) can be calculated. In the one-dimensional case (N = 1), if Ω = (0, 1), the three eigenvalue problems look as follows: −u = λu, 0 < x < 1 , u(0) = u(1) = 0 ,

−u = λu, 0 < x < 1 , u (0) = u (1) = 0 ,

−u = λu, 0 < x < 1 , −u (0) + αu(0) = 0 = u (1) + αu(1) ,

where α is a given positive constant. In the ﬁrst two cases (Dirichlet and Neumann) we obtain by easy computations √ 2 2 D λD 2 sin(nπx), n = 1, 2, . . . ; n = π n , un (x) =

8.6 Some Comments

235

N N 2 2 λN 0 = 0, u0 (x) = 1; λn = π n ,

uN n (x) =

√

2 cos(nπx), n = 1, 2, . . .

In the Robin case we cannot calculate by elementary methods the corresponding eigenpairs (un , λn ), n ≥ 1. −1/2

5. In the Dirichlet case above, the system {wn = λn un }∞ n=1 is an orthonormal basis in WD = H01 (Ω). Indeed, we can deduce from −Δun = λn un in Ω , (8.6.24) on ∂Ω , un = 0 that

∇wn · ∇wk dx = Ω

un uk dx = δnk

∀n, k ≥ 1,

Ω

which shows that {wn }∞ n=1 is an orthonormal system in WD . Now, ∞ since {un }n=1 is complete in H = L2 (Ω), any u ∈ H can be written as (see Theorem 6.21) u=

∞

(u, un )L2 (Ω) un =

n=1

∞ n=1

uun dx un , Ω

so, according to (8.6.24), u=

∞ n=1

∇u · ∇wn dx wn . Ω

Thus {wn }∞ n=1 is complete in WD . Similar statements hold true for the other two cases (Neumann and Robin) within WN = V1 ∩ H 1 (Ω) and WR = H 1 (Ω) equipped with the scalar products ∇w1 · ∇w2 dx, (w1 , w2 )N = Ω ∇w1 · ∇w2 dx + α w1 w2 dx . (w1 , w2 )R = Ω

∂Ω

In fact, these statements on the negative Laplacian with Dirichlet, Neumann or Robin boundary conditions can be derived from the abstract framework we describe below, related to the so-called energetic extension of a linear operator Q satisfying the following assumptions:

236

8 Eigenvalues and Eigenvectors

(a) Q : D(Q) ⊂ H → H is a linear, densely deﬁned, self-adjoint, strongly positive operator, where (H, (·, ·), · ) is a real, inﬁnite dimensional, separable Hilbert space. Deﬁne on the vector subspace D(Q) the so-called energetic scalar product (u, v)E = (Qu, v) ∀u, v ∈ D(Q). It induces the energetic norm on D(Q): u2E = (u, u)E , u ∈ D(Q). Denote by HE the completion of (D(Q), · E ). Then HE is a Hilbert space with respect to the scalar product (u, v)E := lim (uk , vk )E , k→∞

where (uk ) and (vk ) are sequences in D(Q) converging to u and v, respectively. Since Q is strongly positive, i.e., there exists a constant c > 0 such that (Qu, u) ≥ cu2 we have

∀u ∈ D(Q) ,

1 u ≤ √ uE c

(8.6.25)

∀u ∈ HE ,

so the identity map from HE to H is continuous (i.e., HE is continuously embedded in H). Denote by QE the Riesz isomorphism from (HE , · E ) onto its dual HE∗ , namely, (QE u)(v) = (u, v)E

∀u, v ∈ HE .

Identifying H with its dual, we have D(Q) ⊂ HE ⊂ H ⊂ HE∗ . Since D(Q) is dense in H, we see that QE u = Qu

∀u ∈ D(Q) ,

i.e., QE is an extension of Q which is called the energetic extension. The term energetic will become clear later when we discuss examples. We also assume that (b) the identity map from HE into H is compact (i.e., HE is compactly embedded into H). Now we can state the following abstract spectral result.

8.6 Some Comments

237

Theorem 8.16. Assume (a) and (b) above are fulﬁlled. Then there exist an increasing sequence (λn )n≥1 in (0, ∞) converging to ∞, and an orthonormal basis {un }∞ n=1 in H such that un ∈ D(Q), Qun = λn un

∀n ≥ 1 .

(8.6.26)

−1/2

In addition, {λn un }∞ n=1 is an orthonormal basis in HE (the energetic space deﬁned above). Proof. We shall adapt the proof of Theorem 8.11 to the present abstract framework. First of all, note that Q : D(Q) ⊂ H → H is bijective since its extension QE : HE → HE∗ is. Denote A = Q−1 . Obviously, A ∈ L(H), N (A) = {0}, and A is self-adjoint. Operator A is also compact. Indeed, if for some M > 0 we take f ∈ H such that f ≤ M , then we have for u = Af (equivalently Qu = f ), u2E = (Qu, u) ≤ f · u ≤ M u .

(8.6.27)

Combining (8.6.27) with (8.6.25) yields M Af E = uE ≤ √ , c i.e., A sends bounded sets in H to bounded sets in HE , hence A is compact (cf. (b)). According to the Hilbert–Schmidt Theorem there −1 exists a sequence of eigenpairs for A = Q , (μn , un ) n≥1 with the known properties, with μn > 0, n ≥ 1, since Q is strongly positive. Thus, the ﬁrst part of the theorem follows with (λn , un ) n≥1 , where λn = 1/μn , n = 1, 2, . . . −1/2 In order to prove the second part, denote wn = λn un , n ≥ 1. It follows from (8.6.26) that (wn , wk )E = (Qun , uk ) = δnk

∀n, k ≥ 1 ,

∞ i.e., the system {wn }∞ n=1 is orthonormal in HE . Now, since {un }n=1 is complete in H, any u ∈ H can be expressed as (cf. Theorem 6.21)

u=

∞

(u, un )un .

n=1

So, by virtue of (8.6.26), u=

∞

(u, wn )E wn ,

n=1

and so we can conclude that {wn }∞ n=1 is complete in HE .

238

8 Eigenvalues and Eigenvectors

Remark 8.17. In addition to the conclusions of Theorem 8.16 one can 1/2 ∗ show that {λn un }∞ n=1 is an orthonormal basis in HE . For more details on energetic spaces and extensions we refer the reader to [52, Chapter 5]. See also [22, Chapter 1, p. 18]. Remark 8.18. One can reobtain from Theorem 8.16 the previous statements related to Q = −Δ with Dirichlet, Neumann or Robin boundary condition. In the Dirichlet case we have H = L2 (Ω) with its usual scalar product 2 1 and induced norm, D(Q) = H01 (Ω)∩H (Ω), and HE = H0 2(Ω) with the energetic scalar product (u, v)E = Ω ∇u · ∇v dx and uE = (u, u)E . Note that HE is equal to WD deﬁned above. In the Neumann case, H = V1 := {v ∈ L2 (Ω); Ω v dx = 0} with the 2 (Ω), scalar product and norm inherited from L2 (Ω), D(Q) = V1 ∩ H and HE = V1 ∩ H 1 (Ω) (denoted above by WN ) with (u, v)E = Ω ∇u · ∇v dx, u2E = (u, u)E . Of course, in this case we have an additional eigenvalue λ0 = 0 as speciﬁed before. Finally, in the case of the Robin boundary condition, H = L2 (Ω) with 2 1 its usual scalar product and norm, D(Q) = H (Ω), and HE = H (Ω) (denoted above by WR ) with (u, v)E = Ω ∇u · ∇v dx + α ∂Ω uv ds and u2E = (u, u)E . There are also many other speciﬁc examples covered by Theorem 8.16, in particular the case Q = −Δ with diﬀerent conditions on parts of the boundary of Ω. Remark 8.19. In order to develop the above theory on energetic extensions we can begin with an operator Q which satisﬁes all the assumptions in (a), with one exception: Q is only symmetric, not self-adjoint. Everything works similarly and HE and QE can be constructed by usˆ : D(Q) ˆ ⊂H→H ing the same arguments. Now deﬁne an operator Q as follows: ˆ = QE v ˆ = {v ∈ HE ; QE v ∈ H}, Qv D(Q)

ˆ . ∀v ∈ D(Q)

ˆ is an extension of Q so D(Q) ˆ is dense in H. It is also Obviously, Q ˆ is strongly positive. As QE is bijective so is Q ˆ since easily seen that Q −1 ˆ it is a restriction of QE . Note also that Q ∈ L(H) and is symmetric, ˆ is self-adjoint as well. Operator Q ˆ is called hence self-adjoint. Thus Q the Friedrichs extension of Q. It is easily seen that the energetic space ˆ are exactly HE and QE . and the energetic extension deﬁned by Q ˆ satisﬁes all the conditions in (a) and plays Summarizing, we see that Q the role of the former Q. So assuming in (a) that Q is a self-adjoint

8.7 Exercises

239

operator (not a symmetric one) does not restrict the generality. In fact, in this case the Friedrichs extension of Q is Q itself. For example, if we choose H = L2 (Ω) (where Ω ⊂ RN is an open bounded set with smooth boundary) and D(Q) = C0∞ (Ω), Qu = −Δu, then Q is symmetric in H (not self-adjoint), the corresponding energetic space is HE = H01 (Ω), and QE : HE → HE∗ is given by ∇u · ∇v dx

QE (u)(v) = Ω

∀u, v ∈ H01 (Ω) ,

i.e., the same energetic extension we had before (see Remark 8.18). Obviously, the corresponding Friedrichs extension of Q is given by ˆ = −Δu ˆ = H01 (Ω) ∩ H 2 (Ω), Qu D(Q)

8.7

ˆ . ∀u ∈ D(Q)

Exercises

1. Let X denote the real linear space of all polynomials with real coeﬃcients of degree ≤ 3. Deﬁne A : X → X by (Ap)(x) = xp (x), x ∈ R, p ∈ X, where p denotes the derivative of p. (a) Determine N (A) and R(A); (b) Find all the eigenpairs of A. 2. Let X = C[0, 1] be the usual real Banach space equipped with the sup-norm. Deﬁne on X the operator A by (Au)(t) = (at + b)u(t), t ∈ [0, 1], u ∈ X, where a, b are real constants. (i)

Show that A ∈ L(X);

(ii)

Find the eigenvalues and eigenvectors of A.

3. Let X be a Banach space over K. Let A, B ∈ L(X) and λ ∈ K, λ = 0. Prove that λ is an eigenvalue of AB := A ◦ B if and only if λ is an eigenvalue of BA := B ◦ A.

240

8 Eigenvalues and Eigenvectors

4. Let X denote the real Banach space C[0, 1] with the usual supnorm. Let k = k(t, s) ∈ C[0, 1] × C[0, 1], with ∂k/∂t ∈ C[0, 1] × C[0, 1], k(t, t) = 0 ∀t ∈ [0, 1]. Deﬁne on X the operator A by t k(t, s)u(s) ds, t ∈ [0, 1]. (Au)(t) = 0

Show that (a) A ∈ L(X); (b) A has no eigenvalue. Solve the same exercise for X = L2 (0, 1) with the usual norm. 5. Let H = l2 be the usual ∞Hilbert2 space of sequences x = (x1 , x2 , . . . ) in C satisfying n=1 |xn | < ∞ with the inner product x, y =

∞

xi y¯i , x = (x1 , x2 , . . . ), y = (y1 , y2 , . . . ) ∈ H,

n=1

and the corresponding Hilbertian norm. Deﬁne the multiplication operator A by Ax = (λ1 x1 , λ2 x2 , . . . ) ∀x = (x1 , x2 , . . . ) ∈ H, where (λn )n∈N is a given sequence in C with supn∈N |λn | < ∞. (a) Show that A ∈ L(H) and determine A; (b) Show that A is symmetric (hence self-adjoint) ⇐⇒ λn ∈ R for all n ∈ N; (c) Find all the eigenvalues of A. 6. Let H = L2 (0, 1) be the real Hilbert space equipped with the usual scalar product and the induced norm, denoted · . Deﬁne A : H → H by t 1 u(s) ds + su(s) ds, 0 ≤ t ≤ 1, u ∈ H. (Au)(t) = t t

0

(a) Check that A ∈ L(H); (b) Prove that A is a compact operator; (c) Prove that A is symmetric (hence self-adjoint);

8.7 Exercises

241

(d) Find all the eigenvalues and eigenvectors (eigenfunctions) of A and use this information to determine an orthonormal basis of H. 7. Let (H, (·, ·), · ) be a Hilbert space. Show that x ∈ H \ {0} is an eigenvector of A ∈ L(H) ⇐⇒ |(Ax, x)| = Ax · x. 8. Let (H, (·, ·), · ) be a Hilbert space and let u, v ∈ H \ {0} be two orthogonal vectors (i.e., (u, v) = 0). Deﬁne A : H → H by Ax = (x, v)u + (x, u)v, x ∈ H. Obviously, A ∈ L(H). (a) Calculate A; (b) Show that A is symmetric (hence self-adjoint); (c) Using (a) calculate A, where A : L2 (−π, π) → L2 (−π, π) is the linear operator deﬁned by π f (s) cos s ds+ (Af )(t) = sin t −π π f (s) sin s ds, t ∈ [−π, π], cos t −π

for all f ∈ L2 (−π, π); (d) Find all the eigenpairs of A. 9. Let (H, (·, ·), · ) be a Hilbert space and let {e1 , e2 , . . . , em } ⊂ H be an orthonormal system, where m is a given natural number. Deﬁne A : H → H by Ax =

m

ci (x, ei )ei , x ∈ H,

i=1

where ci ∈ K \ {0}, i = 1, . . . , m. (a) Show that A ∈ L(H) and determine A, R(A) and N (A); (b) Show that A is symmetric ⇐⇒ ci ∈ R ∀i ∈ {1, . . . , m}; (c) Determine all the eigenvalues of A.

242

8 Eigenvalues and Eigenvectors

10. Let H = L2 (0, 1) be the real Hilbert space equipped with the usual scalar product and norm. Deﬁne A : H → H by 1 s t u(s) ds, t ∈ [0, 1], u ∈ H. (Au)(t) = 1+t 0 1+s (a) Show that A ∈ L(H) and A is symmetric (hence self-adjoint); (b) Determine R(A) and N (A); (c) Determine all the eigenpairs of A. 11. Let H = L2 (0, 1) be the real Hilbert space equipped with the usual scalar product and norm. For u ∈ H consider the problem v (t) = u(t) a.e. in (0, 1), v (0) = 0, v(1) = 0. Deﬁne A : H → H by Au = v, u ∈ H, where v is the solution of the above problem corresponding to u. (a) Show that A ∈ L(H) and N (A) = {0}; (b) Prove that A is symmetric and compact; (c) Find all the eigenpairs of A and use this information to determine an orthonormal basis of H. 12. Solve the Dirichlet eigenvalue problem −Δu = λu in Ω ⊂ R2 , u=0 on ∂Ω, where Ω is the rectangle (0, a) × (0, b) ⊂ R2 , a, b ∈ (0, ∞). 13. Consider in Ω = (0, a) × (0, b) ⊂ R2 , a, b ∈ (0, ∞), the eigenalue problem for −Δ with Neumann conditions on all sides of the rectangle Ω or combinations of Dirichlet and Neumann conditions on diﬀerent sides of Ω. Solve all these eigenvalue problems.

Chapter 9

Semigroups of Linear Operators Let A be an n × n matrix with entries aij ∈ C for all i, j = 1, 2, . . . , n. Consider the Cauchy problem u (t) = Au(t), t ≥ 0, u(0) = x,

(E) (IC)

where x is a given (column) vector in Cn . It is well known that problem (E), (IC) has a unique solution given by u(t) = etA x, t ≥ 0,

(9.0.1)

where etA represents the fundamental matrix of the linear diﬀerential system (E) which equals I (the n × n identity matrix) for t = 0. We have ∞ k t k tA A , (9.0.2) e = k! k=0

which is valid for all t ∈ R. Here A and etA can be interpreted as linear operators A, etA ∈ L(X), where X = Cn , equipped with one of its equivalent norms, and L(X) denotes, as usual, the space of bounded linear operators from X into itself. As we will see later, the family of matrices (operators) {T (t) = etA ; t ≥ 0} is a uniformly continuous semigroup on X = Cn . What’s more, the family {T (t); t ≥ 0} extends © Springer Nature Switzerland AG 2019 G. Moro¸sanu, Functional Analysis for the Applied Sciences, Universitext, https://doi.org/10.1007/978-3-030-27153-4 9

243

244

9 Semigroups of Linear Operators

to a group of linear operators, {etA ; t ∈ R}. The representation of the solution u(t) as u(t) = T (t)x, t ≥ 0 (9.0.3) allows the derivation of some properties of solutions from the properties of the family {T (t); t ≥ 0}. This idea extends easily to the case when X is a general Banach space and A is a bounded (continuous) linear operator, A ∈ L(X). If A is not an element of L(X), then the operator exponential etA no longer makes sense. This case is not trivial, rather it is much more interesting and very useful in applications. If A : D(A) ⊂ X → X satisﬁes certain conditions, then one can associate with A a so-called C0 -semigroup of linear operators {T (t); t ≥ 0} ⊂ L(X) (see Deﬁnition 9.1 below), so that the solution of the Cauchy problem (E), (IC) can again be represented by the above formula (9.0.3). Indeed, there is a central result in the linear semigroup theory, known as the Hille–Yosida theorem,1 which establishes the necessary and suﬃcient conditions for a linear operator A to “generate” a C0 -semigroup of linear operators {T (t); t ≥ 0} ⊂ L(X). In this way, one can solve linear partial diﬀerential equations of the form (E), where A represents unbounded linear diﬀerential operators with respect to the space variables, deﬁned on convenient function spaces. The linear semigroup theory received considerable attention in the 1930s as a new approach in the study of parabolic and hyperbolic linear partial diﬀerential equations. This theory has since developed as an independent theory with applications in some other ﬁelds, such as ergodic theory, the theory of Markov processes, etc. In this chapter we present some of the most important results of the linear semigroup theory and provide some related applications.

9.1

Deﬁnitions

Throughout this chapter X will be a Banach space over K with norm · , where K is either R or C. Denote as usual by L(X) the space of all bounded (continuous) linear operators T : X → X, which is a Banach space with respect to the operator norm T = sup {T x : x ∈ X, x ≤ 1}. 1

Carl Einar Hille, American mathematician, 1894–1980; Kosaku Yosida, Japanese mathematician, 1909–1990.

9.1 Deﬁnitions

245

Deﬁnition 9.1. A one-parameter family {T (t); t ≥ 0} ⊂ L(X) is said to be a semigroup if (i) T (0) = I (the identity operator on X); (ii) T (t + s) = T (t)T (s) for all t, s ≥ 0 (the semigroup property). If, in addition, (iii) limt→0+ T (t)x − x = 0 for all x ∈ X, then {T (t); t ≥ 0} is called a C0 -semigroup (or a semigroup of class C0 , or a strongly continuous semigroup). Condition (iii) says that the function t → T (t)x is continuous at t = 0, that is why {T (t); t ≥ 0} is called a C0 -semigroup. Deﬁnition 9.2. A family {T (t); t ≥ 0} ⊂ L(X) is said to be a uniformly continuous semigroup if it satisﬁes conditions (i) and (ii) above, and (iii)’ limt→0+ T (t) − I = 0. Remark 9.3. Obviously, condition (iii) is stronger than (iii). Indeed, for any x ∈ X, we have T (t)x − x ≤ T (t) − I · x, which proves the assertion. Deﬁnition 9.4. Let {T (t); t ≥ 0} ⊂ L(X) be a C0 -semigroup. Denote Ax := lim

h→0+

1 [T (h)x − x], h

(9.1.4)

for all x ∈ X for which the above limit exists. If D(A) is the set of all such x’s, then we have a linear operator A : D(A) ⊂ X → X, which is called the inﬁnitesimal generator of the semigroup {T (t); t ≥ 0}. Now let us state a ﬁrst result on semigroups of linear operators: Theorem 9.5. For any operator A ∈ L(X) the family {T (t) = etA ; t ≥ 0} is a uniformly continuous semigroup whose inﬁnitesimal generator is A.

246

9 Semigroups of Linear Operators

Proof. Recall that e

tA

=

∞ k t k=0

k!

Ak ,

meaning that for any t ≥ 0 this series is convergent in L(X) and its sum is etA . It is easily seen that the family {T (t) = etA ; t ≥ 0} satisﬁes (i) and (ii). Condition (iii) is also satisﬁed since T (t) − I ≤

∞ k t k=1

k!

A ≤ tA · etA

for all t ≥ 0. Note also that for all h > 0 h−1 [T (h) − I] − A ≤ hA2 ehA , which shows that A is the inﬁnitesimal generator of {T (t) = etA ; t ≥ 0}. Remark 9.6. We will see later that, in fact, every uniformly continuous semigroup is a family of operator exponentials {etA ; t ≥ 0} with A ∈ L(X). Note that A can be obtained from the right derivative of T (t) = etA calculated at t = 0. This explains the above deﬁnition of the generator of a C0 -semigroup {T (t); t ≥ 0}. In this case, we can expect only the existence of the right derivative at t = 0 of T (t)x for some points x ∈ X. Examples of C0 -semigroups (that do not belong to the class of uniformly continuous semigroups) will be provided later.

9.2

Some Properties of C0 -Semigroups

We start this section with a basic result in the linear semigroup theory: Theorem 9.7. If {T (t); t ≥ 0} ⊂ L(X) is a C0 -semigroup, then the following hold: (a) there exist constants M ≥ 1 and ω ∈ R such that T (t) ≤ M eωt ∀t ≥ 0;

(9.2.5)

(b) the function t → T (t)x is continuous on [0, ∞) for all x ∈ X.

9.2 Some Properties of C0 -Semigroups

247

Proof. Assertion (a): Let us ﬁrst prove that there exists a constant δ > 0 such that T (t) is bounded on [0, δ], i.e., sup {T (t) : 0 ≤ t ≤ δ} =: C < ∞.

(9.2.6)

Assume, by way of contradiction, that this is not the case, i.e., there exists a sequence of real numbers tk ! 0 such that T (tk ) → ∞. On the other hand, condition (iii) of Deﬁnition 9.1 implies that for each x ∈ X there exists a natural number N = N (x) such that T (tk )x ≤ T (tk )x − x + x ≤ 1 + x, ∀k > N.

(9.2.7)

By the Uniform Boundedness Principle, we derive from (9.2.7) that T (tk ) is bounded, which contradicts the assumption above. Thus (9.2.6) holds true for some δ > 0. Since T (0) = I = 1, we have C ≥ 1. Now, for all t ≥ 0 we have the decomposition (division with remainder) t = nδ + r, n ∈ N, 0 ≤ r < δ. So, by using condition (ii) of Deﬁnition 9.1, we can derive the estimate T (t) ≤ T (δ)n · T (r) ≤ C n+1 . Therefore, T (t) ≤ C · C t/δ , t ≥ 0, which shows that (9.2.5) holds true with M = C and ω = (ln C)/δ. Assertion (b): Let t0 > 0 and x ∈ X be arbitrary but ﬁxed. For any h > 0 we have (cf. condition (ii) from Deﬁnition 9.1) T (t0 + h)x − T (t0 )x = T (t0 )[T (h)x − x] ≤ T (t0 ) · T (h)x − x, which shows that the function t → T (t)x is continuous from the right at t = t0 (cf. condition (iii) of Deﬁnition 9.1). Now, for 0 < h < t0 , we can write (cf. (ii) and (9.2.5)) T (t0 − h)x − T (t0 )x = T (t0 − h)[x − T (h)x] ≤ M eω(t0 −h) x − T (h)x, which implies that t → T (t)x is continuous from the left at t = t0 .

248

9 Semigroups of Linear Operators

Remark 9.8. In fact, if {T (t); t ≥ 0} ⊂ L(X) is a C0 -semigroup, one can easily derive the following property that is stronger than (b) above: the map (t, x) → T (t)x is continuous from [0, ∞) × X to X (see Exercise 9.4). Remark 9.9. The constant ω in (9.2.5) determined in the proof above is nonnegative, but this is not the best constant. Indeed, sometimes ω can be negative (e.g., this is the case if T (t) = etA , where A is a real square matrix whose eigenvalues have negative real parts). Theorem 9.10. Let {T (t) : t ≥ 0} ⊂ L(X) be a C0 -semigroup and let A be its inﬁnitesimal generator. Then, (c) A is densely deﬁned: D(A) = X; (d) A is a closed operator; (e) for all t ≥ 0, x ∈ D(A), we have T (t)x ∈ D(A) and d T (t)x = AT (t)x = T (t)Ax. dt

(9.2.8)

Proof of (c): Obviously, x = lim

t→0+

1 t

t

T (s)x ds, ∀x ∈ X.

0

Since D(A) is a linear subspace of X, to prove (c) it suﬃces to show that t T (s)x ds ∈ D(A), ∀t > 0, x ∈ X. (9.2.9) 0

Indeed, for some given t > 0, x ∈ X, and for all h > 0, we have t t T (s)x ds = h−1 [T (s + h)x − T (s)x] ds h−1 [T (h) − I] 0 0

t+h −1 T (s)x ds =h h t T (s)x ds − 0 t+h −1 T (s)x ds =h t h T (s)x ds. − h−1 0

9.2 Some Properties of C0 -Semigroups

249

Therefore, there exists −1

lim h

h→0+

t

[T (h) − I]

T (s)x ds = T (t)x − x,

(9.2.10)

0

which implies (9.2.9). Proof of (d): Let (xn ) be a sequence in D(A) such that xn → x and Axn → y. Using (9.2.10), we can write t lim T (s)h−1 [T (h)xn − xn ] ds h→0+ 0 t = T (s)Axn ds ∀t > 0.

T (t)xn − xn =

0

It follows that

t

T (t)x − x =

T (s)y ds ∀t > 0,

0

so

lim t−1 [T (t)x − x] = y.

t→0+

It follows that x ∈ D(A) and y = Ax. Proof of (e): Let t ≥ 0 and x ∈ D(A). We have T (t)Ax = =

lim T (t){h−1 [T (h)x − x]}

h→0+

lim h−1 [T (h)T (t)x − T (t)x],

h→0+

which shows that T (t)x ∈ D(A) and T (t)Ax = AT (t)x.

(9.2.11)

On the other hand, lim h−1 [T (t + h)x − T (t)x] =

h→0+

lim T (t){h−1 [T (h)x − x]}

h→0+

= T (t)Ax.

(9.2.12)

From (9.2.11) and (9.2.12) we derive d+ T (t)x = AT (t)x = T (t)Ax. dt

(9.2.13)

250

9 Semigroups of Linear Operators +

We have used ddt to denote the right derivative. To conclude, we need to show that the left derivative of T (t)x exists and equals its right derivative at any t > 0. For 0 < h < t, we have − h−1 [T (t − h)x − T (t)x] − T (t)Ax = T (t − h){h−1 [T (h)x − x] − T (h)Ax} ≤ M eω(t−h) {h−1 [T (h)x − x] − Ax + Ax − T (h)Ax}. It follows that for all t > 0 and x ∈ D(A), d− T (t)x = T (t)Ax. dt

(9.2.14)

Obviously, (e) follows from (9.2.13) and (9.2.14). Remark 9.11. In fact, if A is the generator of a C0 -semigroup {T (t); t ≥ n 0} ⊂ L(X), then the subspace Y := ∩∞ n=1 D(A ) is dense in X, where n n the operators A : D(A ) → X are inductively deﬁned as follows: D(An ) = {x ∈ D(An−1 ); An−1 x ∈ D(A)}, An x = A(An−1 x) ∀x ∈ D(An ), for all n ∈ N, n ≥ 2. Now, for any x ∈ X and φ ∈ C0∞ (R), with supp φ ⊂ (0, +∞), deﬁne ∞ φ(t)T (t)x dt. x(φ) = 0

For h > 0 we have 1 1 T (h) − I x(φ) = h h −

∞

φ(t)T (t + h)x dt 0 ∞

φ(t)T (t)x dt

0 1 ∞ = φ(t − h)T (t)x dt h h ∞ φ(t)T (t)x dt − ∞0 φ(t − h) − φ(t) = T (t)x dt, h 0

which converges to −x(φ ) as h → 0+ . Hence x(φ) ∈ D(A) and Ax = −x(φ ). We infer by induction that x(φ) ∈ D(An ) and An x(φ) =

9.2 Some Properties of C0 -Semigroups

251

(−1)n x(φ(n) ) for all n ∈ N, hence x(φ) ∈ Y . Now, let us prove that any x ∈ X can be approximated by x(φ) for suitable φ’s (see [49, p. 44]). If ω ∈ C0∞ (R) is the usual test function with supp ω = [−1, +1] +1 and −1 ω(t) dt = 1, deﬁne the molliﬁer t 1 − 2 ∀t ∈ R, ε > 0. φε (t) = ω ε ε Since x(φε ) − x = ≤

3ε

φε (t)[T (t)x − x] dt

ε 3ε

φε (t)T (t)x − x dt 3ε φε (t) dt sup T (t)x − x ε

≤

t∈[ε,3ε]

ε

sup T (t)x − x.

=

t∈[ε,3ε]

Therefore lim x(φε ) − x = 0.

ε→0+

Theorem 9.12. If two C0 -semigroups have the same inﬁnitesimal generator, then they coincide. Proof. Let A be the common generator of two C0 -semigroups, say {T (t); t ≥ 0} and {S(t); t ≥ 0}. For any t > 0 and x ∈ D(A) we have (see Theorem 9.10, (e)) d [T (t − s)S(s)x] ds = −T (t − s)AS(s)x + T (t − s)AS(s)x = 0, ∀ 0 ≤ s < t. Hence, for all t > 0 and x ∈ D(A), the function s → T (t − s)S(s)x is constant on the interval [0, t]. In particular, T (t)x = S(t)x on D(A) for all t ≥ 0. This concludes the proof since D(A) = X. Remark 9.13. Property (e) of Theorem 9.10 says that for every x ∈ D(A) the function u(t) = T (t)x is continuously diﬀerentiable on [0, ∞) and satisﬁes the Cauchy problem u (t) = Au(t), t ≥ 0; u(0) = x.

(CP )

252

9 Semigroups of Linear Operators

This u, which is a C 1 -solution of problem (CP ) (hence a classical solution on every bounded interval [0, r] in the sense of Deﬁnition 9.44 below) is unique. Indeed, if u ˜ is also a C 1 -solution of problem (CP ), then for any t > 0 we have d T (t − s)˜ u(s) = −T (t − s)A˜ u(s) + T (t − s)˜ u (s) = 0 ∀ s ∈ (0, t), ds hence s → T (t − s)˜ u(s) is a constant function on [0, t]. In particular, its values at s = 0 and s = t coincide: u ˜(t) = T (t)˜ u(0) = T (t)x, which proves that the solution of (CP ) is unique and is given by u(t) = T (t)x, t ≥ 0. Now, if x ∈ X\D(A), then the function u(t) = T (t)x satisﬁes the initial condition u(0) = x, but is no longer diﬀerentiable (see Sect. 9.5 below), so it cannot satisfy the Cauchy problem above in a classical sense. However, u can be regarded as a generalized solution (or mild solution, as it will be called later, see Sect. 9.11) since the initial condition is still satisﬁed, u(0) = x, and there exists a sequence (un ) of C 1 -solutions of equation (CP )1 , such that un → u in C([0, r]; X) for all r > 0. Indeed, one can choose a sequence (xn ) in D(A), such that xn → x (cf. Theorem 9.10, (c)), and obviously un (t) = T (t)xn are all C 1 -solutions satisfying the required condition: T (t)xn − T (t)x ≤ T (t) · xn − x ≤ M eωr xn − x, for all t ∈ [0, r]. Clearly, the deﬁnition of the generalized solution is independent of the choice of the sequence (un ) (or (xn = un (0))). It is worth pointing out that in the discussion above A was assumed to be the inﬁnitesimal generator of a C0 - semigroup. Now, given a linear operator A we want to know the conditions on A ensuring the existence of a C0 -semigroup whose generator is precisely A. This will allow us to solve Cauchy problems like (CP ) above. From Theorem 9.10 we know that such an A has to necessarily be densely deﬁned and closed. The complete answer will be provided later.

9.3

Uniformly Continuous Semigroups

Uniformly continuous semigroups have been deﬁned before. We have also seen that for any A ∈ L(X), the family {T (t) = etA ; t ≥ 0} is a uniformly continuous semigroup whose generator is A. According

9.3 Uniformly Continuous Semigroups

253

to Theorem 9.12, this is the unique C0 -semigroup, hence the unique uniformly continuous semigroup, having A as its generator. The next result shows that, in fact, the class of uniformly continuous semigroups reduces to {{etA ; t ≥ 0}; A ∈ L(X)}. Theorem 9.14. Let {T (t); t ≥ 0} ⊂ L(X) be a uniformly continuous semigroup. If A is its inﬁnitesimal generator, then A ∈ L(X). Proof. Since 1 lim I − + t t→0

t

T (s) ds = 0, 0

there exists a t0 > 0 such that I − B < 1,

1 where B = t0

t0

T (s) ds. 0

−1 ∈ L(X). Now, Therefore, B is invertible and B −1 = I − (I − B) for all h > 0, we have t0 1 t0 1 [T (h) − I]B = T (s + h) ds − T (s) ds h ht0 0 0 1 h 1 1 t0 +h T (s) ds − T (s) ds . = t0 h t 0 h 0 Therefore, there exists lim

h→0+

1 1 [T (h) − I]B = [T (t0 ) − I], h t0

(9.3.15)

with respect to the topology of L(X). Since the generator of {T (t); t ≥ 0} is A, it follows from (9.3.15) that AB =

1 [T (t0 ) − I]. t0

(9.3.16)

Since B is invertible and B −1 ∈ L(X), we infer from (9.3.16) A=

1 [T (t0 ) − I]B −1 ∈ L(X). t0

In fact, every uniformly continuous semigroup {etA ; t ≥ 0}, A ∈ L(X), can naturally be extended to the group {etA ; t ∈ R} (see the next section).

254

9 Semigroups of Linear Operators

Remark 9.15. Let {T (t); t ≥ 0} ⊂ L(X) be a C0 -semigroup whose inﬁnitesimal generator A : D(A) ⊂ X → X is bounded, i.e., there exists a constant c > 0 such that Ax ≤ cx for all x ∈ D(A). Then, D(A) = X, A ∈ L(X) and so the semigroup is in fact uniformly continuous: T (t) = etA , t ≥ 0. Indeed, since D(A) = X, A has an ˜ extension A˜ ∈ L(X). Denote by {T˜(t) = etA ; t ≥ 0} the (uniformly ˜ For an arbitrary t > 0 and continuous) semigroup with generator A. x ∈ D(A), we have d ˜ [T (t − s)T (s)x] ds ˜ (s)x + T˜(t − s)AT (s)x = 0 ∀s ∈ (0, t), = −T˜(t − s)AT ˜ = X and d T (s)x = AT (s)x for all since T (s)x ∈ D(A) ⊂ D(A) ds s ∈ (0, t). It follows that the function s → T˜(t − s)T (s)x is constant on [0, t], and hence T˜(t)x = T (t)x for all x ∈ D(A) which shows that T˜(t)x = T (t)x for all x ∈ X. Therefore, A coincides with A˜ and the assertion follows.

9.4

Groups of Linear Operators. Deﬁnitions and Link to Operator Semigroups

Deﬁnition 9.16. A family {G(t); t ∈ R} ⊂ L(X) is called a group if (j) G(0) = I (the identity operator on X); (jj) G(t + s) = G(t)G(s) for all t, s ∈ R (the group property). If, in addition, (jjj) limt→0 G(t)x − x = 0 for all x ∈ X, then {G(t); t ∈ R} is called a C0 -group (or a group of class C0 ). The inﬁnitesimal generator A of a group {G(t); t ∈ R} is deﬁned by 1 Ax = lim [G(h)x − x] ∀x ∈ D(A), h→0 h where D(A) is the set of all x ∈ X for which the limit above exists. If {G(t); t ∈ R} satisﬁes conditions (j), (jj) and, in addition,

9.4 Groups of Linear Operators: Deﬁnitions and Link to Operator. . .

(jjj)’

255

limt→0 G(t) − I = 0,

(which is stronger than (jjj)), then {G(t); t ∈ R} is called a uniformly continuous group. Remark 9.17. If {G(t); t ∈ R} is a C0 -group, then the families {G(t); t ≥ 0} and {G(−t); t ≥ 0} are both C0 -semigroups, with generators A and −A, respectively (prove it!). Conversely, if {T+ (t); t ≥ 0}, {T− (t); t ≥ 0} are C0 -semigroups with generators A and −A, respectively, then one can deﬁne a C0 -group T+ (t) if t ≥ 0, G(t) = T− (−t) if t < 0, having A as its generator. The proof of this assertion relies on the identity (9.4.17) T+ (t)T− (t) = T− (t)T+ (t) = I ∀t ≥ 0. Indeed, for any x ∈ D(A) = D(−A) and t ≥ 0, we have (cf. Theorem 9.10, (e)) d T+ (t)T− (t)x = T+ (t)AT− (t)x − T+ (t)AT− (t)x = 0, dt hence t → T+ (t)AT− (t)x is a constant function. Since it takes the value x for t = 0, it follows that T+ (t)T− (t)x = x ∀t ≥ 0, x ∈ D(A).

(9.4.18)

We know that D(A) = X, therefore (9.4.18) holds for all x ∈ X, i.e., T+ (t)T− (t) = I ∀t ≥ 0. Similarly, T− (t)T+ (t) = I ∀t ≥ 0, so (9.4.17) holds true. Identity (9.4.17) shows that T+ (t) and T− (t) are invertible for all t ≥ 0, being inverse to each other. Thus {G(t); t ∈ R} satisﬁes the group property (jj). Since (j) and (jjj) are trivially satisﬁed, we conclude that {G(t); t ∈ R} constructed above is indeed a C0 -group, and its generator is A, as claimed. Note that all the members G(t) of any group are necessarily invertible operators, since G(t)G(−t) = I = G(−t)G(t). The next result shows that invertibility allows one to extend any C0 -semigroup to a C0 -group.

256

9 Semigroups of Linear Operators

It is worth pointing out that if {T (t); t ≥ 0} ⊂ L(X) is a semigroup and T (t0 ) is a bijection from X to itself (hence T (t0 ) is invertible) for some t0 > 0, then so is T (t) for all t ≥ 0. Indeed, for t ∈ (0, t0 ), we have T (t0 ) = T (t)T (t0 − t) = T (t0 − t)T (t), which shows that T (t) is bijective. For t > t0 we write t as t = nt0 + s, where n ∈ N and 0 ≤ s < t0 (division with remainder) and so T (t) = T (t0 )n T (s), which clearly shows that T (t) is also bijective in this case. Theorem 9.18. Let {T (t); t ≥ 0} be a C0 -semigroup and let A denote its inﬁnitesimal generator. If T (t) is a bijection from X to itself for all t > 0 (equivalently, T (t0 ) is a bijection for some t0 > 0), then {T (t)−1 ; t ≥ 0} is a C0 -semigroup with the generator −A, so {G(t); t ∈ R} deﬁned by G(t) =

T (t) T (−t)−1

if t ≥ 0, if t < 0,

is a C0 -group whose generator is A. Proof. Denote S(t) = T (t)−1 , t ≥ 0. Obviously, S(0) = I and S(t + s) = [T (s)T (t)]−1 = T (t)−1 T (s)−1 = S(t)S(s), for all t, s ≥ 0. Thus, the family {S(t) = T (t)−1 ; t ≥ 0} is a semigroup, and {G(t); t ∈ R} deﬁned in the statement above is a group. Now, let us prove that the semigroup {S(t) = G(−t); t ≥ 0} satisﬁes condition (iii) of Deﬁnition 9.1. Let x ∈ X and s > 1. Denote y := T (s)−1 x. For 0 < t < 1, we have S(t)x − x = G(−t)x − x = G(−t)G(s)y − T (s)y = T (s − t)y − T (s)y → 0 as t → 0+ , since t → T (t)y is continuous on [0, ∞). Therefore {S(t); t ≥ 0} satisﬁes condition (iii) as claimed, i.e., it is a C0 -semigroup. Let B be the inﬁnitesimal generator of {S(t) = T (t)−1 ; t ≥ 0}. For x ∈ D(A) we have 1 lim { [x − T (h)x] + Ax} = 0. + h h→0

9.5 Translation Semigroups

257

This implies that 1 lim S(h){ [x − T (h)x] + Ax} = 0, h h→0+ since S(h) ≤ M1 eω1 h for some M1 ≥ 0 and ω1 ∈ R (cf. Theorem 9.7, (a)). Therefore, lim h−1 [T (h)−1 x − x] = −Ax,

h→0+

i.e., D(A) ⊂ D(B) and Bx = −Ax ∀x ∈ D(A). Since T (t) = −1 T (t)−1 , t ≥ 0, we also have D(B) ⊂ D(A). Hence, D(A) = D(B) and Bx = −Ax ∀x ∈ D(A), i.e., B = −A. Remark 9.19. Let {G(t); t ∈ R} ⊂ L(X) be a group. If for all x ∈ X the function t → G(t)x is continuous from the right (or from the left) at some point t = t0 ∈ R, then there exist constants M ≥ 1 and ω ∈ R such that G(t) ≤ M eω|t| ∀t ∈ R. (9.4.19) This follows by the Uniform Boundedness Principle (see the proof of Theorem 9.7). Moreover, using this estimate and the invertibility of every G(t), one can easily see that t → G(t)x is continuous on R; even more, the function (t, x) → G(t)x is continuous from R × X to X. Remark 9.20. If A ∈ L(X), then {G(t) = etA ; t ∈ R} is a uniformly continuous group. In fact, it follows from the discussion above that the class of uniformly continuous groups is precisely {{etA ; t ∈ R}; A ∈ L(X)}.

9.5

Translation Semigroups

In this section we present the ﬁrst examples of C0 -semigroups which are not uniformly continuous ones. Let X be the space of all functions f : [0, ∞) → R which are uniformly continuous and bounded. The space X is a real Banach space with respect to the norm f ∞ = sup |f (t)|. t≥0

For each t ≥ 0 deﬁne T (t) : X → X by T (t)f (s) = f (t + s), s ∈ [0, ∞), f ∈ X.

258

9 Semigroups of Linear Operators

It is easily seen that the family {T (t); t ≥ 0} is a C0 -semigroup. Its inﬁnitesimal generator is deﬁned by D(A) = {f ∈ X; f is diﬀerentiable on [0, ∞) and f ∈ X}, (9.5.20) Af = f ∀f ∈ D(A).

(9.5.21)

Indeed, if f ∈ X, f is diﬀerentiable on [0, ∞), and f ∈ X, then for all h > 0 and s ≥ 0 −1 h [T (h)f − f ] (s) = h−1 [f (s + h) − f (s)] = f (θ), for some θ ∈ (s, s + h), so −1 h [T (h)f − f ] (s) − f (s) = f (θ) − f (s) → 0, as h → 0+ , uniformly in s (since f is uniformly continuous). Therefore, f ∈ D(A) and Af = f . To conclude the proof, we need to show that (9.5.20) holds true, i.e., the converse inclusion relation is valid. To this end, let f ∈ D(A), which means there exists lim h−1 [T (h)f − f ] = lim h−1 [f (· + h) − f (·)] = f+ ∈ X,

h→0+

h→0+

where f+ denotes the right derivative of f . It remains to prove that f is diﬀerentiable on [0, ∞) so that f+ = f . For an arbitrary ε > 0 deﬁne t

g(t) = f (t) − f (0) − 0

f+ (s) ds − εt.

We have g(0) = 0 and (t) = −ε < 0 ∀t ≥ 0, g+

which implies g(t) ≤ 0, which in turn means t f+ (s) ds, f (t) ≤ f (0) + 0

for all t ≥ 0 (since ε was arbitrarily chosen). Similarly, replacing −ε by +ε, we obtain the converse inequality, so t f (t) = f (0) + f+ (s) ds ∀t ≥ 0, 0

9.5 Translation Semigroups

259

which shows that f is indeed diﬀerentiable on [0, ∞) and so f = f+ ∈ X, as claimed. The semigroup deﬁned above is called a translation semigroup. Obviously, T (t)f ∞ ≤ f ∞ ∀t ≥ 0, which shows that T (t) ≤ 1 for all t ≥ 0, i.e., the estimate in Theorem 9.7 holds with M = 1 and ω = 0. It is worth pointing out that A is not a member of L(X) in this case, so {T (t); t ≥ 0} is not a uniformly continuous semigroup (see Theorem 9.14). This conﬁrms the fact that the unit sphere of X is not equicontinuous (equivalently, condition (iii) is not valid). Remark 9.21. If f ∈ D(A) (see (9.5.20)), then u(t) = u(t, ·) = T (t)f (·) = f (t + ·) satisﬁes the Cauchy problem in X u (t) = Au(t) u(0) = f, i.e.,

∂u ∂t (t, s)

= ∂u ∂s (t, s), u(0, s) = f (s),

∀t ≥ 0,

t, s ≥ 0, s ≥ 0.

If f ∈ X is not diﬀerentiable, then u(t, s) = f (t + s) does not satisfy the above partial diﬀerential equation in a classical sense; it has to be interpreted as a generalized solution of the Cauchy problem above. If X is replaced by the space of all functions f : R → R which are uniformly continuous and bounded, with the norm f ∞ = sup |f (t)|, t∈R

then one can deﬁne similarly a semigroup of translations, T (t) : X → X, t ≥ 0, T (t)f (s) = f (t + s) ∀s ∈ R, f ∈ X. In this case, the family {T (t); t ≥ 0} is again a C0 -semigroup, with T (t) = 1 for all t ≥ 0, and its inﬁnitesimal generator A is given by

260

9 Semigroups of Linear Operators

D(A) = {f ∈ X; f is diﬀerentiable on R and f ∈ X}, Af = f ∀f ∈ D(A). It is worth mentioning that this C0 -semigroup can be extended to a C0 -group {G(t); t ∈ R} deﬁned by G(t)f (s) = f (t + s) ∀t, s ∈ R, f ∈ X. This is not a uniformly continuous group, since its inﬁnitesimal generator does not belong to L(X).

9.6

The Hille–Yosida Generation Theorem

Let X be a Banach space and let A : D(A) ⊂ X → X be a linear closed operator, not necessarily bounded. The set ρ(A) = {λ ∈ K; λI − A is a bijective operator from D(A) to X} (9.6.22) is called the resolvent set of A. If ρ(A) is nonempty, then, for λ ∈ ρ(A), denote (9.6.23) R(λ, A) = (λI − A)−1 , which is called the resolvent of A. Since A is a closed operator, so is R(λ, A) for all λ ∈ ρ(A). If we also take into account the fact that D(R(λ, A)) = X, we infer that R(λ, A) ∈ L(X) for all λ ∈ ρ(A) (cf. Theorem 4.10 (Closed Graph Theorem)). Now, let us state a central result in the theory of semigroups of linear operators, which belongs to E. Hille and K. Yosida. Theorem 9.22. A linear operator A : D(A) ⊂ X → X is the inﬁnitesimal generator of a C0 -semigroup of contractions {T (t); t ≥ 0} (i.e., T (t) ≤ 1 ∀t ≥ 0) if and only if (k) D(A) = X and A is closed; (kk) (0, ∞) ⊂ ρ(A) and R(λ, A) ≤

1 λ

∀λ > 0.

9.6 The Hille–Yosida Generation Theorem

261

Proof. Necessity: If A is the generator of a C0 -semigroup, then the two conditions of (k) are fulﬁlled (cf. Theorem 9.10). It remains to prove (kk), under the assumption that {T (t); t ≥ 0} is a C0 -semigroup of contractions. To this purpose, deﬁne ∞ e−λt T (t)x dt ∀λ > 0, x ∈ X. (9.6.24) Rλ x = 0

Note that Rλ is well deﬁned, since e−λt T (t)x ≤ e−λt x ∀t ≥ 0. Furthermore, Rλ ∈ L(X) and Rλ x ≤

∞

0

≤

e−λt T (t) · x dt

∞

e−λt dt x

0

=

1 x ∀x ∈ X, λ > 0, λ

which implies that Rλ ≤

1 ∀λ > 0. λ

(9.6.25)

Let us prove that for all λ > 0 and x ∈ X, Rλ x ∈ D(A). For all h > 0 we have

∞ −1 −1 h [T (h) − I]Rλ x = h e−λt T (t + h)x dt 0 ∞ e−λt T (t)x dt − 0 eλh h −λτ eλh − 1 Rλ x − e T (τ )x dτ. = h h 0 Observe that the right-hand side of the last equality converges to λRλ x − x as λ → 0+ . Therefore, Rλ x ∈ D(A) and ARλ x = λRλ x − x, i.e., (λI − A)Rλ = I ∀λ > 0.

(9.6.26)

262

9 Semigroups of Linear Operators

On the other hand, for all x ∈ D(A) and t ≥ 0, T (t)x ∈ D(A) (cf. Theorem 9.10, (e)) and ∞ e−λt h−1 [T (t + h)x − T (t)x] dt ARλ x = lim h→0+ 0 ∞ = e−λt T (t)Ax dt 0

= Rλ Ax, hence (see also (9.6.26)) Rλ (λI − A) = ID(A) ∀λ > 0,

(9.6.27)

where ID(A) is the identity operator on D(A). From (9.6.26) and (9.6.27) we infer that λI − A is a bijective operator from D(A) to X and Rλ = (λI − A)−1 ∀λ > 0. Therefore, (0, ∞) ⊂ ρ(A) and Rλ = R(λ, A) ∀λ > 0, so (9.6.25) implies that 1 ∀λ > 0. λ Thus the proof of necessity is complete. R(λ, A) ≤

Suﬃciency: Assume that both (k) and (kk) hold. For the convenience of the reader, the proof will be divided into several steps. Step 1: limλ→∞ λR(λ, A)x = x ∀x ∈ X. If x ∈ D(A), then, according to (kk), we have λR(λ, A)x − x = R(λ, A)Ax ≤

1 Ax, λ

which shows that lim λR(λ, A)x = x ∀x ∈ D(A).

λ→∞

(9.6.28)

Now, if x ∈ X, according to (k), there exists a sequence (xn ) in D(A) such that xn → x. Since λR(λ, A)x − x ≤ λR(λ, A)(x − xn ) + λR(λ, A)xn − xn + xn − x ≤ λR(λ, A)xn − xn + 2xn − x, we have (see (9.6.28))

9.6 The Hille–Yosida Generation Theorem

263

lim sup λR(λ, A)x − x ≤ 2xn − x, λ→∞

which concludes Step 1. Step 2: Deﬁne Aλ := λAR(λ, A), λ > 0 (the Yosida approximation of A); then, for all λ > 0, Aλ ∈ L(X), and lim Aλ x = Ax ∀x ∈ D(A).

λ→∞

(9.6.29)

Indeed, since Aλ x = λ2 R(λ, A)x − λx ∀x ∈ X, λ > 0, we have Aλ ∈ L(X) and, if x ∈ D(A), Aλ x = λR(λ, A)Ax ∀λ > 0. According to Step 1, this implies (9.6.29), thus the proof of Step 2 is complete. Step 3: For all t ≥ 0, x ∈ X, and λ, ν > 0, we have etAλ x − etAν x ≤ tAλ x − Aν x.

(9.6.30)

First of all, note that for all t ≥ 0 and λ > 0, etAλ = e−λt etλ ≤ e

−λt

·e

2 R(λ,A)

tλ2 R(λ,A)

≤ 1. It is also easily seen that etAλ , etAν , Aλ , Aν commute with each other. Using this information, we infer that 1 d tsAλ t(1−s)Aν e x ds etAλ x − etAν x ≤ e 0 ds 1 ≤ tetsAλ et(1−s)Aν (Aλ x − Aν x) ds 0

≤ tAλ x − Aν x,

(9.6.31)

as claimed. Step 4: The limit limλ→∞ etAλ x =: T (t)x, t ≥ 0, x ∈ X exists, and {T (t); t ≥ 0} is a C0 -semigroup of contractions having A as its generator.

264

9 Semigroups of Linear Operators

First of all, according to Steps 2 and 3, the above limit exists for each x ∈ D(A), uniformly on compact subintervals of [0, ∞), thus t → T (t)x is a continuous function on [0, ∞). It is also easily seen that T (0)x = x, T (t)x ≤ x ∀x ∈ D(A), t, s ≥ 0.

(9.6.32)

Obviously, T (t) extends to D(A) = X as a bounded (continuous) operator and (9.6.32) are satisﬁed for all x ∈ X. Moreover, t → T (t)x is continuous on [0, ∞) for all x ∈ X. Indeed, if (xn ) is a sequence in D(A) converging to x, then T (t)xn − T (t)xm = T (t)(xn − xm ) ≤ xn − xm , hence T (t)xn → T (t)x uniformly and so the function t → T (t)x is indeed continuous on [0, ∞). On the other hand, T (t)x − etAλ x ≤ T (t)x − T (t)xn + T (t)xn − etAλ xn + etAλ (xn − x) ≤ 2x − xn + T (t)xn − etAλ xn , which implies that T (t)x = lim etAλ x ∀x ∈ X, λ→∞

uniformly on compact subintervals of [0, ∞). Since e(t+s)Aλ x = etAλ esAλ x ∀λ > 0, t, s ≥ 0, x ∈ X, we have T (t + s)x = T (t)T (s)x ∀t, s ≥ 0, x ∈ X. Thus we have already proved that {T (t); t ≥ 0} is a C0 -semigroup of contractions, and all we have to prove next is that its generator, say B, coincides with the given operator A. If x ∈ D(A), we have

T (t)x − x = lim etAλ x − x λ→∞ t esAλ Aλ x ds = lim λ→∞ 0 t = T (s)Ax ds, (9.6.33) 0

9.7 The Lumer–Phillips Theorem

265

since esAλ Aλ x → T (s)Ax uniformly on bounded subintervals of [0, ∞), as λ → ∞. From (9.6.33) we easily see that D(A) ⊂ D(B) and Bx = Ax for all x ∈ D(A). Now, by assumption 1 ∈ ρ(A). On the other hand, according to the forward implication, we also have 1 ∈ ρ(B) (since B is the generator of a C0 -semigroup of contractions). So both A and B are bijections from D(A) and respectively D(B) to X. Since Ax = Bx for all x ∈ D(A), it follows that D(A) = D(B) and A = B.

9.7

The Lumer–Phillips Theorem

In this section we discuss another result which also provides necessary and suﬃcient conditions for a linear operator to generate a C0 semigroup of contractions. This result belongs to Lumer and Phillips2 and is useful in applications. Before stating this result we need the following deﬁnition. Deﬁnition 9.23. A linear operator A : D(A) ⊂ X → X is said to be dissipative if λx ≤ λx − Ax ∀λ > 0, x ∈ D(A).

(9.7.34)

If in addition R(λI − A) = X for all λ > 0, then A is called mdissipative. Remark 9.24. If A is a dissipative linear operator, then it is m-dissipative if and only if there exists a λ0 > 0 such that R(λ0 I − A) = X. Indeed, by the dissipativity condition (9.7.34) it follows that λ0 I − A is a bijection between D(A) and X, (λ0 I − A)−1 ∈ L(X) and (λ0 I − A)−1 ≤ 1/λ0 . Using this information and Banach’s Fixed Point Theorem it follows easily that R(λI − A) = X for all λ ∈ (0, 2λ0 ). Obviously, this interval can be extended indeﬁnitely to the right and so R(λI −A) = X for all λ > 0. Theorem 9.25 (Lumer–Phillips). A linear operator A : D(A) ⊂ X → X is the inﬁnitesimal generator of a C0 -semigroup of contractions if and only if the following conditions hold: (a) D(A) = X, and (b) A is m-dissipative. 2

G¨ unter Lumer, German-born mathematician, 1929–2005; Ralph S. Phillips, American mathematician, 1913–1998.

266

9 Semigroups of Linear Operators

Proof. Suﬃciency: Assume that both (a) and (b) hold. From (b) it follows that for every λ > 0 we have λ ∈ ρ(A), R(λ, A) ∈ L(X), and R(λ, A) ≤ 1/λ (see the remark above). Also, A is a closed operator since (λI − A)−1 ∈ L(X) for all λ > 0. It follows by the Hille–Yosida Theorem that A generates a C0 -semigroup of contractions. Necessity: Assume that A is the generator of a C0 -semigroup of contractions {T (t); t ≥ 0}. According to the Hille–Yosida Theorem, it suﬃces to show that A is dissipative. Let x ∈ D(A) and x∗ ∈ J(x), where J is the duality mapping of X. We have Re x∗ (Ax) = =

lim Re x∗ (h−1 [T (h)x − x]) lim h−1 Re x∗ (T (h)x) − x2

h→0+ h→0+

≤ 0, since Re x∗ (T (h)x) ≤ x∗ · T (h) · x ≤ x2 , where Re denotes the real part. Therefore, Re x∗ (λx − Ax) = λx2 − Re x∗ (Ax) ≥ λx2 ∀λ > 0, which obviously implies (9.7.34). Remark 9.26. A linear operator A : D(A) ⊂ X → X is dissipative if and only if ∀x ∈ D(A) ∃x∗ ∈ J(x) such that Re x∗ (Ax) ≤ 0.

(9.7.35)

From the proof above we see that (9.7.35) implies (9.7.34). For the proof of the converse implication, see [13, p. 81] or [39, p. 14]. If X is a Hilbert space, then this implication follows easily. If X is a real Hilbert space, then (9.7.35) means that A is negative semideﬁnite: (Ax, x) ≤ 0 ∀x ∈ D(A) (equivalently, −A is positive semideﬁnite or monotone). Note that if X is assumed to be reﬂexive then condition (a) in the Lumer–Phillips Theorem becomes superﬂuous, so we have

9.7 The Lumer–Phillips Theorem

267

Theorem 9.27. Assume X is a reﬂexive Banach space. Then a linear operator A : D(A) ⊂ X → X is the inﬁnitesimal generator of a C0 semigroup of contractions if and only if A is m-dissipative. Proof. Bearing in mind the Lumer–Phillips Theorem, we need to prove that if X is reﬂexive and A is m-dissipative (equivalently, A satisﬁes (9.7.35) and R(λ0 I − A) = X for some λ0 > 0), then D(A) = X. Obviously, (0, ∞) ⊂ ρ(A), R(λ, A) ∈ L(X), and R(λ, A) ≤ 1/λ for all λ > 0. Now, for x ∈ D(A) and λ > 0 denote xλ := λR(λ, A)x. As in the proof of the Hille–Yosida Theorem, we can see that xλ − x ≤

1 Ax, λ

hence xλ converges to x as λ → +∞. (Note that this property cannot be extended, for the time being, to all x ∈ X, as in the proof of the Hille–Yosida Theorem, since D(A) = X is now a target, not a hypothesis). It is also easily seen that for x ∈ D(A) and λ > 0 Aλ x = λR(λ, A)Ax ≤ Ax. Now, by the reﬂexivity assumption on X we derive the existence of a sequence λn → ∞ such that (Axλn ) converges weakly. Moreover, since A is m-dissipative, its graph is closed in X × X, hence weakly closed, so we have (9.7.36) lim x∗ (Axλn ) = x∗ (Ax) ∀x∗ ∈ X ∗ , n→∞

X∗

where denotes the dual of X. Now, let x∗ ∈ X ∗ such that x∗ (x) = 0 for all x ∈ D(A). Since Axλ = λR(λ, A)Ax ∈ D(A) for all λ > 0, we derive from (9.7.36) that x∗ (Ax) = 0 ∀x ∈ D(A).

(9.7.37)

Taking into account (9.7.37) and R(λ0 I −A) = X, we infer that x∗ = 0. Therefore D(A) = X as claimed. Remark 9.28. The reﬂexivity of X is an essential assumption in Theorem 9.27, as the following counterexample shows: X = C[0, 1] equipped with the usual sup-norm (which is a non-reﬂexive Banach space), A : D(A) ⊂ X → X, D(A) = {u ∈ C 1 [0, 1]; u(0) = 0}, Au = −u . It is easily seen that A is m-dissipative, but not densely deﬁned (D(A) =

268

9 Semigroups of Linear Operators

{u ∈ C[0, 1]; u(0) = 0}). Hence, according to the Lumer–Phillips Theorem, A cannot be the generator of a C0 -semigroup in X. This counterexample clearly shows that Theorem 9.27 fails to hold in nonreﬂexive Banach spaces. We close this section with the following result which is valid in a general Banach space X. Theorem 9.29. If A : D(A) ⊂ X → X is a closed linear operator such that D(A) = X and both A and A∗ are dissipative (where A∗ denotes the adjoint of A), then A is m-dissipative (hence, according to the Lumer–Phillips Theorem, A is the generator of a C0 -semigroup of contractions). Proof. Let x∗ ∈ X ∗ be such that x∗ (x − Ax) = 0 for all x ∈ D(A). It follows that x∗ ∈ D(A∗ ) and x∗ − A∗ x∗ = 0. Since A∗ is assumed to be dissipative, we infer that x∗ = 0, so R(I − A) = X. In fact, R(I − A) is a closed subspace of X (since A is dissipative and closed), hence R(I − A) = X.

9.8

The Feller–Miyadera–Phillips Theorem

The Hille–Yosida theorem has the following signiﬁcant generalization that belongs to Feller, Miyadera, and Phillips.3 Theorem 9.30. A linear operator A : D(A) ⊂ X → X is the inﬁnitesimal generator of a C0 -semigroup {T (t); t ≥ 0} satisfying T (t) ≤ M eωt , t ≥ 0, with M ≥ 1, ω ∈ R, if and only if (k) D(A) = X and A is closed; (kk)’ (ω, ∞) ⊂ ρ(A) and R(λ, A)n ≤

M (λ−ω)n

∀λ > ω, n = 1, 2, . . .

Proof. Necessity: This is similar to the necessity part of the proof of the Hille–Yosida Theorem. Here Rλ is well deﬁned for λ > ω and one can similarly prove that (ω, ∞) ⊂ ρ(A) and R(λ, A) = Rλ , M R(λ, A) ≤ λ−ω for all λ > ω. Then, for all λ > ω and x ∈ X, we have 3

William S. Feller, Croatian-American mathematician, 1906–1970; Miyadera, Japanese mathematician, born 1926.

Isao

9.8 The Feller–Miyadera–Phillips Theorem

R(λ, A)2 x = = = = =

269

∞

e−λt T (t)Rλ x dt 0 ∞

∞ e−λt T (t) e−λs T (s)x ds dt 0 0 ∞ ∞ e−λ(t+s) T (t + s)x ds dt 0 ∞ 0 ∞ e−λr T (r)x dr dt t 0 ∞ te−λt T (t)x dt. 0

It follows by induction that for all λ > ω, x ∈ X and n = 1, 2, . . . ∞ 1 tn−1 e−λt T (t)x dt. (9.8.38) R(λ, A)n x = (n − 1)! 0 We derive from (9.8.38) and the exponential estimate satisﬁed by the semigroup that for all λ > ω, x ∈ X and n = 1, 2, . . . ∞ M n R(λ, A) x ≤ tn−1 e(ω−λ)t dt · x (n − 1)! 0 M · x, = (λ − ω)n which completes the proof of necessity. Suﬃciency: To simplify the proof, we note that, in general, if {T (t); t ≥ 0} is a C0 -semigroup satisfying T (t) ≤ M eωt , t ≥ 0, for some M ≥ 1 and ω ∈ R, with generator A, then the family {S(t) = e−ωt T (t); t ≥ 0} is also a C0 -semigroup with the generator A − ωI. Thus, one can assume in the following that ω = 0 (i.e., (0, ∞) ⊂ ρ(A) and λn R(λ, A)n ≤ M for all λ > 0 and n = 1, 2, . . . ). The idea that can be used to complete the proof is to deﬁne a new norm on X, say · ∗ , equivalent to the original, such that the corresponding operator norm of R(λ, A) be less than or equal to 1/λ for all λ > 0. Then the conclusion will follow from the Hille–Yosida theorem. First, deﬁne for ν > 0 the following norm on X xν = sup{ν n R(ν, A)n x; n ∈ N ∪ {0}}. Obviously, the new norm is equivalent to the original one because x ≤ xν ≤ M x ∀x ∈ X,

(9.8.39)

270

9 Semigroups of Linear Operators

and the operator norm of R(ν, A) with respect to the new norm satisﬁes R(ν, A)ν ≤

1 ∀ν > 0. ν

(9.8.40)

In addition, 1 for all 0 < λ ≤ ν. (9.8.41) λ This follows easily from (9.8.40) and the so-called resolvent identity: R(λ, A)ν ≤

R(λ, A) − R(ν, A) = (ν − λ)R(ν, A)R(λ, A).

(9.8.42)

Now, deﬁne x∗ = sup{xν ; ν > 0}, and observe that (see (9.8.39) and (9.8.41)) x ≤ x∗ ≤ M x and R(λ, A)∗ ≤

1 ∀λ > 0. λ

So, according to the Hille–Yosida Theorem, A generates a C0 -semigroup {T (t); t ≥ 0} ⊂ L(X) satisfying T (t)∗ ≤ 1 ∀t ≥ 0, hence T (t) ≤ M ∀t ≥ 0. Remark 9.31. Obviously, if · ν is the norm deﬁned in the proof of Theorem 9.30, then λn R(λ, A)n x ≤ λn R(λ, A)n xν ≤ xν ∀ 0 < λ ≤ ν, x ∈ X, n = 0, 1, 2, . . . which implies xλ ≤ xν ∀x ∈ X, 0 < λ ≤ ν. Therefore, the norm · ∗ can be obtained as a limit x∗ = lim xλ . λ→∞

Taking into account the above discussion on groups and their relationship with semigroups, one can easily derive the following extension to groups of the Feller–Miyadera–Phillips generation theorem.

9.9 A Perturbation Result

271

Theorem 9.32. A linear operator A : D(A) ⊂ X → X is the inﬁnitesimal generator of a C0 -group {G(t); t ∈ R} satisfying G(t) ≤ M eω|t| , t ∈ R, with M ≥ 1, ω ∈ R, if and only if (k) D(A) = X and A is closed; (kk) for every λ ∈ R with |λ| > ω one has λ ∈ ρ(A) and R(λ, A)n ≤ M ∀n = 1, 2, . . . (|λ|−ω)n Remark 9.33. Obviously, if M = 1 in the above theorem, then the in1 equality R(λ, A)n ≤ (|λ|−ω) n ∀n = 1, 2, . . . is equivalent to R(λ, A) 1 . If M = 1 and ω = 0, then G(t) = 1 for all t ∈ R, or equiv≤ |λ|−ω alently G(t)x = x for all t ∈ R (i.e., all G(t)’s are isometries). Summarizing, we have the following result.

Corollary 9.34. A linear operator A : D(A) ⊂ X → X is the inﬁnitesimal generator of a C0 -group of isometries {G(t); t ∈ R} if and only if (k) D(A) = X and A is closed; (kk)* for every λ ∈ R \ {0} one has λ ∈ ρ(A) and R(λ, A) ≤

9.9

1 |λ| .

A Perturbation Result

It is intuitive that perturbing the generator A of a C0 -semigroup with any operator B ∈ L(X) yields a generator. Indeed, the following result holds. Theorem 9.35. Let A : D(A) ⊂ X → X be the generator of a C0 semigroup {T (t); t ≥ 0} ⊂ L(X) satisfying T (t) ≤ M eωt for all t ≥ 0, with M ≥ 1, ω ∈ R, and let B ∈ L(X). Then the operator C = A + B with D(C) = D(A) is the generator of a C0 -semigroup {S(t); t ≥ 0} ⊂ L(X) satisfying S(t) ≤ M e(ω+M B)t for all t ≥ 0. Proof. As in the proof of the Feller–Miyadera–Phillips Theorem (Theorem 9.30), one can assume that ω = 0. Next, we also assume that M = 1. Then (0, ∞) ⊂ ρ(A) and for all λ > 0 we can write λI − C = I − BR(λ, A) (λI − A).

(9.9.43)

272

9 Semigroups of Linear Operators

For all λ > B we have BR(λ, A) ≤ B · R(λ, A) < 1, so I − BR(λ, A) is invertible in L(X). Thus, taking into account (9.9.43), we can see that (B, ∞) ⊂ ρ(C) and for all λ > B −1 R(λ, C) = R(λ, A) I − BR(λ, A) ∞ n = R(λ, A) BR(λ, A) , n=0

which shows that R(λ, C) ≤

1 ∀λ > B. λ − B

This is enough to conclude that C generates a C0 -semigroup {S(t); t ≥ 0} satisfying S(t) ≤ eBt for all t ≥ 0. Now, let us consider the general case M ≥ 1 (and ω = 0). Deﬁne the norm x∗ = supt≥0 T (t), which is equivalent to the original norm of X: x ≤ x∗ ≤ M x ∀x ∈ X. Obviously, T (t)∗ ≤ 1 for all t ≥ 0 and Bx∗ ≤ M B · x ≤ M B · x∗ ∀x ∈ X. By the above proof for the case M = 1, C = A + B generates a C0 -semigroup {S(t); t ≥ 0} satisfying S(t)∗ ≤ eB∗ t t ≥ 0. Therefore, S(t)x ≤ S(t)x∗

≤ eB∗ t x∗

≤ M eM Bt x ∀t ≥ 0, which concludes the proof.

9.10 Approximation of Semigroups

9.10

273

Approximation of Semigroups

An example of approximation has already been encountered in the proof of Theorem 9.22. Speciﬁcally, we saw that if {T (t); t ≥ 0} ⊂ L(X) is a C0 -semigroup of contractions with generator A, then S(t)x can be approximated (uniformly for t in compact intervals) by etAλ x as λ → ∞, where Aλ denotes the Yosida approximation of A. Deﬁnitely, this approximation result extends to any C0 -semigroup. In what follows, we present another approximation result, known as the Trotter Theorem,4 which is relevant for applications. As in [39], for M ≥ 1 and ω ∈ R denote by G(M, ω) the class of operators which generate C0 -semigroups {T (t); t ≥ 0} satisfying T (t) ≤ M eωt , ∀t ≥ 0. The Trotter Theorem [48] says that the convergence of a sequence An ∈ G(M, ω) to A ∈ G(M, ω) in some sense (see below) is equivalent to the convergence of the corresponding semigroups. Theorem 9.36. If A, An ∈ G(M, ω) and {T (t); t ≥ 0}, {Tn (t); t ≥ 0} are the C0 -semigroups generated by A, An (n = 1, 2, . . . ), then the following conditions are equivalent: (a) for some λ > ω and for all x ∈ X, R(λ, An )x → R(λ, A)x as n → ∞; (b) for all x ∈ X and t ≥ 0, Tn (t)x → T (t)x as n → ∞, uniformly for t in compact subintervals of [0, ∞). Proof. We ﬁrst prove that (a) implies (b). For a given t > 0, every s ∈ (0, t), and every x ∈ X, we have d [Tn (t − s)R(λ, An )T (s)R(λ, A)x] ds = −Tn (t − s)An R(λ, An )T (s)R(λ, A)x + Tn (t − s)R(λ, An )AT (s)R(λ, A)x = Tn (t − s)[−An R(λ, An )R(λ, A) + R(λ, An )AR(λ, A)]T (s)x = Tn (t − s)[R(λ, A) − R(λ, An )]T (s)x. Note that all the above operations are allowed. Integrating the above equality over [0, t] yields R(λ, An )[Tn (t) − T (t)]R(λ, A)x t Tn (t − s)[R(λ, A) − R(λ, An )]T (s)x ds. (9.10.44) = 0 4

Hale F. Trotter, Canadian mathematician, born 1931.

274

9 Semigroups of Linear Operators

It follows from (9.10.44) that, for all t in an arbitrary compact interval [0, t1 ], one has R(λ, An )[Tn (t) − T (t)]R(λA)x t Tn (t − s) · [R(λ, An ) − R(λ, A)]T (s)x ds ≤ 0 t1 Tn (t − s) · [R(λ, An ) − R(λ, A)]T (s)x ds. ≤

(9.10.45)

0

Note that the sequence of the integrands in (9.10.44) converges pointwise to zero in [0, t1 ] and it has in this interval the upper bound 2M 3 eωt1 x(λ − ω)−1 . Thus, according to the Lebesgue Dominated Convergence Theorem, one gets from (9.10.45) lim R(λ, An )[Tn (t) − T (t)]R(λ, A)x = 0,

n→∞

uniformly for t in every compact subinterval of [0, ∞). In fact, since the range of R(λ, A) = D(A), we have lim R(λ, An )[Tn (t) − T (t)]x = 0 ∀x ∈ D(A),

n→∞

(9.10.46)

uniformly for t in every compact subinterval of [0, ∞). Now, let us estimate [Tn (t) − T (t)]R(λ, A)x ≤ Tn (t)[R(λ, A) − R(λ, An )x] + R(λ, An )[Tn (t) − T (t)]x + [R(λ, An ) − R(λ, A)]T (t)x. (9.10.47) The right-hand side of (9.10.47) has three terms, say Si = Si (t, n, x), i = 1, 2, 3. Using our assumption (a) and the estimate Tn (t) ≤ M eωt , t ≥ 0, we can see that, for each x ∈ X, S1 (t, n, x) converges to zero as n → ∞, uniformly for t in every compact subinterval of [0, ∞). A similar conclusion holds for S2 (t, n, x), x ∈ D(A) (see (9.10.46)). Taking again assumption (a) into account, it follows that S3 (t, n, x), x ∈ X, also converges to zero as n → ∞, uniformly for t in every compact subinterval of [0, ∞) (here we use the fact that {T (t)x; 0 ≤ t ≤ t1 } is a compact set for each t1 > 0). Summarizing, we derive from (9.10.47) that lim [Tn (t) − T (t)]R(λ, A)x = 0, x ∈ D(A).

n→∞

9.10 Approximation of Semigroups

275

Hence, lim [Tn (t) − T (t)]z = 0, ∀z ∈ D(A2 ),

n→∞

uniformly on every compact subinterval of [0, ∞). Since D(A2 ) is dense in X (see Remark 9.11), this conclusion extends to all x ∈ X, so (b) holds. Conversely, assuming now that (b) is satisﬁed, we have for any λ > ω and x ∈ X ∞ e−λt [Tn (t)x − T (t)x] dt R(λ, An )x − R(λ, A)x = 0 ∞ e−λt Tn (t)x − T (t)x dt. ≤ 0

(9.10.48) Using again Lebesgue’s Dominated Convergence Theorem for the righthand side of the above inequality, we conclude that indeed (b) implies (a). Remark 9.37. It is obvious from the proof above that condition (a) is equivalent to (a) : for all x ∈ X and all λ > ω, R(λ, An )x → R(λ, A)x as n → ∞. If one assumes that, for some λ > ω, R(λ, An )x converges as n → ∞ to some Rλ x for all x ∈ X, and if in addition the range of Rλ is assumed to be dense in X, then Rλ is the resolvent R(λ, A) of an operator A ∈ G(M, ω). For the proof of this implication, see [24] and [39, p. 86]. This implication can be used to replace Theorem 9.36 by an improved version, in which the existence of A ∈ G(M, ω) is no longer assumed. The reformulation of the Trotter Theorem in view of the above information is left to the reader. Remark 9.38. It is worth pointing out that the Trotter Theorem or suitable versions of it can be used successfully in the numerical analysis of various initial-boundary value problems. We continue this section with a result known as the Chernoﬀ product formula.5 Theorem 9.39. Let A ∈ G(M, ω) for some M ≥ 1 and ω ∈ R and let F : [0, ∞) → L(X) be a function satisfying F (0) = I and F (t)k ≤ M ekωt for all t ≥ 0, k ∈ N. Assume that Ax = lim s−1 [F (s)x − x], ∀x ∈ D(A). s→0+

5

Paul R. Chernoﬀ, American mathematician, born 1942.

(9.10.49)

276

9 Semigroups of Linear Operators

Then, T (t)x = lim F (t/n)n x,

(9.10.50)

n→∞

for all x ∈ X, uniformly for t in compact subintervals of [0, ∞), where {T (t); t ≥ 0} is the C0 -semigroup generated by A. In order to prove this theorem we need the following lemma. Lemma 9.40. Let Q ∈ L(X) such that Qj ≤ M for some M ≥ 1 and all j ∈ N. Then we have √ en(Q−I) x − Qn x ≤ M n Qx − x, ∀n ∈ N, x ∈ X. Proof. Let n ∈ N be arbitrary but ﬁxed. We have

en(Q−I) − Qn = e−n enQ − en Qn = e−n

∞ nk k Q − Qn . k!

(9.10.51)

k=0

Note that for k > n we have Q −Q = k

n

k−1

Qj (Q − I),

j=n

and a similar identity holds for k < n. So we obtain by using Qj ≤ M (9.10.52) Qk x − Qn x ≤ M |n − k| · Qx − x. Now, using (9.10.51), (9.10.52), and the Bunyakovsky–Cauchy–Schwarz inequality, we derive e

n(Q−I)

x − Q x ≤ e n

−n

∞ nk k=0

≤ Me ×

−n

k!

M |n − k| · Qx − x

Qx − x

∞

nk 1/2

∞

(n − k)2

k!

k=0 k n 1/2

k! 1/2 n 1/2 = M e−n Qx − x en ne √ = M n Qx − x. k=0

9.10 Approximation of Semigroups

277

Proof of Theorem 9.39. We consider ﬁrst the case ω = 0. Deﬁne for s>0 As x = s−1 [F (s) − I]x, x ∈ X. Obviously, As ∈ L(X) for all s > 0 and (cf. (9.10.49)) lim As x = Ax, ∀x ∈ D(A).

s→0+

(9.10.53)

Note also that for each s > 0, e

tAs

≤e

−t/s

∞ tk F (s)k ≤ M, ∀t ≥ 0, sk k!

(9.10.54)

k=0

i.e., As ∈ G(M, 0). For λ > ω = 0 and y = (λI − A)x, x ∈ D(A), we have R(λ, As )y = R(λ, As ) (λI − As )x − (λI − As )x + (λI − A)x = x + R(λ, As ) As x − Ax . Therefore, according to (9.10.53), we have R(λ, As )y → R(λ, A)y, as s → 0+ , ∀y ∈ X,

(9.10.55)

since R(λ, As ) ≤ M/λ. Now, using (9.10.53), (9.10.54), and (9.10.55), it follows by Theorem 9.36 (which also works with s instead of n), T (t)x − etAs x → 0, as s → 0+ , ∀x ∈ X, uniformly for t in compact subintervals of [0, ∞), and hence T (t)x − etAt/n x → 0, as n → ∞, ∀x ∈ X,

(9.10.56)

uniformly for t in compact subintervals of [0, ∞). On the other hand, by Lemma 9.40, we have etAt/n x − F (t/n)n x = en[F (t/n)−I] x − F (t/n)n x √ ≤ M nF (t/n)x − x Mt = √ At/n x → 0, as n → ∞, n

(9.10.57)

for all x ∈ D(A), uniformly for t in compact subintervals of [0, ∞). Combining (9.10.56) and (9.10.57), we derive (9.10.50) for all x ∈ D(A). Since D(A) is dense in X, (9.10.50) extends to the whole of X.

278

9 Semigroups of Linear Operators

The case ω = 0 can be reduced to the previous one. Indeed, the function F˜ , deﬁned by F˜ (t) = e−ωt F (t), satisﬁes F˜ (0) = I, F˜ (t)k ≤ M for all t ≥ 0 and k ∈ N. Moreover, (9.10.49) is satisﬁed with F˜ instead of F , and A − ωI instead of A. So the conclusion of Theorem 9.39 follows easily. Corollary 9.41. For every A ∈ G(M, ω), M ≥ 1, ω ∈ R, we have

t −n x, ∀x ∈ X, (9.10.58) T (t)x = lim I − A n→∞ n uniformly for t in compact subintervals of [0, ∞), where {T (t); t ≥ 0} ⊂ L(X) is the C0 -semigroup generated by A. Proof. We can assume that ω > 0. Deﬁne ⎧ ⎨ I, (1/t)R 1/t, A , F (t) = ⎩ 0,

F : [0, ∞) → L(X) by t = 0, t ∈ (0, δ), t ≥ δ,

for some δ ∈ (0, 1/ω). We choose δ > 0 small enough so that F (t)k ≤ (1/tk )R(1/t, A)k ≤ M/ tk (t−1 − ω)k = M/(1 − ωt)k ≤ M ek(ω+1)t , ∀t ∈ (0, δ), k ∈ N. We also have lim t−1 [F (t)x − x] = lim (1/t)R(1/t, A)Ax = Ax, ∀x ∈ D(A).

t→0+

t→0+

Thus, all the assumptions of Theorem 9.39 are fulﬁlled and so (9.10.58) holds. Another consequence of the Chernoﬀ product formula is the so-called Trotter product formula corresponding to perturbed semigroups: Corollary 9.42. Let A ∈ G(M, ω), M ≥ 1, ω ∈ R, and B ∈ L(X). If {T (t); t ≥ 0} is the C0 -semigroup generated by A, S(t) = etB , t ≥ 0, and {U (t); t ≥ 0} is the C0 -semigroup generated by A + B, then n U (t)x = lim T (t/n)S(t/n) x, (9.10.59) n→∞

for all x ∈ X, uniformly for t in compact subintervals of [0, ∞).

9.11 The Inhomogeneous Cauchy Problem

279

Proof. By Theorem 9.35, A + B ∈ G(M, ω + M B). Making use of the previous renorming procedure (see the proof of Theorem 9.30), we can assume M = 1. So, deﬁning F (t) = T (t)S(t), t ≥ 0, we have F (t)k ≤ T (t)k S(t)k ≤ ekωt ekBt = ek(ω+B)t , ∀t ≥ 0, and for all x ∈ D(A + B) = D(A) S(t)x − x T (t)x − x + lim . + t t t→0 = Bx + Ax.

lim t−1 [F (t)x − x] =

t→0+

lim T (t)

t→0+

Therefore, Theorem 9.39 is again applicable and (9.10.59) follows. Remark 9.43. The Trotter product formula is valid, under appropriate conditions, for two general C0 -semigroups (see, e.g., [13, p. 154]).

9.11

The Inhomogeneous Cauchy Problem

Consider the Cauchy (initial value) problem u (t) = Au(t) + f (t), t ∈ [0, r]; u(0) = x,

(CP )

where A is the generator of a C0 -semigroup {T (t); t ≥ 0} ⊂ L(X), f is a given function from [0, r] to X, r ∈ (0, ∞). The case f ≡ 0 was discussed before. Deﬁnition 9.44. A function u : [0, r] → X is a classical solution of problem (CP ) if u is continuous on [0, r] and continuously diﬀerentiable on (0, r], u(t) ∈ D(A) for all t ∈ (0, r], u(0) = x, and u satisﬁes equation (CP )1 for all t ∈ (0, r]. Remark 9.45. If f ∈ C([0, r]; X) and u is a classical solution of problem (CP ), then for 0 < s < t ≤ r we have d [T (t − s)u(s)] = −T (t − s)Au(s) + T (t − s)u (s) ds = −T (t − s)Au(s) + T (t − s)Au(s) + T (t − s)f (s) = T (t − s)f (s).

280

9 Semigroups of Linear Operators

Therefore, by integration over [0, t] one obtains t T (t − s)f (s) ds, t ∈ [0, r], u(t) = T (t)x +

(9.11.60)

0

showing that u is unique (since A generates a unique C0 -semigroup; see Theorem 9.12). Note also that the integral term in the right-hand side of Eq. (9.11.60) makes sense for f ∈ L1 (0, r; X), since (see Theorem 9.7) T (t − s)f (s) ≤ M eω(t−s) f (s), 0 ≤ s ≤ t ≤ r. This observation leads to the introduction of a new concept of solution for the Cauchy problem (CP ). Deﬁnition 9.46. Let x ∈ X, f ∈ L1 (0, r; X), and let A be the generator of a C0 -semigroup {T (t); t ≥ 0} ⊂ L(X). The function u ∈ C([0, r]; X) given by t u(t) = T (t)x + T (t − s)f (s) ds ∀t ∈ [0, r] (9.11.61) 0

is called a mild solution of problem (CP ). Obviously, if A is the generator of a C0 -semigroup {T (t); t ≥ 0}, then for each (x, f ) ∈ X × L1 (0, r; X) problem (CP ) has a unique mild solution (since the C0 -semigroup generated by A is unique). Formula (9.11.61) above is often called the variation of constants formula. Under certain conditions on x and f it gives a classical solution of problem (CP ). The following theorem is one such example. Theorem 9.47. Let A : D(A) ⊂ X → X be the inﬁnitesimal generator of a C0 -semigroup, say {T (t); t ≥ 0}, and let x ∈ D(A) and f ∈ C 1 ([0, r]; X). Then problem (CP ) has a unique classical solution (given by (9.11.61)). Proof. Uniqueness is already known (see the remark above). To prove existence it suﬃces to show that t T (t − s)f (s) ds v(t) = 0

satisﬁes equation (CP )1 on (0, r] (see also Theorem 9.10). Indeed, t T (s)f (t − s) ds v(t) = 0

9.11 The Inhomogeneous Cauchy Problem

and so there exists

t

v (t) = T (t)f (0) +

281

T (s)f (t − s) ds

0 t

= T (t)f (0) +

T (t − s)f (s) ds ∀t ∈ (0, r].

0

On the other hand, for each t ∈ (0, r) and h > 0 small enough, we have t −1 −1 T (t + h − s)f (s) ds − h−1 v(t) h [T (h) − I]v(t) = h 0

= h−1 [v(t + h) − v(t)] − h−1 t+h × T (t + h − s)f (s) ds t

which converges to v (t) − f (t) as h → 0+ . Therefore, v(t) ∈ D(A) and Av(t) = v (t) − f (t), ∀t ∈ (0, r). In fact, f can be extended to the right of t = r as a continuously diﬀerentiable function, so v(r) ∈ D(A) and there exists v (r) = Av(r)+ f (r). Even more, there exists v (0) = f (0) so the function u(t) = T (t)x+v(t) is continuously diﬀerentiable on [0, r] and satisﬁes equation (CP )1 for all t ∈ [0, r]. Remark 9.48. From the proof above we see that (under the conditions of Theorem 9.47) t T (t − s)f (s) ds ∀t ∈ [0, r]. (9.11.62) u (t) = T (t)x + T (t)f (0) + 0

Remark 9.49. Let A be the inﬁnitesimal generator of a C0 -semigroup {T (t); t ≥ 0} and let (x, f ) ∈ X × L1 (0, r; X). If u is the corresponding mild solution of problem (CP ), then it is the uni1 of form limit of a sequence of C -solutions (hence classical solutions) (CP ). Indeed, let (xn , fn ) be a sequence in D(A) × C 1 ([0, r]; X) which approximates (x, f ) in X × L1 (0, r; X). For each (xn , fn ) there exists a unique C 1 -solution un of problem (CP ) with x := xn and f := fn given by the variation of constants formula: t T (t − s)fn (s) ds. un (t) = T (t)xn + 0

282

9 Semigroups of Linear Operators

By standard arguments one gets for all t ∈ [0, r] un (t) − u(t) ≤ T (t)(xn − x) t T (t − s) · fn (s) − f (s) ds + 0

≤ M eωt xn − x t M eω(t−s) fn (s) − f (s) ds + 0 r

ωr xn − x + fn (s) − f (s) ds . ≤ Me 0

Therefore, un → u in C([0, r]; X). Remark 9.50. The semigroup approach can be used to solve Cauchy problems for semilinear evolution equations. Speciﬁcally, let us consider the following problem, u (t) = Au(t) + f (t, u(t)), t ∈ [0, r]; u(0) = x ∈ X,

(N CP )

where A is the inﬁnitesimal generator of a C0 -semigroup {T (t); t ≥ 0} ⊂ L(X) and f : [0, r] × X → X is continuous and satisﬁes the Lipschitz condition f (t, x1 ) − f (t, x2 ) ≤ Lx1 − x2 , (t, x1 ), (t, x2 ) ∈ [0, r] × X. Here L is a positive constant. One can consider the following “mild” form for (N CP ) t T (t − s)f (s, u(s)) ds, t ∈ [0, r]. (9.11.63) u(t) = T (t)x + 0

If u is a classical solution of problem (N CP ), then it satisﬁes (9.11.63). One can prove the existence of a solution u ∈ Y := C([0, r]; X) of (9.11.63) by using the Banach Contraction Principle. For this purpose, let us consider the Bielecki norm6 on Y : gB = sup e−βt g, g ∈ Y, 0≤t≤r

where β is a large positive constant. This Bielecki norm is equivalent to the usual sup-norm of Y , so Y is a Banach space with respect to · B . Deﬁne on Y an operator Q by t (Qu)(t) = T (t)x + T (t − s)f (s, u(s)) ds, t ∈ [0, r], u ∈ Y. 0 6

Adam Bielecki, Polish mathematician, 1910–2003.

9.12 Applications

283

Obviously, Q maps Y into itself, and for all t ∈ [0, r] and u1 , u2 ∈ Y we have t eω(t−s) u1 (s) − u2 (s) ds (Qu1 )(t) − (Qu2 )(t) ≤ LM 0 t = LM eωt e(β−ω)s e−βs u1 (s) − u2 (s) ds 0 t ωt e(β−ω)s ds ≤ LM e u1 − u2 B 0

LM u1 − u2 B eβt − eωt = β−ω LM u1 − u2 B eβt . ≤ β−ω Thus, LM u1 − u2 B , β−ω t ∈ [0, r], u1 , u2 ∈ Y,

e−βt (Qu1 )(t) − (Qu2 )(t) ≤

which implies Qu1 − Qu2 B ≤

LM u1 − u2 B , u1 , u2 ∈ Y. β−ω

So, for β > LM + ω, Q is a contraction and the Banach Contraction Principle ensures the existence of a unique ﬁxed point u of Q. This u is the unique solution in Y of Eq. (9.11.63), which can be called a mild solution of the given semilinear Cauchy problem. In general, a mild solution is not a classical one. However, under appropriate conditions on x and f it is.

9.12

Applications

In this section we illustrate the above theory with some applications.

9.12.1

The Heat Equation

Consider the heat (diﬀusion) equation ut = uxx + f (t, x), t ∈ [0, r], x ∈ (0, 1),

(9.12.64)

284

9 Semigroups of Linear Operators

with Dirichlet boundary conditions u(t, 0) = 0 = u(t, 1), t ∈ [0, r],

(9.12.65)

and initial condition u(0, x) = u0 (x), x ∈ (0, 1),

(9.12.66)

where u0 ∈ L2 (0, 1), f ∈ L1 (0, r; L2 (0, 1)), and u = u(t, x) is the unknown function representing the temperature (or density in the case of a general diﬀusion process). We have denoted ut := ∂u ∂t and uxx := ∂2u . In order to solve problem (9.12.64)–(9.12.66), we choose X = ∂x2 L2 (0, 1) as the basic space equipped with the usual scalar product 1 p, q = p(x)q(x) dx, 0

and the corresponding (Hilbertian) norm. Deﬁne A : D(A) ⊂ X → X by d2 v D(A) = H 2 (0, 1) ∩ H01 (0, 1), Av = v = 2 . dx So, regarding u = u(t, x) as an X-valued function of t ∈ [0, r], problem (9.12.64)–(9.12.66) can be expressed as the Cauchy problem in X d u(t, ·) = Au(t, ·) + f (t, ·), t ∈ [0, r]; u(0, ·) = u0 . dt

(9.12.67)

Note that the boundary conditions (9.12.65) are incorporated into the deﬁnition of D(A). It turns out that A is the generator of a C0 semigroup of contractions, say {T (t) : X → X; t ≥ 0}, so there is a unique mild solution u of problem (9.12.64)–(9.12.66) given by the variation of constants formula (see (9.11.61)) t T (t − s)f (s, ·) ds, t ∈ [0, r]. (9.12.68) u(t, ·) = T (t)u0 (·) + 0

In order to show that A is a generator of a C0 -semigroup of contractions one could use the Hille–Yosida Theorem. A better option is to use the Lumer–Phillips Theorem. In fact, as X is a Hilbert (hence reﬂexive) space, it suﬃces to prove that A is an m-dissipative operator (cf. Theorem 9.27). This means that we do not need to check the density condition on D(A) (that actually follows by the density of C0∞ (0, 1) in X and the obvious inclusion relation C0∞ (0, 1) ⊂ D(A)).

9.12 Applications

285

As the dissipativeness of A follows trivially, let us prove that for any λ > 0 we have R(λI − A) = X. In other words, for any λ > 0, g ∈ X, there exists a solution v ∈ H 2 (0, 1) of the following boundary value problem λv − v = g, v(0) = 0 = v(1). But this follows easily by imposing the boundary conditions to the general solution of the above diﬀerential equation. One could also use Theorem 9.29 and the fact that A is a self-adjoint operator. According to Theorem 9.47 (see also its proof), if u0 ∈ D(A) = H01 (0, 1)∩H 2 (0, 1) and f ∈ C 1 ([0, r]; X) then u ∈ C 1 ([0, r]; X). Moreover, since u satisﬁes the heat equation it follows that u ∈ C([0, r]; H 2 (0, 1)). Note that the condition u0 ∈ D(A) incorporates the compatibility of u0 with the boundary conditions: u0 (0) = u0 (1) = 0. It is also worth pointing out that higher regularity of u can be obtained under additional conditions on u0 and f . The above discussion can be extended to more dimensions. Speciﬁcally, let Ω ⊂ Rn , n ≥ 2, be a bounded domain with a suﬃciently smooth boundary ∂Ω. Consider the n-dimensional heat equation ut = Δu + f (t, x), t ∈ [0, r], x ∈ Ω, and associate with it the homogeneous Dirichlet boundary condition u = 0 on ∂Ω, and the initial condition u(0, x) = u0 (x), x ∈ Ω. We have denoted by Δ the classical Laplacian with respect to x. Let X = L2 (Ω) and let A = Δ with D(A) = H01 (Ω) ∩ H 2 (Ω). So the above initial-boundary value problem can be viewed as a Cauchy problem in X. The fact that A is a dissipative operator follows from Green’s formula, and its m-dissipativity can be derived by using the Lax–Milgram Theorem. The reader is encouraged to continue the discussion and derive existence, uniqueness, and regularity of the solution to the above problem. The reader could also consider the case of the homogeneous Neumann or Robin boundary condition and investigate it along the same lines.

286

9 Semigroups of Linear Operators

9.12.2

The Wave Equation

Consider in a ﬁrst stage the one-dimensional wave equation utt − uxx = f (t, x), t ≥ 0, x ∈ (0, 1),

(9.12.69)

with the homogeneous Dirichlet boundary conditions, u(t, 0) = 0 = u(t, 1), t ≥ 0,

(9.12.70)

and initial conditions u(0, x) = u0 (x), ut (0, x) = v0 (x), x ∈ (0, 1).

(9.12.71)

Recall that this problem describes the evolution of the displacement u(t, x) of an elastic string ﬁxed at both its ends (x = 0 and x = 1), where f (t, x) represents an external force. Denoting v = ut , problem (9.12.69)–(9.12.71) can be equivalently written as ⎧ ∂ ⎪ t ≥ 0, x ∈ (0, 1), ⎨ ∂t [u, v] = [v, uxx + f ], u(t, 0) = u(t, 1) = 0, t ≥ 0, ⎪ ⎩ [u, v](0, x) = [u0 (x), v0 (x)], x ∈ (0, 1). Let X = H01 (0, 1) × L2 (0, 1) (the so-called phase space) which is a real Hilbert space with the scalar product 1 1 p1 p2 dx + q1 q2 dx, [p1 , q1 ], [p2 , q2 ] = 0

0

and the induced norm. Deﬁne A : D(A) ⊂ X → X by D(A) = [H01 (0, 1) ∩ H 2 (0, 1)] × H01 (0, 1), A[p, q] = [q, p ]. Thus the above problem can be expressed as the following Cauchy problem in X d dt [u(t, ·), v(t, ·)] = A[u(t, ·), v(t, ·)] + [0, f (t, ·)], t ≥ 0, [u(0, ·), v(0, ·)] = [u0 , v0 ]. Denote this Cauchy problem by (CP). In order to derive existence results for (CP), we are going to show in what follows that A is the generator of a C0 -group of isometries. For this purpose, we can use Corollary 9.34.

9.12 Applications

287

First, noting that C0∞ (0, 1) is dense in H01 (0, 1) as well as in L2 (0, 1), and C0∞ (0, 1) × C0∞ (0, 1) ⊂ D(A), we infer that the closure of D(A) in X equals X. It is also easily seen that A is a closed operator. So we need only to show that condition (kk)∗ of Corollary 9.34 is fulﬁlled. Let λ ∈ (−∞, 0) ∪ (0, ∞) and let [g, h] be an arbitrary pair in X. We claim that there exists a unique [p, q] ∈ D(A) such that λ[p, q] − A[p, q] = [g, h],

(9.12.72)

or, equivalently, there exists a unique p ∈ H01 (0, 1)∩H 2 (0, 1) satisfying the equation λ2 p = p + h + λg. We know from the preceding discussion on the heat equation that the last assertion is true. We also have q = λp − g ∈ H01 (0, 1) which concludes the proof of our claim. Hence λI − A is invertible. Now, multiplying Eq. (9.12.72) by [p, q] and taking into account the deﬁnition of A we get 1

1 p q dx + p q dx = [g, h], [p, q], λ[p, q]2X − 0 0 =0

which implies |λ| · [p, q]2X

= |[g, h], [p, q]| ≤ [g, h]X · [p, q]X .

Therefore, |λ| · (λI − A)−1 [g, h]X ≤ [g, h]X

∀[g, h] ∈ X,

λ ∈ (−∞, 0) ∪ (0, ∞), and so (λI − A)−1 ≤

1 |λ|

∀λ ∈ (−∞, 0) ∪ (0, ∞).

Thus, according to Corollary 9.34, A generates a group of isometries, say {G(t); t ∈ R} ⊂ L(X). Therefore, for all [u0 , v0 ] ∈ X and f ∈ L1loc ([0, ∞); X) there exists a unique mild solution [u, v] of (CP) given by the variation of constants formula t [u(t, ·), v(t, ·)] = G(t)[u0 , v0 ] + G(t − s)[0, f (s, ·)] ds, t ≥ 0, 0

288

9 Semigroups of Linear Operators

hence u ∈ C([0, ∞); H01 (0, 1)). This u can be viewed as a generalized solution of problem (9.12.69)–(9.12.71). If [u0 , v0 ] ∈ D(A) = [H01 (0, 1) ∩ H 2 (0, 1)] × H01 (0, 1) and f ∈ C 1 ([0, ∞); L2 (0, 1)), then [u, v] ∈ C 1 ([0, ∞); X) (cf. Theorem 9.47). It follows that u ∈ C 2 ([0, ∞); L2 (0, 1))∩C 1 ([0, ∞); H01 (0, 1))∩C([0, ∞); H 2 (0, 1)) and u is a classical solution of problem (9.12.69)–(9.12.71). The above discussion can be extended ⎧ ⎪ ⎨utt − Δu = f (t, x), u(t, x) = 0, ⎪ ⎩ u(0, x) = u0 (x),

to the n-dimensional case t ≥ 0, x ∈ Ω, t ≥ 0, x ∈ ∂Ω, x ∈ Ω,

where Ω ⊂ Rn , n ≥ 2, is a bounded domain with suﬃciently smooth boundary ∂Ω, and Δ is the Laplacian with respect to x. In this case, using the substitution v = ut again, the above initial-boundary value problem can similarly be expressed as a Cauchy problem in the phase space X = H01 (Ω) × L2 (Ω), associated with the operator A : D(A) ⊂ X → X deﬁned by D(A) = H01 (Ω) ∩ H 2 (Ω) × H01 (Ω), A[p, q] = [q, Δp]. One can again use Corollary 9.34 to prove that A generates a C0 group of isometries on X. In particular, to show that Eq. (9.12.72) has a solution in D(A) we need to use Green’s formula (instead of integration by parts) and Lax–Milgram. The rest follows similarly. The case of the homogeneous Neumann or Robin boundary condition can be addressed in a similar manner.

9.12.3

The Transport Equation

Let a be a given vector in Rn , n ≥ 1. Consider the equation ut + a · ∇u = f (t, x),

t ≥ 0, x ∈ Rn ,

(9.12.73)

with the initial condition u(0, x) = u0 (x),

x ∈ Rn ,

(9.12.74)

where ∇u means the gradient of u with respect to x, and a · b denotes the usual scalar product of a, b ∈ Rn . Equation (9.12.73) is known as the transport equation. The case a = 0 is trivial, so in what follows we assume a = 0 (i.e., a = (a1 , . . . , an ) contains nonzero components).

9.12 Applications

289

Let us choose X = Lp (Rn ), p ∈ (1, ∞), equipped with the usual norm. If f ≡ 0 and u0 is a smooth function, then the solution of problem (9.12.73) and (9.12.74) is given by u(t, x) = u0 (x − ta),

t ≥ 0, x ∈ Rn .

This formula leads us to the deﬁnition of the semigroup {T (t) : X → X; t ≥ 0}, (T (t)v)(x) = v(x − ta),

v ∈ X, x ∈ Rn , t ≥ 0.

It is easily seen that this is a semigroup of isometries, of class C0 : p |v(x − ta) − v(x)|p dx lim T (t)v − vX = lim t→0+

t→0+

Rn

= 0, ∀v ∈ X, by virtue of the Lebesgue Dominated Convergence Theorem. In order to determine its inﬁnitesimal generator A : D(A) ⊂ X → X, consider Eq. (9.12.73) with f ≡ 0 and deduce Av = −a · ∇v for all v ∈ D(A). This follows from the fact that the right derivative of t → T (t)v at t = 0 is equal to Av. Indeed, if v ∈ C0∞ (Rn ) (which is dense in X), then v ∈ D(A) and lim h−1 [T (h)v − v] + a · ∇vpX lim |h−1 [v(x − ha) − v(x)] + a · ∇v(x)|p dx

h→0+

=

h→0+

Rn

= 0, by virtue of the Mean Value Theorem (which insures uniform convergence as h → 0+ under the above integral). Since the range of A must be a subset of X, the maximal domain of A is D(A) = {v ∈ X;

∂v ∈ X for all i ∈ {1, . . . , n} for which ai = 0}, ∂xi

∂v where ∂x denotes the partial derivative of v with respect to xi in the i sense of distributions. Since C0∞ (Rn ) is dense in X and C0∞ (Rn ) ⊂ D(A) it follows that D(A) is dense in X. Obviously, A is a closed operator. We can use Theorem 9.29 to prove that A is a generator (the generator of {T (t) : X → X; t ≥ 0}). Indeed, for all u ∈ D(A) \ {0}

290

9 Semigroups of Linear Operators

p−2 u (here J denotes the duality mapping and u∗ = J(u) = u2−p X |u| of X), we have 2−p ∗ Au · |u|p−2 u dx u (Au) = uX

= =

Rn n ∂u 2−p ai |u|p−2 u dx −uX ∂xi Rn i=1 n 1 ∂ 2−p ai |u|p dx − uX p ∂x n i R i=1

= 0, so A is dissipative. To derive the last equality, we have used the fact that the function g(xi ) = Rn−1 |u|p dx1 . . . dxi−1 dxi+1 . . . dxn belongs to W 1,1 (R) so g(xi ) −→ 0 as |xi | −→ ∞ (prove it, or see [6, Corollary 8.9, p. 214]). Let X ∗ = Lq (R) be the dual of X (i.e., p1 + 1q = 1). The adjoint A∗ of A is deﬁned by D(A∗ ) = {w ∈ X ∗ ; A∗ w = a · ∇w.

∂w ∈ X ∗ ∀i ∈ {1, . . . , n} for which ai = 0}, ∂xi

By a computation similar to that performed above for operator A, we infer that operator A∗ is also dissipative. Thus, according to Theorem 9.29, A is m-dissipative, hence it is indeed the generator of {T (t) : X → X; t ≥ 0}. In fact, this semigroup extends to a C0 group of isometries, T (t)v (x) = v(x − ta) x ∈ Rn , t ∈ R, with generator A (see Sect. 9.4). Therefore, for all u0 ∈ X = Lp (Rn ) and f ∈ L1 (0, ∞; X), problem (9.12.73) and (9.12.74) has a unique mild solution u, t u(t, x) = T (t)u0 (x) + T (t − s)f (s, ·) (x) ds, 0 t f (s, x − (t − s)a) ds, ∀t ≥ 0. = u0 (x − ta) +

0

If u0 ∈ D(A) and f ∈ C 1 ([0, ∞); X), then u ∈ C 1 ([0, ∞); X) and u is a classical solution, with the additional property a·∇u ∈ C([0, ∞); X).

9.12 Applications

291

Remark 9.51. If n = 1 and a = −1, then the above group {T (t); t ∈ R} is a group of translations deﬁned on X = Lp (R). Remark 9.52. Since the above operator A generates a C0 -group of isometries, it follows by Corollary 9.34 that R \ {0} ⊂ ρ(A). Therefore for all λ ∈ R \ {0} and g ∈ X = Lp (Rn ) there exists a unique solution u ∈ D(A) satisfying the equation λu(x) + a · ∇u(x) = g(x),

9.12.4

x ∈ Rn .

The Telegraph System

For an electrical long line we have the following PDE system, called the telegraph system (see, e.g., [36, p. 320]) Lit + vx + Ri = e(t, x), t ≥ 0, x ∈ (0, 1), Cvt + ix + Gv = 0, with the boundary conditions (Ohm’s law at both ends of the line) v(t, 0) + R0 i(t, 0) = 0, R1 i(t, 1) = v(t, 1), t ≥ 0, and initial conditions i(0, x) = i0 (x), v(0, x) = v0 (x), x ∈ (0, 1), where i = i(t, x) is the current ﬂowing in the line and v = v(t, x) represents the voltage across the line; R ≥ 0, R0 > 0, R1 > 0, L > 0, C > 0, G ≥ 0 are constants representing resistances, inductance, capacitance, and conductance, respectively; e = e(t, x) is the voltage per unit length impressed along the line in series with it. We regard the unknown pair [i, v] as a function of t ≥ 0 with values in X = L2 (0, 1) × L2 (0, 1). Consider in X the scalar product 1 1 f1 f2 dx + C g1 g2 dx, [f1 , g1 ], [f2 , g2 ] = L 0

0

and the norm induced by this scalar product, so X is a real Hilbert space. Deﬁne A : D(A) ⊂ X → X by D(A) = {[f, g] ∈ X; f , g ∈ L2 (0, 1), g(0) + R0 f (0) = 0, R1 f (1) = g(1)}, 1 1 A[f, g] = − (g + Rf ), − (f + Gg) . L C

292

9 Semigroups of Linear Operators

Operator A is densely deﬁned, since C0∞ (0, 1) × C0∞ (0, 1) ⊂ D(A) and is dense in X. It is also easily seen that A is a closed operator: it suﬃces to note that the derivative is a closed operator in L2 (0, 1) and that convergence in H 1 (0, 1) implies convergence in C[0, 1] (cf. Arzel`a–Ascoli). It turns out that A is an m-dissipative operator (thus conﬁrming the fact that A is densely deﬁned and closed, cf. Theorems 9.10 and 9.27). Indeed, for all [f, g] ∈ D(A) we have 1 1 A[f, g], [f, g] = − (g + Rf ), − (f + Gg) , [f, g] L C 1 1 f (g + Rf ) dx − g(f + Gg) dx = − 0 0 1 1 1 2 = − (f g) dx − R f dx − G g 2 dx 0 0 0 1 (f g) dx ≤ − 0

= f (0)g(0) − f (1)g(1) = −R0 f (0)2 − R1 f (1)2 ≤ 0, that is to say, A is dissipative (with respect to the scalar product ·, ·). Let us now prove that R(λI − A) = X for all λ > 0, i.e., for all λ > 0 and [h, k] ∈ X there exists a solution [f, g] ∈ D(A) of the equation λ[f, g] − A[f, g] = [h, k].

(9.12.75)

Equation (9.12.75) can be written as the following boundary value problem ⎧ ⎪ ⎨f + (Cλ + G)g = Ck, g + (Lλ + R)f = Lh, ⎪ ⎩ g(0) + R0 f (0) = 0, R1 f (1) = g(1). We compute the general solution [f, g] of the above diﬀerential system (see the solution of Exercise 9.13) and then impose upon it the above boundary conditions to deduce that there exists a unique [f, g] ∈ D(A) satisfying the problem. The details are left to the reader. Thus, A is m-dissipative, so it generates a C0 -contraction semigroup on X, say {T (t) : X → X; t ≥ 0} (cf. Theorem 9.27). Therefore, for all [i0 , v0 ] ∈ X and e ∈ L1loc ([0, ∞); L2 (0, 1)) there exists a unique mild

9.13 Exercises

solution [i, v] ∈ C([0, ∞); X) of the Cauchy problem d dt [i(t, ·), v(t, ·)] = A[i(t, ·), v(t, ·)] + [e(t, ·), 0], [i(0, ·), v(0, ·)] = [i0 , v0 ],

293

t ≥ 0,

which is the representation in X of our initial-boundary value problem formulated above. This mild solution can be written explicitly in terms of T (t), i0 , v0 , and e, by using the usual variation of constants formula. If [i0 , v0 ] ∈ D(A) and e ∈ C 1 ([0, ∞); L2 (0, 1)), then [i, v] is a classical solution, [i, v] ∈ C 1 ([0, ∞); X) ∩ C([0, ∞); H 1 (0, 1)2 ). It is worth pointing out that the condition [i0 , v0 ] ∈ D(A) implies compatibility of the initial data with the boundary conditions and, as a by-product of this compatibility plus smoothness of function e, we obtain a classical solution [i, v] with the above properties. In particular i, v are continuous on [0, ∞) × [0, 1] and satisfy the boundary conditions for all t ≥ 0. Remark 9.53. All the above applications can be extended to the semilinear case, as pointed out in Remark 9.50. Comment. This chapter represents a short introduction into the theory of semigroups of linear operators, including its implications to linear evolution equations and some applications. Some subjects in the ﬁeld have not been addressed, e.g., semigroups of compact operators, diﬀerentiable semigroups, analytic semigroups, dual semigroups, etc. For more information about linear operator semigroups and their applications, the reader is referred to [7], [12], [19], [21], [39], [49], [51]. For more details on the regularity of solutions to linear evolution equations, including signiﬁcant examples from the theory of linear partial diﬀerential equations, see [6], [19], [39], [49].

9.13

Exercises

1. Compute T (t) = etA , t ∈ R, where . . . −1 −1 0 1 1 1 . ; (iii) A = ; (ii) A = (i) A = 2 −4 −1 0 −1 −1 2. If A is an n × n complex matrix, then the following equivalences hold true:

294

9 Semigroups of Linear Operators

(a) supt≥0 etA < ∞ ⇐⇒ all eigenvalues λ of A satisfy Re λ ≤ 0 and whenever Re λ = 0, then λ is a simple eigenvalue; (b) limt→∞ etA = 0 ⇐⇒ all eigenvalues λ of A satisfy Re λ < 0. 3. Let (X, · ) be a Banach space and let A ∈ L(X). Consider in X the Cauchy problem u (t) = Au(t), t ∈ R, u(0) = u0 . Show that if u0 = 0 then u(t) = 0 for all t ∈ R. 4. Let (X, · ) be a Banach space. Show that for every C0 semigroup {T (t) : X → X; t ≥ 0} the X-valued function (t, x) → T (t)x is continuous on [0, ∞) × X. 5. Let X denote the space of all functions f : R → R which are continuous and bounded, equipped with the sup-norm. For some λ > 0 and δ > 0 deﬁne G(t) : X → X by (G(t)f )(x) = e−λt

∞ (λt)k k=0

k!

f (x − kδ), t ≥ 0, f ∈ X, x ∈ R.

(a) Prove that {G(t) : X → X; t ≥ 0} is a uniformly continuous group and determine its inﬁnitesimal generator; (b) Show that

G(t) =

1 e−2λt

if t ≥ 0, if t < 0.

6. Let X be the real Banach space of all functions f : R → R that are continuous on R and p-periodic with some period p > 0, equipped with the sup-norm f = sup |f (s)| ∀f ∈ X. 0≤s≤p

Deﬁne (T (t)f )(s) = f (t + s), t, s ∈ R, f ∈ X. Show that {T (t) : X → X; t ∈ R} is a C0 -group of isometries, i.e., T (t) = 1, t ∈ R. Find its inﬁnitesimal generator.

9.13 Exercises

295

7. Let M = (mij ) be a k × k matrix with real entries. Denote X = Lp (Rk ), where p ∈ [1, ∞). For t ∈ R deﬁne G(t) : X → X by (G(t)f )(x) = f (e−tM x), f ∈ X, a.a. x ∈ Rk . (a) Show that {G(t) : X → X; t ≥ 0} is a C0 -group and determine its inﬁnitesimal generator; (b) If ki=1 mii = 0, then G(t) = 1 for all t ∈ R. 8. Let X be the real Banach space of all functions f : [0, ∞) → R that are bounded and uniformly continuous on [0, ∞), equipped with the usual sup-norm. Deﬁne f (s − t) for s − t ≥ 0, (T (t)f )(s) = f (0) for s − t < 0. Show that {T (t) : X → X; t ≥ 0} is a C0 -semigroup and determine its inﬁnitesimal generator. p 9. For a given 1 ≤ p < ∞, consider the real ∞Banachp space X = l of all sequences (xn )n∈N in R satisfying n=1 |xn | < ∞, equipped with the usual norm

(xn )p =

∞

|xn |p

1/p

∀(xn )n∈N ∈ X.

n=1

Let (cn )n∈N be a sequence of positive numbers. Deﬁne T (t) : X → X by T (t)(xn )n∈N = (e−cn t xn )n∈N ∀(xn )n∈N ∈ X, t ≥ 0. (a) Show that {T (t) : X → X; t ≥ 0} is a C0 -semigroup of contractions; (b) Determine its inﬁnitesimal generator; (c) Prove that {T (t) : X → X; t ≥ 0} is uniformly continuous if and only if (cn ) is bounded. 10. Let H = L2 (0, 1) be equipped with the usual scalar product and the corresponding induced norm. Deﬁne A : D(A) ⊂ H → H by D(A) = {v ∈ H 1 (0, 1); v(0) = 0}, Av = −v ∀v ∈ D(A).

296

9 Semigroups of Linear Operators

Show that A generates a C0 -semigroup of contractions {T (t) : H → H; t ≥ 0}. Find the explicit form of this semigroup and show that, for u0 ∈ H, u(t, x) = (T (t)u0 )(x) satisﬁes the transport equation ut + ux = 0 in Ω = (0, ∞) × (0, 1) in the sense of distributions. 11. Consider the initial-boundary value problem ⎧ ⎪ t > 0, x ∈ (0, 1), ⎨ut − uxx + au = f (t, x), u(t, 0) = 0, ux (t, 1) + αu(t, 1) = 0, t > 0, ⎪ ⎩ x ∈ (0, 1), u(0, x) = u0 (x), where a ∈ R, α > 0, u0 ∈ L2 (0, 1), f ∈ L1loc [0, ∞). Solve this problem using the semigroup approach. Solve the more general problem obtained by replacing the term au in the above equation by h(u), where h : R → R is a Lipschitz function. 12. Consider the initial-boundary value problem ⎧ ⎪ t > 0, x ∈ (0, 1), ⎨utt − uxx = f (t, x), u(t, 0) = 0, ux (t, 1) = 0, t > 0, ⎪ ⎩ u(0, x) = u0 (x), where u0 ∈ H 1 (0, 1), u0 (0) = 0, f ∈ L1loc [0, ∞). Solve this problem using the semigroup approach. 13. Consider the telegraph diﬀerential system Lit + vx + Ri = e(t, x), t ≥ 0, x ∈ (0, 1), Cvt + ix + Gv = 0, with the following boundary conditions v(t.0) + R0 i(t, 0) = 0, −i(t, 1) + C1 vt (t, 1) + D1 v(t, 1) = e1 (t), t > 0, and initial conditions i(0, x) = i0 (x), v(0, x) = v0 (x), x ∈ (0, 1), where C > 0, C1 > 0, L > 0, D1 ≥ 0, G ≥ 0, R ≥ 0, R0 ≥ 0, and e, e1 are given functions. (a) Solve the above problem by using the semigroup approach; (b) What can you say about existence in the case when D1 , G, R are Lipschitz functions from R into itself?

Chapter 10

Solving Linear Evolution Equations by the Fourier Method In Chap. 9 we used the linear semigroup approach to solve inhomogeneous linear evolution equations. For the same purpose, we use here the Fourier method. More precisely, under appropriate conditions on the linear operators governing such equations, we ﬁnd the solutions in the form of Fourier series expansions. This approach is based in an essential way on the results discussed in Chap. 8.

10.1

First Order Linear Evolution Equations

Consider the Cauchy problem u (t) + Qu(t) = f (t), 0 < t < T, u(0) = u0 ,

(E) (IC)

where Q satisﬁes the set of conditions (a) originally presented in Chap. 8:

© Springer Nature Switzerland AG 2019 G. Moro¸sanu, Functional Analysis for the Applied Sciences, Universitext, https://doi.org/10.1007/978-3-030-27153-4 10

297

298

(a)

10 Solving Linear Evolution Equations by the Fourier Method

Q : D(Q) ⊂ H → H is a linear, densely deﬁned, self-adjoint, strongly positive operator, where (H, (·, ·), · ) is a real, inﬁnite dimensional, separable Hilbert space. We also assume that the energetic space HE deﬁned in Chap. 8 satisﬁes

(b)

HE is compactly embedded into H, so that Theorem 8.16 holds true. The notation in the statement of that theorem will be also used in what follows.

If Q satisﬁes (a) then −Q generates a C0 -semigroup of contractions (see Theorem 9.29) and so for any u0 ∈ H and f ∈ L1 (0, T ; H) there exists a unique mild solution u = u(t) of problem (E), (IC) given by the variation of constants formula. If u0 ∈ D(Q) and f ∈ C 1 ([0, T ]; H), then u is a classical solution (cf. Theorem 9.47). The Fourier method we are going to discuss next oﬀers more possibilities to investigate the regularity of solutions and provides good approximations of solutions in terms of eigenfunctions of the operator Q. Let us start with a speciﬁc result. Theorem 10.1. Assume that (a) and (b) above are fulﬁlled. Then function u ∈ for all u0 ∈ H and f ∈ L2 (0, T ; H) there exists a unique √ C([0, T ]; H) ∩ C((0, T ]; HE ) ∩ L2 (0, T ; HE ) with tu ∈ L2 (0, T ; H) which satisﬁes (IC) and Eq. (E) for a.a. t ∈ (0, T ). This function u (called a strong solution of problem (E), (IC)) is expressed as the Fourier series expansion u(t) =

∞

un (t)en ,

(10.1.1)

n=1

where {en }∞ n=1 is the orthonormal basis in H provided by Theorem 8.16, and un (t) = (u(t), en ), n = 1, 2, . . . If u0 ∈ HE and f ∈ L2 (0, T ; H), then u ∈ H 1 (0, T ; H) ∩ C([0, T ]; HE ), u(t) ∈ D(Q) for a.a. t ∈ (0, T ), and Qu ∈ L2 (0, T ; H). Proof. Assume ﬁrst that u0 ∈ H and f ∈ L2 (0, T ; H). As mentioned before, we already know that problem (E), (IC) has a unique mild solution u given by the variation of constants formula. A strong solution is clearly a mild one so the uniqueness part of the theorem is obvious.

10.1 First Order Linear Evolution Equations

299

In fact, uniqueness also follows by a simple direct proof. If y = y(t) denotes the diﬀerence of two strong solutions of problem (E), (IC), then y(0) = 0 and y (t) + Qy(t) = 0 for a.a. t ∈ (0, T ) . Multiplying this equation by y(t) and taking into account the positivity of Q we obtain 1d y(t)2 = (y (t), y(t)) ≤ 0 for a.a. t ∈ (0, T ) , 2 dt which shows that the function t → y(t) is nonincreasing on [0, T ]. Since y(0) = 0 it follows that y is the null function, i.e., the two strong solutions coincide. We could show that, under our assumptions, the mild solution u is in fact a strong solution by a limiting procedure applied to a sequence of strong solutions un ∈ C 1 ([0, T ]; H) (given by Theorem 9.47) corresponding to sequences u0n ∈ D(Q) and fn ∈ C 1 ([0, T ]; H) which satisfy u0n − u0 → 0, fn − f L2 (0,T ; H) → 0. However, we shall provide here the existence proof using the Fourier method. Speciﬁcally, we seek a solution in the form (10.1.1) where the un ’s are unknown real valued functions. For u0 we have the Fourier expansion u0 =

∞

u0n en

u0 = 2

with u0n = (u0 , en ),

n=1

∞

u20n .

n=1

Similarly, for a.a. t ∈ (0, T ), f (t) =

∞

fn (t)en

with fn (t) = (f (t), en ),

f (t) = 2

n=1

∞

fn (t)2 .

n=1

Denoting sk (t) = sk (t)2 =

k

n=1 fn (t)en ,

k

we can see that

fn (t)2 ≤ f (t)2

∀k ∈ N, a.a. t ∈ (0, T ) ,

n=1

so by the Lebesgue Dominated Convergence Theorem sk → f strongly in L2 (0, T ; H). Now we impose conditions on u (given by (10.1.1)) to formally satisfy Eq. (E), ∞ n=1

un (t)en

+

∞ n=1

un (t)λn en =

∞ n=1

fn (t)en ,

300

10 Solving Linear Evolution Equations by the Fourier Method

and (IC), ∞

un (0)en =

n=1

∞

u0n en ,

n=1

so identifying the coeﬃcients of the en ’s yields un (t) + λn un (t) = fn (t)

for all n ∈ N and a.a. t ∈ (0, T ) , (10.1.2)

un (0) = u0n , n ∈ N , hence un (t) = e

−λn t

t

u0n +

(10.1.3)

e−λn (t−s) fn (s) ds ∀t ∈ [0, T ], n ∈ N .

0

Therefore, un ∈ H¨ older’s inequality

H 1 (0, T )

un (t) ≤ 2 2

u20n

and, since λn ≥ λ1 > 0, we easily obtain by

T

+T

fn (s)2 ds

∀t ∈ [0, T ], n ∈ N .

(10.1.4)

0

T Since u20n and 0 fn (s)2 ds are terms of convergent series, ∞ it follows from (10.1.4), by the Weierstrass M-test, that the series n=1 un (t)2 is uniformly convergent in [0, T ] and consequently so is the series (10.1.1) and its sum u is in C([0, T ]; H). Next, we multiply Eq. (10.1.2) by tun (t) and then integrate the resulting equation over [0, T ] to obtain ∀n ∈ N T λn T un (T )2 tun (t)2 dt + 2 0 T λn T 2 = un (t) dt + tfn (t)un (t) dt 2 0 0 T 1 T 2 1 T λn un (t)2 dt + tun (t) dt + tfn (t)2 dt . (10.1.5) ≤ 2 0 2 0 2 0 On the other hand, multiplying (10.1.2) by un (t) and then integrating over [0, T ] we obtain T T 1 2 2 2 u (T ) − u0n + λn un (t) dt = fn (t)un (t) dt 2 n 0 0 1 T fn (t)2 dt ≤ 2 0 1 T un (t)2 dt , + 2 0

10.1 First Order Linear Evolution Equations

for all n ∈ N, so

∞

T

λn

301

un (t)2 dt < ∞ ,

(10.1.6)

0

n=1

hence ∞

u(t), λn−1/2 en

2 E

=

n=1

∞

u(t), λn−1/2 Qen

2

=

n=1

T

n=1 0

so

√

λn un (t)2

n=1

is convergent for a.a. t ∈ (0, T ), and t → u(t)2E = summable on (0, T ), i.e., u ∈ L2 (0, T ; HE ). From (10.1.5) and (10.1.6) we infer that ∞

∞

∞

n=1 λn un (t)

2

is

tun (t)2 dt < ∞ ,

tu ∈ L2 (0, T ; H). We also have the inequality (similar to (10.1.5))

1 λn tun (t)2 ≤ − 2 2

t 0

λn sun (s) ds + 2

T

2

0

1 un (s) ds + 2

T

2

sfn (s)2 ds ,

0

for all t ∈ [0, T ], n ∈ N, which ∞ combined 2with (10.1.6) implies (by the Weierstrass M-test) that n=1 λn tun (t) is uniformly convergent in √ [0, T ] so tu ∈ C([0, T ]; HE ). This shows that u ∈ C((0, T ]; HE ). Now, passing to the limit in L2 (0, T ; H) as k → ∞ in the equation k

fn (t)en =

n=1

=

k n=1 k n=1

un (t)en +

k

un (t)λn en ,

n=1

un (t)en

+Q

k

un (t)en

n=1

we conclude that u satisﬁes Eq. (E) for a.a. t ∈ (0, T ). This uses the fact that Q is a closed operator. It is also obvious that u(0) = u0 . Now, let us assume that u0 ∈ HE and f ∈ L2 (0, T ; H). Multiplying Eq. (10.1.2) by un (t) we obtain λn d un (t)2 2 dt = fn (t) · un (t) for a.a. t ∈ (0, T ), ∀n ∈ N .

un (t)2 +

(10.1.7)

302

10 Solving Linear Evolution Equations by the Fourier Method

It follows, by integration over [0, T ], that T λn 2 2 + fn (t) · un (t) dt un (T ) − u0n = 2 0 0 1 T 1 T 2 2 ≤ fn (t) dt + u (t) dt , (10.1.8) 2 0 2 0 n ∞ 2 for all n ∈ N. Since u0 ∈ HE (i.e., n=1 λn u0n < ∞), the last inequality implies ∞ T un (t)2 dt < ∞ ,

T

un (t)2 dt

n=1 0

∞

hence n=1 un (t)en is convergent in L2 (0, T ; H) and, obviously, its sum is u ∈ L2 (0, T ; H). Integration over [0, t] of (10.1.7) an inequality similar to ∞ leads to 2 (10.1.8) which implies that n=1 λn un (t) is uniformly convergent in [0, T ] and so u ∈ C([0, T ]; HE ). As u , f ∈ L2 (0, T ; H) we derive from Eq. (E) that Qu ∈ L2 (0, T ; H). Remark 10.2. For further regularity results see, e.g., [22, Chapter 7]. We continue with a result on the existence of a periodic solution of Eq. (E). Theorem 10.3. Assume that (a) and (b) are fulﬁlled and f ∈ L2 (0, T ; H). Then, there exists a unique function u ∈ H 1 (0, T ; H) ∩ C([0, T ]; HE ) satisfying Eq. (E) for a.a. t ∈ (0, T ) and u(0) = u(T ), and u is given by Eq. (10.1.1), where t e−λn (t−s) fn (s) ds , un (t) = dn e−λn t + 0

with

dn = 1 − e

−λn T −1

T

e−λn (T −s) fn (s) ds , n = 1, 2, . . .

0

Proof. By Theorem 10.1, for all u0 ∈ H there is a unique strong solution u = u(t, u0 ) of problem (E), (IC) belongs to C([0, T ]; H) ∩ √ which 2 C((0, T ]; HE ) ∩ L (0, T ; HE ) with tu (t, u0 ) ∈ L2 (0, T ; H). For two vectors u0 , v0 ∈ H we have d [u(t, u0 ) − u(t, v0 )] + Q[u(t, u0 ) − u(t, v0 )] = 0 for a.a. t ∈ (0, T ) . dt

10.1 First Order Linear Evolution Equations

303

If we multiply this equation by u(t, u0 ) − u(t, v0 ) and use the strong positivity of Q (with some constant c > 0), we get 1 d u(t, u0 ) − u(t, v0 )2 2 dt + cu(t, u0 ) − u(t, v0 )2 ≤ 0 for a.a. t ∈ (0, T ) , or, equivalently, d 2ct e u(t, u0 ) − u(t, v0 )2 ≤ 0 for a.a. t ∈ (0, T ) dt which shows that the function t → ect u(t, u0 ) − u(t, v0 ) is nonincreasing and hence u(t, u0 ) − u(t, v0 ) ≤ e−ct u0 − v0 ∀t ∈ [0, T ] .

(10.1.9)

Now let us consider the so-called Poincar´e operator P : H → H deﬁned by P u0 = u(T ; u0 ) ∀u0 ∈ H . From (10.1.9) we see that P is a contraction: P u0 − P v0 ≤ e−cT u0 − v0 ∀u0 , v0 ∈ H . By the Banach Contraction Principle (see Chap. 2) it follows that P has a unique ﬁxed point u∗0 ∈ H, i.e., P u∗0 = u∗0 . In other words, u(T, u∗0 ) = u∗0 , which is to say, u(t, u∗0 ) is the unique periodic solution of Eq. (E). Since u∗0 = u(T, u∗0 ) we deduce from the ﬁrst part of Theorem 10.1 that u∗0 ∈ HE . Therefore, by the second part of Theorem 10.1, it follows that u(t, u∗0 ) ∈ H 1 (0, T ; H) ∩ C([0, T ]; HE ). Clearly u(t, u∗0 ) is the sum of a Fourier series of the form (10.1.1) which is convergent in C([0, T ]; HE ) since u∗0 ∈ HE . From the periodicity condition u∗0 = u(T, u∗0 ) we infer un (0) = un (T )

∀n ∈ N ,

(10.1.10)

where the un ’s are solutions of (10.1.2), i.e., t e−λn (t−s) fn (s) ds ∀t ∈ [0, T ], n ∈ N, un (t) = dn e−λn t + 0

Taking into account (10.1.10) we can easily ﬁnd −1 T −λn (T −s) e fn (s) ds, n ∈ N . dn = 1 − e−λn T 0

304

10 Solving Linear Evolution Equations by the Fourier Method

10.2

Second Order Linear Evolution Equations

In this section we keep the notation and assumptions used in the previous section. Consider the Cauchy problem u (t) + Qu(t) = f (t), 0 < t < T, u(0) = u0 , u (0) = u1 .

(e) (ic)

Theorem 10.4. Assume that conditions (a) and (b) are fulﬁlled. Then for all u0 ∈ D(Q) (i.e., Qu0 ∈ H), u1 ∈ HE and f ∈ L2 (0, T ; HE ) there exists a unique function u ∈ C 1 ([0, T ]; HE ) ∩ H 2 (0, T ; H) which satisﬁes (ic) and (e) for a.a. t ∈ (0, T ), and Qu ∈ C([0, T ]; H). If, in addition, f ∈ C([0, T ]; H) then u ∈ C([0, T ]; H). Alternatively, if u0 ∈ D(Q), u1 ∈ HE and f ∈ H 1 (0, T ; H) then u ∈ C 1 ([0, T ]; HE ) ∩ C 2 ([0, T ]; H) (hence Qu ∈ C([0, T ]; H)). In both cases the solution u is given by a Fourier series expansion of the form (10.1.1). Proof. Let us ﬁrst prove uniqueness. Let y ∈ H 1 (0, T ; H) be the diﬀerence of two solutions of problem (e), (ic). Then, y(0) = 0, y (0) = 0, and y (t) + Qy(t) = 0 for a.a t ∈ (0, T ) . We multiply this equation by y (t) to obtain y (t), y (t) + Qy(t), y (t) = 0 , so, as Q is self-adjoint, we can write d y (t)2 + Qy(t), y(t) = 0 , dt for a.a. t ∈ (0, T ). This shows that y is the null function (since y(0) = 0, y (0) = 0 and Q is strongly positive), so the solution is indeed unique (if it exists). In order to prove existence, we seek a solution u to problem (e), (ic) in the form (10.1.1). Requiring this series to formally satisfy (e) and (ic) we ﬁnd un (t) + λn un (t) = fn (t)

∀n ∈ N and a.a. t ∈ (0, T ) ,

un (0) = u0n , un (0) = u1n

∀n ∈ N ,

(10.2.11) (10.2.12)

10.2 Second Order Linear Evolution Equations

305

where fn (t), u0n and u1n are the Fourier coeﬃcients of f (t), u0 and u1 , respectively. For each n ∈ N problem (10.2.11) and (10.2.12) has the solution u1n un (t) = u0n cos( λn t) + √ sin( λn t) λn t 1 λn (t − s) fn (s) ds , sin (10.2.13) +√ λn 0 for all t ∈ [0, T ]. Therefore, λn u0n sin( λn t) + u1n cos( λn t) t + λn (t − s) fn (s) ds , cos (10.2.14) 0

un (t) = −

=

t 0

√

cos

λn s fn (t−s) ds

and un (t) = −λn u0n cos( λn t) − λn u1n sin( λn t) t sin λn (t − s) fn (s) ds , + fn (t) − λn

(10.2.15)

0

or, equivalently, un (t) = −λn u0n cos( λn t) − λn u1n sin( λn t) t cos λn (s) fn (t − s) ds . (10.2.16) + 0 =

t 0

cos

√

λn (t−s) fn (s) ds

From (10.2.13)–(10.2.16) we deduce (where C1 , C2 , C3 , C4 are constants) un (t)2 ≤ C1 (u20n + un (t)2 un (t)2

≤

1 2 1 u1n + λn λn

≤

C3 (λ2n u20n

+

+

λn u21n

u21n

fn (s)2 ds) ,

(10.2.17)

0

C2 (λn u20n

T

T

+

fn (s)2 ds) ,

0

T

2

+ fn (t) + λn 0

fn (s)2 ds) ,

(10.2.18) (10.2.19)

306

10 Solving Linear Evolution Equations by the Fourier Method

and un (t)2

≤

C4 (λ2n u20n

+

λn u21n

T

+ 0

fn (s)2 ds) .

(10.2.20)

Assume u0 ∈ D(Q), u1 ∈ HE and f ∈ L2 (0, T ; HE ). Then ∞

λ2n u0n 2 < ∞,

n=1 ∞ n=1

∞

λn u1n 2 < ∞ ,

n=1

T

λn

f (t)2 dt < ∞ .

(10.2.21)

0

It follows from (10.2.17)–(10.2.19) and (10.2.21) that the series (10.1.1) is convergent in diﬀerent spaces and its sum u satisﬁes u ∈ C 1 ([0, T ]; HE ) ∩ H 2 (0, T ; H), Qu ∈ C([0, T ]; H) . If, in addition, f ∈ C([0, T ]; H) then, according to (10.2.20), u ∈ C([0, T ]; H). If u0 ∈ D(Q), u1 ∈ HE and f ∈ H 1 (0, T ; H) then, according to (10.2.17), (10.2.18), (10.2.20), and (10.2.21), u ∈ C 1 ([0, T ]; HE ) ∩ C 2 ([0, T ]; H) (hence Qu ∈ C([0, T ]; H)). Finally, it is easily seen (as in the proof of Theorem 10.1) that in both cases u, expressed as the sum of the series (10.1.1), satisﬁes (e), (ic). Remark 10.5. Obviously, further regularity results can be stated under diﬀerent conditions on u0 , u1 and f . On the other hand, using the semigroup approach, one can derive the existence of a solution to problem (e), (ic) which comes from the mild solution for the Cauchy problem associated with a ﬁrst order diﬀerential equation in the product space X = V × H equipped with the scalar product

[v1 , h1 ], [v2 , h2 ]

X

= (v1 , v2 )E + (h1 , h2 )

∀[v1 , h1 ], [v2 , h2 ] ∈ X .

Obviously, X is a real Hilbert space. Deﬁne A : D(A) ⊂ X → X by D(A) = D(Q) × HE , A[v, h] = [h, −Qv]

∀[v, h] ∈ D(A) .

10.2 Second Order Linear Evolution Equations

307

It is easily seen that A is linear, densely deﬁned, closed, and dissipative. In fact, for all [v, h] ∈ D(A), we have A[v, h], [v, h] X = [h, −Qv], [v, h] X = (h, Qv) − (Qv, h) = 0. Thus, according to Remark 9.26, A is a dissipative operator. We also have A∗ = −A, so A∗ is also dissipative. By Theorem 9.29 it follows that A is m-dissipative, so (according to the Lumer–Phillips Theorem) it generates a C0 -semigroup of contractions, say {S(t) : X → X; t ≥ 0}. Problem (e), (ic) can be expressed as the following Cauchy problem in X d [u(t), w(t)] dt = A[u(t), w(t)] + [0, f (t)], 0 < t < T ; [u, w](0) = [u0 , u1 ] . (10.2.22) According to Sect. 9.11, for [u0 , u1 ] ∈ X and f ∈ L1 (0, T ; H) this problem has a unique mild solution [u, w] ∈ C([0, T ]; X), t [u(t), w(t)] = S(t)[u0 , u1 ] + S(t − s)[0, f (s)] ds, t ∈ [0, T ] . 0

(10.2.23) The ﬁrst component u = u(t) can be called a mild solution of problem (e), (ic). In fact, w(t) = u (t). In order to show this, we approximate [u0 , u1 ] ∈ X by [uk0 , uk1 ] ∈ D(Q) × HE , and f ∈ L1 (0, T ; H) by f k ∈ H 1 (0, T ; H). Denote by [uk , wk ] = [uk , (uk ) ] the solution of problem (10.2.22) with [u0 , u1 ] := [uk0 , uk1 ] and f := f k which is a strong solution belonging to C 1 ([0, T ]; HE ) ∩ C 2 ([0, T ]; H) (cf. Theorem 10.4). Obviously, [uk (t), (uk ) (t)] =

S(t)[uk0 , uk1 ]

t

+

S(t − s)[0, f k (s)] ds, t ∈ [0, T ] .

(10.2.24)

0

As {S(t) : X → X; t ≥ 0} is a semigroup of contractions, we have for all t ∈ [0, T ] [uk (t) − um (t), (uk ) (t) − (um ) (t)]X ≤ [uk0 − uk0 , uk1 − um 1 ]X T + f k (s) − f m (s) ds, 0

308

10 Solving Linear Evolution Equations by the Fourier Method

hence uk converges in C([0, T ]; HE ) to some u ∈ C([0, T ] HE ), and (uk ) converges in C([0, T ]; H) to w = u ∈ C([0, T ]; H). Passing to the limit in (10.2.24) we reobtain (10.2.23) with w = u . So the mild solution u belongs to C([0, T ]; HE ) ∩ C 1 ([0, T ]; H). Since u is a limit of strong solutions uk that admit Fourier series expansions (as stated in Theorem 10.4), we can easily show that u is the sum of the Fourier series (10.1.1), where un (t) = (u(t), en ) for n = 1, 2, . . .

10.3

Examples

Let ∅ = Ω ⊂ RN , N ≥ 2, be a bounded domain with smooth boundary ∂Ω. Consider the following problem (associated with the heat equation) ⎧ ⎪ ⎨ut − Δu = f (t, x), u(t, x) = 0, ⎪ ⎩ u(0, x) = u0 (x),

t ≥ 0, x ∈ Ω , t ≥ 0, x ∈ ∂Ω , x ∈ Ω.

(10.3.25)

This problem can be solved by the Fourier method using the results presented in Chap. 8 and in Sect. 10.1 above. Thus, the Fourier method provides an approach for solving the above initial-boundary value problem which is complementary to the semigroup approach. Speciﬁcally, consider H = L2 (Ω) equipped with the usual scalar product and Hilbertian norm, Q = −Δ with D(Q) = H01 (Ω) ∩ H 2 (Ω), and HE = H01 (Ω) (the corresponding energetic space) with ∇p · ∇q dx, p2E = (p, p)E . (p, q)E = Ω

By Theorem 8.16 there exist an increasing sequence (λn )n≥1 in (0, ∞) 2 converging to ∞ and an orthonormal basis {en }∞ n=1 in H = L (Ω) such that −Δen = λen in Ω, ∀n ≥ 1 . Thus, Theorem 10.1 is applicable to problem (10.3.25) which is of the form (E), (IC) with the above choices. In particular, under suitable conditions, the solution of (10.3.25) is given by u(t, x) =

∞ n=1

un (t)en (x) ,

(10.3.26)

10.4 Exercises

309

where the un ’s are solutions of un (t) + λn un (t) = fn (t), un (0) = u0n , with

t ≥ 0, n = 1, 2, . . .

fn (t) =

f (t, ξ)en (ξ) dξ, u0n = Ω

u0 (ξ)en (ξ) dξ, n = 1, 2 . . . Ω

Theorem 10.3 is also applicable to problem (10.3.25). Theorem 10.4 can be illustrated with the following problem (associated with the wave equation): ⎧ ⎪ ⎨utt − Δu = f (t, x), t ≥ 0, x ∈ Ω , (10.3.27) u(t, x) = 0, t ≥ 0, x ∈ ∂Ω , ⎪ ⎩ x ∈ Ω. u(0, x) = u0 (x), The cases of the boundary conditions of Neumann or Robin type can also be analyzed along the same lines.

10.4

Exercises

1. Consider the following initial-boundary value problem: ⎧ ⎪ t ∈ (0, T ), x ∈ (0, 1), ⎨ut − uxx = f (t, x), u(t, 0) = 0, ux (t, 1) = 0, t ∈ [0, T ], ⎪ ⎩ x ∈ (0, 1). u0 (x) = u0 (x), Denote H = L2 (0, 1). Assume that H is equipped with the usual scalar product (·, ·) and the induced norm · (hence H is a real Hilbert space which is inﬁnite dimensional and separable). Deﬁne Q : D(Q) ⊂ H → H by D(Q) = {v ∈ H 2 (0, 1); v(0) = 0, v (1) = 0}, Qv = −v ∀v ∈ D(Q). The above problem can be expressed as a Cauchy problem in H: u (t) + Qu(t) = f (t), 0 < t < T, (CP ) u(0) = u0 , where u(t) := u(t, ·) ∈ H.

310

10 Solving Linear Evolution Equations by the Fourier Method

(i)

Show that Q satisﬁes condition (a) of Theorem 10.1 (i.e., Q is densely deﬁned, self-adjoint, and strongly positive);

(ii)

Find all the eigenpairs of Q and construct a corresponding orthonormal basis {en }∞ n=1 of H;

(iii) Determine the energetic space HE , show that HE is compactly embedded in H, and determine an orthonormal basis of HE ; (iv) Find the explicit Fourier series solution u(t, x) = ∞ n=1 un (t)en (x) for u0 (x) = x(1 − x), f (t, x) = (t + 1)x. 2. Consider a homogeneous thin metal rod occupying an interval [0, l], l > 0. The temperature at time t = 0 of the rod is constant: u = u0 for x ∈ [0, l]. The temperatures at the ends of the rod are kept constant in time: u(t, 0) = u1 , u(t, l) = u2 , t ∈ [0, T ], where T > 0 is a given time instant. Find the temperature distribution u = u(t, x) on the rod, if there is no external heat source distributed along the rod. 3. Consider the following initial-boundary value problem: ⎧ ⎪ ⎨ut − uxx = f (t, x), −ux (t, 0)+αu(t, 0)=0, ux (t, 1)=0, ⎪ ⎩ u0 (x) = u0 (x),

t ∈ (0, T ), x ∈ (0, 1), t∈[0, T ], x ∈ (0, 1),

where α is a given positive number. Denote as before H = L2 (0, 1) and deﬁne Q : D(Q) ⊂ H → H by D(Q) = {v ∈ H 2 (0, 1); −v (0) + αv(0) = 0, v (1) = 0}, Qv = −v ∀v ∈ D(Q). Thus, the above problem can be expressed as a Cauchy problem in H: u (t) + Qu(t) = f (t), 0 < t < T, (CP ) u(0) = u0 , where u(t) := u(t, ·) ∈ H.

10.4 Exercises

311

Show that Q satisﬁes the conditions (a) and (b) of Theorem 10.1 (thus ensuring existence, uniqueness, and regularity of solutions to the given problem). 4. Repeat Exercise 10.1 above, replacing the boundary conditions by the following (Neumann) boundary conditions ux (t, 0) = 0, ux (t, 1) = 0, t ∈ [0, T ]. 5. Let (H, (·, ·), ·) be a real Hilbert space and let A : D(A) ⊂ H → H be a linear and positive operator, i.e., (Ap, p) ≥ 0 ∀p ∈ D(A), where I is the identity operator on H. Assume that Q = A + αI satisﬁes both conditions (a) and (b) of Theorem 10.1, where α is a positive constant. (a) Solve the following Cauchy problem: u (t) + Au(t) = f (t), 0 < t < T, u(0) = u0 ,

(CP )

for some given u0 ∈ H and f ∈ L2 (0, T ; H). (b) Show that, given T and f , if α is small enough, then there exists u0 ∈ H such that u(T ) is close to u0 , i.e., u(T ) − u0 is small, where u is the solution of (CP ) corresponding to u0 and f . 6. Let Ω = (0, a) × (0, b) ⊂ R2 , a, b boundary value problem ⎧ ⎪ ⎨ut − Δu = f (t, x), u(t, x) = 0, ⎪ ⎩ u(0, x) = u0 (x),

∈ (0, ∞). Consider the initial(t, x) ∈ (0, T ) × Ω, (t, x) ∈ [0, T ] × ∂Ω, x ∈ Ω.

Find the general Fourier series expansion of the solution u = u(t, x) of the above problem for u0 ∈ H = L2 (Ω) and f ∈ L2 ((0, T )×Ω), and determine an explicit expansion for u0 (x) = c and f (t, x) = tx1 x2 , where c is a real constant. 7. Repeat the previous exercise with Neumann conditions on ∂Ω (instead of the preceding Dirichlet boundary conditions). Consider also combinations of Dirichlet and Neumann conditions on diﬀerent sides of the rectangle Ω.

312

10 Solving Linear Evolution Equations by the Fourier Method

8. Solve the following initial-boundary value problem: ⎧ ⎪ ⎨ut − uxx = αδ(x − 1) + βδ(x − 2), (t, x) ∈ (0, ∞) × (0, 3), u(t, 0) = 0, u(t, 3) = 0, t ≥ 0, ⎪ ⎩ u(0, x) = 0, x ∈ [0, 3], where α, β are real constants, and δ(x−1), δ(x−2) are the usual Dirac distributions in D (0, 3), also denoted δ1 , δ2 . 9. Consider an elastic string of length l > 0, held ﬁxed at both ends x = 0 and x = l. Find the displacement u = u(t, x) in the string, which is set in motion from its straight equilibrium position, with the initial velocity v0 deﬁned by Ax, 0 ≤ x ≤ l/2, v0 (x) = A(l − x), l/2 ≤ x ≤ l, where A is a positive constant. 10. Consider an elastic string of length l > 0, held ﬁxed at the end x = 0, while the end x = l is free. Find the displacement u = u(t, x) in the string, if it is set in motion at t = 0 from the initial conﬁguration described by a function u0 (x), with zero initial velocity. Discuss the regularity of u with respect to u0 . 11. Solve the initial-boundary value problem ⎧ ⎪ ⎨utt − uxx + u = 0, ux (t, 0) = 0, ux (t, 1) = 0, ⎪ ⎩ u(0, x) = u0 (x), ut (0, x) = 0,

(t, x) ∈ (0, ∞) × (0, 1), t ≥ 0, x ∈ [0, 1]

using the Fourier method. 12. Consider a guitar string of length l > 0, ﬁxed at both ends x = 0 and x = l. Assume that the string is at rest at the time instant t = 0 and is set to motion by a force f = cδ(x − l/2) exerted on the midpoint of the string, where c is a real constant and δ(x − l/2) is the Dirac distribution (also denoted δl/2 ). Determine the displacement u(t, x) of the string for t > 0 and x ∈ [0, l] using the Fourier method.

10.4 Exercises

313

13. Let Ω = (0, a) × (0, b) ⊂ R2 , a, b ∈ boundary value problem ⎧ ⎪ ⎨utt − Δu = f (t, x), u(t, x) = 0, ⎪ ⎩ u(0, x) = u0 (x), ut (0, x) = 0,

(0, ∞). Solve the initial(t, x) ∈ (0, T ) × Ω, (t, x) ∈ [0, T ] × ∂Ω, x ∈ Ω,

where u0 (x) = x1 (a − x1 ) sin

3πx 2

b

, f (t, x) = tex1 x2 .

Chapter 11

Integral Equations This chapter is an introduction to the theory of linear Volterra and Fredholm equations. Some aspects related to certain nonlinear extensions are also addressed.

11.1

Volterra Equations

We begin with scalar, linear Volterra equations.1 There are two kinds of such equations that are most relevant to applications, namely t f (t) = k(t, s)x(s) ds, a ≤ t ≤ b , (11.1.1) a

and

t

k(t, s)x(s) ds,

x(t) = f (t) +

a ≤ t ≤ b,

(11.1.2)

a

where a, b ∈ R, a < b, f ∈ C[a, b] := C([a, b]; R), k ∈ C(Δ) := C(Δ; R) (called the kernel), with Δ = {(t, s) ∈ R2 ; a ≤ s ≤ t ≤ b}; and x = x(t) denotes the unknown function which is sought in the space C[a, b]. Equation (11.1.1) is known as the Volterra equation of the ﬁrst kind, while (11.1.2) as the Volterra equation of the second kind. In the following we examine Eq. (11.1.2). We will show later that Eq. (11.1.1) reduces to (11.1.2) under suitable conditions. 1

Vito Volterra, Italian mathematician and physicist, 1860–1940.

© Springer Nature Switzerland AG 2019 G. Moro¸sanu, Functional Analysis for the Applied Sciences, Universitext, https://doi.org/10.1007/978-3-030-27153-4 11

315

316

11 Integral Equations

Theorem 11.1 (Existence and Uniqueness). Under the above conditions there exists a unique solution x ∈ C[a, b] to Eq. (11.1.2). We present below three diﬀerent proofs. Proof 1: Denote K = sup(t,s)∈Δ |k(t, s)| which is ﬁnite since Δ is a compact subset of R2 . Assume in a ﬁrst stage that K(b − a) < 1 .

(11.1.3)

Consider X = C[a, b] equipped with the usual sup-norm, g = supa≤t≤b |g(t)|, and the corresponding metric, d(g1 , g2 ) = g1 − g2 . Deﬁne T : X → X by t k(t, s)g(s) ds, t ∈ [a, b], g ∈ X. (11.1.4) (T g)(t) = f (t) + a

It is clear from (11.1.4) that T maps X into itself. We have t k(t, s)[g1 (s) − g2 (s)] ds | |(T g1 )(t) − (T g2 )(t)| = | a t |k(t, s)| · |g1 (s) − g2 (s)| ds ≤ a

≤ K(b − a)g1 − g2 , for all g1 , g2 ∈ X, and all t ∈ [a, b]. Hence, d(T g1 , T g2 ) ≤ K(b − a)d(g1 , g2 ) , i.e., T is a contraction (cf. (11.1.3)). By the Banach Contraction Principle (see Chap. 2), T has a unique ﬁxed point x ∈ X which is clearly the unique solution of Eq. (11.1.2). If condition (11.1.3) is not fulﬁlled, then we consider a subdivision of [a, b], say, a = t0 < t1 < t2 < · · · < tN −1 < tN = b , where tj = a + jh for j = 1, 2, . . . , N , h = (b − a)/N , with N large enough such that Kh < 1. In particular, K(t1 − t0 ) = Kh < 1, so it follows from above that Eq. (11.1.2) has a unique solution x1 = x1 (t) on the interval [t0 , t1 ] = [a, t1 ], i.e.,

t

x1 (t) = f (t) + a

k(t, s)x1 (s) ds, t ∈ [a, t1 ] .

11.1 Volterra Equations

317

Now consider the equation x(t) = f (t) +

t1

k(t, s)x1 (s) ds +

a

t

k(t, s)x(s) ds, t ∈ [t1 , t2 ].

t1

=:f1 (t)∈C[t1 ,t2 ]

Since K(t2 − t1 ) = Kh < 1, it follows by the above argument that this equation has a unique solution x2 ∈ C[t1 , t2 ], and obviously x2 (t1 ) = x1 (t1 ). Similarly, there exists a unique function x3 ∈ C[t2 , t3 ] satisfying for all t ∈ [t2 , t3 ] the equation

t1

x3 (t) = f (t) +

t2

k(t, s)x1 (s) ds + a

k(t, s)x2 (s) ds t1

t

+

k(t, s)x3 (s) ds, t2

and x3 (t2 ) = x2 (t2 ). Continuing this procedure we obtain a solution x ∈ C[t0 , tN ] = C[a, b] of Eq. (11.1.2) deﬁned by x(t) = xj (t) for t ∈ [tj−1 , tj ], j = 1, 2, . . . , N . The solution x is obviously unique. Proof 2. Again, consider the operator T deﬁned by (11.1.4), where X is the same as above. It is easily seen that |(T g1 )(t) − (T g2 )(t)| ≤ Kg1 − g2 (t − a) ∀t ∈ [a.b], g1 , g2 ∈ X. Consequently for T 2 = T ◦ T we obtain the estimate

t

|(T 2 g1 )(t) − (T 2 g2 )(t)| ≤

|k(t, s)| · |(T g1 )(s) − (T g2 )(s)| ds t 2 ≤ K g1 − g2 (s − a) ds a

a

=

K 2 (t − a)2 g1 − g2 . 2!

It can be shown by induction that |(T k g1 )(t) − (T k g2 )(t)| ≤ ≤

K k (t − a)k g1 − g2 k! K k (b − a)k g1 − g2 , k!

318

11 Integral Equations

for all t ∈ [a, b], g1 , g2 ∈ X, k = 1, 2, . . . . We now take the supremum to ﬁnd that K k (b − a)k g1 − g2 ∀g1 , g2 ∈ X, k = 1, 2, . . . k! (11.1.5) k k k Since K (b − a) /k! → 0 as k → ∞, T is a contraction for k large enough (cf. (11.1.5)). According to Remark 2.38, T has a unique ﬁxed point x ∈ X which is the unique solution of (11.1.2). d(T k g1 , T k g2 ) ≤

Proof 3. Let T be the same operator as before, but consider another norm on X = C[a, b], the Bielecki norm which is deﬁned by gB = sup e−Lt |g(t)| , a≤t≤b

with L a large positive constant such that K/L < 1. This is indeed a norm on X which is equivalent to the usual sup-norm. Denote by dB the metric generated by · B . We have for all t ∈ [a, b] and g1 , g2 ∈ X

t

|k(t, s)|eLs e−Ls |g1 (s) − g2 (s)| ds a t ≤ Kg1 − g2 B eLs ds

|(T g1 )(t) − (T g2 )(t)| ≤

a

=

Kg1 − g2 B Lt e − eLa , L

so that e−Lt |(T g1 )(t) − (T g2 )(t)| ≤ ≤

K g1 − g2 B 1 − e−L(t−a) L K g1 − g2 B . L

Now take the supremum for t ∈ [a, b] to ﬁnd dB (T g1 , T g2 ) ≤

K dB (g1 , g2 ) ∀g1 , g2 ∈ X . L

As K/L < 1, T is a contraction with respect to dB , hence the conclusion of the theorem follows again by the Banach Contraction Principle.

11.1 Volterra Equations

319

Resolvent Kernel Assume that the conditions above on f and k are satisﬁed. For n ∈ N, t ∈ [a, b], deﬁne t k(t, s)xn−1 (s) ds , xn (t) = f (t) + a

x0 (t) = f (t) . Clearly, the xn ∈ X = C[a, b] for all n. In fact, the above sequence (xn )n≥0 can be expressed as xn = T xn−1 , n ∈ N; x0 = f , where T : X → X is the operator deﬁned by (11.1.4). So, (xn ) is the sequence of successive approximations (associated with operator T ) which was used in the proof of the Banach Contraction Principle (see Chap. 2). Here we consider a particular starting function, x0 = f . From the proof of the Banach Contraction Principle we know that (xn ) converges in (C[a, b], · B ) (see also Proof 3 above) to the unique ﬁxed point of T , i.e., (xn ) converges uniformly in [a, b] to the unique solution x of Eq. (11.1.2). On the other hand, we have for all t ∈ [a, b] t x1 (t) = f (t) + k(t, s)f (s) ds , a t s k(t, s) f (s) + k(s, τ )f (τ ) dτ ds x2 (t) = f (t) + a a t s t k(t, s)f (s)ds + k(t, s)k(s, τ )f (τ ) dτ ds. = f (t) + a

a

a

We can interchange the integration to ﬁnd that the last integral is equal to t t k(t, s)k(s, τ ) ds f (τ ) dτ , a

τ

so by simply relabeling τ and s we get t a

t s

k(t, τ )k(τ, s) dτ f (s) ds , =:k2 (t,s)

320

11 Integral Equations

and have a new kernel, k2 . In general, if we denote for n = 2, 3, . . . t k(t, τ )kn−1 (τ, s) dτ , kn (t, s) := s

k1 (t, s) := k(t, s) , we have for n = 1, 2, . . . xn (t) = f (t) +

t n a

kj (t, s) f (s) ds .

(11.1.6)

j=1

Since the k is continuous on the compact set Δ, we have for all (t, s) ∈ Δ, |k1 (t, s)| ≤ K < ∞ , |k2 (t, s)| ≤ K 2 (t − s) , t |τ − s| dτ |k3 (t, s)| ≤ K 3 s

(t − s)2 , = K3 2! .. . (t − s)n−1 (n − 1)! (b − a)n−1 . ≤ Kn (n − 1)! By the Weierstrass M -test the series ∞ n=1 kn (t, s) clearly converges uniformly on Δ since |kn (t, s)| ≤ K n

∞ K n (b − a)n−1 n=1

(n − 1)!

Denote R(t, s) =

∞

< ∞.

kn (t, s) ,

n=1

which is in C(Δ). Letting n → ∞ in (11.1.6) we deduce that t R(t, s)f (s) ds, t ∈ [a, b] . (11.1.7) x(t) = f (t) + a

11.1 Volterra Equations

321

We call R(t, s) the resolvent kernel. It depends on k but is independent of f , so that once we ﬁnd R(t, s) we have the solution of (11.1.2) for any f (cf. (11.1.7)). Notice that t N +1 N +1 kn (t, s) = k(t, τ ) kn−1 (τ, s) dτ , s

n=2

n=2

which implies −k(t, s) +

N +1

t

kn (t, s) =

k(t, τ ) s

n=1

N

kn (τ, s) dτ .

n=1

Letting N → ∞ we ﬁnd that R satisﬁes t k(t, τ )R(τ, s) dτ ∀(t, s) ∈ Δ , R(t, s) = k(t, s) + s

which is a Volterra equation similar to (11.1.2). Now let us examine Eq. (11.1.1). Assume that f ∈ C 1 [a, b], and k,

∂k ∈ C(Δ), k(t, t) = 0 for all t ∈ [a, b]. ∂t

(H)

We also assume f (a) = 0 which is a necessary condition for Eq. (11.1.1) to have a solution. If Eq. (11.1.1) has a solution x ∈ C[a, b], then diﬀerentiating (11.1.1) gives d t k(t, s)x(s) ds (11.1.8) f (t) = dt a t kt (t, s)x(s) ds, t ∈ [a, b] , = k(t, t)x(t) + a

which is equivalent to the following integral equation of the second kind, $ t# −kt (t, s) f (t) x(t) = x(s) ds. (11.1.9) + k(t, t) k(t, t) a So x is also a solution of Eq. (11.1.9). On the other hand, we know from the previous theorem that (11.1.9) has a unique solution x ∈ C[a, b]. This x is also a solution of Eq. (11.1.1). This follows by integrating Eq. (11.1.8) over [a, t] and using the condition f (a) = 0. Thus we have proved the following result.

322

11 Integral Equations

Theorem 11.2. Under the conditions (H) above, plus f (a) = 0, Eq. (11.1.1) has a unique solution x ∈ C[a, b]. We continue with the nonlinear Volterra equation t k(t, s, x(s)) ds, t ∈ [a, b] , x(t) = f (t) +

(11.1.10)

a

and prove the following general result. Theorem 11.3. Assume that f ∈ C[a, b], k ∈ C(D), where D := Δ × R = {(t, s, v) ∈ R3 ; a ≤ s ≤ t ≤ b, v ∈ R}, and there exists a K > 0 such that |k(t, s, v)−k(t, s, w)| ≤ K|v −w| ∀a ≤ s ≤ t ≤ b, v, w ∈ R . (11.1.11) Then there exists a unique function x ∈ C[a, b] which satisﬁes Eq. (11.1.10) in [a, b]. Proof. Consider X = C[a, b] equipped with the Bielecki norm and deﬁne T : X → X by t k(t, s, g(s)) ds ∀t ∈ [a, b], g ∈ X . (T g)(t) = f (t) + 0

The conclusion follows by the Banach Contraction Principle similarly as in Proof 3 of Theorem 11.1. Theorem 11.3 gives a global solution in the sense that the existence interval is the whole [a, b]. Obviously this is a generalization of Theorem 11.1. Indeed, to obtain Theorem 11.1 it is enough to assume that k is linear in the third variable, i.e., k := k(t, s)v, a ≤ s ≤ t ≤ b, v ∈ R, with k ∈ C(Δ) so that the Lipschitz condition (11.1.11) is automatically satisﬁed. Now let us examine a case when the resulting solution is only a local one, i.e., its domain may not be the whole [a, b]. Theorem 11.4. Assume that f ∈ C[a, b], k = k(t, s, v) ∈ C(D), where D := Δ × [x0 − c, x0 + c] = {(t, s, v) ∈ R3 ; a ≤ s ≤ t ≤ b, |v − x0 | ≤ c}, with x0 ∈ R and c ∈ (0, ∞). If in addition there exists a K > 0 such that |k(t, s, v) − k(t, s, w)| ≤ K|v − w| ∀(t, s, v), (t, s, w) ∈ D , (11.1.12)

11.1 Volterra Equations

323

and for some d ∈ [0, c) |f (t) − x0 | ≤ d ∀t ∈ [a, b],

(11.1.13)

then there exists a unique function x ∈ C[a, a + δ] which satisﬁes Eq. (11.1.10) in [a, a + δ], where δ = min {b − a, (c − d)/M }, M = sup {|k(t, s, v)|; (t, s, v) ∈ D}. (M is assumed to be positive since the case M = 0 is trivial). Proof. Consider the space C[a, a + δ] with the usual sup-norm and the metric d generated by it. Denote Y = {g ∈ C[a, a + δ]; |g(t) − x0 | ≤ c ∀t ∈ [a, a + δ]} . Clearly (Y, d) is a complete metric space (since Y is a closed subset of (C[a, a + δ], d)). As usual, deﬁne an operator T by t k(t, s, g(s)) ds, t ∈ [a, a + δ], g ∈ Y. (T g)(t) = f (t) + a

Let us show that T takes Y into itself. Indeed, for all g ∈ Y and t ∈ [a, a + δ] we have (see (11.1.13)) t |(T g)(t) − x0 | ≤ |f (t) − x0 | + |k(t, s, g(s))| ds a

≤ d + M (t − a) ≤ d + Mδ ≤ c, which proves the assertion. By arguments similar to those used in Proof 2 of Theorem 11.1 we deduce that T k is a contraction on (Y, d) for k large enough. So T has a unique ﬁxed point x ∈ Y which is the unique solution of Eq. (11.1.10) in [a, a + δ]. Another existence and uniqueness result is obtained if k is deﬁned on a diﬀerent domain, ˜ = {(t, s, v) ∈ R3 ; a ≤ s ≤ t ≤ b, |v − f (s)| ≤ c}, c ∈ (0, ∞) , D which is a compact subset of R3 . The following result makes that precise.

324

11 Integral Equations

˜ with Theorem 11.5. Assume f ∈ C[a, b] and k = k(t, s, v) ∈ C(D), M = supD˜ |k| > 0. If, in addition, there exists a K > 0 such that ˜ , (11.1.14) |k(t, s, v) − k(t, s, w)| ≤ K|v − w| ∀(t, s, v), (t, s, w) ∈ D then there exists a unique function x ∈ C[a, a + δ] which satisﬁes Eq. (11.1.10) in [a, a + δ], where δ = min {b − a, c/M }. Proof. The proof is similar to that of Theorem 11.4 above. Here the domain of operator T is conveniently chosen as Y˜ = {g ∈ C[a, a + δ]; |g(t) − f (t)| ≤ c ∀t ∈ [a, a + δ]}, which is the closed ball in (C[a, a + δ], d) centered at f (restricted to [a, a+δ]) of radius c. Obviously, T is well deﬁned on Y˜ and takes Y˜ into itself. It is also easily seen that T k is a contraction for some suﬃciently large k ∈ N. This completes the proof (see Remark 2.38). Comments 1. If in Theorem 11.4 we assume d = 0 (i.e., f ≡ x0 ) and k is independent of t, i.e., k(t, s, v) = h(s, v), then we reobtain a wellknown existence and uniqueness result for the Cauchy problem x (t) = h(t, x(t)), x(a) = x0 . See the introductory part of Sect. 2.5. The same result can also be derived from Theorem 11.5. 2. If all the conditions of Theorem 11.4 are fulﬁlled, except for the Lipschitz condition (11.1.12), then local existence still holds, but without uniqueness. Indeed, k = k(t, s, v) can be approximated uniformly on D by a sequence of smooth functions (hence Lipschitzian, even in all variables), say (kn )n∈N . To obtain such a sequence we can use, for instance, Friedrichs’ molliﬁcation with ε = 1/n (see Chap. 5). In fact, by a classical result, k = k(t, s, v) can even be approximated by polynomials in t, s, v. According to Theorem 11.4, for each n ∈ N there exists a unique function xn which satisﬁes the equation t ˆ kn (t, s, xn (s)) ds ∀t ∈ [a, a + δ], (11.1.15) xn (t) = f (t) + a

11.2 Fredholm Equations

325

ˆ }, with M ˆ being the least upper where δˆ = min {b − a, (c − d)/M ˆ = sup(t,s,v)∈D, n∈N bound of {supD |kn |}n∈N , i.e., M |kn (t, s, v)| (which is ﬁnite since kn → k uniformly in D). Of course, δˆ is less than the δ given by Theorem 11.4. It is easily seen that (xn ) satisﬁes the conditions of the Arzel`a–Ascoli Criterion (see Chap. 2), so there exists a subsequence (xnj )j∈N which ˆ to a function x ∈ C[a, a + δ]. ˆ converges uniformly on [a, a + δ] Letting j → ∞ in (11.1.15) with n := nj , we infer that this x ˆ satisﬁes Eq. (11.1.10) in [a, a + δ]. Similar remarks are valid for Theorem 11.5. 3. Qualitative problems, such as continuability of local solutions, existence on the half-axis [a, ∞), behavior of solutions at the end of their existence intervals, etc., are avoided here. For details in this respect we refer the reader to [9], where Volterra equations in L2 -spaces and abstract Volterra equations are also addressed. 4. All the above remarks apply to linear and nonlinear Volterra equations in Rk , k ∈ N, k ≥ 2, with obvious slight changes.

11.2

Fredholm Equations

In the following K is either R or C. Consider in K the integral equation b x(t) = f (t) + k(t, s)x(s) ds, t ∈ [a, b] , (11.2.16) a

where a, b ∈ R, a < b, f ∈ C([a, b]; K) and k ∈ C([a, b] × [a, b]; K). Here we prefer K instead of R since some speciﬁc aspects are better described in this framework. Equation (11.2.16) is known as the Fredholm equation (it is sometimes called the Fredholm equation of the second kind). It involves a ﬁxed interval of integration and is fundamentally diﬀerent from Eq. (11.1.2) (the corresponding Volterra analogue). A ﬁrst remark that conﬁrms this assertion is that, while the corresponding Volterra equation (of the second kind) always has a (unique, continuous) solution in [a, b], Eq. (11.2.16) may have no solution in some cases. For instance, assuming that there exists a solution x ∈ C[0, 1] := C([0, 1]; R) of the equation (see [9, p. 41]) 1 k(t, s)x(s) ds, t ∈ [0, 1] , (11.2.17) x(t) = t + 0

326

11 Integral Equations

where k(t, s) =

π 2 s(1 − t) π 2 t(1 − s)

s ≤ t, t ≤ s,

it follows by diﬀerentiating Eq. (11.2.17) twice that x should satisfy the problem x (t) + π 2 x(t) = 0, t ∈ [0, 1], x(0) = 0, x(1) = 1 . On the other hand, it is easily seen that actually this problem has no solution. Therefore Eq. (11.2.17) has no solution. It is worth pointing out, however, that under the above assumptions, Eq. (11.2.16) has a unique solution in C[a, b] whenever the sup-norm of |k| is suﬃciently small, more precisely if (b − a) sup[a,b]×[a,b] |k| < 1. This result follows readily by the Banach Contraction Principle. In fact, the existence question can be discussed in the space L2 (a, b; K), which is a larger framework. Speciﬁcally, let us assume f ∈ L2 (a, b; K), k ∈ L2 (Q; K), where Q = (a, b) × (a, b). The solution x of Eq. (11.2.16) will be sought in L2 (a, b; K) which is a Hilbert space with respect to the usual scalar product and norm, b g1 , g2 L2 = g1 (t) · g2 (t) dt, g2L2 = g, g . a

Of course, if we ﬁnd a solution x ∈ L2 (a, b; K) of Eq. (11.2.16) with f ∈ C([a, b]; K), k ∈ C([a, b] × [a, b]; K), then obviously x ∈ C([a, b]; K). We have the following result. Theorem 11.6. If f ∈ L2 (a, b; K), −∞ < a < b < +∞, k ∈ L2 (Q; K) and Q |k(t, s)|2 dt ds < 1, where Q = (a, b) × (a, b), then there exists a unique function x ∈ L2 (a, b; K) satisfying the equation

b

k(t, s)x(s) ds,

x(t) = f (t) + a

almost everywhere in (a, b). Proof. Let T be the operator deﬁned by b k(t, s)g(s) ds ∀g ∈ L2 (a, b; K) (T g)(t) =f (t) + a

and for a.a. t ∈ (a, b) .

11.2 Fredholm Equations

327

It is easily seen that T takes L2 (a, b; K) into itself. Moreover, kL2 (Q; K) < 1, and so T is a contraction with respect to the metric generated by · L2 . Hence it has a unique ﬁxed point x ∈ L2 (a, b; K) which is the unique L2 -solution of the equation b x(t) = f (t) + k(t, s)x(s) ds . a

Remark 11.7. Using a procedure similar to that used for the Volterra Eq. (11.1.2), we ﬁnd that the solution given by Theorem 11.6 can be represented by the formula b R(t, s)f (s) ds for a.a. t ∈ (a, b) , x(t) = f (t) + a

where the resolvent kernel R is given by R(t, s) =

∞

ki (t, s) ,

(11.2.18)

i=1

with

b

k1 (t, s) := k(t, s), km (t, s) =

k(t, τ )km−1 (τ, s) dτ ∀m ≥ 2 .

a

The series in (11.2.18) converges in L2 (Q; K) and almost everywhere on Q. We encourage the reader to check the details. Remark 11.8. Theorem 11.6 can be extended to the nonlinear Fredholm equation b k(t, s, x(s)) ds, t ∈ [a, b] . (11.2.19) x(t) = f (t) + a

Indeed, if f ∈ L2 (a, b; K), k : Q × K → K is Lebesgue measurable, k(·, ·, 0) ∈ L2 (Q; K), and |k(t, s, v) − k(t, s, w)| ≤ α(t, s)|v − w| for all v, w ∈ K and a.a. (t, s) ∈ Q , for a given α ∈ L2 (Q) with αL2 (Q) < 1, then there exists a unique x ∈ L2 (a, b; K) which satisﬁes Eq. (11.2.19) almost everywhere in (a, b). As usual, the conclusion follows by the Banach Contraction Principle. Let us just notice that for every g ∈ L2 (a, b; K) the function (t, s) → k(t, s, g(s)) belongs to L2 (Q; K), since |k(t, s, g(s))| ≤ |k(t, s, 0)| + α(t, s)|g(s)|

for a.a. (t, s) ∈ Q .

328

11 Integral Equations

Remark 11.9. In the case of Fredholm equations, the concept of a local solution does not make sense since the integral term involves the values x(t) for a.a. t ∈ (a, b). This shows once more that the Fredholm equations are fundamentally diﬀerent from the Volterra equations of the second kind. On the other hand, the reader may be wondering whether Eq. (11.2.16) still has solutions when the condition kL2 (Q; K) < 1 is no longer satisﬁed. A complete answer is given by the Fredholm alternative (see Remark 7.11). In our speciﬁc case H = L2 (a, b; K) and A : H → H is deﬁned by b k(t, s)g(s) ds ∀g ∈ H and for a.a. t ∈ (a, b) . (11.2.20) (Ag)(t) = a

Clearly, A ∈ L(H). Moreover, we have the following lemma: Lemma 11.10. If k ∈ L2 (Q; K), then operator A : H → H deﬁned by (11.2.20) is compact. Proof. Assume ﬁrst that k ∈ C([a, b] × [a, b]; K). In order to show that A is compact in this case, we shall make use of the Arzel` a–Ascoli Criterion (see Chap. 2 and notice that the criterion is valid with K instead of Rk ). Let B(0, r), r ∈ (0, ∞), be a ball in H. Then the set F = {Ag; g ∈ B(0, r)} is a bounded subset of C([a, b]; K): b |(Ag)(t)| ≤ |k(t, s)| · |g(s)| ds a b 1/2 ≤ |k(t, s)|2 ds gL2 a

≤ r(b − a)1/2 sup |k| < ∞ , Q

for all g ∈ B(0, r) and all t ∈ [a, b]. Set F is also equicontinuous since k is uniformly continuous on [a, b] × [a, b], so (by the Arzel` a– Ascoli criterion) F is relatively compact in C([a, b]; K), hence also in H = L2 (a, b; K). Therefore, A is indeed a compact operator. Now, assume k ∈ L2 (Q; K). Then there is a sequence (kn ) in C([a, b]× [a, b]; K) such that kn − kL2 (Q; K) → 0 as n → ∞ (one can use, for instance, the density of C0∞ (Q) in L2 (Q), see Theorem 5.8). Let us associate with each kn the operator An ∈ L(H) deﬁned by b kn (t, s)g(s) ds ∀g ∈ H, t ∈ [a, b] , (An g)(t) = a

11.2 Fredholm Equations

329

which is compact, by the above argument. A straightforward computation shows that An − AL(H) ≤ kn − kL2 (Q; K) for all n, hence An − AL(H) → 0 as n → ∞. It follows by Theorem 4.11 that A is compact.

Consider (in K) the equation

b

x(t) = f (t) + λ

k(t, s)x(s) ds, t ∈ [a, b] ,

(11.2.21)

a

where λ ∈ K, f ∈ L2 (a, b; K), k ∈ L2 (Q; K), Q = (a, b) × (a, b). According to Theorem 11.6, Eq. (11.2.21) has a unique solution in L2 (a, b; K) provided that |λ| is suﬃciently small. More precisely, this happens if (11.2.22) |λ| · kL2 (Q; K) < 1 . We shall show in what follows that there are solutions for Eq. (11.2.21) even if λ does not satisfy condition (11.2.22). Using the above notation we can write Eq. (11.2.21) as an abstract equation in H = L2 (a, b; K), namely x = f + λAx . (11.2.23) Note that A∗ , the adjoint of A, is given by (A∗ h)(t) =

b

k(s, t) · h(s) ds ∀h ∈ H .

a

¯ ∗. Note also that (λA)∗ = λA According to Lemma 11.10 and Theorem 8.4, operator A has a countable set of eigenvalues with 0 being the only possible accumulation point; moreover, for any eigenvalue ν = 0 of A, dim N (I − λA) < ∞, where λ = 1/ν. Of course, similar assertions hold for A∗ , in particular ¯ ∗ ) < ∞. In fact, we can prove that, if ν = 0 is an dim N (I − λA eigenvalue of A, then ¯ ∗ ), dim N (I − λA) = dim N (I − λA

where λ = 1/ν .

(11.2.24)

First of all, note that ν¯ is an eigenvalue of A∗ (cf. Theorem 7.10), ¯ ∗ ) ≥ 1. Let {φ1 , φ2 , . . . , φm } and {ψ1 , ψ2 , . . . , ψn } be so dim N (I − λA ¯ ∗ ), respectively. Assume orthonormal bases in N (I −λA) and N (I − λA

330

11 Integral Equations

by way of contradiction that m < n. Let B be the operator associated with the kernel K(t, s) = k(t, s) −

m

φj (s) · ψj (t) ,

j=1

and let φ, ψ ∈ H be solutions of the equations φ(t) = λ(Bφ)(t) b b m k(t, s)φ(s) ds − λ ψj (t) φj (s) · φ(s) ds , =λ a

j=1

a

(11.2.25) ¯ ∗ ψ)(t) ψ(t) = λ(B b b m ¯ ¯ k(s, t)ψ(s) ds − λ φj (t) ψj (s) · ψ(s) ds . =λ a

j=1

a

(11.2.26) Multiplying Eq. (11.2.25) by ψk (t) and then integrating over [a, b] the resulting equation yields (φ, ψk )L2

b b λ = k(t, s) · ψk (t) dt φ(s) ds − λ(φ, φk )L2 a a =ψk (s)

= (φ, ψk )L2 − λ(φ, φk )L2 , hence (φ, φk )L2 = 0, k = 1, 2, . . . , m .

(11.2.27)

From (11.2.25) and (11.2.27) we deduce that φ ∈ N (I − λA). Thus m φ = i=1 ci φi with some ci ∈ K, i = 1, 2, . . . , m. This combined with (11.2.27) yields φ = 0, hence Eq. (11.2.25) has only the null solution. On the other hand, Eq. (11.2.26) is satisﬁed by ψk for all k ∈ {m + 1, . . . , n}. Indeed, since (ψk , ψj )L2 = 0 for j ∈ {1, . . . , m}, k ∈ {m + 1, . . . , n}, Eq. (11.2.26) with ψ = ψk , k = m + 1, . . . , n, ¯ ∗ ψk , k = m + 1, . . . , n. This means that, can be written as ψk = λA ∗ ¯ ) = N (I − (λB)∗ ) = {0}, while N (I − λB) = {0}, which N (I − λB contradicts Theorem 7.10. Therefore, m ≥ n. The converse inequality

11.2 Fredholm Equations

331

¯ ∗ )∗ = λA, so the proof of (11.2.24) is follows from the fact that (λA complete. Notice that in the case of Eq. (11.2.21) above the Fredholm Alternative (see Remark 7.11) has the following speciﬁc form: Theorem 11.11 (Fredholm Alternative). Assume λ ∈ K, f ∈ H = L2 (a, b; K), k ∈ L2 (Q; K), where Q = (a, b)×(a, b), and let A : H → H be the operator deﬁned by

b

(Ag)(t) =

k(t, s)g(s) ds ∀g ∈ H and for a.a. t ∈ (a, b).

a

Then, one of the following holds: ¯ ∗ ) = {0}) and in this • N (I − λA) = {0} (if and only if N (I − λA case the equation

b

x(t) = f (t) + λ

k(t, s)x(s) ds, t ∈ [a, b]

(F )

a

has a unique solution for all f ∈ H, ¯ ∗ ) = m with 1 ≤ m < ∞ and in • dim N (I − λA) = dim N (I − λA this case Eq. (F ) is solvable if and only if

b

(f, ψ)L2 =

¯ ∗) , f (t) · ψ(t) dt = 0 ∀ψ ∈ ker (I − λA

a

(equivalently, (f, ψk )L2 = 0, k ∈ {1, 2, . . . , m}, where the ψk ’s ¯ ∗ )). form an orthonormal basis in N (I − λA Remark 11.12. Since the set S = {λ ∈ K; N (I − λA) = {0}} is countable it follows by Theorem 11.11 that there exist “many” λ’s which do not satisfy condition (11.2.22), but for which Eq. (F ) has a (unique) solution for all f ∈ H = L2 (a, b; K). Even for λ ∈ S Eq. (F ) is solvable ¯ ∗ ). if and only if f ⊥ N (I − λA The Case of Hermitian Kernels: Schmidt’s Formula In addition to the conditions f ∈ H = L2 (a, b; K), k ∈ L2 (Q; K), Q = (a, b) × (a, b),

332

11 Integral Equations

we have used before, let us assume that k is Hermitian, i.e., k(t, s) = k(s, t), for a.a. (t, s) ∈ Q . Then obviously A = A∗ . According to Proposition 8.5 every eigenvalue of A is real. Next, we try to use the Hilbert–Schmidt Theorem to investigate the Fredholm equation in its abstract form (11.2.23), i.e. x = f + λAx,

(11.2.23)

In fact, in the following A in (11.2.23) may be any linear, symmetric, compact operator from an inﬁnite dimensional, separable Hilbert space (H, (·, ·), · ) into itself, and f ∈ H. As a ﬁrst step, let us assume that N (A) = {0}, i.e., zero is not an eigenvalue of A. Thus the Hilbert–Schmidt Theorem (Theorem 8.7) is applicable to A (see also Lemma 11.10). Denote by λ1 , λ2 , . . . , λn , . . . the eigenvalues of A given by this theorem and by u1 , u2 , . . . , un , . . . the corresponding eigenvectors, i.e., Aun = λn un , n = 1, 2 . . . . According to the proof of the Hilbert–Schmidt Theorem, each eigenvalue is taken into account k-times, where k means its multiplicity (the dimension of the corresponding eigenspace). The system {un }n≥1 is an orthonormal basis in H. For λ ∈ K \ {0} we distinguish two cases (i) N (I − λA) = {0}, i.e., 1/λ is not an eigenvalue of A; (ii) N (I − λA) = {0}, i.e., 1/λ is an eigenvalue of A. Let us ﬁrst discuss the case (i). By Remark 7.11 Eq. (11.2.23) has a unique solution x for each f ∈ H. By formula (8.2.11) from the proof of Theorem 8.7 (the Hilbert–Schmidt Theorem) we have Ax =

∞

λn (x, un )un .

(11.2.28)

n=1

On the other hand, using Eq. (11.2.23) and the fact that A is symmetric, we get (x, un ) = (f, un ) + λλn (x, un ), n = 1, 2, . . . , hence (x, un ) =

1 (f, un ), n = 1, 2, . . . 1 − λλn

(11.2.29)

11.2 Fredholm Equations

333

Now, from (11.2.23), (11.2.28), and (11.2.29) we can derive the following formula for the solution x of Eq. (11.2.23) (known as Schmidt’s formula) ∞ λn (f, un )un . (11.2.30) x=f +λ 1 − λλn n=1

Now, let us discuss the case (ii), i.e., when 1/λ is an eigenvalue of operator A, say 1/λ = λk for some k ∈ N. Obviously, formula (11.2.30) does not make sense in this case. Denote H0 := N (I − λA) = N (λk I − A), H1 := H0⊥ , so that H = H0 ⊕ H1 . By Theorem 8.4, H0 is ﬁnite dimensional. Denote m := dim H0 ∈ N. Let B0 = {v1 , v2 , . . . , vm } be a basis of H0 . As H is a separable space, so is H1 . Taking into account the fact that A is symmetric, it is easily seen that A maps H1 into itself. Clearly, the restriction A1 = A|H1 is symmetric and A1 ∈ K(H1 ), i.e., A1 is compact in H1 which is a Hilbert subspace of H with the same (·, ·) and · . Obviously, N (A1 ) = {0} so the Hilbert–Schmidt Theorem is applicable to H1 and A1 and shows the existence of a sequence of (real) eigenvalues of A1 (hence of A), which does not include λk , and of a corresponding orthonormal basis in H1 , with A1 un = Aun = λn un , n ∈ N, n = k. According to the previous analysis corresponding to the case (i), Eq. (11.2.23) has a (unique) solution x = x1 in H1 (i.e., x1 − λA1 x1 = f ) if and only if f ∈ H1 , and (see (11.2.30)) x1 = f + λ

λn =λk

λn (f, un )un . 1 − λλn

If we consider (11.2.23) in H, then for f ∈ H1 and for all y ∈ H0 , x=f +λ

λn =λk

λn (f, un )un + y 1 − λλn

is a solution of Eq. (11.2.23). Consequently, the formula x=f +λ

λn =λk

λn (f, un )un + ci vi , 1 − λλn m

i=1

(11.2.31)

334

11 Integral Equations

with c1 , . . . , cm ∈ K, gives all solutions of Eq. (11.2.23). We now turn our attention to the case when N (A) = {0}. Denoting Y0 = N (A) and Y1 = Y0⊥ , we can write H = Y0 ⊕ Y1 . We can assume that Y0 is a proper subspace of H, otherwise A = 0 which is a trivial case. It is easy to see that A takes Y1 to itself. Obviously, Y1 is a Hilbert subspace of H with respect to the same (·, ·) and · , ˜ = {0}. and the restriction A˜ = A|Y1 is symmetric, compact, and N (A) If Y1 is inﬁnite dimensional, then the Hilbert–Schmidt Theorem is ˜ In order to solve Eq. (11.2.23) we use the applicable to Y1 and A. decompositions x = x0 + x1 , f = f0 + f1 , where x0 , f0 ∈ Y0 and x1 , f1 ∈ Y1 . Thus (11.2.23) becomes x0 − f0 = −x1 + f1 + λAx1 , hence both sides are equal to zero, so x0 = f0 and ˜ 1. x1 = f1 + λAx

(11.2.32)

Clearly, for every f ∈ H, f = f0 + f1 , x is a solution of Eq. (11.2.23) if and only if x = f0 + x1 , where x1 ∈ Y1 satisﬁes Eq. (11.2.32). ˜ = It is worth pointing out that Eq. (11.2.32), with A˜ : Y1 → Y1 , N (A) {0}, is in the situation we had before, so one can similarly discuss the solvability of (11.2.32) in terms of the eigenvectors of A˜ (i.e., the eigenvectors of A corresponding to nonzero eigenvalues). If it turns out that Y1 is ﬁnite dimensional, then Eq. (11.2.32) reduces to a linear algebraic system which can be solved by using elementary algebraic computations. Example. Let H = L2 (−π, π) with the usual scalar product and norm. Consider the usual orthonormal basis in H, i.e. (see Chap. 6), 1 1 u0 = √ , u2k−1 (t) = √ cos(kt), π 2π 1 u2k (t) = √ sin(kt), k = 1, 2, . . . π For a given m ∈ N, deﬁne ∞ 1 un (t)un (s), (t, s) ∈ Q = (−π, π) × (−π, π) . k(t, s) = n2 n=m

11.2 Fredholm Equations

335

¯ ⊂ L2 (Q). If A is the operator deﬁned by (11.2.20), Clearly, k ∈ C(Q) where a = −π, b = π, with this kernel (which is symmetric, hence Hermitian), then Ag = 0 for every g which is a linear combination of u0 , u1 , . . . , um−1 . Therefore Span{u0 , u1 , . . . , um−1 } ⊂ N (A) . On the other hand, if Af = 0, where f is a member of H, i.e., f = ∞ k=0 (f, uk )L2 uk (which is the Fourier expansion of f ), then 0 = (Af, f )L2 ∞ ∞

1 = (f, u ) (f, u ) 2 un , 2 uk n k L L n2 L2 n=m k=0

∞ 1 = (f, un )2L2 , 2 n n=m

hence (f, un )L2 = 0 for all n ≥ m and so f = f ∈ Span{u0 , u1 , . . . , um−1 }. Therefore,

m−1 k=0

(f, uk )L2 uk , i.e.,

N (A) = Span{u0 , u1 , . . . , um−1 } . On the other hand, if we choose, for example, k(t, s) = 1 +

∞ 1 un (t)un (s), (t, s) ∈ Q , n2

n=1

then the corresponding operator A satisﬁes the condition N (A) = {0}. The solvability of the Fredholm equation x = f + λAx, with A, associated with the k’s, deﬁned above, is left to the reader. Comments. 1. If in the equation

b

x(t) = f (t) + λ

k(t, s)x(s) ds, t ∈ [a, b],

a

(which is (11.2.21) above) we assume f ∈ C[a, b] and k ∈ C([a, b]× [a, b]), then x ∈ C[a, b]. Moreover, if f and k are more regular, then so is x.

336

11 Integral Equations

2. The above theory also works if [a, b] is replaced by a bounded domain D ⊂ RN or by the boundary of such a domain. It is well known that the main elliptic boundary value problems (Dirichlet, Neumann, Robin) can be reduced, by using potentials, to Fredholm equations that live on the boundary of the corresponding domains. Thus the above theory can be used to solve such problems. 3. The following nonlinear extension of the Fredholm equation, known as the Hammerstein equation, k(t, s)g(s, x(s)) ds for a.a. t ∈ D , x(t) = f (t) + D

where g is a nonlinear function, is also heavily discussed in the literature (see [20], [9], [26]).

11.3

Exercises

1. Calculate the resolvent kernels of the following Volterra equations and then ﬁnd the corresponding solutions: t 2 2 2 (a) x(t) = et + 0 et −s x(s) ds, t ≥ 0; t 2+cos t (b) x(t) = et sin t + 0 2+cos s x(s) ds, t ≥ 0; t (c) x(t) = t + 0 (t − s)x(s) ds, t ≥ 0. 2. Solve the following integral equations by converting them into Cauchy problems for diﬀerential equations: t 3 (a) x(t) = t − t6 + 0 (t − s + 1)x(s) ds, t ≥ 0; t (b) x(t) = t3 + 1 − 0 (t − s)x(s) ds, t ≥ 0; t (c) x(t) = 3t − 0 et−s x(s) ds, t ≥ 0. 3. Solve the following Volterra equations of the ﬁrst kind: t 2 (a) 0 (1 − t2 + s2 ) · x(s) ds = t2 , t ≥ 0; t (b) 0 cos(t − s) · x(s) ds = 2t(t + 1), t ≥ 0; t (c) 0 et+s · x(s) ds = t cos t, t ≥ 0 .

11.3 Exercises

337

4. Let h ∈ C[0, b], where b ∈ (0, ∞). Deﬁne k(t, s) = h(t − s), 0 ≤ s ≤ t ≤ b. Show that the resolvent kernel R(t, s) associated with k(t, s) depends only on t − s. 5. Let a, b ∈ R, a < b. Let f, x ∈ C[a, b], k ∈ C(Δ) be nonnegative functions, where Δ = {(t, s) ∈ R2 ; a ≤ s ≤ t ≤ b}. If t k(t, s)x(s) ds, t ∈ [a, b], x(t) ≤ f (t) + a

then

t

x(t) ≤ f (t) +

R(t, s)f (s) ds, t ∈ [a, b],

a

where R(t, s) is the resolvent kernel associated with k(t, s). 6. Consider in C([0, π]; K) the equation π (sin t · cos s)x(s) ds, t ∈ [0, π]. x(t) = λ 0

Show that for any λ ∈ K the equation has only the null solution. 7. Let a, b ∈ (0, ∞). Deﬁne D = {(t, s); 0 ≤ t ≤ a, 0 ≤ s ≤ b}, Q = {(t, s, ξ, η); 0 ≤ ξ ≤ t ≤ a, 0 ≤ η ≤ s ≤ b}. Consider the integral equation t

s

x(t, s) = f (t, s) + 0

k(t, s, ξ, η) dξdη, (t, s) ∈ D .

(E)

0

Assume k ∈ C(Q) := C(Q; R). Show that for each f ∈ C(D) := C(D; R) there exists a unique function x = x(t, s) ∈ C(D) satisfying Eq. (E) for all (t, s) ∈ D. 8. Consider the problem t x (t) = f (t) + 0 k(t, s)x(s) ds, x(0) = x0 ,

t ∈ (0, T ),

where x0 ∈ R, T ∈ (0, ∞), f ∈ L1 (0, T ), k ∈ C(Δ), and Δ = {(t, s) ∈ R2 ; 0 ≤ s ≤ t ≤ T }. Show that there exists a unique function x ∈ W 1,1 (0, T ) satisfying the above integro-diﬀerential equation for a.a. t ∈ (0, T ) and the initial condition x(0) = x0 .

338

11 Integral Equations

9. Solve the following integral equations, where λ is a real parameter: π (a) x(t) = cos t + λ 0 sin(t − s) · x(s) ds; 2π (b) x(t) = t + λ 0 |π − s| sin t · x(s) ds; 1 (c) x(t) = f (t) + λ 0 (1 − 3ts) · x(s) ds, f ∈ L2 (0, 1). 10. Consider, in K, the following Fredholm equation with degenerate (separable) kernel:

n b

x(t) = f (t) + λ a

ai (t)bi (s) x(s) ds,

i=1

(F )

k(t,s)

where λ ∈ K, f, ai , bi ∈ L2 (a, b; K), i = 1, 2, . . . , n. One can assume without any loss of generality that the systems {a1 , . . . , an }, {b1 , . . . , bn } are linearly independent. Denoting

b

ci =

bi (s)x(s) ds, i = 1, . . . , n,

(1)

a

we obtain from (F ) x(t) = f (t) + λ

n

ci ai (t) .

(2)

i=1

Plugging (2) into (1) we obtain the algebraic system c i = fi + λ

n

kij cj , i = 1, . . . , n ,

(3)

j=1

where

b

fi =

b

bi (s)f (s) ds, kij = a

bi (s)aj (s) ds, i, j = 1, . . . , n . a

Show that the Fredholm alternative for Eq. (F ) can be expressed as an equivalent alternative for the algebraic system (3).

11.3 Exercises

339

11. Let (H, (·, ·), · ) be a Hilbert space and let {e1 , . . . , em } ⊂ H be an orthonormal system, where m is a given natural number. Deﬁne A : H → H by Ax =

m

k(x, ek )ek , x ∈ H.

k=1

Solve the abstract Fredholm equation x = f + λAx, where f ∈ H and λ ∈ K. 12. Consider the functions

√ un (t) = 2 cos (n + 1/2)πt , t ∈ [0, 1], n = 0, 1, 2, . . . It is well known that the system {un }∞ n=0 is an orthonormal basis 2 in H = L (0, 1) equipped with the usual scalar product and norm (see the solution to Exercise 8.11). Deﬁne the kernel k(t, s) by k(t, s) =

∞

1 un (t)un (s), t, s ∈ [0, 1] , (n + 1)2 n=m

where m ∈ {0, 1, 2, . . . }, and the integral operator A : H → H,

1

(Ag)(t) =

k(t, s)g(s) ds, g ∈ H .

0

Discuss the existence for the Fredholm equation x = f + λAx, f ∈ H, λ ∈ R , in two cases: m = 0 and m ≥ 1.

Chapter 12

Answers to Exercises This chapter provides solutions to almost all exercises proposed at the end of each chapter. The solutions are labeled with the same numbers used for the corresponding exercises. For easy exercises we shall provide hints or just their ﬁnal solutions. Answers to very easy exercises are left to the reader.

12.1

Answers to Exercises for Chap. 1

1. Left to the reader. 2. Answers: X = (C \ A) ∪ B;

X = C ∪ (A \ B).

3. Left to the reader. 4. It is easily seen that the statements (a) and (c) are true, while statement (b) is not true in general as shown by the following counterexample: A = {1, 2}, B = {3}, C = {3, 4}, D = {4, 5}. 5. Let b be a minimal element of A. Since a = min A, we have a ≤ x for all x ∈ A. In particular, a ≤ b ⇒ b = a. 6. Observe that

1 1 1 1 1 + − + ··· + − 2 2 3 n n+1

1 . = 1− n+1

an =

1−

© Springer Nature Switzerland AG 2019 G. Moro¸sanu, Functional Analysis for the Applied Sciences, Universitext, https://doi.org/10.1007/978-3-030-27153-4 12

341

342

12 Answers to Exercises

It follows that inf A =

1 2

and sup A = 1.

7. Parts (a) and (b) are left to the reader; in order to solve (c), we suggest the partial order z1 = x1 + y1 i z2 = x2 + y2 i ⇐⇒ x1 ≤ x2 and y1 ≥ y2 . 8. It is easily seen by induction that (an ) is increasing and an < 2, for all n ≥ 1. According to the Monotone Convergence Theorem, √ a ≥ a = 2. Letting n → ∞ in (an ) is convergent and its limit 1 √ √ an = 2 + an−1 we get a = 2 + a, so a = 2. 9. Use Zorn’s Lemma. 10. Left to the reader. 11. Left to the reader. 12. For part (a), Systems (i), (iii), (v), and (vi) are linearly independent, while (ii) and (iv) are linearly dependent. Part (b) is left to the reader. 13. It is readily seen that B is a basis of X, and the coordinates of p = p(t) with respect to this basis are p(1),

p (1) p (1) p , , . 1! 2! 3!

14. Left to the reader. 15. Left to the reader. 16. Left to the reader. 17. Left to the reader. 18. We have F (x) = (Bx, x) = ax2 + Ax2 , where (·, ·) and · denote the usual scalar product of Rn and the induced norm, respectively. Clearly, F is positive deﬁnite if a > 0. If a = 0, then F is positive deﬁnite if and only if det A = 0, otherwise F is only positive semideﬁnite. 19. Left to the reader. 20. Left to the reader.

12.2 Answers to Exercises for Chap. 2

12.2

343

Answers to Exercises for Chap. 2

1. Consider ﬁrst nthe case of ﬁnite unions. Since Ai ⊂ Cl Ai we have n A ⊂ the right-hand side is a closed set, i i=1 ni=1 ClAi . Here n ⊂ hence Cl A Cl Ai . For the converse inclusion, let i=1 i i=1 n x ∈ i=1 Cl Ai . So there exists n a j ∈ {1, 2, . . . , n} such that x ∈ Cl Aj . Therefore, x ∈ Cl i=1 Ai which implies the converse inclusion. ∞ For the inclusion relation ∞ i=1 Cl Ai ⊂ Cl i=1 Ai , one can use a similar argument. This inclusion can be proper: for example, let (R, d) be our metric space, where d(x, y) = |x − y| for all x, y ∈ R, and consider also the set Q of rational numbers, which can be written as Q ={r 1∞, r2 , . . . }. If Ai = {ri }, i ∈ N, then ∞ Cl A = Q and Cl i i=1 i=1 Ai = R. 2. Consider (R, d) with d(x, y) = |x−y| for all x, y ∈ R, and let A = Q. Then Cl A = R, Int A = ∅, Int(Cl A) = R, and Cl(Int A) = ∅, so both answers are No. 3. Let A be an arbitrary nonempty subset of (X, d0 ). For any x ∈ A the ball B(x, 1/2) = {x} ⊂ A, so A is indeed open in (X, d0 ). 4. Let p ∈ Cl A. Assume by way of contradiction that inf {d(p, x) : x ∈ A} =: r > 0 . Obviously, A ⊂ X \ B(p, r/2), the latter being a closed set. It follows that Cl A ⊂ X \ B(p, r/2), which contradicts the assumption p ∈ Cl A. For the converse implication, assume inf {d(p, x) : x ∈ A} = 0. Then for every n ∈ N there exists an an ∈ A such that d(an , p) < 1/n. So an → p ⇒ p ∈ Cl A. 5. It is easy to see that (BC(A; Y ), ·sup ) is a normed space, hence a metric space with the metric generated by · sup . Let (fn )n∈N be a Cauchy sequence in BC(A; Y ). It is a Cauchy sequence among the bounded functions, so it has a limit f among the bounded functions (in fact, B(A; Y ) is similar to B(A; R) which was discussed in Chap. 2, since (Y, · ) is a Banach space). It remains to prove that f is continuous. Let x0 be an arbitrary

344

12 Answers to Exercises

point in A and let ε be a small positive number. We have f (x) − f (x0 )

≤ <

f (x) − fN (x) + fN (x) − fN (x0 ) +fN (x0 ) − f (x) ε ε ε + + = ε, 3 3 3

(for N ∈ N suﬃciently large and) for x − x0 suﬃciently small, hence f is continuous at x0 . 6. Answers: (a) ∅, since all the points of Z × Z are isolated; (b) R2 ; (c) R × {0}; (d) {(0, 0)} ∪ {(1/m, 0); m ∈ Z, m = 0} . Proof of (c): In order to obtain accumulation points we have to consider as denominators terms of integer sequences that converge to ±∞. Deﬁne hk :=

z pnk + z p = + , k = 1, 2, . . . , qnk q qnk

where p, q, z ∈ Z, q = 0, nk ∈ N, nk → ∞. Then (hk , 1/qnk ) converges to (p/q, 0). Since Q is dense in R the result follows. 7. Answers: ∂A = [0, 1]; ∂B = B ∪ {0}; ∂C = {(x, y) ∈ R2 ; x2 − y 2 = 1}. 8. We have B(x, r) ⊂ B(x, r) ⇒ Cl B(x, r) ⊂ B(x, r) . For the converse inclusion, take an arbitrary y ∈ B(x, r)\B(x, r), i.e., y − x = r. Deﬁne the sequence (yn ) by

1 1 y , n = 1, 2, . . . yn = x + 1 − n n Since r 1 d(yn , y) = y − x = → 0 , n n we see that y ∈ Cl B(x, r), which concludes the proof. In (X, d0 ), B(x, r) = {x} ⇒ Cl B(x, r) = {x}, while B(x, r) = X if r ≥ 1.

12.2 Answers to Exercises for Chap. 2

345

9. Let (xn ) be a Cauchy sequence in a metric space (X, d), and let y, z be two cluster points of (xn ), i.e., y, z are the limit points of two subsequences of (xn ), say (yn ), (zn ). From d(y, z) ≤ d(y, yn ) + d(yn , zn ) + d(zn , z) we derive d(y, z) = 0 ⇒ y = z. 10. Use the decompositions 2π n2 + 3n = 2πn + αn , π n2 + n = πn + βn , where αn = √

n2

π 6πn πn → 3π, βn = √ → , 2 2 + 3n + n n +n+n

to infer that (xn ) is convergent to 0, while (yn ) has two cluster points, +1 and −1. 11. Denote C[0, 1] := C([0, 1]; R). Let dsup be the metric generated by the sup-norm · sup on C[0, 1]. Let g be an arbitrary element of B. By Weierstrass’ Theorem inf x∈[0,1] g(x) = g(x0 ) > 0 for some x0 ∈ [0, 1]. Denoting c := g(x0 ), we see that the ball centered at g with radius c/2 of the metric space C[0, 1], dsup is included in B. Therefore B is open in C[0, 1], dsup . To answer the latter question, observe that ˜ := {f ∈ C[0, 1] : f (x) ≥ 0 for all x ∈ [0, 1]} B ˜ is is a closed subset of C[0, 1], dsup , and every element f of B an accumulation point of B: f = lim fn , where fn (x) = f (x) + ˜ 1/n, x ∈ [0, 1], n = 1, 2, . . . Therefore Cl B = B. 12. Let dsup be the metric on BC(R; R) generated by the sup-norm. Obviously, D contains functions whose inﬁmum is zero. For example, g(x) = e−x , x ∈ R, is such a function. Any ball B(g, ε) ⊂ (BC(R; R), dsup ) contains functions with negative values, so g is not an interior point of D, hence D is not open in (BC(R; R), dsup ). Obviously, Int D = {f ∈ BC(R; R); inf R f > 0}. The closure of D is {f ∈ BC(R; R); f (x) ≥ 0 ∀x ∈ R}. The proof is the same as in the previous exercise.

346

12 Answers to Exercises

13. Clearly, such an open cover does exist since (0, 1] is not a closed set, hence not compact. Indeed, for example the collection {(1/n, 2)}n∈N is an open cover with no ﬁnite subcover (easy to check). 14. Let S ⊂ (X, d) be a discrete subset of a metric space (X, d). If S is ﬁnite, then it is clearly compact. Now, let us assume that S is compact and show that it is ﬁnite. Assume by way of contradiction that S is inﬁnite and consider an open cover of S, {B(x, rx )}x∈S , where B(x, rx ) ∩ S = {x} for all x ∈ S (such balls exist since all the points of S are isolated). Any proper sub-collection of {B(x, rx )}x∈S is no longer a cover of S. This contradicts the fact that S is compact. Thus S is compact if and only if it is ﬁnite. In fact, this result holds in any topological space. 15. The conclusion follows from the total boundedness of A. 16. Both assertions follow easily by using sequences. As a counterexample, consider the following two subsets of (R, |·|): A = [2, +∞) and B = {1}. 17. The reader is encouraged to ﬁrst draw the graphs of the fn ’s. Denoting by · sup the sup-norm of C[0, 1], we have fn sup = 1 for all n ∈ N, so F is bounded (with respect to the metric dsup , dsup (f, g) = f − gsup , ∀f, g ∈ C[0, 1]). On the other hand, dsup (fn , fm ) = 1 for all m, n ∈ N, m = n, so all elements of F are isolated points (in other words, F is a discrete set). Therefore, F is a closed set. Being an inﬁnite discrete set, F is not compact (one can use the open cover {B(fn , 1/2)}n∈N ). 18. We need to analyze the case when A is an inﬁnite set, otherwise the conclusion is obvious. Let (xn )n∈N be a sequence in A. Since A is totally bounded, it is a subset of a ﬁnite union of open balls of radius 1. One of these balls, say B(a1 , 1), contains inﬁnitely many terms of (xn ). Denote C1 := B(a1 , 1) ∩ A and pick a term xn1 ∈ C1 . Obviously, C1 is also totally bounded, so it is a subset of a ﬁnite union of open balls of radius r = 1/2. There exists a ball B(a2 , 1/2) such that C2 := B(a2 , 1/2) ∩ B(a1 , 1) ∩ A contains inﬁnitely many terms of (xn ). Choose one of these terms, xn2 , with n2 > n1 . Continuing this process, we ﬁnd xnj ∈ Cj = A∩{∩ji=1 B(ai , 1/i)},

12.2 Answers to Exercises for Chap. 2

347

with nj > nj−1 , j = 2, 3, . . . Since xnj ∈ B(ak , 1/k) for all j ≥ k, we have d(xnj , xni ) ≤ d(xnj , ak ) + d(xni , ak )) <

2 , ∀i, j ≥ k , k

so the subsequence (xnk )k∈N is a Cauchy sequence in (X, d), hence convergent (to a point in Cl A), since (X, d) is complete. 19. (a) It is easily seen that l1 is a vector space over R with respect to the usual operations, · is a norm on l1 , and (l1 , · ) is a Banach space. (b) For every a = (an )n∈N ∈ A we have ∞

|ak | ≤

k=N

∞ 1 1 , k|ak | ≤ N N k=N

so, for all ε > 0, there exists an N ∈ N large enough such that (0, 0, . . . , 0, aN , aN +1 , . . . ) < ε . This shows that A is totally bounded, since [−1, +1]N −1 is. As (l1 , d) is a complete metric space, A is relatively compact (see the previous exercise). It is easily seen that A is closed in (l1 , d), hence A is, in fact, compact. 20. Let us ﬁrst consider the case D = [0, 1]. Denote ∞by A the set of all sequences a = (an )n∈N in R satisfying n=1 n|an | ≤ 1. According to Weierstrass’ M-test, for all a ∈ A, the function fa , fa (x) =

∞

an sin (nπx), x ∈ [0, 1] ,

n=1

is well deﬁned and belongs to C[0, 1]. Indeed, for all x ∈ [0, 1], |an sin (nπx)| ≤ |an |, ∀n ∈ N, and

∞ n=1

|an | ≤

∞ n=1

n|an | ≤ 1 ,

∞ hence the series n=1 an sin (nπx) is uniformly convergent in [0, 1], so fa ∈ C[0, 1]. In fact, fa is continuously diﬀerentiable in [0, 1] and fa (x)

=

∞ n=1

nan sin (nπx), x ∈ [0, 1] .

348

12 Answers to Exercises

It is easily seen that the function a → fa is continuous from A ⊂ l1 to C[0, 1]. By the previous exercise, A is compact in l1 so F = {fa ; a ∈ A} is compact in C[0, 1] (cf. Theorem 2.19). Now, if D = R, then C(R; R) cannot be equipped with the sup-norm. However, the result holds if C(R; R) is replaced by BC(R; R) (bounded and continuous functions: R → R). 21. We try to apply the Arzel`a–Ascoli criterion. First, we have t un (τ ) dτ =⇒ |un (t)| ≤ |un (s)| un (t) = un (s) + s b + |un (τ )| dτ . a

Integration over [a, b] with respect to s yields

b

(b − a)|un (t)| ≤

b

|un (s)| ds + (b − a)

a

a

|un (τ ) dτ ,

which shows (by H¨ older’s inequality) that (un ) is bounded in C([a, b]; R). On the other hand, we obtain by using H¨older’s inequality and the boundedness of (un ) in Lp (a, b; R) that, for all a ≤ s ≤ t ≤ b and for all n ∈ N, t |un (t) − un (s)| = | un (τ ) dτ | s t |un (τ )| dτ ≤ s 1/p

t |un (τ )|p dτ (t − s)1/q ≤ s

b 1/p ≤ |un (τ )|p dτ (t − s)1/q a

≤ C|t − s|1/q , where q is the conjugate of p. This shows that (un ) is equicontinuous, so the result follows by the Arzel`a–Ascoli criterion. 22. Proceed by way of contradiction. Assume that F is not uniformly equicontinuous, i.e., there exists an ε0 > 0 such that

12.2 Answers to Exercises for Chap. 2

349

∀n ∈ N, ∃xn , yn ∈ A, d(xn , yn ) <

1 , ∀f ∈ F, n

ρ(f (xn , f (yn )) ≥ ε0 . As A is compact, there exist convergent subsequences (xnk )k∈N , (ynk )k∈N , with the same limit, say x0 ∈ A. So the above statement contradicts the equicontinuity of F at x = x0 . 23. Apply the Arzel`a–Ascoli criterion. Denote by · sup the norm of C[0, 1] and by dsup the induced metric. We have fa sup ≤ 1 for all a ∈ R, so F is bounded in C[0, 1]. We also have |fa (x)| ≤ 1, ∀x ∈ [0, 1], a ∈ R , thus, according to the Mean Value Theorem, F is (Lipschitz) equicontinuous. It follows by the Arzel`a–Ascoli criterion that F is relatively compact in C[0, 1], but F is not closed, hence not compact. Indeed, fa converges uniformly, as a → ∞, to the null function which does not belong to F (there is no a ∈ R such that fa ≡ 0). 24. (a) Denote

t

b(s)u(s) ds .

y(t) = t0

We have y (t) = b(t)u(t) ≤ a(t)b(t) + b(t)y(t), t ∈ [t0 , T ] , which, after multiplication by e

−

t t0

b(s) ds

, becomes

d − tt b(s) ds − t b(s) ds e 0 y(t) ≤ a(t)b(t)e t0 , t ∈ [t0 , T ] . dt

Integrating this inequality over [t0 , t] gives y(t) ≤

t

a(s)b(s)e

t s

b(τ ) dτ

ds, t ∈ [t0 , T ] ,

t0

which leads to the desired conclusion. Bellman’s lemma follows trivially from this conclusion.

350

12 Answers to Exercises

(b) Let y = y(t) be another solution on [t0 − δ, t0 + δ]. Then, for t ∈ [t0 , t0 + δ], t x(t) − y(t) ≤ [f (s, x(s)) − f (s, y(s))] ds t0

≤ L

t

x(s) − y(s) ds .

t0

According to Bellman’s lemma (with C = 0), this implies x(t) = y(t), t ∈ [t0 , t0 + δ]. The case t ∈ [t0 − δ, t0 ] can be reduced to a similar one if we use the change t = t0 − τ , τ ∈ [0, δ]. 25. First consider the case c = 0. Denote t f (s) · |x(s)| ds ≥ |c|2 . y(t) = |c|2 + 2 t0

Then |x(t)|2 ≤ y(t), t ∈ I

=⇒ |x(t)| ≤

y(t), t ∈ I .

Thus, y (t) = 2f (t) · |x(t)| ≤ 2f (t) y(t), t ∈ I . Having in mind that y(t) ≥ |c|2 > 0, ∀t ∈ I, we can write d y(t) ≤ f (t), t ∈ I , dt which gives the desired inequality by integration over [t0 , T ]. Now, if c = 0 we replace it by ε > 0, so by the above reasoning we obtain t f (s) ds ∀t ∈ I , |x(t)| ≤ |ε| + t0

and now let ε → 0 to ﬁnish. 26. According to Peano’s Theorem, for any a, b ∈ (0, ∞) there exists a solution deﬁned on [−δ, +δ], where δ = min{a, b/M }, M = 1+a2 +b2 /(1+b2 ). The solution is also unique, since the function

12.2 Answers to Exercises for Chap. 2

351

deﬁned by the right-hand side of the equation is Lipschitzian with respect to the second variable. If we ﬁx an arbitrary a > 0, we can choose a suﬃciently large b > 0 such that b ≥ a =⇒ δ = a . M Thus, for any a > 0 there exists a unique solution of the given Cauchy problem, deﬁned on the whole interval [−a, a], so the solution can (uniquely) be extended to R. Remark. Note that the function deﬁned by the right-hand side of the equation, f (t, v) = 1 + t2 + v 2 /(1 + v 2 ), is Lipschitz continuous with respect to v on R2 . If we consider an arbitrary interval [−a, a] and use Euler’s method of polygonal lines (as in the proof of Peano’s Theorem), we observe that the functions φε can be deﬁned on [−a, a], so we obtain a solution deﬁned on the whole interval [−a, a]. By the Lipschitz continuity of f with respect to v this solution is also unique. Since a was arbitrarily chosen, the solution exists on R and is unique. Another approach towards solving this exercise is based on the Banach Contraction Principle. Indeed, for an arbitrary but ﬁxed a > 0, the operator T deﬁned by t f (s, v(s)) ds, t ∈ [−a, a], (T v)(t) = 0

maps C[−a, a] := C([−a, a]; R) into itself and T k is a contraction (with respect to the usual sup-norm of C[−a, a]) for a suﬃciently large k, hence T has a unique ﬁxed point, which is the unique solution of our Cauchy problem on [−a, a]. Hence there exists a unique solution on R. 27. Apply arguments similar to those used for the previous exercise. 28. Choose −∞ < t1 ≤ 0 ≤ t2 < +∞, and deﬁne ⎧ 2 ⎪ ⎨−(t − t1 ) , x(t) = 0, ⎪ ⎩ (t − t2 )2 ,

t < t1 , t1 ≤ t ≤ t2 , t > t2 .

There are inﬁnitely many pairs (t1 , t2 ) that can be chosen in this way, and all the corresponding x’s are solutions of the given

352

12 Answers to Exercises

Cauchy problem. Moreover, if we replace one of the restrictions of x to (−∞, t1 ) and (t2 , +∞) by zero, we obtain further solutions. Of course, the null function is also a solution on R. In fact, these functions are the only solutions of the problem. 29. As f (t, v) = 1 + t(1 + v 2 ) is Lipschitzian on compact sets (locally Lipschitzian), it follows by Theorem 2.33 that there exists a unique (local) solution, say x = φ(t), deﬁned on an interval [0, δ]. This solution can be extended (uniquely) to the right. Indeed, let us consider the Cauchy problem x (t) = 1 + t 1 + x(t)2 , t ≥ δ; x(δ) = φ(δ) . This problem has a unique solution, say x = ψ(t), on an interval [δ, δ1 ]. Thus, the function deﬁned by φ(t), t ∈ [0, δ] , x(t) = ψ(t), t ∈ (δ, δ1 ] is the unique solution of our problem on [0, δ1 ]. This interval can be further extended. We can prove the existence of a unique solution, x = x(t), deﬁned on a maximal interval [0, T ). Let us prove that T < +∞. Assume by contradiction that T = +∞. Therefore, x (t) = 1 + t 1 + x(t)2 , ∀t ≥ 0 =⇒

x (t) ≥ t, ∀t ≥ 0 . 1 + x(t)2

Integrating over [0, t] we get arctan x(t) − arctan x0 ≥

t2 , ∀t ≥ 0 , 2

which is impossible. 30. Note that f : R2 → R, f (t, v) = t2 + v 2 , is continuous and locally Lipschitzian with respect to v. According to Theorem 2.33, there exists a unique solution x = x(t) on an interval [−δ, δ]. This solution can be uniquely extended to a maximal interval (−T1 , T ). Let us prove that T < +∞. Assume the contrary: T = +∞. Then, for t ≥ 1, x (t) ≥ 1 + x(t)2 , hence t x (s) ds ≥ t − 1, t ≥ 1 s 1 1 + x(s)

12.2 Answers to Exercises for Chap. 2

353

i.e., arctan x(t) ≥ arctan x(1) + t − 1, t ≥ 1 , which is impossible. Thus, T < +∞. On the other hand, we observe that x ˜(t) = −x(−t) is also a solution of the problem. By the uniqueness property, it follows that the solution is an odd function and hence its maximal interval is symmetric with respect to t = 0, i.e., T1 = T . √ It remains to show that T > 2/2. Consider our Cauchy problem on a rectangle [−a, a] × [−b, b] with a, b > 0. From Theorem 2.33 we derive existence and uniqueness on [−δ, δ], where δ = min {a, b/(a2 + b2 )}. Note that, for a given a > 0, the maximal value of b/(a2 + b2 ) is 1/(2a), being attained for b = a. √Now, the maximal value of min√ {a, 1/(2a)} is reached for a √ = 2/2. 2/2 the corresponding δ = 2/2, so Summarizing, for a = b = √ T > 2/2. 31. For any (t0 , x0 ) ∈ Ω one can choose suﬃciently small numbers a, b > 0 such that Da,b = {(t, v) ∈ Ω; |t − t0 | ≤ a, v − x0 ≤ b} ⊂ Ω . Apply Peano’s Theorem to get (local) existence, and for uniqueness just observe that f = f (t, v) is Lipschitzian with respect to v on the compact Da,b . See also Theorem 2.33. 32. It is enough to prove existence and uniqueness on every compact subinterval of I containing t0 . Let [a, b] be such a subinterval. Obviously the function f : [a, b]×Rk → Rk , f (t, v) = A(t)v+b(t), is continuous. It is easily seen that f is Lipschitzian with respect to v (actually, with respect to any norm of Rk ) since aij |[a,b] ∈ C[a, b], i, j = 1, 2, . . . , k. According to Theorem 2.33, there exists a unique solution on the whole interval [a, b]. 33. For n ∈ N consider the operator Tn : B(0, 1) → B(0, 1), deﬁned by

1 Tn x = 1 − T x, x ∈ B(0, 1) . n Obviously, for each n ∈ N, Tn is a contraction on the metric space (B(0, 1), d2 ), so it has a unique ﬁxed point xn ∈ B(0, 1) (cf. Banach’s Contraction Principle). To conclude, we need merely to use the (sequential) compactness of B(0, 1) and the continuity of T .

354

12 Answers to Exercises

34. Apply the Banach Contraction Principle in C[0, 1], equipped with the usual sup-norm, to the operator T : C[0, 1] → C[0, 1] deﬁned by the right-hand side of the equation. As the function y → cos(αy) is Lipschitzian of constant α and T is a contraction, it has a unique ﬁxed point x ∈ C[0, 1], which is the unique solution of the given equation. 35. Let m ∈ (0, ∞) be arbitrary but ﬁxed. Note that C([0, m]; X) is a Banach space with respect to the sup-norm: ym = supt∈[0,m] y(t), y ∈ C([0, m]; X). Deﬁne

t

(T y)(t) = x0 +

f (s, y(s)) ds, t ∈ [0, m], y ∈ C([0, m]; X) .

0

It is easily seen that T maps C([0, m]; X) into itself and T k is a contraction on this space for a suﬃciently large k ∈ N (since f = f (t, v) is Lipschitz continuous with respect to v with Lipschitz constant Lm = sup {|a(t)|; t ∈ [0, m]}). Therefore T has a unique ﬁxed point x ∈ C([0, m]; X) (see Remark 2.38), which is the unique solution of our Cauchy problem on [0, m]. As m was chosen arbitrarily, x can be uniquely extended to [0, ∞). From the original equation we see that x ∈ C 1 ([0, ∞); X).

12.3

Answers to Exercises for Chap. 3

1. Assume that Ω ⊂ Rk is measurable. Then Rk \ Ω is measurable, so for every ε > 0 there exists an open set D ⊃ Rk \ Ω such that m(D \ (Rk \ Ω)) < ε (see Deﬁnition 3.1). It follows that F := Rk \ D is closed and m(Ω \ F ) < ε. The converse implication can be proved similarly. 2. Let ε > 0 be arbitrary, but ﬁxed. Since Ω is measurable, it follows from the previous exercise that there exists a closed set F ⊂ Ω such that m(Ω \ F ) < ε/2. If it turns out that F is bounded, hence compact, we are done. Assume now that F is unbounded. Observe that Fn := F ∩ B(0, n) is a compact set for each n ∈ N, where B(0, n) is the closed ball centered at 0 of radius n. Since Fn ⊂ Fn+1 and Fn ⊂ F ⊂ Ω, it follows that the sequence (m(Fn )) is nondecreasing and m(Fn ) ≤ m(F ) ≤ m(Ω) < ∞, n = 1, 2, . . . . Therefore, there exists

12.3 Answers to Exercises for Chap. 3

355

limn→∞ m(Fn ) ≤ m(F ) < ∞. On the other hand, F can be written as a countable union of measurable disjoint sets, F = F1 ∪

∪∞ (F \ F ) , n n−1 n=2

hence m(F ) = m(F1 ) +

∞

[m(Fn ) − m(Fn−1 )]

n=2

=

lim m(Fn )

n→∞

≤ m(F ) . Therefore m(Fn ) → m(F ). It follows that for a suﬃciently large N, m(F \ FN ) = m(F ) − m(FN ) < ε/2 , where FN =: K is a compact set. Since Ω\K = (Ω\F )∪(F \K), we conclude that m(Ω \ K) = m(Ω \ F ) + m(F \ K) ε ε + = ε. < 2 2 3. Since B = (B \ A) ∪ A, we have m(B) = m(B \ A) + m(A) =⇒ m(B \ A) = 0 . As Ω \ A is a subset of the null set B \ A, it follows that Ω \ A is measurable with m(Ω \ A) = 0. From Ω = (Ω \ A) ∪ A we deduce that Ω is measurable, and m(Ω) = m(Ω \ A) + m(A) = m(A). 4. (a) If C ⊂ Rk is a closed cube, then Cξ is so for any ξ ∈ Rk \{0} and v(Cξ ) = v(C). Therefore, me (Ωh ) = me (Ω) = m(Ω) . So Ωh is measurable and m(Ωh ) = m(Ω). (b) Employ similar arguments.

356

12 Answers to Exercises

5. Assume in addition that f ≥ 0. The function fh : Rk → R deﬁned by fh (x) = f (x − h), x ∈ Rk , is measurable. Indeed, for any λ ∈ R, we have {x ∈ Rk ; fh (x) > λ} = {y + h; f (y) > λ}, which is measurable (see the previous exercise). Obviously, the property holds for nonnegative simple functions and hence, by using the standard limiting process, one can obtain it for f . For a general f ∈ L1 (Rk ), one can use the decomposition f = f + −f − . Similar arguments work for the function x → f (αx). 6. Consider a sequence of partitions Pn of [a, b] with norms tending to 0, as well as the corresponding sequences of the lower and upper Riemann sums, say Ln and Un , which can be interpreted as Lebesgue integrals of some simple functions ln and un , ln ≤ f ≤ un . As f is Riemann integrable, we have

b

Ln −→ (R)

f (x) dx ←− Un as n → ∞ ,

a

which leads to the desired conclusion. For more details, see, for example, [50, Theorem 5.52, p. 83]. Now, obviously, D is not Riemann integrable, but it is Lebesgue integrable, since D = 0 almost everywhere. 7. Use integration by parts (with u = 1/(1 + x), v = xn ), then Lebesgue’s Dominated Convergence Theorem. 8. Set fn (x) = χ[1,n] (x)x−2 ln x, x ∈ R, n = 2, 3, . . . Observe that the fn ’s are measurable, fn (x) → χ[1,∞) (x)x−2 ln x and 0 ≤ fn (x) ≤ fn+1 (x) for each x ∈ R. By the Monotone Convergence Theorem, lim

+∞

n→∞ −∞

∞

fn (x) dx =

f (x) dx . 1

12.3 Answers to Exercises for Chap. 3

357

On the other hand, +∞ fn (x) dx = lim n→∞ −∞

n 1

x−2 ln x dx n

ln x d(x−1 ) 1 n −1 x−2 dx = −x ln x|n1 + = −

1

1 ln n + 1 − → 1, = − n n which, combined with the previous equality, implies the result. 9. Let fn : (0, ∞) → R be deﬁned by

x n −2x e , x > 0, n = 1, 2, . . . fn (x) = χ(0,n] (x) 1 + n We have for each x > 0, x n x ≤ ex , n = 1, 2, . . . , ≤ x =⇒ 1 + n ln 1 + n n x x n = ex . lim n ln 1 + = x =⇒ lim 1 + n→∞ n→∞ n n So, by the Lebesgue Dominated Convergence Theorem, n ∞ x n −2x lim e dx = lim fn (x) dx 1+ n→∞ 0 n→∞ 0 n n e−x dx = 1. = lim n→∞ 0

10. Apply Lebesgue’s Dominated Convergence Theorem. 11. (a) Left to the reader; (b) For each m ∈ N, fm (x) = χ[m−1 ,1] (x)f (x), x ∈ [0, 1] , is a simple function. We also have 0 ≤ fm ≤ fm+1 and fm → f as m → ∞. Hence f is measurable and, for 1 ≤ p < ∞, 1 ∞ √ 1 1 p f (x) dx = ( n)p − n n+1 0 =

n=1 ∞ n=1

n(p−2)/2 , n+1

358

12 Answers to Exercises

which is ﬁnite for 1 ≤ p < 2 and inﬁnite for 2 ≤ p < ∞. It is obvious that f ∈ / L∞ (0, 1). 12. Left to the reader. 13. By assumption, there exists lim

x→0+

f (x) − f (0) = lim x−1 f (x) = f+ (0) , x−0 x→0+

hence |g(x)| = x−1/2 |x−1 f (x)| ≤ Cx−1/2 ∀x(0, 1] , where C is a positive constant. For each n ∈ N, deﬁne hn (x) = χ[n−1 ,1] (x)|g(x)|, x ∈ (0, 1] , extended as zero on R \ (0, 1]. Applying the Monotone Convergence Theorem, we ﬁnd

1

|g(x)| dx =

0

hn (x) dx

n→∞ 0

=

1

lim

lim

1

|g(x)| dx

n→∞ 1/n 1 −1/2

≤ C

x

dx = 2C < ∞ .

0

14. Apply Lebesgue’s Dominated Convergence Theorem. 15. The case q = ∞ is trivial. For all the other cases, use H¨older’s inequality. 16. The case f L∞ (Ω) = 0 ⇐⇒ f = 0 a.e. in Ω is trivial, so let us assume that f L∞ (Ω) > 0. Obviously (see also the previous exercise), f Lp (Ω) ≤ f L∞ (Ω) m(Ω)1/p ∀p ≥ 1 , which implies lim sup f Lp (Ω) ≤ f L∞ (Ω) . p→∞

(∗)

12.4 Answers to Exercises for Chap. 4

359

Now, for 0 < α < f L∞ (Ω) , set E = {x ∈ Ω; |f (x)| > α}. Clearly, m(E) > 0 and for p ≥ 1 we have |f |p dx ≥ |f |p dx ≥ αp m(E) =⇒ lim inf f Lp (Ω) ≥ α . Ω

p→∞

E

Therefore, lim inf f Lp (Ω) ≥ f L∞ (Ω) , p→∞

which, combined with (∗) above, concludes the proof.

12.4

Answers to Exercises for Chap. 4

1. If G is the graph of a linear operator A : X → Y , i.e., G = {[x, Ax]; x ∈ X}, then necessarily PX G (the projection of G onto X) is the whole of X and G is a linear subspace of X × Y (or, equivalently, PY G is a linear subspace of Y ) whose pairs enjoy the property that the right component is uniquely associated with the ﬁrst component. Conversely, let G be a linear subspace of X × Y with PX G = X satisfying the property: ∀x ∈ X ∃ a unique y ∈ PY G such that [x, y] ∈ G. Deﬁne A : X → Y by Ax = y. It is easy to check that A is a linear operator whose graph is precisely G. 2. It is readily seen that A(rx) = rAx

∀r =

m ∈ Q, x ∈ X. n

Using the density of Q in R and the continuity of A we derive A(αx) = αAx

∀α ∈ R, x ∈ X,

hence A is linear. 3. (i)

Denote the norm of X = C[a, b] by · sup . Obviously, Af sup ≤ max (|a|, |b|) · f sup ∀f ∈ X =⇒ A ≤ max (|a|, |b|).

On the other hand, for the constant function f (t) = 1, t ∈ [a, b], we have f sup = 1 and Af sup = max (|a|, |b|), hence A = max (|a|, |b|).

360

12 Answers to Exercises

(ii)

Denote the norm of X = Lp (a, b) by · p . We have

Af p ≤ max (|a|, |b|) · f p ∀f ∈ X =⇒ A ≤ max (|a|, |b|). Let us prove that the converse inequality, A ≥ max (|a|, |b|), is also satisﬁed and thus A = max (|a|, |b|). Consider ﬁrst the case max (|a|, |b|) = |b| =⇒ b > 0. Deﬁne the sequence of functions 0, a < t < b − n1 , fn (t) = n1/p , b − n1 < t < b, for n ∈ N, n > 1/b. Obviously, fn p = 1 and 1/p n 1 1 p ∀n ∈ N, n > , |t| n dt ≥b− Afn p = 1 n b b− n

which implies A ≥ b = max (|a|, |b|). The case max (|a|, |b|) = |a| =⇒ a < 0 is similar and is left to the reader. 4. Clearly, A is a linear operator from X into itself. Using Theorem 2.29 (the Arzel` a–Ascoli Criterion) we ﬁnd A ∈ K(X). We also have b g(s) ds ∀f ∈ X =⇒ A Af sup ≤ f sup a b ≤ g(s) ds. a

Testing with the constant function f (t) = 1, t ∈ [a, b], we see b that, in fact, A = a g(s) ds. 5. Obviously, if A is continuous, then (∗) holds true. For the converse implication, it suﬃces to prove that there exists an r > 0 such that A BX (0, r) = {Ax; x ∈ X, xX < r} is bounded in(Y, · Y ). Assume the contrary, i.e., for all n ∈ N, the set A BX (0, 1/n) is unbounded. This means there is a sequence (xn ) in X such that xn X < 1/n, Axn Y > n, for all n ∈ N, which contradicts (∗).

12.4 Answers to Exercises for Chap. 4

361

6. According to Theorem 4.6, L(X, Y ) is a Banach space. Denote Sn = A1 + A2 + · · · + An . For every ε > 0 there exists an Nε ∈ N such that Sn+p − Sn = An+1 + An+2 + · · · + An+p ≤ An+1 + An+2 + · · · + An+p ≤ an+1 + an+2 + · · · + an+p < ε, ∀n > Nε , p ∈ N, since the series ∞ n=1 an is convergent. So (Sn )n∈N is a Cauchy sequence (in the Banach space L(X, Y )), hence it is convergent. A This means that ∞ n=1 n is convergent in L(X, Y ). 7. (i)

Use the previous exercise with Y = X and An =

1 n 1 A , an = An ∀n ∈ N . n! n

Indeed, we have An =

1 1 An ≤ An = an ∀n ∈ N . n! n!

The notation eA for the sum of this series arises naturally from the similar notation for the classical exponential ea , a ∈ R. (ii) From classical analysis we know that (1 − a)−1 is the sum of the geometric series 1 + a + a2 + · · · + an + · · · if |a| < 1. So we are naturally led to the following geometric series in L(X) (α)

I + A + A2 + · · · + An + · · · ,

where I denotes the identity operator. Since An ≤ An and a := A < 1, it follows that the series (α) above is convergent in L(X) (see the solution of the previous exercise). Denote its sum by S, i.e., Sn − S → 0, where Sn = I + A + A2 + · · · + An , n ∈ N. Note that (I − A)Sn = Sn (I − A) ∀n ∈ N . Letting n → ∞ in this equality yields (I − A)S = S(I − A), so I − A is invertible and (I − A)−1 = S which is an element of L(X).

362

12 Answers to Exercises

8. The answer is based on arguments similar to those used in classical analysis for the identity ea · eb = ea+b (a, b ∈ K). 9.

(a) This follows directly from Theorem 4.7 (Uniform Boundedness Principle); (b) It is clear that T is a linear operator. From (a) we infer that there exists a constant C > 0 such that Tn xY ≤ CxX ∀x ∈ X. Therefore, T xY ≤ CxX ∀x ∈ X =⇒ T ∈ L(X, Y ). (c) From Tn xY ≤ Tn ∀x ∈ X, x ≤ 1 we ﬁnd T xY ≤ lim inf Tn ∀x ∈ X, x ≤ 1 =⇒ T ≤ lim inf Tn .

10. Use Theorem 4.7 (Uniform Boundedness Principle) with (X := X ∗ , · X ∗ ), (Y := K, | · |), I := S. For x ∈ S deﬁne Tx : X ∗ → K by Tx (f ) = f (x), f ∈ X ∗ . By the condition on S from the statement of the problem, we have sup |Tx (f )| < ∞ ∀f ∈ X ∗ . x∈S

So, by Theorem 4.7, there exists a constant c > 0 such that |f (x)| ≤ cf X ∗ ∀f ∈ X ∗ , x ∈ S =⇒ x ≤ c ∀x ∈ S, cf. Corollary 4.18.

12.4 Answers to Exercises for Chap. 4

363

11. Apply Theorem 4.10 (Closed Graph Theorem). In order to do that, it is suﬃcient to show that A is a closed operator (equivalently, the graph of A is closed in X × Y ). Let xn → x in X, Axn → f in X ∗ . Letting n → ∞ in (Axn )(y) = (Ay)(xn ), y ∈ X yields f (y) = (Ay)(x) = (Ax)(y) ∀y ∈ X, hence f = Ax. 12. Obviously, (D(A), · X ) is a Banach space. Using Theorem 4.10 we infer that the restriction of A to D(A) is a linear continuous operator from (D(A), · X ) into (Y, · Y ). 13. In order to apply Theorem 4.10, we show that A is a closed operator. To this purpose, consider xn → x in X and Axn =: fn → f in X ∗ . By the assumption we have (fn − Ay)(xn − y) ≥ 0 ∀y ∈ X =⇒ (f − Ay)(x − y) ≥ 0 ∀y ∈ X. Now take y = x − tz, t ∈ R, z ∈ X to conclude that f = Ax. 14. The identity operator I : (X, · 1 ) → (X, · 2 ) is bijective and continuous (due to the inequality which was assumed to be satisﬁed by the two norms). According to Theorem 4.8 (Open −1 Mapping Theorem), I = I ∈ L (X, · 2 ), (X, · 1 ) , hence there exists a constant C1 > 0 such that x1 ≤ C1 x2 ∀x ∈ X. This combined with the inequality from the statement of the problem shows the equivalence of the two norms. 15. We shall assume that f = 0, otherwise its norm is zero in all cases. (i) For u = ni=1 αi ui ∈ X, we have |f (u)| ≤

n i=1

|αi | · |fi | ≤ u∞

n i=1

|fi |,

364

12 Answers to Exercises

n ∗ ≤ hence f X i=1 |fi |. Now, choose a particular u, namely n ˜ i ui ∈ X, where u ˜ = i=1 α 0, fi = 0, α ˜i = −1 |fi | f¯i , fi = 0, for i = 1, 2, . . . , n. Here f¯i denotes the complex conjugate of fi . Since ˜ u∞ = 1, f (˜ u) = ni=1 |fi | and f X ∗ ≤ ni=1 |fi | (see above), it follows that f X ∗ =

n

|fi |.

i=1

n

(ii) In this case, for u = |f (u)| ≤

n

i=1 αi ui

∈ X, we have

|αi | · |fi | ≤ u1 · max |fi |, 1≤i≤n

i=1

hence f X ∗ ≤ max1≤i≤n |fi |. Assume that max1≤i≤n |fi | (which is a positive number) is achieved for some i0 and choose the vector u ˜ ∈ X whose coordinates are null, except for α ˜ i0 = |fi0 |−1 f¯i0 . Since ˜ u1 = 1 and f (˜ u) = |fi0 |, we infer that f X ∗ = max |fi |. 1≤i≤n

(iii) In this case, for u = |f (u)| ≤

n

n

i=1 αi ui

∈ X, we have

|αi | · |fi | ≤ up

i=1

n

|fi |q

1/q ,

i=1

where q is the conjugate of p (i.e., 1/p + 1/q = 1), so f X ∗ ≤

n

|fi |q

1/q .

i=1

On the other hand, for u := u ˜ = ni=1 α˜i ui , where 0, fi = 0, α ˜ i = n −1+1/q q−2 q ¯ |fi | fi , fi = 0, j=1 |fj |

12.5 Answers to Exercises for Chap. 5

365

we have ˜ up = 1 and f (˜ u) = f X ∗ =

n q i=1 |fi |

n

|fi |q

1/q , hence

1/q .

i=1

16. It is easily seen that f X ∗ ≤ choosing un : [0.1] → R, n ∈ N, nt, un (t) = 1,

1. In fact, f X ∗ = 1. Indeed, deﬁned by 0 ≤ t ≤ 1/n, 1/n < t ≤ 1,

we have un ∈ X, un sup = 1, f (un ) = 1 −

1 2n

∀n ∈ N,

which proves the assertion. Now, to answer the question from the statement of the problem, observe that |f (u)| ≤ f (|u|), u ∈ X, so it is enough to consider only nonnegative functions in the deﬁnition of f X ∗ . Assume by way of contradiction that there exists a function u ≥ 0, 1 u ∈ X, usup = 1 such that f (u) = f X ∗ = 1, i.e., 0 1−u(t) dt = 0. As u ∈ C[0, 1] with values in [0, 1], this implies u(t) = 1 for all t ∈ [0, 1], which contradicts the fact that u(0) = 0. So the answer is “no.”

12.5

Answers to Exercises for Chap. 5

1. It is easy to see that u and all its partial derivatives of order k ≤ 2 are in C(Ω), so u ∈ C 2 (Ω). In order to ﬁnd supp u, notice that u = 0 on {0}×(−1, 1) as well as on the graph of the function x1 = −

1 − |x2 |x2 , x2 ∈ (−1, 0) ∪ (0, 1), x2

and u = 0 otherwise. Therefore, supp u = Cl Ω = R × [−1, +1]. 2. Set F = {pt : C[0, 1] → R; t ∈ [0, 1]}, where pt (f ) = |f (t)|, f ∈ C[0, 1].

366

12 Answers to Exercises

Obviously, pt is a seminorm for all t ∈ [0, 1]. In addition, F satisﬁes the axiom of separation: for all f ∈ C[0, 1], f = 0, there exists a t ∈ [0, 1], such that f (t) = 0 ⇐⇒ pt (f ) = 0. It is easily seen that convergence with respect to the topology generated by F means pointwise convergence. 3. It is enough to prove the triangle inequality for d (the other axioms are trivially satisﬁed). For each j ∈ N, we have dj (f, g) ≤ dj (f, h) + dj (h, g), f, g, h ∈ C j (Ω). This follows from the inequality |u − w| |w − v| |u − v| ≤ + , u, v, w ∈ R, 1 + |u − v| 1 + |u − w| 1 + |w − v| which is a consequence of α β α+β ≤ + , α, β ≥ 0. 1+α+β 1+α 1+β The triangle inequality for d follows similarly. 4. Having in mind the typical example of a test function (see Sect. 5.1) we can provide the following example:

1 0 < x < 4, C exp (x−2) 2 −4 , φ(x) = 0 otherwise, where we choose C = exp (1/4) to obtain supR φ = 1. 5. Any function φ = φ(t) ∈ C0∞ (R) with R φ(t) dt = 0 can be ext pressed as the derivative of the function φ1 = φ1 (t) := −∞ φ(s) ds which belongs to C0∞ (R). Conversely, if φ is the derivative of a function φ1 ∈ C0∞ (R), then R φ(t) dt = 0. This result can easily be used for the case of k variables to derive the conclusion. 6. If φ ∈ C0∞ (R) with φ(n) = an , n ∈ N, then an = 0 for all suﬃciently large n (located outside supp φ). Conversely, if an = 0 ∀n > n0 , we can construct the test function φ : R → R,

81(x−n)2 an exp 9(x−n) 2 −1 , |x − n| < 1/3, n = 1, 2, . . . , n0 , φ(x) = 0, otherwise, which satisﬁes the required properties.

12.5 Answers to Exercises for Chap. 5

367

7. Let r > 0 be such that supp ψ ⊂ B(0, r). Then supp φn ⊂ B(0, r) for all n ∈ N. We also have for all n ∈ N and C a constant |φn (x)| ≤ C nm 2−n , x ∈ B(0, r) =⇒ φn → 0 uniformly. The same fact holds for Dα φn (x) = nm+|α| 2−m Dα ψ(nx), x ∈ Rk , n ∈ N, for every multi-index α = (α1 , . . . , αk ). 8. Let r > 0 be such that supp ψ ⊂ B(0, r). Then supp φn ⊂ B(0, r + h)

∀n ∈ N.

Using Taylor’s formula we can write |φn (x) −

k

hj

j=1

1 ∂ψ , ∀x ∈ Rk , n ∈ N, (x)| = O ∂xj n

and similar formulas for the Dα φn ’s showing that φn →

k j=1

hj

∂ψ ∂xj

in D(Rk ).

The last claim follows trivially from the previous one with h and −h. 9. Obviously, for every n ∈ N large enough, φn is well deﬁned and supp φn ⊂ K, where K is a compact subset of Ω. From Proposition 5.4 we know that φn → φ uniformly. Notice that ∂φn ∂ (x) = φ(y) ω (x − y) dy ∂xj ∂xj 1/n Ω ∂ ω (x − y) dy = − φ(y) ∂yj 1/n Ω ∂φ ∂φ (y)ω1/n (x − y) dy → (x), j = 1, . . . , k, = ∂y ∂x j j Ω uniformly in Ω as n → ∞. This result extends to Dα φn for every multi-index α, Dα φn → Dα φ uniformly in Ω, so φn → φ in D(Ω).

368

12 Answers to Exercises

10. Left to the reader. 11. Let u be the regular distribution generated by φ. Then φ2 dx = 0, u(φ) = Ω

which implies φ = 0 as claimed. 12. For any φ ∈ D(R) we have, for some constant C, |φ(1/i2 ) − φ(0)| = |φ (θi )| · ≤ C

1 , 0 < θi < 1, i2

1 , i2

which implies that the series deﬁning u(φ) is absolutely convergent, i.e., u is well deﬁned. It is also easily seen that u ∈ D (R). Now assume by way of contradiction that u is a regular distribution, i.e., there exists f ∈ L1loc (R) such that ∞ 2 u(φ) = φ(1/i ) − φ(0) = i=1

+∞

f (t)φ(t) dt −∞

∀φ ∈ D(R).

(∗) Choosing φ with support in R \ {0, 1, 1/22 , 1/32 , . . . } we deduce that f = 0 almost everywhere in R (see Theorem 5.9). So, according to (∗), u(φ) = 0 for all φ ∈ D(R), which is a contradiction (take, for instance, φ = ω). 13. Left to the reader. 14. Left to the reader. 15. We have f (x) = |x|, f = 2H − 1, f = 2δ. The computation of g , g , g is left to the reader. 16. We are intuitively led to consider the usual Friedrichs’ approximations of H, ∞ H(y)ω1/n (x − y) dy Fn (x) = −∞ ∞ ω1/n (x − y) dy, x ∈ R. = 0

12.5 Answers to Exercises for Chap. 5

369

Obviously, for all n ∈ N, Fn is in C ∞ (R), but supp Fn is not compact. So we consider (instead of Fn )

n

Hn (x) = 0 x

ω1/n (x − y) dy

=

ω1/n (t) dt, x ∈ R, n ∈ N,

x−n

with supp Hn = [−1/n, n + 1/n], i.e., Hn ∈ C0∞ (R) for all n ∈ N. We are going to prove that Hn (φ) → H(φ) ∀φ ∈ D(R), where Hn and H denote the regular distributions associated with Hn and H. Take an arbitrary φ ∈ D(R). Its support, supp φ ⊂ [−a, a] for some a > 0. We have ∞ φ(x)Hn (x) dx Hn (φ) = −∞ a

x φ(x) ω1/n (t) dt dx. = −a x−n =:fn (x)

Since |fn (x)| ≤ |φ(x)|

R

ω1/n (t) dt = |φ(x)|, x ∈ [−a, a], n ∈ N,

and fn (x) → H(x) ∀x ∈ [−a, a] \ {0}, we can apply the Lebesgue Dominated Convergence Theorem to derive a lim Hn (φ) = φ(x)H(x) dx n→∞ −a ∞ H(x)φ(x) dx = −∞

= H(φ).

370

12 Answers to Exercises

17. (i) Left to the reader. (ii) Proceed by way of contradiction. Assuming the existence of f ∈ L1loc (R2 ) that generates u, we have f φ dx = 0 ∀φ ∈ D(R2 ), R2

with supp φ ⊂ R2 \ {(x1 , 0); x1 ∈ R}. This implies f = 0 a.e. in R2 =⇒ u = 0, which is a contradiction. (iii) Left to the reader. 18. For any φ ∈ D(Ω) the series ∞ n=1 an φ(xn ) has a ﬁnite number of nonzero terms since supp φ is a compact subset of Ω, hence supp φ contains ﬁnitely many points in S. 19. Assume by contradiction that δxn → 0 in D (Rk ) and lim inf xn < ∞, i.e., there exists a bounded subsequence (xnm )m∈N . Therefore, a subsequence of (xnm )m∈N , again denoted (xnm )m∈N , converges to some x∗ ∈ Rk as m → ∞. So δxnm (φ) = φ(xnm ) → φ(x∗ )

as m → ∞,

which implies φ(x∗ ) = 0 for all φ ∈ D(Rk ), which is a contradiction. 20. Left to the reader. 21. Left to the reader. 22. Denote In = nπ, (n + 1)π , n ∈ Z. If u is a solution of the given equation, then u is a solution of the equation in D (In ) for all n ∈ Z, i.e., for all φ ∈ C0∞ (Ik ) we have (sin t)u , φ = 0 =⇒ u , (sin t)φ = 0. So ∀ψ ∈ D(In ) (u , ψ) = 0 =⇒ u = 0 =⇒ u is constant on In . Hence, u=

n∈Z

c n χI n ,

12.5 Answers to Exercises for Chap. 5

371

where the cn ’s are real constants and χIn denotes the characteristic function of In , n ∈ Z. In fact, this is the general solution of the given equation. Clearly, {χIn ; n ∈ Z} is an inﬁnite, linearly independent system, hence the claim is conﬁrmed. Notice that an equivalent form for the general solution of the given equation is cn H(t − nπ), u= n∈Z

and {H(t − nπ); n ∈ Z} is a linearly independent system of solutions. 23. First of all, solve the third equation for u1 , u1 = u3 − u3 − H. Then solve the ﬁrst equation for u2 and use the above equation to ﬁnd u2 = −u1 + 4u1 + H = −(u3 − u3 − δ) = 4(u3 − u3 − H) + H = −u3 + 5u3 − 4u3 − 3H + δ. Finally, we obtain from the second equation a third order linear diﬀerential equation in u3 which can be solved by the usual method, etc. 24. Recall that W01,1 (a, ∞) is the closure in W 1,1 (a, ∞) of C0∞ (a, ∞). So, as u ∈ W01,1 (a, ∞), there exists a sequence (un )n∈N in C0∞ (a, ∞) which converges to u in W 1,1 (a, ∞). Let b ∈ (a, ∞) be arbitrary but ﬁxed. We have for all t ∈ [a, b] and m, n ∈ N t un (s) − um (s) ds| |un (t) − um (t)| ≤ | a b |un (s) − um (s)| ds → 0 as n, m → ∞. ≤ a

Therefore, un converges in C[a, b] to some v ∈ C[a, b] and v(a) = 0. In fact, v is an absolutely continuous representative of u|[a,b] (cf. Theorem 5.35). Since b was arbitrary v can be extended as a function in C[a, ∞).

372

12 Answers to Exercises

25. The embedding of W 2,p (0, 1) into C 1 [0, 1] is realized by the map (injection) which associates with each element u ∈ W 2,p (0, 1) its representative from A2,p (0, 1) ⊂ C 1 [0, 1], also denoted by u (see Theorem 5.35). Let (un )n∈N be a bounded sequence in W 2,p (0, 1). Let us apply the Arzel`a–Ascoli criterion to show that (un ) has a subsequence which is convergent in C 1 [0, 1]. For t, s ∈ [0, 1] and n ∈ N we have

t

|un (t)| = |un (s) + s

un (τ ) dτ |

1

≤ |un (s)| + 0

|un (τ )| dτ

= |un (s)| + un L1 (0,1) ≤ |un (s)| + un Lp (0,1) (by H¨older’s inequality). By integration over [0, 1] with respect to s we get |un (t)| ≤ un L1 (0,1) + un Lp (0,1) ≤ un Lp (0,1) + un Lp (0,1) ≤ C (by assumption), where C is some constant. Hence, (un ) is bounded in C[0, 1]. We also have for t, s ∈ [0, 1] and n ∈ N,

t

|un (t) − un (s)| = |

s t

≤| s

un (τ ) dτ | |un (τ )| dτ |

≤ |t − s|1/q un Lp (0,1) (by H¨older with q =

p p−1 )

≤ C |t − s|1/q , where C is a constant, which shows that (un ) is equicontinuous. According to the Arzel` a–Ascoli criterion, (un ) has a subsequence (unk )k∈N which is convergent in C[0, 1] to some u ∈ C[0, 1]. By repeating the above arguments for (unk )k∈N we deduce the existence of a subsequence of (unk )k∈N which converges in C[0, 1] and its limit is u ∈ C[0, 1]. Consequently, the original sequence (un ) has a subsequence which converges in C 1 [0, 1].

12.5 Answers to Exercises for Chap. 5

373

26. We have supp φ ⊂ [−a, a] for some a > 0, so supp φ(j) ⊂ [−a, a] for all j ∈ N. (i) Let us ﬁrst discuss the case p = ∞. Since φ is not the null function, it is easily seen that for each j ∈ {0, 1, . . . } there exists tj ∈ (−a, a) such that sup |φ(j) | = |φ(j) (tj )| > 0. [−a,a]

This implies (j) sup |u(j) n | ≤ |φ (tj )| ∀n ∈ N, j = 0, 1, . . . R

Therefore (un ) is bounded in W m,∞ (R) for all m ∈ N. In the case 1 ≤ p < ∞ we have +∞ +∞ (j) p |un (t)| dt = |φ(j) (t + n)|p dt −∞ −∞ a |φ(j) (t)|p dt ∀n ∈ N, j = 0, 1, . . . , = −a

which conﬁrms the claim. (j)

(ii) Clearly, for each j ∈ {0, 1, . . . }, (un ) converges pointwise to zero. Let q = ∞. Assume by way of contradiction that there exists a subsequence (unk )k∈N which converges uniformly to the null function. Let t0 ∈ (−a, a) such that φ(t0 ) = 0. Choose tk = t0 − nk , k ∈ N. We have unk (tk ) = φ(tk + nk ) = φ(t0 ) = 0 ∀k ∈ N, so (unk )k∈N cannot converge uniformly. If 1 ≤ q < ∞, we can write +∞ +∞ q |un (t)| dt = |φ(t + n)|q dt −∞ −∞ a |φ(t)|q dt = 0, = −a

and thus (unk )k∈N cannot converge in Lq (R) (to the null function).

374

12 Answers to Exercises

27. According to Theorem 5.21, there exist some sequences (un )n∈N , ¯ such that (vn )n∈N in C 1 (Ω) un → u, vn → v in H 1 (Ω). Obviously, for each i ∈ {1, 2, . . . , k}, ∂un ∂vn ∂ (un vn ) = · v n + un · ∀n ∈ N, ∂xi ∂xi ∂xi hence

∂φ = − un v n ∂xi Ω

Ω

∂un vφ + ∂xi

un Ω

∂vn φ, ∂xi

(∗)

for all φ ∈ D(Ω). We intend to pass to the limit in (∗). Pick an arbitrary φ ∈ D(Ω). Denoting C := supΩ |∂φ/∂xi | < ∞, we can write ∂φ ∂φ − uv | | un v n ∂xi ∂xi Ω Ω |un vn − uv| ≤ C Ω

|un (vn − v)| + |v(un − u)| ≤ C Ω

Ω ≤ C un L2 (Ω) vn − vL2 (Ω) + vL2 (Ω) un − uL2 (Ω)

≤ C ∗ vn − vL2 (Ω) + un − uL2 (Ω) ∀n ∈ N, ∗ where C is a constant. So the left-hand side of (∗) converges to − Ω uv (∂φ/∂xi ) as n → ∞. Similar arguments can be used for the two terms in the right-hand side of (∗). Thus we obtain by passing to the limit in (∗) ∂φ ∂vn ∂u − uv = φ+ u φ, ∂x ∂x i i Ω Ω Ω ∂xi

for all φ ∈ D(Ω) and i = 1, 2, . . . , k, i.e., ∂ ∂u ∂u (uv) = ·v+u· , in D (Ω), i = 1, 2, . . . , k. ∂xi ∂xi ∂xi Of course the above equalities are also satisﬁed in L1 (Ω), hence a.e. in Ω.

12.6 Answers to Exercises for Chap. 6

12.6

375

Answers to Exercises for Chap. 6

1. If p = 2 the corresponding norm, · L2 (Ω) , is generated by the usual scalar product uv dx, u, v ∈ L2 (Ω), (u, v)L2 (Ω) = Ω

so L2 (Ω), · L2 (Ω) is a Hilbert space. In order to conclude, it is suﬃcient to prove that, for p ∈ (1, ∞)\ {2}, · Lp (Ω) does not satisfy the parallelogram law (see Theorem 6.1 (Jordan–von Neumann)). To this end, we choose two disjoint open balls B1 , B2 ⊂ Ω and two C ∞ functions φ1 , φ2 with supp φi ⊂ Bi and φi Lp (Bi ) = 1, i = 1, 2. Obviously, φ1 and φ2 do not satisfy the parallelogram law. 2. Recall that for all x, y ∈ H and α ∈ K, ¯ (x, y) + |α|2 y2 . x + αy2 = x2 + 2 Re α Assume that |(x, y)| = x · y, y = 0. Choosing in the above identity α = −(x, y)/y2 we obtain |(x, y)|2 y2 x2 y2 = x2 − = 0, y2

x + αy2 = x2 −

so x + αy = 0. Conversely, if x, y are linearly dependent, it follows easily that |(x, y)| = x · y. 3. According to the Jordan–von Neumann theorem, it is enough to show that there are functions u, v ∈ C[a, b] which do not satisfy the parallelogram law. Choose, for example, u, v ∈ C[a, b] such that 0 ≤ u ≤ 1, 0 ≤ v ≤ 1, supp u ⊂ (a, (a + b)/2), supp v ⊂ ((a + b)/2, b), max u = max v = 1. 4. The space C is a ﬁnite dimensional subspace of L2 (0, 1) (whose dimension is n + 1), hence C is a closed linear subspace. According to Theorem 6.4, for any u ∈ L2 (0, 1), there exists a unique pu ∈ C which minimizes u−pL2 (0,1) over C, namely, pu = PC u.

376

12 Answers to Exercises

5. (i) Observe that P is precisely the projection operator PC , where C is the closed unit ball, so P is nonexpansive (see Sect. 6.3); (ii) In this case we cannot use the previous argument (which is valid in Hilbert spaces). We distinguish three cases (a) u, v ∈ C is a trivial case; (b) if u ∈ C, v ∈ H \ C, then P u − P v = u − v−1 v ≤ u − v + v − v−1 v = u − v + v − 1 ≤ u − v + v − u ≤ 2u − v; (c)

if u, v ∈ H \ C, then P u − P v ≤ (1/u)u − (1/u)v + (1/u)v − (1/v)v ≤ u − v + (1/u) · |v − u| ≤ 2u − v.

6. The space M is two-dimensional (representing a plane in R3 ), so it is closed. Clearly, the vector v = (2, −1, −3)T is orthogonal to M and Span{v} is the orthogonal complement of M , i.e., M ⊥ = Span{v}. The projection PM x of the given x = (1, 2, −1)T satisﬁes the conditions: x − PM x ∈ M ⊥ (i.e., x − PM x = (2α, −α, −3α)T ) and PM x ∈ M . Using these two conditions we can determine α = 3/14, so PM x = (4/7, 31/14, −5/14)T , and x = PM x + (x − PM x). 7. Obviously, M is a linear subspace of the Hilbert space L2 (a, b). In fact, M is the nullspace of the linear continuous functional φ : L2 (a, b) → R,

b

φ(u) = a

u(t) dt, u ∈ L2 (a, b),

12.6 Answers to Exercises for Chap. 6

377

so M is a closed linear space, with codim M = 1. We have M ⊥ = Span{1}, i.e., M ⊥ is the subspace of all constant functions. It is easily seen that any u ∈ L2 (a, b) can be written as u=

1 u− b−a

b

u(t) dt a

1 + b−a

b

u(t) dt. a

8. M ⊥ is the subspace of odd functions, i.e., M ⊥ = {u ∈ L2 (−1, 1); u(t) = −u(−t) for a.a. t ∈ (−1, 1)}, and for any u ∈ L2 (−1, 1) we have the decomposition u(t) =

u(t) + u(−t) u(t) − u(−t) + for a.a. t ∈ (0, 1). 2 2

9. Let us ﬁrst prove that ⊥ Y ⊥ = Cl Y .

(∗)

Indeed, on the one hand, Y ⊂ Cl Y =⇒

Cl Y

⊥

⊂ Y ⊥.

The converse inclusion is also true. Indeed, if x ∈ Y ⊥ , then (x, y) = 0, ∀y ∈ Y , and this can be extended to all y ∈ Cl Y , so ⊥ ⊥ x ∈ Cl Y . Thus Y ⊥ ⊂ Cl Y , as claimed. Now, taking into account (∗), we can write ⊥ ⊥ ⊥ ⊥ = . Cl Y Y In order to conclude, it suﬃces to show that the right-hand side of the above equation equals Cl Y . In fact, for any closed subspace ⊥ ⊥ Z ⊂ H, we have Z ⊥ = Z. Indeed, Z ⊂ Z ⊥ and the ⊥ ⊥ converse inclusion follows easily: if x ∈ Z , then x = x1 + x2 , x1 ∈ Z, x2 ∈ Z ⊥ ,

378

12 Answers to Exercises

and since 0 = (x, x2 ) = (x1 , x2 ) + (x2 , x2 ) = (x2 , x2 ), it follows that x2 = 0, so x = x1 ∈ Z. 10. The subspace Y is not closed in H = L2 (0, 1). In order to prove this, consider, e.g., the sequence (un ) in H, deﬁned by ⎧ ⎪ ⎨0, un (t) = (nt)−1/4 , ⎪ ⎩ −2βn t,

0 < t < n1 , 1 1 n < t < 2, 1 2 < t < 1,

where βn are constants, n = 3, 4, . . . We determine the βn ’s such that un ∈ Y , i.e.,

1

0 = 0

u(t) dt t 1/2

= n−1/4

t−1−1/4 dt − 2βn

1/n

1/2

dt. 0

Hence, βn = 4n−1/4 n1/4 − 21/4 → 4, as n → ∞. It is easily seen that un → u in H, where u(t) = Clearly,

1 0

0, −8t,

0 < t < 12 , 1 2 < t < 1.

u(t) dt = −4 = 0. t

11. We know that H ∗ is a Banach space, so it remains to prove that its norm is generated by a scalar product. Let x∗ , y ∗ be two arbitrary elements of H ∗ . According to the Riesz Representation Theorem, there exist x, y ∈ H such that x∗ (u) = (u, x), y ∗ (u) = (u, y) ∀u ∈ H. Deﬁne (x∗ , y ∗ )H ∗ = (y, x). It is easy to check that (·, ·)H ∗ is a scalar product in H ∗ and x∗ H ∗ = (x∗ , x∗ )H ∗ = x for all x∗ ∈ H ∗ .

12.6 Answers to Exercises for Chap. 6

12. (i)

We have vn 2 = = = ≤

1 n2

379

n

n

a i ui , a j uj i=1 j=1 n 1 2 ai ui 2 n2 i=1 n 1 a2i n2 i=1 C 2n C2 n2

=

n

→ 0.

√ (ii) We know from the above computation that nvn ≤ C for ∞ all n ∈ N. Let x ∈ H be arbitrary but ﬁxed. Since N{un }n=1 is a ∞ basis in H, x = n=1 (x, un )un . Denoting xN = n=1 (x, un )un , we have for ε > 0 small √ √ √ |( nvn , x)| ≤ |( nvn , x − xN )| + |( nvn , xN )| √ ≤ Cx − xN + n · |(vn , xN )| √ < ε + n · |(vn , xN )|, N > Nε . This estimate along with √

√ n · |(vn , xN )| =

n | ai (x, ui )| n N

i=1

≤

CN x √ , n≥N n

implies √ √ lim sup |( nvn , x)| < ε ∀ε > 0 =⇒ lim ( nvn , x) = 0. n→∞

n→∞

13. Assume (i) holds. It is easy to see that R(A) is a closed subspace of H, so H = R(A) ⊕ R(A)⊥ . We also infer from (i) that A is injective, so there exists A−1 : R(A) → H which is continuous. Deﬁne B : H → H by By = A−1 PR(A) y ∀y ∈ H. Clearly, B ∈ L(H) and B ◦ A = I. Conversely, assuming (ii), we have x = B(Ax) ≤ B · Ax ∀x ∈ H, with B = 0 which is guaranteed by the relation B ◦ A = I.

380

12 Answers to Exercises

14. (i) It suﬃces to prove that N (A) = R(A)⊥ (since (Cl R(A))⊥ = R(A)⊥ , see the solution of Exercise 6.9 above). Indeed, if x ∈ N (A), then (Av, v + x) ≥ 0 ∀v ∈ H, so replacing v by tv, t ∈ R, we obtain (Av, x) = 0 ∀v ∈ H, i.e., x ∈ R(A)⊥ . Conversely, let x ∈ R(A)⊥ . We have (A(v + x), v + x) ≥ 0 ∀v ∈ H =⇒ (Av + Ax, v) ≥ 0 ∀v ∈ H. Replacing v by tv we easily derive (Ax, v) = 0 ∀v ∈ H =⇒ Ax = 0. (ii) If x+tAx = 0, t > 0, then x2 +t(Ax, x) = 0 =⇒ x2 ≤ 0 =⇒ x = 0, so I + tA is injective. Let us prove that I + tA ia also surjective. For an arbitrary y ∈ H consider the equation x + tAx = y. Apply the Lax–Milgram Theorem with a(u, v) = (u, v)+t(Au, v), b(v) = (y, v) to deduce the existence of a unique x ∈ H satisfying a(x, v) = (y, v) ∀v ∈ H =⇒ x + tAx = y. Denote Jt u = (I + tA)−1 u, u ∈ H, t > 0. It is easily seen that Jt is a nonexpansive operator for all t > 0: we have just to multiply Jt u + tAJt u = u by Jt u and use the positivity of A and the Bunyakovsky–Cauchy–Schwarz inequality. Next, let u ∈ H be arbitrary but ﬁxed. According to (i), u = u1 + u2 , u1 ∈ N (A), u2 ∈ Cl R(A). We have Jt u1 = u1 = PN (A) u ∀t > 0.

(∗)

Now, for y ∈ R(A), i.e., y = Av for some v ∈ H, and t > 0, we have Jt y + A(tJt y − v) = 0. By the positivity of A we get (Jt y, tJt y − v) ≤ 0 =⇒ Jt y ≤

v . t

Thus, limt→∞ Jt y = 0. This property extends by density to all y ∈ Cl R(A) since Jt is a nonexpansive operator. So we can write lim Jt u2 = 0.

t→∞

(∗∗)

From (∗) and (∗∗) we infer that Jt u = Jt u1 + Jt u2 → PN (A) u as t → ∞. 15. Since (un ) converges weakly to u we have u ≤ lim inf un . n→∞

(∗)

12.6 Answers to Exercises for Chap. 6

381

In order to prove (∗) we can assume u = 0 (as the case u = 0 is trivial). We have u2 = (u, u − un ) + (u, un ) ≤ (u, u − un ) + u · un , which yields by passing to the limit u2 ≤ u · lim inf un , n→∞

so (∗) holds true. Summarizing, we have u ≤ lim inf un ≤ lim sup un ≤ u, n→∞

n→∞

hence un → u. Now it is easy to conclude: un − u2 = un 2 − 2 Re(un , u) + u2 → 0. 16. Apply the Lax–Milgram Theorem (Theorem 6.17) with H := H01 (0, 1) (endowed with the H 1 norm) and 1 1 1 u v dt + uv dt, b(v) = f v dt. a(u, v) = 0

0

0

Since u satisﬁes the equation −u + u = f in D (0, 1), it follows that u ∈ L1 (0, 1), hence u ∈ W 2,1 (0, 1) and a.e. in (0, 1), −u + u = f u(0) = 0, u(1) = 0. 17. (i) Assume u is a solution to problem (P ). Let v be arbitrary in H 1 (0, 1). Multiplication of the diﬀerential equation by v(t) and integration over (0, 1) shows that u is a solution of (P˜ ). Now, assume that u is a solution of (P˜ ). Let v in (P˜ ) range C0∞ (0, 1). It follows that u ∈ H 1 (0, 1) satisﬁes the equation −u + αu = f

in D (0, 1).

Since αu − f ∈ L2 (0, 1), it follows that in fact u ∈ H 2 (0, 1) and the above equation is satisﬁed for a.a. t ∈ (0, 1). Now, testing in

382

12 Answers to Exercises

(P˜ ) with functions v ∈ C 1 [0, 1] we readily infer that u satisﬁes the boundary conditions u (0) = 0, u (1) = u(1). (ii) In order to apply Lax–Milgram, consider H = H 1 (0, 1) and deﬁne a : H × H → R and b : H → R,

1

a(u, v) = −u(1)v(1) +

uv +α 0

1

1

uv, b(v) = 0

fv . 0

Obviously, the functional a is bilinear and symmetric. It is also continuous on H ×H (note that (u, v) −→ u(1)v(1) is continuous as H 1 (0, 1) is compactly embedded in C[0, 1]). We need to prove that a is coercive for large α. For u ∈ H 1 (0, 1) we deduce from the obvious relation 1 1 2 2 2 2 uu =⇒ u(1) ≤ u(t) + 2 |u| · |u |, u(1) = u(t) + 2 t

0

by integration over [0, 1], u(1)2 ≤ u2L2 (0,1) + 2uL2 (0,1) u L2 (0,1) 1 ≤ u2L2 (0,1) + u2L2 (0,1) + εu 2L2 (0,1) , ε where ε is a positive number. If ε ∈ (0, 1), then for all u ∈ H 1 (0, 1),

1

2

1

(u ) + α u2 a(u, u) = −u(1) + 0 0

1 ≥ α−1− u2L2 (0,1) + (1 − ε)u 2L2 (0,1) , ε 2

which shows that a is coercive for α large enough. It is also clear that b is linear and continuous, so all the conditions required by the Lax–Milgram Theorem are fulﬁlled. (iii) Since a is symmetric, u is a minimizer of the functional 1 1 1 1 2 2 − v(1) + (v ) + α v2 v −→ a(v, v) − b(v) = 2 2 0 0 1 − f v, v ∈ H. 0

12.7 Answers to Exercises for Chap. 7

383

18. We are looking for z := PY y, i.e., z must satisfy two conditions: z ∈ Y and (y − z) ⊥ Y . Note that Y itself is a Hilbert space, with the same scalar product and norm, having an orthonormal basis {un }∞ n=1 , so the vector z ∈ Y has the Fourier expansion ∞ z = n=1 (z, un )un (cf. Theorem 6.21 and Remark 6.22). The second condition is equivalent to (y − z, un ) = 0 ∀n ∈ N =⇒ (z, un ) = (y, un ) ∀n ∈ N. Therefore, z = ∞ n=1 (z, un )un . 19. According to Theorem 6.23, there exists a countable orthonormal basis of H, say {un }∞ n=1 . We also know that un → 0 weakly. On the other hand, if x = 1, then the constant sequence xn = x, n = 1, 2, . . . satisﬁes the required properties. So we can assume 0 < x < 1. Intuitively, we consider the sequence xn = αn un + x, n = 1, 2, . . . , where the αn ’s are real numbers determined from the required condition xn 2 = 1 ⇐⇒ αn un + x2 = αn2 + 2αn Re (un , x) + x2 = 1, n = 1, 2, . . . Choose αn = − Re(un , x) + | Re(un , x)|2 + 1 − x2 −→ 1 − x2 as n → ∞. It follows that (xn , v) = αn (un , v) + (x, v) → (x, v) ∀v ∈ H. 20. Left to the reader.

12.7

Answers to Exercises for Chap. 7

1. Left to the reader. ⊥ 2. It suﬃces to show that N (A∗ ) = Cl R(A) , where ⊥ Cl R(A) = {y ∗ ∈ Y ∗ , y ∗ (y) = 0 ∀y ∈ Cl R(A)}. 3. We need to prove that D(A∗ ) ⊂ D(A). Take an arbitrary y ∈ D(A∗ ). Since R(A) = H, there exists an x ∈ D(A) such that A∗ y = Ax. It is suﬃcient to prove that y = x. So, for any w ∈ D(A) we have (Aw, y) = (w, A∗ y) = (w, Ax) = (Aw, x),

384

12 Answers to Exercises

hence, as R(A) = H, (u, y) = (u, x) ∀u ∈ H =⇒ y = x. 4. Left to the reader. 5. We have

A∗ A ≤ A∗ · A2 .

(1)

On the other hand, for all x ∈ H, Ax2 = (Ax, Ax) = (x, A∗ Ax) ≤ x · A∗ Ax ≤ A∗ A · x2 , which implies

A2 ≤ A∗ A

(2)

The claim follows from (1) and (2). 6. If A is symmetric, then (Ax, x) = (x, A∗ x) = (x, Ax) = (Ax, x) =⇒ (Ax, x) ∈ R ∀x ∈ H. Conversely, suppose that (Ax, x) ∈ R ∀x ∈ H. We have (Ax, y)

=

1 [ A(x + y), x + y − A(x − y), x − y 4 +i A(x + iy), x + iy − i A(x − iy), x − iy ].

Next, (x, Ay)

= =

=

= hence A = A∗ .

(Ay, x) 1 A(y + x), y + x − A(y − x), y − x 4 −i A(y + ix), y + ix + i A(y − ix), y − ix 1 A(x + y), x + y − A(x − y), x − y 4 +i A(x + iy), x + iy − i A(x − iy), x − iy (Ax, y),

12.7 Answers to Exercises for Chap. 7

385

7. We have for all u ∈ H (T u, u) = u2 + aAu2 ,

(∗)

where · is the norm induced by (·, ·). In particular, (∗) shows that N (T ) = {0}, so T is injective. In order to prove that T is onto, one can apply the Lax–Milgram Theorem to a(u, v) = b(v), v ∈ H, where a(u, v) = (T u, v) = (u, v) + a(Au, Av), b(v) = (f, v), f ∈ H. So T is bijective, hence invertible, and obviously T −1 is linear. According to (∗), T u ≥ u for all u ∈ H and thus T −1 ∈ L(H). 8. (a) Left to the reader. (b) Using the fact that A is symmetric, we obtain T x2 = (Ax + ix, Ax + ix) = Ax2 + x2 ∀x ∈ H. (∗) This shows that N (T ) = {0}, hence T is injective. Next, it follows from (∗) that T x ≥ x ∀x ∈ H.

(∗∗)

This implies that R(T ) is closed in H. Indeed, if yn = T xn converges to some y ∈ H, then xn − xm ≤ yn − ym ∀m, n, which shows that xn converges to some x ∈ H. Hence, taking into account the continuity of T , y = lim T xn = T x ∈ R(T ). n→∞

Thus, H = R(T ) ⊕ R(T )⊥ . Let us show that R(T )⊥ = {0}. Indeed, if z ∈ R(T )⊥ , we have 0 = (T x, z) = (x, T ∗ z) = (x, Az − iz) = (x, T z − 2iz) ∀x ∈ H.

386

12 Answers to Exercises

It follows that T z = 2iz, hence z ∈ R(T ) =⇒ z = 0 =⇒ R(T ) = H, so T is bijective. So T is invertible, T −1 is a linear operator, and, according to (∗∗), T −1 ∈ L(H). 9. It is readily seen that for all A ∈ L(H) 2 n ¯0 I + a ¯1 A∗ + a ¯2 A∗ + · · · + a ¯n A∗ , P (A)∗ = a where the coeﬃcients are the complex conjugates of the coeﬃcients of P (A). (j)

follows immediately from this identity;

(jj)

Assume that A∗ A = AA∗ . Then P (A∗ )P (A) =

=

n i,j=1 n

i a ¯i aj A∗ Aj i aj a ¯i Aj A∗

i,j=1

= P (A)P (A∗ ). 10. Left to the reader. 11. Left to the reader 12. One can assume x = 0. If Ax = x, then x2 = (Ax, x) = (x, A∗ x) ≤ x · A∗ x ≤ A∗ · x2 = A · x2 ≤ x2 , so we have equalities everywhere and in particular (x, A∗ x) = x · A∗ x. From Exercise 6.2, we infer that x and A∗ x are linearly dependent, i.e., there exists a scalar α = 0 such that A∗ x = αx. Using the equality x2 = (x, A∗ x) we see that α = 1, hence A∗ x = x. Conversely, let us assume that A∗ x = x. Since A∗ = A ≤ 1, we infer by the previous argument that (A∗ )∗ x = x =⇒ Ax = x.

12.7 Answers to Exercises for Chap. 7

387

13. (a) We know that C0∞ (0, 1) is dense in H (see Theorem 5.8). Since C0∞ (0, 1) ⊂ D(A), we infer that D(A) is dense in H. In order to prove that A is closed, let (un ) be a sequence in D(A) such that un → u and Aun = un → v in H. Applying the Arzel`a–Ascoli criterion we infer that un → u in C[0, 1] and, in particular, u(0) = 0. Then un → u in D (0, 1) and in H = L2 (0, 1). Therefore, u ∈ D(A) and v = u . (b) It is easy to see that N (A) = {0} (hence A is injective), and R(A) = H; ∗ (c) If 1v ∈ D(A ), then the linear functional f (u) := (Au, v) = 0 vu is continuous on D(A) with respect to the norm · of H = L2 (0, 1). Since Cl D(A) = H, the functional f can be extended (by the Hahn–Banach Theorem or by continuity) to the whole of H. This extension is again denoted by f . So, according to the Riesz Representation Theorem, there exists w ∈ H such that 1 wu ∀u ∈ H. f (u) = (u, w) = 0

Interpreting v and w as distributions from D (0, 1), we have for all φ ∈ C0∞ (0, 1) v (φ) = −v(φ ) 1 vφ = − 0

= −f (φ) = −w(φ), hence v = −w ∈ H =⇒ v ∈ H 1 (0, 1). By the continuity of f on D(A), there exists a constant k > 0 such that 1 1 | vu | = |u(1)v(1) − uv | ≤ ku ∀u ∈ D(A). 0

0

If, in addition, u(1) = 1, then we obtain |v(1)| ≤ ku + v · u ∀u ∈ D(A). If we choose in this inequality u = un , where (un ) is a sequence in D(A), with un (1) = 1 for all n, and such that

388

12 Answers to Exercises

un → 0 in H, we ﬁnd v(1) = 0 by letting n → ∞. It follows that D(A∗ ) ⊂ {v ∈ H 1 (0, 1); v(1) = 0}. In fact, D(A∗ ) = {v ∈ H 1 (0, 1); v(1) = 0} and A∗ v = −v . Clearly, Cl D(A∗ ) = H. 14. Obviously, D(A) is dense in H in both cases, hence it makes sense to deﬁne A∗ . By arguments similar to those used for the previous exercise we obtain 1 (a) N (A) = {0}, R(A) = {g ∈ H; 0 g = 0}, D(A∗ ) = H 1 (0, 1), A∗ v = −v , N (A∗ ) = Span{1} (constant functions), R(A∗ ) = H. (b) We distinguish two cases. If α = 1, then N (A) = {0}, R(A) = H, D(A∗ ) = {v ∈ H 1 (0, 1); v(1) = αv(0)}, A∗ v = −v , N (A∗ ) = {0}, R(A∗ )= H 1 (0, 1). 1 If α = 1, then N (A) = Span{1}, R(A) = {g ∈ H; 0 g = 0}, D(A∗ ) = {v ∈ H 1 (0, 1); v(1) = αv(0)}, A∗ v = −v , 1 N (A∗ ) = Span{1}, R(A∗ ) = {g ∈ H; 0 g = 0}. 15. Obviously, in each of the above cases, C0∞ (0, 1) ⊂ D(A), so D(A) is dense in H, hence A∗ can be deﬁned. Also, in each of the four cases, if v ∈ D(A∗ ) then f (u) = (Au, v) satisﬁes (for some constant C) |f (u)| ≤ Cu ∀u ∈ D(A). Since Cl D(A) = H, the functional f can be extended (by the Hahn–Banach Theorem or by continuity) to a functional from H ∗ , which is again denoted f . According to the Riesz Representation Theorem, there exists w ∈ H such that 1 uw, ∀u ∈ H. f (u) = (u, w) = 0

On the other hand, interpreting v and w as elements of D (0, 1), we can write for all φ ∈ C0∞ (0, 1) v (φ) = v(φ ) = f (φ) = (Aφ, v) = w(φ).

12.7 Answers to Exercises for Chap. 7

389

Hence v = w ∈ H. Of course, v (as a primitive of v ) is absolutely continuous in [0, 1], so v ∈ H 2 (0, 1) and A∗ v = v . Now, all we have to do is to determine D(A∗ ). Using the same idea as above (see the solution to Exercise 7.13), we ﬁnd (a) D(A∗ ) = D(A), hence A = A∗ ; (b) D(A∗ ) = H 2 (0, 1), hence A is symmetric; (c) D(A∗ ) = D(A), hence A = A∗ ; (d) D(A∗ ) = {u ∈ H 2 (0, 1); v(0) = v(1) = 0, v (0) = v (0) = v (1)}. 16. (a) Obviously, A is linear and Ax2 =

∞

|xp+j |2 ≤ x2 ∀x = (xn ) ∈ H,

j=1

which shows that A is continuous and A ≤ 1. In fact, A = 1 since for x ˜ = (0, 0, . . . , 0, 1, 0, 0, . . . ), where 1 is placed on the p+1 position, we have ˜ x = 1 and A˜ x = 1. In order to ﬁnd A∗ observe that Ax, y = =

∞

xp+m y¯m

m=1 ∞

xn y¯n−p

n=p+1 ∗

= x, A y ∀x, y ∈ H, hence A∗ y = (0, 0, . . . , 0, y1 , y2 , . . . ) ∀y = (yn ) ∈ H, where the zeroes occupy the ﬁrst p positions. (b) Clearly, D(B) = H, B is linear, and Bx2 =

∞ n=1

n2α |xn |2 ≤ x2 ∀x = (xn ) ∈ H, (1 + n)2

hence B ∈ L(H) with B ≤ 1. It is easily seen that the sup in the deﬁnition of B is reached, so B = 1;

390

12 Answers to Exercises

(c) Obviously, D(B) = {x = (xn ) ∈ H;

∞ n=1

n2α |xn |2 < ∞}, (1 + n)2

hence D(B) is a proper subset of H, which is dense in H: indeed, for any x = (xn ) ∈ H and ε > 0 there exists xk = (x1 , x2 , . . . , xk ,

(k + 1)α (k + 2)α xk+1 , xk+2 , . . . ), k+2 k+3

k large enough, such that xk − x < ε. (d) It is easy to see that

nα (−i)n xn ∀x = (xn ) ∈ D(B ∗ ) = D(B). B∗x = n+1 (e) A is not normal, but B is normal (easy to check).

12.8

Answers to Exercises for Chap. 8

1. (a) N (A) = Span{1} and R(A) = Span {x, x2 , x3 } ; (b) By simple computations we ﬁnd the following eigenvalues and corresponding eigenvalue sets: λ = 0, Span({1}) \ {0} and λ = i, Span({xi }) \ {0}, i = 1, 2, 3. 2. (i)

Left to the reader;

(ii) If a = 0, then there is one eigenvalue of A, λ = b, and any u ∈ X \ {0} is an eigenfunction corresponding to λ = b. If a = 0, then it is easily seen that A has no eigenvalue. 3. Assume λ is an eigenvalue of AB, i.e., there exists an x ∈ X, x = 0, such that A(Bx) = λx. Note that, of necessity, Bx = 0. It follows that B(A(Bx)) = λBx ⇒ (BA)(Bx) = λBx, i.e., λ is also an eigenvalue of BA (Bx being a corresponding eigenvector). The converse implication is similar.

12.8 Answers to Exercises for Chap. 8

391

4. (a) Clearly, A maps X into itself and is a linear operator. Denote K = sup {|k(t, s)|; (t, s) ∈ [0, 1] × [0, 1]} < ∞ (since k is continuous on the compact set [0, 1] × [0, 1]). We have |(Au)(t)| ≤ K

1

|u(s)| ds, t ∈ [0, 1],

0

which implies AuX ≤ KuX ∀u ∈ X. Thus A ∈ L(X), as claimed. (b) Consider in X the equation Au = λu, λ ∈ R. If λ = 0, this equation becomes Au = 0, which implies (by diﬀerentiation)

t

k(t, t)u(t) + 0

∂k (t, s)u(s) ds = 0, t ∈ [0, 1], ∂t

hence 1 u(t) = − k(t, t)

t 0

∂k (t, s)u(s) ds, t ∈ [0, 1]. ∂t

This implies

t

|u(t)| ≤ K1

|u(s)| ds, t ∈ [0, 1] =⇒ u ≡ 0,

0

so λ = 0 is not an eigenvalue of A. If λ = 0, the equation Au = λu reads

t

k(t, s)u(s) ds = λu(t), t ∈ [0, 1],

0

which leads to

t

|u(t)| ≤ K2

|u(s)| ds, t ∈ [0, 1],

0

and thus we have again u ≡ 0. The case X = L2 (0, 1) is similar.

392

12 Answers to Exercises

5. (a) Obviously, A is linear, maps H into itself, and A ∈ L(H). It is easily seen that A = supn∈N |λn |; (b) Easy to prove; (c) The set of eigenvalues consists of all distinct λn ’s. 6. (a) Left to the reader; (b) Apply the Arzel`a–Ascoli criterion; (c) Let u, v be arbitrary elements of H. Taking into account the obvious equalities d d t sv(s) ds , v(t) = − v(s) ds tv(t) = dt 0 dt and integrating by parts, we obtain 1 1 tv(t) u(s) ds dt (Au, v) = 0 t 1 t + v(t) su(s) ds dt = (u, Av). 0

0

(d) Consider in H the equation Au = λu, λ ∈ R. Let us ﬁrst examine the case λ = 0, i.e., t 1 u(s) ds + su(s) ds = 0, t ∈ [0, 1]. t t

0

By diﬀerentiation we obtain 1 u(s) ds = 0, t ∈ [0, 1] =⇒ u ≡ 0, t

hence λ = 0 is not an eigenvalue of A, and N (A) = {0}. Now we are looking for nonzero eigenvalues of A. The equation Au = λu reads t 1 u(s) ds + su(s) ds = λu(t), t ∈ [0, 1]. t t

0

By diﬀerentiating this equation twice we ﬁnd that u satisﬁes the equivalent problem λu (t) + u(t) = 0, 0 ≤ t ≤ 1; u(0) = 0, u (1) = 0.

12.8 Answers to Exercises for Chap. 8

393

Multiplication by u(t) of this equation, followed by integration over [0, 1], shows that λ > 0. Solving the above problem we ﬁnd 1 , un (t) = cn sin (n + 1/2)πt , λn = 2 2 (1/2 + n) π n = 0, 1, 2, . . . We determine the constants cn by imposing un = 1, n ∈ N, so √ un (t) = 2 sin [(1/2 + n)πt], n = 0, 1, 2, . . . By 8.7 (Hilbert–Schmidt) we conclude that B = √ Theorem ∞ { 2 sin (1/2 + n)πt }n=0 is an orthonormal basis of H. 7. Assume that Ax = λx for some scalar λ. Then Ax = |λ| · x and so |(Ax, x)| = |λ| · x2 = Ax · x. Conversely, let us assume that |(Ax, x)| = Ax · x. For an arbitrary λ we have ¯ · (Ax, x) + |λ|2 x2 , Ax − λx2 = Ax2 − 2 Re λ which (according to our assumption) equals zero for λ = (Ax, x)/ x2 . 8. (a) Denote e1 = u−1 u, e2 = v−1 v, so {e1 , e2 } is an orthonormal system. For all x ∈ H we have 0 ≤ (x, e1 )e1 + (x, e2 )e2 − x2 = − |(x, e1 )|2 − |(x, e2 )|2 + x2 , which gives a particular case of the so-called Bessel inequality, i.e., |(x, e1 )|2 + |(x, e2 )|2 ≤ x2 . Now, for all x ∈ H Ax2 = (x, v)u2 + (x, u)v2 = u2 v2 |(x, u−1 u)|2 + |(x, v−1 v)|2 ≤ u2 v2 x2 ,

394

12 Answers to Exercises

which follows from the above Bessel inequality. Hence A ≤ u · v. In fact, A = u · v, since A(u−1 u) = (u−1 u, u)v = u · v; (b) Easy to check; (c) Apply (a) with H = L2 (−π, π), u = cos t, v = sin t to ﬁnd A = π; (d) We ﬁrst observe that any eigenvalue of A is a real number (since A is symmetric). Denote Y = Span{u, v}. Let us determine the nullspace N (A). The equation Ax = 0 reads (x, v)u + (x, u)v = 0 =⇒ |(x, v)|2 u2 + |(x, u)|2 v2 = 0, which implies x ⊥ Y . Therefore, N (A) = Y ⊥ . Note that A(Y ) = Y . In what follows we distinguish two cases: (i) dim H > 2. In this case, N (A) = Y ⊥ = {0}, and λ = 0 is an eigenvalue of A, the corresponding eigenvectors being all nonzero vectors from Y ⊥ . Next, consider the equation Ax = λx, λ ∈ R \ {0}, x ∈ Y \ {0}. By elementary computations we ﬁnd two eigenvalues: λ = ±u · v, the corresponding eigenvectors being the nonzero multiples of u−1 u ± v−1 v; (ii) H = Y . In this case, N (A) = {0} so λ = 0 is no longer an eigenvalue of A. As before, we ﬁnd λ = ±u · v, the corresponding eigenvectors being the nonzero multiples of u−1 u ± v−1 v. 9. (a) Clearly A is a linear operator. For all x ∈ H we have Ax2 =

m

|ci |2 |(x, ei )|2

i=1

≤ ≤

max |ci |

m 2

1≤i≤m

2

|(x, ei )|2

i=1

max |ci | x2 ,

1≤i≤m

12.8 Answers to Exercises for Chap. 8

395

where we have used the Bessel inequality (see the solution of the previous exercise where the Bessel inequality is derived for m = 2). Hence A ∈ L(H) and A ≤ β := max1≤i≤m |ci |. In fact, A = β, for if the maximum β is achieved for i = i0 , i.e., β = |ci0 |, then observe that Aei0 = |ci0 |, which conﬁrms our claim. It is readily seen that R(A) = Hm := Span {e1 , . . . , em } ⊥; and N (A) = Hm (b) Easy to check; (c) If dim H > m, then N (A) = {0}, so λ = 0 is an eigenvalue of A, the corresponding eigenvectors being the nonzero vec⊥ . The other eigenvalues are detertors from N (A) = Hm mined m from the equation Ax = λx, λ ∈ K \ {0}, x = i=1 αi ei ∈ Hm \ {0}, i.e., from the algebraic system (λ − ci )αi = 0, i = 1, . . . , m. So, the eigenvalues we are looking for are the distinct ci ’s, and the mcorresponding eigenvectors are the nonzero vectors x = i=1 αi ei ∈ Hm with the αi ’s satisfying the above system. If dim H = m, i.e., H = Hm , then we have only nonzero eigenvalues which can be determined as before. 10. (a) Obvious. (b) Denote u0 (t) = t/(1 + t), t ∈ [0, 1]. We have R(A) = ⊥ Span{u0 } and N (A) = Span{u0 } . (c) First, λ = 0 is an eigenvalue of A and the corresponding eigenfunctions are all nonzero functions of N (A) = ⊥ Span{u0 } . Next, consider the equation Au = λu, u ∈ R(A)\{0}, λ = 0. Since u(t) = Cu0 (t) = Ct/(1+t), C = 0, we obtain

1

λ= 0

3 s2 ds = − 2 ln 2 . 2 (1 + s) 2

The corresponding eigenfunctions are u(t) = Cu0 (t), C ∈ R \ {0}.

396

12 Answers to Exercises

11. (a) It is easily seen that for u ∈ H, we have

1 s

(Au)(t) = v(t) = − t

u(τ )dτ ds, t ∈ [0, 1].

0

Obviously, A ∈ L(H) and N (A) = {0}. (b) Integrating twice by parts shows that A is symmetric (hence self-adjoint). Also, A is compact (by the Arzel` a–Ascoli 2 criterion or the compact embedding of H (0, 1) in H = L2 (0, 1)). (c) The equation Au = λu reads 1 s

− t

u(τ )dτ ds = λu(t), t ∈ [0, 1].

0

Clearly, λ = 0 is not an eigenvalue of A, so consider λ ∈ R \ {0}. We see that u ∈ C ∞ [0, 1] and satisﬁes the problem t ∈ [0, 1], λu (t) = u(t), u(1) = 0, u (0) = 0. If we multiply the above equation by u(t) and integrate over [0, 1], we obtain

1

−λ 0

2 u dt =

1

u(t)2 dt,

0

which shows that any eigenvalue λ < 0. Denote for convenience λ = −1/ν 2 , ν > 0. Solving the above problem we ﬁnd u(t) = c cos(νt), cos ν = 0, c ∈ R \ {0}. Thus ν = nπ + π/2, n = 0, 1, . . . Therefore, A has eigenvalues λn =

1 , n = 0, 1, . . . (nπ + π/2)2

and the corresponding eigenfunctions are the nonzero multiples of the following normalized functions √ un (t) = 2 cos (nπ + π/2)t , t ∈ [0, 1], n = 0, 1, . . . According to Theorem 8.7 (Hilbert–Schmidt), the system {un }∞ n=0 is an orthonormal basis in H.

12.8 Answers to Exercises for Chap. 8

397

12. Search for u in the form u(x) = u1 (x1 ) · u2 (x2 ), with u1 = 0, u2 = 0. Thus the equation −Δu = λu reads −

u1 u = 2 + λ. u1 u2

Since the diﬀerent sides of this equation depend on distinct variables, x1 and x2 , they must be constant functions, so we obtain the following two eigenvalue problems: 0 < x1 < a, u1 + νu1 = 0, u1 (0) = 0, u1 (a) = 0,

u2 + μu2 = 0, u2 (0) = 0, u2 (b) = 0,

0 < x2 < b,

with ν + μ = λ. If we multiply the equation u1 + νu1 = 0 by u1 and then integrate over [0, a], we get a a 2 u21 dx1 , u1 dx1 = ν 0

0

which shows that ν > 0. Similarly, μ > 0, hence λ > 0 as well. Solving the above eigenvalue problems we ﬁnd

nπ 2

nπ x1 , n = 1, 2, . . . , u1,n (x1 ) = cn sin νn = a a and μm =

mπ 2 b

, u2,m (x2 ) = c˜n sin

mπ b

x2 , m = 1, 2, . . .

Thus we have obtained the following eigenvalues of −Δ

nπ 2 mπ 2 + , m, n ∈ N, λmn = a b the corresponding eigenfunctions being the nonzero multiples of

mπ

nπ 2 x1 · sin x2 , m, n ∈ N. umn (x) = √ sin a b ab Note that the system S = {umn }∞ m,n=1 is an orthonormal basis of H = L2 (Ω), Ω = (0, a)×(0, b). As S is an orthonormal system, it is enough to show that Span S is dense in H (see Theorem 6.21).

398

12 Answers to Exercises

Indeed, every function u ∈ H can be approximated with respect to the L2 -norm by a function from C0∞ (Ω) (cf. Theorem 5.8), which in turn is close (even with respect to the uniform convergence topology) to a polynomial in x1 , x2 , i.e., a ﬁnite sum of product functions u1 (x1 ) · u2 (x2 ). Since the systems nπ ∞ mπ ∞ { 2/a sin x1 }n=1 , { 2/b sin x2 }m=1 a b are bases in L2 (0, a) and L2 (0, b), respectively, it follows that every product function u = u1 (x1 ) · u2 (x2 ), with u1 ∈ L2 (0, a) and u2 ∈ L2 (0, b), is approximated in H = L2 (Ω) by functions from Span S, hence Span S is dense in H, as claimed. 13. Proceed as for the previous exercise. Similarly, you can determine diﬀerent orthonormal bases in L2 (0, a) × L2 (0, b).

12.9 1. (i)

Answers to Exercises for Chap. 9 Apply the usual formula etA =

∞ k t k=0

and the observation that Thus we ﬁnd etA (ii)

k!

Ak

Ak

is the null matrix for k = 2, 3, . . . . . 1+t t ; = −t 1 − t

One can use the formula etA =

∞ k t k=0

k!

Ak ,

again, but we suggest another method. Recall that etA is the fundamental matrix of the diﬀerential linear system ! " ! " d x x =A y dt y which equals the identity matrix for t = 0. We solve the above diﬀerential system with the initial conditions x(0) = 1, y(0) = 0 and x(0) = 0, y(0) = 1, and ﬁnd . cos t sin t tA ; e = − sin t cos t

12.9 Answers to Exercises for Chap. 9

(iii)

399

-

e

tA

. 2e−2t − e−3t −e−2t + e−3t = . 2e−2t − 2e−3t −e−2t + 2e−3t

2. By the classic Jordan decomposition theorem we have A = B −1 JB, where B is a nonsingular n × n matrix and J has Jordan blocks J0 , J1 , . . . , Jm on its diagonal and 0 in the rest. Here, denoting the simple eigenvalues of A by λ1 , λ2 , . . . , λp and the other eigenvalues of A by λp+1 , . . . , λp+m , we have J0 = diag (λ1 , λ2 , . . . , λp ), Ji = λp+i Ipi + Bpi , i = 1, . . . , m, where Ipi is the pi × pi identity matrix and Bpi is the pi × pi matrix having all entries situated above the principal diagonal equal to 1 and 0 otherwise. Note that etJ0 = diag (etλ1 , . . . , etλp ), etJi = etλp+i etBpi , where etBpi has a special form involving {1, t, t2 , . . . , tpi −1 }, i = 1, . . . , m. Thus, since etA = B −1 etJ B = B −1 · diag{etJ0 , etJ1 , . . . , etJm } · B, then both (a) and (b) follow easily. 3. Left to the reader. 4. For a given pair (t0 , x0 ) ∈ [0, ∞) × X and for all (t, x) ∈ (0, ∞) × X, t > t0 , we have T (t)x − T (t0 )x0 ≤ T (t)x − T (t)x0 + T (t)x0 − T (t0 )x0 = T (t)(x − x0 ) + T (t0 ) T (t − t0 )x0 − x0 ≤ M eωt x − x0 + M eωt0 · T (t − t0 )x0 − x0 . On the other hand, if t0 > 0 and t ∈ (0, t0 ), we have T (t)x − T (t0 )x0 ≤ T (t)x − T (t)x0 + T (t)x0 − T (t0 )x0 = T (t)(x − x0 ) + T (t) x0 − T (t0 − t)x0 ≤ M eωt x − x0 + T (t0 − t)x0 − x0 .

400

12 Answers to Exercises

The claim follows from the above estimates. 5. (a) It is easy to check that {G(t) : X → X; t ∈ R} is a uniformly continuous group. Its generator is A ∈ L(X) given by (Af )(x) = λ[f (x − δ) − f (x)], f ∈ X, x ∈ R. (b) If t ≥ 0, we have G(t)f X ≤ 1 for all f ∈ X satisfying f X ≤ 1, and for fˆ ≡ 1 we have G(t)fˆX = 1. So G(t) = 1 ∀t ≥ 0. For t < 0 we easily deduce that G(t) ≤ e−2λt , and this upper bound is reached for f˜(x) = cos(πx/δ) (indeed, (G(t)f˜) (0) = e−2λt ). 6. This is a translation group and its generator A : D(A) ⊂ X → X is deﬁned by D(A) = {f ∈ X; f is diﬀerentiable on R and f ∈ X}, Af = f . Use arguments similar to those in Sect. 9.5. Obviously, T (t) = 1 for all t ∈ R. 7. (a) It is easy to see that {G(t) : X → X; t ∈ R} is a C0 -group and its inﬁnitesimal generator is A : D(A) ⊂ X → X is given by D(A) = W 1,1 (Rk ), (Af )(x) = −∇f (x) · M x, for all f ∈ D(A) and a.a. x ∈ Rk . k (b) Assume that i=1 mii = 0. Denote by W (t) the determinant of etM whose columns are solutions of the diﬀerential linear system u (t) = M u(t). Recall that W (t) is known as the Wronski determinant of the system of solutions of u (t) = M u(t) that are here given by the columns of X(t) = etM . Using the deﬁnition of a determinant, we can see that the derivative of W (t) is the sum of k determinants that are obtained by diﬀerentiating one by one the rows of W (t). Noting that the derivative of each row contains linear combinations of the other rows, we derive

W (t) =

k i=1

mii · W (t) = 0 ∀t ∈ R.

12.9 Answers to Exercises for Chap. 9

401

So W is a constant function, hence W (t) = W (0) = 1 ∀t ∈ R. Next, by using the change x = etM y, we obtain p |(G(t)f )(x)| dx = |f e−tM x |p dx k Rk R |(f )(y)|p W (t) dy, = k R |(f )(x)|p dx, = Rk

for all f ∈ X and t ∈ R, hence G(t) = 1 for all t ∈ R. 8. It is easy to check that all the C0 -semigroup properties are fulﬁlled in this case. The inﬁnitesimal generator is given by D(A) = {f ∈ X; f exists and belongs to X}, Af = −f . 9. (a) Let us only check the continuity at t = 0, the other properties being trivially satisﬁed. Let (xn )n∈N ∈ X be arbitrary but ﬁxed. For every ε > 0 there exists an m ∈ N such that ∞ p j=m+1 |xj | < ε. So we have 1/p ∞ |e−cj t xj − xj |p

j=m+1 ∞

≤

1/p |e

−cj t

xj |

j=m+1

∞

≤ 2

p

1/p

+

∞

1/p |xj |

p

j=m+1

|xj |p

j=m+1

≤ 2ε

1/p

.

This implies T (t)(xn ) − (xn )pp ≤

m

|e−cj t xj − xj |p + 2p ε.

j=1

It follows that lim sup T (t)(xn ) − (xn )pp ≤ 2p ε ∀ε > 0, t→0+

which proves the claim.

402

12 Answers to Exercises

(b) If A denotes the inﬁnitesimal generator of the semigroup and limh→0+ h−1 [T (h)(xn ) − (xn )] exists, then A(xn ) = −(cn xn ) with (cn xn ) ∈ X. In fact, D(A) = {(xn ) ∈ X; (cn xn ) ∈ X}. Indeed, noting that (by the Mean Value Theorem) e−cj h − 1 + cj = −cj e−cj θj + cj , 0 < θj < h, h we can write for h, ε > 0 and (xn ) ∈ D(A) (deﬁned above) h−1 T (h)(xn ) − (xn ) + (cn xn )p N e−cj h xj − xj + c j xj | p ≤ h j=1

∞

+

|1 − e−cj θj |p · |cj |p · |xj |p

j=N +1

≤

N

|

j=1

≤

∞ e−cj h xj − xj + c n xn | p + |cj |p · |xj |p h j=N +1

N

|

j=1

e−cj h xj − xj + cj xj |p + ε, h

where N = Nε comes from (cn xn ) ∈ X, i.e., |xn |p < ∞. Therefore, lim sup h→0+ N

≤

j=1

∞

n=1 |cn |

T (h)(xn ) − (xn ) + (cn xn )p h |

e−cj h xj − xj + cj xj |p + ε, h

which implies lim sup h→0+

T (h)(xn ) − (xn ) + (cn xn )p ≤ ε. h

This concludes the proof.

p

·

12.9 Answers to Exercises for Chap. 9

403

(c) The semigroup is uniformly continuous ⇐⇒ D(A) = X and A ∈ L(X) (see Theorems 9.5 and 9.14). In our case, D(A) = X ⇐⇒ (cn ) is bounded. If (cn ) is bounded, then obviously A ∈ L(X). 10. It is easy to see that A satisﬁes all the conditions of the Hille– Yosida Generation Theorem (Theorem 9.22), so A generates a C0 -semigroup of contractions {T (t) : X → X; t ≥ 0}. We know that, for u0 ∈ D(A), u(t, ·) = T (t)u0 satisﬁes d u(t, ·) = Au(t, ·), t ≥ 0, dt in X, i.e., ut + ux = 0 in Ω = (0, ∞) × (0, 1). Noting that this equation has the characteristic x − t = C, we can derive u0 (x − t), 0 ≤ t ≤ x, (T (t)u0 )(x) = 0, t > x. This formula extends by density to all u0 ∈ X, hence the semigroup is completely determined. We saw that, for u0 ∈ D(A), the function u(t, x) = (T (t)u0 )(x) satisﬁes in the classical sense the transport equation ut + ux = 0 in Ω. Obviously, this u also satisﬁes the transport equation in the sense of distributions, i.e., u(φt + φx ) dt dx = 0 ∀φ ∈ D(Ω). Ω

This relation extends by density to all u0 ∈ X = L2 (0, 1), hence ut + ux = 0 in D (Ω). 11. First of all, observe that the substitution v(t, x) = eat u(t, x) leads to a similar initial-boundary value problem for the equation vt − vxx = eat f (t, x), so one can assume a = 0. As in Sect. 9.12.1, one can express this problem as a Cauchy problem for an evolution equation in X = L2 (0, 1) associated with the operator A : D(A) ⊂ X → X deﬁned by D(A) = {v ∈ H 2 (0, 1); v(0) = 0, v (1) + αv(1) = 0}, Av = v , v ∈ D(A).

404

12 Answers to Exercises

This operator is linear, densely deﬁned (since C0∞ (0, 1) ⊂ D(A) and C0∞ (0, 1) is dense in X), closed, self-adjoint, and dissipative. The reader can easily check that all these properties of A hold true. According to Theorem 9.29, A is the generator of a C0 semigroup of contractions. So, applying the theory developed in Sect. 9.11, we conclude that, for an arbitrary r > 0, the Cauchy problem in X u (t) = Au(t) + f (t), 0 ≤ t ≤ r, u(0) = u0 , where u(t) := u(t, ·), f (t) := f (t, ·), has a unique mild solution u on every interval [0, r], r > 0, hence u ∈ C([0, ∞); L2 (0, 1)). If u0 ∈ D(A) and f ∈ C 1 ([0, ∞); L2 (0, 1)), then u ∈ C([0, ∞); L2 (0, 1)) ∩ C 1 ((0, ∞); L2 (0, 1)) (cf. Theorem 9.47), i.e., u is a classical solution. If the term au is replaced by h(u) with h a Lipschitz function, then we can also prove the existence of a unique mild solution on every interval [0, r], r > 0 (see Remark 9.50). 12. This problem is similar to the initial-boundary value problem discussed in Sect. 9.12.2. However, since the boundary conditions are diﬀerent, separate analysis is needed. Denote X = {p ∈ H 1 (0, 1); p(0) = 0} × L2 (0, 1) and endow X with the scalar product 1 [p1 , q1 ], [p2 , q2 ] = p1 p2 dx + 0

1

q1 q2 dx 0

and the corresponding induced norm. It is easily seen that X is a real Hilbert space. Deﬁne A : D(A) ⊂ X → X by D(A) = {[p, q] ∈ H 2 (0, 1) × H 1 (0, 1); p(0) = p (1) = 0, q(0) = 0}, and A[p, q] = [q, p ]. The given problem can be expressed as the following Cauchy problem in X d [u(t, ·), v(t, ·)] = A[u(t, ·), v(t, ·)] + [0, f (t, ·)], t ≥ 0, dt [u(0, ·), v(0, ·)] = [u0 , v0 ] ∈ X.

12.9 Answers to Exercises for Chap. 9

405

Denote by (CP) this Cauchy problem. In order to derive existence results for (CP), we are going to show that A is the generator of a C0 -group of isometries. For this purpose, we can use Corollary 9.34. First of all, note that D(A) is dense in X = Y × L2 (0, 1), where Y = {p ∈ H 1 (0, 1); p(0) = 0}. Indeed, Y is dense in L2 (0, 1) (in fact C0∞ (0, 1) is dense in L2 (0, 1)). Next, let u be arbitrary in Y . Then, y(x) = u(x) + x(x − 2)u(1) belongs to H01 (0, 1), so there exists a sequence (φn ) in C0∞ (0, 1) which converges in H 1 (0, 1) (hence in C[0, 1]) to y. Construct a sequence (un ) by un (x) = φn (x) − x(x − 2)u(1). Clearly, un (0) = 0, un (1) = 0 for all n, and un → u in H 1 (0, 1), which concludes the proof of the claim (that D(A) is dense in X). The condition (kk)∗ of Corollary 9.34 is also satisﬁed, so A generates a C0 -group of isometries, say {G(t) : X → X; t ∈ R}. In order to ﬁnish we can follow the discussion in Sect. 9.12.2 (where the case u(t, 0) = 0 = u(t, 1) was addressed). 13. Note that the second boundary condition is a dynamic one (as it involves the derivative vt ) so this initial-boundary value problem needs special analysis. The main idea towards solving this problem is to consider an appropriate framework. Speciﬁcally, we shall consider the real Hilbert space X = L2 (0, 1) × L2 (0, 1) × R with the scalar product 1 1 f1 f2 dx + C g1 g2 dx [f1 , g1 , ξ1 ], [f2 , g2 , ξ2 ] = L 0

0

+ C 1 ξ1 ξ2 , and the induced norm. (a) Deﬁne A : D(A) ⊂ X → X and B : X → X by D(A) = {[f, g, ξ] ∈ H 1 (0, 1)2 × R; ξ = g(1), −g(0) = R0 f (0)}, A[f, g, ξ] = [−L−1 g , −C −1 f , C1−1 f (1)] ∀[f, g, ξ] ∈ D(A), B[f, g, ξ] = −[L−1 Rf, C −1 Gg, C1−1 D1 ξ] ∀[f, g, ξ] ∈ X. Note that the given initial-boundary value problem can be expressed as the following Cauchy problem in X, denoted

406

12 Answers to Exercises

(CP ), ⎧ d ⎪ ⎨ dt [i(t, ·), v(t, ·), ξ(t)] = (A + B)[i(t, ·), v(t, ·), ξ(t)]+ [L−1 e(t, ·), 0, C1−1 e1 (t)], t > 0, ⎪ ⎩ [i(0, ·), v(0, ·), ξ(0)] = [i0 (·), v0 (·), ξ0 ], where ξ(t) = v(t, 1) and ξ0 = v0 (1). Let us show that A is the generator of a C0 -semigroup. First of all, observe that D(A) is dense in X. Indeed, for any [f, g, ξ] ∈ X, there exist two sequences (fn ), (φn ) in C0∞ (0, 1) such that fn → f and φn → h = h(x) := g(x)−ξx in L2 (0, 1). Hence, denoting gn (x) = φn (x) + ξx, we can see that [fn , gn , ξ] ∈ D(A) for all n and [fn , gn , ξ] → [f, g, ξ] in X. By a straightforward computation it follows that A[f, g, ξ], [f, g, ξ] ≤ 0 ∀[f, g, ξ] ∈ D(A), so A is dissipative. Let us prove that R(λI − A) = X for all λ > 0, i.e., A is m-dissipative, hence A is the inﬁnitesimal generator of a C0 -semigroup of contractions (see Theorem 9.25 (Lumer–Phillips)). In fact, it is enough to show that R(λI − A) = X for some λ > 0, i.e., in other words, for every [p, q, η] ∈ X there exists [f, g, g(1)] ∈ D(A) such that λLf + g = Lp, λCg = f = Cq, with g(0) + R0 f (0) = 0, λC1 g(1) − f (1) = C1 η. The above diﬀerential system can be solved for f, g by using the substitutions √ √ λCg = z1 − z2 , λLf = z1 + z2 , which give (by addition and subtraction of the two equations) an equivalent system of two diﬀerential equations in z1 and z2 . Requiring the general solution [f, g] of the above diﬀerential system to satisfy the boundary conditions we

12.10 Answers to Exercises for Chap. 10

407

see that (for λ > 0 large enough) there exists a unique solution [f, g] of the above boundary value problem, with [f, g, g(1)] ∈ D(A), as claimed. Therefore, A generates a C0 -semigroup of contractions, say {S(t) : X → X; t ≥ 0}. Note that the density of D(A) in X also follows from Theorem 9.27, since X is a Hilbert space and therefore a reﬂexive Banach space. Since B ∈ L(X), it follows by Theorem 9.35, that A + B is the generator of a C0 -semigroup {T (t) : X → X; t ≥ 0}. In fact, this semigroup is also a semigroup of contractions since B is m-dissipative. Having this semigroup, one can express the solution of the above Cauchy problem (CP ) by using the variation of constants formula. For [i0 , v0 , v0 (1)] ∈ D(A), e ∈ C 1 ([0, ∞); L2 (0, 1)), and e1 ∈ C 1 [0, ∞), there exists a unique strong solution [u(t, ·), v(t, ·), v(t, 1)], hence [u, v] can be regarded as a classical solution of the original problem. On the other hand, since D(A) is dense in X, we have that for [i0 , v0 , ξ] ∈ X, e ∈ L1loc ([0, ∞); L2 (0, 1)) and e1 ∈ L1loc [0, ∞), the Cauchy problem (CP ) only has a mild solution given by the formula of variation of constants. In this case, the third component of the initial datum, ξ0 , can be chosen independently of v0 , and the third component of the solution, ξ(t), may not satisfy the identity ξ(t) = v(t, 1), t > 0, as in the classical case. This means that the evolution at the boundary point x = 1 is weakly dependent on the evolution in (0, 1). (b) In this case, we observe that B is a Lipschitz operator from X into itself, so one can prove existence of a mild solution under appropriate conditions (see Remark 9.50).

12.10

Answers to Exercises for Chap. 10

1. (i) We ﬁnd that D(Q) is dense in H = L2 (0, 1) since C0∞ (0, 1) ⊂ D(Q) and is dense in H.

408

12 Answers to Exercises

Now, we easily obtain by integration by parts 1 1 v w dx = − vw dx (Qv, w) = − 0

0

= (v, Qw) ∀v, w ∈ D(Q), so D(Q) ⊂ D(Q∗ ) and Q∗ w = Qw for all w ∈ D(Q). Let us prove the converse inclusion: D(Q∗ ) ⊂ D(Q). Choose a w ∈ D(Q∗ ), i.e., w ∈ H and f (v) = (Qv, w) satisﬁes (for a constant C) |f (v)| ≤ Cv ∀v ∈ D(Q). By arguments already used before (see the solution to Exercise 7.15) we can deduce that w ∈ H 2 (0, 1). On the other hand, from |f (v)| = |(Qv, w)| 1 = | v w dx| 0

1

= | − v (0)w(0) − v(1)w (1) +

vw dx|

0

≤ Cv ∀v ∈ D(Q), we get |v (0)w(0) + v(1)w (1)| ≤ Cv ∀v ∈ D(Q). Choosing v := vn in the last inequality, where (vn ) is a sequence in D(Q) satisfying vn (0) = 1, vn (1) = 0 ∀n, and vn → 0 in H, we obtain by letting n → ∞ that w(0) = 0. By a similar argument, we also get w (1) = 0. Summarizing, w ∈ D(Q), hence D(Q∗ ) ⊂ D(Q), as claimed. The fact that Q∗ = Q also follows by the arguments used in the proof of Proposition 7.14. Moreover, Q is strongly positive, since the Poincar´e inequality still holds for functions v ∈ H 1 (0, 1) with v(0) = 0. Indeed, by using the H¨older inequality, we have 1 x v (s) ds, x ∈ [0, 1] =⇒ [v(x)]2 dx v(x) = 0 0 1 2 [v (x)] dx. ≤ 0

12.10 Answers to Exercises for Chap. 10

409

(ii)

By easy computations we ﬁnd

√ 1 2 2 1 π , en (x) = 2 sin (n + )πx , n = 0, 1, 2, . . . λn = n + 2 2

(iii) HE = {v ∈ H 1 (0, 1); v(0) = 0}, with the scalar product 1 (p, q)E = 0 p q dx. Obviously, this space is compactly embed√ 2 2 ded in H. An orthonormal basis of HE is { (2n+1)π sin (n + ∞ 1 2 )πx }n=0 . (iv) u(t, x) = ∞ n=0 un (t)en (x), where un (t) + (n + 12 )2 π 2 un (t) = fn (t), un (0) = u0n ,

0 < t < T, n = 0, 1, 2, . . . ,

where

√ fn (t) = (f (t, ·), en ) = 2 √ = (u0 , en ) = 2

1 0

1 f (t, x) sin (n + )πx dx, 2

1 u0 (x) sin (n + )πx , n = 0, 1, 2, . . . 2 0 The rest is left to the reader. u0n

1

2. The temperature u = u(t, x) satisﬁes the initial-boundary value problem ⎧ ⎪ t ∈ (0, T ), x ∈ (0, l), ⎨ut − αuxx = 0, u(t, 0) = u1 , u(t, l) = u2 , t ∈ [0, T ], ⎪ ⎩ x ∈ [0, l], u(0, x) = u0 , where α > 0 is the diﬀusivity of the rod material. We want to convert this problem into a similar one with homogeneous boundary conditions. It is easy to see that y deﬁned by x x y(t, x) = u(t, x) − 1 − u1 + u2 , (t, x) ∈ [0, T ] × [0, l], l l satisﬁes ⎧ ⎪ ⎪ ⎨yt − αyxx = 0, y(t, 0) = 0, y(t, l) = 0, ⎪ ⎪ ⎩y(0, x) = u0 − 1 − x u1 + x u2 , l l

t ∈ (0, T ), x ∈ (0, l), t ∈ [0, T ], x ∈ [0, l].

410

12 Answers to Exercises

This problem can be expressed as a Cauchy problem in H = L2 (0, l) associated with the operator Q : D(Q) ⊂ H → H, deﬁned by D(Q) = H 2 (0, l) ∩ H01 (0, l), Qv = −αv ∀v ∈ D(Q). This Q satisﬁes both conditions (a) and (b) of Theorem 10.1 (with HE = H01 (0, l)). Its eigenvalues are λn = α

n2 π 2 , n = 1, 2, . . . l2

The corresponding orthonormal basis consists of the functions √ nπx en (x) = 2 sin l , n = 1, 2, . . . The solution y is given by y(t, x) =

∞

yn (t)en (x),

n=1

where the yn ’s satisfy

yn (t) + α n l2π yn (t) = 0, yn (0) = y0n , 2 2

0 < t < T, n = 1, 2, . . . ,

with y0n = (y(0, ·), en )H , n = 1, 2, . . . The rest is left to the reader. 3. In this case we deﬁne Q : D(Q) ⊂ H → H by D(Q) = {v ∈ H 2 (0, 1); −v (0) + αv(0) = 0, v (1) = 0}, Qv = −v ∀v ∈ D(Q). (a) Obviously, D(Q) is dense in H. It is also easy to check that (Qv, w) = (v, Qw) ∀v, w ∈ D(Q), so D(Q) ⊂ D(Q∗ ) and Q∗ v = Qv ∀v ∈ D(Q). Let us prove that D(Q∗ ) ⊂ D(Q) to conclude that Q∗ = Q. Choose an arbitrary w ∈ D(Q∗ ). By arguments similar to those used for

12.10 Answers to Exercises for Chap. 10

411

Exercise 7.15, we infer that w ∈ H 2 (0, 1). Next, since 1 v w dx f (v) := (Qv, w) = − 0

= v (0)w(0) + v(1)w (1) 1 vw dx − v(0)w (0) − 0

= v(0)[αw(0) − w (0)] 1 + v(1)w (1) − vw dx 0

for all v ∈ D(Q) and |f (v)| ≤ Cv for all v ∈ D(Q) and a constant C, we deduce |v(0)[αw(0) − w (0)] + v(1)w (1)| ≤ Cv ∀v ∈ D(Q). Choosing in this inequality v := vn ∈ D(Q) such that vn (0) = 1, vn (1) = 0, vn → 0 in H, we obtain by letting n → ∞ −w (0) + αw(0) = 0. Similarly, w (1) = 0, hence w ∈ D(Q), so D(Q∗ ) ⊂ D(Q), as claimed, i.e., Q∗ = Q. Let us prove that Q is also strongly monotone. First of all, for all v ∈ H 1 (0, 1) we have x v (s) ds, x ∈ [0, 1] =⇒ |v(x)| ≤ |v(0)| v(x) = v(0) +

0

+ v , x ∈ [0, 1], so

v2 ≤ 2 |v(0)|2 + v 2 .

On the other hand, for all v ∈ D(Q) we have 1 v v dx (Qv, v) = −

0

= v (0)v(0) + v 2 = α|v(0)|2 + v 2 , Therefore, Q is indeed strongly positive.

412

12 Answers to Exercises

(b) It is easily seen that HE = H 1 (0, 1) with the scalar product 1 (v1 , v2 )E = αv1 (0)v2 (0) + v1 v2 dx ∀v1 , v2 ∈ HE , 0

and the induced norm, which is equivalent to the usual norm of H 1 (0, 1) (easy to check). Thus HE is compactly embedded in H. 4. Denoting again H = L2 (0, 1), one can deﬁne Q : D(Q) ⊂ H → H by D(Q) = {v ∈ H 2 (0, 1); v (0) = v (1) = 0}, Q(v) = −v . But this operator is not strongly positive. In order to remedy this, let us consider a perturbation of Q, again denoted Q, deﬁned on the same D(Q) by Qv = −v + v, v ∈ D(Q). With the substitutions y(t, x) = e−t u(t, x), f˜(t, x) = e−t f (t, x), the original problem becomes ⎧ ˜ ⎪ ⎨yt − yxx + y = f (t, x), yx (t, 0) = 0, yx (t, 1) = 0, ⎪ ⎩ y0 (x) = u0 (x),

t ∈ (0, T ), x ∈ (0, 1), t ∈ [0, T ], x ∈ (0, 1),

which can be expressed as the Cauchy problem in H y (t) + Qy(t) = f˜(t), 0 < t < T, y(0) = u0 , where y(t) := y(t, ·) ∈ H. (i) By arguments already used before we can see that Q satisﬁes (a). 2 2 (ii) λ √n = 1 + n π , n = 0, 1, 2, . . . ; 2 sin (nπx), n = 1, 2, . . .

e0 (x) = 1, en (x) =

(iii) HE = H 1 (0, 1) with the usual scalar product and norm, and the corresponding orthonormal basis in HE is √ 2 sin (nπx), n = 1, 2, . . . eˆ0 (x) = 1, eˆn (x) = √ 1 + n2 π 2 (iv) Left to the reader.

12.10 Answers to Exercises for Chap. 10

413

5. First of all, observe that this exercise is an abstract extension of the previous exercise. (a) Since A is not strongly positive, we cannot apply Theorem 10.1 directly to the given Cauchy problem (CP ). Fortunately, by the changes y(t) = e−αt u(t), f˜(t) = e−αt f (t), the given (CP ) becomes

y (t) + Qy(t) = f˜(t), 0 < t < T, y(0) = u0 ,

for which Theorem 10.1 is applicable. The reader is encouraged to discuss (using Theorem 10.1) the existence, uniqueness, and regularity of the solution u with respect to the regularity of u0 and f . (b) According to Theorem 10.3, the Cauchy problem above governed by Q has a unique periodic solution y: y(0) = y(T ). In terms of u and α this is equivalent to eαT u0 = u(T ), which proves the claim. It should be noted that, in fact, u0 determined here belongs to D(A), hence the corresponding solution u is more regular and u(T ) − u0 E is small. This follows from [36, Theorem 2.4, p. 56 and Theorem 2.1, p. 48], since Q is a maximal monotone, self-adjoint operator. We encourage the reader to read Chapter I in [36] to understand why u0 ∈ D(A). 6. The eigenvalues of Q = −Δ with Dirichlet conditions on ∂Ω have been already found (see the solution to Exercise 8.12), namely, λmn =

nπ 2 a

+

mπ 2 b

, m, n = 1, 2, . . .

Correspondingly, we have the following orthonormal basis in H = L2 (Ω),

nπ

mπ 2 x1 · sin x2 , m, n = 1, 2, . . . emn (x) = √ sin a b ab

414

12 Answers to Exercises

Thus the Fourier expansion solution is u(t, x) = ∞ m,n=1 umn (t) ×emn (x), where the umn ’s are determined from ⎧

⎨u (t) + nπ 2 + mπ 2 u (t) = f (t), mn mn mn a b ⎩u (0) = u , m, n = 1, 2, . . . , mn

0mn

where

mπ

nπ 2 ξ1 · sin ξ2 dξ1 dξ2 , u0 (ξ) sin u0mn = √ a b ab Ω m, n = 1, 2, . . . ,

and

nπ

mπ 2 ξ1 · sin ξ2 dξ1 dξ2 , f (t, ξ) sin fmn (t) = √ a b ab Ω m, n = 1, 2, . . .

The rest is left to the reader. 7. The solution is similar to that of the preceding exercise, being based on Exercise 8.13). 8. The given problem is governed by the operator Q : D(Q) ⊂ H = L2 (0, 3) → H deﬁned by D(Q) = H 2 (0, 3) ∩ H01 (0, 3), Qv = −v ∀v ∈ D(Q). Its energetic extension maps HE = H01 (0, 3) into (HE )∗ . So the given problem can be regarded as a Cauchy problem for an evolution equation in (HE )∗ and solved by the Fourier method (see [22, Chapter 7]). On the other hand, in this case we can convert the given problem into a usual one. First of all, observe that the second order distributional derivative of the function u ˜(x) = α(x − 1)H(x − 1) + β(x − 2)H(x − 2) is precisely αδ1 +βδ2 , where H denotes the usual Heaviside function. In fact, you just need to observe that the second derivative in the sense of distributions of x → (x − x0 )H(x − x0 ) is δx0 . Thus, the substitution y(t, x) = u(t, x) + u ˜(x) leads us to the problem ⎧ ⎪ (t, x) ∈ (0, ∞) × (0, 3), ⎨yt − yxx = 0, y(t, 0) = 0, y(t, 3) = 2α + β, t ≥ 0, ⎪ ⎩ y(0, x) = u ˜(x), x ∈ [0, 3].

12.10 Answers to Exercises for Chap. 10

415

A new substitution, namely z(t, x) = y(t, x) − (2α + β)x/3, leads us to the problem ⎧ ⎪ (t, x) ∈ (0, ∞) × (0, 3), ⎨zt − zxx = 0, z(t, 0) = 0, z(t, 3) = 0, t ≥ 0, ⎪ ⎩ z(0, x) = u ˜(x) − 2α+β 3 x, x ∈ [0, 3], which can easily be solved by the Fourier method (please do it!). 9. Here is the mathematical model (see, e.g., [5, p. 460]). ⎧ 2 ⎪ t ≥ 0, 0 ≤ x ≤ l, ⎨utt − a uxx = 0, u(t, 0) = 0, u(t, l) = 0, t ≥ 0, ⎪ ⎩ u(0, x) = 0, ut (0, x) = v0 (x), 0 ≤ x ≤ l. Here a2 = H/ρ, where H is the tension in the string, and ρ is the mass per unit length of the string material. The governing operator is Q : D(Q) ⊂ H = L2 (0, l) → H, with D(Q) = H 2 (0, l) ∩ H01 (0, l), Qv = −a2 v ∀v ∈ D(Q). Its eigenvalues are λn = (anπ/l)2 , n = 1, 2, . . . and the corresponding orthonormal basis consists of en (x) = 2/l sin(nπx/l), n = 1, 2, . . . Using the Fourier method, one can ﬁnd the solution u of the ∞ initial-boundary value problem above as u(t, x) = n=1 un (t) en (x), where the un ’s satisfy the problems un (t) + (anπ/l)2 un (t) = 0, t ≥ 0, un (0) = u0n , un (0) = v0n , n = 1, 2, . . . , with u0n = 0, v0n

l = 2/l v0 (ξ) sin(nπξ/l) dξ, n = 1, 2, . . . , 0

and so on. 10. The mathematical model is similar to that in the previous exercise, with some changes corresponding to the new situation, namely, ⎧ 2 ⎪ t ≥ 0, 0 ≤ x ≤ l, ⎨utt − a uxx = 0, u(t, 0) = 0, ux (t, l) = 0, t ≥ 0, ⎪ ⎩ u(0, x) = u0 (x), ut (0, x) = 0, 0 ≤ x ≤ l.

416

12 Answers to Exercises

The solution can be determined as a Fourier expansion. In order to ﬁnd the eigenvalues and eigenfunctions of the governing operator Q, look at the solution to Exercise 10.1 above. For the regularity of the solution, see Theorem 10.4. You can also use the semigroup approach to deduce the existence of a mild solution for u0 ∈ H 1 (0, l) with u0 (0) = 0. 11. Left to the reader. 12. The mathematical model is the following: ⎧ 2 ⎪ ⎨utt − a uxx = δ(x − l/2), (t, x) ∈ (0, ∞) × (0, l), t ≥ 0, u(t, 0) = 0, ux (t, l) = 0, ⎪ ⎩ u(0, x) = 0, ut (0, x) = 0, 0 ≤ x ≤ l. As in the case of Exercise 10.8 above, we use a change of the form y(t, x) = u(t, x) + u ˜(x), where u ˜(x) =

c (x − l/2)H(x − l/2), a2

to convert the above initial-boundary value problem into ⎧ ⎪ (t, x) ∈ (0, ∞) × (0, l), ⎨yt − yxx = 0, cl t ≥ 0, y(t, 0) = 0, y(t, l) = 2a2 , ⎪ ⎩ y(0, x) = u ˜(x), yt (0, x) = 0, x ∈ [0, l]. In order to homogenize the boundary condition we use a new change, z(t, x) = y(t, x) − cx/(2a2 ), which leads to ⎧ ⎪ (t, x) ∈ (0, ∞) × (0, l), ⎨zt − zxx = 0, z(t, 0) = 0, z(t, l) = 0, t ≥ 0, ⎪ ⎩ 2 z(0, x) = u ˜(x) − cx/(2a ), zt (0, x) = 0, x ∈ [0, l]. This problem can be easily solved by using the Fourier method and this task is left to the reader. Of course, we could have used a single change, namely, z(t, x) = u(t, x) + u ˜(t, x) −

cx , 2a2

but we wanted to follow a more transparent procedure.

12.11 Answers to Exercises for Chap. 11

417

13. Recall that in this case (see the solution to Exercise 8.12) the eigenvalues of −Δ with Dirichlet boundary conditions are

nπ 2 mπ 2 + , m, n = 1, 2, . . . λmn = a b and the corresponding orthonormal basis in H = L2 (Ω) consists of

mπ

nπ 2 emn (x) = √ sin x1 · sin x2 , m, n = 1, 2, . . . a b ab To complete the task see the solution to Exercise 10.6 above.

12.11

Answers to Exercises for Chap. 11

1. Recall that for a given kernel k = k(t, s) ∈ C(Δ), Δ = {(t, s) ∈ R; a ≤ s ≤ t ≤ b}, the resolvent kernel R(t, s) is deﬁned by R(t, s) =

∞

kn (t, s), (t, s) ∈ Δ,

n=1

where k1 (t, s) = k(t, s), t k(t, τ )kn−1 (τ, s) dτ, (t, s) ∈ Δ, n = 2, 3, . . . kn (t, s) = s

Note that the interval [a, b] could be replaced by [a, +∞) if the corresponding Volterra equation is considered on [a, +∞). (a) By easy computations we ﬁnd R(t, s) = et

2 −s2 +t−s

, x(t) = et(t+1) , t ≥ 0.

Alternatively, denoting y(t) = e−t x(t), the given equation can be written as t y(t) = 1 + y(s) ds, t ≥ 0, 2

0

which is equivalent to the problem y (t) = y(t), t ≥ 0, y(0) = 1, so we obtain again the solution x.

418

12 Answers to Exercises

(b) R(t, s) =

2 + cos t t−s e , 2 + cos s

3 . 2t + cos t (c) R(t, s) = sinh(t − s), x(t) = sinh t. x(t) = et sin t + et (2 + cos t) ln

2. (a) If x is a solution of the given integral equation, then x(0) = 0 and t t2 x(s) ds, t ≥ 0. x (t) = 1 − + x(t) + 2 0 Thus x (0) = 1 and x = x (t) + x(t) − t. Thus we have obtained the Cauchy problem x − x (t) − x(t) = −t, t ≥ 0, x(0) = 0, x (0) = 1. Conversely, if x is a solution to this problem, then x satisﬁes the given integral equation. By easy computations we ﬁnd x(t) = c1 e(1+ with

√

5)t/2

+ c2 e(1−

√

5)t/2

+ t − 1,

√ √ 5− 5 5+ 5 , c2 = . c1 = 2 2

(b) The equivalent Cauchy problem is x (t) + x(t) = 6t, x(0) = 1, x (0) = 0.

t ≥ 0,

By easy computations we ﬁnd x(t) = cos t − 6 sin t + 6t, t ≥ 0. (c) If x is a solution, then x(0) = 0. By diﬀerentiation we obtain from the given integral equation t et−s x(s) ds, t ≥ 0, x (t) = 3 − x(t) − 0 3t−x(t)

12.11 Answers to Exercises for Chap. 11

hence x satisﬁes the problem x (t) = 3 − 3t, x(0) = 0,

419

t ≥ 0,

which is equivalent to the given integral equation and has the solution 3 x(t) = t(2 − t), t ≥ 0. 2 3. (a) From the given integral equation we obtain by diﬀerentiation t x(s) ds = t, t ≥ 0. x(t) − 2t 0 t Then y(t) = 0 x(s) ds satisﬁes the Cauchy problem y (t) − 2ty(t) = t, t ≥ 0, y(0) = 0 which has the solution 1 2 2 y(t) = (et − 1), t ≥ 0 =⇒ x(t) = tet , t ≥ 0. 2 (b) From the given integral equation we obtain by diﬀerentiation t

x(t) −

sin(t − s) · x(s) ds = 2(2t + 1), t ≥ 0,

0

and so x(0) = 2. Another diﬀerentiation leads to t cos(t − s) · x(s) ds = 4, t ≥ 0. x (t) − 0 2t(t+1)

So we have obtained the problem x (t) = 2(t2 + t + 2), x(0) = 2,

t ≥ 0,

which is equivalent to the given integral equation and gives the solution 2 x(t) = t3 + t2 + 4t + 2, t ≥ 0. 3 (c) x(t) = (cos t − t cos t − t sin t)e−2t , t ≥ 0.

420

12 Answers to Exercises

4. R(t, s) is a continuous function on the triangle Δ0 = {(t, s); 0 ≤ s ≤ t ≤ b}, being deﬁned by R(t, s) =

∞

kn (t, s), (t, s) ∈ Δ0 ,

n=1

where k1 (t, s) = k(t, s) = h(t − s), t k(t, τ )kn−1 (τ, s) dτ kn (t, s) = s t h(t − τ )kn−1 (τ, s) dτ, (t, s) ∈ Δ0 , n = 2, 3, . . . = s

Since k1 depends on t−s only, we can easily observe (by a change of variable) that so is k2 . It follows by induction that all the kn ’s depend on t − s only =⇒ so is R. 5. We can easily show by induction that R(t, s) ≥ 0, 0 ≤ s ≤ t ≤ b. Next, denote

t

φ(t) = f (t) +

k(t, s)x(s) ds − x(t) ≥ 0, t ∈ [a, b].

a

Hence,

t

x(t) = f (t) − φ(t) +

k(t, s)x(s) ds, t ∈ [a, b],

a

which implies

t

x(t) = f (t) − φ(t) + R(t, s)[f (s) − φ(s)] ds a t R(t, s)f (s) ds = f (t) + a t R(t, s)φ(s) ds , t ∈ [a, b]. − φ(t) + a ≥0

So the conclusion is obvious.

12.11 Answers to Exercises for Chap. 11

421

6. Left to the reader. 7. We prefer to use the following Bielecki-like norm in X = C(D) : g B = sup e−M (t+s) | g(t, s) |, g ∈ X, (t,s)∈Q

where M is a large positive constant. Deﬁne an operator P on X by t s k(t, s, ξ, η)g(ξ, η) dηdξ, (P g)(t, s) = f (t, s) + 0

0

(t, s) ∈ D, g ∈ X. Clearly, P maps X into itself, and for g1 , g2 ∈ X and (t, s) ∈ D we have | (P g1 )(t, s) − (P g2 )(t, s) | t s ≤ C | g1 (ξ, η) − g2 (ξ, η) | dηdξ 0 0 t s e+M (η+ξ) e−M (η+ξ) | g1 (ξ, η) − g2 (ξ, η) | dηdξ = C 0 0 t s eM (η+ξ) dηdξ ≤ C g1 − g2 B 0

=

0

C g1 − g2 B (eM t − 1)(eM s − 1), M2

where C = sup(t,s,ξ,η)∈Q | k(t, s, ξ, η) |< ∞. It follows that e−M (t+s) | (P g1 )(t, s) − (P g2 )(t, s) | C g1 − g2 B (1 − e−M t )(1 − e−M s ) ≤ M2 C g1 − g2 B , ≤ M2 for all (t, s) ∈ D, g1 , g2 ∈ X. Hence P g1 − P g2 B ≤

C g1 − g2 B , g1 , g2 ∈ X, M2

so P is a contraction on (X, · B ) for M 2 > C. Therefore, according to the Banach Contraction Principle, P has a unique

422

12 Answers to Exercises

ﬁxed point x = x(t, s) ∈ X which is the unique solution of equation (E). In order to prove the result the reader may also use other methods, similar to those discussed in Sect. 11.1. 8. The given problem is equivalent to the following integral equation in X = C[0, T ]

t

t

x(t) = x0 +

s

f (s) ds + 0

0

k(s, τ )x(τ ) dτ ds, t ∈ [0, T ].

0

(∗)

Deﬁne P : X → X by t t f (s) ds + (P g)(t) = x0 + 0

0

s

k(s, τ )g(τ ) dτ ds,

0

t ∈ [0, T ], g ∈ X. One can show by a ﬁxed point approach that P has a unique ﬁxed point x ∈ X, which is the unique solution of equation (∗), and hence of the given problem. 9. (a) The equation can be written as x(t) = cos t + λc1 sin t + λc2 cos t,

(∗)

with

π

c1 =

π

cos s · x(s) ds, c2 = −

0

sin s · x(s) ds.

(∗∗)

0

If we substitute (∗) into (∗∗), we obtain the following algebraic system in c1 , c2 : π c1 − λπ 2 c2 = 2 , λπ 2 c1 + c2 = 0. Note that the determinant of this system is positive for all λ ∈ R, so there exists a unique solution (c1 , c2 ) which gives the solution of the given integral equation (see (∗)) x(t) =

2(2 cos t + λπ sin t) . 4 + λ2 π 2

12.11 Answers to Exercises for Chap. 11

423

2π (b) We have x(t) = t + λc sin t, where c = 0 |π − s| · x(s) ds. Substituting x(t) given by the ﬁrst relation into the second yields 2π |π − s| · (s + λc sin s) ds ⇐⇒ c(1 − 2λπ) = π 3 . c= 0

Therefore, if λ = 1/(2π) the given integral equation has no solution, otherwise (i.e., for λ ∈ R \ {1/(2π)}) the equation has the solution x(t) = t +

λπ 3 sin t. 1 − 2λπ

(c) We have x(t) = f (t) + λc1 − 3λc2 t, where

t

c1 =

(∗)

t

x(s) ds, c2 =

sx(s) ds.

0

0

Thus we get the system 1 c1 = 0 [f (s) + λc1 − 3λc2 s] ds, 1 c2 = 0 s[f (s) + λc1 − 3λc2 s] ds, or

1 (1 − λ)c1 + 32 λc2 = 0 f (s) ds, 1 − 12 c1 + (1 + λ)c2 = 0 sf (s) ds.

The determinant of this algebraic system is Δ = (4 − λ2 )/4. So, for each λ ∈ R \ {−2, +2}, c1 , c2 can be uniquely determined and the solution of the given integral equation can be explicitly expressed by using formula (∗). If λ = −2 the above algebraic system has solutions if and only if 1

1

f (s) ds = 3 0

sf (s) ds

(∗∗)

0

and in this case there are inﬁnitely many solutions of the given integral equation, namely (see (∗)),

1

x(t) = f (t) + 2c1 (3t − 1) − 2t 0

f (s) ds, c1 ∈ R.

424

12 Answers to Exercises

An example of a function satisfying condition (∗∗) above is f (t) = t − 1. If condition (∗∗) is not satisﬁed, then the given integral equation has no solution. If λ = +2 the compatibility condition for the above algebraic system is 1 1 f (s) ds = sf (s) ds 0

0

and, if this condition is satisﬁed (e.g., f (t) = 3t − 1), we have again inﬁnitely many solutions for the given integral equation, 1 f (s) ds, c1 ∈ R. x(t) = f (t) + 2c1 (1 − t) − 2t 0

Otherwise, the given integral equation has no solution. 10. Denote

⎡ ⎤ ⎡ ⎤ f1 c1 ⎢ .. ⎥ ⎢ .. ⎥ c = ⎣ . ⎦ , g = ⎣ . ⎦ , K = (kij )1≤i,j≤n . cn fn

So (3) can be written as follows: (I − λK)c = g.

(3 )

There is a bijective correspondence between the solution sets of (F ) and (3 ). The following alternative for equation (3 ) is well known: (j) if det(I − λK) = 0, then there is a unique solution of (3 ) given by c = (I − λK)−1 g, which gives the solution of (F ) by means of (2); (jj) otherwise, det(I −λK) = 0 and equation (3 ) has solutions ¯ ∗ ) = N (I − λ ¯K ¯ T ), so equation ⇐⇒ g is orthogonal to N (I − λK (F ) has inﬁnitely many solutions, x(t) = xp (t) +

m i=1

αi xi (t),

12.11 Answers to Exercises for Chap. 11

425

where xp is a particular solution of (F ), α1 , . . . , αm ∈ K. and x1 , . . . , xm are independent solutions of the homogeneous integral equation (which can be calculated explicitly). 11. If λ = 0, then there is a unique solution x = f. From now on we consider λ ∈ K \ {0}. Denote Hm = Span({e1 , . . . , em }). From the solution to Exercise 8.9, we know that A is symmetric (hence ⊥. its eigenvalues are real numbers), R(A) = Hm and N (A) = Hm In fact, it is easy to see that the eigenvalues of A are μk = k, k = 1, . . . , m, with e1 , . . . , em as corresponding eigenvectors. We distinguish two cases: Case 1. dim H = m, i.e., H = Hm . Then the given Fredholm equation is a simple algebraic system, (I − λA)x = f.

(1)

If λ ∈ K \ {1, 1/2, . . . , 1/m}, then (1) has the unique solution x=

m (f, ek ) ek . 1 − λk k=1

If λ = 1/j for some j ∈ {1, . . . , m}, then system (1) is solvable if and only if (f, ej ) = 0. In this situation, there are inﬁnitely many solutions x with coordinates xk = j(f, ek )/(j − k), k = j, and xj ∈ K being arbitrary. ⊥ with H ⊥ = {0}. Case 2. dim H > m. Of course, H = Hm ⊕ Hm m ⊥ . Using We look for x of the form x = x1 + x2 , x1 ∈ Hm , x2 ∈ Hm ⊥, a similar decomposition for f, i.e., f = f1 +f2 , f1 ∈ Hm , f2 ∈ Hm we derive from (1) that x2 = f2 , and (I − λA)x1 = f1 . Using the same discussion as before, we can ﬁnd x1 , when it exists, so we conclude that x = x1 + f2 .

12. By the Weierstrass M −test, we have ¯ ⊂ L2 (Q), Q = (0, 1) × (0, 1). k ∈ C(Q) Obviously, A is self-adjoint and compact. Case m = 0. In this case, N (A) = {0}. Indeed, Ag = 0 implies 0 = (Ag, g)L2 =

∞ n=1

1 (g, un )2L2 , (n + 1)2

426

12 Answers to Exercises

hence (g, un )L2 = 0 ∀n ∈ {0, 1, 2, . . .} =⇒ g = 0, 2 since the system {un }∞ n=0 is a basis in H = L (0, 1).

In order to determine the eigenpairs of A consider the equation Ag = μg, which can be written as ∞ (g, un )L2 n=0

(n + 1)

u =μ 2 n

∞

(g, un )L2 un ,

n=0

where we have used the Fourier expansion of g. As {un }∞ n=0 is a basis in H, we have

1 (∗) (g, un )L2 = 0, n = 0, 1, 2, . . . μ− (n + 1)2 If μ =

1 (n+1)2

for all n ∈ {0, 1, 2, . . .} then

(g, un )L2 = 0 ∀n ∈ {0, 1, 2, . . .} =⇒ g = 0, hence such μ s are not eigenvalues of A. For μ = μn = have from (∗)

1 (n+1)2

we

(g, uk )L2 = 0, ∀k ∈ N, k = n, so the eigenfunctions corresponding to μn = multiples of un .

1 (n+1)2

are nonzero

According to the Schmidt formula we have for λ ∈ R \ {1, 22 , 32 , . . .} and a.a. t ∈ (0, 1), ∞ 1 f (s) cos (k + 1/2)πs ds 0 x(t) = f (t) + 2λ (k + 1)2 − λ k=0 × cos (k + 1/2)πt + α cos (n + 1/2)πt , α ∈ R. Case m ≥ 1. In this case Y0 := N (A) = Span({u0 , u1 , . . . , um−1 }) and H = Y0 ⊕ Y1 , where Y1 = N (A)⊥ = Span({um , um+1 , . . .}). Denote by A1 the restriction of A to Y1 which is a Hilbert space with respect to the scalar product and norm of H = L2 (0, 1). Obviously, A1 maps Y1 to itself, being compact, self-adjoint, with

12.11 Answers to Exercises for Chap. 11

427

N (A1 ) = {0}, and with eigenvalues μn = 1/(n + 1)2 and eigenfunctions un , n ≥ m + 1. In fact, Y1 and A1 play the roles of H and A we had before. The equation x = f + λAx can be written as x0 + x1 = f0 + f1 + λAx1 , where x0 , f0 ∈ Y0 and x1 , f1 ∈ Y1 , so x0 = f0 and x1 = f1 + λA1 x1 .

(∗∗)

Based on the above arguments, we have • if λ = (n + 1)2 for all n ≥ m then x(t) = f0 (t) + x1 (t)

f (s) cos (k + 1/2)πs ds 1 0 = f (t) + 2λ (k + 1)2 − λ k=m × cos (k + 1/2)πt , ∞

1

and • if λ = (n + 1)2 for some n ≥ m, then the Fredholm equation (∗∗) has solutions if and only if f1 ⊥ un ⇐⇒ f ⊥ un , and in this case x(t) = f (t) + (2n + 1)2 1 f (s) cos (k + 1/2)πs ds 1 0 × 2 (k + 1) − λ k≥m,k =n × cos (k + 1/2)πt + α cos (n + 1/2)πt , α ∈ R.

Bibliography [1] Adams, R. A., Sobolev Spaces, Academic Press, New York–San Francisco–London, 1975. [2] Ambrosetti, A. and Arcoya, D., An Introduction to Nonlinear Functional Analysis and Elliptic Problems, Springer, 2011. [3] Bant¸a˘, V., Partial Diﬀerential Equations. Collection of Problems, University of Bucharest, Bucharest, 1989 (in Romanian). [4] Barbu, V., Partial Diﬀerential Equations and Boundary Value Problems. Mathematics and its Applications, 44, Kluwer, Dordrecht, 1988. [5] Boyce, W. E. and DiPrima, R. C., Elementary Diﬀerential Equations and Boundary Value Problems, Second Edition, John Wiley, New York–London–Sydney–Toronto, 1969. [6] Brezis, H., Functional Analysis, Sobolev Spaces and Partial Diﬀerential Equations, Springer, 2011. [7] Butzer, P. L. and Berens, H., Semi-Groups of Operators and Approximation, Springer, 1967. [8] Corduneanu, C., Principles of Diﬀerential and Integral Equations, Second Edition, Chelsea Publishing Co., Bronx, N.Y., 1977. [9] Corduneanu, C., Integral Equations and Applications, Cambridge University Press, Cambridge, 1991. [10] Costara, C. and Popa, D., Exercises in Functional Analysis, Kluwer, 2003.

© Springer Nature Switzerland AG 2019 G. Moro¸sanu, Functional Analysis for the Applied Sciences, Universitext, https://doi.org/10.1007/978-3-030-27153-4

429

430

Bibliography

[11] Cronin, J., Diﬀerential Equations. Introduction and Qualitative Theory, Third edition, Chapman & Hall/CRC, 2008. [12] Engel, K.-J. and Nagel, R., One-Parameter Semigroups for Linear Evolution Equations, Graduate Texts in Math., Vol. 194, Springer-Verlag, 2000. [13] Engel, K.-J. and Nagel, R., A Short Course on Operator Semigroups, Springer-Verlag, 2010. [14] Evans, L. C., Partial Diﬀerential Equations, Graduate Studies in Math. 19, Amer. Math. Soc., Providence, Rhode Island, 1998. [15] Friedman, A., Foundation of Modern Analysis, Dover, New York, 1982. [16] Gel’fand, I. M., Lectures on Linear Algebra, Dover, 1989. [17] Gel’fand, I. M. and Shilov, G. E., Generalized Functions. Vol. 2. Spaces of Fundamental and Generalized Functions, Academic Press, New York–London, 1968. [18] Godunov, A. N., The Peano theorem in Banach spaces, Funct. Anal. Appl 9 (1975), no. 1, 53–55. [19] Goldstein, J. A., Semigroups of Operators and Applications, Oxford University Press, 1985. [20] Hammerstein, A., Nichtlineare integralgleichungen nebst anwendungen, Acta Math. 54 (1930), No. 1, 117–176. [21] Hille, E. and Phillips, R. S., Functional Analysis and Semigroups, Amer. Math. Soc. Coll. Publ., Vol. 31, Amer. Math. Soc., 1957. [22] Hokkanen, V.-M. and Moro¸sanu, G., Functional Methods in Differential Equations, Chapman & Hall/CRC, 2002. [23] Iftimie, V., Partial Diﬀerential Equations, Bucharest, 1980 (in Romanian).

University of

[24] Kato, T., Remarks on pseudo-resolvents and inﬁnitesimal generators of semi-groups, Proc. Japan Acad. 35 (1959), 467–468. [25] K¯ omura, Y., Nonlinear semi-groups in Hilbert space, J. Math. Soc. Japan 19 (1967), No. 4, 493–507.

Bibliography

431

[26] Krasnosel’skii, M. A., Topological Methods in the Theory of Nonlinear Integral Equations, Pergamon, 1964. ´ [27] Krasnov, M., Kiss´elev, A. and Makarenko, G., Equations int´egrales, Mir, Moscou, 1976. [28] Kurosh, A., Cours d’alg`ebre sup´erieure, Mir, Moscou, 1973. [29] Lang, S., Real and Functional Analysis, Third Edition, Springer, New York, 1993. [30] Lebedev, N. N., Special Functions and Their Applications, Revised English edition, translated and edited by R.A. Silverman, Prentice-Hall, Inc., Englewood Cliﬀs, N.J., 1965. [31] Lions, J. L. and Magenes, E., Probl`emes aux limites non homog`enes et applications, Vol. 1, Dunod, Paris, 1968. [32] M˘arcu¸s, A., Introduction to Mathematical Logic and Set Theory, Course Notes, 2017 (in Romanian). [33] Marsden, J. E. and Hoﬀman, M. J., Elementary Classical Analysis, Second edition, W. H. Freedman & Co., New York, 1993. [34] Micu, S. and Zuazua, E., An introduction to the controllability of partial diﬀerential equations, in “Quelques questions de th´eorie du contrˆ ol”, T. Sari, ed., Collection Travaux en Cours Hermann, 2004, pp. 69–157. [35] Milman, V., An Introduction to Functional Analysis, World, 1999. [36] Moro¸sanu, G., Nonlinear Evolution Equations and Applications, D. Reidel, Dordrecht–Boston–Lancaster–Tokyo, 1988. [37] Moro¸sanu, G., Elements of Linear Algebra and Analytic Geometry, Matrix Rom, Bucharest, 2000 (in Romanian). [38] Natanson, I. P., Theory of Functions of a Real Variable, Editura tehnic˘ a, Bucharest, 1957 (in Romanian). [39] Pazy, A., Semigroups of Linear Operators and Applications to Partial Diﬀerential Equations, Springer-Verlag, 1983. [40] Popa, E., Collection of Functional Analysis Problems, Editura didactic˘a ¸si pedagogic˘ a, Bucharest, 1981 (in Romanian).

432

Bibliography

[41] Rosenlicht, M., Introduction to Analysis, Dover, New York, 1968. [42] Rudin, W., Principles of Mathematical Analysis, Third Edition, McGraw-Hill, 1976. [43] Schechter, M., Principles of Functional Analysis, Academic Press, New York–London, 1971. [44] Shilov, G. Ye., Mathematical Analysis. A Special Course, Pergamon Press, Oxford–New York–Paris, 1965. [45] Showalter, R. E., Monotone Operators in Banach Space and Nonlinear Partial Diﬀerential Equations, Math. Surveys and Monographs, Vol. 49, Amer. Math. Soc., 1997. [46] Stein, E. M. and Shakarchi, R., Real Analysis. Measure Theory, Integration, and Hilbert Spaces, Princeton University Press, Princeton and Oxford, 2005. [47] Stroock, D. W., Weyl’s lemma, one of many, Groups and Analysis, 164–173, London Math. Soc. Lecture Notes Ser., 354, Cambridge Univ. Press, Cambridge, 2008. [48] Trotter, H. F., Approximation of semi-groups of operators, Paciﬁc J. Math. 8 (1958), 887–919. [49] Vrabie, I. I., Semigroups of Linear Operators and Applications, Editura Universit˘a¸tii “Alexandru Ioan Cuza”, Ia¸si, 2001 (in Romanian). [50] Wheeden, R. L. and Zygmund, A., Measure and Integral. An Introduction to Real Analysis, Marcel Dekker, Inc., 1977. [51] Yosida, K., Functional Analysis, Third Edition, Springer, 1971. [52] Zeidler, E., Applied Functional Analysis. Applications to Mathematical Physics, Appl. Math. Sci. 108, Springer-Verlag, 1995. [53] Zeidler, E., Applied Functional Analysis. Main Principles and Their Applications, Appl. Math. Sci. 109, Springer-Verlag, 1995.