Math 6580
One must be sure that one has enabled science to make a great advance if one is to burden it with many new terms and require that readers follow research that offers them so much that is strange. A. L. Cauchy
Linear Algebra, Infinite Dimensional Spaces, and MAPLE PREFACE These notes evolved in the process of teaching a beginning graduate course in Hilbert Spaces. The first edition was simply personal, handwritten notes prepared for lecture. A copy was typed and given to the students, for it seemed appropriate that they should have a statement of theorems, examples, and assignments. With each teaching of the course, the notes grew. A text for the course has always been announced, but purchase of the text has been optional. A text provides additional reading, alternate perspectives, and a source of exercises. Whatever text was used, it was chosen with the intent of the course in mind. This one quarter course was designed for science and engineering students. Often, as graduate students in the sciences and engineering mature, they discover that the literature they are reading makes references to function spaces, to notions of convergence, and to approximations in unexpected norms. The problem they face is how to learn about these ideas without investing years of work which might carry them far from their science and engineering studies. This course is an attempt to provide a way to understand the ideas without the students already having the mathematical maturity that a good undergraduate analysis course could provide. An advantage for the instructor of this course is that the students understand that they need to know this subject. The course does not develop the integration theory and notion of a measure that one should properly understand in order to discuss L2 [0,1]. Yet, these note suggest examples in that space. While this causes some students to feel uneasy with their unsophisticated background, most have enough intuition about integration that they understand the nature of the examples. The success of the course is indicated from the fact that science and engineering students often choose to come back the next term for a course in real variables and in functional analysis. The course has been provocative. Even those that do not continue with more graduate mathematics seem to feel it serves them well and provides an opening for future conversations in dynamics, control, and analysis. In this most current revision of the notes, syntax for MAPLE has been added. Many sophomores leave the calculus thinking that the computer algebra systems are teaching tools because of the system's abilities to graph, to take derivatives, and to solve text-book differential equations. We hope, before they graduate, students will find that these systems really are a "way for doing mathematics." They provide a tool for arithmetic, for solving equations, for numerical simulations, for drawing graphs, and more. It's all in one program! These computer algebra systems will move up the curriculum.
In choosing MAPLE, I asked for a computer algebra system which is inexpensive for the students, which runs on a small platform, and which has an intuitive syntax. Also, MATLAB will run Maple syntax. The syntax given in these notes is not always the most efficient one for writing the code. I believe that it has the advantage of being intuitive. One hopes the student will see the code and say, "I understand that. I can do it, too." Better yet, the student may say, "I can write better code!" It is most important to remember that these notes are about linear operators on Hilbert Spaces. The notes, the syntax, and the presentation should not interfere with that subject. The idea in these notes that has served the students best is the presentation of a paradigm for a linear operator on a Hilbert Space. That paradigm is rich enough to include all compact operators. For the student who is in the process of studying for various exams, it provides a system for thinking of a linear function that might have a particular property. For the student who continues in the study of graduate mathematics, it is a proper place to step off into a study of the spectral representations for linear operators. To my colleagues, Neil Calkin and Eric Bussian, I say: The greatest complement you can give a writer is to read what he has written. To my students: These notes are better because they read them and gently suggested changes. I am grateful to the students through the years who have provided suggestions, corrections, examples, and who have demanded answers for the assignments. In time.... In time....
James V. Herod School of Mathematics Georgia Tech Atlanta 30332-0160 Summer, 1997
Table of Contents Section 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 26 25
Title
A Decomposition for Matrices Exp(tA) Self Adjoint Transformations in Inner-Product Spaces The Gerschgorin Circle Theorem Convergence Orthogonality and Closest Point Projections Orthogonal, Nonexpansive, & Self-Adjoint Projections Orthonormal Vectors The Finite Dimensional Paradigm Bounded, Linear Maps from E to C I Applications to Differential and Integral Equations The Simple Paradigm for Linear Mappings from E to E Adjoint Operators Compact Sets Compact Operators The Space of Bounded Linear Transformations The Eigenvalue Problem Normal Operators and The More General Paradigm Compact Operators and Orthonormal Families The Most General Paradigm: A Characterization of Compact Operators The Fredholm Alternative Theorems Closed Operators Deficiency Index An Application: A Problem in Control An Application: Minimization with Constraints A Reproducing Kernel Hilbert Space
Page 1 5 7 11 14 18 23 27 32 35 40 43 45 47 50 53 57 61 64 67 69 72 74 77 ?? ??
1
Linear Algebra, Infinite Dimensional Spaces, and MAPLE This course will be chiefly concerned with linear operators on Hilbert Spaces. We intend to present a model, a paradigm, for how a linear transformation on an inner-product space might be constructed. This paradigm will not model all such linear mappings. To model them all would require an understanding of measures and integration beyond what a beginning science or engineering student might know. To set the pattern for this paradigm, we first recall some linear algebra. We recall, review, and re-examine the finite dimensional setting. In that setting, there is the Jordan Canonical Form. We present this decomposition for matrices in an alternate view from the traditional one. The advantage to the representation presented here is conceptual. It sets a pattern in concepts, instead of in "form." We think of projections and eigenvalues. And, we must turn to nilpotent matrices when projections and eigenvalues are not enough. This is how we begin.
Section 1: A Decomposition for Matrices Definition A projection is a transformation P from E → E such that P2 = P. Note that some texts also require that P should be non-expansive in the sense that |Px| ≤ |x|. An example of a projection that is not nonexpansive is P(x,y) = (x+y,0) Definition A nilpotent transformation is a linear transformation N from E → E for which there is an integer k such that Nk = 0. 3 0-2 Examples -1 1 1 is a projection from R3 to R3 and 3 0-2 function from R3 to R3.
-1 1 1 0 0 0 is a nilpotent -1 1 1
Theorem 1(Spectral Resolution for A) If A is a linear function from Rn to k k k Rn then there are sequences {λ i }i=1, {Pi }i=1, and {Ni }i=1 such that each λ i is a number and (a) Pi is a projection, (b) Ni is nilpotent, (c)P i Pj = 0 if i ≠ j k (d) Ni Pj = 0 if i ≠ j, (e) Ni Pi = Ni (f) I = ∑ Pi , 1 k and (g) A = ∑ [λ i Pi + Ni ] i
2 Outline of Proof: We assume the Cayley-Hamilton Theorem which states that if A in an nxn matrix and D(λ) is the polynomial in λ defined by D(λ) = det(λI-A), then the polynomial in A defined by making the substitution A = λ satisfies D(A) = 0. To construct the proof for the theorem, first factor D(λ) as k
D(λ) =
∏ (λ-λp)
m
p,
p=1
where the λp's are the zero's of D with multiplicity mp. Now form a partial fraction decomposition of D(λ): Let a1, a 2, ... ak be functions such that k a (λ) 1 p = ∑ m . D(λ) p=1 (λ-λ p) p Two Examples 1 0 0 A2 = 0 2 1 0 0 2 D1(λ) = (λ-1)(λ-2) and D2(λ) = (λ-1)(λ-2)2 1 -1 1 = + (λ-1)(λ-2) λ-1 λ-2 -λ+3 1 1 = + (λ-1)(λ-2)2 λ-1 (λ-2)2 -1-2 A 1 = 3 4
If qp(λ) =
and
∏ (λ-λi )
m
i
then
i≠p k
1=
∑ ap(λ) qp(λ)
p=1
k
so that I =
∑ ap(A) qp(A)
p=1
Two Examples Continued 1 = -1⋅ (λ-2) + (λ-1) = (2-λ) + (λ-1) 1 = (λ-2)2 + (-λ+3)(λ-1) 100 000 32 -2-2 I = -3-2 + 3 3 and I = 000 + 010 . 000 001 Claim 1: Using the Caley-Hamilton Theorem if Pj = aj (A) qj (A) then Pi Pj = 0.
3
Claim 2: Pj is a projection since Pj = Pj ⋅ I =
k
∑ Pj Pi = Pj 2.
i=1 Claim 3: By the Caley-Hamilton Theorem, if Ni = ai (A)q i (A)(A-λ i I) = Pi (A-λ i I) then
Ni
m
i
= 0.
Claim 4: Ni Pi = Ni and N i Pj = 0. To see this note that Ni Pj = Pj Ni = Pj Pi (A-λ i I) = 0 and Ni Pi = Pi Pi (A-λ i I) = N i . Finally, since I =
∑ Pi
then
i
A=
∑ Pi A i
=
∑ Pi (λi I + A - λi I) i
=
∑ [λi Pi + N i ].
i Two Examples Finished -1-2 = 1 3 2 +2 -2-2 and 3 4 -3-2 3 3 100 100 000 000 021 000 = 1 + 2 010 + 001 . 002 000 001 000 Remarks
k (1) The sequence {λ i }i=1 is the sequence of eigenvalues. If x is in Rn and m -1 vi = Pi Ni i (x),
then vi is an eigenvector for A in the sense that λ i vi = A vi . (2) For the nilpotent part m i ≤ n.
4 In fact, Ni n must be zero for each i. Assignment Get the spectral resolution (or Jordan Canonical Form) for the matrices: -5 1 3 0 1 , -3 2 , 0-1 , and 1-2-1 . 1 0 1-2 1 2 -4 1 2
5
Section 2: Exp(tA) Often in differential equations -- both in ordinary and partial differential equations -- one can conceive of the problem as having this form Z′ = AZ , Z(0) = c. If one can make sense of exp(tA), then surely this should be the solution to a differential equation of the form suggested. Finite dimensional linear systems beg for such an understanding. More importantly, this understanding gives direction for the analysis of stability questions in linear, and nonlinear differential equations. Here is a review of the linear, finite dimensional case. Theorem 2 If P is a projection, t is a number, and x is in {E, <⋅ ,⋅ >}, Then the sequence n ti Pi n ti S(n) = ∑ i! (x) = ∑ i! P(x) i=1 i=1 converges in E. Recall that whatever norm is used for Rn , if A is an nxn matrix, then there is a number B such that if x is in Rn , then |Ax| ≤ B|x|. Moreover, the least such B is denoted ||A||. ∞ i i tA ∑ i! . i=0 Corollary 3 If P is a projection, exp(tP)(x) = (1-P)(x) + et P(x). Definition exp(tA) =
Corollary 4 Suppose that P and Q are projections, PQ = QP = 0, and if t is a number. Then exp(tP+tQ) = et P + et Q + (1-P-Q). Suggestion of Proof. With the suppositions for P and Q, P+Q is also a projection. Thus, the previous result applies. m-1 ti Ni Observation If N is nilpotent of order m then exp(tN) = ∑ i! . i=0 Theorem 5 If A is a linear transformation from Rn → Rn and k A = ∑ [λ i Pi + Ni ] i is as in Theorem 1, then
6 k
exp(tA) =
m i -1tj N j
∑ exp(λi t)Pi ( ∑ 1
j=0
i
j!
).
Suggestion of Proof. Suppose that λ is a number. Then exp(tA) = exp(λt) exp(t[A-λI]). Suppose that A = Σ i [λ i Pi +N i ]. ∞ p λ t t Then Pi exp(tA) = e i Pi ∑ p! (A-λ i I) p p=0 ∞ p λ t t i = e Pi ∑ p! Pi⋅ (A-λ i I) p since Pi = Pi 2. p=0 ∞ p λ t p t = e i Pi ∑ p! Ni . p=0 Thus, exp(tA) = 1⋅ exp(tA) = Σi Pi exp(tA) m i -1p λ t p t = Σi e i Pi ∑ p! Ni . p=0
Assignment Solve Z′ = ΑΖ , Z(0) = c, where A is any of the matrices in the previous assignments.
7
Section 3: Self Adjoint Transformations in Inner-Product Spaces In these notes, we work in an inner-product space. Often, we consider the space E as a collection of vectors over the complex (or real) numbers. We take an inner product, or dot-product, to be defined on ExE and assume the inner product to have these properties: < x, y > = < y, x>* < x+αy, z > = < x, z > + α < y, z>, and < x, x> > 0 if x ≠ 0. Examples: Some innerproduct spaces every young analysists should know are denoted R n , R ∞ = ∪n Rn , L2 , and L 2[0,1]. We take Rn+1 ⊃ Rn in a cannonical way. Every innerproduct space is a normed space for take the norm to be defined by |x| = <x,x> . One should note that not every normed space arises from an innerproduct space. In fact, a characterization of those norms which arise from an inner product is that they satisfy the parallelogram identity. The difference in a normed space and an innerproduct space is an important notion. It would be well to have some examples of innerproduct spaces and normed spaces to build intution. We present examples in class. The best known inequality in an inner product space is the Cauchy Schwartz inequality. Theorem 6: If x and y are in E then |< x, y>|≤ |x| |y|. Hint for proof: Consider h(t) = x + <x,y> y t and |h(t)| 2 . Linear transformations that are self adjoint have a simple structure. We begin with these. Definition In an inner product space {E,<⋅ , ⋅ >}, the linear transformation A on E is self adjoint if
= <x,Ay> for all x and y in the domain of A. Examples: Here are three examples to remind you that the notion of "selfadjoint" is pervasive. 01 (a) If A is 1 0 then A is self-adjoint in R2 with the usual dot product. 1 (b) If K(f)(x) is ∫ cos(x-y) f(y) dy then K is self-adjoint on L2[0,1] with the 0 usual dot product.
8 (c) If A(f) = (pf ′ )′ + qf, with p and q smooth, then A is self-adjoint on the collection of functions in L2[0,1] having two derivatives and that are zero at 0 and at 1. A useful identity to understand this example is (pf ′ )′g − f(pg′ )′ = [ p(f ′g − fg′)]′. Definition: Numbers λ and vectors v are eigenvalues and eigenvectors, respectively, for the linear transformation A if Av = λv. Theorem 7: If A is self-adjoint then all eigenvalues are real and eigenvectors corresponding to different eigenvalues are orthogonal. Suggestion of Proof. Suppose that x≠0 and that Ax=λx. We hope to show that λ = λ*: λ<x,x> = = <x,Ax> = λ*<x,x>. Finally (λ-µ)<x,y> = - <x,Ay> = 0, so that <x,y> = 0. Corollary 8: In the decomposition of Theorem 1, if A is self adjoint then each Ni = 0. Suggestion of a proof. Let m be the smallest integer such that Nm = 0 and suppose that m ≠ 1. Then, for each x in E and use the fact that each N is a polynomial in A to see that N is self adjoint if A is. Consequently, = = 0. Assignment (3.1) Give a point in L2 that is not in R∞. (3.2) Construct a proof of the Cauchy Schwartz inequality in a general inner product space. (3.3) Show that -.5 .5 .5-.5 is self adjoint with the usual dot product. Find its Jordan Form. Repeat this for -.5 .5 0 .5-.5 0 . 0 0 0 (3.4) Suppose that A is self-adjoint and has spectral resolution given by
9 k
∑ λi Pi . i=1 Suppose also that λ 1 = 0 and all other λ's are not zero. Recalling that if e is in E then k e = ∑ Pi (e), i=1 show that P 1(e) is in the nullspace of A and that k ∑ Pi (e) i=2 is in the range of A. In fact, find u such that k A(u) = ∑ Pi (e). i=2 A=
(3.5) Find the eigenvalues and eigenvectors for the matrices in the Assignment of Section 1. Relate this to the Jordan Form for self adjoint matrices. (3.6) Verify that 1 1 0 1 -1 , , and 0 0 0 1 are orthogonal. Create A such that these three vectors are eigenvectors and 0,-1, and 2 are eigenvalues for A. (Hint: Ax = 0 <x,v 1> v1/|v 1| − 1 <x, v2> v2 /|v 2| + 2 <x, v3> v3 /|v 3|.) (3.7) Let 1 K[f](x) = ∫ cos(π(x-y)) f(y) dy 0 2 on L [0,1]x[0,1]. Find the eigenvalues and eigenfunctions for K. Show that the eigenfunctions are orthogonal.
10
Section 4: The Gerschgorin Circle Theorem If all the eigenvalues of A are in the left-half plane -- have negative real part -- then solutions for Z′ = AZ collapse to zero. If at least one eigenvalue is in the right-half plane, then some solution grows without bound. Consequently, it is very important to know where the eigenvalues of A are, even if the matrix is such a mess that the eigenvalues are hard to fine. You can guess that people have turned their attention to locating eigenvalues. The next theorem is a classical result. More recent results are included in this section as a cited reference. Theorem 9 (Gerschgorin's Circle Theorem) If A is a matrix and S= ∪m {z: |A m m -z| ≤
∑ |A j m | },
j≠m then every eigenvalue of A lies in S. Suggestion for Proof. Lemma: If A is a matrix and |A m m | >
∑ |A im |
i≠m for all m, then A is invertible. Here's how to see this: Suppose A is not invertible. Thus 0 = det(A) = det(A T ). Let x be such that AT x = 0 and x ≠ 0. Let m be such that |x m | = max i |x i |. Then 0 =(AT x) m = ∑ A T mi x i . i Or, Am m x m = −
∑ AT mi x i = − ∑ Aim x i .
i≠m
Hence, |A m m ||x m | = |
i≠m
∑ Aim x i | ≤ |x m | ∑ |A im |.
i≠m i≠m Proof of Theorem 9: Suppose that λ is an eigenvalue of A not in S. Then |λ-A m m | >
∑ |A im |.
i≠m Thus, det(λI-A) ≠ 0 and λ can not be an eigenvalue. This is a contradiction.
11 Remark: It is easy to see that the Gershgorin Circle Theorem might be important for determining the behavior of systems of differential equations. For example, as noted in the beginning of this section, if all eigenvalues of the matrix A are in the left half of the complex plane, then solutions of the differential equation Z′ = AZ have asymptotic limit zero. As a result of the importance of these ideas, considerable work has been done to improve the bounds. One improvement can be found in the following: David G. Feingold and Richard S. Varga, Block Diagonally Dominant Matrices and Generalizations of the Gerschgorin Circle Theorem, Pacific J. Math., 12 (1962) 1241-1249. (QA1.P3.) In that paper, the authors consider block matrices down the diagonal and state the inequalities in terms of a matrix norm. Definition: If A is a matrix on Rn with norm |x| = of A is defined by |Ax| ||A|| 2 = sup ( |x| , |x|≤ 1).
Σ i xi 2, then the 2-norm
Matrix norms: If the norm for the space is given by n n |x| 1 = ∑ |x i | then ||A||1 = maxp ∑ |Ai,p|-- the maximum column i=1 i=1 sum. If the norm for the space is given by n |x| 2 = ∑ |x i | 2 then ||A||2 = the square root of the maximum p=1 eigenvalue of the matrix A⋅ (tranpose A). If the norm for the space is given by n
|x| ∞ = maxi |x i | then ||A||∞= maxp
∑
i=1 sum. Example: Consider the matrix
|Ap,i|-- the maximum row
12
-24 -24 -10 -10 A = -1 0 4 -2. 0 -1 -2 4 The result presented in these notes establish that all eigenvalues lie in the disk |λ - 4| ≤ 3. The improved result in the reference cited above show that all eigenvalues lie in the disks |λ - 6| ≤ 1 and |λ - 2| ≤ 1. Eigenvalues are actually 1, 3, 5, and 7. Assignment (4.1) Prove Theorem 9 for row sums instead of column sums. 2 0 1 A = 0 1 0 . 1 0 2 Show that the eigenvalues λ of A satisfy |λ-2| ≤ 1 or λ= 1. Give the eigenvalues for A. Get the Jordan form for A. (4.2) Let
14
Section 5: Convergence In many applications of mathematics, one can only make approximations. If you thought of making successive approximations, you would hope to be making a sequence which converges. The question is, in what sense would the approximation be good? In what sense would the sequence of approximation get close? This section sets a framework for understanding these questions by considering various forms of convergence. Suppose that {E, <⋅ ,⋅ >} is an inner product space of functions on an interval. We will discuss four methods of convergence: strong, weak, pointwise, and uniform. Definition A sequence of points {xp} converges to y strongly if limp→∞|x p - y|=0. A sequence of points {xp} converges to y weakly if, for each h in E, lim p→∞< xp - y, h> = 0. A sequence of functions {fp} converges to g pointwise on C if for each x in C, lim p→∞fp(x) = g(x). A sequence of functions {fp} converges to g uniformly on C if lim p→∞max|f p(x) - g(x)|= 0. Examples (1) Strong convergence implies weak convergence, but not conversely. Strong convergence ⇒ Weak convergence: Suppose h is in E and {xp} converges to y strongly . By the Cauchy-Schwartz inequality |< x p - y, h>|≤ |x p - y| |h|. To see that weak convergence does not imply strong convergence, we need an example. One example comes from experiences with Fourier Series. Suppose h ∈ L2[0,1]. There is a sequence ap with limp→∞ap = 0 so that ∞
h(x) =
∑ ap sin(pπx). p=0
Thus, the claim is that φp defined by φp(x) = sin(pπx), converges weakly to zero but not strongly. We have that for each h, < φp - 0, h> = ap → 0 and |φp - 0|2 =
1 1 1+cos(2pπx) 1 2 dx = 2. ∫ sin (pπx) dx = ∫ 2 0 0
(2) Uniform convergence implies pointwise convergence, but not conversely.
15 That uniform convergence implies point wise convergence is not so hard. To be assured that the implication does not go the other way, consider fn (x) = nxe -nx and note
fn (1/n) = 1/e.
Figure 5.1 is an illustrative graph. It shows this sequence converging pointwise, but not uniformly.
Figure 5.1
It's time to say what a Banach space is and what a Hilbert space is. Definitions: A sequence {xp} converges if lim m,n→∞ | x m - xn | → 0. A space is complete if every sequence that converges has a limit in the space. A Banach space is a complete, normed linear space. A Hilbert space is a complete, inner product space. Another notion that is critical in the remainder of these notes and is likely in the vocabulary of the reader already is that of a closed set. A closed subset of a Hilbert space or Banach space is a set with the property that if {xp} is a sequence with values in the set and limpx p = y, then y is also in the set. There are alternate characterizations: closed sets are sets that contain all their boundary points, are sets whose complements are open. Assignment: (5.1) Show that if |xn | →|L| and xn → L weakly, then xn → L strongly. (Hint: Show that |xn -L| 2→|L|2-2|L| 2+|L| 2.) (5.2) Show that uniform convergence on [0,1] implies strong convergence in L2[0,1], but not conversely.
16 (5.3) Show that pointwise convergence on [0,1] does not imply strong convergence in L2[0,1] and that strong convergence in L2[0,1] does not imply pointwise convergence on [0,1]. (5.4) Discuss the nature of the convergence of ∞ sin(nx) ∑ (-1)n+1 n 1 4
on the interval [0,2π]. (Hint: here is the graph of
∑ (-1)n+1 1
sin(nx) . n
1.5 1 0.5 00
1
2
3x
4
5
6
-0.5 -1 -1.5 (5.5) Suppose that lim p xp = y and lim p up = v. (a) Show that limp < xp, up> = < y, v>. (b) Show that lim p |x p|= |y|. (c) Let {φp} be an infinite maximal orthonormal sequence, xp = φp, and up = φp/2. Show that the weak limit of xp and up is zero, but lim p< xp, up> ≠ 0. MAPLE Remark: Graphs of the first four functions in a sequence that converges pointwise, but not uniformly are made with MAPLE in a rather intuitive way. > f1:=x->x*exp(-x); f2:=x->2*x*exp(-2*x); f3:=x->3*x*exp(-3*x); f4:=x->4*x*exp(-4*x); > plot({f1(x),f2(x),f3(x),f4(x)},x=0..6);
You might prefer this syntax and graph. > f:=(n,x)->n*x*exp(-n*x); > plot3d(f(n,x),n=1..4,x=0..6,axes=NORMAL,orientation=[-10,50]);
17 The space C[-1,1] of continuous functions in L2[-1,1] is an example of an innerproduct space that is not complete. When asked to show it is not complete, a novice might draw a sequence of functions in a manner suggested with this syntax: >
plot({[x,1-2*x,x=0..1/2],[x,1-3*x,x=0..1/3],[x,1-4*x,x=0..1/4], [x,1+2*x,x=-1/2..0],[x,1+3*x,x=-1/3..0],[x,1+4*x,x=-1/4..0]}, x=-1..1);
This is not a convincing picture for L2[-1,1] is more complicated than suggested by that picture. To see this, establish that the sequence suggested by that picture converges to zero by seeing that 1 1/n 2 ∫ [fn (x) - 0] dx = 2 ∫ (1-nx)2 dx = 2/(3n). -1 0 > int((1-n*x)^2,x=0..1/n);
This integral goes to zero as n increases. How does one make an example of a sequence in C[-1,1] that converges but has a limit not in C[-1,1]? Here's a suggestion. > plot( {
[x,Pi*(1-x)/2,x=0..1], [x,-Pi*(1+x)/2,x=-1..0], [x,sum( sin(p*Pi*x)/p, p=1..5 ),x=-1..1]});
That graph suggests a good idea for an example. One must consider whether
∑ sin(pπx)/p
p converges in C([-1,1]). This brings up integrating the sum of the squares of the terms. Note the calculus here > simplify(int( sin(p*Pi*x)*sin(q*Pi*x), x = -1..1));
This is zero if p and q are different integers. On the other hand > subs(sin(p*Pi)=0, int( (sin(p*Pi*x)/p)^2 ,
x=-1..1));
We are ready to see the series is Cauchy. We see that the limit of the series is the odd extension of π (1 - x)/2. > int((Pi*(1-x)/2)^2,x=0..1) - 2*sum(int(Pi*(1-x)*sin(p*Pi*x)/(2*p),x=0..1), p=1..n) + sum(int( (sin(p*Pi*x)/p )^2,x=0..1),p=1..n); > limit(subs(sin(p*Pi)=0,"),n=infinity);
This limit is zero and we have that the sequence has limit not in C[-1,1].
18
Section 6: Orthogonality and Closest Point Projections We now will study the notions of orthogonality and orthogonal projections. This will expand our notion of what a projection is. Critical in making up "closest point " projections is that the range of the projection should be convex. We examine this geometric notion in this section. The identity of part 1 of the next theorem is called the Pythagorean Identity and the identity of part 2 is called the Parallelogram Identity. Theorem 10 (1) If <x, y> = 0, then |x+y|2 = |x|2 + |y|2. (2) If x and y are in E, then |x+y|2 + |x-y|2 = 2|x|2 + 2|y| 2. What follows is a review of the notion of convexity for use in the exploration of projections. Definition A set C is convex if for all x and y in C and all numbers t in [0,1], tx + (1-t)y is in C. REMARK: The next result gives a characterization of closest point projection in terms of the norm and also in terms of the dot product. Theorem 11 If C is a closed, convex subset of the Hilbert space {E, <⋅ ,⋅ >}, x 0 ∈ E, and δ = inf{||x0 - c||, c ∈ C} then there is one and only one y0 in C such that ||x0 - y0|| = δ. Moreover, these are equivalent: (a) z is in C and Re<x0-z, c-z> ≤ 0 for all c in C, and (b) z = y0. Outline of Proof: Suppose that C is a closed convex set and that x0 is in E. Let δ = inf{|x0-y|: y∈C} First we show there is a point y0 in C such that |x0 - y0| = δ. To do 1 ∞ this define a sequence {up} in C by |up - x0| < δ + p . Note that {up}p=1 converges for |un - um | 2 = |(u n -x 0) - (um - x0)| 2 = 2|un -x 0| 2 + 2|u m -x 0| 2 - 4 |(un +u m )/2 - x0| 2 by the parallelogram identity. Because C is convex, (un +u m )/2 is in C and this last line does not exceed 2|un -x 0| 2 + 2|u m -x 0| 2 - 4 δ2 which goes to zero as n and m goes to infinity. Hence, the sequence {up} converges strongly and has limit some point y0 in E since E is complete. It is in C since C is closed. Now we show there is only one closest point to xo in C. Suppose that |x 0 - z | = δ = |x0 - y0|. Then |y0 - z|2 = |(y0 - x0) - (z-x0)| 2 = 2|y0-x 0| 2 + 2|z-x 0| 2 - 4 |(y0+z)/2 - x 0| 2 = 0. Thus, y0 = z.
19 What follows is an understanding of the characterization of the closest point using the innerproduct. These are equivalent: (a) z is a point in C and Re<x0-z,c-z> ≤ 0 for all c in C, and (b) z is y0. a⇒b Suppose that z has this property (a). We hope to show that z = y0. Because of the uniqueness of this closest point, we would be content to show that |x 0-z| = |x 0-y0|. We know that |x0-y0| ≤ |x 0-z|. 0 ≤ |x 0-z| 2 - |x 0-y0| 2 = |x0-z| 2 - |(x 0-z)+(z-y 0)| 2 = |x 0-z| 2 - [|x 0-z| 2 - 2 Re<x0-z,y 0-z> + |z-y 0| 2] = 2 Re<x0-z,y 0-z> - |z-y0| 2 ≤ 0. Therefore, |x0-z| 2 = |x0-y0| 2. Now we show that b ⇒ a by showing that y0 has this dot product characterization. Recall that the set C is convex so that if c is in C then so is tc+(1-t)y0 in C. The first inequality holds because y0 is the closest point in C to x0: 0 ≥ |x 0-y0| 2 - |x 0-[tc+(1-t)y0]| 2 = |x0-y0| 2 - |[x 0-y0] + t(y0 - c)|2 = |x 0-y0| 2 − |x 0-y0| 2 +2t Re<x0−y0, c-y 0> - t2 |y0 - c|2 for all 0 < t ≤ 1. Therefore, Re<x0-y0, c-y 0> ≤ 0. REMARK. This result leads to two irresistable questions: (1) In a Banach Space that is not a Hilbert Space and given a point, is there a unique closest point on the unit disk? For example consider L1 and L ∞? (2) We have repeatedly used the parallelogram law. This holds in a Hilbert space. Does the parallelogram law hold in spaces other than Hilbert spaces? (3) Convex sets, not necessarily closed, do not have to have points of minimum norm. (4) Closed sets, not necessarily convex, do not have to have points of minimum norm. For example, ∞
C = ∪p=1 { ep x where 1+ 1/p ≤ x < ∞} is closed, not convex, and there is not a point of minimum norm. Corollary 12 If M is a closed linear subspace of the Hilbert space {E, <⋅ ,⋅ >}, x 0 is in E, and δ = inf{||x0 - m||: m ∈M} then there is one and only one y0 in M such that ||x0 - y0 || = δ. Moreover, <x0 - y0 , m> = 0 for all m in M. If y1 ∈M and y 1 ≠ y0 then <x0 - y1, m> ≠ 0 for all m in M. Suggestion of Proof: Let x0 be in E and y0 be the closest point in M to x0. There is such a point because M is closed and convex. We have from the above theorem that Re<x-y0, m-y0> ≤ 0 for all m in M. (*)
20 We hope to show that Re<x-y0, m-y0> = 0 for all m in M. Let u = 2y0 - m. Note that u is in M. Hence, 0 ≥ Re<x-y0, u-y 0> = Re< x-y0,y0 -m>. This last inequality, together with inequality (*) above gives that Re< x-y0, y0 -m> = 0 for all m in M. We now want to show that this equality holds for the imaginary part, too. Choose v = y0 + i(m-y0). We have that 0 = Re<x-y0,i(m-y0)> = Re[ -i< x-y0, m-y0> = Im <x-y0, m-y0>. Therefore, < x-y0, m-y0> = 0 for m in the linear space. Definition If C is a closed, convex set in E then Pc denotes the (possibly nonlinear) function from E to E such that if x is in E then Pc (x) is the closest element in C to x. Assignment (6.1) Compute <x,y> and |x+y|2- |x|2 - |y|2 for x = {0,i} and y = {0,1}. n (6.2) Suppose that {Cp}p=1 is a collection of convex sets. Show that the intersection is also convex. n (6.3) Show that if {xp}p=1 is a sequence of points in a convex set C and n ∑ tp = 1, with each tp ≥ 0, 1 n then ∑ tpx p is in C. 1 (6.4) The closure and the interior of a convex set is convex. (6.5) If E ⊇ N ⊇ M and M and N are closed subspaces with M ≠ N then there is z in N, not in M and with z perpendicular to all points of M. (Hint: Suppose that n is in N-M and m is in M. Let P(n) be the closest point to n that is in M and c = m + P(n). Then 0 = . Let z = n-P(n).) (6.6) Consider the space R2 with the norm given by ||x||1 = |x1| + |x2|. (a) Show that every closed set in this space has a point of minimum norm. (b) Show, by example, that this may fail to be unique. (c) What happens if ||x||∞ = max(|x1|, |x 2|)?
21 (-1)p (6.7) Note that x = (x1, x 2, ... ) where xp= p is in the real Hilbert space (little) RL2 consisting of square summable real sequences and C = {y: yp ≥ 0, y ∈ (little) RL2} is a convex set in (little) RL2. Find the closest element in C to x. (6.8) Give an example of a convex set C such that if Pc (x) is the closest element in C to x, then Pc (x) is a function from E to E such that Pc 2 = Pc but Pc is not necessarily linear. Can you characterize those convex sets in a Hilbert space for which the closest point projection is a linear function? (6.9) Show that every closed, convex set in a Hilbert space has a point of minimum norm. n (6.10) Suppose that {φp}p=1 is an orthonormal sequence and x is in E. Show n that if {ap}p=1 is a number sequence then n n n 2 2 2 |x - ∑ apφp | = |x| + ∑ |<x, φp> - ap| - ∑ |<x, φp>| 2. 1 1 1 (Hint for seeing the above equality: expand the right side.) n How do you choose ap such that | x -∑ apφp | is minimum? Let C = span 1 n {φp}p=1 . Give a formula for Pc (x). (Remark: This problem has several interesting parts. There is the notion of a closest point, n
that
∑ |<x, φ p>| 2 converges as n →∞, 1 n
that
∑ <x, φ p> φ p converges in E, 1
n
and that
x=
∑ <x, φ p> φ p if {φ p} is maximal orthonormal. ) 1
(6.11) Show that φn (x) = sin(nπx), n = 1, 2, ... is an orthogonal sequence in L2[0,1]. Let C be the span of φn (x), n = 1,2,..., 100. Give a formula for Pc (f) where f is given by f(t) = t for 0 ≤ t ≤ 1. (6.12) Show that φ0(x) = 1, φ1(x) = x, φ2(x) = (3x2- 1)/2, φ3(x) = (5x3- 3x)/2, φ4(x) = (35x4 - 30x2 + 3)/8 is an orthogonal sequence in L2[-1,1]. Let C be the span of φn (x), n = 0,1,...,4. Give a formula for Pc (f) where f(x) = x3 + x 2 + x + 1.
22 (6.13) Let n
p n 1 p Ln (x) = ∑ (-1) p p! x . 0 Compute L0 through L3 . Let Φn (x) = e-x Ln (x). Show that {Φn (x)} is an orthogonal sequence in L2([0,∞), ex ). Let C be the span of Φn , n = 0,1,2,3. Give a formula for Pc (x) where f(x) = .25 if 0 ≤ x ≤ 1, = .75 if 2 ≤ x ≤ 3, = 0 otherwise.
MAPLE Remark: When one thinks of the graph of cos(πx/2) on the interval [-1,1], one is struck by the resemblance of this graph to that of a quadratic, turned down and translated up, over the same interval. It's curious to think about how close one might approximate this transcandental function with a quadratic function. The first thing to think of is the Taylor Polynomial of degree two. > taylor(cos(Pi*x/2),x=0,3); > plot({cos(Pi*x/2),1 - Pi^2*x^2/8},x=-1..1);
An alternate quadratic approximation that is more appropriate to this section is to use the polynomials of problem 6.12. These are called the Legendre polynomials. MAPLE knows these polynomials, and others. Here are the first three Legendre polynomials: > with(orthopoly): > P(0,x);P(1,x);P(2,x);
To get the best quadratic approximation for cos(πx/2) in L 2[0,1] we compute the coefficients as in problem 6.10. > a0:=int(cos(Pi*x/2)*P(0,x),x=-1..1)/int(P(0,x)^2,x=-1..1); a1:=int(cos(Pi*x/2)*P(1,x),x=-1..1)/int(P(1,x)^2,x=-1..1); a2:=int(cos(Pi*x/2)*P(2,x),x=-1..1)/int(P(2,x)^2,x=-1..1); > plot({cos(Pi*x/2),a0*P(0,x) + a1*P(1,x) + a2*P(2,x)},x=-1..1);
To further emphasize the importance of orthogonal polynomials, we make one more illustration. Take n+1 points chosen evenly spaced on the interval [-1,1] and choose the point pairs {x[i],y[i]} where y[i] = f(x[i]). Then take the interpolating polynomial of degree n that fits the resulting n+1 point-pairs exactly. Do you think this polynomial sequence might converge uniformly to f? There is a classical example that illustrates how wrong this can be. > f:=x->1/(1+25*x^2); > for i from 1 to 9 do; s[i]:= -1 + (i-1)*2/8; fs[i]:= f(s[i]); od;
23 > fintrp:=x->interp([s[1],s[2],s[3],s[4],s[5],s[6],s[7],s[8],s[9]], [fs[1],fs[2],fs[3],fs[4],fs[5],fs[6],fs[7],fs[8],fs[9]],x); > plot({f(x),fintrp(x)},x=-1..1);
The situation -- lack of closeness to f -- does not improve with more points; just modify this syntax to see.
24
Section 7: Orthogonal, Nonexpansive, & Self-Adjoint Projections Some authors require that a projection should be nonexpansive, orthogonal, self adjoint ... even linear. These notes have made a point that a function with P2 = P need not have these properties. At least geometrically, the minimum requirement that P2 = P defines a "projection." So, one must ask, what does one gain with these other conditions? Theorem 16 in this section provides an answer. Lemma 13 (Polarization Identity) If T is a linear transformation then x+y x+y x-y x-y = − + x+iy x+iy x-iy x-iy i − i . Suggestion of Proof: Just do it! Lemma 14 If E is a Hilbert space over the complex field and T is a linear function then these are equivalent: (a) = <x, Ty> for each x and y in E (or, T is self adjoint), and (b) is real for each x in E. Suggestion of Proof: Use the Polarization Identity. Definition An Orthogonal Projection is a projection P for which the null space of P is perpendicular to the range of P. Remark: Recall that if C of Theorem 11 is a linear subspace of E, then statement (a) of Theorem 11 can be replaced by (a') z is in the subspace C and < x-z, c-z> = 0 for all c in C. or (a'') z is in the subspace C and <x-z, m> = 0 for all m in C. Definition If M is a linear space then M⊥ = {n: <m,n> = 0 for all m in M}, R(P) is the range of P, and and N(P) is the null space of P. Lemma 15 If P is a linear, closest point projection onto a closed subspace M, then (R(P)) ⊥ = N(P) -- that is, P is an orthogonal projection. Suggestion of Proof: Suppose that n ∈ N[P]. We want to show that = 0 for all r in R(P). Since n ∈ N[P] then P(n) = 0. By Corollary 12, since P is a closest point projection, we have that for all r in R(P), 0 = = = for all r in R(P). Hence, n∈R(P)⊥ . Now, suppose that n ∈ R(P)⊥ . We want to show that Pn=0.
25 Since n ∈ R(P)⊥ , then = 0 for all r ∈ R, or = 0 for all r ∈ R. But, this is a characterization of closest point. EXAMPLE : Here is an example of a projection that is not a closest point: P({x,y}) = {x-y,0}. What is the range and null space of this projection? Theorem 16 Suppose that P is a linear projection. These are equivalent: (a) P is an orthogonal projection. (b) if x is in E then <(1-P)x, Px> = 0. (c) if x and y are in E, then = <x, Py>, i.e., P is self adjoint. (d) if x and y are in E then |x-Px |2 + | Px - Py |2 = | x-Py |2, i.e., P is a right triangle projection. (e) if x and y are in E then | x-Px | ≤ | x-Py |, i.e., P is a closest point projection. (f) <x-Px, z-Px > = 0 for all x and for all z in P(E), i.e., P is a right angle projection. (g) |Px| ≤ |x| for each x in E, i.e., P is a non-expansive. Suggestion of Proofs: a⇒b Px is in the range of P and (1−P) is in the nullspace of P. b⇒c Show is real: = = |Px|2. c⇒a = = . a⇒d Add to the left side <x-Px,Px-Py> + = 0+0. d⇒e Delete |Px - Py|2 from the left side of d. e⇒f Use Corollary 12. f⇒b <x-Px, Px> = <x-Px, 0-Px> = 0 since 0 is in P(E) c⇒g |Px| 2 = ≤ |Px| |x|. g⇒b Let y = Px + λ(x-Px). Note that Py = Px and |y-Py| = |λ| |x-Px|. By (g) 0 ≤ |y| 2 - |Py|2 = 2 Re λ* + |λ| 2 |x-Px|2. Investigate this parabola with λ real or equal to -it, t real. Remarks (1) Some texts take all projections to be linear. It is useful not to do this so as to get orthogonal projections onto closed subsets. (2) Some texts take a "resolution" of the identity to be a sum of orthogonal projections. We did not do that because this did not happen in our resolution of matrices. ∞ (3) One might think that if A = ∑ λ pPp then these are equivalent p=1 (a) A is self adjoint, and (b) each λ p is real.
26 This did not happen in our examples. Not to worry! There is an innerproduct with respect to which this is a theorem. Assignment (7.1) Give examples which contrast projections, linear projections and orthogonal, linear projections. (7.2) Let C be the closed and bounded set {x: |x| ≤ 1 in R2}. Let x = {1,1} and Pc (1,1) be the closest point projection onto C. Find y 1 in C such that <x-Pc (x), y 1 - Pc (x) > < 0. Find y2 in C such that <x-Pc (x), y 2 - Pc (x) > = 0. (7.3) Let S = L2[-1,1]. Let E = {f: f∈S, f(-x) = f(x)} and O = {f: f∈S, f(-x) = - f(x)}. Show that E and O are orthogonal, linear subspaces. Find formulas for PE(f) and [1-PE ](f). Show that PE is a projection, is linear, and is orthogonal. MAPLE remark: In this MAPLE exercise, we construct a non-orthogonal projection of R3. To do this, choose u, v, and w linearly independent. We project onto the subspace spanned by multiples of u. Instead of being an orthogonal projection, however, it is a projection in the "direction of v and w." This will mean that P(u) = u, P(v) = 0, and P(w) = 0. Such a projection can be accomplished as follows: Choose x in R3. Write x = au + bv + cw. Define P(x) = a u. In a similar manner, we get the projection onto the subspace spanned by v and in the direction of u and w. And, we get the projection onto the supspace spanned by w in the direction of u and v. Here's the technique: choose, for example, u = {1,0,1}, v = {1,1,0} and w = {0, 0, 1}. We seek the projection of R3 onto the space spanned by u along v and w, the projection of R3 onto the space spanned by v along u and w, and the projection of R3 onto spanned by w along u and v. Take {x, y, z} to be a point in R3. we find the first matrix, Pu, which is the projection onto u. Thus we seek a, b, and c so that a u + b v + c w = {x,y,z} and we wish to write a, b, and c as a function of x, y, and z. Make the matrix M such that the columns are u, v, and w. > with(linalg): > u:=vector([1,0,1]); v:=vector([1,1,0]); w:=vector([0,0,1]); > M:=transpose(array([[u[1],u[2],u[3]], [v[1],v[2],v[3]],[w[1],w[2],w[3]]]));
We know that {a, b, c} = M -1{x, y, z} and that the projection onto u should be < M-1{x,y,z}, e1>.
27
> > > >
preP:=evalm(inverse(M)&*vector([x,y,z])); coef1:=dotprod(preP,vector([1,0,0])); proj1:=evalm(coef1*u); col1:=vector([subs({x=1,y=0,z=0},proj1[1]), subs({x=1,y=0,z=0},proj1[2]), subs({x=1,y=0,z=0},proj1[3])]); > col2:=vector([subs({x=0,y=1,z=0},proj1[1]), subs({x=0,y=1,z=0},proj1[2]), subs({x=0,y=1,z=0},proj1[3])]); > col3:=vector([subs({x=0,y=0,z=1},proj1[1]), subs({x=0,y=0,z=1},proj1[2]), subs({x=0,y=0,z=1},proj1[3])]); > Pu:=evalm(transpose(matrix([col1, col2, col3])));
Here is a different projection. > coef2:=dotprod(preP,vector([0,1,0])); > proj2:=evalm(coef2*v); > col1:=vector([subs({x=1,y=0,z=0},proj2[1]), subs({x=1,y=0,z=0},proj2[2]), subs({x=1,y=0,z=0},proj2[3])]); > col2:=vector([subs({x=0,y=1,z=0},proj2[1]), subs({x=0,y=1,z=0},proj2[2]), subs({x=0,y=1,z=0},proj2[3])]); > col3:=vector([subs({x=0,y=0,z=1},proj2[1]), subs({x=0,y=0,z=1},proj2[2]), subs({x=0,y=0,z=1},proj2[3])]); > Pv:=transpose(matrix([col1,col2,col3]));
And, here is the last projection. > coef3:=dotprod(preP,vector([0,0,1])); > proj3:=evalm(coef3*w); > col1:=vector([subs({x=1,y=0,z=0},proj3[1]), subs({x=1,y=0,z=0},proj3[2]), subs({x=1,y=0,z=0},proj3[3])]); > col2:=vector([subs({x=0,y=1,z=0},proj3[1]), subs({x=0,y=1,z=0},proj3[2]), subs({x=0,y=1,z=0},proj3[3])]); > col3:=vector([subs({x=0,y=0,z=1},proj3[1]), subs({x=0,y=0,z=1},proj3[2]), subs({x=0,y=0,z=1},proj3[3])]); > Pv:=transpose(matrix([col1,col2,col3]));
As, a check, it should be true that the sum of the projections is the identity matrix.
28
Section 8: Orthonormal Vectors Surely, any view of a Hilbert space will have the notion of orthogonality and orthogonal vectors at its core. The very notion of orthogonality is embedded in the concept of the dot product. The classical applications of orthogonal functions pervade applied mathematics. Indeed, there are books on orthogonal functions and their application to applied mathematics. We have seen that orthogonal vectors arise naturally as eigenvectors of self adjoint transformations. This section provides a process for generating an orthogonal set of vectors. It also examines the implication of having a maximal family of orthogonal vectors. Remark One should review the notion of linearly independent vectors and verify that any orthogonal collection is linearly independent. ∞ is a maximal orthonormal family if the p=1 only vector y satisfying < x p, y > = 0, for all p, is y = 0. Definition The collection {xp}
Theorem 17 Let {xp}
∞ be an orthonormal set in {E, <⋅ , ⋅ >}. p=1
∑ ||2 ≤ | y|2, with equality holding only in case y = ∑ xp. Suggestion of Proof: 0 ≤ |y - ∑ xp| 2 = |y|2 -∑ || 2.
Theorem 18 Suppose that {xp}
∞ is an orthonormal sequence in {E,<⋅ , ⋅ >}. p=1
These are equivalent. ∞ (a) {x p} is maximal - in the sense that if y is in E and = 0 for all p p=1 then y = 0. ∞ (b) {xp} is an orthonormal basis - in the sense that if y is in E then p=1 y = ∑ xp . (c) Parseval's equality holds - if u and v are in E then
29
< u,v> = ∑ <xp, V>. p (d) Bessel's equality holds - if y is in E then |y|2 = ∑ || 2. ∞ is dense in E - in the sense that if y is in E then p=1 ∞ there is a sequence {up} in S such that limp up = y. p=1 (e) The span S of {x p}
Suggestion of Proof. a⇒b Suppose that {xp} is maximal. Let M be all y that can be written as
∑ xp.
Suppose M is not E. Let z be in M-E. Consider z0 = z −
∑ xp.
Then = 0 for all p. b⇒c = Σ pΣ q <xq ,v> <xp,x q >. c⇒d = Σ p <xp,y>. d⇒a If = 0 for all p then = Σ p <xp,y> = 0. n b⇒e If un = ∑ xp then limn un = y. p=1 e⇒a Suppose that y ≠ 0 and = 0 for all p. Let {ap} be any number sequence. |y - ∑ ap xp| 2 = |y|2 +
∑ |a p| 2 > |y|2.
Hence there is no such {up} for this y. ∞ Definition If {xp} is a linearly independent sequence of vectors then the p=1 Gramm-Schmidt process generates an orthonormal sequence as follows: u 1 = x1, v1 = x1/|x 1| n-1 u n = xn - ∑ <xn , vp> vp vn = un /|u n |. p=1 Remark This process generates an orthogonal sequence: if n > k,
30 n-1
∑ <xn , vp>vp, vk > = 0. p=1 J J If 1 ≤ J ≤ n then the span of {xp}p=1 is the same as the span of {vp}p=1. = <xn -
∞ is a sequence of orthonormal n=1 polynomials, each having degree n and that that the generating dot product has the property that <x f(x), g(x)> = . It follows that for each n, there are numbers α n , β n , and γn such that Pn+1(x) = (α n x + β n ) Pn (x) + γn Pn-1(x). Theorem 19: Suppose that {Pn }
Warning: In this discusion, Pn is a polynomial, not a projection. Suggestion for proof. Here's why there is the recursion formula for orthonormal polynomials: We suppose that Pn is a polynomial of degree n, that = 0 if n ≠ m and = 1 if n = m, and that the dot product has the property that <x f(x), g(x)> = . Choose β n such that the function Pn+1(x) - β n x Pn (x) is a polynomial of degree n. Then there is a sequence {α p} of numbers such that n Pn+1(x) - β n x Pn (x) = ∑ α p Pp(x). p=0 Suppose that k ≤ n-2. n 0 = = <β n x Pn (x), P k (x)> + < ∑ α p Pp(x), P k (x)> p=0 = β n + α k k+1 = β n + α k i=0 k+1 = β n ∑ γi + α k = 0 + α k . i=0 Hence, Pn+1= β n x Pn + α n Pn (x) + α n-1 Pn-1(x) Examples From the CRC: Legendre Polynomials: (n+1) Pn+1(x) = (2n+1) x Pn (x) - n Pn-1(x). 1 Here, = ∫ f(x) g(x) dx. -1 Tschebysheff Polynomials: Tn+1(x) = 2 x Tn (x) - Tn-1(x).
31 1
Here,
=
∫
1−x 2 f(x) g(x) dx.
-1 Laguerre Polynomials: (n+1) Ln+1(x) = [2n+1-x] Ln (x) - n Ln-1(x). ∞ Here, = ∫ e-x f(x) g(x) dx. 0 Assignment ∞ ∞ (8.1) Let {xp} be an orthonormal sequence and {α p} be a number p=1 p=1 sequence. These are equivalent: ∞ (a) ∑ |α p| 2 < ∞, and p=1 ∞ (b) ∑ α px p converges in {E, <⋅ ,⋅ >}. p=1 N N (8.2) Let {xp}p=1 be an orthonormal sequence and M be the span of {xp}p=1. Give a formula for the closest point projection PM onto M. (8.3) If y is in E and {xp}
∞ is an orthonormal sequence then the series p=1 ∞ ∑ xp converges. p=1
(8.4) Gramm-Schmidt {e1, e2, e1+e 2+e 3} in R 3. (8.5) Gramm-Schmidt {1,x,x2} in L2([-1,1]). (8.6) Gramm-Schmidt {e-x , xe -x , x 2e-x } in L2([0,∞),e x ). MAPLE Remark. The procedure to perform the Gram-Schmidt process on a sequence of vectors is a part of the MAPLE linear algebra package. That procedure uses the standard dot-product. When using this procedure, it should be noted that the vectors returned are not normalized. > with(linalg): > u:=vector([1,0,1]); v:=vector([1,1,0]); w:=vector([0,0,1]); > GramSchmidt({u,v,w});
32 It is not so hard to write a Gram-Schmidt procedure for dot-products other than the standard one, and even in a function space. MAPLE also contains the standard orthogonal functions. For example, here are the Legendre polynomials. > with(orthopoly); > P(0,x); P(1,x); P(2,x);P(3,x); > int(P(2,x)*P(3,x),x=-1..1);
33
Section 9: The Finite Dimension Paradigm Since this collection of notes purports to be about linear operators on an inner product space, one would hope to develop an understanding of how to make examples. A model that is useful follows the form of the paradigm introduced in this section. Two useful applications of this paradigm are included. We return to finite dimensions. Theorem 20 If A is a self adjoint, linear transformation from Rn to Rn then n there is an orthonormal sequence {φp}p=1 in Rn and a number sequence n {λ p}p=1 such that if x is in Rn , then n Ax = ∑ λ p <x,φp> φp. p=1 Suggestion of Proof: Recall that corresponding to any A there is a sequence of projections such that Pi Pj = 0 if i ≠ j and such that x = Σ Pi x for all x in E. Let Mi = Pi (E). Note that if m ∈ Mi then Pi m = m and Mi ∩M j = 0 if i ≠ j. Select in each Mi a maximal sequence of orthonormal vectors and name the mutually orthonormal vectors in the union of this collection for all i {φ1, φ2,...,φk }. We argue that k ≤ n, for Rn has at most n linearly independent vectors. Even more, it must be that k = n for otherwise, let y be perpendicular to their span. But, y = Σ i Pi y, each Pi y is in M i , and the vectors in Mi cannot all be orthogonal to the the vectors in {φp}--otherwise, the sequence {φp} was not chosen as directed. So, it must be that {φp} forms an orthonormal basis for Rn . If we now assume that A is self adjoint, then each Pi is a closest point projection onto Mi . This means that Pi will have a representation in terms of the Fourier coefficients u = Σ φp and A = Σ λ p φp. Corollary 21 Moreover, if m is a positive integer, then n m m A (x) = ∑ λ p <x, φp> φp. p=1 Remarks (1) With A self adjoint,as in Theorem 20, we have that N(A)⊥ = R(A). (2) With A as supposed, if λ p ≠0 for 1 ≤ p ≤ k and λ p = 0 for k < p ≤ n, then u is in the null space of A if n u = ∑ φp k+1
34 and v is in the range of A if k v = ∑ φp. 1 Definition If A is as in Theorem 20 and m is an integer such that m ≤ n and λ i ≠ 0 for 1 ≤ i ≤ m and λ i = 0 if i > m, then the generalized inverse is a linear function such that m 1 A†(z) = ∑ φp λ p p=1 n for each z in R . Theorem 22 If A is as in Theorem 20, then A† has the following properties: (1) AA†A = A (2) A†AA†= A† (3) AA† is the orthogonal projection onto R(A). (4) A†A is the orthogonal projection onto N(A)⊥ . Remark: (1) One uses the generalized inverse to get a "best solution" x for the equation Ax=y where A and y are given and where there is no x such that Ax=y. Suppose that y ∈ Rn and y ∉ R(A). There is no x such that Ax = y. In order to find v in R(A) that is closest to y, choose AA†y. The above establishes AA† as the closest point projection onto the range of A. Now, among all u's such that Au = v, the one with smallest norm is A†y. To see this, first note that AA†y = v. It remains to show that A†y is the closest point to zero in the convex set of all points s that map to v, that is, we need to see that < 0-A†y, s - A†y> = 0 for all s such that As = v. To see this, note that s - A†y is in N(A), and the above statement (4) shows that A†y =A†A(A†y) is perpendicular to everything in the nullspace of A. (2) If z is in R n , then AA† z is the closest point u to z in the range of A. Also, A†(z) is the element with the smallest norm among all x's so that Ax = u. Assignment 0 2 0 (9.1) Compute A†, AA†, and A†A for A = 2 0 0 . 0 0 0 -3/2 1/2 0 (9.2) Let A = 1/2-3/2 0 . Verify that {1,1,1} is not in the range of A. Find v 0 0 0 in the range of A so that || v -{1,1,1}|| is as small as possible. Find u so that Au = v and u has the smallest norm among all other x such that Ax = v.
35 (9.3) The following matrix is self-adjoint, but A⋅ A + is not diagonal: 4/3 1/3 -2/3 1/3 1/3 1/3 . -2/3 1/3 4/3 MAPLE Remark: The methods suggested in this section can be used with MAPLE to get the finite dimensional paradigm representation for a matrix. > > > > > >
with(linalg): A:= array([[-3/2, 1/2, 0], [1/2, -3/2, 0],[0, 0, 0]]); eigenvects(A); v1:=vector([-1,1,0]); v2:=vector([1,1,0]); v3:=vector([0,0,1]); u:=vector([x,y,z]); rep:= (-2)*innerprod(u,v1)*v1/norm(v1,2)^2 + (-1)* innerprod(u,v2)*v2/norm(v2,2)^2 + (0)*innerprod(u,v3)*v3/norm(v3,2)^2;
To understand the next MAPLE command, note that if u and v are vectors and α and β are numbers, then add(u,v,α,β) computes α u + β v. > add(v1,v2,-x+y,-(x+y)/2);
As a check, it should be noted that the results of the last line are the same as if regular matrix multiplication had been performed. The value of this representation has been explained in this section.
36
Section 10: Bounded, linear maps from E toC I The notion of continuity is basic to any analysis of functions. In a study of Hilbert spaces, one should have in their repertoire examples of linear functions that are continuous and examples that are not continuous. We look first at the simplest examples of continuous linear functions on an inner product space: functions from E to the complex plane. We provide a characterization of bounded, linear functions from a Hilbert space to the complex plane. Theorem 23 Suppose that L is a linear function from {E,<⋅ ,⋅ >} to E. I These are equivalent: (a) There is a number b such that if x is in E then |L(x)| ≤ b |x| - - L is bounded. ∞ (b) if {xp} is a sequence in E with limit y then limp L(xp) = L(y) - - L is p=1 continuous. Suggestion of Proof: a⇒b |L(xp) - L(y)| = |L(xp-y)| ≤ b |xp-y|. b⇒a Suppose L is continuous and there is no such b. For each p, there is zp zp with |L(zp)| ≥ p |z p|. Let xp = . Then xp →0, but |z p| p |L(z p)| |L(xp)| = ≥ p, so that |L(xp)|→∞. This contradicts the |z p| p assumption that L is continuous. Remark: The smallest b is the norm of A. (Recall the earlier discussion on the norm of a matrix in Chapter 4.) Theorem 24 Suppose that L ≠ 0 is a bounded linear function and M = N(L). Then M is a closed, linear subspace. Moreover, if L: E → C I then dim M⊥ = 1. Suggestion of Proof: First note that M is a closed linear subspace. Suppose that L maps E to C. To prove that dim(M ⊥ ) = 1, suppose that it is not. Let x and y be linearly independent members of M⊥. and consider x y z = L(x) - L(y). Since M⊥ is a subspace, z is in M⊥ and L(z) = 0. Thus z ∈ M∩M ⊥ and hence must be 0. Hence L(x) x = L(y) y and x and y are not linearly independent.
37 Theorem 25 (The Riesz Representation Theorem) Suppose that A is a linear function from {E, <⋅ ,⋅ >} to C. I These are equivalent: (a) There is y in E such that A(x) = <x, y> for all x in E, and (b) A is bounded. Suggestion of Proof: a⇒b |Ax| = |<x,y>| ≤ |x| |y|. b⇒a By 24, N(A)⊥ has dimension 1. Let z be in N(A)⊥ with |z| = 1. Let y = A(z)* z. Then, <x,y> = <x, A(z)* z> = A(z) <x,z> = A(z) { + <(1-P)x, z>} where P is the closest point projection onto N(A) = A(z) <(1-P)x,z> = A(z) <βz,z> since N(A)⊥ is one dimensional = β A(z) = A(βz) = A((1-P)x) = A(x)
Assignment (10.1) Let L be a function on L 2 defined by ∞ L(x) = ∑ |<x,ep>| 2. p=1 Is L a linear function from L 2 to C? I (10.2) Let L be defined on R3 by L(x) = x1 + 3x2 - 7x3. (a) Find v such that L(x) = <x,v> for all x in R3. (b) Find b such that |L(x)| ≤ b⋅ |x|. (10.3) Let A:L2[-1,1] → C I by 1/2 A(f) = ∫ f(x) dx. 0 Show that A is continuous and find g in L2[-1,1] such that A(f) = . (10.4) Let C[-1,1] be the continuous functions in L2[-1,1] and A: C[-1,1] → C I by A(f) = f(1/2). Show that A is not continuous, that is, show there is a sequence fp in C[-1,1] with limp→∞fp = 0, but limpA(f p) ≠ A(0). Show that there is no g in L2[-1,1] such that 1 A(f) = ∫ f(x) g(x) dx. -1 (Hint: Recall the sequence of functions {fn } that converges strongly, but not pointwise.)
38
(10.5) For each f in L2[0,1], let Y(t) be a function such that y′ + 3 y = f, with y(0) = 0. Define A:L2[0,1]→R by 1 A(f) = ∫ y(t) dt. 0 Show that A is a bounded linear function on L2[0,1] and find g such that A(f) = < f, g >. MAPLE Remark: This is an exercise in getting the Riesz representation of a continuous mapping from L2[0,1] to the complex plane. Pick x in [0,1]. We define a point - -a function -- G(x) in L2[0,1]. Since points in L 2[0,1] are functions on [0,1], we write this G as a function of not only x, but also t: G(x,t). The linear mapping as defined by the Riesz theorem will be 1 < G(x,t), f(t) > = ∫ G(x,t) f(t) dt. 0 It remains to say what is the linear mapping for which we will get this Riesz representation. Here is the definition of L: For each f, we define L(f) = y where y'' + 3 y′ + 2y = f, with y(0) = y(1) = 0. We use MAPLE to create this y. Then we make G(x,⋅ ). First we solve the differential equation with y(0) = 0. This will leave one constant to be evaluated. > dsolve({diff(y(x),x,x)+3*diff(y(x),x)+2*y(x)=f(x),y(0)=0},y(x)); > _C2:=int( (exp(-1+u)-exp(-2*(1-u)))*f(u), u=0..1)/(exp(-1)-exp(-2)); > ("");
The above makes G(x,u). Input f, evaluate the integrals, and one has a number. The Riesz theorem says that we can find a representation in term of the dot product, which is an integral in this case. With a study of those integrals, we can extract G: > G:=proc(x,u) if u < x then (exp(-x+u)-exp(-2*(x-u))) -(exp(-1+u)-exp(-2+2*u))*exp(-x)/(exp(-1)-exp(-2)) +(exp(-1+u)-exp(-2+2*u))*exp(-2*x)/(exp(-1)-exp(-2)) else -(exp(-1+u)-exp(-2+2*u))*exp(-x)/(exp(-1)-exp(-2)) +(exp(-1+u)-exp(-2+2*u))*exp(-2*x)/(exp(-1)-exp(-2)) fi end;
There are some properties of Green's functions that may be familiar. It satisfies the differential equation for x < u and for x > u. We verify this:
39 > z1:=x->(exp(-x+u)-exp(-2*(x-u))) -(exp(-1+u)-exp(-2+2*u))*exp(-x)/(exp(-1)-exp(-2)) +(exp(-1+u)-exp(-2+2*u))*exp(-2*x)/(exp(-1)-exp(-2)); > diff(z1(x),x,x)+3*diff(z1(x),x)+2*z1(x); > z2:=x->-(exp(-1+u)-exp(-2+2*u))*exp(-x)/(exp(-1)-exp(-2)) +(exp(-1+u)-exp(-2+2*u))*exp(-2*x)/(exp(-1)-exp(-2)); > diff(z2(x),x,x)+3*diff(z2(x),x)+2*z2(x);
There is a symmetry for the graph of G: > plot3d(G,0..1,0..1);
1.00 0.800 0.600 0.400 0.200 00
0.200
0.400
0.600
0.800
1.00
-0.0500 -0.100 -0.150 -0.200
The observation from the graph that the values of G are negative gives information about the solution for f's that have only non-negative values. We compute a solution for the boundary value problem with f the constant function 1. > w:=x->int(-(exp(-1+u)-exp(-2+2*u))*exp(-x)/(exp(-1)-exp(-2)) +(exp(-1+u)-exp(-2+2*u))*exp(-2*x)/(exp(-1)-exp(-2)),u=x..1) + int((exp(-x+u)-exp(-2*(x-u))) -(exp(-1+u)-exp(-2+2*u))*exp(-x)/(exp(-1)-exp(-2)) +(exp(-1+u)-exp(-2+2*u))*exp(-2*x)/(exp(-1)-exp(-2)), u=0..x); > w(0);w(1);diff(w(x),x,x)+3*diff(w(x),x)+2*w(x)-1; > simplify("); > simplify("); > plot(w(x),x=0..1);
40 0 0
0.2 -0.02 -0.04 -0.06 -0.08 -0.1 -0.12 -0.14
0.4
0.6
0.8
1 x
41
Section 11: Applications to Differential and Integral Equations This section will place some of the ideas that have come before into the context of integral equations and into the context of ordinary differential equations with boundary conditions. We have already seen that if A is a bounded, linear transformation on E then exp(tA)(c) provides the solution for the initial value problem Y ′ = AY, Y(0) = c. Example x(1-y) if 0 ≤ x ≤ y ≤1 K(x,y) = y(1-x) if 0 ≤ y ≤ x ≤1 1 and let A(f)(x) = ∫ K(x,y) f(y) dy. 0 (1) A is a bounded linear transformation from L2[0,1] to L2[0,1] that is selfadjoint. Let
(2) If f is in L2[0,1] then these are equivalent: 1 (a) g(x) = ∫ K(x,y) f(y) dy, so that g = A(f), 0 and (b) g′′ = -f with g(0) = g(1) = 0. Define B to have domain the functions g in L2[0,1] with g(0) = g(1) = 0 and with two derivatives and B(g) = − g′′. Note that A(f) = g if and only if f = B(g). (3) These are equivalent: (a) λ is a number, φ is a function, and λφ = A(φ), and 1 (b) λ = 2 2 and φ(x) = 2 sin(nπx), for some positive integer n. n π ∞ 1 2 (4) If f is in L [0,1], then A(f) = ∑ 2 2 φn where φn is as above. 1 n π ∞
(5) (a) B(g) =
∑ p2 π2 φp.
p=1 (b) The solution for Z′(t) + BZ(t) = 0, Z(0) = C is
42 ∞
∑ exp(-n2 π2 t) φn . 1
(6) The system Y′ = AY + F, E0Y(0) + E1Y(1) = 0 can be rewritten as an initial value problem provided [E0 + E1 exp(A)] has an inverse. In any case, a Green's function G can be constructed where by, for appropriate functions F, solutions for the system can be computed as 1 Y(x) = ∫ G(x,t) F(t) dt. 0 Assignment Find matrices A, E0 and E1 such that the boundary value problem y'' + 3y' + 2y = f, y(0) + y(1) = 0, y'(0) - y'(1) = 0 is written as a system Y′ = AY + F, E0Y(0) + E1Y(1) = 0. Example: Here are graphs of the solution y for y'' + 3y' + 2y = f, with y(0) = y′(0) = 0, and another solution with y(0) = 0, y(1) = 0, and another solution with y(0) - y(1) = 0, y'(0) + y'(1) = 0. Is it easy to identify which graph goes with which solution?
Figure 11.1
43 MAPLE Remark: It is interesting to think of how the graphs of the solutions to y'' + 3 y' + 2 y = 1 on [0,1] change as the boundary conditions change. At each time the relationship specified in the differential equation between y and its derivatives must hold. How y starts at t = 0 is modified in order to satisfy the boundary conditions. Here are three examples. > deq:=(D@@2)(y)(t) + 3*D(y)(t) + 2*y(t) = 1; > bdry1:=y(0) = 0, D(y)(0)=0; bdry2:=y(0) = 0, y(1) = 0; bdry3:=y(0) = y(1), D(y)(0) = -D(y)(1); > y1:=dsolve({deq,bdry1},y(t)); y2:=dsolve({deq,bdry2},y(t)); y3:=dsolve({deq,bdry3},y(t));
It seems that MAPLE does not know how to solve this last boundary value problem with boundary conditions bdry3. Not to worry. Humans can teach MAPLE! > solve({ a+b = a*exp(-1)+b*exp(-2), -a-2*b = -(-a*exp(-1)-2*b*exp(-2))},{a,b}); > assign("); > y3:=t->1/2+a*exp(-t) +b*exp(-2*t);
Here's my check that y3 is a solution: > (D@@2)(y3)(t) + 3*D(y3)(t) + 2*y3(t); > evalf(y3(0)-y3(1)); > evalf(D(y3)(0)+D(y3)(1));
Finally, here is a plot of the three solutions. > plot({rhs(y1),rhs(y2),y3},0..1);
44
Section 12: The Simple Paradigm for Linear Mappings from E to E It's time to put together two ideas: might one dare to guess the form for the paradigm in a general inner product space? And, what is special about the paradigm in case the linear function it represents is a bounded linear mapping? The answer to both questions is as could be expected. ∞ Example Suppose that {φp} is an infinite maximal orthonormal p=1 ∞ sequence and that {λ p} is a sequence of numbers. The linear p=1 transformation A given by ∞ Ax = ∑ λ p <x, φp> φp p=1 ∞ is bounded if {λ p} is bounded and unbounded, otherwise. To verify this, p=1 we note that |Ax| ≤ sup p(|λ p|) |x| for all x in E and that |A(φp)| = |λ p||φp|. The domain of A is all x for which ∞ ∑ |λp <x, φp>|2 < ∞. p=1 ∞ One can show that if {λ p} is bounded then A has domain all of E and that p=1 the domain of A is dense in E in any case. It should be noted that the domain of A2 is a (perhaps proper) subset of the domain of A. (See the Remark below.) Moreover, these are equivalent: (a) A has an inverse, and (b) each λ p ≠ 0. ∞ Finally, if Bx = ∑ µp <x, φp> φp, p=1 then A and B commute. Remark To see that the domain of A2 is a subset of the domain of A, let x be in the domain of A2. Define a sequence {µp} by µp = λ p if |λ p| ≥ 1 and = 0 if |λ p| < 1. Suppose x is in D(A2). Then
45 ∞ > Σ p |λ p2 <x,φp>| 2 ≥ Σ p |µp2 <x,φp>| 2 ≥ Σ p |µp <x,φp>| 2 . Then, ||x|| 2 + ||A 2x|| 2 = Σ p |<x,φp>| 2 + Σ p |λ p2 <x,φp>| 2 ≥ Σ p |<x,φp>| 2 + Σ p |µp <x, φp>| 2 ≥ Σ p |λ p <x, φp>| 2 = |A(x)|2. Assignment (12.1) Give an example of an unbounded linear function whose inverse is bounded. (12.2) Give an example of an unbounded linear function whose inverse is unbounded. (12.3) Give the simple paradigm representation for A:D ⊂ L2[0,1] →L2[0,1] where D = {f: f′′ exists and f(0) = f(1) = 0} and A(f) = f′′. MAPLE remark. We use MAPLE to verify that the sequence fn (x) = sin(nπx) forms a set of eigenfunctions for A as defined in Assignment 12.3. We see that fn (0) = 0 and fn (1) = 0. > diff(sin(n*Pi*x),x,x);
The function A can be represented in this simple paradigm as we saw in Assignment 12.3. With MAPLE's help do an experiment concerning A. Consider the linear mapping ∞ A(f) = ∑ −2 n2 π 2 < f(t), sin(nπt) > sin(nπx). n=1 We examine A( t⋅ (t-1) ). > AProx:=x->2*sum(-n^2*Pi^2*int(t*(t-1)*sin(n*Pi*t),t=0..1)*sin(n*Pi*x), n=1..10); > plot(AProx(x),x=0..1);
This last was an approximation for taking two derivatives of t⋅ (t-1). Surely, it is irresistable to take the inverse of A, or at least to approximate this inverse. > APverse:=x->2*sum(-int(t*(t-1)*sin(n*Pi*t),t=0..1)/ (n^2*Pi^2)*sin(n*Pi*x), n=1..10); > plot({x^4/12-x^3/6+x/12+.0001,APverse(x)},x=0..1);
The question is, why did APverse turn out to be g(x) ≈ x4/12 - x3/6 + x/12 instead of, say x4/12 - x3/6? Two derivatives of both are zero....
46
Section 13: Adjoint Operators Not every bounded linear mapping is self adjoint. You know this from your experience with matrices. Never mind. They all have adjoints! Suppose that A is a bounded linear transformation on E. Fix z in E and define L by L(x) = for all x in E. L is a bounded linear function from E to C I. By the Riesz Theorem, there is y in E such that L(x) = <x,y> for all x in E. Consider this pairing of z and y: for each z there is only one y. We define the adjoint of A by pairing this y with z. Definition If A is a bounded linear function on E then the adjoint of A, A*, is the function on E such that = <x, A*z> for all x in the domain of A. Observations (1) A* is a function from E to E that is linear and bounded. (2) N(A*) = R(A)⊥ ∞ (3) If Ax = ∑ λ p <x, φp> φp p=1 ∞ for all x then A*(y) = ∑ λ p* φp . p=1 Theorem 26 Suppose that K is a continuous function on [0,1]x[0,1] to R, so that K(x,y) = K(y,x)*. Suppose also that A is defined by 1 A(f)(x) = ∫ K(x,y)f(y) dy. 0 Then A is self adjoint. Moreover, if M = lub{: |x| = 1} and m = glb{: |x| = 1} and λ is an eigenvalue for A then m ≤ λ ≤ M. Assignment (13.1) Suppose that K is a continuous function from [0,1]x[0,1] to C I . If 1 A(f)(x) = ∫ K(x,y) f(y) dy 0 then A is a bounded, linear function from all of L2[0,1]. Moreover, A* is given by 1 A*(g)(u) = ∫ H(u,v) g(v) dv, 0
47 where H(u,v) = K(v,u)* , so that A is self-adjoint if K(x,y) = K(y,x)*.
(13.2) Let
12(y-x) - 14 K(x,y) = 1 1 2(x-y) - 4
if x < y . if y < x
1 Define A(f)(x) = ∫ K(x,y) f(y) dy 0 (a) Show that A is a bounded, linear function from L2[0,1] to L2[0,1]. (b) Show that A* = A. (c) Show that these are equivalent: (a) g = A(f) (b) g′′ = f, g(0) + g(1) = 0, g′(0) + g′(1) = 0. (d) Find the eigenvalues and eigenfunctions for A. ans: λ n = - 1/[(2n+1) π]2, fn (x) = cos((2n+1)π x) and sin((2n+1)π x) ∞ (e) Show that A has the property that if {λ p} is the sequence of p=1 eigenvalues then limpλ p = 0. MAPLE remark: Likely, by now, the reader is aware that what analysts call the adjoint of an operator is different from what one often sees in texts on linear algebra that is called the adjoint of a matrix. We persist. But, we take note that the linear algebra package in MAPLE follows the notation of the linear algebra texts, just as we follow the precedence of Hilbert space texts. Perhaps this MAPLE exercise will give understanding to the two "adjoints". > > > > >
with(linalg): A:=array([[1,2],[3,4]]); Ajoint:=adjoint(A); Apose:=transpose(A); evalm(A &* Ajoint)/det(A); dotprod(evalm(A &* vector([a,b])),vector([x,y])) dotprod(vector([a,b]),evalm(Apose &* vector([x,y]))); > simplify(");
Thus, the transpose is the adjoint of these notes. Here is a complex example. > A:=array([[1+I,2+3*I],[3-2*I,4]]); > Ajoint:=adjoint(A); Apose:=transpose(A); Aconjpose:=transpose(map(evalc,map(conjugate,A))); > evalm(A &* Ajoint)/det(A); > dotprod(evalm(A &* vector([a,b])),vector([x,y])) - dotprod(vector([a,b]), evalm(Aconjpose &* vector([x,y]))); > simplify(");
48
Section 14: Compact Sets During the next portion of the notes, we will begin an investigation of linear, compact operators. In order to do this study, there are several ideas about number sets that should be remembered: what is a bounded number set, what is a compact number set, what is a sequentially compact number set, and what is a totally bounded number set. These notions carry over to normed spaces, too Definitions A set S is bounded if there is a number b such that if x is in S then |x| ≤ b. A set C is compact if every open covering of C has a finite subcovering. A set C is sequentially compact provided that if {sp} is a sequence with values in C then there is a subsequence of {sp} that converges and has limit in C. A set S is totally bounded if, for each positive number c, there is a finite set n of points {x p}p=1 such that S is contained in ∪p Dc (x p). ( Here, and in the remainder of the notes, Dc (s) represents the open disk with center s and radius c.) Examples (1) Unbounded sets are not totally bounded. (2) In L 2, cl(D 1(0)) - - the closed unit disk - - is not totally bounded. Here's why: Let c = 2/2. Since ||ei - ej || = 2, then any collection of disks of radius less than 2/2 which covers all of cl(D1(0)) must be infinite. (3) In R 3, cl(D 1(0)) is totally bounded. Here's why: Suppose that K is a positive integer. Choose points F such that if {a,b,c} is one of them then each of a, b, and c has the form m/k where m is an integer and -k ≤ m ≤ k. For example, if k = 3, then {a,b,c} might be {2/3, 1/3, -2/3}. If {x,y,z} is any point in cl(D1(0)), then there is a point {a,b,c} in F such that 3 ||{x,y,z} - {a,b,c}||2 < 2 . k Thus, if c > 0, choose k such that 3 3 < c or c < k. 2 k We cover cl(D1(0)) by disks with radius c and centers at the points of F. (4) (0,1) and [0,∞) are not sequentially compact.
49
Remark: In finite dimensions, if S is a set, then these are equivalent: (a) S is compact, (b) S is sequentially compact, and (c) S is closed and bounded. Our use of these ideas is illustrated in the next Theorem 27. To show the generality, we reference Introductory Real Analysis, by Kolmogorov and Fomin (translated by Richard Silverman) and published by Dover. In a section numbered 11.2, they prove a Theorem 2: A metric space is compact if and only if it is totally bounded and complete. Theorem 27 If S is a subset of the Hilbert space {E, < , >}, then these are equivalent: (a) S is sequentially compact, and (b) S is closed and totally bounded. Suggestion of Proof: a⇒b Prove S is closed. Suppose lim up = v, each u p is in S. Sequential compactness implies that v is in S. To prove that S is totally bounded: Suppose ε > 0. Pick u1; if S is not contained in Dε (u 1) pick u 2 in S but not in Dε (u 1). If there are an infinite number of such disks then continue this process. This produces an infinite sequence {up} such that |up-u q | > ε. S being sequentially compact implies that this sequence has a subsequence that converges. This is a contradiction and there must be only a finite number of such disks. b⇒a Suppose that S is totally bounded and that {up} is an infinite sequence with values in S. A convergent subsequence will be extracted. Choose a finite number of disks with radius 1 that cover S. An infinite subsequence of u lies in one of these. Call it u1(p). Cover this disk containing u1 with a finite number of disks with radius 1/2. An infinite subsequence of u1 lies in one of them, call this u2(p). ETC. Then |un (n) - u m (m)| < 1/n for all m > n. This is a subsequence of u that converges. Since the space is complete, the sequence has a limit. If the set is closed, the limit is in S. Remark Insight into the geometric structure of a sequentially compact set in L 2 is gained by realizing that, while a sequentially compact set in L 2 may be infinite dimensional, it can contain no open set. Assignment (1) List four points {a,b} such that if 0 < x < 1, 0 < y < 1, then 1 ||{x,y} - {a,b}|| < 2 for at least one of the four {a,b}.
50 (2) Show that the Hilbert cube is totally bounded. The Hilbert cube consist of 1 points {xp} in L2 such that |xn | ≤ n . 2 1 (Hint: if c > 0, n < c/2, and x is in the cube, then there is xn and rn such 2 that x = xn + r n , |r n | < c/2, and xn is in D1(0) ∩ Rn . This last set is totally bounded.) MAPLE remark: The compact sets in Rn are precisely the closed and bounded sets. That is, a set in Rn is compact if and only if it is closed and bounded. In a Hilbert space, every sequentially compact set is closed and bounded, but there are closed and bounded sets that are not sequentially compact. In fact, the closed unit disk is such an example. The following syntax verifies that the sequence 2 sin(nπx), n= 1, 2, 3, ... in an infinite sequence in L2[0,1] in the unit disk, any two are orthogonal, and the distance between any two is the square-root of two. > > > > >
int(2*sin(n*Pi*x)^2,x=0..1); int(2*sin(n*Pi*x)*sin(m*Pi*x),x=0..1); int(2*(sin(n*Pi*x)-sin(m*Pi*x))^2,x=0..1); simplify("); sqrt(simplify((4*n^3*m-4*n*m^3)/(n*m*(n^2-m^2))));
Note the relationship between this example and Example 2 of this section.
51
Section 15: Compact Operators If the simple paradigm for a linear operator in a Hilbert space is to be useful, we should be able to look for characteristics of the paradigm that determine characteristics of the operators. We now ask what conditions on the simple paradigm will imply that the image of the unit disk is contained in a sequentially compact set. Theorem 28 Suppose that {φp} {λ p}
∞ is a maximal orthonormal sequence in E, p=1
∞ is a sequence of numbers, and p=1 Ax=
∑ λp <x, φp> φp.
These are equivalent: (1) lim p |λ p| = 0, and (2) cl(A( D 1(0) )) is sequentially compact. Suggestion of Proof. (1)⇒(2). We show the closure of A(D1(0)) is totally bounded. Since |λ p|→0 there is a number B such that |λ p| < B for all p. Then A(D1(0)) is contained in the closed ball with radius B for |A(x)| 2 =
∑ |λp| 2 |<x, φp>|2
≤ Β 2 |x| 2.
We now pick out the set of points required by the definition of totally bounded: For 1 > c > 0, choose N such that maxN≤p|λ p| < c/2. Since the closed disk with radius B is a sequentially compact subset of RN choose a finite set of points {cp} in R N such that if α is in R N with |α| ≤ B then |α c p| 2 < c/2 for some p. Now pick y in the closure of A(D1(0)). There is x in D1(0) such that Ax = y. There is p such that N | ∑ [ - cpq]2| < c/2. q=1 ∞ N 2 Also, |y-c p| = | ∑ [ - cpq] | + ∑ || 2 q=1 N+1 ∞ ≤ c/2 + ∑ |λ q | 2 |<x, φ q >| 2 ≤ c/2 + c|x|2/2 = c. N+1 (2)⇒(1). Supppose lim|λ p| ≠ 0. There is an infinite sequence {φn } with |φn | = 1 and |λ n φn | ≥ c. Let xn be this sequence φn . Then
52 |A(φn ) - A(φm )| 2 = |λ n | 2 + |λ m | 2 ≥ 2 c2. Thus, A(xn ) = A(φn ) has no convergent subsequence. DEFINITION The linear operator A is compact if A(D 1(0)) is contained in a closed and totally bounded set. Remarks. (1) Note that the statement that the linear operator is compact is equivalent to the following: The linear operator is compact provided if {xp} is a bounded sequence in the domain of A, then there is a subsequence {x[up]} such that {A(x[up])} converges. (2) If {xp} is bounded and converges weakly to y and A is a compact linear operator, then {A(xp)} converges strongly to A(y). (3) Every compact linear operator is continuous. Assignment (15.1) Let {φp}
∞ be a maximal orthonormal sequence and define A from E p=1
to E by Ax =
1
∑ (2 - p) < x, φp> φp.
(a) Show that A is self-adjoint, bounded, linear, and not compact. (b) Show that there is no z such that Az = 2z. (c) Show that ||Ax|| ≤ 2||x|| for x in E and if b < 2 then there is y in E such that ||Ay|| > b⋅ ||y||. (d) Solve the differential equation Z′ = AZ, Z(0) = C ∈ E. (15.2) Define B from E to E by Bx = ∑ p <x, φp> φp. (a) Show that B is self adjoint, unbounded, linear, and not compact. (b) Show that there is z such that Bz = 2z. (c) Give a point x not in the domain of B. (d) Let z be the solution of the differential equation Z′ + BZ = 0, Z(0) = C ∈E. Show that the solution is contractive in the sense that |Z(t)| ≤ |C|. (15.3) Solve these two equations ∂z ∂2z (a) = , z(t,0) = z(t,1) = 0, z(0,x) = f(x) ∈L2[0,1] ∂t ∂x 2
53
(b)
∂z ∂2z = + g(t,x), z(t,0) = z(t,1) = 0, z(0,x) = f(x), with f and g(t,⋅ ) in L2[0,1]. ∂t ∂x 2
MAPLE remark: We provide a solution for the partial differential equation ∂z/∂t = ∂2z/∂x 2, z(t,0) = z(t,1) = 0 for all t > 0, z(0,x) = x⋅ (x-1) for 0 < x < 1. The methods we will use employ the techniques of this course. This is an appropriate place to embed an understanding of the methods of solving partial differential equations by separation of variables into the context of these notes. First, we write this partial differential equation as an ordinary differential equation in L2[0,1]. There Z' = AZ where Af = f'' with f(0) = f(1) = 0. We have seen the representation for A in Assignment 12.3: ∞ A[f](x) = ∑ − 2 n2 π 2 < f(s), sin(nπs) > sin(nπx). n=1 Maple computes the solution and draws the surface of the solution. > sum(2*int(s*(s-1)*sin(n*Pi*s),s=0..1) *sin(n*Pi*x),n=1..11); > z:=(t,x)->sum(exp(-n^2*Pi^2*t)*(2*int(s*(s-1)*sin(n*Pi*s),s=0..1)) *sin(n*Pi*x),n=1..11); > plot3d(z(t,x),t=0..1/2,x=0..1,axes = NORMAL,orientation=[150,65]);
54
Section 16: The Space of Bounded Linear Transformations One goal has been, and continues to be, the investigation of eigenvalue problems. That is, we wish to establish whether we can find a vector v and a number λ such that Av = λ v. We must do this if we are to generate a paradigm representation for linear transformation A. We will see that if A is self-adjoint and compact, then there are solutions for an eigenvalue problem. En route, we need to consider the operator topology. The notion of a norm for a matrix has been suggested in these notes already. Definition BLT(E1,E2) is the space of bounded, linear transformations from E1 to E2. If L is in the space then ||L||1,2 is the smallest number b such that |L(x)|2 ≤ b |x| 1 for all x in E 1. Example (1) Suppose that {φp} {ψp}
∞ is a maximal orthonormal sequence in E1 and that p=1
∞ is a maximal orthonormal sequence in E2. Define p=1 1 Lx = ∑ (2 - p) <x, φp> ψp.
It follows that ||L||1,2 = 2. (2) Show that if lim pα p = 0 then limn
n
∑ αp<⋅ , φp> φp converges in BLT and
p=1 n
otherwise, limn
∑ αp<x, φp> φp strongly -- in the sense of Section 5. p=1
∞ ∞ be a point in L 2 and {φp} be a maximal orthonormal p=1 p=1 n 2 sequence in L ([0,1]). Define Kn (x,y) = ∑ α p φp(x) φp(y) on [0,1]x[0,1]. Then p=1 1 1 m 2 |K (x,y) K (x,y)| dx dy = n ∑ |αp| 2. ∫∫ m 0 0 n+1 (3) Let {α p}
55
Hence {K p}
∞ is Cauchy in L2([0,1]x[0,1]) and has limit p=1 ∞ K(x,y) = ∑ α pφp(x) φp(y). p=1
Define Kn from L2[0,1] to L2[0,1] by 1 Kn (f)(x) = ∫ Kn (x,y) f(y) dy. 0 Then 1 1 [Km (x,y) - Kn (x,y)] f(y) dy 2 dx ∫ ∫ 0 0 1 1 1 2 ≤ ∫ ∫ [Km (x,y) - Kn (x,y)] dy ⋅ ∫ |f(y)| 2 dy dx 0 0 0 2 ≤ || Km − Kn || ||f||. ∞ Therefore, {Kp} is Cauchy in BLT on L 2[0,1]. p=1 ∞ Theorem 29 Suppose that {Lp} is a sequence of compact operators in p=1 BLT and converges to K in the norm of BLT. Then K is compact. Suggestion of Proof: First, note that K is a bounded linear operator because corresponding to 1 > 0, there is N such that if n ≥ N then |L n (x) - K(x)| ≤ 1⋅ |x|, so that |K(x)| ≤ |L n (x)| + |K(x) - Ln (x)| ≤ |L n (x)| + 1⋅ |x|. We now need to show that the closure of K(D1(0)) is totally bounded. Let ε > 0. We seek a finite number of ε disks that will cover the closure of K(D1(0)). Let N be such that if n > N then |Ln (x) - K(x)| < ε |x|/2. Since Ln is compact there is a finite sequence {cp} such that ∪pDε/2 (c p) ⊇ Cl(Ln (D1(0))). Let x ∈ D1(0). Then |K(x) - c p| ≤ |K(x) - Ln (x)| + |Ln (x) - c p|. Hence, Cl( K(D 1(0))) ⊆ ∪pDε/2 (c p). Remark: We showed that if limp→∞|λ p| = 0 and {φp} is a maximal orthonormal sequence then n Ln = ∑ λ p <⋅ ,φp> φp p=1
56 converges in the BLT norm to ∞ ∑ λp <⋅ ,φp> φp. p=1 Moreover, each Ln is a compact operator. This provides another example of a sequence of compact operators that converges in the BLT norm and the limit is compact. Example Let K:[0,1]x[0,1] → C I be continuous and let 1 K(f)(x) = ∫ K(x,y) f(y) dy. 0 Then K is compact. Theorem 30 If A is a bounded, linear, self adjoint operator, then these are equivalent: (1) b > 0 and |Ax| ≤ b |x| for all x in E, and (2) b ≥ sup{ || : |x| = 1} Proof: (1)⇒(2) This follows from the Cauchy-Schwartz inequality. (2)⇒(1) Let α be the number |Ax|/|x| and z be the vector Ax/α. |Ax| 2 = = < α A(x), z > = 1/4 [ - ] ≤ 1/4 b ( |αx+z| 2 + |αx-z| 2 ) and expanding these dot-products = 1/2 b ( |αx| 2 + |z| 2) = 1/2 b[ α 2|x| 2 + 1/α 2 |Ax| 2] and using the definition of α = b |x| |Ax|. Thus, |Ax| ≤ b|x|. Example: A part of the hypothesis for Theorem 30 is that A is self adjoint. This part of the hypothesis was not used in getting that (1) implies (2). However, that part cannot be completely removed, as is illustrated by the following example. Suppose that φ1 and φ2 are orthonormal. Take Ax = < x, φ1 > φ2. Then A is linear, and |Ax| = |< x, φ1 >| ≤ |x|. However, |< Ax, x >|2 = |< x , φ1> < x , φ2>| 2 and this is strictly less than |< x , φ1>| 2 + |< x , φ2>| 2 ≤ |x| 2. For example, if A(x,y) = (y,0), then A is bounded, not self adjoint and = x y.
57
The maximum value of x y subject to x2 + y2 = 1 is 1/2, not 1 -- which is ||A||. Examples: Mindful that some authors take all linear projections to be bounded (and orthogonal) here are two examples of linear projections that are unbounded: The second one has range a finite dimensioned space. Example (1) (Due to Mary Chamlee, Fall,'92) E = L2 and P({x 1, x 2, x 3, ... , xn , ...}) = {x1 - x2, 0, x 3 - 2 x4, 0, x 5 - 3 x6, 0, ... }. Example (2) (Due to Mahdi Zaidan, Fall, '92) E = L2 and P({x 1, x 2, x 3, ... , xn , ...}) = {x1 + 2x3 + 3x4 + ... , 0, 0, ... }. Assignment: Suppose that K is defined by
1 K(f)(x) = ∫ cos(π(x-y)) f(y) dy. 0 Show that K has finite dimensional range. Argue that it is a compact operator. Compute ||K||. MAPLE remarks. We have already suggested that MAPLE can find matrix norms. In the next section, the concern will be in getting eigenvalues for self-adjoint linear mappings. A hint for how this will go is suggested in these MAPLE experiments. > > > > > > >
with(linalg): A:=array([[2,1],[1,2]]); evalf(norm(A,1));evalf(norm(A,2));evalf(norm(A,infinity)); eigenvals(A); A:=array([[2,1,0],[1,0,0],[0,0,5]]); evalf(norm(A,1));evalf(norm(A,2));evalf(norm(A,infinity)); lambda:=eigenvals(A);
57
Section 17: The Eigenvalue Problem We now have the proper space in which to solve the eigenvalue problem: the space of the bounded linear transformations. We proceed to show that the eigenvalue problem can be solved. Theorem 31 Suppose that A is a compact, self-adjoint operator. There is a real number λ and a non-zero vector z such that |λ| = ||A|| and Az = λz. Proof: By Theorem 30, there is a sequence {xp} in E such that |xp| = 1 and lim p = λ where |λ| = ||A||. Since A is compact, there is z in E and a subsequence of x such that limpAx u(p) = z. Then lim p = λ. 0 ≤ |Ax n - λx n | 2 = |Axn | 2 + |λx n | 2 - λ - λ<xn ,Ax n > ≤ (||A|| 2 + |λ| 2) - 2 λ = 2λ 2 - 2λ →0. Thus, λ limpx u(p) = lim pAx u(p) = z, or, 0 = A(z- λ limpx u(p) ) = Az - λ A(lim px u(p) ) = Az - λz. Theorem 32 If A is a compact, self adjoint operator and α > 0, then the number of eigenvalues λ with |λ| > α is finite. Moreover, if λ is an eigenvalue for A and M = {x: Ax = λx} then M is finite dimensional. Proof: |Ax p - Axq | 2 = |λ p xp - λ q xq | 2 = |λ p| 2 |x p| 2 - |λ q | 2 |x q | 2 > α would lead to a contradiction if there are infinitely many and A is compact. Corollary 33 If A is a compact, self-adjoint operator, then there is a sequence {φp} of orthonormal vectors and a number sequence {λ p} such that ∞ A(x) = ∑ λ p <x, φp> φp. p=1 Proof: We have established that A has an eigenvalue λ with |λ|= ||A||. N
Let M1 = {x: Ax = λx}. We also know that M1 is finite dimensional. Let {φp}1
be an orthonormal basis for M1. Then A(φp) = λ φp for each p. Let {θ p}∞ be N+1 the additional orthonormal vectors to make a maximal orthonormal family. Let N be the (possibly infinite) combinations of {θ }∞ . Then A maps N into p N+1
N. To see this, it suffices to see that A(θ i ) is in N for each i. Suppose not. Then
58 N
N
N
A(θ i ) = ∑ φp = ∑ < θ i , Aφp> φp = ∑ λ < θ i , φp> φp = 0. 1 1 1 But this is in N. Thus, A is a compact, self adjoint operator on the Hilbert space N. From the above, it has an eigenvalue µ with µ < ||A||. We now make M2 and continue the process. The previous Theorem 32 establishes that all eigenvalues of A will be found this away. Remark The above Theorem 31 requires that A be compact and self-adjoint. One cannot guarantee non-trivial eigenvalues and eigenfunctions if A is not self-adjoint. An example follows: 0 if 0 ≤ x ≤ y ≤ 1 Let K(x,y) = y-x 2(y-x) -e if 0 ≤ y ≤ x ≤ 1 e and H(x,y) = K(y,x). Here is the graph of K: 0.200 0.150 0.100 0.0500 0 0 0.200 0.200 0.400 0.400 0.600 0.600 0.800 0.800 1.00 1.00
Graph of K as given above.
1 1 Define A(f)(x) = ∫ K(x,y) f(y) dy and B(g)(x) = ∫ H(x,y) g(y) dy. Each of A 0 0 and B is a bounded, compact linear function from L2[0,1] to L2[0,1]. Also A* = B. Moreover, these are equivalent: (a) g = A(f), and (b) g′′ + 3g′ + 2g = f, g(0) = g′(0) = 0. Also, these are equivalent: (c) g = B(f), and (d) g′′ - 3g + 2g = f, g(1) = g′(1) = 0.
59 To see that A has no non-trivial eigenvalues and eigenfunctions, note x that A(f)(x) = ∫ K(x,y) f(y) dy, so that A is a Volterra Integral Operator -- the 0 integral is from 0 to x, not 0 to 1. If there were a number λ ≠ 0 and a function f such that x λf(x) = ∫ K(x,y) f(y) dy 0 n 1 x then |f(x)| ≤ Bn n! M, for each integer n, where |K(x,y)| ≤ B and |λ| n |f(x)| ≤ M. Hence ||f|| = 0. Remark Here is a pair of self adjoint linear operators that are not compact: A(x) = x for all x in any infinite dimensional Hilbert space, and B(f)(t) = t f(t) in L 2([0,1]). A way to see the latter one is not compact is to note that it has no eigenvalues. Assignment (17.1) Go back and see where the proof of Theorem 31 uses A = A*. (17.2) Suppose that E has a countable maximal orthonormal sequence. If P is a continuous projection then these are equivalent: (a) P is a compact operator, and (b) The range of P is finite dimensional. (17.3) Definition: Suppose that A is a bounded, self-adjoint operator on E and that {φp}
∞ is a maximal orthonormal family. Then traceA= p=1
∑ .
This sum may be "∞". If λ n is a sequence of real numbers, E is L2[0,1] and 1 K(s,t) = ∑ λ n φn (s) φn (t) and A(f)(x) = ∫ K(x,y) f(y) dy 0 1 then traceA = ∫ K(s,s) ds. 0 (17.4) Show that if K(s,t) = 4 cos(s-t) then the operator given by π A(f)(x) = ∫ K(x,s) f(s) ds -π is a self-adjoint operator on L2[-π,π] with finite dimensional range. Find projections and numbers such that the operator is given by
60 λ 1 P1 + λ 2 P2 + ⋅ ⋅ ⋅ + λ n Pn . (17.5) Let {φp} ∞
=
∞ be a maximal orthonormal family and A be defined by Ax p=1
1
∑ p <x, φp> φp+1
- so that A is a "weighted shift". Show that
p=1
(a)
||A|| 2 ≤
∞
1
∑ p2 .
p=1 n 1 (b) A = limn ∑ p <⋅ , φp> φp+1 in BLT. p=1 (c) A is compact. (d) A has no nonzero eigenvalues. (e) Give a formula for A*. MAPLE remarks Here is the syntax for drawing the graph of a kernel. Seeing the shape of the graph gives a geometric understanding to the algebraic notion of symmetry: K(x,y) = K(y,x). The following graph kernel is not symmetric. > K:=proc(x,y) if x <= y then 0 else exp(y-x) - exp(2*(y-x)) fi end; > plot3d(K, 0..1, 0..1,axes=NORMAL);
Sometimes a finite dimensional analogue of an infinite dimensional result will show that the ideas are quite easy indeed. Consider this example in connection with Assignment 17.5. > > > > >
with(linalg): A:=array([[0,0,0,0],[1,0,0,0],[0,1/2,0,0],[0,0,1/3,0]]); evalf(norm(A,1));evalf(norm(A,2));evalf(norm(A,infinity)); eigenvals(A); transpose(A);
61
Section 18: Normal Operators and The More General Paradigm Self adjoint operators play the role in BLT that real numbers do in C I - not only in the sense that A = A*, but also, if T is in BLT, then there are T+T* self-adjoint operators A and B such that T = A + iB. In fact A = 2 and 1 B = 2i (T-T*). We continue our investigation into the representation of operators as
∑ λp Pp.
For reasons explained previously, we ask that the sequence of
projections should form a resolution of the identity. Definition T is normal provided that TT* = T*T. Theorem 34 Suppose that T is in BLT and each of A and B is self adjoint with T = A + iB. These are equivalent: (a) T is normal, and (b) AB = BA. Proof: To prove (b), simply compute AB and BA. To prove (a), compute TT* and T*T. Theorem 35 With the supposition of Theorem 34, suppose also that T is normal. These are equivalent: (a) T is compact, (b) Each of A and B is compact, and (c) T* is compact. Proof (a)⇒(b) |Tx|2 = |(A+iB)x|2 = |Ax| 2 + i < Bx, Ax> - i + |Bx|2 |Ax| 2 ≥ |Bx|2
(b)⇒(a) |Txn - Txm | 2 = |Axn - Axm | 2 + |Bx n - Bxm | 2. Theorem 36. Suppose that T is a compact, normal operator. Then T has an eigenvalue λ with max(||A||, ||B||) ≤ |λ|. Proof: We know there is an eigenvalue λ for A with |λ| = ||A||. Consider {x: Ax = λx} = N(λI-A). This is a Hilbert space. To see that B maps N into N, let n ∈ N. Then ABn = BAn = Bλn = λBn. Also, B is compact and
62 selfadjoint on N. There is µ such that µy = By. Let σ = λ+iµ and v = y. Then Tv = (A+iB)v = Av +iBv = λv + iµv = σv. Also, |σ| ≥ max(|λ|, |µ|). Theorem 37 If T is bounded and normal, then (a) |Tx| = |T*x|, (b) If Tx = λx then T*x = λ*x, and (c) If λ ≠ µ, Tx = λx, and Ty = µy, then <x, y> = 0. Remark Actually, statement (a) is equivalent to the statement that T is normal. Theorem 38 If T is a compact normal operator on E then there is a family ∞ {φp} of orthonormal vectors which is maximal in E and a sequence {λ p} p=1 of complex numbers such that, if x is in E, then ∞ Tx = ∑ λ p <x, φp> φp. p=1 Moreover, if 0 is not an eigenvalue of T, then the eigenvectors form a complete orthonormal system. The proof of this theorem is just like that of Theorem 36. The proof that the orthonormal family is maximal, or complete, can be argued as follows: suppose that v is a nonzero vector with < v, φp> = 0 for all p. Then v must be in the nullspace of A so that Av= 0v. This contradicts the supposition that 0 is not an eigenvalue of A. Assignment (18.1). Suppose that {λ p} {Pp}
∞ is a bounded sequence of complex numbers and p=1
∞ is a sequence of orthogonal projections. Show that ∑ λ pPp is p=1
normal.
(18.2). Show that T(x) = Σ <x, φp> φp+1 is not normal. 1-1 (18.3). Show that 0 0 is a projection that is not normal. (18.4). Give an example of a matrix that is normal but not self adjoint. MAPLE Remark The challenge is to make up a normal operator that is not self-adjoint. > > > >
with(linalg): T:=array([[0,0,0,0],[1,0,0,0],[0,1/2,0,0],[0,0,1/3,0]]); A:= evalm(A+transpose(A))/2; B:= evalm(T-transpose(T))/(2*I);
63 > evalm(A &* B);evalm(B &* A);
Since A and B do not commute, T is not normal. On the other hand here is a matrix that is normal, but not self adjoint: 1 -1 T = 1 1 . > T:=array([[1,-1],[1,1]]); > A:=evalm((T+transpose(T))/2); > B:=evalm((T-transpose(T))/(2*I));
Note that A and B are self-adjoint. This is clear for A. To see that it is true for B, compute: > Bstar:=transpose(-B); > evalm(A &* B); evalm(B &* A); > evalm(A + I*B);
Is it clear from this that T is normal?
64
Section 19: Compact Operators and Orthonormal Families A question arises about how compact operators map orthonormal ∞ families. If T is compact and normal and {φp} is the sequence of p=1 orthonormal eigenvalues, then limn→∞{Tφn }= limn→∞{λφn }= 0. What if T is ∞ compact, but not necessarily normal ( or self adjoint) and {φp} is an p=1 orthonormal family. Must limn→∞{Tφn } = 0? Theorem 39 Suppose that T is compact and {φp}
∞ is an orthonormal p=1
family. Then limn→∞{Tφn }= 0. Proof: Suppose not. There is a subsequence {φu(p) } for some ε ≥ 0. Since {φu(p) } subsequence of {Tφu(p) }
∞ such that |Tφu(p) | ≥ ε p=1
∞ is bounded and T is compact, there is a p=1
∞ that converges and has limit, say, v ≠ 0. Then, p=1
lim n→∞< Tφu(n), v> = < v, v > ≠ 0. But, also, lim n→∞< Tφu(n), v> = limn→∞< φu(n) , T*v > = 0 because these last are terms in the Fourier expansion of T*v = Σ p < T*v, φp> φp. The terms of this sum must go to zero. This gets a contradiction. Remark The ideas around Theorem 38 provides an easy characterization of when operators commute. Theorem 40 Supose that A and B are compact, normal operators. These are equivalent: (a) AB = BA, and ∞ (b) There is a maximal orthonormal family {φp} which are eigenvectors p=1 for A and for B.
65 Proof. If (b) holds, this is clear from the representation of Theorem 36. Suppose that (a) holds and λ is an eigenvalue of A. Let S be the subspace of vectors x such that Ax = λx. Because A and B commute, B maps S into S. Hence there is a sequence of eigenvectors for B that spans S. But, each of these is an eigenvector for A corresponding to λ. This process is symmetric in A and B. The representation of Theorem 36 completes the result. Remarks (1) if λ is a nonzero eigenvalue for BA, then it is an eigenvalue for AB. To see this, suppose BAx = λx. Then (AB) Ax = Aλx = λ Ax. Thus, λ is a non zero eigenvalue for AB. 2) Some use this last result to characterize normal operators this way: an operator is normal if and only if it and its adjoint can be "simultaneously diagonalized." (3) This course has investigated the representation of linear ∞ transformations as ∑ λ p <x, φp> φp. This representation gives insight as to p=1 the nature of linear transformations. We have found the representation is appropriate for compact, self-adjoint and normal operators From examples, we have seen that it gives an understanding to bounded, even if not compact operators, and even to unbounded operators on a Hilbert space. The representation should give insight and unification to some of the ideas that are encountered in a study of integral equation, Green's functions, partial differential equations, and Fourier series. Assignment (1) Find the eigenvalues for ∞ ∞ R(x) = ∑ <x,φp> φp+1 and for L(x) = ∑ <x, φp+1> φp. p=1 p=1 (One of these has no eigenvalue and the other has every number in the unit disk as an eigenvalue.) (2) Do the weighted left shift and right shift have a representation in the simple paradigm? Maple Remark: The finite dimensional analogue to the right shift and the left shift might be explored as follows: > with(linalg): > R:=array([[0,0,0,0],[1,0,0,0],[0,1,0,0],[0,0,1,0]]); L:=array([[0,1,0,0],[0,0,1,0],[0,0,0,1],[0,0,0,0]]);
66 Looked at this way, we see that right-shift and left-shift are infinite dimensional analogues of nilpotent operators. We should check to confirm what are the eigenvalues and eigenvectors of these two. > eigenvects(R); eigenvects(L);
There is a new idea that should be brought up here. One could define "generalized eigenvectors" for A as vectors v for which there is a number λ such that (A - λ I)v ≠ 0 but for which (A - λ I) 2 v = 0. This would be a generalized eigenvector v of rank 2. One could define a generalized eigenvector of rank k. Here's an example: > A:=array([[1,1,2],[0,1,3],[0,0,2]]); > charpoly(A,x); > eigenvects(A);
Note that 1 is an eigenvalue of multiplicity 2, but has only one eigenvector. We go looking for one generalized eigenvector of rank 2. To that end, we find the nullspace of (A-I)2: > AmIdnty:=evalm((A-diag(1,1,1))&*(A-diag(1,1,1))); > nullspace(AmIdnty);
Thus, we have that (A-2 I){5,3,1} = {0,0,0}, (A-1 I){1,0,0} = {0,0,0}, and (A-1 I)2{0,1,0}={0,0,0}. Several questions come to mind: (1) What are the generalized eigenvectors for R and L above? (2) How do these fit into the context and structure for the paradigm presented in these notes?
67
Section 20: The Most General Paradigm: A Characterization of Compact Operators The paradigm that has been sugested in these notes is applicable for compact and normal operators. This is a fairly satisfactory state of affairs. Yet, the simple matrices 010 0 1 and 0 0 0 0 2 0 0 2 do not fit into that situation. We will push the representation one more time. In addition to the satisfaction of having a decompositon that is applicable to those two matrices, we will be able to obtain the Fredholm Alternative Theorems for mappings with less hypothesis. Lemma 41 (1) If A is bounded and B is compact then AB is compact. (2) If A is compact and B is bounded then AB is compact. (3) If T is compact, then T* is compact. (Hint: since T is compact, then TT* is compact and = |T* x| 2.) Theorem 42. Suppose that T is a compact operator from E to E. There are maximal orthonormal families {Φp} and {Θp} and a non-increasing number ∞ sequence {λ p} such that limp λ p = 0, and if x is in E, then p=1 ∞ T x = ∑ λ p < x , Φp> Θp. p=1 Moreover, the convergence is in the norm of BLT. Proof: Suppose that T is compact. First, T* is bounded since T is. To see this, |T*x| 2 = = ≤ ||T|| |T*x| |x|, so that |T*x| ≤ ||T|| |x|. Now, knowing that T* is bounded and T is compact, we can get that T*T is compact and it is selfadjoint. Moreover, ≥ 0 so that all the eigenvalues of T*T are nonnegative. Arrange all the eigenvalues in decreasing order. We have T*Tx = ∑ µp <x, x p> xp. p For each n such that µn ≠ 0, let yn = T(xn )/ µn . Then < yn , ym > = / µm µn = / µm µn =
µn <xn ,x m > = 0. µm
68 Thus, {yp} is orthogonal, even orthonormal. Extend it to be a maximal. Then T(x p) = µp yp even if µp= 0. Suppose that x = Σ p < x, x p > xp. Tx = T(Σ p < x, x p > xp) = Σ p <x,xp> Txp = Σ p µp <x,xp> yp, To see that this sum converges in the BLT norm, | Σ n+1
µp <x,xp> yp| 2 = Σ p µp |<x,xp>|< µp+1 |x| 2
Assignment (20.1) Perhaps you will agree that applying this decomposition to the matrices 010 0 1 and 0 0 0 0 2 0 0 2 is irresistible. Note that this is different from the decomposition which we had in the first of these notes. (20.2) With T as in Theorem 42, What is T*? MAPLE Remark; We will get the generalized paradigm for a matrix T that is not normal. > with(linalg): > T:=array([[0,1,0],[0,0,0],[0,0,2]]);
We form the self-adjoint matrix A*A. > A:= evalm(transpose(T) &* T); > eigenvects(A); > x[1]:=vector([0,1,0]); x[2]:= vector([0,0,1]); x[3]:= vector([1,0,0]); > y[1]:=evalm(T &* x[1]/1); y[2]:=evalm(T &* x[2]/2);y[3]:=evalm(T &* x[3]); > y[3]:=[0,1,0];
The proposal is that T(u) = 1 y1 + 2 y2 + 0 y3. We check this. > u:=vector([a,b,c]); > evalm(T &* u); > evalm(1*innerprod(u,x[1])*y[1] + 2*innerprod(u,x[2])*y[2] + 0*innerprod(u,x[3])*y[3]);
69
Section 21: The Fredholm Alternative Theorems The Fredholm Alternative theorems concern the equation (1-A)u = f. These ideas come up repeatedly in differential equations and in integral equations. The Alternative Theorems state necessary and sufficient conditions for the equation (1-A)u = f to have a solution u for some previously specified f. There are two alternatives: either the equation has exactly one solution for all f or the equation has many solutions for some f's and none for others. Those for which there are solutions are characterized. Lemma 43. Suppose that A is compact. There are orthonormal sequences ∞ {Φp} and {Θp} such that [1-A] = ∑ (1-λ p)< x, φp> Θp. p=1 * Suggestion of Proof: - A - A + A* A is compact and self adjoint. Also, [1-A] * [1-A] = 1 - A* - A + A* A. Let - A* - A + A* A = Σ p − µi <⋅ ,φi >φi . Also, I = Σ p <⋅ ,φi >φi so that [1-A] * [1-A] = Σ p (1-µi ) <⋅ ,φi >φi . λ i = 1- 1-µi and θ i = [1-A](φi )/(1-λ i ). < θ i , θ j > = 0. ∞ x = Σ p < x, φi >φi and [1-A](x) = ∑ (1-λ p)< x, φp> Θp. p=1
Let Note that Then
Theorem 44 Suppose that A is compact. Exactly one of the following alternatives holds: (a) if f is in E then the equation (1-A)u = f has only one solution, (b) the equation (1-A)u = 0 has more than one solution. Theorem 45 If A is compact and the first alternative holds for the equation (1-A)u = f then it also holds for the equation (1-A*)u = f. Theorem 46 Suppose A is compact and the second alternative holds for the equation. Then these are equivalent: (1) the equation (1-A)u = f has a solution, and (2) < f , z > = 0 for all solutions z for the equation (1-A*)z = 0. (Hint: to see that 2 implies 1, write out what is the null space for (1-A*). )
70 Remark For general linear operators, the question might be, given f, does the equation Bu = f have a solution. For square matrices, we can completely answer the question: Exactly one of the following holds: (a) det(B) ≠ 0 and if f is in E then Bu = f has exactly one solution. (b) det(B) = 0 and Bu = 0 has more than one solution. For the general compact operator, the alternatives are not definitive: Let ∞
B=
1
∑ p <⋅ , φp> φp.
p=1 Then B(u) = 0 has only one solution, but there are f's for which B(u) = f has no solution. Assignment (21.1) In R2, let A(x,y) = {x+y, 2y}. Show that (1-A)u = 0 and (1-A*)u = 0 have non-trivial solutions. What conditions must f satisfy in order that (1-A)u = f should have a solution? 1 (21.2) In L2[0,1], let A(f)(x) = ∫ [1+cos(πx) sin(πt)] f(t) dt. Show that (1-A)u = 0 0 and (1-A*)u = 0 have non-trivial solutions. If (1-A)u = f has a solution, what conditions must f satisfy? (21.3) Show that if A is compact and normal then the representation can be ∞ [1-A] = ∑ (1-λ p)< x, φp> φp. p=1 MAPLE Remark: In the matrix situation, the Fredholm Alternative Theorems are no more interesting than evaluation of the determinant. If the determinant is not zero, the matrix is in the first alternative. It the determinant is zero, it is in the second alternative. Since det(1-A) = det(1-A*), it is no surprise that the first alternative holds for 1-A if and only if it holds for 1-A*. The last of the Fredholm Alternative theorems is a little more interesting. We examine it with a matrix for which det(1-A) = 0, so that the second alternative holds. > > > >
with(linalg): A:=array([[0,-2,1],[-2,-5,-2],[-3,-4,8]]); B:=evalm(diag(1,1,1)-A); det(B);
71 Because det(1-A) = 0, this A is in the second alternative. To determine for which f's the equation (1-A)u = f has a solution u, we find the null space of 1-A*. > nullspace(evalm(diag(1,1,1)-transpose(A)));
Ask what vectors are orthogonal to this one. There is a plane of them: a [1,5,0] + b [1,0,5]. We solve (1-A)u = [1,5,0] and (1-A) = [1,0,5]. > solve({x+2*y-z=1,2*x+6*y+2*z=5,3*x+4*y-7*z=0},{x,y,z}); > solve({x+2*y-z=1,2*x+6*y+2*z=0,3*x+4*y-7*z=5},{x,y,z});
Finally, we predict the equation (1-A)u = [1,5,1] has no solution. > solve({x+2*y-z=1,2*x+6*y+2*z=5,3*x+4*y-7*z=1},{x,y,z});
72
Section 22: Closed Operators We change subjects now. The subject is continuity. The first notion of continuity encountered in this course was approached from the perspective of sequences and has been restricted to linear functions. Bounded linear functions and unbounded linear functions are words which were used to characterize continuous and non-continuous linear functions. Later, a notion was investigated that was more restrictive than contiunous. It is that of a compact operator. Thus, we had these three: compact operator, bounded operator, and unbounded operator. Now we introduce another that lies among these. It is that of a closed operator. Definition A function is closed provided its graph is closed -- in the sense that if {xp} has limit y and {L(xp)} has limit z then y is in the domain of L and L(y) = z. Examples (1) Let f(x) be the function from R to R defined by f(x) = 1/x if x is not zero and f(0) = 0. This function is closed. (2) Let f(x) be the function from R to R defined by f(x) = x/|x| if x is not zero and f(0) = 0. This function is not closed. We have characterized maps which satisfy the paradigm and which are compact, bounded, and unbounded depending on the character of their eigenvalues. Here is the situation for closed operators: Theorem 47. Suppose that A = Σ p λ p <⋅ , φp> φp and that the domain is as large as possible in the sense that D(A) = { x: Σ p |λ p| 2 |< x, φp>| 2 < ∞} It follows that the operator A is closed. Suggestion of proof. Suppose that limn x n = y and limn A(x n ) = z. We want to show that Σ p |λ p| 2 |< y, φp>| 2 < ∞ and that Σ p λ p φp = z. For each N, 0 = < limn Σ p λ p <xn , φp> φp,φN > - < z, φN > = limn λ N < xn , φN > - < z, φN > = λ N < y, φN > - < z, φN >. Hence, Σ p |λ p| 2 |< y, φp>| 2 = Σ p|< z, φp>| 2 < ∞, from which we conclude that y is in the domain of A and z = A(y). Remark (1) To see the generality of a closed operator, note that if A is a symmetric, densely defined operator in a Hilbert space, then there is a closed, symmetric operator B such that A is contained in B. (2) Because of the previous remark, we might as well always take a symmetric, densely defined operator to be closed. While this will not imply
73 that the operator is continuous, there is this idea: It is always possible to redefine the inner-product on D(A) such that the domain of A becomes a Hilbert space and A becomes a bounded operator on the domain of A, In fact, for x and y in the domain of A, define the new inner-product by < x, y>n = < x, y> + < Ax, Ay>.
77
Section 24: An Application: A Problem in Control We will illustrate some of the ideas that we have encountered by considering a control problem. To move gently to this problem, we discuss four examples. The first comes from 2nd year calculus. We work it with 2nd year calculus techniques and then note the disadvantages of those techniques. The problem is re-considered and worked with more powerful methods which lead to a method to solve the control problem. Example 1 Let M be the line in the intersection of the planes P1: x+y+z=0 and P 2:x-y+z=0. Also, consider the point {1,2,3} which is not in either plane. The problem is to find the closest point in M to {1,2,3}. Here is a solution using the techniques of calculus: P1 is the plane consisting of points {x, y, z} such that <{x,y,z}, {1,1,1}> = 0. P2 is the plane consisting of points {x, y, z} such that <{x,y,z}, {1,-1,1}> = 0. The line M has the same direction as the vector {1,1,1}×{1,-1,1} = {2,0,-2} and contains {0,0,0}. Hence, an equation for this line is M(t) = {2t,0,-2t}. We want to choose t in order to minimize |{2t,0,-2t} - {1,2,3}| = (2t-1)2 + 22 + (-2t-3)2 = D(t). Well, D′(t) = 2(2t-1)2 + 2((-2t-3)(-2) = 16t + 8. And, D′(t) = 0 provided that t = -1/2. Hence, the closest point in M to {1,2,3} is {-1,0,1}. Remark This method only works in 3-D because the cross product is unique to R3. Example 2 Let L1 and L 2 be linear functions from E to C. I Let M be the intersection of the null space of L1 and of L 2. The problem is: given u in E, find the closest point in E to the null space of L1 and L 2, By the Riesz theorem, there is y1 and y2 in E such that L 1 = <⋅ , y1> and L2 = <⋅ , y2>. It follows that M = {x: <x, y1> = 0 = <x, y2> }. Then M = {x: <x, αy1 + βy2> = 0 for all α and β}. Also, M⊥ = span{y1, y2}. Now, suppose that u is a point of E and P m is the orthogonal projection onto M and provides the closest point in M to u. (Note u - Pm (u) ≠ y1 + y2 unless y1 and y2 are orthonormal.) We have u - Pm (u) ∈ M⊥ and P m (u) ∈ M. From the first of these u - Pm (u) = αy1 + βy2 for some α and β and from the second = 0 = . Hence, = α + β and = α + β , or
78 α ) = 1 1 β . We can solve this system uniquely provided |y1| 2 |y2| 2 - || 2 ≠ 0. And, this inequality holds if y1 and y2 are linearly independent.
(
Assignment (1) Re-work example 1 in this context. (2) Find the point in the intersection of the subspaces w+x+y+z = 0 and wx+y-z = 0 that is closest to {0,1,2,3}. Example 3 For many applications, we make one further variation in the problem. Let L 1 and L 2 be the linear functions from E to C. I Let M = {x: L1(x) = A 1 and L 2(x) = A 2}. Let L1 = <⋅ , y1> and L2 = <⋅ , y2>. Now, M is a closed, convex set. If u ∈ E and if we seek the closest point in M to u then P m (u) again provides this point. While Pm (u) is a projection, it is not necessarily linear. We want a formula for Pm (u). To get this formula, see that u Pm (u) is perpendicular to N(L1)∩N(L2). We know that for all y in M, ≤ 0. In fact, if n ∈ N(L1)∩N(L2) and y = n+P m (u) with y ∈ M then ≤ 0. Also, if n ∈ N(L1)∩N(L2) then -n is also so that, ≥ 0. Thus, u - Pm (u) is perpendicular to the intersection of these two nullspaces. As in Example 2, this implies that u - Pm (u) = α y1 + β y2. We also have the equations A 1 = L1(Pm (u)) = < Pm (u), y 1> and A 2 = L2(Pm (u)) = < Pm (u), y 2>. Suppose we know u and seek Pm (u). From the above, we see that u - αy1 - βy2 = Pm (u). Hence, - α - β = A1, - α - β = A2. - A 1 = α . Or, - A 2 β Example 4 Suppose that Q0 ∈R2 and A is a 2x2 matrix. Suppose that b ∈R2 and Q 1 ∈ R2. We seek v with minimum norm such that if Z′ = AZ + bv, Z(0) = Q0, then also, Z(1) = Q1. We know t Z(t) = exp(tA)Q0 + ∫ exp((t-s)A)bv(s) ds. 0 Since Q1 = Z(1), We have
79 1 Q1 = exp(A)Q 0 + ∫ exp((1-s)A)bv(s) ds 0 1 Q1 - exp(A)Q0 = ∫ exp((1-s)A)bv(s) ds. 0
or,
A1 Let A = Q1 - exp(A)Q0. Let L1 and L 2 be defined so that 2
1 L1(v) = exp((1-s)A)b v(s) ds . L2(v) ∫ 0 To ask for the point v with minimum norm which satisfies A1=L1(v) and A 2= L2(v), we choose u in the previous problem to be 0. Let v be Pm (0) where Pm is the nonlinear projection onto {x: L1(x) = A 1 and L 2(x) = A 2}. From the above example, v(s) = 0 - α <exp((1-s)A)b, {1,0}> - β<exp((1-s)A)b, {0,1}> where α and β satisfy an equation such as that as that just prior to this example. Assignment Find v(s):[0,π]→ R such that y′′ + y = v, y(0) = 0, y′(0) = 1, y(π) = 1, y′(π) = 0. Example This example shows that not every point is accessible. Suppose that P1 and P 2 are orthogonal projections, A = αP1 + βP2, and that P 1(b) = 0. If x is in E and u is in L2[0,1] then 1 |P 1(x) - ∫ exp((1-s)A)b u(s) ds| ≥ |P 1(x)|. 0 Here's why: 1 1 2 |P 1(x) - ( ∫ exp((1-s)α)P1bu(s) ds + ∫ exp((1-s)β)P2bu(s) ds)| = 0 0 1 2 | P 1(x) - ∫ exp((1-s)β)u(s)ds P2b | 0 1 2 2 2 = |P1(x)| + | ∫ exp((1-s)β)u(s) ds| |P2(b)| 0 2 _ |P 1(x)| . >
80 Definition If A is a linear transformation, we denote by T the trajectory T = 1 {exp(sA)b: 0 ≤ s ≤ 1} and by Ω.the accessible points Ω = {z: ∫ exp((1-s)A)b u(s) 0 2 ds, u ∈L [0,1]}. Conjecture: Ω⊥ = T⊥ .
81 Index: Hilbert Space Notes Adjoint 45-47 Banach Space 15 Bessel's inequality 28 BLT(E1 ,E2 ) 53 Cayley-Hamilton 2 Cauchy-Schwartz 7 closed, operator 72-73 closed, set 18-20, 35, 47-50 closest point 18-24 closure 20, 48 compact, operator 49, 50-71 compact, set 47-48 complete space 15 continuous 35 control 77 convergence of operators 53 convergence, pointwise 14 convergence, strong 14 convergence, uniform 14 convergence, weak 14 convex set 18 differential equations 5, 8, 40-42, 58 eigenvalue -vector 8-12, 27, 32, 43-45, 53-67 Fourier 15, 31, 65 Fredholm alternative 69-71 generalized eigenvector 66 generalized inverse 33 Gerschgorin circle theorem 11-12 Gramm-Schmidt 28-30 Green's functions 41, 65 Green, William F. 81 Hilbert space 15 Innerproduct space 7 interior 20 Jordan Form 1 Laguerre 29 Legendre 30 maximal orthonormal family 27 matrix norms 12 nilpotent 1, 5, 66 non-expansive 1, 23 normal operator 61-62, 67 operator topology 53 Parallelogram 18
Parseval's inequality 27 polarization 23 polynomial 29-30 projection, definition 1 projection, closest point 18-22, 32-34, 77-78 projection, right triangle 24 projection, unbounded, 56 orthogonal 23 Reisz Representation Theorem 36, 37 resolution of identity 59 self-adjoint 7, 23, 45 separable Shift operators 65 spectrum totally bounded 47 Tschebysheff 29 unbounded operator 65, 72
82
A Compendium of Problems