Discrete and Continuous Boundary Problems
MATHEMATICS IN SCIENCE A N D E N G I N E E R I N G A Series o f Monographs and Textbooks
Edited by Richard Bellman The RAND Corporation, Santa Monica, California
Volume 1 .
TRACY Y.THOMAS. Concepts from Tensor Analysis and Differential Geometry. 1961
Volume 2.
TRACY Y.THOMAS. Plastic Flow and Fracture in Solids. 1961
Volume 3.
RUTHERFORD ARIS.The Optimal Design of Chemical Reactors: A Study in Dynamic Programming. 1961
Volume 4 .
JOSEPHL SALLEand SOLOMONLEFSCHETZ.Stability by Liapunov's Direct Method with Applications. 1961
Volume 5 .
GEORGELEITMANN(ed.) . Optimization Techniques: with Applications to Aerospace Systems. 1962
Volume 6.
RICHARD BELLMAN and KENNETHL. COOKE.DifferentialDifference Equations. 1963
Volume 7 .
FRANKA. HAIGHT.Mathematical Theories of Traffic Flow. 1963
Vo'olume8 .
F. V. ATKINSON.Discrete and Continuous Boundary Problems. 1964
Volume 9.
A. JEFFREY and T. TANIUTI. Non-Linear Wave Propagation: with Applications to Physics and Magnetohydrodynamics. 1964
Volume 10.
JULIUSTOU.Optimum Design of Digital Control Systems. 1963
Volume 11.
HARLEYFLANDERS. Differential Forms: with Applications to the Physical Sciences. 1963
Volume 12.
SANFORD M. ROBERTS. Dynamic Programming in Chemical Engineering and Process Control. 1964
In preparation D. N. CHORAFAS. Systems and Simulation
Discrete and Continuous B oundury ProbZems F. V. Atkinson DEPARTMENT OF MATHEMATICS UNIVERSITY OF TORONTO TORONTO, CANADA
1964
hTew
York
ACADEMIC PRESS
London
COPYRIGHT @
1964, BY ACADEMIC PRESS INC.
ALL RIGHTS RESERVED. NO PART OF THIS BOOK MAY BE REPRODUCED I N ANY FORM, BY PHOTOSTAT, MICROFILM, OR ANY OTHER MEANS, WITHOUT WRITTEN PERMISSION FROM THE PUBLISHERS.
ACADEMIC PRESS INC. 1 1 1 Fifth Avenue, New York 3, New York
United Kingdom Edition published by ACADEMIC PRESS INC. (LONDON) L T D . Berkeley Square House, London, W. 1
LIBRARY OF CONGRESS CATALOG CARDNUMBER : 63-16717
PRINTED IN THE UNITED STATES OF AMERICA
Preface
A major task of mathematics today is to harmonize the continuous and the discrete, to include them in one comprehensive mathematics, and to eliminate obscurity from both. [E. T. Bell, “Men of Mathematics,” pp:13-14.
Dover, New York, 1937; Simon and Schuster, New York.]
To compare the discrete with the continuous, to search for analogies between them, and ultimately to effect their unification, are patterns of mathematical development that did not begin with Zeno, and certainly did not end with Leibnitz and Newton, nor even with Riemann and Stieltjes. Such a pattern of investigation is especially appropriate to the theory of boundary problems, for which the discrete and the con$nuous pervade both physical origins and mathematical methods. It is the aim of this book to present in this light the theory of boundary problems in one dimension, that is to say, for “ordinary” differential equations and their analogs and extensions. It would be highly desirable to develop a corresponding theory for partial differential equations and their analogs ; however, here the discrete theory seems ill-developed, and unification remote indeed. The essential unity of our subject has not always been apparent; the wealth of its applications and interpretations are perhaps responsible for this. It has been natural to expound the topic of boundary problems for ordinary differential equations in the situation that the coefficients have any requisite degree of smoothness; this case combines practical value with mathematical convenience. At the other extreme, boundary problems for difference or recurrence relations have tended to be viewed primarily as numerical aids to problems for differential equations. This is not to say that the mathematical theory of recurrence relations has gone undeveloped; the higher branches of the theory of boundary problems for recurrence relations tend to be found more or less effectively disguised in the contexts of linear operators, of the theory of moments, or of continued fractions or of orthogonal polynomials. Here again the dual role of the classical polynomials, as solutions both of differential and of recurrence relations, scarcely lessens the confusion. Unified theories of differential and of difference equations are of too recent emergence to gain full recognition. V
vi
PREFACE
We shall pursue our task from three directions. We shall present the theory of certain recurrence relations in the spirit of the theory of boundary problems for differential equations. Second, we shall present the theory of boundary problems for certain ordinary differential equations, emphasizing cases in which the coefficients may be discontinuous, or may have singularities of delta-function type. Finally, we give some account of theories which unify the topics of differential and difference equations, relying mainly on the method of replacement by integral equations. The introductory Chapter 0 provides a survey of the field to be investigated, and introduces the basic concept for classifying our boundary problems, whether discrete or continuous. This is the invariance of a quadratic form under linear or fractional-linear transformations, which may be continuously or discretely applied; this notion generalizes that of the constancy of the Wronskian in the case of Sturm-Liouville theory. Chapters 1 and 2 take up this notion in the simplest case of the invariance of the modulus of a complex number, to yield what is perhaps the most elementary of boundary problems. Here one may view in microcosm all aspects of the theory; one may also regard the material of these chapters as a primitive case of the as yet relatively undeveloped topic of boundary problems ,involving fractional-linear matrix factors. After Chapter 3, devoted to general principles for recurrence relations, there are four chapters devoted to the spectral theory for special types of recurrence relation. Chapters 4 and 5 might have been entitled “discrete Sturm-Liouville theory,” but many of the results are more familiar in the context of orthogonal polynomials. Chapter 6 presents some recent extensions in the matrix or multi-variate direction. Chapter 7, though its title also relates to orthogonal polynomials, continues to emphasize the recurrence relation approach. With Chapters 8 and 9 we turn to the second aspect of our task, the presentation of the theory of boundary problems for differentialequations, without making unnecessary smoothness restrictions on the coefficients; as various authors have noted, it is in fact possible to arrange for difference equations to be included as special case of differential equations. Chapter 8 deals in this manner with the main case of classical SturmLiouville theory. To conduct a similar investigation for higher-order equations seemed unnecessary here, and, accordingly, Chapter 9 has been confined to an account of the first-order matrix system; this is, of course, more general than the nth order equation and does not seem to have been treated very often in book form. Oscillatory properties for matrix systems have received renewed attention lately, and we have thus devoted Chapter 10 separately to them. Since a number of variational treatments of this topic are available, it
PREFACE
vii
seemed appropriate to expound here the matrix approach. For simplicity, the exposition has been confined to the continuous case. Chapters 11 and 12 are devoted to the unified theory, which includes both differential and recurrence relations. Chapter 11 is devoted to a sketch of the general theory, which has been the object of much recent research, while Chapter 12 deals with the extension of special SturmLiouville properties. Here it must be emphasized that the theory so generalized does not cover the case of fractional-linear relations considered in Chapters I and 2; on the other hand, the theory points the way to numerous generalizations of other investigations for differential equations, apart from boundary problems. The level of mathematical argument is fairly elementary; a knowledge of Lebesgue integration is only rarely needed, while the Stieltjes ihtegral and some of its less accessible properties have been treated in an Appendix. In certain chapters a familiarity with matrix manipulations is presumed ; monotonic properties of eigenvalues have been developed in an Appendix, once more since certain of them seemed unavailable in the majority of texts. Complex variable theory is used mainly in respect of the elementary properties of the bilinear mapping of the plane. While the book has been written in what seemed the most logical order, and cross-references to analogs elsewhere are often made, it may also be read piecewise; however, Chapters 1-2, 4-6, and 11-12 form connected sequences. Problems have been given for each chapter. In most cases, these range from elementary exercises through straightforward generalizations to research suggestions. Little reference has been made to the use of functional analysis in connection with the boundary problems discussed here. In part, this is justified by the special character of our problems, and by the aim of obtaining the results in the most expeditious and simple manner. Apart from this, it may be questioned whether the suggestive value of the theory of boundary problems for functional analysis has been exhausted, having in mind here the theory of the symmetrizable operator and that of the fractional-linear recurrence relation or “J-contractive” matrix function. An exclusive reliance on the theory of the self-adjoint linear operator would at present have a limiting effect on the theory of our problems. I am indebted to a number of colleagues for their critical comments. For comments on the material in lecture and manuscript form I must thank Dr. C. F. Schubert and Mr. C. E. Billigheirner. For their careful reading of the proofs, in whole or in part, my especial gratitude is due to Professor J. R. Vanstone, and to Professor B. Abrahamson.
...
Vlll
PREFACE
Finally, it is my particular pleasure to acknowledge the cooperation and patience of Academic Press, Inc., and to express my appreciation of the consideration given to this work by them and by Richard Bellman, the Editor of this series of monographs. F. V. ATKINSON Madison, Wisconsin October, 1963.
Contents
PREFACE
...........................
V
INTRODUCTION 0.1. 0.2. 0.3. 0.4. 0.5. 0.6. 0.7. 0.8.
Difference and Differential Equations . . . . . . . . . . . . . . . . . . . 1 The Invariance Property . . . . . . . . . . . . . . . . . . . . . . . . 4 6 The Scalar Case . . . . . . . . . . . . . . . . . . . . . . . . . . . . . All-Pass Transfer Functions . . . . . . . . . . . . . . . . . . . . . . 8 Inverse Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . 11 The General Orthogonal Case . . . . . . . . . . . . . . . . . . . . . 13 The Three-Term Recurrence Formula . . . . . . . . . . . . . . . . . . 15 The 2-by-2 Symplectic Case . . . . . . . . . . . . . . . . . . . . . 21
1-Boundary
Functions
1.1. I .2. 1.3. 1.4. 1.5. 1.6. 1.7. 1.8. 1.9. 1.10.
Finite Fourier Series . . . . . . . . . . . . . . . . . . . . . . . . . The Boundary Problem . . . . . . . . . . . . . . . . . . . . . . . . Oscillation Properties . . . . . . . . . . . . . . . . . . . . . . . . . Eigenfunctions and Orthogonality . . . . . . . . . . . . . . . . . . . The Spectral Function . . . . . . . . . . . . . . . . . . . . . . . . . The Characteristic Function . . . . . . . . . . . . . . . . . . . . . . The First Inverse Problem . . . . . . . . . . . . . . . . . . . . . . . The Second Inverse Problem . . . . . . . . . . . . . . . . . . . . . . Moment Characterization of the Spectral Function . . . . . . . . . . . . Solution of a Moment Problem . . . . . . . . . . . . . . . . . . . . .
2-The 2.1. 2.2. 2.3. 2.4. 2.5. 2.6. 2.1. 2.8.
Problems for Rational 25 21 29 31 35 31 39 42
46 51
Infinite Discrete Case
A Limiting Procedure . . . . . . . . . . . . . Convergence of the Fundamental Solution . . . Convergence of the Spectral Function . . . . . Convergence of the Characteristic Function . . Eigenvalues and Orthogonality . . . . . . . . . Orthogonality and Expansion Theorem . . . . A Continuous Spectrum . . . . . . . . . . . . Moment and Interpolation Problem . . . . . .
ix
............ . . . . . . . . . . . . . ............. . . . . . . . . . . . . . ............ ............. ............ .............
55 57
60
62 63 61
10 71
CONTENTS
X
2.9. 2.10. 2.11.
A Mixed Boundary Problem A Mixed Expansion Problem Further Boundary Problems
3-Discrete 3.1. 3.2. 3.3. 3.4. 3.5. 3.6. 3.7. 3.8.
...................... ...................... ......................
74 76 81
Linear Problems
Problems Linear in the Parameter . . . . . . . . . . . . . . . . . . . . Reduction to Canonical Form . . . . . . . . . . . . . . . . . . . . . The Real Axis Case . . . . . . . . . . . . . . . . . . . . . . . . . . The Unit Circle Case . . . . . . . . . . . . . . . . . . . . . . . . . The Real 2-by-2 Case . . . . . . . . . . . . . . . . . . . . . . . . . The 2-by-2 Unit Circle Case . . . . . . . . . . . . . . . . . . . . . . The Boundary Problem on the Real Axis . . . . . . . . . . . . . . . The Boundary Problem on the Unit Circle . . . . . . . . . . . . . . .
. .
83 85 87 89 90 92 94 96
L F i n i t e Orthogonal Polynomials 4.1. 4.2. 4.3. 4.4. 4.5. 4.6. 4.7. 4.8. 4.9.
The Recurrence Relation . . . . . . . . . . . . . . . . . . . . . . . . Lagrange-Type Identities . . . . . . . . . . . . . . . . . . . . . . . . Oscillatory Properties . . . . . . . . . . . . . . . . . . . . . . . . . Orthogonality . . . . . . . . . . . . . . . . . . . . . . . . . . . . Spectral and Characteristic Functions . . . . . . . . . . . . . . . . The First Inverse Spectral Problem . . . . . . . . . . . . . . . . . The Second Inverse Spectral Problem . . . . . . . . . . . . . . . . Spectral Functions in General . . . . . . . . . . . . . . . . . . . . . . Some Continuous Spectral Functions . . . . . . . . . . . . . . . .
. .
97 98 100 104 106 107 111 114 117
. . . . . . . . . . . . . . . . . . . . . ....... . . . . . . . ....... . . . . . . . ....... ....... . . . . . . .
119 120 123 125 129 130 132 134 136 138
. . . . . .
5-Orthogonal Polynomials The Infinite Case 5.1.
5.2.
5.3. 5.4.
5.5.
5.6. 5.7. 5.8. 5.9. 5.10.
Limiting Boundary Problems . . . . . . Spectral Functions . . . . . . . . . . Orthogonality and Expansion Theorem . Nesting Circle Analysis . . . . . . . . Limiting Spectral Functions . . . . . . Solutions of Summable Square . . . . . Eigenvalues in the Limit-Circle Case . . Limit.Circ1e. Limit-Point Tests . . . . . Moment Problem . . . . . . . . . . . TheDualExpansion Theorem . . . . .
6-Matrix 6.1. 6.2. 6.3.
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
Methods for Polynomials
Orthogonal Polynomials as Jacobi Determinants . . Expansion Theorems. Periodic Boundary Conditions . Another Method for Separation Theorems . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
142 144 145
6.4. 6.5. 6.6. 6.7. 6.8. 6.9. 6.10. 6.11.
CONTENTS
xi
The Green’s Function . . . . . . . . . . . . . . . . . . . . . . . . . A Reactance Theorem . . . . . . . . . . . . . . . . . . . . . . . . . Polynomials with Matrix Coefficients . . . . . . . . . . . . . . . . . Oscillatory Properties . . . . . . . . . . . . . . . . . . . . . . . . . Orthogonality . . . . . . . . . . . . . . . . . . . . . . . . . . . . Polynomials in Several Variables . . . . . . . . . . . . . . . . . . . . The Multi-Parameter Oscillation Theorem . . . . . . . . . . . . . . . Multi-Dimensional Orthogonality . . . . . . . . . . . . . . . . . . . .
148
.
150
150 152 157 160
.
162 169
....... . . . . . . . . . . . . . . . . . . . . ....... . . . . . . . ........ . . . .......
170 172 173 178 182 184 188 190 196 199
7-Polynomials Orthogonal on the Unit Circle 7.1. 7.2. 7.3. 7.4. 7.5. 1.6. 7.7. 7.8. 7.9. 7.10.
The Recurrence Relation . . . . . . . . . . . . . . . . . The Boundary Problem . . . . . . . . . . . . . . . . . Orthogonality . . . . . . . . . . . . . . . . . . . . The Recurrence Formulas Deduced from the Orthogonality Uniqueness of the Spectral Function . . . . . . . . . . . . The Characteristic Function . . . . . . . . . . . . . . . A Further Orthogonality Result . . . . . . . . . . . . . Asymptotic Behavior . . . . . . . . . . . . . . . . . . Polynomials Orthogonalona Real Segment . . . . . . . Continuous and Discrete Analogs . . . . . . . . . . . . .
8-Sturm-Liouville 8.1. 8.2. 8.3. 8.4. 8.5. 8.6. 8.1. 8.8. 8.9. 8.10. 8.11. 8.12. 8.13.
Theory
The Differential Equation . . . . . . . . . . . . . . . . . . . . . . . Existence. Uniqueness. and Bounds for Solutions . . . . . . . . . . . . . The Boundary Problem . . . . . . . . . . . . . . . . . . . . . . . . Oscillatory Properties ........................ An Interpolatory Property . . . . . . . . . . . . . . . . . . . . . . . The Eigenfunction Expansion . . . . . . . . . . . . . . . . . . . . . Second-Order Equation with Discontinuities .............. The Green’s Function . . . . . . . . . . . . . . . . . . . . . . . . . Convergence of the Eigenfunction Expansion . . . . . . . . . . . . . . . Spectral Functions . . . . . . . . . . . . . . . . . . . . . . . . . . Explicit Expansion Theorem . . . . . . . . . . . . . . . . . . . . . . Expansions over a Half-Axis . . . . . . . . . . . . . . . . . . . . . . Nesting Circles . . . . . . . . . . . . . . . . . . . . . . . . . . . .
202 205 207 209 217 222 226 229 232 238 240 243 247
9-The General First-Order Differential System 9.1. 9.2. 9.3. 9.4.
Formalities . . . . . . . . . . . . . . . . The Boundary Problem . . . . . . . . . . . Eigenfunctions and Orthogonality . . . . . The Inhomogeneous Problem . . . . . . . . .
............. ............. . . . . . . . . . . . . . . .............
252 255 258 262
xii 9.5. 9.6. 9.7. 9.8. 9.9. 9.10. 9.11. 9.12.
CONTENTS
The Characteristic Function . . . . . . . . . . . . . . . . . . . . . . The Eigenfunction Expansion . . . . . . . . . . . . . . . . . . . . . Convergence of the Eigenfunction Expansion . . . . . . . . . . . . . Nesting Circles . . . . . . . . . . . . . . . . . . . . . . . . . . . . Expansion of the Basic Interval . . . . . . . . . . . . . . . . . . . . . Limit-Circle Theory . . . . . . . . . . . . . . . . . . . . . . . . . . Solutions of Integrable Square . . . . . . . . . . . . . . . . . . . . . The Limiting Process a -P - m. b + m . . . . . . . . . . . . .
+
10-Matrix 10.1.
10.2. 10.3. 10.4. 10.5. 10.6. 10.7. 10.8. 10.9.
..
268 273 280
..
289 292 293 298
284
Oscillation Theory
Introduction . . . . . . . . . . . . . . The Matrix Sturm-Liouville Equation . . . Separation Theorem for Conjugate Points . Estimates of Oscillation . . . . . . . . . Boundary Problems with a Parameter .. A Fourth-Order Scalar Equation . . . . . The First-Order Equation . . . . . . . . Conjugate Point Problems . . . . . . . . First-Order Equation with Parameter . . .
.
. . . .
. . . .
. . . .
. . . .
............. ............. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . .
. . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ..........
...............
300 303 308 312 317 323 328 332 336
11.-From Differential to Integral Equations
The Sturm-Liouville Case . . . . . . . . . . . . . . . Uniqueness and Existence of Solutions . . . . . . . . . . Wronskian Identities . . . . . . . . . . . . . . . . . Variation of Parameters . . . . . . . . . . . . . . . . Analytic Dependence on a Parameter . . . . . . . . . . Eigenvalues and Orthogonality . . . . . . . . . . . . . Remarks on the Expansion Theorem . . . . . . . . . . The Generalized First-Order Matrix Differential Equation . A Special Case . . . . . . . . . . . . . . . . . . . . 11.10. The Boundary Problem . . . . . . . . . . . . . . . .
11.1.
11.2. 11.3. 11.4. 11.5. 11.6. 11.7. 1 I .8. 11.9.
. . . . . . . . . .
. . . . . . . . . .
...... . . . . . . . . . . . . . . . . . . . . . . . . ...... ...... . . . . . . . . . . . . . . . . . .
339 341 348 350 355 356 358 359 363 364
12-Asymptotic Theory of Some Integral Equations 12.1. 12.2. 12.3. 12.4. 12.5.
Asymptotically Trigonometric Behavior . . . . . . . . . . . . . . . . The S.Function . . . . . . . . . . . . . . . . . . . . . . . . . . . . A Non-Self-Adjoint Problem . . . . . . . . . . . . . . . . . . . . . . The Sturm-Liouville Problem . . . . . . . . . . . . . . . . . . . . . . Asymptotic Properties for the Generalization ofy" [ka g(x)]y = 0 . .
+ +
. .
366 371 375 381 384
...
CONTENTS
12.6. 12.7. 12.8. 12.9. 12.10.
Xlll
Solutions of Integrable Square . . . . . . . . . Analytic Aspects of Asymptotic Theory . . . . Approximations over a Finite Interval . . . . . Approximation to the Eigenfunctions . . . . . Completeness of the Eigenfunctions . . . . . .
. . . . .
........... 391 . . . . . . . . . . . . 393 . . . . . . . . . . . . 398 . . . . . . . . . . . . 408 . . . . . . . . . . . . 411
Appendix I. Some Compactness Principles for Stieltjes Integrals 1.1. 1.2. 1.3. 1.4. 1.5. 1.6. 1.7. 1.8. 1.9.
Functions of Bounded Variation . . . . . . The Riemann-Stieltjes Integral . . . . . . . . A Convergence Theorem . . . . . . . . . . . The Helly-Bray Theorem . . . . . . . . . . Infinite Interval and Bounded Integrand . . . Infinite Interval with Polynomial Integrand . . A Periodic Case . . . . . . . . . . . . . . The Matrix Extension . . . . . . . . . . . . The Multi-Dimensional Case . . . . . . . . .
. . . . . . . . . . . .\ . ............. ............. ............. . . . . . . . . . . . . . . . . . . . . . . . . . . . . ............. ............. ............. ,
416 418 423 425 426 428 430 431 434
.
Appendix I1 Functions of Negative Imaginary Type 11.1. 11.2. 11.3.
Introduction . . . . . . . . . . . . . . . . The Rational Case . . . . . . . . . . . . . Separation Property in the Meromorphic Case
............. 436 ............. 437 . . . . . . . . . . . . . . 439
.
Appendix I11 Orthogonality of Vectors III.1.
III.2.
The Finite-Dimensional Case . The Infinite-Dimensional Case
..................... .....................
441 442
Appendix IV. Some Stability Results for Linear Systems Iv.1. Iv.2. IV.3. Iv.4. Iv.5.
A Discrete Case . . . . . . . . . . . . . . . . . . . . . . . . . . . The Case of a Differential Equation . . . . . . . . . . . . . . . . . A Second-Order Differential Equation . . . . . . . . . . . . . . . . The Mixed or Continuous-Discrete Case . . . . . . . . . . . . . . . The Extended Gronwall Lemma . . . . . . . . . . . . . . . . . . . .
447
. . 449 . . 450 . . 452 455
xiv
CONTENTS
Appendix V. Eigenvalues of Varying Matrices
................ . . . . . . . . . . ... . . ............... ... . .. . . ..... . .. .......... ... . . . . . . . .. . .. . ..
Variational Expressions for Eigenvalues . Continuity and Monotonicity of Eigenvalues A Further Monotonicity Criterion . . . . Varying Unitary Matrices . . . . . . . . Continuation of the Eigenvalues . . . . . V.6. Monotonicity of the Unit Circle . . . . .
V. 1. V.2. V.3. V.4. V.5.
,
457 459 461 464 465 468
Appendix VI. Perturbation of Bases in Hilbert Space VI.1. VI.2. VI.3.
The Basic Result . . . . . . . Continuous Variation of a Basis. Another Result . . . . . . . .
NOTATION AND TERMINOLOGY
........................
LIST OF BOOKS AND MONOGRAPHS
NOTES:
. .. . .. .
471 473 475
,
476
.................
478
Section 0.1, Section 0.2, Section 0.4, Section 0.7, Section 0.8, Section 1.5, Section 1.6, Sections 1.7-8, Section 1.10, Section 2.2, Section 2.3, Section 2.5, Section 2.7, Section 2.10, Section 3.1, Section 3.2, Section 3.3, Section 3.5, Chapter 4, Section 4.2, Section 4.3, Section 4.4, Section 4.5, Section 4.7, Chapter 4, Problems, Section 5.1, Section 5.2, Section 5.4, Section 5.7,Section 5.8, Section 5.9, Section 5.10, Section 6.1, Section 6.4, Section 6.5, Section 6.6, Sections 6.9-10, Section 7.1, Section 7.5, Section 7.6, Section 7.7, Sections 7.8-9, Section 7.10, Section 8.1, Section 8.3, Section 8.4, Section 8.5, Section 8.6, Section 8.7, Section 8.10, Section 8.13, Section 9.1, Section 9.4, Section 9.11, Section 10.1, Section 10.2, Section 10.3, Section 10.4, Section 10.6, Sections 10.8-9, Section 11.1, Section 11.2, Section 11.8, Chapter 11, Problems, Section 12.1, Section 12.3, Section 12.4, Section 12.5, Section 12.6, Section 12.7, Section 12.10, Appendix I, Appendix 11, Appendix'IV, Appendix V, and Appendix VI . 481
PROBLEMS.. INDEX
.. ...... .. .. ........ . . , .. .. . ... . . . .. .. .. . . . .. .. . . . . . . ... .. ..
. ... .
.
..........................
...................................
536 565
Introduction
0.1. Difference and Differential Equations
Our problems in the sequel belong to the field of linear analysis, in that the unknown functions or their derivatives appear only to the first power in the governing equations. They are also linear in a second sense, or perhaps sequential, in that the “independent variable” ranges over some part of the real line, perhaps over a discrete series of values. When a physical interpretation is available, it will be in terms of entities ranged along a line, each of which interacts only with those in its immediate vicinity. Alike on the physical and mathematical sides, it is convenient to devise separate treatments for the purely discrete case and for the purely continuous case. I n the former case the situation is that there is a discrete sequence of entities, each of which either gives an input to its successor, or perhaps interacts with both its neighbors; this is immediately describable by means of a differencc or recurrence relation, often soluble by algebraic processes. I n the case when the physical entities are not separate, but form some continuous distribution, limiting processes of a more or less satisfactory character lead us to differential equations. I n both cases, the wealth of the mathematical theory amply justifies a concentration on these two cases separately. When the topics of difference and differential equations are developed separately, many analogies are observed between the results. These analogies form our main concern here, in so far as they apply to boundary problems. It should be emphasized that these analogies result from specializing a single situation in two directions. While such specialization may well be necessary in order to accumulate usetul results, neither the difference equation nor the differential equation provides a fully adequate framework for the topic of boundary problems. I n the case of differential equations, this inadequacy is to some extent apparent in the need to consider “weak solutions,” or to use the machinery of “distributions.” 1
2
0. INTRODUCTION
Our treatment in what follows will be a compromise between the total separation of difference and differential equations, on the one hand, and their unification within a more general but necessarily more abstruse theory, on the other. T h e simplifying features of difference or recurrence relations are not to be wasted; on the other hand, we shall give occasional attention to what may be termed the mixed continuousdiscrete case, which combines features of both difference and differential equations. Though we do not essay here a general treatment of such mixed continuous-discrete cases and have referred to such a treatment as necessarily abstruse, the underlying physical idea is quite simple. Returning to the notion of interacting entities ranged continuously or discretely, or both, along a line, the physical or other hypotheses concerning laws of motions will tell us that the change in some quantity from one point of the system to another point depends on the state of the system between the two points. This situation is naturally expressed by an integral equation, involving a Stieltjes integral in the discrete or the mixed case. T h e formation of a differential equation by a limiting process is a step which may be dispensed with, so far as the general theory is concerned. T h e close relation between discrete and continuous cases, and the possibility of subsuming the two within a larger theory, will be evident from the physical systems which realize certain of our boundary problems. An example of great suggestive value is given by the vibrating string, stretched between two fixed points, loaded in some manner, and executing small harmonic vibrations. If the string itself be weightless and bears a discrete sequence of particles, perhaps a finite number, then the amplitudes of oscillation of three consecutive particles are connected by a recurrence relation. At the other extreme, we suppose that the string itself has a continuously varying density, and bears no particles, deriving in this case a second-order differential equation. It is plausible that this heavy string might be approximated, in dynamic behavior, by a finite number of suitably distributed particles; in effect we approximate here to a differential by a difference equation. For the general case, it is physically intelligible that the string should both have a density itself and bear particles in addition; this calls for an integral or integro-differential equation, including differential and difference equations as special cases. Similar remarks apply to an electrical interpretation. If as the continuous problem we take the harmonic propagation of a disturbance in a transmission line formed of two parallel conductors, the discrete analog will be that of propagation through a series arrangement of fourterminal networks. As regards approximation between the two cases,
0.1.
DIFFERENCE AND DIFFERENTIAL EQUATIONS
3
we might wish to simulate a transmission line by a dummy line formed of a finite number of circuits, or by making the number of such circuits become larger, we might derive the transmission line equations. For the general case, we clearly have a problem in the same area if we interpose networks at points along a long line. A further example of propagation by continuous or discrete steps is given by propagation through a stratified medium, where the properties of the medium may vary continuously, or abruptly, or both. For mathematical purposes, the difference equation often appears as an aid to the study of a differential equation. An outstanding classical example is provided by the initial-value problem. T o have a definite case in front of us, let us take the second-order differential equation. y"
+ hp(x)y = 0,
0 Q x Q 1,
(0.1.1)
where p ( x ) is, say, continuous; the initial-value problem calls for a solution such that y(0) = a, y'(0) = b. (0.1.2) The approximation
Y"(4
-
h-Yy(x
+ h) - 2Y(X)+ y(x - h))
(0.1.3
suggests approximating to (0.1. l ) by
+ h) - 2yt(x) + yt(x - h)} + h p ( ~yt(x) ) = 0.
h-'{yt(~
(0.1.4)
For convenience we suppose h = m-l for some integer m, and write yn = yt(nh); it then follows from (0.1.4) that y n f l - 2y,, +Y,,-~
+ hh2p(nh)y, = 0 ,
n = 1, ..., m - 1.
(0.1.5)
For the initial conditions we approximate to (0.1.2) by Yo
= a,
(Yl -Yo)lh
= b.
(0.1.6)
Then yz,y3, ... are determined recursively from (O.lS), and we have replaced the initial-value problem (0.1.1-2) by the initial-value problem (0.1.5-6),the latter being immediately soluble. The conjecture to be tested is then that yn = yt(nh) approximates to y(nh), where y ( x ) is the solution of (0.1.1-2). This, or similar approaches, provides the basis of both a computational solution of the initial-value problem (0.1.1-2), and also of existence theorems for the existence of a solution of (0.1.1-2). The same ideas may be applied to boundary problems. I n the simplest
4
0.
INTRODUCTION
of boundary problems for (0.1. l), we ask for A-values, the eigenvalues, such that (0.1.1) has a solution such that Y(0) = Y U )
= 0,
(0.1.7)
without, however, y ( x ) vanishing identically, and so subject to y’(0) # 0. In the parallel problem for (0.1.5), we ask for A-values such that (0.1.5) has a solution such that yo = y m = 0,
(0.1.8)
and again not vanishing throughout, and so such that y1 # 0. It is natural to conjecture that for small values of the tabular interval h, that is to say, for large m,at least the smaller eigenvalues of (0.1.5), (0.1.8) approximate to those of (O.l.l), (0.1.7), and that the eigenfunctions, the corresponding sequences satisfying (0.1.5), (0.1.8) can be used as a basis for approximation to the eigenfunctions of (0.1. l), (0.1.7). This latter argument has been used from time to time as a means of proving the eigenfunction expansion for (0.1.l), (0.1.7). Since the eigenfunction expansion for (0.1.5), (0.1.8) is purely elementary, being in effect the expression of an arbitrary vector as a linear combination of the eigenvectors of a symmetric matrix (see Chapters 4 and 6), this method provides perhaps the most elementary proof of the SturmLiouville eigenfunction expansion. It can also serve as a foundation for the eigenfunction expansion when (0.1.1) is generalized to allow certain discontinuities. In the reverse direction, it is possible to represent (0.1.5) in terms of a differential equation. We use this in Chapter 8 to give a unified treatment. Our attitude will be that boundary problems for difference and for differential equations are equally deserving of study, and that each can be of suggestive value for the other.
0.2, The Invariance Property Our boundary problems will have the properties associated with “self-adjoint” problems, in that eigenvalues will be real, and there will be an orthogonality of eigenfunctions. Without laying much stress on the concept of self-adjointness, we adopt as the key property that the differential or difference equation should admit, for real parametervalues, an integral of a certain form. I n the case of (0.1.1) this will be the well-known fact that the Wronskian of two solutions is constant.
0.2. THE
INVARIANCE PROPERTY
5
In general, there is to be a quadratic form in the solution which is constant, in the independent variable. If we postpone for the moment a detailed exposition of this requirement, it may be remarked that such a property is to be expected on heuristic grounds in connection with vibrating physical systems which are nondissipative ; for the vibrating string, for example, the particles are not subject to any friction, whereas for our cascade of circuits only LC-circuits are to be used. Supposing that the system is oscillating harmonically, we concentrate attention on the energy, potential and kinetic, located in some segment of the system. Since the time average of the energy in this segment is constant, the time-average of the energy flow into the segment at one end must equal the average energy flow out at the other end. Thus the average energy flow is the same at all points along the system, and this is the invariant quadratic form in question. We outline the latter argument in the case of (0.1.1). T h e property is that, if h is any real quantity, then a solution y ( x ) of (0.1.1) satisfies y’jj
-95
=
const,
(0.2.1)
where y(x) is the complex-conjugate solution. That this is so may, of course, be verified by differentiation of (0.2.1) and use of (0.1.1). However, we may also derive it, for positive A, by considerations of energy. Starting from the differential equation of a vibrating string in the form (0.2.2) we seek solutions of the form u(x, t ) = y(x) exp (-i
dit ) ,
(0.2.3)
so that y(x) must satisfy (0.1.1). We now allow y to be complex, the physical disturbance being measured by the real part of u. This becomes a standard procedure, if we shift the metaphor to electromagnetic theory; we interpret E = (0, u, 0) as the electric vector of a linearly polarized wave moving in the x-direction in a stratified dielectric medium with parameters E = E(x), p = constant, with in (0.2.2)p(x) = c(x)p. T h e magnetic vector is then
H
=
(0,0,(ip dX)-’uZ),
(0.2.4)
where u, = &/ax = y’ exp (- i f i t ) , the complex Poynting vector being
i E x H = &+y’/(p
dX), 0,0).
(0.2.5)
6
0. INTRODUCTION
T h e real part of this, namely (0.2.6)
*(t.di)-l(zjy- iyy’, 0,O)
is conventionally interpreted as the mean energy-flow through unit area; in our present case it must be constant in x, since no energy is dissipated. In this way, the Wronskian property (0.2.1) becomes a case of Poynting’s theorem. In order to obtain a broader perspective, we put this latter result in matrix terms. To rewrite in such terms the original differential equation (0.1.1) we define the column matrix or vector =
so that (0.I . I ) yields z’ =
t:,) ( =
(;,j
Y’ --hPY
) = (+0 );
(0.2.7)
z.
(0.2.8)
T h e (*) is used to indicate the complex-conjugate transpose, so that z* is the row matrix (99’).Then (0.2.1) may be written compactly as where
z*Jz = const, ]=(O 1
-1)0 .
(0.2.9) (0.2.10)
Very similar identities to (0.2.1) are known in connection with selfadjoint differential operators of the nth order, associated with the topics of the Lagrange identity or of Green’s theorem. When translated into matrix terms, these will involve nth order matrices in place of (0.2.10). We consider some cases in Chapter 9. Such identities also occur in discrete cases such as (0.1.5) (see, for example, Section 4.2). In what follows we shall regard identities of the type of (0.2.9) as a means of classifying, and to some extent as a means of setting up boundary problems. We proceed to what is perhaps the simplest case of a boundary problem.
0.3. The Scalar Case In keeping with our program of setting u p boundary problems in association with invariance properties, we shall do this first for the case in which the invariance property merely says that a complex number has constant modulus. This will be the case of (0.2.9) in which all quantities
0.3.
7
THE SCALAR CASE
are scalars, J being the 1-by-1 matrix unity. There will again be discrete, continuous, and mixed cases. Of these the continuous case is well known and is given by the differential equation y' = ixy
[(')
(0.3.1)
= d/dx],
where X is a parameter a n d y a scalar dependent variable. It is immediately verifiable that if h is real, there holds the invariance property jjy = const.
(0.3.2)
T h e boundary problem associated with (0.3.1) is also well known. We choose a real 01, 0 01 < 27r, and require that (0.3.1) have a solution, not identically zero, such that
<
(0.3.3)
Y(1) = Y(0)exp (4.
That the eigenvalues, the A-values for which this is possible, are all real may be shown as follows. T h e boundary condition (0.3.3) requires that Iy(1) 1 = 1 y(0) 1. However, since y(1) = y(0) exp (A), we have that I y(1) I < I y(0) I when I m h > 0, I y(1) I > I y(0) 1 when I m h < 0, which proves the result. T h e discrete analog of this situation will deal with a sequence y o ,yl, ... connected by a recurrence relation
which is to have the property that
I yn+1 I
(0.3.5)
= I Yn I
if X is real. This cannot be achieved by making &(A) a linear function of A, and the simplest function with this property will have the fractional linear or bilinear form &(A) = (1
+ iACn)/(l- Atn),
(0.3.6)
where c,, E, are complex conjugates, and we have arranged that +%(O) = 1. For the corresponding boundary problem there will first be the finite discrete case, in which we have a finite recurrence formula Y,,+~ = yn(l
+ iAcn)/(l- i&),
n
= 0,
..., m
and again, in a similar manner to (0.3.3), require that
- 1,
(0.3.7)
8
0.
INTRODUCTION
T h e proof that the eigenvalues are all real proceeds much as before, assuming that the cn all have positive real part: we take this up in Chapter 1. There will also be a limiting boundary problem of discrete type with m = 00, which we consider in Chapter 2, and mixed continuousdiscrete problems. Associated with the boundary problem (0.3.1), (0.3.3) there will be an orthogonality for the eigenfunctions and an expansion theorem. I n the special case a: = 0, these become the orthogonality and expansion theorem associated with complex Fourier series; the case of general a: will be similar. For the discrete recurrence formula (0.3.7) with the same boundary conditions, the eigenfunctions and orthogonality relations will concern certain rational functions, of which we give details in Chapter 1. There will again be a possibility of approximating to the continuous differential equation (0.3.1) by means of discrete relations (0.3.7), in other words of expressing exp (ih) as the product of factors (0.3.7). Although it is fairly obvious that this can be done, the problem of making such an approximation in a best possible manner has some practical interest. 0.4. All-Pass Transfer Functions
T o illustrate the boundary problem just discussed, we borrow notions from the topic of servomechanisms. Writing (0.3.7) in the form Yn+l
- i%Yn+l
= Yn
+ ihcnyn
9
(0.4.1)
we are led to consider the differential recurrence relations (0.4.2)
which we may imagine as a sequence of devices of which the nth feeds into the (n 1)th. We suppose that there is applied a sinusoidal driving force uo = yo exp ( - A t ) , where X is real and y o is a non-zero constant. We find, for example, that
+
and so
+ t0zi,= (1 + ihc,) yo exp ( - a t ) , u1 = (1 + ihc,) (1 - iht0)-l y o exp ( - A t ) , u1
(0.4.3) (0.4.4)
where we have omitted as exponentially damped a term in exp ( - t / F o ) . Proceeding in this way we find that urn = Y m ( 4 Yo exp (-W
(0.4.5)
0.4.
9
ALL-PASS TRANSFER FUNCTIONS
where (0.4.6)
T h e function ynL(A)just defined expresses the ratio of the output unb to the input uo = yo exp (-iAt), and is the “transfer function” of the system. Since
IYm(4 I
=
(0.4.7)
1
for all real A, as may be seen from the form (0.4.6), the driving function yo exp (-iAt) passes through the system with no change in absolute value; the term “all-pass” is used for such a transfer function. T h e effect of the system is thus to apply a phase shift (0.4.8)
(0.4.9)
Our boundary problem (0.3.8) requires the phase shift to have a specific value. More precisely, determining ym(A)from (0.4.9) as a continuous function for real A, fixed by qn,(0)= 0, the eigenvalues are the roots of ym(h)= a (mod 27~).
(0.4.10)
We may interpret these eigenvalues as the singularities of a feedback problem with phase shift a ; the case a = T is commonly studied in applications. I n place of (0.4.2) we set the system
+
u*+~
= u,
- c,&,
n = 1,
..., m
-
1,
(0.4.1 1)
together with the feedback equation u1
+ foCl = (1 - co d / d t )(uo + e-”Lu,).
(0.4.12)
Again we consider the result of applying a driving function uo = y o exp (-iht), and select the solution, if any, of (0.4.12) of the form const. exp (-iht). Putting zi, = -iAu, to find such a solution, we may obtain from (0.4.11-12) (0.4.13) (0.4.14)
10
0.
INTRODUCTION,
Here y,(h) is as given by (0.4.6). T h e effect of the system is now to multiply the driving function yo exp (-iht) by the “closed-loop transfer function” Ym(h)/[l - e-”?n(4l. (0.4.15) This function has poles at the eigenvalues, which are accordingly the values of h for which the system exhibits a singular response to the driving force exp (-;At). Functions of a very similar type to (0.4.15) will be termed later “characteristic functions,” and will play a basic role in all our boundary problems. They are closely related to two important concepts, that of the spectral function and that of the Green’s function. We may modify this feedback problem so as to obtain a more exact analog of the characteristic and the Green’s function. Recalling that the latter is, in essence, the ratio of the response of the system at one point to a disturbance applied at the same or another point, roughly speaking, we set up the problem
+
u,,+~ E,U,+,
= un - cnUn,
n = 1, ..., m - 1,
+ exp (-&I}, (I - co d / d t ) + + exp (-iht)}.
u, = exp (ia) {uo -
u1
+ F~U,=
{uo
(0.4.16) (0.4.17) (0.4.18)
Here we have introduced an inhomogeneous term exp (-iht), distributed half on either side of the “point of application” uo, treating in (0.4.17) the boundary condition as a member of the family of recurrence relations. If we select again the solution, if any, of the form const. exp (-iht), and so put 14, = on exp (-;At), the above equations yield v,+, w,
= (1
= e“(w0
+ ihc,) (1 - i h ~ , ) - ’ w , , - +),
wl = (1
n = 1, ..., m - 1,
+ ihco) (1 - iAt0)-1 (wo + 4)
whence, with the notation (0.4.6),
T h e solution of (0.4.16-18) is, therefore, in part,
T h e coefficient of exp (-;At) on the right may be considered, moving to network terminology, a driving-point admittance, as the ratio of response at the location uo to the force exp (-;At) administered there.
0.5. INVERSE PROBLEMS
11
Modifying this function by a constant factor, we shall consider in Chapter 1 the function f m . m
=
(W1+
mt71(h))/{ei0! - ym(X)>r
(0.4.20)
terming this the characteristic function, in analogy with a partial usage for differential equations. It has the important properties that it is real when h is real, since then I yl(h) I = 1, with poles at the eigenvalues, and has negative imaginary part when I m h > 0. It is also related, by way of the Stieltjes transform, to the spectral function, which we define later. In a similar way we may find u1 , u2 , ... from (0.4.16-18), extending (0.4.19) to u, = exp (-Lit) eiyn((x)/{eia - ym(h)},
(0.4.21)
the coefficient of exp (-iht) being a transfer admittance. We may next repeat the whole process with the “driving force” applied to the location ul, according to
+,flu2 (I - c1 d/dt){ul + 9 exp (-iht)}, (1 + E, dfdt) {ul - 8 exp (-&)I u, - cogo, u2
=
=
the remainder of the recurrence relations and the boundary condition being unaltered. This will yield another driving-point admittance and transfer admittances. If generally g,.,(h) is the ratio of the function u, to a disturbance exp (-iht) applied in the case of ur in the above manner, the matrix g&i) may be considered as a Green’s function.
0.5. Inverse Problems In widely separated contexts, inverse problems present themselves. Here there is a boundary problem in which the differential or difference equation is unknown. The information given us may consist of the eigenvalues for a known boundary condition, or perhaps several sets of eigenvalues for several boundary conditions. A related problem is that in which the asymptotic behavior of the solutions is given, the differential or difference equation to be found. The problem may appear as one of design or synthesis of an apparatus with a view to some prescribed performance, or as one of diagnosis, as when the “potential” in the Schrodinger equation is to be found from scattering measurements. To give an illustration at a purely algebraic level, we might try to
12
0.
INTRODUCTION
find the constants c, implicit in (0.4.6),given the singularities of the closed-loop transfer function (0.4.15); as a matter of fact, we should need two sets of singularities corresponding to two values of a, these singularities having the separation property. We deal with this in Section 1.8. A closely analogous problem is that of determining a set of orthogonal polynomials, and so a recurrence relation, given the zeros of two consecutive polynomials of the set. From the theoretical point of view, when such inverse problems have been solved, sometimes no mean task, we seen the theory in a finished form. Necessary and sufficient conditions are then known in order that the differential or difference equation should lead to eigenvalues of some specified character, or should have solutions with some specified asymptotic behavior. Similar inverse considerations may be applied to intermediate stages in the theory of a boundary problem. For example, in the simplest case of the finite discrete scalar problem (0.3.7-8), the recurrence relation is essentially contained in the transfer function ym(h) given by (0.4.6), the eigenvalues being the roots of ym(h)= exp (2.). From the functiontheoretic point of view, y,,(h) is a rational function which maps the real A-axis into the unit circle, and the upper and lower half-planes into the inside and outside of the unit circle, respectively, the c, being assumed to have positive real part. I t may be shown that the form (0.4.6) gives the most general rational function with these properties, subject to Y r m = 1. This intrinsic characterization of ym(h) as a rational function with specified mapping properties gives our boundary problem, within its limits, an air of finality. It also suggests characterizing some related classes of boundary problems in this way. For example, we might ask for the most general entire function mapping the upper and lower half-planes into the inside and outside of the unit circle. T h e answer is given by a complex exponential, which may be interpreted as the transfer function for the differential equation (0.3.1). Combining these two cases, we might next ask for the most general meromorphic function with these mapping properties, being led to a combination of (0.3.1), (0.3.7) into the mixed continuous-discrete case. Still further generality is obtained if we merely consider functions mapping the upper halfplane into the interior of the unit circle. I n the problems just discussed, there is little difficulty in principle in proceeding from, in the discrete case, the transfer function (0.4.6) to the recurrence relation (0.3.7). T h e corresponding higher-dimensional problem, of the factorization of contractive matrix functions is much more substantial.
0.6.
THE GENERAL ORTHOGONAL CASE
13
0.6. The General Orthogonal Case T h e further course of the theory of boundary problems might be baldly summarized as the extension of these ideas into vector and matrix terms. I n each case we may start with a differential or a difference equation, leaving some quadratic form invariant, and involving a parameter. Here we outline the case in which it is to be the length of the vector which is invariant. We start with the continuous case, and give a direct extension of the differential equation (0.3.1). This is given by the first-order system y’ = ihHy,
0
< x < 1,
(0.6.1)
where now y = y(x) is a k-vector, written as a column matrix with entries yl(x), ...,yk(x), and H = H ( x ) is a k-by-k matrix of functions of x, which for convenience we may suppose continuous; all quantities may be complex. T h e desired invariance property is that y ( x ) should be of constant length, in the ordinary sense, so that
2 ~y,(x) k
12
=
r=1
const,
o < x < I.
(0.6.2)
Denoting by y* = y * ( x ) the complex conjugate transpose of y ( x ) , that is to say, a row matrix with entries y l ( x ) , ..., yk(x), we may write (0.6.2) more compactly as y*y = const, 0 < x < 1. (0.6.3) This forms an extension of (0.3.2), or again a particular case of (0.2.9). We require that the matrix H should be such that (0.6.3) is true for any solution of (0.6.1) if X is real. This is ensured if H ( x ) is Hermitean, or equal to its complex conjugate transpose, H=H*,
(0.6.4)
O<x
it may in particular be real and symmetric. T o verify this we use the adjoint form of (0.6.1) ; taking complex conjugates and transposing, we have, in fact, y*’ = -iAy*H. (0.6.5) Hence (y*y)’
= y*‘y
+ y*y’
=
-iXy*Hy
+ y*(ihHy)
= i(h - A)y*Hy,
which is zero if h is real. This establishes (0.6.2-3), for real A.
(0.6.6)
14
0. INTRODUCTION
The analog of what was in Section 0.4 termed the transfer function concerns the “fundamental solution ” of the matrix equation corresponding to (0.6.1); this is a k - b y 4 matrix Y = Y(x) defined by Y’ = i H Y ,
Y(0) = El
(0.6.7)
where E denotes the unit matrix. In a similar way to (0.6.6), it may be shown that (Y*Y)‘ = i(h - A) Y*HY,
(0.6.8)
so that, if X is real, Y*Y is constant in x and, in fact, Y*Y
= E,
(0.6.9)
and so Y is unitary. If, moreover, H(x)
> 0,
0 <x
< 1,
(0.6.10)
in the sense that H ( x) is positive-definite, it follows from (0.6.8) that Y*Y is a decreasing function of x if I m h > 0, and increasing if Im X < 0. Thus Y * ( l )Y ( l ) < E, Im h > 0, (0.6.11) Y * ( l )Y ( l ) > E,
Im h
< 0.
(0.6.12)
We may interpret the matrix loci in (0.6.9), (0.6.11), and (0.6.12) as the (matrix) unit circle, and its interior and exterior, respectively, and again assert that the transfer function Y(l) maps the upper and lower half-planes in the X-plane into the interior and exterior of the (matrix) unit circle. For a boundary problem, similar to (0.3.3), we take some fixed unitary matrix N , say, with N*N = El and ask for what X-values (0.6.1) has a nontrivial solution such that Y(1) = W ( 0 ) ;
(0.6.13)
we could, of course, put N into exponential form, as was done in (0.3.3), which might be preferable for the study of separation theorems. Here we confine ourselves to observing that the eigenvalues are necessarily real; by (0.6.13) and the fact that N is unitary, y(1) and y(0) have the same length, and sincey(1) = Y(l)y(O), this is impossible by (0.6.11-12) if X is complex. For a discrete boundary problem associated with the invariance property (0.6.3), we seek elementary factors, matrix functions of A,
0.7.
15
THE THREE-TERM RECURRENCE FORMULA
which are unitary when h is real, and lie, so to speak, inside the unit circle when h is in the upper half-plane. Such factors are once more, as in (0.3.6-7), given by bilinear expressions. Denoting by yo, yl, ... a discrete sequence of K-vectors, we are led to set up the recurrence relations n = 0,1, ... , (0.6.14) yn+l= ( E iAC,) ( E - iACn*)-lyn,
+
+
where the C, are “normal,” C,*C, = CnCn*, and C, C,* > 0, or have positive “real part.” For the boundary problem we consider again whether there is a solution such that ym = N y , , where N is a fixed unitary matrix. There are a number of variants on the two boundary problems just formulated, of which a bare mention must suffice: (i) the boundary problem (0.6.1), (0.6.13) may be studied, in the limit, over the interval 0 x < a, or - 03 < x < a,yielding “limitpoint” and “limit-circle” cases ; (ii) the recurrence relation (0.6.14) may be studied over 0 n < m, or over < n < m, again with a discrimination between limit-point and limit-circle cases; (iii) the variation of y ( x ) may be determined in part by a differential equation, in part by a recurrence relation, according to a scheme of the form, for example,
<
<
y’(x) = iAH(x)y(x),
y(xn
x,
< x < xn+l,
71 = 0,1,
+ 0) = ( E + ihC,) ( E - iACn*)-’y(xn - 0),
n
=
...,
1,2,
... ,
(iv) the coefficient XH(x) on the right-hand side of (0.6.1) may be D(x)]-l replaced by a bilinear expression [hA(x) B(x)] [XC(x) which is restricted to be Hermitean, when the denominator is nonsingular.
+
+
There is a certain parallel between (0.6.1) and the time-dependent Schrodinger equation, if the derivative on the left be interpreted as a time-derivative, and the matrix H on the right be compared to a Hermitean operator. 0.7. The Three-Term Recurrence Formula
Returning to systems of low dimensionality, we come to the discrete case presented by the recurrence relation CnYn+1
+ (%A + bl>Yn +
cn-1yn-1 = 0,
(0.7.1)
16
0.
INTRODUCTION
which we may also write as a difference equation c n ( ~ ‘ n + l- ~
n
) c n - l ( ~ n -~ n - 1 )
+ (ad +
Yn = 0,
bt’)
(0-7.2)
the y , now being scalars. If we assume the c, all real and positive, the yn can be determined by recurrence given two consecutive y,, , Y,,+~and boundary problems can be set up; a special case we gave in (0.1.5),
(0.1.8).Here we fit this relation into the scheme of difference and differ-
ential equations with an invariance property, and then consider some realizations of such recurrence relations. The invariance property here is that if z, is a second solution of the same recurrence formula, and a, , b, or b,’, c, and X are all real, then cn-l{2nyn-l - Zn-lyn} = const,
(0.7.3)
in the sense that it is independent of n. This forms an analog of (0.2.l), and, like it, can be put into a matrix form (0.2.9-10). We take this up in Chapter 3. The relation (0.7.1) or (0.7.2) has a substantial claim to mathematical attention as a basis for the theory of orthogonal polynomials; we treat this aspect in Chapters 4 and 5. A closely related topic is that of analytic continued fractions. In this section we shall outline three interpretations of (0.7.1-2) of a more or less physical character. These are much more than “applications,” in that each has proved of great suggestive value, and has a respectable theory of its own, at which we can do no more than hint. We take up first the case of the vibrating string, mentioned in Section 0.1, which gives a simple illustration of (0.7.2) with b,’ = 0. T o take a finite-dimensional case, we suppose that the weightless string bears m particles, of masses a, , ..., am-l; the distance between the particles a,, a,,+1is to be l/c, , r = 0, ..., m - 2. In addition, the string extends to a length l/cm.-l beyond al,L-l, and to a length 1/ccl beyond a, . The entire string is stretched to unit tension. If u, is the displacement of the particle a, at time t , the restoring forces on it due to the tension of the string are cnP1(u,- unP1),- C , ( U , + ~ - u,), considering small oscillations only, whence the differential equation of the motion, -a,f.in
= cn-&
- un-l) - cn(un+, - u,).
(0.7.4)
If we seek solutions of the form u, = y, cos ( w t ) , where y , is the amplitude of oscillation of the particle a,, we derive cn(x+1
- m)- Cn-l(Yn
- Yn-1)
+ WJJZYn= 0,
which is the form (0.7.2) with h = w 2 and b,‘ = 0.
(0.7.5)
0.7.
17
THE THREE-TERM RECURRENCE FORMULA
If we define y,(A) as the solution of c,(y,,
-y,)
-y,-J
- c,&,
+ a,&
= 0,
n
= 0,
...,m
- 1,
(0.7.6)
with the initial conditions yP1= 0, c - g , = 1, this may be taken as representing a hypothetical vibration in which the lower end is pinned down. If it so happens that ym(A)= 0, it will be possible for the string to vibrate freely with both ends pinned down. For a second solution we may take z,(A), satisfying the same recurrence relation (0.7.6), with the initial conditions z-l = 1, z, = 0, representing a vibration with the particle a, pinned down. IE again z,(A) = 0, then 4 2 = w will be a natural frequency of the system with the particle a, held down and also the upper end of the string. It may be shown that the zeros of ym(A), z,,(A) have the separation property associated with Sturm-Liouville equations. We pursue this example to the extent of setting up an inhomogeneous problem, leading to the characteristic function. We suppose a force F cos ( w t ) applied transversely to the particle a, , the string being pinned down at both ends (Fig. 1). Assuming, if possible, the displace~ F C OwSt
FIG. 1 .
Inhomogeneous problem for vibrating string.
ment of a, at time t to have the form v, cos ( w t ) , we must have, writing w2 = A, C,(W,+~
+ a,Xv,
- w,) - c,-l(~, - v,+~) C0(Wl
- w,)
- c-10,
+
= 0, a&,
n
+F
=
1,
...)m
-
1,
(0.7.7) (0.7.8)
= 0,
and finally v, = 0. We may solve these equations by taking v, to be a linear combination of solutions of (0.7.6). If we take v, = ay, Pz, where y,, z, are fixed by initial conditions as above, the equations (0.7.7) will hold, and also (0.7.8) if we arrange that
+
F = C-10-1
=
c-l(uy-1
+
/%-I)
= Pc-1
,
18
0.
INTRODUCTION
by the initial conditions. Since v, = 0, and so cry, have LY = -/lzm/ym = - F ~ , / ( c - ~ y , ) .We deduce that
+ /lzm= 0, we
The function -z,(h)/ym(h) will be a special case of what we use in Chapters 4 and 5 as a characteristic function; it will be expedient to use a more general function, corresponding to a more general boundary condition at the upper end, with a view to limit-circle investigations in the complex plane. The function -z,(h)/y,(h) measures the ratio of the amplitude of oscillation of a, to that of the applied force, and so may be considered again as a “driving-point admittance.” The term “coefficient of dynamical pliability” has been used by M. G. Krein. It would be possible to compare the amplitude of oscillation of a general a, to the force applied at a,, and so generally to construct transfer admittances or a Green’s function. Analytically, the characteristic function just formed will have the properties that its poles and zeros alternate with each other, being all real, that it is monotonic on the real axis, and has negative imaginary part when h has positive imaginary part. An equivalent scheme in terms of network theory is noted in Fig. 2.
FIG. 2. Inhomogeneous problem for LC-network.
The a, are now interpreted as inductances instead of masses, while the l/c, are capacitances instead of the distances between the masses. The u, are now the loop currents in the successive meshes, instead of the displacements of the masses; these cancel in part on branches common to two adjoining meshes, so that, for example, the current in the shunt branch containing l/c, is u, - u1 in the sense of u, . For the inhomogeneous problem we suppose a generator in the first mesh to supply a voltage E exp (iwt), and seek a solution of the form u, = u‘ , exp (iwt), the vn being complex constants. Equating to zero the total voltage drop
0.7.
19
THE THREE-TERM RECURRENCE FORMULA
around each of the meshes, and suppressing the factor exp (iwt), we get E = uowoiw
0
= u,,w,iw
+ + cn-l(wn
+
C - ~ W ~ ( ~ W ) - ~co(wo
- wl) (iw)-l,
- wa-l)(iw)-l
+ cn(wn -
n = 1, ..., m - 1,
W,,+~)(~W)--~,
where for the last equation we interpret W, = 0. On multiplying by iw and rearranging, we obtain a set of equations of the form (0.7.7-8). The ratios vo/E, vl/E, ..., may again be interpreted as driving-point and transfer admittances or as a Green’s function; of these, oo/E is essentially a characteristic function, whose zeros and poles are certain frequencies of the network. It scarcely needs pointing out that the network analogy can interpret more general recurrence relations. I n particular, it gives other realizations of the present recurrence relations, for example, by forming the network with series capacitances and shunt inductances. For a third interpretation of the three-term recurrence formula, and one with a different mathematical flavor, we turn to probability theory, and the area of Markov processes, birth and death processes, and random walks. We take first the latter interpretation with a discrete time process. A particle starts at time t = 0 at one of m places, labeled 0, 1, ..., m - 1 in Fig. 3, and at successive instants t = 1, 2, ..., can
0
1
.....
2
FIG.3. Random walk, with a,
= c./n.,
m-2
m- 1
fin = ctl-l/an.
move one place to the right or to the left, or can remain fixed. If for some t = to the particle is in position n, there is a probability a, that at the next instant to 1 it will be at position n 1, and likewise probability 9,, that it is at position n - 1 at time to 1, the probability of it being in the same position being therefore 1 - an - j?,. At the endpoints there are likewise probabilities of motion to the left and the right; if the particle moves to the left of position 0, or to the right of position m - 1, it is considered permanently lost. We now definep,(n) as the probability of the particle being in position s at time n if it starts off in position Y when t = 0. The relation
+
Prs(n
+ 1) = (1 -
+ +
as
- PJ Prs(n)
+
as-~Ps.s-l(n)
+
Ps+lPr.s+l(n)
(0.7.9)
results from the consideration that if the particle, starting initially at at s when t = n 1, then when t = n it must have been at one of
I, is
+
20
0.
INTRODUCTION
+
1, s 1 ; it then has respective probabilities 1 - or, - rS,, , of moving from these positions to position s. If s = 0, m - 1, (0.7.9)remains in force if formally we set = 0, prn = 0 corresponding to the no return from the left of 0 and from the right of m - 1. It is possible to deduce from (0.7.9) and from the fact that p,(O) = 6, a second relation, which comes closer to our recurrence relation. T o see this we put (0.7.9) in matrix form. We write P(n) for the matrix p,(n), T , s = 0, ..., m - 1, and T for the matrix s, s -
p8+,
-0Lo
-80
81
T =
- 81
0
82
0
0
...
... ... - p2 ...
0
m0 -011
0 0 0 0
a1 -012
(0.7.10)
am-2
O
Pm-1
-&rn-i-Pm-i
a matrix of Jacobi form, in which all elements are zero except perhaps for those on the leading diagonal and on the diagonals immediately above and below it. If we write E for the mth order unit matrix, (0.7.9) assumes the form P(n
+ 1)
= P(n) ( E
+ T);
(0.7.1 1)
since also P(0) = E, we have the solution P(n) = ( E
+ T)”.
(0.7.12)
From this we have at once that, in addition to (0.7.1l),
or, explicitly,
A formal resemblance to (0.7.2), (0.7.4) will now be apparent. This resemblance becomes more significant if we set the problem of finding expressions for P(n) as given by (0.7.12); the form ( E T)n gives, for example, little information on the asymptotic form of P(n) for large n. Alternative expressions may be found in terms of the eigenvalues and eigenvectors of the matrix T,relying on the spectral theory of symmetric matrices; T is not, indeed, symmetric but is “similar”
+
0.8.
THE 2 - B Y - 2 SYMPLECTIC CASE
21
to a symmetric matrix. To find these eigenvalues X and eigenvectors ( y o ,y l , ..., ym-l) we have the equations AYO
= “o(Y1 -Yo)
AY,
=
’Yrn-1
=
%(Y,+1
-
-arn-lYm-l
-
BOY0
7
Yn) - Bn(Y,
-
Yn-l),
n
=
1, - * * 7 m - 2,
- Bm-l(Yrn-1 - ~ m - 2 ) -
If now we put a, = c,/a, , 3/, = c,-,/a,, these equations agree with those, (0.7.6), which define a polynomial y,(h) associated with the vibrating string, except as regards the sign of A. Without reproducing the full details, we may say that the eigenvalues of T are negatives of the squared natural frequencies of the string of Fig. 1, vibrating freely without applied force, the eigenvectors being given by the displacements of the respective particles in these natural modes. Very similar analysis applies to the process, continuous in time but not in space, in which the particle at position n has in a small time A t a chance a,At of moving to position n 1, and a chance /3,d T of moving to position n - 1. I n this case we denote by pr8(t)the probability of a particle starting in position Y at t = 0 being at position s at time t, and by P(t) the matrix p,(t), Y, s = 0, ..., m - 1. It is found that (0.7.12) must be replaced by P ( t ) = exp T , (0.7.15) and (0.7.14) by
+
P:s(t) = 4 P r + i , s ( t )- P r s ( t ) > - B r { P r s ( t )
- Pr-l,dt)}s
(0.7.16)
which again is close to (0.7.4). Again, more informative expressions for P(t) than (0.7.15) may be found in terms of the eigenvalues and eigenvectors of T , effectively in terms of certain orthogonal polynomials and their zeros. In the above we have confined ourselves to problems which are finite and discrete. Extensions that suggest themselves include in the first instance infinite discrete cases. T h e vibrating light string might bear an infinity of particles, with one or more limit-points, finite or otherwise, which is, of course, included in the general case in which the string also has weight. 0.8. The 2-by-2 Symplectic Case
We may use this as a general term to cover cases exhibiting an invariance of the form (0.2.9-lo), or again (0.7.3). T h e Sturm-Liouville
22
0. INTRODUCTION
case of (0.1.1) or (0.2.8) is included in that of a two-dimensional system of the form Ul’(4 f l z w
+
= {Jw44 4 ) u2(4, = - {hq2(x) T Z ( X ) ) u,(x),
+
1
o<x,<1,
(0.8.1) (0.8.2)
where at least one of q l ( x ) , q2(x) is positive, and q1 , qz , rl , r2 are, say, piecewise continuous, and rl , r2 are real. Again we observe that if A is real, and vl(x), v2(x)is a second solution of the system (0.8.1-2), then G2ul- Gluz = const.
(0.8.3)
T o translate this into matrix terms, we define a fundamental solution, a 2-by-2 matrix Y(x),the solution of 0
Y‘(x) =
hq,
+ ) y (*), I1
Y(0)= E,
(0.8.4)
where E is the unit matrix. It may then be shown that Y*]Y
=
J,
0
< x < 1;
(0.8.5)
since Y * J Y = J is the defining property of a symplectic matrix, J being given by (0.2.10), we refer to this as the symplectic case, for the present 2-by-2 situation. So far in this section we have confined attention to continuous cases. In the mixed continuous-discrete case the solutions u l , u2 of (0.8.1-2) may be subjected to discontinuous changes at points in 0 x 1, which are at any rate denumerable. At such a point t, say, the values on either side will be related by
< <
(0.8.6)
Here the matrix
( *y
!)
is to be symplectic, if independent of A;
if linearly dependent, or bilinearly dependent, on A, it is to be symplectic for all real A. The simplest example of such a situation is perhaps that of a parameter in the boundary conditions of a Sturm-Liouville problem. We might consider the case of a transmission line beginning or ending with a circuit, or a heavy string bearing a particle at one end, which is free to slide transversely. Modifying the latter problem, suppose that the string has density p ( x ) , 0 x 1, is pinned down at x = 1, bears a particle of mass m at x = 0,which is joined by a weightless string to a fixed point
< <
0.8.
THE 2 - B Y - 2 SYMPLECTIC CASE
23
x = -1, the whole string having unit tension. We seek harmonic vibrations in which the displacement has the form y ( x ) cos (t1/X). The dynamical equations give, for the particle and heavy string,
0 < x < 1,
--hp(x)y = y”,
--hmr(O)
= y”
- y(O),
(0.8.7) (0.8.8)
where y’(0) denotes the right-hand derivative at x = 0. T h e left-hand derivative of y at x = 0 is in the present case y(0) = y’(-1). As is well known, the problem (0.8.7-8), together with y( 1) = 0, has, broadly speaking, the properties associated with ordinary Sturm-Liouville problems. We may rephrase this latter problem as one with the boundary conditions of standard type A-1) = y ( l )
(0.8.9)
= 0,
together with discontinuities specified by a symplectic matrix. We have the geometrical condition Y(0) = A-1)
+ Y’(-l),
while (0.8.8) may be written, in more basic form, as --hmy(O)= y‘(0) - y’( -l),
where again y’(0) denotes y’(+O). Hence (0.8.10)
and the square matrix on the right is symplectic for all real A. A third method of dealing with this boundary problem will be to express it as a first-order system with piecewise continuous coefficients. We write ul(x) = y(x), uz(x) = y’(x), for 0 x 1, in accordance with the usual method of rewriting (0.8.7) as a system. At x = 0, y’ undergoes a jump of -Amy(O) = -Amul(0). We make uz(x) have this increase over the interval (-1, 0), keeping ul(x) constant, so that we set
< <
al’ = 0,
u2’ =
-1
--Xmu,,
< x < 0.
Finally, corresponding to the linear behavior of y ( x ) in (-1, 0), we set ul’ = uz , u2‘ = 0,
-2
< x < -1.
24
0.
INTRODUCTION
In this way, the case of a discontinuity in y‘ due to the presence of a point mass may be “smoothed out,” and the need for special machinery avoided. We mention briefly the various treatments possible for the general case of a vibrating string, loaded in any manner, and stretched over 0 x 1 with unit tension. We consider harmonic vibrations as before. The string is defined by a bounded nondecreasing function m(x), being the mass of the portion (0, x) of the string, including any particle there might be at the end-point x. Denoting byy(x) the amplitude of the displacement, and by y ’ ( x ) its right-hand derivative, y will be continuous, and y’ also, except’at points of the string where a particle is located, at which it will have a discontinuity. Supposing that y(0) = 0, we have from dynamical considerations that
< <
(0.8.11)
and this, together with y ( 1) = 0, specifies a boundary problem. By the method just described, this may, if we wish, be replaced by a first-order system with piecewise continuous coefficients. Another possibility is to write (0.8.1 1) as a system of integral equations, with u1 = y, u2 = y’ as (0.8.12) uz(x) - U,(O) = -A
j’ ul(s) dm(s).
(0.8.13)
0
Yet another possibility is to integrate (0.8.1 1) so as to derive an ordinary integral equation
where y(O), y‘(0) are considered as given. We develop the theory of this integral equation in Chapters 11 and 12. We refer, in the notes to Chapter 8, to another method, much developed by W. Feller, in which the machinery of the second-order differential equation is adapted to discontinuous cases, by means of a generalized second derivative.
CHAPTER 1
Boundary Problems for Rational Functions
1 , I . Finite Fourier Series In the simplest of the standard boundary problems for differential equations, we ask for solutions of the first-order equation y’ = ihy, 0
<x < 1
[(’) = d/dx],
(1.1.1)
satisfying the boundary conditions ( 1.1.2)
Y(0) = A l l f 0.
In this chapter we take up a “discretized” version of this problem. A well-known discrete analog of (1.1.1) is suggested by the associated eigenfunction expansion. The eigenvalues, the A-values for which the problem (1.1.1-2) is soluble, are the roots of the equation exp (;A) = 1, and so are given by A = 0, f2v, ..., and the eigenfunctions, the corresponding solutions of (l.l.l), are given by y ( x ) = exp (ihx) for these A-values. These eigenfunctions are orthonormal, in that
j: exp (27~irx)exp (277isx) dx = S,, ,
(1.1.3)
for t , s = 0, *l, ... . On the basis of this we form the expansion of an “arbitrary” function f ( x ) in a complex Fourier series (1.1.4)
whereby, if the expansion is to be valid in a suitable sense, the coefficients K, must be given by the Fourier process K,
=
1f ( x ) exp (27~inx) 1
dx.
0
25
(1.1.5)
26
1.
BOUNDARY PROBLEMS FOR RATIONAL FUNCTIONS
I n the topic of finite Fourier series we fix a positive integer rn and in place of the eigenfunctions exp (2xinx) set the sequences or vectors. n = 0, ...,m - 1.
1, exp (277in/m), ..., exp (277in(m - l)/m),
(1.1.6)
In place of the orthogonality (1.1.3) we have
2 exp (2rrikrlm) exp ( 2 ~ i k s / m )= m s r s ,
m-1
T,
s = 0, ...,m - 1.
( 1.1.7)
k=O
It is not hard to see that the m-vectors (1.1.6) are linearly independent, and to see that an arbitrary vector fo
9
(1.1.8)
***,fm-1
where the f, are any complex constants, admits an expansion similar to (1.1.4). We have, in fact, m-1 f7
=
Ks
exp (2n-irs/m),
Y
= 0,
..., m - 1 ,
(1.1.9)
S=O
the coefficients K~ being found by multiplying (1.1.9) by exp (27rirs/m) and summing, and using (1.1.7), so that ( 1.1.10)
To complete the analogy, we raise the question of what boundary problem the sequences (1.1.6) are eigenfunctions. A simple answer to this is given by the boundary problem in which we set up a recurrence relation ( I .1.11) n = 0, ...,m - 1, Y,,+~ = Ay,, , and impose the boundary condition Ym = yo f 0.
(1.1.12)
T h e eigenvalues of this problem are immediately seen to be the roots of Am = 1, and the eigenfunctions, in this case discrete sequences or vectors, corresponding to the eigenvalues, are indeed the sequences (1.1.6), apart from constant factors. It will be noted that the eigenvalues, certain roots of unity, lie in this case on the unit circle. T o facilitate the analogy with standard
1.2.
27
THE BOUNDARY PROBLEM
problems for differential equations, we shall transfer the eigenvalues to the real axis, which may be done by a bilinear transformation. If the problem (1.1.1 1) be replaced by Y,+~ = (1
+ ih) (1 - ih)-y,,
= 0,
n
..., m
-
1,
(1.1.13)
the boundary condition (1.1.12) being unaffected, the eigenvalues are then X = tan ( T Y / ~ ) , Y = 0, + I , ..., and are all real. T h e change from the unit circle to the real axis involves a slight loss of symmetry, and occasional complications in connection with infinite eigenvalues. We shall extend the problem posed in (1.1.12-1 3) in two ways, generalizing the boundary condition to ym = y o exp (ia)where a is real, and fntroducing certain constants into the bilinear relation (1.1.13). Even so, we shall only be tackling a small fraction of the relevant class of boundary problems.
1.2. The Boundary Problem This is the following: Constants co , ..., c,-~, possibly complex, are given, and also given is real a, 0 a <'2rr. T h e c, satisfy
<
Rl{c,} > 0,
n
..., m
= 0,
(1.2.1)
- 1.
It is required to find X such that the recurrence formula y,+l
= (1
+ ihc,) (1 - ihE,)-ly, ,
n
= 0,
...,m - 1,
(1.2.2)
has a solution such that (1.2.3)
# 0.
y m = yo exp).i(
I n proving that the spectrum, that is to say the set of eigenvalues, is real, we rely on the fact that the bilinear factors appearing in (1.2.2) are contractive in the upper half-plane, expansive in the lower half-plane, and length preserving on the real axis. This is expressed in:
Lemma 1.2.1. If Rl{c} 1(1
> 0, then
+ ihc) (1 - ih~)-l I
I
< 1, 1, > 1, =
In fact, the left-hand side may be written
I
- i/c
I/I
+ i / f I,
ImX>O, Imh = 0, Imh < O .
(1.2.4)
28
1,
BOUNDARY PROBLEMS FOR RATIONAL FUNCTIONS
and so is the ratio of the distances from h to the points ilc, -i/?, which lie in the upper and lower half-planes, respectively, since Rl{c} > 0. We deduce that if h is real, then the solution of the recurrence relation (1.2.2) satisfies IYO I = IY1 I
=
a*.
= IYm I,
while if h is complex and yo # 0, we have
I Y ~ I > I Y ~ I > I Y ~ I > ~ ~ ~ > I Y ~ I ~ if Im h
> 0,
and IYOl
if Im h < 0. Neither of the last two is reconcilable with the boundary condition (1.2.3), which implies that I yo I = I ym I, and so we have
Theorem 1.2.2. The boundary problem (1.2.2-3) has only real eigenvalues. We shall admit, 00 as an eigenvalue when appropriate. Putting h = m in (1.2.2), we obtain (1.2.5)
yn+l = - ( ~ n / f n ) ~ n
and so (1.2.3) will be formally satisfied if m-1
(-c,/E,)
(1.2.6)
= exp (ia).
0
Here we do not distinguish between --a, and $ 0 0 ; the special character of 00 may be brought down to size by transferring the eigenvalues to the unit circle from the real axis by a bilinear transformation. For the sake of the analogy with differential equations, we cite another property of the recurrence formula (1.2.2). Let there be a second solution, so that z,,
= (1
+ ihc,) (1 - ihF,)-%,
,
n
...,m
= 0,
- 1.
(1.2.7)
If then h is real, we have soyo= flyl = ... = I,y,
;
(1.2.8)
this corresponds in certain circumstances to the constancy of the Wronskian for solutions of second-order differential equations.
1.3.
29
OSCILLATION PROPERTIES
1.3. Oscillation Properties In the real domain, “osciilation~’commonly refers to the situation in which some function is repeatedly changing its sign. Here the appropriate concept is that certain functions in the complex domain are of absolute value unity, and are describing the unit circle with various velocities. As the standard circular representation of simple harmonic motion shows, oscillations of the real type can be transformed into those of the complex type. For definiteness we shall take y o = 1 in (1.2.2), so that yn = m ( h ) =
n n-1 0
((1
+ ihc,) (1 - iAE7)-1}.
(1.3.1)
We may term y,(h) the fundamental solution of the recurrence relation, in analogy with a standard terminology for systems of linear differential equations. In conformity with Lemma 1.2.1, we note the:
Lemma 1.3.1. If R1 {c} > 0, then as h describes the real axis from to +-, (1 + ihc)/(l - ihE) describes the unit circle once in the positive sense and strictly monotonically. The proof follows from the fact that as h increases, arg (1 ihc) increases and arg (1 - ihE) decreases. T h e connection with Lemma 1.2.1 is that (1 ihc)/(l - ihE) maps the upper half-plane into the interior of the unit circle, so that their boundaries are mapped in corresponding senses. Supposing now that h goes from to +m along the real axis, we see that the factors (1 ihc,)/( 1 - ihE,) each make a positive and strictly monotonic circuit of the unit circle, so that y,(h) as given by (1.3.1) makes exactly n positive and strictly monotonic circuits of the unit circle. We can deduce the simplicity, in a certain sense, of the spectrum. --oo
+
+
+
--
Theorem 1.3.2. T h e boundary problem (1.2.2-3) has precisely m eigenvalues, all of which are real and of which at most one may be infinite. T h e eigenvalues, the h-values for which (1.2.2-3) is soluble, are the roots of ym(h) = exp (ia). (1.3.2) Cleared of fractions, this becomes an algebraic equation of degree at most m, so that there cannot be more than m finite roots. On the other hand, as h describes the real axis, y,,,(h) describes the unit circle m times,
1.
30
BOUNDARY PROBLEMS FOR RATIONAL FUNCTIONS
and so there are exactly m roots, one of which may be infinite, and all of which are distinct. From algebraic considerations it may be shown that none of these is a multiple root of (1.3.2). As stated before, we do not distinguish between -03, +a, as eigenvalues. For further analysis we define an angular parameter
when A is real; the arguments are all to be taken as 0 when A = 0, and calculated thence by continuous’ variation. T h e defining property of the eigenvalues is then (1.3.4) w,(A) = a (mod 2v), and they may be classified precisely according to w,(A,.)
=a
+ 2rv.
(1.3.5)
Here r will run through some sequence A of consecutive integers; in the event that 03 is an eigenvalue, we may for definiteness take as representative, and incorporate the corresponding r-value in A . T h e eigenvalues thus defined are, of course, dependent on a and on m,
+-=
A,. = A,.(m, 4,
(1.3.6)
and an important group of topics centers on the dependence of h, on m and a, and more generally on the dependence on m and a of the spectral function, to be defined shortly. T h e following results are immediate consequences of the fact that as A describes the real axis, ( 1 iAc,)/(l - iAE,) describes the unit circle in a certain sense, the point 0 corresponding to the point 1 on the unit circle.
+
Theorem 1.3.3. For fixed m,
a, A,.
is an increasing function of r .
Theorem 1.3.4. For fixed m, r, A,, is an increasing function of
a.
Theorem 1.3.5. For fixed r , a, as m increases the A, approach zero. T h e above results correspond to known results for Sturm-Liouville equations; here increasing m corresponds to increasing the size of the basic interval in which the differential equation is defined. A further analogy with Sturm-Liouville theory is given by the following “separation theorem.” This is:
<
Theorem 1.3.6. Let 0 a -=c/I < 27r. Then between two consecutive eigenvalues h,(m, a), A,,+,(m, a) there lies an eigenvalue Ar(m, /I) of the boundary problem (1.2.2-3) with /I replacing a.
1.4. EIGENFUNCTIONS
AND ORTHOGONALITY
31
1.4. Eigenfunctions and Orthogonality We may use the term “eigenfunction” for the sequence y,(h), ...,ym(A),
where h
= A,,
Y
EA,
that is to say, for the nontrivial solutions of the recurrence relation (1.2.2) and boundary condition (1.2.3). T h e orthogonality applies, however, to slightly modified sequences; we have here a departure from the analogy with differential equations. T o obtain these relations we set up two relations which are analogs of Green’s theorem, or of the Lagrange identity for differential equations, or again of the Christoffel-Darboux identity for orthogonal polynomials. The first of these is:
Theorem 1.4.1.
Let A, p be real. Then
where an = c,
+ En.
(1.4.2)
The proof is by induction. By (1.2.2) we have
We apply this with PZ = 0, 1, ..., m - 1 and sum. Recalling that y,(A) = yo(p) = 1, we derive (1.4.1). A second result of this character, with a number of applications, is:
Theorem 1.4.2. y,(A). Then
Let A, possibly complex, not be among the poles of (1.4.3)
where a, is given by (1.4.2).
32
1.
BOUNDARY PROBLEMS FOR RATIONAL FUNCTIONS
The proof again is by induction. We have
and the result again follows on summing over n = 0, ..., m - 1. The orthogonality relations come from the observation that if A, p are both eigenvalues, then the left of (1.4.1) vanishes. We have, in fact, then y m ( h ) = Y m ( r ) = cia,
-
-
whence y&) = e-*a and y,,(p)ym(A) = 1. Taking it that A, p are unequal eigenvalues, and turning to the right of (1.4.1), we deduce that (1.4.4)
T o put this in compact form we define (1.4.5)
?,(A) = (1 - iAf,,)-lyn(h) =
n-1 0
(1
+ ihc,) n(1 - i h f p . n
(1.4.6)
0
The orthogonality result for the (modified) eigenfunctions is then:
Theorem 1.4.3. Let A,, A, be two distinct and finite eigenvalues of the boundary problem (1.2.2-3). Then (1.4.7)
This is (1.4.4) with A = A,. , p = A, . We continue the development for the case that the eigenvalues are all finite; modifications for the contrary case will be given at the end of this section. First we complete (1.4.7) with normalization relations for the case I = s. We write (1.4.8)
1.4.
Theorem 1.4.4.
EIGENFUNCTIONS AND ORTHOGONALITY
33
We have also ~r =
4 4 h r) = -CL(~r)l(ym(hr)-
(1.4.9)
Before proving this we comment that the middle expression in (1.4.9) indicates a connection between the normalization constants, which we later interpret as weights attached to points of the spectrum, on the one hand, and oscillatory properties, on the other. This connection appears also in differential equations and other branches of the theory. T o verify (1.4.9) we note that when h is real,
I ~ ~ (I h= )I 1
- ihf,
I-l,
(1.4.10)
so that (1.4.8) is equivalent to
(1.4.11) Since a, = c,
+ Fn this may also be written
by logarithmic differentiation of (1.3.1). T h e remaining expression in (1.4.9) then follows since w,(h) = argy,(h) by the definition (1.3.3). From the orthogonality relations (1.4.7-8) we can proceed at once (see Appendix 111) to the dual orthogonality relations.
Theorem 1.4.5. If the boundary problem (1.2.2-3) has only finite eigenvalues, then for j, k = 0, ..., m - 1, (1.4.12) where Sj, denotes the Kronecker 6, and the sum on the left extends over all eigenvalues. Corresponding to our two sets of orthogonality relations, there will be two expansion theorems. Each of these will assert that an arbitrary m-vector can be expressed as a linear combination of m known orthogonal vectors. Although such a statement does not lie very deep, results of this type provide one avenue to expansion theorems in infinite-dimen-
34
1.
BOUNDARY PROBLEMS FOR RATIONAL FUNCTIONS
sional cases by way of a limiting process. The first result, the ‘‘eigenfunction expansion” in this case, is: Theorem 1.4.6. Let the eigenvalues of (1.2.2-3) be all finite. For arbitrary complex u o , ..., define
-
(1.4.13)
n=O
Then we have the expansion theorem
and the Parseval equation (1.4.15)
The deduction of this from (1.4.7-8) follows standard lines, and is given in Appendix 111. Corresponding to (1.4.12), we have the dual expansion theorem, likewise purely elementary. Theorem 1.4.7. Let the problem (1.2.2-3) have only finite eigenvalues. For any complex v 0 , ..., vmF1define (1.4.16)
Then there holds the expansion( 1.4.13) and the Parseval equality (1.4.15). It is a simple matter to deduce one of these expansion theorems from the other; the situation naturally becomes much more difficult in infinitedimensional cases, even when both expansions are discrete. Finally we indicate the modifications to be made when an eigenvalue is infinite, that is to say, when (1.2.6) holds. There will now be only m - 1 finite eigenvalues, and, corresponding to them, m - 1 eigenfunctions of the form vo(AT), ..., T,,-~(&), which will be orthogonal according to (1.4.7-8). We supplement them with a vector (1.4.17)
where (1.4.18)
1.5.
35
THE SPECTRAL FUNCTION
This vector is orthogonal to the other m - 1, in that (1.4.19)
where X, is any of the finite eigenvalues. This may be deduced from (1.4.1) by taking p = A, and making h + 03. As a corresponding normalization constant, we define p,,,,, by (1.4.20)
(1.4.21)
The last expression may be found by expressing ynE’(X)/yrn(h) in partial fractions by logarithmic differentiation. The dual orthogQnality relations to (1.4.749, when supplemented by (1.4.19-20) for the case of an infinite eigenvalue will be quoted without detailed proof. They are (1.4.23)
There will also be modified versions of the expansion theorems.
1.5. The Spectral Function This will, in the first place, provide merely a means of rewriting sums over the eigenvalues as Stieltjes integrals. Such a step will be essential if we are to carry out limiting processes in their full generality. Over and above this, the spectral function will be a central concept in further developments. We define this function on the real axis, having jumps of amount l/p, at the eigenvalues A, and being constant in between the eigenvalues. We set (1.51)
36
1.
BOUNDARY PROBLEMS FOR RATIONAL FUNCTIONS
and 7m,a(~ = )
-
2
a
l/p,,
if A
< 0.
(1.5.2)
Thus the spectral function T,,,(A) is a real-valued step function, which is nondecreasing as a function of A, and which is right-continuous, in that lim T~,~,,(A') = T,,,(A) as A'+ X from above. The spectral function unifies the eigenvalues A, and the normalization constants p, into a single concept. Expressing in terms of the spectral function certain of the results of Section 1.4, we take first the dual orthogonality relations (1.4.12), generalized in (1.4.23) for the case when there is an eigenvalue which is infinite. The general case is (1.5.3)
Here the term in /3,n,, is to be omitted if all the eigenvalues are finite. A particularly important case is when j = k, when (1.5.3) becomes
The interest of this is that it establishes a bound on T ~ , , ( A ) , which is independent of m and a. T o rewrite the eigenfunction expansion in these terms, we start as before with an arbitrary sequence uo , ..., and in place of (1.4.13) define for all real A, (1.5.5) n-0
Then the expansion (1.4.14) takes the form (1.5.6)
the Parseval equality being (1.5.7)
where we have excluded for simplicity the case of infinite eigenvalues.
1.6.
37
THE CHARACTERISTIC FUNCTION
In the dual expansion theorem, we start with an arbitrary function o(A) and define u, by (1.5.6); here the measure ~T,,,,(A) only takes account of the values of v(A) at the eigenvalues A,, sets not including any of the A, having measure zero. T h e expansion theorem (1.5.5) is then true 4< almost everywhere,” that is to say, when A is an eigenvalue.
1 .&The Characteristic Function Like the spectral function, with which it is closely connected, this function will play an important part in all our boundary problems. In the present case we may describe it as a rational function, whose poles are at the eigenvalues, the residues thereat being the jumps of the spectral function, that is to say, the reciprocals of the normalization constants p, . If we specify in addition that it is to be finite at infinity, when the latter is not an eigenvalue, this will fix the function except for an additive constant. We shall actually define this function in terms of y,(A). T h e latter, we recall, is the terminal value for n = m of the fundamental solution y,(A) ; since y,,,(A) maps the initial values of solutions of (1.2.2) into their final values, we may term it the transfer function. In terms of this function we define the characteristic function of our boundary problem (1.2.2-3) as fm.a(h) =
(2i)-l{exp (ia)
+yrn(h)){exp
(ia)- ~ m ( x ) > - l .
(1.6.1)
The main properties of the function are given by:
Theorem 1.6.1. Let the boundary problem (1.2.2-3) have only finite eigenvalues. Then: (i) there is a real constant y m V asuch that
(iii) fn,,a(A) is real on the real A-axis, except for poles, and maps the upper and lower A-half-planes into each other, (iv) at the poles and zeros of ym(A)we have frn,a(i/cJ = - Qi, jm,a(-i/zD) =
+ *i,
p
..., m
= 0,
- 1.
(1.6.4)
38
1.
BOUNDARY PROBLEMS FOR RATIONAL FUNCTIONS
Of these, (1.6.2) is effectively the resolution of fma,a(h)as given by (1.6.1) into partial fractions. It follows from (1.6.1) that f,,,(h) is a rational function which is finite at A = m, if the latter is not an eigenvalue. Furthermore, its poles are the zeros of exp (ia)- ym(h), and these are precisely the eigenvalues. If we use the fact that at an eigenvalue we have y,(&) = exp (ia),the standard partial fraction formula gives fm.a(A) = Y m . a
+ (zi>-lZ
2 ~ m ( ~ {r- )L ( A r ) I - l
T
(A - A r F 1 ,
(1.6.5)
where ynl,,is some constant. Using (1.4.9), this may be written fm.a(A)
Ym.a
+ 2 P;'(A
-~T)-lr
(1.6.6)
r
which is equivalent to (1.6.2). That ym,a is real may be seen in various ways. When A is real we have I ym(h)I= 1, by (1.2.4), whence it follows from (1.6.1) that f,,,(h) is real when h is real, apart from poles. Now the integral in (I .6.2), or the sum in (1.6.6), is certainly real when h is real, and so 'ymSamust be real also. The statement will also follow from (I .6.4). For (1.6.4) shows that f,,,(h) takes complex conjugate values at'the pair of points i/c, , -i/Eo , which are, of course, complex conjugates. Since the integral in (1.6.2) also takes complex conjugate values, it follows that ym,a must be real. For the proof of (1.6.3), we use the fact that if a is not a multiple of 2n, then h = 0 is not an eigenvalue; we have, of course, ym(0)= 1. Putting X = 0 in (1.6.1), we then get f,,,(O) = - &cot *a, and so (1.6.3) results from putting h = 0 in (1.6.2). Part (iii) of the theorem results immediately from (1.6.2), while part (iv), (1.6.4), comes from substituting in (1.6.1). I n fact, y,(i/c ) = 0, p. since y,(h) contains the factor (1 $- ihc,), and similarly ym(-z/Ep) is infinite. We may express (1.6.2) by saying that the characteristic function is, apart from an additive constant, effectively the Stieltjes transform of the spectral function. The result (1.6.3) has the interesting feature that it fails to determine ym,awhen a is a multiple of 2 r ; we return to this point in the next section. The set of functions described in part (iii) of the theorem will be of a basic importance throughout our work, together with their matrix generalizations; to a partial extent (i) may be viewed as a consequence of (iii). The property (iv) (1.6.4) expresses an interpolation property of the characteristic function, which can be used to discuss the convergence of sequences of such functions.
1.7.
39
THE FIRST INVERSE PROBLEM
The modification of Theorem 1.6.1 for the event that one of the eigenvalues is infinite is that (1.6.2) is to be replaced by fm.a(h)
=
-rBm.ah
+
Ym.a
+J
m
--oo
dTm,a(p)(h - p)-1,
(1.6.7)
where &n,a is given by (1.4.20-22). I n expressing fnb,a(h)in partial fractions along the lines of (1.6.5), we must now allow for the principal part of f,,,(h) at 00, since it is no longer bounded there. Since now y,(h) + exp (ia)as h + 00, we may write ym(h) = exp.(ia)
for large A, where
I/& is
+ +lh-l + ...
some constant. This gives
and on comparison with (1.4.22) we have Hence, for large h we have Y m ( h ) = exp (ia)(1 - i/(Bm.crX)
+1
=
-i exp (ia)/pm,a.
+ .--),
and on substituting in (1.6.1) we get fm.a(h)
-
--rBm,ah
as A -+ m, which gives the extra term in (1.6.7). The remaining assertions of the theorem need no alteration.
1.7. The First Inverse Problem In the foregoing discussion we set up a boundary problem, and deduced from it eigenfunctions and eigenvalues, orthogonality relations, a spectral function and a characteristic function. We shall now reverse the course of argument, and consider whether the boundary problem can be reconstructed given some of the derivative entities. The most obvious question might be whether the boundary problem can be found when the eigenvalues are given. This may be rejected on very simple grounds. In specifying the eigenvalues of (1.2.2-3), we are being given m real quantities. T o specify the boundary problem, on the other hand, we need the m complex constants co , ..., cmPl and the one real number a appearing in the boundary condition, making a total of 2m 1 real quantities to be specified, which are clearly not
+
40
1.
BOUNDARY PROBLEMS FOR RATIONAL FUNCTIONS
fixed by the m eigenvalues. We get a problem which is prima facie soluble if we are given the boundary condition, that is to say, the number a, and not only the eigenvalues A,, , but also the normalization constants p, , and seek to find the c, , that is to say, the recurrence relation. This we shall term the first inverse spectral problem, and may be posed briefly as that of finding the recurrence relation, given the spectral function and the boundary condition. We show that this problem is, in fact, uniquely soluble, with one exception; this is the case a = 0, that of the periodic boundary condition yo = y l l l .In this latter case A = 0 is necessarily one of the eigenvalues, and so only the other m - 1 can be prescribed at will. We are in this case left with a deficiency of 1 in the data, and there is a one-parameter family of solutions. Confining attention to finite eigenvalues, we take first the nonexceptional case.
+
Theorem 1.7.1. Let 0 < a < 27, let the A,, ..., Am-l, 0, be real and distinct, and let the p o , ..., pmdl be positive. Then there exists a set of complex constants c,, , ..., cmW1with positive real part, unique except as to order, such that the boundary problem (1.2.2-3) admits the A, as its eigenvaluis and the p7 as the corresponding normalization constants, according to (1.4.8), where a, is given by (1.4.2). In numbering the A, from r = 0 to r = m - 1, we have proceeded arbitrarily, and not according to the oscillatory characterization (1.3.5). The solution in brief is that the A,,, p , determine the spectral function T~,,~(A), whence we determine in turn the characteristic function fm,a(A), the transfer function y,(A), and finally the recurrence relation by factorization of the latter. In fact, having found T,,~(A) according to (1.5.1-2) from the prescribed A, ,p7 , we get y,nlafrom (1.6.3), and S O ~ ~ , ~ from ( A ) (1.6.2). Then y,,(A) is given by solving (1.6.1), in fact, by (1.7.1)
We need to show that the y,(A) thus constructed admits a factorization of the form (1.3.1), in which the c, have positive real part. as given by (1.6.2), that is, (1.6.6), together In the first place, f,,,(A) with (1.6.3), is a rational function, expressible as the ratio of two polynomials with real coefficients, the denominator being of degree m and the numerator of degree at most m. By (1.7.1) it follows that y,(A) is also a rational function, the ratio of polynomials of degree exactly m. I t will therefore have m zeros and m poles. We need to show that the zeros lie
1.7.
THE FIRST INVERSE PROBLEM
41
in the upper half-plane, the points i/c, , and the poles in the lower half-plane, the points -i/Fp, where the cp have positive real part. &o to be shown is that the zeros and poles lie at complex conjugate points, and that ym(0)= 1. From (1.6.6) it follows that f,,&,,(h) has positive imaginary part when Im h < 0, the p,, being all positive. Thus when I m h < 0, 2if,,,(X) - 1 has negative real part, and so does not vanish. On reference to (1.7.1), we see that yl(X) does not vanish when I m h < 0. It cannot have any zeros on the real axis either; to see this we observe that fm,,(h) is real on the real A-axis, possibly infinite, so that by (1.7.1) I ym(X)I = 1 when h is real. We conclude that the zeros of yl,(h) must lie in the upper A-half-plane; they must be the points at which fm,,(h) = 1/(2i), and we take them to be the points i / c p , where we must have Rl{cp} > 0, since otherwise the points i/cp would not lie in the upper half-plane. From (1.6.6) we see thatf,,,(A) takes complex conjugate values at complex conjugate A-values, so that Jrb,,(h)= - 1/(223 when h = - i/Fp. These latter points must be poles of yL(X), by (1.7.1). Thus
where c is a constant. T o show that c = 1, in accordance with (1.3.1), we verify that ym(0)= 1. In fact, it follows from (1.6.3) that fnl,.(0)= - Q cot 8 a, and substituting this in (1.7.1) we do indeed get yL(O) = 1, as asserted. It remains to consider whether if we set up the boundary problem (1.2.2-3) with the co , ..., c,-~ thus found, and find eigenvalues and normalization constants as in Section 1.4, we obtain those originally prescribed. We must arrive at the same characteristic function fm,,(h), since these are connected by the one-to-one relationship given in (1.6.1) or (1.7.1). Since the eigenvalues and normalization constants are uniquely fixed when the characteristic function is known, these must be the same as those prescribed. We pass to the exceptional indeterminate case.
Theorem 1.7.2. I n the notation of Theorem 1.7.1, let the h, be distinct real numbers of which exactly one is zero, and let the p r be positive. Then there is a one-parameter family of sets of values of c,, , ..., c , ~ - ~such that the boundary problem (1.2.2-3) with 01 = 0 has the A,, as its eigenvalues and the p,, as its normalization constants. In this case we define f?)l,a(h)by (1.6.2) or (1.6.6), leaving ym,, as an indeterminate. As previously, ynl(h)has m zeros and m poles, lying at
42
1.
BOUNDARY PROBLEMS FOR RATIONAL FUNCTIONS
complex conjugate pairs in the upper and lower half-planes, respectively. T h e only difference is that we prove that y,(O) = I by making A + 0; as A ---t 0 in (1.6.6) we have fm,=(A) ---t 03, and inserting this in (1.7.1) we see that y,,(h) ---t 1, so that ~ ~ ( =0 1.) Since Y , , , ~ may now have any real value, there is as stated a one-parameter set of.solutions to our problem.
1.8. The Second Inverse Problem We have mentioned that the boundary problem is not determined by a knowledge of its eigenvalues; the eigenvalues give roughly half the information necessary. It is plausible that the boundary problem might be fixed when we are given two sets of eigenvalues, for two given boundary conditions and one and the same, unknown, recurrence relation. We shall term this the second inverse spectral problem, the first being that in which we are t‘old the spectral function. We treat here the main case in which none of the eigenvalues is either infinite or zero. T h e oscillation theorems of Section 1.3 impose certain restrictions on the assumed eigenvalues. They must have the separation property, by Theorem 1.3.6. There is also a restriction on the eigenvalues nearest to zero.
Theorem 1.8.1. Let A,, p, be two sets of m real numbers, all 2m numbers being distinct, finite and non-zero. Let them have the separation property that between any two of the A, there lies at least one of the p, , and conversely. Let the closest to zero among the A,, pr be one of the A,, one the positive side (if any of the 2m numbers are positive), and one of the pr on the negative side (if any of the 2m numbers are negative). Let a, a‘ be given with 0 < a < a’ < 2n. Then there is a set of complex numbers co , ..., cmP1,unique except as to order, such that the boundary problem (1.2.2-3) has the A,, as its eigenvalues, and with a’ replacing a has the p, as its eigenvalues. We have that the equations ym(A)= exp (ia),y,,(h) = exp (id)have as their roots the A,, , p, , respectively. With the notation
n(1+ m-1
n 1 =
k=O
n(1 m-1
ihck),
17, =
-
k-0
;he,),
(1.8.1) (1.8.2)
we have then the identities exp (ia)II, = [l - exp (ia)]113 , If,- exp (id)L7, = [I - exp (id)] l7, ZI,
-
.
(1.8.3) (1.8.4)
1.8.
43
THE SECOND INVERSE PROBLEM
T o prove (1.8.3), we observe that the left-hand side is equal to (Yrn(X)
- ~ X P(i.1)
n, >
and is therefore a polynomial of degree m with the zeros A, . It is therefore a multiple of 17, , and the constant factor on the right of (1.8.3) is obtained by equating the constant terms on both sides. The proof of (1.8.4) is similar. Here 17,,Il,and a, 01' are all known, and so l7,and 17,are determined from (1.8.3-4) by solving the equations. Since yrn(X) =
nl/n,
9
the boundary problem is recovered. It requires greater trouble to verify that the solution so obtained yields a set of ck with positive real part. These ck are specified by the fact that the zeros of ym(h),that is to say, the zeros of 17, , are the points ilc,. What we have to prove is that the zeros of l7, lie in the upper half-plane. For this purpose we use the argument principle. We consider the variation in arg17, as h describes a large semicircle, center the origin and diameter on the real axis, with curved portion in the upper halfplane, and described positively. If, as we say, the zeros of l7, all lie in the upper half-plane, then the variation in arg l7, around this contour will be 2mr. So far as the curved portion of the semicircle is concerned, on which l7, is asymptotic to a constant multiple of Am, the variation in arg17, will approximate to mr, for large semicircles. Hence we have to prove that the variation in arg l7,as h describes the real axis positively amounts to an increase of mr. Solving (1.8.3-4) for l7,, we have sin &(a' - a) l7, = -exp ( & i d )sin 8.
.n3+ exp ( i i a )sin &a' n,, (1.8.5) *
In this equation we put X equal to the successive 4, pr to study the variation of arg17,. It is necessary to ascertain the signs of the n3, l7, at the zeros of the other. T o illustrate the latter we draw up a table; we assume the A,, , pr numbered in the following manner : A_,
< p-v < ... < A_, < p-, < 0 < h, < po < ... < p m - p , < (1.8.6)
This expresses our assumptions that the 4, pr separate one another, and that the least positive such number belongs to the A,. , the greatest negative one to the p, . The possibility is not excluded that all the A,, ,
1.
44
A
BOUNDARY PROBLEMS FOR RATIONAL FUNCTIONS
A_,
p-,
sgnn,
0
(-)l-,
sgnn,
(-)p
0
... A_,
p-1
0
... 0 + + ... - 0 +
A,
p,
...
pm-9-1
Am-,
0
-
...
(-)-
0
...
0
(-)m-P
+ o
In the central column we record the fact that 17, , n4are both positive, in fact, unity, when h = 0. I n tabulating the signs of n3,for example, we use the fact that it vanishe's at each of the A,, , changing sign at each of them since it has only simple zeros. At two successive p,, ,17,will have opposite signs, since between them must lie just one of the A,,. We repeat that in the above table, of the first and last columns only one is actually present. Suppose for definiteness that A_, , A, both occur, and consider the variation in arg Ill as X increases from A_, to A, . We use (1.8.5), noting that sin (a' - a), sin a, sin a' are all positive, since 0 < a < a' < 2n. Putting A = A_, in (1.8.5), so that 17, = 0, l7, < 0, it follows from (1.8.5) that we may take arg I7,(Ll) = Q a - n. Making now h increase to p-, , we get 17, > 0,U4= 0, so that arg I7,(p-,) = Q a' - n(mod 2n). We assert that in fact arg l7,(p-,) = $a' - n. T h e alternative possibilities are that as X goes from A_, to p-, , arg17,(h) might go from a - n to 4 a' n, or to Q a' - 3n, or to more distant values still. If this were so, arg 17,(X) would reach in between the value 4 a', or a' - 2n or some value congruent to these (mod 2n). This would mean that the left of (1.8.5) and the first term on the right would have argument congruent to & 01' (mod n), the second term on the right having argument a (mod n). Since 0 < 01 < a' < 2n, this is not possible unless U , = 0, and this too is impossible between A_, and p-,. We conclude that as h goes from h-, to p-, , arg n , ( h ) goes from a - n to a' - v . Similarly it may be shown that as h increases from p-l to A,, arg Ul(h ) increases from & a' - 7 to 4- a, so that it has increased by rr as h goes from A_, to A,. Generally, as h increases between two consecutive A,, or two consecutive p,, , arg n ( h ) increases by n. We can now find the variation in arg17,(h) as X describes the whole real axis. Suppose for definiteness that in the above table A, exists, to arg IT,(h) and so not & + p . Then as h increases from A, will increase by (m - l)n, in fact, with the previous determination of arg17,(h), we have
4
4
4
+
+
+
fr
4
4
1.8.
45
THE SECOND INVERSE PROBLEM
Also, as h increases to P,-~-, , the greatest of the two sets of eigenvalues in this case, the previous reasoning shows that arg 17,(X) will increase further to (1.8.8) arg fl1(pm-$-,) = $ a’ (m - p - I ) rr.
+
h
It remains to consider the variation of arg17,(h) when h
> pm-p-, . We write dl
= arg I7,(h-,)
< A,,
- arg n,(-m),
A , = arg f l , ( + ~) arg fll(pm-p-l).
+
Our aim is to show that A, A, = r - 8 (a’- a). Now this is certainly true (mod 2n),since the variation of arg I7,(X) round our semicircle is a multiple of 2n, the variation round the curved part of the semicircle is asymptotically mn, and that along the real axis from the lowest eigenvalue A_, to the greatest pm-p-l is (m- l ) +&(a’ ~ - a). We therefore have to dispose of the eventualities Al+A2=rr-’
2 (a’-
4+2P,
(1.8.9)
for some integral q # 0. Considering A , , as X varies in --oo < h < A_, , argn,(X) cannot reach either of the values +a‘ - pn, 8 a‘ - (p 1)n; as was shown above in considering the variation from A-, to ho , such values correspond to zeros of n4, that is to say, to members of the set p, , of which there are none in the interval (-m, LP). In view of (1.8.7) we deduce that
+
1 - %(a’ - a) < A ,
< 7r - +(a’
- a).
Similarly, in the interval (pm-p-l, +m), argn,(h) cannot reach the values Q a ( m - p - l)n, 8 a (m - p ) ~ since , these correspond to zeros of I f 3 that , is to say, to members of the set A,. Hence A , satisfies the same bound, and so
+
+
- (a’ - a )
< A,
+ A , < 2rr
-
(a’ - a).
+
Recalling that 0 < a < a‘ < 277, this implies that dl A, < 2n, which excludes (1.8.9) with q = 1, 2, ... . Similarly we exclude (1.8.9) with q = -1, -2, ... , so that in (1.8.9) q = 0. This completes the proof that the variation of argII,(A) is mn along the real axis, and so 2m7r around a large semicircle closed in the upper half-plane. Hence the zeros of IT, lie in the upper half-plane, and the ck given by (1.8.1), (1.8.5) have positive real part. T h e proof is similar in the case that is the greatest of all the 4 , pr and p - p the least.
1.
46
BOUNDARY PROBLEMS FOR RATIONAL FUNCTIONS
A few simple points complete the proof. Since I?,(O) = 1, D4(0)= 1, we deduce on putting A = 0 in (1.8.3-5) that I?,(O) = I?,(O) = 1. Thus Ill admits the expression ( 1 . 8 4 , in which the ck have positive real part. Also, on solving (1.8.34) for 17,,we get sin
+
(01’
- a) IT, =
-exp (--
+ id)sin (9
a)
n, + exp (-
+
ia)sin
(8or’) 17,.
Since Il, , 114 are polynomials with real coefficients, comparison with (1.8.5) shows that is the polynomial which is the complex conjugate to I?,, in the sense of having complex conjugate coefficients. Hence I?, is given by (1.8.1). Forming now ym(h)= DJI?,, we have the transfer function of the boundary problem, so that the recurrence relation (1.2.2) is determined, except as regards permutations among the c,, . We need, of course, to show that ym(h)= exp (ia) when A = A,, , and that y,(h) = exp (id) when h = p, . These follow immediately from (1.8.34).
n,
1.9. Moment Characterization of the Spectral Function This topic also belongs to some degree in the category of inverse problems. We focus attention on the dual orthogonality relations (1.4.12), and ask what other orthogonality relations there may be concerning the same rational functions ~ ~ ( with h ) possibly different A,, pr. T h e problem may be more compactly handled in the Stieltjes integral formulation (1.5.3). If we assume for simplicity that the eigenvalues are all finite, and write .(A) instead of T ~ & , ~ ( Athis ) , becomes (1.9.1)
Extending the definition of a spectral function given in Section 1.5 by actual construction, we may term .(A) a spectral function, associated with the recurrence relation (1.2.2), if it satisfies (1.9.1), together with some general restrictions. Here we shall require T(A) to be nondecreasing, right-continuous and to ensure absolute convergence in (1.9.1). T h e latter requirement is equivalent to (1.9.2)
An alternative formulation proceeds in terms of the eigenfunction
1.9.
47
MOMENT CHARACTERIZATION OF THE SPECTRAL FUNCTION
expansion, Theorem 1.4.6, in Stieltjes integral formulation (1.5.5-6). For arbitrary u, we define, as before,
and term .(A)
a spectral function if, whatever the choice of the u,, (1.9.3)
It is not hard to show that (1.9.1) implies (1.9.3), and conversely. The formulation in terms of the eigenfunction expansion has advantages for the corresponding problem for differential equations. We mention in passing that (1.9.1) does not ensure the validity of what has been termed the dual expansion theorem in which we start with any, suitably integrable, function v(h) and define the u, by (1.9.3), and then consider the expansion of o(h)in terms of ?,&(A), with the u, as Fourier coefficients. A spectral function with this additional property may be termed an orthogonal spectral function ; however, we shall not consider this here. The conditions (1.9.1), supposed to hold for 0 < j , K m - 1, constitute a moment problem, in that .(A) is to be found, as far as possible, from a knowledge of the “moments” with respect to it of the functions qj(h)qk(h).Just as moments of polynomials may be expressed in terms of moments of separate powers, so also the moments of the rational functions r/pjkmay be expressed in terms of moments of simpler functions of which they are linear combinations, namely, by resolving them into partial fractions. With this in mind, we define the function
<
(1.9.4)
where y is some real quantity to be chosen later. This function stands in the same relation to ~ ( p as ) the characteristic function (1.6.2) does to the specific spectral function T ~ ~ , ~ ( A ) . We first carry out the reduction of (1.9.1) to a simpler moment problem. This part of the argument we give for a fairly general class of .(A); later we specialize to the case in which .(A) is a step function with a finite number of jumps. We have: ) real, right-continuous, and of bounded Theorem 1.9.1. Let ~ ( h be variation in every finite interval. Let the integral in (1.9.4) converge absolutely for all nonreal A. Let the co , ..., c,-~ have positive real part,
48
1.
BOUNDARY PROBLEMS FOR RATIONAL FUNCTIONS
<
and be all distinct. In order that (1.9.1) should hold for 0 \< j, k m - 1, the T~(X) being given by (1.4.6) and a, by (1.4.2), it is necessary and sufficient that, for some real y in (1.9.4), f ( i / c p )= - ai,
p
= 0,
..., m
- 1.
(1.9.5)
If the co , ..., c,-~ are not all distinct, and s of them coincide in a certain value c, then (1.9.5) is to be supplemented by the s - 1 equations f ' ( i / C ) = f " ( i / c ) = ... =f's-l'(i/c) = 0.
(1.9.6)
Let us first prove the conditions sufficient. We assume (1.9.5), and also (1.9.6) if appropriate, and establish (1.9.1). In the case j = K, the result to be proved becomes, by (1.4.10),
J
m -m
(1
+ i~c~)-1(1- iAfj)-l
~ T ( A= )
1/(c5
+ ti).
(1.9.7)
If we put the left-hand side into partial fractions, this becomes
and by (1.9.4) the result to be proved is now -f(i/cj)
+f( -i/fj)
(1.9.9)
= i.
This follows from (1.9.5), bearing in mind thatf(h) as given by (1.9.4) with real y and real ~ ( pmust ) take complex conjugate values at complex conjugate points. T o complete the proof of the sufficiency, we show that (1.9.1) must hold also when j # k. It will be sufficient to take the case j < k, other cases being obtainable by taking complex conjugates. Using (1.4.6), we have iAfJ-q--J
k-1 0
=
I-I(1 - ikr) n (1 + iAcr)-l,
(1 - iAE,)
I-J k
0
(1
+ iAc,)-' ( 1.9.10)
or, in the case j = k - 1, rlj(h) Y ~ + ~ ( A= ) (1
+ iAcj)-l (1 + iAc,+J-l.
(1.9.11)
1.9.
MOMENT CHARACTERIZATION OF THE SPECTRAL FUNCTION
49
In either event, we may express 7jrjij,, for real A, in partial fractions as (1.9.12)
where the dots ... indicate terms of the form (A - i / ~ ) - (A~ ,- i / ~ ) - ~ , ...)in the event of two or more of the cp coinciding. Since, in the notation (1.9.4), m dT(h) ( h - i/C,)-l = )' - f ( i / C , ) , ( 1.9.13)
I
--m
and since, if s of the cp coincide in a value c,
1
dT(h) ( h - i / ~ )=- 0, ~
~
t = 2,
-a
..., s - 1, S,
(1.9.14)
by (1.9.6), we have from (1.9.12) that
and by (1.9.5), the right-hand side is (1.9.16)
In view of (1.9.10-ll), 7jrr(A)qk(h)is of order O(A-2) for large A, and hence it follows from (1.9.12) that $d,
= 0.
(1.9.17)
P=j
Hence (1.9.16) vanishes, and so also (1.9.15). This completes the proof that (1.9.5-6) are sufficient for (1.9.1). Next we prove that (1.9.1) for 0 < j ) k < m - 1 imply (1.9.5), together with (1.9.6) in the case of coincidences among the cp. Let us take first the case in which the cp are all different. First we use cases of (1.9.1) in which k = j + 1. In view of (1.9.1 l), the result may be written (0
(i(cj - c~+~)}-'
-a
dT(h) { ( A - i/c$)-l - ( A - ~ / C , + ~ ) - - I= } 0. (1.9.18)
By (1.9.13) we deduce that f(i/C,)
-f(i/Cj+1)
= 0,
j
..., m
= 0,
- 2,
(1.9.19)
50
1.
BOUNDARY PROBLEMS FOR RATIONAL FUNCTIONS
whatever the choice of y. T o complete the proof we use (1.9.1) with j = k. We have (1.9.7), and so (1.9.9). Sincef(A) takes complex conjugate values at complex conjugate points, it follows from (1.9.9) that Imf(i/ci) = - Q. Since the f(i/cj) are all equal, by (1.9.19), th'is completes the proof of (1.9.5). Suppose next that the cj are not all distinct. Here (1.9.18) holds if cj # cj+l , and so (1.9.19) holds in any case. Also (1.9.9) holds, and so the proof of (1.9.5) still holds good. T o derive (1.9.6), that is to say, (1.9.14), we take first pairs cj , cj+l , if any, such that cj = cj+l . In this case, (1.9.1) with k = j 1 gives, by (1.9.11),
+
which is equivalent to a case of (1.9.14). Thus (1.9.14) holds for all cases of two consecutive equal cj . Next we take all triples cj , c , + ~ , cj+2 yielding fresh coincidences. The first possibility of this kind is that cj = c,+~ = c ~ + ~ In. this case, (1.9.1) with k = j 2 gives, using (1-9.lo),
+
^W
J -m
(1 - iAZ,) (1
+ ~AC,)-~dT(h) = 0.
(1.9.20)
Putting this integrand into partial fractions, we get
1 W
djz
-W
(A - i/cj)-z &(A)
+ dj3j
m
( A - i / ~ , )&(A) - ~ = 0, (1.9.21)
--m
there being no term in ( A - i/ci)-l since the integrand is O(A-2) for large A. Here the first integral on the left is zero since it relates to two consecutive and equal cj . Also dj3 # 0, since the integrand in (1.9.20) has a pole of order 3 at h = i/cj, cj having positive real part. Hence the last integral on the left-hand side of (1,9.21) also vanishes, and we get another case of (1.9.14). If again cj = cj+z # cj+l , we get in place of (1.9.21) an equation of the form W
dj
--m
(A
- i/c,)-l &(A)
+ dj+l
W
-W
(A - i / ~ ~ + &(A) ~)-l
+ dj2Im ( A - i / ~ ~&(A) ) - ~= 0.
(1.9.22)
-w
From the order of magnitude of the integrand in (1.9.1), as given by (1.9.10) with k = j 2, we have dj dj+l = 0, so that the sum of the first two terms on the left vanishes, by (1.9.5). From the singularity
+
+
1.10.
51
SOLUTION OF A MOMENT PROBLEM
+
2, and cj = c ~ + ~ , of the integrand, given by (1.9.10) with K = j we have also that dj, 0. Hence (1 -9.22) gives another case of (1.9.14). We continue in this way, taking cases of (1.9.1) with K = j 3, K = j 4, ..., and expanding the integrand, as given by (1.9.10), in If the number of times that the partial fractions whenever cj = value cj appears in the sequence cj , c ~ + ~...,, ck-l is greater than the number of times it appears in any shorter consecutive sequence, we obtain a fresh case of (1.9.14). Proceeding in this way, we obtain all cases of (1.9.14). This completes the proof that the conditions (1.9.5-6) are necessary for (1.9.1).
+
+
+
1.10. Solution of a Moment Problem The topic of this section is the determination of all spectral functions, within a rather restricted set. Here by a spectral function we understand a function .(A), which satisfies the orthogonality conditions (1.9.1) with 0 j, K m - 1. We propose to determine all such functions which are stepfunctions, nondecreasing, and with only a finite number of jumps. That there are infinitely many such functions may be seen from the boundary problem (1.2.2-3). Corresponding to any a, 0 a < 27r, we constructed a particular spectral function .,(A) ; these are, of course, all distinct, since their points of discontinuity are the roots of y,(A) = exp (ia). From these spectral functions others may be constructed. Since the defining property (1.9.1) is linear inhomogeneous in ~ ( h )such , an expression as {.,(A) .,,(A)}, will also be a spectral function, and will, in general, have 2m jumps. More generally, the spectral functions form a convex set and the arithmetic mean of any number of them will also be a spectral function. A second means of finding additional spectral functions is to extend the boundary problem by adding extra stages to the recurrence relation. For some m' > m we suppose that the c, , n = 0, ..., m' - 1, have positive real part, and set up the boundary problem given by
<
<
<
+
Y,,+~= (1
+
+ iXc,) (1 - iM,,)-ly,,,
..., m' - 1,
n = 0,
(1.10.1)
together with the boundary condition ym. = yo exp (ia)# 0.
(1.10.2)
If for this boundary problem we form the spectral function T,,~,,(A) according to (1.5.1-2), this function will satisfy (1.9.1) with 0 j ,
<
52
1.
BOUNDARY PROBLEMS FOR RATIONAL FUNCTIONS
<
< <
< <
k m' - 1, and so a fortiori with 0 j , k m - 1. It turns out that this second method gives all possible spectral functions, that is to say, functions satisfying (1.9.1) with 0 j , k m - 1, if we restrict attention to step functions which are nondecreasing and with only a finite number of jumps. As shown in Section 1.9, the conditions (1.9.1) on ~ ( h )which , have the form of a moment problem, are equivalent to a simpler moment problem. We may sum up these two equivalent versions of our problem as follows: (i) Constants c o , ..., cm-l with positive real part are given, and v,(h) is defined by (1.4.6). It is required to find all nondecreasing step functions .(A) with only a finite number of jumps satisfying (1.9.1), where a, = c, En. (ii) Constants co , ..., cnt-l with positive real part are given. It is required to find all nondecreasing step functions ~ ( h with ) only a finite number of jumps, such that for some real y the function
+
has the interpolatory property f(i/cn>= - +i,
n
..., m
= 0,
- 1.
(1.10.4)
supplemented in the case of s coincident c, by s - 1 equations of the form f'(i/cn) = ... =f(S-1)(i/cn) = 0. (1.10.5)
A more intrinsic formulation of (ii) is suggested by the observation that f(h) is a rational function that maps the upper and lower half-planes into each other; it is, in fact, the general such function which is finite at infinity. We may therefore pose (ii) as the problem of finding rational functions with these properties, which take the value - i at m specified points in the upper half-plane. We approach here the topic of the PickNevanlinna problem, though in restrictive fashion, as we confine attention to rational functions and require the interpolatory values all to be - 8;. The complete solution of these problems is given by:
+
Theorem 1.10.1, Let ~ ( hsatisfy ) either of the problems (i), (ii) above ; let it also be fixed so that ~ ( 0 = ) 0, and defined at points of discontinuity so that it is right-continuous. Let m' be the number of points of discontinuity of .(A). Then m' m. If m' = m, there is an a, 0 a < 27r,
>
<
1.10.
53
SOLUTION OF A MOMENT PROBLEM
so that ~ ( hcoincides ) with the spectral function ~,,,(h) of the problem (1.2.2-3). If m' > m, there are additional constants c , , ..., cTn8-l with positive real part, and an a, 0 a < 277, such that ~ ( hcoincides ) with the spectral function T,,,~(A) of the extended problem (1.10.1-2). As already remarked, this necessary form for ~ ( his) sufficient for (i); by Theorem 1.9.1 it is also sufficient for (ii). In proving the theorem, we work from the form (ii), that is to say, from (1.10.3-4). From (1.10.3) we see that f(h) is a rational function, with denominator of degree m' and numerator of degree at most m'. Now (1.10.4) expresses the fact that the equationf(X) = - & i has the m roots i / c , , ..., i/cm-.l If some of these are not distinct, the supplementary conditions (1.10.5) ensure that they are multiple roots off(h) = - & i with a corresponding degree of multiplicity. Thus this equation has at least m roots, taking into account multiplicities of roots. Sincef(h) has denominator of degree m', and numerator of degree at most m', the equation f(X) = -- & i is of degree m', at most and in fact exactly, when cleared of fractions. Hence m' 2 m, as asserted. Suppose first that m' = m. We set
<
.
<
ym(h) = exp (ia){2if(h) - l} {2if(X)
+ l}-l
(1.10.6)
where a, 0 a < 277, is chosen so that ~ ~ (=0 1 ); if ~ ( phas ) a jump at p = 0, so that f ( 0 ) = 03, this is to mean that a == 0. Since f ( h ) is a rational function, with real coefficients, with denominator of degree m and numerator of degree at most m, (1.10.6) defines ym(h) as a rational function with numerator and denominator of degree exactly m, with m zeros and m poles. Sincef(h) is real for real A, possibly infinite, we have I y,(h) I = 1 when A is real; the zeros and poles of ynL(h)will all be complex. Moreover, f(h) takes complex conjugate values at complex conjugate points, so that the poles of ym(h), the zeros of 2if(h) 1, that is, will be complex conjugates to the zeros of 2if(h) - 1, which are the zeros of ym(h). T h e latter are the points i / c , , n = 0, ..., m - 1, repeats among these values being counted according to multiplicity, so that the poles will be the points - i / E , . Taking into account the fact that yl(0) = 1, we see that y,,(h) must admit the factorization (1.3.1), with n = m, so that y,,(h) is what we have termed the transfer function of the boundary problem (1.2.2-3). T h e relation (1.10.6), solved for f(h), gives
+
f ( h ) = (2i)-'{exp (ia)+yffl(X)}{exp (ia)- yffl(A)}-l.
(1.10.7)
Comparing this with (1.6.1), we see thatf(h) coincides with the characteristic function f,,,(h) of the boundary problem (1.2.2-3). Comparing
54
1.
BOUNDARY PROBLEMS FOR RATIONAL FUNCTIONS
(1.6.2), (1.10.3), we see that ~ ( pmust ) coincide with the spectral function ~ ~ , ~of (the p )problem (1.2.2-3), since in (1.10.3) the rational function f ( h ) fixes uniquely the step function ~ ( p ) subject , to normalization by ~(0) = 0 and right-continuity, This completes the proof in the case
m' = m. Suppose finally that m'
> m. We write now
ym.(A) = exp (ia){2if(X) - l} {2if(A)
+ l}-l,
(1.10.8)
where again a is fixed so that yn,,(0)= 1, and set (1.10.9)
By the above reasoning, ymt(h)has the zeros ilc, and the poles - i / & , n = 0,..., m - 1, taking into account multiplicities where necessary. Hence we may write
Ym4A)
= Yrn(A)$(A),
(1 .lo. 10)
where #(O) = 1, and $(A) is the ratio of two polynomials of degrees m' - m. T h e zeros and poles of $(A) will be those of y,,#(X) over and above those of yl,&(X).Now the zeros and poles of yIfi(X) are all complex, and the zeros lie in the upper half-plane, since!(/\) has positive imaginary part in the lower half-plane, the poles of yml(h)lying in the lower half-plane for a similar reason; in addition, the poles and zeros will be complex conjugates of one another. Since the same is true of ym(X),it will also apply to $(A) = ymt(h)/ym(X), the poles and zeros of ynL(h) being a subset of those of y,.(h). Denote the zeros of $(A) by i i c , , n = m, ..., m' - 1, which, of course, need not be distinct from each other or from the zeros of yL(X). T h e poles of $(A) must be the points , n = m, ..., m' - 1; hence, recalling that #(O) = 1,
-+,,
and so
showing that ym,(h)has a similar form to ym(h),and is the transfer function of the extended recurrence relation (1.10.1). T h e proof is now completed as before.
CHAPTER 2
The Infinite Discrete Case
2.1. A Limiting Procedure
An important method in the theory of boundary problems consists in considering a sequence of such problems, formed in such a way that we may proceed to the limit in some result of the theory, usually the expansion theorem. The convergence of the various auxiliary functions has also an independent interest. In this way, results for particular simple cases serve as a foundation for those for more difficult situations. Here we consider one type of limiting process, in which the number of stages in the recurrence relation (1.2.2) tends to infinity, over some fixed sequence of constants c,,, , n = 0, 1, ..., with positive real part. The process is the discrete analog of that for differential equations in which we start with a finite interval (a, b), and then make b + 00, or a + - m, or both. By way of illustration we list the corresponding results for the first-order differential equation, of which (1.2.2) is the discrete analog. We write this in the form
0 <x
y’ = iAc(x)y,
where
C(X)
< b,
(2.1.l)
is to be positive and continuous. Defining C1W
c(t)
=
4
(2.1.2)
0
the fundamental solution is y(x, A) = exp [iAc,(x)].
(2.1.3)
<
Corresponding to some fixed a, 0 LY < 2r, we have the boundary problem given by y(b, A) = exp (ia),whose roots are the eigenvalues A, , so that A,c,(b) = cy. 2nr. T h e orthogonality relations being
+
1;4-4
Y(X,
A,) A x , A,) fix = b C l ( b ) , 55
(2.1.4)
56
2.
THE INFINITE DISCRETE CASE
the spectral function is a step function with jumps of amount l/c,(b) at all the points ( a 2nx)/c,(b), n = 0, f 1, ... . Finally, the characteristic function defined in Section 1.6 is here
+
here the partial fraction expansion (1.6.2), essentially that of cot m, needs slight modification to ensure absolute convergence. We now raise the question of the behavior of these entities as b -+ a. If we start with the convergence, or lack of convergence, of y(b, A) as b --t 00, it is evident that two cases are to be distinguished, namely,
+
c1(-)
and
<
O0l
c l p > = m.
(2.1.6) (2.1.7)
In the first of these cases, y(b, A) tends to a finite limit as b -+co for all A. T h e boundary problem may be expressed directly as y(m, A) = exp (ia). T h e eigenvalues A,, functions of b and a, tend to finite and distinct limits as b --t 03, and the spectral function tends to a limiting spectral function, which is also a stepfunction and depends on a. T h e situation is quite different when (2.1.7) holds. I n this case the boundary condition at infinity, y(m, A) = exp (ia),has no sense, and eigenvalues cannot be defined. Nevertheless, the spectral function tends to a limit .(A) as b - m . Since the distance between consecutive A, is 27r/cl(b) and the jump at each is (c,(b))-l, on making b - t m and therewith c,(b) + 00, the limiting form of the spectral function is found to be T(A) = A/(27r). T h e salient features of this situation are that the limit of the spectral function is no longer a stepfunction, and is independent of the boundary parameter 01. T h e associated eigenfunction expansion is, of course, the Fourier integral as applied to a half-axis. We also study convergence as b - t m in the complex A-plane. I n the case (2.1.6), y(b, A) tends as b --t 03, for a fixed A-value, to a limit which is an entire function of A. I n the case (2.1.7), it tends to zero for A in the upper half-plane, tends to infinity in modulus for A in the lower half-plane, and for real A moves indefinitely often round the unit circle. T h e convergence for complex A can be exhibited more graphically in the case of the characteristic function (2.1.5). We consider, for fixed complex A in the upper half-plane, the set of values assumed byfb,a(A) for varying a ; let us denote this locus by C(b).Writing (2.1.5) in the form
2.2.
57
CONVERGENCE OF THE FUNDAMENTAL SOLUTION
and writing f for fb,a(h),we see that this locus is the f-locus given by
If
-
1/(24 I
=
I Y(b,
I . If
+ 1 / w I.
(2.1.8)
It is clear that C(b) is a circle (of Apollonius), given by points whose distance from 1/(2i) is in a fixed ratio I y(b, A) I, less than 1, to its distance from - 1/(2i). Now as b increases, for fixed h with I m h > 0, I y(b, A) I decreases monotonically, since ~ ( x )> 0. Hence the circles C(b) given by (2.1.8) shrink progressively, or “nest,” in the sense that if b’ > b, then C(b‘) lies in the interior of C(b). In the case (2.1.6), y(b, A) diminishes to a positive limit, and, accordingly, C(b) shrinks towards a circle C(m), given by (2.1.8) with this limiting value of I y(b, A) 1 ; this may be termed the limit-circle case. If again (2.1.7) holds, so that 1 y(b, A) I + 0, then by (2.1.8) the circles C(b) converge on the point 1/(2i), so that we term this the limit-point case. T h e limit-point 1/(2i) is, of course, independent of the boundary parameter a, as was the limiting spectral function ~ ( h= ) h/(27~);the two are connected by the formal relation m
1/(2i) =
-m
(A - p)-I d
(5-),
though for a rigorous connection we have to use a more complicated integrand. Another aspect that discriminates between (2.1.6-7) is the existence of solutions of integrable square. T h e latter term is construed in conformity with the inner product appearing in the orthogonality relations (2.1.4). Since the equation (2.1.1) has essentially only one solution, y(x, A), the question is whether or for what h this solution satisfies (2.1.9)
It may be proved that this is certainly true if I m h > 0. For I m h = 0, it is true trivially if and only if (2.1.6) holds ;the same holds if Im h < 0. I n what follows, we prove that very similar statements hold for the recurrence relation (1.2.2) with rn = m.
2.2. Convergence of the Fundamental Solution We suppose defined an infinite sequence c,, , c1 , ... of possibly complex constants with positive real parts. T h e recurrence relation is, as before, n = 0, 1 , ... , (2.2.1) Y,,+~ = (1 ihc,,) (1 - ihE,,)-ly,,,
+
58
2.
THE INFINITE DISCRETE CASE
but with an infinite number of stages. For the fundamental solution we take yo = 1, defining, as before,
r-J n-1
y,(h) =
0
((1
+ ihc,) (1 - ihf,)-'>.
(2.2.2)
An immediate question is whether y J h ) tends to a limit as n + 00. We do not settle here this question in its full generality, but only in the case that the limit is to be meromorphic; this is equivalent to demanding that the zeros and poles should have no finite limit, which implies that c,--+O. Transferred to the unit circle, such questions are handled in greater generality in the theory of Blaschke products. In the following simple results we consider analogs of the criteria
(2.1.6-7).
Theorem 2.2.1. Let the constants co , cl, ..., have positive real parts. Then for y,(h) to tend as n-03 to a limit which is a meromorphic function of A, it is necessary and sufficient that and
C,
+O,
as
(2.2.3)
n +m,
(2.2.4)
where
a, = c,
.
(2.2.5)
if,
Let us first prove the conditions sufficient. We rewrite (2.2.2) in the form
provided that h # - i/& , p = 0, I, ...; these points have no finite limit, by (2.2.3).Provided again that h is not one of these points, 1/(1 - ihf,)
-+
1 as p
-+
00,
(2.2.6)
and hence by (2.2.4),
Thus the infinite product converges to a meromorphic function by standard tests.
2.2.
CONVERGENCE OF THE FUNDAMENTAL SOLUTION
59
Next we assume that yn(h) tends to a meromorphic function. By
(2.2.2), such a function would have to vanish at the points i/cD. Since y,(O) = 1, the function could not vanish identically, and so its zeros cannot have a finite limit. Hence (2.2.3) holds. T o prove (2.2.4) we use (1.4.3), where h is complex and not one of the points -i/Fp. We assume also that h is chosen so that I ym(h)I tends to a positive limit as m + m; this must be possible since yn,(0) = 1 for all m and since the limit is to be meromorphic. Choosing also A in the upper half-plane, we write (1.4.3) in the form
< l/{2 Im A),
(2.2.8)
Making m + 00, we get (2.2.9) Since I 1 - ihf, 1 1 as n-t 03, (2.2.3) having been proved already, and since by hypothesis I y,&(h)I tends to a positive limit, the convergence of the series on the left of (2.2.9) implies that of E r n f l , so that (2.2.4) holds. This completes the proof. We pass to the analog of the situation (2.1.7).
Theorem 2.2.2. Let the c, have positive real part, tend to zero as n --t 00, and let, with the notation (2.2.5), W
(2.2.10)
F u n = m.
Then if Im X > 0, yn(h)-+ 0 as n + 00, and if I m X < 0, Iyfl(A)] -+ m. If I m X > 0, the result (2.2.9) is available, and since c, --f 0 we have (2.2.11) Since Im h
> 0, R1 c,& > 0, we have by Lemma 1.2.1 that 1
= I yLl(h) I
> I ydh) I > I ydh) I > .*. 9
so that as n + 00, we have that I y,(h) I either tends to a positive limit or to zero. The former case is excluded as it would give a contradiction
2.
60
THE INFINITE DISCRETE CASE
between (2.2.10-1 1). Hence y,l(X) +0 if Im X > 0. If Lm h < 0, the fact that I y,(h) 1 -+ 00 follows from the case just proved, since
-
Y*(4 = l / Y d h
apart from singularities. Another aspect of the analogy between (2.2.1) and (21.1) which may be disposed of simply is that of “solutions of integrable square.” This phrase has here the interpretation that
$
an
I %(A) l2 <
(2.2.12)
O0,
As compared with (2.1.9) the solution has been replaced by the modified function defined in (1.4.5-6), and the integral has been replaced by a sum; if, however, c,+ 0 as n+ 00, this is equivalent to (2.2.11). In any event, (2.2.11) is ensured by (2.2.9) if Im X > 0, and so there necessarily exists a solution of integrable square if Im X > 0, whether (2.2.4) holds or not. If (2.2.4) does hold, y,,(X) tends to a finite limit, and so (2.2.9) holds as a consequence of (2.2.7), apart from the poles h = -i& . I n connection with second-order difference and differential equations, a different pattern of results is encountered in connection with the existence of solutions of integrable square. In the present case the invariant form as given by (1.2.8) is definite.
2.3. Convergence of the Spectral Function We recall the definition (1.5.1-2) of r,,,(X) as a right-continuous step function, having jumps l/p, at the eigenvalues A,, and fixed by T,,,(O) = 0. Here A,, pr vary with m, as also with a. We consider here the convergence of r,,,(X) as m -+ 00, keeping for definiteness a fixed. This provides one of the main approaches to the proof of the eigenfunction expansion. A very simple, though somewhat vague, method of dealing with this question relies simply on the boundedness, uniformly in m and a, of the spectral function T ) ~ , ~ ( X ) The . boundedness of the spectral function may in turn be established in various ways, in particular by means of the dual orthogonality (1.5.3-4). T a k i n g j = 0 in the latter, we have
J
m
-m
I1
+
% l
dT,,,(X)
+ I c,
l-2Pm,a
=
I/% ;
(2.3.1)
2.3.
61
CONVERGENCE OF THE SPECTRAL FUNCTION
here the term p,,, occurs only if w is an eigenvalue of ( 1 . 2 . 2 3 , and in any event is nonnegative. Since Rl{c,} > 0, the function (1 h2)/ I 1 ihc, l2 is continuous on the real axis; it also tends to a positive limit as h + fw, and so has a positive lower bound c, say. Hence, for real A,
+
I
1
+ ihc,
1-2
+
+ P),
2 c/(l
and so from (2.3.1) we deduce the bound
f= dTm,cl(X)/(1 + h2) Q l/(cao),
(2.3.2)
-W
independently of m and a. The left-hand side is not less than
=
for any h
> 0, and so, for A > 0, 0 Q 7m,a(A)
Similarly, for any h
< 0, -(I
(1
+
X2)-1Tm,a(X),
< (1 + h2)/(cao)*
+ h2)/(cao)< Tm,,(h)
Q 0.
(2.3.3)
(2.3.4)
Since ~ , ~ . , ( his ) thus a uniformly bounded nondecreasing function, an application of the Helly-Bray theorem shows that we may choose , nonan m-sequence such that ~,,,(h) converges to a limit ~ ( h )also decreasing and satisfying the bounds (2.3.2-4). I n putting this result formally we incorporate the result concerning the passage to the limit for integrals involving the spectral function. Theorem 2.3.1 Let the c, , cl, ..., have positive real part. Then there exists at least one limiting spectral function ~ ( h )such , that for some m-sequence and any finite h we have Tm.,(h)
-
7(4*
(2.3.5)
This function is nondecreasing and such that (2.3.6)
There is a constant /3 3 0 such that for an arbitrary continuous function g(X), such that (1 h2)g(X) is uniformly bounded for all A, and such
+
62
2.
THE INFINITE DISCRETE CASE
+
that (1 A2)g(A) tends to the same finite limit go as A+ as m ---t 00 through the same sequence,
fm,
we have,
For the proof we refer to Appendix I.
2.4. Convergence of the Characteristic Function Closely linked with the convergence of the spectral function is that of, effectively, its Stieltjes transform, defined in (1.6.1). A curious feature is that in order to ascertain the boundedness and convergence of f,.,(X) for increasing m, it is advantageous to imbed this problem in the wider problem of the convergence of the set of values of fm,a(X), when a takes all real values, or indeed all values in the lower half-plane as well. This leads to the topic of nesting circles, which finds application in many of our boundary problems. Taking a real, it follows from (1.6.1) [cf. (2.1.8)] that Ifm.rr(h) - 1/(2i) I
= I ~ m ( h )I
Ifm.a(h)
+ 1/(2i) I.
(2.4.1)
As mentioned in connection with (2.1.8), for the analogous continuous case, this means that for fixed A, and varying real a, frn,.(A) 1’ies on a certain circle, C(m, A), say. Supposing that Im A > 0, so that I y,(A) I < 1, we can say that f,,,,(A) lies on the boundary of the finite disk D(m, A) of f-values characterized by
If-
1/(2i) I
< I Y m N I . If+ 1/(2i) I.
(2.4.2)
As m increases, ym(A)steadily decreases, if I m A > 0, and so these regions shrink, so that D(m, A) contains in its interior the disk D(m 1, A) and its boundary C(m 1, A). The conclusion may be drawn that either the circles contract to a point, in this case to the point 1/(22], or else to a limit-circle. A useful conclusion from the nesting-circle argument is that, whether the limit-circle or limit-point case holds, fin.a(A) is at any rate bounded, for fixed A with I m A > 0, independently of m and a. This gives an alternative proof of the boundedness of the spectral function. Putting A = i in (1.6.7), and comparing imaginary parts of both sides, all of which are negative, we deduce that
+
Pm.a
+ Irn dTm.a(P)/(l + P2) G I ~m {fm.a(i)) I. -W
+
(2.4.3)
2.5.
EIGENVALUES AND ORTHOGONALITY
63
Since the right-hand side is bounded, this yields bounds of the form (2.3.3-4). In passing to the limit as m d - in the partial fraction formula (1.6.2), or (1.6.6-7), it is necessary to modify the integrand so as to ensure absolute convergence. We rewrite (1.6.7) as
Here the real constant y k , a is bounded uniformly in m,a, by a similar argument to (2.4.3). Making m + a , we may assume that for some subsequence of m-values, convergence holds in &,,a, yk,a and ~ ~ , . ( p ) . By Theorem 2.3.1 we deduce that there holds a representation for the limit f ( X ) of the sequence of characteristic functions, of the form
2.5. Eigenvalues and Orthogonality I n this section we assume the constants c, to obey (2.2.3-5). The effect of this is that, as n --t ~ 3 , (2.5.1)
rI ((1 + ihc,) (1 - ihQ-l} W
yw(h) =
0
-
(2.5.2)
is meromorphic, with poles at the points l / ( i f p ) One . aspect of this is that the “limit-circle case” holds; making m + in (2.4.2), we see that if X is in the upper half-plane and is not one of the points i/cp, the characteristic function f is in the limit confined to a circle of positive radius. Another aspect is that eigenvalues and eigenfunctions can be defined, with very little difference from the finite-dimensional case discussed in the last chapter. For fixed real 01, we define the eigenvalues X, as the roots of the equation yw(h) = exp (icy). (2.5.3)
64
2.
T H E INFINITE DISCRETE CASE
As in Section 1.2, the reality of the eigenvalues follows from the fact that I ym(X)I < 1, > 1 for X in the upper or lower half-planes. Defining, analogously to (1.3.3) (2.5.4)
it is seen that each term in the latter sum increases by 2~ as X describes the real axis, so that O,(X) increases over an infinite range as X increases over the real axis. Hence (2.5.3) has an infinity of real roots. In the finite-dimensional case, m was admitted as an eigenvalue if it satisfied the determining equation (1.3.2). This does not in general apply to (2.5.3). The limit ym(-) for X tending to f 03 along the real axis will generally fail to exist, since y,(X) describes an arbitrarily large number of circuits of the unit circle as X describes the real axis. Instead, we consider whether (2.5.3) holds as h + im along the positive imaginary axis. Actually a further condition is required; we admit as an eigenvalue if, first, ym(4
as X+im
-
exp (4,
(2.5.5)
along the positive imaginary axis, and if at the same time, h2ym'(h)l{iym(X))
-+
P-*
(2.5.6)
for some /? > 0. This last condition, a modification of (1.4.9) for the point im, ensures that m as an eigenvalue corresponds to a positive weight in the spectral function. We may link up this eventuality of an infinite eigenvalue with the constants c, , on the one hand, and with the behavior of the characteristic function on the other. First we note: Theorem 2.5.1. In order that as X axis, we have (2.5.6) and also Ym(4
-
--f
im
along the positive imaginary
exp ( i 4
(2.5.7)
for some real a', it is necessary and sufficient that m
(2.5.8)
We first assume (2.5.6-7) and deduce (2.5.8). From (2.5.6) we have that ( d / d ) iogym(h) = O(h-'),
2.5.
EIGENVALUES AND ORTHOGONALITY
65
and so [logy,(h)]';"
=
O(h-1).
Using (2.5.7) we have that for large h on the positive imaginary axis, ym(h)= exp (ict')
+ O(h-l).
(2.5.9)
Next we use (2.2.7). Making m - + m we have, with a slight rearrangement,
If we make h -+ along the imaginary axis in the positive sense, the right-hand side remains bounded, by (2.5.9). Hence the left-hand side of (2.5.10) is also uniformly bounded. Making h -+im in the individual terms on the left, we have I y,(h) I -+ 1, and Q)
I h/(l - ihf,,) I -,1/1 cn 1. Applying this limiting process to the sum over 0 making m + Q),we derive
< n < m and then
m
(2.5.1 1)
in partial verification of (2.5.8). We write provisionally (2.5.12)
By logarithmic differentiation of (2.5.2), we have y,'(h)/y,(h)
=i
2 an(1 + ihcn)-' (1 - ihfn)-l, 0
the series converging absolutely at least for purely imaginary A, and so the left of (2.5.6) admits the expression m
h2ya'(h)/{iym(X)}= z u n I cn 0
(1
1
(2.5.13) + =)(1 - A)-'. axe, -1
2.
66
THE INFINITE DISCRETE CASE
We wish to make h + im in the factors on the right. T o justify this we note that (2.5.11) may be written in the form
2 R1{cn}/l c,, l2 < W
2
00,
so
0
$
that
R1{cn}/l c,
I < 00,
where we use the fact that c, + 0. It follows that for n > no , say, either + r > arg c, > a r , or else - i r < arg c, < - 7. Hence if A is a pure imaginary, either * T < arg (ihc,) < $ r , or else - $7 < arg (ihc,) < - a n ; the same bounds apply to arg (At?,), if h is purely imaginary and n > n o . It follows that, under the same circumstances,
a
I 1 + l/(ihcn) I > 2-1'2,
I 1 - 1/(iMn) I > 2-lI2.
(2.5.14)
We deduce that the series on the right of (2.5.13) is uniformly convergent. Making A -+ im in the individual terms on the right, and using (2.5.12), we get h2ym'(h)/{iyw(X)} 1/F, +
and on comparison with (2.5.6) we have 8' = 8, completing the proof of the necessity. Next assume that (2.5.8) holds. By the argument just given we deduce (2.5.6). It remains to prove (2.5.7). It follows from (2.5.6) that (d/dh)log yw(h)= O(h-2),so that ym(h)tends to a limit as h -+im. In order to establish (2.5.7) for some real a', it will be sufficient to show that I yw(X)I + 1 as h + im, still along the positive imaginary axis. To do this we use (2.5.10) again, writing the left-hand side in the form m
By (2.5.14), the last factors are uniformly bounded, and also we have I y,(h) I < 1. Hence this series is uniformly bounded. Hence the right of (2.5.10) is uniformly bounded, whence it follows that 1 - Iym(h)
1'
=z
O(h-'),
which completes the proof. T o set up a connection between this case of an infinite eigenvalue and the behavior of the characteristic function, we have: Theorem 2.5.2. Let (2.5.5-6) hold. Then, as h+im positive imaginary axis, fm.dh) -PA-
-
atong the (2.5.15)
2.6.
ORTHOGONALITY AND EXPANSION THEOREM
67
As in the proof of (2.5.9), we deduce from (2.5.5-6) that
-
[ l o g ~ m ( ~ ) Ii/(PA)j ~~
whence
-
exp (ia)- ym(h)
Substituting this in the expression fm.a(A) =
i(Ph)-l exp (ia).
+
(2iFYexp (ia) Ym(h)) { ~ X(;a) P - Ym(A))-l,
we deduce (2.5.15). This justifies the use of the notation /lfor the various quantities in Sections 2.3, 2.4, and 2.5.
2.6. Orthogonality and Expansion Theorem If we confine attention to the meromorphic case in which there exist eigenvalues h, , the roots of (2.5.3), with no finite limit-point, we obtain orthogonality relations very similar to those of Section 1.4. The eigenfunctions are now infinite sequences Yo(Ar), Yl(hr),
9
which for the purposes of orthogonality have to be modified to 710(Ar),
?l(M>
' I *
f
(2.6.1)
where r],(A) is given as before by (1.4.5-6). The latter are orthogonal in a similar manner to (1.4.7) in that (2.6.2)
For the proof, we take h = A, , p = A, in (1.4.1) and make m * 00. Subject to the assumptions (2.2.3-4) of the meromorphic case, the lefthand side of (1.4.1) tends to a limit as m --+ m, which is zero by the boundary condition (2.5.3). T h e conditions (2.2.3-4) also ensure absolute convergence in (2.6.2), in view of (1.4.10). Parallel to (1.4.8-9), we have the normalization relations (2.6.3)
as in (2.5.13).
68
2.
THE INFINITE DISCRETE CASE
If 00 is an eigenvalue, according to (2.5.5-6), we have in addition a corresponding eigenfunction, the sequence Tot, T l t ,
(2.6.6)
8
where Tnt is as defined in (1.4.18). These are orthogonal to the sequences (2.6.1) in that (2.6.7) n=O
The proof consists in taking t+ = X, in (1.4.1), with h on the positive imaginary axis, making first m -+ and then h -+.,i The first process gives
We get (2.6.7) formally on making A + ,i in view of (1.4.18). This limiting process may be justified by uniform convergence. Since c, + 0, by (2.2.3), we have from (1.4.10) that Tn(h) = O(1).
From (1.4.5) we have, since 1 y,(h)
I < 1 when I m h > 0,
I yn(A) I < I 1 - ihfn 1-l < I iACn since - ih is real and positive and R1 c,
I-l,
> 0. Hence
(A - A,) Tn(A) = O(l/l cn
I)*
Hence the series (2.6.8) will be uniformly convergent, for A on the positive imaginary axis, if
is absolutely convergent. This follows, using the Cauchy inequality, from (2.2.4) and (2.5.8); the latter was found to be necessary for 00 to be an eigenvalue. Finally, there is the normalization relation
This is the same as (2.5.8).
2.6.
ORTHOGONALITY AND EXPANSION THEOREM
69
Next there will be a second set of orthogonality relations, dual to (2.6.2), (2.6.3), and parallel to (1.4.12). In the finite-dimensional case of Section 1.4, these dual relations were deduced as a direct consequence of the orthogonality of the eigenfunctions. In the present infinitedimensional case such a deduction is no longer possible. Instead, we use the method of limiting transition as m -+ m from the finite-dimensional case. We make m -+ 00 in (1.5.3). Since yr,&(A)tends to a meromorphic limit as m -+ 03, the eigenvalues A,,, of the finite-dimensional problem and the normalization constants P , , , ~ ,will tend as m -+ 00 to the corresponding quantities for the infinite problem, the A, as given by (2.5.3) and the p, as given by (2.6.5). Hence the spectral function T,,,~(A) will tend to a limit ~,,,(h) defined by (2.6.10) (2.6.1 1)
Making m
-+ 00
in (1.5,3), and using Theorem 2.3.1, we get (2.6.12)
for j , K = 0, 1, 2, ... . The term in /3 is to be omitted unless m is an eigenvalue, that is to say, unless (2.5.5-6) hold, and so also (2.5.8). The eigenfunction expansion is an immediate consequence of these dual orthogonality relations (see Appendix 111). In conformity with the orthogonality (2.6.2), it relates to arbitrary infinite sequences (2.6.13)
(2.6.14)
We write (2.6.15)
so that ~ ( 4is) the Fourier coefficient of (2.6.13) with respect to the eigenfunction (2.6.1); the series will be absolutely convergent for real A
70
2.
THE INFINITE DISCRETE CASE
by (2.6.14), (2.2.4), and (1.4.10), in view of (2.2.3). For the event of an infinite eigenvalue we define likewise
3anun m
ot =
qnt.
(2.6.16)
The eigenfunction expansion is then (2.6.17) (2.6.18)
where the term in
fl is omitted if
a3
is not an eigenvalue.
2.7. A Continuous Spectrum In the case in which (2.2.3) holds but not (2.2.4), we have the limitpoint case in which y,(X) +0 as n -+ a3 for h in the upper half-plane. In this case the direct orthogonality (2.6.2) fails. The eigenvalues cannot be defined and the series (2.6.2) will not be absolutely convergent, nor will the series in (2.6.3). Nevertheless, the dual orthogonality of the type of (2.6.12) remains in force; we remark in passing that the opposite situation prevails for (2.1.1), in that the eigenfunctions are orthogonal by (2.1.4), but the dual orthogonality has no sense. To adapt (2.6.12) to the present case, we take .(A) = X/(27), fl = 0, and the formulas in question are
These formulas are true with the sole assumption that the c , ~ have positive real part; they may be verified directly, for example, by contour integration, using the expressions (1-4.lo), (1.9.10-1 1). An expansion theorem may be deduced. For example, if in the sequence (2.6.13) all members beyond some point are zero, and .(A) is defined by (2.6.15), then un =
J”
m -m
~ ( hvn(h) ) d[h/(2r)1-
(2.7.2)
For the proof we need only substitute for w(h) and integrate term by
2.8.
MOMENT AND INTERPOLATION PROBLEM
71
term, using (2.7. l), the integrals converging absolutely. The restriction that the sequence (2.6.13) vanish beyond some point is clearly too severe, but we shall not investigate this further here.
2.8. Moment and Interpolation Problem Of these various questions of inverse type raised in Sections 1.7-1.10, we take up here the determination of the spectral function by the orthogonality. The property postulated is that for some nondecreasing rightcontinuous function .(A), of bounded variation in any finite interval and such that, in addition to ~ ( 0 = ) 0, (2.8.1)
and for some constant
I
m
-cc
p >, 0, we are to have
a +
vg(~)
dT(x)
&g+
rlKt = S,,/ai
9
(2.8.2)
for all j , k = 0, 1, ... . Here qj(A), qjt are given as before by (1.4.6), (1.4.18), for a given set of constants c,, with positive real part, while ,8 does not necessarily have the value given by (2.5.8). As we have seen, the problem of finding such ~ ( h?!)t , has at any rate one solution, given ) A/(27r), ,8 = 0. by ~ ( h= The problem may be considered as a moment problem, “determinate” if the solution is unique, namely, that just mentioned; and “indeterminate” if there is more than one, and so an infinity of solutions. As in Section 1.9, a first step is to replace this by a simpler moment problem. In modification of (1.9.4), we define now (2.8.3) (2.8.4)
this modification being necessary to ensure absolute convergence in (2.8.34) on the basis of (2.8.1). Here y is an indeterminate real quantity. We can then assert that it is sufficient for (2.8.2) that f(i/c,) = - +i,
p
= 0,1,2,
... ,
(2.8.5)
if the cp are all distinct ; if they are not all distinct, and s of them coincide
2.
72
THE INFINITE DISCRETE CASE
in some value c, we are to have s - 1 additional differentiated equations, namely, f’(i/c) = f”(i/c) = ... =f(-(i/c) = 0. (2.8.6) Let us for example deduce (2.8.2) with j parts in (2.8.3), h being complex, we get Im { f ( A ) }
=
-1m { A }
J
--a,
= k.
dT(p)/I A - p
On taking imaginary
l2 - /3 Im {A}.
(2.8.7)
Putting h = i / c j we have, in view of (2.8.5),
or 1 =(cj+f5)IS
m --a,
d+Il
-~p~jI-2+BIc~l-z/,
which is equivalent to (2.8.2) with j = k, by (1,4.10), (1.4.18), and (2.2.5). We rewrite the result to be proved, (2.8.2), to deal with the case j # k, in the form
J
m --a,
dT(CL)77dd
77ko + B ?ifnm PZ775(CL)6x4 = 0.
(2.8.8)
Supposing for definiteness that j < k, and for simplicity that the c, are all distinct, we have by (1.9.7) the partial fraction representation, for real p,
where k
Z d . = 0, 3
and so, by an easy calculation,
The result (2.8.8) to be proved then assumes the form
(2.8.9)
2.8.
73
MOMENT AND INTERPOLATION PROBLEM
This follows on taking linear combinations of the results (2.8.5) and using (2.8.9). We omit the details of the calculations for the event that the c, are not all distinct, as also the proof that the conditions (2.8.5-6) are necessary for (2.8.2);these are closely similar to arguments given in Section 1.9. T h e conclusion is that (2.8.2) may be replaced by (2.8.5-6), which is a moment problem also, in which we consider the moments of the elementary functions ( A - I.)-', h = i / c p , p = 0, 1, ... . A still simpler formulation is obtained if we observe that (2.8.4) is the general expression of a function ,f(h), which is regular in I m A > 0, and satisfies there Imf(X) 0. We are now asking for a function of this class with the interpolatory properties (2.8.5-6). As previously mentioned, this falls within the topic of the Pick-Nevanlinna problem. For its solution, we impose to begin with m of the conditions (2.8.5), with such of (2.8.6) as may be relevant in the event of c o , ... , c,,,-~ not being all distinct. T h e function
<
x ( 4 = W(4 - 1)/(2if(4 + 11,
<
(2.8.10)
1, since is regular in the upper half-plane and satisfies I x(X) [ Im.f(h) 0. It also vanishes when X = i / c , , p = 0, ..., m - 1, having multiple zeros in the case of coincident cp . Furthermore, since f ( X ) takes complex conjugate values at complex conjugate points, x(X) has poles at the points --i/Fp . Hence it must have the form
<
(2.8.11) X(X) = m m ( 4 $(4, where ym(h) is as previously, and $(A) is regular in I m X > 0. We have 1 #(A) = 1 when h is real, since y J h ) , x(X) also have this property. Also, I ym(h)[ -+1 as I h I -+ 00, so that for suitably large 1 A [, with I m h 2 0, we shall have 1 #(A) 1 < 1 E for any chosen c > 0; it now follows from the maximum-modulus principle that [ #(A) 1 1 for I m h 2 0. Thus
+
<
(2.8.12) + Y m ( 4 $(h)I/{l - Y m ( 4 $(W. Conversely, it may be verified that such a function has the properties of being regular in I m X > 0, with Imf(h) < 0, and satisfying (2.8.5)
f(4 = (24-V
for p = 0, ..., m - 1. Having solved the finite moment problem, we consider the effect of making m + 00. This may be viewed geometrically in terms of nesting circles. If in (2.8.11) we take X as fixed with I m h > 0 and so y,(h) as fixed with 1 ym(X)1 < 1, we may treat #(A) as disposable subject to I #(A) I 1. I n the form (2.8.10) this means that
<
I (2if(4 - 1)/(2if(4 + 1) I < I Y m ( 4 I.
74
2.
THE INFINITE DISCRETE CASE
This clearly restrictsf(h), for fixed A, to a circle. Since I ym(h)I decreases as m increases, these circles shrink or nest. We obtain as a result a limitpoint if yn,(X)+ 0 as m -+ 00, and otherwise a limit-circle. These correspond to the determinate and the indeterminate cases, respectively, in that if ym(X)---t 0 for all h in Im h > 0, f(h) must tend to a unique limit, in fact to - &i.
2.9. A Mixed Boundary Problem As a check on the generality of the recurrence relation (2.2.1) as a source of boundary problems, we apply the procedure of the “first inverse spectral problem” of Section 1.7, in which we start with the spectral function and boundary condition, and attempt to recover the recurrence relation. In the situation of Sections 2.5-6, when the infinite recurrence relation yielded a meromorphic limit, the spectral function was a nondecreasirig step function ~ ( h )whose , points of increase have no finite limit-point, satisfying the boundedness condition (2.3.6); there was also a constant /I 2 0, which could be considered as a jump of ~ ( hat) infinity. The question is then whether, given such a .(A) and a /I 2 0, and a real a prescribing the boundary condition, we can find ) to (2.6.10-1 l), and a recurrence relation yielding this ~ ( h according also /I corresponding to an infinite eigenvalue, if /I > 0. Solving this problem according to the method of Section 1.7, we form the characteristic function f(h) according to (2.8.34), and y(X) according to
+ l}.
y(h) = exp (ia.){2if(h) - 1}/{2if(h)
(2.9.1)
Here y in (2.8.3-4) is to be fixed so that y(0) = 1 ; as in Section 1.7, y is fixed uniquely if exp (ici)# 1 and h = 0 is not a point of increase of ~ ( h )and , in the contrary case is arbitrary, subject to being real. The problem is then to factorize y(h) in the form (2.5.2). Considering the general nature of these functions, f(h) as given by (2.8.34) is meromorphic, with its only poles on the real axis, and mapping the upper and lower half-planes into each other; it is indeed the general such function. By (2.9.1), y(h) will be meromorphic, and satisfy I y(h) I < 1 if Im A > 0, I y(h) I > 1 if Im h < 0, and I y(X) 1 = 1 if Im h = 0. In addition, we are to arrange that y(0) -- 1. However if we adopt these as the requirements for y(h), to be the transfer function of some recurrence relation, we find that y(h) need not be representable in the form (2.5.2), or (1.3.1), since the function exp (ich), c > 0,
2.9.
75
A MIXED BOUNDARY PROBLEM
also fulfills these conditions. T h e general function with these requirements is, in fact, given by a combination of the two types, namely (2.9.2)
subject to the conditions c
Rl{c,}
> 0,
real,
c
2 0,
2 Rl{c,} <
(2.9.3)
m-1
w,
(2.9.4)
c, + 0.
0
The above statement is to include the cases m = 0, c > 0, when (2.9.2) reduces to an exponential only, m finite, in which case the latter conditions in (2.9.4) are unnecessary, and m infinite. Corresponding to (2.9.2), we can form the mixed differential recurrence relation for a function y(x, A), defined continuously in - c x 0, and discretely for x = 0, 1, 2, ... , m, by
< <
A-C,
y’(x, A) = iAy(x, A),
y(n
-c
4
=
1,
< x < 0,
+ 1, A) = (1 + ihc,) (1 - iAt,)-ly(n, A),
(2.9.5) [(‘)
= d/dx],
n = 0,..., m - 1.
(2.9.6) (2.9.7) (2.9.8)
where y(X) is given by (2.9.2), and y(m, A) is interpreted in a limiting sense if m = 00. For the boundary problem we fix a real a, 0 a < 277, and require that y(A) = exp (ia). (2.9.9)
<
Since y(X) has absolute value unity only when h is real, the eigenvalues A,, , the roots of (2.9.9), are all real; since y(X) is meromorphic, they have no finite point of accumulation. If we restrict ourselves to the case c
> 0,
(2.9.10)
the case c = 0 having been dealt with, then y(X) -+ 0 as X + iW along the positive imaginary axis, so that iw will not be a solution, in a limiting sense, of (2.9.9). We are thus concerned only with finite eigenvalues; in the preceding formulas we shall have /3 = 0.
76
2.
THE INFINITE DISCRETE CASE
For eigenfunctions we define the modified function d x , 4 = Ax, A),
--c
Q x Q 0,
(2.9.1 1)
From the identity for real A, p, [cf. (1.4.1)],
where as before a,
=
2 Rl{c,}, and (2.9.15)
+ 3 an I 1 + ix,cn m-1
=c
1-2
= Y’(AMiy(U1.
(2.9.16) (2.9.17)
We shall not give the detailed proofs of these formulas, very similar calculations having been given in Section 1.4.
2.10. A Mixed Expansion Theorem
It is natural to conjecture that associated with the partly continuous, partly discrete boundary problem (2.9.5-9) there should be a corresponding expansion of an arbitrary function in terms of eigenfunctions, both defined continuously in - c x \< 0 and discretely in x = 0, 1, 2, ..., m - 1, where possibly m = m, In this section we prove such a result, confining attention to the case when all the c, are real. We need two subsidiary results, in both of which we take it that the c, are real, and positive.
<
2.10.
77
A MIXED EXPANSION THEOREM
Lemma 2.10.1. Let A, , p, be as defined in Sections 1.3-4 for the boundary problem (1.2.2-3) with finite m. Then for Y' 2 1, (2.10.1)
(2.10.2)
From (1.4.9) we have (2.10.3)
In the last equation we may replace A, by A, and deduce that A b ' ( A ) is an increasing function of X when A is positive. Thus, when A is positive, its reciprocal, A-2dA/dw is a decreasing function of A. Now as A increases from to A,, w will increase monotonically by an amount 27, and A will be positive if Y > 1. Hence
+
and the result (2.10.1) follows on summing over Y = I' 1, ... . The proof of (2.10.2) is entirely similar. The next result plays a similar role to those, such as the RiemannLebesgue lemma, asserting that if a function is sufficiently smooth, then its Fourier coefficients tend to zero with a certain rapidity.
Lemma 2.10.2. n-1
Writing
+
n
~ ~ (= hn ) (1 ihc,) n ( 1 - iAc,)-l, 0
n
...,m
= 0,
- 1,
(2.10.5)
0
we have, for real A and arbitrary u o , ...,
For the proof, we have with the notation (2.10.7)
78
2.
THE INFINITE DISCRETE CASE
=
-2iXcn
~
-)ln(X).
Hence
and the result follows on taking absolute values, noting that 1 y,(A) 1 = 1 for real A. T h e expansion theorem to be established is, formally, that for given u(x), -c x 0, x = 1, 2, ..., we definc
< <
where q ( x , A) is given by (2.9.11-12), and then have firstly the eigenfunction expansion, where an = 2cn, (2.10.9)
and secondly, to some extent equivalently, the Parseval equality
I” I 4%) --c
l2 dx
+
m-1
a, I un(h)
l2
=
2 1 w(h,) la p;l.
(2.10.10)
T
We prove the result in this latter form.
Theorem 2.10.3. Let u(x) be of bounded variation over (-c, and also over x = 0, 1, 2, ..., if m = 00, in the sense that
0),
(2.10.11)
Then (2.10.10) holds. For simplicity we shall assume that m = 00; this will include the case of finite m if formally we set a,,, = anrfl= ... = 0. We continue to
2.10.
A MIXED EXPANSION THEOREM
79
assume c > 0. For any positive integer s we form an approximating recurrence relation with 2s stages based on real constants c:), ..., czs-l (S) , where cb"' = c y = c(a)
= co
... =
CS"il =
c (8-1 8)
c1
,
(2.10.12)
= c/(a),
...,
= '8-1
'za-1 (a)
(2.10.13)
Here co , c l , ... are the constants appearing in the boundary problem (2.9.5-9), assumed real and to satisfy (2.9.4), save that if m is finite, c, = c,+~ = ... = 0. Writing (2.10.14)
we have (2.10.15)
+
(4
Since [l iAc/(2s]8 + exp iAc) as s -+ 03, we have, on comparison with (2.9.2), for real y(*)(A)
-+
c,
y(h) as s -+ 00,
, that (2.10.16)
uniformly in any A-region of the form
c1
where is arbitrary, and & is such that this region does not include any of the poles -i/c? . Next we set up eigenvalues and normalization constants for the approximating problem, observing that these tend to the corresponding quantities for the problem (2.9.5-9). We denote by A:) the roots of y("(h) = exp (ia),
(2.10.18)
identifying them as in Lemma 2.10.1. Since (2.10.16) holds uniformly in any region (2.10.17), we have A:)
-+
A,
as
s
-03,
(2.10.19)
h, being the root of (2.9.9), which is identified in the same manner; the latter are, of course, simple roots of (2.9.9) in view of (2.9.16-17).
80
2.
THE INFINITE DISCRETE CASE
Similarly, the normalization constants p y ) associated with A?) will, by (1.4.9), be given by =~ ( S ) ' ( A ~ ) ) / { ~ ( S ) ( A ~ ) ) , (2.10.20) and on comparison with (2.9.17), we see that pp)-+pr,
as
s+w.
(2.10.21)
Here we rely on the convergence (2.10.16) in the sense of the uniform convergence of analytic functions. We may now set up the Parseval equality for the approximating problem and carry out the limiting transition. We write (A) =
n--l
(8)
vn
so that, if n
< s,
7:)(A)
=
[l
r=O
(1 + iAcg"')
n n
(1 - i A C y ,
(2.10.22)
0
+ iAc/(2s)]"[l - i A c / ( 2 ~ ) ] - ~ - '
-+
exp (iAcn/s) (2.10.23)
as s ---t 03, where n may vary with s, uniformly in the A-region (2.10.17). If n 2 s, say, n = s t , then
+
t-1
7p(~) -+exp ( ~ Ac ) r-0
(1 + ~AC,)
IT(1 t
~ACJ-~.
r=o
(2.10.24)
We define, for the approximating problem, w(s)(A) =
2 (c/s) s-1
U( -C
+~c/s)
+ 2 atu(t)7zt(A). 8-1
(2.10.25)
If we use (2.10.23-24) and make s -+w,the first sum becomes an integral, the second, in general, an infinite series, and so we get, on comparison with (2.10.8), d 8 ) ( h )-+ v(A), as s+ 00. (2.10.26) I n considering the limiting transition as applied to the second sum in (2.10.25), we use the fact that Z 2 t < w, that I u(t) 1 is bounded, by (2.10.11), and that, in the present case I rg)(A)1 1 if A is real. The finite-dimensional Parseval equality then assures us that, by (1.4.15),
<
n=o
t-0
2.1 1.
81
FURTHER BOUNDARY PROBLEMS
We now make s + m . On the left, the first sum becomes an integral and the second sum becomes a sum which is in general infinite, and we get the left of (2.10.10). If we take s + on the right-hand side of (2.10.27), the individual terms, for fixed Y , tend to the corresponding terms on the right of (2.10.10). To justify the process, we observe that the series on the right of (2.10.27) is uniformly convergent; strictly speaking, the right of (2.10.27) is a finite sum, but over a range of r-values which increases indefinitely with s, and the theory of uniform convergence can be applied. By Lemma 2.10.2, applied to the sum (2.10.25), we have
I ~ c 6 I) < ( 4c’/I
I,
where the constant cf depends on the magnitude and variation of u(x), but not on X or on s. Hence the right of (2.10.27) is uniformly convergent if this is true of
and this is ensured by Lemma 2.10.1. Alternatively, we may carry out the limiting process s + in a finite number of terms of the series on the right of (2.10.27), estimating the remainder by means of Lemmas 2.10.1-2, then making the number of terms tend to infinity. This completes the proof. I n taking the continuous part of the recurrence relation first, and the discrete part subsequently, we have relied on the fact that in the scalar case the order of the factors in (2.9.2) is immaterial. For matrix cases such a simplification is not possible. Q)
2.1 1. Further Boundary Problems The previous boundary problems by no means exhaust the category of those for which orthogonality and expansion theorems hold. To illustrate a simple example, we consider solutions of y’
I
-jh-l
y
(-1
< x < 0),
y’
= ihy
(0
<x
< l),
(2.11.1)
subject to, to take the simplest boundary condition,
A-1) = y ( l ) # 0.
(2.1 1.2)
Here y is continuous, y f having a discontinuity at x = 0. The basic property that yy is constant in x holds, if X is real, as also the property
82
2.
THE INFINITE DISCRETE CASE
that (jjy)' < 0 if Im A > 0, and vice versa. From this we may deduce that the eigenvalues are all real. In fact, in this case, they are easily calculated. Since from (2.11.1) it follows that y( 1) = exp (ih)y(0) = exp (ih - ih-1) y( -l),
the eigenvalues are given by
x - A-'
= 2nv,
n
= 0,
+1, f2,
... .
(2.11.3)
Denoting by A,, A, two distinct eigenvalues, and by y(x, h),y(x, A,) two corresponding solutions of (2.11.1), it may be verified that there holds the orthogonality
j;v(s,.L)r(s.
xs>
dx
+ (hA.L)-l j
0
-1
Y ( X , hr)Y(X,
A,) dx = 0. (2.11.4)
We shall not take up these problems here. The one just mentioned may serve as representative of those for which the transfer function, in this case exp ( i A - i / A ) , maps the upper and lower half-planes into the interior and exterior of the unit circle, being analytic almost everywhere on the real axis. In a more general class of problem, the transfer function will merely map the upper half-plane into the interior of the unit circle.
CHAPTER 3
Discrete Linear Problems
3.1. Problems Linear in the Parameter In the previous two chapters we studied a recurrence relation (1.2.2) connecting successive complex numbers yn , ynfl with the property of preserving length, I y, I = I yn+l.I, for real parameter values. This relation was in general, of necessity, bilinear in the parameter A. If we move on to higher-dimensional cases, yn being a vector, a simpler possibility presents itself, namely that the recurrence relation is linear in the parameter. Although the more general bilinear relation can be studied also in the matrix case, the linear one is of special interest, and is the subject of the next four chapters, with reference to orthogonal polynomials. The recurrence relation in its general form will be yn+1 = (AAn
+ Bn)yn
9
ft
= 0,1,
9
(3.1.1)
where yT1is a vector, or rather a k-by-1 column matrix, and the A,, B, are k-by-K matrices. In generalization of the length-preserving property of (1.2.2), we postulate that for some fixed k-by-k matrix J and for some suitable A-set we are to have
JYn+l
~n+i*
= Yn* JYn
(3.1.2)
9
for all n, where the (*) indicates the complex conjugate transpose. We take J to be nonsingular and symmetric or else skew symmetric, in the real or Hermitean sense, but do not restrict it to be positivedefinite. As a suitable A-set we shall admit either the real axis or the unit circle; we do not wish that (3.1.2) should be a consequence of (3.1.1) for all A. That circles or straight lines are the appropriate curves may be seen by setting up sufficient conditions for (3.1.2) to hold. Substituting from (3.1.1), we get
+
~n*(hAn B n ) * l ( W 83
+ Bn)Yn =
~ n * l 1~ n
84
3.
DISCRETE LINEAR PROBLEMS
which will be so for any yn if
(un* +
J W n + &)
=
I.
(3.1.3)
If this equation be written out in terms of the entries of the matrices, we should get k2 equations of the form ahX
+ b,h + b,X + c = 0,
which, if representing any curves, represent circles or straight lines.
A linear transformation in X can transform the curve in question into
the unit circle or the real axis, as the case may be. To complete the boundary problem in the finite discrete case we suppose that the recurrence relation (3.1.1) is defined for n = 0, ..., m - 1, yielding a sequence of k-vectors y o , ...,y,rL, which is determinate when the first is known. We impose boundary conditions on y o , ym of the following form. There are prescribed boundary matrices M , N , square and of the k-th order, and subject to M* JM
= N* JN ;
(3.1.4)
they are also to have no common null-vectors, i. e. Mv = Nv = 0, v a column vector, must imply a = 0. We ask for solutions of (3.1.1) such that there exists a column matrix v # 0 with the property that y m = Nv.
y o = Mv,
(3.1.5)
The requirement that M , N have no common null-vector ensures that at least one of y o , ym does not vanish, in the sense of the vanishing of all their entries. By the recurrence relation, ym # 0 implies yo # 0, so that in any event y o # 0. The requirement (3.1,4),together with (3.1.5), may be thought of as requiring that y o , ynr should be of the same “length,” in the sense that YO*JYO=Ym*JYn * (3.1.6) For by (3.1.5) this is equivalent to v*M* JMv
= v*N* JNv,
which is, of course, implied by (3.1.4). We may express this by saying that the sequence of mappings (3.1.1) form, when applied successively and starting with the particular vector y o , an “isometry,” in that (Urn-1
+
&n-l)
(hA0
+ &)Yo
3.2.
85
REDUCTION TO CANONICAL FORM
has the same length as y o . With certain additional restrictions, the conclusion can be drawn that each of the separate mappings (3.1.1) is an isometry, and that A, an eigenvalue, must lie on the real axis, or the unit circle as the case may be.
3.2. Reduction to Canonical Form Without loss of generality, the matrix J characterizing the invariant quadratic form may be supposed to have one of certain special forms, in particular, a diagonal form made up of 1’s and -1’s. To make such a reduction, suppose that J is related to a Jo of some special form by J
=
(3.2.1)
K*JoK
where K is nonsingular. If we define KYn
= Y nt ,
(3.2.2)
the invariance relation (3.1.2) becomes another of the same form, (3.2.3)
The recurrence relation becomes, in terms of the y i
, (3.2.4)
and the boundary conditions are now that, for some v # 0, y,J = KMv,
yi
= KNv.
(3.2.5)
The new boundary matrices K M , K N satisfy, by (3.1.4) and (3.2.1),
which correspond in form to (3.1.4). T h u s the boundary problem has been transformed to one of the same type, with Jo for J. Suppose that J is Hermitean, and as always nonsingular. I n this case we may connect J with a Jo according to (3.2.1), where Jo is diagonal, with diagonal entries 1, - 1 corresponding to the positive and negative
86
3.
DISCRETE LINEAR PROBLEMS
eigenvalues of the matrix J. If J has p positive and q negative eigenvalues, p q = k, we may take loin the form
+
(3.2.7)
where Ep , E, are the pth order and qth order unit matrices, the matrix
Jo being completed with zeros. Another important case is that in which
J is real and skew symmetric; since J is to be nonsingular, even. In this case we may take
K must be (3.2.8)
Having arranged that J should be in a suitable form, we may next standardize the form of the recurrence relation. If now we write, for some nonsingular H,, , HnYn == zn i (3.2.9) the recurrence relation becomes (3.2.10)
and the boundary conditions are z0 = H0Mv,
Z,
= H,Nv.
(3.2.11)
Provided that the H , satisfy Hn*JHn = J,
(3-2.12)
it may be verified that the new problem (3.2.10-1 1) has the same properties as the old one, in that z,*Jz,, is independent of n, for X real or on the unit circle as the case may be, while (3.1.4) holds with HoM, H,N for MI N . We can then choose the H , successively so as to make the right-hand side of (32.10) have some special form; the details will, of course, vary from case to case. We shall have much occasion to consider the set of matrices 2 such that, for some fixed J, (3.2.13) Z*JZ= J. I n the special case when J = E, the unit matrix, these form the set of unitary matrices. In the general case we may term them ]-unitary. It will always be the case that J is nonsingular, in which case the J-unitary matrices form a group, the group U(J) say. In the case (3.2.8) this group is sometimes termed the symplectic group.
3.3.
87
THE REAL AXIS CASE
3.3. The Real Axis Case We now give the form of some boundary problems satisfying the restrictions of Section 3.1. We take in this section the case in which the quadratic J-form is invariant for A on the real axis. We first find some recurrence relations which have this property. Taking a general recurrence relation as typified by the transformation
the property required is that if a second column matrix z is transformed
then z*Jy is unchanged, provided that A is real. This is equivalent to
(M* + B * ) J(u +B)
=
J
(3.3.3)
+
for all real A, and indeed for all A; in other words, AA B is to be J-unitary for real A. Comparing powers of A in (3.3.3), we wish to find matrices A, B satisfying B*JB = J , (3.3.4) A* JB
+ B*]A = 0, A*JA
If we write (AA
+ B)
=
(AAo
(3.3.5)
= 0.
(3.3.6)
+ E ) B,
(3.3.7)
where B is J-unitary according to (3.3.4), it will be sufficient to ensure that AA, E is J-unitary for real A. For this case (3.3.5-6) become
+
A,* J
+ JA, = 0,
A,* JA,
= 0.
(3.3.8)
We now take J to be skew-Hermitean, so that J*
=
-1.
(3.3.9)
Then the first of (3.3.8) may be written
( P o ) = (JAo)*, so that if we define JA, = C, we may replace (3.3.8) by C
=
C*,
C*J-lC
= 0.
(3.3.10)
88
3.
DISCRETE LINEAR PROBLEMS
Thus suitable matrices A, B are given by XA
+ B = (XJ-lC + E ) B,
(3.3.11)
where B is ]-unitary, C is symmetric in the Hermitean sense, and satisfies the second of the equations (3.3.10). Our problem is thus reduced to finding Hermitean matrices C such that C*J-'C = 0. If J has all its eigenvalues of the same sign, in the imaginary sense, that is to say, if J/i, which has its eigenvalues real, is positive-definite or negative-definite, this problem has only the trivial solution C = 0. For if say J/i> 0, then J-'i > 0, and so C*(J-'i) C 2 0 if C #O, in the sense that C*(J-'i)C = 0 is excluded. Suppose then that J has (imaginary) eigenvalues of both signs; this is assured if k is even and J is real and skew symmetric. In this case (3.3.10) has nontrivial solutions. Let p >, 1 be an integer such that J has at least p eigenvalues of both signs. It may be shown that there exists an "isotropic" set of p vectors tl, ..., t p ,that is to say, column matrices, such that (3.3.12) [,* J-'ts = 0, I , s = 1 , ...,9. If then we put (3.3.13)
we have which vanishes by (3.3.12). If, in addition, we impose on the numerical coefficients yrs the symmetrical conditions f r s = ysr
r , s = 1,
-*.,P,
(3.3.14)
then C will be Hermitean. Finally, for later purposes it will be necessary to impose a definiteness condition on C, restricting its sign; if we require that C 3 0, this will be ensured by imposing a similar condition on the yra 9 namely, (3.3.1 5)
So far as the restrictions M*JM = N*JN are concerned, special interest attaches to the case in which M*]M
= 0,
N*JN
= 0.
(3.3.16)
3.4. THE
89
UNIT CIRCLE CASE
This again is impossible if J , or J/i,is positive or negative definite. Suppose however that k is even, to take a simple case, and that the eigenvalues are & k of each sign, so that there exists an isotropic set of p = & k vectors satisfying (3.3.12). Here we need only take M , N to have the form
where u l ,
..., u k
are 1inear.ly independent column matrices.
3.4. The Unit Circle Case We pass to the other case in which z*Jy is to be invariant under (3.3.1-2) provided that [ h I = 1. We must now have (XA*
if
IhI
+ B*) J(hA + B ) = J
= 1, and so, since then
(X-lA*
=
A-l,
+ B*) J ( M + B ) = J ,
(3.4.1)
for h on the unit circle, and indeed for all A. Comparing coefficients we see that it is necessary and sufficient for the required invariance that B*JA = A*JB A*JA
(3.4.2)
= 0,
+ B*JB = J.
(3.4.3)
We may now conveniently take J to be Hermitean instead of skewHermitean; it is permissible that J be positive definite, though this is not the most interesting case, since it is more restrictive on the boundary conditions. To construct solutions of (3.4.2-3), we suppose for definiteness that J has some positive eigenvalues, and select some number p of them. Let y l , ..., y p be these eigenvalues, tl, ..., tp the corresponding eigenvectors; we number the remaining eigenvectors from p 1 to k. If then we take
+
then (3.4.2) will be satisfied whatever the values of the numerical coefficients a T 8 ,&; here we suppose the C1, ..., (k orthonormalized.
3.
90
DISCRETE LINEAR PROBLEMS
Since
J
k
=
Lr&*
1
we may break up (3.4.3) into two separate equations as (3.4.5) (3.4.6)
T o write the latter in terms of the coefficients ars, jlrs, let A, denote the p-by-p matrix of the a r s , B, the (K - p)th order matrix of the &, , r, the diagonal matrix with entries y1 , ..., yp , P, the diagonal matrix of the y p f l , ..., y k . Then (3.4.5-6) are equivalent to A,*r,A, B,*r,B,
r, , = r, .
(3.4.7)
=
(3.4.8)
Having found particular solutions of (3.4.2), (3.4.3), that is to say, a B which is ]-unitary for h on the unit circle, further particular XA cases may be found by multiplying, on either side, by an arbitrary constant ]-unitary matrix.
+
3.5. The Real 2-by-2 Case In this, which is substantially the case of ordinary orthogonal polynomials, we take
J
=
(y -3
(3.5.1)
and seek real 2-by-2 matrices A, B such that for arbitrary real 2-vectors
y, z if
then
yt
= (XA
+B)y, ~ tJyt *
ZT = (XA = z*]Y,
+ B) I,
(3.5.2) (3.5.3)
whenever h is real. We find here a heuristic solution, which is substantially the general solution, though we shall not prove this.
3.5. THE
REAL
2-BY-2
91
CASE
With a suitable convention as to sign, z* Jy represents the area of the parallelogram with y , z as two of its sides. Hence our requirement is that the matrix h A B, interpreted as a mapping of the plane into itself, should leave area unchanged. This may also be seen in another way. For a real 2-by-2 matrix to be symplectic, it is necessary and sufficient that it be unimodular, that is, have determinant unity. This again implies the invariance of area. Our quest is therefore for mappings, linearly dependent on a real parameter A, which preserve area. I n geometrical language, such a transformation is given by a “shear,” or a “symplectic transvection.” T o form such a transformation, we take a fixed line I in the plane, for any point P in the plane drop a perpendicular PQ to I, and form the transformed point P’ by moving P a distance (ha P)PQ parallel to I ; regard is to be had to the sense of the motion and to the sense of PQ. I t is easily seen that this transformation leaves area unchanged. Another transformation leaving area unchanged is rotation about a point, and we may form a boundary problem by imposing on P a succession of transformations of the above form, relative to a set of concurrent lines I,, interspersed with rotations. T h e boundary conditions may require that P start and finish on some line, for example. For the standard form of such transformations, we take the shear
+
+
x1-
x1
+ (ha + B)
x2
,
x2
-
,
x,
(3.5.4)
where xl, x2 may be thought of as coordinates, succeeded by the “rotation” x1-
-
-x2
,
x,
-
-
x1
.
(3.5.5)
The combined transformation may be written x1
-x2
,
x2
x1
+ (ha + B) -
(3.5.6)
x2
For the recurrence relation we take a series of such transformations, obtaining Xl.n+l
=
-%.n
I
XZ.n+l
= x1.n
+ (horn + B J
x2.n *
(3.5.7)
On substitution from the first equation into the second, we have Xz.n+l == -Xz.n-1
+ (horn +
Bn) x2.n
3
(3.5.8)
which forms a three-term recurrence relation, which we treat as the origin of orthogonal polynomials.
92
3.
DISCRETE LINEAR PROBLEMS
3.6. The 2-by-2 Unit Circle Case This leads to the topic of orthogonal polynomials on the unit circle, and to another case in addition, There are here two distinct possibilities for the matrix J characterizing the quadratic form. In the first of these we take (3.6.1)
which is similar to (3.5.1) and not essentially distinct from it. Again we ask that the 2-by-2 matrices A, B should have the invariance property (3.5.2-3), with J as given by (3.6.1) and for arbitrary h on the unit circle; in general the property is not to hold for h off the unit circle. Special cases of such pairs A, B are given by +
=
A 0 1 0 A 0 (0 1)’ (0 A)’ (0 A).
(3.6.2)
From these further cases may be constructed by multiplying by a general matrix A’ which is J-unitary, that is, A’* JA’ = J. Particular cases of such A’ are given by A’
=
(i
:), (a = 6,u2 - I b l2 = l), (
(3.6.3)
and, in fact, the general such A’ may be factorized into matrices of this form. It may be shown that the general solution to the problem of finding such XA B can be built up from the elements we have given. For a boundary problem concerning orthogonal polynomials we combine the first matrices in (3.6.2-3), a typical recurrence formula being represented by the transformation
+
(3.6.4)
If we set up a sequence of recurrence formulas of this type and write u,, v, for the entries in the vector yn , this recurrence formula may be written explicitly as un+l = hanun
+ bnvn ,
Vn+l = A&nun
+ anvn
(3.6.5)
Here we assume a, real and positive, and an2- Ib, l 2 = 1 ; the latter may be relaxed to un2 - I b, l2 > 0 at the cost of slight modifications in the formulas.
3.6.
THE
2-BY-2 UNIT
93
CIRCLE CASE
We take the initial conditions uo = 1,
wo =
1,
(3.6.6)
whence (3.6.5) defines u, , v, as polynomials in A, of which the former have orthogonality properties, as we show in Chapter 7. We impose a terminal boundary condition u, = eiuvm, (3.6.7) and the eigenvalues will be the zeros of a polynomial. I n terms of the general formalism (3.1.4-5) for boundary conditions, this corresponds
(3.6.8) We have here M* JM = N * J N = 0, and again M , N have no common null-vectors. Naturally, there are many other possible choices of boundary conditions for example, the periodic boundary conditions given by M=N=E. For the second unit circle possibility we take J = E, the unit matrix. Here we seek 2-by-2 matrices A, B such that h A B is unitary, in the ordinary sense whenever h is on the unit circle. Matrices of this form are again given by (3.6.2), and these solutions may again be extended by multiplying by a general 2-by-2 unitary matrix A‘. We are now confined to “two-point” boundary conditions. Denoting by y o , y m the initial and final vectors of the recurrence sequence, we may impose the condition yn4= Nyo where N is any fixed unitary matrix. Some degree of unification is possible if we extend our consideration from matrices h A B to the bilinear form (XA B ) (hC I))-’. For example, if in the latter expression we make the fractional linear transformation h = (ah’ b)/(ch’ d ) , we derive again a bilinear matrix expression, and any invariance property the original expression had on some A-curve will be translated into an invariance property of the new expression on a A’-curve. There thus ceases to be any basic distinction between invariance on the real axis and on the unit circle. In particular, the investigations of Chapters 1 and 2 may then be included in the cases just discussed of two-dimensional invaria.nce on the unit circle. Naturally, bilinear transformations are not the only ones effecting a mapping between the unit circle and the real axis. Another such mapping is A‘ = 4 ( A l / h ) , which has important applications to the connection between polynomials orthogonal on the unit circle and on the real axis, and many others can be constructed.
+
+
+
+
+
+
+
94
3.
DISCRETE LINEAR PROBLEMS
3.7. The Boundary Problem on the Real Axis Here we summarize briefly the constructions associated with the boundary problem (3.1.1), (3.1.5) for the case that the invariance (3.1.2) holds for all real A. Since we discuss some special cases in detail in Chapters 4-6, and since similar, indeed more general, investigations are given in Chapter 9 we do little more than give the definitions. There will, of course, be parallels with the situation of Chapter 1 as well. We assume that J is nonsingular and skew-Hermitean, and that A,, , B, , n = 0, ..., m - 1, satisfy the conditions noted in Section 3.3 as equivalent to the invariance (3.1.2), namely Bn*JBn = J ,
+ Bn*JAn = 0,
An*JBn
An*]An = 0. (3.7.1)
The boundary matrices M, N are again assumed to satisfy (3.1.4), and to have no common null-vectors. The fundamental solution of (3.1.1) will be a square matrix Y,(A), defined by Yn+l(h)= ( U n Bn) Yn(h), YdX) = E, (3.7.2)
+
so that Yn(x) = (
+ Bn-1)
U - 1
(XAo
+ Bo)*
(3.7.3)
For real A, p we have
C+I(P) JYn+l(x)- Yn*(p)JYn(h) = =
Yn*(p){(tLAn*
+ Bn*) I(% + Bn) - 1)Yn(X) (3.7.4)
- P ) Yn*(t.)Bn*JAnYn(h),
using (3.7.1). Introducing the notation %*]An
=
(3.7.5)
Cn 9
and summing (3.7.4) over n, we get Yrn*(~) JYm(X) - J
3
m-1
=
- P)
Yn*(p)CnYn(h).
(3.7.6)
This forms an analogue of (1.4.1), and again of the Christoffel-Darboux identity for orthogonal polynomials, and of the Lagrange identity for linear differential equations. I n particular, if A, p are real and equal, the right of (3.7.6) vanishes, expressing the fact that Y,,(A) is J-unitary for real A.
3.7.
THE BOUNDARY PROBLEM O N THE REAL AXIS
95
T h e recurrence relation (3.1.1) having the general solution = Yn(A)yo, the boundary problem is equivalent to that of finding h such that
yn
NV = y m = Ym(h).Yo = Ym(h) Mv,
with some v # 0. T h e eigenvalues are thus the roots of det ( N - Ym(h)M ) = 0.
(3.7.7)
With each root A,, of this equation there will be a sequenceof column matrices
+
Yo7
1
...,Y m r
9
such that yn+l,t = Bn)ynT 7 Y* = Mvr 9 ywc, = Nvr 7 where v?.# 0. If A,, A, are two real and distinct eigenvalues, then on multiplying (3.7.6) on the left by y&, and on the right by yor the left-hand side vanishes by the boundary conditions, and we obtain the orthogonality of the eigenfunctions in the form (3.7.8) We now make the further assumption that the C, have constant sign, say, Cn>O,
n=O
,..., m - 1 ,
(3.7.9)
in the sense that the C, are positive semidefinite; they cannot be definite in view of the last of (3.7.1). Furthermore, we make the definiteness assumption concerning the recurrence relation that for any nontrivial solution of (3.1.1) we must have m-1
(3.7.10)
This enables us firstly to ensure that the eigenvalues are in fact all real; by (3.7.10) no eigenfunction can be orthogonal to itself, and it may be shown, in a similar manner to the proof of (3.7.8), that an eigenfunction corresponding to a complex eigenvalue would have just this property. In the second place, the eigenfunctions can be normalized by multiplication by suitable scalar factors, so as to ensure that (3.7.1 1)
96
3.
DISCRETE LINEAR PROBLEMS
If this be done, and we write u, for the initial value Mv, = y& of the rth eigenfunction, the spectral function .(A) may be defined as a stepfunction, whose value for any real A is a square matrix of order K, whose jumps occur at the eigenvalues A,, and are of amount u,u,* . I n a similar manner to Section 1.6, we may define a characteristic function, whose poles are at the eigenvalues and whose residues there are the jumps of the spectral function. A suitable form turns out to be FIM.Jh) =
a (Y;'(A) N + M ) (Y;'(A) N
-
M)-']*-'.
(3.7.12)
A similar function is investigated in connection with the general firstorder system of differential equations in Chapter 9. T o identify the function with that of Section 1.6, we take J = --i, M = 1, N = exp (ia).
3.8. The Boundary Problem on the Unit Circle Here too we confine ourselves to a brief discussion. We take J to be Hermitean and nonsingular, but not necessarily definite. T h e A, , B, in (3.1.1) are now to satisfy (3.4.2-3), so that AA, B, is J-unitary when 1 h I = 1. ,Defining the fundamental solution Y,(A) as previously, it may be shown, for example, by induction, that
+
I n particular, if I A 1 = 1 we have that Y,,,(A)is J-unitary. Writing C , = A,* JA,fi, we assume that C,, 3 0; it is again not possible that C , > 0, except in the comparatively trivial case in which B, = 0. Again we make the definiteness assumption that (3.7.10) holds for any nontrivial solution of (3.1.1). T h e eigenvalues are again the roots of (3.7.7), and by means of (3.8.1) we may prove first that all eigenvalues lie on the unit circle, and secondly that the eigenfunctions are orthogonal according to (3.7.8). We may suppose them normalized according to (3.7.1 1). For the spectral function we consider a weight distribution on the unit circle. With the eigenvalue A,, on the unit circle, we associate the (matrix) weight u,u,*, where, as before, u, is the initial value yor of the corresponding normalized eigenfunction.
CHAPTER 4
Finite Orthogonal Polynomials
4.1. The Recurrence Relation We take up here boundary problems of Sturm-Liouville type associated with the recurrence formula CnYn+i
= (anh
+
bn)Yn
- Cn-iYn-1,
71
= 0,
..., m
- 1,
(4.1.1)
where the a,, b, and c, are real scalars, subject to an
> 0,
cn
> 0.
(4.1.2)
A boundary problem is given if we ask for sequences y-l , ..., ynrconnected by this relation, not all zero, and satisfying the boundary conditions (4.1.3)
where h is some fixed real number. That this is a problem of eigenvalue type, soluble only for isolated values of A, is easily seen if we construct a typical solution, that is to say, sequence, satisfying (4.1.1) and the first of the boundary conditions (4.1.3), and not vanishing throughout. We must, of course, take y o # 0, since otherwise by (4.1.1) y1 = 0, y z = 0, ..., and the sequence vanishes identically. It will be convenient to define a standard solution Y - ~ ( A ) ,~ o ( h )~ ,1 ( h ) i* * * ,
ym(X)
(4.1.4)
of (4.1.1) with the fixed initial conditions y-,(h)
= 0,
yo@) =
l / C 1
> 0.
(4.1.5)
Now that we have fixed y-,(h), yo(h),the values of yl(h), y,(h), ..., are to be found successively from (4.1.1). For n 2 0, it is evident that y,,(A) is a polynomial of degree precisely n. We can now say that the remaining boundary condition in (4.1.3) will be satisfied if (4.1.6)
4.
98
FINITE ORTHOGONAL POLYNOMIALS
The roots of this equation, the eigenvalues, are thus the zeros of a polynomial of degree m. For if (4.1.6) holds, the sequence (4.1.4) certainly satisfies the conditions (4.1.11, (4.1.3) of the boundary problem, without vanishing identically; conversely, it is easy to prove that any solution of (4.1.1) and (4.1.3), not vanishing identically, must be a sequence proportional to (4.1.4) for such a A-value. In showing that the eigenvalues of our boundary problem are the zeros of certain polynomials we begin to approach the theory of orthogonal polynomials. It is not immediately apparent that the polynomials (4.1.4) defined by (4.1.1) and (4.1.5) have any orthogonality properties. This may be deduced from the orthogonality of the eigenfunctions by arguments similar to those of Section 1.4 (cf. Theorem 1.4.5). As in Chapter 1, we are here considering only the orthogonality of finite sets (4.1.4), leaving the infinite discrete case to the next chapter. T h e converse step, of showing that polynomials known to be orthogonal satisfy a recurrence relation of the type of (4.1.1), will be considered later in this chapter.
4.2. Lagrange-Type Identities We collect here for later use some results of the type of Green’s theorem or the Lagrange identity for differential equations, here associated with the names of Christoffel and Darboux. T h e results are analogous to Theorems 1.4.1-2; a more general result was indicated in (3.7.6). They may be used to establish the reality of the spectrum, the orthogonality of the eigenfunctions, and for oscillatory investigations. We have first: Theorem 4.2.1.
For 0
< n < m,
4.2.
LAGRANGE-TYPE IDENTITIES
99
Putting n = 0 and recalling that ~ - ~ ( = h )~ - ~ ( = p 0, ) we derive (4.2.1) with n = 0. Induction over n then yields (4.2.1) from (4.2.2) in the general case. We deduce two important special cases. Dividing (4.2.1) by (A - p) and making p -+ h for fixed h we get, using 1’Hopital’s rule:
Theorem 4.2.2. For 0
< n < m, (4.2.3)
I n particular, for real A,
A+l(4Y n ( 4 - Y?a+l(4 Y X 4 > 0.
(4.2.4)
T h e other special case of Theorem 4.2.1 is
Theorem 4.2.3.
For 0
< n < m, and complex A,
This results immediately on putting p = X in (4.2.1). Further results of this type relate to two distinct solutions of (4.1.1). As a second standard solution we take a sequence Z-l(A),
such that
c,,zn+1(4 = (%A
+
...,%n(A), bn)
(4.2.6)
zn(4 - c,,-lzn-l(4
(4.2.7)
z-,(A)
(4.2.8)
and with the fixed initial conditions z,(h) = 0,
>
=
1.
For n 1, .=(A) will be determined recursively from (4.2.7) as a polynomial of degree n - 1. I n analogy to (4.2.1) we have then:
Theorem 4.2.4. For 0
In particular, for h
= p,
< n < m,
100
4.
FINITE ORTHOGONAL POLYNOMIALS
For the proof of (4.2.9) we take the results
(4 + bn) Y n ( 4 - Cn-1Yn-1(4,
CnYn+1(4
=
CnZn+1(P)
= (%P
+ bn)
Zn(P.> - Cn-l%-l(Ph
multiply, respectively, by z,(p), y,(h) and subtract, getting Cn{Yn+1(4 Zn(P>
- P)Y n ( 4 z n w
- Zn+1(P) Y n ( 4 ) =
+
cn-l{Yn(h) Z n - l ( r >
(4.2.1 1)
- Z n ( P ) Yn-1(4>.
Putting n = 0 and recalling that y-,(h) = 0, zPl(p) = 1, ~ - ~ y ~=( h1,) we get (4.2.9) with n = 0. T h e general case then follows as before by induction over n, using (4.2.1 1). T h e case h = p, (4.2.10), constitutes an analog of the constancy of the Wronskian determinant for two solutions of a differential equation of the form y" u(x)y = 0.
+
4.3. Oscillatory Properties We prove here results concerning the reality and separation properties of zeros of the y,(h), and more generally of polynomials of the form y,(h) hy,-,(h). These results will of course also give information on the spectra of boundary problems of the form (4.1.1), (4.1.3). For the classical polynomials, such as those of Legendre, numerous methods are available for proving such results ;for general orthogonal polynomials, still other methods are available, based on the orthogonality. Here we confine ourselves to methods based on the recurrence relation, and its immediate consequences as found in Section 4.2. A basic result is:
+
Theorem 4.3.1.
For real h, the polynomial Yn(4
+ hz-l(4
(4.3.1)
has precisely n real and simple zeros. Suppose if possible that A is a complex zero of (4.3.1). Using this fact and taking also complex conjugates we have Yn(4
+ hYn-1(4 = 0,
YJJ)
+
hYn-l@)
= 0.
(4.3.2)
We then have that the right of (4.2.5) vanishes, and this is impossible since the terms on the left of (4.2.5) are nonnegative, that for Y = 0 being positive, by (4.1.2), (4.1.5). Hence the zeros of (4.3.1) are all real.
4.3.
101
OSCILLATORY PROPERTIES
That they are all simple follows from the fact that at a hypothetical multiple zero, necessarily real, we should have simultaneously yn(h)
+ hyn-,(A)
= 0,
yn’(h)
+ hyL-,(h)
= 0,
and so y,(h)y~_,(h) - y,’(A)y,-,(X) = 0, in contradiction to (4.2.4). Since (4.3.1), as a polynomial of degree exactly n, must have n zeros altogether, this completes the proof. I n other words, the boundary problem (4.1.1), (4.1.3) has a purely real spectrum, consisting of m real eigenvalues. Next we give some separation theorems. A simple case is
Theorem 4.3.2. Two consecutive polynomials y,(h), y,-,(X) have no common zeros. Between any zeros of one of them lies a zero of the other. Since all zeros are necessarily real, the first statement follows from (4.2.4). The rest of the theorem also follows from (4.2.4). Suppose that A , , A, are two zeros of y,(X), which we take to be consecutive; since y,(X) has only simple zeros, this implies that yn’(hl),yn’(h2)have opposite signs. By (4.2.4), with n - 1 replacing n, we have Yn‘(4) Y n - d U
> 09
b‘(UYn-l(h2) > 0,
and so y,-,(X,), y,-,(A2) must also have opposite signs, which proves the result. The proof that between two zeros of y,-,(X) lies a zero of yn(h) is similar. Generalizing this type of argument, we have the following analog of Theorem 1.3.6.
Theorem 4.3.3. For real distinct h, , h, , between any two zeros of Y A 4 h,Y,-,(h) lies a zero OfY,(4 h,y,-,(X). This may be proved in a similar manner, using (4.2.4). T o put the argument differently, we note that (4.2.4) implies that y,-l(X)/y,(h) is a strictly decreasing function of X when it is finite. As X increases from --oo to +m, y,-,(A)/y,(X) will start at 0, and tend to -00, as X approaches as X goes from the the lowest zero of y,(X), then go from $00 to lowest to the next lowest zero of y,,(h), and so on, finally tending to zero as h 00. Hence between any two A-values at which y?-,(A)/y,(h) = - l/h, , taking it that h, # 0, this function will have a discontinuity, tending in between to --oo and again going from $00 to - l / h l : and hence taking all other values, including -l/h,. T h e proof is similar if h, = 0. The consideration of infinities may be avaided by considering in place of y,-,(X)/y,(A) the variation of the function
+
+
--03
---f
+
b n ( 4 + &I-l(WYn(4 - k l ( 4 1 ,
(4.3.3)
102
4.
FINITE ORTHOGONAL POLYNOMIALS
It follows from (4.2.4) that as A varies on the real axis, this function moves monotonically on the unit circle. We shall use this device in connection with matrix systems. We turn to a different type of oscillation theorem, in which A is fixed, or is treated as a parameter, and yn(A) is viewed as a function of n. So far y,(A) has only been defined for integral values of n, n = -1, 0, ..., m. We complete it to a continuous function y,(A), -1 x m, by specifying that between two integers, n x n 1, y,(A) is to be a linear function of x. This definition may seen artificial, particularly when applied to a classical polynomial such as that of Legendre, for which another and more analytic definition is available for nonintegral orders. T h e definition is however entirely natural from the point of view of the mechanical problem which gives rise to (4.1.1), namely, the problem of the vibrations of a stretched string bearing particles. Here the segments of the string between consecutive particles are, of course, straight, and are appropriately represented by linear functions. We start by observing that the zeros of y,(A), -1 x m, for fixed real A, are simple and well defined. Suppose first that yl,(h) = 0 for some non integral x‘. Then (a/ax)y,(A) exists at x = x’, and is not zero, for if it were, then y,(h) being linear would vanish throughout the interval of the form n x n 1 containing x’, and this is impossible since y,(A), Y,+~(A) cannot both vanish, by Theorem 4.3.2, or by the recurrence relation. Suppose again that yJA) = 0 for some integer n. I t then follows from (4.1.1) that ynpl(A),Y,+~(A) have opposite signs. Hence y,(A) - ~ , ~ - ~ (Y,+~(A) h ) , - y,(A) have the same sign and n is a simple zero, in the sense that y,(A) changes sign as x increases through n, the derivative (a/ax)y,(A) having the same sign and not being zero in (n - 1, n), (n, n I). For any fixed A, y,(A) will have a certain number of discrete zeros in -1 x m, including a fixed one at x = - 1. We can now consider the behavior of these zeros as h varies. It is easily seen that y,(h) is a continuous function of x and A. We have seen that the zeros of y,(A) are simple, corresponding to non-zero values of the derivative (a/ax)y,(A)if x is nonintegral, and to non-zero left and right derivatives, y,(A) - ynWl(A), Y ~ + ~ ( A )- y,(A), respectively, if x is an integer n. From this we deduce that as A varies, the zeros of yz(A) in -1 < x < m vary continuously, as functions of A. T o take this up in detail, suppose that y,.(A’) = 0, - 1 < x’ < m, x’ being not an integer. Then for some E > O the interval x’ - E < x < x’ E will contain no integer, so that y,(A’) will be linear in x in this interval, and y,#-,(A’), ~ , ~ + ~ ( hwill ’ ) have opposite signs. By continuity, there will be a 6 > 0 such that these statements are also true of y,(h)
< < +
< <
< <
< < +
+
< <
+
4.3.
103
OSCILLATORY PROPERTIES
for any X with I X - A‘ I < 6. Hence y,(h) will have a unique zero E . T h e proof is similar if x’ is an integer n ; in in x’ - E < x < x‘ this case y,(h) is linear in x in (n - E , n), (n,n E ) , the derivative being possibly discontinuous at x = n, but not changing sign there. This argument shows that for I h - A’ I < 6 , yZ(X) has a unique zero in a neighborhood of each zero x’ of y,(x’) in -1 < x < m. T o complete the discussion by including the end-points, y,(X) will in no case have a zero in -1 < x 0, being there a fixed linear function prescribed by (4.1.5). As to the end x = m, if ynl(A’)# 0, then clearly for some 6 > 0, 1 X - A‘ I < 6, y,(h) will have no zero in an interval of the form m - E x m. T h e situation when ym(h‘) = 0 will turn out to depend on the sign of X - A’. Summing up these preliminaries, for some E > 0, we can specify 6 > 0 such that if I A - A‘ I < 6, then y,(h) has a unique zero within a distance E of every zero x‘ of yZ(hl)in - 1 < x‘ < m, and possibly also a zero in m - E x m, ify,(X’) = 0. In addition, if B is small enough, y,(h) will have no other zeros in -1 < x m ; this follows from the fact that y,(h) is continuous in both variables, its zeros, for fixed h and varying x, being points of change of sign. We can now show that these zeros are monotonic functions of A.
+
+
<
< <
< <
<
<
Theorem 4.3.4. As X decreases, the zeros of y,(X), -1 < x m, move to the left. Let the zero .(A) occur in n < x(h) n 1. Since yZ(X) is linear in n x n 1, the location of this zero is given by
< +
< < +
x(4 = n
+m(4/{m(4 -m+1(U ;
(4.3.4)
conversely, x(h) as given by this equation will actually be a zero of yZ(X)if y,(h) # Y,+~(X) and if x(h) so given falls in the interval n <x n 1. If we differentiate the right of (4.3.4), with respect to A, the result is found to be
< +
(YA+lrn - rn+lrn‘)/(Y?l - Y n + J - ,
+
and this is positive, by (4.2.4). If then x(h) is a zero in (n,n l), the condition y,(h) # Y ~ + ~ ( X is ) necessarily satisfied, and .(A) as given by (4.3.4) decreases as X decreases, remaining initially at least in (n,n 4- 11, and so remaining a zero of y,(X). This proves the statement. In particular, if y7,&(h’) = 0, then as h decreases from A’, the zero at x = m moves into the interval (- 1, m). Thus, as h decreases through the value A’, the number of zeros of y,(X) in (-1, m) increases by one. We can now set up the oscillatory characterization of the eigenvalues
104
4.
FINITE ORTHOGONAL POLYNOMIALS
of the problem (4.1.1), (4.1.3), in the special case h = 0. The eigenvalues are then the zeros of ym(h),which we write, in decreasing order, A,, A,, ..., A,-. We have then:
Theorem 4.3.5.
The sequence YO(^),
.**)
(4.3.5)
Ym(h),
+
exhibits for h > A, no changes of sign, for A, > h > A,.+, exactly r 1 changes of sign, and for h < A, exactly m changes of sign. It follows from (4.1.1), (4.1.5) that y,(h) is a polynomial of degree n with positive coefficient of An. Thus for large positive A, (4.3.5) will show no changes of sign, and for large negative X will show m changes of m for large positive sign. Thus y,(h) will have no zeros in -1 < x A, and m zeros for large negative A. As h decreases from some large positive value, a zero of y,(h) will appear at x = m when X = A,, and will move to the left as X decreases from A,; additional zeros will enter the interval on each occasion that X decreases to one of the A,, and these will all move to the left, remaining in (- 1, m). Since as X decreases indefinitely, there are ultimately m zeros of y,(h), there must be precisely r zeros when > A > A T , and likewise no zeros for X > A,, rn zeros for A < A,,. Moreover, zeros of y,(h) in -1 < x < m are in unique correspondence with changes of sign in the sequence (4.3.5). This completes the proof. It will be observed that the order of the A,. as real numbers is in this case opposite to their order in the oscillatory characterization. This results from the choice of sign of a, in (4,1.1), which ensures that the y,(X) have positive highest coefficient, in keeping with the standard practice in the theory of orthogonal polynomials.
<
4.4. Orthogonality
We derive two types of orthogonality, first the orthogonality of eigenfunctions, that is to say, of certain sequences of the form (4.1.4), and secondly a dual orthogonality, which is of rather greater importance in that it establishes that the polynomials yn(h) are orthogonal in the usual sense. We shall use the notation A, , ..., for the roots of (4.1.6), for fixed real h. By Theorem 4.3.1, we know that these A, are all real and distinct. The first type of orthogonality is given by:
Theorem 4.4.1. T h e sequences .YO(&),
...,Y m-l (hr),
* = 0,*..,m
- 1,
(4.4.1)
4.4.
105
ORTHOGONALITY
are orthogonal according to (4.4.2) P-0
where (4.4.3)
For the case getting
Y
#
s
we take A-
A,, p = A,, n = m
-
1 in (4.2.1),
+
The determinant on the right vanishes since y,(A) l~y,)~-~(A) = 0 if h = Xr , A,, which proves the result. For the case I = s we have to justify the last expression in (4.4.4). This follows in a very similar way from (4.2.3). Since the sequences (4.4.1) constitute m orthogonal and nontrivial m-vectors, there will as in Section 1.4 be an eigenfunction expansion. is any sequence, and In this case this will merely say that if uo , ..., we define (4.4.5)
then (4.4.6)
In addition, we have the Parseval equality (4.4.7) r-0
v-0
Though of a trivial character, this expansion theorem can serve as a foundation of the expansion theorem for differential equations of the second order, by way of a limiting process, though we shall not carry this out here. The dual orthogonality, a consequence of these last results or of (4.4.2) (see Appendix 111) is:
106
4.
Theorem 4.4.2.
FINITE ORTHOGONAL POLYNOMIALS
For 0
< p, q < m - 1, (4.4.8)
r=o
This establishes that the polynomials y,(A) are indeed orthogonal with respect to the distribution of weights p;’ at the points A,, and justifies the heading given to this chapter. There will also be a dual expansion theorem according to which, starting with an arbitrary function o(A), and defining the up by (4.4.6), the expansion theorem (4.4.5) is true when h = 4 , that is to say, it is true with respect to the distribution of weights pF1 at the points A,. Both these expansion theorems naturally acquire greater interest when extended to the case m = 00.
4.5. Spectral and Characteristic Functions In analogy to Sections 1.5-6, we define first a spectral function T ~ , ~ , ~ ( Xas ) , a stepfunction, constant except at points of the spectrum, and whose jump at an eigenvalue A, is of amount l/p, , where pr is the normalization constant defined by (4.4.3-4). Specifically we take
(4.5.2)
The dual orthogonality, (4.4.8), assumes the form
As in chapter 2, we may use this result to establish that T , , , ~ ( A ) is bounded, uniformly in h and in m, in the event that the recurrence relation (4.1.1) is defined for an infinite sequence of n. The characteristic function will again have poles at the A,, that is hy,_,(A), with residues l/p, , and will to say, at the zeros of y,,,(A) again admit explicit expression in terms of solutions of the recurrence relation. We define
+
4.6. THE
107
FIRST INVERSE SPECTRAL PROBLEM
Here z,(X) is the second solution of the recurrence relation defined by
(4.2.7-8).We have then:
Theorem 4.5.1.For h not an eigenvalue, the characteristic function (4.5.4)has the representation (4.5.5)
In particular, it maps the upper and lower half-planes into each other. Since ynL(A) hym-,(A) is of degree m, and has only simple zeros, and since zm(h) hz,-,(h) is of degree m - 1, the partial fraction expression off,.,(4 is
+
+
Taking n = m - 1 in (4.2.lo),and also h = A, so that ym(Ar) = - hy,_,(h,), we have --Cm-1Ym-1(4)
{~m(&)
+
h~m-,(Ar))
(4.5.7)
= 1*
Comparing this with (4.4.4),we deduce that {Zm(Ar)
+ hzm-l(Ar))l{ym'(&) + h~m-l'(&)l
=
- ~ / P T
9
(4.54
and so (4.5.6) is equivalent to m-1
(4.5.9)
which is the same as (4.5.5). It is immediate from (4.5.9)that if Im h > 0, then Imf(X) < 0, and conversely. In Section 1.6 we noted some interpolation properties of the characteristic function for that case. T h e place of those is here taken by an asymptotic expansion ~ t f ~ , , ~ (for X )large A, again connected with moment problems. We turn to this aspect later.
4.6. The First Inverse Spectral Problem As in Section 1.7, we consider the reconstruction of the boundary problem given the spectral function; that is to say, we are given the real quantity h, the eigenvalues or zeros of ym(A) hy,-,(h),. and the normalization constants p,. defined in (4.4.3),or (4.4.4).It is easy to
+
108
4.
FINITE ORTHOGONAL POLYNOMIALS
show that the polynomials y,(h), ...,ym(X)are fixed, apart from constant factors, so that the recurrence relation (4.1.1) is essentially determinate apart from certain trivial transformations. As in Section 1.7, a knowledge of the eigenvalues only is insufficient. The solution of the problem may be divided into two stages, each of which is well known in the theory of orthogonal polynomials. First we construct the y,,(h).
Theorem 4.6.1. Let ~ , , ~ ( hbe ) a nondecreasing step function with precisely m points of increase, and let a,, ..., a,,-, be given positive quantities. Then there exist unique polynomials yo@), ..., yT2L-l(h) satisfying (4.5.3), and such that yn(X) is of degree n, the coefficient of An being positive. T h e proof is by “orthogonalization.” T o sketch the process, we seek y,(h) in the form (4.6.1)
and consider first the solution of (4.5.3) with p # q. For these it is necessary and su@cient that m -m
y,(h) h‘ dT,,h(h)
= 0,
q = 0,
...,p
- 1.
(4.6.2)
Substituting (4.6.1) on the left, and introducing the “moments” ui =
J
co
~ldT,,,(h), -53
j = 0,1,
... ,
(4.6.3)
we may replace (4.6.2) by the system of linear equations (4.6.4) r=O
<
For any particular p, 0 < p m - 1, we treat these a s p inhomogeneous Either they are uniquely soluble, equations to fix the aP,,,..., or else there is a set a,,, ..., aP-, of numbers not all zero such that the corresponding homogeneous equations (4.6.5) s-0
4.6.
109
THE FIRST INVERSE SPECTRAL PROBLEM
are satisfied. T h e pj being real, we may suppose the plying (4.6.5) by aq and summing, we derive
aT all
real. Multi-
(4.6.6)
which by (4.6.3) is equivalent to (4.6.7)
This is impossible, since Zt-l a$ is a polynomial, with coefficients not all zero, of degree at most m - 1, and so with at most m - 1 zeros; it cannot therefore vanish at all the m points of increase of ~ , , ~ ( h ) . Thus the polynomials y,(X) are determinate, except for the constant factors k,. We complete the determination by means of (4.5.3) with p = q = n, requiring k, to be real and positive. In the remaining step we show that the polynomials y,(h), constructed by orthogonalization, satisfy a recurrence formula of the form (4.1.1). Theorem 4.6.2. Under the conditions of Theorem 4.6.1, there are constants c - ~ , ..., c,-~ which are positive, and constants b , , ..., b,,,-l, such that the boundary problem (4.1. I), (4.1.3) has ~ , , ~ ( has) its spectral function. We take formally Y - ~ ( A ) = 0, and fix cP1 by (4.1.9, y,(h) having been found from (4.5.3) with p = q = 0. For n = 0, the recurrence relation (4.1.1) must reduce to coyl(h) (a,h b,)yo(h); here yo(h), yl(h) are known, and a, , and so c, , b, are determined. For 1 n m - 2 we set up an expansion = I
+
< <
(4.6.8)
It must be possible to determine such p n s T , since y,.(h) has positive coefficient of AT. T o identify this with (4.1.1) we wish to show first that pn,n+l,/3n,n-lare positive, and have the form c,, , c,-~ , respectively, and secondly that pn,T = 0 for r < n - 1. We may determine the pn,r by the Fourier process. Since
110
4.
FINITE ORTHOGONAL POLYNOMIALS
we have from the orthogonality that [see (4.5.3)]
(4.6.10)
We must verify that c, so given is positive. From (4.6.1) we see that Ay,(A) = (kn/kn+l)yn+l(A) lower powers of A, and by the orthogonality (4.6.2) we deduce that
+
J
m -m
h ~ n ( h~) n + i ( h dTm.h(~) ) = (kn/hn+i)
J
m
-m
(yn+i(h>)' dTm,h(h),
which is, of course, positive. It remains to prove that p,,, = 0 if Y < n - 1. This follows from (4.6.9), since Ayr(A) is then of lower degree than y,(A), and so orthogonal to it by (4.6.2). For n = 0, (4.6.10) is still in force. If n = m - 1, the argument is to be modified. We have to determine ym(A), and constants c,-~, bnhPl, such that ym,(A) hym-l(A) has the points of increase of T ? ~ , ~ ( A ) , A,, ..., AmPl, say, as its zeros, and such that
+
C m - ~ ~ m ( h )=
> 0.
subject to c,-~
(am-ih
+
bm-1)
ym-l(X) - ~ m - z ~ m - d h ) ,
(4-6.1 1)
We define in this case (4.6.12)
+
so that ym(A) hy,a-l(A) will have the correct zeros, and ynl(A)will have positive highest coefficient. There will still hold an identity
where
is now to be determined by comparing the coefficients of is positive. T h e remaining pmPl,, are determined as previously. T o complete the proof, we need to show that p;' as given by (4.4.3) are the same as the jumps at the A, of the prescribed step function /?m-l,m
Am. This shows that pm-l,n,= c,-~
4.7.
THE SECOND INVERSE SPECTRAL PROBLEM
111
T , , ~ ( A ) . Let the jumps of the latter function at the points A, be denoted (pi)-1. Then (4.5.3) may be written
The dual orthogonality relations may be deduced, and read
From the case
Y
= s we get
P
i
so that T,,~(A) is indeed the spectral function of the boundary problem we have constructed. We indicate briefly another solution of the problem which proceeds along quite different lines. Given T , , ~ ( A ) , we form fm,h(h) according to (4.5.5), express it as a rational function -$(A)/x(A), say. In view of (4.5.4) we make the identification Ym(A)
+
brn-l(A) =
x(A),
zm(A)
+ hzm-l(A)
= +(A)*
By (4.2.10) we must have
x(A) zm-l(A) - +(A) Ym-l(A)
=
l/Cm-l-
(4.6.15)
If for definiteness we take c,-~ = 1, there will be a pair of unique polynomials Y,-~(A), Z,-~(A), of degrees m - 1, m - 2 satisfying (4.6.15). Having found Y,-~(A), x,-,(A) by the highest common factor process, we repeat the process, using (4.2.10) with n = m - 2 to obtain Yrn-n(h),Z , , , - z ( 4 , and faonA related procedure, into which we do not enter here, will be to express fm,h(A) as a continued fraction.
4.7. The Second Inverse Spectral Problem As in Section 1.8, although the boundary problem is not fixed by one set of eigenvalues, it is fixed by two sets, together with the corresponding boundary conditions. The problem is determinate to the extent that the
112
4.
FINITE ORTHOGONAL POLYNOMIALS
polynomials y,(h) are known apart from constant factors; if these are prescribed in some way, the recurrence relation is completely determinate. In one general form of the problem, real and distinct constants h, , h, are given, and we are told the eigenvalues of (4.1.1), (4.1.3) with h = h , , and again with h = h , . Thus, for certain prescribed Ar,,, )d,2, we have
and again,
The prescribed two sets of eigenvalues must have the interlacing property, by Theorem 4.3.3. More precisely, supposing that h, < h, , we must have Ao.2
< A0.1 < Al.2 < 4 . 1 <
a*.
1 . 2
(4.7.3)
In justification of this ordering, we note that ym(X)/ym-,(h)tends to as h + 00, and so, since -hl > -h2 ,the greatest of all the eigenvalues will be that for which ym(X)/ym-,(h) = -hl. T h e solution of the problem is in some degree immediate. Without essential loss of generality, we may assume that y J h ) has unit highest coefficient, and so (4.7.2-3) give
+-
+
These two equations may be solved for ym(h),J J ~ - ~ ( X ) . T h e assumed recurrence formula (4.1.1) then becomes a means for determining Y ~ - ~ ( X ) , Y~’-~(X), ... in turn, by successive division according to the Euclidean algorithm. For definiteness we take all the c, = 1. T h e equation, for instance, Ym(4 = (aTn-,A
+
hn-1)
Yrn-l(4 - Ym--2(4,
(4.7.5)
in which ym(h),~ , ~ - , ( hhave ) been found from (4.7.4), clearly determines bm-, and J J ~ - ~ ( X ) . It is, however, not clear that we must have urn-, > 0, and that ym-,(X) so found will have positive coefficient of This extra information is, of course, to be deduced with the aid of the interlacing property (4.7.3). As a first step we show, as a consequence of (4.7.1-3), that the zeros
4.7.
113
T H E SECOND INVERSE SPECTRAL PROBLEM
of y,(X) ,y,-,(h) also have the interlacing property required by Theorem 4.3.2. For this purpose we consider the variation of
4)= arg {Ym-1(4
+
(4.7.6)
+m(4}-
This exists as a continuous function for real A ; the requirement (4.7.3) excludes the possibility of y,(h), ym-,(X) having a common zero. We write also -h, = tan a, -h, = tan p, where - 4 T < p < a < 4 T . We start at h = +m with w(h) = * T , and let w(X) vary continuously as h decreases. When h reaches any of the values (4.7.3) we shall have ym(h)/ym-l(h)= -h, or - h , , and so w(h) = a, p (mod T). Now as h decreases from +a, the first such value it reaches is at which w(X) = a (mod T ) . We deduce that ~ ( h , , ~ -= ~ ,a~; )it could not equal 'Y T , since then it would have to pass through the value /3 T in between, corresponding to a A-value of the series A,.,2. Proceeding in this way, as h decreases from +m to -00, w(h) will start from i n and pass successively through the values a, /.I a,- n, /3 - T , ..., and tending as X + --oo to T - mT. It will therefore be equal to a multiple of v at least m times, and in fact exactly m times, for finite A, since when w(h) is a multiple of T , ym(h)= 0. There are thus rn values of h at which w(h) takes successively the values 0, - T , - 2 ~ , ..., -(m - l ) ~ and , between these w(h) must assume the values - & T , - 2~ 1 - T , ..., - $ v - ( m - 2 ) ~ The . latter correspond to zeros of J J ~ - ~ ( A )and , we have therefore shown that these zeros interlace with those of y,(X). The original inverse problem (4.7.1-3) may therefore be replaced by a simpler one, in which we assume that y,(h), of degree m, has the m real zeros A,, ..., and that ympl(h),of degree m - 1, has the zeros hk , ..., &-,, also real and interlacing with the previous set, so that
+
+
+
A, < A,' < A, < A,' < ... < A;-z < hm-l
.
(4.7.7)
The question is whether, restricting ym(h),~ , - ~ ( h ) to have positive highest coefficient, they form two members of a sequence defined by (4.1.1), (4.1.5), or, what comes to the same thing, whether they are members of an orthogonal set. Taking
and finding Y,-~(X) according to ( 4 . 7 4 , we see that a,,, > 0 by comparison of the coefficients of Am on the two sides of this equation. If we put in (4.7.5) A = A,.', so that ~,-~(h,.') = 0, we see that y,,,(A,.'), Y,,,-~(A,,') have opposite signs. Now it follows from (4.7.7) that y,,,(X) takes alternating
114
4. FINITE
ORTHOGONAL POLYNOMIALS
signs at the points ho’, X1’, ..., Xg-, , and hence ~ ~ , - ~ (does h ) the same. Hence ~ ~ - ~ is( of h )degree exactly m - 2, having m - 2 zeros which interlace with those of ~ ~ - ~ (Finally, h ) . we show that ym-,(h) has positive highest coefficient. From (4.7.7) we have that ym(&-,) < 0, from which it follows that ym-2(Xh-2)> 0. Since Ah-, is greater than all the zeros of ~ ~ , ~ - ~the ( h result ), is proved. We have thus that ymPl(h),Y ~ - ~ ( X ) are of degrees m - 1, m - 2, with positive highest coefficients and interlacing real zeros, in a similar manner to ym(X), Y.~-~(X). We may therefore repeat the algorithm (4.7.5)with m - 1 for m, continuing the process until all the y,(h) have been found.
4.8. Spectral Functions in General After we constructed polynomials recursively from (4.1. l ) , (4.1.5), the spectral function formed in (4.5.1-2)fulfilled the role of ensuring that the polynomials were orthogonal. As in Section 1.9, we raise the question of what other orthogonalities these same polynomials may possess, that is to say, we ask for,a general characterization of functions .(A) such that m -m
Ys(h)Yg(h)d ~ ( h = ) Qilapg
9
0 < P,q
< m - 1-
(4.8.1)
We shall here impose in any case the restrictions that ~ ( h be ) nondecreasing, right-continuous, and such that the integrals (4.8.1) are absolutely convergent, which is ensured by
1
m
-m
hSm-, dT(h) < 00.
(4.8.2)
We consider here particularly the case when ~ ( his) a step function with a finite number of jumps. A nondecreasing right-continuous function satisfying (4.8.1-2)may be termed a spectral function, for the. functions yo(h),..., ym-*(h)or for the recurrence relation (4.1.1) with initial conditions (4.1.5). I n place of postulating (4.8. l), we may equivalently postulate the “expansion we define theorem,” according to which for any sequence uo , ..., ~ ( h= )
and derive the expansion
2 anunyn(h>,
m-1
(4.8.3)
0
(4.8.4)
4.8.
SPECTRAL FUNCTIONS IN GENERAL
115
There is no suggestion that the expansion holds in its inverse form, in which we start with u(A), define the u, by (4.8.4), and deduce (4.8.3). The formulation in terms of the expansion theorem rather than in terms of the dual orthogonality (4.8.1) is more suitable in the continuous case of differential equations. We may regard (4.8.1) as a moment problem, specifying the moments of the y,(A)y,(A), 0 p, q m - 1. Since these functions are linear combinations of the functions 1, A, A2, ..., A2m-2, we may replace (4.8.1) by a set of moment conditions on these powers.
<
<
Theorem 4.8.1. Defining pj , j = 0, ..., 2m - 2, by (4.6.3), for any real h, T ~ , , ~ ( Abeing ) given by (4.5.1-2), it is necessary and sufficient for (4.8.1) that
/
m
K d ~ ( h= ) p,,
-m
j
..., 2m - 2.
= 0,
(4.8.5)
It follows immediately from (4.8.5), (4.6.3) that Jm
--m
Y 5 0 )%(A)
W) = Sm Y D ( 4 Y --m
m dTTn.,(A),
(4.8.6)
for p , q = 0, ..., m - 1, and so (4.8.1) follows from (4.5.3). Conversely, if (4.8.1) holds, then so does (4.8.6), and so also
1
m
-m
Aj
&(A)
=
/
m
Aj
&,,(A),
j
--m
..., 2m - 2,
= 0,
(4.8.7)
since A j can be expressed as a linear combination of the y,(h), y,(h). The equivalent conditions (4.8.6-7) are also equivalent to the property that
/
m
-m
44 W ) =
--m
44 dTm,,(A),
(4.8.8)
where T ( A ) is an arbitrary polynomial of degree not more than 2m - 2. This is the property of “mechanical quadrature,’’ replacing the possibly continuous integral on the left by the finite sum on the right. Parallel to Theorem 1.10.1, we observe that if the spectral function ~ ( h ) in , the sense (4.8.1-2), is a nondecreasing step function with a finite number of jumps, then it must coincide with the spectral function T , : ~ ( A for ) some m‘ >, m and some real h. In the first place, to see that m’ 2 m, we observe that (4.8.1) is impossible if .(A) has fewer than m jumps. If it did, we could construct a polynomial of degree m - 1, or less, to vanish at all the jumps of ~ ( h )which , could be put in the form
116
4.
FINITE ORTHOGONAL POLYNOMIALS
r ( A ) = Z7-'a,yn(A), with the at all the jumps of ~ ( h ) ,
I
an
not all zero. Since T(A) is to vanish
m (7T(h)}'
-m
dT(h) = 0.
(4.8.9)
However, the integral on the left of (4.8.9) is, by (4.8.1), equal to Z7-I ai1an2,which cannot vanish, giving a contradiction. We now apply the construction of Theorem 4.6.2, with m', the number of jumps of .(A), in place of m ; if m' > m, the additional positive constants a, , ..., a,.-, are to be taken arbitrarily. Orthogonalizing with respect to .(A), we form a set of polynomials yU(A),...,y,,.-,(A) of which the first m must coincide with those used in (4.8.1). It is not hard to show that they satisfy a recurrence relation of which the first m stages coincide with (4.1.1). An alternative method of characterizing a spectral function T(A) proceeds by setting up the equivalent restrictions for its Stieltjes transform .m
(4.8.10)
corresponding to the characteristic function. I n the case of the recurrence relation studied in Chapter 1, we found an interpolatory property (1.9.5-6) for f(A). Here the corresponding property is the asymptotic behavior of f(A) for large A. If for simplicity we take the case when T(A) is a step function with a finite number of jumps, then for large A, f(A) will admit an expansion (4.8.1 1)
Since the first 2m - 1 coefficients are the power moments (4.8.5), we see by Theorem 4.8.1 that we must have (4.8.12)
if the orthogonality (4.8.1) is to hold. Since this expansion certainly holds in the case offn,,h(A), by ( 4 . 5 4 , (4.6.3), our requirement is that
f(h) = frn.h(h) for large A.
+ o(h-2rn+11
(4.8.13)
4.9.
SOME CONTINUOUS SPECTRAL FUNCTIONS
117
Since f(A) as given by (4.8.10) has negative imaginary part when A is in the upper half-plane, we have reached the problem of finding all functions of this type which have a given asymptotic expansion (4.8.12) or (4.8.13) for large A. This takes the place in this case of the PickNevanlinna problem mentioned in Section 1.10, of finding such functions taking specified values at specified points. We shall not pursue this further, referring to works on the theory of moments and on continued fractions.
4.9. S o m e Continuous Spectral Functions As remarked in Section 1.10, from particular spectral functions others may be formed by taking arithmetic means. In particular, from the spectral functions rnz,&(A),for fixed m and all real h, we may average over h to form continuous distributions, with respect to which the y,(A) are orthogonal. Here we give an example of this process. First we return to the topic of the dependence of the eigenvalues on the boundary parameter h. If the determining equation for the eigenvalues Ym(U
+
hn-l(hT)
(4.9.1)
=0
be differentiated with respect to h, we derive {Ym’(U
+
Mn-l(hT))
(dW4
+
Ym-1(&)
= 0.
Using (4.4.4) this gives P T ( 4 P )
= -Cm-1{Ym-1(4)>2.
(4.9.2)
By means of (4.9.1) we put this in the form P T ( 4 l W =
-cm-l{(Ym(~TN2
+ ( Y m - d U 2 M 1 + h2)
(4.9.3)
and now write the dual orthogonality (4.4.8) as
We now integrate with respect to h over (-00, 00). The A, are, by (4.9.2), decreasing functions of h, and will between them cover the whole real A-axis, except for the points at infinity and the zeros of ymP1(X).Each of the intervals, into which the real axis is divided by these zeros, will be described by one of the A, as h varies over the real
118
4. FINITE
ORTHOGONAL POLYNOMIALS
axis, Thus on integrating the left of (4.9.4) with respect to h, we derive an integral over the h-axis, with reversed sign. Hence
J
m
-m
Y A ~ ) Y , ( I~(yrn(h))' )
+ (yrn-l(h))'
I-' dh = "crn-1Sr,a,',
(4.9.5)
T o formulate the result, we make a slight generalization based on the observation that the recurrence formula (4.1.1) is still in force if we replace ymby ym/c,and c,-~ by c?,&-~c, for real positive c. With this modification, our result becomes: to
Theorem 4.9.1. Let the y,(h) be defined by (4.1.1), (4.1.5), subject (4.1.2). Then, for any c > 0, the y,,(h) are orthogonal according to
f o r O < p , q < m - 1. T o illustrate this in the case of, for example, the Legendre polynomials, we get the formula W
c -m
+
P,(h) P,(h) I Prn(h) i ~ P ~ - ~I-'( hdh)
+ l)-'S,, .
=nm(2~
(4.9.7)
CHAPTER 5
Orthogonal Polynomials The Infinite Case
5.1. Limiting Boundary Problems
In a similar manner to Chapter 2, we may observe at least two ways of applying limiting processes to obtain further results from those of Chapter 4. We may visualize these compactly in terms of the vibrating string. T h e first process may be thought of as the approximation to a continuous heavy string by means of a light string bearing a large number of particles, or again as the approximation to a differential equation by means of a difference equation with a large number of stages. I n the second process, we have the light string bearing n particles and consider this a part of a string bearing an infinity of particles, possibly infinite in length. Keeping one point on the string permanently fixed, and increasing the length of the vibrating portion, and therewith the number of vibrating particles, we obtain a sequence of boundary problems, which may be expected to display some limiting properties. It is this latter type of problem that we consider in this chapter. Mathematically expressed, we suppose given an infinite sequence of recurrence formulas where
cnYn+i = (anh
+ 4Jyn -
cn-1yn-1
cn > 0,
a,,
> 0,
(n = 0,1, -*.)
(5.1.1) (5.1.2)
and impose at least the initial conditions y-1. = 0,
yo # 0.
(5.1.3)
The question then immediately presents itself of whether a boundary problem with eigenvalues can be formed by imposing a boundary condition at n = 00, say, by limy,,(h) = 0. (5.1.4) n+m 119
120
5.
ORTHOGONAL POLYNOMIALS-THE
INFINITE CASE
However, it is not at all clear that this limit need always exist. Heuristically, we expect the limit to exist in the case of a finite string, bearing an infinite succession of particles of finite total mass, that is to say, in the case bn = E n
+ cn-1
t
2
l/cn
(00,
$an < w ;
(5.1.5)
0
here the first equation expresses a special form for the vibrating string, while 1/c, represents the distance between successive particles, and a,, represents the mass of a typical particle. It is not hard to show that these conditions, and, indeed, more general conditions, ensure the convergence of Y"A). Still another possibility of forming a boundary problem is that we in (4.4.2), taking some A, as a given proceed to the limit as m + eigenvalue. That is to say, we select some A,, and define the remaining eigenvalues as solutions of (5.1.6)
We may avoid, ,for the time being, considering whether such limits exist, as those in (5.1.4), (5.1.6), by an altogether simpler procedure, in which we give up the attempt to form a boundary problem, and instead proceed to the limit in the case of the spectral function. This approach yields important results, and we proceed to it now. 5.2. Spectral Functions
For the finite orthogonal polynomials of Chapter 4 we defined the spectral function T , , ~ ( A ) in the first place explicitly in terms of the eigenvalues and the associated normalization constants in Section 4.5, it then being known that the polynomials y,(A) were orthogonal with respect to it, according to (4.5.3);subsequently, in Section 4.8, we considered the inverse procedure in which the orthogonality is taken as basic, a spectral function being a nondecreasing function with respect to which the yn(A) are orthogonal. This latter procedure is still available to us here, and involves no prior knowledge of, or even definition of, the eigenvalues. We define the Y,~(A), as before, as solutions of cnYn+l(x) = (anh
+ 6,)
y-,(h) = 0,
Yn(h) - Cn-lYn-l(h) C-lY@(h) = 1.
(5.2.1) (5.2.2)
5.2.
121
SPECTRAL FUNCTIONS
A spectral function T(A) is to be nondecreasing, right-continuous, satisfying the boundedness requirement
Im < AZ* ~ T ( A )
m,
= 0,I ,
-W
... ,
(5.2.3)
for all n, and the orthogonality
In addition we may impose for definiteness the requirement that ~ ( 0 = ) 0. Such a function may be termed a spectral function, associated with the recurrence formula (5.2.1) and initial conditions (5.2.2). The existence of spectral functions is settled by transition to the limit in the finite-dimensional case of Chapter 4.
Theorem 5.2.1. I n the recurrence relation (5.2.1) let the a , , c, be positive and the b, real. Then there is at least one nondecreasing function satisfying the requirements (5.2.3-4). For the proof we take a sequence of finite-dimensional spectral functions T,,~(A), m = 1, 2, ..., formed according to (4.5.1-2). Purely for definiteness, we assume that h does not vary with m,being some arbitrary fixed real number. By (4.5.3), with p = q = 0, (5.2.5)
for all m, from which we draw the conclusion that the T ~ , ~ ( are A ) uniformly bounded. We can therefore select a sequence m l ,m 2 ,..., such that the T,,~(A) converge; denoting the limit by T(A), we shall have T,".h(h)
-
T(W,
11
-
m,
(5.2.6)
for all finite real A, with the possible exception of a denumerable set, the points of discontinuity of .(A), at which we standardize T ( A ) by right-continuity. Clearly .(A) will also be bounded, for all real A, and also nondecreasing. Next we exploit the property of mechanical quadrature. Since .m
is, by (4.5.3), independent of m, h so long as m
> p , m > q, and since
122
5.
ORTHOGONAL POLYNOMIALS-THE
INFINITE CASE
any power, h2j, say, can be expressed as a linear combination of the y,(h)y,(h) for 0 p , q j , it follows that the integrals
<
<
jmh2' dT,,,(h)
(5.2.8)
-m
are independent of m, h for m >j and fixed j. Thus for fixed j , the integrals (5.2.8) have a fixed upper bound. We deduce (see Appendix I) that we may make here the limiting transition (5.2.6) in integrals with polynomial integrands, namely,
where ~ ( his) any polynomial. I n particular,
and since the left-hand side has the value given by (4.5.3) if m, is large enough, we have proved (5.2.4), and incidentally (5.2.3). This proves the theorem, and so the existence of at least one spectral function. The spectral function will have the property of mechanical quadrature. ) as in Theorem 5.2.1. Then if ~ ( h is) Theorem 5.2.2. Let ~ ( h be any polynomial of degree less than 2m - 1,
jmr ( h ) -m
dTm,A(h)
=
j
m -m
r(h)
(5.2.1 1)
For we may express ~ ( h )in, an infinity of ways, in the form
and on inserting this on the left and right of (5.2.11) we obtain the same result in view of (5.2.4). I n proving merely the existence of at least one spectral function by a limiting transition from the finite-dimensional case, we have left untouched two more delicate questions, first, whether there can be more than one spectral function, and second, if there exists more than one spectral function, and so an infinity of such functions, whether they can all be obtained by such limiting transitions. These questions are more
5.3.
ORTHOGONALITY AND EXPANSION THEOREM
123
conveniently discussed in terms of the convergence of the characteristic function f,,l,h(h) of Section 4.5. Before taking this up we discuss the orthogonality.
5.3. Orthogonality and Expansion Theorem In the existence of at any rate one spectral function we have the important result that the recurrence relation (5.2.1-2) and the orthogonality (5.2.3-4) form equivalent starting points for the theory of orthogonal polynomials. From the orthogonality it is almost immediate that there holds a recurrence relation of the form (5.2.1); the proof given in Section 4.6 applies without modification for the case of an arbitrary nondecreasing, suitably bounded function ~ ( h )T. h e existence of such a ~ ( hfor ) a given recurrence relation is what we proved in the last section. Associated with the orthogonality (5.2.4) there will be an expansion theorem, which will formally run as follows: We take an arbitrary sequence (5.3.1) uo u1 .*., 9
9
which is to be expressed as a linear combination, as series or integral, in terms of the sequences Y O ( 4 I Y d 4 , *** (5.3.2) 9
where the y,(h) are the orthogonal polynomials given by (5.2.1-2); in certain cases, but not in all, the A-values appearing in (5.3.2) may admit interpretation as eigenvalues of a boundary problem. Defining the Fourier coefficient 4h)
2 %%Yn(h), m
=
(5.3.3)
0
the expansion theorem will assert that, in some sense and under suitable
with the Parseval equality
(5.3.5) At a trivial level, we observe that the expansion theorem certainly holds for finite sequences.
124
5.
ORTHOGONAL POLYNOMIALS-THE
INFINITE CASE
Theorem 5.3.1. In the sequence (5.3.1) let there be only a finite number of non-zero terms, and let ~ ( hbe) a spectral function in the sense of Section 5.2. Then the expansion theorem (5.3.4) and Parseval equality (5.3.5) hold. For the proof we take v(X) as given by (5.3.3), which is now a finite sum, and substitute on the right of (5.3.4-5), when we get the left of (5.3.4-5) by means of (5.2.4). For other cases we impose the restriction on the sequence (5.3.1) that it should be of “integrable square” in the sense that
(5.3.6) The series (5.3.3) defining the Fourier coefficient will then converge absolutely, by the Cauchy inequality, if
Whether this is so or not may be settled in the case that T(A) is a step function. Theorem 5.3.2. In order that (5.3.7) hold for some real A’, it is necessary and sufficient that there exist a spectral function with a positive jump at A’. Supposing that ~ ( his) a spectral function with a positive jump at A’, the conclusion (5.3.7) with X = A‘ follows by orthogonality arguments (see Appendix 111). Suppose again that (5.3.7) holds with X = A’. We construct a sequence of finite-dimensional spectral functions T,,~(X), where h = h, is chosen so that A’ is an eigenvalue. This means that y&I‘) h,y,-,(h’) = 0. We may choose h, accordingly except when yIn-,(A’) = 0; the latter cannot be so for two consecutive values of m by the separation property, and so we can always choose an infinite sequence of m, h,, such that A‘ is in the spectrum. The corresponding jump in the spectral function is, by (4.4.3) and (4.5.1-2),
+
(5.3.8) and this will have a positive lower bound, if (5.3.7) is applicable. We now choose an m-sequence such that T,,,~,(X) converges as m-+ a, and the limiting spectral function ~ ( h must ) likewise have a jump at
5.4.
NESTING CIRCLE ANALYSIS
125
A’, of amount not less than the lower bound of (5.3.8) as 171-01. This completes the proof. We may now dispose of the case in which ~ ( his) a step function.
Theorem 5.3.3. Let T(A) be a spectral function and also a step function, with jumps forming a denumerable set. Let the sequence u, satisfy (5.3.6). Then there holds the expansion theorem (5.3.3-4) and, what is more, the Parseval equality (5.3.5). The proof depends only on orthogonality arguments, and is given in Appendix 111. The assumption that T(A) is a given step function ensures, that the definition (5.3.3) has sense at jumps of ~ ( h )but , not necessarily at other A-values. Hence the integrals (5.3.4-5) are to have the interpretation of sums, of products of the jumps of .(A) and the value thereat of the integrand. Such an interpretation is given by use of the LebesgueStieltjes integral. In the case when T(A) need not be a step function, the condition (5.3.6) on the sequence to be expanded does not ensure the convergence of the ~eries~(5.3.3) for the Fourier coefficient for any particular A-value. The series does, however, converge in mean, with respect to .(A). Writing
(5.3.9) we have from Theorem 5.3.1 that, for 0
< n < n’,
so that the left-hand side tends to zero as n 4 m . From this we deduce that v,(A) converges as n 4 for~ almost all real A, with respect to the measure &(A). The limit o(h) is defined for almost all A. The proof of (5.3.4-5) may be accomplished by replacing v(A) by o,(h) and making n + 03, the integrals being taken in the Lebesgue-Stieltjes sense.
5.4. Nesting Circle Analysis We now consider the limiting behavior of the characteristic function frn,*(A) defined in (4.5.4-5).From (4.5.5) it is evident that if T,,~(A) tends to a unique limit as m + m, with h possibly varying with m,then fm,*(A) also tends to a unique limit for fixed complex A. If therefore we can ascertain by another method that fm,h(A) does not tend to a
126
5.
ORTHOGONAL POLYNOMIALS-THE
INFINITE CASE
unique limit, then we shall know that there is an infinity of spectral functions ~ ( h )T. h e alternative expression (4.5.4) does give an alternative approach to the limiting behavior of the characteristic function, and leads to a nesting circle description similar to that of Section 2.4. We denote by C(m,A) the locus of fm,h(A) as h describes the real axis, taking A to be fixed and in the upper half-plane. We denote also by D(m,A) the region described byf,,6,h(A)when h takes all values in the upper half-plane. For example, since y-l = 0, yo = l / ~ - z~ -, ~= 1, zo = 0, we find that fO.A(’)
(5.4.1)
= -c-lh,
so that C(0, A) is the real axis and D(0, A) is the lower half-plane. Since further yl(A) = (aoA bo)/(coc-l), and z1= -c-l/co, we have
+
fl,A(’)
= (c-l)2/(aOA
+
bO
f
(5.4.2)
cOh).
For fixed A in the upper half-plane and varying real h, fl,h(A) describes a finite curve which must be a circle, by the elementary theory of conformal mapping. Thus C(1,A) is a circle; since I m A > 0, it follows from (5.4.2) that Imf1,,(A) 0 when h is real, with equality only when h is infinite. Thus the circle C(1, A) lies in the lower half-plane, touching the real axis at the origin. Since, again by (5.4.2), fl,h(A) is finite when I m h > 0, the region D(1, A) is the inside of the circle C(1, A). We have here the beginnings of the nesting property, in that C(1, A) lies inside the region D(0, A), and contains D(1, A) as its interior. For the general result we proceed inductively, showing that
<
C(m + 1, A) c D(m,A). By the recurrence relation we have
which is equivalent to (5.4.4)
fm+1,fl(A) = f T n . h 4 4
where h‘
= -Crn-1/(QmA
+ bm + hem).
(5.4.5)
Now if h is real, and as always I m A > 0, we shall have I m h’ > 0, and so the points of fm+l,h(A) when h is real are points of when h’ is in the upper half-plane. This proves that C(m+ 1, A) C D(m,A). T h e same argument proves that D(m 1, A) C D(m,A), since if fm,ht(A)
+
5.4.
127
NESTING CIRCLE ANALYSIS
Im h > 0, then (5.4.5) shows that Im h’ > 0. Thus C(2, A) lies inside D(1, A), and must in particular be a circle rather than a straight line,
and D(2, A) lying inside D(1, A) must be the finite region bounded by C(2, A), and so a disk, and so on. As in Section 2.4, we recognize two possibilities, according to whether the nesting circles contract to a point, or to a limiting circle, these two cases being the limit-point and limit-circle cases, respectively. We obtain an analytic discrimination between the cases by calculating the radius of the circle C(m, A). For this purpose we note that one point of C(m, A), given by h = 00, is - z ~ - ~ ( A ) / ~ ~ - ~so( Athat ) , the radius will be half the distance from this point to the furthest point of the circle, namely
8 “h” I ~rn-l(~)/Yrn-l(4- {%(A)
+ k n - l ( U / { Y r n ( 4 + b?n-l(W I-
Simplifying and using (4.2.10) this may be written
+ 4Jrn-l(9 I-l.
1
2 mh”” I Gn-lYTn-l(4 CYrn(h>
+
hy,,-,(A) I has a minimum, The maximum is reached when I y,(A) for real h, and straightforward calculations show that this occurs when h = --R1 {ym,(A)yv,.-l(A)}/ I ym-&I) 12, the radius being then
-
--
I Crn-l{Yrn(4 Yrn-l(4
-
Ym(4 Y r n - l ( J 9
I-l.
Using (4.2.5), this may be written (5.4.6)
It is obvious from (5.4.6) that the radius of the circle C(m, A) tends to zero as m + if and only if the series in (5.4.6) diverges as m --t m. Extending this result slightly, we prove:
Theorem 5.4.1.
In order that as m
--t m
the radius of the circle
C(m,A), for fixed complex A with Im A > 0, should tend a positive
limit, it is necessary and sufficient that, for the same A,
(5.4.7)
these series converging or diverging together.
128
5.
ORTHOGONAL POLYNOMIALS-THE
INFINITE CASE
We only need to prove that the conditions (5.4.7-8) go together. 2, -x,(A)/y,(A) lies inside or on the circle C(2, A), which In fact, for n is a finite circle lying strictly inside C( 1, A), and so in the lower halfplane and away from the origin. Thus the ratio I z,(A)/y,(A) \ is bounded from above and also from zero, which proves the result. Some further information is to be had from an alternative interpretation of the nesting circles C(m,A). Writing f for a typical point inside or on this circle, where
f = - (zwi + hzm-d/(ym
and solving for h, we get h(fym-1
Multiplying by fynz-l since Im h 3 0, Writing
+
zm-1)
+ z,-~
Im { ( f y m
+
+ (fym +
zm)
= 0.
and taking imaginary parts, we have, z m ) (fym-1
+
=fyn
wn
we may write (5.4.9) as
+ hym-l),
-
+
zm-l>>
< 0.
(5.4.10)
zn,
-
Im {wmwm-l - wmwm-1)
(5.4.9)
< 0.
(5.4.1 1)
We now form an identity similar to (4.2.5). As a linear combination of y,, z, the expression w, will be a solution of the recurrence relation; since yP1= 0, yo = 1 / c l , z-l = 1, zo = 0, we have the initial values 1,
w-1 =
wo = f/C1.
From the recurrence relations we have, multiplying by
<,
w, and subtracting,
n=o
(5.4.12)
5.5.
LIMITING SPECTRAL FUNCTIONS
129
We can now formulate the alternative characterization of points f inside or on the circle C(m, A). Since for such f (5.4.11) is to hold, the right of (5.4.13) has zero or negative imaginary part. Since I m h > 0, we have that for f inside or on C(m,A), there holds the bound
q
m-1
a, I fy*
+ x, l2 < -1mflIm
A.
(5.4.14)
Whether the circles contract to a point or not as m -+m , we can in any case say that there is at least one f which lies inside all of them, so that, for this f (5.4.14) holds for all m, and so, in particular, (5.4.1 5 )
Hence:
Theorem 5.4.2. If A is not real, the recurrence relation cnw,+1
=
(4 + bn) w n - cn-1%-1
has at least one nontrivial solution of summable square, in the sense that (5.4.16)
T h e above proof was for the event that I m A > 0. T h e result for the lower half-plane follows on taking complex conjugates.
5.5. Limiting Spectral Functions Let us suppose that for some complex A, Im h > 0, the limit-circle case holds ; we show presently that whether the limit-circle or limitpoint case holds is independent of the choice of complex A, but do not assume this at the moment, T h e circles C(m, A) will then contract towards a limit-circle C(m, A), and the characteristic functions fm,h(h) of (4.5.4-9, as m - - t m and the real h varies in any manner, will also approach C(m, A). We now conduct the same limiting process in terms of the spectral functions ~ , ~ , ~ , (allowing h), h, real, to vary with m. From any particular sequence of spectral functions of this kind, we may pick out a convergent subsequence, its limit .(A) being, as explained in Section 5.2, a spectral function for the full recurrence relation. By (4.5.9, the characteristic
130
5.
ORTHOGONAL POLYNOMIALS-THE
INFINITE CASE
functionsfm,hm(A)will also tend to a limit. Sincefnl,h,(A) lies on the circle C(m,A), the limit of f,,,.,JA), taken over the convergent subsequence, will tend to a limit on the limit-circle C(-, A). T h u s with each spectral function .(A), the limit of finite-dimensional spectral functions T ~ , ~ , ( A ) , we may associate a point on the limit-circle. Conversely, given a point on the limit-circle, we may represent it as the limit of a sequence of points on the approximating circles C(m,A), and so as a limit of points fm,JA). Picking out a convergent subsequence, if necessary, from the corresponding finite-dimensional spectral functions -rm,JA), we obtain a limiting spectral function ~ ( h )associated , with the point on the limitcircle. If we denote by f a typical point on C(-, A), the relation between it and the associated spectral function 7 ( A ) is given by (5.5.1)
obtained by proceeding to the limit in (4.5.5). T h e argument given has not, of course, established a (1, 1) relationship between points of the limit-circle and the associated spectral functions. For the present limit-circle case we define a class of limiting spectral functions, namely, spectral functions according to Section 5.2 which are representable as the limit of a convergent sequence of finite-dimensional spectral functions 7m,h,(A), where m + through a subsequence if necessary. We know that limiting spectral functions correspond, via (5.5.1), to points of the limit-circle. It is easily seen that there will be additional spectral functions, corresponding to points inside the limit-circle. Let &)(A), +)(A) be two limiting spectral functions, corresponding to distinct points f,!$, fA2) on the limit-circle. Then ~ ( h= ) &&)(A) & 7 J 2 ) ( A ) will also be a spectral function in the sense of Section 5.2, the spectral functions forming a convex set. It will cor&fA2), that is to say, to the respond via (5.5.1) to the point midpoint of the chord joining f:'), and so to a point inside the circle. Since T(A) corresponds to a point inside the limit-circle, it cannot be a limiting spectral function, and so the latter form a proper subclass of the class of spectral functions. Obviously, to each point inside the limit-circle corresponds at least one spectral function.
-
+
+
5.6. Solutions of Summable Square We shall now justify an earlier statement that the discrimination between limit-circle and limit-point cases is independent of the complex A involved. By Theorem 5.4.1 it will be enough to consider whether
5.6.
SOLUTIONS OF SUMMABLE SQUARE
131
solutions of the recurrence relation are of summable square in the sense (5.4.7-8). We prove that whether the latter is so or not is independent of A. Theorem 5.6.1. If (5.4.7-8) hold for any one A, real or complex, then they hold for all A. The proof proceeds by applying the method of variation of parameters to the recurrence relation. Supposing that (5.4.7-8) hold for some A, we define, for any other value p , Pn(P) = Cn-1{Yn(P)
rn-l(4
P?l(P.) = s - l M P > %1(4
- Y n ( 4Yn-l(P)>,
(5.6.1)
- 4 4Yn-l(P)>*
(5.6.2)
By (4.2.2), one has PTI+l(P.) - Pn(P.> = an(P - 4 YTl(P)Y n ( 4 ,
(5.6.3)
and, by (4.2.1 l), - h)Yn(P) z,(Q
4n+1(P) - Qn(P) =
(5.6.4)
We may eliminate yr,(p) from the right of (5.6.3-4) by means of (5.6.1-2). Since, by (4.2.10), Cn-1{Y70> s-1(4 - zn(4 Y n - l ( U = 1 7 (5.6.5) we have Yn(P) = 4n(P) Y n ( 4 - Pn(P) 4%
(5.6.6)
Substituting for yn(p) on the right of (5.6.3-4) the result may be written Pn+1(tL)
where
4n+dP)
- Pn(P) = (CL - A) {%1IPn(P) -4nW
an11 = %21
=
= (P -
+
%124n(P)>,
4 {%2lPn(P.) + %22Qn(P)),
-%Yn(4 zn(4,
-%{%Z(W,
%22
an12
= an{rn(W2,
=
w n ( 4 44.
(5.6.7) (5.6.8) (5.6.9)
The key fact is now that
2I m
n-0
a,,,
I < m,
I, s
=
1,2.
(5.6.10)
In the case of unZ1this follows from the assumed (5.4.7-8), while for anll , unZ2it follows by Cauchy's inequality. It follows from a general
132
5.
ORTHOGONAL POLYNOMIALS-THE
INFINITE CASE
stability theorem that the solutions of (5.6.7-8) are uniformly bounded as n 3 00 (see Appendix IV). From (5.6.6) we deduce that for some constant c, possibly dependent on p but not on n,
I m(d I G c T h e desired conclusion that
then follows from (5.4.7-8). T h e proof in the case of z,(A) is exactly the same. Summing up, the recurrence relation either has two independent solutions of summable square, in the sense (5.4.16), for all complex A, and in fact all A, when the limit-circle case holds for all complex A, or else for all complex A there is just one such solution, apart from constant multiples, and for real A at most one such solution, when the limit-point case holds for all complex A.
5.7. Eigenvalues in the Limit-Circle Case We show here that eigenvalues can be defined in the limit-circle case, by means of a boundary condition. This must, of course, have a limiting form, and in preference to such a form as (5.1.4), which may or may not have sense, we choose the form (5.I .6). Equivalently, starting with one eigenvalue, we may define the others on the basis of the orthogonality of the eigenfunctions. Assuming the limit-circle case to hold, we take it that all solutions are of summable square, and uniformly so in any finite A-region. This conclusion also follows from (5.6.6-lo), by minor refinements in the argument. We therefore take it that (5.7.1) 0’
where c(A) is some function which is bounded for bounded A. Adopting some fixed real A’ as an eigenvalue, we define the eigenvalues as the zeros of (5.7.2)
5.7.
EIGENVALUES I N THE LIMIT-CIRCLE CASE
133
By (5.7.1), the function (5.7.2) is an entire function of A. It does not vanish identically, since its derivative is not zero when A = A’. Hence its zeros will have no finite limit. Moreover, these zeros will all be real, being the limits as m + w of the zeros of
the zeros of the latter are real, by Theorem 4.3.1, and the zeros of (5.7.2) are the limits of zeros of (5.7.3), by RouchC’s theorem. Having defined the eigenvalues A,, say, by (5.7.2), or else, what is the same thing, as the roots of
we may define as a spectral function a step function whose jumps are at the h, and are of amount (5.7.5)
These are positive in view of (5.7.1)- We then have the machinery of Sections 4.4-5 with m = w. The verification of the two types of orthogonality may proceed by making m + in the finite-dimensional problem. We set up a sequence of spectral functions T,,h,(h), m = 1, 2, ..., where h,, is chosen so that h’ is in the spectrum. Denote the eigenvalues of the corresponding problem , numbered so that hr,, +h, as m +a, where A, is a typical by root of (5.7.4). By (5.7.1), (4.4.2), and (4.2.3), no two of the h,,,, can tend to one and the same limit. Hence the points of discontinuity of T,,h,(h) tend to those of the spectral function defined by (5.7.4-5), and it is easily seen that the amounts of its jumps do likewise. Thus the spectral function so defined is what was in Section 5.5 termed a limiting spectral function, and the orthogonality (5.2.4) is in force. The orthogonality of the eigenfunctions, namely that
may also be proved by a limiting argument.
134
5.
ORTHOGONAL POLYNOMIALS-THE
INFINITE CASE
5.8. Limit-Circle, Limit-Point Tests I n seeking tests to discriminate between the limit-point and limitcircle cases, we may devise criteria to be applied to the coefficients in the recurrence relation (5.1.1), or again to be applied to the spectral function or moments. It is the former case which we consider here. We rewrite (5.1.1) in the form
where b,’ is defined in terms of the coefficients in (5.1.1) by
In this form the recurrence relation assumes more the form of a secondorder difference equation, with a natural parallel with the differential equation (5.8.3) [c(x) u’]’ = [a@) h b(x)] 24, [(‘) = d/dx].
+
The criteria we seek for (5.8.1) may be expected to have parallels in the case of (5.8.3). It will be convenient to write (5.8.1) in matrix form. Defining wn
= cn(un+,
- un),
we may write (5.8.1) as
and (5.8.1), (5.8.4) assume the matrix form
From this we get simple sufficiency criteria for the two cases.
Theorem 5.8.1.
Let
Then the limit-circle case holds.
(5.8.4)
5.8.
LIMIT-CIRCLE, LIMIT-POINT TESTS
135
For the limit-circle case it is sufficient to prove that for some A, say, h = 0, we have
$
an I un
12
<
(5.8.7)
00,
for all solutions of (5.8.5). In view of the first of (5.8.6), it is sufficient to prove that the sequence u, is bounded. This, in turn, is assured if the matrix on the right of (5.8.5), less the unit matrix, that is to say, (5.8.8)
has entries which form an absolutely convergent series (see Appendix IV). Taking X = 0, we see from (5.8.6) that this is indeed the case. The fourth of (5.8.6) is, of course, implied by the second and third requirements in (5.8.6). For a criterion in the opposite sense we give:
Theorem 5.8.2. Let (5.8.9)
and for some real h let ~,,h$b,'>O,
n=0,1,
....
(5.8.10)
Then the limit-point case holds. For the limit-point case we have to show that for some h and for some solution (5.8.7) is false. In view of (5.8.9) it will be sufficient to show that for at least one h and at least one solution, the sequence 14, is positive and increasing. If X satisfies (5.8.10), then it follows from (5.8.5) that if unP1, v , - ~ are positive, then so are u, , w, , with, moreover, u, > u , - ~ . If, therefore, we start (5.8.5) with positive u F 1 , v - ~ ,then u, , 8, will be positive for all n, with u, > u,-~. This completes the proof. The condition (5.8.10) is equivalent to demanding that the sequence bn'/an should be bounded from below. For example, for the Legendre polynomials one has (n
+ 1)
f'n+l(h)
= (2n
+
+ 1) f'n(h) +
- nf'n-,(h),
+
so that in this case, a, = 2n 1, c, = n 1, b, = 0, b,' = -(2n 1). Here plainly the limit-point case holds. This is also evident from the fact that P,(1) = 1.
136
5.
ORTHOGONAL POLYNOMIALS-THE
INFINITE CASE
5.9. Moment Problem
If we start from the recurrence relation (5.1.1) and initial conditions (5.1.3), a natural problem is the determination of all spectral functions, that is t o say, nondecreasing right-continuous functions .(A), satisfying the orthogonality (5.2.4), and the boundedness (5.2.3). We know from the method of limiting transition from the finite-dimensional case that there exists at least one spectral function, and an infinity of essentially distinct spectral functions in the limit-circle case. This information may be transferred to the moment problem equivalent to (5.2.4). T o set up this moment problem, given (5.1.1-3), we form finitedimensional spectral functions ~ , , ~ ( has ) in Chapter 4, and define pj =
j
W
-m
x dTm,h(h),
2m - 2 > j ,
(5.9.1)
<
the right-hand side being independent of h and m if j 2m - 2. It is not indeed necessary to calculate the spectral function ~ , . ~ ( h ) in order to find the moments p j . For a simpler method we calculate the characteristic function fnl,h(h)according to (4.5.4), for example, fm,o(A) = --znL(A)/y,(h). By (4.5.5), such a function admits the asymptotic expansion, for large values of A, (5.9.2) n=O
The pn. may be obtained by purely rational processes by forming this expansion. As we have found the pj from the recurrence relation, the moment problem, equivalent to the definition of a spectral function in Section 5.2, calls for a nondecreasing right-continuous function 7(h), such that
1 W
-W
hjd7(h)
=pj,
j = 0, 1, 2, ... .
(5.9.3)
In the form that the pj are given, without knowledge of any recurrence relation, this constitutes the Hamburger moment problem. We shall not give here a full discussion of this problem, but will show that the solution is in part given by the theory of the recurrence relation. In the first place, it is a necessary condition for the problem .to be soluble at all that (5.9.4)
5.9.
MOMENT PROBLEM
137
for any set of a0 , ..., a7,L-land any m ;we up .mThis requirement may be expressed {p,}, is positive, or positive semidefinite.
may confine attention to real by saying that the sequence T o see that the condition is necessary, we note that if (5.9.3) is true, then the left of (5.9.4) may be written (5.9.5)
which is, of course, nonnegative, since ~ ( his)nondecreasing. If, moreover, the al, are not all zero, and ~ ( his) not merely a step function with a finite number of jumps, the integral (5.9.5) must be positive, so that equality in (5.9.4) cannot hold. Supposing that the moments pj satisfy (5.9.4), with strict inequality if the ap are not all zero, we can construct a recurrence relation (5.1.1-3) such that the pi are also given by (5.9.1). Once this recurrence relation has been found, any spectral function in the sense of Section 5.2 will be a solution of the moment problem. If the recurrence formula, when found, belongs to the limit-circle class, there will be an infinity of solutions of the moment problem, and the moment problem is commonly said to be indeterminate. If the recurrence formula belongs to the limitpoint type, we obtain only one spectral function and so only one solution of the moment problem; to show that the moment problem has no essentially distinct solutions, and so is determinate, requires an additional argument. The construction of the recurrence formula, given the moments, follows the lines of Section 4.6. We may first form the polynomials y,(A) by what is essentially the process of orthogonalization ; the latter term cannot strictly be used at this stage, since we do not know whether any ~ ( h or ) ~ , , ~ ( h exist. ) Nevertheless, the equations (4.6.4) for the coefficients (4.6.1) in y,(h) involve only the pj , and the calculations can be carried through. Likewise, in (4.6.9-10) the integrals can be interpreted in the sense that the integral of any power is to be replaced by the known moment (5.9.1). By these means we deduce that the y,(h), found by a process of mock orthogonalization, satisfy a recurrence relation. Again, there will be an alternative approach in which we start with (5.9.2), looking for a rational function fmSh(X) admitting the stated asymptotic form for large A. We refer to treatments of the moment problem for the details.
138
5.
ORTHOGONAL POLYNOMIALS-THE
INFINITE CASE
5.10. The Dual Expansion Theorem Inverting the expansion theorem of Section 5.3, we obtain an expansion which proceeds formally as follows. For an arbitrary function v(h), -03 < X < 03, we form its Fourier coefficient with respect to Y P ( 4 by m U) = v(h)y,(h)d+), p = 0,1, ... (5.10.1)
j
9
-m
where the polynomial y,(X) is derived from a recurrence formula as in (5.2.1-2), and ~ ( his) a spectral function in the sense of Section 5.2. The expansion theorem will then state that, in some sense, (5.10.2)
together with the Parseval equality, (5.10.3)
This result has the important form of the expansion of an arbitrary function in a series of orthogonal polynomials. A number of qualifications are needed to make the result valid. The arbitrary function v(h) must belong to L: , being measurable with respect to &(A) and such that
In the second place, the expansion (5.10.2) will have no claim to validity in an interval where ~ ( h is) constant, since by (5.10.1) the Fourier coefficients take no account of the value in such an interval of w(h). Third, the Parseval equality (5.10.3) turns out to be true only if ~ ( h ) is a limiting spectral function, as defined in Section 5.5. This latter point is one of substance only in the limit-circle case. As mentioned, (5.10.2) is an expansion of v(h) in terms of polynomials; correspondingly, (5.10.3) is equivalent to the statement that v(h)can be approximated in mean square arbitrarily close by polynomials, in the sense that (5.10.5)
5.10.
139
THE DUAL EXPANSION THEOREM
as m -+ m. We are thus led to the question of whether such approximation is possible for an arbitrary function of L: , in other words, of whether the polynomials are dense in this space. Known results on this topic deal with cases in which ~ ( h is) constant outside a finite interval; by what is essentially the Weierstrass theorem on the approximation to continuous functions by polynomials in the uniform sense over a finite interval, we can deduce that (5.10.3), (5.10.5) are valid when &(A) is positive only in a finite interval, as, for example, in the case of the Legendre polynomials. Conversely, if the limit-circle case holds and there exist spectral functions for which (5.10.3)) (5.10.5) are untrue, then deductions can be made about the nonconstancy of ~ ( houtside ) a finite interval. For a definite result we prove the following:
Theorem 5.10.1. Let ~ ( h be ) a limiting spectral function of the recurrence relation (5.1.1-3). Then (5.10.3), (5.10.5) hold if w(h) has the form l/(h - p), for any complex p. The significance of this result is that an arbitrary w(h) E L: can be approximated in mean square by functions of the form l/(h - p). For the proof we turn once more to the method of limiting transition from the finite-dimensional case. For any m, h where m is a positive integer and h is real, we can form a polynomial $nr,h(h)which agrees with l/(h - p ) at the zeros of ym(h) hy,&), that is to say, at the points of increase of T , , ~ ( X ) . Such a polynomial is
+
+
This is certainly a polynomial; the denominator ym(p) h ~ , - ~ ( pis) not zero if h is real and p is complex, by Theorem 4.3.1. Also clearly $,.h(X) = l/(h - p) if ynL(h) yh,,-,(h) = 0. Hence
+
Having shown that the polynomials ~ ) ~ , ~approximate (h) to ( A - p)-l exactly with respect to the measure dT,,,h(h), we show that as m - + m they approximate to ( A - p)-' asymptotically with respect to ~ ( h )a, limiting spectral function. We suppose that T,,,(X) -+ ~ ( h as ) m -+ 00, possibly with m running through a subsequence and with h varying with m. We wish to prove that
J
m -m
I (A
- p1-l - $m.*(h)
It dT(4
-
0
(5.10.8)
140 as m
5. --t 03.
ORTHOGONAL POLYNOMIALS-THE
INFINITE CASE
I n view of (5.10.7) this is equivalent to
Expanding the squared term, we see that it will be sufficient to prove that
(5.10.12)
Of these, (510.10) follows from the fact that ~ , , ~ ( h+ ) ~ ( h )according , to the Helly-Bray theorems, modified for an infinite interval. The result (5.10.12) is immediate; the left-hand side vanishes identically, since $m,h(h)is a polynomial of degree m - 1, and the mechanical quadrature (Theorem 5.2.2) is available, It remains to prove (5.10.11). We write the integral in the form +rn.hGTL) Jm
--m
+ Jm
--m
(A - i4-l
(A - PF1
{h.h(4
{W) - dTrn*h()o)
- $rn,h(F))
( 4 4 - d T r n * h ( U . (5.10.13)
The last integral vanishes by the mechanical quadrature, the integrand being a polynomial of degree m - 2. In the first of the two terms in (5.10.13), the integral tends to zero, once more by the Helly-Bray theorem, and to complete the proof it is only necessary to show that $,n,h(p) is bounded. Putting h = p in (5.10.6), the expression in the braces does not exceed 2, in absolute value, and hence
Hence i,brn&i) is bounded, independently of m, h, as required. Having shown that any expression 1/(X - p) may be approximated arbitrarily closely by polynomials, in the sense of approximation in mean square with respect to ~ ( h )the , further step can be made of approximating to any function of L: in mean square by a linear combination of expressions l/(h - pJ, and so by polynomials. The approximation by means of functions l/(h - p,.) is a comparatively simple matter
5.10.
THE DUAL EXPANSION THEOREM
141
to arrange directly, since we are dealing now with functions which are small at infinity. We shall not give the details here, however, since they have nothing to do with boundary problems. We repeat the conclusion that spectral functions derived by a limiting process from finite-dimensional spectral functions T ~ ~ , ~ (have A ) the distinguishing property that the polynomials are dense in the corresponding L2 space.
CHAPTER 6
Matrix Methods for Polynomials
6.1. Orthogonal Polynomials As Jacobi Determinants I n this chapter we give some extensions to the topic of orthogonal polynomials, starting with an alternative and, in some ways, more powerful approach to the finite orthogonal polynomials of Chapter 4. The polynomial y,(h) of Section 4.1 had as its zeros the eigenvalues of the boundary problem cny,,+l = (a,,h
+ b,,) y,, - c,,-lyn-l , y-1
n
..., m - 1,
(6.1.1)
= 0,
(6.1.2)
= y m = 0.
If in (6.1.1) we suppress terms involving y-l ,ym as zero we find that the result can be written as a matrix equation
-c,
ha,
+ b,
0
-c1
0
0
...
...
...
0 -c1
ha,
+ b,
... ...
0 -c,
... ...
0
Yo
0
Y1
0
Y2
+
- 0.
...
... ham-1
'
bm-1,
Ym-1
6.1.3)
If this be abbreviated to
(M + B ) V
=o,
(6.1.4)
where A , B are m-by-m matrices and q is a column matrix, we observe that A, B are symmetric matrices, A being diagonal and B of Jacobi form, with zeros everywhere except in the main diagonal and the diagonals immediately next to it. Since moreover A is positive definite, the a,, being always assumed positive, we conclude at once that the eigen142
6.1.
POLYNOMIALS AS JACOBI DETERMINANTS
143
values of the problem (6.1.3-4) are real. Since these are the zeros of ym(A), as defined in Section 4.1, we have another proof of the reality of the zeros of our (orthogonal) polynomials, though not of their distinctness. It follows that ym(A) differs by at most a constant factor from det (AA B), and this factor is easily found. It will be convenient to define the determinant
+
ha,
+ b, ...
... ...
0
...
-Cr
Ardh) =
0
-c,
...
0
... ... ha,
1
(6.1.5)
+ b,
this minor being formed with as leading diagonal some part of the leading diagonal of AA B, so that in particular Ao,m-.l(A) = det (AA B). Since
+
Ao,m-l(h)
+
= hm
n
m-1 0
a,
+
,
(6.1.6)
while from (4.1. I), (4.1.5), (6.1.7)
we have
The second solution z,(A) of the recurrence relation defined in (4.2.6-8) may also be expressed in these terms. The zeros of z,,,(A) are the eigenvalues of the problem
and so are the zeros of LI~,,,-~(A). in the two cases yields
Am-'
Comparison of the coefficients of (6.1.1 1)
144
6.
MATRIX METHODS FOR POLYNOMIALS
I n connection with oscillatory properties and the Green's function it is convenient to define a third solution of the recurrence relation, to vanish at the upper boundary. We define w,(A) by c,w,+l(h) = (UJ
+ b,) %(A)
- Cn-1Wn-1(4,
(6.1.12)
and
T h e zeros of w,(h) are then the eigenvalues of the problem
wp = w,
= 0,
and hence by the same arguments (6.1.14)
6.2. Expansion Theorems, Periodic Boundary Conditions With the notation (6.1.3-41, there will be an eigenvector expansion associated with the nontrivial solutions of (6.1.4). This is none other than the eigenfunction expansion of Section 4.4. Subject to an additional argument to show that the eigenvalues are distinct, the orthogonality of the eigenvectors of AA B coincides with (4.4.2). T h e case of the boundary condition Y ~ , ~ ( A ) l~y,-~(A)= 0, in place of Y,~(A) = 0, needs only a slight modification, the replacement of bmPl by bm-l h~,-~. Going beyond the problems discussed in Chapter 4, the method just discussed deals also with two-point boundary conditions. Supposing that we impose the conditions
+ +
+
where a,5,? are real, and subject to
we may eliminate y-l
,y m from the first and last of the equations (6.1, l),
6.2.
EXPANSION THEOREMS, BOUNDARY CONDITIONS
145
obtaining m homogeneous equations in y o ,ymP1.The eigenvalues of this problem are those of the A-matrix ha,
+ b,
-c,
0
-c,
ha,
0
+ b,
-c1
-C1
...
...
0
... 0
--rBC,-1
... ... ... ...
...
0
--oLc-1
0 .-.0
0
...
... ...
... ...
-c~-Z
(6.2.3)
-Cm-1 ham-1fbm-i
+
which differs from AA B in (6.1.3) only in the entries at the top right and lower left corners. Subject to (6.2.2), the matrix will still be symmetric for real A, the coefficient of A being positive-definite, so that the eigenvalues will again be all real, and an expansion in terms of eigenvectors with orthogonality relations will still be availble. However, it can no longer be asserted that the eigenvalues are all distinct; eigenvectors corresponding to the same eigenvalue will have to be orthogonalized. T h e orthogonality will no longer lead to a dual orthogonality in terms of polynomials in so simple a form. If c - ~= c ~ , ~ we - ~ may , take 01 = j?,so that (6.2.1) becomes Y-1
= Ym-1,
Yo
=Y m
9
(6.2.4)
which may be viewed as periodic boundary conditions. This might be interpreted as the situation in which m particles are fixed to a circular weightless string, which is stretched round a smooth cylinder, the particles executing small transverse oscillations.
6.3. Another Method for Separation Theorems The method concerned is alternative to that of section 4.3, where we proved separation theorems for the zeros of consecutive polynomials y,(A), y,+l(A), and indeed for nonconsecutive polynomials of the same set. The present method is based essentially on a sign-definite property of the Green’s function in the complex plane, including a similar property for the characteristic function (cf. Theorem 4.5.1). It is applicable to more general boundary conditions, and to more general problems altogether. In the present polynomial case we rely on a well-known (see Appen-
146
6.
MATRIX METHODS FOR POLYNOMIALS
dix 11) connection between the separation property for a pair of polynomials p(h), q(h) and the behavior of their ratio in the complex plane. Let p(X), q(h) have real coefficients, real and simple zeros and no common zeros. If in addition they have the separation property, that between two zeros of one lies a zero of the other, then (see Appendix 11) their ratio satisfies Im {p(X)lq(h)) # 0, if Im A # 0. (6.3.1) Conversely, if $(A), q(h) have real coefficients and no common factors and (6.3.1) holds, then their zeros are real and simple and have the separation property; the same statements will also apply to a pair of polynomials of the form ap(h) bq(X), cp(h) dq(X) for any real a, b, c and d with ad - bc # 0, if they apply to $(A), q(h). T h e requirement (6.3.1) involves that the mapping h +p(h)/q(h) maps the upper and lower half-planes into themselves, or else into each other, so that the mapping of the real axis must be monotonic, except for poles. This behavior provides another standard approach to separation theorems. If p(h)/q(h) is monotonic increasing or monotornic decreasing on the whole real axis, except for poles, then obviously the zeros separate the poles, and so the zeros of p(h), q(h) separate each other, if they are distinct. Using the former method we prove
+
+
Theorem 6.3.1. Let y,(h) be defined by (4.1.1), (4.1.5), subject to n m - I , between (4.1.2), and w,(h) by (6.1.12-13). Then for 0 two zeros of y,(h) w,(h) lies a zero of ym(h);between two zeros of ym(h), which are not zeros of y,(h) w,(h), there lies a zero of y,(h) w,(h). Since the matrices A, B appearing in (6.1.3-4) are Hermitean, in fact real and symmetric, we have, if h is complex, I m (h A B ) = A I m A ; since A > 0, by (4.1.2), it follows that I m ( h A B ) > 0 if I m h > 0, and hence (see the notes on Notation and Terminology)
< <
+
+ B)-l} < 0,
Im {(AA
for Im h
+
> 0.
(6.3.2)
This is, of course, a matrix inequality, but implies the same inequality in the numerical sense for its diagonal entries. If in the matrix in (6.1.3) we form the minor corresponding to the entry ha, b, , the only b, , or below non-zero entries are those above and to the left of ha, and to the right of it, the determinant breaking up into the product of two determinants formed by these two sets of entries. Hence the diagonal entries in ( h A B)-l are, by Cramer’s rule,
+
+
+
do,n-l(h) ~ n + l , ~ - l ( h ) / d ~ , ~ - l ( h ) , = 0,
m - 1,
(6.3.3)
6.3.
ANOTHER METHOD FOR SEPARATION THEOREMS
147
where d,(h) is in general given by (6.1.5), do,-,@)and L I ~ . ~ - , ( A ) being interpreted as unity. By (6.1.8), (6.1.14) we may write (6.3.3) as
Hence the real polynomials y,(h) w,(h) and y,(X) have the property that Im {y,(h) wn(h)/ym(h)}< 0,
for Im h
> 0.
(6.3.5)
We can conclude at once that when ynwn/ymis reduced to its lowest terms, the numerator and denominator will have the property of the mutual separation of zeros. This proves the last statement of the theorem, and also the first statement in that between two zeros of ynwn , which are not zeros of y m , there must lie a zero of y m . T o complete the proof we show that a zero of y m , which is also a zero of y n w , , must be a double zero of the latter. Suppose for example that for some h we have ym(X) = y,(X) = 0. Since w,(h),y,(h), r =: 1, ..., m, are both nontrivial solutions of the recurrence relation vanishing when r = m, they differ only by a constant factor, and hence also wn(h) = 0. Similarly, if we assume ym(h) = w,(h) = 0, it follows that y,(h) = 0. Hence a common zero of ynwn and y m must be a double zero of ynwn . Recalling from Chapter 4 that y m has only simple zeros, we have that zeros of ynwn are also zeros of ynwn/ymin its lowest terms. Hence between two such zeros there must be a pole of ynwn/yn,,which completes the proof. In the case n = 0, yo(A) is a constant, and wo(X) differs only by a constant factor from z,,(h), by (6.1.11) and (6.1.14); the result is then that the zeros ofy,(h), x,(h) separate one another. Again, if n = m - 1, wm-,(h) is a constant, and we have, as noted in Chapter 4, that the zeros of ym-,(h), y,(h) separate one another. For other cases we have the following weakened form of the result.
<
Theorem 6.3.2. For 1 < n m - 1, between two zeros of y,(h) lies at least one zero of y,(h). This follows immediately from Theorem 6.3.1. A number of proofs are known. The corresponding result in Sturm-Liouville theory deals with eigenvalues of boundary problems over intervals of the form (a, b), (a, b’), where b’ > b. Let us apply similar arguments to the boundary problem with periodic boundary conditions. We assume that c-, = c,-,, and seek A-values for which (6.1 -1) has a nontrivial solution satisfying (6.2.4). Denote B, the matrix (6.2.3) with a = /3 = 1, and again c-, = c m - , . by hA
+
148
6. MATRIX
METHODS FOR POLYNOMIALS
T h e eigenvalues of the periodic problem will be the zeros of det (AA As before, we have Im ((XA
+ B1)-l] < 0,
if Im X
> 0,
+ B,) (6.3.6)
+
and this holds numerically for the diagonal elements of (AA B1)-l. Now the minors of Xu, 6, , bmPl in (AA B,) are the same as in (AA B), being given by the numerator in (6.3.3)for n = 0, m - 1 or, except for constant factors, by the numerators in (6.3.4). Taking the case n = m - 1, we have
+
+
Im (y,-,(A)/det
+
(AA
+
+ B,)} < 0,
if Im h
> 0.
(6.3.7)
We deduce a weak separation property.
Theorem 6.3.3. If A’, A“ are zeros of Y ~ - ~ ( A ) , the closed interval [A’, A”] contains an eigenvalue of the periodic problem. T h e result is trivial if either A’, A” is a zero of det (AA Bl),and otherwise follows from (6.3.7).
+
For the slightly specialized case of the weightless string stretched and bearing particles, the results may be visualized as follows. For the situation of Theorem 6.3.1,we suppose the string bearing particles of masses a , , ..., am-l and pinned down at both ends, in which situation it will have certain natural frequencies. If now we pin down also the particle a,, the natural frequencies will be those of the two parts into which the string is thus divided. T h e result asserts that between any two frequencies of the separate parts, belonging to either the same or to different parts, there lies a frequency of the whole string. Theorem 6.3.3 admits a similar interpretation in which the string is closed, the particles sliding on a smooth cylinder or on parallel wires; we compare the situations in which the string vibrates freely, and in which it vibrates with the particle an,-l fixed. 6.4. The Green’s Function
Reverting to the boundary problem (6.1.1-Z),the Green’s function arises from the inhomogeneous boundary problem cnyn+,
+
- (4 hJyn
+
Cn-iyn-1
= an
3
= 0,
-.I
- 1,
(6-4.1)
where the a, are prescribed, still subject to y-l = ym= 0. If the homogeneous equations, with all a, = 0, have only the trivial solution
6.4.
149
THE GREEN’S FUNCTION
yn = 0, that is to say, if y,(h) # 0, then (6.4.1) will be uniquely soluble in the form m-1
y,
= zg,,(h)a,,
T
..., m
= 0,
- 1.
(6.4.2)
The function g,.,(X) is the Green’s function; we may visualize it as measuring the disturbance produced at one point of the system by a unit pulsating force applied at another. We may define it formally as zero when r = - 1, m or when s = 1, m or both. The Green’s function has the characteristic properties that it satisfies the boundary conditions, by the definition just made, that it is symmetric for real A, in the sense that g,,(h) = g,(h), and that it satisfies the recurrence relation in either variable, with an inhomogeneous term when I = s. I n fact the g,,(h), I , s = 0, ..., m - 1 form a matrix which, by definition, is the inverse of the matrix - (hA B), where A , B are given by (6.1.3-4). Since h A B is symmetric, when h is real, its inverse, G(h), say, must also be symmetric, and so g,(X) = g,(X). Finally the matrix relationship
+
+
may be written explicitly as
for s = 0, ..., m - 1, where again g,,-l(h) = g,(X) = 0, and r runs through 0, ..., m - 1, so that the g,(X) satisfy the recurrence relation in s, except when s = r ; the corresponding statement for fixed s and varying r follows by the symmetry, if h is real, or again from (6.4.3). We have the explicit formulas
provided that ym(X)# 0. In the case r = s these were found in (6.3.4), where we obtained the elements on the leading diagonal of (h A B)-l. T o justify the general formulas (6.4.5-6) we may either use Cramer’s rule again, or observe that (6.4.5-6) give solutions of (6.4.4) when r # s, and also satisfy the boundary conditions. By (6.3.2), the matrix G(X) comprising the Green’s function will have positive imaginary part when I m h > 0.
+
150
6.
MATRIX METHODS FOR POLYNOMIALS
6.5. A Reactance Theorem T h e argument employed in Section 6.3 to prove separation theorems for zeros does not entirely depend on the Jacobi character of the matrices. Let A, B be Hermitean, of the same order m, say, and let A be positivedefinite. As before, this implies that ( h A B)-l has negative imaginary part when h is in the upper half-plane, the same being true of the diagonal elements of ( h A B)-l. Considering the last diagonal entry, let A’, B’ denote the (m - 1)th order matrices obtaining by deleting the last row and column of A, B. We have then that the scalar function
+
+
+
det (XA’ B‘) det ( X A + B )
(6.5.1)
has negative imaginary part when h is in the upper half-plane. It follows that if we remove from det (hA’ B’), det (AA B ) any common zeros they may have, the remaining zeros will be simple and will have the mutual separation property. We see that (6.5.1) may be expressed as the ratio of two polynomials with mutually separating zeros. Hence, by Section 4.7, we may express it in the form ~ y ~ - ~ ( h ) / where y ~ ( h the ) y,(h) are a sequence of orthogonal polynomials, or polynomials connected by a recurrence relation. We approach here a theorem in network theory, according to which the driving-point impedance of an LC-network at any point is the same as that of a suitably chosen LC-ladder of the form given in Section 0.7, Fig. 2.
+
+
6.6. Polynomials with Matrix Coefficients I n the remainder of this chapter we indicate two generalizations of the theory of orthogonal polynomials. T h e first of these arises by taking the coefficients in the recurrence relation (4.1.1) to be square matrices of some fixed order K. Virtually the whole of Chapters 4 and 5 can be extended in this way. A very similar situation obtains in SturmLiouville theory, where large sections of the ordinary theory carry over to the case in which the coefficients in the differential equation are square matrices. We define the basic polynomials by the matrix recurrence relation with the initial conditions Y-,(A) = 0,
Yo@)= E.
(6.6.2)
6.6.
151
POLYNOMIALS WITH MATRIX COEFFICIENTS
This defines Yn(h)as a polynomial of precise degree n, if the A , are nonsingular; we have, in fact, Yn(A)
=
(&-,A
+ Bn-1) ...( A d + Bo) + O(Xn-2)*
(6.6.3)
We shall assume in what follows that the A,, B, are Hermitean, and the A, positive-definite, so that A:
= An,
An > 0.
B,* = B n ,
(6.6.4)
A fuller extension of (4.1.1) would be the relation
+
CnYn+i = (4 Bn) Yn - Cn-iyn-i
*
This, however, can be reduced to the previous form by the substitution Yn = c;:l Cn-*c;:3 ... Y;,
(with different A,, Bn). For a boundary problem, extending that of Chapter 4, we set up the vector recurrence relation Y , + ~ = (AA,
+ B,) y, - yn-l ,
n
= 0,
+ Hym-1
= 0.
..., m - 1,
(6.6.5)
and ask for nontrivial solutions such that y-1
= 0,
Ym
(6.6.6)
Here yn denotes a k-by-1 column matrix, and H is a square Hermitean matrix of order k. The reality of the spectrum is most simply proved by writing (6.6.5-6) as a single matrix equation of the form (6.1.4). We introduce (6.6.6) into (6.6.5), writing the first and last equations in (6.6.5) as ~1
=
+ Bo)yo,
+
0 = (urn-, Bm-l+ Wym-1
-~Ym-2
*
The equations may now be written UO+BO
--E
0
--E
M+B,
-E
0
-I?
hA2+B2
...
...
... 0
...
... ...
0
Yo
0
Y1
0
...
...
-E
...
urn-,+ Bm-l+
= 0.
Ym-:
H
Ym-I
(6.6.7)
152
6.
MATRIX METHODS FOR POLYNOMIALS
T h e eigenvalues of our problem are those of a km-by-km A-matrix, which is Hermitean. Since also the coefficient of h is positive-definite, these eigenvalues are all real, though we cannot assert that they are all distinct. There will be an orthogonality of eigenfunctions, to which we return later. Of the various results of Christoffel-Darboux type, similar to those of Section 4.2, we need the following, namely,
For the proof we take the complex conjugate transpose of (6.6.1), getting
+
Y,*,1(4 = Y,*(4(J4 Bn) - Y,*_1(9
since A,, B, are Hermitean, and write (6.6.1) with p for A,
Multiplying the latter on the left by Y,*(A) and the former on the right by Y,(p) and subtracting, we get the analog of (4.2.2). Summing over n and using (6.6.2) we derive (6.6.8).
6.7. Oscillatory Properties We pass to analogs of the results of Section 4.3. I n place of the ordinary polynomial y,(A) of Chapter 4 we consider either the matrix polynomial YJA) defined by (6.6.1-2) or its determinant det Yn(A).We have already proved Theorem 6.7.1. T h e zeros of det Yn(A)are all real. For the zeros of det Yn(A)are the eigenvalues of a boundary problem for (6.6.5) with the boundary conditions y-l = yn = 0 in place of (6.6.6). Another proof will be noted later. This corresponds to Theorem 4.3.1; an inessential extension of Theorem 6.7.1 is possible to the determinant det { Y,(X) HY,-,(A)}, where H is Hermitean. We cannot assert that the zeros of det Yn(h) are all simple. T h e corresponding result deals with the poles of {Yn(h)}-l.We take the opportunity to make a slight relaxation of our assumptions concerning
+
6.7.
153
OSCILLATORY PROPERTIES
(6.6.1); we still assume A,, B, Hermitean, but not all the A, need be definite. We have
Theorem 6.7.2.
Let A,>O,
Al>O,
A,>O
,....
(6.7.1)
Then { Y,(h)}-l has only real and simple poles. If in (6.6.8) we take p = h and replace m by n we get
The terms in the last sum are all at least positive semidefinite by (6.7.1). Since Yo(h)= E we deduce that
>
Im { Y,-,(A) Yn(A)} Im {A} A, ,
if Im A
> 0.
(6.7.3)
Since A, > 0, we have that Y J h ) is nonsingular if I m h > 0; a similar argument shows that it is nonsingular if I m h < 0. Hence {Yn(A)}-l has only real poles. T o prove that the poles are all simple we multiply either (6.7.2) or (6.7.3) on the left by {YZ(h)}-land on the right by {Y,(h)}-l, taking h complex, the result being
if I m h > 0. Suppose now that A, is a real pole of order s, the Laurent expansion of ( Y,(h))-l near A, being ( Yn(A))-l= P(A - A,)--@
+ ...
+
Putting A = A, it in (6.7.4), where t is small and positive, and substituting for ( Y,(h))-l its Laurent expansion and making t --+ 0, the term of highest order on the left will be of order at most t-S, while the right of (6.7.4) will have as leading term -t( -it)--@P*A$(it)--@,
and since A, > 0 this is of order exactly PZ5. We thus have a contradiction if s > 1. This completes the proof. Next we extend the result of Theorem 4.2.2.
154
6.
MATRIX METHODS FOR POLYNOMIALS
In (6.6.8), with n for m, we take A, p real. If in particular X the right vanishes and we derive Y,*(A)Yn-l(h)- Y,*-,(h)Yn(h)= 0, for h real.
= p,
(6.7.7)
If now in (6.6.8) we make p -+ A and use I’Hbpital’s rule we get (6.7.5), with (6.7.6) as an immediate consequence. We derive an extension of the property that two consecutive orthogonal polynomials with scalar coefficients have no common zeros. It is not true that det Y,(X), det Y,-,(h) cannot have common zeros, but nevertheless we have Theorem 6.7.4. For any column matrix g # 0 the polynomials Y,(A) g , Y,-,(X) g have no common zeros. They can neither have complex zeros, by Theorem 6.7.1. To show that they cannot have a common real zero we multiply (6.7.6) on the left and right by g*, g. By (6.7.6), the left-hand side cannot vanish, which it would if Y,(h)g = 0, and Y,-,(h)g = 0. A property closely related to the separation property for ordinary
polynomials of an orthogonal set is that the ratio of two consecutive polynomials is a monotonic function, when finite. This is still the case.
Theorem 6.7.5. For real A for which { Y,L-l(X)}-lexists, Y,(X){Yn-,(X)}-l is Hermitean, and is an increasing function of A. Similarly Yn-l(X){Y,‘(h)}-l is Hermitean and decreasing if it exists. Supposing that Y;A1 exists, it follows from (6.7.7) that YnY;!,
=
(y:.l)-l y,*
9
(6.7.8)
so that YnY;Al is Hermitean. Differentiating with respect to X we have
( Yn Y-1 n-1 )’
=
Y;Y,’l
- YnY,:lY;-lY;:l
- Y‘Y-1 - ( y * )-1 y* y’ n n-1 n-1 n n-1’21
6.7.
155
OSCILLATORY PROPERTIES
using (6.7.8), and hence
and the right-hand side is positive-definite by (6.7.6). T h e proof of the latter statement of the theorem is similar. This result may be used to study the variation of eigenvalues with boundary conditions of the type (6.6.6). T h e method has the disadvantage that it may happen that neither Y;', Y;f1 exists, and so we study instead the Cayley transform of these matrices.
Theorem 6.7.6. T h e matrix Qn(A) =
+
(Yn-i(A> iYn(A))(Yn-i(A)- iYn(A)>-l
(6.7.10)
exists for all real A, and is unitary. It satisfies a differential equation (6.7.1 1) (6.7.12)
T o prove that &,(A) tities
exists and is unitary for real X we use the iden-
+
+
( Y,*, - iY,*)( YnPl iY,) = ( Y,*_, iY;) ( Y,-, - iY,)
(6.7.13)
These follow from (6.7.7). We now observe that the last expression is positive-definite. T h e terms Y:-l Y,-l, Y,* Y , being at least positive semidefinite, their sum is positive-definite except possibly in the event of the corresponding quadratic forms vanishing simultaneously, that is to say, if there is a column matrix g such that g*Y:-, Y,-g = 0 and g*Y,*Y,g = 0, with g # 0. These, however, imply that Yn-,g = 0, Yng = 0, and this is excluded by Theorem 6.7.4, or indeed by the recurrence relation (6.6.5). We deduce that the factors in (6.7.13) are nonsingular for real A, so that QJA) does indeed exist. From (6.7.13) we have that
(Y,*-, + iY,*)-'(Y,-l- iY,*)(Y,-,+ iY,) (Ynp1 -i Y,)-' or so
Q,*(A) Q,(A)
that QJA)
is unitary.
= E,
= E,
(6.7.14) (6.7.15)
156
6.
MATRIX METHODS FOR POLYNOMIALS
Finally we verify the differential equation (6.7.1 1). We have
Multiplying on the left by Q; = Q;'
QQ :A
we obtain
+ iY,*)-'(Y,*_,iY:) (Yh-l+ iYi)(Yn-l- iYJ1
= (YZ-,
-
(6.7.17)
- ( YA-,- iYA)( Y,-,- iYJ1 =
+
(Y,*_, iYn*)-l {(Y,*_, - iu;) (Yip, + iYJ
(6.7.18)
+
- (Y,*_,iy:,
+
= 2i( Y,*_liY,*)-' { Y,*_lY; -
Y,*Y;_,} ( Ynpl -i Y,)-'
(6.7.19)
= &?,(A),
(6.7.20)
say. I n view of (6.7.6), the matrix Qn(A) so defined is positive-definite. This completes the proof. In the scalar case in which Qn , SZ, are 1-by-1 matrices, (6.7.11-12) bear the interpretation that Q, moves positively round the unit circle. In the general case the same statement applies to the eigenvalues of Qn(A). By this means we derive an extension of the separation property of Theorem 4.3.2.
+
Theorem 6.7.7. Let the interval [A', A"] contain K 1 zeros of det Y,-,(A). Then the interval (A', A") contains a zero of det Y,(A). Let the eigenvalues of Q,(A) be
, %(A),
w,(Q
(6.7.21)
K being the common order of all the square matrices. These being the roots of an algebraic equation, det {Q,(A) - w E } = 0, whose coefficients are regular when A is real, can be taken to be K continuous functions of A. T h e w,(h) will lie on the unit circle, for real A, since &,(A) is unitary, and by (6.7.1 1-12), and Appendix V, will move monotonically and positively around the unit circle as A increases. If now det Y,_,(A) = 0, there will be a column matrix g # 0 such that Y,-l(A) g = 0, and so also Y,(A) g # 0 by Theorem 6.7.4. Hence, dropping the A's, (Yn-1- iY,)g
=
-( Y,-l
+ iY,)g,
6.8.
ORTHOGONALITY
157
or, with h = (Y,-l - iY,)g,
so that Qn(A) must have -1 among its eigenvalues. Conversely, if one of the w,(A) is -1, so that for some h # 0 we have h = -QILh, by retracing the above steps we find that Y,-lg = 0 for some g # 0. Similarly, if det Y,(A) = 0, Q,(A) must have $1 among its eigenvalues, and conversely. Suppose now that the open interval (A’, A”) contains no zero of det Y,(A). It follows that in this open interval the functions w,(A) cannot take the value + l . Thus as A increases in A‘ < A < A’‘ the w,(A) will describe monotonically some part of the unit circle from which the point +1 has been deleted. Hence in the open interval there can be at most k values of A at which one of the w,(A) equals -1. T h e same is true of the closed interval [A’, A”] since if, say w,(A’) = - 1, then w,(A) must lie in the lower half of the unit circle for A‘ < A < A”, and cannot take the value -1 again in this interval, a similar argument applying to the eventuality that w,(A”) = -1. Hence if the closed interval contains k 1 zeros of det Y,.-l(A), counted according to multiplicity, there must be at least one zero of det Y,(A) in the open interval, as was to be proved.
+
6.8. Orthogonality
Next we prove that the polynomials Y,(A) defined by the recurrence relation are indeed orthogonal. We prove this first in the finite-dimensional case, and assume for simplicity that the matrices A , in (6.6.1) are all positive definite. T h e eigenfunction expansion associated with the boundary problem (6.6.5-6) will be the same as the eigenvector expansion associated with the matrix equation (6.6.7). As already noted, the eigenvalues will be all real; they will be km in number, possibly not all distinct. We write them A, , r = 1 , ..., km. (6.8.1) There will correspondingly be krn nontrivial solutions of (6.6.5-6) ; we write the corresponding values of y o in (6.6.5) as u,,
r = 1,
..., km,
(6.8.2)
158
6.
MATRIX METHODS FOR POLYNOMIALS
those corresponding to coincident A, being linearly independent. T h e solutions of (6.6.5-6) will then be Yo(h,)u, , ..., Y m & i r u, ) ,
Y
1, ..., km,
=
(6.8.3)
together with a first and last member satisfying (6.6.6) which we omit. We then assert that the u, can be chosen so that m-1 n=O
u:Y,*(hs)A,Y,(h,l u,
=
a,,
I,
s =
1, ... km. )
(6.8.4)
I n the case A, # A, this follows from (6.6.8), multiplying on the left and right by u,*, u, and use of the second of the boundary conditions (6.6.6). If A, is a simple eigenvalue, we may arrange that (6.8.4) holds for s = Y by multiplying u, if necessary by a scalar factor. If finally A, is a multiple eigenvalue, there will be a corresponding number of linearly independent u, , and (6.8.4) can be arranged to hold for these u, by a process of orthogonalization. T h e eigenfunction expansion asserts that for an arbitrary sequence of k-vectors or column matrices (6.8.5)
there will be a set of scalars a, , Y = 1, ..., km, such that
5,
km
A,Y,,(h,) u,a,.)
=
n
...,m
= 0,
r=1
- 1.
(6.8.6)
The possibility of such an expansion follows from dimensional considerations, (6.8.3) forming a set of km vectors, each involving km constants, and orthonormal according to (6.8.4). T h e order of the factors on the right of (6.8.6) follows matrix conventions, a,. being a 1-by-1 matrix. Multiplying (6.8.6) by uzY:(A9) and summing over n we have, in view of (6.8.4), m-1
z u f Y : ( h s ) 5,
n-o
= as,
s = 1,
...)km,
and so, substituting in (6.8.6), (6.8.7) r=i
Pro
6.8.
159
ORTHOGONALITY
Comparing coefficients of the arbitrary vector
z
5, , we
deduce that
km
Esn, = An
r=1
Yn(hr)~
(6.8.8)
+ * y ; ( ~ r ) *
With a view to the transition m --t 00 we rewrite this in Stieltjes integral form, defining the spectral function
(6.8.9)
Thus T , , ~ ( A ) will be a nondecreasing Hermitean matrix function of the real variable A. We sum up the result as
Theorem 6.8.1. Let A, , B, , n = 0, ..., m - 1 be K-by-K Hermitean matrices of which the A, are positive-definite. Let H be Hermitean. Then the polynomials YJA)defined by (6.6.1-2) are orthogonal according to Yj(x)dT,,.~(h) Y,*(h)= 8jkAy1, i,k = 0,..., m - 1 (6.8.10)
Im -W
where T , , , ~ ( A ) is a nondecreasing Hermitean step function, with jumps at the eigenvalues of (6.6.5-6). 0 and the A, all Supposing the A,, B, Hermitean for all n positive-definite, we may make the limiting transition m -P m in (6.8.10). Puttingj = K = 0 in (6.8.10) we have, by (6.8.2),
from which we deduce that T , ~ , ~ ( Ais) uniformly bounded for m -P 03, with H possibly varying with m. The Helly-Bray theorem, modified for matrices, may now be applied to show the existence of a convergent subsequence, with limit .(A) say, likewise a nondecreasing Hermitean matrix function. We must again show that integrals of the form (5.2.8) are uniformly bounded in the matrix case, for which we use further cases of (6.8.10). Thus, by an argument similar to that of Section 5.2, we find that the full orthogonality holds, namely,
j
m --m
Yj(h)
Y:(h)
=, '&6
i,k
... .
= 0,
(6.8.12)
160
6.
MATRIX METHODS FOR POLYNOMIALS
Some further developments along the lines of Chapters 4 and 5 are indicated among the examples for this chapter.
6.9. Polynomials in Several Variables Again as in Sturm-Liouville theory, parts of the theory of the recurrence relation (4.1.1) extend to the case where there are several parameters. Writing A for (A, , ..., A,) we define k sequences of polynomials by
Y
= 1,
...,k.
(6.9.2) (6.9.3)
the relations (6.9.1-2) define yn,,(A) as a polynomial of the nth degree in the A,, ..., A,, with real coefficients, assuming as we do that the a,, , b,, are all real. For the boundary problem we demand that y,,,(h)
= 0,
T
=
1, ..., k,
(6.9.4)
the eigenvalues A = (A,, ..., A), being thus the roots of k simultaneous polynomial equations. Let us show that for suitably restricted anrsthe eigenvalues are all real. Supposing A to be a complex eigenvalue, so that X is also an eigenvalue, we multiply (6.9.1) by y,,(A), its complex conjugate by y,&,?(X) and subtract, and sum over n = 0, ..., rn - 1 [cf. (4.2.1), (4.2.5)]. This gives m-1 n=o
12 k
s=l
anrs(As
-A,)/I yn.r(h>
12
= 0,
=
1, ..*,k-
(6.9.5)
If A is complex, so that not all the A, - 1, vanish, the latter may be eliminated, giving the determinantal condition (6.9.6)
6.9.
POLYNOMIALS IN SEVERAL VARIABLES
161
where the left-hand side is a k-by-k determinant with typical element given by fixing r , s. This determinant may be expanded into mk terms of the form
I Y n , . l ( A ) 1'
*.*
1 y n k . k ( A ) 1'
Anl...nk,
where A n l . ..nk is a determinant formed of the anrs, and each of the n , , ..., nk may take any of the lralues 0, ..., m - 1. We have explicitly
%t22k
I
'nkkl
'nkk2
%,kk
I-I
(6.9.7)
T h e requisite condition is that all these be positive, extending (4.1.2) for the case k = 1. Our result is then
Theorem 6.9.1. I n the recurrence relation (6.9.1) let the c,, be positive, the b,, real, and the a, real and subject to the determinants (6.9.7) being positive for all possible choices of n, , ..., nk as integers from 0 to m - 1. Then the boundary problem (6.9.4) has only real eigenvalues. Our conditions make (6.9.6) impossible, since all the mk terms in the expansion of the determinant (6.9.6) are now non-negative, and at least one, that for which n, = ... - nk = 0, for example, is positive. T h e argument leading to (6.9.5-6) also leads to a certain orthogonality of the eigenfunctions, similar to (4.4.2). Let A(") be two distinct eigenvalues, which we now know to be real; they are to be distinct in that at least one of their components differs in the two cases. It is then easily shown that
I2 m-1
n-o
a n r s Y n . r ( h ( u ) )y n , T ( h ( " ) )
jk
TI
s-1
= 0,
(6.9.8)
this being a k-by-k determinant as in (6.9.6). We now write n for a multiindex or k-tuple (n, , ..., nk) of integers, 0 n, m - 1, and
< <
(6.9.9) 1.-1
Write also a, for the determinant (6.9.7). Then (6.9.8) may be written in a form approximating to (4.4.2) as (6.9.10)
162
6.
MATRIX METHODS FOR POLYNOMIALS
this being a definition of pu when u = v . With the above range of n, this constitutes an orthogonality in a space of dimension mk,which may be viewed as the tensor product of k spaces of dimension m. We have not so far shown that the eigenvalues exist, or that they are discrete. We proceed to show that the usual oscillatory characterization of the eigenvalues (cf. Theorem 4.3.5) does in fact extend to the present case.
6.10. The Multi-Parameter Oscillation Theorem In this section we assume that the anrs satisfy the condition that the determinants (6.9.7) are positive, in addition to the c,, being positive and the b,, , anrs real. Before proving the oscillation theorem we show that the spectrum is discrete.
(Ar),
Theorem 6.10.1. Let A(,) = ..., Afl)) be an eigenvalue of the boundary problem (6.9.4). Then there is a neighborhood of A("), of the form (6.10.1) for some E > 0, containing no eigenvalue other than A(,). For the proof of this and of the oscillation theorem to follow we use the polar coordinate method, whereby the oscillation of some real function is translated into the variation of some angular quantity; a similar approach was used in the proof of Theorem 6.7.7. Corresponding to each of the polynomials y,,,(A) defined by (6.9.1-2) we define %(A) = arg {cm-l.r(Ym*r(4
-Yrn-1.r(4>
+ *rn,r(~)b
(6.10.2)
The indeterminate multiple of 27r in this definition is fixed by the following construction. We define the complex numbers un.r(X)
+ +
= Cn-l,r{Yn,r(A) - ~ n - l , r ( A ) >
wn,r(A) = ~n-l.r{~n,r(A)- Yn-l.r(A)}
$n-l,r(Ah
(6.10.3)
C~n.r(h)*
(6.10.4)
Taking these in the order uo,r(A)*wo,r(A),
#1,r(A), wI,r(A),
**.) w m , r ( A ) ,
(6.10.5)
we join them by a series of straight lines, obtaining a polygonal arc a, say, whose sides are parallel to the axes alternately; we have initially
6.10.
THE MULTI-PARAMETER OSCILLATION THEOREM
163
+
1 i / ~ - ~by, (6.9.2). ~ , An important point is that this polygon does not pass through the origin. Suppose first , passed through the origin. On that a side of the form u ~ , ~ ( A )v,,,(A) reference to (6.10.3-4) we see that this implies thaty,,,(X) - Y , + ~ , ~ ( A )= 0, since points on this line have fixed real part which must be zero. Hence Yn.r(A) = Y n - l . r ( A ) , and SO un,r(A) = vn,r(A) = O,andyn,r(A) = Y n - l . r ( A ) =O, contrary to (6.9.2). Suppose again that the line joining vn,(A),u , + ~ , ~ ( A ) passes through the origin. Here uo,,.(A)= 1, and vO,,(A) =
Un+l,r(h) = Cn.r{Yn+l.r(h)
- Y n , r ( h ) } + iYn.r(X),
and on the line in question the imaginary part is constant, namely, yn,(A), and this must vanish. By (6.9.1) this implies that c ~ , ~ ~ ~ + ~ , ~ ( A ) = - c n - l , r Y n - l , r ( A ) , SO that vn,r(A) = u n + l , r ( A ) = 0, and Y n - l , r ( A ) = 0 as well as y,,,(A) = 0, in contradiction to (6.9.2). Since the polygon E does not pass through the origin, the function a r g z is defined as a continuous function on C, uniquely fixed if its value is fixed at some point of E. We choose arg z = 0 at z = 1 = u0,,(A), and define v,(A) as the value of a r g z when z has described the whole of E and arrived at V ~ , ~ ( A Since ). the polygon E varies continuously with A, in an obvious sense, v,(A) will be a continuous function of A, with derivative obtainable by differentiating (6.10.2), for all finite real A. We wish to show that the Jacobian of the tpl(A), ..., vk(h) with respect to A,, ..., h k does not vanish. Differentiating (6.10.2) with respect to A, and writing we get
ur(h) = {I Cm-l(Ym.r(h) - Yrn-l,r(h))
I'
+I
Ym,r(X)
12}-',
(6.10-6)
By what is essentially (4.2.3), this may be written (6.10.8)
Hence the Jacobian in question is
where the last factor on the right is a k - b y 4 determinant. The argument
164
6.
MATRIX METHODS FOR POLYNOMIALS
of (6.9.6-7) shows that this cannot vanish, so that the Jacobian is not zero. T h e eigenvalues determined by (6.9.4)are such that ym,,(h)in (6.10.2) vanishes, and so are characterized by ,p,(h)
= 0, (mod T ) ,
=
Y
1, ...,A.
(6.10.10)
If therefore this is true for some A("), by the local implicit function theorem or property of the total differential, (6.10.10) will not be true in a certain neighborhood of A(,) other than at A(,) itself. This proves Theorem 6.10.1. For the oscillation theorem we complete the sequence y-,.,(h), ...,ynLlr(h) to a continuous function of its first suffix, obtaining a piecewise linear function y,,,(h) which coincides with y,,,(h) when x is an integer n, and in between integers is linear. T h e continuous function thus defined will have a certain number of zeros or nodes in -1 < x < m ; as shown in Section 4.3, these zeros, if any, are simple and well defined in that if y,,,(h) vanishes for some x', it will change sign as x passes through x ' , for fixed Y and A. These numbers of zeros are uniquely associated with the eigenvalues, in the case of eigenfunctions. Theorem 6.10.2. Let q l ,
..., qk be
0 < q r < m - 1,
any integers such that Y
=
1,..., K.
(6.10.11)
Then there is precisely one eigenvalue h of the problem (6.9.14) such that y,,,(h) has precisely q, zeros in -1 < x < m, Y = 1, ..., k. We first express the latter property in terms of the angular variables defined by (6.10.2). Suppose first that y,,,(h) vanishes for some nonintegral x', n - 1 < x' < n. Then ~ % - ~ , , ( y,,,(A) h), must have opposite signs, and the line joining un,,(X), .,,(A) as given by (6.10.34) will cross the real axis. It will, moreover, cross it in the positive sense of motion around the origin; if ~ , - ~ , , ( h )< 0 < y,,?(A), the crossing from u,,,(h) to .,,(A) will be from below to above in the right halfplane, and of course from above to below in the left half-plane if yn-l,,(h) > 0 > y,,,(h). Thus as x increases through x', the complex number c n - l . r b n . r ( X ) - Yn-l.r(h)>
+ +z,r(h)
will cross the real axis in the sense of positive motion around the origin, and its argument will increase through a multiple of 7 ~ .
6.10.
THE MULTI-PARAMETER OSCILLATION THEOREM
Suppose again that y,,,(X) = 0 for some integral n, 0 T h e relevant section of the polygon a joins the points
+
un,r(A) = -Cn-l,rYn-l,r(A) %r(4
165
< n < m.
*n-l.r(A)
= -cn-l.rYn-l.r(49
% + l * r ( 4 = cn*rYn+l.7(4 = %.r(% %+1,r(X)
= Gln.rYn+1.r(4
+
iYn+1,r(9
That the second and third of these four points are identical if y,,,(A) = 0 follows from the recurrence relation (6.9.1). Here Y,-~,~(A), Y,+~,,(A) have opposite signs. If say Y,-~,,(A) < 0 < Y,+~,,(A), it is easily seen that a point describing this section of a will cross the real axis from below to above in the right half-plane, a similar discussion applying to the opposite case. Hence as z describes this part of 6,arg z will increase through a multiple of T . T o complete this part of the argument we observe that if as z describes a, arg z reaches a multiple of 7, then this may occur through the points (6.10.3-4) lying on opposite side of the real axis, corresponding to a nonintegral zero of yZ,,(A), or through one of these points lying on the real axis, corresponding to an integral zero of yZ,,(A). I t is not possible for arg z to be a multiple of 7 at a point of the join of w,,,(h), ~ , + ~ , ~ ( h ) , other than the end points, since this would imply y,,,(A) = 0, and also ya,,(A) - Y,-~,,(A) = 0, which is impossible. Hence as z describes the polygon in the order (6.10.5)) a r g z will be a multiple of 7 only at points corresponding to zeros of yz,,(X), and will pass through these multiples of 7 from below to above. Hence if yz,,(A) has qr zeros for which -1 < x < m, and in addition vanishes at x = m, A being an eigenvalue, we must have q+(h) = (qr
+ 1)T,
T
=
1, ...)k.
(6.10.12)
For as z describes (5, arg x must increase through qr successive multiples of T , its initial value being zero, and since it is initially increasing. In addition it must reach a further multiple of T at the end of &, and the arguments just given show that it can only reach this further multiple of 7 from below. T h e problem now assumes the following form. T h e rpr(A), being k functions of the k real variables A,, are to have the values (6.10.12), and we are to show that the equations (6.10.12) have precisely one solution for A,, ..., A,. We consider the transformation
(4>
***)
A,)
-
(Vl(4,
*.*,
?Jk(4),
(6.10.13)
166
6.
MATRIX METHODS FOR POLYNOMIALS
or in abbreviated form,
-
v(4
(6.10.14)
as a transformation rp of euclidean k-space, 8, say, into some part of itself. Our first task is to identify the range rpb, , the set of all possible sets of values of (rpl ..., fpk) for all possible finite real A. We observe first that y b k is certainly a bounded set. As z describes any side of the polygon a , arg z cannot vary by more than T . We therefore have the crude bounds
I vr(h)I
< (2m + 1)x,
Y
=
1, ..., k.
(6.10.15)
Next we consider the boundary of rpgk. Since the mapping (6.10.13-14) has nonvanishing Jacobian, as was proved in connection with the previous theorem, I p 6 k contains with any point rpt, the map of some At, also a neighborhood of rpt, the map of some neighborhood of At, by the implicit function theorem. Thus r p d k is an open set, and points of its boundary will be the limits of sequences of the form rp(X(j)), j = 1, 2, ..., where the sequence h ( j ) has no finite limit-point. We show that this implies that the boundary of Fdk is located in certain planes. Assume then that h(3) = (Xf), ...,A t ) ) , j = 1, 2, ..., is an infinite sequence such that
(6.10.16) and write
the unit vector p ( j ) prescribing the direction from the origin to X(j). Since p ( j ) is bounded, we may by selection of a subsequence arrange pk) that the sequence p ( j )converges to a limit p, say, where p = ( p l , is also a unit vector. Then XCj) = Jljp
+
O(Jlj),
(6.10.18)
the last term being a k-vector whose entries are small compared to + j as j -+ a. Substituting for X in (6.9.1-2) we obtain., selecting the leading terms,
6.10.
167
THE MULTI-PARAMETER OSCILLATION THEOREM
With the notation (6.10.20) 8-1
this simplifies to
The significance of the latter as an asymptotic formula for Y ~ , ~ ( A ( ~ ) ) for large j will depend on the v ~ not , vanishing, ~ and on their sign. Regarding this we prove that there is at any rate one r, 1 r k, for which the sequence ..., v , , - ~ , ~contains terms only of one sign, and no zeros. Assume if possible that this were not the case, that for each such sequence there was either one zero member or two members of ..., opposite signs. Then for every r we can find a set of scalars none negative and at least one positive, such that
< <
n=O
and so k
m-1
s=1
n=O
Y
=
1, ..., k,
on substitution from (6.10.20). Eliminating pl , ...,pk , we have
the left being a k-by-K determinant. Expanding the determinant as in the case of (6.9.6), we see that this is impossible. Hence there is an r for which the vPerhave the same sign. Suppose that for the r in question all the v p , r are positive. Then (6.10.21) shows that for some positive constants urn.,.we have yn,r(A(’))
Cn.4;
+ 4$7),
(6.10.22)
for n = 1, ..., m - 1. The quantities u,JA(j)), wn,JA(j)) defined by (6.10.34)all lie in the first quadrant, for all j beyond some point, and in particular we have
168
6.
MATRIX METHODS FOR POLYNOMIALS
and so, by (6.10.2), (6.10.4), fpT(W) =
cot-' cm-l,T
+ o(l),
(6.10.23)
the inverse cotangent having its value between 0 and 9.r. If again for are all negative, then in (6.10.22) the the r in question the will be of alternating sign, starting with uo,r positive, by (6.9.2). From (6.10.3-4) we have
+ o($Y), + 9 + o(#Y),
%W))=
cn-l,T%,T#
cIn,r(A")) =
( ~ 1 . 7
0n.dY
showing that the sequence u0,,.(h(j)),..., uT,&,,.(h(j)) lies, for large j , approximately on the positive and negative real semiaxes alternately, starting with the former, the sequence vo,,.(h(j)), ..., vm,,,.(h(j)) lying alternately in the first and third quadrants, again starting with the former. Hence in this case we replace (6.10.23) by fp,.(A(j))
= mrr
+ cot-l cm-l,T + o(1).
We conclude that the boundary of
(6.10.24)
is contained in the sets
&?k
y r = cot-' cm-l,T,
(6.10.25)
q+ = cot-'
(6.10.26)
+ mn,
which, putting Y = 1, ..., k, form a set of 2k hyperplanes. Hence the range y a k of the transformation (6.10.13) is a bounded open set whose boundary is entirely contained in the hyperplanes (6.10.25-26). The only bounded open set with such a boundary is the region cot-1 cm-l,r
< f p r < c0t-l
cm-1.7
+ mw,
(6.10.27)
and this is therefore the range of the transformation F. Hence the equations (6.10.12) are soluble, and so there is at any rate one eigenvalue h such that ~ ~ , ~has ( hexactly ) qr zeros in -1 < x < m, which is part of the assertion of Theorem 6.10.2. It remains to verify that there is exactly one such A, or, in other words, that cp effects a (1, 1) mapping of the euclidean k-space 8,onto (6.10.27). This follows from the fact that the Jacobian of the mapping is not zero, proved above, together with the obvious fact that the range (6.10.27) of the transformation is simply-connected, in the sense that its fundamental or first homotopy group reduces to the identity. This completes the proof.
6.1 1.
MULTI-DIMENSIONAL ORTHOGONALITY
169
6.1 1. Multi-Dimensional Orthogonality Reverting to the notation A(U) for the eigenvalues, the orthogonality (6.9.10) now forms a full set of orthogonality relationships between mk vectors in space of mk dimensions. This incidentally provides a fresh
proof that there are no more than mk eigenvalues (see Theorem 6.10.1, or the last paragraph of the preceding section). What is more important, we can pass to the dual orthogonality, namely
where p, q are multi-indices. Thus our polynomials are orthogonal in the usual sense, but in A-space of K-dimensions. A multidimensional spectral function may now be defined, and the limiting process m + 00 considered.
CHAPTER7
Polynomials Orthogonal on the Unit Circle
7.1. The Recurrence Relation Returning to polynomials with scalar coefficients and in one independent variable, we take those defined by the recurrence formula un+l
= an%
*n+l
=b
n k
+ +
bnvn
9
n = 0 , 1 , 2 )...,
aneJn 9
(7.1.1) (7.1.2)
and the initial conditions 11, = U,(X)
v, = vo(h) = 1.
= 1,
(7.1.3)
Subject to the restrictions an=&,
(7.1.4-5)
az-lbn12>0
we may show that a very similar theory obtains for the polynomials u,(h) so defined to that developed in Chapters 4-5 for polynomials satisfying (4.1.1); the principal difference is that the role of the real axis in Chapters 4-5 is now taken over by the unit circle. T h e invariance property associated with this system is that if h is on the unit circle, that is, XX = 1, then Gz+1un+1
- fin+l*n+1 = (an 2
-
I bn 1)'
(cnun
- Cz*n),
(7.1.6)
which is readily verified as a consequence of (7.1.1-2); more general results will be used later. It is possible to arrange by a change of variable that a: - I b, l2 = 1, so that (7.1.6) becomes a strict invariance, though this will not be necessary. Together with the case of Chapters 4-5, this exhausts the possibilities of the two-dimensional matrix recurrence relation (3.1.1) insofar as orthogonal polynomials with scalar coefficients 170
7.1.
171
THE RECURRENCE RELATION
are concerned ; there will, however, be further analogous systems involving rational functions. The polynomials u,$ = u,(h), ern = u,(h) are very simply related. We may derive one from the other by reversing the order of the coefficients and replacing the coefficients by their complex conjugates, in other words,
For the polynomials on the right satisfy the same initial conditions (7.1.3) for n = 0, and in fact the same recurrence relations (7.1.1-2). The first statement is obvious. T o verify the second we take j h 1 = 1 in (7.1.1-2) and take complex conjugates, obtaining
+ &5,, ,
Cn+, = anA-%in
t7n+l = bnA-lCn
+ an5,,,
and so
Hence (7.1.1-2) are satisfied by Anfin ,hna, in place of u, , wn respectively when I h I = 1. Thus (7.1.7) holds when I h I = 1, and so generally, if A is replaced by 1/X. This leads to an alternative formulation of the recurrence relation and initial conditions. Let a,(h), 6,(X) be the polynomials obtained from u,(h), w,(h) respectively by replacing the coefficients in these polynomials by their complex conjugates. We may then write (7.1.7) for general h # 0 as un(A) = An6*( 1/A),
V,,(h) = A%n(
(7.1.8)
1/A).
The recurrence relations (7.1.1-2) and initial conditions (7.1.3) may thus be replaced, so far as defining u,(h) is concerned, by
+ bnAnCn(l/A),
un+&I) = anAun(A)
n
= 0, 1,
... ,
(7.1.9)
together with u,(A) = 1.
(7.1.lo)
It is therefore possible to conduct the investigation entirely in terms of the polynomials u,(h) ; we shall, however, prefer the more symmetrical version in terms of u,(h), a,(h), even though only the former appear in the orthogonality relations. Just as in the case of ordinary orthogonal polynomials, an important
172
7.
POLYNOMIALS ORTHOGONAL O N THE U N I T CIRCLE
part is played by a second independent solution of the recurrence relations. We shall denote by un,(h),
vn,(h),
= 0,
71
1, ...
(7.1.11)
9
a sequence of polynomials which are solutions of (7.1.1-2), with unl, wnl for u, , v, , but with the different initial conditions uol = uol(h) =
1,
00,
= WOl(h) =
-1.
(7.1.12)
7.2. The Boundary Problem
<
For fixed a, 0 a < 2 ~ we , ask for A-values for which (7.1.1-2) have a nontrivial solution such that uo = v o , u, = exp (ia)v, . I n terms of the polynomials defined by (7.1.1-2) together with (7.1.3), this is equivalent to demanding that #,(A)
- exp (ia)v,(h) = 0.
(7.2.1)
It follows from (7.1.1-5) that u,(h) is of degree exactly m, and v,(A) of degree at most m. Hence the eigenvalues, the roots of (7.2.1), will be exactly m in number, subject to our verifying that they are all distinct. We proceed to show in addition that they all lie on the unit circle. We first locate the zeros of the actual polynomials u,(A), o,(A). Unlike the case of ordinary orthogonal polynomials, these do not figure as eigenvalues. We assume throughout that (7.1.4-5) hold. Then we have Theorem 7.2.1. T h e polynomials u,(A), .,(A) have their zeros respectively inside and outside the unit circle. Here “inside” and “outside” have their strict senses. We start by showing that u,,(A), .,&(A) certainly do not vanish simultaneously. This being correct for n = 0, suppose if possible that U , + ~ ( A ) = V , + ~ ( A ) = 0 for some n 3 0. It then follows from (7.1.1-2) that .,(A) = v,(A) = 0, since the determinant of coefficients on the right, that is, A(ai - I b, 12) , does not vanish, by (7.1.5), unless A = 0; the conclusion is easily seen to be valid in the event that A = 0 also. Since u,(A), v,(A) are not zero, we are led to a contradiction. We may thus define the rational function wn(h) = u n ( h ) / d h ) ,
which will be in its lowest terms. Dividing (7.1.1-2) by a,w,, writing cn = bnlan
9
(7.2.2) and (7.2.3)
7.3.
173
ORTHOGONALITY
the recurrence relation for w,(X) is seen to be
We assert that as X describes the unit circle once and positively, w,(h) describes the unit circle n times in the positive sense. T h e assertion being trivial when n = 0, suppose it true for some non-negative n and proceed inductively. We recall that the mapping of the complex plane given by z’ = ( z c,)/(zE, l), I c, I < 1, has the property that as z makes a positive circuit of the unit circle, so also does z’. Hence if w,(h) makes n positive circuits of the unit circle as h makes one such circuit, so that Xw,(X) makes n 1 circuits, it follows that ~ , + ~ ( hwill ) also make n 1 circuits. Hence the assertion is true generally. I n just the same way we may prove inductively that w,(h) lies inside or outside the unit circle with A. We need again the property that I c, I < 1, which follows from (7.1.5). Thus the zeros and poles of wJX) lie inside and outside the unit circle, which was to be proved. I n the course of the proof we have derived that of
+
+
+
+
Theorem 7.2.2. T h e eigenvalues, or roots of (7.2.1), are m in number and lie on the unit circle, being all distinct. For as X describes once the unit circle in the positive sense, w,(h) makes m positive circuits of the unit circle, and thereby assumes m times the value exp (ia). 7.3. Orthogonality
T h e basis is, as usual, a result of the character of Green’s, or Lagrange’s, or the Christoffel-Darboux identity. For the present case this is
Theorem 7.3.1. Writing (7.3.1) with in particular ko
=
1, we have
174
7.
POLYNOMIALS ORTHOGONAL O N THE U N I T CIRCLE
This corresponds to (4.2.1), and, like it, follows directly from the recurrence relation. Writing this in full for A, and in complex conjugate form for p, we have, for n 2 0,
Summing over n
=
0, ..., m
-
1, and noting that
__
-
ko{uo(P) uo(4 - Wo(P) %(4)= 0,
we get (7.3.2). T h e case p = A of (7.3.2) gives
This gives a fresh proof that the eigenvalues lie on the unit circle. For if h satisfies (7.2.1), the left of (7.3.3) vanishes, while the sum on the right contains a non-zero term for n = 0. Hence I h l 2 - 1 = 0, as asserted. Again, we see from (7.3.3) that w,(h) = u,(h)/v,(h) lies inside or outside the unit circle, with A, as proved in Section 7.2. We supplement (7.3.2) with its limiting form for the case that p = A, h lying on the unit circle; this will be the analog of (4.2.3). For fixed p on the unit circle, we make h + p in (7.3.2). By I'H6pital's rule we have
7.3.
175
ORTHOGONALITY
or, since p = l/p,
Let now A,, ..., A, be the eigenvalues, the solutions of (7.2.1), so that for A = A, there will be a solution of (7.1.1-2) such that uo = w, , u, = exp (ia)wm , and not vanishing identically. These solutions, the eigenfunctions, turn out to be orthogonal, in a sense involving only the u, . As in Section 4.4, there are two orthogonalities, one following directly from the Green's type identity, and the other or dual orthogonality following from the first orthogonality. Theorem 7.3.2. The eigenfunctions of the problem (7.1.1-5), (7.2.1) are orthogonal according to
(7.3.5) n-b
where
-
i
Pe
&(h> - om(&)
= kmh{um(&)
4n(Xr)>*
(7.3.6)
T o obtain (7.3.5) with r # s we take, in (7.3.2), A = A,, and p = A,, the left of (7.3.2) vanishing in view of (7.2.1). In the case r = s (7.3.5-6) become the same as (7.3.4). The dual, and more important orthogonality is Theorem 7.3.3. The eigenfunctions of (7.1 .l-5), (7.2.1) are orthogonal according to
2u,(U m-
%(he) P;'
= 8PO k-l P Y
-(7.3.7)
c-1
This follows at once from (7.3.5-6) (see Appendix 111). With a view to the limiting process m --f a, we again write this in Stieltjes integral form, defining a spectral function, in effect a measure or weight distribution on the unit circle, but here specified as a nondecreasing function on the interval (0, 2 ~ ) ,in the first place a step function. We define (7.3.8) where arg A, is taken subject to 0
< arg A, < 2
~the ; function so defined
176
7.
POLYNOMIALS ORTHOGONAL ON THE U N I T CIRCLE
will be right-continuous, any jump at the ends of the interval being located at 2 ~ T.h e orthogonality (7.3.7) may be written
where we have written h = exp (is).
(7.3.10)
This, like (7.3.7), expresses the fact that the polynomials uo(h),..., U ~ & - ~ ( A ) are orthogonal on the unit circle; we have reached the stage corresponding to (4.5.3), the orthogonality of a finite set of polynomials defined by the recurrence relation (4.1.1) and initial conditions (4.1.5). At this stage we may conveniently prove the analog of Theorem 5.2.1, asserting the orthogonality of an infinite set of polynomials satisfying the recurrence relation and given initial conditions. This is Theorem 7.3.4. Let (7.1.1-5) hold for all n = 0, 1, 2, ... . Then there is a function T ( 6 ) , which in the interval (0, 27r) is bounded, nondecreasing and right-cotltinuous, such that
J
2n __ 0
u,(h) u,(A) d ~ ( 8 = ) 6,,&’,
p,q
= 0,
1, 2, ... .
(7.3.11)
Here we use the notations (7.3.1), (7.3.10); the function ~(6)may have a saltus at 27r, but not at 0. For the proof we make m --+ m in (7.3.9). By the case p = q = 0 of (7.3.9) we have
and since ~,,,(6) = 0 when 6 = 0, by (7.3.8), we see that the ~,,,(6) form a family of uniformly bounded nondecreasing and right-continuous step functions. Keeping a fixed, we may select an infinite m-sequence such that T,*,(O) converges to a limit u(6), say, which is bounded, nondecreasing, and such that (7.3.13) as m -+ m. Since the left of (7.3.13) is fixed for m (7.3.9), it follows that
> max ( p , q),
by
7.3.
177
ORTHOGONALITY
T o obtain (7.3.11), we replace u(0) by an equivalent weight distribution on the unit circle satisfying certain additional conditions. We may first replace it by a function d(e) which is right-continuous, setting d ( 8 ) = u(e 0) for 0 8 < 2 ~ a, saltus of o(8) at 8 = 0 being transferred to 0 = 27r, so that ut(27r) = a(27r) u(+O) - ~ (0). Finally, we may for definiteness ensure that ~(0) = 0, by defining T ( e ) = d(8) - o(0). These changes do not affect the integral in (7.3.14), and we deduce (7.3.11). The result just proved may be written
+
<
+
which is equivalent to the following “mechanical quadrature.”
Theorem 7.3.5. Then
Let x(8) be a trigonometric polynomial of degree n.
The above description of x(0) is to mean that it is a linear combination of 1, cos 0, ..., cos no, sin 8, ..., sin nd, that is to say, of exp (ire), -n r n. The property to be established is equivalent to
< < 2n
exp (ire) dT,,,(8)
=
0
j”
2n
exp (ire) dT(8),
0
-n
< r < n,
(7.3.17)
and this is easily deduced from special cases of (7.3.15), namely,
I
2n
fnUq(h) 0
and
I
2n
0
__
dT,,,(8)
=
u,(X)d~,,,(e) =
0
u,(x) dT(8),
2n __ 0
u,(h)dr(O),
0
<
0< p
< n < m.
tl
< m, (7.3.18) (7.3.19)
We get (7.3.18) from (7.3.15) on taking p = 0, the complex conjugate versions (7.3.19) on taking q = 0. If in (7.3.19) we put p = 0, we get at once (7.3.17) with r = 0. Further cases of (7.3.17) may be derived inductively. If we write (7.3.20)
178
7.
POLYNOMIALS ORTHOGONAL O N T H E U N I T CIRCLE
where necessarily an, # 0, we have from (7.3.18) that
If (7.3.17) holds for r = 0, ..., q - 1 it follows that it also holds for r = q. Hence it holds for r = 0, ..., n. T h e remaining cases of (7.3.17) follow on taking complex conjugates.
7.4. The Recurrence Formulas Deduced from the Orthogonality Having in Sections 7.1-3 shown that polynomials defined by (7.1.1-5) are necessarily orthogonal according to (7.3.1 l), we show that these properties are actually equivalent, in that the orthogonality implies recurrence formulas of the form in question. We confine ourselves to the infinite case, omitting analogs of Theorems 1.7.1 and 4.6.1. We have
< <
O 27r, be Theorem 7.4.1. Let the real-valued function T(O), 0 bounded, nondecreasing and right-continuous; let it also be such that T(27r) > ~(o),and not be a step function with only a finite number of points of increase. Let k, , k, , ..., be positive. Then there is a set of polynomials u,(h) which are orthogonal according to (7.3.1 l), which satisfy together with further polynomials v,(h) relations of the form (7.1.1 -2), (7.1.4-5). We start by constructing polynomials of the form
(7.4.1)
which satisfy (7.3.1 1) except possibly when p
=
q. We demand that
(7.4.2) J O
where as before h
=
exp (id). For this it will be sufficient that
and indeed (7.4.2-3) are equivalent, since an integral of either type
7.4.
RECURRENCE FORMULAS DEDUCED FROM ORTHOGONALITY
179
can be expressed as a linear combination of integrals of the other type. With the notation pr =
for the moments of
T(O),
J",ede dr(8) 2n
substitution of (7.4.1) in (7.4.3) gives (7.4.4)
which constitute n equations to determine the a,,, , ..., we discuss them in a similar manner to (4.6.4). Either there is a unique solution, or else there is a set ah, ..., aLl not all zero such that
This means that
and so, multiplying by
and summing,
This is impossible, since Z ~ - ' a ~ hhas r only a finite number of zeros, and ~ ( 6 is) nondecreasing, not a constant, and not a step function with only a finite number of points of increase. Hence the u,(h), of the form (7.4.1), are uniquely fixed by (7.4.2) or (7.4.3) equivalently. Next we define the aJh), by the second of (7.1.7) if 1 A I = 1, and more generally by the second of (7.1.8) if h # 0, so that w,(h) is a polynomial of degree at most n. Since u,(h) has, in (7.4.1), unit highest coefficient, a,(h) will have unit constant term. - It will also have certain orthogonality properties. Substituting for u,(h) in the complex conjugate of (7.4.3) we derive
J-
2n 0
vn(A)A-p dT(8) = 0,
p
=
1, ...,n.
(7.4.5)
180
7.
POLYNOMIALS ORTHOGONAL ON THE UNIT CIRCLE
T o obtain the recurrence relations, we consider the polynomial v,(h). As already noted, v,(O) = 1, and so this polynomial contains h as a factor. We show that v(X) = h-l{v,+l(h) - o,(h)} is a multiple of u,(h). In the first place v(h) is a polynomial of degree at most n. Secondly, it satisfies the same orthogonality relations (7.4.3) as u,(h). For T J , + ~ ( ~)
and both integrals on the right vanish for m = 0, ..,,n - 1, by (7.4.5). Hence if we determine ,9 so that v(h) - ,9u,(X) is of degree at most n - 1, we shall have
and so, as in the discussion of (7.4.4),
so that w(h) - /?u,&(X)= 0, as asserted. Modifying the notation in the direction of (7.1.1-2), we write 6, for 8, so that we have just shown that "n+1(4
=
Lk(4 + wn(4.
(7.4.6)
Taking h on the unit circle, we may by (7.1.7) write this as
removing the factor Xnfl
and taking complex conjugates we have
~,+l(X)= hz(4
+
bnet,(h).
(7.4.7)
Together with (7.4.6), this gives a system of the form (7.1.1-2) with a, = 1, valid also for h off the unit circle. I n conformity - with (7.1.5), we now show that ] b, I < 1. We multiply (7.4.7) by w,(h) and integrate over the unit circle, with respect to T ( 0 ) . We have
7.4.
RECURRENCE FORMULAS DEDUCED FROM ORTHOGONALITY
+ 1 in place of n, and in view of (7.1.7)
by (7.4.3) with n
s”’ s”’
q(x)dT(8) =
hu,(h)
0
-
so
h1-n{Un(h)}2dT(8),
0
0
2n
=
2n
2n
0
b,
1
vn(h)dT(8) = J-
v,(h)
Hence
181
I u,(h)
12 dT(8).
~ 1 - y g , ( ~ yd T ( e ) / J r I %,(A)
12
q).
(7.4.8)
From this it is immediate that I b, I < 1 ; if I b, I = 1, so that 6, = l/b,, it will follow from (7.4.6-7) that u,+,(h) = h , ~ , + ~ ( h )It. will then follow from (7.4.3) and (7.4.5) that
J
2n 0
u,+,(X)h-~ dT(8) = 0,
p
= 0,
...)11 + 1,
so that u,+,(h) would be orthogonal to itself, which is impossible. Hence we have from (7.4.8) that 1 b, I < 1. Finally, we modify the u,(h) by constant factors so as to ensure that (7.3.1 1) holds also when p = q, with a corresponding modification in the w,(h). We define
where u,(X), .,(A)
are as previously found, and
K,
> 0 is given by
T h e modified recurrence relations (7.4.6-7) may be written
and since I b, I < 1, this system satisfies the restrictions laid down in (7.1.4-5). If, in addition, the initial conditions (7.1.3) are to hold, the constant KO is restricted by (7.3.11) in that
1
2n
0
dT(8) = ~ ( 2 7 r) ~(0) = k;’.
182
7.
POLYNOMIALS ORTHOGONAL O N THE UNIT CIRCLE
7.5. Uniqueness of the Spectral Function As in Chapter 5, and earlier in Chapters 1 and 2, we may define a spectral function as one with respect to which the solutions of the recurrence relation are orthogonal, and enquire whether there exists more than one such function. Starting with the recurrence system (7.1.1-5), we term a suitably restricted function T(8), 0 8 277, a spectral function if we have
< <
I
2n __ 0
p,q
u,(h) u,(h) dT(8) = 6,,K;’,
= 0,
1, ...,h
= exp (ie).
(7.5.1)
In distinction to the case of ordinary orthogonal polynomials, uniqueness of the spectral function is here ensured, and the associated distinction between limit-circle and limit-point cases does not arise, the limitcircle case being absent. Theorem 7.5.1. Defining the polynomials u,(h) by ( 7 . 1 . 1 4 , there is precisely one real-valued function T(8), 0 8 ,< 277, which is nondecreasing, right-continuous, and bounded, and such that ~ ( 0 = ) 0, which satisfies (7.5.1). Here the k, are given by (7.3.1). As we indicate presently, we have to deal here essentially with the proposition that the trigonometric moment problem is determinate. T o sketch the proof, we know by Theorem 7.3.4 that there exists at any rate one such function, Tt(8) say. Let T ( 8 ) be any other such function. For any O’, 0 < 8‘ < 277, we construct a sequence of functions cp,(8), n = 2, 3, ..., continuous in [0, 271, vanishing in 8‘ 8 27, and approximating to 1 in 0 < 8 < 8’. Specifically, let us take
<
< <
e p < e < e’ - e p ,
P)n(e)= I ,
completing the definition of cp,(8) in the remaining intervals so as to lie between 0 and 1, and being sufficiently smooth to have an absolutely convergent Fourier series in (0, 277). We may, for example, take
P)n(e) = eye - 2e’ln)a ( e 7 4 - 4 , o <e <e p , elln < e < 8’. P)n(e) = P)n(e’- e), el
Since cp,(8) will have a bounded periodic second differential coefficient, with four discontinuities, it may be expanded as a complex Fourier series
vn(e) where
t,,
= O ( Y - ~for ) large
2 m
=
r=-m
Y.
(ire),
{nr ~ X P
7.5.
UNIQUENESS OF THE SPECTRAL FUNCTION
183
If therefore we write
we shall have v,,(8)
+ vn(8)as
m
,.2n
-+
00,
uniformly in 8, and hence
r2n
(7.5.2)
-
as m + 00, the same result holding for Tt(8). Since u,(h), ur(h) are polynomials of precise degree r in exp (iB), exp (-i8), respectively, we may write
r=0
r--m
where the rlnr are constants, dependent only on n, Y , m, 8’ and the coefficients in (7.1.1-2). Now from (7.5.1) with p = 0 or q = 0 we have
(7.5.3) Substituting in (7.5.2) we have that
as m -+ 0 0 ; the coefficients -qnr here may depend on m. T h e same statement holds with Tt(8) in place of T(8), with the same -qnr, and hence we have
m, and recalling that ~ ( 0 = ) ~ t ( 0= ) 0, we deduce that Making n T(8’ - 0) = Tt(8’ - 0). Hence .(el) = Tt(8’) at all points B’, 0 < 8‘ < 27r, at which T ( 8 ) , Tt(8) are continuous. It is easily deduced that they must, as right-continuous nondecreasing functions, coincide for all B‘, 0 < 8’ < 27r. T h e same result for 8’ = 0 is among the postulates, while that for 8’ = 27r is given by (7.5.1) with p = q = 0. This proves the uniqueness of T(8). --f
184
7.
POLYNOMIALS ORTHOGONAL O N THE UNIT CIRCLE
The proposition that the trigonometric moment problem is determinate asserts that there is at most one nondecreasing, bounded and right-continuous function 7(0), 0 0 27r, ~ ( 0= ) 0, such that
< <
for any given set of pr . Granted this, we have at once that there cannot be more than one T ( 0 ) satisfying the conditions of Theorem 7.5.1. For the special cases (7.5.3) of (7.5.1) fix po = K ; l , and thence in turn p i , p 2 , ... and pP1,pP2,...., in terms of the coefficients of the u,(h). Thus a T ( 0 ) which satisfies Theorem 7.5.1 will be the solution of a certain trigonometric moment problem, and so is unique. A rearrangement of the argument of Sections 7.4-5 provides one of the approaches to the trigonometric moment problem. Supposing that the pr are given, we conduct the mock orthogonalization (7.4.1-4), in which (7.4.3) is replaced by (7.4.4), T ( 0 ) being unknown. T h e proof of the solubility of (7.4.4) will now rest on the assumption of the positivity of the “Toeplitz form”
when the a: are not all zero. Having found the u,(h), and the v,(X) from (7.1.8), we may construct the recurrence relation (7.1.1-2), and thence T ( 0 ) according to Section 7.3. Slight modifications will be necessary for the event that the Toeplitz forms are positive up to some n, and nonnegative thereafter, corresponding to the case that T ( 0 ) is the spectral function of a finite sequence of polynomials, in the sense (7.3.9). In a parallel way, the Hamburger moment problem (4.6.3) may be solved by orthogonalizing polynomials in the sense (4.6.4), and thence constructing the recurrence relation and the spectral function.
7.6. The Characteristic Function We construct this in the first place for the finite boundary problem (7.1.1-5), together with (7.2.1), as a function with poles at the eigenvalues, the residues there being proportional to the associated normalization constants. If we define (7.6.1)
7.6.
185
THE CHARACTERISTIC FUNCTION
where uml(X),v,,(h) are the second solutions of the recurrence relations (7.1.1-2) with initial conditions (7.1.12), it is to be expected that fm,exp(.la)(X)will have poles at the roots of (7.2.1), that is to say, at the eigenvalues. We proceed to verify that this function has further properties, similar to those mentioned in analogous cases in Section 1.6 and Section 4.5.
Theorem 7.6.1. representations
T h e function (7.6.1) has, for z
1
2n
=
(A
0
+
ei9)
(A - ei91-1
=
exp (ia), the
(7.6.2) dTm,a(v).
I t maps the unit circle into the imaginary axis, and the interior and exterior of the unit circle into the left and right half-planes, respectively. For the proof we expand (7.6.1) in partial fractions in the usual manner, taking z = exp (ia),with a real. Both numerator and denominator in (7.6.1) are polynomials of degree m, and we need the fact that they have the same coefficient of Am. For it follows from (7.1.1-3) that u ~ ( A )= A m ~ , - 1
... + ..., wm(A) U,
= Am&,+1~,,-2
... U,
+ ... ,
(7.6.3)
the omitted terms being lower powers of A. The different initial conditions (7.1.12) do not affect the leading terms in Am, and (7.6.3) is also valid with u m l ,wml in place of u, , wm . Hencef,,,,(h) -+ 1 as I h I -+ 00, and so fm.exp(ia)(A) =
1
In order to calculate the coefficients on the right we first put (7.3.6) in the form
-
P7
= ~ m h r u m ( A 7 ){uh(h) - exp
( i 4 %I(A,)>,
-
(7.6.5)
where we have used the fact that wm(h,) = exp (ia)u,(X7), in view of the boundary condition (7.2.1). I n addition, we need the fact that, if 1x1 = 1,
186
7.
POLYNOMIALS ORTHOGONAL O N THE U N I T CIRCLE
Here the equality (7.6.7) follows from the )initial conditions (7.1.3), (7.1.12), with the definition A,, = 1. The proof of the Wronskian-type identity (7.6.6) is very similar to that of (7.3.2). Suppressing the dependence on A, the recurrence relations give = aJlin
and un+l.l
= anhun1
+
+
, bnwm
iTn+l
,
=
bnhin
Wn+l,i = 6 n k 1
+ anCn,
+
%t',e,,l-
Multiplying these respectively and subtracting, and using the fact that Ah = 1, we deduce after slight reduction that (G+lun+l.l
- fin+lVn+l,d =
(4 - I bn 12)
(Qn1
-Gvnd-
In view of (7.3.1), (7.6.6) - follows by induction. Combining now (7.6.5-7), and the fact that o,(h,) = exp (ia)um(h,), we have
Observing finally that (7.6.8)
by (7.3.7) with p = q = 0, we see that this reduces to the required result (7.6.2), in the series form; the Stieltjes integral version is of course identical, by the definition (7.3.8). For the additional remarks in the theorem, we note that 1 A, 1 = 1, so that (A &)/(A - A,) has a real part which is negative, positive, or zero according as h lies inside, outside, or on the unit circle. The same applies to the linear combination with positive coefficients, of such terms appearing in (7.6.2). It follows that as A describes the unit circle in the positive sense, fffl,exp(ia)(h) describes the imaginary axis, in the sense from -im to +im, and in fact m times. An incidental consequence is the following separation theorem.
+
Theorem 7.6.2. Between two zeros of urn@)- exp (ia)v,(h), on the unit circle, there lies a zero of u,,(A) - exp (ia)w,,(h). For since fffl,exp(Ca,(h) describes the imaginary axis monotonically as h describes the unit circle, its zeros must alternate with its poles. (See Appendix 11.)
7.6.
THE CHARACTERISTIC FUNCTION
187
Next we derive certain results analogous to (4.8.12-13) or (5.9.2). The function (7.6.2) is analytic in the unit circle, and hence admits an expansion as a power series in A. We show that up to a certain point the coefficients are independent of a and m. Theorem 7.6.3.
We have, for real
a, I
h
I < 1,
where the pI are the moments (7.5.4) and are independent of a and m. For if 1 h I < 1 we have from (7.6.2) that
and the result follows from (7.3.16-17). The expansion (7.6.9), though not its proof, applies more generally. Theorem 7.6.4.
If I z 1 >, 1, I h I
< 1,
f m J 4 = -Po - 2
m-1
Anp4 n-1
+ ... .
(7.6.10)
It was shown in Section 7.2 that I u,(X)/o,(h) I < 1 when 1 h I < 1, and so (7.6.1) is regular in z for I h I < 1, and I z I 2 1. The coefficients of the power series expansion of fm,z(h) in the unit circle, namely, (2~i)-1/fm,&)
dA,
n
= 0,
1, ... ,
taken round some fixed circle inside the unit circle, are also regular in z for I z 1 2 1, and since they are constant for n = 0, ..., m - 1 when 1 z I = 1, they are constant when 1 z I >, 1 also. Another way of proceeding from Theorem 7.6.3 to Theorem 7.6.4 is to use the identity, which we need later,
188
7.
POLYNOMIALS ORTHOGONAL O N T H E U N I T CIRCLE
We get (7.6.11) on using the Wronskian-type identity Um1(A)
.,(A)
- .,,(A)
u,(A) =
If 1 h 1 = 1, this results from (7.6.6-7) together with (7.1.7), and hence it is true generally; alternatively it may be proved independently by induction. From (7.6.11) it follows that the Maclaurin expansions of fnL,z,(X), fm,z,(h) coincide up to and including the term in at least if I z1I 3 1, 1 zz I 2 1, so that (7.6.10) follows from (7.6.9), as we asserted. I n a similar way, for expansions for large A, we have Theorem 7.6.5.
If 1 z I
< 1, I h 1
> 1,
T h e proof is omitted, as entirely analogous. An aspect we shall not develop is that of the limit f ( X ) of the characteristic functions (7.6.2) as m --t m, given by ) ( A - eiV)-l d ~ ( p ) ,
(7.6.13)
which defines analytic functions in I X I < 1 and I X [ > 1. In particular f ( h ) will have negative real part inside the unit circle, with the Maclaurin expansion m
(7.6.14)
T h e finite-dimensional characteristic functions (7.6.2) may be regarded as rational functions, also with negative real part inside the unit circle, and zero real part on it, which approximate tof(X) in the sense of having in part the same Maclaurin expansion. This provides another route to our orthogonal polynomials, in which we start with f ( X ) and consider approximations to it by means of rational functions.
7.7. A Further Orthogonality Result For a finite set of polynomials u,(X), ..., U,-~(X) we have an infinite number of orthogonality relationships (7.3.9), corresponding to the choice of a, 0 a < 27r. From these other orthogonalities for the same
<
7.7.
A FURTHER ORTHOGONALITY RESULT
189
polynomials, discrete and continuous, may be formed by linear combination. We carried out a similar process in connection with orthogonal polynomials on the real line in Section 4.9. For the analogous result for polynomials orthogonal on the unit circle we use here a different method. T h e result is Theorem 7.7.1. ( 2 ~ ) - ~ (1 I z 12) k i l
Let I z
J
2n 0
1 < 1, 0
< p , q < m. Then
_I urn@)- zv,(h) I-' .,(A)
up@)dB = S,K;'.
(7.7.1)
Here we have written X = exp (ie). We take zl, z2 with I z1I < 1 < I z2 I. Since, as shown in Section 7.2, u,,,(h)/v,(h) is inside, on or outside the unit circle with A, f7,L.z1(X)will be analytic outside and on the unit circle, and in fact in a region I h I > 1 - E , for some E > 0; similarly f,.,,(h) will be analytic in a region I A I < 1 E . Hence their difference will be analytic in an annulus 1 - E < I h I < 1 E and will have a Laurent expansion there, given by (7.6.10) and (7.6.12) as
+
+
+ 2 2 Xnp-n + m--1
fm.z,(h> - f m . z , ( h )
=
-a-
(7.7.2)
-m+l
It follows that, for an integral taken positively around the unit circle
or
This implies a corresponding result in which An is replaced by any linear - combination of A", - m < n < m and so in particular by uJh) u,(X), 0 p , q < m. I n view of (7.3.9) we deduce that
<
T o obtain the special case (7.7.1) we choose a z with 0 and take z1 = z, z2 = l/f. Using (7.6.11) we have
< Iz I < 1,
190
7.
POLYNOMIALS ORTHOGONAL O N THE UNIT CIRCLE
Using (7.1.7), this simplifies to 1-1212
{Km
I Um
-
zvm
12)-1
’
and so, replacing also dh/X by i dd, (7.7.3) yields the stated result (7.7.1). The case z = 0 may be proved by a slight modification in the argument, or deduced by making x 4 0. 7.8. Asymptotic Behavior
We consider here the asymptotic character of A(,). when X is on the unit circle and n is large. This corresponds to asymptotic problems for linear differential equations in which the independent variable becomes large. We recall that there are three natural starting points for defining the polynomials u,(X), v,(h). Two of these are quite distinct, though actually equivalent, and are given by the recurrence relations (7.1.1-3), and the orthogonality (7.3.11) ; the third method is a formal generalization of the second, in which we start not with T(8) but instead its moments (7.5.4), and orthogonalize in the sense (7.4.4). I n each case, the problem may be posed of determining the asymptotic behavior of #,(A) for large n, given one of these definitions; inverse problems may also be posed, in which we start with a knowledge of the asymptotic behavior, and wish to make some deduction concerning the recurrence relation. I n the most tractable version of this problem, we start with the orthogonality and make deductions concerning the asymptotic form for large n. Some rather vague information is to be obtained by comparing the assumed orthogonality (7.3.1 l), where T ( 0 ) is nondecreasing, rightcontinuous, and bounded, and not a step function with a finite number of jumps, with the result (7.7.1); taking for simplicity x = 0 in the latter we have, with h for exp (id),
s’”
( 2 4 ~ 4 I um(h) 1-2 0
<
a#,(A) de
=1 ~ a ,;
(7.8.1)
for 0 p , q < m. Making m --f m, it follows from the uniqueness of the trigonometric moment problem that the weight distribution (27rkm)-’ I um(h) d8 must approximate in some sense to dT(8). If in particular T(8) has a derivative w(d), this suggests that for some constants K, we have 1 u,,(X) )-I K, d w ( 8 ) . Actually in this case h t h o g o n a l i t y with respect to a weight function w(O), as opposed to that of a weight distribution, much more exact information is available for weight
-
7.8.
ASYMPTOTIC BEHAVIOR
191
functions of various classes. I n what follows we indicate some simple direct arguments, starting with a formal solution of the problem. We determine the polynomials u,(h) from the orthogonality
I
2n
0
u9(A)#,(A) ~ ( 6 dB ) =0
( p # 41,
# 0 ( p = q),
(7.8.2)
where h = exp (i0). We assume w(0) continuous and to have positive real part; there is no particular need for it to be real-valued. Since (7.8.2) leaves the polynomials indeterminate to the extent of multiplicative factors, we are at liberty to impose the further requirement that u,(h) have unit highest coefficierit, so that (7.8.3)
Then (7.8.2) is equivalent to
I
2n
0
#,(A) h - q e ) de = 0,
..., n - I
s = 0,
(7.8.4)
or, defining the moments (7.8.5)
to pn+
+ 2 anrpn-r-8 = 0,
..., n - I ,
s = 0,
*=I
that is to say, to
+ 2 an,p8+ 1)
ps
r=l
= 0,
s = 1,
...,n.
(7.8.6)
We note in passing that these equations are uniquely soluble for the that the polynomials u,(X) are uniquely fixed by (7.8.2-3). Using an argument given previously, if they were not uniquely soluble, there would be a nontrivial solution for the homogeneous equations a,,, , so
2abrpa-r= 0, r=l
s = 1,
..., n,
192
7.
POLYNOMIALS ORTHOGONAL O N THE U N I T CIRCLE
whence
r=l8=1
or
whereas the integral on the left has positive real part if w(0) does so and is continuous. Hence the polynomials u,(X) exist and are unique. T o derive in a formal way their asymptotic form for large n, we replace n by a, in the system of equations (7.8.6), getting the infinite system ps
+ 3ffrps-r= 0,
s = 1, 2,
... .
(7.8.7)
r=l
We have first to solve these equations for the a,., and then to find conditions under which it is correct to assume that h,. 4ar as m ---t a,. T h e task of solving the equations (7.8.7) may be reduced to a certain factorization problem in terms of analytic functions. If (7.8.7) holds, then there holds the formal identity . m
m
(7.8.8)
as we see by comparing coefficients of s = 1, 2, ...; the values of the constants p, do not concern us. Conversely, (7.8.7) follows from (7.8.8). Assuming the series to converge when I X I = 1, and writing (7.8.8) in abbreviated form as
A X ) w1(4
=
W),
(7.8.9)
we may interpret these three factors as follows. I n the first place wl[exp (ie)] =
2 ps exp (-a), m
-m
and in view of (7.8.5) this means that, under suitable conditions, wl[exp (ie)] = 2 4 0 ) .
(7.8.10)
7.8.
ASYMPTOTIC BEHAVIOR
193
As to the remaining factors in (7.8.9), g(h) is regular outside the unit circle, with g(a) = 1 , while h(h) is regular inside the unit circle. T h e problem, roughly stated, is then to find a function analytic outside the unit circle and defined on it, which when multiplied by the weight function, considered as a function on the unit circle, yields a function analytically continuable within the unit circle. In the present scalar case there is a formal solution to this factorization problem. We form the Fourier series of log eo,(exp ( i d ) ) , say m
log wl[exp (ie)]=
-m
Y r exp (ire),
(7.8.1 1)
giving the logarithm its principal value. We then take (7.8.12) m
(7.8.13)
If, for example, the series (7.8.1 1) is absolutely convergent, the exponentials in (7.8.12-13) may be expanded, yielding power series of the form given by the first and last factors in (7.8.8) which are absolutely convergent when I h I = 1. It is easily seen also that (7.8.9) will hold. Summing up, the conjectural asymptotic formula for u,(h) when n is large and 1 h I = 1 is constructed as follows. From the weight function w(0) we form a similar function defined on the unit circle by w,(h) = 277w(d). We expand log w,(h) for I h I = 1 in a series of positive and negative powers of A, essentially the Fourier series of log w(d), and retain only the negative powers. Changing the sign and taking the exponential we obtain the function g(h), and the required asymptotic formula for u,(h) is h,g(h). We proceed to show that these arguments have some validity in the event of absolute convergence. Theorem 7.8.1. Let w(8) be continuous and have positive real part for 0 d 27r, and let log w(d) have an absolutely convergent Fourier expansion. Then u,(h) approximates in mean square to h”g(h) as n - + m ; more precisely we have
< <
(7.8.14)
194
7.
POLYNOMIALS ORTHOGONAL ON T H E UNIT CIRCLE
=
exp (it)),and the oc, are given by (7.8.12), together with
Here A
+2 m
g(h) = 1
(7.8.15)
;
1
I, since we assume log w(0), and so this converges absolutely if I h I (7.8.1 l), to have an absolutely convergent Fourier series. It follows that the right-hand side of (7.8.14) tends to zero as n + m. In addition, the series expansion of h(h) [cf. (7.8.13)] will converge absolutely for I h I 1, while it follows from (7.8.11) that the series expansion for wl(h) converges absolutely for I h I = 1. Hence for I h I = 1 all three series occurring in (7.8.8) converge absolutely and (7.8.8) holds. Hence (7.8.7) are also correct, again with absolute convergence. For the proof of (7.8.14) we subtract from (7.8.6) the corresponding equations (7.8.7) with s = 1, ..., n, deriving
<
r-n+1
r=1
Multiplying by ,& n
- OLs
and summing we have m
n
r=l s=l
m
r=n+l s=l
In order to estimate these quadratic forms we translate them into integrals. In view of (7.8.5) we may write this result as
Introducing the notation gn(4 = 1
+2 n
gnl(4 =
2 .J-', m
(7.8.17)
n+l
1
this may be written
1I 2X
h-nun(q
-
0
Defining also where
w
gn(q12 w(e) de = w =
> 0 since
/
2n 0
min Rlw(B),
gnl(q nu,(^) - g,(q w(e) de. (7.8.18) Q
=
max I w(0) I,
(7.8.19)
w(0) has positive real part and is continuous, and
7.8.
195
ASYMPTOTIC BEHAVIOR
comparing the real part of the left of (7.8.18) with the modulus of the right, we deduce that
by the Cauchy inequality. Hence
J""I h-"u,(X) - g,(X) 0
l2 d6
1; I
< (Q/W)~
g&)
l2 d6.
(7.8.20)
This is essentially the result. Since we may modify (7.8.20) to 0
- g(X) I2 dB < (2 + 2Q2/w2)
I
I'" 0
I gnl(A) l2 d6.
Introducing a factor An, of absolute value unity, on the left and evaluating the integral on the right we have
J""I #,(A) 0
- h*g(h) l2 d6
< 2 4 2 + 2Qz/w2) 2 I a, 12, m
(7.8.21)
n+1
which gives the required result. For an actual approximation to u,(h), and not one in the mean-square sense, we may work from (7.8.20), deriving
Theorem 7.8.2. Under the conditions of Theorem 7.8.1 there holds for I h I = 1 the bound
I u n ( 4 - hngn(4 I For
< n(Q/w)22 I % 12. m
n+l
(7.8.22)
196
7.
POLYNOMIALS ORTHOGONAL O N THE U N I T CIRCLE
by the Cauchy inequality. T h e bound for the sum on the right needed to prove (7.8.22) is immediate from (7.8.20). In both Theorems 7.8.1-2, a connection is manifest between the accuracy of the asymptotic approximation and the smoothness of the weight function.
7.9, Polynomials Orthogonal on a Real Segment T h e factorization (7.8.8) also provides asymptotic expressions for a certain class of ordinary orthogonal polynomials on a finite interval of the real axis. We use the notation of the last section, with the additional assumption that w(0), when considered as of period 27r and defined for all real 8, is an even function. I n other words, we have w(27 - 8) = w(O), ps = p P 8 , and Wl(4
=
M/4,
I I
=
(7.9.1)
1,
in view of (7.8.10). There is still no need to assume w(0) to be realvalued. We write (7.8.8) or (7.8.9) in the form gn(4 W l ( 4 = -gn1(4
W1(;0
+ 4%
(7.9.2)
with the notation (7.8.17). We deduce that, with X = exp (ie),
r =0,1,
..., n -
in view of the fact that h(X) contains no negative powers of the results for f r it follows that
r = 0,
1,
A. Combining
..., n - 1.
Selecting the even part of both sides, that is making the change of variable 0 ---t -8 and combining the results, we have in view of (7.9.1) that
7.9. for the same
=
POLYNOMIALS ORTHOGONAL O N A REAL SEGMENT Y.
By (7.8.17) this is equivalent to
-I': 12
as cos (n
a=n+l
197
e
I
w , ( ~ )cos ye de,
...,
= 0,
-
1.
(7.9.3)
T h e right-hand side being small for large n, we treat this as an approximation to the problem in which the right-hand side is to be zero, that is to say, in which constants a n g ,s = 1, ..., n, are to be determined such that
That they can be so determined, and are unique, is proved as in the discussion of (7.8.6). Subtracting (7.9.3) from (7.9.4), and replacing cos re by cos (n - Y) 8, we obtain
for
Y =
1, ..., n. Multiplying by (&,,
-
CT) and
summing over
Precisely as in the discussion of (7.8.18) we deduce that
Y
we have
198
7.
POLYNOMIALS ORTHOGONAL O N THE UNIT CIRCLE
We have proved the following result, which we formulate first in terms of cosine polynomials.
< <
Theorem 7.9.1. Let w(0), 0 0 277, be continuous, have positive real part, satisfy w(0) = w(277 - O), and let log w(0) have an absolutely convergent Fourier expansion. Let the cosine polynomials qJ0) have the form
qn(e) = cos ne +
2 n
(7.9.5)
cos (n - s) 8,
LYng
GI
and the orthogonality properties
Then as n + m, qn(0) is approximated in mean square over (0, 277) by 9 {Xng,(X) X - n g , ( w } . I n the above, the integral in (7.9.6) could be replaced by that over (0, T ) ; in the asymptotic formula, g,(h) could be replaced by g(X). As in Theorem 7.8.2, the statement may be improved to approximation in the uniform sense if
+
Finally we put the result in terms of ordinary polynomials. Since (7.9.5) gives a polynomial in cos 0, we may define a polynomial p,(x) by p,(cos 0) = qn(0), so that in particular
p,(x)
= 21-nxn
+ ... .
(7.9.7)
T h e polynomials are to have this form and to be orthogonal according to
j1 pn(x) wo(x)x' ax -1
= 0,
T
= 0,
...,n
- 1,
(7.9.8)
where W,(COS
0) I sin 0 I
(7.9.9)
= w(0).
If w(0) so given satisfies the conditions of Theorem 7.9.1, we obtain the asymptotic approximation for ~ , ( C O0) S as +{ein8gn(eie) e-i"8g,(e-ie)}, in the mean square or uniform sense as before.
+
7.10.
199
CONTINUOUS AND DISCRETE ANALOGS
7.10. Continuous and Discrete Analogs We recall that in the case of polynomials orthogonal on the real axis the three-term recurrence relation by which the polynomials may be defined, (4.1. I), has a continuous counterpart in the Sturm-Liouville differential equation, the two being included in certain more general equations. I n large measure this situation obtains also for the recurrence relations (7.1.1-2) which determine polynomials orthogonal on the unit circle. In order to allow the transition to a continuous analog it is first necessary to reformulate the recurrence relations, (7. I. 1-2). In the first place we take it that a, = 1 ; this simplifies matters and may be arranged by a substitution. Secondly, we transfer the spectrum to the real axis by a substitution of such a form as h = (1 iA’)/(l - iA’). Making these substitutions in (7.1.1-2) we have, after slight rearrangement, - u, = 2ih’(l - iA’)-lun b,vn,
+
+
%+I
- Vn = Gn(l
+ ih‘) (1 - &‘)-1u,.
Without attempting to go through any limiting process, we may formulate a continudus analog of these equations. Here u, is replaced by u(x), where x is a continuous variable, u,+~ - u, by u ( x ) dx, 2h‘ by p dx, b, by b(x) dx, where b(x) is a function of the continuous variable x. We thus arrive at the system of differential equations du/dx
= ipu
+ b(x) v,
dvldx
=
b(x) u.
(7.10.1)
Attention to this system, as an analog of that defining polynomials orthogonal on the unit circle, was called by M. G. Krein, who outlined a rather complete theory of the topic, which we shall not reproduce here. It should be emphasized that the differential equations (7.10.1) provide merely one possible starting point; we recall that in the polynomial case other starting points are provided by the orthogonality (7.3.11), and by the moments (7.5.4), forming a Toeplitz matrix (pr+). Krein actually starts with a function H ( r - s) which defines a (continuous) quadratic form, having certain positivity properties. Some remarks relating the differential system (7.10.1) to previous constructions will be in order. Assuming b(x) to be continuous, say, we have from (7.10.1) that (d/dx)(ziu - Gv) = (-ijizi
+ G6) u + zi(ipu + bv) - (zib) v - G(6u)
= i(p - i;) tiu.
200
7.
POLYNOMIALS ORTHOGONAL O N THE U N I T CIRCLE
We deduce the invariance property that ziu - fiv is constant if p is real, and monotonic otherwise, the trivial solution excepted. Boundary problems such as those given by (7.1.3), (7.2.1) will therefore have a real spectrum. The solutions of (7.10.1), with fixed initial data, are of course not polynomials ; nevertheless, the analogy can be developed on the basis that they are linear combinations, as integrals, of exponentials. If we define u t ( x ) = exp (+x) u(x), (7.10.2) corresponding to the expression h-%,(h) in the polynomial case, we may rewrite the differential equations (7.10.1) as dut/dx = exp (-ipx) bv,
dw/dx = exp (ipx) 6ut.
(7.10.3)
Supposing the initial data to be u(0) = ut(0) = o(0) = 1, an integration gives
+
u+(x> = 1
w(x) = 1
+
r
w(5) d t ,
(7.10.4)
ei”‘b(5> u t ( 4 ) df.
(7.10.5)
e-+‘b(f)
0
0
These equations are soluble by infinite series, the Neumann series or method of successive substitution or approximation, yielding results of the type ut(x) = 1
v(x)
=
1
+
a(x, 0
5) c i / ’ P
(7.10.6)
d5,
+ 1*-a ( ~5,) ei”d d t ,
(7.10.7)
0
where the kernel
a(x,
5)
5) is explicitly obtainable as a series of the form =b(0
+
r-‘ + 0
b(5
7)b(rl) d7
+ ... .
(7.10.8)
T h e kernel a ( x , 8) is analogous to the coefficients anr of the polynomial u,(h), and (7.10.6) is to be compared with (7.4.1). T h e analog of (7.4.4) will take the form of an integral equation. T h e difference equations (7.1.1-2) and the differential equations (7.10.1) are both linear in their parameters. If we sacrifice this feature
7.10.
CONTINUOUS A N D DISCRETE ANALOGS
20 1
further generalizations may be made. I n extension of (7.1.1-2) we may set up the system, with a, = I , u,,,
= (1
+ ipc,) (1
v,+, = in(1
-
ipE7a-l u,
+ ipc,) (1 - ipt,)-l
+ b,%l
u,
+
U."
1
(7.10.9)
,
(7.10.10)
where the c, have positive real part. If all the c, are the same, we have essentially the case (7.1.1-2). If again all the b, vanish, we have the situation considered in Chapters 1 and 2. As in the latter case, discrete variation of the type (7.10.9-10) may be interspersed in continuous variation of the type (7.10. l), yielding boundary problems of a similar t Y Pe-
CHAPTER 8
Sturm-Liouville Theory
8.1. The Differential Equation In a celebrated group of papers Sturm and Liouville treated boundary problems for a second-order ordinary differential equation, which we shall write (Y’/Y)’
+ (hp + q)y
= 0,
a
< x < b,
’
= d/dx,
(8.1.1)
where p, q and r are suitably smooth functions of x and h is a scalar parameter; in addition, the functions p , r are commonly required to be positive, except possibly at the end-points x = a, b. T h e formal analogy between (8.1.1) and the three-term recurrence relation (4.1. l), which we may bring out better by writing the latter in the form
is reflected in a rather complete correspondence between the results for the two cases. As mentioned in Chapter 0, the analogy is also borne out by a common physical model, the vibrating string, either continuously loaded or else weightless and bearing discrete particles. T h e analogy is substantiated by various mathematical formalisms which unify the two cases. As such a formalism we shall treat in this chapter the first-order system 24’ = YO,
Y’
=
-(hp
+ q ) u.
(8.1.2-3)
In the first place, if we have a solution of (8.1.1) then a pair of solutions of (8.1.2-3) is given by y = u, y ’ / r = v ; the reverse deduction may also be made if r > 0. Secondly, we may also consider (8.1.2-3) in the case that r vanishes over a subinterval of (a, b). Suppose in particular that (a, b) breaks up into a sequence of intervals (a, b,), (b, , a,) (a1, bl), ( b , , as), ..., in which alternately r = 0, or p = q = 0. T h e 202
8.1.
203
THE DIFFERENTIAL EQUATION
relations (8.1.2-3) yield on integration recurrence relations of the form U(b0) =
244,
f@,)
-
4 . )
=-42)
.(al) - u(b,) = o(b,)
T
dx,
(A J p dx b0
.(al)
+ I””4 dx) ,
= s(b,),
bo
and so on. With the identifications
it may be shown that
which is essentially a three-term recurrence formula of the above type. The transition from (8.1.1) to (8.1.2-3) has also a bearing on the topic of quasidifferential equations, in which we minimize the differentiability requirements on the coefficients. Assuming thatp, q E C(a, b), T E C’(a,b), and that T > 0, we may write (8.1.1) as
a “solution” being sought in the class C”(a, b). This is an unnecessarily restrictive procedure ; (8.1.1) retains sense without assuming T to be differentiable, or y to be twice differentiable, if we merely ask that y’/r be continuously differentiable, without attempting to differentiate it as a product. Such an interpretation is included in (8.1.2-3), with the assumptions that p, q, Y E C(a, b), solutions being sought in C’(a,b). We obtain a fairly wide framework if we consider the system (8.1.2-3) with the hypotheses that p, q, T be piecewise continuous, having at most a denumerable number of discontinuities which are simple jumps. This will include both the case of the three-term recurrence formula, and the case of (8.1.1) when the coefficients are continuous. However, the more general case in which the coefficients are integrable in the sense of Lebesgue provides an adequate foundation. I n this chapter it will be assumed that the following hold: (i) p, q, Y E L(a, b), where (a, b) is a finite real interval; (ii) for a x b we have p>o, r>o;
< <
(8.1.4)
204
8.
(iii) for any x, a
< x < b,
l z p ( t ) dt a
STURM-LIOUVILLE THEORY
> 0,
f p ( t ) dt 2
> 0,
~ ( tdt) a
> 0;
(8.1.5)
(iv) if for some x, ,x2 we have (8.1.6)
then (8.1.7)
In the requirement (i), subject top, 4, Y E L(a, b), the additional restriction that ( a , b) be finite is a matter of convenience. In (iii) the requirements ensure that the system (8.1.2-3) does actually involve the parameter h at its end-points. Slightly greater generality still is to be obtained by replacing (8.1.2-3) by the Stieltjes integral equations
(8.1.8-9)
Basic requirements would be that p, , 4 , , and rl should be of bounded variation, p, and I, being nondecreasing; to simplify matters we could assume them all continuous. The system (8.1.2-3) is a particular case, where p, , q1 , and I, are integrals of p, q, and Y. Conversely, if p, , q1 , and I, satisfy the above requirements, and are in addition absolutely continuous, we may take p, q, and r to be their derivatives and reason back to (8.1.2-3). This procedure is excluded in the event that p, , q1 , or rl contains a singular, not absolutely continuous, component. In what follows we consider the system in differential form (8.1.2-3), subject to the assumptions (i)-(iv) above. A solution will be a pair of absolutely continuous functions u, z, satisfying (8.1.2-3) almost everywhere. The reader who wishes may specialize this situation to that in which p, q, and Y are piecewise continuous, (8.1.2-3) holding everywhere except at discontinuities of p, q, and I, or, more specially still, to that in which p, q, and r are continuous, and (8.1.2-3) hold everywhere.
8.2.
205
EXISTENCE, UNIQUENESS, AND BOUNDS
8.2. Existence, Uniqueness, and Bounds for Solutions We recall the standard fact that(8.1.2-3) has a unique solution forwhich
u(a), v ( a ) have prescribed values. A solution such that u(a) = .(a) = 0, or such that u(x) = V(X) = 0 for any x in [a, b], must necessarily be the trivial solution given by U ( X ) = w(x) = 0.
We need the observation that for a nontrivial solution there holds the inequality u(x) l a dx > 0. (8.2.1) In other words, the equality U(X)
(8.2.2)
lads = 0
must imply that u = D = 0. If p ( x ) is positive and continuous, it is immediate that u = 0, whence if ~ ( x )is also positive we have w = 0, from the first of (8.1.2-3). T o derive the same result under the assumptions of (i)-(iv) of Section 8.1, we may argue in the first place that w must be constant. For it follows from (8.1.3) that ~(x) .(a) = --h
I Z p ( t )u ( t ) dt
-
a
a
q(t) u ( t ) dt,
(8.2.3)
and both the integrals on the right must vanish. In the case of the first such integral we have
which vanishes by (8.2.2). Suppose, if possible, that the last integral in (8.2.3) does not vanish for some x. We then have dt
and hence, for some
K
> 0,
> 0,
1: I
q(t) u(t> I dt
:
Hence there must exist arbitrarily small intervals (xl , x2) such that
8.
206
STURM-LIOUVILLE THEORY
and so, since u ( t ) is continuous and so bounded, such that (8.2.4)
< <
while 1 u ( t ) 1 > K for some t , x1 t x 2 . By taking the interval (xl, x2) small enough, we may ensure that 1 u ( t ) [ > K in (xl , x2), again since u ( t ) is continuous in [a, b]. From (8.2.4) it follows, by (iv) of Section 8.1, that
and since 1 u(t) 1
> &K
in this interval,
in contradiction to (8.2.2). Hence the right of (8.2.3) vanishes and W(X) is constant. To complete the proof, suppose first that this constant is zero. Supposing if possible that u 0, we have from (8.1.2) that u is constant, and so a non-zero constant. In this case (8.2.2) is impossible in view of (8.1.5). Supposing again that w is a non-zero constant, it follows from (8.1.2) that u is monotonic, and does not vanish at both x = a, x = b, in view of the last of (8.1.5). Hence, by the continuity of u, there is an E > 0 such that u(t) has a positive lower bound in at least one of the intervals (a, a E), ( b - E, b), which again conflicts with (8.2.2), in view of (8.1.5). Hence both u, ZI must vanish identically, as was to be proved. For fixed initial values u(a), w(a), let us now consider the dependence of the solution of (8.1.2-3) upon A. It is a standard result that the dependence of the solution on A is analytic for all complex A. Writing the solution u(x, A), w(x, A), we have that these are entire functions of A. Certain conclusions can be drawn from the fact that they are entire functions of order at most More precisely, there hold bounds of the form
+
+
4.
u(x, A), w(x, A) = O{exp (const.
41h I)}.
(8.2.5)
To prove this we use the fact, a consequence of (8.1.2-3), that
8.3.
Since, if I A
207
THE BOUNDARY PROBLEM
1 # 0, 2 I WJI
< {I
A
+I
I I la
1”/1/1A
I 9
we deduce that
I (dldx) log {I A I I la
+I
12}
< d1A I ( r + PI + I Q 1w1A I,
and (8.2.5) follows on integrating with respect to x and taking exponentials.
8.3. The Boundary Problem Preliminary conclusions can now be drawn concerning the eigenvalue problem in which we fix real numbers a, /Iand ask for nontrivial solutions of (8.1.2-3) sych that sin a
= 0,
(8.3.1)
u(b) cosp - w(b) sinp
= 0.
(8.3.2)
u(a) cos a - .(a)
To treat this problem we choose a solution of (8.1.2-3) such that u(a) = sin a,
.(a) = cos a,
(8.3.3)
so that (8.3.1) is satisfied. Writing u(x, A), o(x, A) for this solution, we find that the eigenvalues are then the roots of u(b, A) cos /3 - a(b, A) sin p = 0.
(8.3.4)
We first verify that they are all real. Supposing that h is a complex eigenvalue, we have that will also be an eigenvalue. For since the coefficients p , q and I in (8.1.2-3) are all real-valued, u(x, A) = u(x, A), and similarly for o, so that if (8.3.2) holds for A, it will also hold for A. At this point we need the identity, of Lagrange type,
x
the analog of (4.2.1), which it in fact includes.
208
8.
STURM-LIOUVILLE THEORY
For the proof we note that, by (8.1.2-3),
( W x ) { u ( x , A) v(x, PI
- 4 x 3 P) 4% A)
A ) v(x, P) - 4 x 9
= y+,
- y q x , P) v ( x , 4 =
(A
-
+
P) 4% PI p(.>
4 (PP + 9) 4 x 9 PI
u(x, PI
+,4.
(AP
+ 9) 4 x 9 4
The result (8.3.5) now follows on integration, using the fact that the left of (8.3.5) vanishes when x = a, in view of (8.3.3), which holds for all A. Taking in particular p = 1, we have
A)
~ ( xA) , ~ ( x , - u(x,
A) o(x, A)
=
(A
-
A) r p ( t ) I u(t, A) l2 dt. a
(8.3.6)
Putting x = b, the left-hand side vanishes by the assumed boundary condition (8.3.2), and so ( A - A)
s” p ( t ) I a
u(t, A)
12
dt = 0.
As proved in Section 8.2, the integral can only vanish in the case of the trivial solution, which is excluded by (8.3.1). Hence A must be real. We have the essentials of the proof of the following
Theorem 8.3.1. Subject to the assumptions (i)-(iv) of Section 8.1, the boundary problem (8.1.2-3), (8.3.1-2) has at most a denumerable set of eigenvalues A,, A, , ..., all of which are real, and which are such that (8.3.7)
for every c
> 0. T h e eigenfunctions u(x, A,.)
are orthogonal according to
/ : p ( x ) ~ ( x4) , ~ ( xA,) , dx = 0,
y
# s.
(8.3.8)
The eigenvalues are the zeros of the entire function on the left of (8.3.4). We have just shown that this function does not vanish when A is complex. Hence it does not vanish identically, and hence its zeros form a denumerable set, at most, with no finite limit. They also satisfy (8.3.7), since this function is of order at most %.T h e orthogonality (8.3.8) follows from (8.3.5) on taking x = b, h = A,, p = A,, and using (8.3.2).
8.4.
209
OSCILLATORY PROPERTIES
When we come to the expansion theorem it will be convenient to use normalized versions of the eigenfunctions. If we define (8.3.9) (8.3.10)
we may replace (8.3.8) by the orthonormal relations (8.3.11 )
8.4. Oscillatory Properties Since the boundary problem (8.3.1-2) prescribes the values of the ratio u : v at x = a, b, an important role is played by the discussion of the functions u/v, v/u, and 8 = tan-l(u/v) in respect of their dependence on x and A. T h e functions are also connected in an obvious manner with the zeros of u. T h e dependence of u/v on x is characterized by a Riccati-type differential equation. We assume h real. Theorem 8.4.1. For a nontrivial solution of (8.1.2-3), the functions u/v, v/u satisfy, when finite, the differential equations (u/v)’ = r
+ (XP + 4 )
(w/u)’ = -Y(v/u)a
-
(8.4.1)
(U/v)2,
(Xp
+ 4).
(8.4.2)
In particular, as x increases, u/v cannot tend to zero from above; as x decreases, u/v cannot tend to zero from below. On differentiating u/v, v/u and using (8.1.2-3) we get at once (8.4.1-2). We have only to verify the last statements. Suppose if possible that u/v --t 0 as x increases, say, as x --t x 2 , u/v being positive in a leftneighborhood of x 2 , Then v/u 4 + w as x + x2 - 0. Suppose that v/u is finite for x1 x < x2 . Noting that, by (8.4.2), (vlu)’ -(A$ q), since Y 2 0, and integrating over ( x l , x), where x1 < x < x 2 , we deduce that
<
<
+
210
8.
STURM-LIOUVILLE THEORY
Making x --+ x, , the left tends to fm, since vju + while the right remains finite, since p, q E L(a, b), giving a contradiction. T h e proof of the final statement in the theorem is analogous. The dependence of u/v, vju for fixed x on varying real A is monotonic. , A) = v (a ) fixed. We assume u(a, A) = ~ ( a )v(a, +a),
Theorem 8.4.2. If v ( x , A) # 0, (8.4.3)
while if u(x, A) # 0,
I n particular, for a nontrivial solution, u(b, A)/v(b, A) and v(b, A)/u(b,A) are respectively strictly increasing and strictly decreasing functions of A when finite. We use the result [cf. (4.2.3)]
a
v(x, A) - u(x, A)
ah
a
- u(x, A) ah w(x, A)
=
I zU p ( t ) {u(t,
dt.
(8.4.5)
This follows from (8.3.5) on dividing by (A - p), making p + A, and using 1’Hbpital’s rule. From this (8.4.3-4) follow immediately. As regards the last statement in the theorem, it was shown in Section 8.2 that the integral on the right of (8.4.3-4) is not zero, if the solution u, w is nontrivial, and taking here A to be real and x = b. T o avoid complications with the infinities of u/v, v / u , we introduce the angular variable 8 = tan-l(u/v), or more precisely, 8 = arg {w
+ iu}.
(8.4.6)
We assume in addition that u, v have fixed initial values for x = a, and all A, given by (8.3.3). Since u, v are functions of x, A, so also is 8, and we define initially
qa,A)
= a,
(8.4.7)
in view of (8.3.3). For other x and A, 8(x, A) is given by (8.4.6) except for an arbitrary multiple of 27r, since u and w cannot vanish simultaneously. This multiple of 27r is to be fixed so that 8(x, A) satisfies (8.4.7) and is continuous in x and A. Since the (x, A)-region, namely, a x b, -m < A < 00, is simply-connected, this defines e(x, A) uniquely. T h e following properties of 8(x, A) are contained in previous results.
< <
8.4. OSCILLATORY PROPERTIES
21 1
Theorem 8.4.3. (i) 8(x, A) satisfies the differential equation, with :espect to x, 8’ = r cosz 8 (hp + q) sin2 8. (8.4.8)
+
(ii) As x increases, 8 cannot tend to a multiple of x from above; as x decreases, 8 cannot tend to a multiple of x from below. (iii) As A increases, for fixed x, 8 is nondecreasing; in particular, 8(b, A) is a strictly increasing function of A.
As regards the differential equation (8.4.8) we have from (8.4.6) that 8’= u’v - v’u Y2
+
v2
which gives (8.4.8), since tan 8 = u / v . The statement (ii) follows from the last part of Theorem 8.4.1 and likewise (iii) from the last part of Theorem 8.4.2. From (8.4.6) it is evident that the zeros of ~ ( xA) , are the same as the occasions on which 8(x, A) is a multiple of x . Considering particularly the zeros of u(x, A) for fixed A as x increases from a to b, we see that zeros of u will occur as 8 increases through, or increases to, a multiple of T ; by (ii) of the theorem, it is not possible for 8 to decrease to a multiple of x as x increases. Supposing that 0 a < x , as x increases from a to b, 8 may reach in succession a finite number of the values x , 277, ... . Since it cannot decrease to a multiple of x , it reaches multiples of x in ascending order. It reaches 8 = 0 only insofar as it starts there, and cannot reach negative values at all. It may exceptionally happen that B(x, A) is a multiple of x , nx, say, for some x in an interval in which r = 0. In this case by (8.4.8), d(x, A) = nx throughout this interval; likewise u E 0 throughout this interval. The term zero may occasionally bear the interpretation of an interval of zeros. With this qualification we have
<
Theorem 8.4.4. As A increases, the zeros of u(x, A) move to the left, except for a zero at x = a in the event that a = 0, or for an interval of zeros containing x = a, in the event that r E 0 in a right-neighborhood of a. Suppose first that 0 < a < x , and that for some x’, A’, we have B(x‘, A’) = nx for some positive integral n. By Theorem 8.4.3 (iii) we then have 8(x’, A”) >, nx for all A” > A’. In fact, we must have
212
8.
STURM-LIOUVILLE THEORY
B(x’, A”) > nrr. T o see this we refer back to (8.4.3); since 0 < a < n, we have u(a, A) # 0, and so, by the first of (8.1.5), the integral on the right of (8.4.3) is positive for x = Hence u(x’, h)/v(x‘, A) is a strictly increasing function of A, when finite; from (8.4.4) we see similarly that its reciprocal is strictly decreasing when finite. Hence B(x’, h) is strictly increasing as a function of A, so that B(x’, A”) > nr if A” > A‘. Since B(u, A) = a, where a < n, it follows that there is a root of the equation B(x, A“) = nn such that u < x < as was to be proved. ‘ I t remains to deal with the case that a = 0, or that u(a, A‘) = 0. It will again be sufficient to show that XI.
XI,
> 0, where as before x’ is such that B(x’, A’) = nrr contrary, that = 0.
> 0.
Suppose, on the
Then, as shown in Section 8.2, v ( x ) must be constant in u < x < a non-zero constant since u(a, A‘) = 0 and so v ( a , A’) # 0. However, when 0 = i r r , we have v = 0, and so 0 cannot reach the value i r r , in (a, x’); it therefore cannot reach a value nn > 0 in (a, x’], contrary to hypothesis. This completes the proof of the theorem. We now consider the boundary problem (8.3.1-2), taking it that
XI,
0
<
0.
< n,
0
<
71,
(8.4.9)
and prove the “oscillation theorem,” according to which the eigenvalues may be uniquely associated with the numbers of zeros of the eigenfunctions.
Theorem 8.4.5. The eigenvalues A, of the problem (8.1.2-3), (8.3.1-2) form a sequence A, < A, < ..., possibly finite, such that (8.4.10)
The eigenfunctions u ( x , A,) have, with a suitable interpretation, just n zeros in (a, b). The interpretation in question relates to possible intervals of zeros in the event that r = 0 throughout an interval. Two zeros xl, x2 of u(x, A) such that Jz2 r ( t ) dt = 0 are not to be reckoned as distinct. If 21 either x, = a or x2 = 6 they are not to be counted at all; we confine attention to zeros in the interior of (a, b).
8.4. oscI LLATORY
213
PRO PERTIES
Since the case of finite orthogonal polynomials, whose zeros are eigenvalues of a certain boundary problem, is included in the assumptions of Section 8.1, we cannot assert the existence of an infinity of eigenvalues in Theorem 8.4.5; in degenerate cases there may be none at all. Supplementary conditions, ensuring the existence of an infinity of eigenvalues, will be given in Theorem 8.4.6. T h e result of Theorem 8.4.5 is that if there are any eigenvalues, they can be arranged in increasing order, the corresponding eigenfunctions having 0, 1, ... zeros in the manner just described. Since the function B(b, A) is continuous and strictly increasing in A, the equation (8.4.10) will have exactly one real root A, for a certain sequence of n values. We have to show that the lowest member of this sequence, if nonempty, is n = 0. We prove this by showing that B(b, A) -+ 0 as A --t - w . Noting that B(a, A) = a 2 0, and that as x increases B(x, A) cannot decrease to 0, or decrease from 0, by Theorem 8.4.3 (ii), we have that B(x, A) 2 0 for all real A and a x b. Since also B(x, A) is nondecreasing as a function of A, we have that there exists the limit B(x, - w ) = lim B(x, A) as A --+ and furthermore that B(x, - 0 0 ) 0. We have to prove that B(b, = 0. We have in particular that B(b, A) is bounded for A 0, lying between B(b, 0) and 0. Integrating (8.4.8) over (a, 6 ) we have that
< <
>
--a),
<
--a))
8(b, A) - OL
=
j:
{I
cos2 8
+ hp sin28 + q sin28) dx
is uniformly bounded for A < 0. Since q, Y E L(a, b), it follows that 1 A j J:p sin2 B dx is uniformly bounded for X < 0. We draw the following conclusion, to be used several times in the proof of Theorem 8.4.5. If the interval (xl, x2) is such that f X 2 p ( tdt)
> 0,
(8.4.1 1)
XI
then there is an x 3 , x1
< x3 < x 2 , such that I sin 8 I
< const. I h
I-1l2,
(8.4.12)
where the constant may depend on xl, x2 but not on A, though x3 may vary with A. I n other words, if it is known that sin B is bounded from zero in (xl , x2), for X < 0, independently of A, then we must have j : : p ( t ) dt
= 0.
(8.4.13)
2 14
8.
STURM-LIOUVILLE
THEORY
A second property needed is that O(x, -m) tinuously. Taking h < 0 in (8.4.8), we have 8'
<
1
Integrating over ( x 4 , x5), where x4
can only increase con-
+ I q I.
< x 5 , we have, if h < 0, (8.4.14)
Making h - t
-00,
we deduce that
We proceed to the proof that B(b, -00) = 0. We first observe that e(xt, -m) < Q T for some xt with a xt < b. T o see this we take an a' such that e(x, 0) < T - r] for a x a' and some r] > 0,
<
< <
which is possible since O(a, 0) = a < n, note that r ' p ( t ) dt > 0 by a (8.1.5), and apply the conclusion (8.4.11-12). We see that for large h < 0, [a, a'] contains an x for which sin 0 is arbitrarily small. Since 0 O(x, A) O(x, 0) < T - r ] , this means that 0 is arbitrarily small, for large h < 0 and some x in [a, a']. Hence for large h < 0 there is at any rate an xt for which 0(xt, A) < +T, and so e(xt, -m) < i n , as was to be proved. In the next step we prove that e(x, -a) < Q n for xt < x < b. Let 36 denote the upper bound of xtt b with the property that O(x, -m) < Q T for xt x xtt. Suppose first that x6 < b. We assert that O(xa, -m) = Q n . For if O(xs , -m) < Q n , it would follow from (8.4.15) that O(x, -m) < Q n for x in some right-neighborhood of x,; for this purpose we apply (8.4.15) with x, , x in place of x 4 , x 5 , where x > x,, and is suitably close to x, . Similarly, if O(x,,, -m) > Q T , it would follow that O(x, --a) > Q T for x in some left-neighborhood of x6 , as we see by applying (8.4.15) with x, x,, in place of x 4 , xs and taking x < x6 and suitably close to it. Both of these situations conflict with the definition of x, , and we conclude that O(xs, -00) = Q n . In the event that x, = b, we have O(x,,, -00) &T, since otherwise e(x, -m) > n in a left-neighborhood of x,, . We now show that it is in fact impossible that O(x,, -m) = Q n . Supposing the latter to hold, we choose an x7 < x, such that
<
<
< <
<
<
9
(8.4.16) 27
8.4. OSCILLATORY
215
PROPERTIES
4~,
and It then follows from (8.4.15) that O(x,, --) - O(x7, --) < so B(x7, --) > By the same argument, we have in fact 8(x, --) > 47. for x7 x x, , and indeed O(x, A) > for the same x and all real A, since O(x, A) is nondecreasing in A. Since 8(x7, -a) < Q T , by the definition of x 7 , we have O(x7, A) < * T for large negative A, say, A < A‘. By (8.4.14), with x, , x in place of xp , x 5 , it then follows that O(x, A) < $ T for A < A’ and x7 x x, . Hence, for such x and A, we have $ 7 ~< 8(x, A) < $T, and so sin2 8 > 8 . By the argument of (8.4.11-13) we deduce that Jx8p(t)dt = 0, and z7 so also Jxe I q(t) I dt = 0, so that p, q vanish almost everywhere in +7 (x7 , x6). We may therefore replace (8.4.8) in this interval by 0’ = r cos2 8, or (tan 0)’ = Y , whence
ST.
< <
< <
tan B(x, A)
- tan B(x,,
A) =
lX
~ ( tdt)
,
27
<
for x7 x < . For A < x’, tan 8(x7, A) will be finite, since O(x7 , A). < Q T , and so tan O(x, A) remains finite as x + x, from below. Hence O(x, , A) < Q T for A < A’, giving a contradiction. We deduce that 8(x, .< Q T for xt x b, so that in particular 8(b, --) < i n . Suppose if possible that O(b, = 7’ > 0. Applying (8.4.14-15) as previously, we choose x8 < b such that
< <
--a))
--a))
< <
and deduce that for x, x b and large negative A, say, A < A”, there holds i 7‘ < O(x, A) < & T i q’,so that sin2 8 > sin2 (Q7’)> 0. By the argument of (8.4.11-13), this implies that J b p ( t ) dt = 0, which X8 conflicts with (8.1.5). Hence O(b, = 0, and the proof of Theorem 8.4.5 is complete. Finally, we note conditions which exclude the event that zc(b, A), w(b, A) are polynomials in A and ensure the existence of an infinity of eigenvalues.
+
--a))
Theorem 8.4.6. In addition to the assumptions (i)-(iv) of Section 8.1, let there be an infinite sequence to< <: ... of points of (a, b) such that /::‘p(t)
dt
> 0,
K
= 0, 1,
... ,
(8.4.17)
8.
216
STURM-LIOUVILLE THEORY
and ,:::::~(t) dt
> 0,
k
= 0,
1, ... .
(8.4.18)
Then the problem (8.1.2-3), (8.3.1-2) has an infinity of eigenvalues. For the proof it will be sufficient to show that 8(b, A) becomes arbitrarily large as A -+ f-, or again that as x increases from a to b, 8(x, A) increases through a number of multiples of T which increases indefinitely with A. It will be convenient to prove this instead for a modified phase u / v , where A > 0, the variable 8, = 8,(x, A) defined by tan 8, = arbitrary additive multiple of T being fixed by 18 - I < 8.r. Since tan 8, = tan 8, the two variables 8, 8, will equal a multiple of T , together, and will increase and decrease together. It will thus be sufficient to show that, as x increases from a to b, 8,(x, A) increases through a number of multiples of r which tends to infinity with A. For this purpose we set up the differential equation satisfied by 8,. We have
. 0;
sec2 0,
= W(U/W)’ =X~/~(U’W - UW’)/W~
+ W ( h p + q) (u/w)Z = h112r + A-V((hp + q) tan2 0, . =N 2 T
Hence, for A
> 0, 0;
= All2 T
It follows that, for A
cos28,
+ X112psin20, + h-V
q sina 0,
.
(8.4.19)
> 0, e; 2
-A-V
I I
(8.4.20)
and so, for A 2 1, (8.4.21)
Hence Bl(b, A) - 8,(a, A) is bounded from below, uniformly for A 2 1 ; let us suppose if possible that it is bounded from above, uniformly for A 2 1, say, by
O l ( 4 4 - 0,(a, 4
< c1 -
(8.4.22)
I n showing that this is impossible, we show first that this hypothesis would imply that Bl(x, A) is of bounded variation over (a, b), uniformly for A 3 1. Since the left of (8.4.22) may be written All2
f a
(T
cos2el
+ p sin20,) dx + k 1 1 2 fq sin2el dx a
8.5.
217
AN INTERPOLATORY PROPERTY
we deduce that, for h 2 1, All2
s” a
(I
cos20,
+ p sin20,) dx < c1 +
b
a
I q I dx
= c2 ,
(8.4.23)
say. Hence (8.4.24)
say, so that 8, is of bounded variation uniformly for A 2 1. We now compare (8.4.17) with the fact that, by (8.4.23),
J::yl
p sin2
dx
<
A-112
c2
.
Writing rlzk for the left of (8.4.17), we deduce that sin2
A)
d A-l12 c2/vZk
(8.4.25)
for at any rate one x E [ t Z k52k+,]. , With a similar notation for the left of (8.4.18), we have in the same way cos2
A)
<
~ - 1 1 2Cz/v2r+l
Making A large, we may ensure that
8, is arbitrarily close to a multiple of
.
(8.4.26)
[ t o tl] , contains an x such that and that [el, t2]contains an x
T,
such that 8, is arbitrarily close to an odd multiple of Q T , and so on alternately. Hence by taking h large, the variation of B,(x, A) over (a, 6) can be made as large as we please, and we have a contradiction. Hence B,(b, A) - B,(a, A) can be made arbitrarily large, and B,(x, A) increases through an arbitrarily large number of multiples of T as x goes from a to b, which completes the proof of Theorem 8.4.6.
8.5. An Interpolatory Property In this and the next sections we consider the eigenfunction expansion, the expansion of a function from some general class in a series of the u(x, An), n = 0, 1, ... , in extension of the Fourier sine or cosine series. Of the many possible proofs of this expansion, we select that due to Prufer, which proceeds entirely in the real domain and makes no use of the theory of integral equations, or its equivalents. It rests on an interpolatory property of the eigenfunctions, a special case of a group
218
8.
STURM-LIOUVILLE THEORY
of properties which have interest independently of the eigenfunction expansion. Defining u,(x) by (8.3.9-lo), the property in question is Theorem 8.5.1. Let the boundary problem (8.1.2-3), (8.3.1-2) admit the eigenvalues A,, A, , ..., A, , ... , for some m > 0. Then an expression of the form
4-4 = 2 anun(x), m-1
(8.5.1)
0
where the a, are real and not all zero, cannot vanish at all the zeros of urn(.). We assume that the eigenvalues are arranged in increasing order, and that additional conditions, such as those given by Theorem 8.4.6 have ensured the existence of at least m 1 eigenvalues; it is not, however, necessary at the moment that there should be an infinity of eigenvalues. A more general result, the Cebygev property, due in this case to Sturm, asserts that w(x) as given by (8.5.1) cannot have as many as m zeros; in our present case certain conventions must be set up as to when zeros are regarded as distinct. So far as the expansion theorem is concerned, however, the more restricted result will suffice, that the zeros of w(x) cannot include all the zeros of u,(x) in a < x < b. The proof depends on the following lemma, also needed for the proof of the eigenfunction expansion.
+
Lemma 8.5.2. Let the real-valued absolutely continuous functions g(x), h(x) satisfy g' = rh, (8.5.2) g(a) cos o - h(a)sin
CL
= 0,
g(b) cos
- h(b) sin /3 = 0,
(8.5.3)
and let g vanish at all the zeros of u, in (a, 6). Then (8.5.4)
Completing the notation (8.3.10) for the normalized eigenfunctions, so that we write D,(x) for ~ ( xA,), u:, = TD,
,
v:,
=
-(Amp
+ q) u, .
(8.5.5-6)
8.5.
219
AN INTERPOLATORY PROPERTY
We obtain the required result in a formal way if we integrate (8.5.7) over (a, b), noting that the first term on the right is non-negative, and that integrating the term on the left gives zero, since (glum) (hum - gwm)
(8 S.8)
0
-+
as x -+ a and as x -+ b. T o justify this in more detail we consider the E, ? - z), where 4 , are ~ integral over an interval of the form ( f consecutive zeros of u,, so that u,(f) = urn(?)= 0, u , ~ ( x )# 0 for .$ < x < 9. Since the first term on the right of (8.5.7) is non-negative, we have
+
for small E > 0. We wish to make z --+ 0, and assert that (8.5.8) is also true when x -+ f 0, x -+ 7 - 0, so that the left of (8.5.9) yields zero as E + 0. By hypothesis, we have u, -+ 0, g -+ 0 as x -+ 4 0, and so in order to prove (8.5.8) for x -+ 5 0 it will be sufficient to show that g/u,, is bounded. Since urn([)= 0, g ( e ) = 0 we have
+
+
+
um(f
+
I,
e+r
6)
=
r ( t ) w,(t) dt,
g(f
+
c) =
fyt
Since v,(.$) cannot vanish with urn((),and since v, we have for small e inequalities of the form
so that glum is bounded, and (8.5.8) holds as x --f f
r ( t ) h(t) dt.
, h are continuous,
+ 0. In an entirely
220
8.
STURM-LIOUVILLE THEORY
similar way, the result may be proved for x --+ r ] - 0. Hence making E ---t 0 in (8.5.9) we have
T o complete the proof we observe that this is also true when 6 = a and 7 is the smallest zero of u,,(x) which is greater than a. If the boundary condition at x = a is that u,(a) = 0, that is, if sin a = 0, this has already been proved. If sin a # 0, then glum is finite at x = a, while hum - gv, = 0 at x = a by (8.3.1) and (8.5.3). Hence (8.5.8) is true for x -+ a 0, so that (8.5.10) is available. Similarly, it is available if 7 = b and 6 is the nearest zero of u, to the left of b. We now note that the interval (a, b) comprises a finite number of intervals of the above forms, that is to say, intervals bounded by consecutive zeros of u, or by a zero of u,, and an end-point of (a, b). I n exceptional cases, there may in addition be intervals throughout which u,, vanishes; in terms of the phase variable 8 = B(x, A,) defined in Section 8.4, there will be m 1 intervals in which 8 goes from a to n, from x to 277, and finally from mx to mx 8, and possibly others in which 6 remains a mu1tip)e of n. Intervals of this latter form, in which u, = 0 and so in whichg 3 0, clearly do not contribute to the integrals in (8.5.4). Hence on summing the results (8.5.10) we have
+
+
+
(8.5.1 1)
which is equivalent to (8.5.4), completing the proof of the lemma. Passing to the proof of Theorem 8.5.1, we suppose if possible that w ( x ) as given by (8.5.1) vanishes at all the zeros of u,(x). We apply the result of the lemma, with w in place of g, and w1 in place of h where
We have then w‘ = rwl in view of (8.5.5), while the boundary conditions (8.5.3) hold since they are satisfied by u, , vn . Evaluating for this case the right of (8.5.4), we have
8.5.
AN INTERPOLATORY PROPERTY
22 1
by (8.5.6). Hence the right of (8.5.4) gives
by the orthonormality (8.3.11). In a similar way the left of (8.5.4) becomes b
m-1
a
0
h m j pw2dx = h m z a : .
Hence from (8.5.4) we have
or
Since the A, are in increasing order, this implies that all the a, vanish. This proves Theorem 8.5.1. T h e following interpolatory property follows at once.
Theorem 8.5.3. Let b, , ..., 6 , be any constants, and let xl, ..., x, be zeros of u,(x) which are distinct from each other and from the endpoints a, b, no two such points lying in an interval in which U, = 0. Then there is a unique set of constants a, , ..., am-l such that
2 anun(x,)
m-1 0
=
b, ,
s =
I,
...,m.
(8.5.12)
For if there were not always such a unique set, there would be a set of a , , not all zero, such that
2 anun(x,)
m-1 0
= 0,
s =
1, ...,m.
(8.5.1 3)
Denoting this expression as before by w ( x ) , we should have that w(x) vanished at all the zeros of u,(x). If x = a, or x = b, or both, were zeros of u,, according to the boundary conditions, then these points would also be zeros of w . Any further zeros of u, would not be essentially distinct from 'these, but would lie together with one of the x, , or one
222
8.
STURM-LIOUVILLE THEORY
of a or b, in an interval in which u, = 0 and so in which Y = 0 almost everywhere; however, in such an interval all the un would be constant, and so also w, which accordingly would vanish throughout such an interval. Hence w would vanish at all the zeros of u, contrary to Theorem 8.5.1. The criterion for the zeros x, and the end-points a , b to be distinct in the above sense may be put explicitly as Y l ( 4
< +1)
< ..-< T l ( X r n ) < Y l ( h
(8.5.14)
where as previously rl(x) = J" ~ ( tdt. )
8.6. The Eigenfunction Expansion
The interpolation theorem just proved may be stated in the form that, given any function ~ ( x )a, x b, and any m, we can find a linear combination of uo(x),..., U , - ~ ( X ) which coincides with it at the zeros of u,(x) in a < x < b ; strictly speaking, the zeros should be distinct from each other and from the end-points, and there must of course be at least m 1 eigenvalues. This is already a form of expansion theorem. Furthermore, making m + m and assuming that there are an infinity of eigenvalues, we obtain approximations which are correct at a larger and larger number of points in ( a , b). It was shown by Prufer that there exists a rigorous argument leading from the interpolatory property to the eigenfunction expansion. If the expansion
< <
+
W
dx)=
Glun(x)
(8.6.1)
0
holds, with say absolute and uniform convergence, the coefficients may be found by multiplying by p ( x ) un(x) and integrating over (a, b). By the orthonormal property (8.3.11) this yields (8.6.2)
The first step is to establish the validity of the expansion in meansquare, with respect to the measure p ( x ) dx, in the sense that (8.6.3)
8.6. as m
+ 03.
223
THE EIGENFUNCTION EXPANSION
As follows from (8.6.2), (8.3.11), this may also be written (8.6.4)
as m + m, or, what is the same thing, (8.6.5)
the Parseval equality. Having established the expansion in the meansquare sense (8.6.3), improvements of two kinds may be undertaken. It may be possible to show, often by less delicate arguments, that the expansion actually converges in the uniform sense; its uniform limit must then also be cp(x), at least when p ( x ) is positive and continuous. In another direction, it may be possible to show that the class of v(x) originally considered are dense in the mean-square sense, in some larger space, and so to extend the validity of the mean-square result (8.6.3). The central result is
< < b,
Theorem 8.6.1. Let functions cp, t,b, x be defined in a x absolutely continuous with derivatives satisfying
v and t,b being
v’ = 4,
#’
+ 9v = Px,
where p1I2x is of integrable square over (a,b). Let also &heboundary conditions v(a)cos OL - #(a) sin OL
(8.6.6-7)
v, $
satisfy
= 0,
(8.6.8)
p ( b ) cos 9, - #(b) sin ,9 = 0.
(8.6.9)
Then the expansion (8.6.1) is true in the mean-square sense (8.6.3-5). We shall confine attention to the event that there are actually an infinity of eigenvalues. Sufficient conditions for this were noted in Theorem 8.4.6; it is, for example, sufficient that there be an interval in (u, b) in which p , r are both continuous and positive. The contrary situation was considered in Chapter 4. For the proof we take zeros xl, ..., x, of u,,,(x) in (a, b), distinct from each other and from the end-points a, b in the sense (8.5.14); this means that in the event of a whole interval of zeros, in which r = 0. we take only one zero from this interval. We form the interpolatory ) these points, choosing the an so that approximation to ~ ( x at m-1
~ a , U , ( x , )= P(%), n=O
s = 1, * * * I m,
(8.6.10)
224
8.
STURM-LIOUVILLE THEORY
which is possible by Theorem 8.5.3, and define the difference
2
m-1
g(x) = d x ) -
0
W n ( 4 .
(8.6.11)
The required result (8.6.4) is then obtained by applying Lemma 8.5.2 to the function g(x). T o complete the formalism of Lemma 8.5.2, we note that (8.5.2) holds where h(x) is given by (8.6.12)
by (8.5.5), (8.6.6); the h(x) so defined is absolutely continuous, and together with g(x) satisfies the boundary conditions (8.5.3), in view of (8.6.8-9)and the boundary conditions(8.3.1-2) of the eigenvalue problem. Next we note that g(x) vanishes with u,(x). This is obvious in the case of isolated zeros x, of unL(x),or again in the case of zeros at x = u, b of uJx), prescribed by the boundary conditions. For the case when u,,(x) has an interval of zeros, in which necessarily I = 0 or r1 is constant, and containing one representative zero x, , we note that g(x) will also be constant throughout this interval, by (8.5.2), and so will vanish throughout. It remains to substitute for g, h in (8.5.4) according to (8.6.11-12) and to evaluate the result. In the following calculations, sums will, unless otherwise indicated, be from 0 to m - 1, integrals and variations from x = a to x = b. On the left of (8.5.4) we get
Turning to the right of (8.5.4), we note first that
8.6.
THE EIGENFUNCTION EXPANSION
225
Hence the right of (8.5.4) gives
In order to evaluate this we have to calculate integrals of the form Ju,px dx. Using (8.5.5-6), (8.6.6-7) and integration by parts we have
where in setting the integrated term equal to zero we have relied on the boundary conditions. Hence (8.6.14) may be written
= -I
p ~ dx x - 2hncE
+ 2 A , , ( c n - an)z.
(8.6.16)
The result (8.5.4) now states that the expression (8.6.13) does not exceed the expression (8.6.16). On slight rearrangement this gives
Since the A, are in increasing order, the last sum is non-negative and may be omitted, as also the first sum on the right; this yields the main result (8.6.17)
226
8.
STURM-LIOUVILLE THEORY
The desired conclusion, that the expression in the braces {} on the left tends to zero as m 400, now follows, provided that there is an infinity of eigenvalues. This proves (8.6.4), and its equivalents (8.6.3) and (8.6.5). For a later purpose we note that (8.6.18)
This comes from applying the Bessel inequality to -x, whose Fourier coefficients are h,c, , by (8.6.15). This proves (8.6.4), and its equivalents (8.6.3) and (8.6.5). We show later that the expansion is uniformly and absolutely convergent under the same assumptions; the proof of this will depend on the Green’s function, considered in Sections 8.8-9.
8.7. Second-Order Equation with Discantinuities By way of illustration we formulate the oscillation and expansion theorems for the special case of a second-order differential equation d2YldP
+ PP(5) + d03r
(8.7.1)
= 0,
to hold in (0, 1) except at a finite number of points where discontinuities in y’ are prescribed, the change in y’ being proportional to y . Let 8, be such that 0 = to< 5, < ... < 5, < tm+,= 1, let p , q be continuous in each interval [t,, 5,+,], and let p be positive. Let y satisfy tnfl),be continuous at each t,, the (8.7.1) in each interval discontinuity in y’ at 5, being specified by
(en,
+ 0)-
~ ’ ( t n
0) = -(AP(n)
~ ~ ’ (t n
+ q‘n’)y(En),
1
< n d m,
(8.7.2)
the pen), q(,) being constants, the pcn1 > 0. If for simplicity we take as boundary conditions
the oscillation theorem will assert that there is an infinity of eigenvalues, all real, and forming an increasing sequence with no finite limit, corresponding eigenfunctions having 0, 1, 2, ... zeros in 0 < < 1. We may
8.7.
SECOND-ORDER EQUATION WITH 'DISCONTINUITIES
227
derive this from Theorems 8.4.5-6 by considering the first-order system for u(x), v ( x ) given by
ul=v,
11'
= v,
0'
=
-[Ap(x)
11'
= 0,
0'
=
-(Ap
v'=
(nl
+ q(x)] + q'")) u,
11,
-[Ap(x-l)+q(~-l)]u,
< x < 61 , El < x < tl + 1, 0
&+1 <x
and so on, terminating with
(8.7.4) (8.7.5) (8.7.6)
the boundary conditions being
It is readily seen that this system has the same eigenvalues as (8.7.1-3), and that zeros of y correspond to zeros of u, with the qualification n - 1, n), and that u is constant in intervals of the form (& so may exceptionally have an interval of zeros. Let now the eigenfunctions of (8.7.1-3) corresponding to eigenvalues A, be written y,(x). They will be orthogonal, and we may suppose them normalized by multiplication by constant factors, according to
+
en +
for j , K = 0, 1, ...; this is the same as (8.3.11) in our present case. The formalities of the expansion will be that for a function z(5)from some general class we define the Fourier coefficient
and consider the validity of the expansion
Specializing the result of Theorem 8.6.1 to the continuous domain, let us assume that z(6) is continuous in [0, 13, and that it is continuously twice differentiable in each of the intervals [t,, the derivatives
228
8.
STURM-LIOUVILLE THEORY
at the end-points of these intervals being one-sided derivatives; it is not assumed that the left and right derivatives at tl,..., tmcoincide. Assume finally that the boundary conditions (8.7.3) hold, that is, that z(0) = z(1) = 0. Then (8.7.1 1) is true at any rate in the mean-square sense. In the present case (8.6.3) becomes the following. Write (8.7.12)
Then, as m
--+
00,
(8.7.13)
In particular, if pen) > 0, we see that z,(gn) --+ 0 as m --+ 00, so that (8.7.11) is true in the ordinary sense for ( = g,, , mean-square convergence implying ordinary convergence. We may, of course, extend the boundary conditions to y(1) cos /? - y'(1) sin /? = 0.
y(0) cos a - y'(0) sin a = 0,
(8.7.14-15)
More generally we may superimpose a discontinuity of the form (8.7.2) upon the boundary data. Taking (8.7.14) in the form y(0) cot (Y = y'(0) and applying (8.7.2) with n = 0, the effective boundary condition at E=Ois y'(O+) = y(0) {cot a - Ap'O' - p), (8.7.16) where actually the q ( O ) is redundant. Applying similar considerations to the other end of the interval, we are led to the boundary problem formed by (8.7.1-2) together with ~ ' ( 0 )= ~ ( 0(cot )
- Ap'O)),
~ ' ( 1 )= ~ ( 1(Cot ) /?
+ Ap(m+l)),
(8.7.17-18)
where the constants p(O),$cm+l) are non-negative. T h e main effect of these boundary conditions will be that the sums in (8.7.9-lo), (8.7.13) must now be taken over 0, m 1. I n the case q(() = 0, qcn) = 0, we may interpret the equation considered here as that of a string of density $( (), loaded additionally with particles of masses pen). As already explained in Section 0.8, the case in which the string has a particle at each end, a weightless portion of string being fixed to each end so that the system can vibrate, leads to a boundary problem with the eigenvalue parameter appearing in the boundary conditions.
+
8.8.
THE GREEN’S FUNCTION
229
8.8. The Green’s Function
Returning to the general theory of the boundary problem (8.1.2-3), (8.3.1-2), we give the analog, indeed extension, of the results of Section 6.4. We start with the inhomogeneous problem, in which we suppose given a function x ( x ) E L (u ,b), and ask for absolutely continuous functions 97, $ satisfying the differential equations $‘
9J’ = y$v
+ (AP + 4)9J = x,
(8.8.1-2)
and the boundary conditions (8.6.8-9). Provided that A is not an eigenvalue, we show that (8.8.1-2) has a unique solution which is expressed by
so far as ‘p is concerned; this corresponds to (6.4.2). T h e kernel G(x, t , A) is the Green’s function which provides two of the main proofs of the eigenfunction expansion. Here we use it to establish the uniform and absolute convergence of this expansion, under the conditions of Section 8.6. I n addition, it has sign-definite properties which may be used to prove separation theorems, as was done in Section 6.3. Our first task is to establish the existence of Green’s function and to find it explicitly. This we do by solving (8.8.1-2), together with the boundary conditions, by the method of the variation of parameters. In addition to the solutions u ( x ) = u(x, A), and w(x) = w(x, A) of (8.1.2-3) fixed by the initial conditions (8.3.3), we define a second pair of solutions by means of the terminal conditions ut(b) = sin 8,
d ( b ) = cos 8,
(8.8.4)
corresponding to the solution w,(A) of the recurrence relation, defined in (6.1.12-13). Provided that A is not an eigenvalue, ut and w t will not be merely a constant multiple of u and v. Their functional determinant or Wronskian will be written w = w(A) = u(x) d ( x ) - ut(x) .(X) =
u(b) cos 8 - o(b)sin 8.
(8.8.5) (8.8.6)
It follows from the differential equations (8.1.2-3) that the Wronskian appearing in (8.8.5) is independent of x for a x b; putting x = b, we get the left of (8.3.4), whose vanishing determines the eigenvalues. With these notations we can specify the Green’s function.
< <
230
8.
STURM-LIOUVILLE THEORY
Theorem 8.8.1. If x EL(u,b) and X is not an eigenvalue, the unique solution of the inhomogeneous system (8.8.1-2) subject to the boundary conditions (8.6.8-9) has q~ given by (8.8.3), where (8.8.7)
The last formulas correspond, in the discrete case, to (6.4.5-6). We have omitted for brevity the dependence on X in u, ut. Following the method of the variation of parameters, we seek solutions of (8.8.1-2) in the form ql
where s
= 3-24
= s(x), s t = st(x)
+ stut,
*
=m
+ stof,
(8.8.8-9)
are to be found. The boundary conditions
(8.6.8-9) will be satisfied if we choose s(b)
= 0,
St(U)
= 0,
(8.8.10-11)
in view. of the initial values of tl, v and the final values of at, vt. It remains to ensure that (8.8.1-2) hold. We must first have (su
+ stut)’ = r(sw + StWt),
and since u‘ = rv, ut’ = r v t , we must have s‘u
+ st’ut = 0.
(8.8.12)
For (8.8.2) we must have
and since
‘u’
=
-(Xp
+ q) u, v t ’ = -(A9 + q) u t , this implies that slw
+ St’Wt = x.
(8.8.13)
Using (8.8.5) we deduce that St’ = ux/w.
s( = -utx/w,
From (8.8.10-11) it now follows that
5
b
S(X) =
2
u t ( t ) ~ ( tw-I ) dt,
st(x> =
5’
a
u ( t ) ~ ( tw-l )
dt. (8.8.14-15)
Substituting in (8.8.8) we get (8.8.3), with G given by (8.8.7).
8.8.
23 1
THE GREEN'S FUNCTION
T o verify the solution we set up the corresponding expression for
#(x). Substituting for s, st in (8.8.9) we obtain
the full expression for
being
).(.I I
b
~(x= ) w-l
2
u t ( t ) ~ ( tdt)
+ ut(x) /' u ( t ) ~ ( tdt) 1 . a
(8.8.17)
It is a routine matter to verify that these satisfy (8.8.1-2) and the boundary conditions (8.6.8-9). The uniqueness of the solution of (8.8.1-2), (8.6.8-9) follows from the fact that if there were two solutions, their difference would satisfy (8.1.2-3) and (8.3.1-2), so that either this difference would vanish or h would be an eigenvalue. The following formal properties of the Green's function are more or less immediate. Theorem 8.8.2. If h is not an eigenvalue, (i) the Green's function is symmetric, that is, G(x, t , 4 = GO, x, 4,
(8.8.18)
(ii) it is continuous in x, t jointly, and absolutely continuous in x, for fixed t, or in t for fixed x; (iii) its partial derivatives have discontinuities when x = t, according to
a
I A) I
- G(x, t , A) ax
?-.t+O
- G(x, t , at
t==z+o
a
a
- - G(x, t , A) ax -
a at G(x, t , A)
I
=~(t),
a
< t < b,
(8.8.19)
= Y(x),
a
< x < 6;
(8.8.20)
2-t-0
I
t-2-0
the pair G, H satisfy in either x or t the differential equations (8.1.2-3) when x f t, and the -boundary conditions (8.3.1-2); (v) for any eigenfunction u,,(x) we have (8.8.22)
232
8. STURM-LIOUVILLE THEORY
(vi) if p is also not an eigenvalue, there holds the resolvent equation G(x, t , A) - G(x, t , P ) = (P - 4
I
b a
G(x,
5, P ) G(5,t , A) p ( 5 ) d5.
(8.8.23)
Of these (i)-(iv) need only a straightforward verification. For (v), we rewrite (8.5.5-6) as u:, = rvn
,
4
+ (Ap + 4)
= (A - h,)pu,
% I
-
Comparing this with (8.8.1-3) we deduce that
I
b
u n ( ~ )=
a
G(x, t , A) (A - h ) p ( t )un(t)4
which is the required result. For (vi) we consider (8.8.1-2) for fixed x and varying A, writing the solutions of (8.8.1-2) and (8.6.8-9) F ~+ a,. We have then
d = .+a
9
+;
+ (+ + 4)pa =
XI
and the latter may be rewritten as
tfii + (PP + q) Fa = (P - X)Pcpa + X. Hence
may be expressed both in the form (8.8.3) and also in the form
j: G(x, 4 PI {(P - 4m v,n(t)+ X ( t ) ) dt =
s”
G(x, 1, P ) X ( t ) dt
+ (P - 4 J
b
G(x, 5, P ) P ( 5 ) d5
s”
(8.8.24) G(5, t, 4 X(t) dt,
on substituting for ~ ~ on ( tthe ) basis of (8.8.3). Since (8.8.3) and (8.8.24) are the same for all continuous functions x, and since G is continuous, there must hold the identity (8.8.23).
8.9. Convergence of the Eigenfunction Expansion We now use the Green’s function to establish the absolute convergence of the eigenfunction expansion under the conditions of Section 8.6; we shall also consider the uniformity of the convergence, in regard to x and in regard to varying boundary conditions. At the center of these investigations is the Fourier expansion of the Green’s function G(x, t, A)
8.9.
233
CONVERGENCE OF EIGENFUNCTION EXPANSION
in a series of the u,(t), taking x fixed. Taking it that h is not an eigenvalue, the Fourier coefficients are given by (8.8.22) as n = 0, 1, ...,
un(x>
h -An'
(8.9.1)
and so we have the formal Fourier expansion, the bilinear formula (8.9.2)
Without considering, in the first place, whether this formula is true, in either the pointwise or the uniform or the mean-square sense, we base ourselves on the Bessel inequality (8.9.3)
J a
I n particular, taking h = i, (8.9.4)
and so (8.9.5)
for some c independent of x ; for it follows from (8.8.7) that G(x, t, i) is bounded, uniformly in x and t, so that the right of (8.9.4) is bounded, uniformly in x. We now strengthen the result of Section 8.6. We have Theorem 8.9.1. Let q(x) satisfy the assumptions of Theorem 8.6.1. Then the eigenfunction expansion (8.6.1) is true, the series on the right being absolutely and uniformly convergent for a x b. We first prove that the series in (8.6.1) is absolutely and uniformly convergent, and then consider whether its sum is equal to ~ ( x ) .For any integers m,m' with 0 < m < m' we have
< <
(8.9.8)
8.
234
STURM-LIOUVILLE THEORY
by (8.9.5). In view of (8.6.18-19), the last sum tends to zero as m + 00, m'+ 00 independently of x. This proves the absolute and uniform convergence. It follows that the sum
2 03
Vl(4 =
(8.9.9)
GLun(x)
is continuous; we have to show that it is the same as p(x). Making m + in (8.6.3), we have that
(8.9.10) In the special case in which p ( x ) is positive and continuous in [a, b ] , it follows at once that V(X) = V&) (8.9.11) for all x in [a, b]. More generally, suppose first that x E (a, b ) is a point of increase of p l ( x ) = J"p(t) dt, that is to say, that
r+f x-
for arbitrarily small E > 0 so that for x - 6
E
x-f
> 0. Since V,
I V(t> - P d t ) I 2
s""
p(t) dt
E,
>0
(8.9.12)
f
p1 are continuous, we may choose
*
IV(4
-Vl(4
I 9
and so that also (8.9.12) holds. This gives
P(t) I P(t) - V l W
l2 dt 2 I +{V(X)
-Vl(4)
l2 p x-f + ' P ( t ) dt,
whence, if p(x) # A x ) ,
rf 2- f
P(t) I VP)
- Vdt)
lP
dt
> 0,
which is impossible by (8.9.10). Thus (8.9.11) holds for all x such that (8.9.12) holds for arbitrarily small E > 0. In view of (8.1.5), an entirely similar argument shows that (8.9.12) holds at x = a and at x = b. Suppose now that (xl, x2) is an interval in which pl(x) is constant, and that it is not contained in any larger such interval. We have therefore (8.9.13)
8.9. that
Q
CONVERGENCE OF EIGENFUNCTION EXPANSION
235
< x1 < x p < b, by (8.1.5), and furthermore that = Vl@l),
&l)
94x2)
= Vl(XZ),
(8.9.14)
since (8.9.12) cannot hold when x = xl, x = x 2 . By (8.1.6-7) we have = q = 0 almost everywhere in (xl,xz), and so the w, are constant in (xl, xz), by (8.5.6) and likewise I/J by (8.6.7). By (8.6.6) we have then
p
d x ) = dXl)
+
$(XI)
42
and from (8.5.5) u,(x) = un(xl)
+ w,(xl)
r ( t ) dt,
x1
Q x d x2 ,
(8.9.15)
21
s2
r ( t ) dt,
21
x1
< x < x2.
(8.9.16)
If J2*r ( t ) dt = 0,that is to say, if r(t) = 0 almost everywhere in (xl , xz), 21 then q~ is constant in (xl, x2), and likewise the u, and so also vl. Hence it follows from (8.9.14) that p)(x) = tpl(x) in (xl, x2). If again Jx2 r ( t ) dt > 0,we have +1
the last series necessarily converging, since ql(x2) is finite. Comparing this with (8.9.15) with x = x2 and using (8.9.14) we deduce that
However, by the argument just used.
<
for x1 x < x 2 , using (8.9.14). Comparing this with (8.9.15) we have (8.9.11) for x1 < x < x2 , and so, together with the previous results, it holds generally. Hence the eigenfunction is valid in the sense of pointwise convergence, completing the proof of the theorem.
8.
236
STURM-LIOUVILLE THEORY
We proceed to a partial justification of the bilinear formula (8.9.2), and to the uniformity of the convergence in regard to the boundary conditions. We commence with the expansion of the iterated Green’s function appearing on the right of (8.8.23).
Theorem 8.9.2. Let A, p be not eigenvalues. Then
the series on the right being uniformly and absolutely convergent in a <x b, a t b. The eigenfunction expansion may be expressed by saying that if
<
< <
where x satisfies the assumptions of Theorem 8.6.1, then ‘p can be expanded in a uniformly and absolutely convergent series (8.6.l), since the assumptions (8.6.6-9) concerning ‘p(x) are equivalent to a representation in the form (8.9.18), this being the basic property of the Green’s function. In particular, it is sufficient that x(() be continuous, and we may therefore apply the result with ~ ( 5 = ) G((,t, A); we also replace X by p in (8.9.18). It follows that (8.9.19)
with absolute and uniform convergence for any fixed 1. The coefficients ck are given by
= (p -A n y (A
- A,)-l
U&),
by two applications of (8.8.22), using the symmetry property (8.8.18). Hence we have (8.9.17) with pointwise convergence, and with absolute and uniform convergence for fixed t and varying x, a x b. In order to obtain the convergence uniformly in x and t we need the following theorem of Dini, which we cite as
< <
8.9.
237
CONVERGENCE OF EIGENFUNCTION EXPANSION
Lemma 8.9.3. Let the series Z: g,(x) of non-negative functions g,(x), continuous on some closed compact set S, converge and have a sum s(x) which is also continuous on S. Then the series converges uniformly on S. To indicate the proof, suppose that the result is untrue, so that there is an E > 0 and a sequence n1 < n2 < ... of positive integers and an associated sequence x k E S such that zckgrL(xk)> e. Here, by the compactness of S , we may take it that the sequence x k has a limit xo E S. Writing s,(x) = go(.) ... g,.-l(x), choose n‘ such that
+ +
1 4x0)’ - %@o) I < 6/39 and a 8
> 0 such that
I ~ ( x k) Snr(Xk) 1
< I ~(xb)-
sn*(xk)
I
< I S b k ) - 4x0) I + 1 4x0) - %@o) I + I %@o)
-S
n h J
I<
€9
so that Zzkg,(xk) < c. This gives a contradiction, proving Dini’s theorem. While we have in mind first the case in which S consists of an interval on the real line, we use later the case in which it is a plane point set. If in (8.9.17) we take t = x, and p = A, not being an eigenvalue, we have
Here the terms on the right are non-negative, while the sum on the left is continuous in x; the latter may be seen more clearly by transforming the left of (8.9.20) by use of (8.8.23), when it becomes, if A is complex,
(A
- X)-l {G(x, t , A)
-
Hence by Dini’s theorem the series on the right of (8.9.20-21) is uniformly convergent for a x b. The statement that the series in (8.9.17) is uniformly convergent in x and t jointly now follows by means of the Cauchy inequality.
< <
238
8.
STURM-LIOUVILLE THEORY
The result (8.9.21) may be put as
Theorem 8.9.4. If A is complex, the bilinear formula (8.9.2) holds when we take imaginary parts of both sides. An inessential modification of the above arguments gives Theorem 8.9.5.
The series
<
converges uniformly, for fixed complex A, for all a x < b and real a, j3 appearing in the boundary conditions. For the left of (8.9.21) is easily seen to be continuous in x, a, and j3, being periodic in a and j3. Finally we have as a consequence
Theorem 8.9.6. Let ~ ( x )satisfy the assumptions of Theorem 8.6.1 for given a and all B, so that ~ ( b = ) 0, $(b) = 0. Then the eigenfunction expansion (8.6.1) is convergent uniformly in x and j3. Taking A = i in (8.9.21), the left is continuous in x and j3, and so the conditions of Lemma 8.9.3 are satisfied, the set S now being a x b, 0 j3 27r. Hence the series in (8.9.5) converges uniformly in x and B. We now employ the argument of (8.9.6-7) in the sense that the first factor in (8.9.7) is bounded, by (8.6.18), while the second tends to zero as m, m14 m, uniformly in x and j3, by the uniformity of the convergence of the series (8.9.5).
< <
< <
8.10. Spectral Functions For investigations in which the eigenvalue problem is varied at the end x = b of the basic interval, it is convenient to have the expansion theorem in terms of eigenfunctions with fixed initial values at x = a. We therefore express the eigenfunction expansion in terms of the functions u(x, A,), where as in Section 8.3 we have u(a, A,) = sin a, w(a, A,) = cos a. In terms of the normalized eigenfunctions uJx) the expansion theorem states that (8.10.1)
8.10.
SPECTRAL FUNCTIONS
239
this series being absolutely and uniformly convergent under the conditions of Theorem 8.6.1, as was proved in Theorem 8.9.1. Since un(x) = u(x, A,) where p, is given by (8.3.9), (8.10.1) is equivalent to (8.10.2)
As on previous occasions, we may put this into Stieltjes integral form by defining the spectral function (8.10.3) .(A) =
-2
a
pi1,
h
< 0,
(8.10.4)
with the understanding that ~ ( 0 = ) 0; .(A) will of course be a nondecreasing right-continuous step function. The eigenfunction expansion (8.6.1-2) may then be put in the form of a pair of reciprocal integral transforms. Defining, as a sort of generalized Fourier coefficient,
With a view to manipulations concerning more than one spectral function, we collect here estimates relating to the convergence of the eigenfunction expansion (8.10.6). We write, for A > 0, =
/
A
-A
u(x,
dT(h) Y(h).
(8.10.7)
Assuming cp to satisfy the assumptions of Theorem 8.6.1, we have first that the expansion is valid in the mean-square sense given by (8.6.3) or (8.6.4). In the present notation these results may be written respectively as (8.10.8)
or
240
8.
STURM-LIOUVILLE THEORY
by (8.3.10) and (8.6.2). Thus, taking first (8.10.9), (8.10.10)
Since the A, are bounded from below, and A, < A, < ..., we may drop the restriction - A < A, if - A < A, , and the left of (8.10.9) may then be written (8.10.11)
which tends to zero as A -+ m, by (8.6.4). Thus (8.10.9) holds. In a similar way, if - A < A,, the left of (8.10.8) is the same as (8.10.12)
which tends to zero as A + m, by (8.6.3). This proves (8.10.8). Next we replace (8.10.9) by a bound for the left-hand side. By (8.10.9) and (8.10.10), or by the Parseval equality (8.6.5), the left of (8.10.9) is the same as
In view of (8.6.18) we have that a
p ( x ) I ~ ( x l2) dx -
(" I y(A) l2 d+) < -A
s" a
p ( x ) I x(x)
la
dx.
(8.10.13)
We use this bound later for the limiting transition b + m. 8.1 1. Explicit Expansion Theorem
We shall now prove the analog of Theorems 4.9.1 and 7.7.1. In the latter results we proved that polynomials orthogonal on the real axis or on the unit circle were orthogonal with respect to a weight function given explicitly in terms of the polynomials themselves ; the orthogonality applied to a finite number of the polynomials, and could have been expressed as an expansion theorem on the lines of (4.4.5-7). In the present Sturm-Liouville case, the place of a sequence of polynomials of degrees 0, 1, 2, ... , is taken by the functions u(x, A), where in place
8.1 1.
24 1
EXPLICIT EXPANSION THEOREM
of the degree of the polynomial we have the continuous variable x . These are orthogonal with respect to integration over A only in a rather questionable sense, and we use here instead the formulation as an expansion theorem. T h e ordinary expansion theorem involves the determination of eigenvalues, which are in general the roots of transcendental equations. We use the term expZicit to describe the result of the present section, since it is expressible directly in terms of solutions of the differential equations.
Theorem 8.11.1. Let p)(x) satisfy the assumptions of Theorem 8.6.1 for all j3, that is to say, we have p)(b) = $(b) = 0. Then (8.1 1.1)
where y(A) is the extended Fourier coefficient defined in (8.10.5). T h e proof is similar to that of Theorem 4.9.1, and proceeds by averaging the ordinary eigenfunction expansion with respect to the angle j3 determining the boundary condition at x = b, the condition at x = a remaining fixed. We write
P(4
J p(t) b
=
U2(t,
4 dt,
(8.11.2)
where u2(t,A) denotes {u(t, A)}2, so that in the notation (8.3.9) we have
p n = p(An). T h e expansion (8.10.2) is then
(8.11.3)
Considering A, as a function of j3, namely, the root of B(b, A,) = j3 we propose to calculate dAn/dj3, that is to say, the value of
+ m,
pep, ~ y a ~ } - l when A
=
A,.
Now by (8.4.6) we have
-awl A) - {un(b, A) w(b, A) - u(b, A) W I ( b , A)}
ax
{U2(b,
A)
+ W y b , A)}-l.
By (8.4.5) this gives -ae(b9A )
ax
- p(A) ( U Z ( b , A)
+ "2(b, A)}-1,
(8.1 1.4)
242
8.
STURM-LIOUVILLE THEORY
whence (8.1 1.5)
Hence the eigenfunction expansion (8.1 1.3) may be written (8.11.6)
To complete the proof of the theorem we integrate with respect to of course, the left of (8.11.1). The series on the right of (8.11.6) is uniformly convergent, by Theorem 8.9.6, and may therefore be integrated term by term, so that we get
fl over (0, x ) . The left of (8.11.6) gives,
u(x, A) y(A) {u2(b,A)
where we have written A, = A,@?),
+ w2(b, A)}-'
dh,
(8.1 1.7)
and h,(+O) in place of h,(O) since
fl was restricted in (8.4.9) to 0 < fl < T . Since X,(fl) is monotonic increasing in fl, and since every finite real h is an eigenvalue for some fl,
the sum on the right of (8.11.7) adds up' to the integral over the real axis appearing in (8.11.1). T o be precise, as fl --t +0, A,@) ---t -00, since it was proved in Section 8.4 that d(b, A) tends to zero from above as h ---t --. Hence the first term in the series in (8.11.7) gives the integral in (8.1 1.7) over (--, h0(r)).The remaining terms on the right of (8.11.7) give the integrals over (h,-l(x), A,(T)), n = 1, 2, ... , since, as is easily seen, X,(+O) = hn-l(x). This completes the proof. The result remains in force if there are only a finite number of eigenvalues, but is then equivalent to Theorem 4.9.1. By applying the same process to the Parseval equality associated with the ordinary eigenfunction expansion we get Theorem 8.11.2. x
Under the assumptions of Theorem 8.11.1,
/ l p ( x ) I cp(x) l2 dx
=
--m
I y(A) l2 { ~ 2 ( bA),
+ ~ 2 ( bA)}-l ,
dh.
(8.1 1.8)
We use (8.10.13), of which the left-hand side may be written, with the notation (8.11.2),
8.12.
EXPANSIONS OVER A HALF-AXIS
243
or, in view of (8.11.5),
Integrating (8.10.13) with respect to
over (0,r) thus gives
< 7rA-* J ~ ( x I)X(X) la dx. 1,
a
and (8.1 1.8) follows on making
(i --f
(8.11.9)
00.
8.12. Expansions over a Half-Axis In this section we apply the limiting transition b to the eigenfunction expansion, in Parseval equality form, keeping fixed a and the boundary condition at x = a. This situation is analogous to that in which we have an expansion theorem associated with a finite sequence of recurrence relations, and consider the effect on this theorem of increasing without limit the number of stages in the set of recurrence formulas; particular cases of this process were undertaken in Sections 2.3, 5.2, and 7.3. Once more, the simplest procedure is to show that the spectral function T(A) is bounded, for fixed A, as b --+ 00, and to use the Helly-Bray theorems. The argument is adapted only to the proof of the existence of at least one spectral function, in the limiting sense, and does not touch on the question of uniqueness. We assume in this section that the assumptions (i)-(iv) of Section 8.1 hold for a sequence of intervals (a, b), where a is fixed and b = b, , b, , ... , where b, as m --t 00. We now write T ~ , ~ ( Afor ) the step function defined by (8.10.34). Our first step is to prove its boundedness. We have Theorem 8.12.1. There is a function c(h) independent of b = b , , m = 1,2, ... , such that
and of
The proof proceeds by applying the Bessel inequality to a function which is initially unity in some small interval and thereafter is zero. In the above-mentioned discrete cases a similar argument was used,
244
8.
STURM-LIOUVILLE THEORY
relying on certain Parseval equalities. In the present case we take a function rp&) = 1 ( a < x < a’) (8.12.2) =0
(x
2
a’),
where a‘ = a’(h) is to be chosen later. With respect to the orthonormal its Fourier set {un(x)}rassociated with some finite b = b, and some /I, coefficient in the sense (8.6.2) will be
taking it that b,, > a’. Although this function does not satisfy our assumptions for the expansion theorem, we can nevertheless use the Bessel inequality, which tells us that (8.12.3)
or Jrn --a
If
P ( 4 4 6 I.) dx
l2
d7dCL) Q
T’
We now show that for given h we can choose a’
f(4dx.
(8.12.4)
> a and c > 0 so that (8.12.5)
from which the result (8.12.1) will follow easily. For on taking on the left of (8.12.4) only the integral over (- I h I, I h I) and using the bound (8.12.5) it will follow that
recalling that T*,&) Q 0 when p < 0, we deduce (8.12.1). That (8.12.5) can be arranged to hold is easily seen if sin a = u(a,X)# 0. Since u(x, A) is continuous in both variables we may choose a‘ > a so that u(x, p ) 3 &sinor > O i f a x a’and I p I [ h 1. Wehavethen
< <
<
which is positive by the first of (8.1.5). Suppose next that sin a that is, that a = 0 since we take 0 a < x.
<
=
0,
8.12.
245
EXPANSIONS OVER A HALF-AXIS
To begin with, suppose in addition that
s:
> a. Since
for all x
r ( t ) dt
>0
(8.12.6)
v ( x , p) is continuous we may choose a’ so that $ for a x a‘ and I p I 1 h I. Writing r,(x) for the integral in (8.12.6) we have from (8.1.2) that u(x, p ) 2 &r1(x), for a x a’ and I p I 1.h I, so that
w(x, p)
2
$.(a,
< <
p) =
< <
<
<
(8.12.7)
and it will be sufficient to show that the last integral is positive. This follows from (8.1.5). Choose in fact an a” > a such that
r’
and then we have
a
p ( x ) rl(x) dx >, ~ Y , ( w )
r’ a”
p(x) dx > 0,
which again justifies (8.12.5). Suppose finally that u(a, A) = 0 and that (8.12.6) fails for some x In this case there will be an a, > a such that j l l P ( x ) I u(x, A )
dx = 0,
> a.
(8.12.8)
since r vanishes in a neighborhood of a so that u is constant and so zero in such a neighborhood. As was shown in Section 8.2, it follows from (8.12.8) that v ( x , A) is constant in (a, ul), and so equal to unity, whence u(x, A) = rl(x). Thus the solution is independent of h for a x a , . We take a, to be the greatest number with the property (8.12.8). Then (8.12.9)
< <
for all x
> a, , and
so (8.12.10)
for all x of a.
> a,.
We now apply the previous arguments with a, in place
246
8.
STURM-LIOUVILLE THEORY
Since it follows from (8.12.8) that T l p ( x ) u(x, A) dx = 0,
J a
r’
we have to show that there is an a‘
> a,
p ( x ) u(x, p) d x
such that
>c >0
(8.12.1 1)
a1
<
for 1 p I I X 1. If u(al , p) = rl(al) > 0, the existence of a’ follows as before, using (8.12.10). Suppose again that u(al , p ) = 0. In this case, we assert,
s‘
r ( t )di
>0
(8.12.12)
a1
for any x > a,. For otherwise there would be an a, > a, such that = 0 almost everywhere in (a,, a2),so that u would be constant there, and so zero, in contradiction to the assumption that a, is the greatest number with the property (8.12.8). This brings us back to the situation in which u(a, A)’ = 0 and (8.12.6) holds, which has already been dealt with, so that (8.12.11) may be taken to hold in this case also. This completes the proof of Theorem 8.12.1. The existence of at any rate one limiting spectral function is now more or less immediate. I
) is Theorem 8.12.2. There is a nondecreasing function ~ ( h which = 0, such that the Parseval equality right-continuous, with ~(0)
) the following conditions: holds for functions ~ ( x satisfying
(i) ~ ( x )is defined and absolutely continuous for x 2 a, vanishing outside some finite interval ; (ii) there are functions #(x), ~ ( x )vanishing , outside some finite interval, = r#, Y ,I qF = p x for x 2 a, 6 being absolutely consuch that tinuous and p1j2x of integrable square over (a, .); C$
+
(iii) ~ ( acos ) a - +(a) sin a = 0.
247
8.13. NESTING CIRCLES
Since the sequence of spectral functions Tb,,fi(h), where for definiteness we keep j3 fixed, is uniformly bounded in any finite A-interval, it contains a convergent subsequence. We take T(A) as the limit of this subsequence, normalized to ensure right-continuity and that ~ ( 0 = ) 0. T o justify (8.12.13) we write (8.10.13) in the form
Here we have taken it that m is so large that 9,x are zero in (b, , m); for simplicity let us assume also that A, - A are not points of discontinuity of any of the Tb,,fi(A) or oy these excluded values forming a denumerable set. Making rn + 8 , through the subsequence which makes the spectral functions converge, we obtain
~(x),
and the required result follows on making A + a. 8.13. Nesting Circles
The following alternative proof of the boundedness of the family of spectral functions T ~ , ~ ( A for ) , increasing b, is more elaborate than that given in the last section, but provides some information on the uniqueness of the limiting spectral function. The argument is similar to that of Sections 5.4-5, and is given in outline only. The first step is to construct a function previously termed here a characteristic function, whose poles are at the eigenvalues A,&, the residues being the corresponding normalization constants pn , Such a function will be set up in the next chapter in terms of the resolvent kernel, an extension of the notion of the Green's function. Here we set it up directly in terms of the solution u(x, A), u(x, A) of (8.1.2-3) such that u(a, A) = sin a, v (a , A) = cos a, and a second solution of (8.1.2-3), which we denote by ul(x, A), q ( x , A), such that ul(u, A) = cos a, q ( u , A) = - sin a. We then define [cf. (4.5.4)] (8.13.1)
The definition may be motivated as follows. We define a third solution of (8.1.2-3) by u ~ ( xA), = Y ~ ( xA) , - fi((x, A),
w ~ ( xA), = w ~ ( xA),
-fe(x, A)
(8.13.2-3)
248
8.
STURM-LIOUVILLE THEORY
where f is to be determined so that u 2 , v2 should satisfy the boundary condition at x = b, namely, u2(b,A) cos j3 - wz(b,A) sin j3
= 0.
(8.13.4)
This leads to f as given by (8.13.1). The function (8.13.1) has the following analytic properties.
Theorem 8.13.1. For complex A, Im A and Imfb,s(A) have the opposite signs. For real A, ,fb.p(A) is real, and finite except at the A,, where its residue is p, . We make here the assumptions of Section 8.1; the A,, are the roots of (8.3.4), the p , being given by (8.3.9). It is obvious from (8.13.1) that fb,@(A) is regular except at the zeros of the denominator, which are the A,, and that it is otherwise real for real A. Its residue at X = A, is ul(b, A,) cos j3 - q ( h , A,) sin j3 ul(b, A,) cos j3 - q ( b , A,) sin j3
Using (8.3.2) this is the same as
T o evaluate the numerator we may replace b by a, giving the value 1, while the denominator is p , , by (8.4.5). Thus Im A and Imfb,s(A) certainly have the opposite sign when A has the form A, f k for small E > 0. T o complete the proof it will be sufficient to show that Imfb,a(A) does not vanish when A is complex. Supposingfb,B(A)to be real, then the u2(x, A), v2(x, A) given by (8.13.2-3) with this value off would satisfy the boundary problem given by the differential equations (8.1.2-3); they would also satisfy the boundary condition (8.13.4), and the initial condition (sin a f cos a) u,(a, A) (cos a - f sin a ) v,(a, A) = o since u,(a, A) -- cos 01 - f sin a, v,(a, A) = - sin 01 - f cos a. With f real, this is a boundary problem of the same type as that of Section 8.3, with a different a, and since u 2 , v2 do not vanish identically, A must be real. Hence fb,@(h) is complex with A, completing the proof. From a standard property of functions which map the upper and lower half-planes into each other, we have
+
+
I for any complex A.
P i 1 Im {(A
- An>-1>
Id
8.13. Taking A
=
249
NESTING CIRCLES
i we have (8.13.5)
whence, for any real A’, Tb.B(A’)
= 0{1
+ A’2),
(8.13.6)
where Tb,,9(A) is the function defined in (8.10.34). For the final link in the chain bounding the spectral function we need to show that f b $ ( A ) is bounded as b -+ 00 for fixed complex A, such as h = i; it will then follow that (8.13.6) holds uniformly in A‘ and b, and for that matter p. We assume that the conditions laid down in Section 8.1 hold for all b > a, or less restrictively that conditions (i) and (ii) hold for all b > a, and that (iii) and (iv) hold for some b > a. For this purpose we define the circle C(b,A) which is the locus of (8.13.1) for real 8, which is the same as the circle described by (8.13.7) as z describes the real axis, including 00; here we assume A complex. Denoting by D(b, A) the disk bounded by C(b, A), we have that D(b, A ) is the map under z +fb,c(h) of either the upper or the lower halfplanes. Taking for definiteness I m h > 0, we assert that D(b, A) is in fact the map of I m z < 0. For if I m A > 0, we have from (8.3.6) that I m (u(b, A)/o(b, A)} > 0, so that fb,e(A) is finite if I m A > 0, I m z < 0. If we interpret D(b, A ) as the closed disk, we have the nesting property given by
Theorem 8.13.2. For b’ > b, D(b, A) 3 D(b’, A). Writingfforfl,,(A), and solving (8.13.7) for z in terms off, we obtain z = uz(b, A)/a,(b, A), where u, and w, are given by (8.13.2-3); we no longer impose (8.13.4), which applies to the special choice z = tan /3. T h e - set I m z 0 is thus given by I m (uz(b,h)v,(b, A) - u,(b, A)w,(b, A)} \< 0. We now observe that
<
in a similar way to (8.3.6). Also, _ _ _ _ _ _
u2(a,A ) wz(a,A) - uz(a,A ) w,(a, A) = (cos a - f sin a) (- sin a - fcos a )
- (cos a -fsin
a ) (- sin a - f c o s a) = f
-f.
8.
250 Hence the set I m z
STURM-LIOUVILLE THEORY
< 0 is also characterized by
As b increases, this inequality becomes more stringent, so that the f-locus which satisfies it shrinks, or at least does not expand. This proves the theorem, and therewith the existence of at least one limiting spectral function. Still assuming that Im A > 0, we have that the circles C(b, A) “nest,” apart from intervals in which p = 0, where they will be constant. For fixed A, we thus have the distinction between limit-circle and limitpoint cases, which may be investigated by calculating the radius of C(b,A). Writing u for u(b, A), and so on, the disk D(b, A) is the f-set given by I m (u26, - C202) 0, or
<
Im{(u, -fu)
(4 -fB) - (a, - fa) (0, - f w ) )
< 0.
This may be brought to the form
the right-hand side being the squared radius. Since u,(b, A) w(b, A ) - u(b, A) w,(b, A)
= ul(u, A) w(a, A) - u(a, A ) wl(a, A) =
1
by (8.1.2-3), and the denominator is given by (8.3.6), we find that the radius of C(b, A) is [cf. (5.4.6)]
The limit-circle and limit-point cases can now be identified as those in which, respectively, /%(s)
I u(x, A) la dx < 00,
= 00.
(8.1 3.10-1 1)
We state without proof the following properties, which may be established similarly to their analogs in Chapter 5: (i) in the limit-circle case, all solutions satisfy J ” p I u is, are of integrable square;
la dx < 00, that
8.13. NESTING CIRCLES
25 1
(ii) if the limit-circle holds for one complex A, it holds for all complex A, and likewise for the limit-point case; (iii) for every complex A, there is at least one nontrivial solution of integrable square; (iv) if all solutions are of integrable square for one A, then this is the case for all A. In the proof of (i) and (iii) we use (8.13.8); for (ii) and (iv) we use the variation of parameters, rather as in Section 5.6.
CHAPTER9
The General First-Order Differential System
9.1. Formalities
In discussing the Sturm-Liouville equation (8.1.1), or the more general system (8.1.2-3) with “one-point” boundary conditions of the form (8.3.1-2), we have merely scratched the surface of the topic of boundary problems for differential equations. For the Sturm-Liouville system itself we may take two-point boundary conditions, such as the periodic conditions .(a) = u(b), .(a) = v(b)for (8.1.2-3), ory(a) = y(b) and y’(a) = y’(b) for (8.1.1). Beyond these stretches a wide range of similar problems for higher order analogs of (8.1.1), and for higherorder equations in vector terms. Since an nth order linear differential equation, even a vector differential equation, can still be written as a first-order differential equation by use of vector notation, we shall cover a large number of cases by setting up a boundary problem for a general type of first-order differential equation in vector terms, As an inclusive framework for a wide variety of boundary problems is provided by the system
13’ = W ( x )
+ B(4l y ,
a
< x < b,
(9.1.1)
where J , A, B are square matrices of fixed order k, and y(x) is a k-by-1 column matrix of functions of x, h is a scalar parameter, and (a, b) is a finite interval. We take it that A(x), B(x) are integrable over (a, b ) and that J is constant and nonsingular, so that the usual existence and uniqueness properties are available for solutions with given initial values. For the boundary problem, to be set up in the next section, to have a real and discrete spectrum we need two further sets of assumptions. In the first place we make restrictions of the type of self-adjointness, namely, that J be skew-Hermitean and A(x), B(x) Hermitean, so that
I* = -1,
A*(%)= A(x), 252
B*(x) = B(x).
(9.1.24)
9.1.
253
FORMALITIES
Secondly, we make the “definiteness” assumptions that (9.1.5)
and that
(9.1.6)
for a solution of (9.1.1) which does not vanish identically. The assumptions (9.1.2-4) have the effect that, if h is real and y a solution of (9.1. l), then the quadratic form y*Jy is constant in a x b. In fact, for real or complex A,
< <
(r*J.)’=Y*’/y =
--(@A
+Y*lY’
=
-r*’/*r +Y*/Y’
+ B ) y ) * y +y*(AA + B ) y
=
=
41Y’)*Y +Y*(JY’)
-r*(XA
+ B ) y +y*(AA + B ) y (9.1.7)
= (A - X)y*Ay.
Hence y*Jy is constant when h is real; in fact, by (9.1.5), Imy*]y is nondecreasing or nonincreasing in x , according to the sign of Imh. Furthermore, integrating (9.1.7) we have
r*(@IN4 -r*W 1x4 = (A -A)
J r*@)A ( x ) y ( x )dx f 0, ( 9 . 1 4 b
if y + 0, h # 1, by (9.1.6). This inequality performs a vital function in ensuring the reality of the eigenvalues. An important part will be played by the “fundamental solution” Y(x,A), a k-by-k matrix function, defined by /Y’
=
[AA(x)
+ B(x)] Y ,
Y(a,A)
= E,
(9.1.9-10)
E being the unit k-by-k matrix. I n a similar way to (9.1.7) we may prove that
(Y*/Y)’ = (A
-A) Y*AY,
(9.1.11)
so that Y*JY is constant when h is real. Using (9.1.10) we thus have, for real A, (9.1.12) Y*(X,4 / Y @ ,A) = J , so that Y(x,A) is “J-unitary” when h is real. As an example, we may put in the form (9.1.1) the Sturm-Liouville system (8.1.2-3), which of course includes (8.1.1). The system (8.1.2-3) may, in fact, be written (9.1.13)
2 54
9.
GENERAL FIRST-ORDER DIFFERENTIAL SYSTEM
so that (9.1.1) includes Sturm-Liouville equations, and also, allowing piecewise continuous coefficients, the recurrence relation leading to orthogonal polynomials. The definiteness condition is here (8.2. l), which was shown to be a consequence of (8.1.4-7). Similar treatment applies to the matrix Sturm-Liouville system U'
V' = -(W + Q ) U ,
= RV,
(9.1.14-1 5)
where U , V, P, Q, and R are variable square matrices, P,Q, and R being Hermitean, and P positive definite or at least semidefinite, and J having the form (3.2.8); if -we weaken the conditions to allow P, Q, and R to vanish over subintervals, there will be included the case of matrix orthogonal polynomials, the topic of Sections 6.6-8. For a distinct example we take the fourth-order analog of the SturmLiouville equation, namely, the scalar equation (U')/P,)')
+
( P l . ' ) '
+
(APO
+ 4)
(9.1.16)
U = 0,
the variable coefficients q, p , ,p , ,p , being real-valued. The components
yl,...,y4 of the 4-vector or column matrix y are now taken to be Y,
y1 = U ,
= u',
y4 = (u"/P,)' + ~ p ' . (9.1.17)
ys = u"/P,,
Then (9.1.16) is equivalent to the system
r; = Yz
9
r; = PZYS ,
r; = Y4 - PlY, ,
r; = -(XPo
+ 4) Y1
(9.1.18)
Writing y for the column matrix with entries y1 , ..., y4 we may write (9.1.18) in the form 0 0 -1
0
0 1
-Pz
0 0
0
y.
(9.1.19)
Here the matrix J on the left is skew-Hermitean, while that on the right is Hermitean when h is real. The coefficient of h is also positive semidefinite, if as usual we assume that p , > 0, though this is unnecessarily restrictive. Let us now examine (9.1,16) from the point of view of the exact restrictions to be placed on the coefficients. If it is to be possible to differentiate out the leading term (ur'/p2)"as u ( i v ) / p 2 2u(111)(l/p2)' u"(l/p,)", we must assume not only that p , # 0 but also that it has a
+
+
9.2.
THE BOUNDARY PROBLEM
255
second derivative belonging to some suitable class of functions ; similar remarks apply to the middle term. However, it is also possible to consider (9.1.16) on the understanding that (u’’/PZ) is twice differentiable, without necessarily either of u”, 1/pa being twice differentiable separately, and likewise for ( p p ’ ) ’ ;this is the interpretation of (9.1.16) as a quasidifferential equation. Both interpretations of (9.1.16) are included in the interpretation of (9.1.18) or (9.1.19) in which we assume that q, p , ,p l , and pa are Lebesgue integrable, and look for solutions that are absolutely continuous and satisfy the equations almost everywhere. The fourth-order equation (9.1.16) is more restrictive than the firstorder system (9.1.19) in another way also, in that (9.1.16) has no sense ifpa vanishes over a subinterval of (a, b). This does not apply to (9.1.19). As in the case of (8.1.2-3), we may consider (9.1.19) with coefficients vanishing over subintervals in order to bring certain recurrence relations within the framework of differential equations. Returning to the general case of the system (9.1.1), we assume in this chapter that (9.1.2-6) hold, and that A, B EL(u,b). A solution will be a (vector) function that is absolutely continuous, satisfying (9.1.1) almost everywhere in (a, b). As in the case of (8.1.2-3), this is not quite the most general system enjoying the type of property to be established. We may consider the integral equation
on the assumption that A,(x), B,(x) are of bounded variation and continuous over (a, b), A,(x) being nondecreasing, solutions being sought in the domain of continuous functions. If Al(x), B,(x) are absolutely continuous, their derivatives almost everywhere being A(x), B(x), we arrive back at the differential equation (9.1.1). In the interests of preserving the differential formalism, we are thus excluding the case of (9.1.20) when A,(x), B,(x) contain a singular component. Except where otherwise indicated, we assume the basic interval (a, b) to be finite. As before, this is mainly a matter of convenience, the essential restriction being that A, B E L(a, b). 9.2. The Boundary Problem
As for the discrete case mentioned in Section 3.1, we suppose the boundary conditions specified by two square matrices M, N such that M*]M
= N*]N
(9.2.1)
256
9.
GENERAL FIRST-ORDER DIFFERENTIAL SYSTEM
and such that M v = N u = 0, v a column matrix, must imply v = 0. The boundary problem consists in asking that (9.1.1) have a solution such that y(a) = Mv, y(b) = Nv, (9.2.2) for some column matrix v # 0. Admissible boundary conditions will in all cases be the periodic conditions y(a) = y(b) # 0, given by taking M = N = E ; more generally we may take M = E, N = exp (ia)E, for any real a. More generally still, we may take M = E, so that M*JM = J, and take N to be any “J-unitary” matrix, such that N* JN = J. Another possibility is that (9.2.1) should hold by virtue 3f both sides vanishing. This occurs, for example, in the SturmLiouville case ; representing (8.1.2-3) in the form (9.1.13), the boundary conditions (8.3.1-2) may be expressed as
);;(
(“
= 0
c o s OaL ) ( ev2 sin )l),
($3 ; ; ; ( =
8,cj
where v l , v2 are unknown, but not both zero; as it happens, in this case neither may be zero. The matrices M , N are those on the right of these equations, and we verify easily that
and likewise for N. We verify also that M , N have no common nullvectors, in that Mv = Nv = 0 implies that ZI = 0. As in Section 8.3, some general information concerning the eigenvalues is immediately available. Theorem 9.2.1. The eigenvalues of the problem (9.1.1), (9.2.2) are all real and have no finite limit-point. Denoting them by the series
h , , r =0,1,
...,
(9.2.3) (9.2.4)
that is, is convergent for any E > 0. Suppose first that X is a complex eigenvalue, sa that (9.1.1), (9.2.2) hold, with v # 0. We have then that Mv, Nv are not both zero, so that y(a), y(b) are not both zero, so that y ( x ) is a nontrivial solution. Considering (9.1.8), the left-hand side is, by (9.2.2), (Nv)*](Nv)- (Mv)*J(Mv)= v*(N*]N - M * J M )v
=0
9.2.
THE BOUNDARY PROBLEM
257
by (9.2.1). However, the integral on the right of (9.1.8) cannot vanish, by (9.1.6), and so h = A, and h is real. Thus the eigenvalues are all real. Next we exhibit the eigenvalues as the zeros of an entire function. We define a fundamental solution, a square matrix of functions of x, of the matrix analog of (9.1.1), by JY'
= (AA
+B) Y,
a Qx
< b,
Y(u) = E,
(9.2.5-6)
where E is the k-by-k unit matrix; we write Y(x) or Y(x,A) for this solution. For fixed x, a x b, it will be an entire function of A, in that all its entries will be entire functions. T h e relation connecting solutions of (9.1.1) with this fundamental solution is
< <
Y ( 4 = Y(X,4 Y ( 4
(9.2.7)
For the right-hand side is a solution of (9.1.1),as we see by multiplying (9.2.5) on the right by y(a), and it coincides with y ( x ) when x = a in view of (9.2.6). Applying (9.2.7) with x = b and using the boundary conditions (9.2.2), we must have Nw = Y(b,A)Mq (9.2.8) and if this is to be soluble with w # 0 we must have det { N - Y(b,A) M } = 0.
(9.2.9)
Conversely, if h satisfies (9.2.9), there will be a nontrivial solution z, of (9.2.8), and we have a solution of the boundary problem by taking y(a) = Mv. Since Y(b,A) consists of entire functions of A, the left of (9.2.9) is also an entire function. We have just shown that it has no complex zeros, and hence it does not vanish identically. Hence its zeros have no finite limit-point. It must therefore be possible to number the eigenvalues A, serially. For definiteness we may syppose this done so that
I A, I
< I 4 I < I A,
IQ
*'*
;
(9.2.10)
to number them as in the Sturm-Liouville case may not be possible, as they may tend to infinity in both directions. We suppose each A, written in the series (9.2.10) a number K, times, where K, , 1 K,& k, is the number of linearly independent solutions w of (9.2.8); with each multiple eigenvalue there will thus be associated a set of K , consecutive suffixes. Finally, the observation (9.2.4) follows from the fact that Y(b,A) or,
< <
258
9.
GENERAL FIRST-ORDER DIFFERENTIAL SYSTEM
rather, its entries, are entire functions of order at most 1, satisfying in fact bounds of the form O{exp (const. I A I)}. This estimate, and so (9.2.4), can be improved if A(%) has certain special forms, being in particular of rank less than K [cf. (8.2.5), (8.3.7)]. Since the polynomial case of Chapter4 is not excluded, we cannot assert that there is necessarily an infinity of eigenvalues, without making additional assumptions.
9.3. Eigenfunctions and Orthogonality We consider first only simple eigenvalues, for which (9.2.8) has only one linearly independent solution for A = A,, that is to say, for which K, = 1 ;subject to a later normalization, we take this solution as w = w, , so that { N - Y(b,A,) M } w, = 0,w, # 0. We assert that these eigenfunctions are orthogonal according to
J’: y*(x,
y(x, A),
Am)
where the eigenfunction y(x, A,) I.’(% A,)
= (AnA
+ B)y(x,
&I)*
dx = 0,
(Am
z
(9.3.1)
is the solution of
r(a,An)
= MWn
9
Y(b, A,)
= N.,
As in the proof of (9.1.7) we have
-
(9.3.24)
{y*(x, Am) Iy(x, An)}’ = (An - Am) Y*(x, Am) A(x)~ ( xAn), ,
using the fact that h, is real, and so
Using the boundary conditions, the left-hand side is (Nwm)*I(Nd- ( M w m ) * l ( M w n ) = 0
by (9.2.1). Since A, # h,, the integral on the right of (9.3.5) must vanish, as asserted. If all eigenvalues are simple, we normalize the eigenfunctions so that (9.3.6) This is possible since by (9.1.6) the integral on the right is certainly positive, and its value may be made unity by multiplying w, by a positive
9.3.
259
EIGENFUNCTIONS AND ORTHOGONALITY
scalar ; this fixes V , , and so y(x, An), except for a scalar factor of modulus unity, which we leave indeterminate. Abbreviating y(x, A,) to y,(x) we shall then have Ja
provided that A, # A,. Suppose now that A, is a multiple eigenvalue, in that the set of column matrices v given by {w
1 (N
-
(9.3.8)
Y(b,A,) M ) w = 0}
has dimension K~ > 1. We suppose the eigenvalue A, written in the sequence of eigenvalues as, say, A,,
T
= n’
+ 1, ...,n’ + .
K,
times (9.3.9)
K,
Our task is to choose a basis of the set (9.3.8) w,
,
T
+ 1, ..., +
= n‘
12’
K,
,
(9.3.10)
such that the corresponding eigenfunctions yT(x) = Y(x,A,)
Mv,,
T
= 71‘
+ 1, ...,n’ +
K,
,
(9.3.11)
are mutually orthonormal among themselves, in that / l y : ( x ) A(%) ys(x)dx = a,
,
T,
s = n’
+ 1, ...,n’ +
K,
; (9.3.12)
if this be done for every multiple eigenvalue, (9.3.7) will hold without restriction. We arrange (9.3.12) by a process of orthogonalization. Writing ur = Mwr = rr(a),
so that Y,.(X)
= Y(X, A,) 11,
,
Y
= ?Z’
+ 1, ...,n’ +
K,,
(9.3.13)
(9.3.12) is equivalent to u:
s” B
Y*(qA,) A(%)Y(x,A,) dx u,
= 6,,
,
T,
s = n’
+ I, ...,n‘ + . K,
260
9.
GENERAL FIRST-ORDER DIFFERENTIAL SYSTEM
we have to choose a basis u,
,
= n’
Y
+ 1, ..., + nt
,
(9.3.15)
= 01,
(9.3.16)
K,
from the set of column matrices u given by
1
{u u = Mv, ( N
- Y(b,A,) M ) v
which are orthonormal in the sense that u:Wl(b, A,) u, = a,,
Y,
s = n’
+ 1, ..., n’ + .
K . ~
(9.3.17)
We note in the first place that the set (9.3.16) is of dimension K ~ with , the set (9.3.8); for if there were a v in the set (9.3.8) such that Mw = 0, we should have also Nv = 0, and hence also w = 0, by a basic assumption regarding the boundary matrices. Hence M is nonsingular in its action on the set (9.3.8), so that (9.3.16) has the same dimension. Next we need the observation that W,(b,A)
> 0.
(9.3.18)
For if u is an arbitrary column matrix, and y ( x ) = Y(x,A) u is therefore a solution of (9.1.1) which is nontrivial if u # 0, we have u*W1(b,A) u
= U*
Y* ( x A) , A ( x ) Y ( x ,A) dx u
h,)
if u # 0; this proves (9.3.18). Thus Wl(b, will have a positive definite square root, likewise Hermitean, and If we write
4 = {Wl(b, A,))1’2
u,
(9.3.19)
9
the relations (9.3.17) assume the form u:* uz =
a,., ,
Y, s
= n‘
+ 1, ...,n’ + . K,
(9.3.20)
T h e set of column matrices uf ,
Y
= n’
+ 1, ...,n‘ +
K,
(9.3.21)
are thus to be an orthonormal basis, with the standard inner product, of the set
1
{ut ut = {Wl(b,A,J)1/2 u, u = Mv, ( N
- Y ( b ,A,))
v = 0). (9.3.22)
9.3.
26 1
EIGENFUNCTIONS AND ORTHOGONALITY
Since (W,(b, is nonsingular, this set is, with (9.3.16), also of dimensionality K, , and possesses an orthonormal basis of K, column matrices, with the standard orthogonality as in (9.3.20). Hence we may arrange that (9.3.12) holds. Thus the orthonormal relations (9.3.7) may be taken to hold also in the case that A, = A,, whether or not m = n, and so unrestrictedly. We shall now set up the eigenfunction expansion associated with these orthonormal relations. We do this in a purely formal way, deferring the proof of the expansion till Section 9.6. T h e eigenfunctions being the column matrices y,(x), the expansion will be of some class of column matrix functions ~ ( x ) ,in the form (9.3.23)
Here the coefficient c, , a scalar, has been placed at the right, considering it as a 1-by-1 matrix admitting left-multiplication by a column matrix. T o determine the c, by the usual Fourier process, we multiply (9.3.23) on the left by y*(x) A(x) and integrate, obtaining, in view of
(9.3.7),
T h e expansion (9.3.23) then becomes (9.3.25)
This shows incidentally a connection between the rank of A(t) and the nature of the expansion. I t may happen that A ( t ) + = 0 for all in some subspace of the K-dimensional vector space, and for a t ,< b ; in such a case (9.3.24) takes no account of the component of ~ ( x lying ) in such a subspace, and so the expansion cannot be expected to hold in this subspace. T h e latter remark applies to the Sturm-Liouville case, where the matrix A is the first matrix on the right of (9.1.13), and has constant rank 1, with a constant null-space. In this case we get an expansion of an “arbitrary” function, in the scalar sense, in terms of eigenfunctions. Similar remarks apply to (9.1.19). T h e expansion may be put in more symmetrical form, and a form
<
+
262
9.
GENERAL FIRST-ORDER DIFFEPENTIAL SYSTEM
which is important for limiting procedures, if we define the spectral function
where we interpret T M , N ( 0 ) = 0. Thus T M , N ( A ) is a matrix-valued step function which is Hermitean, nondecreasing, and right-continuous. Its jumps occur at the eigenvalues, the jump at A, being u&, where un is the initial value, for x = a, of the associated normalized eigenfunction, provided that the eigenvalue A, is simple; in the event of a multiple eigenvalue A,, the jump is to be
z
Ur43
(9.3.27)
+4a
taken over a set of normalized and orthogonal eigenfunctions associated with A,. A similar function was constructed in Section 6.8, for the special case of orthogonal polynomials with matrix coefficients. T o rephrase the eigenfunction expansion in terms of the spectral function we define a column matrix function #(A) by (9.3.28)
being a modification of the Fourier coefficient. In view of the relations y,(x) = Y(x,A,) u, , y,*(t)= u,*Y*(t,A,) we may then write (9.3.25) as
The eigenfunction expansion thus becomes a pair of reciprocal integral transforms.
9.4. The Inhomogeneous Problem In this, the basis chosen here for the expansion theorem, we suppose given a column matrix ~ ( x ) ,a x b, of functions of L(a,b), and ask for a solutiony of (9.4.1) I.’ = (M B ) y - x ,
< <
+
9.4.
THE INHOMOGENEOUS PROBLEM
263
satisfying the boundary conditions, so that for some column matrix w , with v = 0 allowed, we have Y(U)
= Mw,
y(b) = Nw.
(9.4.2)
We show that, provided that A is not an eigenvalue, the unique solution is available in the form (9.4.3)
where K(x, t, A) is a square matrix of functions which for fixed x have at most one discontinuity in t. In particular, w in (9.4.2) is determinate, and will be found explicitly below. The problem and its solution have obvious affinities with the solution (8.8.3) of the inhomogeneous problem of Section 8.8 by means of the Green’s function. The latter problem may be posed in the present terms and with a slight extension as the finding of a solution of
where in (8.8.1-2) we have x for x1 and 0 for xz , the boundary conditions being
for some vl, vz , possibly both zero. According to (9.4.3), there will be a solution of the form (9.4.6)
where K,, = KTs(x,t, A) are the entries in K(x, t, A). In particular, if
xZ = 0, we have
and on comparing this with (8.8.3) we see that the Green’s function G(x, t, A) for the problem (8.8.1-2), (8.6.8-9) is the top left entry in the matrix K(x, t, A) for the problem (9.4..4-5). To avoid confusion with the Green’s function we shall term the matrix K(x, t, A) for the problem (9.4.1-2) the “resolvent kernel.” It may be constructed by means of routine calculations of the nature of the method of variation of parameters. We seek a solution of (9.4.1-2)
264
9.
GENERAL FIRST-ORDER DIFFERENTIAL SYSTEM
of the form y ( x ) = Y(x,A) ~ ( x ) , where Y(x,A) is the fundamental matrix solution of (9.2.5-6) and ~ ( x is) a column matrix to be found. Abbreviating Y(x,A) to Y , and differentiating we have
Jr’ = JY& + JY,,’
=
(AA
+ B ) Yzrl + JYZrl’
This agrees with (9.4.1) if JYzq’ =
-x,
and so we take
= (AA
+ B ) y + JYZrl’.
. -y-1,P X .
=
I n addition we have
(9.4.8)
(9.4.9)
and in particular ~ ( 6 )= NV = YbMv -
j: YbY;lJ-lx(t)dt.
(9.4.10)
Provided that X is not an eigenvalue, that is, provided that (9.2.9) does not hold, N - Y,M will have an inverse, and (9.4.10) may be solved for v , giving b v = (Y,M - N)-’ Y,Y;lJ-lx(t) dt.
S
a
Substituting in (9.4.9) we obtain the solution of (9.4.1-2) as y(x) = Y,M(Y,M - I V - 1 1
b
a
Y b Y , - l J - l ~dt ( t )-
a
Y,Y;lJ-lx(t) dt.
(9.4.1 1)
Verifying this solution, it is easily checked that y as given by (9.4.9) satisfies (9.4.1) and the first of (9.4.2), and that it satisfies the last of (9.4.2) in the special form (9.4.11). If X is not an eigenvalue, the solution is of course unique, since the difference of two solutions of the inhomogeneous problem would have to be an eigenfunction. Summing u p we have Theorem 9.4.1. If X is not an eigenvalue of the problem (9.1.1), (9.2.2), and x ( x ) EL(u,b), then the inhomogeneous problem (9.4.1-2)
9.4.
THE INHOMOGENEOUS PROBLEM
265
has a unique solution (9.4.3), where the resolvent kernel K(x, t, A) has for x < t the form K(x, t , A)
and for x
>t
=
YxM(YbM - N)-lY, Yi'J-1,
(9.4.12)
the forms
K(x, t , A)
=
YXM(YbM- N)-'Y,Y,-']-' - Y x Y-1J-1 t '
K(x, t , A)
=
YxY;'N(YbM - N)-'Y, Y;'J-',
(9.4.13) (9.4.14)
where Y , denotes Y(x,A). T o check that (9.4.13-14) are the same we write (9.4.13) as Yx{M - Y;1( YbM - N ) } ( YbM - N)-'Yb Yr'J-1, which clearly simplifies to (9.4.14). For a < x < b there exist the distinct limits
+ 0, A)
=
YxM(YbM - N)-'Y, Y;']-l,
(9.4.15)
K(x, x - 0,A)
=
YxM(YbM- N)-'YbY~'J-'- J-',
(9.4.16)
K(x, x
or (9.4.14) with t = x. For definiteness, we may take K ( x , x , A) as the arithmetic mean of these two, which may be reduced to the form, taking (9.4.15) and (9.4.14) with t = x , K(x, X, A)
=
$ Yx{M+ Y;lN} (YbM- N)-lYbY;'J-'
=
Q Yx(M+ Y;")
(9.4.17)
( M - Y-1N)-'Y;'J-'. b
T h e resolvent kernel has the following formal properties:
Theorem 9.4.2. If A is not an eigenvalue, K(x, t, A) is continuous in both x and t, except for a jump J - l as t increases through x for fixed x, a < x < b. It satisfies the resolvent equation K(x, t , A) - K*(t, x, p)
= @ - A)
1 K*(s, b
a
x, p ) A(s)K(s, t , A) ds,
(9.4.18)
if x # t and p is not an eigenvalue. In particular, K(x, t , A) = K*(t, x, A).
(9.4.19)
T h e first statement follows from (9.4.12-13), Y , , Yy' being continuous
9.
266
GENERAL FIRST-ORDER DIFFERENTIAL SYSTEM
in x, t without restriction ; the nature of the discontinuity as t increases through x is evident from (9.4.15-16). We remark incidentally that in the case (9.4.4) the jump of J-’ does not affect the top left entry in K so that the Green’s function in this case has no discontinuity. For (9.4.18) we give an indirect proof. Together with the inhomogeneous problem (9.4.1-2) we consider a second such problem, namely, Jz‘
= (PA
+ B ) z - $4
z(a) = M w ~ , z(b) = N w ~ . (9.4.20)
I n view of these boundary conditions and (9.4.2) we have [z*Jy];= 0.
However (z*/y)’ = -z*’J*y =
+*@A
= (A
+ z*Jy’
+ B)
- ii) z*Ay
-
$*>y
+ $*y
(9.4.21)
+ z*{(AA + B ) y - x)
- 2*X.
In view of (9.4.21) we deduce that
f a
$*(x) y ( x ) dx - Jb,z*(t) ~ ( tdt) = & a
- A)
I” a
z*(s) A(s)y(s) ds.
(9.4.22)
In view of (9.4.3) the first integral on the left is
Since z(t) = Sb K(t, x, p ) $(x) dx, we have, on taking adjoints and substituting in the next integral in (9.4.22),
sb a
z*(t)x ( t ) dt
=
s” s” a
a
# * ( x ) K*(t, x, p ) x ( t ) dt dx.
Substituting in the last integral in (9.4.22) we obtain
Hence from (9.4.22) we obtain
~ ( tdx ) dt.
9.4.
267
THE INHOMOGENEOUS PROBLEM
This is true for arbitrary continuous $(x), ~ ( t )Also, . the matrix function in the braces { } is continuous in x and t if x # t ; this also applies to the integral inside the braces. Hence the matrix function inside the braces must vanish identically, which proves (9.4.18). We need also the special case t = x, for which the result is still valid.
Theorem 9.4.3. If A, p are not eigenvalues, K(x, x, A) - K*(x, x, p)
/ K*(s, A(s)K(s, A) ds A ) / K(x, I.) A(s)K*(x, s, A) ds. L
= (i; - A) =
(i; -
x, p)
a
b
x,
s,
(9.4.23) (9.4.24)
a
a
We deduce this by making t
< x < b and make t
--t
x
x; we assume for definiteness that
+ 0. The integral on the right of (9.4.18) ---+
is continuous in t in spite of the fact that K(x, t , A) has a jump J-l at t = x ; this may be seen, for example, by expressing the integral in question as the sum of integrals over (a, x), (x, t ) , and (1, b), in each of which one of thy expressions (9.4.12-13) may be used. Thus we derive from (9.4.18) that K(x, x
+ 0,A ) - K*(x + 0,x, p) = (i; - A)
I
b
a
K*(s, x, p) A(s) K(s, x, A) ds. (9.4.25)
We now note that
+ 0,A) - K ( x ,x - 0,A) = I-', K(x, x, A) = 8 K ( x ,x + 0,A) + 8 K ( x , x - 0,A) K(x,x
so that
K(x, x
+ 0,A)
K(x, x - 0,A)
Furthermore, K(x
= K(x, x, =
A)
+9
J-l,
K(x, x, A) - 8 1-1.
+ 0, x, A) = K(x, x - 0,A)
(9.4.26) (9.4.27) (9.4.28)
since K(x, t, A) for x > t is given by the continuous expressions (9.4.13-14). Taking adjoints in (9.4.27) and (9.4.28) and recalling that J* = -J we deduce that K*(x
+ 0,x, A) = K*(x,x, A) + 81-1.
(9.4.29)
On the left of (9.4.25) we now substitute for the first term on the basis of (9.4.26), and for the second by means of (9.4.29), with p in
268
9.
GENERAL FIRST-ORDER DIFFERENTIAL SYSTEM
place of A. We then obtain the required result (9.4.23). T h e proof for
x = b is similar.
A special case of this result, with x direct calculation as (9.5.14).
=
a and p = A, is proved by
9.5. The Characteristic Function Taking in (9.4.17) x = a and noting that Y , = E we obtain the function FM,JA) = & ( M + Y ; W ) ( M - Y;W)-l]-l. (9.5.1) We term this the "characteristic function," in full analogy to the function defined in (1.6.1). It is a square matrix of functions of A, of the same order as the matrices occurring in the differential equation (9.1.1); in connection with the second-order difference equation of Chapter 4, it was convenient to consider a scalar characteristic function, the same situation obtaining in the Sturm-Liouville case of Chapter 8. The complete function has, however, the same dimensionality as the defining first-order equation. Key properties of this function are that it is Hermitean for real A, when finite, and that its imaginary part has a fixed sign in each of the upper and lower A-halfplanes. Its residues at its poles, which are located at the eigenvalues, are the jumps of the spectral function at those points. In addition, its singularities specify those of the resolvent kernel. To connect the characteristic function with the resolvent kernel, we note that
Hence (9.4.12), (9.4.14) give K(x, 1, A)
=
(9.5.3)
+
Y3E{FM,N(A) +]-I}
K(x, 1, A) = Yz{FM,,(A)- + / - I }
]Y;'/-',
x
< t,
(9.5.4)
]Y;1/-',
x
> 1.
(9.5.5)
These results may be put in a simpler form for real A, since for such A we have Y*(x,A) JY(x,A)
=
1.
(9.5.6)
To see this we note that, as for (9.1.7), (Y*JY)' = (A - A ) Y*AY.
(9.5.7)
9.5. Since Y(a,A)
=
269
THE CHARACTERISTIC FUNCTION
E it follows that = ( A - A)
Y*(x,A) JY(x,A) - J
Y*(t,A ) A ( t ) Y ( t ,A ) dt.
(9.5.8)
U
The right-hand side being zero if h is real, we deduce (9.5.6) for real A. Writing this in the form JY-'(t, A) J-' = Y*(t,A),
(9.5.9)
we deduce that, for real A, (9.5.4-5) may be replaced by
+ +I-'>yt*,
q x , t, A) =
Y%{FM.Iv(A)
K(x, t , A) =
Y % { F M , N ( A )-
*I-'} y,
x
< t,
(9.5.10)
x
> t,
(9.5.1 1)
while for x = t the mean of these expressions gives K(x, x, A)
(9.5.12)
= Y % F M . N ( A ) y:.
From (9.5.4-5) and (9.5.10-1 1) we see that the singularities of K(x, t, A) and FMsN(h)are closely connected. That the latter has only simple poles will follow from Theorem 9.5.1. The function FMSN(h) is Hermitean for real A, except for poles, and for complex h satisfies ImFM,N(A)5 0
for
Im A
2 0.
(9.5.13)
This will follow from the evaluation of Im FM,N(h),namely, ImFM,N(A)= -(Im A ) V*-lN*Y,*-lWl(b,A) Yi'NV-',
where we write for brevity, WJx, A)
and U
=
M
=
s' Y*(t, a
+ Yi'N,
(9.5.14)
A) A ( t ) Y ( t ,A) dt,
(9.5.15)
V
(9.5.16)
=
J(M - Yi'N).
For if h is real, (9.5.14) shows that FMSN(h) has zero imaginary part, and so is Hermitean, provided that V-l exists, that is, X is not an eigenvalue. We next observe that Wl(b,A)
I
b
=
a
Y*(t,A ) A ( t ) Y(t,A) dt
> 0.
(9.5.17)
2 70
9.
GENERAL FIRST-ORDER DIFFERENTIAL SYSTEM
For in the definiteness postulate (9.1.6) we may replace y ( x ) by Y(x,A) u where u is any column matrix other than zero, getting u*
i" a
> 0,
Y*(x,A) A(%)Y(x,A) dx u
which is the same as (9.5.17). Hence, if V-l exists, (Y,"V-')*
> 0,
Wl(b,A) (Y,-"V-1)
and so the right of (9.5.14) has the opposite sign to Im A. Turning to the calculation (9.5.14), with the notation (9.5.16) we may write J?M,N(h) ==
so that
8 uv-',
IrnF&fJ@)= (2i)4(+uv-1 - 1 2 V*-'U*) = (4i)-lV*-'(
Now
(9.5.18) (9.5.19)
v*u- U*V) v-1.
v*u- u * v = (M* - N*Y,*-l)J*(M + Y,") - (M*
+ N*Y*-1 b
-
Y,")'
Since J* = -J this reduces to V*U - U*V
and since M * J M
=
= -2(M*
J M - N*Y,*-l]Y;'N),
N * J N to
V*U - U*V
= 2N*Y*-l b =
-2(h
-
(1- ' , * J Y b )
',"
A) N*Y,*-lW,(b,A) Y,",
by (9.5.8) and (9.5.15). Substituting in (9.5.19) we obtain (9.5.14), completing the proof of Theorem 9.5.1. Hence, as stated, F,,,(A) can have only simple poles (cf. Appendix 11), which occur at the singularities of ( M - Y;'N)-l or ( Y , M - N)-'Y,. These are clearly the zeros of det ( Y , M - N ) or roots of: (9.2.9), that 'Is to say, the elgenvalues. We denote the residue of FIM,,J'h)at 'h, b y P, so that near A, there holds the expansion as a Laurent series FM,N(h) =
P,(h - An)-'
+ ...
the omitted terms being regular near A = A,.
(9.5.20)
9.5.
THE CHARACTERISTIC FUNCTION
27 1
In particular, since P,
=
lim (A - A,)FM,N(h),
(9.5.21)
a+,
we have that P, is Hermitean, since the transition (9.5.21) may be made through real A-values. We proceed to evaluate P, as the jump in the spectral function (9.3.26) at A,. Theorem 9.5.2.
The residue of Pn
that of K(x, t, A) being
=
2
a,= 5,
FM,N(A)
z
+an
at A = A, is (9.5.22)
W:9
rz(43w.
(9.5.23)
We denote as before by K, 2 1 the dimension of the set (9.3.8), which is also the number of terms in the sums in (9.5.22-23); if K , > 1, the orthonormalization (9.3.12) or (9.3.17) is supposed to have been carried out. We first show that P, has rank at most K, . For it follows from (9.5.1) that FM,N(A)JY;1( Y,M - N ) = ( M Y;").
+
+
Since the right is regular for all A, substitution of (9.5.20) shows that Pn(A - An)-'JY,-'(Y,M - N )
is bounded in a neighborhood of A,, so that P,]Y-l(b, An) [Y(b,A") M - N ]
(9.5.24)
=0
Of the factors on the right, J and Y-l(b, A,) are nonsingurar, and [Y(b,A,J M - N ] has rank k - K, , in view of our assumption concerning the set (9.3.8). Hence it follows from (9.5.24) that P, has rank at most K, . We complete the determination of P, by considering the singularities of K(x, t, A). By Theorem 9.4.2 the resolvent kernel is regular except at the eigenvalues, namely the roots of (9.2.9). By Theorem 9.5.1 and (9.5.10-12) the singularities of K(x, t, A) are in fact simple poles, and substituting (9.5.20) in these formula we see that the residue is given by K(x, t , A)
=
Y(x,A,) P,Y*(t, A") (A - An)-1
+ ... ,
(9.5.25)
272
9.
GENERAL FIRST-ORDER DIFFERENTIAL SYSTEM
valid in a neighborhood of A,, excluding A, itself. We now use the fact that an eigenfunction yr(x) associated with the eigenvalue A,, that is, for which A, = A,, satisfies the differential equation
Jr:
=
(AA
+ B ) Y , - ( A - A,)
4,
7
together with the boundary conditions. By (9.4.1-3) we have then (9.5.26)
J a
provided that A is not an eigenvalue. Making A+ A, and using (9.5.25) we deduce that ~r(x> =
J
b
Y(X, An) PnY*(t, ~ ( t ) ~ r dt* ( t )
By (9.3.13) this is equivalent to u, = p,
J
b a
Y*(t, A,) A ( t ) Y(t, A,) dt u , ,
or, with the notation (9.5.17), ur = P,Wl(b, A,)u,,
r = n'
+ 1, ...,n' +
K,.
(9.5.27)
Abbreviating W,(b, A,) temporarily to W, , we may write (9.5.27) as
where W:'' is as before the positive definite square root of W, , which is Hermitean. Hence the Hermitean matrix W:"P,W:'' acts as the identity operator on the orthonormal set (9.3.19) of K , column matrices. We proved above that P, was of rank at most K,; since the same conclusion follows for W:1zPnW:/2we see that the latter is of rank exactly K*& , having K, eigenvalues equal to unity, the remainder of its eigenvalues being accordingly zero. Hence W:''P,W:/' is the projector onto the manifold sparined by the set (9.3.19), that is to say,
the summation being over the same set of Y as in (9.3.19). Removing the nonsingular factors W:'' we deduce (9.5.22). We get (9.5.23) on substituting for P, in (9.5.25) and using (9.3.13). This completes the proof of Theorem 9.5.2.
9.6. 9.6.
THE EIGENFUNCTION EXPANSION
273
The Eigenfunction Expansion
We give here a proof, of a more general character than that used in Section 8.6, and one which depends to some extent on principles of complex variable theory. T h e main argument is contained in
< <
Lemma 9.6.1. Let ~ ( t )a, t b be a column matrix of functions which are measurable over (a, b), and which are such that
(9.6.I )
For some positive real A let
Let h = 0 not be an eigenvalue, and let y ( x ) be the unique solution of the inhomogeneous problem IY’
= By
+ Ax,
(9.6.3)
together with the boundary conditions y ( a ) = M v , y(b) = N v for some v ; as in Section 9.4, v is determinate, when x is given. Then (9.6.4)
For the proof we consider the inhomogeneous problem Jw’ = (AA
+ B ) w + Ax,
(9.6.5)
together with the boundary conditions, ~ ( a= ) M v t , w(b) = Nvt, for some vt. Provided that h is not an eigenvalue, the solution is given by the resolvent kernel, as (9.6.6)
according to (9.4.3). T h e argument concerns the analytic behavior of the scalar function
274
9.
GENERAL FIRST-ORDER DIFFERENTIAL SYSTEM
where y ( x ) is given by (9.6.3), or explicitly as
J
b
y(x) = -
a
(9.6.8)
K(x, t , 0)A(t)X(t) dt.
In view of the analytic expressions (9.4.12-14) for K(x, t, A), we see that y(A) is analytic except at the A,, where, by (9.5.25), it may have at most simple poles. By (9.5.23), the residue of y(A) at A, is (9.6.9)
here we use the fact that the omitted terms in the expansion (9.5.25) are uniformly bounded in a neighborhood of A,, which in turn follows from (9.5.4-5). We now remark that (9.6.9) vanishes by (9.6.2) if 1 h, I < A. We deduce that y(A) is analytic in the circle 1 A I A. The further course of the argument concerns the expansion of w(x, A), and so of y(A), as a power series in A. If formally we put
<
w(x, A) =
c m
0
A"w,,(x),
wt =
2
(9.6.10)
9
0
and substitute in (9.6.5) and the boundary conditions we obtain a sequence of problems Jw; = Bw,
J w ~= Bw,,
+ Ax,
+ Awn-, ,
wo(a) = Mw0,
w0(b) = Nw0,
w,,(u) = Mw,, , wn(b) = N w n
,
12
(9.6.11)
=
1,2,
... .
(9.6.12)
Here wo coincides with y defined in the statement of the lemma. Solving these problems, we have wo = y given by (9.6.8), the remainder of the w, being given recursively by w,,(x) =
-
I
b a
K(x, t , 0)A(t)~ , + ~ (dt, t)
n
=
More strictly, writing (9.6.5) in the form Jw' = Bw follows from (9.4.3) that its solution must satisfy w(x,
A)
= -
b a
K(x, t, 0)A ( t ){Aw(t, A)
1, 2, ...
.
(9.6.13)
+ A(Aw + x), it
+ X ( t ) > dt.
(9.6.14)
This integral equation may be solved by iteration, the method of the Neumann series, for small A, since the kernel K(x, t, 0) is piecewise
9.6.
275
THE EIGENFUNCTION EXPANSION
continuous and uniformly bounded. Thus a solution in the form (9.6.10) is certainly possible for small A, where wo = y is given by (9.6.8) and the w, , w 2 , ... by (9.6.13); from these latter we then deduce (9.6.11-12), on the basis of Theorem 9.4.1. Substituting for w(x, A) in (9.6.7), and replacing y by wo , we obtain (9.6.15)
at least for small A. However we showed previously that y(A) is analytic in I A I A, and so the series in (9.6.15) is analytic in this closed circle.
<
We deduce that, for some constant y o ,
1 J”
b
w:Awn dx
a
[
(9.6.16)
To complete the proof we need some results concerning expressions J b w:Aw, dx. Observing that
of the form
+ wZ,A) w, + W:(BW, + AW,-l)
(w:Jws)’ = -(w:B = w:Aw,-,
- W:-lAws,
by (9.6.12) and (9.1.24), we have that
for r 3 1, s 3 1 ; the result is also true for Y 0, s 2 0 if we interpret x, in accordance with (9.6.11). However the left of (9.6.17) vanishes by the boundary conditions in (9.6.1 1-12), together with (9.2.1). Hence
w-, as
,:
w:Aw,-, dx
=
,:
w,*_,Aw,dx,
r >, 0 , s
2 0,
(9.6.18)
where we interpret w-, = x. Our first application of this is to modify (9.6.16). We have, for n 3 1, w,*-,Aw,,+, dx = ... =
,:
Hence from (9.6.16) we have that wZAwn dx
<
(9.6.19)
276
9.
GENERAL FXRST-ORDER DIFFERENTIAL SYSTEM
Secondly, we use (9.6.18) in the form
J’: w:Aw,
dx =
1:
w,*_lAw,+l dx,
whence, by the Cauchy inequality,
for r
=
1, 2, ..., the case r = 0 being
If J: wCAw,, dx = 0, the required result (9.6.4) is certainly true, since w, = y. We therefore take it that J b w$Aw, d x # 0; it then follows from (9.6.20-21) that none of the igtegrals appearing there vanish. We may therefore consider the ratios
1 s” a
1,
w:Aw, d x / j w,*-lAw,-l dx a
1,
r = 0, 1,
... .
(9.6.22)
By (9.6.20-21) this sequence is nondecreasing. Writing, for the case r = 0. (9.6.23) it follows that the ratios (9.6.22) are not less than v, so that
Jl
where y1
=
w;Awndx 3 ylvn ,
(9.6.24)
J” w,*Aw, dx > 0.Comparing (9.6.19) and (9.6.24) we have a ylvn
Taking nth roots and making n v
<
--t
, P n .
m,
we deduce that
< A-2.
Hence from (9.6.23) it follows that
and since w, = y , this is the required result (9.6.4).
9.6.
THE EIGENFUNCTION EXPANSION
277
Next we remove the restriction that h = 0 not be an eigenvalue. We have Lemma 9.6.2. The result of Lemma 9.6.1 remains valid if h = 0 is an eigenvalue, if y in (9.6.3) satisfies the additional restriction that
j+y
(9.6.25)
(A, = O),
dx = 0
for all eigenfunctions with zero as eigenvalue. For if h = 0 is an eigenvalue, we may modify the eigenvalue problem so as to increase all the eigenvalues by an arbitrarily small Q > 0, so that zero will no longer be an eigenvalue, and Lemma 9.6.1 can be applied. For the differential equation satisfied by the eigenfunctions may be written
k:= ( B - 4
Y ,
+ (4 + 4 4,
*
If therefore we take the boundary problem formed by Jy’
= (B -‘
4y
+ AAy,
(9.6.26)
with the same boundary conditions as previously, the eigenfunctions will be the same, y7(x) now corresponding to a revised eigenvalue A: = A, B. I n place of (9.6.3) we write
+
JY’
= (B - 6A)y
+ A h + EY),
(9.6.27)
and propose to apply Lemma 9.6.1. This is justified in part by the fact that h == 0 is not an eigenvalue of the revised problem. We also verify that, in modification of (9.6.2), /;Y:(t)
4){ X ( t ) + E N ) )
dt = 0,
I
A7
I
(9.6.28)
This will be ensured, in view of (9.6.2) and (9.6.25), if (9.6.29)
T o prove this we note that y , yt both satisfy the boundary conditions, so that [Y: Jyl:
in view of (9.2.1).
= 0,
(9.6.30)
278
9.
GENERAL FIRST-ORDER DIFFERENTIAL SYSTEM
However (Y: JY)’
=
-Yf(ArA
=
--hry:Ay
+ B ) Y +Y ~ ( B +Y A X ) AX-
Hence from (9.6.30) it follows that A,
j y:Aydx b
a
1
b
=
a
(9.6.31)
y:Axdx.
Since the right hand side is zero, we deduce (9.6.29), and so (9.6.28). Applying now Lemma 9.6.1, and taking it that 0 < z < A, we have that (9.6.28) holds for -A z AS A E, since AS = X, E. Hence (9.6.28) holds for I A: I A - E, and by (9.6.4) we have
+ < < + <
f Y * ( t ) A ( t ) y ( t )dt
< ( A - .F21 (x* + .y*) A ( t )(x + y b
a
+
) dt.
Making z -+0 we derive (9.6.4), the desired result. We can now deduce the eigenfunction expansion, with convergence in a certain mean-square sense, in which the weight-distribution is given by the matrix A(t).
< <
Theorem 9.6.3. Let the column matrix ~ ( t )a , t b, be measurable and of integrable square with respect to A(t) in the sense (9.6.1). Let the column matrix ~ ( t )a , t b, be absolutely continuous and satisfy almost everywhere the differential equation
< <
Jrp’ = Brp
+ AX,
(9.6.32)
and the boundary conditions ~ ( a = ) Mw,~ ( b = ) Nu for some column matrix w. Write, for any A > 0, (9.6.33)
where the Fourier coefficients c, are given by (9.3.24). Then
frp,*(x) A(x)p,,(x) dx 6 A-2 1 x*(x) A(x)x ( x ) dx. b
a
a
(9.6.34)
It is immediate from this result that the left of (9.6.34) tends to zero as A ---+ m. This represents a form of mean-square convergence, coinciding with ordinary mean-square convergence if, for example, A(t) is continuous and positive-definite. A more common situation is that in
9.6.
279
THE EIGENFUNCTION EXPANSION
which A is positive-definite only when restricted to a certain linear manifold, when (9.6.33-34) gives a mean-square convergence only in that manifold. This occurs in the Sturm-Liouville case of (9.1.13) and its higher-order analogs, such as (9.1.19). A degenerate case will be that in which there are only a finite number of eigenvalues, when (9.6.33) will contain only a bounded number of terms, and the left of (9.6.34) will be zero for large A . For the proof we define the analogous quantities
/ y*(x > A ( x ) x ( x ) d x , b
d,,
and
=
n
(9.6.35) (9.6.36)
From the fact that cp, yr both satisfy the boundary conditions we have [y,*J91]~ = 0 and hence, as in (9.6.30-31),
that is to say, Arcr
= dr
We deduce that
the summations being over I h, I
.
< A. By (9.6.36-37) +
= BVA
AXA
(9.6.37)
this gives (9.6.38)
In addition, cpA satisfies the boundary conditions, and
<
A ; these follow from the definitions (9.6.33), (9.6.36) and for I h, I the orthonormality (9.3.7). By Lemmas 9.6.1-2, we have
280
9.
GENERAL FIRST-ORDER DIFFERENTIAL SYSTEM
However
<
the summations being over I A, I A . By (9.6.35) and its adjoint, together with the orthonormality (9.3.71, we have (9.6.42)
Hence in particular
and the desired result (9.6.34) follows from (9.6.41).
9.7. Convergence of the Eigenfunction Expansion Extending the method of Section 8.9, we may consider the eigenfunctions as Fourier coefficients of the resolvent kernel, apart from certain constant factors. By means of the Bessel inequality we then have bounds for certain series involving the eigenfunctions, which in turn enable us to investigate the convergence of the eigenfunction expansion in the uniform sense. We prove first a bound in the matrix sense. Theorem 9.7.1. If h is not an eigenvalue, yn(x)y'(x)
<
s:
K(x, s, A) A(s)K*(x, s, A) ds.
(9.7.1)
Here the sum on the left may be any finite sum, or may be over the infinite series of eigenvaiues, if there be an infinity of them. Since the terms on the left are positive semidefinite, their sum either converges, or else diverges in that some of the diagonal elements in the partial sums tend to f m ; the latter is excluded by the bound on the right-hand side. T h e series on the left is in fact absolutely convergent, in that the k2 entries in the matrices form k2 absolutely convergent numerical series.
9.7.
CONVERGENCE OF EIGENFUNCTION EXPANSION
28 1
The proof follows the lines of that of the scalar Bessel inequality. We consider the expression
K*(x, s, A ) - r)yn(s)y,*(x) (A - An)-1 ds
(9.7.2)
n <no
for any finite no > 0. Since A($)2 0 and the two matrices in the braces { }are adjoints of one another, the integral (9.7.2) is non-negative definite. In evaluating it we use (9.5.26), that is to say,
and its adjoint
Using these, and the orthonormality (9.3.7),the integral (9.7.2)reduces to
This is accordingly non-negative definite, proving (9.7.1). We may deduce the following scalar variant.
Theorem 9.7.2. If h is not an eigenvalue, Y,*(x)yn(x)
< tr
1:
K ( x , s, A) A(s)K*(x, s, A ) ds.
(9.7.3)
T h e inequality (9.7.1) remains in force if we take the trace of both sides. This gives (9.7.3), bearing in mind that
In fact, from the inequality (9.7.1) we may deduce the corresponding inequality for the diagonal elements of both sides.
282
9.
GENERAL FIRST-ORDER DIFFERENTIAL SYSTEM
In particular, we have Theorem 9.7.3. T h e series (9.7.4)
that is to say, is absolutely convergent. We now pass to the convergence of the eigenfunction expansion. Theorem 9.7.4. Under the assumptions of Theorem 9.6.3, the eigenfunction expansion of pl(x), the series on the right of (9.3.23) with c, given by (9.3.24), is absolutely and uniformly convergent. T h e convergence asserted is understood as the absolute and uniform convergence of the k series formed by the k entries in each of the column matrices y,(x) c, . Since the entries in y,(x) are all bounded in modulus by d r , * ( x ) y , ( ~ ) } 1 /it ~ , will be sufficient to show that (9.7.5)
for absolute convergence, and that
as no -+00, uniformly in x, for uniform convergence. By the Cauchy inequality we have
Together with (9.7.3-4) we need the facts that (9.7.8) (9.7.9)
These are simply the Bessel inequalities appropriate to (9.3.24) and (9.6.35), noting (9.6.37). If therefore in (9.7.7)we keep no fixed and make n1 -+ m, both sums on the right remain bounded, by (9.7.4) and (9.7.8-9). Hence the left of (9.7.7) remains bounded, proving (9.7.5) and the
9.7.
CONVERGENCE OF EIGENFUNCTION EXPANSION
283
absolute convergence of the eigenfunction expansion. Suppose next that in (9.7.7) we make no + with arbitrary n1 > n o , n, + m. I n this case the second sum on the right of (9.7.7) tends to zero, independently of x . T h e first sum on the right of (9.7.7) remains bounded, uniformly in x; this follows from (9.7.3), the right of (9.7.3) being bounded uniformly in x . Hence the left of (9.7.7) tends to zero as no + 00, uniformly in x , proving the uniform convergence of the eigenfunction expansion. A second application of Theorems 9.7.1-2 is to provide bounds for the spectral function (9.3.26). Such bounds may be applied to the transition b + m, or a -+ -00, to establish the existence of a spectral function corresponding to an expansion theorem for an infinite interval ; similar processes were used in Chapters 2 and 5. We have, however, to replace the integrals in (9.7. I), (9.7.3) by more convenient expressions.
Theorem 9.7.5.
If X is complex,
In particular, taking I m h
For on taking p
=
> 0,
X in (9.4.24) we get
or, on replacing X by 1,
S: K(x,
s, A) A(s)K*(x, s, A) ds = (A - X)-l {K(x,x, A) - K*(x, x, A)}.
Hence we may replace (9.7.1) by
Replacing X by A, which does not affect the left-hand side, we get (9.7.10). If I m h > 0 and we take x = a, if follows that
which coincides with (9.7.11) in view of (9.3.26), (9.5.12). This result
284
9.
GENERAL FIRST-ORDER DIFFERENTIAL SYSTEM
is connected with the partial fraction expansion of the characteristic function. By taking the trace, we may, as before, derive numerical bounds concerning the left of (9.7.10-1 1). 9.8. Nesting Circles
We now show that the characteristic function defined in (9.5.1) lies on a certain locus, which is independent of the boundary matrices M , N . This locus may be considered as an analog, in matrix terms, of a circle. Furthermore, as b increases, these circles have a certain nesting property, in that each contains in its interior the circles corresponding to greater values of b. One consequence of this is that the characteristic function is bounded, as b + 00, for fixed complex A. In view of (9.7.11), this provides a bound for spectral functions corresponding to varying b, and a basis for considering the limiting process b -+ 00. We first find the equation satisfied by FM,N(A),taking A fixed and complex. With the notation (9.5.16) we have
Substituting in the restriction (9.2.1) laid on the boundary matrices, we deduce that, canceling the factors $, (U
+ J-'V)*/(U + J-lV) = {Yb(U- J-'V)>*JY,(U - J-lV).
Since J* = - J this is equivalent to ( U * - V*J-1) J ( U
+ J-'V)
= (U*
+ V*J-1) Y,*/Y,(U - J-lV).
Dividing on the left and right by V* and V ,and recalling thatF = iUV-l, this gives (2F* - 1-1) J(2F
+ 1-1) = (2F* +
1-1)
ydrJyb(2F - 1-1).
We write the final result as
Theorem 9.8.1. For fixed complex A, the characteristic function (9.5.1) lies on the locus (F -k
9 /-')*(J/i)(F +
J-')
= (F
- Q J-l)*( ydcJyb/i)(F - 3 J-l).
(9.8.3)
9.8.
285
NESTING CIRCLES
Here we have written F for FM,,,(h), and Yb for Y(b,A). We have divided through by a factor i so that the central factors on either side should be Hermitean, both sides being therefore Hermitean. As already mentioned, this locus is independent of the boundary matrices. T h e result is also true when h is real and not an eigenvalue. We now have to investigate the conditions under which a locus of the above form, that is to say, of the form ( F + G)*Q(F
+ G) = (F
-
G)*P(F - G)
(9.8.4)
may be reasonably described as a circle. Here G, P, and Q are given square matrices of order k, and F is a variable square matrix of the same order. I n the scalar case k = 1, (9.8.4) represents a genuine locus, an ordinary circle, if G # 0, P , Q are real numbers, neither of them zero, and of the same sign. I n the general case we prove Lemma 9.8.2. Let P,Q, G be nonsingular, P and Q being Hermitean. Let also P > Q, and let the eigenvalues of Q-IP be all positive. Then (9.8.4) determines an F-set which is bounded and nonempty, being homeomorphic to the set of unitary matrices. Rearranging (9.8.4) we have F*(P - Q ) F - F*(P
or {F* - G*(P
+ Q ) G - G*(P + Q ) F + G*(P
+ Q ) ( P - Q)-'} ( P - Q ){F - ( P = G*{(P
+ Q ) ( P - Q)-'
(P
-
-
Q)G
= 0,
+
Q)-' ( P Q ) G}
+ Q ) - ( P - Q)}G.
(9.8.5)
Subject to it being proved that the matrix in the braces { } on the right is positive-definite, this equation has the general solution F
+Q )G + ( P - Q)-'/' O{(P + Q) ( P - Q)-'
= ( P - Q)-'(P
(P
+ Q) - ( P - Q)}'/' G,
(9.8.6)
where 0 is any unitary matrix, and the square roots are to be positivedefinite and Hermitean. As the conclusions of the lemma will follow from (9.8.6), all we have to do is to verify that (P+Q)(P-Q)-'(P+Q)-(P-Q)
>O-
Writing the left of (9.8.7) in the form ( P - Q)'/' ( [ ( P- Q)-'/' ( P
+ Q ) (P
-
Q)-'/']'
-
E } ( P - Q)''z
(9.8.7)
286
9.
GENERAL FIRST-ORDER DIFFERENTIAL SYSTEM
we see that this is the same as { ( P - Q)-’/’ ( P
+Q)(P
-
Q)-’/’}’
> E.
(9.8.8)
Thus we must show that the eigenvalues of the left-hand side are all greater than 1, that is to say, that the eigenvalues of (P - Q)-ll2(P Q ) ( P - Q)-1/2are all greater than 1 or less than - 1. Suppose on the contrary that for some column matrix q # 0 we have
+
(P- Q)-’/’ (P+ Q ) ( P - &)-‘I’ 7 = 9,
-1
< < 1, Y
the eigenvalue v being necessarily real, since the matrix on the left is Hermitean. Writing 5 = (P - Q)-1/2q,this gives or
If v and
1 this gives Q[ = 0, which is impossible since Q is nonsingular, # 0 since 7 # 0. If -1 Y < 1 we derive
=
5
<
(Y
+
+ 1) < 0,
(V -
l)-’[
= Q-lPl,
where (v 1) (v - l)-’ and this is excluded since Q-lP is to have only positive eigenvalues. Hence we have a contradiction and so (9.8.7) must hold, completing the proof. Together with the “circle” (9.8.4), we may also consider the “disk” formed by its interior, with or without the circle itself; as the interior we understand a bounded set which has the circle as frontier. We continue to use the term “bounded” as applied to a set of matrices in the obvious sense that all the entries of all the matrices admit a bound; similarly, a neighborhood of a matrix consists of all matrices whose entries differ by not more than an assigned amount from the corresponding entries of the given matrix. As the disks, open or closed, determined by (9.8.4), we take those given in Lemma 9.8.3. Subject to the assumptions of Lemma 9.8.2, the F-sets given by (9.8.9) ( F G)*Q(F G ) > ( F - G)*P(F - G ) ,
+
+ (F + G)*Q(F + G ) 2 (F - G)*P(F - G )
(9.8.10)
are bounded nonempty sets, which are open and closed, respectively. T h e sets are evidently open and closed, respectively, so that we have
9.8.
287
NESTING CIRCLES
only to show that they are bounded and nonempty. In view of Lemma 9.8.2 we need only show this for the set (9.8.9). From (9.8.9) we may reason as before to (9.8.5) with the sign < replacing equality. Writing C
+
(P Q) G,
= (P- Q)-’
R1
= ( P - Q)l/’,
R,
= {(P
(9.8.1 1) (9.8.12)
+ Q) (P- Q)-’ (P+ Q) - (P-
Q)}1/2
G,
(9.8.13)
the modified version of (9.8.5) is (F - C)*R;(F - C ) < R,*R,.
(9.8.14)
@ = Rl(F - C ) R;’,
(9.8.1 5)
If therefore we write it follows that
< E,
@*@
(9.8.16)
so that the matrix @ is “contractive,” in that it reduces the length of a non-zero column matrix to which it is applied. Conversely, if @ satisfies (9.8.16), and we set (9.8.17) F = C R;’@R,
+
so that
Rl(F - C ) = @R2,
then on multiplying this by its adjoint we have ( F - C)*R2,(F- C ) = R,*@*@R,< R,*R,.
Since t.he @ satisfying (9.8.16), that is, the @ in the interior of the “unit circle,’’ form a bounded set, the set given by (9.8.17) also form a bounded set; this set is nonempty since the “center” C is obviously included. For the corresponding result for (9.8.10) we need only replace the sign < in (9.8.16) by the sign We proceed to verify that the conditions of the last two lemmas are verified in the case of the circle (9.8.3).
<.
Theorem 9.8.4. Identifying (9.8.3) with (9.8.4), that is to say, taking G = lz
1-1
9
Q
=
Jli,
P
=
YZJYJi,
(9.8.18-20)
9.
288
GENERAL FIRST-ORDER DIFFERENTIAL SYSTEM
and taking ImA
> 0,
(9.8.21)
the conditions of Lemmas 9.8.2-3 are satisfied. T h e locus (9.8.3) is homeomorphic to the matrix unit circle, or set of unitary matrices. T h e sets (9.8.9-10) are bounded and nonempty. It is clear that with the choice (9.8.18-20) the matrices P, Q, and G are nonsingular, and that P, Q are Hermitean. T o show that P - Q is positive-definite we note that
P -Q
= i-l( Y,*JYi
-
J)
= 2{Im A}
Y*(t,A) A(t) Y ( t ,A) dt ,
by (9.5.8). This is positive-definite by (9.5.17), in fact, by (9.1.6). Finally we must verify that Q-lP has only positive eigenvalues. This we prove by continuous variation. Let us write, in extension of (9.8.20), (9.8.22)
P, = Y:JY,/i, so thar Pb = P. By the argument just given, we have
P, - Q
= 2{Im A}
J: Y*(t,A) A(t) Y(t,A) dt 2 0
if Im A > 0. Hence P, Q, while P, and Q are also nonsingular and Hermitean. It follows from these last observations that the eigenvalues of Q-lP, are all real. For, if for some column-matrix 5 # 0 we have Q-lP,< = p l , for some scalar p, it follows that pQ5 = Px5, and so that pC*Qg = 5*PX5. Since [*QC, 5*PX5 are both real, we can have p complex only if 5*Q5 = 5*PX5= 0. This latter implies that (*(P,- Q) = 0, and since P, >, Q, this means that (P, - Q)5 = 0. Since pQ5 = P,( and Q is nonsingular, we deduce that p = 1, so that p must be real, as asserted. Knowing now that the eigenvalues of Q-lP, are all real, we add the observation that none of them are zero. We have already noted in fact that P, , Q are nonsingular since J and Y , are nonsingular. Now when x = a, QP1Pa= E, and so in this case the eigenvalues are all $1. As x increases from a to b, the eigenvalues of Q-lP, will vary continuously, remaining real and never vanishing. Hence they remain positive, as was to be proved. Finally we note the “nesting” property, which is of a fairly selfevident character.
<
9.9.
289
EXPANSION OF THE BASIC INTERVAL
Theorem 9.8.5. Let the assumptions of Section 9.1 hold for all b 2 6, , for some fixed b, > a, and for fixed b, h with b > b, , Im h > 0 denote by 9 ( b , A) the F-set characterized by (F
+-
]-l)*(]/j)
(F
+ & 1-l) 2 ( F - & J-l)*(Y,*]Yb/i)( F -
p).(9.8.23)
Then as b increases, the region 9 ( b , A) shrinks, in that
w* , A) c 9 ( b I ,A),
b,
< b, < b, .
(9.8.24)
For as b increases, Y$JY,/i is nondecreasing, since
Hence the inequality (9.8.23) becomes more stringent as b increases; in other words, if it is satisfied for some b and some F, then it is satisfied for the same F and all lesser b. This proves the result. I n the case that I m h < 0 we must reverse the inequality in (9.8.23) in order to obtain a bounded region which contracts, or at any rate does not expand, as b increases.
9.9. Expansion of the Basic Interval We now consider the case of a semi-infinite interval (a, m), supposing that the assumptions of Section 9.1 hold for all finite b 2 b, , for some fixed b, > a ; in particular, A(x) and B(x) are assumed integrable over any finite interval (a, b), b > a, but not necessarily over (a, 00). A somewhat crude, but nevertheless important, consequence of the nesting circle analysis of the last section is the boundedness of the spectral function of (9.3.26), independently of b 2 b, .
Theorem 9.9.1. There is a constant c and of b for b 2 b, , such that
> 0, independent of M
I tr TIM.N(P) I < 4
+ P2).
and N , (9.9.1)
T h e expression on the left serves as a norm for the matrix T M , ~ ( p ) . Since the latter is non-negative definite for p 2 0, nonpositive definite for p 0, its diagonal entries will have the same sign. Furthermore, these diagonal entries will not exceed in absolute value their sum, the trace of since T M . N ( p )is Hermitean, the same bound will also apply to the off-diagonal entries.
<
290
9.
GENERAL FIRST-ORDER DIFFERENTIAL SYSTEM
By Theorems 9.8.1, 9.8.5 the characteristic function (9.5.1), which we now write F M . N ( 4 = F.W,N.b(X) (9.9.2) to indicate the dependence on b, will lie for all b 2 b, in the finite region 9 ( b , , A) given by (9.8.23) with b = b, , here we suppose A fixed with is Im A > 0. Taking in particular A = i, we deduce that FM,N,b(i) uniformly bounded, for b >, b, and all M , N satisfying the standard restrictions of Section 9.2. In particular, -1m FM.N.b(z)admits under these circumstances a bound from above by some fixed matrix T o , say, for example, some multiple of E . Taking A = i in (9.7.1 1) we then have Jrn (1 + C L Y d T M , N ( t L ) -03
Hence, for any p‘ (1
< To.
> 0,
+ P-’ TM,N(P’) = (1 + P ’ Y J
P’
< SF’ (1 + p”-’
and so
T M , N ( p ’ ) < (1
dTM.N(P) dTM,N(CL)
+ p’,) To
1
< To, (9.9.3)
and we get (9.9.1) for p‘ > 0 on taking the trace of both sides, the left of (9.9.3) being non-negative definite for p‘ > 0. T h e proof for p’ < 0 is similar. Making b --t 00, and keeping M , N fixed for definiteness, and writing T.W,N.b(4 = T M . N ( 4
(9.9.4)
for the function defined in (9.3.26), which of course depends on b, we may deduce the existence of a sequence b, ,b, , ... , with b,‘ -+m as n + 00, and a nondecreasing right-continuous Hermitean matrix function T(A),such that, as n 4m, TM,N.b,(PCL)
-
T(k’),
(9.9.5)
for all p at which T(p) is continuous, that is, for all finite p with the exception of at most a denumerable set. In addition, the limiting transition shows that T ( r )2 0 for p 2 0 , (9.9.6)
I tr
T(p) I
< c(l + p2).
(9.9.7)
9.9.
EXPANSION OF THE BASIC INTERVAL
29 I
This function T(p) will be a spectral function in a certain sense. We prove here the Parseval equality for a restricted class of functions.
<
Theorem 9.9.2. Let the column matrix g)(t), a t < 03, satisfy 0, g,(t) = 0 for t 2 t o . Let g,(t) be absolutely continuous and satisfy almost everywhere Jg,' = Bg, + AX, where x is measurable and satisfies X*AXdt < w. Defining
g,(a) =
S"
#(A) = Jm y*(t,A) A ( t )p(t) dt, a.
we then have m
J
a
p*Ap dx = J m
--co
#*(A)
dT(A) $(A).
(9.9.8) (9.9.9)
For the proof we take the Parseval equality, deducible for a finite interval (a, b) from (9.6.34), and proceed to the limit as b -+ w ; strictly speaking, we make first b -+w, and then A -+ 00. We take it that b > bo , b > to , and that ~ ( t=) 0 for t > t o . Substituting (9.6.33) in (9.6.34) we derive (9.9.10)
Here we have used the orthonormality of the y n , and have written the integrals as over (a, a),the integrands vanishing over ( t o , 0 0 ) . We have to express the sum in (9.9.10) as a Stieltjes integral, and to simplify this assume that A is not a point of discontinuity of T M . N , b , ( h ) for any R = 1, 2, ..., that is to say, not one of the corresponding eigenvalues, and not a point of discontinuity of T(h); we assume the same concerning --A. This is legitimate since these excluded points form a denumerable set. Since c, as given by (9.3.24) may also be written W
cr =
a
W
yf(x) A(x) p(x) dx = uf
a
Y*(x,AT) A ( x ) p ( x )dx = u,*$(A,), (9.9.11)
the sum in (9.9.10) is
assuming that * A are not discontinuities of in (9.9.10) we get
TM,N,b(X).
Substituting this
292
9.
GENERAL FIRST-ORDER DIFFERENTIAL SYSTEM
for n = 1, 2, ... . Making n -P 00, we may make the limiting transition (9.9.5) in the finite integral in (9.9.12), getting
jr v*Av dx j -
A
-A
$*(A) dT ( A ) $(A)
< k 2j”x*Ax a
(9.9.13)
dx,
and the asserted result (9.9.9) clearly follows on making A
+ 00.
9.10. Limit-Circle Theory
We confine the discussion here to general remarks. Considering the disks (9.8.23) for fixed h with Im h > 0 and as b --t 00, we know that they form for b 2 b, a family of bounded closed sets, each of which is nonempty and includes those for later members of the family, that is, those for greater values of b. We can therefore conclude that the intersection of all of these disks is nonempty; it includes, for example, the limit of the “center” (9.8.11) as b ---t m, or at least a limit-point of the sequence of centers. The situation may be seen more clearly if we consider the limiting behavior of the quantities C, R, , and R, given in (9.8.1 1-13), (9.8.18-20); as may be seen from (9.8.17), C forms the center of the disk, while R;,, R, form together a sort of radius. Using the formulas (9.8.18-20), and also the fact that P -Q
= i-l{Y,*JYb- J } = 2{Im A} = 2{Im A}
1: Y*(t,
A) A ( t ) Y ( t ,A) dt
(9.10.1)
Wl(b,A)
as in (9.5.6-8), (9.5.15), and carrying out slight manipulations in (9.8.1 l), (9.8.13) we have, for the center, C
={E
+ 2(P - Q)-lQ} G = 4J-’ + {(A
-
A) Wl(b,A)}-’,
(9.10.2)
while the “radius” is given in terms of R;l
=
and R,
= {4Q
( P - Q)-lI2
=
[2{Im A} Wl(b,A ) ] - l l 2 ,
+ 4&(P- Q)-1Q}1/2G
(9.10.3)
(9.10.4)
9.1 1.
293
SOLUTIONS OF INTEGRABLE SQUARE
The main point about these formulas is that C, finite manner on the matrix W,(h 4
K1, R, depend
in a
(9.10.5)
= {W,(b,A)>-l.
For b 2 b, , and fixed X with I m X > 0, Wl(b,A) is positive-definite and nondecreasing as a function of b. Hence W,(b, A) will be positivedefinite and nonincreasing as a function of b, and so will tend to a limit as b -+00. Hence C,&l, and R, will tend to limits as b + 00, and hence E, tends to a limit the locus given by (9.8.17), for all @ with @*@ which is also, in some sense, a disk. T h e simplest case is that in which Wl(b,A) tends to a finite limit as b + w, that is to say, in which
<
tr
Sm Y*(t,A ) A ( t )Y(t,A) dt <
00.
(9.10.6)
U
I n this case W,(b, A) tends to a nonsingular limit, and K’, R, as given by (9.10.3-4) also tend to limits, that of K1 at any rate being nonsingular. Subject to its being proved that R, is, in the limit, nonsingular it will follow that the limit of this disk is homeomorphic to the unit disk @*@
< E.
We shall, however, show that the case (9.10.6) is, in certain cases, equivalent to a boundary problem over a finite interval.
9.1 1 . Solutions of Integrable Square We shall say that a solution y ( x ) of (9.1.1) over (a,w) is “of integrable square” if
1;
y * ( x ) A ( x ) y ( x ) dx
< 00.
(9.11.1)
Such solutions form a linear manifold. It is obvious that if y has the above property, then so has any multiple of y . Furthermore, if y and z have this property, then so has y z ; this may be deduced from the Cauchy inequality, more specifically from the fact that
+
together with the fact that the last integral is non-negative for any
294
9.
GENERAL FIRST-ORDER DIFFERENTIAL SYSTEM
b > a. Hence these solutions form a linear manifold, and we may inquire as to the number of linearly independent solutions with this property. Writing y ( x ) = Y(x,A) u, where u = y(a) # 0, we see that the condition (9.1 1.1) is equivalent to (9.11.2)
T h e simplest case is, of course, that in which the nondecreasing matrix function W,(b, A) tends to a finite limit as b + m; then (9.11.2) will hold for all u, and so all solutions will be of integrable square. I n general, the situation will depend on the behavior of the eigenvalues of W,(b, A). Denoting these by P,(4
e CLZ(b) e *.. < Pk(b),
(9.11.3)
each eigenvalue being written in this series according to multiplicity, there being k eigenvalues altogether, we note that these are non-negative and nondecreasing functions of b, together with W,(b, A). Making b ---f 03, a certain number of (9.11.3) may remain finite, the remainder tending to infinity, so that p,(m)
say, while
< 00,
pL1(m)= m,
Y
T
... k,,
(9.1 1.4)
+ I , ... K.
(9.11.5)
= 1,
=
K,
)
)
I t is easily seen that there then exist k, linearly independent solutions of (9.1.1) which are of integrable square in the sense (9.11.1). Let U(T)(b),
Y =
1, ..., k, ,
(9.1 1.6)
be an orthonormal set of eigenvectors of W,(b, A), that is to say, column matrices, corresponding to the eigenvalues (9.1 1.4). T h e entries in the column matrices (9.1 1.6) are uniformly bounded, and we may therefore assume that the set (9.1 1.6) converges as b -+ a,perhaps through some sequence of b-values. We denote the limit by u(r)(m),
T
=
1, ..., K, ,
which will likewise form an orthonormal set. Then, for b u(+)*(b’) W,(b,A)
U(T)(b’)
< U(‘)*(b‘)
(9.1 1.7)
< b’ <
W,(b’,A) U ( + ) ( b ‘ ) = p,(b’).
00,
9.1 1. Making b‘ we have
--t
SOLUTIONS OF INTEGRABLE SQUARE
295
Q),through the above-mentioned subsequence of b-values,
u(+)*(-) Wl(b,A) U y - )
< pt(-),
Y
=
1, ...,kl ,
and in view of (9.11.4) we have k, linearly independent solutions of (9.11.2), as asserted. We can now give a lower bound for the number of linearly independent solutions of integrable square.
Theorem 9.1 1.1. Let J/i have k‘ negative eigenvalues and k” eigenvalues which are positive. Then (9.1.1), taken over (a, a), has at least k’ linearly independent solutions satisfying (9.11.1) if I m A > 0, and at least k” such solutions if I m A < 0. We write (9.11.8) W3(b,A) = Y*(b,A) JY(b,A ) / i When b = a, W3(a,A) = J/i has k‘ negative and k” positive eigenvalues. For b 2 a, W3(b,A) is Hermitean and nonsingular, and so has eigenvalues which are real and distinct from zero. Hence, by continuity, W3(b,A) has k’ negative and k” positive eigenvalues for general b 2 a. Taking I m A > O’and writing (9.5.8) in the form Wl(b,A)
= (2 Im A)-l{
W3(b,A) - J/i},
(9.11.9)
we deduce that if the column matrix u is such that u*W3(b,A) u
then u*
< 0,
Wl(b,A) u Q -(2 Im A)-l u * ( ] / i ) u.
(9,ll.10) (9.11.11)
As has been shown, W3(b,A) has k’ negative eigenvalues, and so (9.1 1.10) holds for a linear manifold of column matrices u of dimension k’, namely, linear combinations of the corresponding eigenvectors. Denoting by po any bound from above for the eigenvalues of -J/i,it follows from (9.11.11) that there is a set of u of dimensionality k’ for which u*W1(b,A) u
(9.11.12)
From this we deduce that W,(b,A) has at least k’ eigenvalues less than p 0 / ( 2I m A); to indicate a proof of this, we observe that if W,(b,A) had fewer than k’ such eigenvalues, we could find a u # 0 from the above set orthogonal to all the corresponding eigenvectors, and so not satis-
296
9.
GENERAL FIRST-ORDER DIFFERENTIAL SYSTEM
fying (9.11.12). Hence for any b, there are at least k’ of the eigenvalues (9.1 1.3) of W,(b, A) which are less that p0/(2I m A), and so at least k’ of them which remain finite as b --t 00. This proves that k, 2 k’, which is the required result. T h e proof of the result for I m A < 0 is similar. As a simple illustration we take the scalar equati0.n iy’ = Ay. Here J/i = 1, k’ = 0, k” = 1. T h e solutions are not of integrable square, in the ordinary sense, if I m h > 0, since apart from the trivial solution they are exponentially large as x + m; if I m A < 0, they are exponentially small as x -+ 00 and of integrable square, there being only k” = 1 linearly independent solutions. I n the case of the system (8.1.2-3), in matrix form (9.1.13), we have- J = (! -:), and so k’ = k” = 1 ; as is well known, there is, if I m A # 0 at any rate, one nontrivial solution, of integrable square in the sense that
assuming p , q, Y to satisfy the assumptions of Section 8.1 for all finite b > a. As a final example, consider the system (9.1.18-19), including the fourth-order equation (9.1.16). Here k’ = k” = 2, so that if I m A # 0, (9.1.16) has two linearly independent solutions satisfying
>
here we may assume p , 0, p , > 0, and all coefficients continuous, though these conditions may be much weakened. It was shown in Chapter 5 that if all solutions of a certain recurrence relation were of integrable square for some A, then this was the case for all A. We shall now prove this in the more general case of the differential equation (9.1. l), with an additional assumption. T h e result is
Theorem 9.1 1.2. In addition to the previous assumptions, let J-lA(x) be real. If for some A all solutions are of integrable square, in the sense (9.11.1), then this is so for all A. We assume that all solutions are of integrable square when A = p, and define, for other A, Z(x, A) by Y(X,A)
=
Y(X,P ) Z(x9 A).
(9.1 1.13)
Multiplying on the left by J , differentiating and using (9.1.1) we obtain, writing A, B for A(x) and B(x), (AA
+ B ) Y(x,A)
= (PA
+ B ) Y(&P )
Z(X,
A)
+ /Y(X,P ) q x , A).
9.1 1.
297
SOLUTIONS OF INTEGRABLE SQUARE
Substituting from (9.11.13) on the left and simplifying we have JY(x,p) zyx, A) = (A - p) A Y(X,p ) q x , A).
(9.1 1.14)
or, with the notation (9.11.8), Z ( x , A) = -i(W3(x, p)}-l(A - p) Y*(x,p) AY(x,p) Z(x, A).
(9.11.15)
Abbreviating the latter differential equation to
we now assert that C(x) is absolutely integrable over (a, m), or, in other words,
IrnII 5
C(x)
II dx < m,
(9.1 1.17)
where the norm I I C(x) I I may, for example, be the sum of the absolute values of all the entries in C(x). Since all solutions of (9.1.1) are of integrable square when X = p, we have
in the sense that the diagonal entries of Y*AY, which are non-negative, are absolutely integrable over (a, a),and so also the nondiagonal entries, since Y*AY is Hermitean. Hence to ensure (9.1 1.17) it will be sufficient to show that (W3(x,p)}-l is bounded as x + a. Turning to the proof of the latter statement, we observe first that it is trivial if p is real, since then W3(x,p ) = J/ifor all x. More generally, we have from (9.5.7-8) that
whence, by (9.11.18), W3(x,p) tends to a finite limit as x .+ 03. Thus for its inverse to be bounded as x +m it will be sufficient for its determinant to be bounded from zero as x 3 00. We have, of course, det W3(x,p )
=
det (J/i) det Y(x,p) det Y*(x,p)
298
9.
GENERAL FIRST-ORDER DIFFERENTIAL SYSTEM
and so for the required property, the boundedness of (W3(x,p)}-l, it will be sufficient to prove that
1 det Y(x,p) I > const. > 0.
(9.1 1.20)
By a standard formula from the theory of linear differential equations, we have from the fact that ]Y'(x,p ) = ( p A B ) Y(x,p), Y ( a , p ) = E, the result
+
det Y ( x ,p) = exp
",1
a
t r (pLJ-IA
+ ]-lB) dt1 .
(9.11.21)
Here we note that tr J-lB is purely imaginary, or zero, since tr ]-IB
= tr
(J-'B)*
= tr
B*]*-I
= -tr
I exp
a
BJ-'
=
so that from (9.1 1.21) we have
I det Y(x,p) 1
=
[p
tr (J-lA)dt
-tr LJ-lB,
1I.
(9.11.22)
Since we have required J-lA to be real, so that tr (J-IA) = 0, the right-hand side is unity, proving (9.1 1.20). Returning now 'to (9.1 1.161, having justified (9.1 1.17), we can assert that the solution Z(x, A) tends to a finite limit as x -+0 , and so will be bounded above by some multiple of E . From (9.11.13) we now see that there holds an inequality of the form
1:
Y*(t,A) A ( t ) Y ( t ,A) dt
s:
< const.
Y*(t,p) A ( t ) Y ( t,p) dt,
so that the integral on the left converges as x This completes the proof.
--t 00,
9.12. The Limiting Process u-+
-0,
in view of (9.1 1.18).
b-+
$0
For this purpose, with a view to eigenfunction expansions over the whole real axis, we suppose that a < 0 < b and revise the definitions of the preceding sections so as to replace the value x = a as a basepoint by x = 0. As a fundamental solution of the matrix equation JY' = (AA B ) Y we take the function Y(O)(x,A) satisfying Y(O)(O, A) = E, so that in fact Y(O)(x,A) = Y(x,A) Y-l(O, A). We define a new spectral function T;!,(A) by the properties that it is a nondecreasing Hermitean matrix function whose jumps occur at
+
9.12.
THE LIMITING PROCESS
a --t -00, b -+ $03
299
the A,, and are of amount y,(O)y$(O), thus replacing u, = , ( a ) in (9.3.26) by y,(O). In (9.3.28-29), and in (9.9.8) we are to replace Y(t,A) by Yco)(t,A). The limiting transition to an infinite interval depends on the uniform boundedness of 7'g!N(A), for fixed A, as a -+ --, b -+ Following the limit-circle and limit-point method, we take in this case as the characteristic function K(0, 0, A), where the definition of the resolvent kernel remains unchanged from that given in Section 9.4. Taking x = 0, A = i in (9.7.10) we have
+-.
This gives a bound for the spectral function. T o make it into a bound holding uniformly as a + --, b + we have of course to show that K(0, 0, i> is similarly bounded. Writing F'O)for the new characteristic function K(0,0, A), and dropping the suffixes M, N from the old one, the relationship between them may be written F(0) = Y,FJy-']-'.
+-,
We get this from (9.5.4-5), taking the arithmetic mean of the two according to our definition of K(x, x, A). Writing this in the form F = Y;'F(O)JY,J - l , substituting for F in the equation of the disk (9.8.23), and removing on the left and right factors (JY0J-')*, JY,,J-', respectively, we obtain, assuming as before that Im A > 0,
We now argue that as b increases, or as a decreases, this inequality becomes more stringent, so that the disks "nest," or at least do not expand, and in any case are uniformly bounded, for sufficiently large b > 0, a < 0. This is so since as b increases, the central factor on the right is nondecreasing. In fact, Yay;' = Yc0)(b,A), so that on the right of (9.12.1) we have the factor YfO)*(b, A) (J/i> Y(O)(b,A), which is nondecreasing by (9.5.7). Similarly, Y;' = Y(O)(a,A), and a similar argument shows that Y(O)*(a, A) (I/;)Y(O)(a,A) does not increase as a decreases. Thus the disks (9.12.1) remain bounded. and so also F(O), as a + --, b + +m, and the existence of a limiting spectral function follows from Helly's theorem.
CHAPTER10
Matrix Oscillation Theory
10.1. Introduction
T h e term oscillation refers in the first place to the zeros of real scalar functions, particularly the solutions of second-order differential equations. These zeros have also the meaning that some self-adjoint boundary problem is satisfied, at least in part; this provides a natural basis for interpreting the notion of oscillation in a more general context. Taking the general formulation given by the first-order equation (9.1.1), a new aspect is opened up if we suppress the parameter A, and with a free boundary x1 enquire for what x1 the boundary problem Jy’
= By,
y(.)
= Mw,
y(.J
= NV,
(10.1.1)
admits a nontrivial solution; here as before J is to be skew-Hermitean, B = B ( x ) Hermitean, M* J M = N* J N and M and N are to have no common null-vectors. Such points x1 may be termed “right-conjugate” points of a, relative of course to the boundary conditions; as in the special case of zeros of a scalar solution, we may study separation properties, “disconjugacy” and “nonoscillation.” Reintroducing the parameter, the problem being that of Chapter 9, we have the detailed study of eigenvalues, in particular, separation properties for varying boundary conditions and quantitative information on their distribution. T h e two types of investigation are not quite distinct and may be blended; for example, in Theorems 4.3.4 and 8.4.4 we considered the motion of zeros in x for varying A. Again, there will be other forms of oscillatory investigation, not directly related to boundary problems. In this chapter we discuss these topics first in the context of the vector or matrix Sturm-Liouville system. This will include, of course, the ordinary Sturm-Liouville case of a second-order scalar equation. Treating this as a first-order system as in Chapter 8, and allowing the coefficients to vanish over intervals, we may include also the recurrence relation cases of Chapters 4-5 and Section 6.7. We shall then give the extension to the more general first-order system (10.1.1). 300
10.1.
30 I
INTRODUCTION
T o illustrate the type of question to be discussed we take the trivial case of a scalar first-order equation
-ir’
=[qx)
+ r(x)ly,
a
< x < b,
(10.1.2)
with the boundary condition y(4
=
exp (i.)y(a) f 0,
(10.1.3)
<
for some real a, 0 a < 27r. I n addition, we consider also the nonparametric equation, with h = 0, -iy’
= .(x)y,
a
< x < b.
(10.1.4)
For simplicity, let us assume q(x), Y(X) to be continuous in any relevant interval, q ( x ) being positive. Relative to the nonparametric equation (10.1.4), the following definitions may be set up. If for a given interval (a, b,), there is no subinterval (a, b), with a < b b, , such that the boundary problem (10.1.3-4) is soluble, then (10.1.4) is “disconjugate” over ( a , b J ; naturally, this disconjugacy is relative to the particular boundary condition (10.1.3), so that we might say in this case “a-disconjugate.” For the given a, any b > a for-which (10.1.3-4) is soluble may be termed a right-conjugate point of a, with respect to the same boundary conditions; the smallest such b > a may be termed the first right-conjugate point. If (10.1.4) is defined over (a, a), we may raise the question of whether or not there is an infinity of right-conjugate points of a, terming the equation nonoscillatory if there is only a finite number of such points, and oscillatory otherwise. In view of the explicit solution of (10.1.4), namely,
<
(10.1.5)
the following statements are self-evident: (i) if Y(X) > 0, and if the equation is disconjugate over (a, b,) for some a, it is also disconjugate over (a, b,) for a’ with a < a’ < 27r; (ii) if Y(X) > 0, all right-conjugate points move to the left as a decreases ; (iii) if Y(X) > 0, then between any two right-conjugate points of a for a = a, lies a right-conjugate point for a = a 2 , a, # a 2 , 0 a,, a2
< 27~;
<
(iv) if Y(X) > 0, and the equation is nonoscillatory, over (a, a), for some a, it is nonoscillatory for all a ;
302
10.
MATRIX OSCILLATION THEORY
(v) if ~ ( x )> 0, the equation is disconjugate over (a, b) if and only if the least positive eigenvalue of
-+’
= Ar(x)y,
a
< x < b,
with the boundary condition (10.1.3), is greater than 1 ; (vi) between two eigenvalues of (10.1.2-3) for a = a1 there lies an eigenvalue of (10.1.2-3) for any distinct a = az , a1 , az E [0, 27r); (vii) indexing the eigenvalues of (10.1.2-3) in numerical order, there holds the asymptotic formula
as n + +. I n what follows we wish to establish results of a similar character for the general case of the first-order matrix system (lO.l.l), with special reference to Sturm-Liouville systems in matrix terms. An immediate obstacle to this program is that we do not possess an explicit formula extending (10.1 S), for the solutions of (10.1.1) ; the formula Y(X) =
exp
JZ
+ W t)l dt 1 r(4
will be valid only if A(t),B(t) are constant matrices, o r if the system is one-dimensional as just discussed, or if certain commutativity relations hold. T o see how to surmount this difficulty, we may consider an alternative line of argument for the one-dimensional or scalar case (10.1.2-4), not making explicit use of exponential or trigonometric functions. For a solution of (10.1.4) with y ( a ) # 0 we define the function (10.1.7)
where eiu appears simply as a constant derived from the boundary conditions ; this function has affinities with the characteristic function defined in (1.6.1), and so with the notions of a Green’s function or influence function or driving-point admittance. I t turns out that f ( x ) satisfies, independently of a, the Riccati-type differential equation f ’ = -r{f’
+ $1.
(10.1.8)
Without solving this equation, we can draw the conclusion that if
~ ( x )> 0 then f ( x ) is a decreasing function of x. Hence, for example,
10.2.
THE MATRIX STURM-LIOUVILLE EQUATION
303
its zeros alternate with its infinities, that is to say, the right-conjugate points for the boundary conditions y(b) = f e x p (;a)y ( a ) alternate with each other. T o avoid infinities we may consider such expressions as (10.1.9)
although similar transformations may in the present case lead us back to y(x), this will not normally be the case. From the fact that f decreases in x , we can deduce that 8 moves positively round the unit circle as x increases. As B moves from point to point on the unit circle, it will pass through intermediate points, and this observation is a source of separation theorems. By investigating the rate at which 8 moves on the unit circle, estimates for eigenvalues may be obtained. A second obstacle met with in connection with matrix systems is that instead of a scalar quantity moving round the unit circle, we have a matrix moving on the matrix unit circle, that is to say, in the unitary group; we can no longer say that as it goes from point to point, it passes through all intermediate points. Without however looking into the connectivity properties of the unitary group, we may obtain much information by considering the variation of the eigenvalues of the unitary matrix in question.
10.2. The Matrix Sturm-Liouville Equation Suppressing for the moment the parameter A, we consider here the two first-order systems U'
= RV,
V'
=
-QU
V'
=
-Qu,
1
a<x
(lQ.2.1)
a<x
(10.2.2)
and U' = Rv,
where U , V , R, and Q are k-by-k matrices of functions of x, and u, w are column matrices of functions. I t is (10.2.2) which is appropriate for the purposes of boundary problems. As in the ordinary SturmLiouville case, when k = 1, it is natural to concentrate on one-point boundary problems, in particular on the existence of a nontrivial solution of (10.2.2) such that U(U)
= 0,
u(b) = 0.
(10.2.3-4)
The role of the matrix version (10.2.1) in this problem is as follows.
304
10.
MATRIX OSCILLATION THEORY
We take the solution of (10.2.1) specified by the initial conditions U(a) = 0,
V(a) = E,
(10.2.5-6)
where E is the k-by-k unit matrix. The solution of (10.2.2) satisfying the initial condition (10.2.3) is then u(x) = U ( x )w(a),
w(x) =
V(x)w(a),
(10.2.7)
since this satisfies (10.2.2) and reduces to ~ ( a = ) 0, .(a) when x = a. Hence the boundary problem (10.2.3-4) has a nontrivial solution if, and only if, det U(b) = 0. (10.2.8) Similarly, if we ask for a solution of (10.2.2) satisfying the boundary conditions u(a) = 0, w(b) = 0, (10.2.9) we see that there will be a nontrivial solution if, and only if, det V(b) = 0,
(10.2.10)
where again U , V are specified by (10.2.1), (10.2.5-6). This may suggest that for boundary problems of this and some similar special forms it will be useful to study the matrix Z(x) = U(X) V-l(x),
(10.2.1 1)
which we do in this section. We consider its dependence on x, and on a parameter h to be reintroduced, and also consider its Cayley transform. We assume that the matrices Q(x), R(x) are Hermitean and, at any rate, Lebesgue integrable. The following results apply to 2 as given by (10.2.11), where U , V satisfy (10.2.1) with possibly more general initial conditions than (10.2.5-6). For the case of the dependence of Z on x we derive a Riccati-type differential equation, and other properties.
Theorem 10.2.1. Let UV-l, or VU-l, exist and be Hermitean for at least one x. Then UV-l and VU-' are Hermitean whenever they exist. Furthermore, 2 = UV-' satisfies, when V-l exists, the differential equation Z' = R f Z*QZ. (10.2.12) The first assertion will be deduced from the property that U*(x) V(x) - V*(x) U(x) = const.
(10.2.13)
10.2.
THE MATRIX STURM-LIOUVILLE EQUATION
305
To see this we differentiate the left-hand side, getting ( U * V - V*U)'
(RV)*V
1
+ U * ( - Q U ) - (-QU)*U
-
V*(RV), (10.2.14)
and since R* = R, Q* = Q this vanishes, proving (10.2.13). Thus if for some x, V-l exists and UV-l is Hermitean so that UV-' = (UV-')* = V*-lU*, then U*(x) V(x)- V*(x) U(x) = 0 (10.2.15) for this x and so for all x in (a, b). Hence for all x for which V-' exists, multiplying (10.2.15) on the left and right by V*-l and V-l shows that V*-lU* - UV-l = 0, so that UV-l is Hermitean. The argument is similar if we postulate that VU-l is Hermitean for some x. T o verify (10.2.12), assuming V-' to exist, Z' = (UV-1)' = U'V-1= ( R V ) V-'
UV-lJ7V-l
(10.2.16)
- UV-'(-QU) V-'
=R+ZQZ =R
+ Z*QZ
since 2 is Hermitean. It may happen that neither of U-l, V-l exist, and to avoid difficulties of this nature we may consider instead the matrix 6(x) = (V(x)
+ iU(x)}( V ( x ) - iU(x)}-l,
which is the same as 6 = (E
+ iZ)( E - iZ)-l
(10.2.17)
(10.2.18)
if V-l exists. The results corresponding to those of the last theorem are given by Theorem 10.2.2. For some x let U*V be Hermitean and V - iU have an inverse. Then 6 exists for all x in (a, b) and is unitary. It satisfies the differential equation 8' = im, (10.2.19) where the Hermitean matrix Q(x) is given by
L' = 2( V* + iU*)-'(V*RV + U*QU)( V - ill)-'.
(10.2.20)
Supposing U*V Hermitean we have (10.2.15) for some x, and so for all x. Hence for all x we have (V* - iU*)( V
+ iU)= ( V * + iU*) ( V - iU)= V*V + U*U.
(10.2.21)
306
10.
MATRIX OSCILLATION THEORY
Supposing V - iU nonsingular for some x, we then assert that V*V
+ u*u > 0
(10.2.22)
+
for all x. For V*V U*U is at least positive semidefinite, and if (10.2.22) did not hold there would be a column matrix z # 0 such that z*(V*V U * U ) z = 0, whence z*V*Vz = 0, z*U*Uz = 0, and so Uz = V z = 0 for some x. Since u(x) = V(x)z, v(x) = V ( x )z are solutions of (10.2.21, it follows that for all x we have U(x)z = 0, V ( x )z = 0, and so V ( x )z - iU(x) z = 0, contrary to the assumption that V - iU is nonsingular for some x. Hence, under the assumptions of the theorem, 8 exists for all x. Writing the first equation in (10.2.21) in the form
+
(V-iU)*-l(V+iU)*(V+iU)(V-iU)-l
=E
we see that 8*8 = E, so that B is unitary, as asserted. Finally we verify directly the differential equation (10.2.19). For 0’ = (Y’ = (V‘
+ iU’)(V - iU)-l
-
+ iU‘)(V - i U ) - 1 -
(V
+ iV)(V - iV)-l (V’ - iU’)( Y - iU)-l
8(V’ - iU’) (V - 2 J - I .
(10.2.23)
Since B*B = E, multiplication by 8* gives
+ iU*)-’(Y*- iU*)(V’ + iU’)(V - iU)-1 ( V ’ - iU’)(V - iU)-l = (V* + iU*)-l(( V* - iU*)( V’ + iU’)- (V* + iU*)(V’ - iU’)}( V-iU)-l = (V* + iU*)-’{2iV*U’ - 2iU*V‘} (V - Xq-1 (10.2.24)
B*8’ = (V*
-
Using (10.2.1) it follows that
e*e
= 2i(v*
+ ~ u * ) - ~ v +* RU*QU) v (V - iu)-1= i
~ ,(10.2.25)
from which we have (10.2.19) since B is unitary. This completes the proof. The differential equation (10.2.19) cannot, in general, be solved explicitly. We have, however, by a standard theorem for linear differential equations, det O(x) = exp i tr Q ( t ) d t det e(a), (10.2.26)
I 1:
I
a result which may be used for the estimation of eigenvalues. T o complete the analytical machinery we give the corresponding results for the dependence of Z and 8 on a parameter X to be introduced.
10.2.
307
THE MATRIX STURM-LIOUVILLE EQUATION
We extend (10.2.1) to U'
V' = -(xP+Q) U ,
= RV,
(10.2.27)
where P = P(x) is also Hermitean and Lebesgue integrable. We take U = U(x,A), I' = I'(x, A) to be a solution with the fixed initial values U(U,A) = 0 ,
V(U,A)
= E.
(10.2.28)
Denoting partial differentiation with respect to h by a s u f i , we have UA= I ' A = 0 when x = a, while differentiation of (10.2.27) gives U;
V;
= RV,,
-(AP +Q) U, - PU.
1
From these latter results it is easily verified that, with h real, (V*Ud - U*VJ'
+ Q) UA + V*RV, - V*RV, - U*{ -(hp + Q ) UA - PU)
=
-U*(AP
=
U*PU,
whence, integrating over (a, x), V*(x,A) U,(x, A) - U*(x,A) VA(X,A)
=
I' a
U*(t,A) P ( t ) U(t,A) dt.
(10.2.29)
This forms an extension of (4.2.3), or again of (8.4.3-4). We can now deduce easily Theorem 10.2.3.
W xA,)
~-
ah
For real h for which Z exists,
- { V*(x,A)}-1
I=
U*(t,A) P ( t ) U(t,A) dt { V(x,A)}-l.
a
(10.2.30)
(10.2.31)
where the Hermitean matrix Qt(x, A) is given by SZt(x, A ) = 2{ V*(x,A )
+ iU*(x, A)}-l
Jz a
U*(t,A) P ( t ) U(t,A) dt { V(x,A) - iU(x, A)}-l.
(10.2.32)
308
10.
MATRIX OSCILLATION THEORY
The significance to be attached to these two formulas is, roughly speaking, that if P(x) > 0, then Z(x, A) is an increasing function of h when A is real, while the eigenvalues of e(x, A) move positively on the unit circle as A increases. For the proof of (10.2.30) we have that
z,= (UV-l),= U,V-' =
UV-lV,V-' u,v-1 - y*-lu*vAv-' -
since Z is Hermitean. Hence 2,
=
v*-1(v* u, - V*V,) v-1,
which is equivalent to (10.2.30) in view of (10.2.29). In the case of (10.2.31-32) we use (10.2.24), replacing U', I/' by U , , I/, and using (10.2.29).
10.3. Separation Theorems for Conjugate Points We compare here the boundary problems in which nontrivial solutions are to be found for U' =
or for
Rv,
U' = Rv,
V'
-Qu,
U(U) = 0,
U(X) = 0
(10.3.1)
-Qu,
U(U) = 0,
V(X) = 0,
(10.3.2)
1
V' =
where Q, R are as before K-by-k Hermitean matrices of functions of x, and u, z, are column matrices. There being no disposable parameter A, these problems will in general be insoluble. T h e x-values, a < x b, for which they are soluble will be denoted by
<
in the case of (10.3.1) and by 771
< < .*. 772
(10.3.4)
in the case of (10.3.2). There is, of course, no guarantee that these numbers exist; however, the existence of the 5, and that of the qr are related. If for some x the problem in question has more than one linearly independent solution, it is to be written a corresponding number of times in (10.3.3) or (10.3.4). Parallel to the definitions given in Section 10.1 the 5, , q, may be termed "right-conjugate" points of a, relative to the boundary conditions in question.
10.3.
SEPARATION THEOREMS FOR CONJUGATE POINTS
309
I n the scalar case K = 1, the initial condition .(a) = 0 specifies a solution which is unique except for a scalar factor. T h e right-conjugate points (10.3.3-4) are then simply the zeros of u(x), w ( x ) for this solution. In the special case of a second-order scalar differential equation y" + qy = 0, these become the zeros of y , y' for a solution such that y(a) = 0 ; these zeros have certain separation properties, in that between two zeros of y there lies a zero of y', and vice versa, if q ( x ) is continuous and not zero. Here we shall extend these properties. Taking as before the K-by4 matrix solutions of U' = RV, V' = -QU , with U ( a ) = 0, V(a) = E, we form the matrix O(x) = ( V iU)( V - iU)-l which, by Theorem 10.2.2, exists and is unitary. Initially we shall have O(a) = E . Let us relate the eigenvalues of O(x) to boundary problems of the type of (10.3.1-2). Since the eigenvalues must lie on the unit circle, suppose that, for some x > a, exp (ia) is among the eigenvalues. Then for some column matrix w # 0 we have
+
@) w = (V + iU)( V - iU)-l w
Defining z
=
and so
=
exp (ia) w .
(10.3.5)
( V - iU)-lw, this is the same as (V
+ i U ) z = exp (ia)(V - i U ) z ~ z c o s + a=
Vzsin+a.
(10.3.6)
If now we consider the solution of u' = Rv, v' = -Qu given by U ( X ) = U ( x )z, V ( X ) = V ( x )z, we see that this solution satisfies the boundary conditions u(a) = 0,
.(a) # 0,
and
u ( x ) cos
+
01
= ~ ( x sin )
&.I.
(10.3.7)
Conversely, retracing the above argument, we see that if (10.3.7) has a nontrivial solution, then O(x) has exp (ia) among its eigenvalues. Furthermore, if exp (ia) is a multiple eigenvalue in that (10.3.5) has several linearly independent solutions w , then (10.3.7) will have a similar number of linearly independent solutions, and conversely. Taking in particular 01 = 0, we see that the right-conjugate points 5, from the problem (10.3.1) are the x-values for which O(x) has + 1 among its eigenvalues. Similarly, with a = n, the 17,. are the x for which O(x) has -1 as an eigenvalue. Taking a simple case we have then
Theorem 10.3.1. Let Q(x), R(x) be continuous, Hermitean and positive-definite in a x b. Then the existence of 5, implies that of T,, and (10.3.8) a <= 71, < 5,
< <
-
lo.
310 For n
> k, the
MATRIX OSCILLATION THEORY
existence of rln implies that of a
7s
5n-r
CnPk,
and
-
(10.3.9)
The proof depends on a study of the motion of the eigenvalues
...)
(10.3.10)
wl(x), wk(x)
of @(x), and particularly of their arguments arg wl(x),
..., arg w k ( x ) .
(10.3.11)
Since @(a)= E, all the w,(a) are unity, and we .take all the (10.3.1 1) to be zero when x = a. We show later (Appendix V) that they may be continued uniquely and continuously so that arg wl(x)
< arg up(.) < ... < arg wk(x) < arg q ( x ) + 27r,
this condition being automatically satisfied at x argw,(a)
= 0,
T
= 1,
=
..., k.
(10.3.12)
a since (10.3.13)
Having determined the eigenvalues of O(x) as K continuous functions
of x, the conclusions of the theorem will follow rapidly from the further fact that the w,(x) move positively on the unit circle as x increases, in
other words, that the arg w,(x) are monotonic increasing functions. This in turn will follow from the differential equation (10.2.19) and the fact that SZ > 0 (see Appendix V). In view of (10.2.20), the fact that SZ > 0 will follow from the result V*RV U*QU > 0. T o see this we recall the assumption that R > 0, Q > 0, so that there is a positive scalar c such that R > cE, Q > cE. It then follows that V*RV U*QU >, c(V*V U * U ) > 0, by (10.2.22). Hence the arg o,(x) are monotonic increasing. Supposing now that 5, exists, we have that as x increases in a < x C, the w,(x) must pass through or reach, moving positively, the value +1 a total of n times; it is easily seen that they must pass through the value -1 at least n times in this process, from which follows the existence of rln in a < x < 5,. For a detailed proof, suppose that wr(x) passes through or reaches 1 a total of n, times in(a, (,I, so that arg or((,) >,2m,, C,k n, = n. Then in the same interval the equation arg w,(x) = r (mod 2 ~ ) which , determines the x for which w,(x) = - 1, will have at least n, solutions, making a total of at least n x-values in (a, (], for which -1 is an eigenvalue of @(x), regard being had to multiplicity. Passing to the proof of the second half of Theorem 10.3.1 and sup-
+
+
+
<
+
10.3.
SEPARATION THEOREMS FOR CONJUGATE POINTS
31 1
posing that q n exists, let m, be the number of times that wr(x) passes through or reaches the value -1 in the interval (a, q,]. We have then arg w h n ) 2 (2% - 1) n,
$ m r = n.
Hence the number of solutions of arg +(x) = 0 (mod 2n),
Q
< x < ?n ,
is not less than m, - 1, and so in total not less than n - k. Hence 5n-k exists, if n > k, and satisfies (10.3.9). This completes the proof of Theorem 10.3.1. If n > k, and 5, exists, the inequalities (10.3.8-9) may be combined in the form
< cn-k
Vn-k
qn
< 5,.
(10.3.14)
This may be expressed as a separation theorem. Theorem 10.3.2. Under the assumptions of Theorem 10.3.1, if a 1 of the 5, , it contains in its interior at closed interval contains k least one of the qr , and vice versa. I n the scalar case k = 1, these statements follow immediately from Rolle’s theorem, since we have then to deal with a pair of equations u’ = YO, v’ = -qu, with Y > 0, q > 0; and it is obvious that between two zeros of u there lies a zero of v, and vice versa. No such simple argument seems to apply for k > 1. A refinement of these arguments deals with the situation, important in Sturm-Liouville theory, in which only one of the coefficients Q, R is positive-definite. Illustrating this in the scalar case u’ = rw, v’ = -qu, suppose only that r > 0. Then the trajectories in the (u, v)-plane, taken with the usual orientation, will have the property that when they cross the v-axis, they will do so in a clockwise sense relative to the origin. Thus between two x-values at which a nontrivial solution meets the w-axis, there will be at least one value at which it meets the u-axis; the statement need not be true with the axes interchanged. We prove
+
Let Q(x), R(x) be continuous and Hermitean in Then the existence of 5, implies that of q n , and (10.3.8) holds. We may, as before, define the eigenvalues (10.3.10) of O(x) and take it that they are continuous. T h e main fact to be established is that the w,(x) move monotonically and positively on the unit circle as x increases when they are at the point 1. For this it will be sufficient to prove that the angular velocity matrix Q(x) defined in (10.2.19-20) is positive-
a
Theorem 10.3.3.
< x < b, and let R(x) be positive-definite. +
3 12
10.
MATRIX OSCILLATION THEORY
definite when applied to eigenvectors of O(x) associated with an eigenvalue + l . Supposing that for some x one or more of the w,(x) is + I , we consider any column matrix w such that e(x) w
w,
w*w
W*Q(X)w
> 0.
=
and propose to prove that
> 0,
(10.3.15) (10.3.16)
Writing z = ( V - iU)-leO, (10.3.15) gives ( V + + U ) z= ( V - i U ) Z ,
or
uz
w
= 0,
=
(V - iU)Z
=
vz.
Now the left of (10.3.16) is, by (10.2.20), the same as 2{(V - iU)-lw}*(V*RV
+ U*QU)( V - iU)-'w
= 22*( V*RV
+ U*QU)z
= ~z*V*RVZ= ~ w * R w> 0,
since R > 0. This proves (10.3.16), so that the w,(x) can only pass through f l in the positive sense, and will pass through it if they reach it in a < x < b. Thus the functions arg w,(x) are strictly increasing when they are multiples of 27r; in particular, they are increasing when x = a. Supposing that 5, exists, and that arg w 7 ( x ) is a multiple of 2~ for n, values of x in a < x < 5, , we have arg wT(5n) 2 2nnv *
while
x: n,
=
n. From this it is immediate that the equations arg w,(x)
E T
(mod 2 ~ ) ,
a
< x < 5, ,
have altogether at least n solutions, which proves the result. T h e above theorems will evidently admit of certain extensions, concerning boundary conditions of the more general type (10.3.7). 10.4. Estimates of Oscillation I n the last section we proved results of a qualitative character concerning conjugate points of x = a for the nonparametric system u' = Rv, v' = -8.. These were deduced from the continuity or monotonicity of the motion of the eigenvalues up(.) of O(x) = ( V iU)( V - iU)-l,
+
10.4.
ESTIMATES OF OSCILLATION
313
where U' = RV, V' = -QU, U(a) = 0, V ( a ) = E ; the nature of the motion of the eigenvalues q ( x ) with x was in turn deduced from the differential equation (10.2.19). If now we employ this differential equation in a quantitative sense, we may obtain bounds for the motion of the eigenvalues. Hence we may obtain bounds for conjugate points, and in particular conditions relating to their existence. The velocity of the wr(x),as functions of x on the unit circle, is bo,unded by the eigenvalues of Q(x), which in turn are bounded by the eigenvalues of R(x) and Q(x). Assuming R(x) and Q(x) Hermitean and, for convenience, continuous we denote by y l ( x ) , y2(x), respectively, the lowest and the highest among the eigenvalues of either R(x) ok Q(x). Then, for a x b,
< <
and so yl(U*U
+ V*V) < (U*QU + V*RV) < y2(U*U + V*V).
(10.4.2)
Multiplying on the right by (V - iU)-l and on the left by its adjoint, we have in view of (10.2.21) that Yl(X)
E
Y2(4
E!
(10.4.3)
where Q(x) is given by (10.2.20). We deduce from (10.2.19) that the eigenvalues of d(x) vary on the unit circle at a rate which is similarly bounded, namely, that (10.4.4)
for r
=
< x < b. Since
1, ..., k and a
and since argo,(a) = 0, it follows on integrating that 2
I'
Y d t ) dt
< arg
Wr(4
<2
I'
Y 2 W dt,
(10.4.5)
The following conclusions are almost immediate.
Theorem 10.4.1. Let Q(x), R(x) be Hermitean, continuous and let R(x) be positive-definite in a x b. For the existence of t l , the
< <
3 14
10.
MATRIX OSCILLATION THEORY
first right-conjugate point of a with respect to the boundary problem (10.3.1), it is necessary that (10.4.6)
Jl
and sufficient that
Y d t ) dt
3 =*
(10.4.7)
I t was shown in connection with Theorem 10.3.3 that under our present assumptions the cur(.) move positively on the unit circle when at the point + l . Since they all start at the point +1, and right-conjugate points for (10.3.1) are given by wr(x) = 1 for some r , the existence of is equivalent to one, or more, of the wr(x) making a complete circuit of the unit circle. If L1 exists, we must therefore have, for some I ,
c1
argwr(51) = 27G
(10.4.8)
in fact, numbering the a?(.) as in Section 10.3, we have here r = K. I t follows from (10.4.5) that 2
I" Yz(t)
dt 3 277.
a
Since R(x) > 0, so that y 2 ( x ) > 0, we deduce (10.4.6). Assuming (10.4.7), it follows from the first of (10.4.5) that argw,(b) 3 2 ~ , r
=
1, ..., k,
so that (10.4.8) is soluble for each Y . Thus (10.4.7) is sufficient not only for the existence of C1, but actually for the existence of For further results, which are in some cases more precise than those obtainable by the method just used, we consider the variation of the determinant det O(x), which is of course equal to the product of the w I ( x ) , and use the exact expression (10.2.26). By means of this we prove
ck.
Theorem 10.4.2. Let the assumptions of Theorem 10.4.1 hold, and let 5, exist, a < 5, b. Defining the Hermitean positive-definite or semidefinite matrix I R - Q I by
<
I we have the inequality
R
-Q I
=
d ( R -Q)',
(10.4.9)
10.4. and, if
(n+l
315
ESTIMATES OF OSCILLATION
does not exist in (a, b], (10.4.11)
J-tr(R+Q-1RRQ/)dt<2~(n+k). a
If 5, exixts in (a, b], then as x increases in this interval the or(%) must pass through or reach the point 1 between them at least n times; as shown for Theorem 10.3.3, the condition R > 0 ensures that they can only pass through or reach the point + 1 in the positive sense. Hence there are non-negative integers n, such that k
arg w,(b) 3 2nn,,
n,
(10.4.12)
= n,
1
and so n
arg w,(b) >, 2nn.
(10.4.13)
1
Considering now the function arg det O(x), interpreted as a continuous function of x starting at x = a with the value zero, we have of course
2 arg k
arg det e(x)
=
arg det e(x)
=
w,(x),
(10.4.14)
tr Q ( t ) dt,
(10.4.15)
1
and also, by (10.2.26), a
where Q(t) is given by (10.2.20). Combining this with (10.4.13-14) we have
1:
tr Q(x) dx
> 2m,
(10.4.16)
and the remainder of the proof of (10.4.10) consists in estimating tr Q(x) from above. For this purpose we express Q in terms of R,Q,and 8. Since O = (V iU)( V - iU)-l, we have
+ e +E =~ V ( V - iul-1,
whence (V*
e -E
=
2 q v - iu)-1, (~0.4.~7-18)
+ iU*)-'V*RV( V - iU)-l = i ( e + E)*R(B + E ) ,
(V* + iU*)-lU*QU(V - 2 J - l
=
$(e
- E)*Q(O
- E).
10.
316
MATRIX OSCILLATION THEORY
On taking twice the sum of these, (10.2.20) becomes Q =i { R
+ Q + (R - Q) O + 8*(R - Q) + 8*(R + Q) O}.
(10.4.19)
Taking the trace, we note first that tr {O*(R
+ Q) O}
= tr
{(R
+ Q) 8O*}
= tr
(R
+ Q),
since 8 is unitary. Secondly, we note that where, in fact, tr
IR
-Q
I
I d tr I R - Q I,
I tr{(R -Q)O>
= sum of absolute values
(10.4.20)
of eigenvalues of R - Q. (10.4.21)
T o verify (10.4.20), write 8 = 8,8,8T, where 8, and 8, are unitary and 8, is chosen so that 8r(R - Q) 8, is in diagonal form; the maximum of tr {(R - Q) 8} remains unaffected if 8 has this form and 8, ranges over all unitary matrices. Now
tr { ( R - Q ) e,e,e,*>
= tr
{ e p - Q) e,e,).
(10.4.22)
Since Q ( R - Q) 8, is diagonal, with as diagonal entries the eigenvalues of R - Q, the right of (10.4.22) is the sum of products of the eigenvalues of R - Q and the corresponding diagonal entries of 8,. Since the latter are of absolute value not exceeding 1, the right of (10.4.22) admits the (precise) bound (10.4.21), which is easily seen to be the same as tr I R - Q I as given by (10.4.9). Hence on taking the trace of (10.4.19) we get and
trQ
< tr(R + Q
+ I R -Q
trQ
3 tr(R + Q
-
I)
(10.4.23)
1 R - Q I).
(10.4.24)
Inserting the upper bound (10.4.23) on the left of (10.4.16) we obtain one of the desired results (10.4.10). For the other, supposing that 5, exists in (a, b] but not Cn+, , we have arg %(b) < 274%
+ 1)
where as before ZinT = n, whence arg det O(b) =
k 1
arg w,(b) < 2 4 n
+ k).
Using (10.4.15) and (10.4.24) as previously we get (10.4.11).
10.5. BOUNDARY
PROBLEMS WITH A PARAMETER
317
10.5. Boundary Problems with a Parameter In this section we consider the matrix Sturm-Liouville equation, reintroducing the scalar parameter, U' =
Rv,
-(hP
V' =
+ Q ) U,
u
< x < b,
(10.5.1)
where P, Q, R are Hermitean k-by-k matrices which are continuous functions of x, and u, v are K-by-1 column matrices. T h e problems to be considered are in a way dual to those of Sections 10.3-4. We now impose boundary conditions in which the independent variable is fixed and A is allowed to vary, the primary cases being those in which we seek nontrivial solutions such that u(a) = u(b) = 0,
(10.5.2)
u(a) = v(b) = 0
(10.5.3)
or such that which are special cases of u(a) = 0,
u(b) cos 4 OL = v(b) sin
+
iy.
(10.5.4)
Once more, we construct a unitary matrix O(x, A ) and consider the behavior of its eigenvalues. Defining U(x, A), V(x,A) by (10.2.27-28), we take O(x, A) = ( V iU)( V - iU)-l and denote its eigenvalues by w,(x, A), r = 1, ..., k. Restricting A to be real, the wr(x, A) will lie on the unit circle, and with their arguments may be taken to be continuous and uniquely defined functions of x and A, subject to
+
arg wr(a,A) argw,(x, A )
= 0,
r
=
1, ..., k,
< argw,(x, A) < ... < arg wk(x,A) < argw,(x, A) + 27.
(10.5.5) (10.5.6)
As shown in Section 10.3, the boundary problems (10.5.2-4) correspond to roots of the equations w,(b, A) = 1, - 1, exp (ia). For separation theorems we rely on a monotonic behavior of the functions arg w,(b, A). Referring to (10.2.31-32), we see that if SZt(b, A) > 0, that is to say, if
J:
U*(t,A) P ( t ) U(t,A) dt
> 0,
(10.5.7)
then (see Appendix V) the arg w,(b, A) are monotonic increasing in A.
318
10.
MATRIX OSCILLATION THEORY
The condition (10.5.7) is of the type of the definiteness conditions (8.2.1) or (9.1.6) ; it may be expressed by the requirement that
i” a
u*(t)
P ( t ) u ( t ) dt > 0
(10.5.8)
for any nontrivial solution of (10.5.1) such that u(a) = 0, since U(x, A) ~ ( a )V(x, , A) .(a) gives the general such solution. T o take a simple case, let us assume that P ( x ) > 0,
a
< x < b,
(10.5.9)
and that R-’(x) exists, for some x in a
< x < b.
(10.5.10)
Supposing if possible that equality holds in (10.5.8), it follows from the continuity of the integrand that u * ( t ) P(t)u ( t ) vanishes identically, and so from (10.5.9) that u ( t ) vanishes identically. From (10.5.1) it then follows that o(t) is constant in [a, b ] . Since u’ also vanishes identically, so does Ro,by (10.5.1) ; by (10.5.10) it follows that o vanishes for some x in [a, b], and so everywhere, since it is constant. Thus the conditions (10.5.9-10) ensure that the functions arg u7(b,A) are monotonic increasing in A. The following separation theorem is now immediate.
Theorem 10.5.1. Let P(x), Q(x), R(x) be continuous and Hermitean, and satisfy (10.5.9-10). Then a closed A-interval containing k 1 eigenvalues of the problem (10.5.1-2) contains an eigenvalue of the problem (10.5.1), (10.5.3) in its interior; the statement remains true with the problems interchanged. Here eigenvalues of these problems are to be counted according to multiplicity, this being the number of linearly independent solutions of (10.5.1-2), or (10.5.1), (10.5.3). Suppose that the interval [A’, A”] contains k 1 eigenvalues of (10.5.1-2). Then, as A increases in this interval, at least one of the u,(b, A) must attain the value 1 more than once, and so at least one of the arg w,(b, A) must increase from one multiple of 27~to the next greater multiple of 27~,and must therefore in between equal an odd multiple of rr. This proves one statement of the theorem, the other being proved similarly. It is a standard circumstance in Sturm-Liouville theory that the
+
+
+
10.5.
319
BOUNDARY PROBLEMS WITH A PARAMETER
eigenvalues can only accumulate at +m. We now give additional conditions which ensure this in the matrix case.
Theorem 10.5.2. Let P(x), Q ( x ) , and R(x) be continuous and Hermitean in [a, b], P(x) and R(x) being also positive-definite. Then the boundary problem (10.5.1), (10.5.4) has at most a finite number of negative eigenvalues. The eigenvalues will be the roots of arg w,(b, A)
=a
+ 2nx,
(10.5.11)
<
for r = 1, ..., k and any integer n. Taking it that 0 < a 277 we show that only n 2 0 are admissible. It was shown in the proof of Theorem 10.3.3 that if R > 0, then the w,.(x) move positively on the unit circle when at 1. In the notation of this section, we have that for any fixed A, the arg w,(x, A) are increasing functions of x when at a multiple of 277 [cf. Theorem 8.4.3 (ii)]. In view of (10.5.5) we have therefore, for any real A,
+
arg w,(x, A)
> 0,
a
< x d b.
(10.5.12)
Hence in (10.5.11) there will be no solution for n = -1, -2, ... . The assertion of the theorem now follows from the fact that the arg w,(b, A) are monotonic increasing in A. It follows that under the conditions of the last theorem the eigenvalues of the boundary problem (10.5.1), (10.5.4) may be indexed from 0 upwards in ascending order, so that A,
< A, < A, < ... .
(10.5.13)
Under the same conditions we may show that for large n, A, is bounded by multiples of n2.
Theorem 10.5.3. Let P(x), Q ( x ) , and R(x) be continuous and Hermitean in a, b, with P ( x ) > 0, R (x) > 0. Denoting the eigenvalues of the problem (10.5.1), (10.5.4) as in (10.5.13), there is a constant no such that, for n > n o , 2wn < A , ! , ~a z ~ t r { R + P + I R - P l } d x + ( 3 k - l ) ~ , (10.5.14)
27rn >
J”” t r ( R + P - 1 R - P 1) a
where I R - P I denotes d ( R - P)2.
dx - (3K
+ 3)
7r,
(10.5.15)
320
10.
We take it that 0 of the congruence
< 01
MATRIX OSCILLATION THEORY
< 2rr, and denote by n, the number of solutions
arg w,(b, A) = a (mod 27),
A
+
< A,
,
so that Z t n , = n 1, the number of eigenvalues not exceeding A., Since the arg ~ , ( b A), are positive, by (10.5.12), and monotonic increasing in A, we have
arg %(b, A,) 2
a
+ 2(n,
-
1) T,
and so summing over r, arg det B(b, A,)
k
=
arg w,(b, A,)
2 ha
1
Similarly, arg w,(b, A,) < a and so
+
+ 2(n + 1 h,7,
In order to estimate O(b, A) we take it that A auxiliary matrix = B t ( x , A) =
k ) ~ . (10.5-16)
< Ka + 2(n + 1) T.
arg det B(b, A,)
Bt
-
(V
> 0,
(10.5.17)
and set up an
+ iU dA)( V - iU dA)-l.
( 10.5.18)
If we define Ut = Ut(x,A) = d A U ( x , A), then U and V will satisfy the differential equations Ut'
=
dARV,
V'
=
-(dAP
+ Q/dA)U t .
( 10.5.19)
The matrix O t given by (10.5.18) stands in the same relation to the system (10.5.19) as does 8 to the original system (10.2.27). We assert that
I arg det B(x, A)
- arg det Bt(x, A)
I < Kr.
(10.5.20)
T o see this we set up the relation between the eigenvalues of 8 and O t . Let w be an eigenvalue of 8, so that for some column matrix w # 0 we have Ow = w w , or, with z = ( V - iU)-lw, ( V iU)z = w( V - iU)z, or ( w - 1 ) V z = i(w + 1 ) uz. Hence ( w - 1) ( V f i d A U ) z = i{(w 1) f z/h(w - 1)) Uz,
+
+
10.5.
BOUNDARY PROBLEMS WITH A PARAMETER
32 1
and so
This is equivalent to 0twt = wtwt, with (10.5.21)
where w t = ( V - i l / A U ) x. Defining as before the eigenvalues wr of 0, we may form the eigenvalues wS of 0t according to the transformation (10.5.21). We now observe that for fixed positive A the transformation (10.5.21) maps the unit circle in the o-plane continuously into the unit circle in the wt-plane, mapping the upper and lower halves of the unit circle and the points f l into the same entities. We define arg wJ(x, A) initially by argwJ(a, A) = 0, as we may since &(a, A) = E, and thence by continuous variation in x. We can then assert that
since w,(x, A), wJ(x, A) both start at + I when x = a, and remain as x increases within an angular distance m of each other; as we noted in connection with (10.5.21), orand w; must lie either both in the upper half of the unit circle, or both in the lower half, or both at f l or at -1. Hence
and this is the same as (10.5.20), since Z: arg wr(x, A) = arg det 0(x, A), and likewise Zf arg wJ(x, A) = arg det &(x, A) in each case by continuity and since the result is true when x = a. The bounds (10.5.16-17) may then, if n is so large that A, > 0, be put in the form of bounds for arg det &(b, A,). By (10.5.20) and (10.5.16) we have (10.5.22) arg det Ot(b, A,J > ka 2(n 1) T - 3km.
+
+
Similarly, from (10.5.17), arg det Ot(b, A,) < ka
+ 2(n + 1) m + km.
(10.5.23)
We now use, with due alteration, the results (10.4.15) and (10.4.23-24). Since &(x, A) is formed from the system (10.5.19) instead of (10.2.27),
322
10. MATRIX
OSCILLATION THEORY
+
we replace R and Q in (10.4.23-24) by R .\/h and P v ' A Q/dA, respectively. Using (10.4.15), adapted to the present case, we have therefore, for any h > 0,
(10.5.24)
and
Taking it that h
(10.5.25)
> 1, these imply that
and arg det O+(b,A) 2 dA
tr { R a
1 + P - I R - P I} dt + 0 (--) l/h
.
(10.5.27)
Taking for example (10.5.24), it is obvious that the first term Q/dA on the right contributes only O(l/t/A) to the integral; in the case of the second term, it will be sufficient to show that tr I R
-
P -&/A
I - tr I R - P I
=
O(l/A).
This is so since the eigenvalues of R - P - (?/A and those of R - P differ by O( 1/A), and these two traces are the sums of the absolute values of these eigenvalues, as noted in (10.4.21). In particular, we have from (10.5.27) that arg det Bt(b, A ) + 03 as A + m.
(10.5.28)
This depends on the observation that the integrand in (10.5.27) is positive. T o see this let cl, ..., f k be a set of normalized eigenvectors of R - P,so that, by (10.4.21), tr I R
Since R
-
PI
> 0, P > 0, we have
2 I 5:(R - P ) 5, I. k
=
1'
10.6.
323
A FOURTH-ORDER SCALAR EQUATION
and so
as was to be proved. Hence (10.5.28) holds and so, by (10.5.20), as h + m. This shows incidentally that A, exists arg det 6(6, A) 4 for arbitrarily large n > 0; that A, + 00 as n + is evident from ( 10.5.16). Combining now the bounds (10.5.22-23) and (10.5.26) we have, for n so large that A, > 1, ha + 2 ( n
+ l ) w -3kw
/
< +,,
b a
I R -PI}dt
tr{R + P +
1 + 0 (-),A
(10.5.29) ha
+ 2(n + 1) w +
> l/h,,
/
b a
tr { R
+ P - 1 R - P 1) dt + 0 (--)I . d A 7 3
(10.5.30) For large n, these imply the bounds (10.5.14-15) which are the assertion of Theorem 10.5.3, since 0 < 01 27, and the terms O(l/dAn) will be less than 7 in absolute value if n is large.
<
10.6. A Fourth-Order Scalar Equation The fourth-order equation (y”/y)”
- (qy’)’ - py = 0,
a
< x < b,
(10.6.1)
where y , p , q, and Y denote real-valued scalar functions, exhibits a wealth of oscillatory phenomena which goes beyond any parallel in the ordinary Sturm-Liouville equation or its matrix analog, and may properly be handled either as it stands or within a more general framework than that provided by matrix Sturm-Liouville theory. Nevertheless, it is possible to write (10.6.1) in matrix Sturm-Liouville form, and to derive a certain group of results by arguments similar to those of Sections 10.2-5. T o rewrite (10.6.1) in Sturm-Liouville form we define 111
= y,
u2 = y“/y,
Then u; = - v 2 , v; = - Y U 2 .
PI1 =
-(y”/Y)’
+ qy‘,
ui = -vl - qv2, v;
=
-pu,
w2
= -y‘.
(10.6.2)
by (10.6.1), and
10.
324
This has the form u’
R
=
(-1
-‘I, -q
MATRIX OSCILLATION THEORY
=
Rv, v’ = -Qu with
Q =
(i 01,
u =
(:I,
v =
(:I.
(10.6.3)
Here R, Q are Hermitean, indeed, real and symmetric. We shall assume that P>O, qzo, r > 0 , (10.6.4) and for simplicity that they are continuous in [a, b ] , though Lebesgue integrability would generally suffice; it would also be possible to allow p , or T in the matrix form, to vanish over intervals. T h e conditions (10.6.4) ensure that Q is positive-definite, though not R . T h e form u’ = Rv, v’ = -Qu for (10.6.1) lends itself most naturally to the discussion of such boundary problems as u(a) = u(x) = 0,
(10.6.5)
= v ( x ) = 0;
(10.6.6)
) . ( U
there being no disposable parameter A, there will be at most a discrete set of x-values for which these problems admit a nontrivial solution, the right-conjugate points of a relative to the boundary conditions in question. In terms of the original differential equation (10.6. l ) , these boundary problems are, respectively, y ( a ) = y”(a)
= 0,
y(a) = y”(a) = 0,
y = y”
(10.6.5’)
= 0,
(y”/r)’ = y‘
= 0,
(10.6.6‘)
the second pair of conditions to hold in each case at some x. As before, we denote the solutions, if any, of (10.6.5-6) in (a, b] in ascending order by 51 9 5 2 , and 71 9 q 2 , *.* * The method of Sections 10.2-3 calls in this case for 2-by-2 matrices U , V such that U‘ = RV, V‘ = -QU, U(a) = 0, V ( a ) = E ; we define, when possible, the Hermitean or, in this case, real-symmetric matrix 2 = UV-l, and the unitary matrix 6 = ( V iV)(V - iV)-l. T h e eigenvalues wl(x), w 2 ( x ) of 0 will be such that their arguments are continuous, with arg wl(a) = arg w2(u) = 0, * * a
+
arg wl(x)
< arg w2(x) < arg wl(x) + 277.
Since Q, R are not both positive-definite, we cannot assert that the wr move positively on the unit circle throughout. Since, however, Q > 0, we assert that they move positively when at the point - 1. T h e proof
10.6.
A FOURTH-ORDER SCALAR EQUATION
325
is similar to one given in connection with Theorem 10.3.3. Supposing -1 to be an eigenvalue, and so Ow = - w for some w # 0, we write z = (Y- iU)-lw, w = V z - iUz,sothat ( V iV)z = -(V - iU)z, and so Yz = 0, w = -iUz. Hence, with the notation (10.2.20),
+
W*QW = 22*(V*RV
+ U * Q U ) z = ~ z * U * Q U Z= 2W*Qw > 0.
Hence the w,(x) move positively as x increases when at - 1. T h e following separation theorem is an immediate consequence. Theorem 10.6.1. If for some. n 3 3 there exists qn , then 5n-2 exists and a <
+
Theorem 10.6.2. Let p , q, and r be continuous in [a, b] and satisfy b, then so does q1 , and a < q1 < 5, . (10.6.4). If 5, exists, a < 5, For the proof we turn to the Riccati equation (10.2.16), where Z = UV-l is a real symmetric 2-by-2 matrix which will be defined in [a, qI), if q1 exists, and otherwise in [a, 61. Writing the entries of Z in the form
<
we may write out in full the Z' = R Z*QZ, as
+
Riccati equation (10.2.16),
or
326 or z; = pz;
10.
+
,
12,"
MATRIX OSCILLATION THEORY
z; = (pz,
+
YZ3)
z2 - 1,
2; = pzz'
+
YZ," - q.
(10.6.7-9)
Together with these we need the differential equation satisfied by det 2 = zlzB- zi, which is (det 2)' = (pz,
+
yz3)
det Z
+ 22, - qz, .
(10.6.10)
This may be deduced from (10.6.7-9) by straightforward calculations, or from the general result that, in matrix terms, (det 2)' = tr (ZQ) det Z
+ tr (Radj 2)
where adj 2, the adjugate of 2, is the matrix of cofactors in 2. Suppose now that vl does not exist in (a, b ] ; we shall show that does not exist either. This assumption means that V is nonsingular in [a, b], and so 2 exists and the differential equations (10.6.7-10) hold there. We have initially Z(a) = 0, and so zl(u) = z,(a) = z3(a)= 0. It follows that if x - a is small and positive, zl,z2 and z3 are all of order x - a. Applying these estimates on the right of (10.6.7-8) and integrating we get
c1
z~(x)= -(x
z,(x) = O{(X - u ) ~ } ,
and so
det Z(x)
= -(x
- a)
- a)2
+ O{(X -
+ O{(x -
a)4},
(10.6.11-12) (10.6.13)
all for small positive x - a. We now consider the sign of some of these functions. From (10.6.7) we see that z1>, 0 in [a, b] ; in fact, since z, # 0, by (10.6.12), when x is near to a, this can be sharpened to z,>o,
(10.6.14)
a<x
Turning to z2 , we see that z, < 0 for x near to a, by (10.6.12), and that, by (10.6.8), z6 = -1 whenever z2 = 0; hence z,
(10.6.15)
u<x
Taking now the case of det 2, we see from (10.6.13) that det 2 < 0 for x near to a, while from (10.6.10), whenever det 2 = 0, we have (det 2)' = 22, - qzl which is negative for u < x b, by (10.6.14-15). Hence also det Z < 0, a < x ,< b. (10.6.16)
<
10.6.
327
A FOURTH-ORDER SCALAR EQUATION
< < c1
c1
It now follows that U is nonsingular in a < x b, and hence does not exist. This proves the theorem in a slightly weakened form namely, that if exists, then so does ql and a < vl . T o complete the proof we have to exclude the latter equality sign. For this purpose we assume that ql exists, so that Zexists in a x ql, and consider the eigenvalues of Z in this interval; we denote these eigenvalues by vl(x),v2(x) with vl(x) v2(x).Since 8 = (E+iZ)(E-iZ)-', these are related to the eigenvalues of 0 by UJ, = (1 iv,) (1 - iv,)-l. As x tends to q1 from below, one of the a, must tend to -1 along the unit circle, and so one of the v,, must tend to -p on the real axis. Similarly, as x tends to one of the w7 must tend to 1 and one of the v, to zero. We shall prove the theorem by showing that as x increases from a towards q l , neither of the v,, can tend to zero. T h e inequalities (10.6.14-16) are now available for a < x < v1 . I n particular, from (10.6.16) and the fact that vlv2 = det 2 it follows that
c1
< <
<
+
+
cl,
v1(x)
< 0 < v2(x),
a
< x < 71.
(10.6.17)
For further information we resort to the maximum-minimum characterization of the eigenvalues of the symmetric matrix 2, which gives v1
,v 2
= min, max
respectively, taken over real vectors f that is to say,
=
v1 , v2 = min, max {z,f,"
[*Zf,
(el,
f2)
such that f :
+ 2zZ1,5, + z&).
+
=
1,
(10.6.18)
I n particular, taking f 1 = 1, f 2 = 0 we see that v2 2 zl,and since z1is nondecreasing, by (10.6.7), we see that v2 cannot tend to zero as
771 - 0. It therefore remains to examine the possibility that v1 tends to zero from below as x + ql - 0. Let f t = ( f ; , f $ ) be the normalized vector minimizing (10.6.18), that is to say, the normalized eigenvector of Z associated with v l , so that Z f t = v l f t . Since z2 < 0, by (10.6.15), the minimum in (10.6.18) will not be reached when f l , f 2 have opposite signs; we may therefore take it that (1, f ; are non-negative. We now appeal to the formula for the rate of change of the eigenvalues of a varying symmetric matrix (see Appendix V). We have in fact v; = f t * Z ' f t , and so
x
--+
vi = ft*R(
+ [t*Z*QZft.
Since the entries in f t are non-negative, and those in R nonpositive,
328
10.
MATRIX OSCILLATION THEORY
the first term on the right is nonpositive, while since Z@ = vl(t, the second term is vl(t*Q(tvl. Hence
since (t is normalized. Hence (l/Vl)’
+ 4,
3 -(P
and on integrating we see that l / v l cannot tend to -00, as x + vl - 0, so that v1 cannot tend to zero from below. This completes the proof of Theorem 10.6.2. If we translate these results for v l , v2 into results on the behavior of w1 , w 2 , a further separation theorem can be deduced, namely, Theorem 10.6.3. If q2 exists, then so does 771
cl, and
5,= < 772l*
(10.6.19)
Since as x increases from a the v l , v2 become negative and positive, respectively, we have that w1 , w 2 lie in the lower and upper halves of the unit circle, respectively, excluding the end-points 1, - 1. This situation persists until x reaches vl from below, when w, reaches the point -1 moving in a positive sense; it was shown in the proof of Theorem 10.6.1 that the wr move positively at -1, and this excludes w1 from tending to -1 as x + vl - 0. When x = ql , w1 is still in the lower half of the unit circle, since we proved that vI does not tend to zero as x -+ ql - 0. Thus, for x in a right-neighborhood of q1 , both w1 and w 2 are in the lower half of the unit circle, and before either of them reaches -1 again, one of them must pass through +1, which proves the result. 10.7. The First-Order Equation
We now set up similar machinery to that of Section 10.2, in order to investigate oscillatory properties of the first-order system (9.1.1). T o begin with we omit the scalar parameter A, and consider the system
ly’
=
B(x)y,
a
d x d b,
(10.7.1)
with a view to problems concerning “conjugate points.” For this purpose we set up a boundary problem, or “free boundary problem,” in which the boundary conditions are the same as in Chapter 9, to be
10.7.
329
THE FIRST-ORDER EQUATION
satisfied with the upper end indeterminate. We are to find x such that (10.7.1) has a solution such that y(a) = Mw,
y ( x ) = Nv,
(10.7.2)
for some column matrix w # 0. T h e x-values for which this problem is soluble may be termed “right-conjugate” points of a, with respect to the chosen boundary conditions. We maintain here the general assumptions of Section 9.1. T h e matrices J , B , M , N are square and of order k, while y is a k-by-1 column matrix; if the system (10.2.2) were put in the form (10.7.1) we should have to replace k by 2k. As before, J is constant, skewHermitean, and nonsingular, while B(x) is Hermitean ; for simplicity, we assume here that B(x) is continuous. T h e boundary matrices M , N are to satisfy M* J M = N* J N , and may have no common null-vectors. We define the ‘‘fundamental solution” Y(x)of (10.7.1) by JY’
= BY,
(10.7.3)
Y ( u )= E ,
and then, in a similar though not identical manner to that of Section 10.2, make the definitions
u = J(YM z= uv-1,
-
N),
v = (YM +N), e = (v+ iu)(v
- w-1,
(10.7.4-5)
(10.7.6-7)
where Y , U , V , 2 and B are all functions of x. T h e function 2 bears a certain resemblance to the characteristic function defined in (9.5.1) and a closer one to the function defined in (9.4.17) with b = x ; we have arranged that Z should have a zero eigenvalue, rather than an infinite one, when the boundary problem is soluble and V-l exists, and so have taken the inverse of the function given in (9.4.17). Parallel to Theorem 10.2.1 we have
Theorem 10.7.1. With the above assumptions, 2 is Hermitean when it exists, and satisfies the Riccati differential equation Z‘
=
*
(2
+ /)*]*-1B]-1(Z + /).
(10.7.8)
T o show that 2 is Hermitean it will be sufficient to prove that
v*u u*v = 0. -
(10.7.9)
Calculating the left-hand side we get from (10.7.4-5)
v*u u*v= (M*Y* + N * ) J ( Y M -
-
N ) - (M*Y* - N * ) / * ( Y M
+N).
10.
330 Since J*
=
MATRIX OSCILLATION THEORY
J this gives
-
V*U - U*V
= 2(M*Y*JYM - N * J N ) = 2(M*JM
-
N*JN),
using (9.1.12), which justifies (10.7.9) since we assume that M* J M = N * J N . Hence 2 is Hermitean, if V-l exists. Differentiating (10.7.6) we have Z'
=
U'V-1 - uv-'V'V-1
=
U'V-1 - ZV'V-1.
Differentiating (10.7.4-5) we have also U'
=
JY'M
= BYM,
V'
J-'BYM,
=
Y'M
=
( J - Z ) J-lBYMV-'.
=
and so Z' = BYMV-1 - ZJ-'BYMV-1
However, 2
+ J = ] ( Y M - N ) ( Y M + N)-' + ] = 2JYM(YM + N)-'
= 2JYMV-',
and so the above reduces to Z'
=
Q ( J - Z ) J-'B J-'(Z
+J) =
(- J* - Z*) J-'BJ-'(Z
+ J),
since J* = - J , and this is the same as (10.7.8). The corresponding results for the matrix 0 are given by
Theorem 10.7.2. For all x, a the differential equation 9 where 52
= 4(V*
< x < b, d(x)
is defined and satisfies (10.7.10)
= iOQ,
+ iU*)-'M*Y*BYM(V - iU)-'.
(10.7.11)
We have first to show that ( V - iU)-l exists. For (10.2.21) holds, by (10.7.9), and so we have to prove that V*V U*U > 0. This is true provided that V z = 0, U z = 0 have no common solution z # 0; if they had such a solution, then, by ( 1 0 . 7 . 4 4 , M z = N z = 0, which is excluded unless z = 0 by our assumptions concerning the boundary conditions. Thus 0 exists; that it is unitary follows from (10.2.21) as previously. For (10.7.10) we use (10.2.24), which gives (10.7.10) with
+
Q = 2( v*
+ iU*)-l(v*U' - u*V') (V
-
iU)-'.
(10.7.12)
10.7.
THE FIRST-ORDER EQUATION
33 1
Here
+ N*) JY‘M - (M*Y* - N*) ]*Y’M = (M*Y* + N*)BYM + (M*Y*-N*)BYM = 2M*Y*BYM,
V*U’ - U*V’ = (M*Y*
and substituting in (10.7.12) we get (10.7.1 l), which completes the proof. Finally, as in Section 10.2, we extend the argument to dependence on A, in the case of the form
h‘ =.(AA + W y ,
where A is Hermitean, non-negative, and Lebesgue integrable, or perhaps continuous. Defining Y = Y ( x , A) by
+B) Y ,
JY’ = (M
Y(u,A)
= E,
the functions U,V , 2, and 0 defined in (10.7.4-7) become functions of A as well as x . In particular we have
eA = ieQt, where
Qt = 2(V*
+ iV*)-l(V*UA - V*Va) ( V - iU)-l,
(10.7.13) (10.7.14)
the suffix A indicating partial differentiation with respect to A. As before, we have V*U - U*V
= (M*Y*
+ N*) J Y J 4 - (M*Y* - N * ) ]*YAM
(10.7.15)
= 2M*Y* ]YAM.
To evaluate Y * / Y A ,we differentiate the differential equation for Y, getting Ya(u,A) = 0. J Y ; = AY (M B ) Ya ,
+
Hence, for real A,
+
=
+ y*(Jy;> -{(M+ B)Y}*Ya+ Y*{AY + (AA + B ) 3 )
=
Y*AY.
(Y*JYa)’ = -(/Y’)*YA
Since Y A= 0 when x
=
a, we deduce that
Y*(x,A) JYA(X,A)
=
J’a Y*(t,A) A ( t ) Y(t,A) dt.
332
10.
MATRIX OSCILLATION THEORY
Substituting in (10.7.14-15) we obtain 52+ =4(V*
+iu *)-1M*r
Y*(t,A)A(t)Y(t,A) df M ( V - iu)-l.
a
(10.7.16)
Since A 3 0, we deduce that SZt
2 0.
(10.7.17)
10.8. Conjugate Point Problems Our approach to the boundary problem (10.7.1-2), where x is to be found, is based on a study of the eigenvalues of the unitary matrix O(x) defined in (10.7.7). As previously, these eigenvalues may be taken to be k continuous functions wl(x), ..., w k ( x ) , their arguments being also continuous and subject to argw,(x)
< ... < argw,(x) < argw,(x) + 2
~ ;
they will be fixed uniquely if we fix their initial values at x to (10.8.1). We have initially
(10.8.1) =a
qU)= { ( M + N ) + ~ J ( M- N ) } { ( M + N ) - ~ J ( M- ~ ) } - 1 ,
subject (10.8.2)
and the initial values of the arg wr(x) will of course depend on M and N ; they will, for example, all start at zero in the case of the periodic boundary conditions M = N = E. We first take up the case B > 0, which is similar to that of (10.3.1) when Q > 0, R > 0, as in Theorem 10.3.1. We show again that the eigenvalues of O(x) move positively, this being the source of separation theorems. Theorem 10.8.1. Let B(x) be positive-definite, Hermitean, and continuous in a x b. Then the functions arg u , ( x ) are strictly increasing in x. It follows from (10.7.11) that 52 3 0 if B > 0, and so under the assumptions of the theorem the arg w,(x) are at any rate nondecreasing. T o prove that they are actually increasing functions, we need the property that if w is an eigenvector of 8, then w*Qw > 0. For this we shall express 52 in terms of 8, B, and J . From (10.7.4-5) we have
< <
YM
+V) = $ { ( i J ) - l+ ( ~iu) - ( i j ) - l ( ~ iU))+ * ( v+ iU + V - iU).
= -&(J-'V
10.8. Hence
333
CONJUGATE POINT PROBLEMS
YM(V - iu)-l =
(ij)-1(0 - E )
+ 4 (e + E ) ,
and so (10.7.11) becomes
n = t{(ij)-ye - E ) + e + E)* ~ { ( i j ) - y e- E ) + e + E ) .
(10.8.3)
Multiplying on the left and right by w * , w , where the column matrix w # 0, and recalling that B > 0, we obtain a positive result provided that or, what is the same thing, (j(0
+ E ) - i(0
-
E ) } w # 0.
Supposing that w is an eigenvector of 8, so that Ow (10.8.5) may be written
(10.8.5) =
eiaw, the left of
{(cia + 1) J - i(eia - 1) El w , and so the required result (10.8.4) is equivalent to
+
{cos + a ~ sin +'YE) w # 0.
We have however, since J* (cos +a/*
+ sin + a
=
(10.8.6)
-J,
~(cos ) +a]
+ sin + OLE)= cos' Q a / * ] + sin' t a
~ ,
(10.8.7)
and since J is nonsingular, the right-hand side is positive-definite for all real a. Hence the matrix on the left of (10.8.6) is nonsingular, which proves the result. Next we relate the eigenvalues of 8 to the solubility of the boundary problem (10.7.2), or a related problem. Suppose that for some x and some column matrix w # 0 we have O(x) w = exp (ia)w. Writing as before z = ( V - iU)-'w, we have then (V
+ iU)z = eaa(V
-
iU)z
and so Vzsin+a = Uzcos+a,
where, since w = V z - iUz, Vx and Ux are not both zero. Substituting from (10.7.4-5) we get (YM+N)zsin+a = j ( Y M - ~ ~ ) z c o s + a ,
334
10. 'MATRIX
OSCILLATION THEORY
or ( J cos * a - E sin * a ) Y M z
= ( J cos * a
+ E sin *a) Nz.
Here we note that (Jcos + a - E sin *a)-' exists; this follows from (10.8.7) with a replaced by -a. Hence finally
9 + E sin 8a)Nz.
Y ( x )M z = ( J cos * a - E sin Q a)-l ( J cos a
(10.8.8)
We write this as (10.8.9)
Y ( x )M X = N(a)Z ,
where N(a) = ( J cos
a -E
sin 9 a)-l ( J cos 9 a
+ E sin 8 a) N .
(10.8.10)
If now we consider the solution of y' = By such that y(a) = Mz, we shall have y ( x ) = Y(x)M z = N(a)z for the x in question. Furthermore, we must have M z # 0, for otherwise N(a) z = 0, whence Nz = 0 and so U z = V z = 0. We therefore have a nontrivial solution of the boundary problem ~ ( a= ) Mw,
Y(X) = N(a)W,
y' = By.
(10.8.11)
Conversely, given such a solution, the reasoning may be reversed, leading to the conclusion that exp (ia) is an eigenvalue of d(x). We are thus led to a family of boundary problems (10.8.11), where a may range over [0, 2n). I n particular a = 0 yields the original problem y ( a ) = Mv,y(b) = Nu,while 01 = n yields y(a) = Mw, y(b) = -Nw; these two problems may, or may not, be distinct. The formulas become simpler if we assume that J*J
(10.8.12)
= JJ* = E.
This occurs in particular if J is the canonical symplectic matrix
K being even, or if J is diagonal with diagonal entries -&i; the general case may be reduced to this by transformations similar to those of Section 3.2. Then (10.8.7) may be generalized to ( E cos a1
+
J-l
+ J-l sin a') = E cos (a1+ a2) + J-l
sin al)( E cos 'a
+ a2).
sin (a1
(10.8.13)
10.8. Hence
4 = (E cos a +
N ( a ) = (E cos a
335
CONJUGATE POINT PROBLEMS
- J-l
J-l
sin & a)-l ( E cos
8 a + J-l
sin 8 a) N
sin +) N .
(10.8.14)
In a similar way to Theorem 10.3.2 we have then Theorem 10.8.2. Let B(x) be positive-definite, Hermitean, and continuous in a < x < b. Then a closed x-interval containing k 1 solutions for x of the boundary problem (10.7.1-2) will contain at least one solution in its interior of the boundary problem (10.8.11). ; For in the closed interval in question there must be at least k'+ 1 values of x for which one of the w,.(x) = 1. Since they move positively for all x, at least one of the w,.(x) must make a complete circuit of the unit circle, and so pass through all values exp (ia),which proves the result. More generally, it is easily shown that if a closed interval contains n > K solutions, that is, conjugate points, of the problem (10.7.1-2), then it contains at least n - K conjugate points of a according to (10.8 11). For further deductions let us make the simplifying assumption that J*J = E. The rate of change of a simple eigenvalue wr(x) of 8(x) is given by
+
(d/dx) arg w,(x) = w*Qw,
where w is a normalized eigenvector of 8, corresponding to the eigenvalue w,.. Taking w,. = exp (ia), and writing z = '(V - iU)-lw, the form (10.7.1 1) for D gives here (d/dx) arg w,(x) = 4 z*M*Y*BYMa.
Using (10.8.8), and the simplification available from (10.8.13) if J*J = E, we derive (d/dx) arg w,(x) = 4 z*N*(E cos a
+
1-1
sin a)*B(E cos a
+
J-l
sin a)Nz. (10.8.15)
If there is an 01 such that the right-hand side is positive for all z with N z # 0, for example, if z*N*BNz > 0 when N z # 0, it can be asserted that the w,.(x) move positively on the unit circle when at exp (ia);the argument is still valid when w, is a multiple eigenvalue, and (10.8.15) will still be true as regards sign. By this means, separation theorems can be set up for the case when B is not positive definite but satisfies some weaker condition ; essentially this situation was considered in a special case in Theorem 10.3.3.
10.
336
MATRIX OSCILLATION THEORY
As in Section 10.4, the phase differential equation (10.7.10) may be employed in a quantitative sense. Using the form (10.8.3) for 52, we obtain in place of (10.8.15) the result, after slight simplification,
($1
arg w7(x) = w*
;I*
E cos - + 1-1 sin - B (Ecos a
(
2
With the assumption that J * J
=
+ 1-1
“12
sin - w.
(10.8.16)
E, the factor
will be unitary, and the right of (10.8.16) will lie between the greatest and the least of the eigenvalues of B . By this means we obtain bounds for the rate of change of argw,(x), which are valid also when w,(x) is a multiple eigenvalue, and hence bounds for the intervals between conjugate points.
10.9. First-Order Equation with Parameter We indicate here some reasoning parallel to that of Section 10.5, and relating to the eigenvalues of the boundary problem of Chapter 9, namely, Jy’
=
(AA
+B)y,
~ ( 0= )
MU,
y(b) = NU.
(10.9.1)
Once more we consider separation theorems, for eigenvalues for varying boundary conditions, and bounds for eigenvalues or their order of magnitude. As to varying boundary conditions, we may consider (10.9.1) as a particular case of a family of boundary problems with N replaced by N(a), as given by (10.8.10). With the simplifying assumption J*J = E, the problems are given by Jy‘
= (AA
+B)y,
y(a) = M v ,
y(b) = (Ecos a
+
J-l
sin a) NU, (10.9.2)
for any real a. For example, taking M = N = E, and a = 0, T , a pair of boundary conditions which are comparable for our present purpose are the periodic conditions y(a) = y(b) and the antiperiodic conditions y(a) = -y(b), and a separation theorem concerning the associated sets of eigenvalues h will hold under certain conditions. We have
10.9.
337
FIRST-ORDER EQUATION WITH PARAMETER
Theorem 10.9.1. Let the assumptions of Section 9.1-2 hold, and I eigenlet also J * J = E. Then in a closed A-interval containing k values of the problem (10.9.1) there lies at least one eigenvalue of a problem of the form (10.9.2). We define the eigenvalues w,(x, A) of O(x, A), to be fixed at x = a subject to (10.8.1), to be continued thence by continuity and so as to satisfy (10.8.1). Considering the w,(b, A) as functions of A, we have from (10.7.13) and (10.7.16) that the w,(b, A) move positively on the unit circle with increasing, real A. Here we rely on the definiteness condition (9.1.6), showing that the right of (10.7.16) is positive-definite, and not merely semidefinite. By the familiar argument, if in some closed 1 A-interval the wr(b,A) assume the value +1 altogether at least k times, then one of them, at least, must make a complete circuit of the unit circle, and so take all other values on the unit circle, yielding a solution of (10.9.2). More generally, if in this closed A-interval there are n > k eigenvalues of (10.9.1), there are in the interior at least n - k eigenvalues of any other problem (10.9.2). Turning to bounds for the eigenvalues, we may obtain some information in a simple manner from (10.8.16). Replacing B by AA B, where A is real, and we assume that J*J = E, we have
+
+
+
d dx
- arg w ~ xA), = w*
(Y
+
1-1
sin
* (AA + B ) ( E cos + 1-1 (Y
2
7
sin - zo; 2
if w, is a multiple eigenvalue, this holds in the sense that the left lies between the greatest and least possible values of the right for all w with w*w = 1. If we write min (AA B), max (AA B ) for the least and the greatest eigenvalues of AA B it follows that
+ +
min (AA
+B) <
d
arg w r ( x ,A)
+
< max (AA + B),
and so, on integration over ( a , b), that
I b
a
min (AA
+ B ) dx < arg w,(b, A) - arg wr(u,A) <
I
b
n
max (AA
+ B ) dx. (10.9.3)
Here the arg w,(a, A) are independent of A. Let the eigenvalues of the problem (10.9.1) be now numbered in increasing order, not in order of absolute value necessarily, and so such that ...
10.
338
MATRIX OSCILLATION THEORY
Here the series may tend to infinity in both directions, as, for example, for (10.1.2-3), though the Sturm-LiouviIle case shows that this need not be so. Whether the spectrum contains arbitrarily large values of both signs is partly clarified by (10.9.3). Denoting by min ( A ) ,max ( A ) the least and greatest eigenvalues of A, it follows from (10.9.3) that if
1:
min ( A )dx
> 0,
(10.9.5)
then arg w,(b, A) + fm with A, and so passes through multiples of 27 for arbitrarily large A of both signs; these A-values give eigenvalues of (10.9.1). Assuming (10.9.5) to hold, and noting that the arg w,(b, A) are monotonic and stay within 277 of each other, we have that arg w,(b, A,) will differ by at most a bounded quantity from 2 n ~ / k .Using the first of (10.9.3) and taking n > 0, we get a bound of the form A,
fmin ( A )dx < 2nm/k + const. a
with a similar bound in the opposite sense if n < 0. Thus if (10.9.5) holds, the spectrum extends to infinity in both directions, and A, has for large n the same sign as n and is of order of magnitude at most n. Without assuming (10.93, but retaining the assumption that A is integrable over ( a , b), we may assert that A, is of order at least n ; in the Sturm-Liouville case, for example, it is of order n2 [cf. (10.5.14-15)]. Taking n > 0 it follows from the second of (10.9.3) that 2nm
7 < A,
fmax ( A )dx + const. a
with a similar inequality for n < 0; our present assumptions do not ensure the existence of an infinity of eigenvalues of either sign. In the case (10.9.5) we may now assert that A, is of order precisely n. These results are of course sharper than the statement (9.2.4). They become fairly precise in the trivial case in which A is a multiple of E .
CHAPTER1 1
From Differential to Integral Equations
11 .I. The Sturm-Liouville Case In the classical investigation of boundary problems for the scalar second-order differential equation y"
+ (Ap + q ) y
= 0,
a
< x < b,
(1 1.1.1)
we commonly assume that the coefficients p, q are continuous, or at least Lebesgue integrable. Since this form does not cover the case of a second-order difference equation, the topic of Chapters 4-5, we adopted in Chapter 8 the device of extending.( 11.1.1) to a system 10 = yo, o' = -(Ap q) u, the coefficients p , q, and r being piecewise continuous, or at any rate integrable. This procedure still leaves a slight area uncovered, and we outline here another approach, in which we abandon the formalism of the differential equation. Taking one-point boundary conditions
+
y(a) cos
- y'(a) sin a = 0,
y(b) cos j? - y'(b) sin j?
= 0,
(1 1.1.2-3)
we concentrate attention on the solution y(x, A) of (1 1.1.1) such that y(a, A)
= sin a,
y'(a, A) = cos a,
(11.1.4-5)
so that (11.1.2) is automatically satisfied. For this solution we derive an integral equation, of Volterra type, by integrating (1 1.1.1) twice over (a,x) and using (1 1.1.4-5). The first integration gives, using (1 1.1.5), y y x , 4 = cos a -
{Ap(t) 339
+ q ( t ) } y ( tA), dt.
(1 1.1.6)
340
11.
FROM DIFFERENTIAL TO INTEGRAL EQUATIONS
Integrating once more and using (1 1.1.4) we derive y(x, A) = sin a = sin a
+ (x - a) cos a
-
+ (x - a) cos a -
1: s:
s:
{Ap(t)
+ q ( t ) }y(t, A) dt ds (11.1.7)
(x - t ) {Ap(t)
+ q(t)}y(t, A) dt.
Let us now write uo(x) =
r p ( t ) dt, a
(1 1.1.8-9)
ul(x) = s Z p ( t ) dt. a
T h e integral equations (1 1.1.6), (1 1.1.7) may then be written y y x , A) = cos a -
r(t,A) d{Auo(t) + .l(t)>,
( 11.1.10)
and y ( x , A) = sin a
+ (x - a) cos a
-
s:
(x - t )y (t ,A) d{Auo(t)
+ u l ( t ) } . (11.1.11)
T h e differential equation has thus been replaced by an integro-differential, or an integral equation of Volterra type, in which the coefficients of the original differential equation appear only by way of their integrals. We now remark that (1 1.1.I 1) remains intelligible on the basis that y is to be continuous in x, and that uo(x), ul(x) are of bounded variation over a x b. I n some ways this forms the most natural and general framework for problems of Sturm-Liouville type; we mentioned in Section 0.8 the case of the vibrating string, in which ul(x) = 0 and u,,(x) is the mass of the segment (a, x] of the string. Assuming that uo , u1 are also right-continuous, we may derive (1 1.1.10) from ( 1 1.1.11) with the interpretation that y’(x, A) is a right-derivative for a x b, and in fact a full derivative, left and right, if uo and ul are continuous at x. T o verify that formal differentiation of (1 1.1.11) does in fact yield (1 l.l.lO), with due restriction, we use (1 1.1.1 1) for x = x1 , x 2 , subtracting the results and getting, after slight reduction,
< <
< <
1 1.2.
UNIQUENESS AND EXISTENCE OF SOLUTIONS
34 1
Dividing by (x2 - xl) and making x2 4xl, with x1 fixed, we obtain (1 1.1.10) with x1 for x provided that
This is easily seen to be the case if uo , u1 are continuous at x = xl. More generally, it is true if x2 + x1 from above, since we assume uo , u1 right-continuous. Hence (1 1.1.10) follows from (1 1.1.1 1) if y‘ is interpreted as a right-derivative for a x < b, and in a < x < b with y‘ as an ordinary derivative in the full sense if uo , u1 are continuous at x. Provided that uo , u1 are continuous at x = b the boundary condition (11.1.3) will have a unique sense, and the eigenvalue problem will be specified by
<
y(b, A) cos /I - y’(b,A) sin /I
= 0.
(1 1.1.12)
If uo or u1 has a jump at x = b a convention must be set up as to whether y’(b, A) is the left-derivative at x = b, or the virtual right-derivative as given by (11.1.10). This does not arise if the boundary condition at x = b is given b$ y(b, A) = 0. Without rewriting in these terms the whole of Chapter 8, we note in the next section some basic results from the theory of integral equations of the form (11.1.11)) analogous to the theory of ordinary differential equations, which enable such an extension to be carried out.
11.2. Uniqueness and Existence of Solutions We must, in the first place, be able to say that (1 1.1.11) has a unique solution ; this extends the familiar property that the differential equation (11.1.1) has a unique solution with given initial values of y and y’. Suppressing for the moment the parameter h and simplifying the notation we have
Theorem 11.2.1. Let u(x) be right-continuous and of bounded variation over the finite interval a x b. Then the integral equation
< <
has a unique solution which is continuous in [a, b] for prescribed c1 , c 2 .
342
1 1.
FROM DIFFERENTIAL TO INTEGRAL EQUATIONS
Supposing there to be two continuous solutions, their difference,
z(x), say, would satisfy
z(.)
=
(x
a
-
t ) z ( t )du(t).
(1 1.2.2)
We first prove that z(x) s 0 in some right-neighborhood of a. Since is of bdunded variation we may choose an x1 > a such that
u(x)
(XI
- a)
here
I"I a
du(t) I
< *;
(1 1.2.3)
(1 1.2.4)
+
the total variation of o(t) over (a, x), and tends to zero as x -+ a 0. Suppose that the maximum of I z ( x ) I in [a, xl] is reached at x2 . Applying then (1 I .2.2) with x2 for x and taking absolute values we have
d I 4 x 2 ) I (XI - Q)
21
I W )I < B I .(Xz) I,
by (1 1.2.3). Hence z(x2) = 0, and so z(x) = 0 in [a, 4. Let now a' be the upper bound of x3 in (a, b) such that z(x) in [a, x3]. We may then replace (1 1.2.2) by .(X)
=
IZ a'
(x
=0
- t ) z ( t ) du(t),
< <
for a' x 6. If a' < b, a repetition of the argument shows that z(x) vanishes identically in a right-neighborhood of a', giving a contradiction. Hence z(x) vanishes identically in [a, b], proving the uniqueness of the solution of (1 1.2.1). So far as existence is concerned, two methods are available, which are virtually the same as the methods for the special case of the initialvalue problem for differential equations. I n the Liouville method of successive approximation we solve, in effect, the integral equation (1 1.2.1) by its Neumann series, setting up the iterative scheme J a
and starting with yo(") = 0.
(1 1.2.6)
1 1.2.
UNIQUENESS AND EXISTENCE CF SOLUTIONS
343
Subject to it being proved that the process converges suitably as n +m, this establishes both the existence and the uniqueness of the solution. We shall prefer here, however, to take the finite-difference or polygon approach, in which the solution appears as the limit of a sequence of piecewise linear functions. Suppose first that u ( x ) is a step function with a finite number n of jumps at points a,, where a < a, < ... < a, ,< 6 . In this case (1 1.2.1) can be solved by recurrence. Writing (1 1.2.1) with the Stieltjes integral as a sum we get y(x)
= c1
+
cZ(x - a )
+ 2 ( x - a,)
{O(ar)
a,<x
- u(ar - O))y(ar).
(11.2.7)
By means of (11.2.8-10) we may find the y(a,) in turn and thence by means of (11.2.7). We approach the general case by a limiting process, and for this need a bound for y(x) constructed according to (11.2.7-lo), in terms of the variation of u(x). Taking absolute values in (11.2.8) we deduce a bound of the form
y(x) for all x
For Iy(al) I with s = 1,
< c3,
by (11.2.10) and (11.2.12), and so, by (11.2.11)
I r(az) I
< + c3
c4c3
Q c3 exp @a
I o(a,> - 4%- 0)I w(a,)),
11.
344
FROM DIFFERENTIAL TO INTEGRAL EQUATIONS
in verification of (1 1.2.15) with s = 2. T o complete the proof of (1 1.2.15) we use induction. Supposing that (1 1.2.15) is valid for y(aJ on the right of (11.2.1 I) we deduce that
r=1
where we interpret a, = a, ~ ( a = ) 0. Hence
= c3 exp [c4
4411
proving (11.2.15). Inserting this bound on the right of (11.2.7), in the weaker form I y(uJ 1 c3 exp [ c p ( b ) ] , we obtain
<
I y(x) I
< + c3
,< c3
c3
exp k
4
4 4 1 2 (x - 4 I 4%)- 4%- 0 ) I a,<=
+ c3 exp [c4 4 b ) l
c4
4~) < c3 exp [ 2 c 4 4 ) l .
( 11.2.16 )
T h e conclusion is that for given cl, c2 and a, b with b - a finite, the solution y ( x ) is uniformly bounded in terms of w(b), the variation of .(X) over [a, b ] , provided that .(x) is a step function with a finite number of jumps. Suppose now merely that u(x) is of bounded variation and rightcontinuous. In this case we approximate to ).( by a sequence of step functions rJ(*)(x), n = 1 , 2, ... , (1 1.2.17) and construct the corresponding solutions y ( * ) ( x ) of y(")(x)= c1
+ cz(x - a ) +
a
(x - t ) y ( " ) ( td)d n ) ( t ) .
(11.2.18)
As in the corresponding argument in the theory of differential equations, it is not necessary to establish that the sequence y'")(x) converges, even though this may be the case; it is sufficient to show that this sequence forms a family of bounded and equicontinuous functions, and so a compact sequence, from which a uniformly convergent subsequence may be extracted.
1 1.2.
UNIQUENESS AND EXISTENCE OF SOLUTIONS
345
We choose these step functions d n ) ( x )so as to have at most n jumps, coinciding with ~ ( x )at the points obtained by dividing (a, b) into n equal parts, and being constant between these points. Thus
for j = 0, ..., n - 1. These step functions are of bounded variation uniformly in n, their total variation not exceeding that of a(x); in other words, modifying the notation (1 1.2.14),
The solutions of (11.2.18) will then satisfy a bound
I ~ ( ~ ) (I x<) c3 + c3 exp {c.,
wcn)(b)}( b
-
a)dn)(b),
by (11.2.16) with w(")(b)in place of w(b). Hence, if w(b) is the total variation of u(x),
! y'"'(x) I d
c3
+
c3
exp ( c 4 4 ) ) ( b - 4
4).
Thus the family of functions y ( n ) ( x ) ,n = 1, 2, ..., is uniformly bounded. It is easily deduced from (1 1.2.18) that the same family of functions is also equicontinuous. We have, in fact,
+ s'" (x2 - t )y '")(t)du(")(t), whence
+1
Here the coefficient of I x2 - x1 I on the right is bounded, uniformly in n, so that the y(")(x)satisfy in fact a uniform Lipschitz condition and are certainly equicontinuous. Applying now the Arzell compactness principle, we deduce that there is an infinite sequence of n-values such that the y(")(x) converge uniformly to a limit function y(x). T h e bound proved above for the y(")(x), independently of n, will then apply to y(x), so that
I y(x) I d c3 exp P where
c g , c4
are given by (11.2.12-13).
4
4))
( 1 1.2.19)
346
11.
FROM DIFFERENTIAL TO INTEGRAL EQUATIONS
T o complete the proof of Theorem 11.2.1 we must show that y ( x ) satisfies (I 1.2.1), and for this make n -+ m in (11.2.18). Since the left sides of ( I 1.2.1), (1 1.2.18) become identical as n -+ m, through the sequence in question, we must prove that
- t)y'"'(t)d u ' y t ) -+
J'(X a
Since
S2
(x
-
t ) y ( t )du(t).
(x
-
t ) y ( t )du(t),
U
I' a
(x - t ) y ( t ) U
y t )+
1' a
the left being an approximating sum to the Riemann-Stieltjes integral on the right, it will be sufficient to show that
Jl
(x - t ) { y ( t )- y'"'(t)}du'")(t) + 0,
and this is the case since y ( t ) - y(")(t)-+0 uniformly, and since the d n ) ( t )are uniformly of bounded variation. This proves Theorem 11.2.1. As mentioned in the last section, a solution of the integral equation (1 1.2.1) also satisfies a slightly simpler integro-differential equation.
Theorem 11.2.2. T h e solution of (11.2.1) has, for a right-hand derivative given by y'(x) = c2
+ I'm du(t), a
y'(a) = c2
< x < b,
.
This is a two-sided derivative if u(x) is continuous, or if y ( x ) For
a
( I 1.2.20) =
0.
(1 1.2.21)
and so
Here the last term on the right tends to zero as x2 -+ xl, for fixed xl, in any case as x2 -+ x1 0 since u(x) is right-continuous, and as x2 tends to x1 from either side if either ~ ( x is) continuous at x1 or ify(x,) = 0.
+
1 1.2.
347
UNIQUENESS AND EXISTENCE OF SOLUTIONS
An obvious consequence of (1 1.2.20) is
Theorem 11.2.3.
The right-derivative y’(x) is uniformly bounded in
a
< x < b, indeed of bounded variation.
a
< a’ < b, the integral
A second-order differential equation has the property that a solution is determined by its value and that of its derivative at any point of the relevant interval. For the integral equation (11.2.1) it is likewise the case that the initial point x = a may be replaced by any other.
Theorem 11.2.4.
A solution of (11.2.1) satisfies, for any a’ with equation
where y‘(a’) is the right-derivative at x = a’. For if on the right of (11.2.22) we substitute for y(a’) according to (1 1.2. l), and for y’(a‘) according to (11.2.20) we obtain
and it is easily checked that this reduces to the right of (1 1.2. l), which proves the result. Finally we note that the solution of (11.2.22) is unique. We put this result in another important form.
Theorem 11.2.5. If the solution of (1 1.2.1) does not vanish identically, then y ( x ) and y ’ ( x ) do not vanish together. For if y(a’) and y’(a’) both vanished, we should have from (1 1.2.22) that y(x) = 0 in a’ x b, by Theorem 11.2.1, or more particularly from the discussion of (1 1.2.2). The proof that y(x) = 0 in a x < a’ is similar. It is mainly a matter of proving that y(x) E 0 in a leftneighborhood of a’. We have, for such x, and takingy(a’) = y ‘ ( ~ ’ )= 0 in (1 1.2.22), the bound
< <
from which the conclusion follows as before.
<
1 1.
348
FROM DIFFERENTIAL TO INTEGRAL EQUATIONS
11.3. Wronskian Identities The solutions of a given integral equation (1 1.2.1) for fixed a ( . ) and varying cl, c 2 , that is to say, with varying y(a), y’(a), form a twoparameter family of functions which are entirely analogous to the family of solutions of a second-order differential equation. I n particular, any two of them have a constant Wronskian.
Theorem 11.3.1. Let U(X) be of bounded variation and rightcontinuous in a x b, let y ( x ) be the unique solution of (11.2.1), and z(x) the solution of
< <
z(x) = cg
Then, for a
+ c4(x - a) + s’(x - 1 ) 4 1 ) du(t). a
(11.3.1)
< x < b, y(x) z’(x) - y’(x) z(x) = c1c4 - czcg .
(1 1.3.2)
Here, as usual, y‘ and z’ denote right-derivatives. At a point where is continuous, they may, of course, be taken to be ordinary twosided derivatives. The end x = b needs special consideration. If a ) . ( is continuous there, the left-derivatives will exist there, being given by
u(x)
and likewise for z’(b); this is the same as limy’(x) as x + b, where
y ’ ( x ) is the right-derivative, and (11.3.2) holds for x = b by making x + b. Suppose again that ~ ( x )has a jump at x = b. Then (11.3.2) is
still true, with y’(b) given by (11.3.3). For according to (11.3.3), the discontinuity in y’(x) when x reaches b will be y(b) [u(b) - a(b - O)]. The effect of this and the similar discontinuity in z’(x) on the Wronskian in (1 1.3.2) is therefore a discontinuity of amount y(b)W) [u(b) - o(b - O)]} - {y(b)[up) - o(b - O)]} z(b) = 0,
so that the Wronskian is continuous at x = b, even if u(x) has a jump there. Thus (1 1.3.2) will also be true at x = b with the interpretation (1 1.3.3). Turning now to the proof of (11.3.2), we prove the following more general result, along the lines of the Green or Lagrange identity.
1 1.3.
349
WRONSKIAN IDENTITIES
Theorem 11.3.2. Let a(x), at(.) be of bounded variation and rightx b. Let y ( x ) be the solution of ( 1 1.2.1) and z(x) continuous in a that of
< <
z(x) = ca
+ c4(x - u ) + 1” ( x - t ) z(t)dut(r). J a
Then
[yz’ - y’z]; = r y ( t )z ( t ){dut(t) - du(t)). a
(11.3.4)
(1 1.3.5)
Noting that in the products yz’, y‘z one factor is continuous and the other, in fact both, of bounded variation, we have [yz’ - y’z],” =
r a
{(dy)z’
+y(dz’) - (dy’) z - y’(dz)}.
(1 1.3.6)
T o derive the required result (1 1.3.5) we have in the first place that r ( d y ) z’ = r y ‘ z ’ dx
r
and secondly that a
(dy’) z =
ry’(dz),
(11.3.7-8)
r y ( d z ’ ) = r y z dot.
(1 1.3.9-10)
a
a
r
a
yz do,
=
a
a
a
T h e latter are obtained formally by setting dy‘ = y du, dz’ = z dut, which come from (11.2.20) and its analog for z. These operations are readily justified from first principles. Taking (1 1.3.7) we have to show that
Jya (dy
as n -+ Q), max same as
(t7+1 - &)
-+
- y’ d x ) z’ = 0.
(11.3.11)
0. I n view of (11.2.20-21) this is the
350
1 1.
FROM DIFFERENTIAL TO INTEGRAL EQUATIONS
The left is less than or equal to, in absolute value,
which tends to zero, since - 5, -+ 0 and since u(x) is of bounded variation. This justifies (11.3.7-8). In the case of (11.3.9) the corresponding result to be proved is that
J: (dy' - y do) z = 0,
or that
r=O
for a sequence of subdivisions as before. By (1 1.2.20) this is the same as
This follows from the facts that z(t) is bounded, variation, and that y ( t ) is continuous, so that ~ ( t -y(tT) )
+
0 as
-00,
tr
u(t)
is of bounded
tr+1v
subject to max - 5,) --f 0. This justifies (1 1.3.9), and likewise (1 1.3. lo), so that the proof of (I 1.3.5) is complete, and therewith that of (1 1.3.2).
11.4. Variation of Parameters The inhomogeneous equation corresponding to (1 1.2.1) will be taken to be
$44= c1
+ cz(x - a ) + 12.( a
- t ) W) do(t)
+ 12X(t) dt, a
(1 1.4.1)
where ~ ( tis) given and #(x) is to be found; there will again be a simpler formulation as an integro-differential equation, namely,
1 1.3. WRONSKIAN IDENTITIES
35 1
where the variations and integrals are over (a, x), or any other subinterval of (a, b). We shall, however, take (11.4.1) as basic since a solution may be sought in the class of continuous functions, without any a priori reference to differentiability. I n what follows we construct a solution of (1 1.4.1) for some, unknown, values of c1 , c 2 . As in the case of differential equations, having some “particular integral” of an inhomogeneous equation, a linear combination of solutions of the homogeneous equation may be added so as to satisfy certain initial conditions or boundary conditions. A solution of (11.4.1) will be constructed in terms of two solutions of the “homogeneous equation;” which must be linearly independent, and so have nonvanishing Wronskian. For definiteness let us take y ( x ) , z(x) as the solutions of
+
y(x) = (x
-0)
z(x) = 1
+ s’(x a
.( - t ) Y ( t )du(t),
- t ) z ( t ) du(t),
(1 1.4.3) (1 1.4.4)
that y(a) = 0, y‘(a) = 1, .(a) = 1, ~ ’ ( a= ) 0. With the assumptions made previously concerning u(x), we may then assert, by Theorem 11.3.1, that so
y’(x) z(x) - y(x) z’(x)
=
1.
(1 1.4.5)
An expression for a solution of (1 1.4.1) may be set up by analogy with the standard case in which ~ ( x ) x, ( x ) are differentiable. Our concern here will be to verify this solution.
Theorem 11.4.1. Let a(x), x ( x ) be right-continuous and of bounded x b. Then the function variation over a
< <
satisfies (1 1.4.1) for some c1 , c2 , with y ( x ) , z(x) given by (1 1.4.34). Let us first verify that JI is continuous. We have
Making x2 4 x1 for fixed xl, all three terms on the right tend to zero,
3 52
11.
FROM DIFFERENTIAL TO INTEGRAL EQUATIONS
so that #(x) is continuous. Its differentiability properties may also be read off from (11.4.7). Noting that, by (11.4.34), y’(4
and
=
and likewise
1
+ S’y(t)du(t),
(1 1.4.8)
a
*I
z ( x ~) z(xl) = ( x Z - x1) z’(xl)
we have for fixed x1 and x1
< t < x2,
( 11.4.10)
+
22
( x Z - t ) z ( t ) du(t), (1 1.4.1 1)
+1
) ~ ( t= ) O(x2 y(x2) - y ( t ) = O ( X ~ - XI), ~ ( x p-
and so Y(X2)
-
(1 1.4.12)
- XI),
z(xz)r(t) = O(x2 - x1).
(1 1.4.13)
If therefore we divide (1 1.4.7) by (x2 - xl) and make x2 tend to x1 from above, the last integral in (11.4.7) will contribute nothing, bearing in mind that ~ ( tis) right-continuous, and we obtain, writing x for xl, $‘(x) = y’(x)
a
z ( t ) d x (t) - z’(x)
r y ( t ) dx(t), a
(11.4.14)
all derivatives being right-handed. By similar methods, we may verify directly that
the integral being a Riemann integral and the derivative a right derivative. Taking a subdivision a = to< ... < cn = x, we have of course (1 1.4.16)
The terms on the right are now expressed in the form (1 1.4.7), with fr , t,,, for x1 and x2 . By (1 1.4.9) we have
1 1.3.
WRONSKIAN IDENTITIES
353
with a similar result for x . Hence from (1 1.4.7) we have
the last 0-term coming from the last integral in (1 1.4.7), estimated by means of (1 1.4.13), and all bounds holding uniformly. Summing over I, and making n + 00, and therewith max (&+l - &) 4 0, we derive (1 1.4.15). We have to take the calculations a stage further, and consider the variation of +'(x), with a view to verifying (11.4.2). T h e two terms on the right of (11.4.14) are each the product of two factors, each factor being of bounded variation ;however, the factors may be all discontinuous at the same x-values, and we shall accordingly proceed once more from first principles. We 'have
$'(4- #'(a)
We now note that, by (11.4.8),
and in view of (11.4.12)
2 {$'(L+l)- #'(5d}*
n-1
=
0
(1 1.4.18)
354
1 1.
FROM DIFFERENTIAL TO INTEGRAL EQUATIONS
A similar result, of course, holds for x . Also
The first term on the right of (1 1.4.19) is, by (11.4.20),
the second term in the right of (1 1.4.9) being
Taking these terms together, the result may by (1 1.4.6) be written as #'(tF+J
- #'(t,)= { 4 4 r + J
Summing over r = 0, ..., n max ([,+I - 5,) 0, we get
-
-44r))
1, and making n
+ 03
subject to
+
which justifies (1 1.4.2). To complete the proof we integrate (1 1.4.21). I n view of (11.4.15) we derive
11.5.
ANALYTIC DEPENDENCE O N A PARAMETER
355
Since +(a) = +’(a) = 0, by (1 1.4.6) and (1 1.4.14), and since
by an integration by parts, we deduce
which is equivalent to (1 1.4. l), with c1 = 0, c2 = -x(a). This completes the proof. An incidental consequence, needed for results on continuous dependence, is that the solution of (11.4.22) is bounded in terms of the function ~ ( t appearing ) in the inhomogeneous term. Theorem 11.4.2. For a fixed function o ( x ) , right-continuous and of bounded variation in a x b, the solution of (1 1.4.22) admits a bound
< <
I $(4I < const JZ I d x ( t ) I.
(1 1.4.23)
This follows from (1 1.4.6), y ( x ) and z(x) being dependent only on and being continuous, and so bounded. The solution of (I 1.4.22) is of course, unique, in view of Theorem 11.2.1. Furthermore, the dependence of +(x) on x ( x ) will be linear and continuous, in that if we have two such functions &(x), &(x), corresponding to xl(x), x2(x), then o(x)
(1 1.4.24)
11.5. Analytic Dependence on a Parameter T o complete, for present purposes, our stock of classical results for differential equations as extended to integral equations of SturmLiouville type, we prove that the function defined by (1 1.1.1 1) is analytic in A, in fact, an entire function of A. Assuming that uo(x), ul(x) are right-continuous and of bounded variation over the finite interval (a, b), we now know from Section 11.2 that y(x, A) is uniquely defined as the continuous solution of (1 1.1.11). We first remark that y(x, A) is bounded, uniformly in any bounded
356
11.
FROM DIFFERENTIAL TO INTEGRAL EQUATIONS
< <
A-region and for a x b. This follows from the explicit bound (1 1.2.19), which gives here
I Y(", 4 I
+
< CS exp ( 2 4 A I
WOP)
+ w,(b)h
(1 1.5.1)
where cg = 1 sin a I ( b - a ) I cos a I, cp = b - a, and o o ( b ) , w,(b) are the total variations of uo(x), ul(x) over (a, b). Next we show that y(x, A) is continuous in A. For this we set up (11.1.11) with p in place of A, subtract (11.1.11) from this, and divide by ( p - A). Writing, for p # A, (1 1.5.2)
the result may be arranged in the form w(x, A, P ) = - r ( x - t ) w(t, A, P ) d{Aoo(t)
+
.I(t))
-
a
(x - t ) Y ( t ,P ) doo(t).
This has the form (11.4.22) with w ( x , A, p ) for $(x), and with
x(4
=
- r Y ( f ' PI d%(t).
(1 1.5.3)
( 11.5.4)
Here x ( x ) is of bounded variation and right-continuous, with uo(x). By what has just been proved, it will be of bounded variation uniformly in any bounded A-region. It follows by Theorem 11.4.2 that w(x, A, p ) will be uniformly bounded in any finite p-region for fixed A, and hence y ( x , p ) is continuous as a function of p at p = A. T o show that y(x, A) is analytic we show that it has a well-defined derivative for all complex A, or that w ( x , A, p ) tends to a unique limit as p + A. Defining w(x, A, A) as the solution of ( I 1.5.3) with p = A, we see by (1 1.4.24) that w ( x , A, p ) will be a continuous function of p, for fixed x and A, in view of the continuity we have established for y(x, p). Hence w ( x , A , p ) + w ( x , A , A) as p + A in any manner, and y(x, A) is therefore analytic. It is therefore an entire function, by (11.5.1) of order at most 1, though this figure is not precise.
11.6. Eigenvalues and Orthogonality The solution of (1 1.1.1 1) necessarily satisfies the initial boundary condition (1 1.1.2), and a boundary problem is then given by imposing a condition at x = b, namely, y(b, A) cos fi - y'(b, A) sin
= 0.
(1 1.6.1)
1 1.6.
EIGENVALUES AND ORTHOGONALITY
357
Supposing for definiteness that uo(x), q ( x ) are continuous at x = 6, we may take it that y’(b, A) is the left-derivative of y ( x , A) at x = b, which is also given by (11.1.10). T h e roots of (11.6.1), the eigenvalues, are thus exhibited as the zeros of an entire function. I t follows that they have no finite limit, provided that this function does not vanish identically, that is to say, provided that not every A-value is an eigenvalue. T o exclude the latter possibility we turn to the orthogonality relations for the eigenfunctions. Denoting by A, a typical eigenvalue, and by y,(x) = y(x, A,) the corresponding eigenfunction with yn(a) = sin OL, y;(a) = cos a, we have from (1 1.3.5) that for any pair of eigenfunctions
Here the left vanishes by the boundary conditions, and so, if A,
# A,, (1 1.6.2)
Let us now assume, in addition to the assumptions that o0(x) and are of bounded variation and right-continuous over the finite interval ( a , b), that they are furthermore real-valued and that uo(x) is nondecreasing and satisfies the “definiteness” requirement
crl(x)
J a
(11.6.3)
for all A, real or complex. This corresponds to (8.2.1) or to (9.1.6); the minimal requirement is that (1 1.6.3) should hold for all A such that y(x, A) satisfies both boundary conditions. Under these assumptions, we may assert that all eigenvalues are real. For otherwise we could choose A, to be a complex eigenvalue, with A, as its complex conjugate, also necessarily an eigenvalue. Since ynr(x), y,(x) would be complex conjugates (1 1.6.2) would then contradict (11.6.3). Since complex eigenvalues are now excluded, the entire function on the left of (11.6.1) does not vanish identically, and the eigenvalues therefore have no finite limit. Further information concerning their distribution may be derived from the order of the entire function on the left of (11.6.1). As in Section 8.4 we may investigate the oscillatory characterization of the eigenvalues by means of an angular variable
358
1 1.
FROM DIFFERENTIAL TO INTEGRAL EQUATIONS
In case of a jump in the right-derivative y’(x, A), we join up the two values of y’ + iy by a straight line, and determine the value of B(x, A) by continuous variation along this line, as in Section 6.10.
11.7. Remarks on the Expansion Theorem This theorem will deal with the formal expansion
-z m
V(X)
where cn =
J
b
cnyn(x), b
V(X> yn(x>d~o(x>/J [ Y ~ ( X ) I Z d~o(x>i
(11.7. I )
(1 1.7.2)
and we suppose the additional assumptions of Section 11.6 to hold, in particulkr (1 1.6.3). For a certain class of ~ ( x (1 ) 1.7.1) will hold with uniform and absolute convergence, and for a wider class in the sense that (1 1.7.3)
as m+-. T h e proof may follow the lines of those used in Chapter 8 or 9, and will in either case involve the inhomogeneous boundary problem. We write this most simply in integro-differential form, as (1 1.7.4)
to hold with the variation and integrals taken over any subinterval of (a, b), with the requirement that cp satisfy the boundary conditions (1 1.1.2-3). Assuming that T satisfies such a relation with x of integrable square with respect to a,, , we then have the situation, with a slight generalization, of Theorem 8.6.1, and the proof may be developed along the lines of Prufer’s investigation. For the further theory, and for the method of Chapter 9, we solve (11.7.4) explicitly, assuming that h = 0 is not an eigenvalue, by means of a Green’s function, in the form ( 11.7.5)
The variation of constants formula (1 1.4.6) must here be modified along the lines of (8.8.17) in order to derive an expression satisfying
1 1.8.
FIRST-ORDER MATRIX DIFFERENTIAL EQUATION
359
the boundary conditions, and the Green's function may be expressed in a form similar to (8.8.7). As in Section 8.9, by applying the Bessel inequality to the Green's function information may be obtained on the absolute and uniform convergence of the eigenfunction expansion. T h e more general problem posed by
I'[.
= -
j.
4 h J
+
01)
-
jx
(1 1.7.6)
9
where must also satisfy the boundary conditions, may be solved as in (1 1.7.5) by means of a Green's function G(x, t, A ) if h is not an eigenvalue, or alternatively as a power series in A, if 0 is not an eigenvalue and if X is small. This was the method adopted in Section 9.6 for the proof of the eigenfunction expansion. A third method of solving (1 1.7.6) will be to apply the formalism of (11.7.4-5). Taking it that 0 is not an eigenvalue, we have b
a(x) =
G(x, t , 0) ~ ( t&(t) )
+ s" G(x, t , 0) x ( t )
duo(t),
(1 1.7.7)
a Fredholm-type integral equation with Stieltjes integrals ; this provides another avenue to the expansion theorem, a variant of this being provided by the theory of the completely continuous operator. Yet another method of proving the expansion theorem deserves a brief mention, namely, the proof by means of finite-dimensional approximation. As for the solution of the initial-value problem considered in Section 11.2, we replace the functions uo(x), ul(x) by a sequence of step functions ul;")(x), uP)(x), having only n jumps, and agreeing with uo(x), u,(x) at x = a, x = b and at points of a subdivision of (a, b). For the resulting finite-dimensional problem, the eigenfunction expansion is of a trivial kind, being that considered in Chapter 4. T h e limiting transition n --+ 00 presents rather similar problems to the transition b + m in the case of the Sturm-Liouville expansion over (a, b). As in that case, we may most conveniently take a fixed section of the Parseval equality, in the form (8.10.13), make n + 00, and then make A 4 00.
11.8, The Generalized First-Order Matrix Differential Equation We now turn to the general system of first-order linear differential equations, with a view to possible extensions of the boundary problems discussed in Chapter 9. We must start by checking that the usual existence
360
1 1.
FROM DIFFERENTIAL TO INTEGRAL EQUATIONS
and uniqueness theorems are still available, for which purpose the scalar parameter may be omitted. From the differential equation y’
= B(x)y,
a
< x < b,
(11.8.1)
where y is a k-by-1 column matrix and B a k-by-k square matrix, we have on integration that (1 1.8.2) (1 1.8.3)
(1 1.8.4)
An interpretation of (1 1.8.4) which includes (1 1.8.1), with integrable B(x), will be that C(x) is continuous and of bounded variation; here a solution will be a continuous vector function y(x), satisfying (1 1.8.4) for a x b, the integral in (1 1.8.4) being the ordinary RiemannStieltjes integral. T o interpret (11.8.4) when C(x) is of bounded variation and rightcontinuous, possibly not continuous in the full sense, some attention must be given to the Riemann-Stieltjes integral, since a discontinuity in C(x) must, in general, imply a discontinuity in the solution of (1 1.8.4). Denoting by {tr},a = f,, < < ... < = x a typical finite subdivision of the interval (a, x), we prescribe that the integral on the right of (11.8.4) is to mean
< <
0-1
r=o
the limiting process being subject, as usual, to
A “solution” of (1 1.8.4) is then to be a right-continuous vector function y ( x ) of bounded variation, satisfying the equation for a x b. We pass over here the proof of the existence of the integral defined by (1 1.8.5), that it has a unique value, independent of the choice of the sequence of subdivisions, provided that C and y are right-continuous and of bounded variation.
< <
11.8.
FIRST-ORDER MATRIX DIFFERENTIAL EQUATION
36 1
The proof that (11.8.4) has at most one solution, or that the homogeneous equation z(x) =
r
a
d C ( t )z ( t )
has only the solution z(x) = 0, will be very similar to the discussion of (1 1.2.2), relying now on the right-continuity of C(x). The existence of a solution may be most expeditiously proved by the method of successive approximations. We set up the recurrence scheme
for a sequence of column matrices un(x), starting with uo(x) = 0. T o carry out the estimations we adopt norms, written I I, of a vector or column matrix, or of a square matrix, given by the sum of the absolute values of all the entries. T o investigate the convergence of the scheme (1 1.8.7) we subtract successive equations of the sequence, getting
whence i(n(x)
I. (11.8.8)
sb
Here I dC(t) I is in fact the sum of the total variations over (a, b) of all the Oentries in C(x). If b is such that (1 1.8.9)
the process clearly converges, by (11.8.8). If (11.8.9) does not hold, we may replace b by some lower b‘ such that
since C(x) is right-continuous ; we may therefore continue the solution over (a,b’) for some 6’ > a. The process may then be repeated, continuing the solutions of (1 1.8.4) over (b‘, b”) for some b” > b’ as solutions of
362
1 1.
FROM DIFFERENTIAL TO INTEGRAL EQUATIONS
If C(x) is continuous over ( a , b), we may extend the solution of (1 1.8.4) by this means to the whole of ( a , b ) by a finite number of steps of this type. Even if C(x) has discontinuities, this may still be done if they are limited in size; writing d,C
= C(X)
it will be sufficient that Id,CI
-
C(X
-
O),
< 3.
(11.8.10)
(1 1.8.1 1)
If finally C(x) has discontinuities not satisfying (1 1.8.1 l), they will be finite in number, since C(x) is of bounded variation, and they may be treated individually; (1 1.8.4) specifies that at a discontinuity of C(x) there is to be a discontinuity in y ( x ) specified by d,y
= d,Cy(x
- 0).
(1 1.8.12)
We mention briefly the alternative finite-difference approach to (1 1.8.4). Denoting by {&}, as before, a subdivision of (a, x), it is plausible to replace (1 1.8.4) by the sequence of recurrence relations (1 1.8.13)
starting with y t ( t o ) = yt(a) explicitly in the form, with 6,
n
= ~ ( a ) These . =
n-1
~t(x= >
0
{E
x,
relations may be solved
+ C(tr+i)- C ( ~ ~ ) I Y ( U ) ~
where E is the unit matrix and the factors are written from left to right in descending order of Y. Making n + m, subject of course to (1 1.8.6), we reach a solution in the form Y(4
=
Y(x,4 Y ( 4 ,
( 11.8.14)
where Y(x,a), the analog of the fundamental matrix solution of the differential equation, is given by (1 1.8.15)
where the factors are written from left to right in the order n - 1, ..., 0. T h e expression (11.8.15) forms a multiplicative or product integral; we refer elsewhere for the proof that it has a unique sense, independently of the choice of the sequence of subdivisions.
Y =
11.9.
A SPECIAL CASE
363
11.9. A Special Case For the integral equation (1 1.8.4) we adopted, for the event of C(x) being discontinuous, a special interpretation (1 1.8.5) of the RiemannStieltjes integral ; in the approximating finite sums, the integrand y(x) was to be reckoned at the lower end of the relevant interval (4, , &+l) of the subdivision {t,}.In the standard Riemann-Stieltjes integral the integrand is reckoned at an arbitrary point of the closed interval of the subdivision. If, however, we define, in place of (11.8.5),
the integral equation (11.8.4) need not have the same solution as previously, or indeed any solution at all. The two versions of the integral equation differ only in their treatment of discontinuities in C(x). Denoting as in (11.8.10) a jump in C(x) by d,C, the interpretations (1 1.8.5), (1 1.9.1) imply that the jump in y ( x ) at the same point is given, respectively, by Y(X) - Y ( X - 0) = dZCY(X - 01,
and
Y ( X ) - Y(X - 0) = ~ Z C Y ( X ) ,
These give, respectively, Y(X)
=
(E
+ ~ , C ) Y ( X- 0),
Y(X)
=
( E - ~ , C ) - ' J J (X 0),
(11.9.2-3)
which are, of course, not always the same. We make here the remark that (11.9.2-3) are the same if (dXC)2= 0;
(1 1.9.4)
this happens to be the case for self-adjoint boundary problems. A consequence of this is that the discontinuity represented by (1 1.9.2-3) may be achieved by means of a differential equation. Writing d C for the discontinuity in C(x), where (dC)2= 0, we consider the differential equation du,/dt = (dC) 11,
0 Q t Q 1,
u(0) = y(x - 0).
The solution of this equation is u(t) =
(E
+ t d C ) u(0) = ( E + tdC)y(x - O),
364
11.
FROM DIFFERENTIAL TO INTEGRAL EQUATIONS
so that
41) =y(xh
by (1 1.9.2). We have of course, on integration u ( t ) - u(0) =
0
0
( A C )~ ( tdt, )
< t < 1.
It follows that we may, subject to all jumps of C(x) satisfying (11.9.4), replace (11.8.4) by a similar equation in which C ( t ) is replaced by a continuous function.
1 1 .lo. The Boundary Problem Omitting for the present the parameter h from the differential equation of Section 9.1 we have .ly’(x) = B ( x ) y ( x ) ,
a
e x < b,
(1 1.10.1)
where 1is skew-Hermitean and B Hermitean. Integrating over ( a , x), and writing B,(x) =
I’ a
B ( t ) dt,
(11.10.2)
we obtain an integral equation
In this generalized setting we may now consider the equation on the basis that y, B , are of bounded variation and right-continuous, B, being Hermitean, and the integral in (1 1.10.3) being interpreted in a sense similar to (11.8.5). We cannot, however, allow B,(x) to have arbitrary discontinuities. A basic ingredient in the boundary problem is that there should hold a Lagrange identity, according to which for any other solution of an equation of the form of (11.10.3), say, z(x) where
it should be the case that z*(x) Jy(x) = const.
(1 1.10.5)
1 1.10.
365
THE BOUNDARY PROBLEM
At points at which B,(x) is continuous, this is ensured by the requirement that B, be Hermitean, and J skew-Hermitean. Suppose now that B,(x) has a jump d B , at some point x. Then by (1 1.10.3)
"I
- Y ( X - 011 = A B A x
- O),
and similarly for z. Hence
.*(4 J Y ( 4 - z*(. = z * ( x - 0) = z*(x
-
0) JY@
- 0)
( E - dBJ-1) J(E
- 0) (--dBJ-'dB,)y(x
+ J-'dB,)y(x - 0) - z*(x - 0) Jy(x - 0)
- 0).
Hence the discontinuity dB1 must, in addition to being Hermitean, be such that dBIJ-ldB, = 0, or such that (J-'dB,)2 = 0. This is the situation (11.9.4), so that (1 1.10.3) will also hold with the same y ( x ) with the integral as a Riemann-Stieltjes integral in the ordinary sense. Alternatively, the discontinuity may be replaced by a differential equation over an interval, which was essentially the method adopted in Chapter 8. For a boundary problem with parameter we introduce the matrix function A,(x), being nondecreasing, Hermitean, right-continuous, and bounded, and so of bounded variation. T h e problem then takes the form of (9.1.20), the boundary contitions being as in Section 9.2. The discontinuities of A , , B, at any point must satisfy (MA, dB,) J-l (MA, dB,) = 0 for any real A, and we derive conditions similar to those of Section 3.3. As mentioned at the end of Section 11.9, we may replace these discontinuities by continuous sections, so that there is no loss of generality in assuming that A,(x), B,(x) are continuous. Naturally, this has no bearing on the interposition of discontinuities of the form y ( x ) = (AA B ) (AC 0 ) - ' y ( x - 0). Subject to these assumptions, the work of Chapter 9 extends for the most part to the integral equation (9.1.20), the principal change being the replacement of the differential A(x) dx by the Stieltjes differential dA,(X).
+
+
+
+
CHAPTER12
Asymptotic Theory of Some Integral Equations
12.1 + Asymptotically Trigonometric Behavior In this chapter, as in the last, we have to deal with results which are classically in the field of second-order linear differential equations, but which find a very natural and perhaps maximal extension in the context of integral equations. Whereas in Chapter 11 we dealt with existence, uniqueness, and analytic dependence, we extend here some more special investigations, relating to the asymptotic behavior for large x or for large h of the solutions of the differential equationsy"+[l +hg(x)]y=O andy" [A g(x)] y = 0. The connection with the theory of boundary problems, though more profound than we can indicate here, will be apparent from the possibility of proving the eigenfunction expansion by these means; we give some cases of this procedure in Section 12.10. One of the oldest problems of asymptotic behavior relates to the comparison of the solutions of
+ +
(12.1.1) (12.1.2)
the problem being to give conditions on g(x) which ensure that as x + m each solution of one is approximated by some solution of the other. Thus for any solution of (12.1.1) there must be a pair of constants a, ,b, , not both zero for a nontrivial solution, such that (12.1.3)
y ( x ) - a, cos x - b, sin x -+0,
or constants a,, b, such that y(x) - a, exp (ix) - b, exp ( - i x )
or, in polar coordinate form, constants r, y(x)
- Y sin (x 366
+ a)
--f
0,
(12.1.4)
with --*
0.
(12.1.5)
12.1.
ASYMPTOTICALLY TRIGONOMETRIC BEHAVIOR
367
Sufficient conditions for this state of affairs are that g(x) should be absolutely integrable over (0, m), that is, (12.1.6)
together with the continuity, or piecewise continuity of g(x), or at least Lebesgue integrability over (0, b) for any finite b. The choice of lower limit in the integral in (12.1.6) is, of course, arbitrary. In particular, it is sufficient that g(x) = O ( X - ~as) x -+m. T o illustrate the extension of theorems on differential equations to integral equations of the type considered in Chapter 11 we shall establish this result in a more general setting. The type of extension primarily envisaged is that y should satisfy the differential equation (12.1.1) save for certain points 6, at which y’ makes a jump proportional to y, according to Y’(67l
+ 0)-
Y‘(67l
- 0)
+ YnY(5;J = 0,
(12.1.7)
the constants yn satisfying (12.1.8)
One way of expressing this latter situation will be to incorporate in the function g(x) of (12.1.1) a series of delta functions Er ynS(x - 6,J. A more general approach will, however, be to replace the differential equation (12.1.1) by an integro-differential or integral equation with Stieltjes integrals. Writing
and integrating (12.1.1) over an arbitrary finite interval (xl , x2) on the positive real axis we obtain ( 12.1.10)
As in Chapter 11, we now observe that this is intelligible on the basis that u(x) is of bounded variation and also right-continuous. Taking x1 = 0 and integrating with respect to x2 over (0, x) we obtain, as in Section 11.1,
368
12.
ASYMPTOTIC THEORY OF INTEGRAL EQUATIONS
This is to be interpreted on the basis that u ( t ) is of bounded variation and right-continuous, a solution being sought in the class of continuous functions, y(0) and y'(0) being arbitrary constants. If u(x) is absolutely continuous we have essentially the situation of (12.1.1); if it is the sum of an absolutely continuous function and a jump function, with jumps yn at the points t n ,we have the case of (12.1.1) punctuated by discontinuities (12.1.7). Using (12.1.11) in preference to (12.1.10), we suppose given ~ ( x ) ,of bounded variation over every finite interval of the real axis, and rightcontinuous; in case u(x) is complex-valued, we interpret this as meaning that its real and imaginary parts are of bounded variation separately. We seek additional conditions on u(x) which ensure that for arbitrary constants c1 , c, , the unique continuous solution of (12.1.12)
should admit asymptotic approximation as x + a, in one of the forms (12.1.3-5). Regarding these three forms of asymptotic expression for y ( x ) , it is clear that if (12.1.3) is possible, then so is (12.1.4) and, conversely, with the relations a, =
8 (a1 - ibl),
b,
= +(a,
+ ibl).
If u(x) is real-valued, so that without loss of generality we may confine ourselves to real-valued y(x), we may pass from (12.1.3) to (12.1.5). T h e case in which u(x) is complex-valued is, however, of some interest, for example, in connection with scattering theory and inverse problems, and here it may happen that (12.1.34) are available but not (12.1.5). For if (12.1.4-5) hold, then T , rp and u 2 , b, are related by r exp (irp) = 2iu,, and r exp (-irp) = -2ib2; if one of a,, b, vanishes, but not the other, these relations are impossible. For the general result we use the form (12.1.4), proving
< < < <
Theorem 12.1.1. Let ~ ( x ) ,(0 x a,), be of bounded variation x 00, and be right-continuous. Then over the whole semiaxis 0 for any cl, c, , not both zero, there are constants a 2 ,b, , not both zero, such that the approximation (12.1.4) holds as x .+ 00. For the proof we use the method of variation of parameters, defining functions p ( x ) , q(x), for given c1 and c2 and solution y ( x ) of (12.1.12), by y = (2j)-l(peaz - q c i 2 ) ,
y' = 2-'(pei2
+qriZ),
(12.1.13-14)
12.1.
369
ASYMPTOTICALLY TRIGONOMETRIC BEHAVIOR
that is to say, by
p
= e-iz(y'
+ iy),
q = e*"(y'- iy).
(12.1.15-16)
Here y' denotes, as usual, the right-derivative of y. Since y, y' are of bounded variation, so also are p and q ; p and q will have a discontinuity when yt does, that is to say, when u ( x ) has a discontinuity and y is not zero, and otherwise are continuous. We first set up integral equations for p and q. For any a, b, 0 a < b, we have b b [pf = dp = e-iTdy' + i dy - i( y' + iy) dx].
<
j
a
a
We have, however,
and, by (12.1.12), or its differentiated form (12.1.10),
similar manipulations were justified in detail in Section 11.3, the present case being in fact the special case ot(x) = const. Hence
[p]: = -
a
and similarly, [q]: =
-
e-6x y do = -(2i)-l
1
b
a
e i 2 ydu
=
-(2i)-l
b
a
[p(x) - q(x) e-2iz] du(x), (12.1.17)
[ p ( x )eZix- q(x)] du(x). (12.1.18)
We first prove that p , q are bounded. We take a so large that ( 12.1.19)
and write (12.1.20)
Here we note that c > 0, since by Theorem 11.2.5, and since c1 , c2 do not both vanish, y and yt do not vanish together so that p and q do not vanish together. Assuming if possible that p , q are not both bounded as x --f 00, we define b as the greatest number > a such that
Ip(x) I, 1 q(x) I < 2c
for
a
< x < b.
(12.1.21)
3 70
12.
ASYMPTOTIC THEORY OF INTEGRAL EQUATIONS
Here we note that, by (12.1.17-18), p and q are right-continuous, so that such a b > a necessarily exists. We note also that max {I P ( b ) I, I q ( 4 I)
(12.1.22)
3 2c;
for if I p(b) 1, I q(b) 1 < 2c it would follow from the right-continuity of p and q that b would not be the largest number with the property (12.1.21). We deduce that at least one of the inequalities
IP(4
-$(a)
I2
I q(4
c,
-q(4
I2
c
(12.1.23)
must be true. In view of (12.1.13) we have from (12.1.21) that jy(x) 1 2c for a x < b, and by continuity also for x = b. Hence from (12.1.17) we have that
<
<
( 1 2.1.24)
by (12.1.19), a similar bound applying to q. These conflict with (12.1.23), so that p , q are bounded as x + m. The argument jdst given shows that (12.1.21) holds for all b > a. Hence (12.1.24) holds for all b > a, and similarly for q. In view of (12.1.20), we deduce that p(x), q(x) do not both tend to zero as x + 03, at least one of them exceeding in absolute value 8 c . Knowing that p , q, and so also y, are bounded as x -+ m, we deduce from (12.1.17-18) that p , q tend to limits as x -+00, being in fact of bounded variation over (0, w). As just shown, p(m) and q(m) cannot both be zero. Inserting these limiting values of p , q in (12.1.13-14) we obtain an asymptotic formula for y ( x ) as x + of the form (12.1.4), and a similar one for y', to be obtained by formal differentiation. If (12.1.17) be taken over (0, w) we get p(m) -~ ( 0 = ) -
J
m
e - i z y ( x ) du(x).
0
Evaluating p(0) from (12.1.15) we have p ( m ) = y'(0)
+ iy(0) - 1: e-"y(x)
du(x).
(12.1.25)
Similarly, (12.1.26)
12.2.
THE
S-FUNCTION
371
12.2. The S-Function
With a view to boundary problems of Sturm-Liouville type we introduce a parameter into theequation(l2.l.l),or moregenerally(l2.1.10-12), and impose fixed initial conditions at x = 0. Taking the most commonly studied case, we take the initial conditions to be Y(0,A) = 0,
y’(0, A) = 1 ,
(12.2.1-2)
where y(t, A) satisfies the integro-differential equation (12.2.3)
<
for arbitrary x1 , x2 with 0 x1 < x2 < m. As previously, (12.2.1-3) are incorporated in the integral equation (12.2.4)
and we may define y ( t , A) to be the unique continuous solution of (12.2.4). We shall now assume o ( t ) to be real-valued, nondecreasing, rightcontinuous, and bounded over the whole semiaxis (0, =), so that a(m) exists and is finite. Asymptotic formulas for y(x, A), in which either or both of x and A is large and positive, are obviously of importance in determining the asymptotic character of the eigenvalues, or the nature of the spectrum. However, it turns out to be important to consider also asymptotic behavior when X varies generally in the complex plane, which we do in this section. Since in Theorem 12.1.1 u ( x ) was allowed to be complex-valued, the results and definitions of Section 12.1 may be taken over with Ao(x) in place of u(x). For the solution y(x, A) of (12.2.4) we define as before functions p ( x , A), q(x, A) by p = c i 5 ( y ’ iy), q = ei5(y’ - +). Here y’ is the right-derivative of y , and p , q are right-continuous. From (12.2.1-2) we have now that p(0, A) = q(0, A) = 1, whence, by (12.1.17-18),
+
(12.2.5)
and
3 72
12.
ASYMPTOTIC THEORY OF INTEGRAL EQUATIONS
Since, by Theorem 12.1.1, y(t, A) is bounded as t --+ m, these formulas hold also in a limiting sense with x = 00. Furthermore, p ( x , A) and q(x, A) will for any fixed x be entire functions of A, since y ( t , A) is analytic in A, as shown in Section 11.5. This too is true also when x = 00; a slight modification of the argument of Section 12.1 shows that the p ( x , A), q ( x , A), and so also y ( x , A), are bounded uniformly for all real x and in any bounded A-region, so that the integrals in (12.2.5-6) converge uniformly, and so converge to analytic functions. We give later some specific bounds for p , q, and y . The entire functions p ( w , A) and q(m, A) therefore admit the representations (12.2.5-6) with x = 00. I n terms of the asymptotic form of y(x, A) for large x and fixed A we have y(x, A) = (2i)-l{p(m,A)
,is
+
- q(m, A) cis}o(l),
(12.2.7)
with a similar asymptotic formula for y’(x, A), to be obtained by formal differentiation with respect to x. At least one of p ( m , A), q(m, A) is not zero, by Theorem 12.1.1. Since p ( x , A), q(x, A), for fixed x with 0 x 00, have no common zeros, and have each at most a denumerable number of zeros, the “S-function” S(x, A) will be well defined by the formula
< <
(12.2.8)
except for zeros of q(x, A), if any, which will be its poles. The zeros of p ( x , A) will coincide with the zeros of S(x, A). Since p ( x , A) and q(x, A) are entire, S(x, A) will be meromorphic. In terms of y ( x , A) and f(x, A), the definition (12.2.8) is equivalent to (12.2.9)
in this form an affinity is apparent with the functions defined in (8.4.6) and (10.2.17). I n particular, S(0, A) = 1, I S(x, A) I = 1 when A is real. We clarify further the analytic character of S(x, A) in
< <
Theorem 12.2.1. Let a(x), 0 x 00, be bounded, nondecreasing, and right-continuous. Then for S(x, A), for fixed x and varying A, not to be a constant in A it is necessary and sufficient that (12.2.10)
12.2.
THE
373
S-FUNCTION
If (12.2.10) holds, then
1 S(x, A ) 1 < 1
for Im A
> 0, I S(x, A) I > 1
for Im A
< 0. (12.2.11-12)
We start with the identity, of Lagrange type,
this is a case of (11.3.5). Taking in particular p
y ( x , A), we have
=
A, and writing y for
y ’ j - y j ‘ = -2i{Im A}
(12.2.14)
We deduce that
where p
= p ( x , A),
(y’
+ z$)
q = q(x, A). For the left of (12.2.15) is
( j ‘- ij)- (y‘
-
;r)(3” + 9)= - 2 4 3 9 - y j ’ ) ,
yielding (12.2.15) from (12.2.14). It follows from (12.2.15) that if I m A > 0, then 1 p l2 1 q 12; from this it follows that q # 0 for ImA > 0, for otherwise we should also have p = 0, whereas p and q have no common zeros. Hence, for Im A > 0 S(x, A) is regular and satisfies I S(x, A) I 1. Similarly, if I m A < 0, S(x, A) is regular except for poles, has no zeros, and satisfies I S(x, A) I 2 1. T o investigate whether or not S(x, A) is constant as A varies we consider its derivative at A = 0. Indicating a/aA by a suffix A we have, of course,
<
<
aS(x5 A ) - Paq - PqA ah q2
I
where
T o evaluate the latter, we divide (12.2.13) by ( p - A) and make p getting
--t
A,
(12.2.16)
3 74
12.
ASYMPTOTIC THEORY OF INTEGRAL EQUATIONS
Hence, if q # 0,
a result allied to (10.2.30-32). In particular, with h = 0, q(x, 0) = 1, S(x, 0) we have
=
1 and y(t, 0)
= sin t
Hence near h = 0 we have a Taylor development ~ ( xA), = 1
+2 i ~
+ 2c , ~ . ou
sin2 t du(t) 0
(12.2.18)
n=2
From this we see at once that the condition (12.2.10) is sufficient for S(x, A) not to be a constant. As to the necessity of the condition, suppose it not to hold, that is, that (1 2.2.19)
we propose to deduce that S(x, A) is constant. If it is not constant, then let c, be the first nonvanishing coefficient in the Taylor development (12.2.18). Then, near h = 0, S(x, A) has the form 1 c,Xm 0(hm+l) for some m 3 2, c, # 0. However, it is easily seen then that we cannot have 1 S(x, A) j 1 for all h with I m h > 0, contrary to what was proved earlier. Hence all the c, must vanish, and S(x, A) is constant. T o complete the proof of Theorem 12.2.1 we suppose that the “definiteness” condition (12.2.10) holds, and deduce (12.2.11-12), in other words, that h 4 S(x, A) maps the upper and lower A-halfplanes into the interior and exterior, in the strict sense, of the unit circle. We proved above that I S(x, A) 1 1 for all h with I m h > 0. If now for some such h we had 1 S(x, A) 1 = 1, it would follow from the maximum modulus principle that S(x, A) is constant, contrary to (12.2.10). A similar argument shows that if Im h < 0, then 1 S(x, A) I > 1. I n the course of the proof of the last theorem we located the zeros and poles of S(x, A), in relation to the sign of I m A.
+
+
<
<
Theorem 12.2.2. Under the assumptions of Theorem 12.2.1, with (12.2.10), the zeros of S(x, A) lie in I m h > 0, its poles at the complex conjugate points in Im h < 0.
12.3.
A NON-SELF-ADJOINT
PROBLEM
375
We have included (12.2.10) as necessary for the existence of any zeros or poles, for otherwise S(x, A) = 1 ; that (12.2.10) is sufficient for the existence of at least one pole and one zero will be noted in the next section. For the proof of Theorem 12.2.2, we recall that p ( x , A) can vanish only if I m A > 0, and q(x, A) only if I m A < 0, in view of (12.2.15) and the fact that p and q cannot vanish together. That the zeros and poles of S(x, A) are located at complex conjugate points follows from the fact that I S(x, A) I = 1 when A is real, together with the Schwarz reflection principle. More directly, we see from (12.2.5-6) that (12.2.20) so that the zeros of p and q are located at complex conjugate points, proving the result. As already indicated, the condition (12.2.10) for the S-function not to be a constant has a parallel in such conditions as (8.2.1) or (9.1.6), that a nontrivial solution of the differential equation should not be of zero mean-square. I n the present case we prove
Theorem 12.2.3. If the assumptionsof Theorem 12.2. I hold, regarding
a ( x ) , and if, for some x
> 0,
(12.2.21)
holds for one real A, then it holds for all real A. For (12.2.21) implies that aS(x, A)/aA = 0, by (12.2.17), for the real A in question. As we showed in the special case A = 0, this is incompatible with the property that I S(x, A) I 1 for Im A > 0, unless S(x, A) is a constant. I n the latter event aS(x, A)/aA = 0 for all real A, so that (12.2.21) follows from (12.2.17).
<
12.3. A Non-Self-Adjoint Problem The zeros and poles of S(x, A) are obviously of importance in any discussion of its functional character. Here we observe that these zeros and poles are eigenvalues of certain boundary problems which fall outside the usual Sturm-Liouville conditions. There are several equivalent formulations. Since S(x, A) = exp (-2ix) (y’ + iy) (y’ - iy)-l, it is clear that the
376
12.
ASYMPTOTIC THEORY OF INTEGRAL EQUATIONS
zeros of S(x, A) are the A-values for which (12.2.3) has a nontrivial solution such that y(0,A)
yyx, A)
= 0,
+ +(x, A) = 0.
(12.3.1)
Its poles, of course, will be the eigenvalues of the problem y(0,A)
yyx, A)
= 0,
-
(12.3.2)
+(x, A) = 0.
If x is infinite, the zeros and poles of S(a, A), which is still meromorphic in A, will be the eigenvalues of y(0, A)
= 0,
y'(x, A)
y(0, A)
= 0,
y'(x,
+ iy(x, A) + 0
A) - iy(x, A )
+0
as
x --+
03,
(12.3.3)
as
x+
-.
(12.3.4)
T h e arguments which prove that the eigenvalues of Sturm-Liouville problems are all real, serve in this case to show that the eigenvalues of (12.3.1) or (12.3.3) lie in the upper half-plane, and those of (12.2.2) or (12.2.4) in the lower half-plane. These statements follow from Theorem 12.2.2, phich was proved by reasoning of the same character. For a second interpretation of these boundary problems we consider the expression of y and y' in polar form. If A is real, we may define functions r(x, A), p)(x, A) by y = Y sin (x
+ v),
y'
=Y
cos (x
+ v).
(12.3.5-6)
In view of (12.1.13-16) we have also
p
=Y =
exp (iv),
d(Pd,
or again =
q = Y exp (-iv),
v
=
Q arg (P/9)1
(2i)-1 log s.
(12.3.7-8) (12.3.9-10) (12.3.11)
T h e definitions of r and p) for fixed x and varying real A may be extended into the complex A-plane, with the reservation that the zeros of p and q will, in general, be branch points of I and p), according to (12.3.9-10). I n the case x = we may replace (12.3.5-6) by the single asymptotic relation y(x, A) - ~ ( 0 0A) , sin {x v(m, A)) 0 (1 2.3.12)
+
-
as x -+ 00; to complete the definition of r(m, A) and ~ ( 0 0 ,A) we must specify that ~ ( 0 0 A) , > 0 and that p ) ( a , A) is continuous for real A, with ~ ( 0 0 0) , = 0. T h e question of the location of the branch points of
12.3.
A NON-SELF-ADJOINT
377
PROBLEM
r(m, A) and
~ ( m A), , for the standard case of a differential equation, was raised by Bellman and discussed by Fort and by Levinson and Kemp, in the latter case by similar analysis to that of Sections 12.1-2. T h e third interpretation of the boundary problems associated with the zeros and poles of S is most simply expressed in the case x = 03, and corresponds more closely with the notion of “scattering.” It is an easy consequence of Theorem 12.1.1 that the integro-differential equation
[z’]
+ 1z{dt + A du} = 0,
or the integral equation z(x) = z(O)
+ xz’(0) -
Iz
(12.3.13)
(x - t ) z ( t ){dt
0
+ A do(t)},
(12.3.14)
has a pair of solutions of the asymptotic form, for fixed A, real or complex, and large positive x, zl(x) = eiz
+ o(l),
zz(x) = e-lz
+ o(1).
(12.3.15-16)
T h e problem is now posed of finding S such that the solution of (12.3.14) given by ( 12.3.17) z = sz, - z 2 , where S is independent of x, satisfies the initial condition z(x) = 0. O n comparison with (12.2.7) and recalling that y(x, A) is a solution of (12.3.14) with y(0, A) = 0, we have S = S(m, A), provided that the latter is finite. Thus the poles of S(m, A) are the A-values for which the determination of S as above is impossible. We now show that S(x, A ) does indeed have at least one zero and pole, provided that it is not a constant.
< <
Theorem 12.3.1. Let o(x), 0 x 00, be nondecreasing, bounded, and right-continuous. Then, for x satisfying (12.2.10), S(x, A) has at least one zero and at least one pole. It will be sufficient to show that p(x, A ) has at least one zero. If it had no zeros, then as an entire function it would have a logarithm which would be a polynomial, since p ( x , A) is of finite order. If therefore we show that it is of lower than exponential order, more precisely, that
I p ( x , A) I d exp (4 A I)}, for some function
w(v)
-
(1 2.3.18)
with the property that
W(?)/?
0
as
7 -m,
( 12.3.19)
378
12.
ASYMPTOTIC THEORY OF INTEGRAL EQUATIONS
we can say that if p ( x , A) has no zeros, then it is a constant. It will be sufficient to show that a similar bound applies to y(x, A) and y ’ ( x , A), by (12.1.15). We first assume x < 00. Taking it that I A I 3 1, we write (12.2.34) as a pair of simultaneous integral equations. Writing u ( t ) = I A 11/2y(t, A), v(t) = y’(t, A), we have, for 0 a < b < 00,
<
u(b) - .(a)
=
v(b) - w(u)
=
Ih
1
b
-
n
s” a
o(r) df,
u ( t ) d{l h
1-112
t
+ Ih
(12.3.20) ll/Z
o(t)}.
We deduce that
by (12.3.20). This type of functional inequality is discussed in Appendix IV. Using Theorem IV.4.1, we deduce that
I u(x) I
+ I ~ ( x I) < exp {(I h I1/n + I h
noting that I u(0) I result to
I r(x, I
I-lj2)
x
+ I X 11/2[o(x)
-
o(O)]},
+ I o(0) I = 1. Since I h I 2 1, we may simplify the
+ I ~ ’ ( xA), I < exp {I h 11/2[2x+
U(X) -
o(O)]}. (12.3.21)
Hence y(x, A), y ’ ( x , A) for fixed finite x are entire functions of A of order at most .&, the same being true of p ( x , A). By (12.1.15) we have, in fact,
I p(.r, A) I
< 2 exp {I
+ ~ ( 4u(O)l},
11/2[25c
-
(12.3.22)
the same bound applying to q ( x , A). If therefore p ( x , A) has no zeros, it is a constant; by (12.2.20), q(x, A ) will also be a constant, and so also S(x, A), in contradiction to (12.2.10). This completes the proof of Theorem 12.3.1 if x is finite. Since the bound (12.3.22) becomes ineffective when x is large, we produce a modified version of the previous argument to deal with large x, combining the two to yield a uniform bound for all x; the process is carried out for a differential equation in Section IV.3 of Appendix IV.
12.3.
379
PROBLEM
A NON-SELF-ADJOINT
We employ the integral equations (12.1.17-18) which, with ho(t) for
a ( t ) , yield in this case
p(b, A) -?(a, A) q(b, A)
-
=
-(2i)-lA
q(a, A) = -(2i)-lA
a b
a
{ p ( t ,A) - q(t, A)
e-zti"} du(t),
{ p ( t ,A) eZit - q(t, A)}
du(t),
(12.3.23) (12.3.24)
the expressions in the braces being continuous. We deduce that
and so, by Theorem IV.4.1, that
IP(k4 I
+ I q(b, 4 I < {I p(a, 4 I + I q(a, 4 I} exp {I A I [+)
- 4a)l).
If on the right we use the bound (12.3.22), which applies also to q, it follows that, for 0 < a < b and 1 h I 2 1 we have
I P(b, 4 I d 4 exp {I A I1/"2a Taking a = I h
I p(b, A) I
Ill4
+ 44 - 4 Y 1 + I A I [ 4 b ) - .(a)l}.
we have then, for b
< 4 exp (2 I A 13/ 4
+ I A ll/B I(.[ <
I h Ill4
A Ill4) - o(O)]
+ I A I [o(b)- .(I
A
P4)1
and this is true also for 0 b bound slightly we have, for I h
<1X by (12.3.22). Simplifying this I 1 and any x 2 0,
I p ( x , A) I d 4 exp {I A 13/*[2
- u(O)]
+
U(W)
+IA I
[ ~ ( m )-
.(I A \1/4)]} (12.3.25)
and since a(w) - cr( Ih Ill4) + 0 as I h j + 03, this proves that p ( m , A) is of lower than exponential order in the sense (12.3.18-19). We may then conclude as before that if it has no zeros it must be a constant, and so also q(-, A), S(m, A). This completes the proof of Theorem 12.3.1. Putting the result another way, we see that subject to u(x) being rightx 00, the condicontinuous, nondecreasing, and bounded over 0 tion (12.2.10) is necessary and sufficient for there to be an eigenvalue of the problem (12.3.1) for the x in question, or (12.3.3) if x = m. Let now the zeros of S(x, A), for fixed x, be denoted
< <
v,,
n = l , 2 ,...,
(12.3.26)
for example, in nondecreasing order of absolute value; they may be finite or infinite in number. T h e poles of S(x, A) will of course be the
3 80
12.
ASYMPTOTIC THEORY OF INTEGRAL EQUATIONS
complex conjugate points. By means of factorization theorems we may express S(x, A) in terms of its zeros and poles, at least for finite x.
Theorem 12.3.2. Under the assumptions of Theorem 12.3.1 there hold for finite x the representations
and
-
(1 -
x
-1
1
*
(12.3.28)
I n addition, (12.3.29)
For the zeros of S(x, A) are also those of p ( x , A), and p ( x , A) is an entire function of order at most Q, by (12.3.22). Since also p ( x , 0) = 1, we have (12.3.27) as the expression of p ( x , A) as an Hadamard product. T h e case of q(x, A) is similar, and follows from (12.2.20); the product representation of S(x, A) is an immediate consequence. Since p ( x , A) is of order at most Q, we have
and so, by logarithmic differentiation of (12.3.27),
with absolute convergence, apart from poles. Hence
However, we have from (12.2.5) that
where in fact y(t, 0) = sin t . Comparing the two evaluations of for A = 0 we get (12.3.29).
ap/aA
12.4.
THE STURM-LIOUVILLE PROBLEM
38 1
12.4. The Sturm-Liouville Problem We shall now study the relation of the S-function to a self-adjoint boundary problem, that given by the integro-differential equation (12.2.3) with the initial condition y(0, A) = 0 and the terminal condition y(b, A) cos /3 - y’(b, A) sin 3,
=0
(12.4.1)
for some real /3 and some finite positive b. In other words, defining y(x, A) as the continuous solution of the integral equation (12.2.4), we impose the condition (12.4.1). We continue to assume u(x) bounded over 0 x 00, right-continuous, and nondecreasing. We first interpret the eigenvalues A,, the roots of (12.4.1), in terms of S(x, A). I t follows from (12.4.1) that (y’ iy)/(y’ - iy) = exp (2i/3), where y denotes y(b, A) and y’ denotes y’(b, A). Since then S(b, A) = exp (-2ib) (y’ iy)/(y’ - iy), it follows from (12.4.1) that
< <
+
+
S(b,A )
= exp
[2i(/3 - b ) ] ,
(12.4.2)
and conversely (12.4.1) follows from (12.4.2). Hence the A, are also the roots of (12.4.2). Subject to (12.4.3)
S(b, A) will be non-constant, of absolute value unity only when A is real; the roots of (12.4.2) will then be real and form at most a discrete set. One way of looking at this situation is that if the S-function S(b, A) is known, for the given b, then the eigenvalues are known for any given /3. Having defined the eigenvalues, we next define the associated normalization factors, and enquire whether these also can be found from a knowledge of the S-function. Writing
the normalized eigenfunction associated with the eigeuvalue A, will be y(x, A,) { P ( A , ) } - ~ / ~ . T h e spectral function 7 ( A ) will be a right-continuous ) 0. step function with jumps at the A, of amount l/p(A,), fixed by ~ ( 0= T h e question just mentioned is thus whether p(A) can be found if the function S(b, A) is known. T o see that this can be done we prove first the formula (12.4.5)
382
12.
ASYMPTOTIC THEORY OF INTEGRAL EQUATIONS
where S = S(b, A), S, = aS/aA, Y = ~ ( bA), is the amplitude of a solution as given in (12.3.5-9), and A is real, so that Y is defined. For (12.2.17) gives
SA
=
2@&'
( P / dP ( 4
which is equivalent to (12.4.5), by (12.2.8) and (12.3.9). Actually (12.4.5) is valid for all A, except for zeros and poles of S, since y 2 is a singlevalued analytic function of A. We see from (12.4.5) that if S(b, A) is known then p(A) will be known if, in addition, we are able to determine ~ ( bA). , We are thus led to the problem of finding the amplitude function Y , given the S-function, essentially a phase function. This is a simple matter. From the factorization (12.3.28) we may construct the factors (12.3.27), with I m v, > 0. T h e functions p ( b , A), q(b, A) appear then as the solution to the factorization problem (12.4.6) q(b, 4 S(b,4 = P(b, 4, where S is a given meromorphic function, S(b, 0) = 1, I S(b, A) I = 1 for real A, and p , q are to be entire functions, of order in this case at most + , p being free of zeros in the lower half-plane, while q does not vanish in the upper half-plane. It may then be shown that this problem has a unique solution, subject to q(b, 0) = 1. In our case the solution is given explicitly by (12.3.27-28). Thus if S(b, A) is known, we can in principle determine p ( b , A), q(b, A), ~ ( bA), and p(A). Thus for any /?,the spectral function can be constructed. In a variant of this problem we suppose given the amplitude ~ ( bA), , for fixed b and all real A. We seek to recover the functions p ( b , A), q(b, A), S(b, A), and p(A); if then any boundary condition at x = b is prescribed in terms of an angle 8, the eigenvalues and the spectral function will be known, even though we have not constructed the integral equation. Since { ~ ( bA)}z , is entire and of order at most +, it will be determinate if it is fixed for all real A; indeed, it is not really necessary to know its values for all real A. Noting also that necessarily ~ ( b0) , = ), we may factorize { ~ ( bA)I2 , in the form p(b, A) q(b, A), where p has its zeros in the upper half-plane and q has its zeros in the lower half-plane, and in addition p(b, 0) = 1. Recalling that { ~ ( b A)}-2 , may also appear as a weight function for orthogonality purposes (Section 8.1 I), an affinity begins to appear with the factorization problem of Section 7.8. Coming more specifically to inverse boundary problems, we consider briefly the analog of what was, in Sections 1.8 and 4.7, termed the second inverse problem. In this we are told the eigenvalues of the
12.4. THE STURM-LIOUVILLE PROBLEM
383
boundary problem for, in this case, two terminal boundary conditions of the form (12.4. l), the initial conditions and integral equation being the same. It is to be anticipated that, in general at least, the knowledge of the eigenvalues for one boundary condition at x = b will not suffice for the determination of the integral equation. Here we confine ourselves to the observation that if two sets of eigenvalues are given, then the functions p ( b , A), q(b, A) are known, in general, and so for any boundary condition of the same form we may construct the eigenvalues, and indeed the spectral function. T h e solution is similar to that of Section 1.8, We suppose given two boundary conditions of the form (12.4.1) with /3 = , = p2 , the corresponding eigenvalues being
possibly finite in number. These eigenvalues are the roots of, by (12.4.2), S(b,A)
= exp {2i(&
- b)},
= 1,
m
2,
that is to say, exp {-Qm- b)) p(b, A) - exp ri(L - b)) q(b, A )
= 0.
Since we know the roots of these equations, we have the identities, for m = 1, 2, exp {-@m
- b ) ) p(b, 4 - exp {i(Bm -
-
4 db,4
-2isin (&, - b) n
A (1 - hTms) , n
(12.4.7)
provided that sin (8, - b ) # 0; the constant factor on the right was found by comparing constant terms, while in factorizing the right we rely on the assumption that the Akmm,are the eigenvalues of a boundary problem, and so the zeros of an entire function of order at most taking b to be finite. Provided that sin (PI - p2) # 0, the equations (12.4.7) may be solved for p(b, A), q(b, A) yielding thereby the functions S(b, A), r(b, A), and p(A). Thus, though we have taken no steps towards finding the integral equation, we can at least say that the spectral function of any boundary problem of the form (12.4.1) is determined. Naturally, it remains for investigation how far the eigenvalues of the two boundary problems can be prescribed at will, for example, whether they need only have the separation property and the distribution to be expected of the zeros of entire functions of order or less.
4,
+
384
12.
ASYMPTOTIC THEORY OF INTEGRAL EQUATIONS
12.5. Asymptotic Properties for the [k2 g(x)] y = 0 Generalization of y”
+
+
In Sections 12.1-4 we considered in essence the perturbation of the equation y” y = 0, the perturbing term Ag(x) y depending on a parameter; assuming g(x) absolutely integrable over (0, m), the behavior of the solutions was then asymptotically trigonometric as x-+ 00, whatever the value of A. T h e situation is quite different in the case of y” [ A g(x)] y = 0, where g(x) is again integrable at a. T h e comparison equation is now y” + Ay = 0 for which the behavior of the solutions as x -+ varies radically according to whether X is positive or negative, real or complex. There are obvious advantages in replacing X by k2. Generalizing this equation as before to allow g(x) to have singularities of delta function type, we write u ( x ) for its integral, and arrive first at the integro-differential equation
+
+ +
(12.5.1)
with variations and integrals over any non-negative interval, or the integral equation
where in fact c1 = y(O), c2 = y’(0). We assume once more that u ( x ) is right-continuous and of bounded variation over (0, m), while y ( x ) is a continuous function satisfying (12.5.2). I n this section we consider the asymptotic character of the solutions as x -+m; we shall not assume u ( x ) to be nondecreasing, and at the moment it need not be real-valued. The basic result, justifying the comparison of (12.5.1-2) with y” k2y = 0, is
+
Theorem 12.5.1. Let a ( x ) be right-continuous and of bounded variation over (0, w). Then for fixed k # 0, real or complex, the constants c l , c2 in (12.5.2) may be chosen so as to yield a solution yl(x) of the asymptotic form, as x + a, y,(x)
-
exp (ikx),
y;(x)
-
or a solution y2(x) of the asymptotic form
ik exp (ikx),
(12.5.34)
12.5.
GENERALIZATION OF
y"
+ [K2 + g(x)] y
=
0
385
Here the symbol (-) has its usual meaning that, for example, exp ( - i k x ) yl(x) -+ 1 as x -+ 03. The case when k # 0 is real is effectively included in Theorem 12.1.1, by means of the change of variable x-+lklx.
Extending the definitions (12.1.15-16) we set
p
= e-ikx(y'
+ iky),
q = eikx(y' - iky),
(12.5.7-8)
so that, if K # 0,
y
= (2ki)-l(peikx - qe-ikx),
y'
+ qe-ikx).
= 2-'(peikX
(12.5.9-10)
I n extension of (12.1.17-18) it is easily shown that p and q satisfy the integral equations, for 0 a < x,
<
[pi:
= - r e - i * y ( t ) du(t) =
and [q]," = -
a
Sz a
-(2ki)-1 r { p a
- qe-zkitt)
du(t),
eikty(t)du(t) = -(2ki)-l r { f w 2 k i t- q} du(t). a
(12.5.11)
(12.5.12)
T o dispose first of the case in which k is real, and not zero, we deduce in this case from (12.5.11) that
the same bound applying to 1 q ( x ) - q(a) I by (12.5.12). Taking a so large that
jPIW)l< $ I K I ,
(12.5.13)
we deduce that
( 1 2.5.14)
12.
386
ASYMPTOTIC THEORY OF INTEGRAL EQUATIONS
From this it follows that, for x
S(IP(4 I
a,
+ I d a ) I> < IP(4 I + I d x ) I < 2{1P(4 I + I 4 b ) I>,
provided that / p ( a )I
+ I q(a) I # 0, that
y(x) does not vanish identically. For let b
(12.5.15)
is to say, that the solution
> a be the greatest number, for a < x < b ; some such b
supposed finite, such that (12.5.15) holds necessarily exists since p , q are right-continuous. Applying (12.5.14) with x = b we then deduce that (12.5.15) holds also when x = b, and so in a right-neighborhood of b, giving a contradiction. Thus p , q are bounded and do not both tend to zero. By (12.5.1 1-12) we see now that they both tend to limits, being in fact of bounded variation over (0, -), and that the limits p(.o), q(m) are not both zero, if y ( x ) 0. Since p ( a ) , q(m) are homogeneous linear expressions- in y(O),y'(O), vanishing together only when y(0) = 0 and y'(0) = 0, we see that the linear relationship connecting p ( - ) and q(m) with y(0) and y'(0) is nonsingular, and that therefore y(0) and y'(0) can be chosen so that p ( m ) = 2Ki, q(-) = 0. Inserting these values in (12.5.9-10) we obtain y'(x) = ikeikZ o(l), y(x) = eikx o(l),
+
+
+
as required by (12.5.3-4). A solution satisfying (12.5.5-6) is, of course, given by p ( m ) = 0, q ( a ) = -2ki. Passing to possibly complex k, we may suppose without loss of generality that Im k 2 0, since (12.5.1-2) are even in k ; we write k=(+iv,
v>O,
kfO.
(1 2.5.16)
We rely in this case on a transformation of the integral equation y ( x ) = y(0) cos k x
+ y'(0)k-'
sin kx - k-l
J: sin k(x
- t ) y ( t )du(t).
(12.5.17)
It may be shown by means of Theorem 1 I .4. I that this is equivalent to (12.5.2); the equivalence is well-known in the case when ~ ( x is) differentiable and in which we have to deal with differential equations. Here we shall establish a modified form of (12.5.17) by a different argument. Defining z(x) = eikzy(x), (12.5.18) ,
-
and substituting in (12.5.17) we get, after slight reduction,
+ I ) + ~'(0) (2ki)-l(eZki~- 1 )
z(x) = +y(O) ( P i '
42kil-l
j' (exp [ 2 ~ i ( x- t ) ] 0
-
1 ) z ( t )do(t).
(12.5.19)
12.5.
GENERALIZATION OF
y"
+ [k2+ g(x)] y
=
387
0
T o derive this directly from (12.5.1 1-12) we have, on taking a
=
0 there,
Multiplying the first by exp (2kix) and subtracting the second we get
On dividing by 2ki we obtain (12.5.19), noting that (12.5.20)
z ( x ) = (2ki)-l[p(x) eZki2 - q(x)l9
+
and also that p ( 0 ) = y'(0) iky(O), q(0) = y'(0) - iky(0). 1, by (12.5.16), if x 2 0, we deduce on taking absolute Since I eZkix 1 values in (12.5.19) that
<
By Theorem IV.S.,l of Appendix IV we obtain the bound
or, making x
+ 03
I eik2Ax)I
on the right and using (12.5.18)
< {I ~ ( 0I )+ I y'(O)/k I> e
~ {Ip k
r m
I-' J I W t ) I>. (12.5.23) 0
From (12.5.12) we may now deduce that q ( x ) tends to a limit as x -+ m, being of bounded variation. Thus q(m) exists if Im k >, 0. Concerning p ( x ) we assert that, if Im k
p(x) as x -+
00.
For since I e-zkit 1
=
=
> 0,
o(e-2kiz)
e2qt,
(12.5.24)
we have from (12.5.11) that
so that to establish (12.5.24) it will be sufficient to show that
388
12.
ASYMPTOTIC THEORY OF INTEGRAL EQUATIONS
This is evident on writing the left-hand side in the form
both terms on the right being of order o(e2qz). We deduce a general, though slightly incomplete, asymptotic expression for solutions of (12.5.1-2) when I m k > 0. Sincep(x) exp (2ki.r) 0 as x -P a, we have from (12.5.9-10) that --f
y ( x ) = -(2ki)-'q(o.)
and y'(x) = + q ( a ) e - t k x
r t k z
+
+o(rtkz)
o(e-tkx).
(12.5.25)
(12.5.26)
Provided that q ( a ) # 0, this solution is one of the type whose existence was asserted in (12.5.5-6). We proceed to verify that q ( a ) # 0 for at least one solution. If in (12.5.23) we take it that y(0) is zero we obtain
If therefore it so happens that
it will follow from (12.5.12) that
144 - d o ) I
<
r 0
I ei k t
1 IW)I
In this case therefore, provided that q(0) = y'(0) # 0, q ( x ) cannot tend to zero as x --t a. If (12.5.27) does not hold, then there will be some a > 0 such that it does hold when the lower limit 0 is replaced by a. We then apply a change of origin, considering (12.5.1-2) over (a, a); all solutions of (12.5.1-2) over (a, a) will be restrictions to ( a , a) of solutions over (0, a). Writing at(.) = a ( x - a), the equation yt(x) = c t
+ c;x
-
J' (x - t ) y ' ( t ) d { P t + u t ( t ) ) 0
12.5.
GENERALIZATION OF
y”
+ [ K 2 + g(x)] y
=
0
389
will have a solution, for suitable c!, c$ of the asymptotic form yt(x)-qt(-) exp (-ikx), where qt(w) # 0, and by definingy(x)=yt(x+a) we obtain a solution of (12.5.2), for some c1 , c2 , of the asymptotic form (12.5.25) where q(w) # 0. This completes the proof that (12.5.1-2) has a solution of the form (12.5.5-6) when I m K > 0, that is to say, an exponentially large solution. The existence of an exponentially small solution is then immediate, the solution being given by
It follows at once from (12.5.5-6) that this solution has the required asymptotic behavior (12.5.3-4). That it is a solution, as in the ordinary case of y” {k2 g(x)} y = 0, may be shown as usual by means of the Wronskian identity (1 1.3.2). We supplement these results with a more partial result for the case k = 0. I n the differential equation case we have to consider y” g(x) y = 0, where g(x) is small for large x. T h e comparison equation being y” = 0, with typical solutions y = 1, y = x, it is natural to expect that y” g(x) y = 0 has solutions of these asymptotic forms. However, the absolute integrability of g(x) does not suffice. For the general case we prove
+
+
+
+
< <
Theorem 12.5.2. Let a(x), 0 x -, be right-continuous and of bounded variation over (0, -), and such that (12.5.28)
Then the integral equation (12.5.29)
has a pair of solutions y l ( x ) ,y2(x) of the forms, as x + 00, y1 -+ I ,
y ; + 0,
yz
-
X,
y ; + 1.
Differentiating (12.5.29) we have, with right-derivatives,
(12.5.30-33)
390
12.
ASYMPTOTIC THEORY GF INTEGRAL EQUATIONS
and so, for any x 2 a 3 0, (12.5.35)
Consider the solution such that y‘(a) = 1, y(a) = a, where a 3 0 is to be chosen later, and denote by b > a a number with the property that
f < Iy’(t) 1 < 2,
a
< t < b;
such a b > a exists since y’ is right-continuous. For a therefore have
also valid for t deduce that
=
(12.5.36)
< t < b we
b by continuity. From (12.5.35) with x
=
b we
Hence, if a is chosen so large that
it will follow that (12.5.36) is true for all b > a ; for if there were a greatest finite such b with the property (12.5.36), the latter would hold also at t = b and so in a right-neighborhood of b, giving a contradiction. We deduce that I y ( t ) I < 2t for all t > a. From this it follows by means of (12.5.35) that y’(x) tends to a limit as x + 00, which is not zero by (12.5.36). Except possibly for a constant factor, this yields a solution of the asymptotic form (12.5.32-33). The existence of a solution of the form (12.5.30-31) then follows as before, being given by
This completes the proof. For some purposes we need bounds for solutions of (12.5.1-2) which hold uniformly in k for given initial data y(O), y’(0). If U(X) is of bounded variation over (0, m), we have the explicit bound (12.5.23), a weakened form of (12.5.22). On account of the factor 1 k I-l, however, this bound
12.6.
SOLUTIONS OF INTEGRABLE SQUARE
39 1
becomes non-uniform near k = 0. Under the stronger assumption (12.5.28) we may provide a useful bound for this region. Theorem 12.5.3. Let u(x) be of bounded variation over 0 and be right-continuous. Then, for I m k 2 0, we have
Subject to (12.5.28), we may make x -+ obtaining a bound, for fixed y(O),y’(O),
I eikzy(z)I
03
<x <
00
in the braces on the ri.ght,
< const. (x + 1).
(12.5.38)
T o prove (12.5.37) we take absolute values in (12.5.19), noting that if I m k 3 0, k # 0,
Hence from (12.5.19) with k # 0 we have
I 4-4I < I
NI
+ x I y’(0) I +
r
(x - t ) I z ( t ) I
I W-t)I;
(12.5.39)
if k = 0, then z(x) = y ( x ) by (12.5.18) and (12.5.39) follows from (12.5.29). Dividing by (x l), we obtain
+
+
Dropping the factor (x - t ) / ( x 1) on the right as being less than unity, the result follows from Theorem IV.5.1 with p ( x ) = z(x)/(x 1).
+
12.6. Solutions of Integrable Square
In connection with expansion theorems for the equation of the last section it is of great importance to determine when solutions can be of L2(0,m), that is to say, (12.6.1)
Our results on the asymptotic behavior of solutions naturally give much information on this.
12.
392
ASYMPTOTIC THEORY OF INTEGRAL EQUATIONS
Theorem 12.6.1. Let ~ ( x ) be right-continuous and of bounded variation over 0 x m. Then for real k # 0 the equation
< <
y ( x ) = y(0)
+ xy’(0) - f( x - t ) y ( t )4k2t + ~ ( t ) }
(12.64
has no solution satisfying (12.6.1). For complex k there is one such solution, unique apart from a constant factor, and characterized, if I m k > 0, by lim eikx(y’ - iky) = 0. (12.6.3) X+X
I n the above we exclude from consideration the trivial solution. Taking first K real and not zero, we have the asymptotic representation of a solution in the form, by (12.5.9), yfx) = (2ik)-l(p(m) eiks - q(m) e-ikz)
as x + 00. Hence, for large a
> 0, b > a,
+ o( l),
by direct calculation
This is clearly incompatible with (12.6.1). Suppose next that k is complex, with I m k > 0. A general solution of (12.6.2) will be a linear combination of the exponentially large solution y2(x) cikX and the exponentially small solution y,(x) eikx* BY (12.5.25) we have in fact
-
-
y ( x ) = -(2ki)-lq(m) e-ikx[l
+ 0(1)] + coeikx[l + 0(1)],
for some constant c o , and q(m) is, by (12.5.Q the same as the left of (12.6.3), it having been shown that this limit exists. Thus if q(m) = 0, that is, if (12.6.3) holds, then y ( x ) is exponentially small for large x, of order O(e-qx) in the notation (12.5.16), and so is of integrable square. Conversely, if y ( x ) is of integrable square, in the sense (12.6.1), it cannot be exponentially large for large x, and so we must have q(m) = 0, which completes the proof. In the above we did not assume ~ ( x )real-valued. I n this latter case an important further property can be stated. Theorem 12.6.2. Let ~ ( x )be real-valued, right-continuous, and of bounded variation over 0 x m. Then the k-values, for which
< <
y ( x ) = sin a
+ y’(0) cos
CL
-
s’
(x
0
-
t )y ( t ) d{k2t
+ o(t)),
(12.6.4)
12.7.
ANALYTIC ASPECTS OF ASYMPTOTIC THEORY
393
with 01 real, has a solution satisfying (12.6.1), lie on the imaginary axis, including possibly the origin. Applying the Lagrange identity (11.3.5) to (12.6.4) and its complex conjugate, we have (12.6.5)
Since y(0) = sin a, y’(0) = cos 01, the same being true for 9, we have - yjj’ = 0 when t = 0. By Theorem 12.6.1, we know that k must be either complex or zero, if (12.6.1) is to hold, and so, if k # 0, y ( x ) must be exponentially small as x -+ 03. By Theorem 12.5.1, we have 0 x 4 03, and so y‘jj -yY’+ 0 as x-+ 03. Hence on also y ’ ~ as making x -+ in (12.6.5), we obtain h2 - k2 = 0. Since k cannot be real and not zero, it must be purely imaginary, possibly zero, as was to be proved. The case of a solution of integrable square when k = 0 is partly covered by Theorem 12.5.2.
y’9
Theorem 12.6.3. Let ~ ( x ) be right-continuous and of bounded variation over (0, m)# and let
Then (12.6.2) with k = 0 has no nontrivial solution satisfying (12.6.1). For by Theorem 12.5.2 the solutions have the asymptotic forms 1, x, or a linear combination of these, which cannot satisfy (12.6.1).
12.7. Analytic Aspects of Asymptotic Theory We obtain an interesting blend of the asymptotic theory of differential equations, or in our case integral equations, and of complex variable theory, if we fix the initial conditions and consider the dependence of the asymptotic behavior on a parameter in the equation. Taking, as in Section 12.2, only the simple conditions y(0, k) = 0,
y’(0, A)
=
1,
(12.7.1)
which are incorporated in the integral equation (12.7.2)
394
12.
ASYMPTOTIC THEORY OF INTEGRAL EQUATIONS
the asymptotic law, as s -+ y(x, k) = (2ik)-'p(m, k ) eaPz[l
a,
+ o( l)] - (2ik)-'q(m, k) e c i k z [ l + o( l)]
(12.7.3)
serves with certain limitations to define functions p ( m , k), q(m, k); these are the same as the numbers p ( m ) , q(m) defined in Section 12.5 for a general solution y(x). Taking first k real and not zero, and assuming that u(x) is of bounded variation over (0, m), p ( m , k) and q(m, k) are both defined, and characterize the known trigonometric behavior of y(s, k) for large x. If k is complex, with Im k > 0, the leading term in (12.7.3) is the exponentially large one exp ( - i k x ) , and so (12.7.3) determines q(m, k), possibly as zero, but does not determine p ( m , k); the latter is, however, determined if Im k < 0. Here we do not require u(x) to be real-valued. Formally, functions p ( x , k) and q(x, k) may be defined for any x > 0 and any k by
p
= e-ikz(y'
+ iky),
q
=
eikr(y' - iky),
(12.7.4-5)
andp(m, k), q(m, k) may be defined as their limits as x + 03, when these limits exist; by (12.5.9), (12.5.25) this is equivalent to the definition (12.7.3) in terms of asymptotic behavior. For real k # 0 and real u(x) a third definition of these functions is available. T h e solution of (12.7.2) will admit an asymptotic formula which we shall write y ( x , k) = k-lr(k) sin [kx
+ ~ ( k )+ ] o(1)
(12.7.6)
where r ( k ) > 0 is an asymptotic amplitude and rp(k) an asymptotic phase. Here r(k) is fixed by (12.7.6), while rp(k) is fixed apart from a multiple of 277, to be fixed by continuity and possibly other restrictions also. On comparison with (12.7.3) we have (12.7.7) (12.7.8)
(12.7.9)
(12.7.10-11)
12.7.
ANALYTIC ASPECTS OF ASYMBTOTIC THEORY
395
For the general case, when u(x) need not be real-valued, the functions k) and q(m, k) are analytic in certain half-planes, and approximate to 1 for large k. We prove
p(m,
Theorem 12.7.1. Let ~ ( x )be right-continuous and of bounded variation over 0 x 00. T h e function q(m, k) = limz+ooq(x, k) is defined and continuous in the region I m k 2 0, k # 0, and is analytic in I m k > 0. For large k we have
< <
q(m, k) = 1
+ O(k-l),
(12.7.12)
uniformly for I k I 2 1, Im k 2 0. Analogously, we shall have that p ( m , k) is analytic for I m k continuous in I m k 0, k # 0, while
<
p(m, k) = 1
<
+ O(k-1)
uniformly for I m k 0, I k I 2 1. For the proof, it is a question of making x + q(x, k) = 1 -
0
< 0, and (12.7.13)
in the representation,
eikty(t,k) du(t),
(12.7.14)
which follows from (12.5.12) with a = Oand (12.7.1), and is essentially a case of the "variation of parameters." Taking it that I m k >, 0, we have from (12.5.23) the bound
and from this it follows that the integral in (12.7.14) converges for Im k 2 0, k # 0, uniformly in any subset of this region from which a neighborhood of the origin has been excluded. Hence q ( a , k) exists and is analytic and continuous as stated in the theorem. Since, for I m k 2 0, k # 0, we then have q(m, k) = 1 -
1,"eikty(t,k) du(t),
(1 2.7.16)
the result (12.7.12) follows at once on inserting the bound (12.7.15) in the integral on the right of (12.7.16). This completes the proof. We have, of course, in a similar way (12.7.17)
for Im k
< 0, k # 0.
396
12.
ASYMPTOTIC THEORY OF INTEGRAL EQUATIONS
We round off this discussion of the analytic character of p(m, k), k) by treating the case of k = 0.
q(m,
< <
Theorem 12.7.2. Let u(x), 0 x 00, be right-continuous and satisfy (12.5.28). Then p(m, k), q(m, k) are continuous in I m k 0 and I m k 2 0, respectively. I n particular,
p(m, 0 ) = 4(? 0)
=y
<
p , 0).
(12.7.18)
As shown in the proof of Theorem 12.5.2, the condition (12.5.28) ensures that y’(x, 0) tends to a‘limit as x + m, so that (12.7.18) is a matter of definition. For the proof of continuity, we have again to show that the integrals (12.7.16-17) converge uniformly in their respective regions. This is ensured by the bound (12.5.38), together with (12.5.28). For further discussion we restrict ourselves to the case when u is real; here important information is available concerning the zeros of p ( m , k), !I(4. ”,
< <
Theurem 12.7.3. Let a(x), 0 x m, be real-valued, rightcontinuous, and of bounded variation over (0, m). Then the zeros of q(m, k), if any, lie on the upper half of the imaginary axis, including possibly the origin. T h e zeros in I m k > 0 correspond to the k-values there for which
Jm 0
I y (t , K) l2 dt <
00.
(12.7.19)
This is included in Theorems 12.6.1-2. The case k = 0 is intelligible if (12.5.28) holds; here q(m, 0) = 0 means that the solution of (x - t ) y ( t ,0) da(t)
(12.7.20)
satisfies as x -+ 03 the relation y ( x ) / x + 0, or, what is by Theorem 12.5.2 the same thing, y(x, 0) --t y(m, 0) where y(m, 0) # 0. This solution is not, of course, of L2(0,m). I n view of the asymptotic behavior (12.7.12), the number of zeros of q(m, k) is linked with the variation in the argument of q(m, k) as k varies on the real axis. I n other language, the number of “bound states” is linked with the variation of the asymptotic phase.
0
Theorem 12.7.4. Let u(x) be real-valued and right-continuous on 00, and let (12.5.28) hold. Let also the solution of (12.7.20)
<x <
12.7.
ANALYTIC ASPECTS OF ASYMPTOTIC THEORY
397
not be bounded as x -+ m. Then the number of zeros of q(w, k) in the upper half-plane Im k > 0 is equal to the limit as k .+ + m of (12.7.21)
(2r)-l{arg q(m, k) - arg q(m, -A)} =
(4n)-l(arg S( 4 )- arg S(k)}.
(12.7.22)
For the proof we apply the argument principle of complex variable theory to the function q(m, k) and the closed contour in the k-plane formed by an interval (-x, x), say, of the real axis, closed by a semicircle in the upper half of the k-plane. We first remark that, if x is sufficiently large, q(m, k) has no zeros on this contour. I n the case of the curved portion this follows from (12.7.12). Passing to the consideration of real k,we observe first that q(m, 0) # 0, since we assume the solution of (12.7.20) unbounded. If again we have q(m, k) = 0 for some real k # 0, then its complex conjugate p ( m , k) will also vanish, implying by (12.7.3) that y(x, k) -+ 0 as x + 00; the latter contradicts Theorem 12.5.1 (or Theorem 12.1.1). T h e argument principle may therefore be applied, and tells us that the variation in arg q(m, k) as k describes this contour positively is 2rr times the number of zeros of q(m, k) inside the contour; we show below that these zeros are all simple. Making x -+ m, the variation of q(m, k) as k describes the curved part of the semicircle may be neglected, in view of (12.7.12). Retaining only the change in arg q(m, k) as k describes the real axis, we obtain the formula (12.7.21) for the number of zeros. As regards (12.7.22), we note that sincep(m, k) and q(m, k) are complexconjugate, it follows from (12.7.9) that arg q(m, k) = - 6 arg S(k). We may furthermore replace (12.7.21-22) by similar variations over 0 k 00. Since y ( t , k) is an even function of k, by (12.7.2), we have for real k,
< <
q(m,
-k)
= q(m, k),
S(-k)
=
Hence the number of zeros of q(m, k) in I m k example, by
__
S(k).
> 0 is also given, for
Iim (2r)-'{arg S(0) - arg S(k)).
k+w
(12.7.23-24)
(12.7.25)
In expressing the result in terms of the asymptotic phase ~ ( k we ) meet the difficulty that the asymptotic formula (12.7.6) has no sense when k = 0. Nevertheless ~ ( k = ) -arg q(=, k) tends to the limit -arg q(m, 0) in a continuous manner as k -+ 0; here we assume as in the last theorem that (12.7.20) has a solution of the asymptotic form
12.
398
ASYMPTOTIC THEORY OF INTEGRAL EQUATIONS
const. x + 00, so that q(m, 0) = y'(m, 0) # 0. T h e number of zeros of q(m, k ) in the upper half-plane is then given by (12.7.26)
where v ( k ) is interpreted as a continuous function defined by (12.7.6). We complete the discussion by remarking that the zeros of q(m, k) in Im k > 0 are all simple. It is a question of showing that qk(m, k ) # 0, or limr+mqk(x, k ) = 0, where the suffix k indicates a/ak. Differentiation of (12.7.5) gives qk(x, k) = eik"(ixy'
Recalling that q(m, k )
=
0 we have
y ( x ,k )
for some
c
-
+ xky -y ;
+ 0, whence
y'(x, k)
ceikx,
-
-
zy
-
iky,).
ikceikx,
lim q,(x, k) = lim e*"(yL - iky,). x+m
x-rm
(12.7.27) (12.7.28)
We now use the fact that y ; y - y k y'
=
-2k r y 2 dt, 0
(12.7.29)
+
co be proved in a similar way to (12.2.16). Taking x so large that y 0, dividing by y 2 , and integrating with respect to x we deduce that y k / y = O{exp (-akix)), whence y k = O{exp ( - i k x ) } , a similar bound holding for y ; by (12.7.29). Substituting (12.7.27) in (12.7.28) we see that the right of (12.7.28) is limx+wc-l(yy; - y'yk), which is not zero in view of (12.7.29); thus the zeros of q(m, k ) are simple. I t follows that, under the assumptions of Theorem 12.7.4, the number of values of k2 for which (12.7.2) has a solution of integrable square over (0, m) is given in terms of the asymptotic phase by (12.7.26). We reach here the fringe of a number of important inverse investigations, in which we start from one of S ( k ) , r ( k ) or ?(k) and seek to recover the others, and, what is more ambitious, to recover u(x). We give some notes and references on this later.
12.8. Approximations over a Finite Interval We turn to a different type of approximation, dealing with perturbations of the Sturm-Liouville problem y"
+ k2y = 0 ,
0 <x
< b,
(12.8.1)
12.8.
APPROXIMATIONS OVER A FINITE INTERVAL
399
subject to ~ ( 0K),= 0,
~'(0, K) = 1,
y(b, K) = 0.
(12.8.2-4)
Confining attention to k 3 0 this problem has of course the eigenvalues k, and eigenfunctions y(x, A,) given by
K,
=m/b,
y(x, K,) = ',k
sin k,x,
(12.8.5-6)
where n = 1, 2, 3, ..., and y ( x , k,) has just n - 1 zeros in n < x < b. We now ask whether these statements have some approximate validity for the integral equation over a finite interval ~ ( xK) , = x - r ( x - t ) y ( t ,K) d{K2t 0
+ ~ ( t ) } , 0 < x < b,
(12.8.7)
where (12.8.2-3) are already included, and (12.8.4) is imposed to make the boundary problem. It is to be expected that approximate information similar to (12.8.5-6) will be available for this problem if o ( t ) is of small total variation and again if k is large, that is to say, for the larger eigenvalues. I n (12.8.2), (12.8.4) we have confined attention for clarity to the simplest of boundary conditions, but the reasoning to be given may be extended to more general cases. Since for the boundary problem (12.8.7), (12.8.4) we can not usually find the eigenvalues explicitly, there is an obvious utility in approximate information concerning them. However, the importance of such approximations goes well beyond this. Since the convergence of the eigenfunction expansion depends mainly on the larger eigenvalues, asymptotic formulas for these eigenvalues and the associated eigenfunctions provide rather exact information on this convergence ; the convergence of the eigenfunction expansion is often equivalent to that of a comparison Fourier series, according to theorems of "equiconvergence." At a more basic level, these asymptotic formulas provide a means of actually proving the eigenfunction expansion, and indeed one of the simplest proofs; here the idea is to proceed by continuous variation from the ordinary Fourier series expansion, the validity of which is assumed. T h e method is, of course, only available for differential, or integral, equations which are in some sense comparable with (12.8.1). T h e method of continuous variation may also be used to prove the oscillation theorem, roughly speaking that the nth eigenfunction has n - 1 zeros in the interior of the basic interval. Finally, one may note miscellaneous applications of approximate formulas for eigenvalues, such as to trace formulas and to the approximation to lower eigenvalues.
400
12.
ASYMPTOTIC THEORY OF INTEGRAL EQUATIONS
Since we wish to consider both the eventualities of k being large and of u ( t ) being of small total variation, we give in what follows bounds involving specific numerical factors, without attempting any precision in these factors. We shall, unless otherwise stated, allow k and a(x) to take complex values. We write (12.8.8)
and w =
J
b
I du(t) I.
0
(1 2.8.9)
In this section it will be assumed without further statement that u ( x ) is right-continuous and of bounded variation over 0 x b where b is fixed and finite. By y ( x , k) will be meant the unique continuous solution of (12.8.7). We give first the approximation to eigenvalues.
< <
Theorem 12.8.1.
For integral n with n > 8wb/n
(12.8.10)
there is a unique k, such that y(b, k,) = 0 and such that I bk, - n r I < 1. This k, satisfies I bk, - nn I < 6wb/(m). (12.8.11) We need an approximation to y(x, k ) , which we derive from the integral equation, for k # 0, y ( x , k) = k-l sin kx - k-l
s:
sin k(x - t ) y(t, k) du(t),
(12.8.12)
a special case of (12.5.17). This yields in the first place a bound for y(x, k ) ; supposing that I m k 2 0, we have from (12.5.23) that
Here we have replaced the upper limit
m
in the integral on the right
of (12.5.23) by x, which is legitimate by (12.5.22). In the important case
in which u ( x ) and k are real we may deduce (12.8.13) directly from (12.8.12) by means of Theorem IV.5.1. With the notation (12.8.8) we have from (12.8.13) that for I m k 3 0.
I eakzy(x,k) I
1-l
exp {w(x)/l k
I),
(12.8.14)
12.8.
40 1
APPROXIMATIONS OVER A FINITE INTERVAL
In the next stage we substitute this bound for y(t, k) on the right of (12.8.12). We have first from (12.8.12) that j ky(x, k) - sin kx
Id
fI sin K(x - t ) I I y(t,k) I 0
dw(t).
(12.8.15)
We rewrite the integral on the right in the form
I e-ikz I
0
1 e i k ( 2 - t ) sin ~
( x t ) I I eikty(t, K) I d w ( t ) ,
and since, if Im k 3 0,
I exp [ik(x - t ) ] sin k(x
- t ) I = I exp [2ik(x - t ) ] - 1 I < 1,
we deduce that
l Ky(x, k) - sin k x 1
< I e-iks I
l2I 0
ei"y(t, k ) I d w ( t ) .
(12.8.16)
Suppose first that w(x) is continuous, that is, that u(x) is continuous. In this case the bound (12.8.14) gives n
I eikty(t,k ) I d w ( t ) <
n
exp
) Ik (---)4 lkl
1-l d w ( t )
= exp
(--)I k I
W(X)
- 1.
(12.8.17) I n the general case this last step is invalid, and in fact the preceding integral may involve simultaneous discontinuities in the integrand and the weight distribution. We may, however, proceed by a limiting process and rescue the conclusion. Denoting a subdivision of (0,x) by 0 = to< 5, < ... < 5, = x, we have, for an approximating sum to the integral on the left of (12.8.17),
and by (12.8.14) this is less than or equal to
402
12.
ASYMPTOTIC THEORY OF INTEGRAL EQUATIONS
Hence the bound (12.8.17) is still available and we deduce from (12.8. the bound, for I m k 2 0,
I ky(x, k) - sin Kx I
< I e-ikx I {exp (w(x)/l K I)
Recalling that y ( x , k ) is even in k, we have if I m k
I ky(x, k) - sin kx I
=
I ( - K ) y ( x , -k)
-
< I eikx I {exp (w(.)/l
-
l}.
(12.8.
< 0,
sin ( - k x )
I
k I> - 11
(12.8.19)
thus providing a bound for all k # 0. Armed with this inequality, we proceed to look for a solution k, of y ( b , k ) = 0 in the vicinity of nn/b. We write kb = nn p and consider first the case in which u ( x ) is real, so that K and p may be taken to be real. Putting x = b in either of (12.8.18-19) we obtain
+
I ky(b, k) - (-lp sin p I
< exp (w/l k I) - 1.
(12.8.20)
Suppose now that p varies over (- 1, l), so that sin p increases from -sin 1 to +sin 1. If we arrange that K is so large that exp ( w / Ik I) - 1 < sin 1 throughout this process, it will follow from (12.8.20) that y ( b , k ) must vanish somewhere in - 1 < p < 1. T o put this argument in definite terms, we note first that, if I p I 1,
< I sin p - p I < 1 p I (1/3! + 1/5! + ...) < 1 p 115,
(12.8.21)
so that in particular
I sin p 3 4/5
for
Ip 1
=
1,
(12.8.22)
and sin 1 > 4/5. T o arrange that exp (w/l k I) - 1 < 415 we note first that bk = nn p 3 n r - 1, so that the desired property may be replaced by exp ( b w / ( m - I)} - 1 < 4/5. (12.8.23)
+
Since n 3 1, we have
bw/(nn - 1)
=
{bw/(nx)}{nn/(nn- 1)) - 1)) 3wb/(2nr).
< {bw/(nn)}{n/(n
<
Hence, if we wish, we may strengthen (12.8.23) to exp {3wb/(2nn)} - I < 4/5.
(12.8.24)
< w < 1/5, eu - 1 < v(l + 5-'/2! + ...) < 10a/9,
(12.8.25)
Noting now that, if 0
12.8.
APPROXIMATIONS OVER A FINITE INTERVAL
403
we see that (12.8.24) will certainly hold if and so if
3wb/(2nn) < 1/5,
(12.8.26)
> 15wb/(2n),
(12.8.27)
n
of which (12.8.10) is a weakened form. We conclude from this that there is at any rate one k, with bk, = n r p, , - 1 < pn < 1, such that y ( b , k,) = 0, provided that (12.8.10) holds. We proceed to find a better estimate of p, . Putting k = k , in (12.8.20) we obtain
+
I sin p n I
< exp (w/kn) - 1 < exp {bw/(nn- 1)) - 1 < exp {3wb/(2nw)) - 1,
(12.8.28)
as for (12.8.24). By (12.8.25-26), we deduce that
I sin pFLn I < (1019) {3wb/(2nm)) = 5wb/(3nm). By (12.8.21) we have, however, 1 sin p,
(12.8.29)
I 2 4 I p, 115, and so
4 I pn l/5 < 5wb/(3nm),
(12.8.30)
which proves (12.8.11), in sharper form. This completes the proof of Theorem 12.8.1 for the case that u(x) is real-valued, except that we have not proved the uniqueness of K, as described in the theorem. This we leave as a consequence of the modified argument to deal with the case of complex u(x), to which we proceed now. We put once more bk = n7r p, where I p I 1. We have then I exp ( - i k b ) I = I exp (-+) I e, and likewise I exp (ikb) I e. Hence from (12.8.18-19) we have
+
<
<
<
I ky(b, k) - (-1)n s i n p I
< {exp (w/l k I)
- 1) e.
(12.8.31)
In place of continuity arguments we now use RouchC’s theorem; we cite this in the form that if f(z), g(z) are analytic functions inside and on a simple closed contour C, and if on C we have I g(z) - f(z) I < If(z) I, then f ( z ) and g(z) have the same number of zeros inside C. In our present case, the function s i n p has exactly one zero inside the unit circle, and on I p I = 1 has modulus not less than 4/5, by (12.8.22). It will then follow that ky(b, k) also has exactly one zero in I p I < 1, if
I [exp (41 I)
-
11 e I < 4/5
12.
404 for 1 p
I
=
ASYMPTOTIC THEORY OF INTEGRAL EQUATIONS
1, which by the above reasoning will be so if [exp {3wb/(2n~)}- 11 e < 4/5.
Assuming n to satisfy (12.8.26), and using (12.8.25), this indeed will be so, the left-hand side being less than or equal to (10/9) e(3wb)/(2nrr) < 2e/9 < 4/5.
Hence there is a unique k, as described. T o estimate it we have once more (12.8.30), with the addition in this case of a factor e on the right, arising from the factor e on the right of (12.8.31). This gives
I pn I
< 25ewb/(12nn) < 6wb/(nn),
in verification of (12.8.11). This completes the proof. The previous theorem identified a class of eigenvalues k, of the problem (12.8.7), (12.8.4) of the form
k,, = nn/b
+ O(w/n),
(12.8.32)
for positive integral n, at least for large n ; these will be duplicated for negative n according to the identity k-, = -k, . It remains to be determined whether there might be an infinity of eigenvalues, possibly complex, distinct from these. We may exclude this possibility by limiting the eigenvalues to a region of the plane, of the form of an infinite strip including the real k-axis, narrowing towards the real axis at each end. T o make clear the orders of magnitude involved, we use bounds with numerical factors, though only rough ones. Theorem 12.8.2. The eigenvalues of the problem (12.8.7), (12.8.4) lie in the union of the regions in the k-plane given by (12.8.33) (12.8.34)
It will be sufficient to take the case I m k >, 0. Putting y(b, k) = 0, for an eigenvalue, in (12.8.18) with x = b, we get
I sin Kb I or
8I 1-
e-ikb
e2kib
I
I {exp (w/l k I)
< exp (41k I)
- l},
- 1.
(12.8.35) (12.8.36)
If (12.8.34) does not hold, that is, if w/I k I < 1/5, then by (12.8.25) we have exp (4I) - 1 < 1 0 4 9 I k I),
12.8.
APPROXIMATIONS OVER A FINITE INTERVAL
405
and so (12.8.36) gives
I 1 - e2kib I G 2 0 4 9 I
Since
ID.
(12.8.37)
j 1 - e 2 k i b I 2 1 - exp (-26 Im {K}),
we have from (12.8.37) that exp (-2b Im {A})
>1
-
20w/(9 I K I).
(12.8.38)
Since 20w/(9 I k I) < 419, and since, if 0 < w < 419 we have (1 - v)-1 < 1 + v(1 4/9 ...) < 1 9v/5, we deduce from (12.8.38) that exp (2b Im {K}) < 1 4w/1 K 1,
+
+
+
+
whence in verification oft( 12.8.33). This proves the theorem. It is not hard to show that for large k lying in the strip (12.8.33) the only eigenvalues are those of the form (12.8.32). This will, however, follow from a more precise calculation of the number of eigenvalues lying in a large circle. We now replace k2 in (12.8.7) by A, the boundary problem being y(x)
=x
(x - t ) y ( t )d{ht
-
0
+ o(t)},
y(b) = 0.
(12.8.393
The eigenvalues h of this problem are, of course, the squares of those of (12.8.7), (12.8.4). As an immediate consequence of Theorem 12.8.2 we have Theorem 12.8.3.
the regions
The eigenvalues of (12.8.39) lie in the union of
I Im {A} I G 4 4 , 1hI
< 25w2.
(12.8.40) (12.8.41)
For if k satisfies (12.8.34), then h = k2 obviously satisfies (12.8.41). If on the other hand k satisfies (12.8.33), then
I Imh I
=2
I Rl{k} Im{k} I
according to (12.8.40).
< 2 I k I I Im{k} 1 < 4w/b,
406
12.
ASYMPTOTIC THEORY OF INTEGRAL EQUATIONS
These eigenvalues will be the zeros of an entire function y(b, d h ) ; in the previous notation, we denote by y(x, l / h ) the solution of the integral equation in (12.8.39), the sign of l / h being a matter of indifference. The eigenvalues have therefore no finite limit, and their real parts must tend to +m. T o number them precisely we need
+
Theorem 12.8.4. For n & >, %win, the circle z/ j h ] = (n+&r/b contains in its interior exactly n eigenvalues of the problem (12.8.39). The eigenvalues may be denoted A,, n = 1, 2, ..., where
A,
= {#T/b}2
+ O(1).
(12.8.42)
We apply RouchC's theorem once more, to the functions y(b, z / h ) and (sin b l/X)/l/h on the circle I l / h 1 = ( n i)~ / b Inside . this circle the function (sin b z / X ) / l / h has exactly n zeros, at the points ( ~ / b )..., ~ (, n ~ / b )and ~ , so it is a question of proving that on this circle we have the inequality
+
I y(b, dA) - (sin b dA)/dA1 < I (sin b dA)/dA1.
(12.8.43)
These functions ake analytic in A, and as X describes the whole circle, l / h will describe a semicircle; we may take it that k = l / h describes a semicircle in the upper half-plane, as h describes the circle, starting and ending at {(n &) ~ / b ) ~ . We may replace (12.8.43) by the inequality
+
I ky(b, k) - sin bk ] < 1 sin bk I for I k so if
I
=
(n +
4) T/b, Im k
(12.8.44)
2 0. By (12.8.18) with x
] e-akb I {exp ( w / l k
I)
-
=
b this will be
l } < 1 sin bk I,
which is equivalent to exp (w/l k 1) - 1 < + I 1 - ezibk 1.
Putting k that
=
(n
+ 8) ~ b - ' exp (ie),0 < 0 <
T,
(12.8.45)
we have thus to prove
+ i) - 1) < j 1 - exp ((2n + 1) Tieis} I, for integral n satisfying n + 4 2 5 b w / ~ . 2(exp (bw/{(n
T}}
(12.8.46)
We now note that the right-hand side of (12.8.46) has a positive lower bound. For example, we have
I1
- exp((2n
+ l)nieie} I 2 3,
n = 0,1,
...,
0 <0
<
71.
(12.8.47)
12.8.
407
APPROXIMATIONS OVER A FINITE INTERVAL
T o check this, we observe first that
I exp {(2n + 1) vie@}I
=
exp {-(2n
+ 1)
7r
sin 0 )
+
which is less than or equal to e-l if (2n 1) n- sin B 2 1, and so in the latter event (12.8.47) is certainly true. Suppose then that 0 < (2n 1)n-sine < 1, so that d is close to 0 or to 7, and cosB is close to 1 or to - 1. We have, in fact,
+
e d {(2n + 1) “1-2,
1 - cosz
so that of the inequalities 1
-
cos t~ < {(2n
+ 1)
1
+1,
+ cos e G {(2n + 1)
(12.8.4849)
at least one, indeed just one, must hold. Considering now the real part of the expression in (12.8.47) we have Rl(1
-
exp {(2n + 1) vieio}} =
1 - exp [-(2n
+ 1)
T
sin 01 cos ((2n
+ 1) n cos e},
(12.8.50)
which is greater than 1, implying (12.8.47), provided that cos {(2n
+ I)
cos el
< 0.
(12.8.51)
This, however, is the case if either of (12.8.4849) holds. For example, we have from (12.8.48) that
I (2n + i ) v c o s e
- (2n
+ i l n I G I < iv,
from which (12.8.51) is immediate, and a similar argument applies to (12.8.49). Hence (12.8.47) holds in this case also. It follows that (12.8.46) will be ensured if (12.8.52)
Imposing the condition (n + 8) T 2 5bw, and using (12.8.25), the left of (12.8.52) is not greater than 10
bw
so that (12.8.52) holds. Hence so does (12.8.43), and an application of
Roucht’s theorem gives the result.
408
12.
ASYMPTOTIC THEORY OF INTEGRAL EQUATIONS
Provided that n is sufficiently large, we may characterize A, by (12.8.42). More precisely, if n Q > 5bw/r, we may define A, as the unique eigenvalue whose absolute value lies between {(n Q)~ / b and } ~ {(n - $) ~ / b }If~ a. ( x ) is real-valued, we may more simply define A, as the eigenvalue for which the corresponding eigenfunction has n - 1 zeros in the interior of (0, b).
+
+
12.9. Approximation to the Eigenfunctions
The eigenfunctions of the simple boundary problem (12.8.1-4) provide the first approximation to the eigenfunctions of (12.8.7), (12.8.4), for either large eigenvalues or for small w, the variation of ~ ( t )This . approximation is easily established, and is adequate for the purpose of proving the eigenfunction expansion. Theorem 12.9.1. Let a(.) be right-continuous and of bounded variation over (0, b). Then, for integral n such that n > 8 w b / r , we have
I k,y(x, k,)
- sin ( n m / b ) I
< 2lwb/(mn).
(12.9.1)
Here o is as defined in (12.8.9), and k, as in Theorem 12.8.1. We first prove that
I k,y(x, A,)
- sin k,x
I
< 5ewb/(3n~).
In (12.8.18) or (12.8.19) we take k I pLn1 < 1. Here, for 0 x b,
< <
I exp ( f ikx) I
=
=
k,
= (nr
(12.9.2)
+ p,)/b,
where
I exp ( f i p , x / b ) I < e,
and so (12.8.18-19) yield
I k,y(x, )A,
- sin k,x
I
< (exp (4 K I)
- 1 ) e.
Using the estimations of (12.8.25-29) we have exp (w/l k, I)
-
1
< exp (3wb/(2nn))- 1 < 5wb/(3na)
from which (12.9.2) follows at once. T o complete the proof we show that For
I sin K,x
- sin (nzx/b) 1
k,x = (n.rrx/b)
+
< 6ewb/(mz). (p,+)
(12.9.3)
12.9.
409
APPROXIMATION TO THE EIGENFUNCTIONS
and so sin k,x
-
sin - =
b
n==/b+&,=/b
cos z dz
jnnz,b
where the integral is along a straight line. Since I p.,,1 j cos z I < e along the path of integration, and hence
< 1 we
have
6ewb
by (12.8.11). This proves (12.9.3). Combining (12.9.2-3) we get
which is slightly sharper than (12.9.1), and proves the theorem. I n the corresponding result for the normalized eigenfunctions we confine attention to the case in which u ( x ) is real, and n so large that k , is real and positive ; such a result will be needed for the eigenfunction expansion. As before, we give bounds involving numerical factors, though the actual values of these factors are not important. T h e normalized eigenfunctions of the problem (12.8.14) are, of course, the functions (2/b)'l2sin (nnxlb), n = 1 , 2 , ... , while those for (12.8.7), (12.8.4) will be written
where n = 1, 2, ..., and the numbering may be assumed to be such that k: , ki , ... are in ascending order. Then we have
Theorem 12.9.2. Let a ( x ) be real-valued, in addition to the assumptions of Theorem 12.9.1. Then, for n satisfying n
> 20wb,
(12.9.5)
the normalized eigenfunctions satisfy
I yn(x) - (2/6)'12sin (nnxlb) 1
<50~bl/~/n.
(12.9.6)
We have from (12.9.1) that
I (2/b)l12kK,y(x, kn) - (2/b)1/2sin (nnxlb) I
< 2 1 ~ ( 2 b ) ' / ~ / ( n n(12.9.7) ),
410
12.
ASYMPTOTIC THEORY OF INTEGRAL EQUATIONS
and this we write in abbreviated form as (12.9.8)
We conduct the calculation in these terms, proving the: Lemma 12.9.3. Let u(x), w ( x ) be continuous real-valued functions in 0 x b satisfying (12.9.8), and let also
< <
1: 6
db < &,
Defining vt(x) = V(X)
we then have
I w ~ ( x) U(X) 1 Writing
W(X)
= u(x)
-
v2 dx =
(12.9.9)
{ u ( x ) } dx ~ = 1,
I ).(U
I s" 0
I Q d(2/4.
(12.9.12)
{ ~ ( t ) )dt*
0 <x
< 56,
u(x), so that I ~ ( x I )
s,b
Q 1
u2 dx
+2
(12.9.10-11)
1
b
0
uw dx
< 6.
(12.9.13)
< 6 , we have
+ Jo w2 dx b
+ 2 d ( 6 2 b ) + 626,
using (12.9.9) and applying the Cauchy inequality to the middle integral. Hence and in a similar way
1:
w2 dx Q (1
1:
v2 dx
+ 6 d/b)2,
(1 - 6 l / b ) 2 .
By (12.9.12) we then have, if v ( x ) > 0,
with the reversed inequalities if v ( x )
< 0. Hence in any case
12.10.
COMPLETENESS OF THE EIGENFUNCTIONS
41 1
using (12.9.10). Hence
I .t(.)
-
4.) I < I .+(x)
-
(I.
I
+ I I(.
< 2 I).(. I 6 d b + 8 < 2" u(.) I + 61 8 l/b + 6,
-
4.) I
and the result now follows from (12.9.10-1 1). T h e function u ( x ) given by identifying (12.9.7-8) will satisfy (12.9.9), (12.9.11), and for 6 to satisfy (12.9.10) we must have 21wb 2/2/(7rn)
this is ensured by (12.9.5). T h e function d ( x ) will be the same as that given by (12.9.4), K, being real and positive if n satisfies (12.9.5), in are all real if a(x) is view of Theorem 12.8.1 and the fact that the real-valued. The conclusion (12.9.13) now yields
I m(x)
-
(2/b>'I2 sin (nnxlb) I Q 105w(2b)1/2/(7rn),
of which (12.9.6) is a weakened and simplified form. This completes the proof.
12.10. Completeness of the Eigenfunctions We use here the approximate formulas for the eigenfunctions as a means of proving the eigenfunction expansion for the boundary problem (12.8.7), (12.8.4). We begin by recalling the notion of "completeness" for a set of functions un(x), n = 1, 2, ..., over an interval (0, b). Relative to some general class of functions f ( x ) over (0, b) and relative to some concept of approximation, the set {u,(x)} is said to be complete if an arbitrary f ( x ) can be approximated arbitrarily closely by a linear combination Er c,u,(x) for suitable choice of n and the c, . Here it is in the first place most natural to take as the general class of functions the space L2(0,b) of functions of integrable square over (0, b), and as the concept of approximation that of approximation in mean-square. With these interpretations the set un(x) is complete, in L2(0,b), if first the u,(x) €L2(0,b ) and if, for any f ( x ) €L2(0,b) and any > 0 we can find an integer n and constants cl, ..., c, such that
If@) 3 -
0
c,u,(x)
1
dx
< E.
(12.10.1)
412
12.
ASYMPTOTIC THEORY OF INTEGRAL EQUATIONS
It may happen that (12.10.1) can be arranged by taking Er c,u,(x) to be a partial sum of a certain series, that is to say, we have an expansion m
(12.10.2) in the sense that (12.10.1) holds if n is large enough, the c, being independent of n and of e. I n this case, assuming such an expansion to exist for all f ( x ) E L2(0,b), the un(x) are said to form a “basis” in L2(0,b). I n particular, they form a basis if they are both complete and orthonormal, in that (12.10.3) for all p , q = 1, 2, ...; for in this case the best possible choice of the c,. in (12.10.1) is given by
(12.10.4) independently of n. Thus, for example, we showed in Section 8.6 that the eigenfunctions un(x) formed an orthonormal basis in a certain subspace ofL2(a,b), the set of functions described in Theorem 8.6.1. The relevant question is now the stability of the property of completeness, that is to say, given a set of functions { ~ ~ ( x in ) } L2(0, ~ b) 1 which are known to be complete in this space, and a second set {vn(x)}rin L2(0,b) which differ little from the u,(x), respectively, under what conditions can it be deduced that the {vn(x)}ralso form a complete set in L2(0,b). A simple criterion for this, proved in Appendix VI, states that if the set {un(x)}y is both complete and orthonormal, and if
(12.10.5) then the set { ~ ~ ( x is ) }also ~ complete; it is not here assumed that the 1 vn are also orthonormal. I n our first application of this result we take it as known that the orthonormal system un(x) = (2/6)Ii2 sin (nscrlb),
n = 1, 2,
...,
(12.10.6)
from a complete set in L2(0,b), that is, that the expansion given by a Fourier sine series holds in the mean-square sense. On the basis of this we have
12.10.
COMPLETENESS OF THE EIGENFUNCTIONS
413
< <
x b, be right-continuous and of Theorem 12.10.1. Let u(x), 0 bounded variation over (0, b), with small total variation w , a permissible limit being wb < 2/3/21. (12.10.7)
Then the set y ( x , k,) of eigenfunctions of (12.8.7), (12.8.4) is complete in L2(0,b). Here k,, , n = I, 2, ..., is the unique eigenvalue of (12.8.7), (12.8.4) satisfying I bk, - nr I < 1, according to Theorem 12.8.1; the condition (12.8.10) will be satisfied for all positive integral n if w is small enough, in particular if (12.10.7) holds. As the system wn(x) we shall take vn(x) = (2/b)'/2kny(x,A,),
n
=
1,2, ... .
(12.10.8)
These will be orthogonal if u ( x ) is real-valued, and approximately normal, but we do not use these facts. It follows from (12.9.7) that j u,(x)
- v&) I
< 21w(2b)'/2
(7rn)-1,
and so
Summing over n = 1, 2,
..., it follows that
so that the criterion (12.10.5) is ensured if w is small enough, in particular, if (12.10.7) holds. Hence the functions (12.10.8) form a complete set, and so also the y(x, k,). We note the orthogonality of the eigenfunctions, namely, (12.10.9)
to be proved as usual by means of the Lagrange identity (1 1.3.5) together with the boundary conditions. This also ensures that kk must be real if u ( x ) is real-valued. In this latter case, the requirement imposed in the last theorem that u ( x ) be of small total variation is entirely unnecessary.
Theorem 12.10.2. Let u(x) be real-valued, right-continuous, and of x b, for b finite. Then the eigenfunctions bounded variation over 0 y(x, k,) of the problem (12.8.7), (12.8.4) form a complete set in L2(0,b).
< <
414
12.
ASYMPTOTIC THEORY OF INTEGRAL EQUATIONS
Here we suppose the eigenvalues k, numbered so that the A, = ki , the eigenvalues of (12.8.39), form an increasing sequence on the real axis, for n = 1, 2, ... . For the proof we introduce a real parameter y , say, before the function ~ ( i )in (12.8.39), to vary from 0 to 1. When y = 0 we have the case of (12.8.1), for which the completeness of the eigenfunctions is assumed, while for y = 1 we get the equation for which completeness is to be proved. T h e method is to pass from y = 0 to y = 1 by a finite number of steps, in each of which the eigenfunctions are close enough to satisfy (12.10.5). Replacing u ( t ) by yu(t) in (12.8.39), the eigenvalues, written in ascending order, will be denoted by h,(y), n = 1, 2, ..., 0 y 1, and the normalized eigenfunctions will be denoted by yn(x,y). Writing (12.8.39) in integro-differential form, the y n will be solutions of
< <
the variation and integral being over any subinterval of (0, b), together with the boundary conditions yn(h r>= 0,
m(0,r>= 0 ,
(12.10.11-12)
and the normalization conditions YXO, Y ) > 0,
1;
{ m ( x , y)ISdx
=
1.
(12.10.13-14)
We have to show that there is a subdivision of (0, 1) by points (yt}, 0 = yo < y1 < ... < yl = 1, such that (12.10.15)
for Y = 0, ..., 1 - 1. From the assumed completeness of the system yn(x,y,) when Y = 0, that is, that of the system (12.10.6), there follows in turn its completeness for Y = 1, ..., 1, the latter giving the desired conclusion. T o show that (12.10.15) is possible we first choose an integer no such that
2
n-n0
r { y n ( s ,y ) - (2/l~)'/~ sin ( n ~ x / b )dx} ~< 1/8, 0
(12.10.16)
12.10.
COMPLETENESS OF THE EIGENFUNCTIONS
415
< <
for 0 y 1. To see that this is possible we use (12.9.6), where y o may replace w on the right, and w is given by (12.8.8) as the total variation of ~ ( x ) The . sum on the left of (12.10.16) is thus less than or equal to a
which can be made arbitrarily small by taking no large. From the inequality 1 a, a2 l2 2 I a, l2 2 I a2 l2 we then deduce that
+
+
<
I ~ n ( xyr+A , - Yn(x, ~ r l 2) < 2 I Yn(x, ~ r + l )- (2/b)’/zsin (nnx/b) I’ 2 I yn(x, - (2/b)”2 sin (nnx/b) lo,
+
~
7
)
and so, from (12.10.16) with y = y r , Y , . + ~ ,
T o complete (12.10.15) we need, of course, that
Since no is fixed, this is merely a matter of choosing the subdivision {y,.} of (0, 1) sufficiently fine. Here we rely on the continuity of yn(x,y ) as a function of y. To elaborate this point a little, we start by observing that the eigenvalues A&) are continuous functions of y. They are, in fact, the zeros of an entire function y(b, A, y), where y(x, A, y ) is the solution of Y(X,A, 7 ) = x -
(x - t>Y(t,A, Y ) d{At
+Y 4 N .
When y = 0, y(b, A, 0) = (sin b dX)/dA,whose zeros are all simple. As X varies on the real line, the zeros of y(b, A, y ) as a function of h remain simple ; the eventuality of two zeros tending to coincidence may be excluded by (12.10.9), and may also be excluded by the oscillatory characterization of the eigenfunctions, which we have not developed in the present context. From the continuity of the Xn(y), the continuity of the yn(x,y ) follows from the relation
which completes the proof.
APPENDIXI
Some Compactness Principles for Stieltjes Integrals
1.1. Functions of Bounded Variation I n this and the next section we review some basic definitions concerned with the Riemann-Stieltjes integral, commencing with the idea of a function of bounded variation. T h e function o(x), possibly complexvalued, defined over a finite real interval (a, b), will be said to be of bounded variation over (a, b) if the expression (1.1.1) r-o
admits some fixed upper bound, for all integers n and all subdivisions (5,) of (a, b), a = to < < ... < 5, = b. T h e least common upper bound of all such expressions will be referred to as the total variation of U(X) over (a, b), and will sometimes be denoted by var { u ( x ) ; a, b}. Next we note some simple continuity properties. It is a consequence of the property of bounded variation that for any x, a < x < b, u(t) tends to a definite limit as t -+x from above, and likewise to a possibly different limit as t -+ x from below. I n other words, the limits a ( x 0), u(x - 0) exist for a < x < b ; as regards the end-points, the one-sided limits o(a 0), o(b - 0) both exist. If for some x, a < x < b, it happens that (1.1.2) u(x - 0) = u(x) = u(x O),
+
+
+
then, as usual, we say that x is a point of continuity of a ( x ) . T h e points of discontinuity of a(x), those for which one or both of (1.1.2) fails, form a finite or at most denumerable set. This depends on the observation that the jumps of u(x) form an absolutely convergent series. Here by the jumps of u(x) we mean all expressions u(x) - a(x - 0), u(x 0) - o(x), and it is easily seen that the sum of their absolute
+
416
I. 1.
FUNCTIONS OF BOUNDED VARIATION
41 7
values does not exceed the total variation, assumed finite, of a(.). These jumps may evidently be ordered according to nonincreasing absolute values, starting with the largest. A consequence of the fact that the points of discontinuity are denumerable is that the points of continuity are everywhere dense, in the sense that any interval of positive length contains at least one of them. T o a large extent we may standardize the behavior of a ( x ) at its points of discontinuity, without affecting the values of integrals with respect to u ( x ) . A common procedure is to require that “(X)
=
9u(x + 0) +
Here we shall usually arrange that
4.)
= u(x
*
U(X - 0).
+ O),
(1.1.3)
(I.1.4)
described by saying that a ( x ) is right-continuous. These procedures need qualification at the end-points; (1.1.3) has no sense at x = a or x = b, ahile (1.1.4) constitutes a restriction only at x = a. These requirements, as also left-continuity, are included as special cases in the situation in which We sometimes use the symbol
s” I
dU(4
I
(1.1.6)
in place of var { u ( x ) ; a, b} for the total variation. T h e integral (1.1.6) is naturally interpreted as the limit of (1.1.1) as n -+ 00, the subdivisions - 4,) + 0. It becoming increasingly fine, in the sense that max, may be shown that (1.1.6) is in fact the same as the total variation in such cases as (1.1.3-5). In the case of a semi-infinite interval (a, m), where a is finite, in saying that a(.) is of bounded variation over (a, m) we shall mean that u(-) is defined and that the requirement of the uniform boundedness of (1.1.1) is in force with b = 5, = 03. Somewhat restrictively, we demand furthermore that o(m) should be the same as lim a(.) as x + A jump of u ( x ) at x = 00 is therefore excluded; this will entail that certain infinite integrals must be supplemented by individual terms. Likewise, for the interval (-00, m) we take it that u(-m), u(-) are both formally defined, and are equal to the limits of u ( x ) as x + f-, x --t -a, and that (1.1.1) is uniformly bounded with a = to= -00, b = fn = +m.
+-
I.
41 8
SOME COMPACTNESS PRINCIPLES
We note two important cases in which a function is of bounded variation; first, that in which u(x) is nondecreasing, in that u is realvalued and such that x’ > x implies u(x’) .(.). Here o(x) is of bounded variation over (a, 6 ) if and only if .(a) and o(b) are finite. T h e second case is that of a step function. I n the case of a finite interval (a, b ) we shall restrict this to mean a step function with a finite number of discontinuities. There will thus be a finite number of points in a x b between (in the strict sense) any two consecutive members of which U(X)is constant. In the case of a step function in an infinite interval, we permit an infinity of discontinuities, which are, however, to have no finite point of accumulation. T h e above definitions extend with little difficulty to the case in which u(x) is a square matrix, or rectangular matrix or vector, dependent on x. In saying that u(x) is of bounded variation over (a, b ) we mean simply that each entry in the matrix ~ ( x is ) separately a function of bounded variation in the previous sense. This notion will arise mainly in the special case in which o(x) is a square Hermitean nondecreasing matrix function; the term nondecreasing will mean that o(x’) - U(X) is positivedefinite, or at any rate non-negative definite, if x‘ > x. T h e entries on the principal diagonal will then be real-valued and nondecreasing functions in the ordinary scalar sense, and if they are of bounded variation, that is, bounded, then U(X)will be of bounded variation; this follows from the observation that for a non-negative definite matrix, no entry exceeds in absolute value the greatest diagonal entry.
>
< <
1.2. The Riemann-Stieltjes Integral We recall here the definition and immediate properties of this integral. For a pair of functionsf(x), ~ ( x defined ) in the finite interval a x b, and for any subdivision {&} of (a, b),
< <
a =
to< 61 < ... < 5,
= b,
(1.2.1)
- u(tr>),
(1.2.2)
we form the sum
s=
n-1
f(%> {4&+1)
r=o
where the 7,.are arbitrary points with subject to m,ax I &+I - 5,
5,
I
< r], < 0,
. If as
TZ ---t
Q),
(1.2.3)
the sum (1.2.2) tends to a unique limit, for all sequences of subdivisions
1.2.
419
THE RIEMANN-STIELTJES INTEGRAL
satisfying (1.2.3) and for all choices of the 7, E [ f , , f,+,],then this limit is the Stieltjes integral of f ( x ) with respect to u(x), written J b f ( x )du(x), or Sb d u ( x ) f ( x ) , the order being important only in matrix ahalogs. In the case in which u(x) is a step function, the Stieltjes integral reduces to a sum. Theorem 1.2.1. In the finite interval [a, b] let f ( x ) be continuous and let u(x) be a step function, with a finite number of jumps. Then
+f(@ ( 4 4- o(b - 011.
(1.2.4)
Here the sum on the right is in fact a finite sum, being extended over points of discontinuity of u(x) in the interior of (a, b). The first and last terms are absent if u(x) is continuous at x = a or at x = b, respectively. - u(a)}, For the proof, the first term in (1.2.2), namely, f ( q 0 ) where a v0 El, clearly yields the first term on the right of (1.2.4) as n + m, since f(rl0) -+f(a) by continuity, and --+ a(a 0); here the limiting transition is as n ---t m, with qo and f1 functions of n. In a similar way, the last term in (1.2.4) arises from the last term in (1.2.2). If next x is a point of discontinuity of u(x) in a < x < b, suppose first that x is not a point of the subdivision (1.2.2), and denote by (5, 5') the interval of the subdivision containing x; suppose further that the subdivision is so fine that no other point of discontinuity of u(x) occurs in (5, f ' ) . I n this case we have in (1.2.2) a term of the form f(7) {u(S') - u(E)}, 7 E [ f , 5'1, which tends as n --+ 03 to the typical term in the sum in (1.2.4). If again x is a point of discontinuity, which is also a point of the subdivision, let 5, t',be the adjacent left and right points of the subdivision; we then get in (1.2.2) a pair of terms of the form f ( 7 ) {u(x) - ~ ( 5 ) ) f(7') (~(5') - a ( x ) ) , which yield in the limit as before f ( x ) {u(x 0) - u(x - 0)). We may complete the proof by showing that terms in (1.2.2) for which [t,,f,,,] does not contain a point of discontinuity of u(x) vanish. We pass to the existence of the Stieltjes integral in general. The main result is
{.(el)
< <
.(el)
+
+
+
< <
Theorem 1.2.2. Iff(.) is continuous in the finite interval a x b, and u(x) of bounded variation over the same interval, the integral J b f ( x >du(x) exists.
420
I.
SOME COMPACTNESS PRINCIPLES
Together with the sum S given by (1.2.2) we consider a second such sum n’-1
S‘
=
ZfK){45+1) n
- 0(5:)>,
’:}L({
formed in the same manner with a second subdivision We compare S with S’ by comparing both with a third sum
of ( a , b).
formed with the subdivision {t;};‘‘, comprising both the points {(,.} and the points {ti}.Write also 6 for the largest of the lengths of any of the intervals (5, , t,.+l)and (ti,(:+,) and w(6) =
max If(x’) - f ( x ) I,
I x’
subject to
-x
I
< 6.
(1.2.5)
We now write the sum S as a sum with respect to the joint subdivision {(:}. Noting that
45,+1>
-
45,)
=
2
eve;
{45;+J
-
45;,>
and substituting in (1.2.2) we may write S in the form
where r/? lies somewhere in that interval [(,, , (,+,I of the first subdivision which contains the interval [(;, (+ ;], of the joint subdivision. We have then
s - S“
2 If(%) -rc.l;,>
n”-1
=
c4+ ;!1,
p=0
-
a;,>.
Since r/,. and 7; both lie in the same interval [(,. , (,.+l] (1.2.5) that
it follows from (1.2.6)
< ~ ( 6 var ) (u(x); a, b}.
(1.2.7)
1.2.
THE RIEMANN-STIELTJES
INTEGRAL
42 1
Since the same argument applies to S’, we deduce that
IS
-
S’ I
-
S” I
+ I S’
-
S” I
< 248) var { u ( x ) ;a, b } .
(1.2.8)
We can now prove the existence of the integral, that is, that the sum typified by S tends to a unique limit, subject to (1.2.3). We first note that all such sums are bounded uniformly; taking absolute values in (1.2.2) we have at once that
I s I d max I f(4I
2I
n-1
U(t?,.l)
0
- .(tA
I
< max I f(4I var (44;a, b).
(1.2.9)
Take now a sequence of such sums S,, q = 1 , 2 , ... , and denote by 6, the greatest length of any of the intervals into which (a, b) is divided by the associated subdivision. It then follows from (1.2.8) that for any q, q’ we have I S , - S,, I < 2 max {w(8,), ~ ( 8 , ~var ) ) {u(x); a, b}.
If the sequence S, satisfies 6, -+ 0 as q -+ 03, in accordance with (1.2.3), we have ~ ( 6 , )--t 0, by the property that a continuous f ( x ) is also uniformly continuous. T h e sequence S, therefore satisfies the Cauchy convergence condition. Any other such sequence must tend to the same limit, since otherwise the two sequences could be combined into a nonconvergent sequence of the same type. This proves Theorem 1.2.2. We use the following trivial bound for the integral. Theorem 1.2.3.
Under the conditions of Theorem 1.2.2, (1.2.10)
This follows from (1.2.9), on passing to the limit through a sequence of subdivisions satisfying (1.2.3). Integrals over infinite intervals will be understood in the improper sense, so that, for finite a, Smf(s) du(x) = b-m lim a
1 b
a
f(x> du(x),
(1.2.1 1)
if the limit exists, and likewise (1.2.12)
Two cases in which the latter limit exists are relevant.
I.
422
SOME COMPACTNESS PRINCIPLES
Theorem 1.2.4. Let f ( x ) be continuous for all finite real x, and uniformly bounded, and let u ( x ) be of bounded variation over (-m, a ) . Then the integral (1.2.12) exists. Using an evident additive property of the integral, we have for a’ < a < b < b’,
so that to prove the existence of the limit (1.2.12) it will be sufficient to prove that
(1.2.13) as b, b‘ + 00, b < b‘. together with a corresponding result for the integral over (a’, a) as a + - m . By (1.2.10) and the assumed uniform boundedness of f ( x ) it will be sufficient for (1.2.13) to show that var ( u ( x ) ; b, b‘} + 0 as b + m . This is so since var { u ( x ) ; 0, b} tends monotonically to a finite limit as b + m, and since var (u(x); b, 6 ’ ) = var (u(x); 0, b’} - var (u(x); 0, b}
for
0
< b < b’.
The analogous result for the integral over (a’, a) is proved similarly. In connection with orthogonal polynomials we need
on
Theorem 1.2.5. --03
Let u(x) be defined, real-valued, and nondecreasing
< x < m, and for all positive integral n let, as x + m,
Then the integral
jmx m d u ( x )
(1.2.16)
-W
>
exists for all integral m 0. As in the case of the last theorem it is sufficient to prove that
J
b’
b
as b
3
m, a + - m ,
Jl,
x m d u ( x ) +O,
with b’
> b,
a’
xm du(x) -+ 0
< a.
(1.2.17-1 8)
1.3.
A CONVERGENCE THEOREM
423
We suppose that b > 1 and break up the interval (b, b') into intervals of the form (b, 2b), (2,4b), ..., with possibly a part of such an interval. Denote by M an upper bound for xm+l{u(m) - u(x)} for x 3 1. For b x 2b we have xm (2m/b)hm+l and so
< <
<
Treating the intervals (2b, 4b), (4b, 8b), ..., similarly we have
J'"'xm d+) < 2"M{l/b + 1/(2b)+ 1/(4b) + ...} = 2"+lM/b, b
from which (1.2.17) is immediate; (1.2.18) is of course proved similarly, which completes the proof.
1.3. A Convergence Theorem In applying limiting processes to boundary problems we frequently encounter sequences of spectral functions, and have to consider the convergence of integrals with respect to them. In this connection we need
Theorem 1.3.1. Let un(x), n = 1, 2, ..., be of uniformly bounded variation in the finite interval (a, b). Let u(x) also be of bounded variation in (a, b) and let, as n -+ m, "&)
-
4%
(1.3.1)
for x = a, x = b, and for the x-values of a set I of (a, b) which is everywhere dense in (a, b). Then, for any f ( x ) which is continuous in [a, b ] , we have, as m .--+ 00, (I.3.2)
I n saying that I is everywhere dense in (a, b) we mean that any subinterval of (a, b ) which has positive length contains a member of I . By this assumption, we may choose an arbitrarily fine subdivision (1.2.1) of (a, b), where tl,..., belong to I .
I.
424
SOME COMPACTNESS PRINCIPLES
We now write
say. Writing M for an upper bound for I f ( x ) 1 and evaluating the integrals in J1 we have
I J1 I
<M
n-1
Z I Um(fr+J
r=O
-
45,)
- 45r+1)
+ 45,) I
c,,),
Passing to J z , J 3 , we denote by 6 the greatest of the (&+l Y = 1, ..., n - 1 , and denote by w ( 6 ) the modulus of continuity defined in (1.2.5). Let, furthermore, V denote an upper bound for var {u,(x); a, b} and var {u(x); a, b}. We have then I 1 2
I
< VW(S),
I J3 I
< VW(%
by Theorem 1.2.3. Choosing now any E > 0, we choose 6 > 0 so small that Vu(6)-=c~ / 3 , relying on the uniform continuity of f ( x ) . We then choose m, so that for m > m, we have I J1 I < ~ / 3 which , is possible by (1.3.1) and (1.3.3). It then follows that
for m > m, , which proves (1.3.2), as required. As a very special case we have the following result, according to which the Stieltjes integral of f ( x ) with respect to u(x) is substantially independent of its definition at points of discontinuity.
1.4.
THE HELLY-BRAY
THEOREM
425
Theorem 1.3.2. Let u(x), ut(x) be of bounded variation over the finite interval (a, b), and be equal for x = a, x = b, and at a set I which is everywhere dense in (a, b). Then
ff(x) a
du+(x) =
frcx) do(x). a
(I.3.4)
This is included in Theorem 1.3.1 by taking all the u,(x), n = 1, 2, ..., as the same as at(.). T h e result also follows directly from the definition of the Stieltjes integral, confining ourselves to subdivisions of (a, 6 ) of which the interior points of subdivision belong to I.
1.4. The Helly-Bray Theorem T h e following important principle enables us to sidestep the question of the convergence of a sequence of functions of bounded variation, enabling us to rely on the simpler properties of uniform boundedness and uniformly bounded variation. We prove the result first in its basic form as applying to finite intervals, though for most applications modifications will be required.
< <
x b let the functions Theorem 1.4.1. I n the finite interval a be uniformly bounded and of uniformly bounded variation, so that for n = 1, 2, ..., and some fixed c, c' we have u,(x)
var (o*(x); a,4
Q c'. (1.4.1-2) Then there is an infinite sequence n1 < n2 -=c... of positive integers and a function u(x), of bounded variation over (a, b), such that for any f ( x ) , continuous in [a, b ] , we have, as m + 00,
I G*@) I Q c,
f f ( 4 do*&)
-1 b
A x ) du(x).
(I .4.3)
We choose the sequence n, so that the sequence unm(x)converges as m + for all x in a certain subset I of [a, 61, to be denumerable and dense there. We shall here take I to be the set of x of the form a t(b - a), where t , 0 t 1, is rational. Since the rationals may be arranged in order, we may take it that I consists of a sequence xl, x 2 , ... . By (1.4.1) and the Bolzano-Weierstrass theorem we may select a sequence of positive integers n,, , n I 2 , ... , so that the sequence u,Jxl) converges as m + 03. We next choose a sequence n 2 , , n 2 2 ,... , a subsequence of nll , n 1 2 , ... , so that the sequence un2,(x2) converges as m + a, followed by a subsequence 7131 , n32, ... of n 2 , , nZ2, ... such that the sequence un,,,(x3) converges, and so on. T h e "diagonal" sequence
+
< <
426
I. SOME
COMPACTNESS PRINCIPLES
nll , n Z 2 ,... then belongs to the sequence nrl , nrZ, ... after some point; to be precise, we have n,,,, E { n r l , n r 2 , ...} for m 3 Y . Hence the sequence unmm(xr) converges as m 00, that is to say, un,,(x) converges as m + for all x E I. Abbreviating n,,,, to n, , we now define -+
(I.4.4)
+
for all x E I. This leaves u(x) undefined when x = a t(b - a) and t, 0 < t < 1, is irrational; somewhat arbitrarily, we define u ( x ) for such x as the right-hand limit .u(x) = u(x 0). It is easily seen that u(x) is of bounded variation. We have in the first place
+
(1.4.5)
for any subdivision of (a, b) for which all the 6, E I ; for this holds with unm(x)in place of u(x), by (1.4.2), and hence holds for u(x) by making m + 00. Secondly, if some or all of ..., do not belong to I , we replace them by numbers ti, ..., in I and in certain rightneighborhoods of ..., tn.-lrespectively. Then (1.4.5) holds for the subdivision a = go < 5; < ... < fkPl < g, = b, and follows for the original subdivision on making tend to 6, on the right, r = 1, ..., n - 1, through points of I . Since u(x) is of bounded variation and since u,,,(x) + u(x) for x = a, x = b, and all x in the dense subset I of (a, b), we have the required conclusion (1.4.3) as a consequence of Theorem 1.3.1. It is possible to replace the limit u(x) constructed above by another, ut(x), such that un(x) -+ut(x) for some n-sequence and all x E [a, b ] . However the essential point for us here is merely that there exists at any rate one function u(x) and a sequence {n,} satisfying (1.4.3).
el, cnWl
el,
1.5. Infinite Interval and Bounded Integrand This heading describes roughly the first of the required modifications of the Helly-Bray theorem.
Theorem 1.5.1. Let u,(x), n = 1, 2, ..., be uniformly bounded and uniformly of bounded variation on --oo x 00, that is, let (1.4.1) hold for all x, and (1.4.2) for all finite a, b with a < b. Let also
< <
u n ( ~= )
lim u,(x),
x +m
on(-..)
= lim u,(x). X-r--m
(I. 5.1-2)
1.5.
INFINITE INTERVAL AND BOUNPED INTEGRAND
427
Then there exists a sequence nl , n 2 ,... , a function u(x) of bounded variation on -m x m, and constants a,/?, such that for any function f ( x ) , continuous on the whole real axis, with in particular
< <
(1.5.3-4)
we have, as m Jm
--m
--f
Q),
f(4 dU(4 = - 4 - m )
f(4 ddx) + Mm).
+Jrn -m
(1.5.5)
We have here assumed that a ( x ) satisfies the analog of (1.5.1-2), this necessitating the supplementary discrete terms on the right of (1.5.5). We first construct u(x). We choose the sequence n, , n 2 , ... , so that un,(x) converges as m + for any rational x. This is to be achieved by the diagonal process used in the proof of Theorem 1.4.1 ; the set of all real rationals forms a denumerable set I , just as the finite set I considered in Section 1.4. For rational x we then define u(x) = 1imm+- u,,(x); for irrational x we define u(x) = u(x 0), the right-hand limit being taken through rational values. As in Section 1.4, it may be shown that this u(x) is of bounded variation, uniformly in the sense that
+
(1.5.6)
var {u(x); a, b} Q c',
for all a, b. Hence u(x) will tend to limits as x 4 f m ; these are defined as .(*a). In particular, we have by Theorem 1.2.4 that the integral on the right of (1.5.5) exists. Since the sequences un,(m), a,,(--), m = 1, 2, ..., are bounded, by (1.4.1) and (1.5.1-2), we may select a subsequence of the n, such that these two sequences both converge. T o conserve notation, we assume that the sequence n l ,n 2 ,..., has this property. Choosing any positive integer N we now write
= 11
+12 + + +15 , 13
14
say.
428 Making m
I. --t m,
SOME COMPACTNESS PRINCIPLES
for fixed N, we have by Theorem 1.4.1 that 11
Since Jz
=
- jN
f(m)
-N
f(4 W X ) . - ”(,nL%
{%,(00)
we have as m + m,
If therefore we define
/I = m+w lim un,(m)
- u(m),
(I.5.7)
we have
12 Passing to
J3,
-+
Pf(m) +f@)
{+>
-U ( W *
we have by (1.4.2) the bound
I 1 3 I G {%% If(..)
-f(4 I} c’.
The treatment of J 4 , J5 is similar, 01 being defined by (1.5.7) with -m replacing m. From these results it is easily seen that
may be made less than any assigned E > 0 in absolute value by first fixing N = N(e), sufficiently large, and then taking m > m,(e) sufficiently large. This proves the theorem.
1.6. Infinite Interval with Polynomial Integrand For this variant of the Helly-Bray theorem we confine attention, with a view to simplicity of statement, to real-valued and nondecreasing u,(x); this is the situation that is primarily relevant for the case of orthogonal polynomials on the real axis.
1.6.
INFINITE INTERVAL WITH POLYNOMIAL INTEGRAND
429
Theorem 1.6.1. Let the real-valued nondecreasing functions on(x), x 00, n = 1, 2, ..., be uniformly bounded in the sense (1.4.1), and let there be constants czp , p = 0, 1, ... , such that -m
< <
Jrn x 2 p
do&)
(1.6.1)
Q c2v
-W
for n = 1, 2, ... and p = 0, 1, ... , the improper integral on the left having in particular a finite value. Then there is a sequence of positive integers n1 < n2 < ... and a nondecreasing bounded function o(x) such that, as m + 00,
J
m x9
-m
du,,(x) -+
Jrn x9
du(x),
(1.6.2)
-W
for any integral q 2 0. It will follow of course that 301 may be replaced in (1.6.2) by an arbitrary polynomial. Since the o,(x) are nondecreasing and uniformly bounded, they are also of uniformly bounded variation. We may as before choose the sequence n,, so that onm(x) converges as m -+ 00 whenever x is rational, defining o(x) when x is irrational by o(x) = o(x 0). Obvious limiting processes show that o(x) is also nondecreasing and bounded, so that o(+), understood as the limits of o(x) as x & 03, also exist and are finite. Next we show that the integral on the right of (1.6.2) exists. For 0 < b < b’ we have from (1.6.1) that
+
and hence
If b, b’ are rational we may make m -+ m and deduce that bZ*{fJ(b’)- u(b)) Q c21, *
(1.6.3)
If b or b‘ is irrational the same result holds in view of the definition o(b) = o(b 0) or o(b‘) = o(b‘ + 0). Making b’ + in (1.6.3) we deduce that, for b > 0,
+
430
I.
SOME COMPACTNESS PRINCIPLES
Similarly we have, for any a
< 0,
< czp .
a2P{u(a) - u( -00))
(1.6.5)
These ensure the conditions (1.2.14-15), there shown to be sufficient for the existence of the integrals on the right of (1.6.2). For the proof of (1.6.2) we write the integral on the left as
I
+ J;)N + /~~l’@du?%,,,(x)= 11+ /Z +
a l
-m xg
du,,(x)
=
say, for some positive integral N . As m +
J- N N
xq
do,&)
I-, N
+
xq
13
9
we have du(x),
by Theorem 1.4.1. T o estimate J z , J 3 we have
Jz < N-q-2
x2q+2du,,,(x)
< N-q-2 c~,,+~
and similarly
I t follows that for any
Ij
m -m
E
I 13 I
< N-@
>0
we may ensure that
xQdanm(x)-
CZQ+Z
J-N xPdu(x) 1 < N
E,
(1.6.6)
by first taking N sufficiently large and fixed, (1.6.6) then holding for all m beyond some value. This proves the result. I n the application to orthogonal polynomials, the un(x) are step functions with a finite number of jumps, and the integrals (1.6.1) have a constant value, for fixed p and all sufficiently large n.
1.7. A Periodic Case In connection with orthogonal polynomials on the unit circle we need the following inessential variation of the basic theorem. Theorem 1.7.1. Let u,(e), n = 1, 2, ... be a sequence of uniformly 9 27r. Then there is a bounded nondecreasing functions over 0 sequence nl , n 2 ,... and a nondecreasing, bounded and right-continuous
< <
1.8.
43I
THE MATRIX EXTENSION
function r(0) such that, for any function f(0), continuous in [0, 27r3 and such that f(0) = f(27r), we have, as m --+ 03, (1.7.1)
By Section 1.4, there exists a bounded and nondecreasing function (1.7.2)
If ~ ( 8is)
right-cont;nuous, there is nothing more to prove. If ~ ( 6 has ) discontinuities in the interior of (0, 27r) at which it fails to be rightcontinuous, we may replace it by the function T ( 0 ) = a(0 0) at such points; provided that u(0) is continuous at 0 = 0, the definition T(0) = o(O 0) for 0 0 < 2 ~ ~, ( 2 7 )= 4 2 ~ yields ) a right-continuous nondecreasing function, and the replacement of u(0) by T(0) on the right of (1.7.2) will not affect the value of the integral. It remains to consider the eventuality of o(0) having a jump u(O+) - o(0) at 0 = 0. By the periodicity off(0), that is to say, the fact thatf(0) = f(23, we may transfer this jump to 6 = 27r, the final and general definition 0 < 2 ~ T, ( 2 T ) = 4272) a(O+) - ~ ( 0 ) . being T ( 0 ) = o(0 f 0), 0
+
+
<
+
<
1.8. The Matrix Extension
We now consider integrals of the form
ffn(x) dan(x),
do(x)f(x),
rf(4du(x)g(x),
(1.8.1-3)
where f(x), g(x) may be scalar functions, or row or column matrices or square matrices, and “(2) is a square matrix of fixed order k ; regard must of course be had to the order and compatibility of matrix products. In the previous work, such integrals arose in connection with orthogonal polynomials on the real axis whose coefficients were square matrices, and the general first-order differential equations (Chapter 6, 9). For a finite interval (a, b), (1.8.1) is again interpreted in the sense (1.2.1-3) as the limit of Riemann sums; (1.8.2-3) are &e&neCi<xmi\ax\y, oxher of factors being preserved. I n most cases it is a trivial matter to extend to this matrix case the theorems on integration given in the previous sections. We can again say, as in Theorem 1.2.2, that the integral exists for a finite interval (a, b), if f ( x ) (and g(x) as the case may be) are continuous over [a,b] (in the sense of the continuity of each of their entries) and if u(x) is of bounded variation over (a, b) (in the sense
432
I.
SOME COMPACTNESS PRINCIPLES
of the boundedness of the variation of each of its entries) ; the integrals (1.8.1-3) are in fact none other than arrays of Stieltjes integrals of the previous scalar type. Passing to the case of the Helly-Bray theorem for a finite interval (Theorem 1.4.1), let u,(x), n = 1, 2, ..., be a sequence of k-by-k matrix functions, each of whose entries u p ) ( x ) , r, s = 1, ..., k, is uniformly bounded and of uniformly bounded variation. Choosing as before a dense and denumerable set I in [a, b ] , we wish to have an n-sequence such that u,(x), n = 1, 2, ..., converges at each x E I . For this we may either apply the Bolzano-Weierstrass theorem (in k2 dimensions in the real case), or else proceed as follows. By the argument given in Section 1.4, we may find an n-sequence such that crA’’)(x) converges as n + for all x E I. Repeating the argument, we choose a subsequence of this n-sequence such that u:”(x) also converges for x E I , a subsequence of this subsequence such that u g 3 )(x )converges, and so on, till we arrive at a subsequence for which all the up)(x) converge. T h e proof of the analog of (1.4.3) is then completed as before. We quote formally the analog of Theorem 1.6.1. Q)
< <
x Q), be a sequence Theorem 1.8.1: Let o,(x), n = 1, 2, ..., --oo of Hermitean nondecreasing matrix functions which are uniformly bounded, and satisfy (1.5.1-2). Let there be constant Hermitean matrices C, , p = 0, 1, ... , such that
j
m --m
x2p
du,(x)
< C,, .
(1.8.4)
Then there is a sequence of positive integers n, < n2 < ... and a nondecreasing bounded Hermitean matrix function u ( x ) such that, for any integral q 3 0, W
--m
--m
(1.8.5)
as m + Q). I n saying that u ( x ) is nondecreasing, we mean that v*{u(x’) - u(x)}v> 0 if x’ > x and for all column matrices v . This has the consequence that
I u ( y x ’ ) - u(“)(x)
12
< { u ( ‘ y x ‘ ) - u(‘+)(x)}. { d y x ’ ) - u ( q c ) } ,
(I.8.6)
the differences in the braces on the right being non-negative. This may be put in the simpler though weaker form
I &a)(x’)
- u(+s)(x)I
< t r { ~ ( x ‘ )- u ( x ) } ,
(1.8.7)
the trace being the sum of all the necessarily non-negative diagonal entries in u(x’) - u(x).
1.8.
THE MATRIX EXTENSION
433
We choose the sequence n, so that u,,,(x) converges as m --+ m for all rational x, the limit being u(x), the definition of u(x) being completed as before by right-continuity. Clearly u(x) is bounded, Hermitean, and nondecreasing. From the matrix inequalities (1.8.4) we deduce the corresponding scalar inequalities for diagonal entries, namely,
j
m -a
x2PduF)(x)
< Cir),
t
= 1, ..., k.
(I.8.8)
By Theorem 1.6.1 we have at once that
-ca
-a
(IA.9)
so that (1.8.5) is correct so far as diagonal entries are concerned. It remains to show that we have also
(IA.10)
when Y # s, and in particular that these integrals exist. This may be proved in a manner very similar to the proof of Theorem 1.6.1. If in (1.8.10) we replace the interval (-m, m) by ( - N , N ) , for positive integral N , the result is true by Theorem 1.3.1. T o complete the proof it will be sufficient to show that, for 0 < b < b', with rational b', we have (1.8.1 1)
as b -+m, uniformly in n, together with a similar result for the integral over (-b', -b). However, it follows easily from (1.8.7), with u n in place of u, that
and so, by (1.8.4),
which establishes (1.8.11) and completes our proof.
434
I.
SOME COMPACTNESS PRINCIPLES
1.9. The Multi-Dimensional Case We indicate finally an extension of a different kind in which the integration is over a set in euslidean space of k dimensions. This arises in the theory of simultaneous boundary problems involving several parameters ; for example, we might have k ordinary differential equations, each involving k parameters, and each subjected to boundary conditions. Under certain conditions, the eigenvalues will be sets of k real numbers, and integrals or sums concerning them will involve integration in the Stieltjes sense over k-space. In Chapter 6 we took this matter up in the context of orthogonal polynomials on the real axis, or rather in real k-space. Confining attention now to scalar-valued integrands, we may relate the Riemann-Stieltjes integral as before to a function u defined at each point of the region of integration. In the ordinary case k = 1, we may describe the definition of the integral by saying that with every interval ( a , b) is associated a weight or measure u(b) - .(a). In the case k = 2 we shall have a function u(xl , x,) of two variables, and with the “interval” a, x, b, , a, x, b, associate the measure
< <
,< <
For the general case, using boldface letters for k-tuples of real numbers, , ..., xk), the measure of the interval a x b, that is, the parallelopiped a, x, b, , r = 1, ..., k, will be
x = (x,
< <
< <
(1.9.2)
where y runs through all the 2k vertices of the parallelopiped, the rth coordinate being either a, or b , , and v being the number of the a, present among the coordinates of y. T o define the Riemann-Stieltjes integral, let us take it that u(x) is defined for all finite x, but that for sufficiently large x the measure (1.9.2) of a k-dimensional interval vanishes; to be precise, we assume that for some M > 0 the function u(x) = u(xl, ..., x k ) is constant in x, if x, 2 M , or if x, - M , for any r = 1, ..., k. We now suppose the euclidean space 8, subdivided into an infinity of boxes by hyperplanes of the form x, = const., for an infinity of constants and r = 1, ...,k ; we suppose the boxes numbered serially, and denoted Es , s = 1, 2, ... . Choosing an arbitrary point qs in each B s , and denoting the measure (1.9.2) of such a box by p(Es), the integral over 8, of an arbitrary
<
1.9.
435
THE MULTI-DIMENSIONAL CASE
function f(x) will be the limit, if such exists and is unique, of the approximating sums
Zf(l.)tL(E.1, 8
(1.9.3) c
subject as usual to the subdivisions being in the limit indefinitely fine. As an example, we may put in the form of a Stieltjes integral a finite sum, of the type occurring in connection with orthogonal polynomials. Let x ( ~= ) (x?), ..., xf')), u = I, ..., m be a finite set of points in 8, and xu, u = 1, ..., m,some associated weights. Letf(x) be any continuous function, for example, a polynomial in xl, ..., x k . We wish to represebt as a Stieltjes integral the sum Eyf(x(u)) xu. The associated weight distribution is given by (1.9.4)
<
<
where the inequality x ( ~ ) x is to be understood in the sense x:") xr , r = 1, ...)k. It is easily seen that this function allots zero measure, in the sense (L9.2), to any box not containing in its interior or on its boundary one of the x ( ~ )and , that
Here the k-dimensional integral on the left is to be understood as the unique limit of the sums (1.9.3). The function a(x) may be said to be of bounded variation if the sum
admits a fixed upper bound for any collection of nonoverlapping intervals or boxes, overlapping being only permitted in respect of boundaries. It may be said to be nondecreasing if the measure of any box, that is to say, the expression (1.9.2), is real and non-negative. With these definitions, the theory of Sections 1.2-8 may be extended to k dimensions.
APPENDIXI1
Functions of Negative Imaginary Type
11.1. Introduction I n the heading of this appendix we wish to describe, in a general way, functionsf(X) of the complex variable h which are analytic in the upper half-plane and which have there negative imaginary part, or more precisely are such that Imf(h)
<0
for
Imh
> 0.
(11.1.l)
We are mainly concerned with the case in which f(h) is meromorphic, real for real A, apart from its poles which are also real, and furthermore, indeed consequently, Imf(h) 2 0
for
Imh
< 0.
(11.1.2)
This situation arises by applying fractional-linear transformations to “fundamental solutions” which have a length-preserving property for real X and which are contractive in one of the upper or lower half-planes [cf., for example, (1.6.1), (4.5.4), (8.13.11, or (9.5.1)]. Another, and equivalent, source of such functions is provided by inhomogeneous problems and Green’s functions. Here we consider the case in which f(h) is scalar-valued. Functions of this type appear widely in electrical engineering literature, in the modified form that f(h) is to have positive real part when h has positive real part, possibly with other restrictions as well; they are then known as “positive-real” or “p.r.” functions. We prefer here the formulation (11.1.1-2) since our functions have poles at the eigenvalues of boundary problems, these eigenvalues being conventionally located on the real axis. 436
11.2.
THE RATIONAL CASE
437
11.2. The Rational Case T h e case when f(A) is a rational function occurs in finite-dimensional problems, concerning a finite set of recurrence relations. For this case the property (11.1.1-2) is related very simply to the zeros and poles off(A).
Theorem 11.2.1. I n order that f(A) be a rational function satisfying (11. I. 1-2), or satisfying these inequalities with the sign of Imf(A) reversed, it is necessary and sufficient that it have the formf(A) = p(A)/q(A) where p(A) and q(A) are polynomials with real coefficients, with real and simple zeros and no common zeros, the zeros of p(A) separating those of q(A). By the last statement we mean that between any two zeros of the one polynomial there lies exactly one zero of the other. Assuming that (11.1.1-2) hold, we have that f(A) must be real when A is real, apart from poles of f(A). We assert that these poles are real and simple, as also are the zeros of f(A). Suppose first that A, were a complex zero off(A), of order I , say. Near A, there would then be an expansion of the form f ( h ) = c(h - h,)r
or, putting
c =y
(11.2.1)
exp (ia),A - A, = 7 exp (id), f ( h ) = mr exp ;(a
whence
+ ...
Imf(h)
+ re) + O(q+l),
(11.2.2)
+ re) + O(qrfl).
(11.2.3)
= yvr sin (a
Taking it that c # 0, y being real and positive, we see that if 0 is chosen so that a re = then, for sufficiently small 7,Imf(A) may have either sign, which is impossible if A, is complex. Hence f(A) cannot have a complex zero. Very similar arguments show that it cannot have a complex pole. Supposing A, to be a pole of order Y, we have in place of (11.2.1) an expansion of the form, for small A - A,,
+
f ( h ) = c(h - &)-r
+ ... ,
(I 1.2.4)
leading to, in place of (11.2.3), Imf(h)
= m-'sin (a - re)
+ O(V'-~).
(11.2.5)
As before, this shows that Imf(A) takes both signs in a neighborhood of A,, which is excluded by (11.1.1-2) unless A, is real.
11. FUNCTIONS
438
OF NEGATIVE IMAGINARY TYPE
Passing to the consideration of real zeros and poles, we have to show that these must be simple. This follows in a very similar way from (11.2.3), (11.2.5). Suppose if possible that A, is a real zero, of order T 3 2. Then as 8 increases from 0 to T , a r8 increases by at least 277. Hence there will be two values of 8, with 0 < e l , 8, < T , such that sin (a 18,) = sin (a ‘8,) = With these values, and for sufficiently small 7, Imf(h) will by (11.2.3) take both signs, which is excluded. Hence the zero must be simple. T h e same argument, applied to (11.2.5), shows that a real pole must be simple. T h e same argument applies if (11.1.1-2) hold with reversed sign. Sincef(h) has only real and simple zeros and poles, it follows that we may write it in the form f ( h ) = p(h)/q(X),where p(h) and q(h) are polynomials with real and simple zeros, and no common zeros. Since, by (11.1.1-2), f(X) is real when A is real, apart from poles, we may suppose that p(h) and q(h) have real coefficients. It remains to prove that the zeros of p ( h ) and q(h) have the separation property. Suppose, for example, that p(h) had two consecutive zeros A,, A, between which there was no zero of q(h). Then between A, and A, there would be an extremum of f(h), at which f’(h) = 0. Denoting the A-value in question by A,, we have near A, a Taylor expansion of the form
+
4,
+ 4.
+
where T 3 2 and f(h,) is real. We derive again the formula (11.2.3), where h = A, 7 exp (i8). As in the discussion of multiple real zeros, we are led to a contradiction with (11.1.1-2). Similarly, if there were two zeros of q(h) without a zero of p(h) between them, then f(h) would have a finite extremum between them, which as before is impossible. This completes the proof of the necessity. Suppose next thatf(h) = p(h)/q(h),where p(h) and q(h) are as described in Theorem 11.2.1; we wish to deduce (11.1.1-2). In fact, we prove that there hold the strict inequalities
+
Imf(h)
<0
for
Imh
> 0,
(11.2.6)
Imf(A)
>0
for
Imh
< 0,
(11.2.7)
provided that f ( h ) is not merely a real constant. Since the zeros of p(h), q(h) have the separation property, the degrees of p(h) and q(h) will differ by at most unity. If q(h) is a constant, we shall b)/c for real a, b, c, for which the assertion is trivial. havef(h) = (ah Suppose then that q(h) has n zeros, denoted p1 < p2 < ... < pn , and
+
11.3.
SEPARATION PROPERTY IN MEROMORPHIC CASE
439
suppose first that p(A) is of degree n - 1, with zeros A,, ..., An-l , where < A, < /-L,.+~. T h e standard partial fraction formula gives here
p,.
(11.2.8)
Here we must observe that the coefficients p(p,.)/q’(p,.) all have the same sign, for p(p,.) and p ( ~ , . +will ~ ) have opposite signs by the separation property, while q’(p,.) and q’(p,.+,) have opposite signs since q(h) has only simple zeros. If, for example, the p(p,.)/q’(pr)are all positive, then (11.1.1-2) hold, with strict inequality, since I m (A - p,.)-l has the opposite sign to I m A ; if, of course, the coefficients in (11.2.8) are all negative, we get (11.1.1-2) with reversed signs. Suppose next that p(A) has the same degree as q(A), so that (11.2.8) must be supplemented on the right by a term p ( m ) / q ( m ) , meaning limA-,mp(h)/q(h).Since this term is a real constant, (11.1.1-2) are unaffected, and the same proof holds good. Finally, take the case in which p(A) is of degree one greater than q(A). This reduces to a previous case one if we consider l l f ( A ) = q(A)/p(A). For this case we can say that I m Ilf(A) has either always the same sign as I m A, or else always the opposite sign as Im A. On taking reciprocals these situations are interchanged, and we have that Imf(A) has either always the opposite sign to I m A, or else always the same sign. This completes the proof of Theorem 11.2.1.
11.3. Separation Property in the Meromorphic Case I n the case whenf(A) no longer is rational but is still meromorphic we can show that the “negative imaginary’’ property still implies a separation of its zeros and poles.
Theorem 11.3.1. Let f(A) be analytic, except for possible poles on the real axis, these poles having no finite limit-point, and let (11.1.1-2) hold (and so indeed (11.2.6-7) iff(A) is not a constant). Then the zeros of f(A), lying necessarily on the real axis, separate and are separated by the poles. T h e argument of (11.2.1-3) shows that f(A) can have no eomplex zeros; the same argument shows in fact that Imf(A) Gannot vanish for complex A, apart from the case when f(A) is a real constant. As before, f(A) cannot have a pair of zeros not separated by a pole,
440
11.
FUNCTIONS OF NEGATIVE IMAGINARY T Y P E
for if it did it would have an extremum for some real A, leading to a contradiction with (11.1.1-2). T h e hypothesis of two poles not separated by a zero likewise leads to a real extremum, and is therefore rejected. T h e above theorem may be applied to the proof of certain SturmLiouville separation theorems. We refer to other sources for the further development of the theory of “negative imaginary” functions, their general expression in such forms as (2.4.5), and the inversion of the latter, these being properties we have not appealed to.
APPENDIXI11
0rt hogonalit y of Vectors
111.1. The Finite-Dimensional Case We use frequently the simple observation that if the rows of a square matrix are mutually orthogonal, and not zero, then the columns are likewise orthogonal, with suitable weights. This is, with a slight transformation, a well-known property of an orthogonal matrix. However, we give a direct proof.
Theorem 111.1.1. Let y r S , r, s = 0, ..., m - 1 be orthogonal according to m-1
-
~ a r Y r s Y r= t b% s, t 9
...,m - 1 ,
= 0,
(111.1 . l )
r=o
where the ur , ps are real and positive. Then (I1I. 1.2)
I t follows from the orthogonality (111.1.1) that the m vectors
yoa, ...,y n , - l , s , ( s = 0, ..., m - I), are linearly independent. Thus an
arbitrary vector uo , ...,
may be expressed in the form
(111.1.3)
here the p p are normalization factors, and the Fourier coefficients vup are to be found. Multiplying by urLynrand summing over n we get
441
442
111.
ORTHOGONALITY OF VECTORS
by (111.1.1). Hence, substituting for wp in (111.1.3),
p=o
q=o
Here the un are arbitrary, and (111.1.2) may therefore be derived by comparing coefficients of the uq . The Parseval equality (111.1.4)
may be verified on substituting for the u, from (111.1.3), and using (111.1.1).
111.2. The Infinite-Dimensional Case In the case m = it is not possible to deduce (111.1.2) from (111.1. I), but some deductions can nevertheless be made.
Theorem 111.2.1. Let ur , pr , Y = 0, 1, 2, let the y r 8 ,Y , s = 0, 1, 2, ..., satisfy *
-
ZySrytTp;' = aSta;l,
Then
..., be real and positive,
s, t = 0,1,
... .
and
(111.2.1)
r-0
For an arbitrary sequence u,, , u1 , ... , satisfying
define (111.2.4)
111.2. THE INFINITE-DIMENSIONAL CASE
443
Then there holds the Parseval equality m
m
We prove first that the Parseval equality is true for finite sequences of the form u o , u l , ..., u, ,O,O,... . Defining, in accordance with (111.2.4), =
vrn
we have
2
auUdu7
(111.2.6)
9
P=o
2I
v r n 1 2 P;'
=
r-0
2 22 P;'
_-
apaau,ugyuvyar
u-Oa=O
r-0
P=O a-0
r-0
the last series being absolutely convergent ; this last statement follows from (111.2.1) for s = t together with the Cauchy inequality. Using (111.2.1) we deduce that m
n
r=O
u-0
(111.2.7)
in confirmation of (111.2.5) for this case. Let us now prove (111.2.2). We take up = y p 8 , for 0 Q p Q n, up = 0 for p > n. From (111.2.6) we have then n
P-0
Taking on the left of (111.2.7) only the term for
I =
s we have
n
Ivm12Pg1 < Z a u I y u s l a= v m . u=o
Hence v,
< p a , that is to say,
2% < n
u=o
IYP. 12
P S .
Since n is arbitrary, we have (111.2.2) on making n -+
00.
444
111.
ORTHOGONALITY OF VECTORS
Letting now uo , u1 , ... ; be any sequence satisfying (111.2.3), we pass to the proof of (111.2.5) for the general case. With the notations (111.2.4), (111.2.6) we have vrn+ w, as n -00, (111.2.8) in view of (111.2.2-3) and the Cauchy inequality. Furthermore, for O
m
r-0
P=n+l
this is proved in the same way as (111.2.7), and is indeed a variant of it. Hence, for any integral N > 0, N
m
r=O
and so, making
7tl-
Making now N
--t
p=n+l
Q]
Q]
and using (111.2.8),
we have
from which we draw the conclusion, a sharpening of (111.2.8), that W
~1wr-vrn~2p;'+0
as
n+m.
(III.2.9)
r=O
We may now prove (111.2.5) by making n + m in (111.2.7); it is necessary to show that
(111.2.10)
111.2. THE INFINITE-DIMENSIONAL CASE
445
and so (111.2.10) follows in view of (111.2.9), and in view of the fact that
2 I vrn I I m
r=o
vr
- vrn I P;'
-
0;
the latter follows from (111.2.9), (111.2.7) together with the Cauchy inequality, This completes the proof. Together with the Parseval equality (111.2.5) we have an associated expansion theorem. Theorem 111.2.2.
Under the assumptions of Theorem 111.2.1, c=
u, = z v D y n p p ; ' ,
n
= 0,
I , 2, ... .
(111.2.11)
p=O
With the notation (111.2.6) we have, if m >, n,
to be verified by substituting for vpUpm by (111.2.6) and using (111.2.1). On making m -+ 00, we obtain (111.2.1 l), subject to it being proved that
This follows from (111.2.9), (111.2.1) and the Cauchy inequality. In connection with polynomials orthogonal on the real axis we note also a partial extension of Theorem 111.2.1 from discrete sums to Stieltjes integrals involving jumps. Theorem 111.2.3. Let ~ ( h ) --oo , < h < m, be non-decreasing and have a jump l/p' > 0 at X = A'. Let the continuous functions y,,(h), n = 0, 1, 2, ... , be such that
for certain positive a, , the integrals being absolutely convergent. Then m
446
111.
ORTHOGONALITY OF VECTORS
We modify (111.2.6) to
getting
From this we deduce that
The spectral choice up = y,(A’) together with the limiting transition n + m then gives the result as before.
APPENDIXIV
Some Stability Results for Linear Systems
IV.1. A Discrete Case We prove here for solutions of difference or differential equations some conditions which ensure convergence at infinity, or which yield bounds for large values of the independent variable. We start with the discrete analog of a well-known theorem on the convergence of solutions of differential equations.
Theorem IV.l.1. An,
Let the sequence of k-by-K matrices n = 1,2, ...;
An = (anre),
r, s = 1, ...,K,
satisfy (IV.1.1)
where (1V.1.2) r-1 s=l
Then the solutions of the recurrence relations x,+~ - x,, = A,x,,
,
n
=
1,2, ...,
(IV.1.3)
where x, is a k-vector, converge as n 3 m. If in addition the matrices (E A,) are all nonsingular, then limn+mx, # 0, unless all the x, are zero. Writing x, ,r = 1, ..., k, for the entries in x, and defining its norm by
+
2I k
xn
=
r-1
447
xnr
I
(IV.l.4)
Iv.
448
SOME STABILITY RESULTS FOR LINEAR SYSTEMS
we have from (IV.l.3) that
and so, summing over
Y,
<
Ixn+l-xnl
From this we deduce that the sequence I xn bounded as n + 00. Using the property
I xn+l I
xn
I
+I
xn+l
(IV. 1.5)
IxnI.
lAnl
1, n
= 1, 2,
- xn
..., is uniformly
I
it follows from (1V.I .5) that
and so, by induction,
I xn+l I
n
< fm=ll ( 1
+I
Am
I) I x 1 I.
Since the product converges, under the assumption (IV.l.l), the 1 x, I have a fixed upper bound, as asserted. We now deduce that the sequence x, converges, in the sense that the sequences x, , n = 1, 2, ..., converge for Y = 1, ..., k. For, by (IV.1.5), n
n
2 1 x m + 1 - ~ m 1< ~ m=l m=l
I
A' 1 ~x m 1I 1
I are bounded, we have from (IV.l.l) that
and since the I x,
and in particular
2I m
xrn+l.r
- xmr
m=l
I<
=
001
Hence the sequences xm, ,
for
Y
=
1,
..., k
m
=
1,2,
1,
*..I
k.
... ,
are all convergent, as was to be proved.
(I V. 1.6)
Iv.2.
THE CASE OF A DIFFERENTIAL EQUATION
449
For the last statement of the theorem, that the x, do not tend to zero, we observe first that if the E A, are nonsingular, then the x,, will be all zero or all non-zero. From (IV.l.5) we now deduce that
+
I xn+l I 2 I xn I - I x n t l and so, if n is so large that I A ,
- xn
I 2 I xn I (1 - I An I),
1 < 1 for m 3 n, m=n
Making p -+ 00, the product on the right converges to a positive limit, by (IV.l.l), and so 1 xntP 1 has a positive lower bound. We remark that in (IV.l.6) we have a stronger conclusion than the convergence of the sequences x, , namely, the fact that they are of bounded variation.
IV.2. The Case of a Differential Equation At the other extreme from the discrete case we have that of the t < 00, system of differential equations, over 0
<
dx,
dt = ~ u r s ( t ) x s ,
T
= 1,
..., k,
(IV.2.1)
s=l
or in vector-matrix notation dx/dt = A(t)x. Assuming the a,,(t) continuous, or at least integrable, not necessarily real, we have Theorem IV.2.1.
Let
SrnI A ( t )I
dt
0
<
00.
(IV.2.2)
Then the solutions of (IV.2.1) are asymptotically constant as t + m; a solution which is not identically zero does not tend to zero as t -+ 00. Here 1 A(t)I = I ars(t)I. Varying the method of the last section, we use the result that, in matrix notation, ( d / d t ) X * X = x*Ax
+ x*A*x.
Hence it is easily verified that (IV.2.3)
450
Iv.
SOME STABILITY RESULTS FOR LINEAR SYSTEMS
Hence
and so, by (IV.2.2), x ( t ) remains bounded as t -+ 00. From (IV.2.1-2) it follows that the dx,/dt are absolutely integrable over (0, =), so that the xr(t) are of bounded variation over (0, -), and tend to limits as t + 00. For the last statement of the theorem we take it that x(0) # 0, and use the result, a consequence of (IV.2.3), that x*(o)x(o) exp -2
x*(t> x ( t )
0
I A($)I dsl.
IV.3. A Second-Order Differential Equation An illustration of the last theorem is provided by the differential equation, in scalar terms,
+ [1 +
Y
d2Y/dt2
= 0,
(IV.3.1)
whereg(t) is absolutely integrable over (0, -). If we define new dependent variables p , q by
It is then easily verified that
By Theorem IV.2.1 we may conclude that p , q tend to constant limits as t --t 00, which are not both zero, unless y vanishes identically. Hence y and y' are bounded as t + -, and in fact are asymptotic to linear combinations of eil, e-i'. The conclusion of the boundedness of y and y' as t + 00 may be deduced directly from the identity W t ) (I
Y' l2
+ I Y tZ)
=
-RY3'
- gJ?Y'.
(IV.3.2)
From this we have
I ( W(I ) Y' l2
+ I Y IZ) I ,< I g I (I Y' la + I Y 12),
(IV.3.3)
IV.3.
45 1
A SECOND-ORDER DIFFERENTIAL EQUATION
whence, on integration,
Making t + m, it follows from the integrability of I g(s) I over (0, m) that y ( t ) and ~ ’ ( tremain ) bounded. The latter argument may be used to consider the nature as an analytic function of the solution y(t, A) of
+ t1 + Ag(t)ly = 0,
(IV.3.5)
d2y/dt2
with the initial conditions y(0, A) = 0, y’(0, A) = 1. Replacing g(t) by
Ag(t) in (IV.3.4) we have
I Y’(C A) l2
+ I A t *A) l2 d exp 1 I A I Jl I g(s) I ds 1.
(IV.3.6)
Hence y ( t , A), y‘(t, A) are entire functions of order at most 1. For some applications we need to improve this last result. For fixed finite t we may use, as in Section 8.2, the identity
Integrating over (0, t) we have ~ y ’ ( tA),
12
+ I A I I y(t, A ) < exp I~(I + I A 12
11/2
1-112)
+I
11/2
fI 0
g(s> I dsl.
(IV. 3.7)
Hence y(t, A), y’(t, A) are, for fixed t, of order at most 8. Although (IV.3.7) is superior to (IV.3.6) for fixed t, it gives a weaker bound as t + 00. T h e question may be raised as to the order of magnitude, for large A, of sup I y(t, A) 1, over 0 t < m; this upper bound is finite, in view of (IV.3.6) and the assumption that g ( t ) EL(O,m). We have, in fact,
<
(IV.3.8)
However, supl I y(t, A) 1 is of slightly less than exponential order in A. To see this we assume I A 1 > 1, and use (IV.3.7) with t = I A 1 /4.
452
Iv.
SOME STABILITY RESULTS FOR LINEAR SYSTEMS
Replacing I h I on the left by 1, and making simplifications on the right, we have
I~ ' ( 1
A PI4,A )
l2
+ I~
(A lV4,A) I'
< exp 1 I A I3I4(2+ Srn I g(s) I ds)/.
T h e same bound applies if 1 h
o
[1/4.
0
1 /4
(IV.3.9)
on the left is replaced by any t,
<
We supplement this with a bound over 1 h J1f4 t < 00. For this we use (IV.3.4) over the interval ( Ih 11/4, t ) in place of (0,t ) and with g(s) on the right replaced by hg(s). Combining this with (IV.3.9) we have
(IV.3.10) (IV.3.11)
IV.4. The Mixed or Continuous-Discrete Case T h e recurrence relation (IV. 1.3) and the differential equation (IV.2.1) may be brought together in the integral equation
discussed in Section 11.8. I n general, some attention must be given to the interpretation of the Stieltjes integral, for example, in the sense (1 1.8.5), since discontinuities ofy(t) and C(t)occur together; the application will however be confined to the generalized second-order differential equation, in which the integrals may be interpreted in the usual Riemann sense. Applying (IV.4.1) with x = a, x = b and subtracting we have, for 0 u < b < m,
<
(IV.4.2)
Using the vector norm (IV.l.4) for y ( t ) and the matrix norm (IV.1.2) for C(t), we have, on taking norms in (IV.4.2), lY(b)
-
(IV.4.3)
IV.4.
453
THE MIXED OR CONTINUOUS-DISCRETE CASE
where we assume that C ( t ) is of bounded variation over any finite interval. From (IV.4.3) it follows that
From this functional inequality we may deduce a bound on the rate of growth of y ( t ) . T h e principle. is contained in
< <
x let p ( x ) be positive Theorem IV.4.1. I n the finite interval a and bounded, and ~ ( x right-continuous, ) bounded, and nondecreasing. For any a, b with a a
<
IAb) Then, for a
<
-d a ) I
<{
SUP a<x
W) - .(a)>.
(IV.4.5)
.(a)>.
(IV.4.6)
< x < b, Ax)
< P(a) exp M X )
-
Here p ( x ) , a(.) are to be real-valued. T h e result is close to one of Gronwall’s, which Bellman, in connection with stability theory, has termed the “fundamental lemma.’’ From (IV.4.5) and our assumption that p ( x ) is bounded, it follows that p ( x ) is right-continuous and of bounded variation, with a(.). Thus p ( x & 0) exist, and p(x 0) = p ( x ) . We first prove that, for a<x
+
+
<
provided that a(b - 0) - .(a) < 1. T o see this let us write m for a x < b, and suppose that p(xn) --t m as n --t m, a Then, by (IV.4.5) with x, for b,
<
P(%) - f ( 4
<
SUP
a <x c;xn
P(4
{U(Xn)
- 44)
d m{u(b - 0) - .(a)>, and making n +
on the left we deduce that
whence, if a(b - 0) - .(a)
< 1,
=
sup p ( x ) < b.
< x,
454
Iv.
SOME STABILITY RESULTS FOR LINEAR SYSTEMS
which is the same as (IV.4.7). Considering next jumps in p(x), which can occur only at jumps in a ( x ) , we have p(b)
d p(b - 0)(1
+ u(b) - o(b - 0)).
(IV.4.8)
For applying (IV.4.5) to the interval (x, , b), where x, 3 b from below, we have ~ ( b) dxn) {z , <SUP A x ) > ( 4 b ) - 4xn)>i z 0, and for this E a subdivision (5,) of (a, x), a = to< g1 < ... < 5, = x, with the property that
<
4)- a(.!,) < t
for
5,
5,+1.
(IV.4.9)
To achieve this let us suppose).(a separated into a continuous component ue(x) and a jump component aj(x). We may then choose the subdivision so fine that
.,(t) - .,(5,)
5, < t < 5,+,
9
and it will then be sufficient to arrange that uj(t) - u j ( t r ) < 46,
5, < t < t r + 1 *
T o achieve the latter, we assume that the subdivision includes all points of discontinuity of ai(x), that is to say, of a(x), at which the jump exceeds some c' > 0. Choosing z' sufficiently small, the total of all the remaining jumps of aj(x) will be less than Q E , and a,(x) will have the above property. We have now that
Using the bounds (IV.4.7), (IV.4.8) we deduce that
taking it that
E
< 1 so that the last logarithms are real. Using the results
IV.5.
THE EXTENDED GRONWALL LEMMA
455
3 0, log (1 + 7) < 7, and that log (1 <7 f < < 9,and taking it that 0 < < i,we deduce that
that, for 7 if 0 7
7’
E
T h e last term is less than or equal to
2 { o ( t r - 0) n
E
U(tr-l)>
1
B C{U(X)
- a(a)>,
and so tends to zero as E -P 0. We then derive (IV.4.6) from (IV.4.10), which completes the proof.
IV.5. The Extended Gronwall Lemma T h e following result gives the direct extension to Stieltjes integrals of the fundamental lemma of Gronwall (cf. Bellman, “Stability Theory,” p. 35).
< <
Theorem IV.5.1. Let p ( x ) , a x b, be non-negative and continuous, and in the same interval let U(X) be nondecreasing, and rightcontinuous. For a x b let
< <
where co
> 0, c1 > 0, are constants. P(4
p
< co exp {c1[+)
Then - .(a)l).
(IV.5.2)
T h e result will obviously be established if we prove that for arbitrary > 1 and a x b we have
< <
P(X)
< llco exp { c 1 [ 4 4 - u(a)I>.
(IV.5.3)
For any chosen p > 1, this will be true for x = a by (IV.5.1), and in a right-neighborhood of a, by continuity. If (IV.5.3) does hot hold, for the p in question, for all x E [a, b], let a’ be the greatest number in (a, b] such that (IV.5.3) holds for a x < a’. Substituting x = a’ in (IV.5.1), and replacing the integral by a sequence of approximating sums we have
<
Iv.
456
SOME STABILITY RESULTS FOR LINEAR SYSTEMS
where {&}, a = 5, < I, < ... < 5, = a', is a subdivision of (a, a'), and the limit is over increasingly fine subdivisions as n -+ m, in the usual manner. Inserting the bound (IV.5.3) on the right of (IV.5.4) we have p(a')
< co + c1 lim
= co
+ p0cl o
=P o
n-1
pc0 exp cl[o(tT) - .(a)]
9-0
. {o(tT+l) - +$,)I
(IV.5.5)
r"')
exp {CJU - .(a)]} du
U(U)
+ CLco{exp exp {Cl[+')
{Cl[+4 -
- Wl>- 1)
+)I>.
Hence (IV.5.3) holds also when x = a', and so also in a right-neighborhood of a' by continuity, if a' < b. So a' is not the largest number with the property that (IV.5.3) holds for a t < a', and hence (IV.5.3) holds throughout [a, b ] ; the same conclusion is available if a' = b. Since p > 1 was arbitrary, we deduce (IV.5.2) by making p -+ 1 . In Theorem IV.5.1 we may relax the assumed continuity of p ( x ) to right-continuity, if the integral in (IV.5.1) is interpreted as in (1 1.8.5).
<
APPENDIXV
-
Eigenvalues of Varying Matrices
V. 1. Variational Expressions for Eigenvalues We collect here some well-known expressions for the eigenvalues of a Hermitean matrix A of order k, in terms of certain extremal problems, usually with side conditions; a discussion of these formulas will be found in Bellman’s “Matrix Analysis,” Chapter 7. Let the ,eigenvalues, possibly not all distinct, of A be written in descending order as (V. 1.1) A, 2 A, 2 ... 3 A,; we assume the fact of the existence of a corresponding set of orthonormal vectors x, , ..., xk , considered as column matrices, such that X;X*
= Srs
,
AX,
= A,x,
,
(V. 1.2-3)
the (*) indicating the complex-conjugate transpose, so that x: row matrix. For the largest eigenvalue we have
Theorem V.l.l.
is a
We have A,
= max{x*Ar X
x*x =
I},
(V.1.4)
the maximum being over varying column matrices x of unit length, in the sense that x*x = 1. For the proof we note that the set of x with x*x = 1 is given by (V.1.5)
where the c, are scalars subject to, by (V.1.2),
$,
c,
12
457
=
1.
(V. 1.6)
458
v. EIGENVALUES
OF VARYING MATRICES
By (V.1.2-3) we have then x*Ax
k
= 1
A, 1 c,
(V. 1.7)
12.
<
By (V.l.1) and (V.1.6) it then follows that x * A x A, , equality being reached when c1 = 1, c2 = cg = ... = 0. In a very similar way we may prove more generally the Theorem V.1.2.
For 1
< < k, Y
A, = max{x*Ax x*x = 1,'
= 0,
x:x
1
< s < r}.
(V.I.8)
T h e condition that the unit vector x is to vary subject to the side conditions x,*x = 0, 1 s < Y , is of course operative only if r > 1; the case r = 1 has just been disposed of, and may be ignored. T h e set of admissible x is in fact given by (V.1.5-6) with c1 = ... = c,-~ = 0, and the conclusion follows from (V.1.7). T h e expression (V.1.8) for A,, has the disadvantage of involving a knowledge of xl, ..., xrP1 . This is remedied in the following alternative, though more complicated, expression for A,.
<
Theorem V.1.3.
For any set of column matrices y1 , ..., JJ,-~ write
m(A,y, , ...,Y , - ~ ) = max {x*Ax x*x
=
1 , y:x
= 0,
Then, .for varying y1 , ...,y+-l , A, = min m ( A , y l , ...,Y , - ~ ) .
s =
1, ..., r - l}. (V.1.9)
(V. 1.10)
We first prove that, for any y1 , ...,yr--l , A,
< m(A,y,
+ +
I
-vy,-1).
(V.1.11)
For this we choose x = clxl ... c,.x, subject to y z x = 0, s = 1, ..., r - 1, the cl, ..., c, not being all zero. This is possible since we have Y of the c, , subject to r - 1 homogeneous conditions. We may therefore suppose, in addition, that ZL I c, l2 = 1, or that x*x = 1. We have then
so that the right of (V.1.9) is not less than h, . This proves (V.1.11).
v.2.
CONTINUITY AND MONOTONICITY OF EIGENVALUES
459
T o complete the proof of (V.l.10) it is sufficient to note that equality occurs in ( V . l . l l ) if ys = x, , s = 1 , ...,r - 1, this being the result of Theorem V. 1.2. In just the same way we have
Theorem V. 1.4. For column matrices Y + +,~...,yk define
Then, for varying Y + +,~...,yk , hr = max m + ( 4yr+1 9
(V. 1.13)
YJ-
-9
This may also be proved by using Theorem V.1.3 with A replaced by -A.
V.2. Continuity and Monotonicity of Eigenvalues
We start by comparing the eigenvalues of two Hermitean matrices A and A B, and denote them by &(A), &(A B) respectively, ordered as in (V. 1.1). We first estimate the difference between corresponding eigenvalues. Writing
+
+
1
11 B 11
= maxi1 x * ~ x 1 x*x = 11,
(V.2.1)
so that 11 B 11 is in fact the greatest of the absolute values of the eigenvalues of B, we have
Theorem V.2.1. For 1
< r < k,
I &(A For x such that x*x
+ B ) - h ( A )I < I1 B /I.
(V.2.2)
1 we have
=
I x*(A + B ) x - x*Ax I
< 1 ) B 11, from which it follows that the maxima of x*(A + B) x, x*Ax, subject =
I x*Bx 1
to x*x = 1 and any collection of side conditions, can differ by at most (1 B 11. Thus, with the notation (V.1.9),
I m(A + B , ~
1
, Yr-1)
.--I
- m(A,~
1
**-*Y?-I) 9
and (V.2.2) now follows on use of (V.1.10).
I < I I B II
v.
460
EIGENVALUES OF VARYING MATRICES
Suppose now that A = A(t) is a Hermitean matrix each of whose entries is a continuous function of t , a real variable. T h e eigenvalues, written now A,.(t) and taken in order according to (V.l.l), will then be well-defined functions of t . I n view of (V.2.2) we can also say that they will be continuous functions of t. Reverting to the comparison of A and ( A B), we now take the case in which B > 0, or B 2 0, in the sense that the associated quadratic form x*Bx is definite or semidefinite and non-negative. Let us write, if B 2 0, 11 B = min{x*Bx I x*x = l}, (V.2.3) 2
+
so that this is the least of the eigenvalues of B, assumed to be all non0 to include the case B > 0 we have negative. Using the phrase B
Theorem V.2.2.
If A, B are Hermitean and B 2 0, then &(A
For x such that x*x
=
+B)
-
&(A)2 I I B IIt.
1 we have now
x*(A + B ) x --*Ax
and so m(A
(V.2.4)
+ B , y , , ...,
Y7-1)
2 1 1 B Ilt,
= x*Bx
- m(A,y,1 ...,Y7--1)
II B IT.
T h e same result now follows for A,. by means of (V.1.10). If now A(t) is an increasing, or nondecreasing, Hermitean matrix function in the sense that A(t’) - A(t) is positive or non-negative definite when t’ > t , we may conclude that its eigenvalues A,.(t) are increasing, or nondecreasing, functions of t in the ordinary sense. As a continuous variant of the last result we have Theorem V.2.3. I n a real t-interval let A(t) be a differentiable Hermitean matrix function, whose derivative A’(t) is positive-definite. Then the A,.(t), the eigenvalues of A(t),are increasing functions of t . For any fixed t’ and variable t“ + t’ we have A(t”) - A(t’)= (t”
+
-
+
t‘) A’(t’)
O(t”
- t’)
as t” + t’ 0. From this, and the fact that A’(t’) > 0, it is easily deduced that A(t”) > A(t’) for sufficiently small t” - t’, so that A(t) is an increasing function of t . T h e result of the theorem now follows from the last theorem.
v.3.
46 1
A FURTHER MONOTONICITY CRITERION
V.3. A Further Monotonicity Criterion I n connection with matrix Sturm-Liouville theory we need an extension of Theorem V.2.2, under which B need not be fully positive-definite, but only on the linear subspace formed by the eigenvectors of A for the eigenvalue in question. We first consider the comparison of two matrices, and then give a continuous version. Theorem V.3.1. Let A, B be Hermitean and A' an eigenvalue of A , possibly multiple, so that &(A)= A', For some p
r = u, u
> 0, and all x with Ax x*Bx
=
+ 1, ...,w.
(V.3.1)
X'x let
3 px*x.
(V.3.2)
If A' is the greatest of the eigenvalues of A, that is, if u = 1 in (V.3.1), then &(A + B ) > A,(A), r = u, u + 1, ..., v. (V.3.3) If X is not the greaiest of the eigenvalues of A, let X'be the next greater, and let p' denote 11 B 11 as defined in (V.2.1). If p'2
< p(A"
- A' - p),
Pz
2
then (V.3.3) holds. Let PI =
2
xrxT*,
=
1,(A) < A '
xrxT*,
d7(A)=d'
(V. 3.4)
P,
=
2
x,x:.
& ( A )> A '
We assert that, for r = u, ..., v, h,(A
+ B ) >, m$
{x*(A
+ B) x
x * x = 1,
+
x = (Pz P,) x ) . (V.3.5)
This follows from Theorem V.1.4, if in place of Y we have z, and in ..., x k . T h e requirements x,*x =0, place of y r + l ,..., yk take s = z, + 1 , ..., k imply that Plx = 0; these requirements are to be omitted if z, = k, that is to say, if A' is the lowest eigenvalue. Furthermore, we have P, Pz P , = E, the unit matrix, by (V.l.2), the eigenvectors forming a complete orthonormal set, so that if P , x = 0 then x = (Pz P,) x. This justifies (V.3.5). We first dispose of the trivial case in which A' is the greatest eigenvalue, so that P, = 0, and in (V.3.5) we have x = P z x , and so Ax = X'x.
+ +
+
v. EIGENVALUES
462
OF VARYING MATRICES
+
+
It then follows from (V.3.2) that x*(A B ) x 3 x * A x px*x = (A' p ) x*x, and so from (V.3.5) that &(A B ) A' p, in verification of (V.3.3) for this case. For the more general case that P, # 0 we have, if x = (P2 P,) x,
+
+
+
+
x*(A
+ B ) x = x*(PZ+ P3)A(P, + P,) x + x*PzBP2x + x*P,BP, x + x*P,BPz x + x*P3BP3x ,
(V.3.6) where we have used the facts that P: = Pz , P$ = P3 . Here we note that APzx = A X?X?X = A'PZx,
2
Ar(.4)-aa
so that
x*PZBPzx> ~ x * P , x ,
x*P2APZx= A'x*P,x,
using the fact that Pi
=
P 2 . In a similar way we have
x*P3AP3x > A"x*P,x,
and, using (V.2.1) with pf
=
X'P,AP,X
= x*P,AP,x
= 0,
11 B 11,
I x*P,BP,x 1
< p'x*P,x.
Using these results in (V.3.6) we obtain x*(A
+ B ) x > (A' + p ) x*P,x + (A"
+ x*PZBP3x+ x*P3BPZx.
- p') x*P,x
(V. 3.7)
We introduce a notation for the length of an arbitrary column matrix, writing 1 z 1 = 1/(z*z) 3 0. For any other column matrix y we have the Cauchy inequality I z*y 1 I z I . I y I. We have then
<
and
x*Pg
=
(Pg)*(P,x) = I P,.
I x*P2BP,x I
=
18,
I x*P3BP$ I
x*P,x
< I P,x
=
I P,x
12,
I . j BP,x I.
<
Since B is Hermitean, we have I By I p' I y I for any column matrix y ; as is well known, the maxima of z*Bz and I B z I subject to z*z = 1 are reached simultaneously when z is an eigenvector corresponding to that eigenvalue which is greatest in absolute value. Hence I BP2x I pf I P,x 1, and so from (V.3.7) we have
<
x*(A
+ B ) x > (x + p) I P g ID + (A" - p') I P3x If
- 211' I Pzx
I . I P3x I,
(V.3.8)
v.3. A
FURTHER MONOTONICITY CRITERION
where we still take it that x that is to say, that
=
(Pz+ P3)x. Taking it also that x*x
I p,x l2
463 =
1,
+ I P3x l2 = 1,
we have from (V.3.8) that x*(A
+ B) x - h’
L ./
[ P2x 12
+ (A”
- A‘ - p’) 1 P3x ( 2 - 2
4 1 Pz“ 1 * 1 P,x 1.
The left-hand side will be positive if the quadratic on the right is positivedefinite, and this is ensured by (V.3.4). This completes the proof. As a continuous analog of this result we have Theorem V.3.2. Let A(t) be a Hermitean matrix function of the real variable t, which is differentiable at t = t o . For some eigenvalue A, of A(to)and for all associated eigenvectors x, x*x = 1, A(to)x = hox, let x*A’(t,) x > 0. (V.3.9) Then the eigenvalues h,(t) of A(t) which coincide with A, when t = to are increasing functions of t at t = t o . The case in which A, is a simple eigenvalue can be dealt with by a brief direct argument. T o indicate this only, we differentiate the equation A ( t )x ( t ) = A ( t ) x ( t ) , where x ( t ) is a varying normalized eigenvector associated with the varying eigenvalue h(t), getting ”t)
x(t)
+ A ( t )x ’ ( t ) = A’(t) x ( t ) + A(t) x ’ ( t ) .
Multiplying on the left by x * ( t ) , and using the fact that x * ( t ) A(t) = h(t) x * ( t ) , we deduce that x*(t)
A’(t)x ( t )
= A’(t) x * ( t ) x ( t ) = A’(t),
which proves our result in this special case. For the general case we use the previous theorem. We take A = A(to), A B = A(t,), where t , - to > 0 is to be made suitably small. In view of (V.3.9) there will be a v > 0 such that
+
x*A’(t,) x >, vx*x,
if
A(to)x
Since A(t) is differentiable at t o , we have so that
+
A(t,) - A(t0) = (tl - 1,) A’@,) x*{A(t,) - A(to)}x
= ( t l - to) %*”to) x
=
O(t,
+.
- to),
+ o{(t, - to)x*x}.
v.
464
Hence, for some
E
EIGENVALUES OF VARYING MATRICES
> 0 and 0 < t, - to < E,
x*(A(t,) - A(t,)) x 2 *v(t,
-
to)x*x,
we have if
A(t,) x = X,x.
Furthermore, for some V' > 0 and 0 < t, - to < E, we shall have 11 A(t,) - A(t,) /I ~ ' ( t , to), with the interpretation (V.2.1). This yields the situation of Theorem V.3.1, with
<
1
p = 2 V ( t l - to),
p' = V ' ( t , - to).
If A, is the largest eigenvalue of A(t,) we may conclude at once that > A,.(t,) = A,, for 0 < t , - to < E and the r-values for which h,(t,) = A,. If A, is not the greatest eigenvalue, we denote by Ah the next greater eigenvalue of A(t,), and the same conclusion follows, provided that, according to (V.3.4),
A,(t,)
V'2(tl
- t,)2
< &V(t,
- to) [A;
- h, - 1 24tl
- t0)L
This is plainly satisfied for some E' > 0 and 0 < t, - to < E'. We deduce once more that A,(t,) > A,(t,) = A, for t , in some rightneighborhood of to . T h e corresponding statement that &(t,) < A,(t,)=h, for t , in some left-neighborhood of to may be proved by applying the same argument to -A(-t).
V.4. Varying Unitary Matrices We now pass to the situation in which we are given a matrix O(t), of the kth order, which for to t t , is unitary, in that e(t) e*(t) = O*(t) d ( t ) = E, and which is also continuous in t , in that all its entries are continuous. We denote its eigenvalues by ~ , . ( t ) ,r = 1, ..., k, not necessarily all distinct and written a number of times according to multiplicity. T h e w,.(t) lie necessarily on the unit circle, and we need results giving conditions under which they move monotonically on the unit circle as t increases. Two differences emerge when we compare this situation with that of the eigenvalues of a varying Hermitean matrix A(t).Whereas a Hermitean matrix A(t) defined in connection with a differential equation may become infinite, and therewith also some of its eigenvalues, even though the differential equation exhibits no irregularity, a unitary matrix and its eigenvalues are necessarily finite; this was, in Chapter 10, a motive for introducing such matrix functions. On the other hand, a difficulty arises in connection with the identification of the eigenvalues. For a
< <
v.5.
CONTINUATION OF THE EIGENVALUES
465
Hermitean matrix, the eigenvalues may be uniquely numbered according to their order on the real line, as in (V.l.l). There being no lowest or highest point on the unit circle, the definition of the w,(t) needs special consideration. In what follows we determine the wT(t),and their arguments, in the following manner: (i) the w,(t), Y = 1, ..., k, are to appear in positive order on the unit circle with increasing I, that is to say, arg w l ( t )
< arg w z ( t ) < ... < arg w k ( t )< arg wl(t) + 2v;
(V.4.1)
(ii) the w,(t), and their arguments, are to be continuous functions of t ; (iii) the w,(t), and their arguments, satisfy (V.4.1) when t = t o . We shall write P&) = argw,(t), (V.4.2) so that the ~ , . ( t )are to be continuous functions of t, satisfying Pdt)
< P&) < < Pk(t) < Pdt) + 2v, **.
(V.4.3)
and assuming known values, subject to (V.4.3), when t = t o . I t is also necessary to consider the case of a unitary matrix e(s, t ) which is a continuous function in some rectangle in the real (s, t)-plane. Again, with some base-point (so, to), we suppose the eigenvalues w,(s, t ) and their arguments vr(s,t ) to vary continuously subject to (V.4.1) and (V.4.3).
V.5. Continuation of the Eigenvalues In this section we justify the definition of the w,(t) and their arguments ~ , ( t by ) continuous variation subject to (V.4. l), (V.4.3), taking first the continuation along the real axis from t = to in the case when e(t) is a function of only one real variable. We start by showing that such continuation is possible at least locally. Suppose that for some t’ the w,(t’) have been fixed, and satisfy (V.4.1) there. We choose a number on the unit circle, exp (ia) say, which is distinct from all the w,(t’). Reading round the unit circle in the positive sense from exp (ia) and back to exp (ia),the w,(t’) will be encountered in a certain order, a cyclic permutation of the Y = 1, ..., k, let us say in the order y o , yo I , ..., k, 1, ..., r0 - 1. We prescribe that for a certain c > 0, chosen so that exp (ia) is not an eigenvalue for t‘ t < t’ 6 , the ~ ~ (are t )to be numbered in the same order when
+
<
+
v.
466
EIGENVALUES OF VARYING MATRICES
< +
t' < t t' E and the unit circle is read in the positive sense, again starting and finishing at exp (ia). Such an E will exist; since O(t) is continuous, and since O(t) - exp (ia)E is nonsingular when t = t', it will be nonsingular also in some neighborhood oft'. Hence the definition of the w,(t), subject to (V.4. l), is ensured for a right-neighborhood of t'. T h e question now arises as to whether the w,(t) thus defined are continuous in t' t t' E. This may be deduced from the continuity mentioned in Section V.2 of the eigenvalues of a varying Hermitean matrix. In this t-interval we may define the Hermitean matrix
< < +
A ( t ) = i($(t)
+ eiaE}{ $ ( t ) - eiaE}-l.
Corresponding to an eigenvalue h,(t) of A(t) according to
w,(t)
of O(t) there will be an eigenvalue
+ cia} { ~ ~ ( t )
A,(t) = i ( ~ , ( t )
+
(V.5.1)
- ei"}-l.
(V.5.2)
T h e mapping h = i(w eiu)/(w- eia) transforms the unit circle, taken in the positive sense, in the w-plane, into the real A-axis, taken in the negative sense. Since w = eia corresponds to h = 00, the abovementioned succession of the wr(t), read from exp (ia)to exp (ia)in the positive sense, corresponds to the following ordering of the &(t), namely, Av0(t) 2 Ar0+1(t)
3
9..
2
2 A&) 3
2 &-1(t).
We showed, however, that the eigenvalues of A(t), a Hermitean matrix, form continuous functions when numbered in order on the real axis. Inverting the relationship (V.5.2) to get w,(t) in terms of h,(t), we deduce that the w,(t) are also continuous in t' t t' E. T o complete the definition of the w,(t), and the q~,(t),we start with t' = to and fixed values for t = to satisfying (V.4.1), the choice of wl(to) being arbitrary. Having extended the definition to a rightneighborhood of t o , we continue the process with t' as some point in this right-neighborhood. For any t ' , we can find an exp (ia),not an eigenvalue of e(t'), and not having any other eigenvalue within an angular distance of rr/k of it. From this it is easily seen that the continuat, by a finite number tion may be extended to any finite point in to t of steps. T h e q+(t) are, of course, to vary continuously with w,.(t). It may be shown that this process of continuation gives unique values to the w,(t), q~,(t),independent of the choice of the various t' and a, granted the initial values at t = t o . T o see this we consider the equality det e(t) = II: wr(t) and its logarithmic form
< < +
< <
arg det e ( t ) =
k
vr(t). 1
(V.53)
v.5.
CONTINUATION OF THE EIGENVALUES
467
We may determine arg det @to) so that (V.5.3) is true when t = t o , and it remains true for t > to under our process of continuation of the w,(t) as continuous functions with modulus unity. Since arg det O(t) is determinate except for a multiple of 2n, its continuation as a continuous function is unique. From this it follows that the numbering and valuation of the p),(t) = arg w,(t) is also unique, subject to our restrictions (V.4.1), (V.4.3). If, for example, the determination of arg w l( t) were varied by 2m7, the numbering of the o,(t) remaining unaltered, the right of (V.5.3) will be varied by 2kmn, and the equality will be destroyed. Suppose next that the numbering of the w,(t) is altered by one place, so that w 2 ( t )is renumbered as w l ( t ) , w3(t) as w $ ( t ) , and so on. We shall then have, for some integer n, arg w : ( t ) = arg w 2 ( t ) 2nn, arg w J ( t ) = arg w l ( t ) 2na, ..., and finally arg w&t) = arg wl(t) 2 n ~ 27r, so that, writing p),?(t)= arg w,?(t), we have
+
+
+
+
Thus it is not permissible to replace the p),(t) on the right of (V.5.3) by the p),?(t);a similar calculation shows that the equality (V.5.3) fails if wl(t) is replaced as the first member of the sequence by any of the w 3 ( t ) , ..., ok(t).This completes our proof that the w,(t) are uniquely continuable as continuous functions satisfying (V.4.1). Consider m x t the situation in which O(s, t) is a unitary matrix which is a continuous function of s and t in the rectangle so \< s sl, -m < t < m. Taking as starting point (so, 0), we suppose the eigenvalues w,(so, 0) of O(so, 0) arranged in a similar manner to (V.4.1). Another point (s’, t’) of the rectangle may be joined to (so, 0) by a continuous path [s(T), t ( T ) ] , 0 T 1, lying in the rectangle, and the functions w,(s, t ) may be continued along this path as continuous functions of T ; this also applies to p),(s, t ) = arg w,(s, t) and to arg det O(s, t ) . We may therefore arrange that the analog of (V.5.3), that is,
<
< <
arg det e(s, t ) =
2 vr(s, t ) k
(V.5.4)
1
holds at the chosen point (s’, t’) and along the path leading from (so, 0) to it. We now appeal to the uniqueness of the continuation of arg det O(s, t ) as a continuous function of s and t. Here we rely on the fact that the rectangle of definition is simply-connected. Any path from (so, 0) to (s’, t’) lying in the rectangle can be continuously deformed, in an obvious manner, within the rectangle, into any other such path with
468
v. EIGENVALUES
OF VARYING MATRICES
the same end-points. From this it is easily shown that continuous variation of arg det 8(s, t ) , with a fixed value of arg det O(s,, 0), yields a unique value of arg det B(s’, t’). This in turn implies, as previously, that the wr(s, t ) and their arguments are uniquely fixed by our assumptions.
V.6. Monotonicity on the Unit Circle T h e differential equation z’ = ixz, where z is a complex scalar of modulus unity and x is real and positive, has the implication that z is moving in the positive sense on the unit circle as the independent variable increases. We wish here to set up similar results in a matrix context, when it will be a question of the eigenvalues of a unitary matrix moving positively on the unit circle. T h e simplest case is one in which all the eigenvalues move positively without restriction. Theorem .V.6.1. Let 8(t), a k-by-k unitary matrix function of the real variable t in to t t , , satisfy a differential equation 8’ = iOQ, where Q(t) is Hermitean, continuous and positive-definite. Then the eigenvalues of O(t) move positively on the unit circle as t increases. We assume here that the eigenvalues w,(t) of 8(t) are continued as continuous functions subject to (V.4.1). We prove this by reduction to the corresponding result for Hermitean matrix functions. For some t’, let exp (ia)not be an eigenvalue of 8(t’), and define
< <
A ( t ) = i{exp (ia)E
+ O(t)}{exp (ia)E - O(t)}-l,
(V.6.1)
this having sense in a neighborhood of t‘. It is easily verified that it is Hermitean. Let us also show that A(t) is an increasing function, in the matrix sense. Since A ( t ) = 2iexp (ia){exp (ia)E - e(t)}-l - iE
and since 8(t) is assumed differentiable we have A’(t) = 2ieia{eiaE- O(t)}-l e ’ ( t ) {eiaE - O(t)}-l.
Using the facts that 8’
=
idQ, O* = 8-l, this may be transformed to
A’(t) = -2{E - e-iaO(t)}-l e ( t ) Q ( t ) {e”E - O(t)}-l = 2{eiaE - O(t)}*-l Q ( t ) {e”E - e(t)>-l.
(V.6.2)
V.6.
469
MONOTONICITY ON THE UNIT CIRCLE
This is positive-definite, with Q(t). By Theorem V.2.3, we deduce that the eigenvalues of A(t) are increasing functions of t ; here these eigenvalues are identified, as continuous functions of t, with preservation of order on the real axis in a neighborhood of t‘. T o complete the proof we note that the eigenvalues w,(t) of d ( t ) will be related to the eigenvalues A,.(t) of A(t) by a relation similar to (V.6.1), namely, by i{eia + w ,( t ) } h ( t ) = {cia - w r ( t ) } *
+
Here the mapping h = i(eia ;)/(eta - w ) takes the positively described unit circle into the positively described real axis; the numbering of the h,(t) will generally not coincide with ( V . l . l ) . Thus as the h,,(t) increase with t , the w,,(t) move positively on the unit circle, as was asserted. Finally we note the situation in which Q ( t ) is positive-definite only as applied to certain eigenvectors of d ( t ) , so that only the associated eigenvalues can be asserted to move positively. Theorem V.6.2. I n the assumptions of Theorem V.6.1 let the condition that Q(t) be positive-definite be weakened to the following. For a certain t’ and a certain eigenvalue w,,(t’) let w*Qw > 0 for all w # 0 such that d(t’) w = w,(t’) w . Then w,,(t) moves positively on the unit circle at t = t ’ . If d(t’) w = w,,(t’) w , it follows that A(t’) w = h,(t’) w , where A,. is related to w,, as above; furthermore, all w such that A(t‘) w = h,(t’) w will be obtained in this way, as solutions of O(t’) w = CU,,(~’) w . Next we note that w * A ’ ( t ’ ) w > 0, if also w # 0. Since [eiaE - O(t’)]-lw
=
(cia - wr(t’)]-lw,
it follows from (V.6.2) that w*A’(t’) w
=
2 I eia
- wr(t’)
w*Q(t’) w ,
which is therefore positive. T h e result now follows from Theorem V.3.2. From the last results we may deduce bounds on the rate of change of the eigenvalues. Theorem V.6.3. Let the k-by-k unitary matrix O(t) satisfy 0‘ = iOQ, where Q(t) is Hermitean and continuous. For some t’ and a certain eigenvalue w,(t’) let the scalars y3 , y4 satisfy y3w*w w*Q(t’)w y4w*w for all vectors w with Q(t’)w = w,(t’)w. Then arg w,(t) - y s t , arg wr(t)
<
<
470
v. EIGENVALUES
OF VARYING MATRICES
- ylt are, respectively, nondecreasing and non-increasing functions at t=t'. We need only apply the result of Theorem V.6.2 to the unitary matrices exp (-iy3t)8(t), exp ( -iyQt)8(t). It will be convenient to write the conclusion of the last theorem in the form Y d f ) < ( W )argw,(t) G Y&). (V.6.3)
T h e result may be weakened by replacing y 3 , y4 by y l , y z , the lowest and highest of the eigenvalues of 52. Strictly speaking, we have not demonstrated that the eigenvalues of 8(t) are differentiable, if 8(t) is differentiable. This is easily seen in the case of a simple eigenvalue, which is a simple root of an algebraic equation with differentiable coefficients. Without showing that multiple eigenvalues are differentiable functions, we leave (V.6.3) with the interpretation that it bounds the lower and upper derivatives of ~ ~ ( t ) . What is actually needed in Chapter 10 is the result of integrating (V.6.3) over a finite interval. T h e result of this process is easily justified directly, in a similar manner to the proof of Theorem V.6.3.
APPENDIX VI
Perturbation of Bases in Hilbert Space
VI.1. The Basic Result In what follows we have in mind the comparison of two systems of functions u,(x), tl = 1,2, ... and D,(x), tl = 1,2, ... , in regard to the property of completeness (or “closure”) over a finite interval (a, b). Supposing that the system {un(x)} is complete in L2(a,b), in the sense that for any f ( x ) E L2(a,b) and for any E > 0 we can find a linear combination Zr c,u,(x) of the {un(x)}such that (VI. 1.1)
and supposing also that the wn(x) are in some sense close to the u,(x), at least for large tl, we ask for criteria which ensure that the wn(x) also form a complete set in the above sense. I n Chapter 12 we used an argument of this character to establish the validity of the eigenfunction expansion, that is to say, the completeness of the set of eigenfunctions, assuming the result in a special case. Of the many available results of this type we need only the simplest. In its basic form this assumes that the system u,(x) is complete and orthonwmal, that is to say,
and that the system {w,(x)} is close to { ~ , ~ ( x )in } the sense that (VI. 1.3) 471
472
VI.
PERTURBATION OF BASES I N HILBERT SPACE
We shall conduct the argument in Hilbert space terms for brevity; thus the un(x), oJx) are considered as elements of a Hilbert space H , and (VI.1.2-3) may be rewritten as (V1.1.4) m
m
~(u,--u,u,-~u)
=~llu,-v,ll2<
1.
(VI. I. 5 )
P-1
P=l
The completeness property asserted in (VI.I.l) will mean that for any and any z > 0 we can find N and cl, ..., C, such that
fEH
(VI.l.6)
The arguments to be used do not depend on the deeper properties of Hilbert space, in particular its “completeness” ; they could be rephrased in terms of continuous functions only. The formal result is then Theorem VI.l.1. In the Hilbert space H let u , , n = 1, 2, ..., be a complete orthonormal set, and let v,&, n = 1, 2, ..., be a set of elements of H satisfying (VI.1.5). Then the set {v,} is complete. It will be noted that we have not assumed the set {o,} to be orthonormal. The result is sharp, in the sense that the conclusion can fail if equality holds in (VI.1.5). We write P =
where p 2 0, and choose some p’ with p < p’ < 1 . We first prove that for any f E H we can find an f i t a finite linear combination of the {wVL}, such that (V1.1.7) I!f-f1 II d P‘ Ilfll. This being trivial i f f = 0, we take it that )If 1) > 0. Since the u, are assumed complete, we may express f in the form (VI. 1.8)
where
(VI. 1.9)
v1.2.
473
CONTINUOUS VARIATION OF A BASIS
As fi , a linear combination of the v,
, we then take
N j-1
=
cnvn
(VI. I. 10)
*
1
T o justify (VI.l.7) we note first that
N G q I C n I IIun-vnII
+ IIgII.
Since, by the Bessel inequality, or Parseval equality, use of the Cauchy inequality gives
ElN
[ c,
I2
< [lf1I2,
Q Ilf I1 P, which together with (V1.1.9) yields (VI.1.7). T h e proof now follows by a repetition of the process. Applying the result (VI. 1.7) with f - fi instead of f we have that there exists fz , a finite linear combination of the v, , such that, with the same p‘ < 1,
Ilf-fl
I1 d P’ I l f
-f2
-f1
II Q PI2 I l f It.
Continuing the process, there exist linear combinations fi , ...,fm of the
v, such that
Ilf
-fi
-fz
- .‘. -fm
II
< P’m llf
11,
and the result follows on taking m sufficiently large.
VI.2. Continuous Variation of a Basis For the standard Sturm-Liouville expansion we need the following result, based on repeated application of the principle just established.
< <
Theorem VI.2.1. For 0 T 1, let u n . = , n = 1, 2, elements of H such that (i) u , , ~is continuous in T , in the metric of H ;
..., be a set of
4 74
VI.
PERTURBATION OF BASES I N HILBERT SPACE
(ii) for fixed 7,the un,r are mutually orthogonal and not zero; (iii) for 7 = 0, the u , , ~are orthonormal and form a complete set in H , (iv) for some fixed K > 0,
I1 un,7 - un,o II < Kin-
(V1.2.1)
Then, for 7 = 1, the unsl are also complete in H . We first normalize the u n S 7defining , wn.7
(VI .2.2)
= un,r/Il ~ n . 71 1 9
so that the w ~are, orthonormal, ~ and w , , ~= u , , ~ Noting . that 11 u , , ~11 does not vanish, and is continuous, since I 11 u , . ~ '11 - 11 11 I 11 u,,,. - un,, 11 and ~i,,~ is continuous in 7, we have that w,,, is continuous in 7. We shall also show that the w , , ~satisfy a result of the form (VI.2.1). We have in fact
<
II wn.7 - wn.0 II
=
II wn,z - un.0 I1
< I1
we.7
- un,7 II
+ I1 un,r -
Un.0
11.
Here the second term is bounded by (VI.2.1). Since un., = w,.? 11 u,.? we have also
I1 wn.7 - u n , r I I
=
I / wn,7(1 - II Un.7 II) II
=I 1 -
11
I1 un,z I1 I-
Here, using (V1.2.1), we employ the fact that
I 1 - I I un,z II I
I II un,o II - I I un.7 II I < I I Un.0 - un,7 I1 < K/n,
=
where we have also used the normality of the u , , ~ Hence .
I I wn, r - wn.0 I I < 2K/n*
(VI.2.3)
T o complete the proof of the theorem we choose a subdivision of = T~ < ... < 7, = 1, such that
(0, l), say, 0
2 II m
n=l
< 1.
wn,rr+,-wn,7,~~2
(VI.2.4)
T o verify that such a choice of a subdivision is possible, we first choose an integer N > 1 such that (VI .2.5)
VI.3.
< < 1. In view of (VI.2.3)
for 0 T 4 K 2 Zz n-2
< 3. Since
I I w n , zICl
- wn.rr
I I2
=
I I (Wn.++,
< 2 II
475
ANOTHER RESULT
it will be sufficient to ensure that
- W n . 0 ) - ( w n , r , - W n - 0 ) 11'
wn,r,+l
- wn.0
112
+ 2 11
wn,rr
- wn.0
112,
it follows from (VI.2.5) that
2 II m
-
~ n , r ~ w + n~, r ,
n=N
I l2 < Q -
(VI.2.6)
To complete (VI.2.4) we require that, for this N ,
n=l
for r = 0, ,.., m - 1. This may be arranged by choosing the subdivision { T ~ sufficiently } fine, relying on the continuity of the w,,,, in T . On the basis of Theorem VI.l.l, it follows from (VI.2.4) that if the w ~ ,, ,n ~=~ 1, 2, ..., form a complete set, then so do the w , ~ , , ~. +Since ~ the w , , ~are assumed complete, it follows that the w , , ~are~ also complete. This is equivalent to the completeness of the u , , ~ ,which was to be proved.
VI.3. Another Result We cite without proof the following result on the perturbation of bases, of a slightly more delicate character that those of the last two sections.
Theorem VI.3.1. Let u , , n = 1, 2, ..., and w,, n = 1, 2, ..., be two orthonormal sets in H , of which the set u, is complete. Let also m
en 112
< 00.
(V1.3.1)
Then the set w, is also complete. I n comparison with Theorem VI.l.1, we have a weaker restriction on the sum in (V1.3.1), but now require the set {w,} to be orthonormal. We refer to a paper of Birkhoff and Rota for a discussion of this result and its application to the Sturm-Liouville expansion.
Notation and Terminology
We use throughout the conventions of matrix algebra (see for example Bellman’s “Matrix Analysis,” Chapter 2). Square matrices will be denoted by Latin capitals, or by Greek letters, lower case or capital; row or column matrices will be denoted by lower case letters. Unless otherwise stated, the entries of our matrices may be complex. We use E for the unit matrix, occasionally with a suffix to indicate the order of the matrices concerned; the entry in the rth row and sth column of E will thus be ,6, the Kronecker delta symbol, with ,6 = 0 in Y # s, ,6 = 1. T h e zero matrix, all of whose entries are zero, will be denoted by 0, alike for square, row, and column matrices. T h e symbol (*) will indicate the Hermitean adjoint of a matrix, obtained by transposing and taking complex conjugates, so that for two-by-two matrices, for example,
If it so happens that A = A*, we say that A is Hermitean; if also the entries in A are real, A is said to be symmetric, but this additional specialization produces for us no advantage here. If A = -A* we say that A is skew-Hermitean; the diagonal entries in A are then pure
imaginaries or zero, a special case of such a matrix being iE. We write tr A for the trace of a square matrix A, being the sum of its diagonal entries. If u is a column matrix, with entries u l ,..., u, written vertically, then u* will be a row matrix, with entries cl,..., fi, written horizontally. If w = ( w 1 , ...,w,) is a second column matrix, w*u will be the scalar C: ETu7;.this is the same as the “scalar product” of the vectors u, w , often written (u, w ) or u * w , though these notations will not be used here. If A is a square matrix, with typical entry u p s , u*Au will be the quadratic form X: X ~ U , ~ , ~ also , , a scalar. On the other hand, uu* will be an n-by-n matrix. 476
NOTATION A N D TERMINOLOGY
477
T h e inequalities A > 0, A 2 0 are to be understood in their matrix senses. Thus, if for all column matrices u # 0 we have u*Au > 0, then A is positive-definite and we write A > 0. If u*Au 3 0 for all u, then we write A 3 0, this including both of the eventualities A > 0, A = 0 as special cases. For two square matrices A , B the inequalities A > B, A 2 B are to mean that A - B > 0, A - B 2 0. If A = A(x) is a square matrix of functions of the real variable x, the statement that A(x) is nondecreasing, as a function of x, will mean that if x8 > x1 then A(x,) 3 A(x,) in the above matrix sense. A value x may be said to be a point of (strict) increase of A(x) if for all E > 0 we have A ( x E) > A ( x - E); we may term it a point of weak increase if these requirements are weakened to
+
A("
+
E)
2 A("
- €),
A(x
+
€)
# A(x - €).
As usual, we term A nonsingular if its determinant does not vanish, this being necessary and sufficient for the existence of a second matrix B = A-l, such that AB = BA = E. If the square matrix U is such that UU* = U*U = E, that is to say, if its inverse U-l coincides with its Hermitean adjoint U*, we say that U is unitary; in the special case when the entries in U are all real, U is said to be orthogonal, though this specialization is not needed. More generally, if for some nonsingular matrix J we have U* JU = J , then U is said to be J-unitary. For any J , the set of ]-unitary matrices form a group; apart from the unitary group, with J = E, the main case is that of the symplectic group, when J is of even order and is compounded in a certain manner (3.2.8) of zero and unit matrices of half that order. If U*U < E, then U has the property of reducing the length of a vector to which it is applied, and is accordingly termed contractive. More generally, if U*J U < J it is said to be J-contractive." iC,where Any square matrix A may be written A = B
+
B
=
Q(A +A*),
c = - '2 i ( A - A * ) ,
so that B and C are Hermitean; they may be termed the real and imaginary parts of A , without of course implying that the entries in B and C are necessarily real. T h e statement that A has positive imaginary part will then of course mean that C > 0, in the matrix sense of this inequality.
List of Books and Monographs
(References to these items in the Notes will be abbreviated, usually to the author’s name only. Other items will be cited with full bibliographical details.) AHIEZER,N. I. (ACHIEZER),“Lectures on the Theory of Approximation.” MoscowLeningrad, 1947; German ed., Akad.-Verlag., Berlin, 1953. AHIEZER, N. I., and GLAZMAN, I. M. (GLASMA”), “Theorie der linearen Operatoren in Hilbert Raum.” Moscow, 1950; German ed., Akad.-Verlag., Berlin, 1960. BECKENBACH, E. F., and BELLMAN, R. E., “Inequalities.” Springer, Berlin, 1961. BELLMAN R. E., “Stability Theory of Differential Equations.” McGraw-Hill, New York, 1953. BELLMAN, R. E., “Introduction to Matrix Analysis.” McGraw-Hill, New York, 1960.
L., “Theorie der Differentialgleichungen.” Berlin, 1926. BIEBERBACH, BIRKHOFF, GARRETT,and ROTA,G.-C., “Ordinary Differential Equations.” Ginn, Boston, Massachusetts, 1962. BIRKHOFF, GEORGE D., “Collected Works,” Vol. I. New York, 1950. CODDINGTON, E. A,, and LEVINSON, N., “Theory of Ordinary Differential Equations.” McGraw-Hill, New York, 1955. COLLATZ,L., “The Numerical Treatment of Differential Equations.” Springer, Berlin, 1960. COURANT, R., and HILBERT, D., “Methods of Mathematical Physics,” Vol. 1. 2nd German ed., Berlin, 1931; English ed., Wiley, New York, 1953. DOLPH,C. L., Recent developments in some non-self-adjoint problems of mathematical physics, Bull. Amer. Math. SOC.67 (1961), 1-69. FORT,T., “Finite Differences and Difference Equations in the Real Domain.” Oxford. Univ. Press, London and New York, 1948. GANTMAHER, F. R. (GANTMACHER), and KRE~N, M. G., “Oscillation Matrices, Oscillation Kernels, and Small Vibrations of Mechanical Systems.” 2nd Russian ed., MoscowLeningrad, 1950; German ed., Akad.-Verlag., Berlin, 1960; English ed., USAEC translation 4481, 1961. 478
BOOKS AND MONOGRAPHS
479
GEROAIMUS, YA. L., “Theory of Orthogonal Polynomials.” Moscow, 1958; English ed., “Orthogonal Polynomials: Estimates, Asymptotic Formulas, and Series of Polynomials Orthogonal on the Unit Circle and on an Interval.” Consultants’ Bureau, New York, 1961; or “Polynomials Orthogonal on a Circle and Interval.” Pergamon, New York, 1960.
~ , “Toeplitz Forms and Their Applications.” Univ. of GFENANDER, U., and S Z E C G., California Press, Berkeley, California, 1958. HANNAN, E. J., “Time Series Analysis.” Wiley, New York, 1960. INCE,E. L., “Ordinary Differential Equations,” 4th ed. Dover, New York, 1953. KAMKE,E., “Differentialgleichungen reeller Funktionen.” Teubner, Leipzig, 1930. KARLIN,S., and S Z E CG., ~ , On certain determinants whose elements are orthogonal polynomials, J. Anal. Math. 8 (1960/61), 1-157. KRE~N, M. G., and KRASNOSEL’SKI~, M. A., Fundamental theorems on the extension of hermitian operators and certain of their applications to the theory of orthogonal polynomials and the problem of moments, Uspekhi Mat. Nauk 2 (1947), 60-106. KRE~N, M. G., The ideas of P. L. Cebysev and A. A. Markov in the theory of the limiting values of integrals and their further development, Uspekhi Mat. Nuuk 6 (1951), 3-120; Amer. Math. SOC.Transt. (2) 12 (1959), 3-120. KRE~N, M. G., and REHTMAN, P. G., Development in a new direction of the CebySevMarkov theory of the limiting values of integrals, Uspekhi Mat. Nuuk 10 (1955), 67-78; Amer. Math. SOC.Transl. ( 2 ) 12 (1959), 123-136.
B. M., Appendices I-V to Russian ed. of Part I of “Eigen-Function Expansions” LEVITAN, by E. C. Titchmarsh. Moscow-Leningrad, 1960. MORSE, M., “Calculus of Variations in the Large.” h e r . Math. SOC.Colloquium Publications, Vol. 18, New York, 1934. NA~MARK, M. A. (NEUMARK), “Linear Differential Operators.” Russian ed., Moscow, 1954; German ed., Akad.-Verlag., Berlin, 1960. N A ~ M A RM. K ,A., Investigation of the spectrum and the expansion in eigenfunctions of a non-self-adjoint differential operator of the second order on a semi-axis, Trudy Moskow. Mat. ObE. 3 (1954), 181-270; Amer. Math. SOC.Trunsl. ( 2 ) 16 (1960), 103-193. POTAPOV, V. P., The multiplicative structure of ]-contractive matrix-functions, Trudy Moskow. Mat. ObP. 4 (1955), 125-236; Amer. Math. SOC.Transl. ( 2 ) 15 (1960), 131-243. RIESZ,F., and SZ~KEFALVI-NAGY, B., “Leqons d’analyse fonctionelle.” Budapest, 1953. SHOHAT, J., and TAMARKIN, J. D., “The Problem of Moments,” Math. Surveys No. 1. Amer. Math. SOC.,New York, 1943, 1950. STONE,M. H., “Linear Transformations in Hilbert Space and Their Applications to Analysis.’’ Amer. Math. SOC.Colloquium Publications, Vol. 15, New York, 1932.
480
BOOKS A N D MONOGRAPHS
SZECO,G . , “Orthogonal Polynomials.” Arner. Math. SOC. Colloquium Publications, Vol. 23,New York, 1939;2nd. ed., 1959.
TITCHMARSH, E. C., “Eigenfunction Expansions Associated with Second-Order Differential Equations,” Part I. Oxford Univ. Press, New York, 1946; 2nd. ed., 1962; Part 11, Oxford Univ. Press, New York, 1958.
TITCHMARSH, E. C., “The Theory of Functions.” Oxford Univ. Press, New York, 1932; 2nd ed., 1939. TITCHMARSH, E. C., “Theory of Fourier Integrals,” Oxford Univ. Press, New York, 1937. WALL,H. S., “Analytic Theory of Continued Fractions.” Van Nostrand, Princeton, New Jersey, 1948. WIDDER, D. V., “The Laplace Transform.” Princeton Univ. Press, Princeton, New Jersey, 1946.
Notes
Section 0.1 Some discussion of the discrete boundary problem (0.1.5), (0.1.8), and of variational and other aspects, is given on pp. 142-146 of Bellman’s “Matrix Analysis.” Practical numerical aspects of the replacement of the boundary problem for a differential equation by that for a difference equation are treated in works such as that of L. Collatz, in Chapter I11 of the cited book. For the use of the discrete approximation to establish the eigenfunction expansion for the differential equation case see Levitan, Appendix I to the Russian edition of Titchmarsh’s book, or M. PLANCHEREL, Le passage A la limite des Cquations aux diffkrences aux Cquations diffbrentielles dans les problbmes aux limites, Bull. Sci. Math. 46 (1922), 153-160, 170-177;
the matter is also referred to in Fort’s book. Although we shall not reproduce this argument in this book, we use the process to establish the expansion theorem for a certain mixed discrete-continuous recurrence relation, generalizing that associated with complex Fourier series; see Section 2.10.
Section 0.2 Concerning this type of wave propagation, see for example J. A. STRATTON, “Electromagnetic Theory.” McGraw-Hill, New York, 1941, Chapter 5 and Problems.
Section 0.4 See for example J. G. TRUXAL, “Control Engineer’s Handbook.” McGraw-Hill, New York, 1958.
48 1
48 2
NOTES
Section 0.7 For the further theory of the probabilistic model see the notes to Section 5.7 and the references given there. See also I. J. GOOD,Random motion and analytic continued fractions, R o c . Cambridge, Phil. SOC. 54 (1958), 43-47.
Section 0.8 Sturm-Liouville theory with a parameter in the boundary conditions has been treated by a number of writers; in particular see G . W. MORGAN, Some remarks on a class of eigenvalue problems with special boundary conditions, Quart. uppl. Math. 11 (1953). 157-165, W. F. BAUER, Modified Sturm-Liouville systems, ibid. 272-283,
R. L. PEEK,Jr., A problem in diffusion, Ann. of Math. (2) 30 (1929), 265-269, E. HILLE,Note on the preceding paper by Mr. Peek, ibid. 270-271.
There are many more general investigations, relating to more general boundary conditions or side conditions, systems of higher order, and so on. For such work and further references see J. D. TAMARKIN, Some general problems of the theory of ordinary linear differential equations and the expansion of an arbitrary function in series of fundamental functions, Math. 2. 27 (1927), 1-54,
R. E. LANCER, A theory for ordinary differential boundary problems of the second order 53 (1943), 292-361, and of the highly irregular type, Trans. Amer. Math. SOC.
L. A. D I K I ~On , boundary conditions depending on an eigenvalue, Uspekhi Mat. Nauk 15 /1960), 195-198,
J. ADEM, “Matrix differential systems with a parameter in the boundary conditions, Quart appl. Math. 17 (1959), 165-171, H. J. ZIMMERFIERG, Two-point boundary conditions linear in a parameter, Pacific J. Math. I2 (1962), 385-393.
In the Sturm-Liouville case, it is clear that the presence of the second derivative in the boundary conditions may be eliminated by means of the differential equation, at the cost of introducing the spectral parameter. For another approach to such problems see R. V. CHURCHILL, Expansions in series of non-orthogonal functions, Bull. Amer. Math. SOC.48 (1942), 143-149.
We discuss in Chapter 8 and the notes for Section 8.1 various extended forms of Sturm-Liouville theory in which the presence of the parameter
483
NOTES
in the boundary conditions in linear fashion forms part of a wider generalization which allows for discontinuities within the basic interval. For the case of Sturm-Liouville theory with a finite number of interface conditions a thorough investigation is due to W. C. Sangren, cited under the notes for Section 11.8. T h e condition that the matrix in (0.8.6), if constant, be symplectic may be ensured by multiplication by a constant factor, if it has positive determinant; this does not of course apply for higher dimensions. In this case orthogonality relations can still be set up if the symplectic property fails.
Section 1.5 T h e term “spectral function” occurs in the literature in three senses. I n that used in this book it gives a weight-distribution on the real axis with the property of inverting the definition of the Fourier coefficient to yield the function being expanded in eigenfunctions, as (1.5.5) is inverted by (1.5.6). One may also demand that (1.5.6) be inverted by (1.5.5), for a suitable class of a@). T h e definitions may be most simply illustrated in the case of the continuous analog of the recurrence relation of this chapter, that is in the case of the differential equation y’ = ihy, subject to y(0) = y(1). T h e expansion theorem, that of complex Fourier series, asserts that if for any well-behaved f ( x ) we define
v(h) = f f0( t ) e x p ( i h t ) d l , then
-00
where the spectral function ~ ( his) in this case the greatest integer not exceeding h , / ( 2 ~ T) .h e “dual orthogonality,” corresponding to (1.5.3), is now the formal result that
1
W
-m
exp ( i h ~exp ) ( i ~ td+) )
= a(x - t ) ,
the right-hand side being the Dirac delta function. T h e spectral function just defined has the orthogonal property that the relationship between f ( x ) and a(h) is reciprocal, isometric, and onto as between f ( x ) E L2(0,l), on the one hand, and the set of v(h) such that J” I v(h) l2 &(A) is finite, on the other, the latter being effectively -W
484
NOTES
the set of sequences of summable square, in view of the Riesz-Fischer theorem (Riesz and Sz.-Nagy, Chapter 2, or Titchmarsh, “Theory of Functions,” Chapter 13). For a distinct use of the term “spectral function” Iet us define rather the “spectral kernel” T ( X , y; A) by J O
where ~ ( his) as previously. Then, under suitable restrictions,
This spectral kernel may be defined as a step function with jumps at the eigenvalues, the jump being the product of two eigenfunctions, associated with the particular eigenvalue, the eigenfunctions being normalized in the mean-square sense, and in our present case using the complex conjugate of one of them. T h e same construction is important in Sturm-Liouville cases (see for example Levitan, Appendices to Titchmarsh’s book). For a similar construction in connection with partial differential equations, see for example F. J. BUREAU,Asymptotic representationof the spectral function..., 3. Math. Anal. Appl. 1 (1960), 423-483.
T h e third use of the term “spectral function” concerns the integral operator defined by the “spectral kernel” just introduced. We define a family of operators EA by
so that, formally at least, - i f ’ ( x ) = Jrn h d E a f ( x ) . For further informa--m tion on such “resolutions of the identity’’ we refer to the books of Ahiezer and Glazman or of Stone, or to papers of Naimark, such as his Extremal spectral functions of a symmetric operator, Izw. Akod. Nauk SSSR, Ser. Mat.
11 (1947), 327-344,
or R. C. GILBERT, The denseness of the extreme points of the generalized resolvents of a symmetric operator, Duke. Math. J. 26 (1959), 683-691.
NOTES
48 5
I t should be mentioned that in any of these senses the spectral functions (though not in general the orthogonal spectral functions) form a convex set, containing with any two such functions also their arithmetic mean with any non-negative weights. We may therefore distinguish extremal spectral functions lying, so to speak, on the boundary of this set, not representable as the arithmetic mean of other spectral functions. This notion appears in particular in the moment problem, and the convexity is reflected geometrically here and in the case of differential equations by the association of spectral functions with points of circles in the complex plane.
Section 1.6 We emphasize that the (somewhat overworked) term “characteristic function” will not be used in this book in the sense of an eigenfunction, but in the sense of a certain meromorphic function of the spectral parameter, having poles at the eigenvalues. A rather similar use of the term is made by Naimark in his book, p. 240, and by K. Kodaira in two important papers on differential equations [Amer. J. Math. 71 (1949), 921-945; 72 (1950), 502-5441. A distinct though not unrelated usage is followed in the definition of characteristic functions or matrix functions in a functional-analytic context by LivBic. See M. S. BRODSKI~ and M. S. LIVSIC,Spectral analysis of non-self-adjoint operators and intermediate systems, Uspekhi Mat. Nauk 13 (1958), 3-85, A . V. STRAUS, Characteristic functions of linear operators, Doklady Akad. Nauk SSSR 126 (1959), 514-516,
or, for a brief account, the book of Ahiezer and Glazman.
Sections 1.7-8 These form analogs of problems of inverse Sturm-Liouville theory, in which a coefficient-function or “potential” in a second-order differential equation is to be recovered, given either the spectral function, or alternatively given two sets of eigenvalues corresponding to two given boundary conditions at one end, the boundary condition at the other end being fixed. See Sections 4.6-7, 5.2, 7.4, 12.4, and the Notes to Section 12.7.
Section 1.10 T h e term “moment-problem” most commonly refers to problems
486
NOTES
concerning moments of powers on the real axis, that is to say, the determination of ~ ( hfrom ) the equations jh"dT(h)
= pn ,
n
= 0,1,
...,
the pn being given; the integral may extend over the whole axis (-m, m) (Hamburger problem), or over (0, a)(Stieltjes), or over (0, 1) (Hausdod); here it is mainly the Hamburger problem which is of interest in this book, though this does not exclude .(A) being constant on the negative real axis. One way of viewing the moment problem is that the moments define a scalar product of any two polynomials f ( h ) , g(X); the expression
j m f(4go d T ( 4 -W
involves only the moments, and so may be evaluated without knowledge of ~ ( h )Completing . this set of polynomials to a Hilbert space, we study the symmetric operator defined on polynomials by the mapping f ( h ) + hf(h). For this approach, due to Liviic, Krein, and Krasnosel'skii, see the cited monograph of M. G. Krein and M. A. Krasnosel'skii. In a similar way, for the problem of this section, we may suppose known the values of J"--m ( A - a,)-' &(A), where the a , are given but ~ ( h is) unknown, but is to be nondecreasing; the a, are to lie in the upper half-plane and in the simplest case are all distinct. These moments again determine a scalar product and so a pre-Hilbert space of rational functions with poles at the a,. T h e operator given by multiplication by h will be symmetric, with domain including those rational functions with at most simple poles at the a, which vanish to order O(h-2) as
x+
00.
Similar ideas apply to the trigonometric moment problem and its continuous analogs. For multivariate extensions see A. DEVINATZ, On the extensions of positive definite functions, Acta Moth. 102 (1959), 109-134.
where a connection is found with the work of N. ARONSZAJN, The theory of reproducing kernels, Trans. A m y . Math. SOC.68 (1950), 337-404.
Additional references on the ordinary power moment problem are given in the notes to Section 5.10, and on the trigonometric problem in the notes to Section 7.5.
487
NOTES
Hilbert spaces of analytic functions also occur in the work of L. DE BRANGES, Some Hilbert spaces of entire functions, IV, Trans.Amer. Math. SOC. 105 (1 963, 43-83,
where other references are given. T h e Pick-Nevanlinna problem consists in finding a function f(h), analytic in I m h > 0, Imf(h) having a fixed sign there, to take assigned values at an infinite sequence of points in I m h > 0. Imposing the first m of these conditions, and making m increase, there results a recurrence relation, leading to a limit-point, limit-circle classification; this classification is the analog of that obtaining in Sturm-Liouville theory when the basic interval is extended to infinity, or in the threeterm recurrence situation of Chapter 5. I n the present case, these recurrence relations suggest analogs for differential equations which involve the spectral parameter in fractional-linear form. See H. WEYL,Uber das Pick-Nevanlinna’sche Interpolations-problemund sein infinitesimales Analogon, Ann. of Math. (2) 36 (1939, 230-254.
See also Krein’s monograph, “The ideas of P. L. Cebysev that of Beckenbach and Bellman.
...,
”
and
Section 2.2 For recent results and bibliography on Blaschke products see G. T. CARGO, Angular and tangential limits of Blaschke products and their successive derivatives, Canad. J. Math. 14 (1962), 334-348, A. A. GOL’DBERG, Notes on Blaschke’ derivatives for a half-plane, Ukrain. Mat. Zh. 1 1 (1959), 210-213.
Section 2.3 Reasoning from the uniform boundedness of a family of spectral functions to the existence of a limiting spectral function is a device to be employed later in connection with orthogonal polynomials(Section 5.2), and is standard usage in the topic of Sturm-Liouville theory on a half-axis (Section 8.12).
Section 2.5 I n Theorem 2.5.1, if the real axis be transformed to the unit circle, we have to deal with the derivative of a Blaschke product, and the radial limit of this derivative; see the reference just made to the paper of G. T. Cargo.
488
NOTES
Section 2.7 In view of the orthogonality (2.7.l), the rational functions qn(h) result from applying the process of orthogonalization to the functions (1 - ihFn)-l, the orthogonality interval being the whole real axis with a constant weight-function. See 0. SzAsz, “Collected Works.” Cincinnati, 1955.
T h e same orthogonality may be applied to what we might view as a dual expansion theorem, in which a(h), defined on the real axis, is to be expanded in a series of the ~ ~ ( hLikewise, ). we may consider v(h) as a given meromorphic function to be expanded in such a series. For similar investigations see E. LAMMEL, Uber Approximation meromorpher Funktionen durch rationale Funktionen, Math. Ann. 118 (1941), 134-144.
Section 2.10
If we restrict the expansion to that of a function defined over the continuous range, here denoted by - c x 0, we make contact with, though without including, an investigation of
< <
STRAUS, On the spectral function of the operation of differentiation, Uspekhi Mat. Nauk I3(1958), 185-191,
A, V.
where it is a question of finding all operator spectral functions associated with i d/dx (see notes for Section I .5).
Section 3.1 In this chapter we consider the products of a finite number of factors of the form AJ B , , each of which is J-unitary for real A, and J-contractive when I m h > 0. This forms a very special case of a theory of products of matrix factors with these properties, allowing also fractional-linear factors (A,h B,) (C,h Dn)-l, and allowing infinite products, discrete, continuous, or mixed. T h e basic work in the field is the monograph of V. P. Potapov, listed in the general references, which underlies all our discussion.
+
+
+
Section 3.2 Concerning the symplectic group see the book of H. Schwerdtfeger, “Introduction to Linear Algebra and theTheory of Matrices” (Groningen, 1950), or that of C. Chevalley, “Theory of Lie Groups, I” (Princeton
NOTES
489
Univ. Press, Princeton, New Jersey, 1946). Analytic aspects are taken UP by C. L. SIEGEL, Symplectic geometry, Amer. J. Math. 65 (1943), 1-86.
See also the notes for Section 10.1.
Section 3.3 Isotropic subspaces with respect to an indefinite metric are considered by A. I. Mal’cev, “Foundations of Linear Algebra” (Moscow-Leningrad, 1948), Chapter 9. See also the references to the work of V. A. YakuboviE on the symplectic group given in the Notes to Section 10.1.
Section 3.5 Transformations of the plane leaving area invariant are discussed by H. S. M. COXETER, “Introduction to Geometry.” Wiley, New York, 1961,
such linear transformations being termed “equi-affine”; the special case of a shift parallel to a fixed line, of amount proportional to the distance from it, is a “shear.” T h e term “symplectic transvection” is used by E. Artin, in “Geometric Algebra” (Interscience, New York, 1957).
Chapter 4 T h e theory of orthogonal polynomials is usually developed starting from the orthogonality; the latter is usually taken with respect to a weight-distribution function ~ ( hwith ) an infinity of points of increase, or more specially a weight-function which is continuous and positive in some interval. T h e principal reference is Szegb’s book; this takes the orthogonality as basic, as do a number of briefer presentations, for example F. G. TRICOMI, “Vorlesungen iiber Orthogonalreihen,” Berlin, 1955,
or D.
“Fourier Series and Orthogonal Polynomials,” Carus Monograph Series No. 6. Ohio, 1941.
JACKSON,
T h e recurrence relation point of view is systematically developed in Stone’s book, pp. 530-614.
Section 4.2 T h e inequality (4.2.4) appears in Sturm-Liouville theory as the monotonic dependence of a certain polar angle on the spectral para-
490
NOTES
meter [Theorem 8.4.3(iii).] Though this is not our approach here, we indicate the very simple proof of (4.2.4) which rests on the orthogonality to be proved in Section 4.4. I t follows from this orthogonality (see Problems 1 and 9) that the polynomial y,JA) ~ Y , _ ~ ( A ) , for any real h, has at least m - 1 changes of sign as A increases on the real axis. But if equality held in (4.2.4) for some real A, this polynomial would, for suitable h, have there a multiple zero, and so would have at most m - 2 changes of sign; the constant sign of the left of (4.2.4) may now be ascertained by considering the highest power of A. For an extension of this argument to higher order Wronskians see the monograph of Karlin and Szego (p, 6), where numerous other interesting investigations will be found. For a converse of the Wronskian property see
+
W. A. AL-SALAM, On a characterization of orthogonality, Math. Mag. 31 (1957/58), 41-44.
Section 4.3 Concerning the zeros of y,(A), as a function of X, see Szegb’s book, Section 3.3, where a variety of arguments is given. T h e oscillatory properties of y,(A), as a function of x, were apparently known to Sturm, though not proved until much later, by M. B. PORTER, On the roots of functions connected by a linear recurrent relation of the second order, Ann. o j Math. (2) 3 (1902), 55-70; see also 0. DUNKL,The alternation of nodes of linearly independent solutions of second-order difference equations, Bull. Amer. Math. SOC.32 (1926), 333-334,
W. M. WHYBURN, On related difference and differential systems, Amer. J . Math. 51 (1929), 265-280.
For a detailed exposition we refer to Fort’s book. See also the book of Gantmaher-Krein, Chapter 2, Section 1, and the monograph of Karlin and SzegB. Anticipating the topic of Chapter 5 in some degree, consider the infinite recurrence sequence defined by C,U,+~ = b,u, - C , - ~ U , - ~ , n = 0, 1, ..., with initial values u - ~ u,, , not both zero. T h e recurrence relation may be said to be “nonoscillatory” if the sequence u, is ultimately of one sign; as in the case of second-order differential equations, this classification is one of the recurrence relation, and does not depend on the choice of initial data. Again as in the case of differential equations, the question has applications to the nature of the spectrum. I n addition to Fort’s book, see for example P. HARTMAN and A. WINTNER, Linear differential and difference equations with monotone solutions, Amer. J. Math. 15 (1953), 131-143,
49 1
NOTES
P. J. MCCARTHY, Note on the oscillation of solutions of second order linear difference equations, Portugal. Math. 18 (1959), 203-205.
T. FORT,“Limits of the characteristic values for certain boundary problems associated with difference equations, J. Math. Phys. 35 (1957), 401-407,
and the notes for Section 5.2.
Section 4.4 The normalization constants pr given in (4.4.34) are essentially the reciprocal of the Christoffel numbers; see Szego’s book, (3.4.7-8), for the case, in our notation, h = 0:
Section 4.5 Our two forms (4.5.4), (4.5.5) conceal a well-known identity in the theory of orthogonal polynomials. Comparing the two we have, taking h = 0, m =
- ~ r n ( h ) J”
-m
(A - PI-’
dTrn.O(p)
sinceym(p)vanishes at the jumps of ~ ~ , , , ( pAnticipating ). the “mechanical quadrature” (5.2.11) [or (4.8.8)] it follows that
This may be interpreted in the sense that we start with the weightdistribution ~ ( h ) construct , polynomials ym(A), by orthogonalization, which necessarily satisfy a recurrence relation, and then derive by the last formula a second solution of the recurrence relation (Szego, “Orthogonal PoIynomials,” Section 3.5). For the continuous analog, relating to solutions of second-order differential equations, see B. M. LEVITAN, On a theorem of H. Weyl, Doklady Akad. Nauk SSSR 82 (1952), 246-249.
Section 4.7 A similar problem has been treated by B. WENDROFF, On orthogonal polynomials, Proc. Amer. Math. SOC.12 (1961), 554-555.
Problems of this kind have interesting mechanical formulations. We refer to the book of Gantmaher-Krein, Appendix 11, “On a remarkable
492
NOTES
problem for a string of pearls and on Stieltjes continued fractions,” where the problem is treated in the form that particles are to be fixed on a light string, with given length and tension and fixed at one end, so as to have one given set of frequencies when the other end is fixed, and another given set of frequencies when this end slides transversely. T h e dynamical interpretation leads to interesting extremal problems, such as minimizing the total mass to be fixed to the string so as to produce given frequencies. See M. G . KREIN,On some problems on the maximum and minimum for characteristic values and on Lyapunov stability zones, Priklad. Mat. i Mekh. 15 (1951), 323-348; or Amer. Math. Soc. Transl. ( 2 ) 1(1955), 163-187, On some new problems of the theory of the oscillation of Sturmian systems, Priklad. Mat. i Mekh. 16 (1952), 555-568,
D. BANKS,Bounds for the eigenvalues of some vibrating systems, Pracific J. Math. 10 (1960), 439-474,
B. SCHWARZ, “On the extrema...,” J . Math. Mech. 10 (1961), 401-422. B. SCHWARZ, Some results on the frequencies of nonhomogeneous rods, J. Math. Anal. Appl. 5 (1962), 169-175,
where references are given to work of P. R. Beesack and S. H. Gould. On the relation to inverse spectral problems see also R. BELLMAN and J. M. RICHARDSON, A note on an inverse problem in mathematical physics, Quart. Appl. Math. 19 (1961), 269-271 ;
references to some analogous problems for differential equations are given in the notes to Section 12.7. In particular, in the Gel’fandLevitan solution of the inverse Sturm-Liouville problem the parallel with the orthogonalization of the powers to form orthogonal polynomials appears to have been found suggestive in connection with the orthogonalization (in a continuous sense) of the function cos Kx.
Chapter 4, Problems Problems 6-10. For the basic theory of CebyBev systems, sometimes called Markov systems when there is an infinite sequence of functions, see Ahiezer’s book, Chapter 2, and the book of Gantmaher-Krein, Chapters 3, 4, where many examples of such systems are found, in association with boundary problems. I t is possible to discuss multiple zeros of linear combinations of functions of such systems, without introducing differentiability. See for example D. R. DICKINSON, On Tschebysheff polynomials, Quart. J. Math. 10 (1939), 277-282; 12 (1941), 184-192; alsoJ. London Math. SOC.17 (1942), 211-217. S.LIPKA,Uber die Anzahl der Nullstellen von T-Polynomen, Monatsh. Math. Phys. 51 (1944), 173-178.
NOTES
493
Problem 15. These are the CebySev-Markov-Stieltjes inequalities. For an analogous property for a second-order differential equation see M. G. KREIN,Analog of the Cebysev-Markov inequalities in a one-dimensional boundary problem, Doklady Akad. Nauk SSSR 89 (1953), 5-8.
Section 5.1 Among illustrations of the theory of this chapter are the classical polynomials of Legendre, Jacobi, Hermite, and Laguerre, discussed in Szego’s book and elsewhere, and certain discrete analogs of the special functions. See R. J. DUFFINand Th. W. SCHMIDT, An extrapolator and scrutator, J. Math. Anal. Appl. 1 (1960), 215-227, P. LESKY, Unendliche orthogonale Matrizen und Laguerresche Matrizen, Monatsh. Math. 63 (1959), 59-83, and the same author’s Die Ubersetzung der klassischen orthogonalen Polynomen in die Differenzenrechnung, ibid. 65 (1961), 1-26; 66 (1962), 431-435.
R. H. BOYER,Discrete Bessel functions, J. Math. Anal. Appl. 2 (1961), 509-624,
and the monographiof Karlin and Szegb. A case when the polynomials have a definite asymptotic form for large n is considered by D. J. DICKINSON, H. 0.POLLAK, and G. H. WANNIER, On a class of polynomials orthogonal over a denumerable set, Paci’cJ. Math. 6 (1956). 239-247.
For other recent work see V. G. TARNOPOL’SKII, The dispersion problem for a difference equation, Doklady Akad. Nauk SSSR 136 (1961), 779-782, W. G. BICKLEY and J. MACNAMEE, Eigenvalues and eigenfunctions of finite difference operators, Proc. Cambridge Phil. SOC.57 (1961), 532-546.
Many special polynomials, some of which have orthogonality properties, have been considered by Carlitz; see for example L. CARLITZ,On some polynomials of Tricomi, Boll. Union Mat. 1201. (3) 13 (1958), 58-64.
Section 5.2 T h e observation that a three-term recurrence relation of a suitable form defines polynomials which are necessarily orthogonal on the real axis seems to have been first explicitly stated by J. FAVARD, Sur les polynomes de Tchebicheff, C . R. Acad. Sci. 200 (1939, 2052-2053,
who remarked that the result followed from one of Hamburger.
494
NOTES
It would appear that the result was already in the possession of J. SHOHAT, The relation of the classical orthogonal polynomials to the polynomials of Appell, Amer. J. Math. 58 (1936), 453-464.
For related earlier investigations see E. HELLINGER, Zur Stieltjesschen Kettenbruchtheorie, Muth. Ann. 86 (1922), 18-29,
J. SHERMAN, On the numerators of the convergents of the Stieltjes continued fractions, Truns. Amer. Muth. SOC.35 (1933), 64-87,
and the later sections of Stone’s book. It is also possible to consider recurrence relations in which our restrictions of sign on the coefficients are relaxed in an essential way, and orthogonality with respect to a distribution of bounded variation, which need not be nondecreasing. See J. SHOHAT, Sur les polynomes orthogonaux gknkraliskes, C . R . Acad. Sci. 207 (1938), 556-558,
D . DICKINSON, On certain polynomials associated with orthogonal polynomials, Boll. Union. Mat. Iltal. (3) 13 (1958), 116-124.
In forming the sequence of spectral functions ~ , , , ~ ( hit )is permissible to restrict ourselves to the case h = 0, so long as we merely wish to show that there is at least one limiting spectral function. In this case the spectral functions have their jumps at the zeros of ym(h),rn = 1, 2, ..., and this leads to the conclusion that the interval of orthogonality may be taken to be the smallest interval containing all the zeros of all the yl(h). This interval is sometimes termed the “true” interval of orthogonality of the polynomials. A particularly important case is that in which the zeros of the y , ( h ) have one sign only. This occurs in the case of the vibrating string and in the case of recurrence relations associated with birth and death processes. Concerning the latter, see for instance S . KARLIN and J. MCGREGOR, Linear growth, birth and death processes, J . Math. Mech. 7 (1 958), 643-662.
T h e situation in which all the polynomials have zeros of the same sign has recently been studied by T. S . CHIHARA, Chain sequences and orthogonal polynomials, Trans. Amer. Math. SOC. 104 (1962), 1-16.
The case of quasi-orthogonal polynomials, orthogonal when their degrees differ by at least two, has been considered in regard to necessary and sufficient conditions by D. DICKINSON, On quasi-orthogonal polynomials, Proc. Amer. Math. SOC.12 (1961), 185-194.
495
NOTES
Section 5.4 The nesting circle aspect of the second-order difference equation was brought out by E. Hellinger [Math. Ann. 86 (1922), 18-29], in analogy to the famous discovery of H. WEYL,Uber gewohnliche Differentialgleichungen mit Singularitaten und die zugehorigen Entwicklungen willkiirlicher Funktionen, Math. Ann. 68 (1 910), 220-269.
The argument leading to the nesting property can be extended to allow the b, to have suitably restricted complex values; furthermore, the X appearing in the recurrence relation may depend on n, the copplex X with Im X > 0 being replaced-by a sequence of values in the bpper half-plane. See Wall’s book, Chapter 4, and the work of Sims referred to in the notes for Section 8.13.
Section 5.7 In the limit-circle case, the existence of a plurality of spectral functions, and of orthogonality relations, both direct and dual, leads to a plujality to solutions of certain differential equations, similar to (0.7.16), or to a plurality of values of the exponential of a certain matrix, similar to (0.7.15). The situation is of interest in connection with birth and death processes, and for a fuller discussion we refer to the article of W. FELLER, The birth and death processes as diffusion processes, J. Math. Pures AppI. 38 (1959), 301-345,
where other references are given. We vary our notation, changing the sign of X and replacing an by I, so that the polynomials are to be defined by
+ +
~n~n+l(h)
+
bn) ~ n ( x )
cn-~n-l(A)
= 0,
with c, > 0, yPl(A)= 0, C-~~,,(X) = 1. We shall assume that the spectrum is bounded from below, and that the limit-circle case holds. In other terms, for any solution of the recurrence relation with X = 0, that is any sequence u, satisfying cnu%+l
+
b%un
+
cn-lUn-1
= 0,
the terms must be ultimately of one sign, the equation being “nonoscillatory,” and the sequence must be of summable square, or Z: I un l2 < 00. For any chosen real A’ we may then determine a boundary problem by (5.7.4), with a direct orthogonality
496
NOTES
the dual orthogonality (5.2.4) holding with a, = 1; here ~ ( his) a step function with jump l/p, at A, , and will also be a limiting spectral function. We now set up the expressions, for t >, 0,
p ( j , k, t ) =
J
m
-m
e - l t y j ( ~Yk(h) ) dT(h),
which have many interesting properties. In the first place we have immediately that p ( j , k, 0 ) = 6 j k . Furthermore, by the recurrence formulas for the y,(A), (d/dt)p(j,k, t ) = cip(j = ckp(j,
+ 1, k,t ) + M j , k, t ) + ~ i - ~ p-( j 1, h, t ) - 1, t ) . k + 1, t , f k, t ) + bkp(j,
ck-lp(j>
From these, and the facts thatp( - 1, k, t ) = p ( j , - 1, t ) = 0, p ( j , j , t ) >O, it may be verified that p( j , K, t ) > 0 for t > 0. We mention in passing that any spectral function .(A) gives us a solution of the above system of differential equations with the same initial conditions. That these solutions are in fact different, for different .(A), may be seen by considering the asymptotic nature of p(0, 0, t ) as t --t 00. In fact, we need not confine attention to limiting, or orthogonal, spectral functions. T h e breakdown in the uniqueness theorem for differential equations with given initial data is due, in part, to the fact that we have a differential equation with an infinity of unknowns. Confining ourselves to spectral functions arising from a boundary condition of the above type, so that the eigenfunctions are orthogonal, we assert the “semigroup” property, that there hold the “ChapmanKolmogorov equations” m
for s >/ 0, t >, 0. This is immediately to be verified, on writing out the p’s as sums and using the orthogonality of the eigenfunctions. Again, we have an infinity of solutions of these relations, but with a more restricted class of spectral functions. With this semigroup property and the non-negativity of the p ( J , k, t ) for t >, 0 we approach the conditions defining a Markov process. Moving further in this direction, and without confining ourselves to the limit-circle case, let us assume that b, co = 0, b, c1 c,, = 0, ...; it may then be verified that Zk( d / d t ) p ( j ,k, t ) vanishes, so that & p( j , K, t ) is constant, and so unity, the same conclusion holding for & p ( j , k, t ) . T h e infinite matrix p( j , k, t ) is thus “doubly stochastic” (Bellman, “Matrix Analysis,’’ pp. 267-268).
+
+ +
497
NOTES
A further possible development in the limit-circle case is to impose a boundary condition at infinity containing a parameter. This is readily interpreted in the case of a vibrating string of finite length, bearing an infinity of particles of finite total mass converging to one end, a finite particle being located at that end, and free to slide transversely. For such a development in the probabilistic context we refer once more to Feller’s paper. Concerning that fact the the integrals defining the p( j , k, t ) are not independent of the choice of ~ ( h )in, the limit-circle case, we reach an apparent contradiction with this fact on expanding the exponential exp ( - A t ) in a power series and integrating term by term; the resulting integrals of polynomials in h should be independent of the choice of ~ ( h )T. h e resolution of the difficulty is that in the limit-circle case ~ ( h ) does not tend to its limits as h -+ 3 00 sufficiently rapidly to justify the term-by-term integration. See problem 11 for this chapter, or Titchmarsh’ “Fourier Integrals,” p. 320. These investigations for polynomials with scalar coefficients would seem to admit extension to the case of polynomials with matrix coefficients. See Sections 6.6-8 of this book and, for the probabilistic aspect, R. BELLMAN,On a generalization of classical probability theory. I, Markoff chains, Proc. Nut. Acad. Sci: U.S.A. 39 (1953), 1075-1077.
Another treatment of the subject has been given by J. H. B. KEMPERMAN, An analytical approach to the differential equations of the birth and death processes,MichiganMath.J. 9 (1962), 321-361.
Regarding the convergence of the formal series expansion of the integrals defining p ( j , k, t ) and the moment problem, see R. FORTET, Calcul des moments d’une fonction de rtpartition B partir de sa caracttristique, Bull. Sci. Math. 68 (1944), 117-131.
Section 5.8 T h e distinction between limit-point and limit-circle cases may be carried out in a functional-analytic context. T o simplify matters we suppose that a, = 1 in (5.1.1); this may be achieved by a substitution yk = a;l2y,, bk = a;’J2b,, c i = a;’J2c,a;Y: . This done, we consider the Hilbert space l2 of sequences of complex numbers such that
5
= (50
1 5 ,
9
...I
498
NOTES
the scalar product being given by
T h e transformation 5 -+ C’, where the components of in terms of those of 5 by
5;
= Cnln+l - bn5n
+
5’
are given
Cn-,5n-l>
where formally we set = 0 or c - ~= 0 for the case t z = 0, then defines a linear operator within 12, which will be denoted differently according to the domain, that is to say, the subset of l2 to which the transformation is applied. As a minimal domain of definition of this transformation let us take the set lo formed by sequences 5 such that only a finite number of the 5, are different from zero. We denote by A the linear operator given by 5’ = At for 5 E I , , so that I, is the domain D, of A. It is then easily verified that, if 5, r] E D,, then in other words, that
(47) = (5, AT),
where the 5, , 71, are all zero beyond some point. This means that A is symmetric, or Hermitean. However, A is not self-adjoint. T o define the adjoint A* of A we consider the set of 7 E l2 for which there is an 7’ E l2 such that
(A597) = (597’) for all 5 E D, = lo; the set of such 9 forms the domain of A* and on it we have r]’ = A*r]. As we have seen, if r] E lo we may take r]‘ = Aq. However, it is not hard to show that r]’ exists also for some r ] E l2 not in 1,; it i s sufficient to require that
This means that A* agrees with A on l o , but is also defined on a larger set, so that A* is an extension of A; since A* does not coincide with A , the latter is not self-adjoint.
499
NOTES
Consider next the operator B defined in the same way, that is, by = B5 with 5’ as above, but with the maximal domain of definition as an operator on 12 into 1 2 ; the domain D , consists of the set of 5 E l2 such that
5‘
It may happen that B is self-adjoint; this is in fact the limit-point case. T h e simplest case is that in which the constants b , , cn are bounded, uniformly in n. Here the operator B is bounded, and its domain is the whole of 12. T h e condition for self-adjointness coincides with that for symmetry, in this case that
(a, 7) = (1, B7)l for all
5, n E 12.
-
-
On calculation we find that this is equivalent to Cn(LL+,;in
- Slliin+l)
0
as
0°*
This i s true since we assume c, bounded and since 1,-0,
vn-0
1 , EP. ~
for
It is easily seen that the case in which B is bounded belongs to the limit-point case. Supposing the limit-circle case to hold, and writing y(A) for the sequence formed by {y,(A),y,(A), ...}, we should have y(A) E 12, by (5.4.7), and also By(A) = Ay(A), by (5.1.1-3). This is impossible for large A, and so the limit-point case holds. If we merely assume the c, uniformly bounded, the domain D, is characterized by Z I b,<, l2 < 00, Z I 5, l2 < 00, which may or may not be a proper subset of 12. However, the self-adjointness condition still holds. More generally, without assuming bn , c , bounded, and with D, as originally defined, let us assume that if 5 , 7 E D,, then Cn(Sn+l+jn - 5n;in+d 0 as n + 00. Then B is self-adjoint, and the limit-point case holds. For if the limit-circle case held, then we might take 5 = y(A), 7 = y(p) and so, by (5.7.3), +
if
which is impossible if A is complex and p
=
A.
500
NOTES
Passing to the limit-circle case, we note that in this case the operator A has deficiency indices (1, 1). More precisely, if A5 - A( = q , where I; E I,, then (q,y(A)) = 0. T h e operator B is no longer a self-adjoint extension of A ; however, an infinity of self-adjoint extensions of A may be formed by imposing a boundary condition at infinity, as in (5.7.4). For some fixed real A’, we define an operator C = C(A’) in the same way as A and B, whose domain D, includes D, and is included by D,. For 5 E D, we demand of course that 5 E 12, and C( E P, and in addition that cn { 5 n + ~ ~ n ( h ’ ) Sn~n+i(A’))
+
as n
-+
03.
0
By calculations similar to those of Section 5.6, which we do
not reproduce here, it may be shown that this C is self-adjoint.
We take the opportunity to rephrase in these terms the expansion theorem considered in Section 5.3. Still with a,, = 1, (5.3.6) expresses that the sequence u is in Z2, and (5.3.3) may be written, for real A, .(A) = (u,y(A)). If we define the operator Ea by Eau
a = J” -W
(u9 A’))
Y(V)
d~(v)
the result (5.3.4), when applicable, asserts that ELu = u. This, together with other properties of E a , characterize it as a spectral function of the difference operator considered here, that is to say, a spectral function in the operator sense rather than that used in this book. For further information we refer to Stone’s book, or to such papers as that of B. V. BAZANOV, The construction of generalized resolvents of a symmetric finite difference operator with deficiency (1, I), I z v . Vysshikh Uchebn. Zavedenii Mat. No. 4(5) (1958), 28-35, V. G . TARNOPOL’SKII, On the self-adjointness of difference operators with operator coefficients, Dopovidi Akad. Nauk Ukrain. RSR (1959), 1189-1 192.
Section 5.9 T h e moment problem, under Hamburger and other conditions, is treated in the book of Shohat and Tamarkin. A correction has recently been noted by D. S. GREENSTEIN, On a criterion for determinate moment sequences, Proc. Amer. Math. SOL.13 (1962), 120-121.
See also the cited monographs of Krein, Krein and Krasnosel’skii, Krein and Rehtman, Wall’s “Continued Fractions,” and Widder’s “Laplace Transform.” T h e book of Beckenbach and Bellman may also be consulted.
NOTES
50 1
There is work relative to the indeterminacy of the Stieltjes problem (over a semi-infinite interval) in Titchmarsh’s “Fourier Integrals” (pp. 320-322). T h e HausdorfT problem (over a finite interval) is treated in Bellman’s “Matrix Analysis.” For Hilbert space interpretations, see in addition to the monograph of Krein and Krasnosel’skii the paper of H. L. HAMBURGER, “Hermitean transformations of deficiency index (1, l), Jacobi matrices and undetermined moment problems, Amer. J . Math. 66 (1944), 489-522,
and the appendix to the paper of Devinatz [Duke Math. J. 24 (1957), 490-4981, Section 5.10
It is a classical result in the theory of power-moments that the polynomials are dense in Lt if the moments given by &(A) either constitute a determinate moment-problem, or an indeterminate problem of which .(A) is an “extremal solution”; see the book of Shohat and Tamarkin. I n either case T(A) will be what has been in this book termed a limiting or orthogonal spectral function. In the limit-circle or indeterminate case we may add extra stages to the recurrence relation at the infinite end; the interpretation in terms of a vibrating string was mentioned in the notes to Section 5.7. In a similar way, for an indeterminate moment problem, we may impose additional moments, of powers or of rational functions. For some results and references see D. S. GREENSTEIN, On redundancy of powers and the moment problem, PYOC. Amer. Math. SOC.13 (1962), 625-630.
Section 6.1 Concerning the expression of orthogonal polynomials as determinants see Szegii’s book, Problem 6 for the Jacobi form, and Section 2.2 of the same work for other forms of determinant. See also, regarding the associated polynomials considered here, V. E. SPENCER, Persymmetric and Jacobi determinant expressions for orthogonal polynomials, Duke Math. J . 5 (1939), 333-356.
Section 6.4 Green’s functions for difference equations have been treated by a number of writers. See in particular Fort’s book, and the papers of M. A. ABDEL-MESSIH, A Green’s function analog for ordinary linear difference equations, PYOC. Math. Phys. SOC. Egypt. NO.22 (1958), 43-51 (1959), A. L. TEPTIN, On the sign of the Green’s function of a linear difference boundary problem of the second order, Doklady Akad. Nauk SSSR 142 (1962), 1038-1039.
so2
NOTES
Section 6,5 Concerning the separation property of the eigenvalues of a matrix and of a truncated matrix see H. JEFFREYS and B. S. JEFFREYS, “Methods of Mathematical Physics,” 3rd ed., pp. 140-142. Cambridge Univ. Press, London and New York, 1962.
or Bellman’s “Matrix Analysis,” pp. 115-116. T h e result has been related to a monotonic property of the eigenvalues, discussed in Appendix V, by W. N . EVERITT, Two theorems in matrix theory, Amer. Math. Monthly 6 9 (1962), 856-859.
For the network interpretation, see for example E. A. GUILLEMIN, “Synthesis of Passive Networks.” Wiley, New York, 1957.
Section 6.6 Just as in the scalar case of Chapters 4 and 5, so with the theory of the matrix recurrence relation (6.6.1) will be associated (i) a moment problem, of Hamburger type, in which for a given sequence of Hermitean matrices pn , n = 0, 1, ..., it is required to find a nondecreasing Hermitean matrix function ~ ( hsuch ) that 00
J”
-m
A”
~ T ( A )=
pn ,
n
=
0,1,2, ...,
(ii) continued fractions, or rather fractional-linear matrix recurrence relations, satisfied by quotients (in the matrix sense) of solutions of the linear recurrence relations, (iii) a limit-circle theory (see Section 9.10), (iv) a theory of matrix-valued analytic functions of a complex variable, whose imaginary part is sign-definite in the upper half-plane. Concerning matrix cases we cite M. G . KRE~N, Infinite J-matrices and a matrix moment-problem, Doklady Akad. Nauk SSSR 69 (1949), 125-128,
J. S. MACNERNEY, Investigation concerning positive definite continued fractions, Duke Math. J. 26 (1959), 663-678.
T h e setting may be extended from matrices to operators on a Hilbert space, concerning which see B. SZ-NAGY,A moment problem for self-adjoint operators, Acta Math. Acad. Sci. Hungary 3 (1952), 285-293,
503
NOTES
C. DAVIS, A device for studying Hausdorff moments, Trans. Amer. Math. SOC.87 (1958), 144- 158,
J. S. MACNERNEY, Hermitean moment sequences, Trans. Amer. Math. SOC.103 (1962), 45-81.
T h e matrix recurrence relation, as also the scalar recurrence relation, may be subsumed under the theory of infinite matrices of Jacobi or similar types. See J. S. MACNERNEY, Half-bounded matrices, J . Indian Math. SOC.16 (1952), 151-176,
H. NAGEL,uber die aus quadrierbaren Matrizen entstehenden Operatoren, Math. Ann. 112 (1935), 247-285.
Sections 6.9-10 This investigation forms the discrete analog of the multi-parameter Sturm-Liouville problem considered by Ince in his book, pp. 248-251, under the heading of Klein’s oscillation theorem; Klein considered the Lam6 equation, with a view to arranging for a solution to have certain numbers of zeros on certain disjoint intervals. T h e eigenfunction expansion for a pair of simultaneous differential equations was discussed by Hilbert in D. HILBERT, “Grundzuge einer allgemeinen Theorie der Integralgleichungen.” Berlin, 1912, pp. 262-267.
Some general discussion is given by R. D. CARMICHAEL, Boundary value and expansion problems: formulation of various transcendental problems, Amer. J . Math. 43 (1921), 232-270.
I n our present case, the products Y7l,.l(h(”))... Y ? I k , k V U ) )
form collectively a certain type of product of the k vectors y or(h(~)), ym-l.r(h(u))r ...I
T
=
1,
...,K,
namely, their tensor or Kronecker product (Bellman, “Matrix Analysis,’’ Chapter 12). If the recurrence relations are all the same, we may consider also exterior products, more precisely determinants formed out of the Y ~ , , ~ ( X ( “ ) ) . See the book of Gantmaher-Krein, Chapter 4, Section 3, part 4, where references to Schur are given. Similar topics have been investigated in a recent series of papers by Karlin and McGregor, in which probabilistic interpretations are to be found.
504
NOTES
We cite S. KARLINand J. MCGREGOR, Determinants of orthogonal polynomials, Bull. Amer. Math. SOC.68 (1962), 204-209,
where numerous properties are announced, and references given to earlier work. Concerning continuous analogs see S. KARLIN,Determinants of eigenfunctions of Sturm-Liouville equations, J. anal. Math. 9 (1961/62), 365-397.
For some similar developments in the case of the polynomials of these sections see F. V. ATKINSON,Boundary problems leading to orthogonal polynomials in several variables, Bull. Amer. Math. SOC.69 (1963), 345-351.
Connected with these topics is the subject, not yet completely clarified, of the multi-dimensional moment problem, in which the moments of products of powers of K variables are prescribed. In addition to the text of Shohat and Tamarkin, see A. DEVINATZ, Two parameter moment problems, Duke Math. J. 24 (1957), 481-498,
R. B. ZARHINA, On the two-dimensional problem of moments, Doklady Akad. Nauk SSSR 124 (1959), 743-746.
Section 7.1 T h e theory of polynomials orthogonal on the unit circle was developed by Szego, starting from the orthogonality, and is reproduced in his “Orthogonal Polynomials”; the more recent book of Grenander and Szego contains a simpler treatment of some aspects. T h e recurrence relation, and its connection with asymptotic properties, receive detailed attention in the book of Geronimus. Here, except in Section 7.8, we have confined attention to the Hermitean case, in which the recurrence relations (7.1 .l-2) have coefficients forming a Hermitean matrix, with the effect that the polynomials are orthogonal with respect to a real-valued weight-distribution. That much of the theory can be developed without this restriction has been shown in a recent series of papers by Glen Baxter, Polynomials defined by a difference system, J. Math. Anal. Appl. 2 (1961), 223-263, A convergence equivalence related to polynomials orthogonal on the unit circle, Trans. 99 (1961), 471-487, Amer. Math. SOC. A norm inequality for a “finite-section” Wiener-Hopf equation, Tech. Sci. Note No. 4, Contract No. A F AFOSR-61-4.
Much of the theory of these polynomials is illustrated in two distinct
NOTES
505
topics in statistics and probability, namely the theory of time series and that of sums of random variables; the former of these is indicated in the notes to Section 7.5, and we proceed to the latter. For this we shall modify both the metaphor and the construction from Section 10 of the first-cited paper of Baxter. We shall work in terms of a model according to which organisms are distributed at equally spaced locations on an infinite line, the locations being numbered in order ..., -2, -1, 0, 1, 2, ... . Each generation of organisms produces a subsequent generation, an individual at location j producing a fraction “ k - j individuals at location k in the next generation, k = 0, *l , *2, ... . T h e process is to start with some large number N of organisms at time zero. T h e organisms are considered immortal, reproduce once only, and we are interested in the first place in the total number, NPjn say, which has accumulated at location j after n reproductions. Mathematically, we have here to deal with the recurrence relations
with initial conditions pi0 = 6jo . We may if we wish visualize the process as one of random walks, in which a particle moves among a set of equally spaced points of a h e , with probability “ k - j of moving from location j to location k at each stage. In connection with probabilities we have the sometimes awkward implications that they should be non-negative and sum to unity, just as in speaking above of organisms we have glossed over any implication that there should be an integral number of them. So far as the mathematical argument is concerned, the 01,. could well be complex. All that the model will do is to suggest ways of manipulating or subdividing multiple sums and products; this is, however, no mean service. For this model we propound three problems, in order of complexity and significance. Assuming that ZrwI at I < 1, it is easily seen that the grand total of individuals produced remains finite, so that we may consider the ultimate number, N y j , say, produced at location j by all generations; we shall have y j = Z:=o & . We ask for an evaluation of yi . This problem may be solved by the method of generating functions. Noting that the above recurrence relations are equivalent to
506
NOTES
it is easily seen that
which determine the y j in terms of the aj;in having to consider the Fourier series of the reciprocal of a function we begin to make contact with the famous Wiener theorem on this topic. Refining this problem, for any j 2 0 we denote by Ny{ the total number of individuals accumulating at location j , all of whose ancestors enjoyed locations with non-negative serial numbers; in random walk terms, we are to enumerate all possible finite walks, with attached weights or probabilities, which avoid the locations - 1, -2, ... . It turns out that this problem too can be solved by the method of generating functions, though this is much less evident than in the last case. We introduce an operator (f) on complex Fourier series which replaces by zero the negative powers of exp(it). Writing a for Xr”,a j e j i f , we have in particular that a+ = Er ajejif. From the definition of y{ it then follows that
2 m 0
y;ejit
=
1 + a’
+ (a’.)’
+ (a’.)’.)’
+ ....
It is a most remarkable fact that this last series can be sumed explicitly as
where we are to expand ak and log (1 - a) in complex Fourier series and to discard negative powers of exp(it). Concerning this identity see Baxter’s paper An analytic problem whose solution follows from a simple algebraic identity, Pacific J . Math. 10 (1 960), 73 1-742,
where references are given to earlier work of E. S. Andersen, F. Spitzer, and J. G. Wendel. See also J. G. WENDEL, Brief proof of a theorem of Baxter, Math. Scand. 1 1 (1962), 107-108, F. V. ATKINSON, Some aspects of Baxter’s functional equation, J . Math. Anal. Appl. (to be published).
With a slight extension, the identity forms the discrete analog of the
507
NOTES
Wiener-Hopf integral equation. For a treatment both of this integral equation and of its discrete analog we refer to M. G. K R E ~ N Integral , equations on a half-axis with kernel depending on the difference of the arguments, Uspekhi Mat. Nauk 13, No. 3(83) (1958), 3-120.
Some treatments of the integral equation, in the continuous case, and further references are indicated on pp. 49-51 of Dolph's survey. Our third problem for this model leads to a pair of recurrence relations equivalent to (7.1.1-2). For m 3 0, we denote by Nyj*), 0 j m, the number accumulating at location j whose ancestors are confined to the locations 0, ..., m ; similarly, for m < j 0, we denote in the same way the number of those with ancestors confined to m,m 1, ..., - 1 , O . Once more we define generating functions, writing now X in place of exp(it),
< <
<
+
T h e first of these recurrence relations derives from the observation that family trees (or random walks) which contribute to gm+l(X), but not to g,(X), are those which reach location m 1, do not move to 1 places to the left the right of it, and do not move more than m of it. T h e second relation is proved similarly. We may reduce (7.1.1-2) to the above form on putting
+ +
This is the form used by Baxter in the first-cited paper. T h e relation between the second and third problems for our model is that of limiting approximation. It is intuitive that for any fixed j > 0, the probability of a random walk from 0 terminating at j and avoiding negative locations yet not going beyond location m will tend to a limit as m -+ m, the same probability as if the restriction about not passing to the right of m were discarded. This suggests the asymptotic formulas Y y
-
r:
9
-
gm(4
exp (- (1% (1
which may be justified by easy estimations.
-
41')
9
508
NOTES
With some further discussion, this argument yields an asymptotic expression for the polynomials in terms of the orthogonality; for the connection between the latter and the function here denoted by a: we refer to Baxter’s first-cited paper, or to Section 7.7, where we must make m -+ 00. An alternative model may be constructed in terms of the theory of queues. Passing directly to the second of the above three problems, it would be a question of the probability of the nth customer waiting j minutes to be served, time being discrete; the waiting time is of course non-negative. T h e waiting time of the (n 1)th customer clearly exceeds that of the nth ‘customer by the excess of the latter’s service time over the inter-arrival time; this excess constitutes for present purposes a random walk. To derive the last of the three problems, we would imagine a situation in which server and customers are alike impatient, the service being terminated if either has to wait more than m minutes. However, we shall omit the details.
+
Section 7.5 An important feature of the trigonometric moment problem (7.5.4) is that its solution ~ ( 8 is ) not only essentially unique, but can also be exhibited explicitly. We consider the limit, as m -+ 00, of
here the prime on the first sum indicates omission 2n r = 0. From the first expression for u,(8) we see So exp as m 4m, and that u,(O) = 0, ~ ~ ( 2 = 7 ~po) . From the integral expression we see that u,(8) is nondecreasing, by a basic assumption concerning the p T , so that u,(i9) tends to a limit as m -+ m, at least for some m-sequence. This formula was found by G . HERGLOTZ, uber Potenzreihen mit positivem reellem Teil im Einheitskreis, Ber. Verh. Scichs. Akad. Wiss., Leipzig 63 (1911), 501-511.
T h e connection between this moment problem and a certain power moment problem is explored by P. G. ROONEY,On the Hausdorff and trigonometric moment problems, Canad. J. Math. 13 (1961), 454-461,
NOTES
509
in relation both to nondecreasing distribution functions and to certain other classes. T h e above formula for the spectral density is of considerable importance in the spectral analysis of time series; together with the random walk situation outlined in the notes to Section 7.1, this forms the second of two extensive realizations of the mathematical theory of orthogonal polynomials on the unit circle. For this model we suppose given a sequence x t of elements of a Hilbert space H , where the index t runs through all integers, positive, negative, and zero, and is considered as indicating a sequence of discrete instants of time. It may be appropriate in special cases to consider the x i as numerically valued functions defined on some set SZ; what is essential for present purposes is, however, only that there be defined a scalar product (x, , x i ) of two members of our sequence, and likewise for any two linear combinations of the xf . T h e x f may form random variables, the sequence of them a stochastic process. T h e relevant situation for us now is the case of a stationary process, in which the scalar product (x,, X I ) , an expectation or covariance, is dependent only on s - t ; we write (x, , x t ) = pSpf. T h e Toeplitz form Ey Ey [ j f k p j - k is certainly non-negative, representing as it does the squared norm of Etjxj in the Hilbert space, to be visualized as the mean-square expectation of this linear combination of random variables. We shall, to exclude degenerate cases, assume the Toeplitz form to be strictly positive, for tj not all zero. These conditions on the pr are precisely those which ensure the unique solubility of the trigonometric moment-problem, with r(0) having an infinity of points of increase. T h e two interpretations of the pT tell us that
We may view this as an isometric isomorphism between two Hilbert spaces, in that, for example, with any finite linear combination of the random variables we may associate a trigonometric polynomial according to the diagram
these having the same squared norms, in respective spaces, namely,
510
NOTES
We mention two detailed aspects of this correspondence between combinations of random variables and trigonometric polynomials. For the time series we may define a shift operator U by Ux, = x , + ~ ,and generally by U Z aixi = Z aixj+l . T h e stationarity condition implies that this operator is unitary. T h e use of the spectral analysis of this unitary operator to obtain a spectral representation of the process is described in Chapter 1 of E. J. Hannan’s book. Correspondingly, for a trigonometric polynomial f ( 0 ) we may define Uf(0) = ei@f(0),which again is unitary in that it is one-to-one and does not alter the squared norm I f ( 0 ) l 2 d7(0), and consider its spectral resolution. T h e corresponding process for polynomials on the real axis was mentioned in the notes for Section 1.10. Our second parallel between combinations of random variables and trigonometric polynomials relates to the definition of the orthogonal polynomials considered in this chapter. T h e definition (7.4.1-3) of these polynomials by orthogonality may be replaced by an extremal definition. T h e a,, are to be determined so that
s,””
attains its minimum value. This is the same as the problem of minimizing
and indeed, by the stationary property, of minimizing
for any integral m, all of these expressions having the same expression in terms of the p r . This we may interpret as a problem of prediction, in which one of the random variables is to be approximated to as closely as possible, in a mean-square average sense, by a linear combinatior of the n variables which precede it in the series. Of particular interest is the behavior of the above minimum as n + 00. In addition to the book of Grenander and Szegij, and that of Ahiezer (Appendix B), we cite G. BAXTER, An asymptotic result for the finite predictor, Math. Scand. 10 (1962), 137-144,
NOTES
51 1
where the effect of the smoothness of the weight-function is considered, and the book of Hannan, where other types of prediction are mentioned. We may wish to predict the present from a knowledge of both past and future, or again to estimate in some way the independence of the future from the past, with or without the present. For an interesting investigation of such questions, in terms of the angle between two linear manifolds in a Hilbert space, see H. HELSONand G. SZEGO,A problem in prediction theory, Ann. Mat. Pwa Appl. (4) 51 (1960), 107-138,
and, for other extremal problems, Ju. A. ROZANOV, Interpolation of stationary processes with discrete time, Doklady Akad. Nauk SSSR 130 (1960), 730-733.
I n an important generalization we consider matrix-valued inner products, so that ( x s , x t ) = pSptis a square matrix of some fixed order. We are thus led to a matrix trigonometric moment problem and to other developments parallel to those of Sections 6.6-8. Further references: N. WIENERand P. MASANI,The prediction theory of multivariate stochastic processes. Part I. The regularity condition, Acta Math. 98 (1957), 111-150,
... . Part 11. T h e linear predictor, Acta Math. 99 (1958), 93-137, P. MASANI, Shift invariant spaces and prediction theory, Acta Math. 107 (1962), 275-290,
T. BALOGH,The solution of a minimum problem, Publ. Math. (Debrecen) 8 (1961),
-
131 142,
T. BALOGH, Matrix-valued stochastical processes, ibid. 368-378, T . BALOGH,The estimation of the mean-value of a matrix-valued discrete stationary stochastic process, ibid. 9 (1962), 65-74. H. HELSONand D. LOWDENSLAGER, Prediction theory and Fourier series in several variables, Acta Math. 99 (1958), 165-202, M. ROSENBLATT, The multi-dimensional prediction problem, PYOC.Nut. Acad. Sci. U.S.A.43 (1957), 989-992.
For a general account see E. PARZEN, An approach to time series analysis, Ann. of Math. Stat. 32 (1961), 951-989.
Additional references on the factorization problem in the matrix context are given in the notes to Section 7.8.
512
NOTES
Section 7.6 What we are throughout terming the “characteristic function” has in the present context been termed the “CarathCodory function” by J. (= YA. L.) GERONIMUS, On polynomials orthogonal on the unit circle, on trigonometric moment problems and on allied CarathCodory and Shur functions, Mat. Sb. (N. S.) 15 (57) (1944), 99-130.
Section 7.7 For similar results in the non-Hermitean case see Baxter, J. Math. Anal. Appl. 2 (1961), 231; for the Hermitean case see the book of Geronimus.
Sections 7.8-9 See the references to SzegB, Grenander, Geronimus, and Baxter in the notes to Section 7.1. We have to deal here with an aspect of the “first inverse spectral problem,” in that from the spectral function we wish to determine not so much the recurrence relations, but rather the asymptotic character of their solutions. I n Sturm-Liouville theory over (0, 00) we also meet the phenomenon that the asymptotic amplitude of the solutions, for large independent variable, is linked to the limiting spectral function. See Chapter 8, Problem 9 or Chapter 12, Problem 11. A detailed study of asymptotic properties is also given in YA. L. GERONIMUS, On asymptotic properties of polynomials ..., Zzv. Akud. Nuuk SSSR, Ser. Mat. 14 (1950), 123-144, or Amer. Math. SOC.Trunsl. No. 95 (1953).
For recent work see T. F R E (FREY), ~ On the asymptotic behavior of orthogonal sequences of polynomials, Mat. Sb. 49 (91) (1959), 133-180,
G. FREUD, Eine Bemerkung zur asymptotischen Darstellung von Orthogonalpolynomen, Math. Scand. 5 (1957), 285-290,
A. L. KUZ’MINA, On the asymptotic representation of polynomials orthogonal on the unit circle, DokIady Akad. Nauk SSSR 107 (1956), 793-795, G. SZEGO, Recent advances and open questions on the asymptotic expansions of orthogonal ’. SOC. Znd. Appl. Math. 7 (1959), 311-315. polynomials, J
In carrying out an extension of the arguments of these sections to polynomials with matrix coefficients, and argument either on the unit circle or on the real axis, the main difficulty lies in an extension of the factorization of complex Fourier series, or functions on the unit circle,
NOTES
513
into functions regular inside and outside the circle; the method of taking logarithms breaks down. This has attracted much attention recently. In addition to the papers cited in the notes for Section 7.5, and in the context of absolutely convergent Fourier series with matrix coefficients, see I. C. GOHBERG and M. G. KRE~ N, Systems of integral equations on a half-axis with kernels depending on the difference of the arguments, Uspekhi Mat. Nauk 13 (1958), 3-72 (Section 14),
and, for the extension to operator-valued functions, with different restrictions, A. DEVINATZ, The factorization of operator-valued functions, Ann. of Math. 73 (1961), 458-495.
I. C. GOHBERG, A general theorem concerning the factorization of matrix functions in normed rings and its application, Doklady Akad. Nauk SSSR 146 (1962), 284-287.
See
Section 7.10
M. G. KRE~N, Continuous analogs of propositions on polynomials orthogonal on the unit circle, Doklady Akad. Nauk SSSR 105 (1955), 637-640, N. I. AHIEZER, A space analog of polynomials orthogonal on a circuIar arc, Dokfady Akad. Nauk SSSR 141 (1961), 769-772.
Section 8.1 T h e original papers of Sturm and Liouville, which still repay study, are in Liouville’s J . Math. Pures Appl. 1 (1936), as follows: C. STURM, Sur les equations diffbrentielles du second ordre, pp. 106-186, Sur une classe d’bquations a differences partielles, pp. 373-444,
J. LIOUVILLE, Sur le dbveloppement des fonctions..., pp. 253-265, Demonstration d’un theoreme dfi
a M. Sturm, pp. 269-277.
In ibid. 2 (1837), 16-35, Liouville takes up asymptotic formulas with a view to the convergence of the expansion. A contemporary worker in the field was Poisson. Accounts of Sturm-Liouville theory will be found in the cited books of Bieberbach, Birkhoff and Rota, Coddington and Levinson, Courant and Hilbert, Kamke, Naimark, and Titchmarsh. While in most presentations the coefficient functions are taken to be continuous, or piecewise continuous, there is little difficulty in allowing them to be Lebesgue
514
NOTES
integrable, y‘ being then absolutely continuous (see the book of Coddington and Levinson). In the extension of the notion of a second-order differential equation which is natural for Sturm-Liouville theory, y is everywhere continuous, indeed y’ exists almost everywhere, but y’ may have a denumerable number of discontinuities, proportional to y. Among workers in this direction should be mentioned Feller, Krein, and Sz. -Nagy. T h e feature of a series of papers by Feller is the preservation of the differential formalism, the second derivative being generalized to a mixed derivative. Given any nondecreasing function u(x), we may define a derivative D,f of a general function f ( x ) with respect to u ( x ) ; this might be the limit, if it exists, as h -+ 0 of { f(x h) - f(x)}/ { ~ ( x h) - u(x)}. T h e ordinary derivative f’ is then of course D, f. Retaining this derivative, we may form a second derivative D,D, f with respect to a nondecreasing function m ( x ) as the limit as h + 0 of { f ’ ( x h) - f ’ ( x ) } / { m ( x h) - m(x)}. This enables one to write the equation (0.8.11) of the vibrating string in differential form as D,,D,y = -Ay. For a detailed study of the resulting differential operators we refer to
+
+
+
+
W. FELLER, Generalized second-order differential operators and their lateral conditions, Illinois J. Math. 1 ( 1 957), 459-504,
a briefer account being On generalized Sturm-Liouville operators, R o c . Conj. Differential Equations, Uniw. oy Maryland, 1955.
T h e spectral resolution is considered by H. P. MCKEAN,Jr., Elementary solutions for certain parabolic equations, Trans. Amer. Math. SOC.82 (1956), 519-548,
the treatment being oriented to Laplace transform techniques and the Chapman-Kolmogorov equations, and also applying to the case when the basic interval is singular at both ends. Another side of Feller’s work relates to the intrinsic characterization of the mixed derivative D,Dx f(x) by two features, firstly its local character, in that the value of D,D, f at x depends only on the values o f f in any neighborhood of x, and secondly certain strong or weak minimum properties, for example that D,D, f >, 0 at a local minimum off. For exact statements see W. FELLER, On the intrinsic form for second order differential operators, ZllinoisJ. Math. 2 (1958), 1-18, Differential operators with the positive maximum property, ibid. 3 (1959), 182-186.
NOTES
515
A third aspect is the relation of these two general properties, enjoyed in particular by the mixed derivative, to general categories of diffusion processes. See W. FELLER, The general diffusion operator and positivity preserving semi-groups in one dimension, Ann. of Math. 60 (1954), 417-436.
Some differential equation cases are considered by P. MANDL,Spectral theory of semi-groups connected with diffusion processes and its application, Czech. J. Math. 4 (1961), 559-569.
Another school of investigation, using the formalism of integral equations (see Chapters 11 and 12), is associated with M. G. Krein, another contributor being I. S. Kac. Of Krein’s many papers we cite particularly On a generalization of investigations of Stieltjes, Doklady Akud. Nauk SSSR 87 (1952), 881-884.
Aspects relating to the spectrum are developed in I. S. KAC,On the existence of spectral functions of certain singular differential systems of the second order, Doklady Akad. Nuuk SSSR 106 (1956), 15-18, On the behavior of spectral functions of differential systems of the second order, ibid. 183-186,
more references being given in the same. author’s Growth of spectral functions of differential systems of the second order, Izw. Akad. Nauk SSSR, Ser. Mat. 23 (1959), 257-214.
I n addition to Volterra integral equations, it is also possible to consider Fredholm integral equations with Stieltjes weight distributions as generalizing Sturm-Liouville theory. This approach is developed in the book of Gantmaher and Krein. See also M. G. KRE~N, On the Sturm-Liouville problem in the interval (0, a)and on a class of integral equations, Dokludy Akud. Nuuk SSSR 7 3 (1950), 1125-1 128
or, for a simple illustration, Bellman’s “Matrix Analysis,” p. 144, Exercises 1 and 2. For the presentation of Sz.-Nagy we refer to B. SZ~KEFALVI-NACY, Vibrations d’une corde non homogene, Bull. SOC.Math. France 75 (1947). 193-208,
or to the book of Riesz and Sz.-Nagy, where the spectral resolution is derived by methods of functional analysis. We refer under the notes for Sections 0.8 and 8.7 to investigations of the slightly more special situations of classical Sturm-Liouville theory
516
NOTES
modified by the presence of the parameter in the boundary conditions, or by a finite number of interface conditions; under the notes for Section 11.8 we refer to work on more general systems of higher dimensionality. A number of new directions have been opened up in Sturm-Liouville theory under classical continuity conditions. A survey is given by B. M. LEVITAN and I. S . SARGSYAN, Some problems in the theory of the Sturm-Liouville equation, Uspekhi Mat. Nauk 15, No. l(91) (1960), 3-98.
That recurrence relations may be imbedded in the theory of differential equations by taking the coefficients to be piecewise constant is explained in a matrix context by W. T. REID,Generalized linear differential systems, J . Math. Mech. 8 (1959), 705-726 (in particular pp. 721-722),
where reference is made to the dissertation of V. C. Harris.
Section 8.3 Concerning the e3ponent of convergence of the zeros of an entire function, relevant to the proof of (8.3.7), see for example Titchmarsh, “Theory of Functions,” Section 8.22. For more special situations than those considered here, we may obtain more information on the distribution of the eigenvalues either by classical methods (as in the text of Ince) or from more incisive results from the theory of functions; see for example the result of Levinson discussed in P. Koosrs, Nouvelle demonstration d’un theoreme de Levinson ..., Bull. SOC.Math. France 86 (1958), 27-40.
For the asymptotic form of the eigenvalues of a vibrating string with arbitrary mass-distribution see M. G. KRE~N, Determination of the density of a symmetrical inhomogeneous string from its spectrum of frequencies, Doklady Akad. Nauk SSSR 76 (1951), 345-348, and On inverse problems for an inhomogeneous string, ibid. 82 (1952), 669-672.
For cases in which the eigenvalues increase more rapidly than the classical estimate O(n2),see H. P. MCKEANand D. B. RAY, Spectral distribution of a differential operator, Duke Math. J . 29 (1962), 281-292.
Since the eigenvalues are the zeros of the left of (8.3.4), an entire function of order less than 1, we may by factorizing this function
517
NOTES
[cf. (12.3.27)] obtain explicit formulas for the sums of inverse powers of the eigenvalues. See R. BELLMAN, Characteristic values of Sturm-Liouville problems, Illinois 3.Math. 2 (1958), 577-585.
For trace-formulas involving sums of eigenvalues see L. A. D I K I ~Trace , formulas for Sturm-Liouville differential operators, Uspekhi Mat. Nauk 13, No. 3(81) (1958), 111-143.
Section 8.4 T h e use of the polar coordinate method to establish the Sturmian oscillatory properties seems due to H.Prufer (see the notes for Section 8.6). T h e key fact that the polar angle, as defined in this particular version of the method, is a monotonic function of the spectral parameter was also noticed by W. M. WHYBURN, Existence and oscillation theorems for non-linear differential systems of the second order, Trans. Amer. Math. SOC.30 (1928), 848-854 (p. 854), and A non-linear boundary value problem for second order differential systems, Pacific Math. 5 (1955), 147-160,
3.
where nonlinear systems are also treated. A second and distinct version of the polar coordinate method belongs in the area of asymptotic theory, either for large parameter values or for large values of the independent variable; we have used this method, in a somewhat crude form, at the end of this section. A similar device is used at the end of Section 10.5. In its more precise form, this other version of the polar coordinate method applies to the second-order equation y f f f ( x ) y = 0, where f ( x ) is smooth and positive, and may depend on a spectral parameter. T h e method depends on an investigation of the differential equation of the first order satisfied by O(x) as defined by tan B = -y’/(yf1l2); the success of the method depends, roughly speaking, on the variation in logf being small compared to the integral of fl2. For applications of this method see
+
F. V. ATKINSON, On second-order linear oscillators, Rev. Univ. Tucuman, Ser. A , Mat. y Ffs. Tedr. 8 (1951), 71-87,
J. H. BARRETT, Behavior of solutions of second order self-adjoint differential equations, Proc. Amer. Math. SOC.6 (1955), 247-251,
J. B. MCLEOD,On certain integral formulae, Proc. London Math. SOC.(3) 11 (1961), 134-138, and
518
NOTES
The distribution of the eigenvalues for the hydrogen atom and similar cases, ibid. 139- 158,
H. HOCHSTADT, Asymptotic estimates for the Sturm-Liouville spectrum, Comm. Pure Appl. Math. 14 (1961), 749-764, N. WAX, On a phase method for treating Sturm-Liouville equations and problems, J. SOC.Ind. Appl. Math. 9 (1961), 215-232, N. S. ROSENFELD, The eigenvalues of a class of singular differential operators, Comm. Pure Appl. Math. 13 (1960), 395-405.
For ramifications of the method reaching into optics, statistical mechanics and quantum theory, particularly the so-called WKB method, see P. FRANKand R. VON MISES,“Die Differential- und Integralgleichungen der Physik,” Vol. 11. Braunschweig, 1935, pp. 82, 119, 986.
+ +
For equations of the form y” (k2 g(x)) y = 0, where g(x) need not be smooth but is in some sense small, one may modify the substitution to tan 8 = -y’/(ky); see Problems 5-7 for Chapter 12. Returning to separation, comparison and oscillation theorems, we refer to books such as that of Ince for a treatment of these topics not involving the polar coordinate method; in this work nonlinear dependence on the parameter is also considered, with a view to the multiparameter application. For results for more general types of side condition see W. M. WHYBURN, Second-order differential systems with integral and k-point boundary conditions, Trans. Amer. Math. SOC.30 (1928), 630-640.
Section 8.5 Regarding the Cebygev property in general see the Notes to Chapter 4, Problems 6-10, and the books of Ahiezer and Gantmaher-Krein. That the eigenfunctions of a Sturm-Liouville problem have this property, in its fullest statement that a linear combination of u,(x), ..., us(x), 0
<
S. BOCHNER, Sturm-Liouville and heat equations whose eigenfunctions are ultraspherical polynomials or associated Bessel functions, Proc. Conf. Differential Equations, Univ. of Maryland, 1956, pp. 23-48,
NOTES
519
S . KARLIN, Total positivity, absorption probabilities and applications, Bull. Amer. Math. SOC. 67 (1961), 105-108,
Classical diffusion processes, and total positivity, J. Math. S . KARLINand J. MCGRECOR, Anal. Appl. 1 (1960), 163-183.
S. KARLINand J. MCGREGOR, Total positivity of fundamental solutions of parabolic equations, Bull. Amer. Math. SOC.13 (1962), 136-139.
T h e CebySev property for Sturm-Liouville eigenfunctions was rediscovered by 0. D. Kellogg, who also observed that the property was associated with a class of integral equations; see the following of his papers: Orthogonal functions arising from integral equations, Amer. J. Math. 40 (1918), 145-154, and Interpolation properties of orthogonal sets of solutions of differential equations, ibid. 225-234.
A full theory of “Kellogg kernels” was later developed by Krein and Gantmaher in their book. For the particular case of nth order differential equations see M. G. WIN, Oscillation theorems for ordinary linear differential operators of arbitrary
order, Doklady Akad. Nauk SSSR 25 (1939), 717-720.
T h e thought occurred to Kellogg, as to Liouville, that the CebySev property should yield a proof of the eigenfunction expansion, the following form of argument being due to Liouville. Let the remainder in the eigenfunction expansion, if not identically zero, change sign at a finite number of points. Choose a linear combination of eigenfunctions to change sign at just these points. This cannot be orthogonal t o the remainder since it changes sign with it. On the other hand, the remainder must, by the Fourier construction, be orthogonal to all eigenfunctions, so that we have a contradiction, and the remainder must vanish identically. See, for arguments based on a more general property, the papers of H. Mollerup [Math. Ann. 66 (1909), 511-516; corrected in ibid. 71 (1911), 6001. For recent work in the general area of CebySev systems, SturmLiouville eigenfunctions and variation diminishing transformation see B. SCARPELLINI, Approximation von Funktionen durch Linearkombinationen von Eigenfunktionen Sturm-Liouvillescher Differentialgleichungen, Comment. Math. Helv. 36 (1962), 265-305, I. I. HIRSCHMAN, Jr., Variation diminishing transformations and Sturm-Liouville systems, ibid. 36 (1961), 214-233, I. HIRSCHMAN and D. V. WIDDER, “The Convolution Transform.” Princeton Univ. Press, Princeton, New Jersey, 1956.
520
NOTES
R. BELLMAN, A note on variation diminishing properties of Green’s functions, Boll.
Union. Mat. Ital. ( 3 ) 16 (1961), 164-166, J. C. MAIRHUBER, I. J. SCHOENBERG, and R. E. WILLIAMSON, On variation diminishing transformations of the unit circle, Rend. Circ. Mat. Palermo (2) 8 (1959). 241-270,
and the monograph of Beckenbach and Bellman.
Section 8.6 T h e proof of the completeness of the eigenfunctions presented here was discovered by H. PRUFER,Neue Herleitung der Sturm-Liouvilleschen Reihenentwicklung stetiger Funktionen, Math. Ann. 95 (1926), 499-518,
and may be consulted in the texts of Bieberbach and Kamke. Prufer’s argument proceeds via the closure of the eigenfunctions, proving that a function orthogonal to all the eigenfunctions and satisfying certain general conditions must vanish identically; this is supplemented by a proof, distinct from that used here, that the eigenfunction expansion converges. Prufer also notes (pp. 514-518) that his argument extends to further cases, in particular that of a parameter in the boundary conditions, and certain “singular” cases in which the spectrum is nevertheless discrete. This proof, like that used in Chapter 9 and unlike that used in Chapter 12, has the advantage of not depending on the asymptotic behavior of the eigenfunctions, and so not necessitating much in the way of smoothness conditions on the coefficients in the equation.
Section 8.7 T h e formal character of the expansion considered here, and of similar higher order equations, is considered by L. C. BARRETT and C. R. WYLIE,Jr., Orthogonality conditions and lumped parameters, Rev. Univ. Tucuman,Ser. A , Mat. y Fis. Tedr. 12, Nos. 1 and 2 (1959) 73-80.
See also R. COURANT and D. HILBERT, “Methoden der mathematischen Physik,” Vol. I, Chapter VI, Section 1.
and, for a full-scale treatment, the work of W. C. Sangren cited in the notes for Section 11.8.
Section 8.10 For recent work on the spectral function see I. S. KAC, Two general theorems on the asymptotic behavior of spectral functions of differential systems of the second order, Zzo. Akad. Nauk SSSK, Ser. Mat. 26 (1962), 53-78.
NOTES
521
For work on the spectral kernel, which involves sums of products or eigenfunctions with respect to the spectral measure, see the survey of Levitan and Sargsyan cited in the notes for Section 8.1, or Levitan’s appendices to Titchmarsh.
Section 8.13 T h e theory of the limit-circle, limit-point alternative in the case of the second-order differential equation goes back to H. WEYL,Uber gewohnliche Differentialgleichungen mit Singularitaten und die zugehorigen Entwicklunger willkurlicher Funktionen, Math. Ann. 69 (1 910), 220-269,
and will be found in the books of Coddington and Levinson and Titchmarsh. For the extension to higher-order equations see K. KODAIRA, On ordinary differential equations of any even order and the corresponding eigenfunction expansion, Amer. J. Math. 7 2 (1950), 502-544,
E. A. CODDINGTON and R. C. GILBERT,Generalized resolvents of ordinary differential operators, Trans. Amer. Math. SOC.93 (1959), 216-241.
See also Sections 9.8 and 9.10. An extension to second-order differential equations with complex coefficients has been made by A. R. SIMS,Secondary conditions for linear differential operators of the second order, J. Math. Mech. 6 (1957), 247-285.
and is briefly described on pp. 22-23 of Dolph’s survey.
Section 9.1 Among pioneer papers on the boundary problem for the first-order vector-matrix differential system should be mentioned that of G. D. BIRKHOFF and R. E. LANGER, T h e boundary problem and developments associated with a system of ordinary linear differential equations of the first order (see Birkhoff’s “Collected Works,” Vol. I, pp. 347-424),
where asymptotic behavior in the complex h-plane is utilized, and the work of G. A. BLISS,A boundary value problem for a system of ordinary differential equations of the first order, Trans. Amer. Math. SOC. 28 (1926), 561-584, and Definitely self-adjoint boundary value problems, ibid. 44 (1938), 41 3-428,
where an integral equation approach is used.
522
NOTES
For recent work and further references see W. T. REID,A class of two-point boundary problems, IllinoisJ. Math. 2 (1958),434-453,
F. BRAUER, Spectral theory of linear systems of differential equations, Pacific. J . Math. 10 (1960),17-34, F. BRAUER, Spectral theory for the differential equation Lu 10 (1958),431-446.
=
XMu, Canad. J . Math.
Apart from scalar equations of the second and higher orders, the special case of the first-order system receiving most detailed attention is that of two simultaneous scalar first-order systems, such as u’ = ( A + ql)o, o’ = -(A q&. For recent work and references see
+
E. C. TITCHMARSH, Some eigenfunction expansion formulae, Proc. London Math. SOC. (3)11 (1961),159-168, S. D. CONTEand W. C. SANGREN, An asymptotic solution for a pair of first order equations, R o c . Amer. Math. SOC.4 (1953),696-702, B. W. Roos and W. C. SANGREN, Spectra for a pair of singular first order differential equations, ibid. 12 (1961),468-476,
and, by the same authors, Spectral theory of a pair of first-order ordinary differential operators, General Atomic Rept. GA-1777(1960), Expansions associated with a pair of singular first-order differential equations, General Atomic Rept. GA-2686(Rev.) (1962). Three spectral theorems for a pair of singular first order differential equations, Pacific J. Math. 12 (1962),1047-1055.
For the theory of quasi-differential equations, which fall within the scope of the first-order system, see the book of Ahiezer and Glazman. Without making any attempt here to summarize the vast literature on boundary problems for nth order differential operators, we cite the books of Coddington and Levinson and of Naimark, and among recent papers those of N. LEVINSON, Transform and inverse transform expansions for singular self-adjoint differential operators, Illinois J . Math. 2 (1958),224-235,
E. A. CODDINGTON, Generalized resolutions of the identity for symmetric ordinary differential operators, Ann. of Math. ( 2 ) 68 (1958),378-392, G.-C. ROTA,On the spectra of singular boundary value problems,
(1961),83-90,
3. Math. Mech.
10
M. G. KRE~N, On the one-dimensional singular boundary value problem of even order in the interval (0,m), Doklady Akad. Nauk SSSR 74 (1950),9-12.
523
NOTES
Section 9.4 I‘he matrix K ( x, t , A) is also known as the Green’s matrix, or in older works as the Green’s tensor. It may be constructed for more general types of boundary or side condition. See for example R. H. COLE,The expansion problem with boundary conditions at a finite number of points, Canad. J. Math. 13 (1961), 462-479.
For a more general type of inhomogeneous problem see W. M. WHYBURN, On a class of linear differential systems, Reo. Cienc. (Lima) 60 (1958), 43-59.
Section 9.11 I n addition to the texts of Ahiezer and Glazman and others, see W. N. EVERITT, Integrable-square solutions of ordinary differential equations, Quart. J. Math. (2) 10 (1959), 145-155.
Section 10.1 Extensions of oscillation theory, such as separation and comparison theorems, to an n-dimensional context are to be found mainly in the American literature, and are largely inspired by the calculus of variations. We cite in particular the book of Morse, and the papers of G . D. BIRKHOFF and M. R. HESTENES, Natural isoperimetric conditions in the calculus of variations, Duke Math. J. 1 (1935), 198-286, and W. T. REID,A system of ordinary linear differential equations with two-point boundary. conditions, Trans. Amer. Math. SOC.44 (1938), 508-521,
where other references are given. Matrix Riccati equations and matrix transformations have recently become prominent; see for example W. T. REID,Properties of solutions of a Riccati matrix differential equation, Mech. 9 (1960), 749-770,
J. H. BARRETT, A Priifer transformation for matrix differential equations, Math. SOC.8 (1957), 510-518.
3. Math.
?‘roc.
Amer.
T h e variational approach has retained its vigor, and has been found useful also for nonlinear oscillation problems, as in R. A. MOORE and Z. NEHARI, Non-oscillation theorems for a class of non-linear differential equations, Trans. Amer. Math. SOC.93 (1959), 30-52.
Topological aspects of boundary problems are brought out in an important paper by R. BOTT,On the iteration of closed geodesics and the Sturm intersection theory, Comm. Pure Appl. Math. 9 (1956), 171-206.
524
NOTES
Russian literature is rich in the investigation of systems with periodic coefficients, with special reference to the stability or boundedness of solutions. This leads at once, by the Floquet theory, t o the eigenvalues of a certain matrix, the fundamental solution of the first-order system taken over a period (see for example Bellman, “Stability Theory,” pp. 28-31, or the book of Coddington and Levinson). Among investigators we mention Gel’fand, Lidskii, Krein, and YakuboviE. Selected works: I . M. GEL’FAND and V. B. LIDSKI!,On the structure of the regions of stability of linear canonical systems of differential equations with periodic coefficients, Uspekhi Mat. Nauk 10, NO. 1 (1955), 3-40, M. G. KRE~N and G. YA. LYUBARSKI~, On the analytic properties of the multipliers of periodic canonical systems of differential equations of positive type, Izw. Akad. Nauk, Ser. Mat. 26 (1962), 549-572, V. A. YAKUBOVI~, The form of the group of symplectic matrices and the structure of the set of unstable canonical systems with periodic coefficients, Mat. Sb. 44 (1958), 313-352, and Oscillation properties of solutions of linear canonical systems of differential equations,
DokIady Akad. Nauk SSSR 124 (1959), 533-536,
where oscillation questions are taken up in a general way, and Oscillation and non-oscillation conditions for canonical linear sets of simultaneous differential equations, ibid. 994-997.
V. A. YAKUBOVI~, Arguments on the group of symplectic matrices, Mat. Sb. 55 (97) (1961), 255-280, and Oscillation properties of solutions of canonical equations, ibid. 56 (98) (1 962), 3-42.
Section 10.2 Concerning the monotonic properties of eigenvalues see V. B. LID SKI^, Oscillation theorems for a canonical system of differential equations, Doklady Akad. Nauk SSSR 102 ( 1 9 5 9 , 877-880.
For an investigation adapted to integral equations of the type of Chapter 11 we refer once more to Reid [J. Math. Mech. 8 (1959), 705-7261.
Section 10.3 T h e main discoveries regarding the extension of the Sturmian properties to a vectorial context are due to M. Morse. I n addition to his book, see his paper A generalization of the Sturm separation and comparison theorems to n-space, Math. Ann.
103 (1930), 52-69.
NOTES
525
See also the just-cited work of Lidskii and that of Reid. For a comparison test relating Sturmian vector systems to the ordinary scalar case see A. WINTNER, A comparison theorem for Sturmian oscillation numbers of linear systems of second order, Duke. Math. J. 25 (1 958), 515-5 18.
Further results, including consideration of eigenvalues, are given in an early paper of G. A. BLISSand I. J. SCHOENBERG, On separation, comparison and oscillation theorems for self-adjoint systems of linear second-order differential equations, Amer. J. Math. 53 (1931), 781-800.
Section 10.4 Theorem 10.4.1, or at least the first part, is given by W. T. REID,A Priifer transformation for differential systems, Pacific J. Math. 8 (1958), 575-584 (especially p. 582),
in improvement of an earlier result of Barrett. T h e result of Theorem 10.4.2 becomes fairly exact in the case when R(x) -= Q(x), that is to say, in the case of the system u‘ = QD, d = -Qu or its matrix analog U‘ = QV, V’ = -QU. This system has been Amer. Math. SOC.8 (1957), 510-5181, studied by J. H. Barrett [PYOC. as a generalization of the sine and cosine functions, to yield an extension of the polar coordinate method so successful in the scalar second-order case. See also the above-cited paper of Reid.
Section 10.6 This type of boundary condition for a fourth-order equation, namely y = y” = 0 and its associate y f = ( y f ’ / y ) ’ - qyf = 0, is studied by J. H. Barrett, Systems disconjugacy of a fourth-order differential equation, Proc. Amer. Math. SOC.12 (1961), 205-213,
where other references are given. T h e same author considers the alternative boundary condition y = y’ = 0 and its associate y” = (y“/r)’ - qy’ = 0 (actually with q = 0) in Disconjugacy of a self-adjoint differential equation of the fourth order, Pacific J. Math. 1 I (1961), 25-37.
See also the problems for this chapter, and later citations.
526
NOTES
For the fourth-order equation generally, one must study not merely boundary problems derived from eigenvalue problems but also the distribution of isolated zeros of solutions, though these aspects are not entirely separate. T h e main and pioneer work in this area is that of W. LEICHTON and Z. NEHARI, On the oscillation of the solutions of self-adjoint differential equations of the fourth order, T7ans. Amer. Math. Soc. 89 (1958), 325-377.
For equations of order 2n, one may, as in the case n = 1, describe as nonoscillatory or disconjugate on an interval an equation of which no solution, not identically zero, has more than one nth order zero in this interval. For the spectral implications of this situation we refer to I. M. GLAZMAN, Oscillation theorems for differential equations of higher orders and the spectra of the corresponding differential operators, Doklady Akad. Nauk SSSR 118 (1958), 423-426.
For further work on specific oscillation criteria, including eigenvalue interpretations, see L. D. NIKOLENKO, Some criteria of non-oscillation for a differential equation of the fourth order, Doklady Akad. Nauk SSSR 114 (1957), 483-485,
H. C. HOWARD, Oscillation criteria for fourth-order linear differential equations, Trans. Ame7. Math. SOC.96 (1960), 296-3 1 1, W. T. REID,Oscillation criteria for self-adjoint differential systems, Trans. Amer. Math. SOC.101 (1961), 91-106.
For a discussion of the fourth-order equation in the light of the Gantmaher-Krein theory of oscillation matrices and kernels, see their book, Chapter 3, Section 8. See also A. HALANA~ and ST. SANDOR, Sturm-type theorems for self-conjugate systems of linear differential equations of higher order, Bull. Math. Soc. Sci. Math. Phys. R . P . Roumaine (N.S.) 1 (49) (1957), 401-431, or Dokaldy Aknd. Nauk SSSR (N.S.) 114 ( 1 957), 506-507,
J. H. BARRETT, Two-point boundary problems for linear self-adjoint differential equations of the fourth order with middle term, Duke Math. J. 29 (1962), 543-554,
R. W. HUNT,The behavior of solutions of ordinary self-adjoint differential equations of arbitrary even order, Puci’c
J. Math. 12 (1962), 954-961.
Sections 10.8-9 For the variational approach to oscillation and comparison theorems for first-order systems see Section 6 of Reid’s paper in Trans. Amer. Math. SOC.44 (1938), 508-521, where results of a somewhat more general character are given.
NOTES
527
Section 11.I Work of Krein and others using the integral equation formalism for boundary problems of Sturm-Liouville type, including singular problems and inverse problems, was cited in the notes for Section 8.1.
Section 11.2 For the method of successive approximations as applied to the existence and continuity properties of solutions of integral equations of the type considered here we refer to the papers cited in the notes for Section 8.1 of Feller [Illinois J . Math. 1 (1957), 459-504, particularly pp. 466-4671 and Reid [J. Math. Mech. 8 (1959), 705-726, particularly pp. 706-7101; in the latter case vector-matrix systems are considered.
Section 11.8 For a detailed survey, with bibliography and some new results, of the theory of the Stieltjes integral equation (11.8.4) we refer to an address given by J. S. MACNERNEY, A linear initial value problem, Bull. Amer. Math. SOC.69 (1963), 314329.
Selected papers: M. FRBCHET, Sur l’intkgration d’un systbme canonique d’kquations diffkrentielles linkaires B coefficients discontinus, Proc. Benares Math. SOC.1 (1939), 1-14,
T. H. HILDEBRANDT, On systems of linear differentio-Stieltjes-integral equations, Illinois J . Math. 3 (1959), 352-373, J. S. MACNERNEY, Hellinger integrals in inner product spaces, J. Elisha Mitchell Sci. SOC. 76 (1960), 252-273,
J. S. MACNERNEY, Integral equations and semigroups, Illinois J. Math. 7 (1963), 148-173, W. T. REID,Generalized linear differential systems, J . Math. Mech. 8 (1959), 705-726 (for vector-matrix Sturm-Liouville systems),
F. W. STALLARD, Functions of bounded variation as solutions of differential systems, Proc. Amer. Math. SOC.13 (1962), 366-373,
where a procedure of factorization is described bearing some analogy to the additive decomposition of a function of bounded variation into absolutely continuous and other components. T h e theory of (1 1.8.4) overlaps with the topic of differential equations with interface conditions. In the simplest form of such a problem, we suppose the vector y ( x ) to satisfy a differential equation (1 1.8.1) except at a finite number of points x, , say, at which a discontinuity takes place
528
NOTES
+
according toy(x, 0) = A,y(xT - 0), the matrices A, being prescribed, often being nonsingular. We considered particular cases in Sections 0.8, 8.7, and indeed the three-term recurrence relations of Chapters 4, 5, and 6 may be considered as special cases. For an account of such interface conditions and the associated boundary problems see F. W. STALLARD, Differential systems with interface conditions, Oak Ridge Nat. Lab. Publ. No. 1876 (Physics), Differential equations with discontinuous coefficients, Oak Ridge Nat. W. C. SANGREN, Lab. Publ. No. 1566 (Physics) (1953),
where both nth order equations and Sturm-Liouville problems are considered. It is possible to consider also inhomogeneous interface conditions. For work of this kind, dealing also with more general side conditions than the standard boundary conditions, and a survey of the field, we refer to W. M. WHYBURN, Differential systems with boundary conditions at more than two points, Proc. Conf. DiTerential Equations, Univ. of Maryland, 1955 pp. 1-21,
T. J. PICNANIand W. M. WHYBURN,Differential systems with interface and general boundary conditions, 3. Elisha Mitchell Sci. Soc. 72 (1956), 1-14.
Concerning modified Stieltjes integrals see also G . B. PRICE, Cauchy-Stieltjes and Riemann-Stieltjes integrals, Bull. Amer. Math. SOC.49 (1943), 625-630,
H. SCHARF,Uber links- und rechtseitige Stieltjes-integrate und deren Anwendungen, Portugal. Math. 4 (1944), 73-118, W. H. INCRAM, The j-differential and its integral, Proc. Edinburgh Math. Soc. (2) 12 (1960/61), 85-93; 13 (1962), 85-86.
Relative to the product integral given by (11.8.15), see the references given in Bellman’s “Matrix Analysis,” pp. 178-179. A somewhat different approach is given in Potapov’s monograph. It is evident that (11.8.15) defines a matrix-valued function of two scalar variables with the property that Y(c,a) = Y(c,b) Y(b,u). Imposing also continuity requirements, Wall has used the term “harmonic” to describe matrix solutions of such an identity. See H. S . WALL,Concerning harmonic matrices, Arch. Math. 5 (l954), 160-167.
Relaxing slightly the continuity requirement(cf. Section 1 1.9), MacNerney has generalized this concept. See J . S. MACNERNEY, Concerning quasi-harmonic operators, J . Elisha Mitchell Sci. SOC.73 (1957), 257-261,
where also matrices are extended to certain linear transformations.
NOTES
529
A theory of generalized differential equations which is not confined to linear equations, using the idea of a sequence of approximating equations, has been constructed by Kurzweil. See in particular his paper J. KURZWEIL, Generalized ordinary differential equations and continuous dependence on a parameter, Czech. Math. J. 7 (82) (1957), 418-449; also 9 (84) (1959), 564-573.
Chapter 1 1 , Problems Problems 7 and 8. For the differential equation case see Bellman, “Stability Theory,” pp. 123-125, and, for a matrix extension, W. T. REID,Oscillation criteria for linear differential systems with complex coefficients, Pacific J. Math. 6 (1956), 733-751, Theorem 4.2.
T h e original result is due to Lyapunov. T h e topic is connected with an investigation of M. G. Krein on the extremal frequencies of a string loaded with particles of given total mass, and referred to under Notes for Section 4.7. I t provides one of a number of instances in which topics for Sturm-Liouville equations admit a slightly more complete treatment when the equation is generalized to one of the forms considered here. For similar problems for higher order equations see F. V. ATKINSON, Estimation of an eigenvalue occurring in a stability problem, Math. 2 . 6 8 (1957), 82-99.
Section 12.1 We cite here some examples of investigations for linear differential and difference equations which admit extension to integral equations, in some cases with a certain rounding off and finality in the results; these are additional to the asymptotic theory and expansion theorems considered in this chapter. In Problems 7 and 8 for Chapter 11 are indicated extremal properties relating to minima of the lowest eigenvalue and to the boundedness of solutions of equations with periodic coefficients. In Problems 3 and 4 for this chapter are given an extremum relating to eigenvalues of a non-self-adjoint problem. A bound relating to the perturbation of eigenvalues is given in Problem 7 for this chapter. We cite next certain formal relations for differential equations, in which one sets up differential equations satisfied by Wronskians of solutions of another, as in G. MAMMANA, Decomposizione delle espressioni differenziali lineari ..., Math. Z . 33 (1931), 186-231, M. M. CRUM,Associated Sturm-Liouville systems, Quart. J . Math. 6 (1955), 121-127.
530
NOTES
These relations were used by the last author to consider the removal of eigenvalues, that is to say, the “deflation” of the problem, or the addition of eigenvalues. I t was pointed out by M. G. KREiN, On a continual analog of a Christoffel formula from the theory of orthogonal polynomials, Doklady Akad. Nauk SSSR 113 ( I 957), 970-973,
that this was the continuous analog of a process from the theory of orthogonal polynomials (Szego, pp. 28-30), in which from a given set of orthogonal polynomials one forms determinants which are orthogonal with respect to a new weight-function, modified by the addition of a polynomial factor. Krein made an application to the inverse SturmLiouville problem. An interesting application of the formalism of the integral equation which generalizes the Sturm-Liouville equation has been made by M. MIKOLAS,uber gewisse Eigenschaften orthogonaler Systeme der Klasse La und die
Eigenfunktionen Sturm-Liouvillescher Differentialgleichungen, Acta Math. Acad. Sci. Hungary 6 ( I 9 5 3 , 147- 190,
who considered a number of questions relative to the integration or differentiation of complete orthonormal sets to obtain further such sets.
Section 12.3 See the paper of R. R. D. KEMPand N. LEVINSON,On u” PYOC. A ~ wMath. . SOC.10 (1959), 82-86,
+ (1 +Ag(x))
u = 0 for
J: I g(x) 1 dx < 00,
and, for a discrete approach, T. FORT,A problem in linear differential equations, Proc. Amer. Math. SOC.10 (1959), 300-303.
See also Problems 3 and 4 for this chapter. For another application of approximation by recurrence relations to secondyorder differential equations see T. FORT,A problem of Richard Bellman, Pror. Amer. Math. SOC.9 (1958), 282-286.
Section 12.4 For references on inverse Sturm-Liouville problems see the notes to Section 12.7.
Section 12.5 In the case of second-order differential equations, the result of Theorem 12.5.1 forms a very special case of a result of Levinson for systems of differential equations with almost constant coefficients; see
NOTES
53 1
Bellman’s “Stability Theory,” p. 50. For the analog of Theorem 12.5.2 see the same work, p. 114.
Section 12.6 T h e validity of results such as Theorem 12.6.1 may be extended by the use of more general methods than the reliance on the possibility of an asymptotic integration, as in Chapters 5 and 8.
Section 12.7 For similar asymptotic results for differential equations, see for example M. H. STONE,Certain integrals analogous to Fourier integrals Math. 2. 28 (1928), 654-676,
and the monograph of Naimark. In this book we have taken up inverse problems only in certain discrete contexts; the theory for second-order differential equations may be found in the book of Naimark, or in the appendices of Levitan to Titchmarsh. In addition, the following brief list of references may be useful. Among the earliest works on the area is a short paper by V. AMBARZUMIAN, Uber eine Frage der Eigenwerttheorie, Z . Physik 53 (1929), 690-695,
+ +
who considered the problemy” (1 hg(x))y = 0, y‘(0) = O,y’(v) = 0; he showed that if J:g(x) dx = 0 and the eigenvalues are 12, 22, ..., then g(x) must vanish identically. A comprehensive theory of the matter appears for the first time in the fundamental work of G . BORG,Eine Umkehrung der Sturm-Liouvilleschen Eigenwertaufgabe: Bestimmung der Differentialgleichung durch die Eigenwerte, Acta Math. 78 (1946), 1-96.
In particular, it is shown that the prescription of one set of eigenvalues does not fix the differential equation; with certain qualifications, two sets of eigenvalues must be given, or additional requirements, such as symmetry, must be imposed on the coefficient in the differential equation (cf. Sections 1.7-8,4.6-7). I n Borg’s work the uniqueness of the equation depends on some interesting completeness properties of products of eigenfunctions, the existence being treated by methods of successive approximation. For later advances relating to this particular formulation of the inverse problem we cite the papers of N. LEVINSON, The inverse Sturm-Liouville problem, Mat. Tidsskt. B. (1949), 25-30,
M. G . KRE~N, Solution of the inverse Sturm-Liouville problem, Doklady Akad. Nauk SSSR 76 (1951), 21-24,
532
NOTES
CUDOV,A new variant of an inverse Sturm-Liouville problem on a finite interval, Doklady Akod. Nauk SSSR 109 (1956), 40-43,
L. A.
and papers of Krein cited in the notes for Section 8.3. More recent work has been devoted to formulations in which the differential equation is to be recovered not from sets of eigenvalues but from the spectral function or from the asymptotic phase or S-matrix. A basic method is due to I. M. GEL’FAND and B. M. LEVITAN, On the determination of a differential equations from its spectral function, Izo. Akud. Nauk SSSR, Sw. Mat. 15 (1951), 309-356, Amer. Math. SOC.Transl. (2) 1 (1955), 253-304.
T h e problem in which the differential equation is to be recovered from the spectral function provides a basis for solving the problem in which we start from the asymptotic phase, or scattering matrix, or that in which we start from sets of eigenvalues. It is of course necessary when starting from the asymptotic phase to recover first the spectral function; this depends on a problem in the factorization into functions analytic in the upper and lower half-planes. For an exposition we refer to the survey of L. D. FADDEEV, The inverse problem of the quantum theory of scattering, Uspekhi Mat. Nauk 14, NO. 4(88) (1959), 57-131.
Briefer early expositions were given by N. LEVINSON, Determination of the potential from the asymptotic phase, Phys. Rev. (2) 75 (1949), 1445, Certain explicit relationships between phase shift and scattering potential, Phys. Reo. (2), 89 (1953), 755-757.
For further recent work we cite the papers of A. FRIEDMAN, On the properties of a singular Sturm-Liouville problem determined by its spectral function, Michigan Math. J. 4 (1957), 137-145,
I. KAYand H. E. MOSES,The determination of the scattering potential from the spectral measure function, Nuowo Cimento (10) 2 (1 955), 91 7-961 ; 3 ( 1 956), 66-84,276-304; 5 (1957) SUppl. 230-242, R. G. NEWTON and R. JOST, The construction of potentials from the S-matrix for systems of differential equations, Nuowo Cimento (10) 1 (1955), 590-622,
Z. S. ACRANOVIE and V. A. MAREENKO, Reconstruction of the potential energy from the scattering matrix, Uspekhi Mat. Nauk 12, No. l(73) (1957), 143-145; Amer. Math. SOC.T r a d . (2) 16 (1960), 355-357, M. G. K R E ~ NOn , the theory of accelerants and S-matrices of canonical differential systems, Doklady Akod. Nuuk SSSR 11 1 (1956), 1167-1 170, M. S. BRODSKI~, The inverse problem for systems of linear differential equations containing a parameter, Doklady Aknd. Nauk SSSR 112 (1957), 800-803,
NOTES
533
V. F. KOROP,T h e converse problem of dispersion for equations with a singularity, Doklady Akad. Nauk SSSR 132 (1960), 754-757.
A basic tool in the Gel’fand-Levitan method is formed by transformation operators, integral operators which transform solutions of one differential equation into those of another; the kernels in these operators satisfy partial differential equations (a discrete analog being indicated in Problem 16 for Chapter 4). An independent discussion of such kernels was given by M . M. CRUM,On certain Sturm-Liouville functions, J . London Math. Soc. 31 (1956), 426-432.
Among other work see J. DELSARTE and J . L. LIONS,Transmutation d’opkrateurs diffhrentielles dans le domaine complexe, Comment. Math. Helw. 32 (1 957), I 13- 128,
some discussion being also given in the book J. L. LIONS, “Equations diffkrentielles opkrationnelles et problhmes aux limites.” Springer, Berlin, 1961.
For an interesting application of the Gel’fand-Levitan results to produce differential equations with unusual spectral behavior see N. ARONSZAJN, On a problem of Weyl in the theory of singular Sturm-Liouville equations, Amer. J . Math. 79 (1957), 597-610.
Section 12.10 Concerning the proof of the Sturm-Liouville expansion by approximation from the trigonometric case see GARRETT BIRKHOFF and G.-C. ROTA,On the completeness of the Sturm-Liouville eigenfunctions, Amer. Math. Monthly 67 (1960), 835-841,
and also the book of the same authors. T h e idea goes back to George D. Birkhofl [PYOC.Nut. Acud. Sci. U . S . A . 3 (1917)’ 656-6591, or “Collected Works,” Vol. I, pp. 90-93. See also J. L. WALSH,On the convergence of the Sturm-Liouville series, Ann. of Math. (2) 24 ( 1923), 109-120.
For an application of the same principle to prove the completeness of the eigenfunctions of a nonlinear problem of Sturm-Liouville type see R. M. MORONEY, A class of characteristic value problems, Trans. Amer. Math. SOC.102 ( 1962), 446-470.
534
NOTES
Appendix I For accounts of the Stieltjes integral under standard conditions see Widder’s “Laplace Transform,” where the Helly-Bray theorems are also treated. T h e modification made in Section 1.5 is mentioned also in “The Convolution Transform” (Princeton Univ. Press, Princeton, New Jersey, 1955) by I. I. Hirschman and D. V. Widder. For the multi-dimensional Stieltjes integral see E. KAMKE, “Das Lebesgue-Stieltjes Integral.” Teubner, Leipzig, 1956; 2nd. ed. 1960.
See also E. J. MCSHANE, “Integration.” Princeton Univ. Press, Princeton, New Jersey, 1944.
Appendix I1 For further results of a more general character on the zeros of polynomials see M. MARDEN, “The Geometry of the Zeros in the Complex Plane,” Math. Surveys No. 3. Amer. Math. SOC.,New York, 1949, or The zeros of certain real rational and meromorphic functions, Duke Math. J . 16 (1949), 9 1-97.
T h e general form of functions whose imaginary part has fixed sign in the upper half-plane is given in terms of a Stieltjes integral; for this and for the Stieltjes inversion formula we refer to the book of Shohat and Tamarkin, or to those of Stone or of Ahiezer and Glazman. Of interest in a functional-analytic connection is a study of the logarithm of such functions. See N. ARONSZAJN and W. F. DONOHUCH, Jr., On the exponential representation of analytic functions in the upper half-plane with positive imaginary part, J . anal. Math. 5 (1956/57), 321-388.
For references in the “positive-real” setting see S. and L. SESHU, Bounds and Stieltjes transform representations for positive real functions, J . Math. Anal. Appl. 3 (1961), 592-604,
or p. 109 of Bellman’s “Matrix Analysis.”
Appendix IV Bounds for solutions of difference and differential equations utilizing one-sided bounds for coefficient matrices have recently been given by J. B. ROSEN,Stability and bounds for non-linear systems of difference and differential equations, J. Math. Anal. Appl. 2 (1961), 370-393.
Further references are given by Beckenbach and Bellman.
NOTES
535
Appendix V For a study of the perturbation of matrices and the corresponding perturbations of eigenvalues and eigenvectors we refer to M.I. VISIKand L. A. LWSTERNIK, Solution of some perturbation problems in the case
of matrices and self-adjoint and non-self-adjoint differential equations, I, Uspekhi Mat. Nauk 15, No. 3(93) (1960), 3-80; or Russian Math. Suroeys 15, No. 3 (1960), 1-73 (Appendices I, 11).
See also the paper of W. N. Everitt cited in the Notes for Section 6.5.
Appendix VI For these and similar theorems see, in addition to the work of Birkhoff and Rota cited in the Notes for Section 12.10, the book of Riesz and Sz.-Nagy, pp. 205-207. Additional references: R. E. A. C. PALEYand N. WIENER,“Fourier Transforms in the Complex Domain.”
Amer. Math. SOC.Colloquium Publications, Vol. 19, New York, 1934, p. 100,
Yu. A. KAZMIN,On bases and complete systems of functions in Hilbert space, Mat. Sb. 42 (84) (1957), 513-522,
N. K. BARI,Sur les systtmes complets de fonctions orthogonales, Mat. Sb. 14 (56) (1944), 5 1-1 08.
Problems
Chapter 1. 1. Defining y,(h) by (1.3.1), subject to (1.2.1), let the ar be constants such that
2 aryr(h) m
~ ( h= )
7n’
has no real zeros. Show that, as h describes the real axis positively, w(A) makes not less than m’ and not more than m positive circuits of the origin. 2. Show that if the c, > 0 are real and is fixed, then as the c,, increase the A, , as fixed by (1.3.5), move towards zero. 3. Show that if the c, > 0 are real, then the eigenvalues A, for which r > 0 satisfy
Show also that this bound can be attained, for suitable c, 4. Show that if the c,, > 0 are real and a = m, then
.
5. Shown that if Rl {c,} > 0, and m < m’ , then between two eigenvalues h,(m, a),&+l(m, a) lies an eigenvalue h,(m’, /3). 6. Consider the boundary problem
+
( E ihcn) ( E - ihC,*)-’yn YO # 0 Ym = NY, where yn is a k-by-1 column-matrix, E the unit matrix, N a fixed unitary matrix, and the matrices C, are such that C, C,* > 0, C,C,* = C,*C, showing in particular that the eigenvalues are real. 7. For the problem of the previous question, and supposing the C, Hermitean, establish the analog of Problem 2, for the event that one of the C, is increased in the matrix sense. (Remark: Suitable arguments are used in a different context in Chapter lo.) ~ n + l=
9
+
536
537
PROBLEMS
= 0, ..., m - 1, Y = I , ..., k let the , ..., Cn7k) be real 1-by-k row matrices, and let = (A,, ..., A,)
8. For a fixed integer k > 1 and n c,
= (c,,,,
be a k-by-1 column matrix. Define the y,,,(h) recursively by y,(X)
+
= 1 and
~ n + l , d X )=z (1 icnJ) (1 - iCn4)-'Ynr(A)* For some real a,, ..., ak form the boundary problem
ymr(h)= exp (ia,), r = 1, ..., k, eigenvalues being column matrices X which satisfy these k simultaneous equations. For a k-tuple n of integers (n, , ..., nk) , 0 n, < m , write
<
cn =
det(c,,):-l
..., in that order. Prove that if c,> Ofor all n, then this boundary problem has only real eigenvalues. Prove also that it has exactly mk eigenvalues, and set up orthogonality relations. (Remark: Compare Sections 6.9-10, Problem 16 for Chapter 8). for the determinant formed by the k row matrices cni1,
Chapter 2. 1. Show that the eigenvalues of the problem (2.5.1-3) with (Y = &r, c,, = n-2, are 12/2, 32/2, ... and - 2.12, - 2.22, ... . 2. Show that if the c,, are real and positive and such that c,, = O ( r 2 ) for large n, then A'; = O(Y-~)for large Y. 3. For the matrix analog of the problem (2.5.1-3), i.e., Problem 6 of Chapter 1 with m = 00, prove that the spectrum is discrete provided that m
0
4. Consider the multi-parameter boundary problem described in Problem 8 of Chapter 1 with m = m, the c, being real and all c, > 0. Show that if
exists for all real h and for suitably restricted complex A. Show also that the spectrum is discrete, i.e., that the eigenvalues (which are column matrices) have no finite limit. Chapter 3.
+
1. Let A, B be 2-by-2 matrices and let A XB for all real X be symplectic AB = A'(] AC), [i.e., ]-unitary with ] given by (3.5.1)]. Show that A
+
+
538
PROBLEMS
where A' is symplectic, and independent of A, and C has one of the forms
3
and a, b are real. 2. In addition to the assumptions of Problem 1, let Im ( ( A
+ hB)*/(A + m) - /} 2 0,
for all h with Im h > 0, with equality excluded in the matrix inequality. Show that in the above form for C the (+) sign is to be taken, with a and b not both zero. 3. Let A,, AB,, satisfy the assumptions of Problems 1 and 2. Show that the recurrence relation, for the column matrices yn ,
+
yn+1
=
(An
+W y n,
71
= 0, 1,
.** >
may be transformed, by a substitution z,, = H,,y,, with H,, symplectic, to the form n == 0,1, ..., Zn+l = (/ XC,,)Z,, ,
+
where C,, has the form attributed to C in Problems 1 and 2. 4. With the assumptions of the previous problems, show that a further substitution of the form
reduces the recurrence relation to the form
and so to a scalar three-term recurrence formula. (Hint: Two recurrence steps of Problem 3 for which C,,+,C,, = 0 may be combined into a single step, and the sign of a,,, b, may be adjusted so that anan-, b,,b,,-, > 0.) 5. Let A , B be 2-by-2 matrices and let A +hB be unitary, in the usual sense, for all h such that 1 h I = 1. If further A , B are neither equal to the zero matrix. show that
+
where A', A" are unitary and independent of A. 6 . Show that a recurrence relation Y,,+~ = (A,, AB,,)y,,, where A,,, B,,
+
are as in the previous problem, may be transformed by a substitution z,, = Hnyn to the form 0 cos B,, sin /I,,
539
PROBLEMS
(Equivalently, the order of the matrices may here be ,reversed.) 7. Let A , B be 2-by-2 matrices, neither zero, such that for all A on the unit circle A AB is J-unitary, where 1has the form (3.6,l).Show that A hB has one or other of the forms
+
+
where A‘, A” are J-unitary. 8. Find a parametric representation of the general J-unitary matrix, with J given by (3.6.1),and a standard form similar to that of Problem 6 for the associated recurrence relation.
9. Show that the set of 2-by-2 symplectic matrices
(,“ ):
is connected when
u, b, c, d may be complex, and also when they are restricted to be real.
Chapter 4. 1. Polynomials y,,(A) are defined as in Section 4.1, and the constants a,. , m, Y m2 , 0 < m, m2 m, are not all zero. Prove that XZ2 ary,.(A) 1 has at least m, , and at most m2 real zeros. (Hint: This may be deduced from the orthogonality.) 2. The piecewise linear function yz(A) is defined as in Section 4.3,and h, , A,,+l are the zeros of ym(A). Show that a linear combination
< <
< <
...,
with the ar not all zero, has at most m - 1 zeros in the x-interval - 1 <x <m ; in the event of this expression vanishing throughout an interval of the form n x n 1, only the zeros at x = n, x = n 1 are to be counted. 3. Polynomials y,(A) of degree n, 0 n m - 1, are orthogonal in that
< < +
< <
+
for certain positive a,., where T(A) is nondecreasing and such that the integrals converge absolutely. Show that ~ ( hhas ) at least m points of increase. 4. With the assumptions of the previous problem, let A’, A”, with A‘ < A”, be zeros of y,.(A), for some I , 1 < T < m. Show that T(- m) < .(A’) < T@”) < ~(00). [Hint: Show that there would otherwise exist a polynomial of degree less than Y and not orthogonal to y,(A).]
540
PROBLEMS
5. With the assumptions of Problem 3, assume in addition that the weightdistribution is confined to the right half of the real axis, i. e., that T( - m)= ~(0). Show that the coefficient of hr-l in y7(h),0 < Y < m, is not zero. 6 . A set of real-valued functions pr(h), 0 Y m - 1, continuous in a real interval a < X < 8, is said to form a “CebyHev system” if for all I , 0 Y < m, a nontrivial linear combination Xiy,p,(h) with constant coefficients can have at most Y distinct zeros in this interval. Show that the following systems form Cebygev systems in the intervals indicated: (i) 1, A, h2, ..., hm-l , any interval, (ii) 1, h2, h3, ...,Am, any non-negative interval, (iii) 1, A, X2, ..., Am, as for (ii), (iv) sin x, sin 2x, ..., sin mx, 0 < x < T, (v) 1, cos x, cos 2x, ..., cos (m - 1)x, 0 < x < T, Y = 0,..., m - 1, the v, all distinct and negative, (vi) (A - +)-I, any non-negative interval. 7. If the pT(h)form a CebyHev system in a < h < 8, 0 Y < m,then given Y distinct points in a < h < /? there exists a linear combination XLy,p,(X) which changes sign at these points, and which vanishes nowhere else in a
< <
<
<
j p(pr(4)” W),
0 < <m
existing, possibly as improper integrals. Show that there exist unique linear combinations
+2~ r ~ s ( h ) , r-1
+,(A)
vr(~)
1
= 0, - * *m I -
1,
8-0
such that
9. With the assumptions of the previous problem, show that $,(A)
p points of change of sign in combination
a
has exactly
< h < /?,and more generally that a linear
Q
with the a7 not all zero, has at least q and at most p changes of sign.
10. In addition to the assumptions of Problem 8, let the p.,(X) be continuously ) X~yepa(h) differentiable in a < h < 8, and term a common zero of ~ ( h=
54 1
PROBLEMS
and of ~ ’ ( ha) multiple zero. Let the CebyHev property, that x(h) may not have more that T distinct zeros, be strenghtened by counting a multiple zero as at least two zeros, a multiple zero with change of sign as at least three zeros. Defining the +,(A) as in the previous problem, show that the zeros of +,(A) and of t,hpp1(h)separate one another. [Hint: Show that +,(h)+,-l‘(h) - +9’(A)t,hp-l(h)does not vanish.] 11. Let v r , I = 0, ..., m - 1 be negative and distinct, and let ~ ( h )h, 2 0, be non-decreasing, with at least m points of increase in 0 < h < 00, and such that jr d~(h)/(l h2) < 03. Show that there exist unique rational functions
+
r-I
such that
J-= +9(A)+*(W dT(4 0
P
= 0,
z ?!
Show also that +,(A) has p zeros in 0 < h < 00, and that the zeros of +,(A), I,~,-~(A) have the separation property. 12. With the assumptions of the previous problem, show that the +,,(A) satisfy a recurrence relation (A
+
- (4
- vn+J+n+i(h)
bn)+n(X)
= Cn-l(X
- ~n-l)t,h’n-l(h) *
(Hint: Show that a,, , b,, may be found so that the left-hand side has no poles at A = v,, , v,,-~ , and use reasoning similar to the proof of Theorem 4.6.2.) 13. Consider the boundary problem
(4+ bn)/(an’h
~ n + i ( h )= ~ n ( h )
+
bn’)
-~n-i(h),
n = 0, ..., m - 1, with boundary conditions y_,(h) = 0, ym(h) = 0, where a,, , a,,’ , b,, , and b,,’ are all real, and u,,b,,‘ - an’bn > 0, showing that the eigenvalues are all real. 14. For integral m,m’,with 0 < m m’,and real h, h‘, with h # h’ if m = m’, let ~ , , ~ ( h )T,’,~,(A) , be spectral functions as defined in Section 4.5. Show that for any polynomial m(h) of degree not exceeding 2m - 2 we have the finite mechanical quadrature
<
Srn
m(A) d 7 ~ , h ( h ) =
--m
-a
m(h) dTm’.A’(h).
15. With the assumptions of the previous problem, if A, is a point of discontinuity of ~,,*(h), show that -
- Tm’.A‘(-OO) < T m , A ( x r ) - T m . h ( - Tm*.A’(hr) < Tm.A(+ 03) - Tm,h(hr).
Tm’.A’(hr) Tm’.A‘(+
542
PROBLEMS
[Hint: Apply the quadrature formula of the previous problem to a suitable ~ ( h(see ) Szeg6, “Orthogonal Polynomials,” Section 3.41 1.)I. 16. For positive constants cP1, co , ... and real constants b , , b, , ..., do,dl , ..., define polynomials y,(h), z,(h) by the recurrence relations CnYn+l(X)
= (A
CnZn+l(h)
=
+ bn)yn(X) + dn)z,(h)
Cn-lYn-l(h)
s
- cn-lzn-l(h)
i
together with the initial conditions y-l = z-l Define the transformation matrix k,, , 0 < Y Q
= 0, tl,
by
yo = zo = 1 / c l .
n
0, and that if we set formally k,,-l Show that k,, = I, n then the k,, satisfy the partial difference equations cnkn+l,r
+
cn-A-1.7
- 4aknr
= crkn,r+l
+
= 0, n
3 - 1,
-bAr*
Cr-1kn.r-1
Show also that these difference equations and boundary conditions determine the k,, . 17. Letyn(h), z,(h) be as defined in Sections 4.1-2. Show that, for 0 Q Y < n,
18. Use the latter result to show that, if 0 Q s Q
I
< h,
the integral being taken positively round a circle enclosing all the zeros of y,(h). Deduce that this result remains true if - z,(h)/y,(h) be replaced by -
(zn(X>
+
hzn-l(A)}/{Yn(h)
+
hyn-l(h)}
9
for any constant h, real or complex, the contour enclosing all zeros of the denominator. 19. Deduce from the result of the previous problem (i) the orthogonality (4.4.8), (ii) the orthogonality (4.9.6). [Hints: (i) Evaluate the integral by residues, (ii) use fhe result with and h, using semicircular contours.) 20. Consider the argument of the previous three problems in the event that the coefficients a,, b, , c, in the recurrence relations are not all real. Chapter 5. In the following problems, a, , b, , c, are all real, the a, , c, being positive; the polynomials y,(h) are as given by (4.1.1) and (4.1.5).
543
PROBLEMS
1. For some real A“ and some integral k, let a,h”
+ b,, > c,, + c,,-~
for n
> k.
Show that the sequence y%(A”),JJ%+~(A”), ..., exhibits at most one change of sign. 2. For some real h‘ and all n >, 0 let
d’ + b,
+ + cn
cn-1
< 0.
Show that sign y,(h’) = (-)”. 3. Deduce from the recurrence relation for Legendre polynomials, where a, = 2n 1, b,, = 0, c,, = n 1, n 3 0, that these polynomials have their zeros in - 1 < h < 1. 4. With the assumptions of Problem 1, show that there is a A,, such that y,(h) > 0 for all n, and h > A,. Deduce that a limiting spectral function arising from (5.2.6) with h = 0 will be constant in h > A,, and that any limiting spectral function will have at most one point of discontinuity in this range. 5. With the assumptions of Problem 1, show that for n > k a polynomial y,,(h) can have at most k 1 zeros in h >, A”. (Hint: Use Theorem 4.3.5.) 6. Let (b, - c,, - c,,-~)/u,, -+ m as n -+ 03. Show that there is a discrete spectrum, any limiting spectral function being a pure step function. 7. If the assumptions of both Problems 1 and 2 hold, show that the spectrum is finite and the limit-point case holds.
+
+
+
+
8. Show that, for the recurrence relation for Hermite polynomials, the limitpoint case holds. p e have here a,, = 1/(2*-1n!), b,, = 0, c, = 1/(2”n!), and it is known that HZ,(O) = (- 1),(2n)!/n! .] 3. Let polynomials $,,(A) be defined by P - ~ ( X ) = 0, p,,(h) = 1, and
%A+l(4= (an
+ AI - 4Pn(h) - 15,p,-l(h)
(an
, F,, > 0).
Show that the polynomials are orthogonal with respect to a weight distribution located in 0 X < m.
<
1. Show that in the case a,, = n, p,, = n polynomials, the limit-point casc holds.
+ v - 1,
that of the Laguerre
I. Let the moments pj as given by (5.9.3) arise from a spectral function for the limit-circle case. Show that log p,,
> n log n + Bn
for n >, 1 and some constant B.
[Hint: Carleman’s theorem (Titchmarsh, “Theory of Functions,” Section
3.7) provides one method.]
544
PROBLEMS
Chapter 6. 1. Show that the Green's function g&) representation g7m =
2
defined in Section 6.4 admits a
YA~P)Ys(AP)P;'(~P
-
P
4-l
'
where the A, are the zeros of ym(A)and p D is given by (4.4.34),with h = 0. 2. Show also that g,,(h) may be specified as a rational function which tends to zero as h -+ 00, and which differs from z,(X)yl.(X)y,(h)/r,(h) by a polynomial. 3. Let the matrix polynomials Yn(h),Y:(h), Zn(h), Z$h) be defined by the recurrence relations
Yn+l = ( A J
+ Bn)Yn- Yn-l
YJ+l= Y:(A,X Zn+1 = ~ : + 1=
+ B,)
Y-1 = 0,
Yo = E ,
YTl
= 0,
YJ = E,
Zn-1,
2-1
= E,
Zo
Z nt - 1 ,
Z!,
= E,
2 : = 0,
- Yi-,
+ B,)Zn Z,t(AnX + Bn) (Anh
,
= 0,
capital letters denoting K-by4 matrices and 0 and E the zero and unit matrices. Show that --E
0
Yn+l
Yn
',+I)
zn
=
(i
4. Show also that, with the same notation as in Problem 3,
5. Show further that, apart from poles,
Y;l(h)Zn(h)= -
n-1
Y;;l(h)Y:-l(A) .
0
6. Show that, for polynomials R(X),S(X), whose coefficients are square matrices, the integral
I
R(h)YG1(A)Zn(X)S(h) dh ,
caken round a closed contour in the positive sense, the contour enclosing all poles of Y;l(h), has the same value for all sufficiently large n, namely, such that 2n exceeds the total of the degrees of the fixed polynomials R ( 4 , S(4.
545
PROBLEMS
7. Let the k-by-1 column matrix y,,, - 1 < n = 0, for a square matrix H , and ym + yn+1
- (AT?
+ BnlY,, +
yn-1
< m, satisfy
(n # O),
= 0,
= u,
y-l
=
0,
(n = O),
for a column matrix u. Show that, apart from poles,
yo
=
+ Hym-l(4}-1{zm(4 + HZ,-1(4}u
{Ym(h)
8. Assume A,,, B,, , H Hermitean, A, - {ym(h)
*
> 0. Show that
+ H Y ~ - ~ ( A ) } - ~ { +z ~~(zAm) - l ( h ) ) / =
m -m
dTrn,H(p)
- p)-l
where ~ , , ~ ( pis) the matrix-valued spectral function defined in (6.8.9).
9. Let R(X), S(h) be polynomials with matrix coefficients, of total degree less than 2m. With the assumptions of the previous problem, establish the “mechanical quadrature,” according to which the integral
has a value independent of M and of H . 10. With the same restrictions on A,, , B,, , and H show that
+
-{ym(~>
J
~ y m - 1 ( ~ ) } -1 1
m
drrn,H(p)Ym*-l(p)( A - p)-1.
--OD
1 1. Let A,, , B,, be Hermitean, A,, >0, and let H satisfy Im H =(2i)-l ( H -H * ) > 0. Show that the eigenvalues of (6.6.5-6) lie in the lower half-plane. 12. With the assumptions of the previous problem, and with R(h), S(A) matrix polynomials of total degree not exceeding 2m - 2, show that (suppressing the A’s in the integral)
m
= rr
+
-m
R dTrn,os .
+
13. Writing Fn,H = - {Y,, HY,,-l}-l{Z,, HZ,,-l}, show that Fn+l,H(X) = F,~p(h),where H = - (&A B, H)-l. 14. Denote by gm(X) the set of Fm,H(h)for fixed h with Im h > 0, and all possible H with Im H > 0, the A,, B,, being Hermitean with A,, > 0 . Show that, for m 2 1, g m ( h ) is a finite region, and that .9m+l(h) C gm(A). 15. Let ~ ( h ), 00 h 00, be a k-by-k Hermitean and nondecreasing matrix function, such that, for all n 3 0,
+ +
< <
Jm
-m
tr
< ,
~ 2 n {~T(A)}
546
PROBLEMS
+
and with an infinity of points of increase [i.e., points such that ~ ( h T(X - c) for all E > 0.1 Show that matrix polynomials of the form P,(h) = h"E
c)
>
+ 2 Cn$ n-1
r=o
satisfying the conditions
jrn P,(h) dT(h)hs -w
0,
S = 0,
..., 71 - 1,
exist and are unique. 16. Show that these polynomials minimize the expression
j-
P,,(X) d+) P,*(h), tr -m for varying C,,, . 17. Show that the PJh) satisfy a three-term recurrence relation. 18. Evaluate the polynomials of Section 6.9 explicitly in the case that c,,, = 1, b,,, = 2, and the anrsare independent of n; verify the oscillation theorem for this special case.
Chapter 7.
1. With the assumptions of Section 7.1, if 0 < a < solutions of um(h) - e%,(X) %,(A) - ei%,(h) = 0,
< 27r
show that the
=0
separate one another on the unit circle, in that as h moves positively between two roots of one, it passes through a root of the other. 2. With the same assumptions, and 0 < n < m, show that the solutions of .,(A) - e%,,(h) = 0 are separated by those of urn@)- &%,(A) = 0. 3. With the same assumptions, and I p I < 1, prove that the zeros of
for varying A, lie outside the unit circle. 4. Prove that the zeros of .,(A) lie inside the unit circle, as a consequence of the orthogonality (7.3.1 1). [Hint: In the contrary event, a. polynomial of degree less than n and not orthogonal to .,(A) could be constructed.] 5. Discuss whether the u,(h), .,(A) are determined, or determined apart from constant factors, by a knowledge of the eigenvalues of the two boundary problems of Problem 1, and whether the resulting recurrence relations satisfy the assumptions of Section 7.1.
547
PROBLEMS
6 . Prove that the zeros of un(A) - hu,-,(A), where h is any constant, lie in the circle 1 A l2 < 2, with at most one exception.
The following examples relate to the situation in which the orthogonality has been transferred to the real axis. It is to be understood that the weight distributions on the real axis satisfy such bounds as ensure the absolute convergence of the integrals which occur. Orthogonality is in the complex sense, in that u(A) is orthogonal to v(A) if the integral of u(A)vKwith respect to the weight distribution in question, vanishes. Unless otherwise indicated, the weight-distribution functions .(A), .(A) will be real-valued and nondecreasing. 7. If .(A) is not a constant, and W , , w 2 are complex, with imaginary parts of opposite signs, show that
J
m -w
Jm-m
(A - w,)-l(A
-
du(A) # 0,
wz)-1
(A - W1) (A - W * ) &(A)
# 0.
w < 0 and .(A) has at least n points of increase, show that there is a unique polynomial pn(A), of precise degree n and with unit coefficient of An, which is orthogonal, with respect to .(A), to all polynomials q(A) of degree a t most n which vanish when A = w. 9. Show that the polynomial pn(A) just specified has all its zeros in Im A > 0. 10. The numbers vr , r = 0, ..., m are complex, all distinct, snd such that Im vr < 0. The nondecreasing function T(A) has at least m points of increase. Show that, for 0 p < m, a linear combination ?(A) of ( A - vs)-lI 0 s p , which is orthogonal to these functions, in the sense that
8. If Im
<
< <
m
[-m
?(A) (A - Ys)-l &(A)
= 0,
s = 0,
J
must vanish identically. 11 Show that, with the same assumptions and 0 linear combinations
...,p ,
< p < m, there exist unique
s=o
such that ?,(A) is orthogonal in the above sense to (A - v S ) - l , s =O, 12. Show that vP(A)as just defined has its zeros in Im A > 0. 13. Show that the coefficients y p sin Problem 11 are not zero. 14. Defining yn(A)as inproblem 11, and un(A),wn(A) by
...,p - 1.
548
PROBLEMS
show that there hold recurrence relations of the form
[Hint: As in Section 7.4, consider the orthogonality properties of wn(A).] 15. Extend the results of the preceding problems to the case when the v, are not all distinct, orthogonalizing the functions
(A - yo)-1,
...,
n 9-1 0
(A
-
(A - V,)
Go){(h - yo) (A - v1)}-1
1
V
J-J (A - v p , ... . 0
16. Deduce orthogonality relations from the recurrence relations of Problem 14
for suitably restricted a,, b, , considering both the case of a finite set of recurrence relations, and an infinite set. 17. Let the v,, Y = 0, 1 , ..., satisfy Im v,. < 0, and let T(A) have the form given by T’(A) = I x(A) where x(A) is a rational function whose numerator and denominator are of precise degree N < 0, having N zeros in the upper half-plane, not necessarily all distinct, and having poles at the V,; in the event of the v, , Y = 0, ..., N - 1 not being all distinct, x(A) is to have poles at these points of the corresponding multiplicity. Let the an@) be defined, apart from constant factors, by orthogonalizing with respect to .(A) the functions of Problem 15. Show that, for n 3 N , un(A) is, apart from a constant factor,
vI , Y = 0, 1 , ..., be an infinite sequence of complex numbers, of which an infinity are distinct, and all of which lie in a bounded closed set in the upper half-plane, not meeting the real axis. Show that if h(A)is defined and continuous on the real axis, is of order O(A-l) as A + f 03, and such that
18. Let the
I
m
-m
h(h) (A - v8)-I dA
=0
for s = 0, 1, ..., then h(A) = 0. 19. Show that the functions of Problem 15 are mutually orthogonal on the real axis, with respect to the weight dx (cf. Section 2.7). 20. Discuss the approximation by means of the latter functions to a function h(h), analytic and continuous in the closed upper half-plane, including the real axis, and of order O(A-l) as I A I + a in the upper half-plane.
549
PROBLEMS
21. Discuss the factorization of a function w(A), defined, not zero, and asymptotic to a constant for large A on the real axis, into factors which are analytic in the upper and lower half-planes, represented, respectively, in these half-planes by absolutely convergent series of the functions of Problem 15, and their conjugates.
Chapter 8.
< <
1. Let p(x), q(x) be continuous for a x b, with p ( x ) > 0 in a < x < b, and let y,,(x), A, , n = 0, 1, ..., be the eigenfunctions and eigenvalues, assumed discrete, of the boundary problem
Y"
with A,
+ (Ap + 4)Y
= 0,
r(a)
= y(b) = 0,
< A, < ... . Show that, for real alenot all zero, an expression
2(A, n
- hObleYk(X)
has at least as many distinct zeros in a
2
< x < b as does (Liouville).
aryk(x)
n
(Hint: Writing z for the last expression, apply Rolle's theorem first to
" / y o , and then to "'yo- zyo'.)
2. By repeated application of the above argument show that XF a,y,(x) has not more than m and not less than n distinct zeros in a < x < b, the y,(x) forming in particular a Markov (or CebyHev) system (Sturm, Liouville). 3. Extend the above reasoning (i) to the situation of Section 8.7, (ii) to the system (8.1.2-3) under the assumptions of Section 8.1. (Note: In the latter case conventions are necessary regarding the counting of intervals of zeros.) 4. Let p , q be continuous in [a, b] and p positive in ( a , b), and let ~ ( x A), , v ( x , A) be solutions ofy" (Ap q)y = 0 such that u(a, A ) = 0, u'(a, A)= 1 and v(b, A) = 0, v'(b, A) = 1. Let A, , A , , with A, < A , , be consecutive zeros of u(x, A) w (x, A), for some fixed x, a < x < 6. Show that there is a zero of u(b, A) in the interval A, < A < A , . 5. With p , q as in the last question, let y,(x), A, , n = 0, 1, ... be the eigenfunctions and eigenvalues of
+
+
+ +
y" (Ap q)y = 0, y(a)cos a = y'(a)sin a, for fixed real a,8, the y, being normalized so that
2 ly;(x) n
l2
for some finite c independent of x,
(1
+ A:)-1
y(b)cos /? Jb
< c4
=y'(b) sin
8,
y n p dx = 1. Prove that
5 50
PROBLEMS
6 . Use the latter result to consider the validity of the eigenfunction expansion when formally differentiated, 7. Let p(x), q(x) be continuous and real-valued for x 0, p ( x ) being also positive and of bounded variation on the semiaxis (0, a),with p(m) > 0, and q ( x ) being absolutely integrable over (0, 00). Let y(x, A) be the solution of y’(0, A) = 1. y” (Ap 4)y = 0, y(0, A) = 0,
+ +
Define e(x, A), yl(x, A) by y = y1 sin 8, y’ = (Ap)1/2y1 cos 0, subject to > 0, e(0, A) = 0, and both 8 and rl being continuous. Show that, for fixed A > 0, the functions
y1
qx,A)
-
j’
+,
( ~ p ) 1 / 2dt,
0
A)
tend to constants (dependent on A) as x a. 8. With the above assumptions and notation, show that, for A --f
aeph
=
(u)-lsin e cos e + r;z(~p)-1/2
> 0,
j’p(t){y(t, A)? dt. 0
9. With the assumptions of the previous two problems and any b > 0, let A,(b), n = 0, 1, ... be the eigenvalues of the boundary problem obtained by setting y(b, A) = 0, and T~(A)the corresponding spectral function, so that for A > 0 we have
Show that for fixed x’, h”, with 0 Tb(h”)
< A’ < X‘,
- Tb(A’) + 57-l {p(w))-1/2
and as b
--f
-, we have
j,,’ A’
kl/’{Y1(A))-’
dh,
where p ( - ) = lim p(x), r,(A) = lim Y ~ ( X A) , as x --f a. 10. Extend the last result to the case of a differential equation with a finite or denumerable set of discontinuities of the type considered in Section 8.7. 11. By a change of variable or otherwise, establish oscillation and expansion theorems for the system u‘ = t(x)u
+
+)W,
w’ =
- {Ap(x)
+ q(x)}u - s(x)v.
12. Show that a change of independent variable eliminates discontinuities of the form
r(5 + 0) = v(t - 01,
for positive constants
Y.
r’(t+ 0) = Y - l y ’ e -
0)
551
PROBLEMS
13. Discuss the effect on second-order boundary problems of discontinuities of the form y(5
+ 0)
=
r'(4+ 0) = - r(5 - 0).
r'(4- 01,
14. Show that a real 2-by-2 symplectic matrix may be expressed as a product
of a finite number of factors of the form exp (/A,.), where
/=
(Y ); -
and the A, are symmetric. Determine the minimum number of such factors which is always sufficient (cf. Chapter 3, Problem 9). 15. Show that a discontinuity for first-order systems, of the form u(4
.(5 where the matrix
+ 0) + 0)
;( ):
= au(5
-
= yu(5
-
+ B.(5 - 01, 0) + WE - 01, 0)
is symplectic, may be replaced by a finite sequence
of constant coefficient differential equations over finite intervals. 16. Let pjk(x), q,(x), j , k = 1, 2, ..., I, be continuous and real-valued for 0 x b. Writing p j ( x ) for the row matrix formed by p,,(x), ..., pjz(x), A for a column qatrix formed by scalars A, , ..., A, , let yl(x, A), ..., yl(x,A) be solutions of
< <
+
+
j = 1, ..., 1, yjl' (p$ qj)yj = 0, with initial data, yj(O, A) = 0,y,'(O, A) = 1. Define continuous phase and amplitude variables Oj(x, A), Y,(x, A) by y j = yi sin 8, ,
y; = Y, cos 8, ,
I,
> 0,
8, (0, A)
= 0.
Write x for a set x, , ..., x l of numbers in [0, b] and
Y(%4 r(x, A) p(x)
= Yl(X19
4rz(x, 4 ...Yz(X, 4, 9
= ydx1 9 4y,(x, =
det ( p j k ( x j ) ) ,
Writing 8(b, A) for the vector 8,(b, A), determinant
a@/aA= (db, A))-2
0
9
A)
yz(x,
j, K
=
..., B,(b, A)
4, 1,
..., 1.
show that the Jacobian
(y(x, A))2 p(x) dx,
where dx = dx, ... dx, . Deduce the multi-parameter oscillation theorem, assuming p(x) positive for all x.
5 52
PROBLEMS
Chapter 9. 1. Calculate the spectral function, as defined in (9.3.26), for the system
with the boundary conditions (i) u(0) = v(b) = 0, or (ii) u(0) = u(b) = 0. 2. Show that, as b + a, the spectral functions of the last question tend, respectively, to the limits
3. Show that, with the system of Problem 1 and the initial condition v(0) = hu(O), and a homogeneous condition at x = b such as u(b) = 0, the limit as b + 03 of the spectral function is
Deduce (i), (ii) of Problem 2 as particular cases. 4. For the same system, and the conditions u(0) = u(b), o(0) = v(b), show that the limit as b m of the spectral function is the mean of that given in (i), (ii) of Problem 2. 5. Calculate the limit, as b + m, of the spectral function of the system --f
with the boundary conditions u(0) = u(b), v(0) = v(b). 6. Determine the limiting spectral function for the same system, with the boundary condition
where U is any fixed 2-by-2 symplectic matrix. 7. Set up the eigenfunction expansion for the first-order system
+ p).,
+ q)u,
< Q b, where p ( x ) , q(x) are real-valued and continuous in 0 < x < b, with boundu’ = (A
.’ =
- (A
0
ary conditions such as u(0) = u(b) = 0. 8. For the system of the previous problem, show that the eigenvalues A,,
553
PROBLEMS
n = 0, f 1, ..., may be numbered in ascending order on the real axis so that the corresponding eigenfunctions u,(x), wn(x) are such that u,(x) has at least I n I - 1 zeros in the interior of (0, b). (Hint: The polar coordinate transformation tan 8 = u/v may be used.) 9. Suppose that the first-order differential equations of Problem 7 hold, except for isolated points 5 where u undergoes a saltus u(5
where
OL
+ 0) - 4 5 - 0)
= (ha
+ B)v(5),
> 0 and v is continuous at 5, or for points 7 at which q.1
+ 0) - v(7 - 0) = - (hr + @(7),
with y > 0 and u continuous; u and v are not to be discontinuous at the same x-value. Show that an eigenfunction expansion holds. (Hint: As in Chapter 8, such a saltus may be replaced by a differential equation over a segment of the x-axis.) 10. For the system of Problem 7, considered over (0, m), show that for every complex h there is a nontrivial solution such that
/,”{
Iu(x)
l2
+ 144 I”>x
<
11. Show also that for every A there is a solution for which the last inequality fails, provided that
or in particular if p , q E L(0,m). 12. Let u, w be solutions of the system of Problem 7 with u(0) = 0, v(0) = 1. Show that (a/ah)tan-’ ( u / v ) = (uz
+ v2)-l 1% (u2 + w z )
dt.
0
13. Use the latter result to connect the limiting form as b + of the spectral function for the problem u(0) = u(b) = 0, with the limiting value of the “amplitude” 1/(u2 v 2 ) , assuming that p , q E L(0, 00) (cf. Problem 9, Chapter 8). 14. Set up the eigenfunction expansion for the fourth-order equation of (9.1.16), taken over a finite interval (0, b), with p z , po positive, and with boundary conditions u = u’ = 0 at x = 0, b. [Hint: The first-order system form is given in (9.1.19), and suitable boundary matrices are reproduced in Chapter 10, Problem 91. 15. Discuss the eventuality of p , in (9.1.14) exhibiting singularities of delta-
+
554
PROBLEMS
function type. [Hint: One method is to allow all entries in the matrix on the right of (9.1.19) to vanish, except the leading one.] 16. For a piecewise continuous vector function ~ ( t ) and , with the notation of Sections 9.3-4, consider the validity in the pointwise sense of the expansion
provided that X is not an eigenvalue. 17. By applying the latter formula with ~ ( t=) K*(x, t , h)u, with any constant vector u, consider whether equality holds in (9.7.1). 18. Show that equality holds in (9.7.11), except possibly for a linear term, i.e., that Im FM,N(X)=
I
m --m
Im (A - p)-l dT,v,N(p)
+ Tt Im A.
19. Deduce that, according to the Stieltjes inversion, or Titchmarsh-Kodaira formula, for the present discrete spectrum, = - T-1 lim J'+if
TM,N(/l)-
'++O
a+ic
Im FM,&) dh,
except when a, /l are eigenvalues. 20. Consider the effect on these formulas of the limiting transitions b -+ 00, or U + -w. b+w. Chapter 10. In the following problems, capital letters denote K-by4 matrices, in general functions of the real variable x, of which P, Q, R, A, B are Hermitean and, as necessary, continuous; J is to be skew-Hermitean and constant, and M , N such that M*JM = N*JN and that M , N have no common null vectors. The unit and zero matrices are denoted E , 0. Lower case letters will denote K-by-1 column-matrices or scalars except that 0 will be a square matrix. 1. Let U , V , W , X be solutions of the differential equations
U'
z=
RV,
V'
=
- QU ,
W'=RX,
X'=-QW
with the initial conditions U(a) = 0, V(a) = E, W(a) = E , X(a) = 0. Show that U(x)W*(x)is Hermitean. 2. In the definitions of the previous problem let Q be replaced by (XP Q), where P is positive-definite and X is complex. Show that, for a < x b, Im VU-l and Im U-l W have the opposite sign to Im A.
+ <
555
PROBLEMS
< <
3. Show that, with P > 0 for a x b, the matrix e(x) defined by (10.2.17) has its eigenvalues inside the unit circle when Im A > 0, a < x b. 4. Show that, for real A, the matrix (W - iU)-l(W iU)is unitary. Determine the manner in which its eigenvalues vary with A. 5. For the scalar equation (y”/r)” - (qy’)‘ - py = 0, a Q x b, with positive continuous p , q, T , let be the first right-conjugate point for the problem y(a) = y”(a) = 0, y’ = (y”/r)’ = 0. Show that
<
+
<
6 . With reference to the previous problem, show that the effect of replacing
p ( x ) by a greater function is to decrease 7,. (Hint: Introduce a parameter
and use Theorem 10.2.3.) 7. For the situation of Section 10.8, with for simplicity JJ* = E, and N(cY) given by (10.8.14), let there be an a such that N*(a)B(x)N(a) 0 for all x. Show that, of the form
as x increases, one CY
of
the
w,(x)
can
neither decrease
+ 2nn, nor decrease from such a value.
to a
value
[Hint: The differen-
tial equation (10.8.15) implies, for wFnear to a, a bound of the form dw,/dx 3 - const I wc - a I.]
8. Consider, for the,scalar equation (y”/r)” - (qy‘)’ - py = 0, with positive and continuous p , q, r , the conjugate point problem y(a) = y’(a) = 0, y(x) = y’(x) = 0. Show that such conjugate points have no finite limitpoint. [Hint:Take two independent solutions satisfying y(a) = y‘(a) = 0 and consider their Wronskian.] 9. Show that the conjugate point problem of the last question may be put in the form (10.7.1-2) as follows: y1 = y
, 0
J=(
-(P
-1
:
0
0
and verify that
Yz = 0 0
+ I0
8I 0
]*]
y3
Y’, 0 -10
-
0
0
N = ( :0
M*]M
= N*JN = 0,
:),
0
\o
O
:;
0
N(a) =
P
0
0
0
= E,
‘ B = ( i
o)> 0
H g
Y4 = (y”/r)# --
= - Y’)/T,
+I
0 0
cosa
0
0
0
0
0
-sin a cos a/
0
E)
qy’,
0
8).
0
556
PROBLEMS
and that N*(a)BN(a)3 0 if cos (y. = 0 and p 3 0, q 3 0, or again if sin a = 0 and r > 0. 10. Defining B(x) by (10.7.4-7) for the case of the last question, and supposing exp (;a)to be an eigenvalue of B(x), show that there is a nontrivial solution of the fourth-order scalar equation such that y(a) = y’(a) = 0, while at the point x we have cos ay = - sin a((y“/r)‘ - qy’],
cos ay’ = sin ay”/r.
exp (ia)are eigenShow conversely that if such a solution exists, then values of B(x). 11. For B(x) as in the previous problem, show that B(a) has as its eigenvalues 1, - 1, i, - i. 12. For (y”/r)” - (qy’)’ - py = 0, with r > 0 and continuous p, q, r , let ,y2 , ... be in increasing order the conjugate points defined by the problem
y(a) = y’(a) = 0, and let
y = y’ = 0
for x
= r],
,
t2, ... be the conjugate points for the problem
y(a) = y’(a) = 0,
y” = (y”/r)’ - qy’
for x = 5,.
=0
Show that an x-interval which contains m of the r], also contains in its interior at least m - 2 of the (, . If also q > 0, r > 0, show that a reciprocal property holds with the (, , r],, interchanged. 13. Consider also similar properties concerning boundary conditions of the general form mentioned in Problem 10. 14. Show that 7, satisfies, if it exists, the bound
15. Show that if the functions p, q, r are increased, the 16. Consider the eigenvalue problem
(y”/r)” - (qy’)‘ - (Ap,
+ ply
=0
qn , (, are diminished.
(a
where p, p, , q, r are continuous functions, and p, boundary conditions being either (i) y(a) = y’(u) = 0, (ii) y(a) = y’(a) = 0,
< x < b), , r are positive, the
or y(b) = y’(b) = 0, y” = (y”/r)’ - qy’ = 0
for
x = b.
Show that a A-interval which contains m eigenvalues of one of these problems will contain in its interior at least m - 2 eigenvalues of the other. 17. Show that the boundary problems of the previous problem have at most a finite number of negative eigenvalues. [Hint: This well-known fact may
557
PROBLEMS
be proved by considering the motion of the eigenvalues of e(x,A) as x increases from a to b.] 18. Show that the boundary problems of Problem 16 have an infinity of real eigenvalues the nth in ascending order admitting, for large n, a lower bound of the form const n4. [Hint: In the differential equation take xA”* as a new independent variable and use the result of Problem 14 with b for 7, . The existence of an infinity of eigenvalues may be deduced from the eigenfunction expansion or, more primitively, by applying comparison principles (cf. Problem 15) to the given equation and suitable constant coefficient equations.] 19. Consider the problem of Problem 16 under the assumption that po changes sign a finite number of times in (a,b). Show that there is an infinity of real eigenvalues which, if numbered in order on the real axis as A, for all positive and negative n, are of order at least n4 in absolute value. 20. If, in the situation of Sections 10.7-8, M* J M = N* JN = 0, show that O/ JO = 0, and deduce that the eigenvalues of O fall into pairs of the form w, - w . 21. Extend the theory of Problems 9 to 12 to the sixth-order equation
+
+
(u”’/s)t”
(U”Y)’’
- (u’q)’
+ up = 0.
22. Consider the boundary problems of Problem 16 for the equations (Y”/Y)”
-
{y”/(~yo
{(A%
+
T1))1’
+ 41)Y’)‘
- PY = 0,
- (qyt)’ - PY = 0,
in regard to the existence, sign and order of magnitude of the eigenvalues. Consider also the effect on the eigenvalues of increasing the coefficient functions qo , y o . 23. Consider the oscillatory properties of the system (7.10.1), written according to Section 10.7 in the form
(-;
.
O i)
‘ =
(ibw
0
with boundary matrices
or with M = N = E . 24. Consider the eigenvalue problem
(r”/q’- (APO + ply = 0,
with boundary conditions y(a) = y’(a)
a
< x < b,
= y(b) = y‘(b) =
0, where
I,
p , Po
558
PROBLEMS
are continuous and Y , po are positive. Show that there are no eigenvalues for which hp, p is negative for a x b. 25. Show that the number of negative eigenvalues of the last problem is equal to the number of conjugate points 7 in a < 7 < b for the problem
+
(y"/.)"
< <
- py = 0,
y(a) = y'(a) = y(7) = y'(7) = 0.
26. Consider the conjugate point problem u' = Qv, v' = - Qu, u(a) = u(7) = 0, where u is a k-by-1 column matrix, Q is a Hermitean continuous positive-definite matrix in (a, If also the entries in Q(x) are absolutely integrable over (a, a),show that the total number of such conjugate points lies between a ) ) .
w-l
Jm a
t r Q(t) dt - k,
Ja tr Q(t) dt. = hA(x)y, a < x < b, where m
T-1
27. Consider the eigenvalue problem iy' A is a k-by-k Hermitean, positive-definite, and continuous matrix, with boundary condition y(b) = Ny(a), where N is fixed and unitary. Show that, if the eigenvalues are numbered in ascending order, then as n --+ f 00, 2nn
-
A,,
j
b
a
t r ~ ( x ax. )
Chapter 11. 1. Let u(x) be of bounded variation, and let y1 ,y2 be two linearly independent solutions of
Show that between two zeros of yl(x) there lies a zero of yz(x). 2. Let ul(x), uZ(x) be of bounded variation, and let uz(x) - ul(x) be strictly increasing. Let yr(x),Y = 1, 2, be a solution of
neither of yl(x), y2(x)vanishing identically. Show that between two zeros of yl(x) there lies at least one zero of y2(x). 3. Establish similar results for the equations
[~,y:l
+ J y,(t) du,(t),
Y
=
1,2,
where K , K , denote continuous positive functions of t .
559
PROBLEMS
4. Let u(x) be right-continuous and of bounded variation over a Show that the problem
[$’I
=
< x < b.
+ J d t ) dt,
J* H t )
with continuous p)(t), to hold over any subinterval of [a, b ] , with boundary conditions +(a) = #(b) = 0, is soluble by means of a Green’s function G(x, t ) in the form
(Cl(4=
la b
G(x, t>9J( t ) dt,
provided that the same problem with
+(t) = 0.
p)(t)
= 0 has only the trivial solution
5. Extend the latter result to an arbitrary pair of homogeneous boundary conditions, allowing u(x) to have discontinuities at x = a and at x = b. 6. Show that the boundary problem of Problem 4 is soluble for if and only if p)(t) is orthogonal to any solution of the same problem with v(t)= 0. 7. Let y ( x ) 0 satisfy, for all real x1 , x2 , the relation
+
+
Y’(X2)
- Y‘(X1)
+
p) 4 4
= 0,
where u(x) is nondecreasing, and let a, b be consecutive zeros of y(x). Show that ( b - a){u(b) - .(a)} 2 4.
8. Show that the equality can be attained in the last result by taking U ( X ) to be a step function with one jump in a < x < b. 9. The function u(x) is nondecreasing for all real x, and u(x 1) - u(x) is a positive constant. If the equation of the previous problem but one has a 4. Show also nontrivial solution of period 1, show that u(x + 1) - U(X)
+
that equality may hold in this last result. In the following problems all functions occurring are to be of bounded variation and right-continuous. The Stieltjes integral is to be understood in the following mixed sense:
the limit being over increasingly fine subdivisions a = ‘To
< 7 1 < ... < 7, = b, max (‘T?+~- T,)
0.
-+
10. Show that the integral just defined exists, the limit being independent of the choice of subdivisions.
560
PROBLEMS
11. Show that
12. Show that the integral equation
has a unique solution, for given y(a), G(t) being a given matrix function and y(x) a vector function to be found. 13. Prove the same for the integral equation z(b) - z(x) = - J b z(t)dG(t), X
a
<x
where z(x) is a row matrix, and verify that z(x)y(x) is constant, where y ( x ) is given in the previous problem. 14. Define the square matrix Y(x, u) as the solution of
Y ( x ,U) = E
+ 1::dG(t)Y(t,u).
+
Show that if the discontinuities d G of G(t) are such that ( E dG) is nonsingular, then Y(x,u ) is nonsingular. Denoting its inverse, if it exists, by Y(u,x), show that
Y ( u ,X) - E
=-
ST
Y(u,S) ~ G ( s ) .
15. Show that the unique solution for y(x), with given y(a), of
is given by y(x) = Y(x,alr(4
+ I’Y(x,
t ) df(t).
16. Show that the number of linearly independent solutions of the equation of Problem 12, with y(a) indeterminate, such that Jly(u) J,y(b) = 0, is the same as the number of linearly independent solutions of the equation of Problem 13, such that for some row matrix v we have .(a) = - v J 1 , z(b) = vJz. 17. If in the equation of Problem 12 we have G(t) = XA(t) B(t), and the boundary conditions of the previous problem are imposed, show that the eigenvalues either cover the whole plane, or else have no finite limit-point.
+
+
56 1
PROBLEMS
Establish the bi-orthogonality
for eigenfunctions of the two integral equations, corresponding to distinct eigenvalues.
Chapter 12.
< <
1. Let u(x), 0 x 00, be right-continuous and of bounded variation over (0,m). and lety, for 0 x < 03, be a solution of
<
[y'l
+ I{ + y 3 do}
= 0,
<
[to hold over any (xl, x2), 0 x1 < x2 < m]. Show that for some 6 a solution with initial data satisfying
0 < I Y(0) I
> 0,
+ I Y'(0) I < 6
admits an asymptotic approximation, as x
+
+
00,
y(x) - a, cos x - b, sin x + 0,
with constants a, , b, not both zero. [Hint: The method of Section 12.1 may be used, it being necessary to show that p(x), q ( x ) [see (12.1.15-16)] remain bounded as x + m, if they start off small enough when x = 0.1 2. With u(x) as in the previous problem, consider the solutions of
[y'l
= / { y dx
+ y3do}.
Show that for some 8 > 0, and solutions such that I y(0) I < 6, corresponding values of y'(0) can be selected so that y(x) -+ 0 as x + m. 3. Let g(t) be non-negative and integrable over (0, a), and let h = Y be an eigenvalue of the problem y"
+ (1 + Xg(x))y
Prove that
= 0,
I v 1-2
Im v
y(0) = 0,
y'(a)
+ iy(a) = 0.
< j: sin2 t g ( t ) dt.
4. Prove that the result of the previous problem cannot be improved by inserting on the right a constant less than unity and independent of g(t). 5 . Let y(x), 0 x T,be the solution of
< <
562
PROBLEMS
where u(t) is real-valued, of bounded variation and right-continuous. For real k > 0, define 8(x) by
W ) = 0,
= arg
{Mx)
+ y'(x)},
where 8(x) is to be continuous with y ' ( x ) ; at a jump of y ' ( x ) , the change in 8(x) is to be determined by continuous variation along the straight line joining iky(x) y'(x f 0). Show that
+
I 8(x)
I
- kx
< k-l
jzI du(t) 1. 0
6. With the assumptions of the previous problems let k , > 0 be the value of k , if such exists, such that O(n)= nn, for positive integral n. Show that k,
< B.
+ 1/{P+
wh},
where w = J* I du(t) I. Show further that if n2 0 does exist, and satisfies the lower bound
k, 3
$ + d{@
> 4+,
then such a k,
- w/n}.
7. With the assumptions of the previous problem show that k , > 0, if it exists, satisfies the upper bound k, < n wl(nn).
+
Show also that there is no fl < 1, independent of u and n, such that there holds in all cases the bound k , n / ? w / ( n n ) .[Hint: Consider the case when u ( t ) is a step function with one jump.] 8. Let u(x) be real-valued, right-continuous, and of bounded variation over 0 x 00, and let y ( x ) = y(x, k ) be defined as in Problem 5. Writing y k for aylak, y; for a?/ax ak, show that
< +
< <
r;r
- YkY' =
-
2k
j: {rct,k))2
9. Let the assumptions of the previous problem hold, and let 8(x) = 8(x, k ) be defined as in Problem 5 . Define also yl(x, k ) , for real k > 0, by rl > 0, y = y1 sin 8, y' = k ~ ,cos 8. Writing 8 k for a8/ak, show that 8k =
2{r1(x, k)}-2
jz{y(t,k))2 dt + k-' 0
sin 6 cos 8.
10. With the assumptions of the previous two problems, for b > 0 and sufficiently large positive integral n let k,(b) be the value of k for which 8(b, k ) = n ~ so, that A, = k: is that eigenvaiue of
563
PROBLEMS
such that y has tl - 1 zeros in 0 < x < b. Show that as b -+ + the A,,@) become arbitrarily dense in any interval on the real and positive 00,
k-axis. If in particular b -+ 00, and n a positive limit k, show that
-+
00
in such a way that k,(b) tends to
I”
{k,+l(b) - k ? m } where r,(k)
=
lim
yl(x,
Z+W
0
dt
{r
k), or that b{kn+l(b) -
Ub)}
- b{w}2>
T*
-+
[Note: In the notation of (12.7.6), r,(k) = k-lr(k).] 11. For the boundary problem of the previous problem and finite 6 > 0, let T b ( h ) be the spectral function, so that for A > 0,
Show that, for fixed A‘, A”, with 0
< A‘ < A”,
as b - m . 12. Let a(x), 0 < x < T , be real-valued, right-continuous, and of bounded variation, and let U(X) = u(x, A), W(X) = w(x, k) be solutions of
[Y‘I
+ J”Y d{k2x + +)}
=0
with the initial or final conditions u(0) = 0, u’(0) = 1 and w’(T) = 1. If u ( x ) # 0, define the Green’s function
w(w) = 0,
Show that this function provides, in the usual manner, the solution of the inhomogeneous boundary problem [z’]
+ J” x d{k2x + +)}
=
J” ~ ( x dx, )
z(0) = Z ( T ) = 0, for arbitrary continuous ~ ( x ) . 13. For the problem [Y‘I
+ I Y d{”. + +)}
= 0,
Y(0) = Y ( 4 = 0, let A, , n = 1, 2, ..., be the eigenvalues and y,,(x) the normalized eigenfunctions [(12.9.4) with b = T ] . Show that the Green’s function G(x, t , A) defined in the previous problem has the following properties:
564
PROBLEMS
(i) at an eigenvalue An the Green’s function has a simple pole with residue Yn(tlYri(x> (ii) if h is not an eigenvalue, the Fourier coefficient of G(x, t , A) with respect to r,&)is m(.)/(h - hn), (iii) if h is not an eigenvalue, 3
(iv) if h is complex, G(x, t , A) - G(x, t , h) = (h - A)
I”
G(x, s, h)G(s, t , A) ds,
0
(v) G(x, x, A) is a “negative imaginary” function (see Appendix 11), (vi) if Im h > 0,
o < 2 I yn(x) 12 Im 1/(hn - A) < - I m ~ ( x x,, A). W
1
14. With the notation of the previous problem, show that
I m ( x ) l2 Im I/&
- A),
- Im G(x, x, A)
tend to zero on a contour given by X = k2,where k describes the rectangle R1 k = 0, (m Im k = 0, (m &)r, and m +03. 15. Deduce, from the maximum principle for harmonic functions, that these two functions are equal. Deduce that the bilinear expansion of the Green’s function holds in mean-square, and deduce further the eigenfunction expansion.
+ a)~,
+
Index
Abdel-Messih, M. A. 501 Adern, J. 482 Agranovit, Z. S. 532 Ahiezer, N. I. (Achieser) 478, 484-5, 492, 510, 513,522-3 Al-Salarn, W. A. 490 Ambarzurnian, V. 531 Analogies between difference and differential equations 1, 30, 487, 491-3, 529-530 Andersen, E. S. 506 Aronszajn, N. 486, 533-4 Artin, E. 489 Asymptotic behaviour, of eigenfunctions and, eigenvalues for integral equations 398-41 I , also qus. 6-7 for Chapter 12. of eigenvalues for differential equations 208, 319-323, 338, 516 of phase 532, qu. 7 for Ch. 8, qu. 5 for Ch. 12. of polynomials orthogonal on real interval 196-8 of polynomials orthogonal on unit circle 190-6, 508, 510 of solutions of differential equations 206-7, 366, 449-452, 517 of solutions of integral equations 366370, 384-398 of spectral function 56, 61, 130, qu. 9 for Ch. 8, qu. 1 1 for Ch. 12. Atkinson, F. V. 504, 506, 517, 529 Balogh, T. 511 Banks, D. 492 Bari, N. K. 535 Barrett, J. H. 517, 523, 525-6 Barrett, L. C. 520
Bauer, W. F. 482 Baxter, G. 504-8, 510, 512 Bazanov, B. V. 500 Beckenbach, E. F. 478, 487, 500, 520, 534 Beesack, P. R. 492 Bellrnan, R. 377, 453, 455, 451, 476, 478, 481, 492, 496, 501-3, 515, 520, 524, 528, 531, 534. Bickley, W. G. 493 Bieberbach, L. 478, 513, 520 Birkhofl, Garrett, 475, 478,513, 533-535 Birkhoff, George D. 478,521, 523,533 Birth and death process 19, 494-5,497 Rlaschke product 58,487 Bliss, G. A. 521, 523, 525 Bochner, S. 518 Borg, G. 531 Rott, R. 523 Boundary conditions, at infinity, 64, 120, 132-3, 497, 499, 501 general form of, 14, 84, 88, 89, 255-6, involving parameter, 22-4, 482-3, 5 16, 520 involving second derivative 482 periodic 93, 145, 256, 336 Boyer, R. H. 493 Brauer, F. 522 Brodskii, M. S. 485, 532 Bureau, F. J. 484 Cargo, G. T. 487 Carlitz, L. 493 Carrnichael, R. D. 503 Cebyiev, P. L. (Tchebycheff, Tschebysheff, etc.) 493 CebyHev system 218, 492, 518-9, qus. 6-10 for Ch. 4. Chapman-Kolrnogorov equations 496, 514
566
INDEX
Characteristic function 10-1, 17-8, 37-9, 56, 62-3,184-8,268-272,302,329,485, 51 2, distinguished from eigenfunction, 485, Chevalley, C. 488 Chihara, T. S. 494 Christoffel numbers 49 1 Christoffel-Darboux identity 31, 94, 98, 152, 173 Churchill, R. V. 482 Coddington, E. A. 478, 513-4, 521-2, 524 Cole, R. H. 523 Collatz, L. 478,481 Comparison theorems 523, 525, qu. 3, 7 for Ch. I , qu. 6 for Ch. 10, qu. 15 for Ch. 10, qus. 2-3 for Ch. 11. Conjugate points 300- I , 308-3 14, 328-9, 332-6 Completeness in hilbert space 471-5, 535 Conte, S. D. 522 Continued fractions 1 I 1, 492, 502 Courant, R. 478, 513, 520 Coxeter, H. S. M. 489 Crum, M. M. 529, 533 Cudov. L. A. 532 De Branges, L. 487 Definiteness 205, 253, 357, 374 Delsarte, J. 533 Determinants, of Jacobi form, 142-4, 501 Devinatz, A. 486, 501, 504, 513 Dickinson, D. J. 493, 494 Dickinson, D. R. 492 Difference equation (see also recurrence relation) 1, 2, 3 Differential equation, fourth-order 254-5, 323-8, 525-6, qus. 5 , 6, 8-19 for Ch. 10 first-order scalar 7, 55-7, 301-2 first-order vector-matrix, see under “system”, second-order, see under “Sturm-Liouville” Diffusion 495, 5 15, 5 18 Dikii, L. A. 482, 517 Disconjugacy 300-2, 525 Discrete approximation 3, 4, 8, 76-81, 359, 48 1, 530 Discrete-continuous problems 2, 8, 12, 76-81,452-6,481 in Sturm-Liouville theory, 226-8
Dolph, C. L. 478, 507, 521 Donohugh, W. F. Jr. 534 Duffin, R. J. 493 Eigenfunction expansion for first-order system of differential equations 261-2, 273-284 for integral equations 358-9, 41 1-5 for matrix recurrence relation 158 for scalar recurrence relations 34, 67-71, 105, 169, 175 for Sturm-Liouville equations 4, 222-6, 232-8, 243-7, 475, 481, 520 Eigenvalues, of varying matrices 303, 310, 317, 324-8, 335,337,457-470, 535 Eigenvalues, infinite 27-8, 24-6, 64 see also asymptotic behaviour Everitt, W. N. 502, 523, 535 Explicit expansion theorem I 17-8, 188-190, 240-3, qu. 12 for Ch. 6. Factorization 12, 192-3, 382, 51 1-3 Faddeev, L. D. 532 Favard, J. 493 Feller, W. 24, 495-6, 514, 527 Fort, T. 377,478,481,490-1, 501, 530 Fortet, R. 497 Frank, P. 518 Freud, G. 512 Frey, T. (Frei) 512 Friedman, A. 532 Fundamental solution 14, 29, 58-9, 94, 96, 253, 257, 264, 298, 362 see also “transfer-function” Gantmaher, F. R. (Gantmacher) 478, 490-2, 503, 518, 526 Gel’fand, I. M. 492, 524, 532-3 Geronimus, Ya. L. 479, 504, 512 Gilbert, R. C. 484, 521 Glazman, I. M. (Glasmann) 478, 484-5, 522-3,526,534 Gohberg, I. C. 513 Gol’dberg, A. A. 487 Good, I. J. 482 Gould, S. H. 492 Green’s function 10-1 I , 18-19, 145, 148-9, 229-232, 263, 266, 302, 358-9, 436, 50 1
INDEX
Green’s matrix 523 Green’s theorem 31, 98, 173 Greenstein, D. S. 500-1 Grenander, U. 479,504, 510 Gronwall, T. H. 455 Guillemin, E. A. 502 Halanai, A. 526 Hamburger, H. L. 486, 493, 501 Hannan, E. J. 479, 510-1 Harris, V. C. 516 Hartman, P. 490 Hausdorff, F. 486 Hellinger, E. 494-5 Helly-Bray theorem 61, 159, 299, 425-434, 534 Helson, H. 51 1 Herglotz, G. 508 Hestenes, M. R. 523 Hilbert, D. 478, 503, 513, 520 Hildebrandt, T. H. 527 Hille, E. 482 Hirschman, I. I. Jr. 519, 534 Hochstadt, H. 518 Howard, H. C. 526 Hunt, R. W. 526 Ince, E. L. 479, 503, 518 Ingram, W. H. 528 Initial-value problem 3 for integral equations 341-7 Integrable square 57, 250-1, 293-8, 391-3 see also “summable square” Integral equations 2, 521 with Stieltjes integrals 2 of Volterra-Stieltjes type 24, 204, 255, 339-415 of Fredholm-Stieltjes type 359, 515 Interface conditions 483, 516, 527-8 Interpolation in complex plane 71-4 on real axis 21 7-222 Invariance property 4-6, 13, 16, 83, 87, 93 Inverse problems 11-12, 39-46, 74, 107114, 178-181, 492, 531-3 Jackson, D. 489 Jeffreys, H. 8z B.S. 502 Jost, R. 532
567
Kac, I. S. 515, 520 Kamke, E. 479, 520 Karlin, S. 479, 490, 493, 503-4, 519 Kay, I. 532 Kazmin, Yu. A. 535 Kellogg, 0. D. 519 Kemp, R. R. D. 377, 530 Kemperman, J. H. B. 497 Klein, F. 503 Kodaira, K. 485, 521 Korop, V. F. 533 Krasnosel’skii, M. A. 479, 486, 500-1 Krein, M. G. 18, 199, 478-9, 486-7,490-3, 500-1, 503, 507, 513-5, 518-9, 522, 524, 526-7, 529-532 Kurzweil, J. 529 Kuz’mina, A. L. 512 Lagrange identity 31, 94, 98, 173, 348, 373 Lammel, E. 488 Langer, R. E. 482, 521 Leighton, W. 526 Lesky, P. 493 Levinson, N. 377, 478, 513-4, 521-2, 524, 530-2 Levitan, B. M. 479, 481, 491, 516, 521, 531-3 Lidskii, V. B. 524-5 Limit-circle, limit-point 15, 57, 63, 70, 129-130, 132, 134-5, 292-3, 299, 487, 495-7,499,501-2,521 Lions, J. L. 533 Liouville, J. 202, 342, 513, 518-9 Lipka, S. 492 Livlic, M. S. 485-6 Lowdenslager, D. 51 1 Lyapunov, M. A. 529 Lyubarskii, V. A. 524 Lyusternik, L. A. 535 MacNamee, J. 493 Mac Nerney, J. S. 502, 527-8 Mairhuber, J. C. 520 Mal’cev, A. I. 489 Mammana, G. 529 Mandl, P. 5 I5 MarEenko, V. A. 532 Marden, M.534 Markov, A. A. 493 Markov process 19, 496 Markov system 492
568
INDEX
Masani, P. 51 1 Matrix, contractive, 477 doubly-stochastic, 496 eigenvalues of, 457-470, 502, 535 harmonic, 528 hermitian, 476 imaginary part of, 477 J-contractive, 477, 488 J-unitary, 86, 253, 477, 488 Jacobi, 20, 142, 503 non-singular, 477 orthogonal, 477 positive-definite, 477 skew-hermitian 476 symmetric 4, 476 symplectic, 22, 86, 334, 477, 483, 488 McCarthy P. J. 491 McGregor, J. 494, 503-4, 519 McKean, H. P. Jr. 514, 516 McLeod, J. B. 517 McShane, E. J. 534 Mechanical quadrature 115, 121-2, 177, 49 1 Mikolis, M. 530 Mollerup, H. 519 Moment problem, 485-6, 500-1 for rational functions, 46-54, 71-74 Hamburger, 136-7, 184,486, 500 Hausdorff, 486, 501, 503 matrix, 502-3, 51 1 multi-dimensional 504 Stieltjes 486, 501 trigonometric, 184, 486, 508-9 Moore, R. A. 523 Morgan, G. W. 482 Moroney, R. M. 533 Morse, M. 479, 523-4 Moses, H. E. 532 Nagel, H. 503 Nagy, B. Sz. 479,502,514-5 Naimark, M. A. (Neumark) 479, 484-5, 513, 522, 531 Negative-imaginary function 436-440, 502 Nehari, 2,523,526 Nesting circles 125-7, 247-25 I , 284-9 299, 495 Networks, 2, 3, 5 , 10, 18, 19, 150, 502 Newton, R. G. 532
Nikolenko, L. D. 526 Non-oscillation, 300-2, 490, 495 Orthogonality, 441-446 dual, 33, 106, 175,483,496 Orthogonalization, continuous, 492, of polynomials, 108, 179, 492 of rational functions, 488 Oscillation theorem, multi-parameter, 164 on unit circle, 30, 173 Sturm-Liouville, 212 Oscillations, for differential equations, 300-338, 523526 for recurrence relations, 29, 100-4, 152-7 Parzen, E. 51 1 Peek, R. L. Jr. 482 Pick-Nevanlinna problem 73, 487 Pignani, T. J. 528 Plancherel, M. 481 Poisson, S. D. 513 Polar coordinate methods 101-2, 155-7, 210-7, 303, 305, 329, qus. 7, 8, 16 for Ch. 8, qu. 5 for Ch. 12 Pollak, H. 0. 493 Polynomials, classical, Hermite, 493, qu. 8 for Ch. 5 , Jacobi, 493 Laguerre, 493, qu. 10 for Ch. 5, Legendre, 118, 135, 493, qu. 3 for Ch. 5. Polynomials, orthogonal on real axis, 16,9092, 97-141, 213, 487, 489-491, 493-4 asymptotic behaviour, 196-8, 493, 51 2 as determinants, 142-9, 501 in several variables, 160-9, 503-4 quasi-orthogonal, 494 with matrix coefficients, 150-160, 254, 262, 497, 502 Polynomials orthogonal on the unit circle, 92-3, 96, 170-201 Porter, M. B. 490 Positive-real functions 436, 534 Potapov, V. P. 479, 488, 528 Prediction 510-1 Price, G. B. 528 Probability 19, 482,497, 503, 505 Priifer, H. 222, 358, 520, 523
569
INDEX
Quasi-differential equation 203, 255, 522 Queues 508
for polynomials on real axis 101-4,145-8 for polynomials on unit circle 186 for rational functions 30 Random variable 509,510 Seshu, S. and L.534 Random walk 19,505-9 Shear 91,489 Ray, D.B. 516 Sherman, J. 494 Recurrence relations 2-4, 7-1 1, 15-21, Shohat, J. 479,494,500, 504,534 Siege], C. L.489 25-203,487, 489-491,493-5,501-2 multivariate 160,qu. 8 for Ch. 1 Sims, A. R. 495,521 oscillatoryproperties29-30,100-104,4901 Spectral function 10, 35-7, 46-51, 56, with matrix coefficients 83-96,150-160, 60-2, 120-5, 182-4, 238-240, 262,
502-3
within
differential
equations,
202-3,
254-5,516
Rehtman, P. G. 479,500 Reid, W.T. 516,522-7 Resolvent equation 265 Resolvent kernel 263-8 Riccati equation 209, 302 matrix, 304, 325, 329,523 Richardson, J. M. 492 Riesz, F.479,484 Right-conjugate 300-1,308, 324, 329 Rooney, P. G. 508 Roos, B.W.522 Rosen, J. B. 534 Rosenblatt, M.51 1 Rosenfeld, N. S . 518 Rota, G.-C. 475,478, 513, 522, 533, 535 Rozanov. Yu. A. 511 S-function 371-383 S-matrix 532 Sandor, St. 526 Sangren, W. C. 483,520,522, 528 Sargsyan,,I. S . 516, 521 Scarpellini, B. 519 Scattering 377, 532 Scharf, H.528 Schmidt, T.W.493 Schoenberg, I. J. 520,525 Schur, I. 503 Schwarz, B. 492 Schwerdtfeger, H.488 Self-adjoint 4, 498-500 Separation theorems 17, 300,302 for first-order system 332-7 for matrix polynomials 156-7 for matrix Sturm-Liouville systems
308-312,318, 523-4
500-1 298,382-3,483-5,487-8,494-6, 520-1 limiting 129-130, 139, 496 multi-dimensional 169 orthogonal 485, 496, 501 Spectral kernel 484 Spectrum 27, 70,371,490
Spencer, V. E. 501 Spitzer, F. 506 Stability 447-456 for differential equations 206, 449-452 for integral equations 356, 368, 384,
452-6
for recurrence relations 131-2,447-9 Stallard, F. W.527-8 Stieltjes integral 2, 35, 204, 416-435,445,
455, 534
modified, 360, 363-4,528 Stieltjes, T. J. 486,493 Stochastic process 509 Stone, M. H.479,484,489,494, 500, 531,
534
Stratton, J. A. 481 Straus, A. V. 485, 488 String, vibrating 2, 5, 16,21,24, 119, 148,
202, 492-4,497, 501, 514, 516
Sturm, Ch. 202,218,490,513, 518 Sturm-Liouville theory 17, 21-4,30, 97,
147, 150, 160, 202-254, 261, 263, 338, 340, 355, 381-3,398-415,440, 473, 475, 482-3,489, 513-521 discrete, 4, 97-149 eigenfunction expansion 4,222-6,358-9, 411-5 inverse, 485,492, 530-3 matrix, 254, 300,303-323 on a half-axis, 243-251,487, 512, 521 Summable sauare 60. 130-2.484,495 see also “integrable square”
570
INDEX
Symplectic transvection 91, 489 System of differential equations 13-14, 252-300, 332-8, 521-2 bounds for eigenvalues 256, 338 conjugate points, 300, 332-6 eigenfunction expansion, 273-284 generalised to integral equations, 359-365 nesting circles, 284-9 over infinite intervals, 289-299 Szisz, 0.488 Szegb, G. 479,480,489-491, 493, 501, 504, 510-2, 530 Tamarkin, J. D. 479, 482, 500, 504, 534 Tarnopol’skii, J. G. 493, 500 Tchebychev, Tschebysheff, etc. see “Cebyiev” Tensor product 503 Teptin, A. L. 501 Time series 509 Titchmarsh, E. C. 480,484, 497, 501. 513, 516, 521-2 Toeplitz form 184, 199, 509 Trace, of matrix 476 Trace-formulae for eigenvalues 380, 399, 51 7 Tricomi, F. G. 489, 493 Truxal, J. G. 481
ViHik, M. I. 535 von Mises, R. 518 Wall, H. S. 480, 495, 500, 528 Walsh, J. L. 533 Wannier, G. H. 493 Wave propagation 3,5-6,481 Wax, N. 518 Wendel, J. G. 506 Wendroff, B. 491 Weyl, H. 487,495 Whyburn, W. M. 490, 517-8, 523, 528 Widder, D. V. 480, 500, 519, 534 Wiener, N. 506, 51 I Wiener-Hopf equation 504, 507 Williamson, R. E. 520 Wintner, A. 490, 525 WKB-method, 518 Wronskian, 4, 6, 28, 100, 490, 529, 530 for integral equation, 348-350 Wylie, C. R. Jr. 520 YakuboviE, V. A. 489, 524 Zarhina, R. B. 504 Zimmerberg, H. J. 482