This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!
ALL RIGHTS RESERVED. No part of this work covered by the copyright herein may b e reproduced, transcribed, or use d in any form or by any means-graphic, electronic, o r mechanical, including photocopying, recording, taping , Web distribution, or informatio n storage and retrieval systems-without the written permission of the publisher .
North America Nelso n 1120 Birchmount Road Toronto, Ontario M1K 5G4 Canad a
Printed and bound in Taiwan 1 2 3 4 07 06 For more information contac t Nelson, 1120 Birchmount Road, Toronto, Ontario, Canada, M1K 5G4 . Or you can visit our Internet site at http ://www.nelson .com Library of Congress Contro l Number: 2006900028 ISBN : 0 .495.08237.6 If you purchased this book within the United States or Canada yo u should be aware that it has been wrongfully imported without the approval of the Publisher or th e Author.
For permission to use material from this text or product, submit a request online at www.thomsonrights.com Every effort has been made t o trace ownership of all copyright material and to secure permission from copyright holders . In th e event of any question arising as to the use of any material, we will be pleased to make the necessary corrections in future printings .
Asi a Thomson Learnin g 5 Shenton Way #01-0 1 UIC Building Singapore 06880 8 Australia/New Zealand Thomson Learning 102 Dodds Street Southbank, Victori a Australia 300 6 Europe/Middle East/Africa Thomson Learning High Holborn House 50/51 Bedford Row London WC1R 4LR United Kingdom Latin Americ a Thomson Learning Seneca, 53 Colonia Polanco 11560 Mexico D.F. Mexico Spain Paraninfo Calle/Magallanes, 2 5 28015 Madrid, Spain
1.1 Preliminary Concepts 3 1 .1 .1 General and Particular Solutions 3 1 .1 .2 Implicitly Defined Solutions 4 1 .1 .3 Integral Curves 5 1 .1 .4 The Initial Value Problem 6 1.1 .5 Direction Fields 7 1 .2 Separable Equations 1 1 1 .2.1 Some Applications of Separable Differential Equations 1 4 1 .3 Linear Differential Equations 2 2 1 .4 Exact Differential Equations 2 6 1.5 Integrating Factors 3 3 1 .5 .1 Separable Equations and Integrating Factors 3 7 1 .5 .2 Linear Equations and Integrating Factors 3 7 1.6 Homogeneous, Bernoulli, and Riccati Equations 3 8 1.6.1 Homogeneous Differential Equations 3 8 1 .6.2 The Bernoulli Equation 4 2 1 .6.3 The Riccati Equation 4 3 1 .7 Applications to Mechanics, Electrical Circuits, and Orthogonal Trajectories 46 1 .7.1 Mechanics 4 6 1 .7.2 Electrical Circuits 5 1 1 .7.3 Orthogonal Trajectories 5 3 1.8 Existence and Uniqueness for Solutions of Initial Value Problems 5 8
Chapter 2
Second-Order Differential Equations 6 1 2.1 Preliminary Concepts 6 1 2.2 Theory of Solutions of y" + p(x)y' + q(x)y = f(x) 62 2.2.1 The Homogeneous Equation y" + p(x)y' + q(x) = 0 64 2.2.2 The Nonhomogeneous Equation y" + p(x)y' + q(x)y = f(x) 2.3 Reduction of Order 6 9 2 .4 The Constant Coefficient Homogeneous Linear Equation 7 3 2.4.1 Case 1 : A2 - 4B > 0 7 3 2.4.2 Case 2 : Az - 4B = 0 74
68
Contents
2.4.3 Case 3 : A2 - 4B < 0 74 2.4.4 An Alternative General Solution in the Complex Root Case 7 5 2.5 Euler' s Equation 7 8 2.6 The Nonhomogeneous Equation y" + p(x)y' + q(x)y = f(x) 82 2.6.1 The Method of Variation of Parameters 8 2 2.6.2 The Method of Undetermined Coefficients 8 5 2.6.3 The Principle of Superposition 9 1 2.6.4 Higher-Order Differential Equations 9 1 2.7 Application of Second-Order Differential Equations to a Mechanical System 9 3 2.7.1 Unforced Motion 9 5 2.7.2 Forced Motion 98 2.7.3 Resonance 10 0 2 .7 .4 Beats 10 2 2 .7 .5 Analogy with an Electrical Circuit 10 3
Chapter 3
The Laplace Transform 107 3.1 Definition and Basic Properties 10 7 3.2 Solution of Initial Value Problems Using the Laplace Transform 11 6 3.3 Shifting Theorems and the Heaviside Function 120 3.3.1 The First Shifting Theorem 120 3.3.2 The Heaviside Function and Pulses 12 2 3 .3 .3 The Second Shifting Theorem 125 3 .3 .4 Analysis of Electrical Circuits 12 9 3.4 Convolution 13 4 3 .5 Unit Impulses and the Dirac Delta Function 13 9 3 .6 Laplace Transform Solution of Systems 144 3 .7 Differential Equations with Polynomial Coefficients 150
Chapter 4
Series Solutions 155 4.1 Power Series Solutions of Initial Value Problems 15 6 4 .2 Power Series Solutions Using Recurrence Relations 16 1 4.3 Singular Points and the Method of Frobenius 16 6 4.4 Second Solutions and Logarithm Factors 17 3
Chapter 5
Numerical Approximation of Solutions 18 1 5 .1 Euler's Method 18 2 5.1 .1 A Problem in Radioactive Waste Disposal 187 5 .2 One-Step Methods 190 5.2 .1 The Second-Order Taylor Method 19 0 5.2 .2 The Modified Euler Method 19 3 5.2 .3 Runge-Kutta Methods 195 5 .3 Multistep Methods 197 5.3 .1 Case 1 r = 0 19 8 5.3 .2 Case 2 r = 1 19 8 5.3 .3 Case 3 r = 3 19 9 5.3 .4 Case 4 r = 4 199
Contents
PART
2
Chapter 6
Vectors and Linear Algebra 20 1 Vectors and Vector Spaces 203 6.1 The Algebra and Geometry of Vectors 203 6.2 The Dot Product 21 1 6.3 The Cross Product 21 7 6.4 The Vector Space R" 223 6.5 Linear Independence, Spanning Sets, and Dimension in R" 22 8
Chapter 7
Matrices and Systems of Linear Equations 23 7 7.1 Matrices 23 8 7 .1 .1 Matrix Algebra 239 7.1 .2 Matrix Notation for Systems of Linear Equations 242 7 .1 .3 Some Special Matrices 243 7.1 .4 Another Rationale for the Definition of Matrix Multiplicatio n 24 6 7.1 .5 Random Walks in Crystals 247 7.2 Elementary Row Operations and Elementary Matrices 25 1 7.3 The Row Echelon Form of a Matrix 258 7 .4 The Row and Column Spaces of a Matrix and Rank of a Matrix 266 7.5 Solution of Homogeneous Systems of Linear Equations 272 7.6 The Solution Space of AX = 0 280 7 .7 Nonhomogeneous Systems of Linear Equations 283 7.7.1 The Structure of Solutions of AX = B 284 7.7.2 Existence and Uniqueness of Solutions of AX = B 285 7 .8 Matrix Inverses 293 7.8 .1 A Method for Finding A- 1 295
Chapter 8
Determinants 29 9 8.1 Permutations 29 9 8.2 Definition of the Determinant 30 1 8.3 Properties of Determinants 30 3 8 .4 Evaluation of Determinants by Elementary Row and Column Operations 30 7 8.5 Cofactor Expansions 31 1 8.6 Determinants of Triangular Matrices 31 4 8.7 A Determinant Formula for a Matrix Inverse 31 5 8.8 Cramer' s Rule 31 8 8.9 The Matrix Tree Theorem 320
Chapter 9
Eigenvalues, Diagonalization, and Special Matrices 32 3 9 .1 Eigenvalues and Eigenvectors 324 9.1 .1 Gerschgorin's Theorem 328 ' 9.2 Diagonalization of Matrices 33 0 9 .3 Orthogonal and Symmetric Matrices 33 9
viii
Contents 9 .4 Quadratic Forms 347 9 .5 Unitary, Hermitian, and Skew Hermitian Matrices 35 2
PART
3
Chapter 10
Systems of Differential Equations and Qualitative Methods 35 9 Systems of Linear Differential Equations 36 1 10 .1 Theory of Systems of Linear First-Order Differential Equations 36 1 10.1 .1 Theory of the Homogeneous System X' = AX 365 10.1 .2 General Solution of the Nonhomogeneous System X' = AX + G 37 2 10 .2 Solution of X' = AX when A is Constant 374 10.2.1 Solution of X' = AX when A has Complex Eigenvalues 37 7 10.2.2 Solution of X' = AX when A does not have n Linearly Independent Eigenvectors 379 10.2 .3 Solution of X' = AX by Diagonalizing A 384 10.2 .4 Exponential Matrix Solutions of X' = AX 38 6 10.3 Solution of X' = AX + G 39 4 10.3 .1 Variation of Parameters 39 4 10.3 .2 Solution of X' = AX + G by Diagonalizing A 398
Chapter 11
Qualitative Methods and Systems of Nonlinear Differential Equations 40 3 11 .1 Nonlinear Systems and Existence of Solutions 40 3 11 .2 The Phase Plane, Phase Portraits and Direction Fields 40 6 11 .3 Phase Portraits of Linear Systems 41 3 11 .4 Critical Points and Stability 42 4 11 .5 Almost Linear Systems 43 1 11 .6 Lyapunov ' s Stability Criteria 45 1 11 .7 Limit Cycles and Periodic Solutions 46 1
PART
4
Chapter 12
Vector Analysis 47 3 Vector Differential Calculus 47 5 12 .1 Vector Functions of One Variable 47 5 12 .2 Velocity, Acceleration, Curvature and Torsion 48 1 12 .2.1 Tangential and Normal Components of Acceleration 48 8 12 .2.2 Curvature as a Function of t 49 1 12.2 .3 The Frenet Formulas 492 12 .3 Vector Fields and Streamlines 49 3 12 .4 The Gradient Field and Directional Derivatives 49 9 12.4 .1 Level Surfaces, Tangent Planes and Normal Lines 50 3 12 .5 Divergence and Curl 51 0 12.5 .1 A Physical Interpretation of Divergence 51 2 12.5 .2 A Physical Interpretation of Curl 513
Contents
Chapter 13
Vector Integral Calculus 51 7 13 .1 Line Integrals 51 7 13.1 .1 Line Integral with Respect to Arc Length 525 13 .2 Green' s Theorem 52 8 13.2 .1 An Extension of Green's Theorem 53 2 .13 .3 Independence of Path and Potential Theory in the Plane 53 6 13.3 .1 A More,Critical Look at Theorem 13 .5 539 13 .4 Surfaces in 3-Space and Surface Integrals 54 5 13.4.1 Normal Vector to a Surface 54 8 13 .4.2 The Tangent Plane to a Surface 55 1 13 .4 .3 Smooth and Piecewise Smooth Surfaces 55 2 13.4 .4 Surface Integrals 55 3 13 .5 Applications of Surface Integrals 55 7 13.5 .1 Surface Area 557 13.5 .2 Mass and Center of Mass of a Shell 55 7 13 .5.3 Flux of a Vector Field Across .a Surface 56 0 13.6 Preparation for the Integral Theorems of Gauss and Stokes 56 2 13.7 The Divergence Theorem of Gauss 564 13.7 .1 Archimedes 's Principle 567 13.7 .2 The Heat Equation 568 13.7 .3 The Divergence Theorem as a Conservation of Mass Principle 570 13.8 The Integral Theorem of Stokes 572 13 .8.1 An Interpretation of Curl 57 6 13 .8 .2 Potential Theory in 3-Space 57 6
PART 5 Chapter 14
Fourier Analysis, Orthogonal Expansions, and Wavelets 58 1 Fourier Series 583 14.1 Why Fourier Series? 58 3 14.2 The Fourier Series of a Function 58 6 14.2.1 Even and Odd Functions 589 14.3 Convergence of Fourier Series 59 3 14.3.1 Convergence at the End Points 59 9 14.3 .2 A Second Convergence Theorem 60 1 14.3 .3 Partial Sums of Fourier Series 604 14.3 .4 The Gibbs Phenomenon 606 14.4 Fourier Cosine and Sine Series 609 14.4 .1 The Fourier Cosine Series of a Function 61 0 14.4 .2 The Fourier Sine Series of a Function 61 2 14.5 Integration and Differentiation of Fourier Series 61 4 14.6 The Phase Angle Form of a Fourier Series 62 3 14.7 Complex Fourier Series and the Frequency Spectrum 63 0 14.7.1 Review of Complex Numbers 63 0 14.7.2 Complex Fourier Series 631
x
Contents
Chapter 15
The Fourier Integral and Fourier Transforms 63 7 15 .1 The Fourier Integral 637 15 .2 Fourier Cosine and Sine Integrals 64 0 15 .3 The Complex Fourier Integral and the Fourier Transform 64 2 15 .4 Additional Properties and Applications of the Fourier Transform 65 2 15.4.1 The Fourier Transform of a Derivative 65 2 15.4.2 Frequency Differentiation 65 5 15.4 .3 The Fourier Transform of an Integral 65 6 15.4 .4 Convolution 65 7 15.4 .5 Filtering and the Dirac Delta Function 66 0 15.4 .6 The Windowed Fourier Transform 66 1 15.4 .7 The Shannon Sampling Theorem 66 5 15.4 .8 Lowpass and Bandpass Filters 66 7 15 .5 The Fourier Cosine and Sine Transforms 67 0 15 .6 The Finite Fourier Cosine and Sine Transforms 67 3 15.7 The Discrete Fourier Transform 67 5 15 .7.1 Linearity and Periodicity 67 8 15 .7 .2 The Inverse N-Point DFT 67 8 15 .7 .3 DFT Approximation of Fourier Coefficients 67 9 15.8 Sampled Fourier Series 68 1 15 .8.1 Approximation of a Fourier Transform by an N-Point DFT 685 15 .8.2 Filtering 68 9 15 .9 The Fast Fourier Transform 69 4 15 .9 .1 Use of the FFT in Analyzing Power Spectral Densities of Signals 69 5 15.9.2 Filtering Noise From a Signal 69 6 15.9.3 Analysis of the Tides in Morro Bay 69 7
Chapter 16
Special Functions, Orthogonal Expansions, and Wavelets 70 1 16.1 Legendre Polynomials 70 1 16.1 .1 A Generating Function for the Legendre Polynomials 704 16.1 .2 A Recurrence Relation for the Legendre Polynomials 70 6 16.1.3 Orthogonality of the Legendre Polynomials 70 8 16.1 .4 Fourier-Legendre Series 709 16.1 .5 Computation of Fourier-Legendre Coefficients 71 1 16.1 .6 Zeros of the Legendre Polynomials 71 3 16.1 .7 Derivative and Integral Formulas for Pn(x) 715 16.2 Bessel Functions 71 9 16.2.1 The Gamma Function 71 9 16.2.2 Bessel Functions of the First Kind and Solutions of Bessel 's Equation 72 1 16.2.3 Bessel Functions of the Second Kind 72 2 16.2.4 Modified Bessel Functions 72 5 16.2.5 Some Applications of Bessel Functions 72 7 16.2.6 A Generating Function for L(x) 732 16.2.7 An Integral Formula for L(x) 733 16.2.8 A Recurrence Relation for Jv (x) 735 16.2.9 Zeros of Jv (x) 737
Contents 16.2 .10 Fourier-Bessel Expansions 73 9 16.2 .11 Fourier-Bessel Coefficients 74 1 16.3 Sturm-Liouville Theory and Eigenfunction Expansions 74 5 16.3.1 The Sturm-Liouville Problem 74 5 16.3 .2 The Sturm-Liouville Theorem 75 2 16.3 .3 Eigenfunction Expansions 75 5 16.3 .4 Approximation in the Mean and Bessel ' s Inequality 75 9 16.3 .5 Convergence in the Mean and Parseval 's Theorem 76 2 16.3.6 Completeness of the Eigenfunctions 76 3 16 .4 Wavelets 765 16.4.1 The Idea Behind Wavelets 765 16.4 .2 The Haar Wavelets 76 7 16.4 .3 A Wavelet Expansion 77 4 16.4 .4 Multiresolution Analysis with Haar Wavelets 77 4 16.4 .5 General Construction of Wavelets and Multiresolution Analysis 77 5 16.4.6 Shannon Wavelets 77 6
PART
6
Chapter 17
Partial Differential Equations 779 The Wave Equation 78 1 17 .1 The Wave Equation and Initial and Boundary Conditions 78 1 17.2 Fourier Series Solutions of the Wave Equation 78 6 17.2.1 Vibrating String with Zero Initial Velocity 78 6 17 .2 .2 Vibrating String with Given Initial Velocity and Zero Initial Displacement 79 1 . 17.2 .3 Vibrating String with Initial Displacement and Velocity 79 3 17.2 .4 Verification of Solutions 79 4 17.2 .5 Transformation of Boundary Value Problems Involving the Wave Equation 79 6 17.2 .6 Effects of Initial Conditions and Constants on the Motion 79 8 17.2 .7 Numerical Solution of the Wave Equation 80 1 17 .3 Wave Motion Along Infinite and Semi-Infinite Strings 80 8 17.3 .1 Wave Motion Along an Infinite String 80 8 17.3 .2 Wave Motion Along a Semi-Infinite String 81 3 17.3 .3 Fourier Transform Solution of Problems on Unbounded Domains 81 5 17.4 Characteristics and d'Alembert' s Solution 822 17 .4.1 A Nonhomogeneous Wave Equation 82 5 17 .4.2 Forward and Backward Waves 82 8 17.5 Normal Modes of Vibration of a Circular Elastic Membrane 83 1 17.6 Vibrations of a Circular Elastic Membrane, Revisited 83 4 17.7 Vibrations of a Rectangular Membrane 83 7
Chapter 18
The Heat Equation 84 1 18 .1 The Heat Equation and Initial and Boundary Conditions 84 1 18.2 Fourier Series Solutions of the Heat Equation 844
Contents 18.2 .1 Ends of the Bar Kept at Temperature Zero 844 18.2 .2 Temperature in a Bar with Insulated Ends 84 7 18.2 .3 Temperature Distribution in a Bar with Radiating End 84 8 18.2 .4 Transformations of Boundary Value Problems Involving the Heat Equation 85 1 18.2 .5 A Nonhomogeneous Heat Equation 85 4 18.2 .6 Effects of Boundary Conditions and Constants on Heat Conduction 85 7 18.2 .7 Numerical Approximation of Solutions 85 9 18.3 Heat Conduction in Infinite Media 86 5 18.3.1 Heat Conduction in an Infinite Bar 86 5 18.3.2 Heat Conduction in a Semi-Infinite Bar 86 8 18.3.3 Integral Transform Methods for the Heat Equation in an Infinite Medium 86 9 18.4 Heat Conduction in an Infinite Cylinder 87 3 18.5 Heat Conduction in a Rectangular Plate 877
Chapter 19
The Potential Equation 879 19 .1 Harmonic Functions and the Dirichlet Problem 87 9 19 .2 Dirichlet Problem for a Rectangle 88 1 19.3 Dirichlet Problem for a Disk 88 3 19 .4 Poisson' s Integral Formula for the Disk 886 19 .5 Dirichlet Problems in Unbounded Regions 88 8 19.5 .1 Dirichlet Problem for the Upper Half Plane 88 9 19.5 .2 Dirichlet Problem for the Right Quarter Plane 89 1 19.5 .3 An Electrostatic Potential Problem 89 3 19 .6 A Dirichlet Problem for a Cube 89 6 19 .7 The Steady-State Heat Equation for a Solid Sphere 89 8 19.8 The Neumann Problem 90 2 19 .8.1 A Neumann Problem for a Rectangle 90 4 19 .8.2 A Neumann Problem for a Disk 90 6 19.8.3 A Neumann Problem for the Upper Half Plane 90 8
PART
7
Chapter 20
Complex Analysis 91 1 Geometry and Arithmetic of Complex Numbers 91 3 20.1 Complex Numbers 91 3 20.1.1 The Complex Plane 91 4 20.1 .2 Magnitude and Conjugate 91 5 20.1 .3 Complex Division 91 6 20.1.4 Inequalities 91 7 20.1.5 Argument and Polar Form of a Complex Number 91 8 20.1.6 Ordering 920 20.2 Loci and Sets of Points in the Complex Plane 92 1 20.2.1 Distance 922 20.2.2 Circles and Disks 92 2 20.2.3 The Equation lz -al = Iz - bI 923 20.2.4 Other Loci 925 20.2.5 Interior Points, Boundary Points, and Open and Closed Sets 925
Complex Functions 93 9 21 .1 Limits, Continuity, and Derivatives 93 9 21 .1.1 Limits 93 9 21 .1 .2 Continuity 94 1 21.1 .3 The Derivative of a Complex Function 94 3 21.1 .4 The Cauchy-Riemann Equations 94 5 21 .2 Power Series 95 0 21 .2 .1 Series of Complex Numbers 95 1 21 .2.2 Power Series 95 2 21 .3 The Exponential and Trigonometric Functions 95 7 21 .4 The Complex Logarithm 966 21 .5 Powers 96 9 21.5 .1 Integer Powers 969 21 .5 .2 z it" for Positive Integer n 969 21 .5.3 Rational Powers 97 1 21 .5.4 Powers zw 972
Chapter 22
Complex Integration 97 5 22.1 Curves in the Plane 97 5 22.2 The Integral of a Complex Function 98 0 22.2 .1 The Complex Integral in Terms of Real Integrals 98 3 22.2 .2 Properties of Complex Integrals 985 22.2 .3 Integrals of Series of Functions 98 8 22.3 Cauchy' s Theorem 99 0 22.3.1 Proof of Cauchy' s Theorem for a Special Case 99 3 22.4 Consequences of Cauchy's Theorem 99 4 22.4.1 Independence of Path 99 4 22.4 .2 The Deformation Theorem 99 5 22.4 .3 Cauchy's Integral Formula 99 7 22.4 .4 Cauchy's Integral Formula for Higher Derivatives 100 0 22.4 .5 Bounds on Derivatives and Liouville ' s Theorem 100 1 22.4 .6 An Extended Deformation Theorem 100 2
Chapter 23
Series Representations of Functions 100 7 23.1 Power Series Representations 100 7 23 .1.1 Isolated Zeros and the Identity Theorem 101 2 23 .1.2 The Maximum Modulus Theorem 101 6 23.2 The Laurent Expansion 101 9
Chapter 24
Singularities and the Residue Theorem 102 3 24.1 Singularities 1023 24.2 The Residue Theorem 103 0 24 .3 Some Applications of the Residue Theorem 1037
xiii
xiv
Contents
24.3 .1 The Argument Principle 103 7 24.3 .2 An Inversion for the Laplace Transform 103 9 24.3 .3 Evaluation of Real Integrals 1040
Chapter 25
Conformal Mappings 105 5 25 .1 Functions as Mappings 105 5 25 .2 Conformal Mappings 106 2 25.2 .1 Linear Fractional Transformations 106 4 .3 Construction of Conformal Mappings Between Domains 107 2 25 25.3 .1 Schwarz-Christoffel Transformation 107 7 25.4 Harmonic Functions and the Dirichlet Problem 108 0 25 .4.1 Solution of Dirichlet Problems by Conformal Mapping 108 3 25.5 Complex Function Models of Plane Fluid Flow 108 7
PART
8
Chapter 26
Probability and Statistics 109 7 Counting and Probability 109 9 26 .1 The Multiplication Principle 109 9 26 .2 Permutations 1102 26 .3 Choosing r Objects from n Objects 1104 26.3 .1 r Objects from n Objects, with Order 110 4 26.3 .2 r Objects from n Objects, without Order 110 6 26.3 .3 Tree Diagrams 1107 26 .4 Events and Sample Spaces 111 2 26 .5 The Probability of an Event 111 6 26.6 Complementary Events 112 1 26.7 Conditional Probability 112 2 26.8 Independent Events 112 6 26.8 .1 The Product Rule 112 8 .9 Tree Diagrams in Computing Probabilities 113 0 26 26.10 Bayes' Theorem 113 4 26.11 Expected Value 113 9
Chapter 27
Statistics 114 3 27 .1 Measures of Center and Variation 1143 27.1 .1 Measures of Center 1143 27.1 .2 Measures of Variation 114 6 27 .2 Random Variables and Probability Distributions 1150 27 .3 The Binomial and Poisson Distributions 1154 27.3 .1 The Binomial Distribution 1154 27.3 .2 The Poisson Distribution 115 7 27 .4 A Coin Tossing Experiment, Normally Distributed Data, and the Bell Curve 115 9 27.4 .1 The Standard Bell Curve 117 4 27.4 .2 The 68, 95, 99 .7 Rule 1176
27.5 27.6 27.7 27.8
Sampling Distributions and the Central Limit Theorem 117 8 Confidence Intervals and Estimating Population Proportion 118 5 Estimating Population Mean and the Student t Distribution 1190 Correlation and Regression 1194
Answers and Solutions to Selected Problems Index
11
Al
Preface
This Sixth Edition of Advanced Engineering Mathematics maintains the primary goal of previous editions-to engage much of the post-calculus mathematics needed and used by scientists , engineers, and applied mathematicians, in a setting that is helpful to both students and faculty . The format used throughout begins with the correct developments of concepts such as Fourie r series and integrals, conformal mappings, and special functions . These ideas are 'then brought t o bear on applications and models of important phenomena, such as wave and heat propagatio n and filtering of signals . This edition differs from the previous one primarily in the inclusion of statistics an d numerical methods . The statistics part treats random variables, normally distributed data, bel l curves, the binomial, Poisson, and student t-distributions, the central limit theorem, confidenc e intervals, correlation, and regression . This is preceded by prerequisite topics from probabilit y and techniques of enumeration . The numerical methods are applied to initial value problems in ordinary differential equations, including a proposal for radioactive waste disposal, and to boundary value problem s involving the heat and wave equations . Finally, in order to include these topics without lengthening the book, some items from th e fifth edition have been moved to a website, located at http ://engineering.thornsonlearning .com. I hope that this provides convenient accessibility . Material selected for this move include s some biographies and historical notes, predator/prey and competing species models, the theor y underlying the efficiency of the FFT, and some selected examples and problems . The chart on the following page offers a complete organizational overview . Acknowledgment s This book is the result of a team effort involving much more than an author . Among those to whom I owe a debt of appreciation are Chris Carson, Joanne Woods, Hilda Gowans an d Kamilah Reid-Burrell of Thomson Engineering, and Rose Kernan and the professionals at RP K Editorial Services, Inc . I also want to thank Dr. Thomas O'Neil of the California Polytechni c State University for material he contributed, and Rich Jones, who had the vision for the firs t edition of this book many years ago . Finally, I want to acknowledge the reviewers, whose suggestions for improvements an d clarifications are much appreciated : Preliminary Revie w
Panagiotis Dimitrakopoulos, University of Marylan d Mohamed M . Hafez, University of California, Davi s Jennifer Hopwood, University of Western Australia Nun Kwan Yip, Purdue University
Statistical Analysis Systems of Algebraic Equation s
Vector Analysis Qualitative Methods, Stability, Analysis of Critical Points
/
Probability
I
Statistics
Fourier Analysis Fourier Series, Integrals
Fourier
Discrete Fourie r
Preface
xix
Draft Revie w Sabri Abou-Ward, University of Toronto Craig Hildebrand, California State University - Fresn o Seiichi Nomura, University of Texas, Arlingto n David L. Russell, Virginia Polytechnic Institute and State University Y .Q . Sheng, McMaster University
PETER V . O ' NEI L
University of Alabama at Birmingham
PAR T
CHAPTER 1
First-Order Differential Equation s CHAPTER 2
Second-Order Differential Equation s CHAPTER 3
Ordinary Differential Equations
The Laplace Transfor m CHAPTER 4
Series Solution s CHAPTER 5
Numerical Approximation o f Solutions
A differential equation is an equation that contains one or more derivatives . For example ,
y"(x) + y(x) = 4 sin(3x ) and d4 w _ dt 4
(w ( t)) 2
=e
-`
are differential equations . These are ordinary differential equations because they involve onl y total derivatives, rather than partial derivatives . Differential equations are interesting and important because they express relationship s involving rates of change . Such relationships form the basis for developing ideas and studyin g phenomena in the sciences, engineering, economics, and increasingly in other areas, such as th e business world and the stock market. We will see examples of applications as we learn more about differential equations .
The order of a differential equation is the order of its highest derivative . The first example given above is of second order, while the second is of fourth order . The equatio n xy'_y 2 =ec is
of first order . A solution of
a differential equation is any function that satisfies it . A solution may b e defined on the entire real line, or on only part of it, often an interval . For example, y = sin(2x) is a solution
of
y" +4y = 0 , because, by direct differentiation , y" +4y = -4 sin(2x) +4 sin(2x) = 0. This solution is defined for all x (that is, on the whole real line) . By contrast, y=xln(x)- x is a solution
of
Y =y + 1 , x but this solution is defined only for x > 0 . Indeed, the coefficient 1/x of y in this equatio n means that x 0 is disallowed from the start. We now begin a systematic development of ordinary differential equations, starting wit h the first order case .
2
PRELIMINARY ONCEPTS SEPARABLE EQUATION S i O,k'10( t .,NJ U 7, BERNOULLI, AND l 1ICC ; tr41 El9 LJln. T ONS APPLICATIONS TO MECHANICS, ELECTRICA L CIRCUITS ., AND ORTHOGONAL TRAJECTORIES E N i
CHAPTER
I
First-Order Differential Equations
1 .1
Preliminary Concepts Before developing techniques for solving various kinds of differential equations, we will develo p some terminology and geometric insight .
1.1 .1 General and Particular Solution s A first-order differential equation is any equation involving a first derivative, but no highe r derivative . In its most general form, it has the appearanc e F(x, y, y') = 0,
(1 .1 )
in which y(x) is the function of interest and x is the independent variable. Examples are y
yzey- 0 y'-2=0 ,
and y' - cos(x) = 0 . Note that y' must be present for an equation to qualify as a first-order differential equation, bu t x and/or y need not occur explicitly . A solution of equation (1 .1) on an interval I is a function cp that satisfies the equation fo r all x in I . That is, F(x, cp (x) , co' (x)) = 0
for all x in I .
For example, cp (x) = 2 + keX 3
4
CHAPTER 1 First-Order Differential Equation s is a solution of y' +y= 2 for all real x, and for any number k. Here I can be chosen as the entire real line . And q )(x) = x In (x) + cx is a solution of
for all x > 0, and for any number c . In both of these examples, the solution contained an arbitrary constant . This is a symbol independent of x and y that can be assigned any numerical value . Such a solution is called th e general solution of the differential equation . Thu s cp(x) = 2 + ke -X is the general solution of y' + y = 2. Each choice of the constant in the general solution yields a particular solution . For example, f(x) = 2 +
g(x) = 2 - e - X
and h(x) = 2- 53e -x are all particular solutions of y ' + y = 2, obtained by choosing, respectively, k = 1, -1 and - .N/ j. 3 in the general solution . 1.1 .2 Implicitly Defined Solution s Sometimes we can write a solution explicitly giving y as a function of x . For example , y = ke -x is the general solution of y = -Y, as can be verified by substitution . This general solution is explicit, with y isolated on one sid e of an equation, and a function of x on the other . By contrast, consider Y
2xy 3 + 2 3x2 y2 + 8e4Y
We claim that the general solution is the function y(x) implicitly defined by the equatio n x 2y 3 + 2x + 2e4j' = k,
(1 .2)
in which k can be any number . To verify this, implicitly differentiate equation (1 .2) with respec t to x, remembering that y is a function of x . We obtain 2xy3 + 3x2 y2 y ' + 2 + 8 e4Y y ' = 0, and solving for y' yields the differential equation . In this example we are unable to solve equation (1 .2) explicitly for y as a function of x, isolating y on one side . Equation (1 .2), implicitly defining the general solution, was obtaine d by a technique we will develop shortly, but this technique cannot guarantee an explicit solution .
1 .1
Preliminary Concept s
1 .1 .3 Integral Curves A graph of a solution of a first-order differential equation is called an integral curve of the equation . If we know the general solution, we obtain an infinite family of integral curves, on e for each choice of the arbitrary constant .
EXAMPLE 1 . 1
We have seen that the general solution of
y' + y = 2 is y=2+ke- x
for all x . The integral curves of y' + y = 2 are graphs of y = 2 + ke-x for different choices of k . Some of these are shown in Figure 1 .1 . Y 30
k=
20 k= 10
k=0(y= 2 I
-2
I
-1
2
3
k= -
k= -
FIGURE 1 .1
10
- 20
Integral curves of y ' +y = 2 for k = 0, 3, -3, 6, and -6 .
EXAMPLE 1 . 2
It is routine to verify that the general solution o f
Y+ y=ex x is
y = 1 (xe x -e x +c) x
i
I
I
4
5
6
CHAPTER 1 First-Order Differential Equation s
for x 0 . Graphs of some of these integral curves, obtained by making choices for c, are show n in Figure 1 .2. Y
FIGURE 1 .2 -10 .
Integral curves of y' -F xy = e x for c = 0, 5, 20, -6, an d
We will see shortly how these general solutions are obtained . For the moment, we simpl y want to illustrate integral curves . Although in simple cases integral curves can be sketched by hand, generally we need computer assistance . Computer packages such as MAPLE, MATHEMATICA and MATLA B are widely available . Here is an example in which the need for computing assistance is clear .
EXAMPLE 1 . 3
The differential equation y' +xy= 2 has general solution Y( x) = e -x212 f0 x 2,12 dk -F ke _x2/2 . Figure 1 .3 shows computer-generated integral curves corresponding to k = 0, 4, 13, -7, -1 5 and -11 .
1.1.4 The Initial Value Proble m The general solution of a first-order differential equation F(x, y, y') = 0 contains an arbitrary constant, hence there is an infinite family of integral curves, one for each choice of the constant . If we specify that a solution is to pass through a particular point (xo, yo), then we must find that particular integral curve (or curves) passing through this point . This is called an initial valu e problem . Thus, a first order initial value problem has the form F(x, Y, Y) = 0 ;
Y(xo) = Yo ,
in which xo and yo are given numbers . The condition y(xo) = yo is called an initial condition .
1 .1
FIGURE 1 .3
Preliminary Concepts
Integral curves of y' +xy = 2 for k = 0, 4, 13, -7, -15, and
-11 .
EXAMPLE 1 . 4
Consider the initial value problem y' +y
=2 ;
Y(l)=-5 .
From Example 1 .1, the general solution of y' + y =
2 is
y=2+ke-x .
Graphs of this equation are the integral curves . We want the one passing through (1, -5) . Solve for k so that y(l) =
2. + ke -1
=
-5 ,
obtaining k=-7e .
The solution of this initial value problem i s y = 2 -lee-x = 2 - 7e -(x-I) .
Asa check, y(1) =
2-7 = -5 .
The effect of the initial condition in this example was to pick out one special integral curv e as the solution sought. This suggests that an initial value problem may be expected to have a unique solution . We will see later that this is the case, under mild conditions on the coefficient s in the differential equation . 1.1.5 Direction Fields Imagine a curve, as in Figure 1 .4 . If we choose some points on the curve and, at each point , draw a segment of the tangent to the curve there, then these segments give a rough outline o f the shape of the curve . This simple observation is the key to a powerful device for envisionin g integral curves of a differential equation.
CHAPTER 1 . First-Order Differential Equation s
FIGURE 1 .4 Short tangen t segments suggest the shap e of the curve.
The general first-order differential equation has the for m F(x,y,y')=0 . Suppose we can solve for y' and write the differential equation a s y' = f(x , Y) • Here f is a known function . Suppose f(x, y) is defined for all points (x, y) in some region R of the plane. The slope of the integral curve through a given point (xo, yo) of R is y'(xo) , which equals f(xo, yo) . If we compute f(x, y) at selected points in R, and draw a small line segment having slope f(x, y) at each (x, y), we obtain a collection of segments which trace out the shapes of the integral curves . This enables us to obtain important insight into the behavio r of the solutions (such as where solutions are increasing or decreasing, limits they might hav e at various points, or behavior as x increases) . A drawing of the plane, with short line segments of slope f(x, y) drawn at selected point s (x, y), is called a direction field of the differential equation y' = f(x, A . The name derives fro m the fact that at each point the line segment gives the direction of the integral curve through tha t point. The line segments are called lineal elements .
EXAMPLE 1 . 5
Consider the equation
y = y2 . Here f(x, y) = y 2, so the slope of the integral curve through (x, y) is y2 . Select some points and , through each, draw a short line segment having slope y2 . A computer generated direction field is shown in Figure 1 .5(a) . The lineal elements form a profile of some integral curves and giv e us some insight into the behavior of solutions, at least in this part of the plane . Figure 1 .5(b ) reproduces this direction field, with graphs of the integral curves through (0, 1), (0, 2), (0, 3) , (0, -1), (0, -2) and (0, -3) . By a method we will develop, the general solution of y' = y 2 i s 1
so the integral curves form a family of hyperbolas, as suggested by the curves sketched i n Figure 1 .5(b) .
1 .1
-2 // ////// -4
/ / / / / / / / / / / / / / / /- 2
IIIIIII I I I I / I I I I
Preliminary Concepts
// //// // // // // // // // // // IIIIIII I I 2
I
4
IIIIIII I
IIIIIIII- 4 Ill/111 1 FIGURE 1 .5(a)
A direction field for y' = y2 .
-4 -2 /// ///// / / / / / / / / / / / / /
IIIIIII I IIIIIII I IIIIIIIIFIGURE 1 .5(b)
Direction field for y' = y2 and integral curves through (0, 1), (0, (0, 3)(0, -1), (0, -2), and (0, -3) .
2),
x
CHAPTER 1 First-Order Differential Equation s
EXAMPLE 1 .6 Figure 1 .6 shows a direction field for y' = sin(xy) , together with the integral curves through (0, 1), (0, 2), (0, 3), (0, -1), (0, -2) and (0, -3) . In this case, we cannot write a simple expression for the general solution, and the direction fiel d provides information about the behavior of solutions that is not otherwise readily apparent .
x
FIGURE 1 .6
Direction field for y' = sin(xy) and integral curves through (0, 1), (0, 2) ,
(0, 3), (0, -1), (0, -2), and (0, -3) . With this as background, we will begin a program of identifying special classes of firstorder differential equations for which there are techniques for writing the general solution . Thi s will occupy the next five sections .
In each of Problems 1 through 6, determine whether th e given function is a solution of the differential equation . 1. 2yy' = l ; cp(x) = 1 forx > 1 2. y' + y = 0; so (x) = Cex z C ex 3. y' _ _ 2Y+ e forx > 0 ; 5)(x) _ 2x 2x
_
5. xy' = x - y;cp(x)=2x3 forx 6. y'+y=1 ;cp(x)=1+Ce x
In each of Problems 7 through 11, verify by implicit differentiation that the given equation implicitly defines a solution of the differential equation . 7.
In each of Problems 21 through 26, generate a direction field and some integral curves for the differential equation . Also draw the integral curve representing the solution o f the initial value problem . These problems should be don e by a software package .
0
In each of Problems 12 through 16, solve the initial valu e problem and graph the solution . Hint: Each of these differential equations can be solved by direct integration . Us e the initial condition to solve for the constant of integration . 12. y' = 2x ; y(2) = 1 13. y'=e x ;Y(0) = 2 14. y' =2x+2 ; y(-1) = 15. y' = 4cos(x)sin(x) ; y(lr/2) = 0 16. y' = 8x + cos(2x) ; y(O) = - 3
In each of Problems 17 through 20 draw some linea l elements of the differential equation for -4 < x < 4, -4 < y < 4. Use the resulting direction field to sketch a graph of the solution of the initial value problem . (Thes e problems can be done by hand .)
1 .2
11
y' = y sin(x) - 3x 2; y(0) = 1 y' = ex -y ; Y(- 2) = 1 y' -ycos(x)= 1-x2 ;y(2) = 2 y' = 2y+3 ; y(0) = 1 Show that, for the differential equation y' + p(x)y = q(x), the lineal elements on any vertical line x = xo , with p(xo) 0, all pass through the single poin t (, r1), wher e
e = xo +
1
P(xo)
and
q (x o) 77 =
o)
p (x
Separable Equation s
DEFINITION 1.1
Separable Differential Equation
A differential equation
is
called separable if it can be writte n y' = A(x)B(y) .
In this .event, we can separate the variables and write, in differential form, 1 dy = A(x) dx B (y) wherever B(y) 0 . We attempt to integrate this equation, writin g f B (y) dY = f A(x) dx. This yields an equation in x, y, and a constant of integration . This equation implicitly defines the general solution y(x) . It may or may not be possible to solve explicitly for y(x) .
12 L.___J
CHAPTER 1 First-Order Differential Equation s
EXAMPLE 1 . 7 y' = y2 e -'
is separable . Write dY = y2 e-x dx
as dx = e -x dx y2 2 for y 0 . Integrate this equation to obtain
1 --
=
Y
-e -x +k ,
an equation that implicitly defines the general solution . In this example we can explicitly solv e for y, obtaining the general solution _ Y
1 e -x - k
Now recall that we required that y 0 in order to separate the variables by dividing by y2 . In fact, the zero function y(x) = 0 is a solution of y' = y2 ex , although it cannot be obtaine d from the general solution by any choice of k : For this reason, y(x) = 0 is called a singular solution of this equation . Figure 1 .7 shows graphs of particular solutions obtained by choosing k as 0, 3, -3, 6 an d -6 .
Integral curves of y' = y2 e -x fo r k=0,3,-3,6, and -6. FIGURE 1 .7
Whenever we use separation of variables, we must be alert to solutions potentially los t through conditions imposed by the algebra used to make the separation .
1 .2 Separable Equations
13
Er: EXAMPLE 1 . 8
x2y ' = 1 +y is separable, and we can writ e 1+Y
dy
= x2
dx .
The algebra of separation has required that x 0 and y # -1, even though we can put x = 0 and y = -1 into the differential equation to obtain the correct equation 0 = O . Now integrate the separated equation to obtai n 1n11+YJ
1 =--1
+k.
This implicitly defines the general solution . In this case, we can solve for y(x) explicitly . Begi n by taking the exponential of both sides to obtai n -'/x 11-f -y* = e k e = Ae -'fix , in which we have written A = e k . Since k could be any number, A can be any positive number . Then 1 +y = ± Ae-llx = Be
e,
in which B = ±A can be any nonzero number . The general solution i s y = -1 +Be -'1x , in which B is any nonzero number . Now revisit the assumption that ,x 0 and y -1 . In the general solution, we actuall y obtain y = -1 if we allow B = O . Further, the constant function y(x) = -1 does satisfy x2 y' = 1 +y . Thus, by allowing B to be any number, including 0, the general solutio n y(x) = -1+Be - ' 1x contains all the solutions we have found. In this example, y = -1 is a solution, but not a singular solution, since it occurs as a special case of the general solution . Figure 1 .8 shows graphs of solutions corresponding to B = - 8, -5, 0, 4 and 7 .
>x
Integral curves of x 2 y' = 1+y fo r B = 0, 4, 7, -5, and -8 . FIGURE 1 .8
We often solve an initial value problem by finding the general solution of the differentia l equation, then solving for the_appropriate choice of the constant.
14
CHAPTER 1 First-Order Differential Equation s
EXAMPLE 1 . 9
Solve the initial value problem y/ = Yee-x ;
Y(1)
= 4.
We know from Example 1 .7 that the general solution of y' = y2e -x is Y( x) =
1 e_x _ k
Now we need to choose k so that 1 e-1 -
= 4,
from which we get 1
k=e-u- 4 .
The solution of the initial value problem i s 1 y(x) = e-x + 4 e-I -
EXAMPLE 1 .1 0
The general solution of Y=y
(x -1) 2 y+ 3
is implicitly defined by y+31nIyI =
(x- 1) 3
+k .
(1 .3 )
To obtain the solution satisfying y(3) = -1, put x = 3 and y = -1 into equation (1 .3) to obtai n -1
= 3(2) 3 +k ,
hence k=
11 3
The solution of this initial value problem is implicitly defined b y y+31nIYI =
3(x - 1) 3 -
11
1.2 .1 Some Applications of Separable Differential Equation s Separable equations arise in many contexts, of which we will discuss three .
1.2 Separable Equations
15
EXAMPLE 1 .1 1
(The Mathematical Policewoman) A murder victim is discovered, and a lieutenant from th e forensic science laboratory is summoned to estimate the time of death . The body is located in a room that is kept at a constant 68 degrees Fahrenheit . For some tim e after the death, the body will radiate heat into the cooler room, causing the body's temperature to decrease . Assuming (for want of better information) that the victim's temperature was a "normal" 98 .6 at the time of death, the lieutenant will try to estimate this time by observing th e body's current temperature and calculating how long it would have had to lose heat to reach this point. According to Newton's law of cooling, the body will radiate heat energy into the room a t a rate proportional to the difference in temperature between the body and the room . If T(t) i s the body temperature at time t, then for some constant of proportionality k, T' (t) = k [T(t) - 68] . The lieutenant recognizes this as a separable differential equation and write s 1 dT=kdt. T-6 8 Upon integrating, she get s lnlT-681= kt+C. Taking exponentials, she gets
IT - 681 = e kt+c
Ae k '
T - 68 = ±Aekt
= Bekt .
in which A = e c . Then
Then T(t) = 68 + Bek r Now the constants k and B must be determined, and this requires information . The lieutenant arrived at 9 :40 p .m. and immediately measured the body temperature, obtaining 94 .4 degrees . Letting 9 :40 be time zero for convenience, this means tha t T(0) = 94 .4 = 68 + B , and so B = 26 .4 . Thus far, T(t) = 68+26 .4e k' . To determine k, the lieutenant makes another measurement . At 11 :00 she finds that the body temperature is 89 .2 degrees . Since 11 :00 is 80 minutes past 9 :40, this means that T(80) = 89 .2 = 68 + 26 .4e s °k . Then esok = 21 .2 26.4 ' so 80k =1n
(21 .2 26.4
CHAPTER 1 First-Order Differential Equation s
16 i
and 1 In 21 .2
k=
26 .4
80
The lieutenant now has the temperature function : T(t) = 68 + 26.4emn(21 .2/26.4)t/80
In order to find when last time when the body was 98 .6 (presumably the time of death), solv e for the time in .4)t/80 T(t) = 98 .6 = 68 + 26 .4etn(21 .2/26 To do this, the lieutenant writes 30.6 _ e 16(21 .2126.4)118 0 26.4 and takes the logarithm of both sides to obtain 30.6 26.4
1n I
21 .2 1 t ln( 80 1\\26 .4))
Therefore the time of death, according to this mathematical model, wa s 801n(30 .6/26 .4) ln(21 .2/26.4) '
t
which is approximately -53 .8 minutes . Death occurred approximately 53 .8 minutes before (because of the negative sign) the first measurement at 9 :40, which was chosen as time zero . This puts the murder at about 8 :46 p.m. EXAMPLE 1 .12 (Radioactive Decay and Carbon Dating) In radioactive decay, mass is converted to energy b y radiation . It has been observed that the rate of change of the mass of a radioactive substanc e is proportional to the mass itself . This means that, if m(t) is the mass at time t, then for some constant of proportionality k that depends on the substance , dm = km . dt
This is a separable differential equation . Write it a s
1 dm=kd t and integrate to obtain lnimi = kt+c. Since mass is positive, Iml = m and In(m) = kt+c. Then m (t) = ekt+c
in which A can be any positive number .
=
Aekt ,
1.2 Separable Equations
17
Determination of A and k for a given element requires two measurements . Suppose at some time, designated as time zero, there are M grams present . This is called the initial mass . Then m(0)=A=M, so m(t) = Me kt .
If at some later time T we find that there are MT grams, then m(T) = M T = MekT . Then In
C MT
I
= kT,
hence k=Tln(MT
.
This gives us k and determines the mass at any time : m(t) = Me ln(MTImtIT
We obtain a more convenient formula for the mass if we choose the time of the secon d measurement more carefully . Suppose we make the second measurement at that time T = H at which exactly half of the mass has radiated away . At this time, half of the mass remains, s o MT = M/2 and MT /M = 1/2. Now the expression for the mass become s m(t) =
Me
ln(1/2)t/ x
or 7 2 (t) = Me- in (2)t/H
This number H is called the half-life of the element . Although we took it to be the tim e needed for half of the original amount M to decay, in fact, between any times t 1 and t 1 +H , exactly half of the mass of the element present at t l will radiate away . To see this, writ e m(tl + H) = Me- ln ( 2)(h+r)/ H =
Me-ln(2)t,/H e -In(2)H/H
_
e -In(2) m ( t1 )
= z1?2(t l ) .
Equation (1 .4) is the basis for an important technique used to estimate the ages of certai n ancient artifacts . The earth's upper atmosphere is constantly bombarded by high-energy cosmi c rays, producing large numbers of neutrons, which collide with nitrogen in the air, changin g some of it into radioactive carbon-14, or 14 C . This element has a half-life of about 5,730 years . Over the relatively recent period of the history of this planet in which life has evolved, th e fraction of 14C in the atmosphere, compared to regular carbon, has been essentially constant . This means that living matter (plant or animal) has injested 14C at about the same rate over a long historical period, and objects living, say, two million years ago would have had the sam e ratio of carbon-14 to carbon in their bodies as objects alive today . When an organism dies, it ceases its intake of 14C, which then begins to decay . By measuring the ratio of 14C to carbon in an artifact, we can estimate the amount of the decay, and hence the time it took, giving a n
CHAPTER 1 First-Order Differential Equation s estimate of the time the organism was alive . This process of estimating the age of an artifac t is called carbon dating . Of course, in reality the ratio of 14C in the atmosphere has only been approximately constant, and in addition a sample may have been contaminated by exposure t o other living organisms, or even to the air, so carbon dating is a sensitive process that can lea d to controversial results . Nevertheless, when applied rigorously and combined with other test s and information, it has proved a valuable tool in historical and archeological studies . To apply equation (1 .4) to carbon dating, use H = 5730 and comput e
in which becomes
111(2) N 111(2) 0.00012096 8 = H 573 0 means "approximately equal" (not all decimal places are listed) . Equation (1 .4) m(t) =
Me -0 .000120968 r
Now suppose we have an artifact, say a piece of fossilized wood, and measurements show tha t the ratio of 14C to carbon in the sample is 37 percent of the current ratio . If we say that th e wood died at time 0, then we want to compute the time T it would take for one gram of th e radioactive carbon to decay this amount . Thus, solve for T in 0.37 =
e -0
.000120968r
We find that T_0
ln(0 .37 ) .000120968
8 ' 21 9
years . This is a little less than one and one-half half-lives, a reasonable estimate if nearly s of the 14C has decayed .
EXAMPLE 1 .1 3
(Torricelli's Law) Suppose we want to estimate how long it will take for a container to empt y by discharging fluid through a drain hole . This is a simple enough problem for, say, a sod a can, but not quite so easy for a large oil storage tank or chemical facility . We need two principles from physics . The first is that the rate of discharge of a fluid flowing through an opening at the bottom of a container is given b y dV = -kAv , dt in which V(t) is the volume of fluid in the container at time t, v(t) is the discharge velocit y of fluid through the opening, A is the cross sectional area of the opening (assumed constant) , and k is a constant determined by the viscosity of the fluid, the shape of the opening, and th e fact that the cross-sectional area of fluid pouring out of the opening is slightly less than tha t of the opening itself. In practice, k must be determined for the particular fluid, container, an d opening, and is a number between 0 and 1 . We also need Torricelli's law, which states that v(t) is equal to the velocity of a free-fallin g particle released from a height equal to the depth of the fluid at time t . (Free-falling mean s that the particle is influenced by gravity only) . Now the work done by gravity in moving the particle from its initial point by a distance h(t) is nigh(t), and this must equal the change in the kinetic energy, (2) mv 2. Therefore, v(t)
= A /2gh(t) .
1.2 Separable Equations
19
FIGURE 1 . 9
Putting these two equations together yields dV = -kA,f2gh(t) . (1 .5) dt We will apply equation (1 .5) to a specific case to illustrate its use . Suppose we have a hemispherical tank of water, as in Figure 1 .9 . The tank has radius 18 feet, and water drains through a circular hole of radius 3 inches at the bottom . How long will it take the tank to empty ? Equation (1 .5) contains two unknown functions, V(t) and h(t), so one must be eliminated . Let r(t) be the radius of the surface of the fluid at time t and consider an interval of time from to to t, = to + At . The volume AV of water draining from the tank in this time equals th e volume of a disk of thickness Ah (the change in depth) and radius r(t*), for some t* betwee n to and t 1 . Therefore AV= 7r [r(t *)] 2 A h so
0-=7T[r(t*)]2 At
.
In the limit as t - 0, dV 2 dh =7rr d dt t Putting this into equation (1 .5) yield s 'n-r2
dt
= -kA
2gh.
Now V has been eliminated, but at the cost of introducing r(t) . However, from Figure 1 .9 , r 2 = 18 2 - (18-h) 2 = 36h-h 2 so
'tr (36h - h2)
dh
= -kA*/2gh .
This is a separable differential equation, which we write a s 7r
36h-h 2 dh = -kA 2g dt . hl/2
Take g to be 32 feet per second per second . The radius of the circular opening is 3 inches, or a feet, so its area is A = ar/16 square feet . For water, and an opening of this shape and size , the experiment gives k = 0 .8. The last equation becomes 1 (36h 1/2 - h3/2) dh = -(0 .8) (-) 16
64 dt,
20
CHAPTER 1 First-Order Differential Equation s
or (36h"2 - h3/2) dh = -0 .4 dt. A routine integration yield s 2 2 24h312 - -h5/2 = --t+ c , 5 5 or 60h312 - h5/2 = -t+k . Now h(0) = 18, so 60(18) 3/2 - (18) 5/2 = k . Thus k = 2268V' and h(t) is implicitly determined by the equation 60h312 - h512 = 2268V - t . The tank is empty when h = 0, and this occurs when t = 2268 seconds, or about 53 minutes , 28 seconds. The last three examples contain an important message . Differential equations can be use d to solve a variety of problems, but a problem usually does not . present itself as a differential equation . Normally we have some event or process, and we must use whatever informatio n we have about it to derive a differential equation and initial conditions . This process is called mathematical modeling . The model consists of the differential equation and other relevan t information, such as initial conditions . We look for a function satisfying the differential equation and the other information, in the hope of being able to predict future behavior, or perhaps better understand the process being considered .
In each of Problems 1 through 10, determine if the differential equation is separable . If it is, find the general solution (perhaps implicitly defined) . If it is not separable, do not attempt a solution at this time . 1. 2. 3. 4.
5. xy' + y=y2 (x+l) 2 -2y 6. y'= 2Y = 7. x sin(y)y' cos(y ) x y - 2y2 + 1 8. y x+ 1 9. y+y' =e x -sin(y) 10. [cos(x+y)+sin(x-y)]y' = cos(2x)
In each of Problems 11 through 15, solve the initial valu e problem. 11. xy2y' = y-+- 1 ; y(3e2) = 2 12. y' = 3x2 (y+ 2) ; y(2) = 8 13. 14. 15. 16.
ln ( yx)y' = 3x2y; y(2) = e3 2yy' = ex-Y2 ; y(4) = - 2 yy' = 2x sec(3y) ; y(2/3) = 7r/3 An object having a temperature of 90 degrees Fahrenheit is placed into an environment kept at 60 degrees . Ten minutes later the object has cooled to 88 degrees . What will be the temperature of the object after it ha s been in this environment for 20 minutes? How lon g will it take for the object to cool to 65 degrees ? 17. A thermometer is carried outside a house whose ambient temperature is 70 degrees Fahrenheit . After five minutes the thermometer reads 60 degrees, and fiftee n minutes after this, 50 .4 degrees . What is the outside temperature (which is assumed to be constant) ?
1.2 Separable Equations 18. Assume that the population of bacteria in a petri dis h changes at a rate proportional to the population a t that time . This means that, if P(t) is the population at time t, then dP
dt
= kP
for some constant k . A particular culture has a population density of 100,000 bacteria per square inch . A culture that covered an area of 1 square inch at 10 :0 0 a .m . on Tuesday was found to have grown to cover 3 square inches by noon the following Thursday . Ho w many bacteria will be present at 3 :00 p.m . the following Sunday? How many will be present on Monday a t 4:00 p.m .? When will the world be overrun by thes e bacteria, assuming that they can live anywhere on th e earth's surface? (Here you need to look up the lan d area of the earth .) 19. Assume that a sphere of ice melts at a rate proportional to its surface area, retaining a spherical shape . Interpret melting as a reduction of volume wit h respect to time . Determine an expression for the volume of the ice at any time t . 20. A radioactive element has a half-life of ln(2) weeks . If e 3 tons are present at a given time, how much wil l be left 3 weeks later ? 21. The half-life of uranium-238 is approximatel y 4.5 years . How much of a 10-kilogram bloc k of U-238 will be present 1 billion years from now ? 22. Given that 12 grams of a radioactive element decays to 9 .1 grams in 4 minutes, what is the half-life of thi s element ? 23. Evaluate Jo
e
-`2_9/'2
dt.
Hint: Let 1(x)
=j
dt.
0
Calculate I'(x) by differentiating under the integral sign, then let u = x/t . Show that I'(x) = -2I(x) and solve for I(x) . Evaluate the constant by using the standard result that fo e-`2 dt = J/2 . Finally, evaluate 1(3) . 24. Derive the fact used in Example 1 .13 that v(t) = ,/2gh(t) . Hint : Consider a free-falling particle having height h(t) at time t . The work done by gravity in moving the particle from its starting point to a given point is nigh(t), and this must equal the change in the kinetic energy, which is (1/2)nzv2 .
21
25. Calculate the time required to empty the hemispherical tank of Example 1 .13 if the tank is positione d with its flat side down. 26. (Draining a Hot Tub) Consider a cylindrical hot tu b with a 5-foot radius and height of 4 feet, placed o n one of its circular ends . Water is draining from the tub through a circular holes inches in diameter locate d in the base of the tub . (a) Assume a value k = 0 .6 to determine the rate at which the depth of the water is changing. Here it is useful to write dh dh dV _ dV/dt _=__= .dt dV dt dV/d h
(b) Calculate the time T required to drain the hot tub i f it is initially full . Hint: One way to do this is to writ e T
=
f H
dt dh . dh
(c) Determine how much longer it takes to drain the lower half than the upper half of the tub . Hint : Us e the integral suggested in (b), with different limits fo r the two halves. 27. (Draining a Cone) A tank shaped like a right circular cone, with its vertex down, is 9 feet high and has a diameter of 8 feet . It is initially full of water . (a) Determine the time required to drain the tank through a circular hole of diameter 2 inches at th e vertex. Take k = 0.6. (b) Determine the time it takes to drain the tank if i t is inverted and the drain hole is of the same size an d shape as in (a), but now located in the new base . 28 . (Drain Hole at Unknown Depth) Determine the rate of change of the depth of water in the tank of Proble m 27 (vertex at the bottom) if the drain hole is located i n the side of the cone 2 feet above the bottom of the tank . What is the rate of change in the depth of the water whe n the drain hole is located in the bottom of the tank? Is i t possible to determine the location of the drain hole if w e are told the rate of change of the depth and the depth of the water in the tank? Can this be done without knowin g the size of the drain opening? 29. Suppose the conical tank of Problem 27, vertex at th e bottom, is initially empty and water is added at th e constant rate of ar/10 cubic feet per second . Does th e tank ever overflow? 30. (Draining a Sphere) Determine the time it takes t o completely drain a spherical tank of radius 18 feet i f it is initially full of water and the water drains through a circular hole of radius 3 inches located in the botto m of the tank . Use k = 0 .8.
22
1.3
CHAPTER 1 First-Order Differential Equation s
Linear Differential Equation s DEFINITION 1.2 Linear Differential Equation A first-order differential equation is linear if it has the form
Y ( x)+ p (x)Y= q (A ) Assume that p and q are continuous on an interval I (possibly the whole real line) . Becaus e of the special form of the linear equation, we can obtain the general solution on I by a cleve r observation . Multiply the differential equation by o f P(x) dx to get dx y efP(x) x) dx y = q(x)efP(x) dx . P(x)p(x)efP(
The left side of this equation is the derivative of the product y(x)ef P(x)dx, enabling us to write d P(x)dx) = q( x) e fP(x) dx dx (y(x)ef
Now integrate to obtain Y(x)ef p(x)dx
=f
(q(x)efP(x)dx) dx+C.
Finally, solve for y(x) : Y(x) = e-fP(x) dx
f (q(x)efp(x)dx) dx+Ce -fp(x)dx .
(1 .6)
The function o f P(x) dx is called an integrating factor for/ the differential equation, because multi plication of the differential equation by this factor results in an equation that can be integrated t o obtain the general solution . We do not recommend memorizing equation (1 .6) . Instead, recognize the form of the linear equation and understand the technique of solving it by multiplying by of P(x) dx EXAMPLE 1 .1 4
The equation y' + y = sin(x) is linear. Here p(x) = 1 and q(x) = sin(x), both continuous for all x . An integrating factor is of dx
or ex . Multiply the differential equation by ex to get y' ex +yex
=
e x sin(x) ,
or (yex ) ' = ex sin(x) .
Integrate to get yex
=f
ex sin(x) dx
=2
ex [sin(x) - cos(x)] + C .
The general solution is y(x)
= 2 [sin(x) - cos(x)] +
Ce -x .
1.3 Linear Differential Equations
23
EXAMPLE 1 .1 5
Solve the initial value problem y'=3 x2
-
y(1)=5 .
y;
First recognize that the differential equation can be written in linear form : 1 Y + x- y = 3 x2. An integrating factor is o f(l/x) dx = to get
el"(x)
= x, for x > O . Multiply the differential equation by x xy' + y=3x3 ,
or (xy) ' = 3x3.
Integrate to get xy
=
3
4x4 +C .
Then y (x) =
3 3
4x
C +x
for x > 0 . For the initial condition, we need
4+ C
y(1) =5=
so C = 17/4 and the solution of the initial value problem i s 3 17 y(x) = 4x + 4x 3
forx>0 . Depending on p and q, it may not be possible to evaluate all of the integrals in the genera l solution 1 .6 in closed form (as a finite algebraic combination of elementary functions) . Thi s occurs with y' +xy=2 . whose general solution is y(x) =
2e-x2/2
f
ex2/2 dx+ Ce -x2/2 .
We cannot write f e x2/2 dx in elementary terms . However, we could still use a software package to generate a direction field and integral curves, as is done in Figure 1 .10. This provides some idea of the behavior of solutions, at least within the range of the diagram .
24
CHAPTER 1 First-Order Differential Equation s
FIGURE 1 .10 Integral curves of y ' +xy = 2 passing through (0, 2) , (0, 4), (0, -2), and (0, -5) .
Linear differential equations arise in many contexts . Example 1 .11, involving estimation of time of death, involved a separable differential equation which is also linear and could hav e been solved using an integrating factor .
EXAMPLE 1 .1 6
(A Mixing Problem) Sometimes we want to know how much of a given substance is present in a container in which various substances are being added, mixed, and removed. Such problem s are called mixing problems, and they are frequently encountered in the chemical industry an d in manufacturing processes . As an example, suppose a tank contains 200 gallons of brine (salt mixed with water), i n which 100 pounds of salt are dissolved . A mixture consisting of pound of salt per gallon i s flowing into the tank at a rate of 3 gallons per minute, and the mixture is continuously stirred . Meanwhile, brine is allowed to empty out of the tank at the same rate of 3 gallons per minute (Figure 1 .11) . How much salt is in the tank at any time ?
FIGURE 1 .1 1
Before constructing a mathematical model, notice that the initial ratio of salt to brine i n the tank is 100 pounds per 200 gallons, or pound per gallon . Since the mixture pumped i n has a constant ratio of pound per gallon, we expect the brine mixture to dilute toward th e incoming ratio, with a "terminal" amount of salt in the tank of pound per gallon, times 20 0 gallons . This leads to the expectation that in the long term (as t -4- oo) the amount of salt in th e tank should approach 25 pounds .
1.3 Linear Differential Equations
25
Now let Q(t) be the amount of salt in the tank at time t. The rate of change of Q(t) with time must equal the rate at which salt is pumped in, minus the rate at which it is pumped out . Thus dQ = (rate in) - (rate out) dt _ 1 pounds 3 gallons Q(t) pounds 3 gallon s 8 gallon) ( minute ) - ( 200 gallon) (minute ) 3 3 8 200 Q(t) This is the linear equation Q(t)+ An integrating factor obtain
is
3 _3 200 Q= 8 .
of(3/2oo)dt = eat/too Multiply the differential equation by this factor t o Qi e 3t/200 + 3e at/200Q = 3 eat/zoo 200 8
or ( Q e3t/200 ) t = 3 e 3t/20 0 \\ 8 Then
Qe3t/too = 3 200 eat/200 + c, 8 3 so Q(t) = 25+Ce 31/2o o Now Q(0)=100=25+ C so C = 75 and Q(t) = 25+75e -3t/20o As we expected, as t increases, the amount of salt approaches the limiting value of 25 pounds . From the derivation of the differential equation for Q(t), it is apparent that this limiting valu e depends on the rate at which salt is poured into the tank, but not on the initial amount o f salt in the tank. The term 25 in the solution is called the steady-state part of the solutio n because it is independent of time, and the term 75e- 3'/ 2°° is the transient part . As t increases , the transient part exerts less influence on the amount of salt in the tank, and in the limit th e solution approaches its steady-state part .
CHAPTER 1 First-Order Differential Equations
26
PROBLEMS 16. A 500-gallon tank initially contains 50 gallons of brin e solution in which 28 pounds of salt have been dissolved . Beginning at time zero, brine containing 2 pounds of salt per gallon is added at the rate of 3 gallons per minute, and the mixture is poured out of the tank a t the rate of 2 gallons per minute . How much salt i s in the tank when it contains 100 gallons of brine ? Hint : The amount of brine in the tank at time t is 50 + t .
In each of Problems 1 through 8, find the general solution . Not all integrals can be done in closed form . 1. y' -
3 y=2x2 x
2. y' - y = sinh(x ) 3. y' +2y= x 4. sin(2x)y' + 2y sin 2 (x) = 2 sin(x)
17. Two tanks are cascaded as in Figure 1 .12 . Tank 1 initially contains 20 pounds of salt dissolved in 10 0 gallons of brine, while tank 2 contains 150 gallon s of brine in which 90 pounds of salt are dissolved . A t time zero a brine solution containing pound of sal t per gallon is added to tank 1 at the rate of 5 gallon s per minute. Tank 1 has an output that discharges brine into tank 2 at the rate of 5 gallons per minute, an d tank 2 also has an output of 5 gallons per minute . Determine the amount of salt in each tank at any time t . Also determine when the concentration of salt in tan k 2 is a minimum and how much salt is in the tank a t that time . Hint: Solve for the amount of salt in tank 1 at time t first and then use this solution to determine the amount in tank 2 .
15. Find all functions with the property that the y-intercept of the tangent to the graph at (x, y) is 2x 2 .
1 .4
FIGURE 1 .12 Mixing between tanks in Problem 17.
Exact Differential Equation s
We continue the theme of identifying certain kinds of first-order differential equations for whic h there is a method leading to a solution . We can write any first order equation y' = f(x, y) in the form M(x, y) + N(x, y)y ' = 0 . For example, put M(x, y) = -f(x, a function CO such that ax
=
y)
and N(x, y) =
M(x, y)
and
1.
aq)
ay
An interesting thing happens if there i s
= N(x, y) .
1 .4 Exact Differential Equations
27
In this event, the differential equation become s acp acp dy ax
+
=
ay dx
0'
which, by the chain rule, is the same as d cp(x, y(x)) = O . dx But this means that cp(x, y(x)) = C,
with C constant. If we now read this argument from the last line back to the first, the conclusio n is that the equation co(x, y) = C implicitly defines a function y(x) that is the general solution of the differential equation . Thus , finding a function that satisfies equation (1 .7) is equivalent to solving the differential equation . Before taking this further, consider an example .
EXAMPLE 1 .1 7
The differential equation 2xy3 + 2 3xZ y 2 +8e4y
Y
is neither separable nor linear . Write it in the form M+ Ny' = 2xy3 + 2 + (3x2 y2 + 8 e4y) y' = 0,
(1 .8)
with M(x, y) = 2xy3 + 2 and
N(x, y) = 3xZ y 2 + 8e4y .
Equation (1 .8) can in turn be written M dx + N dy = (2xy 3 + 2) dx + (3xZ y 2 + 8e 4y) dy = 0 . Now let cP(x, y) = x2 y3 +2x+2e4y.
Soon we will see where this came from, but for now, observe tha t
a*
= 2xy 3 + 2 = M
and
a* Y
= 3xZ y2 + 8e 4y = N.
With this choice of cp(x, y), equation (1 .9) become s 0x
a*
dx+* dy=0 ,
or *hP (x, y)_°O.
(1 .9)
CHAPTER 1 First-Order Differential Equation s The general solution of this equation is co(x, y) = C,
or, in this example, x 2 y 3 + 2x + 2e4y = C . This implicitly defines the general solution of the differential equation (1 .8) . To verify this, differentiate the last equation implicitly with respect to x : 2xy 3 + 3x2 y2 y ' + 2 + 8e4j'y' = 0 , or 2xy3 + 2 + (3x2 y2
+ 8e4j' )y '
= 0.
This is equivalent to the original differential equation Y'
2xy3 + 2 3x 2y2 +8e4y
With this as background, we will make the following definitions .
DEFINITION 1.3 Potential Functio n A function co is a potential function for the differential equation M(x, y on a region R of the plane if, for each (x, y) in R , ax
DEFINITION 1 .4
M(x, y)
and
a
ay= *
N( x,
Y)
Exact Differential Equatio n
When a potential function exists on a region R for the differential equation then this equation is said to be exact on R .
M + Ny' = 0,
The differential equation of Example 1 .17 is exact (over the entire plane), because w e exhibited a potential function for it, defined for all (x, y) . Once a potential function is found , we can write an equation implicitly defining the general solution . Sometimes we can explicitly solve for the general solution, and sometimes we cannot . Now go back to Example 1 .17 . We want to explore how the potential function that materialized there was found . Recall we required that ** = 2xy3 + 2 = M
and
a!Y = 3x
2y2
+ 8e4y = N.
1 .4 Exact Differential Equations
29
Pick either of these equations to begin and integrate it . Say we begin with the first . Then integrate with respect to x : cp (x, y)
=f a
dx =
f (2x? +
2) dx
= xZ y3 + 2x + g(y) . In this integration with respect to x we held y fixed, hence we must allow that y appears in th e " constant" of integration . If we calculate acp/ax, we get 2xy 2 + 2 for any function g(y). Now we know cp to within this function g . Use the fact that we know acp/ay to write
• a-Y = 3x2 y 2 + 8e 4*' a (x2y3 +2x + g ( y)) = 3x2 y 2 + g ( y ) •
= ay
This equation holds if g'(y) = 8e 4)', hence we may choose g(y) = 2e4y. This gives the potential function cp (x , y)
= x2 y 3 +2x+2e4' .
If we had chosen to integrate acp/ay first, we would have gotten cp(x , y)
= f (3x2 y 2 + 8e4y ) dy = x 2 y3 +2e4y + h(x) .
Here h can be any function of one variable, because no matter how h(x) is chosen , a (x 2 y 3 + 2e 4' + h(x)) = 3x 2 y 2 + 8e4y , ay as required . Now we have two expressions for acp/ax : a ax
= 2xy3 + 2
= ax
(x 2 y 3 +, 2e4y +h(x)) =2xy3 +h (x) .
This equation forces us to choose h so that h ' (x) = 2, and we may therefore set h(x) = 2x. This gives co(x , y) = xZy3 + 2e 4y + 2x, as we got before . Not every first-order differential equation is exact . For example, conside r Y + y' _
30
CHAPTER 1 First-Order Differential Equation s If there were a potential function cp, then we would hav e
a* - y'
ay
= 1,
Integrate acp/ax = y with respect to x to get cp(x, y) = xy + g(y) . Substitute this into &p/ay = 1 to get c3y (xy + g( y))
= x+g (y) = 1 .
But this can hold only if g' (y) = 1 - x, an impossibility if g is to be independent of x . Therefore, y + y' = 0 has no potential function . This differential equation is not exact (even though it i s easily solved either as a separable or as a linear equation) . This example suggests the need for a convenient test for exactness . This is provided by the following theorem, in which a "rectangle in the plane" refers to the set of points on or inside any rectangle having sides parallel to the axes .
THEOREM 1 .1 Test for Exactnes s Suppose M(x, y), N(x, y), aM/ay, and aN/ax are continuous for all (x, y) within a rectangle R in the plane . Then, M ( x, y) + N( x, y) y' = 0 is exact on R if and only if, for each (x, y) in R , aM aN ax ay Proof
If M + Ny' = 0 is exact, then there is a potential function cp and
Ox
= M( x , y)
and
a*y = N(x, y) .
Then, for (x, y) in R, a 2 co _ a z cp _ a acp ayax axay ax ( ay)
am _ a acp ay ay (ax)
aN ax
Conversely, suppose aM/ay and aN/ax are continuous on R . Choose any (xo, yo) in R and define, for (x, y) in R, cp (x , y)
=f
M(, yo) d + f v N(x, n) drl . yo
xo
Immediately we have, from the fundamental theorem of calculus , acp = N(x, y) , ay
(1 .10 )
1 .4 Exact Differential Equations
31
since the first integral in equation (1 .10) is independent of y . Next, comput e a* a ax = ax
s
w, yo) d + a ax Y
= M(x, yo) +*o
Yo
J yo
N(x, 7l) d77
ON , drl ax (x n)
f Y -aM
M(x, yo) + J =
Y
,
Jxo
aY
(x, *l) del
M(x , yo) + M(x , y) - M( x , yo) = M(x, y) ,
and the proof is complete . 81 For example, consider again y + y' = O . Here M(x, y) = y and N(x, y) = 1, s o
az = 0 and
Y
=1
throughout the entire plane . Thus, y + y' = 0 cannot be exact on any rectangle in the plane . We saw this previously by showing that this differential equation can have no potential function .
EXAMPLE 1 .1 8
Consider x2
+3xy+ (4xy-j-2x)y ' = 0 .
Here M(x, y) = x2 + 3xy and N(x, y) = 4xy + 2x . No w aN _ = 4y + 2 and ax
aM
aY
= 3x,
and 3x=4y+ 2 is satisfied by all (x, y) on a straight line . However, aN/ax = aM/ay cannot hold for all (x, y) i n an entire rectangle in the plane . Hence this differential equation is not exact on any rectangle .
EXAMPLE 1 .1 9
Consider ex sin(y)-2x+(e x cos(y)+1)y'=0 . With M(x, y) = ex sin(y) - 2x and N(x, y) = e x cos(y) + 1, we hav e ON _ aM e` cos(y) ax = = ay
for all (x, y) . Therefore this differential equation is exact . To find a potential function, set ** = ex sin(y)
- 2x and ** = ex cos (y) + 1 . Y
32 -
CHAPTER 1 First-Order Differential Equation s Choose one of these equations and integrate it . Integrate the second equation with respect to y : 9(x,
y)
= f (ex cos (y) + 1) d y = e x sin(y) + y + h(x) .
Then we must have - = ex sin(y) - 2x ax
= az
(ex sin(y) + y + h(x)) = e x sin (y) + h' (x) .
Then h'(x) = -2x and we may choose h(x) = -x2 . A potential function i s cp (x , y) = e x sin (g) +Y-x2 . The general solution of the differential equation is defined implicitly b y e x sin ( y) + y - x2 = C. Note of Caution : If cp is a potential function for M + Ny' = 0, cp itself is not the solution . The general solution is defined implicitly by the equation cp(x, y) = C .
PROBLEMS
In each of Problems 1 through 8, determine where (if anywhere) in the plane the differential equation is exact . If it is exact, find a potential function and the general solution , perhaps implicitly defined. If the equation is not exact, d o not attempt a solution at this time . 1. 2y2 + ye' ' + (4xy + xexy + 2y)y' = 0
12. 1 + eY/x - eY/x + eY/x Y = 0> Y( 1) = - 5 x y sinh(y x) - cosh(y - x) + y sinh(y - x)y' = 0 ; 13.
In each of Problems 9 through 14, determine if the differential equation is exact in some rectangle containing in it s interior the point where the initial condition is given . If so, solve the initial value problem . This solution may b e implicitly defined . If the differential equation is not exact, do not attempt a solution .
In Problems 15 and 16, choose a constant a so that th e differential equation is exact, then produce a potentia l function and obtain the general solution . 3y- (3x + ax2y2 - 2ay)y' = 0 16. 3x2 +xya - x2y- t y = 0 17. Let cp be a potential function for M+ Ny' = 0 in some region R of the plane . Show that for any constant c , 9+c is also a potential function . How does the general solution of M + Ny' = 0 obtained by using cp differ from that obtained using 9 + c? 15. 2xy3 -
1.5
1.5
Integrating Factors
33
Integrating Factor s " Most" differential equations are not exact on any rectangle . But sometimes we can multiply the differential equation by a nonzero function example that suggests why this might be useful .
µ(x,
y) to obtain an exact equation . Here is an
EXAMPLE 1 .2 0
The equation y 2 - 6xy + (3xy - 6x2 )y' = 0 is not exact on any rectangle. Multiply it by
µ(x, y)
= y to ge t
y3 - 6xy 2 + (3xy 2 - 6x2y)y' = 0 .
(1 .12)
Wherever y 0, equations (1 .11) and (1 .12) have the same solution . The reason for this i s that equation (1 .12) is just y [y 2 - 6xy + (3xy - 6x2)y'] = 0 , and if y 00, then necessarily y2 - 6xy + (3xy - 6x 2)y' = 0. .Now notice that equation (1 .12) is exact (over the entire plane), having potential functio n cp ( x, y) = xy3 - 3x2y2. Thus the general solution of equation (1 .12) is defined implicitly b y xy 3 - 3x2y2 = C, and, wherever y 0, this defines the general solution of equation (1 .11) as well . To review what has just occurred, we began with a nonexact differential equation . We multiplied it by a function p chosen so that the new equation was exact . We solved this exact equation, then found that this solution also worked for the original, nonexact equation . Th e function µ therefore enabled us to solve a nonexact equation by solving an exact one . This idea is worth pursuing, and we begin by giving a name to A .
DEFINITION 1. 5
Let M(x, y) and N(x, y) be defined on a region R of the plane . Then µ(x, y) is an integrating factor for M+Ny' = 0 if µ(x, y) 0 for all (x, y) in R, and µM+µNy' = 0 is exact on R .
µ
How do we find an integrating factor for M + Ny' = 0? For to be an integrating factor , p,M+µNy' = 0 must be exact (in some region of the plane), henc e
ax (!u,
ay (A
(1 .13)
34
CHAPTER 1 First-Order Differential Equation s in this region . This is a starting point . Depending on M and N, we may be able to determine A from this equation. Sometimes equation (1 .13) becomes simple enough to solve if we try A as a function o f just x or just y .
EXAMPLE 1 .2 1
The differential equation x - xy - y ' = 0 is not exact . Here M = x - xy and N = - 1 and equation (1 .13) is ax (-A) - ay (µ(x - x Y)) •
'Write this as -
_ (x - xy)
ay
- xµ .
Now observe that this equation is simplified if we try to find µ as just a function of x, becaus e in this event aµ/ay = 0 and we are left with jus t ap, _ ax -
xµ .
This is separable . Writ e
and integrate to obtain 1n1µ1 = 1x2 . Here we let the constant of integration be zero because we need only one integrating factor . From the last equation, choose
µ( x ) = e x2 /2 , a nonzero function . Multiply the original differential equation by
ex2/2 to obtain
(x - xy) ex2/2 - e x2/z y' = 0 .
This equation is exact over the entire plane, and we find the potential function (P(x , Y) = (1 - y) ex2/2 . The general solution of this exact equation is implicitly defined by (1 - y)e x212 = C . In this case, we can explicitly solve for y to ge t y(x) = 1 -
Ce
-x2/2 ,
and this is also the general solution of the original equation x - xy - y ' = O . If we cannot find an integrating factor that is a function of just x or just y, then we mus t try something else . There is no template to follow, and often we must start with equation (1 .13 ) and be observant .
1.5 Integrating Factor s
EXAMPLE 1 .2 2
Consider 2y2 - 9xy + (3xy - 6x2)y' = 0 . This is not exact . With M = 2y2 - 9xy and N 3xy 6x 2 , begin looking for an integrating factor by writing equation (1 .13) : ax
=
[µ(3xy - 6x 2) = y [ µ(2y2 - 9xy)] .
This is (3xy - 6x2)
ax + µ(3y - 12x) = (2y 2 - 9xy) ay + µ(4y - 9x) .
(1 .14)
If we attempt µ = µ(x), then aµ/ay = 0 and we obtain (3xy - 6x2)
ax +µ(3y -12x) =
µ(4y -9x)
which cannot be solved for µ as just a function of x. Similarly, if we try µ = µ(y), so Op,/ax = 0, we obtain an equation we cannot solve. We must try something else . Notice that equation (1 .14) involves only integer powers of x and y . This suggests that we try µ(x, y) = x" yb . Substitut e this into equation (1 .14) and attempt to choose a and b. The substitution gives u s 3ax'y+1 - 6axa+l yb + 3x°y b+l -12x"+lyb = 2bx a y b+i - 9bxa+i yb +4x°y b+i - 9x°+i y Assume that x 0 and y 0 . Then we can divide by x a y b to ge t 3ay - 6ax + 3y - 12x = 2by - 9bx + 4y - 9x. Rearrange terms to write (1 +2b-3a)y = (-3 +9b-6a)x . Since x and y are independent, this equation can hold for all x and y only if 1+2b-3a=0
and
-3+9b-6a=0 .
Solve these equations to obtain a = b = 1 . An integrating factor is µ(x, y) = xy. Multiply th e differential equation by xy to get 2xy3 - 9x2y2 + (3x2y2 - 6x3y)y = 0• This is exact with potential function cp(x, y) = x 2y3 - 3x 3y2 . For x 0 and y 0, the solution of the original differential equation is given implicitly b y x2y3 - 3x3y2 = C. The manipulations used to find an integrating factor may fail to find some solutions, a s we saw with singular solutions of separable equations . Here are two examples in which thi s occurs .
EXAMPLE 1 .2 3
Consider 2xy -y'=O . y-1
(1 .15 )
CHAPTER 1 First-Order Differential Equation s We can solve this as a separable equation, but here we want to make a point about integratin g factors . Equation (1 .15) is not exact, but µ(x, y) = (y -1)/y is an integrating factor for y z 0, a condition not required by the differential equation itself . Multiplying the differential equatio n by µ(x, y) yields the exact equation 2x-
-1 y y
= 0,
with potential function cp(x, y) = x2 - y + In IA and general solution defined b y x2
-y+ln0y1 = C for y * 0 .
This is also the general solution of equation (1 .15), but the method used has required that y 0 . However, we see immediately that y = 0 is also a solution of equation (1 .15) . This singular solution is not contained in the expression for the general solution for any choic e of C . r 1EXAMPLE 1 .2 4
The equation y-3-xy' =0
(1 .16)
is not exact, but µ(x, y) = 1/x(y - 3) is an integrating factor for x 0 and y 3, condition s not required by the differential equation itself . Multiplying equation (1 .16) by µ(x, y) yields the exact equation 1 x
1 , y =0 , y- 3
with general solution defined by ln*x*+C=1nIy-31 . This is also the general solution of equation (1 .16) in any region of the plane not containing the lines x = 0 or y = 3 . This general solution can be solved for y explicitly in terms of x . First, any real number i s the natural logarithm of some positive number, so write the arbitrary constant as C = ln(k), in which k can be any positive number . The equation for the general solution become s ln*x* + ln(k) = lnIy - 31 , or lnIkxI =1nIy-31 . But then y - 3 = ±kx . Replacing ±k with K, which can now be any nonzero real number, w e obtain y=3+Kx
as the general solution of equation (1 .16) . Now observe that y = 3 is a solution of equatio n (1 .16) . This solution was "lost", or at least not found, in using the integrating factor as a method of solution . However, y = 3 is not a singular solution because we can include it i n the expression y = 3 + Kx by allowing K = 0 . Thus the general solution of equation (1 .16) i s y = 3 + Kx, with K any real number .
1.5 Integrating Factors
37
1.5 .1 Separable Equations and Integrating Factor s We will point out a connection between separable equations and integrating factors . The separable equation y' = A(x)B(y) is in general not exact . To see this, write it as A(x)B(y)-y'=0, so in the present context we have M(x, y) = A(x)B(y) and N(x, y) = -1 . Now ax (-1)
= 0 and
ay
[A(x)B(y)] = A(x)B ' (y) ,
and in general A(x)B'(y) 0. However, µ(y) = 1/B(y) is an integrating factor for the separable equation . If we multipl y the differential equation by 1/B(y), we ge t A(x) B (Y)
y=0
,
an exact equation because [_L ] ay [A(x)] = 0 . B (Y) The act of separating the variables is the same as multiplying by the integrating factor 1/B(y) . ax
1.5 .2 Linear Equations and Integrating Factor s Consider the linear equation y' + p(x)y = q(x) . We can write this as [p(x)y - q(x)] +y' = 0 , so in the present context, M(x, y) = p(x)y - q(x) and N(x, y) = 1 . Now ax
[1] = 0 and
ay
[p(x)y -q(x)] = p (x) ,
so the linear equation is not exact unless p(x) is identically zero . However , is an integrating factor. Upon multiplying the linear equation by µ, we get [p (x) y -
µ(x, y) = of p(x) dx
q(x)]ef P(x) do' + of P(x) dxy = 0 ,
and this is exact becaus e a efP(x)dx=p(x)efP(x)dx= -
ay
[[p(x)Y-q(x)]efP(x)dx] .
1. Determine a test involving M and N to tell when M + Ny' = 0 has an integrating factor that is a func tion of y only .
(b) Find an integrating factor µ(x) that is a function of x alone .
2. Determine a test to determine when M + Ny' = 0 has an integrating factor of the form µ(x, y) = xa yb for some constants a and b.
(c) Find an integrating factor v(y) that is a functio n of y alone .
3. Consider y- xy' = O . (a) Show that this equation is not exact on any rectangle.
(d) Show that there is also an integrating facto r ri(x, y) = x" y b for some constants a and b . Find all such integrating factors .
38 .
CHAPTER 1 First-Order Differential Equations
In each of Problems 4 through 12, (a) show that the differ ential equation is not exact, (b) find an integrating factor , (c) find the general solution (perhaps implicitly defined) , and (d) determine any singular solutions the differentia l equation might have.
11. y' +y = y4 (Hint: try µ(x, y ) = eaxy b ) 12. x2 y' + xy = - y-3/2 (Hint: try µ,(x, y) = xayb) In each of Problems 13 through 20, find an integrating factor, use it to find the general solution of the differential equation, and then obtain the solution of the initia l value problem.
1.6
21. Show that any nonzero constant multiple of an integrating factor for M + Ny ' = 0 is also an integrating factor. 22. Let µ(x, y) be an integrating factor for M + Ny ' = 0 and suppose that the general solution is defined b y cp(x, y) = C. Show that µ(x, y) G(cp(x, y)) is also an integrating factor, for any differentiable function G of one variable.
Homogeneous, Bernoulli, and Riccati Equation s In this section we will consider three additional kinds of first-order differential equations fo r which techniques for finding solutions are available.
1.6 .1 Homogeneous Differential Equation s
A first-order differential equation is homogeneous if it has the form
1 .6
Homogeneous, Bernoulli, and Riccati Equation s
In a homogeneous equation, y' is isolated on one side, and the other side is some expressio n in which y and x must always appear in the combination y/x . For example , y = -x sin y
x
is homogeneous, while y' = x 2y is not. Sometimes algebraic manipulation will put a first order equation into the form of the homogeneous equation . For example, y x+ y
y
(1 .17)
is not homogeneous . However, if x 0, we can write this a s Y' =
y/x 1+ y/x '
(1 .18)
a homogeneous equation . Any technique we develop for homogeneous equations can therefor e be used on equation (1 .18) . However, this solution assumes that x 0, which is not required in equation (1 .17) . Thus, as we have seen before, when we perform manipulations on a differentia l equation, we must be careful that solutions have not been overlooked . A solution of equation (1 .18) will also satisfy (1 .17), but equation (1 .17) may have other solutions as well . Now to the point . A homogeneous equation is always transformed into a separable one b y the transformation y = ux . To see this, compute y' = u'x + x'u = u'x + u and write u
= y/x .
Then y'
= f(y/x) become s
u'x + u = f(u) . We can write this as 1 du f(u) - u dx
1 x'
or, in differential form, 1 1 du = - dx , f(u)-u x and the variables (now x and u) have been separated . Upon integrating this equation, we obtai n the general solution of the transformed equation . Substituting u = y/x then gives the general solution of the original homogeneous equation .
EXAMPLE 1 .25
Consider xy ' = y2 x
+y .
Write this as ( l2 y- \x / + x Let y = ux . Then u' x+u=u 2
40
CHAPTER 1 First-Order Differential Equation s
or u'x = u2 . Write this a s
1 u2
du
=
1 dx x
and integrate to obtain 1 - - = lnIxI+ C .
u
Then u(x)
-1 lnIxI+ C '
the general solution of the transformed equation . The general solution of the original equation i s -x
y lnIxI+C •
EXAMPLE 1 .26 A Pursuit Proble m
A pursuit problem is one of determining a trajectory so that one object intercepts another . Examples involving pursuit problems are missiles fired at airplanes and a rendezvous of a shuttle with a space station . These are complex problems that require numerical approximatio n techniques . We will consider a simple pursuit problem that can be solved explicitly . Suppose a person jumps into a canal of constant width w and swims toward a fixed poin t directly opposite the point of entry into the canal . The person's speed is v and the water current's speed is s . Assume that, as the swimmer makes his way across, he always orients t o point toward the target . We want to determine the swimmer's trajectory . Figure 1 .13 shows a coordinate system drawn so that the swimmer's destination is th e origin and the point of entry into the water is (w, 0) . At time t the swimmer is at the point (x(t), y(t)) . The horizontal and vertical components of his velocity are, respectively , x' (t) = -vcos(a)
and
y' (t) = s- vsin(a) ,
with a the angle between the positive x axis and (x(t), y(t)) at time t . From these equations , dy _ y'(t) _ s-vsin(a) dx
x'(t)
FIGURE 1 .13
-vcos(a)
s = tan(a) - v set(a) .
The swimmer's path .
1.6 Homogeneous, Bernoulli, and Riccati Equation s
1
From Figure 1 .13, tan(a) = x
and
sec(a)
=
1 ,/x2 +y2 . x
Therefore dyy dx x Write this as the homogeneous equation dY dx
s l
x2 +y2 .
vx
s v
Y x
1+( I ) x
2
and put y = acv to obtain 1 s l du = --- dx . vx ./1+u 2 Integrate to get In
u+I1+u 2
= --lnjxj+C. v Take the exponential of both sides of this equation : u+V 1+ac 2 = e c e -(s]nlxl)/v . We can write this as u+ /1+u 2 =Kx-0 This equation can be solved for u . First write /1+u 2 =Kx -s/v - u and square both sides to get 1 + u 2 = Kee-2s/v - 2Kue-S/v + u 2. Now u2 cancels and we can solve for u(x)
u:
= 1 Kx 2
s/v -
1
1x s/U . 2K
Finally, put u = y/x to get y(x) =
Kx i-S/v _ 2 Kx l+s/ v
2
To determine K, notice that y(w) = 0, since we put the origin at the point of destination . Thus ,
2
Kwl-s/v -
wl+S/v = 0
and we obtain K = &'v . Therefore, ll
(w x)
1+s/v
I-s/v ( x 1l
1
42
CHAPTER 1 First-Order Differential Equation s
FIGURE 1 .14
Graphs of w
(x l
i-s/
1+s/v ]
- \w) xY2[(w)
for s/v equal to
1 , z and 4, and w chosen as
1.
As might be expected, the path the swimmer takes depends on the width of the canal, the spee d of the swimmer, and the speed of the current . Figure 1 .14 shows trajectories corresponding t o s/v equal to s , , i and 4, with w = 1 . III
1.6 .2 The Bernoulli Equatio n DEFINITION 1 . 7 A Bernoulli equation is a first order equation, y ' + P( x) y = R(x)y" ,
in which a is a real number . ■
A Bernoulli equation is separable if a = 0 and linear if a = 1 . About 1696, Leibni z showed that a Bernoulli equation with a 1 transforms to a linear equation under the chang e of variables : v = y l-a . This is routine to verify . Here is an example.
EXAMPLE 1 .2 7
Consider the equation 1 y' + -y = 3x2y3 , x which is Bernoulli with P(x) = 1/x, R(x) = 3x 2 , and a = 3 . Make the change of variable s v = y-2 .
1.6 Homogeneous, Bernoulli, and Riccati Equations
43
Then y = v-1/2 and 1 y' (x) = - -v-3/2v'(x) , so the differential equation becomes -
v-3/2v
(x) + v -1/2 = 3x2v-3/2 ,
or, upon multiplying by -2v3/2 , v' --v=-6x 2, x a linear equation. An integrating factor is e-f (2/x) dx = x-2 . Multiply the last equation by this factor to get x-2 v' - 2x-3 v = -6 , which i s (x-2 v) ' _ - 6 . Integrate to get x-2 v = - 6x + C , so v = - 6x3
+ Cx 2.
The general solution of the Bernoulli equation i s y(x)
=
Vv (x)
- ✓Cx2 -
. 6x3
1.6 .3 The Riccati Equatio n
DEFINITION 1 . 8 A differential equation of the form y' = P(x)y2 + Q(x)y + R(x) is called a Riccati equation .
A Riccati equation is linear exactly when P(x) is identically zero . If we can someho w obtain one solution S(x) of a Riccati equation, then the change of variables y = S(x) +
1
44
CHAPTER 1 First-Order Differential Equation s
transforms the Riccati equation to a linear equation . The strategy is to find the genera l solution of this linear equation and from it produce the general solution of the original Riccat i equation .
EXAMPLE 1 .28
Consider the Riccati equation 2 1 2 1 Y = -Y +-Y- - . By inspection, y = S(x) = 1 is one solution . Define a new variable z by puttin g 1 Y =1 +-z . The n 1 y = --z' . z2
Substitute these into the Riccati equation to ge t 1 11 2 11 2 - z =- 1+-) +-1 ( 1+-)- x x' z2 x1( z z or 3
,
z +-z = x
1 x
This is linear . An integrating factor is of (3/x) dx = x 3 . Multiply by x3 to get x 3 z + 3x2 z
=
(x3 z) '
= - x2 .
Integrate to get x3z = -
1
3x3 + C,
so z(x) = -
31 + xC3 .
The general solution of the Riccati equation i s y(x) = 1 +
1
1 z(x) = 1 + -1/3+C/x 3
This solution can also be written Y( x
in which K = 3C is an arbitrary constant .
K + 2x3 ) = K - x3
,
1.6 Homogeneous, Bernoulli, and Riccati Equations
PROBLEMS
SECTION
In each of Problems 1 through 14, find the general solution . These problems include all types considered in this section . 1 1 1. x y'=
In each of Problems 16 through 19, use the idea of Problem 15 to find the general solution . 16. y'=
y- 3 x+y- 1
17. y' =
3x-y- 9 x+y+ 1
x y+l
Zy2-
2.
y'+
- y=
- y -4/3
x+2y+ 7 18 . y = -2x+y- 9
3 . y'+xy=xy2
x y 4. y'=-+ y
2x-5y- 9
19. Y = -4x+y+ 9 20. Continuing from Problem 15, consider the case that ae - bd = 0 . Now let u = (ax + by)/a, assuming that a O . Show that this transforms the differential equa tion of Problem 15 into the separable equatio n '
x
5. y'= 6.
Y x+y 1
1 -
-y-
y'=
4 Y
x
du b (au+c \ =1+-F dx a du+ r
7. (x-2Y)y' =2x- y
8. xy' = x cos (y/x) + y 1 1 9. y + -y = _y-3/4 x x
In each of Problems 21 through 24, use the method o f Problem 20 to find the general solution .
4
10. x
2
y' =
11. y'=-
x
22+y 2
1
y2
x
21. y' =
2 +- y x
22. y' =
12. x2y' =x2y_y3 13. y _ - e -Xy 2 +y+e`
23. y = '
14. y +x '
y=
45
x-y+ 2
x-y+ 3 3x+y- 1 6x + 2y - 3
x-2 y 3x-6y+4
- y2
x
15. Consider the differential equation
)' = F
ax+by+ c dx+ey+r
24. y =
x-y+ 6 3x-3y+ 4
25. (The Pursuing Dog) A man stands at the junction of '
in which a, b, c, d, e, and r are constants and F is a differentiable function of one variable . (a) Show that this equation is homogeneous if an d only ifc=r=0 . (b) If c and/or r is not zero, this equation is calle d nearly homogeneous . Assuming that ae - bd 0 , show that it is possible to choose constants h and k s o that the transformation X = x+ h, Y = y+ k convert s this nearly homogeneous equation into a homogeneous one. Hint : Put x = X - h, y = Y - k into the differential equation and obtain a differential equatio n in X and Y. Use the conclusion of (a) to choose h and k so that this equation is homogeneous .
two perpendicular roads and his dog is watching hi m from one of the roads at a distance A feet away . At a given instant the man starts to walk with constan t speed v along the other road, and at the same tim e the dog begins to run toward the man with speed 2v. Detennine the path the dog will take, assuming that it always moves so that it is facing the man . Also determine when the dog will eventually catc h the man . (This is American Mathematical Monthly problem 3942, 1941) . 26. (Pursuing Bugs) One bug is located at each corner o f a square table of side length a . At a given time they begin moving at constant speed v, each pursuing its neighbor to the right. (a) Determine the curve of pursuit of each bug . Hint: Use polar coordinates with the origin at the
CHAPTER 1 First-Order Differential Equation s speed w . The bug moves toward the center of the dis k at constant speed v . (a) Derive a differential equation for the path of th e bug, using polar coordinates . (b) How many revolutions will the disk make before the bug reaches the center? (The solution will b e in terms of the angular speed and radius of th e disk.) (c) Referring to (b), what is the total distance th e bug will travel, taking into account the motion of the disk?
center of the table and the polar axis containing on e of the corners . When a bug is at (f(O), O), its target is at (f(O), 0+7r/2) . Use the chain rule to write
dy _ dy/dO dx dx/dO ' where y(O) = f(O)sin(O) and x(O) = f(O)cos(O) . (b) Determine the distance traveled by each bug . (c) Does any bug actually catch its quarry? 27 . (The Spinning Bug) A bug steps onto the edge of a disk of radius a that is spinning at a constant angular
1.7
Applications to Mechanics, Electrical Circuits , and Orthogonal Trajectories 1.7.1 Mechanic s Before applying first-order differential equations to problems in mechanics, we will review some background . Newton ' s second law of motion states that the rate of change of momentum (mass time s velocity) of a body is proportional to the resultant force acting on the body . This is a vecto r equation, but we will for now consider only motion along a straight line . In this case Newton' s law is
F=k dt (mv) . We will take k = 1, consistent with certain units of measurement, such as the English, MKS , or gcs systems . The mass of a moving object need not be constant . For example, an airplane consumes fuel as it moves . If in is constant, then Newton's law i s
dv F = m - = ma , in which a is the acceleration of the object along the line of motion. If m is not constant, then
din dv F = in dt + v dt . Newton's law of gravitational attraction states that if two objects have masses ml and m2, and they (or their center of masses) are at distance r from each other, then each attracts th e other with a gravitational force of magnitude
F=G
ml in2
r2 This force is directed along the line between the centers of mass . G is the universal gravitational constant. If one of the objects is the earth, then
F-
G
mM (R + x)2 '
where M is the mass of the earth, R is its radius (about 3,960 miles), m is the mass of th e second object, and x is its distance from the surface of the earth . This assumes that the earth i s
1.7 Applications to Mechanics, Electrical Circuits, and Orthogonal Trajectories
47
spherical and that its center of mass is at the center of this sphere, a good enough approximatio n for some purposes . If x is small compared to R, then R + x is approximately R and the force on the object is approximately GM R2
m,
which is often written as nag . Here g = GM/R2 is approximately 32 feet per second per secon d or 9 .8 meters per second per second . We are now ready to analyze some problems in mechanics . Terminal Velocity Consider an object that is falling under the influence of gravity, in a medium such as water, air or oil . This medium retards the downward motion of the object. Think, for example, of a brick dropped in a swimming pool or a ball bearing dropped in a tan k of oil . We, want to analyze the object's motion . _ Let v(t) be the velocity at time t . The force of gravity pulls the object down and ha s magnitude nag . The medium retards the motion. The magnitude of this retarding force is not obvious, but experiment has shown that its magnitude is proportional to the square o f the velocity . If we choose downward as the positive direction and upward as negative, the n Newton's law tells us that, for some constant a,
dv dt If we assume that the object begins its motion from rest (dropped, not thrown) and if we star t the clock at this instant, then v(0) = O . We now have an initial value problem for the velocity : a v'= g -- v2 ; v(0)=0 . in This differential equation is separable. In differential form, F=mg -av2=m
1 = dt . g - (a/m)v2 dv Integrate to get na tanh -1 ag
(1a v) = t + C. nag
Solve for the velocity, obtainin g mg tanh*ag (t+C) a m
v(t)
Now use the initial condition to solve for the integration constant : v(0) _ V
mg tanh (C* a ) = O . a m
Since tanh(6) = 0- only if = 0, this requires that C = 0 and the solution for the velocity i s v(t)
-*mag
tanh(I -1 t) .
Even in this generality, we can draw an important conclusion about the motion . As t increases, tanh(,/ag/net) approaches 1 . This means that lim v(t) =
t*.
On g a --
48
CHAPTER 1 First-Order Differential Equation s This means that an object falling under the influence of gravity, through a retarding mediu m (with force proportional to the square of the velocity), will not increase in velocity indefinitely . Instead, the object's velocity approaches the limiting value ,/mg/a . If the medium is deep enough, the object will settle into a descent of approximately constant velocity . This number ,/mg/a is called the terminal velocity of the object. Skydivers experience this phenomenon . Motion of a Chain on a Pulley A 16 foot long chain weighing p pounds per foot hangs over a small pulley, which is 20 feet above the floor . Initially, the chain is held at rest with 7 feet o n one side and 9 on the other, as in Figure 1 .15 . How long after the chain is released, and wit h what velocity, will it leave the pulley ? When 8 feet of chain hang on each side of the pulley, the chain is in equilibrium . Call this position x = 0 and let x(t) be the distance the chain has fallen below this point at time t . Th e net force acting on the chain is 2xp and the mass of the chain is 16p/32, or p/2 slugs . Th e ends of the chain have the same speed as its center of mass, so the acceleration of the chain a t its center of mass is the same as it is at its ends . The equation of motion i s p dv - = 2xp , 2 dt
FIGURE 1 .15
Chain on a pulley.
from which p cancels to yield dv = 4x. dt A chain rule differentiation enables us to write this equation in terms of v as a function of x . Write dv dv dx dt dx dt
=v
dv dx
Then dv v - = 4x. dx This is a separable equation, which we solve to ge t v2 = 4x2 + K. Now, x = 1 when v = 0, so K = -4 and v 2 = 4x2 - 4 .
The chain leaves the pulley when x = 8 . Whenever this occurs, v2 = 4(63) = 252, so v = 252 = 6, 0 feet per second (about 15.87 feet per second) . To calculate the time t required for the chain to leave the pulley, comput e
f
f=f
t
dt=
= f s dt
i dx
f 6' ' dv = f$ 1
dx
r dt
i v
dv dv.
1 .7 Applications to Mechanics, Electrical Circuits, and Orthogonal Trajectorie s
49
Since v(x) = 2-✓x2 - 1,
=1 J 2 it
s
8
1
Jx2_ 1
=-1n(8+ 2
dx= 21n x + -✓x2 - 1 ii
63) ,
about 1 .38 seconds . In this example the mass was constant, so dm/dt = 0 in Newton's law of motion. Next i s an example in which the mass varies with time . Suppose a 40 foot long chain weighing p pounds per foot is supported in a pile several feet above the floor, and begins to unwind when released from res t with 10 feet already played out . Determine the velocity with which the chain leaves the support. The amount of chain that is actually in motion changes with time . Let x(t) denote the length of that part of the chain that has left the support by time t and is currently in motion . The equation of motion is Chain Piling on the Floor
dv dm _ F, +v m dt dt where F is the total external force acting on the chain . Now F = xp Then
(1 .19)
= Ong, so rn = xp/g = xp/32 .
din p dx p -- v . dt 32 dt 32 Further, dv dv dt - v dx ' as in the preceding example . Put this information into equation (1 .19) to get xp dv p 2 _ xp . 32 32 v dx + v If we multiply this equation by 32/xpv, we get dv 1 32 (1 .20) dx +xv _ v ' which we recognize as a Bernoulli equation with a = -1 . Make the transformation w = v l- " = v2. Then v = w 1" 2 and dv - 1 w_1/2 d w dx 2 dx t Substitute these into equation (1 .20) to ge -w-1t2 - + wlt2 = 32w -1t2 . dx x
Upon multiplying this equation by 2w" 2, we get 2 w' + - w = 64 , x . Solve this to get a linear equation for w(x) 64
C
2 w(x) = = v(x) __ = _3_ x+ z2 .
50
CHAPTER 1 First-Order Differential Equation s Since v = 0 when x = 10, 0 = (64/3) (10) + C/100, so C = -64, 000/3 . Therefore , 64 1000 x - xz =3 C The chain leaves the support when x = 40 . At this time, v(x)
vz
2
= 64
1000 4(210 ) C 40- 1 600=
so, the velocity is v = 2 210, or about 29 feet per second . In these models involving chains, air resistance was neglected as having no significan t impact on the outcome . This was quite different from the analysis of terminal velocity, i n which air resistance is a key factor . Without it, skydivers dive only once ! Motion of a Block Sliding on an Inclined Plane A block weighing 96 pounds is released from rest at the top of an inclined plane of slope length 50 feet, and making an angle 7r/ 6 radians with the horizontal . Assume a coefficient of friction of µ = , /4. Assume also that ai r resistance acts to retard the block's descent down the ramp, with a force of magnitude equal t o one half the block's velocity . We want to determine the velocity v(t) of the block at any time t . Figure 1 .16 shows the forces acting on the block . Gravity acts downward with magnitud e mg sin(6), which is 96 sin(7r/6), or 48 pounds . Here nig = 96 is the weight of the block. The drag due to friction acts in the reverse direction and is, in pounds , -µN = -µmg cos(O) = -
4
(96) cos ( -) = -36 . 6
The drag force due to air resistance is -v/2, the negative sign indicating that this is a retardin g force . The total external force on the block i s F=48-36- -v=12-zv . Since the block weighs 96 pounds, it has a mass of 96/32 slugs, or 3 slugs . From Newton' s second law, 3
4t = 12 - -v .
This is a linear equation, which we write a s v' +6v=4 .
FIGURE 1 .16 Forces acting on a block on an inclined plane.
1.7 Applications to Mechanics, Electrical Circuits, and Orthogonal Trajectories An integrating factor is e fo /6m`
51
= e't6 . Multiply the differential equation by this factor to obtai n v' e`' 6 + 6e`' 6v = (veil' = 4e`tb
and integrate to get ve'/6 = 24e `16 +C . The velocity is v(t) = 24 + Ce - "6 Since the block starts from rest at time zero, v(0) = 0 = 24 + C, so C = -24 an d v(t) = 24 (1 - e -'/6 ) Let x(t) be the position of the block at any time, measured from the top of the plane . Since v(t) = x' (t), we get x(t)
f
= v(t) dt = 24t+ 144e -0 +K .
If we let the top of the block be the origin along the inclined plane, then x(0) = 0 = 144+K, s o K = -144. The position function is x(t) = 24t+ 144 (e -116 - 1) . We can now determine the block's position and velocity at any time . Suppose, for example, we want to know when the block reaches the bottom of the ramp . This happens when the block has gone 50 feet. If this occurs at time T, the n x(T)=50=24T+144(e -T/6 -1) . This transcendental equation cannot be solved algebraically for T, but a computer approximatio n yields T ti 5 .8 seconds . Notice that lim v(t) = 24 ,
t,co
which means that the block sliding down the ramp has a terminal velocity . If the ramp is lon g enough, the block will eventually settle into a slide of approximately constant velocity . The mathematical model we have constructed for the sliding block can be used to analyz e the motion of the block under a variety of conditions . For example, we can solve the equations leaving 0 arbitrary, and determine the influence of the slope angle of the ramp on position and velocity . Or we could leave u unspecified and study the influence of friction on the motion . 1.7 .2 Electrical Circuit s Electrical engineers often use differential equations to model circuits . The mathematical model is used to analyze the behavior of circuits under various conditions, and aids in the design o f circuits having specific characteristics . We will look at simple circuits having only resistors, inductors and capacitors . A capacitor is a storage device consisting of two plates of conducting material isolated from one another b y an insulating material, or dielectric . Electrons can be transferred from one plate to another vi a external circuitry by applying an electromotive force to the circuit . The charge on a capacitor i s essentially a count of the difference between the numbers of electrons on the two plates . This charg e_ is proportional to the applied electromotive force, and the constan t _ of proportionality
CHAPTER 1 First-Order Differential Equation s is the capacitance . Capacitance is usually a very small number, given in micro (10 -6 ) or pic o (10 -12 ) farads . To simplify examples and problems, some of the capacitors in this book ar e assigned numerical values that would actually make them occupy large buildings . An inductor is made by winding a conductor such as wire around a core of magnetic material. When a current is passed through the wire, a magnetic field is created in the core and aroun d the inductor . The voltage drop across an inductor is proportional to the change in the curren t flow, and this constant of proportionality is the inductance of the inductor, measured in henrys . Current is measured in amperes, with one amp equivalent to a rate of electron flow of on e coulomb per second . Charge q(t) and current i(t) are related by i(t) = q' (t) . The voltage drop across a resistor having resistance R is iR . The drop across a capacitor having capacitance C is q/C . And the voltage drop across an inductor having inductance L is Li'(t) . We construct equations for a circuit by using Kirchhoff's current and voltage laws . Kirchhoff's culTent law states that the algebraic sum of the currents at any juncture of a circuit is zero . This means that the total current entering the junction must balance the curren t leaving (conservation of energy) . Kirchhoff's voltage law states that the algebraic sum of th e potential rises and drops around any closed loop in a circuit is zero . As an example of modeling a circuit mathematically, consider the circuit of Figure 1 .17. Starting at point A, move clockwise around the circuit, first crossing the battery, where there is an increase in potential of E volts . Next there is a decrease in potential of iR volts acros s the resistor . Finally, there is a decrease of Li'(t) across the inductor, after which we return t o point A . By Kirchhoff's voltage law, E-iR-Li ' =0 , which is the linear equation E E -f- R a = L . Solve this to obtain i(t)
= R + Ke
-Rt/L
To determine the constant K, we need to be given the current at some time . Even without this , we can tell from this equation that as t --± oo, the current approaches the limiting value E/R . This is the steady-state value of the current in the circuit . Another way to derive the differential equation of this circuit is to designate one of th e components as a source, then set the voltage drop across that component equal to the sum of the voltage drops across the other components . To see this approach, consider the circuit o f Figure 1 .18 . Suppose the switch is initially open so that no current flows, and that the charg e
C
FIGURE 1 .17
RL Circuit.
FIGURE 1 .18
RC circuit.
1.7 Applications to Mechanics, Electrical Circuits, and Orthogonal Trajectories
53
on the capacitor is zero . At time zero, close the switch . We want the charge on the capacitor. Notice that we have to close the switch before there is a loop . Using the battery as a source , write iR-I-
1
C q=E,
or Rq'
1
+ C q=E .
This leads to the linear equation
1 E q + - R
'
with solution q(t) = EC (1 -
e
-t/RC )
satisfying q(0) = 0 . This equation provides a good deal information about the circuit . Sinc e the voltage on the capacitor at time t is q(t)/C, or E(1- e- `/nc ) we can see that the voltag e approaches E as t oo . Since E is the battery potential, the difference between battery and capacitor voltages becomes negligible as time increases, indicating a very small voltage dro p across the resistor . The current in this circuit can be computed as i ( t) = q '(t) =
Re
-ttrz c
after the circuit is switched on . Thus i(t) E/R as t -+ oo . Often we encounter discontinuous currents and potential functions in dealing with circuits . These can be treated using Laplace transform techniques, which we will discuss in Chapter 3 .
1.7.3
Orthogonal Trajectories
Two curves intersecting at a point P are said to be orthogonal if their tangents are perpendicular (orthogonal) at P . Two families of curves, or trajectories, are orthogonal if each curve of the first family is orthogonal to each curve of the second family, wherever an intersection occurs . Orthogonal families occur in many contexts . Parallels and meridians on a globe are orthogonal , as are equipotential and electric lines of force . A problem that occupied Newton and other early developers of the calculus was the determination of the family of orthogonal trajectories of a given family of curves . Suppose w e are given a family of curves in the plane . We want to construct a second family 0 of curve s so that every curve in a is orthogonal to every curve in 0 wherever an intersection occurs . As a simple example, suppose i5 consists of all circles about the origin . Then @1 consists of all straight lines through the origin (Figure 1 .19) . It is clear that each straight line is orthogonal to each circle wherever the two intersect .
a
CHAPTER 1 First-Order Differential Equation s
FIGURE 1 .19 Orthogona l families: circles and lines.
In general, suppose we are given a family way, say by an equation
a of curves . These must be described in some
F(x, y, k) = 0 , giving a different curve for each choice of the constant k. Think of these curves as integra l curves of a differential equation y' = f(x , y) , which we determine from the equation F(x, y, k) = 0 by differentiation. At a point (xo, yo), th e slope of the curve C in l through this point is f(xo, yo) . Assuming that this is nonzero, any curve through (xo, yo) and orthogonal to C at this point, must have slope -1/f(xo, yo) . (Her e we use the fact that two lines are orthogonal if and only if their slopes are negative reciprocals .) The family of orthogonal trajectories of therefore consists of the integral curves of th e differential equation
a
1 Y' f( x, y) Solve this differential equation for the curves in h .
EXAMPLE 1 .2 9
Consider the family
a of curves that are graphs o f F(x, y, k) = y - kx 2 = 0.
This is a family of parabolas . We want the family of orthogonal trajectories . First obtain the differential equation of Differentiate y - kx2 = 0 to get
a.
y'-2kx=0 . To eliminate k, use the equation y - kx2 = 0 to write k
= Z. x
Then Y - 2 ( x2 ) x=0 , or y = 2y x = f(x , y) .
1.7 Applications to Mechanics, Electrical Circuits, and Orthogonal Trajectories
55
This is the differential equation of the family a'. Curves in iS are integral curves of thi s differential equation, which is of the form y' = f(x, y), with f(x, y) = 2y/x . The family 03 of orthogonal trajectories therefore has differential equatio n y
1
x
f( x, y)
2y
This equation is separable, since 2ydy = -x dx.
Integrate to get y2 =-2x 2 +C . This is a family of ellipses 2x2 +y2
= C.
Some of the parabolas and ellipses from a and 0.3 are shown in Figure 1 .20 . Each parabola in a is orthogonal to each ellipse in t wherever these curves intersect .
FIGURE 1 .20 Orthogona l families : parabolas an d ellipses.
Mechanical System s 1. Suppose that the pulley described in this section is onl y 9 feet above the floor. Assuming the same initial con ditions as in the discussion, determine the velocity wit h which the chain leaves the pulley . Hint: The mass of the part of the chain that is in motion is (16 - x)p/32 . 2. Determine the time it takes for the chain in Problem 1 to leave the pulley . 3. Suppose the support is only 10 feet above the floor i n the discussion of the chain piling on the floor . Calculate the velocity of the moving part of the chain as i t leaves the support. (Note the hint to Problem 1 .) 4. (Chain and Weight on a Pulley) An 8p-pound weigh t is attached to one end of a 40-foot chain that weighs
p pounds per foot . The chain is supported by a smal l frictionless pulley located more than 40 feet above th e floor . Initially, the chain is held at rest with 23 feet hanging on one side of the pulley with the remainder of the chain, along with the weight, on the other side . How long after the chain is released, and with wha t velocity, will it leave the pulley ? 5. (Chain on a Table) A 24-foot chain weighing p pounds per foot is stretched out on a very tall, frictionless table with 6 feet hanging off the edge . If the chain is released from rest, determine the time it take s for the end of the chain to fall off the table, and als o the velocity of the chain at this instant . 6. (Variable Mass Chain on a Low Table) Suppose th e chain in Problem 5 is placed on a table that is only
56 L___
CHAPTER 1 First-Order Differential Equation s
4 feet high, so that the chain accumulates on the floor as it slides off the table . Two feet of chain are alread y piled up on the floor at the time that the rest of th e chain is released. Determine the velocity of the moving end of the chain at the instant it leaves the tabl e top . Hint : The mass of that part of the chain that is moving changes with time. Newton's law applies to the center of mass of the moving system . 7. Determine the time it takes for the chain to leave th e support in the discussion of the chain piling on the floor . 8. Use the conservation of energy principle (potential energy plus kinetic energy of a conservative syste m is a constant of the motion) to obtain the velocity o f the chain in the discussion involving the chain on th e pulley. 9. Use the conservation of energy principle to give a n alternate derivation of the conclusion of the discussion of the chain piling on the floor' 10. (Paraboloid of Revolution) Determine the shap e assumed by the surface of a liquid being spun i n a circular bowl at constant angular velocity w . Hint: Consider a particle of liquid located at (x, y) on th e surface of the liquid, as in Figure 1 .21 . The forces acting on the particle are the horizontal force havin g magnitude mw2 x and a vertical force of magnitud e mg . Since the particle is in radial equilibrium, th e resultant vector is normal to the curve .
mw2x
x Particle on the surface of a spinning liquid. FIGURE 1 .21
Properties of spinning liquids have foun d application in astronomy . A Canadian astronomer ha s constructed a telescope by spinning a bowl of mercury, creating a reflective surface free of the defect s obtained by the usual grinding of a solid lens . He claims that the idea was probably known to Newton , but that he is the first to carry it out in practice . Roger Angel, a University of Arizona astronomer, has developed this idea into a technique for producing telescop e mirrors called spin casting. As reported in Time (April 27, 1992), " . . . a complex ceramic mold is assemble d inside the furnace and filled with glittering chunks of Pyrex-type glass . Once the furnace lid is sealed,
the temperature will slowly ratchet up over a perio d of several days, at times rising no more than 2 degrees Centigrade in an hour . At 750 degrees C (138 2 degrees Fahrenheit), when the glass is a smooth, shiny lake, the furnace starts to whirl like a merry-go-round , an innovation that automatically spins the glass into the parabolic shape traditionally achieved by grinding ." The result is a parabolic surface requiring littl e or no grinding before a reflective coat is applied. Professor Angel believes that the method will allow th e construction of much larger mirrors than are possible by conventional techniques . Supporting this claim i s his recent production of one of the world's larges t telescope mirrors, a 6 .5 meter (about 21 feet) to be placed in an observatory atop Mount Hopkins in Arizona. 11. A 10-pound ballast bag is dropped from a hot air balloon which is at an altitude of 342 feet and ascendin g at a rate of 4 feet per second . Assuming that air resistance is not a factor, determine the maximum height attained by the bag, how long it remains aloft, an d the speed with which it strikes the ground . 12. A 48-pound box is given an initial push of 16 feet pe r second down an inclined plane that has a gradient o f a If there is a coefficient of friction of 3 betwee n the box and the plane, and an air resistance equal to the velocity of the box, determine how far the bo x will travel before coming to rest. 13. A skydiver and her equipment together weigh 19 2 pounds . Before the parachute is opened, there is an air drag equal to six times her velocity . Four second s after stepping from the plane, the skydiver opens the parachute, producing a drag equal to three times th e square of the velocity . Determine the velocity an d how far the skydiver has fallen at time t . What is the terminal velocity ? 14. Archimedes' principle of buoyancy states that an object submerged in a fluid is buoyed up by a forc e equal to the weight of the fluid that is displaced b y the object. A rectangular box, 1 by 2 by 3 feet, an d weighing 384 pounds, is dropped into a 100-foot-dee p freshwater lake. The box begins to sink with a dra g due to the water having magnitude equal to z the velocity . Calculate the terminal velocity of the box . Will the box have achieved a velocity of 10 feet pe r second by the time it reaches bottom? Assume tha t the density of the water is 62 .5 pounds per cubic foot . 15. Suppose the box in Problem 14 cracks open upon hitting the bottom of the lake, and 32 pounds of its contents fall out . Approximate the velocity with whic h the box surfaces . 16. The acceleration due to gravity inside the earth i s proportional to the distance from the center of th e
1 .7
Applications to Mechanics, Electrical Circuits, and Orthogonal Trajectories
earth. An object is dropped from the surface of th e earth into a hole extending through the earth's center. Calculate the speed the object achieves by the time i t reaches the center . 17. A particle starts from rest at the highest point of a vertical circle and slides under only the influence o f gravity along a chord to another point on the circle. Show that the time taken is independent of th e choice of the terminal point . What is this common time ?
1052
.57
1552
30 S2
FIGURE 1 .2 4
Circuits 18. Determine each of the currents in the circuit of Figure 1 .22.
22. In a constant electromotive force RL circuit, we find that the current is given by E i(t) = R (1- e-Rt/L) +i(0)e -Rt/L
Let i(0) = 0 . (a) Show that the current increases with time . (b) Find a time to at which the current is 63% of E/R . This time is called the inductive time constan t of the circuit.
1052
(c) Does the inductive time constant depend on i(0) ? If so, in what way ? 23. Recall that the charge q(t) in an RC circuit satisfies the linear differential equation 1 1
FIGURE 1 .2 2
+RCq=
19. In the circuit of Figure 1 .23, the capacitor is initially discharged . How long after the switch is closed will the capacitor voltage be 76 volts? Determine the current in the resistor at that time . (Here La denotes 1000 ohms and ,aF denotes 10 _6 farads .)
(a) Solve for the charge in the case that E(t) = E, constant . Evaluate the constant of integration by using the condition q(O) = qo . (b) Determine lim t.co q(t) and show that this limit i s independent of qo . (c) Graph q(t) . Determine when the charge has it s maximum and minimum values . (d) Determine at what time q(t) is within 1% of it s steady-state value (the limiting value requested in (b)) .
250 SI 2
80 V
R E(t) .
Nc F
T
FIGURE 1 .2 3
20. Suppose, in Problem 19, the capacitor had a potentia l of 50 volts when the switch was closed . How long woul d it take for the capacitor voltage to reach 76 volts ? 21. For the circuit of Figure 1 .24, find all currents immediately after the switch is closed, assuming that all o f these currents and the charges on the capacitors ar e zero just prior to closing the switch .
Orthogonal Trajectorie s In each of Problems 24 through 29, find the family of orthogonal trajectories of the given family of curves . If software is available, graph some curves in the given family and some curves in the family of orthogonal trajectories . 24. x+2y= K 25. 2x2 -3y= K 26. x2 + 2y2 = K 27 .
y=Kx2 + 1
28. x2 -Ky2 = 1 29. y = etix
58
1.8
CHAPTER 1 First-Order Differential Equation s
Existence and Uniqueness for Solutions of Initial Valu e Problems We have solved several initial value problem s y' = f(x, y) ; Y(xo) = Yo and have always found that there is just one solution . That is, the solution existed, and it was unique . Can either existence or uniqueness fail to occur? The answer is yes, as the followin g examples show .
EXAMPLE 1 .30
Consider the initial value problem
Y = 2y"2 ; Y( 0) = -1 . The differential equation is separable and has general solutio n y( x) = (x -}- C) 2. To satisfy the initial condition, we must choose C so that y( O ) = C2 = -1 , and this is impossible if C is to be a real number. This initial value problem has no real-valued solution .
EXAMPLE 1 .3 1
Consider the problem y, = 2y"2 ;
y( 2) = 0 .
One solution is the trivial function y = So(x) = 0 for all x . But there is another solution . Define
fi(x)-
0 (x - 2) 2
for x-< 2 forx > 2.
Graphs of both solutions are shown in Figure 1 .25 . Uniqueness fails in this example . Because of examples such as these, we look for conditions that ensure that an initia l value problem has a unique solution . The following theorem provides a convenient set of conditions .
1.8 Existence and Uniqueness for Solutions of Initial Value Problems
59
y
x
FIGURE 1 .25 y(2) = 0.
Graphs of solutions o f y' = 2 /;
THEOREM 1 .2 Existence and Uniqueness
Let f and of/ay be continuous for all (x, y) in a closed rectangle R centered at (xo, yo) . Then there exists a positive number h such that the initial value proble m y' = f(x , y) ; Y(xo) = Yo has a unique solution defined over the interval (xo - h, xo + h) . As with the test for exactness (Theorem 1 .1), by a closed rectangle we mean all points o n or inside a rectangle in the plane, having sides parallel to the axes . Geometrically, existence of a solution of the initial value problem means that there is an integral curve of the differentia l equation passing through (xo, yo) . Uniqueness means that there is only one such curve . This is an example of a local theorem, in the following sense . The theorem guarantee s existence of a unique solution that is defined on some interval of width 2h, but it says nothin g about how large h is . Depending on f and xo, h may be small, giving us existence an d uniqueness "near" xo . This is dramatically demonstrated by the initial value proble m y = y2 ;
Y( 0) = n ,
in which n is any positive integer . Here f(x, y) = y 2 and of/ay = 2y, both continuous over th e entire plane, hence on any closed rectangle about (0, n) . The theorem tells us that there is a unique solution of this initial value problem in some interval (-h, h) . In this case we can solve the initial value problem explicitly, obtaining x- 1 This solution is valid for -1/n < x < 1/n, so we can take h = 1/n in this example . This means that the size of n in the initial value controls the size of the interval for the solution . The large r n is, the smaller this interval must be . This fact is certainly not apparent from the initial valu e problem itself ! In the special case that the differential equation is linear, we can improve considerably on the existence/uniqueness theorem.
-r "
THEOREM 1 . 3
Let p and q be continuous on an open interval I and let xo be in I . Let yo be any number . Then the initial value problem y' + p (x)Y = q (x) ;
has a unique solution defined for all x in I .
Y( x o) = Yo
60
CHAPTER 1 First-Order Differential Equation s In particular, if p and q are continuous for all x, then there is a unique solution define d over the entire real line . Equation (1 .6) of Section 1 .3 gives the general solution of the linear equation . Usin g this, we can write the solution of the initial value problem : Proof
Y(x)
= e - .fxOn(f) d f [Lc
q()
de dk+yo ]
: Because p and q are continuous on I, this solution is defined for all x in I . Therefore, in the case that the differential equation is linear, the initial value problem has a unique solution in the largest open interval containing xo, in which both p and q are continuous .
PROBLEMS In each of Problems 1 through 5, show that the condition s of Theorem 1 .2 are satisfied by the initial value problem . Assume familiar facts from the calculus about continuit y of real functions of one and two variables . 1. y' = 2y2 +3xeY sin(xy) ; y(2) = 4 2. y' = 4xy+cosh(x) ; y(1) = - 1 3. y' = (xy) 3 - sin(y) ; y(2) = 2 4. y ' =x 5 -y 5 +2xe Y ;y(3) = 7r 5. y' = x 2ye 2x + y2; y(3) = 8 6. Consider the initial value problem = 2y ; y(xo) = yo. (a) Find two solutions, assuming that yo > 0 . (b) Explain why part (a) does not violate Theorem 1 .2. Theorem 1 .2 can be proved using Picard iterates, whic h we will discuss briefly . Suppose f and 8f/ay are continuous in a closed rectangle R having (xo, yo) in its interior an d sides parallel to the axes . Consider the initial value problem y ' = f(x, y) ; y(xo) = yo. For each positive integer n, define y„( x) = Yo+ fxo .f( t ,Y i( t )) dt.
This is a recursive definition, giving y i (x) in term s of yo, then y2 (x) in terms of y l (x), and so on . The functions y„ (x) for n = 1, 2, . . . are called Picard iterates for the initial value problem . Under the assumptions made on f, the sequence {y,,(x)} converges for all x in some interval about xo, and the limit of thi s sequence is the solution of the initial value problem on thi s interval . In each of Problems 7 through 10, (a) use Theorem 1 .2 to show that the problem has a solution in some interval about x0, (b) find this solution, (c) compute Picard iterates y 1 (x) through y6 (x), and from these guess y,,(x) in general, and (d) find the Taylor series of the solution from (b ) about xo . You should find that the iterates computed in (c ) are partial sums of the series of (d) . Conclude that in thes e examples the Picard iterates converge to the solution . 7. y'=2-y;Y(0)= 1 8. y'=4+y;y(0)= 3 9. y' =2x 2 ;y(l)= 3 10. y' = cos(x) ;
y(7r)
=1
P
' ORDER CONSTANT COEFFICIENT HO , ",?O "\FOlIS LINEAR EQUATION EULER`S EQU TIOI ' : ` \ix l Ai ti 7ENEOUS EQUA TFiON }i" +p(x)y' '-l'q(x14' :'__ . { :Q APPLICATION OF SECOND ORDER DIFFERENTIA L Cat :
Z
CHAPTER
F a
f
'IZ
Second-Orde r Differential Equations
2.1
Preliminary Concepts A second-order differential equation is an equation that contains a second derivative, but n o higher derivative . Most generally, it has the for m
F(x,y,y,y)=0 , although only a term involving y" need appear explicitly . For example , yI =xs xy" - cos(y)
= ex
and y" -4xy ' +y= 2 are second-order differential equations . A solution of F(x, y, y', y") = 0 on an interval I (perhaps the whole real line) is a functio n go that satisfies the differential equation at each point of I : F(x, cp(x), cp' (x), cp" (x)) = For example,
go(x)
0
for
x
in
I.
= 6 cos(4x) - 17 sin(4x) is a solution of y" +16y= 0
for all real x . And
go(x)
= x 3 cos(ln(x)) is a solution of x2y"-5xy'+l0y= 0
for x > O . These can be checked by substitution into the differential equation . The linear second-order differential equation has the form R(x)y" + P(x) y' + Q(x)y = S(x) 61
,62
CHAPTER 2 Second-Order Differential Equation s in which R, P, Q, and S are continuous in some interval . On any interval where R(x) can divide this equation by R(x) and obtain the special linear equation
0, we (2.1)
Y' + p ( x)Y +q(x)y=f(x)•
For the remainder of this chapter, we will concentrate on this equation . We want to know : 1. What can we expect in the way of existence and uniqueness of solutions of equatio n (2 .1) ? 2. How can we produce all solutions of equation (2 .1), at least in some cases that occu r frequently and have important applications ? We begin with the underlying theory that will guide us in developing techniques fo r explicitly producing solutions of equation (2 .1) .
2.2
Theory of Solutions of
y"
+ p(x)y'
+
q(x) y
=
f
(x )
To get some feeling for what we are dealing with, and what we should be looking for, conside r the simple linear second-order equation y" - 12x = 0 . We can write this as y" =12x and integrate to obtai n
y =fy '
" (x)dx=
f
12xdx=6x2 +C.
Integrate again : y(x)
=
f y'
(x) dx
=
f
(6x 2 + C) dx = 2x3 + Cx + K .
This solution is defined for all x, and contains two arbitrary constants . If we recall that th e general solution of a first order equation contained one arbitrary constant, it seems natural tha t the solution of a second-order equation, involving two integrations, should contain two arbitrar y constants . For any choices of C and K, we can graph the integral curves y=2x3 +Cx+K as curve s in the plane . Figure 2 .1 shows some of these curves for different choices of these constants . Unlike the first-order case, there are many integral curves through each point in the plane . For example, suppose we want a solution satisfying the initial conditio n y(0) = 3 . Then we need y(0) = K = 3 , but are still free to choose C as any number . All solutions y(x) = 2x3 + Cx + 3 pass through (0, 3) . Some of these curves are shown in Figure 2 .2.
2.2 Theory of Solutions of y" + p(x)y' + q(x)y = f(x)
FIGURE 2 .1
63
Graphs of y = 2x 3 +Cx+K for various values of C and K.
-1 0
-20 FIGURE 2 .2
Graphs of y = 2x 3 + Cx + 3 for various values of C.
We single out exactly one of these curves if we specify its slope at (0, 3) . Suppose, for example, we also specify the initial conditio n y'(0) = -1 . Since y' (x) = 6x2 + C, this requires that C = -1 . There is exactly one solution satisfying bot h initial conditions (going through a given point with given slope), and it i s y(x) = 6x 2 - x + 3 . A graph of this solution is given in Figure 2 .3. To sum up, at least in this example, the general solution of the differential equation involve d two arbitrary constants . An initial condition y(O) = 3, specifying that the solution curve mus t pass through (0, 3), determined _one of these constants .- However, that left infinitely man y
CHAPTER 2 Second-Order Differential Equation s
FIGURE 2 .3
Graph of y=2x3 -x+3 .
solution curves passing through (0, 3) . The other initial condition, y' (0) = -1, picked out tha t solution curve through (0, 3) having slope -1 and gave a unique solution of this problem . This suggests that we define the initial value problem for equation (2 .1) to be the differential equation, defined on some interval, together with two initial conditions, one specifying a poin t lying on the solution curve, and the other its slope at that point . This problem has the form y" +
p(x)y' + q (x) y = f(x) ;
y ( x o)
=
A , y ( xo)
= B,
in which A and B are given real numbers . The main theorem on existence and uniqueness of solutions for this problem is the second order analogue of Theorem 1 .3 in Chapter 1 . THEOREM 2. 1 Let p, q, and f be continuous on an open interval I . Let xo be in I and let A and B be any real numbers . Then the initial value problem y" + p( x) y + q ( x) y = f( x) ;
y (xo)
= A, y (xo) = B
has a unique solution defined for all x in I . This gives us an idea of the kind of information needed to specify a unique solution of equation (2 .1) . Now we need a framework in which to proceed in finding solutions . We wil l provide this in two steps, beginning with the case that f(x) is identically zero . 2.2 .1 The Homogeneous Equation y" + p(x) y ' + q(x) = 0 When f(x) is identically zero in equation (2 .1), the resulting equation y"+p(x)y + q(x) = 0
is called homogeneous . This term was used in a different context with first-order equations , and its use here is unrelated to that . Here it simply means that the right side of equation (2 .1) is zero . A linear combination of solutions y l (x) and y2 (x) of equation (2.2) is a sum of constan t multiples of these functions : c i yi(x) +
c 2y2(x)
with cl and c2 real numbers . It is an important property of the homogeneous linear equatio n that linear combinations of solutions are again solutions .
2.2 Theory of Solutions of y"+ p(x)y' +q(x)y = f(x)
65
THEOREM 2.2
Let y1 and Y2 be solutions of y" + p(x)y' + q(x)y = 0 on an interval I . Then any linear combination of these solutions is also a solution . Proof Let c i and c2 be real numbers . Substituting y(x) = c, y, (x)+c2y2 (x) into the differentia l equation, we obtai n (c l Yi +
c2Y2) "
+ p (x)
(
c 1Y1 + c2y2)' + q( x) ( c i Y1 + c2Y2 )
= c1Yi + c2 y' + c i p (x) y + c2 p (x )Yz + c 1 q (x)Yi + c2q(x)y2 = ci[ y' + p (x) y + q(x) y i] + c2[y' + p ( x) y + q( x) y2]
= 0+0=0 , because of the assumption that y and Y2 are both solutions . Of course, as a special case (c2 = 0), this theorem tells us also that, for the homogeneou s equation, a constant multiple of a solution is a solution . Even this special case of the theorem fails for a nonhomogeneous equation . For example, yi (x) = 4e 2i /5 is a solution o f y" + 2y' - 3y = 4e" , but 5y 1 (x) = 4e2x is not. The point to taking linear combinations c 1 y i +c2y2 is to obtain more solutions from jus t two solutions of equation (2 .2) . However, if y2 is already a constant multiple of y i , then c i y 1 + c2 y2 =
c
1Y1
+c2ky = (c 1
+ kc2) y1 ,
just another constant multiple of y i . In this event, y 2 is superfluous, providing us nothing we did not know from just yi . This leads us to distinguish the case in which one solution is a constant multiple of another, from the case in which the two solutions are not multiples of eac h other.
DEFINITION2.1
Linear Dependence, Independence
Two functions f and g are linearly dependent on an open interval I if, for some constant c, either f(x) cg(x) for all x in I, or g(x) = cf(x) for all x in I. If f and g are not linearly dependent on I, then they are said to be linearly independent on the interval .
EXAMPLE 2 . 1
yi (x) = cos(x) and y 2 (x) = sin(x) are solutions of y" + y = 0, over the real line . Neither o f these functions is a constant multiple of the other . Indeed, if cos(x) = k sin(x) for all x, the n in particular cos
(7r
4/
2
=ksin(
7r =k-4/ 2 '
so k must be 1 . But then cos(x) = sin(x) for all x, a clear absurdity (for example, let x = 0) . These solutions are linearly independent . Now we know from Theorem 2.2 that a cos(x) + b sin(x)
66
CHAPTER 2 Second-Order Differential Equation s is a solution for any numbers a and b. Because cos(x) and sin(x) are linearly independent, thi s linear combination provides an infinity of new solutions, instead of just constant multiples o f one we already know . ■ There is a simple test to tell whether two solutions of equation (2 .2) are linearly independen t on an interval . Define the Wronskian of solutions yl and y2 to b e W(x) = yi(x )Y'2( x ) - y (x )Y2(x) • This is the 2 x 2 determinant W(x) =
7
Y1( x )
Y2(x )
Yi( x)
y (x)
THEOREM 2 .3 Wronskian Tes t
Let yl and Y2 be solutions of y" + p(x)y' + q(x)y = 0 on an open interval I . Then , 1. Either W(x) = 0 for all x in I, or W(x) 0 for all x in I . 2. y l and Y2 are linearly independent on I if and only if W(x) 0 on I . Conclusion (1) means that the Wronskian of two solutions cannot be nonzero at som e points of I and zero at others. Either the Wronskian vanishes over the entire interval, or it i s nonzero at every point of the interval . Conclusion (2) states that nonvanishing of the Wronskia n is equivalent to linear independence of the solutions . Putting both conclusions together, i t is therefore enough to test W(x) at just one point of I to determine linear dependence or independence of these solutions . This gives us great latitude to choose a point at which th e Wronskian is easy to evaluate .
EXAMPLE 2 . 2
In Example 2 .1, we considered the solutions y l (x) = cos(x) and y2(x) = sin(x) of y" + y = 0 , for all x . In this case, linear independence was obvious . The Wronskian of these solutions i s W(x) =
cos(x) sin(x ) - sin(x) cos(x )
= co st (x) + si ne (x) = 1 0 0.
EXAMPLE 2 . 3
It is not always obvious whether two solutions are linearly independent or dependent on a n interval . Consider the equation y" + xy = 0 . This equation appears simple but is not easy to solve . By a power series method we will develop later, we can write two solutions yl (x) = 1 - 6
x3
+
180 x6
12,960x9 +
.. .
and
Y2(x)=x- 12x-+ 504 x7
xio+ . . . 45,360
2.2 Theory of Solutions of y" + p(x)y' + q(x)y = f(x)
67
with both series converging for all x . Here I is the entire real line . The Wronskian of thes e solutions at any nonzero x would be difficult to evaluate, but at x = 0 we easily obtai n w(o)
= Yi(o)Y'2 ( 0)
-
(0)Y2 (0)
=
( 1 )( 1 ) - ( 0 )( 0 )
=1.
Nonvanishing of the Wronskian at this one point is enough to conclude linear independence o f these solutions . 23 We are now ready to use the machinery we have built up to determine what is needed to find all solutions of y" + p(x) y' + q(x) = 0 . THEOREM 2 . 4
Let y l and y2 be linearly independent solutions of y" + p(x)y' + q(x) y = 0 on an open interval I . Then, every solution of this differential equation on I is a linear combination of y l and y2 . This fundamental theorem provides a strategy for finding all solutions of y" + p(x)y' + q(x)y = 0 on I . Find two linearly independent solutions . Depending on p and q, this may be difficult, but at least we have a specific goal . If necessary, use the Wronskian to test fo r independence. The general linear combination c l y l +c,y2 , with c 1 and c2 arbitrary constants , then contains all possible solutions . We will prove the theorem following introduction of some standard terminology .
DEFINITION 2. 2
Let y and
vp
be solutions of y"+p(x)v'+q(x)v=0 on an open interval I .
1. y l and 1 '2 form a fundamental set of solutions on I if y i and >>2 are linearly independent on I . 2. When yl and Y2 form a fundamental set of solutions, we call c i y 1 +c2 Y2 , with c 1 and c2 arbitrary constants, the general solution of the differential equation on I .
In these terms, we find the general solution by finding a fundamental set of solutions . Here is a proof of Theorem 2 .4. Proof Let co be any solution of y" + p(x)y' + q(x)y = 0 on I . We want to show that there must be numbers c l and c 2 such that co ( x) = c 1Y1(x ) + c2Y2(x ) •
Choose any xo in I . Let co(xo) = A and cp ' (xo) = B . By Theorem 2 .1, co is the unique solution on I of the initial value proble m y" + p (x)Y
+ q (x ) y = 0 ;
y( x o) = A, y' (xo) = B .
Now consider the system of two algebraic equations in two unknowns : Yi(xo) c 1
+ y2( xo) c2 =
Yi(xo) c i
+ y2' ( xo) c,
A
= B.
68
CHAPTER 2 Second-Order Differential Equations It is routine to solve these algebraic equations . Assuming that W(xo) 0, we find tha t c1
-BY2( x o) __ Ay(xo) W(xo)
- A Yi (xo ) c2 __ By 1( xo) W(xo)
With this choice of cl and c2, the function c l yl +c2 y2 is a solution of the initial value problem . By uniqueness of the solution of this problem, cp(x) = c l y 1 (x) + c2 y2 (x) on I, and the proof is complete . ■ The proof reinforces the importance of having a fundamental set of solutions, since th e nonvanishing of the Wronskian plays a vital role in showing that an arbitrary solution must b e a linear combination of the fundamental solutions . 2.2.2
The Nonhomogeneous Equation
y"
+ p(x) y' +
q(x) y
= f(x )
The ideas just developed for the homogeneous equation (2 .2) also provide the key to solvin g the nonhomogeneous equation
y" + p(x)y' + q(x)y = f(x) .
J
(2.3)
THEOREM 2.5 Let YI and Y2 be a fundamental set of solutions of y" +p(x) y' + q(x)y = 0 on an open interval I . Let yp be any solution of equation (2 .3) on I . Then, for any solution co of equation (2 .3), ther e exist numbers c 1 and c 2 such that q) = c 1 y 1 + c2Y2 + Yp . This conclusion leads us to call c 1 y 1 +c2 y2 +yp the general solution of equation (2 .3) and suggests the following strategy . To solve y" + p(x)y' + q(x)y = f(x) : 1. find the general solution c 1 y1 + c2y2 of the associated homogeneous equation y" + p(x)y' + q(x)y = 0 , 2. find any solution yp of y " + p(x)y' + q(x) y = f(x), and 3. write the general solution c 1Y1 +c2y2+Yp . This expression contains all possible solutions of equation (2 .3) on the interval . Again, depending on p, q, and f , the first two steps may be formidable . Nevertheless, the theorem tells us what to look for and provides a clear way to proceed . Here is a proof of the theorem . Proof
Since
cp
and yp are both solutions of equation (2 .3), then
(cP-Yp)//+p((p-Yp)/+q(cpYp) = cp +
pcp' + qcp - (yP+ pyp+ qup )
=f-f=0 . Therefore, cp - yp is a solution of y" + py' + qy = O . Since yi and y2 form a fundamental set of solutions for this homogeneous equation, there are constants c 1 and c 2 such that cp - yp
and this is what we wanted to show .
= c 1Y1 + c2Y2
2 .3 Reduction of Order
69 .
The remainder of this chapter is devoted to techniques for carrying out the strategies jus t developed . For the general solution of the homogeneous equation (2 .2) we must produce a fundamental set of solutions . And for the nonhomogeneous equation (2 .3) we need to find one particular solution, together with a fundamental set of solutions of the associated homogeneou s equation (2 .2) .
PROBLEMS
In each of Problems 1 through 6, (a) verify that y l an d y2 are solutions of the differential equation, (b) show that their Wronskian is not zero, (c) write the general solutio n of the differential equation, and (d) find the solution of the initial value problem . 1. y" - 4y = 0 ; y(0) = l, y'(0) = 0 yl (x) = cosh(2x),y2(x) = sinh(2x ) 2. y" + 9y = 0 ; y( rr/3) = 0 ,Y' (rr/3 ) = 1 y, (x) = cos(3x), y2(x) = sin (3x) 3. y" + 11y' +24y = 0 ; y(0) = 1, y ' (0) = 4 Yi (x) = e-3a, y2( x) = 2 8.r 4. y" + 2y' + 8y = 0 ; y(O) = 2, y' (0) = 0 yi (x) = e-z cos(,/77x), y2(x) = e-x sin(fix) y ( l ) =2 ,Y (1) = 4 Y*(x) = x4 , y2(x ) = x ln(x) / 44 6. " ' I 1- 4 2 Iy=0 ;y(*)=-5,Y(1')= 8 5. y" -
y
ZY' +x6 y=0 ;
+zy + x
yl (x) =,/ *\ cos(x), y2(x)
=1- sin(x)
7. Let yl (x) = x2 and y2 (x) = x3. Show that W(x) = x 4 for all real x . Then W(0) = 0, but W(x) is not identically zero. Why does this not contradict Theorem 2 .3 .1 , with the interval I chosen as the entire real line?
2 .3
8. Show that y l (x) = x and y2 (x) = x2 are linearly independent solutions of x2 y" - 2xy' + 2y = 0 on [-1, 1] , but that W(0) = 0 . Why does this not contradict Theorem 2 .3 .1 on this interval ? 9. Give an example to show that the product of tw o solutions of y" + p(x)y' + q(x)y = 0 need not be a solution . 10. Show that y l (x) = 3e 2i - 1 and y2 (x) = e' +2 are solutions of yy" +2y' - (y') 2 = 0, but that neither 2y i nor yl + y2 is a solution. Why does this not contradic t Theorem 2.2? 11. Suppose y t and Y2 are solutions of y" + p(x)y' + q(x)y = 0 on [a, b], and that p and q are continuous on this interval . Suppose yl and Y2 both have a relative extremum at xo in (a, b) . Prove that yl and Y2 are linearly dependent on [a, b] . 12. Let cp be a solution of y" + p(x)y' + q(x)y = 0 on an open interval I, and suppose cp(xo) = 0 for some xo in I . Suppose c9(x) is not identically zero . Prove that co' (x o) = 0 . 13. Let y l and Y2 be distinct solutions of y" + p(x)y ' + q(x)y = 0 on an open interval I . Let xo be in I an d suppose yl (xo) = y2(xo) = 0 . Prove that yl and y2 are linearly dependent on I . Thus linearly independen t solutions cannot share a common zero .
Reduction of Order Given y" + p(x)y' + q(x)y = 0, we want two independent solutions . Reduction of order is a technique for finding a second solution, if we can somehow produce a first solution . Suppose we know a solution y,, which is not identically zero . We will look for a second solution of the form y2 (x) = u(x)y l (x) . Compute Y2
=u'Y1+ uyi ,
Y2"
= u "YI +2u 'y + Y i •
In order for y2 to be a solution we need ci Yt +2u'yi + uYi + p [ u Y1 + uy i] + quY1 =
70
CHAPTER 2 Second-Order Differential Equation s Rearrange terms to write this equation as u"yi + u' [2Yi + p Y1] + u[Y' + pyi + qy1] = O . The coefficient of a is zero because y l is a solution . Thus we need to choose u so that u' Yi+ u' [2yi+ phi] =0 . On any interval in which y l (x) 0, we can write
u' + 2Yi+PYi u' =0 .
Yi To help focus on the problem of determining u, denot e g(x ) =
2)) ; (x) +p(x) y1( x) Yi( x)
a known function because y l (x) and p(x) are known . The n u" +g(x)u' = 0 . Let v = u' to get v' +g(x)v = 0 . This is a linear first-order differential equation for v, with general solutio n v(x) = Ce-f g(x) dx
Since we need only one second solution Y 2, we will take C = 1, so v(x) = e -fg(x)dx Finally, since v = u', u(x) =
f
e- f g(x) dx dx .
If we can perform these integrations and obtain u(x), then y2 = uyl is a second solution o f y" + py' + qy = 0. Further , W( x) = Yi Yz - Yi Y2 = Y1( u Yi + u'Y1) -
Yi u Y1 =
u'y
= vy.
Since v(x) is an exponential function, v(x) 0 0. And the preceding derivation was carried ou t on an interval in which yl (x) 0 . Thus W(x) 0 and yl and Y2 form a fundamental set of solutions on this interval . The general solution of y" + py' + qy = 0 is c o', + c2y2. We do not recommend memorizing formulas for g, v and then u . Given one solution y l , substitute y2 = uy l into the differential equation and, after the cancellations that occur becaus e yl is one solution, solve the resulting equation for u(x) .
EXAMPLE 2 . 4
Suppose we are given that y1 (x) = e -2x is one solution of y" + 4y' + 4y = 0 . To find a second solution, let y2 (x) = u(x)e -2x . Then y2 = u' e-2x - 2e -2x u and y2 = u"e-2x + 4e -2x u -4u e-2x Substitute these into the differential equation to ge t u"e2x +4e -2x u-4u ' e 2x +4(u ' e -2x -2e -2x u)+4ue -2x =0.
2 .3 Reduction of Orde r
Some cancellations occur because e-2x is one solution, leaving u'e-2x = 0 , or u" =0 . Two integrations yield u(x) = cx + d . Since we only need one second solution y 2 , we only need one u, so we will choose c = 1 and d = O . This gives u(x) = x an d -2x . y2(x) = xe Now e -2x
W(x) _
xe -2x
= e -4x 0 0
-2e -2x e -2x - 2xe -2x
for all x . Therefore, y l and y2 form a fundamental set of solutions for all x, and the general solution of y" + 4y' + 4y = 0 is -2x + c2xe-2x . y( x) = c j e
EXAMPLE 2 . 5
Suppose we want the general solution of y" - (3/x)y' + (4/x 2)y = 0 for x > 0, and somehow we find one solution y i (x) = x2 . Put y 2 (x) = x2u(x) and compute y2
= 2xu+x2 u'
and
y2
= 2u + 4xu' + x 2u" .
Substitute into the differential equation to get 2u+ 4xu' + x2 u" -
-
(2xu + x 2u') +
x2 (x
2u) = 0 .
Then x2u" + xu' = 0 . Since the interval of interest is x > 0, we can write this as xu" + u' = O . With v = u', this is xv' + v = (xv) ' = 0 , so xv = c . We will choose c = 1 . Then 1 v-ux so u = ln(x) + d, and we choose d = 0 because we need only one suitable u. Then y2 (x) = x 2 ln(x) is a second solution . Further, for x > 0, W(x) _
x2
x2 ln(x)
2x 2xln(x)+x
= xs # 0.
72
CHAPTER 2 Second-Order Differential Equation s Then x 2 and x 2 ln(x) form a fundamental set of solutions for x > O . The general solution is for x>0is y ( x) = c 1 x 2 -I- c2 x2 ln(x) .
PROBLEM S
In each of Problems 1 through 10, verify that the given function is a solution of the differential equation, find a second solution by reduction of order, and finally writ e the general solution .
(a) xy" = 2+ y ' (b) xy" +2y' = x (c) 1-y' = 4y"
1.
y" + 4y = 0; y1 (x) = cos(2x )
2. y" - 9y = 0 ; Y1(x ) = e3 x
(d) Y " + (Y' ) 2
= 0
(e) Y " = 1 +
(y') 2
3. y " -10y ' +25y=0 ;y 1 (x)=e s z 4. x2 y"
- 7xy '
+16y=0 ;y 1 (x)=x 4 forx> 0
5. x2 y"
- 3xy '
+ 4y=0 ;y 1 (x)=x2 forx> 0
6. (2x2 +1)y" - 4xy ' +4y=0 ;y 1 (x)=xforx> 0 7.
x
8. Y
„-
f or
x
2x , 1+x2y
2 + 1+x2
9. y" + z y' + (1 x> 0
x2 4
Y =0 ;YI(x) =x
13. A second-order equation in which x does not explicitly appear can sometimes be solved by putting u = y' and thinking of y as the independent variable and it as a function of y . Write y"
d [dy] du du dy dx dx dx dy dx
u
du dy
to convert F(y, y', y") = 0 into the first-order equatio n F(y, u, u(du/dy)) = 0 . Solve this equation for u(y ) and then set u = y' to solve for y as a function of x . Use this method to find a solution (perhaps implicitl y defined) of each of the following . (a) YY"+3(Y')2= 0
) y=
0; Yl ( x ) = ,/7x cos (x) for
10. (2x 2 +3x+1)y"+2xy'-2y=0 ;y 1 (x)=x on any interval not containing -1 or - 2 11. Verify that, for any nonzero constant a, y 1 (x) = e-°x is a solution of y" +2ay' + a 2y = O . Write the general solution . 12. A second-order equation F(x, y, y ' , y") = 0 in which y
is not explicitly present can sometimes be solved by putting u = y' . This results in a first-order equation G(x, it, u') = O . If this can be solved for u(x), then y1 (x) = f u(x) dx is a solution of the given second order equation . Use this method to find one solution , then find a second solution, and finally the general solution of the following .
(b) YY" + (Y + 1 )(Y' ) 2 (c) yy" = y2y'
= 0
+ (y' ) 2
(d) Y " = 1 + (y' ) 2 (e) Y " +(Y' ) 2=
0
14. Consider y" + Ay' + By = 0, in which A and B are constants and A2 - 4B = 0 . Show that y 1 (x) = e -Ax1 2 is one solution, and use reduction of order to find th e second solution y2 (x) = xe -Ax/2 . 15. Consider y" + (A/x)y' + (B/x2)y = 0 for x > 0, with A and B constants such that (A- 1) 2 - 4B = 0. Verify that y1 (x) = x(1 -4)/2 is one solution, and us e reduction of order to derive the second solutio n -A)/2 ln(x) . Y2 (x) = x(I
2.4 The Constant Coefficient Homogeneous Linear Equation
2.4
73
The Constant Coefficient Homogeneous Linear Equatio n The linear homogeneous equation y" +Ay' + By=O
(2 .4)
in which A and B are numbers, occurs frequently in important applications . There is a standar d approach to solving this equation . The form of equation (2 .4) requires that constant multiples of derivatives of y(x) must su m to zero . Since the derivative of an exponential function e lx is a constant multiple of e Ax , we will look for solutions y(x) = e Ax . To see how to choose A, substitute eAx into equation (2 .4) to get A2eax +AAeax + BeAx = 0 . This can only be true if A 2 +AA+B=O . This is called the characteristic equation of equation (2 .4) . Its roots are -A ± /A 2 - 4B 2
A= leading to three cases . 2 .4.1 Case 1 : A 2 - 4B > 0
In this case the characteristic equation has two real, distinct roots , a=
-A + - ✓ A2 - 4B 2
and
b=
-A- ./A2 - 4 B 2
yielding solutions yl (x) = e' and y2(x) = e" for equation (2 .4) . These form a fundamental set of solutions on the real line, since W(x) = eaxbebx - e vxae!x = (b - a) e ("+b) x and this is nonzero because a b . The general solution in this case i s y(x)
ni .
= clex +
c2ebx •
EXAMPLE 2 . 6
The characteristic equation of y" . - y' - 6y = 0 i s A2 -A-6=0 , with roots a = - 2 and b = 3. The general solution i s y=
Now the characteristic equation has the repeated root A = -A/2, so y l (x) = e-Axl2 is one solution . This method does not provide a second solution, but we have reduction of order for just suc h a circumstance . Try y2 (x) = u(x) e'/2 and substitute into the differential equation to get -Ax/2 +ue - Ax/2*-}-Bue-Ax/2 - 0.
2 -Ax/2 - Aue -Ax/2 +ua e - Ax/2
4 ue Divide by
e-Ax/2
and rearrange terms to get 2
/
u" +I B\
4 )u=0 .
Because in the current case, A 2 - 4B = 0 , this differential equation reduces to just u" = 0, an d we can choose u(x) = x. A second solution in this case is y2 (x) = xe-Ax/2 . Since yl and y2 are linearly independent, they form a fundamental set and the general solution i s y(x) = cl e-Ax/2 + c2xe-Ax/2
=
eAx/2 ( i + 2 c
c x)
EXAMPLE 2 . 7
The characteristic equation of y" - 6y' + 9y = 0 is A2 - 6,k + 9 = 0, with repeated root A = 3 . The general solution is Y(x) = e 3x (c l + c2 x) • 2.4 .3
Case 3: A2 - 4B < 0
Now the characteristic equation has complex roots -A ± /4B - A2 i 2
For convenience, write A p = -2,
1
q= 2 - ✓4B-A2 ,
so the roots of the characteristic equation are p + iq . This yields two solution s Yi (x) = e(P+iq)x
and
y2( x) = e (p-rq) x
These are linearly independent because their Wronskian i s W(x)
=
_
e( p + iq)x
e(p -iq) x
(p+iq)e (p+1q)x
(p-iq)e(n-iq) x
e2px ( p - iq)
-
(p+iq)e2px
and this is nonzero in the current case in which q
= - 2ige2px ,
0. Therefore the general solution i s
y(x) = c1 e(P+iq)x + c2e(P-i q)x . EXAMPLE 2 . 8
The characteristic equation of y" + 2y' + 6y = 0 is A2 + 2A + 6 = 0, with roots -1 ± *i . Th e general solution is y(x) = c l e(-1+Ji)x + c2e (-1-,/30x .
2. 4 The Constant Coefficient Homogeneous Linear Equation
75
2.4.4 An Alternative General Solution in the Complex Root Cas e When the characteristic equation has complex roots, we can write a general solution in terms of complex exponential functions . This is sometimes inconvenient, for example, in graphin g the solutions . But recall that any two linearly independent solutions form a fundamental set . We will therefore show how to use the general solution (2 .5) to find a fundamental set of real-valued solutions . Begin by recalling the Maclaurin expansions of ex , cos(x), and sin(x) : ex
and (-1) , t xz„+' = x - 1 x 3 sin(x) = „=o (2n+ 1)! 3!
! 1 x5 - 1 x' + . . 5!
7!
.
with each series convergent for all real x . The eighteenth century Swiss mathematician Leonhar d Euler experimented with replacing x with ix in the exponential series and noticed an interestin g relationship between the series for ex, cos(x), and sin(x) . First, e ix
Now, integer powers of i repeat the values i, -1, -i, 1 with a period of four :
i2=-1,
i3=-i ,
i4=1 ,
4
4 =-1,
i 5 =i 1=i, i 6 =i i 2
i 7 =i4 i 3
=-i,
and so on, continuing in cyclic fashion . Using this fact in the Maclaurin series for e `x , we obtai n ei),
l x3+1x4+ l x 5 - 1 x6 - . . 3! 4! 5! 6! 1 . . . +i x _x3+_x5- . . . * 4!1x4 1x6-!3! 6! 5!
=l+ix-1x22! =\
2!1x2+
= cos(x) + isin(x) .
(2.6 )
This is Euler's formula. In a different form, it was discovered a few years earlier by Newton' s contemporary Roger Cotes (1682-1716) . Cotes is not of the stature of Euler, but Newton's hig h opinion of him is reflected in Newton's remark, "If Cotes had lived, we would have know n something ." Since cos(-x) = cos(x) and sin(-x) _ - sin(x), replacing x by -x in Euler's formul a yields e-`x = cos(x) - i sin(x) . Now return to the problem of solving y" + Ay' + By = 0 when the characteristic equatio n has complex roots p±iq . Since p and q are real numbers, we hav e e (r+ig)x = epxeigx = e rx (cos(gx) + i sin(qx) )
= ex cos(gx) + ie*x sin(qx)
76
CHAPTER 2 Second-Order Differential Equation s and e (P-`ox = e Px e _igx = e Px (cos(gx) - i sin(qx) ) = e Px cos(qx) - ie Px sin(qx) . The general solution (2 .5) can therefore be writte n y(x) = c 1 e' cos (qx) + iclePxsin(gx)+ c2e' cos(qx) - ic2ePx sin(gx) = (c l + c2)ePx cos(qx) + (cl - c2)iePx sin(gx) . We obtain solutions for any numerical choices of c l and c 2. In particular, if we choose c l = c2 =
2 we obtain the solution
And if we put c l
ePx cos(gx) . y3( x ) =
= Zi and c 2 = - Zi we obtain still another solution y4( x) = ePx sin(qx) .
Further, these last two solutions are linearly independent on the real line, sinc e W( x) =
We can therefore, if we prefer, form a fundamental set of solutions using Y 3 and Y4, writing the general solution of y" + Ay' + By = 0 in this case as y(x) = ePx(c l cos(gx) + c2 sin ( gx)) . This is simply another way of writing the general solution of equation (2 .4) in the complex root case .
EXAMPLE 2 . 9
Revisiting the equation y" - 6 y' + 6y = 0 of Example 2 .8, we can also write the general solutio n y(x) = e -x(c l cos
Ox) + c 2 sin (Ax)) .
We now have the general solution of the constant coefficient linear homogeneous equatio n y" + Ay' + By = 0 in all cases . As usual, we can solve an initial value problem by first finding
the general solution of the differential equation, then solving for the constants to satisfy the initial conditions .
EXAMPLE 2 .1 0
Solve the initial value proble m y" - 4y' + 53y = 0 ;
Y( 77-)=-3,Y('7T)=2 .
First solve the differential equation . The characteristic equation i s A 2 -4A+53=0,
2.4 The Constant Coefficient Homogeneous Linear Equation
77
with complex roots 2+7i . The general solution i s y(x) = cl e2x cos (7x) + c2 e2x sin (7x) . Now y(lr) = c l e2'r cos(77r)+c 2 e 2'r sin(77r) =-c l e 2' r = - 3 , so c l = 3e -2' Thus far y(x) = 3e -2'r e 2x cos(7x) + c2e 2a sin(7x) . Compute y' (x) = 3 e" [2e 2x cos (7x) - 7e2x sin (7x) ] + 2 c 2e2xsin (7x) + 7 c2e 2z cos (7x) . Then y' (7r) = 3e-2'r 2e2'r (-1)+7c2 e 2'(-1) = 2 , so. 8 -2 ' . c 2 =--e 7 The solution of the initial value problem i s y(x) = 3e -2 'r e 2r cos(7x) -
e -2'r e 2i sin(7x )
= e2(x-IT) [3 cos(7x) - sin(7x)] .
In each of Problems 1 through 12, find the general solution .
In each of Problems 13 through 21, solve the initial valu e problem.
13. y" + 3y' = 0 ; y(0) = 3, y'(0) = 6 14. y"+2y'-3y =0 ;Y(0) =6 , y' (0) = - 2 15. Y " - 2y' + Y =0 ; y ( 1) = Y' ( 1) = 0 16. y"-4y'+4y=0 ;y(0)= 3,y' (0)= 5 17. y" + y' - 12y = 0 ; y(2) = 2, y' (2) = 1 18. y" - 2y' - 5y = 0 ; Y( 0) = O, y'(0) = 3 19. Y"-2Y'+Y=0 ;y(1)=12,Y'(l)=-5 20. y"-5y'+12y=0 ;y(2)=O,Y'(2)=- 4 21. y,-y'+4y=0 ;Y(-2)=1,y'(-2)= 3 22. This problem illustrates how small changes in the coefficients of a differential equation may caus e dramatic changes in the solutions.
CHAPTER 2 Second-Order Differential Equation s
78
(b) Find the solution
(a) Find the general solution cp(x) of y" - 2ay' + a2 y = 0, with a a nonzero constant . (b) Find the general solution coe (x) of y" - 2ay' + (a 2 - e2 )y = 0, in which e is a positive constant . (c) Show that, as e -4- 0, the differential equation in (b) approaches in a limit sense the differential equation in (a), but the solution cpE (x) for (b) does not i n general approach the solution go(x) for (a) . 23. (a) Find the solution 1r of the initial value proble m
of the initial value problem
y" - 2ay' + (a2 - e2 )y = 0 ; y(O) = c, y'(O) = d . Here e is any positive number . (c) Is it true that limE*o tir6 (x) = iP(x)? How doe s this answer differ, if at all, from the conclusion i n Problem 22(c) ? 24. Suppose co is a solution o f y " + Ay' + By = 0 ;
y (xo) = a, y ( xo) = b . Here A, B, a, and b are constants . Suppose A and are positive . Prove that lim, „ cp(x) = 0 .
y" - 2ay' + a2y = 0 ; y (0 ) = c, y'(o) = d, with a, c, and d constants and a 0.
2.5
YE
B
Euler's Equatio n In this section we will define another class of second-order differential equations for whic h there is an elementary technique for finding the general solution . The second-order homogeneous equation 1 1 y" + - Ay' +--By = 0 with A and B constant, is called Euler's equation . It is defined on the half-lines x > 0 an d x < 0 . We will assume for this section that x > 0 . We will solve Euler's equation by transforming it to a constant coefficient linear equation , which we can solve easily . Recall that any positive number x can be written as e t for some t (namely for t = ln(x)) . Make the change of variable s x = e` ,
or, equivalently, t =1n(x)
and let Y( t) = y(e`) . That is, in the function y(x), replace x by et , obtaining a new function of t. For example, i f y(x) = x 3 , then Y(t) = (e t ) 3 = e 3` . Now compute chain-rule derivatives . First ,
y(x)=
dt dx
Y(t) x
so Y' (t) = xy' (x) .
2.5 Euler's Equation
79
Next, dx y '(x)
y "(x)
dx ( I Y'(t))
1 1 d x_ x2 Y (t) + x dx _
r ( t)
1 1 dY' d t x2Y(t)+x dt d x x2Y(t)+xY (t) x
= x2(Y"(t)-Y'(t)) • Therefore, x2y"(x) = Y"(t) - Y' (t) . If we write Euler's equation a s x2y" (x) + Axy ' (x) + By(x) = 0 , then these substitutions yield (t) - Y ' (t) +AY ' (t) +BY(t) = 0 , or Y"+(A-1)Y'+BY=O .
(2 .8)
This is a constant coefficient homogeneous linear differential equation for Y(t) . Solve thi s equation, then let t = ln(x) in the solution Y(t) to obtain y(x) satisfying the Euler equation . We need not repeat this derivation each time we want to solve an Euler equation, sinc e the coefficients A- 1 and B for the transformed equation (2 .8) are easily read from the Euler equation (2.7) . In carrying out this strategy, it is useful to recall that, for x > 0 , =
EXAMPLE 2 .1 1
Find the general solution of x 2y" + 2xy ' - 6y = O . Upon letting x = et , this differential equation transforms to Y " +Y' - 6Y=0 .
The coefficient of Y' is A 1, with A = 2 in Euler's equation . The general solution of thi s linear homogeneous differential equation i s Y(t) c1e-3' +c2e 2t for all real t . Putting t = ln(x) with x > 0, we obtain y(x) = c1 e-31n(x) + c2e2ln(x) = c1 x-3 + c2 x2 and -this -is- the general-solutionof the Euler equation . -II
80 .
CHAPTER 2 Second-Order Differential Equation s
EXAMPLE 2 .1 2
Consider the Euler equation x2y" - 5xy' + 9y = 0 . The transformed equation i s Y"-6Y'+9Y=0 , with general solution Y(t) = cie3t+ c2te3t . Let t = ln(x) to obtain y(x) = c l x3 + c 2x3 ln(x ) for x > 0 . This is the general solution of the Euler equation . 11
EXAMPLE 2 .1 3
Solve x2y" + 3xy' + by = 0 This transforms to Y" + 2Y' + 10Y = 0 , with general solution Y(t) = c l e -t cos(3t) + c 2 e -t sin(3t) . Then y(x) = c t x-1 cos(31n(x)) + c2x 1 sin(3 ln(x) )
= 1x (c 1 cos(3ln(x)) + c2 sin(31n(x)) ) forx>0 . As usual, we can solve an initial value problem by finding the general solution of th e differential equation, then solving for the constants to satisfy the initial conditions .
EXAMPLE 2 .1 4
Solve the initial value proble m x2 y" - 5xy' + 10y = 0 ;
y(l) = 4, y'(l) = -6 .
We will first find the general solution of the Euler equation, then determine the constant s to satisfy the initial conditions . With t = ln(x), we obtai n Y" -6Y' +10Y=0 , having general solution Y(t) = c l e at cos(t) + c2e3t sin(t) . The general solution of the Euler equation i s y(x) = c t x3 cos(ln(x)) + c2x3 sin(In(x)) .
2.5 Euler's Equation
81
For the first initial condition, we need y(l)=4= c 1 . Thus far, y(x) = 4x 3 cos(ln(x)) + c2x3 sin(ln(x)) . Then y' (x) = 12x2 cos (ln (x)) - 4x2 sin (ln (x)) + 3 c 2x2 sin (ln (x)) + c2x2 cos(ln(x)) , so y'(1)=12+c2=-6 . Then c 2 = -18 and the solution of the initial value problem i s y(x)
= 4x 3 cos(ln(x)) - 18x 3 sin(ln(x)) .
Observe the structure of the solutions of different kinds of differential equations . Solution s of the constant coefficient linear equation y" + Ay' + By = 0 must have the forms e"x , xe "x , e"X cos(l3x), or e" x sin($x), depending on the coefficients . And solutions of an Euler equatio n x 2y" + Axy' + By = 0 must have the forms x'', x''ln(x), xP cos(gln(x)), or x P sin(gln(x)) . Fo r example, x3 could never be a solution of the linear equation and e -6x could never be the solutio n of an Euler equation .
In each of Problems 1 through 12, find the general solution . 1. 2. 3. 4. 5.
19. x 2y" + 25xy' + 144y = 0 ; y(l) = -4, y'(1) = 0 20. x2y"-9xy'+24y=0 ; y(l) = 1, y'(l) = 1 0 21. x2y" + xy' - 4y = 0 ; y(l) = 7, y'(1) = - 3 22. Here is another approach to solving an Euler equation . For x > 0, substitute y = x' and obtain values of r to make this a solution . Show how this leads in al l cases to the same general solution as obtained by the transformation method .
82
2.6
CHAPTER 2 Second-Order Differential Equation s
The Nonhomogeneous Equation y" + p(x)y' + q(x) y = f(x) In view of Theorem 2 .5, if we are able to find the general solution y l, of the linear homogeneou s equation y" + p(x)y' + q(x)y = 0, then the general solution of the linear nonhomogeneou s equation (2 .9)
y" + p(x)y' + q(x)y = f(x) is y = yet +y,, in which y,, is any solution of equation (2 .9) . This section is devoted to two methods for finding such a particular solution y, .
2.6.1 The Method of Variation of Parameter s Suppose we can find a fundamental set of solutions y l and Y2 for the homogeneous equation . The general solution of this homogeneous equation has the form y1, (x) = c l y 1(x) + c2Y2 (x) The method of variation of parameters consists of attempting a particular solution of th e nonhomogeneous equation by replacing the constants c 1 and c2 with functions of x. Thus , attempt to find u(x) and v(x) so that yp(x) = u(x) y 1(x) + v ( x )Y2(x) is a solution of equation (2 .9) . How should we choose u and v? First compute Yn =uyi+ vy z+ uy 1+ v Y2 • In order to simplify this expression, the first condition we will impose on u and v is tha t u'y i + v'y2
=
(2 .10)
0.
Now yp, = uyi +vyz • Next compute yp =uy l + vy2 Substitute these expressions for u' Yi +
v'y'2
yP
+ uy i
+ vyz •
and y(,' into equation (2.9) :
+ uyi + vyz + p(x)( u0i + vyz) + q( x)( uyi + vy2)
= f(x) .
Rearrange terms in this equation to ge t u[y' + p ( x) y + q( x) y 1]
+ q( x) y2] + u'Yi + v'y'2 = f(x) .
+ v [ y' + p ( x) y
The two terms in square brackets vanish because equation . This leaves
y1
and
y2
are solutions of the homogeneou s (2 .11 )
u' Yi+ v'y'2 = f( x) • Now solve equations (2 .10) and (2 .11) for u' and v' to ge t u' x ()
in which W is the Wronskian of y1 and y2. If we can integrate these equations to determine u and v, then we have y p .
EXAMPLE 2 .1 5
We will find the general solution of y" +y = sec(x) for -7r/4 < x < 7r/4 . The characteristic equation of y" + 4y = 0 is A2 +4 = 0, with roots ±2i . We may therefore choose y l (x) = cos(2x) and y2 (x) = sin(2x) . The Wronskian of these solutions of th e homogeneous equation is W(x) _
cos(2x)
sin(2x)
-2sin(2x) 2cos(2x)
2
With f(x) = sec(x), equations (2 .12) give us u ' (x) = - - sin(2x) sec(x)
Here we have let the constants of integration be zero because we need only one u and one v . Now we have the particular solutio n y (x) = u (x)YI( x) + v (x) y2(x)
= cos(x) cos(2x) + (sin(x) - 21n* sec(x) +tan(x)1
I
sin(2x) .
The general solution of y" + y = sec(x) i s Y(x ) = Yh(x) + Yp(x) = c 1 cos(2x) + c2 sin(2x)
Suppose we want the general solution of 4 4 y" - xY + Y = x 2 + 1 x2 for x > 0 . The associated homogeneous equation i s 4 4 y, _-y +x2 y=0 , x which we recognize as an Euler equation, with fundamental solutions y l
(x)
for x > 0 . The Wronskian of these solutions i s W(x)
3x 4
_
and this is nonzero for x > 0 . From equations (2 .12) , u' (x)
+I ) _ - 1(x2 3
_ - x4 (x 2 +0 3x 4
and ()
vx=
x(x2 +
1)
1
3x 4
1
+
3(x
x3 )
Integrate to get 1 3 u(x)=- 9x
-
1 3x
and v(x)
= 3 ln(x) - 6xz '
A particular solution i s Yp(x)
= (- 9 x3 -
x+ ( .ln(x) - 6z 3 x)
2 ) x4
The general solution is Y(x) = Yh(x) + Yp(x )
forx>0 . ■
= c 1 x+c2 x4 -
9x4 - 3x 2 +3x 4 1n(x)- -x1 2
= ci x + c2 x4 -
9 x4 -
-
x2
+ x4 ln(x) .
3
= x and y2 (x) = x4
2 .6
The Nonhomogeneous Equation y" + p(x)y' + q(x)y = f(x)
85
2.6 .2 The Method of Undetermined Coefficients Here is a second method for finding a particular solution y p , but it only applies if p(x) and q(x) are constant . Thus consider y" + Ay ' + By = f(x) . Sometimes we can guess the general form of a solution y p from the form of f(x) . For example , suppose f(x) is a polynomial . Since derivatives of polynomials are polynomials, we might try a polynomial for yp (x) . Substitute a polynomial with unknown coefficients into the differentia l equation, and then choose the coefficients to match y" + Ay' + By with f(x) . Or suppose f(x) is an exponential function, say f(x) = e_2x. Since derivatives of e -2x are just constant multiples of e -2x we would attempt a solution of the form yp = Ce2x, substitute into the differential equation, an d solve for C to match the left and right sides of the differential equation . Here are some examples of this method .
EXAMPLE 2 .1 7
Solve y" - 4y = 8x2 - 2x . Since f(x) = 8x 2 - 2x is a polynomial of degree 2, we will attempt a solution y(x) = ax e +bx+c . We do not need to try a higher degree polynomial, since the degree of y" - 4y must be 2 . If, for example, we included an x 3 term in yp, then y - 4yp would have an x 3 term, and we kno w that it does not . Compute y,=2ax+b
and
yp=2 a
and substitute into the differential equation to get 2a - 4(ax2 + bx + c) = 8x2 - 2x . Collect coefficients of like powers of x to writ e (-4a - 8)x2 + ( - 4b +2)x+ (2a - 4c) = O . For y p to be a solution for all x, the polynomial on the left must be zero for all x. But a second degree polynomial can have only two roots, unless it is the zero polynomial . Thus all the coefficients must vanish, and we have the equation s -4a - 8 = 0 , -4b+2=0 , 2a-4c=0 . Solve these to obtain a=-2,
1 b= 2 ,
c=-1 .
Thus a solution is 1 yp (x) _ -2x2 + x-1,
2
as can be verified by substitution into the differential equation .
CHAPTER 2 Second-Order Differential Equation s If we want the general solution of the differential equation, we need the general solutio n yh of y" - 4y = 0 . This is 2x+ c2e -2 x yh(x) = cle The general solution of y" - 4y = 8x 2 - 2x is 2x + c2 e-2x - 2x 2 + 1 -1 . Y( x) = cl e 2x The method we have just illustrated is called the method of undetermined coefficients , because the idea is to guess a general form for yp and then solve for the coefficients to make a solution . Here are two more examples, after which we will point out a circumstance in whic h we must supplement the method .
EXAMPLE 2 .1 8
Solve y" + 2y' - 3y = 4e 2x. Because f(x) is a constant times an exponential, and the derivative of such a functio n is always a constant times the same function, we attempt yp = ae2x . Then yp' = 2ae2x an d yp" = 4ae2x. Substitute into the differential equation to ge t 4ae2x +4ae2x - 3ae2x = 4e 2x. Then 5ae2x = 4e2x, so choose a
= 5 to get the solution yp (x)
e 2x
=5 Again, if we wish we can write the general solutio n y ( x)
= cl e-3x + c 2ex
4 +
5 e2x .
EXAMPLE 2 .1 9
Solve y" - 5y' + 6y = -3 sin(2x) . Here f(x) = - 3 sin(2x) . Now we must be careful, because derivatives of sin(2x) can b e multiples sin(2x) or cos(2x), depending on how many times we differentiate. This leads us t o include both possibilities in a proposed solution : yp(x) = c cos(2x) + d sin (2x) . Compute yp = - 2csin(2x)+2dcos(2x),
yp = -4ccos(2x) -4dsin(2x) .
Substitute into the differential equation to get - 4c cos(2x) - 4d sin(2x) - 5[-2c sin(2x) + 2d cos(2x) ] + 6[c cos(2x) + d sin(2x)] = -3 sin(2x) . Collecting the cosine terms on one side and the sine terms on the other : [2d + 10c+3] sin(2x) = [-2c + 10d] cos(2x) .
For yp to be a solution for all real x, this equation must hold for all x . But sin(2x) and cos(2x) are linearly independent (they are solutions of y" + 4y = 0, and their Wronskian is nonzero) . Therefore neither can be a constant multiple of the other . The only way the last equation can hold for all x is for the coefficient to be zero on both sides : 2d+10c=- 3 and 10d-2c=0 . Then 3 d=-52
and
15 c=- 52
and we have found a solution : yn(x) =
3 15 52 sin(2x) - 52 cos(2x) .
The general solution of this differential equation is y(x) = c 1 e3z + c2 e2x - 52 sin(2x) -
5*
cos(2x) .
As effective as this method is, there is a difficulty that is intrinsic to the method . It can be successfully met, but one must be aware of it and know how to proceed . Consider the followin g example .
EXAMPLE 2 .2 0
Solve y" +2y' - 3y = 8ex. The coefficients on the left side are constant, and f(x) = 8ex seems simple enough, so we proceed with yp (x) = ce x .
Substitute into the differential equation to ge t cex + 2ce x - 3cex = 8e x, or 0 = 8ex . Something is wrong . What happened ? The problem in this example is that e x is a solution of y" + 2y' - 3y = 0, so if we substitut e ce x into y" + 2y' - 3y = 8ex, the left side will equal zero, not 8ex. This difficulty will occur whenever the proposed yb,, contains a term that is a solution of the homogeneous equation y" + Ay' + By = 0, because then this term (which may be all of the proposed y p) will vanish when substituted into y" + Ay' + By .
There is a way out of this difficulty . If a term of the proposed yp is a solution o f y" + Ay' +By = 0, multiply the proposed solution by x and try the modified function as yn . If this also contains a term, or by itself, satisfies y" + Ay ' + By = 0, then multiply by x again to
88
CHAPTER 2 Second-Order Differential Equations try x2 times the original proposed solution. This is as far as we will have to go in the case o f second-order differential equations . Now continue Example 2.20 with this strategy .
EXAMPLE 2 .2 1
Consider again y" +2y' - 3y = 8ex. We saw that yp = cex does not work, because e x, and henc e also cex, satisfies y" + 2y' - 3y = 0 . Try yp = cxex . Compute yp' = ce x + cxe x ,
y; = 2ce' + cxex
and substitute into the differential equation to ge t 2cex +cxe x +2[ce x +cxe x] 3cxex = 8? . Some terms cancel and we are left with 4ce x = 8ex . Choose c = 2 to obtain the particular solution yp (x) = 2xe x.
EXAMPLE 2 .22
Solve y" - 6y' + 9y = 5e 3x Our first impulse is to try yp = ce3x . But this is a solution of y" - 6y' + 9y = 0 . If we try yp = cxe3x, we also obtain an equation that cannot be solved for c . The reason is that the characteristic equation of y" - 6y' + 9y = 0 is (A - 3) 2 = 0, with repeated root 3 . This means that e3x and xe3x are both solutions of the homogeneous equation y" - 6y' +9y = 0 . Thus try yp(x) = cx 2e3x . Compute 2 e3x . 3x yp = 2cxe +3cx2e3x, yp = 2ce3x -I-12cxe 3x +9cx Substitute into the differential equation to ge t 2ce 3x + 12cxe 3x + 9cx2e 3x - 6 [2cxe3x + 3 cx2 e3x] + 9cx2 e3x = 5 e3x . After cancellations we have 2ce3x = 5 e 3x so c = 5/2 . We have found a particular solution yp (x) = 5x2e3x/2 . The last two examples suggest that in applying undetermined coefficients to y" + Ay' + By = f(x), we should first obtain the general solution of y" + Ay' + By = 0 . We need this anyway for a general solution of the nonhomogeneous equation, but it also tells us whether to multipl y our first choice for yp by x or x2 before proceeding . Here is a summary of the method of undetermined coefficients . 1. From f(x), make a first conjecture for the form of y p. 2. Solve y" + Ay' + By = 0 . If a solution of this equation appears in any term of th e conjectured form for yp, modify this form by multiplying it by x . If this modifie d function still occurs in a solution of y" + Ay' + By = 0, multiply by x again (so the original yp is multiplied by x 2 in this case) . 3. Substitute the final proposed yp into y" + Ay' + By = f(x) and solve for its coefficients .
Here is a list of functions to try in the initial stage (1) of formulating 4 . In this list P(x ) indicates a given polynomial of degree n, and Q(x) and R(x) polynomials with undetermine d coefficients, of degree n.
f(x)
Initial Guess for yp
P(x) ce ax
Q(x) dea x
a cos (bx) or sin (bx)
c cos (bx) + d sin(bx)
P(x) ea x
Q(x)e"
P(x) cos(bx) or P(x) sin(bx)
Q(x) cos(bx) + R(x) sin(bx )
P(x) ex cos(bx) or P(x) eax sin(bx)
Q(x)e" cos(bx) + R(x)e' sin(bx )
EXAMPLE 2 .2 3
Solve y" +9y = -4xsin(3x) . With f(x) = -4x sin(3x), the preceding list suggests that we attempt a particular solutio n of the form yn(x) = (ax + b) cos (3x) + (cx + d) sin (3x) . Now solve y" +9y = 0 to obtain the fundamental set of solutions cos(3x) and sin(3x) . The proposed y,, includes terms b cos(3x) and d sin(3x), which are also solutions of y" + 9y = O . Therefore, modify the proposed y(, by multiplying it by x, trying instea d yp (x) = (a x2 + bx) cos(3x) + (cx 2 + dx) sin(3x) . Compute y;, =(2ax + b) cos(3x) - (3ax e +3bx) sin(3x) + (2cx + d) sin (3x) + (3 cx2 + 3dx) cos (3x) and yP
Substitute these into the differential equation to obtain 2a cos(3x) - (6ax+3b) sin(3x) - (6ax+3b) sin(3x ) - (9ax2 + 9bx) cos(3x) + 2c sin(3x) + (6cx + 3d) cos(3x) + (6cx+ 3d) cos(3x) - (9cx 2 +9dx) sin(3x) . + (9ax2 + 9bx) cos(3x) + (9cx2 + 9dx) sin(3x) = -4x sin(3x) . Now collect coefficients of "like" terms (sin(3x), x sin(3x), x2 sin(3x), and so on) . We ge t (2a+ 6d) cos_(3x) + (-6b + 2c) sin(3x) + 12cx cos_(3x)+ (-12a +4)x sin (3x) = 0,
'o
CHAPTER 2 Second-Order Differential Equation s with all other terms canceling. For this linear combination of cos(3x), sin(3x), x cos(3x), an d x sin(3x) to be zero for all x, each coefficient must be zero . Therefore , 2a + 6d = 0, -6b+2c = 0, 12c=0 , and -12a+4=0 . Then a
= 3,
c = 0, b = 0, and d
= -9. 3x
We have found the particular solution
yp(x) =
2 cos(3x) -
-x
sin(3x) .
The general solution is y(x) = cl cos(3x) +c2 sin(3x)
+ 3x
2 cos(3x) - -x sin(3x) .
Sometimes a differential equation has nonconstant coefficients but transforms to a constant coefficient equation . We may then be able to use the method of undetermined coefficients on the transformed equation and then use the results to obtain solutions of the origina l equation.
EXAMPLE 2 .24
Solve x 2y" - 5xy' + 8y = 21n(x) . The method of undetermined coefficients does not apply here, since the differential equation has nonconstant coefficients . However, from our experience with the Euler equation, apply th e transformation t = ln(x) and let Y(t) = y(e t ) . Using results from Section 2 .5, the differentia l equation transforms to Y " (t) - 6Y' (t) + 8Y(t) = 2t, which has constant coefficients on the left side . The homogeneous equation Y" - 6Y' + 8Y = 0 has general solution = cie 2t +c2e4 t Yh(t) and, by the method of undetermined coefficients, we find one solution of Y" - 6Y' +8Y = 2t to be YY (t)=3+ 6 The general solution for Y is Y(t) = c l e2t +c 2e4t +4t+6 . Since t = ln(x), the original differential equation for y has general solutio n e 2m(x) + c2e41n(x) y( x) = cl = c 1x2 +c2x4
2.6 .3 The Principle of Superposition Consider the equation Y" + p ( x )Y +q(x)Y=fi(x)+f2(x)+ . . .+fN(x) .
(2 .13)
Suppose y,1 is a solution of
f
y" + p(x)y' + q (x)Y = j ( x) . We claim that Yp +Yj,2+ . .+Yj,N is a solution of equation (2 .13) . This is easy to check by direct substitution into the differentia l equation :
This means that we can solve each equation y" + p(x)y' + q(x) y = f1 (x) individually, and the sum of these solutions is a solution of equation (2 .13) . This is called the principle of superposition, and it sometimes enables us to solve a problem by breaking it into a sum o f "smaller" problems that are easier to handle individually .
EXAMPLE 2 .2 5
Solve y"+4y = x+2e -2x . Consider two problems : Problem 1 : y" + 4y = x, an d Problem 2 : y" + 4y = 2e-2x . Using undetermined coefficients, we find that a solution of Problem 1 is ypl (x) = x/4, and that a solution of Problem 2 is yp2 (x) = e -2x /4 . Therefore , Y1,(x)
=
1 4 (x+e -2x)
is a solution of y" + 4y = x+2e -2i . The general solution of this differential equation i s y(x) = c i cos(2x) + c2 sin(2x)
+4
(x +
2.6.4 Higher-Order Differential Equation s The methods we now have for solving y" + p(x)y' + q(x)y = f(x) under certain conditions can also be applied to higher-order differential equations, at least in theory . However, there ar e practical difficulties to this approach. Consider the following example .
CHAPTER 2 Second-Order Differential Equation s
EXAMPLE 2 .2 6
Solve
dx
b-4
dx dx 4+2
+15y=0 .
If we take a cue from the second-order case, we attempt solutions y = e Ax . Upon substituting this into the differential equation, we obtain an equation for A : A 6 -4A4 +2A+15=0 . In the second-order case, the characteristic polynomial is always of degree 2 and easily solved . Here we encounter a sixth-degree polynomial whose roots are not obvious . They are, approximately, - 1 .685798616+0 .2107428331i , - 0.04747911354 + 1 .279046854i , and 1 .733277730 ± 0.4099384482i .
When the order of the differential equation is n > 2, having to find the roots of an nth degre e polynomial is enough of a barrier to make this approach impractical except in special cases . A better approach is to convert this sixth-order equation to a system of first-order equations a s follows . Define new variables z1 =
y,
z2 = Y ,
z3
=
y
z4
d3
= dx3 , z
5
d4 =
dx4 ,
z6
d5 =
dx5
.
Now we have a system of six first-order differential equations : zl = z2 z2 = z 3 Z3 = z4
zq = z5 z5 = Z6
z6=4z 5 -2z2 -15z 1 . The last equation in this system is exactly the original differential equation, stated in terms o f the new quantities z; . The point to reformulating the problem in this way is that powerful matrix technique s can be invoked to find solutions . We therefore put off discussion of differential equation s of order higher than 2 until we have developed the matrix machinery needed to exploit thi s approach .
2 .7 Application of Second-Order Differential Equations to a Mechanical Syste m
In each of Problems 1 through 6, find the general solutio n using the method of variation of parameters . 1. 2. 3. 4. 5.
Application of Second-Order Differential Equations to a Mechanical System Envision a spring of natural (unstretched) length L and spring constant k . This constant quantifies the "stiffness" of the spring . The spring is suspended vertically . An object of mass in is attached at the lower end, stretching the spring d units past its rest length . The object comes to rest in its equilibrium position . It is then displaced vertically a distance yo units (up or down) , and released, possibly with an initial velocity (Figure 2 .4) . We want to construct a mathematica l model allowing us to analyze the motion of the object. Let y(t) be the displacement of the object from the equilibrium position at time t . As a convenience, take this equilibrium position to be y = 0 . Choose down as the positive direction . Both of these choices are arbitrary .
CHAPTER 2
Second-Order Differential Equation s (a) Unstretched
FIGURE 2 .4
(c) System in motio n
(b) Static equilibrium
Mass/spring system .
Now consider the forces acting on the object . Gravity pulls it downward with a force o f magnitude tng . By Hooke's law, the force the spring exerts on the object has magnitude ky. At the equilibrium position, the force of the spring is -kd, negative because it acts upward . If th e object is pulled downward a distance y from this position, an additional force -ky is exerted on it . Thus, the total force on the object due to the spring i s -kd - ky. The total force due to gravity and the spring i s mg - kd - ky. Since at the equilibrium point (y = 0) this force is zero, then mg = kd . The net force acting o n the object due to gravity and the spring is therefore just -ky . Finally, there are forces tending to retard or damp out the motion . These include air resistance or viscosity of the medium if the object is suspended in some fluid such as oil . A standard assumption, arising from experiment, is that the retarding forces have magnitud e proportional to the velocity y ' . Thus, for some constant c called the damping constant, th e retarding forces have magnitude cy ' . The total force acting on the object due to gravity , damping and the spring itself therefore have magnitud e -ky - cy' . Finally, there may be a driving force of magnitude f(t) on the object . Now the total external force acting on the object has magnitud e F = -ky - cy' +f( t ) . Assuming that the mass is constant, Newton's second law of motion enables us to writ e my"=-ky-cy'+f(t) , or c k y"+tnY + m y =f( t) •
(2.14)
2.7 Application of Second-Order Differential Equations to a Mechanical System
95
This is the spring equation . We will analyze the motion described by solutions of this equation, under various conditions . 2.7 .1
Unforced Motion
Suppose first that f(t) = 0, so there is no driving force . Now the spring equation i s y"+
m
k +-y= 0 m
with characteristic equation A 2 +-A+ k =0 . in to This has roots c
A = -- f -'✓ c 2 - 4km . 2m 2 m As we might expect, the general solution, hence the motion of the object, will depend o n its mass, the amount of damping, and the stiffness of the spring . Consider the following cases . Case 1 c2 - 4km > 0 In this event, the characteristic equation has two real, distinct roots : A'
2m +-
/c2-4km
and
A2 =
- 2m 2mice
-4km .
The general solution of equation (2.14) in this case is y( t) = ci e A ' t + c2 eA2t . Clearly A2 < 0 . Since m and k are positive, c 2 -4km < c2 , so ,Vc2 - 4km < c and Al is negative also . Therefore, lim y(t) = 0 , regardless of initial conditions . In the case c2 - 4km > 0, the motion of the object decays to zero as time increases . This case is called overdaniping, and it occurs when the square of th e damping constant exceeds four times the product of the mass and spring constant .
EXAMPLE 2 .27 Overdampin g
Suppose c = 6, k = 5, and m = 1 . Now the general solution i s y( t) =
c1
e -t + c2e-5t
Suppose, to be specific, the object was initially (at t = 0) drawn upward 4 units from the equilibrium position and released downward with a speed of 2 units per second . Then y(O) = - 4 and y' (0) = 2, and we obtain y ( t) = 2e-t(-9+e-4t) . A graph of this solution is shown in Figure 2.5. What does the solution tell us about the motion? Since -9 + e -4t < 0 for t > 0, then y(t) < 0 and the object always remains above the equilibrium point . Its velocity. y' (t) = e' (9 - 5e -4 9/ 2 decreases to zero as t increases, and y(t) 0 as t increases, so the object moves downwar d
96
CHAPTER 2 Second-Order Differential Equations
I
I
I
I
I
l 4
6
8 10
-2
An example of overdamped motion, no driving force . FIGURE 2 .5
An example of critical damped motion, no driving force . FIGURE 2 .6
toward equilibrium with ever decreasing velocity, approaching closer to but never reaching th e equilibrium point, and never coming to rest . Case 2 c2 - 4km = 0 Now the general solution of the spring equation (2 .14) i s -ct12m y( t) = (c i + c2 t)e This case is called critical damping . While y(t) --> 0 as t -+ co, as in the overdamping case , we will see an important difference between critical and overdamping .
EXAMPLE 2 .2 8
Let c = 2 and k = m = 1 . Now y(t) = (c l + c 2t)e-t . Suppose the object is initially pulled u p four units above the equilibrium position and then pushed downward with a speed of 5 units per second . Then y(O) = - 4 and y' (0) = 5, so y(t) = ( - 4+ t)e - ' . Observe that y(4) = 0, so, unlike the what we saw with overdamping, the object actually reache s the equilibrium position, four seconds after it was released, and then passes through it . In fact , y(t) reaches its maximum when t = 5 seconds, and this maximum value is y(5) = e -5 , about 0.007 unit below the equilibrium point . The velocity y'(t) = (5 - t)e- t is negative for t > 5, so the object's velocity decreases after this 5-second point. Since y(t) 0 as t -+ co, the object moves with decreasing velocity back toward the equilibrium point as time increases . Figure 2.6 shows a graph of the displacement function in this case . In general, when critical damping occurs, the object either passes through the equilibriu m point exactly once, as just seen, or never reaches it at all, depending on the initial conditions . Case 3 c2 - 4km < 0 Now the spring constant and mass together are sufficiently large that c 2 < 4km, and the dampin g is less dominant . This case is called underdamping . The general solution now i s y(t) = e -ct/2m {ci cos(Pt) + c2 sin(f3t)] , in which 1 ✓4km - c 2. f3 = 2m -
2 .7 Application of Second-Order Differential Equations to a Mechanical System
97
Because c and in are positive, y(t) ->- 0 as t -± oo . However, now the motion is oscillator y because of the sine and cosine terms in the solution . The motion is not, however, periodic , because of the exponential factor, which causes the amplitude of the oscillations to decay t o zero as time increases .
EXAMPLE 2 .2 9
Suppose c = k = 2 and m = 1 . Now the general solution i s y(t) = e-` [c 1 cos(t)+C2 sin(t)] . Suppose the object is driven downward from a point three units above equilibrium, with a n initial speed of two units per second . Then y(O) = -3 and y' (0) = 2 and the solution i s y(t) _ - e-`(3 cos (t) +sin(t)) . The behavior of this solution is more easily visualized if we write it in phase angle form . We want to choose C and S so that 3 cos(t) +sin(t) = Ccos(t+8) .
3 cos(t) + sin(t) = C cos(t) cos(S) - C sin(t) sin(s) ;
C cos (S) = 3 and C sin(s) = -1 . C sin(s) 1 =tan(S)_-3 , Ccos(8) so
S=tan_ ' (- -) =-tan _1 (5.)1 . 3 To solve for C, write C 2 cos2 (8) + C 2 sin 2 (6) = C2 = 3 2 + 1 2 = 1 0 so C = 10 . Now we can write the solution a s y(t) =
l0e -` cos(t-tan - '(1/3)) .
The graph is therefore a cosine curve with decaying amplitude, squashed between graphs o f y = 46e-` and y = - l0e-` . The solution is shown in Figure 2 .7, with these two exponentia l functions shown as reference curves . Because of the oscillatory cosine term, the object passe s back and forth through the equilibrium point . In fact, it passes through equilibrium exactl y when y(t) = 0, or d t = tan_i ()+2h2 I 1 Tforn=0,123• . In theory, the object oscillates through the equilibrium infinitely ofte n in this underdamping case, although the amplitudes of the oscillations decrease to zero as tim e increases . 3
98
CHAPTER 2
Second-Order Differential Equation s
FIGURE 2 .7
An example of underdamped motion, n o
driving force. 2.7.2 Forced Motion Now suppose an external driving force of magnitude f(t) acts on the object. Of course, different forces will cause different kinds of motion . As an illustration, we will analyze the motion unde r the influence of a periodic driving force f(t) = Acos(wt), with A and w positive constants . Now the spring equation is c y' y= (2.15 ) A cos(wt) . y+ +k
m m
m
We know how to solve this nonhomogeneous linear equation . Begin by finding a particula r solution, using the method of undetermined coefficients . Attempt a solution yn (x) = a cos(wt) + b sin(wt) . Substitution of this into equation (2 .15) and rearrangement of terms yield s awc - k C-awe + bw-c +a k b sin(wt) . Al cos(wt) =
m m m
C bw2 + m
m
Since sin(wt) and cos(wt) are not constant multiples of each other, the only way this can b e true for all t > 0 is for the coefficient on each side of the equation to be zero . Therefore 2 bwc k A -aw +-+a---= 0
m m
m
and bw2 +
awc
m
-b k =0 .
m
Solve these for a and b, keeping in mind that A, c, k, and m are given. We get a=
A(k - mw2) (k - mw 2 ) 2 + w2c2
and
b=
Awc (k - mw2 ) 2 + w2 c2
Let wo = \/k/m . Then a particular solution of equation (2 .15), for this forcing function, is given by
mA(w(i - w2) yp (x) = m2(wo _ w2)2+w2c2 cos(wt ) Awc + m 2 (wo - w 2) 2 + w2c2 assuming that c 0 or w wo .
sin(wt) ,
(2 .16)
2 .7 Application of Second-Order Differential Equations to a Mechanical Syste m
We will now examine some specific cases to get some insight into the motion with this forcing function . Overdamped Forced Motion Suppose c = 6, k = 5, and ni = 1, as we had previously in th e overdamping case . Suppose also that A = 6-1 and w = , . If the object is released from res t from the equilibrium position, then the displacement function satisfies the initial value proble m y" +6y ' +5y = 6'cos(/t) ;
y(O) = y'(0) = 0.
This problem has the unique solutio n y( t ) =
e-s')
4
+ sin(/t) ,
a graph of which is shown in Figure 2 .8 . As time increases, the exponential terms decrease t o zero, exerting less influence on the motion, while the sine term oscillates . Thus, as t increases , the solution tends to behave more like sin(./t) and the object moves up and down through th e equilibrium point, with approximate period 2ir//. Contrast this with the overdamped motio n with no forcing function, in which the object began above the equilibrium point and move d with decreasing velocity down toward it, but never reached it . Critically Damped Forced Motion Let c = 2 and m = k = 1 . Suppose w 1 and A = 2. Assume that the object is released from rest from the equilibrium position . Now the initial value problem for the position function i s y" + 2y' + y = 2 cos(t) ;
y(O) = y'(0) = 0
with solution y(t) = -te-`+sin(t) .
A graph of this solution is shown in Figure 2 .9. The exponential term exerts a significan t influence at first, but decreases to zero as time increases . The term -te -t decreases to zero as t increases, but not as quickly as the corresponding term (-e-` + e-5t ) in the overdampin g case . Nevertheless, after a while the motion settles into nearly (but not exactly, because -le ' is never actually zero for positive t) a sinusoidal motion back and forth through the equilibriu m point. This is an example of critically damped forced motion .
1.0 0.5 0
A
f
A
5
10
15
-0 .5 -1 .0 FIGURE 2 .8 An example of overdamped motion driven by 6A/3 cos( St) .
FIGURE 2 .9 An example of critical damped motio n driven by 2cos(t).
CHAPTER 2 Second-Order Differential Equation s
An example of underdamped motion driven by 2vcos(Vt) . FIGURE 2 .10
Underdamped Forced Motion Suppose now that c = k = 2, m = 1, w = ,■/-f , and A = 2V . Now c2 - 4km < 0, and we have underdamped motion, but this time with a forcing function . If the object is released from rest from the equilibrium position, then the initial value proble m for the displacement function is y" +2y' +2y = 2vcos(Jt) ;
y(O) = y'(0) = 0 ,
with solution y(t) = - Ve -` sin(t) + sin(t) . Unlike the other two cases, the exponential factor in this solution has a sin(t) factor . Figure 2.1 0 shows a graph of this function . As time increases, the term -sin(t) becomes les s influential and the motion settles nearly into an oscillation back and forth through the equilibrium point, with period nearly 2'n-R/2 . 2.7.3 Resonance In the absence of damping, an interesting phenomenon called resonance can occur . Suppos e c = 0 but that there is still a periodic driving force f(t) = Acos(wt) . Now the spring equation is A y" + k _ cos(wt) . my m From equation (2 .16) with c = 0, this equation has general solutio n y(t) = c 1 cos(wot) + c2 sin(wot) + m(w
o
cos(wt),
(2.17)
(02)
in which wo = ,/k/m . This number is called the natural frequency of the spring system, and i s a function of the stiffness of the spring and mass of the object, while w is the input frequency and is contained in the driving force . This general solution assumes that the natural and inpu t frequencies are different . Of course, the closer we choose the natural and input frequencies, the larger the amplitude of the cos(wt) term in the solution . Consider the case that the natural and input frequencies are the same . Now the differentia l equation is A (2.18) y" + k y = cos(wot) m m and the function given by equation (2.17) is not a solution. To solve equation (2 .18), first write the general solution yh of y" + (k/m)y = 0 : y1t (t)
= c 1 cos (wo t) + c2 sin (wo t) .
2 .7 Application of Second-Order Differential Equations to a Mechanical System
101
For a particular solution of equation (2 .18), we will proceed by the method of undetermine d coefficients . Since the forcing function contains a term found in yh,, we will attempt a particular solution of the form yp (t) = atcos(coot)+btsin(wot) . Substitute this into equation (2 .18) to obtain -2aw o sin(wot) + 2bwo cos(wot) = A cos(coot) . m Thus choose A a = 0 and 2bwo = - , in leading to the particular solution yp(t)
A = 2mwo tsin(wot) .
The general solution of equation (2 .18) is therefore y(t) = cl cos(coot) + c2 sin(wot) +
A oo t sin(wot) . 2n
This solution differs from that in the case co wo in the factor of t in yp (t) . Because of this , solutions increase in amplitude as t increases . This phenomenon is called resonance . As a specific example, let c 1 = c2 = wo = 1 and A/2m = 1 to write the solution as y(t) = cos(t) + sin(t) + t sin(t) . A graph of this function is shown in Figure 2 .11, clearly revealing the increasing magnitude o f the oscillations with time . While there is always some damping in the real world, if the damping constant is close t o zero compared to other factors, such as the mass, and if the natural and input frequencies ar e (nearly) equal, then oscillations can build up to a sufficiently large amplitude to cause resonancelike behavior and damage a system . This can occur with soldiers marching in step across a bridge. If the cadence of the march (input frequency) is near enough to the natural frequenc y of the material of the bridge, vibrations can build up to dangerous levels . This occurred nea r Manchester, England, in 1831 when a column of soldiers marching across the Broughton Bridg e caused it to collapse . More recently, the Tacoma Narrows Bridge in Washington experienced increasing oscillations driven by energy from the wind, causing it to whip about in sensationa l
FIGURE 2 .11
Resonance .
102
CHAPTER 2 Second-Order Differential Equation s fashion before its collapse into the river . Videos of the wild thrashing about of the bridge ar e available in some libraries and engineering and science departments .
2.7.4
Beats
In the absence of damping, an oscillatory driving force can also cause a phenomenon calle d beats . Suppose w wo and consider A y" + w o'y = - cos(w 0 t ni
The Tacoma Narrows Bridge was completed in 1940 and stood as a new standard of combined artistry and functionality. The bridge soon became known for its tendency to sway in high winds , but no one suspected what was about to occur . On November 7, 1940, energy provided b y unusually strong winds, coupled with a resonating effect in the bridge's material and design , caused the oscillations in the bridge to be reinfored and build to dangerous levels. Soon, th e twisting caused one side of the sidewalk to rise 28 feet above that of the other side . Concrete dropped out of the roadway, and a section of the suspension span completely rotated and fell away. Shortly thereafter, the entire center span collapsed into Puget Sound . This sensational construction failure motivated new mathematical treatments of vibration and wave phenomena in the design o f bridges and other large structures . The forces that brought down this bridge are a more complicated version of the resonance phenomenon discussed in Section 2.7.3.
2.7 Application of Second-Order Differential Equations to a Mechanical Syste m A
1 .0 -
0 .5
, t 0
15
10
20
25
30
-0 .5
-1 . 0
Beats.
FIGURE 2 .12
Assuming that the object is released from rest at the equilibrium position, then y(O) = y'(0) = 0 and from equation (2 .17) we have the solution y(t) =
A m(w o
(02) [cos(wt) -cos (wot)] .
The behavior of this solution reveals itself more clearly if we write it a s y(t)
=
2A m
w2
(wo + w) t ) sin ( - (wo - w) t) .
sin
(2
This formulation reveals a periodic variation of amplitude in the solution, depending on the relative sizes of wo + co' and wo - to . It is this periodic variation of amplitude that is called a beat . As a specific example, suppose wo + w = 5 and w o - co =
2,
and the constants are chosen
so that 2A/[m(wo - w 2)] = 1 . In this case, the displacement function is 5t
t
y(t)=sin(2)sin(-) . The beats are apparent in the graph of this solution in Figure 2 .12. 2.7.5 Analogy with an Electrical Circui t If a circuit contains a resistance R, inductance L, and capacitance C, and the electromotive force is E(t), then the impressed voltage is obtained as a sum of the voltage drops in the circuit : E(t) = Li (t) +Ri(t)
+ C q(t).
Here i(t) is the current at time t, and q(t) is the charge. Since i = q', we can write th e second-order linear differential equation q
R
1
1
+Lq +LC= L E .
If R, L, and C are constant, this is a linear equation of the type we have solved for variou s choices of E(t) . It is interesting to observe that this equation is of exactly the same form as the equation for the displacement of an object attached to a spring, which i s „
c,
k
1
y + an y + m y_ iri f(t) .
CHAPTER 2 Second-Order Differential Equation s This means that solutions of one equation readily translate into solutions of the other an d suggests the following equivalences between electrical and mechanical quantities : displacement function y(t) <-> charge q(t) velocity y'(t) < > current i(t) driving force f(t) < > electromotive force E(t) mass m < > inductance L damping constant c < > resistance R spring modulus k { > reciprocal 1/C of the capacitanc e
,
EXAMPLE 2 .3 0 Consider the circuit of Figure 2 .13, driven by a potential of E(t) = 17 sin(2t) volts . At tim e zero the current is zero and the charge on the capacitor is 1/2000 coulomb . The charge q(t) on the capacitor for t > 0 is obtained by solving the initial value proble m 10q" + 120q' + 1000q = 17 sin(2t) ; q( 0) = 2000' q'(0) = 0 . The solution is q(t)
E(t) = 17 sin(2t) volts
1500 e
-6 [7 cos(8t) - sin(8t)] +
240[- cos(2t) +4 sin(2t)] .
10 -3 F
Transient part of the current for th e circuit of Figure 2 .13.
FIGURE 2 .13
FIGURE 2 .14
0 .03
0 .03
0 .02
0 .02
0.01
0.01
A
0
- 0.01
-0 .01
- 0 .0 2
- 0 .02
-0 .03
- 0 .03
Steady-state part of the current for the circuit of Figure 2 .13.
Figure 2 .13.
FIGURE 2 .15
FIGURE 2 .16
5
10
15
Current ficnction for the circuit of
20
> t
2 .7 Application of Second-Order Differential Equations to a Mechanical System
105
The current can be calculated as i(t) = q' (t) = -
03 e -6' [cos(8t)+sin(8t)]-f 120 [4cos(2t)+sin(2t)] .
The current is a sum of a transient part 30e-6t [cos (80 + sin(8t)] , named for the fact that it decays to zero as t increases, and a steady-state par t 120 [4 cos (20 + sin (20] . The transient and steady-state parts are shown in Figures 2 .14 and 2 .15, and their sum, th e current, is shown in Figure 2 .16.
SrEO*"* .=7
PROBLEM S
1. The object of this problem is to gauge the relative effects of initial position and velocity on the motion i n the unforced, overdamped case . Solve the initial valu e problems y"+4y +2y = 0 ;
y(0) = 5, y' (0) = 0
y" + 4y' + 2y = 0 ;
y (O) = 0, y' (0) = 5 .
and Graph the solutions on the same set of axes . What conclusions can be drawn from these solutions about th e influence of initial position and velocity ? 2. Repeat the experiment of Problem 1, except now use the' critically damped unforced equation y" + 4y' + 4y=0 . 3. Repeat the experiment of Problem 1 for the underdamped unforced case y" + 2y' +5y = 0 .
8. y" + 2y' +5y = 0; y(0) = A, y' (0) = 0 ; A has value s 1, 3, 6, 10, -4 and -7 . 9. y" + 2y' + 5y = 0 ; y(0) = 0, y' (0) = A ; A has value s 1, 3, 6, 10, -4 and -7 . 10. An object having mass 1 gram is attached to the lowe r end of a spring having spring modulus 29 dynes per centimeter. The bob is, in turn, adhered to a dashpot that imposes a damping force of 10v dynes, where v(t) is the velocity at time t in centimeters per second . Determine the motion of the bob if it is pulled dow n 3 centimeters from equilibrium and then struck upward with a blow sufficient to impart a velocity of 1 centimeter per second . Graph the solution . Solve th e problem when the initial velocity is, in turn, 2, 4, 7 , and 12 centimeters per second . Graph these solutions on the same set of axes to visualize the influence o f the initial velocity on the motion .
4. y" + 4y' + 2y = 0 ; y(0) = A, y' (0) = 0 ; A has value s 1, 3, 6, 10, -4 and -7 . 5. y" + 4y' + 2y = 0 ; y(0) = 0, y' (0) = A ; A has value s , 1, 3, 6, 10, -4 and -7 . 6. y" + 4y' + 4y = 0 ; y(0) = A, y' (0) = 0 ; A has value s 1, 3, 6, 10, -4 and -7 .
11. An object having mass 1 kilogram is suspended fro m a spring having a spring constant of 24 newtons pe r meter . Attached to the object is a shock absorber , which induces a drag of llv newtons (velocity is i n meters per second) . The system is set in motion b y lowering the bob centimeters and then striking i t hard enough to impart an upward velocity of 5 meters per second . Solve for and graph the displacemen t function. Obtain the solution for the cases that th e bob is lowered, in turn, 12, 20, 30, and 45 centimeters, and graph the displacement functions for the fiv e cases on the same set of axes to see the effect of th e distance lowered .
7. y" + 4y' + 4y = 0 ; y(0) = 0, y' (0) = A ; A has values 1, 3, 6, 10, -4 and -7 .
12. When an 8-pound weight is suspended from a spring, it stretches the spring 2 inches . Determine the
Problems 4 through 9 explore the effects of changing th e initial position or initial velocity on the motion of th e bob . In each, use the same set of axes to graph the solution of the initial value problem for the given values of A and observe the effect that these changes cause in the solution .
3
106
j
CHAPTER 2 Second-Order Differential Equation s
equation of motion when an object with a mass o f 7 kilograms is suspended from this spring, and the system is set in motion by striking the object an upward blow, imparting a velocity of 4 meters per second . 13. How many times can the bob pass through the equilibrium point in the case of overdamped motion? Wha t condition can be placed on the initial displacemen t y(O) to guarantee that the bob never passes through equilibrium? 14. How many times can the bob pass through the equilibrium point in the case of critical damping? Wha t condition can be placed on y(O) to ensure that the bob never passes through this position? How doe s the initial velocity influence whether the bob passe s through the equilibrium position ? 15. In underdamped motion, what effect does the damping constant c have on the frequency of the oscillations of motion? 16. Suppose y(O) = y'(O) O . Determine the maximum displacement of the bob in the critically damped case, and show that the time at which this maximum occur s is independent of the initial displacement .
a fluid that imposes a drag of 2v pounds . The entire system is subjected to an external force 4 cos(wt) . Determine the value of w that maximizes the amplitude of the steady-state oscillation . What is this maximum amplitude ? 21. Consider overdamped forced- motion governed by y" +6y' +2y=4cos(3t) . (a) Find the solution satisfying y(O) = 6, y'(O) = 0 . (b) Find the solution satisfying y(O) = 0, y'(O) = 6 . (c) Graph these solutions on the same set of axes to compare the effect of initial displacement with tha t of initial velocity . 22. Carry out the program of Problem 21 for the criticall y damped forced system governed by y" +4y ' +4y = 4cos(3t) . 23. Carry out the program of Problem 21 for the underdamped forced system governed by y " + y' + 3y 4 cos(3t) . In each of Problems 24 through 27, use the information to find the current in the RLC circuit of Figure 2 .17 . Assume zero initial current and capacitor charge .
17. Suppose the acceleration of the bob on the spring at distance d from the equilibrium position is a . Prove that the period of the motion is 2rr,/d/a in the case of undamped motion. 18. A mass ml is attached to a spring and allowed to vibrate with undamped motion having period p . At some later time a second mass mn 2 is instantaneousl y fused with m 1 . Prove that the new object, havin g mass in 1 +Jn2, exhibits simple harmonic motion with period +m 2/mn 1 . 19. Let y(t) be the solution of y"+woy = (Alm) cos(wt) , with y(0) = y'(O) = O . Assuming that co w0, find 1im w ,wo y(t) . How does this limit compare with th e solution of y" + a y = (Aim) cos(w0t), with y(O) = y'(O) = 0 ? 20. A 16-pound weight is suspended from a spring , stretching it - feet. Then the weight is submerged in
FIGURE 2 .17
RL C
circuit. 24. R = 200 .! , L = 0.1 H, C = 0 .006 F, E(t) = to - ` volts 25. R = 400 SI, L = 0 .12 H, C = 0 .04 F,E(t) =
120 sin(20t) volt s 26. R=150 SZ,L=0 .2 H,C=0 .05 F, E(t)=1-e - ` volts 27. R = 450 SL, L = 0 .95 H, C = 0 .007 F, E(t) = e ` sin2(3t) volts
CHAPTER
'',11111i NC, 1111,1 1
3
The Laplace Transform
3.1
Definition and Basic Propertie s In mathematics, a transform is usually a device that converts one type of problem into anothe r type, presumably easier to solve . The strategy is to solve the transformed problem, the n transform back the other way to obtain the solution of the original problem . In the case of the Laplace transform, initial value problems are often converted to algebra problems, a proces s we can diagram as follows: . initial value proble m
algebra problem
solution of the algebra problem
solution of the initial value problem .
DEFINITION 3.1
Laplace Transform
The Laplace transform 2[f] of f is a function defined by 2[f](s)
= ff
e -st f(t) alt ,
for all s such that this integral converges . 107
108
CHAPTER 3 The Laplace Transform The Laplace transform converts a function f to a new function called 2[f] . Often we use t as the independent variable for f and s for the independent variable of 2[f] . Thus, f(t) is the function f evaluated at t, and 2[f](s) is the function 2[f] evaluated at s . It is often convenient to agree to use lowercase letters for a function put into the Laplac e transform, and its upper case for the function that comes out . In this notation, H=2[h] ,
G=2[g],
F=2 [.f],
and so on .
EXAMPLE 3 . 1
Let f(t) =
e at ,
with a any real number . Then
f
2 [.]( s) = F(s) =
= lim
f
e-srea'
dt
=f O
dt
0
0
k
k->0 0
e(a_S)t
1 e (a-S)`
dt = lira
k->oo
Ca-s
= llm [ 1 e(a-S)k - 1 k-->oo a-s a -s 1 a-s
k e
(a-5)`
Jo
*
1 s- a
provided that a - s < 0, or s > a . The Laplace transform of f(t) = ea ` is F(s) = 1/(s - a) , defined for s > a .
EXAMPLE 3 . 2
Let g(t) = sin(t) . Then
=f = lim f k-+o
2[g] (s) = G(s)
e-St
sin(t) dt
k
e -S` sin(t) dt e-kS
_ loco [
cos k + se -kS sin k s2 + 1
1 -
-1
,
52
+1
G(s) is defined for all s > 0. A Laplace transform is rarely computed by referring directly to the definition and integrating . Instead, we use tables of Laplace transforms of commonly used functions (such a s Table 3 .1) or computer software . We will also develop methods that are used to find the Laplac e transform of a shifted or translated function, step functions, pulses, and various other function s that arise frequently in applications . The Laplace transform is linear, which means that constants factor through the transform , and the transform of a sum of functions is the sum of the transform of these functions .
3.1 Definition and Basic Properties
Table of Laplace Transforms of Functions F(s) _ £[f(t)](s)
1 s
1
2.
t
3.
t't(n=1,2,3,•••)
4.
s2 n! s,t+ll+
1
a-b
(s - a)2 n! (s - a)tt+ l 1 (s-a)(s-b)
ab
(s-a)(s-b)
10 .
(c-b)ea`+(a-c)eb`+(b-a)e`t (a - b)(b - c)(c - a)
11.
sin(at)
12.
cos(at)
13.
1- cos(at)
14.
at-sin(at)
1 (s - a) (s - b) (s - c) a s2 + a2 s s2 + a2 a2 s(s2 + a2) a3 s2(s2 + a2 )
15.
sin(at) - at cos(at)
16
sin(at) + at cos(at)
17.
tsin(at )
18.
tcos(at )
19.
cos(at) - cos(bt) (b-a)(b+a)
20.
ea` sin(bt)
21.
e a` cos(bt)
22.
sinh(at )
23.
cosh(at)
24.
sin(at)cosh(at) - cos(at)sinh(at )
25.
sin(at)sinh(at)
8.
(eat
e at )
2a 3 (s2 + a2) 2 2as2 ( s2 + a2 )2 g as (s2 + a2) 2 (s2- a2 ) (s 2 + a2 ) 2 s (s2 + a2) ( s2 + b2 ) b (s - a) 2 + b2 s- a (s-a) 2 +b2 a s2 - a 2 s s2 _ a2 4a 3 s4 + 4a 4 2a 2 s s 4 +4a 4
109
110
CHAPTER 3 The Laplace Transform
F(s) = 2[f(t)](s)
f(t) 26 .
sinh(at) - sin(at)
27 .
cosh(at)-cos(at )
28 .
1 e" (1+2at) 7r t
29 .
J„(at)
31 .
Jo(2
33 .
34.
35. 36 . 37. 38.
s (s - a)3/2 1 S2 + a2
Jo(at)
30 .
32 .
2a3 a4 2a- s - a4
S4
t t t
(
1 an
\\II /s2+a2-S / s z a2
nt)
sin(at)
tan- l ( s )
[1-cos(at)]
In
(s2 +
s 2 a2 ) - a2 s2
[1 -cosh(at)]
1 - aea2lerfc
\
a
2( a 7rt ea2 `erf(aJ)
) /
erfc(a,fi )
39 .
1
e s
e ns erfc( as )
n
3.1 Definition and Basic Properties
48 .
f(t)
F(s) = 2 [f(t)](s)
n! (2n)! Trt H2„(t )
( 1 - s) n sn+l/ 2
(Hermite polynomial )
-n ! ,F'r(2n+1)I.H2,,
49.
(1 - s) "
1( t)
sn+3/2
(Hermite polynomial)
50.
1
triangular wave
as2
1+e" J
= as2
tank
as l (2 / )
f( t ) t
a 2a 3a 4a 51 .
square wav e
f(t) I
)t
2a 3a 4a 52 .
sawtooth wav e
f(t) 1 /✓✓ a 2a
1
e-"
as 2
s(1 - e - as )
>t
Operational Formulas
F(s ) aF(s) + bG (s) sF(s) - f(0+) s"F(s)-s„-1f(0)- . . .- f( n _ I)(0 )
fr J
f(T) dT
s F(s)
tf(t) t "f(t) f(t )
i
e"rf(t) f(t - a)H(t - a )
F(s - a ) e -"s F(s) T
f(t +T) = f(t ) (periodic)
1-e
TS
o
e'
111
L
112
CHAPTER 3 The Laplace Transfor m THEOREM 3.1
Linearity of the Laplace Transform
Suppose 2[f](s) and 2[g] (s) are defined for s > a, and a and /3 are real numbers . Then 2[af +/3g](s) = aF(s)+/3G(s) for s > a . Proof By assumption,
Lc"'
2[af +Q g]( s)
e-stf(t)
=f
dt and
Lc') e -st g(t)
e -st (af( t
= af
)+a
e-st f(t)
g ( t))
dt+/3
dt converge for s > a . Then
dt
f*
e st g(t)
dt = aF(s)+/3G(s)
0
0
fors > a . ■ This conclusion extends to any finite sum : 2[al .f+ . .+an .fn](s) = al(s)+ . .+a,,F,,(s) , for all s such that each Fj (s) is defined. Not every function has a Laplace transform, because fo e-st f(t) dt may not converge for any real values of s . We will consider conditions that can be placed on f to ensure that f ha s a Laplace transform . An obvious necessary condition is that fo e -stf(t) dt must be defined for every k > 0, because 2U] (s) = fo e -st f(t) dt . For this to occur, it is enough that f be piecewise continuous on [0, k] for every positive number k . We will define this concept in general terms because it occurs in other contexts as well .
DEFINITION 3.2
Piecewise Continuity
f is piecewise continuous on [a, b] if there are points a
such that f is continuous on each open interval (a, t,), (t the following one-sided limits are finite :
j_l ,
tj ), and (t,,, b), and all of
lira f(t), lim f(t), lien f(t), and lim f(t) .
tea}
t-,tj -
t- t
,+
t-ib -
This means that f is continuous on [a, b] except perhaps at finitely many points, at each o f which f has finite one-sided limits from within the interval . The only discontinuities a piecewise continuous function f can experience on [a, b] are finitely many jump discontinuities (gaps o f finite width in the graph) . Figure 3 .1 shows typical jump discontinuities in a graph . For example, let
f(t)
I
t2
for0
2
att= 2
1 for 2 < t < 3 -1
fora
3.1 Definition and Basic Properties
11 3
y (t)
t
FIGURE 3 .1
A function having jump
discontinuities at t 1
and
FIGURE 3 .2
t2. f(t)-
2 1 -1
if if 0
Then f is continuous [0, 4] except at 2 and 3, where f has jump discontinuities . A graph of this function is shown in Figure 3 .2 . If f is piecewise continuous on [0, k], then so is astf(t), and fo es` f(t) dt exists . Existence of fo e-s t f(t) dt for every positive k does not ensure existence of lim k,oo f e st f(t) dt . For example, f(t) = e` 2 is continuous on every interval [0, k], but fo e'e`2dt diverges for every real value of s . Thus, for convergence of fo e -s`f(t) dt, we need another condition on f . The form of this integral suggests one condition that is sufficient . If, for some numbers M and b, we have 1f(t) < Me", then -
o
Me(b-s)t
e-srl f(t)I
fors>b .
But
f
Me (b_s)t dt
0
converges (to M/(s - b)) if b- s < 0, or s > b . Then, by comparison, fo e - "'j f(t) dt als o converges if s > b, hence fo e -sr f(t) dt converges if s > b. This line of reasoning suggests a set of conditions which are sufficient for a function t o have a Laplace transform .
THEOREM 3 .2 Existence of 2[f]
Suppose f is piecewise continuous on [0, k] for every positive k. Suppose also that there ar e numbers M and b, such that f(t) j < Me b` for t > 0. Then fo e -stf(t) dt converges for s > b , hence 2 [ f] (s) is defined for s > b . ■ Many functions satisfy these conditions, including polynomials, sin(at), cos(at), ea` , and others . The conditions of the theorem are sufficient, but not necessary for a function to have a Laplace transform . Consider, for example, f(t) = t-112 for t > 0. This function is not piecewis e
114
CHAPTER 3 The Laplace Transfor m continuous on any [0, k] because limt,o+ t-'/2 = oo . Nevertheless, fk e-st t-1R dt exists fo r every positive k and s > 0 . Further ,
2 [f] (s)
e-gx2 dx e-" t -112 dt = 2 f
=f
0
0 2
f
e-zz dz
(let x = t '/2 )
(let z = x*)
'nin which we have used the fact (found in some standard integral tables) that fo e-z2 dz = ,Fr/2• Now revisit the flow chart at the start of this chapter . Taking the Laplace transform of a function is the first step in solving certain kinds of problems . The bottom of the flow chart suggests that at some point we must be able to go back the other way . After we find som e function G(s), we will need to produce a function g whose Laplace transform is G . This is th e process of taking an inverse Laplace transform .
DEFINITION 3. 3
Given a function G, a function g
such that 2[g]
= G is called an inverse Laplace transfor m
For example, 2-1 [
(t) s- a ]
at
and 2-1 [ s2
I 1 ] (t) = sin(t) .
This inverse process is ambiguous because, given G, there will be be many function s whose Laplace transform is G . For example, we know that the Laplace transform of e-t i s 1/(s + 1) for s > -1 . However, if we change f(t) at just one point, letting h(t) =
lc' 0
for t 3 fort=3 ,
then fo e-Stf (t) dt = fo e-st h(t) dt, and h has the same Laplace transform as f. In such a case , which one do we call the inverse Laplace transform of 1/(s+1) ? One answer is provided by Lerch's Theorem, which states that two continuous functions having the same Laplace transform must be equal . THEOREM 3.3 Lerc h
Let f and g be continuous on [0, oo) and suppose that 2[f] = 2[g] . Then f = g. ■
3 .1 Definition and Basic Propertie s In view of this, we will partially resolve the ambiguity in taking the inverse Laplac e transform by agreeing that, given F(s), we seek a continuous f whose Laplace transfor m is F. If there is no continuous inverse transform function, then we simply have to make som e agreement as to which of several possible candidates we will call 2 -1 [F] . In applications , context will often make this choice obvious . Because of the linearity of the Laplace transform, its inverse is also linear . THEOREM 3. 4
If Q -1 [F] = f and 2-1 [G] = g, and a and (3 are real numbers, then 2 - '[aF+f3G] = of +13 g.
If Table 3 .1 is used to find 2[f], look up f in the left colunm and read 2[f] from the righ t column. For 2 -1 [F], look up F in the right column and match it with f in the left.
In each of Problems 1 through 10, use the linearity o f the Laplace transform, and Table 3 .1, to find the Laplace transform of the function . 1. 2. 3. 4.
Suppose that f(t) is defined for all t > O . Then f is periodic with period T if f(t+T) = f(t) for all t 0 . Fo r example, sin(t) has period 27r . In Problems 19-22, assum e that f has period T . 19. Show that 2[.f]( s) =
In each of Problems 11 through 18, use the linearity of the inverse Laplace transform and Table 3 .1 to find the (continuous) inverse Laplace transform of the function .
(n+1) T
E fT
fnT
e-s1 f(t) dt = e -nsT f T e -S`f (t) dt. 0
21. From Problems 19 and 20, show that 2 [.f](s) _
Ec'' e-"l f0
T
e-s1 f(t) dt .
n=o
22. Use the geometric series EL o r" = 1/(1 - r) fo r < 1, together with the result of Problem 21, t o show that
1
2[f](s) = 1- e-' T JOI T e - s`f(t) dt. In each of Problems 23 through 30, a periodic function i s given, sometimes by a graph . Find 2[f], using the result of Problem 22 .
,CHAPTER 3 The Laplace Transfor m
116
23. f has period 6 and . f(t) =
28. .f has the graph of Figure 3 .6.
for 0 < t 3 0 fora
(5
24. f(t) = IEsin(cwt)I, with E and w positive constants .
f(t)
(Here f has period Ir/w) . 25. f has the graph of Figure 3 .3 .
3 0
f(t)
Z
2
8 10
r,
16 1 8
FIGURE 3 . 6
5 _*LJ 0 5 10
U 30 35
I 55 60
_ t 29. f has the graph of Figure 3 .7.
FIGURE 3 . 3
26. f has the graph of Figure 3 .4. h f(t ) a 2a 3a 4a 5a 6a 7 a
2 /,/,/ t 12 0 6
FIGURE 3 . 7
FIGURE 3 .4
30. f has the graph of Figure 3 .8.
27. f has the graph of Figure 3 .5. f(t) E sin(wt) E -
-
t FIGURE 3 . 8
FIGURE 3 .5
3.2
Solution of Initial Value Problems Using the Laplac e Transform The Laplace transform is a powerful tool for solving some kinds of initial value problems . Th e technique depends on the following fact about the Laplace transform of a derivative . THEOREM 3 .5 Laplace Transform of a Derivativ e
Let f be continuous on [0, cc) and suppose f' is piecewise continuous on [0, k] for every positive k . Suppose also that limk,oo a -skf(k) = 0 if s > 0 . The n 2[f' ](s) = sF(s) - f(0) .
That is, the Laplace transform of the derivative off is at s, minus f at zero.
(3 .1 ) s
times the Laplace transform of f
3 .2 Proof
Solution of Initial Value Problems Using the Laplace Transfor m
Begin with an integration by parts, with u = e ' - s and dv = f'(t) dt . For k
k
f
e - " f'(t) dt = [e-S'f(t)]o = e-sk
Take the limit as k
co
and
use
>
0,
k f
-se S' f(t) dt
f(k) - f (O) + s f k
f(t) dt.
0
the assumption that e Skf(k)
0 to obtai n
f'] (s) =k*m [ef(k) - f(O) + s f kes'f( t) dt 0
]
.
=-f(0)+s f e-s'f(t)dt=-f(0)+sF(s) . If
f has a jump discontinuity at 0 (as occurs, for example, if f is an electromotive force
that is switched on at time zero), then this conclusion can be amended to rea d 2 [f' ](s) = sF(s) - f(0+) , where
f(0+)
=
turn f( t)
is the right limit of f(t) at 0 . For problems involving differential equations of order 2 or higher, we need a higher derivative version of the theorem . Let f (i) denote the jth derivative of f. As a notational convenience, we let f (°) = f. THEOREM 3 .6 Laplace Transform of a Higher Derivativ e
Suppose f, f', • • • , f" -I are continuous on [0, oo), and f (" ) is piecewise continuous on [0, k] fo r every positive k. Suppose also that limk, co e -sk f (i) (k) = 0 for s > 0 and for j = 1, 2, . . . , n -1 . . Then - s f(,t-2) (0) - ft'r) (0) . II ,2[ f(0] ( s ) = s"F(s) - s"-l f( 0) - si-2f ' (0) (3 .2) The second derivative case (n = 2) occurs sufficiently often that we will record it separately . Under the conditions of the theorem ,
Q[f " ](s) = s2 F(s) -sf(0) - f' (0) .
(3 .3)
We are now ready to use the Laplace transform to solve certain initial value problems .
EXAMPLE 3 . 3
Solve y'-4y = l ; y(0) = 1 . We know how to solve this problem, but we will use the Laplace transform to illustrat e the technique. Write 2[y] (s) = Y(s) . Take the Laplace transform of the differential equation , using the linearity of 2 and equation (3 .1), with y(t) in place of f(t) : 2[y' - 4y] ( s ) = 2 [y'] (s) - 42[y] (s) = ( sY( s) - y(0)) - 4Y(s) _ 2[ 1] ( s) = 1
118
CHAPTER 3 The Laplace Transfor m
Here we used the fact (from Table 3 .1) that £[1](s) = 1/s for s > 0 . Since y(O) = 1, we no w have (s-4)Y(s)=y(0)+1
=1+5.
At this point we have an algebra problem to solve for Y(s), obtaining Y(s)
1 1 = (s-4) + s(s-4)
(note the flow chart at the beginning of this chapter) . The solution of the initial value problem i s _ 1 Y-'[* = -1 [s-4]+ 2
_1 [s(s-4)
From entry 5 of Table 3 .1, with a = 4, 2-1
1 = e4r s- 4
And from entry 8, with a = 0 and b = 4 , 2 -I
1
1
[s(s4)] __ -4
( eot _ e4r ) = 1 (e4r -1) . 4
The solution of the initial value problem is 1
y(t) = e4t + 4 ( e4t - 1)
= 5 e4t 4
4
One feature of this Laplace transform technique is that the initial value given in the proble m is naturally incorporated into the solution process through equation (3 .1). We need not find the general solution first, then solve for the constant to satisfy the initial condition .
EXAMPLE 3 . 4
Solve y(O) = 0, y ' (0) = 2. y" +4y' +3y = e t ; Apply 2 to the differential equation to get 2[y"] +42[y'] +32[y] = 2[e t] . Now
2[y"] =s2 Y - sy(0) - y'(0) = s2 Y - 2 and 2[y'] =sY-y(0) =sY. Therefore, s2 Y-2+4sY+3Y =
3 .2 Solution of Initial Value Problems Using the Laplace Transform
119
Solve for Y to obtain 2s-1 Y(s) (s - 1) (s 2 + 4s + 3 ) The solution is the inverse Laplace transform of this function . Some software will produce this inverse . If we want to use Table 3 .1, we must use a partial fractions decomposition to write Y(s) as a sum of simpler functions . Writ e 2s-1 Y(s) (s - 1) (s2 + 4s + 3 ) _
2s-1 _ A B C + (s-1)(s+l)(s+3) s-1 s+l + s+ 3
This equation can hold only if, for all s , A(s+1)(s+3)+B(s-1)(s+3)+C(s-1)(s+l) =2s-1 . Now choose values of s to simplify the task of determining A, B, and C . Let s = 1 to ge t 8A = 1, so A= Let s = -1 to get -4B = -3, so B = Choose s = -3 to get 8C = -7, s o C=-8 . Then Y(s)
_ 1 1 3 1 8s-14s+1 +
7 1 8s+ 3
Now read from Table 3 .1 that y ( t)
3 =get + e-t - e-3t .
8 Again, the Laplace transform has converted an initial value problem to an algebra problem , incorporating the initial conditions into the algebraic manipulations . Once we obtain Y(s), the problem becomes one of inverting the transformed function to obtain y(t) . Equation (3 .1) has an interesting consequence that will be useful later . Under the condition s of the theorem, we know that
2[f] = s2[f] - . f(0 ) • Suppose
f(t) is defined by an integral, say f(t) = fg(T) dT.
Now f(O) = 0 and, assuming continuity of 2 [.f'] = 2 [g]
g, f'(t)) = g(t) . Then = s2 [f t g( T) dT] .
This means that
[f
g ( T) dT ] = 2 [g],
(3 .4) .
enabling us to take the Laplace transform of a function defined by an integral. We will use thi s equation later in dealing with circuits having discontinuous electromotive forces . Thus far we have illustrated a Laplace transform technique for solving initial value problem s with constant coefficients . However, we could have solved the problems in these examples by other means . In the next three sections we will develop the machinery needed to apply th e Laplace transform to problems that defy previous methods .
i '120
CHAPTER 3 The Laplace Transform
11. Suppose f satisfies the hypotheses of Theore m 3 .5, except for a jump discontinuity at 0 . Show that ,[f' ](s) = sF(s) - f(0+), where f(0+) = lim n. >o+ f(t) 12. Suppose f satisfies the hypotheses of Theorem 3 .5, except for a jump discontinuity at a positive number c . Prove tha t
In each of Problems 1 through 10, use the Laplace trans form to solve the initial value problem. 1. y' +4y = 1 ; y(0) = - 3 2. y'-9y=t ;y(0)= 5 3. y' +4y = cos(t) ; y(O) = 0 4. y' + 2y=et ; y(0) = 1 5. y'-2y=1-t ;y(0)= 4 6. y" + y = l ;y(0) = 6,y'(0) = 0 7. y"-4y'+4y=cos(t) ;y(0)=l,y'(0)=- 1 8. y" + 9y = t 2 ; y(0) = y' (0) = 0 9. y" + 16y = 1 + t; y(0) = -2, y' (0) = 1 10. y"-5y'+6y=e' ;y(0)=0,y'(0)=2
3.3
2[ f' ] (s) = sF(s) -f( 0) - Cc' [f(c+) -f(c -)] , where f(c-) = lim1. _ f(t) . 13. Suppose g is piecewise continuous on [0, k] for every k > 0, and that) there are numbers M, b, and a suc h that Ig(t) < Me b' for t > a . Let £[G] = g . Show that l
[f t (w) dw] ( s) = s G ( s) -
w) dw . s J O g(
Shifting Theorems and the Heaviside Functio n One point to developing the Laplace transform is to broaden the class of problems we are abl e to solve. Methods of Chapters 1 and 2 are primarily aimed at problems involving continuou s functions . But many mathematical models deal with discontinuous processes (for example , switches thrown on and off in a circuit) . For these, the Laplace transform is often effective, bu t we must learn more about representing discontinuous functions and applying both the transfor m and its inverse to them .
3.3 .1 The First Shifting Theore m We will show that the Laplace transform of e at f (t) is nothing more than the Laplace transform of f(t), shifted a units to the right. This is achieved by replacing s by s -a in F(s) to obtain F(s - a) .
THEOREM 3 .7 First Shifting Theorem, or Shifting in the s Variable
Let 2M (s) = F(s) for s > b > 0 . Let a be any number. Then 2[e at f(t)] (s) = F(s - a) Proof
for s > a+ b
Compute 53[ eatf( t )]( s)
= ff
eate-''tf(s) ds
=f
e -(s-a)t f(t) dt = F(s - a)
0
fors-a>b,ors>a+b . ■
3.3 Shifting Theorems and the Heaviside Functio n
EXAMPLE 3 . 5
We know from Table 3 .1 that 2[cos (bt)] = s/(s2 + b2) . For the Laplace transform of e at cos(bt) , replace s with s -a to get £[e ar cos(bt)] (s) =
s- a +122 . (s - a)2
EXAMPLE 3. 6
Since 2[t3] = 6/s', then Q[t3e7'](s) _ (s
67)4 .
The first shifting theorem suggests a corresponding formula for the inverse Laplace transform : If 2[f] = F, then 2-1 [F(s -
a)] = e ar f(t) .
Sometimes it is convenient to write this result a s 2-1 [F(s - a)] = ea` .2 -1 [F( s) ]
EXAMPLE 3 . 7
Suppose we want to compute 2 1
4 s2 +4s+20 ]
We will manipulate the quotient into a form to which we can apply the shifting theorem . Complete the square in the denominator to writ e 4 4 s2 +4s+20 = (s+2) 2 +1 6 Think of the quotient on the right as a function of s+ 2: F(s +2) =
4 (s + 2) 2 + 1 6
This means we should choose F(s) =
4 s2 +16 .
Now the shifting theorem tells us that 2[e -2t sin(4t)] =F(s-(-2)) =F(s+2) =
4 (s+2) 2 +16
CHAPTER 3 The Laplace Transform
and therefore 2-1
=e2t sin(4t) .
[(s+2)2+16]
EXAMPLE 3 . 8
Compute 1
3s- 1 [s2 -6s+ 2
Again, begin with some manipulation into the form of a function of s -a for some a : 3s-1 s2 -6s+2
3s- 1 (s-3) 2 - 7 - 3(s-3) + 8 (s-3) 2_ 7 (s3)2
7
= G(s-3)+K(s-3 )
if we choose G(s) = s23s7
and
K(s)
8 s2 - 7
Now apply equation (3 .5) (in the second line) to write 2-'
3.3.2 The Heaviside Function and Pulse s We will now lay the foundations for solving certain initial value problems having discontinuou s forcing functions . To do this, we will use the Heaviside function . Recall that f has a jump discontinuity at a if limt,a_ f(t) and limt_,a_ f(t) both exist an d are finite, but unequal . Figure 3 .9 shows a typical jump discontinuity . The magnitude of the jump discontinuity is the "width of the gap" in the graph at a . This width is lim f(t) - lim f(t) t-aa-
t->a -
Functions with jump discontinuities can be treated very efficiently using the unit step function , or Heaviside function.
3.3 Shifting Theorems and the Heaviside Function
123
FIGURE 3 . 9
DEFINITION 3.4 Heaviside Function The Heaviside function H is defined by H(t)
0 ift< 0 1 if t > O .
Oliver Heaviside (1850-1925) was an English electrical engineer who did much to introduc e Laplace transform methods into engineering practice . A graph of H is shown in Figure 3 .10. It has a jump discontinuity of magnitude 1 at 0 . The Heaviside function may be thought of as a flat switching function, "on" when t > 0, where H(t) = 1, and "off' when t < 0, where H(t) = 0 . We will use it to achieve a variety of effects, including switching functions on and off at different times, shifting functions along th e axis, and combining functions with pulses . To begin this program, if a is any number, then H(t - a) is the Heaviside function shifted a units to the right, as shown in Figure 3 .11, sinc e H(t-a)=
0 1
ift< a ift>a .
H(t - a) models a flat signal of magnitude 1, turned off until time t = a and then switched on. We can use H(t - a) to achieve the effect of turning a given function g off until time t = a , at which time it is switched on . In particular , H(t - a)g(t) _
0 ift< a g(t) if t > a .
H(t - a)
(0, 1 ) x FIGURE 3 .10
function H(t) .
The Heaviside
a FIGURE 3 .11
function .
A shifted Heaviside
t
124
CHAPTER 3 The Laplace Transform
FIGURE 3 .12
Comparison of y = cos(t) and y = H(t - 7r) cos(t) .
is zero until time t = a, at which time it switches on g(t) . To see this in a specific case, le t g(t) = cos(t) for all t. The n 0 if t < 7r H(t -71-)g(t) = H(t - 7r) cos(t) = { cos(t) if t > 7r. Graphs of cos(t) and H(t - 7r)cos(t) are shown in Figure 3 .12 for comparison. We can also use the Heaviside function to describe a pulse .
DEFINITION 3.5
Puls e
A pulse is a function of the form k[H(t - a) - H(t - b) ] in which a < b and k is a nonzero real number
This pulse function is graphed in Figure 3 .13 . It has value 0 if t < a (where H(t - a) = H(t - b) = 0), value 1 if a < t < b (where H(t - a) = 1 and H(t - b) = 0), and value 0 if t > b (where H(t - a) = H(t - b) = 1) . Multiplying a function g by this pulse has the effect of leaving g(t) switched off until time a . The function is then turned on until time b, when it is switched off again . For example , let g(t) = et . Then
[H(t-1)-H(t-2)]e`=
0 ift< 1 e` if1 2 .
Figure 3 .14 shows a graph of this function . Next consider shifted functions of the form H(t -'a)g(t - a) . If t < a, the g(t - a)H(t - a) = 0 because H(t - a) = 0 . If t > a, then H(t - a) = 1 and H(t - a)g(t - a) = g(t - a), which is g(t) shifted a units to the right . Thus the graph of H(t - a)g(t - a) is zero along the horizontal axis until t = a, and for t > a is the graph of g(t) for t 0, shifted a units to the right to begin at a instead of O .
Consider g(t) = t2 and a = 2 . Figure 3 .15 compares the graph of g with the graph of H(t - 2 ) g(t - 2) . The graph of g is a familiar parabola . The graph of H(t - 2)g(t - 2) is zero until time 2 , then has the shape of the graph of t2 for t > 0, but shifted 2 units to the right to start at t = 2. It is important to understand the difference between g(t), H(t - a)g(t), and H(t - a)g(t - a) . Figure 3 .16 shows graphs of these three functions for g(t) = t 2 and a = 3 . 3.3 .3 The Second Shifting Theore m Sometimes H(t - a)g(t - a) is referred to as a shifted function, although it is more than tha t because this graph is also zero for t < a . The second shifting theorem deals with the Laplac e transform of such a function .
Comparison of y = t2 and y = (t - 2) 2 H(t - 2) . FIGURE 3 .15
Y
y
t
) t 3 FIGURE 3 .16
Comparison of y = t 2,
y
= t2H(t -3), and y = (t-3) 2H(t-3
126
CHAPTER 3 The Laplace Transfor m
b THEOREM 3.8 Second Shifting Theorem, or Shifting In the t Variable Let 2[f] (s) = F(s) for s > b. Then 2[H(t - a) f(t - a)](s) = e -' F(s) for s > b. That is, we obtain the Laplace transform of H(t - a) f(t - a) by multiplying the Laplace transform of f(t) by e- °S Proof Proceeding from the definition , 2[H(t - a) f(t - a)] (s)
=f
=f
e-'"H(t-a)f(t-a) dt e -s` f(t - a) dt
because H(t -a) = 0 fort < a, and H(t - a) = 1 fort > a . Now let w = t - a in the last integral to obtain 2[H(t - a )f( t - a )] (s)
=f
e - ' t° +w) f(w) d w
= 0
e-sw f(w) dw = e-°SF( s)
EXAMPLE 3 .1 0
Suppose we want the Laplace transform of H(t - a) . Write this as H(t - a) f(t- a), with f(t) = 1 for all t . Since F(s) = 1/s (from Table 3 .1 or by direct computation from the definition), then 2[H(t-a)](s) = e-as2 [ 1 ](s)
= 1s e -as
EXAMPLE 3 .1 1
Compute 2[g], where g(t) = 0 for 0 < t < 2 and g(t) = t2 + 1 for t > 2. Since g(t) is zero until time t = 2, and is then t2 +1, we may write g(t) = H(t - 2) (t2 + 1) . To apply the second shifting theorem, we must write g(t) as a function, or perhaps sum o f functions, of the form f(t - 2)H(t - 2) . This necessitates writing t2 + 1 as a sum of function s of t - 2 . One way to do this is to expand t2 + 1 in a Taylor series about 2 . In this simple cas e we can achieve the same result by algebraic manipulation : t2 +1 = (t-2+2) 2 +1 = (t-2)22-I-4(t-2)-I-5 . Then g(t) = (t2 + 1)H(t - 2 ) _ (t - 2) 2 H(t - 2) + 4(t-2)H(t-2) +5H(t - 2) .
3.3 Shifting Theorems and the Heaviside Function
127
Now we can apply the second shifting theorem : 2[g] =2[(t-2)2H(t-2)]+41[(t-2)H(t-2)]+52[H(t-2) ] = e -2s 2[t 2 ] + 4e-2.s 2[t]+5e -2s 2[1 ]
_2S[2
4
3
s2
5 S
]S
As usual, any formula for the Laplace transform of a class of functions can also be read as a formula for an inverse Laplace transform . The inverse version of the second shifting theorem is :
2 -1 [e-asF(s)](t) = H(t - a) f(t - a) .
(3 .6)
This enables us to compute the inverse Laplace transform of a known transformed functio n multipled by an exponential e'.
EXAMPLE 3 .1 2
Compute
_1se [s 2 +4] 1 gs
The presence of the exponential factor suggests the Use of equation (3 .6) . Concentrate on findin g
s 2 +4 [s2
1
This inverse can be read directly from Table 3 .1, and is f(t) = cos(2t) . Therefore s.r
1
[se se-'s
(t) =H(t-3)cos(2(t-3)) .
We are now prepared to solve certain initial value problems involving discontinuous forcin g functions .
EXAMPLE 3 .1 3
Solve the initial value problem
y" +4y = f(t) ;
y( 0) = y '(O) = 0 ,
in which f(t) -
0 fort < 3 t fort > 3
Because of the discontinuity in f, methods developed in Chapter 2 do not apply . First recognize that f(t) = H(t - 3) t.
CHAPTER 3 The Laplace Transform
Apply the Laplace transform to the differential equation to ge t 2[y"] + 2[y] = s2Y(s) - sy(O) - y'(0) +4Y(s ) _ (s2 +4) Y(s) = 2 [H(t - 3) t] , in which we have inserted the initial conditions y(O) = y ' (0) = O . In order to use the second shifting theorem to compute 2[H(t - 3) t], write 2[H(t-3)t] _ 2[H(t - 3) (t - 3 +3) ]
We now have 3 1 (s2 +4)Y = -e-3s+-e 3s . s s2 The transform of the solution is As) =
3s+1 _ 3s s2 (s2 + 4) e
The solution is within reach . We must take the inverse Laplace transform of Y(s) . To do this , first use a partial fractions decomposition to writ e 3s+ 1 e_3s _ 3 1 e_3 - 3 s e_3S + 1 1 e_3S - 1 1 e _3 s 4s2 +4 4s2 4s2 + 4 4s s2 (s 2 +4) Each term is an exponential times a function whose Laplace transform we know, and we ca n apply equation (3 .6) to write y(t) = 4H(t - 3) - -H(t - 3) cos(2(t - 3) )
+ 4H(t-3)(t-3) - 4H@-3)2 sin(2(t-3)) . Because of the H(t - 3) factor in each term, this solution is zero until time t = 3, and we may write 0 fort< 3 y(t) = 3 - 3 cos(2(t - 3)) + 1 (t - 3) - 1 sin(2(t - 3)) fort > 3 . 4 8 4 4 or, upon combining terms , 0 fort< 3 y(t)
A graph of this solution is shown in Figure 3 .17 . In this example, it is interesting to observe that the solution is differentiable everywhere , even though the function f occurring in the differential equation had a jump discontinuity at 3 . This behavior is typical of initial value problems having a discontinuous forcing function . If the differential equation has order n and cp is a solution, then co and its first n 1 derivatives will b e continuous, while the nth derivative will have a jump discontinuity wherever f does, and thes e jump discontinuities will agree in magnitude with the corresponding jump discontinuities of f .
3 .3 Shifting Theorems and the Heaviside Function
FIGURE 3 .17
129
Solution of
0 Y -I- 4y = t
if 0
;
( y( o ) = y'(O) = 0) .
Often we need to write a function having several jump discontinuities in terms of Heavisid e functions in order to use the shifting theorems . Here is an example.
EXAMPLE 3 .1 4
Let f(t) =
0 t-1 -4
ift< 2 if2 3 .
A graph off is shown in Figure 3 .18 . There are jump discontinuities of magnitude 1 at t = 2 and magnitude 6 at t = 3 . Think of f(t) as consisting of two nonzero parts, the part that is t - 1 on [2, 3) and the part that is -4 on [3, oo) . We want to turn on t - 1 at time 2 and turn it off at time 3, then turn on -4 at time 3 and leave it on . The first effect is achieved by multiplying the pulse function H(t - 2) - H(t - 3) by t - 1 . The second is achieved by multiplying H(t - 3) by 4 . Therefore f(t) _ [H(t-2) -H(t-3)](t-1) -4H(t-3) . As a check, this gives f(t) = 0 if t < 2 because all of the shifted Heaviside functions are zero for t < 2. For 2 < t < 3, H(t - 2) = 1 but H(t - 3) = 0 so f(t) = t - 1 . And for t > 3, H(t - 2) = ff(t - 3) = 1, so f(t) = -4 . 3.3 .4
Analysis of Electrical Circuit s
The Heaviside function is important in many kinds of problems, including the analysis o f electrical circuits, where we anticipate turning switches on and off. Here are two examples .
EXAMPLE 3 .1 5
Suppose the capacitor in the circuit of Figure 3 .19 initially has zero charge and that there is no initial current . At time t = 2 seconds, the switch is thrown from position B to A, held there for 1 second, then switched back to B. We want the output voltage Eo„ t on the capacitor.
130
CHAPTER 3 The Laplace Transform
2 1
if
*t
23
-1
4
5
AQ I
-2 -3
10
V
- 4
FIGURE 3 .18
f(t) =
0 t-1 -4
O
250,00011
Graph of if t< 2 if 2 3
10 -6 F
B
out
T
FIGURE 3 .1 9
From the circuit diagram, the forcing function is zero until t = 2, then has value 10 volts until t = 3, and then is zero again . Thus E is the pulse functio n E(t) = 10[H(t - 2) - H(t - 3)] .
By Kirchhoff's voltage law, Ri(t) + -q(t) = E(t) ,
or 250, 000q ' (t) + 106 q(t) = E(t) .
We want to solve for q subject to the initial condition q(0) = O. Apply the Laplace transform to the differential equation, incorporating the initial condition, to writ e 250,000[sQ(t) - q(0)] + 10 6 Q(t) = 250,000sQ -I-106 Q = 2[E(t)] .
Now 2[E(t)](s) = 102[H(t-2)](s) - 102[H(t - 3)] (s ) = 10 - 10 e -3 s -e-2s s s We now have the following equation for Q :
2.5(105 )sQ(s) + 10 6 Q(s) = 10 e-2s - 10 e-3s s s or Q (s) =
4(10-5)
s(s + 4)
e-zs - 4 (10)- s (s +
e Use a partial fractions decomposition to write Q(s)=10-5 [Se -2s - s
+4e
4 )Us
zs1-10-spe-3s _s+4e -3s .
J
By the second shifting theorem, 1 -2s ] (t) = H(t - 2) 2 -1 [ -e s
J
3.3 Shifting Theorems and the Heaviside Function E(t)
131 ,
Eaut (t )
10
10
I
8 6 4
2 1 0
2 t
23
FIGURE 3 .20
0
Input voltage for the circuit
2
4
6
8
FIGURE 3 .21 Output voltage for th e circuit of Figure 3 .19.
of Figure 3 .19. and
1 s e -2 ' =H(t-2)f(t-2) , s +4
2-1
where f(t) = 2 -1 [1/(s+4)] = e -4 ' . Thus a
1
P
1
s -1 4
e -2s] = H(t - 2) e -4(t-2) .
The other two terms in Q(s) are treated similarly, and we obtai n q(t) = 10 -5 [H(t - 2) - H(t - 2) e -4(' )] -10 -5 [H(t - 3) H(t - 3) e -40-3) ] = 10 -5 H(t - 2) [1 - e-40-2 )]
-
10 5H(t - 3) [1- e -4 ( r-3) ] .
Finally, since the output voltage is Eo„ t (t) = 106q(t) , Eout (t) = 1OH(t - 2) [1 - e-40-2 )]
-
1OH(t - 3) [1 - e
4
' 3) ] .
The input and output voltages are graphed in Figures 3.20 and 3.21 .
EXAMPLE 3 .1 6
The circuit of Figure 3.22 has the roles of resistor and capacitor interchanged from the circui t of the preceding example . We want to know the output voltage i(t)R at any time . The differential equation of the preceding example applies to this circuit, but now we ar e interested in the current. Since i = q', the n (2 .5) (10 5 )i(t) + 106 q(t) = E(t) ;
10 V FIGURE_ 3 .22
i(0) = q(0) = 0 .
132
CHAPTER 3 The Laplace Transfor m E(t)
10 -
2 1 0 Input voltage for. the circuit of Figure 3 .22. FIGURE 3 .23
FIGURE 3 .24
Output voltage for the circui t
of Figure 3.22 .
The strategy of eliminating q by differentiating and using i = q' does not apply here , because E(t) is not differentiable . To eliminate q(t) in the present case, writ e t q(t) = f i(T)dT+q(0)=f i(T)dT . 0
0
We now have the following problem to solve for the current : (2 .5)(105)i(t) + 10 6 f t i(T) dT = E(t) ;
i(0) = O .
This is not a differential equation . Nevertheless, we have the means to solve it . Take the Laplace transform of the equation, using equation (3 .4), to obtain 1
(2 .5) (10 5)I(s) + o' -1(s) = 2[E] (s) s 1 = 10 -e 1 -2s -10 -e-3 s s
Here I = 2[i] . Solve for I(s) to get I(s) =4(10-5)
s+4
e-2s-4(10-5)
e-3s . s+4
Take the inverse Laplace transform to obtai n i (t) = 4(10 -5 ) H(t - 2) e-4(t-2) - 4(10 -5 ) H(t - 3)e-40'-3) . The input and output voltages are graphed in Figures 3.23 and 3.24.
1 for 0 < t < 7 cos (t) fort > 7
In each of Problems 1 through 15, find the Laplace trans form of the function . 1. (t 3 -3t+2)e-2 r
3 . f(t)
2. e-3' (t - 2)
4. e4` [t - cos(t)]
3 .3 Shifting Theorems and the Heaviside Function 5.
At) =
t forO < t < 3 1 1-3t fort> 3
2t-sin(t) for0 < t < 7r 6. f(t) = 0 fort > 7r 1 7. e-'[l-r2+sin(t) ] _ 8 f(t)
t2 1 1-t-3t2
9 . f(t) =
forO < t < 2 fort> 2
cos(t) forO < t < 27r 2-sin(t) fort >27r
10. f(t) =
-4 forO < t < 1 0 fort < t< 3 e' fort > 3
11. to -2' cos(3t)
12. e'[1-cosh(t) ] 13. f(t)={t-2 -1
forO 1 6
14. f(t) _ (1- cos(2t) for 0 < t < 37r tl 0 fort >37r 15. e 5 (t4 +2t2 +t)
In each of Problems 16 through 25, fin d the invers e Laplace transform of the function . 16.
1 s2 +4s+1 2
17.
1 s '--4s+5
133
26. Determine 2[e-2 ' ,fa e2W cos(3w) dw] . Hint: Use th e first shifting theorem. In each of Problems 27 through 32, solve the initial valu e problem by using the Laplace transform . 27. y" + 4y = f(t) ; y(0) = 1, y'(0) = 0, with f(t) = 0 forO 4 28. y"-2y'-3y=f(t) ;y(0)= 1,y'(0).=0,with f(t) = for0 4 29. y(3) - 8y = g(t) ; y(O) = y' (0) = y" (0) = 0, with g(t) = (0 for0 6 30. y" + 5y' + 6y = At) ; y(O) =y'(0) = 0, with f(t) = -2 forO 3
1
31. y(3) - y "
4y' - 4y = f(t) ; y(O) = y ' (0) = 0, 1 for0 5 +
32. y"-4y'+4y= f(t) ; y(0) = -2, y' (0) = 1, with f(t) = t for0 3 33. Calculate and graph the output voltage in the circuit of Figure 3 .19, assuming that at time zero the capaci tor is charged to a potential of 5 volts and the switch is opened at 0 and closed 5 seconds later .
s e- s
18. s
34. Calculate and graph the output voltage in the RL cir cuit of Figure 3 .25 if the current is initially zero and for 0< t< 5 E(t) 2 fort>5 .
se -2s 19. s2 + 9
= 10
20. 3 e -4r s+2 21.
1 s 2 + 6s + 7
E(t)
s- 4 22. s 2 -8s+1 0 23
s+ 2 . s2 +6s+ 1
24.
1 es (s - 5) 3
25.
1 s(s2 + 16)
FIGURE 3 .2 5
21
35. Solve for the current in the RL circuit of Problem 34 if the current is initially zero and E(t) = (k for0 5 .
CHAPTER 3 The Laplace Transform
134
36. Solve for the current in the RL circuit of Problem 34 if the initial current is zero and E(t) = 0 for0t< 4 AC' fort>4. 37. Write the function graphed in Figure 3 .26 in terms of the Heaviside function and find its Laplace transform . f(t )
K t a b
f(t)
h t a
b
c
FIGURE 3 .28
40. Solve for the current in the RL circuit of Figure 3 .29 if the initial current is zero, E(t) has period 4, and for0
FIGURE 3 .26
E(t)
38. Write the function graphed in Figure 3 .27 in terms of the Heaviside function and find its Laplace transform . FIGURE 3 .2 9
f(t) Hint: See Problem 22 of Section 3 .1 for the Laplac e
M a
b
/ c
N
transform of a periodic function . You should find that 1(s) = F(s)/(1+C 2s ) for some F(s) . Use a geometri c series to write 1 + e-2s
FIGURE 3 .2 7
39. Write the function graphed in Figure 3 .28 in terms of the Heaviside function and find its Laplace transform.
3.4
_ E (_1) tt e -2tt s
to write 1(s) as an infinite series, then take the invers e transform term by term by using a shifting theorem . Graph the current for 0 < t < 8 .
Convolution In general the Laplace transform of the product of two functions is not the product of their transforms . There is, however, a special kind of product, denoted f * g, called the convolution of f with g . Convolution has the feature that the transform off *g is the product of the transforms of f and g. This fact is called the convolution theorem ,
DEFINITION 3.6 Convolution
If f and g are defined on [0, oo), then the convolution defined by
fort>0 .
g
is the function
3.4 Convolution
135
THEOREM 3 .9 Convolution Theorem
If f *g is defined, then *t [f Proof
* g] _ 2[f] 53[g]
Let F = 2[f] and G = 2[g] . The n F(s)G(s) = F(s) f
e-st g(t) dt = f F(s)e-'rg(T) dT,
in which we changed the variable of integration to recall that
T
and brought F(s) within the integral . No w
= a[H(t-T)f(t-T)](s) .
e -'°t F(s)
Substitute this into the integral for F(s)G(s) to ge t F(s)G(s)
=f
2[H(t-T)f(t-T)](s)g(T)dT.
(3 .7)
But, from the definition of the Laplace transform, 2[H(t-T)f(t-T)]
=f
. e -"H(t -T)f(t-T)dt.
Substitute this into equation (3 .7) to get F(s)G(s)
=f
[
e-S'H(t-T)f(t-T) dt] g(T) d T
f
= f ff
e -st g(T)H(t-T)f(t-T)dtdT.
0
Now recall that H(t -T) = 0 if 0 < t F(s)G(s)
< T,
=f
while H(t - T) = 1 if t f
> T.
Therefore,
e-S'g(T)f(t-T)dtdi- .
Figure 3 .30 shows the tT plane. The last integration is over the shaded region, consistin g of points (t, T) satisfying 0 < T < t < oo. Reverse the order of integration to write F(s)G(s) = l'°° /e-'g(T) f(t -T) dT =f
e-st [ f t g(T) f(t -T)
dt
= ff
e'(f * g) ( t) dt = 2 [f * g] ( s) •
136
CHAPTER 3 The Laplace Transform
t
FIGURE 3 .3 0
Therefore
F(s)G(s) =
.[f *g](s) ,
as we wanted to show . ■ The inverse version of the convolution theorem is useful when we want to find the invers e transform of a function that is a product, and we know the inverse transform of each factor . THEOREM 3.1 0 I
Let 2- 1 [F] = f and 2 -1 [G] = g. Then 2 -1 [FG] = f *g.
EXAMPLE 3 .1 7
Compute 1 s(s -4) 2
1
We can do this several ways (a table, a program, a partial fractions decomposition) . But we can also write 2
1
-1
= 2_ 1 1
1
[s
s-4) 2
2-1[F(s)G(s)] .
-4)21
Now 2-1 [1J==f(t) s
4)2
and
-1
[(sl
= te4t = g( t) •
Therefore, 2 -1
L s(s
1 4)2 I = f( t) * g( t) =
=f
Te 4?
1 * tear
1 1 1 dT = 4 tear - 16 e4t + 16
The convolution operation is commutative.
3.4 Convolution
137
THEOREM 3 .11 If f *g is defined, so is g* f , and f * g = g * f Proof
Let z = t -
T
in the integral defining the convolution to ge t
(f* g)(t)=
f
=f
f(t-T)g(T)d T o
f(z)g(t- z)(-1) dz =
f
.f(z)g(t-z)dz=(g*f)(t) .
Commutativity can have practical importance, since the integral defining g* f may be easier to evaluate than the integral defining f *g in specific cases . Convolution can sometimes enable us to write solutions of problems that are stated in ver y general terms .
EXAMPLE 3 .1 8
We will solve the proble m y(O) = 1, y'(0) = 0 .
y" - 2y' - 8y = f( t ) ;
Apply the Laplace transform, insetting the initial values, to obtai n .2[y"-2y'-8y](s)_(s2Y(s)-s)-2(sY(s)-1)-8Y(s)_
[f](s)=F(s) .
Then (s2 - 2s - 8) Y(s) - s + 2 = F(s) , so Y(s) _
1 s- 2 s 2 -2s-8 F(s)+ s 2 -2s-8 .
Use a partial fractions decomposition to write Y(s) _
1 1 6s-4 F(s)
1 1 1 1 2 1 F(s)+ 6s+2 3 s-4 + 3 s+ 2
Then y(t)
e4t *
=6
f( t ) - 6 e
-2r *
f( t) +
e 4r +
e -2r
This is the solution, for any function f having a convolution with e4r and e-21 . Convolution is also used to solve certain kinds of integral equations, in which the functio n to be determined occurs in an integral . We saw an example of this in solving for the current in Example 3 .16.
138.
CHAPTER 3 The Laplace Transfor m
EXAMPLE 3 .1 9
Determine f such that f(t)=2t 2 + f ' f(t-T)e-7 dT. 0
Recognize the integral on the right as the convolution of f with e' . Thus the equation has the form f( t) = 2t2 + (f * e-')(t) . Taking the Laplace transform of this equation yields F(s)
= 4 +F(s) s+
1
Then 4
F(s)
4 ,
= S3 + S4
and from this we easily invert to obtain f( t) = 2t2
+
3 t 3.
PROBLEMS In each of Problems 1 through 8, use the convolutio n theorem to compute the inverse Laplace transform of the function (even if another method would work) . Wherever they occur, a and b are positive constants .
1. 2.
1 ( s2 + 4) (s2 - 4 ) 1 s2 +1 6
e
2s
s
3.
(s2 + a2 ) ( s2 + b2 ) s2 4. (s - 3) (s 2 +5 ) 5. s ( s 2 6. 7.
+ a2) 2
s4 (s
s(s+2)
s3(s2
+5 )
In each of Problems 9 through 16, use the convolutio n theorem to write a formula for the solution of the initial value problem in terms of f(t) . 9. y" - 5y + 6y = .f( t) ; Y( 0) = Y ' (0) = 0 10. y" + l0y'+24y = f(t) ; y(0) = 1, y'(0) = 0 11. y" - 8y'+ 12y = f( t ) ; Y(0) _ -3, y' ( 0) = 2 4 12. y" - y' - 5y = f(t) ; Y( 0) = 2, y'(O) = 1 13. y" +9y = f(t) ; y(0) _ -1, y'(0) = 1 14. y " - k2 Y = f(t) ; y(O) = 2, y'(O) = - 4
y(3)-y"-4y'+4y=f(t) ;Y(0)=y'(0)=1 , 15. y"(O) =0
5)
1
2 8.
_4s,
16. y(4)
=
lly" + 18y = f(t) ; y(0) = y'(O) = y"(O) =
y(3) (0) = 0
3 .5 Unit Impulses and the Dirac Delta Functio n
23. f(t) = e 3' [e' - 3 fo f(a)e 3a da]
In each of Problems 17 through 23, solve the integra l equation.
17. f(t) = -1 + lot f(t-a)e -3a da
24. Use the convolution theorem to derive the formul a 2[ fo f(w) dw](s) = (1/s)F(s) . What assumptions are needed about f(t) ?
18. f(t) =-t+ fo f(t-a)sin(a)da
25. Show by example that in general f * 1
f, where 1 denotes the function that is identically 1 for all t . Hint : Consider f(t) = cos(t) .
19. f(t) = e-' + fo f(t - a) d a
26. Use the convolution theorem to determine the Laplac e transform of e-'f fo e 2V cos(3w) dw .
20. f(t)=-1+t-2 fo f(t-a)sin(a)d a
27 . Use the convolution theorem to show that
21. f(t) = 3 + fo f(a) cos[2(t - a)] d a
r
2
22. f(t) = cos(t)+e -2i f f(a)e2" da
3 .5
[s F(s)](t)=J
fw
f(a)dadw .
Unit Impulses and the Dirac Delta Functio n Sometimes we encounter the concept of an impulse, which may be intuitively understood as a force of large magnitude applied over an instant of time . We can model an impulse as follows . For any positive number E, consider the pulse SE defined b y S E (t)
= 1 [H(t) - H(t - E)] . E
As shown in Figure 3 .31, this is a pulse of magnitude 1/E and duration E . By letting E approach zero, we obtain pulses of increasing magnitude over shorter time intervals . Dirac's delta function is thought of as a pulse of "infinite magnitude" over an "infinitel y short" duration, and is defined to be 8(t) = lim S E_,o+
e
(t) .
This is not really a function in the conventional sense, but is a more general object calle d a distribution . Nevertheless, for historical reasons it continues to be referred to as the delt a function . It is also named for the Nobel laureate physicist P .A .M . Dirac . The shifted delt a function 8(t - a) is zero except for t = a, where it has its infinite spike . We can define the Laplace transform of the delta function as follows . Begin with 8 E (t-a)
= 1 [H(t-a) -H(t-a-E)] .
E
t 0
FIGURE 3 .31 SE(t - a) .
E
Graph of
140
CHAPTER 3 The Laplace Transform Then 2[8,(t - a)] = 1 l_e -as - l e(a+E)s = e S
S
-es )
-aS (1- e ES
This suggests that we define e-as(1-e Es)
2[8(t - a)]
E->0+
e as
ES
In particular, upon choosing a = 0 we hav e 2[S(t)] = 1 . Thus we think of the delta function as having constant Laplace transform equal to 1 . The following result is called the filtering property of the delta function . If at time a, a signal (function) is hit with an impulse, by multiplying it by 6(t - a), and the resulting signal is summed over all positive time by integrating from zero to infinity, then we obtain exactly the signal value f(a) . THEOREM 3.12
Filtering Property
Let a > 0 and let f be integrable on [0, co) and continuous at a . Then
f 00 Proof
f(t)S(t-a)dt= f(a) .
First calculate
f:
f(t)8 E (t - a) dt
=
f
00
E[H(t-a)-H(t-a-E)]f(t)d t
1
a+E
E a
f(t) dt.
By the mean value theorem for integrals, there is some tE between a and a
f
+ E such that
a+E .f( t) dt = E.f( tE) •
a Then fo
.f( t) 8E( t - a) dt =f( tE) .
As E -* 0+, a+E a, so tE -+ a and, by continuity, f(t E ) -+ f(a) . Then lim
f
f .f(I) hm =f E
.f( t) 6E( t-a ) dt=
p
as we wanted to show .
5,(t - a) d t
f(t)6(t -a) dt = lim f(tE ) = f(a) , E,o +
3.5 Unit Impulses and the Dirac Delta Functio n If we apply the filtering property to f(t)
J0 *
=
e -sr , we ge t
e-sr 8 (t - a) d t= e- ns
consistent with the definition of the Laplace transform of the delta function . Further, if we change notation in the filtering property and write it a s
fo e' then we can recognize the convolution of f with 8 and read the last equation as f*8=f. The delta function therefore acts as an identity for the "product" defined by the convolution o f two functions . Here is an example of a boundary value problem involving the delta function .
EXAMPLE 3 .2 0
Solve y"+2y'+2y=S(t-3) ;
y(0)=y'(0)=0 .
Apply the Laplace transform to the differential equation to ge t s2Y(s) + 2sY(s) + 2Y(s) = e-3s hence e-3s Y(s) =
s2 +2s+2 '
To find the inverse transform of the function on the right, first writ e Y(s))
1
=
3s
.
Now use both shifting theorems . Because £-' [l/(s2 + 1)] = sin(t), a shift in the s-variabl e gives us 1
2
-r sin(t) . -1 [(s+1)2+1] =e
Now shift in the t-variable to obtai n y(t) = H(t - 3)e -(t-3) sin(t - 3) . A graph of this solution is shown in Figure 3 .32 . The solution is differentiable for t > 0 , except that y'(t) has a jump discontinuity of magnitude 1 at t = 3 . The magnitude of the jum p is the coefficient of 8(t - 3) in the differential equation. M -
142
CHAPTER 3 The Laplace Transfor m y(t )
The delta function may be used to study the behavior of a circuit that has been subjecte d to transients . These are generated during switching, and the high input voltages associated wit h them can create excessive current in the components, damaging the circuit . Transients can also be harmful because they contain a broad spectrum of frequencies . Introducing a transient into a circuit can therefore have the effect of forcing the circuit with a range of frequencies . If one of these is near the natural frequency of the system, resonance may occur, resulting in oscillations large enough to damage the system . For this reason, before a circuit is built, engineers sometimes use a delta function to mode l a transient and study its effect on the circuit .
EXAMPLE 3 .2 1
Suppose, in the circuit of Figure 3 .33, the current and charge on the capacitor are zero at tim e zero . We want to determine the output voltage response to a transient modeled by 6(t) . The output voltage is q(t)/C, so we will determine q(t) . By Kirchhoff's voltage law , Li +Ri+Cq=i'+10i+100q=8(t) . Since i = q', q" + 10q' + 100q = 8(t) . We assume initial conditions q(0) = q'(0) = 0 . Apply the Laplace transform to the differential equation and use the initial conditions t o obtain
s2 Q(s) +10sQ(s) + 100Q(s) = 1 . eo n 1H
E in(t) = 8(t)
0 FIGURE 3 .33
10SZ
3 .5 Unit Impulses and the Dirac Delta Function
14 3
Then Q(s) =
1 s2 + 10s + 10 0
In order to invert this by using a shifting theorem, complete the square to writ e Q(s) =
1 (s+5) 2 +75 .
Sinc e
2-1 [(s2 +75 ) 1
5*
sin(5*t) ,
then q(t)
2-1 [(s+5)2+75]
5,4e-5' sin (5t) .
The output voltage is Cq(t) = 100q(t) _ 20 e -5' sin(5/t) .
A graph of this output is shown in Figure 3 .34. The circuit output displays damped oscillations at its natural frequency, even though it was not explicitly forced by oscillations of this frequency . If we wish, we can obtain the current by i(t) = q'(t) . E
FIGURE 3 .34
Output of the circuit of
Figure 3 .32 .
PROBLEMS In each of Problems 1 through 5, solve the initial value problem and graph the solution . 1. y"+5y'+6y=38(t-2)-4&(t-5) ;y(0)=y'(0)= 0
2. y" - 4y' +13y = 4&(t -3) ;. y(0) = y' (0) = 0
3. y(3)+4y"+5y'+2y=68(t) ;y(0)=y'(0)=y"(0)= 0 4. y" +16y' = 126(t - 517/8) ; y(O) = 3, y'(0) = 0 . 5. y" +5y' + 6y = B&(t) ; y(0) = 3, y' (0) = 0 . Call th e solution cp . What are go(0) and go' (0)? Using this
CHAPTER 3 The Laplace Transfor m
144
information, what physical phenomenon does the Dira c delta function model ? 6. Suppose f is not continuous at a, but lim,,_ ., Q+ f(t) = f(a+) is finite. Prove that fo f(t)6(t-a) dt = f(a+) . 7. Evaluate
fo (sin(t)/t)8(t - nr/6) dt .
J0
8. Evaluate ' t2 6(t-3)dt. 9. Evaluate
fo f(t)8(t-2)dt, where
t for0t< 2 f(t) = t 2 fort > 2 5 fort=2 . 10. It is sometimes convenient to consider 8(t) as th e derivative of the Heaviside function H(t) . Use the definitions of the derivative, the Heaviside function, an d the delta function (as a limit of 6 E) to give a heuristic justification for this. 11. Use the idea that H'(t) = 8(t) from Problem 10 t o determine the output voltage of the circuit of Example 3 .16 by differentiating the relevant equation to obtain an equation in i rather than writing the charge a s an integral. 12. If H'(t) = 8(t), then 2[H'(t)](s) = 1 . Show that not all of the operational rules for the Laplace transfor m are compatible with this expression . Hint : Check to se e whether [H' (t) ] (s) = s2 [H(t) ] (s) - H(O+) •
3.6
13. Evaluate 8(t - a) *At) . 14. An object of mass in is attached to the lower end o f a spring of modulus k. Assume that there is no damp ing. Derive and solve an equation of motion for the position of the object at time t > 0, assuming that , at time zero, the object is pushed down from th e equilibrium position with an initial velocity vo . With what momentum does the object leave the equilibriu m position? 15. Suppose an object of mass m is attached to the lowe r end of a spring having modulus k. Assume that there is no damping . Solve the equation of motion for th e position of the object for any time t > 0 if, at time zero , the weight is struck a downward blow of magnitud e mvo . How does the position of the object in Proble m 14 compare with that of the object in this problem fo r any positive time ? 16. A 2-pound weight is attached to the lower end of a spring, stretching its inches . The weight is allowed t o come to rest in the equilibrium position . At some later time, which is called time zero, the weight is struck a downward blow of magnitude a pound (an impulse). Assume that there is no damping in the system . Determine the velocity with which the weight leaves th e equilibrium position as well as the frequency and mag nitude of the resulting oscillations .
Laplace Transform Solution of System s The Laplace transform can be of use in solving systems of equations involving derivatives an d integrals .
EXAMPLE 3 .2 2 Consider the system of differential equations and initial conditions for the functions x and y :
x" - 2x' + 3y ' + 2y = 4 , 2y'-x'+3y=0 , x(0) = x ' (0) = y(O) = 0 . Begin by applying the Laplace transform to the differential equations, incorporating th e initial conditions . We get
s2X-2sX+3sY+2Y =
4
s
2sY-sX+3Y=0 .
3.6
Laplace Transform Solution of Systems
14 5
Solve these equations for X(s) and Y(s) to ge t 2 _ 4s+6 X(s) s2(s + 2) (s - 1) and Y(s) = s(s + 2)(s - 1) A partial fractions decomposition yield s X(s)
_
71 2s
1
1
1
10
-3 s2+ 6s+2+ 3
1 s- 1
an d Y(s)
_
1 1 1 2 1 s + 3s+2 + 3s- 1 '
Upon applying the inverse Laplace transform, we obtain the solutio n x(t)2-3t+6e-2t+ 01 et and 2 y( t) = -1 + -e -2t + -e t .
The analysis of mechanical and electrical systems having several components can lead t o systems of differential equations that can be solved using the Laplace transform .
EXAMPLE 3 .2 3
Consider the spring/mass system of Figure 3 .35. Let x i = x2 = 0 at the equilibrium position , where the weights are at rest. Choose the direction to the right as positive, and suppose th e weights are at positions xi (t) and x2 (t) at time t . By two applications of Hooke ' s law, the restoring force on in l is
k x
-k l xl + 2( 2 -x i )
and that on in 2 is -k 2(x2 - x 1) - k 3x2.
By Newton's second law of motion, m t x* = -( k 1 +k2 )x 1 +k 2x2
FIGUR .3 .35
+f (t)
146
CHAPTER 3 The Laplace Transform and m 2x2 = k2 x 1 - ( k2 + k3) x2 + f2( t) •
These equations assume that damping is negligible, but allow for forcing functions acting o n each mass . As a specific example, suppose m l = m 2 = 1 and k1 = k3 = 4 while k2 = z . Suppos e f2 (t) = 0, so no external driving force acts on the second mass, while a force of magnitud e t f1 (t) = 2[1 - H(t - 3)] acts on the first . This hits the first mass with a force of constan e . Now the system of equations for th magnitude 2 for the first 3 seconds, then turns off displacement functions is
2x 1 +2x2 +2[1-H(t-3)] , x2
=
13 5 2 x 1 - 2 x2.
If the masses are initially at rest at the equilibrium position, the n x1 (0) = x2(0) = x'1(0) = x2(0) = O . Apply the Laplace transform to each equation of the system to ge t 5: . . .
s2 X1
= - 132 X 1 -1- 2X2-
s 2 X2
=
e-3s )
2(1 s
5 13 2 X 1 - 2 X2 .
Solve these to obtain Xl (s) _ (52
2 + 9) (s2+
(s2+)
4)
1
s (1- e -3s )
and 5 1 X2 (s) = (s2+9)(s2+4) s (1-e 3 s ) . In preparation for applying the inverse Laplace transform, use a partial fractions decomposition to write Xl
13 1 _3s 1 s () _ 13 1 1 s 1 s s 36s 4s 2 +4 9s2 +9 36 s e + 4s2 +4 e
3s
1
s
-3s
+ 9s 2 +9 e
and 1 s 5 1 _3s 1 s () - 5 1 1 s _3s s `YZ 36s 4s2 +4 + 9s2 +9 36 s e + 4s2 +4 e
1 s -3s 9s 2 +9 e
3 .6
Laplace Transform Solution of Systems
' 147
Now it is routine to apply the inverse Laplace transform to obtain the solutio n 13 1 36 4 cos(2t) -
x1 (t)
+
[-+
9 cos(3t)
cos(2(t-3)) -
1
cos (3 (t - 3))] H(t - 3) ,
6 4
x2 (t) = 3 - cos(2t) + cos(3t) +
- cos(3(t_3))] H(t-3) .
[-+
EXAMPLE 3 .2 4
In the circuit of Figure 3 .36, suppose the switch is closed at time zero . We want to know th e current in each loop . Assume that both loop currents and the charges on the capacitors ar e initially zero. Apply Kirchhoff's laws to each loop to ge t 40i1 + 120( .7 1 - q2) = 10 6012 +120q2 = 120(g 1 - q2) . Since i = q', we can write q(t) = lc; i(T) dT + q(0) . Put into the two circuit equations, we ge t 40i1 +120 f [i 1 (T)-i2(7)]dT+120[g 1 (0)-q2 (0)]=1 0 60i2 +120 f r i2 (T)dT+12082 (0)=120f [i 1 (T)-i2 (T)]dT+120[g 1 (0)-q2 (0)] . Put q1 (0) = q2 (0) = 0 in this system to get 40i1 -1-120 f 1 [i l (T) - i2 (T)] dT = 1 0 60i2 +120 f i 2 (T)dT=120 f *[ii(T)-i2(T)]dT.
I
40
Si
60 120 F
10V
T
FIGURE 3 .36
120 F
;148
CHAPTER 3 The Laplace Transform
Apply the Laplace transform to each equation to ge t + 120 '1- 120 12 - 10 s s s 120 -{ 120 12 = 120 11 1 6012 s 2 s s
401 1
After some rearrangement, we hav e 1 (s+3)11 -312 = 4 271 - (s+4)12
=
0.
Solve these to get () I1s =
s+4 _3 1 1 1 6 4(s+1)(s+6) 20s+110s+ +
IZ (s) =
1 =1 1 1 . 1- .1 os+6 2(s+l)(s+6) 10s+l
and
Now use the inverse Laplace transform to find the solution
i1(t)
20e
t+
In each of Problems 1 through 10, use the Laplace trans form to solve the initial value problem for the system .
10e-6e
i2(t)
10e-'
11. Use the Laplace transform to solve the system Yi - 2y'2 +3y 1 = 0
Yl - 2y'2 +3y'3 = - 1 ; Y1( 0) = Y2( 0) = Y3(0 ) = 0 . 12. Solve for the currents in the circuit of Figure 3 .37, assuming that the currents and charges are initiall y zero and that E(t) = 2H(t - 4) - H(t -5) .
5. 3x' - y = 2t,x'+y'-y=0 ;x(0)=y(0)= 0 2 SI 6. x' +4y' - y = 0, x' +2y = e - ` ; x(0) = y(0) = 0 7. x'+2x-y'=0,x'+y+x=t2 ;x(0)=y(0)= 0 8. x'+4x-y=0,x'+y'=t ;x(0)=y(0)= 0
13 . Solve for the currents in the circuit of Figure 3 .37 if the currents and charges are initially zero an d E(t) = 1 - H(t - 4) sin(2(t - 4)) . 14. Solve for the displacement functions of the masse s in the system of Figure 3 .38 . Neglect damping an d assume zero initial displacements and velocities, an d external forces f1 (t) = 2 and f2 (t) = O . 15 . Solve for the displacement functions in the syste m of Figure 3 .38 if fi (t) = 1- H(t - 2) and f2 (t) = O . Assume zero initial displacements and velocities.
149
highly varnished table . Show that, if stretched an d released from rest, the masses oscillate with respect to each other with period tar
m1
n1 1 7n2 k(m 1 +na2 )
k
m2
FIGURE 3 .4 0
18. Solve for the currents in the circuit of Figure 3 .41 if E(t) = 5H(t - 2) and the initial currents are zero . 20 H
30 H
E(t)
10 d1
FIGURE 3 .4 1
19. Solve for the currents E(t) = 58(t -1) .
FIGURE 3 .3 8
16 . Consider the system of Figure 3 .39 . Let M be subjected to a periodic driving force f(t) = A sin(wt) . The masses are initially at rest in the equilibrium position . (a) Derive and solve the initial value problem for th e displacement functions. (b) Show that, if in and k 2 are chosen so that w = ,/k2 /m, then the mass in cancels the forced vibrations of M . In this case we call in a vibration absorber . k,
k2
in the circuit of Figure 3 .41 if
20. Two tanks are connected by a series of pipes as shown in Figure 3 .42 . Tank 1 initially contains 60 gallons o f brine in which 11 pounds of salt are dissolved . Tank 2 initially contains 7 pounds of salt dissolved in 1 8 gallons of brine . Beginning at time zero a mixtur e containing a pound of salt for each gallon of water i s pumped into tank 1 at the rate of 2 gallons per minute , while salt water solutions are interchanged betwee n the two tanks and also flow out of tank 2 at the rate s shown in the diagram . Four minutes after time zero , salt is poured into tank 2 at the rate of 11 pound s per minute for a period of 2 minutes . Determine th e amount of salt in each tank for any time t > 0 .
in
VV Yi
Y2
FIGURE 3 .39
17 . Two objects of masses 1121 and m2 are attached to opposite ends of a spring having spring constant k (Figure 3 .40) . The entire apparatus is placed on a
5 gal/min FIGURE 3 .4 2
2 gal/mi n
1 50
CHAPTER 3 The Laplace Transform
21 . Two tanks are connected by a series of pipes as show n in Figure 3 .43 . Tank 1 initially contains 200 gallon s of brine in which 10 pounds of salt are dissolved. Tank 2 initially contains 5 pounds of salt dissolve d in 100 gallons of water . Beginning at time zero, pure water is pumped into tank 1 at the rate of 3 gallon s per minute, while brine solutions are interchange d between the tanks at the rates shown in the diagram . Three minutes after time zero, 5 pounds of salt ar e dumped into tank 2. Determine the amount of salt i n each tank for any time t > 0.
3.7
2 gal/min
4 gal/min
1 gal/min
FIGURE 3 .43
Differential Equations with Polynomial Coefficients The Laplace transform can sometimes be used to solve linear differential equations havin g polynomials as coefficients . For this we need the fact that the Laplace transform of tf(t) is the negative of the derivative of the Laplace transform of f(t) .
THEOREM 3 .1 3
Let 2[f](s) = F(s) for s > b and suppose that F is differentiable . The n 2[tf(t)](s) = -F ' (s) for s > b . Proof
Differentiate under the integral sign to calculat e
f dsd ( e-s`f(t)) dt = f -te s`f(t) dt = f e-s'[-tf(t)] dt
F'(s) = d ds
f
f
e-s' f( t) dt
=f
= 2[-tf(t)](s) , and this is equivalent to the conclusion of the theorem . ■ By applying this result
n times, we reach the following .
COROLLARY 3. 1
Let 2[f] (s) = F(s) for s > b and let n be a positive integer. Suppose F is n times differentiable . Then, for s > b, d" F(s) . ■ 2[tnf(t)](s) = (-1) n ds"
3.7 Differential Equations with Polynomial Coefficient s
EXAMPLE 3 .2 5
Consider the problem ty" + (4t - 2)y' - 4y = 0 ;
y(0) = 1 .
If we write this differential equation in the form y" + p(t)y' + q(t)y = 0, then we must choose p(t) = (4t - 2)/t, and this is not defined at t = 0, where the initial condition is given . This problem is not of the type for which we proved an existence/uniqueness theorem i n Chapter 2 . Further, we have only one initial condition . Nevertheless, we will look for functions satisfying the problem as stated . Apply the Laplace transform to the differential equation to get 2[ty"] + 42 [ty " ] - 22[y' ] - 42[y] = 0 . Calculate the first three terms as follows . First , 2[ty"]
=-
d-
ds [s2Y - sy( 0 ) - y( 0) ]
2[ y ] = -
= -2sY - s2Y' + 1 because y(O) = 1 and y'(0), though unknown, is constant and has zero derivative . Next , 2[ ty ]
= - ds [y ] ds [sY - y(0)] = -Y - sY' .
Finally, £[y'] =sY-y(0)
=sY-1 .
The transform of the differential equation is therefor e -2sY' - s2 Y + 1 - 4Y - 4sY' - 2sY + 2 - 4Y = Then Y
4s+8 + s(s+4) Y
3 s(s+4)
This is a linear first-order differential equation, and we will find an integrating factor . Firs t compute 4s+8
f s(s+4 ) ds = ln[s2 (s + 4) Then e ln[(s2 (s+4) 2 ] = s2(s+4) 2
2] .
152
CHAPTER 3 The Laplace Transform is an integrating factor . Multiply the differential equation by this factor to obtai n s2 (s+4) 2 Y' + (4s+8)s(s+4)Y = 3s(s+4) ,
[s2(s+4)2Y]' = 3s(s+4) .
s 2(s+4) 2Y = s 3 +6s2 -I- C.
Y(s) =
6 C (s+4)2 + (s+4)2 + s2(s±4) s
Upon applying the inverse Laplace transform, we obtain y(t) = e -4t +2te-4t
+ 32
[-1+2t+e -4t +2t -41.
This function satisfies the differential equation and the condition y(O) = 1 for any real number C. This problem does not have a unique solution . When we applied the Laplace transform to a constant coefficient differential equation y" + Ay' + By = f(t), we obtained an algebraic expression for Y . In this example, with polynomial s occurring as coefficients, we obtained a differential equation for Y because the process o f computing the transform of tky(t) involves differentiating Y(s) . In the next example, we will need the following fact .
THEOREM 3 .1 4
Let f be piecewise continuous on [0, k] for every positive number k, and suppose there are numbers M and b such that I f (t) I < Mebt for t > 0 . Let 2[f] = F. Then lim F(s) = 0 .
s-> co
Proof
Write 00
*F(s)I
= =
as s
--> co .
f
e st f(t) dt
M e(S b-s
00
< f e -st Me bt dt
b)t**
o
=
M *0 s- b
■
This result will enable us to solve the following initial value problem .
3.7 Differential Equations with Polynomial Coefficients
153
EXAMPLE 3 .2 6
Suppose we want to solve y" + 2ty' -4y = 1 ;
y(0) = y'(0) = 0 .
Unlike the preceding example, this problem satisfies the hypotheses of the existence/uniquenes s theorem in Chapter 2 . Apply the Laplace transform to the differential equation to ge t s2 Y(s) - sy(O) - y ' (0) + 22[ty' ] (s) -4Y(s) = 1. Now y(O) = y'(0) = 0 and 2[tY]( s) = - ds [ 2 [Y ](s)] ds [sY(s) - y(0)] = -Y(s) - sY ' (s) . We therefore have s 2Y(s) - 2Y(s) - 2sY ' (s) - 4Y(s) = 1 , s or - sl Y'+ (3 s 2 Y=
1 2s1 2
This is a linear first-order differential equation for Y . To find an integrating factor, first comput e 3 s (_)dS=3(5)S2 . s 2
4
The exponential of this function, or _s 2/ 4 , s3 e
is an integrating factor. Multiply the differential equation by this function to obtai n • (s3e-SZ'4Y) ' = _ se-s2/4 2
The n s3 e _s2/4 Y = so
e-s2/4
+C
154
CHAPTER 3 The Laplace Transform
We do not have any further initial conditions to determine C . However, in order to have lim s ,%, Y(s) = 0, we must choose C = O . Then Y(s) = 1/s3 s o y ( t)
1 2
= t2 .
PROBLEMS Use the Laplace transform to solve each of Problems 1 through 10. 1. t 2y'-2y=2 2. y"+4ty'-4y=0 ;y(0)=0,y'(0)=-7 3. y" -16ty' +32y = 14; y(0) = y' (0) = 0 4. y" + 8 ty' - 8y = 0 ; y(0) = 0, y' (0) = -4
POWER SERIES SOLUTIONS OF INITIAL VALUE PRO B LEMS SINGULAR POINTS ANDTI-IE METHOD OF FROR I NIUSPOWERSERIES SOLUTIONS USING RECURRENC E RELATIONS SECOND SOLUTIONS AND LOGARITH M
4
Series Solutions
Sometimes we can find an explicit, closed form solution of a differential equation or initia l value problem . This occurs with Y + 2y = 1 ; which has the unique solution y(x)
=Z
(1
y( 0) = 3 ,
+5e
-2x )
This solution is explicit, giving y(x) as a function of x, and is in closed form because it is a finite algebraic combination of elementary functions (which are functions such as polynomials , trigonometric functions, and exponential functions) . Sometimes standard methods do not yield a solution in closed form . For example, the problem y + ex y = x 2 ;
y (0) = 4
has the unique solution
y (x) = e_
*.r
x J2eef 0
de
+4e -".
This solution is explicit, but it is not in closed form because of the integral . It is difficult to analyze this solution, or even to evaluate it at specific points . Sometimes a series solution is a good strategy for solving an initial value problem . Suc h a solution is explicit, giving y(x) as an infinite series involving constants times powers of x . It may also reveal important information about the behavior of the solution-for example , whether it passes through the origin, whether it is an even or odd function, or whether th e function is increasing or decreasing on a given interval. It may also be possible to make good approximations to function values from a series representation. .We will begin with power series solutions for differential equations admitting such solutions. Following this, we will develop another kind of series for problems whose solutions d o not have power series expansions about a particular point . This chapter assumes familiarity with basic facts about power- series . 155
156
4 .1
CHAPTER 4 Series Solutions
Power Series Solutions of Initial Value Problem s Consider the linear first-order initial value proble m y' + p (x) y = q( x) ;
Y(xo) = yo.
If p and q are continuous on an open interval I about xo, we are guaranteed by Theorem 1 .3 that this problem has a unique solution defined for all x in I. With a stronger condition on these coefficients, we can infer that the solution will have a stronger property, which we now define .
DEFINITION 4.1
Analytic Functio n
A function f is analytic at xo if f(x) has a power series representation interval about xo : f( x)
in some open
E an n= 0
in some interval (xo - h, xo + h )
For example, sin(x) is analytic at 0, having the power series representatio n sin(x)
x2n+1
= nE
(2n +)1)!
This series converges for all real x. Analyticity requires at least that f be infinitely differentiable at xo, although this by itsel f is not sufficient for f to be analytic at xo . We claim that, when the coefficients of an initial value problem are analytic, then th e solution is as well . THEOREM 4.1 Let p and q be analytic at xo . Then the initial value problem y' + p ( x)Y = q(x) ;
Y( xo) = Yo
has a solution that is analytic at xo . ■ This means that an initial value problem whose coefficients are analytic at xo has an analytic solution at xo . This justifies attempting to expand the solution in a power series abou t xo, where the initial condition is specified . This expansion has the form 0 = E a n( x - xo) n , Y(x) n= 0
in which 1
an = ni y(n *(xo) .
4.1 Power Series Solutions of Initial Value Problems
157
One strategy to solve the initial value problem of the theorem is to use the differentia l equation and the initial condition to calculate these derivatives, hence obtain coefficients in th e expansion (4 .1) of the solution .
EXAMPLE 4 . 1
Consider again the problem y' + exy = x2;
y( 0 )
= 4.
The theorem guarantees an analytic solution at 0 : 03
y (x)
=
E 1l y (n)
(0)xn
n=0
= y(O) +y'( 0) x +
2, y"(0)x2 + 3, y3) (0)x3 + . . .
We will know this series if we can determine the terms y(O), y ' (0), y" (0), . The initial condition gives us y(O) = 4 . Put x = 0 into the differential equation to ge t y' (O) + y( 0 ) = 0 , or Y(0)+4=O . Then y'(O) = -4. Next determine y"(O) . Differentiate the differential equation to get y" + ex y'
+
ex y
=
2x
and put x = 0 to get y" (0) +y ' (0) + y(0) = 0 . Then y"(O) = -y'(O) - y(0) = -(-4) - 4 = 0 . Next we will find
y3) (x) .
Differentiate equation (4 .2) to get y(3)
+2exy' + ex y " + e x y
=
2.
Then y3)(0)
+2y(o) +y"(0) +y(o) = 2 ,
or y 3) (0)
+ 2(-4) + 4 = 2.
Then y(3) (0)
=6.
Next differentiate equation (4 .3) : y4)+3exy+3ex y " + ex y
(3) +ex y
=
0.
CHAPTER 4 Series Solutions Evaluate this at 0 to get y4) (0)+3(-4)+3(0)+6+4=0 , so y4) (0) = 2 . At this point we have the first five terms of the Maclaurin expansion of the solution : 1y(3)(O)x3+ 1 y4)(0)x4+ . . . y (x) =y(0)+y'(0)x+ 1y 24 2 (0)x2+ 6 =4-4x+-x3+12x4+ . . . . By differentiating more times, we can write as many terms of this series as we want .
EXAMPLE 4 . 2
Consider the initial value proble m y' + sin(x)y = 1 - x ; y(ar) = -3 . Since the initial condition is given at x = ar, we will seek terms in the Taylor expansion of th e solution about 7r . This series has the for m y(x) =y ( IT) + y (IT) (x -
6
(3) (70 (x -
'77")
y" (I') (x
+
-
'77) 2
Z 7r) 3
+
24 y(4)
(7r) (x -
7T) 4
+... .
We know the first term, y(Ir) = -3 . From the differential equation, y' ('n-) = 1- 7r + 3 sin( g) = 1- Tr . Now differentiate the differential equation : y" (x) + cos (x)y + sin(x)y ' = -1 . Substitute x
= 7r
to get y"(IT)-(-3)=-1 ,
so y" (Ir) = -4 . Next differentiate equation (4 .4) : y3) (x) - sin(x)y + 2 cos(x)y' + sin(x)y" = O . Substitute x
= IT
to get y3)(ar)-2(1-7r)=0 ,
so y (3) en) = 2(1 - Tr) .
4.1
Power Series Solutions of Initial Value Problems
159 ,
Up to this point we have four terms of the expansion of the solution about 7r :
y(x)=-3+(1-7r)(x-7r)-Zi (x-7r) 2 + 2(1 31 =
-3+(1-70(x-7r) -2(x-7r)2+
3
'7)(x-7r)3+ . . .
(1 -7r)(x-7r)3+• . . .
Again, with more work we can compute more terms . This method for generating a series solution of a first order linear initial value proble m extends readily to second order problems, justified by the following theorem . THEOREM 4. 2
Let p, q and f be analytic at xo . Then the initial value problem y" + p ( x) y + q(x) y = f(x) ;
y(xo) = A, y (xo) = B
has a unique solution that is also analytic at xo .
EXAMPLE 4 . 3
Solve y" - xy + exy =
4;
y(O) = 1,y ' (0) =
4.
Methods from preceding chapters do not apply to this problem . Since -x, ex, and 4 are analytic at 0, the problem has a series solution expanded about 0 . The solution has the form y (x ) =y(0)+y (0)x+2*y"(0)x2+3ly(2)(0)x3+
...
We already know the first two coefficients from the initial conditions . From the differential equation, y'(0) =
4 - y(0) = 3 .
Now differentiate the differential equation to ge t y (3) - y' - xy" + ex y + ex y' = 0. Then y (3) (0) = y'(o) - y (0) - y
(o) -1 .
Thus far we have four terms of the series solution about 0 :
y(x)=1+4x+2x2-6x3+ . . . . Although we have illustrated the series method for initial value problems, we can also us e it to find general solutions .
CHAPTER 4 Series Solution s
160
EXAMPLE 4 . 4
We will find the general solution o f y" +cos(x)y ' + 4y = 2x -1 . The idea is to think of this as an initial value problem , y" +cos(x)y' + 4y = 2x-1 ;
y(O) = a, y' (0) = b ,
with a and b arbitrary (these will be the two arbitrary constants in the general solution) . Now proceed as we have been doing . We will determine terms of a solution expanded about 0 . The first two coefficients are a and b. For the coefficient of x 2 , we find, from the differentia l equation y" (0) = -y'(O) - 4y(0) -1 =-b-4a- 1. Next, differentiate the differential equation: y3) - sin(x)y ' + cos(x)y" -1 = 4y' = 2 , so (0)
Continuing in this way, we obtain (with details omitted ) y(x) =a+bx+
-1-4a-b 2 3+4a-3b 3 x + 6 x 2
1+12a+8b 4 -16-40a+ b x{ ... + 24 x + 120 In the next section, we will revisit power series solutions, but from a different perspective .
In each of Problems 1 through 10, find the first fiv e nonzero terms of the power series solution of the initia l value problem, about the point where the initial condition s are given. 1. 2. 3. 4.
10. y" - y' + y = 1 ; y(4) =0 , y' ( 4) = 2 In each of Problems 11 through 20, find the first fiv e nonzero terms of the Maclaurin expansion of the general solution . 11. 12. 13. 14.
y'+sin(x)y=- x y'-x2y= 1 y'+xy=l-x+x2 y'-y=ln(x+1)
4.2 Power Series Solutions Using Recurrence Relations
15. y" + xy = 0 16. y"-2y'+xy= 0
In each of Problems 22 through 25, the initial value problem can be solved in closed form using methods fro m Chapters 1 and 2. Find this solution and expand it in a Maclaurin series . Then find the Maclaurin series solutio n using methods of Section 4 .1 . The two series should agree .
17. y"-x 3y= 1 18. y" + (1- x)y' + 2xy = 0 19. y"+y'-x2y= 0 20. y" - 8xy=1+2x9 21. Find the first five terms of the Maclaurin serie s solution of Airy's equation y" + xy = 0, satisfying y(O) = a, y' ( 0) = b.
Power Series Solutions Using Recurrence Relations We have just seen one way to utilize the differential equation and initial conditions to generat e terms of a series solution, expanded about the point where the initial conditions are specified . Another way to generate coefficients is to develop a recurrence relation, which allows us t o produce coefficients once certain preceding ones are known . We will consider three example s of this method.
EXAMPLE 4 . 5
Consider y" + x 2y = 0. Suppose we want a solution expanded about 0 . Instead of computing successive derivatives at 0, as we did before, now begin by substitutin g y(x) = E ,', =o a„x" into the differential equation . To do this, we nee d y'
= E na " x" -
and
y"
n=1
= E n(n - 1) a we - 2 n=2
Notice that the series for y' begins at n = 1, and that for y" at n = differential equation to get
Put these series into the
2.
y" + x2y= E n(n -1)a xi-2 +Ea xn+2 =0 . n=2 n=o Shift indices in both' summations so that the power of x occurring in each series is th e same . One way to do this is to writ e E n(n - 1)a„x„-2 n=2
= E (n + 2) (n + 1)a,i+2 xn n=o
and E a n xn+2 = /te a„- z n-o n=2 Using these series, we can write equation (4 .5) a s 00 E(n + 2) (n + 1 ) a n+2 xn n=o
0
+ E aii-2xn = 0 . n=2
162
CHAPTER 4 Series Solution s
We can combine the terms for n > 2 under one summation and factor out the common x " (this was the reason for rewriting the series) . When we do this, we must list the n = 0 an d n = 1 terms of the first summation separately, or else we lose terms . We ge t 0 2(1)a2 x° + 3 (2) a 3 x + E [(n+2)(n+1)a,t+2+a, _2] x" = 0 . n= 2
The only way for this series to be zero for all x in some open interval about 0 is for th e coefficient of each power of x to be zero . Therefore, a2 =a 3 = 0
and, for n = 2, 3, . . . (n +2)(n+ 1 ) a„+2 + a ,_2 = 0 .
This implies that a
1 „+2 = (n + 2)(n +1)an_2
for n = 2, 3, . . . .
(4 .6 )
This is a recurrence relation for this differential equation . In this example, it gives a„ +2 in term s of a, i_2 for n = 2, 3, . . . . Thus, we know a4 in terms of a ° , a5 in terms of a l , a 6 in terms of a2 , and s o on . The form of the recurrence relation will vary with the differential equation, but it always give s coefficients in terms of one or more previously indexed ones . Using equation (4.6), we proceed : 1 4(3) a0
a4
1 12 a 0
(by putting n = 2 1
a5 =- 5(4) a i
=
20 a i
(by putting n = 3) ; a6
a7
2 =0 = - 6(5) a
(because a 2 = 0 )
3=0
(because a 3 = 0 )
= - 7(6) a
as
8(7) a4
(56)(12) a °
a9 _
1 _ 1 9(8) a5 (72)(20) ' '
and so on . The first few terms of the series solution expanded about 0 are 1 y(x)=a0+alx+0x2+0x3- 12 aox4 20aix5+0x6+0x7+ (1 -
2 x , 6 7 2 x
6
672 xs + 1440x9+• (x -
•
1 4 40 x
'
This is actually the general solution, since a0 and a l are arbitrary constants . Note that a 0 = y(O) and a l = y' (0), so a solution is completely specified by giving y(O) and y'(0) . ■
4 .2 Power Series Solutions Using Recurrence Relation s
EXAMPLE 4 . 6
Consider the nonhomogeneous differential equation y" +x2 y' +4y = 1-x2 . Attempt a solution y(x) =
a,,x" . Substitute this series into the differential equation to ge t
En(n-1)a,,xi-2 +x2 Ena,x "-' +4Ea,x"=1-x2 . n=2 „=1 n= o The n 00
00
00
En(n-1)anx n-2 +Ena„x n+s +E4a nx" = 1 - x 2 . n=2 n=1 n= o
(4 .7)
Shift indices in the first and second summation so that the power of x occurring in each is x" : E n(n - 1)a,, xi-2 n=2
= E( n + 2)(n + 1 )an+ 2x ' 1 n= o
and E na,,xn+1 n=1
= E(n -1 ) a „- 1 n=2
Equation (4 .7) become s 00
CO
CO
E(n+2)(n+1)a n+2xn +E(n- 1)a i_1 x " +E4a nx" = 1 - x 2 . n=0 n=2 ,1= 0 We can combine summations from n = 2 on, writing the n = 0 and n = 1 terms from the firs t and third summations separately . Then 0 2a2x° -]-6a 3x+4aox° +4a 1 x
+ E[(n +2)(n+ 1)an+2 + (n -1)a„_1 +4an]x'1 = n= 2
1- x 2 .
For this to hold for all x in some interval about 0, the coefficient of x" on the left must match the coefficient of x" on the right . By matching these coefficients, we get : 2a2 +4ao = 1 (from x°), 6a 3 +4a j = 0 (from x), 4(3)a4 +a 1 +4a 2 = - 1 (from x2 ), and, for n > 3, (n±2)(n+1)an+2+(n-1)ai_1+4a,,
O.
164
CHAPTER 4 Series Solution s
From these equations we get, in turn , 1 =2
a2
2a o,
2 a 3 =-3a 1 ,
=1
2 (-1-a 1 -4az )
a4
_
1 12
1 3
2
1
=
1 12a1
1
(2 -2a o
1 12 a1'
+ 3 a0
and, for n = 3, 4, . . . , a +z = -
4a„+(n-1)a„_ 1 (n+2)(n+1 )
This is the recurrence relation for this differential equation, and it enables us to determine a „+2 if we know the two previous coefficients a n and a i_1 . With n = 3 we get as
_
4a 3 + 2az = - 1 8 a -I-1- 4a 0 20 20 ( 3 1
1 2 1 5 ao+ 20 + 15 x1 . With n = 4 the recurrence relation gives us a6 _
1 (4a 4 +3a3 30
)=- 0
(-
1
+38 ao-31 a 1- 2a 1 l
7 90 a 1 .
_ 1 4 30 45 a0+ Thus far we have six terms of the solution : y(x) =ao+alx+
\Z
12 ( 4360 1 + (30
)
-2ao) xz _ 3a 1 x3 1 1 1 2a 1\) x s 12ai*x +(-2+3ao -f- 15
4 7 6 . .. , 4560+90 a1 x +
Using the recurrence relation, we can produce as many terms of this series as we wish . A recurrence relation is particularly suited to computer generation of coefficients . Because this recurrence relation specifies each an (for n > 3) in terms of two preceding coefficients, it i s called a two-term recurrence relation . It will give each a n for n > 3 in terms of ao and a l , whic h are arbitrary constants . Indeed, y(O) = ao and y'(0) = a 1 , so assigning values to these constants uniquely determines the solution. Sometimes we must represent one or more coefficients as power series to apply the curren t method. This does not alter the basic idea of collecting coefficients of like powers of x an d solving for the coefficients .
4.2 Power Series Solutions Using Recurrence Relations
165
EXAMPLE 4 . 7
Solve
e .
sx y" + xy - y =
Each coefficient is analytic at 0, so we will look for a power series solution expanded about O . Substitute y = E° a„x" and also e3x = E°° into the differential equation to get : » E n(n-1)a„x" 2 +Ena x" -Ea x" = E-x " . n=2 n=1 n=0 n=0 n! Shift indices in the first summation to write this equation as = co n E(n+2)(n+ 1)an+2xn + E na,x" - E n=0 n=1 n=0 n=0 n
p
o(3"/n!)x" n n
00
3
3 a» x" E x„
We can collect terms from n = 1 on under one summation, obtaining 00
3„ x
n E[(n+2)(n+1)an+2+(n-1)a„]x"+2a2-a0 = 1 + n=1 n= n ! Equate coefficients of like powers of x on both sides of the equation to obtai n
1
2 o
2a - a = 1 and, for n = 1, 2, . . (n + 2) (n + 1)a
n+2 + (n - 1) a n = 3"n-! .
This gives a2
= 2( 1 + a 0 )
and, for n = 1, 2, . . ., we have the one-term recurrence relation (in terms of one precedin g coefficient) an+z =
(3"/n!)+(1 - n) a n (n+2)(n+1 )
Using this relationship we can generate as many coefficients as we want in the solution series , in terms of the arbitrary constants ao and a . The first few terms are
l
0
y(x) a +a1x+ !
C TAO nl,'
4: 2
1 + 2 a0
(1_ a0
x a 7 x5+ x2 + 1
324) x+ 40
3
57 ao) 6 *- . . . 30(24+ gx
PROBLEMS
In each of Problems 1 through 12, find the recurrence relation and use it to generate the first five terms of th e Maclaurin series of the general solution.
1 . y' - xy =1- x 2. y' - x 3 y = 4
166
CHAPTER 4
Series Solution s
3. y'+(1-x2)y=x
8. y"+x 2y' +2y= 0
4. y"+2y' +xy=0
9. y"+(1-x)y'+2y=1-x 2
5. y"-xy'+y=3
10. y"+y'-(1-x+x2)y=-5
6. y" + xy' + xy = 0
11. y' + xy = cos(x)
7. y" - x2 y' + 2y=x
12. y"+xy'=1-e x
4.3
Singular Points and the Method of Frobeniu s In this section we will consider the second-order linear differential equatio n P(x) y " + Q(x)y' + R (x) y = F(x) . If we can divide this equation by P(x) and obtain an equation of the form y" +P( x)y + q(x) y = f( x ) , with analytic coefficients in some open interval about xo, then we can proceed to a powe r series solution of equation (4 .9) by methods already developed, and thereby solve equation (4.8) . In this case we call xo an ordinary point of the differential equation . If, however, xo is not an ordinary point, then this strategy fails and we must develop some ne w machinery .
DEFINITION 4.2
Ordinary
and Singular Points
xo is an ordinary point of equation (4 .8) if P( .a 3 ) 0 and Q(x)/P(x) R(x)/P(x), an d F(x)/P(x) are analytic at xo . xo is a singular point of equation (4 .8) if xo is not an ordinary point.
Thus, xo is a singular point if P(xo) = 0, or if any one of Q(x)/P(x), R(x)/P(x), or F(x)/P(x) fails to be analytic at xo .
EXAMPLE 4 . 8
The differential equation x3 (x - 2) 2y" + 5(x + 2) (x - 2)y' + 3x2y = 0 has singular points at 0 and 2, because P(x) = x3 (x - 2) 2 and P(0) = P(2) = 0. Every other real number is a regular point of this equation . 111 In an interval about a singular point, solutions can exhibit behavior that is quite differen t from what we have seen in an interval about an ordinary point . In particular, the general solution of equation (4 .8) may contain a logarithm term, which will tend toward oo in magnitude as x approaches xo .
4.3 Singular Points and the Method of Frobenius
167
In order to seek some understanding of the behavior of solutions near a singular point, w e will concentrate on the homogeneous equatio n P(x)y"+Q(x)y + R(x)y = 0 .
(4 .10 )
Once this case is understood, it does not add substantial further difficulty to consider th e nonhomogeneous equation (4 .8) . Experience and research have shown that some singular point s are "worse" than others, in the sense that the subtleties they bring to attempts at solution ar e deepened . We therefore distinguish two kinds of singular points .
DEFINITION 4.3 Regular and Irregular Singular Points xo is a regular singular point of equation (4 .10) if xo is a singular point, and the functions xo) Q O P()
and ( x= x o )2
R (x ) P(x)
are analytic at xo. A singular point that is not regular is said to be an irregular singular point .
EXAMPLE 4 . 9
We have already noted that x3 (x-2) 2y" +5(x+2)(x-2)y' +3x2 y = 0 has singular points at 0 and 2. We will classify these singular points . In this example, P(x) = x3 (x - 2) 2, Q (x) = 5(x +2)(x - 2) and R (x) = 3x2. First consider xo=0 . Now (x -
is not defined at 0, hence is not analytic there . This is enough to conclude that 0 is an irregular singular point of this differential equation . Next let xo = 2 and consider +2 (x-2)Q(x) =5 x P(x) x3 and (x-2)2R(x) = 3 P(x) x Both of these functions are analytic at 2 . Therefore, 2 is a regular singular point of the differential equation . Suppose now that equation (4 .10) has a regular singular point at xo . Then there may be n o solution as a power series about xo . In this case we attempt to choose numbers c,, and a numbe r r so that Co +r (4 .11 ) Y(x) = E C,, (x - xp )" n=0
168
CHAPTER 4 Series Solution s is a solution . This series is called a Frobenius series, and the strategy of attempting a solutio n of this form is called the method of Frobenius . A Frobenius series need not be a power series , since r may be negative or may be a noninteger . A Frobenius series "begins" with cox'', which is constant only if r = O . Thus, in computing the derivative of the Frobenius series (4 .11), we ge t cc,
y ( x)
= E(n+r)c„(x-xo),t+r- 1 ,t= o
and this summation begins at zero again because the derivative of the n = 0 term need not be zero . Similarly, Co y" (x) = E(n±r)(n+r- 1)c„(x-xo) n+r-2 ,t=o We will now illustrate the method of Frobenius .
EXAMPLE 4 .1 0 We want to solve x2y"+x1//12
\ +2x ly'+y=0 (x- - ) . 2
It is routine to show that 0 is a regular singular point . Substitute a Frobenius series y(x) _ „=0 cn x" + ' into the differential equation to ge t CO Dn+r)(n+r-1)c„x " r+E -(n+r)cnxn+r +E 2(n+r)c„x n+r+ l ,t=o 2 n=o n=0 00 + Cnxn+r+1 E Cnx't+r = 0. 21 n=0 n=0 00
Shift indices in the third and fourth summations to write this equation a s [r(r_1)co +c o r_co ]x r +[(n+r)(n+r_1)cn +(n+r)c n n= 1 2(n
I
r - 1)Cn_1 { cn_1 -
2 Cn
xn+r = O .
This equation will hold if the coefficient of each
x't+r
is zero . This gives us the equation s
r(r-1)co+ -cor- -co = 0
(4.12)
and (n+r)(n+r-1)cn+-(n+r)c„+2(n+r-1)cn_1 +for n =1 ; 2, . . . . Assuming that co implies that
2 cn
=0
(4 .13 )
0 0, an essential requirement in the method, equation (4 .12 ) 1 1 r(r-1)+2r-2=0 .
(4 .14)
Singular Points and the Method of Frobenius
4.3
This is the indicial equation for this differential equation, and it determines the values of can use . Solve it to obtain r, = 1 and r2 = terms of c„ _1 to get the recurrence relation
for
r we
- - . Equation (4 .13) enables us to solve for e n in 2
1 +2(n+ r
c" = -
169
-1 ) 1
(n+r)(n+r-1)+
(n+r) -
21 c „- 1
n = 1,2, . . . . First put r = r1 = 1 into the recurrence relation to obtai n _-
1+2n
ci_1
for n = 1,2,
(n+ -3 )
.n3
2
Some of these coefficients are C 1 =-
C 2 =-
6 S CO =- S C O e 2
5 c 1 =-5 ( _
_ 7 27 2
c3
c2_
6 6 co )
=
14
6 _
27
7
c
o, 4
Co)
-- §' c
°'
and so on . One Frobenius solution i s Yl ( x )
= co
I x -
6 x2
-
6 x3 - 9x4
+.
.)
Because r1 is a nonnegative integer, this first Frobenius solution is actually a power serie s about 0 . For a second Frobenius solution, substitute r = r2 = - into the recurrence relation . To 2 avoid confusion we will replace c, j with c', in this relation . We get
2)
1+2(n -
--
for
n=
1,
3\
1
Cn
2 )Cn-
1
1l
2 )+ 2
In -
2n-2 3
c „- 1
2
1
I-
2c
"-1
2, . . . . This simplifies to
n(la-
2)
It happens in this example that cl = 0, so each en* = 0 for solution is
0 Y2(x)
= E cx
„-1*2
= 4x -1/ 2
n =1, 2, . . . and the second Frobeniu s forx>0 .
„=0
The method of Frobenius is justified by the following theorem .
1 I 170
-1
CHAPTER 4 Series Solution s THEOREM 4.3 Method of Frobenius Suppose x 0 is a regular singular point of P(x)y"+Q(x)y'+R(x)y=0 . Then there exists at least one Frobenius solution 0
y(x)
= E cn(x - xo n= 0
with co 0 . Further, if the Taylor expansions of (x-x0)Q(x)/R(x) and (x-x0)2R(x)/P(x ) about x0 converge in. an open interval (x0 - h, x0 + h), then this Frobenius series also converge s in this interval, except perhaps at x 0 itself. ■ It is significant that the theorem only guarantees the existence of one Frobenius solution . Although we obtained two such solutions in the preceding example, the next example show s that there may be only one .
EXAMPLE 4 .1 1
Suppose we want to solve x2y" + 5xy' + (x+ 4)y = 0 . Zero is a regular singular point, so attempt a Frobenius solution y(x) into the differential equation to get
= EL0 cnx" +r .
Substitut e
n+r +Ec n xn+r+l +E4cnxn+r = n=0 n=0
E(n+r)(n+r-c nxn+r +E5(n+r)cnx n=0 n=0
Shift indices in the third summation to write this equation a s E(n+r)(n+r- 1)cnxn+r+ J*` 5(n+r)c nxn+r +E ci-1xn+r+4cnxn+r = O n=0 n=0 / n=1 n= 0 00
00
CO
0
Now combine terms to writ e [r(r-1)+5r+4]coxr+E[(n+r)(n+r- 1)cn + 5(n+ r)cn + cn_1 +4c„] x"+r = 0 . n= 1 Setting the coefficient of xr equal to zero (since co indicial equation
0 as part of the method), we get the
r(r-1)+5r+4 .= 0 with the repeated root r = -2 . The coefficient of x'i+r in the series, with r = -2 inserted, gives us the recurrence relation (n-2)(n -3)c n +5(n -2)cn +cn_1 +4cn = 0 or 1 (n-2)(n-3)+5(n-2)+4cn- 1
c"
for n = 1, 2, . . . . This simplifies to cn =
1 n2 en-1 for n = 1, 2, 3, . . . .
4.3 Singular Points and the Method of Frobenius
171
Some of the coefficients ar e c l = -co 1 1 c2 =- 4 c1 = -co = 4
1
co
c3 = -
1 1 1 9 cz = -4 co = - (2 3) 2 c0
c4
1 c3 16
1 c° 4 .9 .16
1 c° (2 .3 .4)2
and so on . In general, cn
for
n=
= (-1) n 1 c (ni) z 0
1, 2, 3, . . . . The Frobenius solution we have found i s 1 y(x)=c° [x -z - x-i +4 = Co
00 E(_l)n n=0
1 36x+s+6 xz +••• ] xn-2 •
1 (n!) 2
In this example, xQ(x)/P(x) = x(5x/x 2) = 5 and x2R(x)/P(x) = x2(x+4)/x 2 = x+4. Thes e polynomials are their own Maclaurin series about 0, and these series, being finite, converge fo r all x . By Theorem 4 .3, the Frobenius series solution converges for all x, except x = O . In this example the method of Frobenius produces only one solution . E In the last example the recurrence relation produced a simple formula for c„ in terms of co . Depending on the coefficients in the differential equation, a formula for c,, in terms of co may be quite complicated, or it may even not be possible to write a formula in terms of elementar y algebraic expressions . We will give another example, having some importance for later work , in which the Frobenius method may produce only one solution.
EXAMPLE 4 .12 Bessel Functions of the First Kin d
The differential equation x2y" + xy' + ( x2 - v2) y = 0 is called Bessel's equation of order v, for v > O . Although it is a second-order differentia l equation, this description of it as being of order v refers to the parameter v appearing in it, an d is traditional . Solutions of Bessel's equation are called Bessel functions, and we will encounte r them in Chapter 16 when we treat special functions, and again in Chapter 18 when we analyz e heat conduction in an infinite cylinder . Zero is a regular singular point of Bessel's equation, so attempt a solutio n 0
xn+, . y (x) = E c n=o
172
CHAPTER 4 Series Solutions Upon substituting this series into Bessel's equation, we obtain
[r(r-1)+r-v2]cox''+[r(r+1)+(r+1)-v2]cix''+ 1 00
+>[[(n+r)(n+r-1)+(n+r)-v2]c,t+cn_2]xn+r=0 .
(4.15)
n=2 Set the coefficient of each power of x equal to zero . Assuming that co 0, we obtain the indicial equation r2 - v2 = 0 , with roots ±v. Let r = v in the coefficient of x r+1 in equation (4.15) to get
(2v+1)cl =0 . Since 2v+1 0, we conclude that c l = 0 . From the coefficient of x"+'' in equation (4.15), we get
[(n+r)(n±r-1)+ (n+r) - v2 ]c" +cri_2 = 0 for n = 2, 3, . . . . Set r = v in this equation and solve for c,l to ge t c"
1 n(n + 2v)
C,,
_2
for n = 2, 3, . . . . Since c l = 0, this equation yields C3=C5 = . .
Codd =
U,
For the even-indexed coefficients, writ e 1 2n(2n+2v) c2i_2 = -22n(n+v) c 2n-2
c2n _
1
-1
22n(n + v) 2(n -1)[(2(n -1)'+2v] C2i-4
1 24n(n - 1)(n + v) (n + v - 1) C212-4 ( _ 1)n _ ... = 22nn(n - 1) . . .(2)(1)(n + v)( n - 1+ v) . . .(1 + v) co (_1) " 22nn ! (l + v)(2 + v) . . . (n + v) co One Frobenius solution of Bessel's equation of order v is therefor e Y1( x ) = c o E 22nn! (1 + v) (2 n
,v) . . . ( n !
v) xzn+v .
(4 .16)
These functions are called Bessel functions of the first kind of order v. The roots of the indicial equation for Bessel's equation are ±v . Depending on v, we may or may not obtain two linearly independent solutions by using v and -v in the series solution (4 .16) . We will discuss this in more detail when we treat Bessel functions in Chapter 16, where we will see that, when v is a positive integer, the functions obtained by using v and then -v in the recurrence relation are linearly dependent.
4.4 Second Solutions and Logarithm Factors
_SECTION 4 .3
PROBLEMS
In each of Problems 1 through 6, find all of the singular points and classify each singular point as regular or singular. 1. x2 (x-3) 2 " 4x(x2 -x-6)y'+(x 2 -x-2)y= 0 2. (x3 -2x2 -7x-4)y"-2(x2 +1)y'+(5x 2 -2x)y=0 3. x 2 (x-2)y" + (5x - 7)y' + 2(3 + 5x 2 )y = 0 4. [(9-x2)y')'+(2+x2)y= 0 5. [(x- 2) - 'y']' + x-5/2y = 0 6. x2 sin 2 (x - 7r)y" + tan(x - 7r) tan(x)y' + (7x - 2) cos(x)y = 0 2y
+
In each of Problems 7 through 15, (a) show that zero is a regular singular point of the differential equation, (b) fin d and solve the indicial equation, (c) determine the recur -
4.4
173
rence relation, and (d) use the results of (b) and (c) to fin d the first five nonzero terms of two linearly independen t Frobenius solutions . 7. 4x2 y" + 2xy' - xy = 0 8. 16x2 y"-4x2 y' + 3y= 0
Second Solutions and Logarithm Factor s In the preceding section we saw that under certain conditions we can always produce a Frobeniu s series solution of equation (4 .10), but possibly not a second, linearly independent solution . Thi s may occur if the indicial equation has a repeated root, or even if it has distinct roots that diffe r by a positive integer. In the case that the method of Frobenius only produces one solution, there is a metho d for finding a second, linearly independent solution . The key is to know what form to expec t this solution to have, so that this template can be substituted into the differential equation t o determine the coefficients . This template is provided by the following theorem . We will state the theorem with xo = 0 to simplify the notation. To apply it to a differential equation having a singular point xo 0 0, use the change of variables z = x - x o .
THEOREM 4.4 A Second Solution in the Method of Frobeniu s
Suppose 0 is a regular singular point of P(x)y" + Q(x)y' + R (x) y = O . Let rl and r2 be roots of the indicial equation . If these are real, suppose r, >
r2 .
The n
1 . If r 1 - r2 is not an integer, there are two linearly independent Frobenius solution s y1
( x)
= E cnxn+ri n=0
with co 0 and
co
and
y2 ( x) =
C E n=0
A
X
n+r2
O . These solutions are valid in some interval (0, h) or (-h, 0) .
174
CHAPTER 4 Series Solutions 2. If r 1 - r2 = 0, there is a Frobenius solution y l (x) = a second solution
E°,°_o c„x"+ ''I
with co 0 as well a s
0 y2 (x)
= y l (x) ln(x) + E n=1
Further, yl and y2 form a fundamental set of solutions on some interval (0, h) . 3 . If r1 - r2 is a positive integer, then there is a Frobenius series solutio n 0 Y1 ( x)
_
E n=0
,
In this case there is a second solution of the form o C y2(x)
= ky1(x) ln(x) + E 4x" +'2 . n=0
If k = 0 this is a second Frobenius series solution ; if not the solution contains a logarithm term . In either event, y1 and Y2 form a fundamental set on some interval (0, h) . We may now summarize the method of Frobenius as follows, for the equation P(x)y" + Q(x)y'+R(x)y = O . Suppose 0 is a regular singular point . Substitute y(x) = EL° c,,x"+r into the differential equation . From the indicial equation, determine the values of r . If these are distinct and do not differ by an integer, we are guaranteed two linearly independent Frobenius solutions . If the indicial equation has repeated roots, then there is just one Frobenius solution y 1. Bu t there is a second solution Y2 (x), =Yi (x) ln(x)
+ E Cnx"+r. n= I
The series on the right starts its summation at n = 1, not n = O . Substitute y 2(x) into the differential equation and obtain a recurrence relation for the coefficients cn . Because thi s solution has a logarithm term, yl and Y2 are linearly independent . If r1 - r2 is a positive integer, there may or may not be a second Frobenius solution . In this case there is a second solution of the form y2(x )
=
ky l(x) ln (x)
+
E 4xn+r2 n=0
Substitute y2 into the differential equation and obtain an equation for k and a recurrence relation for the coefficients ci . If k = 0, we obtain a second Frobenius solution ; if not, then this second solution has a logarithm term . In either case y l and Y2 are linearly independent . In the preceding section we saw in Example 4 .10 a differential equation in which r1 - r2 was not an integer. There we found two linearly independent Frobenius solutions . We will illustrate cases (2) and (3) of the theorem .
EXAMPLE 4 .13
Conclusion (2), Equal Root s
Consider again x 2 y" + 5xy' + (x + 4)y = O . In Example 4 .11 we found one Frobenius solution 1 x n-2
Y1 (x) = Co E(-1)n (n!) 2 n=0
4.4 Second Solutions and Logarithm Factors
175
The indicial equation is (r + 2) 2 = 0 with the repeated root r = -2. Conclusion (2) of the theorem suggests that we attempt a second solution of the form 0 y2(x)
= yl(x ) ln ( x ) + E
c xn-2
•
n= 1
Note that the series on the right begins at n = 1, not n = O . Substitute this series into th e differential equation to get, after some rearrangement of terms , 4y1 +2xyc +E(n-2)(n-3)cnx i-2 +E5(n-2)cnx"-2 +E clx „- 1 n=1
n=
n= I
0
+ E 4cnx i-2 + In (x) [x2 y ;' + 5xy + (x + 4)y 1 ] = O . n= 1
The bracketed coefficient of 1n(x) is zero because y l is a solution of the differential equation . In the last equation, choose co = 1 (we need only one second solution), shift indices to writ e i-2 and substitute the series obtained for (x) to get E*= „ 1 c*x1z-1 = E°°„=2 c* x y1 I +cix 1
+E
[4(1)0
1)" + 2-x (- (n-2 )
,i=2
+ (n - 2) (n - 3) cn + 5(n - 2) cn + cn _1
+ 4c,*,] xii-2 =
O.
Set the coefficient of each power of x equal to zero. From the coefficient of x -1 we ge t ci =2 . From the coefficient of x i-2 in the summation we get, after some routine algebra , 2(-1) " n + n2cn + c,_ 1 = 0 , ( n! ) 2 or *
1
2(-1) "
n2
n(n!) 2
for n = 2, 3, 4, . . . . This enables us to calculate as many coefficients as we wish . Some of the terms of the resulting solution are y2(x) = y i
(x) In (x)
2 3 11 + x- - 4 + 108 x
25 2 137 3 . 3456 x + 432,000 x' + • - •
Because of the logarithm term, it is obvious that this solution is not a constant multiple o f y 1 , so y l and y2 form a fundamental set of solutions (on some interval (0, h)) . The general solution is y(x) =[CI +
+ C2 ln(x)]
2
x,=- 2 E (1))2 „= o
[2 3 11 4 + 108 x
C x
25 137 3 ... 3456 x-+ 432,000 x +
1 :76
CHAPTER 4 Series Solution s
EXAMPLE 4 .14
Conclusion (3), with k= 0
The equation x 2y" + x2 y' 2y = 0 has a regular singular point at O . Substitute y(x) = EL c„x" + '' and shift indices to obtain 0 [r(r - 1) - 2]cox r + E[(n + r) (n + r - 1) c n + (n + r - 1)cn_1 - 2c,]x"+r = O . ,t _ ( Assume that co O . The indicial equation is r 2 - r - 2 = 0, with roots r1 = 2 and r2 = -1 . Now r1 - r2 = 3 and ease (3) of the theorem applies . For a first solution, set the coefficient of x n+r equal to zero to get (n + r) (n + r - 1)c„ + (n + r - 1)c„_ 1 - 2c„ = 0 .
(4 .17)
Let r = 2 to get (n + 2)(n + 1)c„ + (n + 1)c n_1
- 2c n = 0,
or
n+1 c1
n(n+3)
cn_1
for n=1,2, . . . .
Using this recurrence relation to generate terms of the series, we obtai n
y1 (x)=cox z
L
1- 21 x+
3 2 1 3 1 a 20 x - 30 x + 168 x
s 1 1 6 +••• . 1120 x +8640x
Now try the second roots = -1 in the recurrence relation (4.17) . We get (n - 1)(n-2)c'=+(n-2)cn_1-2cn = 0 for n = 1, 2, . . . . When n = 3, this gives cz = 0, which forces
cn
= 0 for n > 2 . But then
1 Y2 (x)=cox+c* . Substitute this into the differential equation to ge t x 2 (2cox-3 )+x2 (-cox 2)-2 and
Then c
we
obtain the
\ ci+co xl/
=-co-2ci =
second solution
(1 Y2(x) = co* x
1) 2
with co nonzero but otherwise arbitrary . The functions yl and Y2 form a fundamental set of solutions . ■
EXAMPLE 4 .15
Conclusion (3), k
Consider the differential equation xy" - y = 0, which has a regular singular point at O . Substitute cnxn+r to obtai n 00 cnx n+r = O . E(n + r) (n + r - 1)cnx"+r-1 - E
y(x) = E :'' =o
n=0
n=0
4.4 Second Solutions and Logarithm Factors Shift indices in the second summation to write this equation a s 0 (r-2 - r)cox r-1 +> [(n+r)(n+r1) c,, -
177
= O.
,1= 1
The indicial equation is r 2 - r = 0, with roots r, = 1, r2 = 0 . Here r i - r2 so we are in case (3) of the theorem . The recurrence relation i s
=
1, a positive integer,
(n+Y)(n+Y-1)C t -c0_1 = 0 for n = 1, 2, . . . . Let r = 1 and solve for c,, : 1 _1 n(n +l )c,,
c„ =
for n = 1, 2, 3, . . . .
Some of the coefficients are l =1
1
co , 2) 1 1 1 = 2(2)(3) co , cz = 2(3) c C
1 C3
CZ -
= 3(4)
1
co.
2(3)(2)(3)(4)
In general, we find that _ c"
1 n!(n+1)! co
for n = 1, 2, . . . . This gives us a Frobenius series solution 00 1 x„+ 1 y1 (x) = co En!(n+l) ! . .J In this example, if we put r = 0 into the recurrence relation, we get =co [ x+ 2 xz+ 12 x3+ 144
x4+
n(n-1)c,, -ci_1 = 0 for n = 1, 2, . . . . But if we put n = 1 into this equation, we get co = 0, contrary to the assumption that co O . Unlike the preceding example, we cannot find a second Frobenius solution by simpl y putting 1z into the recurrence relation . Try a second solution 0 y 2 (x) = ky 1 (x)ln(x) + E c,x " „= o (here x"+"2 = x" because r2 = 0) . Substitute this into the differential equation to get 1 1 x [kYln(x) ! 2ky4 x - ky1 + n (n -1) cix"-z x2 „ .z 0 - ky 1 1n(x) - E c;x" = O . „= o
(4 .18 )
178
CHAPTER 4 Series Solution s Now k1n(x)[xy'l -Y1] = 0 because y l is a solution of the differential equation . For the remaining terms in equation (4 .18), insert the series for yl (x) (with co = 1 for convenience) to get 1 n 1 x -kE i i xn f E c*nn(n-1)xn-1 - E cn* x„ = „=b (n!)2 „=o n (n+ l ) n=2 n=o Shift indices in the third summation to write this equation a s 2kE
for n = 1, 2, 3, . . . . Since co can be any nonzero real number, we may choose k = 1 . For a particular second solution, let ci = 0, obtaining : x3 Y2(x) = Y1(x)ln(x)+1- 34 x2- 7 36
-
35
172 8
x4 - . .
co
= 1 . The n
. .
To conclude this section, we will produce a second solution for Bessel's equation, in a case where the Frobenius method yields only one solution . This will be of use later when we study Bessel functions .
EXAMPLE 4 .16 Bessel Function of the Second Kin d
Consider Bessel's equation of zero order (v = 0) . From Example 4 .12, this is x2y" + xy + x2y = O . We know from that example that the indicial equation has only one root, r = O . From equation (4.16), with co = 1, one Frobenius solution is Y1(x ) _ E(-1)k k=o
12 2k . 22k (k
Attempt a second, linearly independent solution of the form y2 (x) = yl (x) In (x) + E ckx k. k= 1
4.4 Second Solutions and Logarithm Factors
179
Substitute y2(x) into the differential equation to get . xyi In (x) +2yi - -y 1 +E k(k - 1)ckxk- 1 k= 2 k-1 + yi 1n (x) + - y1 + E kckx +xy 1 1n(x) + E ckxk+1 = 0. x k=1 k= 1 Terms involving 1n(x) and y l (x) cancel, and we are left with 2yi+Ek(k-1)ckxk-1 +kckxk-1 + ckx k+l = 0 . k=2 k=1 k= 1 Since k(k - 1) = k2 - k, part of the first summation cancels all terms except the k = 1 term in the second summation, and we hav e 2yi+Ek2ckx k-1 +cl+Eckx k+l =O . k=2 k= 1 Substitute the series for yi into this equation to get k 1 2k-1 +Ek2ck x* k 1 +c 1+ E ckx * k+1 = 2 * 2zk-1 k_I)! x k=1 ( k=2 k= 1 k1 Shift indices in the last series to write this equation as 2k-1 + cI + 4c2x + *,(k2ck + ck_2)xk-1 = 0. (4 .19) E 22k-2 k!k k 1)! x k=1 ( k=3 The only constant term on the left side of this equation is q, which must therefore be zero . The only even powers of x appearing in equation (4 .19) are in the right-most series when k i s odd . The coefficients of these powers of x must be zero, henc e k2 ck + 4_2 = 0
for k = 3,5,7, . . . .
But then all odd-indexed coefficients are multiples of ci, which is zero, s o czk+1 = 0 for k = 0, 1, 2, . . . . To determine the even-indexed coefficients, replace k by 2j in the second summation of equation (4 .19) and k with j in the first summation to get (-1)j E 1=1 22j-2j! (j -1)!
Equate the coefficient of each power of x to zero . We get cz
1 4
and the recurrence relation (- 1) '+1j - 432 1 c2J_2 c2J = 221[j!]2
for j=2,3,4, . . . .
CHAPTER 4 Series Solutions If we write some of these coefficients, a pattern emerges : c4
2242 C1
+ 2] ,
2242 C1+2+ 2 1 62
and, in general, 1i+1 c2i -
2242 . .)
1l i+ ' 22J (1 !) 2 0(A '
1
[i++ . . .±]=
(2J) 2
where 1
*l (J) = 1-1- 2 -1- . .+
1
for j
J
1, 2, . . . .
We therefore have a second solution of Bessel's equation of order zero : Y2 (x)
= Yi (x) In( x) + E
0
Zkl) 22k
0(k) x2k
k-i
for x > O . This solution is linearly independent from y, (x) for x > 0 . When a differential equation with a' regular singular point has only one Frobenius series solution expanded about that point, it is tempting to try reduction of order to find a second solution . This is a workable strategy if we can write y l (x) in closed form. But if y l (x) is an infinite series, it may be better to substitute the appropriate form of the second solution from Theorem 4 .4 and solve for the coefficients .
CC* .*'OQa 4U4
PROBLEMS
In each of Problems 1 through 10, (a) find the indicial equation, (b) determine the appropriate form of each o f two linearly independent solutions, and (c) find the firs t five terms of each of two linearly independent solutions . In Problems 11 through 16, find only the form that tw o linearly independent solutions should take. 1. 2. 3. 4. 5. 6. 7. 8.
i_JL `i,iSTE': ' . i l li0D S DI):)1'\i ( STE P .'. . i i 111r 1 ) l,
Numerical Approximation o f Solutions
Often we are unable to produce a solution of an initial value problem in a form suitable fo r drawing a graph or calculating numerical values . When this happens we may turn to a scheme for approximating numerical values of the solution . Although the idea of a numerical approximation is not new, it is the development an d ready accessibility of high speed computers that have made it the success that it is today . Some problems thought to be intractable thirty years ago are now considered solved from a practical point of view . Using computers and numerical approximation techniques, we no w have increasingly accurate models for weather patterns, national and international economies , global warming, ecological systems, fluid flow around airplane wings and ship hulls, and man y other phenomena of interest and importance . A good numerical approximation scheme usually includes the following features . 1. At least for first order initial value problems, the scheme usually starts at a point xo where the initial value is prescribed, then builds approximate values of the solutio n at points specified to the left or right of xo . The accuracy of the method will depen d on the distance between successive points at which the approximations are made, thei r increasing distance from xo, and of course the coefficients in the differential equation . Accuracy can also be influenced by the programming and by the architecture of the com puter . For some complex models, such as the Navier-Stokes equations governing flui d flow, computers have been built with architecture dedicated to efficient approximatio n of solutions of that particular model . 2. A good numerical scheme includes an estimate or bound on the error in the approximation . This is used to understand the accuracy in the approximation, and often t o guide the user in choosing certain parameters (such as the number of points at whic h approximations are made, and the distance between successive points) . Often a compromise must be made between increasing accuracy (say, by choosing more points) an d keeping the time or cost of the computation within reason. The type of problem under consideration may dictate what might be acceptable bounds on the error . If NASA is placing a satellite in a Jupiter orbit, a one meter error might be acceptable, while a n error of this magnitude would be catastrophic in performing eye surgery . 181
'vi l ;
X182
CHAPTER 5 Numerical Approximation of Solution s 3. The method must be implemented on a computer . Only simple examples, devised for illustrative purposes, can be done by hand . Many commercially available software packages include routines for approximating and graphing solutions of differentia l equations . Among these are MAPLE, MATHEMATICA and MATLAB . We will now develop some specific methods .
5.1
Euler's Method Euler's method is a scheme for approximating the solution o f y' = f( x, Y) ;
Y(x o) = Yo ,
in which xo, yo and the function f are given. The method is a good introduction to numerical schemes because it is conceptually simpl e and geometrically appealing, although it is not the most accurate . Let y(x) denote the solutio n (which we know exists, but do not know explicitly) . The key to Euler's method is that if w e know y(x) at some x, then we can compute f(x, y(x)), and therefore know the slope y'(x) o f the tangent to the graph of the solution at that point . We will exploit this fact to approximat e solution values at points x i = xo + h, x 2 = xo + 2h, . . . , x„ = xo + nh . First choose h (the step size) and the number n of iterations to be performed . Now form the first approximation . We know y(xo) = yo . Calculate f(xo, yo) and draw the line having this slope through (xo, yo) . This line is tangent to the integral curve through (xo, yo) . Move along this tangent line to the point (xu, yi) . Use yl as an approximation to y(xu) . This is illustrated in Figure 5 .1. We have some hope that this is a "good " approximation, for h "small'' , becaus e the tangent line fits the curve closely "near" the point . Next compute f(xl , yl ) . This is the slope of the tangent to the graph of the solution of the differential equation passing through (xl , y i ). Draw the line through (x 1 , yl ) having this slope , and move along this line to (x2, Y2) . This determines Y2, which we take as an approximation t o y(x 2) (see Figure 5 .1 again) . Continue in this way . Compute f(x2, Y2 ) and draw the line with this slope through (x2, Y2) . Move along this line to (x3 , y3 ) and use Y3 as an approximation to y(x3 ) .
Approximation points formed according to Euler's method . FIGURE 5 .1
5.1 Euler's Method
183 .
In general, once we have reached (Xk, yk ), draw the line through this point having slop e f(xk , yk ) and move along this line to ( xk+1, Yk+1) Take yk+l as an approximation to y ( xk+1) • This is the idea of the method . Obviously it is. quite sensitive to how much f(x, y) changes if x and y are varied by a small amount . The method also tends to accumulate error, since w e use the approximation yk to make the approximation yk+l . In Figure 5 .2, the successively drawn line segments used to determine the approximat e values move away from the actual solution curve as x increases, causing the approximations t o be less accurate as more of them are made (that is, as n is chosen larger) . Following segment s of lines is conceptually simple and appealing, but it is not sophisticated enough to be ver y accurate in general . We will now derive an analytic expression for the approximate solution value y k at Xk . From Figure 5 .1, Y 1 = Yo
+ f(xo,Yo)(x l - xo) .
At the next step, Y2 = Y1 + f(x 1, Y1)(x2 - x 1) .
After we have obtained the approximate value yk , the next step (Figure 5 .3) gives Yk+1 = Yk + f(xk , Yk)( xk+1 - xk) •
Since each x k+1 - Xk = h, we can summarize the discussion as follows .
DEFINITION 5.1
Euler's Method
Euler's method is to define y k+1 in terms of yk by Yk+l = Yk +f( x k, Yk)(X k+ 1
Yk+1
Yk + hf(xk,Yk) ,
-1 . yk is the Euler approximation to y
Y
/ ' slope f
Y Yk+ l Yk /
1111 x 2 x3
xo x1
1 x4
x
FIGURE 5 .2 Accumulating error in Eider's method.
FIGURE 5 .3
(xk, yk)
184
CHAPTER 5 Numerical Approximation of Solutions
EXAMPLE 5 . 1
Consider y(2)
y = xJ ;
= 4.
This separable differential equation is easily solved : x2 ` ) . 4 This enables us to observe how the method works by direct comparison with the exact solution . First we must decide on h and n. Since we do not have any error estimates, we have no rationale for making a particular choice . For illustration, choose h = 0.2 and n = 20. Then xo = 2 and x20 = x0 + nh = 2 + (20) (0 .2) = 6 . No w y(x ) = ( 1 +
yk+1
=
Yk
+ 0 .2xk
for
yk
k = 0, 1, 2, . . . , 19.
Table 5 .1 lists the Euler approximations, and Figure 5 .4 shows a graph of this approximate solution (actually, a smooth curve drawn through the approximated points), together with a graph of the actual solution, for comparison . Notice that the approximation becomes les s accurate as x moves further from 2 .
Approximate Values of the Solution of y' = x,5, ; y(2) = 4; h = 0.2 ; n = 20 x 2 .0 2 .2 2 .4
2.6 2 .8 3 .0
3 .2 3 .4 3 .6 3 .8 4.0
Yapp (x ) 4 4.8 5 .763991701 6.916390802
X
Yapp ( x)
4 .2 4 .4
26 .6209720 4
4 .6 4 .8
30 .9549953 3 35 .8510701 2 41 .3596403 3
47 .5335406 0
11 .783171355 13 .98007530
5 .0 5 .2 5 .4 5 .6
16 .52259114
5.8
19 .44924644
6 .0
80 .0228895 9 90.3997295 0
8 .28390462 9 .895723242
54 .4279978 4 62 .10063249 70 .61145958
22 .80094522
10 0 80 60 40 20 0 3
4
5
6
FIGURE 5 .4 Exact and Euler approximate solutions of y' = x ; y(2) = 4 with stepsize 0 .2 and twenty iterations.
5 .1 Euler's Method
Approximate Values of the Solution of y' = x.; y(2) = x 2 .0
FIGURE 5 .5 Exact and Euler approximate solutions of y' = x ,/y ; y(2) = 4, first With h = 0 .2 and twenty
iterations, and then h =
0 .1
and forty iterations.
The accuracy of this method depends on h. If we choose h = 0 .1 and n = 40 (so th e approximation is still for 2 < x < 6), we get the approximate values of Table 5 .2 . A grap h of this approximation is shown in Figure 5 .5, showing an improved approximation by choosing h smaller . With today's computing power, we would have no difficulty using a muc h smaller h.
186
CHAPTER 5 Numerical Approximation of Solution s
EXAMPLE 5 .2 Consider y' = sin(xy) ; y(2) = 1 . We cannot write a simple solution for this problem . Figure 5 .6 shows a direction field for y' = sin(xy), and Figure 5 .7 repeats this direction field, with some integral curves, includin g the one through (2, 1) . This is a graph of the solution (actually an approximation done by the software used for the direction field) . For a numerical approximation of the solution, choose h = 0.2 and n = 20 to obtain an approximate solution for 2 < x < 6 . The generated values are given in Table 5 .3, and a smooth curve is drawn through these points in Figure 5 .8 . ■ y
/N V N -
/\\//- \
\\*//-\\2 - ///i-\\ \
-4
-2 // /
2
0
4
\\ \\*// / --\_7 -\\- // N \ -
\\\ -
/
-\-/--\i/\
FIGURE 5 .6 A direction field for y' = sin(xy) .
y 6
2
FIGURE 5 .7 A direction field and some integral curves for y' = sin(xy), including the integral curve through (2,1).
x
5.1 Euler's Method
187
pproximate Values ofthe Solution of y' = sin(xy) ; y(2) = 1; h = 0.2 ; n = 20 X
From the examples, it appears that the error in an Euler approximation is proportiona l to h. It can be shown that this is indeed the case, and for this reason Euler's method is a firs t order method . If a method has error that is proportional to hP, it is called an order p method .
5.1 .1 A Problem in Radioactive Waste Disposa l Disposal of radioactive waste materials generated by nuclear power plants, medical research , military testing and other sources is a serious international problem . In view of the long halflives of some of the materials involved, there is no real prospect for disposal, and the proble m becomes one of safe storage . For example, Uranium-235 has a half-life of 7.13(10 8) years, and Thorium-232 a half-life of 1 .39(10 10) years . If a ton of Thorium-232 is stored, more than hal f of it will still be here to see our sun consume all of its fuel and die . Storage plans have been proposed and studied on both national levels (for example, b y the U.S . Atomic Energy Commission) and the international level (by the International Atomi c Energy Agency of the United Nations) . Countries have developed a variety of policies an d plans . Argentina has embarked on a program of storing containers of radioactive materials in granite vaults . Belgium is planning to bury containers in clay deposits . Canada plans to us e crystalline rocks in the Canadian shield . The Netherlands is considering salt domes . France and Japan are planning undersea depositories . And the United States has a diversified approac h
CHAPTER 5 Numerical Approximation of Solution s which have included sites at Yucca Mountain in Nevada and the Hanford Reservation in th e state of Washington . One idea which has been considered is to store the material in fifty-five gallon container s and drop them into the ocean at a point about 300 feet deep, shallow enough to prevent ruptur e of the drums by water pressure . It was found that drums could be manufactured that woul d endure indefinitely at this depth . But then another point was raised . Would the drums withstan d the impact of settling on the ocean floor after being dropped from a ship ? Testing showed that the drums could indeed rupture if they impacted the bottom at a spee d in excess of 40 feet per second . The question now is : will a drum achieve this velocity in a 30 0 foot descent through seawater ? To answer this question, we must analyze what happens when a drum is dropped into th e water and allowed to settle to the bottom . Each 55-gallon drum weighs about 535 pounds afte r being filled with the material and some insulation . When in the water, the drum is buoyed u p by a force equal to the weight of the water displaced . Fifty-five gallons is about 7 .35 cubic feet, and the density of seawater is about 64 pounds per cubic foot, so each barrel will be subject t o a buoyant force of about 470 pounds . In addition to this buoyant force, the water will impose a drag on the barrel as it sinks , impeding its descent . It is well known that objects sinking in a fluid are subject to a drag forc e which is proportional to a power of the velocity . Engineers had to determine the constant o f proportionality and the exponent for a drum in seawater . After testing, they estimated that th e drag force of the water was approximately equal to 0 .5v 10/3 pounds, in which v is the velocity in feet per second . Let y(t) be the depth of the drum in the water at time t, with downward chosen as the positive direction. Let y = 0 at the (calm) surface of the water . Then v(t) = y' (t) . The force s acting on the drum are the buoyant and drag forces (acting upward) and the force of gravit y (acting downward) . Since the force of gravity has magnitude mg, with m the mass of the drum , then by Newton's law, m
dv =mg-470-0 .5v,/ i°/3 . dt
For this problem, mg = 535 pounds . Use g = 32 ft/sect to determine that m = 16 .7 slugs . Assume that the drum is released from rest at the surface of the water . The initial valu e problem for the velocity of the descending drum i s 16.7
dz
= 535 - 470 - 0.5v'5'/3 , v(0) = 0
or dv = 1 [65 0 :5v rods] dt 16.7
v(0) = 0 .
We want the velocity with, which the drum hits bottom . One approach might give us a quick answer . It is not difficult to show that a drum sinking in seawater will have a termina l velocity. If the terminal velocity of the drum is less than 40 feet per second, then a dru m released from rest will never reach a speed great enough to break it open upon impact with th e ocean floor, regardless of the depth ! Unfortunately, a quick calculation, letting dv/dt = 0, shows that the terminal velocity i s about 100 feet per second, not even close to 40 . This estimate is therefore inconclusive i n determining whether the drums have a velocity of 40 feet per second upon impact at 300 feet . We could try solving the differential equation for v(t) and integrating to get an equatio n for the depth at time t . Setting y(t) 300 would then yield the time required for the dru m to . reach this depth, and we could put this time back into v(t) to see if the velocity exceeds
5.1 Euler's Method
18 9
40 feet per second at this time . The differential equation is separable . However, solving it lead s to the integral 1 f J v io/3 -130 dv' which has no elementary evaluation. Another approach would be to express the velocity as a function of the depth . A differential equation for v(y) can be obtained by writin g dv _ dv dy _ dv dt dy dt - vdy . This gives us the initial value proble m dv dy
65 - 0 .5v io/ s 16.7v
v(0) = 0 .
(5 .1)
This equation is also separable, but we cannot perform the integrations needed to find v(y ) explicitly . We have reached a position that is common in modeling a real-world phenomenon . Th e model (5 .1) does not admit a closed form solution . At this point we will opt for a numerical approach to obtain approximate values for the velocity . But life is not this easy! If we attempt Euler's method on the problem with equatio n (5 .1), we cannot even get started because the initial condition is v(0) = 0, and v occurs in th e denominator . There is a way around this difficulty . Reverse perspective and look for depth as a functio n of velocity, y(v) . We will then calculate y(40), the depth when the velocity reaches 40 feet pe r second . Since the velocity is an increasing function of the depth, if y(40) > 300 feet, we wil l know that the barrel could not have achieved a velocity of 40 feet per second when it reache d the bottom . If y(40) < 300, then we will know that when the drum hits bottom it was movin g at more than 40 feet per second, hence is likely to rupture . Since dy/dv = 1/(dv/dy), the initial value problem for y(v) is dy _ 16.7v . dv 65 - 0 .5v"/73/3 ' y(0) = 0 Write 10/3 ti 1 .054 and apply Euler's method with h = 1 and n = 40 . We get y(40) c 268 .2 feet . With h = 0 .5 and n = 80 we get y(40) i 272 .3 feet . Further reductions in step size wil l provide better accuracy. With h = 0 .1 and n = 400 we get y(40) 275 .5 feet . Based on thes e numbers it would appear that the drum will exceed 40 feet per second when it has fallen 300 feet, hence has a good chance of leaking dangerous material . A more detailed analysis, using an error bound that we have not discussed, leads to th e conclusion that the drum achieves a velocity of 40 feet per second somewhere between 272 and 279 feet, giving us confidence that it has reached this velocity by the time it lands on th e ocean floor . This led to the conclusion that the plan for storing radioactive waste materials i n drums on the ocean floor is too dangerous to be feasible .
PROBLEMS
In each of Problems 1 through 6, generate approximate numerical solutions using h = 0 .2 and twenty iterations,
then h = 0.1 and forty iterations, and finally h = 0 .05 and eighty iterations . Graph the approximate solutions on the
CHAPTER 5 Numerical Approximation of Solution s y' = 1/x ; y(1) = 0. Use h = 0.01 . Will this approxi -
same set of axes . Also obtain error bounds for each case . In each of Problems 1 through 5, obtain the exact solutio n and graph it with the approximate solutions . 1. 2. 3. 4. 5. 6. 7.
y' = y sin(x) ; y(0) = 1 y' = x+y ; y(1) = - 3 y' = 3xy ; y(O) = 5 y' = 2 - x ; y(O) = 1 y'=y-cos(x) ;y(1)=- 2 y' = x - y2 ; Y(0) = 4 Approximate e as follows . Use Euler's method with h = 0.01 to approximate y(1), where y(x) is the solution of y' = y ; y(O) = 1 . Sketch a graph of the solution before applying Euler's method and determine whether the approximate value obtained is less than or greater than the actual value . 8. Approximate ln(2) by using Euler's method t o approximate y(2), where y(x) is the solution of
5 .2
mation be less than or greater than the actual value ? 9. In the analysis of the radioactive waste disposal problem, how does the constant of proportionality for th e drag on the drum affect the conclusion? Carry out th e numerical analysis if the drag is 0.3v'3 , and again for the case that the drag is 0 .8v"55/' . 10. Try exponents other than 10/3 for the velocity in th e disposal problem to gauge the effect of this numbe r on the conclusion. In particular, perform the analysis if the drag equals 0 .5v (1 is slightly less than 10/3 ) and again for a drag effect of 0 .5v4/3 (4/3 is slightly greater than 10/3) . 11. Suppose the drums are dropped over a part of th e ocean having a depth of 340 feet . Will the drums be likely to rupture on impact with the ocean floor ?
One-Step Method s
Euler's method is a one-step method because the approximation at xk+1 depends only on the approximation at Xk, one step back . We will consider some other one-step methods for th e initial value problem y' = f(x , y) ;
Y(xo) = Yo •
As usual, let the step size be h, and denote x k = xo +kh for k = 0, 1, 2, . . . , n .
5.2.1 The Second-Order Taylor Metho d By Taylor's theorem with remainder (under certain conditions on f and h) we can writ e Y(x k+1) = y(x k) + hy ' (xk ) + 2i h2Y" (xk )
.+ . . .+mlh„tyt»,)(xk)+( m
1+1)th»,+ly*»t+l)( k) ,
for some 6k in [xk, xk+1] . If y ( "t+1) (x) is bounded, then the last term in this sum can be mad e as small as we like by choosing h small enough . We therefore form the approximation Yk+1
Y(xk)+hY(xk)+2*h2Y"(xk)+ . . .+m*h'»Y
(xk) •
If in = 1, this is Euler's method, since y'(xk ) = f(xk , yk ) . Now let m = 2 . Then Yk+1
1 Y( x k) + hy' (x k) + y h2Y»(xk) .
(5 .2)
5 .2 One-Step Methods
191
We know that y(x) = f(x, y(x)) . This suggests that in the approximation (5 .2) we consider f(xk , yk ) as an approximation of y'(xk ) if yk is an approximation of y(xk ) . Thus consider y' (xk ) '-'-' f(xk,Yk) •
This leaves the term y"(x k) in the approximation (5 .2) to treat . First differentiate the expressio n y'(x) = f(x, y(x)) to get 8f Y " (x)
ax
( x , y)
+
O ay
(x , Y)Y (x) •
This suggests we consider Y "(xk)
ax (xk, Yk) + yf ( x k, Yk)f(xk, Yk) •
Insert these approximations of y'(x k ) and y " (xk ) into the approximation (5 .2) to get 1
Yk+1
Yk+ hf( xk,Yk)+ 2 h
This is a one-step method, because
z C af (xk,Yk)+aof ( xk+Yk)f(xk ax
y
is obtained from information at
yk+1
1 Yk)J •
Xk,
one step back fro m
xk+ 1
DEFINITION 5.2
Second-Order Taylor Method
The second-order Taylor method consists of approximating Yk+ I
1
Yk+ h f(x k,Yk)+
2
hz
y(xk+I)
by the expressio n
( af
of ( xk , Yk)+a ( xk,Yk)f(xk,Yk) J ax y
This expression can be simplified by adopting the notatio n fk = f( xk, Yk) '
Ox
= fx, ay = fy ,
and ax (xk, Yk) - (fx)k = fxk,
x ay ( k, Yk) (fy)k - fyk •
Now the formula is 1 Yk+I '" Yk
+ hfk +
2
hz (fxk + fkfyk) •
192,
CHAPTER 5 Numerical Approximation of Solutions
EXAMPLE 5 . 3
Consider y ' = y2
With f(x, y)
y2 cos(x),
Yk+1
cos(x) ; y(0) = 1/5 .
we have L = - y2 sin(x) and fy = 2y cos(x) . Form Yk
Exact and second-order Taylor approximat e = y2 cos(x) ; y(O) =
This problem can be solved exactly, and we obtain y(x) = 1/(5 - sin(x)) . Figure 5.9 shows a graph of this solution, together with a smooth curve drawn through the approximate d function values . The student should redo the approximation, using h = 0.1 and n = 40 for comparison. The Euler approximations for this example, with h = 0 .2, ar e Yk+1
= Yk + (0 .2)yk cos(xk) .
5.2 One-Step Methods It is instructive to compute these approximations for n = 20, and compare the accuracy of the Euler method with that of the second-order Taylor method for this problem . 5 .2 .2 The Modified Euler Metho d
Near the end of the nineteenth century the German mathematician Karl Runge noticed a similarity between part of the formula for the second-order Taylor method and another Taylo r polynomial approximation . Write the second-order Taylor formula a s Yk+I
1 1 = Yk + h [fk+2 h (fc( xk,Yk)+fkfy( xk,Yk)] .
Runge observed that the term in square brackets on the right side of this equation resemble s the Taylor approximatio n f(xk+ ah ,Yk+ak) i fk+ ahfx(xk,Yk)+Q hfy( xk,Yk) • In fact, the term in square brackets in equation (5 .3) is exactly the right side of the last equation if we choose a = (3 = 1/2 . This suggests the approximatio n Yk+1
h . 2 ^'Yk+ hf ( x k+ 2 >Yk+ hfk)
DEFINITION 5.3 Modified Euler Method The modified Euler method consists of defining the approximation yk+1 by Yk+I
h hfk l ''-'Yk+ h f xk+2>Yk+ 2 J
The method is in the spirit of Euler's method, except that f(x, y) is evaluated at (xk + h/2, yk + hfk /2) instead of at (Xk, yk ) . Notice that (xk + h/2) is midway between xk and Xk+] .
EXAMPLE 5 . 4
Consider 1
y' --y=2x 2 ; x
Y(l)=4.
Write the differential equation as y' = + 2x 2 = f(x, y) . Using the modified Euler method with h = 0 .2 and n = 20 iterations, generate the approximat e solution dues given in Table 5 .5. x.
194
I
CHAPTER 5 Numerical Approximation of Solution s The exact solution of this initial value problem is y(x) = x3 + 3x . The graph of this solution , together with a smooth curve drawn through the approximated values, are shown in Figure 5 .10 . The two curves coincide in the scale of the drawing . For example, y(5) = 140, while Y20, the approximated solution value at x20 = 1 + 20(0 .2) = 5, is 139 .7. This small a difference doe s not show up on the graph . Values of the Solution of y' = (y/x) +2x 2 ; y(l) = 4
Exact and modified Euler approximation of th e solution of y' - (1/x)y = 2x 2 ; y(1) = 4. FIGURE 5 .10
e
We leave it for the student to do this example using the other approximation schemes fo r comparison . For this initial value problem, the other methods with h = 0 .2 give: Euler :
Yk+l
=
Yk
1 Yk + 2x + 0 .2 ( k xk Yk
Modified Euler:
Yk+l = Yk+ 0 .2
+0 .
1 (), +24) +2(xk +0 .1) 2
xk +0 . 1
I
;
Second-order Taylor : Yk+1
=
Yk
+0 .2
rYk xk
+. 2xk +0 .02 ]
Yk [- xk
+4xk +
I yk xk
+2xk
/
5.2 One-Step Methods
195
5.2.3 Runge-Kutta Method s An entire class of one-step methods is generated by replacing the right side in the modified Euler method with the general form a fk + bf(xk + ah, y k+I3 h fk) •
The idea is to choose the constants a, b, a and /3 to obtain an approximation with as favorabl e an error bound as possible . The fourth order Runge-Kutta method (known as RK4) has proved both computationall y efficient and accurate, and is obtained by a clever choice of these constants in approximatin g slopes at various points . Without derivation, we will state the method .
DEFINITION 5.4 RK4
The RK4 method of approximation is to define h
=
Yk+1
Yk+
6
y k+ ,
in terms of
Yk
[Wk ,+2Wk ,+2Wk +Wk4 ] ,
where Wkl = fk, Wk2 = f(x k+ h / 2 , y k+ hWk1/ 2 ) ,
Runge-Kutta and modified Euler approximations of the solution of FIGURE 5 .11
y' = (1/y) cos(x+y) ; y(O) = 1 . Table 5 .6 shows the computed values, and Figure 5 .11 shows graphs drawn through th e approximated points by both methods . The two graphs are in good agreement as x nears 2, but then they diverge from each other . This divergence can be seen in the table . In general, RK4 is more accurate than modified Euler, particularly as the distance increases from the point wher e the initial data is specified . It can be shown that the Taylor and modified Euler methods are of order h 2 , while RK4 is of order h 4 . Since usually 0 < h < 1, h 4 < h2 < h , so accuracy improves much faster b y choosing smaller h with RK4 than with the other methods . There are higher order Runge-Kutt a
5 . 3 Multistep Methods
19 7
methods which are of order h n for larger p . Such methods offer greater accuracy, but usually at a cost of more computing time . All of the methods of this section are one-step methods, which have the general for m Yk+l = Yk + cP(xk, Yk) . In the next section, we will discuss multistep methods .
In each of Problems 1 through 6, use the modified Euler , Taylor and RK4 methods to approximate the solution, first using 1z = 0 .2 with twenty iterations, then h = 0.1 with forty iterations, and finally h = 0 .05 with eighty iterations . Graph the approximate solutions for each method, and fo r a given h, on the same set of axes . 1. y' = sin(x + Y) ; Y(0) = 2
k
Also solve this problem exactly and include a grap h of the exact solution with graphs of the approximate solutions . 3. y' = cos (y) + e' ;; y(O) = 1 = y3 - 2xy ; y(3)
=
2
5. y' = -y + e -` ; Y( 0) = 4 . Also solve this problem exactly and include a grap h of the exact solution with graphs of the approximate solutions . 6. y' = sec(1/y) - xy2; y(7r/4) = 1
5 .3
8. Do Problem 5 of Section 5 .1 with RK4 instead of Euler. Which method yields the better result ? 9. Derive the improved Euler method, also known a s the Heun method, as follows . Begin with Euler' s method and replace f with (fk +fk+l)/2 . Next replace Yk+1 in fk+1 with y +hf . The result should be the approximation scheme yk+l = Yk + ( h / 2 ) (fk + k
2. y'=Y-x2 ;Y(l)=-4 .
4. y'
7. Do Problem 3 of Section 5 .1 with RK4 instead of Euler . Which method yields the better result ?
k
f( xk+1, .Yk + h fk) •
In each of Problems 10 through 12, use Euler, modified Euler and improved Euler to approximate the solution . Use h = 0 .2 with n = 20, then h = 0 .1 with n = 40 an d then h = 0 .05 with n = 80 . Graph the approximate solutions, for each h, on the same set of axes . Whenever the solution can be found in exact form, graph this solutio n with the approximate solutions for comparison . 10. y'=1-y;y(0)= 2
Multistep Method s We continue with the problem y' = f(x, y) ; y(xo) = yo . The solution is y(x), and we presumabl y do not have a "good" expression for this function . We want to obtain approximate values y t o y(x ), where x = xo -I- kh for k = 0, 1, . . . , n . The basis for some multistep methods is the informal belief that, if p (x) is a polynomial tha t approximates f(x, y(x)) on [x , x then p (x)dx should approximate f k+' f(x, y(x))dx . R Now write l k+I Xk+1 x k+ l y'(x)dx = f Y(xk+1) - y(x ) = f f(x,Y(x))dx f pk(x) dx . k
k
k
k
k
k+1 ],
k
ffkk+'
k
tk
.tk
.rk
Therefore Y(xk+l)
Y( xk)
+f
xk+ I p k(x) dx .
zk
So far this is deliberately vague . Nevertheless, this proposed approximation (5 .4) contains th e germ of an idea which we will now pursue .
198
CHAPTER 5
Numerical Approximation of Solution s
First we must decide how to choose the polynomials pk (x) . Suppose we have somehow arrived at satisfactory approximations yk, Yk-1, Yk-r to the solution at Xk , respectively . Then fk = f(xk, Yk) f(xk, Y(xk)) , fk-1
= f(xk-1, Yk-1)
f(x k-l, Y(xk-1)) ,
J k-r = J (xk-r+ Yk-r) J ( xk-r, Y(xk-r) )
Keep in mind here that y(x k ) is the exact solution evaluated at Xk (this is unknown to us), and yk is an approximation of this solution value, obtained by some scheme. Now choose pk (x) to be the polynomial of degree r passing through the point s ( xk, fk), (Xk-1,Jk-1), . . .Jk-r) '
These r+ 1 points will uniquely determine the r-degree polynomial pk (x) . When this polynomial is inserted into the approximation scheme (5 .4), we obtain a multistep approximation metho d in which the approximation y k+l of y(xk+1) is defined b y
Yk+1
Yk+
f
xk+ 1
Pk(x ) dx-
Xk
We obtain different methods for different choices of r . Consider some cases of interest.
5.3.1 Case 1 r = 0 Now p k (x) is a zero degree polynomial, or constant . Specifically, p k (x) The approximation scheme defined by equation (5 .5) becomes Yk+1 = Yk
+f
xk+1 fk dx = Yk
Xk
= fk for Xk < x < xk+1 •
+ fk [x k+1 + Xk] = Yk + hfk
for k = 0, 1, 2, . . . ,n - 1 . This is Euler's method, a one-step method.
5.3.2 Case 2 r = 1 Now p k (x) is a first degree polynomial, whose graph is the straight line through (xk , fk ) and (xk_1, fk_1) . Therefore 1
1
x - xk-1)fk • Pk (x) = - h (x - xk)fk-1 + h Upon inserting this into the scheme (5 .5) we ge t
Yk+1 = Yk +
f
xk+1
1
[(x_xk)fk_1+ k
1
x-xk_1)fk dx.
5 .3 Multistep Methods
199
A routine integration which we omit yields h
Yk+1
=
Yk
+ 2 [3 fk - fk-1 ]
for k = 1, 2, . . . , n - 1 . This is a two-step method because computation of yk+l requires prior computation of information at the two points Xk and Xk _j . For larger r, the idea is the same, but the details are move involved because pk (x) is of degree r and the integral in equation (5 .5) is more involved, though still elementary . Here are the final results for two more cases . 5.3 .3 Case 3 r = 3 Yk+1
=
Yk
+ 2 [23fk
-
16fk-1 + Sfk-2]
for k = 2, 3, . . . , n - 1 . This is a three-step method, requiring information at three points to compute the approximation yk+l . 5 .3 .4 Case 4 r = 4 Yk+1
=
Yk +
4 [SSfk
- 59 fk_1
+ 37fk_2 - Sfk-3 ]
for k = 3, 4, . . . , n - 1 . This is a four-step method . We might expect multistep methods to improve in accuracy as the number of steps increases, since more information is packed into the computation of the approximation at th e next point . This is in general true, and an r-step method using an interpolating polynomial o f degree r on each subinterval has error of order O(h r ) . The cost in improved accuracy is that the polynomials must be computed on each interval and more data is put into computation o f each successive yk+1 . The schemes just given for r = 1, 2, and 3 are called Adams-Bashforth multistep methods . One drawback to a multistep method is that some other method must be used to initiate it . For example, equation (5 .6) involves fk_3 , and so is only valid for k = 3, 4, . . . , n - 1 . Some other scheme must be used to start it by feeding in y1 , Y2 and y3 (and, of course, yo is given a s information) . Often RK4 is used as an initiator in computing these first values . Another class of multistep methods, called Adams-Moulton methods, is obtained by usin g different data points to determine the interpolating polynomial p k(x) to use in equation (5 .4) . For r = 2, pk (x) is now chosen as the unique second degree polynomial passing throug h (xk-1, fk_t), ( xk, fk) and (xk+l, fk+1)• This leads to the approximating scheme Yk+1
= Yk
+ 4 [9fk+I
+ 19fk - 5fk _ j
+fk-2] •
This Adams-Moulton method is a four step method, and has error of order 0(h4) . There is a significant difference between the Adams-Bashforth method (5 .6) and the Adams Moulton method (5 .7) . The former determines yk+1 in terms of three previously compute d quantities, and is said to be explicit. The latter contains yk+l on both sides of the equation (5 .7) , because Yk+1 = f( x k+l, Yk+1), and therefore defines yk+l implicitly by an equation containing Yk+1 on both sides . Equation (5 .7) therefore only provides an equation containing Yk+l , from which yk+1 must then be extracted, a perhaps nontrivial task .
290,
CHAPTER 5 Numerical Approximation of Solution s
In each of Problems 1 through 5, use the Taylor, modifie d Euler, and RK4 methods to approximate solution values . First use h = 0 .2 with 20 iterations, then h = 0 .1 with 40 iterations, then h = 0 .05 with 80 iterations . 1. 2. 3. 4. 5.
In each of Problems 6, 7 and 8, use the Adams-BashforthMoulton scheme, first with h = 0 .2 and twenty iterations , then with h = 0 .1 and forty iterations . 6. y' = y-x3 ; y(- 2) = - 4 7. y' = 2xy - y3 ; y( 0) = 2 8. y' =1n(x) + x2y ; y ( 2) = 1
9. Carry out the details for deriving the two-step scheme stated for the case r = 2 . 10. Carry out the details for deriving the three-ste p scheme stated for the case r = 3 . 11. Every one-step and multistep method we have considered is a special case of the general expressio n yk+1
= E aj yk+l-j j= 1
+ h(p(xk+l-m, )4+1,Yk}1-m,
. . .,Xk , + yk' yk+1)
•
By making appropriate choices of m, the a'j s and (p, show how this formula gives Euler's method, th e modified Euler method, the Taylor method, RK4, an d the Adams-Bashforth method .
CHAPTER 6 Vectors and Vector Spaces CHAPTER 7 Matrices and Systems of Linear Equation s
Vectors and Linear Algebra
CHAPTER 8 Determinants CHAPTER 9 Eigenvalues, Diagonalization, and Special Matrices
Some quantities are completely determined by their magnitude, or "size ." This is true o f temperature and mass, which are numbers referred to some scale or measurement system . Such quantities are called scalars . Length, volume, and distance are other scalars . By contrast, a vector carries with it a sense of both magnitude and direction . The effect of a push against an object will depend not only on the magnitude or strength of the push, bu t also on the direction in which it is exerted . This part is concerned with the notation and algebra of vectors and objects called matrices . This algebra will be used to solve systems of linear algebraic equations and systems of linea r differential equations . It will also give us the machinery needed for the quantitative study of systems of differential equations (Part 3), in which we attempt to determine the behavio r and properties of solutions when we cannot write these solutions explicitly or in closed form . In Part 4, vector algebra will be used to develop vector calculus, which extends derivative s and integrals to higher dimensions, with applications to models of physical systems, partia l differential equations, and the analysis of complex-valued functions .
201
CHAPTER
6
AEGEBR,-\ AND GEOMETRY OF VECTORS THE DO T PRODUCT THE CROSS PRODUCT THE VECTO R SPACE A" LINEAR INDEPENDENCE SPANNING SET S AND DIMENSION IN R" ABSTRACT VECTOR SPAC E
Vectors and Vector Spaces
' 6.1
The Algebra and Geometry of Vector s When dealing with vectors, a real number is often called a scalar. The temperature of an objec t and the grade of a motor oil are scalars . We want to define the concept of a vector in such a way that the package contain s information about both direction and magnitude . One way to do this is to define a vector (i n 3-dimensional space) as an ordered triple of real numbers .
DEFINITION 6.1
Vector
A vector is an ordered triple
(a, b, c), in which a, b, and c are real numbers .
We represent the vector (a, b, c) as an arrow from the origin (0, 0, 0) to the point (a, b, c) in 3-space, as in Figure 6 .1 . In this way, the direction indicated by the arrow, as viewed fro m the origin, gives the direction of the vector . The length of the arrow is the magnitude (or norm) of the vector-a longer arrow represents a vector of greater strength. Since the distance fro m the origin to the point (a, b, c) is , Jae + b2 + c 2, we will define this number to be the magnitud e of the vector (a, b, c) .
DEFINITION 6.2 Norm of a Vector
The norm, or magnitude, of a vector (a, b, c) is the number I (a, b, c) I defined b y II( a , b, c)II = \/a2+ b2 + c2 203
204
CHAPTER 6 Vectors and Vector Space s
(a, b, c )
z
z
(a, b, c)
Y
FIGURE 6 .1 The vector (a, b, c) i s represented by the arrow from (0, 0, 0 ) to the point (a, b, c) .
FIGURE 6 . 2 1I(-l,4, 1)1! =
Paralle l representations of the same vector. FIGURE 6 .3
18 .
For example, the norm of (-1, 4, 1) is II (-1, 4,1)11 = -./1 + 16 + 1 = 18 . This is the length of the arrow from the origin to the point (-1, 4, 1) (Figure 6 .2) . The only vector that is not represented by an arrow from the origin is the zero vecto r (0, 0, 0), which has zero magnitude and no direction . It is, however, useful to have a zer o vector, because various forces in a physical process may cancel each other, resulting in a zer o force or vector. The number a is the first component of (a, b, c), b is the second component, and c the third component. Two vectors are equal if and only if each of their respective components i s equal : (a, b, c) _ (u, v, w) if and only if a=u,
b=v ;
,c - w.
We will usually denote scalars (real numbers) by letters in regular type face . (a, b, c , A, B, . . .), and vectors by letters in boldface (a, b, c, A, B, . . .). The zero vector is denoted O . Although there is a difference between a vector (ordered triple) and an arrow (visua l representation of a vector), we often speak of vectors and arrows interchangeably . This is useful in giving geometric interpretations to vector operations . However, any two arrows having th e same length and same direction are said to represent the same vector . In Figure 6 .3, all th e arrows represent the same vector . We will now develop algebraic operations with vectors and relate them to the norm .
The product of a real number a with a vector F = (a, b, c) is denoted aF, and is defined by
Thus a vector is multiplied by a scalar by multiplying each component by the scalar . For example, 3(2, -5, 1) = (6, -15, 3) and -5(-4, 2, 10) = (20, -10, -50) . The following relationship between norm and the product of a scalar with a vector lead s to a simple geometric interpretation of this operation .
6.1 The Algebra and Geometry of Vectors
2 05
THEOREM 6. 1
Let F be a vector and a a scalar . Then 1.
II aFII = I
2.
II F II
Proof
If
a l II F II .
= 0 if and only if F = O .
F =
(a, b, c), then aF = (aa, ab, ac), s o IIaFII
=
.Ja2a2+ a 2 b 2 +a2 c2
= IaI ✓ a 2 +b 2 +c 2
=I
a I II F I I
This proves conclusion (1) . For (2), first recall that 0 = (0, 0, 0), s o 11011
=,j0 2 +0 22+0 2 = 0 .
Conversely, if II F II = 0, then a 2 + b2 + c2 = 0, hence a = b = c = 0 and F = O .
111
Consider this product of a scalar with a vector from a geometric point of view . By (1 ) of the theorem, the length of aF is I a I times the length of F . Multiplying by a lengthens th e arrow representing F if IaI > 1, and shrinks it to a shorter arrow if 0 < IaI < 1 . Of course, if a = 0 then aF is the zero vector, with zero length . But the algebraic sign of a has an effect a s well . If a is positive, then aF has the same direction as F, while if a is negative, aF has th e opposite direction .
EXAMPLE 6 . 1
Let F = (2, 4, 1), as shown in Figure 6 .4. 3F = (6, 12, 3) is along the same direction as F, but i s represented as an arrow three times longer . But -3F = (-6, -12, -3), while being three time s longer than F, is in the direction opposite that of F through the origin . And F = (1, 2, 1/2) is in the same direction as F, but half as long.
z (6, 12, 3 ) ( 1,
2 ,Z)
(2,4,1)
} y
(-6,
-12,
-3)
FIGURE 6 .4
Scalar multiples of a vector .
In particular, the scalar product of -1 with F = (a, b, c) is the vector (-a, -b, -c), havin g the same length as F, but the opposite direction . This vector is called "minus F," or the negativ e of F, and is denoted -F . Consistent with the interpretation of multiplication of a vector by a scalar, we define tw o vectors F and G to be parallel if each is a nonzero scalar multiple of the other . Of course i f F = aG and a 0, then G = (1/a)F . Parallel vectors may differ in length, and even be i n opposite directions, but the straight lines through arrows representing these vectors are paralle l lines in 3-space .
206
CHAPTER 6 Vectors and Vector Space s The algebraic sum of two vectors is defined as follows .
DEFINITION 6.4
Vector Su m
The sum of F = (a l , b 1 , c l ) and G = (a2 , b2 , c 2) is the vecto r F+G=(a 1 +a2 ,b i +b2,c i +
That is, we add vectors by adding respective components . For example , (-4, 7r, 2) + (16, 1, -5) _ (12, ?x+ 1, -3) . If F = (a l , b1, c 1) and G = (a 2, b2, c 2), then the sum of F with -G is (a l - a2, bl - b2 , c l - c2) . It is natural to denote this vector as F- G, and refer to it as "F minus G ." For example, (-4, 7r, 2) minus (16, 1, -5) i s (-4,7x,2)-(16,1,-5)=(-20,7x-1,7) . We therefore subtract two vectors by subtracting their respective components . Vector addition and multiplication of a vector by a scalar have the following computationa l properties . -*
THEOREM 6 .2 Algebra of Vectors
Let F, G, and H be vectors and let a and 13 be scalars . The n 1. 2. 3. 4. 5. 6.
Conclusion (1) is the commutative law for vector addition, and (2) is the associative law . Conclusion (3) states that the zero vector behaves with vectors like the number zero does wit h real numbers, as far as addition is concerned . The theorem is proved by routine calculations , using the properties of real-number arithmetic . For example, to prove (4), write F = ( a l , bl , c 1) and G = (a2 , b2 , c2) . Then a(F+G) = a(a l + a 2, b 1 + b2, c l + c 2) = (a(al + a 2), a(b1 +b 2), a(cl + c 2) ) (aa 1 + aa2 , ab 1 + ab 2, ac t + ac2) (aa 1 , ab i , ac t ) + (aa2, ab2 , ac2) = a (a 1, b1,
c1)
+ a(a2 , b2 , c2) = aF+ aG .
Vector addition has a simple geometric interpretation . If F and G are represented as arrow s from the same point P, as in Figure 6 .5, then F+G is represented as the arrow from P to th e opposite vertex of the parallelogram having F and G as two incident sides . This is called the parallelogram law for vector addition .
6.1 The Algebra and Geometry of Vectors
-207
k (0, 1, 0) y J i (1, 0, 0 )
FIGURE 6 . 5
FIGURE 6 .6
FIGURE 6 .7
Parallelogram law fo r vector addition .
Another way of visualizing th e parallelogram law .
along the axes.
Unit vectors
The parallelogram law suggests a strategy for visualizing addition that is sometimes useful . Since two arrows having the same direction and length represent the same vector, we could apply the parallelogram law to form F + G as in Figure 6 .6, in which the arrow representin g G is drawn from the tip of F, rather from a common initial point with F . We often do this i n visualizing computations with vectors . Any vector can be written as a sum of scalar multiples of "standard" vectors as follows . Define i = (1, 0, 0), j = (0, 1, 0),
k = (0, 0, 1) .
These are unit vectors (length 1) aligned along the three coordinate axes in the positive directio n (Figure 6 .7) . In terms of these vectors , F = (a, b, c) = a(1, 0, 0)+b(0,1, 0)+c(0, 0,1) = ai+bj+ck . This is called the standard representation of F . When a component of F is zero, we usually just omit it in the standard representation . For example, (-3, 0, 1) = -3i+k . Figure 6 .8 shows two points P1 (a 1 , b1 , c 1 ) and P2(a 2 , b2, c2 ) . It will be useful to know the vector represented by the arrow from P I to P2. Let H be this vector. Denote G=a 1 i+b l j+c 1k
and F=a 2i+b 2j+c2 k.
By the parallelogram law (Figure 6 .9), G+H=F . Hence H=F-G= (a2 -a l )i+(b 2 -b i ) j+(c2 -c 1 )k . For example, the vector represented by the arrow from (-2, 4, 1) to (14, 5, -7) is 16i+ j - 8k . The vector from (14, 5, -7) to (-2, 4, 1) is the negative of this, or -16i - j+ 8k . Vector notation and algebra are often useful in solving problems in geometry . This is no t our goal here, but the reasoning involved is often useful in solving problems in the sciences an d engineering. We will give three examples to demonstrate the efficiency of thinking in terms o f vectors .
CHAPTER 6 Vectors and Vector Space s
The arrow from (a 1 , b 1 , c 1 ) to (a2, b 2, c2) is (a 2 - a 1 )i + (b2 - b i)j + (c2 - c1 )k . FIGURE 6 .9
FIGURE 6 .8
EXAMPLE 6 . 2
Suppose we want the equation of the line L through the points (1, -2, 4) and (6, 2, -3) . This problem is more subtle in 3-space than in the plane because in three dimensions ther e is no point-slope formula . Reason as follows . Let (x, y, z) be any point on L . Then (Figure 6.10) , the vector represented by the arrow from (1, -2, 4) to (x, y, z) must be parallel to the vecto r from (1, -2, 4) to (6, 2, -3), because arrows representing these vectors are both along L . Thi s means that (x -1)i + (y + 2)j+ (z -4)k is parallel to 5i + 4j - 7k . Then, for some scalar t , (x - 1)i + (y + 2)j + (z - 4)k = t[5i+4j-7k] . But then the respective components of these vectors must be equal : z-4=-7t .
x-1 =5t, y+2=4t, Then x=1+5t,
y=-2+4t,
z=4-7t.
(6 .1 )
A point is on L if and only if its coordinates are (1 +5t, -2+4t, 4-7t) for some real number t (Figure 6 .11) . Equations (6 .1) are parametric equations of the line, with t, which can be assigne d any real value, as parameter . When t = 0, we get (1, -2, 4), and when t = 1, we get (6, 2, -3) . We can also write the equation of this line in what is called normal form . By eliminating t , this form is x-1 y+2 z- 4 5 4 -7 We may also envision the line as swept out by the arrow pivoted at the origin and extendin g to the point (1 +5t, -2+4t, 4-7t) as t varies over the real numbers . Some care must be taken in writing the normal form of a straight line . For example, the line through (2, -1, 6) and (-4, -1, 2) has parametric equation s x=2-6t,
y=-1,
z=6-4t .
If we eliminate t, we get x-2 _ z - 6 -6 -4 '
y= - 1 .
6.1 The Algebra and Geometry of Vectors
L
209.
z
(1+5t, -2+4t,4-7t)
y
FIGURE 6 .10
FIGURE 6 .1 1
Every point on the line has second coordinate -1, and this is independent of t . This informatio n must not be omitted from the equations of the line . If we omit y = -1 and write jus t x-2 z- 6 -6 -4 ' then we have the equation of a plane, not a line .
EXAMPLE 6 . 3
Suppose we want a vector F in the x, y-plane, making an angle of with the positive x-axis, and having magnitude 19 . By "find a vector" we mean determine its components . Let F = ai + bj. From the right triangle in Figure 6 .12, cos(ar/7) = 19 and sin(or/7) =
b
Then F = 19 cos(ir/7)i+ 19 sin(ir/7)j .
y
x FIGURE 6 .1 2
EXAMPLE 6 . 4
We will prove that the line segments formed by connecting successive midpoints of the side s of a quadrilateral form a parallelogram . Again, our overall objective is not to prove theorem s of geometry, but this argument is good practice in the use of vectors . Figure 6 .13 illustrates what we want to show . Draw the quadrilateral again, with arrow s (vectors) , B, C, and D as sides . The vectors x, y, u, and v drawn with dashed lines connec t
CHAPTER 6 Vectors and Vector Spaces
FIGURE 6 .1 3
FIGURE 6 .1 4
the midpoints of successive sides (Figure 6 .14) . We want to show that x and u are parallel an d of the same length, and that y and v are also parallel and of the same length. From the parallelogram law for vector addition and the definitions of x and u , x = -A+- B and u=ZC+2D . But also by the parallelogram law, A +B is the arrow from Po to P I , while C +D is the arrow from P I to Po . These arrows have the same length, and opposite directions . This means that A+B = -(C+D) . But then x = -u, so these vectors are parallel and cif the same length (just opposite in direction) . A similar argument shows that y and v are also parallel and of the same length, completin g the proof.
In each of Problems 1 through 5, compute F+ G, F - G , 1IGII, 2F, and 3G . 1. F=2i-3j+5k,G=Ji+6j-5k
9. F=i-2j,G=i-3j 10. F=-i+4j,G=-2i-2 j
2. F=i-3k,G=4j
In each of Problems 11 through 15, determine aF and represent F and aF as arrows .
3. F=2i-5j,G=i+5j- k
11. F = i+j, a = -1/2
4. F= / i+j-6k,G=8i+2 k
12. F=6i-2j,a= 2
5. F=i+j+k,G=2i-2j+2k
13. F=-3j,a=-4
In each of Problems 6 through 10, calculate F + G and F - G by representing the vectors as arrows and using th e parallelogram law .
14. F=6i-6j,a=1/ 2 15. F=-3i+2j,a= 3
7. F=2i-j,G=i- j
In each of Problems 16 through 21, find the parametri c equations of the straight line containing the given points . Also find the normal form of this line .
8. F=-3i+j,G=4j
16. (1,0,4),(2,1,1)
6. F=i,G=6j
6.2 The Dot Product
17. (3, 0, 0), (-3, 1, 0)
25. 15, 77r/ 4
18. (2, 1, 1), (2, 1, -2)
26. 25, 3ar/2
19. (0,1,3), (0,0,1 ) 20. (1, 0, -4), (-2, -2, 5 ) 21. (2, -3,6), (-1, 6,4 ) In each of Problems 22 through 26, find a vector F in th e x, y-plane having the given length and making the angl e (given in radians) with the positive x-axis . Represent th e vector as an arrow in the plane . 22.
27. Let P I , P2 , . . . , P„ be distinct points in 3-space, with n > 3 . Let F . be the vector represented by the arro w from Pi to Pi+I for i = 1, 2, . . . , n - 1 and let F,, be the vector represented by the arrow from P„ to PI . Prove that Fl +F2 + . . .+F,, = O. 28. Let F be any nonzero vector. Determine a scalar t such that IItFil = 1 . 29. Use vectors to prove that the altitudes of any triangle intersect in a single point. (Recall that an altitude is a line from a vertex, perpendicular to the opposite side of the triangle .)
7r/ 4
23. 6, 7r/ 3 24. 5, 37r/5
6 .2
211
The Dot Product
Throughout this section, let F = a 1 i + b lj + c l k and G = a2 i + b, j + c2k .
DEFINITION 6.5
Dot Product
The dot product of F and G is the number F • G defined b y F . G=a a 2 +b 1 b2+c 1 c 2
For example, (13-i+4j-irk) • (-2i+6j+3k) = -2*+24-37r. Sometimes the dot product is referred to as a scalar product, since the dot product of two vectors is a scalar (real number) . This must not be. confused with the product of a vector wit h a scalar. Here are some rules for operating with the dot product . THEOREM 6.3 Properties of the Dot Produc t
Let F, G, and H be vectors, and a a, scalar. The n 1. 2. 3. 4. 5.
F•G=G•F . (F+G)•H=F•H+G•H . a(F•G) _ (aF) •G=F• (aG) . F•F= II F II 2 . F•F=O if and only if F=O .
Conclusion (1) is the commutativity of the dot product (we can perform the operation i n either order), and (2) is a distributive law . Conclusion (3) states that a constant factors through a
212
CHAPTER 6 Vectors and Vector Space s dot product . Conclusion (4) is very useful in some kinds of calculations, as we will see shortly . A proof of the theorem involves routine calculations, two of which we will illustrate . Proof
For (3), writ e a(F . G) =a(a 1 a 2 +bi b2 +c t c2) = (aa 1)a2 +(ab1)b2 +(ac1 )c2 = (aF) . G= a l (aa2) + b 1 (ab2 ) + c l (ac 2) = F • (aG) .
For (4), we have F .F = (a l i+bt j+cl k) • (a l i+b tj+c 1k)
=4+14+4 =
II F II 2 .
Using conclusion (4) of the theorem, we can derive a relationship we will use frequently . -*
LEMMA 6. 1
Let F and G be vectors, and let a and /3 be scalars . Then II aF
Proof
+/3G II 2 =
a2 II F II 2
+2a/F • G+/3 2
II G II
By using Theorem 6 .3, we have IIaF+/3GII 2 = (aF+PG) • (aF+(3G ) a2 F•F+a$F•G+a$G•F+/3 2 G• G =a2F•F+2a/3F•G+/3 2 G• G = IIFII 2 +2a/3 F •G+II G II 2
.
The dot product can be used to determine the angle between vectors . Represent F and G as arrows from a common point, as in Figure 6 .15 . Let B be the angle between F and G . The arrow from the tip of F to the tip of G represents G - F, and these three vectors form the side s of a triangle . Now recall the law of cosines, which states, for the triangle of Figure 6 .16, that (6 .2)
a 2 + b2 - tab cos(O) = c 2 . Apply this to the vector triangle of Figure 6 .15, with sides of length a = II G II, b = c = IIG - FII . By using Lemma 6 .1 with a = - 1 and /3 = 1, equation (6 .2) becomes IIGII 2 +II F II 2 - 2
II G II II F II cos ( O
ITV,
an d
2 )= IIG-FII
=
II G II 2 +II F II 2 -
2G . F.
Then F•G=
II F II II G II
cos(O) .
Assuming that neither F nor G is the zero vector, then cos(O)
F.G
=
(6.3) II FJI I GII
This provides a simple way of computing the cosine of the angle between two arrows representing vectors . Since vectors can be drawn along straight lines, this also lets us calculate the angle between two intersecting lines .
6 .2 The Dot Product
213
z
G- F
FIGURE 6 .15
FIGURE 6 .1 6
FIGURE 6 .1 7
Law of cosines : a 2 +b2 2ab cos(0) = c 2.
EXAMPLE 6 . 5
Let F = -i + 3j + k and G = 2j - 4k . The cosine of the angle between these vector s (Figure 6 .17) is cos(B) _
Then 0 = arccos(2/ 20) , which is that unique number in [0,7r] whose cosine is 2/x/220 . 0 is approximately 1 .436 radians .
EXAMPLE 6 . 6
Lines L 1 and L 2 are given, respectively, by the parametric equation s
x= 1+6t,
y=2-4t,
z=-1+3 t
and x=4-3p,
y=2p, z=-5+4p
in which the parameters t and p take on all real values . We want the angle between these line s at their point of intersection, which is (1, 2, -1) (on L 1 for t = 0 and on L 2 for p = 1) . Of course, two intersecting lines have two angles between them (Figure 6 .18) . However, the sum of these angles is 7, so either angle determines the other. The strategy for solving this problem is to find a vector along each line, then find th e angle between these vectors . For a vector F along L 1 , choose any two points on this line, sa y (1, 2, -1) and, with t = 1, (7, -2, 2) . The vector from the first to the second point i s F= (7 - 1)i+ (-2 - 2)j+ (2
(-1))k=6i-_4j±3k .
CHAPTER 6 Vectors and Vector Spaces
FIGURE 6 .1 8
FIGURE 6 .1 9
Two points on L 2 are (1, 2, -1) and, with p = 0, (4, 0, -5) . The vector G from the first to th e second of these points is G=(4-1)i+(0-2)j+(-5-(-1))k=3i-2j- 4k. These vectors are shown in Figure 6 .19 . The cosine of the angle between F and G i s cos(9)
One angle between the lines is 0 = arccos(14/V ' 1769), approximately 1 .23 radians . If we had used -G in place of G in this calculation, we would have gotten 0 = arccos (-14/x/1769), or about 1 .91 radians . This is the supplement of the angle found i n the example .
EXAMPLE 6 . 7
The points A(1, -2, 1), B(0, 1, 6), and C(-3, 4, -2) form the vertices of a triangle . Suppos e we want the angle between the line AB and the line from A to the midpoint of BC. This lin e is a median of the triangle and is shown in Figure 6 .2 0 Visualize the sides of the triangle as vectors, as in Figure 6 .21 . If P is the midpoint of BC , then Hl = H2 because both vectors have the same direction and length . From the coordinates of the vertices, calculate F = -i+3j +5k and G = -4i+6j - 3k . We want the angle between F and K, so we need K . By the parallelogram law , F+H I = K and K+ H2 = G . Since H l
H2, these equations imply tha t K=F+HI =F+(G-K) .
Therefore, K=
9 (F+G) _ - i+-j+k . 2 -52 2
6.2 The Dot Product
B FIGURE 6 .20
HI
H2
P
215
C
FIGURE 6 .2 1
Now the cosine of the angle we want i s cos(0) _
F• K II F IIII K II
42
42
35/110
,/385 0
0 is approximately 0 .83 radians . The arrows representing two nonzero vectors F and G are perpendicular exactly when th e cosine of the angle between them is zero, and by equation (6 .3) this occurs when F . G = 0 . This suggests we use this condition to define orthogonality (perpendicularity) of vectors . If we agree to the convention that the zero vector is orthogonal to every vector, then this dot produc t condition allows a general definition without requiring that the vectors be nonzero .
DEFINITION 6.6
Orthogonal Vectors
Vectors F and G are orthogonal if and only if F • G = 0 .
EXAMPLE 6 . 8
Let F = -4i + j + 2k, G = 2i + 4k and H = 6i -j - 2k . Then F • G = 0, so F and G are orthogonal. But F . 11 and G • H are nonzero, so F and H are not orthogonal, and G and H are not orthogonal . Sometimes orthogonality of vectors is a useful device for dealing with lines and planes in three-dimensional space .
EXAMPLE 6 . 9
Two lines are given parametrically by Ll
:x=2-4t, y=6+t, z=3 t
and L2
:x=-2+p,
y= 7+2p,
z=3-4p .
We want to know whether these lines are perpendicular . (It does not matter whether the line s intersect).
216 :
CHAPTER 6 Vectors and Vector Space s The idea is to form a vector along each line and test these vectors for orthogonality . For a vector along L I , choose two points on this line, say (2, 6,0) when t = 0 and (-2,7,3 ) when t = 1 . Then F = -4i +j + 3k is along L I . Two points on L2 are (-2,7, 3) when p = 0 and (-1, 9, -1) when p = 1 . Then G = i + 2j - 4k is along L2 . Since F F . G = -14 0, thes e vectors, hence these lines, are not orthogonal .
EXAMPLE 6 .1 0
Suppose we want the equation of a plane II containing the point (-6, 1, 1) and perpendicular to the vector N = -2i+4j+k .
FIGURE 6 .22
A strategy to find such an equation is suggested by Figure 6 .22. A point (x, y, z) is on H if and only if the vector from (-6, 1, 1) to (x, y, z) is in H and therefore is orthogonal to N . This means that ((x+6)i+(y-1)j+(z-1)k)•N=O . Carrying out this dot product, we get the equatio n -2(x+6) +4(y - 1) + (z -1) = 0, or -2x+4y+z = 17 . This is the equation of the plane . Of course the given point (-6, 1, 1) satisfies this equation . We will conclude this section with the important Cauchy-Schwarz inequality, which state s that the dot product of two vectors cannot be greater in absolute value than the product of th e lengths of the vectors . - ; THEOREM 6.4 Cauchy-Schwarz Inequality Let F and G be vectors . Then I F . G I : II F II II G I I
If either vector is the zero vector, then both sides of the proposed inequality are zero . Thus suppose neither vector is the zero vector . In this event , Proof
cos(O)
=
F.G II F II I GII '
6 .3 The Cross Produc t
217
where 0 is the angle between F and G . But then -1 -
F• G II E II
IIGII - 1
>
so -II F II
IIGII
II F II II G II ,
which is equivalent to the Cauchy-Schwarz inequality . 12
In each of Problems 1 through 6, compute the dot prod uct of the vectors and the cosine of the angle betwee n them. Also determine if they are orthogonal and verify the Cauchy-Schwarz inequality for these vectors . 1. i,2i-3j+ k 2. 2i-6j+k,i- j 3. 4. 5. 6.
13. A = (1, -2, 6), B = (3, 0, 1), C = (4, 2, -7 ) 14. A= (3, -2, -3), B = (- 2, 0, 1), C = (1, 1, 7 ) 15. A= (1, -2, 6), B = (0, 4, -3), C = (-3, -2, 7 ) 16. A= (0, 5, -1), B = (1, -2, 5), C = (7, 0, -1 )
In each of Problems 7 through 12, find the equation o f the plane containing the given point and having the give n vector as normal vector. (-1, 1, 2), 3i - j + 4k (-1,0,0),i-2j (2, -3, 4), 8i - 6j +4k (-1, -1, -5), -3i+ 2j
6 .3
In each of Problems 13 through 16, find the cosine of th e angle between AB and the line from A to the midpoint o f BC .
-4i-2j+3k,6i-2j- k 8i-3j+2k, -8i-3j+ k i-3k,2j+6 k i+j+2k,i-j+2 k
7. 8. 9. 10.
11. (0,-1,4),7i+6j-5k 12. (-2,1, -1), 4i+3j+ k
17. Suppose F • X = 0 for every vector X . What can be concluded about F ? 18. Suppose F • i = F • j = F • k = 0 . What can be concluded about F ? 19. Suppose F O . Prove that the unit vector u for which IF • uI is a maximum must be parallel to F . 20. Prove that for any vector F , F = (F•i)i+(F• j)j+(F• k)k .
The Cross Produc t
The dot product produces a scalar from two vectors . We will now define the cross product , which produces a vector from two vectors . For this section, let F = a, i +b., j+ c, k and G = a2 i + b2j + c2k.
DEFINITION 6.7 Cross Produc t
The cross product of F with G is the vector F x G defined b y F x G = (b 1 c2 -b 2 c,)i+(a 2c 1 -a l c,)j+(a 1 b2
-a2 b 1 )k .
L 218
CHAPTER 6
Vectors and Vector Spaces
This vector is read "F cross G ." For example, (i+2j-3k) x (-2i+j+4k) = (8 + 3)i+ (6 -4)j+ (1 +4)k = lli+2j+5k . A cross product is often computed as a three by three "determinant," with the unit vector s in the first row, components of F in the second row, and components of G in the third . If expanded by the first row, this determinant gives F x G . For example i j k 1 2 -3 -2 1 4
2 -3 1 4
1-
1 -3 -2 4
1 2 -2 1
k
=11i+2j+5k=FxG . The interchange of two rows in a determinant results in a change of sign . This means that interchanging F and G in the cross product results in a change of sign : FxG=-GxF . This is also apparent from the definition . Unlike addition and multiplication of real number s and the dot product operation, the cross product is not commutative, and the order in whic h it is performed makes a difference. This is true of many physical processes, for example, the order in which chemicals are combined may make a significant difference . Some of the rules we need to compute with cross products are given in the nex t theorem.
[F ;. THEOREM 6 .5
Properties of the Cross Produc t
Let F, G, and H be vectors and let
a be a scalar.
1. 2. 3. 4.
FxG=-GxF . F x G is orthogonal to both F and G . IIF x G il = II F II IIGII sin( g), in which B is the angle between F and G . If F and G are not zero vectors, then F x G = 0 if and only F and G are parallel . 5. Fx(G+H)=FxG+FxH . 6. a(FxG)=(aF)xG=Fx(aG) .
Proofs of these statements are for the most part routine calculations . We will prove (2 ) and (3) . Proof
For (2), compute F•(FxG)=a 1 (b 1 c2 -b2c 1 )+b1 (a2c 1 -a 1 c 2)+c 1 (a 1 b2 -a 2 b 1 )=0 .
Therefore F and F x G are orthogonal. A similar argument holds for G .
6.3 The Cross Product
219
For (3), comput e II
F x GII 2 =
(b i c2 -
b 2 c 1 ) 2 + (a 2 c 1 - a 1 c2 ) 2
= (a?+b? + c i)(4'
+(a 1 b2
-a2 b 1 ) 2
+ 14. +c2) - (a 1 a 2 +b 1 b2 +c 1 c2 ) 2
= II F II 2 II G II 2 - (F • G) 2 = II F II 2 II G II 2 - II F II 2 II G II 2 cos2 (9 ) = IIFII 2 II G II 2 (1-cos 2(e) )
= II F II 2 IIGII 2 s in2 (0) . Because 0 < B < rr, all of the factors whose squares appear in this equation are nonnegative , and upon taking square roots we obtain conclusion (3) . If F and G are nonzero and not parallel, then arrows representing these vectors determin e a plane in 3-dimensional space (Figure 6 .23) . F x G is orthogonal to this plane and oriente d as in Figure 6 .24. If a person's right hand is placed so that the fingers curl from F to G, the n the thumb points up along F x G . This is referred to as the right-hand rule . G x F = -F x G points in the opposite direction . As a simple example, i x j = k, and these three vectors defin e a standard right-handed coordinate system in 3-space.
FIGURE 6 .23
FIGURE 6 .2 4
Plane determined by F and G.
Right-hand rule gives th e direction of F x G .
The fact that F x G is orthogonal to both F and G is often useful. If they are not parallel , then vectors F and G determine a plane II (Figure 6 .23) . This is consistent with the fact that three points, not on the same straight line, determine a plane . One point forms a base point for drawing the arrows representing F and G, and the other two points are the terminal points o f these arrows . If we know a vector orthogonal to both F and G, then this vector is orthogona l to every vector in II . Such a vector is said to be normal to IT . In Example 6 .10 we showe d how to find the equation of a plane, given a point in the plane and a normal vector . Now we can find the equation of a plane, given three points in it (not all on a line), because we can us e the cross product to produce a normal vector .
EXAMPLE 6 .1 1
Suppose we want the equation of the plane II containing the points (1, 2, 1), (-1, 1, 3), an d (-2, -2, -2) . Begin by finding a vector normal to II . We will do this by finding two vectors in II an d taking their cross product . The vectors from (1, 2, 1) to the other two given points are in I I (Figure 6 .25) . These vectors are F = -2i - j + 2k and G = -3i - 4j - 3k .
22 0
CHAPTER 6 Vectors and Vector Space s
I
FIGURE 6 .25
Form N=FxG=
i j k - 2 -1 2 - 3 -4 -3
= 11i -12j +5k .
This vector is normal to H (orthogonal to every vector lying in II ) . Now proceed as in Example 6.10. If (x, y, z) is any point in H, then (x -1)i+ (y - 2)j + (z - 1)k is in II and s o is orthogonal to N . Therefore, [(x - 1)i + (y - 2)j + (z - 1)k] • N = 11(x - 1) - 12(y-2)+5(z - 1) = 0 . This gives llx-12y+5z=-8 . This is the equation of the plane in the sense that a point (x, y, z) is in the plane if and only it s coordinates satisfy this equation . If we had specified three points lying on a line (collinear) in this example, then we woul d have found that F and G are parallel, hence F x G = O . When we calculated this cross produc t and got a nonzero vector, we knew that the points were not collinear . The cross product also has geometric interpretations as an area or volume .
THEOREM 6. 6
Let F and G be represented by arrows lying along incident sides of a parallelogram (Figure 6 .26) . Then the area of this parallelogram is IIF x Gil . The area of a parallelogram is the product of the lengths of two incident sides an d the sine of the angle between them . Draw vectors F and G along two incident sides . Then Proof
FIGURE 6 .26 = CIF x Gil . .
Are a
6.3 The Cross Product
221
these sides have length I I F I I and II G II • If B is the angle between them, then the area of the parallelogram is II F II II G II sin( g ) . But this is exactly OF x G II • ■ LEI EXAMPLE 6 .1 2
A parallelogram has two sides extending from (0, 1, -2) to (1, 2, 2) and from (0, 1, -2) t o (1, 4, 1) . We want to find the area of this parallelogram. Form vectors along these sides : F=i+j+4k,
G=i+3j+3k .
Calculate FxG=
i j k 11 4 1 3 3
=-9i+j+2k
and the area of the parallelogram is OF x GO = 86 square units . If a rectangular box is skewed as in Figure 6 .27, the resulting solid is called ..a rectangula r parallelopiped. All of its faces are parallelograms . We can find the volume of such a solid b y combining dot and cross products, as follows . FxG
F FIGURE 6 .27
FIGURE 6 .28
Parallelopiped.
_
Volum e
IH• (F x GA .
THEOREM 6 . 7
Let F, G and H be vectors along incident sides of a rectangular parallelopiped . Then the volume of the parallelopiped is IH • (F x G) I . This is the absolute value of the real number formed by taking the dot product of H with FxG . Figure 6 .28 shows the parallelopiped . F x G is normal to the plane of F and G an d oriented as shown in the diagram, according to the right-hand rule . If H is along the third side of the parallelogram, and 0 is the angle between H and F x G, then IIHII cos(t/i) is the altitude of the parallelopiped . The area of the base parallelogram is OF x Gil by Theorem 6 .6 . Thu s the volume of the parallelopiped is Proof
II H II II F
But this is
IH
(F x G) I .
■
x GII cos(i) .
222
CHAPTER 6 Vectors and Vector Spaces
EXAMPLE 6 .1 3
One corner of a rectangular parallelopiped is at (-1, 2, 2), and three incident sides extend fro m this point to (0, 1, 1), (-4, 6, 8), and (-3, -2, 4) . To find the volume of this solid, form the vectors F = (0 - (-1))i+ (1 - 2)j + (1 - 2)k = i - j - k , G = (-4 (-1))i+ (6 - 2)j + (8 - 2)k = -3i+4j+ 6k, and H = (-3 - -1))i + (-2 - 2) j + (4 - 2)k = -2i - 4j +2k . Calculate FxG=
i j k 1 -1 - 1 -3 4 6
= -2i-3j+k .
Then H . (F x G) = (-2) (-2) + (-4) (-3) + (2) (l) = 18 , and the volume is 18 cubic units. The quantity H • (F x G) is called a scalar triple product. We will outline one of its properties in the problems .
*G1r1101Y' C***)
PROBLEMS
In each of Problems 1 through 6, compute F x G and, independently, G x F, verifying that one is the negative of th e other. Use the dot product to compute the cosine of the angl e 0 between F and G and use this to determine sin ( g ) . Then calculate II FII II G II sin ( g) and verify that this gives IIF x G II .
11. (-4, 2, -6), (1, 1, 3), (-2, 4, 5 ) In each of Problems 12 through 16, find the area of th e parallelogram having incident sides extending from the first point to each of the other two .
1. F=-3i+6j+k,G=-i-2j+ k
12. (1, -3, 7), (2, 1, 1), (6, -1, 2 )
2. F=6i-k,G=j+2k
13. (6, 1, 1), (7, -2, 4), (8, -4, 3 )
3. F=2i-3j+4k,G=-31+2 j
14. (-2, 1, 6), (2, 1, -7), (4, 1, 1 )
4. F=8i+6j,G=14j
15. (4, 2, -3), (6, 2, -1), (2, -6, 4 )
5. F=5i+3j+4k,G=20i+6k
16. (1, 1, -8), (9, -3, 0), (-2, 5, 2 )
6. F=2k,G=8i- j
In each of Problems 17 through 21, find the volume o f the parallepiped whose incident sides extend from the first point to each of the other three .
In each of Problems 7 through 11, determine whether th e points are collinear. If they are not, find an equation of the plane containing all three points.
In each of Problems 22 through 26, find a vector norma l to the given plane . There are infinitely many such vectors . 22. 8x-y+z = 1 2 23. x-y+2z= 0 24. x-3y+2z= 9 25. 7x+y-7z= 7 26. 4x + 6y + 4z = -5 27. Prove that F x (G+H) = F x G+F x H. 28. Prove that (aF) x G = F x (aG) = a(F x G) . 29. Prove that F x (G x H) _ (F • H)G - (F • G)H .
6.4
223
30. Use vector operations to find a formula for the area of th e triangle having vertices (a ;, c;) for i = 1, 2, 3 . What conditions must be placed on these coordinates to ensure that the points are not collinear (all on a line) ? The scalar triple product of F, G, and H is defined to be [F, G, H] = F . (G x H) . 31. Let F=a t i+bij+c l k, G=a2i+b2j+c2k, and H = a3 i + b 3j + c3k. Prove that [F, G, H] _
al a2 a3
b1
b2 b3
cl c2 C3
The Vector Space R n The world of everyday experience has three space dimensions . But often we encounter setting s in which more dimensions occur . If we want to specify not only the location of a particle but th e time in which it occupies a particular point, we need four coordinates (x, y, z, t) . And specifying the location of each particle in a system of particles may require any number of coordinates . The natural setting for such problems is R", the space of points having n coordinates .
If n is a positive integer, an n-vector is an n-tuple (x 1 , x2, . . , x" ), with each coordinate xi a real number . The set of all n-vectors is denoted R" .
R I is the real line, consisting of all real numbers . We can think of real numbers as 1-vectors , but there is is no advantage to doing this . R 2 consists of ordered pairs (x, y) of real numbers , and each such ordered pair (or 2-vector) can be identified with a point in the plane . R 3 consists of all 3-vectors, or points in 3-space . If n > 4, we can no longer draw a set of mutuall y independent coordinate axes, one for each coordinate, but we can still work with vectors in R " according to rules we will now describe .
DEFINITION 6 .9 Algebra of R "
1. Two n-vectors are added by adding their respective components :
(x1,x2, . . .,x,1)+(YI,Y2, .=(x1+Y1, x2+Y2, . . .,x,:+Yn) • 2. An n-vector is multiplied by a scalar by multiplying each component by the scalar : a(x 1 , x2, . . . , x,1) = (ax 1 , ax 2, . . . , ax,t) .
224
CHAPTER 6 Vectors and Vector Spaces
The zero vector in R n is the n-vector 0 = (0, 0, . . . , 0) having each coordinate equal to zero . The negative of F = (x 1 , x2, . . . , x n ) is -F = (-x 1 , - x2 , . . . , -x, 1 ) . As we did with n = 3 , we denote G+ (-F) as G -F . The algebraic rules in R n mirror those we saw for R 3 . THEOREM 6. 8
Let F, G, and H be in R n , and let a and /3 be real numbers . Then 1. F+G = G+F . 2. F+(G+H) = (F+G)+H . 3. F+O=F. 4. (a+/3)F = aF+/3F . 5. (a/3)F = a(/3F) . 6. a(F+G) = aF+aG . 7. aO=O. ■ Because of these properties of the operations of addition of n-vectors, and multiplicatio n of an n-vector by a scalar, we call R n a vector space. In the next section we will clarify th e sense in which R' can be said to have dimension n . The length (norm, magnitude) of F = ( x 1 , x2 , . . . , xn) is defined by a direct generalizatio n from the plane and 3-space : ITV
_,\/xi+xi+•••+xn .
There is no analogue of the cross product for vectors in R n when n > 3 . However, the dot product readily extends to n-vectors.
DEFINITION 6.10
Dot., Product
of n-Vectors,
The dot product of (x 1 , x2, . . . , x,) and (y 1 , y,, .. v
All of the conclusions of Theorem 6 .3 remain true for n-vectors, as does Lemma 6 .1 . We will record these results for completeness. THEOREM 6. 9
Let F, G, and H be n-vectors, and let a and /3 be real numbers . Then 1. F G=G•F . 2. (F+G) .H=F .H+G .H. 3. a(F•G) = (aF) •G=F . (aG). 4. F•F= II F II 2 •
6 .4 The Vector Space R "
5. F•F=O if and only ifF=O . 6. IIaF+/GII 2 = a2 1 FII 2 + 2a /3F • G+/32 I GII 2
•
The Cauchy-Schwarz inequality holds for n-vectors, but the proof given previously fo r 3-vectors does not generalize to R" . We will therefore give a proof that is valid for any n . THEOREM 6.10 Cauchy-Schwarz Inequality in R "
Let F and G be in
Then
R" .
IF .G I
II F II II G II
The inequality reduces to 0 < 0 if either vector is the zero vector . Thus suppose F and G0O . Choose a = IIGII and = - II F II in Theorem 6.9(6) . We get
Proof
0
0 < II aF +PGII 2 = II GII 2 II F II 2 - 2 I GII IIFII F F . G+ II F II 2 I GII 2 • Upon dividing this inequality by 2 II F II II G I we obtain F . G < II F II II G II . Now go back to Theorem 6 .9(6), but this time choose a = IIGII and /3 =IIFII to get 0
I GII 2 II F II 2 + 2 I G II II F II F F . G+ I GII 2 I F II Z
and upon dividing by 2 I I F I I IIGII we ge t
-II F II II G I < F . G. We have now shown that
-II F II II G II < and this is equivalent to the Cauchy-Schwarz inequality . ■ In view of the Cauchy-Schwarz inequality, we can define the cosine of the angle betwee n vectors F and G in R" by cos(B)
_
1 0
(F • G )/(II F II IIG II)
if F or G equals the zero vector if F 0 and G 0
This is sometimes useful in bringing some geometric intuition to R" . For example, it i s natural to define F and G to be orthogonal if the angle between them is 7r/2, and by thi s definition of cos(O), this is equivalent to requiring that FF . G = 0, consistent with orthogonalit y in R 2 and R 3 . We can define a standard representation of vectors in R" by defining unit vectors along the n directions :
e1 =(1,0,0, . . .,0) e2 =(0,1,0, . . .,0)
et
=
(0, 0, . . . , 0, 1) .
226. .
CHAPTER 6 Vectors and Vector Spaces Now any n-vector can be written (x 1 , x2, . . . , xn)
= xl e l +x2e2 +
. + xn e n
n
= Exj ej . I= 1
A set of n-vectors containing the zero vector, as well as sums of vectors in the set an d scalar multiples of vectors in the set, is called a subspace of R" .
DEFINITION 6.11
Subspac e
A set S of n-vectors is a subspace of R" if: 1 .0isinS . 2. The sum of any vectors in S is in S . 3. The product of any vector in S with any real number is also in S .
Conditions (2) and (3) of the definition can be combined by requiring that aF+/3G be i n S for any vectors F and G in S, and any real numbers a and P . T. EXAMPLE 6 .1 4
Let S consist of all vectors in R" having norm 1 . In R 2 (the plane) this is the set of points on the unit circle about the origin ; in R 3 this is the set of points on the unit sphere about the origin. S is not a subspace of R" for several reasons . First, 0 is not in S because 0 does not have norm 1 . Further, a sum of vectors in S is not in S (a sum of vectors having norm 1 does not have norm 1). And, if a 1, and F has norm 1, then aF does not have norm 1, so aF is not in S . In this example S failed all three criteria for being a subspace . It is enough to fail one to disqualify a set of vectors from being a subspace .
EXAMPLE 6 .1 5
Let K consist of all scalar multiples of (-1, 4, 2, 0) in R 4 . We want to know if K is a subspace of R4 First, 0 is in K, because 0 = 0(-1, 4, 2, 0) = (0, 0, 0, 0) . Next, if F and G are in K, then F a(-1, 4, 2, 0) for some a and G _ /3(-1, 4, 2, 0) for some /3, so F+ G = (a +P)(-l, 4, 2, 0) is a scalar multiple of (-1, 4, 2, 0) and therefore is in K . Finally, if F = a(-1, 4, 2, 0) is any vector in K, and /3 is any scalar, then /3F = (/3a)(-1, 4, 2, 0 ) is a scalar multiple of (-1, 4, 2, 0), and hence is in K . Thus K is a subspace of R 4 .
6.4 The Vector Space R"
227
EXAMPLE 6 .1 6
Let S consist of just the zero vector 0 in R" . Then S is a subspace of R" . This is called the trivial subspace . At the other extreme, R" is also a subspace of R" . In R 2 and R 3 there are simple geometric characterizations of all possible subspaces . Begin with R 2. We claim that only subspaces are the trivial subspace consisting of jus t the zero vector, or R 2 itself, or all vectors lying along a single line through the origin . T o demonstrate this, we need the following fact . -LEMMA 6.2
Let F and G be nonzero vectors in R 2 that are not parallel . Then every vector in R 2 can b e written in the form aF + /3G for some scalars a and P . Proof Represent F and G as arrows from the origin (Figure 6 .29) . These determine nonparallel lines L I and L 2, respectively, through the origin, because F and G are assumed to be nonparallel . Let V be any 2-vector . If V = 0, then V = OF+ 0G. We therefore consider the case that V 0 and represent it as an arrow from the origin as well . We want to show that V must be the su m of scalar multiples of F and G . If V is along L 1 , then V = aF for some real number a, and then V = aF+OG . Similarly , if V is along L 2, then V = PG = OF +/3G . Thus, suppose that V is not a scalar multiple of either F or G . Then the arrow representing V is not along L 1 or L 2 . Now carry out the construction shown in Figure 6 .30 . Draw lines parallel to L I and L2 from the tip of V . Arrows from the origin to where these parallels intersec t L I and L2 determine, respectively, vectors A and B . By the parallelogram law , V = A +B . But A is along L I , so A = aF for some scalar a . And B is along L2, and so B = /3G for some scalar P . Thus V = aF+/3G, completing the proof . We can now completely characterize the subspaces of R 2.
y
FIGURE 6 .2 9
x
FIGURE 6 .30
-THEOREM 6.11 The Subspaces of R 2
Let S be a subspace of R 2 . Then one of the following three possibilities must hold : 1. S = R 2 , or 2. S consists of just the zero vector, or 3. S consists of all vectors parallel to some straight line through the origin .
228
CHAPTER 6 Vectors and Vector Space s Suppose cases (1) and (2) do not hold . Because S is not the trivial subspace, S must contain at least one nonzero vector F . We will show that every vector in S is a scalar multipl e of F . Suppose instead that there is a nonzero vector G in S that is not a scalar multiple of F . If V is any vector in R , then by Lemma 6 .2, V = aF+aG for some scalars a and P. But S is a subspace or R , so aF+PG is in S . This would imply that every vector in R is in S, hence that S = R , a contradiction . Therefore there is no vector in S that is not a scalar multiple of F. We conclude that every vector in S is a scalar multiple of F, and the proof i s complete . Proof
2
2
2
2
By a similar argument involving more cases, we can prove that any subspace of R mus t be either the trivial subspace, R itself, all vectors parallel to some line through the origin, o r all vectors parallel to some plane through the origin . 3
3
In each of Problems 1 through 6, find the sum of the vec tors and express this sum in standard form . Calculate the dot product of the vectors and the angle between them . The latter may be expressed as an inverse cosine of a number . 1. (-1, 6, 2, 4, 0), (6, -1, 4,1,1)
11. S consists of all vectors (x, y, x+ y, x - y) in R4 .
12. S consists of all vectors in fifth components . 13. S consists of all vectors in components are equal .
having zero third an d
R7
R4
whose first and second
14. Let S consist of all vectors in R on or parallel to th e plane ax+ by+ cz = k, with a, b, and c real numbers , at least one of which is nonzero, and k 0 . Is S a subspace of R
IIF+GII +IIF-GII = (II I +II I ) • Hint: Use the fact that the square of the norm of a vector is the dot product of the vector with itself. 2
6. (-5, 2, 2, -7, -8), (1, 1, 1, -8, 7) In each of Problems 7 through 13, determine whether th e set of vectors is a subspace of R" for the appropriate n. 7. S consists of all vectors (x, y, z, x, x) in
R5 .
8. S consists of all vectors (x, 2x, 3x, y) in
R4.
10. S consists of all vectors (0, x, y) in
6.5
R3.
R6.
F
2
G
2
16. Let F and G be orthogonal vectors in R" . Prove that • This is called Pythagoras's = II F'+GII 2
9. S consists of all vectors (x, 0, 0, 1, 0, y) in
2
II F II 2 +II G II 2
theorem .
17. Suppose F and G are vectors in R" satisfying th e relationship of Pythagoras's theorem (Problem 16) . Does it follow that F and G are orthogonal ?
Linear Independence, Spanning Sets, and Dimension in R't In solving systems of linear algebraic equations and systems of linear differential equations , as well as for later work in Fourier analysis, we will use terminology and ideas from linea r algebra . We will define these terms in R", where we have some geometric intuition .
6.5 Linear Independence, Spanning Sets, and Dimension in R"
DEFINITION 6.12
229
Linear Combinations in R "
A linear combination of k vectors Fl , . . . , F k in R" is a sum a 1 F l +•• +a k Fk in which each aj is a real number .
For example, -8(-2, 4, 1, 0) +6(1, 1, -1, 7) - 7r(8, 0, 0, 0 ) is a linear combination of (-2, 4, 1, 0), (1, 1, -1, 7), and (8, 0, 0, 0) in R4. This linear combination is equal to the 4-vector (22 - 87r, -26, -14, 42) .
The set of all linear combinations of any given (finite) number of vectors in R" is always a subspace of R" . THEOREM' 6. I2
Let F 1 , . . .,Fk be in and let V consist of all vectors a 1 F 1 + . . .+ a kFk, in which each a j can be any real number. Then V is a subspace of R" . First, 0 is in V (choose a l = a2 = • • • = a k = 0) . Next, suppose G and H are in V . Then
Proof
G=alFl+ . . .+akFk andH=p l F l+ . . . +Rk Fk for some real numbers a i , . . . , a k, P1, . , /3k . Then G+H = ( a l +(31)Fl + . . . + ( ak +Nk) Fk is again a linear combination of F 1 , . . . , Fk , and so is in V . Finally, let G = a 1 Fl + . . .+ a kFk be in V . If c is any real number, then cG = (ca l )F 1 + • . .+ ( cak) F k is also a linear combination of F l , . . .,Fk , and is therefore in V . Therefore V is a subspace of R" .
Whenever we form a subspace by taking a linear combination of given vectors, we sa y that these vectors span the subspace .
DEFINITION 6.13
Spanning Set
Let F1 , . . . , Fk be vectors in a subspace S of R" . Then F l , . . . , F k form a spanning set for S if every vector in S is a linear combination of F 1 , . . . , Fk . In this case we say that S is spanned by F 1 , . . . , F k , or that F l , . . .,Fk span S .
230.
CHAPTER 6 Vectors and Vector Space s
For example, i, j, and k span R 3 , because every vector in R 3 can be written ai + bj + ck . The vector i +j in R 2 spans the subspace consisting of all vectors a(i+j), with a any scalar. These vectors all lie along the straight line y = x through the origin in the plane . Different sets of vectors may span the same subspace of R" . Consider the followin g example.
EXAMPLE 6 .1 7
Let V be the subspace of R 4 consisting of all vectors (a, /3, 0, 0) . Every vector in S can b e written (a,3, 0, 0) = a(l, 0, 0, 0) +,3(0,1, 0, 0) , so (1, 0, 0, 0) and (0, 1, 0, 0) span V . But we can also write any vector in V a s (a,/3,0,0)= 2(2,0,0,0)+a(0,Tr, 0, 0) , so the vectors (2, 0, 0, 0) and (0, ar, 0, 0) also span V . Different numbers of vectors may also span the same subspace . The vectors (4, 0, 0, 0), (0, 3, 0, 0), (1, 2, 0, 0 ) also span V . To see this, write an arbitrary vector in V a s (a, (3, 0, 0) =
a
2 (4, 0, 0, 0) +
133
4(0, 3, 0, 0) +2(1, 2, 0, 0) .
The last example suggests that some spanning sets are more efficient than others . If two vectors will span a subspace V, why should we choose a spanning set with three vectors in it ? Indeed, in the last example, the last vector in the spanning set (4, 0, 0, 0), (0, 3, 0, 0), (1, 2, 0, 0 ) is a linear combination of the first two : (1, 2, 0, 0)
=4
(4, 0, 0, 0)
+3
(0, 3, 0, 0) .
Thus any linear combination of these three vectors can always be written as a linear combinatio n of just (4, 0, 0, 0) and (0, 3, 0, 0) . The third vector, being a linear combination of the first two , is "extraneous information." These ideas suggest the following definition .
DEFINITION 6.14
Linear Dependence and Independenc e
Let F l , . . .,Fk be vectors in R" . , Fk are linearly dependent if and only if one of these vectors is a linea r combination of the others . 2. Fl, . . ., Fk are linearly independent if and only if they are not linearly dependent . 1.
Linear dependence of Fl, . . . , Fk means that, whatever information these vectors carry, no t all of them are needed, because at least one of them can be written in terms of the others . For example, if Fk = a 1F1 + . . . + ak_lFk_l,
6.5 Linear Independence, Spanning Sets, and Dimension in R"
231
then knowing just F 1 , . . . , Fk_1 gives us Fk as well. In this sense, linearly dependent vectors are redundant . We can remove at least one F* and the remaining k - 1 vectors will span th e same subspace of R" that F l , . . . , Fk do. Linear independence means that no one of the vectors F l , . . . , F k is a linear combinatio n of the others . Whatever these vectors are telling us (for example, specifying a subspace), w e need all of them or we lose information . If we omit Fk, we cannot retrieve it from F 1, . . . , Fk_1 .
EXAMPLE 6 .1 8
The vectors (1, 1, 0) and (-2, 0, 3) are linearly independent in R 3 . To prove this, suppose instead that these vectors are linearly dependent . Then one is a linear combination of the other , say (-2, 0, 3) = a(1, 1, 0) . But then, from the first components, a = -2, while from the second components, a = 0, a n impossibility . These vectors span the subspace V of R 3 consisting of all vectors a(-2, 0, 3) +(3(1, 1, 0) . Both of the vectors (1, 1, 0) and (-2, 0, 3) are needed to describe V . If we omit one, say (1, 1, 0), then the subspace of R3 spanned by the remaining vector, (-2, 0, 3), is different fro m V . For example, it does not have (1, 1, 0) in it. The following is a useful characterization of linear dependence and independence . THEOREM 6.1 3
Let F l , . . . ,Fk be vectors in
Then
R .
1. F1 , . . . , Fk are linearly dependent if and only if there are real numbers all zero, such that
not
a 1F1 + a 2F2 + . . . + akFk = O . 2. F 1 , . . . , Fk are linearly independent if and only if an equation a1Fl + a 2F2 + . . . + akFk = O can hold only if al=a2=••=ak=0 . To prove (1), suppose first that Fl , . . . , Fk are linearly dependent. Then at least one o f these vectors is a linear combination of the others . Say, to be specific, that
Proof
F l = a2F2 + . . . + ak Fk. Then F l - a2F2 + . . . +akFk = O . But this is a linear combination of F l , . . . , Fk adding up to the zero vector and having a nonzer o coefficient (the coefficient of F l is 1) . Conversely, suppose a 1F1 +a 2F2 + . . .+akFk = 0
232 I
CHAPTER 6 Vectors and Vector Space s with at least some a j 0. We want to show that F l , . . . , Fk are linearly dependent . By renaming the vectors if necessary, we may suppose for convenience that a l 0. But the n a2 ak F 1 =--F2- . . .--Fk , al al so F 1 is a linear combination of F2, . . . , Fk and hence F l , . . ., F k are linearly dependent. Thi s completes the proof of (1) . Conclusion (2) follows from (1) and the fact that F l , . . . , Fk are linearly independent exactl y when these vectors are not linearly dependent . This theorem suggests a strategy for determining whether a given set of vectors is linearl y dependent or independent . Given F l , . . . , Fk , se t a 1F 1 +a2F2 + . . . .+ ak F k = 0
(6 .4)
and attempt to solve for the coefficients a l , . . . , a k . If equation (6 .4) forces a l = • • = a k = 0 , then F l , . . . , Fk are linearly independent. If we can find at least one nonzero a j so that equation (6 .4) is true, then F1 , . . . , F k are linearly dependent .
EXAMPLE 6 .1 9 Consider (1, 0, 3, 1), (0, 1, -6, -1) and (0, 2, 1, 0) in R 4 . We want to know whether thes e vectors are linearly dependent or independent. Look at a linear combinatio n c l (1, 0, 3, 1) + c2 (0, 1, -6, -1) + c3(0, 2, 1, 0) = (0, 0, 0, 0) .
If this is to hold, then each component of the vector (c 1 , c2 + 2c3 , 3c1 - 6c 2 + c3 , c 1 - c2) mus t be zero : c1 = 0 c2 +2c3 = 0
3c 1 -6c2 +c3 cl
=0
- c2 =
0.
The first equation gives c l = 0, so the -fourth equation tells us that c2 = 0 . But then the secon d equation requires that c3 = O. Therefore, the only linear combination of these three vectors tha t equals the zero vector is the trivial linear combination (all coefficients zero), and by (2) o f Theorem 6 .13, the vectors are linearly independent . In the plane R 2 , two vectors are linearly dependent if and only if they are parallel . In R 3 , two vectors are linearly dependent if and only if they are parallel, and three vectors are linearl y dependent if and only if they are in the same plane . Any set of vectors that includes the zero vector must be linearly dependent . Consider, fo r example, the vectors 0, F2, . . ., Fk . Then the linear combination 10+0F2 +•••+0Fk = 0 is a linear combination of these vectors that add up to the zero vector, but has a nonzer o coefficient (the coefficient of 0 is 1) . By Theorem 6 .13(1), these vectors are linearly dependent . There is a special circumstance in which it is particularly easy to tell that a set of vector s is linearly independent. This is given in the following lemma, which we will use later .
6 . 5 Linear Independence, Spanning Sets, and Dimension in R"
233
LEMMA 6. 3
Let F1 , . . . , Fk be vectors in R" . Suppose each F 1 has a nonzero element in some componen t where each of the other Fps has a zero component . Then F1 , . . . , Fk are linearly independent . An example will clarify why this is true .
EXAMPLE 6 .2 0
Consider the vectors F l = (0, 4, 0, 0, 2), F 2 = (0, 0, 6, 0, -5), F 3 = (0, 0, 0, -4, 12) . To see why these are linearly independent, suppos e aF1 +QF2 + yF3 = (0, 0, 0, 0, 0) . Then (0, 4a, 60, -4y, 2a - 5,Q + 12y) _ (0, 0, 0, 0, 0) . From the second components, 4a = 0 so a = 0 . From the third components, 613 = 0 so /3 = 0 . And from the fourth components, -4y = 0 so y = 0 . Then the vectors are linearly independen t by Theorem 6 .13(2) . The fact that each of the vectors has a nonzero element where all th e others have only zero components makes it particularly easy to conclude that a =0 = y = 0 , and that is what is needed to apply Theorem 6.13 . There is another important setting in which it is easy to tell that vectors are linearl y independent . Nonzero vectors F1, . . .,Fk in R" are said to be mutually orthogonal if each i s orthogonal to each of the other vectors in the set . That is, F 1 . Fj = 0 if i j . Mutually orthogonal nonzero vectors are necessarily linearly independent . THEOREM 6.1 4
Let F1 , . . .,F k be mutually orthogonal nonzero vectors in R". Then F,, . . . , Fk are linearly independent. Proof
Suppose a1F1 +a 2F2 + . . .+a k Fk = 0 .
For any j=1, . . .,k, (a 1 F 1 +a 2F 2 + . . . + akFk) . F./ = 0 = alF 1 •F1 +a2F2 . F1 + . . . + a1 F1 . F1 + . . .+a kFk •F1 =c1F1 .F1 =c1II FA 2 • because F ; • F1 = 0 if i j . But F1 is not the zero vector, so 11F jII 2 0, hence c1 = 0 . Therefore, each coefficient is zero and F 1 , . . . , F k are linearly independent by Theorem 6 .13(2). ■
234. .
CHAPTER 6 Vectors and Vector Spaces
EXAMPLE 6 .2 1 The vectors (-4, 0, 0), (0, -2, 1), (0, 1, -2) are linearly independent in R 3 , because each is orthogonal to the other two . A "smallest" spanning set for a subspace of R n is called a basis for that subspace .
DEFINITION 6.15
Basis
Let V be a subspace of R . A set of vectors F 1 , . . . , Fk in V form a basis for V i f F 1 , . . . , F k are linearly independent and also span V .
Thus, for F l , . . . , F k to be a basis for V, every vector in V must be a linear combination of F 1 , . . ., Fk , and if any FJ is omitted from the list F l , . . ., F k, the remaining vectors do not span V . In particular, if Fj is omitted, then the subspace spanned by F l , . . . , Fj_l, Fj+1, . . . Fk cannot contain F3 , because by linear independence, F j is not a linear combination of F 1 , . . . , F3_1 , Fr+l, . . .Fk .
EXAMPLE 6 .22 i, j, and k form a basis for R 3 , and e l , e2 , . . . , e n form a basis for, R n .
EXAMPLE 6 .2 3 Let V be the subspace of R n. consisting of all n-vectors with zero first component . The n e2, . . . , e n form a basis for V.
EXAMPLE 6 .2 4 In R 2 , let V consist of all vectors parallel to the line y = 4x. Every vector in V is a multiple of (1, 4) . This vector by itself forms a basis for V . In fact, any vector (a, 4a) with a 0 form s a basis for V.
EXAMPLE 6.25 In R 3 , let M be the subspace of all vectors on or parallel to the plane x+y+z = O . A vector (x, y, z) in R 3 is in M exactly when z = -x - y, so such a vector can be writte n (x, y, z) = (x, y,-x-y)=x(1,0,-1)+z(0, 1,-1) . The vectors (1, 0, -1) and (0, 1, -1) therefore span M. Since these two vectors are linearl y independent, they form a basis for M . We may think of a basis of V as a minimal linearly independent spanning set F 1 , . . . , F k for V. If we omit any of these vectors, the remaining vectors will not be enough to span V .
6 .5 Linear Independence, Spanning Sets, and Dimension in R"
235
And if we use additional vectors, say the set F l , . . . , Fk , H, then this set also spans V, but i s not linearly independent (because H is a linear combination of F l , . . . , Fk ) . There is nothing unique about a basis for a subspace of R" . Any nontrivial subspace of R " has infinitely many different bases. However, it is a theorem of linear algebra, which we will not prove, that for a given subspace V of R", every basis has the same number of vectors in it . This number is the dimension of the subspace .
DEFINITION 6.16 The dimension of a subspace of R" is the number of vectors in any basis for the subspace .
In particular, R" (which is a subspace of itself) has dimension n, a basis consisting of th e n vectors e l , . . . , e " . The subspace in Example 6 .25 has dimension 2 .
SECTION 6 5 PROBLEMS In each of Problems 1 through 10, determine whether the given vectors are linearly independent or dependent in R " for appropriate n . 1. 3i +2j, i -j in
R3
13. i + 6j - 2k, -i + 4j - 3k, i + 16j - 7k 14. 4i - 3j + k, 10i - 3j, 2i-6j+3 k 15. 8i+6j, 2i-4j, i+ k
19. S consists of all vectors in the plane 2x - y +z = 0 .
20. S consists of all vectors (x, y, -y, x - y, z) in
10. (3, 0, 0, 4), (2, 0, 0, 8) in R4
21. S consists of all vectors in ponent .
11. Prove that three vectors in R 3 are linearly dependen t if and only if their scalar triple product is zero . (See Problem 31 in Section 6 .3) .
22. S consists of all vectors (-x, x, y, 2y) in R 4 .
In each of Problems 12 through 16, use the result of Prob lem 11 to determine whether the three vectors in R 3 are linearly dependent or independent. 12. 3i+6j-k, 8i+2j-4k,i-j+k
R4
R5.
with zero second com-
23. S consists of all vectors parallel to the line y = 4x in R 2 . 24. S consists of all vectors parallel to the plane 4x + 2y-z=0 in R3 .
T
CHAPTER
s-
Tt+
y,* ROW O, s) A T
6.' NI ) : . El o In5 POEIVI f
k}' Y MATRICES THE slu\ty O .F HOMOGENEOUS SYSTEM . 3?_J ATEOINS TI-1E SOLUTION S PACE O F
7
v*
Matrices and Systems of Linear Equations
This chapter is devoted to the notation and algebra of matrices, as well as their use in solving systems of linear algebraic equations . To illustrate the idea of a matrix, consider a system of linear equations : xl + 2x2 - x3
+
4x4
=
0
3x1 -4x2 +2x3 -6x4 = 0 x1 -3x 2 -2x3 +x4 =0 . All of the information needed to solve this system lies in its coefficients . Whether the first unknown is called x1, or yl , or some other name, is unimportant . It is important, however, that the coefficient of the first unknown in the second equation is 3 . If we change this number w e may change the solutions of the system . We can therefore work with such a system by storing its coefficients in an array called a matrix : (1 2 -1 4 3 -4 2 - 6 1 1 -3 -2 This matrix displays the coefficients in the pattern in which they appear in the system o f equations . The coefficients of the P h equation are in row i, and the coefficients of the j`h unknown .x are in column j . The number in row i, column j is the coefficient of xi in equation i . But matrices provide more than a visual aid or storage device . The algebra and calculus o f matrices will form the basis for methods of solving systems of linear algebraic equations, an d later for solving systems of linear differential equations and analyzing solutions of systems o f nonlinear differential equations .
237
r-
238
7.1
CHAPTER 7 Matrices and Systems of Linear Equation s
Matrice s
DEFINITION 7.1 .
Matrix
An n by m matrix is an array of objects arranged in n rows and in columns . We will denote matrices by boldface type, as was done with vectors. When A is an n by in matrix, we often write that A is n x m (read "n by m " ) . The firs t integer is the number of rows in the matrix, and the second integer is the number of columns . The objects in the matrix may be numbers, functions, or other quantities . For example , 1
7r
-5
(21 is a 2 x 3 matrix , 2x
e
4x
(coes(x) x2 is a 2 x 2 matrix, an d
is a 4 x 1 matrix . A matrix having the same number of rows as columns is called a square matrix. The 2 x 2 matrix shown above is square . The object in row i and column j of a mat rix is called the i, j element, or i, j entry of the matrix . If a matrix is denoted by an upper case letter, say A, then its i, j element is ofte n denoted a ll and we write A = [ a 11]. For example, i f 0 H = [ h1j] =
x
1- sin(x) x2
1- 2i i
then His a 3 x 2 matrix, and h11 = 0, h 12 = x, h21 = 1 - sin(x), h 22 = 1 - 2i, h 31 = x2 and h 32 = i. We will be dealing with matrices whose elements are real or complex numbers, o r functions . Sometimes it is also convenient to denote the i, j element of A by (A) 11. In the matrix H, (
H)22 = 1 2i and
(
11 )31
x2 •
DEFINITION 7.2 Equality Matrice s
1
A = [a id ] and B = [b y] are equal if and only if they have the same number of rows, th e same number of columns, and for each i and j, If two matrices either have different numbers of rows or columns, or if the objects in a particular location in the matrices are different, then the matrices are unequal .
7.1 Matrices
7.1.1 Matrix Algebra We will develop the operations of addition and multiplication of matrices and multiplication o f a matrix by a number.
DEFINITION 7.3 Matrix Additio n If A = [a id ] and B = [b id ] are n x in matrices, then their sum is the
n
x in matrix
A+B = [a id +bi1 ] .
We therefore add matrices by adding corresponding elements . For example ,
If two matrices are of different dimensions (different numbers of rows or columns), the n they cannot be added, just as we do not add 4-vectors and 7-vectors .
DEFINITION 7.4 Product If A = [a1 ] and a is
of a Matrix and a Scala r
a scalar, then aA is the matrix defined b y «A
=
[aa1,] •
This means that we multiply a matrix by a by multiplying each element of the matrix b y a . For example, 2 0 3 0 0 14 2 6
-
6 0 3 6
0 0 12 18
and x
1 (-x
x cosx( x) ) (-xz xcos(x) )
Some, but not all, pairs of matrices can be multiplied .
DEFINITION 7.5 Multiplication of Matrice s Let A = [aid ] be an n x r matrix, and B = [b ig] an r x in matrix . Then the matrix produc t AB is the n x in matrix whose i, j element is ail b11 +a 12b2j + . . .+ai,.b,.j . That is, AB
=E k= 1
a ik bkJ
240
CHAPTER 7 Matrices and Systems of Linear Equation s
the
If we think of each row of A as an r-vector, and each column of B as an r-vector, then element of AB is the dot product of row i of A with column j of B :
i, j
i, j element of AB = (row
i
of A) . (column j of B).
This is why the number of columns of A must equal the number of rows of B for AB to be defined . These rows of A and columns of B must be vectors of the same length in order to take this dot product . Thus not every pair of matrices can be multiplied . Further, even when AB is defined, BA need not be . We will give one rationale for defining matrix multiplication in this way shortly . First w e will look at some examples of matrix products and then develop the rules of matrix algebra.
EXAMPLE 7 . 1
Let A=
1 and B (2 5) = (2 1 4) '
Then A is 2 x 2 and B is 2 x 3, so AB is defined (number of columns of A equals the number of rows of B). Further, AB is 2 x 3 (number of rows of A, number of columns of B). Now compute 1 - (2
5)(2
((1, 3) . (1, 2) (2,5)•(1,2)
3 1 4) (1, 3) • ( 1 , 1 ) (2,5) .'(1,1)
(1, 3) . (3, 4) (2,5) . 3,4)
_ 7 4 15 - (12 7 26) In this example, BA is not defined, because the number of columns of B, which is 3, does not equal the number of rows of A, which is 2 .
EXAMPLE 7 . 2
Let A
= (4 1 6 2 2)
and
Since A is 2 x 4 and B is 4 x 2, AB is defined and is a 2 x 2 matrix : AB - ((1, 1, 2, 1) . (-1, 2, 1, 12) (1, 1, 2, 1) . (8, 1, 1, 6) ) (4, 1, 6, 2) . (-1, 2, 1, 12) (4, 1, 6, 2) . (8, 1, 1, 6 ) _ 15 17 28 5 1
7.1 Matrices
241
In this example BA is also defined, and is 4 x 4 :
BA=
-1 2 1 12 31 6 5 36
8 1 11 1 (4 1 6 7 3 2 18
2 6
1 2
46 15 10 4 8 3 .60 24
As the last example shows, even when both AB and BA are defined, these maybe matrice s of different dimensions . Matrix multiplication is noncommutative, and it is the exception rathe r than the rule to have AB equal BA . If A is a square matrix, then AA is defined and is also square . Denote AA as A2. Similarly , A(A2) = A3 and, for any positive integer k, A k = AA . . . A, a product with k factors . Some of the rules for manipulating matrices are like those for real numbers .
THEOREM 7 . 1
Let A, B, and C be matrices . Then, whenever the indicated operations are defined, we have : 1. A+B =B+A . 2. A(B+C) =AB+AC . 3. (A+B)C=AC+BC . 4. A(BC) _ (AB)C . For (1), both matrices must have the same dimensions, say n x in . For (2), B and C mus t have the same dimensions, and the number of columns in A must equal the number of row s in B and in C . For (4), A must be n x r, B must be r x k and C must be k x in . Then A(BC) and (AB)C are n x in .
Proof The theorem is proved by direct appeal to the definitions . We will provide the details for (1) and (2) . To prove (1), let A = [a 11] and B = [b11]. Then A+B = [au +b11] = [b 11 +a11]
= B+A ,
because each a u and b;j is a number or function and the addition of these objects is commutative . For (2), let A = [a u], B = [b ;1] and C = [c if ] . Suppose A is n x k and B and C are k x in . Then B + C is k x in, so A(B + C) is defined and is n x in . And AB and BC are both define d and n x in. There remains to show that the i, j element of AB + AC is the same as the i, j element of A(B + C) . Row i of A, and columns j of B and C, are k-vectors, and from properties of the do t product , i, j element of A(B + C) = (row i of A) . (column j of B + C )
= (row i of A) . (column j of B) + (row i of A) . (column j of C) = (i, j element of AB) + (i, j element of AC ) = i, j element of AB+AC. ■
242
i
I
CHAPTER 7 Matrices and Systems of Linear Equation s
i
We have already noted that matrix multiplication does not behave in some ways like multiplication of numbers . Here is a summary of three significant differences . Difference 1 For matrices, even when AB and BA are both defined, possibly AB BA .
EXAMPLE 7 .3
(-12 4) ( 1 3) ( - 2 8 0 ) but (-2 - 1-14 24 ■ 1 3) -2 4) 12) Difference 2 There is no cancellation in products . If AB = AC, we cannot infer that B = C.
EXAMPLE 7 .4
(31
31)(
0
(52 43 16) - (3 3)
1
8 . _ (71 254 1 But (4 16)
(5 11) `
Difference 3 The product of two nonzero matrices may be zero .
EXAMPLE 7 . 5
1 (0 0) (- 3 7.1.2
-2)
- (0 0 . 0)
Matrix Notation for Systems of Linear Equation s
Matrix notation is very efficient for writing systems of linear algebraic equations . Consider, fo r example, the system 2x 1 -x2 +3x3 +x4
=1
x1 + 3x2 - 2x4 = 0 -4x1 - x2 + 2x3 - 9x4 = - 3. The matrix of coefficients of this system is the 3 x 4 matri x A
2 -1 3 1 1 3 0 -2 . -4 -1 2 - 9
7.1 Matrices Row i contains the coefficients of the x* . Define
243
equation, and column j contains the coefficients o f
1 and B = 0 -3
.
Then
-4
-1
2
-9
We can therefore write the system of equations in matrix form a s AX=B . This is more than just notation . Soon this matrix formulation will enable us to use matri x operations to solve the system . A similar approach can be taken toward systems of linear differential equations . Consider the system x*+tx3-x3=2t- 1 t 2xi - cos(t)x3 - x3 = et .
Let _ A
=(
-cos(t)
-1) '
xi X=
and F
(2t _
1) .
x3
Then the system can be written AX' = F , in which X' is formed by differentiating each matrix element of X . As with systems of linea r algebraic equations, this formulation will enable us to bring matrix methods to bear on solvin g the system of differential equations . In both of these formulations, the definition of matrix product played a key role . Matrix multiplication may seem unmotivated at first, but it is just right for converting a system o f linear algebraic or differential equations to a matrix equation . 7.1 .3
Some Special Matrice s
Some matrices occur often enough to warrant special names and notation .
DEFINITION 7,6
Zero Matrix
O n, ,,, denotes the n x na zero matrix, having each element equal to zero .
244
CHAPTER 7
Matrices and Systems of Linear Equation s
For example, 0 0
0 0 (0 0
0 2,3 = If A is n x in, then
A + On,,n = O n,,n + A = A . The negative of A is the matrix obtained by replacing each element of A with its negative . This matrix is denoted -A . If A = [au], then -A = [-au] . If A is n x in, then A + (-A) = 0 ,,,,,, • Usually we write A + (-B) as A - B .
DEFINITION 7.7
Identity Matrix
The n x n identity matrix is the matrix I n having each and each i, i- element equal to 1 .
element equal to
i, j
For example, I2-
10 and I 3 = 0 1 00
1 1 0 ( 1
0 0 1
THEOREM 7. 2
If A is n x m, then AT,,, = I,,A = A . We leave a proof of this to the student.
EXAMPLE 7 . 6
Let 1 0' 2 1 . -1 8
A=
Then I3A =
1 0 0
AI2
1 2 -1
0 1 0
1 2 -1
0 0 1
0 1= 8
1 2 -1
0 1 8
=A
and =
0 1 8
1 (0
0 1
I =
1 2 -1
0 1 8
= A. ■
zero
if i j ,
7 .1 Matrices
DEFINITION 7.8
24 5
Transpos e
If A = [a1 ] is an n x m matrix, then the transpose of A is the rn x n matrix A` = [aj1
The transpose of A is formed by making row k of A, column k of At .
EXAMPLE 7 . 7
Let 1 6 3 3 A_ -( 0 ar 12 -5 ' This is a 2 x 4 matrix . The transpose is the 4 x 2 matri x
THEOREM 7.3
1. (I„) `= I 2. For any matrix A, (A')' = A . 3. If AB is defined, then (AB)' = B'A' . Conclusion (1) should not be surprising, since row i of I„ is the same as column i, s o interchanging rows and columns has no effect . Similarly, (2) is intutively clear . If we interchange rows and columns of A to form At , an d then interchange the rows and columns of At , we should put everything back where it was , resulting in A again . We will prove conclusion (3) .
Proof Let A = [a11 ] be n x k and let B = [b11] be k x in . Then AB is defined and is n x m . Since B' is in x k and At is k x n, then B'A' is defined and is in x n . Thus (AB)' and B'A ' have the same dimensions . Now we must show that the i, j element of (AB)' equals the i, j element of B'A' . Falling back on the definition of matrix product, we have k ,
j element of B'A ' =
k
E(B'),s(A')si = E b s,ajs s=1
s=1
k
= E ais bs, s= 1
This completes the proof of (3) . ■
=
j, i element of AB =
i, j element of (AB)' .
246
CHAPTER 7
Matrices and Systems of Linear Equation s
In some calculations it is convenient to write the dot product of two n-vectors as a matri x product, using the transpose . Write the n-vector (x 1 , x2, . . . , x11 ) as a 1 x n column matrix : x1
X
(x.2
xn
Then X` = (x i an n x 1 matrix . Let (y i , y2 , matrix:
•••,
xn )
. . .
x2
y„) also be an n-vector, which we write as an 1 x n column
Then X`Y is the 1 x 1 matri x
X`Y = (x i
=
x2
xn )
( x 1Y1+ x2Y2+
. . . +x
n
yn)
=X•Y .
Here we have written the resulting 1 x 1 matrix as just its single element, without the matri x brackets . This is common practice for 1 x 1 matrices . We now have the dot product of two n-vectors, written as 1 x n column vectors, as the matrix produc t X`Y. This will prove particularly useful when we treat eigenvalues of matrices . 7.1.4 Another Rationale for the Definition of Matrix Multiplication We have seen that matrix multiplication allows us to write linear systems of algebraic and differential equations in compact matrix form as AX = B or AX' = F. Matrix products are also tailored to other purposes, such as changes of variables in linear equations . To illustrate, consider a 2 x 2 linear syste m a 11 x 1 -+12x2 = C 1
a21 x 1
+ a 22 x2 = c2 .
(7.1)
Change variables by putting xi = h 11 .Y1
+
h 12Y2
xz = h21Y1
+
h22Y2 •
Then an
( h uh + h 12Y2) + a 12( h21Y1 +'h22Y2) =
CI
7.1 Matrices
24 7
and a 2l( h linl + h l2Y2) + a 22(h 2lYl + h 22 y2) = c2 .
After rearranging terms, the transformed system i s
c
+ a 12 h 21) y l + (a n h 12 + a l2 h 22)Y2 = i (a21 12 11 + a 22 hzl)Yl + (all h 12 + a22 h 22 )y2 = c 2 . (a ll 1111
Now carrry out the same transformation using matrices . Write the original system (7 .1) a s AX = C, where A=
al l
ail
x
al e a22 '
X=(x )
and
C=
Cc2 )
and the equations of the transformation (7 .2) as X = HY, where H = ( h ll
h12
h 21
and
h221
Y=
3'Y1 . 2
Then AX = A(HY) = (AH)Y = C . Now observe that AH =
r ail
`a il
a nzl a 22
(a ll h ll+
a2l h ll
(h 11
h2l
a lz h zl
+ a 22 h zl
h 12 ) 1122 a ll h l2+ a l2 h2 2 a2l h l2+ a22 h2 2
exactly as we found in the system (7 .3) by term by term substitution . The definition of matri x product is just what is needed to carry out a linear change of variables . This idea also applie s to linear transformations in systems of differential equations . 7.1 .5 Random Walks in Crystal s We will conclude this section with another application of matrix multiplication, this time to th e problem of enumerating the paths atoms can take through a crystal lattice . Crystals have sites arranged in a lattice pattern . An atom may jump from a site it occupie s to any adjacent, vacant site . If more than one adjacent site is vacant, the atom "selects" its targe t site at random. The path such an atom makes through the crystal is called a random walk . We can represent the lattice of locations and adjacencies by drawing a point for eac h location, with a line between two points only if an atom can move directly from one to th e other in the crystal. Such a diagram is called a graph. Figure 7 .1 shows a typical graph . In this graph an atom could move from point v l to v2 or v3 , to which it is connected by lines, but no t directly to v 6 because there is no line between v l and v6. Two points are called adjacent in G if there is a line between them in the graph . A poin t is not considered adjacent to itself-there are no lines starting and ending at the same point . A walk of length n in such a graph is a sequence t l , . . . , t„+l of points (not necessarily different) with each tj adjacent to tj+l in the graph. Such a walk represents a possible path a n atom might take through various sites in the crystal . Points may repeat in a walk because an atom may return to the same site any number of times . A vi - v i walk is a walk that begins at v; and ends at vJ .
CHAPTER 7 Matrices and Systems of Linear Equation s
FIGURE 7 . 1 A typica l graph.
Physicists and materials engineers who study crystals are interested in the following question : given a crystal with n sites labeled v 1 , . . . , v,,, how many different walks of length k are there between any two sites (or from a site back to itself) ? Matrices enter into the solution of this problem as follows . Define the adjacency matrix A of the graph to be the n x n matrix having each i, i element zero, and for i j, the i, j-element equal to 1 if there is a line in the graph between vi and v i , and 0 if there is no such line . For example, the graph of Figure 7 .1 has adjacency matrix /0 1 1 0 1 1 A= 1 0 0 0 \0 0
1 1 0 1 0
0
1 0 0 0 1 0 0 1 1 0 1 1
0\ 0 0 1 1 0/
The 1, 2 element of A is 1 because there is a line between v 1 and v 2 , while the 1, 5 element i s zero because there is no line between v l and v 5 . The following remarkable theorem uses the adjacency matrix to solve the walk-enumeratio n problem.
THEOREM, 7.4 Let A = [a 1j] be the adjacency matrix of a graph G having points v 1 , . . . , v,,. Let k be any positive integer . Then the number of distinct vi - v1 walks of length k in G is equal to the i, j element of A k . We can therefore calculate the number of random walks of length k between any two points (or from any point back to itself) by reading the elements of the k`h power of the adjacenc y matrix . Proof Proceed by mathematical induction on k. First consider the case k = 1 . If i j, there is a v i - v1 walk of length 1 in G exactly when there is a line between v 1 and vJ , and in this case a lp = 1 . There is no v, - v) walk of length 1 if v* and vj have no line between them, and in thi s case aij = O . If i j, there is no v ; - v, walk of length 1, and a t, = 0 . Thus, in the case k = 1 , the i, j element of A gives the number of walks of length 1 from v1 to vi , and the conclusio n of the theorem is true. Now assume that the conclusion of the theorem is true for walks of length k. We will prove that the conclusion holds for walks of length k + 1 . Thus, we are assuming that the i, j
! 24 9
7.1 Matrices
U4
FIGURE 7 .3
element of A k is the number of distinct v i - vj walks of length k in G, and we want to prove that the i, j element of Ak -1- 1 is the number of distinct vi - vi walks of length k + 1 . Consider how a v i - v1 walk of length k + 1 is formed. First there must be a v i - v,. walk of length 1 from v i to some point v,: adjacent to v i , followed by a v, . - vi walk of length k (Figure 7 .2) . Therefore number of distinct v i - vi walks of length k + 1 = sum of the number of distinct v,. - vi walks of length k, with the sum taken over all points v, . adjacent to v i . Now a ir = 1 if v, . is adjacent to vi , and ai . = 0 otherwise . Further, by the inductive hypothesis, the number of distinct v, . - v j walks o f length k is the r, j element of A k . Denote Ak = B = [b k1] . Then, for r = 1, . . . n, a i ,.b,.* = 0 if v, . is not adjacent to v i
and a i ,.b,.j = the number of distinct vi - vj walks of length k + 1
passing through v,., if v,. is adjacent to vi . Therefore the number of vi - v 1 walks of length k + 1 in G i s + . . . + a i,t b,,
a il bit + a i2 b 2i
because this counts the number of walks of length k from v,. to vi for each point v,. adjacent t o vi . But this sum is exactly the i, j element of AB, which is Ak+1 . This completes the proof by induction . For example, the adjacency matrix of the graph of Figure 7 .3 is /0 1 0 0 A= 0 1 0 *0
1 0 1 0 0 0 1 1
0 0 1 0 0 1 1 0 0 •1 0 1 0 1 0 1
0 0 0 1 0 1 1 0
1 0 0 1 1 0 0 0
0 1 0 1 1 0 0 1
o\
1 0 1 0 0 1 0/
250
CHAPTER 7 Matrices and Systems of Linear Equation s Suppose we want the number of v4 - v7 walks of length 3 in G . Calculate 5 1 4 2 4 3 2\ 2 7 4 5 4 9 8 7 0 8 3 2 3 2 48 6 8 8 11 1 0 5 3 8 4 6 8 4 4 2 8 6 2 4 4 9 3 11 8 4 6 7 8 2 10 4 4 7 4/
A3 =
e
We read from the 4, 7 element of A3 that there are 11 walks of length 3 from v4 to v7. For this relatively simple graph, we can actually list all these walks : v4v7v4v7 ;
v4 v3 v4v7 ; v4 v g v4 v 7 ; v 4 v5 v 4 v7 ; v4 v6 v4 v7 ;
V4 v 7 V 8 V 7; V4V7 V5 V7; V4 V 7 V2 V 7;
v4V3 V2 V7 ;
V 4 V8 V2 V7; V4 V6 V5 V7 .
Obviously it would not be practical to determine the number of v i - vi walks of length k by explicitly listing them if k or n is large . Software routines for matrix calculations make thi s theorem a practical solution to the random walk counting problem .
In each of Problems 1 through 6, carry out the requeste d computation with the given matrices A and B. 1 -1 2 -4 1 ks-1 2A-3B -2 2
7.2 Elementary Row Operations and Elementary Matrices 3 14. A -
-*
,
B = (3
-2
24. Let G be the graph of Figure 7 .5. Determine the number of v1 - v4 walks of length 4 in G. Determine the number of v 2 - v 3 walks of length 2 .
7)
4 15. A
=I
1
6) ' B - (-4
= (-3 \ 3
7
0)
5
7
2
8 , B = (-5
16. A
2)
-3
In each of Problems 17 through 21,determine if AB is defined and if BA is defined . For those products that are defined, give the dimensions of the product matrix . 17. 18. 19. 20. 21. 22.
Ais 14x21,Bis21x 14. Ais 18x4,Bis 18x4. Ais6x2,Bis4x6 . A is 1 x 3, B is 3 x 3 . Ais7x6,Bis7x7 . Find nonzero 2 x 2 matrices A, B, and C such that BA = CA but B C . 23. Let G be the graph of Figure 7 .4 . Determine the number of v1 - v4 walks of length 3, the number of v2 - v3 walks of length 3, and the number of v2 - v4 walk s of length 4 in G .
FIGURE 7 .4
7.2
251
FIGURE 7 . 5
25. Let G be the graph of Figure 7 .6. Determine the number of v4 - v5 walks of length 2, the number o f v2 - v3 walks of length 3, and the number of v l - v2 and v4 - v5 walks of length 4 in G .
FIGURE 7 .6
26. Let A be the adjacency matrix of a graph G . (a) Prove that the i, j- element of A 2 equals the number of points of G that are neighbors of v; in G . This number is called the degree of v, . (b) Prove that the i, j- element of A3 equals twice the number of triangles in G containing v, as a vertex . A triangle in G consists of three points, each a neighbor of the other two .
Elementary Row Operations and Elementary Matrices When we solve a system of linear algebraic equations by elimination of unknowns, we routinel y perform three kinds of operations : interchange of equations, multiplication of an equation by a nonzero constant, and addition of a constant multiple of one equation to another equation . When we write a homogeneous system in matrix form AX = 0, row k of A lists the coefficients in equation k of the system . The three operations on equations correspond, respectively , to the interchange of two rows of A, multiplication of a row A by a constant, and addition o f a scalar multiple of one row of A to another row of A . We will focus on these row operation s in anticipation of using them to solve the system .
252
CHAPTER 7 Matrices and Systems of Linear Equation s
Let A be an n x m matrix. The three elementary row operations that can be performe d on A are : 1. Type I operation : interchange two rows of A . 2. Type II operation : Multiply a row of A by a nonzero constant . 3. Type III operation : Add a scalar multiple of one row to another row .
The rows of A are m-vectors . In a Type II operation, multiply a row by a nonzero constan t by multiplying this row vector by the number . That is, multiply each element of the row b y that number . Similarly, in a Type III operation, we add a scalar multiple of one row vector to another row vector .
EXAMPLE 7 . 8
Let (-2 1 1 1 A= 0 1 2 -3
6 2 3 4
Type I operation : if we interchange rows 2 and 4 of A, we obtain the new matrix -2 1 2 -3 0 1 1 1
6 4 3 2
Type II operation : multiply row 2 of A by 7 to get -2 1 6\ 7 7 14 0 1 3 2 -3 4 / Type III operation : add 2 times row 1 to row 3 of A, obtainin g (-2
1
6
Elementary row operations can be performed on any matrix . When performed on an identity matrix, we obtain special matrices that will be particularly useful . We therefore give matrice s formed in this way a name .
7.2 Elementary Row Operations and Elementary Matrices
DEFINITION 7.10
253
Elementary Matrix
An elementary matrix is a matrix formed by performing an elementary row operation on In .
For example, (0 1 10 00 is an elementary matrix, obtained from
13
0 0 1
by interchanging rows 1 and 2 . And 10 01 -4 0
0 0 1
is the elementary matrix formed by adding -4 times row 1 of 13 to row 3 . The following theorem is the reason why elementary matrices are interesting . It says that each elementary row operation on A can be performed by multiplying A on the left by a n elementary matrix . -
THEOREM 7. 5
Let A be an n x in matrix. Let B be formed from A by an elementary row operation . Let E be the elementary matrix formed by performing this elementary row operation on I, . Then B=EA . We leave a proof to the exercises . It is instructive to see the theorem in practice .
EXAMPLE 7 . 9
Let 1 A= 9 -3
-5 4 2
Suppose we form B from A by interchanging rows 2 and 3 of A . We can do this directly . But we can also form an elementary matrix by performing this operation on 1 3 to form 1 E= ( 0 0
0 0 1
0 1 0
Then 1 EA = (0 0
0 0 1
0 1 0
1 9 -3
-5 4= 2
1 -3 9
-5 2= B . 4
254
CHAPTER 7 Matrices and Systems of Linear Equations
EXAMPLE 7 .1 0
Let 0 -7 3 6 A_ - (5 1 -11 3 Form C from A by multiplying row 2 by -8 . Again, we can do this directly. However, if w e form E from by performing this operation on I2, then -88 and EA = (0
-8) (55
1 -11 3)
_ 0 -7 3 6 _C - (-40 -8 88 -24
EXAMPLE 7.1 1
Let (-6 14 2 A = 4 4 -9 -3 2 13 Form D from A by adding 6 times row 1 to row 2. If we perform this operation on 13 to form E.
1 6 0
0 0 1 0 , 0 1
then EA =
1 6 0
0 0 1 0 0 1
-6 -32 -3
14 88 2
-6 14 4 4 -3 2 2 3 13
2 -9 13
= B.
Later we will want to perform not just one elementary row operation, but a sequence o f such operations . Suppose we perform operation 01 on A to form A l, then operation ° 2 on Al to form A 2, and so on until finally we perform Or on Ai._1 to get Ar . This process can be diagrammed: A -* A l -* A2 0,
02
03
• . . -)- Ar_1 _ Ar. Or_l
or
7.2
Elementary Row Operations and Elementary Matrices
255.
Let Ej be the elementary matrix obtained by performing operation Of on I . The n Al = E1 A , A2
= E2A 1 = (E2 E 1) A ,
A 3 = E 3 A2 = (E 3E2 E 1) A ,
A,. = E r.Ar_1
=
(Er.E,. _1 . . . E 3E 2E 1 )A .
This forms a matrix SI = E,.E r_1 • • • E2E 1 such tha t A,. = SiA . The significance of this equation is that we have produced a matrix SZ such that multiplyin g A on the left by SI performs a given sequence of elementary row operations . SZ is formed a s a product E,.E,._1 . . .E2E 1 of elementary matrices, in the correct order, with each elementary matrix performing one of the prescribed elementary row operations in the sequence ( E l performs the first operation, E2 the second, and so on until E, . performs the last) . We will record this result as a theorem. THEOREM 7 . 6
Let A be an n x m matrix . If B is produced from A by any finite sequence of elementary row operations, then there is an n x n matrix f such that B=S2A . The proof of the theorem is contained in the line of reasoning outlined just prior to it s statement .
EXAMPLE 7.1 2
Let 21 A= 0 1 -1 3
0 2 2
We will form a new matrix B from A by performing, in order, the following operations : 1: interchange rows 1 and 2 of A to form A l . 2: multiply row 3 of A l by 2 to form A2 . 3: add two times row 1 to row 3 of A 2 to get A3 = B . If we perform this sequence in order, starting with A, we ge t A-> A I °'
=
0 2 -1
1 1 3
2 0 2
*A3
=
0 2 -2
1 1 8
2 0 8
°3
->A2 02 =B .
=
0 2 -2
1 1 6
2 0 4
CHAPTER 7 Matrices and Systems of Linear Equations
256 L
I
To produce dl such that B = (IA, perform this sequence of operations in turn, beginnin g with 13 : 0
1
0
0
1
0
1 0
0 0
0 1
02
1
0
0
0
0
2
0 1 0
1 0 2
0 0 2
=Sl .
2 0 -1
1 1 3
0 2 2
I 3 -+
of
03
Now check that 0 1 0
(IA =
1 0 2
0 2 -2
0 0 2 1 1 8
2 0 8
=B .
It is also easy to check that fl = E3 E2E1 , where Ej is the elementary matrix obtained b y performing operation 0j on 13 .
EXAMPLE 7 .1 3
Let 6 -1 1 4 A= 9 3 7 -7 0 21 5
.
We want to perform, in succession and in the given order, the following operations : 1: add (-3)(row 2) to row 3 , 2: add 2(row 1) to row 2, 3: interchange rows 1 and 3, 4: multiply row 2 by -4 . Suppose the end result of these operations is the matrix B . We will produce a 3 x 3 matrix Sl such that B = (lA . Perform the sequence of operations, starting with 1 3 : I3
-+
o,
03
1 0 0
0 1 -3
0 0 1
0 2
-3 1 0
1 0 0
02
04
1 2 0
0 1 -3
0 0 1
0 -3 -8 -4 1 0
1 0 0
= fl..
Then (IA -
_
0 -8 1
-3 -4 0
-27 - -7 -84 -4 6 -1
1 0 0
6 9 0 -20 -36 1
-1 3 2 26 -4 4
1 7 1 =B .
4 -7 5
7 .2 Elementary Row Operations and Elementary Matrices
257
It is straightforward to check that SI = E4E 3E2E 1, where Ei is the elementary matrix obtaine d from 13 by applying operation (9g . If the operations O are performed in succession, startin g with A, then B results .
DEFINITION 7.11
Row Equivalence
Two matrices are row equivalent if and only if one can be obtained from the other by a sequence of elementary row operaitons .
In each of the last two examples, B is row equivalent to A . The relationship of ro w equivalence has the following properties : THEOREM 7.7
1. Every matrix is row equivalent to itself . (This is the reflexive property) . 2. If A is row equivalent to B, then B is row equivalent to A . (This is the symmetr y property) . 3. If A is row equivalent to B, and B to C, then A is row equivalent to C . (This i s transitivity) . It is sometimes of interest to undo the effect of an elementary row operation. This can always be done by the same kind of elementary row operation . Consider each kind of operatio n in turn . If we interchange rows i and j of A to form B, then interchanging rows i and j of B yields A again . Thus a Type I operation can reverse a Type I operation . If we form C from A by multiplying row i by nonzero a, then multiplying row i of C b y 1/a brings us back to A. A Type II operation can reverse a Type II operation . Finally, suppose we form D from A by adding a (row i) to row j . Then .
aim
A= ail
a* 2
\a „1
a ,, 2
aim
and a ll
a 12
a 11,
a 11 ,
D= nail + of l
aai2 + a*2
+ ajn,
/ Now we can get from D back to A by adding -a(row i) to row j of D . Thus a Type II I operation can be used to reverse a Type III operation . This ability to reverse the effects of elementary row operations will be useful later, and we will recur 1 it as a theorem . \
a„1
a „2
ma in,
258
CHAPTER 7 Matrices and Systems of Linear Equation s THEOREM 7 . 8
Let E l be an elementary matrix that performs an elementary row operation on a matrix A . Then there is an elementary matrix E2 such that E 2(E 1 A) =A . ■ In fact, E2E 1 =
In each of Problems 1 through 8, perform the row operation, or sequence of row operations, directly on A, and then find a matrix fl such that the final result is (lA .
add times roW 1 to row 2, then multiply row 3 by row 4, then add row 2 to row 3 . (-1 0 3 0 13 2 9 ; multiply row 3 by 4, then -9 7 -5 7 add 14 times row 1 to row 2, then interchange rows 3 and 2 .
14 2 ; interchange rows 2 and 3, the n 5 9 15 0 add 3 times row 2 to row 3, then interchange rows 1 and 3, then multiply row 3 by 4 .
I) ;add6 ti me srow 2torow3 . 0
8. A=
5
6 1 -3 1 ; add 13 times row 3 to row 1 , 8 5 2 9 then interchange rows 2 and 1, then multiply row 1 7-2 14
3. A =
0 1
-9
In Problems 9, 10, and 11, A is an n x m matrix.
by 5 .
9. Let B be formed from A by interchanging rows s an d t. Let E be formed from I,, by interchanging rows s and t. Prove that B EA .
7-4 6 - 3 4. A = 12 4 -4 I ; interchange rows 2 and 3, the n 13 0 add negative row 1 to row 2. / 5. A = I 2 18 I ; add A times row 2 to row 1, then multiply row 2 by 15, then interchange rows 1 and 2 . -4 5 9 6. A = (3 2 1 3 -6 ; add row 1 to row 3, then 1 13 2 6
10. Let B be formed from A by multiplying row s by a , and let E be formed from I,, by multiplying row s by a . Prove that B = EA. 11. Let B be formed from A by adding a times row s t o row t . Let E be formed from I,, by adding a times row s to row t . Prove that B = EA .
L_
7.3
The Row Echelon Form of a Matri x Sometimes a matrix has a special form that makes it convenient to work with in solving certai n problems . For solving systems of linear algebraic equations, we want the reduced row echelo n form, or reduced form, of a matrix . Let A be an n x m matrix. A zero row of A is a row having each element equal to zero . If at least one element of a row is nonzero, that row is a nonzero row . The leading entry of a nonzero row is its first nonzero element, reading from left to right . For example, if
A-
0 1 0 0
2 -2 0 0
7 0 0 9
'
7.3 The Row Echelon Form of a Matrix
259
then row three is a zero row and rows one, two, and four are nonzero rows . The leading entry of row 1 is 2, the leading entry of row 2 is 1, and the leading entry of row 4 is 9 . We do no t speak of a leading entry of a zero row . We can now define a reduced row echelon matrix .
DEFINITION 7.12
Reduced Row Echelon Matrix
A matrix is in reduced row echelon form if it satisfies the following conditions : 1. The leading entry of any nonzero row is 1 . 2. If any row has its leading entry in column j, then all other elements of column j are zero . 3. If row i is a nonzero row and row k is a zero row, then i < k. 4. If the leading entry of row r1 is in column c l , and the leading entry of row r2 is in column c2, and if rl < r2 , then c 1 < cz . A matrix in reduced row echelon form is said to be in reduced form, or to be a reduced matrix.
A reduced matrix has a very special structure . By condition (1), if we move from left to righ t along a nonzero row, the first nonzero number we see is 1 . Condition (2) means that, if we stand at the leading entry 1 of any row, and look straight up or down, we see only zeros in the rest o f this column. A reduced matrix need not have any zero rows . But if there is a zero row, it mus t be below any nonzero row . That is, all the zero rows are at the bottom of the matrix . Condition (4) means that the leading entries move downward to the right as we look at the matrix .
EXAMPLE 7 .1 4
The following four matrices are all reduced : (1 0 0 0 0 0
1 0 0 0
2 0 0 0
-4 0 0 1 0 0
1 0
0 0 0' 0
01
1
an d
0 (0 ' 0
1 0 0 1 0 0 0
3 0 0 0 1 0 0
0 1 0 0 0 1 0
, 3 -2 0 0
1 4 1 0
EXAMPLE 7 .1 5
To see one context in which reduced matrices are interesting, consider the last matrix of th e preceding example and suppose it is the matrix of coefficients of a system of homogeneou s linear equations . This system is AX = 0, and the equations are : x1 +3x4 +x5 = 0 x2 - 2x4 + 4x5 = 0 x3 + x5 = 0 .
2 60
CHAPTER 7 Matrices and Systems of Linear Equation s The fourth row represents the equation Ox ; + Ox2 + Ox3 + Ox4 = 0, which we do not write ou t (it is satisfied by any numbers x 1 through x4 , and so provides no information) . Because th e matrix of coefficients is in reduced form, this system is particularly easy to solve . From th e third equation ,
From the second equation, x2 =
4x5.
2x4
And from the first equation, x i = -3x4 - x5 . We can therefore choose x 4 = a, any number, and x 5 =,0, any number, and obtain a solutio n by choosing the other unknowns as x1 = - 3a-P , The form of the reduced matrix is selected just so that as a matrix of coefficients of a syste m of linear equations, the solution of these equations can be read by inspection .
EXAMPLE 7 .1 6
The matrix (0 1 0 0 A= 0 0 0 0
5 1 0 0
0 0 1 0
0 0 0 1
is not reduced. The leading entry of row 2 is 1, as it must be, but there is a nonzero element i n the column containing this leading entry . However, A is row equivalent to a reduced matrix . I f we add -5 (row 2) to row 1, we obtain
B
_
1
0 0 0 10
1 0 0 0
0 1 0 0
0 0 0 0 1 0 0 1
and this is a reduced matrix .
EXAMPLE 7 .1 7
The matrix 2 0 0 C= 0 1 0 10 1 is not reduced. The leading entry of the first row is not 1, and the first column, containing thi s leading entry of row 1, has another nonzero element . In addition, the leading entry of row 3 is to the left of the leading entry of row 2, and this violates condition (4) . However, C is row
7 .3 The Row Echelon Form of a Matrix
261
equivalent to a reduced matrix . First form D by multiplying row 1 by D=
1 0 1
0 1 0
0 0 1
Now form F from D by adding - (r w 1) to row 3 : F=
71 0 0
0 1 0
0 0 1
Then F is a reduced matrix that is row equivalent to C, since it was formed by a sequence o f elementary row operations, starting with C . In the last two examples we had matrices that were not in reduced form, but could in both cases proceed to a reduced matrix by elementary row operations . We claim that this is alway s possible (although in general more operations may be needed than in these two examples) . THEOREM 7. 9
Every matrix is row equivalent to a reduced matrix . The proof consists of exhibiting a sequence of elementary row operations that will produce a reduced matrix . Let A be any matrix . If A is a zero matrix, we are done . Thus suppose that A has at least one nonzero row . Reading from left to right across the matrix, find the first column having a nonzero element . Suppose this is in column c l . Reading from top to bottom in this column, suppose a is the to p nonzero element. Say a is in row rt . Multiply this row by 1/a to obtain a matrix B in which column cl has its top nonzero element equal to 1, and this is in row r t . If any row below rt in B has a nonzero element in column c t , add -,6 times row rt to this row . In this way we obtain a matrix C that is row equivalent to A, having 1 in the r t , c l position, and all other elements of column c l equal to zero. Now interchange, if necessary, rows 1 and rt of C to obtain a matrix D having leading entry 1 in row 1 and column c t , and all other elements of this column equal to zero . Further, b y choice of c1 , any column of D to the left of column c l has all zero elements (if there is such a column) . D is row equivalent to A . If D is reduced, we are done . If not, repeat this procedure, but now look for the first column , say column c 2 , to the right of column c l having a nonzero element below row 1 . Let y be the to p nonzero element of this column lying below row 1 . Say this element occurs in row r2. Multiply row r2 by 1/y to obtain a new matrix E having 1 in the r2 , c2 position. If this column has a nonzero element 6 above or below row r 2, add -8 (row r2) to this row . In this way we obtain a matrix F that is row equivalent to A and has leading entry 1 in row r 2 and all other elements o f column c 2 equal to zero . Finally, form G from F by interchanging rows r2 and 2, if necessary . If G is reduced we are done . If not, locate the first column to the right of column c 2 havin g a nonzero element and repeat the procedure done to form the first two rows of G . Since A has only finitely many columns, eventually this process terminates in a reduce d matrix R . Since R was obtained from A by elementary row operations, R is row equivalent t o A and the proof is complete. Proof
The process of obtaining a reduced matrix row equivalent to a given matrix A is referre d to as reducing A . It is possible to reduce a matrix in many different ways (that is, by differen t sequences of elementary row operations) . We claim that this does not matter and that for a given A any reduction process will result in the same reduced matrix .
CHAPTER 7 Matrices and Systems of Linear Equation s
2 62
[
1
THEOREM 7.1 0
Let A be a matrix . Then there is exactly one reduced-matrix AR that is row equivalent to A. ■ We leave a proof of this result to the student. In view of this theorem, we can speak of th e reduced form of a given matrix A. We will denote this matrix A R .
EXAMPLE 7 .1 8
Let -2 1 A= 0 1 20
3 1 1
We want to find AR . Column 1 has a nonzero element in row 1 . Begin with : A-
(-2 1 01 20
3 1 1
Begin with the operations : multiply row 1 by -
1
In the last matrix, column 2 has a nonzero element below row 1, the highest being 1 in the 2, 2 position. Since we want a 1 here, we do not have to multiply this row by anything . However, we want zeros above and below this 1, in the 1, 2 and 3, 2 positions . Thus add 1/2 times row 2 to row 1, and - row 2 to row 3 in the last matrix to obtai n (1 0 01 00
-1 1 3
In this matrix column 3 has a nonzero element below row 2, in the 3, 3 location . Multiply row 3 by 1 /3 to obtain (1 0 01 00
-1 1 1
Finally, we want zeros above the 3, 3 position in column 3 . Add row 3 to row 1 and - row 3 to row 2 to get 0 0 AR = (1 01 0 00 1 This is AR because it is a reduced matrix and it is row equivalent to A . ■
7 .3
The Row Echelon Form of a Matrix
'.263
To illustrate the last theorem, we will use a different sequence of elementary row operation s to reduce A, arriving at the same final result . Proceed :
A -+ (add row 3 to row 1 ) 0 1 0
3 1 1
(row 1 )
0 1 0
1 1 1
add (-1)(row 1) to rows 2 and 3
0 1 0
1 0 0
add (-1)(row 2) to row 1 -> -
3
interchange rows 1 and 3 -->
(row 1) -+
2
EXAMPLE 7 .19
Let
B=
0 0 0 0
0 0 1 4
0 2 0 3
0 0 1 4
0 0 1 0
0
0 0 1 0
0 2 0 3
0 0 1 0
0 0 1 -4
1 0 0 0
0 2 0 3
1 0 0 0
1 0 0 -4
1 0 0 0
0 1 0 3
1 0 0 0
1 0 0 -4
Reduce B as follows :
add - 4(row 3) to row 4 ->-
1
0 0 0 0
interchange rows 3 and 1 -a
0 0 0
2
(row 2) -> 0
264
CHAPTER 7 Matrices and Systems of Linear Equation s
L
add (-3)(row 2) to row 4
I
1 --(row 4)
0
1 0 1 0
add (-1) (row 4) to row 1 -->
interchange row 3 and 4 -->
1
0
1 0
1
0
1
1
0 0
0 0
0 0
0 -4
1 0 0 0
0 1 0 0
1 0 0 0
1 0 0 1
1
0
1
0
0 0
0 0
0 0
0 1
1
0
1
0 0
0 0
0 0
0 0
1
0
This is a reduced matrix, hence it is the reduced matrix BR of B . In view of Theorem 7 .6 of the preceding section, we immediately have the following .
THEOREM 7 .1 1
Let A be an n x m matrix . Then there is an n x n matrix SI such that EIA = AR . ■ There is a convenient notational device that enables us to find both SI and AR together . We know what SI is. If A is n = n x m, then El is an n x n matrix formed by starting with I„ and carrying out, in order, the same sequence of elementary row operations used to reduce A . A simple way to form fl while reducing A is to form an n x (n+ m) matrix [I,, :A] by placing I„ alongside A on its left. The first n columns of this matrix [I,, :A] are just I,,, and the last m columns are A. Now reduce A by elementary row operations, performing the same operations on the first n columns (I„) as well . When A is reduced, the resulting n x (n+ m) matrix will have the form [fl :AR ], and we read SI as the first n columns .
EXAMPLE 7,2 0
Let A-
-3 1 0 ( 4 -2 1 )
We want to find a 2 x 2 matrix SZ such that (IA = A R . Since A is 2 x 3, form the matri x [I2 :A] =
1
0
.0 1
1
0
4 -2
1
-3 :
7.3 The Row Echelon Form of a Matrix
265
Now reduce the last three columns, performing the same operations on the first two . The colum n of dashes is just a bookkeeping device to separate A from I2. Procee d [I2:A]
4
(row 2) -->
0
3
- -(row 1) -+ 3
:1
01 (
-3
0
4 -2
1
1
0
:1
0
4
1
-3
1
-* (row 2 - row 1 )
0 1
2
4
3
0 4
--> (-6)(row 2) -->
10
-(row 2)+(row 1) -, 3 2
-2
0 1 -;
The last three columns are in reduced form, so they form A R . The first two columns form S2 : El -
-1
-1
As a check on this, form the produc t SZA-
-2
_2
(
In each of the following, find the reduced form of A an d produce a matrix 11 such that SZA = AR . 1 1.A = (0 0
-1 1 0 1 1
(0
1
1
4
0
0
1
4
1
0 0
0 0 0
0 0 0
0
4. A= 0
3 2 0
1 0 0
4
-2
1
0
0
1
-1
4
2
3 1
6 -5 1
4 0
4 0
O1)
6 5. A= 0 1 0
1 0 3 1
6. A- 2 1
2 1
7.A=
7
1
0
1
1
-1 1
1
0
0
2
8. A= (-3
0
-2
AR.
266 '
CHAPTER 7 Matrices and Systems of Linear Equation s
9.A= (-1 1
10.A= (80 4
7.4
2 0 2 1 0
3 0 1 1 0
11 0
11.A=
0 3 -3
4 1 2 2 0 1
-7 0 0
12.A=
(-n
The Row and Column Spaces of a Matrix and Rank of a Matri x In this section we will develop three numbers associated with matrices that play a significan t role in the solution of systems of linear equations . Suppose A is an n x m matrix with real number elements . Each row of A has m elements and can be thought of as a vector in R"` . There are n such vectors . The set of all linea r combinations of these row vectors is a subspace of R"z called the row space of A . This spac e is spanned by the row vectors . If these row vectors are linearly independent, they form a basi s for this row space and this space has dimension n. If they are not linearly independent, the n some subset of them forms a basis for the row space, and this space has dimension < n . If we look down instead of across, we can think of each column of A as a vector in kJ . W e often write these vectors as columns simply to keep in mind their origin, although they can b e written in standard vector notation . The set of all linear combinations of these columns forms a subspace of R" . This is the column space of A . If these columns are linearly independent, they form a basis for this column space, which then has dimension m ; otherwise, this dimension i s less than m .
EXAMPLE 7 .2 1
Let -2 6 1 2 2 -4 B = 10 -8 1 2 3 1 -2 5 -5 7 The row space is the subspace of R3 spanned by the row vectors of B . This row space consist s of all vectors a(-2, 6, 1) +,6(2, 2, -4) + y(10, -8, 12) + 5(3, 1, -2) + e(5, -5, 7) . The first three row vectors are linearly independent . The last two are linear combinations of the first three. Specifically, (3, 1, -2) and
7 3 (10, -8, 12) . 9 1(-2, 6, 1) - 202 (2, 2, -4) 10 The first three row vectors form a basis for the row space, which therefore has dimension 3 . The row space of B is all of R3. (5, -5, 7)
7.4 The Row and Column Spaces of a Matrix and Rank of a Matrix
267
The column space of B is the subspace of R 5 consisting of all vectors -2\ 2
6 2
a 10
-8
+ f3
3
1 -5
/ 1 -4 + y 12 -2 \ 7
These three column vectors are linearly independent in R 5 . Neither is a linear combination o f the other two, or, equivalently, the only way this linear combination can be the zero vector i s for a = Q = y = 0 . Therefore the column space of B has dimension 3 and is a subspace of the dimension 5 space R 5 .
In this example the row space of matrix had the same dimension as the column space, eve n though the row vectors were in R'» and the column vectors in R", with n m . This is not a coincidence .
THEOREM 7.1 2
For any matrix A having real numbers as elements, the row and column spaces have the sam e dimension . Proof
Suppose A is n x m :
A=
a ll an
an
a re a r+1,1
a r2 a r+1,2
\ a"
a ir a2r
a i,r+i
a l», \
a 2,r+I
a 2, »
arr a r+i,r
a r,r+ i
a n, ,
ar+i,r+1
a r+i,,»
a,,,•
a ,,, .+1
a.
a 22
a,,2
. . '
.
Denote the row vectors R1, . . . , R,,, so Ri = (ail, a i2, . . . , ar»,) in R m . Now suppose that the dimension of the row space of A is r. Then exactly r of these-row vector s are linearly independent . As a notational convenience, suppose the first r rows R I , . . . , R, . are linearly independent. Then each of . . . , R,, is a linear combination of these r vectors . Write Rr+i = Pr+i, i R i
+ . . . + fi , .+ i,r Rr ,
Rr+2 = Pr+2,l R l + . . . + ar+2,rR r ,
R,, = /0»,1R4 + . . .
+$,,,r Rr. .
268
CHAPTER 7 Matrices and Systems of Linear Equations Now observe that column j of A can be written / au
/
0 \ 1
1
a21
0
a,.*
0
+a21
/3r+1,1 f a n,/
+
. . . +o r../
Nr+1, 2
1 Nr+1,r
Nn,2 /
/
13,z1
0
R n,r
/
This means that each column vector of A is a linear combination of the r n-vectors on th e right side of this equation . These r vectors therefore span the column space of A . If these vectors are linearly independent, then the dimension of the column space is r . If not, then remove from this list of vectors any that are linear combinations of the others and thus determine a basis fo r the column space having fewer than r vectors . In any event , Dimension of the column space of A < dimension of the row space of A . By essentially repeating this argument, with row and column vectors interchanged, w e obtain Dimension of the row space of A < dimension of the column space of A , and these two inequalities together prove the theorem .
It is interesting to ask what effect elementary row operations have on the row space of a matrix. The answer is-none! We will need this fact shortly .
-*
THEOREM 7.1 3
Let A be an n x in matrix, and let B be formed from A by an elementary row operation . Then the row space of A and the row space of B are the same . If B is obtained by a Type I operation, we simply interchange two rows . Then A and B still have the same row vectors, just listed in a different order, so these row vectors span th e same row space . Suppose B is obtained by a Type H operation, multiplying row i by a nonzero constant c . Linear combinations of the rows of A have the for m Proof
a 1R1
+
.
+ a .Ri +
.+ an Rn
while linear combinations of the rows of B are a 1R 1
+ . . . + ca i R1 + . . . + a n R n
Since ai can be any number, so can cat , so these linear combinations yield the same vector s when the coefficients are chosen arbitrarily . Thus the row space of A and B are again the same. Finally, suppose B is obtained from A by adding c (row i) to row j . The column vector s of B are now
7.4 The Row and Column Spaces of a Matrix and Ranh of a Matri x But we can write an arbitrary linear combination of these rows of B a s
a l RI + . . .+a1_1R1_1 + a1 (cR;+R1)+a1+1R1+1 + . . . +a„R , and this is
a l R 1 + . . +(a1+ca1)R ;+ . • +a1R1 + . . .+ a„R,, , which is again just a linear combination of the row vectors of A . Thus again the row spaces of A and B are the same, and the theorem is proved.
COROLLARY 7. 1
For any matrix A, the row spaces of A and AR are the same . This follows immediately from Theorem 6 .13 . Each time we perform an elementary ro w operation on a matrix, we leave the row space unchanged . Since we obtain A R from A by elementary row operations, then A and AR must have the same row spaces . The dimensions of the row and column spaces will be important when we consider solution s of systems of linear equations . There is another number that will play a significant role in this , the rank of a matrix .
DEFINITION 7.13 Ran k The rank of a matrix A is the number of nonzero rows in A .
We denote the rank of A as rank(A) . If B is a reduced matrix, then B = B R , so the rank of B is just the number of nonzero rows of B itself. Further, for any matrix A rank(A) = number of nonzero rows of A R = rank(AR ) . We claim that the rank of a matrix is equal to the dimension of its row space (or colum n space) . First we will show this for reduced matrices .
LEMMA 7. 1
Let B be a reduced matrix . Then the rank of B equals the dimension of the row space of B . Let RI , . . . , R,. be the nonzero row vectors of B . The row space consists of all linea r combinations Proof
c ] Ri +• • •+c,.R,.. If nonzero row j has its leading entry in column k, then the k'fi component of R1 is 1 . Because B is reduced, all the other elements of column k are zero, hence each other R 1 has k°' component zero . By Lemma 5 .3, R,, . . . , R,. are linearly independent. Therefore these vectors form a basis for the row space of B, and the dimension of this space is r . But r
= number of nonzero rows of B = number of nonzero rows of B R = rank(B) .
CHAPTER 7 Matrices and Systems of Linear Equation s : EXAMPLE 7 .2 2
Let
B=
0 0 0 0
6 1 0 0 30 5 0 1 0 -2 1 0 0 1 2 0 -4 0 0 0 00 0
Then B is in reduced form, so B = BR . The rank of B is its number of nonzero rows, whic h is 3 . Further, the nonzero row vectors ar e (0, 1, 0, 0, 3, 0, 6), (0, 0, 1, 0, -2, 1, 5), (0, 0, 0, 1, 2, 0, -4 ) and these are linearly independent . Indeed, if a linear combination of these vectors yielded th e zero vector, we would hav e a(0, 1, 0, 0, 3, 0, 6) +(3(0, 0, 1, 0, -2, 1, 5) +y(0, 0, 0, 1, 2, 0, -4) = (0, 0, 0, 0, 0, 0, 0) . But then (0, a,(3, y,3a-2a+2y,6,6a+5a-4y)
= (0, 0, 0, 0, 0, 0, 0) ,
and from the second, third, and fourth components we read that a = /3 = y = O . By Theorem 6 .13(2), these three row vectors are linearly independent and form a basis for the row space , which therefore has dimension 3 . Using this as a stepping stone, we can prove the result for arbitrary matrices .
THEOREM 734 For any matrix A, the rank of A equals the dimension of the row space of A . Proof
From the lemma, we know that rank(A) = rank(AR ) = dimension of the row space of AR = dimension of the row space of A ,
since A and A R have the same row space. Of course, we can also assert tha t rank(A) = dimension of the column space of A . If A is n x in, then so is AR . Now A R cannot have more than n nonzero rows (because i t has only n rows) . This means that rank(A) < number of rows of A . There is a special circumstance in which the rank of a square matrix actually equals it s number of rows .
7 .4
The Row and Column Spaces of a Matrix and Ranh of a Matrix
27:1
THEOREM 7.1 5
Let A be an n x n matrix . Then rank(A) = n if and only if A R
=I .
If AR = In , then the number of nonzero rows in A R is n, since I„ has no zero rows . Hence in this case rank(A) = n . Conversely, suppose that rank(A) = n . Then A R has n nonzero rows, hence no zero rows . By definition of a reduced matrix, each row of A R has leading entry 1 . Since each row, being a nonzero row, has a leading entry, then the i, i elements of A R are all equal to 1 . But it is also required that, if column j contains a leading entry, then all other elements of that column are zero . Thus A R must have each i, j element equal to zero if i j, so A R = I, . Proof
EXAMPLE 7 .2 3
Let -1 4 A=(10 1 3 3 -2 15
2 2 8
We find that 1 0 7 0 AR = 0 1 3 2 0 0 0 0 Therefore rank(A) = space of A .
2.
.
This is also the dimension of the row space of A and of the colum n
In the next section we will use the reduced form of a matrix to solve homogeneous system s of linear algebraic equations .
In each of Problems 1 through 14, (a) find the reduce d form of the matrix, and from this the rank, (b) find a basis for the row space of the matrix, and the dimension of this space, and (c) find a basis for the column space and th e dimension of this space.
G 4. 112 1
(-4 1' 2
6.
(1 0
7.
2 1 0 4
2 -1 0 0
1\ 3 1 7)
8.
0 0 0
-1 0 0
0 -1 2
2.
1 0 2
3.
-3 2 4
1 2
3 0)
-1 1 -1
4 3 11
1 2 -3
5. ( 8 1
0 0 -1 -4 -1 3 0
0 0 0 3 1
1 0) 1
1 2 0 2 0
1 2 0
CHAPTER 7 Matrices and Systems of Linear Equation s
0 4 3 9. 16 1 0 22 2 10 .
0
1 0 0 20 0 1 0 -1 30 0
13.
6 2 0 0
1 -4
4
0
1
(-3 2 2 11 . 10 5 002
7.5
-2
(-4 12.
(-2 5 0 1
7 -3
-4
11
14./I 36
-4
11
2
1 1 0 -2 -2 0
15. Show that for any matrix A, rank(A) = rank(A t ) .
Solution of Homogeneous Systems of Linear Equation s We will apply the matrix machinery we have developed to the solution of systems of n linear homogeneous equations in m unknowns : al lxl + a12x2 a 2 x1
a n1x1
+ . . . + a*mxm =
+ a 22x2 +
+
an2x2
0
...+ a z,n x ,n = 0
+ +
aninxm =
0.
This term homogeneous applies here because the right side of each equation is zero . As a prelude to a matrix approach to solving this system, consider the simple system xl - 3x2 + 2x3
0
=
-2x 1 +x 2 -3x3 =0 . We can solve this easily by "eliminating unknowns ." Add 2(equation 1) to equation 2 to ge t -5x2 + x3 = 0, hence 1 x2 = - X3 . 5 Now put this into the first equation of the system to ge t 3 xi - -x3 +2x3
=
0,
or x1
+ 75 x3 =0 .
Then 7 x1 = - - x3. We now have the solution : 7 5
1 5
x1=--a,x2=-a,x3=a ,
7.5 Solution of Homogeneous Systems of Linear Equations
273
in which a can be any number . For this system, two of the unknowns can be written as constan t multiples of the third, which can be assigned any value . The system therefore has infinitel y many solutions . For this simple system we do not need matrices . However, it is instructive to see ho w matrices could be used here . First, write this system in matrix form as AX = 0, wher e xl X = x2 x3
and
I .
Now reduce A . We find that AR
=
The system ARX = 0 is just xi
7 + 5x3 = 0
x2
1 - 5x3 =0 .
This reduced system has the advantage of simplicity-we can solve it on sight, obtaining th e same solutions that we got for the original system . This is not a coincidence . A R is formed from A by elementary row operations . Since each row of A contains the coefficients of an equation of the system, these row operations correspon d in the system to interchanging equations, multiplying an equation by a nonzero constant, an d adding a constant multiple of one equation to another equation of the system . This is why thes e elementary row operations were selected . But these operations always result in new system s having the same solutions as the original system (a proof of this will be given shortly) . Th e reduced system A R X = 0 therefore has the same solutions as AX = O . But A R is defined in just such a way that we can just read the solutions, giving some unknowns in terms of others , as we saw in this simple case . We will look at two more examples and then say more about the method in general .
EXAMPLE 7 .2 4
Solve the system x l - 3x2 +x3 -7x4 +4x 5
=
0
x1 +2x 2 -3x3 = 0 x2 -4x3 +x5 =0 .
This is the system AX = 0, with 1 -3 1 -7 A= 1 2 -3 0 0 1 -4 0
4 0 1
We find that 1 0 0 AR =
0 1 0
0 0 1
_ 3s 16 28 16
7 16
_ 9 16
CHAPTER 7
T274
Matrices and Systems of Linear Equation s
The systems AX = 0 and A RX = 0 have the same solutions . But the equations of the reduce d system AR X = 0 are x1
x2+
x3+
13 35 x5= 0 16 x4+ 16 20 28 x5 = 0 16x4 16 7 16x4
9 16 x5 =0 .
From these we immediately read the solution . We can let x4 = a and x5 = Q (any numbers), and then 28 20 7 9 35 13 x1 = 16 a- 16 f3 , x2 = i6 P, x3 = - 16 a+ 16 f3 . ■ 16 a+ Not only did we essentially have the solution once we obtained A R , but we also kne w the number of arbitrary constants that appear in the solution . In the last example this numbe r was 2 . This was the number of columns, minus the number of rows having leading entrie s (or m - rank(A)) . It is convenient to write solutions of AX = 0 as column vectors . In the last example, w e could write /
16
a
16 p \ 20 a+ 16 P
28 - 16
X=
-
a
9/ 3
16 a + 1 6
\
a /3
/
This formulation also makes it easy to display other information about solutions . In this example, we can also write / 35 -28 X=y -7 16 \ 0
+6
-13 \ 20 9 0 16/
in which y = a/16 can be any number (since a can be any number), and S = /3/16 is also an y number . This displays the solution as a linear combination of two linearly independent vectors . We will say more about the significance of this in the next section . ;77, 17 EXAMPLE 7 .2 5
Solution of Homogeneous Systems of Linear Equations
275
Let 0 0 A- 2 I6
-1 0 1 2
1 _ 0 A R- 0 0
0 1 0 0
2 -1 3 10
4 3 7 28
'
We find that 0 0 1 0
13 -1 0 -3 0
'
From the first three rows of AR , read tha t xi +13x4 = 0 x,-10x4 = 0 x3 - 3x4 = 0 . Thus the solution is given b y xi = - 13a, x2 = 10a, x3 = 3a, x4 = a , in which a can be any number . We can write the solution as
with a any number. In this example every solution is a constant multiple of one 4- vector . Note also that m - rank(A) = 4 - 3 = 1 . 11 We will now firm up some of the ideas we have discussed informally, and then look a t additional examples . First, everything we have done in this section has been based on the assertion that AX = 0 and A R X = 0 have the same solutions . We will prove this . THEOREM 7 .1 6 Let A be an n x in matrix . Then the linear homogeneous systems AX = 0 and AR X = 0 have the same solutions . We know that there is an n x n matrix S2 such that flA = AR . Further, SZ can be written as a product of elementary matrices E l E,.. Suppose first that X = C is a solution of AX = O . Then AC = 0, s o Proof
SZ(AC) = (SZA)C = AR C = SZO = O . Then C is also a solution of A R X = O . Conversely, suppose K is a solution of A R X = O . Then AR K = O . We want to show that AK = 0 also . Because AR K = 0, we have ((A)K = O , or (Er . . .E,A)K=O .
276
CHAPTER 7 Matrices and Systems of Linear Equation s By Theorem 7 .8, for each Ej , there is elementary matrix El that reverses the effect of E1 . Then, from the last equation we hav e E EZ•• .E,,*._1E ;(ErE,._1 •• .E 2 E I A)K=0 . But E*E,. = I,,, because E* reverses the effect of E,.. Similarly, E*._I E, ._I = I n , until finally th e last equation becomes ET (E I A)K = AK = 0 . Thus K is a solution of AX = 0 and the proof is complete. ■ The method for solving AX = 0, which we illustrated above, is called the of the GaussJordan method, or complete pivoting . Here is an outline of the method . Keep in mind that, in a system AX = 0, row k gives the coefficients of equation k, and column j contains the coefficients of xj as we look down the set of equations . Gauss-Jordan Method for Solving AX = 0 1. Find AR . 2. Look down the columns of AR . If column j contains the leading entry of some row (s o all other elements of this column are zero), then xi is said to be dependent. Determine all the dependent unknowns . The remaining unknowns (if any) are said to be independent . 3. Each nonzero row of A R represents an equation in the reduced system, having on e dependent unknown (in the column having the leading entry 1) and all other unknown s in this equation (if any) independent . This enables us to write this dependent unknow n in terms of the independent ones . 4. After step (3) is carried out for each nonzero row, we have each dependent unknow n in terms of the independent ones . The independent unknowns can then be assigned any values, and these determine the dependent unknowns, solving the system . We can write the resulting solution as a linear combination of column solutions, one for eac h independent unknown. The resulting expression, containing an arbitrary constant fo r each independent unknown, is called the general solution of the system .
EXAMPLE 7 .2 6
Solve the system -x 1 + x3 + x 4 + 2x5 = 0 x2 +3x3 +4x5 = 0 x1 +2x2 +x3 +x4 +x5 = 0 -3x 1 -f-x 2 +4x5 = O . The matrix of coefficients is -1 0 A=' 1 -3
0 1 2 1
1 3 1 0
1 2 0 4 1 l 0 4
'
7.5 Solution of Homogeneous Systems of Linear Equations
277
We find tha t 1
0
0
0
-8 \
0
1
0
0
s
0
0
1
0
0
0
0
1
s
-4j
Because columns 1 through 4 of AR contain of leading entries of rows, x1, x2 , x3 and x4 are dependent, while the remaining unknown, x 5 is independent. The equations of the reduce d system (which has the same solutions as the original system) are : 9 xl -gx5 = 0 5 x2 +gx5 = 0 x3+ X5 = 0
1 x4 - 4x5 =0 . We wrote these out for illustration, but in fact the solution can be read immediately from A R . We can choose x5 = a, any number, and then 9 5 9 1 xl = 8-a,x2=-8-a,x3=-8-a,x4= 4 a . The dependent unknowns are given by A R in terms of the independent unknowns (only one i n this case). We can write this solution more neatly a s
\ 8 in which y = a/8 can be any number. This is the general solution of AX = O . In this example, ni-rank(A)=5-4=1 .
EXAMPLE 7 .2 7
Consider the system 3x1 -11x2 -l-5x3 = 0 4x1 +x2 -10x3 = 0 4x1 + 9x2 - 6x3 = 0 . The matrix of coefficients is 3 -11 5 A= 4 1 -1 0 4 9 -6
CHAPTER 7 Matrices and Systems of Linear Equation s The reduced matrix is 0 0 AR = (1 0 1 0 =I3 . 00 1 The reduced system is just x1 =0,x2 =0,x3 =0 . This system has only the trivial solution, with each xi = O . Notice that in this example there are no independent unknowns . If there were, we could assign them any values and have infinitel y many solutions . ■
From this matrix we see that x1 , x2 , x3, x4 are dependent, and x 5, x6, x7 are independent . We read immediately from A R that 7 7 29 x5 - 6 x+ 6 x7 3 233 395 375 x2 _ x5 82 + 164 x6 + 164 x7 1379 2161 995 x5 x3 _ 73 8 + 1476 x6 1476 x7 1895 1043 8773 x4 x5 = 738 1476x6+ 1476 x7' x1 =
7.5 Solution of Homogeneous Systems of Linear Equations
279 ;
while x5, x6 and x7 can (independent of each other) be assigned any numerical values . To make the solution look neater, write x 5 = 1476a, x 6 = 1476/3 and x2 = 14'76y, where a, /3 and y are any numbers . Now the solution can be written xi = 3444a-1722,6+7134y, x2 x3
= -
=
2758a + 995/3 - 2161y, x4
=
4194a+3555/3+3375 y 3790a -1043/3 + 8773 y
x5 = 1476a, x6 = 1476(3, x2 = 1476y , with a, /3 and y any numbers . In column notation , /-17221 / 3444\ / 7134 \ 4194 3555 337 5 -2758 995 -216 1 X = a 3790 + /3 -1043 + y 877 3 1476 0 0 0 1476 0 0/ \ 0/ \ 1476/ This is the general solution, being a linear combination of three linearly independent 7- vectors . In this example, m - rank(A) = 7 - 4 = 3 . In each of these examples, after we found the general solution, we noted that the number m - rank(A), the number of columns minus the rank of A, coincided with the number of linearly
independent column solutions in the general solution (the number of arbitrary unknowns in the general solution) . We will see shortly that this is always true . In the next section we will put the Gauss-Jordan method into a vector space context . Thi s will result in an understanding of the algebraic structure of the solutions of a system AX = 0, as well as practical criteria for determining when such a system has a nonzero solution .
PROBLEMS In each of Problems 1 through 12, find the general solu tion of the system and write it as a column matrix or su m of column matrices .
Suppose A is an n x m matrix . We have been writing solutions of AX = 0 as column m-vectors . Now observe that the set of all solutions has the algebraic structure of a subspace of R'" . THEOREM 7.1 7 Let A be an n x m matrix . Then the set of solutions of the system AX = 0 is a subspace of R'" . Let S be the set of all solutions of AX = O . Then S is a set of vectors in R"' . Since AO = 0, 0 is in S . Now suppose X 1 and X2 are solutions, and a and are real numbers . Then Proof
A(aX i +/3X 2) = aAX 1 +/3AX2 = aO+/30 = 0 , so aX1 +/3X2 is also a solution, hence is in S . Therefore S is a subspace of R"` . We would like to know a basis for this solution space, because then every solution is a linea r combination of the basis vectors . This is similar to finding a fundamental set of solutions for a linear homogeneous differential equation, because then every solution is a linear combinatio n of these fundamental solutions . In examples in the preceding section, we were always able to write the general solutio n as a linear combination of in - rank(A) linearly independent solution (vectors) . This suggests that this number is the dimension of the solution space . To see why this is true in general, notice that we obtain a dependent xi corresponding to each row having a leading entry . Since only nonzero rows have leading entries, the number o f dependent unknowns is the number of nonzero rows of A R . But then the number of independen t unknowns is the total number of unknowns, m, minus the number of dependent unknowns (th e number of nonzero rows of AR ) . Since the number of nonzero rows of AR is the rank of A, then the general solution can always be written as a linear combination of m - rank(A) independent solutions . Further, as a practical matter, solving the system AX = 0 by solving the system AR X = 0 automatically displays the general solution as a linear combination of this number of basi s vectors for the solution space . We will summarize this discussion as a theorem .
7.6 The Solution Space of AX=O
28 1
THEOREM 7.1 8
Let A be n x ni . Then the solution space of the system AX = 0 has dimensio n in - rank(A)
or, equivalently *n - (number of nonzero rows in A R ) .
EXAMPLE 7.2 9
Consider the system -4x 1 + x2 + 3x 3 - 10x4 + x5 = 0 2x1 +8x2 -x3 -x4 +3x5 = 0 -6x, + x2 + x3 - 5x4 - 2x5 = 0 . The matrix of coefficients is (-4 1 3 -10 1 A= 2 8 -1 -1 3 -6 1 1 -5 - 2 and we find that (1 0
0
AR = 0 1
0
0 0 1
33 118 -32 59 -164 59
65 11 8 21 59 56 59
Now A has Jn = 5 columns, and AR has 3 nonzero rows, so rank(A) = 3 and the solution spac e of AX = 0 has dimension ns-rank(A)=5-3=2 . From the reduced system we read the solution s 33 65 32 21 = 118 a 118 a' x2 = 59 a 59 a ' 164 56 a /3, x4 = a , x5 = / 3, x3 = 59 59 in which a and f3 are any numbers . It is neater to replace a with 118y and /3 with 1188 (whic h still can be any numbers) and write the general solution a s /-33y-656 \ (-33\ -65\ 64y - 428 64 -42 X= 328y-1128 =y 328 +8 -11 2 118y 118 0 1186 \ 0/ 118 / This displays the general solution (arbitrary element of the solution space) as a linear combination of two linearly independent vectors which form a basis for the dimension two solutio n space .
CHAPTER 7 Matrices and Systems of Linear Equation s We know that a system AX = 0 always has at least the zero (trivial) solution. This may be the only solution it has . Rank provides a useful criterion for determining when a syste m AX = 0 has a nontrivial solution . THEOREM7.1 9 Let A be n x m . Then the system AX = 0 has a nontrivial solution if and only i f in > rank(A) . This means that the system of homogeneous equations has a nontrivial solution exactl y when the number of unknowns exceeds the rank of the coefficient matrix (the number o f nonzero rows in the reduced matrix) . We have seen that the dimension of the solution space is m - rank(A) . There is a nontrivial solution if and only if this solution space has something in it besides the zer o solution, and this occurs exactly when the dimension of this solution space is positive . Bu t m - rank(A) > 0 is equivalent to m > rank(A) .
Proof
This theorem has important consequences . First, suppose the number of unknowns exceed s the number of equations . Then n < m . But rank(A) < n is always true, so in this cas e m- rank(A) > m -n > 0 and by Theorem 7 .19, the system has nontrivial solutions . 1„ COROLLARY 7.2 A homogeneous system AX = 0 with more unknowns than equations always has a nontrivial solution . For another consequence of Theorem 7 .19, suppose that A is square, so n = ln . Now the dimension of the solution space of AX = 0 is n - rank(A) . If this number is positive, th e system has nontrivial solutions . If n - rank(A) is not positive, then it must be zero, becaus e rank(A) < n is always true. But n - rank(A) = 0 corresponds to a solution space with onl y the zero solution. And it also corresponds, by Theorem 7 .15, to A having the identity matrix as its reduced matrix . This means that a square system AX = 0, having as the same numbe r of unknowns as equations, has only the trivial solution exactly when the reduced form of A i s the identity matrix . COROLLARY 7 . 3
Let A be an n x n matrix of real numbers . Then the system AX = 0 has only the trivial solution exactly when A R = I t .
7-4 1 - 7 A= 2 9 -1 3 11 10 We find that 10 0 AR = 0 1 0I=13 . 0 0 1 This means that the system AX = 0 has only the solution x l = x2 = x3 = O . This makes sense in view of the fact that the system AX = 0 has the same solutions as the reduced syste m ARX = 0, and when A R = 13 this reduced system is just X = O .
1.-12. For n = 1, • . • , 12, use the solution of Problem n , Section 7 .5, to determine the dimension of the solutio n space of the system of homogeneous equations .
as many equations as unknowns, have a nontrivial solution ? 14. Prove Corollary 7 .2. 15. Prove Corollary 7 .3.
13 . Can a system AX = 0, in which there are at least
7.7
Nonhomogeneous Systems of Linear Equations We will now consider nonhomogeneous linear systems of equations : a ll x l
+a12x2+ .
a2x1 +
an1 x 1
.+al„,x,,,
= b1
a 22 x2 + . . . + a2„, x,n
= b2
.
+ an2 x 2 + . . . + anm xm
=
b„ .
We can write this system in matrix form as AX = B, in which A = coefficients , i\ .2 X = (x x x,,,l
[a j ]
is the
11
x in matrix o f
\ and
B=
bz \b„l
This system has n equations in in unknowns . Of course, if each b = 0 then this is a homogeneou s system AX = O . A homogeneous system always has at least one a solution, the zero solution . A nonhomogeneous system need not have any solution at all .
284
CHAPTER 7 Matrices and Systems of Linear Equation s
EXAMPLE 7 .3 1
Consider the system 2x1 -3x2 = 6 4x1 - 6x 2 = 18 . If there were a solution x1 = a, x2 = f3, then from the first equation we would have 2a - 3P = 6. But then the second equation would give us 4a - 6f3 =18 = 2(2a - 3P) = 12, a contradiction . ■ We therefore have an existence question to worry about with the nonhomogeneous system . Before treating this issue, we will ask : what must solutions of AX = B look like?
7.7.1 The Structure of Solutions of AX = B We can take a cue from linear second order differential equations . There we saw that th e every solution of y" + py ' + qy = f(x) is a sum of a solution the homogeneous equatio n y" +py' + qy = 0, and a particular solution of y" +p y' + qy = f(x) . We will show that the sam e idea holds true for linear algebraic systems of equations as well . THE .
7 .2 0
Let Up be any solution of AX = B . Then every solution of AX = B is of the form U, +H, in which H is a solution of AX = O . Proof Let W be any solution of AX = B . Since Up is also a solution of this system, the n A(W-Up ) =AW-AUp =B-B=O . Then W -Up is a solution of AX = 0 . Letting H = W - Up , then W = Up +H . Conversely, if W = Up + H, where H is a solution of AX = 0, then AW =A(Up +H) =AU p -I-AH=B+0 = B , so W is a solution of AX = B . This means that, if U, is any solution of AX = B, and H is the general solution of AX = 0, then the expression U, +H contains all possible solutions of AX = B . For this reason we cal l such Ul, +H the general solution of AX = B, for any particular solution U p of AX = B .
EXAMPLE 7 .3 2
Consider the system -x 1 +x2 + 3x3 = - 2 x2 + 2x3 = 4 . Here A-
0 1 2) and
B=( - 2 4) '
7 .7 Nonhomogeneous Systems of Linear Equations
285
We find from methods of the preceding sections that the general solution of AX = 0 i s 1 a (-2) . 1 By a method we will describe shortly, ) Up = (64 0 is a particular solution of AX = B . Therefore every solution of AX = B is contained in th e expression 1 a -2 1
1+
6 4 0
,
in which a is any number . This is the general solution of the system AX = B .
7.7.2
Existence and Uniqueness of Solutions of AX = B
Now we know what to look for in solving AX = B . In this section we will develop criteri a to determine when a solution U, exists, as well as a method that automatically produces th e general solution in the form X = H+Up , where H is the general solution of AX = 0 .
DEFINITION 7.14
Consistent System of Equations
A nonhomogeneous system AX = B is said to be consistent if there exists a solution . If there is no solution, the system is inconsistent .
The difference between a system AX = 0 and AX = B is B . For the homogeneous system , it is enough to specify the coefficient matrix A when working with the system . But for AX = B, we must incorporate B into our computations . For this reason, we introduce th e augmented matrix [A :B] . If A is n x m, [A :B] is the n x (m+ 1) matrix formed by adjoining B to A as a new last column . For example, if -3 2 6 1 A= 0 3 3 -5 and B2 4 4 -6
5 2 -8
then - 2 6
1
[AB] = 0 3 3 -5 2 4 4 -6
5\ 2 : -8 /
The column of dots does not count in the dimension of the matrix, and is simply a visual devic e to clarify that we are dealing with an augmented matrix giving both A and B for a syste m AX = B . If we just attached B as a last column without such an indicator, we might we dealin g with a homogeneous system having 3 equations in 5 unknowns .
CHAPTER 7 Matrices and Systems of Linear Equation s Continuing with these matrices for the moment, reduce A to find A R : 1 0 0
3 0 -3 1 s
AR = 0 1 0 0
•
Next, reduce [A :B] (ignore the dotted column in the row operations) to ge t 10 0 [A:B] R = 0
1
0
0 0
1
Notice that [A:B] R = [AR :C] . If we reduce the augmented matrix [A :B], we obtain in the first in columns the reduced form of A, together with some new last column . The reason for this can be seen by reviewing ho w we reduce a matrix . Perform elementary row operations, beginning with the left-most colum n containing a leading entry, and work from left to right through the columns of the matrix . In finding the reduced form of the augmented matrix [A :B], we deal with columns 1, • • m, whic h constitute A . The row operations used to reduce [A :B] will, of course, operate on the element s of the last column as well, eventually resulting in what is called C in the last equation . We will state this result as a theorem .
THEOREM 7.2 1
Let A be n x in and let B be in x 1 . Then for some in x 1 matrix C , . [A:B] R
[AR :C] .
The reason this result is important is that the original system AX = B and the reduced system AR X = C have the same solutions (as in the homogeneous case, because the elementar y row operations do not change the solutions of the system) . But because of the special form o f AR , the system AR X = C is either easy to solve by inspection, or to see that there is no solution .
E_ EXAMPLE 7 .33
Consider the system -3 2 2 1 4 -6 0 -2 2
X.
8 1 -2
7.7 Nonhomogeneous Systems of Linear Equation s We will reduce the augmented matrix -3 [A :B] =
2
1
2
:
8
4 -6
:
1
0 -2 2
-2
One way to proceed is 1 [A :B] , interchange rows 1 and 2 - -•3
4 -6
:
1
2
2
8
0 -2
2
: -2
1 4 -6 --> add 3(row 1) to row 2-> 0 14 -16 0
-2
1 11
2
:
/1 0 - io -+ -4(row 2) to row 1, 2(row 2) to row 3
-2/
15
\
7 1 1
0 1 - s7 0 -;
i 4
-3 7
l0 1 0 -7
--(row 3)
01
8
00
1
10
0
10 (row 3) to row 1, - (row 3) to row 2 --> 0 1 0
.
2
0 0 1
z As can be seen in this process, we actually arrived at A R in the first three rows and columns , and whatever ends up in the last column is what we call C : [A :B] R
=
[A R :C] .
Notice that the reduced augmented matrix is [I 3 :C] and represents the reduced system I3X = C . This is the system 1 0 0 0 1 0 X= 0 0 1
0 ; 3 2
,
CHAPTER 7 Matrices and Systems of Linear Equation s which we solve by inspection to get x 1 = 0, x2 = 5/2, x 3 = 3/2. Thus reducing [A :B] immediately yields the solutio n
of the original system AX = B . Because A R = I 3, Corollary 7 .3 tells us that the homogeneou s system AX = 0 has only the trivial solution, and therefore H = 0 in Theorem 7 .20 and UP i s the unique solution of AX = B .
EXAMPLE 7 .3 4
The system 2x 1 - 3x2 = 6 4x 1 - 6x2 = 1 8 is inconsistent, as we saw in Example 7 .31 . We will put the fact that this system has no solutio n into the context of the current discussion . Write the augmented matrix [A :B] _
2
-3
6
4
-6
18
Reduce this matrix . We find that
[A :B] R
=
0
-3 2
0
:
1
From this we immediately read the reduced system A R X = C : ARX -
Co
0) X = (Ol) .
This system has the same solutions as the original system . But the second equation of the reduced system is 0x1 + 0x2 = 1 , which has no solution . Therefore AX = B has no solution either. In this example, the reduced system has an impossible equation because AR has a zero second row, while the second row of [A :B] R has a nonzero element in the augmented column . Whenever this happens, we obtain an equation having all zero coefficients of the unknowns , but equal to a nonzero number . In such a case the reduced system A R X = C, hence the original system AX = B, can have no solution. The key to recognizing when this will occur is that i t happens when the rank of A R (its number of nonzero rows) is less than the rank of [A :B] .
7.7 Nonhomogeneous Systems of Linear Equations
289
THEOREM 7.2 2
The nonhomogeneous system AX = B has a solution if and only if A and [A :B] have the same rank . Proof Let A be n x m. Suppose first that rank(A) = rank([A :B]) = r . By Theorems 7 .12 an d 7 .14, the column space of [A :B] has dimension r . Certainly r cannot exceed the number o f columns of A, so B, which is column m + 1 of [A :B], must be a linear combination of the first m columns of [A :B], which form A . This means that, for some numbers a l , . . . , a,,, , a ll a 21
+a2
a ,2 a22
a im a 2»,
+ . . . + a ,,,
a „1
a ,, a 1 a,, + a 2a 12 +. . . . + am a in, a 1 a21+ a 2a 2z++ a,,, a 2,,,
a,»»
=A
a l a nl +a2an2+ . . .+a»,an»,
al az am
/a l
But then
a2
is a solution of AX = B . \ a»,
Conversely, suppose AX = B has a solutio n
/a1\ a2
ai a ii+ a2a ,2+ . . . + a », a l », al a2, + a2a 22 + . . . + a,,,a 2,,,
\ a »,I
a l a nl +'a 2 a n2 +
B=A
= al
. The n
a ll a 2, a„ 1
1 x 121 +a,
a 22 a „z
+
+ a ,,, a n m a im a2», a »»,
Then B is a linear combination of the columns of A, thought of as vectors in R" . But then the column space of A is the same as the column space of [A :B] . Then rank(A) = dimension of the column space of A = dimension of the column space of [A :B] = rank[A :B] , and the p )of is 'complete . II
2 90
CHAPTER 7
Matrices and Systems of Linear Equation s
L EXAMPLE 7 .3 5
Solve the system xl -x2 +2x 3 = 3 -4x 1 +x2 +7x 3
= -5
-2x 1 -x2 +11x3 = 14 . The augmented matrix is 1 [A :B]
_
-1
2
3
1 7 -5 - 2 -1 11 1 4
When we reduce this matrix we obtain
1 0 -3
:
0
[A :B] R = [AR :C] = 0 1 -5
:
0
0 : 1 The first three columns of this reduced matrix make up AR . Bu t 0 0
rank(A)
= 2 and rank([A :B] R ) = 3,
so this system has no solution . The last equation of the reduced system i s 0x1 + 0x2 + 0x 3 = 1 , which can have no solution .
EXAMPLE 7 .3 6
Solve xl -x 3 +2x4 +x5
+6x6 = - 3
x2 +x 3 +3x4 +2x5 +4x 6 = 1 xl - 4x 2 + 3x3
+ x4 + 2x6 = 0.
The augmented matrix is 1 [AB ]
6
-3
3 2 4
1
3 1 0 2
0
0 -1 2 1
1
= 0
1
1 -4 Reduce this to get 0
27 8
15
[A :B] R = 0 1 0
13
9 8
20
8
8
7 8
12
7 8
1
0
1
8 11
8
8
60 8
8
17
8
7.7 Nonhomogeneous Systems of Linear Equations
29 1
The first six columns of this matrix form A R and we read that rank(A) = 3 = rank([A :B] R ) . From [A :B] R , identify x 1 , x2 , x3 as dependent and x4 , x5 , x6 as independent . The number o f independent unknowns is m - rank(A) = 6 - 3 = 3, and this is the dimension of the solutio n space of AX = 0 . From the reduced augmented matrix, the first equation of the reduced system i s 27 15 60 x1 + 8x4+ 8x5-} 8 x
17 , 68
so x1
=-
27 17 15 60 8 x4 - 8 x5 - 8x6 - 8
We will not write out all of the equations of the reduced system . The point is that we can read directly from [A :B] R that x2
13 8 x4
9 8 x5
20 1 8 x6 + 8
and 11 7 12 7 x3=- 8 x4-gx5- 8x6 + while x4 , x5 , x6 can be assigned any numerical values . We can write this solution a s /- 27g x4 - 15g x5 6g0 x6 - 17 \ 20 - 8 x4 - 89 x5 - 8x6 + 7 _ 7 8 x4 x5 8 X= x6+ 8 x4 x5 x6 If we let x4 = 8a, x5 = 8,6, and x6 = 8y, with a, /3 and y any numbers, then the general solution i s 13
This is in the form H+U p,, with H the general solution of AX = 0 and U1, a particular solution of AX = B . Since the general solution is of the form X = H + Up„ with H the general solution of AX = 0, the only way AX = B can have a unique solution is if H = 0, that is, the homogeneou s system must have only the trivial solution . But, for a system with the same number of unknown s as equations, this can occur only if A R is the identity matrix . THEOREM 7.2 3
Let A be n x n. Then the nonhomogeneous system AX = B has a unique solution if and onl y if ■ AR=I„•
This, in turn, occurs exactly when rank(A) = n .
Matrices and Systems of Linear Equation s
CHAPTER 7
EXAMPLE 7 .3 7
Consider the system 2 -5 1
1 1 1
-11 9 14
X-
-6 12 -5
.
The augmented matrix is
[A :B] =
2
1
-11
:
-6
5
1
9
:
12
1
1
14
1
0
-5
and we find that
[A :B] R =
0
0
1
0
0
0
1
:
86 31
_ 19 1 155
:
11 155
The first three columns tell us that AR = 1 3 . The homogeneous system AX = 0 has only the trivial solution . Then AX = B has a unique solution, which we read from -
X=
[A :B] R :
1
3 -19 1 155 - 11 155
Note that rank(A) = 3 and the dimension of the solution space AX = 0 is n - rank(A) = 3 - 3 = 0, consistent with this solution space having no elements except the zero vector .
In each of Problems 1 through 14, find the general solutio n of the system or show that the system has no solution . 1.
4x, -x2 +4x3 = 1 x l +x2 -5x3 = 0 -2x 1 + x2 + 7x3 = 4
293
14. -6x 1 + 2x2 - x3 +x4 = 0 x,+4x2 -x4 =- 5 xl +x2 +x3 -7x4 = 0 15. Let A be an rn x in matrix with rank r. Prove that the reduced system AR X = B has a solution if and only if br+ , = . . . = b,, = 0.
Matrix Inverses
DEFINITION 7.15 Matr ix Invers e Let A be an n x n matrix . Then B is an inverse of A i f AB=BA=I .
In this definition B must also be n x n because both AB and BA must be defined . Further , if B is an inverse of A, then A is also an inverse of B . It is easy to find nonzero square matrices that have no inverse. For example, let A= If B is an inverse of A, say B = I
AB =
a 12
(2
.
00)
I , then we must have
a (2 0) (ac d) (2a
12 2b) - (0
0 1)
But then and
a=1,b=0,2a=0
b= 1
which are impossible conditions . On the other hand, some matrices do have inverses . For example, 2 1 (1 4)
4
I -
-!
z7
- 7I1
z2 -
(1
4)(0
Ol) '
294
CHAPTER 7 Matrices and Systems of Linear Equation s
DEFINITION7.16
Nonsingular
andSingular Matrices
A square matrix is said to be nonsingular if it has an inverse . If it has no inverse, the matrix is called singular.
If a matrix has an inverse, then it can have only one . Uniqueness of Inverse s
THEOREM 7 .24
Let B and C be inverses of A . Then B = C . Proof
Write B = BI n = B(AC) _ (BA)C = I„C = C . ■
In view of this we will denote the inverse of A as A -1 . Here are properties of invers e matrices . In proving parts of the theorem, we repeatedly employ the strategy that, if AB = BA = In , then B must be the inverse of A . THEOREM 7.25
1. I,, is nonsingular and I -, 1 = I, . 2. If A and B are nonsingular n x n matrices, then AB is nonsingular and ( AB ) -' = B-1A- 1 3 . If A is nonsingular, so is A -1 , and (A -1 ) -1 =A . 4. If A is nonsingular, so is A', and (A` ) -' = (A -1 ) ` . 5 . If A and B are n x n and either is singular, then AB and BA are both singular . Proof
For (2), compute
(AB)(B-1A-1) = A(BB -1 )A -1 = AA -1 = In . Similarly, (B-1A-1)(AB) =In . Therefore (AB) -1 =B -1 A -1 . For (4), use Theorem 7 .3(3) to write (A') (A-1 Y = (A-'A ) ` _ ( I ,,) ' = In . Similarly, (A-' ) ` (A` ) = (AA-1 Therefore (A9 -1
= (A
)'
=
In.
_1 ) '
We will be able to give a very short proof of (5) when we have developed determinants . We saw before that not every matrix has an inverse . How can we tell whether a matrix i s singular or nonsingular? The following theorem gives a reasonable test .
7.8 Matrix Inverses
295
THEOREM 7.2 6
An n x n matrix A is nonsingular if and only if A R = I . Alternatively, an n x n matrix is nonsingular if and only if its rank is it . The proof consists of understanding a relationship between a matrix having an inverse, and its reduced form bein g the identity matrix . The key lies in noticing that we can form the columns of a matrix produc t AB by multiplying, in turn, A by each column of B : b2j column j of AB = A(column j of B) = A U„j We will build an inverse for A a column at a time . To have AB = I,,, we must be able to choose the columns of B so that
Proof
/0\ 0 = column j of I 1,
=
1 \0 /
with a 1 in the j` h place and zeros elsewhere . Suppose now that A R = I . Then, by Theorem 7.23, the system (7 .4) has a unique solution for each j = 1, • • • , n . These solutions form the columns of a matrix B such that AB = I,,, and then B = A -1 . (Actually we must show that AB = I,, also, but we leave this as an exercise) . Conversely, suppose A is nonsingular . Then system (7 .4) has a unique solution for j = 1, . . . , n, because these solutions are the columns of A-1 . Then, by Theorem 7.23, AR = I . 7.8 .1 A Method for Finding A- 1 We know some computational rules for working with matrix inverses, as well as a criterion fo r a matrix to have an inverse . Now we want an efficient way of computing A -1 from A . Theorem 7.26 suggests a strategy . We know that, in any event, there is an n x ri matrix SZ such that S2A = A R . SI is a product of elementary matrices representing the elementary ro w operations used to reduce A . Previously we found Cl by adjoining I n to the left of A to form an n x 2n matrix [I,, :A] . Reduce A, performing the elementary row operations on all of [I :A] to eventually arrive at [fI :I„] . This produces Cl such that S2A = AR . If A R = I,, then SZ = A -1 . If A R
In , then A has no inverse.
EXAMPLE 7.3 8
Let A=I6 I . \\\ We want to know if A is nonsingular and, if it is, produce $// its inverse .
CHAPTER 7 Matrices and Systems of Linear Equation s Form
Reduce A (the last two columns), carrying out the same operations on the first two columns : [I2:A] ->-
1 1 0 01
5 (row 1)
:
1- s
:6
8 1
-6(row 1)+(row 2)
0 0
s
46 (row 2) -
6 5 46 46
5
-5
1 .
0
8 1 46 4 6
(row 2) + (row 1)
0
_ 6 5 46 46
0
In the last two columns we read A R = 12 . This means that A is nonsingular . From the first tw o columns, A -1
46 ( 86 5) .
EXAMPLE 7 .3 9
Let A_
(-3 4
21 -2 8
Perform a reduction : [I2 :A ]
=
- -3(row 1)*
-4(row 1) + (row 2)
-*
1
0
:
-3
21
0
1
:
4
-28
3 0
0
1
-7
4
-28
1 3
0
1
-7
4 3
1
0
0
1
:
We read A R from the last two columns, which form a 2 x 2 reduced matrix . Since this is no t I2, A is singular and has no inverse . Here is how inverses relate to the solution of systems of linear equations in which th e number of unknowns equals the number of equations .
7 .8 Matrix Inverses
29 7
THEOREM 7.2 7
Let A be an n x n matrix . 1. A homogeneous system AX = 0 has a nontrivial solution if and only if A is singular . 2. A nonhomogeneous system AX = B has a solution if and only if A is nonsingular . In this case the unique solution is X = A- 'B . For a homogeneous system AX = 0, if A were nonsingular then we could multiply the equation on the left by A - ' to get X=A- 'O=O. Thus in the nonsingular case, a homogeneous system can have only a trivial solution. In the singular case, we know that rank(A) < n, so the solution space has positive dimensio n n - rank(A) and therefore has nontrivial solutions in it . In the nonsingular case, we can multiply a nonhomogeneous equation AX = B on the left by A- ' to get the unique solution X=A- 'B . However, if A is singular, then rank(A) < n, and then Theorem 7 .22 tells us that the system AX = B can have no solution.
EXAMPLE 7 .4 0
Consider the nonhomogeneous system 2x1 -x2 +3x3 = 4 x1 + 9x2 - 2x3 = - 8 4x1 - 8x2 -I-11x3 = 15 . The matrix of coefficients is A=
2 1 4
-1 9 -8
3 -2 11
and we find that A- '= 1 53
83 -19 -44
-13 10 12
-25 7 19
83 -19 -44
-13 10 12
-25 7 19
.
The unique solution of this system is X = A - 'B = 1 53
4 -8 15
CHAPTER 7 Matrices and Systems of Linear Equation s
In each of Problems 1 through 10, find the inverse of the matrix or show that the matrix is singular. 1.
(- 1 2
1
2.
(12 4
3) 1
3.
-5 ( 1
2 2)
4.
(-1 4
4)
5.
(6 3
(1 6. 2 0 (-3 7. 1 1
(-2 0 -3
11.
12.
13.
1 16 0
-3 1 4
4 2 1
1 0 3 1 1 3
1 1 0
12 1 14 -3 2 0 4 091
In each of Problems 11 through 15, find the unique solution of the system, using Theorem 7 .27(2) .
x1 - x 2 + 3x3 = - 7 16. Let A be nonsingular. Prove that, for any positiv e integer k, A k is nonsingular, and (Ak ) -1 = (A -1 )k. 17. Let A, B and C be n x n real matrices . Suppos e BA = AC = I,, . Prove that B = C.
CHAPTER
8
Determinants
If A is a square matrix, the determinant of A is a sum of products of elements of A , formed according to a procedure we will now describe . First we need some information about permutations .
r
8.1
Permutations If n is a positive integer, a permutation of order n is an arrangement of the integers 1, . . . , n in any order. For example, suppose p is a permutation that reorders the integers 1, . . ., 6 as 3, 1, 4, 5, 2, 6 . Then p(l)
=
3,
p (2)
= 1,
p (3)
= 4,
p(4)
=
5,
p (5)
=
2,
p(6)
=
6,
with p(j) the number the permutation has put in place j . For small n it is possible to list all permutations on 1, . . . , n. Here is a short list : For n = 2 there are two permutations on the integers 1, 2, one leaving them in place and the second interchanging them :
299
300
CHAPTER 8 Determinant s
For n = 3 there are six permutations on 1, 2, 3, and they ar e
1,2, 3 1, 3, 2 2, 1, 3 2,3, 1 3, 1, 2 3, 2, 1 .
For n = 4 there are twenty four permutations on 1, 2, 3, 4: 1,2,3,4; 1,2,4,3; 1,3,2,4; 1,3,4,2 ; 1,4,2,3; 1,4,3,2; 2,1,3,4;2,1,4,3 ;2,3,1,4 ;2,3,4,1 ;2,4,1,3 ;2,4,3,1 ; 3, 1,2,4;3, 1,4,2;3,2, 1,4;3,2,4,1 ;3,4, 1,2;3,4,2, l ; 4, 1,2,3;4, 1,3,2;4,2, 1,3 ;4,2,.3,1 ;4,3, 1,2;4,3,2, 1 . An examination of this list of permutations suggests a systematic approach by which the y were all listed, and such an approach will work in theory for higher n . However, we can als o observe that the number of permutations on 1, . . . , n increases rapidly with n . There are n! = 1 . 2 • • • • • n permutations on 1, . . ., n . This fact is not difficult to derive . Imagine a row of n boxes, and start putting the integers from 1 to n into the boxes, one to each box. There are n choices for a number to put into the first box, n- 1 choices for the second , n - 2 for the third, and so on until there is only one left to put in the last box . There is a total of n (n - 1) (n - 2) . . . 1 = n! ways to do this, hence n! permutations on n objects . A permutation is characterized as even or odd, according to a rule we will now illustrate . Consider the permutation 2, 5, 1, 4, 3 . on the integers 1, . . . , 5 . For each number k in the list, count the number of integers to its right that are smaller than k . In this way form a lis t k
2 5 1 4 3
number of integers smaller than k to the right of k 1 3 0 1 0
Sum the integers in the right column to get 5, which is odd . We therefore call this permutation odd. As an example of an even permutation, consider 2, 1,5,4,3.
301
8.2 Definition of the Determinant
Now the list is k
2 1 5 4 3
number of integers smaller than k to the right of k 1 0 2 1 0
and the integers in the right column sum to 4, an even number . This permutation is even . If p is a permutation, let sgn(p)
1. The six permutations of 1, 2, 3 are given in the discussion. Which of these permutations are even and which are odd? 2. The 24 permutations of 1, 2, 3, 4 are given in the discussion . Which of these are even and which are odd ?
8.2
_ 0 if p is even 1 if p is odd
3. Show that half of the permutations on 1, even, and the other half are odd .
2, . . , n
are
Definition of the Determinan t Let A =
[ a te]
be an
nxn
matrix, with numbers or functions as elements .
DEFINITION 8. 1
The determinant of A, denoted det(A), is the sum (_1)st
;n(p)alp(1)a2p(2)
of all
products
. . . a np(n) ,
taken over all permutations p on 1, . . . , n . This sum is denoted E(-1)''s"(n)alp(1)a2p(2) . . . a np(n) • p
(8 .1 )
Each term in the defining sum (8 .1) contains exactly one element from each row an d from each column, chosen according to the indices j, p(j) determined by the permutation. Each product in the sum is multiplied by 1 if the permutation is even, and by -1 if p is odd.
302
CHAPTER 8 Determinants
Since there are n! permutations on 1, . . . , n, this sum involves n! terms and is therefor e quite daunting for, say n > 4. We will examine the small cases 72 = 2 and n = 3 and then loo k for ways of evaluating det (A) for larger n . In the case n = 2 ,
We have seen that there are 2 permutations on 1, 2, namel y p : 1, 2 which is an even permutation, and q :2, 1 which is odd. Then det(A) =
(-1) s'g,l(p)
a 1p(l) a 2p(2) +( _ 1) sg"(0 a
ll(1) a 2q(2)
= ( - 1 ) °a 11 a 22+(- 1 ) 1a 12 a 21 =a 11 a 22- a 12 a2l .
This rule for evaluating det(A) holds for any 2 x 2 matrix . In the case n = 3,
A=
a ll a 21 a 31
a 12 a 22 a 32
a 13 a2 3 a3 3
The permutations of 1, 2, 3 are
p 1 :1,2, 3 p2 :1,3, 2 p3 2, 1, 3
p4 2, 3, 1 p5 3, 1, 2
p 6 : 3, 2, 1 . It is routine to check that p 1 , P5 and p 6 are even, and p2 , p3 and p4 are odd . Then det(A) = ( - 1)sgf(p1)alp, ( 1 ) a2p 1( 2) a3p 1(3) + (-1)sgn(p2)a1 1,2( 1) a2p 2(2) a3p2( 3 ) +(-1)''0')alp3(l)a2p3(2)a3p3(3)+((-)sgn(') alp 4( 1 ) a2p4( 2) a3p4(3 ) + (-1) sgn(P5) a lp5 ( 1 ) a2p5 (2) a3p5 (3 ) + (-1)sgn (P6) a lp 6 ( 1 ) a2p5 ( 2) a3p6 (3) a 11 a22 a 33 - a 11 a 23 a 32 - a 12 a 21 a 33 + a 12 a23 a31 + a 13 a 21 a32 -a 13 a 22 a3 1
If A is 4 x 4, then evaluation of det(A) by direct recourse to the definition will involve 2 4 terms, as well as explicitly listing all 24 permutations on 1, 2, 3, 4 . This is not practical . We will therefore develop some properties of determinants which will make their evaluation mor e efficient .
C T I D N r8
2'
8.3
Properties of Determinants
3. A-
6 2 0
-3 1 1
-4 0 0
PROBLEMS
In Problems 1 through 4, use the formula for det(A) in th e 3 x 3 case to evaluate the determinant of the given matrix .
5 4 -4
1. A-
16 0 1 2 -1 01 1
4.A-
2. A=
-1 2 1
5 . The permutations on 1, 2, 3, 4 were listed in Sectio n 8.1 . Use this list to write a formula for det(A) whe n Ais4x4 .
8.3
0 1 0
1 1 0
Properties of Determinants
We will develop some of the properties of determinants that are used in evaluating them an d deriving some of their properties . There are effective computer routines for evaluating quit e large determinants, but these are also based on the properties we will display . First, it is standard to use vertical lines to denote determinants, so we will often writ e det(A) =
CAI .
This should not be confused with absolute value . If A has numerical elements, then IAI is a number and can be positive, negative or zero . Throughout the rest of this chapter let A and B be n x n matrices . Our first result says that a matrix having a zero row has a zero determinant . THEOREM' 8. I
If A has a zero row, then IAI = 0 , This is easy to see from the defining sum (8 .1) . Suppose, for some i , each a 11 = 0. Each term of the sum (8 .1) contains a factor a 11, .o ) from row i, hence each term in the sum is zero . Next, we claim that multiplying a row of a matrix by a scalar a has the effect of multiplyin g the determinant of the matrix by a . THEOREM 8 . 2
Let B be formed from A by multiplying row k by a scalar a . Then 1BI = aIAI . The effect of multiplying row k of A by a is to replace each ak1 by aa kj . Then for i k, and kJ = aa kj , so _
E(-1)sg' (P)a1,(1>a2n(2) . . . akP(k)
. . . a pp(,)
kJ =
Li
CHAPTER 8 Determinant s and IB I =
E( - 1) s8n(p)b Ip(I) b2p(2)
. . bkp(k) . . . b np(n )
P
= r( - 1) SSn(P)a 1p(I) a2p(2) . . . p
(aakp(k))
. . . a np(n )
ssn(p) a lp(I) a2p(2) . . . a kp(k) . . = aE(-1) anp(,,) = a IAI .
p
The next result states that the interchange of two rows in a matrix causes a sign change i n the determinant.
-j.
THEOREM 8. 3
Let B be formed from A by interchanging two rows . Then IAI=-IBI .
A proof of this involves a close examination of the effect of a row interchange on th e terms of the sum (8 .1), and we will not go . through these details . The result is easy to see i n the case of 2 x 2 determinants . Let A=(
an an a21
a22 )
and
B-
( a2l al l
Then IA I = a 11 a 22 - a 12 a2 1
and I B I = a 21 a 12 - a22 a 11
=
- IAI
This result has two important consequences . The first is that the determinant of a matri x with two identical rows must be zero . OROLLARY 8 . 1
If two rows of A are the same, then IAI = 0 . 10 The reason for this is that, if we form B from A by interchanging the identical rows, the n B = A, so IBI = IAI . But by Theorem 8 .3, IBI = - IAI, so IAI = 0. COROLLARY 8 . 2
If for some scalar a, row k of A is a times row i, then IAI = 0 . To see this, consider two cases . First, if a = 0, then row k of A is a zero row, so IAI = 0 . If a 0, then we can multiply row k of A by 1/a to obtain a matrix B haying rows i and k the same. Then IBI = 0 . But IBI = (1/a) IAI by Theorem 8 .2, so IAI = 0 . Next, we claim that the determinant of a product is the product of the determinants .
8.3 Properties of Determinants
305
THEOREM 8 . 4
Let A and B be n x n matrices . Then, IABI
= IAI 1BI.
Obviously this extends to a product involving any finite number of n x n matrices . The theorem enables us to evaluate the determinant of such a product without carrying out th e matrix multiplications of all the factors . We will illustrate the theorem when we have efficien t ways of evaluating determinants . The following theorem gives the determinant of a matrix that is written as a sum of matrice s in a special way .
THEOREM 8. 5
Suppose each element of row k of A is written as a sum akj + f3 kj . Form two matrices fro m A . The first, A l , is identical to A except the elements of row k are a kj . The second, A 2 , is identical to A except the elements of row k are f3 kj . Then I A I = I A I I + I A21
.
If we display the elements of these matrices, the conclusion states tha t am
a ll . . .
akl+$kI
akj+/3kj
and
...
akn + I3kn
akl
a kj
akn
a nn
a nt
a„j
aw l
a nj
+
a lj
a 1,,
Nk1 "'
akj
Nk n
a„1
anj
.
This result can be seen by examining the terms of (8 .1) for each of these determinants : IA1=
E ( _ 1) s s"(n)a 1n( a2p(2) . . .
_ E(_1)skn( p) a1 p(1)a2j (2)
(akp(k)+fkp(k)) . . .anp(t)
. . . akl (k) . . . a np(n )
p
+ E(_ llsow a lp(l) a2p(2) . . . +Nkp(k) . . . a„p(n) = I A 1 I + I A21 • p
As a corollary to this, adding a scalar multiple of one row to another of a matrix does no t change It- value of the determinant .
CHAPTER 8 Determinant s
COROLLAR r ,8: 3
Let
B
be formed from A by adding y times row i to row k. Then
IBI
=
IA l. ■
This result follows immediately from the preceding theorem by noting that row k of ya11 + a kf . Then a ll
a 1j
al„
ail
all
a i„
Ya11 + a ki
ya ii + a ki
Ya lu + a kn
a„1
and
a ,,,,
B is
IBI =
all
all
a1I
a il
ail
a ir
a i„
aki
akn
and
ann
+
= IAI .
a kl
Li
nl
and
a nn
..
and
In the last line, the first term is y times a determinant with rows i and k identical, hence is zero . The second term is just IAI . We now know the effect of elementary row operations on a determinant . In summary : Type I operation-interchange of two rows . This changes the sign of the determinant . Type II operation-multiplication of a row by a scalar a . This multiplies the determinant by a . Type III operation-addition of a scalar multiple of one row to another row . This does not change the determinant. Recall that the transpose At of a matrix A is obtained by writing the rows of A as the columns of At . We claim that a matrix and its transpose have the same determinant.
THEOREM ;8 . 6
IAI = IA`I . ■ For example, consider the 2 x 2 case : A=
a ll ( a 21
1 J'
a 12 a22
Ar = ( al l a 12
*
Ia2
l a2
Then IAI =
a 11 a 22 - a12a21
and
IA `I =
a 11 a 22 - a21 a 12 = A l
8.4 Evaluation of Determinants by Elementary Row and Column Operation s A proof of this theorem consists of comparing terms of the determinants . If A = [a 11] then At = [ai1] . Now, from the defining sum (8 .1) , A l E(-1)sg„(,,)a I=
a
a n p(n )
1'
and IA' I
= E(-1 ) ""(n) (A' ),
(1) (A')2n(2)
. . .
p
_ E(_1) .,s"(Oa a 9(1)1 q(2)2 q
(A` ),, (,,)
a q(n)n •
One can show that each term (-1)sr"(n)a1n(1)a2n(2) . . • a„ n(,Z) in the sum for Iis equal to a Al corresponding term (-1)sg"(')aq(1)1aq(2)2 • • • a q(„ )" in the sum for IA'I . The key is to realize that , because q is a permutation of 1, . . . , n, we can rearrange the terms in the latter product to writ e them in increasing order of the first (row) index . This induces a permutation on the secon d (column) index, and we can match this term up with a corresponding term in the sum for IAI . We will not elaborate the details of this argument . One consequence of this result is that we can perform not only elementary row operation s on a matrix, but also the corresponding elementary column operations, and we know the effec t of each operation on the determinant . In particular, from the column perspective : If two columns of A are identical, or if one column is a zero column, then A I= l 0. Interchange of two columns of A changes the sign of the determinant . Multiplication of a column by a scalar a multiplies the determinant by a . And addition of a scalar multiple of one column to another column does not change th e determinant. These operations on rows and columns of a matrix, and their effect on the determinant o f the newly formed matrix, form the basis for strategies to evaluate determinants .
1. Let A = [au] be an n x n matrix and let a be any scalar. Let B = [aau] . Thus B is formed by multiplying each element of A by a. Prove that IB* = a" IAI . 2. Let A = [au] be an n x n matrix . Let a be a nonzero number. Form a new matrix B = [a' -'a U] . How are IAA and
8.4
IBI related? Hint : It is useful to examine the 2 x 2 an d 3 x 3 cases to get some idea of what B looks like . 3. An n x n matrix is skew-symmetric if A = -A' . Prove that the determinant of a skew-symmet ric matrix of odd order is zero .
Evaluation of Determinants by Elementary Row and Colum n Operation s The use of elementary row and column operations to evaluate a determinant is predicated upo n the following observation. If a row or column of an n x n matrix A has all zero elements except possibly for a u in row i and column j, then the determinant of A is (-1)' +'a u times th e determinant of the (n- 1) x (n-1) matrix obtained by deleting row i and column j from A .
308
CHAPTER 8
Determinants
This reduces the problem of evaluating an n x n determinant to one of evaluating a smalle r determinant, having one less row and one less calumn . Here is a statement of this result, with (1) the row version and (2) the column version . THEOREM 8. 7
1 . Row Version a ll
a ll
a i-l,J-1
a
i-I,i
al„
a 1,i+1
ai
_ 1,J+ 1
0
aid
0
0
ai+1,j-1
a i+1,j
a i+1,j+1
ai+1,,,
a ,,1 a ll
=(-1)i+.ia ..
a ln
ai_ 1, 1
a i-1,J-1
J+ 1
ai+1,1
a i+l,j-1
ai+1,1+1
a i+I,,,
a n'
a„n
2. Column Version a ll
al,i-1
ail
0
a 1,i+I
0
a i _ 1,i+ 1
au
ai+1,1
a i+I,i-1
0
ai,J+ 1 a i+l,j+1
a ,,1
an,l_1
0
an,J+l
a ll
=(-1)
i+J a . .
a nd
ai, ,,
".
a i+l, n
a nn
a1„
a1 ,1+1
a a i+I, l
al„
"
. . .
ai-l , n
a i-1,J-1
1+1
ai+I,j-I
ai+1,j+1
a i+1,,,
a ,,,j+1
ann
This result suggests one strategy for evaluating a determinant . Given an n x n matrix A, us e the row and/or column operations to obtain a new matrix B having at most one nonzero element in some row or column . Then CAI is a scalar multiple of IBI, and IBS is a scalar multiple of the (n - 1) x (n - 1) determinant formed by deleting from B the row and column containing this nonzero element. We can then repeat this strategy on this (n - 1) x (n -1) matrix, eventually reducing the problem to one of evaluating a "small" determinant . Here is an illustration of this process .
8 .4 Evaluation of Determinants by Elementary Row and Column Operations
309
EXAMPLE 8 . 1
Let / -6 0 -1 5 A= 8 3 0 1 1 15
1 0 2 5 -3
3 1 1 -3 9
2 7 7 2 4
We want to evaluate IAI . There are many ways to proceed with the strategy we are illustrating . To begin, we can exploit the fact that a 13 = 1 and use elementary row operations to get zeros in the rest of column 3 . Of course a 23 = 0 to begin with, so we need only worry abou t column 3 entries in rows 3, 4, 5 . Add (-2)(row 1) to row 3, -5(row 1) to row 4 and 3(row 1 ) to row 5 to get /
B=
-6 -1 20 30 -17
0 5 3 1 15
1 0 0 0 0
3 1 -5 -18 18
2 7 3 -8 10
Because we have used Type III row operations , IAI = IBI .
Further, by Theorem 8 .7, IBI = (_ 1)1+3b13
ICI = (1) IC I = I C I ,
where C is the 4 x 4 matrix obtained by deleting row 1 and column 3 of B : C-
-1 5 1 7 20 3 -5 3 30 1 -18 - 8 -17 15 18 1 0
This is a 4 x 4 matrix, "smaller" than A . We will now apply the strategy to ICI . We can, fo r example, exploit the -1 entry in the 1, 1 position of C, this time using column operations to ge t zeros in row 1, columns 2, 3, 4 of the new matrix . Specifically, add 5(column 1) to column 2 , add column 1 to column 3, and add 7(column 1) to column 4 of C to ge t
Again, because we used Type III operations (this time on columns) of C, then ICI = I D I
But by the theorem, because we are using the element d 11 = - 1 as the single nonzero element of row 1, we have IDI = ( _1 )
1+1cl
IEI = - IEI ,
CHAPTER 8 Determinants in which E is the 3 x 3 matrix obtained by deleting row 1 and column 1 from D : E.
103 15 151 12 -70 1
143 202 -10 9
To evaluate IEI, we can exploit the entry 3, 2 entry e31 = 1 . Add -15(row 3) to row 1 an d -12(row 3) to row 2 to get F=
1153 0 991 0 -70 1
177 8 151 0 -10 9
Then IEI = IFI . By the theorem, using the only nonzero element f32 = 1 of column 2 of F, we hav e IFI = (_1) 3+2 (1) IGI = - IG I in which G is the 2 x 2 matrix obtained by deleting row 3 and column 2 of F : G
_-(
1153 991
177 8 1510 )
At the 2 x 2 state, we evaluate the determinant directly : IGI = (1153)(1510) - (1778)(991) = - 20, 968 . Working back, we now hav e IAI = IBI = ICI = IDI = - IEI = - IFI = IGI = - 20, 968 . The method is actually quicker to apply than might appear from this example, because w e included comments as we proceeded with the calculations .
In each of Problems 1 through 10, use the strategy of this section to evaluate the determinant of the matrix.
1.
2.
3.
-2 1 7 2 14 -13 -4 -2 2
4 6 0
1 3 4 -3 1 -1
5 3 -2
4.
2 -5 8 4 3 8 13 0 -4
5.
17 -2 5 1 12 0 14 7 -7
6.
-3 3 9 1 -2 15 7 1 1 2 1 -1
7 1 5 6 5 6
6 6 5 3
1
8.5 Cofactor Expansions
7.
0 1 1 -4 6 -3 2 2 1 -5 1 -2 4 8 2 2
8.
2 7 -1 0 3 1 1 8 -2 0 3 1 4 8 -1 0
9.
10.
10 1 -6 0 3 3 0 1 1 -2 6 8 -7 16 1 0 0 3 6 1
311
2 9 7 8 2
4
0
5
-4 1
4
-5
l
( 8.5
Cofactor Expansion s Theorem 8 .5 suggests the following . If we select any row i of a square matrix A, we can write
a ll
a12
a 1„
a ll a nd
+
a „2
.
.
a1„
a 1„
a ll
0
ann
a nd
ant
a ll
a 12
a 1,,
0
a i2
0
a,,1
a „2
a nn
+ . . .+
0
a ll
a 1,
0
0
a „1
a,,2
a 1„ ...
a tn ann
Each of the determinants on the right of equation (8 .2) has a row in which every element but possibly one is zero, so Theorem 8 .7 applies to each of these determinants . The first determinant on the right is (-1)`+1a11 times the determinant of the matrix obtained by deleting row i an d column 1 from A . The second determinant on the right is (-1) i+2 a i2 times the determinant of the matrix obtained by deleting row i and column 2 from A . And so on, until the last matri x on the right is (-1)` + "a 1„ times the determinant of the matrix obtained by deleting row i an d column n from A . We can put all of this more succinctly by introducing the following standar d terminology .
DEFINITION 8.2 Minor If A is an n x n matrix, the minor of a ll is denoted Mu, and is the determinant of the (n - 1) x (n -1) matrix obtained by deleting row i and column j of A . Cofactor The number (-1)` +J M 1 is called the cofactor of a1j .
We can now state the following formula for a determinant .
312
CHAPTER 8 Determinants
L THEOREM 8 .8 Cofactor Expansion by a Ro w If A is n x n, then for any integer i with i < i < n, n IA I = E(-1)''i aijM1j
(8 .3 )
j= 1
This is just equation (8 .2) in the notation of cofactors . The sum (8 .3) is called the cofacto r expansion of IAI by row i because it is the sum, across this row, of each matrix element time s its cofactor. This yields IAI no matter which row is used . Of course, if some a ;k = 0 then we need not calculate that term in equation (8 .3), so it is to our advantage to expand by a ro w having as many zero elements as possible . The strategy of the preceding subsection was t o create such a row using row and column operations, resulting in what was a cofactor expansion , by a row having only one (possibly) nonzero element .
EXAMPLE 8 . 2
Let -6
7
3
7
A = 12 -5 - 9
2 4 -6 If we expand by row 1, we get IAI
3
=
j= 1 = (-1)1+1(-6)
-5 4
-9 -6
+(-1) 1+3 (7)
12 2
-5 4
+ ( _ 1)1+2(3)
12 2
-9 -6
(-6)(30+36)-3(-72+18)+7(-48+10) = 172 . Just for illustration, expand by row 3 : 3
Because, for purposes of evaluating determinants, row and column operations can both b e used, we can also develop a cofactor expansion of Iby column j . In this expansion, we mov e Al down a column of a matrix and sum each term of the column times its cofactor .
8 .5 Cofactor Expansions
THEOREM 8.9
3 13
Cofactor Expansion by a Colum n
Let A be an n x n matrix . Then for any
j
with 1 < j < n ,
E(-1)`+1a, ;M;; .
IAI =
i= l
This differs from the expansion (8 .3) in that the latter expands across a row, while the su m (8 .4) expands down a column . All of these expansions, by any row or column of A, yield 1AI .
EXAMPLE 8 . 3
Consider again -6 3 7 A - 12 -5 - 9 2 4 -6 Expanding by column 1 gives us 3 1 *A l = E(- 1) `+
= ( - 1)1+1(-6)
i1 M11
i= 1
+ (-1) 2+1 (12)
-5 - 9 4 -6
3 7 4 -6 +(-1)3+1(2)
3 -5
7 -9
=(-6) (30 +36) - 12(-18 - 28) + 2(-27 + 35) = 172 . If we expand by column 2 we ge t 3 12 I A I = E( - ) 2a i2 mi2 = (-1)1+2(3) 2
-9 -6
i=1
+ (-1)2+2(-5)
-6 7 2 -6
+ (-1) 3+2 (4)
-6 12
=(-3) (-72 + 18) - 5(36 -14) - 4(54 - 84) = 172 .
In Problems 1-10, use cofactor expansions, combined with elementary row and column operations when this is useful, to evaluate the determinant of the matrix . 4 1 1
1.
2.
1 2 3
2 1 -3 1 -2 -1
-8 0 0 ) 6 1 4
3.
7 1 -3
-3 -2 1
1 4 0
4.
5 -1 -2
-4 3 1 6 -2 4
5.
-5 2 4 1
0 -1 4 -1
1 3 -5 6
6 7 -8 2
7 -9
CHAPTER 8
3 -5 -5 9 7.
8
9
-5 15 1 0
Determinants
6 2 7 15
This is Vandermonde's determinant. This and the next problem are best done with a little thought in using facts about determinants, rather than a brute-forc e approach.
-3 1 0 1 2 -3
14 16 4
14 13 7 1 2 0 1 -6
-2 1 12 5
5 7 3 2
-5 4 3 -9 -2 0 1 14
1 2 -1 0
7 -5 1 3
-8 5 1 01 3 1 10. 22 3 04 1 1 -7 11. Show that
12. Show tha t y 6
y 6
a
8 a Q
S a a y
=(a+a+y+8)(a-a+6-y)
x
0 1 -1 1 y S 1 8 a 1 a ,0
1 a Q y
13. Let A be a square matrix such that A- 1 = A l . Prov e that IAI = + l . 14. Prove that three points (x i, yl), ( x2, y2) and (x 3, y 3 ) are collinear (on the same straight line) if and only i f
2 7 5 -6 5 3 7 2 -6 5
Y1 Y2
=0 .
Y3 Hint : This determinant is zero exactly when one row
= ( a - a)(y - a)(a - y) .
8.6
a /3 a y
or column is a linear combination of the other two .
Determinants of Triangular Matrice s The main diagonal of a square matrix A consists of the elements a ll , a22 , . . . , a,,,, . We call A upper triangular if all the elements below the main diagonal are zero . That is, a id = 0 if i > j . Such a matrix has the appearanc e a ll 0 0
a 12 a 22 0
a 13 a 23 a 33
0 0
0 0
0 0
A=
a l,, a 2„ a 3n ' ' 0
a n-l,n-1 0
a n-1, n a ,,,, /
If we expand Cby Al cofactors down the first column, we hav e a 22 a23 0 a 33
...
a2,n_l a3,n- 1
a2n a3n
...
0
a nn
IAI = all 0 0
0 0
8. 7 A Determinant Formula for a Matrix Invers e
and the determinant on the right is again upper triangular, so expand by its first column to ge t
I
A I = a 11 a 22
a33 0
a 34 a44
0
0
a3, ,
. . .
a 3,,
a,1 „
with another upper triangular determinant on the right . Continuing in this way, we obtai n I
A I = a l am . . .
a nn .
The determinant of an upper triangular matrix is the product of its main diagonal elements . The same conclusion holds for lower triangular matrices (all elements above the mai n diagonal are zero) . Now we can expand the determinant along the top row, each time obtainin g just one minor that is again lower triangular .
EXAMPLE 8 .4
15 -7 0 12 0 0 0 0 0 0
4 7 . 3 -6 3 9 15 -4 0 22 0 0 e
= (15)(12)*are = 180/ire .
PR OBLEMS Evaluate the following determinants . 0
12
0 7
3
-4
2
0
1
1
1 10
-4 -4
16 16
-4
1.
6 0 2.
8.7
0
1
-1
-4
0 0
0
0
2 -5 0
0 0
0 0
0 0
0 0 0 -2
2 2
0 0 0 0
0 0 0
1
5
1
17
0 4
3.
3
0
0
0
0
2
-6
17
14
0 2
22
-2
15
0 0 8
43
12
1
-1
0 0 0 5
0
2
1
-3
1
10
1
14
0
-7 0
0 0
13 0
-4 3
A Determinant Formula for a Matrix Invers e Determinants can be used to tell whether a matrix is singular or nonsingular . In the latter case , there is a way of writing the inverse of a matrix by using determinants .
Determinant s
CHAPTER 8
First, here is a simple test for nonsingularity . We will use the fact that we reduce a matrix by using elementary row operations, whose effects on determinants are known (Type I operations change the sign, Type II operations multiply the determinant by a nonzero constant, and Typ e III operations do not change the determinant at all) . This means that, for any square matrix A , I A I = a IA RI for some nonzero constant a .
-'L
THEOREM 8 .1 0
Let A be an n x n matrix. Then A is nonsingular if and only if
IAI # 0 . ■
Suppose first that I0 . Since I= Al A l a IA R I for some nonzero constant a, A R can hav e no zero row, so AR = I, . Then rank(A) = n, so A is nonsingular by Theorems 7 .26 and 7 .15 . a IAR I 0. ■ Conversely, suppose A is nonsingular . Then A R = Then I= Al Proof
Using this result, we can give a short proof of Theorem 7 .25(5) . Suppose A and n x n matrices, and AB is singular . Then IABI
B
are
= IA I IBI = 0,
so IAI = 0 or IBI = 0, hence either A or B (or possibly both) must be singular .
We will now write a formula for the inverse of a square matrix, in terms of cofactors o f the matrix. gru THEOREM 8.1 1 -- -
Let A be an n x n nonsingular matrix . Define an n x n matrix B by puttin g bi1
Then,
A I I (-1) i+i Mii .
B = A-1 .
That is, the i, j element of A -1 is the cofactor of aji (not a id ), divided by the determinan t of A . Proof
By
the way
B is
defined, the i, j element of (AB) if = L a ik b kj = IAI k=1
AB i s
E(- 1 ) i+ka ik Mjk .
Now examine the sum on the right. If i = j, we get 1
(AB )ii
"
=A I I E( -1 ) i+ka ik Mik
and the summation is exactly the cofactor expansion of (AB )ii
=
IAI
by row i . Therefore
A IAI = 1 .
If i j, then the summation in the expression for (AB) ij is the cofactor expansion, by row j, of the determinant of the matrix formed from A by replacing row j by row i . But this matrix then has two identical rows, hence has determinant zero, Then (AB) ij = 0 if i j, and we conclude that AB = I,i . A similar argument shows that BA = I,t , hence B = A -1 . ■
8.7 A Determinant Formula for a Matrix Inverse This method of computing a matrix inverse is not as efficient in general as the reductio n method discussed previously . Nevertheless, it works well for small matrices, and in some discussions it is useful to have a formula for the elements of a matrix inverse .
EXAMPLE 8 . 5
Let -2 4 1 6 3 -3 2 9 -5
AThen
-2 4 1 6 3 -3 =12 0 2 9 -5 so A is nonsingular. Compute the nine elements of the inverse matrix B : b11
1 =1 120 M11 - 120 _
b12
b13
b21
120
1 = 120
1 = 120 M31 = 120 =
b22
1 120 M12
120 _
b23
(-1)M21
1
1 -5
1
1 b31 _120 M = 120
120
- 2 6
1 8'
6 2
-2 1 6 -3
1 5'
4 3
- 2 4 2 9 1 4
Then B=A- '=
29
-1
1M
8
15 13
-
60
=V,
2 5'
3 9
1 1 23 = M 120 120
_ b33
29 120 '
15
13
b32
1 10 '
1 -
6 -3 2 -5
1 32 M 120 = 120
_
4 9
4 1 3 -3
= - 120
- 2 2
12 120
3 -3 9 -5
-1 4
13 60 '
In each of Problems 1 through 10, use Theorem 8 .10 to determine whether the matrix is nonsingular . If it is, use Theorem 8 .11 to find its inverse . 1.
2 ( 1
2 ( 13 3' (
8.
11 0 -5 0 1 0 4 -7 9
9.
3 1 -2 4 6 -3 -2 1 7 13 0 1
1 9 4 5
10.
7 -3 -4 0 8 2 1 5 -1 3 -2 -5
1 0 7 9
0 4 1)-1 14 )
-7 -3 )
5.
6 -1 3 0 1 -4 2 2 -3
8.8
0 -4 2 -1 1 -1
-1 ) 6
4.
6.
7.
14 1 2 -1 1 1
-3 3 7 )
3 6 7
Cramer's Rule Cramer's rule is a determinant formula for solving a system of equations AX = B when A is n x n and nonsingular . In this case, the system has the unique solution X = A -1 B . We can , therefore, find X by computing A -1 and then A-1 B . Here is another way to find X .
-1, THEOREM 8.12 Cramer's Rule
Let A be a nonsingular n x n matrix of numbers . Then the unique solution of AX = B i s x1 x2 x„ xk
=
IAI
IA (k ; B)
and A(k ; B) is the matrix obtained from A by replacing column k of A by B . Here is a heuristic argument to suggest why this works . Let bl bZ B= b„
8 .8 Cramer's Rul e
Multiply column k of A by
Xk .
xkI A I
The determinant of the resulting matrix is
=
a ll ale . . a21 a 22 ' '
a lk xk a 2k x k
al n
..
ankxk
a.
and a,,2
xk IAI,
so
a2 n
For each j k, add x i times column j to column k . This Type III operation does not change the value of the determinant, and we ge t
Xk
IAI =
a i l a 12 a21 a22
a 11 x 1+ a 12 x2+ . . . + .a 1„ x„ az1 x 1 + az2 x2+ . . . + azn a,,
a nd ant
a n1 x 1 + a n2 x2 + ' . . + a nn x n b1 . . a1„ . b2 a 2, = I A ( k ; B )I .
a 12
al l a21
a92
a „1 a ,,2
b„
a ,n ,
Solving for Xk yields the conclusion of Cramer's Rule .
EXAMPLE 8 .6 Solve the system xi - 3x2 - 4x3
=
1
-x 1 + x2 - 3x3 = 1 4 x 2 - 3x3 = 5 .
The matrix of coefficients is 1 -1 0
-3 1 1
13
1 14 5
-3 1 1
-4 -3 -3
13
1 -1 0
1 14 5
-4 -3 -3
1 13
1 -1 0
-3 1 1 14 1 . 5
A=
-4 -3
-3
We find that IAI = 13 . By Cramer's rule, x1
=
xz =
x3
=
117 =-9 , 13 _
10 13 ' 25 13
' a 1,, a2 „ ann
CHAPTER 8 Determinants Cramer's rule is not as efficient as the Gauss-Jordan reduction . Gauss-Jordan also applies to homogeneous systems, and to systems with different numbers of equations than unknowns . However, Cramer's rule does provide a formula for the solution, and this is useful in som e contexts .
In each of Problems 1 through 10, either find the solution by Cramer's rule or show that the rule does not apply . 1.
The Matrix Tree Theorem In 1847, G .R . Kirchhoff published a classic paper in which he derived many of the electrica l circuit laws that bear his name . One of these is the matrix tree theorem, which we will discus s now . Figure 8 .1 shows a typical electrical circuit . The underlying geometry of the circuit i s shown in Figure 8 .2 . This diagram of points and connecting lines is called a graph, and was seen in the context of the movement of atoms in crystals in Section 7 .1 .5 . A labeled graph has symbols attached to the points . Some of Kirchhoff's results depend on geometric properties of the circuit's underlyin g graph . One such property is the arrangement of the closed loops . Another is the number o f spanning trees in the labeled graph . A spanning tree is a collection of lines in the graph formin g no closed loops, but containing a path between any two points of the graph . Figure 8 .3 shows a labeled graph and two spanning trees in this graph. Kirchhoff derived a relationship between deteminants and the number of labeled trees in a graph.
8.9 The Matrix Tree Theore m
FIGURE 8 .1
FIGURE 8 . 2 •U 1
• U1
3
V10
V3
V10
V9
•
V6
• U5
V6 •
U4•
FIGURE 8 .3 A labeled graph and two of its spanning trees .
THEOREM 8 .13
Matrix Tree Theorem
Let G be a graph with vertices labeled v 1 , . . . , v,i . Form an n x 12 matrix T = [t11] as follows . If i = j, then tit is the number of lines to v i in the graph . If i j, then tt1 = 0 if there is no line between v,• and vj in G, and ttj = - 1 if there is such a line. Then, all cofactors of T are equal, and their common value is the number of spanning tree s in G .
EXAMPLE 8 . 7
For the labeled graph of Figure 8 .4, T is the 7 x 7 matri x
f
3
-1 T=
0 0 0
-1 3
-1 -1
-1
0 0
\ -1
0
0
-1
0 -1
0 0 0
-1
-1
-1
0 -1 0
-1 \ 0 0
-1 4 -1 -1 -1 3 -1 - 1 -1 0 -1 4 - 1 3 -1 0 0
FIGURE 8 .4
-1
Graph G.
4
V4 •
322
CHAPTER 8 Determinants Evaluate any cofactor of T . For example, covering up row 1 and column 1, we hav e
(_1)1-"Mtt =
3 -1 -1 0 0 0
-1 3 -1 0 -1 0
-1 -1 4 -1 0 -1
0 0 -1 3 -1 -1
= 386 . Evaluation of any cofactor of T yields the same result . ■ Even with this small graph, it would clearly be impractical to enumerate the spanning tree s by attempting to list them all .
PROBLEMS 1. Find the number of spanning trees in the graph of Figure 8 .5.
4. Find the number of spanning trees in the graph o f Figure 8 .8 .
FIGURE 8 . 5 FIGURE 8 . 8
2. Find the number of spanning trees in the graph of Figure 8 .6.
5. Find the number of spanning trees in the graph o f Figure 8.9.
FIGURE 8 . 6
3. Find the number of spanning trees in the graph of Figure 8 .7. FIGURE 8 . 9
6. A complete graph on n points consists of n points with a line between each pair of points . This graph is often denoted Kn . With the points labeled 1,2,, . ., n, sho w that the number of spanning trees in K„ is ni-2 fo r FIGURE 8 .7
n=3,4, . . . .
CHAPTER
9
Eigenvalues, Diagonalization, and Special Matrices
Suppose A is an n x n matrix of real numbers . If we write an n-vector E as a column
then AE is an n x 1 matrix which we may also think of as an n-vector . We may therefore consider A as an operator that moves vectors about in R" . Because A(aE 1 + bE 2) = aAE 1 + bAE 2 , A is called a linear operator. Vectors have directions associated with them . Depending on A, the direction of AE will generally be different from that of E. It may happen, however, that for some vector E, A E and E are parallel . In this event there is a number A such that AE = AE . Then A is called a n eigenvalue of. A, with E an associated eigenvector. The idea of an operator moving a vector to a parallel position is simple and geometrically appealing . It also has powerful ramifications in a variety of contexts . Eigenvalues contain important information about the solutions of systems of differential equations , and in models of physical phenomena may they have physical significance as well (suc h as the modes of vibration of a mechanical system, or the energy states of an atom) . 323
324
9.1
CHAPTER 9 Eigenvalues, Diagonalization, and Special Matrice s
Eigenvalues and Eigenvector s Let A be an n x n matrix of real or complex numbers .
DEFINITION 9.1
Eigenvalues and Eigenvectors
A real or complex number A is an eigenvalue of A if there is a nonzero n x 1 matri x (vector) E such that AE _ A.E . Any nonzero vector E satisfying this relationship is called an eigenvector associated wit h the eigenvalue- A .
Eigenvalues are also known as characteristic values of a matrix, and eigenvectors can b e called characteristic vectors. We will typically write eigenvectors as column matrices and think of them as vectors in R" . If an eigenvector has complex components, we may think of it as a vector in C", whic h consists of n-tuples of complex numbers . Since an eigenvector must be nonzero vector, at least one component is nonzero . If a is a nonzero scalar and AE = lE, the n A(aE) = a(AE) = a(AE) = A(aE) . This means that nonzero scalar multiples of eigenvectors are again eigenvectors .
EXAMPLE 9 . 1
Since
(0
o II
4) =I
o I=0
4
I,
0 is an eigenvalue of this matrix, with ( 40 / an associated eigenvector. Although the zero vector cannot be an eigenvector, the number zero can be an eigenvalue of a matrix . For an y scalar a 0, (
4a \I is also an eigenvector associated with the eigenvalue 0 .
EXAMPLE 9 . 2
Let A-
1 -1 0 0 1 1 0 0 -1
9.1 Eigenvalues and Eigenvectors 6 0 , becaus e 0
Then 1 is an eigenvalue with associated eigenvector 6 6 A 0 = 0 0 0
325
6 0 0
1
Because any nonzero multiple of an eigenvector is an eigenvector, the n
a 0 0
is also an
eigenvector associated with eigenvalue 1, for any nonzero number a . Another eigenvalue of A is -1, with associated eigenvector 1 -1 0 0 1 1 0 0 -1
1 2 , becaus e -4
1 2 -4
-1 -2 = -1 4
1 2 -4
.
a 2a , with a 0, is an eigenvector associated with -1 . -4 a We would like a way of finding all of the eigenvalues of a matrix A . The machinery to do this is at our disposal, and we reason as follows . For A to be an eigenvalue of A, there must be an associated eigenvector E, and AE = AE . Then AE - AE = 0, o r Again, any vector
AI„E-AE=O . The identity matrix was inserted so we could write the last equation a s (AI„ - A)E = 0 . This makes E a nontrivial solution of the n x n system of linear equations (AI„ - A)X = 0 . But this system can have a nontrivial solution if and only if the coefficient matrix has determinan t zero, that is, I AI„ - A = 0 . Thus A is an eigenvalue of A exactly when - A * = 0 . This i s the equation A - ail - a21 -
-a l , A-a22
- a „2
..
-a2, ,
=O .
A - a ,,,,
When the determinant on the left is expanded, it is a polynomial of degree n in A, calle d the characteristic polynomial of A . The roots of this polynomial are the eigenvalues of A . Corresponding to any root A, any nontrivial solution E of (Al,, - A)X = 0 is an eigenvector associated with A . We will summarize these conclusions .
32 6
CHAPTER 9 Eigenvalues, Diagonalization, and Special Matrice s THEOREM 9. 1
Let A be an n x n matrix of real or complex numbers . Then 1. A is an eigenvalue of A if and only if I AI - AI = 0 . 2. If A is an eigenvalue of A, then any nontrivial solution of (AI„ - A)X = 0 is an associated eigenvector .
DEFINITION 9.2 Characteristic Polynomial The polynomial CAI„ -AI is the characteristic polynomial of A, and is denoted pA (A) .
If A is n x n, then pA (A) is an nth degree polynomial with real or complex coefficient s determined by the elements of A . This polynomial therefore has n roots, though some may b e repeated . An n x n matrix A always has n eigenvalues A l , . . . , A,,, in which each eigenvalue i s listed according to its multiplicity as a root of the characteristic polynomial . For example, if PA (A) =
we list 7 eigenvalues : 1,
3, 3,
(A-1)(A-3)2(A-i) a
i, i, i, i. The eigenvalue
3
has multiplicity 2 and i has multiplicity 4 .
EXAMPLE 9.3 Let A-
1 0 0
-1 0 1 1 0 -1
as in Example 9 .2. The characteristic polynomial i s A-1 PA ( A ) =
0 0
1 0 A-1 - 1 0 A+ 1
=(A-1)2(A-I- 1) . The eigenvalues of A are 1, 1, -1 . To find eigenvectors associated with eigenvalue 1, solv e (1I3 -A)X=
01 0 0 0 -1 2 0 0
X=O .
This has general solution
and these are the eigenvectors associated with eigenvalue 1, with a 0 .
9.1 Eigenvalues and Eigenvectors For eigenvectors associated with -1, solv e -2 1 0 -2 0 0
(-1I3 -A)X=
0 -1 0
X=O.
The general solution is
and these are the eigenvectors associated with eigenvalue -1, as long as /3 = 0 .
EXAMPLE 9.4 Let
The characteristic polynomial i s PA(x)
A ( 0 1 01 )-( 2 A-1 2 -2 A
02 )
=A(A-1)+4=A2-A+4.
This has roots (1 + l i)/2 and (1 - 15i)/2, and these are the eigenvalues of A . Even though A has real elements, the eigenvalues may be complex . To find eigenvectors associated with (1 + 15i)/2, solve the system (AI 2 - A)X = 0 , which for this A is 1-I- 2 15 i ( 01 0
(2 -1 )
X=O.
This is the system 1+ 15i (
2
,- 1 1+2 i3i)
-2
2
/xi \ x2 = (0 )
or -1-f
-2x i
2
15i
xl +2x2 = 0 15 i
-I- 1 f
2 We find the general solution of this system-to b e a
x2 = - 0.
1 4
and this is an eigenvector associated with the eigenvalue
2
f or any nonzero scalar a .
328
CHAPTER 9 Eigenvalues, Diagonalization, and Special Matrice s Corresponding to the eigenvalue ' - 253i solve the syste m r- rs* -1 2
2
-2
2
0
1+ 1 is i 4
)X=O ,
obtaining the general solution
This is an eigenvector corresponding to the eigenvalue
2/7i
for any (3 0 . II
Finding the eigenvalues of a matrix is equivalent to finding the roots of an n` 1 degree polynomial, and if n > 3 this may be difficult . There are efficient computer routines whic h are usually based on the idea of putting the matrix through a sequence of transformations, the effect of which on the eigenvalues is known . This strategy was used previously to evaluate determinants . There are also approximation techniques, but these are sensitive to error . A number that is very close to an eigenvalue may not behave like an eigenvalue . We will conclude this section with a theorem due to Gerschgorin . If real eigenvalues are plotted on the real line, and complex eigenvalues as points in the plane, Gerschgorin's theorem enables us to delineate regions of the plane containing the eigenvalues . 9.1 .1 Gerschgorin's Theorem
THEOREM 9 .2
Let A
Gerschgori n be
an n x n matrix of real or complex numbers . For k = 1, . . . , n, le t n
rk = L akJ . J=I ,jOk
Let Ck be the circle of radius rk centered at (ak , /3 k ), where a kk = ak + i(3k . Then eac h eigenvalue of A, when plotted as a point in the complex plane, lies on or within one of th e circles C 1 , . . . , C . The circles C k are called Gerschgorin circles. For the radius of Ck , read across row k and add the magnitudes of the row elements, omitting the diagonal element akk . The center of Ck i s a kk , plotted as a point in the complex plane . If the Gerschgorin circles are drawn and the disk s they bound are shaded, then we have a picture of a region containing all of the eigenvalues of A .
EXAMPLE 9 . 5
Let A=
12i 1 4 1-3i
1 -6 1 -9
9 2+i 1 1
-4 -1 4i 4-7i
A has characteristic polynomial p A (A)
= A 4 + (3 - 5i) A 3
+
(18 - 402
+ (290
+ 900A + 1374 -1120i .
9.1 Eigenvalues and Eigenvectors
32 9
It is not clear what the roots of this polynomial are . Form the Gerschgorin circles . Their radii are : r1 =1+9+4=14 , 12=1+N/3+1=2+'■/3 , r3 =4+1+4= 9 and r4 =
10+9+1= 10+ 10 .
CI has radius 14 and center (0, 12), C' 2 has radius 2+A/-5- and center (-6, 0), C3 has radiu s 9 and center (-1,0) and C4 has radius 10 + 10 and center (4, -7) . Figure 9 .1 shows the Gerschgorin circles containing the eigenvalues of A .
FIGURE 9 .1
Gerschgorin circles.
Gerschgorin's theorem is not intended as an approximation scheme, since the Gerschgori n circles may have large radii . For some problems, however, just knowing some information abou t possible locations of eigenvalues can be important . For example, in studies of the stability o f fluid flow, it is important to know whether there are eigenvalues in the right half-plane .
In each of Problems 1 through 16, (a) find the eigenvalue s of the matrix, (b) corresponding to each eigenvalue, find an eigenvector, and (c) sketch the Gerschgorin circles and (approximately) locate the eigenvalues as points in th e plane .
CHAPTER 9 Eigenvalues, Diagonalization, and Special Matrices
3 30
7 9.
10 .
7
-3 1 0 0 0 1
1 0 0
0 0 -1 0 0 1 2 0 0 -14 1 0 2 1 0
11 .
16 .
5 0 0 0
1 1 0 0
0 0 0 0
9 9 9 0
17. Show that the eigenvalues of I
a
I, in which a ,
/3 and y are real numbers, are real real .
0 0 2
«a y 18. Show that the eigenvalues of (3 8 e
are real ,
12 .
3 0 0 1 -2 - 8 0 -5 1
13 .
1 -2 0 0 - 5 0
14 .
-2 1 0 0
1 0 0 0
0 1 0 0
20. Let A be an eigenvalue of A with eigenvector E, an d p, an eigenvalue of A with eigenvector L . Suppos e A 0 r . Show that E and L are linearly independent a s vectors in R" .
15 .
- 4 0 0 1
1 0 1 1 0 0 0 2 0 0 0 3
21. Let A be an n x n matrix . Prove that the constant term of pA (x) is (-1)" IAI . Use this to show that any singular matrix must have zero as one of it s eigenvalues .
1
9.2
y
if all of the matrix elements are real . 19. Let A be an eigenvalue of A with eigenvector E. Show that, for any positive integer k, A' is an eigenvalue o f A', with eigenvector E.
0 0 7
0 0 0 0
E
Diagonalization of Matrice s We have referred to the elements a„ of a square matrix as its main diagonal elements . All other elements are called off-diagonal elements .
A square matrix having all off-diagonal elements equal to zero is called a diagonal matrix .
We often write a diagonal matrix having main diagonal elements di , . . ., d" a s
with 0 in the upper right and lower left corners to indicate that all off-diagonal elements ar e zero . Here are some properties of diagonal matrices that make them pleasant to work with .
9.2 Diagonalization of Matrices
331 L
THEOREM 9 . 3 Let di
0 d2
D=
0
w2 and W = d2
Then 1. / dlw1
0 d2w2
DW=WD= ) . 2.'DI=did2 . . .d,, .
3. D is nonsingular if and only if each main diagonal element is nonzero . 4. If each di
0, then 0
1/d1 1/d
\
2 1/d„ /
0 5. The eigenvalues of D are its main diagonal elements . 6. An eigenvector associated with dj is /0 \ 0 1 0 \0 / with 1 in row j and all other elements zero .
We leave a proof of these conclusions to the student. Notice that (2) follows from the fact that a diagonal matrix is upper (and lower) triangular . "Most" square matrices are not diagonal matrices . However, some matrices are related t o diagonal matrices in a way that enables us to utilize the nice features of diagonal matrices .
DEFINITION 9.4 Diagonalizable Matrix An n x n matrix A is diagonalizable if there exists an n x n matrix P such that P - 'AP i s a diagonal matrix . When such P exists, we say that P diagonalizes A .
_J
CHAPTER 9 Eigenvalues, Diagonalization, and Special Matrice s The following theorem not only tells us when a matrix is diagonalizable, but also how t o find a matrix P that diagonalizes it . THEOREM 9.4
Diagonalizability
Let A be an n x n matrix . Then A is diagonalizable if it has n linearly independent eigenvectors . Further, if P is the n x n matrix having these eigenvectors as columns, then P - 'AP is the diagonal matrix having the corresponding eigenvalues down its main diagonal . ■ Here is what this means . Suppose A l , . . . , A,, are the eigenvalues of A (some possibl y repeated), and V 1, . . . , V„ are corresponding eigenvectors . If these eigenvectors are linearl y independent, we can form a nonsingular matrix P using V1 as colurnn j . It is the linear independence of the eigenvectors that makes P nonsingular . We claim that P-1 AP is the diagonal matrix having the eigenvalues of A down its main diagonal, in the order correspondin g to the order the eigenvectors were listed as columns of P .
EXAMPLE 9 . 6
Let A-I -10 34 I ' A has eigenvalues -1, 3 and corresponding eigenvectors ( 0 1 1
and )
1 (
)
respectively . Form
1
Because the eigenvectors are linearly independent, this matrix is nonsingular (note that We find that P-1 = 0
IPI
0) .
1 1
Now compute P'AP-(0
(
1 ) ( 0 3)(0 0
1 )
3)'
which has the eigenvalues down the main diagonal, corresponding to the order in which th e eigenvectors were written as columns of P . If we use the other order in writing the eigenvectors as columns, and defin e Q= (
'
then we get Q-1 A Q = (33 -Ol) • ■
9 .2 Diagonalization of Matrices
3 33
Any linearly independent eigenvectors can be used in this diagonalization process . For example, if we use ( 6 ) and ( -4 1, which are simply nonzero scalar multiples of th e previously used eigenvectors, then we can defin e _ 6 -4 S- (0 -4 ) and now
)
S_lAS = ( -1 0 03 J
EXAMPLE 9 . 7
Here is an example with more complicated arithmetic, but the idea remains the same. Let -1 1 3 21 4 1 0 -2
A=
The eigenvalues are -1, - + z 29, - z - z 29, with corresponding eigenvectors, respectively , 1 -3 J , 1
3 -{ 29 10+2 29 ) , ( 10-2„,/,03 292 9 2 2
These are linearly independent . Form the matrix P=
1 -3 1
3 + 29 10+2 29 2
3 - 29 10-2 29 2
We find that 29
P-l = 1 8 2
232 29
116 29
-16-2 29 -16-2 29
232 29
-1+ 29 1+ 29
-19+5 29 19+5/2 )
Then P -1AP =
-1 0 0
0 -1
i 0
0 29
0 -1- 29 2
In this example, although we found P -1 explicitly, we did not actually need it to diagonalize A. Theorem 9 .4 assures us that P-'AP is a diagonal matrix with the eigenvalues of A down its main diagonal . All we really needed was to know that A had three linearl y independent eigenvectors . This is a useful fact to keep in mind, particularly if P and P-1 are cumbersome to compute.
CHAPTER 9 Eigenvalues, Diagonalization, and Special Matrices
EXAMPLE 9 . 8
Let A=( -1 -4 3 -2 ) . The eigenvalues are (-3 + respectively,
47i)/2
and (-3 -
471)/2.
Corresponding eigenvectors are,
8 ( 1- 8 471 ) ' ( 1+ N/ 7i ) Since these eigenvalues are linearly independent, there is a nonsingular 2 x 2 matrix P that diagonalizes A : 47 i
-3-I
2
P_lAP
0
2
Of course, if we need P for some other calculation, as will occur later, we can write it down : P-
8 1-
8 471
1 + 47 i
And, if we wish, we can compute P-1
=
47i -1 -WT 8 752 ( 1- i 47 - 8
However, even without explicitly writing P- we know what P-'AP is. EXAMPLE 9 . 9
It is not necessary that A have n distinct eigenvalues in order to have n linearly independent eigenvectors . For example, let
7 5 -4 4 A = 12 -11 12 4 -4 5 The eigenvalues are 1, 1, -3, with 1 having multiplicity 2 . Associated with -3 we find an eigenvecto r
To find eigenvectors associated with 1 we must solve the syste m (I3 -A)X=
1
-4 -12 -4
4 12 4
-4 -12 -4
This system has general solution 0 1 1
xi
0
x2
0
x3
)= ( 0
) .
9 .2 Diagonalization of Matrices
33 5
We can therefore find two linearly independent eigenvectors associated with eigenvalue 1, fo r example, 1 0 -1
and
We can now form the nonsingular matrix P=
1 1 3 0 1 -1
0 1 1
P-1 AP =
-3 0 01 0 0
0 0 1
that diagonalizes A :
Here is a proof of Theorem 9 .4, explaining why a matrix P formed from linearly independent eigenvectors must diagonalize A . The proof makes use of an observation we have made before . When multiplying two n x n matrices A and B , column j of AB = A(column j of B) . Let the eigenvalues of A be A l , . . ., A 1, and corresponding eigenvectors V1 , . . ., V,, . These form the columns of P . Since these eigenvectors are assumed to be linearly independent, the dimension of th e column space of P is n . Therefore rank(P) = n and P is nonsingular by Theorems 7 .15 and 7.26. Now compute P- 1AP as follows . First , Proof
column j of AP = A(column j of P) = AVi = Aj Vj . Thus the columns of AP are A l V 1 , . . . , A.,, V„ and AP has the form AP =
A l V1
A2 V2
. . .
*, V
Then column j of P -l AP = P-l (column j of AP) = P -1 [Aj Vj ] = A.1P -1 Vj . But Vi is column j of P, so
P-1V,
= column j of P-1 P =
1 \
0l
CHAPTER 9 Eigenvalues, Diagonalization, and Special Matrice s in which the column matrix on the right has all zero elements except 1 in row j . Combining the last two equations, we have /0\ 0
/0 \ 0
1
Ai
column j of P -1 AP = AJ P-1VJ =AJ
We now know the columns of P- 1 AP, and putting them together gives u s 0 0 A2 0 0 A3
0 \ 0 0
•••
0 0
An
l
We can strengthen the conclusions of Theorem 9 .4 . So far, if A has n linearly independen t eigenvectors, then we can diagonalize A . We will now show that this is the only time A can b e diagonalized . Further, if, for any Q, Q -1 AQ is a diagonal matrix, then Q must have linearly independent eigenvectors of A as its columns .
THEOREM 9. 5
Let A be an n x n diagonalizable matrix . Then A has n linearly independent eigenvectors . Further, if Q-'AQ is a diagonal matrix, then the diagonal elements of Q -1 AQ are the eigenvalue s of A, and the columns of Q are corresponding eigenvectors . Proof
Suppose that 0
d1 d2
Q-1 AQ = 0
= D. d„
Denote column j of Q as V . Then V1 , . . . , V n are linearly independent, because Q is nonsingular. We will show that di is an eigenvalue of A, with corresponding eigenvector V .. Write AQ = QD and compute both sides of this product separately . First, since the column s of Q are V 1 , . . . , V n, QD = Vi
V2
9 .2
a matrix having djVj as column
j.
Diagonalization of Matrices
337
Now compute
AQ = A V 1
. . . V„
V2
a matrix having AV1 as column j . Since AQ = QD, then column j of AQ equals column j of QD, so AVM
= djV3
which proves that d1 is an eigenvalue of A with associated eigenvector V j . As a consequence of this theorem, we see that not every matrix is diagonalizable .
EXAMPLE 9 .1 0
Let B
_
1 0
1 1
B has eigenvalues 1, 1, and every eigenvector has the form a (* l . There are not two linearly independent eigenvectors, so B is not diagonalizable . We could also proceed here by contradiction . If B were diagonalizable, then for some P , P-1 AP=(
01
1
) .
From Theorem 9 .5, the columns of P must be eigenvectors, so P must have the form P
_
a
0
013
But this matrix is singular (it has zero determinant, and its columns are multiples of each other, hence linearly dependent) . Thus no matrix can diagonalize B . The key to diagonalization of an n x n matrix A is therefore the existence of n linearly independent eigenvectors . We saw (Example 9 .9) that this does not require that the eigenvalue s be distinct . However, if A does have n distinct eigenvalues, we claim that it must have n linearly independent eigenvectors, hence must be diagonalizable . -
THEOREM 9 .6
Lam.
Let the n x n matrix A have n distinct eigenvalues . Then corresponding eigenvectors are linearl y independent. We will show by induction that any k distinct eigenvalues have associated with them k linearly independent eigenvectors . For k = 1, an eigenvector associated with a single eigenvalu e is linearly independent, being a nonzero vector . Now suppose that any k-1 distinct eigenvalue s have asst dated with them k 1 linearly independent eigenvectors . Suppose we have distinct
Proof
CHAPTER 9 Eigenvalues, Diagonalization, and Special Matrice s eigenvalues A l , . . . , Ak . Let V I , . , V k be associated eigenvectors . We want to show that V I , . ,Vk are linearly independent . If these eigenvectors were linearly dependent, there would be numbers cI , . . . , Ck not all zero such that
07.1
+
. . +Ck V k =O .
By relabeling if necessary, we may suppose for convenience that c l (AlIn -A)(cIVl + . . .
+Ck
0. Now
V k) = 0 -A) V2+ . . . + Ck(A 1 In -A) Vk = cl(A1V1 -ANT I ) +C2 (A l V2 -AV2)+ . . .+ck(AIVk -AVk) . . + Ck( A l V k - A kVk) = C l (Al V 1 - AI V 1) + C2 ( A 1 V2 - A2V 2) + . =cl(Alin -A) V 1+
=
C2( A 1 - A2)V2+ . . .
+Ck
(A I - Ak )Vk .
But V 2 , . . . , V k are linearly independent by the inductive hypothesis, so each of these coefficient s must be zero . Since A I - Ai 0 for j = 2, . . . k by the assumption that the eigenvalues ar e distinct, then C2
= . . . = Ck = 0 .
But then c 1 V 1 = 0 . Since V I is an eigenvector and cannot be 0, then c l = 0 also, a contradiction . Therefore V I , . . . , Vk are linearly independent .
COROLLARY 9 . I
If an n x n matrix A has n distinct eigenvalues, then A is diagonalizable .
EXAMPLE 9 .1 1
Let
The eigenvalues of A are 3,4, -2 + diagonalizable . For some P, 30 04 P lAP= 0 0 00
41, -z - i 41 . Because these are distinct, A is 0 0 -2+2
0
0 0 41
0 2-
Z
We do not need to actually produce P explicitly to conclude this .
41
9.3 Orthogonal and Symmetric Matrices
In each of Problems 1 through 10, produce a matrix tha t diagonalizes the given matrix, or show that this matri x is not diagonalizable . -1 ) 3
1. (04 2
10.
113 J / 3 .I -4 11 )
5.
-2 0 0 0 -4 -2 0 0 0 0 -2 0 0 0 0 -2
11. Suppose A2 is diagonalizable . Prove that A is diagonalizable.
r 53 )
12. Let A have eigenvalues A l , . . . , An and suppose P diagonalizes A . Prove that, for any positiv e integer k,
09 )
4 .I
Ak
\50 0 10 3 0 0 -2
6.
7.
8.
9.
9.3
00 10 01
0
P -1 . A;;
In each of Problems 13 through 16, compute the indicate d power of the matrix, using the idea of Problem 12.
1 0 -2
13. A=(
0 1 2
10 0 .0 04 1 0 0 0 -3 1 00 1 -2
O Ak2
Ak = P
0 2 3
-2 0 1 1 0 0 2 0 2 0 0 -1
339
14. A=
-g
(-2
15 .A=(011
1
16. A=(
);
) ; A1 6
14
) ;i
3
8
);
43 i
-4
Orthogonal and Symmetric Matrice s Recall that the transpose of a matrix is obtained by interchanging the rows with the columns . For example, if A= I -6
-3
1
-
) .
then A`=
(-6 ;
Usually At is simply another matrix . However, in the special circumstance that the transpos e of a matrix is its inverse, we call A an orthogonal matrix.
CHAPTER 9 Eigenvalues, Diagonalization, and Special Matrice s
DEFINITION 9.5
Orthogonal Matrix
A square matrix A is orthogonal if and only if AA ' = A'A = I, .
An orthogonal matrix is therefore nonsingular, and we find its inverse simply by taking it s transpose.
EXAMPLE 9 .1 2
Let
Then 0
i --
2 .
\
1
0 1
2
0
2
and a similar calculation gives A`A = I 3 . Therefore this matrix is orthogonal, and A-1
= At
0
1
0
_ *
0
a T
0
2
i
Because the transpose of the transpose of a matrix is the original matrix, a matrix is orthogonal exactly when its transpose is orthogonal . THEOREM 9. 7
A is an orthogonal matrix if and only if A t is an orthogonal matrix. ■ Orthogonal matrices have several interesting properties . We will show first that the determinant of an orthogonal matrix must be 1 or -1 . THEOREM 9. 8
If A is an orthogonal matrix, then I= A l ±1 . Proof Since AA` =I" , then IAA`I = 1 =
I A I I A `I
=
IA I 2 .
The next property of orthogonal matrices is actually the rationale for the name orthogonal . A set of vectors in R" is said to be orthogonal if any two distinct vectors in the set are orthogonal (that is, their dot product is zero) . The set is orthonormal if, in addition, each vector has length 1 . We claim that the rows of an orthogonal matrix form an orthonormal set of vectors, as d o the columns .
9.3 Orthogonal and Symmetric Matrices This can be seen in the matrix of the last example . The row vectors are (0
),(1 0 0 ),(0
- ,i )
. These each have length 1, and each is orthogonal to each of the other two . Similarly, th e columns of that matrix are 0 1 0
1
2
0
0
2
1
Each is orthogonal to the other two, and each has length 1 . Not only do the row (column) vectors of an orthogonal matrix form an orthonormal set of vectors in R", but this property completely characterizes orthogonal matrices . THEOREM 9 . 9
Let A be a real
nxn
matrix . The n
1. A is orthogonal if and only if the row vectors form an orthonormal set of vectors in R" . 2. A is orthogonal if and only if the column vectors form an orthonormal set of vectors i n R" . ■ Proof Recall that the i, j element of AB is the dot product of row i of A with column j of B . Further, the columns of At are the rows of A . Therefore, i, j element of AA ' = (row i of A) . (column j of A' ) = (row i of A) . (row j of A) . Now suppose that A is an orthogonal matrix . Then AA' = I 71, so the i, j element of AA' is zero if i j. Therefore the dot product of two distinct rows of A is zero, and the rows form an orthogonal set of vectors . Further, the dot product of row i with itself is the i, i element of AA' , and this is 1, so the rows form an orthonormal set of vectors . Conversely, suppose the rows of A form an orthonormal set of vectors . Then the do t product row i with row j is zero if i j, so the i, j element of AA ' is zero if i j . Further, the i, i element of AA ' is the dot product of row i with itself, and this is 1 . Therefore AA ' = In . Similarly, A'A is so A is an orthogonal matrix . This proves (1) . A proof of (2) is similar. ■ We now have a great deal of information about orthogonal matrices . We will use this to completely determine all 2 x 2 orthogonal matrices. Let
=(a
. d)
What do we have to say about a, b, c and d to make this an orthogonal matrix? First, the two row vectors must be orthogonal (zero dot product), and must have length 1, s o
ac+bd=0
(9 .1 )
a2 + b2 = 1
(9 .2 )
c 2 + d2 = 1 .
(9 .3)
The two column vectors must also be orthogonal, so in addition , ab + cd = O .
(9 .4)
CHAPTER 9 Eigenvalues, Diagonalization, and Special Matrice s Finally,
IQI
= ± 1, so ad-be=±1 .
This leads to two cases . Case 1-ad - be = 1 . Multiply equation (9 .1) by d to get acd -I- bd2 = 0. Substitute ad = 1-I- bc into this equation to get c(1-{- bc) + bd2 = 0 or c-l-b(c 2 -I-d2)
= 0.
But c2 + d2 = 1 from equation (9 .3), so c + b = 0, henc e c=-b . Put this into equation (9 .4) to get ab-bd=0 . Then b = 0 or a = d, leading to two subcases . Case 1-(a)-b = 0 . Then c = -b = 0 also, s o
But each row vector has length 1, so a2 = d2 = 1. Further, so a=d= 1 or a=d=-1 . In these cases, Q=I2
or
IQI
= ad = 1 in the present case,
Q=-I2 .
Case 1-(b)-b 0 . Then a = d, so Q= (-b
a) '
Since a2 + b2 = 1, there is some 0 in [0, 2,r) such that a = cos(0) and b = sin( g) . The n _ Q
cos(0) sin( g) sin( g) c(*B )
This includes the two results of case 1(a) by choosing 0 = 0 or 0 = 7r . Case 2-ad - bc = -1 . By an analysis similar to that just done, we find now that, for some 0 , - cos(0) sin(0) sin( g) -cos(0)
1
These two cases give all the 2 x 2 orthogonal matrices . For example, with 0 = 7r/4 we get the orthogonal matrices and
9.3 Orthogonal and Symmetric Matrices
343
and with 0 = 7r/6 we get r z
a
r 4
-2
r
and
2
2
z 1
z
We can recognize the orthogonal matrice s ( cos(O) sin(O ) -sin(O) cos(0) ) as rotations in the plane . If the positive x, y system is rotated counterclockwise 0 radians t o form a new x', y' system, the coordinates in the two systems are related b y x' y'
cos(O) sin(O) x (- sin(O) cos(0)) (y )
We will now consider another kind of matrix that is related to the class of orthogona l matrices .
DEFINITION 9.6 Symmetric Matrix A square matrix is symmetric if A = At
.
This means that each a,1 = aj1 , or that the matrix elements are the same if reflected acros s the main diagonal . For example, -7 -2 1 14 -2 2 -9 4 7 1 -9 6 IT 14 47 ar 22 is symmetric . A symmetric matrix need not have real numbers as elements . However, when it does, it has the remarkable property of having only real eigenvalues . THEOREM 9 .1 0
The eigenvalues of a real, symmetric matrix are real numbers. Before showing why this is true, we will review some facts about complex numbers . A complex number z = a + ib has magnitude IzI _ -✓a2 +b2 . The conjugate of z is defined t o be = a - ib . When z is represented as the point (a, b) in the plane, is the point (a, -b) , which is the reflection of (a, b) across the x-axis . A number is real exactly when it equals it s own conjugate . Further, zz=a 2 +b2
=Iz1 2
and (z) =
z.
We take the conjugate A of a matrix A by taking the conjugate of each of its elements . The product of a conjugate is the conjugate of a product : (AB) = (A) *) .
CHAPTER 9 Eigenvalues, Diagonalization, and Special Matrice s Further, the operation of taking the conjugate commutes with the operation of taking a transpose : _(0) . For example, if C=
i 3 - 2+i
1-2 i 0 4
-i 3 -2-i
1+2 i 0 4
then _ C= an d 3 -2 - i __ 0 4
(Cr )
We will now prove that the eigenvalues of a real symmetric matrix must be real . Proof Let A be an n x n matrix of real numbers . Let A be an eigenvalue, and let
be an associated eigenvector . Then AE = AE . Multiply this equation on the left by the 1 x n matrix r E =e1
e2
a„ )
to get E `AE = E t AE = AE t E el e2 =A(el
e2
e„) en
e
= A Ke l + 2 e2 + . . = A (Ie l I 2
+ Ie2 I 2 +
. . .+
I e nI 2 )
which is a real number. Here we are using the standard convention that a 1 x 1 matrix i s identified with its single element . Now compute EtAE
= (E) ` A E = E`AE ,
in which we have used the fact that A has real elements to write
X. =
A.
9.3 Orthogonal and Symmetric Matrices
M5
Now Et AE is a 1 x 1 matrix, and so is the same as its transpose . Recalling that the transpose of a product is the product of the transposes in the reverse order, take the transpose of the las t equation (9 .6) to get
E`AE = (E`AE) ` = E 'A(E`)` = E ` AE .
(9 .7)
From equations (9 .6) and (9 .7) we have EtAE EtAE .
Therefore the 1 x 1 matrix Et AE, being equal to its conjugate, is a real number . Now return to equation (9 .5) . We have just shown that the left side of this equation is real . Therefore the right side must be real . But (le i 1 2 + * e2 1 2 + • • + I e„ 12 ) is certainly real . Therefore A is real, and th e theorem is proved . ■ One ramification of this theorem is that a real, symmetric matrix also has real eigenvectors . We claim that, more than this, eigenvectors from distinct eigenvalues are orthogonal . THEOREM 9.1 1
Let A be a real symmetric matrix . Then eigenvectors associated with distinct eigenvalues ar e orthogonal . Proof
Let A and
µ be distinct eigenvalues with, respectively, eigenvector s / el e2
E=
and G = \ en
Identifying, as usual, a real number with the 1 x 1 matrix having this number as its only element , the dot product of these two n-vectors can be written as a matrix produc t + . . . + e„Sn =
E`G .
Since AE = AE and AG = AG, we hav e AEG = (AE)`G = (AE)`G = (E`A`) G = (VA)G = E` (AG) = E` (AG) = Then (A -A)E`G = 0 . But A A, so EG = 0 and the dot product of these two eigenvectors is zero . These eigenvector s are therefore orthogonal . ■
EXAMPLE 9 .1 3
Let 3 0 -2 A= - 0 2 0 -2 0 0
,
CHAPTER 9 Eigenvalues, Diagonalization, and Special Matrice s a 3 x 3 real symmetric matrix . The eigenvalues are 2, -1, 4, with associated eigenvector s 0 01
.
0 . ■ 2Thesformantglvecors In this example, the eigenvectors, while orthogonal to each other, are not all of length 1 . However, a scalar multiple of an eigenvector is an eigenvector, so we can also write the following eigenvectors of A : u 2 0 0 ) , ( 0 ( ) , ( ''/3 2 1 0 s
v
These are still mutually orthogonal (multiplying by a positive scalar does not change orientation), but are now orthonormal . They can therefore be used as columns of an orthogona l matrix 0 1 2 v5
Q=
1 0 0 1 0
2 -
These column vectors, being orthogonal to each other, are linearly independent by Theore m 6.14. But whenever we form a matrix from linearly independent eigenvectors of A, this matri x diagonalizes A. Further, since Q is an orthogonal matrix, Q -1 = Qt . Therefore, as we can easily verify in this example, 2 0 Q -1 AQ = 0 -1 0 0
0 0 4
The idea we have just illustrated forms the basis for the following result . THEOREM 9 .1 2
Let A be a real, symmetric matrix . Then there is a real, orthogonal matrix that diagonalizes A .
EXAMPLE 9 .1 4
Let 2 1 A= 1 -2 0 4
0 4 2
The eigenvalues are 21, - 21 and 2, with associated eigenvectors, respectively 1 21-2 4
, -
1 21-2 4
-4 0 1
.
9.4 Quadratic Forms
34 7
These eigenvectors are mutually orthogonal, but not orthonormal . Divide each eigenvector b y its length to get the three new eigenvectors : 1
1 1 1 -4 21-2 ) , ( -/i--2 ) , ( 0 4 a 4 1 P
a where
a= -\/42-4 21 and (3= 17 . The orthogonal matrix Q having these normalized eigenvectors as columns diagonalizes A .
In each of Problems 1 through 12, find the eigenvalue s of the matrix and, for each eigenvalue, a corresponding eigenvector . Check that eigenvectors associated wit h ,distinct eigenvalues are orthogonal . Find an orthogona l matrix that diagonalizes the matrix.
7.
5 0 2
8.
2 -4 0
9.
0 0 0
0 1 -2
10.
1 3 0
3 0 1
0 0 0 0
0 1 -2 0
0 -2 1 0
0 0 0
5 0 0 0
0 0 -1 0
0 -1 0 0
0 0 0 0
1.
3.
6 1
4.
-13 1
0 1 ( 0 0 6. 1 1 5.
9 .4
1 4
1 J
1 4 1 -2 0 1 2 0
11 .
0 0 3
I
1
1 0 2
12
1
-4 0 0
Quadratic Forms
DEFINITION 9 . 7
A (complex)
quadratic
form is an expressio n n
n
E E ajkzJZk ' j=1 k= 1
in which each ajk and
zj
are complex numbers .
2 0 0
0 0 0
0 0 0 0 -2 0
0 1 1
348
CHAPTER 9 Eigenvalues, Diagonalization, and Special Matrice s For n = 2 this quadratic form i s a11Z1z1 + a12Z1z2 + a21z1z2 + a22 z 2z 2 . The terms involving z j zk with j k, are the mixed product terms . The quadratic form is real if each ajk and zi is real . In this case we usually write z i as x1 . Since xj = xj when xj is real, the form (9 .8) in this case i s n n E E ajkxj xk . j=1k= 1
For n = 2, this is a ll xi + ( a 12 + a21)xtx2+ a22x2 . The terms involving xi and x2 are the squared terms in this real quadratic form, and x 12 is the mixed product term. It is often convenient to write a quadratic form (9 .8) in matrix form. If A = [au] an d
then
Z` AZ =
(z l
z2
zn
_( a ll zl + . . . +a l „zn
=allzlzl + . . + a ln z n z l it n =LEajk z j z k .
all a21
a12 a22
a l ,, a 2, ,
ant
a,,2
a m,
anl Z l+ . . + ann.
+. . . +
a al z l zn + . . + annznz n
j=1 k= 1
Similarly, any real quadratic form can be written in matrix form as X'AX . Given a quadratic form, we may choose different matrices A such that the form is Z`AZ .
EXAMPLE 9 .1 5
Let
9.4 Quadratic Forms
349
Then x1
x1 +3x2 C3
4x1+2x2
2/\x2/
xi ( x2 )
=xi+3x i x2 +4x1x2 +2x2 = xi + 7x i x2 +24 . But we can also write this quadratic form a s xtz +
2x1x2 } 2-x2xi +2x2
7
7
2
xl
=
x2)
1 7
a
2 2
x1
\ x2
The advantage of the latter formulation is that the quadratic form is X`AX with A a symmetric matrix . There is an expression involving a quadratic form that gives the eigenvalues of a matri x in terms of an associated eigenvector . We will have use for this shortly.
LEMMA 9. 1
Let A be an n x n matrix of real or complex numbers . Let A be an eigenvalue with eigenvector Z. Then Zt AZ Z`Z Proof
Since AZ = AZ, then Z` AZ = AZ` Z .
Using a calculation done in equation (9 .5), we can write 1 n
ij=1 I Zj
„ 2
„
E E ajkZlZk . j=1 k= 1
Quadratic forms arise in a variety of contexts . In mechanics, the kinetic energy of a particle is a real quadratic form, and in analytic geometry a conic is the locus of points in the plane fo r which a quadratic form in the coordinates is equal to some constant. For example, xi+4x2= 9
is the equation of an ellipse in the x i , x2 plane. In some problems involving quadratic forms, calculations are simplified if we transform from the x i , x2, . . . , x„ coordinate system to a yl , y2 , . . . , y,, system in which there are no mixe d product terms . That is, we want to choose y i , . . . , y,, so that
E E a lj xj xk = E (3jyj . j=1 k=1
j= 1
The y l , . . . y,, coordinates are called principal axes for the quadratic_form.
350
CHAPTER 9 Eigenvalues, Diagonalization, and Special Matrice s This kind of transformation is commonly done in analytic geometry, where a rotation o f axes is used to eliminate mixed product terms in the equation of a conic . For example, the change of variables xi
=
xz
=
1
1
Yz
YI +
1
1 YI --Y2
transforms the quadratic form xi -2x 1 x2 -+2 to 2y , with no mixed product term . Using this transformed form, we could analyze the graph o f x1 - 2x 1 x2 + x2 - 4 in the x1 , x 2 - system, in terms of the graph o f Y2 = 2
in the yl , y2 - system. In the y i , y2 - plane it is clear that the graph consists of two horizontal straight lines y2 = ± N/2. We will now show that a transformation that eliminates the mixed product terms of a rea l quadratic form always exists . -7, : . THEOREM 9.13 Principal Axis Tlzeore n Let A be a real symmetric matrix with eigenvalues A l , . . , A n . Let Q be an orthogonal matri x that diagonalizes A . Then the change of variables X = QY transforms E j=1 Ek=1 ajk xj xk t o
Proof
The proof is a straightforward calculation : n n E E a lj xj xk = X`AX j=1k=1
= (QY ) `A (QY ) = (Y`Q`)A(QY) = yt (Q t AQ) y
A2
= (YI
Yn )
O =A IYi++/1nY*
The expression A 1 yi + • • • + Any2 is called the standard form of the quadratic form X`AX .
9.4 Quadratic Forms
; 351
EXAMPLE 9 .1 6
Consider again xi -2x1 x2 +x2 . This is X'AX with A=
(-1
1)
.
The eigenvalues of A are 0 and 2, with corresponding eigenvector s (1)
and
I
Dividing each eigenvector by its length, we obtain the eigenvector s an d These form the columns of an orthogonal matrix Q that diagonalizes A : Q=(
1
1
1
_ 1
v
The transformation defined by X = QY is 1
1
5. ) (
x2
1
1
yz
which gives exactly the transformation used above to reduce the quadratic form xi 2x 1 x 2 +x2 to the standard form 2y2 .
EXAMPLE 9 .1 7
Analyze the conic 4x? - 3x1 x2 +2x2 = 8 . First write the quadratic form as X'AX = 8, where A=
4 _z
2 2
The eigenvalues of A are (6 f 13)/2 . By the principal axis theorem there is an orthogona l matrix Q that transforms the equation of the conic to standard form : 13 2 6- 13 2 l + 2 y 2 =8 . 2 y plane . Figure 9 .2 shows a graph of this ellipse . 6+
This is an ellipse in the
y l , y2 -
FIGURE 9 . 2
352
CHAPTER 9 Eigenvalues, Diagonalization, and Special Matrice s
In each of Problems 1 through 6, find a matrix A suc h that the quadratic form is XtAX . 1. 2. 3. 4. 5. 6.
xi + 2xx2 + 6x3 3xi + 3x3 - 4x l x2 - 3x1 x3 + 2x2x3 + x3 xi -4xl x2 +x3 2xi -x3+2x 1x2 -xi + x4 - 2x1 x4 + 3x2x4 - x 1x3 + 4x2x3 xi - x3 - x i X3 + 4x 2x3
19. (
21.
1 6 )
7 1 -2 1 0 -1 -2 -1 3 23. Give an example of a real, 3 x 3 matrix tha t cannot be the coefficient matrix of a real quadratic form. 22.
In each of Problems 14 through 18, use the principal axi s theorem to analyze the conic .
In each of Problems 19 through 22, write the quadrati c form defined by the matrix .
In Problems 7 through 13, find the standard form of the quadratic form. 7. 8. 9. 10. 11. 12. 13.
14. 15. 16. 17. 18.
7
Unitary, Hermitian, and Skew Hermitian Matrices If U is a nonsingular complex matrix, then U' exists and is generally also a complex matrix . We claim that the operations of taking the complex conjugate and of taking a matrix invers e can be performed in either order.
-;
LEMMA 9. 2
U 1 =U- ' Proof We know that the conjugate of a product is the product of the conjugates, s o In = In = UU- ' = This implies that U-' is the inverse of U . ■ Now define a matrix to be is equal to its transpose .
DEFINITION 9.8
An n x
n
unitary
if the inverse of its conjugate (or conjugate of its inverse )
Unitary Matrix
complex matrix U is unitary if and only if U -' =
9.5 Unitary, Hermitian, and Skew Hermitian Matrices
353
This condition is equivalent to saying tha t
'UV =
EXAMPLE 9 .1 8
Let U__ \
1/ / 1/ . /
i/* -i/*
Then U is unitary becaus e UU`-- (
\
i
/,
1/ .
)
\ (
- i/,5.
17,4
\
1 /, /
_(l 0 ) 01 J
If U is a real matrix, then the unitary condition UUt = I„ becomes UU` = In , which make s U an orthogonal matrix . Unitary matrices are the complex analogues of orthogonal matrices . Since the rows (or columns) of an orthogonal matrix form an orthonormal set of vectors, we will develop the complex analogue of the concept of orthonormality . Recall that, for two vectors (x 1 , . . . , x„) and (yl , . . . y, = ) in R", we can define the column matrices xl \ x2 X=
and
Y=
x„ ) and obtain the dot product X • Y as X`Y. In particular, this gives the square of the length of X a s X`X=xi+xz+ •+x* . To generalize this to the complex case, suppose we have complex n-vectors (z 1 , z2, . . . , z,l ) and (w 1 , w2, . . . , w,,) . Form the column matrice s z,
Z2 z„
It is tempting to define the dot product of these complex vectors as Z'W . The problem with this is that then we get Z`Z=zi+z2+ . . .+z * and this will in general be complex . We want to interpret the dot product of a vector with itsel f as the square of its length, and this should be a nonnegative real number . We get around this by defining the dot product of complex Z and W to b e Z•W=Z ` W=zl w l + z2 w2
+
. . . + z„w,t •
354
CHAPTER 9 Eigenvalues, Diagonalization, and Special Matrice s In this way the dot product of Z with itself i s Zt Z =
z l* z1 +z2z2+ . .-I-.
= Iz i 1 2
2 2 + . . . + 1z " 12
+Iz I
,
a nonnegative real number With this as background, we will define the complex analogue of an orthonormal set o f vectors .
DEFINITION 9 .9 Unitary System of Vectors Complex n-vectors F 1 , . . . , Fr form a unitary; system if F1 • F k = 0 for Fj •Fj =1.
j k,
and each
If each F3 has all real components, then this corresponds exactly to an orthonormal set o f vectors in R" . We can now state the analogue of Theorem 9 .9 for unitary matrices . *. . THEOREM 9_1 4
Let U be an n x n complex matrix. Then U is unitary if and only if its row vectors form a unitary system . ■ The proof is like that of Theorem 8 .9, and is left to the student . It is not difficult to sho w that U is also unitary if and only if its column vectors form a unitary system .
EXAMPLE 9 .19
Consider again U-- (
1/*
Z/* -i//2
)
1/
The row vectors, written as 2 x 1 matrices, are and F 2 = (
-i/ * 1 /4
)
Then F1 . F2 = (
i//2
1/,/-2: ) (
F 1 •F 1 =( i/f2-
)(
1/
) = 0, )
= 1
and F2 . F2 = ( - i/*
1/-4 )
-Ii/ g ) =
1.
Unitary, Hermitian, and Skew Hermitian Matrices
9 .5
355
We will show that the eigenvalues of a unitary matrix must lie on the unit circle in th e complex plane .
THEOREM 9.1 5
Let A be an eigenvalue of the unitary matrix U . Then IA I = 1 . Let E be an eigenvector associated with A . Then UE = AE, so UE = AE . Then
Proof
(UE) ' = AEr , so
EU `I = AE r . But U is unitary, so U r
= U -1 ,
and E ` U-1 = AE r .
Multiply both sides of this equation on the right by UE to ge t E f E = A E ` UE = A E ` AE = AAE `E . Now E `E is the dot product of the eigenvector with itself, and so is a positive number. Dividing the last equation by E `E gives AA = 1 . But then 1A1 2 = 1, so CAI = 1 . We have defined a matrix to be unitary if its transpose is the conjugate of its inverse . A matrix is hermitian if its transpose is equal to its conjugate . If the transpose equals th e negative of its conjugate, the matrix is called skew-hermitian.
DEFINITION 9.1 0 1. Hermitian Matrix
An n x n complex matrix H is hermitian if and only if H = Hr . 2. Skew-Hermitia n Matrix An n x
n complex matrix S is skew-hermitian if and only if g = -S r .
In the case that H has real elements, hermitian is the same as symmetric, because in thi s case H = H .
EXAMPLE 9 .2 0
Let H=
15 -8i
8i
0
6-2 i -4+ i
6+2i
-4-i
-3
j
356
CHAPTER 9 Eigenvalues, Diagonalization, and Special Matrice s Then _ H=
15 8i 6-2i
-8 i 0 -4+i
6+2 i -4-i -3
=H' ,
so H is hermitian . If S=
0 8i 2i
0 -8i -2i
-8i 0 -4i
8i 0 4i
2i 4i 0
then S is skew-hermitian because _ S=
-2i -4i 0
J
= -S t .
The following theorem says something about quadratic forms with hermitian or skewhermitian matrices .
THEOREM 9.I 6
Let
be a complex matrix . Then 1. If H is hermitian, then Z ' HZ is real . 2. If S is skew-hermitian, then Z` SZ is zero or pure-imaginary. Proof
For (1), suppose H is hermitian . Then H ` = H, so
Z'HZ = Z HZ = Z'HZ . But Z' HZ is a 1 x 1 matrix and so equals its own transpose. Continuing from the last equation , we have Z t HZ = ( VIM ' = Z ` H` (Z')` = Z 'HZ . Therefore Z ` HZ = Z` HZ . Since Z ' HZ equals its own conjugate, then Z 'HZ is real.
9.5 Unitary, Hermitian, and Skew Hermitian Matrices
357
To prove (2) suppose S is skew-hermitian . Then S t = -S . By an argument like that don e in the proof of (1), we get Z ` SZ = -Zt SZ . Now write Z t SZ = a + i/3 . The last equation becomes a-i/3=-a-i(3 .
Then a = -a, so a = 0 and Z t SZ is pure imaginary . III Using these results on quadratic forms, we can say something about the eigenvalues of hermitian and skew-hermitian matrices . THEOREM 9.17
1. The eigenvalues of a hermitian matrix are real . 2. The eigenvalues of a skew-hermitian matrix are zero or pure imaginary . Proof
For (1), let A be an eigenvalue of the hermitian matrix H , with associated eigenvector E . By Lemma 8 .1, A_ E ' HE E E But by (1) of the preceding theorem, the numerator of this quotient is real . The denominator i s the square of the length of E, and so is also real . Therefore A is real . For (2), let A be an eigenvalue of the skew-hermitian matrix S, with associated eigenvector E . Again by Lemma 8 .1, A = EASE EE By (2) of the preceding theorem, the numerator of this quotient is either zero or pure imaginary . Since the denominator is a positive real number, then A is either zero or pure imaginary . Figure 9 .3 shows a graphical representation of the conclusions of Theorems 9 .15 and 9.17 . When plotted as points in the complex plane, eigenvalues of a unitary matrix lie on the uni t circle about the origin, eigenvalues of a hermitian matrix lie on the horizontal (real) axis, an d eigenvalues of a skew-hermitian matrix lie on the vertical (imaginary) axis .
FIGURE 9 .3 Eigenvalue locations .
358
CHAPTER 9 Eigenvalues, Diagonalization, and Special Matrices
In each of Problems 1 through 9, determine whether th e matrix is unitary, hermitian, skew-hermitian, or none o f these . Find the eigenvalues of each matrix and an associated eigenvector for each eigenvalue. Determine which matrices are diagonalizable . If a matrix is diagonalizable , produce a matrix that diagonalizes it . 0 2i
2i 4
3 4i
4i -5
0 -1 0 4.
5.
8.
1 0 -1-i
1/ I , f -1/J 0 3 2 0
2 0 -i
-1 0 3- i 0 1 0 3+i 0 0 ( i 1 0 -1 0 2 i 7. 0 2i 0 6.
0 1- i 0
i/-
i/,,5 0 0 i 0
0 0 1
9.
3i -1
0 0
0 i
0
-i
0
8
-1
i
-1 -i
0 0
0 0
10. Let A be unitary, hermitian or skew-hermitian . Prove that AA' -A-A . 11. Prove that the main diagonal elements of a skewhermitian matrix must be zero or pure imaginary . 12. Prove that the main diagonal elements of a hermitian matrix must be real . 13 . Prove that the product of two unitary matrices is unitary.
PART
CHAPTER 1 0
Systems of Linear Differential Equations CHAPTER 1 1
Qualitative Methods and System s of Nonlinear Differential Equations
Systems of Differentia l Equations an d Qualitative Methods
We will now use matrices to study systems of differential equations . These arise, for example , in modeling mechanical and electrical systems having more than one component . We will separate our study of systems of differential equations into two chapters . The first, Chapter 10, is devoted to systems of linear differential equations . For these, powerfu l matrix methods can be brought to bear to write solutions For systems of nonlinear differential equations, for which we usually cannot write explicit solutions, we must develop a different set of tools designed to determine qualitative properties of solutions . This is done in Chapter 11 .
CHAPTER
10
TI-q'ORY OF SYSTEMS OF LINEAR FIRST-ORDE R FI'E? ;f NTIAL EQUATIONS SOLUTION OF X' = A 1 'WHEN A IS CONSTANT SOLUTION OF X' _ AX HT1IEORY OF SYSTEMS OF LINEAR FIRST-ORDE R
Systems of Linear Differential Equations
Before beginning to study linear systems, recall from Section 2 .6 .4 that a linear differential equation of order n always gives rise to a system of n first-order linear differential equations, i n such a way that the solution of the system gives the solution of the original nth order equation . Systems can be treated using matrix techniques, which are now at our disposal . For this reaso n we did not spend time on differential equations of order higher than 2 in Part 1 . We will assume familiarity with vectors in R", matrix algebra, determinants, and eigenvalues and eigenvectors. These can be reviewed as needed from Part 2. We begin by laying the foundations for the use of matrices to solve linear systems o f differential equations .
10.1
Theory of Systems of Linear First-Order Differential Equation s In this chapter we will consider systems of unknown functions :
n
first-order linear differential equations in
n
x 1( t) = all ( t) x1( t) +a12(t)x2(t) + . . . +ai,t(t)x,t(t) + g i (t) x'2 (t) = a21(t)xi(t)+a22(t)x2(t)+ . . .+a2,t(t)x,,(t)+g2(t)
x ;t(t) = a,1(t)xi(t)+a,2(t)x2(t)+ . . .+a,t„(t)x,,(t)+g,,(t)
361
362
CHAPTER 10 Systems of Linear Differential Equation s Let li( t)
a 12( t)
am (t)
a21( t)
a 22( t)
a2n( t)
a „1( t)
a
n2( t)
a un( t)
a
A(t)_
x 1( t )
X(t) =
g1(t)
x2( t )
and
G(t) =
x n ( t)
g2( t)
gn ( t)
Differentiate a matrix by differentiating each element, s o
X'(t)
_
xi ( t ) x'2 (t) x,'(t)
Matrix differentiation follows the "normal" rules we learn in calculus . For example , (X (t)Y(t))' = X' (t)Y(t) -F X (t)Y' (t) , in which the order of the factors must be maintained . Now the system of differential equations i s X'(t) = A(t)X(t) +G(t)
(10 .1 )
or X' = AX + G . This system is nonhomogeneous if G(t) 0 for at least some t, in which 0 denotes the n x 1 zero matrix
If G(t) = 0 for all the relevant values of t, then the system is homogeneous, and we write jus t X' = AX . A solution of X' = AX + G is any n x 1 matrix of functions that satisfies this matri x equation.
EXAMPLE 10 . 1
The 2 x 2 system xl =3x 1 +3x2 + 8 x'2 = xl + 5x2 + 4e 3 t
10.1 Theory of Systems of Linear First-Order Differential Equations
363
can be written 3
)'
( 1
3 5 ) ( x2 )+(4e3t
)
One solution is
X(t) = as can be verified by substitution into the system . In terms of individual components, thi s solution is
x r (t) = 3e 2 ' + x2 (t) _
4e' - 1
- e 2t + e6t +
0
3
2 3
On February 20, 1962, John Glenn became the first American to orbit the Earth . His fligh t lasted nearly five hours and included three complete circuits of the globe . This and subsequen t Mercury orbitings paved the way .for the space shuttle program, which now includes shuttles launched from the NASA Kennedy Space Center to carry out experiments under zero gravity , as well as delivery of personnel and equipment to the developing international space station . Ultimate goals of space shuttle missions include studying how humans function in a zero-gravity environment over extended periods of time, scientific observations of phenomena in space an d on our own planet, and the commercial development of space . Computation of orbits and force s involved in shuttle missions involves the solution of large systems of differential equations .
364
CHAPTER 10 Systems of Linear Differential Equation s Initial conditions for the system (10 .1) have the form xi ( to) x 2( t o)
X(to) _
= Xo ,
x (to ) ,t
in which X° is a given n x 1 matrix of constants . The initial value problem we will conside r for systems is the problem : X' =AX+G ;
(10 .2)
X(to) = X° .
This is analogous to the initial value proble m x' =ax+g ;
x (to) =xo
for single first-order equations . Theorem 1 .3 gave criteria for existence and uniqueness o f solutions of this initial value problem . The analogous result for the initial value problem (10 .2) is given by the following . THEOREM 10.1 Existence and Uniquenes s
Let I be an open interval containing to .Suppose each a 1j (t) and &e(t) are continuous on I . Let X° be a given n x 1 matrix of real numbers . Then the initial value proble m X'=AX+G ; X(to)=X° has a unique solution defined for all t in I . 111
L-EXAMPLE 10 . 2
Consider the initial value problem xi = x 1 + tx2 + cos(t) x'2 =t3x1 -e t x2 +l- t x1 (0) = 2, x 2(0) = -5 . This is the system X
_
t et
)x+( cos(t) 1-t
)
J'
with Xo = This initial value problem has a unique solution defined for all real t, because each a i1 (t) and g1(t) are continuous for all real t . We will now determine what we must look for to find all solutions of X' = AX + G . Thi s will involve a program that closely parallels that for the single first-order equation x' = ax + g, beginning with the homogeneous case .
10.1 Theory of Systems of Linear First-Order Differential Equations
365
10 .1 .1 Theory of the Homogeneous System X' = AX
We begin with the homogeneous system X' = AX . Because solutions of X' = AX are n x 1 matrices of real functions, these solutions have an algebraic structure, and we can form linea r combinations (finite sums of scalar multiples of solutions) . In the homogeneous case, any linear combination of solutions is again a solution .
THEOREM 10.2
Let 01 , . , 4)k be solutions of X' = AX, all defined on some open interval I. Let cl , . . . , Ck be any real numbers . Then the linear combination c 1 + • .+ Ck 'k is also a solution of X' = AX , defined on I .
Proof Comput e (C l (D 1 + . . . .i Ck Ilk) / = C 1
+ . . .+ Ck (lck
=
C 1 A(*1 + . . . + ckA(I)k
=A(Cl(D1+ . . .+Ck(1)k) .
Because of this, the set of all solutions of X' = AX has the structure of a vector space , called the solution space of this system . It is not necessary to have a background in vector spaces to follow the discussion of solutions of X' = AX that we are about to develop . However, for those who do have this background we will make occasional reference to show how idea s fit into this algebraic framework . In a linear combination c 1 (D 1 + . . .+ Ck 41k of solutions, any 4)j that is already a linear combination of the other solutions is unnecessary . For example, suppose 0 1 = a 21)2 + . . + a k (bk . Then CI (* 1
+ . . +Ck*k = C1( a 2(2+ . . . + ak 4)k)+ C2 0 2+ . . + Ck = (C1a2+c2) F2+ . . . +(c l a k +ck)(13k .
In this case any linear combination of (ID 1 , 02, . . . Ok is actually a linear combination of jus t 0k, and (D 1 is not needed. (h 1 is redundant in the sense that, if we have 43 2, • then we have I also . We describe this situation by saying that the functions 43 1, 02, • • • , (Dk are linearly dependent . If no one of the functions is a linear combination of the others, the n these functions are called linearly independent .
DEFINITION 10. 1 Linear Dependence
Solutions (I) l , *2, . . . , (4k , of X' = AX, defined on an interval I, are linearly dependen t on I if one solution is a linear combination of the others on this interval . Linear Independenc e
Solutions 0 1 , $2, . . . , ' k , of X' = AX, defined on an interval I, are linearly independen t on I if no solution in this list is a linear combination of the others on this interval .
Thus a set of solutions is linearly independent if it is not linearly dependent . Linear dependence of functions is a stronger condition than linear dependence of vectors . For vectors in R", V 1 is a linear combination of V 2 and V3 if V 1 = aV2 + bV 3 for some real
3 66
CHAPTER 10 Systems of Linear Differential Equation s numbers a and b . In this case V 1 , V22 , V3 are linearly dependent . But for solutions 4)1, 4)2, 4)3 of X' = AX, 4) 1 is a linear combination of (102 and (113 if there are numbers a and b such that I 1 (t) = a4) 2 (t)+b4)2 (t) for all tin the relevant interval, perhaps the entire real line . It is not enough to have this condition hold for just some values of t .
EXAMPLE 10 . 3
Consider the system X.
X'=(1 5)
It is routine to check that (B1(t) = -2e3'
3t 1 to t)e 43 2 (t) = ( (
and
are solutions, defined for all real values of t . These solutions are linearly independent on the entire real line, since neither is a constant multiple of the other (for all real t) . The function ) 4)3(t) - ( (14+3t)e3` is also a solution . However, 0 1, 4)2, 4)3 are linearly dependent, because, for all real t , 010 3 (t) = -440 1 (t) + 34) 2 (t) . This means that '3 is a linear combination of 40 1 and 4)2, and the list of solutions (D 1, 402, 40 3 , although longer, carries no more information about the solution of X' = AX than the list of solutions 410 1, 412 • If 4) is a solution of X' = AX, then 4) is an
4)(t) =
n x
1 column matrix of functions :
.f1( t) f2( t) fn ( t)
For any choice of t, say t = to, this is an n x 1 matrix of real numbers which can be thought of as a vector in R" . This point of view, and some facts about determinants, provides us with a test for linear independence of solutions of X' = AX . The following theorem reduces the question of linear independence of n solutions o f X' = AX, to a question of whether an n x n determinant of real numbers is nonzero .
THEOREM 10 .3
Test for Linear Independence of Solution s
Suppose that co 11(t)
1(t) = -
p12 ( t)
(P21(t)
=
'P22( t)
, 4,2(t) coal ( t)
9n2 ( t)
cPl„(t) . . , 43„(t)=
92n ( t)
5o,,,t ( t)
10.1 Theory of Systems of Linear First-Order Differential Equations are solutions of X' = AX on an open interval I . Let
to
367
be any number in I. Then
1. (D1 , 2, . . . , *„ are linearly independent on I if and only if F 1 (to), . . . , (D„ (to) are linearly independent, when considered as vectors in R" . 2. 01 , (D2, . . . , On are linearly independent on I if and only if /
*P11( t0)
CP12(t0) (P22 (t0)
.. .
CP21 ( t0) g2nI ( t0)
(P,12 ( t0)
...
q) ln( t0 ) *0 2n ( t0 )
0.
(Pnn( t0 )
Conclusion (2) is an effective test for linear independence of n solutions of X' = AX on an open interval . Evaluate each solution at some point of the interval . Each F1 (to) is an n x 1 (constant) column matrix . Evaluate the determinant of the n x n matrix having these columns . If this determinant is nonzero, then the solutions are linearly independent ; if it is zero, they are linearly dependent . Another way of looking at (2) of this theorem is that it reduces a question of linea r independence of n solutions of X' = AX, to a question of linear independence of n vectors i n R" . This is because the determinant in (2) is nonzero exactly when its row (or column) vectors are linearly independent .
EXAMPLE 10 . 4
From the preceding example, -2e3t ) e 3t
0 1 ( t)
(1-2t)e3t te 3t )
and
are solutions of
(i
X'=
5)
on the entire real line, which is an open interval . Evaluate these solutions at some convenien t point, say t = 0 : 01 (0)
=( 1
I
and
02 (0) = ( ol ) .
Use these as columns of a 2 x 2 matrix and evaluate its determinant : -2 1 10 Therefore
40 1
=-100 .
and 02 are linearly independent solutions .
A proof of Theorem 10 .3 makes use of the uniqueness of solutions of the initial valu e problem (Theorem 10 .1) . Proof
For (1), let to be any point in I . Suppose first that 0 1 , . . . , 4■ „ are linearly dependent on I. Then one of the solutions is a linear combination of the others . By reordering if necessary, say 0 1 is a linear combination of ■D2, . . . , (D,, . Then there are numbers c2 , . . . , c„ so that cDi ( t) = c2 02 (t) + . . . + c,cOn I (t)
368
CHAPTER 10 Systems of Linear Differential Equations for all tin I . In particular, F 1(to)
=
c2 (D2 (to)
j_ cn O n ( to) •
+
This implies that the vectors F (to), . . . ,'Fn (t0) are linearly dependent vectors in R" . Conversely, suppose that'F 1 (to), . . . , (Dn (to) are linearly dependent in R" . Then one of these vectors is a linear combination of the others . Again, as a convenience, suppose 'I (to) i s a linear combination of (102 (to), . . . , (Io n (t0) . Then there are numbers c2 , . . . , c„ such that = c2'F2(t0) + . . . +Cn(k,(to) •
(D 1( t0)
Define F ( t) '
= <* 1( t)
- c2' F2( t)
- . . - c n''„( t)
for all t in I . Then 43 is a linear combination of solutions of X' = AX, hence is a solution . Further 0 0 ''( t o) _
0 j Therefore, on I, 'F is a solution of the initial value problem X' = AX ;
X(to) = O .
But the zero function
''I''( t )
_
0 \ 0 . 0 )
is also a solution of this initial value problem . Since this initial value problem has a uniqu e solution, then for all t in I, 0 'F(t) _ t(t) _ 0 0 Therefore 0 0 'F(t) = 0 1( t)
-
0 2 02 (t)
- . . . -
cn'Fn (t) =
0 for all tin I, which means that (D I (t) = c2'F 2 (t) + . . . +
c u'n ( t)
for all t in I . Therefore (D 1 is a linear combination of 02, . . . , O,,, hence linearly dependent on I .
(D 1, 42, . . . , On
are
10.1 Theory of Systems of Linear First-Order Differential Equation s Conclusion (2) follows from (1) and the fact that n vectors in R" are linearly independen t if and only if the determinant of the n x n matrix having these vectors as columns is nonzero . Thus far we know how to test n solutions of X' = AX for linear independence, if A i s n x n. We will now show that n linearly independent solutions are enough to determine al l solutions of X' = AX on an open interval I. We saw a result like this previously when it wa s found that two linear independent solutions of y" + p(x)y' + q(x)y = 0 determine all solutions of this equation . THEOREM 10.4 Let A = Then
[a ;; (t)]
be an n x n matrix of functions that are continuous on an open interval
I.
1. The system X' = AX has n linearly independent solutions defined on I . 2. Given any n linearly independent solutions 0 1, . . . , 'I defined on I, every solution on I is a linear combination of 0 1 , . . . , don . By (2), every solution of X' = AX, defined on I , must be of the form c l + c2 (I32 + . . . + c u to n . For this reason, this linear combination, with ( 1 , . . . , t'-I),i any n linearly independent solutions, is called the general solution of X' = AX on I . To prove that there are n linearly independent solutions, define the n matrices
Proof
/o\ E (1)
E(2)
=
=
0/
1 0
x
1 constan t
/o \ , . . .,
E(n) =
/
0 0
\ 1 J
Pick any to in I . We know from Theorem 10 .1 that the initial value problem X' = AX ;
X(to) = E(')
has a unique solution (Di defined on I, for j = 1, 2, . . . n. These solutions are linearly independen t by Theorem 10 .3, because the way the initial conditions were chosen, the n x n matrix whos e columns are these solutions evaluated at to is I,,, with determinant 1 . This proves (1) . To prove (2), suppose now that 1Ir1 , . . . ,1lr i are any n linearly independent solutions o f X' = AX, defined on I . Let A be any solution . We want to prove that A is a linear combinatio n of 1Ir1 , . . . , AIr,i Pick any to in I . We will first show that there are numbers c l , . . . , c, i such that A ( to) = c1 1P'1( to) + . . . -I- c,i 1l',i ( to) Now A(to), and each 1Fj (to), is an n x 1 column matrix of constants . Form the n x n matrix S using 1I''1 (to), . . . , 1lri (to) as its columns, and consider the system of n linear algebraic equation s in n unknowns
= A(to) .
(10 .3 )
3'70
CHAPTER 10 Systems of Linear Differential Equation s The columns of S are linearly independent vectors in R", because Alri , . . . , *It are linearly independent. Therefore S is nonsingular, and the system (10 .3) has a unique solution . Thi s solution gives constants c 1 , . . . , c,, such that A (to) = C 11 (t0) + . . . + Cttn ( t0) . We now claim that A(t) =
C1
1(t)+ . . .+cnhn(t)
for all t in I . But observe that A and c 1 1I*'1 + . . .+ c, II', are both solutions of the initial valu e problem X' = AX ; X(to) = A(to) . Since this problem has a unique solution, then A(t) = and the proof is complete . ■
(t) + . .+ c„ 1I!„ (t) for all tin I ,
In the language of linear algebra, the solution space of X' = AX has dimension n, the order of the coefficient matrix A . Any n linearly independent solutions form a basis for this vecto r space .
EXAMPLE 10 . 5
Previously we saw that . :v1(t)
_
_
et
)
and
((1 4 2 (t) =
et)e 3r
)
are linearly independent solutions of
Because A is 2 x 2 and we have 2 linearly independent solutions, the general solution of thi s system is t =
Cl (
-2e" e3t
+ CZ
(1- 2t)e 3r te3t
The expression on the right contains every solution of this system . In terms of components , x 1 (t) = -2c1 e 3t + c 2(1- 2t) e 3t , x2(t) = cl e3t +c 2 te 3t . We will now make a useful observation . In the last example, form a 2 x 2 matrix n(t) having 1 1 (t) and 02 (t) as columns : fl(t) = (- 3est 1\ a
(1- 3t)e3t )1 to
10.1 Theory of Systems of Linear First-Order Differential Equations Now observe that, if C = (
ci I c2 /
371
, then eat
SL(t)C -
(1 tet)e"
/
_e
c2 )
)
c l [-2e"]+ c 2[( l - 2t)e3t cl [e3t] + c2 3t ] = Cl (
= cl
e
_ ear ) l (t)
(D
+
[tte c2 ( (1
at)e3t ) te
+ c2 ■:I)2 ( t) .
The point is that we can write the general solution c l 41:• l + c2 02 compactly as fl(t)C, with fl(t) a square matrix having the independent solutions as columns, and C a column matrix o f arbitrary constants . We call a matrix SI formed in this way a fundamental matrix for the system X' = AX . In terms of this fundamental matrix, the general solution is X(t) = SI(t)C . We can see that SL(t)C also satisfies the matrix differential equation X' = AX . Recall that we differentiate a matrix by differentiating each element of the matrix . Then, because C is a constant matrix, (SL(t)C) = SL(t) C
Therefore (SL(t)C)' = A(Sl(t)C) , as occurs if SL(t)C is a solution of X' = AX .
DEFINITION 10. 2
SI is a fundamental matrix for the n x n system X' = AX if the columns of II are linearl y independent solutions of this system .
Writing the general solution of X' = AX as X(t) = SIC is particularly convenient fo r solving initial value problems .
372 Li=
CHAPTER 10 Systems of Linear Differential Equation s
EXAMPLE 10 . 6
Solve the initial value proble m
1) ;
X'=(
X(0)=(
3 ).
We know from Example 10.5 that the general solution if X(t) = 1(t)C, where n(t)
= ( -2e'' e
(1tte 3t
). J
We need to choose C so that X(0) = S2.(0)C
= ( 3) .
Putting t = 0 into SZ, we must solve the algebraic system -2
1
10 )
(
C
3
(
The solution is C- (
1
0 ) - 1(
) 3
11 ( The unique solution of the initial value problem is therefor e 21
fi(t) = 52.(t)
(
)
(
3)(4 )
43 )
2e -3t - 8te3t = ( 3 e3t +4te3r In this example 4T (0) - ' could be found by linear algebra methods (Sections or by using a software package .
7 .8 .1
and 8 .7)
10.1 .2 General Solution of the Nonhomogeneous System X = AX+ G Solutions of the nonhomogeneous system X' = AX +G do not have the algebraic structure o f a vector space, because linear combinations of solutions are not solutions . However, we will show that the general solution of this system (an expression containing all possible solutions ) is the sum of the general solution of the homogeneous system X' = AX, and any particular solution of the nonhomogeneous system . This is completely analogous to Theorem 2.5 for th e second order equation y" + p(x)y' + q(x) y = f (x) .
THEOREM 10. 5
Let SZ be a fundamental matrix for X' = AX, and let 1lrp be any solution of X' = AX + G . Then the general solution of X' = AX + G is X = (C +' qfp , in which C is an n x 1 matrix o f arbitrary constants .
10.1 Theory of Systems of Linear First-Order Differential Equations First, SIC -{-
Proof
373
p is a solution of the nonhomogeneous system, because
(S2.C + g yp)' = (uC)' + *; =A(f C)+A lIi r +G=A((C+ llrp)+G . Now let 1 be any solution of X' = AX + G . We claim that To see this, calculate (0 -
p )' =
1 - Alfp
1Y;
=+G-(AIP''p+G)=A(tD -
n C is the general solution of X' = AX, there is a constant
Since cl) -
is a solution of X' = AX .
Alfp ) .
n x 1 matrix K such that
=1 K . Then 1 = (K + ¶ r„ completing the proof.
We now know what to look for in solving a system of n linear, first-order differential equations in n unknown functions . For the homogeneous system, X' = AX, we look for n linearly independent solutions to form a fundamental matrix a(t) . For the nonhomogeneou s system X' = AX+G, we first find the general solution ,nC of X' = AX, and any particula r solution llfp of X' = AX + G . The general solution of X' = AX + G is then n C + lip. This is an overall strategy . Now we need ways of implementing it and actually producin g fundamental matrices and particular solutions for given systems .
SECTION 10 .1
PROBLEMS
In each of Problems 1 through 5, (a) verify that the given functions satisfy the system, (b) form a fundamental matrix n(t) for the system, (c) write the general solution i n the form n(t)C, carry out this product, and verify that the rows of ((t)C are the components of the given solution, and (d) find the unique solution satisfying the initial conditions . 1.
x 1 (t) = c l e at cos (t) + c2 e4' sin(t) , x 2 (t) =
,
sin (4)]
= (C2 -
c l ) el
3t
/ 2t x2(t)=c2 e
+ c3 e-3 t
x1 (0) = 1,x2(0) = -3,x3 (0) = 5
+C3 e -3 t
,
374
CHAPTER 10 Systems of Linear Differential Equations
10.2
Solution of X' = AX when A is Constant Consider the system X' = AX, with A an n x n matrix of real numbers . In the case y' = ay, with a constant, we get exponential solutions y = ce"x. This suggests we try a similar solution for the system. Try X = , with an n x 1 matrix of constants to be determined, and A a number to b e determined . Substitute this proposed solution into the differential equation to ge t OAe nt = A(eAt ) . This requires that =A . We should therefore choose A as an eigenvalue of A, We will summarize this discussion.
and g
as an associated eigenvector .
THEOREM I0.6 Let A be an n x n matrix of real numbers . Then is a nontrivial solution of X' = AX if an d only if A is an eigenvalue of A, with associated eigenvector . ■ We need n linearly independent solutions to form a fundamental matrix . We will have these if we can find n linearly independent eigenvectors, whether or not some eigenvalues ma y be repeated . THEOREM 10 . 7
Let A be an n x n matrix of real numbers . Suppose A has eigenvalues A l , . . ., A,,, and suppose there are associated eigenvectors . . . , „ that are linearly independent. Then . . . , e A,, ' are linearly independent solutions of X' = AX, on the entire real line. We know that each I e Ai t is a nontrivial solution. The question is whether these solution s are linearly independent. Form the n x n matrix having these solutions, evaluated at t = 0, a s its columns . This matrix has n linearly independent columns 1, . . . , e,,, and therefore has a nonzero determinant. By Theorem 10 .3(2), l e '' ` , . . . , are linearly independent on the real line. ■
Proof
EXAMPLE 10 . 7
Consider the system X-I 3 4
2 3 IX .
A has eigenvalues 1 and 6, with corresponding eigenvectors
(3 2
and
eigenvectors are linearly independent (originating from distinct eigenvalues), linearly independent solutions, 31 (
z ) e`
and ( 1 ) e6r
1 (
so
.
Thes e
)
we have tw o
10.2 Solution of X' = AX when A is Constan t We can write the general solution as X(t)
) et+C2 2 Equivalently, we can write the fundamental matri x = cl (
12 ( t)
e 6t
et -le t
_
) e 6t
( i )
e 6t
In terms of St, the general solution is X(t) =1L(t)C . In terms of components, x l (t)
= cl e' + c2 e 6t
x2(t)
= - Z cl et + c2e 6r .
EXAMPLE 10 . 8
Solve the system X'=
5 -4 4 12 -11 12 4 -4 5
X.
The eigenvalues of A are -3, 1, 1 . Even though one eigenvalue is repeated, A has three linearly independent eigenvectors . They are :
I
1 3 I associated with eigenvalue - 3 1
and 1 1 0
-1 0 1
and
associated with 1 .
This gives us three linearly independent solutions 1 3 1
1 1 0
et,
-3t
et
3e -3t
et
e-3t
0
e -3r ,
-1 0
e' .
1
A fundamental matrix is e
f(t) = The general solution is X(t) = SL(t)C .
EXAMPLE 10 .9 A Mixing Proble m
Two tanks are connected by a series of pipes, as shown in Figure 10 .1. Tank 1 initially contains 20 liters of water in which 150 grams of chlorine are dissolved . Tank 2 initially contains 5 0 grams of chlorine dissolved in 10 liters of water .
376
CHAPTER 10 Systems of Linear Differential Equation s __1 Pure water: 3 liters/min
Mixture : 3 liters/mi n
Mixture : 2 liters/min
Mixture : 4 liters/min
Mixture : 1 liter/mi n
FIGURE 10 . 1
Beginning at time t = 0, pure water is pumped into tank 1 at a rate of 3 liters per minute , while chlorine/water solutions are interchanged between the tanks and also flow out of bot h tanks at the rates shown . The problem is to determine the amount of chlorine in each tank a t any time t > O . At the given rates of input and discharge of solutions, the amount of solution in each tank will remain constant . Therefore, the ratio of chlorine to chlorine/water solution in each tank should, in the long run, approach that of the input, which is pure water . We will use thi s observation as a check of the analysis we are about to do . Let x1(t) be the number of grams of chlorine in tank j at time t . Reading from Figure 10 .1, rate of change of xi (t)
=
xj (t)
=
_
rate in minus rate out liter 3 min
(gram) 0 \ liter / + 3
liter min
x2 (gram ) 10 \ liter
(liter) xi gram (liter) xi (gram l 2 min 20 \ liter / - 4 min 20 liter )1 3 = 6 20 x1 + 10 x2 Similarly, with the dimensions excluded , x2(t)
_ x1 x2 x2 _ 4 4 20 - 3 10 10 20x1
The system is X' = AX, with A=
_ 3 10 1 5
3 10 2 5
The initial conditions are x1 (0) = 150, x2(0) = 50 , or X(0) = ( 150 ) 50
4 10 x2 •
10.2 Solution of X' = AX when A is Constant The eigenvalues of A are -
io
37 7
-5, and corresponding eigenvectors are, respectively ,
and
and
(-U •
These are linearly independent and we can write the fundamental matri x ( `t)
e -t/10
-e-3t/ 5
e -t/10
e-3t/ 5
2
The general solution is X(t) = fl(t)C . To solve the initial value problem, we must find C so that X ( 0) = ( 1
50) = SL(0)C =
C.
Then -1 1
( 150 ) 50
s 2 5
( - 80 )
3 5
50 )(150\ The solution of the initial value problem is 3 e -t/1 0 2
e -t/10
-e-3t/5
30
1(
e-3t/5 )
80 )
-3 0
120e - '/ '° + 30e -3t/5
80e-t/t° - 30e -3t/ 5
Notice that x i (t) -+ 0 and x2(t) -+ 0 as t ->- oo, as we expected .
10.2.1 Solution of X' = AX when A has Complex Eigenvalue s Consider a system X' = AX . If A is a real matrix, the characteristic polynomial of A has rea l coefficients . It may, however, have some complex roots . Suppose A = a + i f3 is a comple x eigenvalue, with eigenvector g. Then A = Ag, so AS=* . But X. = A if A has real elements, so A=A . This means that T. = a - if3 is also an eigenvalue, with eigenvector . This means that ee At an d e at can be used as two of the n linearly independent solutions needed to form a fundamenta l matrix . This resulting fundamental matrix will contain some complex entries . There is nothing wrong with this . However, sometimes it is convenient to have a fundamental matrix with onl y real entries . We will show how to replace these two columns, involving complex numbers, wit h two other linearly independent solutions involving only real quantities . This can be done for any pair of columns arising from a pair of complex conjugate eigenvalues .
378
CHAPTER 10 Systems of Linear Differential Equation s THEOREM 10 . 8
Let A be an n x n real matrix . Let a+ if3 be a complex eigenvalue with corresponding eigenvector U+iV, in which U and V are real n x 1 matrices . Then ea`
[U cos(Pt)
-V sin(Pt) ]
and [U sin(Pt) + V cos (at) ] are real linearly independent solutions of X' = AX . ■
EXAMPLE 10 .1 0
Solve the system X' = AX, with 2 0 1 A - 0 -2 - 2 0 2 0 The eigenvalues are 2, -1 + N/1, -1 1 0 0
1 1
i . Corresponding eigenvectors are, respectively, 1
1
-3+J i
-3-/i
One solution is
and two other solutions ar e 1
1
1 e(-1+
-3 +
i)t
and
2
i
-3 -
A,/j
1
e (-1-
*t)t
i
These three solutions are linearly independent and can be used as columns of a fundamental matrix e2t
SZ 1 (t) =
e (-1+/ )t
0 -2*ie(-1+*i)t 0 (-3 +/i)e(-1+ 'SOt
e(-1-Jt)t
2(ie(-1-*t) t
(-3 -
e(-1-t) t
However, we can also produce a real fundamental matrix as follows . First write 1 1 -IA I = 0 +i -3 -3+*i
And 1 e(-1-40' = (U - iV) [e -` cos(t) - ie' sin(/Jt) ] -3 - (i
= Ue-` cos(/t) - Ve `sin(/t) - i [Ve' cos(t) + Ue -` sin(/t)] .
(10.5)
The functions (10 .4) and (10 .5) are solutions, so any linear combination of these is also a solution . Taking their sum and dividing by 2 yields the solution I 1 (t)=Ue
` cos(t)-Ve-` sine/JO .
And taking their difference and dividing by 2i yields the solution (132 @) = VC ` cos(t)
+Ue-` sin(*t) .
Using these, together with the solution found from the eigenvalue 2, we can form the funda mental matrix e2t e-` cos(t) e-' sin(*t) 112(t) = 0 2(e `sin(*t) -2/Je-t cos(t) 0 e-`[-3 cos(/t) - (sin(/t)] e-`[V cos(A) - 3 sin(/t) Either fundamental matrix can be used to write the general solution, X(t) = III (t)C or X(t) = 112 (t)K . However, the latter involves only real numbers and real-valued functions . A proof of the theorem follows the reasoning of the example, and is left to the student . 10 .2.2 Solution of X' = AX when A does not have n Linearly Independen t Eigenvectors We know how to produce a fundamental matrix for X' =AX when A has n linearly independen t eigenvectors . This certainly occurs if A has n distinct eigenvalues, and may even occur when A has repeated eigenvalues. However, we may encounter a matrix A having repeated eigenvalues , . for which there are not n linearly independent eigenvectors . In this case we cannot yet write a fundamental matrix . This section is devoted to a procedure to follow in this case to find a fundamental matrix . We will begin with two examples and then make some general remarks .
380
CHAPTER 10 Systems of Linear Differential Equation s
EXAMPLE 10 .11
We will solve the system X' = AX, with A-
(-3
7) '
A has one eigenvalue 4 of multiplicity 2 . Eigenvectors all have the form a A does not have two linearly independent eigenvectors . We can immediately write one solutio n
i (
, with a )
0 O.
(D I ( t) _ ( i ) e 4t
We need another solution. Write El = I i I and attempt a second solutio n = JJJElte4t +E2e4t , 4,2(\0
in which E2 is a 2 x 1 constant matrix to be determined . For this to be a solution, we need t o have (D2 (t) = A4 2(t) : El [e4t + 4te4t] + 4E2 e4t = I 4t + AE2 e4t AE te
Divide this equation by e4t to get E l + 4E 1 t + 4E 2 But AE I = 4E 1 , so
the
=
AEI t + AE2 .
terms having t as a factor cancel and we are left wit h AE2-4E 2 =E l .
Write this equation as (A - 4I2)E2 = E 1 . If E 2
=(b
I , this is the linear system of two equations in two unknowns : (A-4I2)(
b )=( ) ,
or (_-33 This system has general s = 1 to get E2
(
= I
4
solution
\
E2
3 3
= ( 11±3 s ), 3 -
in which
s can be any nonzero number. Let
I and hence the second solution
-3 / 02
(t) = E l te4t + E2e4t = (
I te 4t + )
1+t ) 3+ t
I
4)
\3- /
e
10.2 Solution of X' = AX when A is Constan t If we use (D I (0) and (D 2 (0) as columns to form the matri x ( 1 1 413 ) then this matrix has determinant 1/3, hence (D I and 02 are linearly independent by Theorem 10.3(2) . Therefore (D I (t) and (D 2(t) are linearly independent and can be used as columns of a fundamental matrix ( e4t (1 + ) 4t * . '*' (t) = e4t (1+ 4t t) e The general solution of X' = AX is X(t) = fl(t)C . The procedure followed in this example is similar in spirit to solving the differentia l equation y"- 5y' +6y = e 3x by undetermined coefficients . We are tempted to try yp(x) = ae 3X but this will not work because e3x is a solution of y" - 5y' + 6y = O. We therefore try yp (x) = axe 3x, multiplying the first attempt ae3x by x. The analogous step for the system was to try th e second solution (D2 (t) = E l te 4' + E2e4r . We will continue to explore the case of repeated eigenvalues with another example .
EXAMPLE 10 .1 2
Consider the system X' = AX, in which -2 -1 A = 25 -7 0 1
-5 0 3
A has eigenvalue -2 with multiplicity 3, and corresponding eigenvectors are all nonzero scala r -1 -1 multiples of -5 J . This gives us one solution of X' = AX. Denoting -5 = E l , we 1 1 have one solutio n (t) =
-1 -5 1
e- 21 .= El e -2t .
We need three linearly independent solutions . We will try a second solution of the for m (D2 (t) = El to-2t + E2 e -2t , in which E 2 is a 3 x 1 matrix to be determined . Substitute this proposed solution into X' = AX to get E l [e -2' - 2te-2t] +E2[-2e-2t] = AE I to -2t +AE2e -2 t Upon dividing by the common factor of e -2t , and recalling that AEI = -2E 1 , this equation becomes E I - 2tE 1 - 2E2 = -2tE1 +AE2. or AE 2 +2E 2 = E 1 .
382
CHAPTER 10 Systems of Linear Differential Equation s
L ..
We can write this equation as (A+2I 3)E2 = E I , or 0 -1 -5 -1 25 -5 0 1E2 = - 5 0 1 5 1 a With E2 = f3 J , this is the nonhomogeneous system Y 0 25 0
with general solutio n
-1 -5 1
-5 0 5
a (3 y
7- 1 1 =
-5 1
,
-s 1 - 5s , in which s can be any number . For a specific solution , s
choose s = 1 and let
This gives us the second solutio n 12 (t) = E l to-2t + E2 e -2' -1 -1 - 5 to-2t + -4 e -2 t 1 1 -1- t - 4 - 5t e -2` . l+ t We need one more solution . Try for a solution of the form 1 3 (t) = El t2e-2t+E2te 2t +E3e -2t . We want to solve for E3 . Substitute this proposed solution into X' = AX to get -2 El[te2t -t2e2t]+E2[e 2t -2te -2t ]+E 3 [-2e -2 ']=ZAE l t2e-2t +AE2te-2t +AE 3 e r
Divide this equation by e2t and use the fact that AE I = -2E 1 and AE 2 =
E l t-El t2 +E2 -2E2t-2E3 =-E I t2 -I-
1 3 t- ►-AE 3 . -1
1 3 to get -1
(10 .6)
10.2 Solution of X' = AX when A is Constan t No w -1 \
/ -1 \
/
-1 )21)
E l t-2E 2t=
1 3
t=
I
t,
-
so equation (10 .6) has three terms cancel, and it reduces t o E2 - 2E3 = AE3. Write this equation as (A+2I3 )E3 =E2 or 0 -1 25 -5 0 1
-5 0 E3 5
=
-1 -4 1
with general solution 1-25 s 25 1-5 s s in which s can be any number . Choosing s = 1, we can let
A third solution is -1 1 3 (t) = -
1
-5 1
-1 -4 1
te e -z' +
1
to-2' +
24 25 -4 1
e -2'
25-t-Zt 2 =
-4 - 4t - z t2
e -2 ' .
1+t+2t 2 To show that 01 , 102 and I3 are linearly independent, Theorem 10 .3(2) is convenient . Form the 3 x 3 matrix having these solutions, evaluated at t = 0, as columns : -1 -1 _ 24 25 -5 -4 -4 1 1 1 The determinant of this matrix is - !g, so this matrix is nonsingular and the solutions are linearly independent. We can use these solutions as columns of a fundamental matri x
-e2` ((t) = -5e- 2` e-2'
(-1- t)e-2' (-4-5t)e' (1 + t) e- 2'
The general solution of X' = AX is X(t) = d2(t)C .
2) e2' (- 25 - t - 2 t (-4 - 4t - z t 2) e -2 t (1 + t + 2 t2) e -2t
384
CHAPTER 10 Systems of Linear Differential Equations These examples suggest a procedure which we will now outline in general . Begin with a system X' = AX, with A an n x n matrix of real numbers . We want the general solution, so w e need n linearly independent solutions . Case 1-A has n linearly independent eigenvectors . Use these eigenvectors to write n linearly independent solutions and use these as column s of a fundamental matrix. (This case may occur even if A does not have n distinct eigenvalues) . Case 2-A does not have n linearly independent eigenvectors . Let the eigenvalues of A be A l , . . . , An At least one eigenvalue must be repeated, becaus e if A has n distinct eigenvectors, the corresponding eigenvectors must be linearly independent , putting us back in case 1 . Suppose A l , . . . , A, are the distinct eigenvalues, while A,+i, . . . , A n repeat some of these first r eigenvalues . If V./ is an eigenvector corresponding to Aj for j = 1, . . . r, we can immediately write r linearly independent solution s API(/ t) =V l eA'` , ••,r(t)=VreA't . Now work with the repeated eigenvalues . Suppose µ is a repeated eigenvalue, say µ = A l with multiplicity k. We already have one solution corresponding to µ, namely ''Y'1 . To b e Alf' = 01 . Then consistent in notation with the examples just done, denote V I = E l e a, ' = E l e t 4)1(t) = V 1 is one solution corresponding to µ. For a second solution corresponding to µ, let 02(t) = E l te''' +E2e µ' . Substitute this proposed solution into X' = AX and solve for E 2. If k = 2, this yields a second solution corresponding to µ and we move on to another multiple eigenvalue . If k > 3, we do not yet have all the solutions corresponding to µ, so we attempt (133 (t)
2EI t 2eN''
=-
+
E2 te e ' +E 3el' .
Substitute 4) 3 (0 into the differential equation and solve for E 3 to get a third solution corresponding to µ. If µ > 4, continue with 4)4(t)
= 3i E1 t3e"+ Z*E2(t)t2 e !" +E 3teµ' +E4e
t,
substitute into the differential equation and solve for E 4, and so on . Eventually, we reac h 4)k(t) =
1 1 .E2tk-2eµ`+ . . .+Ek-Iteµ'+Ekel" ; l tk-I eµ' 2)! (k - 1)! E + (k
substitute into the differential equation and solve for Ek . This procedure gives, for an eigenvalue µ of multiplicity k, k linearly independent solution s of X' = AX . Repeat the procedure for each eigenvalue until n linearly independent solution s have been found. 10.2.3
Solution of X' = AX by Diagonalizing A
We now take a different tack and attempt to exploit diagonalization . Consider the system X' =
-2 0 0 0 4 0 X. 0 0 -6
10 .2 Solution of X' = AX when A is Constant
38 5
The constant coefficient matrix A is a diagonal matrix, and this system really consists of thre e independent differential equations, each involving just one of the variables : x'I
= - 2x 1 ,
x2 = 4x2 , x3 = - 6x3 ,
Such a system is said to be uncoupled. Each equation is easily solved independent of the others , obtaining xi
= Clg-2t,
xz = c 2e4', x3
-6t
= C3 e
The system is uncoupled because the coefficient matrix A is diagonal . Because of this, we can immediately write the eigenvalues -2, 4, -6 of A, and find the corresponding eigenvector s 0 01 ) , ( 1 0 0 Therefore X' = AX has fundamental matrix n(t) =
e -2t 0 0 e4t 0
0
0 0 e6t
and the general solution is X(t) = SL(t)C. However we wish to approach this system, the poin t is that it is easy to solve because A is a diagonal matrix . Now in general A need not be diagonal. However, A may be diagonalizable (Section 9 .2) . This will occur exactly when A has n linearly independent eigenvectors . In this event, we can form a matrix P, whose columns are eigenvectors of A, such tha t 0
Al
A2
... . . .
0 0
D is the diagonal matrix having the eigenvalues A l , . . . , A n down its main diagonal . This will hold even if some of the eigenvalues have multiplicities greater than 1, provided that A has n linearly independent eigenvectors . Now make the change of variables X = PZ in the differential equation X' = AX . Firs t compute X' = (PZ)' = PZ' = AX = APZ . , so Z' = P-I APZ = DZ . The uncoupled system Z' = DZ can be solved by inspection . A fundamental matrix for Z' = DZ is / eA t 0
0 eA2 t
0 \ 0
0 0
1
11 D ( t)
.
...
0 0
\
_ • eAnn -I` 0 0 e4l j
386
CHAPTER 10 Systems of Linear Differential Equation s and the general solution of Z' = DZ is Z(t) = 1LD (t)C . Then X(t) =PZ(t) =PSZD (t) C is the general solution of the original system X' = AX . That is, 11(t) = PSZ D (t) is a fundamenta l matrix for X' = AX . In this process we need P, whose columns are eigenvectors of A, but we never actuall y need to calculate P' .
EXAMPLE 10 .1 3
Solve X=I
1
3
Ix.
The eigenvalues and associated eigenvectors \ of A ar e 2, I
i I
and 6, (i ) .
Because A has distinct eigenvalues, A is diagonalizable . Make the change of variables X = PZ , where
1 1
P=I This transforms X' = AX to Z' = DZ, where
D_ 2 0
0 6
This uncoupled system has fundamental matrix e2t
an(t)
- (
6r 0 e a
Then X' = AX has fundamental matri x r
n ( t) = PfD (t) = (
e6t
( -3e2t e 2t
e6t e6'
The general solution of X' = AX is X(t) = SL(t)C . 10.2 .4 Exponential Matrix Solutions of X' = A X A first-order differential equation y' = ay has general solution y(x) = ce°X . At the risk of stretching the analogy too far, we might conjecture whether there might be a solution eAt C for a matrix differential equation X' = AX . We will now show how to define the exponential matrix eAt to make sense out of this conjecture . For this section, let A be an n x n matrix of real numbers .
10.2 Solution of X' = AX when A is Constant
387
The Taylor expansion of the real exponential function , 1 1 e`= 1+t+ 2i t 2 + 31 t3 + . . . suggests the following .
DEFINITION 10.3
Exponential Matrix
The exponential matrix e A ` is the n x n matrix defined b y 1 eA` = I„+At+ 2i A 2t2
1
+ 3*A 3t 3 + . . .
.
It can be shown that this series converges for all real t, in the sense that the infinite series o f elements in the i, j place converges . Care must be taken in computing with exponential matrices, because matrix multiplicatio n is not commutative . The analogue of the relationship eat e ! t = e(a+G) ' is given by the following . THEOREM 10 . 9
Let B be an n x n real- matrix. Suppose AB = BA . Then e (A+B)t = eAt eBt . Because A is a constant matrix, fe At = dt
[I, t +At+2*A 2t2
+3i A 3t3 +4i A4t4 + . . .
=A+A2 t+ y*A3t2+ A4t3 • • =A [I„+ At + Z*A2t2
+ 3i A 3t3 +4*A4t4 + . . .
= Ae At . The derivative of e At , obtained by differentiating each element, is the product Ae A ' of tw o n x n matrices, and has the same form as the derivative of the scalar exponential function e a' . One ramification of this derivative formula is that, for . any n x 1 constant matrix K, e A 'K is a solution of X' = AX . -
LEMMA 10. 1
For any real n x 1 constant matrix K, e A'K is a solution of X' = AX. Proof
Compute (D'(t) = (eA'K) ' = Ae At K = Al(t) .
Even more, eA' is a fundamental matrix for X' = AX .
388 I
-
CHAPTER 10 Systems of Linear Differential Equation s THEOREM 10.1 0 eAt is a fundamental matrix for X' = AX . Proof
Let E1 be the
1 matrix with 1 in the j, 1 place and all other entries zero :
nx
/0 \
0 E1
1
=
0 \ 0 / Then e A `Ei is the j"t column of eA ` . This column is a solution of X' = AX by the lemma . Further, the columns of eAt are linearly independent by Theorem 10.3(2), because if we pu t t = 0, we get e At = I71, which has a nonzero determinant . Thus SZ is a fundamental matrix fo r X'=AX. ■ In theory, then, we can find the general solution eAt C of X' = AX, if we can compute e At . This, however, can be a daunting task . As an example, for an apparently simple matrix such as A= (-2
4) '
we find using a software package that e At = / e5t/2 cos( t/2) +
-
e5t/2
a eSy2
ie5t12 sin(/ t/2)
sin(flt/2)
e5t/2 cos(t/2) -e 5t/2 sin(Ot/2)
sin(V7t/2)
This is a fundamental matrix for X' = AX . It would be at least as easy, for this A, to find th e 4 eigenvalues of A, which are 5zfzl i,s/7, then find corresponding eigenvectors, ( 3 f i,s-I ) ' and use these to obtain a fundamental matrix . We will now pursue an interesting line of thought . We claim that, even though eAt may be tedious or even impractical to compute for a given A, it is often possible to compute th e product eAt K, for carefully chosen K, as a finite sum, and hence generate solutions to X' = AX . To do this we need the following. LEMMA 10 . 2
Let A be an n
xn
real matrix and K an
nx
1 real matrix . Let
µ
be any number. The n
1. eµI,,t K = e µt K . 2. eAt K = eµt e(A-µIr)t K . Proof
For (1), since (I,,)"1 = I,, for any positive integer m, we have ei'I°tK = [In + µln t
1 t 2 + 3 i (µIn)3 t3 + . . . K + 2! (µIn) 2 J
µ
= [1+µt+2!µ2t2+3 3 t 3 + . . . I,1K=e µt K.
10.2 Solution of X' = AX when A is Constant
38 9
For (2), first observe that µI„ and A - µI,, commute, sinc e Al. (A
Now suppose we want to solve X' = AX . Let A l , . . . , A,. be the distinct eigenvalues of A , and let Aj have multiplicity nib . Then m l + ••+m,.=n . For each A1 , find as many linearly independent eigenvectors as possible . For A 1, this can b e any number from 1 to ni b inclusive . If this yields n linearly independent eigenvectors, then w e can write the general solution as a sum of eigenvectors e1 times exponential functions e'j', and we do not need e A t Thus suppose some A j has multiplicity mj > 2, but there are fewer than m j linearly independent eigenvectors . Find an n x 1 constant matrix Kt that is linearly independent fro m the eigenvectors found for Al and such that (A - AE I„)K 1
0, but (A - A1I„) 2 K1 = 0 .
Then eAtK1 is a solution of X' = AX . Further, because of the way Kl was chosen eAt K l = eA;,e(A-A;r„)tKi = eAit [K1 + (A - A EI„) K 1 tj , with all other terms of the series for e (A-A;r )tKi vanishing because (A - A 1I„) 2K 1 = 0 force s (A - A1I„)"'K 1 = 0 for m > 2 . We can therefore compute e"i K i as a sum of just two terms . If we now have nab solutions corresponding to A 1, then leave this eigenvalue and move on to any others that do not yet have as many linearly independent solutions as their multiplicity . If we do not yet have m j solutions corresponding to A j , then find a constant n x 1 matrix K2 such that (A - A1I,,)K2
0 0 and (A - Aj I,,) 2K2
0
but (A - A E I„) 3 K2 = O . Then eAt K2 is a solution of X' = AX, and we can compute this solution as a sum of just three terms : eAt K2 =
eA ;t e(A-Aj I° )tK 2
=
?it
[K2 + (A - A E I„)K2 t+ _ 2 (A - A I„) 2t2 E
J
The other terms in the infinite series for e At K2 vanish becaus e (A - AjI,,) 3K2 = (A - A1I„) 4K 2 = . . . = 0. If this gives us mj solutions associated with A j , move on to another eigenvalue for which we do not yet have as many solutions as the multiplicity of the eigenvalue. If not, produce an n x 1 constant matrix K 3 such that (A - A1I „) K3
0 0, (A - A1 I„) 2K3
0 and (A - A 1I„) 3K 3
but (A - A1I ,) 4K3 =0 . Then e A `K3 can be computed as a sum of four terms .
0
390
CHAPTER 10 Systems of Linear Differential Equation s Keep repeating this process . For A3 it must terminate after at most mj - 1 steps, becaus e we began with at least one eigenvector associated with A and then produced more solutions to obtain a total of m linearly independent solutions associated with A1. Once these are obtained, we move on to another eigenvalue for which we have fewer solutions than its multiplicity, and repeat this process for that eigenvalue, and so on. Eventually we generate a total of n linearl y independent solutions, thus obtaining the general solution of X' = AX .
The eigenvalues are 4, 2, 2, 2 . Associated with 4 we find the eigenvector
9 6 8 4
so on e
solution of X' = AX is 9
431 (t)
= 8 e
4r
4 Associated with 2 we find that every eigenvector has the form
A second solution is
02
(t) = 1 O e ' 2
0 Now find a 4 x 1 constant matrix K 1 such that (A - 2I4)K 1 First compute 0 (A-2I4) 2
=
_
1
0
0 0
0 0
3 1 0 4 0 2
0 0 0 0
0 0 0 0
1 7 0 6 0 8 0 4
1 0 0 1
2
0, but (A - 2I4) 2K 1 = 0 .
10.2 Solution of X' = AX when A is Constant
391
Solve (A - 2I4) 2K1 = 0 to find solutions of the form a
a 0 0 We will choose the solutio n
to avoid duplicating the eigenvector already found associated with 2 . The n 0 1 0 3
0
(A-2I4)K1= 0 0 0 4
1
0 = 0 0 0
0 0 0 2
,0'
as required . Thus form the third solution e2t
0 3 (t) = eA` K I =
+ (A
- 2I4)K l t ]
0 0 1 0 1 0 0 1 0 + 0 0 0 0 0 0 0
= e2t
= e 2t
[K 1
0 01
1 0
+
0
3 1 4 2 I
10 1 0 0 t
0
t
0
It
eft .
0
The three solutions found up to this point are linearly independent . Now we need a fourth solution . It must come from the eigenvalue 2, because 4 has multiplicity 1 and we have on e solution corresponding to this eigenvalue . Look for K2 such tha t (A - 2I4)K2
0
and
(A - 2I4) 2K2
but (A - 2I4) 3K2 = O . First compute
(A - 214) 3
=
0 0 0 0
Solutions of (A - 2I4) 3K2 = 0 are of the for m
0 0 0 0
0 0 0 0
18 12 16 8
0
392
CHAPTER 10 Systems of Linear Differential Equations We will choose
to avoid duplicating previous choices . Of course other choices are possible . It is routine to verify that (A - 2I4)K 2 0 0 and (A - 214) 2K2 0 0. Thus form the fourth solution I 4 (t) =eA`K2 =e 2t [ic
+(A - 21,) K 2t
-f- i ( A - 2I,) 2t2 2. J
1 0 1 0 3 1 0 0 1 1 1 + 0 0 0 4 0 0 0 2
= e2t
1 1 1 t+ 2 0 (1\
0 0 00 0 0 0 0
1 7 0 6 0 8 0 4
11 ) 1.2
010
1+t-{-2t 2 1+ 1 0
t
e 2t .
We now have four linearly independent solutions, hence the general solution . We can also write the fundamental matri x dl! (t) =
9e4t
e2 t
te2t
(1+t+t2 )e2 t
6e4r
0
e 2t
(1 + t) e2 t
8e4t
0
0 0
e2t
4e4t
In each of Problems 1 through 5, find a fundamental matrix for the system and use it to write the general solution . These coefficients matrices for these systems have real , distinct eigenvalues . 1. xi = 3x1 , x2 = 5x 1 -4x2 2. xi = 4x 1 + 2x2,
In each of Problems 6 through 11, find a fundamenta l matrix for the system and use it to solve the initial valu e problem . The matrices of these systems have real, distinct eigenvalues .
10.2 Solution of X' = AX when A is Constant 12. Show that the change of variables z = ln(t) for t > 0 transforms the system tx', = ax, +bx2 , tx2 = cx 1 +dx2 into a linear system X' = AX, assuming that a, b, c and d are real constants . 13. Use the idea of Problem 12 to solve the system
25
3 .( 0
27.
2 5 0 8 0 -1
28.
15 0 01 0 48 1
tx, = -x l - 3x2, tx2 = x1 - 5x2. In each of Problems 15 through 19, find a real-value d fundamental matrix for the system X' = AX, with A the given matrix . 2 1
-4 ) 2
16.
0 l -1
-22
17.
3 1
-5 -1
18.
1 1 1
-1 -1 0
-2 -5 0
1 0 3
15.
20.
J
32 2 -5 1)'(8 ) - 3 )'( 10 )
23.
- 2 )'( 0 5 ) 3 -3 1 7 2 -1 0 ; 4 1 -1 1 3
24. Can a matrix with at least one complex, non-real element have only real eigenvalues? If not, give a proof . It it can, give an example .
0 0 1 0
In each of Problems 31 through 35, find the general solution of the system X' = AX, with A the given matrix, and use this general solution to solve the initial value problem, for the given n x 1 matrix X(0) . Use the method of Section 10.2.2 for these problems . -5 )'(3 ) 32. (
) ;(
3
33.
-4 1 1 0 2 -5 0 0 -4
34.
-5 2 0 -5 0 0
) ;
1 3 -5
0 4 12 2 -3 4
/ 1 -2 0 0 1 -1 0 0 35. 0 0 5 -3 \0 0 3 -1
2 -2 1 4
In each of Problems 36 through 40, find the genera l solution of the system by diagonalizing the coefficien t matrix . 36. xi = -2x, +x2 , x2 = -4x 1 + 3x2 37. x1 = 3x 1 +3x2 , x1 = x 1 +5x2 = x1 +x 2 , x2 = x, +x2 39. xl = 6x 1 +5x2, x2 = x, +2x2 38.
In each of Problems 25 through 30, find a fundamental matrix for the system X' = AX, using the method o f Section 10 .2 .2, with A the given matrix .
6 4 4 1
0 10 0 01 0 00 - 1 -2 0
)
0 0 -2 ) In each of Problems 20 through 23, find a real-value d fundamental matrix for the system X' = AX, with A the given matrix . Use this to solve the initial value problem , with X(0) the given n x 1 matrix. 19.
6 9 2
1 5 -2 03 0 03 0 00 0
29
30.
1 0 -1
2 ) 3
26. (2 O ) \52 J
tx', = 6x 1 +2x2, tx2 = 4x1 +4x2. 14. Solve the syste m
-393
x1
40. x1
= 3x 1 - 2x2,
x2
= 9x 1 - 3x2
394.' CHAPTER 10 Systems of Linear Differential Equations In each of Problems 41-45, solve the system X' = AX, with A the matrix of the indicated problem, by finding e At .
In each of Problems 46-50, solve the initial value problem of the referred problem, using the exponential matrix .
41. 42. 43. 44. 45.
46. 47. 48. 49. 50.
A as in Problem 25 . A as in Problem 26. A as in Problem 27 . A as in Problem 28 . A as in Problem 29 .
10.3
Solution
Problem 31 . Problem 32 . Problem 33 . Problem 34 . Problem 35 .
of X' = AX + G
We now turn to the nonhomogeneous system X'(t) = A(t)X(t) + G(t), assuming that th e elements of the n x n matrix A(t), and the n x 1 matrix G(t), are continuous on some interval I , which may be the entire real line . Recall that the general solution of X' = AX + G has the form X(t) = a(t)C+''IIp(t), wher e ((t) is an n x n fundamental matrix for the homogeneous system X' = AX ; C is an n x 1 matrix of arbitrary constants, and "%Pp is a particular solution of X' = AX + G . At least when A is a real, constant matrix, we have a strategy for finding n. We will concentrate in this sectio n on strategies for finding a particular solution 1Ifp . 10.3.1 Variation of Parameters Recall the variation of parameters method for second-order differential equations . If yl (x) and y2 (x) form a fundamental set of solutions fo r y" (x) +P(x )Y (x ) + q (x) y (x )
= 0,
then the general solution of this homogeneous equation i s y1,(x)
=
c1y
i(x ) + c2 y2(x) •
To find a particular solution yp (x) of the nonhomogeneous equatio n y" (x)+P(x )Y
replace the constants in
yh
(x) + q(x)y(x) = f(x )
by functions and attempt to choose u(x) and v(x) so that yp( x)
=
u ( x) y i(x)
+ v (x) y 2(x)
is a solution . The variation of parameters method for the matrix equation X' = AX + G follows the sam e idea . Suppose we can find a fundamental matrix for the homogeneous system X' = AX. The general solution of this homogeneous system is then X 1,(t) = n(t)C, in which C is an n x 1 matrix of arbitrary constants . Look for a particular solution of X' = AX + G of the for m 1Iip ( t) = SL(t)U(t) ,
in which U(t) is an n x 1 matrix of functions of t which is to be determined . Substitute this proposed solution into the differential equation to get (SLU) ' = A(fU) + G , or S2.'U+SLU ' = (ASC)U+G.
10 .3
Solution of X' =AX+ G
Now SL is a fundamental matrix for X' = AX, so SZ' = ASl . Therefore SZ'U = (ASC)U and the last equation reduces to IIU'=G . Since II is a fundamental matrix, the columns of fl are linearly independent . This mean s that Sl is nonsingular, so the last equation can be solved for U' to ge t U'
G.
= SL
As in the case of second order differential equations, we now have the derivative of the function we want. Then
f
=
U(t)
SL-'(t)G(t)dt,
in which we integrate a matrix by integrating each element of the matrix . Once we find a suitable U(t), we have a particular solution 1lrp (t) = fl(t)U(t) of X' = AX + G . The general solution of this nonhomogeneous equation is the n X(t) = tl(t)C+Sl(t)U(t) , in which C is an n x 1 matrix of constants .
EXAMPLE 10 .1 5
Solve the system X'= (
1 -1
-10 e' 4)X+(sin(t) )
First we need a fundamental matrix for X' = AX . Thee eigenvalues of A are -1 and 6, with associated eigenvectors, respectively, (i) and (
1) . Therefore a fundamental matrix for
X' = AX is 5e' e -t
SZ(t) -
-2e6' ) e6t
We find (details provided at the end of the example) tha t Sl -1 (t) =
er ter ) . -e-6t 5 e 6r
1
Compute 2e' 5e -6r
U'(t) = SZ-1 (t)G(t) = _ 1 7
k
er sin(t)
e 2'+2e'sin(t) -e -5t +5e -6r sin(t)
Then U(t)
=
f
1( 7 -
SZ -1 (t)G(t)dt = -
f e2t dt+2 f e t sin(t)dt f e -st dt+5 f e -6t sin(t)d t
e2'+, e t [sin(t) -cos(t) ] ss e
-5t +
59 a 6'[-6 sin(t) - cos(t)]
CHAPTER 10 Systems of Linear Differential Equation s The general solution of X' = AX + G is X(t) = l(t)C+C(t)U(t)
5e - ` e-t
+
=( e
e tr
-2e6t 1 e 6t )
1 35
5e ` e-t
(
-2e 6t 2e6 t e
_e6*6f )
C
14e2t + le t [sin(t) - cos(t) ] e-5t .+ e 6i [-6 sin(t) - cos(t) ] 259
10 e` + 37 sin(t) io ` + 37 sm(t) -
C+
37
cos(t)
37
cos(t)
If we want to write the solution in terms of the component function , let C =
cl 2
I to obtain
25 3 35 + 37 sin(t) - 37 cos(t) , + 10 et 1 6 37 sin(t) - 37 cos(t) . x 2 (t) = c l e -t +c2e6t+ 10 et+ x 1 (t) =
5c 1 e -t - 2c 2 e6t
Although the coefficient matrix A in this example was constant, this is not a requiremen t of the variation of parameters method . In the example we needed SL -1 (t) . Standard software packages will produce this inverse . We could also proceed as follows, reducing f(t) and recording the row operations beginning with the identity matrix, as discussed in Section 7 .8 .1 :
add -Wow 1) to row
1
0
0
1
5e-t e-t
:
2
1
5e -1
0
-5 multiply row 1 by
-2e6 ` e6t
-2e6t e 6t
0
5
et Se`
0
1
s
1
0
-se7t Se
6t
multiply row 2 by .e-6t
s e` _?
e -6t
0
1 -
e-6t
0
e7r
s
add s e7t (row 2) to row 1 j et
a'et
_ I e-6t ? e-6t
1
0
0
1
Since the last two columns for I 2 , the first two columns are
(t) .
10.3 Solution of X' =AX+G
397
Variation of Parameters and the Laplace Transform There is a connection between th e variation of parameters method and the Laplace transform . Suppose we want a particular solution '‘Pp of X' = AX + G, in which A is an n x n real matrix . The variation of parameters method is to find a particular solution "I',(t) = SL(t)U(t) , where fl(t) is a fundamental matrix for X' = AX . Explicitly,
=f
U(t)
52.-1 (t)G(t)dt .
We can choose a particular U(t) by carrying out this integration from 0 to t : U(t)
= f f1 -1 (s)G(s)ds .
Then II'p (t) =11(t)
f
52.-1 (s)G(s)ds
=
f ' S2.(t)SL -1 (s)G(s)ds.
In this equation II can be any fundamental matrix for X' = AX . In particular, suppose w e choose SI(t) = eA' . This is sometimes called the transition matrix for X' = AX, since it is a fundamental matrix such that l(0) = I,, . Now Sr' (s) = e_As, s o n(t)12 ' (s) = eAte-AS = eA(t-s) = &I@ - s) and 1I1p (t)=
f
11(t - s)G(s)ds.
0
This equation has the same form as the Laplace transform convolution of t2, and G, except that in the current setting these are matrix functions . Now define the Laplace transform of a matri x to be the matrix obtained by taking the Laplace transform of each of its elements . This extende d Laplace transform has many of the same computational properties as the Laplace transform for scalar functions . In particular, we can define the convolution integra l a(t) * G(t)
= f SI(t - s)G(s)ds .
In terms of this convolution, P (t) = SI(t) *G(t) .
p
This is a general formula for a particular solution of X' = AX + G when SI(t) =
EXAMPLE 10 .1 6
Consider the system X' =
(1
54) x+
(
et r
I .
We find that eAt
(1- 2t)e 3t tea'
-4te3t () Sl t . ) e3t '0+20
398
CHAPTER 10 Systems of Linear Differential Equation s A particular solution of X' = AX + G is given by
II'n (t) = f t SL (t - s) G (s) ds -4(t - s)e 3(t-s)
- f ( ( 1- 2(t - s))e3(t-s) (t-s)e3(t-s)
o
)
(1+2(t-s)))e3(t-s)
(e2s s )dt
o f (1-2t+2s) e 3t a-s -4(t-s) e 3`se -3 s ds f ( ( t-s)e 3t a -s +(1+2t-2s)e3t Se-3s )
_(
fo [(1-2t+2s)e 3t e-s -4(t-s)e 3`se3s ]ds e"se-3s] ds fo [( t - s) e 3t a -s + (1 + 2t - 2s)
-
-3e2t + 27 e 3t le 3t t9t - 27 e21+1)l e3t t- 2 e3t -9t + 27
)1
The general solution of X' = AX + G is
X(t) =
((1
+
-4te3' 1 c a` (1 +2t)e3t ) 89 -3e2t +2 9e3t t- 49 t- 278 7 e3t 22 11 3t 28 3t it e2t + 9 e t2e i 4 _2t)e3t
_
10.3.2 Solution of X' = AX+ G by Diagonalizing A Consider the case that A is a constant, diagonalizable matrix . Then A has n linearly independent eigenvectors . These form columns of a nonsingular matrix P such that 0 Al 0 0 A2 ••• 0 P -1 AP=D= 0
0
. . . An
with eigenvalues down the main diagonal in the order corresponding to the eigenvector column s of P . As we did in the homogeneous case X' = AX, make the change of variables X = PZ . Then the system X' = AX + G become s X' = PZ' = A (PZ) + G , or PZ' = (AP)Z +G. Multiply this equation on the left by P- 1 to get Z' = (P-1 AP)Z+P -' G =DZ+P-1 G . This is an uncoupled system of the form zi = A l z 1 + .f* ( t) zi = A 2z2+ .f2( t)
Zn
=
A ,i z n+fit
(0,
10 .3
Solution of X' = AX + G
39 9
where f1 ( t) .fi( t) P-1G(t) .ft ( t) z (t )
Solve these n first-order differential equations independently, form Z(t) =
Z2( t)
, and then
z ,,( t) the solution of X' = AX + G is X (t) = PZ (t) . Unlike diagonalization in solving the homogeneous system X' = AX, in this nonhomoge neous case we must explicitly calculate P - 1 in order to determine P -1 G(t), the nonhomogeneous term in the transformed system .
EXAMPLE 10 .1 7
Consider
)x + ( 4 83f ) .
1
X=(
The eigenvalues of A are 2 and 6, with eigenvectors, respectively , Let -3 1
P=(
1 1) .
Then 2 0
0 6) '
1
1
P-1 AP = Compute P -1
-
4
4
1
3
4
4
The transformation X = PZ transforms the original system into Z'
2 0
06
=( 2 0 0 6
)z + _ ' ( 4e' ' )
)z+ (
4
8
3
4e3'
4
Or 20 z'= ( o 6
Z+
-2+ e3 t ( 2+3e 3t
'
a n d ( ).
400
CHAPTER 10 Systems of Linear Differential Equation s
This is the uncoupled system
1 2 =6z2
zi
= 2z -2+e
z'
3'
+2+3e
3r
Solve these linear first-order differential equations independently : z
1
1
(t)=c e2t +e3t -I-1 ,
= c 2 eSt - eat -
z2 (t)
3.
Then
c1e2t ar + 1 c2 cle e 1 1 1 + + e3t e 6t -e -
Then
) ( /I I\
-3 c e2t
ele2t
c2 e 6t - 4e3t -
)
3
+ c2e6t +
3ezt e6t ee6t
\
zt + at + cze St - eat - 13
-4eat
)c +
_ o (10 .7 )
2 3
This is the general solution of X' = AX+G . It is the general solution of X' = AX, plus a particular solution of X' = AX + G. Indeed, (
-3ezt e2t
e6t ) e6t
is a fundamental matrix for the homogeneous system X' = AX . To illustrate the solution of an initial value problem, suppose we want the solution o f X - (1
5 ) X+ ( 4e3t
I ;
X(0)
( -7
2
I
Since we have the general solution (10 .7) of this system, all we need to do is determine C s o that this initial condition is satisfied . We need X(0) = (
1 1)
_(
1
1
C+
-4- 1
z
3
) c +(1) = (
This is the equation PC=
2+ zz
3 -7-2 3
_ =
3 z3 3
2 -7
10.3 Solution of X' = AX + G
401
We already have P', so 28 3 23 3
c =P-1
_ 1
1 4 4 ) (
=
1
3
4
4
28 3
23 3
)
The initial value problem has the unique solutio n
_ 17 e 2t _ 4
SECTION 10 .3
2.
( -2
1 )' 24
e3et /
(1
3.
7 1
-1 2e 6t 5 ) ' ( 6te6i )
4.
2 0 0
0 6 4
5.
1 4 0 -1
0 3 0 2
0 -4 -2
,
0 0 3 9
0 0 0 1
e2r cos(3t) -2 JI -2
8.
3
2 ) ' (
10t
-4 2e' -3 )'( 2 -3 1 0 2 4 1 1, 0 01
/
1 -3 0 to2` 6 3 -5 0 I , to 2` ) ; ( 2 4 7 -2 te e 2t 3 10. Recall that a transition matrix for X' = AX is a fun damental matrix b(t) such that 1(0) = In. (a) Prov e that, for a transition matrix (1(t), SZ -1 (t) = SI(-t) and Sl(t+s) = ((t)(1(s) for real s and t . (b) Suppose (1(t) is any fundamental matrix for X' = AX . Prove that 1(t) = (l(t)(I-1(0) is a transition matrix . That is, I(t) is a fundamental matrix and 0(0) = I, . 9.
0 -2e` 0 et
6 ter -3e' :6t 5 / 2e-5t (1+5t)e-5' 12. xi = -10x2 , x2 = 2xi -10x2; I e_5t to -sr 2 13. x'1 = 5x 1 -4x2 +4x3 , x2 = 12x1 - 11x2 +12x3 ,
11. x* = 4x 1 +2x2, x2 = 3 x1 +3 x2 ;
2
2
( 5 5 7. ( 4 2
,+
In each of Problems 11, 12 and 13, verify that the given matrix is a fundamental matrix for the system and use it to find a transition matrix.
In each of Problems 6 through 9, use variation of parameters to solve the initial value problem X' = AX + G ; X (O) = X°, with A, G and X° as given (in tha t order) . 6.
e6t
12
PROBLEMS
In each of Problems 1 through 5, use variation of param eters to find the general solution of X' = AX + G, with A and G as given. 1.
41
x3
)' (33 ) ) e' )' (
10e2f 6e2t _e 2t
3 ) 5 11 -2
=4x 1 -4x2 +5x3;
e-'' 3e -3 ' e-3'
et 0 _e t
0 et et
In each of Problems 14 through 18, find the genera l solution of the system by diagonalization . The general solutions of the associated homogeneous systems X' = A X were requested in Problems 36 through 40 of Section 10 .2.
402
CHAPTER 10 Systems of Linear Differential Equation s
14. xi = -2x1 +x 2 , x2 = -4x 1 +3x 2 + 10cos(t) 15. x'l =3x 1 +3x2 +8,x2=x i +5x 2 +4e3' 16. x'l = x i +x2 +6e 3r , x2 = x i +x2 + 4 17. xi =6x 1 +5x2 -4cos(3t),x2 =x 1 +2x2 + 8 18. xi = 3x 1 - 2x2 + 3e2', x2 = 9x 1 - 3x2 + e2` In each of Problems 19 through 23, solve the initial value problem by diagonalization . 19. xi =x 1 +x2 +6e2t , x'2 = x i +x2 +2e2' ; x1 (0) = 6,x2(0) = 0
Qualitative Methods an d Systems of Nonlinear Differential Equations
11 .1
Nonlinear Systems and Existence of Solution s The preceding chapter was devoted to matrix methods for solving systems of differentia l equations . Matrices are suited to linear problems . In algebra, the equations we solve by matrix methods are linear, and in differential equations the systems we solve by matrices are als o linear. However, many interesting problems in mathematics, the sciences, engineering, economics , business and other areas involve systems of nonlinear differential equations, or nonlinea r systems. We will consider such systems having the special form : xi (t) = F1 (t, x1, x2, . . . , x„) , x2(t) =F2(t,xl,x2, . . .,x„) ,
x t = F,,(t,xi , x2, . . . x„) • ;t(
)
This assumes that each equation of the system can be written with a first derivative isolated o n one side, and a function of t and the unknown functions xl (t), . . . , x„(t) on the other . Initial conditions for this system have the for m xi ( to) = x°, x2 (to) = x2, . . . , x„ (to) = x,,
(11 .2)
in which to is a given number and are given numbers . An initial value problem consists of finding a solution of the system (11 .1) satisfying the initial conditions (11 .2) . We will state an existence/uniqueness result for this initial value problem . In the statement, an open rectangular parallelepiped in (ii+ 1)- dimensional t, x i , . . . , x„- space consists of al l points (t, x1 , . . . , x„) in R"+1 whose coordinates satisfy inequalitie s a
r ***.I Y
9 ***a
404
CHAPTER 11 Qualitative Methods and Systems of Nonlinear Differential Equation s If n = 1 this is an open rectangle in the t, x plane, and in three space these points for m an open three-dimensional box in 3-space . "Open" means that only points in the interior o f parallelopiped, and no points on the bounding faces, are included . THEOREM 11 .1 Existence/Uniqueness for Nonlinear Systems
Let Fl , . . . , F„ and their first partial derivatives be continuous at all points of an open rectangula r parallelepiped K in R"+i Let (to, x°, . . . , x°) be a point of K . Then there exists a positive number h such that the initial value problem consisting of the system (11 .1) and the initial condition s (11 .2) has a unique solution xi=tPI(t),x2=tP2(t), . . .,x,t=co.(t)
defined for to - h < t < to
-I- h . ■
Many systems we encounter are nonlinear and cannot be solved in terms of elementar y functions . This is why we need an existence theorem, and why we will shortly develo p qualitative methods to determine properties of solutions without having them explicitly i n hand. As we develop ideas and methods for analyzing nonlinear systems, it will be helpful to have some examples to fall back on and against which to measure new ideas . Here are tw o examples that are important and that come with some physical intuition about how solution s should behave.
EXAMPLE 11 .1 The Simple Damped Pendulu m
We will derive a system of differential equations describing the motion of a simple pendulum , as shown in Figure 11 .1 . Although we have some intuition about how a pendulum bob shoul d move, nevertheless the system of differential equations describing this motion is nonlinear an d cannot be solved in closed form . Suppose the pendulum bob has mass m, and is at the end of a rod of length L . The rod i s assumed to be so light that its weight does not figure into the motion of the bob. It serves only to constrain the bob to remain at fixed distance L from the point of suspension . The position of the bob at any time is described by its displacement angle 0(t) from the vertical . At some time we call t = 0 the bob is displaced by an angle 0o and released from rest . To describe the motion of the bob, we must analyze the forces acting on it . Gravity act s downward with a force of magnitude mg . The damping force (air resistance, friction of the bar a t its pivot point) is assumed to have magnitude cO'(t) for some positive constant c . By Newton' s laws of motion, the rate of change of angular momentum about any point, with respect to time,
cL IO'(t) m yng
FIGURE 11 .1 pendulum.
} L [1 - cos 09) ]
Simple, dampe d
11.1 Nonlinear Systems and Existence of Solutions
405
equals the moment of the resultant force about that point . The angular momentum is mL 20' (t) . From the diagram, the horizontal distance between the bob and the vertical center position a t time t is Lsin(O(t)) . Then, mL20" (t) = -cLO'(t) - mgL sin(O(t)) . The negative signs on the right take into account the fact that, if the bob is displaced to the right, these forces tend to make the bob rotate clockwise, which is the negative orientation . It is customary to write this differential equation as 0" + y0' + w2 sin(O) = 0,
(11 .3)
with y = c/mL and w 2 = g/L . Convert this second order equation to a system as follows . Le t x=0,y=0' . Then the pendulum equation (11 .3) becomes x'= y, y' + yy + w 2 sin(x) = 0 , or x' = y y' = -w 2 sin(x) - yy. This is a nonlinear system because of the sin(x) term . We cannot write a solution of this system in closed form . However, we will soon have methods to analyze the behavior o f solutions and hence the motion itself. In matrix form, the pendulum system i s X' -
( 0 1 0 -y
)x+(
0
-w2sin(x)
'
in which X = I x I . y
EXAMPLE 11 .2 Nonlinear Sprin g
Consider an object of mass in attached to a spring . If the object is displaced and released , its motion is governed by Hooke's law, which states that the force exerted on the mass b y the spring is F(r) = -kr, with k a positive constant and r the distance displaced from the equilibrium position (position at which the object is at rest) . Figure 2 .4 shows a typical suc h mass/spring system, with r used here for the displacement instead of y used in Chapter 2. This is a linear model, since F is a linear function (a constant times r to the first power) . The spring model becomes nonlinear if F(r) is nonlinear . Simple nonlinear models are achieved by adding terms to -kr . What kind of terms should we add? Intuition tells us that the spring shoul d not care whether we displace an object left or right before releasing it . Since displacement s .in opposite directions carry opposite signs, this means that we want F(-r) = -F(r), so F should be an odd function .This suggests adding multiples of odd powers of r . The simplest such model is F(r) = -kr+ar3.
4A6
CHAPTER 11 Qualitative Methods and Systems of Nonlinear Differential Equation s If we also allow a damping force which in magnitude is proportional to the velocity, then b y Newton's law this spring motion is governed by the second order differential equation mr" = -kr + ar3 -Cr ' . To convert this to a system, let x = r and y = r' . The system is x' = y, k a 3 c y =--x+-x --y . m m m In matrix form, this system is X
:YUGO] 11 11o'
0 -k/m
-On )x+(
PROBLEMS
1. Apply the existence/uniqueness theorem to the system for the simple damped pendulum, with initial conditions x(O) = a, y(O) = b . What are the physical interpretations of the initial conditions? Are there any restrictions on the numbers a and b in applying th e theorem to assert the existence of a unique solution in some interval (-h, h) ? 2. Apply the existence/uniqueness theorem to the system for the nonlinear spring system, with initial conditions
11.2
0 ax 3/ m
x(0) = a, y(O) = b. What are the physical interpretations of the initial conditions? Are there any restriction s on the numbers a and b in applying the theorem to as sert the existence of a unique solution in some interval (-h, h) ? 3. Suppose the driving force for the nonlinear spring ha s additional terms, say F(r) = -kr + ar 3 +f3r5. Doe s this problem still have a unique solution in some interval (-h, h) ?
The Phase Plane, Phase Portraits and Direction Field s Throughout this chapter we will consider systems of two first-order differential equations i n two unknowns . In this case it is convenient to denote the variables as x and y rather than xi and x2. Thus consider the system = f(x( t), y(t)) , y' (t) = g(x(t), y(t)), x '(t)
(11.4)
in which f and g are continuous, with continuous first partial derivatives, in some part of th e plane. We often write this system as X'
= F(x(t), y(t)) ,
where
= ( x)
.f( x, y) ( g ( x , y) ) The system (11 .4) is a special case of the system (11 .1) . We assume in (11 .4) that neither f nor g has an explicit dependence on t . Rather, f and g depend only on x and y, and t appears only through dependencies of these two variables on t . We refer to such a system as autonomous . X
and
F (x, y) =
11.2 The Phase Plane, Phase Portraits and Direction Fields
` .407
Working in the plane will allow us the considerable advantage of geometric intuition. If x = co(t), y = fr(t) is a solution of (11 .4), the point (co(t), t/i(t)) traces out a curve in the plan e as t varies . Such a curve is called a trajectory, or orbit, of the system. A copy of the plan e
containing drawings of trajectories is called a phase portrait for the system (11 .4) . In this context, the x, y plane is called the phase plane . We may consider trajectories as oriented, with (cp(t), ali(t)) moving along the trajectory i n a certain direction as t increases. If we think of t as time, then (cp(t), qr(t)) traces out the path of motion of a particle, moving under the influence of the system (11 .4), as time increases . In the case of orbits that are closed curves, we take counterclockwise orientation as the positiv e orientation, unless specific exception is made . In some phase portraits, short arrows are also drawn . The arrow at any point is along th e tangent to the trajectory through that point, and in the direction of motion along this trajectory . This type of drawing combines the . phase portrait with a direction field, and gives an overall sense of the flow of the trajectories, as well as graphs of some specific trajectories . One way to construct trajectories is to write dy _ dy/dt _ dx dx/dt
g(x,
y)
f(x, y)
Because the system is autonomous, this is a differential equation in x and y and we can attemp t to solve it and graph solutions . If the system is nonautonomous, then fig may depend explicitl y on t then we cannot use this strategy to generate trajectories .
-EXAMPLE 11 . 3
Consider the autonomous system x' = Y = f( x,Y) y = x2y2 = g ( x, y) .
Then 2y2 = = x2Y dy dx x y a separable differential equation we write as i dy=x2d Y Integrate to get ln Iyj =
x3+C,
or y = Ae x3/3 . Graphs of these curves for various values of A form trajectories of this system, some of whic h are shown in Figure 11 .2 .
CHAPTER 11 Qualitative Methods and Systems of Nonlinear Differential Equation s y
FIGURE 11 .2
Some trajectories of the system
I x' = y y' = x 2yz
EXAMPLE 11 . 4
For the autonomous system x' = -2y - x sin(xy) , y' = 2x+ysin(xy)
(11 .5)
we have dy _ dx
2x+ysin(xy) 2y + x sin(xy)
This is not separable, but we can write (2x+ysin(xy))dx+(2y+xsin(xy))dy = 0, which is exact . We find the potential function H(x, y) = x2 + y 2 - cos(xy), and the general solution of this differential equation is defined implicitly b y H(x, y) = x2 + y2 - cos(xy) = C , in which C is an arbitrary constant . Figure 11 .3 shows a phase portrait for this system (11 .5), consisting of graphs of these curves for various choices of C . Usually we will not be so fortunate as to be able to solve dy/dx = g(x, y)/f(x, y) in close d form. In such a case we may still be able to use a software package to generate a phase portrait . Figure 11 .4 is a phase portrait of the syste m x' = xcos(y) y' = x2 -y3 +sin(x-y)
11.2 The Phase Plane, Phase Portraits and Direction Fields
generated in this way . Figure 11 .5 (p . 410) is a phase portrait for a damped pendulum wit h w 2 = 10 and y = 0 .3, and Figure 11 .6 (p . 410) is a phase portrait for a nonlinear spring syste m with a = 0.2, k/in = 4 and c/in = 2 . We will consider phase portraits for the damped pendulu m and nonlinear spring system in more detail when we treat almost linear systems . If x = cp(t), y = I/i(t) is a solution of (11 .4), and c is a constant, we call the pair cp(t+ c) , *i(t+c) a translation of cp and' . We will use the following fact .
410
CHAPTER 11 Qualitative Methods and Systems of Nonlinear Differential Equation s
FIGURE 11 .5
FIGURE 11 .6
Phase portrait for a damped pendulum.
Phase portrait for a nonlinear spring.
LEMMA 11 . 1
A translation of a solution of the system (11 .4) is also a solution of this system. Proof
Suppose x = cp(t), y = P(t) is a solution . This means that x' ( t) = cP ( t) = f( cp (t), +jl(t)) and y '(t) =
( t) = g( cp ( t), q(t))
Let At) = cp(t+c) and y(t) = gi(t+c)
.
11.2 The Phase Plane, Phase Portraits and Direction Fields for some constant c . By the chain rule, dx _ dco(t+c) d(t+c) _ cP (t+c) dt d(t+c) dt =f(co(t+c),(t+c)) =f(x(t),y(t) ) and, similarly, dt
= if/(t+c) =g(co(t+c),
( t + c)) =g(x(t),y(t)) •
Therefore x = x(t), y = y(t) is also a solution . ■ We may think of a translation as a reparamatrization of the trajectory, which of course does not alter the fact that it is a trajectory . If we think of the point (cp(t), ii(t)) as movin g along the orbit, a translation simply means rescheduling the point to change the times at whic h it passes through given points of the orbit . We will need the following facts about trajectories . THEOREM 11. 2
Let F and G be continuous, with continuous first partial derivatives, in the (x, y) plane . Then, 1. If (a, b) is a point in the plane, there is a trajectory through (a, b) 2. Two trajectories passing through the same point must be translations of each other . Proof
Conclusion (1) follows immediately from Theorem 11 .1, since the initial value problem
x = f(x , y), y' = g(x , y) ;
x(0) = a, y(0) = b
has a solution, and the graph of this solution is a trajectory through (a, b). For (2), suppose x = cp1 (t), y = t1(t) and x = cp 2 (t), y = 4'2 (t) are trajectories of the system (11 .4) . Suppose both trajectories pass through (a, b) . Then for some to , cm_ (to) = a and 1/i 1( to) = b and for some t1 , cp2(tl) = a and
1/i
2(t1 ) = b .
Let c = to - t 1 and define x(t) = co l (t+ c) and y(t) = trajectory, by Lemma 11 .1 . Further , "At]) =
S°1(to)
= a and y ( t 1)
1 (t -I- c) . Then x = x(t), y = y(t) is a =
q(to) = b .
Therefore .x = Ti(t) , y = y(t) is the unique solution of the initial value proble m
x = f(x , y), y = g( x , y) ; But x = cp2 (t),
y
= tif
x ( tl) = a, y( tl) = b.
2 (t) is also the solution of this problem . Therefore, for all t ,
cp2 (t) = x(t) = cp s (t+ c) and
/i2 (t)
= y(t)
_ 1/f
i ( t+ c) .
This proves that the two trajectories x = cpl (t), y = (t) and x = c '2 (t), of each other. ■
= 02 (t) are translations
412
CHAPTER 11 Qualitative Methods and Systems of Nonlinear Differential Equation s If we think of translations of trajectories as the same trajectory (just a change in the param eter), then conclusion (2) states that distinct trajectories cannot cross each other . This would violate uniqueness of the solution of the system that passes through the point of intersection . Conclusion (2) of Theorem 11 .2 does not hold for systems that are not autonomous .
EXAMPLE 11 . 5
Consider the system x '(t) = t x = f(t, x, y) Y(t)=-zy+x=g(t,x,y) . This is nonautonomous, since f and g have explicit t-dependencies . We can solve this system . The first equation is separable. Write l dx= 1 d t
x t to obtain x(t) = ct . Substitute this into the second equation to get 1
Y+t y=ct , a linear first-order differential eqaution . This equation can be written ty'+y=c t2 , or ( ty ) '
=
Integrate to get ty= 3t 3 +d. Hence c 2 1 y(t) = -t +d t . 3 Now observe that conclusion (2) of Theorem 11 .2 fails for this system. For example, for any number to, the trajectory x(t)
1 1 = -t,t, y( t) = t2 to 0
to t - 3t
passes through (1, 0) at time to . Because to is arbitrary, this gives many trajectories passin g through (1, 0) at different times, and these trajectories are not translations of each other . We now have the some of the vocabulary and tools needed to analyze 2 x 2 nonlinea r autonomous systems of differential equations . First, however, we will reexamine linear systems , which we know how to solve explicitly . This will serve two purposes . It will give us som e experience with phase portraits, as well as insight into significant features that solutions of a system might have . In addition, we will see shortly that some nonlinear systems can be thought of as perturbations of linear systems (that is, as linear systems with "small" nonlinear term s added) . In such a case, knowledge of solutions of the linear system yields important informatio n about solutions of the nonlinear system .
11.3 Phase Portraits of Linear Systems
In each of Problems 1 through 6, find the general solution of the system and draw a phase portrait containing at leas t six trajectories of the system.
7. x'=9y,y'=-4x
1. x' = 4x+y, y' _ -17x-4y
9. x'=y+2,y'=x- 1
4 13
8. x' = 2xy, y' = y2 - x2
2. x' =2x,y ' =8x+2 y
10. x' = csc(x), y' = y
3. x' =4x-7y,y'=2x-5 y
11. x'=x,y'=x+ y
4. x'=3x-2y,y'= 10x-5 y
12. x'=x2 , y'= y
5. x' =5x-2y,y'=4y 13. How would phase portraits for the following systems compare with each other?
6. x' = -4x - 6y, y' = 2x - ll y In each of Problems 7 through 12, use the method of Examples 11.3 and 11.4 to draw some integral curves (at least six) for the system .
Phase Portraits of Linear Systems In preparation for studying the nonlinear autonomous system (11 .4), we the linear system X' = AX, / in which A is a 2 x 2 real matrix and X = ( Y
will thoroughly analyze (11 .6)
. We assume that A is nonsingular, so the
equation AX = 0 has only the trivial solution. For the linear system X' = AX, we actually have the solutions in hand . We will examin e these solutions to prepare for the analysis of nonlinear systems, for which we are unlikely t o have explicit solutions . The origin (0, 0) stands apart from other points in the plane in the following respect . The trajectory through the origin is the solution of : X' = AX ;
X(0) = O = 1
and this is the constant trajectory x(t) = 0, y(t) = 0 for all t . The graph of this trajectory is the single point (0, 0) . For this reason, the origin is calle d an equilibrium point of the system, and the constant solution X = 0 is called an equilibrium solution . The origin is also called a critical point of X' = AX . By Theorem 11 .1, no other trajectory can pass through this point . As we proceed, observe how the behavior of trajectorie s of X' = AX near this critical point is the key to understanding the behavior of trajectorie s throughout the entire plane . The critical point, then, will be the focal point in drawing a phas e portrait of the system and analyzing the behavior of solutions .
414 I
CHAPTER 11 Qualitative Methods and Systems of Nonlinear Differential Equation s We will draw the phase portrait for X' = AX in all cases that can occur . Because the general solution of (11 .6) is completely determined by the eigenvalues of A, we will use thes e eigenvalues to distinguish cases . Case 1-Real, distinct eigenvalues A and µ of the same sign . Let associated eigenvectors be, respectively, E l and E2. Since A and it are distinct, E l and E2 are linearly independent . The general solution i s
X(t)
= (y(t) ) = c
1 E 1eAt + c2E2eµr
Since E l and E2 are vectors in the plane, we can represent them as arrows from the origin, as i n Figure 11 .7 . Draw half-lines L 1 and L2 from the origin along these eigenvectors, respectively , as shown . These half-lines lines are parts of trajectories, and so do not pass through the origin , which is itself a trajectory . Y
L
E
FIGURE 11 .7 Eigenvectors Er and E2 of X' = AX, for distinct eigenvalues of A .
Now consider subcases . Case 1-(a)-The eigenvalues are negative, say A < µ < O . Since e At - 0 and et" -> 0 as t -+ oo, then X(t) --> (0, 0) and every trajectory approache s the origin as t -+ co . However, this can happen in three ways, depending on an initial poin t Po : (xo, yo) we choose for a trajectory to pass through at time t = 0 . Here are the thre e possibilities . If Po is on L 1 , then c 2 = 0 and X(t) = c1Ele Ar which for any t is a scalar multiple of E l . The trajectory through Po is the half-line fro m the origin along L 1 through Po, and the arrows toward the origin indicate that points on this trajectory approach the origin along L 1 as t increases . This is the trajectory Tl of Figure 11 .8 . If Po is on L 2, then c l = 0 and now X(t) = c 2E2e"' . This trajectory is a half-line from the origin along L 2 through Po . Again, the arrows indicate that points on this trajectory also approach the origin along L 2 as t co . This is the trajectory T2 of Figure 11 .8 . If Po is on neither L 1 or L2, then the trajectory is a curve through Po having the parametri c form X(t) = c1 E l e At +c 2E2 e t''r Write this as X( t) =
eµt
[c 1 E l e (A-µ)` +c2E2 ]
.
11.3 Phase Portraits of Linear Systems
415
y
Trajectories along E,,
FIGURE 11 .8 asymptotic to
E2
along E 2 , o r
in the case A <,u < O.
µ
Because A - < 0, e (A-1.0 ' 0 as t -+ oo and the term c, E, e tA-µ>' exerts increasingly les s influence on X(t) . In this case, X(t) still approaches the origin, but also approaches the lin e L 2, as t -+ oo . A typical such trajectory is shown as the curve T3 of Figure 11 .8 . A phase portrait of X' = AX in this case therefore has some trajectories approaching th e origin along the lines through the eigenvectors of A and all others approaching the origin along curves that approach one of these lines asymptotically. In this case the origin is called a nodal sink of the system X' = AX . We can think of particles flowing along the trajectories and towar d the origin . The following example and phase portrait are typical of nodal sinks .
EXAMPLE 11 . 6
Consider the system X' = AX, in which
A =(
1
/ /
5 A has engenvalues and corresponding eigenvectors -1,
I
I
and -4,
In the notation of the discussion, A = -4 and X(t) = c, (
µ = -
I
1
I
1 . The general solution i s
1 ) e-4t +c2 (
) e -r .
L r is the line through (-1, 1), and L2 the line through (2, -5) . Figure 11 .9 shows a phas e portrait for this system, with the origin a nodal sink . Case 1-(b)-The eigenvalues are positive, say 0 < s < A . The discussion of Case 1-(a) can be replicated with one change . Now eat and e µ ' approach oo instead of zero as t increases . The phase portrait is like that of the previous case, except al l the arrows are reversed and trajectories flow away from the origin instead of into the origin as time increases . As we might expect, now the origin is called a nodal source . Particles are flowing away from the origin . Here is a typical example of a nodal source.
416ui
CHAPTER 11 Qualitative Methods and Systems of Nonlinear Differential Equation s
FIGURE 11 .9
of
Phase portrait showing a nodal sin k
I x'=-6x-2 y
y' =5x+ y
EXAMPLE 11 . 7
Consider the system 3 1
X' -
3 5
1
This has eigenvalues and eigenvectors 2,(
Now A = 6 and
µ
)
and6,I \ 1 = 2, and the general solution is \\ X(t)=c1(
-i
)e2t +c2
1 1 ) e6` 1
Figure 11 .10 shows a phase portrait for this system, exhibiting the behavior expected for a nodal source at the origin . Case 2-Real, distinct eigenvalues of opposite sign. < 0 < A . The general solution still has the Suppose the eigenvalues are A and t with appearance X(t) = at 2E 2eµt , clEle +c
µ
and we start to draw a phase portrait by again drawing half-lines L 1 and L 2 from the origin along the eigenvectors . If Po is on L 1 , then c 2 = 0 and X(t) moves on this half-line away from the origin as t co . increases, because A > 0 and e a` -+ co as t But if Po is on L 2, then c l = 0 and X(t) moves along this half-line toward the origin , because .t < 0 and eµt -> 0 as t -} co .
11.3 Phase Portraits of Linear Systems
41 7
»/4'77X]77 ]
»»]7]7777 ] »»/77»77 » 7 1J»//7777777 1 / J / / ?' 7 X 7777 7 1 I 11 *I*I*I II \ \ q
The arows along the half-lines along the eigenvectors therefore have opposite directions , toward the origin along L2 and away from the origin along L l . This is in contrast to Case 1 , in which solutions starting out on the half-lines through the eigenvectors either both approache d the origin or both moved away from the origin as time increased . If Po is on neither L I nor L 2 , then the trajectory through Po does not come arbitrarily clos e to the origin for any times, but rather approaches the direction determined by the eigenvector E l as t -+ oo (in which case e µ` ->- 0) or the direction determined by E 2 as t -oo (in which cas e -+ 0) . The phase portrait therefore has typical trajectories as shown in Figure 11 .11 . Th e r
FIGURE 11 .11
Typical phase portrait for a saddle point at the origin.
418
CHAPTER 11 Qualitative Methods and Systems of Nonlinear Differential Equation s lines along the eigenvectors determine four trajectories that separate the plane into four regions . A trajectory starting in one of these regions must remain in it because distinct trajectories cannot cross each other, and such a trajectory is asymptotic to both of the lines bounding it s region . The origin in this case is called a saddle point .
EXAMPLE 11 .8 Consider X' = AX with A-
3 ( 2 -2
'
Eigenvalues and eigenvectors of A are -4, /I
3 ) I and1, I 2 )
The general solution is X(t) =
cl (
i ) e -4 ' +c2 ) e
and a phase portrait is given in Figure 11 .12 . In this case of a saddle point at the origin , trajectories do not enter or leave the origin, but asymptotically approach the lines deteiiuined by the eigenvectors . • Case 3-Equal eigenvalues . Suppose A has the real eigenvalue A of multiplicity 2 . There are two possibilities .
Y
N NN N""NN********'--, N,'N,'N,N'''.,NN* ` -Z■. NNN N "N"NN*** *150 NN N NN N NNN N NNNNN NNNNN N N NNNiNNN N N NNNNNNNNN N N NNNNNNNNNN
FIGURE 11 .12
Saddle point of
x'=-x+3 y y'=2x-2y
11.3 Phase Portraits of Linear Systems
419
Case 3-(a)-A has two linearly independent eigenvectors E l and E2 . Now the general solutio n of X' = AX is X(t) = (c l E 1 + c2E2) eA' . If E1=I
I
b
and E2 =l
k
I
then, in terms of components , x(t) = (c i a+c2h)e A `, y(t) = (c i b+c2k)eAt . Now y(t) = constant . x(t) This means that all trajectories in this case are half-lines from the origin . If A > 0, arrows along these trajectories are away from the origin, as in Figure 11 .13 . If A < 0, they move toward the origin, reversing the arrows in Figure 11 .13 .
Typical proper node with positive eigenvalue of A. FIGURE 11 .13
The origin in case 3 - (a) is called a proper node . Case 3-(b)-A does not have two linearly independent eigenvectors . In this case there is an eigenvector E and the general solution has the for m X(t) = c 1 (Et +W)eAt + c2Ee A ` = [(c 1 W+c2E)+c 1 Et] e At . To visualize the trajectories, begin with arrows from the origin representing the vectors W and E . Now, for selected constants c l and c2, draw the vector c l W + c2E, which may hav e various orientations relative to W and E, depending on the signs and magnitudes of c l and c2. Some possibilities are displayed in Figure 11 .14 . For given c l and c2 , the vector c 1W+c2E+c1 E t drawn as an arrow from the origin, sweeps out a straight line L as t varies over all real values. For a given t, X(t) is the vector c l W + c 2E+ c 1 Et from the origin to a point on L, with lengt h adjusted by a factor eAt . If A is negative, then this length goes to zero as t co and the vecto r X(t) sweeps out a curve as shown in Figure 11 .15, approaching the origin tangent to E. If
420
CHAPTER 11 Qualitative Methods and Systems of Nonlinear Differential Equation s Y
W-E
-W+2E FIGURE 11 .14 Vectors c1 W+c 2 E in the cas e of an improper node .
Y
FIGURE 11 .15
Typical trajectory near an imprope r
node.
A > 0, we have the same curve (now e A' -+ 0 as t -* -oo), except that the arrow indicating direction of the flow on the trajectory is reversed . The origin in this case is called a improper node of the system X' = AX . The following example has a phase portrait that is typical of improper nodes .
EXAMPLE 11 . 9
Let A
-10 6 ( -6 2
1
Then A has eigenvalue -4, and every eigenvector is a real constant multiple of E = ( 11 ) . A routine calculation gives
11.3 Phase Portraits of Linear Systems
! `421
and the general solution is ar . t )ear+c2(1 X(t)-c1(t f ?6 l )e Figure 11 .16 is a phase portrait for this system . We can see that the trajectories approach the origin tangent to E in this case of an improper node at the origin, with negative eigenvalu e for A .
1»»»>,6 0 111>>>>>50
1//11
» 11»»* >>>>1111 *
l l l/ 1 1/ !//1111 ll/1 /1 !/11111
11/1>>>>40 111111 >> 1//11111' 1/1/1/
/2o /1111111/ 11/11 /11
l1/ /
1
l 1 / 1
l
l//l /1/1/1/1 1/1/1//
0 / 5 0/ /-lp•,/ll/1llll l 111///1// l//////// / FIGURE 11 .16 Phase portrait for the improper node (x ' = -10x+6y y' =-6x+2y
of jl
Case 4-Complex eigenvalues with nonzero real part . We know that the complex eigenvalues must be complex conjugates, say A = a + i(3 an d = a - if3 . The complex eigenvectors are also conjugates . Write these, respectively, as U+ i V and U - iV . Then the general solution of X' = AX i s
µ
X(t) = c l e"' [Ucos(f3t) -Vsin(/3t)]+c 2 e"'[Usin((3t)+V cos(f3t)] . Suppose first that a < O . The trigonometric terms in this solution cause X(t) to rotat e about the origin as t increases, while the factor t causes X(t) to move closer to the origi n oo. This suggests a (or, equivalently, the length of the vector X(t) to decrease to zero) as t trajectory that spirals inward toward the origin as t increases . Since t varies over the entire real line, taking on both negative and positive values, th e trajectories when a > 0 have the same spiral appearance, but now the arrows are reversed an d X(t) moves outward, away from the origin, as t -+ oo . The origin in this case is called a spiral point. When a < 0 the origin is a spiral sin k because the flow defined by the trajectories is spiralling into the origin . When a > 0 the origin is a spiral source because now the origin appears to be spewing material outward in a spira l pattern . The phase portrait in the following example is typical of a spiral source .
e
422
CHAPTER 11 Qualitative Methods and Systems of Nonlinear Differential Equation s
EXAMPLE 11 .1 0 Let
4 3) '
A=
with eigenvalues 1+ 2i and 1- 2i and eigenvectors, respectively, ( Let U
2
L)
and (
2
t
an
U+ iV and U - iV. The dsothaeignvcr general solution of X' = AX is X(t) = c l e ' [(
2 ) cos(2t) - ( 0 ) sin(2t) ]
+ c2et [(
2
) sin(2t) + ( 0 ) cos(2t)] .
Figure 11 .17 gives a phase portrait for this system, showing trajectories spiraling away fro m the spiral source at the origin because the real part of the eigenvalues is positive.
Case 5-Pure imaginary eigenvalues . Now trajectories have the form X (t) = c l [U cos(f3t) - V sin(J3t)] + c 2 [U sin(/3t) +V cos(/3t)] . Because of the trigonometric tennis, this trajectory moves about the origin . Unlike the preceding case, however, there is no exponential factor to decrease or increase distance from the origi n as t increases . This trajectory is a closed curve about the origin, representing a periodic solutio n
11.3 Phase Portraits of Linear Systems
42 3
of the system . The origin in this case is called a center of X' = AX . In general, any close d trajectory of X' = AX represents a periodic solution of this system .
EXAMPLE 11 .1 1
Let A=1
.
18 J
-1
A has eigenvalues 3i and -3i, with respective eigenvectors, eigenvectors
C -3+3i 1
3r ) an (-31
d
1 J A phase portrait is given in Figure 11 .18, showing closed trajectories about the center (origin) . If we wish, we can write the general solution
1111> \111 77
ms s'
* * N
N-40 , \\ '\ AN N NNNN N N\NNN N NNNNNN N 'N 'N N N 'N 'N 'N '* -10 .
FIGURE 11 .18
Center of the system
X(t) = c l [( +c 2 [
) cos(3t)+
I x'=3x+18 y
y'=-x-3y
13 ) sin(3t) ]
i )sin(3t)+(
3 )cost] .
We now have a complete description of the behavior of trajectories for the 2 x 2 constan t coefficient system X' = AX . The general appearance of the phase portrait is completely determined by the eigenvalues of A, and the critical point (0, 0) is the primary point of interest , with the following correspondences : Real, distinct eigenvalues of the same sign-(0, 0) is a nodal source (Figure 11 .10, p . 417) or sink (Figure 11 .9, p . 416) . Real, distinct eigenvalues of opposite sign-(0, 0) is a saddle point (Figure 11 .12, p . 418) .
4 24
CHAPTER 11 Qualitative Methods and Systems of Nonlinear Differential Equation s Equal eigenvalues, two linearly independent eigenvectors-(0, 0) is a proper nod e (Figure 11 .13, p . 419) . Equal eigenvalues, all eigenvectors a multiple of a single eigenvector-(0, 0) is an imprope r node (Figure 11 .16, p . 421) Complex eigenvalues with nonzero real part-(0, 0) is a spiral point (Figure 11 .17, p . 422) . Pure imaginary eigenvalues-(0, 0) is a center (Figure 11 .18, p . 423) . When we speak of a classification of the origin of a linear system, we mean a determinatio n of the origin as a nodal source or sink, saddle point, proper or improper node, spiral point o r center.
In each of Problems 1 through 10, use the eigenvalue s of the matrix of the system to classify the origin of the system. Draw a phase portrait for the system. It is assumed here that software is available to do this, and it i s not necessary to solve the system to generate the phase portrait.
4.
1. x'=3x-5y, y' =5x-7y
8. x'=3x-5y,y'=8x-3 y
2. x'=x+4y,y'=3x
9. x'=-2x-y,y'=3x-2y
=9x -7y, y' = 6x- 4y
5. x' = 7x-17y, y' = 2x + y 6. x'=2x-7y,y ' = 5x-l0y 7. x' =4x-y,y' =x+2 y
3. x'=x-5y,y'=x- y
11 .4
x'
10. x'=-6x-7y,y'=7x-20y
Critical Points and Stabilit y A complete knowledge of the possible phase portraits of linear 2 x 2 systems is good preparatio n for the analysis of nonlinear systems . In this section we will introduce the concept of critica l point for a nonlinear system, define stability of critical points, and prepare for the qualitiativ e analysis of nonlinear systems, in which we attempt to draw conclusions about how solution s will behave, without having explicit solutions in hand . We will consider the 2 x 2 autonomous syste m x' ( t ) = f(x(t),
y( t)) ,
y '(t) = g(x( t), y( t)) , or, more compactly, x' = f(x, y) , y' = g (x , y) This is the system (11 .4) discussed in Section 11 .2. We will assume that f and g are continuou s with continuous first partial derivatives in some region D of the x, y- plane. In specifi c cases D may be the entire plane . This system can be written in matrix form a s X' = F(X )
11 .4
Critical Points and Stability
42 5
hi August of 1999, the Petronas Towers was officially opened. Designed by the American firm of Cesa r Pelli and Associates, in collaboration with Kuala Lumpur City Center architects, the graceful towers have an elegant slenderness (height to width) ratio of 9 :4. This was made possible by modern material s and building techniques, featuring high-strength concrete that is twice as effective as steel in swa y reduction . The towers are supported by 75-foot-by-75-foot concrete cores and an outer ring of supe r columns. The 88 floors stand 452 meters above street level, and include 65,000 square meters of stainless steel cladding and 77,000 square meters of vision glass . Computations of stability of structure s involve the analysis of critical points of systems of nonlinear differential equations .
where X (t) = ( y(t)
and
F(X) = (
g*x
*
y Taking the lead from the linear system X' = AX, we make the following definition .
DEFINITION 11 .1
Critical Point
A point (xo, yo) in D is a critical point (or equilibrium point) of X' = F(X) if .f( xo, Yo) = g (xo, yo) = 0 .
We see immediately one significant difference between the linear and nonlinear cases . The linear system X' = AX, with A nonsingular, has exactly one critical point, the origin. A nonlinear system X' = F(X) can have any number of critical points . We will, however, only consider systems in which critical points are isolated. This means that, about any critical point , there is a circle that contains no other critical point of the system .
r r
42 0
CHAPTER 11 Qualitative Methods and Systems of Nonlinear Differential Equation s
EXAMPLE 11 .1 2
Consider the damped pendulum (Example 11 .1), whose motion is governed by the syste m x'= y y' = -w2 sin(x) - yy . Here f(x , y) = y and g( x , y) = -w2 sin(x) - yy . The critical points are solutions of
y= o and -w 2 sin(x)-yy=0 . These equations are satisfied by all points (nn, 0), in which n = 0, ±1, ±2, . . . . These critical points are isolated . About any point (nv, 0), we can draw a circle (for example, of radius 1/4) that does not contain any other critical point. For this problem, the critical points split naturally into two classes . Recall that x = 0 is the angle of displacement of the pendulum from the vertical downward position, with the bob a t the bottom, and y = dO/dt. When n is even, then x = 0 = 2k7r for k any integer . Each critical point (2ku, 0) corresponds to the bob pointing straight down, with zero velocity (becaus e y = x' = 0' = 0) . When n is odd, then x = 0 = (2k +1)'n- for k any integer. The critical point ((2k+ 1)7r, 0) corresponds to the bob in the vertical upright position, with zero velocity . Without any mathematical analysis, there is an obvious and striking difference betwee n these two kinds of critical points . At, for example, x = 0 , the bob hangs straight down fro m the point of suspension . If we displace it slightly from this position and then release it, th e bob will go through some oscillations of decreasing amplitude, after which it will return to it s downward position and remain there . This critical point, and all critical points (2nir, 0), are what we will call stable. Solutions of the pendulum equation for initial values near this critical point remain close to the constant equilibrium solution for all later times . By contrast, consider the critical point (7r, 0) . This has the bob initially balanced verticall y upward. If the bob is displaced, no matter how slightly, it will swing downward and oscillat e back and forth some number of times, but never return to this vertical position . Solutions near this constant equilibrium solution (bob vertically up) do not remain near this position, but mov e away from it . This critical point, and any critical point ((2k+ 1)7r, 0), is unstable . ■ Y..
f ., . .
. . .. . .
EXAMPLE 11 .1 3
Consider the damped nonlinear spring of Example 11 .2. The system of differential equation s governing the motion is
x = y, k a 3 c y =--x+-x --y . m m in
11.4 Critical Points and Stability
42 7
The critical points are (0, 0), (s/k/a, 0) and (-,/k/a, 0). Recall that x measures the position of the spring, from the equilibrium (rest) position, and y = dx/dt is the velocity of the spring . We will do a mathematical analysis of this system shortly, but for now look at thes e critical points from the point of view of our experience with how springs behave . If we displac e the spring very slightly from the equilibrium solution (0, 0), and then release it, we expect i t to undergo some motion back and forth and then come to rest, approaching the equilibriu m point. In this sense (0, 0) is a stable critical point . However, if we displace the spring slightl y from a position very nearly at distance ,/k/a to the right or left of the equilibrium positio n and then release it, the spring may or may not return to this position, depending on th e relative sizes of the damping constant c and the coefficients in the nonlinear spring forc e function, particularly a . In this sense these equilibrium points may be stable or may not be . In the next section we will develop the tools for a more definitive analysis of these critica l points . Taking a cue from these examples, we will define a concept of stability of critical points . Recall that II V II
2 = ✓ v1 2+ v2
is the length (or norm) of a vector V (v 1 , v2) in the plane. If W = (w1 , w 2) is also a vecto r in the plane, then IIV - WI! is the length of the vector from W to V . If W = ( w 1 , w 2), then II V -WII
= ((v 1 - w l ) 2
+
(v 2 - w2 ) 2 ) ''2
is also the distance between the points (v 1 , v2) and (w 1 , w2) . Finally, if X 0 is a given vector, then the locus of points (vectors) X such that II X - X oII
< r
for any positive r, is the set of points X within the circle of radius r about X0 . These are exactly the points at distance < r from Xo .
Let X 0 = (xo, yo) be a critical point of X' = F(X) . Then X0 is stable if and only if, given any positive number e there exists a positive number S E such, if X = 4)(t) is a solution of X' = F(X) and II 0 (0)
- X0II < Se ,
then 4)(t) exists for all t > 0, and II*(t)
- XoII <eforallt0 .
We say that X 0 is unstable if this point is not stable .
Keep in mind that the constant solution X(t) = X 0 is the unique solution through thi s critical point. That is, the trajectory through a critical point is just this point itself . A critical point X0 is stable if solutions that are initially (at t = 0) close (within S e ) to X 0, remain close
428
J
CHAPTER 11 Qualitative Methods and Systems of Nonlinear Differential Equation s
FIGURE 11 .19 X' = F(X) .
Stable critical point of
(within e) for all later times . In terms of trajectories, this means that a trajectory that starts out sufficiently close to X0 at time zero, must remain close to this equilibrium solution at all late r times . Figure 11 .19 illustrates this idea . This does not imply that solutions that start near X0 approach this point as a limit as t - co . They may simply remain within a small disk about Xo, without approaching X0 in a limiting sense. If, however, solutions initially near Xo also approach X0 as a limit, then we call X 0 an asymptotically stable critical point .
DEFINITION 11 .3
Asymptotically Stable
Critical Point
X0 is an asymptotically stable critical point of X' = F(X) if and only if X 0 is a stabl e critical point, and there exists a positive number 6 such that, if a solution X = cI>(t) satisfies 110(0) - X011 < 6, then lim, 1(t) = X0.
This concept is illustrated in Figure 11 .20 . Stability does not imply asymptotic stability . It is less obvious that asymptotic stability does not imply stability . A solution might start "clos e
FIGURE 11 .20 Asymptotically stable critical point of X' = F(X) .
11.4 Critical Points and Stability
429
enough" to the critical point and actually approach the critical point in the limit as t oo, but for some arbitrarily large positive times move arbitrarily far from X 0 (before bending back to approach it in the limit) . In the case of the damped pendulum, critical points (2n7r, 0) are asymptotically stable . If the bob is displaced slightly from the vertical downward position and then released, it wil l eventually approach this vertical downward position in the limit as t -+ oo . To get some experience with stability and asymptotic stability, and also to prepare fo r nonlinear systems that are in some sense "nearly" linear, we will review the critical point (0, 0 ) for the linear system X' = AX, in the context of stability . Nodal Source or Sink This occurs when the eigenvalues of A are real and distinct, but of th e same sign-a nodal sink when they are negative, and a nodal source when they are positive . From the phase portrait in Figure 11 .9, p . 416 (0, 0) is stable and asymptotically stable whe n the eigenvalues are negative (nodal sink), because then all trajectories tend toward the origin a s time increases . However, (0, 0) is unstable when the eigenvalues are positive (nodal source) , because in this case all trajectories move away from the origin with increasing time (Figur e 11 .10, p . 417) . Saddle Point The origin is a saddle point when A has real eigenvalues of opposite sign . A saddle point is unstable.This is apparent in Figure 11 .12, p . 418 in which we can see tha t trajectories do not remain near the origin as time increases, nor do they approach the origin a s a limit. Proper Node The origin is a proper node when the eigenvalues of A are equal and A ha s two linearly independent eigenvectors . Figure 11 .13, p . 419 shows a typical proper node . When the arrows are toward the origin (negative eigenvalues), this node is stable and asymptoticall y stable . When the trajectories are oriented away from the origin, this node is not stable . Improper Node The origin is an improper node when the eigenvalues of A are equal and A does not have two linearly independent eigenvectors . Now the origin is a stable and asymptotically stable critical point if the eigenvalue is negative, and unstable if the eigenvalue i s positive . Figure 11 .16 shows trajectories near a stable improper node (negative eigenvalue) . If the eigenvalue is positive, the trajectories have orientation away from the origin, and then thi s node is unstable . Spiral Point The origin is a spiral point when the eigenvalues are complex conjugates wit h nonzero real part. When this real part is positive, the origin is a spiral source (trajectories spira l away from the origin, as in Figure 11 .17), and in this case the origin is unstable . When thi s real part is negative, the origin is a stable and asymptotically stable spiral sink (trajectorie s spiralling into the origin) . The phase portrait of such a sink has the same appearance as a spira l source, with arrows on the trajectories reversed . Center The origin is a center when the eigenvalues of A are pure imaginary . A center i s stable, but not asymptotically stable (Figure 11 .18) . There is a succinct graphical way of summarizing the classifications and stability type o f the critical point (0, 0) for the linear system X' = AX . Le t A=I a b c.d
:430
CHAPTER 11 Qualitative Methods and Systems of Nonlinear Differential Equation s The eigenvalues of A are solutions o f -(a+d)A+ad-bc=0 . A2 Let p = -(a + d) and q = ad - be to write this equation as A2 +pA+q=0 . The eigenvalues of A are -p± /p2 -4 q 2 These are real or complex depending on whether p 2 -4q > 0 or p2 - 4q < O . In the p, q plane of Figure 11 .21, the boundary between these two cases is the parabola p2 = 4q . Now the p, q plane gives a summary of conclusions as follows :
FIGURE 11 .21
Classification of (0, 0) for X ' = AX.
Above this parabola (p2 < 4q) the eigenvalues are complex conjugates with nonzero real parts (spiral point) . On the parabola (p2 = 4q) the eigenvalues are real and equal (proper or improper node) . On the q-axis, the eigenvalues are pure imaginary (center) . Between the p-axis and the parabola, the eigenvalues are real and distinct, with th e same sign (nodal source or sink) . Below the p-axis, the eigenvalues are real and have opposite sign (saddle point) . It is interesting to observe how sensitive the classification and stability type of a critica l point are to changes in the coefficients of the system . Suppose we begin with a linear syste m X' = AX, and then perturb one or more elements of A by "small" amounts to form a ne w system. How (if at all) will this change the classification and stability of the critical point? The classification and stability of (0, 0) are completely determined by the eigenvalues, so the issu e is really how small changes in the matrix elements affect the eigenvalues . The eigenvalues of A are (-p f \/p2 - 4q)/2, which is a continuous function of p and q. Thus small changes i n p and q (caused by small changes in a, b, c and d) result in small changes in the eigenvalues . There are two cases in which arbitrarily small changes in A will change the nature of the critica l point .
11.5 Almost Linear Systems
431
(1) If the origin is a center (pure imaginary eigenvalues), then p = -a - d = O . Arbitrarily small changes in a and d can change this, resulting in a new matrix whose eigenvalues hav e positive or negative real parts . For the new, perturbed system, (0, 0) is no longer a center . Thi s means that centers are sensitive to arbitrarily small changes in A . (2) The other sensitive case is that both eigenvalues are the same, which occurs whe n p2 - 4q = 0. Again, arbitrarily small changes in A can result in this quantity becoming positiv e or negative, changing the classification of the critical point . However, the stability or instabilit y of (0, 0) is determined by the sign of p, and sufficiently small changes in A will leave this sig n unchanged . Thus in this case the classification of kind of critical point the system has is mor e sensitive to change than its stability or instability . These considerations should be kept in mind when we state Theorem 11 .3 in the next section. With this background on linear systems and the various characteristics of its critical point , we are ready to analyze systems that are in some sense approximated by linear systems .
PR OBLEMS 1.-10 . For j = 1, . . . , 10 classify the critical point of the system of Problem j of Section 11 .3, as to being stable and asymptotically stable, stable and not asymptotically stable, or unstable . 11. Consider the system X' = AX, where A = 1 -3 with e > O . 2 -l+e ) ' (a) Show that, when e = 0, the critical point is a center, stable but not asymptotically stable . Generate a phase portrait for this system .
12. Consider the system X' = AX, where A = 2+e 5 ) and e > 0 . -5 -8 (a) Show that, when e = 0, A has equal eigenvalue s and does not have two linearly independent eigenvectors . Classify the type of critical point at the origin an d its stability characteristics. Generate a phase portrai t for this system .
(b) Show that, when e 0, the critical point is not a center, no matter how small e is chosen . Generate a phase portrait for this system with e = 1 0
(b) Show that, if e is not zero (but can be arbitrarily small in magnitude), then A has real and distinct eigenvalues . Classify the type of critical point at the origin in this case, as well as its stability characteristics . Generate a phase portrait for the case e = 1 0
This problem illustrates the sensitivity of trajecto ries of the system to small changes in the coefficients , in the case of pure imaginary eigenvalues .
This problem illustrates the sensitivity of trajectories to small changes in the coefficients, in the cas e of equal eigenvalues .
11 .5
Almost Linear System s Suppose X' = F(X) is a nonlinear system . We want to define a sense in which this system ma y be thought of as "almost linear ." Suppose the system has the special form X ' =AX+G(X) .
(11 .7)
This is a linear system X' = AX, with another term, G(X) = ( p(x' y) added. Any nonlinq(x , y) earity of the system (11 .7) is in G(X) . We refer to the system X' = AX as the linear part of the system (11 .7) .
CHAPTER 11 Qualitative Methods and Systems of Nonlinear Differential Equation s Assume that p(0, 0) = q(0, 0) = 0 so the system (11 .7) has a critical point at the origin . The idea we want to pursue is tha t if the nonlinear term is "small enough", then the behavior of solutions of the linear system X' = AX near the origin may give us information about the behavior of solutions o f the original, nonlinear system near this critical point . The question is : how small is "smal l enough? " We will assume in this discussion that A is a nonsingular, 2 x 2 matrix of real numbers , and that p and q are continuous at least within some disk about the origin . In the following definition, we refer to partial derivatives of G, by which we mean f ) Gx = aG = ( qx I and Gy = aG = (qy
DEFINITION 11.4
Almost Linear
The system (11 .7) is almost linear in a neighborhood of (0, 0) if G and its first partia l derivatives are continuous within some circle about the origin, an d lim
x->o
ilG(X)I 1 IIX1 1
This condition (11 .8) means that, as X is chosen closer to the origin, G(X) must become small in magnitude faster than X does . This gives a precise measure of "how small" the nonlinear term must be near the origin for the system (11 .7) to qualify as almost linear . If we write and GOO =G(x,y)=( q(x, y) ) , X=(Y ),A=(a d ) then the system (11 .7) i s x' = ax+by+p(x, y) y' = cx+dy+q(x, y) . Condition (11 .8) now become s q (x , y) =0 . P(x,y) = lim lim (x,y)-+(o,o) \/x2 + y2 (x,yv)-+(o,o) ./x2 +y2 These limits, in terms of the components of G(X), are sometimes easier to deal with tha n the limit of IIG(X)II / IIXII as X approaches the origin, although the two formulations ar e equivalent .
EXAMPLE 11 .1 4
The system x, _
-4xy ) -8x2 y J
11.5 Almost Linear Systems
433
is almost linear . To verify this, compute -8x 2y lim -4xy and lira (x,r)->(o,o) ./x2 +y 2 (x,y)-+(o,o) .✓x2 +y2 There are various ways of showing that these limits are zero, but here is a device worth remembering . Express (x, y) in polar coordinates by putting x = r cos (6) and y = r sin (B) . Then -4xy 4r 2 cos(O) sin(O) _ = -4r cos(6) sin(B) r Jx2 +y2 as
r
0, which must occur if -8x2y / Y x2 + y2 = -
0
(x, y) -+ (0, 0) . Similarly , 8
r3 cos2(O) sin(O)
= -8r2 cos 2(0) sin(O)
0
r
asr-*0 .
Figure 11 .22(a) shows a phase portrait of this system. For comparison, a phase portrait o f the linear part X' = AX is given in Figure 11 .22(b) . Notice a qualitative similarity betwee n the phase portraits near the origin . This is the rationale for the definition of almost linea r systems. We will now display a correspondence between the type of critical point, and it s stability properties, for the almost linear system X' = AX + G and its linear part X' = AX . The behavior is not always the same . Nevertheless, in some cases which we will identify , properties of the critical point for the linear system carry over to either the same properties fo r the almost linear system or, if not the same, at least to important information about the nonlinea r system .
16, ****l!l!ll1l1ll l /1/11I I 1***--*llllllll1 l1 l 1 1
1
14,**^*llllll1 1l1 1 1 1 *°-*ll l l l! l l l l l l 1l///II/llll 1 //1llllll \--...-zl //// l l l! l 1 \ 2 , ' * *** zl l l 1 1 1 1 1 1 1 1 *l *' * *l *l l-2
CHAPTER 11 Qualitative Methods and Systems of Nonlinear Differential Equation s
20 z10 -6-`.-4-`.-2
l/IL\\\ \N \
,
*NNN\\\\
0,-
. 10,
N'N\\\\ \ 'N N\ \ \ \ \ 1 'N\\\111 1
N\1 1 11//z7 1» >'
JJJJJ l J / J6 -2- 4
*-_,
N'
-20,
--
*.*.*.
N N.,N,*** * NNNNNN N FIGURE 11 .22(b)
Phase portrait for the linear part of the syste m
of Figure 11 .22(a). -* . THEOREM 11 . 3
Let A and .t be the eigenvalues of A. Assume that X' = AX + G is almost linear . Then the following conclusions hold for the system X' = AX +G. 1. If A and t are unequal and negative, then the origin, is an asymptotically stable noda l sink of X' = AX -F G. If these eigenvalues are unequal and positive, then the origin i s an unstable nodal source of X' = AX + G . 2. If A and are of opposite sign, then the origin is an unstable saddle point of X' = AX + G . are complex with negative real part, then the origin is an asymptoticall y 3. If A and stable spiral point of X' = AX + G . If these eigenvalues have positive real part, the n the origin is an unstable spiral point . 4. If A and Aare equal and negative, then the linear system has an asymptotically stable prope r or improper node, while the almost linear system has an asymptotically stable node or spira l point. If A and µ are equal and positive, then the linear system has an unstable proper o r improper node, while the almost linear system has an unstable node or spiral point . 5. If A and µ are pure imaginary (conjugates of each other), then the origin is a center of X' = AX, but may be a center or spiral point of the almost linear system X' = AX + G . Further, in the case of a spiral point of the almost linear system, the critical point ma y be unstable or asymptotically stable .
µ
µ
The only case in which the linear system fails to provide definitive information of som e kind about the almost linear system is that the eigenvalues of A are pure imaginary . In this event, the linear system has a stable center, while the almost linear system can have a stabl e center or a spiral point which may be stable or unstable . In light of this theorem, when we ask for an analysis of a critical point of an almost linear system, we mean a determination of whether the point is an asymptotically stable nodal sink , an unstable nodal source, an unstable saddle point, an asymptotically stable spiral point o r unstable spiral point, or, from (5) of the theorem, either a center or spiral point. A proof of this theorem requires some delicate analysis that we will avoid. The rest of this section is devoted to examples and phase portraits .
11 .5 Almost Linear Systems
43 5
EXAMPLE 11 .1 5
The system X
-1 -1 )X+ ( = ( -1 -3
2 2 x3-y2 /
is almost linear and has only one critical point, (0, 0) . The eigenvalues of A are -2 + An: an d -2 - which are distinct and negative . The origin is an asymptotically stable nodal sink o f the linear system X'= AX, and hence is also a stable and asymptotically stable nodal sink of the almost linear system . Figure 11 .23 (a) and (b) shows a phase portrait of the almost linea r system and its linear part, respectively .
FIGURE 11 .23(b) Phase portrait for the linear part of th e system of Figure 11 .23(a) .
436
CHAPTER 11 Qualitative Methods and Systems of Nonlinear Differential Equation s
EXAMPLE 11 .1 6
The system X, -
36-2
X
\(4) x2 cos (y) ) + J Y
is almost linear . The only critical points (0, 0) . The eigenvalues of A are 1 + z i 95 and - i 93 . The linear part has an unstable spiral point at the origin . The origin is therefore an unstable spiral point of the almost linear system . Phase portraits for the given nonlinear syste m and its linear part are shown in Figure 11 .24 (a) and (b), respectively . ■
N N NN
--,---,-,--,--,
N NNNNNN N N NNNNNNN N N NNNNNNNN N
FIGURE 11 .24(a)
Phase portrait for x
; = 3x - 4y + x
cos (y)
y = 6x+2y+ y3
Phase portrait for the linear part of th e system of Figure 11 .24(a) . FIGURE 11 .24(b)
11 .5
Almost Linear Systems
43 7
EXAMPLE 11 .1 7
The system _ X-(
12 2 3
) x+ (
x sin(g) 8 sin (x)
is almost linear, and its only critical point is the origin . The eigenvalues of A are 1+ 2J and 1 - 2A which are real and of opposite sign . The origin is an unstable saddle point o f the linear part, hence also of the given system . Phase portraits of both systems are shown in Figure 11 .25 (a) (nonlinear system) and (b) (linear part) .
EXAMPLE 11 .1 8
The system
x+
_ 4 11 X - ( -2 -4
x sm (y) s in (y)
is almost linear, and its only critical point is (0, 0) . The eigenvalues of A are Al& an d The origin is a stable, but not asymptotically stable, center for the linear part . The theorem does not allow us to draw a definitive conclusion about the almost linear system, which migh t have a center or spiral point at the origin. Figure 11 .26 (a) and (b) shows phase portraits fo r the almost linear system and its linear part, respectively .
EXAMPLE 11 .1 9
Consider the system -1
a
1
-a ) X+
J X 20
NNNN.,*
I
hx(x2 +y2 ) ) ky(x 2
+y
2)
'
1 1
10 -80 \ -60 \-40
-20 \ 0
N
io
I.
80 1
*'k H I.
4 t \ 601
\111 1
t
-20
l
**NN N
7 39
FIGURE 11 .25(a)
Phase portrait for
x'=-x+2y+xsin(y) y' = 2x + 3y+ 8 sin (x)
438
CHAPTER 11 Qualitative Methods and Systems of Nonlinear Differential Equations y
NN N
\, \, \,\,\, -8o
LL
111 1 1111 1
1
\ 1 1 60*
\,
\1 ,
\1\ \,
20
\ 1 \111 1>x 0 " \ 20 \ '46 i\ 6 0 1
-1 0 i
t
20 -! 11l! l -30 -
- ■ l l ---•-`/-
! l 1*--* *
Phase portrait for the linear part of th e system of Figure 11 .25(a) . FIGURE 11 .25(b)
in which a,h and k are constants . The eigenvalues of the matrix of the linear part are ✓ a 2 - 1 and -1a2 - 1 . Consider cases . If 0 < al < 1, then these eigenvalues are pure imaginary . The origin is a center of th e linear part but may be a center or spiral point of the almost linear system . If 'al > 1, then the eigenvalues are real and of opposite sign, so the origin is an unstabl e saddle point of both the linear part and the original almost linear system . If a = f1, then A is singular and the system is not almost linear . Figure 11 .27 (a) shows a phase portrait for this system with h = 0 .4, k = 0.7 and a = 3 Figure 11 .27 (b) has a = 2 . The next example demonstrates the sensitivity of case (5) of Theorem 11 .3.
FIGURE 11 .26(a)
x' = 4x+ 11y+xsin(y) Phase portrait for I y =-2x-4y+sin(g)
11 .5 Almost Linear Systems
-5 .N N NNNNNNN N ,■ NNNNNNNN - 2 NNNNNNNN' i N NNNNNNNN' NNNNNNNN^-4 .N N NNNNNNN ' N N N N N N N N N N N N NNE NNNNN 'N 'NN^26 2 ---, Phase portrait for the linear part of the system
FIGURE 11 .26(b)
of Figure 11 .26(a) .
EXAMPLE 11 .20 Let e be a real number and consider the system y + Ex(x2 + Y 2)
X - ( _x + Ey(x2 + y2 ) We can write this in the form X' = AX + G a s X' =
FIGURE 11 .27(a)
Ex(x 2 + y2 ) 1 ( -1 0 )X+( EY(x2 + y2 )
Phase portrai or
I x' =
3x
- y + 0 .4x (x22 + y2 ) 2)
43 9
440
CHAPTER 11 Qualitative Methods and Systems of Nonlinear Differential Equation s
*** * \ \ \, \-'6 *NN******* * d* l * 1 l11 l\-8\\\\\\\N\ N l
FIGURE 11 .27(b)
I
\ \NN N
Phase portrait for
x' = 2x - y + 0 .4x(x 2 + y2 ) y ' = x - 2y+0 .7y(x2 + y2 )
The origin is a critical point of this almost linear system . The eigenvalues of A are i and -i, s o the linear part of this system has a center at the origin. This is the case in which Theorem 11 .3 does not give a definitive conclusion for the nonlinear system . To analyze the nature of this critical point for the nonlinear system, use polar coordinates r and O . Since r2 = x 2 + y2
then rr ' = xx ' + yy '
= x[y+ Ex(x 2 +y2)] +y [-x+ Ey(x 2 + y2) ] E ( x2 + y2) (x 2 + y2) = Er4. = Then dr dt
= Er a .
This is a separable equation for r, which we solve to ge t r(t) =
1 /k - 2Et '
in which k is constant determined by initial conditions (a point the trajectory is to pass through) . Now consider cases . If e < 0, then r(t) =
1 \/k+21EI t
0
oo . In this case the trajectory approaches the origin in the limit as t -9. oo, and (0, 0) is as t asymptotically stable .
11.5 Almost Linear Systems
44 1
However, watch what happens if E > O . Say r(0) = p, so the trajectory starts at a point at a positive distance p from the origin . Then k = 1/p 2 an d r(t) _
1 .\/(1/p 2 ) - 2E t
In this case, as t increases from 0 and approaches 1/(2Ep 2 ), r(t) oo . This means that, at finite times, the trajectory is arbitrarily far away from (0, 0), hence (0, 0) is unstable when E is positive. A phase portrait for E = -0 .2 is given in Figure 11 .28 (a), and for E = 0 .2 in Figure 11 .28 (b) . Figure 11 .28 (c) gives a phase portrait for the linear part of this system . 118
FIGURE 11 .28(a)
Phase portrait for
with e = -0 .2.
I
x' = y + Ex(x2 +y 2) y' _ -x + ey(x2 + y 2 )
/ / / / / /»I 7
/ / / / / /Z / / / / »»> I / ///7ZVV I »ZIVI» I V I I »I» > JJJJ*JJ J *J*JJ*IJ *
FIGURE 11 .28(b)
E=
0 .2.
442
CHAPTER 11 Qualitative Methods and Systems of Nonlinear Differential Equation s
FIGURE 11 .28(c)
The linear part (e = 0) .
Example 11 .20 shows how sensitive an almost linear system can be when the eigenvalue s of the linear part are pure imaginary . In this example, e can be chosen arbitrarily small . Still, when e is negative, the origin is asymptotically stable, and when e is positive, regardless of magnitude, the origin becomes unstable . Thus far the discussion has been restricted to nonlinear systems in the special form X' = AX +G, with the origin as the critical point . However, in general a nonlinear system comes i n the form X' = F(X), and there may be critical points other than the origin . We will now show how to translate a critical point (xo, yo) to the origin so that X' = F(X) translates to a system X' = AX + G . This makes the linear part of the translated system transparent . Further, since Theorem 11 .3 is set up to deal with critical points at the origin, we can apply it to X' = AX + G whenever this system is almost linear. Thus suppose (xo, yo) is a critical point of X' = F(X), where F =
I . Assume that f
and g are continuous with continuous first and second partial derivatives at least within som e (g circle about (xo, yo) . By Taylor's theorem for functions of two variables, we can write, fo r (x, y) within some circle about (xo, yo) a s f( x ,Y) = f(xo,Yo) + fx(xo,Yo)(x
- x o)
+ fy(xo,Yo)(Y -Yo) + a (x ,Y)
and g ( x, Y) = g(x o, Yo) + gx (xo, Yo) (x - xo) + gy ( x o, Yo) (Y -Yo) + a ( x,
Y) ,
where a (x, y)
=
,0 (x, y)
= 0. (11 .9) ,/(x - xo) 2 + (y Yo) ' Now (xo, yo) is assumed to be a critical point of X' = F(X), so f(xo, yo) = g ( xo, yo) = 0 and these expansions are
lim
(x,y)->(xo,yo)
*/(x - xo) 2 +
( y - Yo)2
lim
(x,y)-(xo,yo)
f(x , Y) = fx (xo, Yo) (x - xo) + fy (xo, Yo) (Y -Yo) + a ( x,
Y)
and g(x,Y) = gx(xo,Yo)( x - x o) + gy( xo>Yo)(Y - Yo) +
(x ,Y) .
11 .5 Almost Linear System s
Let X=
x-xo
Y-Yo
Then ±t X =
ar (x
- xo)
d (Y -A)
-
x' )=x'=F(x ) Y
( f( x , y) ) g (x , , y ) J L(xo,Yo)
fy( xo, yo) ( x - xo Y - Yo gy(xo, Yo)
gx( x0, Yo)
a(x, y) +
/3 (x, Y)
= A (xo,.vo) X + G . Because of the condition (11 .9), this system is almost linear. Omitting the tilda notation fo r simplicity, this puts the translated system into the form X' = A (xo,yo) X+G, with the critical point (xo, yo) of X' = F(X) translated to the origin as the critical point of the almost linear system X' = A (xo,yo) X+ G . Now we can apply the preceding discussion and Theorem 11 .3 to the translated system X' = A (xo,yo) X+G at the origin, and hence draw conclusions about behavio r of solutions of X' = F(X) near (xo, yo) . We use the notation A(xo,ro) for the matrix of the linear part of the translated system for two reasons . First, it reminds us that this is the translated syste m (since we dropped the X notation) . Second, when we are analyzing several critical points of the same system, this notation reminds us which critical point is under consideration, and clearl y distinguishes the linear part associated with one critical point from that associated with another . In carrying out this strategy, it is important to realize that we do not have to explicitly compute a(x, y) or /3(x, y), which in some cases would be quite tedious, or not even practical . The point is that we know that the translated system X' = A (xo,yo) X +G is almost linear if F has continuous first and second partial derivatives, a condition that is usually easy t o verify .
EXAMPLE 11 .2 1
Consider the system X' = F(X) _
sin(irx) - x2 + y2 cos((x+y+ 1)1) .
Here f(x, y) = sin(rx) - x2 +y2 and g(x, y) = cos((x + y + 1)ir/2) . This is an almost linear system because f and g are continuous with continuous first and second partial derivative s throughout the plane. For the critical points, solve sin(rrx) - x2 +y2 = 0 ,
cos((x + y + 1) ;) = 0 . Certainly x = y = 77 is a solution for every integer n . Every point (n, n) in the plane is a critica l point . There may be other critical points as well, but other solutions of f(x, y) = g(x, y) = 0
444
CHAPTER 11 Qualitative Methods and Systems of Nonlinear Differential Equations are not obvious . We will need the partial derivatives Tr cos(arx) - 2x, j 'y = 2y
fx
=
gx
=-2 sin((x+y+1)2),gY =-2 sin((x+y+1)2) .
Now consider a typical critical point (n, n) . We can translate this point to the origin and writ e the translated system as X' = A( ,,,n) X+G wit h A (,,,n)
a (x, Y) = ( a(x , y) J/ We need not actually compute a(x, y) or 13(x, y) . Because the system is almost linear, th e qualitative behavior of trajectories of the nonlinear system near (n, n) is (with exceptions noted in Theorem 11 .3) determined by the behavior of trajectories of the linear system X' = A (n,n) X . We are therefore led to consider the eigenvalues of Ao n, n) , which are
4 Tr(-1)
n - n+ A/9Tr 2 - 40nTr(-1)" + 16n2.
4
We will consider several values for n. For n = 0, the eigenvalues are Tr and -Tr/2, so the origin is an unstable saddle point of the linear system and also of the nonlinear system . For n = 1, the eigenvalues of A (1,1) are -4Tr-1±4*/9Tr 2 +40Tr+16 , which are approximately 2 .0101 and -5 .5809 . Therefore (1, 1) is also an unstable saddl e point . For n = 2, the eigenvalues are 4 Tr-2f 4 ,/9Tr 2 - 80Tr+64 , which are approximately -1 .214 6 + 2 .4812i and -1 .214 6 - 2 .4812i . These are complex conjugates with negative real part, so (2, 2) is an asymptotically stable spiral point . For n = 3, the eigenvalues are -4Tr-3±4 A19Tr2 +120Tr+144 , which are approximately -9 .959 and 2 .388 2 . Thus (3, 3) is an unstable saddle point. For n = 4, the eigenvalues ar e 4Tr-4+ L/9Tr 2 -160Tr+256 , approximately -3 .2146+3 . 1407i and -3 .2146 - 3 . 1407i . We conclude that (4, 4) is a n asymptotically stable spiral point.
11 .5 Almost Linear Systems
44 5
For n = 5, -47r - 5± 4*/97r2 +2007-+400 , approximately 2 .5705 and -14 .141, so (5, 5) is an unstable saddle point . The pattern suggested by the cases n = 2 and n = 4 is broken with n = 6 . Now th e eigenvalues are
47r - 6± 4-s/97r2 - 2407-+576 ,
approximately -5 .2146±2 .3606i, so (6, 6) is also an unstable spiral point. With n = 7 we ge t eigenvalues
--
-7±4 A/97r2 +2807r+784 ,
approximately 2 .680 2 and -18 .251, so (7, 7) is an unstable saddle point . The new pattern that seens to be forming is broken with the next case . If n = 8 the eigenvalues ar e 4 7--8± 4 V97r2 -3207r+16(64) , approximately -9 .8069 and -4 .6223, so (8, 8) is a stable node . Figures 11 .29, 11 .30 and 11 .31 show phase portraits of this system, focusing on trajectorie s near selected critical points . The student should experiment with phase portraits near some o f the other critical points, for example, those with negative coordinates .
FIGURE 11 .29
the origin.
Trajectories of the system of Example 11 .21 near
.446.
CHAPTER 11 Qualitative Methods and Systems of Nonlinear Differential Equations Y
Trajectories of the system of Example 11 .2 1
FIGURE 11 .30
near (1, 1) .
Y
4
3
2
1 2.0
1.5 FIGURE 11 .31
2.5
3.0
3.5
4.0
4.5
, x
Trajectories of the system of Example 11 .21 nea r
(2, 2) and (4, 4) .
EXAMPLE 11 .22 Damped Pendulu m
The system for the damped pendulum i s x' = y y' _ -cv2sin(x) - yy. In matrix form, this system is X' = F(X)
y - yy -cot sin(x)
11 .5 Almost Linear Systems
447
Here f(x, y) = y
and
g(x, y) _ -w 2 sin(x) - yy.
The partial derivatives are fX = 0,
fy = 1, g C
= - w2 cos(x),
gy = -y .
We saw in Example 11 .12 that the critical points are (nor, 0) with n any integer. When n is even, this corresponds to the pendulum bob hanging straight down, and when n is odd, to the bob initially pointing straight up . We will analyze these critical points . Consider first the critical point (0, 0) . The linear part of the system has matri x
A(0,0) - ( gr(0 , 0)
gy( 0 , 0) )
0 =( -622 ly ) with eigenvalues - z y + z iy2 - 4w 2 and - 2 y - ,/ y2 - 4w 2 . Recall that y = c/mL an d w 2 = g/L . As we might expect, the relative sizes of the damping force, the mass of the bo b and the length of the pendulum will determine the nature of the motion . The following cases occur. (1) If y2 - 4w 2 > 0, then the eigenvalues are real, unequal and negative, so the origin is a n asymptotically stable nodal sink . This happens when c > 2in . This gives a measure of ho w large the damping force must be, compared to the mass of the bob and length of the pendulum , to have trajectories spiralling toward the equilibrium solution (0, 0) . In this case, after releas e following a small displacement from the vertical downward position, the bob moves toward thi s position with decreasing velocity, oscillating back and forth through this position and eventuall y coming to rest in the limit as t -* oo . Figure 11 .32 shows a phase portrait for the pendulum with y2 = 0 .8 and co = 0 .44.
Y
FIGURE 11 .32
y2
= 0 .8
Phase portrait for the damped pendulum with and w = 0 .44(y2 -4w 2 > 0) .
44 8
I
CHAPTER 11 Qualitative Methods and Systems of Nonlinear Differential Equation s
FIGURE 11 .33 Damped pendulum with y2 = 0 .8 and 6)2 = 0.2(y2 - 46) 2 = 0) .
(2) If y2 - 4w2 = 0, then the eigenvalues are equal and negative, corresponding to a n asymptotically stable proper or improper node of the linear system . This is the case i n which Theorem 11 .3 does not give a definitive conclusion, and the origin could be an asymptotically stable node or spiral point of the nonlinear pendulum . This case occurs when c = 2m, a delicate balance between the damping force, mass, and pendulum length . In the case of an asymptotically stable node, the bob, when released, moves with decreasin g velocity toward the vertical equilibrium position, approaching it as t -* co but not oscillating through it . Figure 11 .33 gives a phase portrait for this case, in which y2 = 0 .8 and co' = 0.2. (3) If y2 - 4w2 < 0, then the eigenvalues are complex conjugates with negative rea l . part Hence the origin is an asymptotically stable spiral point of both the linear part and th e nonlinear pendulum system . This happens when c < 2mfg . Figure 11 .34 displays this case , with y2 = 0 .6 and w2 = 0 .3 . It is routine to check that each critical point (2nar, 0), in which the first coordinate is an even integer multiple of 7r, has the same characteristics as the origin . Now consider critical points ((2n+1)i 0), with first coordinate an odd integer multipl e of ?r . To be specific, consider (?r, 0) . Now the linear part of the system (with (ir, 0) translate d to the origin) is f.r 0) Aor'0) - ( gx ( r, 0 )
M
gy(7r, 0) ) ( -w2 COS(7r)
.= ( -y )
w2 -y )
.
The eigenvalues are - z y+ y2 + 4w 2 and - y - i A/y2 + 4w2 . These are real and of opposite sign, so (7r, 0) is an unstable saddle point . The other critical points ((2n+1)7r, 0) exhibit th e same behavior . This is what we would expect of a pendulum in which the bob is initially in the vertical upward position, since arbitrarily small displacements will result in the bob moving away from this position, and it will never return to it . The analysis is the same for each critical point ((2n+1)ar,0) . ■
Here k a 3 c f(x , y) = y and g(x, y ) _ --x+ -x - -y . m m in There are three critical points, (0, 0), ( kl e, 0) and (-
x_
0, = 1, gx
kla, 0) . The partial derivatives are
k a +3-x,2 g _ m. m
c m
:450
CHAPTER 11 Qualitative Methods and Systems of Nonlinear Differential Equation s First consider the behavior of trajectories near the origin . The linear part of the system ha s matrix A(O,O)
0
1
'
= ( -k/m -c/m
with eigenvalues (1/2m) (-c+•✓c2 -4mk) and (1/2m) (-c- •✓c2 -4mk) . This yields thre e cases, depending, as we might expect, on the relative magnitudes of the mass, damping constan t and spring constant . 1. If c2 - 4mk > 0, then A (o,o) has real, distinct, negative eigenvalues, so the origin is a n asymptotically stable nodal sink. Small disturbances from the equilibrium position resul t in a motion that dies out with time, with the mass approaching the equilibrium position . 2. If c2 - 4ink = 0, then A (0 ,0) has equal real, negative eigenvalues, so the origin is a n asymptotically stable proper or improper node of the linear system . Hence the origin i s an asymptotically stable node or spiral point of the nonlinear system . 3. If c2 -4mk < 0, then A (0,0) has complex conjugate eigenvalues with negative real part , and the origin is an asymptotically stable spiral point . Figure 11 .35 shows a phase portrait for case (3), with c = 2, k = 5, a = 1 and m = 3 . Next, consider the critical point ( k/a, 0) . Now the linear part of the system obtained by translating this point to the origin is At
( 0 k**,o*
1
= k/m -On
1
J,
'NNNNN N sNNNNN ,\NN\\\ NN ,\NN\\\ N\\\\\ \\ \ \N N\ N\ N\ \N NNN N
Nonlinear spring system with c = 2, k = 5 , and m = 3 (c 2 - 4mk < 0) .
FIGURE 11 .35
a
= 1,
with eigenvalues (1/2m)
(-c + ./c2 + 4mk) and (1/2m) (-c - */c2 + 4mk) . The first eigen-
value is positive and the second negative, so (07«, 0) is an unstable saddle point . A similar analysis holds for the critical point (- k/a, 0) . ■
11.6 Lyapunov's Stability Criteria
In each of Problems 1 through 10, (a) show that the syste m is almost linear, (b) determine the critical points, (c) us e Theorem 11 .3 to analyze the nature of the critical point, or state why no conclusion can be drawn, and (d) generat e a phase portrait for the system .
is possible in this case in general by considering th e following two systems : x' = Y- x,/x2 + y2 ,Y = - x-y,/x2 +y2 and
x' = Y+x,/x2 +y 2 ,Y = -x + y,/x2 +y 2 . (a) Show that the origin is a center for the associate d linear system of both systems . (b) Show that each system is almost linear . (c) Introduce polar coordinates, with x = r cos(0) and y = r sin(0) and use the chain rule to obtai n
4. x' = -2x - 3y - y2 , y' = x + 4y 5. x' =3x+12y, y'=-x-3y+x 3 6. x' =2x-4y+3xy,y'=x+y+x2 7. x' =-3x-4y+x 2 -y2 ,y'=x+ y 8. x' = -3x - 4y, y' = -x + y - x2 y 9. x' = -2x - y+ y2 , y' = -4x + Y 10. x' =2x-y-x 3 sin(x),y' =-2x+y+xy 2 11. Theorem 11 .3 is inconclusive in the case that the critical point of an almost linear system is a center o f the associated linear system . Verify that no conclusion
11.6
451
dx d r x'=--=cos(0)r'(t) dr dt
and dy d r sin(0)r'(t) . y _ dr dt =
Use these to evaluate xx' + yy' in terms of r an d r', where r' = dr/dt . Thus convert each system to a system in terms of r(t) and 0(t) . (d) Use the polar coordinate version of the first system to obtain a separable differential equation for r(t) . Conclude from this that r' (t) < 0 for all t. Solve for r(t) and show that r(t) 0 as t ->- co . Thus conclud e that for the first system the origin is asymptoticall y stable. (e) Follow the procedure of (d), using the second system. However, now find that r'(t) > 0 for all t . Solve for r(t) with the initial condition r(to) = ro . Show that r(t) -* oo as t -> to +1/r from the left. Conclude that the origin is unstable for the second system .
Lyapunov's Stability Criteria There is a subtle criterion for stability due to the Russian engineer and mathematician Alexande r M . Lyapunov (1857-1918) . Suppose X' = F(X) is a 2 x 2 autonomous system of first orde r differential equations (not necessarily almost linear), and that (0, 0) is an isolated critical point . Lyapunov's insight was this . Suppose there is a function, commonly denoted V, such tha t closed curves V(x, y) = c enclose the origin . Further, if the constants are chosen smaller, sa y 0 < k < c, then the curve V(x, y) = k lies within the region enclosed by the curve V(x, y) = c (Figure 11 .36 (a)) . So far this has nothing to do with the system of differential equations . However, suppose it also happens that, if a trajectory intersects the curve V(x, y) = c at some time, which we can take to be time zero, then it cannot escape from the region bounded by this curve, but must for all later times remain within this region (Figure 11 .36 (b)) . This would forc e trajectories starting out near the origin (meaning within V(x, y) = c) to forever lie at least this
45 2
CHAPTER 11 Qualitative Methods and Systems of Nonlinear Differential Equation s
FIGURE 11 .36(a)
Closed curves contracting
about the origin .
Trajectories entering shrinking regions about the origin. FIGURE 11 .36(b)
close to the origin . But this would imply, by choosing c successively smaller, that the origin i s a stable critical point ! If in addition trajectories starting at a point on V(x, y) = c point into the region bounded b y this curve, then we can further conclude that the trajectories are approaching the origin, henc e that the origin is asymptotically stable. This is the intuition behind an approach to determining whether a critical point is stable or asymptotically stable . We will now develop the vocabulary which will allow us to giv e substance to this approach . First, we will distinguish certain functions that have been found t o serve the role of V in this discussion . If r > 0, let N. consist of all (x, y) within distance r from the origin . Thus, (x, y) is in Nr exactly when
r2 . This set is called the r-neighborhood of the origin, or, if we need no explicit reference to r , just a neighborhood of the origin .
11.6 Lyapunov's Stability Criteria
DEFINITION 11.5
453
Positive Definite, Semidefinite
Let V(x, y) be defined for all (x, y) in some neighborhood N. of the origin. Suppose V is continuous with continuous first partial derivatives . Then 1. V is positive definite on N. if V(0, 0) = 0 and V(x, y) > 0 for all other points o f A . 2. V is positive semidefinite on N,. if V(0, 0) = 0 and V(x, y) 0 for all points of N,.. 3. V is negative definite on N, . if V(0, 0) = 0 and V(x, y) < 0 for all other points o f N. 4. V is negative semidefinite on N. if V(0, 0) = 0 and V(x, y) < 0 for all points of Nr .
For example, V(x, y) = x 2 + 3xy + 9y 2 is positive definite on N,. for any positive r, and -3x2 + 4xy - 5y2 is negative definite on any N, . . The function (x - y) 4 is positive semidefinite , being nonnegative but vanishing on the line y = x . The following lemma is useful in producing examples of positive definite and negativ e definite functions .
LEMMA 11 . 2
Let V(x, y) = ax 2 + bxy + cy 2. Then V is positive definite (on any N,. ) if and only if a > 0 and 4ac - b2 > 0 . V is negative definite (on any N, .) if and only if a < 0 and 4ac -b2 > 0 . Proof Certainly V is continuous with continuous partial derivatives in the entire x, y plane. Further, V(0, 0) = 0, and this is the only point at which V(x, y) vanishes . Now recall the second derivative test for extrema of a function of two variables . First, V,(0, 0) = V,,(0, 0) = 0 , so the origin is a candidate for a maximum or minimum of V . For a maximum or minimum , we need Vex (o, 0)V ),(0, o) - v ,(o, 0) 2 > o . But this condition is the same as (2a)(2c) - b2 > 0, or 4ac-b2 >0 . Satisfaction of this inequality requires that a and c have the same sign .
454
CHAPTER 11 Qualitative Methods and Systems of Nonlinear Differential Equation s When a > 0, then Vxx(0, 0) > 0 and the origin is a point where V has a minimum . In thi s event V(x, y) > V(0, 0) = 0 for all (x, y) other than (0, 0) . Now V is positive definite . When a < 0 then Vxx (0, 0) < 0 and the origin is a point where V has a maximum. Now V(x, y) < 0 for all (x, y) other than (0, 0), and V is negative definite . ■ If x = x(t), y = y(t) defines a trajectory of X' = F(X), and V is a differentiable function of two variables, then V(x(t), y(t)) is a differentiable function of t along this trajectory . We will denote the derivative of V(x(t), y(t)) with respect to t as V(x, y), or just V. By the chain rule, V(x, y) = Vx (x(t), y(t))x' (t) +Vy (x(t), y(t))y'(t) ,
or, more succinctly, V = V,x +Vy y' . This is called the derivative of V along the trajectory, or the orbital derivative of V . The following two theorems show how these ideas about positive and negative definite functions relate to Lyapunov' s approach to stable and asymptotically stable critical points . The criteria given in the first theorem constitute Lyapunov's direct method for determining th e stability or asymptotic stability of a critical point . -1. THEOREM 11.4
Lyapunov's Direct Method for Stabilit y
Let (0, 0) be an isolated critical point of the autonomous 2 x 2 system X' = F(X) . 1. If a positive definite function V can be found for some neighborhood N,. of the origin , such that V is negative semidefinite on Nr , then the origin is stable. 2. If a positive definite function V can be found for some neighborhood Nr of the origin , such that V is negative definite on Nr , then the origin is asymptotically stable . ■ On the other side of the issue, Lyapunov's second theorem gives a test to determine that a critical point is unstable . THEOREM 11 .5 Lyapunov's Direct Method for Instability
Let (0, 0) be an isolated critical point of the autonomous 2 x 2 system X' = F(X) . Let V b e continuous with continuous first partial derivatives in some neighborhood of the origin, and let V(0, 0) = O . 1. Suppose R > 0, and that in every neighborhood Nr , with 0 < r < R, there is a point at which V(x, y) is positive . Suppose V is positive definite in NR . Then (0, 0) is unstable. 2. Suppose R > 0, and that in every neighborhood N,., with 0 < r < R, there is a point at which V(x, y) is negative . Suppose V is negative definite in NR . Then (0, 0) i s unstable . Any function V playing the role cited in these theorems is called a Lyapunov function. Theorems 11 .4 and 11 .5 give no suggestion at all as to how a Lyapunov function might b e produced, and in attempting to apply them this is the difficult part . Lemma 11 .2 is sometime s useful in providing candidates, but, as might be expected, if the differential equation is complicated the task of finding a Lyapunov function might be insurmountable . In spite of this potential difficulty, Lyapunov's theorems are useful because they do not require solving the system, no r do they require that the system be almost linear.
11.6 Lyapunov's Stability Criteri a Adding to the mystique of the theorem is the nonobvious connection between V, V, and stability characteristics of the critical point . We will give a plausibility argument intended t o clarify this connection . Consider Figure 11 .37, which shows a typical curve V(x, y) = c about the origin . Call this curve F . Here is how V enters the picture . At any point P : (a, b) on this curve, the vector N= Vi+Vyj is normal (perpendicular) to r, by which we mean that it is normal to the tangent to r at this point . In addition, consider a trajectory x = cp(t), y = ii(t) passing through P at time t = 0, als o shown in Figure 11 .37. Thus, 9(0) = a and i/r(0) = b . The vecto r T = 9'(0)i + i ' (0)j is tangent to this trajectory (not to F) at (a, b) . Now,
V (a, b) = Vx(a, b)9 ' (0) + Vy (a, b) til l (0) = N . T, the dot product of the normal to r and the tangent to the trajectory at (a, b) . Since the do t product of two vectors is equal to the product of their lengths and the cosine of the angl e between them, we obtain 7(a, b)
=
II N II II T I
cos(O) ,
with 0 the angle between T and N . Now look at conclusions (1) and (2) of the first Lyapunov theorem . If V is negativ e semidefinite, then V(a, b) < 0, so cos(e) < O . Then 7r/2 < 0 < 3'nr/2. This means that the trajectory at this point is moving at this point either into the region enclosed by r, or perhaps in the same direction as the tangent to T . The effect of this is that the trajector y cannot move away from the region enclosed by F . The trajectory cannot escape from this region, and so the origin is stable . If V is negative definite, then cos(O) < 0, so 7r/2 < 0 < 37r/2 and now the trajectory actually moves into the region enclosed by r, and cannot simply trace out a path around the origin . In this case the origin is asymptotically stable . We leave it for the student to make a similar geometric argument in support of the secon d Lyapunov theorem.
FIGURE 11 .37
Rationale for Lyapunov's direct method .
456
CHAPTER 11 Qualitative Methods and Systems of Nonlinear Differential Equation s
7 -117_ . EXAMPLE 11 .2 4
Consider the nonlinear system X =
-x 3 -4x 2y
'
The origin is an isolated critical point . We will attempt to construct a Lyapunov function of the form V(x, y) = ax2 + bxy + cy 2 that will tell us whether the origin is stable or unstable . We may not succeed in this, but it is a good first attempt because at least we know condition s on the coefficients of V to make this function positive definite . or negative definite . The key lies in V, so compute
v=ve x+Vyy' = (2ax + by) (-x 3 ) + (bx+2cy)(-4x2 y ) = -2ax4 bx3 y - 4bx 3 y - 8cx2 y2 . Now observe that the -2ax4 and -8cx2 y 2 terms will' be nonpositive if a and c are positive . The x3 y term will vary in sign, but we can make bx3y vanish by choosing b = O . We can choose a and c as any positive numbers, say a = c = 1 . Then V(x, y) = x 2 + y 2 is positive definite in any neighborhood of the origin , an d V = -2x4 - 8x2 y2 is negative semidefinite in any neighborhood of the origin . By Lyapunov's direct metho d (Theorem 11 .4), the origin is stable . A phase portrait for this system is shown in Figure 11 .38 . We can draw no conclusion about asymptotic stability of the origin in the last example . If we had been able to find a Lyapunov function V so that V was negative definite (instead of
Y
FIGURE 11 .38
Phase portrait for
x' = -x3 y'
-4x 2 Y
11.6 Lyapunov's Stability Criteria
457
negative semidefinite), then we could have concluded that the origin was asymptotically stable . But we cannot be sure, from the work done, whether there is no such function, or whether w e simply did not find one . MIL EXAMPLE 11 .2 5
It is instructive to consider Lyapunov's theorems as they relate to a simple physical problem , the undamped pendulum (put c = 0) . The system is x'= y,
y = -L sin(x) . Recall that x = 0, the displacement angle of the bob from the downward vertical rest position . We have already characterized the critical points of this problem (with damping) . However , this example makes an important point . In problems that are drawn from a physical setting, th e total energy of the system can often serve as a Lyapunov function . This is a useful observation , since the search for a Lyapunov function constitutes the primary issue in attempting to appl y Lyapunov's theorems . Thus compute the total energy V of the pendulum . The kinetic energy i s 2 mL 2
or (0) 2 ,
which in the variables of the system is . 2 mL2y2 The potential energy is the work done in lifting the bob above the lowest position . From Figure 11 .1, this is
mgL(l -cos(0)) , or
mgL(1 -cos(x)) . The total energy is therefore given by V(x, y) = mgL(l - cos(x)) + -mL 2y2. Clearly V(0, 0) = O . The rest position of the pendulum (pendulum arm vertical with bob at th e low point) has zero energy. Next, comput e V (x(t), y( t )) = Vxx'(t) + Vyy' (t) = mgL sin(x)x(t) + mL2yy' ( t). Along any trajectory of the system, x' = y and y' = -(g/L) sin(x), so the orbital derivative is V(x(t), y(t)) = mgL sin(x)y + mL2y (-
L sin(x)) = 0 .
This corresponds to the fact that, in a conservative physical setting, the total energy is a constan t of the motion . Now V is positive definite . This is expected because the energy should be a minimum i n the rest position, where V(0, 0) = 0 . Further, V is negative semidefinite . Therefore the origi n is stable .
458
CHAPTER 11 Qualitative Methods and Systems of Nonlinear Differential Equation s As expected, Lyapunov's theorem did not tell us anything new about the pendulum, whic h we had already analyzed by other means . However, this example does provide some insight int o a line of thought that could have motivated Lyapunov . In any conservative physical system, the total energy must be a constant of the motion, and we expect that any position with the system a t rest should be stable if the potential energy is a minimum, and unstable if it is not . This suggests looking at the total energy as a candidate for a Lyapunov function V . In particular, for man y mechanical systems the kinetic energy is a quadratic form . One then checks the orbital derivative V to see if it is negative definite or semidefinite in a neighborhood of the point of interest .
kt
_:= 7 EXAMPLE 11 .2 6
Consider the system x3 - xy2 y3 + 6x2y
X'
The origin is an isolated critical point . We do not know whether it is stable, asymptotically stable, or unstable, so we will begin by trying to construct a Lyapunov theorem that fits eithe r of the Lyapunov theorems . Attempt a Lyapunov functio n V(x, y) = axe +bxy+cx2. We know how to choose the coefficients to make this positive definite . The question is what happens with the orbital derivative . Compute V = Vx x + Vyy' _ (2ax + by) (x 3 - xy2) + (bx + 2cy) (y3 + 6x2y) = 2ax4 +2cy4 + (12c - 2a)x2y2 +7bx3y. This looks promising because the first three terms can be made strictly positive for (x, y ) (0, 0) . Thus choose b = 0 and a = c = 1 to ge t V (x, y) = 2x4 + 2y 4 + 10x2y2 > 0 for (x, y) ; (0, 0) . With this choice of the coefficients , V (x, y)
x2 + y2
is positive definite on any neighborhood of the origin, and V is also positive definite. B y Lyapunov's second theorem (Theorem 11 .5), the origin is unstable. Figure 11 .39 shows a phas e portrait for this system.
EXAMPLE 11 .27
A nonlinear oscillator with linear damping can be modeled by the differential equatio n z " +az ' +z+/3z 2 +yz3 = 0, in which B and y are positive and a 0 . To convert this to a system, let z = x and z' = y t o obtain x'=y, y' =-ay-x-/3x 2 -yx 3 .
11.6 Lyapunov's Stability Criteria
FIGURE 11 .39
Phase portrait for
459
Ix' = x 3 -xy2 y'=y3 + 6x2 y
This is the system y -ay - x -/3x 2 - yx3
X-
We can construct a Lyapunov function by a clever observation . Let V(x , y )
= 2 y2
+
x2 + /3x 3 + yx4 . 3 4
Then ( x +/3x2 + yx3 ) y +( y)( - ay - x
-/3 x2
- yx3 )
= -ay2 . Since a > 0, V is certainly negative semidefinite in any neighborhood of the origin . It may no t be obvious whether V is positive definite in any neighborhood of the origin . Certainly the term y2/2 in V is nonnegative . The other terms are 1
1
1
2 x2+ 3 /3x3+ - yx4 '
which we can write as x2 1 2+4yx2 +-px I . Since g(x) = + a yx2 + /3x is continuous for all x, and g(0) = 1/2, there is an interval (-h, h) about the origin such that g(x)>Ofor -h<x
460
CHAPTER 11 Qualitative Methods and Systems of Nonlinear Differential Equation s Then, in No V(x, y) > 0, and V(x, y) > 0 if (x, y) # (0, 0) . Therefore V is positive definite i n this neighborhood . We now have V positive definite and V negative semidefinite in Nh, hence the origin i s stable . Figure 11 .40 shows a phase portrait for the case a = 1, f3 = '-* and y = It is instructiv e to try different values of /3 and y to get some idea of effect of the nonlinear terms on th e trajectories . III
FIGURE 11 .40
Nonlinear oscillator with a =
1,
1
EXAMPLE 11 .2 8
Consider the system x' = f( t)y + g( t) x (x2 + y2) , y'= -f(t)x + g(t)y(x 2 +y 2 ) .
Assume that the origin is an isolated critical point . If we attempt a Lyapunov function V(x, y) ax 2 + bxy + cy2, then = (2ax + by) [ f(t)y + g(t)x(x 2 + y2)] + (bx + 2cy)[- f(t)x + g(t)y(x2 + y 2) ] = (2a - 2c) f(t)xy + 2ax2g(t) (x 2 + y 2) + 2cy2g(t) (x2 + y2) + bf( t) y2 + 2bg ( t) xy(x2 + y2) - b f(t)x2 . We can eliminate three terms in the orbital derivative by trying b = 0. The term (2a-2c) f(t)xy vanishes if we choose a = c . To have V positive definite we need a an d c positive, so let a = c = 1 . Then V(x, y) = x2 + y 2 , which is positive definite in any neighborhood of the origin, an d V = 2(x2 +y2) 2g(t) .
461
11.7 Limit Cycles and Periodic Solutions
This is negative definite if g(t) < 0 for all t > 0 . In this case the origin is asymptotically stable . If g(t) < 0 for t > 0, then V is negative semidefinite and then the origin is stable . If g(t) > 0 for t > 0, then the origin is unstable .
GTION 11 6 ' PROBLEMS In each of Problems 1 through 8, use Lyapunov's theorem to determine whether the origin is stable asymptotically stable, or unstable .
4 . x' = -x2) 72, y' = x 2
1.
6. x'=xs (1+y2 ),y ' =
x'
= - 2xy2 , y' = - x2y
5. x' = xy 2, y' = y3 x 2 y+y 3
2. x'=-xcos2(y),Y'=(6-x)y2
7. x'=x3(1+y),Y'=y3(4+x2 )
3. x'=-2x,y'=-3y3
8.
11.7
x' = x3cot2 (Y),
y'=y3 ( 2 + x4)
Limit Cycles and Periodic Solution s Nonlinear systems of differential equations can give rise to curves called limit cycles whic h have particularly interesting properties . To see how a limit cycle occurs naturally in a physica l setting, draw a circle C of radius R on pavement, with R exceeding the length L between the points where the front and rear wheel of a bicycle touch the ground . Now grab the handlebars and push the bicycle so that its front wheel moves around C . What path does the rear wheel follow ? If you tie a marker to the rear wheel so that it traces out the rear wheel's path as the fron t wheel moves along C, you find that this path does not approach a particular point . Instead, a s the front wheel continues its path around C, the rear wheel asymptotically approaches a circle K concentric with C and having radius V'R 2 - L2. If the rear wheel begins outside C, it wil l spiral inward toward K, while if it begins inside C , it will work its way outward toward K . If the rear wheel begins on K, it will remain on K . This inner circle K has two properties in common with a stable critical point . Trajectories beginning near K move toward it, and if a trajectory begins on K, it remain s there. However, K is not a point, but is instead a closed curve . K is a limit cycle of thi s motion .
DEFINITION 11 .6 Limit Cycle
A limit cycle of a 2 x 2 system X' = F(X) is a closed trajectory K having the propert y that there are trajectories x = cp(t), y = qi(t) of the system such that (ca(t), ?Kt)) spiral s co . toward K in the limit as t
We have already pointed out the analogy between a limit cycle and a critical point . This analogy can be pushed further by defining a concept of stability and asymptotic stability for limit cycles that is modeled after stability and asymptotic stability for critica l
points.
462
CHAPTER 11 Qualitative Methods and Systems of Nonlinear Differential Equations
DEFINITION 11 . 7
Let K be a limit cycle of X' = F(X) . Then , 1. K is stable if trajectories starting within a certain distance of K must remain withi n a fixed distance of K . 2. K is asymptotically stable if every trajectory that starts sufficiently close to K spirals toward K as t -+ oo . 3. K is semistable if every trajectory starting on one side of K spirals toward K as t -* oo, while there are trajectories starting on the other side of K that spiral awa y from K as t - oo . 4. K is unstable if there are trajectories starting on both sides of K that spiral awa y from K as t - co .
Keep in mind that a closed trajectory of X' = F(X) represents a periodic solution . We have seen periodic solutions previously with centers, which are critical points about whic h trajectories form closed curves . A limit cycle is therefore a periodic solution, toward whic h other solutions approach spirally . Often we are interested in whether a system X' = F(X) ha s a periodic solution, and we will shortly develop some tests to tell whether a system has such a solution, or sometimes to tell that it does not .
EXAMPLE 11 .29 Limit Cycl e
Consider the almost linear system X, _
11) - x*/x2+y2 X+ ( . -1 1 y V x2 + y2 )
J \-
J
(11 .10 )
(0, 0) is the only critical point of this system . The eigenvalues o f
are 1 + i, so the origin is an unstable spiral point of the system X' = AX + G and also of th e linear system X' = AX . Figure 11 .41 shows a phase portrait of the linear system . Up to this point, whenever we have seen trajectories spiraling outward, they have grow n without bound. We will now see that this does not happen with this current system . This will be transparent if we convert the system to polar coordinates . Sinc e r2 = x2 +y2, then rr ' = xx' + yy '
x[x+y-x,\/x2+y2]+y[-x+y-y Jx2+y2 = x2 + y2 - (x 2 + y2) V x2 + y 2 = r 2 - r 3 .
]
11.7 Limit Cycles and Periodic Solutions
FIGURE 11 .41
463
Unstable spiral point for x' = x+y, y ' = -x+y .
Then r'=r-r2=r(1-r) . This tells us that, if 0 < r < 1, then r' > 0 so the distance between a trajectory and the origi n is increasing . Trajectories inside the unit circle are moving outward . But if r > 1, then r' < 0, so the distance between a trajectory and the origin is decreasing . Trajectories outside the uni t circle are moving inward . This does not yet tell us in detail how the trajectories are moving outward or inward . T o determine this, we need to bring the polar angle 0 into consideration . Differentiate x = r cos(h) and y = r sin(0) with respect to t : x' = r' cos(0) - r sin (0) O' , y' = r' sin(0)+rcos(B)0' . Now observe that x'y - xy' = r sin( g) [r' cos(0) - r sin(0) 0'] - r cos(0) [r' sin(0) + r cos(0)0' ] = -r2 [cos2(0)+sin 2(0)] 0' = -r 20' . But from the system X' = AX + G we hav e x'y - xy' = y [x + y - x-/x2 + y2] - x [-x + y - y-\/x2 + y2
i
=x2 +y 2 = r2. Therefore, r2 = -r20' ,
46 4
CHAPTER 11 Qualitative Methods and Systems of Nonlinear Differential Equation s from which we conclude that 0' = -1 . We now have an uncoupled system of differential equations for r and 0 : r' =r(1-r),
0' =-1 .
This equation for r is separable. For the trajectory through (ro , 0o), solve these equations subject to the initial conditions r(O) = ro, 0(0) = 00. We get 1
r=
,O=0o-t.
1- (( ro - 1 )/ ro) e- ` These explicit solutions enable us to conclude the following . If ro = 1, then (ro, 0o) is on the unit circle . But then r(t) = 1 for all t, hence a trajectory that starts on the unit circle remains there for all times . Further, 0'(t) = -1, so the point (r(t), 0(t) ) moves clockwise around this circle as t increases . If 0 < ro < 1, then (ro, Bo) is within the disk bounded by the unit circle . Now 1 < 1 1 + (( ro)/ r0) e for all t > 0 . Therefore a trajectory starting at a point inside the unit disk remains there forever . Further, r(t) -+ 1 as t -a oo, so this trajectory approaches the unit circle from within . Finally, if ro > 1, then (ro, 0o) is outside the unit circle . But now r(t) > 1 for all t, so a trajectory starting outside the unit circle remains outside for all time . However, it is still true that r(t) 1 as t -+ oo, so this trajectory approaches the unit circle from without . In sum, trajectories tend in the limit to wrap around the unit circle, either from within or from without, depending on where they start .The unit circle is a an asymptotically stable limit cycle of X' = F(X) . A phase portrait is shown-'in'Figure 11 .42 . r(t) =
Here is an example of a system havirig infinitely many asymptotically stable limit cycles . EXAMPLE 11 .3 0
The system X
_
0 -1 0
x sin (-\/x2 +y 2) X+ y sin (,/x2 + y2)
has particularly interesting limit cycles, as can be seen in,Figure 11 .43 . These occur as concentri c circles about the origin . Trajectories originating within the innermost circle spiral toward this circle, as do trajectories beginning between this first circle and the second one . Trajectories beginning between the second and third circles spiral toward the third circle, as do trajectorie s originating between the third and fourth circles . This pattern continues throughout the plane . We will now develop some facts about closed trajectories (periodic solutions) and limi t cycles . For the remainder of this section ; X' = F(X) is a 2 x 2 autonomous system.
.f(x, y) ) _ ( g( x, y) J
11.7 Limit Cycles and Periodic Solutions
FIGURE 11 .42
I
x'
= x +y
cycle of
- x/x2 +y2
y'=-x+y-y\/x2+y2
FIGURE 11 .43
I
Limit
Asymptotically
x' =y+xsin(,/x2 +y22 ) y' _ -
x+ysin(,/x2 +y2 )
stable
limit
cycles of
4 65
466
CHAPTER 11 Qualitative Methods and Systems of Nonlinear Differential Equations The first result states that, under commonly encountered conditions, a closed trajectory o f X' = F(X) must always enclose a critical point . THEOREM 11 .6 Enclosure of Critical Points
Let f and g be continuous with continuous first partial derivatives in a region of the plan e containing a closed trajectory K of X' = F(X) . Then K must enclose at least one critical poin t of X' = F(X) . i This kind of result can sometimes be used to tell that certain regions of the plane canno t contain closed trajectories of a system of differential equations . For example, suppose the origi n is the only critical point of X' = F(X) . Then we automatically know that there can be no close d trajectory in, for example, one of the quadrants, because such a closed trajectory could no t enclose the origin, contradicting the fact that it must enclose a critical point . Bendixson's theorem, which follows, gives conditions under which X' = F(X) has n o closed trajectory in a part of the plane . A region of the plane is called simply connected if it contains all the points enclosed by any closed curve in the region. For example, the regio n bounded by the unit circle is simply connected . But the shaded region shown in Figure 11 .44 between the curves C and K is not simply connected, because C encloses points not in the region . A simply connected region can have no "holes" in it, because then a closed curve wrapping around a hole encloses points not in the region .
FIGURE 11 .44
Non-simply
connected region. - L THEOREM 11 .7 Bendixso n
Let f and g be continuous with continuous first partial derivatives in a simply connected region R of the plane. Suppose f,+ gy has the same sign throughout points of R, either positive or negative . Then X' = F(X) has no closed trajectory in R . Suppose R contains a closed trajectory C representing the periodic solution x = cp(t) , y = fi(t) . Suppose (cp(t), qi(t)) traverses this curve exactly once as t varies from a to b, and let D be the region enclosed by C . Evaluate the line integral : Proof
and this integral cannot be zero because the integrand is continuous and of the same sig n throughout D . This contradiction implies that no such closed trajectory C can exist within the region R .
EXAMPLE 11 .3 1
Consider the system 3x+4y+x3 3 ( 5x-2y+y ) Here f and g are continuous, with continuous first partial derivatives, throughout the plane . Further, X'-
fX +gy =3+3x2 -2+3y 2 > 0
for all (x, y) . This system has no closed trajectory, hence no periodic solution . The last two theorems have been negative, in the sense of providing criteria for X' = F(X) to have no periodic solution in some part of the plane . The next theorem, a major result credite d dually to Henri Poincare and Ivar Bendixson, gives a condition under which X' = F(X) has a periodic solution. THEOREM 11 .8 Poincare-Bendixso n
Let f and g be continuous with continuous first partial derivatives in a closed, bounded regio n R of the plane that contains no critical point of X' = F(X) . Let C be a trajectory of X' = F(X ) that is in R for t > to . Then, C must be a periodic solution (closed trajectory), or else C spiral s toward a closed trajectory as t oo. In either case, as long as a trajectory enters R at some time, and R contains no critica l point, then X' = F(X) has a periodic solution, namely this trajectory itself, or, if not, a closed trajectory approached spirally by this trajectory . On the face of it, this result may appear to contradict the conclusion of Theorem 11 .6 , since any periodic trajectory should enclose a critical point. However, this critical point nee d not be in the region R of the theorem . To illustrate, consider again the system (11 .10) . Let R be the region between the concentric circles r = 1 and r = 3 shown in Figure 11 .45 . The only critical point of X' = F(X) is the origin; which is not in R . If we choose any trajector y
468
CHAPTER 11 Qualitative Methods and Systems of Nonlinear Differential Equations
Region between the circles r = z and r = 3, wit h trajectories approaching the limi t cycle r = 1 . FIGURE 11 .45
beginning at a point inside the unit circle, then, as we have seen, this trajectory approache s the unit circle, hence eventually enters R . The Poincare -Bendixson theorem would allow us t o assert just from this that R contains a periodic solution of the system . We will conclude this section with Lienard's theorem, which gives conditions sufficien t for a system to have a limit cycle . THEOREM 1L9 Lienard
Let p and q be continuous and have continuous derivatives on the entire real line . Suppose : 1. q(x) = -q(-x) for all x. 2. q(x)>Oforallx>0 . 3. p(x) = p(-x) for all x . Suppose also the equation F(x) = 0 has exactly one positive root, wher e F(x)
= f xp (Sr) d e.
If this root is denoted y, suppose F(x) < 0 for 0 < x < y, and F(x) is positive and nondecreasing for x > y. Then, the system X - ( - p (x ) y - q( x) ) has a unique limit cycle enclosing the origin. Further, this limit cycle is asymptotically stable . ■ Under the conditions of the theorem, the system has exactly one periodic solution, an d every other trajectory spirals toward this closed curve as t -+ oo . As an illustration, we will use Lienard' s theorem to analyze the van der Pol equation . EXAMPLE 11 .32 van der Pol Equatio n
The second-order differential equatio n z"-+a(z2-1)i -1-z=0,
11 .7 Limit Cycles and Periodic Solutions
469
in which a is a positive constant, is called van der Pol's equation . It was derived by the Dutch engineer Balthazar van der Pol in the 1920's in his studies of vacuum tubes . It was of great interest to know whether this equation has periodic solutions, a question to which the answer is not obvious . First write van der Pol's equation as a system . Let x = z and y = z ' to get x
=y
y'=-a(x2-1)-x. This system has exactly one critical point, the origin . Further, this system matches the one in Lienard's theorem if we let
2
p(x) = a(x -1) and q(x) = x . Now q(-x) = -q(x), q(x) > 0 for x > 0, and p(-x) = p(x), as required . Next, le t F(x)
1 = f xpp(04 ( = ax (
F has exactly one positive zero, y = *. For 0 < x < F(x) < O . Further, for x > F(x) is positive and increasing (hence nondecreasing) . By Lienard's theorem, the van der Po l equation has a unique limit cycle (hence periodic solution) enclosing the origin . This limit cycle is asymptotically stable, so trajectories beginning at points not on this closed trajectory spira l toward this limit cycle . Figures 11 .46 through 11 .49 show phase portraits for van der Pol' s equation, for various choices of a .
ll **1 "
" -6 n
\` _ e
A
V
-6
L
L
G
L
A
1
A
1
AAAA
1
A
^
*/ I A
n
^
//
^
A
<
v
A
^
A
A
\*-*
1 '11 \
1 1
A
r
1 <4
1
I 1
1
^
A
a
^< -6
FIGURE 11 .46 Phase portrait for van der Pol' s
equation for a =
0 .2 .
1
1
I
** 2 1
1
4
A
^
<
**111A
4 1
<
<< 1
A
<
I
I
,1
-
-4
11
1
A
< A^
I
1
n
A
I
1
n
-
6
<
n
n
A
1
1 1 1 1
I
1
I
FIGURE 11 .47 Phase portrait for van der Pol' s equation for a = 0 .5.
1
CHAPTER 11 Qualitative Methods and Systems of Nonlinear Differential Equation s
47 0
2 J
4
-4
IJr,i
2
A
7
1
1
A
,4
A
14
1 1 1
q
AAI
f
1 9
I11
,
,,4
FIGURE 11 .48
,
.-4
.
4
4
2
4 ,
4
4
,
A
A
4 41
A
1
1
1 \ \
Phase portrait for van der Pol's equation
for a = 1 .
In each of Problems 1 through 4, use Bendixson' s theore m to show that the system has no closed trajectory . Generate a phase portrait for the system . x' = -2x-y+x 3 , y' = 10x + 5y + x 2y - 2y sin(x) x'=-x-3y+e2z y'=x+2y+cos(y) x' = 3x-7y+sinh(x), y' = -4y+5e 3y x' = y, y' = -x+ y(9 -x2 - y2 ), for (x, y) in the elliptical region bounded by the graph of x 2 + 9y2 = 9 5. Recall that with the transformation from rectangular t o polar coordinates, we obtai n dr dx dy x dt +y dt =r dt 1. 2. 3. 4.
and dx dy 2 d9 y dt -x d =-r dt .
Phase portrait for van der Pol' s equation for a = 3 . FIGURE 11 .49
Use these equations to show that the syste m x' =
y+ vx
y = -x+
.1
+y2f(,/x2+y2) >
+ y2 f(V
x2 + y2 )
has closed trajectories associated with zeros of th e function f (a given continuous function of one variable) . If the closed trajectory is a limit cycle, what i s its direction of orientation? In each of Problems 6 through 9, use a conversion to pola r coordinates (see Problem 5) to find all of the closed trajectories of the system, and determine which of these are limit cycles . Also classify the stability of each limit cycle . Generate a phase portrait for each system and attempt to identify the limit cycles .
11.7 Limit Cycles and Periodic Solutions 6. x'= 4y+xsin(,/x 2 +y 2 ),y' =- x+ysin(,/x2 +y 2) 7. x'=y(1-x2-y2),y'=-x(1-x2-y2) 8. x' = x(1 - x2 - y2), y' = y( 1 - x2 - y2) 9. x' =y+x(1-x 2 -y 2)(4-x2 -y 2)(9-x2 -y 2) , y = -x+y(1-x2 -y 2)(4-x2 -y 2)(9-x 2 -y2) In each of Problems 10 through 13, use the PoincareBendixson theorem to establish the existence of a close d trajectory of the system . In each problem, find an annular region R (region between two concentric circles) abou t the origin such that solutions within R remain within R . To do this, check the sign of xx' + yy' on circles bounding the annulus . Generate a phase portrait of the system an d attempt to identify closed trajectories . 10. x' =x-y-x,/x2+y2, y' =x+y-y,/x2 +y2 11. x' = 4x - 4y - x(x 2 + 9y2) , y' =4x+4y-y(x 2 +9y2) 12. x'=y,x'=-x+y-y(x2+2y2 ) 13. x' = 4x-2y-y(4x2 +y 2), y' = 2x+4y-y(4x2 +y2) In each of Problems 14 through 22, determine whether th e system has a closed trajectory . Generate a phase portrait for the system and attempt to find closed trajectories . 14. x' = 3x + 4xy + xy 2, y' = -2y 2 + x4 y 15. x'=-y+x+x(x2+y2),y'=x+y+y(x2+y2 ) 16. x' = -y2 , y' = 3x+2x3
471
17. x' = y, y' = x2 -F esi '' (z) 18. x'=y,y'=-x+y-x2 y 19. x' =x-5y+y 3 y' = x-y+y 3 +7y s 20. x' = y, y' = -x + ye -v 21. x' = y, y'= -x3 22. x' = 9x - 5y + x(x 2 + 9y2) , y' = 5x+9y-y(x 2 +9y2) A differential equation x(t) has a periodic solution if ther e is a solution x = µ(t) and a positive number T such tha t µ(t+T) = µ(t) for all t. In each of Problems 23 through 27, prove that the differential equation has a periodic solu tion by converting it to a system and using theorems from this section . Generate a phase portrait for this system an d attempt to identify a closed trajectory, which represents a periodic solution. 23. x" + (x2 - 1)x' + 2 sin(x) = 0 24. x" + (5x4 + 9x2 - 4x)x' + sinh(x) = 0 25. x"+x3 = 0 26. x" +4x = 0 27. x"+
0 1+x2 28. Use Bendixson's theorem to show that the van der Pol equation does not have a closed trajectory whos e graph is completely contained in any of the following regions : (a) the infinite strip -1 < x < 1, (b) the half-plane x 1, or (c) the half-plane x -1 .
CHAPTER 1 2
Vector Differential Calculus CHAPTER 1 3
Vector
Analysis
Vector Integral Calculus
The next two chapters combine vector algebra and geometry with the processes of calculu s to develop vector calculus, or vector analysis . We begin with vector differential calculus, an d follow in the next chapter with vector integral calculus . Much of science and engineering deals with the analysis of forces-the force of wate r on a dam, air turbulence on a wing, tension on bridge supports, wind and weight stresses o n buildings, and so on . These forces do not occur in a static state, but vary with position, tim e and usually a variety of conditions . This leads to the use of vectors that are functions of one o r more variables . Our treatment of vectors is in two parts-vector differential calculus (Chapter 12), and vector integral calculus (Chapter 13) . Vector differential calculus extends out ability to analyze motion problems from the real line to curves and surfaces in 3-space . Tools such as the directional derivative, divergence an d curl of a vector, and gradient play significant roles in many applications . Vector integral calculus generalizes integration to curves and surfaces in 3-space . This will pay many dividends, including the computation of quantities such as mass, center of mass, work, and flux of a vector field, as well as physical interpretations of vector operations . The main results are the integral theorems of Green, Gauss, and Stokes, which have broad applications i n such areas as potential theory and the derivation and solution of partial differential equation s modeling physical processes . 473
it
V E t"
(V UNC'x IONS OF ONE VARIABLE VELOCITY, A ON,CURVATURE, AND) TORSION VECTO R : ;f )*` ,! f t STREAMLINES THE GRADIENT FIELD A N #
< <' f 1
CHAPTER 11
IRECTJObIALDERIVATIVES DIVERGENCE AN D
Vector Differential Calculus
12.1
Vector Functions of One Variabl e In vector analysis we deal with functions involving vectors . We will begin with one such clas s of functions .
DEFINITION 12.1
Vector Function of One Variable
A vector function of one variable is a vector, each component of which is a function o f the same single variable .
Such a function typically has the appearanc e F(t) = x(t)i + y(t)j + z(t)k , in which x(t), y(t) and z(t) are the component functions of F. For each t such that the components are defined, F(t) is a vector . For example, i f F(t) = cos(t)i+2t2j+3tk,
(12 .1)
then F(O) = i, F(ar) = -i+ 2n-2j + 3ark, and F(-3) = cos(-3)i + 18j - 9k . A vector function is continuous at to if each component function is continuous at to . A vector function is continuous if each component function is continuous (for those values o f . t for which they are all defined) . For example , G(t) = is continuous for all t > 0 with t
t
1 l i+ln(t) k
1. 475
476
CHAPTER 12 Vector Differential Calculus The derivative of a vector function is the vector function formed by differentiating eac h component . With the function F of (12 .1) , F'(t) = - sin(t)i+4tj+3k . A vector function is differentiable if it has a derivative for all t for which it is defined . Th e vector function G defined above is differentiable for t positive and different from 1, an d G '(t)
(t-1)2 i+ k . t We may think of F(t) as an arrow extending from the origin to (x(t), y(t), z(t)) . Since F(t) generally varies with t, we must think of this arrow as having adjustable length and pivoting a t the origin to swing about as (x(t), y(t), z(t)) moves . In this way the arrow sweeps out a curv e in 3-space as t varies . This curve has parametric equations x = x(t), y = y(t), z = z(t) . F(t) is called a position vector for this curve . Figure 12 .1 shows a typical such curve .
FIGURE 12 . 1
Position vector for a curve . The derivative of the position vector is the tangent vector to this curve . To see why this i s true, observe from Figure 12 .2 and the parallelogram law that the vecto r F(to+At) - F ( to)
FIGURE 12 . 2
is represented by the arrow from (x(to), y(to), z(to)) to (x(to+At), y(to+At), z(to+At)) . Since At is a nonzero scalar, the vector At
[F(to+At) - F ( to) ]
is along the line between these points . In terms of components, this vector i s x(to + At) - x(to) i + y(to + At) - y(to) j+ z(to + At) - z(to) k . At At At
(12 .2)
12 .1
Vector Functions of One Variable
4 77
In the limit as At -+ 0, this vector approaches x'(to)i+y'(to)j+z'(to)k, which is F'(to) . In thi s limit, the vector (12 .2) moves into a position tangent to the curve at (x(to), y(to), z(to)), as suggested by Figure 12 .3 . This leads us to interpret F'(to) as the tangent vector to the curve at (x(to), y(to), z(to)) . This assumes that F'(to) is not the zero vector, which has no direction .
x F'(to) = lir a at-> o F(t0 +At) -F(t0) At
FIGURE 12 .3
We usually represent the tangent vector F' (to) as an arrow from the point (x(t0), y( t0), z( t0) ) on the curve having F(t) as position vector .
EXAMPLE 12 . 1
Let H(t) = t 2i+sin(t)j-t2k . H(t) is the position vector for the curve given parametrically by x(t) = t 2, y(t) = sin(t) , z(t) = -t2 , part of whose graph is given in Figure 12 .4. The tangent vector i s H' (t) = 2ti + cos (t) j - 2tk . The tangent vector at the origin is H'(0) = j . The tangent vector at (1, sin(1), -1) is H'(l)=2i+cos(1)j-2k . From calculus, we know that the length of a curve given parametrically by x = x(t) , y = y(t), z = z(t) for a < t < b, i s length
= fa ✓( x ' (t)) 2 + (y' ()) 2 + (z' ( t)) 2 ,
in which it is assumed that x', y' and z ' are continuous on [a, b] . Now IIF'(t)II = \✓ (x'(t))2+ (y'(t)) 2 + (z' (t) ) 2
478;
CHAPTER 12 Vector Differential Calculu s
FIGURE 12 .4 Part of the graph of x = t2, y = sin(t), z = -t2 .
is the length of the tangent vector. Thus, in terms of the position vector F(t) = x(t)i + y(t)j + z(t)k, b length IIF'(t) II dt .
=f a
The length of a curve having a tangent at each point is the integral of the length of the tangen t vector over the curve .
EXAMPLE 12 .2
Consider the curve given by the parametric equation s x = cos(t), y = sin(t) ,
t z_ 3
for -47r < t < 47r . The position vector for this curve i s F(t) = cos(t)i+sin(t)j+ tk. 3 The graph of the curve is part of a helix wrapping around the cylinder x 2 +y2 = 1, centered about the z-axis . The tangent vector at any point i s F' (t) = -sin(t)i+cos(t)j+ 3k . Figure 12 .5 shows part of the helix and tangent vectors at various points . The length of th e tangent vector is IIF' (t) II = ' sin2 (t)+cos2 (t)+9 =
3
10.
The length of this curve i s length =
f4*
J_4I
11F'(t) II dt =
f
4*
J-47
1
3
lOdt =
8r
3
10.
12.1 Vector Functions of One Variabl e
FIGURE 12 . 5 Part of a circular helix and som e of its tangent vectors.
Sometimes it is convenient to write the position vector of a curve in such a way that th e tangent vector at each point has length 1 . Such a tangent is called a unit tangent, . We will show how this can be done (at least in theory) if the coordinate functions of the curve hav e continuous derivatives . Let F(t) = x(t)i+y(t)j +z(t) k for a < t < b, and suppose x', y' and z' are continuous . Define the real-valued functio n s ( t) = J t II F' (e) II de . a As suggested by Figure 12 .6, s(t) is the length of the part of the curve from its initial poin t (x(a), y(a), z(a)) to (x(t), y(t), z(t)) . As t moves from a to b, s(t) increases from s(a) = 0 t o s(b) = L, which is the total length of the curve . By the fundamental theorem of calculus, s(t ) is differentiable wherever F(t) is continuous, and ds = II F'(t) II = ✓x'(t)2+y'(t)2+z'(t)2 . dt Because s is strictly increasing as a function of t, we can, at least in theory, solve for t in term s of s, giving the inverse function t(s) (see Figure 12 .7) . Now define G(s) = F(t(s)) = x(t(s))i+ y(t(s))j+z(t(s)) k for 0 < s < L . Then G is a position function for the same curve as F . As t varies from a to b, F(t) sweeps out the same curve as G(s), as s varies from 0 to L . However, G has the advantage that the tangent vector G ' always has length 1 . To see this, use the chain rule to comput e G'(s) = d-F(t(s)) = dt F(t) ds ds/dt F (t) and this vector has length 1 .
IIF'(t) 11
F (t)
480
CHAPTER 12 Vector Differential Calculus
s
o
o
s = s(t )
to FIGURE 12 .6
t =
S-I(SO )
FIGURE 12 .7 A length function has an inverse .
Length function along a
curve .
EXAMPLE 12 . 3
Consider again the helix having position functio n
+ 3 tk
F(t) = cos(t)i+sin(t)j for -47r < t < 47r . We have already calculate d II
F' ( t)II
=3
10.
Therefore the length function along this curve i s
=f
s(t)
a*
3
lode
=3
10(t-+ -47r) .
Solve for the inverse function to write t=t(s)=
1 1
3
10
s-47r.
Substitute this into the position vector to define G(s) = F(t(s)) = F
/
I
0 s - 4 7 r)
=cos( 0 s-47r) i-F-sin(
10 s47r) j-+- 3 (
lO
=cos
3 s i+sin( 3 s) j--(is--7r)k. ( 10 I 10 I \ 10 3
G '(s)
3 s 3 cos 3 s) i+ 3 sin ( 10 ( 10 / 10 10
s47r) k
Now compute
+ 3 k. 10
This is a tangent vector to the helix, and it has length 1 . Assuming that the derivatives exist, then 1. [F(t) + G(t)] ' = F'(t) + G'(t). 2. [f(t)F(t)]' = f'(t)F(t) + f(t)F'(t) if f is a differentiable real-valued function . 3. [F(t) • G(t)] ' = F'(t) • G(t) +F(t) • G'(t) .
12.2 Velocity, Acceleration, Curvature and Torsio n 4. [F(t) x G(t)] ' = F'(t) x G(t) -}-F(t) x G'(t) .
5. [F (f(t))] ' =f'(t)F'(f(t)) • Items (2), (3) and (4) are all "product rules" . Rule (2) is for the derivative of a product of a scalar function with a vector function ; (3) is for the derivative of a dot product; and (4) is for the derivative of a cross product . In each case the rule has the same form as the familiar calculus formula for the derivative of a product of two functions . However, in (4), order is important, since F x G = -G x F . Rule (5) is a vector version of th e chain rule . In the next section we will use vector functions to develop the concepts of velocity an d acceleration, which we will apply to the geometry of curves in 3-space .
PROBLEMS In each of Problems 1 . through 8, compute the requested derivative (a) by carrying out the vector operation and dif ferentiating the resulting vector or scalar, and (b) by usin g one of the differentiation rules (1) through (5) stated at the end of this section . 1. F(t) =i+3t2j+2tk, f(t) = 4 cos(3t) ; (d/dt)[f(t)F(t) ] 2. F(t) = ti - 3t2k, G(t) = i+cos(t)k ; (d/dt)[F(t) •G(t)]
In each of Problems 9, 10, and 11, (a) write the positio n vector and tangent vector for the curve whose parametri c equations are given, (b) find a length function s(t) for the curve, (c) write the position vector as a function of s, an d (d) verify that the resulting position vector has a derivativ e of length 1 . 9. x = sin(t), y = cos(t), z = 45t; (0 < t < 27r)
(d/dt)[F(t) x G(t)]
10. x=y=z=t3;(-1t1 )
4. F(t) = sinh(t)j - tk, G(t) = ti + t2j - t2k ; (d/dt)[F(t) x G(t) ]
11. x=2t2 ,y=3t2,z=4t 2 ;(1
5. F(t) = ti - cosh(t)j + e'k,f(t) = 1 -2t3;
t3 )
12. Let F(t) = x(t)i + y(t)j + z(t)k . Suppose x, y and z are differentiable functions of t . Think of F(t) as the position function of a particle moving along a curve in 3-space . Suppose F x F' = O . Prove that the particl e always moves in the same direction .
7. F(t) = -9i+t2j+t 2k, G(t) = e l l ; (d/dt)[F(t) x G(t) ]
Velocity, Acceleration, Curvature and Torsio n Imagine a particle moving along a path having position vector F(t) =x(t)i+y(t)j+z(t)k as t varies from a to b . We want to relate F to the dynamics of the particle . For calculations we will be doing, we assume that x, y and z are twice differentiable . W e will also make use of the distance function along the curve ,
S(
t)=fa t II F'()II d6
482
CHAPTER 12 Vector Differential Calculus
DEFINITION 12.2
Velocity, Spee d
1. The velocity v(t) of the particle at time t is defined to b e
v(t) = F'(t) . 2. The speed v(t) of the particle at time t is the magnitude of the velocity .
Velocity is therefore a vector, having magnitude and direction . If v(t) is not the zero vector, then the velocity is tangent to the curve of motion of the particle . Thus, at any instant the particle may be thought of as moving in the direction of the tangent to the path of motion . The speed at time t is a real-valued function, given b y v ( t)
=
II v ( t)II
=
II F '(t)II
= at
This is consistent with the familiar idea of speed as the rate of change of distance (along th e path of motion) with respect to time .
DEFINITION 12.3
Acceleration
The acceleration a(t) of the particle is the rate of change of the velocity with respect to time: a (t)
v'(t)
Alternatively,
a(t) = F"(t) . As with velocity, acceleration is a vector.
EXAMPLE 12 . 4
Let F(t) = sin(t)i +
+ t 2k.
The path of the particle is the curve whose parametric equations are
x = sin(t),
y = 2e - `, z = t 2.
Part of the graph of this curve is shown in Figure 12 .8 . The velocity and acceleration are , respectively, v(t) = cos(t)i - 2e -`j + 2t k and a(t) = - sin(t)i + 2e -`j + 2k.
12.2 Velocity, Acceleration, Curvature and Torsion
FIGURE 12 .8
483
Part of
the graph of x = sin(t) , y=2e -', z = t2 .
The speed of the particle is v(t) = ,/cos2 (t)+4e-2t +4t2 . If F'(t) is not the zero vector, then this vector is tangent to the curve at (x(t), y(t), z(t)) . We may obtain a unit tangent by dividing this vector by its length : T(t) =
1
1 F (t) = ds/dtF II F '(t) II
( t) .
Equivalently, T(t)
II v(t) I I
v(t) v ( t ) Y(t) '
the velocity divided by the speed . If arc length s along the path of motion is used as the parameter in the. position function, then we have seen that IIF'(s) II = 1 automatically, so in thi s case the speed is identically 1 and this unit tangent is just the velocity vector . We will use thi s unit tangent to define a function that quantifies the "amount of bending" of a curve at a point .
DEFINITION 12.4
Curvatur e
The curvature K of a curve is the magnitude of the rate of change of the unit tangent with respect to arc length along the curve: K(S) _
dT ds
The definition is motivated by the intuition (Figure 12 .9) that the more a curve bends at a point , the faster the tangent vector is changing there. If the unit tangent vector has been written using s as parameter, then computing T'(s) i s straightforward . More often, however, the unit tangent is parametrized by some other variable , and then the derivative defining the curvature must be computed by the chain rule : dT d t K(t) _ dt ds
CHAPTER 12 Vector Differential Calculus
z
Y
Increasing curvature corresponds to an increasing rate ofchange of the tangent vector alon g the curve. FIGURE 12 .9
This gives the curvature as a function of the parameter used in the position vector . Sinc e
dt ds
1 ds/dt
1 IIF'(t)II '
we often write
t
K(t)
T '( )II II'F(t)II II
EXAMPLE 12. 5
Consider a straight line having parametric equations
x=a+bt, y=c+dt, z=e+ht, in which
a, b, c, d, e
and
h are constants . The position vector of this line is F(t) = (a + bt)i + (c+dt)j+ (e + ht)k .
We will compute the curvature using equation (12 .3) . First, F' (t)
= bi + dj +hk ,
so II F'(t)II
= Jb2 +d2+h2 .
The unit tangent vector is
_ T(t)
1 II
F'( t)II F (t) 1
✓b2 + d 2 + h2
(bi+dj+hk) .
(12 .3)
12.2 Velocity, Acceleration, Curvature and Torsion
48 5
This is a constant vector, so T' (t) = 0 and the curvature i s
K(t)
II F (t)II
II T'(t)II
=0 .
This is consistent with our intuition that a straight line should have curvature zero .
EXAMPLE 12 . 6
Let C be the circle of radius 4 about the origin in the plane y = 3 . Using polar coordinates, thi s curve has parametric equation s x=4cos(O),
y=3,
z=4sin(O)
for 0 < t < 27r. The circle has position vecto r F(O) = 4 cos(O)i + 3j + 4 sin(O)k . Then F'(O) = -4sin(O)i-f-4cos(O)k , so II F'(O) II
=4 .
The unit tangent is T(O)
= 4 [-4sin(O)i+4cos(O)k] = -sin(O)i+cos(O)k .
Then T'(0) = - cos(B)i - sin(O)k . The curvature is K(O)
II F'( B )II
II T'( 0) I =
The curvature of this circle is constant, again consistent with intuition . One can show that a circle of radius r has curvature l/r . Not only does a circle hav e constant curvature, but this curvature also decreases the larger the radius is chosen, as w e should expect .
EXAMPLE 12 . 7
Let C have parametric representation x = cos(t) + t sin(t), y = sin(t) - tcos(t), z = t2 for t > 0 . Figure 12.10 shows part of the graph of C . We will compute the curvature . We can write the position vecto r F(t) = [cos(t) + t sin(t)] i+ [sin(t) - t cos(t)] j + t2k .
486
CHAPTER 12 Vector Differential Calculu s
FIGURE 12 .10 Part of the grap h of x = cos(t) + t sin(t), y = sin(t)tcos(t), z = t2.
A tangent vector is given b y F' (t) = tcos(t)i+tsin(t)j+2tk and II F'( t) II
=
Next, the unit tangent vector is T(t)
I I F'(t)
li
t)
=
[cos (t)i + sin(t)j + 2k] .
Compute T ' (t) =
[-sin(t)i+cos(t)j] .
We can now use equation (12 .3) to comput e K(t)
t II F' OII
_
t
IIT'(t)I I
5 [sine (t) + cost (t) ] - 5t1
for t > 0 . ■ We will now introduce another vector of interest in studying motion along a curve .
Using arc length s as parameter on the curve, the unit normal vector N(s) is defined b y N ( s)
= K(s
T' ( s),
12.2 Velocity, Acceleration, Curvature and Torsion
487
The name given to this vector is motivated by two observations . First, N(s) is a unit vector. Since K(s)= IIT'(s)lI, then 1
II
N (s )II
= II T' (S)II II T '(s) II = 1 .
Second, N(s) is orthogonal to the unit tangent vector . To see this, begin with the fact tha t T(s) is a unit tangent, hence IIT(s)II = 1 . Then 1I
2 T ( s) 11 = T(s) • T(s) = 1 .
Differentiate this equation to ge t T' (s) T(s) +T(s) T' (s) = 2T(s) T ' (s) = 0 , hence T(s) T'(s) = 0 , which means that T(s) is orthogonal to T'(s) . But N(s) is a positive scalar multiple of T'(s) , and so is in the same direction as T'(s) . We conclude that N(s) is ' orthogonal to T(s) . At any point of a curve with differentiable coordinate functions (not all vanishing for the same parameter value), we may now place a tangent vector to the curve, and a normal vecto r that is perpendicular to the tangent vector (Figure 12 .11) . z
FIGURE 12 .11 Tangent and normal vector to a curve at a point.
EXAMPLE 12 . 8
Consider again the curve with position functio n F(t) = [cos(t) + t sin(t)] i+ [sin(t) - t cos(t)] j + t2k for t > O. In Example 12 .7 we computed the unit tangent and the curvature as functions of t . We will write the position vector as a function of arc length, and compute the unit tangent T(s ) and the unit normal N(s) .
488
CHAPTER 12 Vector Differential Calculu s
First, using IIF'(t) II = . st from Example 12 .7, s(t)
= f II F '(e)II de = f
de =
2
t2 .
Solve for t as a function of s :
in which a = 1/ G/5 1/4 . In terms of s, the position vector i s G(s) =F(t(s) )
= [cos(a/) + a/sin(a/)] i + [sin(av) - a/cos(a./)]j + a 2sk. The unit tangent is T(s) = G' (s)
=Z
a2 cos (a/)i + -a2 sin(a/)j + a2k.
This vector does indeed have length 1 : II
T (s)II 2
= 4 a4 +a4
14
= 451 =1 .
Now a3 T' (s) _ -4,7sin(a/)i+ T cos(a/)j
so the curvature is K ( S) = IIT
ab 1/2 = (21 /2)3 21/2 3 1 16s 4.V ° 5 1/4)
(-
(s) 11
1 1 1 53/4 T 7s'
for s > O . Since s = Vt2/2, then in terms oft we have K = 1/5t, consistent with Example 12 .7. Now compute the unit normal vector N(s) =
1 -r(s)
K(s)
= 4 3s a3 [-4 4 sin(a/)i+ a 4
cos(a/)j]
_ - sin(aJ)i+cos(a*)j . This is a unit vector orthogonal to T(s) . 12.2.1 Tangential and Normal Components of Acceleratio n At any point on the trajectory of a particle, the tangent and normal vectors are orthogonal . We will now show how to write the acceleration at a point as a linear combination of the tangen t and normal vectors there: a = a T T + aN N.
This is illustrated in Figure 12 .12.
12 .2
Velocity, Acceleration, Curvature and Torsion
489
THEOREM 12 .1
a=
+v2KN . -T
Thus a T = dv/dt and aN = V 2K . The tangential component of the acceleration is the derivativ e of the speed, while the normal component is the curvature at the point, times the square of th e speed there. Proof
First observe that F'(t) T(t)
II F (t)II
= vv.
Therefore V = vT . Then
a= -v=d-T+vT'(t) dt dt dv ds dT = dt dt ds
-T+v- -
= dtT+v2 T'(s) = dv
T + v2KN . ■
Here is one use of this decomposition of the acceleration . Since T and N are orthogonal unit vectors, then II a II 2
= a • a = (a TT + aN N) . ( aT T + a NN) = a 2.T • T + 2aT a NT • N + a,2*,N • N = a 2T +a2N .
From this, whenever two of Dal', aT and aN are known, we can compute the third quantity .
490
CHAPTER 12 Vector Differential Calculus
EXAMPLE 12 . 9
Return again to the curve C having position functio n F(t) = [cos(t) + t sin(t)]i + [sin(t) - t cos(t)] j + t2k for t > O. We will compute the tangential and normal components of the acceleration . First, v(t) = F'(t) = t cos(t)i + t sin(t)j + 2tk , so the speed is v
(t) = II F'( t) II = Vt.
The tangential component of the acceleration is therefor e dv = V3, dt a constant for this curve. The acceleration vector is aT
=
a = v' = [cos(t) - t sin(t)]i + [sin(t) + t cos(t)]j + 2k , and a routine calculation gives Il
a l =+t2 .
Then a2N
-4 =5 + t2 - 5= t 2. = 1I a 11 2 Since t > 0, the normal component of acceleration i s aN
t.
The acceleration may therefore be written as a=JT+tN . If we know the normal component aN and the speed v, it is easy to compute the curvature , since _ aN = t
= Kv2
= 5t2 K
implies that K=
1 St
as we computed in Example 12 .7 directly from the tangent vector . Now the unit tangent and normal vectors are easy to compute in terms of t . First, T(t)
= vv =
[cos(t)i+sin(t)j+2k] .
This is usually easy to compute, since v = F' (t) is a straightforward calculation . But in addition, we now have the unit normal vector (as a function of t ) 1
N(t) _ -T K '(s) _
5t
1
_ 1 dtdT 1 (t) K ds dt = KVT
[- sin(t)i + cos (t)j] = - sin(t)i + cos(t)j . t This calculation does not require the explicit computation of s(t) and its inverse function .
12.2 Velocity, Acceleration, Curvature and Torsion
491
12.2.2 Curvature as a Function of t Equation (12.3) gives the curvature in terms of the parameter t used for the position function :
K(t)
t 11 T' ( t)1 1 I F OII
This is a handy formula because it does not require introduction of the distance function s(t) . We will now derive another expression for the curvature that is sometimes useful in calculatin g K directly from the position function .
THEOREM 12 . 2
Let F be the position function of a curve, and suppose that the components of F are twic e differentiable functions . Then xF" K __ II F' IIFII3 *z II ■
This states that the curvature is the magnitude of the cross product of the first and second derivatives of F, divided by the cube of the length of F' . Proof
First write
a = aTT+
N.
K (dt)
Take the cross product of this equation with the unit tangent vector : /
Txa=a T TxT+K1
dt) z (TxN)=K1/ dt) z (TxN) ,
since the cross product of any vector with itself is the zero vector . Then z IITxaII=K(ds) IITxNI I
=
K
(ds)2 II TII IINII sin ( g) , dt
where 0 is the angle between T and N . But these are orthogonal unit vectors, so I T II = II N II = 1 . Therefore = K (ds)
IITxaII
z
dt
Then K __
IITxal l (ds/dt)2
But T = F'/ 11F/ II, a = T' = F", and ds/dt = II F' II, s o 1 11 F' I1 2
1 IIF'IIF
xF
_ 1I F' xF "I I I F II3
■
0
= ,n-/2 an d
492 i CHAPTER 12 Vector Differential Calculu s
EXAMPLE 12 .1 0
Let C have position function F(t) = t2i-t 3j+tk.
We want the curvature of C. Compute F' (t) =2ti-3t2j+k , F" (t) = 2i - 6tj , and F' x F" =
12.2.3 The Frenet Formula s If T and N are the unit tangent and normal vectors, the vector B = T x N is also a unit vector,
and is orthogonal to T and N. At any point on the curve where these three vectors are define d and nonzero, the triple T, N, B forms a right-handed rectangular coordinate system (as in Figure 12.13). We can in effect put an x, y, z coordinate system at any point P on C, with the positiv e x-axis along T, the positive y-axis along N, and the positive z axis along B. Of course, this system twists and changes orientation in space as P moves along C (Figure 12 .14) . Since N = (1/K)T'(s), then dT - = KN . ds
FIGURE 12 .13
FIGURE 12 .1 4
12.3 Vector Fields and Streamlines Further, it can be shown that there is a scalar-valued function
T
493
such that
dN = -KT + TB , ds and dB _ -TN . ds These three equations are called the Frenet formulas. The scalar quantity T(s) is the torsion o f C at (x(s), y(s), z(s)) . If we look along C at the coordinate system formed at each point by T , N and B, the torsion measures how this system twists about the curve as the point moves alon g the curve.
TION :12 .2'
PROBLEMS
In each of Problems 1 through 10, a position vector i s given . Determine the velocity, speed, acceleration, tangential and normal components of the acceleration, the curvature, the unit tangent, unit normal and binormal vectors . 1. F=3ti-2j+t 2 k 2. 3. 4. 5. 6.
F = t sin(t)i+ t cos(t)j + k F=2ti-2tj+tk F = e`sin(t)i-j+et cos(t) k F=3e t (i+j-2k) F = acos(t)i+J3tj+a sin(t)k
12.3
7. 8. 9. 10. 11.
F = 2sinh(t)j -2cosh(t) k F=ln(t)(i-j+2k) F=at 2i+/3t2j+yt2k F = 3tcos(t)j -3tsin(t) k Suppose we are given the position vector of a curve an d find that the unit tangent vector is a constant vector . Prove that the curve is a straight line . 12. It is easy to verify that the curvature of any straigh t line is zero . Suppose C is a curve with twice differen tiable coordinate functions, and curvature zero . Does it follow that C is a straight line ?
Vector Fields and Streamline s We now turn to the analysis
of vector
functions of more than one variable.
DEFINITION 12.6 Vector Field A vector field in 3-space is a 3-vector whose components are functions of three variables . A vector field in the plane is a 2-vector whose components are functions of tw o variables .
A vector field in 3-space has the appearanc e G(x, y, z) = f(x, y, z)i+g(x, y, z)j+h(x, y, z)k , and, in the plane, K(x, y) = f(x, y)i + g(x, y)j .
494
CHAPTER 12 Vector Differential Calculu s The term "vector field" is geometrically motivated . At each point P for which a vecto r field G is defined, we can represent the vector G(P) as an arrow from P . If is often useful in working with a vector field G to draw arrows G(P) at points through the regio n where G is defined . This drawing is also referred to as a vector field (think of arrows growing at . points) . The variations in length and orientation of these arrows gives some sense of the flo w of the vector field, and its variations in strength, just as a direction field helps us visualiz e trajectories of a system X' = F(X) of differential equations . Figures 12.15 and 12 .16 show the vector fields G(x, y) = xyi + (x - y) j and H(x, y) = y cos(x)i + (x 2 - y 2 )j, respectively, in the plane . Figures 12 .17, 12 .18 and 12.19 show the vector fields F(x, y, z) = cos(x +y)i - xj + (x - z) k, Q (x, y, z) = -yi + zj + (x + y -I- z) k and M (x', y, z) = cos(x)i + a- .e sin (y) j + (z - y) k , respectively, in 3-space . f , R `
4 1
+
+
+
T
+-l o
*{ -'
FIGURE 12 .15
Representation of
the vector field
G(x, y) = xyi + (x ` y)j •
FIGURE 12 .17 Representation of the vecto r field F(x, y, z) = cos (x +y)i - xj + (x - z)k as arrows in planes z = constant .
NY
4
1'
+ x+
T f
+
-1
A ►t
-*
Representation of the vector field H(x, y) = ycos(x)i + (x2 - Y2 )j• FIGURE 12 .16
FIGURE 12 .18
Representation of the vector field Q(x, y, z) = -yi+zj + ( x+y+z)k in plane s z = constant.
12.3 Vector Fields and Streamlines
49 5
FIGURE 12 .19 The vector field M(x, y, z) = cos(x)i + e _x sin(y)j + (z - y)k in
planes z = constant.
A vector field is continuous if each of its component functions is continuous . A partial derivative of a vector field is the vector field obtained by taking the partial derivative of eac h component function . For example, i f F(x, y, z)
= cos(x+y)i - xj + (x - z)k ,
then aF ax
= Fx = - sin(x+y)i - j +k,
aF ay
=F , = - sin(x+y)i, and Y
aF az
= Fz = -k .
If G (x, y) = xyi + (x - y) j , then ax
=G .=yi+j
and
aG ay
=Gy =xi - J
Streamlines of a vector field F are curves with the property that, at each point (x, y, z), th e vector F(x, y, z) is tangent to the curve through this point .
DEFINITION 12.7
Streamlines
Let F be a vector field, defined for all (x, y, z) in some region SZ of 3-space . Let R be a set of curves with the property that, through each point P of SI, there passes exactly on e curve from R . The curves in R are streamlines of F if, at each (x, y, z) in S2, the vector F(x, y, z) is tangent to the curve in Fs, passing through (x, y, z) .
Streamlines are also called flow lines or lines of force, depending on context. If F is th e velocity field for a fluid, the streamlines are often called flow lines (paths of particles in th e
CHAPTER 12 Vector Differential Calculu s fluid) . If F is a magnetic field the streamlines are called lines offorce . If you put iron filings on a piece of cardboard and then hold a magnet underneath, the filings will be aligned by th e magnet along the lines of force of the field . Given a vector field F, we would like to find the streamlines . This is the problem of constructing a curve through each point of a region, given the tangent to the curve at eac h point . Figure 12.20 shows typical streamlines of a vector field, together with some of th e tangent vectors . We want to determine the curves from the tangents .
FIGURE 12 .2 0
Streamlines of a vector field.
To solve this problem, suppose C is a streamline of F = fi+gj + hj . Let C have parametric equations
x=x(e),Y=Y(e),z=zO . The position vector for this curve i s R () = x() i +Y()j +z() k . Now R' (6 = x' (5) i + Y ()j + z' () k is tangent to C at (x(6), y(e), z(e)) . But for C to be a streamline of F, F(x(e), y(e), z(e)) i s also tangent to C at this point, hence must be parallel to R'(i) . These vectors must therefore be scalar multiples of each other . For some scalar t (which may depend on 6) , R' () = tF(x(e),Y(S),z()) . But then
i+ j+
dx dy dzk = tf(x(), Y(), z())i+tg(x(),Y(), z())j+th(x(),Y(), z(O)k . de de de This implies that
d
= tf,
d = tg, d
= th .
(12 .4)
Since f, g and h are given functions, these equations constitute a system of differential equations for the coordinate functions of the streamlines . If f, g and h are nonzero, then t can be eliminated to write the system in differential form a s dx dy dz f gh
(12 .5)
12.3 Vector Fields and Streamlines
497
EXAMPLE 12 .1 1
We will find the streamlines of the vector field F = x 2i+2yj The system (12 .4) is dx 2 dy dz =tx, =2ty,* = -t. d d If x and y are not zero, this can be written in the form of equations (12 .5) : dx _ dy _ d z x2 2y -1 These equations can be solved in pairs . To begin, integrate dx = -d z x2 to get 1 --=-z+ c . x in which c is an arbitrary constant . Next, integrate
dy = - dz 2y
to get 2 1n!YI = -z+ k. It is convenient to express two of the variables in terms of the third . If we solve for x and y in terms of z, we get x=
1 z - c'
y = ae -2 z
in which a = elk is an arbitrary (positive) constant. This gives us parametric equations of the streamlines, with z as parameter : x=
1 , z- c
y=ae-2z
z= z
To find the streamline through a particular point, we must choose c and a appropriately . For example, suppose we want the streamline through (-1, 6, 2) . Then z = 2 and we need to choos e c and a so that -1 =
1 and 6=ae 4. 2- c
Then c = 3 and a = 6e4, so the streamline through (-1, 6, 2) has parametric equation s x= z-3 ,
y=6e4-2z
A graph * this streamline is shown in Figure 12 .21 . ■
z=z •
498
CHAPTER 12 Vector Differential Calculu s z
FIGURE 12 .21
Part of the graph of 6e4-2z, z = z .
x = 1/(z _3), y =
EXAMPLE 12 .1 2
Suppose we want the streamlines of F(x, y, z) _ -yi -I- zk . Here the i component is zero, so we must begin with equations (12 .4), not (12 .5) . We have dx = 0, dy _ -ty,
dz
= tz .
The first equation implies that x = constant . This simply means that all the streamlines are i n planes parallel to the y, z plane . The other two equations yiel d dy d z -y - z and an integration give s - ln(y) + c = ln(z) . Then ln(zy) = c , implying that zy=k , in which k is constant . The streamlines are given by the equation s x=c,
k z=- , Y
in which c and k are arbitrary constants and y is the parameter . For example, to find the streamline through (-4, 1, 7), choose c = -4 and k so that z = k/y passes through y = 1, z = 7 .
We nee d
7= 1 =k , and the streamline has equations 7 x=-4,z=- . Y The streamline is a hyperbola in the plane x = -4 (Figure 12 .22).
12 .4 The Gradient Field and Directional Derivatives
z
FIGURE 12 .22
Part of the graph of x = -4,
z=7/y.
In each of Problems 1 through 5, compute the two firs t partial derivatives of the vector field and make a diagra m in which each indicated vector is drawn as an arrow fro m the point at which the vector is evaluated . 1. G(x, y) = 3xi - 4xyj ; G(0, 1), G(1, 3), G(1, 4) , G(-1, -2), G(-3, 2) 2. G(x, y) = -2x 2 yj ; G(0,0), G(0, 1), G(2, -3), G(-1, -3 ) 3 . G(x, y) = 2xyi+cos(x)j ; G(7r/2, 0), G(0, 0), G(-1, 1) , G(7r, -3), G(-7r/4, -2) 4 . G(x, y) = sin(2xy)i + (x 2 + y)j ; G(-7r/2, 0), G(0, 2) , G(7r/4, 4), G(1, 1), G(-2, 1) 5. G(x, y) = 3x 2 1+(x-2y)j ; G(1, -1), G(0, 2), G(-3, 2), G(-2, -2), G(2, 5) In each of Problems 6 through 10, compute the thre e first partial derivatives of the vector field . 6. F =
9. F = - z4 sin(xy)i + 3xy 4zj + cosh(z - x)k 10. F = (14x-2y)i+(x 2 -y2 -z2 )j+5xyk In each of Problems 11 through 16, find the streamlines of the vector field, then find the particular streamline through the given point . 11 . F=i-y2j+zk; (2, 1, 1 ) 12 . F=i-2j+k ; (0, 1, 1 ) 13. F = (1 /x) i + e xj - k ; (2, 0, 4) 14. F = cos(y)i+sin(x)j ; (7r/2, 0, -4) 15. F = 2ez j - cos(y)k ; (3, 7r/4, 0) 16. F = 3x2 i - yj + z3 k ; (2, 1, 6) 17. Construct a vector field whose streamlines are straight lines .
- 2x2yj + cosh (z +y) k
7. F = 4z2 cos (x)i - x 3 yzj + x 3yk
12.4
8. F = 3xy 3 i + ln(x + y + z)j + cosh(xyz) k
18. Construct a vector field in the x, y plane whose streamlines are circles about the origin .
The Gradient Field and Directional Derivatives Let cp(x, y, z) be a real-valued function of three variables . In the context of vectors, suc h a function is called a scalar field . We will define an important vector field manufacture d from co.
500
CHAPTER 12 Vector Differential Calculu s
f DEFINITION 12.8 Gradient The gradient of
a
scalar field cp is the vector field
Oro
given by
acp ao vcp = a i + a) A
wherever these partial derivatives are defined .
The symbol Vcp is read "del phi", and V is called the del operator. It operates on a scalar field to produce a vector field . For example, if cp(x, y, z) = x 2y cos(yz), then Ocp = 2xy cos(yz)i+ [x2 cos(yz) x 2yz sin(yz)]j - x 2y2 sin(yz)k .
The gradient field evaluated at a point P is denoted Vcp(P) . For the gradient just computed , Vcp(l, -1, 3) _ -2 cos(3)i + [cos(3) - 3 sin(3)]j + sin(3)k . If cp is a function of just x and y, then Vcp is a vector in the x, y plane . For example, i f cp(x, y) = (x - y) cos(y), the n Vcp(x, y) = cos(y)i + [- cos(y) - (x - y) sin(y)]j .
At (2, ar) this gradient is Vcp(2, 7r) = - i+j •
The gradient has the obvious propertie s V(*+')
= vp+vi
and, if c is a number, then V(ccp) = cVcp.
We will now define the directional derivative, and relate this to the gradient . Suppose cp(x, y, z) is a scalar field . Let u = ai + bj + ck be a unit vector (length 1) . Let Po = (xo, Yo ' z o) • Represent u as an arrow from Po, as in Figure 12.23 . We want to define a quantity that measures the rate of change of tp(x, y, z) as (x, y, z) varies from Po, in the direction of u . To do this, notice that, if t > 0, then the poin t P : (xo+at, yo+bt,zo+ct )
is on the line through Po in the direction of u . Further, the distance from Po to P along thi s direction is exactly t, because the vector from Po to P is (xo+ at - xo) i +(Yo+ bt - yo)j +(zo+ ct -zo) k ,
/
X (x, Y, z) = (xo + at,Yo + bt, zo + ct )
t>0
FIGURE 12 .2 3
12.4 The Gradient Field and Directional Derivatives
501
and this is just tu . The derivative d dt
cp(x+at, y+bt, z+ct)
is the rate of change of So (x + at, y + bt, z + ct) with respect to this distance t, an d d
at cp(x+at, y+bt, z+ct) r=o is this rate of change evaluated at Po . This derivative gives the rate of change of cp(x, y, z) at Po in the direction of u . We will summarize this discussion in the following definition . DEFINITION 12.9 Directional Derivative directional derivative of a scalar field denoted D;,cp(Po) , and is given by The
co at
Po in the direction of the unit vector
+ at, y + bt, z+ct)
u
is
t= o We usually compute a directional derivative using the following . THEOREM 12 . 3
If go is a differentiable function of two or three variables, and u is a constant unit vector, then Du cp (Po)
Proof
=
Vco ( Po)
• u.
Let u = ai + bj + ck . By the chain rule, d acp dt cp(x+at,y+bt,z+ct)= ax
acp
acp
a+a b+az c .
y Since (xo + at, yo + bt, zo + ct) = ( xo, yo, zo) when t = 0, then D,s)(Po)
acp = a ( Po) a + a
= VC9 (Po)
(Po) b +
a
( Po) c
• u.
EXAMPLE 12 .1 3
Let cp (x, y, z) = x 2y - xe z , Po = (2, -1, ir) and u = ± (i - 2j + k) . Then the rate of change o f cp(x, y, z) at Po in the direction of u i s DU co(2 ,
-1,
r) =Ocp (2 ,
= =
cox( 2 ,
1
=-
- 1 , ar)• u
0-
-1, 7
-1,7r) (--) +cpZ(2,
-1'
([2xy-ez](z-iir)-2[x2](-z,i,*)+[-xez](z-i-) l)
(-4-e'r-8-2e'r)=-3 (4+e'r ) . 111 Vb
7T )
5 02
CHAPTER 12 Vector Differential Calculu s In working with directional derivatives, care must be taken that the direction is given by a unit vector . If a vector w of length other than 1 is used to specify the direction, then use the unit vector w/ I w II in computing the directional derivative . Of course, w and w/ I w I have the same direction . A unit vector is used with directional derivatives so that the vector specifie s only direction, without contributing a factor of magnitude . Suppose now that cp(x, y, z) is defined at least for all points within some sphere about Po . Imagine standing at Po and looking in various directions . We may see cp(x, y, z) increasing i n some, decreasing in others, perhaps remaining constant in some directions . In what directio n does cp(x, y, z) have its greatest, or least, rate of increase from Po? We will now show that th e gradient vector V (Po) points in the direction of maximum rate of increase at Po, and -Vco(Po ) in the direction of minimum rate of increase .
-Q
THEOREM 12 . 4
Let cp and its first partial derivatives be continuous in some sphere about Po, and suppose tha t Vcp(Po) O . Then 1. At Po, cp(x, y, z) has its maximum rate of change in the direction of Vcp(Po) . Thi s maximum rate of change is II Vco(Po ) 2. At Po, cp(x, y, z) has its minimum rate of change in the direction of -Vcp(Po) . Thi s minimum rate of change is - II V(P0) II . Proof
Let u be any unit vector. Then D u cP(Po) = VcP(Po) u I! V (Po)IIIl u II cos ( O)=Il V (Po)II cos ( O) ,
because u has length 1 . 0 is the angle between u and Vcp(Po) . The direction u in which co ha s its greatest rate of increase from Po is the direction in which this directional derivative is a maximum . Clearly the maximum occurs when cos(O) = 1, hence when 0 = 0 . But this occur s when u is along Vcp(Po) . Therefore this gradient is the direction of maximum rate of change o f cp(x, y, z) at Po . This maximum rate of change is II VcP(Po) II . For (2), observe that the directional derivative is a minimum when cos(0) = -1, henc e when 0 = 7r . This occurs when u is opposite Vco(Po), and this minimum rate of change i s - IIVcP(Po)ll• ■
EXAMPLE 12 .1 4 Let cp(x, y, z) = 2xz + e yz 2 . We will find the maximum and minimum rates of change o f cp(x, y, z) from (2, 1, 1) . First, Vcp(x, y ; z) = 2zi + e'z 2j + (2x + 2ze)' )k , so VSo(Po) = 2i + ej + (4 +2e)k . The maximum rate of increase of cD(x, y, z) at (2, 1, 1) is in the direction of this gradient, and this maximum rate of change is
-V4+e2 +(4+2e) 2. The minimum rate of increase is in the direction of -2i - ej - (4 + 2e)k, and is -\/4+e2+(4+2e)2 . ■
12 .4 The Gradient Field and Directional Derivative s 12 .4 .1 Level Surfaces, Tangent Planes and Normal Lines Depending on the function co and the constant k, the locus of points cp(x, y, z) = k may form a surface in 3-space . Any such surface is called a level surface of cp . For example , if cp(x, y, z) = x 2 + y2 + z 2 and k > 0, then the level surface cp(x, y, z) = k is a sphere o f radius A/Tc . If k = 0 this locus is just a single point, the origin . If k < 0 this locus is empty . There are no points whose coordinates satisfy this equation . The level surface cp(x, y, z) = 0 of cp(x, y, z) = z - sin(xy) is shown from three perspectives in Figures 12 .24 (a), (b) and (c) .
(a)
(b) FIGURE 12 .24
z = sin(xy) .
Different perspectives of graphs of
50 4
I
CHAPTER 12 Vector Differential Calculu s
(c ) FIGURE 12 .24
(Continued).
Now consider a point Po (xo, Yo ' zo) on a level surface cp(x, y, z) = k . Assume that ther e are smooth (having continuous tangents) curves on the surface passing through Po, such as C l and C2 in Figure 12 .25. Each such curve has a tangent vector at Po . These tangent vectors determine a plane II at Po, called the tangent plane to the surface at Po . A vector normal (perpendicular) to II at Po, in the sense of being normal to each of these tangent vectors , is called a normal vector, or just normal, to the surface at Po . We would like to be able to determine the tangent plane and normal to a surface at a point . Recall that we can find th e equation of a plane through a given point if we are given a normal vector to the plane . Thus the normal vector is the key to finding the tangent plane .
FIGURE 12 .25 Tangents to curves on the surface through Po determine th e tangent plane II.
THEOREM 12.5 Gradient As a Normal Vecto r
Let cp and its first partial derivatives be continuous . Then Vcp(P) is normal to the level surfac e cp(x, y, z) = k at any point P on this surface at which this gradient vector is nonzero. ■
12.4 The Gradient Field and Directional Derivatives
505
We will outline an argument suggesting why this is true . Consider a point Po : (xo, Yo ' z o) on the level surface cp(x, y, z) = k . Suppose a smooth curve C on this surface passes through Po , as in Figure 12 .26 (a) . Suppose C has parametric equation s x = x(t), y = Y( t ), z = z(t) .
Tangent plane to the level surface at Po
(a)
(b ) Ocp(Po) is normal to the level surface at Po .
FIGURE 12 .26
Since Po is on this curve, for some to , x(to) = xo, Y( to) = Yo, z( to) = zo .
Further, since the curve lies on the level surface, the n cp(x(t), y(t), z(t)) = k
for all t . Then d
ca( x ( t), y ( t), z( t ))
= =
0 cox x' ( t) + cPyY ( t )
+ cpzz ( t)
= V . [x'(t)i+y'(t)j+z'(t)k] . Now x' (t)i -I- y ' (t)j + z' (t)k = T(t) is a tangent vector to C . In particular, letting t = to, then T(to) is a tangent vector to C at Po, and we have Ocp(Po) . T(to)
= 0.
This means that Ocp(Po) is normal to the tangent to C at Po . But this is true for any smooth curve on the surface and passing through Po . Therefore VcP(Po) is normal to the surface at Po (Figure 12 .26 (b)) . Once we have this normal vector, finding the equation of the tangent plane is straightforward. If (x, y, z) is any other point on the tangent plane (Figure 12 .27), then the vector (x - xo)i + (y- yo) j + (z - zo) k is in this plane, hence is orthogonal to the normal vector . Then Vco (Po)
This equation is satisfied by every point on the tangent plane . Conversely, if (x, y, z) satisfie s this equation, then (x - xo)i + (y - yo)j + (z - zo)k is normal to the normal vector, hence lie s in the tangent plane, implying that (x, y, z) is a point in this plane . We call equation (12.6) the equation of the tangent plane to cp(x, y, z) = k at Po .
EXAMPLE 12 .1 5 Consider the level surface cp(x, y, z) = z - . /x 2 +y2 = O . This surface is the cone shown in Figure 12 .28 . We will find the normal vector and tangent plane to this surface at (1, 1, -if) . First compute the gradient vector: x *cp= -*x2+y2
1x2+y2 j+k >
provided that x and y are not both zero . Figure 12 .29 shows Vcp at a point on the cone determine d by the position vector R(x, y, z) = xi + yj + 1/x2 + y 2 k . Then Vcp(1, 1, ') = -
FIGURE 12 .28
=j+k .
FIGURE 12 .29 Con e z = ,/x2 + y 2 an d normal at Po.
12 .4
The Gradient Field and Directional Derivatives
507
This is the normal vector to the cone at (1, 1, *) . The tangent plane at this point has equatio n -(x- 1)-
(Y-1)+z-=0 ,
or
x+y-vz=0. The cone has no tangent plane or normal vector at the origin, where the surface has a "shar p point" . This is analogous to a graph in the plane having no tangent vector where it has a sharp point (for example, y = at the origin) .
EXAMPLE 12 .1 6
Consider the surface z = sin(xy) . If we let cp(x, y, z) = sin(xy) - z, then this surface is the leve l surface cp(x, y, z) = O . The gradient vector is Ocp = ycos(xy)i+xcos(xy)j -k .
This vector field is shown in Figure 12 .30, with the gradient vectors drawn as arrows from selected points on the surface . The tangent plane at any point ( xo, yo, zo) on this surface has equatio n Yo cos (xo yo)(x - xo) + xo cos ( xoyo)(Y - Yo) - (z - zo) = O .
FIGURE 12 .30
Gradient field Vcp=ycos(xy)i+xcos(xy)j-krepresented asavector fiel d on the surface z = sin(xy) .
508 I
CHAPTER 12 Vector Differential Calculu s For example, the tangent plane at (2, 1, sin(2)) has equatio n cos (2) (x - 2) + 2 cos (2) (y - 1) - z + sin(2) = 0 , or cos(2)x + 2 cos (2)y - z = 4 cos (2) - sin(2) . A patch of this tangent plane is shown in Figure 12 .31 . Similarly, the tangent plane a t (-1, -2, sin(2)) has equatio n 2cos(2)x+cos(2)y+z = -4cos(2)+sin(2) . Part of this tangent plane is shown in Figure 12 .32. I I A straight line through Po and parallel to Vcp(Po) is called the normal line to the level surface cp(x, y, z) = k at Po, assuming that this gradient vector is not zero . This idea is illustrate d in Figure 12 .33 . To write the equation of the normal line, let (x, y, z) be any point on it . Then the vecto r (x-xo)i+(Y-Yo)i+(z-zo) k is along this line, hence is parallel to Vcp(Po) . This means that, for some scalar t, either of these vectors is t times the other, say (x - xo) i + (Y - Yo)J + (z - zo) k = tVcp(Po )
FIGURE 12 .31
Part of the tangent plane to
z = sin(xy) at (2, 1, sin(2)) .
FIGURE 12 .32
Part of the tangent plane to
z = sin(xy) at (-1, -2, sin(2)) .
Normal line to the surface = k at P o
\ cp(x, y, z) FIGURE 12 .33
12.4 The Gradient Field and Directional Derivatives
509
The components on the left must equal the respective components on the right : acp
x - xo = - (Po) t, y - yo = - (Po) t, z - zo =
(Po) t .
a
These are parametric equations of the normal line . As t varies over the real line, these equation s give coordinates of points (x, y, z) on the normal line .
EXAMPLE 12 .1 7
Consider again the cone ('(x, y, z) = \/x2 gradient vector at (1, 1, Aff,), obtaining
+y 2 - z =
0 . In Example 12.15 we computed th e
1 i - 1j + k. A/2 A/ 2 The normal line through this point has parametric equations x-1=-t,y-1=-t,z-1h=t. A/ 2 A/2 We can also write x=1-t,y=1--t,z=NG+t .
PR OBLEMS
In each of Problems 1 through 6, compute the gradient o f the function and evaluate this gradient at the given point . Determine at this point the maximum and minimum rat e of change of the function. 1. cp(x, Y, z) = xyz ; (1, 1, 1 ) 2. cp(x, y, z) = x2y - sin(xz) ; (l, -1, 7r/4)
In each of Problems 7 through 10, compute the directiona l derivative of the function in the direction of the given vector .
10. cp(x, y, z) = yz + xz + xy; i - 4k In each of Problems 11 through 16, find the equations of the tangent plane and normal line to the surface at th e point. 11. x2 + y2 + z 2
In each of Problems 17 through 20, find the angle betwee n the two surfaces at the given point of intersection . (Compute this angle as the angle between the normals to th e surfaces at this point) .
8. cp(x, y, z) = cos(x-y)+ez ; i-j+2 k
17. z = 3x 2 +2y2 ,
9. cp(x, y, z) = x2 yz 3 ; 2j+ k
18. x2 +y 2
- 2x+7y 2
- z = 0 ; (1, 1, 5 )
+ z2 = 4, z2 + x2 =
2; (1,', 1)
510
CHAPTER 12 Vector Differential Calculus
19 . z = */x 2 +y 2 , x 2 +y2 = 8 ; (2, 2, J)
21. Suppose Vcp = i + k . What can be said about level surfaces of cp? Prove that the streamlines of Ocp are orthogonal to the level surfaces of cp .
20. x2 +y 2 +2z 2 = 10, x+y + z = 5 ; (2, 2, 1)
12.5
Divergence and Curl The gradient operator V produces a vector field from a scalar field . We will now discuss tw o other vector operations . One produces a scalar field from a vector field, and the other a vecto r field from a vector field.
DEFINITION 12.10
Divergenc e
The divergence of a vector field F(x, y, z) scalar field
= f(x, v ,
For example, if F = 2xyi+ (xyz 2 - sin(yz))j + zex+Y k, then div F = 2y + xz 2 - z cos(yz) + ex+v
We read div F as the divergence of F, or just "div F" .
This vector is read "curl of F", or just "curl F" . For example, if F = yi + 2xzj + ze xk, then curl F = - 2xi - zexj + (2z -1)k .
Divergence, curl and gradient can all be thought of in terms of the vector operations o f multiplication of a vector by a scalar, dot product and cross product, using the del operator V . This is defined by V
-i+ -j+ k. ax aY az
The symbol V, which is read "del", is treated like a vector in carrying out calculations, and th e "product" of a/ax, a/ay and a/az with a function 9(x, y, z) is interpreted to mean, respectively ,
12.5 Divergence and Curl
aq/ax, acp/ay and acp/az . In this way, the gradient of
co
511
is the product of the vector V with the
scalar function (I) :
(a . cp axi+ ayj+ azk)
a*i+
j+ a* k
= Ocp = gradient of cp . The divergence of a vector is the dot product of del with the vector :
/a a a I i + ay j+-k •(fi+gj+hk ) \\ = of ag ah + = divergence of F . ax + aY a z
ax a
V .F =
And the curl of a vector is the cross product of del with the vector : i O x F=
j
k
a/ax a/ay a/az f g h
(ay-a
g
li+(az
ax )+ (ax
y) k
= curl F . Informally, del times = gradient, del dot = divergence, and del cross = curl . This provides a way of thinking of gradient, divergence and curl in terms of familiar vecto r operations involving del, and will prove to be an efficient tool in carrying out computations . There are two fundamental relationships between gradient, divergence and curl . The first states that the curl of a gradient is the zero vector .
THEOREM 12 .6 Curl of a Gradient
Let
co
be continuous with continuous first and second partial derivatives . Then Vx(Vcp)=O . IV
This conclusion can also be written curl (Vcp) = O . The zero on the right is the zero vector, since the curl of a vector field is a vector .
5 12
CHAPTER 12 Vector Differential Calculus Proof
By direct computation, \ x(V(p)=Vx (**i+**j+az-k)
i j k a/ax a/ay a/a z acp/ax &p/ay acp/a z ( (92 , a2, i+ a2c
because the paired mixed partial derivatives in each set of parentheses are equal . ■ The second relationship states that the divergence of a curl is the number zero . -t 1 THEOREM 12 . 7
Let F be a continuous vector field whose components have continuous first and second partial derivatives . Then V . (VxF)=0 . ■ We may also write div(curl F) = O . Proof
As with the preceding theorem, proceed by direct computation :
(v x F) =
ax ( ay az ) + ay \ az ax) + az ( ax ay /
a2h a2f a2h a2f a2a a2a = axay - axaz + ayaz - ayax + azax - azay = ° because equal mixed partials appear in pairs with opposite signs . ■ Divergence and curl have physical interpretations, two of which we will now develop . 12.5 .1 A Physical Interpretation of Divergence Suppose F(x, y, z, t) is the velocity of a fluid at point (x, y, z) and time t . Time plays no role in computing divergence, but is included here because normally a velocity vector does depen d on time . Imagine a small rectangular box within the fluid, as in Figure 12 .34 . We would like som e measure of the rate per unit volume at which fluid flows out of this box across its faces, at an y given time. First look at the front face II and the back face I in the diagram . The normal vector pointing out of the box from face II is i . The flux of the flow out of the box across face II is the norma l component of the velocity (dot product of F with i), multiplied by the area of this face : flux outward across face II = F(x+Ax, y, z, t) • ioyoz .
= f( x + Ax, y, z, t)zyOz .
12.5 Divergence and Curl
513
Back face I (x, y, z)
)y
FIGURE 12 .3 4
On face I, the unit outer normal is -i, s o flux outward across face I = F(x, y, z, t) • (-i)DyOz = - f(x, y, z, t)DyOz . The total outward flux across faces I and II is therefor e [f(x + / x, y, z, t) - f(x, y, z,
t)] DyO
z•
A similar calculation can be done for the other two pairs of sides . The total flux of fluid out of the box across its faces is [f(x + Ox , y, z, t) -f(x , y, z, t)] AyOz+ [g( x, y + A y, z, t) - g(x, y, z, t)] 11xOz + [h (x, y, z + Oz, t) - h (x, y, z, t) ] OxLy . The flux per unit volume is obtained by dividing this flux by the volume OxOyOz of the box : fix, y, z, t) - f(x, y, z, t) flux per unit volume out of the box = f(x I Ox + g( x, y+ Ay, z, t) -g(x, y, z,t) Ay
Now take the limit as Ox 0, Ay the flux per unit volume approaches
+ h(x, y, z+Oz, t) - h(x, y, z, t) Oz 0 and Oz -> 0 . The box shrinks to the point (x, y, z) an d of ax
ag a h +
ay
+
az '
which is the divergence of F(x, y, z, t) at time t . We may therefore intrepret the divergenc e of F as a measure of the outward flow or expansion of the fluid from this point . 12.5.2 A Physical Interpretation of Curl Suppose an object rotates with uniform angular speed w about a line L as in Figure 12 .35. The angular velocity vector fl has magnitude w and is directed along L as a right-handed screw would progress if given the same sense of rotation as the object . Put L through the origin as a convenience, and let R = xi + yj + zk for any point (x, y, z) on the rotating object . Let T(x, y, z) be the tangential linear velocity . The n II T II= co II R II sin ( O ) = O nxR II
51'4:
CHAPTER 12 Vector Differential Calculus
FIGURE 12 .35 Angular velocity as the curl of the linear velocity.
where 0 is the angle between R and 52 . Since T and S2 x R have the same direction an d magnitude, we conclude that T = S2 x R . Now write Ii = ai +• bj + ck to obtain T = S2 x R = (bz - cy)i + (cx - az)j + (ay - bx)k . Then VxT=
i j k a/ax a/ay a/a z bz - cy cx - az ay - bx
= 2ai+2bj+2ck =252 . Therefore S2= 1 VxT . 2 The angular velocity of a uniformly rotating body is a constant times the curl of the linea r velocity . Because of this interpretation, curl was once written rot (for rotation), particularly in British treatments of mechanics . This is also the motivation for the term irrotational for a vector field whose curl is zero . Other interpretations of divergence and curl follow from vector integral theorems we wil l see in the next chapter .
In each of Problems 1 through 6, compute V V . F and V x F and verify explicitly that V (V x F) = O .
In each of Problems 7 through 12, compute Vcp and verify explicitly that V x (Ocp) = O .
1. F=xi+yj+2z k
7. cp(x, y, z) = x-y+ 2z2
2. F = sinh(xyz)j
8. cp(x, y, z) = 18xyz + ex
3. F = 2xyi+xeyj +2zk
9. cP(x , y, z) = -2x 3 yz2
4. F = sinh(x)i + cosh(xyz)j - (x + y + z) k
10. cp(x, y, z) = sin(xz)
5. F=x2 i+y2j+z2 k
11. cp(x, y, z) = xcos(x+y+z )
6. F = sin((x-z)i+2yj+(z-y2 )k
12. cp(x, y, z) = ea'+y+z
123 Divergence and Curl 13. Let cp(x, y, z) be a scalar field and F a vector field. Derive expressions for V • (cpF) and V x (cpF) in terms of operations applied to cp(x, y, z) and F . 14. Let F
= fi + gj + hk be a vector field. Define
Let G be a vector field . Show tha t V(F•G)=(F . V G+(G . V)F+Fx(VxG)+Gx(VxF) . 15. Let F and G be vector fields . Prove tha t V .(FxG)=G .(VxG)-F .(VxG) . 16. Let cp(x, y, z) and clr(x, y, z) be scalar fields .
( h8z) k .
F 0-(f8x)i+(g 8y ) j+
515
Prove that V . (Ocp x Der) = 0 .
_
CHAPTER
*3
LT': F
7k*;* ;
., a
.:
*'
f
GREEN'S INDEPENNNC E ' f' t l a d ; kl THLOR7 I N € IIL PLAN E -S2, < ` ,AND SURFACE INTEGRAL S
Vector Integral Calculus
This chapter is devoted to integrals of vector fields over curves and surfaces, and relationship s between such integrals . These have important uses in solving partial differential equations an d in constructing models used in the sciences and engineering .
13.1
Line Integrals We begin with the integral of a vector field over a curve . This requires some background o n curves . Suppose a curve C in 3-space is given by parametric equations x=x(t),y=y(t),z=z(t)
for a
We call x(t), y(t) and z(t) the coordinate functions of C . We will think of C not only as a geometric locus of points (x(t), y(t), z(t)), but also a s having an orientation or direction, given by the direction this point moves along C as t increases from a to b. Denote this orientation by putting arrows on the graph of the curve (Figure 13 .1) . Trajectories of a system of differential equations are also oriented curves, with the particl e moving along the geometric locus in a direction dictated by the flow of the system . We call (x(a), y(a), z(a)) the initial point of C, and (x(b), y(b), z(b)) the terminal point . A curve i s closed if the initial and terminal points are the same.
517
518
CHAPTER 13 Vector Integral Calculu s z A
(x(b),
z
y(b), z(b))
0 FIGURE 13 .1
Orientation along a
27r
FIGURE 13 . 2
curve. EXAMPLE 13 . 1
Let C be given by x=2cos(t),y=2sin(t),z=4 ;
0
A graph of C is shown in Figure 13 .2 . This graph is the circle of radius 2 about the origin in the plane z = 4 . The arrow on the curve indicates its orientation (the direction of motion o f (2cos(t), 2sin(t), 4) around the graph as t varies from 0 to 27r) . The initial point is (2, 0, 4) , obtained at t = 0, and the terminal point is also (2, 0, 4), obtained at t = 2ir . This curve is closed . Contrast C with the curve K given by x = 2 cos(t), y = 2 sin(t), z = 4 ; 0
Line Integra l
Suppose a smooth curve C has coordinate functions x = xt y= y(tl, z = z(t) fo r a < t < b. Let f(x, y, z), g(x, y, z), and h(x, y, z) be continuous at least on the grap h of C . Then, the line integra l
We can write this line integral more compactly a s
fc fdx+gdy+hdz . To evaluate fc fdx + gdy + hdz, substitute the coordinate functions x = x(t), y = y(t) and z = z(t) into f(x, y, z), g(x, y, z) and h(x, y, z), obtaining functions of t . Further, substitut e dx = dx dt, dy = dy dt,
and
dz = dz dt .
This results in the Riemann integral on the right side of equation (13 .1) of a function of t over the range of values of this parameter .
EXAMPLE 13 . 2
Evaluate the line integral
fc xdx - yzdy + ez dz if C is given by x = t3 ,Y = - t,z=t2;
l < t<2 .
Here f(x, y, z) = x,
g(x , y, z) = -yz and
h (x , y, z) = e Z
and, on C, dx 3t z dt, dy = -dt
and
dz = 2tdt .
Then
L
xdx-yzdy +eZ dz
= fi z [t 3(3t2)-(-t)(t2)(-1)+e t2 (2t)]d t z [3-t t5 3 +2te` 2]d= t
111 4
-e . ■
520 I
CHAPTER 13 Vector Integral Calculus
EXAMPLE 13 . 3 Evaluate lc xyzdx-cos(yz)dy+xzdz over the straight line segment from (1, 1, 1) to (-2, 1, 3) . Here we are left to find the coordinate functions of the curve . Parametric equations of the line through these points are x(t) = 1
z(t) = 1 +2t .
3t, y(t) = 1,
We must let t vary from 0 to 1 for the initial point to be (1, 1, 1) and the terminal point to b e (-2, 1, 3) . Now
L xyzdx - cos(yz)dy + xzdz ; =f =f
[(l-3t)(1)(12t)(-3)-cos(l+2t)(0)+(1-3t)(1+2)(2)]d t i (-1+t+6t2)dt=
3
If C is a smooth curve in the x, y plane (zero z-component), and f(x, y) and g(x, y) are continuous on C, then we can write a line integra l
fc f(x, y) dx + g(x, y) dy, which we refer to as a line integral in the plane . We evaluate this according to the Definition 13 .1 , except now there is no z-component.
EXAMPLE 13 . 4 Evaluate lc xydx - y sin(x) dy if C is given by x(t) = t2 and y(t) = t for -1 Proceed :
t < 4.
L xydx - y sin(x)dy =f
[t 2t(2t)-tsin(t 2)(1)]d t
= 410+
2
cos(16) -
2 cos(t) .
In these examples we have included all the terms to follow equation 13 .1 very literally . However, with some experience there are obvious shortcuts one can take . In Example 13 .3, fo r example, y is constant on the curve, so dy = 0 and the term g(x(t), y(t), z(t)) di could hav e been simply omitted. Thus far we can integrate only over a smooth curve . This requirement can be relaxed as follows . A curve C is piecewise smooth if x'(t), y'(t) and z ' (t) are continuous, and not all zero for the same value of t, at all but possibly finitely many values of t . Since x'(t)i + y'(t)j + z'(t) k is the tangent vector to C if this is not the zero vector, this condition means that a piecewis e smooth curve has a continuous tangent at all but finitely many points . Such a curve typicall y has the appearance of Figure 13 .3, with smooth pieces Cl , . . . , C„ connected at points where the curve may have no tangent . We will refer to a piecewise smooth curve as a path . In Figure 13 .3 the terminal point of CC is the initial point of CC+1 for j = 1, . . . , n - 1 . The segments C 1, . . . , C„ are in order as one moves from the initial to the terminal point of C . Thi s
is indicated by the arrows showing orientation along the smooth pieces of C . If f, g and h ar e continuous over each Cj , then we define
fc fdx+gdy+hdz f fdx+gdy+hdz+
..
+ f*
fdx+gdy+hdz .
This allows us to take line integrals over paths, rather than restricting the integral to smoot h curves .
EXAMPLE 13 . 5
Let C be the curve consisting of the quarter circle x2 + y2 = 1 in the x, y- plane, from (1, 0) to (0, 1), followed by the horizontal line segment from (0, 1) to (2, 1) . Compute fc x2ydx + y2 dy. C is piecewise smooth and consists of two smooth pieces (Figure 13 .4) . Parametrize these as follows . The quarter circle part i s C l : x = cos(t), y = sin(t) ;
0t<7r/2 .
The straight segment is C2 :x = p,y = p ; 0
f
r/2
[cos2 (t) sin(t) (- sin(t)) + sin2 (t) cos(t)]d t
Cl
=f
1r/2
(-cost(t)sin2(t)-f-sine(t)cos(t))dt=-16
r+ 3 .
Next evaluate the line integral over C 2. On C2, x = p and y = 1, so dy = 0 and
f*Z 2
2
x ydx+y dy
=f
2
2
p dp =
8 3
Then 1 8 1 x2ydx+y2dy=f 163 + 3
or . 16 +3
522.
CHAPTER 13 Vector Integral Calculu s It is sometimes useful to think of a line integral in terms of vector operations, particularl y in the next section when we deal with potential functions . Consider fc fdx + gdy + hdz . Form a vector field F(x, y, z) = f(x , y, z)i + g(x, y,z)j + h(x, y, z)k . If C has coordinate functions x = x(t), y = y(t), z = z(t), we can form the position vector , for C : R(t) = x(t)i+y(t)j -I- z(t)k . At any time t, the vector R(t) can be represented by an arrow from the origin to the poin t (x(t), y(t), z(t)) on C. As t varies, this vector pivots about the origin and adjusts its length to sweep out the curve . If C is smooth, then the tangent vector R'(t) is continuous . Now dR=dxi+dyj+dzk, s o F . dR = f(x, y, z)dx + g(x, y, z)dy + h(x, y, z)dy and fc f(x,y,z)dx+g(x,y,z)dy+h(x,y,z)dz=
F . dR .
This is just another way of writing a line integral in terms of vector operations . Line integrals arise in many contexts . For example, consider a force F causing a particl e to move along a smooth curve C having position function R(t), where t varies from a to b. At any point (x(t), y(t), z(t)) of C, the particle may be thought of as moving in the directio n of the tangent to its trajectory, and this tangent is R'(t) . Now F(x(t), y(t), z(t)) • R'(t) is the dot product of a force with a direction, and so has the dimensions of work . By integrating this function from a to b, we "sum" the work being done by F over the entire path of motion . This suggests that fc F . dR, or fc fdx + gdy + hdz, can be interpreted as the work done by F i n moving the particle over the path .
EXAMPLE 13 . 6
Calculate the work done by F = i - yj + xyzk in moving a particle from (0, 0, 0) to (1, -1, 1) along the curve x=t,y=-t2,z=tfor0t1 . The work is work = f F • dR = fdx - ydy + xyzd z c c i = f (1+t2(-2t)-t 4 )d t
f - 2t 3 - t4) dt = 1 0 The correct units (such as foot-pounds) would have to be provided from context . Line integrals have some of the usual properties we associate with integrals .
13.1 Line Integrals
523
THEOREM 13 . 1
Let C be a path and let having position vector R . Let F and G be vector fields that ar e continuous at points of C. Then 1. fc c
dR
=fF c
dR+
f
c
G dR .
2. For any number a, fc c
f F•dR . c
This theorem illustrates the efficiency of the vector notation for line integrals . We coul d also write the conclusion (1) a s
fc =f
(f f*)dx+(g+g*)dy+(h+h*)d z fdx+gdy+hdz+
f
f* dx+g*dy+h*dz .
For Riemann integrals, reversing the limits of integration changes the sign of the integral : = - fG f(x)dx . The analogous property for line integrals involves reversing orientation on C . Given C with an orientation from an initial point P to a terminal point Q, let - C denote the curve obtained from C by reversing the orientation to go from Q to P (Figure 13 .5) . Here is a more careful definition of this orientation reversal . fG f(x)dx
z
z
Y
FIGURE 13 .5
Reversing orientation of a curve .
DEFINITION 13.2
Let C be a smooth curve with coordinate functions x = x(t), y = y(t), z = z(t), fo r a < t < b. Then -C denotes the curve having coordinate function s .x(t)=x(a+b-t), y(t)=y(a+b-t),
z(t)=z(a+b-t)
fora
The initial point of -C i s (x(a ), 5 (a), z( a))
_ (x(b),
y (b), z(b)),
524. , '
CHAPTER 13 Vector Integral Calculu s the terminal point of C . And the terminal point of -C i s (x(b),Y(b),z(b)) = (x(a), y(a), z(a)) , the initial point of C . By the chain rule, -C is piecewise smooth if C is piecewise smooth . We will now show that the line integral of a vector field over -C is the negative of th e line integral of the vector field over C . THEOREM 13 . 2
Let C be a smooth curve with coordinate functions x = x(t), y = y(t), z = z(t) . Let f, g and h be continuous on C . Then
Change variables in the last integral by putting s = a+ b - t . When t = a, s = b , and when t = b, s = a . Further, d ':E = d d dx ds -x(a+b- t) = -x(s) = _ dt dt dt ds dt
ds + g ( x ( s), Y(s), z(s))ds +h(x(s), Y( s), z(s))
dz
(-l)ds ds
13.1 Line Integrals
525
In view of this theorem, the easiest way to evaluate f fdx + gdy + hdz is usually to take the negative of lc fdx + gdy + hdz . We need not actually write the coordinate functions o f -C, as was done in the proof.
EXAMPLE 13 . 7
A force F(x, y, z) = x 2i - zyj + x cos(z)k moves a particle along the path C given by x = t 2, y = t, z = rrt for 0 < t < 3. The initial point is P : (0, 0, 0) and the terminal point of C is Q : (9, 3, 37r) . Suppose we want the work done in moving the particle along this path from Q to P . Since we want to go from the terminal to the initial point of C, the work done is f c F • dR . However, we do not need to formally define -C in terms of new coordinate functions . We can simply calculate fc F • dR, and take the negative of this . Calculate
fc F•dR= f fdx+gdy+hd z = f 3 [t4 (2t)-7rt(t)(1)+t2 cos(7rt)(7r)] dt =243-97r- 6. 7r The work done in moving the particle along the path from Q to P i s 6 +97r-243 . 7r
13.1 .1 Line Integral with Respect to Arc Lengt h Line integrals with respect to arc length occur in some uses of line integrals . Here is the definition of this kind of line integral .
DEFINITION 13.3
Line Integral With Respect to Arc Length
Let C be a smooth curve with coordinate functions x = x(t), y = y(t), z = z(t) for a < t < b . Let cp be a real-valued function that is continuous on the graph of C . Then the integral of co over C with respect to arc length is
f 9(x, Y, z)ds = f
a
co (x( t), y(t),
z(t))vlx'(t)2+y'(t)2+z'(t)2dt .
The rationale behind this definition is that the length function along C i s s(t)
=
f
n
,fix '(
) 2 +Y' ( 0 2
+z'(e) 2
Then ds = -\/x' (0 2 +y'(t)2 +z'(t) 2 dt , suggesting the integral in the definition.
.
526
CHAPTER 13 Vector Integral Calculu s
EXAMPLE 13 . 8
Evaluate
f
xyds over the curve given b y x=4cos(t),y=4sin(t),z=-3
for 0 t r/2 .
Compute fc
xyds
=f -f
7,/2
4cos(t)[4sin(t)],/16sin2(t) +16cos2 (t)d t 64cos(t) sin(t)dt = 32 .
0
Line integrals with respect to arc length occur in calculations of mass, density, and variou s other quantities for one-dimensional objects. Suppose, for example, we want the mass of a thi n wire bent into the shape of a piecewise smooth curve C having coordinate function s x=x(t),y=y(t),z=z(t)
fora
The wire is one-dimensional in the sense that (ideally) it has length but not area or volume . We will derive an expression for the mass of the wire as follows . Let 6(x, y, z) be the density of the wire at any point . Partition [a, b] into subintervals by inserting points a=to
<•••
Choose these points At units apart, so t1 - tj_1 = At . These partition points of [a, b] determine points P1 : (x(t1 ), y(t1 ), z(t1) )
along C, as shown in Figure 13 .6 . Assuming that the density function is continuous, we ca n choose At sufficiently small that on the piece of wire between P1_1 and P1 , the values of th e density function are approximated to whatever accuracy we wish by 6(P1 ) . The length of th e segment of wire between P1_1 and P1 is As = s(P1 ) -s(PJ_1 ), which is approximated by ds = ✓x'(t1 )2+ (y'(t1 ) 2 +z' ( t1) 2At. The density of this piece of wire between P1_1 and P1 is therefore approximately the "nearly " constant value of the density on this piece, times the length of this piece of wire, this product being 8 (x ( t1 ), y( t1 ), z( t1))Vx' ( t1) 2 + (y' (t1 ) 2 + z'(t1 )2At.
P,, : (x(b), y(b), z(b) )
y Po : (x(a), y(a), z(a)) *l/ x FIGURE 13 .6
I I to = a t1
I
t2
I t3
b = t„
>t
13.1 Line Integrals
527
The mass of the entire length of wire is approximately the sum of the masses of these pieces :
in which ti means " approximately equal" . Recognize this as the Riemann sum for a definit e integral to obtain, in the limit as At ->- 0 ,
= f S(x(t),y(t), z(t))*/x'()2+y'(t2+z'(t)2dt = fc 8(x, y, z)ds .
mass
A similar argument leads to coordinates
x = 1 f xS(x, y, z)ds, in c
y=
i
(x, y, ) of the center of mass of the wire :
f
f y6(x, y, z)ds,
mc
z=
i f zS(x, f y, z)ds . in c
in which in is the mass of the wire .
EXAMPLE 13 . 9
A wire is bent into the shape of the quarter circle C given b y x= 2 cos(t), y= 2 sin(t), z= 3 for 0< t< 7r/2 . The density function is 6(x, y, z) = xy 2 grams/centimeter . We want the mass and center of mass of the wire . The mass is
= fc xy2 ds
in
=f
7r/ 2
=f
7r/ 2
2cos(t)[2sin(t)] 2*4sin2 (t)+4cos 2 (t)d t 16 cos(t) sin2(t)dt
= 16
grams .
Now compute the coordinates of the center of mass . First ,
Next, y= in - f y6(x, y, z)d s c 3 f r/ 2 [2 cos(t)] [2sin(t)] 3 1/4sin2(t) +4cos2 (t)dt 16 7r/2 3 =6 cos(t) sin 3 (t)dt =
f
f
.
CHAPTER 13 Vector Integral Calculu s
528,
Finally,
=-f f 1
m c
=3 16
z8(x,y,z)ds r/2
3[2cos(t)][2sin(t)] z 1/4sin2 (t)+4cos 2(t)d t
= 9 f irt 2 sin 2(t) cos(t)dt = 3 . 0
The last result could have been anticipated, since the z-component on the curve is constant. The center of mass is (3'tr/8, 3/2, 3) .
-p
a
PROBLEMS
In each of Problems 1 through 15, evaluate the lin e integral . 1. fc xdx - dy + zdz, with C given by x(t) = t, y(t) = t,z(t)=t3 for0t 1 2. fc -4xdx + y 2 dy - yzdz, with C given by x(t) = -t2,y(t)=0,z(t)=-3tfor0t5 1 3. fc (x + Ads, where C is given by x = y = t, z = t2 for 0
(1, 2, -1 ) F • dR, where F = cos(x)i ti-t2j+k for0
5.fc
10. fc xzdy, with C the curve x y = t, z = -4t 2 fo r 1
11.fc
13. fc 8x2 dy, with C given by x = e t , y = 1
15.
6. fc 4xyds, with C given by x = y = t, z = 2t for
16.
1
7. fc F . dR, with F = xi + yj - zk and C the circle x2 + y 2 = 4, z = 0, going around once counterclockwise . yzds, with C the parabola z = y 2, x = 1 for 0 < y< 2 -xyzdz, with C the curve x = 1, y = , for 4 < z9
8.fc 9.fc
13.2
t for
14. fc xdy - ydx, C the curve x = y = 2t, z = e - t for
+ xzk and R =
yj
- t2 , z =
17.
18.
0
Green's Theorem Green's theorem was developed independently by the self-taught British amateur natural philosopher George Green and the Ukrainian mathematician Michel Ostrogradsky . They were studying potential theory (electric potentials, potential functions), and they obtained an important relationship between double integrals and line integrals in the plane .
13 .2
FIGURE 13 .7
Graph of a curve that is not simple.
FIGURE 13 .8
Green's Theorem
529
The Jordan
curve theorem.
Let C be a piecewise smooth curve in the plane, having coordinate functions x = x(t) , y = y(t) for a < t < b. We will be interested in this section in C being a closed curve, so th e initial and terminal points coincide . C is positively oriented if (x(t), y(t)) moves around C counterclockwise as t varies from a to b. moves clockwise, then we say that C is negatively oriented . For example , let x(t) = cos(t) and y(t) = sin(t) for 0 < t < 27- . Then (x(t), y(t)) moves counterclockwis e once around the unit circle as t varies from 0 to 2,r, so C is positively oriented . If, however, K has coordinate functions x(t) = - cos(t) and y(t) = sin(t) for 0 < t < 2ir, then K is negatively oriented, because now (x(t), y(t)) moves in a clockwise sense . However, C and K have the same graph . A closed curve in the plane is positively oriented if, as you walk around it, th e region it encloses is over your left shoulder . A curve is simple if the same point cannot be on the graph for different values of the parameter . This means that x(tl ) = x(t2 ) and y(tl ) = y(t2 ) can occur only .if t 1 = t2 . we envision the graph of a curve as a train track, this means that the train does not return to th e same location at a later time . Figure 13 .7 shows the graph of a curve that is not simple . This would prevent a closed curve from being simple, but we make an exception of th e initial and terminal points . If these are the only points obtained for different values of th e parameter, then a closed curve is also called simple . For example, the equations x = cos(t) and y = sin(t) for 0 < t < 2'n , describe a simple closed curve . However, consider M given by x = cos(t), y = sin(t) for 0 < t < 47r . This is a closed curve, beginning and ending at (1, 0) , but (x(t), y(t)) traverses the unit circle twice counterclockwise as t varies from 0 to 41r . M i s a closed curve but it is not simple . It is a subtle theorem of topology, the Jordan curve theorem, that a simple closed curve C in the plane separates the plane into two regions having C as common boundary . One regio n contains points arbitrarily far from the origin, and is called the exterior of C . The other regio n is called the interior of C . These regions are displayed for a typical closed curve in Figure 13 .8 . The interior of C has finite area, while the exterior does not . Finally, when a line integral is taken around a closed curve, we often use the symbol fc in place of fc . This notation is optional, and is simply a reminder that C is closed . It does not alter in any way the meaning of the integral . We are now ready to state the first fundamental theorem of vector integral calculus . Recall that a path is a piecewise smooth curve (having a continuous tangent at all but finitely man y points) . If (x(t), y(t))
If
-,
THEOREM I3.3 Gree n
Let C be a simple closed positively oriented path in the plane . Let D consist of all points on C and in its interior . Let f, g, ag/ax and of/ay be continuous on D . Then
ic
f(x, y)dx+g(x, y) dy = ffo
(_L) . dA
1111
530
I
CHAPTER 13 Vector Integral Calculu s The significance of Green's theorem is that it relates an object that deals with a curve , which is one-dimensional, to an object related to a planar region, which is two-dimensional . This will have important implications when we discuss independence of path of line integrals i n the next section, and later when we develop partial differential equations and complex analysis . Green's theorem will also lead shortly to Stokes's theorem and Gauss's theorem, which are it s generalizations to 3-space. We will prove the theorem under restricted conditions at the end of this section . For now , here are two computational examples .
E9
l EXAMPLE 13 .1 0
Sometimes we use Green's theorem as a computational aid to convert one kind of integral int o another, possibly simpler, one . As an illustration, suppose we want to compute the work don e by the force F ( x, y)
(y - x2e x)i+ (cos(2y2 ) - x) j
in moving a particle about the rectangular path C of Figure 13 .9. If you try to evaluate 56c F • dR as a sum of line integrals over the straight line sides o f this rectangle, you will find that the integrations cannot be done in elementary form . However , apply Green's theorem, with D the region bounded by the rectangle. We obtai n \ =work f F • dR = f fD I ax (cos(2y2) - x) - y (y - x2ex) I dA = c f -2dA = (-2)(area of D) = j
-4.
D
EXAMPLE 13 .1 1
Another typical use of Green's theorem is in deriving very general results . To illustrate, suppos e we want to evaluate
fc 2x cos(2y)dx - 2x2 sin(2y)dy for every positively oriented simple closed path C in the plane . This may appear to be a daunting task, since there are infinitely many different such paths . However, observe the form of f(x, y) and g(x, y) in the line integral . In particular, ax
for all x and y . Therefore, Green's theorem gives us fc 2x cos(2y)dx - 2x2 sin(2y) dy
= f fD OdA = O .
In the next section we will see how the vanishing of this line integral for any closed curv e allows an important conclusion about line integrals Lc 2x cos(2y)dx 2x 2 sin(2y) dy when K i s not closed. We will conclude this section with a proof of Green's theorem under special conditions o n the region bounded by C . Assume that D can be described in two ways .
13 .2 Green's Theorem
`5 31
Y
FIGURE 13 .9
FIGURE 13 .1 0
First, D consists of all points (x, y) with a<x< b
and, for each x, h(x)
y
k(x) .
Graphs of the curves y h(x) and y = k(x) form, respectively, the lower and upper parts o f the boundary of D (see Figure 13 .10) . Second, D consists of all points (x, y) with c
x G(y).
In this description, the graphs of x = F(y) and x = G(y) form, respectively, the left and right parts of the boundary of D (see Figure 13 .11) . Using these descriptions of D and the boundary of D, we can demonstrate Green's theorem by evaluating the integrals involved . First, let C, be the . lower part of C (graph of y = h(x) ) and C2 the upper part (graph of y = k(x)) . Then
f = fG
x dx =
f( , Y)
, c, f(x
y) dx +
f
cZ
f(x, h(x))dx+
f(x , Y) dx
f
n
f(x, k(x))dx
= f v -[f(x, k(x)) - f(x, h(x))]dx . n
The upper and lower limits of integration in the second line maintain a counterclockwis e orientation on C . Y
FIGURE 13 .11
532
CHAPTER 13 Vector Integral Calculu s
L
Now compute of
ff
D
G k(x)af f dydx ay dA - J*, !(x) ay
f G x y)]h(xj dx = f f(x, k(x)) - f(x, h(x))]dx . =
a
[f(
,
a
Therefore
fc f(x , Y) dx = - f fDDayY Using the other description of D, a similar computation shows tha t
f
dy = c g(x, y)
ag
ff ax dA . D
Upon adding the last two equations we obtain the conclusion of Green's theorem .
1. A particle moves once counterclockwise about the triangle with vertices (0, 0), (4, 0) and (1, 6), under th e influence of the force F = xyi + xj . Calculate the work done by this force.
8. F = (x2 - y)i + (cos(2y) - e3y + 4x)j, with C any square with sides of length 5
2. A particle moves once counterclockwise about the circle of radius 6 about the origin, under the influence of the force F = (ex - y + x cosh(x))i + (y 3/2 + x) j . Calculate the work done . 3. A particle moves once counterclockwise about the rectangle with vertices (1, 1), (1, 7), (3, 1) and (3, 7) , under the influence of the force F = (- cosh(4x 4) + xy)i + (e-Y + x)j . Calculate the work done .
10. F = x2 yi xy2j, C the boundary of the region x2 +y 2 <4,x>0,y> 0
In each of Problems 4 through 11, use Green's theorem t o evaluate fc F • dR. All curves are oriented counterclockwise. 4. F = 2yi - xj, C is the circle of radius 4 about (1, 3) 5. F = x2i - 2xyj, C is the triangle with vertices (1, 1), (4, 1), (2, 6) 6. F= (x+y)i+(x-y)j, C is the ellipse x 2 +4y 2 = 1 7. F = 8xy 2j, C is the circle of radius 4 about the origin
9. F = ex cos(y)i - ex sin(y)j, C is any simple closed piecewise smooth curve in the plane
11. F = xyi + (xy2 - e` os(Y) )j, C the triangle with vertices (0, 0), (3, 0), (0, 5 ) 12. Let C be a positively oriented simple closed path wit h interior D . (a) Show that the area of D equals fc -ydx . (b) Show that the area of D equals fc xdy. (c) Show that the area of D equals
-ydx + xdy .
2 13. Let u(x, y) be continuous with continuous first and second partial derivatives on a simple closed path C and throughout the interior D of C . Show that au .c
ay
au
f ax - ✓ JD
dx+-dy
a2 u
ax e
a2u + aye
dA .
13.2 .1 An Extension of Green's Theore m There is an extension of Green's theorem to include the case that there are finitely many point s enclosed by C at which f, g, af/ay and/or ag/ax are not continuous, or perhaps are not eve n defined. The idea is to excise the "bad points," as we will now describe .
13.2 Green's Theorem
' :533 .
Suppose C is a simple closed positively oriented path in the plane enclosing a region D . Suppose f, g, af/ay and ag/ax are continuous on C, and throughout D excep t at points P I , . . . ,P,, . Green's theorem does not apply to this region . But with a little imagination we can still draw an interesting conclusion . Enclose each Pj with a circle Kj of sufficiently small radius that none of these circle s intersects either C or each other (Figure 13 .12) . Next, cut a channel in D from C to K,, the n from KI to K2 , and so on until finally a channel is cut from Ki_1 to K,, . A typical case i s shown in Figure 13 .13 . Form the closed path C* consisting of C (with a small segment cu t out where the channel to K 1 was made), each of the Kos (with small cuts removed where the channels entered and exited), and the segments forming the connections between C and th e successive Kos. Figure 13 .14 shows C*, which encloses the region D* . By the way C* was formed, the points P I , . . . , P,, are external to C* (Figure 13 .15) . Further, f , g, af/ay and ag/ax are continuous on C* and throughout D* . We can therefore apply Green' s theorem to C* and D* to conclude that *c* .f(x,y)dx+g(x,Y)dY= ffD
(ag
ay) dA .
(13 .2)
Now imagine that the channels that were cut become narrower, merging to form segment s between C and successive Kos . Then C* approaches the curve C of Figure 13 .16, and D * approaches the region b shown in Figure 13 .17 . D consists of D with the disks bounded by Kl , . . . , K,, cut out. In this limit process, equation 13 .2 approaches c f(x,Y) dx + g(x ,Y) dY+E
j_l Kx
FIGURE 13 .12
.f(x ,Y) dx + g(x ,Y) dy=
ff (- - of- )dA . D a
FIGURE 13 .1 3
x
FIGURE 13 .14
FIGURE 13 .15
534
CHAPTER 13 Vector Integral Calculu s
FIGURE 13 .16
FIGURE 13 .1 7
On the left side of this equation, line integrals over the internal segments connecting C and the Kos cancel because the integration is carried out twice over each segment, once in eac h direction . Further, the orientation on C is counterclockwise, but the orientation on each K./ is clockwise because of the way the boundaries were traversed (Figure 13 .14) . If we reverse the orientations on these circles the line integrals over them change sign and we can write
f f( c
x;Y) dx + g (x,Y) dy=
x;
j=1
f(x,Y) dx + g( x,Y) dy
+f f (agax ayf) o
dA ,
in which all the integrals are in the positive, counterclockwise sense about the curves C an d K1 , . . . , K,, . This is the generalization of Green's theorem that we sought .
EXAMPLE 13 .1 2
Suppose we are interested in
is
x -Y dx+ x2+ y2 dY, x 2 + Y2
with C any simple closed positively oriented path in the plane, but not passing through the origin . With f(x , y)
=
+ x2 Y2
and
g(x, y) _
ag ax
y2x2
of ay
x x2 +y2 '
we have
x2 + y 2
f, g, of/ay and ag/ax are continuous at every point of the plane except the origin . This leads us to consider two cases . Case 1-C does not enclose the origin . Now Green's theorem applies an d
f
c x2
-y
± y2
dx +
x
_
frf
ag of
x2 + y2 dy - J JD ax
ay)
dA = O .
13.2 Green's Theorem
x
FIGURE 13 .1 8
Case 2-C encloses the origin. Draw a circle K centered at the origin, with radius sufficiently small that K does no t intersect C (Figure 13 .18) . By the extension of Green's theorem , *c f(x, Y) dx + g(x, y)d y
K f( x,Y) dx + g( x ,Y) dy + f fD
(
_
)d A
ao
K f(x, Y) dx + g (x, y)dy ,
Y=
where b is the region between K and C . Both of these line integrals are in the counterclockwis e sense about the respective curves . The last line integral can be evaluated explicitly because we know K . Parametrize K by x=rcos(O),y=rsin(O) for 0 <0 <27r . Then
*K f(x, Y) dx + g( x , y)dy Jo =f
0
2'n-
( _rsin(O)
rz
[-r sin( g)] +
rcos(O) 1 r2 [r cos(O)]) dO
dB=2*r.
We conclude that f(x , Y)dx+g(x, y)dy =
In each of Problems 1 through 5, evaluate "c F . dR over any simple closed path in the x, y plane that does not pas s through the origin . x y 1.F= x2+ Y2 i + x2+y2 j 3/2 / 1 2. F= I x2+ 2 * (xi + A ) Y
1027r
if C does not enclose the origin if C encloses the origin
3. 4.
F=
(X
2
F=(x2
y
F
2) 22-2y) 2 +x + (x Y
y2
/ x +3x)i+I xz+ yz - y j \
Y 5. F= \ /x2x+y2 +2x)i+( / _ /x2 + yz \
3 YZ j J
536
13.3
CHAPTER 13 Vector Integral Calculu s
Independence of Path and Potential Theory in the Plan e In physics, a conservative force field is one that is derivable from a potential . We will use th e same terminology .
DEFINITION 13.4
Conservative Vector Field
Let D be a set of points in the plane. A vector field F(x, y) is conservative on D if for some real-valued co(x, y), F = Vcp for all (x, y) in D . In this event, cp is a potential function for F on D .
If cp is a potential function for F, then so is cp+c for any constant c, because 0((p+c) = Vco . For this reason we often speak of a potential function for F, rather than the potential function . Recall that, if F(x, y) = f(x, y)i+g(x, y)j and R(t) = x(t)i + y(t)j is a position function for C, then fc F• dR is another way of writing fc f(x, y) dx + g(x, y) dy . We will make frequen t use of the notation fc F • dR throughout this section because we want to examine the effect o n this integral when F has a potential function, and. for this we will use vector notation . First, the line integral of a conservative vector field can be evaluated directly in terms o f a potential function . For suppose C is smooth, with coordinate functions x = x(t), y = y(t) fo r at
=
F ( x , y)
and
f
c
F•dR=
f
acp
f
n
= fG
ax
dx-I
c ax b
ago
1+
acty) ay
acp dx Ox dt
+
ago ay
j
dy acp d y t ay dt) d
d co(x(t), y ( t)) d t
= co ( x (b), y ( b )) - cP(x ( a ), y(a)) . Denoting PI
= (x(b), y(b))
and Po = (x(a), y(a)), this result states that
f
c F • dR = co (PI) - cp ( Po) = co(terminal point of C) - cp(initial point of C) .
(13 .3)
The line integral of a conservative vector field over a path is the difference in values of a potential function at end points of the path. This is familiar from physics . If a particle move s along a path under the influence of a conservative force field, then the work done is equal t o the difference in the potential energy at the ends of the path . One ramification of equation (13 .3) is that the actual path itself does not influence th e outcome, only the end points of the path . If we chose a different path K between the same en d points, we would obtain the same result for fK F • dR . This suggests the concept of independenc e of path of a line integral.
13 .3 Independence of Path and Potential Theory in the Plane
537
DEFINITION 13.5 Independence of Path
fc F . dR is independent of path on a set D of points in the plane if for any points Po an d PI in D, the line integral has the same value over any paths in D having initial point Po and terminal point PI .
The discussion preceding the definition may now be summarized . THEOREM 13. 4
Let cp and its first partial derivatives be continuous for all (x, y) in a set D of points in th e plane . Let F = V . Then fc F • dR is independent of path in D . Further, if C is a simple closed path in D, then fc, F • dR = 0 . The independence of path follows from equation (13 .3), which states that, when F = Vcp, the value of fc F • dR depends only on the values of cp(x, y) at the end points of the path, an d not where the path goes in between . For the last conclusion of the theorem, if C is a close d path in D, then the initial and terminal points coincide, hence the difference between the value s of cp at the terminal and initial points is zero .
EXAMPLE 13 .1 3
2
2
Let F(x, y) = 2x cos(2y)i - 2x sin(2y)j . It is routine to check that cp(x, y) = x cos(2y) is a potential function for F . Since co is continuous with continuous partial derivatives over th e entire plane, we can let D consist of all points in the plane in the definition of independence o f path. For example, if C is any path in the plane from (0, 0) to (1, yr/8), the n
fc F•dR =cp(l,ar/8)-cp(0,0)= 2 . Further, if K is any simple closed path in D, then fK F • dR = 0 . It is clearly to our advantage to know whether a vector field is conservative, and, if it is , to be able to produce a potential function . Let F (x, y) = f (x, y)i +
g(x , y)j .
F is conservative exactly when, for some co, F=V
= a i+ ay
-j ,
and this requires that ac pa
= f(x, y) and
a*Y = g(x , y) .
To attempt to find such a co, begin with either of these equations and integrate with respect to the variable of the derivative, keeping the other variable fixed . The constant of integration is then actually a function of the other (fixed) variable . Finally, use the second equation t o attempt to find this function .
538 f:
CHAPTER 13 Vector Integral Calculu s "r=21 EXAMPLE 13 .1 4
Consider the vector field F(x, y) = 2xcos(2y)i - [2x2 sin(2y) +4y 2]j • We want a real-valued function cp such that
a* = 2x cos (2y) and a*Y _ - 2x2 sin (2y) - 4y2. Choose one of these equations . If we pick the first, then integrate with respect to x, holding y fixed : cp(x, y)
= f 2x cos(2y)dx = x 2 cos(2y) +g(y) .
The "constant" of integration is allowed to involve y because we are reversing a partia l derivative, and for any function of y ,
ax
[x2 cos(2y) +g(y)] = 2xcos(2y) ,
as we require . We now have cp(x, y) to within some function g(y) . From the second equation we need ** = -2x 2 sin(2y) - 4y2 =
ay [x2 cos(2y) + g (Y) ]
Then -2x 2 sin(2y) - 4y 2 = -2x2 sin (2y) +g ' (y) , so g (Y) = -4y2 . Choose g(y) = -4y3/3 to obtain the potential function co (x , y) = x 2 cos(2y)
- 3 y3.
It is easy to check that F = Ocp . Is every vector field in the plane conservative? As the following example shows, the answe r is no .
EXAMPLE 13 .1 5
Let F(x, y) = (2xy 2 + y)i + (2x2y+ex a potential co such that
y)j .
If this vector field were conservative, there would b e
2xy2 + a*
2x2y + e x
y.
y,
Y
ay
Integrate the first equation with respect to x to ge t cP(x,Y)
= f ( 2xy2 +Y) dx=x2y2 + xy +f( y) .
13 .3 Independence of Path and Potential Theory in the Plane
53 9
From the second equation , ay
=
2x2 Y + e x
y = ay ( x2Y 2 + xY + f(Y)) =
2x 2 y + x +
f(Y) •
But this would imply that f' (y)
= exy - x,
and we cannot find a function of y alone satisfying this equation . Therefore F has no potential . Because not every vector field is conservative, we need some test to determine whether or not a given vector field is conservative . The following theorem provides such a test . THEOREM 13 .5
Test for a Convervative Fiel d
Let f and g be continuous in a region D bounded by a rectangle having its sides parallel to th e axes . Then F(x, y) = f(x, y)i + g(x, y)j is conservative on D if and only if, for all (x, y) in D , ag of
(13 .4)
ax ay
Sometimes the conditions of the theorem hold throughout the plane, and in this event th e vector field is conservative for all (x, y) when equation (13 .4) is satisfied.
EXAMPLE 13 .1 6
Consider again
F(x,
y) = (2xy2
+y)i+(2x2y+e zy)j, from Example 13 .15 . Comput e
- = 4xy + 1 and ay Y
ag
= 4xy + ex),
and these are unequal on any rectangular region of the plane . This vector field is not conservative . We showed in Example 13 .15 that no potential function can exist for this field .
13.3.1 A More Critical Look at Theorem 13 .5 The condition (13 .4) derived in Theorem 13 .5 can be written ag -
aff ax a y
=
0.
But the combination ag _ af ax ay
also occurs in Green's theorem . This must be more than coincidence . In this section we will explore connections between independence of path, Green's theorem, condition (13 .4), an d existence of a potential function . The following example is instructive . Let D consist of all points in the plane except th e origin . Thus, D is the plane with the origin punched out . Le t x F(x,Y)=i+=f(x,Y)i+g(x,Y)j • x2+ y2 2+y2
540
CHAPTER 13 Vector Integral Calculu s Then F is defined on D, and f and g are continuous with continuous partial derivatives on D . Further, we saw in Example 13 .12 that af ay
ag = ax
0
for (x, y) in D .
Now evaluate lc f(x, y)dx+g(x, y)dy over two paths from (1, 0) to (-1, 0) . First, let C be the top half of the unit circle, given b y x=cos(B),y=sin(g) for 0 <
B
7r.
Then
L f(x, y) dx + g(x, y) dy f [(-sin(B))(-sin(g))+cos(B)cos(B)]dB= f dB=Tr . Next, let K be the path from (1, 0) to (-1, 0) along the bottom half of the unit circle, given b y x = cos(B), y = - sin( g) for
0
< B *r .
Then
fK f(x , y)dx+g(x, y) dy = f [sin(O) (- sin( g)) +cos(B)(- cos(B))]dO = - f dB
-7r
This means that fc f(x, y)dx + g(x, y)dy is not independent of path in D . The path chosen between two given points makes a difference . This also means that F is not conservative over D . There is no potential function for F (by Theorem 13 .4, if there were a potential function, then the line integral would have to be independent of path) . This example suggests that there is something about the conditions specified on the set D in Theorem 13 .5 that make a difference . The rectangular set in the theorem, where conditio n (13 .4) is necessary and sufficient for existence of a potential, must have some property or properties that the set in this example lacks . We will explore this line of thought. Let D be a set of points in the plane . We call D a domain if it satisfies two conditions : 1. If Po is any point of D, there is a circle about Po such that every point enclosed by thi s circle is also in D . 2. Between any two points of D, there is a path lying entirely in D . For example, the right quarter plane S consisting of points (x, y) with x > 0 and y > 0 enjoys property (2), but not (1) . There is no circle that can be drawn about a point (x, 0) wit h x > 0, that contains only points with nonnegative coordinates (Figure 13 .19) . Similarly, any circle drawn about a point (0, y) in S must contains points outside of S . The shaded set M of points in Figure 13 .20 does not satisfy condition (2) . Any path C connecting the indicated points P and Q must at some time. go outside of M . Figure 13 .21 shows the set A of points between the circles of radius 1 and 3 about th e origin . Thus, (x, y) is in A exactly whe n 1 <x2 +y2 <9 .
13 .3 Independence of Path and Potential Theory in the Plane
541
Y Y
x
x (x, 0 ) FIGURE 13 .19 Right quarte r plane x 0, y > 0.
FIGURE 13 .20
FIGURE 13 .21 Th e region between two concentric circles .
This set satisfies conditions (1) and (2), and so is a domain . The boundary circles are drawn a s dashed curves to emphasize that points on these curves are not in A . The conditions defining a domain are enough for the first theorem .
THEOREM 13 . 6
Let F be a vector field that is continuous on a domain D . Then, path on D if and only if F is conservative . ■
fc F . dR
is independent of
Proof We know that, if F is conservative, then fc F • dR is independent of path on D . It i s the converse that uses the condition that D is a domain . Conversely, suppose fc F • dR is independent of path on D . We will produce a potential function. Choose any point Po : (xo, yo) in D . If P : (x, y) is any point of D, define cc( x, y)
= fc F . dR ,
in which C is any path in D from Po to P . There is such a path because D is a domain . Further, because this line integral is independent of path, cp(x, y) depends only on (x, y) and Po and no t on the curve chosen between them . Thus cp is a function . Because F is continuous on D, co is also continuous on D . Now let F(x, y) = f(x, y)i + g(x, y)j and select any point (a, b) in D . We will show that
a
(a, b) = f(a, b)
and
act) (a, b) = g(a, b) .
For the first of these equations, recall tha t acp ax
_ (a' b)
cp(a+Ox, b) - cp(a, b)
ono
Ax
Because D is a domain, there is a circle about (a, b) enclosing only points of D . Let r be the radius of such a circle and restrict Ox so that 0 < Ox < r . Let C l be any path in D from Po to (a, b) and C2 the horizontal line segment from (a, b) to (a+ ix, b), as shown in Figure 13 .22. Let C be the path from Po to (a + Ax, b) consisting of C, and then C2. No w t,P( a_ +4x,b)--cp(a,= f F•dR- f F•dR= f F . dR . c c, cZ
542
CHAPTER 13 Vector Integral Calculus
Y
(a, b)
Y (a + Ox, b) s-t7 . (a, b) Ox< 0
. (a + Ox, b) Ax>0
FIGURE 13 .2 3
FIGURE 13 .2 2
2
Parametrize C by x = a+ thx, y = b for 0 < t < 1 . Then cp(a + Ax, b) - cp(a, b )
= fc2 F•dR
fc2 f(x,y)dx+g(x,Y)dY
i =JO
f(a+tAx, b)(0x)dt.
Then cp(a + i x,Qb) - cp(a, b) =
f f(a + tA.x, b)dt .
x By the mean value theorem for integrals, there is a number e between 0 and 1, inclusive, suc h that
fo f(a+thx, b)dt = f(a+tAx, b) . Therefore cp(a+Ox, b) -cp(a, b) = f(a+eAx,b) . Ox As Ax ->- 0, f(a + ehx, b) - f(a, b) by continuity of f, proving that cp(a ol,o+
Ox,*) - cP(a, b)
= f( a , b)
By a similar argument, using the path of Figure 13 .23, we can show that lim
Ax,o-
cp( a -I- Ax, b) - cp(a, b) = f(a b) . Ax
Therefore , b) f(a , b) . a* ( a To prove that (acp/ay)(a, b) = g(a, b), the reasoning is similar except now use the paths of Figures 13 .24 and 13 .25. This completes the proof of the theorem . 11
Y
(a,b+Ay) 0 (a ) Ay >
(xo,Yo) ✓ FIGURE 13 .24
x FIGURE 13 .25
13.3 Independence of Path and Potential Theory in the Plane ! 543
FIGURE 13 .26 Th e set of points between two concentric circle s is not simply connected.
We have seen that the condition (13 .4) is necessary and sufficient for F to be conservativ e within a rectangular region (which is a domain) . Although this result is strong enough fo r many purposes, it is possible to extend it to regions that are not rectangular in shape if anothe r condition is added to the region . A domain D is called simply connected if every simple closed path in D encloses only points of D . A simply connected domain is one that has no "holes" in it, because a simpl e closed path about the hole will enclose points not in the domain . If D is the plane with the origin removed, then D is not simply connected, because the unit circle about the origin encloses a point not in D . We have seen in Example 13 .12 that condition (13 .4) may be satisfied in this domain by a vector field having no potential function on D . Similarly, the region between tw o concentric circles is not simply connected, because a closed curve in this region may wra p around the inner circle, hence enclose points not in the region (Figure 13 .26) . We will now show that simple connectivity is just what is needed to ensure that conditio n (13 .4) is equivalent to existence of a potential function . The key is that simple connectivit y allows the use of Green's theorem . THEOREM 13 . 7
Let F(x, y) = f(x, y)i +g(x, y)j be a vector field and D a simply connected domain. Suppos e f, g, of/ay and ag/ax are continuous on D . Then, F is conservative on D if and only if of ag ay - ax
for all (x, y) in D . Proof
Suppose first that F is conservative, with potential cp. Then f( x, y)
= a*
and
g(x, y) _
act)
Then of
a2(p
a 2 cp
ag y ay - axay - ayax - a
for (x, y) in D. For the converse, suppose that condition (13 .4) holds throughout D . We will prove that fc. F • dR is independent of path in D . By the previous theorem, this will imply that F i s conservative . To this end, let Po and P I be any points of D, and let C and K be paths in D fro m Po to P l . Suppose first that these paths have only their end points in common (Figure 13 .27 (a)) . We can then form a positively oriented simply closed path J from Po to Po by moving
544..
CHAPTER 13 Vector Integral Calculus Y
x FIGURE 13 .27(a)
FIGURE 13 .27(b )
Paths C and K from
A closed path formed from C and -K.
Po to P l .
from Po to P I along C, then back to Po along -K (Figure 13 .27 (b)) . Let J enclose a regio n D* . Since D is simply connected, every point in D* is in D, over which f , g, of/ay and ag/ax are continuous . Apply Green's theorem to write *F
a of .dR=ffD
. (-
dA=O .
ay
xg
But then f F . dR r
fc F .dR+ f x F•dR fc F•dR-f F . dR=0 , K
so
fc F
F dR . K
If C and K intersect each other between Po and P7 as in Figure 13 .28 then this conclusio n can still be drawn by considering closed paths between successive points of intersection . W e will not pursue this technical argument . Once we have independence of path of fc F . dR on D, then F is conservative, and th e theorem is proved. ■ In sum : conservative vector field
independence of path of
independence of path on a domain <
fc
F • dR on a set D ,
> conservative vector fiel d
and conservative on a simply connected domain <->
FIGURE 13 .2 8
ag
ofy
13 .4 Surfaces in 3-Space and Surface Integrals
In each of Problems 1 through 8, determine whether F i s conservative in the given region D . If it is, find a potential function . If D is not defined, it is understood to be th e entire plane .
11. F = 2xyi+ (x2 -1/y)j ; (1, 3), (2, 2) (the path canno t cross the x axis ) 12. F = i+ (6y+sin(y))j ; (0, 0), (1, 3 ) 13. F = (3x2y2 - 6y 3)i + (2x3y -18xyz)j ; (0, 0), ( 1, 1 )
1. F = y3 i + (3xy2 - 4)j 2. F = (6y+ye xy)i+ (6x+xex*') j 3. F= 16xi+(2-y2) j
14. F = y i+ln(x)j ; (1, 1), (2, 2) (the path must lie in the x right half-plane x > 0 )
4. F = 2xycos(x 2)i+ssin(x2) j Zx 5. F=( xz Ii l j, D the plane with th e Yz / + (xz2+Yz y / origin remove d 6. F=sinh(x+y)(i+j ) 7. F=2cos(2x)e yi+{e3'sin(2x)-y]j 8. F = (3x 2 y - sin(x) + 1)i+ (x 3 + eY )j In each of Problems 9 through 16, evaluate fc, F • d R for C any path from the first given point to the second . 9. F = 3x 2 (y2 -4y)i+ (2x 3y-4x 3)j ; (-1, 1), (2, 3 ) 10. F = e xcos(y)i - ex sin(y)j ; (0, 0), (2, ar/4)
13.4
545
15. F = (-8eY +e x)i-8xe*'j ; (-1, -1), (3, 1 ) 16. F = (4xy+)i+2x2i ; (1, 2), (3, 3) (the path must x2 lie in the half-plane x > 0 ) 17. Prove the law of conservation of energy : the sum of the kinetic and potential energies of an object acte d on by a conservative force field is a constant . Hint : The kinetic energy is (ni/2) IIR'(t) 1 2, where in is the mass and R(t) the position vector of the particle . Th e potential energy is -cp(x, y) , where cp is a potential function for the force . Show that the derivative of th e sum of the kinetic and potential energies is zero along any path of motion .
Surfaces in 3-Space and Surface Integral s Analogous to the integral of a function over a curve, we would like to develop an integral of a function over a surface . This will require some background on surfaces . A curve is often given by specifying coordinate functions, each of which is a function of a single variable or parameter. For this reason we think of a curve as a one-dimensional object , although the graph may be in 2-space or 3-space . A surface may be defined by giving coordinate functions which depend on two independen t variables, say x = x ( u , v), Y = Y( u , v), z = z(u, v) ,
with (u, v) varying over some set in the u, v plane . The locus of such points may form a two-dimensional object in the plane or in R 3 .
EXAMPLE 13 .1 7
Suppose a surface is given by the coordinate function s x = au cos(v), y = bu sin(v), z = u , with u and v any real numbers and a and b nonzero constants . In this .case it is easy to write z in terms of x and y, a tactic that is sometimes useful in visualizing the surface . Notice tha t
(x l
cos t (v) + sin e (v) = 1 , \ au /2+ \ bu 2 =
546
CHAPTER 13 Vector Integral Calculu s so x2
yz _ z _ bz u az +-
z2 .
In the plane y = 0 (the x, z plane), z = ±x/a, which are straight lines of slope ±1/a through the origin. In the plane x = 0 (the y, z plane), z = ±y/b, and these are straight lines of slop e +1/b through the origin . The surface intersects a plane z = c = constant 0 0, in an ellips e 2
2
nz + bz
= cz, z = c.
This surface is called an elliptical cone because it has elliptical cross sections parallel to th e x, y plane .
EXAMPLE 13 .1 8
Consider the surface having coordinate function s x = u cos(v), y = u sin(v),
z = 2 uz sin(2v) ,
in which u and v can be any real numbers . Now
z = 2 uz sin(2v) = u 2 sin(v) cos(v) = [ucos(v)][usin(v)] = xy . This surface intersects any plane z = c = constant 0 0 in the hyperbola xy = c, z = c . However , the surface intersects a plane y = Ix in a parabola z = ±x2 . For this reason this surface is called a hyperbolic paraboloid . Sometimes x and y are used as parameters, and the surface is defined by giving z a s a function of x and y, say z = S(x, y) . Now the graph of the surface is the locus of point s (x, y, S(x, y)), as (x, y) varies over some set of points in the x, y plane .
EXAMPLE 13 .1 9
Consider z = -,/4 - x2 write
- y2
for x2
+ y2 <
4. By squaring both sides of this equation, we can
xz + y 2 +z z
= 4.
This appears to be the equation of a sphere of radius 2 about the origin . However, in the original formulation with z given by the radical, we have z > 0, so in fact we have not the sphere, but the hemisphere (upper half of the sphere) of radius 2 about the origin .
EXAMPLE 13 .2 0
The equation z = ,/x2 + y2 for x 2 + y2 < 8 determines a cone having circular cross section s parallel to the x, y plane . The "top" of the cone is the circle x2 +y 2 = 8 in the plane z = *•
13.4 Surfaces in 3-Space and Surface Integrals
547
.-, . EXAMPLE 13 .2 1
The equation z = x 2 + y 2 defines a parabolic bowl, extending to infinity in the positive z-direction because there is no restriction on x or y . These surfaces are easy to visualize and sketch by hand. If the defining function is more complicated then we usually depend on a software package to sketch all or part of the surface . Examples are given in Figures 13 .29 through 13 .33 .
FIGURE 13 .29
z = cos(x2 - y2 ) .
FIGURE 13 .31
z=
4 cos(x 2 +y2 ) 1 + x2 -I- y2
FIGURE 13 .30 z =
FIGURE 13 .32
z=x 2 cos(x 2 -y2 ) .
548
CHAPTER 13 Vector Integral Calculus
z
Tangents to curve s on E at Po determine the tangen t plane to the surface there . FIGURE 13 .34
FIGURE 13 .33 z = cos(xy) log(4+y).
Just as we can write a position vector to a curve, we can write a position vecto r R(u, v) = x(u, v)i+y(u, v)j+z(u, v) k for a surface. For any u and v in the parameter domain, R(u, v) can be thought of as an arrow from the origin to the point (x(u, v), y(u, v), z(u, v)) on the surface . A surface is simple if R(u 1 , v1 ) = R(u 2, v2) can occur only if u l = u 2 and v l = v2 . A simpl e surface is one that does not fold over and return to the same point for different values of th e parameter pairs. 13.4 .1 Normal Vector to a Surfac e Let E be a surface with coordinate functions x = x(u, v), y = y(u, v), z = z(u, v) . Assume that these functions are continuous with continuous first partial derivatives . Let Po : (x(uo, vo), y(uo, vo), z(uo, vo)) be a point on E . If we fix v = vo, we can define the curve Evo on the surface, having coordinate function s x = x (u , vo),
y = y( u , vo), z = z(u, vo) .
(See Figure 13 .34.) This is a curve because its coordinate functions are functions of the singl e variable u . The tangent vector to Eva at Po is ax
T 0 =°
ay
az
( uo, vo)i+ au ( uo, vo)j + au ( uo, vo) k. au
Similarly, if we fix u = uo and use v as parameter, we obtain the curve E no on the surfac e (also shown in Figure 13 .34) . This curve has coordinate functions x = x ( uo, v),
y = y( u o, v), z = z(uo, v)•
13.4 Surfaces in 3-Space and Surface Integrals
549
The tangent vector to Euo at Po i s
ax
ay
az
T„ o = av (u0, vo)i + av (uo, vo)] + av ( uo, vo) k . Assuming that neither of these tangent vectors is the zero vector, they both lie in the tangen t plane to E at Po . Their cross product is therefore normal (orthogonal) to this tangent plane , and is the vector we define to be the normal to E at Po :
N (Po)
= [ au
( u o, vo)i+ au ( uo, vo)] + au (uo, vo)k ]
x [_@ o, vo) i + av ( uo, vo)J + av (uo, vo)k] i
k
J
( uo, vo)
(uo, vo)
au (uo, vo )
T, ( uo, vo)
al (uo, vo) av ( uo, vo) _ ay az az ay ) 1 + ax _ ax az ax ay ay ax (au av au av) '+ \ au av au av)* + (au av au av) k'
(13 .5)
in which all the partial derivatives are evaluated at (uo, vo) . An expression that is easier t o remember is obtained by introducing Jacobian notation . Define the Jacobian determinant (name d for the German mathematician Karl Jacobi) of functions f and g to be the 2 x 2 determinan t
a(f, g) a(u, v) In this notation, the normal vector to
of au
of
ag au
as av
au
afagagaf _ au av au a v
E at Po i s
a(x,y) k NP a(y, z)i+ a(z, x) J+ , ( °) = a(u, v) a(u, v) a(u, v) with all the partial derivatives evaluated at (uo, vo) . This notation helps in remembering the normal vector because of the cyclic pattern in the Jacobian symbols . Writ e x, y, z in this order . For the first component of N(Po), omit the first letter, x, to obtain a(y, z)/a(u, v) . For the second component, omit y, but maintain the same cyclic direction, moving left to right through x, y, z . This means we start with z, the next letter after y, then back to x, obtainin g a(z, x)/a(u, v) . For the third component, omit z, leaving x, y and the Jacobian a(x, y)/a(u, v) . Of course, any nonzero real multiple of N(Po) is also a normal to E at Po .
550
CHAPTER 13 Vector Integral Calculu s
EXAMPLE 13 .22
Consider again the elliptical con e x = au cos(v), y = bu sin(v), z-=u . Suppose we want the normal vector at Po : (a-/4, b/4, 1/2), obtained when u = uo = 1/2 , v = vo = it/6 . Compute the Jacobians : a(y,z)
ay az az ay 8(u, v) 0120,16) - Loaav u au av (1/2,er/6) _ [b sin(v)(0) - bu cos(v)](112,,/6) = az ax [ au av
a(z, x) a(u, v) ] (1/2,,T/6)
ax az au av (1/2,7x/6 )
= [-au sin(v) - a cos(v)(0)](1/2,,/6) = -a/ 4
and a(x, y) - ax ay - ay ax a(u, v) ] (1/2,7x/6) au av au av ] (1/2,7x/6 ) = [a cos(v)bu cos(v) - b sin(v) (-au sin(v))](1/2,ar/6 )
= ab/2 . The normal vector at Po is
,,
b, a N'(Po)=-*4r'-4j
ab
'2 k.
1
Consider the special case that the surface is given explicitly as z = S(x, y) . We may think of u = x and v = y as the parameters for E and write the coordinate functions as x = x, y = Y, z = S(x, y) . Since ax/ax = 1 = ay/ay and ax/ay = ay/ax = 0, we hav e a(y, z) a(u, v)
0
a(y, z ) a(x, y)
a(z, x) a (x , Y)
as
as
1
0
ax
1
as as ax ay ay
as ax '
-ay
as
and a (x, Y) a(x, Y)
1 0 01
=1 .
The normal at a point Po : (xo, yo, S(xo, yo)) in this case i s N(Po) = -
as as +k ax (xo, Yo ) i - ay ( xo, Yo)J +k. ax (xo, Yo ) i - ay (xo, Yo)J
We can also denote this vector as N(xo, yo) .
(13 .6)
13.4 Surfaces in 3-Space and Surface Integrals
551
EXAMPLE 13 .23 Consider the cone given by z = S(x, y) _ ,,/x 2 +y2. The n aS 3x
x
.Jx 2+ y 2
and
as -
ay
y
.Jx2+y2 '
except for x = y = 0 . Take, for example, the point (3, 1, 10) . The normal vector at this point i s N(3, 1, 10) _ - 3 i- 1j+k . 10 10 This normal vector is shown in Figure 13 .35, and it points into the cone . In some contexts w e want to know whether a normal vector is an inner normal (such as this one) or an outer norma l (pointing out of the region bounded by the surface) . If we wanted an outer normal at this point , we could take 3 -N(Po) = i+ 1jk. 10 10 This cone does not have a normal vector at the origin, which is a "sharp point" of th e surface . There is no tangent plane at the origin .
Normal to the cone z = ✓x2+y2 at (3, 1, 10) . FIGURE 13 .35
The normal vector (13 .6) could also have been derived using the gradient vector . If E is given by z = S(x, y), then E is a level surface of the function tP(x, y, z) = z - S(x, y)
The gradient of this function is a normal vector, so comput e Vcp
= a* i + a* y
j+
az
k
aS . a S =-ai*j+k=N(P) .
13.4 .2
The Tangent Plane to a Surfac e
If a surface E has a normal vector N at a point Po, then we can use N to determine the equatio n of the tangent plane to E at Po . Let (x, y, z) be any point on the tangent plane . Then the vector (x - xo)i+ (y - yo)j + (z - zo)k is in the tangent plane, hence is orthogonal to N . The n N . [(x-xo)i+(y-yo)j+(z-zo)k] =0 .
55 2
CHAPTER 13 Vector Integral Calculu s More explicitly,
ra(Y,z)1 L a(u, v) J (uo,vo)
(x - xo)
r a(z,x)
+ [ 8(u,
v) ] ( uo, vo)
This is the equation of the tangent plane to
E
(Y - Yo) +
r a (x, Y) L a(u v)
(z-zo)=0 . u o, vo)
at Po .
EXAMPLE 13 .24 Consider again the elliptical cone given b y x = au cos(v),
y = bu sin(v),
z = u.
We found in Example 13 .22 that the normal vector at Po : ( a/4, b/4, 1/2) is N = - ,fRb/4)i (a/4)j + (ab/2)k. The tangent plane to E at this point has equatio n
N
b 4
a, / 4
E
In the special case that
a (x 4
+b 4)
ab 2
1 2
is given by z = S(x, y), then the normal vector at Po i s
N = -aS/ax(xo, yo)i - as/ay(xo , yo)j + k, so the equation of the tangent plane become s
-aasx (xo,Yo)(x-xo) - aa sy (xo,Yo)(Y-Yo)+(z -zo) =0 . This equation is usually written z - zo =
ax
(xo, Yo) (x - xo) + (xo, Yo) (Y - Yo ) ay
13.4 .3 Smooth and Piecewise Smooth Surfaces Recall that a curve is smooth if it has a continuous tangent . Similarly, a surface is smooth if i t has a continuous normal vector. A surface is piecewise smooth if it consists of a finite number of smooth surfaces . For example, a sphere is smooth, and the surface of a cube is piecewis e smooth . A cube consists of six square pieces, which are smooth, but does not have a norma l vector along any of its edges . In calculus, it is shown that the area of a smooth surface E given by z = S(x, y) is the integral area of
=
+
()2+
() 2 dA
(13 .7)
where D is the set of points in the x, y- plane for which S is defined. This may also be writte n area of E
= f fD
()2+ 1+
( ; ) 2 dxdy.
Equation (13 .7) is the integral of the length of the normal vector (13 .6) : area of E
=ff
IIN(x, )2) *I dxdy. D This is analogous to the formula for the length of a curve as the integral of the length of th e tangent vector.
13.4 Surfaces in 3-Space and Surface Integrals
553
More generally, if E is given by coordinate functions x = x(u, v), y = y(u, v) and z = z(u, v), with (u, v) varying over some set D in the u, v plane, then area of E
= ff
IIN(u,
v) II dudv,
(13 .8)
D
the integral of the length of the normal vector, which is given by equation (13 .5) .
EXAMPLE 13 .2 5
We will illustrate these formulas for surface area for a simple case in which we know the are a from elementary geometry . Let E be the upper hemisphere of radius 3 about the origin . We can write E as the graph of z = S(x, y) _ .19 - x 2 - y 2 , with x2 + y2 < 9. D consists of all points on or inside the circle of radius 3 about the origin in the x, y-plane . We can use equation (13 .7). Comput e az _
x
_
x
.\/ 9 - x 2 - y2
ax
z
and, by symmetry, az ay
y z*
Then area of
/x 2 / 12 -I- I z I+ I dxdy
E= f fD
ff
ZJ
z2
D
+x z +yz z
2
dxdy = ff D
3 19_ x2 _
y2
dxdy.
This is an improper double integral which we can evaluate easily by converting it to pola r coordinates . Let x = rcos(O), y = rsin(6) . Since D is the disk of radius 3 about the origin , 0
3
r dr = 67r [-(9 - r 2 ) '/2]3 ° ✓9-r2 =67r[9 112 ]=187r. f
This is the area of a hemisphere of radius 3 . We are now prepared to define the integral of a function over a surface.
13.4.4 Surface Integrals The notion of the integral of a function over a surface is modeled after the line integral, wit h respect to arc length, of a function over a curve . Recall that, if a smooth curve C is given b y x = x(t), y = y(t), z = z(t) for a < t < b, then the arc length along C is
s(t) = fn
✓x'(6)2+y'(
) 2 +z'(02 4.
554
CHAPTER 13 Vector Integral Calculu s Then ds = .,/x'(t)2+y'(t)2+z'(t)2d t and the line integral of a function f along C, with respect to arc length, i s
fc f(x, y ,z) ds= f v f(x(t),y(t),z(t)Wx'(t)2+y'(t)2+z'(t)2dt . We want to lift these ideas up one dimension to integrate over a surface instead of a curve . Now we have coordinate functions that are functions of two independent variables , say u and v, with (u, v) varying over some given set D in the u, v plane . This means that Lb will be replaced by f fD . The differential element of arc length, ds, which is used in the line integral, will be replaced by the differential element do- of surface area on the surface . By equation (13 .8), do- = 1IN(u, v) II dudv, in which N(u, v) is the normal vector at the poin t (x(u, v), y(u, v), z(u, v)) on E.
I
DEFINITION 13.6 Surface Integra l Let E be a smooth surface having coordinate functions x = x(u, v), y = y(u, v), z = z(u, v) for (u, v) in some set D of the u, v plane . Let f be continuous on E. Then the surface integral of f over E is denoted f fE f(x, y, z)do, and is defined by ffEf(x,y,z)do' Jfnf(x(u, v), y(u, v) .
tr
0)1 N(u, v) dudv .
If E is a piecewise smooth surface having smooth components Et , . . . , E,,, with each component either disjoint from the others, or intersecting another component in a set of zer o area (for example, along a curve), the n
ffE f(x, y, z) do- = ffEl f(x, y, z)d o + . . . + ffE „ f(x, y, z) do-. For example, we would integrate over the surface of a cube by summing the surface integral s over the six faces . Two such faces either do not intersect, or intersect each other along a line segment having zero area. The intersection condition is to prevent the selection of surfac e components that overlap each other in significant ways . This is analogous to a piecewise smooth curve C formed as the join of smooth curves C l , . . . , Cn . When we do this, we assume that two of these component curves either do not intersect, or intersect just at an end point, not alon g an arc of both curves . If E is described by z = S(x, y), then
ff f(x , y, z) do- = ff f(x, y, S( x, y))
l+
()2 + ()
2dxdY .
D
We will look at some examples of evaluation of surface integrals, then consider uses o f surface integrals .
13 .4 Surfaces in 3-Space and Surface Integrals
555
FIGURE 13 .3 6
EXAMPLE 13 .2 6
E
Evaluate ffE zdo- if is the part of the plane x+y+z = 4 lying above the rectangle 0 x < 2 , 0y1 . The surface is shown in Figure 13 .36 . D consists of all (x, y) with 0 < x < 2 and 0 < y < 1 . With z = S(x, y) = 4 - x - y we have . JJE
zda = =
if
al + ( - 1) 2
I§ f
+
( - 1) 2 dxdy
1
2
f (4-x-y)dydx .
N
First compute
fo
4-x-y)dy= (4-x)y =4-x -
1 2 1 - 2y ] 0 1 7
2
= 2 -x .
Then 2
ffE
z do-
=f
o
x dx=5,.
7
2
EXAMPLE 13 .2 7
Recall the hyperbolic paraboloid of Example 13 .18 given b y x = u cos(v), y = u sin(v), z = 1 2 u2 sin(2v) . We will compute the surface integral f fE xyzdo- over the part of this surface corresponding t o 1
sin(v)
As we expect of an integral , E(f(x, y, z) +g(x, y, z))doff
=f
E
( f(x, y, z)dv+ f f g(x , y, z))d o E
and, for any real number a, f f af(x, y, z) du = a f f f(x, y, z) du. E E The next section is devoted to some applications of surface integrals .
In each of Problems 1 through 10, evaluat e ffE f(x, y, z)dcr. 1. f(x, y, z) = x,
E is the part of the plane x+4y+z = 1 0
in the first octant 2. f(x, y, z) = y2, j is the part of the plane z = x for 0x<2,0y< 4
4. f(x, y, z) = x+ y, E is the part of the plane 4x+8y + lOz = 25 lying above the triangle in the x, y- plane
having vertices (0, 0), (1, 0) and (1, 1 ) 5. f(x, y, z) = z, E is the part of the cone z = i /x2 +y2 lying in the first octant and between the planes z = 2 andz= 4 6. f(x, y, z) = xyz, E is the part of the plane z = x+ y
3. f(x, y, z) = 1,
E is the part of the paraboloid z =
x2 +y2 lying between the planes z = 2 and z =7
with (x, y) lying in the square with vertices (0, 0) , (1, 0), (0,1) and (1, 1)
13 .5 Applications of Surface Integrals
557
7. f(x, y, z) = y, E is the part of the cylinder z = x2 for 0<x<2,0
9. f(x, y, z) = z, E is the part of the plane z = x - y for
8. f(x, y, z) = x 2, > is the part of the paraboloid z = 4 - x 2 - y2 lying above the x, y plane
10. f(x, y, z) = xyz, E is the part of the cylinder z = 1+ y2 for 0 x < 1, 0 y < 1
13.5
0<x<1,0
Applications of Surface Integral s 13.5 .1 Surface Are a If E is a piecewise smooth surface, then
f f ldu
= f fD IIN(u, v) *I dudv = area of
E
E.
(This assumes a bounded surface having finite area.) Clearly we do not need . the notion of a surface integral to compute the integral and obtain the area of a surface . However, we mentio n this result because it is in the same spirit as other familiar mensuration formulas : ds = length of C,
ff dA = area of D , D
and, if M is a solid region in 3-space enclosing a volume, then
!if
dV = volume of M .
M
13 .5 .2 Mass and Center of Mass of a Shel l
Imagine a thin shell of negligible thickness taking the shape of a smooth surface E . Let 8(x, y, z) be the density of the material of the shell at (x, y, z) . Assume that 8 is continuous . We want to compute the mass of the shell . Suppose E has coordinate functions x = x(u, v), y = y(u, v), z = z(u, v) for (u, v) in D . Form a grid over D in the u, v plane by drawing lines (dashed lines in Figure 13 .37) parallel to the axes, and retain only those rectangles R I , . . . , R N intersecting D . Let the vertical lines be A u units apart, and the horizontal lines, A v units apart . For (u, v) varying over Rj, we obtain a patch or surface element Ej on the surface (Figure 13 .38) . That is, E1 (u, v) = E(u, v) for (u, v) in R./. Let (ui , v1 ) be a point in Rj, and approximate the density of the surface element >j by the constant 8j = 5(x(uj , vj ), y(uj , vj), z(u1 , v j)) . Because 8 is continuous and Ej has finite area, we can choose Du and i v sufficiently small that 8 j approximates 8(x, y, z) as closely as we like over Ej. Approximate the mass of Ej as 8j times the area of Ej. Now this area i s area of
E, = f f . IIN(u, v)II dud v R*
II N(ti i , vi) II
iuiv ,
so
mass of
Ej ti 8j IIN(uj , vj )II Lut v .
The mass of E is approximately the sum of the masses of the surface elements : N
mass of E
E 8(x(uj , vi ), y(uj, v1 ), z(uj , vj)) IIN(uj, vj) II Au Av .
558
CHAPTER 13 Vector Integral Calculus
In 1927, Congress approved construction of a dam to contro l the Colorado River for th e purpose of fostering agriculture i n the American southwest and as a source of hydroelectric power , spurring the growth of Las Vega s and southern California. Construction began in 1931 . On e major problem of construction was the cooling of concrete as it was poured. Engineers estimated that the amount of concrete require d would take 100 years to cool. Th e solution was to pour it in rows and columns of blocks, through which cooled water was pumped in pipes. Hoover Dam wa s completed in 1935 and is 727 feet high, 1244 feet long, 660 fee t thick at its base, and 45 feet thic k at the top. It weighs about 5. 5 million tons and contain s 3,250,000 cubic yards of concrete . On one side of the dam, Lak e Mead is over 500 feet deep. Computation of forces and stresse s on parts of the dam surface involve a combination of materia l science, fluid flow, and vecto r analysis.
z
u
Au
FIGURE 13 .37 plane.
Grid over D in the u, v
FIGURE 13 .38
Surface element Ej on
E.
13.5 Applications of Surface Integrals 55 9
ff
This is a Riemann sum for D 6(x(u, v), y(u, v), z(u, v)) jIN(u, v) II dudv. Hence in the limi t as Au 0 and Av 0 we obtain mass of
E = ffE 8(x, y, z)do-.
The mass of the shell is the surface integral of the density function . This is analogous to the mass of a wire being the line integral of the density function over the wire . The center of mass of the shell is (z, y, ), wher e 1 ff xs(x, y, z)do-, y = 1 mE m
ffE ys (x , y, z)do- ,
and
z = m ffE z8(x, y, z)da-, in which m is the mass of the shell .
EXAMPLE 13 .2 8
We will find the mass and center of mass of the cone z = /x2 +y2, where x2 + y2 < 4 and the density function is 8(x, y, z) = x2 + y2 . First calculate the mass m . We will nee d az _ x az = y and ax z
ay
z
Then m
= ffE (x2+y2)do-
= f fD (x2 + y2) V 1+ Z2 +
2 dydx
2ar 2 f f r2 / rdrd O _O f
lz = 2yar-r4
= 8*ir . J0 By symmetry of the surface and of the density function, we expect the center of mass t o lie on the z axis, sox = y = O . Finally , z=
1
ff
8 /rE
z(x2 +y2)doz
ff
1 D 8*Tr 1 27- 2
ff
1
= 8
The center of mass is (0, 0,
2 + y2 (x2 + y2) r(r2)rdrd O
z 8 (27r) Sr 5 o = . 5
5) .
2
x zz y z z dydx
-5.60
CHAPTER 13 Vector Integral Calculu s
13.5.3 Flux of a Vector Field Across a Surface Suppose a fluid moves in some region of 3-space (for example, through a pipeline), wit h velocity V(x, y, z, t) . Consider an imaginary surface >2 within the fluid, with continuous uni t normal vector N(u, v, t) . The flux of V across >2 is the net volume of fluid, per unit time , flowing across >2 in the direction of N. We would like to calculate this flux . In a time interval Ot the volume of fluid flowing across a small piece >2J of >2 equals the volume of the cylinder with base > i and altitude VN At, where VN is the component of V in the direction of N, evaluated at some point of > j . This volume (Figure 13 .39) is (VN Ot)AJ , where Ai is the area of >2j. Because **NII = 1, VN =V • N . The volume of fluid flowing acros s >i per unit time is VN (At) Ai
N AJ =V•NA J. 6,t =V Sum these quantities over all the pieces of the surface and take the limit as the pieces ar e chosen smaller, as we did for the mass of the shell . We get flux of V across
E
in the direction of N
= f f V • Ndar.
The flux of a vector across a surface is therefore computed as the surface integral of the norma l component of the vector to the surface .
FIGURE 13 .3 9
EXAMPLE 13 .2 9
Find the flux of F = xi + yj + zk across the part of the sphere x 2 + y 2 + z 2 = 4 lying betwee n the planes z = 1 and z = 2 . The surface >2 is shown in Figure 13 .40, along with the normal vector (computed below) at a point . We may think of >2 as defined by z = S(x, y) , where S is defined by the equatio n of the sphere and (x, y) varies over a set D in the x, y plane. To determine D, observe that the plane z = 2 hits >2 only at its "north pole" (0, 0, 2) . The plane z = 1 intersects the sphere in the circle x2 +y2 = 3 , z = 1 . This circle projects onto the x, y plane to the circle of radius about the origin. Thus D consists of points (x, y) satisfying x2 +y2 < 3 (shaded in Figure 13 .41) . To compute the partial derivatives az/ax and &l ay, we can implicitly differentiate th e equation of the sphere to get 2x +2z - = 0, ax
13.5 Applications of Surface Integrals
FIGURE 13 .40
561
FIGURE 13 .4 1
so
and, similarly ,
az
Y
ay
z
A normal vector to the sphere is therefor e
x - yj
- i
-k . z z Since we need a unit normal in computing flux, we must divide this vector by its length, which is
1x
2 y2 y z2 +z2 +1
=
x2 + y2 + z 2 2 z
x
-k
=-- (xi+yj+zk) .
A unit normal is therefore
N=- (- i-
2 z
/
. If 2Thispont er we want the flu x across E from the outside of the sphere toward the inside, this is the normal to use . If we want the flux across E from within the sphere, use -N instead . Now, F • (-N) =
. 2 (x 2 + y 2 + z2)
There flux=
ff
E
2(x2 +y 2 +z 2)do-
-2ff
(x2+y2+z2)
I1+ ;: + Z
2 A
2d
dA ffD (x2+y2+z2 ) x2 +Y2+z2 z2 1 f f (x2 +y2 +z2) 3/2 .14 - x1 2 _ y2 dA 2 D =4ffo `/4-x2-y2 1 dA
=2
5622
CHAPTER 13
Vector Integral Calculus
because x2 + y 2 + z2 = 4 on
E.
Converting to polar coordinates, we hav e
2ir flux = 4 f f 0 0
1 rdrdO .N/4 - r2
=87r[-(4-r 2) 1/2l 0 =87r .
We will see other applications of surface integrals when we discuss the integral theorem s of Gauss and Stokes, and again when we study partial differential equations .
PROBLEMS
1. Find the mass and center of mass of the triangular shell having vertices (1, 0, 0), (0, 3, 0) and (0, 0, 2) if 8(x, y, z) = xz +1 . 2. Find the center of mass of the portion of the homogeneous sphere x2 + y 2 + z 2 = 9 lying above the plan e z = 1 . (Homogeneous means that the density function is constant). 3. Find the center of mass of the homogeneous con e z = _ /x2 +y2 for x2 +z2 < 9. 4. Find the center of mass of the part of the paraboloi d z = 16 - x2 - y2 lying in the first octant and be ngl
13.6
tween the cylinders x2 + y 2= 1 and x2 + y2 = 9, if 6(x, y, z) = xy/i 1 + 4x 2 + 4y2. 5. Find the mass and center of mass of the paraboloid z = 6 - x2 - y2 if 6(x, y, z) = ./1 +4x2 +4y2. 6. Find the center of mass of the part of the homogeneous sphere x2 +y2 +z2 = 1 lying in the first octant. 7. Find the flux of F = xi + yj - zk across the part of the plane x+2y+z = 8 lying in the first octant . 8. Find the flux of F = xzi - yk across the part of th e sphere x2 + y2 + z2 = 4 lying above the plane z = 1 .
.
Preparation for the Integral Theorems of Gauss and Stoke s The fundamental results of vector integral calculus are the theorems of Gauss and Stokes . I n this section we will begin with Green's theorem and explore how natural generalizations lea d to these results . With appropriate conditions on the curve and the functions, the conclusion of Green' s theorem is fc f( x, y) dx + g(x , y) dy = ffD
I ag - af I
dA
in which D is the region on and enclosed by the simple closed smooth curve C . Define the vector field F (x , y) = g( x, y) i - .f(x , y)j . Then V . F=- - f ax ay
Now parametrize C by arc length, so the coordinate functions are x = x(s), y = y(s) for 0 < s < L . The unit tangent vector to C is T(s) = x'(s)i + y ' (s) j, and the unit normal vecto r
13.6 Preparation for the Integral Theorems of Gauss and Stokes
563
is N(s) = y'(s)i - x' (s)j . These are shown in Figure 13 .42 . This normal points away from th e interior D of C, and so is an outer normal. Now F
• N = g(x,
dy y) ds
dx
+f( x , y) *s ,
so
fc f(x , Y) dx + g( x , y)dy = c [f(x , y) ds
+ g (x , y) ds ]
ds
= fc F•Nds . We may therefore write the conclusion of Green's theorem in vector form as
f f V•FdA .
f F•Nds= C
(13 .9)
D
This is a conservation of energy equation . Recall from Section 12 .4.1 that the divergence of a vector field at a point is a measure of the flow of the field from that point . Equation (13 .9) states that the flux of the vector field outward from D across C (because N is an outer normal ) exactly balances the flow of the field from each point in D . The reason for writing Green's theorem in this form is that it suggests a generalization to three dimensions . Replace the closed curve C in the plane with a closed surface E in 3-spac e (closed meaning bounding a volume) . Replace the line integral over C with a surface integra l over E, and allow the vector field F to be a function of three variables . We conjecture that equation (13 .9) generalizes to
ff
F . Ndo•= fff v .FdV, E in which N is a unit normal to > pointing away from the solid region M bounded by E. W e will see that, under suitable conditions on E and F, this is the conclusion of Gauss's divergence theorem . Now begin again with Green's theorem . We will pursue a different generalization to thre e dimensions . This time let F (x , y, z) = f(x , Y) i + g(x , y)j +0k . The reason for adding the third component is to be able to take the curl : VxF=
i a
j a
k a
_ (ag of
az
ay
az
f g
0
ax ay
Then (VxF)•k= dx
FIGURE 13 .42
of y
FIGURE 13 .43
k.
564
I
CHAPTER 13 Vector Integral Calculu s Further, with unit tangent T (s) = x' (s)i + y' (s) j to C, we can write
F •Tds = [f(x , y) i + g (x , y) j ] .
dx dy , s dsi + ds,] d
= f(x, y)dx+g(x, y)d y so the conclusion of Green's theorem can also be writte n
fc F
Tds
= f fD (V x F) • kdA .
(13 .10)
Now think of D as a flat surface in the x, y plane, with unit normal vector k, and bounded by the closed curve C . To generalize this, allow C to be a curve in 3-space bounding a surfac e E having unit outer normal vector N, as shown in Figure 13.43 . Now equation (13.10) suggests that
fcF•Tds= ff(VxF)•Ndd. We will see this equation shortly as the conclusion of Stokes's theorem .
OrE .G llKl I* 1c
PROBLEMS
1. Let C be a simple closed path in the x, y plane, wit h interior D. Let cp(x, y) and r(x, y) be continuous with continuous first and second partial derivatives on C and throughout D . Let
xe
= a
+y a
3. Let C be a simple closed path in the x, y plane, with interior D . Let yo be continuous with continuous firs t and second partial derivatives on C and at all point s of D. Let N(x, y) be the unit outer normal to C (outer meaning pointing away from D if drawn as an arro w at (x, y) on C) . Prove that
Prove that
is
dA . cpN ( x , y) ds = Jff J D V2cP(x , y)
(Recall that cp N (x, y) is the directional derivative o f cp in the direction of N) .
fL (pv 2 i/JdA = f - cpay- dx + coa x dy - ff v(p• oilldA . c
2. Under the conditions of Problem 1, show tha t "D -ic
13.7
(S0v 2 i1! - (llv 2 co) dA
[0 ax -y
dx+ ay ]
[*ax -V a] dy .
The Divergence Theorem of Gaus s We have seen that, under certain conditions, the conclusion of Green's theorem i s
f F•Nds= f f V•FdA . C
D
13 .7 The Divergence Theorem of Gauss
565
Now make the following generalizations from the plane to 3-space : set D in the plane -+ a 3-dimensional solid M a closed curve C bounding D a surface E enclosing M a unit outer normal N to C -+ a unit outer normal N to E a vector field F in the planes a vector field F in 3-spac e a line integral fc F • Nds -* a surface integral f f> F . Ndoa double integral f fD V • FdA
a triple integral film V • FdV.
With these correspondences and some terminology, Green's theorem suggests a theore m named for the great nineteenth-century German mathematician and scientist Carl Friedric h Gauss . A surface E is closed if it encloses a volume . For example, a sphere is closed, as is a cube , while a hemisphere is not . A surface consisting of the top part of the sphere x2 + y2 + z2 = a 2, together with the disk x2 + y2 < a2 in the x, y plane, is closed . If filled with water (through some opening that is then sealed off), it will hold the water. A normal vector N to E is an outer normal if, when represented as an arrow from a poin t of the surface, it points away from the region enclosed by the surface (Figure 13 .44) . If N is also a unit vector, then it is a unit outer normal . N (Unit outer normal)
x-V
Y
FIGURE 13 .44
THEOREM 13.8 Gauss's Divergence Theore m
Let j be a piecewise smooth closed surface. Let M be the set of points on and enclosed by E. Let j have unit outer normal vector N . Let F be a vector field whose components are continuous with continuous first and second partial derivatives on E and throughout M . Then f f F•Ndv= f f f E
v . FdV.
III
(13 .11 )
V • F is the divergence of the vector field, hence the name "divergence theorem" . In th e spirit of Green's theorem, Gauss's theorem relates vector operations over objects of differen t dimensions . A surface is a two-dimensional object (it has area but no volume), while a solid region in 3-space is three-dimensional. Gauss's theorem has several kinds of applications . One is to replace one of the integrals i n equation (13 .11) with the other, in the event that this simplifies an integral evaluation . A secon d is to suggest interpretations of vector operations . A third is to serve as a tool in deriving physica l laws . Finally, we will use the theorem in developing relationships to be used in solving partial differential equations . Before looking at uses of the theorem, here are two purely computational examples t o provide some feeling for equation (13 .11) .
566
CHAPTER 13 Vector Integral Calculu s
EXAMPLE 13 .3 0
Let E be the piecewise smooth closed surface consisting of the surface E 1 of the cone z = \/x2 + y2 for x2 -+ 2 < 1, together with the flat cap E2 consisting of the disk x2 + y2 < 1 in the plane z = 1 . This surface is shown in Figure 13 .45 . Let F(x, y, z) = xi + yj + zk . We will calculate both sides of equation (13 .11) . The unit outer normal to E1 is N
_ 1 x, y i+ -j - k . ' v2 z z
Then F•N1 =(xi+Yj+zk) _1
x2 Z
because on E 1,
z2 =
1 x i + yj- k v2 z z
•-
y2 +--z = 0 z
x2 +y 2 . (One can also see geometrically that F is orthogonal to N 1 .) Then ff F•N I do-=0 .
The unit outer normal to E2 is N2 = k, so F•N2 Since z = 1 on
E2,
=z .
then ffF E2
• N2da-
ffzdu=f EZ dQ fE2 = area of
E2 = IT .
Therefore f f F•Nd -= f f F•N I do-+ El
ff
E2
F•N 2 do-
Now compute the triple integral. The divergence of F i s a a a y+-z=3 , V .F=-x+aY
FIGURE 13 .45
=7r .
13.7 The Divergence Theorem of Gauss
567
so
fffM V • FdV= fffM 3dV = 3 [volume of the cone of height 1, radius 1 ] 1 =3-Tr=7r. 3
EXAMPLE 13 .3 1
Let E be the piecewise smooth surface of the cube having vertice s (0, 0, 0), (1, 0, 0), (0, 1, 0), (0, 0, 1) , (1, 1, 0), (0, 1, 1), (1, 0, 1), (1, 1, 1) . Let F(x, y, z) = x2 i + yzj + z2k . We would like to compute the flux of this vector field acros s the faces of the cube . The flux is ffE F . Ndo . This integral can certainly be calculated directly, but it require s performing the integration over each of the six smooth faces of E. It is easier to use the tripl e integral from Gauss's theorem . Compute the divergence V . F=2x+2y+2 z and then flux =
ff
E
F•Nda .
= f f f v . FdV=2 f f fM (x+y+z)d V =fff =ff =f
(2x+2y+2z)dzdydx
(2x + 2y + 1)dydx
(2x+2)dx = 3 .
Now we will move to more substantial uses of the theorem .
13 .7.1 Archimedes's Principl e Archimedes ' s Principle states that the buoyant force a fluid exerts on a solid object immerse d in it, is equal to the weight of the fluid displaced . A bar of soap floats or sinks in a full bathtu b for the same reason a battleship floats or sinks in the ocean . The issue rests with the weight of the fluid displaced by the object . We will derive this principle. Consider a solid object M bounded by a piecewise smooth surface E. Let p be the constan t density of the fluid . Draw a coordinate' system as in Figure 13 .46, with M below the surface of the fluid. Using the fact that pressure is the product of depth and density, the pressure p(x, y, z) at a point on E is given by p(x, y, z) _ -pz . The negative sign is used because z is negativ e in the downward direction and we want pressure to be positive . Now consider a piece Ej of the surface, also shown in Figure 13.46. The force of th e pressure on this surface element has magnitude approximately -pz times the area A j of Ej .
568
CHAPTER 13 Vector Integral Calculus z Y
FIGURE 13 .4 6
Pressure force o n pzN • Aj . E Vertical componen t =pzN•kAj . If N is the unit outer normal to E j , then the force caused by the pressure on Ej is approximatel y pzNAj . The vertical component of this force is the magnitude of the buoyant force acting upwar d on >j . This vertical component is pzN • kA1. Sum these vertical components over the entir e surface to obtain approximately the net buoyant force on the object, then take the limit as th e surface elements are chosen smaller (areas tending to zero) . We obtain in this limit that net buoyant force on
E = ff
E
pzN • kdo•.
Write this integral as f f E pzk • Ndo and apply Gauss's theorem to convert the surface integra l into a triple integral : net buoyant force on
E=fff
V • (pzk)dV.
M
But V . (pzk) = p, s o net buoyant force on
E=fff
pdV = p [volume of M] .
M
But this is exactly the weight of the fluid displaced, establishing Archimedes's Principle . 13.7.2 The Heat Equatio n We will derive a partial differential equation that models heat conduction . Suppose some medium (for example, a metal bar, the air in a room or water in a pool) has density p(x, y, z) , specific heat z), and coefficient of thermal conductivity K(x, y, z) . Let t) b e the temperature of the medium at time t and point (x, y, z) . We want to derive an equation for u. We will employ a device used frequently in deriving mathematical models . Consider an imaginary smooth closed surface E within the medium, bounding a solid region M. The amount of heat energy leaving M across E in a time interval At is µ(x, y,
u(x, y, z,
(ff(KVU) • Ndo-) At. This is the flux of the vector (K times the gradient of u) across this surface, multiplied by th e length of the time interval. But, the change in temperature at (x, y, z) in M in this time interval is approximatel y (au/at)At, so the resulting heat loss in M is (llfMPdv)
At .
13 .7 The Divergence Theorem of Gauss
569
Assuming that there are no heat sources or losses within M (for example, chemical reaction s or radioactivity), the change in heat energy in M over this time interval must equal the hea t exchange across E . Then \
l
(ff(Kvu) • Nda I At = ( fffM I.,P at dV I At. Therefore
ffE (KVu) • Ndo- = f f fM µP at dV. Apply Gauss's theorem to the surface integral to obtai n
fffM
V • (KVu)dV =
ffff µP at dV.
The role of Gauss's theorem here is to convert the surface integral to a triple integral, thu s obtaining an equation with the same kind of integral on both sides . This allows us to combine terms and write the last equation as
fffM
au 1 (µP at -V • (KVu)) dV = 0.
Now keep in mind a crucial point-E is any smooth closed surface within the medium . Assume that the integrand in the last equation is continuous . If this integrand were nonzero at any point Po of the medium, then it would be positive or negative at Po, say positive . B y continuity of this integrand, there would be a sphere S, centered at Po, of small enough radiu s that the integrand would be strictly positive on and within S . But then we would hav e
fffnr
au v . (KVu) dV> 0, CµP at -
in which M is the solid ball bounded by S . By choosing E = S, this is a contradiction . W e conclude that au pp at -V•(KVu)= 0 at all points in the medium, for all times . This gives us the partial differential equation pp
au = V . (KVu ) at
for the temperature function at any point and time . This equation is called the heat equation . We can expand x k V • (KVu) =V . (K-i+K-j+Ky // a a (K-)+ a (K)+ au -(K- ) ax ax ay ay az az
a
aK au aK au aK au azu a2u a2u ax ax + ay ay + az az + K ( axe + ay2 + az 2 / =VK•Vu+KV 2 u,
570
CHAPTER 13 Vector Integral Calculu s in which a2u
V u= axe+ aY2 2 +
az 2
is called the Laplacian of u . (V 2 is read "del squared") . Now the heat equation can be writte n au µP at = VK • Vu+KV2u. If K is constant, then its gradient is the zero vector and this equation simplifies t o au_ K O2 u . at µ p In the case of one space dimension (for example, if u(x, t) is the temperature distribution in a thin bar lying along a segment of the x axis), this is au at
a2 u k ax e
with k = K/µp .K The steady-state case occurs when u does not change with time . In this case au/at = 0, and the last equation becomes V 2u=0 , a partial differential equation called Laplace's equation . In Chapters 18 and 19 we will writ e solutions of the heat equation and Laplace's equation under various conditions .
13.7.3
The Divergence Theorem as
a
Conservation of Mass Principle
We will derive a model providing a physical interpretation of the divergence of a vector. Let F(x, y, z, t) be the velocity of a fluid moving in a region of 3-space at point (x, y, z ) and time t . Let Po be a point in this region . Place an imaginary sphere Er of radius r abou t Po, as in Figure 13 .47. Er bounds a solid ball Mr . Let N be the unit outer normal to E r . We know that f f* F . Ndo is the flux of F out of Mr across E,.. If r is sufficiently small, then for a given time, V . F(x, y, z, t) is approximated by V V . F(Po, t) at all points (x, y, z) of Mr , to within any desired tolerance . Therefore
fffMr V . F(x, Y, z, t) dV ^ fff = [V • F(Po, t)] [volume of Mr ] =
FIGURE 13 .47
V
. F(Po,
t)dV
3 Irr3V . F(Po, t).
13 .7 The Divergence Theorem of Gauss
57 1
Then V• F(Po, t)
3 47rr 3 3 41rr 3
by Gauss's theorem . Let r becomes an equality :
->-
0 . Then
f f fnr F(x, ff F•Ndcr, V
,
y, z, t)d V
E
Er contracts to its center Po and this approximation
V•F(Po,t)=lim
3
r->0 47173
&r F
Ndv.
On the right is the limit, as r 0, of the flux of F across the sphere of radius r, divided by the volume of this sphere. This is the amount per unit volume of fluid flowing out of Mr acros s E,.. Since the sphere contracts to Po in this limit, we interpret the right side, hence also th e divergence of F at P0 , as a measure of fluid flow away from Po . This provides a physical sens e of the divergence of a vector field . In view of this interpretation, the equation
ffE F •
Ndo-
=fff
v
.Fd V
states that the flux of F out of M across its bounding surface exactly balances the divergenc e of fluid away from the points of M . This is a conservation of mass statement, in the absenc e of fluid produced or destroyed within M, and provides a model for the divergence theorem .
SECTION
13 7 _ a
PROBLEM S
In each of Problems 1 through 8, evaluate f f f F . Ndo- or f f fM div(F)dV, whichever is most convenient . 1. F = xi + yj - zk, E the sphere of radius 4 about (1, 1, 1 ) F 2. = 4xi - 6yj +k, E the surface of the solid cylinder x2 + y2 < 4, 0 z -< 2 (the surface includes the end caps of the cylinder) 3. F=2yzi-4xzj+xyk, E the sphere of radius 5 about (-1, 3, 1 ) 4. F = x3 i+y 3j+z 3 k, j the sphere of radius 1 about th e origin 5. F = 4xi - zj +xk, E the hemisphere x2. + y 2 +z2 = 1, z > 0, including the base consisting of points (x, y ) with x2 + y2 < 1 6. F = (x - y)i + (y - 4xz)j + xzk, E the surface of the rectangular box bounded by the coordinate plane s x=0, y=0 and z=0 and by the planes x=4, y= 2 and z = 3 . 7. F = x2 i + y2j + z 2 k, E the cone z = \/x2 +y2 for x2 + y2 <_ 2, together with the top cap consisting o f points (x, y, /) with x2 +y2 < 2 .
8. F =x 2 i - ezj + zk, E the surface bounding the cylinder x2 + y 2 < 4, 0 < z < 2 (including the top an d bottom caps of the cylinder) 9. Let E be a smooth closed surface and F a vector field with components that are continuous with continuous first and second partial derivatives . through I and its interior. Evaluate f fE (V x F) • Ndo- . 10. Let cp(x, y, z) and i(x, y, z) be continuous with continuous first and second partial derivatives on a smooth closed surface E and its interior M . Suppose Vcp = 0 in M . Prove that f f fM cpV2 /idV = O . 11. Show that under the conditions of Problem 10, if Oqo=VIP = 0, then fffm((pV2 - V2co) dV =0 . 12. Let E be a smooth closed surface bounding an inte rior M . Show that volume of M=
3ff
R• Ndo-,
E where R = xi + yj + zk is a position vector for E. 13. Suppose f and g satisfy Laplace''s equation in a regio n M bounded by a smooth closed surface E. Suppose of/8'7 = ag/ar) on E. Prove that for some constan t k, f(x, y, z) = g(x, y, z) +k for all (x, y, z) in M .
572
CHAPTER 13 Vector Integral Calculu s
L
13 .8
The Integral Theorem of Stoke s We have seen that the conclusion of Green's theorem can be writte n
f F . Tds= C
ff
(VxF)•kdA ,
D.
in which T is the unit tangent to C, a simple positively oriented closed curve enclosing a region D . Think of D as a flat surface with unit normal vector k and bounded by C . To generalize to three dimensions, allow C to be a closed curve in 3-space, bounding a smooth surface E a s in Figure 13.48. Here E need not be a closed surface . Let N be a unit normal to E. This raises a subtle point. At any point of E there are two normal vectors, as shown i n Figure 13 .49. Which should we choose? In addition to this decision, we must choose a directio n on C . In the plane we chose counterclockwise as positive orientation, but this has no meanin g in three dimensions . First we will give a rule for choosing a particular unit normal to E at each point. If E ha s coordinate functions x = x(u, v), y = y(u, v), z = z(u, v), then the normal vector a (Y, z) . 8(z,x) j a (x , Y) v) i I8(u , a(u, v )* E 8(u, v) k ,
divided by its length, yields a unit normal to E. The negative of this unit normal is also a uni t normal at the same point . Choose either this vector or its negative to use as the normal to E, and call it n . Whichever is chosen as n, use it at all points of E. That is, do not use n at on e point and -n at another . Now use the choice of n to determine an orientation or direction on C to be called th e positive orientation . Referring to Figure 13 .50, at any point on C, if you stand along n (that is, your head is at the tip of n), then the positive direction on C is the one in which you walk to have E over your left shoulder . The arrow shows the orientation on C obtained i n this way . Although this is not a rigorous definition, it is sufficient for our purpose, withou t becoming enmeshed in topological details . With this direction, we say that C has been oriente d coherently with n. If we had chosen the normal in the opposite direction, then we would hav e reached the opposite orientation on C . The choice of the normal determines the orientatio n on the curve. There is no intrinsic positive or negative orientation of the curve-simply a n orientation coherent with the choice of normal . With this understanding, we can state Stokes's theorem .
z
y
y FIGURE 13 .48
FIGURE 13.49
Normals to a surface at a point.
FIGURE 13 .50
13 .8 The Integral Theorem of Stokes
-573
THEOREM 13 .9 Stokes
Let E be a piecewise smooth surface bounded by a piecewise smooth curve C . Suppose a unit normal n has been chosen on E and that C is oriented coherently with n . Let F(x, y, z) be a vector field whose component functions are continuous with continuous first and second partia l derivatives on E . Then,
fc F • dR = f f (O x F) • ndo-. E
We will write this conclusion in terms of coordinates and component functions . Let th e component functions of F be, respectively, f, g and h. The n is
F • dR =
is
f(x, y, z)dx+g(x, y, z)dy+ h(x, y, z)dz •
Next , VxF=
i j k a/ax a/ay a/az f g h
= *y - Oz l i +
(af
Ox
I
j
+ (
ag 3f k. a ay)
The normal to the surface is given by equation (13 .5) a s a(z,x) + a (x, Y) k N _ a (Y, z) . i+ _ a(u, v) a(u, v) j a(u, v) Use this to define the unit normal n(u,
N(u, v) II N (u , v) 1 I
v) The n N(u,v )
(Ox F) n= (VxF)
IIN(
1 II
N ( u , v) II
u , v)
II
of ah a(z, x) ah agl a(y, z) [(ay az) a ( u , v) + (az - ax a(u, v)
ag of a (x ,Y) + (ax ay a(u, v) Then ,f f (O x F) •ndor E 1 aga(y, z) + of ah a(z,x) [( _a h ay az* a(u, v) (az ax a(u, v) - ffD IIN(u, v)II IIN(u, v) II dud e a (u, v) _ ag a(y,z) + afah a(z,x) az) a(u, v) ( az ax a(u, v)
+ (axx ay - _ f fD [(
-ah
J
+ ag - of a(x,Y) l dude . (ax ay / a ( u , v)
CHAPTER 13 Vector Integral Calculu s in which the coordinate functions x(u, v), y(u, v) and z(u, v) from E are substituted into the integral, and D is set of points (u, v) over which these coordinate functions are defined . Keep in mind that the function to be integrated in Stokes's theorem is (V x F) • n, in which n = N/ II NII is a unit normal . However, in converting the surface integral to a double integral over D, usin g the definition of surface integral, (V x F) • n must be multiplied by II N(u, v) II, with N(u, v) determined by equation (13 .5) .
EXAMPLE 13 .3 2
Let F(x, y, z) = -yi + xyj - xyzk and let E consist of the part of the cone z = */x 2 + y2 for x 2 + y 2 < 9 . We will compute both sides of the conclusion of Stokes's theorem to illustrate th e various terms and integrals involved . The cone is shown in Figure 13 .51 . Its boundary curve C is the circle around the top o f the cone, the curve x 2 + y2 = 9 in the plane z = 3 . In this example the surface is described b y z = S(x, y) . Here x and y are the parameters, varying over the disk D given by x 2 +y 2 < 9. W e can compute a normal vector az az N=-ai- a j+k ay
= --i- y j+k . z
z
For (V x F) n in Stokes's theorem, n is a unit normal, so compute the norm of N :
II N II
=
-xij+ k z z y
x2 y2 z2 + z2 + 1
x2 +y2+(x2+y2) = N G . x2 + y 2
Use the unit normal n II N II
,4z
(-xi-yj+zk) .
This normal is defined at all points of the cone except the origin, where there is no normal . n is an inner normal, pointing from any point on the cone into the region bounded by the cone . If we stand along n at points of C and imagine walking along C in the direction of th e arrow in Figure 13 .52, then the surface is over our left shoulder . Therefore this arrow orients
FIGURE 13 .51
FIGURE 13 .52
13.8 The Integral Theorem of Stokes
575
C coherently with n . If we had used -n as normal vector, we would orient ,C in the othe r direction . We can parametrize C b y x= 3 cos(t), y= 3 sin(t), z= 3 for 0< t< 27r . The point (3 cos(t), 3 sin(t), 3) traverses C in the positive direction (as determined by n) as t increases from 0 to 27r. This completes the preliminary work and we can evaluate the integrals . For the line integral , F • dR = -ydx + xdy - xyz d z c 217= f [-3sin(t)(-3sin(t))+3cos(t)3cos(t)]dt
= f z7 a
9dt = 187x.
For the surface integral, first compute the curl of F : i V x F=
j
k
a/ax a/ay a/a z -y x -xy z
= -xzi+yzj+2k . Then (VxF) . n= 1 (x2 z-y 2z+2z) .N5, z =
-(x2
- y2 + 2) .
Then
ff
(vxF)•ndo-=
E -_
f fD [(OxF) . n]1INIIdxdy f fD
( 2
-y
v2
= f fD (x 2
-y
2 +2) ,N/2dxdy
2 +2)dxdy .
Use polar coordinates on D to write this integral as 2a
ff =f 0
3
(r 2 cos 2 (B) - r2 sin2 (0) + 2) rdrd O
27r 3
f r3 cos(20)drd0+ f 0
0
2a
fo
3
2rdrd O
=[r4] 3 +2r[r 2] 0 = 187r . [sin(20)] 2 0
0
The following are two applications of Stokes's theorem .
CHAPTER 13 Vector Integral Calculu s
13.8 .1 An Interpretation of Cur l We will use Stokes's theorem to argue for a physical interpretation of the curl operation . Think of F(x, y, z) as the velocity of a fluid and let Po be any point in the fluid . Consider a disk E r of radius r about Po, with unit normal vector n and boundary circle Cr coherently oriented, a s in Figure 13 .53 . For the disk the normal vector is constant . By Stokes's theorem ,
fCr F . dR= f fE r (VxF)•ndcr. Since R' (t) is a tangent vector to Cr , then F • R' is the tangential component of the velocit y about Cr and c,c F dR measures the circulation of the fluid about Cr .
FIGURE 13 .5 3
By choosing r sufficiently small, V x F(x, y, z) is approximated by V x F(Po) as closel y as we like on E r . Further, since n is constant , circulation of F about
Cr ti f fE
(0 x F) (Po) . nd u
= (V x F) (Po) • n(area of the disk
E r)
=Trr2(OxF)(Po)•n . Therefore V x F(Po) . n
- (circulation of F about Cr) .
As r -+ 0, the disk contracts to its center Po and we obtain V x F(Po) . n = lim 1 (circulation of F about Cr ) . 7rr r->O
Since n is normal to the plane of Cr , this equation can be read V x F(Po) . n = circulation of F per unit area in the plane normal to n . Thus the curl of F is a measure of rotation of the fluid at a point . This is the reason why a fluid is called irrotational if the curl of the velocity vector is zero . For example, any conservativ e vector field is irrotational, because if F = Ocp, then V x F = V x (V(p) = O . 13.8 .2 Potential Theory in 3-Spac e As in the plane, a vector field F(x, y, z) in 3-space is conservative if for some potential functio n co, F = Vcp . By exactly the same reasoning as applied in the two-dimensional case, we ge t F•
dR
= cp (P I) - cp (Po),
13.8 The Integral Theorem of Stokes
' :577
if Po is the initial point of C and P, the terminal point . Therefore the line integral of a conservative vector field in 3-space is independent of path . If a potential function exists, we attempt to find it by integration just as in the plane .
EXAMPLE 13 .3 3
Let (xyexyz + Y) k . F (x , y, z) _ (Yz exyz - 4x)i + (xze xyz + z + cos (Y) )j + For this to be the gradient of a scalar field cp, we must have ax
= YZ
exyZ -4x,
a- = xze
xYZ +z+cos()) ,
and acp
= xyexYz + y .
az
Begin with one of these equations, say the last, and integrate with respect to z to get (q (x , y, z) = e xyZ +yz+k(x , y) ,
where the "constant" of integration may involve the other two variables . Now we nee d a* =
yze xYz -4 x
=a
[exyZ
+yz+k(x,Y)]
= Yze''
This will be satisfied if ak ax
so
k(x, y)
= -4x ,
must have the form k(x, y) = - 2x2
+c(y) .
Thus far cp(x, y, z) must have the appearanc e 9( x , y, z) = e xyZ +yz - 2x2 + c (y)
Finally, we must satisfy ay
= xze xyZ +z+ cos (Y)
= aY [exyZ +yz - 2x 2 + c (y) ] = xze xyZ +z+ c' (y) Then c ' (y) = cos(y) and we may choose c(y) = sin(y) .
+ak
578
CHAPTER 13 Vector Integral Calculu s A potential function is given b y (P (x, Y, z) = exYz -I- yz - 2x2 + sin (y) . Of course, for any number a, cp(x, y, z) + a is also a potential function for F . As in the plane, in 3-space there are vector fields that are not conservative . We would like to develop a test to determine when F has a potential function . The discussion follows tha t in Section 13 .3 .1 for the plane . As we saw in the plane, the test requires conditions not only on F, but on the set D of points on which we want to find a potential function . We define a set D of points in 3-space to be a domain if 1. about every point of D there exists a sphere containing only points of D, and 2. between any two points of D there is a path lying entirely in D . This definition is analogous to the definition made for sets in the plane . For example, the set of points bounded by two disjoint spheres is not a domain because it fails to satisfy (2) , while the set of points on or inside the solid unit sphere about the origin fails to be a domain because of condition (1) . The set of points (x, y, z) with x > 0, y > 0 and z > 0 is not a domai n because it fails condition (1) . For example, there is no sphere about the origin containing onl y points with nonnegative coordinates . The set of points (x, y, z) with x > 0, y > 0 and z > 0 i s a domain. On a domain, existence of a potential function is equivalent to independence of path .
-i
THEOREM 13 .1 0
Let D be a domain in 3-space, and let F be continuous on D . Then lc F • dR is independent o f path on D if and only if F is conservative . We already know that existence of a potential function implies independence of path. Fo r the converse, suppose lc F • dR is independent of path in D . Choose any Po in D . If P is any point of D, there is a path C in D from Po to P, and we can define cp(P)
= fc F • dR .
Because this line integral depends only on the end points of any path in D, this defines a function for all (x, y, z) in D . Now the argument used in the proof of Theorem 13 .6 can b e essentially duplicated to show that F = Vcp . With one more condition on the domain D, we can derive a simple test for a vector fiel d to be conservative. A set D of points in 3-space is simply connected if every simple close d path in D is the boundary of a piecewise smooth surface lying in D . This condition enables u s to use Stokes's theorem to derive the condition we want . THEOREM 13 .1 1
Let D be a simply connected domain in 3-space . Let F and V x F be continuous on D . Then, F is conservative if and only if V x F = 0 in D . Thus, in simply connected domains, the conservative vector fields are the ones having cur l zero-that is, the irrotational vector fields . In one direction, the proof is simple . If F = Vcp, then V x F = V x (Vcp) = 0, without th e requirement of simple connectivity . In the other direction, suppose V x F = O . To prove that F is conservative, it is enough to prove that fc F • dR is independent of path in D . Let C and K be two paths from Po to PI in D .
13.8 The Integral Theorem of Stokes
'5 79
FIGURE 13 .5 4
Form a closed path L in D consisting of C and -K, as in Figure 13 .54. Since D is simpl y connected, there is a piecewise smooth surface E in D having boundary C. Then
f F•dR= fc F•dR- f F•dR L
K
= f fE (VxF)•ndo-=0 , so f F . dR fcC K and the line integral is independent of path, hence conservative . If G(x, y) = f(x, y)i + g(x, y)j, then we can think of G as a vector field in 3-space by writing G(x, y) = f(x, y)i + g(x, y)j + Ok . Then j VxG=
a/ax
f(x , y)
k
a/ay a/az g ( x, y) 0
ag of
=
8x - ay) k,
so the condition V x G = 0 in this two-dimensional case is exactly the condition of Theorem 13 .5.
PROBLEMS
In each of Problems 1 through 5, use Stokes's theorem t o evaluate f'c F • dR or f ff (0 x F) . Ndar , whichever appears easier. 1. F = yx2 i - xy2j + z2 k, the hemisphere x2 + y2+ z 2 =4,z 0 2. F = xyi + yzj + xzk, I the paraboloid z = x 2 + y 2 for x2 +y2 9 3. F=zi+xj+yk,Ethe conez= \/x 2 +y2 for0
5. F = xyi+yzj+xyk, E the part of the plane 2x+4y + z = 8 in the first octant 6. Calculate the circulation of F = (x- y)i +x2yj+xza k counterclockwise about the circle x2 + y 2 = 1 . Here a is a positive constant . Hint : Use Stokes's the orem, with E any smooth surface having the circle a s boundary . 7. Use Stokes's theorem to evaluate 96c F . Tds, where C is the boundary of the part of the plane x+4y+z = 1 2 lying in the first octant, and F = (x - z)i + (y - x) j + (z - y)k .
580.
CHAPTER 13 Vector Integral Calculu s
In each of Problems 8 through 14, let SZ be all of 3-spac e (so II is a simply connected domain) . Test to see if F i s conservative. If it is, find a potential function.
In each of Problems 15 through 20, evaluate the lin e integral of the vector field on any path from the firs t point to the second by finding a potential function .
8. F = 2xi - 2yj + 2z k
15. F = i - 9y2zj - 3y 2k ; (1,1, 1), (0,3,5 )
9. F=i-2j+ k 10. F = yz cos(x)i + (z sin(x) +1)j + y sin(x) k
The Fourier Integral and Fourie r Transform s CHAPTER 1 6
Special Functions, Orthogona l Expansions and Wavelets
Fourier Analysis, Orthogona l Expansions, and Wavelets
In 1807 the French mathematician Joseph Fourier (1768-1830) submitted a paper to th e Academy of Sciences in Paris. In it he presented a mathematical treatment of problems involving heat conduction . Although the paper was rejected for lack of rigor, it contained ideas that were so rich and widely applicable that they would occupy mathematicians in research to th e present day. One surprising implication of Fourier's work was that many functions can be expanded in infinite series or integrals involving sines and cosines . This revolutionary idea sparked a heate d debate among leading mathematicians of the day and led to important advances in mathematic s (Cantor's work on cardinals and ordinals, orders of infinity, measure theory, real and comple x analysis, differential equations), science and engineering (data compression, signal analysis) , and applications undreamed of in Fourier's day (CAT scans, PET scans, nuclear magneti c resonance) . Today the term Fourier analysis refers to many extensions of Fourier's original insights , including various kinds of Fourier series and integrals, real and complex Fourier transforms , discrete and finite transforms, and, because of their wide applicability, a variety of compute r 581
programs for efficiently computing Fourier coefficients and transforms . The ideas behind Fourier series have also found important generalizations in a broad theory of eigenfunction expansions , in which functions are expanded in series of special functions (Bessel functions, orthogonal polynomials, and other functions generated by differential equations) . More recently, wavelet expansions have been developed to provide additional tools in areas such as filtering and signal analysis . This part is devoted to some of these ideas and their applications .
582
C H AP T E R
14
Tflt. €>URIEP SERIES OF A FUNCTION CONVERGENC E OFFOURIERSERIES FOURIERCOSINEANDSIN E SERIES INI'EGR TIONAND DIFFERENTIATION O P FOUR t';R SERI ES I FIEPHASE \ Gi_,E FORMOF A
Fourier Series
14.1
Why Fourier Series ? A Fourier series is a representation of a function as a series of constants times sine and/o r cosine functions of different frequencies . In order to see why such a series might be interesting , we will look at a problem of the type that led Fourier to consider them . Consider a thin bar of length vr, constant density and uniform cross section . Let u(x, t) be the temperature at time t in the cross section of the bar at x, for 0 < x < vr . In Section 13 .7 .2 we derived a partial differential equation for u : z k axe for 0 < x < vr, t > 0, (14.1) at in which k is a constant depending on the material of the bar . Suppose the left and right end s of the bar are kept at zero temperature ,
u(0, t) = u(vr, t) = 0 for t > 0,
(14 .2)
and that the temperature throughout the bar at time t = 0 is specifie d
u(x, 0) = f(x) = x(vr -x) .
(14 .3)
Intuitively, the heat equation, together with the initial temperature distribution throughou t the bar, and the information that the ends are kept at zero degrees for all time, are enough t o determine the temperature distribution u(x, t) throughout the bar at any time . By a process that now bears his name, and which we will develop when we study partia l differential equations, Fourier found functions satisfying the heat equation (14 .1) and the conditions at the ends of the bar, equations (14 .2), and having the form u,,(x, t) = b„ sin(nx)e-ktt2` ,
(14 .4)
in which n can be any positive integer, and b,, can be any real number. We will use` thes e functions to find a function that also satisfies the condition (14 .3) . 5 83
584
CHAPTER 14 Fourier Serie s
Periodic phenomena have long ,fascinated mankind ; our ancient ancestors were aware of th e recurrence ofphases of the moon and certain planets, the tides of lakes and oceans, and cycles in th e weather. Isaac Newton 's calculus and law of gravitation enabled him to explain the periodicities of the tides, but it was left to Joseph Fourier and his successors to develop Fourier analysis, which ha s had profound applications in the study of natural phenomena and the analysis of signals and data .
b„ p
A single choice of positive integer no and constant sin(nox)e -k "'`, then we would need
b„o
will not do . If we let u(x, t) =
u(x, 0) = x(rr - x) = b,,o sin(nox) for 0 x < rr ,
an impossibility . A polynomial cannot equal a constant multiple of a sine function over [0, Tr] (or over any nontrivial interval) . The next thing to try is a finite sum of the functions (14 .4), say N
u(x, t)
= E b,, sin(nx)e -k""
(14 .5)
Such a function will still satisfy the heat equation and the conditions (14 .2) . To satisfy the condition (14 .3), we must choose N and s so that N
u(x, 0) = x(7- - x)
= E b„ sin(nx) for
0 x 7r .
But this is also impossible . A finite sum of constant multiples of sine functions cannot equal a polynomial over [0, 7r] .
14 .1
Why Fourier Series?
5 85
At this point Fourier had a brilliant insight. Since no finite sum of functions (14 .4) can b e a solution, attempt an infinite series : 0
u(x, t)
_ E b„ sin(nx)e-k"2` .
(14 .6)
n= i
This function will satisfy the heat equation, as well as the conditions u(x, 0) = u(7r, 0) = O . To satisfy condition (14 .3) we must choose the bat s so that 0
u(x, 0) = x(7r - x)
= E b„ sin(nx)
for 0 < x < 7r .
(14 .7)
n= l
This is quite different from attempting to represent the polynomial x(7r - x) by the finite trigonometric sum (14.5) . Fourier claimed that equation (14 .7) is valid for 0 < x < 7r if the coefficients are chosen as 2
fr
b„ = - J
4 1- (-1) " n3
x(ar - x) sin(nx)dx = 0 7r
By inserting these coefficients into the proposed solution (14 .6), Fourier thus claimed that th e solution of this heat conduction problem, with the given initial temperature, i s u(x, t)
= 47r Ece' 1-( 3 I)" sin(nx)e -k'2t n
The claim that 4
E 7T 1-n(3 1),* sin(nx) = x (7r - x)
for
0 < x < 7r
was too much for mathematicians of Fourier's time to accept . The mathematics of this time wa s not adequate to proving this kind of assertion . This was the lack of rigor that led the Academy t o reject publication of Fourier's paper . But the implications were not lost on Fourier's colleagues . There is nothing unique about x(7r - x) as an initial temperature distribution, and many different functions could be used . What Fourier was actually claiming was that, for a broad class o f functions f, coefficients b„ could be chosen so that f(x) = E 1 b,t sin(nx) on [0, 7r] . Eventually this and even more general claims for these series proposed by Fourier wer e proved . We will now begin an analysis of Fourier's ideas and some of their ramifications .
1. On the same set of axes, generate a graph o f x(7r - x) and E5 _1 (4/7r)([1 - ( - 1)"]/n 3) sin(nx) for 0 < x < 7r . Repeat this for the partial sums E,°_1(4/7r)([1 - ( - 1)"]/n3) sin(nx) and E20 1 (4/7r)([l - (- 1)"]/n3 ) sin(nx) . This will giv e a sense of the correctness of Fourier's intuition i n asserting that x(7r-x) can be accurately represented b y (4/7r) ([1- (-1)"]/n3) sin(nx) on this interval .
2. Prove that a polynomial cannot be a constant multipl e of sin(nx) over [0, 7r], for any positive integer n . Hint : One way is to proceed by induction on the degree of the polynomial. 3. Prove that a polynomial cannot be equal to a nonzer o sum of the form Ej=0 ci sin(jx) for 0 x 7r, where the es are real numbers .
586
CHAPTER 14 Fourier Serie s
14.2
The Fourier Series of a Functio n Let f(x) be defined for -L < x < L . For the time being, we assume only that JILL f(x)dx exists . We want to explore the possibility of choosing numbers a0 , a l , . . . , b l , b2 , . . . such that
x = 2 ao+Ean cos (Lx )+b,, sin (nLx)
f( )
(14 .8)
for -L < x < L . We will see that this is sometimes asking too much, but that under certai n conditions on f it can be done . However, to get started, we will assume the best of all worlds and suppose for the moment that equation (14 .8) holds . What does this tell us about how to choose the coefficients? There is clever device used to answer this question, which was know n to Fourier and others of his time . We will need the following elementary lemma .
LEMMA I4. I
Let n and m be nonnegative integers . The n 1. L
I cos (- sin ( Lx) dx = 0 . L
-L
2. If n
0 m, then
L x )dx= f L sin( L n x )sin( Lx )dx=0.
L
I cos( nx )cos(
L 3. If n
0, then
f
cos2 (Lx ) dx = f L sin 2 (Lx ) dx = L . L
The lemma is proved by straightforward integration . Now, to find ao, integrate the series (14 .8) term by term (supposing for now that we ca n do this) : L
L
L
i f(x)dx = 2- ao f L dx L
+ an
f L cos (Lx) dx I bn f
LL
sin
(nLx) dx.
n= 1
All of the integrals on the right are zero and this equation reduces t o L
f
f(x) dx =Lao . L
Therefore L
ao = L
f
L
f(x)dx .
14.2 The Fourier Series of a Function
58 7
Next, we will determine a k for any positive integer k . Multiply equation (14 .8) by cos(kirx/L) and integrate each term of the resulting series to ge t x ) cos (!ir) dx J L f(
+ E n fL LL
a
cos (
narx L
= 2 ao f L cos
(k'n-x
korx ) cos L / dx + b 71
I dx
narx f L sin ( L ) L
n= 1
kvrx cos
L
dx.
f
By the lemma, all of the integrals on the right are zero except for ±L cos(k'rrx/L) cos(k7rx/L)dx , which occurs when n = k, and in this case this integral equals L . The right side of this equatio n therefore collapses to just one term, and the equation become s
f
L f(x) cos
( kLx )
L
dx = a k L ,
whereupon L
ak
k x = L f L f(x) cos L
dx.
To determine bk, return to equation (14 .8) . This time multiply the equation by sin(k7rx/L ) and integrate each term to ge t L
f L f(x) sin
+ E l fL L
a
11= 1
C L) dx = 1 ao f cos (
LL
sin kLx
ngrx k'rrx ) sin L
(L
dx+ b 72
I
dx
LLL
n7Tx krrx (L) sin ( L
sin
Again, by the lemma, all terms on the right are zero except for when n = k, and this equation reduces t o L
I f(x) sin
dx .
fL
± sin(norx/L) sin(k7rx/L)d x
dx = bk L .
L
Therefore 1 L
L
f
f (x) sin
L
dx.
We have now "solved" for the coefficients in the trigonometric series expansion 14 .8. Of course, this analysis is flawed by the interchange of series and integrals, which is not alway s justified . However, the argument does tell us how the constants should be chosen, at least unde r certain conditions, and suggests the following definition . DEFINITION 14.1 Fourier Coefficients and Series
Let f be a Riemann integrable function on [-L, L] . 1 . The numbers 1 all = L
L
f L f(x) cos
narx
dx,
for
n = 0, 1, 2, . . .
588
CHAPTER 14 Fourier Series
are the Fourier coefficients of f on [-L , 2 . The series 1 (nTrx Z ao+Ean cos L n- 1
is the Fourier series of f on [-L, L] when the constants are chosen to be th e Fourier coefficients of f on [-L, L] .
EXAMPLE 14 . 1 Let f(x) = x for -7r < x < ar . We will write the Fourier series off on [-7r, 7r] . The coefficient s are : ao
= -f IT
an
xdx = 0, 7r
* = 'Jr-1 f xcos(nx)dx cos(nx)
n1
+
sin(nx)l
= 0,
and bn
= -1 f
7r
IT
7f
x sin(nx)dx 77
sin(nx) -
n27r
cos(nx) F
= -? cos(n7r) = -(-1) n+ 1 n n since cos(n7r) = ( - 1) " if n is an integer. The Fourier series of x on [-IT,
In this example the constant term and cosine coefficients are all zero, and the Fourier serie s contains only sine terms .
EXAMPLE 14 . 2 Let f(x) I
0 for - 3< x< 0 x forO<x< 3
14.2 The Fourier Series of a Function
Here L = 3 and the Fourier coefficients are : 1
ao = 3 a,,
3 =1 =
f f f
I 3
3 1 s f(x)dx =
3
3
3
xdx =
3
2
589
,
3 f(x) cos ( n x) dx 3
3 3 +3
xcos ( " x ) dx
n23r2
cos ( n x)
3
T sin
(n3x) *o
3 [(-I) „ I] ,
= n *z
and
f
f
3
b„ = 13 s f(x) sin (n-rrx) dx = 1 3 3o
3
3
3
xsin
z sin ( n x) - x cos ( n 3x (
(nvrx)
3
dx
)1 3
_1 ),i+ 1
n7r
The Fourier series off on [-3, 3] i s
4 +E(n n= l
x' *r2[(-1)„-1]cos(n3x)-f )
3f 3
(-
Even when f(x) is fairly simple, LEI f(x) cos(n7rx/L)dx and f ±L f(x) sin(n7rx/L)dx can involve considerable labor if done by hand . Use of a software package to evaluate definit e integrals is highly recommended . In these examples, we wrote the Fourier series of f, but did not claim that it equalled f(x) . For most x it is not obvious what the sum of the Fourier series is . However, in some cases it is obvious that the series does not equal f(x) . Consider again f(x) = x on [-ar, 7r] in Example 14.1 . At x = ar and at x = -ir, every term of the Fourier series is zero, even though f(ir) = ,nand f(-ar) = -7r . Even for very simple functions, then, there may be points where the Fourie r series does not converge to f(x) . Shortly we will determine the sum of the Fourier series o f a function . Until this is done, we do not know the relationship between the Fourier series and the function itself.
14.2.1 Even and Odd Functions
Sometimes we can save some work in computing Fourier coefficients by observing specia l properties of f(x) .
590
CHAPTER 14 Fourier Serie s
DEFINITION 14.2 Even Function f is an even function on [-L, L] if f(-x) = f(x) for -L <x < L Odd Functio n f is an odd function on [-L, L] if,f(-x) _ -f(x) for -L < x
For example, x2 , x4 , cos(narx/L) and e -IxI are even functions on any interval [-L, L] . Graphs of y = x2 and y = cos(5irx/3) are given in Figure 14 .1. The graph of such a functio n for -L x 0 is the reflection across the y-axis of the graph for 0 < x L (Figure 14.2) . The functions x, x3 , x5 and sin(narx/L) are odd functions on any interval [-L, L] . Graphs of y = x, y = x 3 and y = sin(51rx/2) are shown in Figure 14 .3 . The graph of an odd functio n for -L < x < 0 is the reflection across the vertical axis, and then across the horizontal axis, of the graph for 0 < x < L (Figure 14 .4) . If f is odd, then f(O) = 0, since f(- 0)
= f( 0 ) = -.f( 0 )
Of course, "most" functions are neither even nor odd . For example, f(x) = e x is not even o r odd on any interval [-L, L] .
FIGURE 14 .1 Graphs of even functions y = x2 and y = cos(5irx/3) .
FIGURE 14 .3 Graphis of odd functions y = x, y = x 3, an d y = sin(5rrx/2) .
Graph of a typica l even function, symmetric about the y .axis. FIGURE 14 .2
FIGURE 14 .4
Graph of a typica l odd function, symmetric through th e origin.
14.2 The Fourier Series of a Function
591
Even and odd functions behave as follows under multiplication : even •even = even , odd . odd = even, and even . odd = odd . For example, x2 cos(n77-x/L) is an even function (product of two even functions) ; x2 sin(n7rx/L ) is odd (product of an even function with an odd function) ; and x3 sin(n7rx/L) is even (produc t of two odd functions) . Now recall from calculus that L
f L f(x)dx = 0 if f is odd on [-L, L ] and f L f(x)dx = 2 f L f(x)dx
if
f is even on [-L, L] .
These integrals are suggested by Figures 14.2 and 14.4. In Figure 14.4, f is odd on [-L, L], and the "area" bounded by the graph and the horizontal axis for -L < x < 0 is exactly the negativ e of that bounded by the graph and the horizontal axis for 0 < x < L . This makes f L f(x)dx = 0 . In Figure 14.2, where f is even, the area to the left of the vertical axis, for -L < x < 0, equal s that to the right, for 0 < x < L, so LLL f(x)dx = 2 f L f(x)dx . One ramification of these ideas for Fourier coefficients is that, if f is an even or odd function, then some of the Fourier coefficients can be seen immediately to be zero, and w e need not carry out the integrations explicitly . We saw this in Example 14.1 with f(x) = x, which is an odd function on [-77-, 7r] . There we found that the cosine coefficients were all zero , since xcos(nx) is an odd function .
EXAMPLE 14 . 3
We will find the Fourier series of f(x) = x 4 on [-1, 1] . Since f is an even function, x4 sin(nirx ) is odd and we know immediately that all the sine coefficients b,, are zero . For the other coefficients, compute : 2
ao= f l x4 dx=2 f x4 dx=
5
,
and a,,
=f
i x4 cos(n7rx)d x nz'7z-6
=
2 f x4 cos(n7rx)dx=8 0
77
-4 11 4
The Fourier series of x4 on [-l, 1] is z 72
5
n=1
(-1)" n *4 yn4 6
cos(nrrx) .
(-1) "
592
CHAPTER 14 Fourier Serie s
To again make the point about convergence, notice that f(0) = 0 in this example, but th e Fourier series at x = 0 is 1 n *-4n4 ar - 6 n
5+E8
2 2 (_1)
n= 1
It is not clear whether or not this series sums to the function value 0 .
EXAMPLE 14 . 4
3
Let f(x) = x for -4 < x < 4. Because f is odd on [-4, 4], its Fourier cosine coefficients ar e all zero . Its Fourier sine coefficients ar e 1 4
f 4 x 3 n x )dx 4 3 sin ( n x) dx = (-1) n+1 128 n n*3 6 . 2 x 4
bn= -
sin( i
2
1
3
2
Jo
The Fourier series of x on [-4, 4] is 6 (-l)n+1128n*3 sin (nix) , E n= t
We will make use of this discussion later, so here is a summary of its conclusions : If f is even on [-L, L], then its Fourier series on this interval i s 1
nzrx
E n= 1
(14 .9)
in which an
=2
f
L
f(x) cos
(n-x)
dx
for
n = 0, 1, 2, . . . .
(14 .10)
If f is odd on [-L, L], then its Fourier series on this interval i s
Eb n sin( n-), L
(14 .11 )
,z= 1
where
bn = 2
f
L
f(x) sin
In each of Problems 1 through 12, find the Fourier series of the function on the interval . 1. f(x)
= 4, -3 x
(nLx) dx
n = 1, 2, . . . .
3 . f(x) = cosh(irx), = 1 < x < 1 4. f(x) = 1 -
3
2. f(x)=-x,-1<x<1
for
5 . f(x)
Ixl ,
_ 1-44
-2 < x < 2 for -1r<x< 0 for 0<x
(14 .12)
14.3 Convergence of Fourier Series 6. f(x) = sin(2x), -7r < x
for - 1 < x < 0 for0<x< 1 13. Suppose f and g are integrable on [-L, L] and that f(x) = g(x) except for x = xo, a given point in the interval . How are the Fourier series of f and g related? What does this suggest about the relationship between a function and its Fourier series on an interval ? 14. Prove that f±L f(x)dx = 0 if f is odd on [-L, L] .
< 7r
12. f(x)
7. f(x)=x2-x+3,-2<x< 2 8
f(x)
_ 1-x
9 f(x) _
for -5 < x< 0 1+x' forO<x< 5 1
2
for - or < x < 0 for0<x<7 r
10. f(x) = cos(x/2) - sin(x), -Tr < x ar
_
1 01- x
15. Prove that f_LL f(x)dx = 2 fL f(x)dx if f is even on [-L, L] .
11. f(x) = cos(x), -3 x < 3
14.3
593
Convergence of Fourier Series It is one thing to be able to write the Fourier coefficients of a function f on an interva l [-L, L] . This requires only existence of JILL f(x) cos(n7rx/L)dx and f_LL f(x) sin(n7rx/L)dx . I t is another issue entirely to determine whether the resulting Fourier series converges to f(x)-o r even that it converges at all! The subtleties of this question were dramatized in 1873 whe n the French mathematician Paul Du Bois-Reymond gave an example of a function which i s continuous on (-7r, 7r), but whose Fourier series fails to converge at any point of this interval . However, the obvious utility of Fourier series in solving partial differential equations led i n the nineteenth century to an intensive effort to determine their convergence properties . About 1829, Peter Gustav Lejeune-Dirichlet gave conditions on the function f which were sufficien t for convergence of the Fourier series of f . Further, Dirichlet's theorem actually gave the sum of the Fourier series at each point, whether or not this sum is f(x) . This section is devoted to conditions on a function that enable us to determine the su m of its Fourier series on an interval . These conditions center about the concept of piecewis e continuity .
DEFINITION 14.3 Piecewise Continuous Functio n Let f(x) be defined on [a, b], except possibly at finitely many points . Then f is piecewis e continuous on [a, b] if: 1. f is continuous on [a, b] except perhaps at finitely many points . 2. Both limx. a+ f(x) and limx,b_ f(x) exist and are finite. 3. If xo is in (a, b) and f is not continuous at xo, then limx,xo+f(x) and limx,xo_ f(x ) exist and are finite.
Figures 14 .5 and 14.6 shows graphs of typical piecewise continuous functions . At a point of discontinuity (which we assume are finite in number), the function must have finite one-side d limits . This means that the graph experiences at worst a finite gap at a discontinuity . Points where these occur are called jump discontinuities of the function . As an example of a simple function that is not piecewise continuous, le t f(x)
= 101/x
for x = 0 for0<x<1
594
I
CHAPTER
14 Fourier Serie s
FIGURE 14 .5 A piecewise continuou s function .
FIGURE 14. 6
Graph of a typical piecewis e continuous function .
Then f is continuous on (0, 1], and discontinuous at 0 . However, limx,o+f(x ) = co, so the discontinuity is not a finite jump discontinuity, and f is not piecewise continuous on [0, 1] .
EXAMPLE 14 . 5
Let 5 f(x) =
x
forx= - T for - . 7r < x < 1
1-x2
for1x< 2
4
for 2<x7r
A graph off is shown in Figure 14 .7 . This function is discontinuous at
and
lim f (x) = -7r.
x->-7r+
f is also discontinuous at 1, interior to [-Tr, 7r], an d
lira f(x) = 1
and
x-> 1 -
lir a f(x) = O . x-> I+
Finally, f is discontinuous at 2, an d lim f(x) x->2-
-3
and
xli->r a2+
f(x) = 4.
At each point of discontinuity interior to the interval, the function has finite one sided limit s from both sides . At the point of discontinuity at the end point -7r, the function has a finite limi t
FIGURE 14 .7 Graph of the function of Example 14.5.
14.3 Convergence of Fourier Series
595
from within the interval . In this example, the other end point is not an issue, as f is continuou s (from the left) there . Therefore f is piecewise continuous on [-IT, ir] . We will use the following notation for left and right limits of a function at a point : f(xo+) =*lim+ f(x) and f(xo-)
= x
lim f(x) . 0-
In Example 14 .5, f(1-) = 1 and f(1+) = 0 and f(2-) = -3 and f(2+) =4 . At the end points of an interval, we can still use this notation, except at the left end poin t we consider only the right limit (from inside the interval), and at the right end point we us e only the left limit (again, so that the limit is taken from within the interval) . Again referring to Example 14 .5, f(-ar+) = - IT
and
f(rr-) = 4.
A piecewise smooth function is therefore one that is continuous except possibly for finitel y many jump discontinuities, and has a continuous derivative at all but finitely many points , where the derivative may not exist but must have finite one-sided limits .
EXAMPLE 14 . 6
Let 1 f(x) _ -2x 9e - `
for -4<x< 1 for 1 < x < 2 for 2 < x < 3
Figure 14 .8 shows a graph of f . The function is continuous except for finite jump discontinuities at 1 and 2 . Therefore f is piecewise continuous on [-4, 3] . The derivative of f is 0 f' (x) = - 2 -9e-x
for -4<x< 1 for 1 < x < 2 for 2 < x < 3
The derivative is continuous on [-4, 3] except at the points of discontinuity 1 and 2 of f , where f'(x) does not exist. However, at these points f'(x) has finite one-sided limits . Thus f ' is piecewise continuous on [-4, 3], so f is piecewise smooth . E As suggested by Figure 14 .8, a piecewise smooth function is one that has a continuous tangen t at all but finitely many points . We will now state our first convergence theorem .
596
CHAPTER 14 Fourier Series Y 1
1111
-4 -3 -2 -1 -1
-d I 1
I 2
3
I 4
x
-2 -3 -4 FIGURE 14 .8
1 f(x) = - 2x 9e' X
Graph of for -4 < x < 1 for 1 x < 2 for2x 3
THEOREM 14.1 Convergence of Fourier Series Let f be piecewise smooth on [-L, L] . Then for -L < x < L, the Fourier series of f on [-L, L] converges to
2 (f(x +) +f( x -)) . ■ This means that, at each point between -L and L, the function converges to the averag e of its left and right limits . If f is continuous at x, then these left and right limits both equa l f(x), so the Fourier .series converges to the function value at x. If f has a jump discontinuit y at x, then the Fourier series may not converge to f(x), but will converge to the point midwa y between the ends of the gap in the graph at x (Figure 14 .9) . Y f(x+ ) (f(x+) +f (x-)) -L
f(x- )
Convergence of a Fourier series at a jump discontinuity . FIGURE 14 .9
EXAMPLE 14 . 7
Let 5 sin(x) for - 27r < x < -7r/2 4 for x = -7r/2 f(x) = x 2 for - 71/2 < x < 2 8cos(x) for t<x<7r 4x for7r<x<27r
14.3 Convergence of Fourier Series
597
Y
25 20 15 10
*.5
I
I
I 6
-5
FIGURE 14 .10
5 sin(x) 4
Graph of f(x) = x 2 8 cos(x) 4x
for - 2ar < x < -7r/2 for x = -7r/ 2 for - 7r/2 < x < 2 for 2 < x < 7r for 7T x 27r
A graph off is given in Figure 14.10. Since f is piecewise smooth on [-27r, 27r], we can determine the sum of its Fourier series on this interval . In applying the theorem, we do no t actually have to compute this Fourier series . We could do this ; but it is not necessary to determine the sum of the Fourier series . For -27r < x < -7r/2, f is continuous and the Fourier series converges to f(x) = 5 sin(x) . At x = -7r/2, f has a jump discontinuity and the Fourier series will converge to the average of the left and right limits of f(x) at -77/2. Compute f(-7r/2-) = lim
x->-v/2 -
f(x) = lira 5 sin(x) = 5 sin(-7r/2) = - 5
and f(-7r/2+) = lim
x->-ir/2+
f(x) = lim x2 x->-7/2+
772 4
= -.
Therefore, at x = -7r/2, the Fourier series of f converges to 1 772 2( 4 On (-7r/2, 2) the function is continuous, so the Fourier series converges to x 2 for -7r/2 < x<2 . At x = 2 the function the function has another jump discontinuity . Compute f(2-) = lim x2 = 4 x-r2 -
and f(2+) = lim 8 cos(x) = 8 cos(2) . x->2 +
598 i
CHAPTER 14 Fourier Series
At x = 2 the Fourier series converges t o
2
(4+8cos(2)) .
On (2, Tr), f is continuous . At each x with 2 < x < 7r, the Fourier series converges to f(x) = 8 cos(x) . At x = Tr, f has a jump discontinuity . Compute f(7r-) = lim 8cos(x) = 8cos(7r) = - 8
and f(ar-I-) = lira 4x = 47. x-> or+
At x = 7r the Fourier series of f converges t o
2
(47r-8) .
Finally, on (Tr, 27r), f is continuous and the Fourier series converges to f(x) = 4x. These conclusions can be summarized: -Tr 5 sin(x) for - 27r < x < 2 z -7r for x = 2 2(4 -5) x2 for - 7r/2 < x < 2 Fourier series converges to
2 (4+ 8 cos(2))
for x = 2
8 cos(x)
for 2<x<7r
1 (41r-8 ) 2
forx=7r
4x
for Tr<x<27r
Figure 14 .11 shows a graph of this sum of the Fourier series, differing from the function itsel f on (-27r, 27r) at the jump discontinuities, where the series converges to the average of the left and right limits. ■ y
FIGURE 14 .11 Graph of the Fourie r series of the function of Figure 14 .10.
14.3 Convergence of Fourier Series
i ;-59 9
If f is piecewise smooth on [-L, L] and actually continuous on [-L, L], then the Fourier series converges to f(x) for -L < x < L .
EXAMPLE 14 . 8
Let for - 2 < x < 1 for l <x < 2
f(x) = x2-x2
Then f is continuous on [-2, 2] (Figure 14 .12) . f is differentiable except at x = 1, where f'(x) has finite left and right limits, so f is piecewise smooth . For -2 < x < 2, the Fourier series of f converges to f(x) . In this example the Fourier series is an exact representation of th e function on (-2, 2) .
Y
FIGURE 14 .12
x 2 - x2
f(x)
Graph of for-2<x< 1
for 1 < x < 2
14 .3.1 Convergence at the End Point s Theorem 14 .1 does not address convergence of a Fourier series at the end points of the interval . There is a subtlety here that we will now discuss . The problem is that, while the function f of interest may be defined only on [-L, L], its Fourier series os
a n cos ( nLx) + b a sin (n,nx ) 2 ao + n= 1
(14 .13 )
is defined for all real x for which the series converges . Further, the Fourier series is periodic , of period 2L . The value of the series is unchanged if x is replaced with x+ 2L . How do w e reconcile representing a function that is defined only on an interval by a function that is periodi c and may be defined over the entire real line ? The reconciliation lies in a periodic extension of f over the real line . Take the graph of f(x) on [-L, L) and replicate it over successive intervals of length 2L . This defines a ne w function ff that agrees with f(x) for -L < x < L, and has period 2L . This process is illustrated in Figure 14.13 for the function f(x) = x 2 for -2 < x < 2 . This graph is simply repeated for
60 0
CHAPTER 14 Fourier Series
-6-4-2 0 2 4 6
8 10
FIGURE 14.13 Part of the periodic extension, of period 4, of f(x) = x 2 for-2<x<2 .
2 < x < 6, 6 < x < 10, . . . , -6 < x < -2, -10 < x < -6, and so on . The reason for using the half-open interval [-L, L) in this extension is that, if fp is to have period 2L, then fp (x+2L) = fp(x )
for all x . But this requires that f(-L) = f(-L+2L) = f(L), so once fp ( - L) is defined, fp (L ) must equal this value . If we make this extension, then the convergence theorem applies to fp (x) at all x . In particular, at -L, the series converges to 1
(fn (-L-)
+ fp ( - L+)) ,
which is 2
(f(-L -) +f(- L +)) .
Similarly, at L, the Fourier series converges to 1 (fp (L-) +fp(L +)) , -2 which is - (f( L -) +f(- L +)) .
The Fourier series converges to the same value at both L and at -L . This can also be seen directly from the series (14.13) . If x = L, all the sine terms are sin(nlT), which vanish, and th e cosine terms are cos(nlT), so the series at x = L is 1 -2 ao
+Ea
n
cos(n'r) .
n= 1
At x = - L, again all the sine terms vanish, and because cos(-nTr) = cos(naT), the series at x = - L is also 1
- ao 2
+ E an cos(nlr) . n=1
14.3 Convergence of Fourier Series
6 01
14 .3.2 A Second Convergence Theore m A second convergence theorem can be framed in terms of one-sided derivatives .
DEFINITION 14.5 Right Derivative
Suppose f(x) is defined at least for c< x< c + r for some positive number r . Suppos e f(c+) is finite . Then the right derivative of f at c i s h) - f(c+ )
lim f(c+ f*( c) = h-)-O+ h if this limit exists and is finite .
DEFINITION 14 .6 Left Derivativ e
Suppose f(x) is defined at least for c- r <x< c for some positive number r . Suppos e f(c-) is finite . Then the left derivative of f at c is
If f'(c) exists, then f is continuous at c, so f(c-) = f(c+) = f(c), and in this case the left and right derivative are both equal to f'(c) . However, Figure 14 .14 shows the significance of the left and right derivatives when f has a jump discontinuity at c . The left derivative is the slope of the graph at x = c if we cover up the part of the graph to the right of c and keep onl y the left side . The right derivative is the slope at c if we cover up the left part and just keep th e right part.
,' Slope = fl.(x° )
FIGURE 14 .14 One-sided derivatives as slopes from th e right or left.
6 02
CHAPTER 14 Fourier Series y
FIGURE 14 .15
I1+x
f( x)
= 1x
Graph of for-Tr<x< 1 for 1 x <
2
EXAMPLE 14 . 9
Let f( x) =
J l+x 1 x2
for -<x< 1 for 1 < x < 7r
Then f is continuous on (-ir,'7r) except at 1, where there is a jump discontinuity (Figure 14 .15) . Further, f is differentiable except at this point of discontinuity. Indeed , f' (x) _
f1
for -it<x< 1 2x for 1 < x < 7r
From the graph and the slopes of the "left and right pieces" at x = 1, we would expect the left derivative at x = 1 to be 1, and the right derivative to be 2 . Check this from the definition . First , f( 1 + h)k f(1- ) fc(O) _ h1ro 1+(1+h)h 1, = h h=
ho
f*(c)= h->o+ lim = li
ho
f(1+h)-f(1+ ) h _ ( 1+ h)z 1 = hurn+ (2 + h) =2 .
Using the one-sided derivatives, we can state the following convergence theorem . THEOREM 14 . 2
Let f be piecewise continuous on [-L, L] . Then , 1. If -L < x < L and f has a left and right derivative at x, then the Fourier series off on [-L, L] converges at x to
2 (f(x+) + f(x-)) .
14 .3 Convergence of Fourier Series
60 3
2. If f f (-L) and f(L) exist, then at both L and -L, the Fourier series of f on [-L, L] converges to 2 (f(-L+) + f(L-)) . As with the first convergence theorem, we need not compute the Fourier series to determin e its sum.
EXAMPLE 14 .1 0
Let -x f(x) = le-2x 2 4
for - 2 < x < 1 for 1 < x < 2 . forx= 2
We want to determine the sum of the Fourier series off on [-2, 2] . A graph off is shown in Figure 14 .16 . f is piecewise continuous, being continuous except for jump discontinuities at 1 and 2. For -2 < x < 1, f is continuous, and the Fourier series converges to f(x) = e -x . For 1 < x < 2, f is also continuous and the Fourier series converges to f(x) _ - 2x2 . At the jump discontinuity x = 1, the left and right derivatives exist (-e -1 and -4, respectively) . We can determine these from the limits in the definitions, but these derivatives are als o apparent from looking at the graph of f to the right and left of 1 . Therefore the Fourier serie s converges at x = 1 to
2 (f( 1 -) +f( 1+)) , which is
2 (e
1 _2) .
y
FIGURE 14 .16 Graph of e-x for -2<x<
f(x) =
-2x 2 4
for 1 < x < 2 forx=2
1 .
, 604
L.
CHAPTER 14 Fourier Serie s
This takes care of each point in (-2, 2) . Now consider the end points . The left derivative of f at 2 is -8 and the right derivative at -2 is -e 2 . Therefore, at both 2 and at -2, the Fourier series converges to
2 (f( 2 -) +f( - 2+)) = 2 (-8 + e2 ) . Figure 14 .17 shows a graph of the sum of the Fourier series on [-2, 2], and can be compared with the graph of f . The two graphs agree except at the end points and at the jum p discontinuity . The fact that f(2) = 9 does not affect convergence of the Fourier series of f(x) atx=2 .
FIGURE 14 .17
series
Graph of the Fourie r
of the function of Figure 14.16.
A note of caution is warranted in applying the second convergence theorem . The left and right derivatives of a function at a point are relevant only to verify that the hypotheses of th e theorem are satisfied at a jump discontinuity of the function . However, these derivatives pla y no role in the value to which the Fourier series converges at a point . That value involves the left and right limits of the function itself. 14.3 .3 Partial Sums of Fourier Serie s Fourier's claims for his series were counterintuitive in the sense that functions such as polynomials and exponentials do not seem to be likely candidates to be represented by series of sine s and cosines . It is instructive to watch graphs of partial sums of some Fourier series converg e to the graph of the function .
EXAMPLE 14 .1 1
Let f(x) = x for
-7r
<x<
7r .
We saw in Example 14 .1 that the Fourier series i s
E (-1)''+i ? sin(nx) . n_l n
14 .3
FIGURE 14 .18(a) sawn
Fourth partia l
( ),*+ i sin(nx) of S4 (x) = E4=, 2„
the Fourier series of f(x) = x o n -7r < x < IT.
Convergence of Fourier Series
60 5
Tenth partia l of the Fourier series of f(x) = x on [-7r, 7r] . FIGURE 14 .18(b)
sum
FIGURE 14 .18(c) Twentieth sung of the Fourier series of
partial
f(x) = x on [-7r, 7r] .
We can apply either convergence theorem to show that this series converges t o for - it < x < ar 0 for x = Tr and for x
1
x
=
-Tr
Figures 14 .18 (a), (b) and (c) show, respectively the fourth, tenth and twentieth partial sum s of this series, suggesting how they approach nearer to f(x) = x on (-ar, Tr) as more terms are included .
EXAMPLE 14 .1 2
Let f(x) = ex for -1 < x < 1 . The Fourier series of f on [-1, 1] is
Figures 14 .19 (a) and (b) show the tenth and thirtieth partial sums of this series, compared with the graph of f .
60 6
CHAPTER 14 Fourier Serie s
Y
FIGURE 14 .19(b) Thirtieth partia l sum of the Fourier series of f(x) = e x on [-1, 1] .
FIGURE 14.19(a) Tenth partial sum of the Fourier series of f(x) = e` on [-1, 1] .
EXAMPLE 14 .1 3
Let f(x) = sin(x) for -1 < x < 1 . The Fourier series off on [-1, 1] i s 2nirsin(1)(-1)"+1 sin(n7x) n 2 -2 - 1
This series converges to sin(x) for - 1 < x < 1 forx=1 andforx=- 1 10 Figures 14 .20 (a) and (b) show two partial sums of this series compared with the graph of f .
14.3 .4 The Gibbs Phenomenon In 1881 the Michelson-Morley experiment revolutionized physics and helped pave the wa y for Einstein's theory of general relativity . In a brilliant experiment using their adaptation o f the interferometer, Michelson and Morley showed by careful measurements that the postulate d "ether", which physicists at that time believed permeated all of space, had no effect on the velocity of light as seen from different directions .
0.4 -0 .8 -1 .0
-0 .4
-0 .6
0.2
-0 .200 .20 .40 .60.8 1 .0
FIGURE 14 .20(a) Fourth partial sum of the Fourier series of f(x) = sin(x) for -1 x < 1 .
14.3 Convergence of Fourier Series
FIGURE 14 .20(b)
f(x) = sin(x) for
-1
601
Tenth partial sum of the functio n < x < 1.
Some years later Michelson was testing a mechanical device he had invented for computin g Fourier coefficients and for constructing a function from its Fourier coefficients . In one tes t he used eighty Fourier coefficients for the function f(x) = x for -7r < x < 7r . The machin e responded with a graph having unexpected jumps near the end points 7r and -7r . At first Michelson assumed that there was some problem with his machine . Eventually, however, it was found that this behavior is characteristic of Fourier series at jump discontinuities of th e function . This became known as the Gibbs phenomenon, after the Yale mathematician Josia h Willard Gibbs, who was the first to satisfactorily define and explain it . The phenomenon had also been noticed some sixty years before by the English mathematician Wilbraham, who was , however, unable to analyze it . To illustrate the phenomenon, consider the function defined b y -7r/4
for - 7r < x < 0 for x = 0 for0<x7r
f(x) = 0
7r/4
Figure 14 .21 shows a graph of this function, whose Fourier series i s
n=l
1 1 sin((2n - 1)x) 2n -
By either convergence theorem this series converges to f(x) for -7r < x < 7r . There is a jump discontinuity at 0, but r .'n-7 ) 2(.f(0+)+f(0-)) = 2 ( 4 + 4
Y -7r
•
TO
• 0-f
FIGURE 14 .21
• 7r
Functio n illustrating the Gibbs phenomenon.
x
0=f(0)
.
608
j CHAPTER 14 Fourier Serie s
The
Nth
partial sum of this Fourier series i s N
SN (x)
=E
: . 1 2n- 1
sin((2n - 1)x) ,
and Figure 14 .22 shows graphs of S5 (x), S 14 (x) and S22 (x) . Each of these partial sums show s a peak near zero . Intuitively, since the partial sums approach f(x) as N ->- co, we might expect these peaks to flatten out and become smaller as N is chosen larger . But they don't . Instead , the peaks maintain roughly the same height, but move closer to the y axis as N increases . Th e partial sums do indeed have the function as a limit, but not in quite the way mathematicians expected . As another example, consider f(x) - 0 2-x
for -2<x< 0 forO<x< 2
Partial sums (for 0 < x < 7r/4) showing the Gibbs phenomenon for the function of Figure 14.21 . FIGURE 14 .22
FIGURE 14 .23 Fourth, tenth, and twenty-fifth partial sums of the Fourier series of - 0 for -2<x< 0 f(x)
1 2- x
for 0< x 2
14.4 Fourier Cosine and Sine Series
609
This function has a jump discontinuity at 0, and Fourier serie s
(
+E n2v.2 (1 - (-1)")) cos(n7rx/2) + iZ „ -l
-
sin(n7rx/2)) .
Figure 14 .23 shows the fourth, tenth and twenty-fifth partial sum of this series . Again, th e Gibbs phenomenon shows up at the jump discontinuity . Gibbs showed that this behavior occurs in the Fourier series. of a function at every point where it has a jump discontinuity .
In each of Problems 1 through 10, use a convergenc e theorem to determine the sum of the Fourier series o f the function on the interval . Whichever theorem is used , verify that the hypotheses are satisfied, assuming familia r facts from calculus about continuous and differentiabl e functions . It is not necessary to write the series itself to do this . Next, find the Fourier series of the function and graph f and, for N = 5, 10, 15, 25, graph the N th partial sum of the series, together with the function on the interval . Point out any places where the Gibbs phenomenon is apparen t in these graphs . 2x for - 3 < x < - 2 1. f(x) = 0 for - 2 < x < 1 x2 for l < x < 3
5. f(x) =
x2 2
6. f(x)
cos(x) for - 2 < x < 0 sin(x) for 0 < x < 2
7 f(x)
-1
0
f(x) =
14 .4
2x - 2 3
for - ar x 1 fort<x
for-1<x<
1 3 1 for < x < 2 4 3 2 for-<x 1 4 9. f(x) = e ixl for -ar < x or
1 -
8. f(x) =
-2 1-F x2
0
2. f(x) = x2 for -2 < x < 2
4.
for -4 < x < 0 for0x<4
1
10. f(x) =
3. .f(x)=x 2ex for-3x< 3
for -Trx 0 for0<x7r
for -4<x<- 2 for - 2 < x < 2 for2<x< 4
11. Let f(x) = x 2 /2 for -7r < x < 7r . Find the Fourie r series of f(x) and evaluate it at an appropriately chosen value of x to sum the series EL, 1/n2. 12. Use the Fourier series of Problem 11 to sum the series
E, 1 (-1)"/n2.
Fourier Cosine and Sine Series If f(x) is defined on [-L, L], we may be able to write its Fourier series . The coefficients o f this series are completely determined by the function and the interval . We will now show that, if f(x) is defined on the half-interval [0, L], then we have a choice , and can write a series containing just cosines or just sines in attempting to represent f(x) on this half ,erval .
6 :10
CHAPTER 14 Fourier Series
14.4.1 The Fourier Cosine Series of a Functio n Let f be integrable on [0, L] . We want to expand f(x) in a series of cosine functions . We already have the means to do this . Figure 14.24 shows a graph of a typical f . Fold this graph across the y- axis to obtain an function fe defined for -L < x < L : fe(x)
f(x) _ f(-x)
for 0 < x < L for -L < x < 0
L is an even function, fe(- x) = f(x) ,
and agrees with f on [0,4 fe (x) = f(x) for 0 < x < L .
We call fe the even extension off to [-L, L] .
EXAMPLE 14 .1 4
Let f(x) = e x for 0 < x < 2 . Then fe(x) -
ex e-x
for 0 < x < 2 for -2<x<0 .
Here we put fe (-x) = f(x) = ex for 0 < x < 2, meaning that fe (x) = e-x for -2 < x < 0. A graph of fe is given in Figure 14.25. Because fe is an even function on [-L, L], its Fourier series on [-L, L] i s 1 n*rx l 2ao+Eancos( L /, n= 1
(14 .14 )
in which 2 (n7rxl (n7rx\ f L fe (x) cos dx = ? f L f(x) cos dx, (14.15) J L J L JJ Lo Lo since fe (x) = f(x) for 0 < x < L . We call the series (14 .14) the Fourier cosine series of f o n [0, L] . The coefficients (14.15) are the Fourier cosine coefficients of f on [0, U . The even extension fe was introduced only to be able to make use of earlier work to deriv e a series containing just cosines . When we actually write a Fourier cosine series, we just us e (14.14) to calculate the coefficients, without defining L. an =
FIGURE 14 .24
extension off to [-L, L] .
Even
FIGURE 14 .25
14.4 Fourier Cosine and Sine Series
611 .
The other point to having fe in the background, however, is that we can use the Fourier convergence theorems to write a convergence theorem for cosine series .
THEOREM 14.3 Convergence of Fourier Cosine Serie s
Let f be piecewise continuous on [0, L] . Then , 1. If 0 < x < L, and f has left and right derivatives at x, then the Fourier cosine series for f(x) on [0, L] converges at x to (f(x-)+f(x+)) . 2 2. If f has a right derivative at 0, then the Fourier cosine series for f(x) on [0, L] converge s at 0 to f(0+) . 3. If f has a left derivative at L, then the Fourier cosine series for f(x) on [0, L] converge s at L to f(L-) . Conclusions (2) and (3) follow from Theorem 14.2, applied to The Fourier series of fe converges at 0 to 2
L. Consider first
(fe(0-)+fe(0+) )
But L(0+) = f(0+ ) an d fe(0- ) = f( 0 +) ,
so at 0 the series converges to
2 (f(0+)+f(0+)) = f(0+) . A similar argument proves conclusion (3) .
EXAMPLE 14 .1 5
Let f(x) = e2X for 0 < x < 1 . We will write the Fourier cosine series of f . Comput e
f
ao=2 f e zx dx=e z - 1 and
= 2f
e zx cos(rc-rrx)dx
0
-4ez(-1)„- 1 4 + n2 i 2 The cosine expansion of f is - (e -1)+E4 2 1
2
11=1
_ 1 cos(narx) . + iz?rz
ez(_1)11 4
x = 0.
612
CHAPTER 14 Fourier Serie s This series converges to eZx
for 0 < x < 1 forx= 0 for x = 1
1 e2
Thus this cosine series converges to e Zx for 0 < x < 1 . Figures 14 .26 (a) and (b) show a graph of f compared with the fifth and tenth partial sums of this cosine expansion, respectively .
y
I 0 .2
I 0 .4
I 0.6
I 0 .8
I 0 .2
sx
Fifth partial sum of the cosine expansion of eZx on [0, 1] . FIGURE 14 .26(a)
I 0.4
I 0 .6
I 0.8
FIGURE 14 .26(b) Tenth partial sum of the cosine expansion of e 2x on [0, 1] .
14.4.2 The Fourier Sine Series of a Functio n By duplicating the strategy just used for writing a cosine series, except now extending f to an odd function fo over [-L, L], we can write a Fourier sine series for f(x) on [0, L] . In particular , if f(x) is defined on [0, L], let
fo( x) =
f(x) 1-f(-x)
for 0 < x < L for -L<x< 0
Then fo is an odd function, and fo (x) = f(x) for 0 < x < L . This is the odd extension off to [-L, L] . For example, if f(x) = e 2x for 0 x < 1, le t eZx fo( x) =
-e -2x
for 0 < x < 1 for -1 x < 0
This amounts to folding the graph of f over the vertical axis, then over the horizontal axis (Figure 14 .27) . Now write the Fourier series for fo (x) on [-L, L] . By equations (14 .11) and (14.12), the Fourier series of fo is
E b„ sinn'tr(-x ) n=1
I
(14 .16)
14.4 Fourier Cosine and Sine Series
613
Y f x
-L ,__'
L
FIGURE 14 .27
Odd
extension off to [-L, L] .
with coefficients b,, = 2 L
ff L
o (x) sin (n7rx\ J dx = 2 f L L Lo
f
f(
(L
)
L
dx.
(14 .17 )
J
We call the series (14 .16) the Fourier sine series off on [0, L] . The coefficients given by equation (14 .17) are the Fourier sine coefficients off on [0, L] . As with cosine series, we do not need to explicitly make the extension to f to write the Fourier sine series for f on [0, L] . Again, as with the cosine expansion, we can write a convergence theorem for sine serie s using the convergence theorem for Fourier series .
o
THEOREM 14 .4
Convergence of Fourier Sine Serie s
Let f be piecewise continuous on [0, L] . Then , 1. If 0 < x < L, and f has left and right derivatives at x, then the Fourier sine series fo r f(x) on [0, L] converges at x to 2 (f(x -) +f( x +) ) 2. At 0 and at L, the Fourier sine series for f(x) on [0, L] converges to 0 . Conclusion (2) is immediate because each term of the sine series (14 .16) is zero for x = 0 and for x = L .
EXAMPLE 14 .1 6
Let f(x) = e2x for 0 < x < 1 . We will write the Fourier sine series off on [0, 1] . The coefficients are i b,, = 2 fo e2t sin(n7rx)dx 0 -
The sine series is
2n7r(1- (-1)„e2) . 4+n 27r 2
E2 n7r(1- (z 1)"e 2) sin(n7rx) . n=1
4+ n
7r
This series converges to e2x for 0 < x < 1, and to zero for x = 0 and for x = 1 . Figures 14.28 (a) and (b) show graphs of the tenth and fortieth partial sums of this series .
CHAPTER 14 Fourier Serie s
614
y
FIGURE 14 .28(a)
Tenth partial sum of th e
FIGURE 14 .28(b) Fortieth partial sum of the sine expansion of etc on [0, 11.
sine expansion of etc on [0, 1] .
1
In each of Problems 1 through 10, write the Fourier cosine series and the Fourier sine series of the function on the interval. Determine the sum of each series .
8. f(x) =
-1
1. f(x)=4,0<x<3 f1 2 . f(x) = -1
9. f(x) =
for0<x<1 for 1 <x<2
f(x) =
7 . f(x)=
14.5
1
2-x
x2 1
for0<x< 1 for t<x< 4
10. f(x) = 1- x3 for 0 < x < 2 11. Let f(x) be defined on [-L, U . Prove that f can be written as the sum of an even and an odd function o n this interval. 12. Find all functions defined on [-L, L] that are bot h even and odd .
0 for0<x<ar cos(x) for it < x < 2ir 4. f(x)=2xfor0x< 1 5. f(x) = x2 for 0 < x < 2 6. f(x)=ec for0x< 1 3.
0
for0<x< 1 for 1 < x < 3 for3<x< 5
13. Find the sum of the series EL, (-1)"/(4n 2 - 1) . Hint : Expand sin(x) in a cosine series on [0, 1r] an d choose an appropriate value of x .
for0<x< 2 for2<x<3
Integration and Differentiation of Fourier Serie s In this section we will take a closer look at Fourier coefficients, and consider term by term differentiation and integration of Fourier series . Differentiation of Fourier series term-by-term generally leads to absurd results, even fo r extremely well behaved functions . Consider, for example, f(x) = x for -7r < x < 7r . The Fourier series is 2
E -(-1) n=1
n
sin(nx) ,
14.5 Integration and Differentiation of Fourier Series
615
which converges to x for -7r < x < 7r . Of course, f'(x) = 1 for -7r < x < 7r, so f is piecewis e smooth . However, if we differentiate the Fourier series term-by-term, we ge t co
E 2(-l)"+1 cos(nx) , n= 1
which does not even converge on (-7r, 7r) . The term by term derivative of this Fourier serie s is unrelated to the derivative of f(x) . Integration of Fourier series has better prospects . THEOREM 14.5 Integration of Fourier Serie s
Let f be piecewise continuous on [-L, L], with Fourier series
Then, for any x with -L < x < L ,
f
f(t)dt
1 x + L)+ 2 a o( l
L E 1n [a n sin(
n7rx1 L
IT ii=1
)1
-(-1)" . J -b,, (cos(-JJ L J
The expression on the right in this equation is exactly what we get by integrating th e Fourier series term by term, from -L to x . This means that, for any piecewise continuou s function, we can integrate f from -L to x by integrating its Fourier series term-by-term . This holds even if the Fourier series does not converge to f(x) at this particular x (for example, f might have a jump discontinuity at x) . Proof
Define F(x)
=
f
L
f(t)dt- 2ao x
for -L < x < L . Then F is continuous on [-L, L] and F(L) = F(-L) = Lao/2. Further , F'(x) = f(x) - Zao at every point of [-L, L] where f is continuous . Hence F' is piecewis e continuous on [-L, L] . Therefore the Fourier series of F(x) converges to F(x) on [-L, L] : F(x)
= 2A o + A,, cos (nix)J +B,, sin (-)J ,
(14 .18 )
1=*
in which we will use upper case letters for Fourier coefficients of F, and lower case letters for those of f . Now compute the A',,s and B,',s for n = 1, 2, . . . by integrating by parts . First , A,,
1
L
L
J_L
F(t) cos
(L n t)
dt
1 L 1a7rt L =-LF(t)-sin(-)J L n7r L _L n7r
LL (f(t) -
f
fL (**Lt ) 1
--1
2ao) sin
L L
L n7r t -Lsin(-)F' (t)d t n7r L
dt
(n7rt) 1 L f(t) sin dt+ 1 ao f L sin n7r L L 2nir L
L n7r b
11
,
(-n7rt ) dt L
P 616
CHAPTER 14 Fourier Series in which b,, is the sine coefficient in the Fourier series off on [-L, L] . Similarly , L
B„ -
f L F(t) sin ( L tt) dt
=L([F(t) /
n7r
fL
L
L
LJ_LF'(t)(- -)cos( n t )dt
-cos(nLt))i lL
L
(f(t)--ao)cos( n t )dt
L L f f(t) cos (-) dtn7r 2n7rao _L cos ( nLt ) dt L
n7r a n Therefore the Fourier series of F is
( -) L nE n
F(x) = -A0+
(-bn cos ( n ux ) + ansin L
( nL x ) )
for -L < x < L . Now we must determine A0 . But F(L)
( I )bcos(nir) = Z a0 = - An - E n- 1 =
A0 - LT nE ()bc _ o n.
This gives us Ao = Lao
+
2*
E ()b1 ( _ 1Y1 . n= i
Upon substituting these expressions for A 0 , A,, and B n into the series (14 .18), we obtain th e conclusion of the theorem . ■
EXAMPLE 14 .1 7 Let f(x) = x for -7r < x < 7r. This function is continuous on [-7r, 7r], and its Fourier series is
E ? (-1)n+1 sin(nx) . n=1 n We have seen that we get nonsense if we differentiate this series term by term . However, we can integrate it term by term to obtain, for any x in [-7r, 7r], x
f zr tdt = 2 (x2 = E 2n (- 1)n+l n=1
f
sin(nt)d t
14.5 Integration and Differentiation of Fourier Series
61 7
C-n cos(nx) + n cos(nv) ]
? (-1)" +i n=
=E n2 (-1)" [cos(nx) - (-1)"] . n=i
With stronger conditions on f , we can derive a result on term by term differentiation of Fourier series . THEOREM 14.6 Differentiation of Fourier Serie s
Let f be continuous on [-L, L] and suppose f(L) = f(-L) . Let f' be piecewise continuou s on [-L, L] . Then f(x) equals its Fourier series for -L < x < L , 1 nirx n7rx f(x)=2ao+Ea,, cos(- )+b„sin( L), n=1 and, at each point in (-L, L) where f" (x) exists ,
f'(x) = E n=1
L
(-na1,sin ( , rx/L) +b,, cos (nLx))
.
We leave a proof of this to the student . The idea is to write the Fourier series of f (x) , noting that this Fourier series converges to f'(x) wherever f" (x) exists . Use integration b y parts, as in the proof of Theorem 14 .5, to relate the Fourier coefficients of f'(x) to those of f(x) .
EXAMPLE 14 :1 8
Let f(x) = x2 for -2 < x < 2. The hypotheses of Theorem 14.6 are satisfied . The Fourier series of f on [-2, 2] is f(x)
4 16 ' (-1)" narx cos( l n2 2/,
= 3+21
with equality between f(x) and its Fourier series . Because f'(x) = 2x is continuous, an d f"(x) = 2 exists throughout the interval, then for -2 < x < 2 , 8 " ( - 1)n+i n7rx f'(x) = 2x = E sin
7r „=i
For example, putting x = 1, we get 8
00
n
(_1)n+i
7r n_ i
n
sin
2
nir (2) = 2,
or (-1)"+1 n=1
n
(n w ) _ 7r 2 4
sin -
Manipulations on Fourier series can sometimes be used to sum series such as this .
, 618,
CHAPTER 14 Fourier Series
We now have conditions under which we can differentiate or integrate a Fourier serie s term by term . We will next consider conditions sufficient for a Fourier series to converg e uniformly. First we will derive a set of important inequalities for Fourier coefficients, calle d Bessel's inequalities . F
THEOREM 14.7
Bessel's Inequalities
Let f be integrable on [0, L] . The n 1. The coefficients in the Fourier sine expansion of f on [0, L] satisfy Eb,2 < n= 1
2 L
f
L L f(x) 2dx.
2. The coefficients in the Fourier cosine expansion of f on [0, L] satisfy 2 a2 +Ea
„= 1
*
L f(x) 2dx .
3. If f is integrable on [-L, L], then the Fourier coefficients off on [-L, L] satisfy 2
+b*) < L J_Lf(x)zdx . ■
ao+E n= 1
In particular, the sum of the squares of the (sine, cosine, or Fourier series) coefficient s off converges . We will prove (1), which is notationally simpler than the other two inequalities , but contains the idea of the argument. Proof
Since
fL f(x)dx exists, we can compute the Fourier sine coefficients and write the sin e
series Esin( „= 1
nvx l L/ '
where 2 b,, L
L
L f
f(x) sin
(_
x ) dx.
The Nth partial sum of this series is SN (x)
=
N
b,, sin
n= 1
717Tx
(L ) .
Now consider
f (f(x) L
0
SN(x))2 d x L SN (x) 2dx
=
= f L f(x) 2dx-2 f 0
Lf(x)
0
b,, sin ( n=1
n"rx ) Z
Eb,,sin( nix ) dx L
n=i
N
E b,,, s m=1
L ) (M
dx
14.5 Integration and Differentiation of Fourier Series
N
= fL
f(x) 2dx-2E b,,
o
fL o
n= l N N
+EEb„b,,,
f
nIrx L
L
sin(-)sin(
o
n=1,n=1
f(x)sin(
= f L f(x) 2dx-Eb„(Lb,,) +E b o
n=1
n
619
n7rx )dx L
m7rx ) dx L b,, ,
n= 1
in which we have used the fact that
fo L sin
(n7rx ) (m7rx) sin L L
=0 L/2
dx
if n m if n= m
We therefore have
<
0
f
L
N f(x) 2dx-LE b,+ L Eb* , n=1 2 n= 1
or
2
N
< L-
>b2 „=1
fo
L
f(x) 2dx.
Since the right side is independent of N, we can let N --> co to get L
E b ,2, L n=1
f(x) 2 dx ,
o
proving conclusion (1) . Conclusions (2) and (3) have similar proofs .
EXAMPLE 14 .1 9
We will use Bessel's inequality to derive an upper bound for an infinite series . Let f(x) = x 2 for -7r < x < 7r. The Fourier series off converges to f(x) for all x in [-7r, ar]: x2
= 377-2 +E4 (
_ on cos(nx) .
n
=1
Here ao = 277-2 /3, a,, = 4(-1)"/n2 and b„ = 0 (x2 is an even function) . By Bessel's inequality (3) of Theorem 14 .7, co 1 (20. 2 2 2 3 + n= l )
19 2
(4(
n2
1 <
)
f
7r J_
,*
x4dx 7
Then 16
2 2 E n=l n4 - 5 9 1
00
4
s.o 77.4
00
=1
which is approximately
1 .0823232 .
■
n4
90 '
_ 87 r 45 '
=
2 5 7x4.
620 i
CHAPTER 14 Fourier Serie s Using Bessel's inequality for coefficients in a Fourier expansion on [-L, L] , we can prov e a result on uniform convergence of Fourier series .
THEOREM 14.8
Uniform and Absolute Convergence
of Fourier Series
Let f be continuous on [-L, L] and let f' be piecewise continuous . Suppose f(-L) = f(L). Then, the Fourier series of f on [-L, L] converges absolutely and uniformly to f(x) on [-L, L] . ■ Denote the Fourier coefficients of f by lower case letters, and those of f' by upper case. Then Proof
J. L
A0 = -
L f ' (6)4 = f(L ) -f(- L) = O .
For positive integer n, we find by integration by parts, as in the proof of Theorem 14 .5, that niT nl r An= L b n andB„ = - Lan . Now 05_
(IA,,I_-i_"2=A2_
1 n2 IAnI+ n z
and, similarly,
0
n IA nl+I B„*
z
-(
A n+ B
z
n) +
n
Therefore
7r
Ir n * < -
(An +B2\
+ nz ,
hence
I a,,l +I b „l - 2 (A z +B )+ L 7r n n Now E, 1 (1 /n z ) converges, and E°°_ 1 (An + BD converges by applying Bessel's inequality to the Fourier coefficients of f' . Therefore, by comparison, EL I (Ia n * + Ib,,) converge s also . But, for -L < x < L , Ia n cos(nirx/L)+b n sin(narx/L)I
< Ia nl +Ibnl
•
By a theorem of Weierstrass, this implies that the Fourier series of f converges uniformly on [-L, L] . Further, the convergence is absolute, since the series of absolute values of term s of the series converges . Finally, by the Fourier convergence theorem, the Fourier series of f converges to f(x) on [-L, L] . This completes the proof. ■
14.5 Integration and Differentiation of Fourier Series
621 .
EXAMPLE 14.2 0
Let f(x) = e -I xl for -1 < x < 1 . The n ex e-x
f( x) =
for - 1 <x< 0 for 0 < x < 1
f is continuous on [-1, 1], and f, (x) -
1-e - x ex
for 0 < x < 1 for-1<x< 0
f has no derivative at x = 0, which is a cusp of the graph (Figure 14.29) . Thus f' is piecewise continuous on [-1, 1] . Finally, f(1) = f(-1) = e-1 . Therefore the Fourier series of f converge s uniformly and absolutely to f(x) on [-1, 1] :
f(x) = 1- e-1 +2 E 1 1 + »=1
(znz)
Z
cos(norx)
for-1<x<1 . We can integrate this series term by term. For example,
f
1 f(t)dt=
f 1(1-e-1)dt+2E
1
»= l
*
z z
7r n
*lZZ )* = ( 1 - e 1)(x + 1 )+2 1 1 + n= 1
_1
12
cos(nart)d t
sin(narx) .
This is a correct equation, but it is not a Fourier series (the right side includes the polynomia l term x) . We may always integrate a Fourier series term by term, and the' tesult may be a convergent series, but not necessarily a Fourier series .
y
I
I
-1 .0 -0 .5
, 0
FIGURE 14 .29 r ex
f(x) - {I
e
0 .5
x
1 .0
Graph of for -1
x< 0
for0<x< 1
622 I CHAPTER 14 Fourier Serie s We can also differentiate the Fourier series for f(x) term by term at any point in (-1, 1) a t which f"(x) exists . Thus we can differentiate term by term for -1 < x < 0 and for 0 < x < 1 . For such x,
0 -1 +*2n2 ) t n*rsin(narx) .
-2E n=1
*
II
We will conclude this section with Parseval's theorem . Recall that Bessel's inequality fo r Fourier coefficients on [-L, L] requires only that we be able to compute these coefficients . If, however, we place continuity conditions on the function, as in Theorem 14 .8, then we turn Bessel's inequality into an equality . THEOREM 14 .9
Parseva l
Let f be continuous on [-L, L] and let f' be piecewise continuous . Suppose f(-L) = f(L) . Then the Fourier coefficients of f on [-L, L] satisfy 2ao+E(a2+b2) = L
f L f(x)2dx.
■
n= 1
Proof
The Fourier series of f on [-L, L] converges to f(x) at each point of this interval : f( x)
= 2ao + E a n cos (nITx) + b,, sin ( L n x) . n= 1
Then f(x') 2 =
1
2 aof(x) + E anf(x) co s ( n Lx ) + bnf( x) sin (nLx )
n= 1 We can integrate this Fourier series term by term, and multiplication of the series by th e continuous function f(x) does not change this . Therefore 1
f L f(x) 2dx = 2 ao f L f(x)d x L
+ E a,, f L f(x) cos (
n7rx L) dx + b,
n= 1
L
f L f(x) sin ( n,rrx L ) dx .
Recalling the integral formulas for the Fourier coefficients, this equation can be writte n L
f L f(x) 2dx =
oLa
o+E(a ,,La n+ bnLb „) ,
n= 1 -a and this is equivalent to the conclusion of the theorem.
EXAMPLE 14 .2 1 Parseval's theorem has various applications in deriving other properties of Fourier series . We will encounter it later when we discuss completeness of sets of eigenfunctions . However, one immediate use is in summing certain infinite series . To illustrate, the Fourier coefficients o f cos(x/2) on [-7T, 7r] are ao
=
7r
f
cos(x/2)dx = * r
14.6 The Phase Angle Form of a Fourier Serie s
'i62 3
and a„
= 1 f cos(x/2) cos(nx)dx 7r „
= - 4 (-1) " 7r4n 2 - 1
By Parseval's theorem ,
-1f
4(-1)" 1 (4\ 2 2 +(7r4n2 )2
1 *
cos2 (x/2)dx= 1 .
Then, ,
1 _ 7r2 - 8 (4n2 -1.) 2 16
PROBLEMS 1. Prove Theorem 14 .6 . An argument can be formulated along the lines discussed following the statement of the theorem. 2. Let f(x) _ Ix' for -1 <x < 1 . (a) Write the Fourier series for f(x) on [-1, 1] . (b) Show that this series can be differentiated term by term by yield the Fourier expansion of f'(x) on [-1, 1] . (c) Determine f'(x) and write its Fourier series on [-1, 1] . Compare this series with that obtained in (b) . 3. Let f(x) _
1
0 for -1rx 0 x for0<x<7r .
(a) Write the Fourier series of f(x) on [-7r, 7r] and show that this series converges to f(x) on (-7r, ar) . (b) Show that this series can be integrated term by term.
14 .6
(c) Use the results of (a) and (b) to obtain a trigono metric series expansion for fz,„ f(t)dt on [-7r, 7r] . 4. Let f(x) = x 2 for -3 < x < 3 . (a) Write the Fourier series for f(x) on [-3, 3] . (b) Show that this series can be differentiated termby-term and use this fact to obtain the Fourier expansion of 2x on [-3, 3] . (c) Write the Fourier series of 2x on [-3, 3] by computation of the Fourier coefficients and compare th e result with that of (b) . 5. Let f(x) = xsin(x) for -7r < x < IT. (a) Write the Fourier series for f(x) on [-vr, in . (b) Show that this series can be differentiated term by term and use this fact to obtain the Fourier expansion of sin(x) +xcos(x) on [-7r, ar] . (c) Write the Fourier series of sin(x) + x cos(x) on [-?r, 7r] by computation of the Fourier coefficients and compare the result with that of (b) .
The Phase Angle Form of a Fourier Serie s A function is periodic with period p if f(x + p) = f(x) for all real x . If a function has a period , it has many periods . For example, cos(x) has periods 27r, 47r, 67r, -27r, -47r, and, in fact , 2nir for any integer n. The smallest positive period of a function is called its principal period . The principal period of sin(x) and cos(x) is 2ir . If f has period p, then for any x, and any integer n, f(x+np) =f( x) .
CHAPTER 14 Fourier Serie s
For example, cos
(h)
= cos
(6 + 27r) = cos (- + 47r) = cos (6 + 67r) =
=cos(6 -2ar)=cos(6 -47r)=
.
The graph of periodic f(x) repeats itself over every interval of length p (Figure 14 .30) . This means that we need only specify f(x) on an interval of length p, say on [-p/2,p/2) , to determine f(x) for all x. This specification of function values can be made on any interva l [a, a +p) of length p. Since f(a+p) = f(a), the function must have the same value at th e end points of this interval . This is why we specify values on the half-open interval [a, a+ p) , since f(a + p) is determined once f(a) is defined .
EXAMPLE 14 .2 2
Let g(x) = 2x for -1 < x < 1, and suppose g has period 2 . Then the graph of g on [-1, 1) is repeated to cover the entire real line, as in Figure 14 .31. Knowing the period, and the function values on [-1, 1), are enough to determine the function for all x . As a specific example, suppose we want to know g (2 ) . Because g has period 2, g(x+2n) g(x) for any x and any integer n. Then
If f has period p, and is integrable, then we can calculate its Fourier coefficients on [-p/2, p/2] and write the Fourier series 1
2
ao +
( a„ cos n_1
2n7rx )) + P
b„ sin
(2nirx 1 \ P
J
Here L = p/2 ; so n7rx/L 2n7rx/p in the previous discussion of Fourier series on [-L .L] . The Fourier coefficients are
an =
2 f n*2 (2narx) 2 f(x) cos dx for n = 0, 1, 2, . . . P n/2 p
FIGURE 14 .30 Graph of a periodic function of fundamental period p .
FIGURE 14 .31
14 .6 The Phase Angle Form of a Fourier Series
625 .,
an d b„
2
p/2
P
p /2
=-f
f(x) sin
2norx
dx for n = 1, 2, . . . .
P
Actually, because of the periodicity, we could choose any convenient number a and writ e a„
=
a+p
f(x) cos
pf
2n7rx
dx for n = 0, 1, 2, . . .
(14 .19 )
(ix for n = 1, 2, . . . .
(14 .20)
P
and b„
=
p
f
f(x) sin
2n7rx P
Once we compute the coefficients, we can use a convergence theorem to determine wher e this series represents f(x) .
EXAMPLE 14 .2 3
The function f shown in Figure 14 .32 has fundamental period 6, and f(x) _
0 for - 3 < x < 0 1 for0<x< 3
This function is called a square wave . It's Fourier series on [-3, 3] i s +
2
n-(1-(-1)')sin(n3x )
This series converges to 0 for -3 < x < 0, to 1 for 0 < x < 3, and to 1/2 at x = 0 and x = ±3 . Because of the periodicity, this series also converges to f(x) on (-6, -3) and on (3, 6), o n (-6, -9) and on (6, 9) , and so on . Sometimes we write 27r wo=- . P Now the Fourier series of f on [-p/2, p/2] is CO 1 ao + E (a„ cos(nwox) + b,, sin(nwox)) , 2 n- i
Y
FIGURE 14 .32 and f has period.
Square wave : f(x) =
(14.21)
626
CHAPTER 14 Fourier Serie s where a,,
z
2
n
P
n/2
= - ff(x) cos(ncoox)dx
for n = 0, 1, 2, . . .
and vl2
b„ =
P2 J
f(x) sin(nwox)dx for n = 1, 2, . . . .
nl 2
It is sometimes useful to write the Fourier series (14 .21) in a different way. We will look for numbers c,, and S„ so that a n cos (ncoox) + b„ sin(ncoox) = c„ cos(nwox + 8n ) . To solve for these constants, write the last equation a s a,, cos(nwox) + b,, sin(nwox) = c,, cos(nwox) cos(8,,) - c,, sin(nwox) sin(8„) . One way to satisfy this equation is to have cn cos ( 8n) = a „ and c,, sin(8n )
= -b,, .
Solve these for c,, and 5,, . First square both equations and add to obtai n z 2 2 c „= an+ n , so cn
= ✓an
I b, .
(14.22)
Next, write c„ sin(8n) = tan(s ) c,,cos(8n) n
an y
so 8n = tan -l (
ie )
assuming that a,, O . The numbers c,, and 8,, allow us to write the phase angle form of th e Fourier series (14 .21) .
DEFINITION 14.7
Phase Angle Forni
Letf have fundamental period p . Then the phase angle form of the Fourier series (14 .21 ) off i s
in which wo = 2'rr/p ; c„ = ,/a;,+ ki , and 8,,
tan-'(-b„/an) for n = 1, 2, .
14 .6 The Phase Angle Form of a Fourier Series
627
The phase angle form of the Fourier series is also called its harmonic form . This expressio n displays the composition of a periodic function (satisfying certain continuity conditions) as a superposition of cosine waves . The term cos(nwox + 8,,) is the n th harmonic of f, c,, is the n t h harmonic amplitude , and 8,, is the n"' phase angle of f .
E EXAMPLE 14.2 4
Suppose f has fundamental period p = 3, and
f(x) = x2 for 0 < x < 3 . Since f has fundamental period 3, defining f(x) on any interval [a, b) of length 3 determine s
f(x) for all x. For example, f(- 1 ) = f(- 1 + 3 ) = .f(2) = 4, f(5) =f(2+3 )f(2 = f( 2) = 2 2 = 4, (or observe that f(5) = f(-1 +6) = f(-1 +2 . 3) = f(-1) = 4), an d f(7) = f(1 + 6) = f(1) = 1 . A graph off is shown in Figure 14.33 . Care must be taken if we want to write an algebraic expression for f(x) on a different interval . For example, on the symmetric interval [ z3 , 2) about the origin , for0<x < for <x< 0
x2
f(x) = J (x+3) 2
z
To find the Fourier coefficients of f, it is convenient to use equations (14.19) and (14 .20) with a = 0, since f is given explicitly on [0, 3) . Comput e 3
x2 dx=6 ,
ao3J a,,
2 3
3 JO
x2 cos 2i3 x
dx
n2 7r 2
and 2 b„ 3
9 fJo x2sin C 2nrrx* 3 dx = - nor
-6 -3 0
3 6 9 12
Graph of f(x) =x2 fo r 0 x < 3, with f(x +3) = f(x) for all x . FIGURE 14 .33
628
CHAPTER 14 Fourier Serie s The Fourier series of f is. ( 2n.Trx )
3+En9r(±-
-sinI 2
3x II .
(14.23)
3
We can think of this as the Fourier series of f on the symmetric interval origin. By the Fourier convergence theorem, this series converges t o 9
2 (4 + 4)
for x =
4
9
P2 -' 22 ]
about th e
f3 2
forx= 0
2
3
<x< 0
(x+3) 2
for --
xz
for0<x<
3 2
For the phase angle, or harmonic form, of this Fourier series, comput e 9 c„ = ✓a2 + b* = n2 92 -/ 1 + n2 'n-2 for n = 1, 2, . . . and S„ =tan -1 (- 9-/n2,72 9/"- ) = tan -1 (nar) . Since wo = 27r/3, the phase angle form of the series (14.23 )
3+E 92A/ 1+n2 7r2 cos( ?T n=1n
2ii rx
A
+tan -1 (n7r) ) .
3
The amplitude spectrum of a periodic function f is a plot of values of nw0 on the horizontal axis versus cn /2 on the horizontal axis, for n = 1, 2, . . .. Thus the amplitude spectrum consist s of points (two, c„ /2) for n = 1, 2, . . . . It is also common to include the point (0, lao I) on the vertical axis . Figure 14 .34 shows the amplitude spectrum for the function of Example 13 .24, consisting of points (0, 3) and, for n = 1, 2, . . .,
(2nir
3,
9
2n27r2
,/1 I
n272)
.
This graph allows us to envision the magnitude of the harmonics of which the periodic functio n is composed and clarifies which harmonics dominate in the function . This is useful in signal analysis, in which the function is the signal . Cn
3.0 1.5 0.7 0 .3 5 0.28
It
w0
2w 0
3w 0
4w 0
5w0
, nw 0
FIGURE 14 .34 Amplitude spectrum of the function of Figure 14.33.
62 9
14 .6 The Phase Angle Form of a Fourier Series
1. Let f and g have period p . Show that of +//g has period p for any constants a and P .
11 . Figure 14 .3 6
2. Let f have period p and let a and /3 be positive constants . Show that g(t) = f(at) has period p/a, an d that h(t) = f(t//3) has period Pp . 3. Let f(x) be differentiable and have period p . Show that f'(x) has period p . 4. Suppose f has period p . Show that, for any real number a,
f a+p f(x)dx = fo p f(x)dx = f a
pl2
FIGURE 14 .3 6
f(x)dx .
p/ 2
12 . Figure 14 .37 In each of Problems 5 through 9, find the phase angle for m of the Fourier series of the function. Plot some points of the amplitude spectrum of the function. k
5. f(x) = x for 0 < x < 2 and f(x+2) = f(x) for all x. -2
1 6. f(x) = 0 f(x+2)
3
t
for0<x< 1 forlx< 2 for all x .
7. f(x) = 3x 2 for 0 < x < 4 and f(x+4) = f(x) for all x.
8 . f(x) =
x
2
1
-1
FIGURE 14 .3 7
13 . Figure 14 .38
l+x for0x< 3 2 for3<x< 4 f(x+4) for all x.
Y
9. f(x) = cos(irx) for 0 < x < 1 and f(x) = f(x+ 1) for all x. In each of Problems 10 through 14, find the phase angle form of the Fourier series of the function, part of whose graph is given in the indicated diagram . Plot some points of the amplitude spectrum of the function . 10 . Figure 14 .35
1
3
5
x
FIGURE 14 .3 8
14 . Figure 14 .3 9 Y
Y
//I/X/*/f//x
-4 -3 -2 -1
FIGURE 14 .39 FIGURE 14 .35
1 2 3
4
I
-1 630 ! JJJ
CHAPTER 14 Fourier Series
15. Determine the Fourier series representation of the steady-state current in the circuit of Figure 14.40 if
E(t)
16. Determine the Fourier series representation of th e steady-state current in the circuit shown in Figur e 14 .41 if E(t) = 110sin(8007rt)I . Hint : First show that
_ 1004772 - t2) for - Tr < t < or E(t-I-21r) for all t .
E(t)
=
C20 1-2E cos(1600ii7rt)
4nz - 1
100 n 5H 10 H FIGURE 14 .40
14.7
FIGURE 14 .4 1
Complex Fourier Series and the Frequency Spectrum It is often convenient to work in terms of complex numbers, even when the quantities of interes t are real . For example, electrical engineers often use equations having complex quantities to compute currents, realizing that at the end the current is the real part of a certain comple x expression . We will cast Fourier series in this setting . Later, complex Fourier series and their coefficients will provide a natural starting point for the development of discrete Fourier transforms . 14.7 .1 Review of Complex Number s Given a complex number a + bi, its conjugate is a + bi = a - bi . If we identify a + bi with the point (a, b) in the plane, then a - bi is (a, -b), the reflection of (a, b) across the horizontal (real) axis (Figure 14 .42) . The conjugate of a product is the product of the conjugates : zw = z w for any complex numbers z and w . The magnitude, or modulus, of a + bi is la +bi I = Jae + b2, the distance from the origi n to (a, b) . It is useful to observe that (a+bi)(a+bi) = a 2 +b2
=
Ia+biI 2 .
If we denote the complex number as z, this equation i s
zz = IzI Z . Introduce polar coordinates x = rcos(O), y = rsin(O) to writ e z = x + iy = r[cos(6) + i sin(g)] = re ie ,
by Euler's formula . Then r = Izl and 0 is called an argument of z . It is the angle between the positive x axis and the point (x, y), or x+ iy, in the plane (Figure 14 .43) . The argument is
14.7 Complex Fourier Series and the Frequency Spectrum
. (a, b)
Y
631
Y
Y p a + ib = re i 0 r/ (a, b)
.
x (a, -b) FIGURE 14 .4 2
FIGURE 14 .43
FIGURE 14 .44
Complex conjugate as a reflection across th e horizontal axis .
Polar form of a complex number.
2+2i.
Polar form of
determined to within integer multiples of 2Tr . For example, 12 + 2i l = - A and the argument s of 2+2i are the angles 7r/4 + 2n7r, with n any integer (Figure 14 .44) . Thus we can write 2+ 2i
= Ae "r/4 .
This is the polar form of 2 + 2i . We can actually write 2 + 2i = -\/gei(ir/4+2"7''), but thi s doesn't contribute anything new to the polar form of 2+2i, since e i(w/4+2n7r) = e7ri/4 e 2"7ri
and e 2' 'i = cos(2nor) + i sin(2n7r) = 1 . If we use Euler's formula twice, we can write e ix = cos(x) + i sin(x) and e -ix = cos(x) - isin(x) . Solve these equations for sin(x) and cos(x) to writ e cos(x)
=2
(ei` + e-iz ) and sin(x) =
Zi
(e" - e -' ')
(14 .24 )
Finally, we will use the fact that, if x is a real number, then eix = e _L . This is true becaus e e ix = cos(x) + i sin(x) = cos(x) - i sin(x) = e-ix .
14.7 .2 Complex Fourier Serie s We will use these ideas to formulate the Fourier series of a function in complex terms . Let f be a real-valued, periodic function with fundamental period p . Assume that f is integrable on [-p/2, p/2] . As we did with the phase angle form of a Fourier series, write the Fourier serie s of f(x) on this interval as 1 . 2 as -I- E [a„ cos (n wox) + b,, sin nwox)] , n= 1
63 2
CHAPTER 14 Fourier Series
with co o = 27r/p . Use equations (14 .24) to write this series as
+ E [a,l
-a0
n-1
(einwox + e -inwox) +b„
2 ib,t)
(an2 a o+c[ n= 1 2
=
einwox +
Zi
(e lnwpx - E inwox ) J
(an+ib„)a-i„*ox] .
(14 .25)
In the series (14 .25), let do =
1 o 2a
and, for each positive integer n, d„=
1
2 (a„ -
Then the series (14 .25) becomes co
d0
= do
+ E [dn e inwox + d e -inw" ] n=1
+ E dn einwox + ELG,, e -incoox n=1
n= l
Now consider the coefficients . First, 1 2
l
do= -ao= -
i p/2
1? p/
2 f( t) dt.
And, for n = 1, 2, . . d„
=
(a„ - ib„ ) 2 1 2 P/2
f(t)cos(nwot)dt-
i 2
2p f
p /'2 2 f(t) sin(nwot)dt
= 2 P f '2 1 P/2 f(t)[cos(ncwot) - i sin(ncoot)]dt P f pl2 1 J pp/2 P
(t)e-'ordt.
p l2
Then dn
zz
=1
f(t)e-inwotd t
P f ppl2 Put these results into the series (14 .26) to ge t
do +
2 = pf(t)etnwotdt 1 d-n P f p/2 \
=1
do einwax + r doa inwox E n=1 ,t= 1 00
00
do + E d, eintoox + 2
:=1 00
= do+
00
E
d-„ e-inwox
n=1 00
d,,e,nwox = E dne inwax E n=-oo,,i#0 n=-o o
(14 .26 )
14.7 Complex Fourier Series and the Frequency Spectrum
633
We have reached this expression by rearranging terms in the Fourier series of a periodi c function f . This suggests the following definition .
DEFINITION 14.8
Complex Fourier Series
Let f have fundamental period p . Let wo = 2w/p. Then the is
complex
Fourier series of f
00
P .
J- p/ 2
The numbers d„ are the complex Fourier coefficients of
In the formula for d,,, the integration can actually be carried out over any interval of lengt h p, because of the periodicity of f. Thus, for any real number a, 1 d,z
p
f
a+p
f( t ) e-i"otdt .
a
Since the complex Fourier series is just another way of writing the Fourier series, th e convergence theorems (14.1) and (14 .2) apply without any adjustments .
THEOREM 14.10 Let f be periodic with fundamental period p. Let f be piecewise smooth on [-p/2, p/2] . The n at each x the complex Fourier series converges to f (f(x+) -f- f(x-)) . The amplitude spectrum of the complex Fourier series of a periodic function is a graph o f the points (nwo, jd„I), in which Id„I is the magnitude of the complex coefficient d,, . Sometime s this amplitude spectrum is referred to as a frequency spectrum .
EXAMPLE 14 .2 5
We will compute the complex Fourier series of the full-wave rectification of E sin(At), in whic h E and A are positive constants . This means that we want the complex Fourier series of IE sin(At) I, whose graph is show n in Figure 14 .45 . This function has fundamental period 7T/A (even though E sin(At) has perio d 27r/A) . In this example, w o = 27r/(1r/A) = 2A . The complex Fourier coefficients ar e A
= 7r- fo
1Esin(J*t)* e - '" Ai' /A
=EA 7r o
sin(*tt)e -ziA`t dt .
634
CHAPTER 14 Fourier Series
y
FIGURE 14 .45 IE sin(At) I .
Graph of
When n = 0 we get EEAA 2E sin(At)dt = f ar o ar When n 0, the integration is simplified by putting the sine term in exponential form : do =
e
EA /A 1 (eau _ uc) =znni e `d t f * ar o 2i = EA r/A e(1-2n)Aitdt vA E f / e (l+z )air dt liar 2iar o =
d„
1 EA 1 (I-zn)Ait, */* = EA liar [ (1- 2n)Ai e ] 0 + liar [ (1 12n)A i (1-211)ari 1 e -(1+21I)ni 1 _ E e 2ar [ 1-2n
1-2n +
1+2n
-(1+2n)1it 0
1+2n ]
Now e(1-2n)*i = cos((1- 2n)7r) + i sin((12n) IT) _ (-1)1-2n = - 1
and
e (1+2n)wi = cos((1+2n)ar) -i sin((1+2n)ar) _ (-1)1+2n - -
1.
Therefore _ d"
E [ -1 2ar 1-2n
1
+ -1
-1
1-2n1+2n1+2 + n
2E 1 ar 4n2 - 1 When n = 0 this yields the correct value for do as well . The complex Fourier series of IE sin(At) * is -2 E E 1 e 2nAit ar„_-. 4nz - 1 The amplutide spectrum is a plot of the points 2E (4n 2 -1)ar ) Part of this plot is shown in Figure 14 .46 . ■ (2nA,
14.7 Complex Fourier Series and the Frequency Spectru m
635
2E 3a
2E
4
t
15 a
1 #
-6A -4A -2A
2A 4A 6A
FIGURE 14 .4 6 IE sin(At) I .
Amplitude spectrum of
nw o
PROBLEM S In each of Problems 1 through 7, write the complex Fourie r series of f, determine what this series converges to, an d plot some points of the frequency spectrum . Keep in min d that, in specifying a function of period p , it is sufficien t to define f(p) on any interval of length p . f(x) = 2x
2. f
has period 3 and has period 2 and
3.
f
has period 4 and
f(x) =
4.
f
has period 6 and
f(x) =
5.
f
has period 4 and
f(x) =
6.
f
has period 5 and f(x)
1. f
f(x) = x2
for 0 x < 3 for 0 x < 2
(0 forO<x< 1 1 for l <x < 4 1-x for 0 x < 6
1
has period 2 and f(x)
9. The graphs of Figures 14 .48 and 14.49 define tw o periodic functions f and g, respectively . Calculate the complex Fourier series of each function . Determine a relationship between the frequency spectra of these functions and also between their phas e spectra .
-1 for0x< 2 2 for2<x< 4
= e -x
f(t)
for 0 x < 5
(x for0<x< 1 2-x forl<x< 2 8. Let f be the periodic function, part of whose graph i s shown in Figure 14 .47 . Find the complex Fourier serie s of f and plot some points of its frequency spectrum .
7. f
The next problem involves the phase spectrum of f, which is a plot of points (cp,,, nwe) for n = 0, 1, 2, . . .. Her e cp„ = tan-1(-b„/n„) is the n th phase angle of f .
=
Y
FIGURE 14 .4 8
g(t)
5 -8 - 4 FIGURE 14 .47
4 8 12
12 16 FIGURE 14 .49
CHAPTER
15
31 '-.i LS Ni i * r*'*. k A li A01i A 7kO ' :' i t : , i iO'! . ; O t y L i l .iLlU i E 4}
1
'+
f\l
lxi
AN D ROPERTIP : .°.
'll iN,191_
The Fourier Integral an d Fourier Transforms
15.1
The Fourier Integra l If f(x) is defined on an interval [-L, L], we may be able to represent it, at least at "most" points on this interval, by a Fourier series . If f is periodic, then we may be able to represent i t by a Fourier series on intervals along the entire real line . Now suppose f(x) is defined for all x, but is not periodic . Then we cannot represent f(x) by a Fourier series over the entire line. However, we may still be able to write a representatio n in terms of sines and cosines, using an integral instead of a summation . To see how this might be done, suppose f is absolutely integrable, which means that f: If(x) dx converges, and that f is piecewise smooth on every interval [-L, L] . Write the Fourier series of f on an arbitrary interval [-L, L], with the integral formulas for the coefficients included:
2 f L f(S)de+E n= l [(L
+
f L f(e)cos(nire/L)de i cos(norx/L )
L
Lf
L f(e) sin(nire/L)d sin(n7rx/L) .
We want to let L ->- oo to obtain a representation of f(x) over the whole line . To see what limit, if any, this Fourier series approaches, let
and CJ„ -
or = L = Aw . 637
CHAPTER 15 The Fourier Integral and Fourier-Transforms Then the Fourier series on [-L, L] can be written I Ow
f LL
f() cos („S) d ) cos ( w „ x)
2 7r
(f L L f() d
+
f L f(6) sin(w,g)d6) sin(co,,x) Aw . L
oo,
causing Ow
[( L
L
Now let L
(15 .1 )
O. In the last expression,
(I LLf()d ) Aw
0
L
because by assumption f f()4 converges . The other terms in the expression (15 .1) resembl e a Riemann sum for a definite integral, and we assert that, in the limit as L oo and Aw -+ 0, this expression approaches the limit
f
f(6)cos(w6)d6)
cos(cox)
[(f
+ (ff()sin(w)d) sin(wx)] dw . This is the Fourier integral of f on the real line . Under the assumptions made about f , thi s integral converges to
2
(f(x-) +f(x+))
at each x. In particular, if f is continuous at x, then this integral converges to f(x) . Often this Fourier integral is written
f
[A w cos(cox) +Bw sin(wx)]dw,
(15 .2)
in which the Fourier integral coefficients of f are Aw
. = - J f(e) cos(we)d e ar -o
Bw
1 =1
and W
f
f(6) sin(w )d .
This Fourier integral representation of f(x) is entirely analogous to a Fourier series on an interval, with fo . . . dw replacing EL I, and having integral formulas for the coefficients . These coefficients are functions of w, which is the integration variable in the Fourier integral (15 .2) .
15.1 The Fourier Integral
639
EXAMPLE 15 . 1 Let _ 1 for - 1 < x < 1 0 for Ix' > 1
f(x)
Y
FIGURE 15 . 1 1 for - 1 < x < 0 for'xi> 1
f(x) =
1
Figure 15 .1 is a graph of f . Certainly f is piecewise smooth, and Fourier coefficients of f are A °, =
1
-rr
f1
r
cos(w6)d =
f:
*
f(x) * dx converges . Th e
2 sin(w) ar w
and 13w
1
°°
= - f f(O sin(co)de = 0 .
The Fourier integral of f is
L°° 2sin(w ) Jo
arw
cos(wx)dw .
Because f is piecewise smooth, this converges to '-'-z (f(x+) + f(x-)) for all x . More explicitly , °° 2sin(w) Tao
1 for -1<x< 1 1 cos(wx)dw = 2 for x = ± 1 0 for Ix' > 1
There is an another expression for the Fourier integral of a function that we will sometime s find convenient . Write
f [A w cos(wx) +B °, sin(wx)] dw = f
I
f : f() cos(we)d
L +
1
ar
f
f() sin(w6)d6) sin(wx) d w
f f
.=
cos(wx) )
f()[cos(cw6) cos(wx )
-
+sin(wg)sin(wx)]d d w
1
_-
°° f f
f(f)cos(w(6-x))d6dw .
(15 .3)
Of course, this integral has the same convergence properties as the integral expression (15 .2) , since it is just a rearrangement of that integral .
640
CHAPTER 15 The Fourier Integral and Fourier Transform s
In each of Problems 1 through 10, expand the function i n a Fourier integral and determine what this integral converges to. 1. f(x)
= Ix
for - 7r x 0 for Ix' > IT
7r
I
3. f(x) =
=
for -10 < x < 1 0 0 if lx* > 1 0
sin(x) for - 3'r x 7r forx < -37r and forx > a r 0
1/2 8. f(x) = 1 0
-1 for -Ir<x5 0 1 for 0<x
ar
9. f(x)
sin(x) for - 4 < x 0 cos(x) for 0 < x < 4 for Ixl > 4 0
for-5x< 1 for 1 x 5 for Ixl > 5
= e -Ixl
10. f(x) = xe-14x1 11. Show that the Fourier integral off can be writte n
_ xz for - 100 5. x :5. 100 5 . f(x) for Ix' > 100 10
15.2
for - a < x < 27r forx<-7r and forx>27r
= o
7 . f( x) =
k
2 f(x)
4. f(x)
6 . f(x)
lirn
Tr
(O ->OO
f °° f(t) sin(w(t- x)) dt . os
t -x
Fourier Cosine and Sine Integrals If f is piecewise smooth on the half-line [0, oo), and fo f() * d converges, then we can write a Fourier cosine or sine integral for f that is completely analogous to sine and cosine expansions of a function on an interval [0, L] . To write a cosine integral, extend f to an even function fe defined on the whole line by setting fe(x) -
forx > 0 f(-x) forx< 0 f(x)
This reflects the graph for x > 0 back across the vertical axis . Since fe is an even function, it s Fourier integral has only cosine terms . Since fe (x) f(x) for x > 0, this cosine integral can b e defined to be the Fourier cosine integral of f on [0, co) . The coefficient of fe in its Fourier integral expansion i s
f *fe
7r
and this i s
This suggests the following definition.
cos ( w
)de
15.2 Fourier Cosine and Sine Integrals
DEFINITION 15.1
641
Fourier Cosine Integra l
Let f be defined on [0, co) and let fo f(6) d converge . The Fourier cosine integral of f is
f0
Ao cos(wx)dco ,
in which 2
Aw =
2
f
°°
f() cos(co )d .
By applying the convergence theorem to the integral expansion of fe , we find that, if f is piecewise continuous on each interval [0, L], then its cosine integral expansion converge s to (f (x+) + f(x-)) for each x > 0, and to f(O) for x = 0. In particular, at any positive x at which f is continuous, the cosine integral converges to f(x) . By extending f to an odd function fa, similar to what we did with series, we obtain a Fourier integral for fo containing only sine terms . Since fo(x) = f(x) for x > 0, this gives a sine integral for f on [0, oo) .
DEFINITION 15.2
Fourier Sine Integra l
Let f be defined on [0, is
oo) and
let fo 1f(e) l d converge . The Fourier sine integral of f
f
:W
A* sin(wx)dw ,
0
in which Ao
= -or2 fo
f() sin(co)d .
If f is piecewise smooth on every interval [0, L], then this integral converges to (f (x+) + f(x-)) on (0, co) . As with Fourier sine series on a bounded interval, this Fourier sine integra l converges to 0 at x = 0 .
EXAMPLE 15 .2 Laplace's Integral s Let f(x) = e-k.' for x > 0, with k a positive constant . Then f is continuously differentiable on any interval [0, L], and
fo °e-kcdx= k For the Fourier cosine integral, compute the coefficient s
*
A=
2 f e -ke cos (w) de = -
Tr
kz + coZ .
642
CHAPTER 15 The Fourier Integral and Fourier Transform s The Fourier cosine integral representation of f converges to e -kx for x > 0 : e -kx
=
2k f°° 1 cos(wx)dw . arJo k2 + w 2
For the sine integral, compute Bw
=
f
e-k
sin(k)d = k2
+
w2
The sine integral converges to e' for x > 0 and to 0 for x = 0: e-kx
= ar- fo
k2
w + w 2 sin(wx)dw for x > 0.
These integral representations are called Laplace's integrals because A °, is 2/ar times the Laplace transform of sin(kx), while B*, is 2/7r times the Laplace transform of cos(kx) .
x for0<x< 1 6. f(x) = x+ 1 for 1 < x < 2 forx> 2 0
In each of Problems 1 through 10, find the Fourier sine integral and Fourier cosine integral representations of the function . Determine to what each integral converges . (x2
1.
f(x) _
j II0
for 0 < x 1 0 forx>10
7. f(x ) f(x)=e-cos(x)
8. f(x) = xe
-3x
forx > 0
forx
0
for0<x 9. f(x)=1k 0 forx > c
c
in which k is constant and c is a positive constan t 10. f(x) = e 2x cos(x) for x
5.
f(x) =
15.3
2x+1 for 05x<7r 2 for ar < x < 37r forx>37r 0
0
11. Use the Laplace integrals to compute the Fourier co sine integral of f(x) =1/(1+x2 ) and the Fourier sin e integral of g(x) = x/(1 + x2 ) .
The Complex Fourier Integral and the Fourier Transfor m It is sometimes convenient to have a complex form of the Fourier integral . This complex setting will prove a natural platform from which to develop the Fourier transform . * f(x) * dx converges . Suppose f is piecewise smooth on each interval [-L, L], and that . Then, at any x, 1
2
1 + f(x=)) = - f T-(fx+) '7T
f(e) cos(w (
- x))d dw ,
15.3 The Complex Fourier Integral and the Fourier Transform
643
by the expression (15 .3) . Insert the complex exponential form of the cosine function into thi s expression to write oo
f* f i'(e-x) +e-i* (t-x)) d dc o f ,l_* f(O 2 ( e
2 (f( x+) + f(x-)) = 77-
f
=1 27r f
f 00 f f 00 f(e) eiw(t-x) ded w + 1 27r o co
f(e)e -iw(t-x) ddw . co
In the first integral on the last line, put co = -w to get 1 ( f(x +)
2
+f( x -))
f
f 0 f
f 00 f( )e -iw(-x) di dw . =1 o _ . f() e-iw(t-x)d 6dw + 1 27r _* 27r 00 L
Now write the variable of integration in the next to last integral as w again and combine thes e two integrals to write
00
1 (f(x +)
2
+f( x-))
-«,(t-x) d6dco . =1 27rf_ 0 f f(6)e
(15 .4 )
This is the complex Fourier integral representation of f on the real line . If we let Co, L. f(t) e-`°' d t, then this integral i s 2 (f( x+)
+ f(x -)) = 277- f
=
Ce` wx dw .
We call Cw the complex Fourier integral coefficient of f.
EXAMPLE 15 . 3
e
Let f(x) = alxI for all real x, with a a positive constant. We will compute the complex Fourie r integral representation of f . First, we hav e e -ax eax
f(x) -
for x > 0 forx< 0
Further,
f
f(x)dx
= fo e
ax dx+
oo
Now compute
f
C
fo
cc'
2 e -ax dx = - . a
ealrl e -iwtdt
=1 o
ea'e-iwtdt+
fo
o c
e -at e -iwt dt
= f ° e (a-iw)t dt+ fo ie
( a+iw ) t dt
1 e (a-iw)t ° + -1 e-(a+iw) t `° a - iw a+iw ° -co _
1 + a+lw
1
a- lw
_ 2a = a2 +w 2 .
6 44
CHAPTER 15 The Fourier Integral and Fourier Transform s The complex Fourier integral representation of f is
e -akl =
a 7r
r'
1 e dco . -* a z + wz
The expression on the right side of equation (15 .4) leads naturally into the Fourier transform . To emphasize a certain term, write equation (15 .4) as
(f(x+) +f(x-)) = 2 ,n-
:
(L:fe-d)
e'dw .
(15 .5)
r
The term in parentheses is what we will call the Fourier transform of f.
DEFINITION 15.3 Fourier Transform Suppose function
f
*f(x)I dx converges . Then the Fourier transfoiml off is defined to be the R[f](w)
=f
f(t)e- «t dt .
Thus the Fourier transform of f is the coefficient C,, in the complex Fourier integral representation of f . turns a function f into a new function called R[f] . Because the transform is used in signal analysis, we will often use t (for time) as the variable with f , and w the variable of th e transformed function R[f] . The value of the function a[f] at w is a[f](w), and this numbe r is computed for a given w by evaluating the integral roc f(t)e-`°"dt . If we want to keep th e variable t before our attention, we sometimes write a[f] as R[f(t)] . Engineers refer to the variable w in the transformed function as the frequency of th e signal f . Later we will discuss how the Fourier transform, and a truncated version called th e windowed Fourier transform, are used to determine information about the frequency content of a signal . Because the symbol RV(t)] may be clumsy to use in calculations, we sometimes write th e Fourier transform of f as f . In this notation,
a
R[J]( w ) = f(w ) •
EXAMPLE 15 .4
Let a be a positive constant . Then [e
-Qltl ]
2a
( w ) = az+wz .
This follows immediately from Example15 .3, where we calculated the Fourier integral coefficient C. of C al * This coefficient is the Fourier transform of f .
15.3 The Complex Fourier Integral and the Fourier Transform
645
EXAMPLE 15 . 5
Let a and k be positive numbers, and let for -a a
1k f(t)
This pulse function can be written in terms of the Heaviside function as f(t) = k[H(t + a) - H(t -a)] ,
and is graphed in Figure 15 .2. The Fourier transform of f i s : f( w )
=f
f(t)e
= f nn ke
-iw1d t
rwr dt
=
r
e iwr]
L to) _ _ k [ e -iwa iw
-
eiwal
J
a -a
=
2k co
sin(aw) .
f(t)
Pulse function : k
f(t) = k[H(t+ a) - H(t - a)] .
-a
t
a
FIGURE 15 . 2
Again, we can also write = 2k
R[f](w)
w
sin(aw) ,
or R[f(t)](w) = wk sin(aw) .
In view of equation (15 .5), the Fourier integral representation off i s 1 27r
f*
f (w)e iwr Ctw .
If f is continuous, and f' piecewise continuous on every interval [-L, L], then the Fourier integral of f represents f : f( t)
=
J
1 f ( )e iw `dw .' 27r -*
(15 .6)
We can therefore use equation (15 .6) as an inverse Fourier transform, retrieving f from f. This is important because, in applications, we use the Fourier transform to change a,proble m involving f from one form to another, presumably easier one, which is solved for f (co) . We must then have some way of getting back to the f(t) that we want, and equation (15 .6) is the vehicle that is often used. We write R-i [ = f if 3[f] = f. As we expect of any integral transform, is linear :
f]
z1 [ a f + (3 g] = n [f] + * i1 [ g]
64 6
CHAPTER 15 The Fourier Integral and Fourier Transform s The integral defining the transform, and the integral (15 .6) giving its inverse, are said to constitute a transform pair for the Fourier transform. Under certain conditions on f, i(w)
= f * f(t)e -iwt dt and f(t) = 1 f f (w)e i "t dt . 27r -co
EXAMPLE 15 . 6
Let 1-ItI 0
f(t)
for -1 land fort< - I
Then f is continuous and absolutely integrable, and f' is piecewise continuous . Compute m
Act))
=f
f( t)
e-iwtd t
= f 1(1 _ I tDe iWt dt = 2(1- cos(w) ) i
w2
This is the Fourier coefficient C o, in the complex Fourier expansion of f(t) . If we want to go the other way, then by equation (15 .6) , .f( t )
1 f (w)eiwt dw =27r f f = 1 fr°° (1-cos(w) ) e i °,r dw : 7T -oo
We can verify this by explicitly carrying out this integration . A software package yield s 1 7T
°° (1- cos(w)) ei0 t dw w2
f e°
=7-t signum (t + 1) +
IT
signum (t + 1) + 7rt signum (t - 1)
- 7r signum (t - 1) - 2 signum (t) , in which
signum(w) =
1 forty > 0 0 forty = 0 . -1 for co < 0
This expression is equal to 1- Itl for -1 < t < 1, and 0 for t > 1 and for t < -1, verifying the integral for the inverse . In the context of the Fourier transform, the amplitude spectrum is often taken to be a graph of f (w) . This is in the same spirit as the use of this term in connection with the Fourier integral .
15.3 The Complex Fourier Integral and the Fourier Transform
647
EXAMPLE 15 . 7 If f(t) =H(t)e -Qt , then !(co) = 1/(a+iw), so
I(w)
=
1 Jae cot
Figure 15 .3 shows a graph of i(co)l . This graph is the amplitude spectrum of
FIGURE 15 . 4
Graph of with
FIGURE 15 .3
f (w) =
f.
f(t) = H(t)e -«r
EXAMPLE 15 . 8 The amplitude spectrum of the function f of Example 15 .5 is a graph of ) = 2k shown in Figure 15 .4.
sin(w ) co
s
We will now develop some of the important properties and computational rules for the Fourier transform. With each rule, there is also an inverse transform version which we will als o state. Throughout, we assume that fo f(t) I dt converges, and, for the inverse version, that f is continuous and piecewise continuous on each [-L, L] .
*
f
THEOREM 15.1 Time Shifting
If to is a real number, then c
t0 [f( t - to)]( w ) = e-1 1(t) .
s
That is, if we shift time back to units and replace f(t) by f(t - to), then the Fourier transform of this shifted function is the Fourier transform of f , multiplied by the exponential factor e `'0 •
Proof
zS[f(t-to)](co)=
f f(t-to)e -"'' t dt
= e-taro
f .f( t - to)e-r' (t-t0) dt .
6 48
CHAPTER 15 The Fourier Integral and Fourier Transform s Let u = t - to to write / / -iwta 1p .f(u)e-""du = e-tutu f (w) . ■ i [f(t - to)] ( w) = e -co
17,
EXAMPLE 15 . 9
Suppose we want the Fourier transform of the pulse of amplitude 6 which turns on at time 3 and off at time 7 . This is the function g( t) -
0 fort < 3 and fort ? 7 6 for 3 < t < 7
shown in Figure 15 .5 . We can certainly compute g(w) by integration . But we can also observe that the midpoint of the pulse (that is, of the nonzero part) occurs when t = 5 . Shift the graph 5 units to the left to center the pulse at zero (Figure 15 .6) . Calling this shifted pulse f , the n f(t) = g(t+5) . Shifting f five units to the right again just gets us back to g : g( t) = f( t - 5) . The point to this is that we already know the Fourier transform of f from Example 15 .5: [f(t)] (w) = 12
sin(2w co )
By the time-shifting theorem, z`rsrg( t)(w) = z [f(t-5)](w) = 12e-5i(o sin
(2w) . w
The inverse version of the time-shifting theorem is : -i [ IS e-i0,t0F(w )] ( t) = f( t - to) .
(15 .7)
EXAMPLE 15 .1 0
Suppose we want e2i w
[5+iw ] g(t) g(t)
6
6 -
-2 FIGURE 15 . 5 6
g(x) = 0
for37
FIGURE 15 .6
o®-* t 2
The
function of Figure 15. 5 shifted five units to the left.
FIGURE 15 .7 Graph of H(t+2)e-5(t+2)
15.3 The Complex Fourier Integral and the Fourier Transform
649
The presence of the exponential factor suggests the inverse version of the time-shifting theorem . Put to = -2 in equation (15 .7) to write e 2io, f(t-(-2))=f(t+2) ,
[5+iw] where f( t)
= R-i
H(t)e s
[5+iw]
t
Therefore e2io ,
[5+iw]
= f(t+2)=H(t+2)e-s(r+2 )
A graph of this function is shown in Figure 15 .7. The next result is reminiscent of the first shifting theorem for the Laplace transform (Theorem 3 .7) . THEOREM 15 .2
Frequency Shifting
If wo is any real number, then
al ei`Ot f( t)l
= f (co - wo ) . ■
Proof Z*[
eiwt
f( t )l ( w )
= f eiwOt f(t)e-1QJt d t f* = f(t)e-`('WO)tdt= .f(co-wo) .
The inverse version of the frequency-shifting theorem i s zY -` [ ( - w o)(t) =
f(t)
THEOREM 15.3 Scaling
If a is a nonzero real number, then zs[f( at)l (w) =
f ( co l a ) . ICI
This can be proved by a straightforward calculation proceeding from the definition . Th e inverse transform version of this result i s "s- `[.f ( w/ a)]( t) =
l al
f(at) .
This conclusion is called a scaling theorem because we want the transform not of f(t) , but of f(at), in which a can be thought of as a scaling factor . The theorem says that we can compute the transform of the scaled function by replacing w by w/a in the transform of th e original function, and dividing by the magnitude of the scaling factor .
650 :
CHAPTER 15
The Fourier Integral and Fourier Transform s
EXAMPLE 15 .1 1
We know from Example 15 .6 that, if f(t) - { 1 - I t s 0
for -1 1 and fort < -1 '
then f(w )
=2
1 -cos(w) w2
Let g(t) = f(7t) =
1-*7t' 0
for - i and fort <
71
Then
g ( w ) = R[f(7t)](w) = 7 f (7 ) 2 1- cos(w/7) - 14 1- cos (* ) (w/7)2 7 w2
H
THEOREM 15.4
Time Reversal z` [f(- t)]( w ) = f(- w) . ■
This result is called time reversal because we replace t by -t in f(t) to get f(-t) . The transform of this new function is obtained by simply replacing w by -w in the transform of f(t) . This conclusion follows immediately from the scaling theorem by putting a = -1 . The inverse version of time reversal is za-'[f(-w)](t) =f(-t) . THEOREM 15.5
Symmetry a[f( t)](w ) = 27rf(-w) . ■
To understand this conclusion, begin with f(t) and take its Fourier tramsform f (w). Replace w by t and take the transform of the function 1(0 . The symmetry property of th e Fourier transform states that the transform of 1(0 is just the original function f(t) with t replaced by -w, and then this new function multiplied by 27r . ==
e
T a EXAMPLE 15 .1 2
Let f(t)-
4-t2 0
for -22and fort<- 2
15.3 The Complex Fourier Integral and the Fourier Transform
651 .
FIGURE 15 . 8
4 - t2 for -2 < t< 2 f( x ) = 0 for Id > 2
Figure 15 .8 shows a graph of f . The Fourier transform off is f( w)
= f f(t)e-`wt dw= f 22 (4-t 2 )e -iwt d t = 4 sin(2w) - 2w cos (2w ) w3
In this example, f(-t) = f(t), so exchanging -w for co should not make any difference i n f" (co), and we can see that this is indeed the case .
THEOREM 15.6 Modulatio n
If coo is a real number, then zs'[f( t)
cos (w o t)](w) =
[1 w + wo) +f (w - coo) ]
and a[f( t ) sin(wot)] (co) =
Put cos(wt) = theorem to get Proof
f
i [f ( w + wo) - '(w - coo )] . ■
(e`we'+e-`wo') and use the linearity of
zS [f( t ) cos ( wo t )]( w ) =
and the frequency-shifting
[e°tt) + 2 e-""t f(t)]
= 2 iS[ e`w0t f(t)]( w )+2iS[e '
( co ) 'f(t)]( w )
= 2 f( w - co o) + 2 f(co+coo) . The second conclusion is proved similarly, using sin(wt) = it (e`wo' - e-`*Ot) . ■
CHAPTER 15 The Fourier Integral and Fourier Transforms
652
PROBLEMS and
In each of Problems 1 through 8, find the complex Fourier integral of the function and determine what this integra l converges to.
1. f(x) = xelxl 2 . f(x)
_ 1 - x for - 1 < x < 1 for Ix' > 1 0
3 . f(x) =
4 . f(x) 5. f(x)
1 for0 1
sin(7rx)
0
for - 5 x for 'xi > 5
_ sin(t) for - k t < k for Itl > )c , 0 11. f(t) = 5[H(t-3) - H(t - 11) ]
5
10. f(t)
=for-2<x< 2 0 for 'xi > 2 for -1<x for Ixj > 1
x e Ix l
12. f(t) = 5e3(t-5) 2 13. f(t)=H(t-k)e-`/4
1
14. f(t) = H(t -k)t2 15. f(t) = 1/(1+t2)
1 for0<x< k
16. f(t)=311(t-2)e3 '
6 . f(x) = -1 for -k<x< 0 0 for > k,
.f( t ) = 3e41t+21 18. f(t) = H(t - 3)e- 2' 17 .
in which k is a positive constant cos(x) for 0 < x
< < 2 for Ixl > 2
7. f(x) = sin(x) for -
0 8. f(x) = x 2e -31x 1
In each of Problems 19 through 24, find the inverse Fourie r
<2 x
transform of the function . 19. 9e -(a+4)2/32 e (20-4w)i 20.
0
3-(5-w) i e(2w-6) i
In each of Problems 9 through 18, find the Fourier trans form of the function and graph the amplitude spectrum . Wherever k appears, it is a positive constant . For som e problems one or more theorems from this section can be used in conjunction with the following transforms, whic h can be assumed : [e -altl ](w)
15.4
= a2+w2 , ale -a '2 1(co) =
a
e
w2 /4a
21. 22. 23.
24.
5-(3-w) i
10 sin(3a ) w+7r
1+it,() 6-w 2 +5i w Hint: Factor the denominator and use partial fractions . 10(4+iw) 9- w 2 +8i w
Additional Properties and Applications of the Fourie r Transform 15.4.1 The Fourier Transform of a Derivativ e In using the Fourier transform to solve differential equations, we need an expression relating the transform of f' to that of f. The following theorem provides such a relationship for derivative s of any order, and is called the operational rule for the Fourier transform . A similar issue arises for any integral transform when it is to be used in connection with differential equations (as i n Theorems 3 .5 and 3 .6 for the Laplace transform) .
15.4 Additional Properties and Applications of the Fourier Transform Recall that the kth derivative of f is denoted this symbol, with the understanding that f (0) = f .
P k) .
653
As a convenience we may let k = 0 in
THEOREM 15 .7 Differentiation in the Time Variable
Let n be a positive integer . Suppose f ("') is continuous, and f ( ") is piecewise continuous on each interval [-L, L] . Suppose fs f 0-1) (t) I dt converges . Suppos e lira f (k) (t) = lira f (k) (t) = 0
t,oo fork=0,
1- -oo
1, . . .,n-1. Then B[f(")(t)]()
Proof
(iw) „ f(w) .
Begin with the first derivative. Integrating by parts, we have u[.f']( w )
= f f'(t)e-i""d t = [fO t e -i'l7.- f f(t) (-ico) e -L 'dt .
Now e -io" = cos(cot) - i sin(wt) has magnitude 1, and by assumption ,
lim t,oo f(t) = lim f(t) = 0. "
Therefore
f f(t)e - ''dt = iwf (co) .
F [f(")(t)](w) = iw
The conclusion for higher derivatives follows by inducation on n and the fact that . f (")(t) =
dt f(ii-1)(t)
.
The assumption that f is continuous in the operational rule can be relaxed to allow fo r a finite number of jump discontinuities, if we allow for these in the conclusion by adding appropriate terms . We will state this result for the transform of f' .
THEOREM 15 . 8
Suppose f is continuous on the real line, except for jump discontinuities at t1 , . . . , tM . Let f' be piecewise continuous on every [-L, L] . Assume that r. * f(t) l dt converges, and tha t lim f(t) = lim t->-oo f(t) = 0. Then
R[ .f'](w)=iwf(w)-E[ .f(tj+)-f(tj-)]e 'i . ■ j=1
CHAPTER 15 The Fourier Integral and Fourier Transform s f(t) f(t;- ) f(t;+)
3
t
tj FIGURE 15 .9 Th e function f has a jump discontinuity at t j .
Each term f(tj +) - f(t1 -) is the difference between the one-sided limits of f(t) at the jump discontinuity tt . This shows up in Figure 15 .9 as the size of the jump between ends o f the graph at this point . Suppose first that f has a single jump discontinuity, at t 1 . In the event of more jump discontinuities, the argument proceeds along the same lines, but includes more of the type of calculation we are about to do . Integrate by parts : Proof
a[ft ]( w )
=f
ft ( t ) e-iwrdt
= -co
fr(t)e-tmtdt+ f ft (t)e -t',r dt t, t, _ iwt dt
= [f(t)e-iwtlt* J +
0 f(t)(-iw)e
[f(t)e-iwtl°O -(-iw) f JJ
f(t)e
r,
-iwt dt
=f(ti-)e-"'w-f(t1+)e-it,w+iw f f(t)e -iwt d t =iwf(w ) - [f( t 1+)
Here is an example of the use of the operational rule in solving a differential equation .
EXAMPLE 15 .1 3 Solve
y' - 4y = H(t)e-4t , in which H is the Heaviside function . Thus the differential equation i s e -4t y - 4y
0
fort>0 for t < 0
Apply the Fourier transform to the differential equation to ge t z` [ y ]( w) - 45( w) =
[H(t)e-4`](w) .
15.4 Additional Properties and Applications of the Fourier Transform
-4
655
-2 -0 .04 - 0 .0 8 - 0.1 2
FIGURE 15 .10
=
y(t)
Using Theorem 15 .7 and the fact that F[H(t) e -4'] (w) = 44+m, , write this equation as iwy(w) -4y(w)
1 = 4+
iw
.
Solve for y(w) to obtain y(w)
-1 16+cw2
The solution is r
y(t)=
1
-1 1 16+w2 (t)=-8e-41`1 ,
which is graphed in Figure 15 .10. The inverse transform just obtained can be derived in several ways . We can use a table of Fourier transforms, or a software package that contains this transform . We can also see from Example 15 .4 that is'
[Cal l (w) =
2a a 2 +w2
and choose a = 4 . There is no arbitrary constant in this solution because the Fourier transform has returne d the only solution that is continuous and bounded for all real t . Boundedness is assumed when we use the transform because of the required convergence of f Iy(t)1 dt . 15.4 .2 Frequency Differentiatio n
The variable w used for the Fourier transform is the frequency of f(t), since it occurs in th e complex exponential el', which is cos(wt) + i sin(wt) . In this context, differentiation of f:(w) with respect to w is called frequency differentiation . We will now relate derivatives of f (co) and f(t) . THEOREM 15.9 Frequency Differentiatio n
Let n be a positive integer . Let f be piecewise continuous on [-L, L] for every positive number L, and assume that L°°. It"f(t) dt converges . Then d" 'sit' f(t)](w) = i"
dw"
f (w) .
■
656
CHAPTER 15 The Fourier Integral and Fourier Transform s
In particular, under the conditions of the theorem, a[ tf( t)](w )
= i dw f (w)
[t2f(t)](w) = - dw2 f( co ) •
and
Proof We will prove the theorem for n = 1 . The argument for larger n is similar . Apply
Leibniz's rule for differentiation under the integral to writ e dw f (w)
d
=
f
f
f(t)e_iWtdt
f
=
8w [
f(t)(-it)e-`''tdt=-i
f(t)e_t*,tl dt
f . [tf(t)]e-""d t co
= -i[tf(t)](w) . ■
EXAMPLE 15 .1 4
[t 2 e51tI ] . Recall from Example 15 .4 that
Suppose we want to compute
10 z[e s1ti_ ]( w) 25 + w 2 . By the frequency differentiation theorem, 10 25- 3w 2 2 Sits z d2 z`[t e ]( w) = i dw 2 [25+w 2 ] .20 (25+w 2) 3 . 15.4.3
The
Fourier Transform of an Integral
The following enables us to take the transform of a function defined by an integral . THEOREM 153 0
Let f be piecewise continuous on every interval [-L, L] . Suppose Suppose 1(0) = 0 . Then [LfTdT] Proof Let g(t) = f(T)dT . Then g'(t) g(t) -> 0 as t -+ -oo . Further,
=
(w)
iw
f(w) .
f(t) for any t at which f is continuous, and
-co
We can therefore apply Theorem 15 .7 to g to obtain i( w)
=
f(t) dt converges.
*
e
lim g(t) _ Lf(T) d T = 1(0) = 0 .
i-,oo
f
R[f( t)]( w ) = z [b' (0] (co ) iwa[g(t)](w) =
iwa
[fr] (w).
This is equivalent to the conclusion to be proved . ■
15.4 Additional Properties and Applications of the Fourier Transform
657
15.4 .4 Convolutio n There are many transforms defined by integrals, and it is common for such a transformation t o have a convolution operation . We saw a convolution for the Laplace transform in Chapter 3 . We will now discuss convolution for the Fourier transform .
DEFINITION 15.4
Convolution
Let f and g be functions defined on the real line . Then f has a convolution with g i f 1. fb f(t)dt and f' g(t)dt exist for every interval [a, b] . 2. For every real number t, oo
f
t
I.f( -
T) g( T) I d T
converges . In this event, we define the convolution f * g of f with g to be the function given by . (.f * g) ( t) = f f(t - T)g(T)dT . c
In this definition, we wrote (f *g)(t) for emphasis . However, the convolution is a functio n denoted f * g, so we can write just f * g(t) to indicate f *g evaluated at t . THEOREM 15 .1 1
Suppose f has a convolution with g. Then 1. (Commutativity of Convolution) g has a convolution with f , and f * g = g * f. 2. (Linearity) If f and g both have convolutions with h, and a and /3 are real numbers , then a f +(3g also has a convolution with h, and (a.f+/3g)*h=a(.f*g)+/3(g*h) . Proof
For (1), let z = t f * g(t)
T
to write
= ff(t-T)g(T)dT .
=f
f(z)g(t-z)(-l)dz= f
g(t-z)f(z)dz=g*f(t) .
Conclusion (2) follows from elementary properties of integrals, given that the integrals involve d converge . We are now ready for the main results on convolution . THEOREM 15.1 2
Suppose f and g are bounded and continuous on the real line, and that f g(t) * dt both converge . Then,
f. If(t) j dt an d
658 I
CHAPTER 15 The Fourier Integral and Fourier Transform s 1. f *g(t)dt=
f
f(t)dt
f
g(t)dt .
2. (Time Convolution) f * g( w) = f ( w ) g ( w) • 3 . (Frequency Convolution) 1 co ) . ■ f(t)g(t) ( w) = -(f * g )( The first conclusion is that the integral, over the real line, of the convolution of f with g, is equal to the product of the integrals of f and of g over the line . Time convolution states that the Fourier transform of a convolution is the product of th e transforms of the functions . This formula can be stated `z [f * g] ( w) = f ( w) g( w ) . That is, the Fourier transform of the convolution of f with g, is equal to the product of th e transform of f with the transform of g. This has the important inverse versio n z - ' [f ( w) g( w)] ( t) = f * g(t) . The inverse Fourier transform of the product of two transformed functions, is equal to th e convolution of these functions . This is sometimes of use in evaluating an inverse Fourie r transform . If we want -' [h(w)], and are able to factor h(w) into f (w)g(w), a product of th e transforms of two known functions, then the inverse transform of h is the convolution of thes e known functions . Frequency convolution can be stated 1 [f(t)g(t)](w) = - (.f * g )(w ) . The Fourier transform of a product of two functions is equal to ( 2-,ff ) times the convolution of the transforms of these functions . The inverse version of frequency convolution i s -' [f( w ) * g(w)](t) = 2lrf( t) g( t) • Proof
For (1), write
f * f * g( t ) dt = f (ff(t _ T)g(r)dr) dt = f (f f(t-7)g(T)dt I d7= f (L: f(t -T)dtI g (T) dT, assuming the validity of this interchange of the order of integration . Now , Lf(t_ T)dt
= f* f( t) dt
15.4 Additional Properties and Applications of the Fourier Transfor m for any real number
T.
Therefore
f f* g(t)dt
= f : (f
:f@dt)
=1
g(T)d T
f(t)dt f g(T)dT=
f(t)dt f g(t)dt .
For (2), begin by letting F(t) = e '° f(t) and G(t) = e - 'g(t) for real t and w . The n
=f
f * g( w )
f * g ( t) e-i'dt
1
= f 00 (L: f(t - T) g( T) dT) e -t' t dt 00
= f* =f
(f
:e f(t-T)g(T)dT I dt e-i
-imz g(T)dT I dt. f( t - T)e
: (L:
Now recognize that the integral in large parentheses in the last line/ is the convolution of F with G . Then, by (1) of this theorem applied to F and G ,
f *g(w) = f F* G(t)dt = f F(t)dt f G(t)d t ce
=f
f(t)e-iwt
f g(t)e dt
''dt = f ( w )( w
)•
We leave conclusion (3) to the student .
EXAMPLE 15 .1 5
Suppose we want to compute ZS
-1 (4
+ w 2)1(9 + w2)
Recognize the problem as one of computing the inverse transform of a product of function s whose individual transforms we know : 4+w2
=
e-21h1 ) zs
(4
= f( w) with f(t)=4e -21 ,
and 9 I- w 2 -
(e _3 u h1 ) = g(w) with g(t) =
6 e-31 ` 1 .
The inverse version of conclusion (2) tells us tha t 1 C(4+w2)(9+w2)1
(t) =
[f(w)g(w)](t) = f * g( t) 4 e-21tl
* 6 e-31tl
f
24 J_*
CHAPTER 15 The Fourier Integral and Fourier Transform s We must be careful in evaluating this integral because of the absolute values in the exponents . First, if t > 0, then 24[f
* g(t)]
= fo
f e-21t-T1 e -31TI dT+ f f t e -21t-TI e-3ITI dT+f
15 .4.5 Filtering and the Dirac Delta Function A Dirac delta function is a pulse of infinite magnitude having infinitely short duration. On e way to describe such an object mathematically is to form a short puls e a [H(t+ a) - H(t - a)] , Z as shown in Figure 15 .11, and take the limit as the width of the pulse approaches zero : 8(t) = lim
[H(t + a) - H(t - a)] . Za This is not a function in the standard sense, but is an object called a distribution . Distributions are generalizations of the function concept . For this reason many theorems do not apply to S(t) . y(t) 2a -a
0 a
FIGURE 15 .1 1
y = za [H(t+ a) - H(t - a)] .
t
15.4 Additional Properties and Applications of the Fourier Transform
661
However, there are some formal manipulations that yield useful results . First, if we take the Fourier transform of the pulse, we get zS[H(t+a)-H(t-a)]= =
f t
e-;*rdt=--e-rmtl aa
1
lW
a
JJ - a
( e"° - e -raw) = 2 sin(aw)
it()
W
By interchanging the limit and the operation of taking the transform, we have zS [8(t)] (w) = z1
[u
rn -[H(t + a) - H(t - a)]] @o )
+1imo 1 2a[H(t+a)-H(t-a)](W ) = a=1im
a-*O
sin(aw) = 1. aw
This leads us to consider the Fourier transform of the delta function to be the function that i s identically 1 . Further, putting 8(t) formally through the convolution, we hav e
an d
suggesting that
The delta function behaves like the identity under convolution . The following filtering property enables us to recover a function value by "summing" it s values when hit with a shifted delta function . THEOREM 15 .13
Filtering
If f has a Fourier transform and is continuous at to, then f f( t) 8 ( t - to) dt= f( to) . This result can be modified to allow for a jump discontinuity of f at to . In this event w e get f 03 f(t)8(t - to) dt
= 2 (f( to+) +f( to-) )
15 .4.6 The Windowed Fourier Transfor m Suppose f is a signal . This means that f is a function that is defined over the real line, and ha s finite energy f If(t) Iz dt .
CHAPTER 15 The Fourier Integral and Fourier Transform s In analyzing f(t), we sometimes want to localize its frequency content with respect to th e time variable . We have mentioned that f (co) carries information about the frequencies of th e signal . However, T (co) does not particularize information to specific time intervals, sinc e )(co
=f
.f(t)e-«,tdt
and this integration is over all time . Hence the picture we obtain does not contain informatio n about specific times, but instead enables us only to compute the total amplitude spectru m V (co)l . If we think of f(t) as a piece of music being played over time, we would have to wai t until the entire piece was done before even computing this amplitude spectrum . However, w e can obtain a picture of the frequency content of f(t) within given time intervals by windowing the function before taking its Fourier transform . To do this, we first need a window function g, which is a function taking on nonzer o values only on some closed interval, often [0, T] or [-T, T] . Figures 15 .12 and 15 .13 sho w typical graphs of such functions, one on [0,T] and the other on [-T, T] . The interval is called the support of g, and in this case that we are dealing with closed intervals, we say that g ha s compact support. The function g has zero values outside of this support interval . We windo w a function f with g by forming the product g(t) f(t), which vanishes outside of [-T, T] . g(t)
g(t)
t
T FIGURE 15.12
Typical
window function with compact support [0, T].
-
`
-T
T
FIGURE 15 .13
t
Typical window
function with compact support [-T, T] .
=:::l
EXAMPLE 15 .1 6
Consider the window function g(t)
_ 1 for -4 4
having compact support [-4, 4] . This function is graphed in Figure 15 .14 (a), with the vertical segments at t = ±4 included to emphasize this interval . Let f(t) = t sin(t), shown in Figur e 15.14 (b) . To window f with g, form the product g(t) f(t), shown in Figure 15 .14 (c) . This windowed function vanishes outside the support of g . For this choice of g, windowing has the effect of turning the signal f(t) on at time -4 and turning it off at t = 4 . The windowed Fourier transform (with respect to the choice of g) i s awin[f]) w ) = fu,in( w) =
f
.f( t) g(t) e-twtd t
=1 T f(t)g(t)e-"'rdt .
15 .4 Additional Properties and Applications of the Fourier Transform
66 3
g(t) f(t)
1 .0 0.8 0.6 0.4 0.2 I I I -6 -4 -2
0 2 4
6 FIGURE 15 .14(b)
Window functio n
FIGURE 15 .14(a)
11 for
itl
<4
S(t) - 0 for
Its
>4
f(t) = tsin(t) .
g(t) f(t )
FIGURE 15 .14(c)
f windowed
with g.
EXAMPLE 15 .1 7 Let f(t) = 6e -ItI . Then 6e-HtIe-rwtdt =
12 1+w2
.
Use the window function 1 for -2 2 Figure 15 .15 shows a graph of the windowed function g(t) f(t) .The windowed Fourier transform of f is .fwitt( w)
=f _
6e -HtI g(t)e 2
f
-iwt d t
6e-lrl e -iwt d t
2
- 12 -2e
-2 cos 2 (co) + e -2 + e-2 w sin(2w) + 1 1+ w2
664
CHAPTER 15 The Fourier Integral and Fourier Transform s
f(t) = 6e-' Id windowed with 1 forltl< 2 S( t) {0 for Itl > 2 . FIGURE 15 .15
This gives the frequency content of the signal f in the time interval -2 < t < 2 . Often we use a shifted window function . Suppose the support of g is [-T, T] . If to > 0 , then the graph of g(t - to) is the graph of g(t) shifted to units to the right. Now f(t) g (t - to)
=
f( t)g( t - to) 0
for to -T < t < to + T fortto+ T
Figures 15 .16 (a) through (d) illustrates this process . In this case we take the Fourier transfor m of the shifted windowed signal to b e fiui,i,to ( w ) = za[f(t) g(t - to)] ( w ) to+ T *,t f( t ) g( t - to)e-i dt . *'to-T
This gives the frequency content of the signal in the time interval [to - T, to + T] . Engineers sometimes refer to the windowing process is known as time frequency localization . If g is the window function, the center of g is defined to be the poin t tc __ f
t I g ( t)I 2d t
f. I g ( t ) l ' dt
g(t)
-T
T
FIGURE 15 .16(a) A window function g on [-T, T .
-)- t
to - T FIGURE 15 .16(b)
function g(t - to) .
to+ T Shifted windo w
15.4 Additional Properties and Applications of the Fourier Transform
FIGURE 15 .16(c)
Typical signa l
FIGURE 15 .16(d)
665
g(t- to) f(t) .
f(t) .
The number 1/ 2
- tc) 2 g( t )1 2d t Es. I g(t) 1 dt
(t
tR
l
2
is the radius of the window function. The width of the window function is 2tR , and is referred to as the RMS duration of the window . It is assumed in this terminology that the integrals involved all converge . When we deal with the Fourier transform of the window function, then similar terminolog y applies :
g
center of = co
dw dw
Ig(w) I2
f-°° f
c
I
g 2 ( w)1
and radius of
g-
- coR=
L
.(w-wc)2 f
I
gw)1 2
g( 2dco
dw
w )1
The width of g is 2coR , a number referred to as the RMS bandwidth of the window function . 15.4 .7 The Shannon Sampling Theorem We will derive the Shannon sampling theorem, which states that a band-limited signal can b e reconstructed from certain sampled values . A signal f is band-limited if its Fourier transform has compact support (has nonzero values only on a closed interval of finite length) . This means that, for some L,
f
f(co)=0if Icol > L . Usually we choose L to be the smallest number for which this condition holds . In this event L is the bandwidth of the signal . The total frequency content of such a signal f lies in the band [-L, L] . We will now show that we can reconstruct a band-limited signal from samples taken a t appropriately chosen times . Begin with the integral for the inverse Fourier transform, assuming that we can recover f(t) for all real t from its transform:
t
1 f( ) - 2a
f f ( w ) ere r
dw .
CHAPTER 15 The Fourier Integral and Fourier Transform s Because f is band-limited,
=27r- f LL f (w)etwt dw .
f(t)
(15 .8)
Put this aside for the moment and write the complex Fourier series for 0 f (w) = c e nnriw/L
f
E
(w) on [-L, L] : (15 .9)
n=-oo
where 1
fL
Comparing c„ with f(t) in equation (15 .8), we conclude that c"
=
Lf ( L am ) .
Substitute this into equation (15 .9) to get .)( w)
=
Lf
1 L ) e nnrtw/L
(
n=-co
Since n takes on all integer values in this summation, we can replace n with -n to writ e )(co)
= L i
(L)
n=-oo f
Now substitute this series for
f(co)
-nnriw/L
into equation (15 .8) to get
1 7r 27r L
f(t)
e
f rr,t=_oo (L )
e -nanw/L e,wtd w .
Interchange the sum and the integral to ge t f( t)
= 1 2L
f( n=-co 00
1 = 2L =
Ln7r)
t
1E
= n=-09
f
e iw(t-nnr/L )d w L
1 f( n7rlL) i(t - n7r/L) f(nar/L)
[ eiw(t-nnr/L)] L -L (ei(Lt-n ') - e -i(Lt-nnr) )
1
f(ti ar/L )
1 1 Lt - n7r 2 i
f(nar/L)
sin(Lt - n7r) Lt-n7r
(ei(Lt-nn) - e- i(Lt - n7r)
)
(15 .10)
This means that f(t) is known for all times t if just the function values f(nar/L) are determine d for all integer values of n. An engineer would sample the signal f(t) at times 0, far/L, +27r/L , . Once the values of f(t) are known for these times, then equation (15 .10) reconstructs the entire signal . This is actually the way engineers convert digital signals to analog signals, wit h application to technology such as that involved in making compact disks .
15.4 Additional Properties and Applications of the Fourier Transform
667
Equation (15 .10) is known as the Shannon sampling theorem . We will encounter it again when we discuss wavelets . In the case L = 7r, the sampling theorem has the simple form f(n) sin(7r(t - n) )
f(t) _
'n(t-n )
n=-co
15.4.8 Lowpass and Bandpass Filter s Consider a signal f , not necessarily band-limited . However, we assume that the signal ha s finite energy, so I f(t) I '
f
dt
is finite . Such functions are called square integrable, and we will also encounter them late r with wavelet expansions . The spectrum of f is given by its Fourier transform .f ( w )
= Lfte
1wt
c1t .
If f is not band-limited, we can replace f with a band-limited signal f,,,o with bandwidth not exceeding a positive number wo by applying a low pass filter which cuts off i(w) at frequencie s outside the range [-w o, wo] . That is, le t f (w) fr o (w )
=
0
This defines the transform of the function transform: f,,0 (t)
f,,, o ,
for - wo < co < for I w I > wo
27r
o
from which we recover Luc by the inverse Fourier
f W f (w)e *mr dw . f** (w)eiwtdw =27r *,o `** 1
f:
w
o
The process of applying the lowpass filter is carried out mathematically by multiplyin g by an appropriate function (essentially windowing) . Define the characteristic function Xr of a n interval I by Xr ( t)
= 1 if t is in 7 0 if t is a real number that is not in I
Now observe that f0
o (w) =
X[
- . o, w ol ( w)f ( w ),
(15 .12 )
or, more succintly, fro = X[-(do, cool f .
In this context, X[-oo,o,ol is called the transfer function . Its graph is shown in Figure 15 .17 . The inverse Fourier transform of the transfer function i s sin(wot) woedw kw zS [xHwo,wo]](t) 771 *o
f
whose graph is given in Figure 15 .18 . In the case that wo = 7r, this is the function, evaluated a t t-n instead of t, which occurs in the Shannon sampling formula (15 .11) that reconstructs f(t) from sampled values f(n) on the integers . For this reason sin(wot)/7rt is called the Shannon sampling function .
668
CHAPTER 15 The Fourier Integral and Fourier Transform s
X[-wo, wa ]
1 st w
FIGURE 15 .17
o
Graph of X[_wo,wo]
FIGURE 15 .18
Graph of sin(coot)/art fo r
wo = 2 .7.
Now recall Theorem 15.12(2) and (3) of Section 15 .4.4. Analog filtering in the time variabl e is done by convolution . If ca(t) is the filter function, then the effect of filtering a function f by ca is a new function g defined b y t
g ( t)
= (CO *f)( t)
= f (p()f(t-)d •
Taking the Fourier transform of this equation, we have g(w)
= C( w)f( w ) •
We therefore filter in the frequency variable by taking a product of the Fourier transform o f the filter function with the transform of the function being filtered . We can now formulate equation (15 .12) as ,fw o (t)
=
(sin(coot) * f(t)) . 7rt
This gives the lowpass filtering of f as the convolution of the Shannon sampling function with f . In lowpass filtering, We produce from the signal f a new signal fwo that is band-limited . That is, we filter out the frequencies of the signal outside of [-wo, coo] . In a similar kind of filtering, called bandpass filtering, we want to filter out the effects of the signal outside of given bandwidths . A band-limited signal f can be decomposed into a sum of signals, each o f which carries the information content of f within a certain given frequency band. To see ho w to do this, let f be a band-limited signal of bandwidth SI . Consider a finite increasing sequenc e of frequencies, 0
- X
I- w; , - wJ_ IJ
+X[
_ 1 ,w>] .
This transfer function, which is a sum of characteristic functions of frequency intervals, i s graphed in Figure 15 .19. The bandwidth filter function f3i (t), which filters the frequency
15.4 Additional Properties and Applications of the Fourier Transform
669
1
COQ FIGURE 15 .19
Wj FIGURE 15 .2 0
X[-wi ,-o'i-11 +X1wi-I, wil'
/31 (t)
=
sin(w •t) - sin(wj_1 t) Trt
with w j = 2 . 2
and wj_1 = 1 .7.
content of f(t) outside of the frequency range [w j_1 , w j ], is obtained by taking the invers e Fourier transform of f3 j (w) . We get sin(w t) - sin(wt) ( t)=
whose graph is shown in Figure 15 .20 . Now define functions fo(t)
Trt
=C
sin(Twot) *f
J
(t)
and, for for j = 1, 2, . . . , N, fli t) = (81 *f)( t)
Then, for j = 1, 2, . . . , N, each fj (t) carries the content of the signal f(t) in the frequency range w1 _1 w < wp while fo(t) carries the content in [0, wo], which is the low-frequency range of f(t) . Further, f( t) =fo(t)+fi(t)+f2(t)+ . . .-f-fN(t),
(15 .13 )
giving a decomposition of the signal into components carrying the information of the signal fo r specific frequency intervals .
PROBLEMS In each of Problems 1 through 8 , determine the Fourier transform of the function t 1. 9+t 2 2. 3 to9'Z 3. 26H(t) to -2i 4. H(t-3)(t-3)e -4 i 5. -[H(t)e-3' ] 6. t[H(t + 1) - H(t - 1)]
5esir t 2 -4t+1 3 8. H(t-3)e2' In each of Problems 9, 10 and 11, use convolution to find the inverse Fourier transform of the function . 9. (1+iw) 2 1 10. (1+iw)(2+iw )
CHAPTER 15 The Fourier Integral and Fourier Transform s
670 11.
sin(3w) w(2-I-iw)
In each of Problems 12, 13 and 14, find the inverse Fourie r transform of the function . 12.
6e41° sin(2w ) 9+w 2
In each of Problems 19 through 24, compute the windowed Fourier transform of the given function f , usin g the windowing function g. Also compute the center an d RMS bandwidth of the window function . 1 for -5t< 5 10 for Its > 5
19. f(t) = t2 , g(t) =
13. e-31w+41 cos(2w + 8 )
1 for - 47r < t 41 r for *ti > 4'n-
20 . f(t) = cos(at), g(t) = jI 0
14. a -U'2/9 sin(8w )
15. Prove the following form of Parseval's theorem : 21. f(t)=e',g(t)=
lf(t) IZ dt = 2 dw . tar J_: f (w)
J -co
11
for0t< 4 0 fort < 0 and fort > 4 1 0
for-1t for ltl> 1
1
16. The power content of a signal f(t) is defined to be f: * f(t) 12 dt, assuming that this integral converges . Determine the power content of H(t)e-2 ` .
22. f(t) = e t sin(lrt), g(t) =
17. Determine the power content of (1/t) sin(3t) . Hint : Use the result of Problem 15 .
23 . f(t) = (t+2) 2 , g( t ) =
for -2 2
24 . f(t) =H( t -
for37r 57 r
18. Use the Fourier transform to solve y" +6y' +5y = 8(t-3) .
15.5
g( t) = { 0
The Fourier Cosine and Sine Transform s We saw in Section 15 .3 how the Fourier integral representation of a function suggested it s Fourier transform . We will now show how the Fourier cosine and sine integrals of a functio n suggest cosine and sine transforms . Suppose f(t) is piecewise smooth on each interval [0, L] and fo I f(t) I dt converges . The n for each t at which f is continuous , f(t) = J . a w cos(wt)dw , 0
where aw
= -J
f(t) cos(wt)dt.
ar o Based on these two equations, we make the following .
DEFINITION 15.5
Fourier Cosine Trans onn
The Fourier cosine transform of f is defined b y
?IN( ) =
I
Often we will denote [ f ] (co) = f c (co) .
f(
1) co s
15.5 The Fourier Cosine and Sine Transforms
671
Notice that 17-
c (w) = 2 a° and that 2 f/°°
f( t) = 7r
w ) cos(wt)dw . f fc(
(15 .15 )
The integrals in expressions (15 .14) and (15 .15) form the transform pair for the Fourier cosin e transform . The latter enables us, under certain conditions, to recover f(t) from f c (w) .
EXAMPLE 15 .1 8
Let K be a positive number, and le t for0 K 1
f(t)The Fourier transform of f i s f c(w)
=f
f(t) cos(wt)d t
= f K cos(wt)dt=
sin(Kw ) w
The Fourier sine transform is defined in the same spirit .
DEFINITION 15 .6
Fourier Sine
Trartsfornt
The Fourier sine transform of f is defined by
f f(t) sin(wt)dt .
'as{ f](w)
0
We also denote this as fs(w) . If f is continuous at t > 0, then the Fourier integral sine representation i s f(t)
=f
b°, sin(wt)dw ,
where bW
= ar - fo
f(t) sin(wt)dt .
Sinc e fs(w ) =
IT
2
bw
then f(t) =
2
0°
f
f s (w) sin(wt)dw ,
and this is the means by which we retrieve f(t) from fs(w) .
CHAPTER 15 The Fourier Integral and Fourier Transform s
' ,672
EXAMPLE 15 .1 9
With f the function of Example 15 .18 , fs (w)
=f =f
f(t) sin(cot)d t
x sin(wt)dt
1
[ 1- cos(Kw)] .
=Both of these transforms are linear :
ac[af+/3S] = az [f]+Pac[g] and
as [af +P g] = az s[.f]+13
s[g] ,
whenever all of these transforms are defined . When these transforms are used to solve differential equations, the following operationa l rules play a key role . -* ,
THEOREM 15.14
Operational Rules
Let f and f' be continuous on every [0,4 and let fo * f(t) I dt converge. Suppose f(t) -+ 0 and f' (t) 0 as t -+ co . Suppose f" is piecewise continuous on every interval [0, L] . Then ,
ac[.f"(t)](w) = -w2 1 c( w) -
1. and 2.
i s[f W]( a)) = -w 2fs (w)+wf(0) .
The theorem is proved by integrating by parts twice for each rule, and we leave the detail s to the student. The operational formula dictates which transform is used to solve a given problem . If w e seek a function f(t), for 0 < t < oo, and f(O) is specified ; then we might consider a Fourier sine transform . If, however, information is given about f'(0), then the cosine transform migh t be appropriate . When we solve partial differential equations we will encounter examples wher e this strategy is invoked.
PROBLEMS
In each of Problems 1 through 6, determine the Fourie r cosine transform and the Fourier sine transform of th e function .
1 4. f(t) = -1
for0
0 fort > 2 K
1. f(t) = e - ` 2. f(t) = to- °`, with a any positive number
cos(t) for 0 t K 0 fort > K , with K any positive number
5. f(t) = e`cos(t)
3. f(t)
6. f(t) =
sinh(t) 0
forK
for02K
15.6 The Finite Fourier Cosine and Sine Transforms 7 . Show that, under appropriate conditions on f and its derivatives, *7 Us
[f
(4)
/
/ (t)] ( w ) = w4fs ( w ) - w3 f(o)
673
8 . Show that, under appropriate conditions on f and its derivatives . ,
+ wf" (0) .
*7
Uc
(3) [f t4* ( t )]( w ) = w4 f c ( w ) + w2 f' ( 0) - f ( 0 ) .
Hint : Consider conditions that allow application of th e
operational formula to (f"(t)" .
15.6
The Finite Fourier Cosine and Sine Transform s The Fourier transform, cosine transform and sine transform are all motivated by the respective integral representations of a function . If we employ essentially the same line of reasoning , but using Fourier cosine and sine series instead of integrals, we obtain what are called finit e transforms . Suppose f is piecewise smooth on [0, or] .
DEFINITION 15.7 Finite Fourier Cosine Transform The finite Fourier cosine transform of f is defined b y
If f is continuous at x in [0, ar], then f(x) has the Fourier cosine series representatio n f(x) =
1 2
. ao+Ea,, cos(nx) , n_ t
where an
2
= 7r (
f(x) cos(nx) dx
=
2_
fc (n) .
Then f(x) = * fc(0)+
Efc( n) cos(nx) ,
an inversion-type of expression from which we can recover f(x) from the finite Fourier cosin e transform of f . By the same token, we can define a finite sine transform.
DEFINITION 15.8 Finite Fourier Sine Transform The finite Fourier sine transform of f is defined by n for n=1,2, . . . .
n
x sin nx dx
674
CHAPTER 15 The Fourier Integral and Fourier Transform s For 0 < x < 7r, if f is continuous at x, then the sine series representation i s
= E fs(n) sin(nx) ,
f(x)
an inversion formula for the finite sine transform .
EXAMPLE 15 .2 0 Let f(x) = x2 for 0 < x < ar . For the finite cosine transform, compute fc(0) =
f
x 2 dx=
3
and, for n = 1, 2, . . . . fc (n)
=f o
x2 cos(nx)dx = 27r (-I) " nZ
For the finite sine transform, compute fs(n ) = fo*x2sin(nx)dx= (
- 1) n [2-n27r 2]- 2 n3
Here are the fundamental operational rules for these transforms .
-fiTHEOREM 15.15 Operational Rule s Let f and f' be continuous on [0, 7r], and let f" be piecewise continuous . Then
Cf"](n) = -n2fc(n) -f,(0)+ (-1),tf'(r) ,
1.
for n = 1, 2, . . ., and 2.
C*[f"](n)=-n2fs(n)+nf(0)-n(-1)„fen )
forn=1,2, . . . . ■ We will see applications of these finite transforms when we discuss partial differentia l equations .
PROBLEMS In each of Problems 1 through 7, find the finite Fourie r sine transform of the function.
6. cos(ax)
1. K (any constant) 2. x 3. x2 4. x 5
In each of Problems 8 through 14, find the finite Fourie r cosine transform of the function .
5 . sin(ax)
7 . e -x
8. f(x) =
1 -1
for0<x< 1 for-<x<7r 2 - -
15.7 The Discrete Fourier Transform 9. x 10. x2 11. x3 12. cosh(ax ) 13. sin(ax) 14. e -x 15. Suppose f is continuous on [0,7r] and f is piecewis e continuous . Prove tha t Os(n)
675
16. Let f be continuous and f' piecewise continuous on [0, 7r] . Prove tha t (f ' )c(n ) = n fs( 1Z ) - f(0 ) + (-1) "f( 1') for n = 0, 1, 2, . . . .
= -nfc( 12 )
for n = 1, 2, . . . .
15.7
TTThe Discrete Fourier Transform If f has period
p,
its complex Fourier series i s co
E dk eik'" f
k=-oo
Here
coo
=
tar/p
and the complex Fourier coefficients are given by dk
=If 1
a+p
f( t) e -i'ot dt
in which, because of the periodicity of f, a can be any number. If we substitute the value o f wo, the complex Fourier series of f is co
e2 kyn . E dk '
k=-co
Under certain conditions on f , this series converges to f (f(t+) + f (t-)) at any number t. We will choose a = 0 in the formula for the coefficients, s o dk = -
ff
pJ
f(t)e -2rrikt/ p dtforn=0,fl,f2,
. . . .
To motivate the definition of the discrete Fourier transform, suppose we want to approximat e dk . One way is to subdivide [0, p] into N subintervals of equal length p/N, and choose a poin t t . in each [jp/N, (j+ 1)p/N] for j = 0, 1, . . . , N - 1 . Now approximate dk by a Riemann sum : N- 1
dk
E f (t1) e-
2aiktjlP
N
=0
1 N- 1 = 1j f ( t ) e-2 rikj/N . r N =o i
(15 .16)
The N-point Fourier transform is a rule that acts on a given sequence of N complex number s and produces an infinite sequence of complex numbers, one for each integer k (although wit h periodic repetitions, as we will see later) . We will define the transform in such a way that ,
676
CHAPTER 15 The Fourier Integral and Fourier Transform s except for the 1/N factor, the approximating sum 15 .16 is exactly the N-point discrete Fourier transform of the numbers f( to), f( t 1), •••, f( tN_1) .
DEFINITION
15.9 N-Point Discrete Fourier Transform
Let N be a positive integer. Let u = N o' be a sequence of N complex numbers . Then the N-point discrete Fourier transform of a is the sequence D[u] defined b y ,V_ 1 D[u](k) _ E us e -2aijk/ N j= 0
fork=O,±1,±2, . . . .
To simplify the notation, we will use a convention used with the Laplace transform an d denote the N-point discrete Fourier transform of a sequence u by U (lower case for the given sequence of N numbers, upper case of the same letter for its N-point discrete Fourier transform) . In this notation, if u = {u1 }No', then N- 1
Uk
= j= 0
u J•e
2aijk/ N
We will also abbreviate the phrase "discrete Fourier transform" to DFT .
EXAMPLE 15 .2 1
Consider the constant sequence u = {c}N ol , in which c is complex number. The N-point DFT is given by Uk
N-1
= E j=0
Ce -2aijk/N = c
N- 1
E
(e -2aik/N) j
j=0
Now recall that the sum of a finite geometric series i s N-1 'i = 1 - rN r 1- r j=o
(15 .17)
Applying this to Uk, we have
1 - ( e-2aik/N) N Uk 1 - e -2aik/N 1 - e -27rik = c 1 - e 2aik/N = 0 for k = 0, ±1, ±2, . . . C
because, for any integer k, e-2aik = cos(2irk) - i sin(27rk) = 1 . For any positive integer N, the N-point DFT of a constant sequence of N numbers, is an infinit e sequence of zeros . In more relaxed terms, the N-point DFT of a constant sequence is zero . ■
15.7 The Discrete Fourier Transform
677
EXAMPLE 15 .2 2
Let a be a complex number and N a positive integer. To avoid trivialities, suppose a is no t an integer multiple of er . We will find the N-point DFT of the sequence u = {sin(ja)} N al Denoting this transform by the upper case letter, we hav e N-
Uk
=
sin(ja)e -2aijk/ N E j= 0
Use the fact that sin(ja) = 2Z (e'fa
-
e -`ja )
to write N-1
Uk
=
N- 1
E e ija e -2,ri.ik/N - _ E e-ija e -2*rijk/N
2Z
2i
j=0
N-l
j=0 2Z E
( eia-27rik/N) j -
1
j= 0
N-l
E0 2Z j=
( e -ia-27rik/N) J
Upon using equation (15 .17) on each sum, we hav e 1 1 - (eia-27rik/N)N 1 1 - ( e -ia-21rik/N) N 2Z 1 - eia-2,rik/N 2i 1 - e-ia-2arik/N 1 1- eiaNe-2Tik 1 1- e -iaN e-27rik 1_ eia-2Tik/N 2i 2Z 1 - e-ia-2aik/N 1 1- eiaN 1 1 - e -iaN 2i 1 - eia-2arik/N 2i 1 e-ia-21rik/N '
Uk
since e-2Tik = 1 . To make the example more explicit, suppose N = 5 and a = u is uo = 0, u 1 = sin( ),
u2 = sin(2 ), u3 = sin(3,\/2),
The 5-point DFT U has k"' term 1 Uk = 2i -
1- e 5i.4 1 2rik/5 2i 1 - e' er-
1-e -5i '4 . -2arik/5 1- e-
For example, Uo
1 1-e 5i '` 2Z 1- eia
1
1-e-5'
e 2i 1 - e -ic
sin(4/) + sin( 4
- sin(54
2-2cos(V) U-
1 1-e5 ' 2i 1-eia-tai/5
1 12i 1- e -i,*-2ari/5
(15 .18 )
Then the given sequenc e u4 = sin(4/) .
678
CHAPTER 15 The Fourier Integral and Fourier Transform s and U2
= 1 1- e5i' 2i 1- e iVi-47ri/5
1 1e -ifi-47ri/5
2i
■
We will develop some properties of this transform .
15.7.1 Linearity and Periodicity If u = { uj } Not and v = iv)} j=o are sequences of complex numbers, and a and b are complex numbers, then au+bv= {au j +bvj}
No1 .
Linearity of the N-point DFT is the property : TD[au+bv](k) = aUk +bVk. This follows immediately from the definition of the transform, sinc e N- 1
IID[au+bv](k)
= E(auj +bvj)e -27rijk/N j= o N-1
N- 1
= a E u . e -27rijk/N +b E j=0
v
.e-21rijk/N
= aUk +bVk .
j=0
Next we will show that the N-point DFT is periodic of period N . This means that, if th e given sequence is u = {u j }No1 , then for any integer k, Uk+N
= Uk
This can be seen in the DFT calculated in Example 15 .22. In equation (15 .18), replace k by k + N . In this example, this change shows up only in the term eia-27rik/N in the denominator . But, if k is replaced by k + N in this exponential, no change results, sinc e eia-27ri(k+N)/N = e ia e -2xrik e-27ri = eia e -27rik = e ia-27rik The argument in general proceeds as follows : N- 1
Uk+N
N = - E LIIe-27rij(k+N)/ j=0
_
N-1
N- 1
j=0
j=0
-27rijk/N e -27rijk - L u J e -27rijk/N E uJ e
since e -27njk = 1 . 15.7.2 The Inverse N-Point DF T Suppose we have an N-point DFT N- 1
E
Uk -
J.e
-27rzJk/N
j=0
of a sequence { u1 } Nol of N numbers . We claim tha t 1
uj
=
N- 1
E Uke27rijk/N N k=o
for j = 0, 1, . . . , N -1 .
(15 .19)
15 .7 The Discrete Fourier Transform
679
Because this expression retrieves the original N-point sequence from its discrete transform , equation (15 .19) is called the inverse N-point discrete Fourier transform . To verify equation (15 .19), it is convenient to put W = e-2ari1N . Then WN = 1 and
W71 = e2"/N
.
Now write 1 N-1
N- 1
Uke2aijk/N =
N k=0
N k=0
1
Uk W -jk
N-I N-1
Eu =N E k=O r=0 = 1
N-I
N- 1
N r=o E ur E k=O
.e-2arirk/N
W-jk =
N-I
N- 1
NEE u .W 'k W -jk k=O r= 0
Wrk W- jk .
(15 .20 )
In the last summation, observe tha t
Wrk W -jk = e -2airk/N e2aijk/N = e -27r(r-j)k/N = W(r-j)k . For given j, if r
j, then by equation (15 .17) for the finite sum of a geometric series , N-1 N-1 -J ) N = WrkW - jk W(r-j) k - N-1 (W)k '-1 - 1 - (W' = = = 1- W'"-j k=0
k=0
k=0 because (W ( ' -j) ) N = e-2ari('"-j> = 1 and W''-j = e-2ari(r-j)/N N-1
E
WrkW-jk
k=O
1 . But if r = j, then
N-1
N- 1
k=0
k= 0
= E Wjk W -jk = E 1 = N.
Therefore, in the last double sum in equation (15 .20), we need retain only the term when r = j in the summation with respect to r, yielding 1 N-1 N-1 1 N-I 1 N E u ' WrkW-Jk -u E WrkW - 'k -u N = u N ' N ' '' r=0
k=0
k=0
and verifying equation (15 .19) .
15.7.3
DFT Approximation of Fourier Coefficient s
We began this section by defining the N-point DFT so that Riemann sums approximating the Fourier coefficients of a periodic function were 1/N times the N-point DFT of the sequence o f function values at partition points of the interval . We will now pursue more closely the idea o f approximating Fourier coefficients by a discrete Fourier transform, with the idea of samplin g partial sums of Fourier series . This approximation also allows the application of DFT softwar e to the approximation of Fourier coefficients . We will consider a specific example, f(t) = sin(t) for 0 < t < 4, with the understandin g that f is extended over the entire real line with period 4 . A graph of part of f is shown i n Figure 15 .21 . With p = 4, the Fourier coefficients are dk
CHAPTER 15 The Fourier Integral and Fourier Transform s
f(t) = sin(t) for 0 < t < 4, extended periodically over the real line. FIGURE 15 .21
Now let N be a positive integer and subdivide [0, 4] into N subintervals of equal length 4/N . These subintervals are [4j/N, 4(j + 1)/N] for j = 0, 1, . . . ,N - 1 . Form N numbers by evaluating f(t) at the left end point of each of these subintervals . These points are 4j/N, so we obtain the N-point sequence N- 1
u= {sin( 4 i) 1=o
Form the N-point DFT of this sequence : N-1
Uk
= E sin (4i N i-o
N- 1
e -2*t'k/4
//
= E sin (4i ) e '*ijk/2 . N
Then N- 1
-Elkk N
E sin 1\ 4 * N
e
vi'k/2
N=o
is a Riemann sum for the integral defining d k . We ask : to what extent does (1/N) Uk approximat e dk ? In this example we have an explicit expression (15 .21) for dk . We will explicitly evaluat e (1/1V) Uk , using a = 4/N in the DFT of {sin(ja)}No1 determined in Example 15 .22 This gives us 1
N Uk
_ 1 1 1-e4i N [ 2t 1 - e4i/N-2k-rri/N
1 1-e -4i 2i 1 - e-4i/N-2k7ri/A1
Now approximate the exponential terms in the denominator by using the approximatio n
15.8 Sampled Fourier Serie s 1 2irki sin(4) - 4 cos (4) ] 4 irz k2 - 4 [4 _ cos (4) -1 1 , irk sin (4) ir 2k2 -4 + 2 1 2 k2 4 • _ 1
The approximation e x 1+ x has therefore led to an approximate expression for (1/N) Uk that is exactly equal to d k . This approximation cannot be valid for all k, however . First, the approximation used for ex assumes that 'xi << 1, and second, the N-point DFT is periodic o f period N, so Uk+N = Uk , while there is no such periodicity in the dks . In general it would be very difficult to derive an estimate on relative sizes of !kI and N that would result in (1/N)Uk approximating dk to within a given tolerance, and which would hol d for a reasonable class of functions . However, for many science and engineering applications , the empirical rule 1k! < N/8 has proved effective.
In each of Problems 1 through 6, compute D[u](k) fo r k = 0, +1, . . . , ±4 for the given sequence u .
8. Uk
= i -k , N
=5
9. Uk = e-ik , N = 7
1. {cos(j)}j_0
10. Uk = k2, N = 5
2. { eU }j= o
11. Uk = coc(k), N = 5
3. {1/(j + l )g= 0
12. Uk =ln(k+1),N= 6
4. {1/(j+1) 2 }1 =0
In each of Problems 13 through 16, compute the first seven complex Fourier coefficients do, dt1 , d±2 and d± 3 off (see Section 14 .7) . Then use the DFT to approximate these coefficients, with N = 128 .
5. {J 2 }/= o
6. {cos(j) - sin(j)}j_0
13. f(t) = cos(t)
In each of 7 through 12, a sequence {Uk}ko is given . Determine the N-point inverse discrete Fourier transform of this sequence .
=
e-'
for 0
t < 2, f has period 2 t 3, f has period 3
15. f(t) = t2 for 0 < t < 1, f has period 1 16. f(t)
7. Uk =(l+i) k ,N=6
15.8
14. f(t)
for 0
=
te 2t
for 0
<
t < 4, f has period 4
Sampled Fourier Serie s In the preceding subsection we discussed approximation of Fourier coefficients of a periodi c function f. This was done by approximating terms of an N-point discrete Fourier transfor m formed by sampling f(t) at N points of [0, p] . We will now discuss the use of an inverse DF T to approximate sampled partial sums of the Fourier series of a periodic function (that is, partial sums evaluated at chosen points) . Consider the partial sum M
SM (t)
=
dke 2aikt/p E k=-M
Subdivide [0, p] into N subintervals, and choose sample points t . = jp/N for j = 0 1
. . , N -1 .
682
CHAPTER 15 The Fourier Integral and Fourier Transform s Form the N-point sequence u = {f(jp/N}N01 and approximate
N Uk
dk
e
where N- 1
-27rijk/N Uk = E f(jp/ )e j= 0
In order to have < N/8, as mentioned at the end of the preceding subsection, we will requir e that M < N/8 . Thus, m
SM ( t )
N Uke 27rikt/P
k=- M
In particular, if we sample this partial sum at the partition points jp/N, then SM(jPI"J N
M
1
E N k=M
Uke 'ijk/N
We will show that the sum on the right is actually an N-point inverse DFT for a particula r N-point sequence, which we will now determine. We will exploit the periodicity of the N-point DFT-that is, Uk+N = Uk for all integers k . Write -1
SM(jp/ "/ gyp N 7 Uke27rijk/N + N k=-M =
M
NEU
ke-27rijk/N+
1
M
Uke27rijk/ N
N k=O M Uke27rijk/ N
N k=O E
k=1
M
= NEU k=1 = 1 N
M Uke27rijk/ N
k+Ne27rij(-k+N)/N+
N
N-1 Uke27rijk/N k=N-M
k=O
M Uke27rijk/ N +NE
(15.22)
k=O
In these summations, we use the 2M+ 1 numbers
Since M < N/8, we must fill in other values to obtain an N-point sequence . One way to do this is to fill in the other places with zeros. Thus define fork=0,1, . . . M fork=M+1, . . .,N-M-1 .
fork=N-M, . . .,N- 1 Then the Mtir partial sum of the Fourier series of f, sampled at jp/N, is approximated by SM (jp/N)
1
N- 1
NE V e27rijk/N k=O
15 .8 Sampled Fourier Series
68 3
EXAMPLE 15 .23
Let f(x) =t for 0 < t < 2, and extend f over the entire real line with period 2 . Part of the graph of f is shown in Figure 15 .22. The Fourier coefficients of f are dk
1
2
te -2*rikt/2 dt
fork 0 for k= 0
= ark
1
2 Jo
and the complex Fourier series is 1+ k=-ook#0
Z -e 7rik r Irk
This converges to t on 0 < t < 2 and on periodic extensions of this interval. The sum is M SM
(t) = 1 + *+ k=-M,k#0
l
le partia l
e arikt
?fk
To be specific, choose N = 2' = 128 and M = 10, so M N/8 . Sample the partial sum at points jp/N = j/64 for j = 0, 1, . . . , 127 . Then
u={f (N
The 128-point DFT of
u
{NOl- {64{ j 2o
has k t " term = 127
Ur.
j
e-aijk/64
*64
- j=0
Define Uk
Vk = 0 Uk
for k = 0, 1, . . . 10 fork=11, . ..,1.17 fork=118, . . .,127.
FIGURE15 .22 f(t) = t for 0 < t < 2, periodically extended over the real line.
684
CHAPTER 15 The Fourier Integral and Fourier Transform s Then 10
S 1o(jP/N = S 10(j/ 64 ) = 1 +
i
e auik/6 4
k=-10,kO0 ark
127 Vke'rfjk/64
128 k=o
(15 .23 )
In understanding this discussion of approximation of sampled partial sums of a Fourie r series, it is worthwhile to see the numbers actually play out in an example . We will do the computation S 10 (1/2), and then of the approximation (15 .23) with j = 32 . First, 1o
S 10 (1/2) = 1+
i -e 'r`k/2 = .45847 .
k=-10,k#0 Irk
Now we must compute the Vks . For these, we need the number s Uo = E 1 = 127, Ul = 1E e - 'rtj/64 = -1 .0+40 .735i , \ j=0 64 j=o 64 = 127 U2
This gives the 128-point DFT approximation 0 .47694 to the sampled partial sum Sl o (z), whic h we computed to be 0 .45847 . The difference is 0 .0185 . The actual sum of the complex Fourie r series at t = i is f = 0 .50000 . In practice, we would obtain greater accuracy by using much larger N (allowing large r M), and a software routine to do the computations .
(2)
15 .8 .1 Approximation of a Fourier Transform by an N-Point DFT We will show how the discrete Fourier transform can be used to approximate the Fourier trans form of a function, under certain conditions . Suppose, to begin, that f (co) can be approximate d to within some acceptable tolerance by an integral over a finite interval :
w=
i( )
f f(6) - '4 de
ff
2
arL f(e)e-`*'d .
Here we have written the length of the interval as 27rL for a reason that will reveal itsel f shortly . Subdivide [0, 27-L] into N subintervals of length 27rL/N and choose partition points = 27rjL/N for j = 0, 1, . . .N . We can then approximate the integral on the right by a Riemann sum, obtaining
w jE=0
N- 1
f( )
-
f(27rjL/N)e-2vijLW/N
27rL
N
N-1
j=0
C
27rL
N
f(27rjL/N) e 2aijGm/N
The sum on the right is nearly in the form of a DFT . If we put co = k/L, with k any integer , then we have
f (k/L)
2N
N- 1
j= o
2 L /m e-2vrijk/N
f( *J
(15 .24)
This gives f (k/L), the Fourier transform off sampled at points k/L, approximated by 27rL/ N times the N-point DFT of the sequenc e
*
f C2NL/ *NO
I
As noted previously, the DFT is periodic of period N, while f (k/L) is not, so we again mak e the res_ trictionthat Ikl< N/8 .
686
CHAPTER 15 The Fourier Integral and Fourier Transform s . EXAMPLE 15 .2 4
We will test the approximation (15 .24) for a simple case . Let f(t)
e-r 0
for t - 0 fort< 0
Then f has Fourier transform f (w)
',i:d 6
=f
.f( )e
-0
e-6e-i"*d = 1- la) 1+w 2
Choose L = 1, N = 27 = 128 and k = 3 (keep in mind that we want Ikl < N/8). Now k/L = 3 and .f (k/ L) = .f (3) ti
27r
127
E e -Ir1/64 e-6ai1/128 128 1-0
127
= 0.12451- 0.298 84i .
_ _ 64 1=0 For comparison, f(3)=
10
.1-0 .3i .
=0
Suppose we try a larger N, say N = 29 = 512. Now f( 3)
27r 51 1 -21rj/512 e-6 rri1/51 2 e 512 j= o 51 1
_ 7r_ E e -ir1/256 e -3vi1/256 = 0.105 95 - 0 .299 4i , 256 j_ 0
a better approximation than obtained with N = 128 .
EXAMPLE 15 .2 5
We will continue from the preceding example . There the emphasis was on detailing the ide a of approximating a value of f (w) . Now we will use the same function, but carry out th e approximation at enough points to sketch approximate graphs of Re[f (w)], Im[f(w)] an d V(w)l . Using L = 4 and N = 28 = 256, we obtain the approximation 255
f(k/4)
E 32 1_0
e -r1/32 e -ailk/128
15.8 Sampled Fourier Series
687
We should have l kj < N/8 = 32, although we will only compute approximate values of f (k/4 ) for k = 1, . . ., 13 . Because in this example we can compute f (co) exactly, these values are included in the table to allow comparison . DFT approx . of f(w)
1(w)
.99107- .23509i
.94118 - .23529 i
( k=2) )(2 )
.84989- .3996 i
.8- .4 i
(k = 3) 1(4 ) (k=4) f(1 )
.68989 - . 4794i
.64- .48 i
.54989 - .4992 i
.5 - .5 i
(k = 5) 1(4) (k = 6) .f(z)
.44013 - .4868i 0.35758 - 0 .46033 i
. 39024 - . 4878i 0.3077 - 0 .4615 i
.29605 - .42936i
.24615 - .43077 i
( k = 8) 1( 2 ) ( k = 9) 1(4) ( k = 10) f(z)
.24989 - .39839i
.2 - .4i
.21484- .36933i
.16495 - .37113 i
.18782 - .34282i
.13793 - .34483 i
(k=11 )1(4) (k = 12) 1(3)
.16668- .31896i
.11679 - .32117 i
.14989 - .29759i
.1 - .3i
(k=13 )1(4)
.13638- .27847i
.086486- .28108i
( k = 1)
(k=7)
1(4)
f(4)
.
The real part of Act)) is consistently approximated in this scheme with an error of abou t .05, while the imaginary part is approximated in many cases with an error of about .002 . Improved accuracy can be achieved by choosing N larger . In Figures 15 .23, 15 .24 and 15 .25, the approximate values of Re[f (w)], Im[f (w)] an d j (w) , respectively, are compared with the values obtained from the exact expression fo r "(w) . The squares represent approximate values, and the dots are actual values . In Figure 15 .24 the approximation is sufficiently close that the points are nearly indistinguishable (within th e resolution of the diagram) .
1 .2 1 .0 0.8 0.6 0.4 0.2 . 0 FIGURE 15 .23
Comparison of the DFT approximation of
Re[f (co)] with actual values fo r (e -` fort?. 0 f(t) - j fort < 0 l0
1 6 88
CHAPTER 15 The Fourier Integral and Fourier Transform s
0 -0 . 1 - 0.2 -0.3 - 0.4 - 0.5 -0 .6
IIIIIII 1 2 3 4 5 6 7 8 9 10 11 12 1 3
FIGURE 15 .24 Comparison of the DFT approximation of Im[f (w)] with actual values fo r e ` fort > 0 f(t) *0 fort < 0
-
Thus far the discussion has centered on functions f for which )(co) can be approximate d fo by an integral f(e)e `'d . We can extend this idea to the case that )(co) is approximate d
lrL
by an integral
-
f L f(6)e-"'64,
over a symmetric interval of length 27rL :
?((o) ^
*L
f()e-`"tdk .
Then,
f (k/L)
f aLarL f(5) e-`ke1Lde arL = f 0 f(e) e-ik"d + f f( )e ike/Ld . a-L 0
1 .2 1 .0 0. 8 0.6 0 .4 0 .2 0
FIGURE 15 .25 Comparison of the DFT approximation of lc` fort> 0 A co) * with actual values for f(t) = 0 fort<0
15.8 Sampled Fourier Series
689
Upon letting' =
+27rL in the first integral of the last line, we hav e z*c f f(*-27rL)e 'k(5-21rL)/L d' + f f(k/L) l arL o
f rL
* f(-2L)ed+ f f( ) ed , S
LL
since
e-2w'k
= 1 if k is an integer. Write for ' as variable of integration, obtainin g
kL
f( / )
2wL
=
f
- k lL d
f( -27rL)e `
wL
+
arL
I 0
f(6)
efke/Ld .
Now define g(t) =
f(t) (f(7rL) + f(-7rL)) f(t - 27rL)
for0
(15 .25 )
The n
f (k/L) ti f =
f
2wL
g(6')e L
g(27rt)e L
= 27r
f
,ke/L d
-2a'kt/L (2or)dt
(let = 2 ,n-t)
g(27rt) a zwikr/ L
Finally, approximate the last integral by a Riemann sum, subdividing [0, L] into L/N subintervals and choosing t = jL/N for j = 0, 1, . . . , N - 1 . Then
j
f
(k )
N-
N- ]
g
N J_ 0
L
27rjL 1 etai jk/ N
N J
As before, we assume in using this approximation that < N/8 . This approximates f (k/L ) by a constant multiple of the N- point DFT of the sequenc e
*
g \ 2NL /
j=o
in which points of the sequence are obtained from the function g manufactured from f according to equation (15 .25) . 15.8 .2 Filtering A periodic signal f(t), of period 2L, is often filtered for the purpose of cancelling out, or diminishing, certain unwanted effects, or perhaps for emphasizing certain effects one wants t o study . Suppose f(t) has complex Fourier serie s 0 d '"/ L
E e
n=-co
where d
= 2L- f L
_L f(t)e -
awit/L dt .
690
CHAPTER 15 The Fourier Integral and Fourier Transform s Consider the Nth partial sum N
SN (t)
=
E dj e j=- N
7ijt/ L
A filtered partial sum of the Fourier series of f is a sum of the form N
d e aijt/L ' (T) N '
j=- N
(15 .26)
in which the filter function Z is a continuous, even function on [-1, 1] . In particular applications the object is to choose Z to serve some specific purpose . By way of introduction, we will illustrate filtering for a filter that actually forms a basic approach to the entire issue o f convergence of Fourier series . In the nineteenth century, there was an intense effort to understand the subtleties of convergence of Fourier series . An example of Du Bois-Reymond showed that it is possible for the Fourier series of a continuous function to diverge at every point . In the course of delving into the convergence question, it was observed that in many cases the sequence of averages o f partial sums of a Fourier series is better behaved than the sequence of partial sums itself. This led to a consideration of averages of partial sums : crN( t)
1 N-1 1 E = N k= O Sk( t ) = N
[So(t)+S1(t)+ . . .+SN-1(t)] •
The quantity -N (t) is called the Nth Cesaro sum of f, after the Italian mathematician wh o studied their properties . It was found that, if the partial sums of the Fourier series approach a particular limit at t, then 0 N (t) must approach the same limit as N oo, but not conversely . It is possible for the Cesaro sums to have a limit for some t, but for the Fourier series to diverg e there . It was the 19-year-old prodigy Fejer who proved that, if f is periodic of period 27r, and fo'r f(t)dt exists, then o-N (t) -+ f(t) wherever f is continuous . This is a stronger result than holds for the partial sums of the Fourier series . With this as background, writ e 0-N(t)
=
dievrijt/L* .
NE
k=O j=Ek
We leave it as an exercise for the student to show that the terms in this double sum can b e rearranged to write n
N
(TN(t)=E(1- tt=-N
N
) d, 1 e
aint/ L
This is of the form of equation 15 .26 with the Cesaro filter function
Z(t)=1-It] for
1
The sequence
1z(N)}N-N-{1
N Jn -N
is called the sequence offilter factors for the Cesaro filter .
15 .8 Sampled Fourier Series
691
f(t) o •-
1 -37r
-27r
-7r
.-o
p-o
1 for -7r
f(t) _
FIGURE 15 .26
27r
7r
f(t+2'r) = f(t) for all real t.
One effect of the Cesaro filter is to damp out the Gibbs phenomenon, which is seen in th e convergence of the Fourier series of a function at a point of discontinuity . As an example that displays the Gibbs phenomenon very clearly, conside r f(t)
- -1 for -7r
with periodic extension to the real line. Figure 15 .26 shows a graph of this periodic extension . Its complex Fourier coefficients are 1 0 1 dt= 0 -ldt+2_f do=_ and ,* , u dt = -1-I n 1
f
d,,
27r
f o -e ;rdt-I- 27r f
e
The N th partial sum of this series i s N
SN (t) _
i
- 1+(-1
n«
n
n=-N,n00
If N is odd, then SN (t) = ± (sin(t) +
3 sin(3t) + 5 sin(5t) + • • • + N sin(Nt)) .
The Nth Cesaro sum (with L = 7r) is n
N 0N
(t)_ n= -N
t (1-1 -N ) 7ri -1+(-1)"ein n
This can be written cN (t)
= E (1-
N) (_)
((11)) sin(nt).
Figure 15.27 shows graphs of S 10 (t) and o 10 (t), and Figure 15 .28 shows graphs of S20 (t) and a 20 (t) . In the partial sums SN (t), the Gibbs phenomenon is readily apparent near t= 0 , where f has a jump discontinuity . Even though SN (t) f(t) for 0 < t < 7r and for -17' < t < 0, the graphs of SN (t) have relatively high peaks near zero which remain at nearly constant height even as N increases (although these peaks move toward the vertical axis as N increases) . However, this phenomenon is not seen in the graphs of ON (t), which accelerates and "smooths out" the convergence of the Fourier series .
692
CHAPTER 15 The
Fourier Integral and Fourier Transform s
L.
S lo(t) and cri o(t) for the function of Figure 15.26. FIGURE 15 .27
FIGURE 15 .28 S20(t) and c720(t) for the fimction of Figure 15.26.
The Cesaro filter also damps the effects of the higher frequency terms in the Fourier series , because the Cesaro filter factor 1 - ln/Nl tends to zero as n increases toward N . This effect i s also seen in the graphs of the Cesaro sums . There are many filters that are used in signal analysis . Two of the more commonly encountered ones are the Hamming and Gauss filters . The Hamming filter is named for Richard Hamming, who was for many years a senior scientist and researcher at Bell Labs . It is given by Z(t) = .54 + .46 cos(irt) . The filtered Nil' partial sum of the complex Fourier series of f , using the Hamming filter, i s N
E ( .54+ .46cos(7rn/N)) d,t e'tnu/ L n=- N Another filter frequently used to filter out background noise in a signal is the Gauss filter, named for the nineteenth century mathematician and scientist Carl Friedrich Gauss . It is given b y Z ( t ) = e-atr2t2 , in which a is a positive constant . The Gauss filtered partial sum of the complex Fourier serie s of f is N
E
n=- N
e-a772n2/N2 d e ''it/L n
Filtering is also applied to Fourier transforms . The filtered Fourier transform of f , using the filter function Z(t), is f
Z(e) .f(e)e-'ede .
If this integral is approximated by an integral over a finite interval , f
Z (S) .f(S)
e-`wed
6^ f L Z (s*)f(e) e-`s't d
then it is standard practice to approximate the integral on the right using a DFT . The Cesaro , Hamming and Gauss filters for this integral are, respectively , (Cesaro) L Z(t) = 0 .54+0 .46 cos(art/L) (Hamming) Z(t) = 1-
15.8 Sampled Fourier Serie s and Z(t)
In each of Problems 1 through 6, a function is given, hav ing period p . Compute the complex Fourier series of th e function, and then the 10th partial sum of this series a t the indicated point to . Then, using N = 128, compute a DFT approximation to this partial sum at the point . 1. f(t)=1+t 2. f(t)=t 2
for0
for0
1
1 3. f(t) = cos(t) for0
=
e-a(p`/L)'-
11. f(t) = t[H(t-1) -H(t-2)] 12. f(t) =2e-4lt l
13. f(t) = H(t) - H(t - 1 ) 14. f(t) = e t [H(t) - H(t - 2) ] In each of Problems 15 through 19, graph the function, th e fifth partial sum of its Fourier series on the interval, an d the fifth Cesaro sum, using the same set of axes . Repeat this process for the tenth and twenty-fifth partial sums . Notice in particular the graphs at points of discontinuit y of the function, where the Gibbs phenomenon shows u p in the partial sums .
4. f(t) e-' for0
(Gauss) .
=1 _1
for 0 < t < 2 for -2
for 0 < t < 1, p = 1, to =
6. f(t) = t sin(t)
for 0 < t -< 4, p = 4, to
=
In each of Problems 7 through 10, make a DFT approx imation to the Fourier transform of f at the given point , using N = 512 and the given value of L . e - 4' fort> 0 0 fort < 0 , L=3 ; f(4 )
16. f(t) _
0 L = 6 ,1( 2)
fort 0 for t < 0,
to -2 1
fort> 0 for t < 0 , 0 L=3, f(12) 2 cos(t) for t > 0 10. f(t) = l t 0 fort < 0 , L=4, f(4 ) In each of Problems 11 through 14, use the DFT to approximate graphs of Re[ f (w)], Im[ f (w)] and f (w) for 0 < w < 3, using N = 256 . For these functions, f (w) can be computed exactly. Graph each of the approximation s of Re[f (w)], Im[f (w)] and f (w) on the same of axe s with, respectively, the actual function itself.
for -2
-1 0
for -1
1
for 1/2
17. f(t) =
7 . f(t)=
(2t ) 8. f(t) _ !cos
t2 2+t
18.
f(t)
Ie
t
for - 3< t< 1 for 1 t < 3
cos(t)
19. f(t) 20. Let .f(t)
2+t
7 _
for -1
1 J -1
for 0 t < 2 for -25t<0 .
Plot the fifth partial sum of the Fourier series for f(t) on [-2, 2], together with the fifth Cesaro sum, the fifth Hamming filtered partial sum, and the fifth Gaus s filtered partial sum on the same set of axes . Repeat thi s for the tenth sums, and the twenty-fifth sums . 21. Let f(t) =
2+ t
for -2
Plot the fifth partial sum of the Fourier series for f(t) on [-or, or], together with the fifth Cesaro sum, th e fifth Hamming filtered partial sum, and the fifth Gaus s filtered partial sum on the same set of axes . Repeat this for the tenth sums, and the twenty-fifth sums .
694
15.9
CHAPTER 15 The Fourier Integral and Fourier Transform s
The Fast Fourier Transform The discrete Fourier transform is a powerful tool for approximating Fourier coefficients, partial sums of Fourier series and Fourier transforms . However, such a tool is only useful if there are efficient computing techniques for carrying out the large numbers of calculations involved i n typical applications . This is where the Fast Fourier Transform, or FFT, comes in . The FFT is no t a transform at all, but rather an efficient procedure for computing discrete Fourier transforms . Its impact in engineering and science over the past 35 years has been profound, because i t makes the DFT a practical tool in analyzing data . The FFT first appeared formally in 1965 in a five-page paper, "An Algorithm for the Machine Calculation of Complex Fourier Coefficients", by James W. Cooley of IBM an d John W . Tukey of Princeton University . The catalyst behind preparation and publication of th e paper was Richard Garwin, a physicist who has consulted for federal agencies on question s involving weapons and defense policies . Garwin became aware that Tukey had developed a n algorithm for computing Fourier transforms, a tool that Garwin needed for his own work . When Garwin took Tukey's ideas to the computer center at IBM Research in Yorktown Heights for the purpose of having them programmed, James Cooley was assigned to assist him . Because of the importance of an efficient method for computing Fourier transforms, word of Cooley' s program quickly spread and it became so much in demand that the Cooley-Tukey pape r resulted . After the paper's publication it was found that some of the concepts underlying the method , or similar to it, had already appeared in other contexts . Tukey himself has related that Phillip Rudnick of the Scripps Oceanographic Institute had reported programming a special case o f the algorithm, using ideas from a paper by G .D . Danielson and Cornelius Lanczos . Lanczos, a Hungarian born physicist/mathematician whose career spanned many areas, had developed the essential ideas around 1938 and in the years following, when he was working on problems i n numerical methods and Fourier analysis . Much earlier, Gauss had essentially discovered discret e Fourier analysis in calculating the orbit of Pallas, but of course there were no computers in th e Napoleonic era . Today the FFT has become a standard part of certain instrumentation software . For example, FT-NMR, which stands for Fourier Transform-Nuclear Magnetic Resonance, uses the FFT a s part of its data analysis system . The reason for this widespread use is the FFT's efficiency, which can be illustrated by a simple example . It can be shown that, if N is a positive integer power of 2, then f (k/L) as given by equation (15 .24), can be computed using no more than 4N log 2(N) arithmeti c operations . If we simply compute all of the sums and products involved in computing f (k/L) , we must perform N - 1 additions and N -f-1 multiplications, each duplicated N times to get the approximations at N points. This is a total of N(N- 1) +N(N+ 1) = 2N 2 operations . Suppose, to be specific, N = 2 20 = 1, 048, 576 . Now 2N2 = 2 .1990(10 12) . If th e computer we are using performs one million operations per second, this calculation will requir e about 2,199,023 seconds, or about 25.45 days of computer time. Since a given project ma y require computation of the Fourier transform of many functions, this is intolerable in terms o f both time and money . By contrast, if N = 2", then 4N log 2(N) = 2"+ 2 log 2(2") = n2" +2 With n 20, this is 83,886,080 operations . At one million operations per second, this will take a little under 84 seconds, a very substantial improvement over 25.45 days .
15.9 The Fast Fourier Transform
15.9.1
695
Use of the FFT in Analyzing Power Spectral Densities of Signals
The FFT is routinely used to display graphs of the power spectral densities of signals . For example, consider the relatively simple signa l f (t) = sin(27r(50) t) + 2 sin (27r(120) t) + sin (27r(175) t) + sin (27r(210) t) . f(t) is written in this way to make the frequencies of the components readily identifiable . B y
writing sin(1007rt) as sin(27r(50)t), we immediately see that this function has frequency 50 . Figure 15 .29 shows a plot of the power spectral density versus frequency in Hz. 104 1 03
11
1o-2 10 -3
i
0
I
i
100
I
200
i
I
i
300
I
i
400
I
500
FTT display of the power spectra l density graph of y=sin(1007rt)+2sin(2407rt) + sin (3507rt) + sin (4207rt) . FIGURE 15 .29
Where is the FFT in this? It is in the software that produced the plot . For this example, the graph was drawn using MATLAB and an FFT with N = 2 10 = 1024. Using the same program and choice of N, Figure 15 .30 shows the power spectral density graph o f g(t) = cos (27-(25)t) +cos(27r(80)t) +cos(27r(125)t) +cos(27r(240)t) +cos(27r(315)t) .
In both graphs the peaks occur at the primary frequencies of the function .
103 102 10' 10° 10- ' 10-2 10 - 3
1
10-4
i
0
I
100
i
I
I
200
300
i
I
'
400
FIGURE 15 .30 FFT display of the power spectra l density graph of y= cos(507tt)_ +cos(1607rt) ±
cos(2507rt) +cos(4807rt) +cos(6307rt) .
I 50 0
696 '
CHAPTER 15 The Fourier Integral and Fourier Transform s 15.9 .2 Filtering Noise From
a Signal
The FFT is used sometimes to filter noise from a signal . We discussed filtering previously, but the FFT is the tool for actually carrying it out . To illustrate, consider the signal f(t) = sin (27x(25) t) + sin (27x(80) t) + sin(27r(125) t) + sin(27r(240) t) + sin (21T(315) t) .
This is a simple signal . However, the signal shown in Figure 15 .31 corresponds more closely to reality, and was obtained from the graph of f(t) by introducing zero-mean random noise . If we did not know the original signal f(t), it would be difficult to identify from Figure 15 .31 the mai n frequency components of f(t) because of the effect of the noise . However, the Fourier transform sorts out the frequencies . The power spectral density of the noisy signal of Figure 15 .31 i s shown in Figure 15 .32, where the five main frequencies can be identified easily . This particular plot does not reliably give the amplitudes, but the frequencies stand out very well . Figure 15 .32 was done using the FFT via MATLAB, with N = 2' = 512 .
10 8
-6
I 0
I
I
I
I
I
I
I
I
I
5 10 15 20 25 30 35 40 45 5 0
FIGURE 15 .31 A portion of the signal y = sin(507rt) ± sin (1607rt) + sin(2507rt) + sin(4807rt) + sin(6307rt) corrupted with zero-mea n random noise.
16 0 14 0 12 0
100 80 60 40 20
FIGURE 15 .32 FFT calculation of the power spectra l density of the signal of Figure 15.31.
15.9 The Fast Fourier Transfor m 15 .9.3 Analysis of the Tides in Morro Ba y We will use the DFT and FFT to analyze a set of tidal data, seeking correlations between high and low tides and the relative positions of the sun, earth, and moon . The forces that cause the tides were of great interest to Isaac Newton as he struggled t o understand the world around him, and he devoted considerable space in the Principia to this topic . At one point, Newton required new tables of lunar positions from then Royal Astronome r Flamsteed, who, because of a busy schedule coupled with a personal feud with Newton, was not forthcoming with the data . Newton responded by exerting both professional and politica l pressure on Flamsteed, through his connections at court, finally forcing Flamsteed to publish th e data at his own expense . Years later Flamsteed came into possession of the remaining copies o f this book, and is reported to have given vent to his anger with Newton by burning every copy . It was a triumph of Newton's theory of gravitation, applied to the system consisting o f the earth, moon and sun, that enabled Newton to account for two of the primary tides tha t occur each day . He was also able to explain why the tides have a twice-monthly maximu m and minimum and why the extremes are greatest when the moon is farthest from the earth' s equatorial plane . The elliptical orbit of the moon about the earth also accounts for the monthl y variation in tide heights resulting from the change in the distance between the earth and moo n throughout the month . Morro Bay is near San Luis Obispo in California. Extensive data have been collected a s the Pacific Ocean rolls in and out of the bay and tides wash up on the shore . Figure 15 .33 shows a curve drawn through data points giving hourly tide heights for May 1993 . We wil l analyze this data to determine the primary forces causing these tidal variations . As a curiosity, comparison with Figure 2 .12 even suggests the presence of beats in periodic tide oscillation ! Before carrying out this analysis, we need some background information . The length of a solar day is 24 hours . This is the time it takes the earth to spin once relativ e to the sun . The lunar day is 50 minutes longer than this . It takes the earth about 24 .8 hours to spin once relative to the moon because the moon is traveling in the direction of the earth' s rotation (Figure 15 .34) . The sun exerts its primary tidal forces at a point on the earth twice each day, and the moon , twice each 24-hour-and-50-minute period . It is fairly clear why the tide should have a loca l maximum at a particular location when either the sun or moon is nearly above that point . It is not as obvious, however, that the tide will also rise at a point when either of these bodies is o n the opposite side of the earth, as is observed . Newton was able to show that, as the earth/moo n system travels about its center of mass (which is always interior to the earth), the moon actuall y exerts an outward force on the opposite side of the earth . The same is true of the earth/su n system. Hence both the sun and moon cause two daily tides . 60 50 40 30 '20 10 0 -10
I I*I
.
I II
I I IT I ' *1*ll*' II*i*I ili i, ii I*iiliiiili1111 !I! IIIIII 1001 1200 300 400 500 600 70 0
II llll
l
Hours
Tide profile in Morro Bay from hourly data collected May 1993 . FIGURE 15 .33
G98
CHAPTER 15 The Fourier Integral and Fourier Transform s t = 24 hr 50 min Sun Moon •
0 t = 24 hr
Earth t= 0
FIGURE 15 .3 4
Tidal forces are proportional to the product of the masses of the bodies involved, and inversely proportional to the cube of the distance between them . This enables us to determin e the relative tidal forces of the moon and sun on the earth and its waters . Since the sun has a mass of approximately 27(106) that of the moon and is about 390 times as far from the eart h as the earth is from the moon, the sun's influence on the earth's tides is only about 0 .46 times that of the moon's influence . The semidiurnal (twice daily) tides caused by the sun and moon do not just vary betwee n the same highs and lows each day. Other forces change the amplitudes of these highs and lows . These forces are periodic and are responsible for the beats that seem to be present in Figur e 15.33. Authorities on tides claim that there are actually about 390 measurably significant partia l tides . Depending on the application of the data, usually only seven to twelve of these are used in computing tables of high and low tides . We will focus for the rest of this discussion on thre e major contributing forces . First, as the moon orbits the earth, the distance between the two changes from abou t 222,000 miles at perigee to 253,000 miles at apogee . With the inverse cube law of tidal forces this difference is significant. The time from perigee to apogee is about 27 .55 days . Next, since the moon gains on the sun by about 50 minutes each day, if the three bodies ar e in conjuncation at some time then they will be in quadrature about seven days later. The twic e daily tides will have large 'amplitudes when everything is aligned and the smallest variation s when the earth/moon/sun angle is 90 degrees . The change from these greatest to smallest tid e variations and back again is periodic with a period of 14 .76 days, half the time it takes th e moon to circle the earth. The last tidal force we will consider is that resulting from the moon's orbit being tilted about 5 degrees from the plane containing the earth's orbit about the sun . The result of this deviation can be seen by observing the moon's location in the sky over a 1-month period . As the moon traverses the earth in its orbit, it will be above the Northern Hemisphere for a while , helping create high tides in that region . Then it will move in a southerly direction, and whil e it is in the Southern Hemisphere there is little variation in the tides in the north . It takes 13 .66 days for the moon to move from the most northerly point to that farthest south . The principal periods resulting from these forces are the solar semidiurnal period of 1 2 hours, the lunar semidiurnal period of 12 hours, 50 minutes, 14 seconds ; a lunar-solar diurnal period of 23 hours, 56 minutes, 4 seconds ; and a lunar diurnal period of 25 hours, 49 minutes , 10 seconds . Now consider the actual data used to generate the graph in Figure 15 .33, and look for this information. Apply the FFT to calculate the DFT of this set of 720 data points, take absolute values, and plot the resulting points . This results in the amplitude spectrum of Figure 15 .35 . The units along the horizontal (frequency) axis are cycles per 720 hours . Begin from the right side of the amplitude spectrum in Figure 15 .35 and move left . The first place we see a high point is at about 60, which indicates a term in the data at a frequenc y of 60/720, or 1/12 cycles per hour. Equivalently, this point denotes the presence of a force that is felt about every 12 hours . This is the solar semidiurnal force . The next high point in the amplitude spectrum occurs almost immediately to the left of the first, at 58 . The height of this data point indicates that this is the largest contribution to the
15.9 The Fast Fourier Transform
699
10 8
0
1 10
FIGURE 15 .35
1 20
I 30
1 40
1 50
' •I 60
1 70
, N
Morro Bay tide spectrum.
tides . It occurs every 720/58, or 12 .4 hours . This is the lunar seinidiurnal tide . There is also some other small amplitude activity near this point, about which we will comment shortly . Continuing to move left in Figure 15 .35, there is a large contribution at about 30, indicating a force with a frequency of 30/720, or 1/24, hence a period of about 24 hours . This is th e lunar-solar diurnal period . The only other term of influence that stands out occurs at 28, indicating a frequency o f 28/720 . This translates into a period of 25 .7 hours, and indicates the lunar diurnal period . Thus all of the dominant periods are accounted for and no other significant informatio n occurs in the amplitude spectrum, except for the small scattering noted previously in th e region around 58 . Since the lunar day is not an exact multiple of 1 hour and the data sample s were taken hourly, some of the data associated with the moon's tidal forces have leaked ont o adjacent points . This also skews the amplitudes, hindering our ability to accurately determin e the sun/moon ratio of forces . The same rationale could account for some of the data near 28 . No other discernible information shows up in the amplitude spectrum because all of th e remaining forces have periods longer than 1 month, and this is longer than the time over which the data was taken . It is interesting to speculate on what Newton would have thought of this graphical verification of his theory . Given his personality, it is possible that he would not have been impressed , having worked it all out to his own satisfaction with his calculus .
In each of Problems 1 through 4, use a software package with the FFT to produce a graph of the power spectrum of the function . Use N = 2 10. 1. y(t) = 4 sin(80art) - sin(20irt ) 2. y(t) = 2cos(40art)+sin(907rt)
In each of Problems 5 through 8, corrupt the signal with zero-mean random noise and use the FFT to plot the power density spectrum to identify the frequency components o f the original signal . 5. y(t) = cos(307rt) +cos(70irt) +cos(140art) 6. y(t) = sin(60lrt) + 4 sin(130 irt) + sin (2405 •rrt) 7. y(t) = cos(207rt) +sin(140ert) +cos(2401rt) 8. y(t) = sin(30irt) + 3 sin(40irt) + sin(130art) + sin( _196?rt)_ -1-sin(2207rt)
CHAPTER
1
Special Functions, Orthogona l Expansions, and Wavelets
A function is designated as special when it has some distinctive characterics that make it worthwhile determining and recording its properties and behavior . Perhaps the most familiar examples of special functions are sin(kx) and cos(kx), which are solutions of an importan t differential equation, y"+k z y = 0, and arise in many other contexts as well . For us, the primary motivation for studying certain special functions is that they arise i n solving ordinary and partial differential equations that model many physical phenomena . Lik e Fourier series, they constitute necessary items in the toolkit of anyone who wishes to understan d and work with such models . We will begin with Legendre polynomials and Bessel functions. These are important i n their own right, but also form a model of how to approach special functions and the kind s of properties we should look for . Following these, we will develop parts of Sturm-Liouville theory, which will provide a template for studying certain aspects special functions in general, for example, eigenfunction expansions, of which Fourier series are a special case . The chapter concludes with a brief introduction to wavelets, in the setting of eigenfunction expansions .
16.1
Legendre Polynomials There are many different approaches to Legendre polynomials . We will begin with Legendre' s differential equation (1-x2 )y"-2xy' +Jay=0
(16.1)
in which -1 < x < 1 and A is a real number . This equation has the equivalent form [(1-x2)y']'+Ay=0 , which we will encounter in Chapter 17 in solving for the steady-state temperature distributio n over a solid ssphere_ 701
102
CHAPTER 16 Special Functions, Orthogonal Expansions, and Wavelets We seek values of A for which Legendre's equation has nontrivial solutions . Writing Legendre's equation as 2x A 1-x2y+ 1-x2y=0 '
Y
we conclude that 0 is an ordinary point. There are therefore power series solution s y ( x) = E a,t x" . n= 0 Substitute this series into the differential equation to ge t Ea n n(n- 1)a„x" -2 - En(12- 1)anxn - E2na„x " +E Aa„x" = 0 . n=2 n=2 n=1 n= 0 Shift indices in the first summation to write the last equation a s 00
00
00
00
E (n + 2) (n + 1) a n +2x " - E n (n - 1) anx" - E 2na nxn + E Aan x" = 0 . n=0 n=2 n=1 n= 0 Now combine terms for n > 2 under one summation, writing the n = 0 and n = 1 terms separately : 0 2a 2 +6a3x-2a 1 x+Aa0 +Aa l x+E[(n+2)(n+1)a n+2 - (n2 +n-A.)an]x„ = 0 . n=2 The coefficient of each power of x must be zero, henc e 2a 2 + Aa 0 = 0,
(16 .2)
6a 3 - 2a 1 + Aa l = 0,
(16 .3)
and, for n = 2, 3, . . . , (n+1)(n+2)a n+2 -[n(nH-1)-Alan = 0 from which we get the recurrence relatio n a n+2 =
n(n+1)- A n for n = 2,3, . . . . (n + l)(n 2) a
(16 .4)
From equation (16 .2) we have A a2 =- 2 a0 . From equation (16 .4), 6-A A6-A -A(6-A) a0 , a4 = 3 4 a2 = - 2 3 4 ao = 4! 20-A -A(6-A)(20-A ) a6 5 6 a4 = 6! a0 ' and so on . Every even-indexed coefficient a 2an is a multiple, involving n and A, of a 0 . Here w e have used the factorial notation, in which n! is the product of the integers from 1 through n, i f n is a positive integer . For example, 6! = 720 . By convention, 0! = 1 .
(2-A)(12-A)(30-A ) 7! a1 , 6 7 a5 = is a multiple, also involving n and A, of a . and so on . Every odd-indexed coefficient In this way we can write the solution 30-A
a* =
l
a2„+1
(1 - x2 Y(x)=Eanxt =ao2 n=0 + a1
A(6 - A) 4!
4
x-
A(6-A)(20-A) 6!
) xe } . . . J
Cx + 2-A x3 + (2-A)(12-A) (2 - A) (12 - A) (30 - A ) + . . . x5 + x 5! 3!
7!
The two series in large parentheses are linearly independent, one containing only even power s of x, the other only odd powers . Put Ye
(x)=1-2x
2
A(6-A)
4*
4
x-
A(6-X1)(20-A) e x + 6!
and 2 - A 3 (2 - A)(12 - A) 5 (2 - A) (12 - .x)(30 - A ) x -{- . . . . x + yo (x) = x + 3! x + 5! 7! The general solution of Legendre's differential equation is
l
y (x) = ao ye (x) + a yo (x) ,
l
in which ao and a are arbitrary constants . Some particular solutions are :
l
with A = 0 and a = 0 , y( x) = ao . with A = 2 and ao = 0 ,
l
y(x) = a x.
l
with A = 6 and a = 0 , y(x) = ao(1 - 3x2) . with A = 12 and ao = 0, 5 y(x) = a, (x - 3 x 3)
l
with A = 20 and a = y(x) = ao 11and_ so _on .
10x2
+
35x
41
704
CHAPTER 16 Special Functions, Orthogonal Expansions, and Wavelet s
FIGURE 16 .1
The first five Legendre
polynomials .
The values of A for which solutions are polynomial (finite series) solutions are A = n(n + 1 ) for n = 0, 1, 2, 3, . . . . This should not be surprising, since the recurrence relation (16 .4) contain s n(n + 1) - A in its numerator . If for some nonnegative integer N we choose A = N(N + 1) , then aN+2 = 0, hence also aN+4 = aN+6 = . . • = 0, and one of ye (x) or yo (x) will contain only finitely many nonzero terms, hence is a polynomial . These polynomial solutions of Legendre's differential equation have many applications , for example, in astronomy, analysis of heat conduction, and approximation of solutions o f equations f(x) = 0. To standardize and tabulate these polynomial solutions, ao or a l is chosen for each A = n(n +1) so that the polynomial solution has the value 1 at x = 1 . The resultin g polynomials are called Legendre polynomials, and are usually denoted P,,(x) . The first six Legendre polynomials are Po(x) = 1, P l (x) = x, P2(x) = P4(x)
=8
(3x2 - 1), P3(x) =
(35x4 - 30x 2 + 3) , P5 (x)
=8
(5x3 - 3x) ,
(63x 5 - 70x3 + 15x) .
Graphs of these polynomials are given in Figure 16 .1. P,,(x) is of degree n, and contains only even powers of x if n is even, and only odd powers if n is odd . Although these polynomials are defined for all real x, the relevant interval for Legendre's differential equation is -1 < x < 1 . It will also be useful to keep in mind that, if q(x) is any polynomial solution of Legendre' s equation with A = n(n + 1), then q(x) must be a constant multiple of P 1,(x) .
16.1.1 A Generating Function for the Legendre Polynomial s Many properties of Legendre polynomials can be derived by using a generating function, a concept we will now develop. Let L(x, t) =
1 ,/ 1 - 2xt + t2
We claim that, if L(x, t) is expanded in a power series in powers of t, then the coefficient of t" is exactly the nth Legendre polynomial.
16.1 Legendre Polynomials
THEOREM 16.1
70 5
Generating Function for the Legendre Polynomials co
L (x, t)
= EP
t„ (x) . ■
n
„=o
We will give an argument suggesting why this is true . Write the Maclaurin series for (1 w) -1/2 : 1 -w
Now expand each of these powers of 2xt - t2 and collect the coefficient of each power of t i n the resulting expression : 1 J1-2xt+t 2
=
1+xt - i t2 2
+
3 14 3x2t2 - 3xt3 + 2 2 8
15 5 5 6 35 44 35 35 + gxt - 16 t + 8xt - 4xt
+
5 x3t3- 15 x2 t4 2 4 105
16xt7 + 16 xt2 6 - 35
63 t o . . . 63 55 315 46 315 37 315 2 8 315 9 2 8 x t - i x t + 16x t - 32x t + 128 xt - 256 t ± 6 3 5 1 3 t +(- +2x21 t2 + (-2x+2x 3) t 3 =+xt + 35 8 + 128
/3
15 x2 + 35 x4) 4 /15 35 3 63 5 $ t+I 8x- 4x+ 8x
+Ig- 4
5 t+• •
= Po (x)+P i (x)t+P2 (x)t 2 +P\3(x)t3 +P4(x)t4 +P5 (x)t5
+• • • .
The generating function provides an efficient way of deriving many properties of Legendr e polynomials . We will begin by using it to show tha t P,t(1) = 1 and P,t(-1) = (-1) „ for n = 0, 1, 2, . . . . First, setting x = 1 we have 1 1 = = L ( l , t) = ,/1- 2t + t 2 \/( 1- t)2
co
1
t
=
(1)t,t
,t= o
But, for-1
Since 1/(1- t) has only one Maclaurin expansion, the coefficients in these two series must b e the same,_ hence _each_ P1 (1)_= 1 ._
r
I 706
CHAPTER 16 Special Functions, Orthogonal Expansions, and Wavelet s Similarly, 0 1 Pn(- 1 ) t" . 1+t - E n=0
1 1 */1+2t+t2 - */( 1+t) 2
L(-1, t) _ But,-1
0 1 E( -1 ) "t" , l +t = „=o so p,,(-1) _ (-1)" .
16.1 .2 A Recurrence Relation for the Legendre Polynomials We will use the generating function to derive a recurrence relation for Legendre polynomials . L,,
THEOREM 16.2
Recurrence Relation
for Legendre Polynomials
For any positive integer n, (n + 1 ) P„+1(x) Proof
- (2n
+ 1)xP„ (x) + nP„-1( x) = 0 .
(16 .5)
Begin by differentiating the generating function with respect to t : 8L(x,
E,°° o P„ (x) t" into the last equation to obtai n
Substitute L (x, t) =
(1- 2xt + t2)
E nPn (x) 0 -1 - (x - t) E Pn (x) t" =
n=1
n=0
Carry out the indicated multiplications to writ e
E nP,,(x) 0-1 - E 2nxP,,(x) t i, + E nPn(x) t"+1 - E xPn (x) to + E P,,(x) to+1 = 0. n=1
n=1
n=1
,i=0
n=0
Rearrange these series to have like powers of t in each summation : 00
CO
00
n=o
n=1
n=2
E( n + 1 ) Pn+1(x) t"- E 2nxP,,(x) t" +- E( n-1 ) Pn_1( x) t
"
E xPn (x)t"+EPn_1 (x)t" = 0. n=0
l
n= 1
Combine summations from n = 2 on, writing the terms for n = 0 and n = 1 separately : P1 (x) + 2P2(x) t - 2xP1 (x) t - xPo (x) - xP l (x) t -F- Po (x) t 0 + E [( n + 1)Pn+1(x) - 2nxP,,(x) + (n -1)P,,_1(x) - xPn (x) + 1'„-1( x)] t" = 0. n=2
16 .1 Legendre Polynomials
707
For this power series in t to be zero for all t in some interval about 0, the coefficient of t" must be zero for n = 0, 1, 2, . . . . Then P I (x) - xPo(x) = 0 , 2P2(x) - 2xP 1 (x) - xPo(x) + Po (x) = 0 , and, for n = 2, 3, . . . , (n + 1)P„+1 (x) - 2nxP„ (x) + (n - 1 ) P„-1(x) - xP„ (x) + P,,_1(x) = 0. These give us P o (x) = xPo(x) , (3xP1 (x) - Po (x) )
Since this equation is also valid for n = 1, this establishes the recurrence relation for all positiv e integers . '® Later we will need to know the coefficient of x" in P„(x) . We will use the recurrenc e relation to derive a formula for this number .
THEOREM 16. 3
For n = 1, 2, . . . , let A,, be the coefficient of x" in P„(x) . Then An =
1.3
(2n-1) . n!
For example, A l = 1, A2 =
1 .3 2*
3
2
and A3 =
1 . 3 .5 5 3! - 2 '
as can be verified from the explicit expressions derived previously for P I (x), P2 (x) and P3 (x) . Proof In the recurrence relation (16 .5), the highest power of x that occurs is x" +1 , and thi s term appears in P,1+1(x) and in xP,,(x) . Thus the coefficient of x" +1 in the recurrence relation is (n + 1 ) A „+1 - (2n+ 1) A " . This must equal zero (because the other side of this recurrence equation is zero) . Therefore „+1
2n+ 1 n+ 1 „
708
CHAPTER 16 Special Functions, Orthogonal Expansions, and Wavelet s and this holds for n = 0, 1, 2, . . . . Now we can work back : _ 2n+1 _ 2n+12(n-1)+ 1 An n+ 1 (n -1) + 1 A"- ' n+ 1 _ 2n+1 2n-1 2n+1 2n-1 2(n-2)+ 1 n+ 1 n A"-I n (n-2)+1 An-2 n+ 1 2n+12n-12n-3 -= 2n+12n-1 2n-3 n+ l n n-1 An -2=-= n+1 n n-1
3 2
°
But A° = 1 because P (x) = 1, so A n+I -
1 . 3 . 5•• •(2n-1)(2n+1 ) (n+ 1)!
(16 .6)
for n = 0, 1, 2, . . . The conclusion of the theorem simply states this conclusion in terms of An instead of A „+I . ■
16.1 .3 Orthogonality of the Legendre Polynomials We will prove the following .
-I-THEOREM 16.4
Orthogonality of the Legendre Polynomials on [-1,1] I
If n and m are nonnegative integers, then
Li Pn (x)Pnt (x)dx = 0 if n * m . a
(16 .7)
This integral relationship is called orthogonality of the Legendre polynomials on [-1, 1] . We have seen this kind of behavior before, with the function s 1, cos(x), cos(2x), . . . , sin(x), sin(2x), . . . on the interval [-7r, 7r] . The integral, from -Tr to IT, of the product of any two of these (distinct ) functions is zero . Because of this property, we were able to find the Fourier coefficients of a function (recall the argument given in Section 14 .2) . We will pursue a similar idea for Legendre polynomials after establishing equation (16 .7) . Proof
Begin with the fact that P,,(x) is a solution of Legendre's equation (16 .1) for A = n(n .+1) . In particular, if n and m are distinct nonnegative integers, then [(1-x2)P;t(x)]'+n(n+ 1)P,, (x) = 0 and [(l -x2)P,,(x)]'+m(m+ 1)P," (x) = 0 . Multiply the first equation by P,,, (x) and the second by P,, (x) and subtract the resulting equation s to get [(1-x2)P,(x)]'P",(x) [(1-x2)P,,(x)]'P,,(x)+[n(n+1)-m(m+1)]P„(x)P,,,(x)=0 .
16.1 Legendre Polynomials Integrate this equation : f l[(1-x2)P„(x)]'P,,,(x)dx- f 1[(1-x2)P,,(x)]'P,,(x)d x = [m(m + 1) - n(n + 1)] f P„(x)P,,(x)dx . Since n In, equation (16 .7) will be proved if we can show that the left side of the last equation is zero . But, by integrating the left side by parts, we hav e f
[( 1 -x2)P„(x)]'P,,,(x)dx- f 1[(1-x2)P„(x)]'P,,(x)dx
= [(1-x2)P,(x)P»,(x)]11- f 1 ( 1 - x2) P,(x)(x) dx [(1-x2)P„(x)P„(x)]`_1+ f i ( 1-x2 ) P,(x) Pr„(x) dx=0 , and the orthogonality of the Legendre polynomials on [-1, 1] is proved. 16.1.4 Fourier-Legendre Serie s Suppose f(x) is defined for -1 < x < 1 . We want to explore the possibility of expanding f(x) in a series of Legendre polynomials : 0 f(x)
= E c»P„ (x) .
(16 .8 )
n=o
We were in a similar situation in Section 14 .2, except there we wanted to expand a functio n defined on [-or, 7r] in a series of sines and cosines . We will follow the same reasoning that le d to success then. Pick a nonnegative integer m and multiply the proposed expansion by P, n (x) , and then integrate the resulting equation, interchanging the series and the integral : .f(x) P,, (x)dx
= „_O E c,,
P,,(x)P,,,(x) dx. 1
Because of equation (16 .7), all terms in the summation on the right are zero except when n = in . The preceding equation reduces to f f(x)P(x)clx=c» f 1 P2,(x)dx . Then
f_1 1
.f( x) P,,, (x) dx f_1 P2, (x) clx
(16 .9)
Taking the lead from Fourier series, we call the expansion E°° o c„P„ (x) the FourierLegendre series, or expansion, of f(x), when the coefficients are chosen according to equatio n (16 .9) . We call these c,',s the Fourier-Legendre coefficients of f . As with Fourier series, we must address the question of convergence of the FourierLegendre series of a function . This is done in the following theorem, which is similar in for m to the Fourier convergence theorem. As we will see later, this is not a coincidence .
i 7,10, -
CHAPTER 16 Special Functions, Orthogonal Expansions, and Wavelet s THEOREM 16.5 Let f be piecewise smooth on [-l, 1] . Then, for -1 <x < 1 ,
E c„ P„ ( x) = (f( x+) + f(x-)) , „-o if the c ;, s are the Fourier-Legendre coefficients of f . IS
2
This means that, under the conditions on f, the Fourier-Legendre expansion of f(x) converges to the average of the left and right limits of f(x) at x, for -1 < x < 1 . This is midway between the gap at the ends of the graph at x if f(x) has a jump discontinuity ther e (Figure 16 .2) . We saw this behavior previously with convergence of Fourier (trigonometric ) series . If f is continuous at x, then f(x+) = f(x-) = f(x) and the Fourier-Legendre serie s converges to f(x) . As a special case of general Fourier-Legendre expansions, any polynomial q(x) is a linear combination of Legendre polynomials . In the case of a polynomial, this linear combination can be obtained by just solving for x' t in terms of P„(x) and writing each power of x in q(x) in terms of Legendre polynomials . For example, let q(x) = - 4+2x+9x2 . We can begin with x = P 1 (x) and then solve for x2 in P2(x) : P2(x)
=
3 2 1 2x -- ,
so x2=
3P2(x)+3 = 3 P2(x)+ 3 Po( x) •
Then -4+ 2x + 9x2
= -
4Po (x) + 2P1 (x) +9 1 -P2(x) + 3
= -Po (x) + 2P 1 (x) + 6P2 (xx) .
3 Po(x)
) J
We can now prove the perhaps surprising result that every Legendre polynomial is orthogonal to every polynomial of lower degree .
Y
° (f (xo) +f (xo-) ) xo FIGURE 16 .2 Convergence of a Fourier-Legendre expansion at a jump discontinuity of the function.
16.1 Legendre Polynomials
711
THEOREM 16. 6
Let q(x) be a polynomial of degree tn, and let iv > m . The n 1 f q(x)P,,(x)dx = 0 . ■ J 1 Proof
Write q( x) = coPo( x) + c 1 P i (x) + . . . + c,n Pn, (x)
Then
f since for0
1
q(x)P,,(x)dx =
E ck f 1 Pk( x) P„(x) dx = 0 ,
k=o
f 11 Pk(x)P,,(x)dx=0 . ■
k
This result will be useful shortly in obtaining information about the zeros of the Legendre polynomials .
16.1.5 Computation of Fourier-Legendre Coefficients it The equation (16 .9) for the Fourier-Legendre coefficients off has f P,(x)dx in the denominator . We will derive a simple expression for this integral .
THEOREM 16. 7
If n is a nonnegative integer, then 1
f 1
P2 (x)dx =
2
2n+ 1
■
Proof As before, denote the coefficient of x" in P,,(x) as An . We will also denote f P . - 1 P*(x) dx. The highest power term in P„(x) is A,,x", while the highest power term in P,, _1 (x) is A,, _1 This means that all terms involving x" cancel in the polynomia l q ( x) = P,,(x) - A 11 xP„- 1 n- 1 and so q(x) has degree at most n- 1. Write A P,,(x) = q( x) + An_ 1 xP„-1(x) . Then P„
A" xP„-1(x)) dx = f 1 P,,(x)P,,(x)dx= f 1 P„ (x) (q(x) + A„_ 1 1
An 1 An_
1
xP,,(x)P 1 (x)dx ,
1 x"7
CHAPTER 16 Special Functions, Orthogonal Expansions, and Wavelet s because f 1 q(x)P,,(x)dx = 0 . Now invoke the recurrence relation 16.5 to write xP,,(x)
n+1
n
= 2n 1 P,,+1 (x) + 2n f 1 P,n-1(x) .
Then xPn (x ) P„ _ 1( x )
= 2n -I-1 P„+1(x)Pn-1(x) +
2n+
P*
(x) '
so Pn = A,, An _ 1
-1
xP„(x) P„-i(x ) dx
n = A„ L,P*_1(x)dx* . A,, _1 [ 2n+1 n+1 L IP„+1(x)P,_1(x)dx-I 1 2n I 1 1 Since
i1
f P„+1(x)P„_t (x)dx = 0, Pn
_ An
we are left wit h n
dx =
An-12n+1 f
P2
n-1
1
n An An_1 2n+ 1 Pn-1 •
Using the previously obtained value for An , We hav e P
1 .3 .5 . . . .(2n-3) .(2n-l)
(n -1)!
n (2n - 3) 2n + 1
1 .3 .5
n!
2n - 1 2 n + 1 "-1
Now work forward : pi
_ 1 _ 1 f 3 Po 3 f
1P0(x)
1 f 2 2 dx=3J dx= 3, 1
3 322 5 2 7 2 P2 = 5 P1 = 53 = 5' P3 =- P2 = * ' P4 = -P3 = 9 and so on . By induction, we find that 2 P„=
2n+1 '
proving the theorem . ■ This means that the n`" Fourier-Legendre coefficient of f i s ct
= f'1f(x) P„( x) dx f1 11(x) dx
= 2n+ 1
2
x P„dx x . f 11 f( ) ( )
EXAMPLE 16 . 1
Let f(x) = cos(7rx/2) for -1 <. x < 1 . Then f and f' are continuous on [-1, 1], so the Fourier-Legendre expansion offconverges to cos (7rx/2) for -1 < x < 1 . The coefficients are c, - 2n2
1
1
f 1 cos () P,,(x ) dx .
16 .1 Legendre Polynomial s
Because cos(x/2) is an even function, cos(7rx/2)Pn (x) is an odd function for n odd. This means that c,, = 0 if n is odd . We need only compute even-indexed coefficients . Some of these are : 1
co= 1 2
f 1 cos( 7rx2 )dx= ?7r
c2 = 5 2
1 (3x 2 - 1) dx = 10 zr2 -1 2 f 1 cos (71-x) 71-3 2 2
Although in this example, f(x) is simple enough to compute some Fourier-Legendr e coefficients exactly, in a typical application we would use a software package to comput e coefficients . The terms we have computed yield the approximation cos(Irx/2)
FIGURE 16 .3 Comparison of cos(7rx/2) with a partial sum of a series expansion in Legendre polynomials.
Figure 16 .3 shows a graph of cos(7rx/2) and the first three nonzero terms of its FourierLegendre expansion . This series agrees very well with cos(7rx/2) for -1 < x < 1, but the two diverge from each other outside this interval . This emphasizes the fact that the Fourier-Legendre expansion is only for -1 x 1 . 16.1.6 Zeros of the Legendre Polynomials Po (x) = 1, and has no zeros, while P 1 (x) = x has exactly one zero, namely x = 0 . P2 (x) = (3x 2 - 1) has two real zeros, ±1/J. P3(x) has three real zeros, namely 0 and f 3/5 . After
714 !
I
CHAPTER 16 Special Functions, Orthogonal Expansions, and Wavelets n = 3, finding zeros of Legendre polynomials quickly becomes complicated . For example, P4 (x) has four real zeros, and they are
3
-1525 +70 30 and f 5 i/525-70 30. 35 These are approximately ±0 .8611 and ±0 .3400 . Each P,,(x) just tested has n real roots, all lying in the interval (-1, 1) . We will show that this is true for all the Legendre polynomials . This includes Po(x), which of course has no roots . The proof of this assertion is based on the orthogonality of the Legendre polynomials .
THEOREM 16.8
T
Zeros of P,,(x )
Let n be a positive integer . Then P,,(x) has n real, distinct roots, all lying in (-1, 1) . ■ Proof We first show that, if P,,(x) has a real root xo in (-1, 1), then this root must be simpl e (that is, not repeated) . For suppose xo is a repeated root . Then P,,(xo) = P;, (xo) = 0. Then P,,(x) is a solution of the initial value proble m
((1-x2)y')'+n(n+1)y = 0 ;
y (xo) = y'(x o)
= 0.
But this problem has a unique solution, and the trivial function y(x) = 0 is a solution . This implies that P,,(x) is the zero function on an interval containing xo, and this is false . Henc e P,,(x) cannot have a repeated root in (-1, 1) . Now suppose n is a positive integer . Then P,,(x) and Po(x) are orthogonal on [-1, 1], s o
L I P,,(x) Po (x) dx = f 1 P„ (x) dx = 0 . Therefore P,,(x) cannot be strictly positive or strictly negative on (-1, 1), hence must chang e sign in this interval . Since P,,(x) is continuous, there must exist some x 1 in (-1, 1) with P„(x 1) = 0 . So far, this gives us one real zero in this interval . Let x1 , . . . , x,,, be all the zeros of P„ (x) in (-1,1), with -1 < x1 < • • • < x,,, < 1 . Then 1 < in < n . Suppose m < n . Then the polynomial q(x) _ (x - x1) ( x - x2) . . . (x - x,,, ) has degree less than n, and so is orthogonal to P,,(x) :
L I q(x)P,,(x)dx = 0 . But q(x) and P,,(x) change sign at exactly the same points in (-1, 1) , namely at x 1 , . . . , x,,,. Therefore q(x) and P„(x) are either of the same sign on each interval (-1, x 1 ), (x1 , x2), . . . , (x,,,, 1), or of opposite sign on each of these intervals . This means that q(x)P,,(x) is either strictly positive or strictly negative on (-1, 1) except at the finitely many points x1 , . . ., x,,, i1 where this product vanishes . But then f q(x)P„ (x)dx must be either positive or negative, a contradiction . We conclude that m = n, hence P„(x) has n simple zeros in (-1, 1) . ■ Referring back to the graphs of Po(x) through P4 (x) in Figure 16 .1, we can see that each of these Legendre polynomials crosses the x-axis exactly n times between -1 and 1 .
16.1 Legendre Polynomials
7.1 5
16.1 .7 Derivative and Integral Formulas for P„ (x) We will derive two additional formulas for P„ (x) that are sometimes used to further analyz e Legendre polynomials . The first gives the n th Legendre polynomial in terms of the n th derivative of (x 2 - 1)" . THEOREM 16.9 Rodrigues's Formula
For n =
0,1, 2, . . . , 1
P„( x )
d" 2 - 1)") . = 2 „ n! dx" ((x
In this statement, it is understood that the zero-order derivative of a function is the functio n itself. Thus, when n = 0, the proposed formula gives
1
d° 2°0! dx°
((x2 -
1 ) ° ) = (x2 -
1)° = 1 = Po(x) •
For n = 1 it gives
1 d - (x 2 - 1 2(1!) dx
2(2x)=x=_P1(x) ,
and for n = 2, it gives x2 2 2 (2!) dx2 (( Proof
1)2 )
=8
12x2 - 4) _ - x2 -
2=
P2 ( x) .
Let w = (x2 - 1)" . Then w' = n(x 2 - 1) „-1 (2x) .
Then (x2 -1)w' -2nxw=0 . If this equation is differentiated k +1 times, it is a routine exercise to verify that we obtai n dk+2w dk+l w - (21a - 2k - 2)x dxk+l (x2 - 1) dx k+2 k -[2n+(2n-2)+ +(2n-2(k-1))+2n-2k)]k =0 .
x
Putting k = n, we have (x 2 - 1)
d„+2 w d „+1 w +2 + 2x dxf+l dx'
d" w - [2n + (2n - 2) + • • • + (2n - 2(n -1)) + (2n - 2n)] dx The quantity in square brackets in this equation is 2n+(2n-2)+ .• +2 , which is the same as 2(1+2
±'i)
= 0.
`7:16
CHAPTER 16 Special Functions, Orthogonal Expansions, and Wavelets
= Zn(n+ 1)) . Therefore
But this quantity is equal to n(n± 1) . (Recall that Ej%, j d"+zw 2 (x2
ditt y
n+lw
-1) dx„+z
2x ddx,t+1 - n(n + 1)
dx" = 0
.
Upon multiplying this equation by -1, we have (1-xz)
d"+2w do+l w . +2 -2xdxf+l dx" +n(n+1)dx" d" w =0
But this means that d"w/dx" is a solution of Legendre's equation with A = n(n+ 1) . Further, repeated differentiation of the polynomial (x2 -1)" yields a polynomial . Therefore, the polynomial solution d"w/dx" must be a constant multiple of P,1 (x) : d"w = cPn (x) • dx"
(16 .10 )
Now, the highest order term in (x2 - 1)" is x', and the nth derivative of x2" i s 2n(2n-1) . .•(n+1)x" . Therefore the coefficient of the highest power of x in d"w/dx" is 2n(2n -1) • • • (n + 1) . Th e highest order term in cP„(x) is cA,,, where An is the coefficient of x" in P„(x) . We know A n, so equation (16 .10) gives us 2n(2n-1)•••(n+1)=cA„=c
1 . 3 . 5 •• • (2n - 1 ) i . n.
Then c=
n!(n + 1)•••(2n - 1)(2n) 1 . 3 . 5 . . . (2n-1)
(2n) ! 1 . 3 . 5•••(2n-1 )
=2 .4 .6 . . .2n=2"n! . But now equation (16 .10) becomes do dx" ((x2 -1)" =
2"n!P„( x) ,
which is equivalent to Rodrigues's formula . ■ Next we will derive an integral formula for P„(x) .
THEOREM 16.10`
fi r
For n = 0, 1, 2, . . . , Pn(x)
JO
+
n - l cos (0)) dB . lil
For example, with n = 0 we get 1
'* o
dB=1=Po(x) .
16.1 Legendre Polynomial s
With n = 1 we get
f
(x+ ✓x 2 - 1 cos(B)) dB = x = P 1 (x) ,
and with n = 2 we get
2
f
(x+ .✓x2 -1 cos(B)) d B T
2 +2x-✓x2 - l cos(B) + (x2 -1) cos2 (B)) d B if (x
2 x2 - 2 Proof
=P2( x) .
Let
Q„(x)
1 '7r
r (x+ ✓x2 - 1 cos(B)) 11 dB. 0
The strategy behind the proof is to show that Q,, satisfies the same recurrence relation a s the Legendre polynomials . Since Q0 = Po and Q = P 1 , this will imply that Q,, = P,, for al l nonnegative integers n. Proceed
After a straightforward but lengthy computation, we find that (n + 1) Q,,+1 (x) - (2n + 1)xQ,,(x) + n Q,,_1( x)
_
fIT
+ f
(x+ ✓x2 -1 cos(0)
-1 (1-x2) sin2 (0)dO
(x+ -s/x2 -1 cos(B))
N/x2 -
1 cos(O)dO .
Integrate the second integral by parts, with u = (x+ ,N/x2 - 1 cos(B)) and dv = ,Vx2 - 1 cos(B)dB to get (n + 1 )Q„+1( x ) - (2n + 1 ) x Q,,( x) + n Q„-1 (x) -
f
+r -
IT (x+'✓x 2 -1cos(B))
1 (1-x2 )sin 2 (B)d O
(x+'✓x2 -1 cos(B)) " ✓x2 -1 sin(B)
*
J0
f I '\/x 2 - 1 sin(O)n (x+ , ✓x 2 -1 cos(0)) -1 -✓x2 - 1(-
=0 , completing the proof.
sin(B))dO
CHAPTER 16 Special Functions, Orthogonal Expansions, and Wavelet s
.7 1 8
1. For n = 0, 1, 3, 4, 5, verify by substitution that P,,(x) is a solution of Legendre's equation corresponding t o A = n(n +1) .
where r = /x 2 +y2+ z 2 . To do this, introduce th e angle shown in Figure 16.4. Let d = .‘/ xo+ya+z o and R = ,/(x- xo) 2 + (Y - Yo)2 + (z - zo) 2 .
2. Use the recurrence relation (Theorem 16.2), and th e list of Po(x), . . ., P6 (x) given previously, to determine P5 (x) through Pl o(x) . Graph these function s and observe the location of their zeros in [-1, 1] .
P :(x,y,z)
3. Use Rodrigues's formula to obtain P I (x) through P5 (x) . 4. Use Theorem 16 .10 to obtain P3 (x), P4 (x) and P5 (x) . 5. It can be shown that [ n/ 2] Pn (x) =
FIGURE 16 .4
(2n - 2k) !
k
(-1)k2"k!(n-k)!(n-2k)!x„-z
Use this formula to generate Po (x) through P5 (x) . The symbol [n/2] denotes the largest integer not exceeding n/2 . 6. Show that n!
P„(x )
dk
_ E k!(n-k)! dxk
dn- k [(x+1)„]dx"_k [(x-i)"] .
Hint : Put x 2 - 1 = (x - 1)(x + 1) in Rodrigues' s formula . 7. Let n be a nonnegative integer. Use reduction of orde r (Section 2 .2) and the fact that P,,(x) is one solutio n of Legendre's equation with A = n(n + 1) to obtain a second, linearly independent solutio n
(a) Use the law of cosines to writ e
[1',,(x)]2(1-x2)dx
(c) If r < d, let a = cos(B) and t = r/d to obtain
+x
=E
(P(r)
d,+t P„(cos(6))r „
„=o
(d) If r> d, show that
= E d"Pn(cos(B))r-" . n=0
.
8. Use the result of Problem 7 to show tha t 1 1 +xl Qo (x) _ - 21n 1- x C Ql(x)=1-1nG
d,/1- 2(r/d) cos(B) + (r/d) 2
(b) From our discussion of the generating function for Legendre polynomials, recall that, i f 1 /V1- 2at + t2 is expanded in a series about 0, convergent for Iti < 1, then the coefficient of t" is P,,(a) .
(P(r) Qn(x)=1'„(x)J
1
cp(x, y> z) =
10 . Show that ,E,=0 ( } -) Pn (
f
1
2) = -
11. Let n be a nonnegative integer . Prove that
I ,
P2„+i ( 0) = 0
and
(2n) !
P2„(0 ) = (-1)„ 22„( ni )2 .
and
x) = 4( 1Q2( 3x2 - 1) ln ( + i
x)
12. Expand each of the following in a series of Legendr e polynomials .
- 23
(a) 1+2x-x 2
for-1<x<1 . 9. The gravitational potential at a point P : (x, y, z) due to a unit mass at (xo, yo, zo) is 1
)1'
z)
- xo) 2 +
(Y Yo)' + (z -
zo) z
For some purposes (such as in astronomy) it is convenient to expand co(x, y, z) in powers of r or 1/r,
(b) 2x -I- x2 - 5x3 (c) 2 - x 2 + 4x4 In each of Problems 13 through 18, find the first five coefficients in the Fourier-Legendre expansion of the function . Graph the function and the sum of the first five terms of this expansion on the same set of axes, for -3 < x < 3.
16.2 Bessel Functions The expansion is only valid on [-1, 1], but it is instructive to see how the partial sum of the Fourier-Legendre expansion are generally unrelated outside this interval .
15. f(x) = sin 2 (x)
13. f(x) = sin(arx/2)
17. f(x)
14. f(x) = e - •`
18. f(x) = (x + 1) cos(x)
16.2
7 19
16. f(x) = cos(x) - sin(x) _
1 for - 1 < x < 0 1 for0<x< 1
Bessel Functions We will now develop the second kind of special function we will use to introduce the genera l topic of special functions . Recall from Chapter 4 that the second-order differential equatio n x2y" + xy' +( x2 - v2) y= 0 is called Bessel's equation of order v. Thus the term order is used in two senses here-th e differential equation is of second order, but traditionally we say that the equation has order v to refer to the parameter v occurring in the coefficient of y . In Example 4 .12 of Section 4 .3, we used the method of Frobenius to find a series solutio = C° „= 0
(- I ) " 2 2"n!(1 + v) (2 + v)
x2n+ v
(n + v)
'
nY(x)
in which co is a nonzero constant and v > O . This solution is valid in some interval (0, R) , depending on v . It will be useful to write this solution in terms of the gamma function, which we will no w develop . 16.2.1 The Gamma Functio n For x > 0, the gamma fu nction
r
is defined by r(x)
= f0
oo
t x-l e- 'dt .
This integral converges for all x > 0 . The gamma function has a fascinating history and man y interesting properties . For us, the most useful is the following . THEOREM . 16.11 Factorial Property of the Gamma Functio n
If x > 0, then
r(x +1) Proof
= xr(x) .
If 0 < a < b, then we can integrate by parts, with u = t` and dv = e-'dt , to ge t b
f tx e - 'dt = [tx(-e-')]a - f b xtx-1 (-1)e - 'dt a a b
_ b x e -b+ a x e-a+ x f tx- I a Take the limit of this equation as a 0+ and b co to get =
t xe 'dt
= r* +
= x
f
t
x-i
e - 'dt = xr(x) .
-, 720
CHAPTER 16 Special Functions, Orthogonal Expansions, and Wavelet s The reason why this is called the factorial property can be seen by letting x = n, a positive integer. By repeated application of the theorem, we ge t
F(n + l) = n! for any positive integer n. This is the reason for the term factorial property of the gamma function . It is possible to extend r(x) to negative (but noninteger) values of x by using the factoria l property. For x > 0, write (16 .11 ) r(x) = ix r(x+l) . If -1 < x < 0, then x +1 > 0 so r(x + 1) is defined and we use the right side of equatio n (16 .11) to define r(x) . Once we have extended r(x) to -1 < x < 0, we can let -2 < x < - 1 . Then -1 < x+ 1 < 0 so r(x + 1) has been defined and we can again use equation 16 .11 to define r(x) . In this way we can walk to the left along the real line, defining F(x) on (-n -1, -n) as soon as it ha s been defined on the interval (-n, -n+ 1) immediately to the right . For example,
-2+
1) = - 2r
(2) ,
and 2)
3 (2
)
2.
1
3r (
2)
4
1
3 x( 2 ) .
Figure 16 .5 (a) shows a graph of y = r(x) for 0 < x < 5 . Graphs for -1 < x < 0,
-2 < x < -1 and -3 < x < -2, respectively, are given in Figures 16 .5(b), (c) and (d) . Y
FIGURE 16 .5(a)
Y
r(x) for
0 < x < 5.
FIGURE 16 .5(b)
r(x) for -1 < x < 0 .
16 .2
Bessel
•.721
Functions
Y -5 .0 - 10 -1 5 - 20 I
I-*f
-1 .8 -1 .6 -1 .4 -1 .2 -1 .0 FIGURE 16 .5(c)
F(x) for -2
<x
I I I I II -2 .8 -2 .6 -2 .4 -2 .2 -2 .0
)- x
-3 .0
,_ x
FIGURE 16 .5(d) 1'(x) for -3 < x < - 2.
< -1 .
16.2 .2 Bessel Functions of the First Kind and Solutions of Bessel's Equatio n Now return to the Frobenius solution y(x) of Bessel's equation, given previously . Part of the denominator in this solution is (1 + v) (2 + v)
(n + v) ,
in which we assume that v > 0 . Now use the .factorial property of the gamma function to write F(n+v+l) _ (n+v)r(n+v) = (n+v)(n+v-1)r(n+v-1 ) _ . . . . (n+v)(n+v-1)•
co 2''F(v +1 ) to obtain the solution we will denote as Jv (x) : 00
Jv( x )
(-1)„
_ E 2 2„+v n!F(n+v+ 1) x2„+v n=o
Jv is called a Bessel function of the first kind of order v . The series defining Jv (x) converge s for all x. Because Bessel's equation is of second order (as a differential equation), we need a secon d solution, linearly independent from Jv , to write the general solution . Theorem 4.4 in Section 4.4 tells us how to proceed to a second solution . In Example 4 .12 we found that the indicial equation of Bessel's equation is r2 - v 2 = 0, with roots ±u . The key lies in the difference 2 v between these roots . Omitting the details of the analysis, here are the conclusions .
CHAPTER 16 Special Functions, Orthogonal Expansions, and Wavelet s are linearly independent (neither is a constan t 1 . If 2v is not an integer, then J„ and multiple of the other), and the general solution of Bessel's equation of order v i s y(x) =
a J„ (x) .+ bJ_„ (x),
with a and b arbitrary constants . 2. If 2v is an odd positive integer, say 2v = 2n + 1, then v = n + 1 for some positive integer n. In this case, J„ and are still linearly independent. It can be shown that i n this case J„+1/2 (x) and Ji_1/2 (x) can be expressed in closed form as finite sums of term s involving square roots, sines and cosines . For example, by manipulating the series for J„(x), we find that 2 x sin(x),
.11/2
J_ 1/2 (x) = Y
Y
2 x cos (x),
J3/2(x)
2x sin(x) (x ) -cos(x)] , x
=
I
and
J
-3/2(
x = *7rx [- sin(x) -
cos(x)1
)
.
In this case, the general solution of Bessel's equation of order v i s y( x) =
aJn+1/z( x) I- bJ,i-1/z( x )
with a and b arbitrary constants . 3. 2v is an integer, but is not of the form n + for any positive integer n. In this case J„(x) and J_,(x) are solutions of Bessel's equation, but they are linearly dependent . Indeed, one can check from the series that in this case ,
J-,.(x) = (-1)"J„(x) . In this case we must construct a second solution of Bessel's equation, linearly independent from J„(x) . This leads us to Bessel functions of the second kind .
16.2.3 Bessel Functions of the Second Kin d In Section 4.4 we derived a second solution for Bessel's equation for the case v = O . It was
y x = Jo(x) ln (x) + E 2z' ()!)2 0(n)x n= 1 2(
2n
)
in which
0(n)=1+-* . . . .+ n Instead of using this solution as it is written, it is customary to use a linear combinatio n of (x) and Jo(x), which will of course also be a solution . This combination is denoted Yo(x) , and is defined for x > 0 by y2
Yo(x) = [Y2(x) + (y -1n (2) ) Jo(x)] , where y is Euler's constant, defined by
16.2 Bessel Function s Jo and Yo are linearly independent because of the 1n(x) term in Y0 (x), and the general solution of Bessel's equation of order zero is therefor e y(x) = aJ0 (x) +bYo(x) , with a and b arbitrary constants . Yo is called a Bessel function of the second kind of order zero . With the choice made for the constants in defining Yo, this function is also called Neumann' s function of order zero . If v is a positive integer, say v = n, a derivation similar to that of Y0 (x), but with more computational details, yields the second solutio n Y i(x) = 2 7r
(xl F + Jn(x)[In\2 / k=1
(-1)k+
*zk+[0(k)+0(k+l)]xzk+ n 2„+lk!(k+n) !
- 2' (n-k-1) ! xzk- n 22k-n+ik ! k=0 This agrees with Yo(x) if n = 0, with the understanding that in this case the last summatio n does not appear . The general solution of Bessel's equation of positive integer order n is therefore y(x) = aJn(x)+bY,,(x) .
Thus far we only have Y„(x) for v a nonnegative integer. We did not need this Besse l function of the second kind for the general solution of Bessel's equation in other cases . However, it is possible to extend this definition of Y„(x) to include all real values of v by lettin g (x) = sin(1v7r) [J(x) cos(vTr) - J „(x)] .
For any nonnegative integer n, one can show that Y„ (x) = Jim Y„ (x) .
Y„ is Neumann's Bessel function of order v . This function is linearly independent from J„(x ) for x > 0, and it enables us to write the general solution of Bessel's equation of order v in al l cases as y(x) = aJ„(x) + bY„(x) .
Graphs of some Bessel functions of both kinds are shown in Figures 16 .6 and 16.7 .
FIGURE 16 .6
Bessel functions of the first kind.
FIGURE 16 .7
Bessel functions of the second kind.
724 ; CHAPTER 16 Special Functions, Orthogonal Expansions, and Wavelet s Is interesting to notice that solutions of Bessel's equation illustrate all of the cases of th e Frobenius theorem (Theorem 4 .4). Case 1 occurs if 2v is not an integer, case 2 if v = 0, case 3 with no logarithm term if v = n + 1 for some nonnegative integer n, and case 3 with a logarithm term if v is a positive integer . In applications and models of physical systems, Bessel's equation often occurs in disguise d form, requiring a change of variables to write the solution in terms of Bessel functions .
EXAMPLE 16 . 2
Consider the differential equation 9x2y" - 27xy' + (9x2 +35)y = 0 . Let y = x2u and compute
y' = 2xu+x 2u ' , y" =2u+4xu ' +x2 u " . Substitute these into the differential equation to ge t 18x 2u + 36x 3 u' +9x 4u" - 54x 2u - 27x3 u' + 9x4u + 35x2u = 0. Collect terms to write
9x4 u"+9x3 u + (9x4,-x 2)u = 0 : Divide by 9x2 to get
x2u " +xu'
+(x2
-9)u=0,
which is Bessel's equation of order v = 3 . Since 2v is not an integer, the general solution for u is
u(x) = aJ1/3 (x) +bJ_1*3(x) . Therefore the original differential equation has general solutio n y (x)
ax 2 J1/3 (x)+bx2J_1/3 (x)
forx>0 . If a, b and c are constants and n is any nonnegative integer, then it is routine to show that xaJ„(bx c) and xa Y„(bx c) are solutions of the general differential equation (2a _ 1 ) + (b222c_2 + a 2 -v2 c2 1 ) y = 0. (16 .12) y x2 x
EXAMPLE 16 . 3
Consider the differential equation
16.2 Bessel Functions
725
To fit this into the template of equation 16 .12, we must clearly choose a = ,f3- . Because of th e x 6 term, try putting 2c - 2 = 6, hence c = 4 . Now we must choose b and v so tha t 784 = b 2c2 = 16b 2 , so b = 7, and a 2 -v2c2 =3-16v2 =-61 . This equation is satisfied by v = 2. The general solution of the differential equation is therefore y(x) = c'x''/5J2 (7x4) ) +c2x'15Y2 (7x4 for x > 0 . Here c l and c2 are arbitrary constants . IS
16 .2.4 Modified Bessel Function s Sometimes a model of a physical phenomenon will require a modified Bessel function for it s solution . We will show how these are obtained, Begin with the general solutio n y(x) = c l Jo (kx) + c2Y0 (kx) of the zero-order Bessel functio n 1 y" +xy + k2y=0 . Let k = i . The n y( x) = c 1 Jo(ix) +c2Yo(ix) is the general solution of 1 y, + x y - y 0 for x > 0 . This is a modified Bessel equation of order zero, and fo(ix) is a modified Besse l function of the first kind of order zero . Usually this is denote d 10 (x) = Jo (ix) = 1 + 22 x2 + 2242 x4 + 224262 x6 + . . . , Normally Yo(ix) is not used, but instead the second solution is chosen to be Ko (x) = [ln ( 2) - y] lo (x) - to (x) In (x) +
4 x2 + . . .
for x > O . Here y is Euler's constant . Ko is a modified Bessel function of the second kind of order zero . Figure 16 .8 shows graphs of 10 (x) and Ko (x) .
FIGURE 16 .8 Modifie d Bessel functions.
CHAPTER 16 Special Functions, Orthogonal Expansions, and Wavelet s The general solution of 1 y" +-y' -y= 0 x is therefore y(x) = c i lo(x) + c 2Ko(x) for x > O . The general solution of 1 y" +-y - bey=0 x
(16 .13)
y(x) = c l 10 (bx) + c 2Ko (bx)
(16 .14 )
is
forx>0 . By a routine calculation using the series expansion, we find that
f xIo(ax)dx= 4(x)+ c a for any nonzero constant a . Often we are interested in the behavior of a function when the variable assumes increasin g large values . This is called asymptotic behavior, and we will treat it later in some detail for Bessel functions in general . However, with just a few lines of work we can get some idea o f how 10(x) behaves for large x . Begin with 1 y, +-x y
- Y =0 ,
of which clo(x) is a solution for any constant c . Under the change of variables y = ux -112 , this equation transforms to 1 u" = 1- 4x2
u,
with solution u(x) = clilo(x) for x > 0 and c any constant. Transform further by putting u = ve x, obtaining 1 v " +2v' + 4x2 v=0 , with solution v(x) = cJe -x Io (x) . Since we are interested in the behavior of solutions for large x, attempt a series solution of this differential equation for v of the for m v(x) = 1+c1
1 x
+c2 1 +c3 1 + . . x2 x2
Substitute into the differential equation and arrange terms to obtai n
The series on the right actually diverges, but the sum of the first N terms approximates Io (x ) as closely as we want, for x sufficiently large . This is called an asymptotic expansion of 10(x) . By an analysis we will not carry out, it can be shown that c = 1/ 2Tr . These results about modified Bessel functions will be applied shortly to a description o f the skin effect in the flow of an alternating current through a wire of circular cross section .
16 .2.5 Some Applications of Bessel Functions We will use Bessel functions in the next chapter to solve certain partial differential equations . However, Bessel functions arise in many different contexts . We will discuss two such settings here . The Critical Length of a Vertical Rod Consider a thin elastic rod of uniform density an d circular cross section, clamped in a vertical position as in Figure 16 .9 . If the rod is long enough , and the upper end is given a displacement and held in that position until the rod is at rest, the ro d will remain bent or displaced when released . Such a length is referred to as an unstable length . At some shorter lengths, however, the rod will return to the vertical position when released , after some small oscillations . These lengths are referred to as stable lengths for the rod . We would like to determine the critical length Lc, the transition point from stable to unstable . Suppose the rod has length L and weight w per unit length . Let a be the radius of its circular cross section and E the Young's modulus for the material of the rod . (This is the ratio of stress
CHAPTER 16 Special Functions, Orthogonal Expansions, and Wavelets
•
t
H.
(0, 0 )
(L, 0)
1
FIGURE 16 .9
FIGURE 16 .1 0
to the corresponding strain, for an elongation or linear compression) . The moment of inertia about a diameter is I = 'n-a4/4. Assume that the rod is in equilibrium and is then displace d slightly from the vertical, as in Figure 16 .10. The x axis is vertical along the original positio n of the rod, with downward as positive and the origin at the upper end of the rod at equilibrium . Let P(x, y) and Q(, ri) be points on the displaced rod, as shown . The moment about P of the weight of an element wix at Q is w.x[y() - y(x)] . By integrating this expression we obtai n the moment about P of the weight of the rod above P . Assume from the theory of elasticity that this moment about P is Ely" (x) . Since the part of the rod above P is in equilibrium, the n x
Ely" ( x)
= f0
w [y( )
- y( x)] d •
Differentiate this equation with respect to x : x
EIy(3) (x) = w[y(x) - y(x)] - f wy' (x) ds*
= -wxy' (x) .
0
Then w ' (x) = 0. y(3) (x) + EI xy Let u = y' to obtain the second order differential equatio n w xu=0. u"+ EI Compare this equation with equation 16 .12. We nee d 2a-1=0,
a 2 -v2 c=0,
2c-2=1
b2c2= w EI .
and
This leads us to choose 1 2
a_'
3 c_ 2 '
1 v_3'
2 b= 3
w El
The general solution for u(x) is I u (x) = y (x) = c1J1j3 (3/2) +c2
Since there is no bending moment at the top of the rod , y" (o)
= 0.
.
113
(x3/2) .
16 .2 Bessel Function s
We leave it for the student to show that this condition requires c 1 = O . Then y '(x) =
c2jiJ_1/3
( -23-
1-w
EI x'/2 )
Since the lower end of the rod is clamped vertically, y'(L) = 0, s o c2VLJ_1/3 (IL3t2)
c2 must be nonzero to avoid a trivial solution, we nee =Since d
32
J 1/3 (JL / ) Lis the smallest positive number=Thecritalng which can be substituted for L in thi s equation . From a table of Bessel functions we find that the smallest positive number a such that J 1/3 (a) = 0 is approximately 1 .8663 . Therefor e
w L3c 2
2 3
1 .8663 ,
EI -
so L c ti
I/EI\ -w I 1/3
1 .9863
We will analyze alternating current in a wire of circular cros s section, culminating in a mathematical description of the skin effect (at high frequencies, "most " of the current flows through a thin layer at the surface of the wire) . Begin with general principles named for Ampere and Faraday . Ampere's law states that the line integral of the magnetic force around a closed curve (circuit) is equal to 47r times th e integral of the electric current through the circuit . Faraday's law states that the line integra l of the electric force around a closed circuit equals the negative of the time derivative of th e magnetic induction through the circuit . We want to use these laws to determine the current density at radius r in a wire of circular cross section and radius a. Let p be the specific resistance of the wire, p, its permeability , and x(r, t) and H(r, t) the current density and magnetic intensity, respectively at radius r an d time t. To begin, apply Ampere's law to a circle of radius r having its axis along the axis of the wire . We get Alternating Current in a Wire
27rrH = 47r f x(27r )di;, 0
or rH=47r f r xcl .
(16 .15)
0
Then ar (rH) = 47rxr , so r ar (rH) = 47i-x(r, t) .
(16 .16)
CHAPTER 16 Special Functions, Orthogonal Expansions, and Wavelet s L
r
FIGURE 16 .1 1
Now apply Faraday's law to the rectangular circuit of Figure 16 .11, having one side o f length L along the axis of the cylinder . We get jrr pLx(O, t) - pLx(r, t) = µLH(e, t)d6 . at Differentiate this equation with respect to r to ge t ax _ 8H (16 .17) Por -µ a t We want to use equations 16.16 and 16.17 to eliminate H. First multiply equation (16.17) by r to get ax _ pr ay =
aH at
Differentiate with respect to r :
r
r
a axl a aH * p - r -) = µ - I\ r = Or ar at ar
a (a ) - l\ -(rH) at ar
J=
a (47rxr) = 47rµr at at
in which we substituted from equation (16 .16) at the next to last step . Then par
(r) =41rµr*t .
(16.18)
The idea is to solve this partial differential equation for x(r, t), then obtain H(r, t) fro m equation (16 .15) . To do this, assume that the current through the wire is an alternating curren t given by Ccos(wt), with C constant. Thus the period of the current is 27r/w . It is convenien t to write z(r, t) = x(r, t) + iy(r, t), so x(r, t) = Re(z(r, t)), and to think of the current as the real part of the complex exponential Ce`'" . The differential equation (16 .18), with z in place of x, i s p ar (rar)
47rµr*t .
(16.19)
To solve this equation, we will attempt a solution of the for m f(r)e t0, ' . z( r, t ) = Substitute this proposed solution into equation (16 .19) to get p r (rf Wei') = 47rµrf(r)iwe`' Divide by e h' and carry out the differentiations to get f "(r) + l f' ( r) - bz f(r) = 0 , r where bz = 4µw P
t
16.2 Bessel Function s Comparing this equation with equation (16 .13), we can write the general solution for f(r) i n terms of modified Bessel functions : f(r) = c l lo( br) + c2 K0 (br) , where b=
47r tw l+ i P
Because of the logarithm term in Ko(r), which has infinite limit as choose c2 = 0 . Thus f(r) has the form
r -4 0 (center of the wire) ,
f( r) = c 1 lo(br) and z(r, t) = c1lo(br)etw t To determine the constant, use the fact that (the real part of) Ce` wt is the total current, hence, using equation (16 .14), 27rac 1 I(ba) . b
C = 27rc 1 f rIo(br)dr = Then
bC 1 27ra Io (ba ) and bC 1 Io(br)e`wt . z(r, t) = 2 a Io(ba) Then x(r, t) = Re(z(r, t)), and we leave it for the student to show tha t H( r, t) = Re
2C e«, r alo (ha) lo(br)
We can use the solution for z(r, t) to model the skin effect . The entire current flowin g through a cylinder of radius r within the wire (and having the same central axis as the wire) , is the real part of b twt f Io(br)27rrdr, 27ralo(ba) Ce and some computation shows that this is the real part o f *Io(br) lwt . aIo (ba) Ce Therefore current in the cylinder of radius r _ r Io (br) total current in the wire a II(ba ) When the frequency co is large, then the magnitude of b is large, and we can use the asymptoti c expansion of Io(x) given in Section 16 .2.4 to writ e
--
r Io (bY) Y ebr ba _b(a_r) -a (ba)-- a_ br e b a - - a- - -
I
-I 7+32
CHAPTER 16 Special Functions, Orthogonal Expansions, and Wavelet s For any r, with 0 < r < a, we can make e -b(a-r) as small as we like by taking the frequenc y w sufficiently large. This means that for large frequencies, most of the current is flowing nea r the outer surface of the wire . This is the skin effect .
16.2.6
A Generating Function for J,, (x)
We now return to a development of general properties of Bessel functions . For the Legendre polynomials, we produced a generating function L(x, t) with the property that 00
L ( x , t)
= E P,t(x) t,t n=0
In the same spirit, we will now produce a generating function for the integer order Besse l functions of the first kind .
THEOREM 16 .12
Generating Function for Bessel Function s
e x(t- 1 / t)/2
=
co
E
J„(x)t n.
(16 .20)
n=-co To understand why equation (16 .20) is true, begin with the familiar Maclaurin expansio n of the exponential function to writ e e x(t-1/t)/2 = e xt/2 e -x/2t
= (E
„:=0
C
1 ix \
l
m (l
k= 0
k
k (
)
2t )
1/ 1--+----+•• x 1 x2 1 x3 1 .)
xt 1 x2t2 1 x3t3 1+-+++••• JI 2 2! 2 2 3! 23
2t
21 22t2
3! 23 13
To illustrate the idea, look for the coefficient of t4 in this product. We obtain t 4 when x4t 4/244! on the left is multiplied by 1 on the right, and when x 5 t5/255! is multiplied by -x/2t on the right, and when x6t 6/2 66! is multiplied by x2/2 22!t 2 on the right, and so on . In this way we find that the coefficient of t4 in this product is
24! x4
265! 5! x5 + 282!6! x 6
2103!7!x7 +
.. .
(- I ) " x zn+4 22,t+4n! n+4)! n=0 (
Now compare this series with J4 (x) -
(-1) n (- )n x2n+4 X 211+4 = E 22n+4 E 22n+4n!r(n+4-{-1) n!(n+4) ! n=0 n=0
Similar reasoning establishes that the coefficient of t o in equation (16 .20) is J„(x) for any nonnegative integer n. For negative integers, we can use the fact that J-n(x) = ( - 1 ) nJ„(x)
16.2 Bessel Function s
16.2.7 An Integral Formula for J, t (x) Using the generating function, we can derive an integral formula for J„ (x) when n is a positiv e integer . THEOREM 16.13 Bessel's Integra l
If n is a nonnegative integer, the n Jn (x) = cos(n6-xsin(B))dO . ar Jo Proof
Therefore equation (16 .21) becomes e x(t-1/t)/2 = e ixsin(B) = cos(x sin( g)) + i sin(x sin( g) ) 00
0
= Jo (x) + 2E J2,,(x) cos (2nB) + 2i E J2n_1 (x) sin ((2n -1) 0) . n=1 n= 1 The real part of the left side of this equation must equal the real part of the right side, an d similarly for the imaginary parts : 00
cos (x sln(0)) = Jo (x) +2
E J2„ (x) cos(2n0) n= 1
(16 .22)
:754
CHAPTER 16 Special Functions, Orthogonal Expansions, and Wavelet s and 00
sin(x sin(e)) = 2
E J2n_1(x) sin((2n -1) 0) .
(16 .23)
n= 1
Now recognize that the series on the right in equations (16 .22) and (16 .23) are Fourie r series . Focusing on equation (16 .22) for the moment, its Fourier series is therefor e 0 ao + ak cos(kO) + b k sin(kO) cos(x sin(0))
=Z
E k=1
03
= .10 (x) +2
E J2i,(x) cos (2n0) . n= 1
Since we know the coefficients in a Fourier expansion, we conclude tha t 1 7r
=-f
ak
,*
cos(x sin(e)) cos(kO)dO =
2Jk (x) Jo
if k is odd if k is even
(16 .24 )
an d bk
1
=-
cos(x sin(0)) sin(ke)dO = 0 for k = 1, 2, 3, . . . .
f
(16 .25 )
Similarly, from equation (16 .23) , sin(x sin(0))
=2
Ao + E Ak cos(kO) + Bk sin(kO) k= 1
=2EJ2n_1 (x)sin((2n-1)0) , n= 1 so these Fourier coefficients ar e
=-f
Ak
for
k = 0, 1, 2,
(16 .26 )
0 1 i sin(x sin(0)) sin(ko)dO = 7r ,* 2Jk (x)
if k is even if k is odd
(16 .27 )
IT
sin(x sin(e)) cos(kO)dO = 0 a
and Bk
=-f
Upon adding equations (16 .24) and (16 .27), we hav e
17r f =1
cos(x sin(0)) cos(kO)d0
f
1 7T
a
cos(kO - x sin(0)) de =
IT ,r
f
sin(x sin(0))sin(kO)d O a
2Jk (x) 2Jk (x)
if k is eve n if k is odd
Thus Jk (x)
= -f
Jk (x)
_ -f
cos(kO - x sin(0))dO for k = 0, 1, 2, 3, . . . . 27r , r To complete the proof, we have only to observe that cos(kO - x sin(0)) is an even function , hence f 17 = 2 fo , s o 1 ar o
cos(kO - x sin(0))dO for
k = 0, 1, 2, 3, . . . . ■
16.2 Bessel Functions
73 5
16.2 .8 A Recurrence Relation for J„(x) We will derive three recurrence-type relationships involving Bessel functions of the first kind . These provide information about the function or its derivative in terms of functions of the sam e type, but lower index . We begin with two relationships involving derivatives . THEOREM 16.1 4
If v is a real number, then dx (x' J,, (x)) = xvJv-1(x) •
(16 .28 )
Proof Begin with the case that v is not a negative integer . By direct computation , d (xv dx
Now extend this result to the case that v is a negative integer, say v = -in with in a positive integer, by using the fact tha t J n,(x)
= (-1)mJm(x) •
We leave this detail to the student . THEOREM 16 .1 5
If v is a real number, then dx (x-vJv(x)) = -x-v Jv+1 (x) .
(16 .29 )
Verification of this relationship is similar to that of equation (16 .28) . Using these two recurrence formulas involving derivatives, we can derive the followin g relationship between Bessel functions of the first kind of different orders . THEOREM 16 .1 6
Let v be a real number . Then for x > 0, 2v x J,,(x) = Jv+1(x) + Jv-1 ( x) Proof
Carry out the differentiations in equations (16 .28) and (16 .29) to write _
.I,; (x) + vx_ 1 Jv (x)- = x' Jv=r( te)
(16 .30)
736-
CHAPTER 16 '
Special Functions, Orthogonal Expansions, and Wavelet s
and x-"J;,(x)-vx "- 'J„(x) _-x"J"+1(x) .
Multiply the first equation by
x
-" and the second by Jv( x)
x"
to obtai n
+ x J" (x ) = Jv-1( x)
and v
Jv(x) - -J"(x) = - J" + 1( x ) • x
Upon subtracting the second of these equations from the first, we obtain the conclusion of th e theorem . ■
EXAMPLE 16 . 4 Previously we stated that 2 2 sin (x), J 1/2( x) = Y x cos(x) , x
Jl/2(x)
results obtained by direct reference to the infinite series for these Bessel functions . Puttin g v = 1 into equation (16 .30), we get J1/2 ( x ) = J3/2 ( x) + J-1/2(x) •
Then
J3/2(x)
= - J1/2(x )
= 1v ? / x
- J-1/2(X)
sin(x) - 2 cos(x)
Trx
Trx
sin(x) - cos(x) )I .
I
x
x
Then, upon putting v = i into equation (16 .30), we ge t
x J3/2 (x) = J5/2 ( x) + J1/2( X) • Then J5/2 (x) _
-J1/2 (x )
+
x J3/2 (x )
_ -* x sin(x) +
-3
zI
X
_
.
12
[_- sin(x) +
32
x sin(x) - cos(x) )
sin(x) - - cos(x) I . f
This process can be continued indefinitely . The point is that this is a better way to generat e Bessel functions J„+1/2 (x) than by referring to the infinite series each time . ■
16.2 Bessel Functions
737
16.2 .9 Zeros of J„ (x) We have seen in some of the applications that we sometimes need to know where .J (x) = O . Such points are the zeros of J„(x) . We will show that J„(x) has infinitely many simple positiv e zeros, and also obtain estimates for their locations . As a starting point, recall from equation (16 .12) that y = J„(kx) is a solution o f x2y" + y' + (k2x2 - v2) y = 0 . Let k > 1 . Now put u(x) = kxJ„(kx) . Substitute this into Bessel's equation to get u" (x) + k 2 -
v2 _ i 2 x
u(x) = 0.
(16 .31 )
Our intuition is that, as x increases, the term (v 2 - 1/4) /x 2 exerts less influence on this equation for u, which begins to look more like u" + k2 u = 0, with sine and cosine solutions . This suggests that, for large x, J„ (kx) is approximated by a trigonometric function, divided by kx. Sinc e such a function has infinitely many positive zeros, so must .T„(kx) . In order to exploit this intuition, consider the equatio n v" (x) + v(x) = O .
(16,32)
This has solution v(x) = sin(x - a), with a any positive number . Multiply equation (16 .31) by v and equation (16 .32) by u and subtract to get uv" - vu'
Apply the mean value theorem for integrals to the last integral . There is some number ? betwee n a and a + it such that
r+-
-u(a+7r)-u(a)=u(T)(k2_1_
v2
a
x2
sin(x-a)dx .
Now sin(x - a) > 0 for a < x < a + 7r . Further, we can choose a large enough (depending o n v and k) that
738 I
CHAPTER 16 Special Functions, Orthogonal Expansions, and Wavelet s for a < x < a+ 7r . Therefore the integral on the right in the last equation is positive . Then u(a +7r), u(a), and u(T) cannot all be of the same sign . Since u is . continuous, u must have a zero somewhere between a and a+ r . Since u(x) = kxJ„(kx), this proves that J„(kx) has at least one zero between a and a + Tr . In general, if a is any sufficiently large number and k > 1, then J„(x) has a zero betwee n a and a + kar. We can now state a general result on positive zeros of Bessel functions of the first kind .
THEOREM 16.17
Zeros of J„(x)
Let k > 1 and let v be a real number. Then, for a sufficiently large, there is a zero of J„ (x) between a+knrr and a+k(n+ 1)7r for n = 0, 1, 2, . . . . Further, each zero is simple . The argument given prior to the theorem shows that, for any number sufficiently larg e (depending on 71 and the selected k > 1), there is a zero of J„(x) in the interval from tha t number to that number plus kir . Thus there is a zero between a and a + kw, and then between (a+k'rr) and (a+(k+1)7r), and so on . Further, each zero is simple. For if a zero /3 has multiplicity greater than 1, then J„(J3) = J;,((3) = 0 . But then J„(x) is a solution of the initial value proble m
Proof
x2y° + y' + (k2x2 - v2) y = 0;
y(/3) =
y
(f3) = 0.
Since the solution of this problem is unique, and the zero function is a solution, this woul d imply that J„(x) = 0 for x > 0, a contradiction . Thus each zero is simple . ■ The theorem implies that we can order the positive zeros of J„(x) in an increasing sequenc e J1<12<J3< . . . , so that lim„, c, j,, = co . It can be shown that for v > -1, J„(x) has no complex zeros . We will show that .1,, has no positive zero in common with 4+1 or .4_1 . However, w e claim that both 4 _1 and 4+1 have at least one zero between any pair of positive zeros of J . This is the interlacing lemma stated as Theorem 16 .18 below, and it means that the graphs o f these three functions weave about each other, as can be seen in Figure 16 .12 for J7 (x), J8 (x) and J9 (x) . First we need the following . Y 0.3 0.2 0.1 0 - 0. 1 - 0.2 FIGURE 16 .12 J9 (x) .
Interlacing of .17 (x), Js(x), an d
16 .2 Bessel Functions
739
LEMMA 16. 1
Let v be a real number . Then, except possibly for x = 0, 4 _1 or with 4+1 . Proof
J„ has no zero in common with either
Recall from the proof of Theorem 16 .16 tha t
Jv(x)+vJ„(x)=41(x) . If ,l3 0 0 and J„ (f3) = 4_1 (p) = 0, then 4(p) = 0 also . But then /3 would be a zero of multiplicity at least two for J,,, a contradiction . A similar use of the relation Jv (x) - z J„ (x) =
J„+1(x )
4+1 .
shows that J„ also cannot share a nonzero zero with THEOREM 16.18 Interlacing Lemma
Let v be any real number. Let a and b be distinct positive zeros of J . Then have at least one zero between a and b.
4 _1
and
4+1
each
Proof Let f(x) = x"J„(x) . Then f(a) = f(b) = 0 . Because f is differentiable at all point s between a and b, by Rolle's theorem, there is some c between a and b at which f'(c) = 0. But x'' 4 _1 (x) = dx (xvJv (x)) = so f' (c) = 0 implies that Ji_1 (c) = 0 . Similar reasoning, applied to g(x) = x " J„(x), and using the recursion relatio n f' (x)
shows that
4 +1
dx (x"J„( x)) has a zero between a and b.
= -x-"J„+1(x) ,
The following table gives the first five positive zeros of J .(x) for v = 0, 1, 2, 3, 4. Th e numbers here are rounded at the third decimal place . The interlacing property of successiv e indexed Bessel functions can be seen by looking down the columns . For example, the secon d positive zero of J2 (x) falls between the second positive zeros of J 1 (x) and J3 (x) . J0 (x) J1 (x) J2 (x) J3 (x) J4 (x)
11
12
13
14
J5
2 :405 3 .832 5 .135 6.379 7 .586
5.520 7 .016 8 .417 9.760 11 .064
8 .654 10.173 11 .620 13.017 14.373
11.792 13.323 14.796 16.224 17.616
14.93 1 16.470 17.960 19.41 0 20.827
16.2.10 Fourier-Bessel Expansions Taking a cue from the Legendre polynomials, we might suspect that Bessel functions are orthogonal on some interval . They are not . However, let v be any positive number . We know that J„ has infinitely many positive zeros, which we can arrange in an ascending sequenc e J1
*2_* J3_ <
. .
CHAPTER 16 Special Functions, Orthogonal Expansions, and Wavelets For each such we can consider the function .\5cJ„(jnx) for 0 < x < 1 (so j„x varies from 0 to jn ) . We claim that these functions are orthogonal on [0, 1], in the sense that the integral o f the product of any two of these functions over [0, 1], is zero . THEOREM 16.19
Orthogonality
Let v > O . Then the functions sense that J0
jnx), for n = 1, 2, 3, . . . , are orthogonal on [0, 1] in the
xJ„(j,t x)J„(j„l x)dx = 0
if
n # m. ■
This is in the same spirit as the orthogonality of the Legendre polynomials on [-1, 1], an d orthogonality of the function s 1, cos(x), cos(2x), . . . , sin(x), sin(2x), . . . on [-7r, 7r] . Proof
Again invoking equation (16 .12), u(x) = J„(j„x) satisfie s x2 u" +xu ' +(jgx2 -v2 )u =0 .
And v(x) = J„(j,,,x) satisfies x 2 v" + xv' + (j2,x2 - v 2 )v = 0. Multiply the first equation by v and the second by u, and subtract the resulting equations to obtain x2 u" v + xu' v + ( jn2 x2 - v 2 ) uv - x2 v" u - xv ' u - ( j2m x2 - v 2) uv = O . This equation can be written x2
(u"v -
teV" ) + x(u'v - uv') = (j,Z,i - Jn) x2uv .
Divide by x : x(u"v - uv " ) + (u ' v - uv' ) = (j!, - j,)xuv . Write this equation a s [x(u ' v - uv' )] ' = (j2 - jn)xuv . Then
f
[x(u ' v - uv' )]'
ax =
[x(u ' v - uv )]o
= Jv(Jn) Jv(J.) - Jv(Jn) Jv(Jm) = 0 * = (Jn -JZ) f xJv(Jnx)Jv(j, x)dx . Since j,, j,n , this proves the orthogonality of these functions on [0, 1] . ■ As usual, whenever we have an orthogonality relationship, we are led to attempt Fouriertype expansions . Let f be defined on [0, 1] . How should we choose the coefficients to have a n expansion 0 f(x ) = E a» Jv( jn x) ? n= i
16.2 Bessel Functions
° 741
Using a now familiar strategy, multiply this equation by xJ„(jkx) and integrate to ge t
*
1
f xf(x) Jv(jkx ) dx
=
0
E a,, ff _1
1
f
1
xJv(j x) Jv(jk x) dx = ak f xJv
( .lk
x) dx.
The infinite series of integrals has collapsed to a single term because of the orthogonality . Then ak
f01 xf( x) JJ(.lk x) dx f01
x.42 (jkx)dx
We call these numbers the Fourier-Bessel coefficients of f . When these numbers are used i n the series, we call a,,J„(j„x) the Fourier-Bessel expansion, or Fourier-Bessel series, of f in terms of the functions /J„(j„x) . Sometimes a different point of view if adopted . It is common to say that the functions J„(j„x) are orthogonal on [0, 1] with respect to the weight function p(x) = x. This simpl y means that the integral of the product of any two of these functions, and also multiplied b y p(x), is zero over the interval [0, 1] : P(x)Jv(j„x)J„(j,,,x)dx = 0 if n in . f0 This is the same integral we had before for orthogonality, but places the integral in the contex t of the weight function p(x), a viewpoint we will see shortly with Sturm-Liouville theory . Putting p(x) = x in this integral has the same effect as putting a factor with each J„(j„x) . As with Fourier and Fourier-Legendre expansions, the fact that we can compute th e coefficients and write the series does not mean that it is related to the function in any particula r way . The following convergence theorem deals with this issue . THEOREM 16.20
Convergence of Fourier-Bessel Serie s
Let f be piecewise smooth on [0, 1] . Then, for 0 < x < 1 ,
E a „ J,(j,x) = 2 (f(x +) +f( x -)) , „=1
where a,, is the nth Fourier-Bessel coefficient of f . We will give an example of a Fourier-Bessel expansion after we learn more about the coefficients. 16.2 .11 Fourier-Bessel Coefficients The integral fo xJv (jkx)dx occurs in the denominator of the expression for the Fourier-Bessel coefficients of any function, so it is useful to have an evaluation of this integral . THEOREM 16.21
If v 0, then 1 f xJv(jkx) dx=
2
Jv+1(jk) .
Notice the importance here of the fact that J„ and 4 +1 cannot have a positive zero in common . Knowing that J,,(jk) = 0 implies that J„+1(jk) 0.
CHAPTER 16 Special Functions, Orthogonal Expansions, and Wavelets Proof
From the preceding discussion , x2 u" + xu ' + (jk x2 - v2) u = 0 ,
where u(x) = J,,(jk x) . Multiply this equation by 2u'(x) to ge t 2x2 u' u" + 2x(u ' ) 2 +2(jkx2 - v2 )uu' = 0 .
We can write this equation a s rx2(u
) 2 + (jkx2 - v2)u 2 ] ' -2jkxu2 = 0 .
Now integrate, keeping in mind that u(1) = 0 : f 1 [x2 (u) 2 0= +(jk x2 -v2 )u2 ] ' dx-2jk f l xu2 dx
= [ x2 ( u ) 2 + (jkx2 - v2 ) u 2 t - 2j2k f xu 2 dx
f
1
_ (u ' (1)) 2 - 2jk f xu 2 d x 1
= jk [ J (jk)] - 2jk
f
x [Jv(j kx )] 2dx .
1
xJv (jk x) dx [ . = 2 Jv (jk)] 2 v J' (x ) - J,,(x) = - J,,+1( x ) • v Jv(jk) - -k v(jk) = - 4+1 (A ) J
so JJ (Jk) = - Jv+1(jk) •
Therefore 1 xJv
(jkx)dx
=
1 LJv+1(jk)] 2 .
In view of this conclusion, the Fourier-Bessel coefficient of f is an =
2
,f(x ) JJ(j„ x ) dx . [Jv+1(A)] 2 ✓o x Fourier-Bessel series will occur later when we solve the heat equation for certain kinds o f regions . We will then be faced with the task of expanding the initial temperature function in a Fourier-Bessel series . We will also see a Fourier-Bessel expansion when we solve for the normal modes of vibration in a circular membrane . Generally Fourier-Bessel coefficients are difficult to compute because Bessel functions are difficult to evaluate at particular points, and even their zeros must be approximated . However, with modern computing power we can often make approximations to whatever degree o f accuracy is needed.
16.2 Bessel Functions
7:4 3
EXAMPLE 16 . 5
Let f(x) = x(1 - x) for 0 < x < 1 . Since f is continuous with a continuous derivative, it s Fourier-Bessel series will converge to f(x) on (0, 1) : 0 x(1-x)=Ea„J1 (j„x) for 0<x<1 , n= 1 where 2
an _
f
[ J.2(jn)] 2
x2 ( 1
f
- x ) J1(jn x) dx .
We will compute a l through a4 , using eight decimal places in the first four zeros of .11 (x) : ji = 3 .83170597, j2 = 7 .01558667, j3 = 10 .17346814, j4
=
13.32369194 .
With the understanding that these integrations are approximations, comput e
=
a1
2
f x2 (1-x)J1 (3 .83170597x)dx
[J2 (3 .83 170597)] 2
= 12 .32930609
f
0
x2(1 -x)JI (3 .83170597x)dx
0
= 0 .45221702 , 2 * 2(1-x)JI (7 .01558667x)dx [J2(7 .01558667)] 2 Jo x
a2
= 22 .20508362
f
1
x2 (1-x)J1 (7 .01558667x) dx
= -0 .03151859 , 1
2
a3
[JZ (10 .17346814) ] 2 = 32 .07568554
f
0
f x2(1 -x) JI (10 .17346814x)dx
JO
x2 (1-x)J1 (10 .17346814x)dx
= 0 .03201789 , and a4
=
2
[J2 (13 .32369194)] 2
=41 .94557796
f
0
f
x2 (1-x)J1 (13 .32369194x)d x
0
x2(1-x)J1 (13 .32369194x)d x
= -0 .00768864 . Then, for 0 < x < 1 , x(1 - x) 0 .45221702J1 (3 .83170597x) - 0 .03 151859 .11 (7 .01558667x ) + 0 .03201794 .11 (10 .17346814x) - 0 .00768 864 .11 (13 .32369194x) . Figure 16 .13 shows a graph of x(1-x) and a graph of this four term sum of Bessel functions on [0, 1] . The graph is drawn on [-1, 3 ] to emphasize that, outside of 0, 1], there is no claim
744
CHAPTER 16 Special Functions, Orthogonal Expansions, and Wavelet s
FIGURE 16 .13 Approximation of x(1-x) [0, 1] by a Fourier-Bessel series.
on
that x(1-x) is approximated by the Fourier-Bessel series, and indeed the graphs diverge awa y from each other outside of [0, 1] . Accuracy on [0, 1] can be improved by computing mor e terms in the series . ■
PROBLEMS
1. Show that xa J„(bx c) is a solution of y,r-(2a-1)Y -vz cz 1 +(bzczxzc-2+az JY x x2
0.
10. Use the change of variables by the differential equation
=u
du
to transform
dx + by2 = cx '"
In each of Problems 2 through 9, write the general solutio n of the differential equation in terms of functions xa J„(bx c ) and xa J_„( bxc ) •
into the differential equatio n
2 . Y + 3x)+11+
Use the result of problem 1 to find the general solution of this differential equation in terms of Bessel functions, and use this solution to solve the original differential equation. Assume that b is a positiv e constant.
3. yrr + xY + (4x2
144x2 ) )
- 9x 2) y =
0
4. y"--5y'+(64x6+z))= 0
L
5. y" +-y' + I 16x2 - 4 z )y= 0 3 \ 6. y" - y, + 9x 4y = 0 x 7, 7. yu - xy
+ (36x +
4
1 8. Y + xY
16x2)=
/ 175 ) 1x2 Y = 0 0
x
9. y" +xy'+(81x4 + 4 2 )y=0
d2u - bcx'" u = 0. dx2
In each of Problems 11 through 16, use the given chang e of variables to transform the differential equation into on e whose general solution can be written in terms of Besse l functions. Use this to write the general solution of th e original differential equation. 11. 4x2 y" + 4xy'+(x-9)y=0 ;z= J 12. 4x2y" + 4xy' + (9x3 - 36)y = 0 ; z = x3/z 13. 9x2y" + 9xy' + (4x2/3 - 16)y = 0 ; z = 2x 1/3 14. 9x2.y" - 27xy' + (9x2 + 35) y = 0; u = y/x 2 15. 36x2 y" -12xy' + (36x2 + 7)y = 0; u = x -2/3 y 16. 4x2y" + 8xy' + (4x 2 - 35)y = 0 ; u = y/ 17. Show that y(x) =J1/3 (2kx3/2'3) is a solution of y" +k 2xy=0 .
16.3 Sturm-Liouville Theory and Eigenfunction Expansions In each of Problems 18 through 22, write the genera l solution of the differential equation in terms of function s °J„(bx`) and /°Y„(bx`)) . x/ 18.y"-xy'+I4-x 19.
[x(ii v - v'u)]' = (f3 2 - a2)xuv .
Iy= O
(c) Show from part (b) that
" - xy + -x2J y = 0
20.y"--5y,+(1x) y
21.y,-xy+(4x+x
(32 - a 2)
0
I*
0
2 3 J5/2(x) = z [(- - 1) sin(x) - - cos (x)] . 24. Show that J_5/2 (x)
=
x [(1_
f xJo (ax)Jo((3x)d x
= x [aJo (ax) Jo (/3x) - /3J6 (/3x)Jo (ax)] .
22. y" --y +I 16x2 --z y= O 23. Show that
1) cos(x)
+
x
sin(x)] .
25. Let a be a positive zero of Jo(x) . Show that fo Jl (ax)dx=1/a . 26. Let u(x) = Jo(ax) and v(x) = Jo(13x) . (a) Show that xu" + u' -I- a 2xu = 0 . Derive a similar differential equation for v.
16.3
(b) Multiply the differential equation for u by v, an d the differential equation for v by u and subtract t o show that
1
y
7 45
This is one of a set of formulas called Lomniel's integrals . 27. Show that [xI (x)]' = x1o(x) . 28. In each of (a) through (d), find (approximately ) the first five terms in the Fourier-Bessel expansion EL t a„Jl (j„x) of f(x), which is defined for 0 < x < 1 . Compare the graph of this function with the graph of the sum of the first five terms in th e series . (a) f(x) = x (b) f(x) = e-x (c) f(x) = xe x ' (d) f(x) = x2e29. Carry out the program of Problem 28, except now us e an expansion a„J2 (j„x) .
Sturm-Liouville Theory and Eigenfunction Expansions 16.3.1 The Sturm-Liouville Proble m We have now seen essentially the same scenario played out three times : differential equation
solutions that are orthogonal on [a, b] expansions of arbitrary functions in series of these solution s convergence theorem for the expansion .
First we had Fourier (trigonometric) series, then Legendre polynomials and Fourier-Legendr e series, and then Bessel functions and Fourier-Bessel expansions . It stretches the imagination to think that the similarities in the convergence theorems for these expansions are mere coincidence . We will now develop a general theory into which thes e convergence theorems fit naturally . This will also expand our arsenal of tools in preparatio n for solving partial differential equations . Consider the differential equatio n y" + R(x)y' + (Q(x) + AP(x)) y = 0 .
(16 .33)
Given an interval (a, b) on which the coefficients are continuous, we seek values of A for which this equation has nontrivial solutions . As we will see, in some cases there will be boundary conditions solutions must satisfy (conditions specified at a and b), and sometimes not .
CHAPTER 16
Special Functions, Orthogonal Expansions, and Wavelets
First put the differential equation into a convenient standard form. Multiply equatio n (16 .33) by r(x) = of R(x)dx to get y "e fR(x)dx +
R (x) yef R(x)dx + (Q( x) + AP(x)) of R(x)dxy - 0.
Since r(x) 0, this equation has the same solutions as equation (16 .33) . Now recognize that the last equation can be written (ry')' + (q+ Ap)y = 0 .
(16 .34)
Equation (16 .34) is called the Sturm-Liouville differential equation, or the Sturm-Liouville form of equation (16 .33) . We will assume that p, q and r and r' are continuous on [a, b], or at least on (a, b), and p(x) > 0 and r(x) > 0 on (a, b) .
EXAMPLE 16 . 6
Legendre's differential equation is (1 - x2)y" - 2xy' + Ay = 0 . We can immediately write this in Sturm-Liouville form a s ((1-x2)y)'+Ay=O , for -1 < x < 1 . Corresponding to the values A = n(n+ 1), with n = 0, 1, 2, . . . , the Legendre polynomials are solutions . As we saw in Section 16 .1, there are also nonpolynomial solutions corresponding to other choices for A . However, these nonpolynomial solutions are not bounde d on [-1, 1] .
EXAMPLE 16 . 7
Equation (16 .12), with a = 0, c = 1 and b = .■/X, can be writte n vz (xy')'+ (Ax --)y=0 . x /// This is the Sturm-Liouville form of Bessel's equation . For A > 0, this equation has solutions i n terms of the Bessel functions of order v of the first and second kinds, .J (fix) and 1',. (fix) . We will now distinguish three kinds of Sturm-Liouville problems . The Regular Sturm-Liouville Problem
We want numbers A for which there are nontrivia l
solutions of ( ry )' + ( q + AP)y = 0 on an interval [a, b] . These solutions must satisfy regular boundary conditions, which have the form A i y( a) +A2y' (a ) = 0, B I y(b)+B2y' (b) = 0 . A l and A 2 are given constants, not both zero, and similarly for B l and B 2.
16.3 Sturm-Liouville Theory and Eigenfunction Expansions
747
The Periodic Sturm-Liouville Problem Now suppose r(a) = r(b) . We seek numbers A and corresponding nontrivial solutions of the Sturm-Liouville equation on [a, b], satisfying th e periodic boundary conditions y ( a ) = y ( b),
y '(a) = y '(b) .
The Singular Sturm-Liouville Problem We look for numbers A and corresponding nontrivial solutions of the Sturm-Liouville equation on (a, b), subject to one of the following three kind s of boundary conditions : Type 1 . r(a) = 0 and there is no boundary condition at a, while at b the boundary conditio n is B 1y(b) + B2y(b) = 0 , where Bl and B2 are not both zero. Type 2. r(b) = 0 and there is no boundary condition at b, while at a the condition i s A i y( a ) + A2 y (a ) = 0 , with A l and A 2 not both zero . Type 3 . r(a) = r(b) = 0, and no boundary condition is specified at a or b. In this case we want solutions that are bounded functions on [a, b] . Each of these problems is a boundary value problem, specifying certain conditions at th e endpoints of an interval, as contrasted with an initial value problem, which specifies informatio n about the function and its derivative at a point (in the second order case) . Boundary valu e problems usually do not have unique solutions . Indeed, it is exactly this lack of uniqueness tha t can be exploited to solve many important problems . In each of these problems, a number A for which the Sturm-Liouville differential equatio n has a nontrivial solution is called an eigenvalue of the problem. A corresponding nontrivial solution is called an eigenfunction associated with this eigenvalue . The zero function cannot b e an eigenfunction . However, any nonzero constant multiple of an eigenfunction associated wit h a particular eigenvalue, is also an eigenfunction for this eigenvalue . In mathematical models o f problems in physics and engineering, eigenvalues usually have some physical significance . For example, in studying wave motion the eigenvalues are fundamental frequencies of vibration o f the system . We will consider examples of these kinds of problems . The first will be important in analyzing problems involving heat conduction and wave propagation .
EXAMPLE 16 .8 A Regular Proble m Consider the regular problem y"+Ay=O ;
y(0)=y(L)= 0
on an interval [0, L] . We will find the eigenvalues and eigenfunctions by considering cases on A. Since we will show later that a Sturm-Liouville problem cannot have a complex eigenvalue , there are three cases . Case 1 A = O . Then y(x) = cx+ d for some constants c and d. Now y(O) = d = 0, and y(L) = cL = 0 require s that c = O . This means that y(x) = cx + d must be the trivial solution . In the absence of a nontrivial solution, A = 0 is not an eigenvalue of this problem . -
-- - Ease 2_A _is negative, say _A
=_-e for k_> Q
748,
CHAPTER 16 Special Functions, Orthogonal Expansions, and Wavelet s Now y" - k e y = 0 has general solutio n y(x) = cl ekx + c2e
I' .
Since y(0) = c l + c2 then c2
= - c1 ,
=0
so y = 2c1 sinh(kx) . But then y(L) = 2c 1 sinh(kL) = 0.
Since kL > 0, sinh(kL) > 0, so cl = 0 . This case also leads to the trivial solution, so thi s Sturm-Liouville problem has no negative eigenvalue . Case 3 A is positive, say A = k 2. The general solution of y" +ke y = 0 is y(x) = c l cos(kx) + c2 sin(kx) . Now y(0) = cl = 0 , so y(x) = c2 sin(kx) . Finally, we nee d y(L) = c2sin(kL) = 0. To avoid the trivial solution, we need c2 0. Then we must choose k so that sin(kL) = 0, which means that kL must be a positive integer multiple of IT, say kL = nir. Then n2 ¶2 for n=1,2,3, . . . . An= LZ Each of these numbers is an eigenvalue of this Sturm-Liouville problem . Corresponding to each n, the eigenfunctions are n'rx y„(x)=csin( L ) , in which c can be any nonzero real number .
EXAMPLE 16 .9 A Periodic Sturm-Liouville Proble m
Consider the problem y" + Ay = 0 ;
y(- L ) = y( L), y( - L ) =
Y (L)
on an interval [-L, L] . Comparing this differential equation to equation (16 .34), we have r(x) = 1, so r(-L) = r(L), as required for a periodic Sturm-Liouville problem. Consider case s on A . Case 1 A = O . Then y = cx + d. Now y(-L) = -cL+d = y(L) = cL+ d implies that c = 0 . The constant function y = d satisfies both boundary conditions . Thus A = 0 is an eigenvalue with nonzero constant eigenfunctions . Case 2A<0,sayA_-k 2. Now y( x) =
clekx+c2e
kx•
16 .3 Sturm-Liouville Theory and Eigenfunction Expansions
749
Since y(-L) = y(L), then cl e -kL + c2e kL = cl ekL c z e -kL +
(16 .35)
And y'(-L) = y'(L) gives us (after dividing out the common factor k) e -kL - c2e kL = e kL - c2 e -kL cl cl Rewrite equation (16 .35) as cl(e
kL - ekL)
=
(16 .36)
c2(e kL - e kL ) .
This implies that cl = c2 . Then equation (16 .36) becomes cl(e-kL _ ekL) _ cl ( e kL - e -kL ) But this implies that cl = -c o hence c l = 0 . The solution is therefore trivial, hence this proble m has no negative eigenvalue. Case 3 A is positive, say A = k2 . Now y(x) = c l cos(kx) + c2 sin(kx) . Now y(-L) = c l cos(kL) - c2 sin(kL) = y(L) = cl cos(kL) + c2 sin(kL) . But this implies that 2c2 sin(kL) = O .
Next, y ' (-L) = kc l sin(kL) +kc2 cos(kL) = y' (L) = -kc l sin(kL) +kc2 cos(kL) . Then kc l sin(kL) = 0 .. If sin(kL) 0 0, then c l = c 2 = 0, leaving the trivial solution . Thus suppose sin(kL) = 0 . Thi s requires that kL = nTr for some positive integer n . Therefore the number s A, l
are eigenvalues for
n = 1,
n 2 Tr 2
= L2
2, . . . , with corresponding eigenfunction s Y u (x)
= c l cos
( lZLx)
-I- c2 sin ( -
with c l and c2 not both zero . We can combine Cases 1 and 3 by allowing sponding nonzero constant eigenfunctions . 511
EXAMPLE 16 .10
n = 0,
L
x) ,
so the eigenvalue
Besse! Functions as Eigenfunctions of a Singular Proble m
Consider Bessel's equation of order v, (xY) ' +
v2 Ax--
Y --
A=0
has corre-
750
CHAPTER 16 Special Functions, Orthogonal Expansions, and Wavelet s on the interval (0, R) . Here v is any given nonnegative real number, and R > 0. In the context of the Sturm-Liouville differential equation, r(x) = x, and r(0) = 0, so there is no boundary condition at 0 . Let the boundary condition at R be y(R) = 0 . We know that, if A > 0, then the general solution of Bessel's equation i s y(x) = c 1J„(
Ax) .
5.x)+c2 Y„(
To have a solution that is bounded as x -+ 0+, we must choose c 2 = 0. This leaves solution s of the form y = c 1J„( Ax) . To satisfy the boundary condition at x = R, we must hav e y (R )
=
iLR) = 0 .
c1J„(1/
We need c 1 0 to avoid the trivial solution, so we must choose A so that J„(N,/XR) = 0 . I f j 1, j2, . . . are the positive zeros of J„(x), then -■/XR can be chosen as any j,, . This yields an infinite sequence of eigenvalues j2
A" = R2 ' with corresponding eigenfunctions
with c constant but nonzero . This is an example of a type 1 singular Sturm-Liouville problem .
EXAMPLE 16 .11 Legendre Polynomials as Eigenfunctions of a Singular Proble m
Consider Legendre's differential equatio n ((1-x2)Y)'+Ay= 0. In the setting of Sturm-Liouville theory, r(x) = 1 - x 2 . On the interval [-1, 1], we have r(-1) = r(l) = 0, so there are no boundary conditions and this is a singular Sturm-Liouvill e problem of type 3 . We want bounded solutions on this interval, so choose A = n(n + 1), with n = 0, 1, 2, . . . . These are the eigenvalues of this problem. Corresponding eigenfunctions are nonzero constant multiples of the Legendre polynomials P„(x) . Finally, here is an example with more complicated boundary conditions .
EXAMPLE 16 .1 2
Consider the regular problem y"+Ay=0 ;
y(0)=0,3y(1)+y'(1)=0 .
This problem is defined on [0, 1] . To find the eigenvalues and eigenfunctions, consider cases on A .
16.3 Sturm-Liouville Theory and Eigenfunction Expansions Case 1 A = 0 . Now y(x) = cx+d, and y(O) = d = 0 . Then y = cx . But from the second boundary condition , 3y(1) +y ' (1) = 3c+c = 0 forces c = 0, so this case has only the trivial solution . This means that 0 is not an eigenvalu e of this problem . Case 2 A < 0 . Write A = - k2 with k > 0, so y" - k2 y
=
0, with general solution
y( x) = c1ekx + c2 e -kx Now y(O) = 0 = c i + c 2, so c2
= - c1
and y(x) = ci sinh(kx) . Next,
3y(l) +y ' (1) = 0 = 3c 1 sinh(k) + c 1 k cosh(k) . But for k > 0, sinh(k) and k cosh(k) are positive, so this equation forces c i = 0 and again w e obtain only the trivial solution . This problem has no negative eigenvalue . Case 3 A > 0, say A = k2 Now y" + k2y = 0, with general solution y(x) = c1 cos(kx) + c2 sin(kx) . Then y(O) = ci = 0, so y(x) = c2 sin(kx) . The second boundary condition gives u s 0 = 3c2 sin(k) + kc2 cos(k) . We need c 2
0 to avoid the trivial solution, so look for k so that 3 sin(k) + k cos(k) = 0 .
This means that k tan(k) = - . 3 This equation cannot be solved algebraically . However, Figure 16 .14 shows graphs of y = tan(k) and y = - k/3 on the same set of axes . These graphs intersect infinitely often in the half plan e
Y
y = tan (k)
gt
kl k2 ,111rW
Y= FIGURE 16 .14
k3 i
k4
_ k
s
k5 k
752
CHAPTER 16 Special Functions, Orthogonal Expansions, and Wavelet s k > 0 . Let the k coordinates of these points of intersection be k l , k2 , . . . . The numbers A n = - k2n are the eigenvalues of this problem, with corresponding eigenfunctions c sin(k„x) for c 0 .
16.3 .2 The Sturm-Liouville Theore m With these examples as background, here is the fundamental theorem of Sturm-Liouville theory .
THEOREM 16.22
1. Each regular and each periodic Sturm-Liouville problem has an infinite number of distinct real eigenvalues. If these are labeled A l , A2 . . . so that A n < A„+1, then limn , An = oo . 2. If An and A,,, are distinct eigenvalues of any of the three kinds of Sturm-Liouvill e problems defined on an interval (a, b), and cc,, and (p,,, are corresponding eigenfunctions , then
fa bp(x)cp,,(x)cp,,,(x)dx=0 . 3. All eigenvalues of a Sturm-Liouville problem are real numbers . 4. For a regular Sturm-Liouville problem, any two eigenfunctions corresponding to a single eigenvalue are constant multiples of each other . ■ Conclusion (1) assures us of the existence of eigenvalues, at least for regular and periodi c problems . A singular problem may also have an infinite sequence of eigenvalues, as we saw i n Example 16 .10 with Bessel functions . Conclusion (1) also asserts that the eigenvalues "spread out", so that, if arranged in increasing order, they increase without bound . For example, number s 1 - 1/n could not be eigenvalues of a Sturm-Liouville problem, since these numbers approac h 1 as n -* oo . In (2), denote f • g = p(x) f(x)g(x)dx . This dot product for functions has many of the properties we have seen for the dot product of vectors . In particular, for functions f , g and h that are integrable on [a, b], f .g=g . f, .f • ( g + h) _ .f • g + f • h, (a.f) .g=a(f .g) for any real number a, and f•f>0 . The last property relies on the assumption made for the Sturm-Liouville equation that p(x) > 0 on (a, b) . If f is also continuous on [a, b], then f • f = 0 only if f is the zero function, sinc e in this case fa' p(x) f(x) 2dx = 0 can be true only if f(x) = 0 for a < x < b. This analogy between vectors and functions is useful in visualizing certain processes an d concepts, and now is an appropriate time to formalize the terminology .
16.3 Sturm-Liouville Theory and Eigenfunction Expansions
753
DEFINITION 16. 1 Let p be continuous on [a, b] and p(x) > 0 for a < x < b . 1. If f and g are integrable on [a, b], then the dot product of f with g, with respect to the weight function p, is given by f g
= fa b p(x) f(x)g(x)dx .
2. f and g are orthogonal on [a, b], with respect to the weight function p, if f • g = 0 .
The definition of orthogonality is motivated by the fact that two vectors F and G in 3-spac e are orthogonal exactly when F • G = 0 . Conclusion (2) may now be stated : eigenfunctions associated with distinct eigenvalues are orthogonal on [a, b], with weight function p(x) . The weight function p is the coefficient of A in the Sturm-Liouville equation . As we have seen explicitly for Fourier (trigonometric) series, Fourier-Legendre series an d Fourier-Bessel series, this orthogonality of eigenfunctions is the key to expansions of function s in series of eigenfunctions of a Sturm-Liouville problem. This will become a significant issu e when we solve certain partial differential equations modeling wave and radiation phenomena . Conclusion (3) states that a Sturm-Liouville problem can have no complex eigenvalue . This is consistent with the fact that eigenvalues for certain problems have physical significance , such as measuring modes of vibration of a system . Finally, conclusion (4) applies only to regular Sturm-Liouville problems . For example, the periodic Sturm-Liouville problem of Example 16 .9 has eigenfunctions cos(n7rx/L) and sin(nirx/L) associated with the single eigenvalue n2 ir2 /L2 , and these functions are certainl y not constant multiples of each other . We will prove parts of the Sturm-Liouville theorem . A proof of (1) requires some delicate analysis that we will not pursue . For (2), we will essentially reproduce arguments made previously for Legendre•polynomial s and Bessel functions . Begin with the fact tha t
Proof
( rcP;,) ' + ( qq + A „ p) cc)), = 0
an d ( rcq,,, )' + ( q + A,,,p) cp»,
=
0.
Multiply the first equation by cp,,, and the second by cp,, and subtract to get @VD ' CP711 - (r,,)'
CHAPTER 16 Special Functions, Orthogonal Expansions, and Wavelets Since A„ A,,,, conclusion (2) will be proved if we can show that the left side of the las t equation is zero . Integrate by parts :
f G(r(x)'Pn(x))'cP,n(x)dx- f a
(r(x)(pn,(x))'(pn(x)d x
a
v _ [cp»,(x)r(x)cP,,(x)J - fa
r(x ) cP,,( x ) cP,n(x) dx
v - [cP„(x)r(x)cp,
r (x) cP,,(x) cP,n( x) dx
(x))a+ f
a
= r(b)'P,,,(b)9,,(b) - r(a)ce,,,(a)cP,,(a ) - r( b) cP„ ( b)
(b) + r (a ) cP„ ( a) 9,,, ( a)
= 'r(b) [ cPm ( b ) 9;i ( b ) - 9„ ( b) co'm ( b )l
-
r( a ) [ cp m(a ) cP,, ( a ) - cP„ ( a) cp;,, ( a) ]
(16 .37 )
To prove that this quantity is zero, use the boundary conditions that are in effect. Suppose firs t that we have a regular problem, with boundary condition s A i y ( a)
+ A2 y' (a) =
0,
B l y( b ) +
B2 y' (b) = 0 .
Applying the boundary condition at a to con and cp., we have A i cPn ( a)
+ A 2cp;, (a) = 0
and Ai(Pm(a)+A2cp,,,(a) = 0 . Since A l and A2 are assumed to be not both zero in the regular problem, then the system o f algebraic equations 0, 9,,(a)X+cp,n(a)Y= 0 cp n(a ) X +cP;,(a)Y=
has a nontrivial solution (namely X = A l , Y = A 2) . This requires that the determinant of th e coefficients vanish : co n ( a ) 'Pm (a )
9n (a ) cP ;, (a)
= cP„(a ) cp ,n(a )
-
cP,n(a)cpn(a) = 0 .
Using the boundary condition at b, we obtain 9,i ( b ) co ,
(b) - c9m( b) cp n(b)
= 0.
Therefore the right side of equation (16 .37) is zero, proving the orthogonality relationship i n the case of a regular Sturm-Liouville problem . The conclusion is proved similarly for the other kinds of Sturm-Liouville problems, by applying the relevant boundary conditions in equatio n (16 .37) . To prove conclusion (3), suppose that a Sturm-Liouville problem has a complex eigenvalu e A = a+ i f3 . Let cp(x) = u (x) + iv(x) be a corresponding eigenfunction . Now (rcp')'+(q+Ap)cp=0 . Take the complex conjugate of this equation, noting that cp'(x) = u' (x) + iv' (x) and co' (x) = u' (x) - iv' (x) = (cp(x)) .
16.3 Sturm-Liouville Theory and Eigenfunction Expansions
755
Since r(x), p(x) and q(x) are real-valued, these quantities are their own conjugates, and we ge t ( r ))' +( q + Ap / =0 . This means that A is also an eigenvalue, with eigenfunction Tp . Now, if /3 0, then A and A are distinct eigenvalues, hence b
J
p(x)cp(x)cp(x)dx = 0 .
fa
But then G
f
J p(x)[u(x)2+v(x)2]dx=0 . a
But, for a Sturm-Liouville problem, it is assumed that p(x) > 0 for a < x < b . Therefor e u(x) 2 + v(x) 2 = 0, so u(x) = v(x) = 0
on [a, b] and cp(x) is the trivial solution . This contradicts co being an eigenfunction. We conclude that 3 = 0, so A is real . Finally, to prove (4), suppose A is an eigenvalue of a regular Sturm-Liouville problem , and cp and t/I are both eigenfunctions associated with A . Use the boundary condition at a, and reason as in part of the proof of (2), to show that cp ( a) til l (a) - (a) cP (a)
= O.
But then the Wronskian of cp and vanishes at a, so co and is a constant multiple of the other . ®
cli
are linearly dependent and on e
We now have the machinery needed for general eigenfunction expansions . 16.3 .3 Eigenfunction Expansion s In solving partial differential equations, we will often encounter the need to expand a functio n in a series of solutions of an associated ordinary differential equation-a Sturm-Liouvill e problem . Fourier series, Fourier-Legendre series, and Fourier-Bessel series are examples of such expansions . The function to be expanded will have some special significance in the problem . It might, for example, be an initial temperature function, or the initial displacemen t or velocity of a wave . To create a unified setting in which such series expansions can be understood, consider a n analogy with vectors in 3-space . Given a vector F, we can always find real numbers a, b and c so that F=ai+bj+ck . Although the constants are easy to find, we will pursue a formal process in order to identify a pattern . First, F•i=ai•i+bj•i+ck•i=a , becaus e i i=1
and j•i=k•i=0 .
Similarly, b=F .j and
c=F . k .
CHAPTER 16 Special Functions, Orthogonal Expansions, and Wavelet s The orthogonality of i, j and k provides a convenient mechanism for determining the coefficient s in the expansion by means of the dot product . More generally, suppose U, V and W are any three nonzero vectors in 3-space that are mutually orthogonal, so U•V=U•W=V•W=O . These vectors need not be unit vectors, and do not have to be aligned along the axes . However, because of their orthogonality, we can also easily write F in terms of these three vectors . Indeed, if F = aU + /3V + yW then F•U=aU•U+JV•U+yW . U=aU•U, so a=
F• U U• U
Similarly,
yV
Y= F6=
(16 .38 ) WW Again, we have a simple dot product formula for the coefficients . The idea of expressing a vector as a sum of constants times mutually orthogonal vectors , with formulas for the coefficients, extends to writing functions in series of eigenfunctions of Sturm-Liouville problems, with a formula similar to equation (16 .38) for the coefficients . We have seen three such instances already, which we will briefly review in the context of th e Sturm-Liouville theorem . Fourier Series
and
The Sturm-Liouville problem is y" + Ay = 0 ; y(- L) = y(L) = o
(a periodic problem) with eigenvalues n 2 7r 2 /L 2 for n = 0, 1, 2, . . . and eigenfunctions 1, cos(arx/L), cos(2*rx/L), . . . , sin(arx/L), sin(27rx/L), . . . . Here p(x) = 1 and the dot product to be used i s f.g
f
L
= L f( x) g (x ) dx .
If f is piecewise smooth on [-L, L], then for -L < x < L , (f (x+)
f(x) sin(n7rx/L ) for n = 1, 2, . . . . sin(narx/L)•sin(narx/L)
16.3 Sturm-Liouville Theory and Eigenfunction Expansion s Fourier-Legendre Series
The Sturm-Liouville problem is
((1-x2)y')'+Ay=0 , with no boundary conditions on [-1, 1] because r(x) = 1-x2 vanishes at these endpoints . However, we seek bounded solutions . Eigenvalues are n(n + 1) with corresponding eigenfunctions the Legendre polynomials Po(x), PI (x), . . . . Since p(x) = 1, use the dot produc t f .g
= f i f(x) g(x) dx.
If f is piecewise smooth on [-1, 1], then for -1 < x < 1 , 0 (f( x+) +f(x-)) = E c,P,( x) 2 n= O
where
f'I f(x)P„(x) dx f 11. ( x ) d x Fourier-Bessel Series
f' P„ P„' P„
Consider the Sturm-Liouville problem v2 l (xY) ' +( Ax - x y= 0
with boundary condition y(l) = 0 on (0, 1) . Eigenvalues are A = j, for n = 1, 2, . . . , wher e j i, j2 . . . are the positive zeros of J„(x), and eigenfunctions are J„(j,,x) . In this Sturm-Liouvill e problem, p(x) = x and the dot product is f•g
=
f
xf(x)g(x)dx .
If f is piecewise smooth on [0, 1], then for 0 < x < 1 we can write the serie s 0 (f (x+) + f( x-)) = c ,i 4 (j„ x) , E 2 where =
foxf(x) J,(j„ x) dx
x J,,(j„ x) J,0(j„x)J„(j„x) ' f( ).
fo xJ, (j,t x) dx again fitting the template we have seen in the other kinds of expansions . These expansions are all special cases of a general theory of expansions in series o f eigenfunctions of Sturm-Liouville problems .
cl
THEOREM 16.23
Let
Convergence of Eigenfunction Expansion s
Al, A2 , . . .
be the eigenvalues of a Sturm-Liouville differential equatio n
(u')' + (q + Ap)y = 0 on [a, b], with one of the sets of boundary conditions specified previously . Let corresponding eigenfunctions, and define the dot produc t b
f .g =
f p(x) f(x)g(x)dx.
cp i , c02, . . .
be
CHAPTER 16 Special Functions, Orthogonal Expansions, and Wavelet s Let f be piecewise smooth on [a, b] . Then, for
<
a
x
< b,
0 1 2 (f(x+) +f( x -)) = E cn cP„(x) , where f •cPn cPn .(P. We call the numbers f cP„ (16 .39 ) (Piz * the Fourier coefficients off with respect to the eigenfitnctions of this Sturm-Liouville problem . With this choice of coefficients, E,7=1 c,l cp„(x) is the eigenfunction expansion off with respect to these eigenfunctions . If the differential equation generating the eigenvalues and eigenfunctions has a special name (such as Legendre's equation, or Bessel's equation), then the eigenfunction expansion i s usually called the Fourier - • • series, for example, Fourier-Legendre series and Fourier-Besse l series .
EXAMPLE 16 .1 3
Consider the Sturm-Liouville problem y" + Ay = 0 ; y'(0) = y '(7r/2) = 0 . We find in a routine way that the eigenvalues of this problem are A = 4n2 for n = 0, 1, 2, . . . . Corresponding to A = 0, we can choose 9)0 (x) = 1 as an eigenfunction . Corresponding t o A = 4n2, cp„(x) = cos(2nx) is an eigenfunction . This gives us the set of eigenfunction s 9o (x)
= 1,
cpl (x)
= cos(2x), 92 (x) = cos(4x), . . . .
Because the coefficient of A in the differential equation is [0, 7T/2], the dot product for this problem is 1r/ 2 f • g = f f(x)g(x)dx.
p(x)
= 1, and the interval i s
We will write the eigenfunction expansion of f(x) = x2(1- x) for 0 < x < 7r/2. Since f and f are continuous, this expansion will converge to x2 (1 - x) for 0 < x < 7T/2. The coefficients in this expansion ar e co =
Figure 16 .15 (a) shows the fifth partial sum of this series, compared with f , and Figur e 16.15 (b) shows the fifteenth partial sum of this expansion . Clearly this eigenfunction expansion is converging quite rapidly to x 2 (1 - x) on this interval .
16 .3.4 Approximation in the Mean and Bessel's Inequality In this and the next two sections we will discuss some additional properties of Fourier coefficients, as well as some subtleties in the convergence of Fourier series . For this discussion, le t cPl, cp2, . . . be normalized eigenfunctions of a Sturm-Liouville problem on [a, b] . Normalize d means that each eigenfunction con has been multiplied by a positive constant so that cp„ . cp i = 1 . This can always be done because a nonzero constant multiple of an eigenfunction is again a n eigenfunction . We now have
(Pn ' con, = f
p(x) con (x) c0n, (x) dx =
1
1
0
ifn= m . if n in
For these normalized eigenfunctions, the n th Fourier coefficient is
Cn = f =f cPn .cPn We will now define one measure of how well a linear combination a given function f
(16.40)
E,N,=1 k„ cp,Z approximate s
760
CHAPTER 16
Special Functions, Orthogonal Expansions, and Wavelet s
DEFINITION 16.2
Best
Mean Approximation
Let N be a positive integer and let f be a function that is integrable on [a, b] . A linear combination k,i co. ( 0) n= 1
of cp s , cp2 , . . . , c°N is the best approximation in the mean to f on [a, b] if the coefficients k 1 , . . . , k N minimize the quantity z N x) dx . IN(f) = f bp(x) f(x )-E k „ cp n( a
n= 1
IN (f) is the dot product of f(x) - EL I k n cpn (x) with itself (with weight function p) . For vectors in R 3, the dot product of a vector V = ai+bj+ck with itself is the square of its length : V V = a2 + b2 + c2
=
(length of V) 2 .
This suggests that we define a length for functions b y g •g
=f
b
p(x)g(x) 2 dx
(length of g) 2 .
a
Now IN (f) has the geometric interpretation of being the (square of the) length of f(x) ELI kngon(x) . The smaller this length is, the better the linear combination EL I kn cp n (x ) approximates f(x) on [a, b] . This approximation is an average over the entire interval, a s opposed to looking at the approximation at a particular point, hence the term "approximatio n in the mean" . We want to choose the k's to make IN (f) the best possible mean approximation to f on [a, b], which means we want to make the length of f(x) - EN 1 k n cp„(x) as small a s possible . To determine how to choose the k ;,s, write z
N
0< _ IN (f)
f(x) 2 -2Ef( x) cpn(x
= f bp(x) a
=
)+
n=1
i k con( x)) ,t
n= 1
f b p ( x)f( x) 2dx -2Ekn f bp (x)f(x) cPn( x) dx a
n=1
N N +
a
b
E E knk n, f p(x) co n ( x ) o, (x) d x a
n=1 m=l
N
N
N
= f . f -2 E knf . n+E E k nk. on .cp m n=1 N
= f • f -2
n=1 ,n= 1 N
E knf' c*n+E kncp n n=1
n= 1 N
N * = f • f -2 E k nf 'cp n+L kn > n=1
n=1
dx
16.3 Sturm-Liouville Theory and Eigenfunction Expansion s since cp„ • cp„ = 1 for this normalized set of eigenfunctions . Now let c„ = f • co n, the n th Fourier coefficient of f for this set of normalized eigenfunctions . Complete the square by writing th e last inequality as N 0
N
E k„ C ,,+E
*ff -2
n=1
N 2 kn
n=1
E cn2+ E C, z n=1
N
N
n= 1
N
=ff+E(c,,-kn)2 -Ec,2t• n=1 n=1
(16 .41 )
In this formulation, it is obvious that the right side achieves its minimum when each k,, = c,, . We have proved the following.
= THEOREM 16.2 4
Let f be integrable on [a, b], and N a positive. integer . Then, the linear combination E,N,=1 k,, co n that is the best approximation in the mean to f on [a, b] is obtained by puttin g k„=f•*„ forn=l,2, . . . . Thus, for any given N, the Nth partial sum EN 1 (f • con )cpn of the Fourier series (f . co„) con of f, is the best approximation in the mean to f by a linear combination of co l , cp 2 , coN . The argumen t. leading to the theorem has another important consequence . Put k,, = c,, = f • cP,t in equality (16 .41) to obtain N os
f . f - E(f• cpn) 2 , n=l
or N
E(f .
,z) 2 *
f • f.
n= 1
Since N can be any positive Integer, the series of squares of the Fourier coefficients of f converges, and the sum of this series cannot exceed the dot product of f with itself . This i s Bessel's inequality, and was proved in Section 14 .5 (Theorem 14 .7) for Fourier trigonometric series .
-r
THEOREM 16 .25 Bessel's Inequality
Let f be integrable on [a, b] . Then the series of squares of the Fourier coefficients of f with respect to the normalized eigenfunctions col, cP2, . . . converges . Further, co
E(f
cp n) 2
f f.
n= 1
Under some circumstances, the inequality can be replaced by an equality . This leads us to consider the concept of convergence in the mean.
CHAPTER 16 Special Functions, Orthogonal Expansions, and Wavelet s 16.3 .5 Convergence in the Mean and Parseval's Theorem Continuing from the preceding subsection, col , 'P2, . . . are assumed to be the normalized eigenfunctions of a Sturm-Liouville problem on [a, b] . If f is continuous on [a, b] with a piecewis e continuous derivative, then for a < x < b,
f( x)
=
0 E(f cPn) gPn(x ) • n= 1
This convergence is called pointwise convergence, because it deals with convergence of th e Fourier series individually at each x in (a, b) . Under some conditions, this series may als o converge uniformly . In addition to these two kinds of convergence, convergence in the mean is often used i n the context of eigenfunction expansions .
Let f be integrable on [a, b] . The Fourier series E ', =j (fton of f, with respect to the normalized eigenfunctions cp1, 2 ' •. , is said to converge to f in the mean on [a, b] if
Convergence in the mean of a Fourier series of f , to f , occurs when the length o f f(x) - E'=i (f cp„) cp„ (x) approaches zero as N approaches infinity . This will certainly happen if the Fourier series converges to f, because then f(x) = f • con)cPn(x), and we know that this holds if f is continuous with a piecewise continuous derivative . For the remainder of this section, let C'[a, b] be the set of functions that are continuous on [a, b], with piecewise continuous derivatives on (a, b) .
THEOREM 16.2 6
1. If f(x) = 1 (f • cpn )cp, l (x) for a < x < b, then EL I (f • cp n )cp n also converges in the mean to f on [a, b] . 2. If f is in C'[a, b], then 1 (f . cp,a)'Pn converges in the mean to f on [a, b] .
The converse of (1) is false . It is possible for the length of f(x) - j „ 1 (f • con )cpn (x) to have limit zero as N - oo, but for the Fourier series not to converge to f(x) on the interval. This is because the integral in the definition of mean convergence is an averaging process an d does not focus on the behavior of the Fourier series at any particular point . We will show that convergence in the mean for functions in C'[a, b] is equivalent to bein g able to turn Bessel ' s inequality into an equality for all functions in this class .
16.3 Sturm-Liouville Theory and Eigenfunction Expansions
763
THEOREM 16.2 7 E,7=1 (f . co„)co„ converges in the mean to f for every f in C'[a, b] if and only i f
00 E(f• con) 2= f• f n= 1
for every f in C'[a, b] . Proof
From the calculation done in proving Theorem 16 .24, with k„ = f • cp,, , 2 N N G 0 IN(D = f la (x) f(x) - E(f • So „) con = dx f . f - E a n=l n=1
Therefore N
*6 f
(f(x)(f .
iz) 2 dx= o
n= 1
if and only if 0 f .f->(f .con)2=0 . n= 1
Replacing the inequality with an equality in Bessel's inequality yields the Parseval relationship . We can now state a condition under which this holds . COROLLARY 16 .1
Parseval's Theorem
If f is in C'[a, b], then 0 E(f•(Pn) 2= f n= 1 This follows immediately from the last two theorems . We know by Theorem 16 .26(2 ) that, if f is in C'[a, b], then the Fourier series of f converges to f in the mean . Then, b y Theorem 16.27, ELI (f • cp,I ) 2 = f • f. With more effort, the Parseval equation can be prove d under much weaker conditions on f . 16.3 .6
Completeness of the Eigenfunction s
Completeness is a concept that is perhaps most easily understood in terms of vectors . In 3-space, the vector k cannot be written as a linear combination ai + /3j, even thoug h i and j are orthogonal . The reason for this is that there is another direction in 3-space that i s orthogonal to the plane of i and j, and i and j carry no information about the component a vector may have in this third direction . The vectors i and j are incomplete in R 3 . By contrast , there is no nonzero vector that is orthogonal to each of i, j and k, so we say that these vectors are complete in R 3 . Any 3-vector can be written as a linear combination of i, j and k . Now consider the normalized eigenfunctions cp l , cp2 , . . . . Think of each cps as defining a different direction, or axis, in the space of functions under consideration, which we take t o be C' [a, b] . We say that these eigenfunctions are complete in C' [a, b] if the only function in C' [a, b] that is orthogonal to every eigenfunction is the zero function . If, however, there is a nontrivial function f in C ' L, b that is orthogonal to every eigenfunction, then _we say that the
CHAPTER 16 Special Functions, Orthogonal Expansions, and Wavelets eigenfunctions are incomplete . In this case there is another axis, or direction, in C'[a, b] that is not determined by all of the eigenfunctions . A function having a component in this othe r direction could not possibly be represented in a series of the incomplete eigenfunctions . We claim that the eigenfunctions are complete in the space of continuous functions with piecewise continuous derivatives on (a, b) .
-E
THEOREM 16.2 8
cp 2 , . . . are complete in C'[a, b] . ■
The normalized eigenfunctions
Suppose the eigenfunctions are not complete . Then there is some nontrivial functio n in C'[a, b] that is orthogonal to each cpn . But because f is orthogonal to each co n , each (f, cp ,l) = 0, so c f( x ) = *(f . cp,z) cp ,t(x ) = 0 for a < x < b .
Proof f
n= 1
This contradiction proves the theorem .
EXAMPLE 16 .1 4
The normalized eigenfunctions of the Sturm-Liouville problem y " + Ay = 0 ; y'(0) = y '(7r/2) = 0
are .\/T 77'
2 cos(2x), .V 7T
2
cos(4x),
fii-
2
cos(6x), . . . .
The constants were chosen to normalize the eigenfunctions, sinc e cp n • con
=
w/2
cp dx
f
=f
yr/2
4 cos 2 (2nx)dx = 1 .
-
n
This set E of eigenfunctions is complete in C'[0, 7r/2] . This means that, except for f(x) = 0 , there is no f in C'[a, b] that is orthogonal to each eigenfunction . Observe the effect if one eigenfunction is removed . For example, the set E l of eigenfunctions 2
2
cos(4x),
ii
.N/7T
cos(6x), . . . ,
N/-
is formed by removing f(x) = - cos(2x) from E. Now cos(2x) has no expansion in terms o f E l , even though cos(2x) is continuous with a continuous derivative on (0, 7T/2) . Indeed, if 2 cos(2x) = - c o 7r
.
2 ,FT
n= 2
then co =? •cos(2x)= 0 yar
16 .4
Wavelets
76 5
and, for n = 2, 3, . . . , c„ = cos(2x) • - cos(2nx) = 0 , 7r implying that cos(2x) = 0 for 0 < x < 7r/2 . This is an absurdity . The deleted set of eigenfunctions El , with one function removed from E, is not complete in C' [0, 7r/2] .
In each of Problems 1 through 12, classify the SturmLiouville problem as regular, periodic or singular ; state the relevant interval ; find the eigenvalues; and, corresponding to each eigenvalue, find an eigenfunction . In some cases eigenvalues may be implicitly defined by an equation . 1. y" + Ay = O ; Y( 0) = O, Y' (L ) = 0 2. y" + Ay = O ; y'(0) = O, y' ( L) = 0 3. y"+Ay=0 ;Y'(0)=y(4)= 0 4. y" + Ay = 0 ; y(0) = y ( 7r), Y ' ( O ) = Y' (7r) 5. y, + Ay = O> Y(-37r) = Y( 3 Tr), Y' ( - 37r) = y' (37r) 6. y"+Ay = 0 ; y(O) = 0, y(7r) + 2y' (7r) = 0 7. y" + Ay = 0 ; Y(0 ) - 2Y ' (O ) = 0, Y ' ( 1) = 0 8. y" + 2y' + (1 + A)y = 0 ; y(0) = y(1) = 0 9. (e 2xy' ) ' + Ae2x y = 0; y(O) = Y( ir) = 0 10. (e-6x y ,)' + (1 +A)e -r'x y = 0 ; y(O) = y( 8) = 0 11. (x3 y')' + Axy = 0 ; y(l) = Y( e3 ) = 0 12. (x ly')'+(4+A)x 3 Y =0 >Y( 1) = Y(e4 ) = 0 In each of Problems 13 through 18, find the eigenfunction expansion of the given function in the eigenfunction s of the Sturm-Liouville problem. In each case, determine what the eigenfunction expansion converges to on the in terval, and graph the function and the sum of first N terms
16.4
of the eigenfunction expansion on the same set of axes fo r the given interval . (In Problem 13, do the graph for L = 1 ) 13. f(x) = 1-x for 0 < x < L y" + Ay = 0 ; y(O) = y(L) = 0; N = 4 0 14. f(x) = [xi for 0 < x < 7r Y"+Ay=O;Y(O)=Y'(7T)=0 ;N=3 0 1 for 0 < x < 2 15. f(x) _ 1 fort<x< 4 y" + Ay = 0 ; y'(0) = y(4) = 0 ;N = 4 0 16. f(x) = sin(2x) for 0 <_ x <
Y"+Ay=0 ;y'(O)=y'(7r)=0 ;N=3 0 17. f(x) = x2 for -37r < x < 37r y" + Ay = 0 ; Y(-37r) = Y(37r), Y'(- 3 7r) = Y' (3 r) ; N = 10 _ 0 for 0 < x < 1/ 2 18. f(x) 1 for l/2 < x < 1 y" + 2y' + (1 + A)y = 0 ;y(0) = y(1) = 0 ; N = 30 19. Write Bessel's inequality for the function f(x) = x(4-x) for the eigenfunctions of the Sturm-Liouville problem of Problem 3 . 20. Write Bessel's inequality for the function f(x) = e-x for the eigenfunctions of the Sturm-Liouville problem of Problem 6 .
Wavelets 16.4 .1 The Idea Behind Wavelet s Recent years have seen an explosion in both the mathematical development of wavelets, and their applications, which include signal analysis, data compression, filtering, and electromagnetics . Our purpose here is to introduce enough of the ideas behind wavelets to enable the student t o pursue more thorough treatments . Think of a function defined on the real line as a signal . If the signal contains one fundamental frequency wo, then f is a periodic function with period 27r/w o and the Fourier series of f(t) i s _one_tool for_analyzing the-signal-s_frequencycontent . -The-amplitude spectrum of f-consists-of-- --
766
CHAPTER 16 Special Functions, Orthogonal Expansions, and Wavelet s a plot of points (nwo, c„/2), in which z with a n and b„ the Fourier coefficients of f . Under certain conditions on f , this enables us to represent the signal as a trigonometric series displaying the natural frequencies : 00 f(t) = -ao + E a„ cos(ncoot) + b,, sin(nwot) . Often we model the signal by taking a partial sum of the Fourier series : N
f(t)
2 a o + E a n cos(nwot) + b„ sin(nwot) . n= l
Although this process has proved useful in many instances, the Fourier trigonometric represen tation is not always the best device for analyzing signals . First, we may be interested in a signa l that is not periodic . More generally, we may have a-signal that is defined over the entire real lin e with no periodicity, and we require only that its energy be finite . This means that f f(t)) 2dt is finite, or, if f(t) is complex valued, that f I f(t)1 2 dt is finite . This integral is the energy content of the signal, and functions having finite energy are said to be square integrable . In general, Fourier expansions are not the best tool for the analysis of such functions . There are other disadvantages to Fourier trigonometric series . For a given f , we may have to choose N very large to model f(t) by a partial sum of a Fourier series . Finally, if we are interested on focusing on the behavior of f(t) in some finite time interval, or near some particular time, we cannot isolate those terms in the Fourier expansion that describe thi s behavior, but instead have to take the entire Fourier series, or its entire partial sum if we ar e modeling the signal . To illustrate, consider the signal shown in Figure 16 .16. Explicitly ,
f(t) =
1
1 for 0
--
for-
5
for 3
1
1 3 for-
-3
for
4 - 5
. 5 for t
14 5 -
for
4< t<1
5
11
4 -5
4 11 3 for 8
0
fort
>2
and fort < 0
16 .4 Wavelets
767
f(t) 3 2 1
I
I
-1 .0 -0 .5
0
0 .5
1.0
1
t
1 .5
2.0
x )+b n sin( 2
x) ,
-1 - 2 - 3
FIGURE 16 .16
The signal f(t) .
The Fourier series off on [- , 2 ] i s z 1 2 +an cos( ,i= 1
cos ( )+5+ 12 cos( Sn*r [-6 6 + 'loos (23-) +18 cos
( 6) 5
)-6cos( 4
)-20cos( 3
) 2
locos *112v -4cos(nor)] . .)
This series converges very slowly to the function . Indeed, Figure 16 .17(a) shows the 8 0 th partial sum of this series, and Figure 16 .17(b) the 100t " partial sum . Even with this number of terms, this partial sum does not model the signal very well . In addition, if we were intereste d in focusing on just part of the signal, there is no way of distinguishing certain terms of th e Fourier series as carrying the most information about this part of the signal . Put another way, Fourier series do not localize information . These considerations suggest that we seek other sets of complete orthogonal functions in which square integrable functions might be expanded, and which overcome some of th e difficulties just cited for Fourier trigonometric series . This is a primary motivation for wavelets . We will begin our discussion of wavelets by developing one important wavelet in detail, the n use this construction to suggest some of the ideas behind wavelets in general . 16.4 .2 The Haar Wavelets We will construct an example that is important both historically and for present-day applications . The Haar wavelets were the first to be found (about 1910), and serve as a model of one approac h to_thedevelopment _of_other wavelets . __ _
768
CHAPTER 16 Special Functions, Orthogonal Expansions, and Wavelet s
3
3
2
kw
2
1 1
1,
I
-1 .5 -1 .0
I -0 .5 0
> x
'10 . 5
4, -1 .5
I
I
-1 .0 -0.5
010.5
1 .0
- 2
-2
- 3
- 3 Eightieth partial sum of the Fourier series of the signal. FIGURE 16 .17(a)
h
One-hundredth partial su m of the Fourier series of the signal. FIGURE 16 .17(b)
Let L 2 (R) denote the set of all real valued functions that are defined on the entire real line , and are square integrable . L2 (R) has the structure of a vector space, since linear combinations a1f1 + a2f2 + • • • + an of square integrable functions are square integrable . The dot product we will use for functions in L2 (R) is
f„
f •g
=f
f(t)g(t)dt .
Now consider the characteristic function of an interval I (or of any set of numbers on the real line) . This function is denoted Xi, and has the value 1 for t in I, and zero for t not in I. That is, 1 if t is in I 0 if t is not in I
Xr( t) -
In particular, we will use the characteristic function of the half-open unit interval : 1 for0t< 1 X[o,1)( t) = J 0 if t < O or if t
1
A graph of X]o,1) is shown in Figure 16 .18 . We want to introduce new functions by both scaling and translation, with the objective o f producing a complete orthonormal set of functions in L2 (R) . Recall that the graph of f(t - k)
X[o .1 ] 1
p
FIGURE 16 .18
X[o,i]
16 .4
Wavelets
76 9
is the graph of f(t) translated k units to the right if k is positive, and Ike units to the left if k i s negative . For example, Figure 16 .19(a) shows a graph of tsin(t) for 0 < t < 1 5 fort < 0 and fort > 1 5 to
f(t)
Figure 16 .19(b) is a graph of f(t+5) (graph of f(t) shifted five units to the left), and Figure 16 .19(c) is a graph of f(t - 5) (shift the graph of f(t) five units to the right) . In addition, f(kt) is a scaling of the graph of f . f(kt) compresses (if k > 1) or stretches (if 0 < k < 1) the graph of f(t) for a < t b onto the interval [a/k, b/k] . For example, Figure 16 .20(a) shows a graph of f(t)
_
t sin(vt) 0
for - 2 < t < 3 fort < -2 and fort > 0
Figure 16 .20(b) shows a graph of f(3t), compressing the graph of Figure 16 .20(a) to the righ t and left, and Figure 16 .20(c) shows a graph of f (t/3), stretching out the graph of Figure 16 .20(a) .
f(t + 5 )
f(t)
FIGURE 16 .19(a )
f(t)
FIGURE 16 .19(b)
tsin(t) for 0 < t < 1 5 0 fort < O and fort >
15
f( - 5)
FIGURE 16 .19(c) f(t-5) .
f(t+5) .
CHAPTER 16 Special Functions, Orthogonal Expansions, and Wavelets f(t)
f(3t) 1 2
1 I -2
I -4
I 2
II V
FIGURE 16 .20(b)
FIGURE 16 .20(a )
f(3t) .
f(t) - (t sin(•nrt) for -2 < t 3 0 fort < -2 and fort > 0 f(t/3 )
FIGURE 16 .20(c)
f(t/3) .
Let cp(t) = X[o,l) (t), and define 1 1 for 0
= co(2t) - cp(2t - 1) =
_1
for 1 < t < 1 2 0 fort < O and fort
1
A graph of 1i is shown in Figure 16 .21 . Next, consider translations t/i(t - n), in which n is any integer . This is the functio n
A graph of I/i(t - n) is shown in Figure 16 .22. Now combine a translation with a scaling . Consider the functio n t/r(2t - m) = cp(2(2t - m)) - cp(2(2t - m) - 1)
=cp(4t-2n2)-cp(4t-2n2-1 ) 1 -1
for-
0 for t
<
1) (m 2 ( m 1) 17Z and for t > 2 2
2
in which in is any integer . A graph of this function is shown in Figure 16 .23 . Before proceeding, we will observe that these translated and scaled functions are orthogona l in L2 (R) .
LEMMA 16.2
1. For distinct integers n and in, I/r(t - n) • i/i(t - in) = 0
and i/i(2t -
72)
•
1/i(2t - in) = 0 .
772
CHAPTER 16 Special Functions, Orthogonal Expansions, and Wavelets 2. For any integers n and in, 1/r(t - n) • i/r(2t - m) = 0. Proof If n m, then the intervals [n, n + 1) on which i/r(t - n) takes on its nonzero values , and [in, m + 1) on which i/r(t - in) assumes its nonzero values, are disjoint . Then i/r(t - n) i/r(t - m) = 0 for all t and tr(t - n) i/r(t-m)
=f
i/r(t-n)i/r(t-ni)dt=0 .
Similarly, for n ,-E in, the intervals [n/2, (n + 1)/2) and [m/2, On + 1)/2) on which //(2t - n) and fr(2t - ln), respectively, take on their nonzero values, are disjoint, so i/r(2t - n ) 0(2t-in)=0 . For (2), let n and m be any integers . If the intervals on which /r(t - n) and i/r(2t -m) have nonzero values are disjoint, then these functions are orthogonal . There are two cases in whic h these intervals are not disjoint . Case 1 n = nm/2 . In this case 1 (/r(t-n)i/r(2t-m)=
for n
1 4
forn+4
-1
1 0 for tn+ 2 Then n+1/a -n)•i/r(2t-m)= f
n+1/2
dt=0 .
dt- f
n
n+1/4
Case 2 n+ 1/2 = m/2 . Now
i/r(t - n)t/r(2t - m)
1 3 -1 for n+ 2
fortn+
4
so n+3/4
*r(t-n)•i/r(2t-m)=- f
n+ l
dt+ f
n+1/2
dt=0 . ■
n+3/ 4
However, while the functions i/r(t - n) and 0(2t - m) are orthogonal in L 2 (R), they do not form a complete set as n and in vary over the integers . We leave it for the student to produce nontrivial (that is, nonzero at least on some interval) square integrable functions tha t are orthogonal to all of these translated and scaled functions . The idea now is to extend this set of functions by using scaling factors 2n` for integer in , to obtain functions that take on nonzero constant values on intervals that can be made shorte r (positive m) or longer (negative m) . Let Unt,n
(t) = 0(2"`t-n) .
16 .4
Wavelets
773
a,,,,,,(t) o 3✓ ,( t) 1
°-2,n(t) •-O
•-0
-
-o
SO
I
-
8
n 4
1 •-O 8+
FIGURE 16 .24
0-O
1 16
0•,,,,,,(t)
n
Ln •-O 2 + 2 0'1,,, (t)
2 l f
4+8
n+
r z
t n+
1
~0 cram(t)
form=0, 1,2,3.
for each integer m and each integer n. Then nt+1 t - 2n) - lp(2 nt+1 t - 2n - 1 ) *rn,n ( t ) = gp(2 n n 1 1 for 2m < t < 2m + 2nt+ 1 n 1 n 1 -1 for 2,t + 2m+1 t < 2m + 2m n 1 n 0 fort<-andfort>-+ 2m - 2m 2-m Figure 16.24 shows graphs of cro, ,,(t), or t,n (t), o-2, ,, (t), and cr3, ,,(t) on the same set of axes, for comparison . Note that n determines how far out the t axis the graph occurs, while m control s the size of the interval over which the function is nonzero (shorter for m increasing and positive , longer for increasing but m negative) . In the drawing n is a positive integer, but n can also be chosen negative, in which case the graphs are to the left of the vertical axis . We claim that these functions form an orthogonal set in L 2 (R) . THEOREM 16.2 9
If n, m, n' and m' are integers, and (m, n) (m', n'), then 0-n,,n' *ntt ,n' = 0. A proof of this is left to the student . One last detail before we get to the main point. The v,;, n s are orthogonal, but they are no t orthonormal . This is easily fixed . Divide each of these functions by its length, as defined b y the dot product in L 2 (R) . Compute n/2"'+1/2m / ,t/2"'+1/2"' 1 (length of o',,,,,) 2 = o'2, „(t)d t = J d t = 2m gym,,, = n/2m „/2,,,
f
This suggests that we define the function s 2m/2 0 . (t) = 2m/2 [co(2m+1 t - 2n) - (1)(2"H-1 t - 2n - 1) ] ,non ( t) = nt,n n < n 1 2n,/z for t < 2m + 2111+11+ 2 n 1 n -2m/2 for 211, + 2111+ 1+1 t < 2m + 2m 1 n n 0 fort < - and fort > - + 2t 111
The functions vi,,, , ,, form an orthonormal set in L2 (R) . functions are the Haar wavelets . In the construction, gp is called the scaling fimction, and tar(t) = co(2t) - gp(2t - 1) is the mothe r wavelet . Graphs of these wavelets are similar to the graphs of Figure 16 .24, but the segment at These
:774
CHAPTER 16 Special Functions, Orthogonal Expansions, and Wavelet s height 1 in Figure 16 .24 is now at height 2x71/2 , and the segment at height -1 in Figure 16 .24 is now at height -2nr/2 . The Haar wavelets are complete in L2 (R) . The idea behind this can be envisioned a s follows . If f is square integrable, then f(t) can be approximated as accurately as we like by a function g having compact support (g(t) = 0 outside some closed interval), and having constan t values on half-open intervals of the form [n/2n1, ( n +1)/2"r ), with n and m integers . Such intervals are of length 1/2 nr , which can be made longer or shorter by choice of the integer in . In turn, g can be approximated as closely as we like by a sum of constants times Haar wavelets , which are defined on such intervals, with the error in the approximation tending to zero as th e number of terms in the sum is taken larger .
16.4.3 A Wavelet Expansion Suppose f is a square integrable function . We can attempt an expansion of f in a series of th e Haar wavelets, which form a complete orthonormal set in L 2(R) . Such an expansion has th e appearance 00
00
f\t) = E E Cn,,, 'm,n(t) . m=-oo n=-co
The equality in this expression is taken to mean that the series on the right converges in th e mean to f(t) . This means that 2 lim
M->oo
f 00
f(t) - E E c nut nt(t) ,n n1=-00
dt = 0 .
11=-0 0
The coefficients c,,,,, can be found in the usual way by using the orthonormality of the Haa r wavelets : 00 00
m=-oo n=-oo
We will complete the example begun in Section 16 .5.1, in which f is the signal whos e graph is shown in Figure 16 .16. As we saw in Figures 16 .17(a) and (b), we would have to us e a very large number of terms to model this signal with the partial sum of its Fourier expansio n on [-2, 2] . However, if we calculate the coefficients in the Haar expansion, we find that f(t) = tPoo( t)+*
(t)-0 .6t/J2,1(t)-0 .4 'J 1,2(t)+ 2,5( t)
For some purposes we want Fourier trigonometric expansions, but for this signal the Haa r wavelets provide a very efficient expansion .
16.4.4 Multiresolution Analysis with Haar Wavelets The term multiresolution analysis refers to a sequence of closed subspaces of L 2(R) that ar e related to the scaling used in defining a set of wavelets . We will discuss what this means in the context of the Haar wavelets . Because L 2(R) has the structure of a vector space, the following three conditions hold . 1. Linear combinations *N 1 cjfj of functions in L2(R) are also in L2 (R) . 2. The zero function, 0(t) = 0 for all t, is in L2 (R), and serves as the zero vector of L2 (R) . For any function f in L2 (R), f + B = f . 3. If f is in L 2(R), -f, defined by (-f)(t) = -f(t), is also in L 2(R) .
16.4 Wavelets . A set S of square integrable functions is said to be a subspace of L 2 (R) if S has at leas t one function in it, and, whenever f and g are in S, then f -g is in S . For example, the set of all constant multiples of X[o,1] forms a subspace of L2 (R) . A subspace S is closed if convergent sequences of functions in S have their limit function s in S . For example, the subspace of all continuous square integrable functions is not closed, because a limit (in the sense of mean convergence) of continuous functions need not b e continuous . If a subspace S is not closed, we can form the "smallest" subspace of L 2 (R) containing all the functions in S, together with all the limits of convergent sequences of functions in S . Thi s subspace, which may be all of L2 (R), is called the closure of S, and is denoted S . S is closed, because by its formation it has all the limits of convergent sequences of functions that are i n this space . We will now show how the Haar wavelets generate a sequence of closed subspaces o f L2 (R), which can be indexed by the integers so that each is contained in the next one in the list. The spaces are generated by different scalings of the scaling function co, and may be thought of as associated with different degrees of resolution of the signal . To begin defining these spaces, let So consist of all linear combinations of the translate d scaling function . These translated scaling functions have the form cp(t - n) for integer n, and a typical function in So has the form N
E ci cp(t - nj ) , where N is a positive integer, the ci s are real numbers and each nj is an integer. Now let Vo be the closure of So :
Next, let S,» be the space of all linear combinations of the functions cp(2' » t - n), where varies over the integers and m is a fixed integer in defining S,,, . Let V»
n
= S», •
From the scaling property of the scaling function , p(t) = cp(2t) + cp(2t - 1) , we find that f(t) is in V„ t exactly when f(2t) is in V,,,+1 , and each V,,, is contained within V,n+ l (written V,, C V,,, +1 ) . Thus the closed subspaces V,,,, with integer in, form an ascending chain : CV 2 CV 1 C V0 CV1 CV2 C This chain has two additional properties of importance . First, there is no nontrivial functio n contained in every V,,, . We say that the intersection of all the closed subspaces V. consists o f just the zero function . And, finally, the ascending chain ends in L 2 (R) . This means that every function in L? (R) has a series expansion in terms of the Haar functions, a fact to which w e have already alluded . The spaces V,,, are said to form a multiresolution analysis of L 2 (R) . This multiresolutio n analysis is generated by the scaling function co .
16.4.5 General Construction of Wavelets and Multiresolution Analysi s The Haar wavelets have been known for nearly a century, together with the chain of subspaces that form a multiresolution analysis of L2 (R) . However, it remained unknown for
776
CHAPTER 16 Special Functions, Orthogonal Expansions, and Wavelets some time whether this construction could be duplicated, resulting in multiresolution analyses starting from different scaling functions . To this end, we will use the hindsight of the Haar construction to make a definition of a scaling function and the associated multiresolutio n analysis .
DEFINITION 16.4 Scaling Function and Associated Multiresolution Analysis Let cp be in L2 (R) . Then cp is a scaling function with multiresolution analysis {V,,,} i f CV_2 CV_ I CVo CVCV2 C is an ascending chain of closed subspaces of L 2(R) satisfying the conditions : 1. The translated functions cp(t - n), for integer n, are orthonormal, and every functio n in Vo is a linear combination of functions of this form . 2. There is no nontrivial function that belongs to every V,,, . (That is, the V; s have trivial intersection .) 3. f(t) is in V,n, exactly when f(2t) is in V,,, . +1 4. Every function in L2 (R) can be expanded in a series of functions from the V,, t s.
Vo is a subspace of V1, which contains functions orthogonal to every function in Vo . The subspace of VI containing all of these functions is called the orthogonal complemen t of Vo in V1 . To draw an analogy from vectors in R 3, the constant multiples of k form a subspace of R 3 that is the orthogonal complement of the plane defined by i and j . Every vector in this orthogonal complement is orthogonal to each linear combination ai -f- bj . Now use the scaling function to produce a mother wavelet t/i, having the property that every function in this orthogonal complement of Vo in VI is a linear combination of translate s t/i(t - n) . If there is such a mother wavelet, then we can form the family of wavelet s mn =2m/2i(2"`t-n )
for integers m and n .
16.4.6 Shannon Wavelet s The Haar wavelets form a prototype for wavelets and multiresolution analysis, partly becaus e they were the first, and partly because they are relatively easy to work with and visualize . Th e reason it took many years before other examples of scaling function/wavelet/multiresolutio n analysis were found is that this involves some fairly heavy analysis . However, there are other relatively simple examples . One consists of the Shannon wavelets . For these, begin with the Fourier transform of a potential scaling function . Let 1F( w ) =
Xl-,,,,T) •
Taking the inverse Fourier transform, we obtai n _ sin(ort) cp(t)
ar t
This function occurs in the Shannon reconstruction theorem, which was proved in Section 15 .4.7 for functions of bandwidth < L . In the case that L = ir, the theorem states that a signal f whos e
16.4
Wavelets
:7 7 7
Fourier transform 'f(w) vanishes outside the interval [-7r, 7r] (that is, f has bandwidth < 77-) , can be reconstructed by sampling its values on the integers . Specifically ,
_ E
f( t)
f(n)
sin(7r(t - n)) 7r(t - n)
=
f(n ) cP( t - n) .
The space Vo in this context consists of functions in L 2 (R) of bandwidth not exceeding 7r . By scaling (let g(t) = f(2t)) we can consider the space V1 of functions of bandwidth not exceeding 27r, and so on, forming a multiresolution analysis . Thus cp is a scaling function . We now need a mother wavelet Li that is orthogonal to each cp(t - n), for integer n . By an argument we will not carry out (but whose conclusions can be verified in straightforward fashion), w e obtain a suitable from cp in this case by setting fi(t)cp
7
1
2
2cp(2t-1)=
sin(27rt) - cos(7rt) 7r (t -z )
The frequency content of this function is obtained from its Fourier transform ,
(kw) = - e-,w/2XA(w) , where A consists of all w in [-27r, -7r), together with all w in (7r, 27r] . That is, on each o f these intervals, kw) = - e-` w12 , and for w outside of these intervals, tp(w) = O . Figure 16 .25 shows a graph of the mother wavelet tP, and Figure 16 .26 a graph of its amplitude spectrum . This gives the frequency content of cli . The Shannon wavelets are the functions Y'nt ( t)
= 2"0 ci(2 m t - n ) 2" t/2 7r(t - z )
(sin (27r(2" t t - n)) - cos (7r(2" `t - n))) .
We leave it for the student to explore properties of these wavelets . Graphs of tilo (t) and ci21 (t) are given in Figures 16 .27(a) and (b) . There are many other families of wavelets, including Meyer wavelets, Daubechies wavelet s and Stomberg wavelets . These require a good deal more preliminary work for their definitions . Different wavelets are constructed for specific purposes, and they have applications in suc h areas as signal analysis, data compression and solution of integral equations . For an application
1 .0 0.86p 0. 6
0.8 0.6 ) t
-0 .6 - 0.8 - 1 .0 FIGURE 16 .25
fi(t) _
Shannon mother wavelet
sin(27rt) - cos(7rt) 7r(t-2)
0.4 0.2 -8 -6 -4 -2
0 2 4 6
I , t 8
FIGURE 16 .26 Amplitude spectrum of the Shannon mother wavelet.
778
CHAPTER 16
Special Functions, Orthogonal Expansions, and Wavelet s
FIGURE 16 .27(a)
Shannon wavelet '10(t).
FIGURE 16 .27(b)
Shannon wavelet Y i21(t) .
to the problem of using color patterns in the iris of the eye as a means of identification, se e the article Iris Recognition, by John Daugman, appearing in American Scientist, July-August , 2001, pages 326-333 .
1. Show that o-,,, , „(t) • o
(t) = 0 if (m, n)
(m', n') .
2. On the same set of axes, graph o•1,1 (t) and 0 1,2 (t) . Explain from the graph why these two functions ar e orthogonal . 3. On the same set of axes, graph o•1,3 (t) and o-_2,1 (t) . Explain from the graph why these two functions ar e orthogonal . 4. On the same set of axes, graph o-2,1 (t) and c'1 ,1 (t) . Explain from the graph why these two functions ar e orthogonal . 5. Graph I/i(2t -3) . 6. Graph /i(2t+6) .
7. Let f(t) = 40•_3,_2(t) +6c-_1,1(t) . Write the Fourier series of f(t) on [-5, 5] . Graph the fiftieth partial sum of this series on the same set of axes with a graph of f(t) . 8. Let f(t) _ -3cr2,_2 (t) +4cr2, o(t) +7a1,_1(t) . Write the Fourier series of f(t) on [-4, 4] . Graph the fiftieth partial sum of this series on the same set of axe s with a graph of f(t) . 9. Let f(t)=3a_4,_1(t)+8o_2,1(t) . Write the Fourier series of f(t) on [-6, 6] . Graph the fiftieth partial su m of this series on the same set of axes with a graph o f f(t) . 10. Let f(t) = o•_2,_2(t)+4o•1,3(t)+2o•1,_2(t) . Write the Fourier series of f(t) on [-7, 7] . Graph the fiftieth partial sum of this series on the same set of axes with a graph of f(t) .
PAR T
CHAPTER 1 7 The Wave Equation CHAPTER 1 8 The Heat Equatio n CHAPTER 1 9 The Potential Equation
Partial Differential Equations
A differential equation in which partial derivatives occur is called a partial differential equation . Mathematical models of physical phenomena involving more than one independent variabl e often include partial differential equations . They also arise in such diverse areas as epidemiolog y (for example, multivariable predator/prey models of AIDS), traffic flow studies and the analysi s of economies . We will be primarily concerned in this part with three broadly defined kinds of phenomena: wave motion, radiation or conduction of energy, and potential theory . Models of these phenomena involve partial differential equations called, respectively, the wave equation, the hea t equation, and the potential equation, or Laplace's equation . We will consider each of these in turn, deriving solutions under a variety of boundary and initial conditions describing different settings . The solution of partial differential equations requires a broad array of mathematical tools, including Fourier series, integrals and transforms, special functions and eigenfunction expan sions . These were covered in Part 5, and can be referred to as needed .
CHAPTER
17
'NE EQUATION AND flNI "'IA L AND BOUNDARY n F TH E ON S fya* SERIES S)4..1' h_iYJ U* 7TI Ol.!'1NSi. 'OF ON WAVE MOTION ALONG INFINIT E IT STRINGS CHARACTERISTIC S r
E
.v_"*
i F
9
*M
The Wave Equatio n
17.1
X
The Wave Equation and Initial and Boundary Condition s Vibrations in a membrane or drum head, or oscillations induced in a guitar or violin string , are governed by a partial differential equation called the wave equation . We will derive this equation in a .simple setting . Consider an elastic string stretched between two pegs, as on a guitar . We want to describe the motion of the string if it is given a small displacement and released to vibrate in a plane . Place the string along the x axis from 0 to L and assume that it vibrates in the x, y plane. We want a function y(x, t) such that, at any time t > 0, the graph of the function y = y(x, t) of x, is the shape of the string at that time . Thus y(x, t) allows us to take a snapshot of the string at any time, showing it as a curve in the plane . For this reason y(x, t) is called the positio n function for the string . Figure 17 .1 shows a typical configuration . To begin with a simple case, neglect damping forces such as air resistance and the weigh t of the string and assume that the tension T(x, t) in the string always acts tangentially to th e string, and that individual particles of the string move only vertically . Also assume that the mass p per unit length is constant . Now consider a typical segment of string between x and x+ px and apply Newton's secon d law of motion to write net force on this segment due to the tension = acceleration of the center of mas s of the segment times its mass . This is a vector equation . For Ox small, the vertical component of this equation (Figure 17 .2) gives us approximately z T(x -+-ix, t)sin(O-+ -AO)-T(x, t) sin(O) = pIxaz (z, t) 781
782
CHAPTER 17 The Wave Equation Y
B+49
T(x
+ 0x, t)
y = y(x, t), fixed t T(x, t) i
0 FIGURE 17 .1
String profile
\ i
\
,
)- x
x x+A x
FIGURE 17 . 2
at time t .
where .T is the center of mass of the segment and T(x, t) =
IIT(x,
t) II = magnitude of T . The n
T(x+Ox, t) sin(e+AO) -T(x, t) sin(e) _ a2y _ x , t) . Ox - pate( Now v(x, t) = T(x, t) sin(O) is the vertical component of the tension, so the last equatio n becomes v(x+1 x, t) - v(x, t) = a2y _ ' t) . Ax = ate (x In the limit as Ax - 0, we also have x x and the last equation becomes av a2y ax ate The horizontal component of the tension is h(x, t) = T(x, t) cos(O), so v(x, t) = h(x, t) tan( g) = h(x, t)
(17 .1)
x.
Substitute this into equation (17 .1) to get a( ay1 a 2y ' t) . ax h axJ =pate (x
(17 .2)
To compute the left side of this equation, use the fact that the horizontal component of th e tension of the segment is zero, so h(x + Ox, t) - h(x, t) = 0 . Thus h is independent of x and equation (17 .2) can be written a2y a2y hax2 = pa t2 . Letting c2 = p/h, this equation is often written a2 y c2 ax = at2 This is the one-dimensional (1-space dimension) wave equation . If we use subscript notation for partial derivatives, in which ay yx = ax
and
ay Yt = a t
then the wave equation is Ytt = C2 Yxx•
17.1 The Wave Equation and Initial and Boundary Conditions
783
This spectacular photo, taken by Ensign John Gay from the U.S.S. Constellation, shows a shock wave cloud forming over the tail of a U.S. Navy F/A-18 Hornet as it breaks the sound barrier . Current theory is that sound density waves generated by the plane accumulate in a cone at the plane's tail, and a drop in air pressure causes moist air to condense into water droplets there . Shock waves are not yet fully understood, and their mathematical modeling uses advance d techniques from the theory of partial differential equations .
In order to model the string's motion, we need more than just the wave equation . We mus t also incorporate information about constraints on the ends of the string, and about the initia l velocity and position of the string, which will obviously influence the motion . If the ends of the string are fixed, then y(0, t) = y(L, t) = 0
for t > 0 .
These are the boundary conditions . The initial conditions specify the initial (at time zero) position y(x, 0) = f(x)
for 0 < x < L
and the initial velocity at (x, 0) = g(x)
for 0 < x < L ,
in which f and g are given functions satisfying certain compatibility conditions . For example , if the string is fixed at its ends, then the initial position function must reflect this by satisfyin g f(0) = f(L) = 0 .
If the initial velocity is zero (the string is released from rest), then g(x) = 0 .
784
CHAPTER 17 The Wave Equatio n The wave equation, together with the boundary and initial conditions, constitute a boundary value problem for the position function y(x, t) of the string . These constitute enough informatio n to uniquely determine the solution y(x, t). If there is an external force of magnitude F units of force per unit length acting on th e string in the vertical direction, then this derivation can be modified to obtain a2 y
axe = c2 at2 + -P F. Again, the boundary value problem consists of this wave equation and the boundary and initia l conditions . In 2-space dimensions the wave equation i s 2 ate
2
=c
2
I axe 2Y I . 2
(17 .3)
This equation governs vertical displacements z(x, y, t) of a membrane covering a specified region of the plane (for example, vibrations of a drum surface) . Again, boundary and initial conditions must be given to determine a unique solution . Typically, the frame is fixed on a boundary (the rim of the drum surface), so we would hav e no displacement of points on the boundary : z(x, y, t) = 0
for (x, y) on the boundary of the region and t > O .
Further, the initial displacement and initial velocity must be given . These initial conditions hav e the form z(x, y, 0) = f( x, Y),
at
(x, y, 0) = g(x, y)
with f and g given . We will have occasion to use the two dimensional wave equation (17 .3) expressed in polar coordinates, so we will derive this equation . Let x = rcos(O), y = rsin(O) . Then r =,/x2 + y2
and
0 = tan -1 (y/x) .
Let z(x, y) = z(rcos(e), rsin(e)) = u(r, 0) . Compute az au ar au a e ax arax + 80ax x au y au ./x2+y2 ar x2 +y 2 a e x au y a u = rar - r2ae
17 .1 The Wave Equation and Initial and Boundary Conditions
785
Then
a 2 z au a 11/ y ax2 - ar ax 'r)
au a y x a au) y a au a0 ax r 2 ) + r ax (ar) r 2 ax (ae )
_ y 2 au 2xy au x2 a 2 u r 3 ar + r4 80 + r2 ar2
a2 u y 2 a 2 u r3 arae + r4 ae 2
2xy
By a similar calculation, we get
az y au x au ay rar + r2a e and
a2 z aye
au r3 ar
x2
au y 2 a2 u 2xy a 2 u x 2 a 2 u r4 as + r2 ar 2 + r 3 arae + r4 a02
2xy
Then
a2 z a 2 z ax2 + aye
a2u 1 au 1 a2 u ar2 + r ar + r2 8e2
Therefore, in polar coordinates, the two-dimensional wave equation (17 .3) i s
a2 u atz
2
a2 u ar2
au 1 a 2 u rar r 2 ae 2 1
(17 .4)
in which u(r, 0, t) is the vertical displacement of the membrane from the x, y plane at point (r, 0) and time t . For the rest of this chapter we will solve boundary value problems involving wave motio n in a variety of settings, making use of several techniques .
1. Let y(x, t) = sin(nirx/L)cos(nlrct/L) . Show that y satisfies the one-dimensional wave equation for an y positive integer n . 2. Show that z(x, y, t) = sin(ox) cos(iny) cos (v/n2 + m2 ct ) satisfies the two-dimensional wave equation for an y positive integers n and in . 3. Let f be any twice-differentiable function of one variable . Show that y(x, t) =
[f(x+ct)+f(x-ct) ]
satisfies the one-dimensional wave equation .
+ 1 cos(x) sin(ct) c satisfies the one-dimensional wave equation, together with the boundary condition s
4. Show that y(x, t) = sin(x) cos(ct)
y(0, t) = y(21r, t)
= 1 sin(ct) for t > 0
and the initial conditions (x, 0) = cos(x) for 0 < x < 7T . at 5. Formulate a boundary value problem (partial differential equation, boundary and initial conditions) fo r vibrations of a rectangular membrane occupying a region 0 < x < a, 0 < y < b if the initial position is th e graph of z = f(x, y) and the initial velocity (at time zero) is g(x, y) . The membrane is fastened to a stiff frame along the rectangular boundary of the region . y(x, 0) = sin(x),
6. Formulate a boundary value problem for the motio n of an elastic string of length L, fastened at both end s and released from rest with an initial position given b y f(x) . The string vibrates in the x, y plane . Its motio n is opposed by air resistance, which has a force at eac h point of magnitude proportional to the square of the velocity at that point .
786
17.2
CHAPTER 17 The Wave Equatio n
Fourier Series Solutions of the Wave Equatio n We will begin with problems involving wave motion on a bounded interval . First we will consider the problem when there is an initial displacement, but no initial velocity (strin g released from rest) . Following this we will allow an initial velocity but no initial displacement (string given an initial blow, but from its horizontal stretched position) . Then we will sho w how to combine these to allow for both an initial velocity and initial displacement .
17.2.1 Vibrating String with Zero Initial Velocity Consider an elastic string of length L, fastened at its ends on the x axis at x = 0 and x = L . The string is displaced, then released from rest to vibrate in the x, y plane . We want to find the displacement function y(x, t), whose graph is a curve in the x, y plane showing the shap e of the string at time t . If we took a snapshot of the string at time t, we would see this curve . The boundary value problem for the displacement function i s a2 y ate = c2 ax
for 0 < x < L, t > 0 ,
y(0,t)=y(L,t)=0
for t>0 ,
y(x, 0) = f(x) for 0 < x < L ,
a (x, 0)=0
for 0<x
The graph of f(x) is the position of the string before release . The Fourier method, or separation of variables, consists of attempting a solution of the form y(x, t) = X(x)T(t) . Substitute this into the wave equation to obtai n XT" = c2X" T, where T' = dT/dt and X' = dX/dx . Then X" T " X c2 T . The left side of this equation depends only on x, and the right only on t . Because x and t ar e independent, we can choose any to we like and fix the right side of this equation at the constant value T"(to)/c2 T(to), while varying x on the left side . Therefore X"/X must be constant for all x in (0, L) . But then T"/c2 T must equal the same constant for all t > O . Denote this constant -A . (The negative sign is customary and convenient, but we would arrive at the same final solution if we used just A) . A is called the separation constant, and we now have X" T" _ -A . X c2 T
17.2 Fourier Series Solutions of the Wave Equation
787
Then X" + AX = 0 and
T " + Ac2T = 0.
The wave equation has separated into two ordinary differential equations . Now consider the boundary conditions . First , y(0, t) = X(0)T(t) = 0 for t > 0. If T(t) = 0 for all t > 0, then y(x, t) = 0 for 0 < x < L and t > 0. This is indeed the solution if f(x) = 0, since in the absence of initial velocity or a driving force, and with zer o displacement, the string remains stationary for all time . However, if T(t) = 0 for any time, then this boundary condition can be satisfied only i f X(0) = 0 . Similarly , y(L, t) = X(L)T(t) = 0 for t > 0 requires that X(L)=0. We now have a boundary-value problem for X : X" + AX = 0 ; X (O) = X (L) = 0 . The values of A for which this problem has nontrivial solutions are the eigenvalues of this problem, and the corresponding nontrivial solutions for X are the eigenfunctions. We solved this regular Sturm-Liouville problem in Example 16 .8, obtaining the eigenvalue s n2 77.2 A,, = L 2 The eigenfunctions are nonzero constant multiples o f X„ (x) = sin
n')rx L
for n = 1, 2, . . . . At this point we therefore have infinitely many possibilities for the separatio n constant and for X(x) . Now turn to T(t) . Since the string is released from rest , (x, 0) = X(x)T'(0) = 0 . at This requires that T' (0) = 0. The problem to be solved for T is therefore T" + Ac2 T = 0;
T' (0) = 0 .
However, we now know that A can take on only values of the form n 2'nr2/L2, so this problem is really n2 T" +
.2 c 2
77
L2
T = 0;
T ' (0) = 0 .
788
CHAPTER 17 The Wave Equatio n
The differential equation for T has general solutio n T(t) = acos
(nLct ) (nLct)+bsin
Now T' ( 0) = n)c b = 0 ,
so b = 0. We therefore have solutions for T(t) of the form T/1(t)=cncos(
n7rct ) L
for each positive integer n, with the constants c, as yet undetermined . We now have, f o r n = 1, 2, . . .., functions y„(x,t)=c„sin( nar )cos( xL
n'rrc
tL ) .
(17.5)
Each of these functions satisfies both boundary conditions and the initial condition y,(x, 0) = 0 . We need to satisfy the condition y(x, 0) = f(x) . It may be possible to choose some n so that y(x, t) is the solution for some choice of c n . For example, suppose the initial displacement i s
f(x) = 14 sin ( _37rx ) L
Now choose n = 3 and c3 = 14 to obtain the solution y(x, t) = 14 sin
3
L7rxcos
3irc t L
(
. /
This function satisfies the wave equation, the conditions y(O) = y(L) = 0, the initial conditio n y(x, 0) = 14sin(37rx/L), and the zero initial velocity conditio n
*t (x, 0) = 0. However, depending on the initial displacement function, we may not be able to get b y simply by picking a particular n and c„ in equation (17 .5). For example, if we initially pick th e string up in the middle and have initial displacement function f(x) - L-x
for0<x
1 (7.6)
(as in Figure 17 .3), then we can never satisfy y(x, 0) = f(x) with one of the yns . Even if we try a finite linear combination N
y(x, t)
= E y„ (x, t) n- 1
FIGURE 17 .3
17.2 Fourier Series Solutions of the Wave Equation
789
we cannot choose e l , . . . , cN to satisfy y(x, 0) = f(x) for this function, since f(x) cannot b e written as a finite sum of sine functions . We are therefore led to attempt an infinite superpositio n °O n7rx n7Ict y(x,t)=E c,sin( L )cos( L) »_ i
We must choose the c,',s to satisfy y(x, 0)=Esin
)
(
L
,,= 1
We can do this! The series on the right is the Fourier sine expansion of f(x) on [0, L] . Thu s choose the Fourier sine coefficients 2
c,,
L
I
L
* )
f(6) sin
d
With this choice, we obtain the solutio n y(x, t)
2 = L _*
fL
f(6) sin
n
d
n7rx n7rc t sin ( L ) cos ( ) . L
(17 .7 )
This strategy will work for any initial displacement function f which is continuous wit h a piecewise continuous derivative on [0, L], and satisfies f(0) = f(L) = 0. These condition s ensure that the Fourier sine series of f(x) on [0, L] converges to f(x) for 0 < x < L . In specific instances, where f(x) is given, we can of course explicitly compute th e coefficients in this solution . For example, if L = 7r and the initial position function i s f(x) = xcos(5x/2) on [0, 7r], then the n'" coefficient in the solution (17 .7) i s c„
2 JO
cos(56/2) sin
L n
d
n(-1)"+1 8 7r (5 + 2n) 2 (5 - 2n) 2 The solution for this initial displacement function, and zero initial velocity, i s y(x, t) = 8 ,
n(-1) »+ 1 sin (nx) cos (net) . ( 5 + - 2n) 2 (5 - 20 2
(17 .8)
Figure 17 .4(a) shows graphs of this function (profiles of the string) at times t = 0, 0 .2, 0.4, 0 .7, 0 .9 and 1 .3 seconds . Figure 17 .4(b) shows profiles at times t = 1 .2, 1 .9, 3, 3 .5, 4 .2 and 4.7. And Figure 17 .4(c) shows the graphs at times t = 5 .1, 5 .6, 5 .9, 6 .4, 7 and 8, 3 . Thes e snapshots are made in groupings on the same set of axes to convey some sense of the motio n with time . The solution we have derived by separation of variables can be put into the context o f Sturm-Liouville theory (Section 16 .3) . The problem for X, namel y X° ± X - = O;--X (°-) =X(L) = 0, - ---
CHAPTER 17 The Wave Equatio n
Profiles of the solution at times t = 0, 0 .2, 0.4, 0 .7, 0 .9, and 1 .3 . FIGURE 17.4(a)
FIGURE 17 .4(b)
and
String profiles at times t =
1 .2, 1 .9,
3, 3 .5,
4 .2 ,
4 .7.
is a regular Sturm-Liouville problem, and we found its eigenvalues and corresponding eigenfunctions . The final step in the solution was to expand the initial position function in a series of the eigenfunctions . For this problem this series is the Fourier sine expansion of f(x) on [0, L] .
17 .2 Fourier Series Solutions of the Wave Equation
791
String profiles at times t=5 .1,5 .6,5 .9,6 .4,7, and 8 .3 . FIGURE 17 .4(c)
17.2.2 Vibrating String with Given Initial Velocity and Zero Initial Displacemen t Now consider the case that the string is released from its horizontal position (zero initia l displacement), but with an initial velocity given at x by g(x) . The boundary value problem fo r the displacement function is
ate =c2ax
for0<x0 ,
y(0, t) = y(L, t) = 0
fort > 0 ,
y(x, 0) = 0 for O < x < L ,
at (x, 0) = g(x)
for 0 < x < L .
We begin as before with separation of variables . Put y(x, t) = X(x)T(t) . Since the partia l differential equation and boundary conditions are the same as before, we again obtai n X" + AX = 0; X (O) = X(L) = 0, with eigenvalues n2 7rZ L2
and eigenfunctions constant multiples of X„ (x) = sin
n 7rx L
Now, however, the problem for T is different and we hav e y(x, 0) = 0 = X(x)T(0) , so T(0) = O . The problem for T is n 2 ?r2c2
792
CHAPTER 17 The Wave Equatio n
(In the case of zero initial velocity we had T'(0) = 0) . The general solution of the differentia l equation for T is T(t) = a cos (_ n 7rct) + b sin
(n Lct )
Since T(0) = a = 0, solutions for T(t) are constant multiples of sin(narct/L) . Thus, for n = 1 , 2, . . . , we have functions narx ) sin ( narc t Y„(x,t)=c„sin( L L )
Each of these functions satisfies the wave equation, the boundary conditions and the zero initial displacement condition . To satisfy the initial velocity condition yt (x, 0) = g(x), we generally must attempt a superposition narx n7rct Y(x,t)=Ec„sin( L )sin( L ) . n= 1
Assuming that we can differentiate this series term by term, then 2 nLc sin (n x) = g(x) L This is the Fourier sine expansion of g(x) on [0, L] . Choose the entire coefficient of sin(n7rx/L) to be the Fourier sine coefficient of g(x) on [0, L] : (x, 0) _
at
n= 1
=Lf
cnL
L
g( O sin nLe d ,
or L f g(O sin nL nvrc
2
cn
f
d .
The solution is 2
y(x , t) = arc
°° 1 n
fL (1cl
/ n7r \ g() sin L I d
sin
n7rx n7rc t sin ( . ) (L L)
(17.9)
For example, suppose the string is released from its horizontal position with an initial velocity given by g(x) = x (1 + cos(arx/L)) . Compute
f Lg (5) sin ( nL) d= f L
(1 + cos ( IA)) sin ( "T6) de L if n 1 ifn= 1
The solution corresponding to this initial velocity function i s 2 3L2 orx arcs 2 °° L2(-l)" n7rx )sin (n7rct ) y(x, t) = sin sin arc (47r) (L) (L) + 7rc n z 7r(n z - 1) sin ( L L (17.10) If we let c = 1 and L = 7r, the solution 17 .10 become s
c° n2 1 ) 1 sin(nx) sin(nt) . sin(x) sin(t) + E z z „=z n of the string) at times t = 0 .4, 1 .2, 1.7, Figure 17 .5 shows graphs of this solution (positions 2.6, 3 .5 and 4.3. y(x, t)
=2
(_ )
17.2
Fourier Series Solutions of the Wave Equation
793
String profiles at times t = 0 .4, 1.2, 1 .7, 2 .6, 3 .5, and 4 .3 .
FIGURE 17 .5
17 .2.3 Vibrating String with Initial Displacement and Velocit y Consider the motion of the string with both initial displacement f(x) and initial displacemen t g(x) Formulate two separate problems, the first with initial displacement f(x) and zero initia l velocity, and the second with zero initial displacement and initial velocity g(x) . We know ho w to solve both of these . Let y i (x, t) be the solution of the first problem, and y (x, t) the solution of the second . Now let
2
y( x, t) =y 1( x, t)+ y2(x, t) . Then y satisfies the wave equation and the boundary conditions . Further , y(x, 0) = yl( x, 0)
+y
2 (x, 0) =
f( x) +0
= f(x)
and ay s a l (x, 0) + (9Y2 (x, 0) = 0 + g(x) = g( x) . at (x, 0) = at Thus y(x, t) is the solution in this case of nonzero initial displacement and velocity functions . For example, let the initial displacement function b e f( x) -
fbr 0 < x < L/2 forL/2<x
L-x
and the initial velocity g(x) = x (1+cos ( L
.
)) The solution for the displacement function is the sum of the solution y l (x, t) for just displacement f(x), with zero initial velocity, and the solution y (x, t) with zero initial displacemen t and initial velocity g(x) . For y i (x, t), use the solution (17 .7) . First evaluat e
2
L (JLf(e)
2 LJ =
O
L/2
sin
(7tL)
sin (')
4L sin(inr/2) . nz 7r z
de tl +
L fL 2 (L- e) sin (') d
794
CHAPTER 17 The Wave Equatio n
Therefore 4L
(n7rct) .
cos \ yl(x, t) = E n7r z 2 sin(n7r/2) sin (nix) L L J 1z=1 We have already solved for y 2 (x, t), obtaining 2 3L2 7rx 7rc t Y2(x,t)=7rc (47r)sin(L )sin(L )
n7rx naLrct 2 O0 Lz (-l)" 1) sin ( L ) sin ( ) f 7rc ,* n2 r(nz -
The solution with the given initial position and initial velocity is y(x, t) =yl (x, t)+y2 (x, t). If we let L = 7r and c = 1, this solution i s y(x, t)
sin n7r 2 sin (nx) cos (nt ) n=1 (n
+
7
_' )
co
sin (x) sin (t) 2(-1) n
+ E nz (nz_ 1) sin (nx) sin (nt) . Graphs of this string profile are shown in Figure 17 .6 for times t = 0 .125, 0 .46, 0 .93, 1 .9, 2.5, 3.4 and 5 .2.
Snapshot of the string at times t = $, 0 .46, 0 .93, 1 .9, 2.5, 3 .4, and 5 .2. FIGURE 17 .6
17.2.4
Verification of Solutions
In the solutions we have obtained thus far we have had to use an infinite serie s 0 y(x, t) = EYn( x , t) ,z= 1
and determine the coefficients in the yns by using a Fourier expansion . The question now is whether this infinite sum is indeed a solution of the boundary value problem . To be specific, consider the problem with initial position function f(x) and zero initial velocity. We derived the proposed solution narx
y(x,t)=Ec„sin( L -) cos n=1
n7rc t L
),
17.2 Fourier Series Solutions of the Wave Equation
795
in which L cn
L
JO
f(e) sin
(n77-6 )
d6 .
Certainly y(0, t) = y(L, t) = 0, because every term in the series for y(x, t) vanishes at x = 0 and at x = L . Further, under reasonable conditions on f, the Fourier sine series of f(x) converges to f(x) on [0, L], so y(x, 0) = f(x) , It is not obvious, however, that y(x, t) satisfies the wave equation, even though each ter m in the series certainly does . The reason for this uncertainty is that we cannot justify term- b y term differentiation of the proposed series solution . We will now demonstrate a remarkable fact, which has other ramifications as well . We will show that the series in equation (17 .11) can be summed in closed form. To do this, firs t write sin
n'rrx (L
)
cos (
norct)
1
sin L /= 2
(nlr(x+ct)) L
sin
(n'rr(x_ct)'\ l L / I
Then equation (17 .11) becomes y(x, t)
-
2
c n sin ,n-1
(n(x+ct)\
+ c sin n ,n
(nir(x _ ct)\ l )
,n= 1
If the Fourier sine series for f(x) converges to f(x) on [0, L], as might normally be expected of a function that can be a displacement function for a string, the n
n 7rx f(x) = c, n sin ( -) n=1 1
L
for 0 < x
=
2 [f(x + ct) + f(x - ct) ] .
If f is twice differentiable, we can use the chain rule to verify directly that y(x, t) given b y this expression satisfies the wave equation, wherever f(x + ct) and f(x - ct) are defined . This raises a difficulty, however, since f(x) is defined only for 0 < x < L . But t can b e any nonnegative number, so the numbers x+ ct and x- ct can vary over the entire real line . How then can we evaluate f(x -i- ct) and f(x - ct) ? This difficulty can be overcome in two steps . First, extend f to an odd function fn defined on [-L, L] by setting fa(x) -
f( x) -f(-x)
for 0 < x < L for -L<x< 0
Notice that fo(0) = fo (L) = fo(-L) = 0 because the ends of the string are fixed . Now extend fo to a periodic function F of period 2L by replicating the graph of fo on successive intervals [L, 3L], [3L, 5L], . . . , [ - 3L, [-5L, -3L], . . . . Figure 17 .7(a) displays the odd extension of f defined on [0, L] to fo defined on [-L, L], and Figure 17 .7(b) show s the periodic extension of fo to the real line . We now have y(x, t) = [F(x+ct)+F(x-ct)]
(17 .13)
for 0 x L and t > 0. Assuming that f is twice differentiable, and that the joins at the end s of intervals where f has been extended to produce F are sufficiently smooth, then F is als o twice differentiable, and the chain rule can be used to directly verify that y(x, t) satisfies the
79 6
CHAPTER 17 The Wave Equation Y
f
Odd extension of
FIGURE 17 .7(a) f to [-L, L] .
L
3
\\"/
Y
..
FIGURE 17 .7(b)
J2L "
x 3L, "
Periodic extension F of fo to the real line.
wave equation. This is an elegant expression for the solution in terms of the initial displacement function and the number c, which depends on the material from which the string is made . It i s reasonable that the motion should be determined by these quantities . In practice, there will often be finitely many points in [0, L] at which f is not differentiable . For example, f(x) as given by equation (17 .6) is not differentiable at L/2 . In such a case y(x, t ) given by equation (17 .13) is the solution in a restricted sense, as there are isolated points a t which it does not satisfy all the conditions of the boundary value problem . Equation (17 .13) has an appealing physical interpretation. If we think of F(x) as a wave, then F(x + ct) is this wave translated ct units to the left, and F(x - ct) is the wave translate d ct units to the right . The motion of the string (in this case with zero initial velocity) is a su m of two waves, one moving to the right with velocity c, the other to the left with velocity c, an d both waves are determined by the initial displacement function . We will say more about this when we discuss d'Alembert's solution for the motion of an infinitely long string .
17.2 .5 Transformation of Boundary Value Problems Involving the Wav e Equation There are boundary value problems involving the wave equation for which separation o f variables does not lead to the solution . This can occur because of the form of wave equation (for example, there may be an external forcing term), or because of the form of the boundar y conditions . Here is an example of such a problem and a strategy for overcoming the difficulty . Consider the boundary value proble m aty axe+Ax y(0, t) = y(L, t) = 0
for0<x0 . fort 0 ,
y(x, 0) = 0, * (x, 0) = 1 for 0 < x < L . A is a positive constant. The term Ax in the wave equation represents an external force whic h at x has magnitude Ax . We have let c = 1 in this problem . If we put y(x, t) = X(x)T(t) into the partial differential equation, we get XT" = X" T + Ax, and there is no way to separate the t dependency on one side of the equation, and the x dependent terms on the other .
17.2 Fourier Series Solutions of the Wave Equation
797
We will transform this problem into one for which separation of variables works . Le t y(x, t) = Y(x, t) + r/r(x) . The idea is to choose q to reduce the given problem to one we have already solved . Substitute y(x, t) into the partial differential equation to ge t 8t2
8x
+
ilr
(x) + Ax .
This will be simplified if we choose /i so tha t
" (x) + Ax = O. There are many such choices . By integrating twice, we ge t 3
I/r(x)_-A6 +Cx+D, with C and D constants we can still choose any way we like. Now look at the boundary conditions . First, y(0, t) = Y(0, t) -I - P(0) = 0. This will be just y(0, t) = Y(0, t) if we choos e
0(0)=D=O . Next, 3
y(L, t) = Y(L, t) + *r(L) = Y(L, t) - A6 + CL = 0 . This will reduce to y(0, t) = Y(L, t) if we choose C so that 3
tr(L)_-A6 +CL= 0 or
C = 6 AL2 . This means that
0(x) =-6Ax3 +6AL2x= 6Ax(L 2 -x 2) . With this choice of Y(0, t) = Y(L, t) = 0. Now relate the initial conditions for y to initial condition for Y . First, Y(x, 0) = y(x, 0) - t(x) = - 1/i(x) = And aY (x, 0)
= ay (x, 0) = 1 .
Ax(x2 -L2) .
798
CHAPTER 17 The Wave Equatio n
We now have a boundary value problem for Y(x, t) : a2 Y _ a2Y at2 ax2
for0<x0,
Y(0, t) = 0, Y(L, t) = 0 Y(x, 0)
= 6 Ax(x2 -
L2),
fort > 0, aZ
(x, 0) = 1 for 0 < x < L .
Using equations 17 .7 and 17 .9, we immediately write the solution
E (,.L Y(x , t) = L 2 +*L n=1
1 A e(e 2 -L2) sin I
n \\\
f
I
\nL
L
de) sin ( n
L
x ) cos ( n t )
n,76 ) flir t nirx d sin( L sin( L )sin(L
L
)
2AL3 * (-1)" nirx rt sin ( L )cos (niL l 1 * .=1 n3 +
2L 7r 2
1- (-1)" n= 1
n2
sin (
n7rx flirt )sin L (L )
The solution of the original problem i s y(x, t) = Y(x, t) + Ax (L 2 - x2) .
Figure 17 .8(a) shows graphs of the string's position at times t = 0 .03, 0 .2, 0.5, 0.9, 1.4 and 2.2, with c = 1 and L = ar. Figure 17 .8(b) shows this string at times t = 2.8, 3.7, 4.4, 4.8, 5 .3, 6.1 and 6 .7. These use L = ?r and c = 1 .
FIGURE 17 .8(a) Position of the string at times t = 0 .03, 0 .2, 0 .5, 0 .9, 1 .4, and 2 .2.
17.2.6
FIGURE 17 .8(b)
3 .7,
4 .4, 4 .8,
5.3,
Position at times t = 2 .8, and 6 .7.
6 .1,
Effects of Initial Conditions and Constants on the Motio n
Using separation of variables, we have obtained series solutions of problems involving th e vibrating string on a bounded interval . It is interesting to examine the effects that constant s occuring in the problem have on the solution . We begin with an example investigating the effect of the constant c in the motion of the string .
17.2 Fourier Series Solutions of the Wave Equation
799
EXAMPLE 17 . 1
Consider again the problem of the wave equation with zero initial displacement and initia l velocity given by g(x) = x (l +cos The solution previously obtained, with L y(x, t) = 3 sin(x) sin(ct) 2c
= 7r,
is
+E 2 n=2
(L )) .
T2 2 C )
n21- 1
sin(nx) sin(nct) .
Figure 17 .5 shows graphs of the string's position at various times, with c = 1 . Now we want to focus on how c influences the motion. Figure 17 .9(a) shows the string profile at time t = 5 .3 , with c = 1 .05. Figures 17 .9(b) and (c) show the profile at the same time, with with c = 1 .1 an d 1 .2, respectively . These graphs are placed on the same set of axes for comparison in Figur e 17 .9(d) . The student is invited to select other times and graph the solution for different value s of c. Next, consider a problem in which the initial data of the problem depends on a parameter .
FIGURE 17 .9(a)
t = 5 .3 and c = 1 .05 .
FIGURE 17 .9(b)
t = 5 .3 and c =
1 .1 .
Y
FIGURE 17 .9(c)
t=5 .3 and c=
1 .2 .
FIGURE 17 .9(d) String profile at time t = 5 . 3 with c having values 1 .05, 1 .1, and 1 .2 .
CHAPTER 17 The Wave Equatio n
EXAMPLE 17 . 2
Consider the problem a2y 1.44
for 0 < x < 7r, t > 0, axz y(0, t) = y(or, t) = 0 fort > 0 , atz
y(x, 0) = 0,
(x, 0) = sin(ex) for 0 < x < 7r,
at in which e is a positive number that is not an integer . It is routine to write the solutio n yx, t)
sin( n= l
=
)'=+i nZ)(
sin(nx) sin(1 .2nt) .
e2 with different choices of e . Figure 17 .10(a) Now compare graphs of this solution at various times, shows the string profile at t = 0 .5 for e equal to 0 .7, 0 .9, 1 .5, 4 .7 and 9 .3 . Figure 17 .10(b) shows the graphs for these values of e at t = 1 .1, and Figure 17 .10(c) shows the graphs at t = 2 .8 . We can also follow the motion of the string at different times for the same value of e . Figure 17 .11(a) shows the string profiles for e = 0 .7 at times t = 0 .5, 1 .1 and 2 .8 . Figures 17 .11(b), (c), (d) and (e) each show the string profile for a given e and for these three times . 3
FIGURE 17 .10(a)
e equal to 0 .7, 0 .9,
String profiles at t = 0 .5 for and 9 .3 .
FIGURE 17.10(b)
1 .5, 4 .7,
FIGURE 17 .10(c)
String profiles at t = 2 .8 .
String profiles at t =
1 .1 .
17 .2 Fourier Series Solutions of the Wave Equation
Graphs of the string with times t = 0 .5, 1 .1, and 2 .8.
FIGURE 17 .11(a) e = 0 .7 for
801
FIGURE 17 .11(b) e = 0 .9 .
Y
FIGURE 17 .11(c)
e= 1 .5 .
FIGURE 17 .11(d) e = 4.7.
Y
FIGURE 17 .11(e) e = 9 .3.
In some of the exercises we will ask the student to employ a graphics package to exhibi t string profiles at different times and under different conditions .
17.2.7
Numerical Solution of the Wave Equatio n
We will describe a numerical method for approximating solutions of the wave equation on a n interval . The underlying idea is useful in approximating solutions of the heat equation as well,
CHAPTER 17 The Wave Equatio n and involves difference approximations of the derivative . To understand this idea, begin wit h a function f of a single variable which is differentiable at xo . Approximat e f( xo +h) - f( xo) h
f'(xo) and also
f( xo -h) - f(xo) -h with the approximation improving as h is chosen closer to zero . If h > 0, these are, respectively , the forward and backward difference approximations of f'(xo) . If we average these we ge t f'(xo) f(xo+h) - f(xo - h) f (xo) ti
2h This is the centered difference approximation of f'(xo) . If f is twice differentiable at xo, then f
f(xo + 2h) - 2f(xo) + f(xo - 2h) 4h2 Replacing 2h by h, we can write f// (xo) ti f( xo + h) -2f( x o) +f( xo -h) h2 This is the centered difference approximation of the second derivative . Applying these ideas to y(x, t), we can take increments Ax in x and At in t and writ e centered difference approximations of second partial derivatives : a2y + Ox, t) - 2y(x , t)+y( x - Ox, t) t) N y(x (x axe (0x) 2 and a 2y y(x, t + At) - 2y(x, t) + y(x, t - At) ti ate (x ' t) (At)2 We will use these to write numerical approximations of the solution to the problem :
a
for 0 < x < L, t > 0 , te _ c2 y(0, t) = y(L, t) = 0 fort0 , y(x, 0)
ax
= f(x) for 0 < x < L ,
(x, 0) = g(x) for 0 x < L . at The x, t- region of interest is the strip 0 < x < L, t > 0. Choose a positive integer N an d let Ax = L/N. Partition [0, L] by points xi = jAx, so L 2L . . (N -1)L NL 0
17.2 Fourier Series Solutions of the Wave Equation
803
Also choose an increment At in time and let tk = kOt for k = 0, 1, 2, . . . . In this way form a grid of points (xi , tk), called lattice points, over the x, t- strip, as shown in Figure 17 .12 . It is convenient to write = Y(
Yj,k
, j, tk) = Y(jix
x
kit) .
Now replace the partial derivatives in the wave equation with centered difference approximation s to get Yj,k+1 -
2 Yj,k +Yj,k-1 _ (At)
Yj+1,k - 2Yj,k +Yj-1, k
c2
(0 x )2
2
at (x1,
Yk) .
Solve this for yj,k+1 to get ( ct) A
2
(Yj+1,k -
Yj,k+1 =
j,k +Yj-1,k)
2Y
(17 .14)
j,k-1
j,k
+2Y
-Y
Figure 17 .13 shows why this equation is useful . The horizontal lines t = tk divide the x, t- strip into horizontal time layers At units apart . Compute approximate valises Yj,k at the lattice point s (x1, tk) . The points (xi, k+1), (xtk), (xi , tk), (xj+l , tk) and (xi , tk_1 ) appear as a diamond configuration, with the middle three points at the tk level, the last point at the tk_1 level, and the first at the highest, k+1 level . If we know the (approximate) value of y(x, t) at each of the last four points (in levels tk and tk_1 ), then we know all the terms on the right of equatio n (17 .14), hence we know the (approximate) value yj,k+1 at the tk+1 level . We can work our way up such five point configurations, always solving for the value of y(x, t) at the highest level , from previously derived values at the two next lower levels . This process fails at the edges of the x, t- region because we cannot form this five point diamond configuration there. However, the initial and boundary information of the proble m give information about y(x, t) at the edges . In particular : y(x, 0) = f(x) at each point on the bottom side (t = 0) of the strip, an d y(0, t) ='y(L, t) = 0 on the left and right sides of the strip . Thus, t
approximation of y(xj , tk+1)fr"om precedin g approximations, three at level tk and one at level
4_1 .
8 04
CHAPTER 17 The Wave Equation x= L
t
ti t= 0
t-1
> x (xj, t-1)
FIGURE 17.14 A t_1 layer must be
created to implement the scheme of Figure 17.13 at the t1 layer.
We have not yet used the initial condition on the velocity . Use the backward differenc e approximation of the first derivative to write ay at (x' 0)
N
y(x •, -At) - y(xj , 0) -A t yj,-1 Yj,o = g (x .) = g(JOx), . J = 1, . . . , N -1 . -At
(17 .15 )
Notice that this equation contains a y j,_I term . This is at the layer below the bottom edg e (t = 0) of the x, t- strip . There is really no such layer in a natural sense, but we create i t artificially using this backward difference approximation in order to use the initial information (ay/at) (x, 0) = g(x) for 0 x < L . Solve equation (17 .15) for Y1,_1 to get Y1,o - gUo,x) At, enabling us to determine the appropriate values to fill in on this lowest layer, in terms of know n values on level zero and the initial velocity function . This provides the diamond configuratio n of Figure 17 .14 when k = 0 . The strategy now is to begin by filling in the y(x, t) values at the grid points at level s k = -1 and k = 0. Then work up the layers, using equation (17 .14) to fill in approximate value s of y(x, t) at successively higher layers . With today's computing power, this can be done for a very large number of grid points . One fine point - the number (cit/i x) z has a bearing on the stability of the method. If this number is less than 1/2, the method is stable and produces approximations that improve a s Ox and At are chosen smaller (keeping (cAt/t1x) 2 < 1/2) . If (cit/i x)2 > 1/2, the numerical approximations can be unstable, yielding unreliable results .
EXAMPLE 17 . 3
Consider the problem azY a?Y for0<x<1,t> 0 at e axz y(0, t) = y(l, t) = 0 y(x, 0) = xcos(,rrx/2),
ay I1 for 0<x<1/ 2 (x, 0) = jI at 0 for 1/2 < x 1
17.2 Fourier Series Solutions of the Wave Equation
805
The exact solution is y(x, t) =
*6
n= l
(
-" I) ) 1)2 sin (norx) cos (Hart ) 4( 2
-11 + 2- E°O -1 (cos(nir/2) I\ / sin (n-rrx) sin (n'rt ) 7r n=1 n
n7r
f
We will choose N = 10, so ix = 0 .1 . Let At = 0 .025 . Then (cAt/ix) 2 = (0 .025/0 .1) 2 = 0 .0625 < 1/2 . The equations for the approximations are Yi,k+1 =
Yi,o
=
for j = 1, . . . , 9,
(0 .0625) (yi+ l , k - 2Yi,k + Yi- l ,k) +2yi,k
k = 0,
1, 2, . . . ,
(17 .16)
f(0 .1j) for j=1, . . .,9,
and Yi,-1
= yi,o - g(jOx) Ot = f(0 .1j) - 0 .025g(0 .l j)
for j = 1, . . . , 9.
Note that we take j from 1 through N - 1 = 9 because j = 0 corresponds to the left side of the x, t- strip, and j = N = 10 refers to the right side of this strip, and information is given o n these sides : y(0, t) = y(1, t) = 0. First, compute the values yi,-1 on the lowest horizontal level : Yl,-1
= 0 .07377, Y2,_I = 0 .16521, Y3 ,_ 1 = 0 .2423 0
y4,-1
= 0.29861,
y5,_1
= 0 .32855, y 6, _ 1 = 0 .35267,
Ys,-r = 0.24721,
Y9,_1
= 0.14079 .
y7,-1
= 0.31779 ,
Next, compute the approximate values yi, o : Yl,o = 0 .09877, Y2, o = 0.19021, y 3, o = 0 .26730 , Y4,o = 0.32361, y5, o = 0 .35355,3'6 , 0 = 0 .35267,
y7 , o
= 0.31779 ,
Y8,o = 0.24721, Y9 , o = 0.14079 . Now systematically move up the t- axis, one level at a time. For t = 0 .025 , put equation (17 .16), we hav e yi,l = ( 0 .0625 )( yi+l,o -2yi,o
k
= 0 in
+yi-l,o) +2Yi,o -Yi, r for j=1, . . .,9.
The computed values are : Yl,l = 0.12331, Y2,1 = 0.21431, Y3,1 = 0.29 1 Y4,1
= 0 .34696, Y5,1 = 0 .37662, y6,1 = 0 .35055 ,
y7,1
= 0.31556, Y 8,1 = 0.24160,
y9,I
= 0.13864 .
Next get the approximate values on the k = 2 layer (t = 2(0 .025) = 0 .05) by putting k = 1 in equation (17 .16) and usin g = (0 .0625) ( yi+1,2
-2 Yi,2 + Yj _ 1,2) +2Yi,2 -Yi, l
for j = 1, . . . , 9 . In this way, we can form approximations at lattice points as high as we want in the x, t- strip .
806
CHAPTER 17 The Wave Equatio n
PROBLEMS In each of Problems 1 through 8, solve the boundary valu e problem using separation of variables . Graph some of the partial sums of the series solution, for selected values o f the time . 1.
atz
=cz
8x
a2 y
y(x, 0) = 0, *t (x, 0) = g(x)
y 9 z
y(x, 0) = x(x - 2),
4
at
(x, 0) = 0 for 0 x < 4
for 0 < x < 3
a2 y for0<x<7r,t>0 ,
9ax2
y(0, t) = y(7r, t) = 0
5.
at z
fort 0 ,
a[(x,0)= 1 for0<x<7r
y(x,0)=sin(x),
for 0 < x < 27r, t > 0 ,
axe
y(0, t) = y(27r, t) = 0 fort 0, y(x, 0) = f(x),
at
g(x)
_ 0 for0x<1/2 and forl<x< 2 3 for 1/2 < x < 1
d2y 8
for 0 < x < 7r, t > 0 , axz y(0, t) = y(7r, t) = 0 fort>0 , a te
25
y(x, 0) = sin(2x),
fort > 0 ,
y(x, 0) = 0, *t(x,0) = x(3 - x)
ate
(x, 0) = g(x) for 0 < x < 2,
where
for 0 < x < 3, t > 0 ,
axe
at
fort>0 ,
y(0, t) = y(3, t) = 0
4.
y(0, t) = y(2, t) = 0 fort > 0 ,
for 0 < x < 2 ,
for 0 < x < 4, t > 0
y(x, 0) = 2 sin(7rx),
acy
* 2 for 0 < x < 2, t > 0 , ate =9 -
,a2
y(0,t)=y(4,t)=0
3.
7
12x for 0 < x < 1 0 for l < x < 2
where g ( x) =
ate
0 for0<x< 4 g( x ) = 5-x for 4< x< 5
for0<x<2,t>0 ,
y(0, t) = y(2, t) = 0 for t > 0 ,
2.
where
(x, 0) = 0 for 0 < x < 27r,
where
at
(x, 0) = 7r - x for 0 x < 7r
9. Solve the boundary value problem a2 y + 2 x for 0 < x < 2, t > 0 , ate y(0, t) = y(2, t) = 0 fort > 0 , y(x, 0) = 0,
at
(x, 0) = 0 for 0 < x < 2 .
Graph some partial sums of the series solution . Hint: Upon putting y(x, t) = X(x)T(t), we fin d that the variables do not separate. Put Y(x, t) = y(x, t) +h(x) and choose h to obtain a boundary value problem that can be solved by Fourier series . 10. Solve
f(x) = 6. azy
4
J 3x 677- - 3x
axe
for0<x<7r for 7r < x < 27r
for 0 < x < 5, t > 0 ,
y(0, t) = y(5, t) = 0 fort > 0 , y(x, 0) = 0,
at
(x, 0) = g(x)
for 0 < x < 4,
a2 y x 2 for 0 < x < 4, t > 0 , at e 9 axe -Iy(0, t) = y(4, t) = 0 fort > 0 , y(x, 0) = 0,
*t(x, 0) = 0
for 0 < x < 4.
Graph some partial sums of the solution for value s of t .
17.2 Fourier Series Solutions of the Wave Equation 14 . Consider the boundary value proble m
11. Solve 02y _ 82 y - cos(x) for 0 < x < 27r, t > 0 , ate ax'y(0,t)=y(21r,t)=0 fort>0 , y(x, 0) = 0, *t (x, 0) = 0
for 0 x < 27r.
Graph some partial sums of the solution for selecte d values of the time . 12. Transverse vibrations in a homogeneous rod of lengt h IT are modeled by the partial differential equation a
z for0<x<7r,t>0 .
a-axa+at2 =0
Here u(x, t) is the displacement at time t of the crosssection through x perpendicular to the x axis, an d a 2 = EI/pA, where E is Young's modulus, I is the moment of inertia of a cross-section perpendicula r to the x axis, p is the constant density, and A the cross-sectional area, assumed constant . (a) Let u(x, t) = X(x)T(t) to separate the variables . (b) Solve for values of the separation constant an d for X and T in the case of free ends : a2 t) = a 2 t) = a3u t) = a3u (0 (7r r) = 0 (0, (7r, ax2 ax x2 ax 3 ax 3 fort>0. (c) Solve for values of the separation constant an d for X and T in the case of supported ends : 82 u az u u(0, t) = u(Tr, t) = (o t) = (7r t) = 0 axz
ax z
fort>0 . 13. Solve the
telegraph equation
x
au a2 u +Bu = c2 a 2 for 0 < x < L, t > 0 . at Here A and B are positive constants . The boundary 82 ,, ate
80 7
+A
conditions are u(0, t) = u(L, t) = 0
fort > 0 .
The initial conditions are u(x, 0) = f(x),
*t (x, 0) = 0
for 0 < x < L .
Assume that A 2 L 2 < 4(BL '-+c2 1r2 ) .
ate
9
ax
+5x 3
y(0, t) = y(4, t) = 0 y(x, 0) = cos(Trx),
for 0 < x < 4, t > 0 , fort > 0
*r (x, 0) = 0
for 0 < x < 4 .
(a) Write a series solution . (b) Find a series solution when the term 5x3 is removed from the wave equation . (c) In order to gauge the effect of the forcing term o n the motion, graph the 40` x` partial sum of the solution for (a) and (b) on the same set of axes at time t = 0 .4 seconds . Repeat this procedure successively for times t = 0 .8, 1 .4, 2, 2 .5, 3 and 4 seconds . 15. Consider the boundary value problem 2 9 + cos(7rx) for 0 < x < 4, t > 0 , atz ax2 y(0,t)=y(4,t)=0 fort> 0 y(x, 0) = x(4 - x), *t(x,0) = 0 for 0 < x < 4 . (a) Write a series solution . (b) Find a series solution when the term cos(7rx) i s removed from the wave equation . (c) In order to gauge the effect of the forcing term o n the motion, graph the 40'" partial sum of the solutio n for (a) and (b) on the same set of axes at time t = 0 . 6 seconds . Repeat this procedure successively for time s t = 1, 1 .4, 2, 3, 5 and 7 seconds . 16. Consider the boundary value problem 82 82, e -` for 0 < x < 4, t > 0 , aty 9 ax y(0,t)=y(4,t)=0 fort 0 (x, 0) = 0 forO < x < 4 . at (a) Write a series solution. (b) Find a series solution when the term e- ` is removed from the wave equation . (c) In order to gauge the effect of the forcing term o n the motion, graph the 40 th partial sum of the solution for (a) and (b) on the same set of axes at time t = 0 . 6 seconds . Repeat this procedure successively for time s t = 1, 1 .4, 2, 3, 5 and 7 seconds . y(x, 0) = sin(arx),
L
808 I
CHAPTER 17 The Wave Equatio n
17. Consider the problem 32 y 82y ate
y(x, 0) = sin(irx) for 0 < x < 2 ,
for 0 < x < 1, t > 0 ,
ax e
at (x, 0) = 1 for 0 < x < 2 .
y(0, t) = y(l, t) = 0 fort > 0 , y(x, 0) = f(x) at -
(x,0)=0
for 0 x < 1 ,
Use Ox = 0 .1 and At = 0 .025 . Compute approximate values of y(x, t), going up five layers from t = 0 through t = 0 .125.
for0<x1 ,
where
20. Consider the problem x for0<x<1/ 2 f(x) = J 1-x forl/2<x<1 .
azy azy at e ax z
Use Ox = 0 .1 and At = 0 .025 to compute approximate values of y(x, t) at lattice points in the x, tstrip 0 < x < 1, t > O . Carry out the computations for five t- layers (that is, for t = 0 through t = 5(0 .025) = 0 .125 .
y(0, t) = y(l, t) = 0 fort > 0 , y(x, 0) = x(1 - x) 2 for 0 < x < 1 ,
at (x, 0) = x2
18. Consider the problem azy azy ate
ax z
for 0 < x < 1 .
Use Ox = 0 .2 and At = 0 .025 . Compute approximate values of y(x, t), going up five layers from t = 0 through t = 0 .125 .
for 0 < x < 2,t > 0 ,
y(0, t) = y(2, t) = 0
for 0 < x < 1,t > 0 ,
fort > 0 ,
21. Consider the problem
y(x, 0) = 0 for 0 x < 2, at -
azy azy ate axz
(x, 0) = 1 for 0 < x < 2.
y(0, t) = y(1, t) = 0
Use Ox = 0 .1 and At = 0 .025 and compute approximate values of y(x, t), going up five layers from t = 0 through t = 0 .125.
fort > 0 ,
y(x, 0) = 0 for 0 < x < 1 ,
al (x, 0) = cos(Trx)
19. Consider the proble m a2y _ 82y for 0 < x < 2, t > 0 , ate ax2 y(0, t) = y(2, t) = 0 fort > 0,
17.3
for 0 < x < 1,t > 0,
for 0 < x < 1 .
Use Ax = 0 .1 and At = 0 .025 . Compute approximat e values of y(x, t), going up five layers from t = 0 through t = 0 .125 .
Wave Motion Along Infinite and Semi-Infinite String s 17.3.1 Wave Motion Along an Infinite String If long distances are involved (such as with sound waves in the ocean used to monitor temperature changes), wave motion is sometimes modeled by an infinite string, in which case ther e are no boundary conditions . As with the finite string, we will consider separately the cases o f zero initial velocity and zero initial displacement . Zero Initial Velocity
Consider the initial value problem for - oo < x < oo, t > 0
ate - cZ ax y(x, 0) = f(x),
at
(x, 0) = 0 for oo < x < oo .
17.3 Wave Motion Along Infinite and Semi-Infinite Strings
809
There are no boundary conditions, but we will impose the condition that the solution be a bounded function . To separate the variables, let y(x, t) = X(x)T(x) and obtain, as before , X" + AX = 0,T" + Ac2 T = 0. Consider cases on A . Case 1 A= O . Now X(x) = ax + b . This is a bounded solution if a = 0 . Thus A = 0 is an eigenvalue, with nonzero constant eigenfunctions . Case 2 A<0 . Write A = -w 2 with co > 0 . Then X" - w2X = 0, with general solutio n X(x) = ae"x +be -' . But e' is unbounded on (0, oo), so we must choose a = O . And e - " is unbounded on (-oo, 0), so we must choose b = 0, leaving only the zero solution . This problem has no negative eigenvalue . Case 3 A > O, say A = w 2 with co > O . Now X" + o2X = 0, with general solution Xw (x) = a cos (wx) + b sin( wx) . These functions are bounded for all w > 0 . Thus every positive number A = w 2 is an eigenvalue , with corresponding eigenfunction a cos(wx) + b sin((.ox) for a and b not both zero . We can include Case 1 in Case 3, since a cos (wx) + b sin(wx) = constant if co = 0. Now consider the equation for T, which we can now write as T"+c2w2 T = 0 for co > 0. This has general solution T(t) = a cos(wct) + b sin(wct) . Now
at (x, 0) = X(t)T'(0) = X(t)wcb = 0 , so b = 0 . Thus solutions for T are constant multiples of Tw (t) = cos(wct) . For any to > 0, we now have a functio n y w (x, t) = Xw (x)TW (t) = [a,,, cos(wx)+b w sin(wx)] cos(wct )
which satisfies the wave equation and the condition at (x, 0) = 0 . We need to satisfy the condition y(x, 0) = f(x) . For the similar problem on [0, L], we had a function y,,(x, t) for each positive integer n, and we attempted a superposition EL I y„(x, t). Now the eigenvalues fill out the entire nonnegative real line, so replace E°° 1 with fo • • dw in forming the superposition : y(x, t)
_ l c°
y w (x, Oda)
=f
[a w cos (cox) + U w sin(wx)] cos(wct) dw .
(17 .17 )
810
CHAPTER 17 The Wave Equatio n The initial displacement condition requires that y(x, 0)
= ff
[a w cos(wx) +b w sin (cox)] dw = f(x) .
The integral on the left is the Fourier integral representation of f(x) for -oo < x < co . Thus choose the constants as the Fourier integral coefficients : aw =
r
f
f(6) cos(cg)d 6
and bw = Tr -
-co
f(5*) sn(w)d .
With this choice of the coefficients, and certain conditions on f (see the convergence theorem for Fourier integrals in Section 15 .1), equation (17 .17) is the solution of the problem.
EXAMPLE 17 . 4
Consider the problem atz =
c2ax
for -oo <x 0
y(x, 0) = e-I xl , at (x, 0) = 0
for oo < x < co .
A graph of the initial position of the string is given in Figure 17 .15 . To use equation (17 .17), compute the Fourier integral coefficients : aw = -
e-lel cos(w6)d6 = (1 2 +
r
w2 ) f
and
bw =
-f 7T
e-II sin(wg)d = O . o0
(For bw we need not actually carry out the integration because the integrand is an odd function) . The solution is 's 1 y(x, t) = cos(wx) cos(wct)dw . 7r f + w2
Y
FIGURE 17 .15
Graph of y = e -Ixl .
17.3 Wave Motion Along Infinite and Semi-Infinite Strings
81 1
The solution (17 .17) may be written in more compact form as follows . Insert the integral formulas for the coefficients : y(x, t)
Suppose now the string is released from the horizontal positio n (zero initial displacement), with initial velocity g(x) . The initial value problem for the displacement function is Zero Initial Displacement
atz - C2 y(x,0) = 0,
ax
for - co < x < oo, t > 0
at (x, 0) = g(x)
for co < x < oo .
Let y(x, t) = X(x)T(t) and proceed exactly as in the case of initial displacement f(x) and zero initial velocity, obtaining eigenvalues A = co' for w > 0 and eigenfunctions Xw (x) = a w cos(cox) +b U, sin(cox) . Turning to T, we obtain, again as before , T(t) = a cos(wct) + b sin(wct) . However, this problem differs from the preceding one in the initial condition on T(t) . Now we have y(x, 0) = X(x)T(0) = 0, so T(0) = 0 and hence a = 0. Thus in this, for each w > 0, T(t) is a constant multiple o f sin(wct) . This gives us function s yw (x,
t) = [a w cos(cox) + bw sin(wx)] sin(wct) .
Now use the superpositio n y(x, t)
=f
co
[a w cos (cox) -b w sin(wx)] sin(wct)dw
in order to satisfy the initial condition . Compute ay = _ * at f [a w cos(wx)+b w sin(wx)]wccos( wct)dw .
(17 .19)
CHAPTER 17 The Wave Equatio n We need at
(x, 0)
=f
[o ca p, cos(cox) + cock, sin(wx)]dw = g(x) .
This is a Fourier integral expansion of the initial velocity function . With conditions on g (such as are given in the convergence theorem for Fourier integrals), choos e 1
coca,
_ -f
wck,
=
g(6) cos(co)d6
and 1
00 f g(e) sin(cg )de.
IT
Then 7rcw
J -co g() f
cos(w )d
and b.
Trcw f
g()
sin(w )d .
With these coefficients, equation (17 .19) is the solution of the problem .
L-. EXAMPLE 17 . 5
Suppose the initial displacement is zero and the initial velocity is given b y for0<x< 1 0 forx < O and forx > 1 ex _g(x)
A graph of this function is shown in Figure 17 .16 . To use equation (17 .19) to write the displacement function, compute the coefficients : a
1 7rcw /-co g(6) 1
cos(w )d6 _
?rcw JO 1
o f cos(w )d e
e cos(w) + ew sin(w) - 1
Y
I
I
FIGURE 17 .1 6
8(x)
ex
for0<x< 1
0
forx < 0 and forx > 1
17.3 Wave Motion Along Infinite and Semi-Infinite Strings
813
and _ 1 1 r ' e sin ( w ) d 6 g(O sin(w)d'6= bw 7rcco J_* 7rccv Jo e _ 1 ew cos(co) - e sin(w) - w 7rcw 1 + w2 The solution is y(x, t) =
1 e cos(w) + eco sin(co) J0 ( 7rcco 1 + cw t - ff °° JO
-11 cos(cox) sin(coct)d w J
1 ew cos (co) - e sin (w) - w sin(wx) sin(coct)dco . 7rcco 1 + cot /
As in the case of wave motion on [0, L], the solution of a problem with nonzero initia l velocity and displacement can be obtained as the sum of the solutions of two problems, in on e of which there is no initial displacement, and in the other, zero initial velocity .
17 .3.2 Wave Motion Along a Semi-Infinite Strin g We will now consider the problem of wave motion along a string fastened at the origin an d stretched along the nonnegative x axis . Unlike the case of the string along the entire line, ther e is now one boundary condition, at x = 0 . The problem i s a2y
at e -
c2
axZ
for 0 < x < oo, t > 0,
y(0, t) = 0 fort > 0 , y(x, 0) = f(x),
at Again, we want a bounded solution . Let y(x, t) = X(x)T(t) and obtai n
(x, 0) = g(x)
for 0 < x < oo .
X" + AX = 0, T" + Ac2 T = 0 . In this problem we have a boundary condition : y(0, t) = X(0)T(t) = 0, implying that X(0) = 0 . Begin by looking for the eigenvalues A and corresponding eigenfunctions . Consider cases on A . Case1 A=O . Now X(x) = ax+b . Since X(0) = b = 0, then X(x) = ax . This is unbounded on [0, oo) unles s a = 0, so A = 0 yields no bounded nontrivial solution for X, and 0 is not an eigenvalue . Case 2 A is negative . Now write A = -w 2 to obtain X" - w2X = 0 . This has general solutio n X (x) = a e" + be' . Now X(0)=a+b= 0 implies that b = - a, so X(x) = 2a sinh(wx) . This is unbounded for x > 0 unless a = 0, so this problem has no negative eigenvalue .
CHAPTER 17 The Wave Equatio n Case 3 A is positive. Now write A = co' and obtain X(x) = a cos(wx) -I- b sin(cox) .
Since X(O) = a = 0, only the sine terms remain . Thus every positive number is an eigenvalue, with corresponding eigenfunctions nonzero constant multiples of sin(wx) . Now the problem for T is T" + c2 w2 T = 0, with general solution T(t) = a cos(wct) + b sin(wct) . At this point we must isolate the problem into one with zero initial displacement or zero initia l velocity. Suppose, to be specific, that g(x) = 0 . Then T' (0) = 0, so b = 0 and T(t) must be a constant multiple of cos(wct) . We therefore have function s y w (x, t) =
sin(wx) cos(wct)
for each co > 0 . Define the superpositio n y(x, t)
= ff
co, sin(wx) cos(wct)dw .
Each such function satisfies the wave equation and the boundary condition, as well as y, (x, 0) =0 for x > 0. To satisfy the condition on initial displacement, we must choose the coefficient s so that
= ff
y(x, 0)
This is the Fourier integral expansion of cW
co, sin(wx)dw = f(x) .
f(x) on [0, co), so choos e
2
=-f
f() sin(w)dk .
The solution of the problem is y(x, t)
=
f
/
I f
f() sin(w )de) sin(wx) cos(wct)dw .
If the problem has zero initial displacement, but initial velocity g(x), then a similar analysi s leads to the solution y(x, t)
= ff
c*, sin(wx) sin(wct)dw ,
where c. =
2 ITCW
f g(6) sin(w)d . o
EXAMPLE 17. 6
Consider wave motion along the half-line governed by the problem: a2y 16 for x > 0, t > 0 , ate axe y(0, t) = 0
fort 0,
ay (x, 0)=O,y(x, 0)= at
srn(7rx) for 0 < x < 4 0 forx>4 .
17.3 Wave Motion Along Infinite and Semi-Infinite Strings
81 5
Here c = 4 . To write the solution, we need only compute the coefficient s cw
2 _ =2 7r
f(5) sin(w)d 6
f a sin(7r) sin(w)d
= 8 sin(w) cos(w)
0
2 cos 2 (w) - 1 W2 -1. 2
The solution is y(x, t)
17 .3.3
=f
8 sin(w) cos(w) 2'°' 2 ( w)2
1
sin(wx) cos(4wt)dw .
Fourier Transform Solution of Problems on Unbounded Domain s
It is useful to have a variety of tools and methods available to solve boundary value problems . To this end, we will revisit problems of wave motion on the line and half line and approac h the solution through the use of a Fourier transform . First, here is brief description of what is involved in using a transform . 1. The range of values for the variable in which the transform will be performed is on e determining factor in choosing a transform . Another is how the information given i n the boundary value problem fits into the operational formula for the transform . For example, the operational formula for the Fourier sine transform i s zSs[f"(x)](w) = - w2fs( w )+ w f(0) , so we must be given information about f(O) in the problem to make use of this transform. 2. If the transform is performed with respect to a variable a of the boundary value problem, we obtain a differential equation involving the other variable(s) . This differential equation must be solved subject to other information given in the problem . This solutio n gives the transform of the solution of the original boundary value problem . 3. Once we have the transform of the solution of the boundary value problem, we mus t invert it to obtain the solution of the boundary value problem itself . Finally, the Fourier transform of a real-valued function is often complex-valued . If th e solution is real-valued, then the real part of the expression obtained using the Fourier transfor m is the solution . However, because expressions such as e - '" are often easier to manipulate than cos(wx) and sin(wx), we often retain the entire complex expression as the "solution", extractin g the real part when we need numerical values, graphs, or other information . For reference, we will summarize (without conditions on the functions) some facts abou t the Fourier transform and the Fourier sine and cosine transforms . Fourier Transform iS[.(w)=f(w)= 1
f
f(x)e _ ;
' x dx
0
f( x) = 27r jf f ( w) ei'd w R [f"] (w) = -w 2f (w )
816 ,
CHAPTER 17 The Wave Equatio n Fourier Cosine Transfor m zc* .f](
w) = fc( w ) = J f(x) cos(wx)dx
f(x) =
2 7f
JO
f c(w) cos(cox)d w
f
Mf"]( co ) _ - w2 fc( w ) - r (o ) Fourier Sine Transform *s[ f]( co)
co = fs()=f
x sin(cox) dx f()
2 fC f(x) = 7r fs(co) sin(wx)dw j z` s[f " ]( w ) _ - w2fs( w ) + wf(0) Fourier Transform Solution of the Wave Equation on the Line Consider again the problem a2y _ for - oo < x < oo, t > 0 c2 ax ate y(x, 0) = f(x),
at
(x, 0) = 0
for - oo < x < oo .
Because x varies over the entire line, we can try the Fourier transform in the x variable . To do this, transform y(x, t) as a function of x, leaving t as a parameter. First apply R to the wav e equation : a2 y te](w)=c2[8Y](W ) . [a Because we are transforming in x, leaving t alone, we hav e zR
[ at-2 ] (w)
=f
ate (x, t)e
iwx dx =
, at z J_* y( x
t)e-«,x dx =
atz y( a , t) ,
where y(w, t) is the Fourier transform, with respect to x , of y(x, t) . The partial derivative with respect to t passes through the integral with respect to x because x and t are independent. For the Fourier transform, in x, of a2y/axe , use the operational formula :
R
[-I2(w) _ - wz
Y(
w> O.
The transformed wave equation is therefore 8z a t zY( co , t) = - c2w2 Y( w , t) , or 8z atzy(w, t)+c2 w2y(w, t) = 0. Think of this as an ordinary differential equation for y(w, t) in t, with co carried along as a parameter . The general solution has the for m y(w, t) = aw cos(wct) +bw sin(wct) .
17.3 Wave Motion Along Infinite and Semi-Infinite Strings
817
We obtain the coefficients by transforming the initial data . First, y ( w , 0) = a w = zS [y( x , 0) ] ( w) = zS [.A ( w) = I ( w ) , the transform of the initial position function . Next wcbw
= at (w> 0) = a [(x ,
0)]
(w)
= a[0] (w) = 0
because the initial velocity is zero . Therefore b w = 0 and
y(w, t) = i (w) cos(wct) We now know the transform of the solution y(x, t). Invert this to find y(x, t) :
= -1 f °J f (co) cos( wct)e k'xdw .
y(x, t)
(17 .20)
This is an integral formula for the solution, since f (w) is presumably known to us because w e were given f . Since etwx is complex-valued, we must actually take the real part of this integra l to obtain y(x, t) . However, the integral is often left in the form of equation (17 .20) with th e understanding that y(x, t) is the real part. We will show that the solutions of this problem obtained by Fourier transform and Fourie r integral are the same . Write the solution just obtained by transform as Yt,.(x, t)
= -1 f0f0(w) cos(wct)e`° d w =
27r f 00 (f_:fe-'d 1
00 W
I cos(wct)ed w /
-'w(e- `) cos(wct) f(6)dwde
-
27r J_ co f -coe 1 0 °° f [cos(w( - x)) - i sin(a ( - x))] cos(wct) f(e)dwde . 27r -oo -co
_- f
Since the displacement function is real-valued, we must take the real part of this integral , obtaining
y(x, t) = 1 J 27r
f * cos(w(e - x)) cos(wet) f(e)dwde .
-co -co
Finally, this integrand is an even function of w, s o 1 1 - f •• dw=2- f 27r -* 27( o
1 °° •• . dw=-1 f . . . dw , 7r o
yielding
y(x, t) =
71-r
ff
cos(w( - x)) cos(wet) f()dw d
This agrees with the solution (17 .18) obtained by Fourier integral .
818
CHAPTER 17 The Wave Equatio n
r=EXAMPLE 17 . 7
Solve for the displacement function on the real line if the initial velocity is zero and the initial displacement function is given b y cos(x) for - 7r/2 < x < Ir/2 for 'xi > 7r/ 2
f(x) - it 0
To use the solution (17 .20) we must comput e f (w)
=f
f(e) e-'gde = f ,cos(
1- &
2
)
Z 7r/
cos()e -`'
gd
for co 1 for w = 1
i(w) is continuous, since lim
2cos(irw/2) it = 2 1-w 2
The solution can be written 1 y(x, t) =. -
00 cos(lrw/2 )
w2 cos(wct)e`°'x dw , 1with the understanding that y(x, t) is the real part of the integral on the right . If we explicitl y take this real part, then IT
y(x, t)
_ f
foo
col(7r*/2)
cos(cox) cos(wct)dw .
EXAMPLE 17 . 8
In some instances a clever use of the Fourier transform can yield a closed form solution . Consider the problem a2 y for - oo < x < co, t > 0 , ate 9 axe y(x, 0) = 4e 5NxN for - oo < x < oo , at (x, 0) = 0 . Take the transform of the differential equation, obtaining as in the discussion ate (w, t) = -9w 2 Y(w, t) , with general solution y(w, t) = a °, cos(3wt) + b(, sin(3 wt) . Now use the initial conditions . Using the initial position function we hav e Y( w, 0) = a*,
= a[y( x, 0)]( w) =
S x* [4e1 ] (w) =
40 . 25+w 2
17.3 Wave Motion Along Infinite and Semi-Infinite Strings
819
Next, using the initial velocity, writ e aji ay at (w, 0) = 3cobw = [ x , 0)] (w) = 0 , so b w = 0 . Then y(w, t) =
40 15
-Fw2
cos(3wt) .
We can now write the solution in integral form a s
= v -1 [y(w, t)](x) =
y(x, t)
27r
f
25+w2
cos(3wt)e '*'x dw .
However, in this case we can explicitly invert y(w, t), using some facts about the Fourie r transform. Begin by using the convolution theorem to writ e 40 cos(3wt) [ 25 + w2 ] r 40 -1 = iS L J *F [cos(3wt) ] 25+w 2 =4e-51x1 *i 1 [cos(3wt)] .
y(x, t) = - 1
ri
(17 .21)
We need to compute the inverse Fourier transform of cos(3wt) . Here w is the variable of the transformed function, with t carried along as a parameter . The variable of the inverse transform will be x . Combine the fact that R [8(t)] (w) = 1 from Section 15 .4 .5, with the modulation theorem (Theorem 15 .6 in Section 15 .3) to ge t R [cos(wox) = ir[8(w + wo) + 8(w - wo)] , in which S is the Dirac delta function . By the symmetry theorem (Theorem 15 .5 of Sectio n 15.3), zS[ [8(w + wo) + 8(w - w o)]] = 27r cos (wo w) . Therefore -1 [cos
CHAPTER 17 The Wave Equatio n in which the last line was obtained by using the filtering property of the Delta function (Theore m 15 .13 of Section 15 .4.5) . This closed form of the solution and is easily verified directly . Transform Solution of the Wave Equation on a Half-Line We will use a transform to solv e a wave problem on a half-line, with the left end fixed at x = O . This time we will take the cas e of zero initial displacement, but a nonzero initial velocity : for 0 < x < oo, t > 0 , ate c2 y(0, t) = 0 fort 0 ,
ax
y(x, 0) = 0,
at
(x, 0) = g(x)
for 0 < x < oo .
Now the Fourier transform is inappropriate because both x and t range only over the nonnegativ e real numbers . We can try the Fourier sine or cosine transform in x. The operational formul a for the sine transform requires the value of the solution at x = 0, while the formula for th e cosine transform uses the value of the derivative at the origin . Since we are given the conditio n y(0, t) = 0 (fixed left end of the string), we are led to try the sine transform . Let ys(w, t) be the sine transform of y(x, t) in the x- variable . Take the sine transform of the wave equation . The partial derivatives with respect to t pass through the transform, and we use the operationa l formula for the transform of the second derivative with respect to x : a2 y
r a
sS - C2a
[ax
zw' Z ] = -c ys (
t) +wc 2y(0, t) = - c2w2Ys(w , t ) .
Then Ys( w , t = a . cos(wct) + bo, sin(wct) . No w a . = ys( w , 0) = s[y(x, 0 )]( w ) =
as[ 0 ]( w ) = 0 ,
and aYs (co, 0) = wcbo, at
s(w) ,
so b.
= w1egs( w ) •
Therefore y(w, t)
=1 gs(w) sin(wct) . we
This is the sine transform of the solution. The solution is obtained by inverting : y(x, t) = as 1 I gs(w) sin(wct) (x) _ [
c gs(w) sm(wx) sin(wct)dw .
1 7. 3 Wave Motion Along Infinite and Semi-Infinite Strings
821
EXAMPLE 17 . 9
Consider the following problem on a half-lin e 25
ate
axe
forx > 0, t > 0,
y(0, t) = 0 fort > 0 , y(x, 0) = 0,
at
for 0 < x < oo ,
(x, 0) = g(x)
where 9 - x2 g(x) _ 0
for 0 x < 3 forx > 3 .
If we use the Fourier sine transform, then the solution i s 2 y(x, t) = f -gs(co) sin(wx) sin(5wt)dw . 7r o 5 w All that is left to do is comput e gs (w)
yielding an integral expression for the solution .
SECTION
.
1`7 .3 j
PROBLEMS
In each of Problems 1 through 6, consider the wav e a2y
= c2 a2y on the line, for the given valu e at e axe of c and the given initial conditions y(x, 0) = f(x) an d equation a
(x, 0) = g(x) . Solve the problem using the Fourier integral and then again using the Fourier transform . 1. c = 12, f(x) = e -SIxI , g(x) = 0 2. c = 8, f(x) =
8-x for0<x< 8 forx < 0 and forx > 0 0
g(x) = 0 sin(x) for - Tr < x < ' n3 . c = 4, f(x) = 0, g(x) = 0 for Ix! >7r 4. c=1,f(x) = g (x) = 0
f 2-for 0
-2<x< 2 for I x > 2
5. c = 3, f(x) = 0, g(x)
= 1e0
zx
forx > 1 forx<1
6. c = 2, f(x) = 0 , g(x) =
1 for0<x< 2 -1 for - 2 < x < 0 0 forx > 2 and forx < - 2
In each of Problems 7 through 11, consider the wav e equation = c2 - on the half-line, with y(x, 0) = 0 ax 2 for x > 0, and for the given value of c and the given bound -
aty
ay
(x, 0) = g(x) ary initial conditions y(x, 0) = f(x) and t for x > 0 . Solve the problem using separation of variables (the Fourier sine integral) and then again using the Fourier sine transform .
CHAPTER 17 The Wave Equatio n
Ix(1-x)
7. c = 3, .f(x)= 0
for0x< 1 for x> 1
Sometimes the Laplace transform is effective in solving boundary value problems involving the wave equation . Use the Laplace transform to solve the following .
g(x) = 0
12.
0 for0x< 4 8. c = 3, f(x) =0, g(x) = 2 for 4 < x < 11 0 forx > 11
atz =
for 0 < t < 1 fort > 0
a
for x > 0 13. Solve _ c2 forx > 0, t > ate 2 0ax y(0,t)=t fort> 0
10. c = 6, f(x) = - 2e -x , g(x) = 0
17.4
forx > 0, t > 0
y(x, 0) = t (x, 0) = 0
_ cos(x) for 7r/2 x < 51r/ 2 0 for 0 < x < 7r/2 and for x > 51r/2
11. c = 14, f(x) = 0, g(x) =
ax
sin(27rt) y(0' t) ={ 0
9. c = 2, f(x) = 0 , g(x)
c2
j x2 (3 - x) for 0 < x < 3 0
y(x, 0) = 0,
forx > 3
at
(x, 0) = A
forx > 0
Characteristics and d'Alembert's Solutio n This section will involve repeated chain rule differentiations, which are efficiently written using subscript notation for partial derivatives . For example, = tit, ax = u x, O = utt , and so on . Our objective is to examine a different perspective on the proble m
a
u tt =
a
for - oo < x < oo, t > 0 ,
c2r
u(x,0) = f(x), ut (x, 0) = g(x)
for - oo < x < oo .
Here we are using u(x, t) as the position function because we will be changing variables fro m the (x, y) plane to a (, rl) plane, and we do not want to confuse the solution function wit h coordinates of points . This boundary value problem, which we have solved using the Fourier integral and agai n using the Fourier transform, is referred to as the Cauchy problem for the wave equation .We will write a solution that dates to the eighteenth century . The lines x - ct = k l , x + ct = k2, with k l and k2 any real constants, are called characteristics of the wave equation . These form two families of lines, one consisting of parallel lines with slope 1/c, the other of parallel line s with slope. -1/c . Figure 17 .17 shows some of these characteristics . We will see that thes e lines are closely related to the wave motion . However, our first use of them will be to write an explicit solution of the wave equation in terms of the initial data . Define a change of coordinates =x-ct,
rl=x+ct .
This transformation is invertible, since x= 1 -(4-+- r1),
In the new coordinates, the wave equation i s Ue.1 =O .
This is called the canonical form of the wave equation, and it is an easy equation to solve . Firs t write it as (Un )e=0 .
This means that U., is independent of 6, say U,,
e
=h q
)•
Integrate to get (6,71)
u
= f h (71) d77+ F() ,
in which F(6) is the "constant" of integration of the partial derivative with respect to f h(rl)drl is just another function of rl which we will write as G(rl) . Thu s U(6, *l) = F(6) + G (')1) ,
ri .
No w
82 4
CHAPTER 17 The Wave Equatio n where F and G must be twice continuously differentiable functions of one variable, but ar e otherwise arbitrary . We have shown that the solution of u« = c 2 uxx has the form u(x, t) = F(x - ct) + G(x + ct) .
(17 .22 )
Equation (17 .22) is called d'Alembert's solution of the wave equation, after the French mathematician Jean le Rond d'Alembert (1717-1783) . Every solution of tit, = c 2uxx must have this form. Now we will show how to choose F and G to satisfy the initial conditions . First, u(x, 0) = F(x) + G(x) = f(x)
(17 .23 )
ut (x, 0) = -cF' (x) + cG' (x) = g(x) .
(17 .24)
and
Integrate equation (17 .24) and rearrange terms to obtain x
-F(x) + G(x)
= 1 f g(6)d6-F(0)+G(0) .
Add this equation to equation (17 .23) to get x
2G(x) =
f(x) -f 1 f g()d6-F(0)+G(0) . c o
Therefore l f(x) + d G (x) = 2 1 2c f o x g (e) But then, from equation (17 .23),
- 12 F(0) + 1 G(0) . 2
1 = -2f(x) - 1 f x g(S) d6 + 1 F(0) - -G(0) . 2 2c o 2 2 Finally, use equations (17 .25) and (17 .26) to write the solution a s F(x)
= f(x) - G (x)
(17 .25 )
(17 .26 )
u(x, t) = F(x- ct)+G(x+ct) 1
1
= 2f( x - ct ) - 2c f
x-ct
1 1 g() d5+Z F(0 )-2 G ( 0)
rct
1
+ 2 f(x + ct)+ 2c
1 1 g ( )d6-2F(o)+2G(o) ,
or, after cancellations, * x+c t 1 u(x,t)=-(f(x-ct)+f(x+ct))+is!x_ct g(6)k . 2
(17.27)
Equation (17 .27) is d'Alembert's formula for the solution of the Cauchy problem for the wav e equation on the entire line . It is an explicit formula for the solution of the Cauchy problem, i n terms of the given initial position and velocity functions .
EXAMPLE 17 .1 0
We will solve the boundary value proble m ti t,
= 4u xx
for - oo < x < oo, t > 0 ,
u(x, 0) = e -Ixl , ut (x, 0) = cos(4x) for - co < x
< co .
17.4 Characteristics and d'Alembert's Solution
825
By d'Alembert's formula, we immediately have u(x, t)
= 12 (e-LC-zt + e l
_
zr l) + 1
f x+2t cos(4)d 6
4 x-2t
( e -lx-ztl + e x + zr l) -I
( e -lx-zrl
17.4.1
-Ix+
+ 16
+ e-lx+ztl) +
8
(sin(4(x+2t)) - sin(4(x - 2t)) )
sin(4x) cos(8t) .
A Nonhomogeneous Wave Equatio n
Using the characteristics, we will write an expression for the solution of the nonhomogeneou s problem : z utt = c uxx -f- F(x, t)
for - eo < x
u(x, 0) = f(x), ut (x, 0) = g(x)
< oo,
t > 0,
for - co < x < co .
This problem is called nonhomogeneous because of the term F(x, t), which we assume to b e continuous for all real x and t > O . F(x, t) can be thought of as an external driving or dampin g force acting on the string . Suppose we want the solution at (xo, to) . Recall that the characteristics of the wave equatio n are straight lines in the x, t plane . There are exactly two characteristics through this point, and these are the lines x-ct=xo-cto
and
x+ct=xo-I-cto .
Segments of these characteristics, together with the interval [xo - cto, xo + cte], form a characteristic triangle A, shown in Figure 17 .18 . Label the sides of A as L, M and I . ' Since A is a region in the x, t plane, we can compute the double integral of -F(x, t) over A : - ffo F(x, t)dA
=
dA ffo ( czu xx - u tt)
= ffo
ax
(czux)
a at
(u t ) dA .
Apply Green's theorem to the last integral, with x and t as the independent variables instead of x and y. This converts the double integral to a line integral around the boundary C of A . Thi s piecewise smooth curve, which consists of three line segments, is oriented counterclockwise .
FIGURE 17 .18 triangle .
Characteristi c
826
CHAPTER 17 The Wave Equatio n We obtain by Green's theorem, -
f fQ F(x, t)dA = fc u t dx+cz uxdt.
Now evaluate the line integral on the right by evaluating it on each segment of C in turn . On I, t = 0, so dt = 0, and x varies from xo - cto to xo + cto, s o xo+ctp
Solve this equation for u(xo, to) to obtain 1 u ( x o, to) = 2 [f(xo - ct o) +f(xo + cto)] + 2c
J. x0+cto 0_ct0
g(6)d6+ -
ffo F(x, t)dA .
We have used the subscript 0 on (xo, to) to focus attention on the point at which we ar e evaluating the solution . However, this can be any point with xo real and to > 0 . Thus the solution at an arbitrary point (x, t) is x+ct 1 1 u(x , t) = 2- [f(x-ct)+f(x+ct)]+ 2c Jx_ct g(6)d6+ 2c
ff F(, 11) dk d*l • A
The solution at (x, t) of the problem with the forcing term F(x, t) is therefore d'Alembert' s solution for the homogeneous problem (no forcing term), plus (1/2c) times the double integra l of the forcing term over the characteristic triangle having (x, t) as a vertex .
17.4 Characteristics and d'Alembert's Solution
827
EXAMPLE 17 .1 1
Consider the problem u rt = 25uxx +x 2 t2
for - co < x < oo, t > 0,
u(x, 0) = x cos(x), u t (x, 0) = e -x
for - oo < x < oo .
The solution at any point x and time t has the form u(x, t)
+10ff 6 2 ifd dq . All we have to do is evaluate the integrals . First , x+5t 1 1 1 e- * d = - To e-x-5t + -e'+5 ' 10 x-5t 10 For the double integral of the forcing term, proceed from Figure 17 .19 :
In the last example, u(x, t) gives the position function of the string, at any given time t . The graph of u(x, t) in the x, t plane is not a snapshot of the string at any time . Rather, a pictur e of the string at time t is the graph of points (x, u(x, t)), with t fixed at the time of interest . Figure 17 .20(a) shows a segment of the string at time t = 0 .3, both for the forced and unforce d motion . Figure 17 .20(b) shows a segment of the string for t = 0 .6, again both for unforced an d forced motion . This method of characteristics can also be used to solve boundary value problems involvin g the wave equation on a bounded interval [0, L] . However, this is a good deal more involve d than the solution on the entire line, so we will leave this to a more advanced treatment of partia l differential equations .
FIGURE 17 .19
828
j CHAPTER 17 The Wave Equatio n
u(x, 0 .3)
u(x, 0 .6)
20 30
15 20
10
10
5 -4
-2
0
I *>x 2 4
f -2
FIGURE 17.20(a)
-1
0
FIGURE 17 .20(b)
Profile of the forced and unforced string at t = 0 .3.
I 1
2
3
, x
t = 0 .6.
17.4.2 Forward and Backward Wave s Continuing with the boundary value problem for the wave equation on the entire real line, w e can write d'Alembert's formula (17 .27) for the solution a s x-ct
/
u(x, t)
= 2 1 f(x - ct) +2
g
1
(f(x-I-ct)-*
Od
x+ct
/* cJ
g(e)d6 )
=cp(x-ct)+f3(x+ct) , where x
q)( x )
= 2f(x) 2c *
g
(6) d 6
and
. /3(x)
_ 1 1 2 f(x) + 2c JOf
g
() cq •
We call cp(x - ct) a forward (or right) wave, and l3(x + ct) a backward (or left) wave . Th e graph of So(x - ct) is the graph of cp(x) translated ct units to the right . We may therefore think of cp (x - ct) as the graph of p(x) moving to the right with velocity c . The graph of (3 (x + ct) is the graph of (3(x) translated ct units to the left . Thus /3(x+ ct) is the graph of (3(x) moving to the left with velocity c . The string profile at time t, given by the graph of y = u(x, t) as a function of x, is the sum of these forward and backward waves at time t . As an example of this process, consider the boundary value problem in which c = 1 , 4-x2 0
for -2<x< 2 > 2 for
and g(x) = 0. This initial position function is shown in Figure 17 .21(a) . The solution is a sum of a forward and a backward wave: u(x, t)
= co (x + ct) + $ (x - ct) =
2
f(x + t) + f(x - t) . 2
17.4 Characteristics and d'Alembert's Solution
829
u(x, 0 )
FIGURE 17 .21(a )
f(x) =
4-x2 0
for -2<x<2 for 'xi > 2 u(x,
FIGURE 17 .21(c)
1 .2)
t=
FIGURE 17 .21(b) Superposition of forward and backward waves at t =
u(x, 1 .6 )
1 .2 .
FIGURE 17 .21(d)
t=
FIGURE 17 .21(f)
t=2.1.
1 .6 .
u(x, 1 .8)
FIGURE 17 .21(e)
t=
1 .8.
At any time t, the motion consists of the initial position function translated t units to the right , superimposed on the initial position function translated t units to the left . We see the motion as the initial position function (Figure 17 .21(a)) moving simultaneously right and left. Becaus e f(x) yanishes outside _of_[-2,_2],_these forward and backward_ waves actually separate and
830
CHAPTER 17 The Wave Equatio n u(x, 7 )
u(x, 3 )
4
4
3
3
2
2
1
1
I I I I I I I I I I -6 -4 -2 0 2 4 6 FIGURE17 .21(g)
t=3.
11111 -10-8 -6 -4 -2 FIGURE17 .21(h)
I I I I I ' ) 0 2 4 6 8 10
X
t=7 .
become disjoint, one continuing to move to the right, and the other to the left on the real line . This process is shown in Figures 17 .21(b) through (h) .
PROBLEMS In each of Problems 1 through 6, determine the character istics of the wave equation for the problem u « = c2uxx for - co < x < co, t > 0 , u(x, 0) = f(x), u,(x, 0) = g(x) for - co < x < co
In each of Problems 13 through 18, write the solution o f the problem u„ = u xx
for - co < x < co,t > 0 ,
u(x, 0) = f(x), u,(x, 0) = 0 for - co < x < co .
for the given value of c, and write the d'Alembert solution . 1. c = 1, f(x) = x 2 , g(x) _ -x 2. c = 4, f(x) = x 2 - 2x, g(x) = cos(x) 3. c = 7, f(x) = cos(arx),g(x) = 1 - x2 4. c = 5, f(x) = sin(2x), g(x) = x 3
5. c = 14, f(x) = e x , g(x) = x 6. c = 12, f(x) _ -5x+ x2, g(x) = 3 In each of Problems 7 through 12, solve the proble m u,, = c2uxx+F(x, t)
as a sum of a forward and backward wave. Graph the initial position function and then graph the solution at selected times, showing the solution as a superposition of forward and backward waves moving in opposite directions along the real line . Jsin(2x) for - ar x 7r 13. f(x) = 0 for [xi > 7r for-1<x for Ix' > 1
15. f(x) =
cos(x) 0
for - 7r/2 x < 7r/ 2 for Ixl > 7r/ 2
16. f(x)
1-x2 0
for Ixl < 1 for lxl > 1
1-Ixl
for - oo < x < co, t > 0,
u(x, 0) = f(x), u,(x, 0) = g(x)
for - oo < x < co
for the given c, f(x) and g(x) . 7. c = 4, f(x) = x, g(x) = e -x , F(x, t) = x+ t 8. c = 2, f(x) = sin(x), g(x) 2x, F(x, t) = 2x t 9. c = 8, f(x) = x2 - x, g(x) = cos(2x), F(x, t) = xt2
17.5 Normal Modes of Vibration of a Circular Elastic Membrane
17.5
831
Normal Modes of Vibration of a Circular Elastic Membran e We will analyze the motion of a membrane (such as a drumhead) fastened onto a circular fram e and set in motion with given initial position and velocity . Let the rest position of the membrane be in the x, y plane with the origin at the center, and let the membrane have radius R . Using polar coordinates, the particle of membrane at (r, 0) is assumed to vibrate vertical to the x, y plane, and its displacement from the rest position at time t is z(r, 0, t) . Equation (17 .4) gives the wave equation for this displacement function : a2z 2 =c at2
a 2z art
l az 1 a2 z ++-r ar r2 aO 2 1
We will assume for the moment that the motion of the membrane is symmetric about the origin , in which case z depends only on r and t . Now the wave equation i s 1 az l ate _c C 01"22 a2z + rO r a2z
,
Let the initial displacement be given by z(r, 0) = f(r), and let the initial velocity be at (r, 0) = g( r) . Attempt a solution z(r, t) = F(r)T(0) . We obtain, after a routine calculation, T" +AT=0
1 and F " -I -F ' + c 2 F=O . r
If A > 0, say A = w 2, the equation for F is a zero order Bessel equation, wit h general solution F(r) = aJo (-r) +bYo (-r) . c c Since Yo (wr/c) -- -oo as r -± 0 (the center of the membrane), choose b = 0. Now the equatio n for T is T" + w2T = 0 , with general solution T(t) = d cos(wt) + k sin(wt) . We have, for each w > 0,
a function
z*,(r, t) = a,,Jo (*r) cos(wt)+b w Jo ( W r) sin(wt) . c c Since the membrane is fixed on a circular frame , co w z,,(R, t) = a wJo R) cos(wt)+b w Jo ( -R) sin(wt) = 0 c for t > 0. This condition is satisfied if Jo(wR/c) = 0 . Let j l , j2 , . . . be the positive zeros of Jo , with Ji <Jz < . .
CHAPTER 17 The Wave Equatio n and choose wR -C = Jn or = JnRC
Ct)
for n = 1, 2, . . . . This yields the eigenvalues of this problem : z z CJ 2n =
An-
C Jn z
R
We now have (jt)
z„, (r t) = a n oJ (Lr)
+ bn Jo
(Y)
n
(ct )
All of these functions satisfy the boundary condition z(R, t) = O . To satisfy the initial conditions , attempt a superposition z(r, t)
=
ranJo
(Jnct\
R_-cos
b„Jo
(Jnr)
J
(ir )
sin g *Rt I \\\ JJJ J
(17 .28 )
Now
_E
z( r, 0) = f( r)
a
n
Jo (i)
Fourier-Bessel expansion of f(r) . Let s = r/R to convert this,a series t o 0 f(Rs) = > a n Jo(Jn s) , n= 1
in which s varies from 0 to 1 . We know from Section 16 .3 .3 that the coefficients in thi s expansion are given by an =
2 [Jl(i
f 1 f (Rs) Jo (in s) ds Jo s
)] 2
for n = 1,2, . . . . Next we must solve for the bin s. Compute
at
(r' 0)
=
g ( r)
_E n= 1
b
n JR Jo I JR
/
This is a Fourier-Bessel expansion of g(r) . Again refelTing to Section 16 .3.3, we must choose
b"
R
[J1(Jn)] 2
J
1 . sg(Rs)J0 (Jn s)ds ,
or bn =
2R
c
JO sg(Rs)Jo(Jn s)ds
,,[J1(Jn)]z
for n = 1, 2, . . . . With these coefficients, equation (17 .28) is the solution for the positio n function of the membrane .
17.5 Normal Modes of Vibration of a Circular Elastic Membrane
833_
The numbers w 1, = j,,c/R are the frequencies of normal modes of vibration, which hav e periods 27r/w„ = 2arR/j„c . The normal modes of vibration are the functions z,,(r, t). Often these functions are written in phase angle form a s z,, (r, t) =A„Jo
cos(w„t+8„ )
R
in which A,, and 6,, are constants . The first normal mode is z i (r,t)=A1Jo(J ) cos (w 1 t+8 1) . As r varies from 0 to R, j i r/R varies from 0 to j i . At any time t, a radial section through the membrane takes the shape of the graph of Jo(x) for 0 x < ji (Figure 17 .22(a)) . The second normal mode is r z,(r, t) = A2Jo JR cos(w 2t+62) . Now as r varies from 0 to R, j2 r/R varies from 0 to j2, passing through ji along the way . Sinc e Jo( j2 r/R) = 0 when j2r/R = j1 , this mode has a nodal circle (fixed in the motion) at radiu s r
j1 R =- .
12
A section through the membrane takes the shape of the graph of Jo (x) for 0 < x < j2 (Figure 17.22(b)) . Similarly, the third normal mode i s Z3(r, t) = A 3 Jo ( j;) COS(w3 t+8 3 ) ,
and this mode had two nodes, one at r = j1 R/ j3 and the second at r = j2R/j3 . Now a radial section has the shape of the graph of Jo (x) for 0 < x < j3 (Figure 17.22(c)) . In general, the n'1' normal mode has N- 1 nodes (fixed circles in the motion of th e membrane), occurring at j l R/ j,t, R ijn . In the next section we will revisit this problem, this time retaining the 0 dependence of th e displacement function . This will lead us to a solution involving a double Fourier sine series .
>x FIGURE 17 .22(a)
normal node .
First
FIGURE 17 .22(b)
normal mode.
Second
FIGURE 17 .22(c)
mode.
Third normal
83 4
CHAPTER 17 The Wave Equatio n
2. Repeat Problem 1, except now use f(r) = 1 - r2 an d g(r) = O . 3. Repeat Problem 1, but now use f(r) = sin(7rr) and g(r) = O .
1. Let c = R = 1, f(r) = 1 - r and g(r) = O . Using material from Section 16.2 (Bessel functions), approximate the coefficients a 1 through a5 in the solution given by equation 17 .28 and graph the fifth partial sum of the solution for a selection of different times . Write the (approximate) normal mode s z„ (r, t) = A„Jo (fur) cos( (o„t + 6) for n = 1, . . . 5.
17.6
Vibrations of a Circular Elastic Membrane, Revisite d We will continue from the last section with vibrations of an elastic membrane fixed on a circular frame . Now, however, retain the 0-dependence of the displacement function and consider th e entire wave equation 1 az 1 aZ z a2 z _ 2 82z at2 -c (are+ rar + r2 802 / for 0 < r < R, -7r < 0 < ir, t > O . We will use the initial condition s z(r, 0, 0) = f(r, 0),
at(r,0,0)=0 ,
so the membrane is released from rest with the given initial displacement . In cylindrical coordinates 0 can be replaced by 0+2n7r for any integer n, so we will als o impose the periodicity condition s z(r, -7r, t) = z(r, 7r, t)
(r, -ir, t)
and a0
= a0 (r, 7r, t)
for0r0 . Put z(r, 0, t) = F(r)0(0)7'(r) in the wave equation to get
F"+(1/r)F' 1 O" _ T" + r2 O c2 T F for for some constant A since the left side depends only on t, and the right side only on r and O . Then T" +Ac 2T = 0 and
=-o .
r 2F" + rF' O" +Ar2 F Because the left side depends only on r and the right side only on 0, and these ar e independent, for some constant µ,
o =µ.
r2F" + rF' O" +Ar2 =F Then 0"+µ0= 0
17.6 Vibrations of a Circular Elastic Membrane, Revisited
835
and r2F " + rF' + (Ar2 - µ)F = O . In solving these differential equations for T(t), F(r) and 0(0), we have the followin g boundary conditions . First, by periodicity, 0(-7r) = 0(70 and 0'(-1r) = O' (-rr) . Next, because the membrane is fixed on the circular frame, F(R) = 0. Finally, because the initial velocity of the membrane is zero , T'(0) = 0 . The problem for 0(0) is a periodic Sturm-Liouville problem which was solved in Sectio n 16.3.1 (Example 16 .9) . The eigenvalues are
µ1l =n 2
forn=0, 1,2, . . . ,
and eigenfunctions are 0„ (0) = a n cos(n0) + b1, sin(n 0) With
µ = n2, the problem for F i s r 2F"(r) + rF ' (r) + (Ar 2 - n2)F(r) = 0 ; F(R) = 0 .
We have seen (Section 15 .2 .2) that this differential equation has general solutio n F( r) = aJ,, (JX r) + bY„ ( /r) in terms of Bessel functions of order n of the first and second kinds . Because Y,( ,N/Tir) i s unbounded as r -+ 0+, choose b = 0 to have a bounded solution . This leaves F(r) = aJ,,(JXr) . To find adrnissable values of A, we nee d F(R) = aJ„(XR) = 0 . We want to satisfy this with a nonzero to avoid a trivial solution. Thus *R must be one of the positive zeros of fn . Let these positive zeros b e Ju l
doubly indexed because this derivation depends on the choice of
µ = n2. Then
in k
ink = R2 ' with j„ k the ?cif' positive zero of J„(x) . The A ;, k s are the eigenvalues . Corresponding eigenfunctions are nonzero multiples o ink f(r)
forn=0,1,2, . . .
and
k=1,2, . . . .
With these values of A, the problem for T i s T" + c2
T' (0) = 0
836
CHAPTER 17 The Wave Equatio n with solutions constant multiples of Tnk( t ) = cos ( iFzk
t)
We can now form functions
/ \ [a nk cos(nO) +b nk sin(n6)]J„ (ink) cos I J`k t I for n = 0, 1, 2, . . . and k = 1, 2, . . . . Each of these functions satisfies the wave equation an d the boundary conditions, together with the condition of zero initial velocity . To satisfy the condition that the initial position is given by f, write a superposition
z(r, 0, t) = E E[a nk cos(n0) + b nk n(n0)]J„ n=0 k=1
To see how to choose these coefficients, first write this equation in the for m f( r , 0 )
_ E aokJo ( fir) k= 1
+E
\\C
n=1
E a nk ,,
(r)1
R
k=1
cos(nO) + E bnkJ„ (Jnk r) sco(ne) R k=1
For a given r, think of f(r, 0) as a function of O. The last equation is the Fourier serie s expansion, on [-tr., 7r], of this function of O . Since we know the coefficients in the Fourier expansion of a function of 0, we can immediately writ e 1 f E a ok Jo (r) = 2 - f !(r, 0)de = ao(r) , k 1 and, for n = 1, 2, . . . , .
1
E a „k J„
k= 1
R (Lr)
°°
JR
=-f
9'
f(r,0)cos(ne)de=a„(r)
7r
and
E bnkJn
1
r=
7r
k= 1
f
f(r, 0) sin(ne)de = (3„(r)
Now recognize that, for each n = 0, 1, 2, . . ., the last three equations are expansions o f functions of r in series of Bessel functions, with sets of coefficients, respectively, aok , a nk and bnk . From Section 16 .3.3, we know the coefficients in these expansions :
aok
6ao(R)Jo(Joks)dt
[ Jl (Jok)] Z f
fork = 1, 2, 777
7
and, for n = 1, 2, . . . , a nk =
2 [ 'livfl(J11k)]
1 2 fo
0
ea
n(
R
)Jn(inke)d6 for k = 1, 2, . . . ,
17.7 Vibrations of a Rectangular Membrane
837
and 2
r
b„k = [*„+i (I„k)] 2
( 6) J„(1„k )cl
, R
JO
for k = 1, 2, . . . .
The idea in calculating the coefficients is to first perform the integrations with respect to 0 to obtain the functions ao(r), a n (r) and f3,,(r), written as Fourier-Bessel series . We then obtai n the coefficients in these series, which are the a ;,ks and the b;,ks, by evaluating the integrals fo r the coefficients in this type of eigenfunction expansion . In practice, these integrals must b e approximated because the zeros of the Bessel functions of order n can only be approximated .
SE CT/ON 17 .6
PROBLEM S 2 . Use the solution given in the section to prove th e plausible fact that the center of the membrane remain s undeflected for all time if the initial displacement i s an odd function of 0 (that is, f(r, -B) = -f(r, 0)) . Hint : The only integer order Bessel function that i s different from zero at r = 0 is Jo .
1. Approximate the vertical deflection of the center of a circular membrane of radius 2 for any time t > 0 by computing the first three nonzero terms of the solution for the case c = 2 and the initial displacement is f(r, 0) = (4- r 2 ) sin 2 (0), with g(r, 0) = 0 .
17 .7
Vibrations of a Rectangular Membrane Consider an elastic membrane stretched across a rectangular frame, to which it is fixed . Suppos e the frame and the rectangle it encloses occupy the region of the x, y plane defined by 0 < x < L , 0 < y < K . The membrane is given an initial displacement and released with a given initial velocity . We want to determine the vertical displacement function z(x, y, t) . At any time t, the graph of z = z(x, y, t) for 0 < x < L, 0 < y < K is a snapshot of the membrane's position a t that time . If we had a film of this function evolving over time, we would have a motion pictur e of the membrane . The boundary value problem for z i s 2
2 .2 =a2
01
2
(ax +aY2
for 0 < x < L, 0 < y < K, t > 0 ,
z(x, 0, t) = z (x, K, t) = 0
for 0 < x < L, t > 0 ,
z(0, y, t) = y(L, y, t) = 0
for 0 < y < K, t > 0,
z(x,y,0)= f(x,y)
for0<x
*(x,y,0)=g(x,y)
for 0 < x < L, 0 < y < K .
We will solve this problem for the case of zero initial velocity, g(x, y) = 0. Attempt a separation of variables, z(x, y, t) = X(x)Y(y)T(t) . We ge t XYT" = a 2 [X" YT + XY " T] , or T" Y" X " a2 T Y X
838
CHAPTER 17 The Wave Equatio n We are unable to isolate three variables on different sides of an equation . However, we can argue that the left side is a function of just y and t, and the right side just of x, and these three variables are independent . Therefore, for some constant A , T" Y" X" _ a 2T Y X
A.
Now we have X"
AX = 0
and
T" a2T
+A= -y . In the last equation, the left side depends only on t and the right side only on y, so for some constant A, T" a2T + A = - µ. Y = Then Y" +µY=O
and
T"+a2(A+µ)T =O .
The variables have been separated, at the cost of introducing two separation constants . Now use the boundary conditions : z(0, y, t) = X(O)Y(y)T(t) = 0 implies that X(0) = O . Similarly, X(L)=0,Y(0)=0
and
Y(K) =
The two problems for X and Y are X" + AX = 0 ;
X(0) = X(L) = 0
Y " +µY=0 ;
Y(0) =Y(K) =0 .
An
=
2 2 nL 2
X„( x) = sin ( n Lx )
2 2 K2
l , Y», ( y ) = sin ( \ Ky with n and m varying independently over the positive integers . The problem for T now becomes 7n 2*2 m2 .2 \ T " +a2 ( L2 + L2 I T = 0 =
Further, because of the assumption of zero initial velocity , at (x, y, 0) = X(x)Y(y)T'(0) = 0, so T'(0) = 0 . Then T must be a constant multiple of cos
n2 m Trot . LZ 2 + KZ2
17.7 Vibrations of a Rectangular Membrane
839
For each positive integer n and in, we now have a functio n n7Tx miry n22 m 2 z,,,,, (x, y, t) = a n, sin ( L ) sin ( K ) cos (V L' + KZ 7ra t that satisfies all of the conditions of the problem, except possibly the initial condition z(x, y, 0) _ f(x, y) . For this, use a superposition z(x, y, t) = E E a,,,,, sin ( n7LTx ) sin i n=1 n,=1.
m7Ty ) cos K
n2 m+ K2 'n-a t LZ
We must choose the constants to satisfy °O C0 1n7Ty z(x,y,0)=f(x,Y)=IIa,,,,,sin(n7LTx )sin( K ) »=1,,,=i
We can do this by exploiting a trick we used when introducing Fourier series . Pick a positiv e integer mo and multiply both sides of this equation by sin(mo7ry/K) to ge t m miry 1710 77'y In° 7Ty 127Tx sin ( sin ( sin ( sin ( E E a,,,,, f(x , y) = K ) K) L) K) CO
,,=1 ,,,= 1
Now integrate from 0 to K in the y- variable, leaving terms in x alone . We ge t K f
mo7Ty
n7rx
K
mo7r
y
) dy . ) sin ( f(x, y) sin ( K ) dy = E E a,,,,, sin ( L ) JOf sin ( -'"'Y - „K K i n= 1
By orthogonality of these sine functions on [0, K], all of the integrals are zero except for th e term in = mo . The series in in therefore collapses to a single term, wit h ) dY fo K sin e ( morY K
K 2
when in = mo . So far we hav e K
f f(x, y) sin ( o
11
n7rx ° 7Ty K ) . ) dy = E - a„»,0 sin ( K „= 1 2
The left side of this equation is a function of x . Pick any positive integer no and multiply thi s equation by sin(no7rx/L) :
fo
K
f(x, y) sin (
no7rx 117o7Ty °° K n7rx no7rx )sin( 2a,,,,,0sin( L )sin( L ) . L K )dy=E n= 1
Integrate, this time in the x-variable: L
K
sin (n f f f(x, y) 7 °° K _ E 2 a ,,,,,0 „=1
f L sin (
x ) sin
(1n
y) dydx
n7rx no7rx L )sin(( 1' L
-
All the integrals on the right are zero except when n no, and this integral is L/2 . The last equation becomes L
K
y) sin f0 f f(x,
(n°')
sin
(mo7ry)
dydx = -- a „o,,, o 2 2
- -
CHAPTER 17 The Wave Equation
840
Dropping the zero subscripts, which were just for ease in keeping track of which integer s were fixed, we now have 4 L K f f f(x , y) sin ( n x ) sin ( dydx. = LK
L
ff
Kt' )
With this choice of the coefficients, we have the solution for the displacement function.
EXAMPLE 17 .1 2
Suppose the initial displacement is given b y z(x,y,0) =x(L-x)y(K-y) , and the initial velocity is zero . The coefficients in the double Fourier expansion ar e
K
z K 4 f x(L - x) y (K - y) sin (--) sin ( m y ) dydx = LK ( = LK
I
f
f
L
L x(L - x) sin ( n x ) dx)
(f
K y(K - y) sin (
t' ) dy) K
2K2 - 1 ) '» - 1] . (nmv2)3 [(- 1 ) " - 1 ][( 16L
The solution for the displacement function in this case i s 0 [16L2K 2 ' In2 m2 1 ][(- 1 ) » z - 1] sin ( n7rx sin ( mart' Z(x,)7,t)=E E 7rat) K ) cos V L2 + (nmar2)3 [(-1),= L ) KZ »=i »Z=1
1. Solve 9z
z
acz
axe
zz
+
a
for 0 < x < 27r,
Y2z
00,
z(x, 0, t) = z(x, 27r, t) = 0
for 0 < x < 27r, t > 0 ,
z(0,)7,t)=z(27r,y,t)=0
for 0 < y < 27r, t > 0,
z(x, y, 0) = x 2 sin(y) at
(x,y,0)=0
for 0 < x < 27r, 0 < y < 27r,
for 0 < x < 27r, 0 < y < 27r.
z(0, y, t) = z(7r, y, t) = 0
for 0 < y < 7r, t > 0 ,
z(x, y, 0) = sin(x) cos(y) for 0 < x < 7r, 0 < y < 7T , az (x,y,0)=xy for0<x<7r,0
arz
azz + az2 z
4
2
Y
\
I
for 0 < x < 27r, 00 ,
z(x, 0, t) = z(x, 27r, t) = 0 for 0 < x < 27r, t > 0 ,
2. Solve a2 z a2 z 82 Z 9 at2 ax 2 + ay2 / z(x,0,t) =z(x,7r,t) =0
z(0, y, t) = z(27r, y, t) = 0 for 0 < y < 27r, t > 0 , for0<x<7r, 00, for0<x<7r,t>0,
z(x,y,0)=0
for 0 < x < 27r, 0 < y < 27r ,
az (x,y,0)=1 at
for 0 < x < 27r, 0 < y < 27r .
.
CHAPTER
18
The Heat Equation
Heat and radiation phenomena are often modeled by a partial differential equation called the hea t equation . We derived a three-dimensional version of the heat equation, using Gauss's divergenc e theorem. We will now examine the heat equation more closely and solve it under a variety o f conditions, following a program that parallels the one just carried out for the wave equation .
18.1
The Heat Equation and Initial and Boundary Condition s Let u(x, y, z, t) be the temperature at time t and location (x, y, z) in a region of space . In Section 13 .7.2, we showed that u satisfies the partial differential equation µp
au
at
KI
a2u a2 u 8 2 u
ax2+ ay2+
az2 )
-I-OK • Du ,
in which K(x, y, z) is the thermal conductivity of the medium, z) is the specific hea t and p(x, y, z) is the density. The term VK • Vu is the dot product of the gradients of K and u . This is the heat equation in three space variables and the time . If the thermal conductivity of the medium is constant, then VK is the zero vector and the term VK • Vu = O . Now the three-dimensional heat equation is µ(x, y,
au pp_ t =K
C a2u 32u a xe + a,2
a2 u +a z2
The 1-dimensional heat equation is au _ K a2 u at µp axe This equation often applies, for example, to heat conduction in a thin bar whose length is muc h larger than its other dimensions . To get some feeling for what is involved in the one-dimensional heat equation, we will give a separate derivation of it from basic principles . 841
842
CHAPTER 18 The Heat Equatio n Consider a straight, thin bar of constant density p and constant cross-sectional area A , placed along the x axis from 0 to L . Assume that the sides of the bar are insulated and d o not allow heat loss, and that the temperature on the cross section of the bar perpendicular t o the x axis at x is a function u(x, t) of x and t only. Let the specific heat µ and the therma l conductivity K be constant. Consider a typical segment of the bar between x = a and x =,Q, as in Figure 18 .1 . By the definition of specific heat, the rate at which heat energy accumulates in this segment is
f
a
P
au . µpA at dx u(x, t) = temperature on cross section at x at time t
FIGURE 18 . 1
By Newton's law of cooling, heat energy flows within this segment from the warmer to th e cooler end at a rate equal to K times the negative of the temperature gradient (difference i n temperature at the ends of the segment) . Therefore, the net rate at which heat energy enters thi s segment of the bar at time t is KA ± ((3, t) - KA ± (a, t) . ax ax Assume that no energy is produced within the segment . Such production could occur, fo r example, if there is radiation or a heat source such as a chemical reaction . These would als o change the mass of the segment with time . In the absence of these effects, the rate at which heat energy accumulates within the segment must balance the rate at which it enters the segment . Therefore \ au Paz u au x au fR t) - a x (a, t) I = KA I axz dx , dx = KA J µpA at (a' (a /// so a2 u au (p_K_)dx=O . J
f
This equation must be true for every a and /3 with 0 < a < /3 < L . If the term in parenthese s in this integral were nonzero at any xo and to, then by continuity we could choose an interva l (a, (3) about xo throughout which this term would be strictly positive or strictly negative . Bu t then this integral of a positive or negative function over (a, P) would be, respectively, positive or negative, a contradiction . We conclude that a2 u ,up -at -Kaxz = 0 for 0 < x < L and for t > O . This is the 1-dimensional heat equation . Often this partial differential equation is writte n au az u at k axz ' where k = K/µp is a positive constant depending on the material of the bar . The number k i s called the diffusivity of the bar .
18.1 The Heat Equation and Initial and Boundary Conditions
843
This equation certainly does not determine the temperature function u(x, t) uniquely. For example, if u(x, t) is one solution, so is u(x, t) + c for any real number c . For uniqueness of the solution, which we expect in models of physical phenomena, we need boundary condition s specifying information at the ends of the bar at all times, and initial conditions, giving the temperature throughout the bar at some time usually designated as time zero . The heat equation , together with certain initial and boundary conditions, uniquely determine the temperatur e distribution throughout the bar at all later times . For example, we might have the boundary value proble m a2 u au k for 0 < x < L, t > 0 , at axe u(0, t) = Ti , u(L, t) = T2 fort > 0 , u(x, 0) = f(x)
for 0
x
This problem models the temperature distribution in a bar of length L, whose left end is kept a t constant temperature Ti and right end at constant temperature T,, and whose initial temperature in the cross section at x is f(x) . The conditions at the ends of the bar are the boundary conditions, and the temperature at time zero is the initial condition . As a second example, consider the boundary value proble m 2
at
ax
(0, t)
= k-e
for 0 < x < L, t > 0,
= ax (L, t) = 0
u(x, 0) = f(x)
for t > 0,
for 0 < x < L .
This problem models the temperature distribution in a bar having no heat loss across its ends . The boundary conditions given in this problem are called insulation conditions . Still other kinds of boundary conditions can be specified . For example, we might have a combination of fixed temperature and insulation conditions . If the left end is kept at constan t temperature T and the right end is insulated, then u(0, t) = T
On
and
(L, t) = 0 .
Or we might have free radiation (convection), in which the bar loses heat by radiatio n from its ends into the surrounding medium, which is assumed to be maintained at constan t temperature T . Now the model consists of the heat equation, the initial temperature function , and the boundary condition s On (0, t) = A[u(0, t) - T],
ax
(L, t) _ -A[u(L, t) - T]
for t > 0 . Here A is a positive constant . Notice that if the bar is kept hotter than the surroundin g medium, then the heat flow, as measured by au/ax, must be positive at the left end and negativ e at the right end . Boundary conditions u(0, t) =
T1,
au
(L, t) = -A[u(L, t) - T2]
are used if the left end is kept at the constant temperature Ti while the right end radiates heat energy into a medium of constant temperature T, .
84 4
CHAPTER 18 The Heat Equatio n In 2-space dimensions, with constant thermal conductivity, the heat equation i s au
a2 u
at
k (axe
a 2 tt
+
ay e )
while in 3-space dimensions it is att
02 lt
at
1. Formulate a boundary value problem modeling hea t conduction in a thin bar of length L, if the left end is kept at temperature zero and the right end is insulated . The initial temperature in the cross section at x is f(x) . 2. Formulate a boundary value problem modeling hea t conduction in a thin bar of length L, if the left end is
18.2
a2u
a2 u
-++ ax2 ay 2 az 2
k
kept at temperature a(t) and the right end at tempera ture (3(t) . The initial temperature in the cross sectio n at x is f(x) . 3. Formulate a boundary value for the temperature function in a thin bar of length L if the left end is kept insulated and the right end at temperature ,0(t) . The initial temperature in the cross section at x is f(x) .
Fourier Series Solutions of the Heat Equatio n In this section we will solve several boundary value problems modeling heat conduction on a bounded interval . For this setting we will use separation of variables and Fourier series . 18 .2.1 Ends of the Bar Kept at Temperature Zer o
Suppose we want the temperature distribution u(x, t) in a thin, homogeneous (constant density ) bar of length L, given that the initial temperature in the bar at time zero in the cross section a t x perpendicular to the x axis is f(x) . The ends of the bar are maintained at temperature zero for all time . The boundary value problem modeling this temperature distribution i s
au u(0,
2
=k
for 0 < x < L, t > 0 ,
axe
t) = u(L, t) = 0 fort>0 ,
u(x, 0) = f(x)
for 0 < x < L .
We will use separation of variables . Substitute u(x, t) = X(x)T(t) into the heat equatio n to get XT ' = kX" T
or T' _ X"
kT
X
18.2 Fourier Series Solutions of the Heat Equation
845
The left side depends only on time, and the right side only on position, and these variables ar e independent. Therefore for some constant A , T' X" _ -A . kT X Now u(0, t) = X(O)T(t) = 0. If T(t) = 0 for all t, then the temperature function has the constant value zero, which occurs if the initial temperature f(x) = 0 for 0 < x < L . Otherwise T(t) cannot be identically zero, s o we must have X(0) = 0 . Similarly, u(L, t) = X(L)T(t) = 0 implies that X(L) = 0 . The problem for X is therefore X" +AX = 0; X(0) = X(L) = O . We seek values of A (the eigenvalues) for which this problem for X has nontrivial solution s (the eigenfunctions) . This problem for X is exactly the same one encountered for the space-dependent functio n in separating variables in the wave equation . There we found that the eigenvalues are nz r.z A,, = L2 for n = 1, 2, . . ., and corresponding eigenfunctions are nonzero constant multiples o f n7rx X,,(x) = sin L The problem for T becomes
U
z
T ' + nz z
k
T=0 ,
which has general solution T„( t)
=
-n2 xr2 kt/L 2
c,
For n = 1, 2, • • • , we now have functions n7rx u,, (x, t) = c„ sin L e-n2 7r2 kt/L 2 which satisfy the heat equation on [0, L] and the boundary conditions u(0, t) = u(L, t) = 0 . There remains to find a solution satisfying the initial condition . We can choose n and c,, so that u,,(x, 0) = c,, sin Ln7rx) = f(x) only if the given initial temperature function is a multiple of this sine function . This need not be the case . In general, we must attempt to construct a solution using the superpositio n u(x, t)
_ E c„ sin
(fl7r
xL ) e-n z itzkt/L 2
n- t
Now we need 00
u(x, 0)
= E c,, sin
n7rx L
= f(x),
846
CHAPTER 18 The Heat Equatio n which we recognize as the Fourier sine expansion of f(x) on [0, L] . Thus choose c„
2 L sin - d6. L f f()
With this choice of the coefficients, we have the solution for the temperature distributio n function : u (x ,
00 t) = 2- E n_ 1
\1
f L f(sc) sin (n'r )- d) sin ("Tx ) e -„Z *ZkyLz .
(18 .1 )
EXAMPLE 18 .1
Suppose the initial temperature function is constant A for 0 < x < L, while the temperature at the ends is maintained at zero . To write the solution for the temperature distribution function , we need to compute L / 2A 2A c, t L 2 A sin I n L de = n [1- cos(n7r)] = [1- (-l) n ] .
n*
JO
The solution (18 .1) is
\\\ 2A
u(x,t)_IT ,t=1
1 - (-1)"
n
n9rx sin() e _ n 2 o.2kt/LZ L
Since 1 - (-1)" is zero if n is even, and equals 2 if n is odd, we need only sum over the odd integers and can write 4A °° u(x,t)=E
1
7r n-1 2n -1
sin
- 1)?rx
L
l
e _(2,t-1)2,2kt/L2
.
J
Verification of the Solution The function given by equation (18 .1) clearly satisfies the bound ary and initial conditions of the problem . Each term vanishes at x = 0 and at x = L, and th e coefficients were chosen so that u(x, 0) = f(x) . If we could differentiate this series term-by term, it would also be easy to show that u(x, t) satisfies the heat equation, since each term does . When we were faced with this problem with the wave equation, we used a trigonometri c identity to sum the series . Here, because of the rapidly decaying exponential function in u(x, t) , we can easily prove that the series converges uniformly . Choose any to > 0 . Then, for t > to , 1 sin (( 2n-1
(2n -1)irx\ L lI
e _ (2n-1) 2 a2 kt/L2
< 1 e-(2n-1)2a2kto/L2 - 2n - 1
Because the series 1
e -(2n-1)2„2kt o /L 2
2n- 1
converges, the series for u(x, t) converges uniformly for 0 < x < L and t > to, by a theorem o f Weierstrass often referred to as the M-test. By a similar argument, we can show that the series obtained by differentiating u(x, t) term-by-term, once with respect to t, or twice with respect to x, also converge uniformly . We can therefore differentiate this series term by term, once with respect to t, and twice with respect to x . Since each term in the series satisfies the heat equation, so does u(x, t), verifying the solution (18 .1) . We will now consider the problem of heat conduction in a bar with insulated ends .
18.2 Fourier Series Solutions of the Heat Equation
847
18.2 .2 Temperature in a Bar with Insulated End s Consider heat conduction in a bar with insulated ends, hence no energy loss across the ends . If the initial temperature is f(x), the temperature function is modeled by the boundary valu e problem au 82 u axe for 0 < x < L, t > 0 , at k au au ax (0, t) = (L, t) = 0 fort > 0, ax u (x, 0) . = f(x) for 0 < x < L . Attempt a separation of variables by putting u(x, t) = X(x)T(t) . We obtain, as in the preceding subsection, X" +AX=0,T' +AkT=O . Now (0, t) = X'(0)T(t) = 0 ax implies (except in the trivial case of zero temperature) that X'(0) = 0 . Similarly , (L, t) = X' (L)T(t) = 0 ax implies that X'(L) = O . The problem for X(x) is therefore X" + AX = 0 ; X' (0) = X' (L) = 0 . The eigenvalues are 72
A„ =
LZ
for n = 0, 1, 2, . . ., with eigenfunctions nonzero constant multiples o f narx X„(x) = cos ). L The equation for T is no w z z T ' + n 2 k T=0 .
L
When n = 0 we ge t To(t) = constant . For n = 1, 2, . . ., T,,(t) =
-U22Ir2 kt/L 2
We now have functions u„(x,t)=c„COS
7 ]ivrx e
_ , t2 v-2kt/L 2
L for n = 0, 1, 2, . . ., each of which satisfies the heat equation and the insulation boundar y conditions . To satisfy the initial condition we must generally use a superposition 1 nvrx u(x,t)=-Co+Ec,, cos( L -)
e -n2 ir2 kI/L2 .
CHAPTER 18 The Heat Equation Here we wrote the constant term (n 0) as co/2 in anticipation of a Fourier cosine expansion . Indeed, we need u (x,
1 °° nirx 0) = f( x) = 22co + E c*, cos ( - l ' n= 1
(18 .2)
the Fourier cosine expansion of f(x) on [0, L] . (This is also the expansion of the initial temperature function in the eigenfunctions of this problem .) We therefore choos e 2
cu L
L
cosn (- ) d . f .f()
With this choice of coefficients, equation (18 .2) gives the solution of this boundary valu e problem .
EXAMPLE 18 .2
Suppose the left half of the bar is initially at temperature A, and the right half is kept a t temperature zero . Thus f(x)=
IA 0
forO<x
Then 2 co= L
f
Acos I
L
L/ 2
Adk= A
and, for n = 1, 2, . . ., c„
2 L
f
L/ 2
n
d = - sin(n7r/2) .
The solution for this temperature function i s 1 2A `° nir n1Tx _ - 2 -2 u (x, t) = -A+ E sin ( 2) cos ( L ) e r „= 1
/L2
.
Now sin(n7r/2) is zero if n is even . Further, if n = 2k - 1 is odd, then sin(n7r/2) = (-1) k+ 1 The solution may therefore be written u(x, t) = ZA
2A _
E00 ( 2n
)n1 1 cos ((2n _ 1)7rx )
18.2.3 Temperature Distribution in a Bar with Radiating En d Consider a thin homogeneous bar of length L, with the left end maintained at zero temperature, while the right end radiates energy into the surrounding medium, which is kept a t
18.2 Fourier Series Solutions of the Heat Equation
849
temperature zero . If the initial temperature in the bar's cross section at x is f(x), then the temperature distribution is modeled by the boundary value proble m au at
a 2u ax2
k
for 0 < x < L, t > 0 ,
u(0, t) = 0,
(L, t) = - Au(L, t) ax u(x, 0) = f(x) for 0 < x < L .
for t > 0,
The boundary condition at L assumes that heat energy radiates from this end at a rat e proportional to the temperature at that end of the bar . A is a positive constant called the transfe r coefficient. Let u(x, t) = X(x)T(t) and obtain X"+A.X=0,T'+A.kT=0 . Since u(0, t) = X(0)T(t) = 0, then X(0) = 0 . The condition at the right end of the bar implies tha t X ' (L) = -AX(L)T(t) , hence X' (L) +AX(L) = O . The problem for X is therefore X" +AX = 0, X(0) = 0, X'(L) +AX(L) = 0. This is a regular Sturm-Liouville problem which we solved in Example 16 .12 for the cas e A = 3 and L = 1, with y(x) in place of X(x) .We will find the eigenvalues and eigenvalues in this more general setting by following that analysis . Consider cases on A . Case 1 A = 0 . Then X(x) = cx + d . Since X (O) = d = 0, then X(x) = cx. But then X ' (L) = c = -AX(L) = -AcL implies that c(1+ AL) = O . But 1+ AL > 0, so c = 0 and this case has only the trivial solution . Hence 0 is not an eigenvalue of this problem . Case 2 A < 0 . Write A = -a 2 with a > 0 . Then X" - a 2X = 0, with general solution X(x) = ce" + de' Now X(0)=c+d= 0 so d = -c . Then X(x) = 2c sinh(ax) . Next , X' (L) = 2accosh(aL) = -AX(L) = -2Acsinh(aL) . Now aL > 0, so 2accosh(aL) > 0 and -2Acsinh(aL) < 0, so this equation is impossibl e unless c = 0 . This case therefore yields only the trivial solution for X, so this problem has n o negative eigenvalue .
850
CHAPTER 18 The Heat Equatio n Case 3 A > 0 . Now write A = a 2 with a > 0 . Now X" + a 2X = 0, s o X(x) = c cos(ax) + d sin(ax) . Then X(0)=c= 0 so X(x) = d sin(ax) . Next, X' (L) = dacos(aL) = -AX(L) -Adsin(aL) . Then d = 0 or a tan(aL) = - A We can therefore have a nontrivial solution for X if a is chosen to satisfy this equation . Le t z = aL to write this condition as _ 1 tan(z) AL z . Figure 18 .2 shows graphs of y = tan(z) and y = - z/AL in the z, y plane (with z as the horizonta l axis) . These graphs have infinitely many points of intersection to the right of the vertical axis . Let the z coordinates of these points of intersection be z 1 , z2 , . . ., written in increasing order . Since a = z/L, then z = a2 = *n An n L2 are the eigenvalues of this problem, for n = 1, 2, . . . . Eigenfunctions are nonzero constant multiples of sin(a n x), or sin(z n x/L) .
FIGURE 18 .2 Eigenvalues of th e problem for a bar with a radiating end.
The eigenvalues here are obtained as solutions of a transcendental equation which w e cannot solve exactly . Nevertheless, from Figure 18 .2 it is clear that there is an infinite numbe r of positive eigenvalues, and these can be approximated as closely as we like by numerica l techniques . Now the equation for T is T'
+ L-T= O
with general solution Tn ( t) = c n e -4kt/L2 .
18.2 Fourier Series Solutions of the Heat Equation
851
For each positive integer n, let t)
= X,,(x)T,,(t) = c„ sin
(zL) e
-zkyL2 .
Each of these functions satisfies the heat equation and the boundary conditions . To satisfy the initial condition, we must generally employ a superpositio n oo
u(x, t)
)
e _ 2kt/L2
= E c„ sin ( L) z
= f(x).
_ E C„ sin ( z n= 1
and choose the c,',s so that co
u(x, 0)
,t= 1
This is not a Fourier sine series . It is, however, an expansion of the initial temperature functio n in eigenfunctions of the Sturm-Liouville problem for X . From Section 16 .3 .3, choose c„
__
foL f(Osin (z,,/L ) de fL sin2 (z,t/ L) d
The solution is f( ) fo sin2(zn5/L)d
u(x, t) = n=1
-z?kt/Lz sin (z"x) L e
If we want to compute numerical values of the temperature at different points and times , we must make approximations . As an example, suppose A = L = 1 and f(x) = 1 for 0 < x < 1 . Use Newton's method to solve tan(z) = -z approximately to obtai n zt N
2.0288,
z2 ti 4 .9132,
z 3 = 7 .9787,
z 4 ti
11 .0855 .
Using these values, perform numerical integrations to obtain ct
1.9207,
c2
2.6593,
c3 ti 4.1457,
c4
5 .6329 .
Using just the first four terms, we have the approximatio n u(x, t) ti 1 .9027 sin(2 .0288x) e-4 .116okt + 2 .6593 sin(4 .9132x) e-24 .1395k t
+ 4 .1457 sin(7 .9787x) e-63 .6597kt + 5.6329 sin (11 .0855x) e -122.sss3kr Depending on the magnitude of a, these exponentials may be decaying so fast that these firs t few terms would suffice for some applications .
18.2.4 Transformations of Boundary Value Problems Involvin g the Heat Equation Depending on the partial differential equation and the boundary conditions, it may be impossibl e to separate the variables in a boundary value problem involving the heat equation . Here is an example of a strategy that works for some problems .
852
CHAPTER 18 The Heat Equatio n Heat Conduction in a Bar With Ends at Different Temperatures Consider a thin, homogeneous bar extending from x = 0 to x = L . The left end is maintained at constant temperature T1 , and the right end at constant temperature T2 . The initial temperature throughout the bar in th e cross section at x is f(x) . The boundary value problem modeling this setting i s z = k z for 0 < x < L, t > 0 ,
*x
at
u(0, t) = Tl , u(L, t) = T2
fort > 0 ,
u(x, 0) = f(x)
L.
for 0 < x
We assume that Tl and T2 are not both zero. Attempt a separation of variables by putting u(x, t) = X(x)T(t) into the heat equation to obtain X" +AX =0, T ' + AkT =0 . The variables have been separated . However, we must satisfy u(0, t) = X(0)T(t) = T1 . If Tl = 0, this equation is satisfied by making X(0) = 0 . If, however, Tl 0, then T(t) T1/X(0) = constant. Similarly, u(L, t) = X(L)T(t) = T2, so T(t) = T2 /X(L) = constant . Thes e conditions are impossible to satisfy except in trivial cases (such as f(x) = 0 and Tl = T2 = 0) . We will perturb the temperature distribution function with the idea of obtaining a mor e tractable problem for the perturbed function . Set u(x, t) = U(x, t) + Substitute this into the heat equation to ge t at ax e ,1(x) / aU-ka2U+ This is the standard heat equation if we choose
cif
so that
t/i" (x) = 0.
This means that must have the form fi(x) = cx+d. Now u(0, t) = T1 = U(0, t) + iii(0 ) becomes the more friendly condition U(0, t) = 0 if 1li(0) = T1 . Thus choose d=T1 . So far fi(x) = cx + T1 . Next, u(L, t) = T2 = U(L, t) +(L ) becomes U(L, t) = 0 if (L)
= cL + T1
= T2 , so choose
c=
L
(T2 -Tl ),
18.2 Fourier Series Solutions of the Heat Equation
853
Thus let ii(x)
= L (T2 - T1 )x + T1 .
Finally, u(x, 0) = f(x) = U(x, 0) +t/r(x ) becomes the following initial condition for U: U(x, 0) = f(x) - i/r(x) . We now have a boundary value problem for U : aU at
a2 U ax e ' = U(L, t) = 0, U(0, t) k
U(x, 0) _ f( x) -
(T2 - Tl ) x
We know the solution of this problem (equation 18 .1), and can immediately write U( x , t) = L
f n= i
(j L [f(6) - L (T2 - Tl )x - Tl sin ('
/ dsin I - x)
)ir
e" 2T2kt/L2 .
Once we obtain U(x, t), the solution of the original problem i s u(x, t) = U(x, t) +
L
(T2 - Tl ) x + T1 .
Physically we may regard this solution as a decomposition of the temperature distributio n into a transient part and a steady-state part . The transient part is U(x, t), which decays to zero as t increases . The other term, i/i(x), equals lim t ,o, u(x, t) and is the steady-state part. This part is independent of the time, representing the limiting value which the temperature approache s in the long-term. Such decompositions are seen in many physical systems . For example, in a typical electrica l circuit the current can be written as a transient part, which decays to zero as time increases , oo: and a steady-state part, which is the limit of the current function as t
EXAMPLE 18 . 3
Suppose, in the above discussion, Tl = 1, T2
r L(
1 f( ) - - (T2 - T* ) - Tl
sin
=
2 and f(x) = z for 0 < x < L . Comput e
( -L-)
d = =
1
1 ,)
fL(-L 1 L 1+(-1)
2
"
12?T
The solution in this case i s u(x, t) = t1=1
(1+(-1)t') 117Tx si n ( fzar
-n'-'t2kt/L2
+
x +1.
(
d6
854
CHAPTER 18 The Heat Equatio n 18.2.5 A Nonhomogeneous Heat Equation In this section we will consider a nonhomogeneous heat conduction problem on a finite interval : au _ az u + F(x, t) for 0 < x < L, t > 0, at k axz u(O,t)=u(L,t)=0 fort>0 , u(x, 0) = f(x) for 0 < x < L . The term F(x, t) could, for example, account for a heat source within the medium . It is eas y to check that separation of variables does not work for this heat equation . To develop another approach, go back to the simple case that F(x, t) = 0 . In this event we found a solution 0 u(x, t) = E bn sin ( z n x) n2 *r2kth12 , n= 1
e
in which bn is the n th coefficient in the Fourier sine expansion of f(x) on [0, L] . Taking a cu e from this, we will attempt a solution of the current problem of the for m 0 u(x,t)=ETn (t)sin( nx ) . (18 .3 ) n=1 L The problem is to determine each Tn (t) . The strategy for doing this is to derive a differential equation for Tn (t) . If t is fixed, then the left side of equation (18 .3) is just a function of x, and the right side is its Fourier sine expansion on [0, U . We know the coefficients in this expansion, s o Tn (t)
=
L
2
L I u(e, t) sin
(n7re
de.
(18 .4)
Now assume that, for any choice of t > 0, F(x, t), thought of as a function of x, can also b e expanded in a Fourier sine series on [0, L] : 0 F(x, t) = E Bn (t) sin ( nLx) , (18 .5) n= 1
where 2 L (e, t) sin Lj F
Bn(t)
(nir C )
(18 .6 )
This coefficient may of course depend on t . Differentiate equation (18 .4) to get ,'2( t )
T
2 fL
= L JO at (6, t) sin nL
(18 .7)
Substitute for au/at from the heat equation to get Tn (t)
2 tt
t) sin de. L F(s*, t) sin L JO L aaxe (e, (nL6) de+ 2 J ( nL )
In view of equation (18 .5), this equation become s Tn(t)
f L JO
a 2u axz
nve ) (e, t) sin (- de+Bn(t).
(18 .8)
855
18.2 Fourier Series Solutions of the Heat Equation
Now apply integration by parts twice to the integral on the right side of equation (18 .8), at the end making use of the boundary conditions and of equation (18 .4) :
jL
a u
t) sin (L)
\
(6, t) sin = [ ax ()]-L
d
n7r L
JO L
t) cos
ax (6'
L n
L a
x t) cos ( L I dt
I d
L L [u(e, t) cos ( _nL7r6)] + _nv L
_-
22 nL2r
f
L
f
L u
(6, t) sin (n' L
n2 7r2 L
122
2 T„( t) =
L,
/ \ t) sin ( n 6 I d u(6, L
-
d6
772
2L
T
„( t) •
Substitute this into equation (18 .8) to get n2 7T 2k Ti( t) = - L2
T„(t)+B„(t) •
For n = 1 , 2, . . . , we now have a first order ordinary differential equation for T„(t) : T ,',(t) +
n2 7r2 k L2
T„( t)
=
B,t( t)
Next, use equation (18 .4) to get the condition
T„( 0 )
=L
JOL
u(6, 0) sinn (- ) d5 = 2
f Lf(6) sin (nL
)d=b 1t ,
the nth coefficient in the Fourier sine expansion of f(x) on [0, L] . Solve the differential equatio n for T„(t) subject to this condition to get
Ti( t)
= f0
-„2 e
1r2k(t-T)/L2 B,,(T) dT + b,1 e -„
2,,r2 kt/L
2
.
Finally, substitute this into equation (18 .3) to obtain the solutio n r
u (x, t) _ E
*f°
e-n2Tr2k(t-T)/L'-
R=t\
+ -2L „-t *f0 O0
L
B„ (T) d ,r sm
(1x7Tx ) L
1x77.n7rx )e f(6) sin (-) d sin ( L L
n2 r' kt/L Z
.
(18 .9)
Notice that the last term is the solution of the problem if the term F(x, t) is missing, while the first term is the effect of the source term on the solution .
8 56
CHAPTER 18 The Heat Equatio n
EXAMPLE 18. 4
Solve the problem a 4-+xt
at
for 0 < x < Tr, t > 0 ,
ax2
u(0, t) = u (7r, t) = 0
fort > 0 ,
1 20
for0<x <
0
for- <x
Since we have a formula for the solution, we need only carry out the required integrations . First compute 2
f _-
B,2 (t)
(_1) n+ 1 t sin (n) d6 = 2 l n t.
7r
Now we can evaluate (_1)n+1
t
f0
e"2(
t
=f f
T)Bn( T) dT
-
2
n
(_1)n+1
8
do2 (t
T2
T) dT
-1+4n t+e -an2 t nn5
Finally, we need bn
2 f ir =
.f() sin (n )
_
40 '*/4
f
e=
sin ( n s*) d
40 1 - cos(nTr/4 ) n
We can now write the solutio n u(x, t) _ n=1
+
(1 (_ 1) „ +i -1 +4n n t+ e-4n2t n5 8
E
40 1 -cos(nTr/4) n
„=1 7T
1 sin (nx) J
sin(nx)e-4i2t
.
The second term on the right is the solution of the problem with the term xt deleted in the heat equation. Denote this "non-source" solution as uo (x, t) = n=1
40 1 - cos(nir/4) sin(nx) e- 4„ Z t . Tr n
The solution with the source term i s u(x, t) = uo(x, t)
+ E 1 (_1)n+u n=1
8
-1+4n2t+e-4n2t )
sin(nx) .
n5
To gauge the effect on the solution of the term xt in the heat equation, Figures 18 .3(a) through (d) show graphs of u(x, t) and uo(x, t) at times t = 0 .3, 0 .8, 1 .2 and 1 .32. Both solutions decay to zero quite rapidly as time increases . This is shown in Figures 18 .4, which shows the evolution of uo(x, t) over these times, and Figure 18 .5, which follows u(x, t) . Th e effect of the xt term is to retard this decay. Other terms F(x, t) would of course have differen t effects .
18 .2
Fourier Series Solutions of the Heat Equation
857
u it
Comparison of solutions with and without a source term for t = 0 .3 . FIGURE 18 .3(a)
FIGURE 18 .3(b)
t = 0 .8 .
tt
0.5 FIGURE 18 .3(c)
t=
1 .2.
FIGURE 18 .3(d)
tt
and
1 .5
t=
2 .0
2.5
3.0
1 .32.
it
FIGURE 18 .4 1 .2,
1 .0
1 .32.
uo(x, t) at times t = 0 .3, 0 .8,
FIGURE 18 .5 1 .2,
and
u(x, t) at times t = 0 .3, 0 .8 ,
1 .32.
18 .2 .6 Effects of Boundary Conditions and Constants on Heat Conductio n We have solved several problems involving heat conduction in a thin homogeneous bar of finite length . As we did with wave motion on an interval, computing power enables us to examin e the effects of various constants or terms appearing in these problems, on the behavior of th e solutions .
858
CHAPTER 18 The Heat Equatio n
EXAMPLE 18 . 5
Consider a thin bar of length 7r, whose initial temperature is given by f(x) = x2 cos(x/2) . The ends of the bar are assumed to be maintained at zero temperature . The temperature functio n satisfies au 82 u at kax2
for 0<x<7r,t>0 ,
u(0, t) = u(ar, t) = 0
fort > 0 ,
u(x, 0) = x2 cos(x/2) for 0 < x < ir . The solution i s u(x, t)
_
E n- 1
4
(J
z °O
,1_1 C
2 cos(6/2) sin (n6) d6JJJ I sin (nx) a - "2k` 16irn (-1) " - 64,n-n' (-1) " - 48n - 64n3) 64n6 - 48n 4 + 12172 - 1
sin(nx)e
_ n2k t
We can examine the effects of the diffusivity constant k on this solution by drawing graphs of y = u(x, t) for various times, with different choices of this constant . Figure 18 .6(a) show s the temperature distributions at time t = 0.2, for k taking the values 0 .3, 0 .6, 1 .1 and 2 .7 . Figure 18 .6(b) shows the temperature distributions at time t = 1 .2 for these values of k.
FIGURE 18 .6(a) Solution at time t = 0 .2 with
FIGURE 18 .6(b) Solution at time t = 1 .2
k = 0 .3, 0 .6,
with k = 0 .3, 0 .6,
1 .1,
and 2 .7.
1 .1,
and 2 .7.
EXAMPLE 18. 6
What difference does it make in the temperature distribution whether the ends are insulated o r kept at temperature zero? Consider an initial temperature function f(x) = x2 (7r-x), with a ba r of length IT . Let the diffusivity be k = The solution if the ends are kept at temperature zero i s ul (x, t) = n=1
8(-1),t+i-4 3 n
sin (nx) e-"
2 t/4
18.2 Fourier Series Solutions of the Heat Equation
85 9
The solution if the ends are insulated i s 1 7Ts3 u2 (x, t) = -
co +
(n2,77.2(_I)n+1
E
n
12,,=1
+6)
4
cos(nx) e ne t/ 4 .
Figures 18 .7(a) through (d) compare these two solutions for different values of the time . Figure 18 .8(a) shows the evolution of the solution with zero end temperatures at different times, an d Figure 18 .8(b) shows this evolution for the solution with insulated ends .
18.2.7 Numerical Approximation of Solutions Consider the standard heat conduction proble m 2
for 0 < x < L, t > 0, at k axe u(0, t) = u(L, t) = 0 fort > 0 , u(x, 0)'= f(x)
for 0 < x < L .
One strategy for computing a numerical approximation of the solution is to begin by forming a grid over the x, t- strip 0 < x < L, t > 0, as we did with the wave equation on a bounded interval .
Comparison of the solution with insulated ends, with the solution having ends kept at zero temperature, at time t=0.4. FIGURE 18 .7(a)
FIGURE 18 .7(c)
t=
1 .5 .
FIGURE 18 .7(b)
t = 0 .9 .
FIGURE 18 .7(d)
t = 3 .6 .
860
CHAPTER 18 The Heat Equatio n
u
FIGURE 18 .8(a) Evolution with time of the solution with ends kept at zero temperature.
FIGURE 18 .8(b) Evolution of the solution with insulated ends.
Choose Ax = L/N, where N is a positive integer, and let xi = jix for j = 0, 1, . . ., N . Also choose it positive . This defines lattice points (x1, tk) = (jix, kit) . Denote u(jix, kit) = u i„j. Use centered difference approximations to the derivatives to replace the heat equation with : uj,k+1
it
u = k j+l,k
-2uj,k +u j-l, k
(0x) 2
In the heat equation, the partial derivative in t is first order, so this equation uses the approximation to au/at on the left. Solve this equation for uj,k+1 : Li t
uj, k+ 1
= (ix)2 ( Uj+l,k -2Ujk+ u j-1,k)+ Ltjk •
This enables us to approximate solution values at lattice points on the k + 1" horizontal level, from information at the next lower level, where approximations have already bee n made (Figure 18 .9) . t (xj,
tk+l )
tk+1
tk
(x1_1,
tk) (xtk) I xj-1
I
(xj+1,
tk)
I
xj xj+1
Approximation of u(xj, tk+1) is based on approximate values at three points in the tk layer. FIGURE 18 .9
Since we are moving up the layers of lattice points, filling in approximations at each laye r from the layer below, there must be a starting layer at which we already have information . Dat a for a starting layer is provided by the initial and boundary conditions : UO,k =U N,k = 0
18.2 Fourier Series Solutions of the Heat Equation
86 1
(values at lattice points on the left and right sides of the strip), an d f(xj) = f(j Ax) .
u=
These values are indicated in Figure 18 .10.
u=0tk
1
u=0•
1
u=0•t2
•u= 0
u=0
t1
t*j,o f
u
o,o
I
I
ul o
=
I
I
xj
=f(0) u2 0
f(Ax)
•
= f(jAx)
u = 0
u
=0
I -x
ut,o = f( LOx)
= f(20x)
FIGURE 18 .10 Boundary data give exact values of u(x, t) at lattice points on the boundary of the strip . The quantity k(At)/(0x) 2 should be less than 1/2 to ensure stability of the method .
EXAMPLE 18 . 7 Consider the problem a2 u for0<x<1,t>0 , at axe u(0, t) = u(l, t) =0 , au
u(x, 0) = x(1 -x) for 0 < x < 1 . This has exact solution u (x, t) =
8 n=1 (
2n
1 1) 33 sin((2n - 1)7rx)e-(2tt-1)27T2 I )
To make numerical approximations, we will choose Ax = 0 .1 (N = 10) and At = 0 .0025 . In this example, k = 1 so k(At)/(Ax) 2 = 1/4 < 1/2 . We know tha t U 0,k
=U 10k =0 .
Further u= f( jix)
= j(0 .1) (1- j(0 .1)) .
This initiates the approximation . These values are filled in at the lowest (t = 0) level lattic e points in Figure 18 .11 . To move from one horizontal layer to the next one up (according to the idea of Figur e 18.9), use tlj k+1 = 0 .25(uj+1,k - 2ttj,k + uj-l,k) +
uj,k .
From here go on to the k = 1 (t = 0 .0025) level, obtaining the values shown in Figur e 18 .12. Figure 18 .13 shows the next level, k = 2, or t = 0 .005 . And Figure 18 .14 shows th e k = 3, or t = 0 .0075, level . Proceeding in this way, we can fill in approximate values at lattic e points on any vertical level in the lattice .
862
CHAPTER 18 The Heat Equation x u
L
0, k
0-
I 0.09
0
I I 0.16 0.21
Values of ui,0,
FIGURE 18 .11
I I I I 0.24 0.25 0.24 0.21 uo ,k
I 0.16
- 0 0
I 0.09
0
and ul,k are known at lattice boundary points .
t
x= 1
t 1 = 0 .0025 0.085 0.155 0.205 0.235 0.245 0.235 0.205 0.155 0.085 • • • • • • • • • ui, l -)" 0 I I 1 I I I I I I 0.09 0.16 0.21 0.24 0.25 0.24 0.21 0.16 0.09 0 ui, 0 -->- 0
0
FIGURE 18 .12 Approximate values at the t1 = 0 .0025 level computed from known values at the to = 0 level .
Approximate values of the solution u(x, t) at successive t-levels .
0 0 0 x
18.2 Fourier Series Solutions of the Heat Equation
In each of Problems 1 through 7, write a solution of the boundary value problem. Graph the twentieth partial su m of the temperature distribution function on the same se t of axes for different values of the time . 1.
au _ az tt for0<x0 at k axz u(0, t) = u(L, t) = 0 for t > 0 u(x, 0) = x(L - x)
2.
au a2 u for 0 < x < L, t > 0 a t 4 axz u(0, t) = u(L, t) = 0 fort > 0 for 0
10.A thin, homogeneous bar of length L has initial temperature equal to a constant B, and the right en d (x = L) is insulated, while the left end is kept at a zero temperature . Find the temperature distribution in the bar. 11. A thin, homogeneous bar of thermal diffusivity 9 and length 2 cm and insulated sides has its left en d maintained at temperature zero, while its right en d is perfectly insulated . The bar has an initial temperature f(x) = x2 for 0 < x < 2 . Determine the temperature distribution in the bar . What is lim t , co u(x, t) ?
for 0 < x < L
u(x, 0) = xz (L - x)
x
11 . Show that the partial differential equatio n au -
a
3.
=
3azzz
for0<x 0
u(0, t)t = u(L, t) = 0
4.
au 82 u at ax z
at
fort > 0
u(x, 0) = L [1- cos(27rx/L)]
for 0
x
L
for 0<x<7r,t> 0
- (0, t) = - (7r, t) = 0 fort > 0 dx ax u(x,0)=sin(x) for 0x<7r z 5.
=4
u(x,0)=x(27r-x) at a2 u at = 4 axz
fort > 0
for 0<x<27r
for 0 < x <
3, t > 0
(3, t) = 0 fort > 0 dx 0x u(x, 0) = xz for 0 x < 3
- (0, t) = -
for 0 < x < 6,t > 0
ax
au ) +A- I Bu J x
12. Use the idea of Problem 11 to solve : z at -(axz-I-4ax+2u) u(0,t)=u('r,t)=0
for0<x<7r,t> 0
fort> 0 for 0 x < 7r.
at 2 axz (0, t) = au (6, t) = 0
u(x, 0) = e-x
13. Use the idea of Problem 11 to solve : 02 u at -
axz
\ +6 - I
az
u(0, t) = 11(4, t) = 0
for 0 < x < 4, t > 0
fort
0
u(x, 0) = 1 for 0 x 4 . Graph the twentieth partial sum of the solution for a selection of times . 14. Use the idea of Problem 11 to solve
z
8.
axz (a2u
can be transformed into a standard heat equation b y choosing a and IS appropriately and letting u(x, t) = e«x+Pt v(x, t) .
u(x, 0) =x(ir-x)
c
6.
k
for0<x<27r,t> 0
ax2 = az (0, t) ax (27r, t) = 0 a
86 3
fort > 0
for 0 < x < 6
9. A thin, homogeneous bar of length L has insulated end s and initial temperature B, a positive constant . Find the temperature distribution in the bar.
all ( az II = at u(0,t)=u(7r,t)=0 u(x, 0) = xz (7r - x)
for 0 < x < 7r, t > 0 fort> 0 for 0 < x < 7r.
Graph the twentieth partial sum of the solution fo r selected times .
CHAPTER 18 The Heat Equatio n 15. Solve
19. k = 4, F(x, t) = t, f(x)=x(rr-x),L=ar at
z 16x2
for0<x 0
u(0,t)=2,u(l,t)=5
fort> 0
20. k = 1, F(x, t) = xsin(t), f(x) = 1, L = 4 21. k =1, F(x, t) = tcos(x), f(x) = x 2 (5 -x), L = 5
u(x, 0) = x2 for 0 < x < 1 . Graph the twentieth partial sum of the solution for selected times.
_ K 'r ° 5- x -5- 1 0 for l < x < 2
f(x) = sin(7rx/2), L = 2
23. k = 16, F(x, t) = xt, f(x) = K, L = 3
16. Solve
at
z =k*x2
for0<x 0
u(0, t) = T, u(L, t) = 0
fort
0
u(x, 0) = x(L - x) for 0 x < L .
24. Devise a definition of continuous dependence of the solution on the initial data for the proble m z =k*x2 for0<x 0
at
u(0, t) = u(L, t) = 0
17. Solve
at
u(x, 0) = f(x)
z =4ax2-Au
for0<x<9,t> 0
u(0, t) = u(9, t) =0 fort > 0
u(x, 0) = 0 for 0 < x < 9 . Here A is a positive constant . Choose A = and graph the twentieth partia l sum of the solution for a selection of times, using th e same set of axes. Repeat this for the values A = A= 1 and A= 3 . This gives some sense of the effec t of the -Au term in the heat equation on the behavior of the temperature distribution. 18.
22 . k=4 , F(x, t)
z -=9a- for O < x < L, t > 0 2 u(0, t) = T, u(L, t) = 0 fort > 0
at
u(x,0)=0 for0<x<21r. In each of Problems 19 through 23, solve the proble m au -k z u +F(x,t) at ax2 u(0, t) = u(L, t) = 0
for0<x0 ,
fort 0 ,
u(x, 0) = f(x) for 0 < x < L for the given F, k, L and f . In each, choose a value of the time and, on the same set of axes, graph the twentieth partial sum of the solution of the given problem , together with the twentieth partial sum of the solutio n of the problem with the source term F(x, t) removed. Repeat this for other times . This yields some sense of the significance of F(x, t) on the behavior of the temperature distribution.
fort > 0
for 0 < x < L .
Prove that this problem depends continuously on th e initial data. 25. Find approximate solution values for the proble m au _ a2 u
for 0 < x < 1, t > 0 , ax2 u(0, t) = u(l, t) = 0 fort > 0 , at
u(x, 0) = x 2 (1 - x)
for 0 < x < 1 .
Use Lx = 0.1 and At = 0 .0025 . Carry out calculations for the first four horizontal layers, including the t = 0 layer . 26. Find approximate solution values for the proble m au _ a2 u at
ax2
for 0 < x < 2, t > 0 ,
u(0,t)=u(l,t)=0 fort>0 , u(x, 0) = sin(irx/2)
for 0 < x < 2.
Use ,Nx = 0.2 and At = 0 .0025 . Carry out calculations for the first four horizontal layers, including the t = 0 layer . 27. Find approximate solution values for the problem au at
Use Ax = 0 .1 and At = 0 .0025 . Carry out calculations for the first four horizontal layers, including th e t = 0 layer.
18.3 Heat Conduction in Infinite Media
18.3
865
Heat Conduction in Infinite Medi a We will now consider problems involving the heat equation with the space variable extending over the entire real line or half-line . 18.3.1 Heat Conduction in an Infinite Ba r For a setting in which the length of the medium is very much greater than the other dimensions , it is sometimes suitable to model heat conduction by imagining the space variable free to var y over the entire real line . Consider the problem z au = k for - co < x < co,t > 0 , azz u(x, 0) = f(x) for - co < x < oo . There are no boundary conditions, so we impose the physically realistic condition that solution s should be bounded . . Separate the variables by putting u(x, t) = X(x)T(t) to obtai n X" + AX = 0, T' -I- A.kT = O . The problem for X is the same as that encountered with the wave equation on the line , and the same analysis yields eigenvalues A = co' for w > 0 and eigenfunctions of the for m a w cos(wx) + bw sin(wx) . The problem for T is T' + w z kT = 0, with general solution de -w2k' . This is bounded for t > O. We now have, for co > 0, function s u w (x, t) = [a w cos(wx) + b w sin(wx)] e -w2 kt that satisfy the heat equation and are bounded on the real line . To satisfy the initial condition , attempt a superposition of these functions over all w > 0, which takes the form of an integral : u(x, t)
a cos(wx) +b sin(wx)] e -
dw .
(18 .10 )
We need u(x, 0)
a cos(wx) + b sin(wx)] dw = x .
This is the Fourier integral of f(x) on the real line, leading us to choose the coefficient s aw
=
1 IT
°°
f * f(6) cos(w6) 4
and bw
= -f 7r
f() sin(w)de .
-
EXAMPLE 18 . 8
Suppose the initial temperature function is aw
=1f
f(x) = e kI . Compute the coefficient s
e-lel cos(ws)d = 2
1
866
CHAPTER 18 The Heat Equatio n
and
= _ar f
b
e-ICI sin(w)c16 = O . 0.
The solution for this initial temperature distribution i s u(x,
=
t)
2 f ar o
1w + 2
cos(wx)e -wzk`dw .
The integral (18 .10) for the solution is sometimes written in more compact form, reminiscen t of the calculation in Section 17 .3.1 for Fourier integral solutions of the wave equation on the entire line . Substitute the integrals for the coefficients into the integral for the solution to writ e u(x,
t)
= fo * 1Tr f f(e)cos(w)decos(cwx) C + 7r1 f f( ) sin(w)desin(wx) e -*,z kt d w = -1 f =
f * [cos(w6) cos(wx) + sin(w6) sin(wx)] f()dee - * z kt d w
f f '77' o
cos(w( -x))f(e)e
-wzkt
dkdw .
00
A Single Integral Expression for the Solution on the Real Line
Consider again the problem
2
k at
axe
u(x, 0) = f(x)
for - oo < x
< CO,
t > 0,
for - co < x < co .
We have solved this problem to obtain the double integral expressio n u(x,
t)
=
cos(w(e-x))f(e)e - *'zkt dedw .
f f
Since the integrand is an even function in w, then can also be written u(x, t)
fo • • • dw = i f-oo • • dw and this solutio n
=1f
f cos ( w ( - x))f() e-°'zktdkdw . 00c 2ar co -c We will show how this solution can be put in terms of a single integral . We need the following . LEMMA 18. 1 For real a and /3, with 13 0 0,
f Proof
e-*2 cos \ /
d* =
V--re- a'1402 . 7
Let F(x)
= f e- 'z cos(xt)d'.
18.3 Heat Conduction in Infinite Media
867
One can show that this integral converges for all x, as does the integral obtained by interchangin g d/dx and Lc" • • • 4. We can therefore comput e F' (x) =
f f
-e -t' sin(x*")d*.
Integrate by parts to get
-F(x) 2 .
F' (x) = The n
F'(x) _ F(x)
2
and an integration yields 1n 1 F(x) I = - 1x2 + c. Then F(x) =
Ae
-x2 a
To evaluate the constant A, use the fact that F(O) =
A=f
e-'2 d* =
a result found in many integral tables . Therefore
f
e-C2
cos ( x') dC
, 2
= 4e . _x2 /4
Finally, let x = a/fi and use the fact that the integrand is even with respect to I' to obtain
f
e-'2
cos
\ a/ d' = 2 f
e -t2 cos
= *e-"2/aR2 ■
Now let =
ktw,a=x-6 and(3=
kt .
The n
and
f
a‘
e-W2kr cos w(x l
af
cos (- ) d* =
ktdw = * e - (x-e) 2/4k e
Then
f °° e ''2k' cos(w(x - 6)) dw =
ke
-(x-6) 2/ 4kr
The solution of the heat conduction on the real line is therefor e u(x, t)
= 1f 27r
f
f -03
1 f 27r J_
f(6)cos(w(6-x))e w2kr d6d w
°J
kt
e-(x- )2/4krf( d •
CHAPTER 18 The Heat Equatio n After some manipulation, this solution is u(x, t) =
f 2 7rkt
oo
e-(x-02/4kt
f(e) d e .
-0 o
This is simpler than the previously stated solution in the sense of containing only one integral . 18.3 .2 Heat Conduction in a Semi-Infinite Bar If we consider heat conduction in a bar extending from 0 to infinity, then there is a boundar y condition at the left end. If the temperature is maintained at zero at this end, then the problem i s 2
kax 2
at
u(0,t)=0
for 0<x0 ,
fort0 ,
u(x, 0) = f(x)
for 0 < x
< oo .
Letting u(x, t) = X(x)T(t), the problems for X and T are X" + AX = 0, T' + AkT = 0. If we proceed as we did for the real line, we obtain A = w 2 for co > 0, and functions Xw (x) = a w cos(cox) + b* sin(wx) . Now, however, we also have the conditio n u(0, t) = X(0)T(t) = 0, implying that X(0) = O . Thus we must choose each a w = 0, leaving X u,(x) = b w sin(wx) . Solutions for T have the form of constants times a-2 1a , so for each co > 0 we have function s uw (x, t) = b w sin(wx)e-w2kt . Each of these functions satisfies the heat equation and the boundary condition u(0, t) = 0. To satisfy the initial condition, write a superpositio n u(x, t)
= fo
bw sin(wx)e-w2kt dw .
(18 .11 )
Now the initial condition requires that u(x, 0)
=f
oo
bu, sin(wx)dw ,
0
so choose the bws as the coefficients in the Fourier sine integral of f(x) on [0, oo) : bw
2 j‘
f(e) sin(w)d
.
With this choice of coefficients, the function given by equation (18 .11) is the solution of the problem .
18.3 Heat Conduction in Infinite Media
869
EXAMPLE 18 . 9
Suppose the initial temperature function is given b y Tr-x 0
f(x)
for 0<x< i for x > Tr
The coefficients in the solution (18 .11) are bw
2
2 *rw - sin(Tr(o) 7r w2
f (Tr - ) sin(a4)c16 =
=-
The solution for this initial temperature function i s u(x, t) =
(Trw-sum(Trw ) ) 2 Jof Tr
sin(wx)e -wZk` dw .
18.3.3 Integral Transform Methods for the Heat Equation in an Infinit e Mediu m As we did with the wave equation on an unbounded domain, we will illustrate the use of Fourie r transforms in problems involving the heat equation . Heat Conduction on the Line
Consider again the proble m z < oo, t > 0 , =k at axz for - oc < x
u(x, 0) = f(x)
for - oo < x
< co ,
which we have solved by separation of variables . Since x varies over the real line, we can attempt to use the Fourier transform in the x variable . Take the transform of the heat equation to get
[au] =
k tS'
at]
82 u [ax2 ] .
Because x and t are independent, the transform passes through the partial derivative with respec t to t: [ at ]
(co) = f
au * , t) e_cwgd
= at f
u(, t) e- ° d
e = ii ( w , t ) .
For the transform, in the x- variable, of the second partial derivative of a with respect to x, use the operational formula : z t5 [ ax 2 ] (co) = -w 2 il(w, t) . The transform of the heat equation is therefore at
is (w, t)+kw 2is(w, t) = 0 ,
with general solution u(w, t) = a w e -wZk ` .
870 ,
CHAPTER 18 The Heat Equatio n To determine the coeffocient a w , take the transform of the initial condition to get it( co, 0) = i( w ) = a w .
Therefore
u(w, t) = f (w)e -m2kt . This is the Fourier transform of the solution of the problem . To retrieve the solution, apply the inverse Fourier transform : u (x , t) = `*z
-1
f
*
.2kte,-dw .
[f (w)e-w2kt] (x) = - .- J _m f ((0)e-
Of course, the real part of this expression is u(x, t) . To see that this solution agrees with tha t obtained by separation of variables, insert the integral for f (co) to obtain 27r f *f(w)e-C02kteiwxdw= 1 27Pfoo
(10'' f( .
)e
«4 d l ir,*x e _ *2kt d w
J
Co
f f * f( 27r 1
f° f
27r-
°
J_*f(
- l f* ?7r _
)e-"'(e-x)e-w2ktdedw )cos(w( -x))e
2ktk' d*"d w
f_O f() sin (co( -x)) e-w2k t d dw .
Taking the real part of this expression, we hav e
CC u ( x, t)
C
=1 f f 27r -* -
f(5) cos(w( -x))e -W2kt d6daw ,
the solution obtained by separation of variables .
Heat Conduction on the Half-Line
Consider again the proble m
au 82 u axe for0<x0 , at k u(0, t) = 0 fort 0 , u(x, 0) = f(x)
for - oo < x < oo ,
which we have solved by separation of variables . To illustrate the transform technique, we wil l solve this problem again using the Fourier sine transform . Take the sine transform of the heat equation with respect to x, using the operational formula for the transform of the a 2u/axe term, to get atus(co, t) = -co2kus(aw, t)+wku(0, t) .
18.3 Heat Conduction in Infinite Media
871
Since u(0, t) = 0, this is c7tus(w, t) = -wz kus(co, t) , with general solution 'tis (co t) = b,,,e -'"k ' Now u(x, 0) = f(x), s o
us( w > 0 ) = fs( w) = bw and therefore us(w, t) = 1s( w)
e-,o2kt .
This is the sine transform of the solution . For the solution, apply the inverse Fourier sin e transform to obtain 2 fs(c0)e k`sin(cox)dw . u(x, t) = or
f
We leave it for the student to insert the integral expression for agrees with that obtained by separation of variables.
f s (w) and show that this solutio n
Laplace Transform Solution of a Boundary Value Problem We have illustrated the use o f the Fourier transform and Fourier sine transform in solving heat conduction problems . Here is an example in which the Laplace transform is the natural transform to use . Consider the problem on a half-line : z =k for x > 0, t > 0, axz at u(x, 0) = A u(0,t)_
for x > 0 , B for0 to
in which A, B and to are positive constants . This specifies a problem with nonzero constan t initial temperature and a discontinuous temperature distribution at the left end of the bar . We can write the boundary condition more neatly in terms of the Heaviside function H defined in Section 3 .3.2 : u(0, t) = B[l - H(t- to)] . Because of the discontinuity in u(0, t), we think of trying a Laplace transform in t . Denote 2[u(x, t)](s) = U(x, s) , with s the variable of the transformed function, and x carried along as a parameter. Take th e Laplace transform of the heat equation : all _ az u f [at] k [axz ] For the transform of au/at, the derivative of the transformed variable, use the operationa l formula for the Laplace transform : 2Lat](s)=sU(x,s)-u(x,0)=sU(x,s)-A .
872
CHAPTER 18 The Heat Equatio n The transform passes through az u/axe because x and t are independent:
v [4] t
2
u
(s)
=f
82 u axz
(x, t)dt
a2 f axe
e -3tu(x, t)dt =
a z u(x,
s)
Transforming the heat equation therefore yields a z U(x, s) axz
Write this equation as 02 U(x, s) axz
s k U(x, s) _
A k '
a differential equation in x, for each s > O . The general solution of this equation is U(x s) = as e
s/ 0, + bs e- s/kx
+A s
The notation reflects the fact that the coefficients will in general depend on s . Now, to have a bounded solution we need a s = 0, since e ' 51-r'' oo as s - oo . Therefore U(x, s) = bs e -x
+
(18 .12)
s
To obtain bs , take the Laplace transform of u(0, t) = B[1-H(t- to)] to ge t U(0, s) = B2[1] (s) -B2[H(t - to)] (s) = B 1 -B 1e- t0 s . s s Then U(0, s) =B 1 - B l e-tps =b s + A , s s s so bs =
B-A B s s
Put this into equation (18 .12) to ge t [ B- As s
U(x, s) =
e _tos] e_ s/kx
+ A s
The solution is now obtained by using the inverse Laplace transform : u(x, t) = 2 -1 [U(x, s)] . This inverse can be calculated using standard tables and makes use of the error function and complementary error function, which see frequent use in statistics . These functions are define d by x
erf(x) =
v Jo
e- Z d
and erfc(x) =
2
0°
e
= 1- erf(x) . J x e-e2 d
18 .4 Heat Conduction in an Infinite Cylinder
873
We obtai n u(x, t) = A erf
x,.) +B B erfc (x,-)) ( 1 -H(t- to) 2 ktJ 2 kt )
+ (A elf
x 2 kt
+B erfc
In each of Problems 1 through 4, consider the problem au azu for - oo < x < oo, t > 0 at k axz u(x,0) = f(x) for - oo < x < co .
Obtain a solution first by separation of variables (Fourie r integral), and then again by Fourier transform .
x
C 2 kt/
8. f(x)
= sin(x) for !xi 0 for Ixl
3 . f(x)
_ x for 0 < x < 4 0 forx 4
4. f(x)
ex
9.
7r
In each of Problems 5 through 8, solve the proble m z k for 0 < x < oo,t > 0 , at
axz
u(0, t) =0 fort > 0 , u(x,0) = f(x)
10.
H(t - to) .
1 x for 0 < x < 2 0 forx > 2
au 82 u at axz - to
forx > 0, t > 0 u(x,0) = xe -x forx > 0 u(0, t) =0 fort > 0
_ 1 for 0< x< h 7 f(x) 0 forx > h with h any positive number .
fort > 0
In each of Problems 11 and 12, use the Laplace transform to obtain a solution . 11.
for 0 < x < oo .
6. f(x) = xe "x, with a > O .
au az u -u forx>0,t> 0 at - axe u(x, 0) = 0 forx > 0
ax (0, t) = f(t)
au _ az u for x > 0, t > 0 , at k axz u (o, t) = tz fort > 0 , u(x, 0) = 0
5. f(x) = e -"x, with a any positive constant .
18.4
c 2 Ik(0)) )
>7r
for - 1 x< 1 for [xi > 1
0
x
In each of Problems 9 and 10, use a Fourier transform on the half-line to obtain a solution .
1. f(x) = e -41x I 2 f( x)
_
B erfc
12.
forx > 0
x az
au _ a z u k forx > 0, t > 0 , at u(0, t) = 0 fort > 0 ,
u(x,0)=e -x
forx> 0
Heat Conduction in an Infinite Cylinde r We will consider the problem of determining the temperature distribution function in a solid , infinitely long, homogeneous cylinder of radius R . Let the axis of the cylinder be along the z
874
CHAPTER 18 The Heat Equation
FIGURE 18.1 5
axis in x, y, z space (Figure 18.15) . If u(x, y, z, t) is the temperature function, then u satisfies the 3-dimensional heat equation au a u a2u a2 u k + + at (ax2 ay2 az2
2
It is convenient to use cylindrical coordinates, which consist of polar coordinates i n the plane together with the usual z coordinate, as in the diagram . With x = r cos(O) and y = r sin(g), let u(x, y, z, t) = U(r, 0, z, t) . We saw in Section 17 .1 that a2u ax
a2u
e + ay2
a2u 1 au 1 a2 u ar + r ar + r a02 '
e
2
Thus, in cylindrical coordinates, with U(r, 0, z, t) the temperature in the cylinder at point (r, 0, z), and time t, U satisfies: au
at
U 1 au 1 a2 u a2 u _ ka2+--+--+ J
2 2
2
r ar r a0 az C ar 2 This is a formidable equation to engage at this point, so we will assume that the temperatur e at any point in the cylinder depends only on the time t and the horizontal distance r from the z axis . This symmetry assumption means that WOO = aU/az = 0, and the heat equation become s / 2 for0r0 . Y +- ** r J In this case we will write U(r, t) instead of U(r, 0, z, t) . The boundary condition is
-=k1 * U
U(R, t) = 0
fort > 0 .
This means that the outer surface of the cylinder is kept at zero temperature. The initial condition is U(r, 0) = f(r)
for 0 < r < R .
Separate the variables in the heat equation by putting U(r, t) = F(r)T(t) . We obtain F(r)T' (t) = k (Fh'(r)T(t)+ YF'(r)T(t)) . Because r and t are independent variables, this yields T' _ F"+(1/r)F' _ kT F -*'
18.4 Heat Conduction in an Infinite Cylinder
875
in which A is the separation constant . Then 1 T' +AkT =0 and F"+ -F' +AF=0 . r Further, U(R, t) = F(R)T(t) = 0 for t > 0, so we have the boundary conditio n F(R) = O .
The problem for F is a singular Sturm-Liouville problem (see Section 16 .3.1) on [0, R] , with only one boundary condition . We impose the condition that the solution must be bounded. Consider cases on A . Case 1 Now
A=0 . F" + i F' =0 . r
To solve this, put w = F'(r) to ge t w'(r)+lw(r)=0 , r or rw'+w=(no)'=0 . This has general solution rw(r) = c , so w(r)
= C = F' (r) . r
Then F(r) = cln(r) +d . Now In(r) -* -on as r -> 0+ (center of the cylinder), so choose c = 0 to have a bounded solution . This means that F(r) = constant for A = O . The equation for T in this case is T' = 0 , with T = constant also . In this event U(r, t) = constant. Since U(R, t) = 0, this constant must be zero . In fact, U(r, t) = 0 is the solution in the case that f(r) = 0. If the temperature on th e surface is maintained at zero, and the temperature throughout the cylinder is initially zero, the n the temperature distribution remains zero at all later times, in the absence of heat sources . Case2 A<0 . Write A = -w2 with w > O . Now T' - kw 2 T = 0 has general solutio n T(t) = ce` 2 kr which is unbounded unless c = 0, leading again to u(r, t) = 0. This case leads only to the trivial solution . Case 3 A > 0, say A = w 2 . Now T' + kw2 T = 0 has solutions that are constant multiples of e -0'2k`, and these are bounded for t > 0 . The equation for F is F"(r)
+ 1F' (r)+w 2F(r) = 0 ,
876
CHAPTER 18 The Heat Equatio n or r2F " (r) + rF ' (r) + w2r2F(r) = O . In this form we recognize Bessel's equation of order zero, with general solutio n F(r) = cJo(wr)+dYo(wr) . Jo is Bessel's function of the first kind order zero, and Yo is Bessel's function of the secon d kind of order zero (see Section 16 .2 .3). Since Yo(wr) -k -co as r - 0+, we must have d = O . However, Jo(wr) is bounded on [0, R], so F(r) is a constant multiple of J0 (wr) . The condition F(R) = 0 now requires that this constant be zero (in which case we get th e trivial solution), or that w be chosen so tha t J0(wR) = 0. This can be done. Recall that Jo(x) has infinitely many positive zeros, which we arrange a s
Jl
0<
<
12 <
We can therefore have Jo(wR) = 0 if wR is any one of these numbers . Thus choos e w= in R. The numbers z A„=w,=
Rz
are the eigenvalues of this problem, and the eigenfunctions are nonzero constant multiples o f Jo(in r/ R ) We now have, for each positive integer n, a functio n (itr ) -Jn2kt/ R2 e .
U„( r, t) = an Jo To satisfy the initial condition U(r, 0) U(r, t)J
= f(r)
_ E an
jo
we must generally use a superpositio n
JR e-1„ 2 kt/R 2 .
n= 1
We now must choose the coefficients so that U(r, 0 ) = E 'x' a„Jo
(ir) = f( r) •
n= 1
This is an eigenfunction expansion of f(r) in terms of the eigenfunctions of the singular SturmLiouville problem for F(r) . We know from Section 16.3 .3 how to find the coefficients . Let = r/R . Then 0 f (Re) = E a Jo(Jne) , ,t
n= 1
and a„_
2
1
MC/0F
JO
.f(R ) Jo(.i. 0
The solution of the problem is U(r, t)
1
([J1(j)J2 JO 1 n= 1
Jo ( t R 6)
5.f(R S) Jo(J„6) d
-j, 2kt/R2
18.5 Heat Conduction in a Rectangular Plate
PROBLEMS
SECTION 18 .4
1. Suppose the cylinder has radius R = 1, and, in polar coordinates, initial temperature U(r, 0) = f(r) = r for 0 < r < 1 . Assume that U(1, t) = 0 for t > 0. Approximate the integral in the solution and write the firs t five terms in the series solution for U(r, t), with k = 1 . (The first five zeros of Jo (x) are given in Section 16 .2) . Graph this sum of five terms for different values of t .
Approximate the integral in the solution and write the first five terms in the series solution for U(r, t), with 1 a = 2 . Graph this sum of five terms for different values of t. 4. Determine the temperature distribution in a homogeneous circular cylinder of radius R with insulated to p and bottom caps under the assumption that the temper ature is independent of both the radial angle and height . Assume that heat is radiating from the lateral surfac e into the surrounding medium, which has temperatur e zero, with transfer coefficient A . The initial temperature is U(r, 0) = f(r) . Hint : It will be necessary to kno w that an equation of the form kJo (x) + AJo (x) = 0 has infinitely many positive solutions . This can be proved , but assume it here . Solutions of this equation yield th e eigenvalues for this problem .
2. Suppose the cylinder has radius R = 3, and, in polar coordinates, initial temperature U(r, 0) = f(r) = e'' for 0 < r < 3 . Assume that U(3, t) = 0 for t > 0 . Approxi mate the integral in the solution and write the first fiv e terms in the series solution for U(r, t), with k = 16. Graph, this sum of five terms for different values of t. 3. Suppose the cylinder has radius R = 3, and, in polar coordinates, initial temperature U(r, 0) = f(r) = 9 - r 2 for 0 < r < 3 . Assume that U(3, t) = 0 for t > 0 .
18.5
877
Heat Conduction in a Rectangular Plat e Consider the temperature distribution u(x, y, t) in a flat, square homogeneous plate coverin g the region 0 < x < 1, 0 < y < 1 in the plane . The sides are kept at temperature zero and th e interior temperature at time zero at (x, y) is given b y f(x, y) = x(l -xz)y(1-y) .
The problem for u is at
/z kl axz
+
a2 ayz *
for0<x<1,00 ,
u(x,0,t)=u(x,1,t)=0
for0<x<1,t> 0
u(0,y,t)=u(1,y,t)=0
for 0 < y < 1, t > 0,
u(x, y, 0) = x(1 - x z )y(1 - y) .
Let u(x, y, t) = X(x)Y(y)T(t) and obtain X" +A.X = 0, Y " +pY = 0, T ' + (A+ A )kT = 0 , where A and that
µ
are the separation constants . The boundary conditions imply in the usual wa y X(0)=X(1)=0,
Y(0)=Y(1)=0 .
The eigenvalues and eigenfunctions ar e = n2 7rz , X„(x) = sin(nrrx) , for n = 1, 2, . . . and Aint = mz1r? , _ _ Y,,,(y) =_ sin (tn 7ry )._.
878
CHAPTER 18 The Heat Equatio n for m = 1, 2, . . . . The problem for T is no w T ' + (nz + tnz )1rz kT = 0, with general solution -(n2+d12)7t2k t ( Tiun l t) = C nm e For each positive integer n and each positive integer m, we now have functions u,,,, (x y t) = c sin(nnx) sin(mary)e -(„2+,112)7r2k t which satisfy the heat equation and the boundary conditions . To satisfy the initial condition, le t u(x, y,
= E E c,,,,, sin(n'n-x) sin(m7ry) e-(112+» 12) 7r2kt
t)
n=1 ,n= 1
We must choose the coefficients so tha t u(x, y, 0) = x(1- xz )y(1- y) _
E E cn,,, sin(nirx) sin(mary) . n=1 m=1
We find (as in Section 17 .7) that 1
f
1
cm), = 4 f f x(1-x z )y(1-y) sin(nrrx) sin(m ,ny)dxdy /0
=4I
f 1 x(1 - xz ) sin(mn-x)dx
=Gj$
rn31r3)
II
f y(1- y) sin(m*ry)dy
3
m r31 1 C(
The solution i s u ( x, y, z) _
8
00 ( _1) h2\ (( _ 1) fl_ m3 m=1
l
sin(nx) sin(mary)e -2m2)r2kt
PROBLEMS 1 . Taking a cue from the problem just solved, write a double series solution for the following more general problem: z 82 u + - I for 0 < x L , at k (axe Y
00, u(x,0,t)=u(x,K,t)=0 for0<x0
u(0,y,t) =u(L,y,t)=0 for 0 < y < K, t > 0 , u(x, y, 0) = f(x, y) .
2. Write the solution for Problem 1 in the case that k = 4, L = 2, K = 3 and f(x, y) = x2 (L- x) sin (y) (K - y) . 3. Write the solution for Problem 1 in the case that k = L = ir, K = *r, and f(x, y) = sin(x)y cos(y/2) .
1,
CHAPTER
;C, 19 ;
P, ONIC . UNCTiONSANDTIIE?DIRICHLETPR O i,LI is ROBLEMFO AEC ANGLL' POIS Gii/i FGRMJLA ORTHED!S I*iRICIlLE T UNBOUNDED REGIONS THE STEADY -
The Potential Equation
19.1
Harmonic Functions and the Dirichlet Proble m The partial differential equation a 2u axe
+
a2u aye = 0
is called Laplace's equation in two dimensions . In 3-dimensions this equation i s a2u a2 u a 2 u 8x2+aY 22+0z2 The Laplacian
v2
-o.
(read "del squared") is defined in 2-dimensions b y 82u v2 u
ax2
a2u + aY 2z
and in three dimensions by v2u=
a 2u a2u a2u ax e+ a Y + az 2
In this notation, Laplace's equation is V 2u = O . A function satisfying Laplace's equation in a certain region is said to be harmonic on tha t region. For example, x2 - y2 and 2xy are both harmonic over the entire_plane 879
880
CHAPTER 19 The Potential Equatio n Laplace's equation is encountered in problems involving potentials, such as potentials fo r force fields in mechanics, or electromagnetic or gravitational fields . Laplace's equation is als o known as the steady-state heat equation . The heat equation in 2- or 3-space dimensions i s au = k02u . at oo) the solution becomes independent of t, so au/at = 0 In the steady-state case (the limit as t and the heat equation becomes Laplace's equation . In problems involving Laplace's equation there are no initial conditions . However, w e often encounter the problem of solving V2u (x , y) = 0
for (x, y) in some region D of the plane, subject to the condition tha t u ( x, y) = f(x, y) , for (x, y) on the boundary of D . This boundary is denoted aD . Here f is a function havin g given values on aD, which is often a curve, or made up of several curves (Figure 19 .1) . The problem of determining a harmonic function having given boundary values is called a Dirichle t problem, and f is called the boundary data of the problem . There are versions of this proble m in higher dimensions, but we will be concerned primarily with dimension 2 .
FIGURE 19 . 1 Typical boundary aD
of a region D . The difficulty of a Dirichlet problem is usually dependent on how complicated th e region D is . In general, we have a better chance of solving a Dirichlet problem for a regio n that possesses some type of symmetry, such as a disk or rectangle . We will begin by solving the Dirichlet problem for some familiar regions in the plane .
PROBLEMS 1. Let f and g be harmonic on a set D of points in th e plane. Show that f + g is harmonic, as well as of for any real number a . 2. Show that the following functions are harmonic on th e entire plane: (a) x3 - 3xy 2 (b) 3x2 y - y3
3. Show that ln(x2 + y2 ) is harmonic on the plane with the origin removed .
19.2 Dirichlet Problem for a Rectangle 4. Show that r" cos(nO) and r" sin(nO), in polar coordinates, are harmonic on the plane, for any positive integer n . Hint : Look up Laplace's equation in polar coordinates .
19.2
881
5 . Show that, for any positive integer n, r-" cos(nO) an d r-" sin(nO) are harmonic on the plane with the origin removed.
Dirichlet Problem for a Rectangl e Let R be a solid rectangle, consisting of points (x, y) with 0 < x < L, 0 < y < K . We want to find a function that is harmonic at points interior to R, and takes on prescribed values on th e four sides of R, which form the boundary OR of R . This kind of problem can be solved by separation of variables if the boundary data is nonzero on only one side of the rectangle . We will illustrate this kind of problem, and the n outline a strategy to follow if the boundary data is nonzero on more than one side .
EXAMPLE 19 . 1
Consider the Dirichlet problem V 2u(x,y)=0
for 0 < x < L, 0 < y < K ,
u(x,0)=0
for0xL ,
u(0, y) = u(L, y) = 0
for 0 < y < K ,
u(x, K) _ (L - x) sin(x) for 0 < x < L . Figure 19 .2 shows the region and the boundary data . Let u(x, y) = X(x)Y(y) and substitute into Laplace's equation to obtai n X" X
Y" _ -A . Y
Then X" -I-AX=O and
Y"-AY=O .
From the boundary conditions, u(x, 0) = X(x)Y(0) = 0
Y (L-x) sin(x) (0, K)
7
(L, K )
0
R
0
FIGURE 19 .2 Boundmy
data given on boundary sides of the rectangle .
882
CHAPTER 19 The Potential Equatio n so Y(O) = 0 . Similarly, X (O) = X (L) = 0 . The problem for X(x) is a familiar one, with eigenvalues A n = nz 7rz / L 2 and eigenfunction s that are nonzero constant multiples of sin(n7rx/L) . The problem for Y is now n2 7r 2 Y" - L Y = O; Y(0) = O . Z Solutions of this problem are constant multiples of sinh(niry/L) . For each positive integer n = 1, 2, . . . , we now have function s (n7ry ) /n7rx u n (x, y) = b n sin l L ) sink L which are harmonic on the rectangle, and satisfy the zero boundary conditions on the top , bottom and left sides of the rectangle . To satisfy the boundary condition on the side y = K, we must use a superposition u(x , y) =
L
E bn sin ( n
x) sinh
n= 1
( nL / .
Choose the coefficients so that u(x, K)
n7rx )
= E b„ sin ( L
Binh
(nrK)
= (L - x) sin(x) .
n= 1
This is a Fourier sine expansion of (L - x) sin(x) on [0, L], so we must choose the entire coefficient to be the sine coefficient : (nirK) 2 L d b„ sinh = L (L - 6) sin(k) sin L
f
n7-[1- (-1) n cos(L) ] = 4Lz L 4 - 2L2 n2,77-2 + n47r4 Then b„ _
4L 2
n'r[l - ( - 1)" cos(L) ]
sinh(n7rk/L)
(L 2 - n 2 u 2) z
The solution is (narx ) n7ry 4L2 n7r[l - (-1)" cos(L)] sin sinh ( u(x, y) _,, L L ) sinh(nirk/L) (L 2 - n 2 7r 2) 2 If nonzero boundary data is prescribed on all four sides of R, define four Dirichlet problems , in each of which the boundary data is nonzero on only one side . This process is outlined i n Figure 19 .3 . Each of these problems can be solved by separation of variables . If uj (x, y) is th e solution of the j`" problem, then 4
u (x, y)
= E uj(x, )7)
j= 1 sum will satisfy the original boundary data becaus e is the solution of the original problem . This each uj (x, y) satisfies the nonzero data on one side and is zero on the other three .
19.3
Dirichlet Problem for a Disk
88 3
Y
Y
(0, k)
Y
A
0
0
My)
Y
R
g2(y)
-A.
0
R
> x
u(x, y)
0
R
>x
.fi(x) (L, 0) FIGURE 19 .3
0
0 0
g2(y)
0
g i(Y )
0
R 0
0
= *ay
u(x, 0)
1. u(0, y) = u(1, y) = 0 for 0 < y < 7r, u(x, 7r) = 0 and u(x, 0) = sin(7rx) for 0 < x < 1
u(0, y) = 0, u(a, y) = g(y)
> x
u (x, b) = 0
for 0 < x - a for 0 < y < b .
7. Apply separation of variables to solve the followin g mixed boundary value problem :
V 2 u(x,y)=0 for 0 < x < a, 0 < y < b u(x, 0) = 0, u(x, b)
3. u(0, y) = u(1, y) = 0 for 0 < y < 4, u(x, 0) = 0, u(x, 4) = x cos(7rx/2) for 0 < x < 1
u(0, y)
4. u(0, y) = sin(y), u(7r, y) = 0 for 0 < y < 7r , u(x, 0) = x(7r - x) , u(x, 7r) = 0 for 0 < x < 7r
= f(x) for 0 < x < a
= ax ( a , y) = 0 for 0 < y b .
8. Solve for the steady-state temperature distribution i n a thin flat plate covering the rectangle 0 < x < a, 0 < y < b if the temperature on the vertical side s and bottom side are kept at zero, and the temperature along the top side is f(x) = x(x - a) 2 .
5. u(0, y) = 0, u(2, y) = sin(y) for 0 < y < 7r, u(x, 0) = 0, u(x, 7r) = xsin(7rx) for 0 x < 2 6. Apply separation of variables to solve the followin g mixed boundary value problem (mixed means tha t some boundary conditions are given on the function , and others on its partial derivatives) .
19.3
0
= 4_ 1 u.t (x, y) .
In each of Problems 1 through 5, solve the Dirichlet prob lem for the rectangle, with the given boundary conditions .
V2 u(x,y)=0
R
9. Solve for the steady-state temperature distribution in a thin flat plate covering the rectangle 0 < x < 4 , 0 < y < 1 if the temperature on the horizontal side s is zero, while on the left side it is f(y) = sin(7ry) and on the right side it is f(y) =y(1-y) .