This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!
1 holds, then the series is divergent. From the previous statements it follows that if lim fi = p n i x
(7.28b)
6is greater than a number Q (7.28~)
exists. in the case p < 1 the series is convergent, in the case p > 1 it is divergent. For p = 1 with this test we cannot tell anything about the convergence behavior of the series.
ITheseries 5+(:)4+(:)g+...+(-&)n2t 1
(7.29a)
is convergent because (7.29b)
7.2.2.4 Integral Test of Cauchy 1. Convergence If a series has the general term a, = f(n), and f(z)is a monotone decreasing
function such that the improper integral
7.2 Number Series 407
M
J
(see 8.2.3.2. l., p. 452)
f(2)ds
(7.30)
c
exists (it is convergent). then the series is convergent. 2. Divergence If the above integral (7.30) is divergent, then the series with the general term a, = f(n)is divergent. too. The loFver limit c of the integral is almost arbitrary but it must be chosen so that for c < z < m the function f(z)should be monotone decreasing. IThe series (7.27a) is divergent because (7.31)
7.2.3 Absolute and Conditional Convergence 7.2.3.1 Definition Along with the series
al
+ a2 + . . . + a , + . . . =
oc
(7.32a)
ak. kzl
whose terms can have different signs, we consider also the series 00
(7.32b)
~al~+~a~~+”’+~anl+”’=~lak~. k=l
whose terms are the absolute values of the terms of the original sequence (7.32a). If the series (7.3213) is convergent, then the original one (7.32a) is convergent, too. (This statement is valid also for series with complex terms.) In this case, the series (7.32a) is called absolutely convergent. If the series (7.32b) is divergent, then the series (7.32a) can be either divergent or convergent. In the second case, the series (7.32a) IS called condztzonally convergent. s i n a sin20 sin no (7.33a) IA : Theseries 2 22 2n where a is an arbitrarily constant number, is absolutely convergent, because the series of absolute values sin no is convergent. This is obvious by comparing it with the geometric series (7.15): with terms 2” - + - + + . . + - + + . . . .
I I ~
(7.3313) 1 1 1 (7.34) W B : The series 1 - - + - - . . . t (-l)n-l- + ” . 2 3 is conditionally convergent, because it is convergent according to (7.36b), and the series made of the 1
absolute values of the terms is the divergent harmonic series (7.16) whose general term is - = la,l.
7.2.3.2 Properties of Absolutely Convergent Series 1. Exchange of Terms a) The terms of an absolutely convergent series can be exchanged with each other arbitrarily (even infinitely many of them) and the sum does not change. b) Exchanging an infinite number of terms of a conditionally convergent series can change the sum and even the convergence behavior. Theorem of Riemann: The terms of a conditionally convergent series can be rearranged so that the sum will be equal to any given value, even to !cx
408
7. Infinite Series
2. Addition and Subtraction Absolutely convergent series can be added and subtracted term-by-term; the result is absolutely convergent.
3. Multiplication Multiplying a sum by a sum, the result is a sum of the products where every term of the first factor is multiplied by every term of the second one. These two-term products can be arranged in different ways. The most common way for this arrangement is made as if the series were power series, Le.:
+
+
(a1 + a2 + ...+a, + ...)(b1 bz t . . . b, + ...) = albl+azbl + a 1 b ~ + a s b l + a z b z + a l b 3 + . . . + a , b l w--
+a,..lbz+...+alb,+...
.
(7.35a)
If two series are absolutely convergent, then their product is absolutely convergent, so it has the same sum in any arrangement. If a, = Sa and b, = Sb hold, then the sum of the product is
s = sash. If two series a1
(7.3513) m
+ a2 + ...+a, + ... = n=l a, and bl + bz + ... + b, + ... = n=l b, are convergent, and W
at least one of them is absolutely convergent, then their product is also convergent, but not necessarily absolutely convergent.
7.2.3.3 Alternating Series 1. Leibniz Alternating Series Test (Theorem of Leibniz) For an alternating series al-a2+a3-...!ra,F..., (7.36a) where a, are positive numbers, a sufficient condition of convergence is if the following two relations hold: 1. lima,=O and 2. a l > a z > a 3 > ~ ~ ~ > a , > . . . (7.36b) nim
IThe series (7.34) is convergent because of this criterion.
2. Estimation of the Remainder of an Alternating Series If we consider the first n terms of an alternating series, then the remainder R, has the same sign as the first omitted term a,+], and the absolute value of & is smaller than la,+ll: sign& = sign(a,+l)
with & = S - S,, (7.37a)
IS - Snl < la,+ll.
(7.37b)
IConsidering the series I - - 1+ - - 1- + .1. . 2 3 4
n Ff...
- In 2 (7.38a) -
the remainder is
I In 2 - S,, < 1 . (7.38b) n+l
7.2.4 Some Special Series 7.2.4.1 The Values of Some Important Number Series (7.39) 1 - - 1+ - - -1+ . , 1. l! 2! 3!
1
1
IF”’= e n.
(7.40)
(7.41) 1 . . . + - +1 . . . I + -1+ -1+ - + 2 4 8 2,
= 2,
(7.42)
7.2 Number Series 409
(7.43) (7.44) 1 - - $ -1+ . . . 1 -+ 1'2 2 ' 3 3 ' 4
+-+...1 n(n t 1)
t
--$-+--$...
1 1.3
1 3'5
1 5'7
1 -+ 1'3
-t
1 2'4
7
1 3.0
= 1,
(7.45)
1 $ . . . = 12' (2n - 1)(2nt 1)
1 + . , . = 3+...t (n - l)(n 4' t 1)
1 1 1 1 + . . . = -1 - t - -t . . . t 3 . 5 7 . 9 11.13 (472- 1)(4n t 1) 2
+
1 1 1 + . . . = -1 t -+ . . . t 4' 1'2.3 2.3.1 n(n l ) ( n + 2) 1 1 1 1 . . .+...t t n . . . ( n+ 1 - 1) (1 - 1)(1 - l)!' 1 . 2 . .. 1 2 . 3 . . . (1 + 1)
+
~
(7.46) (7.47) (7.48) (7.49)
+
(7.51)
(7.52) (7.53) A4 I + - 1+ - + 1 - + . . 1. + - + . . . =124 34 44 n4 90 ' 1 1 1 7A4 I--+ --... =24 34 n4 720 1 1 1 1 A4 -+-+$,+...+-++. =14 34 04 (2n t 96 ' 1 1 1 1 .2k22k-l I + -+-+-+...+-+... =Bk , * 22k 32k 42k n2k (2k)!
*-*...
(7.54) (7.55) (7.56) (7.57)
(7.58) (7.59) 1 1 1--+---+.., 32ktl 52ktl
1 72ktl
'Bk are the Bernoulli numbers tEfi are the Euler numbers
(7.60)
410
7. In.finite Series
7.2.4.2 Bernoulli and Euler Numbers 1. First Definition of the Bernoulli Numbers The Bernoulli numbers Bk occur in the power series expansion of some special functions, e.g., in the trigonometric functions tan 2 , cot x and csc 2, also in the hyperbolic functions tanh z, coth x,and cosech x. The Bernoulli numbers Bk can be defined as follows
(7.61) and they can be calculated by the coefficient comparison method with respect to the powers of z.Their values are given in Table 7.1. Table 7.1 The first Bernoulli numbers
Bk
k
k
-1 6 1 30 1 42
1
4 a
6
k
Bk
1 30 5 66 691 2 730
B k
-7 6 3 617 510 43 867 798
7 8 9
k 10 11
Bk
174611 330 854 513 138
2. Second Definition of Bernoulli Numbers Some authors define the Bernoulli numbers in the following way:
-x
-x2
-5 = 1 + B1- + Bzex - 1 l! 2!
- X2n + . . . +Bzn+ ... (2n)!
(1x1< 2n).
So we get the recursive formula %= (B+l)kt’ ( k = 1 , 2 , 3 , . ..),
(7.63)
ahere after the application of the binomial theorem (see 1.1.6.4, l., p. 12) we have to replace B,, Le.. the exponent becomes the index. The first few numbers are: 1 - 1 1 - 1 B1 = --. B 2 = 6 , B4 = -30, B6 = 42’ -
Bs =
B16
=
1
--,30 3617
5 691 B1o = 66’ B12 = --. 2730
-
-
--, 510 . . . ,
_ _ -
B3 = B5 = B7 =
The following relation is valid: Bk = (-l)k’l& (k = 1 , 2 , 3 , .. .).
(7.62)
B14
.
7
= 6’
B”
by
(7.64)
= 0.
(7.65)
3. First Definition of E u l e r N u m b e r s The Euler numbers Ek occur in the power series expansion of some special functions. e.g., in the functions sec zand sech x. The EULER numbers Ek can be defined as follows (7.66) and they can be calculated by coefficient comparison with respect to the powers of x. Their values are given in Table 7.2.
7.2 Numberseries 411
4. Second Definition of Euler Numbers Analogously to (7.63) the Eder numbers can be defined with the recursive formula ( E + l ) k ( F - l)k = 0
+
(k = 1 , 2 , 3 , . . .),
(7.67)
where after the application of the binomial theorem we have to replace we get : _ _ E2 = -1. E4 = 5, E6 = -61, Ea = 1385. = -50521, El2 = 2 702 765, E14 = -199 360 981, _ _ _ E16 = 19391 512 145,. . . , El = E3 = Ej = ' . . = 0. The following relation is valid: Ek = (-l)kG ( k = 1 , 2 , 3 , ,. .).
by E. For the first values
(7.68)
(7.69)
Table 7.2 First Euler numbers
1 2 3 4
1 5 61 1385
5 6 7
50 521 2 702 765 199360981
5 . Relation Between the Euler and Bernoulli Numbers The relation between the Euler and
Bernoulli numbers is: (7.70)
7.2.5 Estimation of the Remainder 7.2.5.1 Estimation with Majorant In order to determine how well the n-th partial sum approximates the sum of the series. the absolute value of the remainder (7.71) X
X
ak must be estimated. For this estimation we use a majorant for
of the series
1lakl, usually a k=ntl
k=l
geometric series or another series which is easy to sum or estimate. m -l For the ratio amtl of two subsequent terms of this n! ' am m! 1 1 amtl series with m 2 n 1 we have: - = -= -5 - = q < 1. So the remainder am (m+1)! m t l n t 2 1 1 1 t -t . ' . can be majorized by the geometric series (7.15) with the R, = ( n t l)! (n 2)! ( nt 3)! 1 quotient q = -and with the initial term a = -. and it yields: n+2 ( n I)! 1 n + 2 1 n + 2 1 R, < = -- <--=(7.72) I-q ( n + l ) ! n + l n!n2+2n n.n!'
IEstimate the remainder of the series e =
n=n
+
~
+- +
+
a
412
7. Infinite Series
7.2.5.2 Alternating Convergent Series For a convergent alternating series, whose terms tend to zero with monotone decreasing absolute values, the easy estimation for the remainder is (see 7.2.3.3, l., p. 408): (7.73) Ihl= IS - Snl < Iantll'
7.2.5.3 Special Series For some special series. e.g., Taylor series, there are special formulas to estimate the remainder (see 7.3.3.3, p. 415).
7.3 Function Series 7.3.1 Definitions 1. Function Series is a series whose terms are functions of the same variable x: ~1 (z)t
(z)t
''
t fn (2) +
M
'
=
f n (E),
(7.74)
n=l
2. Partial S u m
S,(x)is the sum of the first n terms of the series (7.74):
(7.75) + fdx) -t. . . + f n ( x ) . 3. Domain of Convergence of a function series (7.74) is the set of values of x = a for which all the functions fn(z)are defined and the series of constant terms Sn(Z)
= fib)
m
(7.76) is convergent, Le., for which the limit ofthe partial sums Sn(a)exists: (7.77)
4. T h e S u m of the Series (7.74) is the function S(Z), and we say that the series converges to the function S(z). 5. Remainder & ( E ) is the difference between the sum S ( x )of a convergent function series and its partial sum Sn(x): (7.78) &(z) = S(z) - sn(x) = f n + l ( x ) t f n + 2 ( 2 ) + . ' + f n + m ( x ) + ' ' '
7.3.2 Uniform Convergence 7.3.2.1 Definition, Weierstrass Theorem According to the definition of the limit of asequence of numbers (see 7.1.2, p. 403 and 7.2.1.1,2., p. 404) the series (7.74) converges at a point x to S(z) if for an arbitrary E > 0 there is an integer N such that IS(z) - Sn(z)I < E holds for every n > N . For function series we distinguish between two cases: 1. Uniformly Convergent Series If there is a number N such that for every z in the domain of convergence of the series (7.74), IS(x) - Sn(x)I < E holds for every n > N , then the series is called uniformly convergent on this domain. 2. Non-Uniform Convergence of Series If there is no such number N which is good for every value of x in the domain of convergence, Le., there are such values of E for which there is at least one x in the domain of convergence such that IS(x) - Sn(x) I > E holds for arbitrarily large values of n, then the series is non-unzformly convergent. x x2 xn IA : Theseries l t - + - - + . . . + - - $ . . . (7.79a) l! 2! n!
7.3 Function Series 413
with the sum ex (see Table 21.3, p. 1011) is convergent for every value of z.The convergence is uniform for every bounded domain of z,and for every 1x1 < a using the remainder of the Maclaurin formula (see 7.3.3.3, 2., p. 416) the inequality (7.7913) is valid. Because ( n t l)!increases more quick than an+'. the expression on the right-hand side of the inequality, which is independent of z,for sufficiently large n will be less than E . The series is not uniformly convergent on the whole numerical axis: For any large n there will be a value of z such that ,p+l
l m e Q ' l
is greater than a previously given E .
+
+ +
+
(7.80a) IB : The series z + z(1 - z) z(1 - z)' ... z(1 - z ) n ... is convergent for every zin [0,1],because corresponding to the d'Alembert ratio test (see 7.2.2.2. p. 405) p = lim ,+XI
a,tl ~
a,
1
= 11 - z/< 1holds for 0
< z 5 1 (for z = 0 S = 0 holds).
(7.80b)
The convergence is non-uniform, because + (1 - z)n+' = (1 - z)n+l S(Z) - Sn(2)= z[(l is valid and for every n there is an zsuch that (1- z)ntlis close enough to 1, Le., it is not smaller than (7.80c) E . In the interval a 5 z 5 1 with 0 < a < 1 the series is uniformly convergent. 3. Weierstrass Criterion for Uniform Convergence The series (7.81a) fi(.) s f d z ) + " ' + f n ( z ) +'.. is uniformly convergent in a given domain if there is a convergent series of constant positive terms (7.81b) e1 + C' t .' . + c, t ' ' ' such that for every z in this domain the inequality lfn(~)I 5 cn (7.81~) is valid. M'e call (7.81~)a majorant of the series (7.81a).
+...I
7.3.2.2 Properties of Uniformly Convergent Series 1. Continuity If the functions fl(z),f2(z),. . ' , f n ( z )., ' . are continuous in a domain and the series f l ( z ) + f 2 ( z ) + ..+fn(z)+. ' '.isuniformlyconvergent in thisdomain, then thesum S(z) iscontinuous in the same domain. If the series is not uniformly convergent in a domain, then the sum S(z) may have discontinuities in this domain. IA: The sum of the series (7.80a) is discontinuous: S(z) = 0 for z = 0 and S(z) = 1 for z > 0. IB: The sum of the series (7.79a) is a continuous function: The series is uniformly convergent for any finite domain; it cannot have any discontinuity at any finite IC. 2. Integration and Differentiation of Uniformly Convergent Series In the domain [a, b] of uniform convergence it is allowed to integrate the series term-by-term. It is also allowed to differentiate term-by-term if the result is a uniformly convergent series. That is: (7.82a)
I
414
7. In.finite Series
7.3.3 Power series 7.3.3.1 Definition, Convergence 1. Definition The most important function series are the power series of the form a.
+ alx + a2x' + + Q,X" +. I
'
cc '.
anxn or
=
(7.83a)
n=O
a0 t a1(x - 2 0 ) + a2(x - 2 0 ) '
+ + a,(x '.
-~
0
+
m )' . ~=
an(x - x O ) ~ ,
(7.83b)
n=O
where the coefficients a, and the centre of expansion ZO are constant numbers 2. Absolute Convergence a n d Radius of Convergence .4 power series is convergent either only for x = xo or for all values of x or there is a number r > 0, the radius of convergence, such that the series is absolutely convergent for Ix - 201 < r and divergent for Ix - ZOI > r (Fig. 7.1). The radzus of convergence can be calculated by the formulas (7.84)
*
domain of convergence
x,,-r
if the limits exist. Ifthese limits do not exist, then we have to take the limit superior (lim) instead of the usual limit (see [7.6],Vol. I). At the endpoints z = t r and x = --T for the series (7.83a) and x = zo + r and x = 10 - r for the series (7.83b) the series can he either convergent or divergent.
xo+r
XO
Figure 7.1
3. Uniform Convergence .4 power series is uniformly convergent on every subinterval /z- 201 5 r0 < r of the domain of convergence (theorem ojiibel).
x x2 Xn 1 n+l (7.85) I For the series 1 + - + - + . ' . + - + ' . . we get - = lim -= 1 , Le., r = 1. n r nix n 1 2 Consequently the series is absolutely convergent in -1 < z < +1, for x = -1 it is conditionally convergent (see series (7.34) on p. 407) and for x = 1 it is divergent (see the harmonic series (7.16) on p. 404). According to the theorem of Abel the series is uniformly convergent in every interval [ - T I , + T I ] , where rl is an arbitrary number between 0 and 1.
7.3.3.2 Calculations with Power Series 1. S u m a n d P r o d u c t Convergent power series can be added, multiplied, and multiplied by a constant factor term-by-term inside of their common domain of convergence. The product of two power series is
+(&
+ aibz + azbi + a 3 b o ) X 3 +. . .
2. First Terms of Some Powers of Power Series: S = a + bx ex' dx3 ex4 j x 5 . .
+
S 2 = a'
+
+
+ 2abx + (b2 t 2ac)x'
+2(af
+ + . + 2(ad + bc)x3+ (c' + 2ae + 2bd)x4
+ be t ed)xj + . '.
(7.86) (7.87)
(7.88)
7.3 Function Series 415
(7.89)
(7.90)
(7.91)
3. Quotient of Two Power Series X
n=O anzn x:
b"xn
+ +
a0 1 a12 t azx2 -bo 1 $12 ~ Z X '
+
+ +
'.
=
' '
a.
-[I t (a1 - B1)x bo
+ (a2
+
n=O
- alBl
+ PIz p2)zz -
+ . .I.
(7.93) +(a3- azi31- a1D2- P3 - 6'13 0131~ t 2P1p2)z3 We get this formula by considering the quotient as a series with unknown coefficients, and after multiplying by the numerator we get the unknown coefficients by coefficient comparison. 4. Inverse of a Power Series If the series (7.94a) y = f(x) = ax bzZ cz3 t dx4 t ex5 t fz6t ' . ( a # 0) is given. then its inyerse is the series (7.94b) z = p(y) = i l y ByZt cy3t Dy4 t Ey5 Fy6 ' ' ' . Taking powers of y and comparing the coefficients we get 1 b 1 1 B = -A= C = -(2b2 - ac), D = -(5abc - azd - 5b3), a a3 ' a5 a7 1 (7.94c) E = -(6a2bd t 3aZc2t 14b4 - a3e - 21ab'c).
+
+
+
+
+
-.
o9
1 F = -(7a3be all
t 7a3cd + 84ab3c - a4f
-
28azb2d- 28azbcz - 42b5).
The convergence of the inverse series must be checked in every case individually.
7.3.3.3 Taylor Series Expansion, Maclaurin Series There is a collection of power series expansions of the most important elementary functions in Table 21.3 (see p. 1009). Usually, we get them by Taylor expansion.
1. Taylor Series of Functions of One Variable If a function f ( z ) has all derivatives at x = a, then it can often be represented with the Taylor formula as a power series (see 6.1.43. p. 388).
416
7. Infinite Series
a) First Form of the Representation:
(x - a)* f"(a) + ' . .
+T (X - a)" f (a) + . . . (n)
,
(7.968)
This representation (7.95a) is correct only for the x values for which the remainder I?, = f(x) - S, tends to zero if n + m. This notion of the remainder is not identical to the notion of the remainder given in 7.3.1, p. 412 in general, only in the case if the expressions (7.95b) can be used. There are the following formulas for the remainder: (x - a)n+' %=f'"")([)(a < < x) or (x < < a) (Lagrange formula), (7.95b) (n l)!
<
+
l X
R, =
/(x - t)"f("t"(t) d t
(Integral formula).
(7.95c)
a
b) Second Form of the Representation: k h2 h" f(a+ h ) = f ( a ) -!'(a) zf"(a) t ' . .t p ( a )
+
+ l!
+ .. '.
(7.96a)
The expressions for the remainder are: hntl
%=-
(0 < 0
f("+')(a t Oh) ( n I)!
+
l h
R, = -J/ ( h - t)nf(ntl)(a
< I),
(7.96b)
+ t )dt.
(7.96~)
0
2. Maclaurin Series The power series expansion of a function f(x) is called a Mucluzlrin series if it is a special case of the Taylor series with a = 0. It has the form
f(x) = f ( 0 ) + I f ' ( 0 ) l!
+ Zf"(0) 22 +.
' '
+ X" - p
( O )
+ .' '
(7.97a)
with the remainder xntl
%=f("")(Qx) ( nt I)!
(0
<8 <
l X
%= ; /(x - t)"f("t')(t) d t .
(7.97b) (7.97c)
0
The convergence of the Taylor series and Maclaurin series can be proven either by examining the remainder & or determining the radius of convergence (see 7.3.3.1, p. 414). In the second case it can happen that although the series is convergent, the sum S ( x ) is not equal to f(x). For instance for the function f(x) = e - 3 for x # 0, and f(0) = 0 the Maclaurin series is the identically zero series.
7.3.4 Approximation Formulas Considering only a neighborhood small enough of the centre of the expansion, we can introduce rational approximation formulas for several functions with the help of the Taylor expansion. The first few terms of some functions are shown in Table 7.3. The data about accuracy are given by estimating the remainder. Further possibilities for approximate representation of functions, e.g., by interpolation and fitting polynomials or spline functions, can be found in 19.6, p. 914 and 19.7, p. 928.
7.3 Function Series 417 Table 7.3 Approximation formulas for some frequently used functions
I
7.3.5 Asymptotic Power Series Even divergent series can be useful for calculation of substitution values of functions. In the following 1 we consider some asymptotic power series with respect to - to calculate the values of functions for large X
values of 1x1.
7.3.5.1 Asymptotic Behavior Two functions f (E) and g(z), defined for x0
< x < co,are called asymptotically equal for z -+ co if
+
f (XI = 1
(7.98b) (7.98a) or f ( ~=)g(z) o(g(z)) for 5 + m S(X) hold. Here, o(g(z)) is the Landau symbol 'little 0'' (see 2.1.4.9, p. 56). If (7.98a) or (7.98b) is fulfilled, then we write also f ( ~ )g(z). lim
~
x+33
N
418
7. Infinite Series
I B : ~ ! ~ I . IC:
I A : @ T ~ ~ X .
32 t 2 4x3 + x + 2
N -
3
4x2’
7.3.5.2 Asymptotic Power Series 1. Notion of Asymptotic Series a, . A series E,”=,is called an asymptotic power series of the function f ( z ) defined for z > zo if X”
(7.99) f ( z ) = C -a”+ O (?!I)v=o xu holds for every n = 0 , 1 , 2 . .. . . Here, 0 - is the Landau symbol “big 0”.For (7.99) we also write (z”:l)
2. Properties of Asymptotic Power Series a) Uniqueness: If for a function f ( z )the asymptotic power series exists, then it is unique. but a function is not uniquely determined by an asymptotic power series. b) Convergence: Convergence is not required from an asymptotic power series.
I A: e! x
“ ”=o
1 -is an asymptotic series, which is convergent for every x with 1x1 > zo
v!x”
I B: The integral f ( z ) =
/
00
0
e-”t
-dt
l f t
R,(z) = 0
“
~
0
(A),
(1
+
t)n+1
0).
(x > 0) is convergent for z > 0 and repeated partial inte-
1 gration results in the representation f(z) = x
n! with&(x) = (-l),-/ zn
(20>
l! - 22
2!
3!
+- - . + (-l),-’23 24
dt. Because of l&(z)l
?C
5
’.
dt
(n - I)! Xn
=
n!
:..l
+ &(XI we get
and with this estimation (7.100)
The asymptotic power series (7.100) is divergent for every z,because the absolute value of the quotient n t l of the (n+ 1)-th and of the n-th terms has the value -. However, this divergent series can be used for a reasonably good approximation of f(z).For instance, for z = 10 with the partial sums Sd(10) and Sj(l0) we get the estimation 0.09152 <
dt
< 0.09164.
7.4 Fourier Series 7.4.1 Trigonometric Sum and Fourier Series 7.4.1.1 Basic Notions 1. Fourier Representation of Periodic Functions Often it is necessary or useful to represent a given periodic function f(z) with period T exactly or approximatirely by a sum of trigonometric functions of the form s,(x) =
2 + a1 cos wx + a2 cos 2wz + . , . + a, cos nux 2 +bl s i n u s + bz sin 2wx + . . + b, sin nwz.
(7.101)
7.4 Fourier Series 419
2T
This is called the Fourzer expanszon. Here the frequency is w = - In the case T = 2n we have Y, = 1 T We can get the best approximation of f ( z ) in the sense given on 7.4.1.2, p. 419 by an approximation function s,(z). where the coefficients a k and b k ( k = 0,1,2,. . . , n) are the Fourier coefficients of the given function. They are determined with the Eulerformulas +f(-x)]coskdzdz,
(7.102a)
-f(-z)]sinlcwzdz,
(7.102b)
20
and
20
or approximatively with the help of methods of harmonic analysis (see 19.6.4, p. 924). 2. Fourier Series If there are z values such that the sequence of functions sn(z) tends to a limit s(z) for n + m, then the given function has a convergent Fourier series for these zvalues. This can be written in the form
+
+
+, + +
+
a1 coswz a2 cos 2wz . , + a , cosnwx 2 +b, s i n d z bz sin2wz . , . b, sin nwx ' . . and also in the form 00
s(z) = -
+
a0
s(z) = -
+
+
2 '41sin(wz t PI) A2 sin(2wz where in the second case:
A,, =
\/,2+bk2,
+
+ p2)+ . + A, sin(nuz + yn)+ . '
ak
tanpk = - ,
,,
~
(7.103b)
(7.103~)
bk
3. Complex Representation of the Fourier Series In many cases the complex form is very useful: CkeikW',
s(z) =
(7.104a)
k=-m
(7.104b)
I
;(a_,
+ ib-k)
fork < 0 .
7.4.1.2 Most Important Properties of t h e Fourier Series 1. Least Mean Squares Error of a Function If a function f ( z ) is approximated by a trigonometric sum sn(x) = 3 +
5
ak cos kdz
+
k=l
bk
sin kwx,
(7.105a)
k=l
also called the Fourzersurn, then the mean square error (see 19.6.2.1, p. 916, and 19.6.4.1, 2., p. 925)
' i[I(.)
F =T o
- sn(x)]2dx
(7.105b)
420
7. In.finite Series
is smallest if we define ak and bk as the Fourier coefficients (7.102a,b) of the given function.
2. Convergence of a Function in the Mean, Parseval Equation The Fourier series converges in mean to the given function, Le.,
/pljjij
-
sn(z)12di + 0 for n
+ rn
(7.106a)
0
holds. if the function is bounded and in the interval0 < x < T i t is piecewise continuous. A consequence of the convergence in the mean is the P a r s e d equation: 2 T a' - / [ f ( x ) ] ' d x = 0 x ( a k 2 t bk'). T O 2 k=l
+
(7.106b)
3. Dirichlet Conditions If the function f ( x ) satisfies the Dzrzchlet condztzons, Le., a) the interval of definition can be decomposed into a finite number of intervals where the function f(x) is continuous and monotone, and b) at every point of discontinuity of f(x)the values f ( x t 0) and f(x - 0) are defined. then the Fourier series of this function is convergent. At the points where f(x) is continuous the sum f(. - 0) + f(z + 0 ) is equal to f ( x ) .at the points of discontinuity the sum is equal to 2
4. Asymptotic Behavior of the Fourier Coefficients If a periodic function f(r)and its derivatives up to k-th order are continuous, then for n expressions annkt1and b,nktl tend to zero.
Figure 7.2
Figure 7.3
+ cc both
Figure 7.4
7.4.2 Determination of Coefficients for Symmetric Functions 7.4.2.1 Different Kinds of Symmetries 1. Symmetry of the First Kind If f ( x ) is an even function, i t . , if f(x) = f(-x) (Fig. 7.2), then for the coefficients we have
(7.107) 2. Symmetry of the Second Kind If f(x) is an odd function, i.e., if f(x) = - j ( - z ) (Fig. 7.3), then for the coefficients we have
(7.108)
7.L Fourier Series 421 3. Symmetry of the Third Kind If f(z+ T/2) = -f(x) holds (Fig. 7.4), then the coefficients are (7.109a)
(7.109b) 4. Symmetry of the Fourth Kind If the function f(x) is odd and also the symmetry of the third kind is satisfied (Fig. 7.5a), then the coefficients are r i d
ak = b2h = 0.
b2k+l =
8j f ( x ) s i n ( 2 k + 1)-2nx dx T
( k = 0,1,2:. . .).
(7.110)
T O
If the function f ( x ) is even and also the symmetry of the third kind is satisfied (Fig.7.5b), then the coefficients are
8
bk
= a Z k = 0, a2ktl = -
l4
T O
2x2 dx ( k = 0, 1 , 2 , .. .). T
f(x)cos(2k + 1)-
Figure 7.5
Figure 7.6
Figure 7.7
7.4.2.2 Forms of the Expansion into a Fourier Series Every function f ( x ) , satisfying the Dirichlet conditions in an interval 0 5 x 5 I (see 7.4.1.2,3., p. 420), can be expanded in this interval into a convergent series of the following forms: a0 27rx 2nx 2nx 1. fl(x) = - a l c o s - + a ~ c o s 2 - + ~ ~ ~ + a , c o s n - + ~ ~ ~ 2 1 1 1 2nx 27rx 2TX + bl sin + b:! sin 2- 1 + ' . + b, sin R- 1 + . . . (7.112a) 1
+
422
7. Infinite Series
The period of the function h ( x ) is T = 1; in the interval 0 < z < l the function fi(z)coincides with the function f(z)(Fig. 7.6). At the points of discontinuity the substitution values are f(x) =
I [ f ( z- 0) + f(z+ O ) ] . The coefficients of the expansion are determined with the Euler formulas
2 2T (7.102a.b) for w = -. 1 a0 -
KX
+
FX
+
+ +
FX
+
a,cosn' (7.112b) a1 cos - a2 cos 2'.. 2 1 1 1 The period of the function f 2 ( x ) is T = 21; in the interval 0 5 x 5 1 the function fZ(x) has a symmetry of the first kind and it coincides with the function f(x) (Fig. 7.7). The coefficients of the expansion of f 2 ( x )are determined by the formulas for the case of symmetry of the first kind with T = 21. 2. f 2 ( x ) =
TX
7iX
572
(7.112c) 3. f3(x)= b l s i n - + + b : ! s i n 2 - + + . . + b b , s i n n - + + . . . . 1 1 1 The period of the function f s ( x ) is T = 21; in the interval 0 < x < 1 the function f 3 ( x ) has a symmetry of the second kind and it coincides with the function f(x)(Fig. 7.8). The coefficients of the expansion are determined by the formulas for the case of symmetry of the second kind with T = 21.
1
TT 4 2
Figure 7.8
TX
Figure 7.9
7.4.3 Determination of the Fourier Coefficients with Numerical Methods If the periodic function f(x)is a complicated one or in the interval 0 5 x < T i t s values are known only kT . for a discrete system x k = - with k = 0: 1 . 2 , . . . , N - 1, then we have to approximate the Fourier N coefficients. Furthermore, e.g.: also the number of measurements N can be a very large number. In these cases we use the methods of numerical harmonic analysis (see 19.6.4, p. 924).
7.4.4 Fourier Series and Fourier Integrals 1. Fourier integral If the function f(x) satisfies the Dirichlet conditions (see 7.4.1.2, 3., p. 420) in an arbitrarily finite interval and. moreover: the integral
'SX
If(x)l dx is convergent (see 8.2.3.2,l., p. 452), then the following
-c€
integral): formula holds (FOURIER +m
f ( ~ 1 ) +~m= e ' ~~' d ; ~ f ( t ) e - ' w ' r l 1t = - ~ d+m w~f(t)cosw(t-x)dt. --s.
-c€
.4t the points of discontinuity we substitute
0
-x
(7.113a)
7.4 Fourier Series 423
2. Limiting Case of a Non-Periodic Function The formula (7.113a) can be regarded as the expansion of a non-periodic function f ( x )into a trigonometric series in the interval (-1, +1) for 1 + m. IVith Fourier series expansion a periodic function with period T is represented as the sum of harmonic 2i7
vibrations with frequency 3, = n-
T
.
with n = 1,2,. . . and with amplitude A,. This representation is
based on a discrete frequency spectrum. L\’*iththe Fourier integral the non-periodic function f ( 5 )is represented as the sum of infinitely many harmonic vibrations with continuously varying frequency w . The Fourier integral gives an expansion of the function f (x)into a continuous frequency spectrum. Here the frequency w corresponds to the density g(d) of the spectrum: -35
(7.113~) -m
The Fourier integral has a simpler form if f ( x ) is either a) even or b) odd: a)
---cosuxdw
f ( x )= 2 x
(7.114a)
0 m
b) f ( x ) = ~2 Jms i n ~ x d w S f ( t ) s i n w t d t .
(7.114b)
0
0
IThe density of the spectrum of the even function f ( x ) = e-l”l and the representation of this function are
7.4.5 Remarks on the Table of Some Fourier Expansions In Table 21.4 there are given the Fourier expansions of some simple functions, which are defined in a certain interval and then they are periodically extended. The shapes of the curves of the expanded functions are graphically represented. 1. Application of Coordinate Transformations Many of the simplest periodic functions can be reduced to a function represented in Table 21.4 when we either change the scale (unit of measure) of the coordinate axis or we translate the origin. W A function f(r) = f (-x) defined by the relations (7.116a)
(Fig. 7.9), can be transformed into the form 5 given in Table 21.4,if we substitute a = 1 and we 2i7x i7 - + -. By the substitution of the variables in T 2 2i7x Ti 27rx + = (-l), cos(2n + 1)we get for the function (7.116a) the series 5 ) because sin(2n 1) T 2 T expression 27ix 1 27rx 1 2KX (7.116b) y = 1 + - cos - - - cos3+ ;cos;- ”‘ . i ( T 3 T o T
introduce the new variables Y = y - 1 and X =
+
(-
-)
)
I
424
7. Infinite Series
2. Using the Series Expansion of Complex Functions Many of the formulas given in Table 21.4 for the expansion of functions into trigonometric series can be derived from power series expansion of functions of a complex variable.
I The expansion of the function I
+
-= 1 z t 22 t . . .
< 1)
(7.117)
2 = ae'p after separating the real and imaginary parts
(7.118)
1-Z
(121
yields for
1 + acosp + a2cos2p
1 -acos(o 1 - 2a cosp a2 ' (o for la1 < 1. . , = 1 - 2aa sin cos (o az
+. . . + ancosllp + , .
a s i n y + a2sin2(o+ . , . + ansinn(o
+.
*
=
+ +
(7.119)
8 Integral Calculus 1. Integral Calculus and Indefinite Integrals Integration represents the inverse operation of differentiation in the following sense: While differentiation calculates the derivative function f‘(.) of a given function f(x),integration determines a function whose derivative f’(z)is previously given. This process does not have a unique result, so we get the notion of an indefinite integral. 2. Definite Integral If we start with the graphical problem of the integral calculus, to determine the area between the curve of y = f(z) and the z-axis, and for this purpose we approximate it with thin rectangles (Fig. 8.1). then we get the notion of the definite integral. 3. Connection Between Definite a n d Indefinite Integrals The relation between these two types of integral is the fundamental theorem of calculus (see 8.2.1.2, l . , p. 440).
8.1 Indefinite Integrals 8.1.1 Primitive Function or Antiderivative 1. Definition Consider a function y = f(x) given on an interval [a, b]. F ( z ) is called a primitive function or antiderivative of f(x)if F ( s )is differentiable everywhere on [a, b] and its derivative is !(.): F ’ ( 2 )= f(.). (8.1) Because under differentiation an additive constant disappears, a function has infinitely many primitive functions, if it has any. The difference of two primitive function is a constant. So, the graphs of all primitive functions Fl(z),Fz(z),. . . , F,(z) can be got by parallel translation of a particular primitive function in the direction of the ordinate axis (Fig. 8.2).
Figure 8.2
Figure 8.3
2. E x i s t e n c e Every function continuous on a connected interval has a primitive function on this interval. If there are some discontinuities, then we decompose the interval into subintervals in which the original function is continuous (Fig. 8.3). The given function = f(z) is in the upper part of the figure; the function
426
8. Inteqral Calculus
y = F ( x ) in the lower part is a primitive function of it on the considered intervals.
8.1.1.1 Indefinite Integrals The indefinite integral of a given function f(z)is the set of primitive functions
F ( z )t C =
J f ( z )dz.
(8.2)
The function f ( x ) under the integral sign
I
is called the integrand, zis the integration variable, and C
is the integration constant. It is also a usual notation, especially in physics, to put the differential dx right after the integral sign and so before the function f ( z ) . Table 8.1 Basic integrals Exponential functions
Powers p t l
/zn dz = n+l
( n # -1)
/ez dx = ex
/$ = In 1x1
/ozdz = Hyperbolic functions
Trigonometric functions
/ sinh zdz / cosh zdz
S s i n x d z = -cos%
/ cos x dz
= sin z
= cosh z = sinh z
/tanzdz=-lnlcoszl
/tanhzdx=ln~coshzl
J
/cothzdz =ln/sinhzI
cot x d z = In 1 sinxi
/&
= tanz
A /&
= -cotx
/A
= tanhz
/A
=
Irrational functions
Fractional rational functions dx
dx
Ia2-z
-
ii1 arctan $
= i.4rtanh
$ = & In (for I zI < a )
/A x -iArcoth$= -a
-cothz
=
& l n l z l
dz
= Arcosh
5 = In Ix + d-1
(for I zI > a)
8.1.1.2 Integrals of Elementary Functions 1. Basic Integrals The integration of elementary functions in analytic form is reduced to a sequence of basic integrals. These basic integrals can be got from the derivatives of well-known elementary functions, since indefinite integration means the determination of a primitive function F ( z ) of the function f(z). The collection of integrals given in Table 8.1 comes from reversing the differentiation formulas in Table 6.1 (Derivatives of elementary functions). The integration constant C is omitted.
8.1 Indefinite Inteorals 427
2. General Case For the solution of integration problems, we try to reduce the given integral by algebraic and trigonometric transformations. or by using the integration rules to basic integrals. The integration methods given in section 8.1.2 make it possible in many cases to integrate those functions which have an elementary primitive function. The results of some integrations are collected in Table 21.5 (Indefinite integrals). The following remarks are very useful in integration: a) The integration constant is mostly omitted. Exceptions are some integrals, which in different forms can be represented with different arbitrary constants. b) If in the primitive function there is an expression containing In f(x),then we have to consider always In If(x)l instead of it. c) If the primitive function is given by a power series, then the function cannot be integrated in an elementary fashion. .A wide collection of integrals and their solutions are given in [8.1] and [8.2].
8.1.2 Rules of Integration The integral of an integrand of arbitrary elementary functions is not usually an elementary function. In some special cases we can use some tricks, and by practice we can gain some knowledge of how to integrate. Today we leave the calculation of integrals mostly to computers. The most important rules of integration, which are finally discussed here, are collected in Table 8.2.
1. Integrand with a Constant Factor .A constant factor a in the integrand can be factored out in front of the integral sign (constant multiple rule): /af(x)dx=
oi
s
f(z)dx.
2. Integration of a Sum or Difference The integral of a sum or difference can be reduced to the integration of the separate terms if we can tell their integrals separately (sum rule): /(u
+ IJ - w) dx =
/
u dz
+ / u dz -
1
wdz.
(8.4)
The variables u , IJ, w are functions of z.
s
I (x + 3)'(x2
+ 1)dx =
(x4 + 6x3+ 102'
3 10 + 6z + 9) dx = y3 t -z4 + -z3 + 32' + 92 + C . 2 3 25
3. Transformation of the Integrand The integration of a complicated integrand can sometimes be reduced to a simpler integral by algebraic or trigonometric transformations.
s s
I sin 22 cos zdz =
s:
-(sin 32 + sin r ) dx.
4. Linear Transformation in the Argument If f(x)dx = F ( x ) is known, e.g., from an integral table, then we get: 1
/!(ax) dx = - F ( a x ) + C,
/ f(x t b) dx
(8.58)
1 + b) dx = -F(az + b) + C. 1 sin ax dx = - - cos ax + C . IB:
= F(z
1
(8.5b) (8.5~)
//(ax IA:
+ b) + C,
1
ea'+*
1
dx = -eaxtb
+ c.
I
8. Intesral Calculus
428
dx
= arctan(x + a )
+ C.
Table 8.2 Important rules of calculation of indefinite integrals
Rule
Formula for integration
Integration constant
/f(x) dx = F ( z ) + C
(C const)
Integration and differentiation
I f(x)
( a const)
Constant multiple rule
/ a f ( x )dx = a
Sum rule
/[u(x)rt ~ ( x )dx] = /u(x)dx & / u ( x ) dx
Partial integration
1
dx
u(x)d(x)dx = u(x)w(x)-
1
u'(x)v(x) dx
x = u ( t ) or t = ~ ( x ) ; u und u are inverse functions of each other :
/ f(x)dx = / f ( u ( t ) ) u ' ( dtt )
Substitution rule
'(')
Special form of the integrand
or
dx = In If(x)l + C (logarithmic integration)
lm 1f'(x)f(x)
1 dx = -f2(x) C 2 u und w are inverse functions of each other : /u(x)dx = xu(.) - F ( u ( x ) )+ C1 with l. 2.
Integration of the inverse function
+
F ( x ) = /u(x)dx + C2 (C, , C2 const)
5 . Power and Logarithmic Integration If the integrand has the form of a fraction such that in the numerator we have the derivative of the denominator. then the integral is the logarithm of the absolute value of the denominator:
"
IA:
2'
+ 3x - 5 dx = ln(x2+ 32 - 5 ) + C. +
If the integrand is a product of a power of a function multiplied by the derivative of the function. and the power is not equal to -1, then
1 1
f'(x)f"(x)dx =
B:
2x+3
(xZ+ 32 - 5)3
I
fa+%) f a ( z ) df(x) = -
dx =
+
c,
a+l
1 ( - 2 ) ( x 2 + 32 - 5)'
+ c.
8.1 IndefinzteInteorals 429
6. Substitution Method If x = u ( f )where t = ~ ( xis) the inverse function of x = u ( t ) )then according to the chain rule of differentiation we get
I e"S1 I e"+ldxx I t + I -. e"
IA:
-
1
dx. Substituting z = lnt. (t
dx
1
dt
t
> 0), - = - , then taking the decomposition into
Dartial fractions we get: ex - 1 t-1dt -- = (L dt = 21n(e" t 1) - x t C . = 1t t t 1 t dx dt IB: Substituting 1$x2 = t , - = 22, then we get dx
1
A)
7. Partial Integration Reversing the rule for the differentiation of a product we get
I
u(x)z"(x)dx = ~ ( x~) ( x-) / u ' ( z ) u(x) dx,
(8.8)
) continuous derivatives. where u(x)and ~ ( xhave IThe integral
S
x e' dx can be calculated by partial integration where we choose u = x and u' = e',
so we get u' = 1 and 2' = e':
S ze'
dz = ze"
-
I
e' dx = (x - 1)e' t C .
8. Non-Elementary Integrals Integrals of elementary functions are not always elementary functions. These integrals are calculated mostly in the following three ways, where the primitive function will be approximated by a given accuracy. 1. Table of Values The integrals which have a particular theoretical or practical importance but cannot be expressed by elementary functions can be given by a table of values. (Of course, the table lists values of one particular primitive function.) Such special functions usually have special names. Examples are: IA: Logarithmic integral (see 8.2.5. 3., p. 458):
IB: Elliptic integral of the first kind (see 8.1.4.3, p. 435): dx = F ( k 9). J(1 - $)(1 - k Z 9 )
fnp
(8.10)
IC: Error function (see 8.2.5, 5., p. 459):
I'
e-tz d t = erf(z).
(8.11)
2. Integration by Series Expansion We take the series expansion of the integrand, and if it is uniformly convergent, then it can be integrated term-by-term.
IA:
1%?
IB:
I$
dx, (see also Sine integral p. 458).
dz. (see also Exponential integral p. 459)
430
8. Inte.qra1 Calculus
3. Graphical integration is the third approximation method, which is discussed in 8.2.1.4. 5.. p. 444.
8.1.3 Integration ofRational Functions Integrals of rational functions can always be expressed by elementary functions.
8.1.3.1 Integrals of Integer Rational Functions (Polynomials) Integrals of integer rational functions are calculated directly by term-by-term integration:
/(a,zn + an_lxn-l +.
’.
+ a12 + ao) dx (8.12)
8.1.3.2 Integrals of Fractional Rational Functions The integrand of an integral of a fractional rational function
I Bp;:i
-dx, where P ( x ) and Q ( x ) are
polynomials with degree rn and n, respectively, can be transformed algebraically into a form which is easy to integrate. \Ye perform the following steps’ 1. Il’e simplify the fraction by the greatest common divisor. so P(2)and Q ( x )have no common factor. 2. !Ye separate the integer rational part of the expression. If m 2 n holds, then we divide P(2) by Q ( x ) Then s e have to integrate a polynomial and a proper fraction. 3. !Ye decompose the denominator Q ( x )into linear and quadratic factors (see 1.6.3.2,p. 44): Q(2) = a n ( x - ~ ) k ( z - 3 ) “ ~ ~ ( ~ Z + p 2 + q ) P ( 2 2 + p ’ ~ + q ’ ) s ~ ~ ~ with
p’4 - q < 0, f4 - q’ < 0.. . . .
(8.13a) (8.13b)
4. IVe factor out the constant coefficient a, in front of the integral sign.
5 . \$‘e decompose the fraction into a sum of partial fractions: The proper fraction we get after the division. which can nolonger be simplified and whose denominator is decomposed into a product of irreducible factors, can be decomposed into a sum of partial fractions (see 1.1.7.3, p. l5), which are easy t o integrate.
8.1.3.3 Four Cases of Partial Fraction Decomposition 1.Case: All Roots of the Denominator are Real and Single Q(2) = (Z - a)(. - 9) ’ . . (2 - A)
(8.14a)
a) PVe form the decomposition:
B P(2) ‘4 +-+... Q(2) 2 - a 2-3
+-x -Lx
(8.14b)
b) The numbers A. B , C, . . . . L can also be calculated by the method of undetermined coefficients (see 1.1.7.3. 1..p. 16). c) Il’e integrate by the formula (8.14d)
8.1 Indefinite Inteqrals 431
21
II = /
+3 1
q x- 11513 2. Case: All Roots of the Denominator are Real, Some of them with a Higher Multiplicity Q ( x ) = ( E - .)'(E
- S),.
".
(8.15a)
a) R'e form the decomposition: P ( x ) - A1 +-+...+A2 Q(z) - (x - a) (x - a)'
A1
(x - a)' (8.15b)
b) Tle calculate the constants A I , Az.. . . , A ' , B1, Bz,. . . , B,. . . . by the method ofundetermined coefficients (see 1.1.7.3, l . ,p. 16) c ) Tle integrate by the rule = ill h ( z - a). 53
1
dx = Ak ( k > 1). x - .)k ( k - 1)(x - a)k-1
+1
(8.15~)
The method of undetermined
coefficients yields A + B1 = 1, -3A - 2B1+ B2 = 0 , 3 A + B1 - Bz + B3 = 0. - A = 1; A = -1. B1 = 2, Bz = 1. B3 = 2. The result of the integration is 1 2 1 1 1 2 dz = - I n x + 2 l n ( x - 1) - -- -+e I = [-3. + 2-1 (x-1)2 (x - 1 ) 2 z =In---
+
/
x
(x - 1 ) 2
+
1-
+ e.
3. Case: Some Roots of the Denominator are Single Complex Suppose all coefficients of the denominator Q ( x ) are real. Then. with a single complex root of Q ( x )its conjugate complex number is a root too and we can compose them into a quadratic polynomial.
Q(z) = (x- a)'(. - +),. with
t .
(x2+ p x
+ q)(x2 +p'x + 4'). . .
(8.16a)
2 < q . 2 < q' , . . . , 4 4
(8.16b)
because the quadratic polynomials have no real zeros. a) w'e form the decomposition:
P(x) Q(x)
A1
2-0
+-+...+A2 A1 Bi +-..+.+, ( z -BzP ) (.-a)' (x-a)' x-8 Cz+D Ex+ F t... .
+ x z t px + q
+-
+
22
+ p'x + q'
Bm (x - 0 ) m (8.16~)
432
8. InteoraE Calculus
b) We calculate the constants by the method of undetermined coefficients (see 1.1.7.3, l.,p. 16). c) We integrate the expression
cx by the formula +px +q +
x2
( C x + D ) dx = ln(x2 + px t q ) 2 x2 t p x t q
C
4dx
,
-
A
+PI2
D - Cpf2 +-
+-cx
(8.16d)
. The method of undetermined coefficients yields the
+
equations A + C = 0, D = 0, 4A = 4, A = 1, C = -1, D = 0. 1 1 x I= dx = lnx - -ln(x2+4) +lnCl = ln- ' l X 2 term arctan is missing
1(;
-)
m , where in this particular case the
4. Case: Some Roots of the Denominator are Complex with a Higher Multiplicity (8.17a) & ( x )= ( 2 - a)k(x- 8)". (x' + p x + q)"(xZ +p'x + qy.. .. I
a) We form the decomposition:
A +2 A Ak + Bi BZ BI p(,) - -1 + . . . +y+ ...+ Q(x) x - a (x-0)' (x-a)k 2 - p (x-p) (x - PI1 C ~ XD1 C Z X Dz Cmx D, (XZ px q ) m 2 2 +pr q (x2 +pa: qy EZX Fz E,x F, E I X Fi t x2 P'. q' (22 t p'x q')* (x' p'x q')n . b) We calculate the constants by the method of undetermined coefficients. Cmx D, for m > 1 in the following steps: c) We integrate the expression (x' px q)m a)We transform the numerator into the form
+
+
+
+
+
+ +
+ +
+
+ + +...+ + + + +"'+ + + + +
(8.17b)
+ + +
C,x
c m + p ) + (Dm - -) CmP + Dm = -(22 2 2
(8.17~)
.
p) We decompose the integrand into the sum of two summands, where the first one can be integrated directly:
cm
(22+P)dZ
S- 2 (x' + p z t
-
q)m
Cm
1
2(" - 1) (2'
+ px + q)m-1 .
(8.17d)
7) The second one will be integrated by the following recursion formula, not considering its coefficient:
+P / 2 dx (8.17e) +pa: t q)m-l ' CiX+D1 C Z X + D Z 2x2+22+13 - A 22' + 22 + 13 I=/ dx : 22 +1 (x - 2)(x2+ 1 ) 2 - x - 2 (2' + 1)Z .( - 2)(x2 t 1 ) 2 The method of undetermined coefficients results in the following system of equations: A+C1 = 0, -2C1+D1= 0 , 2A+C1-2Dl+CZ = 2 , -2C1tD1-2CZSDz = 2, A-2D1-202 = 13; the coefficients are A = 1, C1 = -1, D1 = -2, C2 = -3, DZ = -4,
+ 2(m -21)m(-q3- p2/4)
I
(2'
+-+-
I=/(
2+2
2 - 2
x'+1
3s+4)dx,
(xZt1)Z
8.1 Indefinite Integrals 433
dx
According to (8.17e) we get finally the result is I =
3 - 42 1 t -1n2 ( 9 + 1) 2
(2 - 2)2
~
x2+
1
- 4arctanx
+ C.
Table 8.3 Substitutions for integration of irrational functions I
I
Substitution
Integral *
where r is the lowest common multiple of the numbers m, n, . . . . ~
/ R (x.Jax2
+ bx + c)
One of the three Euler substitutions:
dx:
+c =t - fix dax2 t bx + c = xt + f i
1. For a > 0 t
Jazz+ bx
2 . For c > 0
+ +
3. If the the polynomial ax2 bx c has different real roots: ax2 + bx + c = a ( x - a ) ( . - p)
dax2 + bx + c = t ( x - a )
* The symbol R denotes a rational function of the expressions in parentheses. The numbers n, m, . . . are integers. t If a < 0, and the polynomial ax2 t bx + c has complex roots, then the integrand is not defined for any value of 2, since dax2 bx c is imaginary for every real value of x. In this case the integral is meaningless.
+ +
8.1.4 Integration of Irrational Functions 8.1.4.1 Substitution to Reduce to Integration of Rational Functions Irrational functions cannot always be integrated in an elementary way. Table 21.5 contains a wide collection of integrals of irrational functions. In the simplest cases we can introduce substitutions, as in Table 8.3. such that the integral can be reduced to an integral of a rational function.
R (x,d a x 2 t bx + c) dx can be reduced to one of the following three forms
The integral
1
m) dx.
R (x.
(8.1sa)
/ R (x,m) dx,
(8.18b) (8.18~)
because the quadratic polynomial ax2 + bx t c can always be written as the sum or as the difference of two complete squares. Then, we can use the substitutions given in Table 8.4.
W A : 4x2+16s+17=4
434
8. Integral Calculus
IB: zz + 3x t 1 = x2 + 32
+ -9 - -5 = (z +
IC: -x2
+
i)' ($)' ($)'
with xl = z + -3 2 2s = 1 - xz t 2x - 1 = 1' - (z - 1)' = l2 - xy with x1 = x - 1. 4
4
-
= z;-
Tabelle 8.4 Substitutions for integration of irrational functions I1
Integral
lR
(x.
Substitution
m) dx
I R (x,d-)
dx
/ R (x, d m ) dx
x=asinht
or x = c r t a n t
x=acosht
or z = a s e c t
x=asint
or z = a c o s t
8.1.4.2 Integration of Binomial Integrands An expression of the form zm(a bzn)P (8.19) is called a binomial integrand, where a and b are arbitrary real numbers, and m, n. p are arbitrary positive or negative rational numbers. The theorem of Chebyshev tells that the integral
+
11
+
xm(a bzn)Pdz
(8.20)
can be expressed by elementary functions only in the following three cases: Case 1: Ifp is an integer, then the expression ( a t b P ) P can be expanded by the binomial theorem, so the integrand after eliminating the parentheses will be a sum of terms in the form exk, which are easy to integrate. m+1 Case 2: If -is an integer, then the integral (8.20) can be reduced to the integral of a rational n function by substituting t = t -, where r is the denominator of the fraction p m+l Case 3: If + p is an integer, then the integral (8.20) can be reduced to the integral of a rational
-
function by substituting t =
Substitution t =
$/T!
- where r is the denominator of the fraction p .
(? x/ = (t3 = ,
= St4(4t3 - 7 ) -tc.
7
Because none of the three conditions is fulfilled, the integral is not an elementary function.
8.1 Indefinite Inteorals 435
8.1.4.3 Elliptic Integrals 1. Indefinite Elliptic Integrals Elliptic integrals are integrals of the form
1R (x,dax3 +
1
R ( x , 4 a x 4 + bx3 + ex2 + ex + f ) dx.
bx2 t cx + e ) dx,
(8.21)
Csually they cannot be expressed by elementary functions; if it is still possible, the integral is called pseudoellzptzc. The name of this type of integral originates from the fact that the first application of them was to calculate the perimeter of the ellipse (see 8.2.2.2, 2.) p. 447). The inverses of elliptic integralsare the elliptic functions (see 14.6.1, p. 700). Integralsofthe types (8.21), which are not integrable in elementary terms. can be reduced by a sequence of transformations into elementary functions and integrals of the following three types (see [21.1], [21.2], [21.6]): dt
(1 - k2t2)dt
1 JjiTqcFq
(0< k < 1)) (8.22a) dt
(0< k < 1). (8.2213)
(0< k < 1).
(8.22~)
Concerning the parameter n in (8.22~)one has to distinguish certain cases (see [14.1]). By the substitution t = s i n p
the integrals (8.22a,b,c) can be transformed into the
Legendre form: Elliptic Integral of t h e First Kind:
Elliptic Integral of the Second Kind: Elliptic Integral of the T h i r d Kind:
(8.23a)
-4
1
+
(1 nsin2 p
(8.2313)
dlp.
(8.23~)
dlp
~ .
)
m
2. Definite Elliptic Integrals Definite integrals with zero as the lower bound corresponding to the indefinite elliptic integrals are denoted by (8.24a) ]J=dt,L 1 - k2 sin’
1;
dz:
= E(k,p),
(8.24b)
0
= n ( n ,k , 9) (for all three integrals 0
< k < 1 holds). (8.24~)
We call these integrals incomplete elliptic zntegrals of the first, second, and third kind for p = L. The 2 first two integrals are called complete elliptic integrals, and we denote them by (8.25a)
8. Intesral Calculus
436
(8.25b) Tables 21.7.1, 2, 3 contain the values for incomplete and complete elliptic integrals of the first and second kind F, E and also K and E . IThe calculation of the perimeter of the ellipse leads to a complete elliptic integral of the second kind as a function of the numerical eccentricity e (see 8.2.2.2, 2., p. 447). For a = 1.5, b = 1 it follows that e = 0.74. Since e = k = 0.74 holds, we get from Table 21.7.3: sincy = 0.74, Le., cy = 47" and E ( k , T ) = E(0.74) = 1.33. It follows that U = 4aE(0.74) = 4aE(cy = 47") = 4 * 1 . 3 3 ~= 7.98. 2 Calculation with the approximation formula (3.326~)yields 7.93.
8.1.5 Integration of Trigonometric Functions 8.1.5.1 Substitution With the substitution 2dt Le., dz = 1+t2' an integral of the form
t
= tan
z) 2
, 2t 1- t 2 cosz = slnz=-1 + t * ' 1+t2'
1
(8.26)
(8.27)
R (sin z,cos z)dx
can be transformed into an integral of a rational function, where R denotes a rational function of its arguments.
tan*:
-
2 + tan - + - In tan - + C. In some special cases we can apply simpler substitutions. If the
4 2 2 2 integrand in (8.27) contains only odd powers of the functions sinz and cosz, then by the substitution t = tan z a rational function can be obtained in a simpler way.
8.1.5.2 Simplified Methods Case 1: /R(sinz)coszdz. Case 2: Case 3:
1 1
R (cos z) sin zdz.
Substitution t = sinz, coszdx = d t .
(8.28)
Substitution t = cos z, sinz dz = -dt.
(8.29) (8.30a)
sin" zdx :
a) n = 2m t 1, odd:
I
- c o s Z z ) m s i n z d z = - (1-t2)mdt
/sin"zdz=/(l
b) n = 2m, even: /sinnzdx =
1
[:(l -
COS^^)]
with t = c o s z .
(8.30b)
m
dz =
(1 -
with t = 22.
(8.30~)
8.1 Indefinite Inteorals 437 We halve the power in this way. After removing the parentheses in (1 - cost)" we integrate term-byterm.
Case 4:
1
(8.3la)
x dx.
COS"
a) n = 2m t 1, odd:
1
x dx =
COS"
/(1 - sin' x ) cos~ x dx
=
I
(1 - tz)mdt
b) n = 2m, even: m
/cosnxdx=/[~(1+cos2x)]
d x = - +2:1
with t = sin x.
(8.31b)
1
( l + ~ o s t ) ~ d t with t = 2 x .
(8.3IC)
We halve the power in this way. After removing the parentheses we integrate term-by-term.
1 1
(8.32a)
sin" x cosm x dx.
Case 5:
a) One of the numbers m or n is odd: We reduce it to the cases 1 or 2.
A:
I
sin' xcos5 xdx =
sin'x (1 - sin' x)'cosxdx = /?(1 - t2)' dt with t = sinx.
b) The numbers m and n are both even: we reduce it to the cases 3 or 4 by halving the powers using the trigonometric formulas sin 22 1 - cos 22 1 + cos 232 sin zcos x = , cos'x = (8.32b) sin' x = 2 ' 2 ' ~
~
~
J z l(1
1 - cos4x) dx = - sin3 22 48
Case 6:
1
tann x dx =
1
+
1 1 . -x - - sin4x 16 64
'J
sin' 2z(1+ cos 2 2 ) dx = 8
sin2 cos4 x dx = (sin x cos x)2 cos' x dx =
1 16
+C.
tan"-' x(sec*x - 1) dx =
1
sin222 cos 2z dz t
x (tanx)'dx - tan"-' x dx
1
tann-' x - -- tan"-' x dx. (8.33a) n-1 By repeating this process we decrease the power and depending on whether n is even or odd we finally get the integral /dx=z
or
I t a n x d x = -1ncosx
(8.33b)
respectively.
Case 7:
1
cot" 1 dx.
(8.34)
The solution is similar to case 6. Remark: Table 21.5. p. 1017 contains several integrals with trigonometric functions
8.1.6 Integration of Further Transcendental Functions 8.1.6.1 Integrals with Exponential Functions Integrals with exponential functions can be reduced to integrals of rational functions if it is given in the form
/ R (emz,
ens, .
. . . eps) dx,
(8.35a)
I
438
8. Inteqral Calculus
where rn. n . . . . . p are rational numbers. We need two substitutions to calculate the integral: 1. Substitution o f t = e' results in an integral
f R ( t m , t n :. . t P )dt.
(8.3513)
2. Substitution of z = 8. where r is the loweest common multiple of the denominators of the fractions m. n,. . . > p ,results in an integral of a rational function.
8.1.6.2 Integrals with Hyperbolic Functions Integrals with hyperbolic functions, Le., containing the functions sinh x,cosh x,tanhx and coth x in the integrand, can be calculated as integrals with exponential functions, if the hyperbolic functions are replaced by the corresponding exponential functions. The most often occurring cases cosh" x dx
.I
I
sinh" x dx
sinh" x coshmx dx can be integrated in a similar way to the trigonometric functions
(see 8.1.5, , p. 436)
8.1.6.3 Application of Integration by Parts If the integrand is a logarithm, inverse trigonometric function, inverse hyperbolic function or a product of xm with lnx, ea=) sinax or cosax or their inverses, then the solution can be got by a single or repeated integration by parts. In some cases the repeated partial integration results in an integral of the same type as the original integral. In this case we have to solve an algebraic equation with respect to this expression. We can calculate in this way, e.g., the integrals eaz cos bzdx, e sin ' bx dx: where we need integration by
1
1
parts twice. Il'e choose the same type of function for the factor 2~ in both steps, either the exponential or the trigonometric function. u'e also use integration by parts if we have integrals in the forms
1
P (x)ea' dx: P (x)sin bx dx and
P (x)cos bx dx;where P (z)is a polynomial. (Choosing 2~ = P (x)the degree of the polynomial will
be decreased at every step.)
8.1.6.4 Integrals of Transcendental Functions The Table 21.5,p. 1017: contains many integrals of transcendental functions
8.2 Definite Integrals 8.2.1 Basic Notions, Rules and Theorems 8.2.1.1 Definition and Existence of the Definite Integral 1. Definition ofthe Definite Integral The definite integral of a bounded function y = f (x)defined on a finite closed interval [a.b] is a number, which is defined as a limit of a sum, where either a < b can hold (case A) or a > b can hold (case B). In a generalization of the notion of the definite integral (see 8.2.3, p. 451) we will consider functions defined on an arbitrary connected domain of the real line, e.g., on an open or half-open interval, on a half-axis or on the whole numerical axis, or on a domain which is only piecewise connected, Le., everywhere, except finitely many points. These types of integrals belong to zmproper zntegrals (see 8.2 3. l., p. 451).
2. Definite Integral as the Limit of a Sum iYe get the limit. leading to the notion of the definite integral, by the following procedure (see Fig. 8.1, 425).
8.2 Definite Intearals 439
1. Step: The interval [ a ,b] is decomposed into n sobzntervals by the choice of n - 1 arbitrary points . . , xn-l so that one of the following cases occurs:
1 1 . x2,
a = zo < x1 < x 2
< . . . < z, < . . < z,-1 < E ,
a = xo > x1 > x 2 > . ' ' > 1,>
'.
> xn-l > E ,
=b
(case A) or
(8.36a)
=b
(case B).
(8.36b)
Et is chosen in the inside or on the boundary of each subinterval as in Fig. 8.4: T , - ~5 Et 5 x , (in case 4) or z,-~ 2 (, 2 z, (in case B). (8.36~)
2. Step: .4 point
41
5,
53
-004
m ... xl
b=xnT x,,.~ *
5"
'
5n
51
AXn-1
T
51
*..
Ax2 Axl Axo
* ... x3 T x,T x1'- "? T xo=a 53
52
ffi
(8)
51
Figure 8.4 3. Step: The value f (t,)of the function f ( x ) at the chosen point is multiplied by the corresponding difference Ax,-1 = E , - xt-13Le., by the length of the subinterval taken with a positive sign in case A and taken with negative sign in case B. This step is represented in Fig. 8.1, p. 425 for the case A. 4. Step: Then all the n products f (Ez) A E , - ~ are added 5. Step: The limit of the obtained zntegral approxzmatzon sum or Rzemann sum
(8.37) is calculated if the length of each subinterval 4zi-l tends to zero and consequently their number n as an infinitesimal quantity. tends to x. Based on this, we can also denote If this limit exists independently of the choice of the numbers z, and &, then it is called the definite Riemann integral of the considered function on the given interval. We write (8.38) The endpoints of the interval are called limits of integration and the interval [a>b] is the integration interval: a is the lower limit, b is the upper limit of integration: z is called the integration variable and f(x) is called the integrand.
3. Existence of the Definite Integral The definite integral of a continuous function on [a. b] is always defined, Le., the limit (8.38) always exists and is independent of the choice of the numbers xi and E,. Also for a bounded function having only a finite number of discontinuities on the interval [a. b] the definite integral exists. The function whose definite integral exists on a given interval is called an integrablefunction on this interval.
8.2.1.2 Properties of Definite Integrals The most important properties of definite integrals explained in the following are enumerated in Table 8.5, p. 441.
I
440 8. Inte9ral Calculus
1. Fundamental Theorem of Integral Calculus If the integrand f(x) is continuous on the interval [a,b], and F b
(2) is
a primitive function, then
b
/ f (x)dx = / F'(x)dx
=F
(x)18 = F (b) - F ( a )
(8.39)
a
a
holds, Le., the calculation of a definite integral is reduced to the calculation of the corresponding indefinite integral, to the determination of the antiderivative:
F (z) =
/ f (x)dx + C.
(8.40)
Remark: There are integrable functions which do not have any primitive function, but we will see that, if a function is continuous, it has a primitive function.
2. Geometric Interpretation and Rule of Signs 1. Area under a Curve For all x in [a,b] let f (x)2 0. Then the sum (8.37)can be considered as the total area of the rectangles (Fig. 8.1), p. 425,which approximate the area under the curve y = f (2). Therefore the limit of this sum and together with it the definite integral is equal to the area of the region A, which is bounded by the curve y = f (x),the z-axis, and the parallel lines x = a and x = b:
A=
f (z)dz
( a < b and
f (z) 2 0 for a 5 x 5 b).
(8.41)
a
2. Sign Rule If a function y = f(z) is piecewise positive or negative in the integration interval (Fig. 8.5),then the integrals over the corresponding subintervals, that is, the area parts, have positive or negative values, so the integration over the total interval yields the sum of signed areas. In Fig. 8.5a-d four cases are represented with the different possibilities of the sign of the area. VI
VA
Y
ys
f(x)>O, a>b
f(x)
f(x)b
c) Figure 8.5 IA:
/'==
sin zdx (read: Integral from x = 0 to z = T ) = (- cos 21: = (- cos F
+ cos 0) = 2.
2=0
W B: L'=*'sin zdz
(read: Integral from x = 0 to x = 2 ~ =) (- cos
= (- cos 2x
+ cos0) = 0.
3. Variable Upper Limit 1. Particular Integral If we consider the area depending on the upper limit as its variable (Fig. 8.6, region ABCD), then we have an area function in the form
/ f ( t )d t 2
S(z) =
(a < b and f(z) 2 0 for x 2 a ) .
a
We call this integral a particular integral.
(8.42)
8.2 Definite Inteorals 441
x
O a
x
The geometrical meaning of this theorem is that the derivative of the variable area S(z) is equal to the length of the segment YM (Fig. 8.7). Here, the area, just as the length of the segment. is considered according to the sign rule (Fig. 8 . 5 ) .
4. Decomposition of the Integration Interval The interval of integration [a, b] can be decomposed into subintervals. The value of the definite integral over the complete interval is
p a
f(x)dr
=
]f(x)dx + p f(.) a
dz.
(8.44)
C
This is called the interval rule. If the integrand has finitely many jumps, then the interval can be decomposed into subintervals such that on these subintervals the integrand will already be continuous. Then the integral can be calculated according to the above formula as the sum of the integrals on the subintervals. .4t the endpoints of the subintervals the function must be defined by its corresponding left or right-sided limit. if it exists. If it does not, then the integral is an improper integral (see 8.2.3.3, l.,p. 454) b] if we suppose Remark: The formula above is valid also in the case if c is outside of the interval [a, that the integrals on the right-hand side exist.
8.2.1.3 Further Theorems about the Limits of Integration 1. Independence of the Notation of the Integration Variable The value of a definite integral is independent of the notation of the integration variable: (8.45)
442
8. Intearal Culczllzls Table 8.5 Important properties of definite integrals
Property
Formula
1f(z)
dz = F ( z ) =:I
Fundamental theorem of the integral calculus ( f ( z )is continuous)
a
p
Interchange rule
~ ( a=)
1
f ( z ) dz = -
1
F(b) - F ( a ) with
dz + c or ~ ' ( z = ) f(x)
f ( z ) dx
b
a
Equal integration limits
Interval rule
p
Independence of the notation of the integration variable Differentiation with respect to the upper limit
~
Mean value theorem of the integral calculus
I/
f(x)dz =
a
p
p
f ( u )du = f(t ) d t
a
a
f ( t ) dt = f(z) with f(z)continuous
~
f(.)
dx = ( b - a ) f ( E )
(a < E
1
< b)
a
2. Equal Integration Limits If the lower and upper limits are equal, then the value of the integral is equal to zero:
p
f ( z ) dz = 0.
(8.46)
a
3. Interchange of the Integration Limits After interchanging the limits: the integral changes the sign (interchange rule): (8.47) a
4. Mean Value Theorem and Mean Value 1. Mean Value Theorem If a function f(z)is continuous on the interval [aIb]. then there is at least one value in this interval such that in the case A with a < E < band in the case B with a > E > b (see 8.2.1.1, 2..p.439)
<
j - i i x ) dx = ( b - a ) f ( E )
(8.48)
a
is valid. The geometric meaning of this theorem is that between the points a and b there exists at least one point
8.2 Definite Inteorals 443
E such that the area of the figure ABCD is equal to the area of the rectangle AB'C'D in Fig. 8.8. The value 1
b
/
m=f(x)dx b-ao
(8.49)
is called the mean value or the arithmetic average ofthe function f(x) in the interval [a, b] . 2. Generalized Mean Value Theorem If the functions f(x) and p(z) are continuous on the closed interval [a. b]. and ~ ( 2does ) not change its sign in this interval, then there exists at least one point such that b
/ f(x)&)
b
ds = f(E)
a
(a < E
dx
< b)
(8.50)
a
is valid. 5 . Estimation ofthe Definite Integral The value of a definite integral lies between the values of the products of the infimum m and the supremum M of the function on the interval [a,b] multiplied by the length of the interval:
/ f(x)dx 5 M ( b b
m(b - a ) 5
- a).
(8.51)
a
Iff is continuous. then m is the minimum and M is the maximum of the function. It is easy to recognize the geometrical interpretation of this theorem in Fig. 8.9.
Figure 8.9
8.2.1.4 Evaluation of the Definite Integral 1. Principal Method The principal method of calculating a definite integral is based on the fundamental theorem of integral calculus. Le., the calculation of the indefinite integral (see 8.2.1.2, 1..p. 440), e.g., using Table 21.5. Before substituting the limits we have to check if we have an improper integral. Nowadays we have computer algebra systems to determine analytically indefinite and definite integrals (see Chapter 20).
2. Transformation of Definite Integrals In many cases, definite integrals can be calculated by appropriate transformations, with the help of the substitution method or partial integration. W A: Use the substitution method for I =
I"m
d
Jo
z
First we substitute: x = p(t) = asint, t = $(x) = arcsin ,: li(0) = 0, $ ( a ) = I .We get: 2 I = L a J m d x = / a r c s ' n 1 a 2 ~ ~ c o s t =d at 2 ~ 4 c o s 2 t d t = a425(1+cos2t)dt. 1~ arcsin 0
R'ith the further substitution t = p(z) = f,z = @ ( t )= 2t, @(O) = 0. @(:) a2
I = -ti! 2
*
a2 + -/ 4 0
7r
2
iTa2
a'
get:
.a2
coszdz = - + - sinzl' - 4 4 0 4 '
W B: Method of partial integration:
= i~ we
1'
x ex dx = [xe']; -
1'
e"
dx = e - ( e - 1) = 1.
444
8. Inteoral Calculus
3. Method for Calculation of More Difflcult Integrals If the determination of an indefinite integral is too difficult and complicated, or it is not possible to express it in terms of elementary functions, then there are still some further ideas to determine the value of the integral in several cases. Here, we mention integration of functions with complex variables (see the examples on p. 692-695) or the theorem about the differentiation of an integral with respect to a parameter (see 8.2.4, p. 457):
d dt
1
(8.52)
T d x .
a
a
/#
1 b
f ( x ,t )dx =
'x-1 x d x . Introducing the parameter t: F ( t ) =
1
lxt-l -dx;
F ( 0 ) = 0; F(l) = I . lnx ' d xt-l 1xtlnx 1 dx = -dx = [ x t d x = -xttll' = dt o a t lnx o lnx t+l 0 tS1' t d t Integration: F ( t ) - F ( 0 ) = -= In(t t 1)Ifi= ln(t + 1). Result: I = F(l)= ln2. 0 t + l
II =
1 [-] 1
1
dF Using (8.52) for F ( t ) : - =
o
1
4. Integration by Series Expansion If the integrand f(x) can be expanded into a uniformly convergent series (8.53)
f ( x )= ' P 1 ( x ) + p z ( x ) + . . . + ' P n ( x ) + . . . in the integration interval [a, b],then the integral can be written in the form
In this way the definite integral can be represented as a convergent numerical series: (8.55) When the functions pk(x) are easy to integrate, if, e.g.. f ( x ) can be expanded in a power series, which is uniformly convergent in the interval [a,a], then the integral accuracy.
ICalculate the integral I =
1'''
l
f ( x )dx can be calculated to arbitrary
dx with an accuracy of 0.0001. The series e-"' = 1 -
e-"'
$+
$ $ + $ - . . . is uniformly convergent in any finite interval according to the Abel theorem (see -
7.3.3.1, p. 414), so /e-"' follows that I = /#"* e-"'
i
1
1
holds. With this result it
dx = x
dx = 1
i (1 1
-
-.
,
.). To achieve the accuracy 0.0001 for the calculation of the
= (1 + - zFBB + integral it is enough to consider the first four terms, according to the theorem of Leibniz about alternating series (see 7.2.3.3, l., p. 408):
I x i ( 1 - 0.08333 + 0.00625 - 0.00037) =
i '0.92255 = 0.46127,
rz e
x 2 dx
= 0.4613.
-
5 . Graphical Integration Graphical integration is a graphical method to integrate a function y = f ( x ) which is given by a curve Ai? (Fig. 8.10),i.e., to calculate graphically the integral
f ( x )dx, the area of the region MoABN:
8.2 Definite Inteurals 445
1.
The
interval
is
divided
by
the
points
~ I ~ z , x ~ , x ~ /. z. ,,~,-1,z,-1/2 xz,. into 2n equal parts, where the
result is more accurate if there are more points of division. we draw vertical lines intersecting the curve. The ordinate v a w o f the segments are denoted on the y-axis by OA1, OAz, . . . , OA,. 3. A segment oP of arbitrary length is placed on the negative x-axis, and P is connected with the points AI, Az, . . . ,A,. 4. Through the point ,440 a line segment is drawn parallel to PA1 to the intersection point with the line z = z&s is the segment M&'1. Through the point Mi the segment MlMz is drawn parallel to PA2 to the intersection with the line z = 2 2 , etc., until the last point M,,is reached with the abscissa zn. Theintegral .is numerically equal to the product of the length of OP and the length of 2. .At the points of division x l p , x 3 p , . . .
1
~
m:
-j ( z ) dx = OP . NM,.
Figure 8.10
(8.56)
a
By a suitable choice of the arbitrary segment oP the shape of our result can be influenced; the smaller the graph we want. the longer we should choose the segment
m.If oP = 1, then
l
f(z)dx =
m,
and the broken line ,Wo, Ml, M Z , . . . , M , represents approximately the graph of a primitive function of
f(x).Le., one of the functions given by the indefinite integral
I
j ( z ) dx.
6. Planimeter and Integraph A planimeter is a tool to find the area bounded by a closed plane curve, thus also to compute a definite integral of a function y = j(x) given by the curve. Special types of planimeters can evaluate not only
/
y dx, but also
S
y2 dx and
I
y3 dx.
An integraph is a device which can be used to draw the graph of a primitive function Y =
lz
f ( t ) dt if
the graph of a function y = j(x) is given (see [19.29]).
7. Numerical Integration If the integrand of a definite integral is too complicated, or the corresponding indefinite integral cannot be expressed by elementary functions, or we have the values of the function only at discrete points, e.g., from a table of values, then we use the so-called quadrature formulas or other methods of numerical mathematics (see 19.3.1. p. 895).
8.2.2 Application of Definite Integrals 8.2.2.1 General Principles for Application of the Definite Integral 1. We decompose the quantity we want to calculate into a large number of very small quantities, Le., into infinitesimal quantities: (8.57) A = a1 a2 ' ' . +a,. 2. We replace every one of these infinitesimal quantities a, by a quantity iL,, which differs only very slightly in value from a,, but which can be integrated by known formulas. Here the error CY, = a, - 5, should be an infinitesimal quantity of higher order than ai and 6%. 3. We express 6%by a variable z and a function f(z)so that Sdi has the form j(x,)Ax,. 4. \Ye evaluate the desired quantity as the limit of the sum
+ +
446
8. Intesral Calculus
(8.58) where Ax, 2 0 holds for every i. The lower and upper limit for x is denoted by a and b. IWe evaluate the volume V of a pyramid with base area S And height H (Fig. 8.11a-c): a) We decompose the required volume V by plane sections into frustums (Fig. 8.11a): \' = v1+ 11'2 + ' . ' + un. b) We replace every frustum by a prism, whose volume is iji. with the same height and with a base area of the top base of the frustum (Fig. 8.11b). The difference of their volumes is an infinitesimal quantity of higher order than v,. c ) We represent the volume ijt in the form iji = Si Ahi , where h, (Fig. 8 . 1 1 ~ )is the distance of the top surface from the vertex of the pyramid. Since S,: S = h: : H 2 Sh2 we can write: iji = 3Ah, .
H d) We calculate the limit of the sum
V= Figure 8.11
l i m e 6 - lim~-A Sh2 h , = / - - d hH=S-h- Z. ' - n + m 2=1 H2 H2
n + m ,t=1
SH 3
8.2.2.2 Applications in Geometry 1. Area of Planar Figures 1. Area of a Curvilinear Trapezoid Between B and C (Fig. 8.12a) if the curve is given by an equation in explicit form (y = f(z)and a 5 z 5 b) or in parametric form (x = z ( t ) , y = y(t). t l 5
t 5 tz):
pr!.)
(8.59a)
S,AB = ~ ~ dx = a
tl
2. Area of a Curvilinear Trapezoid Between G and H (Fig. 8.12b) if the curve is given by an equation in explicit form (x = g(y) and a 5 y 5 j3) or in parametric form (z= z ( t ) . y = y(t). tl 5
t 5 t2): (8.59b)
3. Area of a Curvilinear Sector (Fig. 8.12c), bounded by a curve between K and L , which is given by an equation in polar coordinates ( p = p(cp), cp1 5 p 5 cpz): (8.59~) Areas of more complicated figures can be calculated by partition of the area into simple parts, or by line integrals (see 8 3. p. 460) or by double integrals (see 8.4.1, p. 469).
8.2 Definite Integrals 447
a)
b) Figure 8.12
2. A r c l e n g t h o f P l a n e Curves 1. Arclength of a Curve Between T w o Points (I) A and B. given in explicit form (y = f ( z )or x = g(y)) or in parametric form (z= x ( t ) .y = y(t)) (Fig. 8.13a) can be calculated by the integrals:
L- = 4B
p
=
4-dz
a
1
J&jTldy
=
a
J4-k
(8.60a)
tl
\\lth the differential of the arclength dl we get
L=
/ dl
d12 = dx2 + dy’.
with
(8.60b)
The perimeter of the ellipse with the help of (8.60a): With the substitutions z = z ( t ) = a sin t , y =
y ( t ) = b c o s t wegetL- = AB
l:
da2
-
(a2 - b2) sin2t dt = a
6’
d t : where e = -/a
is the numerical eccentricity of the ellipse. Since z = 0. y = b and x = a , y = 0: the limits of the integral in the first quadrant are L4a
l”*d x
d
t
= a E(k,
ble 21.7 (see example on p, 436).
r ) with k
=
AB
=
2
E.
The value of the integral E ( k , -) T is . given in Ta2
2. Arclength of a Curve Between T w o Points (11) C and D. given in polar coordinates ( p = p(;r”)) (Fig. 8.13b):
(8.60~) LVith the differential of the arclength dl we get
L=
J
+
dl with d/* = p2dp2 dp2.
(8.60d)
3. S u r f a c e Area of a Body o f R e v o l u t i o n (see also First Guldin Rule, p. 451) 1. The area of the surface of a body given by rotating the graph of the function y = f(z)around the z-axis (Fig. 8.14a) is:
(8.61a)
I
448
8. Inteqrul Calculus
VI
Figure 8.13
Figure 8.14 2. The area of the surface of a body given by rotating x = f(y) around the y-axis (Fig. 8.14b) is:
S = 2 ; 4/ z d i = 2 r / x ( y ) ~ ~ d y (8.61b) a
n
To calculate the area of more complicated surfaces see the application of double integrals in 8.4.1.3, p. 472 and the application of surface integrals of the first kind, 8.5.1.3, p. 480. General formulas for the calculation of surface areas with double integrals are given in Table 8.9 (Application of double integrals), p. 473. 4. Volume (see also Second Guldin Rule, p. 461) 1. The volume of a rotationally symmetric body given by a rotation around the x-axis (Fig. 8.14a) is:
V
= K
j
y'dx.
(8.62a)
2. The volume of a rotationally symmetric body given by a rotation around the y-axis (Fig. 8.14b) is:
j
V = K x'dy.
(8.6213)
3. The volume of a body, whose section perpendicular to the x-axis (Fig. 8.15) has an area given by the function S = f ( z ) , is:
V=
j
f(z)dz.
(8.63)
8.2 Definite Integrals 449 General formulas t o calculate volumes by multiple integrals are given in Table 8.9 (Applications of double integrals, see p. 473) and Table 8.11 (Applications of triple integrals, see p. 478).
Figure 8.15
Figure 8.16
8.2.2.3 Applications in Mechanics and Physics 1. Distance Traveled by a Point The distance traveled by a moving point during the time from to until T with a time-dependent velocity v = f ( t ) is T
s=Judt
to
2. Work To determine the work in moving a body in a force field we suppose that the direction of the field and the direction of the moveme$ are constant and coincident. We define the z-axis in this direction. If the magnitude of the force F is changing, Le., IF1 = f(z),then we get for the work W necessary to move the body from the point I = a to the point z = b along the z-axis:
(8.65) In the general case, when the direction of the force field and the direction of the movement are not coincident, we calculate the work as a line integral (see (8.130), p. 467) of the scalar product of the force and the variation of the position vector at every point of i along the given path.
3. Pressure In a fluid at rest with a density e we distinguish between yravztational pressure and lateral pressure. This second one is exerted by the fluid on one side of a vertical plate immersed in the fluid. Both depend on the depth. 1. Gravitational Pressure The gravitational pressure P h at depth h is: Ph = @ Y h , (8.66) where y is the gravitational acceleration. 2. Lateral Pressure The lateral pressure p s , e.g., on the cover of a lateral opening of a container of some fluid Kith the difference of depth hl - hz (Fig. 8.16) is: ha
Ps = eY JzIy?(z) - Y1(2)1 dz. hi
The left and the right boundary of the Cover is given by the functions yl(z) and y*(z).
(8.67)
8. Intearal Calculus
450
4. Moments of Inertia 1. Moment of Inertia of an Arc The moment of inertia of a homogeneous curve segment y = f(z) with constant density pin the interval [a,b] with respect to the y-axis (Fig. 8.17a) is: (8.68)
If the density is a function ~ ( z )then , its analytic expression is in the integrand
Y
Figure 8.17 2. Moment of Inertia of a Planar Figure The moment of inertia of a planar figure with a homogeneous density e with respect to the y-axis, where y is the length of the cut parallel to the y-axis (Fig. 8.17b), is: b
Iy = e / x 2 y d z .
(8.69)
a
(See also Table 8.4.2.3: (Applications of line integral), p. 477.) If the density is position dependent, then its analytic expression must be in the integrand.
b)
a)
c)
Figure 8.18
5 . Center of Gravity, Guldin Rules 1. Center of Gravity of an Arc Segment The center of gravity C of an arc segment of a homogeneous plane curve y = f(x) in the interval [a, b] with a length L (Fig. 8.18a) considering (8.60a), p. 447. has the coordinates:
(8.70) Yc = a L L 2. Center of Gravity of a Closed Curve The center of gravity C of a closed curve y = f(z) (Fig. 8.18b) with the equations y1 = f~(z) for the upper part and y2 = f * ( x )for the lower part, and
xc=
a
8.2 Definite Intearals 451
with a length L has the coordinates: j&m+J-idx
xc =
a
3
L
Yc=a
L
. (8.71)
3. First Guldin Rule Suppose a plane curve segment is rotated around an axis which lies in the plane of the curve and does not intersect the curve. We choose it as the x-axis. The surface area S ,, of the body generated by the rotated curve segment is the product of the perimeter of the circle drawn by the centre of gravity at a distance yc from the axis of rotation. Le., 27ryc, and the length of the curve segment L : s,,,= L , 2iTyc. (8.72) 4. Center of Gravity ofa Trapezoid The center of gravity C of a homogeneous trapezoid bounded above by a curve segment between the points of the curve A and B (Fig. 8.18c), with an area Sof the trapezoid. and with the equation y = f ( x ) of the curve segment AB, has the coordinates: b
b
b/y2dx
/ W d X
xc = 0 .
yc=-
S
’
(8.73)
5 . Center of Gravity of an Arbitrary Planar Figure The center of gravity C of an arbitrary planar figure (Fig. 8.18d) with area S.bounded above and below by the curve segments with the equations y1 = f l ( x )and y2 = f 2 ( x ) ,has the coordinates
(8.74)
Formulas to calculate the center of gravity with multiple integrals are given in Table 8.9 (.4pplication of double integrals, p. 473) and in Table 8.11 (Application of triple integrals, p. 478). 6. Second Guldin Rule Suppose a plane figure is rotated around an axis which is in the plane of the figure and does not intersect it. We choose it as the x-axis. The volume V of the body generated by the rotated figure is equal to the product of the perimeter of the circle drawn by the center of gravity under the rotation, Le., 27ryc, and the area of the figure S: (8.75) vrot= s ’ 27ryc.
8.2.3 Improper Integrals, Stieltjes and Lebesgue Integrals 8.2.3.1 Generalization of t h e Notion of t h e Integral Thenotionofthedefiniteintegral (see 8.2.1.1, p. 438), as aRiemannintegra1 (see 8.2.1.1,2., p. 439), was introduced under the assumptions that the function f(z)is bounded, and the interval [a, b] is closed and finite. Both assumptions can be relaxed in the generalizations of the Riemann integral. In the following we mention a few of them.
1. Improper Integrals These are the generalization of the integral to unbounded functions and to unbounded intervals. We discuss integrals with infinite integration limits and integrals with unbounded integrands in the next paragraph.
2. Stieltjes Integral for Functions of One Variable bye start from two finite functions f ( x ) and g(z)defined on the finite interval [a, b]. We make a partition of the interval into subintervals, just as with the Riemann integral, but instead of the Riemann sum (8.37) we compose the sum
452
8. Inteqral Calculus
(8.76) If the limit of (8.76) exists, when the length of the subintervals tends to zero, and it is independent of the choice of the points x, and El, then this limit is called a dejnite Stieltjes integral (see [8.8]). IFor g ( x ) = x the Stieltjes integral becomes the Riemann integral.
3. Lebesgue Integral Another generalization of the integral notion is connected with measure theory (see 12.9,p. 633), where the measure of a set, measure spaces, and measurable functions are introduced. In functional analysis the Lebesgue integral is defined (see 12.9.3.2, p. 635) based on these notions (see [8.6]). The generalization with comparison to the Riemann integral is, e.g., the domain of integration can be a rather general subset of Rnand it is partitioned into measurable subsets. There are different notations for the generalizations of the integrals (see [8.8]).
8.2.3.2 Integrals with Infinite Integration Limits 1. Definitions a) If the integration domain is the closed half-axis [a, +co),and if the integrand is defined there, then
the integral is by definition
1
1 B
+X
f(x)dx = B-+W lim
a
(8.77)
f(x)dx.
a
If the limit exists, then the integral is called a convergent improper integral. If the limit does not exist, then the improper integral (8.77) is divergent. b) If the domain of a function is the closed half-axis (-m, b] or the whole real axis (-cc,+m), then we define analogously the improper integrals
c) At the limits of (8.78b) the numbers A and B tend to infinity independently of each other. If the limit (8.78b) does not exist, but the limit +A
lim A-tm
1f
(8.78~)
(x)dx,
-A
exists, then this limit (8.78~) is called the principal value ofthe improper integral. or Cauchy’s principal value. Remark:An obviously necessary but not sufficient condition for the convergence of the integral (8.77) f(x) = 0. is rlim i m
2. Geometrical Meaning of Integrals with Infinite Limits The integrals (8.77)) (8.78a) and (8.78b) give the area of the figures represented in Fig. 8.19. 33 dx B dx IA: - = lim - = lim 1nB = co (divergent). B+m 1 x B-tm 1 x B dx 1 1 m dx IB: - = lim - = lim = (convergent). 2 x2 B+m 2 x2 B+m 2 B 2
1
1
J
J
(- -) 1 =
lirn [arctan B - arctan A] =
8.2 Definite Integrals 453
Figure 8.19
3. Sufficient Criteria for Convergence Ifthedirect calculationofthelimits (8.77), (8.78a) and (8.78b) iscomplicated, or ifonlytheconvergence or divergence of an improper integral is the question. then one of the following sufficient criteria can be used. Here, only the integral (8.77) is considered. The integral (8.78a) can be transformed into (8.77) by substitution of z by - x : (8.79) -a
-X
The integral (8.78b) can be decomposed into the sum of two integrals of type (8.77) and (8.78a): tx
/
tx
f(z)dz t
f ( z ) dz =
-X
-X
/ f(z)
(8.80)
dz.
C
where c is an arbitrary number Criterion 1: If f(x) is integrable on any finite subinterval of [a, co)and if the integral (8.81) a
exists. then there exists also the integral (8.77). The integral (8.77) is in this case said to be absolutely convergent. and the function f(z) is absolute zntegrable on the half-axis [a, +m). Criterion 2: If for the functions f(z)and p(z) (8.82a) f(r)> 0, ~ ( 2>) 0 and f(z)5 p(z) for a 5 z < +m hold, then from the convergence of the integral
/ g(s)dx a
/ f(x)dx
tm
tco
(8.82b)
the convergence of the integral
a
(8.82~)
8. Inteqrul Calculus
454
follows, and conversely. from the divergence of the integral (8.82~)the divergence of the integral (8.82b) follo~vs. Criterion 3: If we substitute 1 (8.83a) P(X) = 2. and we consider that for a > 0,
(a- 1 ) u O - 1
a
cy
> 1 the integral (8.83b)
(a>O.cu>l)
is convergent and has the value of the right-hand side, and the integral of the left-hand side is divergent for cy 5 1. then we can deduce a further convergence criterion from the second one: If f ( z ) in a 5 x < co is a positive function. and there exists a number cy > 1 such that for x large enough (8.83~) f ( x ) x' < k < co ( k > 0, const) holds. then the integral (8.77) is convergent: if f ( z ) is positive and there exists a number cy 5 1 such that (8.83d) f ( x ) x" > c > 0 (c > 0, const) holds from a certain point, then the integral (8.77) is divergent. f m xsf2 dx
If we substitute
cy
=
1 x3~2 , then we get -2''' 2 1+x2
-
22
= - 3 1. The integral is
1 +x2
divergent.
4. Relations Between Improper Integrals and Infinite Series If x i , x2.. . . ,I,. . . . is an arbitrary, unlimited increasing infinite sequence, Le., if lim z, = co, a < z1 < 2 2 < ... < z, < ... with n+tx
(8.84a)
and if the function f(x) is positive for a 5 x < x. then the problem of conveigence of the integral (8.77) can be reduced to the problem of convergence of the series (8.84b) If the series (8.84b) is convergent, then the integral (8.77) is also convergent, and it is equal to the sum of the series (8.84b). If the series (8.84b) is divergent, then the integral (8.77) is also divergent. So the convergence criteria for series can be used for improper integrals, and conversely, in the integral criterion for series (see 7.2.2.4. p. 406) we use the improper integrals to investigate the convergence of infinite series.
8.2.3.3 Integrals with Unbounded Integrand 1. Definitions 1. Right O p e n or Closed Interval For a function f ( z ) ,which has a domain open on the right [a. b) or a domain closed on the right [a, b]. but at the point b it has the limit Kl0f(x) = x.we have the
definition of the improper integral in both cases.
1 b
a
1 b-E
f ( x ) dx = lim E
4
f ( x ) dx.
(8.85)
Q
If this limit exists and is finite. then the improper integral (8.86) exists, and we call it a convergent zmproper zntegrul. If the limit does not exist or it is not finite, then we call the integral a divergent zmproper zntegrul.
8.2 Definite Intearals 455
2. Left Open or Closed Interval For a function j ( z ) ,which has a domain open on the left ( a ,b] or a domain closed on the left [a.b], but at the point a it has the limit lim f ( s ) = CIC) we define the Z+l+O
improper integral analogously to (8.85). That is: (8.86) a
a+E
3. Two Half-Open Contiguous Intervals For a function j ( z ) ,which is defined on the interval [a,b] except at an interior point c with a < c < b, i.e., for a function j ( z ) defined on the half-open intervals [ a ,c) and (c. b], or is defined on the interval [a,b], but at the interior point c it has an infinite limit at least from one side lim f ( z ) = co or lim f ( z ) = M, the definition of the improper integral x+c-o
ZC ’ +O
is:
1 b
1 C-E
f ( z ) dz = lim
€40
f ( z ) dz
a
0
+F z
1f(.) b
dx.
(8.87a)
C+6
Here the numbers E and 6 tend to zero independently of each other. If the limit (8.87a) does not exist, but the limit
does, then we call the limit (8.87b) the principal value of the improper integral or Cauchy’s principal
value. 2. Geometrical Meaning The geometrical meaning of the integral of discontinuous functions (8.85), (8.86), and (8.87a) is to find the area of the figures bounded. e.g., from one side by a vertical asymptote as represented in Fig.8.20.
a) s. (8.85)
IA:
lb* 0
IB:
b) s. (8.86)
c) s. (8.87a)
Figure 8.20 :
Case (8.86), singular point at z = 0.
6
i”* 1‘’’
t a n z dz : Case (8.85), singular point at z = T . 2
tan zdz = lim
E 4 0
IT’*-‘ Q
t a n z ds = lim In cos 0 - In cos E+O
(f - E ) ]
=x
(divergent)
8. Inteqral Calculus
456
IC:
f 2: Case (8.87a), singular point at x = 0. 3 3 9 - 1) + lim -(4 - J213) = (convergent). s_" $ + limJd @ = lim -(&'I3 fi = lim\-' 2 @ 2 2 1 f i
€ 4-1
1
ID:
\
2
6-0
6
-
6+0
E-02
2xdx
-: Case (8.87a), singular point at x = ikl -1
-2 x2
= limln(x2 - 1) E+O
+
'.
= lim[ln(l €+O
+ 2~ + E'
- 1) - In31
+ ... = m
(divergent).
3. The Application of the Fundamental Theorem of Integral Calculus 1. Warning The calculation of improper integrals of type (8.87a) with the mechanical use of the formula
\ j ( x )dx b
= F ( x ) ;1
with F'(x)= j ( x )
(8.88)
a
(see 8.2.1.1,p. 438) usually results in mistakes if the singular points in the interval [a, b] are not taken into consideration. IE: Using formally the fundamental theorem we get for the example D
L2 2
2xdx
ln3 - ln3 = 0,
= ln(x2 - 1)
though this integral is divergent. 2. General Rule The fundamental theorem of integral calculus can be used for (8.87a) only if the primitive function of f ( x ) can be defined to be continuous at the singular point. IF: In the example D the function ln(x2 - 1) is discontinuous at x = 41, so the conditions are not 3 1 fulfilled. Consider the example C. The function y = - x213is such a primitive function of - on the 2 @ intervals [a. 0) and (0, b] which can be defined continuously at IC = 0, so the fundamental theorem can be used in the example C:
4. Sufficient Conditions for the Convergence of an Improper Integral with
Unbounded Integrand
1 j ( x ) l dx i," absolutely convergent integral 1. If the integral
exists, then the integral
I"
f(x)dx also exists. In this case we call it an
and the function f(x)is an absolutely integrable function on the consid-
ered interval. 2. If the function j ( x ) is positive in the interval [a, b ) , and there is a number a < 1 such that for the values of z close enough to b (8.89a) j ( 2 ) ( b - z ) <~ x holds, then the integral (8.87a) is convergent. But, if the function f(x)is positive in the interval [a,b), and there is a number a > 1 such that for the values of x close enough to b (8.89b) j ( x )( b - x)" > c > 0 (c const) holds. then the integral (8.87a) is divergent.
8.2 Definite Inteorals 457
8.2.4 Parametric Integrals 8.2.4.1 Definition of Parametric Integrals The definite integral
p
(8.90)
f(x. dx = F(Y)
a
is a function of the variable y considered here as a parameter. In several cases the function F ( y ) is no longer elementary, even if f(x,y) is an elementary function of x and y. The integral (8.90) can be an ordinary integral, or an improper integral with infinite limits or unbounded integrand f(x,y). For theoretical discussions about the convergence of improper integrals depending on a parameter see, e.g.,[8.3]. I Gamma Function or Euler Integral of the Second Kind (see 8.2.5,6., p. 459): m
(convergent for y > 0).
T(y) = / x v - l e P d x
(8.91)
0
8.2.4.2 Differentiation Under the Symbol of Integration 1. Theorem If the function (8.90) is defined in the interval c 5 y 5 e, and the function f(x,y) is continuous on the rectangle a 5 2 5 b, c 5 y 5 e and it has here a partial derivative with respect to y, then for arbitrary y in the interval [cl e]:
(8.92) This is called diferentiation under the symbol of integration. d
IFor arbitrary y > 0: dy
/ arctan 1
0
dx =
1 aY la
- (arctan
0
z)
dx = -
1
1
xdx
1 yz = - In -
2
1+yz
For y = 0 the condition of continuity is not fulfilled, and there exists no derivative. 2. Generalization for Limits of Integration Depending on Parameters The formula (8.92) can be generalized, if with the same assumptions we made for (8.92) the functions a(y)and p(y) are defined in the interval [c, e], they are continuous and differentiable there, and the curves x = a ( y ) ,x = B(y) do not leave the rectangle a 5 x 5 b, c 5 y 5 e: (8.93)
8.2.4.3 Integration Under the Symbol of Integration If the function f(x. y) is continuous on the rectangle a 5 x 5 b. c 5 y 5 e, then the function (8.90) is defined in the interval [c, e ] ,and (8.94) is valid. This is called integration under the symbol ufzntegratiun. IA: Integration of the function f ( x l y) = xu on the rectangle 0 5 x 5 1, a 5 y 5 b. The function xu is discontinuous at x = 0, y = 0, for a > 0 it is continuous. So we can change the order of integration:
I
8. Inteqral Calculus
458
[
[Jdl
xY dz] dy =
1' [[
l+b
xY dy] dx. On the left-hand side we get
-
I xb
Ins:dx. The indefinite integral cannot be expressed by elementary functions. Anyway,
handsidel
the definite integral is known, so we get: Jd'-dx=lnl t b (O
IB: Integration of the function f(x, y) =
~
8.2.5 Integration by Series Expansion, Special Non-Elementary
Functions It is not always possible to express an integral by elementary functions, even if the integrand is an elementary function. In many cases we can express these non-elementmy integrals by series expansions. If the integrand can be expanded into a uniformly convergent series in the interval [a,b], then we get
f ( t ) d t if we integrate it term by term.
also a uniformly convergent series for the integral
1. Integral Sine (1x1 < co,see also 14.4.3.2. 2.. p. 694) sint /7 dt 5
Si (z)=
0
K
1
2
x
m
.
=--
sint -dt t (8.95)
2. Integral Cosine (0 < z < 03)
=C
C=-
J
x2 24 (- l)nx2n + lnz - +- t .. + t ... 2 . 2 ! 4.4! 2n ( 2 n ) ! ,
eCt In t dt = 0.577 215 665, . .
3. Integral Logarithm (0 < 5 Li(z) = 0
~
with
(Euler constant).
(8.96a)
(8.96b)
< 1, for 1 < 2 < 00 as Cauchy Principal Value)
(In x)' (lnz)n dt Int = c t l n / l n x l + lnlzl+ -+ . . . t -t " 2.2! n . n!
(8.97)
8.2 Definite Inteoruls 459
4. Exponential Integral (-cc
j
Ei (2) =
d t = C t In
< z < 0,for 0 < z < 00 X2
Xn
2 ' 2!
n,n!
as Cauchy Principal Value)
1x1 t z t -t . ' . + -t . . .
-X
(8.98a)
Ei (lnz) = Li (z).
(8.9813)
5 . Gauss Error Integral and Error Function The Gauss error integral is defined for the domain 1x1 < m and it is denoted by O~The following definitions and relations are valid:
(8.99b) -1 2
(8.99~)
The function O(x)is the distribution function of the standard normal distribution (see 16.2.4.2, p. 757) and its values are tabulated in Table 21.15,p. 1085. The error function erf (z).often used in statistics (see also 16.2.4.2, p. 757), has a strong relation with the Gauss error integral: 2 x erf (z)= -/ e - " d t
fi0
= 200(zd).
(8.100a)
J&erf
(8.1OOb)
(x)= 1,
(-l)nz2n+l n! . (2n + 1)
(8.100~)
(8.100e)
6. Gamma Function and Factorial 1. Definition The gamma function, the Euler integral of the second kind (8.91), is an extension of the notion of factorial for arbitrary numbers z,even complex numbers, except zero and the negative integers. The curve of the function r ( z ) is represented in Fig. 8.21. Its values are given in Table 21.8, p. 1057. It can be defined in two ways:
1
4 x
Y
r(x)=
e-tt"-' dt
(x > 0 ) ,
.-2
(8.101a)
-3 -4 -5
0
nx . n! r(s)= n+m lim z ( x t 1)(zt 2) (x # 0. -1, -2,. . .).
t . .
(z t n)
(8.101b)
Figure 8.21
2. Properties of the Gamma Function
r(zt 1) = zT(sj,
(8.102a)
r ( n t 1) = n! (n = 0 , 1 , 2 , . . .),
(8.102b)
I
460
8. Inteoral Calculus
( n = O , l , 2 , . .),
(8.102e)
I
(8.1029 The same relations are valid for complex arguments z , but only if Re ( z ) > 0 holds. 3. Generalization of t h e Notion of Factorial The notion of factomal, defined until now only for positive integers n (see 1.1.6.4, 3., p. 13), leads to the function (8.103a) x! = i-(z 1) as its extension for arbitrary real numbers. The following equalities are valid:
+
Forpositiveintegersz: z ! = 1 . 2 . 3 . . . x ,
(8.103b) forz=O:
O!=r(l)=l,
for negative integers z: z! = rtm,
(8.103d) f o r z = f :
(i)!=r(i) =$,(8.103e)
f o r z = - ; :2
(8.103~)
( - f ) ! = r ( i ) = f (8.1034 i,
An approximate determination of a factorial can be performed for large numbers (> lo), also for fractions n with the Stirling formula: n ! x (a)"dG(l+--+-+...), 1 1 (8.103h) 12n 288n2 ln(n!) x ( n +
:)
Inn - n + In JZ;;.
(8.103i)
7. Elliptic Integrals For the complete elliptic integrals (see 8.1.4.3. 2., p. 435) the following series expansions are valid: (8.104)
k2 < 1. (8.105) The numerical values of the elliptic integrals are given in Table 21.7,p. 1055.
8.3 Line Integrals The notion of the integral can be generalized in different ways. While the domain of an ordinary definite integral is an interval on the numerical axis, for a line integral, the domain of integration is a segment of a planar or space curve. The curve, Le., the path of integration can also be closed; it is called also circuit integral and it gives the circulation of the function along the curve. We distinguish line integrals of the first type. of the second type, or of general type.
8.3 Line Inteorals 461
8.3.1 Line Integrals ofthe First Type
Y
8.3.1.1 Definitions The lzne zntegral ofthefirst type or zntegral over an arc is the definite integral
1f
(8.106)
( X >Y) ds,
(CI
-
where u = f(x.y) is a function of two variables defined on a connetted domain and the integration is performed over an arc C -AB of a plane curve given by its equation. The considered arc is in the same domain, and we call it the path ofzntegratzon. The numerical value of the line integral of the first type can be determined in the following way (Fig. 8.22):
c
b
X
Figure 8.22
1. We decompose the rectifiable arc segment AB into n elementary parts by points AI, Az, . . . , A,-1 chosen arbitrarily. starting at the initial point A A0 and finishing at the endpoint B I A,. h
2. We choose arbitrary points P, inside or at the end of the elementary arcs A,-lA,, with coordinates t,and rl,.
-
3. We multiply the values of the function f(JEI 7%)at the chosen points with the arclength Al-lA,= As,-, which should be taken positive. (Since the arc is rectificable, As,-, is finite.) 4. We add t h e n products f(&. qt)Ast-l. 5 . We evaluate the limit of the sum
(8.107a) as the arclength of every elementary curve segment As,-l tends to zero, while n obviously tends to m. If the limit of (8.107a) exists and is independent of the choice of the points A, and P,, then this limit is called the line zntegral of the first type, and we write n
(8.107b) We can define analogously the line integral of the first type for a function u = f(x,y , z ) of three variables, whose path of integration is a curve segment of a space curve: (8.107~)
8.3.1.2 Existence Theorem The line integral of the first type (8.107b) or (8.107~)exists if the function 12, Y) or [x,y , z ) and also the curve along the arc segment C are continuous, and the curve has a tanger which vi ies continuously. In other words: The above limits exist and are independent of the choice oi l i and P In this case, the functions f(x,y ) or f(x>y , z ) are said to be integrable along the curve C .
8.3.1.3 Evaluation of the Line Integral ofthe First Type We calculate the line integral of the first type by reducing it to a definite integral.
462
8. Inte.qral Calculus
1. The Equation of the Path of Integration is Given in Parametric Form If the defining equations of the path are z = z ( t )and y = y(t), then f(X>Y) ds = Splixlt!. (C)
(8.108a)
Y(t)l&%FTFmdt
t0
holds. and in the case of a space curve z = x ( t ) ,y = y ( t ) , and z = z ( t ) T
J f b ,Y>
2 ) ds
=
(CI
1
f[z(t),Y(t)l z(t)lJlz’(t)12
+ [Y’(t)I2+ [Z’(t)I2dt,
(8.108b)
t0
where t o is the value of the parameter t at the point A and T is the parameter value at B . The points A and B are chosen so that to < T holds.
2. The Equation of the Path of Integration is Given in Explicit Form We substitute t = z and get from (8.108a) for the planar case
1
b
f ( s > Y ) d s= / f P : Y ( z ) l ~ m d ~ :
(8.109a)
a
(CI
and from (8.108b) for the three dimensional case
1
b
f ( ~ . : ~ . ~ ) d ~ = ~ f I ~ , ~ ( ~ ) , ~ ( ~ ) l ~ 1 + i u ‘ ( ~ ) l ~ + [ ~ ’ (8.109b) ( ~ ) 1 ~ ~ ~
(C)
a
Here a and b are the abscissae of the points A and B , where the relation a < b must be fulfilled. We can consider x as a parameter if every point corresponds to exactly one point on the projection of the curve segment C onto the z-axis, i.e., every point of the curve is uniquely determined by the value of its abscissa. If this condition does not hold, we have to partition the curve segment into subsegments having this property. The line integral along the whole segment is equal to the sum of the line integrals along the subsegments.
8.3.1.4 Application of t h e Line Integral of t h e First Type Some applications of the line integral of the first type are given in Table 8.6. The curve elements needed for the calculations of the line integrals are given for different coordinate systems in Table 8.7.
8.3.2 Line Integrals ofthe Second Type 8.3.2.1 Definitions A line integral ofthe second type or an integral ouer a projection (onto the x-,y- or z-axis) is the definite integral (8.11Oh)
(8.11Oa) (C)
(C)
-
where f(x.y) or f ( z ,y, 2 ) are two or three variable functions defined on a connected domain. and we integrate over a projection of a plane or space curve C =AB (given by its equation) onto the z-. y-, or z-axis. The path of integration is in the same domain. We get the line integral of the second type similarly to the line integral of the first type, but in the third step the values of the function f(&,172) or f ( & ,qt, Cz) are not multiplied by the arclength of the n
elementary curve segments A,-lA,, hut by its projections onto a coordinate axis (Fig. 8.23).
8.3 Line Intearals 463
Table 8.6 Line integrals of the first type ~
Length of a curve segment C
1
Vass of an inhomogeneous curve segment C
L=
/ ds
I
M =
eds
(CI
Center of gravity coordinates
xc =
1
-
L
(e = f(z,y, z ) density function)
1
1
zeds
y~ =
I, = I, =
Moments of inertia of a space curve with respect to the coordinate axes
1
/ x'eds,
I
(y'
Iy=
+ z'),ods,
Iy=
I(.'
1
zeds
(C)
/ y'eds
(CI
I, =
1 yeds , zc = L
(CI
(C)
Moments of inertia of a plane curve in the z.u plane
/
+ z')pds.
(CI
+
(z' y')eds
In the case of homogeneous curves we substitute
e = 1.
Table 8.7 Curve elements
1
Cartesian coordinates z,y = y(z) Plane curve in Polar coordinates 9,p = p(p) the x. y plane Parametric form in Cartesian ds = J[z'(t)]'
-
1. Projection onto the x-Axis It'ith Pr, A,-lA,= z,
- x,-1 = h - 1
we get
(8.113a)
+ [y'(t)]' + [z'(t)]'dt
464
8. Inteoral Calculus
(8.113b)
(8.114)
8.3.2.2 Existence Theorem The line integral of the second type in the form (8.112a), (8.113a), (8.112b), (8.113b) or (8.114) exists if the function f(z,y) or f(z,y, z ) and also the curve are continuous along the arc segment C, and the curve has a continuously varying tangent there.
8.3.2.3 Calculation of the Line Integral of the Second Type We reduce the calculation of the line integrals of the second type to the calculation of definite integrals.
1. The Path of Integration is Given in Parametric Form With the parametric equations of the path of integration z = z ( t ) , y = y(t) and (for a space curve) z = z ( t ) we get the following formulas:
(8.115)
For (8.112a)
(8.116a)
For (8.113a)
(8.116b) m
For (8.112b)
(8.116c)
For (8.113b)
(8.116d)
For (8.114)
(8.116e)
Here, to and T are the values of the parameter t for the initial point A and the endpoint B of the arc segment. In contrast to the line integral of the first type, here we do not require the inequality to < T . Remark: If we reverse the path of the integral, Le., interchange the points A and B , the sign of the integral changes.
2. The Path of Integration is Given in Explicit Form In the case of a plane or space curve with the equations y=y(z) or y = y ( z ) , z =z(z)
(8.117)
8.3 Line Inteorals 465 as the path of integration, with the abscissae u and b of the points A and B , where the condition a < b is no longer necessary, the abscissa ztakes the place of the parameter t in the formulas (8.112a) - (8.114).
8.3.3 Line Integrals of General Type 8.3.3.1 Definition A line rntegral ofgeneral type is the sum of the integrals of the second type along all the projections of a curve. If two functions P ( z ,y) and Q(z,y) of two variables, or three functions P ( z ,y, z ) , Q(z,y , ~ ) , and R(z.y >z ) of three variables, are given along the given curve segment C, and the corresponding line integrals of the second type exist, then the following formulas are valid for a planar or for a space curve. 1. Planar Curve (8.1 1sa)
(8.118b)
The vector representation of the line integral of general type and an application of it in mechanics will be discussed in the chapter about vector analysis (see 13.3.1.1,p. 658).
8.3.3.2 Properties of t h e Line Integral of General Type 1. The Decomposition of the Path of the Integral
-
/4? a) A
by a point AI, which is on the curve, and it can even be outside of AB (Fig. 8.24). results in the decomposition of the integral into two parts:
4B
AM
MB
Figure 8.24
2. The Reverse of the Sense of the Path of Integration changes the sign of the integral:
-
(8.120)
/ ( ~ d z + ~ d y= )- / ( ~ d z + ~ d y ) . * h
BA
4B
3. Dependence on the Path In general, the value of the line integral is dependent not only on the initial and endpoints but also on the path of integration (Fig. 8.25):
-1
1
(Pdz+Qd~)#
Figure 8.25
ADB
I
(zydz + yzdy
A
(8.121)
h
AM B
1 A: I =
(Pdz+Qdy).*
+ z z d z ) , where C is one turn of the helix z = acost, y = asint, z = bt
(C)
(see Helix on p. 242) from to to T = 2 ~ : d
I = /OZn(-a3 sin't cos t + a'bt sin t cos t + ab% cost) dt = -'Similar formulas are valid for the three-variable case.
b
2
.
466
8. Inteoral Calczllzls
B: I
=
I
[y' dx
+ (zy - z') dy], where C is the arc of the parabola y2 = 9z between the points
(C)
A(O:O)andB(1!3): I = / 3 [ 2s y 3 + ( c3 - l ) ]4 9 81
dy=6- 3 20
8.3.3.3 Integral Along a Closed Curve 1. Notion of t h e Integral Along a Closed Curve .4 circuit integral or the circulation along a curwe is a line integral along a closed path of integration C , Le., the initial point A and the end point B are identical. We use the notation:
f ( P ~ +Z Q ~ Y )
f p d x + Q ~ +Y ~ d z ) .
or
(8.122)
(C)
(C)
In general. this integral differs from zero. But it is equal to zero if the conditions (8.127) are satisfied, or if the integration is performed in a conservative field (see 13.3.1.6, p. 660). (See also zero-valued circulation, 13.3.1.6,p. 660.) 2. The Calculation of t h e Area of a Plane Figure is a typical example of the application of the integral along a closed curve in the form
s = -2l f ( x d y - ydz),
(8.123)
(C)
where C is the boundary curve of the plane figure. The integral is positive if the path is oriented counterclockwise.
8.3.4 Independence of the Line Integral of the Path of Integration The condition for independence of a line integral of the path of integration is also called zntegrabzlzty of the total dzferentzal.
8.3.4.1 Two-Dimensional Case If the line integral
/W,Y) d x + Q(x, Y) d ~ l
(8.124)
with continuous functions P and Q defined on a simple connected domain depends only on the initial point A and the endpoint B of the path of integration, and does not depend on the curve connecting these points. Le., for arbitrary A and B and arbitrary paths of integration ACB and ADB (Fig. 8.25) the equality
1
(Pdz+Qdy)=
h
ACB
1
(Pdx+Qdy)
,-.
(8.125)
ADB
holds. then it is a necessary and sufficient condition for the existence of a function U ( z ,y) of two variables, whose total differential is the integrand of the line integral:
P dz + Q dy = d V ,
(8.126a)
(8.126b)
The function C(z,y) is a primitive function of the total differential (8.126a). In physics. the primitive
8.3 Line Intearals 467
function Cr(x,y) means the potential in a vector field (see 13.3.1.6,4., p. 661).
8.3.4.2 Existence of a Primitive Function .A necessary and sufficient criterion for the existence of the primitive junctzon. the integrability condition for the expression P dx + Q dy, is the equality of the partial derivatives d P dQ dy ax ’ where also the continuity of the partial derivatives is required.
(8.127)
8.3.4.3 Three-Dimensional Case The condition of independence of the line integral /[P(x. y. z ) dx t Q(x, y. z ) dy + R(x.y, 2) dz]
(8.128)
of the path of integration analogously to the two-dimensional case is the existence of a primitive function C(z. y , z ) for which (8.129a) P d x + Q d y + Rdz = dC‘, holds, i.e ,
p = -ac ,
ar;
ar; Q=R=--. 82 aY ’ The integrability condition is now that the three equalities for the partial derivatives dR - dP dP dQ -dQ = - dR -a2 dy ’ dx dz dy ax should be simultaneously satisfied. provided that the partial derivatives are continuous. dx
(8.129b)
(8.129~)
H The work 11’ (see also 8.2.2.3, 2., p. 449) is defined as the scalar product of force F(fl and displacement ?. In a conservative field the work depends only on the place ?, but not on the velocity .u’ With @ = P& t QZYt RZ2 = gradV and & = dx& t dyZYt dzCz the relations (8.129a). (8.129b) are satisfied for the potential V ( f l .and (8.129~)is valid. Independently of the path between the points PI and P2 we get:
IT =
[e(7)& 8
=
Ll
9
[ P d x t Qdy t Rdz] = V ( P 2 )- V ( P l ) .
Figure 8.26
(8.130)
Figure 8.27
8.3.4.4 Determination of the Primitive Function 1. Two-Dimensional Case If the integrability condition (8.127) is satisfied, then along an arbitrary path of integration connecting an arbitrary fixed point A(x0.yo) with the variable point P ( x ,y) (Fig.8.26) and passing through the
I
8. Inteoral Calculus
468
domain where (8.127) is valid, the primitive function U ( X y) , is equal to the line integral c'=
I
(8.131)
(Pdz+Qdy)
h
AP
In practice. it is convenient to choose a path of integration parallel to the coordinate axes, Le., one of the segments A K P or A L P , if they are inside the domain where (8.127) is valid. With these we have two formulas for the calculation of the primitive function V ( Xy) , and the total differential P d z + Q dy: (8.132a)
(8.132b) Here C is an arbitrary constant.
2. Three-Dimensional Case (Fig. 8.27) If the condition (8.129~)is satisfied, the primitive function can be calculated for the path of integration A K L P with the formulas
U = W X O , yo, 20) +
J +J +J
m m D X
/ P(<,
+
Y
Z
1
+ 1R(x, y, E ) dJ + C
(C arbitrary constant). (8.133) Yo 20 For the other five possibilities of a path of integration with the segments being parallel to the coordinate axes we get five further formulas. =
yo>zo) dE
Q(zlq,zo) dq
20
Application of the formula (8.13213) and the substitution of xo = 0, yo = 1 (20 = 0, yo = 0 may not be chosen since the functions P and Q are discontinuous at the point (0,O)) results in b' = -+ -+ U ( O ,1) = - arctan + c = arctan Y +~ 1 . Y
1'
)
-$ + (
&)
~ ~ 12 +dx ~t z dy iL zz dz. The relations (ZlY (8.129~)are satisfied. Application of the formula (8.133) and substitution of xo = 1, yo = 1. zo = 1 1 z z result in U = d t J;g 0 . dq d( C = arctan - - - t C .
B: P d X + Q d y + R d t
10.
+
=z
+
(-
--)
+
2
XY
8.3.4.5 Zero-Valued Integral Along a Closed Curve
The integral along a closed curve, Le., the line integral P d x + Q dy is equal to zero, if the relation (8.127) is satisfied, and if there is no point inside the curve where even one of the functions P, Q, 8P or
2
37
is discontinuous or not defined.
Remark: The value of the integral can be equal to zero also without this conditions, but then we get this value only after performing the corresponding calculations.
8.1 Multiple Inteorals 469
8.4 Multiple Integrals The notion of the integral can be extended to higher dimensions. If the domain of integration is a region in the plane or on a surface in space, then the integral is called a surface integral, if the domain is a part of space, then it is called a volume integral. For the different special applications we use different notations.
8.4.1 Double Integrals 8.4.1.1 Notion of the Double Integral 1. Definition The double integral of a function of two variables u = f ( z , y) over a planar domain S is denoted by
1f(.. ?/I 11f(T dS =
I/)
(8.134)
dI/ dz.
S
S
It is a number. if it exists. and it is defined in the following way (Fig. 8.28): 1. We consider a partition of the domain S into n elementary domain. 2. \$'e chose an arbitrary point Pi(z,:yi) in the interior or on the boundary of every elementary domain. 3. We multiply the value ofthe function u = f ( z z ,yz) at this point by the area AS,ofthe corresponding
elementary domain. 4. We add these products f(s,, y,)AS,. 5. We calculate the limit of the sum
kfb,.
YJlS,
(8.135a)
2=1
as the diameter ofthe elementary domains tends to zero, consequently AS',tends to zero, and so n tends to m. (The diameter of a set of points is the supremum of the distances between the points of the set.) The requirement AS tends to zero is not enough, because, e.g., in the case of a rectangle the area can be close to zero also if only one side is small and the other is not, so the considered points could be far from each other. u=f(x,y) Y
)d; ,@;-
,'
I ,I
,11 I ,111; ,111; 1111: '1,
X
Figure 8.28
X
AS^
1
Pi(Xi rYi)
Figure 8.29
If this limit exists independently of the partition of the domain S into elementary domains and also of the choice of the points P,(s,. y,)> then we call it the double integral of the function u = f ( z , y) over the domain S.the domain of integration. and we write: (8.135b)
2. Existence Theorem If the function f ( x , y) is continuous on the domain of integration including the boundary, then the double integral (8.135b) exists. (This condition is sufficient but not necessary.)
470
8. Integral Calculus
3. Geometrical Meaning The geometrical meaning of the double integral is the volume of a solid whose base is the domain in the x,y plane. whose side IS a cylindrical surface with generators parallel to the z-axis. and it is bounded above by the surface defined by u = f(s,y) (Fig. 8.29). Every term f(x,,y,)AS, of the sum (8.135b) corresponds to an elementary cell of a prism with base AS, and with altitude f ( z , . yz). The sign of the volume is positive or negative. according to whether the considered part of the surface u = f ( z , y) is above or under the z,y plane. If the surface intersects the z.y plane, then the volume is the algebraic sum of the positive and negative parts. If the value of the function is identically 1 ( f ( z , y) I 1).then the volume has the numerical value of the area of the domain S in the x.y plane.
8.4.1.2 Evaluation of the Double Integral The evaluation of the double integral is reduced to the evaluation of a repeated integral, Le., to the evaluation of two consecutive integrals.
1. Evaluation in Cartesian Coordinates If the double integral exists, then we can consider any type of partition of the domain of integration, such as a partition into rectangles. We divide the domain of integration into infinitesimal rectangles by coordinate lines (Fig. 8.30a). Then we calculate the sum of all differentials f ( x ,y)dS starting with all the rectangles along every vertical stripe, then along every horizontal stripe. (The interior sum is an integral approximation sum with respect to the variable y, the exterior one with respect to x.) If the integrand is continuous, then this repeated integral is equal to the double integral on this domain. The analytic notation is. (8.136a) h
Here y = qz(z)and y =
+1(x) are
the equations of the upper and lower boundary curves (AB),,,,,,
and of the surface patch 5’.Here a and b are the abscissae of the points of the curves to the very left and to the very right. The elementary area in Cartesian coordinates is
d S = dx dy. (8.136b) (The area of the rectangle is AxAy independently of the value of x.) For the first integration x is
0 Figure 8.30
Figure 8.31
Figure 8.32
handled as a constant. The square brackets in (8.136a) can be omitted, since according to the notation the interior integral is referred to the interior integration variable, the exterior integral is referred to the second variable In (8.136a) the differential signs dx and dy are at the end of the integrand. It is also usual to put these signs right after the corresponding integral signs, in front of the integrand. Il’e can perform the summation in reversed order, too, (Fig. 8.30b). If the integrand is continuous,
8.4 Multiple Inteorals 471
then it results also in the double integral: 9
/ f(.
Y) dS =
S
IA =
MZ(Y)
/ / f(x.
(8.136~)
Y) dx dY.
W(Y1
[
xy2 dS. where S is the surface patch between the parabola y = xz and the line y = 2 1 in 2s
Fig, 8.31.A = r r x y ' d y d x = l'xdr '9
A=
xz
/ f l x y 2 dzdy = Jd2y2dy '9
Y/2
[
[{]x2
2 4 :]v/2 =
=
32 [(8x4 - x 7 )dx = 5 or
):
l y 2 (y -
32 dy = 5 .
2. Evaluation in Polar Coordinates The integration domain is divided by coordinate lines into elementary parts bounded by the arcs of two concentric circles and two segments of rays issuing from the pole (Fig. 8.32). The area of the elementary domazn zn polar coordznatea has the form
pdpdp = dS. (8.137a) (The area of an elementary part determined by the same Ap and Ap is obviously smaller being close to the origin. and larger far from it.) With an integrand given in polar coordinates tu = f ( p . p) we perform a summation first along each sector. then with respect to all sectors: (8.137b)
-
where p = p l ( p ) and p = p Z ( p ) are the equations of the interior and the exterior boundary curves A
(AmB) and (AnB)of the surface S and and ;sz are the infimum and supremum of the polar angles of the points of the domain. The reverse order of integration is seldom used. psin' p dS. where S is a half-circle p = 3 cos p (Fig. 8.33):
3. Evaluation with Arbitrary Curvilinear Coordinates u and v The coordinates are defined by the relations x = z(u. u ) , y = y(u, w) (8.138) (see 3.6.3.1. p. 244). The domain of integration is partitioned by coordinate lines u = const and w = const into infinitesimal surface elements (Fig. 8.34) and the integrand is expressed by the coordinates u and v. Lye perform the summation along one strip, e.g., along v = const, then over all strips: (8.139) S
u1 U l ( U ) h
h
Here u = q ( u ) and w = wz(u) are the equations of the boundary curves AmB and AnB of the surface S.% e\' denote by u1 and u2 the infimum and supremum of the values of u of the points belonging to the
472
8. Integral Calculus
surface S. ID1 denotes the absolute value of the Jacobian determinant (functional determinant)
ax ax av D(u.2j) -19 au 31' dv _ _
D - D ( x Y) - au I
(8.140a)
The area of the elementary domain in curvilinear coordinates can be easily expressed: ID1dudu = dS. 1
0
I
I
Figure 8.33
Figure 8.34
Figure 8.35
The formula (8.13713) is a special case of (8.139) for the polar coordinates x = pcos 'p, y = psin 'p. The functional determinant here is D = p. We choose curvilinear coordinates so that the limits of integration in the formula (8.139) are as simple as possible. and also the integrand is not very complicated.
I Calculate A =
s s
f ( x , y ) dS for the case when S is the interior of an asteroid (see 2.13.4, p. 102),
with z = a cos3 1, y = asin3 t (Fig. 8.35). First we introduce the curvilinear coordinates 2 = u cos3 u, y = u sin3u whose coordinate lines u = c1 represents a family of similar asteroids with equations z = c1 cos3t and y = c1 sin3t. The coordinate lines v = c2 are rays with the equations y = kx, where k = tan3 e2 holds. We get D = 1 cos3 . u -321 cos2 v sin v = 3usin2ucos2u, A = j(z(u,v),y(u,~))3usin'vcos~ududu. s1n3 u 311 sin' u cos u 0 0
rS2'
~
8.4.1.3 Applications of the Double Integral Some applications of the double integral are collected in Table 8.9. The required areas of elementary domains in Cartesian and polar coordinates are given in Table 8.8 Tabelle 8.8 Plane elements of area
1
1
Coordinates Cartesian coordinates x , y
dS = dydx
Polar coordinates p, yo
dS = p d p d ' p
I .Arbitrary curvilinear coordinates u, v 1 dS
=
/DIdudv ( D Jacobian determinant)
[
8.4 Multiple Inteqrals 473
Table 8.9 Applications of the double integral General formula
1
1
Cartesian coordinates
Polar coordinates
1. Area of a plane figure:
2. Surface:
3. Volume of a cylinder:
4. Moment of inertia of a plane figure, with respect t o the z-axis:
Iz=/y2dS
=//y2dydr
~
=
11
p3 sin2 'p d p dp
C
5. Moment of inertia of a plane figure, with respect t o the pole 0:
=[Iedydx
=
Js
e p d p dcp
7. Coordinates of the center of gravity of a homogeneous plane figure:
""
/YdS YC =
S
474
8. Integral Calculus
8.4.2 Triple Integrals The trzple zntegralis an extension of the notion of the integral into three-dimensional domains. We also call it volume zntegral.
8.4.2.1 Notion of the Triple Integral 1. Definition We define the triple integral of a function f(z, y, z ) of three variables over a three-dimensional domain 1; analogously to the definition of the double integral. We write:
The volume 1’ (Fig. 8.36) is partitioned into elementary volumes A V . Then we form the products !(xi,yzrzJAr/, where the point Pi(zi,yi, zi) is inside the elementary volume or it is on the boundary. The triple integral is the limit of the sum of these products with all the elementary volumes in which the volume !,’ is partitioned, then the diameter of every elementary volume tends to zero, Le., their number tends to co.The triple integral exists only if the limit is independent of the partition into elementary volumes and the choice of the points Pi(xi, yzsz i ) . Then we have:
\ f(z, ~1
I’
2) dV
=
);yo
f(zii ~
$
(8.142)
AK.
2%) 3
n-tm 1=1
2. Existence Theorem The existence theorem for the triple integral is a perfect analogue of the existence theorem for the double integral.
dY Figure 8.36
Figure 8.37
8.4.2.2 Evaluation of the Triple Integral The evaluation of triple integrals is reduced to repeated evaluation of three ordinary integrals. If the triple integral exists. then we can consider any partition of the domain of integration.
1. Evaluation in Cartesian Coordinates The domain of integration can be considered as a volume V here. We prepare a decomposition of the domain by coordinate surfaces, in this case by planes, into infinitesimal parallelepipeds, Le., their diameter is an infinitesimal quantity (Fig. 8.37). Then we perform the summation of all the products f ( z ly , z ) dl’. starting the summation along the vertical columns, Le.. summation with respect to z , then in all columnsof one slice, Le., summation with respect to y, and finally in all such slices, i.e., summation with respect to z.Every single sum for any column is an approximation sum of an integral, and if the
8.1 Multiple Intearals 475
diameter of the parallelepipeds tends to zero, then the sums tend to the corresponding integrals, and if the integrand is continuous, then this repeated integral is equal to the triple integral. -4nalytically:
(8.143a) Here z = w1(x.y) and z = $z(z,y) are the equations of the lower and upper part of the surface bounding the domain of integration Ti (see limiting curve r in Fig. 8.37);dxdydz is the elementary volume in the Cartesian coordinate system. y = PI(.) and y = p2(z)are the functions describing the lower and upper part of the curve C which is the boundary line of the projection of the volume onto the x,y plane. and z = a and x = b are the extreme values of the z coordinates of the points of the volume under consideration (and also the projection under consideration). We have the following postulates for the domain of integration: The functions p](z) and p2(1c)are defined and continuous in the interval a 5 z 5 b. and they satisfy the inequality p1(x) I p~(x). The functions &(x,y) and &(x,y) are defined and continuous on the domain a 5 x I b: pl(z) 5 y 5 ~ ( z )and , also $l(x> y) 5 &(x,y) holds. In this way. every point (z, y, z ) in V satisfies the relations a
5 5 b,
PI(^)
I Y I Q(.).
~ ( z , YI ) 5 li)z(z,~).
(8.143b)
Just as with double integrals, we can change the order of integration, then the limiting functions will change in the same sense. (Formally: the limits of the outermost integral must be constants, and any limit may contain variables only of exterior integrals.)
I Calculate the integral I
=
(y2
+ z') dl/ for a pyramid bounded by the coordinate planes and the
plane z + y + z = 1: (y2
+ 2')
dz dy dx =
1'{ 11-'
[ll-i-y(yz
+ z') dz
11
1
dy dx = 30 .
zt
Figure 8.38
Figure 8.39
2. Evaluation in Cylindrical Coordinates The domain of integration is decomposed into infinitesimal elementary cells by coordinate surfaces p = const. p = const, z = const (Fig. 8.38). The volume of an elementary domain an cylindrical coordinates is dl' = p d z d p d p .
(8.144a)
476
8. Integral Calculus
After defining the integrand by cylindrical coordinates f ( p , (o, z ) the integral is: 192 P 2 ( V ) ZZ(P,19)
J i ( P : P>2) dl’ = V
JJ J
(8.144b)
f(P>(o, Z I P dzdpdip.
191 P l ( 1 9 ) Z I ( P , 1 9 )
ICalculate the integral I = the cylindrical surface x 2
dV for a solid (Fig. 8.39) bounded by the 2, y plane, the 2,t plane,
+ y 2 = a 2 and the sphere x 2 + y 2 +
[” { iaCo8’ [im
dt] p d p }
t 2=
a’:
z1
= 0,
z2
-4
=
=
a3
dp = 18 (377 - 4) . Since f ( p , (o, z ) = 1, the integral is equal to
the volume of the solid.
I X
Figure 8.40
Figure 8.41
3. Evaluation in Spherical Coordinates The domain of integration is decomposed into infinitesimal elementary cells by coordinate surfaces T = const, p = const, 29 = const (Fig. 8.40). The volume of an elementary domain in spherical coordinates is
dl.’ = sin 29 dr d29 dy. For the integrand f ( r >p>19) in spherical coordinates, the integral is:
(8.145a)
(8.145b)
ICalculate the integral I =
/
dV for a cone whose vertex is at the origin, and its symmetry axis
V
is the z-axis. The angle at the vertex is 2a, the altitude of the cone ish (Fig. 8.41). Consequently we h have: = 0,rZ = -’cos *’ 291 = 0,292 = a; (01 = 0, $92 = 2a. h l cos 8 cos 29 I = -T2sin8drd29dp= Jd27{j10cosliSinfi h/cosir dr] d 3 ) d(o r2
Jdy7[i
I[
8.5 Surface Intearals 477
= 2a h (1 - cos a ) .
4. Evaluation in Arbitrary Curvilinear Coordinates u, w , w The coordinates are defined by the equations (8.146) x = x(u,v , w ) . y = y(u, 21, u). z = z(u,v ) w ) (see 3.6.3.1, p. 244). The domain of integration is decomposed into infinitesimal elementary cells by the coordinate surfaces u = const, 2: = const, w = const. The volume of an elementary domazn zn arbztray coordznates is:
(8.147a)
Le.. D is the Jacobian determinant. For the integrand f(u,v,w) in curvilinear coordinates u, v , w . the integral is: (8.147b)
Remark: The formulas (8.144b) and (8.145b) are special cases of (8.147b). For cylindrical coordinates D = p holds, for spherical coordinates D = r2 sin v7 is valid. If the integrand is continuous, then we can change the order of integration in any coordinate system. M'e always try to choose a curvilinear coordinate system such that the determination of the limits of the integral (8.147b))and also the calculation of the integral, should be as easy as possible.
8.4.2.3 Applications of t h e Triple Integral Some applications of the triple integral are collected in Table 8.11. The elementary areas corresponding to different coordinates are given in Table 8.8. The elementary volumes corresponding to different coordinates are given in Table 8.10. Table 8.10 Elementary volumes
I s -
1
Coordinates 2, y,
z
~
Elementary volume
I
dV = dxdydz
1 dV = p dp d p dz I 1 dV = r'sint9drdddp Spherical coordinates r ,t9,p 1 Arbitrary curvilinear coordinates u, w,w 1 dV = /Dldudvdw ( D Jacobian determinant) I I Cylindrical coordinates p , p , z
8.5 Surface Integrals We distinguish surface integrals of the first type, of the second type, and of general type, analogously to the three different line integrals (see 8.3, p. 460).
8.5.1 Surface Integral ofthe First Type The surface zntegralor zntegral ouer a surface an space is the generalization of the double integral, similarly as the line integral of the first type (see 8.3.1, p. 461) is a generalization of the ordinary integral.
I
478
8. Inteqral Calculus
Table 8.11 Applications of the triple integral
General formula
Cartesian coordinates
Cylindrical coordinates
Spherical coordinates
1. Volume of a solid
2. Axial moment of inertia of a solid with respect t o the z-axis
I
I
3. Mass of a solid with the density function Q
4. Coordinates of the center of a homogeneous solid
111
/[Ir3sin’ 29 cos
x dz dy dx
111 sin 111
pz sin p d p d p dz
/I/
p d r d29 d v
29 d r d29 d p
r3 sin2 8sin p dr dt9 d p
111
r’ sin 8 dr dt9 d p
/// r3sin
111
T’
cos 29 dr dt9 d p
sin 19 d r d29 d p
8.5.1.1 Notion of the Surface Integral of the First Type 1. Definition The surface integral of thefirst type of a function u = f (z, y , z ) of three variables defined in a connected domain is the integral
over a region S of a surface. The numerical value of the surface integral of the first kind is defined in the following way (Fig. 8.42). 1. We decompose the region S in an arbitrary way into n elementary regions. 2. Lye choose an arbitrary point P,(x,, y z ,2,) inside or on the boundary of each elementary region.
8.5 Surface Integrals 479 3. \+'e multiply the value f (q, yi, ti) of the function at this point by the area ASl of the corresponding elementary region. 4. We add the products f ( ~ytI , zz)AStso obtained. 5. We determine the limit of the sum n
f
(Xi,Yi,
4 AS%
(8.149a)
i=l
as the diameter of each elementary region tends to zero, so AS, tends to zero, Figure 8.42 hence, their number n tends to co (see 8.4.1.1, l., p. 469). If this limit exists and is independent of the particular decomposition of the region S into elementary yi2z i ) , then it is called the surface integral of theJrst regions and also of the choice of the points Pi(xi, type of the function 2~ = f(x,y, z ) over the region S,and we write:
1
f(L!/>2s dS =
;;z0k
(8.149b)
Yz,21) AS,.
f(z1,
n i m
S
1=1
2. Existence Theorem If the function f ( ~y,, z ) is continuous on the domain, and the functions defining the surface have continuous derivatives here, the surface integral of the first type exists.
8.5.1.2 Evaluation of the Surface Integral of the First Type The evaluation of the surface integral of the first type is reduced to the evaluation of a double integral over a planar domain (see 8.4.1, p. 469). 1. Explicit Representation of t h e Surface If the surface 2 = z(x.y)
S is given by the equation (8.150) (8.15la)
S'is the projection of S onto the I,y plane and p and q are the partial derivatives az q = - Here we assume that to every point of the surface S there corresponds a unique point ay'
is valid, where a2
p = -,
ax
in S'in the L. y plane, i.e., the points of the surface are defined uniquely by their coordinates. If it does not hold. n e decompose S into several parts each of which satisfies the condition. Then the integral on the total surface can be calculated as the algebraic sum of the integrals over these parts of S. The equation (8.151a) can be written in the form (8.151b)
x - x Y - y z - z (see Tasince the equation of the surface normal of (8.150) has the form -= -- P 4 -1 ble 3.29, p. 246),since for the angle between the direction of the normal and the z-axis, cos7 = 1 holds. In evaluating a surface integral of the first type, this angle y is always considered dm-7
as an acute angle. so cos? > 0 always holds.
480
8. Inteoral Calculus
2. Parametric Representation of the Surface
If the surface S is given in parametric form by the equations x = x ( u ,ff), y = y(u,ff), z = z ( u ,ff), (Fig. 8.43), then
J
f(Xl
(8.152a)
Y>2) dS
S
=JJf[x(u,v),y(u,u),z(u,a)]~dudv,
(8.152b)
A
where the functions E , F , and G are the quantities given in 3.6.3.3, l., p. 245. The elementary region in parametric form is (8.152~)
Figure 8.43
and A is the domain of the parameters u and u corresponding to the given surface region. The evaluation is performed by a repeated integration with respect to t~and u:
S
w(u)
1/ @(u,u)-dvdu, 211
/@(u.v)dS=
(8.152d) @ = f[x(u,w),y(u,~),z(u,v)].
u1 Vl(U)
-
Here u1 and u2 are coordinates of the extreme coordinate lines u = const enclosing the region S h
(Fig. 8.43),and u = ul(u) and v = uz(u) are the equations of the curves AmB and AnB of the boundary of S . Remark: The formula (8.151a) is a special case of (8.152b) for
u=x, v = y ,
E=l+p',
F=pq,
G=1+q2.
(8.153)
3. Elementary Regions of Curved Surfaces The elementary regions of curved surfaces are given in Table 8.12. Table 8.12 Elementary regions of curved surfaces
Coordinates
Elementary region a2
at
Cartesian coordinates x , y, t = z ( x ,y ) dS = 1 + (-)' + (-)' dxdy ax ay dS= Rdpdz Cylindrical lateral surface, R (const. radius), coordinates cp, z dS = R' sin 19 d19 d p Spherical surface R (const. radius), coordinates ~ 9 ~p, Arbitrary curvilinear coordinates u , TJ dS = d ( E ,F, G see differential of arc, p. 245)
S =JdS S
m
d
u dv
(8.154)
8.5 Surface Inteqrals 481
2. Mass of an Inhomogeneous Curved Surface S
With the coordinate-dependent density e = f(z,u, z ) we have: (8.155)
8.5.2 Surface Integral of the Second Type The surface integral of the second type, also called an integral over a projection, is a generalization of the notion of double integral similarly to the surface integral of the first type.
8.5.2.1 Notion of the Surface Integral of the Second Type 1. Notion of an Oriented Surface A surface usually has two sides, and one of them can be chosen arbitrarily as the exterior one. If the exterior side is fixed, we call it an oriented surface. We do not discuss surfaces for which we cannot define two sides (see [8.7]). 2. Projection of an Oriented Surface onto a Coordinate Plane If we project a bounded part S of an oriented surface onto a coordinate plane, e.g., onto the z,y plane, we can consider this projection PrZyS as positive or negative in the following way (Fig. 8.44):
Figure 8.44
a) If the z.y plane is looked at from the positive direction of the z-axis, and we see the positive side of the surface S, where the exterior part is considered to be positive, then the projection PrZyS has a positive sign, otherwise it has a negative sign (Fig. 8.44 a,b). b) If one part of the surface shows its positive side and the other part its negative side, then the projection PrZyS is regarded as the algebraic sum of the positive and negative projections (Fig. 8 . 4 4 ~ ) . The Fig. 8.44d shows the projections PT& and PT,,S of asurface S;one of them is positive the other one is negative. The projection of a closed oriented surface is equal to zero.
482
8. Inteoral Calculus
3. Definition of the Surface Integral of the Second Type over a Projection
onto a Coordinate Plane The surface integral ofthe second type of a function f(z, y, z ) of three variables defined in a connected domain is the integral
over the projection of an oriented surface Sonto the x,y plane, where S is in the same domain where the function is defined, and if there is a one-to-one correspondence between the points of the surface and its projection. The numerical value of the integral is obtained in the same way as the surface integral of the first type except that in the third step the function value . f ( x I yE, , zt) is not multiplied by the elementary region AS,, but by its projection Pr,,AS,, oriented according to 8.5.2.1, 2., p. 481 on the x.y plane. Then we get: (8.157a) hVe define analogously the surface integrals of the second type over the projections of the oriented surface S onto the y, z plane and onto the z , z plane:
4. Existence Theorem for the Surface Integral of the Second Type The surface integral of the second type (8.157a,b,c) exists if the function f(z,y, z ) is continuous and the equations defining the surface are continuous and have continuous derivatives.
8.5.2.2 Evaluation of Surface Integrals of the Second Type The principal method is to reduce it to the evaluation of double integrals.
1. Surface Given in Explicit Form If the surface S is given by the equation 2
=P (.>
(8.158)
Y)
in explicit form, then the integral (8.15ia) is calculated by the formula
/f(z,Y>z)dzdY
f[.>Y>'p(x,Y)ldS,Y,
=
S
(8.159a)
Prz,S
where S,, = Pr,,S. The surface integral of the function f(z, y, z ) over the projections of the surface S onto the other coordinate planes is calculated similarly: /f(r.y,i)dydz = S
J
f[lCI(Y,~),yl~ldSYzr
(8.159b)
Pr,,S
where we substitute I = y(y, z ) , the equation of the surface S solved for z, and S,,= Pry,S. /f(x:y,z)dzdz = S
s
f[x,x(z,x),z]d&,
(8.159~)
Przd
where we substitute y = x ( z . z ) ,the equation of the surface S solved for y, and S,, = Pv,,S. If the orientation of the surface is changed, Le., if we interchange the exterior and interior sides, then the integral over the projection changes its sign.
8.5 Surface Integrals 483
2. Surface Given in Parametric Form If the surface is given by the equations L = I(U,t ) . y = y(u. v). 2 = z(u, v) (8.160) in parametric form.we calculate the integral (8.157a,b,c) with help of the following formulas:
(8.161a)
(8.16 1b)
00
are the Jacobian determinants of pairs of functions D(u. v) ’ D ( u ,v) ’ D ( u ,u ) x. y1z with respect to the variables u and u: A is the domain of u and v corresponding to the surface S.
Here the expressions
3. Surface Integral in General Form If P ( x . y. z ) ~Q(r.y. 2). R ( x .y >z ) are three functions of three variables defined in a connected domain and S is an oriented surface contained in this domain. the sum of the integrals of the second type taken over the projections on the three coordinate planes is called the surface integral in general form:
/ ( P dy dz + Qdzdz +Rdzdy) =
1
1
1R d z d y .
S
S
S
S
P d y dz +
Qdzdz +
(8.162)
The formula reducing the surface integral to a double integral is:
/ ( P d y d z t Qdzdx t Rdxdy) = S
where the quantities
1[.W+ Qv + D ( u ,v)
D(u. L )
Do DO w) and A have the same meaning, as above. D ( u ,v ) ’ D ( u ,v) ’ D ( u ,u )
Remark: The surface integral of vector-valued functions is discussed in the chapter about the theory of vector fields (see 13.3.2. p. 661).
4. Properties of the Surface Integrals 1. If the domain of integration, Le.. the surface
S.is decomposed into two parts SI and Sz,then
/ ( P d y d z t Qdzdz + R d x d y ) = [ ( P d y dz S
+ Q dzdx + R d x d y )
91
+
I
(Pdydz+Qdzdx+Rdxdy).
(8.164)
S2
If the orientation of the surface is reversed, Le., if we interchange the exterior and interior sides, the integral changes its sign: 2.
I
/(Pdydz+Qdzdr+Rdxdy) = - (Pdydz+Qdzdx+Rdxdy), s+ S-
(8.165)
484 8. Inteqral Calculus where St and S-denote the same surface with different orientation. 3. A surface integral depends, in general, on the line bounding the surface region S as well as on the surface itself. Thus the integrals taken over two different non-closed surface regions S l and Sz spanned by the same closed curve C are, in general, not equal (Fig. 8.45):
/ ( P dy dz t Q dz dz + R dz d y ) SI
# / ( P dy dz t Q dz dz t R dz dy).
(8.166) Figure 8.45
s2
8.5.2.3 An Application of the Surface Integral The volume V of a solid bounded by a closed surface S can be expressed and calculated by a surface integral
'J
V= -
3s
(zdydz+ydzdz+zdzdy),
where S is oriented so that its exterior side is positive.
(8.167)
485
9 Differential Equations 1. A Differential Equation is an equation, in which one or more variables, one or more functions of these variables, and also the derivatives of these functions with respect to these variables occur. The order of a differential equation is equal to the order of the highest occurring derivative. 2. Ordinary a n d Partial Differential Equations differ from each other in the number of their independent variables; in the first case there is only one, in the second case there are several.
9.1 Ordinary Differential Equations 1. General Ordinary Differential Equation of Order n in implicztform has the equation If this equation is solved for y(")(x), then it is the explicit form of an ordinary differential equation of order n.
2. Solution or Integral of a differential equation is every function satisfying the equation in an interval a 5 x 5 b which can be also infinite. .4 solution, which contains n arbitrary constants c l , c2, . . . ,e,, is called the general solution or general integral. If the values of these constants are fixed, a particular integralor a particular solution is obtained. The value of these constants can be determined by n further conditions. If the substitution values of y and its derivatives up to order n - 1 are prescribed at one of the endpoints of the interval. then the problem is called an initial value problem. If there are given substitution values at both endpoints of the interval, then the problem is called a boundary value problem. IThe differential equation -y'sin x + y cos x = 1 has the general solution y = cos x + c sin x. For the condition c = 0 we get the particular solution y = cosx.
3. Initial Value Problem If the n values y(xO), y ' ( ~ ~, ,) y("-')(xO) ~. are given at zo for the solution y = y(x) of an n-th order ordinary differential equation, then an initial value problem is given. The numbers are called the initial values or initial conditions. They form a system of n equations for the unknown constants c1, c2, . . . c, of the general solution of the n-th order ordinary differential equation. IThe harmonic motion of a special elastic spring-mass system can be modeled by the initial value problem y" + y = 0 with y(0) = yo, y'(0) = 0. The solution is y = yo cos x. 4. Boundery Value Problem If the solution of an ordinary differential equation and/or its derivatives are given at several points of its domain, then these values are called the boundary conditions. We call a differential equation with boundary conditions a boundary value problem. IThe bending line of a bar with fixed endpoints and uniform load is described by the differential equation y" = x - x2 with the boundary conditions y(0) = 0, y(1) = 0 (0 5 x 5 1). The solution is 23 24 x ~
y=6-z-E.
I
486
9. Differential Equations
9.1.1 First-Order Differential Equations 9.1.1.1 Existence Theorems, Direction Field 1. Existence of a Solution In accordance with the Cauchy existence theorem the differential equation
yl = f(.. Y) (94 has at least one solution in a neighborhood of 20 such that it takes the value yo at z = zo if the function f ( z , y) is continuous in a neighborhood G of the point (zo, yo). For example, we may select G as the region given by l z - 501< a and ly - yo1 < b with some a and b. 2. Lipschitz Condition The Lipschitz condition with respect to y is satisfied by f ( x ,y) if If(x.y1) - f(x,yz)l 5 d%l - Y2I (9.3) holds for all (x.yl) and (z.y2) from G, where N is independent of z,y1, and y2. If this condition is satisfied! then the differential equation (9.2) has a unique solution through (20:yo). The Lipschitz condition is obviously satisfied if f ( z . y) has a bounded partial derivative af/ay in this neighborhood. In 9.1.1.4, p. 491 we will see examples in which the assumptions of the Cauchy existence theorem are not satisfied.
3. Direction Field If the graph of a solution y = p(z) of the differential equation y’ = f(z,y) goes through the point P ( z ,y); then the slope dy/dz of the tangent line of the graph at this point can be determined from the differential equation. So, at every point (x.y) the differential equation defines the slope of the tangent line of the solution passing through the considered point. The collection of these directions (Fig. 9.1) forms the directionfield. An element of the direction field is a point together with the direction associated to it. Integration of a first-order differential equation geometrically means to connect the elements of a direction field into an integral curue, whose tangents have the same slopes a t all points as the corresponding elements of the direction field.
Figure 9.1
Figure 9.2
4. Vertical Directions If in a direction field we have vertical directions! i.e., if the function f(x,y ) has a pole, we exchange the role of the independent and dependent variables and we consider the differential equation
as an equivalent equation to (9.2). In the region where the conditions of the existence theorems are fulfilled for the differential equations (9.2) or (9.4), there exists a unique integral curve (Fig.9.2) through every point P(z0,yo).
9.1 Ordinaru DifferentialEauations 487
5 . General Solution The set of all integral curves can be characterized by one parameter and it can be given by the equation (9.5a) F ( s ,y, C) = 0 of the corresponding one-parameter family of curves. The parameter C, an arbitrary constant, can be chosen freely and it is a necessary part of the general solution of every first-order differential equation. .A particular solution y = p(z)>which satisfies the condition yo = ~(zo), can be obtained from the general solution (9.5a) if C is expressed from the equation (9.5b) F(z0,Yo; C) = 0.
9.1.1.2 Important Solution Methods 1. Separation of Variables If a differential equation can be transformed into the form (9.6a) .lf(z).V(y)dx P(x)Q(y)dy = 0, then it can be rewritten as (9.6b) R(xjdx t S(y)dy = 0, where the variables z and y are separated into two terms. To get this form, we divide equation (9.6a) by P(z)N(y).For the general solution we have
+
If for some values x = or y = g, the functions P ( x ) or X ( y ) or both are equal to zero: then the constant functions x = Z or/and y = are also solutions of the differential equation. They are called singular solutions. Izdy t ydz = 0;
J
t
t
/ d”c
= C;
ln 1y1t ln 1x1= c = In / c ~ ; yz = c. If we allow also c = o
I 0 and x I 0. 2. Homogeneous Equations If M ( x . y) and X ( x . y) are homogeneous functions of the same order (see 2.18.2.6, l., p. 121), then in
in this final equation, then we have the singular solutions y
the equation M(z.yjdz + :V(x: y)dy = 0 (9.8) we separate the variables by substitution of u = y/x. Iz(z- y)y’ + yz = 0 with y = u(x)x,we get (1 - u)u’ + u/x = 0. By separation of the variables
1
du = -
1:
-dx. After integration: In 1x1 + In / u - u1 = C = In IcI,
uz = ce’,
y = ceY/.”.
As we have seen in the preceding paragraph, Separation of Variables, the line x = 0 is also an integral curve.
3. Exact Differential Equations An exact diflerential equation is an equation of the form (9.9a) M(5.y)dz t ,V(z,y)dy = 0 or ,V(Z, y)y’ t M ( x ,y) = 0, if there exists a function @(xq y) of two variables such that (9.9b) M ( x . y)dx t N(x, y)dy I d@(z:y). i.e.. if the left side of (9.9a) is the total differential of a function @ ( E , y) (see 6.2.2.1, p. 392). If functions M(z,y) and “V(x,y j and their first-order partial derivatives are continuous on a connected domain G, then the equality (9.9c)
488
9. DifferentialEauations
is a necessary and sufficient condition for equation (9.9a) to be exact. In this case the general solution of (9.9a) is the function @(x. y) = C (C = const), (9.9d) which can be calculated according to (8.3.4), 8.3.4.4, p. 468 as the integral Y
X
@(LY) =
J WE>Y) dE + J N x o , a) drl,
(Me)
YO
XO
where xg and yo can be chosen arbitrarily from G. IExamples will be given later.
4. Integrating Factor A function p ( x , y) is called an integratingfactoror a multiplier if the equation (9.10a) Mdx + N d y = 0 multiplied by p(x,y) becomes an exact differential equation. The integrating factor satisfies the differential equation (9.10b) Every particular solution of this equation is an integrating factor. To give a general solution of this partial differential equation is much more complicated than to solve the original equation, so usually we are looking for the solution p(x, y) in a special form, e.g., p(x),p(y), p(xy) or p(x2 + y'). IWe now solve the differential equation (x'+ y) dx - x dy = 0. The equation for the integrating factor dlnp alnp dlnp - (x' + y) -= 2. An integrating factor which is independent of y must satisfy -x = is -2-
aY
dx
OX
1 -2, so p = - Multiplication of the given differential equation by p yields 1 1 '
1 - dx - -dy = 0.
( + J:
'
x
The general solution according to (9.9e) with the selection of xo = 1,yo = 0 is then:
5 . First-Order Linear Differential Equations A first-order linear diferential equation has the form
(9.lla) Y' + P(z)Y = Qb), where the unknown function and its derivative occur only in first degree, and P ( x ) and Q ( x )are given functions. If P ( z ) and Q(x) are continuous functions on a finite, closed interval, then the differential equation here satisfies the conditions of the Picard-Lindeliiftheorem (see p. 608). The integratingfactor is here (9.1lb) the general solution is
Y = exp
(-
P dx)
[QI (/P d x ) dx + C] exp
(9.1 IC)
If we replace the indefinite integrals by definite ones with lower bound xo and upper bound x in this formula. then for the solution y(x0) = C (see 8.2.1.2, l., p. 440). If y1 is any particular solution of the differential equation, then the general solution of the differential equation is given by the formula 2/ = Yl
+ Cexp (- J P d x ) .
(9.1Id)
9.1 Ordinaru Differential Equations 489
If yl(x) and y ~ ( x are ) two linearly independent particular solutions (see 9.1.2.3, 2., p. 498), then we can get the general solution without any integration as (9.1le) Y = Y1 + C(Yz - YI). ISolve the differential equation y' - y t a n z = cosz with the initial condition xo = 0, yo = 0. We calculate exp
(- 1% tanx dx) = cos x and get the solution according to (9.11~): [sinxc;x+x] cos x
-sinx 2
z +-2cosx'
6 . Bernoulli Differential Equations The Bernoulli diferential equation is an equation of the form (9.12) Y' + P(x)Y= Q(x)Y" ( n # O,n # 1)) which can be reduced to a linear differential equation if it is divided by yn and the new variable z = y-n+l is introduced. 4Y = x f i . Since n = 1/2, dividing by fi and introducing the ISolve the differential equation y' - X
dz 22 x new variable z = fi leads to the equation - - - = - , By using the formulas for the solution of a dx x 2 1 linear differential equation we have exp(J P d x ) = - and z = x2 22
so, finally. y = x4
1
(5 In 121 + c)'.
7. Riccati Differential Equations The Riccati diferential equation is the equation (9.13a) Y' = P(x)Y'+ Q(x)Y+ R(x), which cannot usually be solved by elementary integration. However, if we know a particular solution y1 of the Riccati differential equation, then we can reduce the equation to a linear differential equation for z by substituting 1
y=y1+-. 2
(9.13b)
If we also know a second particular solution yz, then 1
21
=YZ - Y l
(9.134
is a particular solution of the linear differential equation for the function z . so its solution can be simplified. If we know three particular solutions y1, yz, and y3, then the general solution of the Riccati differential equation is Y-Yz, Y - Y1 Y3 - Y l By the substitution of
(9.13d)
(9.13e) the Riccati differential equation can be transformed into the normal form du
- = 112 + R(x) dx
(9.13f)
490
9. DifferentialEquations
With the substitution u’ y = -P(x)v we get a second-order linear differential equation (see 9.1.2.6, l.,p. 505) from (9.13a)
Pv”-(P’+PQ)d+P2Rw=0.
(9.13h) 1
4 Solve the differential equation y’+ y2 -y - - = 0. We substitute y = z
+x
22
+ 4(s)into the equation,
+
and for the coefficient of the first power of z we get the expression 2p l/x,which disappears if we 15 substitute 9(s) = -1/2x In this way we get zf - z 2 - - = 0. We are looking for particular solutions 4x2 5 3 in the form z1 = 2 and we get the solutions a1 = - - > a2 = - by substitution, i.e., the two particular X 2 2 1 1 3 5 3 solutions are z1 = z2 = - . The substitution of z = - z1 = - - - results in the equation 22 22 u 21 22 321 1 z u’ + - = 1. Substituting the particular solution u1 = -= - we obtain the general solution X z2 - 21 4 z c 54 +c1 1 3 1 2x4 - 2c1 u=-+-=4x3 and therefore. y = - - - - - = 4 23 21 22 22 25+c1x ’
+
~
9.1.1.3 Implicit Differential Equations 1. Solution in Parametric Form Suppose we have a differential equation in implicit form (9.14) F ( z ,y ; y’) = 0. There are n integral curves passing through a point P(z0,yo) if the following conditions hold: a) The equation F(r0,y0,p) = 0 (p = dy/dz) has n real rootspl!. . . , p , at the point P(z0,yo). b) The function F ( z ,y l p ) and its first partial derivatives are continuous at x = 20.y = yo, p = p i ; furthermore d F / a p # 0. If the original equation can be solved with respect to y’, then it yields n equations of the explicit forms discussed above. Solving these equations we get n families of integral curves. If the equation can be written in the form z = p(y, y’) or y = $(x, y f ) , then putting y’ = p and considering p as an auxiliary variable, after differentiation with respect to y or zwe obtain an equation for dpldy or d p / d x which is solved with respect to the derivative. A solution of this equation together with the original equation (9.14) determines the desired solution in parametric form. Solve the equation z = yy‘ + y’*. We substitute y’ = p and get x = py p2. Differentiation with dx = -1 results in -1 = p (y 2p)-dP or dy PY - 2P2 Solving respect to y and substituting -dy P P dy dp 1 - p 2 - 1 - p 2 ’ c + arcsinp Substituting into the initial equation we get the this equation for y we get y = -p
+
+ +
+
J r ? ’ .
I
solution for x in parametric form.
2. Lagrange Differential Equation The Lagrange diflerential equation is the equation a(y’)z b(y’)y c(y’) = 0. The solution can be determined by the method given above. If for p = PO,
+
ab)
+
+ b ( p b = 0.
(9.14b) then a(p0)x
is a singular solution of (9.14a).
+ b(po)y + c(po) = 0
(9.14a)
(9.14c)
9.1 Ordinaru Differential Equations 491
3. Clairaut Differential Equation The Clazraut dzferentzal equatzon is the special case of the Lagrange differential equation if
4P)+ b(P)P = 0. and so it can be transformed into the form
(9.15a)
Y = Y’Z+ f (YO.
(9.15b)
The general solution is
+
(9.15c) y = cz f ( C ) Besides the general solution, the Clairaut differential equation also has a singular solution, which can be obtained by eliminating the constant C from the equations y = cz
+f(C)
(9.15d)
and 0 = 2
+ f’(C),
(9.15e)
\Ye can obtain the second equation by differentiating the first one with respect to C.Geometrically, the singular solution is the envelope (see 3.6.1.7, p. 237) of the solution family of lines (Fig. 9.3). W Solve the differential equation y = xy’+ Y’~. The general solution is y = C x C*, and we get the singular solution with the help of the equation x + 2C = 0 to eliminate C,and hence x2 4y = 0. Fig. 9.3shows this case.
+
+
\
Figure 9.3
Figure 9.4
9.1.1.4 Singular Integrals and Singular Points 1. Singular element .in element (Q. yo. yh) is called a singular element of the differential equation, if in addition to the differential equation (9.16a) F ( Z , y. y’) = 0 it also satisfies the equation (9.16b)
2. Singular Integral An integral curve from singular elements is called a singular integral curve; the equation P(X, Y) = 0 (9.16~) of a singular integral curve is called a singular integral. The envelopes of the integral curves are singular integral curves (Fig. 9.3);they consist of the singular elements. The uniqueness of the solution (see 9.1.1.1, l., p. 486) fails at the points of a singular integral curve.
3. Determination of Singular Integrals Usually we cannot obtain singular integrals for any values of the arbitrary constants of the general solution. To determine the singular solution of a differential equation (9.16a) with p = y’ we have to
492
9. Di.fferentialEauations
introduce the equation
-8F= o
(9.16d) 8P and to eliminate p . If the obtained relation is a solution of the given differential equation, then it is a singular solution. The equation of this solution should be transformed into a form which does not contain multiple-valued functions, in particular no radicals where the complex values should also be considered. Radicals are expressions we get by nesting algebraic equations (see 2.2.1, p. 60). If the equation of the family of integral curves is known, i.e., the general solution of the given differential equation is known, then we can determine the envelope of the family of curves, the singular integral, with the methods of differential geometry (see 3.6.1.7, p. 237). 4 8 IA: Solve thedifferentialequationz-y--y'+-~'~ = 0. After wesubstitute y' = p , thecalculation 9 27 8 8 of the additional equation with (9.16d) yields - - p + -p2 = 0. Elimination of p results in equation a) 9 9 4 x - y = 0 and b) x - y = - where a) is not a solution, b) is a solution, a special case of the general 27 solution (y - C)' = (x - C)3.The integral curves of a) and b) are shown in Fig. 9.4. IB: Solve the differential equation y'-In 1x1 = 0. We transform the equation into the form eP - 1x1 = dF 0. - ep = 0. We get the singular solution x = 0 eliminatingp. 8P 4. Singular Points of a Differential Equation Singular points of a differential equation are the points where the right side of the differential equation (9.17a) Y' = f ( X >Y) is not defined. This is the case, e.g., in the differential equations of the following forms: 1. Differential Equation with a Fraction of Linear Functions ax by dy (9.17b) (ae - bc # 0) dx c x + e y has an isolated singularpoint at (0,0 ) ,since the assumptions of the existence theorem are fulfilled almost at every point arbitrarily close to (0,O) but not at this point itself. The conditions are not fulfilled at the points where cx + ey = 0. We can force the fulfillment of the conditions at these points if we exchange the role of the variables and we consider the equation dx - cx + ey (9.17~) dy ax+by ' The behavior of the integral curve in the neighborhood of a singular point depends on the roots of the characteristic equation (9.17d) Xz - ( b + c)X t bc - ae = 0. We can distinguish between the following cases: Case 1: If the roots are real and they have the same sign, then the singular point is a branch point. The integral curves in a neighborhood of the singular point pass through it and if the roots of the characteristic equation do not coincide, except for one, they have a common tangent. If the roots coincide, then either all integral curves have the same tangent, or there is a unique integral curve passing through the singular point in each direction. dY 2Y IA: For the differential equation - = - the characteristic equation is Xz - 3X + 2 = 0, XI = 2, dx x XZ = 1. The integral curves have the equation y = C x 2 (Fig. 9.5). The general solution also contains ~
+
9.1 Ordinaru DifferentialEauatzons 493
the line x = 0 if we consider the form x2 = C1 y . dY X + Y IB: The characteristic equation for - = -is X2 - 2X + 1 = 0,XI = X2 = 1. The integral curves
dx
x
+ C x (Fig. 9.6). The singular point is a so-called node. dY = Y IC: The characteristic equation for is X2 - 2X + 1 = 0,XI = Xz = 1. The integral curves dx x are y = x In 1x1
are y = C x (Fig. 9.7). The singular point is a so-called ray point.
Figure 9.5
Figure 9.6
Figure 9.7
Case 2: If the roots are real and they have different signs, the singular point is a saddle poznt, and two of the integral curves pass through it.
dY = -Y is X2 - 1 = 0. XI = +1, X2 = -1. The integral curves D: The characteristic equation for dx x are z y = C (Fig. 9.8). For C = 0 we get the particular solutions x = 0, y = 0. Case 3: If the roots are conjugate complex numbers with a non-zero real part (Re(X) # 0), then the singular point is a spzralpoznt which is also called afocal poznt, and the integral curves wind about this singular point.
dy x + y . E: The characteristic equation for - = IS X2 - 2X + 2 dx X-y integral curves in polar coordinates are r = C eP (Fig. 9.9). ~
=
0,XI = 1 + i,
X2
= 1 - i. The
Yf
Figure 9.8
Figure 9.9
Figure 9.10
Case 4: If the roots are pure imaginary numbers, then the singular point is a central p o d , or center,
which is surrounded by the closed integral curves.
I
494
9. Dzferentzal Eauatzons
dY = -x is X2 + 1 = 0, A1 = i, Az = -i. The integral curves are IF: The characteristic equation for dx Y x2 + y2 = C (Fig. 9.10). 2. Differential Equation with the Ratio of Two Arbitrary Functions (9.18a) has the singular points for the values of the variables where (9.18b) P ( x . y) = Q ( x .y) = 0. If P and Q are continuous functions and they have continuous partial derivatives. (9.18a) can be written in the form
Here xo and yo are the coordinates of the singular point and Pl(x,y) and Q1(z,y) are infinitesimals of a higher order than the distance of the point (x.y) from the singular point (zo, yo). LVith these assumptions the type of a singular point of the given differential equation is the same as that of the approximate equation obtained by omitting the terms Pl and Q1. with the following exceptions: a) If the singular point of the approximate equation is a center, the singular point of the original equation is either a center or a focal point. a b b) If a e - b c = 0, i.e.. - = - or a = c = 0 or a = b = 0, then the type of singular point should be c
e
determined by examining the terms of higher order.
9.1.1.5 Approximation Methods for Solution of First-Order Differential Equations 1. Successive Approximation Method of Picard The integration of the differential equation (9.19a)
Y t = f (2. Y)
with the initial condition y = yo for z = zo results in the fixed-point problem (9.19b) If we substitute another function yl(z) instead of y into the right-hand side of (9.19b), then the result will be a new function yz(x), which is different from yl(z): if yl(z) is not already a solution of (9.19a). After substituting yz(z) instead of y into the right-hand side of (9.19b) we get a function y ~ ( x ) .If the conditions of the existence theorem are fulfilled (see 9.1.1.1. l., p. 486), the sequence of functions y l , y2! y3, . . . converges to the desired solution in a certain interval containing the point 50. This Picard method of successive approximation is an iteration method (see 19.1.1. p. 881). ISolve the differential equation y' = e" - y2 with initial values zo = 0, yo = 0. Rewriting the equation in integral form and using the successive approximation method with an initial approximation 1 yo(x) I 0. x e get: y1 = 1 ' e " d x = e" - 1. yz = [e5 - (e" - l)'] dx = 3e' - -e2" - z etc. 2 2
6"
E,
2. Solution by Series Expansion The Taylor series expansion of the solution of a differential equation (see 7.3.3.3. l., p. 415) can be given in the form y = yo
+ (x - x0)yol+
my{ + . . . (x 2
+ -yo(n)- .oIn n!
+ ...
(9.20)
9.1 Ordinaru Differential Eouations 495
if the values yo‘. yof’) , . . , yo(n),. . . of all derivatives of the solution function are known at the initial value xo of the independent variable. The values of the derivatives can be determined by successively differentiating the original equation and substituting the initial conditions. If the differential equation can be differentiated infinitely many times, the obtained series will be convergent in a certain neighborhood of the initial value of the independent variable. We can use this method also for n-th order differential equations. Remark: The above result is the Taylor series of the function, which may not represent the function itself (see 7.3.3.3, l . ,p. 415). It is often useful to substitute the solution by an infinite series with unknown coefficients, and to determine them by comparing coefficients. A: To solve the differential equation y‘ = e“ - y2>xo = 0, yo = 0 we consider the series y = alx+a2z2+a3z3+...+a,zn t... . Substitutingthisinto the equation consideringthe formula (7.88). p. 414 for the square of the series we get x2 2 3 al + 2a2x + 3 a 3 2 t ’ . [a12x2t 2ala2z3 (a2’ 2ala3)x4 ’ .] = 1 z - - . ’ .. 2 6 1 1 Comparing coefficients we get: a1 = 1, 2a2 = 1, 3a3 a12 = - , 4a4 t 2alaz = -, etc. Solving 2 6 these equations successively and substituting the coefficient values into the series representation we 1 2 23 5 get y = z + - - - - -z4 + . . .. 6 24 2 IB: The same differential equation with the same initial conditions can also be solved in the following way: If we substitute x = 0 into the equation, we get yo’ = 1. By successive differentiation we get y” = e” - 2yy’, yo” = 1; y”’ = e“ - 2y” - 2yy”, yo’’’ = -1, y(4) = e” - 6y’y’‘- 2yy”’, yo(4)= - 5 , From 22 23 5x4 the Taylor theorem (see 7.3.3.3, l . ,p. 415) we get the solution y = x + - - - - - + . . .. 4! 2! 3!
+
+
+
+
+ + + +
+
3. Graphical Solution of Differential Equations
Figure 9.11
The graphical integration of a differential equation is a method, which is based on the direction field (see 9.1.1.1, 3., p. 486). The integral curve in Fig. 9.11 is represented by a broken line which starts at the given initial point and is composed of short line segments. The directions of the line segments are always the same as the direction of the direction field at the starting point of the line segment. This is also the endpoint of the previous line segment.
4. Numerical Solution of Differential Equations The numerical solutions of differential equations will be discussed in detail in 19.4, p. 901. We use numerical methods to determine a solution of a differential equation, if the equation y‘ = f ( x , y) does not belong to the special cases dicussed above whose analytic solutions are known, or if the function f(z.y) is too complicated. This can happen if f ( x . y) is non-linear in y.
9.1.2 Differential Equations of Higher Order and Systems of Differential Equations 9.1.2.1 Basic R e s u l t s 1. Existence of a Solution 1. Reduction to a System of Differential Equations Every explicit n-th order differential equation (9.2la) y(n) = f (x, y. y’; . . . , y(n-1))
I
496
9. Differential Eauations
by introducing the new variables y1
=d,
. . , yn-l = y(n-1)
yz = 2//'>.
(9.21b)
can be reduced to a system of n first-order differential equations (9.21~) 2. Existence of a System of Solutions The system of n differential equations
dYt
, , , syn) dx = fz(z,yl,yz,
(i = 152,. ,,In)i
(9.22a)
which is more general than system (9.21c), has a unique system of solutions (9.22b) yt = 2/i(Z) (i = 1 , 2 , . . . , n)3 which is defined in an interval zo - h 5 z 5 zo h and for z = $0 takes the previously given initial values yz(zo)= yzo (i = 1,2,. . . , n ) ,if the functions fi(z,ylryz, . . . , yn) are continuous with respect to all variables and satisfy the following Lipschitz condition. 3. Lipschitz condition For the values z, and yz + Ayi, which are in a certain neighborhood of the given initial values, the functions f~satisfy the following inequalities: I ~ ~ ( ~ , Y I + A ~ / I ~ Y z + A Y z , -. ..fi(Zr~~>Y~r...r~n)/ .,~~+AY~) I K ( l A ~ ~ l + l A ~ z l + ~ ~ ~ + l A ~ n l (9.23a) ) with a common constant K (see also 9.1.1.1, 2., p. 486). This fact implies that if the function f(z, y, y', . . . , Y ( ~ - ' ) ) is continuous and satisfies the Lipschitz condition (9.23a): then the equation
+
y(n) = f ( 2 , y: y l l . .
.
~
y(n-1))
(9.23b)
= yo@-'), and it has a unique solution with the initial values ~ ( z o=) yo, ~ ' ( z o=) yo', . . . , ~(~-')(zo) is (n - 1) times continuously differentiable.
2. General Solution 1. The general solution of the differential equation (9.23b) contains n independent arbitrary constants: (9.24a) y = y(z.C1,Cz,. . ..Cn). 2. In the geometrical interpretation the equation (9.24a) defines a family of curves depending on n parameters. Every single one of these integral curves, i.e., the graph of the corresponding particular solution, can be obtained by a suitable choice of the constants C1, Cz, . . . , Cn. If the solution has to satisfy the above initial conditions, then the values C1,Cz, . . . , Cn are determined from the following equations:
Y(~o~CI,...,C~) =YO, (9.24b)
.........................
If these equations are inconsistent for any initial values in a certain domain, then the solution is not general in this domain, Le., the arbitrary constants cannot be chosen independently. 3. The general solution of system (9.22a) also contains n arbitrary constants. This general solution can be represented in two different ways: Either it is given in a form which is solved for the unknown functions (9.25a) ?/I = FI(S,CI,. , , ,Cn), YZ = F z ( ~ , C I , ,,,Cn), , , . , > 2 / n = Fn(Z,Cl,.. , > Cn)
9.1 Ordinaru DifferentialEquations 497
or in the form which is solved for the constants P I ( Z , YI,, . ~ n = ) C1, ’PZ(Z,Y I ) , . , In the case of (9.25b) each relation , j
~ n= )
CZ,. . . , c~n(x,Y I ~ ,,
Yn) = Cn.
(9.25b)
(9.25~) P ~ ( x * Y,I ,SYn) , = Ct is a first zntegral of the system (9.22a). The first integral can be defined independently of the general solution as a relation (9.25~).That is, (9.25~)will be an identity if we replace y1, yz,. . . , yn by any particular solution of the given system and we replace the constant by the arbitrary constant C, determined by this particular solution. If any first integral is known in the form (9.25c), then the function cpt(x,y1,. . . , yn) satisfies the partial different equation (9.25d) Conversely, each solution cp,(e,yl,. . . . yn) of the partial differential equation (9.25d) defines a first integral of the system (9.22a) in the form (9.25~).The general solution of the system (9.22a) can be represented as a system of n first integrals of system (9.22a), if the corresponding functions cp,(xl yl,. . . , gn) (z = 1.2. . . , n) are linearly independent (see 9.1.2.3, 2., p. 498).
9.1.2.2 Lowering the Order One of the most important solution methods for n-th order differential equations
f (G Y>Y’> , Y(nl) ’ ’ ’
(9.26)
=0
is the substitution of variables in order to obtain a simpler differential equation, especially one of lower order. We can distinguish between different cases. 1. f = f(y, y’, ,y‘”)), Le., z does not appear explicitly:
. ..
f (y, yl>.. . ,
p)= 0.
(9.27a)
By substitution dP (9.27b) d y = p . d2Y -p-&.... dx we can reduce the order of the differential equation from n to ( n - 1). I\Ve reduce the order of the differential equation yy” - yJ2 = 0 to one. With the substitution y’ = p.pdp/dy = y” it becomes a first-order differential equation y pdpldy - p 2 = 0, and y dpldy - p = 0 results in p = Cy = dg/dx, y = Clecz. Canceling p does not result in a loss of a solution, since p = 0 gives the solution y = C1, which is included in the general solution with C = 0. 2. f = f(z,y’, ,y@)), Le., y does not appear explicitly: -
. ..
f (z,yt, ...,y‘”’) = 0.
(9.28a)
The order of the differential equation can be reduced from n to (n- 1) by the substitution y’ = p . If the first k derivatives are missing in the initial equation, then a suitable substitution is
(9.28b)
y(k+‘l = P. (9.28~) IThe order of the differential equation y”-~y”’+(y”’)~= 0 will be reduced by the substitution y” = p,
E (2)3
so we get a Clairaut differential equationp-x-+ Therefore. y = @-*+CZX+C~. 6 2
= 0 whose general solution isp = Cl z+CI3.
From the singular solution of the Clairaut differential equation
I
9. Diferential Equations
498
= -2x 3 ~42
we get the singular solution of the original equation: y = -82'" 4 + Clz + C2. 315 3. f (z,y, y', ,y(.)) is a homogeneous function (see 2.18.2.4, 4., p. 120) 3
. .. in y, y', y", .. . ,y(.): f (x,y, y'. . . . . y'"')
= 0.
(9.29a)
We can reduce the order by the substitution Y' z = -. Y
i e., y = e . f z d z ~
(9.29b)
dz
W We transform the differential equation yy" - y" = 0 by the substitution z = y'/y. Then - = dx yy" - y'2 , so the order is reduced by one. We get 2 = C1, therefore, In IyI = C1.z C2,or y = CeC1'
+
Y2
with In IC1 = C,. f = f ( z ,y, y',
4.
... ,g(.))
is a function of only
5:
y(n) = f(x),
(9.30a)
We get the general solution by n repeated integrations. It has the form y = c1
+ c2x t e322 + + Cam-' + @(x)
(9.30b)
Y(X)=
11.
(9.30~)
'.'
with '
We mention here that xo is not an additional arbitrary constant, since the change in .zo results in the change of C,+because of the relation 1
C, = -y("-')(xo). (IC - l)!
(9.30d)
9.1.2.3 Linear n-th Order Differential Equations 1. Classification A differential equation of the form yen) + sly("-') + a z ~ ( ~ -+' ) ' .
+ an-ly' + any = F (9.31) is called an n-th order linear differential equation. Here F and the coefficients ai are functions of x, which are supposed to be continuous in a certain interval. If a l , a 2 , .. . ,a, are constants, we call it a diferential equation with constant coeficients. If F I 0, then the linear differential equation is homogeneous. and if F $ 0, then it is inhomogeneous. 2. Fundamental System of Solutions X system of n solutions y1, y2,. . . , y, of a homogeneous linear differential equation is called a fundamenta2 system if these functions are linearly independent on the considered interval, Le., their linear combination 9y1 C2 y2 . C, y, is not identically zero for any system of values C1:C2,. . . , C,, except for the values C1= C2 = . . . = C, = 0. The solutions y1, y2, . . . , y, of a linear homogeneous differential equation form a fundamental system on the considered interval if and only if their Wronskian determinant
+
+. +
(9.32)
9.1 Ordinary Differential Equations 499
is non-zero. For every solution system of a homogeneous linear differential equation the formula of Liouville is valid:
\t.(x)
= uT(z0)exp
(
-/al(z)dz)
.
(9.33)
It follows from (9.33) that if the Wronskian determinant is zero somewhere in the solution interval, then it can be only identically zero. This means: The n solutions y1, yz, . . . , y, of the homogeneous linear differential equation are linearly dependent if even for a single point zo of the considered interval W(zo)= 0. If the solutions yl, yz, . . . , y, form a fundamental system of the differential equation, then the general solution of the linear homogeneous differential equation (9.31) is given as (9.34) y = c1y1 t c, yz ' ' . C, yn. .4 linear n-th order homogeneous differential equation has exactly n linearly independent solutions on an interval, where the coefficient functions ai(z)are continuous. 3. Lowering the Order If a e know a particular solution y1 of a homogeneous differential equation, by assuming Y = 2/1(z)u(.) (9.35) we can determine further solutions from a homogeneous linear differential equation of order n - 1 for
+ +
u'(2).
4. Superposition Principle If y1 and yz are two solutions of the differential equation (9.31) for different right-hand sides F1 and F2. then their sum y = y1 + yz is a solution of the same differential equation with the right-hand side F = Fl + Fz. From this observation it follows that to get the general solution of an inhomogeneous differential equation it is sufficient to add any particular solution of the inhomogeneous differential equation to the general solution of the corresponding homogeneous differential equation. 5 . Decomposition Theorem If an inhomogeneous differential equation (9.31) has real coefficients and its right-hand side is complex in the form F = Fl + iFz with some real functions F1 and Fzl then the solution y = y1 + iyz is also complex, nhere y1 and yz are the two solutions of the two inhomogeneous differential equations (9.31) with the corresponding right-hand sides Fl and F2.
6. Solution of Inhomogeneous Differential Equations (9.31) by Means of Quadratures If the fundamental system of the corresponding homogeneous differential equation is already known, we have the following two solution methods to continue our calculations: 1. Method of Variation of Constants We are looking for the solution in the form (9.36a) L 4=C 1Y1 + C2Y2 + ' . ' + CnY, C,, . . . . C, are functions of z. There are infinitely many such functions: but ifwe require that where C1. they satisfy the equations Cl'Y1 + Cz'yz ' .. Cn'yn = 0, (9.36b) cl'yl' Cz'y2' . ' . + Cn'yn' = 0,
+
+ + +
...............
Cl'y,("-*)+ C;y2(n-2) + . . . + C,'yn(n-2) = 0 and we substitute y into (9.31) with these equalities we get cl'yl("-') + Cz'yz(n-') + . . . + Cn'y,(n-') = F. (9.36~) Because the ll'ronskian determinant of the coefficients in the linear equation system (9.36b) and (9.36~) is different from zero. we get a unique solution for the unknown functions C1', C2', . . . , C,', and their integrals give the functions C1, C, . . . , C,.
I
500
9. Differential Esuations
1
X
Iyll+-y’--y=x-l. 1-x 1-x In the interval 2 > 1 or x < 1 all assumptions on the coefficients are fulfilled. First we solve the 1 homogeneous equation g +L g - -g = 0. A particular solution is ’p1 = e‘. We are looking for a 1-2 1-2 secondonein theformpz = e%($), andwith thenotationu’(x) = ~ ( xweget ) thefirst-orderdifferential 1 equation u’ + 1 t - u = 0. A solution of this equation is w(x) = (1 - z)e-5, and therefore, 1-x) u(x)= u(x)dx = (1 - x)e-’ dx = ze-”. With this result we get ’pz = x for the second element
(
1
I
of the fundamental system. The general solution of the homogeneous equation is g(x) = CleZt Czx. The variation of constants is now: Y(Z)= u1(x)e5 t 4 x 1 2 , y’(x) = ul(x)ezt uZ(x)+ u1’(x)e5 + uz’(x)x, ul’(x)ezt u2’(x)x = 0, u1’(x)e” uZ‘(x)= x - 1, y”(z) = u1(x)e5+ ul’(x)ez + ~ ’ ( x ) ,
+
so
ul’(x) = xe-’, u*’(x)= -1, i.e., u1(z) = -(1 t x ) e P t C1, ~ ( z=)-x t CZ. \Vith this result the general solution of the inhomogeneous differential equation is: y(x) = -(I + 2 )+ CleQ+ (Cz - 1)x = -(I + 2 ) Cl*e5 Cz’z. 2. Method of Cauchy In the general solution (9.37a) Y = ClYl + CZYZ + . ’ . + CnYn of the homogeneous differential equation associated to (9.31) we determine constants such that for an arbitrary parameter cu the equations y = 0, y’ = 0,.. . , y(n-z) = 0,Y ( ~ - ’ ) = F ( a )are satisfied. In this way we get a particular solution of the homogeneous equation, denoted by ~ ( za,) ,and then
+
+
1 5
Y=
(9.3713)
Y(Xl 0) dcu
50
is a particular solution of the inhomogeneous differential equation (9.31). This solution and their derivatives up to order ( n - 1) are equal to zero at the point x = 20. IThe general solution of the homogeneous equation associated to the differential equation (9.36a) which we solved by the method variation of constants is y = Clez Czx. From this result we see that y(a) = Cleo + Cza = 0. CY) = CleQ+ Cz = cu - 1 and p ( x , a ) = ae-Qe” - z, so the particular solution y(x) of the inhomogeneous differential equation with y(x0) = y’(x0) = 0 is: y(x) =
+
ll(cue-oe5 - x) dcu =
+ l)eZ-’O
(20
t (xo- 1)x - x z - 1. Therefore, we can already get the general
solution y(z) = Cl*eZ+ Cz*x- (x2t 1) of the inhomogeneous differential equation.
9.1.2.4 Solution of Linear Differential Equations with Constant Coefficients 1. Operational Notation The differential equation (9.31) can be written symbolically in the form
Pn(D)y= ( D n + a ~ D n - ’ + a ~ D nt -~2~ ~ + a , - l D t a , ) y = F ,
(9.38a)
where D is a differential operator: dY dkY D Y =d x ’ DkY = dxk
(9.38b)
If the coefficients a, are constants, then Pn(D)is a usual polynomial in the operator D of degree n.
9.1 Ordinarv Differential Eauations 501
2. Solution of t h e Homogeneous Differential Equation with Constant Coefficients To determine the general solution of the homogeneous differential equation (9.38a) with F = 0, Le., Pn(D)y = 0 (9.39a) we have to find the roots T I , r2).. . , T, of the characteristic equation P,(r) = rn a1rn-'+ a Z P + ' . t a,-lr a, = 0. (9.39b) Every root r, determines a solution e'-' of the equation P,(D)y = 0. If a root r, has a higher multiplicity k. then ze',', z2eTi2. . . , &l er are also solutions. The linear combination of all these solutions is the general solution of the homogeneous differential equation:
+
+
ir
~
+. +
y = Clerlr + CZerzx
'.
erls
(c~ t ~ , + t~ .z. . t ~,+k-lx~-l) + ..'.
(9.39c)
If the coefficients a, are all real, then the complex roots of the characteristic equation are pairwise conjugate with the same multiplicity. In this case, for r1 = cy + iB and rz = cy - i o we can replace the corresponding complex solution functions eriz and erzs by the real functions ear cos px and ear sin ax. The resulting expression C1 cos Pz + Cz sin pz can be written in the form A cos(Pz + p) with some constants .4 and p. IIn the case of the differential equation y(4) - y" - y = 0, the characteristic equation is r6 t r4 - r2 - 1 = 0 with roots r1 = 1, r2 = -1, r3,4 = i, r5,6 = -i. The general solution can be given in two forms: y = CleZ+ Cze-" (C3 t C ~ Zcosx ) t (C, + c@) sinx, or y = Cle' Cze-" A1 cos(x cpl) t xA2 cos(z pz).
+
+
+ +
+
+
3. Hurwitz Theorem In different applications, e.g., in vibration theory, it is important to know whether a solution of a given homogeneous differential equation with constant coefficients tend to zero for x -+ +m or not. It tends to zero, obviously, if the real parts of the roots of the characteristic equation (9.39b) are negative. .kcording to the Hurwztz theorem an equation (9.40a) a,xn + an_lxn-l . ' . alx t a0 = O has only roots with negative real part if and only if all the determinants
+ +
, . . . , D,= (with a, = 0 for m > n)
(9.40b)
are positive. The determinants Dk haveon their diagonal thecoefficients al, az, . . . , ab (k = 1,2.. . . . n), and the coefficient-indices are decreasing from left to right. Coefficients with negative indices and also with indices larger than n are all put to 0. IFor a cubic polynomial the determinants have in accordance to (9.40b) the following form:
4. Solution of Inhomogeneous Differential Equations with Constant Coefficients can be determined by the method variation of constants. or by the method of Cauchy, or with the operator method (see 9.2.2.3,5., p. 532). If the right-hand side of the inhomogeneous differential equation (9.31) has a some special form, a particular solution can be determined very easily. 1. Form: F ( z ) = Aea', P,(a) # 0 (9.4la)
502
9. Differential Equations
.I\ particular solution is
y=-
Aeax
Pn(a) If (Y is a root of the characteristic equation of multiplicity m, i.e., if
(9.41b)
P,((Y) = P,'(a) = . . . = P,""(a) = 0, (9.41c) Axmeax then y = -is a particular solution. These formulas can also be used by applying the decompoP, (a) sition theorem, if the right side is F ( z ) = Aeax cos dz or Aeaz sin wz. (9.41d) The corresponding particular solutions are the real or the imaginary part of the solution of the same differential equation for
+
F ( z ) = Ae""(cosdz isinwz) = Ae(@'IW)" (9.41e) on the right-hand side. IA: For the differential equation y" - 6y' t 8y = eZ5,the characteristic polynomial is P(D) = D Z - 6 D + 8 w i t h P ( 2 ) = OandP'(D) = 2 0 - 6 w i t h P ' ( 2 ) = 2 . 2 - 6 = -2,sotheparticularsolution xeZ2 1sy=---. 2 IB: The differential equation y" y' y = e"sina results in the equation (0' t D t l ) y = e('+')". e(lti)z e"(cosz isinz) From its solution y = we get a particular solution y1 = 2t3i (1 t i ) 2 (1t i ) t 1 ex -(2 sin z - 3 cos x). Here y1 is the imaginary part of y 13 (9.42) 2. Form: F ( z ) = Qn(z)eLIz,Q n ( z ) is a polynomial of degree TI A particular solution can always be found in the same form, i.e., as an expression y = R(z)eaz. R ( z )is a polynomial of degree n multiplied by zmif a is a root of the characteristic equation with a multiplicity m.Il'e consider the coefficients of the polynomial R ( z )as unknowns and we substitute the expression into the inhomogeneous differential equation. It must satisfy the equation, so we get a linear equation system for the coefficients. and this equation system always has a unique solution. This method is very useful especially in the cases of F ( z ) = Qn(z) for a = 0 and F ( z ) = Qn(z)e'" coswz or F ( 5 ) = Q,(z)e'" sinwz for a = r i iw. There is a solution in the form y = zmerr[Mn(z)coswz ?J,(z) sinwz]. IThe roots of the characteristic equation associated to the differential equation y(4)+ 2y'" t y" = 6a: t 2zsinz are kl = k2 = 0: k3 = kd = -1. Because of the superposition principle (see 9.1.2.3, 4..p. 499), we can calculate the particular solutions of the inhomogeneous differential equation for the summands of the right-hand side separately. For the first summand the substitution of the given form yi = zZ(azt b ) results in a right-hand side 12a t 2b t 6az = 6z, and so: a = 1 und b = -6. For the second summand we substitute yz = ( c z t d ) s i n z t (fz+g) cosz. We get the coefficients by coefficient comparisonfrom(2gt2f-6ct2f~)sinz-(2ct2dt6ft2cz)cosz= 2 z s i n z . s o c = 0 . d = -3, f = 1. g = -1. Therefore, thegeneralsolutionisy = cl+c~z-6z2tz3t(c~ztc~)e-'-3sinz+(z-l)cosz. 3. Euler Differential Equation The Euler differential equation
+ +
+
+
+
n
C ak(cz t d ) k y ( k )= ~ ( z )
(9.43a)
k=O
with the substitution cz t d = et
(9.43b)
9.1 Ordinaru DifferentialEquations 503
can be transformed into a linear differential equation with constant coefficients. IThe differential equation x2y"- 5zy' + 8y = x 2 is a special case of the Euler differential equation for n = 2. IVith the substitution x = et it becomes the differential equation discussed earlier in
d2Y - 6-dy example A. on p. 502: dt2
dt
+ 8y = eZt. The general solution is y = CleZt+ C2e4t- !e2' 2
-
9.1.2.5 Systems of Linear Differential Equations with Constant
Coefficients 1. Normal Form The following simple case of a first-order linear differential equation system with constant coefficients is called a normal system or a normalform:
Y!' = a l l y l t alzy2 + ~ 2 = ' QIYI
+
t~ZZYZ
'.
+ alnyn:
' '.
t a2nyni
.................................
+
1
(9.44a)
yn' = aniZj! + a n ~ y z ' ' t a n n u n . To find the general solution of such a system, we have to find first the roots of the characteristic equation , , . a!, . . . a2, ~ ......................... anl an2 . , . ann - r
all - r
a12
aZ1
a22 - T
=o.
To every single root r, of this equation there is a system of particular solutions y! = Alertz. y2 = Azerxs,.. . , y, = Anerts, (9.44c) whose coefficients 4 k ( k = 1 , 2 , , . . , n) are determined from the homogeneous linear equation system
+ aiz.42 t . ' . t ain& = 0, ....................................... a , l A ~ + an2A2 + . ' . t (ann - rJA, = 0. (a11 - ?,)AI
(9.44d)
This system gives the relations between the values of Ah (see Trivial solution and fundamental system in 1.4.2.1, 2., p. 273). For every T , , the particular solutions we get in this way will contain an arbitrary constant. If all the roots of the characteristic equation are different, the sum ofthese particular solutions contains n independent arbitrary constants, so in this way we get the general solution. If a root r, has a multiplicity m in the characteristic equation. the system of particular solutions corresponding to this root has the form , , y,, = An(z)erLs, (9.44e) y1 = Al(x)erSs, y2 = A2(x)eP1Z:, where .41(x)~. . . , .4,(2) are polynomials of degree at most m - 1. We substitute these expressions with unknown coefficients of the polynomials Ak(x) into the differential equation system. We can first cancel the factor e''", then we can compare the coefficients of the different powers of x to have linear equations for the unknown coefficients of the polynomials, and among them m can be chosen freely. In this way, we get a part of the solution with m arbitrary constants. The degree of the polynomials can be less than m - 1. In the special case when the system (9.44a) is symmetric? Le., when aik = ah,>then it is sufficient to substitute A,(z) = const. For complex roots of the characteristic equation, the general solution can be transformed into a real form in the same way as shown earlier for the case of a differential equation with constant coefficients (see 9.1.2.4, p. 500). . . IFor the system yl' = 2yl t 2y2 - y3, y2' = -291 t 4yz y3. y3' = -3yl t 8y2 2y3 the characteristic ~
+
+
504
9. Differential Equations
equation has the form
1 ‘jT
4 rT! 2
~
= -(r
- 6)(r-
= 0.
For the simple root r1 = 6 we get -4A1+ 2A2 - A3 = 0, -2A1 - 2Az+ A3 = 0, -3A1 + 8A2 - 4A3 = 0. 1 From this system we have A1 = 0, A2 = -A3 = C1, y1 = 0, yz = Cle6”,y3 = 2C1e6’. For the multiple 2 root T Z = 1 we get y1 = (PIX &])ex,y2 = (PZZ Q2)e5, y3 = (P32 Q3)e2. Substitution into the equations yields
+
-I-(S+ + (P2 + Qz) p35 + (p3 + Q3)
Q1)
=
+
+ 2pz - P3)z + ( 2 Q 1 +
(2p1
+
2Q2 - Q 3 ) ,
+ P3)z + (-2Q1+ 4Qz + Q s ) , = (-3p1+ 8P2 + 2P3)2 + (-3Q1+ 8Q2 + 2Q3), , = 5C3 - 6C2, QZ= C3, = 7C3 - llCz. The which implies that PI = 5C2, P2 = C2, P3 = ~ C ZQ1 general solution is y1 = ( 5 G z t 5C3 - 6C2)e2, yz = C1e6” + (Czz + C3)e2, y3 = 2Cle6” + (7C2z + Pz.
= (-2P1+ 4Pz
Q3
7C3 - 11Cz)e”.
2. Homogeneous Systems of First-Order Linear Differential Equations with Constant Coefficients have the general form n
n
1ad&‘ + k=l 1blkyk = 0
(2
= 1, 2 , . . . , n).
(9.45a)
k=l
If the determinant det(a,k) does not disappear, i.e., d e t ( a d # 0, (9.45b) then the system (9.458) can be transformed into the normal form (9.44a). In the case of det(a,k) = 0 we need further investigations (see [9.15]). The solution can be determined from the general form in the same way as shown for the normal form. The characteristic equation has the form det(a,kr b,k) = 0. (9.45c) The coefficients A, in the solution (9.44~)corresponding to a single root T~ are determined from the equation system
+
n
l ( a , k r J + b z k ) A k = O ( 2 = 1 > 2 ,. . . , n ) . (9.45d) k=l Otherwise the solution method follows the same ideas as in the case of the normal form. IThe characteristic equation of the two differentialequations 5y1’+4y1-2yz’-y2 = 0, y1’+8~1-3yz = 0 is: 5T+4 - 2 - 1 =2r2+2~ 4 = 0, TI = 1, ~2 = -2. ~ + 8-3 We get the coefficients AI and Az for r1 = 1 from the equations-9Al - 3Az = 0, 9Ai - 3A2 = 0 so AZ = 3A1 = 3C1. For rz = -2 we get analogously A2 = 2A1 = 2C2. The general solution is y1 = Ciex+ C2e-2x,y2 = 3Clex + 2Cze-2x.
I
3. Inhomogeneous Systems of First-Order Linear Differerential Equations have the general form n
n
1azkyk’ + 1bzkyk = Ft(z) k=l k=l
(Z =
1, 2,. . . 2 n).
(9.46)
9.1 Ordinam Differential Eauations 505
1. Superposition Principle: If yj(') and yj(') ( j = 1,2,.. . , n) are solutions of inhomogeneous systems which differ from each other only in their right-hand sides F,(') and F,('), then the sum yJ = , n ) is a solution of this system with the right-hand side F,(z) = F,(')(z) + Ft(')(x). Because of this, to get the general solution of an inhomogeneous system it is enough to add a particular solution to the general solution of the corresponding homogeneous system. 2. T h e Variation of Constants can be used to get a particular solution of the inhomogeneous differential equation system. To do this we use the general solution of the homogeneous system, and we consider the constants C1, C2, . . . , C,, as unknown functions Cl(z),Cz(z),. . . , Cn(z).We substitute it into the inhomogeneous system. In the expressions of the derivatives of yk' we have the derivative of the new unknown functions Ck(z). Because y1. y 2 , . . . , yn are solutions of the homogeneous system, the terms containing the new unknown functions will be canceled; only their derivatives remain in the equations. We get for the functions Ck'(z) an inhomogeneous linear algebraic equation system which always has a unique solution. '4fter n integrations we get the functions C,(z), C Z ( Z ) .~.., Cn(z). Substitution them into the solution of the homogeneous system instead of the constants results in the particular solution of the inhomogeneous system. For the system of two inhomogeneous differential equations 5yl' 491 - 2y2' - y2 = e-", yl' i8y1 - 3y2 = X" the general solution of the homogeneous system is (see p. 504) y1 = Cle" + C2e-221 y2 = 3Cles -t 2Cze-". Considering the constants C1 and Cz as functions of z and substituting into the original equations we get 5Cl'e" t 5C2'e-'" - 6C1'e" - 4C2'e-2" = e-", Cl'e" -t C2'e-'= = 5e-" or C21e-2"- Cl'e-" = e-", Cl'e5 -t C2'e-2' = W". Therefore, 2Cl'e5 = 4e-', C1 = -e-'" + const, 2C2'e-'" = 6e-", Cz = 3e" + const. Since we are looking for a particular solution, we can replace every constant by zero and the result is y1 = 2e-", yz = 3 e P . The general solution is finally y1 = 2e-" t Cle" + C2e-2x3 y2 = 3e-" + 3Clex + 2Cze-2". 3. The Method of Unknown Coefficients is especially useful if on the right-hand side there are special functions in the form Qn(z)e0".The application is similar to the one we used for differential equations of n-th order (see 9.1.2.5,p. 503). 4. Second-Order Systems The methods introduced above can also be used for differential equations of higher order. For the system
+
n
1
a,kYk"
i;=l
n
n
k=l
k=l
+ 1b,kyk' 4-
C,kyk
=0
(2
= 1;2.. . . , n)
(9.47)
we can determine particular solutions in the form yi = Aie7*".To do this, we get T , from the characteristic equation det(a,kr2t b z k r + c,k) = 0, and we get A, from the corresponding linear homogeneous algebraic equations.
9.1.2.6 Linear Second-Order Differential Equations Many special differential equations belong to this class, which often occur in practical applications. We discuss several of them in this paragraph. For more details of representation, properties and solution methods see [9.15].
1. General Methods 1. The Inhomogeneous Differential Equation is Y" + P ( 2 ) Y ' + d S ) Y = F ( z ) .
(9.48a) a) The general solution of the corresponding homogeneous differential equation, Le., with F ( z ) = 0, is (9.48b) Y = ClYl -t C2Y2. Here y1 and y2 are two linearly independent particular solutions of this equation (see 9.1.2.3,2., p. 498). If a particular solution yl is already known, then the second one yz can be determined by the equation (9.48~)
506
9. Di.#erential Equations
which follows from the Liouville formula (9.33), where A can be chosen arbitrarily. b) .A particular solution of the inhomogeneous equation can be determined by the formula
where y1 and yz are two particular solutions of the corresponding homogeneous differential equation. c) 4 particular solution of the inhomogeneous differential equation can be determined also by variation ofconstants (see 9.1.2.3, 6., p. 499). 2. If in the Inhomogeneous Differential Equation (9.49aj 4x)Y/r+ P(X)Y/+ d X ) Y = F ( x ) the functions s(x))p(x), q(x) and F ( x ) are polynomials or functions which can be expanded into a convergent power series around xo in a certain domain, where s(x0) # 0, then the solutions of this differential equation can also be expanded into a similar series, and these series are convergent in the same domain. We determine them by the method of undetermined coefficients: The solution we are looking for as a series has the form (9.49b) y = a0 al(z - xoj az(x - xo)2 . . , and we substitute it into the differential equation (9.49a). Equating like coefficients (of the same powers of (x - xo)) results in equations to determine the coefficients aor a,, a2.. . . . ITo solve the differential equation y" xy = 0 we substitute y = a0 alx a2x2 a3x3 ' . . , y' = al 2a2x + 3a3xz ' . and y" = 2az 6a3x . ' . . We get 2az = 0, 6a3 a0 = 0 , . . . . a0 a1 The solution of these equations is a2 = 0, a3 = -- . a4 = -a5 = 0.. . . , so the solution is 2.3 3.4 X4 27 _ . . . + a 1 x---$-... 3.4 3.4.6.7 3. The Homogeneous Differential Equation XZY,' + xp(x)y/ q(2)y = 0 (9.50a) can be solved by the method of undetermined coefficients if the functionsp(x) and q(x) can be expanded as a convergent series of x. The solutions have the form y = x'(a0 + a,x t a252 + ' ' .). (9.50b) whose exponent T can be determined from the dejnzng eqzlatzon (9.50~) T ( T - 1) p(0jr q(0) = 0. If the roots of this equation are different and their difference is not an integer number, then we get two linearlj independent solutions of (9.50a). Otherwise the method of undetermined coefficients results only one solution. Then with the help of (9.48b) we can get a second solution or at least we can find a form from which we can get a second solution with the method of undetermined Coefficients. IFor the Bessel differential equation (9.51a) we get only one solution with the method of the undeter-
+
+
+
+
+.
+
~
+
+
+
+
+ +
+
~
) (
+
+
+
30
mined coefficients in the form y1 =
1akXntzk
(a0
# 0 ) , which coincides with Jn(x) up to a constant
k=O
we find a second solution by using formula (9.48~)
The determination of the unknown coefficients ck and dk is difficult from the Q ' S . But we can use this last expression to get the solution with the method of undetermined coefficients. Obviously this form is a series expansion of the function Yn(x) (9.52~).
9.1 Ordinaru DifferentialEouations 507
2. Bessel Differential Equation x2y” + zy’+ ( 2- n2)y = 0.
(9.51a)
1. The Defining Equation is in this case T ( T - 1) r - n2 T’ - n2 = 0, so, r = h. Substituting y = Zn(a0 alz ’ .) into this equation and equating the coefficients of xntk to zero we have k(2n k)ak ah-2 = 0. For k = 1 we get (2n 1)al = 0. For the values k = 2,3,. . . we obtain
+
+
+
+
a4 =
+.
+
a0
,
+
2 ’ 4 ’ (2n + 2)(2n 4 ) ” ”
a. is arbitrary.
(9.51b) (9.51~) (9.5ld)
(9.cile)
where r is 2niyn + 1)’ the gamma function (see 8.2.5, 6.,p. 459), is a particular solution of the Bessel differential equation (9.5la) for integer values of n. It defines the Besselor cylindricalfunction of the first kind of index n 2. Bessel or Cylindrical Functions The series obtained above for a0 =
Jn(z) = 2 , r c + 1)
X2
(l -
+ 2 . 4 , (2n+x42)(2n+4) (9.52a)
The graphs of functions Jo and 51 are shown in Fig. 9.12. The general solution of the Bessel differential equation for non-integer n has the form (9.52b) Y = CiJn(z) + GJ-n(z)> where J-n(z)is defined by the infinite series obtained from the series representation of J,(z) by replacing n with -n. For integer n. we have J-n(x) = (-l)nJn(z). In this case, the term J-n(x) in the general solution should be replaced with the Bessel function of the second kind Jrn(z)COSrnT - J-m(x) Yn(z)= lim (9.52~) I min sin mT which is also called the Weberfunction. For the series expansion of Y,(x) see, e.g., [9.15]. The graphs of the functions Yo and Yl are shown in Fig. 9.13. 3. Bessel Functions with Imaginary Variables In some applications we use Bessel functions with pure imaginary variables. In this case we have to consider the product i-nJn(ix) which will be denoted by I n ( z ) : (9.53a) The functions In(.)
+
z2yrr zy’
-
are solutions of the differential equation
( 2+ n2)y = 0.
(9.53b)
A second solution of this differential equation is the MacDonaldfunction (9.53c)
9. DifferentialEquations
508
Figure 9.12
Figure 9.13
If n converges to an integer number. this expression also converges. The functions In(x)and K n ( x )are called modified Besselfunctions. The graphs of functions IO and 11 are shown in Fig. 9.14; the graphs of functions KOand K1 are illustrated in Fig. 9.15. The values of functions Jo(x),J 1 ( x ) Y) o ( x ) Y, l ( x ) ,IO(.)) I l ( x ) ,K o ( x ) ,Kl(x) are given in Table 21.9, p. 1058.
Y
<:
/
i
:
-2. -3-4-5-
0.5 0
0.5 1.0 1.5 2.0
* -1.01 X
Figure 9.15
Figure 9.14
1 0.5
1
1.0
Figure 9.16
4. Important Formulas for the Bessel Functions &(z) Jn-l(x)
+ Jn+i(Z) = -2nJ n ( x )
I
dJn(x) = --Jn(x) dx
+ Jn-l(X).
(9.54a)
The formulas (9.54a) are also valid for the Weber functions Yn(x). - In+1(X)= 2nIn(x) d I h ) - In-l(Z)- ?I&), x ' dx X
(9.54b)
2nKn(z) dKn(x) = -Kn-l(x) - !Kn(.), Kn+l(x)- Kn-l(x)= x ' dx X
(9.54c)
In-l(x)
~
For integer numbers n the following formulas are valid: (9.54d)
9.1 Ordinaru Differential Eazlations 509
1 7712
~ ~ ~ += ~ ( zsin(z ) sin p) sin(2n
+ 1)yOdyO
(9.54e)
0
or, in complex form.
Jn(z)= -jeiscos+cosnpdp -(i)n
(9.54f)
0
The functions Jntljz(z) can be expressed by using elementary functions. In particular, J l j z ( X ) = p iTX si
nz)
(9.55a)
J-lp(x) = E c o s x .
(9.55b)
By applying the recursion formulas (9.54a)-(9.54f) the expression for Jn+l,l(z) for arbitrary integer n can be given. For large values of z we have the following asymptotic formulas: Jn(X)
;/
=
[cos (X -
y
I);(
,
[sin (z -
-
I n ( X ) = e" [l t 0
6
y,(z) =
Kn(z)=
E
-
I);(
$) t 0
,
(9.56a) (9.56b)
t0
(31 ,
Fe-. I ) ' ( 22
[l t 0
(9.56d)
1 The expression 0 - means an infinitesimal quantity of the same order as - (see the Landau symbol,
(5)
X
p. 56). For further properties of the Bessel functions see [21.1].
3. Legendre Differential Equation
Restricting our investigations to the case of real variables and integer parameters n = 0, 1 , 2 , . . . the Legendre differential equation has the form (9.57a) (1 - 2 ) y " - 2zy' + n(n t 1)y = O or ((I - x2)y')'+ n(n + 1)y = 0. 1. Legendre Polynomials or Spherical Harmonics of the First Kind are the particular solutions of the Legendre differential equation for integer n,which can be expanded into the power series m
y=
1a&.
By the method of undetermined coefficients we get the polynomials
"=O
(1x1< m ; n = 0 , 1 , 2 , .. .). (9.57b) (9.57c) where F denotes the hypergeometric series (see 4., p. 511). The first eight polynomials have the following simple form (see 21.10. p. 1060):
Po(s)= 1 .
(9.57d)
PI(X)
=x >
(9.57e)
510
9. DifferentialEquations
1 2
Pz(x)= - ( 3 2 - 1 ) .
(9.57f)
1
P ~ ( T=) -(35x4 - 302' 8
+ 3),
(9.57h)
P3(2)=
1 -(523 - 32), 2
(9.57g)
1 8
(9.57i)
P~(z) = -(63z5
-
70z3 + 1 5 ~ ) ,
1 1 P6 (x)= -( 2 3 1-315x4+105z2-5) ~~ ,(9.57j) 5 ( z ) = -(429z7-693z5+315r3 -352) .(9.57k) 16 16 The graphs of Pn(z) for the values from n = 1 to n = 7 are represented in Fig. 9.16. The numerical values can be calculated easily by pocket calculators or from function tables. 2. Properties of t h e Legendre Polynomials of t h e First Kind a ) Integral Representation:
(9.58a) The signs can be chosen arbitrarily in both equations. b) Recursion Formulas: ( n 1)Pn+1(x)= (2n t l)zPn(z)- nPn-l(.) (n2 1;Po(.) = 1, P l ( Z ) = z), dP (22 - 1)= n[xP,(.) - Pn-l(z)] (n2 1).
+
dx
(9.58b) (9.58~ )
c) Orthogonality Relation: form
# n>
form = n.
(9.58d)
-1
d) Root Theorem: All t h e n roots of P,(z) are real and single and are in the interval (-1.1). e) Generating Function: The Legendre polynomial of the first kind can be represented as the power series expansion of the function 1
J 1 - 2rx t r2
m
I
=
Pn(z)rn.
n=O
(9.58e)
For further properties of the Legendre polynomials of the first kind see [21.1]. 3. Legendre Functions or Spherical Harmonics of t h e Second Kind We get a second particular solution Qn(z), which is valid for 1x1 > 1 and linearly independent of P,(z)>see (9.58a), by the -(rat11
b,z":
power series expansion v=-m
(9.59a) The representation of Qn(z)valid for 1x1 < 1 is: (9.5913)
9.1 Ordinary Differential Equations 511
We call the spherical harmonics of the first and second kind also the associated Legendre functions (see p. 541). also (9.118~);
4. Hypergeometric Differential Equation The hypergeometric diferential equation is the equation z ( l - x)-d 2 y t [^, - ( a t 8t 1)4-dY - a3y = 0, (9.60a) dz2 dx where a.3. are parameters. It contains several important special cases. 1-2 a) For a = n t 1. 8 = -n, = 1: and z = -it is the Legendre differential equation. 2 b) If? # 0 or *, is not a negative integer, it has the hypergeometric series or hypergeometric function as a particular solution : t;
. ( a t n)P(B + 1 ) .. . (9 t n)Zntl + a (1a. t2 .1). . . ( n + 1 ) .$7 t 1) (Y t n) ,
I
.
+
,,,
(9.60b)
.
which is absolutely convergent for 1x1 < 1. The convergence for x = f l depends on the value of d = - a - 3. For z = 1 it is convergent if 6 > 0. it is divergent if 6 5 0. For x = -1 it is absolutely convergent if 6 < 0. it is conditionally convergent for -1 < 6 5 0. and it is divergent for 6 5 -1. c) For 2 - Y # 0 or not equal to a negative integer it has a particular solution y=z-'-'F(a+l-Y.3$1-y,2-~/,x). (9.60~) d) In some special cases the hypergeometric series can be reduced to elementary functions, e.g.,
F(1,3.3;xj = F ( a . 1. a;z) = I (9.61a) , F(-n, B.p; -z)= (1 t z)", 1-x ln(1 t x) F(1,l.2: -z) = -,
31%
)
(1 1 3 % arcsinz (9.61~) F - - - - z 2 = -, 2' 2' 2
(9.61b) (9.61d) (9.61e)
5 . Laguerre Differential Equation If we restrict our investigation to integer parameters ( n = 0,1,2,. . .) and real variables, the Laguerre dzflerentzal eqzlatzon has the form xy" t (a + 1 - x)y' t ny = 0. (9.62a) .As a particular 3olution we have the Laguerre polynornzal
(9.62b)
(9.62~) (9.62d)
(9.62e)
512
9. Di.ferential Equations
With
r we denote the gamma function (see 8.2.5, 6., p. 459).
6. Hermite Differential Equation Two defining equations are often used in the literature: a) Defining Equation of Type 1: y" -xy) + n y = 0 ( n = 0,1,2,.. .). (9.63a) b) Defining Equation of Type 2: yll- 2xy'+ny = 0 ( n = 0,1,2,. . .). (9.63b) Particular solutions are the Hermite polynomials,He,(x) for the defining equation of type 1,and H,(x) for the defining equation of type 2. a) Hermite Polynomials for Defining Equation of Type 1:
(9.63~)
For n 2 1 the following recursion formulas are valid: Hentl(.) = xHe,(x)
- nHe,-l(z),
(9.63d)
Heo(x) = 1, H e l ( x ) = x.
(9.63e)
The orthogonality relation is: f33
e-"2/2Hem(x)Hen(z) dx =
( 0n!&
form # n, form = n.
(9.630
b) Hermite Polynomials for Defining Equation of Type 2: H,(r) = ( - l ) , e Z Z -
d"
dxn
(e-q
( n E K).
The relation with the Hermite polynomials for defining equation of type 1 is the following: (9.63h)
9.1.3 Boundary Value Problems 9.1.3.1 Problem Formulation 1. Notion of the Boundary Value Problem In different applications, e.g., in mathematical physics, differentii. equations must be sc. ed as socalled boundary value problems (see 9.2.3, p. 532), where the solution we are looking for must satisfy previously given relations at the endpoints of an interval of the independent variable. A special case is the linear boundary value problem, where a solution of a linear differential equation should satisfy linear boundary value conditions. In the followings we restrict our discussion to second-order linear differential equations with linear boundary values.
2. Self-Adjoint Differential Equation Self-adpznt dzflerentzal equatzons are important special second-order differential equations of the form (9.64a) [PllT - QY + b y = f. The linear boundary values are the homogeneous conditions (9.64b) A o y ( a ) + Boy'(a) = 0, Aly(b) Bly'(b) = 0. The functions p(x),p'(x), ~ ( x )~, ( x )and , f(x) are supposed to be continuous in the finite interval a 5 x 5 b. In the case of an infinite interval the results change considerably (see [9.5]). Furthermore,
+
9.1 Ordinary Differential Equations 513
we suppose that p ( x ) > pa > 0, ~ ( z > ) > 0. The quantity A, a parameter of the differential equation. is a constant. For f = 0, it is called the homogeneous boundary value problem associated to the znhomogeneous boundary value problem. Every second-order differential equation of the form A?/"+ By'+ cy ARy = F (9.64~) can be reduced to the self-adjoint equation (9.64a) by multiplying it by p / A if in [a,b ] , A # 0, and performing the following substitutions
+
(9.64d) To find a solution satisfying the inhomogeneous conditions (9.64e) AOY(Q) + BOY'(^) = Co, A I Y ( ~+)Bi~'(b)= Ci we return to the problem with homogeneous boundary conditions, but we change the right-hand side f(x). We substitute y = z u where u is an arbitrary twice differentiable function satisfying the inhomogeneous boundary conditions and z is a new unknown function satisfying the corresponding homogeneous conditions.
+
3. Sturm-Liouville Problem For a given value of the parameter X there are two cases: 1. Either the inhomogeneous boundary value problem has a unique solution for arbitrary f(.), while the corresponding homogeneous problem has only the trivial, identically zero solution, or, 2. The corresponding homogeneous problem also has non-trivial, Le., not identically zero solutions, but in this case the inhomogeneous problem does not have a solution for arbitrary right-hand side: and if a solution exists, it is not unique. The values of the parameter A, for which the second case occurs, Le., the homogeneous problem has a non-trivial solution, are called the eigenvalues of the boundary value problem, the corresponding nontrivial solutions are called the eigenfunctions. The problem of determining the eigenvalues and eigenfunctions of a differential equation (9.64a) is called the Stwm-Liouville problem.
9.1.3.2 Fundamental Properties of Eigenfunctions and Eigenvalues 1. The eigenvalues of a boundary value problem form a monotone increasing sequence of real numbers (9.65,) A0 < XI < A 2 < . ' . < A, < . ' . , tending to infinity. 2. The eigenfunction associated to the eigenvalue A, has exactly n roots in the interval a < z < b. 3. If y(z) and z(x) are two eigenfunctions belonging to the same eigenvalue X, they differ only in a constant multiplier c, Le., z(.) = cy(.). (9.65b) 4. Two eigenfunctions yl(z) and yZ(z), associated to different eigenvalues A1 and 12.are orthogonal to each other with the weightfunction ~ ( z )
jgd4 yz(.)
).(oi
(9.65~)
d. = 0.
a
5 . If in (9.64a) the coefficients p ( z ) and q ( z ) are replaced by C(z) 2 p ( z ) and i ( z ) 2 q ( z ) ,then the eigenvalues will not decrease, Le., in 2 A, where and A, are the n-th eigenvalues of the modified and the original equations respectively. But if the coefficient e(z) is replaced by i(x) 2 ~ ( x )then , the eigenvalues will not increase, Le., in 5 A,. The n-th eigenvalue depends continuously on the coefficients of the equation. i.e., small changes in the coefficients will result in small variations of the n-th eigenvalue.
x,
9. DifferentialEquations
514
6. Reduction of the interval [a,b] into a smaller one does not result in smaller eigenvalues.
9.1.3.3 Expansion in Eigenfunctions 1. Normalization of the Eigenfunction For every A, an eigenfunction vn(x) is chosen such that
p,;.!x)I’P(x!
dx = 1.
(9.66a)
a
It is called a normalized eigenfunctzon.
2. Fourier Expansion To every function g(z) defined in the interval [a, b]. we can assign its Fourier series (9.66b) with the eigenfunctions of the corresponding boundary value problem, if the integrals in (9.66b) exist.
3. Expansion Theorem If the function g(z) has a continuous derivative and satisfies the boundary conditions of the given problem, then the Fourier series ofg(x) (in the eigenfunctions of this boundary value problem) is absolutely and uniformly convergent to g(x).
4. Parseval Eauation If the integral on the left-hand side exists, then m
/[g(x)120(x) dx = J
a
1c;
(9.66~)
n=O
is always valid. The Fourier series of the function g(z) converges in this case to g(x) in mean, that is (9.66d)
9.1.3.4 Singular Cases Boundary value problems of the above type very often occur in solving problems of theoretical physics by the Fourier method, however at the endpoints of the interval [a,b] some singularities of the differential equation may occur, e.g., p(x) vanishes. At such singular points we impose some restrictions on the solutions, e.g.. continuity or being finite or unlimited growth with a bounded order. These conditions play the role of homogeneous boundary conditions (see 9.2.3.3, p. 535). In addition, we often have the case where in certain boundary value problems homogeneous boundary conditions should be considered, such that they connect the values of the function or its derivative at different endpoints of the interval. \Ye often have the relations Y ( U ) = Y(b), p(a)y’(a)= P(b)Y’(b)> (9.67) which represent periodicity in the case ofp(a) = p ( b ) . For such boundary value problems everything we introduced above remains valid, except statement (9.65b). For further discussion of this topic see [9.5].
9.2 Partial DifferentialEauations 515
9.2 Partial Differential Equations 9.2.1 First-Order Partial Differential Equations 9.2.1.1 Linear First-Order Partial Differential Equations 1. Linear and Quasilinear Partial Differential Equations The equation
aZ
az az + x2t . . + x,=y ax, ax2 axn
x1-
(9.68a)
,
is called a linear first-order partial diflerential equation. Here z is an unknown function of the independent variables X I , , . . . x,, and X i , . . . , x,, Y are given functions of these variables. If functions X I , ,, , , X,, Y depend also on z , the equation is called a quasilinear partial diflerential equation. In the case of Y 0; (9.68b) the equation is called homogeneous.
2. Solution of a Homogeneous Partial Linear Differential Equation The solution of a homogeneous partial linear differential equation and the solution of the so-called characteristic system (9.69a) are equivalent. This system can be solved in two different ways: 1. .4ny xk, for which Xk # 0, can be chosen as an independent variable, so the system is transformed into the form (9.69b) 2. .A more convenient way is to keep symmetry and introduce a new variable t , and then we get
dt
= X,.
(9.69~)
Every first integral of the system (9.69a) is a solution of the homogeneous linear partial differential equation (9.68b), and conversely, every solution of (9.68b) is a first integral of (9.69a) (see 9.1.2.1, 2., p. 496). If the n - 1 first integrals pt(xl. . . . , x , ) = 0 ( z = 1 , 2 )n-l) (9.69d) are independent (see 9.1.2.3, 2., p. 498)) then the general solution is (9.69e) 2 = @(PI,. , qn-1). Here @ is an arbitrary function of the n - 1 arguments (pz and a general solution of the homogeneous linear differential equation. I
.
.
.
t
3. Solution of Inhomogeneous Linear and Quasilinear Partial Differential Equations To solve an inhomogeneous linear and quasilinear partial differential equation (9.68a) we try to find the solution z in the implicit form V ( x i > ,, . ,z,,z ) = C. The function V is a solution of the homogeneous linear differential equation with n + 1 independent variables
av +xz-av + . . . +x,-av xi-ax, ax2 ax,
av = 0,
+y-
at
(9.70a)
whose characteristic system (9.70b)
9. Di.#erential Equations
516
is called the characteristic system ofthe original equation (9.68a).
4. Geometrical Representation and Characteristics of the System In the case of the equation (9.71a) with two independent variables x1 = z and x~ = y, a solution z = f(x,y) is a surface in x,y, z space, and it is called the integral surface of the differential equation. Equation (9.71a) means that at every
(E; )
point of the integral surface z = f ( x , y) the normal vector - - , -1
is orthogonal to the vector
( P ,Q , R) given at that point. Here the system (9.70b) has the form dx - dY - d z (9.71b) P ( x ,Yr 2 ) &(.> Y , z ) R(x,Y, 2) . It follows (see 13.1.3.5, p. 646) that the integral curves ofthis system, the so-called characteristics, are tangent to the vector (P>&, R ) . Therefore, a characteristic having a common point with the integral surface z = f ( x l y) lies completely on this surface. Since the conditions for the existence theorem 13.1.3.5, 1.. p. 496 hold, there is an integral curve of the characteristic system passing through every point of space, so the integral surface consists of characteristics. 5 . Cauchy Problem There are given n functions of n - 1 independent variables t l , t z , . . . , t,-1: 5 1 = Zl(t1, t?!.. . , ta-l), xz = XZ(tl,tZ,. . . , t,-1), . . . , x, = X,(tl, tz,. . . , t,-1). (9.72a) The Cauchy problem for the differential equation (9.68a) is to find a solution z=cp(x1.xz, . . . :z,) (9.72b) such that if we substitute (9.72a), the result is a previously given function y(t1, t Z , . . . t,-l): p[xi(ti,tz,...,t,-i), xz(ti,tz,...,t,-i),...! xn(ti,tz,...,L-i)] = $(ti,tz,. ..:t,-i). (9.72~) In the case of two independent variables, the problem reduces to find an integral surface passing through the given curve. If this curve has a tangent depending continuously on a point and it is not tangent to the characteristics at any point, then the Cauchy problem has a unique solution in a certain neighborhood of this curve. Here the integral surface consists of the set of all characteristics intersecting the given curve. For more mathematical discussion on theorems about the existence of the solution of the Cauchy problem see [9.15]. at
IA: For the linear first-order inhomogeneous partial differential equation (mz - ny)-
dX
+ (nz -
dz dx dy l z ) - = ly - mx (1. m, n are constants), the equations of the characteristics are -= dY mz-ny nx-Iz dz , The integrals of this system are lx + my + nz = C1, xz + yz + zz = Cz. We get circles as ly - mx characteristics, whose centers are on a line passing through the origin, and this line has direction cosines proportional to 1, m, n. The integral surfaces are rotation surfaces with this line as an axis.
at +
IB: Determine the integral surface of the first-order linear inhomogeneous differential equation dX az
-
dY
=
2,
which passes through the curve x = 0, z = ~ ( y )The . equations of characteristics are
dz 9 = . The characteristics passing through the point (xo,yo, 1 2 -
20)
are y = x
-
dx
-
1
=
xo + yos z = zoez-"o.
A parametric representation of the required integral surface is y = x + yo, z = e"p(yo), if we substitute
9.2 Partial Differential Eouations 517
xo = 0, zo = p(y0). The elimination of yo results in z = e 5 v ( y - x ) .
9.2.1.2 Non-Linear First-Order Partial Differential Equations 1. General Form of First-Order Partial Differential Equation is the implicit equation (9.73a)
1. Complete Integral is the solution (9.7313) z = + $ X I ,... 2,: a l , . . . , a n ) , depending on n parameters al,. . . ,a, if at the considered values of 21,. . . , x,, z the functional determinant (or Jacobian determinant, see 2.18.2.6, 3., p. 121) is non-zero: (9.73c) 2. Characteristic Strip The solution of (9.73a) is reduced to the solution of the characteristic system
(9.73d) with
The solutions of the characteristic system satisfying the additional condition F(X1.. . . ,x,. t , p 1 , .. . ,pn)= 0 are called the characterzstzc strzps.
(9.730
2. Canonical Systems of Differential Equations Sometimes it is more convenient to consider an equation not involving explicitly the unknown function z. Such an equation can be obtained by introducing an additional independent variable xntl = z and an unknown function I’(x1, . . . , x n , x,+l), which defines the function z ( x l , x Z , .. . ) x n )with the equation (9.74a) V ( X l , ,. . . x,, z ) = c. aZ (i = 1 , . . . n) for - in (9.73a). Then At the same time. n e substitute the functions -axi we solve the differential equation (9.73a) for an arbitrary partial derivative of the function V . The corresponding independent variable will be denoted by x after a suitable renumbering of the other variables. Finally, we have the equation (9.73a) in the form ~
(9.74b) The system of characteristic differential equations is transformed into the system (9.74c) (9.74d) Equations (9.74~)represent a system of 2n ordinary differential equations, which corresponds to an arbitrary function H ( z 1 , .. . , x,, x,pl,.. . ,pn)with 2n + 1 variables. It is called a canonical system or a normal system of differential equations.
9. DifferentialEouations
518
Many problems of mechanics and theoretical physics lead to equations of this form. Knowing a complete integral L' = +(zl.. . . .x,. x,al,.. . ,a,) + a (9.74e) of the equation (9.74b) we can find the general solution of the canonical system (9.74~)) since the equa-
% = b,. dP = p, (z = 1 , 2 , . . . n) with 2n arbitrary parameters a, and b, determine a 2ntions ~
ax,
aa,
parameter solution of the canonical system (9.74~).
3. Clairaut Differential Equation If the given differential equation can be transformed into the form dz = xipi t zzpz t . . ' + w+lt j(p13 p z , . . . ,P,)) p , = - (i = 1, . . . , n ) ,
ax,
(9.75a)
it is called a Clairaut differential equation. The determination of the complete integral is particularly simple, because a complete integral with the arbitrary parameters al, az.. . . ,a, is (9.76b) z = a121 + a2x2 . + a,x, j(a1, a2,. . . ,a,). Two-Body Problem with Hamilton Function: Consider two particles moving in a plane under their mutual gravitational attraction according to the Newton field (see also 13.4.3.2, p. 667). We choose the origin as the initial position of one of the particles, so the equations of motion have the form d2x - 81' d2y - dV, V=- k2 (9.76a) dt2 8s' dt2- dy' If we introduce the Hamiltonian function 1 k2 (9.76b) H = -(p2 + 9') 2 the system (9.76a) is transformed into the normal system (into the system of canonical differential equations) dy - dH dp - aH dq dH -dx= - d H (9.76~) dt d p ' dt dq ' dt a x ' dt - dy with variables dx dy (9.76d) x,y,p=-, dt Now, the partial differential equation has the form
+.
~
+
-
m3
q=z.
-az+ - 1 [(a;)2+($)2] at
--=o.
k2
dx
2
(9.76e)
Introducing the polar coordinates p, 9 in (9.76e) we obtain a new differential equation having the solution
z
= -at
-
bv t c -
+ L:/T 2a
- - -dr
(9.76f)
with the parameters a, b. c. We get the general solution of the system (9.76~)from the equations
dz
- -- -to,
da
az ab - -Po.
4. First-Order Differential Equation in Two Independent Variables For z1 = z,z2 = y, pl = p , pz = q the characteristic strip (see 9.2.1.2, l., p. 517) can be geometrically interpreted as a curve at every point (x,y, z ) of which a plane p ( < - x) + q(v - y) = < - z being tangent
9.2 Partzal Differential Eauations 519
to the curve is prescribed. So, the problem of finding an integral surface of the equation (9.77) passing through a given curve. Le.. to solve the Cauchy problem (see 9.2.1 1, 5., p. 516), is transformed into another problem: To find the characteristic strips passing through the points of the initial curve such that the corresponding tangent plane to each strip is tangent to that curve. We get the values p and q at the points of the initial curve from the equations F ( z ,y. 2,p, q ) = 0 and pdx qdy = d z . u'e may have several solutions in the case of non-linear differential equations. Therefore, under the formulation of the Cauchy problem. in order to obtain a unique solution we assume two continuous functions p and q satisfying the above relations along the initial curve. For the existence of solutions of the Cauchy problem see [9.15]. IFor the partial differential equation p q = 1 and the initial curve y = z3,z = 2x2, we can choose p = z and q = l / x along the curve. The characteristic system has the form
+
dz dp -dt= 2 p q , -=oo, -dq dt dt= o The characteristic strip with initial values 20.yo. zo>poand qo for t = 0 satisfies the equations z = xo + qot. y = yo + p o t . z = 2poqot zo, p = po, q = 40.For the case ofpo = 2 0 . qo = l/zo the equation of the curve belonging to the characteristic strip that passes through the point ( 2 0 ,yo, zo) of the initial curve is
dx
-
&=4,
dy -d=t p ,
+
5
= zo
+ -t
~
20
y = xO3
+ iso, z = 2t + 2zO2.
Eliminating the parameters xo and t we get z2 = 4zy. For other chosen values of p and q along the initial curve we can get different solutions. Remark: The envelope of a one-parameter family of integral surfaces is also an integral surface. Considering this fact we can solve the Cauchy problem with a complete integral. We find a one-parameter family of solutions tangent to the planes given at the points of the initial curve. Then we determine the envelope of this family. Determine the integral surface for the Clairaut differential equation z - pa: - qy pq = 0 passing through the curve y = x. z = x'. The complete integral of the differential equation is z = az by - ab. Since along the initial curve p = q = z,we determine the one-parameter family of integral surfaces by 1 the condition a = b. Finding the envelope of this family we have z = - ( E y)'. 4
+
+
+
5 . Linear First-Order Partial Differential Equations in Total Differentials Equations of this kind have the form
+
+ +
(9.78a) dz = fidxi f i d x z . . fndXn, where f l ;f 2 ; . . . f, are given functions of the variables x l r 52,.. . , x,, z . The equation is called a completely integrableor exact diferential equationwhen there exists a unique relation between ~ 1 ~ x 2. .,,x,, . z with one arbitrary constant: which leads to equation (9.78a). Then there exists a unique solution z = z(xl,x2:. . , , x,) of (9.78a), which has a given value zo for the initial values z l O ,... , z n 0of the independent variables. Therefore, for n = 2 , z1 = I,z2 = y a unique integral surface passes through every point of space. n(n - 1) The differential equation (9.78a) is completely integrable if and only if the equalities 2 ~
~
aft
dXk
afk + f k - at = afk - + f*ax, a2 aft
(i, IC = 1. . , . , n)
(9.78b)
520
9. DifferentialEquations
in all variables x l rx2>. . . , x,, z are identically satisfied. If the differential equation is given in symmetric form fldxl t ' . . t fndx, = 0, then the condition for complete integrability is
(9.78~)
(9.78d) for all possible combinations of the indices i, j, k. If the equation is completely integrable, then the solution of the differential equation (9.78a) can be reduced to the solution of an ordinary differential equation with n - 1 parameters.
9.2.2 Linear Second-OrderPartial Differential Equations 9.2.2.1 Classification and Properties of Second-Order Differential Equations with Two Independent Variables 1. General Form of a linear second-order partial differential equation with two independent variables x , y and an unknown function u is an equation in the form
a2u a2U A+ 2B-axay + C-azu + a-au + b-au +cu = f . (9.79a) 8x2 dy2 ax ay where the coefficients A , B , C, a, b, c and f on the right-hand side are known functions of z and y . The form of the solution of this differential equation depends on the sign of the dzscrzmznant ~ = A c - B ~ (9.7913) in a considered domain. We distinguish between the following cases. 1.6 < 0: Hyperbolic type. 2 . 6 = 0: Parabolic type. 3 . 6 > 0: Elliptic type. 4. 6 changes its sign: Mixed type. 5. .4n important property of the discriminant 6 is that its sign is invariant with respect to arbitrary transformation of the independent variables, e.g., to introduction new coordinates in the x , y plane. Therefore, the type of the differential equation is invariant with respect to the choice of the independent variables. 2. Characteristics of linear second-order partial differential equations are the integral curves of the differential equation dy B t G 2Bdxdy + Cdx2 = 0 or - = (9.80) A ' dx For the characteristics of the above three types of differential equations the following statements are valid: 1. Hyperbolic type: There exist two families of real characteristics. 2. Parabolic type: There exists only one family of real characteristics. 3. Elliptic type: There exists no real characteristic. 4. A differential equation obtained by coordinate transformation from (9.79a) has the same characteristics as (9.79a). 5. If a family of characteristics coincides with a family of coordinate lines, then the term with the second derivative of the unknown function with respect to the corresponding independent variable is missing in (9.79a). In the case of a parabolic differential equation, the mixed derivative term is also missing.
Ad$
-
~
9.2 Partial DifferentialEauations 521
3. Normal Form or Canonical Form IVe have the following possibilities to transform (9.79a) into the normal form of linear second-order partial differential equations. 1. Transformation into Normal Form: The differential equation (9.79a) can be transformed into normal form by introducing the new independent variables E = 4 2 , Y) and 7 = v(x, Y) , (9.81a) which according to the sign of the discriminant (9.79b) belongs to one of the three considered types: d2u 8% - -+... dE2 872
=
0, 6 < 0, hyperbolic type;
d2 U + . . . = 0,
dV2
8% at2
-
d2U ++ . . . = 0,
8172
(9.81b)
6 = 0, parabolic type;
(9.81~)
6 > 0, elliptic type.
(9.81d)
The terms not containing second-order partial derivatives of the unknown function are denoted by dots. 2. Reduction of a Hyperbolic Type Equation to Canonical Form (9.81b): If, in the hyperbolic case, we chose two families of characteristics as the coordinate lines of the new coordinate system (9.81a), Le.: if n e substitute 51 = p(x, y)) 71 = $(x, y), where p(x, y ) = constant, $(x2 y) = constant are the equations of the characteristics, then (9.79a) becomes the form
8% t . . . = 0. d51d171 This form is also called the canonicalform ofa hyperbolic type diflerential equation. From here (9'81e) we get the canonical form (9.81b) by the substitution (9.81f) 5 = E1 + 171, 17 = 51 - V I . 3. Reduction of a Parabolic Type Equation to Canonical Form (9.81~): The only family of characteristics given in this case is chosen for the family 5 = const, where an arbitrary function of s and y can be chosen for 7 . which must not be dependent on 5. 4. Reduction of an Elliptic Type Equation to Canonical Form (9.81d): If the coefficients A(s. y), B ( x ,y). C ( x ,y) are analytic functions (see 14.1.2.1,p. 670) in the elliptic case, then the characteristics define two complex conjugate families of curves p(x>y) = constant, $(x,y ) = constant. If we substitute = p + y , and 7 = i(p - $J)> the equation becomes the form (9.81d). 4. Generalized Form Every statement for the classification and reduction to canonical form remains valid for equations given in a more general form
(
"1
(9.82) A ( X , Y ) G + 2B(x,y)t C ( s . y ) 7 + F z , y , u , - - = 0, dZu dx'dy du dxdy dY where F is a non-linear function of the unknown function u and its first-order partial derivatives du/dx d2U
8221
and duldy, in contrast to (9.79a).
9.2.2.2 Classification and Properties of Linear Second-Order Differential
Equations with More than Two Independent Variables 1. General Form A differential equation of this kind has the form (9.83)
I
522
9. DifferentialEquations
where uik are given functions of the independent variables and the dots in (9.83) mean terms not containing second-order derivatives of the unknown function. In general, the differential equation (9.83) cannot be reduced to a simple canonical form by transforming the independent variables. However, there is an important classification, similar to the one introduced above in 9.2.2.1, p. 520 (see [9.5]).
2. Linear Second-OrderPartial DifferentialEquations with Constant Coefficients If all coefficients a,k in (9.83) are constants, then the equation can be reduced by a linear homogeneous transformation of the independent variables into a simpler canonical form (9.84) where the coefficients K, are &1 or 0. We can distinguish between several characteristic cases. 1. Elliptic Differential Equation .411 the coefficients K, are different from zero, and they have the same sign. Then we have an elliptic differential equation. 2. Hyperbolic and Ultrahyperbolic Differential Equation All the coefficients K, are different from zero, but one has a sign different from the other’s. Then we have a hyperbolic differential equation. If both type of signs occur at least twice, then it is an ultrahyperbolic differential equation. 3. Parabolic Differential Equation One of the coefficients n, is equal to zero, the others are different from zero and they have the same sign. Then we have a parabolic differential equation. 4. Simple Case for Elliptic and Hyperbolic Differential Equations We have a relatively simple case if not only the coefficients of the highest derivatives of the unknown function are constants, but also those of the first derivatives. Then we can eliminate the terms of the first derivatives, for which K , # 0, by substitution. For this purpose, we substitute (9.85)
au where bk is the coefficient of - in (9.84) and the summation is performed for all K , # 0. In this way.
dxk every elliptic and hyperbolic differential equation with constant coefficients can be reduced to a simple form: d2v a ) Elliptic Case: av t kv = g. (9.86) b ) Hyperbolic Case: - - Av + kv = g. (9.87)
a2
Here 4 denotes the Laplace operator (see 13.2.6.5,p. 655).
9.2.2.3 Integration Methods for Linear Second-Order Partial Differential Equations 1. Method of Separation of Variables We can determine certain solutions of several differential equations of physics by special substitutions, and although these are not general solutions, we get a family of solutions depending on arbitrary parameters. Linear differential equations, especially those of second order, can often be solved if we are looking for a solution in the form of a product 4 2 1 , . . . x n ) = IP1(.1)(P2(.2)”.’Pn(5,). (9.88) Next, we try to separate the functions $&(Zk)l i.e., for each of them we want to determine an ordinary differential equation containing only one variable x k . This separation of uarzables is successful in many cases when the trial solution in the form of a product (9.88) is substituted into the given differential equation. In order to guarantee that the solution of the original equation satisfies the required homogeneous boundary conditions, it may appear to be sufficient that some of functions pl(xl), p2(x2), . . . , pn(xn)satisfy certain boundary conditions.
9.2 Partial DifferentialEavations 523
By means of summation, differentiation and integration, new solutions can be acquired from the obtained ones; the parameters should be chosen so that the remaining boundary and initial conditions are satisfied (see examples). Finally. we must not forget that the solutions obtained in this way, often infinite series and improper integrals. are only formal solutions. That is, we have to check whether the solution makes a physical sense, e.g., whether it is convergent, satisfies the original differential equation and the boundary conditions, whether it is differentiable termwise and whether the limit at the boundary exists. The infinite series and improper integrals in the examples of this paragraph are convergent if the functions defining the boundary conditions satisfy the required conditions, e.g., the continuity assumption for the second derivativs in the first and the second examples. IA: Equation of the Vibrating String is a linear second-order partial differential equation of hyperbolic type (9.89a) It describes the vibration of a spanned string. The boundary and the initial conditions are: (9.89b)
(9.89~) and after substituting it into the given equation (9.89a) we have (9.89d) The variables are separated, the right side depends on only x and the left side depends on only t , so each of them is a constant quantity. This constant must be negative, otherwise the bondary conditions cannot be satisfied. We get an ordinary linear second-order differential equation with constant coefficients for both variables. For the general solution see 9.1.2.4, p. 500. We denote this negative constant by -A2 and we get the linear differential equations
X " t xzx = 0,
(9.89e)
T f f+ aZX2T= 0.
(9.894
We have X ( 0 ) = X ( 1 ) = 0 from the boundary conditions. Hence X ( x ) is an eigenfunction of the Sturm-Liouville boundary value problemand is the correspondingeigenvalue (see 9.1.3.1,3., p. 513). Solving the differential equation (9.89e) for X with the corresponding boundary conditions we get nr X(z) = Csin Xz with sinX1 = 0, i.e., with X = (939g) 1 = An ( n = 1,2.. . .). Solving equation (9 89f) for T yields a particular solution of the original differential equation (9.89a) for every eigenvalue A,: naT t b, sin (9.89h) u, = (a, cos Requiring that for t = 0, m
u=
u,
is equal to f(x)
(9.89i) and
n=l
d "
-
u,
is equal to p(x),
(9.89j)
n=l
we get with a Fourier series expansion in sines (see 7.4.1.1. l., p. 418) a
" - 1
nirx 2 [f ( z )sin -dz, I 0
bn =
(9.89k)
I
524
9. Diferential Equations
IB: Equation of Longitudinal Vibration of a Bar is a linear second-order partial differential equation of hyperbolic type, which describes the longitudinal vibration of a bar with one end free and a constant force p affecting the fixed end. We have to solve the same differential equation as in example A (p. 523), Le., (9.90a) with the same initial but different boundary conditions: =0
(free end),
(9.9Oc)
(9.90d) These conditions can be replaced by the homogeneous conditions (9.90e) where instead of u we introduce a new unknown function kpxZ z=u------. 21 The differential equation becomes inhomogeneous: - = 02at2
ax2
+ -,
(9.9Of)
i
We are looking for the solution in the form 2 = u t w ,where w satisfies the homogeneous differential equation with the initial and boundary conditions for z , Le., (9.90h) and w satisfies the inhomogeneous differential equation with zero initial and boundary conditions. So, ka2ptz we get w = . Substituting the product form into the differential equation 21 21 = X ( x ) T ( t ) (9.9Oi) we get the separated ordinary differential equations as in the example A (p. 523)
-
_--
(9.9Oj)
Integrating the differential equation for X with the boundary conditions X f ( 0 ) = X f ( l ) = 0 we find the eigenfunctions nnx = cos (9.90k) 1 and the corresponding eigenvalues
x,
An2=T
n2T2
(n=0.1,2,~..).
(9.901)
Proceeding as in example A (p. 523) we finally obtain (9.90m)
9.2 Partial DifferentialEauations 525
where a, and b, ( n = 0 , 1 , 2 , ,, .) are the coefficients of the Fourier series expansion in cosines of the kpx2 1 functions f(x) and -p(x) in the interval ( 0 , l ) (see 7.4.1.1, l., p. 418). 2 aT IC: Equation of a Vibrating Round Membrane fixed along the boundary: The differential equation is linear, partial and it is of hyperbolic type. It has the form in Cartesian and in polar coordinates (see 3.5.3.1, 6., p. 209) ~
a z u +--++-=-iou 1 aZU -
(9.91a)
ap2
pap
p2a$
a2
a” a?
(9.91b) ’
The initial and boundary conditions are (9.91~)
= f(p.p),
fi~ = F(p,cp),
u l p == ~ 0.
(9.91d)
(9.91e)
t=O
The substitution of the product form (9.91f)
u = L’(P)@(P)T(t)
with three variables into the differential equation in polar coordinates yields
Three ordinary differential eauations are obtained for the seDarated variables analogously ” to examdes A (p. 523) a n i B (p. 524): ~
T”
+ a2X2T= 0.
p2U”
(9.91h)
+ pU’ U
2 2
@I1
+Xp=--=
~
@// + v% = 0. From the conditions @(O) = @(27~),@(O) = @ ( 2 ~it) follows that:
@(p) = a, cos np
+ b, sin np,
P
(9.91i) (9.91j)
u2 = n2 ( n = 0 , 1 , 2 , . . .).
n2 U and X will be determined from the equations [pU’]’ - -Lr
u2,
(9.91k)
= -X2pU
and U ( R ) = 0. Considering
the obvious condition of boundedness of U(p) at p = 0 and substituting Xp = z we get z2C”’
+ ZC“
t ( z 2- n2)U = 0,
( 3,
Le., U(p) = J,(z) = J, p-
(9.911)
P and Jn(p) = 0. The function where J,, are the Bessel functions (see 9.1.2.6, 2., p. 507) with X = R system unk(p) = Jn
(pnki)
(9.91m)
( k = 1,2, ’ ’
with pL,k as the Ic-th positive root of the function J,(z) is a complete system of eigenfunctions of the self-adjoint Sturm-Liouville problem which are orthogonal with the weight function p. The solution of the problem can have the form of a double series:
L’
=
E n=O k=l
[(a,, cos np + b,k sin np) cos R a h k t
+(c,k
3
cos np + d,k sin np) sin - J, pnh-
R
(
(9.91n)
9. Differential Equations
526
(9.910)
n=O k=l
R
where
(9.9 1r) In the case of n = 0, the numerator 2 should be changed to 1. To determine the coefficients c,k and dnk R we replace f ( p , p) by F(p. 9)in the formulas for ank and b,k and we multiply by ~
aPnk
ID: Dirichlet Problem (see 13.5.1, p. 668) for the rectangle 0 5 z 5 a, 0 5 y 5 b (Fig. 9.17): Find a function u(z,y) satisfying the elliptic type Laplace differential equation Au=O (9.92a) and the boundary conditions U ( 0 , Y ) = Pl(Y), 4 % Y ) = VZ(Y)> (9.92b) u(z,o)= q ~ ( z )u(z,b) , = qz(2).
Figure 9.17
First we determine a particular solution for the boundary conditions ipl(y) = p2(y) = 0. Substituting the product form u = X(z)Y(y) (9.92~) into (9.92a) we get the separated differential equations X// -- -Y _I' = - A 2 (9.92d)
X
Y
with the eigenvalue X analogously to examples A (p. 523) through C (p. 525). Since X ( 0 ) = X ( a ) = 0. we get n7i
X = Csin Ax.
X = - = X, a
( n = 1 , 2 , . . .).
(9.92e)
In the second step we write the general solution of the differential equation n2r2 n7i nr in the form Y = a, sinh - ( b - y) + b, sinh -y . (9.92g) - -y = 0 (9.920 02 a a From these equations we get a particular solution of (9.92a) satisfying the boundary conditions u(0. y) = u(u. y) = 0. which has the form u, = [a. sinh
a(b- y) + b, sinh n7i
(9.92h)
In the third step we consider the general solution as a series X
u=
u., n=l
(9.92i)
9.2 Partial Differential Equations 527
so from the boundary conditions for y = 0 and y = b we get u=
5
nT
- y)
(a,sinh -(b
+ b, sinh
(9.92j)
n=l
with the coefficients 2 a, = -/'v1(x)sin--2dx, nTb 0 a sinh -
nir
b, =
a
2
nTb a sinh -
~
la
n?r
&(z) sin -x
dx.
(9.92k)
Q
The problem with the boundary conditions yll(z) = yz ( z )= 0 can be solved in a similar manner, and taking the series (9.92j) we get the general solution of (9.92a) and (9.92b). IE: Heat Conduction Equation Heat conduction in a homogeneous bar with one end at infinity and the other end kept at a constant temperature is described by the linear second-order partial differential equation of parabolic type
which satisfies the initial and boundary conditions (9.93b) U l t d = f(.). u/,,o = 0 in the domain 0 5 x < +coo,t 2 0. We also suppose that the temperature tends to zero a t infinity. Substituting 11 = X ( z ) T ( t )
(9.93c)
into (9.93a) we obtain the ordinary differential equations (9.93d) whose parameter X is introduced analogously to the previous examples A (p. 523) through D (p. 526). We get
~ ( t=)CAe-xinit as a solution for T ( t ) .Csing the boundary condition X(0)= 0, we get
(9.93e)
X(x) = Csin Xx (9.930 and so u~ = CAe-A2a2t sin Xz, where X is an arbitrary real number. The solution can be obtained in the form
(9.934
u ( z .t ) =
/
r
C(X)e-A2a2t sin ~x d ~ .
(9.93h)
From the initial condition u l t = ~= f ( x ) $so
f(x) =
1-
C(X)sin ~x d ~ ,
(9.931)
which is satisfied if we substitute C(X) = z / m f ( s ) s i n X s d s
(9.93j)
T O
for the constant (see 7.4.1.1, 1..p. 418). Combining this equation and (9.93i) we get
~ ( 2 . t=) 'T/ =Of ( s )
(/me-"?"ztsinXssinXxdX) d s
(9.93k)
I
528
9. DifferentialEquations
or after replacing the product of the two sines with one half of the difference of two cosines ((2.122), p. 81) and using formula (21.27), p. 1052, we get m
(9.931)
2. Riemann Method for Solving Cauchy’s Problem for the Hyperbolic
Differential Equation
au + b-au +cu = F
a221
-t adxdy 8s
(9.94a)
ay
1. Riemann Function is a function u(x, y; (, q ) ,where ( and q are considered as parameters, satisfying the homogeneous equation 81; d(av) d(bv) (9.94b) +cv=o dxdy ax dy which is the adjoint of (9.94a) and the conditions
In general, linear second-order differential equations and their adjoint differential equations have the form 82U v) d(bu) atkb,+ cu = f (9.94d) and ,,k @(a - -2- + cv = 0. (9.94e) z,k dxtdxk , dxt dX&k , 8x1
+
au
2. Riemann Formula is the integral formula which is used to determine function .(E, 7) satisfying the given differential equation (9.94a) and taking the previously given values along the previously given curve (Fig. 9.18) together with its derivative in the direction of the curve normal (see 3.6.1.2, 2., p. 226):
‘6
r
w Figure 9.18 The smooth curve T (Fig. 9.18) must not have tangents parallel to the coordinate axes, i.e., the curve must not be tangent to the characteristics. The line integral in this formula can be calculated, since the values of both partial derivatives can be determined from the function values and from its derivatives in a non-tangential direction along the curve arc. dU
In the Cauchy problem, the values of the partial derivatives of the unknown function, e.g., - are often
3Y
given instead of the normal derivative along the curve. Then we use another Riemann formula:
9.2 Partial Differential Esuations 529
IElectric Circuit Equation (Telegraphic Equation) is a linear second-order partial differential equation of hyperbolic type
azU
au
at2
at
d2U
(9.95,) 8x2 where a > 0, b. and c are constants. The equation describes the current flow in wires. It is a generalization of the differential equation of a vibrating strin SVe replace the unknown function u(x,t ) by u = ze-(bTi)t, Then (9.95a) is reduced to the form a-++b-+cu=-
a22
d2Z
- = m2at2 ax2
+ n2z
( -,i m2 =
n 2--)
b2 - ac a2
(9.95b) '
Replacing the independent variables by (9.95c)
{=!(mt+x). q=!(mt-x) m m we finally get the canonical form
(9.95d) of a hyperbolic type linear partial differential equation (see 9.2.2.1, l., p. 521). The Riemann function v({. q; Eo, 70) should satisfy this equation with unit value a t = Eo and 7 = 70. If we choose the form (9.95e) = (E - Eo)(a - 70) for w in 2: = f ( w ) . then f(w) is a solution of the differential equation
d2f + df - -1f = 0 (9.959 dw2 dw 4 with initial condition f ( 0 ) = 1. The substitution w = a' reduces this differential equation to Bessel's differential equation of order zero (see 9.1.2.6, 2., p. 507) 2L'-
-d2f +
1 df - f = 0 -da2 cvda hence the solution is 2: =
lo
[J C Z G G ] .
(9.95h)
A solution of the original differential equation (9.958) satisfying the boundary conditions
can be obtained if we substitute the found value of u into the Riemann formula and then return to the original variables: 1 z(x,t ) = 2[f ( x - mt) f(. mt)]
+
+
ntIl x-mt
m
-'(')
('r -
-J
m2tZ
-
(s
- z)2)
]
ds.
(9.95j)
3. Green's Method of Solving the Boundary Value Problem for Elliptic Differential Equations with Two Independent Variables This method is very similar to the Riemann method of solving the Cauchy problem for hyperbolic differential equations.
I
530
9. Di.flerential Equations
If we want to find a function u(z, y) satisfying the elliptic type linear second-order partial differential equation au au d2u a2u (9.96a) f -t-ta-tb-tcu= ax2 dy2 ax dy in a given domain and taking the prescribed values on its boundary, first we determine the Green functzon G ( s . y , I ,q ) for this domain, where E and q are regarded as parameters. The Green function must satisfy the following conditions: 1. The function G(x, y: E. q ) satisfies the homogeneous adjoint differential equation
d(aG) d(bG) + c G = O dx ay everywhere in the given domain except at the point z = I , y = q. 2. The function G(x, y: E , q ) has the form a2G
d2G
azz
ay2
1 UIn- t 1.'
with r = d ( z - E)'
(9.96~)
r
(9.96b)
+ (Y - v ) ~ .
(9.96d)
where G has unit value at the point z = I . y = 7 and C and L' are continuous functions in the entire domain together with their second derivatives. 3. The function G(z. y; E , q ) is equal to zero on the boundary of the given domain. The second step is to give the solution of the boundary value problem with the Green function by the
where D is the considered domain, S is its boundary on which the function is assumed to be known and
a
- denotes the normal derivative directed toward the interior of D. dn
Condition 3 depends on the formulation of the problem. For instance, if instead of the function values the values of the derivative of the unknown function in the direction normal to the boundary of the domain are given, then in 3 we have the condition
_ aG _( a cos a + bcos ,!?)G= 0
(9.96f) dn on the boundary. (Y and /j' denote here the angles between the interior normal to the boundary of the domain and the coordinate axes. In this case, the solution is given by the formula 1 au 1 u ( < , q )= --/-Gds2a fGdzdy. (9.9%) 27 dn
-11
S
D
4. Green's Method for the Solution of Boundary Value Problems with Three
Independent Variables The solution of the differential equation au au du Auta-tb-tc-teu= f
ax ay az
(9.97a)
should take the given values on the boundary of the considered domain. As the first step, we construct again the Green function, but now it depends on three parameters E , q, and C. The adjoint differential equation satisfied by the Green function has the form (9.97b)
9.2 Partial DifferentialEouations 531
As in condition 2, the function G(x,y, z; E, q, C) has the form
c-T1 + L’
(9.97c)
with r = d ( z -
+ (y - q)’ + ( z - 02.
(9.97d)
The solution of the problem is:
Both methods, Riemann’s and Green’s, have the common idea first to determine a special solution of the differential equation, which can then be used to obtain a solution with arbitrary boundary conditions. An essential difference between the Riemann and the Green function is that the first one depends only on the form of the left-hand side of the differential equation, while the second one depends also on the considered domain. Finding the Green function is, in practice, an extremely difficult problem, even if it is known to exist; therefore, Green’s method is used mostly in theoretical research. IA: Construction of the Green function for the Dirichlet problem ‘Y of the Laplace differential equation (see 13.5.1, p. 668) Au=O (9.98a) for the case, when the considered domain is a circle (Fig. 9.19). The Green function is
1
G ( z ,y ; q ) = In r
!
Figure 9.19
m,
+ In -,R
rlp
(9.98b)
and R is the radius of the where T = p = OM,r1 = considered circle (Fig. 9.19). The points M and MIare symmetric with respect to the circle, Le., both points are on the same ray starting from the center and -O M .OM1 = R2. (9.98~) The formula (9.96e) for a solution of Dirichlet’s problem, after substituting the normal derivative of the Green function and after certain calculations, yields the so-called Poisson integral (9.98d)
The notation is the same as above. The known values of u are given on the boundary of the circle by ~(9) For . the coordinates of the point M ( < , q )we have: E = pcos$, q = psinv. H B: Construction of the Green function for the Dirichlet problem of the Laplace differential equation (see 133.1)p. 668) Au = 0. (9.99a) for the case when the considered domain is a sphere with radius R. The Green function now has the form 1 R (9.99b) G(z, Y, 2; €, v>0= ;- Tlp ?
with p = Jt2+ q2 + C2 as the distance of the point (t,q, C) from the center. T as the distance between the points (2.y. t)and (5, q1C). and r1 as the distance of the point (z,y , z ) from the symmetric point
(E, q>C) according to (9.98c), Le., from the point
-Rq RC . In this case, the Poisson integral ( P ) Pj P I has (with the same notation as in example A (p. 531))the form
of
(9.99c)
532
9. Differential Equations
5 . O p e r a t i o n a l Method Operational methods can be used not only to solve ordinary differential equations but also for partial differential equations (see 15.1.6, p. 707). They are based on transition from the unknown function to its transform (see 15.1, p. 705). In this process, we regard the unknown function as a function of only one variable and we perform the transformation with respect to this variable. The remaining variables are considered as parameters. The differential equation to determine the transform of the unknown function contains one less independent variable than the original equation. In particular, if the original equation is a partial differential equation of two independent variables, then we obtain an ordinary differential equation for the transform. If we can find the transform of the unknown function from the obtained equation, then we determine the original function either from the formula for the inverse function or from the table of transforms. 6. A p p r o x i m a t i o n Methods In order to solve practical problems with partial differential equations, we often use different approximation methods. We distinguish between analytical and numerical methods. 1. Analytical Methods make possible the determination of approximate analytical expressions for the unknown function. 2. Numerical Methods result in approximate values of the unknown function for certain values of the independent variables. We use the following methods (see 19.5. p. 908): a ) Finite Difference Method, or Lattzce-Pod Method The derivatives are replaced by divided differences, so the differential equation including the initial and boundary conditions becomes an algebraic equation system. A linear differential equation with linear initial and boundary conditions becomes a linear equation system. b) Finite Element Method, or briefly FEM, for boundary value problems: We assign a variational problem to the boundary value problem. We approximate the unknown function by a spline, whose coefficients should be chosen to get the best possible solution. We decompose the domain of the boundary value problem into regular subdomains. The coefficients are determined by solving an extreme value problem. c) Integral Equation Method (along a Closed Curve) for special boundary problems: We formulate the boundary value problem as an equivalent integral equation problem along the boundary of the domain of the boundary value problem. To do this, we apply the theorems of vector analysis, e.g., Green formulas. We determine the remaining integrals along the closed curve numerically by a suitable quadrature formula. 3. Physical Solutions of differential equations can be given by experimental methods. This is based on the fact that various physical phenomena can be described by the same differential equation. To solve a given equation, we first construct a model by which we can simulate the given problem, and we obtain the values of the unknown function directly from this model. Since such models are often known and can be constructed by varying the parameters in a wide range, the differential equation can also be investigated in a wide domain of the variables.
9.2.3 Some further Partial Differential Equations from Natural Sciences and Engineering 9.2.3.1 Formulation of the Problem and the Boundary Conditions 1. P r o b l e m F o r m u l a t i o n The modeling and the mathematical treatment of different physical phenomena in classical theoretical physics, especially in modeling media considered structureless or continuously changing, such as gases, fluids, solids, the fields of classical physics, leads to the introduction of partial differential equations. Examples are the wave (see 9.2.3.2,p. 534) and the heat equations (see 9.2.3.3,p. 535). Many problems in non-classical theoretical physics are also governed by partial differential equations. An important
9.2 Partial DifferentialEouations 533
area is quantum mechanics, which is based on the recognition that media and fields are discontinuous. The most famous relation is the Schrodinger equation. Linear second-order partial differential equations occur most frequently and they have special importance in today's natural sciences.
2. Initial and Boundary Conditions The solution of the problems of physics, engineering, and the natural sciences must usually fulfill two basic requirements: 1. The solution must satisfy not only the differential equation, but also certain initial and/or boundary conditions. There are problems with only initial condition or only with boundary conditions or with both. A11 the conditions together must determine the unique solution of the differential equation. 2. The solution must be stable with respect to small changes in the initial and boundary conditions, Le.. its change should be arbitrarily small if the perturbatzons of these conditions are small enough. Then a correct problemformulatzon is given. IVe can assume that the mathematical model of the given problem to describe the real situation is adequate only in cases when these conditions are fulfilled. For instance, the Cauchy problem (see 9.2.1.1,5.,p. 516) iscorrectlydefined withadifferentialequation of hyperbolic type for investigating vibration processes in continuous media. This means that the values of the required function, and the values of its derivatives in a non-tangential (mostly in a normal) direction are given on an initial manifold, Le., on a curve or on a surface. In the case of differential equations of elliptic type, which occur in investigations of steady state and equilibrium problems in continuous media, the formulation of the boundary value problem is correct. If the considered domain is unbounded, then the unknown function must satisfy certain given properties with unlimited increase of the independent variables.
3. Inhomogeneous Conditions and Inhomogeneous Differential Equations The solution of homogeneous or inhomogeneous linear partial differential equations with inhomogeneous initial or boundary conditions can be reduced to the solution of an equation which differs from the original one only by a free term not containing the unknown function, and which has homogeneous conditions. It is sufficient to replace the original function by its difference from an arbitrary twice differentiable function satisfying the given inhomogeneous conditions. In general. we use the fact that the solution of a linear inhomogeneous partial differential equation with given inhomogeneous initial or boundary conditions is the sum of the solutions of the same differential equation with zero conditions and the solution of the corresponding homogeneous differential equation with the given conditions To reduce the solution of the linear inhomogeneous partial differential equation d2 21 - - L[u]= g(s, t ) at2
(9.100a)
with homogeneous initial conditions (9.1OOb) t=O
to the solution of the Cauchy problem for the corresponding homogeneous differential equation, we substitute t
p(x2, t: 7 ) dr
u=
(9.1OOc)
0
Here p(x, t ;7) is the solution of the differential equation
_a2 U_ at2
L[u]= 0,
(9.100d)
534
9. Dzfferential Equations
which satisfies the boundary conditions
In this equation, z represents symbolically all the n variables
~
2
. . ,~2, .of the n-dimensional probdU
lem. L[u]denotes a linear differential expression, which may contain the derivative -. but not higherat order derivatives with respect to t .
9.2.3.2 Wave Equation The extension of oscillations in a homogeneous media is described by the wawe equation
8%
- - u2Au = Q ( x .t ) , (9.101a) at2 whose right-hand side Q ( x , t )vanishes when there is no perturbation. The symbol z represents the n variables zl,. , , ,zn of the n-dimensional problem. The Laplace operator A (see also 13.2.6.5, 655,) is defined in the following way:
(9.101b) The solution of the wave equation is the wavefunctzon u. The differential equation (9.lOla) is of hyperbolic type.
1. Homogeneous Problem The solution of the homogeneous problem with Q(z,t ) = 0 and with the initial conditions (9.102) is given for the cases n = 1.2,3 by the following integrals. Case n = 3 (Kirchhoff Formula):
where the integration is performed over the spherical surface Sat given by the equation (a1- x 1 ) 2+ (a2- ~ 2 +)(a3 ~ - x3)2= a2tz. Case n = 2 (Poisson Formula):
(9.103b) where the integration is performed along the circle Cat given by the equation a2tz. Case n = 1 (d'Alembert formula): u(q.t ) =
~ ( Z+ I at)
+ p(z1 2
- at)
+
'Yc:(a) da. 2azl-at
+ (a2- z2)25
(cy1 - z1)2
(9.103~)
9.2 Partial Differential Eouations 535
2. Inhomogeneous Problem In the case, when Q ( z ,t ) # 0, we have to add to the right-hand sides of (9.103a.b,c) the correcting terms: Case n = 3 (Retarded Potential): For a domain K given by r 5 at with
r = d((1- ~
1
+ (& - ~ 2 + )(E3 ~- z3)2sthe correction term is )
~
(9.104a)
(9.104b) where K is a domain of [ I 3 5 a2(t- 7 ) ' .
2 , space ~
defined by the inequalities 0 5
T
5 t . (E1 - x 1 ) 2+ (Ez - x2)2 5
(9.104~) where T is the triangle 0 5
7
5 t , It - x l / 5 alt - 71. a denotes the wave velocity of the perturbation.
9.2.3.3 Heat Conduction and Diffusion Equation for Homogeneous Media 1. Three-Dimensional Heat Conduction Equation The propagation of heat in a homogeneous medium is described by a linear second-order partial differential equation of parabolic type aU
- - aZAu= Q ( z , t ) , (9.105s) at where 4 is the three-dimensional Laplace operator defined in three directions of propagation 21, 2 2 . z3,determined by the position vector 3. If the heat flow has neither source nor sink, the right-hand side vanishes since Q ( x It ) = 0. The Cauchy problem can be posed in the following way: h'e want to determine a bounded solution u(2.t ) for t > 0, where ult=O= f ( 2 ) . The requirement of boundedness guarantees the uniqueness of the solution. For the homogeneous differential equation with Q(2,t ) = 0. we get the uavefunctzon tm+mtm
(9.105b) In the case of an inhomogeneous differential equation with Q(2,t) side of (9.105b) the following expression:
# 0, we have to add to the right-hand
(9.105~)
I
536
9. Dzferential Equations
The problem of determining u(z,t ) for t < 0, if the values u ( z ,0) are given, cannot be solved in this way, since the Cauchy problem is not correctly formulated in this case. Since the temperature difference is proportional to the heat, we often introduce u = T(?,t ) (temperature field) and a2 = Dw (heat diffusion constant or thermal conductivity) to get
m
- - D w A T = Qw(F, t ) .
dt 2. Three-Dimensional Diffusion Equation
(9.105d)
In analogy to the heat equation, the propagation of a concentration C in a homogeneous medium is described by the same linear partial differential equation (9.105a) and (9.105d), where Dw is replaced by the three-dzmensional daffusion coeficaent Dc.The daffusaon equation is: (9.106) We get the solutions by changing the symbols in the wave equations (9.105b) and (9.105~).
9.2.3.4 Potential Equation The linear second-order partial differential equation
Au = -4TQ (9.107a) is called the potential equation or Poisson differential equation (see 13.5.2, p. 668), which makes the determination of the potential u ( z )of a scalar field determined by a scalar point function p(z) possible, where z has the coordinates 21, Q, 2 3 and A is the Laplace operator. The solution, the potential u M ( z lz2, , z3)at the point M ,is discussed in 13.5.2, p. 668. We get the Laplace differential equation (see 13.5.1, p. 667) for the homogeneous differential equation with e 0 h u = 0. (9.107b) The differential equations (9.107a) and (9.107b) are of elliptic type. 9.2.3.5 Schrodinger’sEquation 1. Notion of the Schrodinger Equation 1. Determinationand Dependencies The solutions of the Schrodinger equation, the wauefunctions ql describe the properties of a quantum mechanical system, Le., the properties of the states of a particle. The Schrodinger equation is a second-order partial differential equation with the second-order derivatives of the wave function with respect to the space coordinates and first-order with respect to the time coordinate: d@ h2 (9.108a) + U(zl,zz,z3,t )$ = fi $J ih- = --A$ dt 2m -2 h d h (9.108b) H = P + U ( i , t ) , $E7- E-V, 2m I di i h Here, A is the Laplace operator, h = - is the reduced Planck’s constant, i is the imaginary unit and 277
V is the nabla operator. The relation between the impulse p of a free particle with mass m and wave length X is X = h / p . 2. Remarks: a) In quantum mechanics, we assign an operator to every measurable quantity. The operator occurring in (9.108a) and (9.108b) is called the Hamilton operator H (“Hamiltonian”). It has the same role as the Hamilton function of classical mechanical systems (see, e.g., the example on Two-Body Problem on p. 518). It represents the total energy of the system which is divided into kinetic and potential energy. The first term in H is the operator for the kinetic energy, the second one for the potential energy.
9.2 Partial Differential Eauations 537
b) The imaginary unit appears explicitly in the Schrodinger equation. Consequently, the wave functions are complex functions. Both real functions occurring in $ J ( ~ ) + are needed to calculate the observable quantities. The square lPIz of the wave function, describing the probability dw of the particle being in an arbitrary volume element dV of the observed domain, must satisfy special further conditions. c) Besides the potential of the interaction, every special solution depends also on the initial and boundary conditions of the given problem. In general, we have a linear second-order boundary value problem, whose solutions have physical meaning only for the eigenvalues. The squares of the absolute value of meaningful solutions are everywhere unique and regular, and tend to zero at infinity. d) The microparticles also have wave and particle properties based on the wave-particle duality, so the Schrodinger equation is a wave equation (see 9.2.3.2, p. 534) for the De Broglie matter waves. e ) The restriction to the non-relativistic case means that the velocity Y of the particle is very small with respect to the velocity of light c (v < c). The application of the Schrodinger equations is discussed in detail in the literature of theoretical physics (see, e.g., [9.15].[9.7],[9.10],[22.13]).In this chapter we demonstrate only some most important examples.
2. Time-Dependent Schrodinger Equation The time-dependent Schrodinger equation (9.108a) describes the general non-relativistic case of a spinless particle with mass m in a position-dependent and time-dependent potential field U ( x l , x 2 , 2 3 ,t ) . The special conditions, which must be satisfied by the wave function, are: a) The function $ must be bounded and continuous. b) The partial derivatives d $ / d q , d i / d x 2 , and d$~/ax3must be continuous. c) The function IyI2 must be integrable, Le.,
According to the normalization condition, the probability that the particle is in the considered domain must be equal to one. (9.109a) is sufficient to guarantee the condition, since multiplying $ by a constant the value of the integral becomes one. A solution of the time-dependent Schrodinger equation has the form -iEt (9.109b) v(Zl,%%t) = @(x1,x~,x3)e h . The state of the particle is described by a periodic function of time with angular frequency w = E/h. If the energy of the particle has the fixed value E = constant, then the probability dw of finding the particle in a space element dV is independent of time: (9.109c) dw = l,i12d V = $$* dV. Then we talk about a stationary state of the particle. 3. Time-Independent Schrodinger Equation If the potential U does not depend on time, Le., U = U(x1,x2,x3),then it is the time-independent Schrodinger equation and the wave function P ( Z ~ , Q ~is X sufficient ~) to describe the state. We can reduce it from the time-dependent Schrodinger equation (9.108a) with the solution (9.109b) and we get 2m (9.11Oa) AP + -(E - U ) P = 0 h2 In this non-relativistic case, the energy of the particle is 2
E=!.-
2m
(9.11Ob)
538
9. DifferentialEquations
h with impulse p = -. The wave functions P satisfying this differential equation are the eigenjunctions;
x
they exist only for certain energy values E , which are given for the considered problem of the special boundary conditions. The union of the eigenvalues forms the energy spectrum of the particle. If U is a potential of finite depth and it tends to zero at infinity, then the negative eigenvalues form a discrete spectrum. If the considered domain is the entire space, then it can be required as a boundary condition that P is quadratically integrable in the entire space in the Lebesgue sense (see 12.9.3.2,p. 635 and [&SI). If the domain is finite, e.g.. a sphere or a cylinder, then we can require, e.g., P = 0 for the boundary as the first boundary condition problem. We get the Helmholtz dzfeerential equation in the special case of U ( z )= 0:
2mE with the eigenvalue X = h2
(9.111b)
'
P
= 0 is often required here as a boundary condition. (9.111a) represents the initial mathematical equation for acoustic oscillation in a finite domain.
4. Force-Free Motion of a Particle in a Block 1. Formulation of t h e Problem A particle with a mass m is moving freely in a block with inpenetrable walls of edge lengths a, b, c, therefore, it is in a potential box which is infinitely high in all three directions because of the inpenetracy of the walls. That is, the probability of the presence of the particle. and also the wave function P,vanishes outside the box. The Schrodinger equation and the boundary conditions for this problem are -+ -+ -+ -EP 3x2 dy2 dz2 A2
= 0 , (9.112a)
P = 0 for
Solution Separating the variables @(IC, Y,z ) = @&) P,(Y)Pz(z) and substituting into (9.112a) we get
z = o , IC=a. y = 0, y = b, z=o, z=c.
(9.112b)
2.
(9.113a)
1 d2Pz 1 d2Py 1 d2Pz 2m -(9.113b) P= d22 + -~ + -- = --E = -B. Py dy2 P, dz2 h2 Every term on the left-hand side depends only on one independent variable. Their sum can be a constant -B for arbitrary IC, y, z only if every single term is a constant. In this case the partial differential equation is reduced to three ordinary differential equations: d2Pz @Pz d2@, - -k y2Pv,-= -kz2Pz (9.113~) - -kx 2 P,, dx2 dY dz2
The relation for the separation constants - k Z 2 , -k,',
+ +
kx2 kY2 k z 2 = B , consequently h2 E = -(kx2 + kY2+ k,').
-kZ2 is (9.113d)
(9.113e) 2m 3. Solutions of the three equations (9.113~)are the functions (9.114a) Pz= .4, sin k , IC, P, = A, sink, y. Pz= A, sin k, z with the constants A,. A,, A,. With these functions P satisfies the boundary conditions P = 0 for z = 0, y = Oand z = 0. sin k , a = sink, b = sin k , c = 0 (9.114b)
9.2 Partial DifferentialEauations 539
must be valid to satisfy also the relation !P = 0 for x = a, y = b and z = c, Le., the relations k - 3 k - T 2n k - - Tn, (9.114c) z - o ' Y - b ' z - c must be satisfied, where n,, ny. and n, are integers. \Ve get for the total energy
"." [ (?)' t (
E,,,,,,,, = 2m
+ (?)
2]
?)2
(% ny, n, = il, *2, . . .)
(9.114d)
It follows from this formula that the changes of energy of a particle by interchange with the neighborhood is not continuous, which is possible only in quantum systems. The numbers n,, ny, and n,, belonging to the eigenvalues of the energy, are called the quantum numbers. After calculating the product of constants A,AyA, from the normalization condition
111
(,4z~4yA,)2
000
m,x m,y . m 8 z sin2 -sin' sin -dx dy d z = 1 b ~
(9.114e)
we get the complete eigenjunctions of the states characterized by the three quantum numbers -
J-
8 , m Z x , m Y y , m,z (9.114f) sin -sin sin b - abc a C The eigenfunctions vanish at the walls since one of the three sine functions is equal to zero. This is always the case outside the walls if the following relations are valid a 20 (n, - 1). x=--. -...., n, n, n, 2b , . . . . (nu- l)b y=-b , (9.114g) nu nY nY c 2c (n, - l ) c z=--, n, nz n, So. there are n, - 1 and n, - 1 and n, - 1 planes perpendicular to the 2- or y- or z-axis, in which @ vanishes. These planes are called the nodal planes. 4. Special Case of a Cube, Degeneracy In the special case of a cube with a = b = c, a particle can be in different states which are described by different linearly independent eigenfunctions and they have the same energy. This is the case when the sum n,' t ny2t nZ2has the same value in different states. \Ye call them degenerate states, and if there are z states with the same energy, we call it 2-fold degeneracy. The quantum numbers n,, ny and n, can run through all real numbers, except zero. This last case would mean that the wave function is identically zero, i.e.. the particle does not exist a t any place in the box. The particle energy must remain finite. even if the temperature reaches absolute zero. This zero-poznt translatzonalenergy for a block is E+-('+-+-) h2 1 1 (9.114h) 2m a2 b2 c2 @nz,n,,,nz -
~
- . . . . )
5. Particle Movement in a Symmetric Central Field (see 13.1.2.2, p. 641) 1. Formulation of the Problem The considered particle moves in a central symmetric potential V ( r ) . This model reproduces the movement of an electron in the electrostatic field of a positively charged nucleus. Since this is a spherically symmetric problem, it is reasonable to use spherical coordinates (Fig. 9.20). \Ye have the relations
540
9. DifferentialEquations
r = J-,
x = rsindcosp,
t9 = arccos zT ,
y = rsinflsinp,
Y (0 = arctan - ,
z = r cos 19,
u.
X
(9.115a)
where r is the absolute value of the radius vector, 19 is the angle between the radius vector and the z-axis (polar angle) and p is the angle between the projection of the radius vector onto the z, y plane and the x-axis (azimuthal angle). For the Laplace operator we get
x
Figure 9.20
m
281 1 &p=-++-+--++Tar
cost9 1 a21 r2sin'at9+mv'
w
(9.1l5b)
so the time-independent Schrodinger equation is:
2. Solution We are looking for a solution in the form
(9.116a) @ ( T , 27,(0) = R~(r)k;~(', 9)) where Rl is the radial wave function depending only on r , and k;"(d, p) is the wave function depending on both angles. Substituting (9.116a) in (9.115~)we get
$;
( T ' Z )
qrn+ p2m[
E- V(T)]Rlk;rn (9.116b)
Dividing by RlKrnand multiplying by r' we get
R1 dr Equation (9.116~)can be satisfied if the expression on the left-hand side depending only on r and expression on the right side depending only on t9 and p are equal to a constant, Le., both sides are independent of each other and they are equal to the same constant. From the partial differential equation we get two ordinary differential equations. If the constant is chosen equal to 1(1 f l ) , then we get the so-called radzal equation depending only on T and the potential V ( r ) : (9.116d) We want to find a solution for the angle-dependent part also in the separated form k;"(',V) = Q ( W ( V ) . Substituting (9.116e) into (9.116~)we get 1 d'0 + 1 ( 1 + 1) = ---. sint9@dp2
( $)
}
(9.116e)
(9.116f)
If the separation constant is chosen as m2 in a reasonable way, then the so-called polar equation is 1 d , d Q m' (9.116g) -(sint9%) + l ( l + 1) - -- 0 sin' 29 Qsint9d6
9.2 Partial Differential Equations 541
and the azimuthal equation is d2@ -+ m2@ = 0. (9.116h) dq2 Both equations are potential-independent, so they are valid for every central symmetric potential. M’e have three requirements for (9.116a): It should tend to zero for r + co,it should be one-valued and quadratically integrable on the surface of the sphere. 3. Solution of the Radial Equation Beside the potential V ( r )the radial equation (9.116d) also contains the separation constant 1(1+ 1).We substitute U d T ) = ’ RdT), (9.117a) since the square of the function u ~ ( T )gives the last required probability Ivl(r)(2dr= /RI(r)l2r2dr of the presence of the particle in a spherical shell between r and r t d r . The substitution leads to the one-dimensional Schrodinger equation (9.117b) This one contains the effective potential (9.117~) l i = V ( r ) k$), which has two parts. The rotation energy 1(1+ l)h2 (9.117d) r/l(l) = \ ; o t ( l ) = 2mr2 is called the centrzf2lgalpotentzal. The physical meaning of / as the orbztal angular momentum follows from analogy with the classical rotation energy
+
~
(9.117e) a rotating particle with moment of inertia 0 = p2and orbital angular momentum
c2= / ( 1 t l)h2,
li21= hd-.
?= 03: (9.117f)
4. Solution of t h e polar equation The polar equation (9.116g), containing both separation constants 1(1 + 1) and m 2 ,is a Legendre differential equation (9.57a), p. 509. Its solution is denoted by 0r(B),and it can be determined by a power series expansion. Finite, single-valued and continuous solutions exist only for 1 ( / t 1) = 0,2,6,12,.. . . We get for 1 and m: 1 = 0,1.2, . . . * Iml 5 1. (9.118a) So, m can take the (21 t 1) values (9.118b) -1. (-It l ) , (-1 2), . . . , (1 - 2 ) , (1 - 1). 1. We get the corresponding Legendre polynomial for m # 0, which is defined in the following way.
+
PY(c0s 8)= -(
2‘1!
1 - cos2 8)m’Z
d”tm(cos28 - 1)1
(dcos Ig)ltrn
(9.118~)
\I’egettheLegendrefunctionofthefirstkind(9.57c),p.j09,asaspecialcase(1= 0. m = n, cos8 = z). Its normalization results in the equation P?(COS8) = ’vypln(cos8).
(9.118d)
5 . Solution of t h e Azimuthal Equation Since the motion of the particle in the potential field V ( r )
is independent of the azimuthal angle even in the case of the physical assignment of a space direction.
m
9. Differential Eouations
542
+
.
e.g by a magnetic field. we specify the general solution @ = a e i m p ,!?e-lrnv by fixing ~ ~ ( p = Ae*lmP, . ) because in this case I @ m12 is independent of p. The requirement for uniqueness is @ m ( P + 2n) = @m(lF), so m can take on only the values 0, kl,1 2 , . . .. It follows from the normalization
(9.11sa) (9.119b)
(9.119c) 0
that
The quantum number m is called the magnetzc quantum number. 6 . Complete Solution for the Dependency of the Angles In accordance with (9.116e). the solutions for the polar and the azimuthal equations should be multiplied by each other: (9.120a)
The functions qm(3.p) are the so-called surface spherical harmonics. \$'hen the radius vector ?is reflected with respect to the origin (i--t =r), the angle 29 becomes and p becomes 9 -t n. so the sign of qrnmay change:
qy71 - 6, $5 + n ) = ( - 1 ) y y I A p ) . IYe get the parity of the considered wave function P = (-1)l.
7r
- 29
(9.120b) (9.121a)
7. Parity The parity property serves the characterization of the behavior of the wave function under space inversion i + -i (see 4.3.5.1, l., p. 269). It is performed by the inversion or parity operator P: PP(?%t ) = P(-i,t ) . If we denote the eigenvalue of the operator by P , then applying P twice it must yield P P P ( ? ,t ) = P P P ( ? >t ) = @(?,t ) ,the original wave function. So: P2= 1, P = * l . (9.121b) We call it an euen wave function if its sign does not change under space inversion, and it is called an odd wave function if its sign changes. 6. Linear Harmonic Oscillator 1. Posing the Problem Harmonic oscillation occurs when the drag forces in the oscillator satisfy Hooke's law F = -kx. For the frequency of the oscillation, for the frequency of the oscillation circuit and for the potential energy we get:
-F 1
u = 2n
k m' (9.122a)
d
=
E,
1
dZ
E,,, = -2k x 2 = -x'. 2
(9.122b)
(9.122~)
Substituting into (9,11la),the Schrodinger equation becomes (9.123a) IVith the substitutions y
=.E.
(9.123b)
A=-
2E hw'
(9.123~)
9.2 Partial Diferential Equations 543 where X is a parameter and not the wavelength, (9.123a) can be transformed into the simpler form of the Weber differential equation d2P -+(Ay2)P = 0. dY2 2. Solution N e get a solution for the Weber differential equation in the form
P(y) = e-YZ”H(y). Differentiation shows that
(9.123d)
(9.124a)
(9.124b) Substitution into the Schrodinger equation (9.123d) yields dZH dH + ( A - l ) H = 0. dY2 dY We determine a solution in the form of a series
_ -2y-
(9.124~)
(9.125a) Substitution of (9.125a) into (9.124~)results in
Comparing the coefficients of y j we get the recursion formula
+
(9.125~) ( j 2 ) ( j + l)a,,z = [ 2 j - (A - 1)]a3 ( j= 0 , 1 , 2 , . . .). We get the coefficients a3 for even powers of y from ao, the coefficients for odd powers from a l . So, a. and al can be chosen arbitrarily. 3. Physical Solutions We want to determine the probability of the presence of a particle in the different states. This will be described by a quadratically integrable wave function P(z) and by an eigenfunction which has physical meaning, Le., normalizable and for large values of y it tends to zero. The exponential function exp(-y2/2) in (9.124a) guarantees that the solution @(y) tends to zero for y + x if the function H(y) is a polynomial. To get a polynomial, the coefficients aj in (9.125a), starting from a certain n! must vanish for every j > n: a, # 0, anti = an+2 = ant3 = . . . = 0. The recursion formula (9.125~)with j = n is
2n - (A - 1) (n+2)(n+l$, an+2 = 0 can be satisfied for a,
(9.126a)
an+2=
# 0 only if
2E 2n - (A - 1) = 0. X = - = 2 n + 1. (9.12613) hw The coefficients ant2, an+4,.. . vanish for this choice of A. Also a,-l = 0 rhust hold to make the coefficients an+lranta,. . . equal to zero. We get the Hermitepolynomialsfrom the second defining equation (see 9.1.2.6,6., p. 512) for the special choice of a, = 2”. = 0. The first six of them are: (9.126~)
9. Differential Equations
544
The solution @(y) for the vibratzon quantum numbern is
pn = .\Tne-v2/2Hn(y),
(9.127a)
where ,Vn is the normalizing factor. We get it from the normalization condition
LF
I
Pnzd y = 1 as
(see (9.123b), p. 542) with fi= -= 2nn! T From the terminating condition of the series (9.123~)we get ?Vn2=
En = Aw ( n +
;)
(9.127b)
( n = 0, 1 , 2 , . . .)
(9.127~)
for the eigenvalues of the vibration energy. The spectrum of the energy levels is equidistant. The summand +1/2 in the parentheses means that in contrast to the classical case the quantum mechanical oscillator has energy even in the deepest energetic level with n = 0, which is known as the zero-poznt vzbratzon energy. Fig. 9.21 shows a graphical representation of the equidistant spectra of the energy states, the corresponding wave functions from Po to P5 and also the function of the potential energy (9.122~).The points of the parabola of the potential energy represent the reversal points of the classical oscillator, which 1 are calculated from the energy E = -mw2a2as the amplitude 2 a=
ig,
The quantum mechanical probability of finding a
u'
particle in the interval (2, z+dz) is given by dw9,, = iP(z)I'dx. It is different from zero also outside of these points.
Figure 9.21
Ja
So we get for, e.g., n = 1, hence for E = ( 3 / 2 ) b , according to dwqu= 2 -e-xsz dx, the maximum of the probability of presence at
(9.127d)
For a corresponding classical oscillator, this is (9.127e) The quantum mechanical probability density function approaches the classical one for large quantum number n in its mean value.
9.2.4 Non-Linear Partial Differential Equations: Solitons, Periodic Patterns, and Chaos 9.2.4.1 Formulation of the Physical-Mathematical Problem 1. Notion of Solitons Solitons, also called solitary waves, from the viewpoint of physics, are pulses, or also localized disturbances of a non-linear medium or field; the energy related to such propagating pulses or disturbances is concentrated in a narrow spatial region. They occur:
9.2 Partial Differential Equations
545
conductors, in linear molecules. in classical and quantum field theory. Solitons have both particle and wave properties; they are localized during their evolution, and the domain of the localization, or the point around which the wave is localized, travels as a free particle; in particular it can also be at rest. A soliton has a permanent wave structure: based on a balance between nonlinearity and dispersion, the form of this structure does not change. SIathematically. solitons are special solutions of certain non-linear partial differential equations occurring in physics, engineering and applied mathematics. Their special features are the absence of any dissipation and also that the non-linear terms cannot be handled by perturbation theory. Important examples of equations with soliton solutions are: a) Korteweg de Vries ( K d V ) Equation (9.128) Ut ~ U U , ~,,, = 0,
+
+
+ u,,
b) Non-Linear Schrijdinger (NLS) Equation
iut
c) Sine-Gordon (SG) Equation
utt - u,,
& 21211% = 0,
+ sin u = 0.
(9.129) (9.130)
The subscripts x and t denote partial derivatives, e.g., u,, = d2u/dx2. We consider the one-dimensional case in these equations, Le., u has the form u = u(x,t ) ,where z is the spatial coordinate and t is the time. The equations are given in a scaled form, Le., the two independent variables x and t are here dimensionsless quantities. In practical applications, they must be multiplied by quantities having the corresponding dimensions and being characteristic of the given problem. The same holds for the velocity.
2. Interaction between Solitons If two solitons, moving with different velocities, collide, they appear again after the interaction as if they had not collided. Every soliton asymptotically keeps its form and velocity; there is only a phase shift. Two solitons can interact without disturbing each other asymptotically. This is called an elastic interaction which is equivalent to the existence of an N-soliton solution, where N ( N = 1 , 2 , 3 , .. .) is the number of solitons. Solving an initial value problem with a given initial pulse u ( z ,0) that disaggregates into solitons, the number of solitons does not depend on the shape of the pulse but on its total amount J-'," u(x.0) dx.
3. Non-Linear Phenomena in Dissipative Systems In dissipative systems (hence friction or damped systems), periodic patterns and non-linear waves can appear through the impact of external forces. One striking example is a fluid (gas or liquid), where the combined action of gravitation and a trnpereature gradient causes (with increasing temperature difference) a transition from a purely heat conducting state (without convection) to a very special, namely regular cell convection state finally up to turbulence. Depending on the magnitude of the temparature difference bifurcation and chaos can appear (see 17.3, p. 827). Important examples for equations of such phenomena are: ut - u - (1 ib)uzt (1 ic)(uI2u= 0, a) Ginsburg-Landau (GGL) Equation (9.131)
+
b) Kuramoto-Sivashinsky (KS) Equation ut
+ +
+ u,, + u,,,, + u: = 0.
(9.132)
In contrast to the dissipationless KdV, NLS, SG, equations, the equations (9.131) and (9.132) are non-linear dissipative equations, which have, besides spatio-temporal periodic solutions, also spatiotemporal disordered (chaotic) solutions.
4. Non-Linear Evolution Equation .4n evolution equation describes the evolution of a physical quantity in time. Examples for such evolution equations are the wave equation (see 9.2.3.2, p. 534)) the heat equation (see 9.2.3.3. p. 535) and the Schrodinger equation (see 9.2.3.5, 1.. p. 536). The solutions of the evolution equations are called evolution functions.
546
9. DifferentialEquations
In contrast to linearevolutionequations, thenon-linear evolutionequations (9.128), (9.129),and (9.130) ~ These ~ equations are (with the exception of (9.131)) contain non-linear terms udu/dx, 1 ~ and1 sinu. parameter-free. From the viewpoint of physics non-linear evolution equations describe structure formations like solitons (dispersive structures) as well as periodic patterns and non-linear waves (dissipative structures).
9.2.4.2 Korteweg de Vries Equation (KdV) 1. Occurrance The KdV equation is used in the discussion of problems of plasma physics and non-linear electric networks. 2. Equation and Solutions The KdV equation for the evolution function u is (9.133) t 62121, u,,,= 0. It has the soliton solution u u ( 5 :t ) = . (9.134) 2 cosh' [ f f i ( x - v t - p)]
+
b
-6
-4
-2
0' 2 x-vt-cp
4
6
Figure 9.22 This KdV soliton is uniquely defined by the two dimensionless parameters u (v > 0) and 'p. In Fig. 9.22 u = 1 is chosen. A typical non-linear effect is that the velocity of the soliton u determines the amplitude and the width of the soliton: KdV solitons with larger amplitude and smaller width move quicker than those with smaller amplitude and larger width (taller waves travel faster than shorter ones), The soliton phase 9 describes the position of the maximum of the soliton at timet = 0. Equation (9.133) also has N-soliton solutions. Such an N-soliton solution can be represented asymptotically for t --t km by the linear superposition of one-soliton solutions: N
~(z, t)
Un(x,
N
t),
(9.135)
n=l
Here every evolution function u,(x,t ) is characterized by a velocity u, and a phase (3.; The initial phases ' p i before the interaction or collision differ from the final phases after the collision cp,,' while the velocities v l l v 2 : .. . , uN have no changes, Le., it is an elastic interaction. For N = 2, (9.133) has a two-soliton solution. It cannot be represented for a finite time by a linear I
I
superposition, and with k - -fin and an = -&(z n-2 2
- vnt - 9,) (n = 1.2) it has the form:
Equation (9.136) describes for t --t --co asymptotically two non-interacting solitons with velocities v1 = 4k: and u2 = 4k;, which transform after their mutual interaction again into two non-interacting solitons with the same velocities fort --t +aasymptotically. The non-linear evolution equation (9.137a) ZL'~+ 6(wz)' w,,, = 0
+
FZ
where w = - has the following properties: F
9.2 Partial Differential Eouations
a) For F ( z ,t ) = 1 t ea,
1
a = - f i ( x - ut - 'p) 2
547
(9.13710)
it has a soliton solution and
(fly ;;)'
b) for F ( z .t ) = 1 t ea' t eaz t - e(o1tuz)
(9.137~)
it has a two-soliton solution. IVith 2w, = u the KdV equation (9.133) follows from (9.137a). Equation (9.136) and the expression w following from (9.137~)are examples of a non-linear superposition. If the term $ 6 2 1 ~is~replaced by -61111, in (9.133), then the right-hand side of (9.134) has to be multiplied by (-1). In this case the notation antisoliton is used.
9.2.4.3 Non-Linear Schrodinger Equation (NLS) 1. Occurrence The NLS equation occurs in non-linear optics, where the refractive index n depends on the electric field strength 3,as, e.g.. for the Kerr effect, where n(G) = no + nzId1' with no, n2 = constant holds, and in the hydrodynamics of self-gravitating discs which allow us to describe galactic spiral arms
2. Equation and Solution The SLS equation for the evolution function u and its solution are: iut tu,,
* 21u/'u = 0.
ei[2&+4(+~2)t-~:
(9.138)
Here u ( z ,t ) IS complex The NLS soliton is characterized by the four dimensionless parameters 17) 5 , p3 and x. The envelope of the wave packet moves with the velocity -4<: the phase velocity of the wave packet is 2(17' - ( ' ) / E . In contrast to the KdV soliton (9.134), the amplitude and the velocity can be chosen independently of each other In the case of .V interacting solitons, we can characterize them by 4V,' arbitrary chosen parameters: q n , E n . p n r x n ( n = 1 , 2, . . . ,N ) . If the solitons have different velocities, the N-soliton solution splits asymptotically for t + *co into a sum of N individual solitons of the form (9.139). Fig. 9.23 displays the real part of (9.139) with u = -4<, 17 = 112 and ( = 215.
u ( z ,t ) = 217
C O S ~ [ ~t V 4
(9.139)
-"lt "
Figure 9.23
9.2.4.4 Sine-Gordon Equation (SG) 1. Occurrence The SG equation is obtained from the Bloch equation for spatially inhomogeneous quantum mechanical two-level systems. It describes the propagation of
spin waves in superfluid helium 3 (3He). The soliton solution of the SG equation can be illustrated by a mechanical model of pendula and springs. The evolution function goes continuously from 0 to a constant value c. The SG solitons are often called
548
9. Differential Eouations
2. Equation and Solution The SG equation for the evolution function u is utt - u,, + sin u = 0. (9.140) It has the following soliton solutions: 1. Kink Soliton u(z.t ) = 4 arctan (9.141) 1 where 7 = -and -1 < v < +1. ~
The kink soliton (9.141) for v = 1/2 is given in Fig. 9.24. The kink soliton is determined by two dimensionless paramet,ersu and 20. The velocity is independent of the amplitude. The time and the position derivatives are ordinary localized solitons: ut = 21, = 27 -. (9.142) U cash Y(S - $0 - ut)
-10
-5
0 x-h-vt
5
10
b
Figure 9.24
kink solitons. If the evolution function changes from the constant value c to 0, it describes a so-called antikink soliton. Walls of domain structures can be described with this type of solutions. 2. Antikink Soliton u(z.t ) = 4 arctan e-7(z-zo-ut). (9.143) 3. Kink-Antikink Soliton We get a static kink-antikink soliton from (9.141,9.143) with u = 0: u(x,t ) = 4arctane*('-'o). Further solutions of (9.140) are: 4. Kink-Kink Collision
(9.144)
[
u ( z . t )= 4arctan vlt:;:::
(9.145)
5. Kink-Antikink Collision 1 sinh yvt u ( x , t ) = 4arctan -C O S 7z] ~
[
(9.146)
6. Double or Breather Soliton, also called Kink-Antikink Doublet
dl - w2 sinwt (9.147) u(x:t) = 4 arctan w c o s h w x Equation (9.147) represents a stationary wave, whose envelope is modulated by the frequency w. 7. Local Periodic Kink Lattice
I
[
u(x, t ) = 2arcsin zksn
(kg4] ~
+a.
(9.148a)
The relation between the wavelength X and the lattice constant k is
x =4K(k)kvT7 Fork = 1, Le., for --t m, we get
(9.148b)
u ( z ,t ) = 4 arctane*Y(S-ut), which is the kink soliton (9.141) and the antikink soliton (9.143) again, with 20 = 0.
(9.148~)
9.2 Partial DifferentialEauations 549
Remark: snx is a Jacobian elliptic function with parameter k and quarter-period K (see 14.6.2, p. 701): snx = sinp(z,k ) ,
(9.149a)
Equation (9.14913)comes from (14.102a), p. 701, by the substitution of sin Q = q . The series expansion of the complete elliptic integral is given as equation (8.104),p. 460.
9.2.4.5 Further Non-linear Evolution Equations with Soliton Solutions 1. Modified KdV Equation (9.150) ut 6u2u, + u,,, = 0.
*
The even more general equation Ut upu, u,,, = 0 has the soliton
+
+
u(2,t) =
l1
I
tu (P + 1)(P + 2)
(9.151) 1
P
(9.152)
cosh2 ( i p f i ( x - ut - p)]
as its solution. 2. Sinh-Gordon Equation utt - u,, + sinh u = 0.
3. Boussinesq Equation
(9.154) U x , - U t t t (u2),, + u,,, = 0. This equation occurs in the description of non-linear electric networks as a continuous approximation of the charge-voltage relation.
4. Hirota Equation ut
+ ~ ~ c Y ~ u+/ 8uz, ~ z L+, iou,,, + 6(uI2u= 0,
CUB= o6.
(9.155)
5 . Burgers Equation (9.156) ut - u,, 2121, = 0. This equation occurs when modeling turbulence. With the Hopf-Cole transformation it is transformed into the diffusion equation, Le., into a linear differential equation.
+
6. Kadomzev-Pedviashwili Equation The equation (ut + 621212 + uzzz), = u y y has the soliton
(9.157a) (9.157b)
as its solution. The equation (9.157a) is an example of a soliton equation with a higher number of independent variables, e.g.. with two spatial variables.
550
10. Calculus of Variations
10 CalculusofVariations 10.1 Defining the Problem 1. Extremum of an Integral Expression A very important problem of the differential calculus is to determine for which zvalues the given function y(x) has extreme values. The calculus of variations discusses the following problem: For which functions has a certain integral. whose integrand depends also on the unknown function and its derivatives, an extremum value? The calculus of variations concerns itself with determining all the functions y(z) for which the integral expression b
I[y]= J/ F ( z )y(z). y'(z), . . . . y(")(z))dz
(10.1)
a
has an extremum, if the functions y(z) are from a previously given class of functions. Here. we may define some boundary and side conditions for y(z) and for its derivatives.
2. Integral Expressions of Variational Calculus There can also be several variables instead of z in (10.1). In this case, the occurring derivatives are partial derivatives and the integral in (10.1) is a multiple integral. In the calculus of variations. the following types of integral expressions are discussed:
Here the unknown function is u = u ( z ,y). and R represents a plane domain of integration.
The unknown function is u = u ( z ,y , z ) , and R represents a space region of integration. Additionally, boundary values can be given for the solution of a variational problem, at the endpoints of the interval a and b in the one-dimensional case. and at the boundary of the domain of integration R in the twodimensional case. Besides, various further side conditions can be defined, e.g., in integral form or as a differential equation. A variational problem is called first-order or higher-order depending whether the integrand F contains only the first derivative y' or higher derivatives y(") ( n > 1) of the function y.
3. Parametric Representation of the Variational Problem A variational problem can also be posed in parametric form. If we consider a curve in parametric form 5 = x ( t ) , y = y ( t ) ( a 5 t 5 9).then, e.g., the integral expression (10.2) has the form 3
I [ z .Yl =
1
F ( z ( t ) ,y(t), 4 t ) ,Y(t)) dt.
a
(10.7)
10.2 Historical Problems 551
10.2 Historical Problems 10.2.1 Isoperimetric Problem The general zsoperzmetric problem is to determine the plane region with the largest area among the plane regions with a given perimeter. The solution of this problem, a circle with a given perimeter, originates from queen Dido, who was allowed, as legend has it, to take such an area for the foundation of Carthago which she could be surround by one bull’s leather. She cut the leather into fine stripes, and formed a circle with them. A special case of the isoperimetric problem is to find the equation of the curve y = y(x) in a Cartesian coordinate system connecting the points A ( a , 0) and B(b, 0) and having the given length 1, for which the area determined by the line segment and the curve is the largest possible (Fig. 10.1). The mathematical formalization is: \Ye have to determine a one-time continuously differentiable Figure 10.1 function y(x) such that b
I[y]= /y(x) dx = max
(10.8a)
a
holds, where the side condition (10.8b) and the boundary conditions ( 1 0 . 8 ~are ) satisfied: h
G[y] =
d
m
d
z =1
(10.8b)
y(a) = y(b) = 0
(10.8~)
a
10.2.2 Brachistochrone Problem The brachistochrone problem was formulated in 1696 by J. Bernoulli, and it is the following: A point mass descends from the point PO(z0, yo) to the origin in the vertical plane x,y only under the influence of gravity \Ve should determine the curve y = y(x) along which the point reaches the origin in the shortest possible time from Po (Fig. 10.2). Considering the formula for the time of fall, T , we get the mathematical description: We have to determine a one-time continuously differentiable function y = y(x), for which 00
dx = min,
(10.9)
(g is the acceleration due to gravity) and the boundary value conditions are Y(0) = 0. Y(Z0) = Yo. (10.10) R‘e see that there is a singularity for z = zo in (10.9).
xo Figure 10.2
x
10.3 Variational Problems of One Variable 10.3.1 Simple Variational Problems and Extrema1 Curves A simple uarzatzonal problem is to determine the extreme value of the integral expression given in the form b
/
I[Yl= F ( z ,Y(x)l
Y’(l))
dx,
(10.11)
a
where y(x) is a twice continuously differentiable function satisfying the boundary conditions y(a) = A and y(b) = B. The values a , b and A , B , and the function F are given.
552
10. Calculus of Variations
The integral expression (10.11) is an example of a so-called functional. A functional assigns a real number to every function y(x) from a certain class of functions. If the functional I[y] in (10.11) takes, e.g., its relative maximum for a function yo(z), then [[Yo1 2 4 Y l (10.12) for every twice continuously differentiable function y satisfying the boundary conditions. The curve y = yo(.) is called an extremal curve. Sometimes all the solutions of the Euler differential equation of the variational calculus are called extremal curves.
10.3.2 Euler Differential Equation of the Variational Calculus We get a necessary condition for the solution of the variational problem in the following way: We construct an auxiliary curve or comparable curve for the extremal yo(z) characterized by (10.12) (10.13) Y b ) = Yo(l) +t v b ) ) the special boundary conditions ~ ( a=) with a twice continuously differentiable function ~ ( xsatisfying q(b) = 0. t i s a real parameter. Substituting (10.13) in (10.11) we get a function dependingon E instead of the functional I [ y ] b
I ( € )=
/ F(x,
YO
t E 17, Y;t E 7’)dz,
(10.14)
a
and the functional I[y] has an extreme value for yo(z) if the function I ( € ) ,as a function of E, has an extreme value for E = 0. Yow, we deduce the variational problem to an extreme value problem with the necessary condition dl - = 0 for E = 0. (10.15) dt Supposing that the function F , as a function of three independent variables, is differentiable as many times as needed, by its Taylor expansion we get (see 7.3.3.3, p. 415) (10.16) The necessary condition (10.15) results in the equation (10.17)
By partial integration of this equation and considering the boundary conditions for q ( x ) ,we get (10.18) From the assumption of continuity and because the integral in (10.18) must disappear for any considerable ~ ( x ) , (10.19) must hold. The equation (10.19) gives a necessary condition for the simple variational problem and it is called the Euler diflerential equation of the calculus of variations. The differential equation (10.19) can be written in the form dF d2F d2F y’ - -y” d2F = 0. (10.20) dy dxdy’ dydy‘ dy’2 It is an ordinary second-order differential equation if Fy,y, # 0 holds.
10.3 Variational Problems o f One Variable 553
The Euler differential equation has a simpler form in the following special cases: Case 1: F ( x .y. y’) = F(y’),Le., z and y do not appear explicitly. Then instead of (10.19) we get (10.21a)
d
and
dF (%) = 0.
(10.21b)
Case 2: F ( x , y, y’) = F ( y , y’), Le., 5 does not appear explicitly. We consider dF d d (F = %y’ dF + @y” d F - y”- dF - y’- d dF dx ay/ dz = y’ ( d y -
y’g)
dF z (a,.))
(G)
and because of (10.19)?we get d
(F
-
y ’ g ) = 0,
(10.22b)
dx
Le.,
F
-
y’-
dF = c (c const) 8Y’
(10.22~)
as a necessary condition for the solution of the simple variational problem in the case F = F ( y , y’). W A: The functional to determine the shortest curve connecting the points Pl(a, A ) and &(b, B ) in the x.y plane is:
1[y1=
s”
~ l + y “ d z = min.
(10.23a)
a
It follows from (10.21b) for F = F(y’)=
that (10.23b)
so y” = 0. Le.. the shortest curve is the straight line. W B: We connect the points Pl(as A ) and P,(b,B ) by acurve y(x), and we rotate it around the x-axis. Then the surface area is
( 10.24a)
I[y]= 2 s a r y m d x
For which curve y(x) will the surface area be the smallest? It follows from (10.22~)with F = F ( y , y’) =
(t)’
or y” = - 1 with c1 = L . This differential equation is 2a 2a separable (see 9.2.2.3. l., p. 522), and its solution is 2 n y W that y =
y = ci cosh
(z+
c2)
(el,
e2
(10.24b)
const),
the equation of the so-called catenary curzle (see 2.15.1, p. 105). We determine the constants c1 and e2 from the boundary values y(a) = A and y(x) = B. We have to solve a non-linear equation system (see 19.2.2, p. 893), which cannot be solved for every boundary value.
10.3.3 Variational Problems with Side Conditions These problems are usually isoperimetric problems (see 10.2.1, p. 551): The simple variational problem (see 10.2 1.p. 551). given by the functional (lO.ll),is completed by a further side condition in the form
j C ( c , y ( z j . y / ( x l ) d x= 1
(1 const)
(10.25)
a
where the constant 1 and the function G are given. .4method to solve this problem is given by Lagrange (extreme values with side conditions in equation form, see 6.2.5.6, p. 401). We consider the expression (10.26) H ( z ,~ ( z )~. ’ bA)) = . F ( z ,Y(.c), ~ ’ ( x +) ) ( G ( ~ , Y ( x ) , Y ’ ( ~-) )
4,
I
554
10. Calculus of Variations
where X is a parameter. and we consider the problem
1
H ( x , y(x), y‘(x).
= extreme!,
(10.27)
a
Le.. an extreme value problem without side condition. The corresponding Euler differential equation is:
E -d dy
(”) = 0,
(10.28)
dx dy’
The solution y = y(x. A) depends on the parameter A, which can be determined by substituting y(x. A) into the side condition (10.25). IFor the isoperimetric problem 10.2.1, p. 551, we get (10.29a) H ( x ,Y(4,Y’(x)>4= Y + A m . Because the variable x does not appear in H , we get instead of the Euler differential equation (10.28), analogously to (10.22c), the differential equation
&
y+x---
-4 = c1 or
y’2 =
(cl c1 - y
(el const).
(10.29b)
whose solution is the family of circles (x - cz)’ (y - c1)’ = Xz (cl,cz,X const). (10.29~) The values cl, c2 and X are determined from the conditions y(a) = 0, y(b) = 0 and from the requirement that the arclength between A and B should be 1. We get a non-linear equation for A, which should be solved by an appropriate iterative method.
+
10.3.4 Variational Problems with Higher-Order Derivatives We consider two types of problems
1. F = F ( z ,y, g’, y”) The variational problem is:
1 b
I[y]= F ( q y, y’, 9”) dx = extreme!
(10.30a)
a
with the boundary values y ( a ) = A, y(b) = B , y‘(a) = A’, y’(b) = B’. (10.30b) where the numbers a, b, A , B , A’, and B‘, and the function F are given. Similarly as in 10.3.2, p. 552, we introduce comparable curves y(z) = yo(r) + E ~ ( with x ) q(a) = q(b) = q’(a) = a’(b) = 0. and we get the Eder dzflerentzal equation
dF dY -
d
dF (a,.) +g
(5) =0
(10.31)
as a necessary condition for the solution of the variational problem (10.30a). The differential equation (10.31) represents a fourth-order differential equation. Its general solution contains four arbitrary constants which can be determined by the boundary values (10.30b). IConsider the problem I [ y ] = il(~’’~ - cyy” - By’) dx = extreme!
(10.32a)
10.3 Variational Problems of One Variable 555
with the given constants a and I? for F = F(y, y’, y”) = y’” - ~ y- ’By2. ~ Then: d d2 Fy = -23y. Fyf= -2y’, FY”= 2y”, - (Fyt)= -2y”, - (Fv)t) = - 2 ~ ( ~ and ) , the Euler differential dx dx2 equation is
+
y (4) ay”- 3y = 0. (10.32b) This is a fourth-order linear differential equation with constant coefficients (see 9.1.2.3, p. 498).
...
2. F = F ( z ,y, y’, ,y‘“)) In this general case. when the functional I[y]of the variational problem depends on the derivatives of the unknown function y up to order n (n2 1))the corresponding Euler differential equation is
dF - d dF (10.33) dY + whose solution should satisfy the boundary conditions analogously to (10.30b) up to order n - 1.
2 (E)
(a,.)
10.3.5 Variational Problem with Several Unknown Functions Suppose the functional of the variational problem has the form b
.
I [ y i , YZ. ’ ’ . Y], =
1
F(L
YZ,.
. . . yn, ~
a
/
I
. . yk) dx,
1~ , 2 , .
(10.34)
.
where the unknown functions yl(x), yz(x)>.. . yn(x) should take given values at x = a and x = b. We consider n twice continuously differentiable comparable functions > n). (10.35) !/t(z) = yd.) + ~ t % ( Z ) (i = 132, where the functions vt(x) should disappear at the endpoints. (10.34) becomes I ( e l , c2. . . . . E , ) with (10.35). and from the necessary conditions I , ,
dI
-
=o
(2
=
4 2 . . . . ,n )
(10.36)
for the extreme values of a function of several variables, we get the n Euler differential equations
whose solutions yl(z). y2(z),. . . y,(x) must satisfy the given boundary conditions. ~
10.3.6 Variational Problems using Parametric Representation For some variational problems it is useful to determine the extremal, not in the explicit form y = y(x), but in the parametric form x = x ( t ) , y = y(t) ( t l 5 t 5 tz). (10.38) where t l and tz are the parameter values corresponding to the points ( a , A ) and (b, B ) . Then the simple variational problem (see 10.3.1, p. 551) is tz
I [ $ ,y] =
1F ( x ( t ) ,
y(t), k ( t ) ly(t))dt = extreme!
(10.39a)
tl
with the boundary conditions ~ ( t l= ) a. ~ ( t z = ) 6. y[tl) = A , ~ ( t z = ) B. (10.39b) Here x and j, denote the derivatives of x and y with respect to the parameter t, as usual in the parametric representation. The variational problem (10.39a) makes sense only if the value of the integral is independent of the
556
10. Calculus of Variations
parametric representation of the extrema1 curve. To ensure the integral in (10.39a) is independent of the parametric representation of the curve connecting the points (a, A ) and (b, B ) , F must be a positive homogeneous junctzon, Le., (10.40) F ( z ,Y . P% PY) = P F ( x . Y , 5 , Y) ( P > 0) must hold. Because the variational problem (10.39a) can be considered as a special case of (10.34),the corresponding Euler differential equations are (10.41) They are not independent of each other, but they are equivalent to the so-called Weierstrass form of the Euler differential equation: (10.42a) with (10.42b) Starting with the calculation of the radius of curvature R of a curve given in parametric representation (see 3.6.1.1, l.,p. 225), we calculate the radius of curvature ofthe eztremal curve considering (10.42a) with (10.42~) The isoperimetric problem (10.8a to 1 0 . 8 ~(see ) 10.2.1,p. 551) has the form in parametric representation: I [ x ,y ] = [’ y ( t ) i ( t ) d t= max! (10.43a) with G[z, y ] = tl
J”’ d
m
d
t = 1. (10.43b)
tl
This variational problem with the side condition becomes a variational problem without the side condition according to (10.26) with H = H ( x 3y , x,y ) = y x -tA m .
(10.434
w’e see that H satisfies the condition (10.40), so it is a positive homogeneous function of first degree. Furthermore, we have H i y = 1,
(10.43d)
so (10.42~) yields that the radius of curvature is R = 1x1. Since A is a constant, the extremals are circles.
10.4 Variational Problems with Functions of Several Variables 10.4.1 Simple Variational Problem One of the simplest problems with a function of several variables is the following variational problem for a double integral: I [ u ]=
// F ( z ,
y . u ( x ,y ) , u,, u y )dxdy = extreme!
(GI
(10.44)
10.1 Variational Problems with Functions o f Several Variables 557
Here. the unknown function u = u(z, y) should take given values on the boundary Analogously to 10.3.2. p 552, we introduce the comparablefunctzon in the form
r of the domain G
(10.45) 4 2 , Y) = uo(z,Y) + 4 z , ~ ) . where u o ( J , y) is a solution of the variational problem (10.44) and it takes the given boundary values. while ~ ( xy),satisfies the condition ~ ( 2y). = 0 on the boundary r (10.46) and together with uo(z.y), they are differentiable as many times as needed. The quantity 6 is a parameter. We determine a surface by u = u(x, y) which is close to the solution surface uo(z, y). I [ u ]becomes I ( € )with (10.45), Le., the variational problem (10.44) becomes an extreme value problem which must satisfy the necessary conditions
dl
-=
de
0
for
6
(10.47)
= 0.
\Ye get from this the Euler dzflerentzal eqzlatzon (10.48) as a necessary condition for the solution of the variational problem (10.44). W -4 free membrane, fixed a t the perimeter of a domain G of the z,y plane. covers a surface with area
r
II =
// dxdy
(10.49a)
@I
If the membrane is deformed by a load so that every point has an elongation u = u(x.y) in the direction. then its area is calculated by the formula I2 =
s/ d m d x d y .
2-
(10.49b)
(GI
If we linearize the integrand in (10.49b) using Taylor series (see 6.2.2.3, p. 394). then we get the relation (10.49~)
(10.49d) for the potential energy li of the deformed membrane, where the constant o denotes the tension of the membrane. We obtain the so-called Dzrzchlet uarzatzonal problem in this way: We have to determine the function u = u(x,y) so that the functional (10.49e)
r
should have an extremum, and u vanishes on the boundary of the plane domain G. The corresponding Euler differential equation is (10.49f)
558
10. Calculus of Variations
It is the Laplace differential equation for functions of two variables (see 13.5.1, p. 667).
10.4.2 More General Variational Problems We should consider two generalizations of the simple variational problem
1. F = F ( z , Y, 4 2 ,Y), u+, uy,%a, u + y , ‘Ilyy) The functional depends on higher-order partial derivatives of the unknown function v(z, y). If the partial derivatives occur up to second order, then the Euler differential equation is:
2. F = F ( Z l , Z Z , ..., 2,,u(21,...,2,),21,,,...,U,,) In the case of a variational problem with n independent variables equation is:
..
~ 1 ~ ~ 2,x,, , . the
Euler differential (10.51)
10.5 Numerical Solution of Variational Problems Most often two ways are used to solve variational problems in practice.
1. Solution of the Euler Differential Equation and Fitting the Found Solution to
the Boundary Conditions Usually, exact solution of the Euler differential equation is possible only in the simplest cases, so we have to use a numerical method to solve the boundary value problem for ordinary or for partial differential equations (see 19.5)p. 908 or 20.4.4: p. 989ff). 2. Direct Methods The direct methods start directly from the variational problem and do not use the Euler differential equation. The most popular and probably the oldest procedure is the Ritz method. It belongs to the socalled approximation methods which are also used for approximate solutions of differential equations (see 19.4.2.2, p. 906 and 19.5.2, p. 909), and we demonstrate it with the following example. ISolve numerically the isoperimetric problem
1’
y”(z)dx = extreme! (10.52a)
for
1’
y2(z)dz = 1 and y(0) = y(1) = 0. (10.52b)
The corresponding variational problem without side condition according to 10.3.3, p. 553, is: I[y] =
s1
[y”(z)dx - Xy2(x)] = extreme!
(10.52c)
\Ye want to find the best solution of the form (10.52d) y(z) = alz(x - 1) + a2x2(x - 1). Both approximation functions z(z-1) and x2(x-1) are linearly independent, and satisfy the boundary conditions. (10.52~)is reduced with (10.52d) to 1 2 1 (10.52e) I(a1, al) = -a; + -ai t -ala2 3 15 3 and the necessary conditions
#- # = 0 result in he homogeneous linear equation system a1
=
a2
105
(10.52f)
10.6 Supulementarv Problems 559
This system has a non-trivial solution only if the determinant of the coefficient matrix is equal to zero. So, we get: Xz - 52X + 420 = 0, Le., XI = 10, A2 = 42. (10.52g) For X = X I = 10 we get from (10.529 a2 = 0. a1 arbitrary, so the normed solution belonging to X I = 10 is: ( 10.52h) y = 5 . 4 8 ~-( ~1). To make a comparison: consider the Euler differential equation belonging to (10.524. We get the boundary value problem yll+ Xy = 0 with y(0) = y(1) = 0 (10.52i) with the eigenvalues Xk = k 2 d (k = 1 , 2 , . . .) and the solution yk = ck sin kax. The normed solution, e.g., for the case k = 1. Le.! X1 = 7r2 % 9.87 is
Jz
(10.52j) y= sin T Z : which is really very close to the approximate solution (10.52h). Remark: With today’s level of computers and science we have to apply, first of all, the finite element method (FEPVI) for numerical solutions of variational problems. The basic idea of this method is given in 19.5.3, p. 910, for numerical solutions of differential equations. The correspondence between differential and variational equations will be used there, e.g., by Euler differential equations or bilinear forms according to (19.146a,b). .Also the gradient method can be used for the numerical solution of variational problems as an efficient numerical method for non-linear optimization problems (see 18.2.6, p. 868).
10.6 Supplementary Problems 10.6.1 First and Second Variation In the derivation of the Euler differential equation with a comparable function (see 10.3.2, p. 552), we stopped after the linear term with respect to E of the Taylor expansion of the integrand of b
I ( E )=
\ F ( z , yo t
€7, yb
+ €4)dx.
(10.53)
a
If we consider also quadratic terms: then we get (10.54)
If we denote as 1. Variation 61 of the functional I[y]the expression (10.55) 2. Variation 6’1 of the functional 1[y] the expression
(10.56)
560
10. Calculus of Variations
then we can write: -
€2 r(o) N €61+ b2 I . 2
(10.57)
We can formalize the different optimality conditions with these variations for the functional I[y] (see [10.7]).
10.6.2 Application in Physics Variational calculus has a determining role in physics. We can derive the fundamental equations of Newtonian mechanics from a variational principle and arrive at the Jacobi-Hamilton theory. Variational calculus is also very important in both atomic theory and quantum physics. It is obvious that the extension and generalization of classical mathematical notions is undoubtedly necessary. So, the calculus of variations must be discussed today by modern mathematical disciplines, e.g., functional analysis and optimization. Unfortunately, we can only give a brief account of the classical part of the calculus of variations (see [10.4], [10.5], [10.7]).
11 Linear Integral Equations 11.1 Introduction and Classification 1. Definitions A n integral equation is an equation in which the unknown function appears under the integral sign. There is no universal method for solving integral equations. Solution methods and even the existence of a solution depend on the particular form of the integral equation. An integral equation is called linear if linear operations are performed on the unknown function. The general form of a linear integral equation is: b(C
g(z)d.(s)= f(z)+ A
J KhY)P(Y)
dY,
c
(11.1)
I z I d.
a(.)
The unknown function is ~ ( 2 the ) ~function K ( z ,y) is called the kernel of the integral equation, and f ( z ) is the so-called perturbation function. These functions can take complex values as well. The integral equation is homogeneous if the function f ( z ) is identically zero over the considered domain, i.e., f(z)= 0, otherwise it is inhomogeneous. X is usually a complex parameter. Two types of equation (11.1) are of special importance. If the limits of the integral are independent of z,Le., a(.) a and b(z) b, we call it a Fredholrn integral equation (11.2a,11.2b). If a(.) t a and b(z) = z,we call it a Volterra integral equation (11.2c, 11.2d). If the unknown function p(z) appears only under the integral sign, Le., g(z) I 0 holds, we have an integral equation ofthe first kindas (11,2a), (11.2~). The equation is called an integral equation ofthe secondkindifg(z) t 1 asin (11.2b). (11.2d).
2'
Pk)= f(.) a
+
J K ( z ,Y)P(Y) dY.
(11W
a
2. Relations with Differential Equations The problems of physics and mechanics relative rarely lead directly to an integral equation. These problems can be described mostly by differential equations. The importance of integral equations is that many of these differential equations, together with the initial and boundary values, can be transformed into integral equations. W From the initial value problem ~'(z) = f(z,y) with z 2 2 0 and y(z0) = yo by integration from zo to z we get
d.1
= Yo +
1:f(E>!/(
d<.
(11.3)
The unknown function y(z) appears on the left-hand side of (11.3) and also under the integral sign. The integral equation (11.3) is linear if the function f(<.y(<)) has the form f.((, ~ ( 0=)a ( < )y(E) b(<), i.e.. the original differential equation is also linear. Remark: In this chapter 11 we only deal with integral equations of the first and second kind of Fredholm and Volterra types. and with some singular integral equations.
+
562
11. Linear Intearal Eauations
11.2 Fredholm Integral Equations oft he Second Kind 11.2.1 Integral Equations with Degenerate Kernel If the kernelK(z. y ) of an integral equation is the finite sum of products of two functions of one variable, Le., one depends only on z and the other one only on y I it is called a degenerate kernelor a product kernel. 1. Solution in the Case of a Degenerate Kernel The solution of a Fredholm integral equation of the second kind with a degenerate kernel leads to the solution of a finite-dimensional equation system. Consider the integral equation (11.4a)
K ( z . y ) = c y ~ ( z ) & (+y )~ z ( ~ ) P z + ( v' )' '
+ an(x)Pn(y).
(11.4b)
The functions al ( z ) $... , cy,(x) and B1(x), . . . , p n ( x )are given on the interval [a, b] and are supposed to be continuous. Furthermore, the functions cy~(z),. . . , cy,(z) are supposed to be linearly independent of one another, Le., the equality (11.5) with constant coefficients c k holds for every x in [a,b] only if c1 = cz = K ( z .y) can be expressed as the sum of a smaller number of products. From (11.4a) and (11.4b) we get:
. . . = c,
= 0. Otherwise,
(11.6a) a
a
The integrals are nolonger functions of the variable z, they are constant values. Let's denote them by A k: b
Ak =
1
& ( y ) p ( y ) &, k = 1,.. . , n.
( 11.6b)
a
The solution function p(x), if any exists. is the sum of the perturbation function f(x) and a linear combination of the functions a l ( z ) ,. . . , a,(z): (11.6~) ~ ( 2= ) f ( ~t)X A ~ C Y ~ (XA~CQ(X) Z) . . . XA,an(x).
+
+ +
2. Calculation of the Coefficients of the Solution The coefficients A I . , , , A, are calculated as follows. Equation (11.6~)is multiplied by &(z) and its integral is calculated with respect to 1c with the limits a and b: ~
The left-hand side of this equation is equal to Ah according to (11.6b). Using the following notation (11.7b)
(11 . 7 ~ )
11.2 Fredholm Intearal Eauations of the Second Kind 563
It is possible that the exact values of the integrals in (11.7b) cannot be calculated. When this is the case. their approximate values must be calculated by one of the formulas given in 19.3, p. 895. The linear equation system (11.7~)contains n equations for the unknown values A I , . . . ,A,: A2 -
...
-kin
Ai +(1 - X ~ 2 2 ) A z-
...
-XCZ, .4, = bz
(1 - Xcl1)Al -XCZ~
-hi:!
A, = b l
~
~
(11.7d)
...................................................... -Xcni AI
-XC,Z
Az -
. . . +(1 - Xc,,)A,
= b, .
3. Analyzing the Solution, Eigenvalues and Eigenfunctions It is known from the theory of linear equation systems that (11.7d) has one and only one solution for Al, , , , A, if the determinant of the matrix of the coefficients is not equal to zero, Le., ~
~
D(X) =
-XCz1
(1 - Xczz) . . .
-XCzn
(11.8)
.................................... -Xc,z . . . (1 - Xc,,) -Xc,1
Obviously D(X) is not identically zero, as D(0) = 1 holds. So there is a number R > 0 such that D(X) # 0 if 1x1 < R. For further investigation we have to consider two different cases. Case D(X) # 0: The integral equation has exactly one solution in the form (11,6c), and the coefficients A I , . . . ,A, are given by the solution of the equation system (11.7d). If in (11.4a) we have a homogeneous integral equation, Le., f(x) 0, then bl = bz = . . . = b, = 0. Then the homogeneous equation system (11.7d) has only the trivial solution AI = .A:! = . . . = A, = 0. In this case only the function p(z) I 0 satisfies the integral equation. Case D(X) = 0: D(X) is a polynomial of no higher than n-th degree, so it can have at most n roots. For these values of X the homogeneous equation system (11.7d) with bl = bz = . . . = b, = 0 also has non-trivial solutions. so besides the trivial solution ( ~ ( zI ) 0 the homogeneous equation system has other solutions of the form p(x) = C . (Alal(x) A ~ c u ~ ( x ). . . A,a,(z)) (C is an arbitrary constant.) Because al(x)., , , . CY,(I)are linearly independent, p(x) is not identically zero. The roots of D(X) are called the eigenvalues of the integral equation. The corresponding non-vanishing solutions of the homogeneous integral equation are called the eigenfunctions belonging to the eigenvalue A. Several linearly independent eigenfunctions can belong to the same eigenvalue. If we have an integral equation with a general kernel, we consider all values of X eigenvalues, for which the homogeneous integral equation has 1. non-trivial solutions. Some authors call the X with D(X) = 0 the characteristic number, and p = - is X
+
+ +
1
b
called the eigenvalue corresponding to an equation form p p ( x ) =
a
K ( x ,y)p(y) d y .
4. Adjoint Integral Equation Now we need to investigate the conditions under which the inhomogeneous integral equation will have solutions if D(X) = 0. For this purpose we introduce the adjoint or transposed integral equation of (11.4a):
564
11. Linear Integral Equations
Let X be an eigenvalue and p(x) a solution of the inhomogeneous integral equation (11.4a). It is easy to show that X is also an eigenvalue of the adjoint equation. Now multiply both sides of (11.4a) by any solution ~ ( zof) the homogeneous adjoint integral equation and evaluate the integral with respect to x between the limits a and 6:
j
b
v ( z ) v ( x dx ) =
a
1f(x)ll(z) 1
dx +
p (i]
1
(11.9b)
K ( z ,Y)llb) dz V(Y) dY.
a
a
a
l
b
Assuming that ~ ( y =) X
K ( x ,y)@(zjdz, we get f(x)$(z) dx = 0 That is: The inhomogeneous integral equation (11.4a) has a solution for some eigenvalue X if and only if the perturbation function f(x) is orthogonalto every non-vanishing solution of the homogeneous adjoint integral equation belonging to the same A. This statement is valid not only for integral equations with degenerate kernels, but also for those with general kernels. a
IA: p(z) = z
+
1,
+ zy'
+I
(z'y
- xy)p(y) dy,
z z 1Q ~ ( z= )
( Y I ( ~ )=
2,
~ ( z =) --z, P ~ ( Y = )
y, &(y) = y') b3(y) = y. The functions c y k ( x ) are linearly dependent. This is why we transform the integral equation into the form p(z) = z+ have (
3 ( ~=)5
+
1:
[x'y
+ x(y' - y)]p(y)dy. For this integral equation we
a'(.) = x, pl(y) = y, Oz(y) = y2 - y. If any solution p(z) exists, it has the form A 1 2 t A:!2 .
~ ~ = ( 5x',)
2 2 2 With these values we have the equations for Al and A': Al - -A2 = - , --Al + 3 3 5 10 2 10 2 10 5 which in turn yield that Al = - , Az = - - and p(z) = x + -2' - -x = -z2 + -2. 21 7 21 7 21 7
I
IB: p(x) = x + X / n s i n ( x + y ) p ( y ) d y , i . e . : K ( z , y ) = s i n ( z + y ) =sinzcosy+cosxsiny, p(z) =z
+ Xsinx
1';
cosy p(y) dy
ell = i " s i n z c o s x d z = 0,
+ A cosx T
c12= [cos'xdz
bl = [zcosxdz
=2
= -2,
=7
cZ2= i * c o s x s i n z d z = 0, bz = [ z s i n z d z = T. 2' IVith these values the system (11.7d) is Al - ATAZ = -2, -ApAl+ A2 = K . It has a unique solution 2 2 cZ1 = L'sin'zdz
~
1
;;
1 -Afor any X with D(X) = 1 -A-
T '
= 1 - X'-
the solution of the integral equation is p(x) = x
K'
4
#
0. SO A1
A--2 = +z, 1- A'-
A2 = 4
T(l
- A)
~
1 - A'-
'x , and 4
+ ___ A Kz [ ( A ~ - 2 ) s i n x + ~ ( l - A ) c o s s 1 - A'-
eigenvalues of the integral equation are A1
2 =-
The homogeneous integral equation p(z) =
x
Ah
4 2 XZ = -- .
/'sin(z
+ y)p(y) dy has non-trivial solutions of the
11.2 Fredholm Inteoral Eouations of the Second Kind 565 2 form pk(z) = Xk(A1 sinz t Azcosz) ( k = 1,2). For XI = - we get AI = A2, and with an arbitrary ?r
constant A we have pl(z)= A(sinz t cosz). Similarlyfor
A2
2
= -- we get pz(z) = B(sin z K
- cos T )
with an arbitrary constant B. Remark: The previous solution method is fairly simple but it only works in the case of a degenerate kernel. By this method. however, we can get a good approximate solution in the case of a general kernel too if we can approximate the general kernel by a degenerate one closely enough.
11.2.2 Successive Approximation Method, Neumann Series 1. Iteration Method Similarly to the Pzcard zteratzon method (see 9.1.1.5, l., p. 494) for the solution ofordinary differential equations, an iterative method needs to be given to solve Redholm integral equations of the second kind. Starting with the equation b
P(2) = f(.)
i-
1
( 11.10)
K ( z ,Y ) d Y ) dY,
a
we define a sequence of functions .po(z), p1(z), p2(2),. . . . Let the first be po(z) = f(~).We get the subsequent pn(z)by the formula b
Pn(.) = f(z)+
1
K ( z ,Y)Pn-l(Y) dy ( n = 4 % .. . : d z ) = f(.)).
( 11.1l a )
a
Following the given method our first step is b
(11.11b)
P1(2)= f(z)i- x/K!T.Y)f(y)dY. a
.4ccording to the iteration formula this expression of p(y) is substituted into the right-hand side of (11.10). To avoid the accidental confusion of the integral variables, let’s denote y by q in (1l.llb). (1l.llc) b b
b
=
f(.) + A /
K ( z ,y ) f ( y ) dY + xz
a
SJK ( z ,
Y)K(Y.
a)f(o)dY d7).
( 11.1Id)
a a
1 b
Introducing the notation of K l ( z ,y) = K ( s ,y) and Kz(z,y) =
K ( z ,<)K(E.y) d<, and renaming q
as y , we can write yz(x) in the form
(1l.llf)
I
566
11. Linear Inte.qral Equations
we get the representation of the n-th iterated cp,(z): b
/
+
pn(x) =
b ~ l ( xY ), ~
( Y )dy + . ' ' +
J
~ " ( y 2 ) ,f ( y )
( 11.1lg)
dy.
a
a
We call Kn(x,y) the n-th zterated kernel of K ( z ,y).
2. Convergence of the Neumann Series To get the solution p(z),we have to discuss the convergence of the power series of X
f(x) +
5
n=l
1
Kn(x. Y ) f ( Y ) dy,
(11.12)
a
which is called the Yeumann series. If the functions K ( r ,y) and f ( z ) are bounded, Le., the inequalities (11.13a) IK(x,y)l < M ( a 5 x 5 b, a 5 y 5 b) and I f ( x ) i < N ( a 5 z 5 b ) , hold. then the series s.
I X M ( b - a)In
A'
(11.13b)
n=O
is a majorant series for the power series (11.12). This geometric series is convergent for all 1
IX' <
m,
(11.13~)
The Seumann series is absolute and uniformly convergent for all values of X satisfying (11.13~).By a sharper estimation of the terms of the Neumann series we can give the convergence interval more precisely. According to this, the Neumann series is convergent for (11.13d)
This limit for the parameter X does not mean that there are no solutions for any 1x1 outside the bounds set by (11,13d), but only that we cannot get it by the Neumann series. Let's denote by
c Xn-lKn(s: y) r
r ( z .y; A) =
(11.14a)
n=l
the resolvent or solving kernel of the integral equation. Using the resolvent we get the solution in the form b
=
+
J T ( x ,Y1 X)f(y) dy.
(11.14b)
a
IFor the inhomogeneous Fredholm integral equation of the second kind p(x) = x + X 1 1 we have Kl(x,y) = zy. Kz(z,y) = / l s i ) i ) y d y = ?xy, Ks(z,y) = -zy,. 9
and from this r ( z . y: X) = zy
c,
1x1 < 1, because i K ( x ,y)i 5 .d '
1'
xy p(y)dy
. . . Kn(z,y) =
XY
17 . With the limit (11.13~)the series is definitely convergent for = 1 holds. The resolvent T ( z ,y; A) = -is a geometric series
11.2 Fredholm Integral Equations of the Second Kind
which is convergent even for (XI < 3. Thus from (11.14b) we get cp(x)= x+X
%
dy=
(l-?)
567
X
1-3
Remark: If for a given X the relation (11.13d) does not hold, then we can decompose any continuous , where K ' ( x , y) is a kernel into the sum of two continuous kernels K ( z :y) = K ' ( z , y) + K Z ( zy): degenerat,e kernel, and K Z ( x y) , is so small that for this kernel (11.13d) holds. This way we have an exact solution method for any X which is not an eigenvalue.
11.2.3 Fredholm Solution Method, Fredholm Theorems 11.2.3.1 Fredholm Solution Method 1. Approximate Solution by Discretization A Fredholm integral equation of the second kind b
v(z) = f(z)+
1
(11.15)
K ( x ,Y)cp(Y) dY
a
can be approximately represented by a linear equation system. We need to assume that the functions K ( z .y) and f(x) are continuous for a 5 x 5 b, a 5 y 5 b. 1Ve willapproximate the integral in (11.15) with the so-calledleft-hand rectangular formula (see 19.3.2.1, p. 896). It is also possible to use any other quadrature formula (see 19.3.1, p. 895). With an equidistant partition yk=a+(k-1)h
( k = 1 . 2 . . . . , n; h = - b ; a )
(11.16a)
we get the approximation x f(x) + [ K ( xYI)P(YI) , + + K ( x ,~ n ) ~ ( ~ n ) l ' Let's replace p(z) in this expression by a function F ( x ) exactly satisfying (11.16b): I ,
(11.16b) ( 11 . 1 6 ~ ) [ K ( x ,Y I ) V ( Y I ) + , + K ( x ,y n ) ~ ( y n ) l . To determine this approximate solution, we need the substitution values of F(z) at the interpolation nodes xk = a + ( k . 1)h. If we substitute z = 51, z = x Z 1 .. ,. z = z, into (11.16c), we get a linear equation system for the required n substitution values of F(zk). Using the shorthand notation (11.17a) Kjk = K ( x j .Y k ) . Y'k = V(zk)r f k = f (zk) we get (1 . XhK11)$7, -AhK12 9 2 .... -XhK1n Pn = f l i ~ ( x=) f (x)+
-XhK2191 +(I . XhK22)~2....
I
-XhKzn pn = f z l
................................................. -XhKnl -XhKn2cp2 .... +(1 . XhKnn)qn= f, This system has the determinant of the coefficients (1 - XhKl1) -XhK12 . . -XhKIn -XhK21 (1 . XhK22) . . . -XhKzn
D,IX) = ........................................... -XhKnl -XhK,2 . . . (1 - XhK,,)
1
1.I
(11.17b)
(11 . 1 7 ~ )
This determinant has the same structure as the determinant of the coefficients in the solution of an integral equation with a degenerate kernel. The equation system (11.17b) has a unique solution for every X where Dn(X) # 0. The solution gives the approximate substitution values of the unknown function 'j (x)at the interpolation nodes. The values of X with Dn(X) = 0 are approximations of the
I
568
11. Linear Inteoral Eauations
eigenvalues of the integral equations. The solution of (11.17b) can be written in quotient form (see Cramer rule, 4.4.2.3, p. 275):
Here we get Dk(X) from Dn(X) by replacing the elements of the k-th column by f l , f2,. . . , fn.
2. Calculation of the Resolvent If n tends to infinity, so the number of rows and columns of the determinant Dk(X) and Dn(X), too. We use the determinant to get the solution kernel (resolvent) T ( z ,y; A) (see 11.2.2, p. 565) in the form (11.19b) It is true that every root of D(X) is a pole of T ( z ,y; A). Exactly these values of A. for which D(X) = 0, are the eigenvalues of the integral equation (11.15),and in this case the homogeneous integral equation has non-vanishing solutions, the eigenfunctions belonging to the eigenvalue A. In the case of D(X) # 0, knowing the resolvent T ( z ,y; A), we have an explicit form of the solution: b
d z ) = f(l)+xJr(z:Y;X)f(Y)dY = f(z)+mjD(z’”;”)f(Y)dy. A b
( 11.19c)
a
To get the resolvent, we need the power series of D(z, y ; A) and D(X) with respect to A:
’
where do = 1, Ko(z,y) = K ( z ,y), and we get further coefficients from the recursive formula:
1
_-
X7r
DdX) =
12
--&AT 12
0
0
I-- &Xa 24
_-X7r 24
_ 3x77 _
l--
24
system has thesolutionpz =
~
2--A
192
fixa 24
1 &a
6
I @ = -
v3 . IfX = 1,then(ol = 0, 2--A & 6
‘pz = 0.915, (p3
=
11.2 Fredholm Integral Equations of the Second Kind 569
1.585. The substitution values of the exact solution are: ~ ( 0 = ) 0,
(o
(T) = 1. p (T) = 1.732 6 3
In order to achieve better accuracy, the number of interpolation nodes needs to be increased ( 4 s~ x2)p(y)dy; do = 1, K ~ ( xy), = 42y - x 2 , d l = 4
4
1
(4xt-x2)(4ty-tZ)dt = ~ + 2 ~ ~ y - - z ~ - - x y , d=2 3 3
1 K2(z,y)= - ( 4 x y - z z ) - J 1 K ( x . t ) ~ ~ ( t , y ) d t= 0. Withthesethevaluesdj, K3(2,y)andallthefol18
lowingvaluesofdkandKk(x,y)areequaltozero. T(x,y:X) = l-X+-
*
XZ
XZ
18
From 1 - X + - = 0. we get the two eigenvalues Xl,z = 9 3fi. If X is not an eigenvalue, we have the l8
solution p(2) = z +
1
1
~ ( xY;,XU(Y) dy =
+
3~(2X- 3Xx 6) X2 - 18X 18 . +
11.2.3.2 Fredholm Theorems For the Fredholm integral equation of the second kind b
94x1 = f(2)+
I K(X>Y)P(Y)dY
(11.21a)
J
0
the correspondent adjoint integral equation is given by b
v ( z ) = g(2) +
J K(y, x)dJ(Y)dY.
(11.21b)
0.
For this pair of integral equations the following statements are valid (see also 11.2.1,p. 562). 1. A Fredholrn integral equation of the second kind can only have finite or countably infinite eigenvalues. The eigenvalues cannot accumulate in any finite interval, Le., for any positive R there are only a finite number of X for which /XI < R. 2. If X is not an eigenvalue of (11,21a), then both of the inhomogeneous integral equations have a unique solution for any perturbation function f(x) or g(x), and the corresponding homogeneous integral equations have only trivial solutions. 3. If X is a solution of (11,21a), then X is also an eigenvalue of the adjoint equation (11.21b). Both homogeneous integral equations have non-vanishing solutions, and the number of linearly independent solutions are the same for both equations. 4. For an eigenvalue X the homogeneous integral equation can be solved if and only if the perturbation function is orthogonal to every solution of the homogeneous adjoint integral equation, Le., for every solution of the integral equation
a
a
The Fredholm alternative theorem follows from these statements: Either the inhomogeneous integral equation can be solved for any perturbation function f ( x ) or the corresponding homogeneous equation
I
570
11. Linear Inteqral Equations
has non-trivial solutions
11.2.4 Numerical Methods for Fredholm Integral Equations ofthe
Second Kind Often it is either impossible or takes too much work to get the exact solution of a Fredholm integral equation of the second kind b
P(5) = f(z)+
1
K ( z :Y ) d Y ) dY
(11.23)
a
by the solution methods given in 11.2.1, p. 562, 11.2.2, p. 565 and 11.2.3, p. 567. In such cases certain numerical methods can be used for approximation. Three different methods are given below to get the numerical solution of an integral equation of the form (11.23).
11.2.4.1 Approximation of the Integral 1. Semi-Discrete Problem Working on the integral equation (11.23) we replace the integral by an approximation formula. These approximation formulas are called quadrature formulas. They take the form
1
f(z)dz SZ
Q[a,b](f)
a
=
5
(11.24)
ukf(zk).
k=l
Le.. instead of the integral we have a sum of the substitution values of the function at the interpolation nodes x k weighted by the values w k . The numbers w k should be suitably chosen (so as to be independent of f j . Equation (11.23) can be written in the approximate form: n
a f(z)+ xQ[a,b](K(z,‘)p(‘))= f(z)+
1
W k K ( z , Yk)’P(Yk).
(11.25a)
k=l
The quadrature formula &(a,b](K(z, .)p(.))also depends on the variable z.The dot in the argument of the function means that the quadrature formula will be used with respect to the variable y . Defining the relation n
a ( z )= f(x)t
WkK(z,Y k ) F ( Y k ) .
( 11.25b)
k=l
p(zj is an approximation of the exact solution p(x). &’e can consider (11.25b) as a semi-discreteproblem, because the variable y is turned into discrete values while the variable z can still be arbitrary. If the equation (11.25b) holds for a function V(z) for every z E [a>b], it must also be valid for the interpolation nodes z = z k : (11 . 2 5 ~ )
This is a linear equation system containing n equations for the n unknown values T ( z k ) . Substituting these solutions into (11.25b) we have the solution of the semi-discrete problem. The accuracy and the amount of calculations of this method depend on the quadrature formula used. For example if we use the left-hand rectangular formula (see 19.3.2.1, p. 896) with an equidistant partition y k = xk = a + h(k - l ) , h = ( b - a ) / n , ( k = 1,.. . .n): (11.26a)
(11.26b)
11.2 Fredholm Integral Eazlations of the Second Kind 571
the system (11.25~)has the form: - X h K 1 2 ~2 - . . . (1 - X h K 1 1 ) p l -XhK2: ~1 +(1 - XhK2292) - . . .
.................
- X h K 1 n 'Pn
= f1S
-XhK2n Pn = fi,
(11 . 2 6 ~ )
I...........................................
-XhKn2 ~2 - . . . +(1 - X h K n n ) i p n = fn. - X h K n l (31 We had the same system in the Fredholm solution method (see 11.2.3, p. 567). As the rectangular formula is not accurate enough, for a better approximation of the integral we have to increase the number of interpolation nodes, along with an increase in the dimension of the equation system. Hence we get the idea of looking for another quadrature formula.
2. Nystrom Method In the so-called Nystrom method we use the Gauss quadrature formula for the approximation of the integral (see 19.3.3, p. 897). In order to derive this, we consider the integral
I=
j
f(x)dx.
(11.27a)
We replace the integrand by a polynomial p(x), namely the interpolation polynomial of f(x) at the interpolation nodes x k : n
P(z) =
Lk(x)f(xk)
with
k=l
(z- 21) . . . (x - xk-I)(x - Q + l ) . . . (x - xn)
Lk(x)=
For this polynomial, p ( x k ) = f ( i k ) , results in the quadrature formula b
/ f(x)dx a
(11.27b) -z k + l ) . ..(xk -xn)' k = 1,.. . n. The replacement of the integrand f(x) by p(x)
(xk - 2:).. . (xk - X k - l ) ( x k
/
2
a
k=l
b
b
x
~
f(q)& ( x ) dx =
/p(x) dx = a
k=l
Idkf(2k)
b
with
wk
=
/ Lk(x)dx.
(11.27~)
a
For the Gauss quadrature formula the interpolation nodes cannot be chosen arbitrarily but we have to choose them by the formula: L
L
The n values t k are the n roots of the Legendre polynomial of the first kind (see 9.1.2.6, 3., p. 509) 1 d" [ ( t 2 - l)"] (11.28b) P,(t) = 2" n! dtn ' These roots are in the interval [-I, +1]. We calculate the coefficients w k by the substitution x - x k = b-a -(t - t k ) , so: 2
= (b - a ) A k . (11.29) In Table 11.1 we give the roots of the Legendre polynomial of the first kind and the weights A k for n = 1 , . . . , 6.
I
572
11. Linear Inteqral E~uations
Table 11.1Roots of the Legendre polynomial of the first kind
I
I n / t 1 tl= 0 2 1 tl = -0.5774 t 2 = 0.5774 ti = -0.7746 3 t2 = 0 t 3 = 0.7746 ti = -0.8612 4 t z = -0.3400 t 3 = 0.3400 t 4 = 0.8612
1
I
1
A Al=l A1 = 0.5 A2 = 0.5 Ai = 0.2778 A2 = 0.4444 A3 = 0.2778 Ai = 0.1739 A2 = 0.3261 A3 = 0.3261 A4 = 0.1739
I1
n 5
t
ti = -0.9062 t 2 = -0.5384 t3 = 0 t 4 = 0.5384 - t 5 = 0.9062 6 ti = -0.9324 tz = -0.6612 t 3 = -0.2386 t 4 = 0.2386 t 5 = 0.6612 tfi = 0.9324
A
'41 = 0.1185 A2
= 0.2393
A3
= 0.2844
A4
= 0.2393
A5
= 0.1185
Ai = 0.0857 A2 =
0.1804
A3 = 0.2340
A4 = 0.2340 A5 = 0.1804 A6 = 0.0857
ISolvetheintegralequationq(2) = c o s r t ~ (2 e2 '++ Rl Z) + - / l e z v p ( y ) d y b y t h e Y y s t r o m m e t h o d for n = 3. 12 = 3 : 2 1 = 0.1127. 2 2 = 0.5, 53 = 0.8873, A1 = 0.2778, A2 = 0.4444, A3 = 0.2778, f l = 0.96214, f 2 = 0.13087, j 3 = -0.65251. K11 = 1.01278, Kzz = 1.28403, K33 = 2.19746, Kl2 = K Z I= 1,05797, K13 = K31 = 1.10517. K23 = K32 = 1.55838. The equation system (11.25~)for P I , ' p z ! and 93 is 0.71864~1- 0.4701692 - 0.30702p3 = 0.96214. -0.2939091 t 0.4293892 - 0.43292~3= 0.13087, -0.30702~1- O.69254pz 0.38955~3= -0.65251.
+
The solution of the system is: 'p1 = 0.93651, p2 = -0.00144, 9 3 = -0.93950. The substitution values of the exact solution at the interpolation nodes are: p ( q ) = 0.93797, y ( z z )= 0; ( ~ ( 2=~-0.93797. )
11.2.4.2 Kernel Approximation Replace the kernel K ( z ,y) by a kernel R ( z ,y) so that K(z,y) x K ( z ,y) for a 5 z 5 b. a 5 y Try to choose a kernel making the solution of the integral equation
5 b.
b
4 . 1 = f b ) t x-/R(z:Y)F(Y)dY
-
(11.30)
a
the easiest possible.
1. Tensor Product Approximation A frequently-used approximation of the kernel is the tensor product approzzmatzon in the form (11.31a) with given linearly independent functions ~ ( z ). .,. , a,(z) and po(y), . . . % pn(y) whose coefficients dje must be chosen so that the double sum approximates the kernel closely enough in a certain sense.
11.2 FTedholm Intearal Eauations of the Second Kind 573
(11.31~) Functions cyo(z), . . . a,(.) and &(y), . . . , D,(y) should be chosen so that the coefficients d j k in (11.31a) can be calculated easily and also that the solution of (11.31~)isn't too difficult.
2. Special Spline Approach Let's choose
( 11.32)
lo
otherwise
for a special kernel approximation on the interval of integration [a,b] = [0,1]. The function
(
ah(.)
has
')
non-zero values only in the so called carrier interval -, - , (Fig. 11.1).
To calculate the coefficients d j k in (11,31a),consider x ( z ,y ) at the points z = l/n, y = i/n( I , i = 0 , 1 , . . . , n ) . We get 1 f o r j = 1, k = i, aj ak = 0 otherwise
(i) (i) { R (:, i) (4,i).
and consequently X ( l / n ,i / n ) = dli. Hence, we substitute dl, =
n
n
n
=K
-
Figure 11.1
n n
Now (11.31a) has the form
aj(z)Pk(Y). j=O
(11.34)
k=O
As we know, the solution of (11.31~)has the form
a(.)
+
+. +
(11.35) = f ( ~ ) Aoc~o(z) . . A,a,(z). The expression Aocuo(z) + . . . + A,a,(z) is a piecewise linear function with substitution values Ak at the points i k = k / n . Solving (11.31~)by the method given for the degenerate kernel, we get a linear equation system for the numbers Ao, . . . ,A,: -Xcol A1 - . . . -Xclo A0 + (1 - XcllA1) - . . .
(1 - Xco0)Ao
-XCO, A, = bo, A, = b l ,
....................................................... -XC,~ A1 - . . . + (1 - Xc,,)A, = b,, -&o AO where
(11.36a)
574
11. Linear Inteqral Equations
. . . +K ( ~n, ~n) ] a n ( z ) c Y k ( z ) d z .
= K(i,i)/ao(z)ak(z)dz+
(11.36b)
0
3n 1
I,, = /a,(z)ak(x) dx = 0
3n 1 6n ,0
for j = 0, k = 0 and j = n, k = n, for j = k , l < j < n , (11.36~) for j = k + l , j = k - 1 , otherwise.
(11.36d) Taking a matrix C with numbers c,k from (11.36a), a matrix B with the values K ( j / n ,k / n ) and a matrix A with the values I,, respectively, a vector b from the numbers bo,. . . , b,. and a vector a from the unknown values .'io2. . . A,, the equation system (11.36a) has the form (I - XC)a = (I - XBA)a = b. (11.36e) In the case when the matrix (I - XBA) is regular, this system has a unique solution a = (Ao,. . . , A n ) .
11.2.4.3 Collocation Method Suppose the n functions q l ( z ) ,. , , , pn(z)are linearly independent in the interval [a,b]. They can be used to form an approximation function $ ( E ) of the solution p(z): ( 11.37a) p(x) zz S ( x ) = a l p l ( z ) + azc~z(z)+ . . . + anpn(z). The problem is now to determine the coefficients al, . . . , a n . Usually, there are no values al... . , a n such that the function F(z)given in this form represents the exact solution p(z) = F(z)of the integral equation (11.23). Therefore, n interpolation points ~ 1 ,. .., z nare defined in the interval of integration, and it is required that the approximation function (11.37a) satisfies the integral equation at least at these points: (11.37b) ~ ( 5 ,= ) aiyi(zk) + . . . + anpn(zk) b
= f(xk)
+ ~ / ~ . ( z , , y[ a)l p l ( y )+ . + anpn(Y)l dy ,,
a
LVith some transformations this equation system takes the form:
( k = 1,.. . . n ) .
(11 . 3 7 ~ )
11.3 Fredholm Inteqral Equations ofthe First Kind
575
Then the equation system to determine the numbers al,. . . , a, can be written in matrix form: (A - AB) a = b. (11.37g)
Ip(x) =
$- + 1' &p(y) ."
+
d y . The approximation function is p(x) = alzZ uzx
+
u3,
y l ( z )=
x2,pz(x)= 2. p3(x)= 1. The interpolation nodes are x1 = 0, x2 = 0 . 5 , = ~ 1. 0 0 0 0 1
4 4 4 -
1 1 1
-
_
The system of equations is a3
=
0.
(4 - 7) al + (i- T) az + (1 - $)a3 = 2J;i 0 1 1 +-as= 7 a1 + z3 a2 3 2' 1
4
1
4
1
= - 0 . 8 1 9 7 ~+ ~ 1.80922, whose solutions are a l = -0.8197,a2 = 1.8092. a3 = 0 and with these and so q(0) = 0. g(O.5)= 0.6997, T(1) = 0.9895. The exact solution of the integral equation is p(z) = fiwith the values p(0) = 0, y(0.5) = 0.7071, p(1) = 1. In order to improve the accuracy in this example. it is not a good idea to increase the degree of the polynomial, as polynomials of higher degree are numerically unstable. It is much better to use different spline approximations, e.g., a piecewise linear approximationq(2) = u l y l ( s )+azyz(z) . '+anpn(z) with the functions introduced in 11.2.4.2
a(.)
+.
otherwise
a(.).
In this case. the solution p(2) is approximated by a polygon Remark: There is no theoretical restriction as to the choice of the interpolation nodes for the collocation method. In the case, however, when the solution function oscillates considerably in a subinterval, we have to increase the number of interpolation points in this interval.
11.3 Fredholm Integral Equations ofthe First Kind 11.3.1 Integral Equations with Degenerate Kernels 1. Formulation of the Problem Consider the Fredholm integral equation of the first kind with degenerate kernel b
f(.)
= /(ul(x!3liY)
+
"
'
+ % l ( ~ ) P n ( Y ) M YdY)
(c 5 z I 4,
(11.38a)
a
and introduce the notation similar to that used in 11.2. p. 562, b
.4, = / f l j ( ~ ) & ) d y a
(.i=1,2,..~,n).
(11.38b)
576
11. Linear Inteqral Equations
Then (11.38a) has the form (11 . 3 8 ~ ) f(z)= Alal(z) i- , + Anan(.), Le.. the integral equation has a solution only iff(.) is a linear combination of the functions a l ( z ) ,. . . a,(z).If this assumption is fulfilled, the constants A I , .. . , A , are known.
2. Initial Approach We are looking for the solution in the form ~ ( z=) C I & ( ~ )i- ' ,
+ cnPn(z)
(11.39a)
where the coefficients c l l . .. , c, are unknown. Substituting in (11.38b) (11.39b) a
a
and introducing the notation b
(11 . 3 9 ~ )
K*, = /3%(Y)PAY)dY a
we have the following equation system for the unknown coefficients cl,. . . ,:c, Kllcl t + Klnc, = A i . t..
+. . . + K,,c,
K,lcl
(11.39d) = A,.
3. Solutions The matrix of the coefficients is non-singular if the functions Pl(y),. . . , ,&(y) are linearly independent (see 12.1.3,p. 596). However, the solution obtained in (11.39a) is not the only one. Unlike the integral equations of the second kind with a degenerate kernel, the homogeneous integral equation always has a solution. Suppose @(z) is a solution of the homogeneous equation and p(z) is a solution of (11.38a). Then p(z) + @(z) is also a solution of (11.38a). To determine all the solutions of the homogeneous equation, let us consider the equation (11.38~)with f(z)= 0. If the functions a l ( z ) ,. . . , a,(z) are linearly independent, the equation holds if and only if b
Aj =
/ ~,(Y)v(Y)
=0
( j = 1,2,. . . I n ) l
(11.40)
a
Le.. every function y h ( y )orthogonal to every function Dj(y) is a solution of the homogeneous integral equation.
11.3.2 Analytic Basis 1. Initial Approach Several methods for the solution of Fredholm integral equations of the first kind b
f(.)
=
/ K(.?
Y)yo(Y) dy (c I z I 4
(11.41)
a
determine the solution p(y) as a function series of a given system offunctions (P,(y)) = {Pl(y), &(y), , . .}. i.e., we are looking for the solution in the form
c 00
Pb)=
C,P,(Y)
(11.42)
3=1
where we have to determine the unknown constants cj. When choosing the system of functions we have to consider that the functions (P,(y)) should generate the whole space of solutions, and also that the
11.3 Fredholm Intearal Eouations of the First Kind 577
calculation of the coefficients cj should be easy. For an easier survey, we will discuss only real functions in this section. All of the statements can be extended to complex-valued functions, too. Because of the solution method we are going to establish, certain properties of the kernel function K ( z ,y) are required. We assume that these requirements are always fulfilled. Next, we discuss some relevant information.
2. Quadratically Integrable Functions .4function q(y) is quadratzcally zntegrable over the interval [a,b] if
f
lV(Y)I2dY <
(11.43)
n
holds. For example, every continuous function on [a,b] is quadratically integrable. The space ofquadratically integrable functions over [a,b] will be denoted by L2[a,b].
3. Orthonormal System Two quadratically integrable functions a ( y ) , bj(y),y E [a,b] are considered orthogonal to each other if the equality (11.44a)
f3t(Y)bAY)dY = 0 a
holds LVe call a system of functions (pn(y)) in the space L2[a,b] an orthononnalsystem if the following equalities are true: ( 11.44b)
.4n orthonormal system of functions is complete if there is no function B ( g ) # 0 in LZ[a,b] orthogonal to every function of this system. A complete orthonormal system contains countably many functions. These functions form a baszs of the space L2[a,b]. To transform a system of functions (P,(y)) into an orthonormal system (3i(y)) we can use the Schmzdt orthogonalzzatzon procedure. This determines the coefficients bnl, b n z r , , , b, for n = 1 , 2 , . . . successively so that the function ~
(11 . 4 4 ~ ) j=1
is normalized and orthogonal to every function p;(y), . . . , PA-l(y). 4. Fourier Series If (bn(y)) is an orthonormal system and $(y) E L2[a,b], we call the series 00
Cd333(Y) = Y ( Y )
(11.4Sa)
j=1
the Fourier series of ~ ( y with ) respect to (Pn(y)), and the numbers dj are the corresponding Fourier coefficients. Based on (11.44b) we have: b
/ (1
ak(Y)@(Y)
dy =
dj
3=1
1
p~(?l)bkok(Y) d!4 = dk.
(11.45b)
a
If (&(y)) is complete, we have the Parseval equality (11.45~)
I
578
11. Linear Integral Equations
11.3.3 Reduction of an Integral Equation into a Linear System of Equations -4 linear equation system is needed in order to determine the Fourier coefficients of the solution function p(y) with respect to an orthonormal system. First, we have to choose a complete orthonormal system (,!-ln(y)).y E [a. b]. A corresponding complete orthonormalsystem (cyn(x))can be chosen for the interval x E [c, d ] . With respect to the system (cyn(x))the function f(z)has the Fourier series (11.46a) If the integral equation (11.41) is multiplied by cyl(z)and the integral is evaluated for 2 running from c to d alone, we get: d b
fi
=
11
K ( x ,YMY14z) dY dx
c a
(11.46b)
The expression in braces is a function of y with the Fourier representation
1
m
K ( x ,~ ) ~ t dz ( z =) K ~ ( Y=)
Kqb'j(y) with
(11.46~)
j=1
C
a c
b'ith the Fourier series approach 35
dY) = k=l C k 3 k ( Y )
(11.46d)
we get
(11.46e)
Because of the orthonormal property (11.44b), we have the equation system m
ft
=
1K I j c j
(i = 1,2,. . .).
(11.46f)
J=1
This is an infinite system of equations to determine the Fourier coefficients cl, c 2 . .. . . The matrix of coefficients of the equation system
(11.46g)
11.3 Fredholm Integral Equations of the First Kind 579
is called a kernel rnatrzz. The numbers f, and Kt3 (isj = 1 , 2 , . . .) are known, although they depend on the orthonormal system chosen. sin y If(x) = p(y) dy, 0 5 x 5 T.The integral is considered in the sense of the Cauchy T 0 cosy - cosx principal value. .4s a complete orthogonal system we use: . . 1 I. aa(x)= - a,(x) =
'ST ,/li
%
By (11.46d), the coefficients of the kernel matrix are
dxdy = 0 ( j = 1 , 2 , . . .),
-
??j~i~sinysinzycosrx dx dy = $ i' sin
y sin zy
'
cosix
d z } dy (z = 1 , 2 , . . .).
cos y - cos 5 For the inner integral the equation z3-7r7r
S 0
0
cosix sin i y dx = -Tcosy - cosz sin y
holds. Consequently Kzj =
-2T
(11.47)
0 sinjysiniy d y = O
1.1
-1 for i = j .
f(z)cu,(x)dx (i = 0 , 1 , 2 . . . .). The equation
The Fourier coefficients of f(x)from (11.46a) are fi =
[-:-: : :1 [!;) 0 0 0 ."
system is
=
solution only if the equality fa =
According to the first equation, the system can have any
[f(z)a~(x) dz = - f(x)dx = 0 holds. Then we have cj = S' A
-fi
for i # j :
0
(1 = 1.2.. . .), and p(y) = -
f (2)dx.
11.3.4 Solution ofthe Homogeneous Integral Equation ofthe First Kind If p(y) and $(y) are arbitrary solutions of the inhomogeneous and the homogeneous integral equation respectively, Le., h
h
then the sum p(y) + @(y) is a solution of the inhomogeneous integral equation. Therefore we have to determine all the solutions of the homogeneous integral equation. This problem is the same as determining all the non-trivial solutions of the linear equation system X
1K& j=1
= 0 ( 2 = 1 , 2 , ,. .)
(11.49)
580
11. Linear Inteqral Equations
.4s sometimes this system is not so easy to solve, we can use the following method for our calculations. take the functions If we have a complete orthonormal system (an(z)), d
K,(y) =
1
K ( z ,y)a,(x) dz (i = 1,2,. . .).
(11.50a)
C
If @(y) is an arbitrary solution of the homogeneous equation, Le.,
p
(11.50b)
K ( x ,y ) v h ( y )dy = 0
a
holds, then multiplying this equality by q ( z ) and performing an integration with respect to z,we get (11.50~) i.e., every solution ph(y)of the homogeneous equation must be orthogonal to every function K,(y). Ifwe replace the system (K,(y)) by an orthonormal system (K;l(y)) using an orthogonalization procedure, instead of (11.50~)we have (11.50d) a
If we extend the system (KG(y))into a complete orthonormal system, the conditions (11.50d) are obviously valid for every linear combination of the new functions. If the orthonormal system (KG(y))is already complete, then only the trivial solution ph(y)= 0 exists. We can calculate the solution system of the adjoint homogeneous integral equation in exactly the same way: (11.50e)
K ( z ,y)@(x)dx = 0. p(y) dy = 0, 0 5 z 5
1.2,. . .), K,(y) =
pi/
7r.
An orthonormal system is: a,(z) =
= sinzsinzz
T7r
0
cosy-cosx
(11.47) twice we get K,(y) = -
de. Applying
cos y - cos z
--s (sin(z - ~);~nysin(z+ 1)y) = $coszy
(z = 1 , 2 , . ..). The
1
system (K,(y)) is already an orthonormal system. The function Ko(y) = - completes this system.
fi
Consequently the homogeneous equation has only the solution: ph(y)= ' e
fi
= 2,
( e is arbitrary).
11.3.5 Construction of Two Special Orthonormal Systems for a Given Kernel 1. Preliminaries The solution of infinite systems of linear equations we saw in 11.3.3, p. 578, is not usually easier than the solution of the original problem. Choosing suitable orthonormal systems (a,(z))and (&,(g)) we can change the structure of the kernel matrix K in such a way that the equation system can be solved easily. By the following method we can construct two orthonormal systems such that the coefficients
11.3 Fredholm Inteqral Equations of the First Kind 581
Kt3of the kernel matrix are non-zero only for z = j and i = j + 1 Using the method given in the previous paragraph, we first determine two orthonormal systems (p,h(y)) and ( a h ( z ) ) ,the solution systems of the homogeneous and the corresponding adjoint homogeneous integral equations respectively. This means that we can give all the solutions of these two integral equations by a linear combination of the functions p,"(y) and ak(x). These orthonormal systems are not complete. By the following method we complete these systems step by step into complete orthonormal systems a J ( z ),!$(y) . (J = 1 , 2 , . . .).
2. Procedure First a normalized function a1(z) is determined, which is orthogonal to every function (ak(x)). Then the following steps are performed for j = 1 , 2 , . . .: 1. Determination of the function OJ(y) and the number vJ from the formulas (11.5la) d vJ$j(y)
= / K ( x $Y ) a j ( z ) d x - /13-1pJ-l(y)
(3 #
'11
(11.51b)
C
so that vj is never equal to zero and R j ( y ) is normalized. Then bJ(y) is orthogonal to the functions
((P,h(Y))>$*(Y)>
$j-l(Y)).
" ' 3
2. Determination of the function aJtl(z) and the number p, from the formula b PJaJ+1(")
=/K(5,y)OJ(~)dy-vJQ3(z)' a
There are two possibilities:
(11'51c)
a) pJ # 0: The function aJtl(z)is orthogonal to the functions ((ak(x)),al(z), . . . , a,(.)).
b) p J = 0: Then the function aJtl(z) is not uniquely defined. Here again we have two cases: bl) Thesystem ((ak(z)),a1(x),...,a~(2)) isalreadycomplete. Thenthesystem ( ( & ( y ) ) , b l ( y ) , . . . , $J(y)) is also complete. and the procedure is finished.
bz) The system ((ak(x)), al(z), . . . ,a,(.)) is not complete. Then again we choose an arbitrary function aJ+l(z)orthogonal to the previous functions. This procedure is repeated until the orthonormal systems are complete. It is possible that after a certain step the case b) does not occur during a countable number of steps, but the system of this countable number of functions ((ah(.)), al(z),. . .) is still not complete. Then again we can start the procedure by a function &I(.), which is orthogonal to every function of the previous system. If the functions aj(x), 3,(y) and the numbers v J spJ are determined by the procedure given above, we have the kernel matrix K in the form
(11.52)
I
582
11. Linear Inteqral Equations
The matrices K"' ( m = 1,2.. . .) are finite if during the procedure we have p p ) = 0 after a finite number of steps. They are infinite if for countably many values of 3 . p y ) # 0 holds. The number of zero rows and zero columns in K corresponds to the number of functions in the systems (ak(z))and ( O t ( y ) ) . We have a very simple case if the matrices Km contain one number vi"' = v, only, Le., all numbers p y ' are equal to zero. Using the notation of 11.3.3, p. 578, we have the solution of the infinite equation system under the assumptions f, = o for a 3 ( z )E (ak(x)) : (11.53)
11.3.6 Iteration Method To solve the integral equation b
/
f ( z ) = Kb.3 Y)P(Y) dy
(c
I z 5 4.
(11.54a)
a
starting with ag(z)= f ( z ) we determine the functions d
b
3,(y) = /K(z.y)a,:(z)dz
(11.54b)
and a,(z)= JK(z,y)&(y) dy,
(11.54~)
a
C
for n = 1.2. . . equalities hold:
.
If there is a quadratically integrable solution y(y) of (11.54a) then the following
b
b d
/ P(y)3n(21) J J dy =
a
~ ( y ) ~ (y)On-l(x) z, dzdy
a c
= jf(z)a,l(z)dz
( n = 1.2. . . .).
(11.54d)
e
By the orthogonalizationand normalizationofthe function systems that we have obtained from (11.54b), (11 54c) we get the orthonormal systems (a;(.))and (Oi(y)). Using the Schmidt orthogonalization method we have Di(y) in the form (11.54e) 3=1
\$'e need to show that the solution p(y) of (ll.54a) has the representation by the series
c JEi
P(Y) =
C,%(Y).
(1l.54f)
3=1
In this case we have for the coefficients c, regarding (11.54d): (1134g)
12.1 Volterra Intearal Eouations 583
To have a solution in the form (11.549 the following conditions are both necessary and sufficient: m
d
1.
/[f(3:)lZ d3:
= n=l
c
d
1 J ~ ( z ) Q ; ( zdxI2, )
m
(11.55a)
2.
1IcnI2 < m.
(11.55b)
n=l
11.4 Volterra Integral Equations 11.4.1 Theoretical Foundations A Volterra integral equation of the second kind has the form
d3:)= fb)+
1
K ( x .Y ) d Y ) dy.
(11.56)
a
The solution function p(z) with the independent variable z from the closed interval I = [a,b] or from the semi-open interval I = [a,w) is required. We have the following theorem about the solution of a L'olterra integral equation of the second kind: If the functions f(x) for z E I and K ( x ,y) on the triangular region 2 E I and y E [a,z]are continuous, then there exists a unzque solution p(x) of the integral equation such that it is continuous for z E I . For this solution yn(a) = ( a ) (11.57) holds. In many cases, the Volterra integral equation of the first kind can be transformed into an equation of the second kind. Hence, theorems about existence and uniqueness of the solution are valid with some modifications.
s
1. Transformation bv Differentiation Assuming that p(x), K ( z ,y). and K,(z, y) are continuous functions, we can transform the integral equation of the first kind (11.58a) a
into the form
by differentiation with respect to z.If K ( z .x) # 0 for all 3: E I , then dividing the equation by K ( z ,z) we get an integral equation of the second kind. 2. Transformation by Partial Integration -4ssuming that yn(x),K ( z ,y) and KY(z, y) are continuous, we can evaluate the integral in (ll.58a) by partial integration. Substituting J
a
gives
(11.59b)
I
584
11. Linear Inteqral Eauations
If K ( z ,z)# 0 for z E I , then dividing by K ( z ,z)we have an integral equation of the second kind: (11 . 5 9 ~ )
Differentiating the solution $(z)we get the solution p(z) of (11.58a).
11.4.2 Solution by Differentiation In some Volterra integral equations the integral vanishes after differentiation with respect to 2, or it can be suitably substituted. Assuming that the functions K ( z ,y), K,(z, y), and p(z) are continuous or. in the case of an integral equation of the second kind, p(z) is differentiable, and differentiating
f(.)
=
j K ( z .Y)P(Y)dY
(11.60a)
or P (.) = f(.)
1
+ K ( z ,Y)P(Y)dY
a
(11.60b)
a
with respect to z we get (11.60~)
d ( z ) = f ' b )+ K(z.z)p(z)+
/" a
zK(z,Y) d
Y ) dY.
(11.60d)
Q
1 x)
tiatingit twicewithrespect t o z we have y(z)cosz-
LX
1 . cos(z-2y)p(y) dy = -zsinz (I). Differen2 1 sin(z-Zy)p(y)dy = - ( s i n z + z c o s r ) (IIa), 2
I' I'
IFind thesolutionp(z) forz E 0, - oftheequation
1 . cos(z - 2y)(p(y)dy = cosz - -zsinz (IIb). The integral in the second equation 2 is the same as that in the original problem, so we can substitute it. We get $(z) COST = cosz and
and p'(z) cosz -
I ;)
because cos5 # 0 for z E 0, -
+
p'(z) = 1, so y ( z ) = z C.
To determine the constant C substitute z = 0 in (IIa) to obtain y ( 0 ) = 0. Consequently C = 0, and the solution of (I) is y ( z ) = z. Remark: If the kernel of a Volterra integral equation is a polynomial, then we can transform the integral equation by differentiation into a linear differential equation. Suppose the highest power of z in the kernel is n. After differentiating the equation (n t 1) times with respect to zwe have a differential equation of n-th order in the case of an integral equation of the first kind, and of the order n + 1 in the case of an integral equation of the second kind. Of course we have to assume that p(z) and f(z)are differentiable as many times as necessary. I X [ 2 ( z- y)* t l]p(y)dy = z3 (I*). After differentiating three times with respect to z we have p(z)t4[(z--y)p(y)
dy = 32' (II*a), y'(z)t4/'p(y) dy = 62 (II*b). 9"(5)+4p(z)= 6 (II*c).
3 The general solution of this differential equation is p(z) = A sin 2x + B cos 22 t -. Substituting z = 0 2 in (II*a) and (II*b) results in y ( 0 ) = 0, p'(0) = 0, so we have A = 0, B = -1.5. The solution of the 3 integral equation (I*) is ~ ( z=) -(1 - cos2z) 2
11.4 Volterru Integral Equations 585
11.4.3 Solution of the Volterra Integral Equation of the Second
Kind by Neumann Series Ll'e can solve Volterraintegral equations of the second kind by using Neumann series (see 11.2.3,p. 567). If we have the equation (11.61)
fory 5 x, for y > x.
(11.62a)
Il'ith this transformation (11.61) is equivalent to a Fredholm integral equation b
4 x 1I .=( !
J%
+
( 11.62b)
Y M Y ) dY,
a
allowing b = co as well. The solution has the representation m
=
b
I(.) + n=l X" Ja KL(x,y ) f ( y ) dY.
(11.62~)
The zteruted kernels K1,K z . . . . are defined by the following equalities: K1(z. Y) =
w. Y),
b
Kz(z,Y)
=
1 2
J"!.:
dwl> I/) dq =
K ( x ,M 1 ,Y) d% . .
(11.62d)
Y
a
and in general: (11.62e) The equalities K3(x,y) I 0 for y > x ( 3 = 1,2,. . .) are also valid for iterated kernels. Contrary to Fredholm integral equations if (11.61) has any solution, the Neumann series converges to it regardless of the value of X. I p(z) = 1 x 12e"-Y9(y) dy. ~ 1 ( xy), = ~ ( xy) ,= ex-21, ~ z ( zy), = eX-tleY-Y dV = ex-Y (x -
dx
+
ex-y
Y),,
' '
>
Kd.. Y) = -(x - y)"-'. ( n - I)!
r(x,y; A)
A" 1-(x O0
- y)" = e('-~)(Xtlj. It is well-known that n! this series is convergent for any value of the parameter A. edXt*)Ydy, in particular if X = -1: p(x) = We get p(z) = 1 + X e(2-gj(X+1) dy = 1 + Xe(At')"
Consequently the resolvent is:
= e2-Y
n=O
1'
1 - z,x
+ -1:
1 X f l
p(z) = -(1
+ Xe("'j2).
1'
11.4.4 Convolution Type Volterra Integral Equations If the kernel of a Volterra integral equation has the special form ~ ( z $ y=)
[ k(z 0
-
y)
for 0 5 y 5 x, for 0 5 x < y,
(11.63a)
586
11. Linear Inteqral Equations
we can use the Laplace transformation to solve the equations
j
k ( z - y ) p ( y ) dy = f ( z ) (11.63b)
or ~ ( z=) f(z)+ j k ( z - y)cp(y) dy.
0
(11.63~)
0
If the Laplace transforms L{p(z)} = @ ( p ) :L { f ( z ) }= F ( p ) , and L { k ( z ) }= K ( p ) exist, then the transformed equations have the form (see 15.2.1.2. ll., p. 711) K(P)@(P)= F (P )
+
(11.64a) or
@ ( p ) = F ( p ) K ( p ) @ ( p )resp.
(11.64b)
(11.64~)or
O(p)=
F ( p ) resp. 1 - K(P)
(11.64d)
From these we get ~
The inverse transformation gives the solution p(z) of the original problem. Rewriting the formula for the Laplace transform of the solution of the integral equation of the second kind we have (11.64e)
The formula (11.640 depends only on the kernel, and if we denote its inverse by h ( z ) ,the solution is
dz)= f(.) + j h ( .
-
Y).f(Y)
(11.64g)
dY.
0
The function h(z - y j is the resolvent kernel of the integral equation. 1 P-1 Ip(a) = f(z)+ i z e z - y p ( y )dy: @ ( p ) = F ( p ) + - @ ( p ) , Le., @ ( p ) = -F(p). P-1 P-2
The inverse
transformation gives cp(z). From H ( p ) = I it follows that h ( z )= e2‘. By (11.64g) the solution is P-2 p(z) = f ( z ) e2(”-y)f(y) dy.
+ 1’
11.4.5 Numerical Methods for Volterra Integral Equation of
the Second Kind \$’e are to find the solution for the integral equation 2
= f(.)
J
+ K ( 2 ,Y)cp(Y) dY
(11.65)
a
for 5 from the interval I = [alb]. The purpose of numerical methods is somehow to approximate the integral by a quadrature formula:
j
K ( z .y)u(u) du E ~ [ a , r l ( ~,jp(.)), (z,
(11.66a)
a
Both the interval of integration and the quadrature formula depend on z.This fact is emphasized by the index [a, z]of QI.,.,(.. .). We get the following equation as an approximation of (11.65): (11.66b) ~ ( x=)f ( z ) + Q[a,zl(K(z,.)F(.)).
11.1 Volterra Inteoral Eouations 587 The function v(z) is an approximation of the solution of (11.65). The number and the arrangement of the interpolation nodes of the quadrature formula depend on z,so as to allow little choice. If is an interpolation node of Qla,%;(K(z, . ) q ( . )then ) , (K(z,<)F(<)) and especially must be known. For this purpose, the right-hand side of (11.66b) should be evaluated first for z = <,which is equivalent to a quadrature over the interval [a. (1. As a consequence, the use of the popular Gauss quadrature formula is not possible. We solve the problem by choosing the interpolation nodes as a = zo < z1 < . . . < XI; < . . . and we use a quadrature formula &I.,., ! with the interpolation nodes zo,zl, . . . ,z,.The substitution values of the function at the interpolation nodes are denoted by the brief notation pk = F(Q) (k = 0,1! 2 , . . .). For po we have (see 11.3.1, p. 575)
a(<)
po = f ( z o )= f(a), (11.66~) and with this:
<
v1= f ( z l ) + Q[a,rll(K(zl, . ) q ( . ) ) .(11.66d)
has the interpolation points 20 and x1 and consequently it has the form (11.66e) Q ; R . z ~ ] (,)p(,) K ( ~ =~ woK(zi, , zo)Po t WK(zi,zi)pi with suitable coefficients wo and w1. Continuing this procedure, the values pk are successively determined from the general relation:
k = 1 , 2 , 3 , .. . . have the following form:
Pk = f ( Q ) -k Q [ o , z k ] ( K ( z k.)V(.)), r
The quadrature formulas
(11.66f)
( 11.66g) Hence, (11.66f) takes the form: k
Pk
= .f(zk)+ ~
W j k K ( ~ k , ~ j ) p j ~
(11.66h)
j=0
The simplest quadrature formula is the left-hand rectangular formula (see 19.3.2.1, p. 896). For this the coefficients are wjk = X j + l - Zj for j < k and W k k = 0. (11.66i) We have the system Po = f ( a ) , (11.67a) P1 = f(.l) + (21 - zo)K(zl,zo)po, Pz = f(zz) t (21 - zo)K(zz?xo)Po+ (z2- Z1)K(zzrZl)pl and generally k-1
Pk
= f(zk)+ E ( z j + l- z j ) K ( z k i x j ) p j .
(11.67b)
j=O
More accurate approximations of the integral can be obtained by using the trapezoidal formula (see 19.3.2.2, p. 896). To make it simple, we choose equidistant interpolation nodes zk = a t kh, k = 0,1.2. . . . :
(11 . 6 7 ~ ) Using this approximation for (11.66f) we get: Po = f(a)l
(11.67d) (11.67e)
588
11. Linear Inteqral Equations
Altough the unknown values also appear on the right-hand side of the equation, they are easy to express. Remark: With the previous method we can approximate the solution of non-linear integral equations as well. If we use the trapezoidal formula to determine the values 'Pk we have to solve a non-linear equation. We can avoid this by using the trapezoidal formula for the interval [a,~ k - ~ ]and . we use the rectangular formula for the interval [zk-l, zk]. If h is small enough, this quadrature error does not have a significant effect on the solution. ILet'sgivethe approximatevaluesofthesolutionoftheintegralequation ~ ( z=) 2+/'(z-y)q(y)
dy
by the formula (11.664 using the rectangular formula. The interpolation nodes are the equidistant points zk = k 0.1, and hence h = 0.1. rectangular trapezoidal formula formula rpn = 2. exact
I
etc.
0.8 1.0
2.6749 3.0862
2.0602 2.2030 2.4342 2.7629 3.2025
2.0401 2.1620 2.3706 2.6743 3.0852
In the table the values of the exact solution are given, as well as the approximate values calculated by the rectangular and the trapezoidal formulas, respectively, so the accuracies of these methods can be compared. The step size used is h = 0.1.
11.5 Singular Integral Equations .4n integral equation is called a szngular zntegral equatzon if the range of the integral in the equation is not finite, or if the kernel has singularities inside of the range of integration. We suppose that the integrals exist as improper integrals, or as Cauchy principal values (see 8.2.3, p. 451ff.). The properties and the conditions for the solutions of singular integral equations are very different from those in the case of "ordinary" integral equations. We will discuss only some special problems in the following paragraph. For further discussions see [11.8].
11.5.1 Abel Integral Equation
'LY
One of the first applications of integral equations for a physical problem was considered by Abel. A particle is moving in a vertical plane along a curve under the influence only of gravity from the point Po(z0,yo) to the point Pl(O,O)(Fig. 11.2). The velocity of the particle at a point of the curve is (11.68) Figure 11.2
By integration we calculate the time of fall as a function of yo:
(11.69a) If s is considered as a function of y, i.e., s = f(y), then (11.69b)
11.5 Sinaular Intearal Eauations 589
The next problem is to determine the shape of the curve as a function of yo if the time of the fall is given. By substitution
fi,T(Y0) = F ( Y o )
(11 . 6 9 ~ ) and f'(Y) = V(Y) and changing the notation of the variable yo into x, a Volterra integral equation of the first kind is obtained: (11.69d)
We will consider the slightly more general equation (11.70)
The kernel of this equation is not bounded for y = z.In (11.70), the variable y is formally replaced by E and the variable x by y. By these substitutions the solution is obtained in the form 'p = ~ ( z )If. both sides of (11.70) are multiplied by the term
~
(x - y)l-"
and integrated with respect to y between
the limits a and z,it yields the equation
jL(x
- y)'-"
(r""d()dy=/(.dy. (Y - Ela
xf ( Yy)'-" )
(11.71a)
Changing the order of integration on the left-hand side we have
The inner integral can be evaluated by the substitution y = E
F
dY ( x - y)I-"(y
-<)a
du = ./ uo(1- u11-a
-
+ (x- E)u:
sinras) '
(11 . 7 4
We substitute this result into (11.71b). After differentiation with respect to x we get the function p(x):
( 11.71d)
11.5.2 Singular Integral Equation with Cauchy Kernel 11.5.2.1 Formulation of the Problem Consider the following integral equation: (11.72) is a system consisting of a finite number of smooth, simple, closed curves in the complex plane such that they form a connected interior domain St with 0 E Stand an exterior domain S - . Traversing the
590
2 1 . Linear Integral Equations
curve we have S+always on the left-hand side of T . A function u(x)is Holder contznuous (or satisfies the Holder condztzon) over if for any pair 21, x2 E the relations
r
r
(11.73) lU(X1)- U ( X 2 ) J < Klx1 - z#, 0 < p 5 1, K > 0 are valid. We suppose that the functions a(z), f(z), and p(x) are Holder continuous with exponent pl, and K ( s .y) is Holder continuous with respect to both variables with the exponents pz > 81. The kernel K ( z .y)(y - s)-' has a strong singularity for x = y. The integral exists as a Cauchy principal value. mlth K ( x ,x) = b(s)and k ( z ,y) = K(x' - K ( z ' we have (11.72) in the form
Y-"
The expression (Cp)(z)denotes the left-hand side of the integral equation in abbreviated form. L is a singular operator. The function k ( z ,y) is a weakly singular kernel. It is assumed that the normality - b(x)* # 0,z E r holds. The equation condition (11.74b)
is the characterzstic equation pertaining to (11.74a). The operator LOis the characteristic part of the operator C. From the adjoint integral equation of (11.74a) we get the equality
11.5.2.2 Existence of a Solution The equation (Lp)(z)= f(z) has a solution p(z) if and only if for every solution V(y) of the homogeneous adjoint equation (CTw)(y) = 0 the condition of orthogonality
1
f(Y)V(Y)
(11.75a)
dY = 0
r
is satisfied. Similarly, the adjoint equation (CT$)(y) = g(y) has a solution if for every solution p(x) of the homogeneous equation (Cp)(s)= 0 the following is valid: (11.75b)
11.5.2.3 Properties of Cauchy Type Integrals We call the function (11.76a)
r
a Cauchy type integral over r. For z $ the integral exists in the usual sense and the result is a holomorphic function (see 14.1.2, p. 670). We also have @(m)= 0. For z = x E r in (11.76a) we consider the Cauchy principal value (11.76b)
11.5 Singular Integral Eouations 591
The Cauchy type integral @ ( z )can be extended continuously over F from St and from S-.Approaching the point z E r with z we denote the limit by Qt(z)and @-(z),respectively. The formulas of Plemelj and Sochozki are valid:
11.5.2.4 The Hilbert Boundary Value Problem 1. Relations The solution of the characteristic integral equation and the Hilbert boundary value problem strongly correlate. If 9(z) is a solution of (11.74b), then (11.76a) is a holomorphic function on St and S-with @(E) = 0. Because of the formulas of Plemelj and Sochozki (11.76~)we have: p(.) = @+(.) - @-(z). 2(%p)(z) = @+(z) @-(z)) z E (11.77a) With the notation
+
r.
(11.77b) the characteristic integral equation has the form: o+(z) = ~ ( z ) @ - (tz )g(z), z E r.
(11 . 7 7 ~ ) 2. Hilbert Boundary Value Problem We are looking for a function @ ( z )which is holomorphic on S+and S-, and vanishes at infinity, and satisfies the boundary conditions (11.77~)over r. 4 solution @ ( z )of the Hilbert problem can be given in the form (11.76a). So, as a consequence of the first equation of (11.77a), a solution p(z) of the characteristic integral equation is determined.
11.5.2.5 Solution of the Hilbert Boundary Value Problem (in short: Hilbert Problem) 1. Homogeneous Boundary Conditions @'(z) = G ( z ) @ - ( z ) , z E r.
(11.78)
During a single circulation of the point z along the curve rl the value of logG(z) changes by 27riX1, where XI is an integer. The change of the value of the function log G ( z )during a single traverse of the complete curve system is
r
(11.79a) n
The number
K.
=
A/ is called the index of the Hilbert problem. We compose a function
1=0
GO(^) = (z- aO)-T(z)G(z)
( 11.79b)
with (11.794 n(z) = (z- al)'l(z - a#? ' . (z- a,)'", where a. E St and a1 (1 = 1,. . . n ) are arbitrarily fixed points inside f i . If r = ro is a simple closed curve ( n = 0), then we define n(z) = 1. With
.
(11.79d)
the following particular solution of the homogeneous Hilbert problems is obtained, which is called the fundamental solution: n-'(z)expI(z) for z E S+: (1 1.79e) X ( Z ) = ( z - ao)-K expI(2) for z E S-.
{
I
592
11. Linear Inteoral Eauations
This function doesn't vanish for any finite 2. The most general solution of the homogeneous Hilbert problem, which vanishes at infinity, for ti > 0 is @h(Z) = x ( z ) p ~ - l ( z ) , z E (11.80) with an arbitrary polynomial P,-l(z) of degree at most (ti - 1). For K 5 0 there exists only the trivial solution @ h ( Z ) = 0 which satisfies the condition @ h ( a ) = 0, so in this case PK-l(z) I 0. For K > 0 the homogeneous Hilbert problem has ti linearly independent solutions vanishing at infinity.
2. Inhomogeneous Boundary Conditions The solution of the inhomogeneous Hilbert problem is the following:
If IC < 0 holds, for the existence of a solution vanishing at infinity the following necessary and sufficient conditions must be fulfilled: (11.83)
11.5.2.6 Solution of the Characteristic Integral Equation 1. Homogeneous Characteristic Integral Equation If @,,(z) is the solution of the corresponding homogeneous Hilbert problem, from (11.77a) we have the solution of the homogeneous integral equation p h ( 5 ) = @ i ( z )- @ h ( z ) , x E (11.84a) For IC 5 0 only the trivial solution vh(z) = 0 exists. For ti > 0 the general solution is (11.84b) P h b ) = [X'(4 - X-(z)lPn-1(2) with a polynomial P,-l of degree at most ti - 1.
r.
2. Inhomogeneous Characteristic Integral Equation If @ ( z )is a general solution of the inhomogeneous Hilbert problem, the solution of the inhomogeneous integral equation can be given by (11.77a): (11.85a) p(z) = @+(2) - F ( z ) (11.85b) = X + ( T ) R + ( X )- X - ( z ) R - ( 2 )+ @ i ( z )- @jh(2), z E r Using the formulas of Plemelj and Sochozki (11.76~)for R ( z )we have
Substituting (11.85~)into (11.85a) and considering (11.76b) and g(z) = f(z)/(a(z) + b ( z ) )finally results in the the solution:
+
X+(x) X-(z)
dz)= 2(a(z)t b ( z ) ) X + ( zf)( 2 )
According to (11.83) in the case ti < 0 the following relations must hold simultaneously in order to ensure the existence of a solution:
11.5 Singular Integral Equations 593
W The characteristic integral equation is given with constant coefficients a and b: 6 d y ) dy = f(x). Here is a simple closed curve, Le., = (TI = 0). From (11.77b) a g ( x ) + ?ri
121-2
r a - b and g(x) = f (XI . G is a constant, consequently IG = 0. Therefore, n ( x ) = 1 and we get G = atb a+b
\
’
Since K. = 0 holds, the homogeneous Hilbert boundary value problem has only the function @ h ( z ) = 0 as the solution vanishing at infinity. From (11.86) we have
594
12. Functional Analusis
12 Functional Analysis 1. Functional Analysis Functional analysis arose after the recognition of a common structure in different disciplines such as the sciences, engineering and economics. General principles were discovered that resulted in a common and unified approach in calculus, linear algebra, geometry, and other mathematical fields, showing their interrelations.
2. Infinite Dimensional Spaces There are many problems, the mathematical modeling of which requires the introduction of infinite systems of equations or inequalities. Differential or integral equations, approximation, variational or optimization problems could not be treated by using only finite dimensional spaces. 3. Linear and Non-Linear Operators In the first phase of applying functional analysis - mainly in the first half of the twentieth century - linear or linearized problems were thoroughly examined, which resulted in the development of the theory of linear operators. More recently the application of functional analysis in practical problems required the development of the theory of non-linear operators, since more and more problems had to be solved that could be described only by non-linear methods. Functional analysis is increasingly used in solving differential equations, in numerical analysis and in optimization, and its principles and methods became a necessary tool in engineering and other applied sciences.
4. Basic Structures In this chapter only the basic structures will be introduced, and only the most important types of abstract spaces and some special classes of operators in these spaces will be discussed. The abstract notion will be demonstrated by some examples, which are discussed in detail in other chapters of this book, and the existence and uniqueness theorems of the solutions of such problems are stated and proved there. Because of its abstract and general nature it is clear that functional analysis offers a large range of general relations in the form of mathematical theorems that can be directly used in solving a wide variety of practical problems.
12.1 Vector Spaces 12.1.1 NotionofaVector Space A non-empty set V is called a vector space or h e a r space over the field F of scalars if there exist two operations on V - addition of the elements and multiplication by scalars from F - such that they have the following properties: 1. for any two elements z,y E pi, there exists an element z = z + y E V, which is called their sum. 2. For every z E V and every scalar (number) a E F there exists an element ax E V, the product of x and the scalar a so that the following properties, the anoms of vector spaces (see also 5.3.7.1, p. 314), are satisfied for arbitrary elements x,y, z E V and scalars a, D E F: (12.1) ( V l ) 2 (y 2 ) = (z y) 2 . (12.2) (V2) There exists an element 0 E V, the zero element, such that z + 0 = z.
+ +
+ +
(V3) To every vector x there is a vector - z such that x + (-z)= 0. (V4) x + y = y + z . (V5) l . z = 2 .
O.z=O.
(12.3) (12.4) (12.5)
(V6) a ( 0 z )= (aP)z.
(12.6)
+ +
(12.7) (12.8)
(V7) (a + O)z = az Pz. (V8) a ( z + y) = ax ay.
12.1 Vector Spaces 595
V is called a real or complex vector space, depending on whether F is the field R of real numbers or the field C of complex numbers. The elements of V are called either points or, according to linear algebra, vectors. In functional analysis, we do not use the vector notation Z or 2. We can also define in V the difference z - y of two arbitrary vectors x,y E V as x - y = z (-y). From the previous definition, it follows that the equation x+y = z can be solved uniquely for arbitrary elements y and z . The solution is x = z - y. Further properties follow from axioms (Vl)-(V8): the zero element is uniquely defined, ax = Ox and z # 0, imply a = az = ay and a # 0, imply x = y: 0 -(ax)= a ' (-x).
+
a,
12.1.2 Linear and Affine Linear Subsets 1. Linear Subsets A non-empty subset VOof a vector space V is called a linear subspaceor a linear manifoldof V if together with two arbitrary elements x,y E VOand two arbitrary scalars a , p E F, their linear combination ax t 3y is also in Vo. VOis a vector space in its own right, and therefore satisfies the axioms (V1)(V8). The subspace Vo can be V itself or only the zero point. In these cases the subspace is called trivial. 2. Affine Subspaces A subset of a vector space V is called an afine linear subspace or an afine manifoldif it has the form (12.9) {zo t 2/ : Y E Val, where xo E V is a given element and VO is a linear subspace. It can be considered (in the case 10 # 0) as the generalization of the lines or planes not passing through the origin in R3. 3. The Linear Hull The intersection of an arbitrary number of subspaces in V is also a subspace. Consequently, for every non-empty subset E c V, there exists a smallest linear subset lin(E)or [E]in V containing E , namely the intersection of all the linear subspaces, which contain E . The set lin(E)is called the linear hull of the set E . or the linear subspace generated by the set E . It coincides with the set of all (finite) linear combinations (12.10) C y 1 5 1 t azxz . . . a,x,, comprised of elements ~ 1 ~ x. 2. ., 5, E E and scalars a ] ,a2,. . . , a, E F. 4. Examples for Vector Spaces of Sequences I A Vector Space IFn: Let n be a given natural number and V the set of all n-tuples, Le., all finite sequences consisting of n scalar terms { (('1,. . . ,En) : ti E F, i = 1,.. . , n}. The operations will be defined componentwise or termwise, Le., if z = (Ell. . . E,) and y = ( ~ 1. ~. , 7.), are two arbitrary elements from V and a is an arbitrary scalar, a E F, then
+ +
~
a ' x = (aE1,.. . a<,).
(12.11b)
~
In this way, we get the vector space Fn.In the special case of n = 1 we get the linear spaces R or 6. This example can be generalized in two different ways (see examples B and C). I B Vector Space s of all Sequences: If we consider the infinite sequences as elements z = En E F and define the operations componentwise, similar to (12.11a) and (12.11b)!then we get the vector space s of all sequences. I C Vector Space cp(a1so COO) of all Finite Sequences: Let V be the subset of all elements of s containing only a finite number of non-zero components, where the number of non-zero components depends on the element. This vector space - the operations are again introduced termwise is denoted by cp or also by coo.and it is called the space of all finite sequences of numbers. ~
I
12. Functional Analysis
596
ID Vector Space m (also loo) of all Bounded Sequences: A sequence z = {&}F=lbelongs to m if and only if there exists C, > 0 with /,[I 5 C, V n = 1 , 2 , .. . . This vector space is also denoted by 1". IE Vector Space c of all Convergent Sequences: .4sequence z = {<,}F=lbelongs to c if and only if there exists a number (0 E F such that for V E > 0 there exists an index no = no(&)such that for all n > no one has (I, - to1< E (see 7.1.2, p. 403). IF Vector Space cg of all Null Sequences: The vector space co of all null sequences, Le., the subspace of c consisting of all sequences converging to zero ((0 = 0). IG Vector Space lp: The vector space of all sequences z = {&}p=l such that E,"==, I<,lP is convergent is denoted by lp (1 5 p < m). It can be shown by the Minkowski inequality that the sum of two sequences from 1P also belongs to lp, (see 1.4.2.13,p. 32). Remark: For the vector spaces introduced in examples A-G, the following inclusions hold: (12.12) p c co c c c m c s and p C lP c lq c CO, where 1 5 p < q < 03. 5. Examples of Vector Spaces of Functions IA Vector Space F ( T ) : Let V be the set of all real or complex valued functions defined on a given set T , where the operations are defined pointwise, Le., if z = z ( t ) and y = y(t) are two arbitrary elements of V and a E F is an arbitrary scalar, then we define the elements (functions) z+ y and c y . z by the rules ( 12.13a) (z y ) ( t ) = z ( t )+ y ( t ) v t E T , (cyz)(t)=cy.z(t)V t E T . (12.13b) We denote this vector space by F ( T ) . We introduce some of its subspaces in the following examples. IB Vector Space B ( T )or M ( T ) : The space B ( T )is the space of all functions bounded on T . This vector space is often denoted by M ( T ) . In the case of T = N,we get the space M(N) = m from example D of the previous paragraph. IC Vector Space C ( [ a ,b ] ) : The set C([a,b]) of all functions continuous on the interval [a,b] (see 2.1.5.1,p. 57). ~
+
ID Vector Space C(")([a,b ] ) : Let k E N, k 2 1. The set C ( k ) ( [ ab,] ) of all functions k-times continuously differentiable on [a,b] (see 6.1, p. 377-382) is a vector space. At the endpoints a and b of the interval [alb], the derivatives have to be considered as right-hand and left-hand derivatives, respectively. Remark: For the vector spaces of examples A-D of this paragraph, and T = [a, b] the following subspace relations hold: (12.14) C ( k ) ( [ abl). c c(b, 4)c B([a,bl) C F([albl). IE Vector Subspace of C ( [ a ,b]): For any given point to E [a, b ] , the set {z E C([a,b ] ] : z(to) = 0) forms a linear subspace of C ( [ a ,b]).
12.1.3 Linearly Independent Elements 1. Linear Independence A finite subset {XI, , , , ,E,} of a vector space V is called linearly independent if alxl+ . . . + cy,z, = 0 implies cy1 = . . = cy, = 0.
(12.15) Otherwise, it is called lznearly dependent. If a1 = . . . = cy, = 0. then for arbitrary vectors ~ 1 , ... ,x, from V, the vector alzl + . ' . + a,z, is trivially the zero element of V. Linear independence of the vectors zl,. . . ,z, means that the only way to produce the zero element 0 = cqzl+. . . + a , ~ ,is when all coefficients are zero cy1 = . . = cy, = 0. This important notion is well known from linear algebra
12.1 Vector Spaces 597
(see 5.3.7.2, p. 316) and was used for the definition of a fundamental system of homogeneous differential equations (see 9.1.2.3,2., p. 498). An infinite subset E c Vis called linearly independent ifevery finite subset of E is linearly independent. Otherwise, E is called linearly dependent. IIf we denote by ek the sequence whose k-th term is equal to 1 and all the others are 0, then ek is in the space cp and consequently in any space of sequences. The set { e l , ezr. . .} is linearly independent in every one of these spaces. In the space C([O,T I ) , e.g., the system of functions 1, sinnt, cosnt (n = 1,2,3,.. .) is linearly independent, but the functions 1,cos 2t, cos't are linearly dependent (see (2.97), p. 79). 2. Basis and Dimension of a Vector Space A linearly independent subset B from V, which generates the whole space V, i.e., lin(B)= V holds, is called an algebraic basis or a Hamel basis of the vector space V (see 6.3.7.2, p. 315). B = {z~: E E} is a basis of V if and only if every vector z E V can be written in the form z = q x ~where , the (€5
coefficients at are uniquely determined by x and only a finite number of them (depending on z)can be different from zero. Every non-trivial vector space V, i.e., V # {0}, has at least one algebraic basis, and for every linearly independent subset E of V,there exists at least one algebraic basis of V, which contains E . A vector space V is m-dimensional if it possesses a basis consisting of m vectors. That is, there exist m linearly independent vectors in V, and every system of m t 1vectors is linearly dependent. A vector space is infinite dimensional if it has no finite basis, Le., if for every natural number m there are m linearly independent vectors in V. The space F" is n-dimensional, and all the other spaces in examples B-E are infinite dimensional. The subspace /in({l: t , t 2 } ) c C ( [ a ,b])is three-dimensional. In the finite dimensional case, every two bases of the same vector space have the same number of elements. Also in an infinite dimensional vector space any two bases have the same cardinality, which is denoted by dim(\'). The dimension is an invariant quantity of the vector space, it does not depend on the particular choice of an algebraic basis.
12.1.4 Convex Subsets and the Convex Hull 12.1.4.1 Convex Sets A4subset C of a real vector space V is called convex if for every pair of vectors z,y
E C all vectors of the form Ax + (1 - X)y, 0 5 X 5 1, also belong to C. In other words, the set C is convex, if for any two elements z and y, the whole line segment (12.16) {Ax + (1 - X)y : 0 5 x 5 l}, (which is also called an interval), belongs to C. (For examples of convex sets in R2 see the sets denoted by A and B in Fig. 12.5, p. 624.) The intersection of an arbitrary number of convex sets is also a convex set, where the empty set is agreed to be convex. Consequently, for every subset E c V there exists a smallest convex set which contains E , namely, the intersection of all convex subsets of V containing E . It is called the convex hull of the set E and it is denoted by co ( E ) . eo ( E ) is identical to the set of all finite convex linear combinations of elements from E , Le.. eo ( E )consists of all elements of the form X l z l + . . . + Xnz,,where z i r . . , 2 , are arbitrary elements from E and A, E [0,1] satisfy the equality A1 + ' . . + A, = 1. Linear and affine subspaces are always convex.
12.1.4.2 Cones A non-empty subset C of a (real) vector space V is called a convex cone if it satisfies the following properties: 1. C is a convex set.
598
12. Functional Analusis
n o m z E C and X 2 0, it follows that Xz E C. 3. From z E C and -z E C , it follows that z = 0. .L\ cone can be characterized also by 3. together with z , y ~ Cand X , p > O imply X z t p y E C . (12.17) IA: The set R: of all vectors z = ([I,. . . ,En) with non-negative components is a cone in R". IB: The set C+ of all real continuous functions on [a, b] with only non-negative values is a cone in the space C ( [ a .b ] ) with only non-negative terms, Le., [,,2 0. V n , IC: The set of all sequences of real numbers is a cone in s. Analogously we obtain cones in the spaces of examples C-G on p. 593, if we consider the sets of non-negative sequences in these spaces. ID: The set C C 1P (1 5 p < m), consisting of all sequences such that for some a > 0 2.
(12.18) is a convex set in lp, but obviously. not a cone. IE: Examples from R2 see Fig. 12.1: a) convex set, not a cone, b) not convex, c) convex hull.
Yt
Figure 12.1
12.1.5 Linear O p e r a t o rs a n d h n c t i o n a l s 12.1.5.1 Mappings .A mapping T : D --t Y from the set D c X into the set Y is called injective, if (12.19) T ( z )= T(y) ==+ 5 = y: surjective, if for V y E Y there exists an element z E D such that T ( z )= y, (12.20) bijective, if T is both injective and surjective. D is called the domain of the mapping T and is denoted by DT or D ( T ) ,while the subset {y E Y : 3 z E DT with T ( z )= y} of Y is called the range of the mapping T and is denoted by R ( T )or Im(T).
12.1.5.2 Homomorphism and Endomorphism Let X and Y be two vector spaces over the same field F and D a linear subset of X. .4mapping T : D t Y is called linear (or a linear transformation, linear operator or homomorphism), if for arbitrary z,y E D and CY: 3 E F, (12.21) T ( a 5 t By) = cvTz PTy. For a linear operator T we prefer the notation 2 's. which is similarly used for linear functions, while the notation T ( z )is used for general operators. N ( T ) = {z E X : T z = 0 ) is the null space or kernel
+
12.1 Vector Spaces 599
of the operator T and is also denoted by ker(T). A mapping of the vector space X into itself is called an endomorphism. If T is an injective linear mapping, then the mapping defined on R ( T )by y c-t x. such that Tx = y, y E R ( T ) (12.22) is linear. It is denoted by T-': R ( T )-+ X and is called the inverse of T . If U is the vector space F: then a linear mapping f: X t F is called a linearfunctional or a linearfonn.
12.1.5.3 Isomorphic Vector Spaces A bijective linear mapping T X -i Y is called an isomorphism of the vector spaces X and Y. Two vector spaces are called isomorphic provided an isomorphism exists.
12.1.6 Complexification of Real Vector Spaces Every real vector space V can be extended to a complex vector space P. The set (z. y) with z,y
E
consists of all pairs
Pi. The operations (addition and multiplication by a complex number a + ib E C)
are defined as follows: 2/11
+ (zz; Yz) = (21 + 5 2 1 Y l + YZ),
(12.23a) (12.23b)
( a +ib)(z,y) = (az - b y , b x + ay).
Since the special relations (12.24) (z:y) = (x.0 ) (0, y) and i(y, 0) = (0 t il)(y, 0) = (0 y - 1 0, l y 0 . 0 ) = (0, y) hold, the pair (z, y) can also be written as z t iy. The set is a complex vector space, where the set Y is identified with the linear subspace 9, = {(z,0 ) : z E V}, Le.. z E V is considered as (z, 0) or as 5 t io. This procedure is called the complexificationof the vector space V. .4linearly independent subset in V is also linearly independent in 9. The same statement is valid for a basis in V, so dim(V)= dim(v).
+
+
12.1.7 Ordered Vector Spaces 12.1.7.1 Cone and Partial Ordering If a cone C is fixed in a vector space V,then an order can be introduced for certain pairs of vectors in W. Namely, if z - y E C for some z, y E V then we write z 2 y or y 5 z and say x is greater than or equal to y or y is smaller than or equal to z. The pair (V, C) is called an ordered vector space or a vector space partially ordered by the cone C. An element z is called positive, if z 2 0 or, which means the same, if x E C holds. Moreover = {z E r.: z 2 0 ) . (12.25)
c
If we consider the vector space R2ordered by its first quadrant as the cone C(= R t ) , then a typical phenomenon of ordered.vector spaces will be seen. This is referred to as "partially ordered or sometimes as "semi-ordered. Namely, only several pairs of two vectors are comparable. Considering the vectors z = (1,-1) and y = (0,2), neither the vector z - y = (1,-3) nor y - z = (-1.3) is in C, so neither x 2 y nor z 5 y holds. .4n ordering in a vector space, generated by a cone, is always only a partial ordering. It can be shown that the binary relation 2 has the following properties: (01) x 2 z V x E V (reflexivity). (12.26) ( 0 2 ) z 2 y and y 2 z imply z 2 z (transitivity).
(12.27)
(03) z 2 y and a 2 0,
(12.28)
(04) x1 2
y1
and zz 2
CY
YZ
E R,
imply
CY% 2
ay.
imply zl + z2 2 y1 + YZ.
(12.29)
600
12. Functional Analusis
Conversely, if in a vector space V there exists an ordering relation, Le., a binary relation 2 is defined for certain pairs of elemegts and satisfies axioms (01)-(04),and if one puts (12.30) Vt = {z E V: z O } , then it can be shown that V, is a cone. The order >v+ in V induced by Vt is identical to the original order >; consequently, the two possibilities of introducing an order in a vector space are equivalent. A cone C C X is called generating or reproducing if every element z E X can be represented as z = u - u with u , u E C. We also write X = C-C. A: .4n obvious order in the space s (see example B, p. 595) is induced by means of the cone (12.31) C = {z = {&}:=* : En 2 0 V n } (see example C, p. 598). We usually consider the natural coordinatewise order in the spaces of sequences (see (12.12), p 596). This is defined by the cone obtained as the intersection of the considered space with C (see (12.31), p. 600). The positive elements in these ordered vector spaces are then the sequences with non-negative terms. It is clear that we can define other orders by other cones, as well. Then we obtain orderings different from the natural ordering (see [12.17], [12.19]).
>
B: In the real spaces of functions F ( T ) , B ( T ) , C ( [ a ,b]) and C@)([a,b]) (see 12.1.2, 5., p. 596), we define the natural order z 2 y for two functions z and y by z ( t ) 2 y(t), V t E T , or V t E [a,b]. Then z 0 if and only if z is a non-negative function in 2'. The corresponding cones are denoted by Ft(T), B,(T),etc. WecanalsoobtainC+ = C + ( T )= F + ( T ) n C ( T ) i f T = [a,b].
>
12.1.7.2 Order Bounded Sets Let E be an arbitrary non-empty subset of an ordered vector space V. An element z E V is called an upper bound of the set E if for every z E E , z 5 z . An element u E V is a lower bound of E if u 5 z,V z E E . For any two elements z, y E V with z 5 y, the set (12.32) [z, y] = (2: E v: 2 5 v 5 y} is called an order interval or (0)-interval. Obviously, the elements z and y are a lower bound and an upper bound of the set [z, y], respectively, where they even belong to the set. A set E c V is called order bounded or simply (0)-bounded, if E is a subset of an order interval, Le., if there exist two elements u , z E V such that u 5 z 5 z , V z E E or, equivalently, E c [u,z]. A set is called bounded above or bounded below if it has an upper bound, or a lower bound, respectively.
12.1.7.3 Positive Operators A linear operator (see 112.21, [12.17]) T : X -+ Y from an ordered vector space X = (X,Xt) into an ordered vector space Y = (Y: Y,) is called positive, if T(X,) c Y,, Le., T z 2 0 for all z > 0. (12.33)
12.1.7.4 Vector Lattices 1. Vector Lattices In the vector space R' of the real numbers the notions of (0)-boundedness and boundedness (in the usual sense) are identical. It is known that every set of real numbers which is bounded from above has a supremum: the smallest of its upper bounds (or the least upper bound, sometimes denoted by lub). Analogously, if a set of reals is bounded from below, then it has an infimum, the greatest lower bound, sometimes denoted by glb. In a general ordered vector space, the existence of the supremum and infimum cannot be guaranteed even for finite sets. They must be given by axioms. An ordered vector space V is called a vector lattice or a linear lattice or a Riesz space, if for two arbitrary elements z,y E \.'there exists an element z E V with the following properties: 1. z 5 z and y 5 z , 2. if u E V with z 5 u and y 5 u, then z 5 u.
12.1 Vector Spaces 601
Such an element z is uniquely determined, is denoted by z V y , and is called the supremum of x and y (more precisely: supremum of the set consisting of the elements z and y). In a vector lattice, there also exists the infimum for any z and y, which is denoted by IC A y. For applications of positive operators in vector lattices see, e.g., [12.2],[12.3] [12.15]. A vector lattice in which every non-empty subset E that is order bounded from above has a supremum lub(E) (equivalently, if every non-empty subset that is order bounded from below has an infimum glb(E)) is called Dedekind complete or a K-space (Kuntorovzch space).
W A: In the vector lattice F([u,b]) (see 12.1.2, 5., p. 596), the supremum of two functions x,y is calculated pointwise by the formula ( Z V Y ) ( t ) =max{x(t),y(t)) V t E [a,bI. (12.34) In the case of [a,b] = [0,1],x(t) = 1 - it and y ( t ) = t2 (Fig. 12.2), we get l - i t , if o s t s ; , (12.35) (zvy)(t) = t 2 , if 5 t 5 1.
A
{
1
7
2
Figure 12.2
;
IB: The spaces C ( [ a ,b]) and B([a,b]) (see 12.1.2, 5., p. 596) are , is not also vector lattices, while the ordered vector space C ( ' ) ( [ ab]) a vector lattice, since the minimum or maximum of two differentiable functions may not be differentiable on [a,b], in general. A linear operator T : X t Y from a vector lattice X into a vector lattice Y is called a vector lattice homomorphzsm or homomorphzsrn of the vector lattice, if for all z, y E X
T ( z V y ) = Tx V T y and T ( x A y) = T z A Ty.
(12.36)
2. Positive and Negative Parts, Modulus of an Element For an arbitrary element z of a vector lattice V, the elements xt = x V 0, x- = (-x) V 0 and 1x1 = z+ t x(12.37) are called the positive part, negative part, and modulus of the element z, respectively. For every element x E V,the three elements x t r x-,1x1 are positive, where for z, y E V the following relations are valid: (12.38a) z 5 2+ 5 1x1, 2 = Z+ - 2-, Z+ A X - = 0, 1x1 = z V (-x),
(z+ Y ) t I z t + Y t ? (2 + Y)- 5 2- + Y-> x 5 y implies zt 5 yt and x- 2 yand for arbitrary a 2 0 (ax)+=ax+,
(0.)-
12
+ YI I I4 +
=ax-, lax/ =aIxI.
lYll
(12.38b) (12.38~) (12.38d)
Figure 12.3 In the vector spaces F ( [ ab]) , and C ( [ a ,b]), we get the positive part, the negative part, and the modulus of a function x ( t ) by means of the following formulas (Fig. 12.3):
I
602
12. Functional Analusis
(12.39a) (12.39b)
Izl(t)= Iz(t)l V t E
[a, b]
(12.394
12.2 Metric Spaces 12.2.1 Notion of a Metric Space Let X be a set. and suppose a real, non-negative function p(z, y) (z, y E X) is defined on X x X. If this function p : X x X -+ R\ satisfies the following properties (Ml)-(M3) for arbitrary elements x. y, z E X. then it is called a rnetrzc or dzstance in the set X, and the pair X = (X, p ) is called a rnetrzc space. The axioms of rnetrzc spaces are: (12.40) ( M l ) p(x. y j 2 0 and p ( i , y) = 0 if and only if z = y (non-negativity),
(M2) P(.. Y) = P(Y> (symmetry). (12.41) (M3) P k >Y) 5 P ( Z , 2) -t P(Zt Y) (triangle inequality). (12.42) X metric can be defined on every subset Y of a metric space X = (X, p ) in a natural way if we restrict the metric p of the space X to the set Y, Le., if we consider p only on the subset Y x Y. The space (Y, p ) of X x X is called a subspace of the metric space X. IA: The sets R" and C" are metric spaces with the Euclzdean rnetrzc defined for points 2 = ((1, " ' , E n ) and Y = (713 ". 37") as (12.43)
(12.44)
(t1,.. . .En) and y = (171,. . ., a )also defines a metric in R" and C", the so-called rnaxzrnurn metrzc. If 2 = ( i l , . . . .&) is an approximation of the vector z,then it is of interest to know
for vectors x =
how much is the maximal deviation between the coordinates: max
l
I&
- $1,
The function n
k k - 7lCI
p(zi y) =
(12.45)
k=l
for vectors x,y E R" (or C") defines a metric in R" and C, the so-called absolute value rnetrzc. The metrics (12.43), (12.44) and (12.45)are reduced in the case of TI = 1 to the absolute value 12 - yI in the spaces R = R' and C (the sets of real and complex numbers). IC: Finite 0-1 sequences, e.g., 1110 and 010110, are called words in coding theory. If we count the number of positions where two words of the same length n have different digits, Le., for z = ((1,. . . ,<,,)> y = ( ~ 1 . .. . ,vn), ( k , r)k E ( 0 , I}, we define p(z, y) as the number of the k E { I , . . . ,?I} values such that &, # V k , then the set of words with a given length n is a metric space, and the metric is the so-called Hammingdistance, e.g., p((lllO), (0100)) = 2. ID: In the set m and in its subsets c and co (see (12.12), p. 596) a metric is defined by p ( x . y) = SYP - 7781, (x = ([l) (2, , , y = 711 72) , . (12.46)
12.2 Metric Spaces 603
W E:
In the set 1P (1 5 p
< m) of sequences 2 = (.&, &, . . .) with absolutely convergent series
W
n=l
IEnlP
a metric is defined by (12.47)
(12.48)
W G : In the set C(')([a,b]) a metric is defined by k
p(z.y) =
1 max Iz(')(t) 1=o
-
(12.49)
y(')((t)/.
tEI4
where (see (12.14) C(')([a,b ] ) is understood as C ( [ a ,b])).
W H: Consider the set P ( Q (1 ) 5 p < m) of the equivalence classes of Lebesgue measurable functions which are defined almost everywhere on a bounded domain
R c R" and
I
n
lz(t)IPdp< co (see also
12.9, p. 633). A metric in this set is defined by P(Z?Y) =
\!
p
I 4 - Y(t)IPdP
(12.50)
12.2.1.1 Balls, Neighborhoods and Open Sets In a metric space X = (X, p ) , whose elements are also called points, the following sets
B ( z o :r ) = {x E X : p(xl 20) < T } . (12.51)
-
(12.52) B (z o ;r ) = {x E X : p(x, xo) 5 r } defined by means of a real number r > 0 and a fixed point EO,are called an open and closed ball with radius r and center at 2 0 , respectively. The balls (circles) defined by the metrics (12.43) and (12.44) and (12.45) in the vector space R2 are represented in Fig. 12.4a,b with 20 = 0 and r = 1.
Figure 12.4 .;2 subset li of a metric space X = (X, p ) is called a nezghborhoodof the point zoif U contains zo together with an open ball centered at 2 0 , in other words, if there exists an r > 0 such that B(z0;T ) c U . 4 neighborhood C' of the point z is also denoted by U ( z ) .Obviously, every ball is a neighborhood of its center; an open ball is a neighborhood of all of its points. A point 20 is called an znterzorpoint of a set A c X if zo belongs to A together with some of its neighborhood, i.e., there is a neighborhood N of zo such that xo E G c .4. A subset of a metric space is called open if all of its points are interior points. Obviously. X is an open set. The open balls in every metric space, especially the open intervals in R, are the prototypes of open sets.
604
12. Functional Analusis
The set of all open sets satisfies the following axzoms of open sets: If G, is open for Vcu E I , then the set U G, is also open. ,€I
n
If G1.Gz... . , G, are finitely many arbitrary open sets, then the set n Gk is also open. k=l The empty set 0 is open by definition. A subset A of a metric space is bounded if for a certain element zo(which does not necessarily belong to A) and a real number R > 0 the set A is in the ball B(z0;R), Le., p(z, zo) < R for all z E A.
12.2.1.2 Convergence of Sequences in Metric Spaces Let X = (X, p ) be a metric space, zo E X and {zn}r=l,z, E X a sequence of elements of X. The sequence {x,}:=~ is said to be convergent to the point zo if for every neighborhood U(z0)there is an index no = no(U(s0))such that for all n > no, x, E V(z0). We use the usual notation J, + T O (n co) i or hlx, = zo (12.53) and call the point zo the lzrnzt ofthe sequence {z~}:=~. The limit of a sequence is uniquely determined. Instead of an arbitrary neighborhood of the point 2 0 , it is sufficient to consider only open balls with arbitrary radii, so (12.53) is equivalent to the following: V E > 0 (we are at once thinking about the open ball B(z0;E ) ) , there is an index no = no(€),such that if n > no, then p(xn, 20) < E . Notice that (12.53) means p(x,, ZO)+0. LVith these notions introduced in special metric spaces we can calculate the distance between points and investigate the convergence of point sequences. This has a great importance in numerical methods and in approximating functions by certain classes of functions (see, e.g., 19.6, p. 914). In the space R", equipped with one of the metrics given above, convergence always means coordinatewise convergence. In the spaces B([a,b])and C ( [ a ,b ] ) , the convergence introduced by (12.48) means uniform convergence of the function sequence on the set [a,b] (see 7.3.2, p. 412). In the space L 2 ( Q )convergence with respect to the metric (12.50) means convergence in the (quadratic) mean, Le., x, --t xo if (12.54)
12.2.1.3 Closed Sets and Closure 1. Closed Sets
A subset P of a metric space X is called closed if X \ F is an open set. Every closed ball in a metric space. especially every interval of the form [alb], [a,co),(-coola] in R, is a closed set. Corresponding to the axioms of open sets, the collection of all closed sets of a metric space has the following properties: If F, are closed for Vcu E I , then the set F, is closed.
n
,€I
n
If P I , .. . , F, are finitely many closed sets, then the set U
Fk
is closed.
k=l
The empty set 0 is a closed set by definition. The sets 0 and X are oDen and closed at the same time. -4 point zo of a metricspace X is called a limit point of the subset A
c X if for every neighborhood
UXO), (12.55) c.(x0)n A # 0. If this intersection always contains at least one point different from 2 0 , then zo is called an accvrnvlatzon p o d of the set A. A limit point, which is not an accumulation point, is called an isolatedpoint. An accumulation point of A does not need to belong to the set A, e.g., the point a with respect to the set A = (a. b]. while an isolated point of A must belong to the set A.
22.2 Metric Spaces 605
-4 point 2 0 is a limit point of the set A if there exists a sequence {z,}~=~with elements z,from A, which converges to 20. If 2 0 is an isolated point, then z, = 20, V n 2 no for some index no.
2. The Closure of a Set Every subset A of a metric space X obviously lies in the closed set X. Therefore, there always exists a smallest closed set containing A, namely the intersection of all closedsets of X, which contain A. This set is called the closKe of the set A and it is usually denoted by .4. A is identical to the set of all limit points of A: weget .4 from the set A by adding all of its accumulation points to it. A is a closed set if and only if A = A. Consequently, closed sets can be characterized by sequences in the following way: .4 is closed if and only if for every sequence {z,}~=~of elements of A, which converges in X to an element zo(E X), the limit zo also belongs to A. Boundary points of A are defined as follows: 20 is a boundary point of A if for every neighborhood U(zo),U(Q) n A # 0 and also U ( z 0 ) n ( X \ A) # 0.zo itself does not need to belong to A. Another characterization of a closed set is the following: A is closed if it contains all of its boundary points. (The set of boundary points of the metric space X is the empty set.)
12.2.1.4 Dense Subsets and Separable Metric Spaces -4 subset A of a metric space X is called everywhere dense if 2 = X, Le., each point z E X is a limit point of the set A. That is, for each z E X, there is a sequence {z,} z, E A such that z, +z. IA: According to the Weierstrass approximation theorem, every continuous function on a bounded closed interval [a,b] can be approximated arbitrarily well by polynomials in the metric space of the space C([a.b ] ) :Le., uniformly. This theorem can now be formulated as follows: The set of polynomials on the interval [a,b] is everywhere dense in C ( [ a ,b]). IB: Further examples for everywhere dense subsets are the set of rational numbers Q and the set of irrational numbers in the space of the real numbers R . .4 metric space X is called separable if there exists a countable everywhere dense subset in X. A countable everywhere dense subset in R" is, e.g., the set of all vectors with rational components. The space 1 = 1' is also separable, since a countable everywhere dense subset is formed, for example, by the set of its elements of the form z = ( T I ) ~ 2 ). .. , T N . 0,O.. . .) , where r, are rational numbers and ,V = .V(z) is an arbitrary natural number. The space m is not separable.
12.2.2 Complete Metric Spaces 12.2.2.1 Cauchy Sequences Let X = (X, p ) be a metric space. A sequence {z,};=~ with z, E X is called a Cauchy sequence if for V i > 0 there is an index no = no(&)such that for V n: m > no there holds the inequality P(Zn,Zn) < E. (12.56) Every Cauchy sequence is a bounded set. Furthermore, every convergent sequence is a Cauchy sequence. In general, the converse statement is not true, as is shown in the following example. = IConsider the space 1' with the metric (12.46) of the space m. Obviously, the elements 1 1 1 (1. 2'?, . . . ,,; 0,O.. . belong to 1' for every n = 1 , 2 , .. . and the sequence {x(,)}:=~ is a Cauchy
.)
sequence in this space. If the sequence (of sequences) {z(")},"=l converges, then it has to be convergent also coordinate-wise to the element do)= to l', since
" 1 - = +m (see 7.2.1.1, 2., p. 404. harmonic series). n=l
n
) . However, do)does not belong
606
12. Functional Analusis
12.2.2.2 Complete Metric Spaces .4 metric space X is called complete if every Cauchy sequence converges in X. Hence, complete metric spaces are the spaces for which the Cauchy principle, known from real calculus, is valid: A sequence is convergent if and only if it is a Cauchy sequence. Every closed subspace of a complete metric space (considered as a metric space on its own) is complete. The converse statement is valid in a certain way: If a subspace Y of a (not necessary complete) metric space X is complete, then the set Y is closed in X.
ICompletemetricspacesare,e.g.,thespaces: m, LP(a. b) (1 5 p < co).
1P
(1 5 p < co),c,
B ( T ) ,C ( [ a , b ] ) ,
C @ ) ( [ ab,] ) ,
12.2.2.3 Some Fundamental Theorems in Complete Metric Spaces The importance of complete metric spaces can be illustrated by a series of theorems and principles. which are known and used in real calculus, and which we want to apply even in the case of infinite dimensional spaces.
1. Theorem on Nested Balls Let X be a complete metric space. If B(x1;r1) 3 B(z2;7-2) I). . ' 3 B(xn;rn) 3 . ' . (12.57) is a sequence of nested closed balls with r, t 0, then the intersection of all of those balls is nonempty and consists of only a single point. If this property is valid in some metric space for any sequence satisfying the assumptions, then the metric space is complete.
2. Baire Category Theorem m
Let X be a complete metric space and { F k } g l a sequence of closed sets in X with U
Fk
= X. Then
k=1
there exists at least one index ko such that the set F k o has an interior point.
3. Banach Fixed-point Theorem Let F be a non-empty closed subset of a complete metric space (X, p ) . Let Z? X operator on F , Le., there exists a constant q E [0,1) such that ~ ( T xTy) . 5 qp(z,y) for all z,y E F. Suppose, if z E F , then 2's E F . Then the following statements are valid: a) For an arbitrary initial point zo E F the iteration
+ X be a contracting
( n = 0.1.2,. . .) is well defined, Le., z, E F for every n. b) The iteration sequence {z~}:=~ converges to an element x' E F . Tx' = x*>Le., x* is a fixed point of the operator T . c) d) The only fixed point of T in F is x*. e) The following error estimation is valid: X,+l
:= T I ,
(12.58)
(12.59)
(12.60)
(12.61) The Banach fized-point theorem is sometimes called the contraction mapping principle
12.2.2.4 Some Applications of the Contraction Mapping Principle 1. Iteration Method for Solving a System of Linear Equations The given linear ( n ,R ) system of equations
+. . . + ainz1 = +. . . + aznxn = bz, ............... anlxl +an2x2 + . . . + annxn = b, +aizxz a21x1+a22x2 UIIZI
bl:
I.................
(12.62a)
12.2 Metn'c Spaces 607
can be transformed according to 19.2.1, p, 887, into the equivalent system z1 - ( I - al!)xl +a!zzz .. +ainzn = bi: x2 +azlxl -(1 - a22)z2 . +aznxn = bz,
+, +, ...................................................... 2, +anixl +an2z2 +. . -(1 - ann)$"= b,.
( 12.62b)
,
If the operator T : F" + Fn is defined by n
TX =
(
51
- XalkXk
+ b i , . . . ,Xn- xankXk + b,
kI1
(12.63)
k=l
then the last system is transformed into the fixed-point problem x = Tx (12.64) in the metric space F", where an appropriate met& is considered: The Euclidean (12.43), the maximum (12.44) or the absolute value metric p(x, y) =
? lzk - yhl (compare with (12.45)). If one of the
k=l
numbers I
n
(12.65) is smaller than one, then T turns out to be a contracting operator. It has exactly one fixed point according to the Banach fixed-point theorem, which is the componentwise limit of the iteration sequence started from an arbitrary point of F".
2. Fredholm Integral Equations The Fredholm integral equation of the second kind (see also 11.2, p. 562)
J
a
with a continuous kernel K(x,y) and continuous right-hand side f(z)can be solved by iteration. By means of the operator T: C([a,b]) +C([a,b]) defined as b
Td.)
=
1
K ( z ,Y)rP(Y) dY + f(.)
v 'p E C ( b >bl),
(12.67)
a
it is transformed into a fixed-point problem T y = p in the metric space C([a,b ] ) (see example A in
z;zb1lK(z,y)1dy < 1, then T is a contracting operator and the fixed-point b
12.1.2. 4.,p. 595). If
theorem can be applied. The unique solution is now obtained as the uniform limit of the iteration sequence {pn}T=l.where pn = Tpn-l, starting with an arbitrary function yo(.) E C([a,b]). It is clear that pn = Tnpo and the iteration sequence is {Tnpo}F=l.
3. Volterra Integral Equations The Volterra integral equation of the second kind (see 11.4, p. 583) 5
3(z)- /K!x. Y)rP(Y) dY = f(.),
E
1%
bl
(12.68)
a
with a continuous kernel and a continuous right-hand side can be solved by means of the Volterra integral operator (12.69)
I
608
12. Functional Analysis
and T y = f t Vp as the fixed-point problem T p = 'p in the space C ( [ a ,b]). 4. Picard-Lindelof Theorem Consider the differential equation x = f(t,x) (12.70) with a continuous mapping f : I x G -+R", where I is an open interval of R and G is an open domain of R". Suppose the function f satisfies a Lipschitz condition with respect to z (see 9.1.1.1, 2. p. 486)) Le., there is a positive constant L such that df(t.zi),f(t,zz))5 Le(zi,zz) v(t,zi),(t,zz)E I x G, (12.71) where g is the Euclidean metric in Rn. (Using the norm (see 12.3.1, p. 609) and the formula (12.81) e(z,y) = llx-y(l (12.71)canbewrittenasIlf(t,rl)-f(t,z2)II 5 I~zl-zzii.)Let (t0,zo) E I x G . Then there are numbers p > 0 and T > 0 such that the set R = { ( t ,z) E R x Rn: It - to1 5 /3, p(z,zo) 5 T } lies in I x G. Let M = maxn e(f(tlz), 0) and a = min{/3, L}.Then there is a number b
M
> 0 such
that for each i E B = {z E R": e(z, 2 0 ) 5 b}, the initial value problem x = f(t,.), x(t0) = i (12.72) hasexactlyonesolutionp(t,i),i.e., $ ( t , i )= f(t,y(t,i))forVtsatisfyingIt-toj 5 aandcp(t0,f) = 5. The solution of this initial value problem is equivalent to the solution of the integral equation t
p(t>i) = i t
1
f(s,p(s, i))ds,
t E [to - a , t o + a].
(12.73)
to
If X denotes the closed ball {p(t,z) : d(rp(t,x),20) 5 a] x B; R")with metric
T}
in the complete metric space C([to - a, t o t
4%Y ) = ( t , z ) E l , ~ ~ $ s ae('p(4~1, )xB N t , z)), then X is a complete metric space with the induced metric. If the operator T : X
(12.74)
+X is defined by
t
T d t ,). = 5 t
1
f ( S >C p ( S l 4 ) ds
(12.75)
to
then T is a contracting operator and the solution of the integral equation (12.73) is the unique fixed point of T which can be calculated by iteration.
12.2.2.5 Completion of a Metric Space Every (non-complete) metric space X can be completed; more precisely, there exists a metric space X with the following properties:
a) X contains a subspace Y isometric to X (see 12.2.3,2., p. 609).
b) Y is everywhere dense in X. c)
X is a complete metric space.
d) If Z is any metric space with the properties a)-c), then Z and X are isometric. The complete metric space, defined uniquely in this way up to isometry, is called the completzon of the space X.
12.2.3 Continuous Operators 1. Continuous Operators
Let T . X -iY be a mapping of the metric space X = ( X , p ) into the metric space Y = (U, e). T is said to be contznuous at the poznt zo E X if for every neighborhood V = V(y0) of the point yo = T(z0)
12.3 Normed Spaces 609
there is a neighborhood U = U(zo)such that: T ( z )E V for all z E U. (12.76) T is called contanuous on the set A c X if T is continuous at every point of A . Equivalent properties for T to be continuous on X are: a) For any point z E X and any arbitrary sequence { x ~ } ; = ~z, , E X with 2, +2 there always holds T(x,) i T ( z ) .Hence p(x,,zo) + 0 implies ~ ( T ( z ~ ) , T (+ z o0.) ) b) For any open subset G c Y the inverse image T-'(G) is an open subset in X. c ) For any closed subset F c Y the inverse image T - ' ( F ) is a closed subset in X. d) For any subset .4 C X one has T ( 2 )c T(A).
2. Isometric Spaces If there is a bijective mapping T : X
+Y for two metric spaces X = (X, p ) and Y = (Y, e) such that
P h Y) = e ( T ( x )T(Y)) , Vz, Y E X, then the spaces X and Y are called isometric, and T is called an isometry.
(12.77)
12.3 Normed Spaces 12.3.1 Notion of a Normed Space 12.3.1.1 Axioms of a Normed Space Let X be a vector space over the field F. A function I/ . 11 : X + R: is called a norm on the vector space X and the pair X = (X, /I 11) is called a normed space over the field F, if for arbitrary elements x,y E X and for any scalar a E F the following properties, the so-called azioms o f a normed space, are fulfilled: ( N l ) Ilz/I 2 0,
and llzli = 0 if and only if z = 0,
(N2) l l a 4 = la1 . ll4 (N3) Ilz
+ yI1 5 11x11 t llyll
(12.78)
(homogenity)
~
(triangle inequality)
4 metric can be introduced by means of 2, Y E X, P ( Z , Y) = llz - Yll,
(12.81) in any normed space. The metric (12.81) has the following additional properties which are compatible with the structure of the vector space: (12.82a) (12.82b) So, in a normed space there are available both the properties of a vector space and the properties of a metric space. These properties are compatible in the sense of (12.82a) and (12.82b). The advantage is that most of the local investigations can be restricted to the unit ball (12.83) B(O:I) = {z E X : llzll < 1) or B(O;1) = {z E X : 11x11 5 1) since (12.84) B ( z ;T ) = {y E X : Ily - zI1 < r } = z+ rB(0;l ) , V z E X and V r > 0. Moreover, the algebraic operations in a vector space are continuous, i.e., z, + z, yn --t y, a, --t a imply zn + Yn + 2 + Y, anzn + az, llznll + 1141. (12.85) In normed spaces instead of (12.53) we may write for convergent sequences Ilz, - 5011 -+0 ( n + m). (12.86)
610
12. Functional Analusis
12.3.1.2 Some Properties of Normed Spaces Among the linear metric spaces, those spaces are nomable (Le., a norm can be introduced by means of the metric, if one defines llzl1 = p(z, 0)) whose metric satisfies the conditions (12.82a) and (12.82b). Two normed spaces X and Y are called n o m isomorphic if there is a bijective linear mapping T: X + Y with IITzll = l/zIl for all z E X. Let /I . 111 and I! 112 be two norms on the vector space X, and denote the corresponding normed spaces by X1 and X2, i.e., X1 = (X, I/ . 111) and Xz = (X, 11 . 112). The norm 11 . 111 is stronger than the norm 11 . 112, if there is a number y > 0 such that /lzII:, 5 yllzlll, for all x E X. In this case, the convergence of a sequence to z with respect to the stronger norm 11 . Ill, Le., llzn - xIll t 0, implies the convergence to z with respect to the norm 11 112r Le., 3
llxn - 2112
t 0.
Two norms /I. 11 and 11 1. 1 1 I are called equivalent if there are two numbers y1 > 0, 72 > 0 such that Vz E X there holds n/1/1z/l5 lllzlll 5 y21lz/I. In a finite dimensional vector space all norms are equivalent to each other. A subspace of a normed space is a closed linear subspace of the space.
12.3.2 Banach Spaces A complete normed space is called a Banach space. Every normed space X can be completed into a Banach space X by the completion procedure given in 12.2.2.5,p. 608, and by the natural extension of its algebraic operations and the norm to X.
12.3.2.1 Series in Normed Spaces In a normed space X we can consider inJinite series. That means for a given sequence {zn}r=lof elements zn E X a new sequence {sk}p=1is constructed by ''' x k = sk-1 xk. (12.87) S1 = X I r 32 = x1 2 2 , . . . sk = If the sequence {Sk}& is convergent, i.e., /Isk - s/l --+ 0 ( k + co) for some 8 E then a convergent series is defined. The elements SI,s2.. . . , sk, . . . are called the partial sums of the series. The limit
+ +
+
+
x,
(12.88) oi
is the sum of the series, and we write s =
W
2.,
n=l
A series
n=l
zn is called absolutely convergent if the
M
/lznllis convergent. In a Banach space every absolutely convergent series is conver-
number series n=l
gent, and 1lsIl 5
cc
C llxnll holds for its sum s. n=l
12.3.2.2 Examples of Banach Spaces IA : Fn lvith ~~z~~ =
~
if 1 5 p < co; llzl/ =
ly2nI & j ,
if p = co.
(12.89a)
These normed spaces over the same vector space F" are often denoted by lP(n) (1 5 p 5 co). For 1 5 p < co,we call them Euclidean spaces in the case of F = R, and unztary spaces in the case of F = C. (12.89b) IB : m with 11x11 = s t p IQI.
IC : c and co with the norm from m.
(12.89~) (12.89d)
12.3 Normed Spaces 611
IE : C ( [ a ,b]) with IlzIl = max Ix(t)l.
(12.89e)
tE[a,bl
1
(12.89f) k
IG
C ( k ) ( [ ab,] ) with Ilz(I=
1max lx(')(t)12 where d0)(t)stands for z ( t ) .
(12.89g)
l=o t W 1
12.3.2.3 Sobolev Spaces R c R" be a bounded domain, Le., an open connected set, with a sufficiently smooth boundary dR. For n = 1 or n = 2,3 we can imagine R being something similar to an interval ( a ,b ) or a convex Let
set. d function f : + R is k-times continuously differentiable on the closed domain i f f is k-times continuously differentiable on R and each of its partial derivatives has a finite limit on the boundary, 1.e..if x approaches an arbitrary point of OR.In other words, all partial derivatives canJe continuously extended on the boundary of Q, Le., each partial derivative is a continuous function on 0. In this vector space (for p E [l.m))and with the Lebesgue measure X in Rn(see example C in 12.9.1, 2., p. 634) the following norm is defined:
a
/ / f l / k , p=
ilfil =
in1
If(x)IpdX+
(12.90)
Theresultingnormedspaceisdenoted byT.i/k,p(R)oralso byT.i/;(R) (incontrast to thespaceC(k)([a,b]) which has a quite different norm). Here cy means a rnultz-zndez, Le., an ordered n-tuple ( a l , .. . ,cy,) of non-negative integers, where the sum of the components of a is denoted by la1 = a1 a2 . . an. For a function f(x) = f ( G , . . .,En)with x = ( S I . . . . ,&) E we use the brief notation as in (12.90):
a
D"f=
+ + +
dI"1 f
a(:' . ,a(,".' ,
The normed space r n k , p ( R ) is not complete. Its completion is denoted by WkJ'(Q) or in the case of p = 2 by Hk(R)and it is called a Sobolev space.
12.3.3 Ordered Normed Spaces 1. Cones in a Normed Space Let X be a real normed space with the norm
11 11. A cone X+ C X (see 12.1.4.2,p. 597) is called a solzd, if X, contains a ball (with positive radius), or equivalently, X+ contains at least one interior point. IThe usual cones are solid in the spaces R, C( [a,b ] ) ,c, but in the spaces L p ( ( a ,b ) ) and 1P (1 5 p < m) they are not solid. .4cone X+ is called normalif the norm in X is sernzmonotonzc,Le., there exists a constant A l > 0 such that (12.92) 0 5 z I l/ 114 5 W Y l l . If X is a Banach space ordered by a cone X+, then every (0)-interval is bounded with respect to the norm if and only if the cone X+ is normal. IThe cones of the vectors with non-negative components and of the non-negative functions in the spaces R". rn. c. CO. C, IP and LP,respectively, are normal. A cone is called regular if every monotonically increasing sequence which is bounded above, 5 x* 5 . , , I x, 5 ' ' ' 5 2 (12.93)
-
612
12. Functional Analysis
is a Cauchy sequence in X. In a Banach space every closed regular cone is normal. IThe cones in R", lp and LP for 1 5 p < 03 are regular, but in C and m they are not.
2. Normed Vector Lattices and Banach Lattices Let X be a vect,or lattice: which is a normed space at the same time. X is called a normed lattice or normed vector lattice (see [12.15], [12.19], [12.22], [12.23]),if the norm satisfies the condition 1x1 5 IyI implies 11z11 5 lIy11 Vz,y E X (monotonicityofthe norm). (12.94) .A complete (with respect to the norm) normed lattice is called a Banach lattice. IThe spaces C ( [ a ,b ] ) , Lp, l p , B([a,b]) are Banach lattices.
12.3.4 Normed Algebras .4 vector space X over F is called an algebra, if in addition to the operations defined in the vector space X and satisfying the axioms (Vl)-(VB) (see 12.1.1, p. 594), a product z.y E Xis also defined for every two elements z,y E X,or with a simplified notation the product zy is defined so that for arbitrary z,y,z E X and cu E F the following conditions are satisfied: (AI) 4 ~= ()X Y ) ~ , (12.95) (A2) z(y + z ) = zy t Z Z , (12.96) (A3) (zt y)z = z z yz, (12.97) (A4) a ( z y ) = (az)y= z(ay). (12.98) .4n algebra is commutative if zy = yz holds for two arbitrary elements 2, y. A linear operator (see (12~21),p. 598) T: X t Y of the algebra X into the algebra Y is called an algebra homomorphism if for any 21. 1 2 E X: (12.99) T(z1 .z2)= Tzl . T x ~ . .An algebra X is called a normed algebra or a Banach algebra if it is a normed vector space or a Banach space and the norm has the additional property (12.100) llz ' YII 5 114 ' IlYIl. In a normed algebra all the operations are continuous, Le., additionally to (12.85), if z, -+z and yn -+ yI then also zny, -+ zy (see [12.20]). Every normed algebra can be completed to a Banach algebra, where the product is extended to the norm completion with respect to (12.100). IA: C ( [ a ,b]) with the norm (12.89e) and the usual (pointwise) product of continuous functions. IB: The vector space W([O,27~1)of allcomplex-valued functionsz(t) continuous on [ 0 , 2 ~and ] having an absolutely convergent Fourier series expansion, Le.,
+
cc
(12.101)
c,eint,
z(t) = n=-m
m
with the norm J / z J=J
IC,\
and the usual multiplication
n=--30
C: The space L(X) of all bounded linear operators on the normed space X with the operator norm and the usual algebraic operations (see 12.5.1.2, p. 617), where the product T S of two operators is defined as the sequential application, Le., T S ( z )= T ( S ( z ) ) z , E X. ID: The space L'(-ffi) m) of all measurable and absolutely integrable functions on the real axis (see 12.9, p. 633) with the norm (12.102) is a Banach algebra if the multiplication is defined as the convolution (z* y)(t) =
1
5.
-02
z(t - s)y(s) ds.
12.4 Hilbert Spaces 613
12.4 Hilbert Spaces 12.4.1 Notion of a Hilbert Space 12.4.1.1 Scalar Product .4vector space V over a field F (mostly F = C) is called a space wzth scalar product or an znnerproduct space or pre-Hzlbert space if to every pair of elements 2,y E V there is assigned a number (z> y) E F (the scalar product of z and y), such that the azzoms of the scalar product are satisfied, Le., for arbitrary z,y. z E V and cy E F: (Hl) (x,x) 2 0, (Le.. (x>x) is real), and ( 2 ,z)= 0 if and only if z = 0, (12.103) Y) = 4 5 , Y),
(H2)
(12.104)
(H3) (z+ Y. 2 ) = ( 2 , ~+)(Y, 2))
(12.105)
m.
(12.106) (H4) (2.y) = (Hereg denotes the conjugate of the complex number w , which is denoted by w* in (1.133~)Sometimes the notation of a scalar product is (2,y).) In the case of F = R, Le., in a real vector space, (H4) means the commutativity of the scalar product. Some further properties follow from the axioms: (12.107) (x,ay) = a(z.y) and ( 2 ,y z ) = (2,y) ( 2 , ~ ) .
+
+
12.4.1.2 Unitary Spaces and Some of their Properties In a pre-Hilbert space H a norm can be introduced by means of the scalar product as follows:
114 =
m
(zE HI.
.L\ normed space H = (H, 11
(12.108)
11)
is called unztary if there is a scalar product satisfying (12.108). Based on the previous properties of scalar products and (12.108) in unitary spaces the following facts are valid: a) Wangle Inequality: 112
+ YII' 5 (Ild+ llvll)' .
(12.109)
b) Cauchy-Schwarz Inequality or Schwarz-Buniakowski Inequality (see also 1.4.2.9, p. 31): (12.110)
c) Parallelogram Identity: This characterizes the unitary spaces among the normed spaces: (12.111)
(12.112)
12.4.1.3 Hilbert Space A complete unitary space is called a Hzlbert space. Since Hilbert spaces are also Banach spaces, they possess in particular. the properties of the last (see 12.3.1, p. 609; 12.3.1.2, p. 610: 12.3.2, p. 610). In addition they have the properties of unitary spaces 12.4.1.2, p. 613. A subspace of a Hilbert space is a closed linear subspace. IA: l2(n). 1' and L Z ( ( ab, ) ) with the scalar products (12.113)
I
614
12. Functional Analysis
IB: The space HZ(R)with the scalar product
/
(f,9) = f ( 4 g o d z + R
J D . f ( x ) D Q g ( x dx. )
(12.114)
1
IC: Let p(t) be a measurable positive function on [a,b]. The complex space L z ( ( a ,b ) , p) of all measurable functions. which are quadratically integrable with the weight function p on ( a . b), is a Hilbert space if the scalar product is defined as
12.4.2 Orthogonality Two elements x,y of a Hilbert space H are called orthogonal (we write x Iy) if (2, y) = 0 (the notions of this paragraph also make sense in pre-Hilbert spaces and in unitary spaces). For an arbitrary subset A c H , the set (12.116) A’ = {z E H: (z, y) = 0 V y E A } of all vectors which are orthogonal to each vector in A is a (closed linear) subspace of H and it is called the orthogonal space to A or the orthogonal complement of A. We write A I B if (z, y) = 0 for all z E A and y E B . If A consists of a single element z,then we write x 1B .
12.4.2.1 Properties of Orthogonality The zero vector is orthogonal to every vector of H. The folloning statements hold: a) z Iy and z 1z implyz I (cyy + bz) for any a,B E C. b) From z 1yn and yn + y it follows that z 1y. c) z 1A if and only if z 1lin(A),where lin(A)denotes the closed linear hull of the set A. d) If IA and A is afundamental set, Le., lin(A)is everywhere dense in H, then z = 0. e) Pythagoras Theorem: If the elements T I , . . . , x, are pairwise orthogonal, that is x k I zl for all k#l,tb; ~
I
n
(12.117)
f ) Projection Theorem: If Ho is a subspace of H, then each vector x E H can be written uniquely as z = 2’
+ d’, 2’ E Ho, 5’’ 1Ho.
(12.118)
g) ApproximationProblem: Furthermore, the equation /Iz’IJ= p(x,Ho) = inf,,a,{ Ilx-yll} holds,
and so the problem llz - yl1 -+ inf, Y E HO (12.119) has the unique solution z’in Ho. In this statement Ho can be replaced by a convex closed non-empty subset of H.The element z’is called the projection of the element z on Ho. It has the smallest distance from z (to Ho),and the space H can be decomposed: H = Ho E?H i .
12.4.2.2 Orthogonal Systems A set {zt:[ E E} of vectors from H is called an orthogonalsystem if it does not contain the zero vector and xF 1z7, ( # 7 , hence (q, z7)= dFq holds, where ( 12,120)
12.4 Hilbert Spaces 615
denotes the Kronecker symbol (see 4.1.2, lo., p. 253). An orthogonal system is called orthonomal if in addition llzcIl = 1 V<. In a separable Hilbert space an orthogonal system may contain at most countably many elements. In what follows we assume, therefore, E = N. IA: The system 1 -cos 1 -
$zJ;;
1 . t , -cos 1 t , -sin
J;;
J;;
1 . 2t, -sin 2t,
J;;
(12.121)
in the real space L'((-T, T ) ) and the system
. . .)
(12.122)
in the complex space L'((-T, T ) ) are orthonormal systems. Both of these systems are called trigonometric. IB: The Legendre polynomials of the first kind (see 9.1.2.6, 2., p. 510)
P,(t) =
d" --[(e - l),] d tn
( n = 0, 1,.. .)
(12.123)
form an orthogonal system of elements in the space L2((-1.1)). The corresponding orthonormal system is (12.124)
IC: The Hermite polynomials (see 9.1.2.6, 6.) p. 512 and 9.2.3.5, 3., 543) according to the second definition of the Hermite differential equation (9.63b) (12.125) form an orthogonal system in the space L2((-m, m)). I D : The Laguerre polynomials form an orthogonal system (see 9.1.2.6, 5., p. 511) in the space L'((0, a)). Every orthogonal system is linearly independent, since the zero vector was excluded. Conversely, if we have a system ~ 1 . ~ 2, ..z,, . , , . of linearly independent elements in a Hilbert space H, then there exist vectors e l , e2, . . . , e n r , . ., obtained by the Gram-Schmzdt orthogonalizatzon method (see 4.5.2.2, 1.. p. 280) which form an orthonormal system. They span the same subspace, and by the method they are determined up to a scalar factor with modulus 1.
12.4.3 Fourier Series in Hilbert Spaces 12.4.3.1 Best Approximation Let H be a separable Hilbert space and (12.126) {e,: n = 1 , 2 . . . .} a fixed orthonormal system in H. For an element z E H the numbers e,, = (z, e,) are called the Fourier coeflcients of z with respect to the system (12.126). The (formal) series m
1cnen I
( 12.127)
n=l
is called the Fourierserzes of the element zwith respect to the system (12.126) (see 7.4.1.1. l., p. 418). The n-th partial sum of the Fourier series of an element z has the property of the best approximation,
I
616
12. Functional Analgsis
Le., for fixed n, the n-th partial sum of the Fourier series n 0 ,
(12.128)
= k=l
gives the smallest value of llz -
5
among all vectors of H, = lin((e1,.. . ,en}). Furthermore,
akekll
k=l
z - un is orthogonal to H,,and there holds the Bessel inequality: m
1
ICnI'
5
cn =
IIzII's
(23en)
(n = 122,. .
(12.129)
n=l
12.4.3.2 Parseval Equation, Riesz-Fischer Theorem The Fourier series of an arbitrary element z E H is always convergent. Its sum is the projection of the element z onto the subspace Ho = lin({e,};.J. If an element x E H has the representation m
z = 1 a,e,, then a, are the Fourier coefficients of z (n = 1 , 2 , .. .). If {a,}:=, is an arbitrary n=l X
sequence of numbers with the property C Ianlz < m, then there is a unique element z in H,whose n=l
Fourier coefficients are equal to a, and for which the Parseval equation holds: 00
m
n=l
n=l
1/(x,e,)I' =
Ianl2= 11z/I2
(Riesz-Fischertheorem).
(12.130)
.An orthonormal system {e,} in H is called complete if there is no non-zero vector y orthogonal to every e,; it is called a basis if every vector z E H has the representation z =
m
anen,i.e., a, = (z, e,) and
n=l
z is equal to the sum of its Fourier series. In this case, we also say that x has a Fourier expansion. The following statements are equivalent: a ) {e,} is a fundamental set in H. b) {e,} is complete in H. c) {e,} is a basis in H. d) For V z,y E H with the corresponding Fourier coefficients c, and d, (n = 1 , 2 , .. .) there holds m
(12.131)
e) For every vector 5 E H,the Parseval equation (12.130) holds. IA: The trigonometric system (12.121) is a basis in the space L'((-a,a)). IB: The system of the normalized Legendre polynomials (12.124) p,(t) (n = 0 , 1 , . . .) is complete and consequently a basis in the space L2((-1, 1)).
12.4.4 Existence of a Basis, Isomorphic Hilbert Spaces In every separable Hilbert space there exits a basis. From this fact it follows that every orthonormal system can be completed to a basis. Two Hilbert spaces H1and H:,are called zsometrzc or zsomorphzc as Hilbert spaces if there is a linear bijective mapping 7': HI t H2 with the property ( T z ,T ~ ) H=, (x,y ) ~ ,(that is, it preserves the scalar product and because of (12.108)also the norm). Any two arbitrary infinite dimensional separable Hilbert spaces are isometric, in particular every such space is isometric to the separable space 1'.
12.5 Continuous Linear Operators and Functionals 617
12.5 Continuous Linear Operators and Functionals 12.5.1 Boundedness, Norm and Continuity of Linear Operators 12.5.1.1 Boundedness and the Norm of Linear Operators Let X = (X, 11 .II) and Y = (U, 11 11) be normed spaces. In the following discussion, we omit the index X in the notation 11 . I/* emphasizing that we are in the space X, because from the text it will be always clear which norm and which space we are talking about. An arbitrary operator T : X +Y is called bounded if there is a real number X > 0 such that IlT(z)ll I X l M 'fzE X. (12.132) A bounded operator with a constant X "stretches" every vector at most X times and it transforms every bounded set of X into a bounded set of Y,in particular the image of the unit ball of X is bounded in Y. This last property is characteristic of bounded linear operators. A linear operator is continuous (see 12.2 3, p. 608) if and only if it is bounded. The smallest constant A, for which (12.132) still holds, is called the n o m of the operator T and it is denoted by llT/l. Le., (12.133) //T/l:= inf{X > 0 : IITxll 5 Xl1x11, x E X}. For a continuous linear operator the following equalities hold: (12.134) IlTlI = SUP IITxll = SUP IlT4 = SUP IITxll llxIlS1
/lxll<1
M=1
and, furthermore, the equality IITxll 5 IITII . It4 'fxE X. (12.135) Let T be the operator in the space C([a.b ] ) with the norm (12.89d),defined by the integral
( T z ) ( s )= ?/(SI=
I"
K ( s ,t ) x ( t )d t
(3 E
[a,all,
(12.136)
where K ( s ,t ) is a (complex-valued) continuous function on the rectangle { a 5 s, t 5 b}. Then T is a bounded linear operator, which maps C ( [ a ,b ] ) into C ( [ a ,b ] ) . Its norm is
'IT''
max
/b
= s€[a,b]
l ~ ( s , t )dt. l
(12.137)
a
12.5.1.2 The Space of Linear Continuous Operators
+
The sum U = S T and the multiple cuT of two linear (continuous) operators S,T : X -+Y are defined pointwise: G ( z ) = S ( z ) + T ( x ) , ( c u T ) ( x ) = a . T ( z ) ,V ~ E and X V~EF. (12.138) The set L(X, U), often denoted by B ( X ,U),of all linear continuous operators T from X into Y equipped with the operations (12.138) is a vector space, where IlTll (12.133) turns out to be a norm on it. So, L(X, Y)is a normed space and even a Banach space if Y is a Banach space. So the axioms (Vl)-(V8) and (Nl)-(N3) are satisfied. If Y = X. then a product can be defined for two arbitrary elements S,T 6 L ( X ,X) = L(X) = B ( X ) as ( S T ) ( x )= S(Tx) ( V l E X), (12.139) which satisfies the axioms (Al)-(A4) from 12.3.4,p. 612, and also the compatibility condition (12.100) with the norm. L(X) is in general a non-commutative normed algebra, and ifX is a Banach space, then it is a Banach algebra. Then for every operator T E L(X) its powers are defined by
TO = I , Tn = T n - 9 (n = 1 , 2 , .. .), where I is the identity operator Ix = x, V x E X. Then llTnll I IITIl" (n= 051:.
.)I
(12.140) (12.141)
I
618
12. Functional Analusis
and furthermore there always exists the (finite) limit
r ( ~=) nlim im
m,
(12.142)
which is called the spectral radzus of the operator T and satisfies the relations r ( T )5 I/TIl. T ( P ) = [ T ( T ) ] R , T(C2T)= lalr(T), r ( T )= T(T*), (12.143) where T* is the adjoint operator to T (see 12.6, p. 624, and (12.159)). If L ( X ) is complete, then for 1x1 > r ( T ) ,the operator (XI- T)-' has the representation in the form of a Neumann series
+
( X I - T)-1 = x-'I + X-2T + . . . X-nTn-'+ . . . , which is convergent for 1x1 > r(T)in the operator norm on L ( X ) .
(12.144)
12.5.1.3 Convergence of Operator Sequences 1. Pointwise Convergence of a sequence of linear continuous operators T,: X -+ Y to an operator T : X T,z t T z inY for each z E X.
---t
Y means that: (12.145)
2. Uniform Convergence The usual norm-convergence of a sequence of operators {T,}~xlin a space L ( X ,Y) to T , i.e.,
is the unzforrn convergence on the unit ball of X. It implies pointwise convergence, while the converse statement is not true in general.
3. Applications The convergence of quadrature formulas when the number n of interpolation nodes tends to co,the performence principle of summation, limiting methods, etc.
12.5.2 Linear Continuous Operators in Banach Spaces Suppose now X and Y to be Banach spaces
1. Banach-Steinhaus Theorem (Uniform Boundedness Principle) The theorem characterizes the pointwise convergence of a sequence {T,} of linear continuous operators T, to some linear continuous operator by the conditions: a) For every element from an everywhere dense subset D c X, the sequence {Tnz}has a limit in Y. b) there is a constant C such that I ~ T ,5~C, ~ V n.
2. Open Mappings Theorem The theorem tells us that a linear continuous operator mapping from X onto Y is open, Le., the image T ( G )of every open set G from X is an open set in Y.
3. Closed Graph Theorem An operator T : DT -+ Y with DT C X is called closed if z, E DT, z, -+ $0 in X and T z , -+ yo in Y imply zo E DT and yo = 2'20. A necessary and sufficient condition is that the graph of the operator T in the space X x Y, Le., the set (12.147) rj-= {(x,Tz): x E DT} is closed, where here (z, y) denotes an element of the set X x Y. If T is a closed operator with a closed domain DT. then T is continuous 4. Hellinger-Toeplitz Theorem Let T be a linear operator in a Hilbert space H. If (z, Ty) = ( T z ,y) for every 5,y E H, then T is continuous (here (z.Ty)denotes the scalar product in H).
12.5 Continuous Linear Operators and Functionals 619
5 . Krein-Losanovskij Theorem on the Continuity of Positive Linear
Operators
If X = (X,X,. 11 . 11) and Y = (Y,Yt, 11 11) are ordered normed spaces, where X, is a generating cone. then the set Lt(X, Y) of all positive linear and continuous operators T , i.e., T(X,) c Y,, is a cone in L(X,Y). The theorem ofKrein and Losanovskzjasserts (see [12.17]): If X and Y are ordered Banach spaces with closed cones Xt and Y+, and X, is a generating cone, then the positivity of a linear operator implies its continuity. 6. Inverse Operator Let X and Y be arbitrary normed spaces and let 2': X + Y be a linear, not necessarily continuous operator. T has a continuous inverse T-' : Y -i X, if T(X) = Y and there exists a constant m > 0 such that l/Tzl/2 rnllzll for each 2 E X . Then l~T-'l~ 5 1. m In the case of Banach spaces X, Y we have the
7. Banach Theorem on the Continuity of the Inverse Operator If T is a linear continuous bijective operator from X onto Y. then the inverse operator T-' is also continuous. An important application is, e.g., the continuity of (XI - T)-I given the injectivity and surjectivity of XI - T . This fact has importance in investigating the spectrum of an operator (see 12.5.3.2, p. 620). It also applies to the 8 . Continuous Dependence of the Solution on the right-hand side and also on the initial data of initial value problems for linear differential equations. We will demonstrate this fact by the following example. 1 The initial value problem
?(t)+ p l ( t ) i ( t ) + p 2 ( t ) x ( t )= q ( t ) , t E [ a b ] , z(t0)= E , i ( t 0 ) = E , t o E [a,b] (12.148a) with coefficients pl(t),pz(t) E C([a,b])has exactly one solution z from Cz([a,b])for every right-hand side q ( t ) E C([a,b]) and for every pair of numbers E , (. The solution zdepends continuously on q ( t ) , and ( in the following sense. If qn(t) E C([a, b]),En, E R' are given and z, E C([a, b]) denotes the solution of (12.148b) f n ( t ) + p l ( t ) & ( t ) + ~ 2 ( t ) z n ( t )= q n ( t ) , zn(a) = Ennr &(a) = En, for n = 1 . 2 . . . ., then:
in
qn(t) + q ( t ) in C([% bl),
5,
tn +E
1
implies that
2,
-+ z in the space C2([a,b]).
<
(12.148~)
9. Method of Successive Approximation to solve an equation of the form (12.149) a:-Tz=y with a continuous linear operator T in a Banach space X for a given y. This method starts with an arbitrary initial element xO,and constructs a sequence (2,) of approximating solutions by the formula (12.150) ',Z , = y + T z , ( n = 0 , 1 , . . .) . This sequence converges to the solution z' in X of (12.149). The convergence of the method, i.e., z, + z*, is based on the convergence of the series (12.144) with X = 1. Let 11TIl 5 q < 1. Then the following statements are valid: 1 a) The operator I - T has a continuous inverse with lI(1- T)-'Il 5 -, and (12.149) has exactly 1-4J one solution for each y
620
12. Functional Analysis
b) The series (12.144) converges and its sum is the operator ( I - T)-'. c) The method (12.150) converges to the unique solution z' of (12.149) for any initial element 2 0 , if the series (12.144) converges. Then the following estimation holds:
I",
- z*l/ 5 -1ITzo Q" 1-q Equations of the type
- zo/( (n= 1,2,. . .).
(12.151)
x - / L T z = ~ , Xz-Tz=y, p , X E F can be handled in an analogous way (see 11.2.2, p. 565, and [12.8]).
(12.152)
12.5.3 Elements of the Spectral Theory of Linear Operators 12.5.3.1 Resolvent Set and the Resolvent of an Operator For an investigation of the solvability of equations one tries to rewrite the problem in the form (12.153) ( I - T)" = y with some operator T having a possible small norm. This is especially convenient for using a functional analytic method because of (12.143) and (12.144). In order to handle large values of llTI1 as well, we investigate the whole family of equations (12.154) (XI - T ) z = y z E X, with X E C in a complex Banach space X. Let T be a linear, but in general not a bounded operator in a Banach space X. The set Q ( T )of all complex numbers X such that (XI - T)-' E B(X) = L(X) is called the resolvent set and the operator Rx = Rx(T) = (XI - T)-' is called the resolvent. Let T now be a bounded linear operator in a complex Banach space X. Then the following statements are valid: a) The set Q ( T is ) open. hlore precisely, if Xo E Q(T)and X E C satisfy the inequality
( 12.155) then Rx exists and
Rx = Rxa + (A - Xo)Ri, + (A
-
Xo)2Ri,+ . . . =
c(X- XO)~-'R?,. 30
(12.156)
k=l
b) { A E C : 1x1 > liTll} c Q ( T )More . precisely, V X E C with / X I > llTll, the operator Rx exists and I T TZ Rx = -- - - - - - , . , , (12.157) X
XZ
A3
X' (see 12.5.4.1, p. 621) and arbitrary x E X the function F(X) = f(Rx(z))is holomorphic on @). f ) For arbitrary A, p E p(T),and X # p one has: e) For an arbitrary functional f E
12.5.3.2 Spectrum of an Operator 1. Definition of the Spectrum
The set u(T)= C \ Q ( Tis) called the spectrum of the operator T . Since I - T has a continuous inverse (and consequently (12.153) has a solution, which continuously depends on the right-hand side) if and only if 1 E Q(T),we must know the spectrum u(T) as well as possible. From the properties of the
12.5 Continuous Linear Operators and Functionals 621
resolvent set it follows immediately that the spectrum u ( T )is a closed set of C which lies in the disk { A : ( X I 5 llT(i}lhowever. in many cases m(T) is much smaller than this disk. The spectrum of any linear continuous operator on a complex Banach space is never empty and r ( T )= sup 1x1. (12.159) XWT)
It is possible to say more about the spectrum in the cases of different special classes of operators. If T is an operator in a finite dimensional space X and if the equation (XI - T ) z = 0 has only the trivial solution (Le., XI - T is injective), then X E e(T) (Le., XI - T is surjective). If this equation has a non-trivial solution in some Banach space, then the operator XI - T is not injective and (XI - T)-' is in general not defined. The number X E C is called an eigenvalue of the linear operator T , if the equation Xz = T z has a nontrivial solution. All those solutions are called eigenvectors, or in the case when X is a function space (which occurs very often in applications), they are called eigenfunctions of the operator T associated to A. The subspace spanned by them is called the eigenspace (or characteristic space) associated to X. The set op(T)of all eigenvalues of T is called the point spectrum of the operator T .
2. Comparison to Linear Algebra, Residual Spectrum An essential difference between the finite dimensional case which is considered in linear algebra and the infinite dimensional case discussed in functional analysis is that in the first case o(T)= u p ( T )always holds, while in the second case the spectrum usually also contains points which are not eigenvalues of T . If XI - T is injective and surjective as well, then X E Q ( T )due to the theorem on the continuity of the inverse (see 12.5.2, 7.) p. 619). In contrast to the finite dimensional case where the surjectivity follows automatically from the injectivity, the infinite dimensional case has to be dealt with in a very different way. The set o,(T) of all X E @), for which XI - T is injective and Im(XI - T ) is dense in X, is called the continuous spectrum and the set a,(T) of all X with an injective XI - T and a non-dense image, is called the residual spectrum of operator T . For a bounded linear operator T in a complex Banach space X
4 T ) = u p m u o m u 40 where the terms of the right-hand side are mutually exclusive.
12.5.4 Continuous Linear F'unct ionals 12.5.4.1 Definition For Y = F we call a linear mapping a linear functional or a linear form. In the following discussions, for a Hilbert space we consider the complex case; in other situations almost every times the real case is considered. The Banach space L(X, F) of all continuous linear functionals is called the adjoint space or the dual space of X and it is denoted by X* (sometimes also by X'). The value (in F) of a linear continuous functional f E X* on an element 2 E Xis denoted by f(x),often also by (Lf ) emphasizing the bilinear relation of X and X' - (compare also with the Riesz theorem (see 12.5.4.2, p. 622). , tn be fixed points of the interval [a,b] and el, c2,. . . cn real numbers. By the formula n
C!d(tk)
f(z)=
(12.161)
k=l
a linear continuous functional is defined on the space C ( [ a ,b ] ) ;the norm off is l l f ( l = case of (12.161) for a fixed t E [a,b] is the 6 functional adz) = z ( t ) ( 2 E C([a,bl)).
n
C Ickl. h special k=l
(12.162)
12. Functional Analusis
622
IB: With an integrable function p(t) (see 12.9.3.1, p. 635) on [a,b] b
f(.)
(12.163)
P(t)Z(t)dt
=
is a linear continuous functional on C([a,b]) and also on B([a,b]) in each case with the norm
llfii
=
12.5.4.2 Continuous Linear Functionals in Hilbert Spaces.
Riesz Representation Theorem
In a Hilbert space H every element y E H defines a linear continuous functional by the formula f(z)= (z, y). where its norm is l l f i l = lly/l. Conversely, iff is a linear continuous functional on H, then there exists a unique element y E H such that S(.) = ( 2 ,Y1 'JzE H , (12.164) where l i f i l = /lyjl. .kcording to this theorem the spaces H and H' are isomorphic and might be identified. The Riesz representation theorem contains a hint on how to introduce the notion of orthogonality in an arbitrary normed space. Let A c X and A* c X*. We call the sets A' = {f E X: f(z)= 0 Vz E A} and A" = { E E X: f(x) = 0 V f E A*} (12.165) the orthogonal complement or the annulator of A and A*, respectively.
12.5.4.3 Continuous Linear F'unctionals in LP
1 1 Let p 2 1. The number q is called the conjugate exponent t o p if - + - = 1, where it is assumed that P Q q = cx in the case ofp = 1. IBased on the Holder integral inequality (see 1.4.2.12,p. 32) the functional (12.163) can be considered 1 1 also in the spaces Lp([a.b]) (1 5 p 5 co) (see 12.9.4, p. 637) if (o E Lq([a,b]) and - + - = 1. Its norm
is then
llfll = IIYII =
P
{ (s,"
lp(t)lqdt)'. if 1 < p 5 m, ess. sup Ip(t)I, if p = 1
q
(12.166)
tEIa,bl
(with respect to the definition ofess. sup Ipl see (12.2171,p. 637). To every linear continuous functional f in the space Lp([a,b])there is a uniquely (up to its equivalence class) defined element y E L*([a.b ] ) such that (12.167) f(.) = = 4t)Y(t)dt, E E LP and IIfII = I l Y I l q = lY(t)l4dt '
p
I/!,.(
(p
)'
For the case ofp = cx see [12.15].
12.5.5 Extension of a Linear Functional 1. Seminorm A mapping p: X +R of a vector space X is called a seminorm or pseudonorm, if it has the following properties: ("1)
P(Z)2
0,
(12.168)
("2)
P(oz1 = lQlP(5).
("3)
P(.
(12.169) (12.170)
+ Y) I A T ) +P(Y).
12.5 Continuous Linear Operators and Functionals 623
Comparison with 12.3.1.p. 609, shows that a seminorm is a norm if and only ifp(z) = 0 holds only for x = 0.
Both for theoretical mathematical questions and for practical reasons in applications of mathematics, the problem of the extension of a linear functional given on a linear subspace XO c X to the entire space (and, in order to avoid trivial and uninteresting cases) with preserving certain '' g o o d properties became a fundamental question. The solution of this problem is guaranteed by
2. Analytic Form of the Hahn-Banach Extension Theorem Let X be a vector space over F and p a pseudonorm on X. Let XObe a linear (complex in the case of F = C and real in the case of F = R) subspace of X, and let fo be a (complex-valued in the case of F = C and real-valued in the case of F = R) linear functional on Xo satisfying the relation lfo(.)l I P ( S ) V x E Xo. (12.171) Then there exists a linear functional f on X with the following properties: f(x) = fo(z) Va: E Xo, lf(.)l 5 P ( 5 ) v z E X. (12.172) So, f is an extension of the functional fo onto the whole space X preserving the relation (12.171). If Xo is a linear subspace of a normed space X and fo is a continuous linear functional on Xo2 then p ( z ) = lifoll . IlzIl is a pseudonorm on X satisfying (12.171), so we get the Hahn-Banach theorem on the extension of contznuous linear functionals. Two important consequences are: 1. For every element z # 0 there is a functional f E X* with f(x) = llzll and l l f i i = 1. 2. For every linear subspace XOC X and 10 !$ XOwith the positive distance d = infzEXo112 - xo((> 0 there is an f E X' such that 1 (12.173) f ( z ) = 0 Vx E XO, f(q) = 1 and llfil = -. d
12.5.6 Separation of Convex Sets 1. Hyperplanes .A linear subset L of the real vector space X,L # x, is called a hypersubspace or hyperplane through 0 if there exists an 50 E X such that X = Izn(x0,L ) . Sets of the form x + L ( L a linear subset) are affinelinear manifolds (see 12.1.2. p. 595). If L is a hypersubspace, these manifolds are called hyperplanes. There exist the following close relations between hyperplanes and linear functionals: a) The kernel f-'(O) = {z E X: f(x) = 0) of a linear functional f on Xis a hypersubspace in X, and = X and f-'(X) = xx f-'(O). for each number X E R there exists an element zx E X with f(q) b) For any given hypersubspace L C X and each xo $ L and X # 0 (A E R) there always exists a uniquely determined linear functional f on X with f-'(O) = L and f(z0) = A. The closedness of f-'(O) in the case of a normed space Xis equivalent to the continuity of the functional
+
f. 2. Geometric Form of the Hahn-Banach Extension Theorem Let X be a normed space, zo E X and L a linear subspace of X. Then for every non-empty convex open set K which does not intersect the affine-linear manifold 10 + L , there exists a closed hyperplane H such that xo + L c H and H n K = 0. 3. Separation of Convex Sets Two subsets A, B of a real normed space X are said to be separated by a hyperplane if there is a functional
f E X*such that: supf(z) I j $ f ( Y ) .
(12.174)
XEA
The separating hyperplane is then given by f - ' ( a ) with a = supzEAf ( ~which ) ~ means that the two sets are contained in the different half-spaces
I
624
12. Functional Analusis
A c {z E X: f(z) 5 a } and B c {x E X: f(x) 2 a}. (12.175) In Fig. 12.5b,c two cases of the separation by a hyperplane are shown. Their disjointness is less decisive for the separation of two sets. In fact, Fig. 12.5a shows two sets E and B. which are not separated although E and B are disjoint and B is convex. The convexity of both sets is the intrinsic property for separating them. In this case it is possible that the sets have common points which are contained in the hyperplane.
Figure 12.5
If A is a convex set of a normed space X with a non-empty interior Int(A)and B c X is a non-empty convex set with Int(A) n B = 0, then A and B can be separated. The hypothesis Int(A) # 0 in that statement cannot be dropped (see [12.3],example 4.47). A (real linear) functional f E X* is then called a supportingfunctionalto A of the set A at the point zo E A , if there is a real number X E R such that f(q) = A, and A c {z E X : f ( z ) 5 A}. f-'(X) is called the supporting hyperplane at the point 20. For a convex set K with a non-empty interior, there exists a supporting functional at each of its boundary points. Remark: The famous Kuhn-Tucker theorem (see 18.2, p. 860) which yields practical methods to determine the minimum of convex optimization problems (see [12.5]),is also based on the separation of convex sets.
12.5.7 Second Adjoint Space and Reflexive Spaces The adjoint space X* of a normed space X is also a normed space if it is equipped with the norm llfll = sup If(z)l. so (Xi)* = X**. The second adjoint space to X can also be considered. The canonical llxll51
embedding J : X tX** with J z = F,, where F,(f) = f(z) V f E X* (12.176) is a norm isomorphism (see 12.3.1,p. 609), hence X is identified with the subset J(X) C X". A Banach space X is called reflexive if J ( X ) = X". Hence the canonical embedding is then a surjective norm isomorphism. IEvery finite dimensional Banach space and every Hilbert space is reflexive, as well as the spaces Lp (1 5 p < m), however C ( [ a ,b ] ) , Li([O,l]), co are examples of non-reflexive spaces.
12.6 Adjoint Operators in Normed Spaces 12.6.1 Adjoint of a Bounded Operator For a given linear continuous operator T : X --+Y (X, Y are normed spaces) to every g
E Y* there is assigned a functional f E X' by f(z) = g ( T z ) , V z E X. In this way, we get a linear continuous operator T * : Y* tx', ( T * g ) ( z= ) g ( T z ) , V g E Y' and Vz E X, (12.177) which is called the adjoint operator of T and has the following properties: ( T + S ) *= T * + S * ,(ST)*= S*T', l/T*/l= IITII, wherefor thelinear continuousoperators'l': X + Y
12.6 Adjoint Operators in Normed Spaces 625 and S: Y + Z (X,Y, 2 normed spaces), the operator S T : X + 2 is defined in the natural way as S T ( z ) = S ( T ( z ) ) .With the notation introduced in 12.1.5, p. 598, and 12.5.4.2, p. 622, the following identities are valid for an operator T E B ( X ,Y):
Im(T)= ker(T*)', Im(T*)= ker(T)', (12.178) where the closedness of Im(T)implies the closedness of Im(T*). The operator T**: X** + Y**, obtained as (T*)*from T', is called the second adjoint of T . Due to (T**(F,))g = F,(T*g) = (T*g)(z) = g(Tz)= &(g) the operator T**has the following property: If Fx E X"', then T**Fz= FT, E Y'*. Hence, the operator T**:X" + Y** is an extension of T . In a Hilbert space H the adjoint operator can also be introduced by means of the scalar product (TI, y) = (zl T*y), 2, y E H. This is based on the Riesz representation theorem, where the identification of H and H**implies (AT)' = I T * ,I' = I and even T**= T . If T is bijective, then the same holds for T', and also (T*)-*= ( P I ) * . For the resolvents of T and T* there holds [RA(T)l*= W-*L ( 12.179) from which u(T*)= (1:A E u ( T ) }follows for the spectrum of the adjoint operator. IA: Let T be an integral operator in the space Lp([a, b]) (1 < p < m)
( T z ) ( s )= [ K ( s , t ) z ( t )d t
(12.180)
with a continuous kernel K ( s ,t ) . The adjoint operator of T is also an integral operator, namely (12.181) according with the kernel K * ( s ,t ) = K ( t ,s), where yg is the element from L* associated to g E (U)* to (12.167). W B: In a finite dimensional complex vector space the adjoint of an operator represented by the matrix A = ( a t j ) is defined by the matrix A* with a:j = q.
12.6.2 Adjoint Operator of an Unbounded Operator Let X and Y be real normed spaces and T a (not necessarily bounded) linear operator with a (linear) domain D ( T ) c X and values in Y. For a given g E Y*, the expression g(Tz), depending obviously linearly on z.is meaningful. Now the question is: Does there exist a well-defined functional f E X* such that f(I)= g(Tz) v z E D ( T ) . (12.182) Let D' c Y* be the set of all those g E Y' for which the representation (12.182) holds for a certain j E X*. If D(T) = X, then for the given g the functional f is uniquely defined. So a linear operator T' is defined by f = T*gwith D ( T ' ) = D'. Then for arbitrary I E D ( T ) and g E D(T*)one has dTI)= (T*g)(z). (12.183) The operator T' turns out to be closed and is called the adjoznt of T . The naturalness of this general procedure stems from the fact that D ( T ' ) = Y' holds if and only if T is bounded on D ( T ) .In this case T' E B(Y*.X') and llT*11 = liTll hold.
12.6.3 Self-Adjoint Operators An operator T E B(H) (H is a Hilbert space) is called self-adjoznt if T' = T . In this case the number ( T z ,z)is real for each z E H. One has the equality (12.184)
626
12. Functional Analysis
and with m = m(T) = inf ( T z ,z) and M = M ( T ) = sup ( T z ,z)also the relations llxJI=l
lIxll=1
IITII = $1 = max{lm/,M}. (12.185) The spectrum of a self-adjoint (bounded) operator lies in the interval [m,M] and m, Ad E u(T )holds. m(T)llzIl2 5 ( T z , ~5)M(T)llzllZ and
12.6.3.1 Positive Definite Operators A partial ordering can be introduced in the set of all self-adjoint operators of B ( H )if we define T 2 0 if and only if (Tz!x) 2 0 Vz E H. (12.186) An operator T with T 2 0 is called positive (or, more exactly positive dejnite). For any self-adjoint operator T (with (Hl)from 12.4.1.1, p. 613), ( T 2 z , z ,= ) ( T z , T z ) 2 0, so TZis positive definite. Every positive definite operator T possesses a square root, Le., there exists a unique positive definite operator W such that W 2 = 2’. Moreover, the vector space of all self-adjoint operators is a K-space (Kantorovich space, see 12.1.7.4,p. 600), where the operators 1 1 IT1 = V F . T+ = -(IT1 + T ) , T - = -(IT1 - T ) ( 12.187) 2 2 are the corresponding elements with respect to (12.37). They are of particular importance for the spectral decomposition and spectral and integral representations of self-adjoint operators by means of some Stieltjes integral (see 8.2.3.1, 2., p. 451, and [12.1],[12.11],[12.12],[12.15],[12.18]).
12.6.3.2 Projectors in a Hilbert Space Let Ho be a subspace of a Hilbert space H. Then every element z E H has its projection I’onto Ho according to the projection theorem (see 12.4.2, p. 614), and therefore, an operator P with Pz = z’is defined on H with values in HO. P i s called a projector onto Ho. Obviously, P i s linear, continuous, and lJPIJ = 1. .4continuous linear operator P in H is a projector (onto a certain subspace) if and only if: a) P = P*,Le.. P is self-adjoint, and b) P2 = P,Le., P is idempotent.
12.7 Compact Sets and Compact Operators 12.7.1 Compact Subsets of a Normed Space A subset A of a normed space+X is called compact, if every sequence of elements from A contains a convergent subsequence whose limit lies in
A, relatzvely compact or precompactif its closure (see 12.2.1.3,p. 604) is compact, Le., every sequence of elements from A contains a convergent subsequence (whose limit does not necessarily belong to A ) . This is the Bolzano-Weierstrass theorem in real calculus, and we say that such a set has the BolzanoWezerstrass property. Every compact set is closed and bounded. Conversely, if the space X is finite dimensional, then every such set is compact. The closed unit ball in a normed space X is compact if and only if X is finite dimensional. For some characterizations of relatively compact subsets in metric spaces (the Hausdorff theorem on the existence of a finite m e t ) and in the spaces s, C (Arzela-Ascoli theorem) and in the spaces LP(1 < p < m) see [12.15].
12.7.2 Compact Operators 12.7.2.1 Definition of Compact Operator An arbitrary operator T : X
tY
of a normed space X into a normed space Y is called compact if the
+It is enough that X is a metric (or an even more general) space. We do not use this generality in what follows.
12.7 Compact Sets and Compact Operators 627
image T ( A )of every bounded set A C X is a relatively compact set in Y. If, in addition the operator T is also continuous, then it is called completely continuous. Every compact linear operator is bounded and consequently completely continuous. For a linear operator to be compact it is sufficient to require that it transforms the unit ball of X into a relatively compact set in Y .
12.7.2.2 Properties of Linear Compact Operators characterization by sequences of the compactness of an operator from B(X,Y )is the following: For =~ a convergent subsequence. every bounded sequence {E,}:=~ from X the sequence { T X , } ~contains .A linear combination of compact operators is also compact. If one of the operators U E B(W, X), T E B ( X ,Y):S E B ( Y ,Z)in each of the following products is compact, then the operators TU and ST are also compact. If Y is a Banach space, then one has the following important statements. a) Convergence: If a sequence of compact operators {Tn}r=to=l is convergent in the space B ( X ,U), then its limit is a compact operator, too. b) Schauder Theorem: If T is a linear continuous operator, then either both T and T* are compact or both are not. c) Spectral Properties of a Compact Operator T i n a n (Infinite Dimensional) Banach Space X: The zero belongs to the spectrum. Every non-zero point of the spectrum g ( T ) is an eigenvalue with a finite dimensional eigenspace XX = {z E X : ( X I - T ) z = 0}, and V E > 0 there is always only a finite number of eigenvalues of T outside the circle { / X I 5 E } , where only the zero can be an accumulation point of the set of eigenvalues. If X = 0 is not an eigenvalue of T , then T-' is unbounded if it exists. .I\
12.7.2.3 Weak Convergence of Elements A sequence {zn}F=lof elements of a normed space X is called weakly convergent to an element 2 0 if for each f E X' the relation f(z,) fi ( q )holds (written as: z, -., 50). Obviously: z, i zo implies z, 50. If Y is another normed space and T : X + Y is a continuous
-
-
linear operator. then: a) 2, 2 .co implies T z , Tzo. b) if T is compact, then z, 2 zo implies T I , i Tzo. IA: Every finite dimensional operator is compact. From this it follows that the identity operator in an infinite dimensional space cannot be compact (see 12.7.1, p. 626). IB: Suppose X = 12, and let T be the operator in 1' given by the infinite matrix
(12.188)
If
30
lt,k12 =
,\f < co,then T is a compact operator from 1' into 1' with //TI/5 M .
k,n=l
IC: The integral operator (12.136)is a compact operator in the spaces C ( [ a ,b ] ) and U ( ( a , b ) ) (1 < P<X)
12.7.3 F'redholm Alternative Let T be a compact linear operator in a Banach space X. We consider the following equations (of the second kind) with a parameter X # 0: Xz-Tz = g > Xz-Tz = o , (12.189) Xf - T'f = 9 % Xf - T*f= 0.
I
628
12. Functional Analysis
The following statements are valid: a) dim(ker(X1- T ) )= dim(ker(X1- T'))< +m, i.e., both homogeneous equations always have the same number of linearly independent solutions. b) Im(XI - T ) = ker(XI - T*)' and t Im(X1- T ' ) = ker(X1- 2')' c) Im(XI - T ) = X if and only if ker(X1- T ) = 0. d) The Fredholrn alternatzve (also called the Riesz-Schauder theorem): a ) Either the homogeneous equation has only the trivial solution. In this case X E p(T),the operator (XI - T)-' is bounded, and the inhomogeneous equation has exactly one solution z = (XI - T)-'y for arbitrary y E X. 3 ) Or the homogeneous equation has at least one non-trivial solution. In this case Xis an eigenvalue of T , Le., X E o ( T) .and the inhomogeneous equation has a (non-unique) solution if and only if the righthand side y satisfies the condition f ( y ) = 0 for every solution f of the adjoint equation T*f = X f . In this last case every solution z of the inhomogeneous equation has the form z = 20+ h, where zo is a fixed solution of the inhomogeneous equation and h E ker(X1- 2'). Linear equations of the form T z = y with a compact operator T are called equations of the first kind. Their mathematical investigation is in general more difficult (see [12.11],[12.18]).
12.7.4 Compact Operators in Hilbert Space Let T : H ---+ H be a compact operator. Then T is the limit (in B(H)) of a sequence of finite dimensional operators. The similarity to the finite dimensional case can be seen from the following: If C is a finite dimensional operator and T = I - G, then the injectivity of T implies the existence of T-' and T-' E B(H). If C is a compact operator, then the following statements are equivalent: a) 3 T-' and it is continuous, b) z # 0 + T z # 0, Le., T i s injective, c) T(H) = H, Le., T is surjective.
12.7.5 Compact Self-Adjoint Operators 1. Eigenvalues A compact self-adjoint operator T # 0 in a Hilbert space H possesses at least one non-zero eigenvalue. More precisely, T always has an eigenvalue X with 1x1 = llTll. The set of eigenvalues of T is at most countable. Any compact self-adjoint operator T has the representation T =
&PAI.(in B(H)),where X k are the k
different eigenvalues of T and PA denotes the projector onto the eigenspace HA. We say in this case that the operator T can be diagonalized. From this fact it follows that T z = &(z,ek)ek for every k
z E H, where { e k } is the orthonormal system of the eigenvectors of 2'. If X
4 o(T)and y E H, then
the solution of the equation (XI - T ) z = y can be represented as z = Rx(T)y =
-(y,
1
k X-Xk
ek)ek.
2. Hilbert-Schmidt Theorem If T is a compact self-adjoint operator in a separable Hilbert space H, then there is a basis in H consisting of the eigenvectors of 2'. The so-called spectral (mapping) theorems (see [12.8], [12.10], [12.12],[12.13], [12.18]) can be considered as the generalization of the Hilbert-Schmidt theorem for the non-compact case of self-adjoint (bounded or unbounded) operators. $Here the orthogonality is considered in Banach spaces (see 12.5.4.2,p. 622).
I
12.8 Non-Linear Operators 629
12.8 Non-Linear Operators In the theory of non-linear operator equations the most important methods are based on the following principles: 1. Principle of the Contracting Mapping, Banach Fixed-point Theorem (see 12.2.2.3, p. 606, and 12.2.2.4. p. 606). For further modifications of this principle see [12.8],[12.11],[12.12], [12.18]. 2. Generalization ofthe Newton Method (see 18.2.5.2, p. 867 and 19.1.1.2, p. 882) for the infinite dimensional case. 3. Schauder Fixed-point Principle 4. Leray-Schauder Theory
Methods based on principles 1 and 2 yield information on the existence, uniqueness, constructivity etc. of the solution. while methods based on principles 3 and 4, in general, allow "only" the qualitative statement of the existence of a solution. If further properties of operators are known then see also 1 2 8.6, p. 631, and 12 8.7, p. 632.
12.8.1 Examples of Non-Linear Operators For non-linear operators the relation between continuity and boundedness discussed for linear operators in 125.1, p. 617 is no longer valid in general. In studying non-linear operator equations, e.g., nonlinear boundary value problems or integral equations, the following non-linear operators occur most often. Iteration methods described in 12.2.2.4, p. 606, can be succesfully applied for solving non-linear integral equations.
1. Nemytskij Operator Let R be an open measurable subset from R" (12.9.1, p. 633) and f : R x R --t R a function of two variables f ( z , s), which is continuous with respect to zfor almost every s and measurable with respect to s for every z (Caratheodory conditions). The non-linear operator N to F ( R ) defined as ( N u ) ( z= ) f[z, .()I .( E fl) (12.190) is called the Nemytskzj operator. It is continuous and bounded if it maps P ( R ) into L9(R), where I -
P
+ q-1 = 1. This is the case, e.g., if If(.$
or i f f :
s)I 5 a(.) + blslg with a(.) E L9(R) ( b > 0) R x R + R is continuous. The operator N is compact only in special cases.
(12.191)
2. Hammerstein Operator
R be a relatively compact s u k e t of R", f a function satisfying the Caratheodory condztzons and K ( z ,y ) a continuous function on R x 0. The non-linear operator 3t on F ( R ) Let
W)(z)=
1
K b , V ) f [ Y ?4 Y ) l dY .( E 0)
(12.192)
R
is called the Hammerstezn operator. 3t can be written in the form ?!I = IC IC determined by the kernel K
N with the integral operator (12.193)
R
If the kernel K ( z )y ) satisfies the additional condition (12.194) RXn
and the function f satisfies the condition (12.191), then 3t is a continuous and compact operator on
J!w).
630
12. Functional Analusis
3. Urysohn Operator Let R c R” be an open measurable subset and K ( z ,y,s) variables. Then the non-linear operator U on F(R) ( U u ) ( z )=
:
RxRxR
t
1
R a function of three (12.195)
K [ z ,Y, U(Y11 dY .( E 0 )
n
is called the Urysohn operator. If the kernel K satisfies the corresponding conditions, then U is a continuous and compact operator in C ( 0 ) or in Lp(R), respectively.
12.8.2 Differentiability of Non-Linear Operators Let X, Y be Banach spaces, D c X be an open set and T : D t Y. The operator T is called Fre‘chet dzflerentiable (or. briefly, differentiable) at the point x E D if there exists a linear operator L E B(X,Y) (in general depending on the point z)such that T ( z + h ) - T ( z )= L h + w ( h ) with ilw(h)ll= o(IIhl1) (12.196) or in an equivalent form
+
llT(z h ) - T ( z )- Lh/l = 0. ( 12.197) llhll Le.. V E > 0, 3 b > 0, such that Ilhll < 6 implies /lT(z h ) - T ( z )- Lhll 5 ~lihll.The operator L , which is usually denoted by T’(z),T‘(z:.) or T ’ ( z ) ( , )is, called the Fre‘chet derivative of the operator T at the point z.The value dT(z:h ) = T’(z)his called the Fre‘chet dzflerential of the operator T at the point z (for the increment h ) . The differentiability of an operator at a point implies its continuity at that point. If T E B(X, Y), i,e.) T itself is linear and continuous, then T is differentiable at every point, and its derivative is equal to T. lim
W+0
+
12.8.3 Newton’s Method Let X, D be as in the previous paragraph and T : D t X. Under the assumption of the differentiability of T at every point of the set D an operator T’ : D t B ( X ) is defined by assigning the element T’(z) E B(X) to every point z E D. Suppose the operator T’ is continuous on D (in the operator norm): in this case T is called continuously diflerentiable on D. Suppose Y = X and also that the set D contains a solution z*of the equation T ( z )= 0. (12.198) Furthermore, we suppose that the operator T’(z)is continuously invertible for each z E D. hence [T‘(z)]-’is in B(X). Because of (12.196) for an arbitrary zo E D one conjectures that the elements T ( z o )= T ( z o )- T ( z * )and T’(zo)(zo - z’)are “not far” from each other and therefore the element z1 defined as (12.199) X I = I o - [T’(zo)l-lT(zO), is an approximation of z’(under the assumption we made). Starting with an arbitrarpzo the so-called Newton approximation sequence (12.200) Z,+l = z, - [T’(zn)]-1T(zn)( n = 0 , 1 , .. .) can be constructed. There are many theorems known from the literature discussing the behavior and the convergence properties of this method. We mention here only the following most important result which demonstrates the main properties and advantages of Newton’s method: V E E (0:1) there exists a ball B = B(z0:6), 6 = b ( ~in) X, such that all points z, lie in B and the Yewton sequence converges to the solution x* of (12.198). Moreover, /Iz,- zo/1 5 c”11zg - z*ll which yields a practical error estimation. The modijed Newton’s method is obtained if the operator [T’(zo)]-’is used instead of [T’(z,,)]-lin
12.8 Non-Linear Operators 631
formula (12.200). For further estimations of the speed of convergence and for the (in general sensitive) dependence of the method on the choice of the starting point zo see [12.7], [12.12], [12.18].
12.8.4 Schauder’s Fixed-point Theorem Let T : D t X be a non-linear operator defined on a subset D of a Banach space X. The non-trivial question of whether the equation z = T ( z )has at least one solution, can be answered as follows: If X = R and D = [-l>11, then every continuous function mapping D into D has a fixed point in D. If X is an arbitrary finite dimensionalnormed space (dimX 2 2), then Brozlwer’s fixed-point theorem holds. 1. Brouwer’s Fixed-point Theorem Let D be a non-empty closed bounded and convex subset of a finite dimensional normed space. If T is a continuous operator, which maps D into itself, then T has at least one fixed point in D . The answer in the case of an arbitrary infinite dimensional Banach space X is given by Schazlder’s fixed-point theorem. 2. Schauder’s Fixed-point Theorem Let D be a non-empty closed bounded and convex subset of a Banach space. If the operator T : D ---+ Xis continuous and compact (hence completely continuous) and it maps D into itself, then T has at least one fixed point in D. By using this theorem. it is proved, e.g., that the initial value problem (12.70), p. 608, always has a local solution for t 2 0, if the right-hand side is assumed only to be continuous.
12.8.5 Leray-Schauder Theory For the existence ofsolutionsoftheequations z = T ( 2 )and ( I + T ) ( z )= y with acompletely continuous operator T . a further principle is found which is based on deep properties of the mapping degree. It can be successfully applied to prove the existence of a solution of non-linear boundary value problems. We mention here only those results of this theory which are the most useful ones in practical problems, and for simplicity we have chosen a formulation which avoids the notion of the mapping degree. Leray-Schauder Theorem: Let D be an open bounded set in a real Banach space X and let T : D : t X be a completely continuous operator. Let y E D be a point such that z+ XT(z) # y for each z E d D and X E [O. 11,where d D denotes the boundary of the set D. Then the equation ( I + T ) ( z )= y has at least one solution. The following version of this theorem is very useful in applications: Let T be a completely continuous operator in the Banach space X. If all solutions of the family of equations z = XT(2) ( A E [O, 11) (12.201) are uniformly bounded, Le., 3 c > 0 such that V X and V z satisfying (12.201) the a priori estimation llzil 5 c holds, then the equation z = T ( z )has a solution.
12.8.6 Positive Non-Linear Operators The successful application of Schauder’s fixed-point theorem requires the choice of a set with appropriate properties, which is mapped into itself by the considered operator. In applications, especially in the theory of non-linear boundary value problems, ordered normed function spaces and positive operators are often considered, Le., which leave the corresponding cone invariant, or isotone increasing operators, Le., if z 5 y + T ( z ) 5 T(y). If confusions (see, e.g., 12.8.7, p. 632) are excluded, we also call these operators monotone. Let X = (X, X,, 11 . 11) be an ordered Banach space, X+ a closed cone and [a,b] an order interval of X. If X, is normal and T is a completely continuous (not necessarily isotone) operator that satisfies T ( [ ab]) , c [a.b],then T has at least one fixed point in [a,b] (Fig. 12.6b). , C [a,b] automatically holds for any isotone increasing operator T , Notice that the condition T ( [ ab]) which is defined on an (0)-interval (order interval) [alb] of the space X if it maps only the endpoints a, b
632
12. Functional Analysis
=f(x)
Figure 12.6 into [a, b ] , Le.: when the two conditions T ( a ) 2 a and T(6)5 b are satisfied. Then both sequences 20 = a and x,+l = T(x,) ( n 2 0) and yo = b and yntl = T(y,) ( n 2 0) (12.202) are well defined, Le., x,, yn E [alb], n 2 0. They are monotone increasing and decreasing, respectively, Le., a = xo 5 x1 5 . . . 5 x, 5 . . . and b = yo 2 y1 2 . . . y, 2 . . .. A fixed point 2, of the operator T is called minimal, maximal, respectively, if for every fixed point z of T the inequalities x* 5 z , z 5 x, hold, respectively. Kow, we have the following statement (Fig. 12.6a)): Let X be an ordered Banach space with a closed cone X+ and T : D t X, D C X a continuous isotone increasing operator. Let [a,b] c D be such that T ( a )2 a and T(b)5 b. Then T ([a,b]) C [a,b], and the operator T has a fixed point in [a>b] if one of the following conditions is fulfilled: a) X+ is normal and T is compact; b) X+ is regular. Then the sequences and {yn}r=o, defined in (12.202), converge to the minimal and to the maximal fixed points of T in [a, b], respectively. The notion of the super- and subsolutions is based on these results (see [12.14]).
{~,}r=~
I12.8.7 Monotone Operators in Banach Spaces 1. Special Properties
An arbitrary operator T : D C X t Y (X, Y normed spaces) is called demi-continuousat the point 20 E D if for each sequence {x,};==, C D converging to xo (in the norm of X) the sequence {T(x,)}r=l converges weakly to T ( q )in Y. T is called demi-continuouson the set D if T is demi-continuous at every point of D. In this paragraph we introduce another generalization of the notion of monotonity known from real analysis. Let X now be a real Banach space, X* its dual, D C X and T : D t X' a non-linear operator. T is called monotone ifVx, y E D the inequality ( T ( x )- T(y), x - y) 2 0 holds. If X = H is a Hilbert space, then (., ,) means the scalar product, while in the case of an arbitrary Banach space we refer to the notation introduced in 12.5.4.1, p. 621. The operator T is called strongly monotone if there isaconstantc>Osuchthat(T(z)-T(y),x-y) > c l l x - y l l Z f o r V x , y ~D. A n o p e r a t o r T : X t X "
is called coercive if lim IIx++m
1/~11
= co.
2. Existence Theorems for solutions of operator equations with monotone operators are given here only exemplarily: If the operator T , mapping the real separable Banach space X into X', ( D T = X), is monotone demi-continuous and coercive, then the equation T ( x )= f has a solution for arbitrary f E X*. If in addition the operator T is strongly monotone, then the solution is unique. In this case the inverse operator T-' also exists. For a monotone, demi-continuous operator T : H t H in a Hilbert space H with DT = H, there holds
12.9 Measure and Lebesoue InteQral 633
Im(I t T ) = H, where ( I + T)-’ is continuous. If we suppose that T i s strongly monotone, then T-’ is bijective with a continuous T-’ . Constructive approximation methods for the solution of the equation T ( z )= 0 with a monotone operator T in a Hilbert space are based on the idea of Galerkin’s method (see 19.4.2.2, p. 906, or [12.10], [12.18]). By means of this theory set-valued operators T: X -+ 2” can also be handled. The notion of monotonity is then generalized by (f - g, 2 - y) 2 0. V 2 , y E Dr and f E T ( z ) g, E T(y).
12.9 Measure and Lebesgue Integral 12.9.1 Sigma Algebra and Measures The initial point for introducing measures is a generalization of the notion of the length of an interval in R. of the area, and of the volume of subsets of R2 and R3,respectively. This generalization is necessary in order to “measure” as many sets as possible and to “make integrable” as many functions as possible. For instance, the volume of an n-dimensional rectangular parallelepiped (12.203) Q = {X E R”: ah 5 ~k 5 bk ( k = 1 , 2 , . . . , n ) } has the value (bk - at). k=l
1. Sigma Algebra Let X be an arbitrary set. A non-empty system A of subsets from X is called a u-algebra if: a) A E A implies X \ A E A and (12.204a) b) A1,A2, ..,A,.. . . E A implies A, E A. (12.204b)
u
n=l
Every u-algebra contains the sets 0 and X, the intersection of countably many of its sets and also the difference sets of any two of its sets. In the following denotes the set R of real numbers extended by the elements {-cm} and {+E} (extended real line). where the algebraic operations and the order properties from R are extended to cm R in the natural way. The expressions (+cm) + ( ~ c m and ) - are meaningless, while 0 . (+m) and cm 0 (-co)are assigned the value 0.
2. Measure
A function p : A -t R+ = R+ U +cm, defined on a u-algebra A, is called a measure if a) p(A) 2 0 V z 4 E A, b) p(0) = 0.
c) ‘41, A?, . . .A,, . . . E A,
Ak
n Al = 0
( k # 1 ) implies p
(E An) n=l
=
p(An).
(12.205a) (12.205b) (12.205~)
n=l
The property c) is called 0-additivity of the measure. If p is a measure on A, and for the sets A: B E A, A c B holds, then p ( A ) 5 p ( B ) (monotonicity). If A, E A (n = 1,2:. . .) and A1 C A2 C . .., then p
U A,
(ny1
1
=
p(A,) (continuity from below)
Let A be a 0-algebra of subsets of X and p a measure on A. The triplet X = (X, A, p ) is called a measure space, and the sets belonging to A are called measurable or A-measurable. IA: Counting Measure: Let X be a finite set {q,x2,.. . , ZN}, A the u-algebra of all subsets of X, and let assign a non-negative number pk to each xk ( k = 1,.. . , N ) . Then the function I./ defined on A foreverysetA E A. A = {xn,,zn2, . . . ,xnk}byp(A) =pn, t p n , + . . . + p n kisameasurewhich takes on only finite values since p ( X ) = p 1 t ’ . t p~ < cm. This measure is called the counting measure. IB: Dirac Measure: Let A be a 0-algebra of subsets of a set X and a an arbitrary given point from X. Then a measure is defined on A by 1 if a E A, (12.206) = 0;if a A.
{
634
12. Functional Analwas
It is called the 6 function (concentrated on a ) . Obviously & ( A )= ~ & ( x A= ) X A ( U ) (see 12.5.4, p. 621), where X A denotes the characteristic function of the set A. IC: Lebesgue Measure: Let X be a metric space and B ( X ) the smallest 0-algebra of subsets of X which contains all the open sets from X. B ( X ) exists as the intersection of all the 0-algebras containing all the open sets, and is called the Bore! u-algebra of X. Every element from B ( X ) is called a Borel set (see [12.6]). Suppose now, X = R" ( n 2 1). Using an extension procedure we can construct a 0-algebra and a measure on it, which coincides with the volume on the set of all rectangular parallelepipeds in R". More precisely: There exists a uniquely defined 0-algebra A of subsets of R" and a uniquely defined measure X on A with the following properties: a) Each open set from R" belongs to A, in other words: B(Rn) c A. b) If A E A: X(A) = 0 and B c A then B E A and X(B) = 0.
c) If Q is a rectangular parallelepiped, then Q E A, and X(Q) =
n
fl (bk - ak) k=l
d) X is translation invariant, Le., for every vector z E R" and every set A E A one has z+ A = {z+ y : y E '4) E Aand X(z + A) = X(A). The elements of A are called Lebesgue measurable subsets of R". X is the (n-dimensional) Lebesgue measure in R". Remark: In measure theory and integration theory one says that a certain statement (property, or condition) with respect to the measure p is valid almost everywhere or p-almost everywhere on a set X, if the set, where the statement is not valid, has measure zero. We write a.e. or p-a.e.5 For instance, if X is the Lebesgue measure on R a n d A, B are two disjoint sets with R = A U B and f is a function on R with f(x) = 1, Vz E A and f(z)= 0, Vz E B , then f = 1, X-a.e. on R if and only if X(B)= 0.
12.9.2 Measurable F'unct ions 12.9.2.1 Measurable Function Let A be a u-algebra of subsets of a set X. A function f : X -+ is called measurable if for an arbitrary a E R the set f - ' ( ( a ,+m]) = {z : z E X, f(z)> a } is in A. A complex-valued function g t ih is called measurable if both functions g and h are measurable. If A is the 0-algebra of the Lebesgue measurable sets of R" and f : R" +R is a continuous function, then the set f-'((a. +m]) = f - ' ( ( a ,+m)),according to 12.2.3,p. 608, is open for every (Y E R, hence f is measurable.
12.9.2.2 Properties of the Class of Measurable Functions The notion of measurable functions requkes no measure but a 0-algebra. Let A be a u-algebra of subsets of the set X and let f , g, f n : X -+ R be measurable functions. Then the following functions (see 12.1.7.4,p. 600) are also measurable: a ) af for every a E R; f . g;
b) f+? f-. Ifl, f v 9 and f A 9; c) f + g1if there is no point from X where the expression (ha)+ ( ~ moccurs; ) d) supf,, inff,, limsupf, (= lim supfh), liminff,; n*m k>n
e) the pointwise limit lim fn, in case it exists; f ) iff 2 0 and p E R , p > 0, the f P is measurable. A function f : X + R is called elementary or simple if there is a finite number of pairwise disjoint sets §Here and in the following "a.e." is an abbrevation for "almost everywhere".
22.9 Measure and Lebesgue Inte.qra1 635
A I , . . . ..+ E AIand nreal numbers a l , .. . ,a, such that f =
n
akxkrwhere Xk denotes the characterk=l
istic function of the set 4 k . Obviously, each characteristic function of a measurable set is measurable, so every elementary function is measurable. It is interesting that each measurable function can be approximated arbitrarily well by elementary functions: For each measurable function f 2 0 there exists a monotone increasing sequence of non-negative elementary functions, which converges pointwise to f .
12.9.3 Integration 12.9.3.1 Definition of the Integral Let (X, A, p ) be a measurable space. The integral
I,f
dp (also denoted by
I
f dp; in this section,
except point 5. we prefer the latter notation) for a measurable function f is defined by means of the following steps: 1. Iff is an elementary function f =
n
k=l
akxk,then
(12.207) 2. I f f : X
1f 3. I f f : X
+a (f 2 0))then
[
dp = sup J g dp : g is an elementary function with 0
I
5 g(s) 5 f(s),V z E X . (12.208)
-+ R a n d f + ,f- are the positive and the negative parts o f f , then
J f dP = J f + d P J f- dP
(12.209)
-
under the condition that at least one of the integrals on the right side is finite (in order to avoid the meaningless expression w - w). 4. For a complex-valued function f = g + ih, if the integrals (12.209) of the functions g, h are finite, DUt
1f 1 dp =
g dp
+ i 1h dp.
( 12.210)
5. If for any measurable set A and a function f there exists the integral of the function f X A then put
Jfdir=JfxadiL.
(12.211)
4
The integral of a measurable function is in general a number from R. A function f : X
integrable or summable over X with respect to p if it is measurable and
I
If1
dp
-+ R is called
< M.
12.9.3.2 Some Properties of the Integral Let (X, A, p ) be a measure space, f , g : X --+ be measurable functions and a , p E R. 1. I f f is integrable, then f is finite a.e., i.e., p { z E X: If(s)l= +a}= 0. 2. Iff is integrable, then
f dpl 5
3. Iff is integrable and f 2 0, then 4. If 0
1I f 1 I
dp.
f dp 2 0
5 g(s)5 f ( E ) on X and f is integrable, then g is also integrable, and
J
g dp
5
J
f dp.
I
636
12. Functional Analvsis
5. Iff, g are integrable, then af
+ Pg is integrable, and
I
(of+ p g ) dp = a
6. If f,g are integrable on A E A, Le., there exist the integrals (12.211) and f = g p-a.e. on A, then
s,
1f + s,
f dp and
dp
p / . g dp.
g dp according to
f dp = i g d p .
If X = R” and X is Lebesgue measure, then we have the notion of the (n-dimensional) Lebesgue zntegral (see also 8.2.3.1, 3., p. 452). In the case n = 1 and A = [a,b], for every continuous function f on
[a,b] both the Riemann integral
l
f ( x ) dx (see 8.2.1.1,2.,p. 439) and the Lebesgue integral
-Ifa.ij
are defined. Both values are finite and equal to each other. Furthermore, iff is a bounded Riemann integrable function on [a,b], then it is also Lebesgue integrable and the values of the two integrals coincide. The set of Lebesgue integrable functions is larger than the set of the Riemann integrable functions and it has several advantages, e.g., when passing to the limit under the integral sign and f,If1 is Lebesgue integrable simultaneously.
12.9 -3.3 Convergence Theorems ?low Lebesgue measurable functions will be considered throughout.
1. B. Levi’s Theorem on Monotone Convergence be an a.e. monotone increasing sequence of non-negative integrable functions with values Let { i n R . Then lim
n+CC
1
f, dp =
1
lim f,dp.
( 12.212)
n+m
2. Fatou’s Theorem
Let { f,}LIbe a sequence of non-negative %valued measurable functions. Then
I f, Lbesgue’J Dominated Convergence Theorem lim inf f, dp 5 lim inf
dp.
( 12.213)
3. Let {f,}be a sequence of measurable functions convergent on X a.e. to some function f. If there exists an integrable function g such that If,l 5 g a.e., then f = limf, is integrable and lim/ f,dp = /(limf,) dp.
( 12.214)
4. Radon-Nikodym Theorem a) Assumptions: Let (X, A, p ) be a 0-finite measure space, i.e., there exists a sequence { A , } , A,
E
A
X
such that X = U A, and p ( A , ) < w for V n. In this case the measure is called o-finite. It is called ,=l
finite if p ( X ) < m, and it is called a probability measure if p ( X ) = 1. A real function p defined on A is called absolutely continuous with respect t o p if p ( A ) = 0 implies p(A) = 0. We denote this property by w + P. For an integrable function f,the function cp defined on A by p(A) = SA f dp is c7-additive and absolutely continuous with respect to the measure p. The converse of this property plays a fundamental role in many theoretical investigations and practical applications: b) Radon-Nikodym Theorem: Suppose a o-additive function cp and a measure p are given on a o-algebra A, and let ~p + p. Then there exists a p-integrable function f such that for each set A E A, (12.215) A
12.9 Measure and Lebesgue Inte.qra1 637 The function f is uniquely determined up to its equivalence class, and f 2 0 p-a.e.
’p is
non-negative if and only if
12.9.4 LP Spaces Let (X, A. p ) be a measure space and p a real number 1 5 p < cm. For a measurable function f, according to 12.9.2.2,p. 634, the function i f l p is measurable as well, so the expression
.%(f)=
(Jlf”d.)”
(12.216)
is defined (and may be equal to $00). A measurable function f:X +R i s calledp-thpowerintegrable, or an LP-function if Np(f)< +m holds or, equivalent to this, if IflP is integrable. For every p with 1 5 p < +m, we denote the set of all LP-functions, Le., all functions p t h power integrable with respect to p on X, by P ( p ) or by P ( X ) or in full detail L P ( X , A, p ) . For p = 1 we use the simple notation C(X). For p = 2 the functions are called quadratically integrable. We denote the set of all measurable p-a.e. bounded functions on X by P ( p ) and define the essential supremum of a function f as i’Vm(f) = ess. supf = inf{a E R : If(z)l5 a p-a.e.}. (12.217) P ( p ) (1 5 p 5 m) equipped with the usual operations for measurable functions and taking into consideration Minkowski inequality for integrals (see 1.4.2.13, p. 32), is a vector space and Np(.)is a semi-norm on P ( p ) . Iff 5 g means that f(z)5 g(z)holds p-a.e., then P ( p ) is also a vector lattice and even a K-space (see 12.1.7.4, p. 600). Two functions f , g E P ( p ) are called equivalent (or we declare them as equal) iff = g p-a.e. on X. In this way, functions are identified if they are equal pa.e. The factorization of the set P ( X ) modulo the linear subspace Ni’(0) leads to a set of equivalence classes on which the algebraic operations and the order can be transferred naturally. So we get a vector lattice (K-space) again, which is denoted now by LP(X, p ) or Lp(p). Its elements are called functions, as before, but actually they are classes of equivalent functions. It is very important that l i f ^l l p = Np(f)is now a norm on Lp(p) (f stands here for the equivalence class IlfilP) for every p with 1 5 p 5 +m is a o f f , which will later be denoted simply by f ) , and (LP(P)~ Banach lattice with several good compatibility conditions between norm and order. For p = 2 with g) =
(fl
fgdp as a scalar product, L*(p) is also a Hilbert space (see [12.12]).
Very often for a measurable subset R c R’ the space LP(R) is considered. Its definition is not a problem because ofstep 5 in (12.9.3.1,p. 639). The spaces LP(R,A), where X is the n-dimensional Lebesgue measure, can also be introduced as the completions (see 12.2.2.5, p. 608 and 12.3.2, p. 610) of the non-complete normed spaces C(R) of all continuous functions on the set R
c R” equipped with the integral norm lIzllp =
(1
lzlpdh)
’ (1 5 I
< m) (see [12.18]). Let X be a set with a finite measure, Le., p(X) < +m, and suppose for the real numberspl,pz, 1 5 p i < p z 5 +m. Then P ( X , p ) c LPl(X, p ) , and with a constant C = C ( p l , p z ,p ( X ) ) > 0 independent of z,there holds the estimation ~ ~5 zCllzllz ~ ~for z l E L P 2 (here //z11denotes k the norm of the space LPX(X,p) ( k = L2)). p
12.9.5 Distributions 12.9.5.1 Formula of Partial Integration For an arbitrary (open) domain R R”, Cr(R) denotes the set of all arbitrary many times in 0 differentiable functions p with compact support, i.e., the set supp(ip) = {z E R : cp(z) # 0) is compact in R” and lies in R. By Li,,(R) we denote the set of all locally summable functions with respect to
638
12. Functional Analysis
the Lebesgue measure in Rn,Le., all the measurable functions f (equivalent classes) on R such that If1 dX < +30 for every bounded domain w c R. Both sets are vector spaces (with the natural algebraic operations). There holds P ( R ) c L;,,(R) for 1 5 p 5 m, and L:oc(R) = L'(R) for a bounded R.If we consider the elements of C'((w) as the classes generated by them in U(R),then the inclusion Ck((w) c P ( R ) holds for bounded R,where C'((n) is at once iense. If R is unbounded, then the set C r ( R ) is dense (in this sense) in U(R).
J,
For a given function f E C'@) and an arbitrary function p E Cr(R) the formula of partial integration has the form
/f(x)D"p(z) dX = (-1)I"l /p(z)D"f(z) dX n
(12.218)
n
101 5 k (we have used the fact that D"&n = 0 ) , which can be taken as the starting point for the definition of the generalized derivative of a function f E L;,,(R).
Vcu with
12.9.5.2 Generalized Derivative Suppose f E L;,,(R). If there exists a function g E L;,,,(R) such that V some multi-index cy the equation
/ f ( z ) D " & ) dX
=
'p
E
CF(R)with respect to
(-1)'"' / g ( z ) ' p ( z )dX
n
(12.219)
n
holds. then g is called the generalized derivative (derivative in the Sobolev sense or distributional derivative) of order cu o f f . We write g = D" f for this as in the classical case. ~ the vector space C r ( R ) to 'p E CF(R) as follows: We define the convergence of a sequence { ~ p k } k m , in a) 3 a compact set K C R with supp(cpk) c K Vn pk (n if and Only if b) D"pk -+ D"'p uniformly on K for each multi-index cy. (12.220)
{
The set C r ( Q ) , equipped with this convergence of sequences, is called the fundamental space, and is denoted by D(R). Its elements are often called test functions.
12.9.5.3 Distributions A linear functional 1 on D(R) continuous in the following sense (see 12.2.3. p. 608): pk, 'p E D(R) and (Fk -+p implies 1(pk) -+t ( 9 ) is called a generalizedfunction or a distribution. N A: I f f E L;,,(R). then
(12.221)
is a distribution. .4 distribution, defined by a locally summablefunction as in (12.222). is called regular. Tworegulardistributionsareequal, i.e.,!f('p) = 1,(p) Vp E D(R).ifandonlyiff = 9a.e. withrespect to A. H B: Let a E R be an arbitrary fixed point. Then &,('p) = p(a)>'p E D(R) is a linear continuous functional on D(R), hence a distribution, which is called the Dirac distribution, 6 distribution or 6 function. Since e, cannot be generated by any locally summable function (see [12.11],[12.24]),it is an example for a non-regular distribution. The set of all distributions is denoted by D'(R). From a more general duality theory than that discussed in 12.5.4,p. 621, we get D'(Q) as the dual space ofD(R). Consequently, we should write D*(R) instead. In the space D'(Q), it is possible to define several operations with its elements and with functions from Cm(R), e.g.. the derivative of a distribution or the convolution of two distributions, which make
12.9 Measure and Lebes.que Inte.qra1 639
D'(Q) important not only in theoretical investigations but also in practical applications in electrical engineering, mechanics. etc. For a review and for simple examples in applications of generalized functions see, e.g., [12.11],[12.24]. Here. we discuss only the notion of the derivative of a generalized function.
12.9.5.4 Derivative of a Distribution If! is a given distribution, then the distribution D"!defined by
(D"4(Pl = ( - ~ ) ~ " ~ W O PY )E, D(f2), (12.223) is called the dzstrzbutional derzvative of order cr of e. Let f be a continuously differentiable function, say on R (so f is locally summable on R, and f can be considered as a distribution), let f' be its classical derivative and D'f its distributional derivative of order 1. Then: (12.224a) R
from which by partial integration there follows (12.224b) R
In the case of a regular distribution
(D"!,)(v) = (-l)l"le,(D'q)
!,
with f E L:o,(f2) by using (12.222) we obtain
= (-1)l"l
1
f(z)D"pdX,
(12.225)
n
and get the generalized derivative of the function f in the Sobolev sense (see (12.219)). IA: For the regular distribution generated by the obviously locally summable Heaviside function 1 for x 2 0, (12.226) O ( 2 )= 0 for z < 0 ne get the non-regular 6 distribution as the derivative. IB: In mathematical modeling of technical and physical problems we are faced with (in a certain sense idealized) influences concentrated at one point, such as a "point-like" force, needle-deflection, collision, etc.. which can be expressed mathematically by using the 6 or Heaviside function. For example. m6, is the mass density of a point-like mass m concentrated at one point a (0 < a < 1) of a beam of length 1. The motion of a spring-mass system on which at time to there acts a momentary external force F is described by the equation x + wzz = F&,. With the initial conditions x(0) = i(0)= 0 its solution is F . z ( t )= - sin(w(t - to))@(t- to)
[
iil
I
640
13. Vector Analvsis and Vector Fields
13 Vector Analysis andvector Fields 13.1 Basic Notions ofthe Theory of Vector Fields 13.1.1 Vector Functions ofa Scalar Variable 13.1.1.1 Definitions 1. Vector Function of a Scalar Variable t 4 vector function of a scalar variable is a vector a' whose components are real functions oft: a' = a'(t) = az(t)Zz+ ay(t)$ t a,(t)&
(13.1)
The notions of limit, continuity, differentiability are defined componentwise for the vector a'(t).
2. Hodograph of a Vector Function If we consider the vector function Z ( t ) as a position or radius vector i = i ( t )of a point P,then this function describes a space curve while t varies (Fig. 13.1). This space curve is called the hodograph of the vector function a'(t).
Figure 13.1
Figure 13.2
Figure 13.3
13.1.1.2 Derivative of a Vector Function The derivative of (13.1) with respect t o t is also a vector function oft: (13.2)
di dt
The geometric representation of the derivative - of the radius vector is a vector pointing in the direction of the tangent of the hodograph at the point P (Fig. 13.2). Its length depends on the choice of the parameter t. If t is the time, then the vector t ( t ) describes the motion of a point P in space (the
d? dt
space curve is its path), and - has the direction and magnitude of the velocity of this motion. If t = s is the arclength of this space curve, measured from a certain point, then obviously
13.1.1.3 Rules of Differentiation for Vectors da' d 6 d - (a'&b*c')=-k-&-
dt dt dt d dp da' - (pa')= -a't$odt dt dt d da'd6 -(a'b)=-b+Zdt ' dt dt
-
dc' dt '
(13.3a) (p is a scalar function oft),
(13.3b) (13.3~)
13.1 Basic Notions of the Theoru of Vector Fields 641 d da'd6 (a' x b) = - x b a' x dt dt dt d da' dp - a'[p(t)] = - ' dt dp dt +
-
+
(the factors must not be interchanged),
(13.3d)
(chain rule)
(13.3e)
da' da' If la'(t)l = const, Le., a"(t) = a'(t) a'(t) = const, then it follows from (13.3~) that I . - = 0, Le., dt dt and a' are perpendicular to each other. Examples of this fact: IA: Radius and tangent vectors of a circle in the plane and IB: position and tangent vectors of a curve on the sphere. Then the hodograph is a spherical curve.
13.1.1.4 Taylor Expansion for Vector Functions da' hZ d'a' h" Pa' q t + h ) = Z(t)S h 7 t -, (13.4) dt 2! d t n! dtn The expansion of a vector function in a Taylor series makes sense only if it is convergent. Because the limit is defined componentwise, the convergence can be checked componentwise, so the convergence of this series with vector terms can be determined exactly by the same methods as the convergence of a series with complex terms (see 14.3.2, p. 689). So the convergence of a series with vector terms is reduced to the convergence of a series with scalar terms. The differential of a vector function a'(t) is defined by: da' da' = -At. (13.5) dt
+
+...
+...
13.1.2 Scalar Fields 13.1.2.1 Scalar Field or Scalar Point Function If we assign a number (scalar value) U to every point P of a subset of space, then we write
u= C ( P ) and we call (13.6a) a scalarfield (or scalarfunctzon). IExamples of scalar fields are temperature, density, potential, etc., of solids. A scalar field Lr = U ( P )can also be considered as
c = by?),
(13.6b)
where ?is the position vector of the point P with a given pole 0 (see 3.5.1.1, 6., p. 181).
13.1.2.2 Important Special Cases of Scalar Fields 1. Plane Field \Ye have a plane field. if the function is defined only for the points of a plane in space.
2. Central Field If a function has the same value at all points P lying at the same distance from a fixed point C(31),called the center, then we call it a teed symrnetrzc field or also a central or sphericalfield. The function U depends only on the distance C P = Ifl:
u = f(1Gl). (13.7a) IThe field of the intensity of a point-like source, e.g., the field of brightness of a point-like source of light at the pole, can be described with = r as the distance from the light source: I: = 5 (c const) . r'
(13.7b)
642
13. Vector Analusis and Vector Fields
3. Axial Field If the function I; has the same value at all points lying at an equal distance from a certain straight line (axis of the field) then the field is called cylindrically symmetric or an axially symmetric field, or briefly an axial field.
13.1.2.3 Coordinate Definition of a Field If the points of a subset of space are given by their coordinates, e.g., by Cartesian, cylindrical, or spherical coordinates, then the corresponding scalar field (13.6a) is represented, in general, by a function of three variables: I; = @(x, Y, z ) , U = @(PlVI 2) or U = x(r,8% CF). (13.8a) In the case of a plane field, a function with two variables is sufficient. It has the form in Cartesian and polar coordinates: L' = @(z,y) or U = @(p, 9). (13.8b) The functions in (13.8a) and (13.8b), in general, are assumed to be continuous, except, maybe, at some points, curves or surfaces of discontinuity. The functions have the form
U ( 4 W )= U ( m )= U(T),
a) for a central field:
U
b) for a n axial field:
U = U ( 4 G ) = U ( p ) = U(rsin19).
=
(13.9a) (13.9b)
Dealing with central fields is easiest using spherical coordinates, with axial fields using cylindrical coordinates.
13.1.2.4 Level Surfaces and Level Lines of a Field 1. Level Surface A4level surface is the union of all points in space where the function (13.6a) has a constant value I; = const. (13.loa) Different constants Uo, UL, U2, . . .define different level surfaces. There is a level surface passing through every point except the points where the function is not defined. The level surface equations in the three coordinate systems used so far are: U = x(r,1 9 , ~ = ) const. I; = @(x> y, z ) = const, U = @ ( p ,q ,z ) = const, (13.10b) IExamples of level surfaces of different fields: A: U = c'i = c,z + cyy + c,z: Parallel planes. B: U = x2 + 2y2 + 42': Similar ellipsoids in similar positions. C : Central field: Concentric spheres. D: .4xial field: Coaxial cylinders.
2. Level Lines Level lines replace level surfaces in plane fields. They satisfy the equation U = const. (13.11) Level lines are usually drawn for equal intervals of U and each of them is marked by the corresponding value of I; (Fig. 13.3). IWell-known examples are the isobaric lines on a synoptic map or the contour lines on topographic maps. In particular cases, level surfaces degenerate into points or lines, and level lines degenerate into separate points. Y 1 IThe level lines of the fields a) I; = xy, b) U = 2, c) U = r 2 ,d) U = - are represented in Fig. 13.4.
19.1 Basic Notions of the Theory of Vector Fields 643
-1-2-3-4
-4-3-2 -1
d) Figure 13.4
13.1.3 Vector Fields 13.1.3.1 Vector Field or Vector Point Function If we assign a vector ? to every point P of a subset of space, then we denote it by
J = ?(P) (13.12a) and we call (13.12a) a vectorfield. IExamples of vector fields are the velocity field of a fluid in motion, a field of force, and a magnetic or electric intensity field. A vector field ? = < ( P ) can be regarded as a vector function
J = J(r')>
(13.12b)
where i is the position vector of the point P with a given pole 0. If all values of i as well as 3 lie in a plane, then the field is called a plane vector field (see 3.5.2, p. 189).
13.1.3.2 Important Cases of Vector Fields 1. Central Vector Field
In a central vector field all vectors ? lie on straight lines passing through a fixed point called the center
(Fig.13.5a). If we locate the pole at the center, then the field is defined by the formula ii = f(?)r'.
(13.13a)
644
13. Vector Analysis and Vector Fields
where all the vectors have the same direction as the radius vector 5. It often has some advantage to define the field by the formula 3 v = w(i))T. (13.13b)
-
where I#)
r’
is the length of the vector $ and - is a unit vector.
Figure 13.5
2. Spherical Vector Field A spherical vector field is aspecial case of a central vector field, where the length of the vector depends only on the distance (Fig. 13.5b). IExamples are the Newton and the Coulomb force field of a point-like mass or of a point-like electric charge: c c i V = -i = -- (e const). (13.14) r3 r2r The special case of a plane spherical vector field is called a czrcularfield.
-
3. Cylindrical Vector Field a) .411 vectors 3 lie on straight lines intersecting a certain line (called the axis) and perpendicular to it, and b) all vectors 3 at the points lying at the same distance from the axis have equal length, and they are directed either toward the axis or away from it (Fig. 13.5~). If we locate the pole on the axis parallel to the unit vector c‘, then the field has the form r” 3 =dP)- > (13.15a) P
where r”* is the projection of ?on a plane perpendicular to the axis: (13.15b) r” = c‘ x (ix 3). By intersecting this field with planes perpendicular to the axis, we always get equal circular fields.
13.1.3.3 Coordinate Representation of Vector Fields 1. Vector Field in Cartesian Coordinates The vector field (13.12a) can be defined by scalar fields VI(?), V2(3), and &(i) which are the coordinate functions of 3 , Le., the coefficients of its decomposition into any three non-coplanar base vectors 81, 62, and 4:
13.1 Basic Notions of the Theorv of Vector Fields 645
ii =
+~
5
+ ~~
(13.16a)
32 ~ 3 .
i,
If we take the coordinate unit vectors l, and k'as the base vectors and express the coefficients V,, V,, r/; in Cartesian coordinates, then we get:
+
(13.16b) = V'(Z,y,Z):+ V,(Z,Y,Z)j+ V,(z,y,t)l;. So, the vector field can be defined with the help of three scalar functions of three scalar variables. 2. Vector Field in Cylindrical and Spherical Coordinates In cylindrical and spherical coordinates, the coordinate unit vectors r' Z, 6+. Cz (= L), and & (= -), &, 6, ( 13.17a) are tangents to the coordinate lines at each point (Fig. 13.6, 13.7). In this order they always form a right-handed system. The coefficients are expressed as functions of the corresponding coordinates:
.3 = V,(P, P,4$ + V,(P, P,z)$ + V,(P, P,z ) & .3 = V,(T,29, p)C, + V+(T.19, 'p)e', + V,(T, 19, 'p)e'+j.
(13.17b)
I
(13.17~) At transition from one point to the other, the coordinate unit vectors change their directions, but remain mutually perpendicular.
p-p -t
X
Figure 13.6
Figure 13.7
Figure 13.8
13.1.3.4 Transformation of Coordinate Systems See also Table 13.1. Cartesian Coordinates in Terms of Cylindrical Coordinates V, = V, cos 'p - r/, sin 'p, r/, = V, sin cp + V, cos 'p, V , = V,. Cylindrical Coordinates in Terms of Cartesian Coordinates T/,=V,coscp+V,sincp, V,=-\Lsin'p+Vvcosrp, V,=V, Cartesian Coordinates in Terms of Spherical Coordinates V' = V, sin 19 cos cp - V, sin cp + V, cos cp cos 29, Vu = V, sin 19 sin 'p V, cos 'p + V, sin 'p cos 29, L< = VTcos 29 - V, sin 19. Spherical Coordinates in Terms of Cartesian Coordinates V , = r/, sin 17 cos p V, sin 29 sin 'p + V, cos 19, = '1 ; cos 19 cos 9 + V,cos $sin 'p - V ,sin 29,
(13.18) (13.19)
+
(13.20)
+
(13.21)
646
13. Vector Analvsis and Vector Aelds
T.5 =
-
T;. sin p + V, cos p.
5 . Expression of a Spherical Vector Field in Cartesian Coordinates
q = p((J.”+.2)(x?t
+y;
(13.22)
ZL).
6. Expression of a Cylindrical Vector Field in Cartesian Coordinates
iT = p((JiG&:+
(13.23)
y.);
In the case of a spherical vector field, spherical coordinates are most convenient for investigations, Le., the form 3 = V(r)Zr;and for investigations in cylindrical fields, cylindrical coordinates are most + convenient, Le., the form V = V(p)Z,. In the case of a plane field (Fig. 13.8),we have
3 = vz(z,Y):t
(13.24)
lry(z,y)l=Vp(P,P)%?+V,(P>P)%,
and for a circular field
3 = y ( r n ) ( z ? + y j ? = y(p)Z,.
(13.25)
Table 13.1 Relations between the components of a vector in Cartesian, cylindrical, and spherical coordinates
Cartesian coordinates
Cylindrical coord.
Spherical coordinates
vpsp+ V,$ + v*s,
V,Zr
= V, cos p - V, sin y
= V ,sin 6 cos (o
+ KjZa + V,Z, + V, cos 29 cos p - V, sin ‘p
= V, sin p
+ V, cos y
= r/;sin 8 sin p
+ \>cos 19 sin p
+ v,cos y v,
= Kcos29 - L5sin29
1;
=
C;cosp+L,sinp
= V,
= V ,sin 8
v, = v,
=
= Vpsin29+V,cos29
=
V, cos$cos (n + Li cost9sinp - V, sin ~p = \’,cos29 - V, sin29
=
-Li
sin p + V, cos y
T.: Vzsin $cos p
+ V, sin29sin y + V, cos8
-T.k sin IJ + L<
=
=
v,
+ Va cos 29
v,
= K COS 29
- L<9 sin
29
v, v, = v,
13.1.3.5 Vector Lines .4 curve C is called a lzne ofa vector or a vector hne of the vector field q(i)(Fig. 13.9) if the vector ?(<) is a tangent vector of the curve at every point P . There is a vector line passing through every point of the field. Vector lines do not intersect each other, except, maybe, at points where the function 3 is not defined, or where it is the zero vector. The differential equations of the vector lines of a vector field 3 given in Cartesian coordinates are
Figure 13.9
13.2 Differential Operators of Space 647
dx
a ) in general:
dy
dz
v, v, , (13.26a)
-= - = -
1:
dx
dy
vz
vy
b) for a plane field: - = - .
(13.26b)
To solve these differential equations see 9.1.1.2, p. 487 or 9.2.1.1. p. 515. W A: The vector lines of a central field are rays starting at the center of the vector field W B: The vector lines of the vector field 3 = c' x r' are circles lying in planes perpendicular to the vector c'. Their centers are on the axis parallel to c'.
13.2 Differential Operators of Space 13.2.1 Directional and Space Derivatives 13.2.1.1 Directional Derivative of a Scalar Field The directional derivative of a scalar field U = U(F) at a point P with position vector ?in the direction c' (Fig. 13.10) is defined as the limit of the quotient
ab-
-=
dc'
lim
C(i
+EZ) - U ( i )
(13.27)
E
E+O
If the derivative of the field U = U(i) at a point Fin the direction of the unit vector So of c'is denoted dC by -> then the relation between the derivative of the function with respect to the vector c'and with
azo
respect to its unit vector So at the same point is (13.28) The derivative
dC'
,
-with
azo
respect to the unit vector represents the speed of increase of the function U in
the direction of the vector
c" at the point r'. If 6 is the normal unit vector to the level surface passing dU
through the point r', and n' is pointing in the direction of increasing U , then -= has the greatest value dn among all the derivatives at the point with respect to the unit vectors in different directions. Between the directional derivatives with respect to n' and with respect to any direction So, we have the relation
dC
ar;
dU
(13.29) - = - COS(^'^. 5) = ;cos p = So . grad L' (see (13.35), p. 649) . dc'O dfi an In the following. directional derivatives always mean the directional derivative with respect to a unit vector. d?( 7 )
0
3 7+&?
J f V(7)
,/,'
p$
-+
a
r
--t
r
--t
Figure 13.10
Figure 13.11
I
648
13. Vector Analgsis and Vector Fields
13.2.1.2 Directional Derivative of a Vector Field The directional derivative of a vector field is defined analogously to the directional derivative of a scalar field. The directional derivative of the vector field 3 = ?(?) at a point P with position vector i (Fig. 13.11) with respect to the vector a' is defined as the limit of the quotient
a3 = lim S(i+Ea') aa'
E
-
3(?)
(13.30)
E
4
If the derivative of the vector field 3 = g(?)at a point ?in the direction of the unit vector a" of a' is denoted by
a3
a3
azo
-)
~
then
a3
- = lalaa' da'0' In Cartesian coordinates, i t . , for 3 = V2S2 V&,
+
+
(13.31)
+ V,SZ,,a' = a,& + alSv + a,&,
+
(a'. g r a d ) g = (a'. grad V,)e: (a'. grad V')e; (a'. grad Vz)ea. In general coordinates we have: + 1 (a'sgrad)V= -(rot(qxa')+grad(a'.?)+a'div?-?divZ-a'xrot?-$ 2
we have: (13.32a)
x rota'. ( 13.32b)
13.2.1.3 Volume Derivative Volume derivatives of a scalar field U = U(r') or a vector field 3 at a point r' are quantities of three forms, which are obtained as follows: 1. We surround the point ?of the scalar field or of the vector field by a closed surface E. This surface can be represented in parametric form r'= r'(u,v) = x(u,v)Z2 y(u, v)$ z(u, v)&, so the corresponding vectorial surface element is
+
-
ar'
ar'
au
av
dS = - x - d u d v .
+
(13.33a)
2. We evaluate the surface integral over the closed surface E. Here, the following three types of integrals can be considered:
(13.33b) 3. We determine the limits (if they exist) (13.33~) Here V denotes the volume of the region of space that contains the point with the position vector r' inside, and which is bounded by the considered closed surface C. The limits (13.33~)are called volume derivatives. The gradient of a scalarfield and the dzvergence and the rotatzon of a vector field can be derived from them in the given order. In the following paragraphs, we discuss these notions in details (we will even define them again.)
13.2.2 Gradient of a Scalar Field The gradient of a scalar field can be defined in different ways.
13.2.2.1 Definition of the Gradient The gradzent of a function U is a vector grad U , which can be assigned to every point of a scalar field U = U(r'), having the following properties:
13.2 Differential Operators of space 649
1. The direction of gradli is always perpendicular to the direction of the level surface U = const, passing through the considered point, 2. grad I; is always in the direction in which the function U is increasing,
dU
3. lgrad I;l = -, Le., the magnitude of grad U is equal to the directional derivative of U in the normal an direction. If the gradient is defined in another way, e.g., as a volume derivative or by the differential operator, then the previous defining properties became consequences of the definition.
13.2.2.2 Gradient and Volume Derivative The gradzent U of the scalar field U = U(5)at a point r'can be defined as its volume derivative. If the following limit exists, then we call it the gradient of U at r':
fl U d S
(13.34) grad U = lim (E) v+o Here V is the volume of the region of space containing the point belonging to Finside and bounded by the closed surface E. (If the independent variable is not a three-dimensional vector, then the gradient is defined by the differential operator.)
v
13.2.2.3 Gradient and Directional Derivative The directional derivative of the scalar field U with respect to the unit vector So is equal to the projection of grad U onto the direction of the unit vector So: (13.35) Le., the directional derivative can be calculated as the dot product of the gradient and the unit vector pointing into the required direction. Remark: The directional derivative at certain points in certain directions may also exist if the gradient does not exist there.
13.2.2.4 Further Properties of the Gradient 1. The absolute value of the gradient is greater if the level lines or level surfaces drawn as mentioned in 13.1.2.4,2., p. 642, are more dense. 2. The gradient is the zero vector (grad U = 6) if U has a maximum or minimum at the considered point. The level lines or surfaces degenerate to a point there.
13.2.2.5 Gradient of the Scalar Field in Different Coordinates 1. Gradient in Cartesian Coordinates dC'(z.Y, 2)
grad I! = i2.
ax
+
Y, "'5 I + dU(z, 8Y ~
dU(s,Y, 2) k' dz
Gradient in Cylindrical Coordinates (z = pcos cp, grad Cr = grad, US, + grad, US,
ac.
grad U = ->
aP
1d l i grad, U = --
+ grad, US,
Pap'
with
dU
grad,I; = dz
(13.36) y = p sin cp, z
=z) (13.37a) (13.37b)
3. Gradient in Spherical Coordinates (z = r sin 6 cos 9, y = r sin 6 sin cp, z = r cos 6) (13.38a) grad C = grad, UST t gradBUS$ + grad, US, with
650
13. Vector Analusis and Vector FaddS
au
1au
grad,U = -.
grad& = --,
ar
T
grad&'
aiy
1
au
(13.3813)
= --
r sin iJ a y
'
4. Gradient in General Orthogonal Coordinates ( E , q , C)
o;+
oi+
OC
If?(<, 1 7 1 0= X(6,17> Y(6,17s 4<> 171 then we get grad U = grad, USc t grad, US, + grad, US,, where 1
au
1
au
(13.39a) 1
au
(13.39b)
13.2.2.6 Rules of Calculations We assume in the followings that c'and c are constant. grad c = 0. grad (U1+ Uz) = grad UI grad Uz5 grad ( c V) = c grad U.
+
(13.40)
+
(13.41)
grad(UlU2) = U1gradUz UzgradUl, gradp(U) = -gradU. dP dU grad (51 . ?z)= (gl. grad)gz t (?z
. grad)q1 t ?I
+ qZx rot gl.
x rot q2
(13.42)
grad (?' c') = c'.
(13.43)
1. Differential of a Scalar Field as the Total Differential of the Function U
dU ax
dl; = grad U . d?= - d x
dU au +dy + -d z . ay a2
(13.44)
__-
_-(&.grad)?=
ax ay a2 at; av, av, --ax d y az
av, _ av, -av, _
(5)
,(13.47b)
gradV=
ax a y
az
av, av, at; --ax ay az avz _ -av, - ai/,
. (13.47~)
13.2 Differential Operators of Space 651
13.2.4 Divergence of Vector Fields 13.2.4.1 Definition of Divergence We can assign a scalar field to a vector field g(?)which is called its divergence. The divergence is defined as a space derivative of the vector field at a point i:
fl?.dg +
(E)
divV = lim (13.48) v+o ' If the vector field ? is considered as a stream field, then the divergence can be considered as the fluid output or source, because it gives the amount of fluid given in a unit of volume during a unit of time flowing by the considered point of the vector field 3. In the case d i v q > 0 the point is called a source, in the case div? < 0 it is called a sink.
v
~
13.2.4.2 Divergence in Different Coordinates 1. Divergence in Cartesian Coordinates -,
av, + av + av,
divV = az
ay
82
(13.49a)
with ?(z, y) z ) = V,i+
VY,+ V&.
(13.49b)
The scalar field div ? can be represented as the dot product of the nabla operator and the vector ? as div?=V,? and it is translation and rotation invariant, Le., scalar invariant (see 4.3.3.2, p. 265).
(13.49~)
2. Divergence in Cylindrical Coordinates
-
1 a(pv,) divV = -P
ap
1 av, av, + -+P
arp
(13.50a) with ?(p,cp, z ) = VpZp+ V,Zv
+ V&,.
(13.50b)
3. Divergence in Spherical Coordinates
with ?(r. 19, p) = V,&
+ V& + V,S,.
4. Divergence in General Orthogonal Coordinates (13.52a) with
D=
(13.52~)
and
13.2.4.3 Rules for Evaluation of the Divergence divc'=O,
d i v ( q 1 + q 2 ) = d i ~ ? ~ + d i v ? 2 , div(c?) =cdiv?.
div (C?) = Cr div 3 + 3
(13.53) (13.54)
652
13. Vector Analusis and Vector Fields
div (?I x
92)
= ?z. rot?1 - ?,. rot?z.
(13.55)
13.2.4.4 Divergence of a Central Field d i v i = 3, divip(r)i= 3yo(r)t r $ ( r ) ,
(13.56)
13.2.5 Rotation ofvector Fields 13.2.5.1 Definitions of the Rotation 1. Definition The rotation or curl of a vector field ? at the point ?is a vector denoted by rot ?, curl ? or with the nabla operator V x 3,and defined as the negative space derivative of the vector field:
(13.57)
2. Definition The vector field of the rotation of the vector field ?(?) can be defined in the following way: a) We put a small surface sheet S (Fig. 13.12)
'
through the point 3. We describe this surface sheet by a vector s' whose direction is the direction of the surface normal fi and its absolute value is equal to the area of this surface patch. The boundary of this surface is denoted by C. b) We evaluate the integral
f 3 . d t along the (CI
closed boundary curve C of the surface (the sense of the curve is positive looking to the surface from the direction of the surface normal (see Fig. 13.12).
c) We find the limit (ifit exists) lim s+o
f 5. d i , (C)
Figure 13.12
while the position of the surface sheet remains unchanged.
d) We change the position of the surface sheet in order to get a maximum value of the limit. The surface area in this position is S, and the corresponding boundary curve is Cmm. e) We determine the vector rot 3 at the point i,whose absolute value is equal to the maximum value found above and its direction coincides with the direction of the surface normal of the corresponding surface. We then get:
(13.58a) The projection of rot ? onto the surface normal ii of a surface with area S , Le., the component of the vector rot 3 in an arbitrary direction ii = i i s
15.2 Diferential Operators ofspace 653
(13.58b) The vector lines of the field rot? are called the curl lines ofthe vectorfield?.
13.2.5.2 Rotation in Different Coordinates 1. Rotation in Cartesian Coordinates r i 1 ; (13.59a)
vx
v, K
The vector field rot? can be represented as the cross product of the nabla operator and the vector ?: rot? = v x
?.
(13.59b)
2. Rotation in Cylindrical Coordinates
rot rot
+ rot v?S+, t rot ,?a, with av, - 1av, av, r o t , V = - - -av, V=---,
? = rot p?Sp
(13.60a)
.+
pap
a2
a2
3. Rotation in Spherical Coordinates t rot v?Sp rot ? = rot ,?ST + rot
.
ap
(13.60b)
(13.61a)
with
(13.61b)
4. Rotation in General Orthogonal Coordinates rot? = rot ?St t rot 'I ?Sq t rot ?Sc
(13.62a)
with
)
(13.62b)
(13.62~)
13.2.5.3 Rules for Evaluating t h e Rotation rot (V;+ Vz) = rot fl+ rot f z 3 rot (c?) = crot Q.
(13.63)
654
13. Vector Analusis and Vector Fields
rot
( U s ) = L' rot 9 + grad U x ?.
(13.64)
~zdiv\ r o t ( q 1 x q z ) = ( ~ ~ ~ g r a d ) ~ ~ - ( ~ ~ ~ g r a d ) ~ ~ t ~ ~ d i v ~ z - (13.65)
13.2.5.4 Rotation of a Potential Field This also follows from the Stokes theorem (see 13.3.3.2, p. 664) that the rotation of a potential field is identically zero: rot
? = rot (grad U) = 0'.
(13.66)
This also follows from (13.59a) for ? = grad U , if the assumptions of the Schwarz interchanging theorem are fulfilled (see 6.2.2.2, l . ,p. 393).
IFor r'= xi + y j + zk with r = I?/ = dx2t y2 t z2 we have: rot i = 0 and rot (q(r)F)= 6. where p(r) is a differentiable function of r . 3
.
4
.
.
13.2.6 Nabla Operator, Laplace Operator 13.2.6.1 Nabla Operator The symbolic vector V is called the nabla operator. Its use simplifies the representation of and calculations with space differential operators. In Cartesian coordinates we have:
aa;. -Ja;.t -k.
V = -1f
(13.67) ay az The components of the nabla operator are considered as partial differential operators, Le.. the symbol
ax
8 ax
- means partial differentiation with respect to 2 , where the other variables are considered as con-
stants. The formulas for spatial diferential operators in Cartesian coordinates can be obtained by formal multiplication of this vector operator by the scalar U or by the vector ?. For instance. in the case of the operators gradient, vector gradient, divergence, and rotation: grad U = V I; (gradient of U (see 13.2.2, p. 648)), (13.68a) grad
9 = V?
div? = V rot
5
?=V x ?
(vector gradient of?
(see 13.2.3, p. 650)),
(13.68b)
(divergence of?
(see 13.2.4, p. 651)),
(13.68~)
(rotation or curl of?
(see 13.2.5, p. 652)).
(13.68d)
13.2.6.2 Rules for Calculations with the Nabla Operator 1. If V stands in front of a linear combination a,X, with constants a, and with point functions X,. then, independently of whether they are scalar or vector functions, we have the formula:
V(Ca J z ) =
a,VX,.
(13.69)
2. If V is applied to a product of scalar or vector functions, then we apply it to each of these functions after each other and add the result. There is a -1 above the symbol of the function submitted to the operation -1 -1 -1 (13.70) V(XY2) = V ( X Y Z ) t V ( X Y Z ) + V ( X Y Z ) , Le.,
+
+
V ( X Y 2 )= (VX)YZ X(VY)2) XY(V2). We transform the products according to vector algebra so as the operator V is applied to only one factor with the sign -1 Having performed the computation we omit that sign.
13.2 Dafferential Operators of Space 655
4
-1
IA: dir(C'?) =V(C?)=V(U?) t V ( U $ ) = ? ~ V U + C ' V . $ = ~ . g r a d U + U d i v ? .
.1
+
-1
IB: grad (?1$2) = V(q~g2) = V( $1 ?2) V(?1q2). Because 6(a'C) = (d)c'+ a' x (6 x c') we 3, x (Vx 3,) (?1~)329, x (Vx 3,) get: grad (313,)= (320)3,
+
= (?zgrad)?l
+
$2
+
+
+ (q1grad)?2 + 3, x
x rot
rot ?2.
13.2.6.3 Vector Gradient The vector gradient grad? is represented by the nabla operator as grad? = 03.
( 13.71a)
!Ye get for the expression occurring in the vector gradient (a' 2(a'.V)?=rot(?xa')+grad(&)
V)?:
ta'div?-?diva'-a'x
r o t 3 - 5 ~r o t s .
(13.71b)
In particular we get for 3 = xi + y j t zk: .
.
-
3
(a' V)i = a'.
(13.71~)
'
13.2.6.4 Nabla Operator Applied Twice For every field 3:
V(V x 3 )= divrot?
= 0,
V(Vb') = divgrad C = AU.
(13.72a)
V x (VU)= rot grad U = 6,
(13.72b)
(13.72~)
13.2.6.5 Laplace Operator 1. Definition The dot product of the nabla operator with itself is called the Laplace operator:
v.v vz.
A= = (13.73) The Laplace operator is not a vector. It prescribes the summation of the second partial derivatives. It can be applied to scalar functions as well as to vector functions. The application to a vector function, componentwise, results in a vector. The Laplace operator is an znvarzant, Le.. it does not change during translation and/or rotation of the coordinate system. 2. Formulas for the Laplace Operator in Different Coordinates In the following, we apply the Laplace operator to the scalar point function U(?)?.Then the result is a scalar. The application of it for vector functions ?(?) results in a vector A? with components AV,, 4\;* A\; 1. Laplace Operator in Cartesian Coordinates
azu a2u a2U Ar;(s, y. z ) = -+ -t ax2 ay2 az2
(13.74)
2. Laplace Operator in Cylindrical Coordinates
(13.75)
656
13. Vector Analysis and Vector Fields
3. Laplace Operator in Spherical Coordinates (13.76) 4. Laplace Operator in General Orthogonal Coordinates
3. Special Relations between the Nabla Operator and Laplace Operator V(V
V(V
3 ) = grad div 3s
v x (V x 3)= rotrot?, 3)- V x (V x 3)= A$,
(13.78) (13.79) (13.80)
where
(13.81)
13.2.7 Review of Spatial Differential Operations 13.2.7.1 Fundamental Relations and Results (see Table 13.2) Table 13.2 Fundamental relations for spatial differential operators
Operator Gradient Vector gradient Divergence Rotation Laplace operator Laplace operator
I Symbol
I! vu
grad grad V div 3 rot 3 AU
A3
Meaning
Relation V3
vector 0 . 3 vector vector 0 x 3 ( V . V)U scalar ( V . V ) 3 vector
maximal increase tensor second order source, sink scalar curl vector potential field source scalar vector
13.2.7.2 Rules of Calculation for Spatial Differential Operators U, u,, ~
2 scalar , functions; c constant; 3, q1,q2vector functions: grad (U1 + U2) = grad Ul t grad UZ.
(13.82)
grad (cli) = c grad U. grad(U1Uz) = U1gradUz+UzgradUl.
(13.83)
gradF(U) = F’(U)gradU.
(13.85)
(13.84)
13.2 Differential Operators of Space 657
+
div (91+ ?2) = d i v q 1 div32.
(13.86)
div (c?) = cdiv?.
(13.87)
div (C?) = ? . grad U + Udiv ?.
(13.88)
rot ($1
+
= rot 31
?2)
+ rot 32.
(13.89)
rot (c?) = crot ?.
(13.90)
rot (U?) = U rot ? - ? x grad U.
(13.91)
.+
div rot V
E
rot grad U
(13.92)
0.
6
(13.93)
(zero vector).
div grad b' = AU.
(13.94)
rot rot? = grad div? div (q1x
32)= ?2
-
a?.
(13.95)
rot ?, - 9, . rot 52.
(13.96)
13.2.7.3 Expressions of Vector Analysis in Cartesian, Cylindrical, and
Spherical Coordinates (see Table 13.3) Table 13.3 Expressions of vector analysis in Cartesian, cylindrical, and spherical coordinates
F
Cartesian coordinates Cylindrical coordinates Spherical coordinates
+ $dU + Z,dz au au au e,-++-+SZ-
ds'= di &dx
gradU div?
rot?
4u
ax
yall
at
-+A+av, av av,
ax
ay
at
+ e'8rd19 + &,r sin t9d9 au 1 a u 1 au 4- + &-- + s,-T at9 rsint9 89 ar 1a i a --
+ &pdq + Zzdz au l a U av e' -.+ s,-- + s,p 89 at ap 1 8 lav, av, --(pV,) + -- + -
&dr
e',dp
P aP
az
I+-
T2ar(m +m G ( v 8 w 1
av,
658
29. Vector Analusis and Vector Fields
13.3 Integration in Vector Fields Integration in vector fields is usually performed in Cartesian. cylindrical or in spherical coordinate systems. Usually we integrate along curves, surfaces, or volumes. The line, surface, and volume elements needed for these calculations are collected in Table 13.4. Table 13.4 Line, surface, and volume elements in Cartesian, cylindrical, and spherical coordinates
1 Cartesian coordinates 1 Cylindrical coordinates I Spherical coordinates 1 +&r sin 8drdp
I
e'= = e', x e',
ZP = e',
gy = e'z x e'%
e, = e, x ep
6, = & x
e', = SP x e',
I
I* I
+
& = e'* x e',
x e',
+
e'* = spx
+
= 4 x e'*
I
The indices i and take the place of IC, y, z or p, q ,z or r, 19, cp. The volume is denoted here by u to avoid confusion with the absolute value of the vector function I?I= V.
13.3.1 Line Integral and Potential in Vector Fields 13.3.1.1 Line Integral in Vector Fields 1. Definition The scalar-valued curvilinear integral or line integral of a vector function ?(i) along h
a rectificable curve AB (Fig. 13.13) is the scalar value (13.97a)
,-.
AB
2. Evaluation of this Integral in Five Steps
/-.
a ) We divide the path AB (Fig. 13.13) by division points AI(&), Az(&), . . . , A,-l(?,-l) (A = Ao, B I A,) into n small arcs which are approximated by the vectors it- ic-l= Aic-,. b) We choose arbitrarily the points P,with position vectors itlying inside or at the boundary of each small arc. c) R'e calculate the dot product of the value of the function ?(it) at these chosen points with the corresponding A?,-,. d) We add all the n products. lL..
e) We calculate the limit of the sums got this way 2=1
obviously.
?(?J . A?%-] for Ai?,-,
+ 0, while n + co
13.3 Integration in Vector Fields 659
If this limit exists independently of the choice of the points Ai and P,, then it is called the line integral (13.97b) 4B
.A sufficient condition for the existence of the line integral (13.97a,b) is that the vector function q(F) h
and the curve AB are continuous and the curve has a tangent varying continuously. A vector function ?(i) is continuous if its components, the three scalar functions, are continuous.
Figure 13.13
Figure 13.14
13.3.1.2 Interpretation of the Line Integral in Mechanics If +(?) is a field of force, Le., g(i)= I?(?), then the line integral (13.97a) represents the work done by h
I? while a particle m moves along the path AB (Fig. 13.13J3.14).
13.3.1.3 Properties of the Line Integral
1 ?(<). / / g(l). 1q(l) / 1 +@(?)I . ,-. / q(l). / 1 1?(i), $(i) d i +
dl=
di.
h
h
h
ABC
AB
BC
. di=-
(Fig. 13.14). ?(l) di
h
h
AB
BA
[?(i)
dl=
di+
W(i) . d i
(13.99)
(13.100)
h
h
AB
AB
cg(r'),dF=
AB
di
c
h
h
AB
AB
(13.101)
13.3.1.4 Line Integral in Cartesian Coordinates In Cartesian coordinates, we have:
1g(i).
d l = /(Vzdz+Vydy+l',dz).
h
h
AB
AB
(13.102)
660
13. Vector Analusis and Vector Fields
13.3.1.5 Integral Along a Closed Curve in a Vector Field A line integral is called a contour integralif the path of integration is a closed curve. If the scalar value of the integral is denoted by P and the closed curve is denoted by C, then we use the notation: (13.103)
13.3.1.6 Conservative or Potential Field 1. Definition If the value P of the line integral (13.97a) in a vector field depends only on the initial point A and the endpoint B , and is independent of the path between them, then we call this field a conservative fieldor a potentialfield. The value of the contour integral in a conservative field is always equal to zero:
/?(r')
d i = 0.
(13.104)
A conservative field is always irrotational: (13.105) rot? = 6, and conversely, this equality is a sufficient condition for a vector field to be conservative. Of course, we have to suppose that the partial derivatives of the field function are continuous with respect to the corresponding coordinates, and the domain of ? is simply connected. This condition, also called the integrability condition (see 8.3.4, 8.3.4.2, p. 467), has the form in Cartesian coordinates
av, -- av, av, --3 3-3 _ ay ' ax az , ax ' az dy 2. Potential of a ConservativeField,
(13.106)
or its potential function or briefly its potential is the scalar function i
p ( i) =
p i ).
(13.107a)
di.
SO
We can calculate it in a conservative field with a fixed initial point A(?,-,) and a variable endpoint B ( q from the integral p(i)=
1
?(?). dr'.
(13.107b)
h
AB
Remark: In physics, the potential v*(i)of a function ?(i) at the point i i s considered with the opposite sign: i
p*(r')= -/$(i)'
d i = -p(?).
(13.108)
80
3. Relations between Gradient, Line Integral, and Potential If the relation ?(?) = grad U(i) holds, then U(i) is the potential of the field ?(?), and conversely, g(3) is a conservative or potential field. In physics we consider the negative sign corresponding to (13.108). 4. Calculation of the Potential in a Conser_vative_Fieid If the function ?(i) is given in Cartesian coordinates V = V,i+ Vuj+ Vz6,then for the total differential of its potential function:
13.3 Inte.gration in Vector Fields 661
dU = V, dx
+ V, dy + V, dz.
au
au
(13.109a) Here, the coefficients V,, V,, V, must fulfill the integrability condition (13.106). The determination of U follows from the equation system
au
(13.109b) - = v,, aZ = v,. 8X 8Y In practice, the calculation of the potential can be done by performing the integration along three straight line segments parallel to the coordinate axes and connected to each other (Fig. 13.15): - = v,,
J X
(13.110) Figure 13.15
13.3.2 Surface Integrals 13.3.2.1 Vector of a Plane Sheet The vector representation of the surface integral of general type (see 8.3.4.2, p. 483) requires to assign a vector s' to a plane surface region S, which is perpendicular to this region and its absolute value is equal to the area of S.Fig. 13.16a shows the case of a plane sheet. The positive direction in S is given by defining the positive sense along a closed curve C according to the nght-hand law (also called nght screw rule): If we look from the initial point of the vector into the direction of its final point, then the posztzve sense is the clockwise direction. By this choice of orientation of the boundary curve we fix the exterior side of this surface region, i.e., the side on which the vector lies. This definition works in the case of any surface region bounded by a closed curve (Fig. 13.16b,c).
interior side
exterior side
Figure 13.16
13.3.2.2 Evaluation of the Surface Integral The definition of a surface integral in scalar or vector fields is independent of whether the surface S is bounded by a closed curve or is itself a closed surface. The evaluation is performed in five steps: a) We divide the surface region S on the exterior side defined by the orientation of the boundary curve (Fig. 13.17) into n arbitrary elementary surfaces AS, so that each of these surface elements can be approximated by a plane surface element. We assign the vector A$ to every surface element AS, as given in (13.33a). In the case of a closed surface, the positive direction is defined so that the exterior side is where Agi should start.
662
13. Vector Analusis and Vector Fields
b) We choose an arbitrary point P,with the position vector r', inside or on the boundary of each surface element. c) w'e produce the products U ( 6 )ASi in the case of a scalar field and the product ?(4) . A$ or ?(r',) x A$i in the case of a vector field d) We add all these products. e) We evaluate the limit while the diameters of AS, tend to zero, Le., AGi --t 0 for n --t CG. So, the surface elements tend to zero in the sense given in 8.4.1, l., p. 469, for double integrals. If this limit exists independently of the partition and of the choice of the points r',, then we call it the surface integral of ? on the given surface. ZA
Figure 13.17
Figure 13.18
13.3.2.3 Surface Integrals and Flow of Fields 1. Vector Flow of a Scalar Field (13.111)
2. Scalar Flow of a Vector Field
Q=
lim ??(it) . As', = AS-0
/ ?(i) . ds'.
(13.112)
$=I
3. Vector Flow of a Vector Field (13.113)
13.3.2.4 Surface Integral in Cartesian Coordinates as Surface Integrals of Second Type /Lrds'= (SI
//UdydzT+ (%a)
//Udtdxjt
(&*I
//Udxdyg.
(%I
(13.114)
23.3 Inteqration in Vector Fields 663
(13.115)
/?
x dg =
(SI
//(L:j'-
Vvc)dydz
+
/ / ( V & - Vzi)dzdx + //(Vu'- V,;) dxdy.(13.116) (SWI
(SZ*)
(%XI
The existence theorems for these integrals can be given similarly to those in 8.5.2,4.. p. 482. In the formulas above, each of the integrals is taken over the projection S on the corresponding coordinate plane (Fig. 13.18),where one of the variables z. y or z should he expressed by the others from the equation of S. Remark: Integrals over a closed surface are denoted by
f c'dg = @ d i , (SI
(SI
f3 , d g = $9, dg,
f $ x dg =
(SI
(SI
(SI
I
W A: Calculate the integral $ =
(13.117 )
x dg. (SI
zyz dg, where the surface is the plane region x+y+z = 1bounded
(S)
hy the coordinate planes. The upward side is the positive side:
$ = //(l-y-z)yzdydz;+
l/(l-x-z)zzdzdzi+
(SYZ)
/ / ( 1 - y - z)yzdydz =
l1
//(l-z-y)zydzdyL;
(SZZI
1
-.120
L1-'(l - y - z)yzdydz =
We get the two further integrals
(Sua1
I - - analogously. The result is: P = -(i + j + k). 120 4
W B: Calculate the integral Q =
1
i. d g
same plane region as in A:
/ / z d y dz
11
z dy dz
=
+
(%I
(S)
= /l
0
I1-'(l 0
-z
11
y dz dx
+
-
y) dydz =
z dz dy over the (SWI
(SXD)
1
-.6
Both other integrals are
(SY'I
1 1 1 1 calculated similarly. The result is: Q = - t - + - = - . 6 6 6 2
W C: Calculate the integral
=
1
r ' x dg = / ( x i + y j + z L ) x (dydz;+ d z d z i + d x d y c ) , where
(SI
(SI
the surface region is the same as in A: Performing the computations we get
= 6.
13.3.3 Integral Theorems 13.3.3.1 Integral Theorem and Integral Formula of Gauss 1. Integral Theorem of Gauss or the Divergence Theorem The integral theorem of Gauss gives the relation between a volume integral of the divergence of ? over a volume u , and a surface integral over the surface S surrounding this volume. The orientation of the surfack (see 8.5.2.1, p. 481) is defined so that the exterior side is the positive one. The vector function ? should be continuous, their first partial derivatives should exist and be continuous. (13.1 18a)
664
13. Vector Analusis and Vector Fields
The scalar flow of the field d through a closed surface Sis equal to the integral of divergence of d over the volume v bounded by S. In Cartesian coordinates we get: (13.118b)
2. Integral Formula of Gauss In the planar case, the integral theorem of Gauss restricted to the x,y plane becomes the integral formula of Gauss. It represents the correspondence between a line integral and the corresponding surface integral:
11[
-
dP0 dY
(B)
[ P ( xy) , dx
+ &(x, y) dy]
(13.119)
B denotes a plane region which is bounded by C. P and Q are continuous functions with continuous first partial derivatives. 3. Sector Formula The sector formula is an important special case of the Gauss integral formula. We can calculate the area of plane regions with it. For Q = x, P = -y it follows that (13.120)
13.3.3.2 Integral Theorem of Stokes The integral theorem of Stokes gives the relation between a surface integral over an oriented surface region S defined in the vector field d ,and the integral along the closed boundary curve C of the surface S. The sense of the curve C is chosen so that the sense of traverse forms a right screw with the surface normal (see 13.3.2.1, p. 661). Thevector function3 should becontinuous and it should havecontinuous first partial derivatives. //rotg. (SI
ds=
f d .dr'.
(13.121a)
(C)
The vector flow of the rotation through a surface S bounded by the closed curve C is equal to the contour integral of the vector field 3 along the curve C. In Cartesian coordinates, we have:
f(Vzdx + V, dy + V, dz) (C)
In the planar case, the integral theorem of Stokes, just as that of Gauss, becomes into the integral formula (13.119) of Gauss.
13.3.3.3 Integral Theorems of Green The Green integral theorems give relations between volume and surface integrals. They are the applications of the Gauss theorem for the function 3 = Ul grad U.,where U, and U, are scalar field functions and 2: is the volume surrounded by the surface S. (13.122)
13.4 Evaluation of Fields 665
(13.123) In particular for C; = 1, we have: (13.124) In Cartesian coordinates the third Green theorem has the following form (compare 13.11%): (13.125)
f (x2y3dx + dy +
IA: Calculate the line integral I =
z dz) with C as the intersection curve of the
(C)
cylinder x2 + y2 = u2 and the plane z = 0. We get with the Stokes theorem: a6
(C)
(SI
+J
(S’)
r=O
r5coszpsin2pdrdp=--~ 8
with
rot? = -3x2y21;, dg = Gdxdy and the circle S*:x2 + y2 5 a’.
B: Determine the flux I =
f ? . d g in the drift space 3
= x3f+
y3j+ z31; through the surface S
(S)
of the sphere x2 + y2 + z2 = u2. The theorem of Gauss yields:
I = (SI f?~d~=///div?dv=3///(x2+yZ+z2)dxdydz=3 (VI
7]]
‘p=O
19=0 r=O
12 r 4 s i n d d r d d d p = -u577. 5
IC: Heat conduction equation: The change in time of the heat Q of a space region u containing no dv (specific heat-capacity e, density
heat source is given by
e, temperature T ) .while
dQ the corresponding time-dependent change of the heat flow through the surface S of v is given by - = dt X grad T . dg (thermal conductivity A). Applying the theorem of Gauss for the surface integral we
//
(SI
get from
///
[
C
Q
I
lv)
which has the form
dT
-~ div(XgradT) dv = 0 the heat conduction equation cX-
dT
-=
at
at
=
div(XgradT),
u2AT in the case of a homogeneous solid (e, e, X constants)
13.4 Evaluation of Fields 13.4.1 Pure Source Fields We call a field ?, a pure source field or an irrotational wurce field when its rotation is equal to zero everywhere. If the divergence is q(F), then we have: d i v g l = q(F),
- - .
rotV1 = 0.
(13.126)
666
13. Vector Analvsis and Vector Fields
In this case, the field has a potential U ,which is defined at every point P by the Poisson daflerential equation (see 13.5.2, p. 668)
?, = grad C;
div grad U = AU = q(r'),
(13.127a)
where r' is the position vector of P . (In physics, 31 = -grad U is used.) The evaluation of U comes from (13.127b) The integration is taken over the whole of space (Fig.13.19). The divergence of? must be differentiable and be decreasing quickly enough for large distances.
m1 or
m2or
41
Figure 13.19
e
Figure 13.20
13.4.2 Pure Rotation Field or Zero-Divergence Field A pure rotatzon (or curl) field or a solenozdalJeld is a vector field 52 whose divergence is equal to zero everywhere. If the rotation is G(r'),then we have: divqz = 0, rotqz = .i(r'). ( 13.128a) The rotation $(?) cannot be arbitrary: it must satisfy the equation divui = 0. With the assumptions ?,(r')=rotb;(?), divb;=O, we get according to (13.95)
Le., r o t r o t b ; = G
grad div b; - Ab; = G , Le., AX = -G.
(13.128b) (13.128~)
So, A(?) formally satisfies the Poisson differential equation just as the potential U of an irrotational and that is why it is called a vector potentzal. For every point P : field 1 G(P) (13.128d) 3, = rot b; with A = dv(r").
..
///
The meaning of r'is the same as in (13.127b); the integration is taken over the whole of space.
13.4.3 Vector Fields with Point-Like Sources 13.4.3.1 Coulomb Field of a Point-Like Charge The Coulomb field is an important example of an irrotational field, which is also solenoidal, except at the location of the point charge, the point source (Fig. 13.20). The field and potential equations are: +
E
E=-$, i-3
v=-! r
in physics also
2. r
(13.129a)
13.5 Diflerential Equations of Vector Field Theoru 667
The scalar flow is h e or 0, depending on whether the surface S encloses the point source or not: 4re, if S encloses the point source, (13.129b) E* , dg = 0; otherwise.
f
(SI
{
The quantity e is the source zntenszty or source strength.
13.4.3.2 Gravitational Field of a Point Mass The field of gravity of a point mass is the second example of an irrotational and at the same time solenoidal field, except at the center of mass. We also call it the Newton field. Every relation valid for the Coulomb field is valid analogously also for the Newton field.
13.4.4 Superposition of Fields 13.4.4.1 Discrete Source Distribution Analogously to the superposition of the fields of physics, the vector fields of mathematics superpose each other. The superposztzon law is: If the vector fields 5, have the potentials U,, then the vector field ? = E?" has the potential U = ELr,. For n discrete point sources with source strength e, (v = 1.2,. . . , n),whose fields are superposed, the resulting field can be determined by the algebraic sum of the potentials U,: n
?(3) = -grad
"=I
Vu
with
e, U, = -
l?-?"l'
(13.130a)
Here. the vector r' is again the position vector of the point under consideration, FV are the position vectors of the sources. If there is an irrotational field ?I and a zero-divergence field 5, together and they are everywhere continuous. then (13.130b) If the vector field is extended to infinity, then the decomposition of q(?) is unique if 1?(3)1 decreases sufficient rapidly for T = + co. The integration is taken over the whole of space.
13.4.4.2 Continuous Source Distribution If the sources are distributed continuously along lines, surfaces, or in domains of space, then, instead of the finite source strength e,,, we have infinitesimals corresponding to the density of the source distributions, and instead of the sums, we have integrals over the domain. In the case of a continuous space distribution of source strength, the divergence is q(3) = div?. Similar statements are valid for the potential of a field defined by rotation. In the case of a continuous space rotation distribution, the 'I rotation denszty " is defined by G(?)= rot 3.
13.4.4.3 Conclusion A vector field is determined uniquely by its sources and rotations in space if all these sources and rotations lie in a finite space.
13.5 Differential Equations of Vector Field Theory 13.5.1 Laplace Differential Equation The problem to determine the potential U of a vector field 3,= grad U containing no sources, leads to the equation according to (13.126) with q(F) = 0 div
= div grad C = 41:= 0,
(13.131a)
668
13. VectorAnalysis and Vector Fields
Le.. to the Laplace diflerential equation. In Cartesian coordinates we have: (13.131b)
Every function satisfying this differential equation and which is continuous and possesses continuous first and second order partial derivatives is called a Laplace or harmonzcfunctzon. We distinguish three basic types of boundary value problems: 1. Boundary value problem (for an interior domain) or Dirichlet problem: We determine a function U ( z ,y. z ) which is harmonic inside a given space or plane domain and takes the given values at the boundary of this domain. 2. Boundary value problem (for an interior domain) or Neumann problem: We determine a function
dU
U ( x ,y, z ) $which is harmonic inside a given domain and whose normal derivative - takes the given an values at the boundary of this domain. 3. Boundary value problem (for an interior domain): We determine a function U ( x ,y, z ) , which is harmonic inside a given domain and the expression dli aU + ,!I-- ( a ,B const, (Y’ p2 # 0 ) takes given values at the boundary of this domain. dn
+
13.5.2 Poisson Differential Equation The problem to determine the potential U of a vector field to the equation according to (13.126) with q(?) # 0
= grad U with given divergence, leads
(13.132a) d i v v l = div grad U = AU = q(?) # 0, Le., to the Poisson diflerential equation. Since in Cartesian coordinates: d2U d2U d2U A v = -+--+ (13.132b) 622 ’ 8x2 a y 2 the Laplacedifferential equation (13.131b)isa specialcase of the Poisson differential equation (13.13213). The solution is the Newton potential (for point masses) or the Coulomb potential (for point charges) (13.1324 The integration is taken over the whole of space. U(?) tends to zero sufficiently rapidly for increasing values. We can discuss the same three boundary value problems for the Poisson differential equation as for the solution of the Laplace differential equation. The first and the third boundary value problems can be solved uniquely: for the second one we have to prescribe further special conditions (see [9.5]).
669
14 Function Theory 14.1 Functions of Complex Variables 14.1.1 Continuity, Differentiability 14.1.1.1 Definition of a Complex Function .4nalogously to real functions, we can assign complex values to complex values, i.e., to the value z = z t i y we can assign a complex number w = u t i v, where u = u (z ,y) and v = v(z, y) are real functions of two real variables. We then write w = f(z). The function w = f ( z ) is a mapping from the complex z plane to the complex 20 plane. The notions of limit, continuity, and derivative of a complex function w = f ( z ) can be defined analogously to real functions of real variables.
14.1.1.2 Limit of a Complex Function The h u t of a function f ( z ) is equal to the complex number wg if for z approaching zg the value of the function f ( z ) approaches wg: 200 = f(2). (14.la) In other words: For any positive E there is a (real) 6 such that for every z satisfying (14.1b), except maybe zo itself, the inequality (14.1~)holds: 120 - zl < 6, (14.lb) Iwo - f(z)l < E . (14.1~) The geometrical meaning is as follows: .4ny point z in the circle with center zo and radius 6, except maybe the center zo itself, is mapped into a point w = f ( z ) inside a circle with center wg and radius E in the w plane where f has its range, as shown in Fig. 14.1.The circles with radii 6 and e are also called the neighborhoods UE(wg)and UJ(Z0).
Figure 14.1
14.1.1.3 Continuous Complex Functions .A function w = f ( z ) is continuous at zo if it has a limit there, a substitution value, and these two are equal, Le., if for an arbitrarily small given neighborhood U,(wg) of the point wg = f(zg) in the w plane there exists a neighborhood UJ(Z0) of zg in the z plane such that w = f ( z ) E UE(wg) for every z E Ud(z0). As represented in Fig. 14.1,U,(wo) is, e.g., a circle with radius E around the point wo.We then write lim f ( z ) = f ( z o ) or limf(z0 6) = f ( z 0 ) . (14.2) 610
t i 2 0
+
14.1.1.4 Differentiability of a Complex Function A function w = f ( z ) is differentiable at z if the difference quotient AW-f(~+Az)-f(~) A2
A2
(14.3)
670
14. Function Theoru
has a limit for A z + 0, independently of how A z approaches zero. This limit is denoted by f’(z) and it is called the derivative of f(z). IThe function f ( z ) = Re z = x is not differentiable at any point z = 20, since approaching zo parallel to the z-axis the limit of the difference quotient is one, and approaching parallel to the y-axis this value is zero.
14.1.2 Analytic Functions 14.1.2.1 Definition of Analytic Functions .4 function f ( z ) is called analytzc, regularor holornorphzcon a domain G, if it is differentiable at every point of G. The boundary points of G, where f’(z) does not exist, are singular points of f(z). The function f ( z ) = u(x. y)+iv(x, y) is differentiable in G if u and u have continuous partial derivatives in G with respect to z and y and they also satisfy the Cauchy-Riemann dzflerential equations:
all
-au_ - -av
av
_ -- _ (14.4) ax dy ’ dy dx’ The real and imaginary parts of an analytic function satisfy the Laplace differential equation: 8%
6221
A u(z, y) = + - = 0, 8x2 dy2
8% 8% A V ( X , y) = -+ - = 0.
(14.5a)
(14.5b) 8x2 dy2 The derivatives of the elementary functions of a complex variable can be calculated with the help of the same formulas as the derivative of the corresponding real functions. IA: f ( z ) = z3. f‘(z)= 32’; IB: f ( z ) = sinz, f ’ ( z ) = cosz.
14.1.2.2 Examples of Analytic Functions 1. Elementary Functions The elementary algebraic and transcendental functions are analytic in the whole z plane except at some isolated singular points. If a function is analytic on a domain, Le., it is differentiable, then it is differentiable arbitrarily many times. IA: The function w = z2 with u = x2 - y2, v = 21y is everywhere analytic IB: The function w = u + iv, defined by the equations u = 2s + y I v = I + 2y, is not analytic at any point. IC: The function f ( z ) = z3 with f ’ ( z ) = 32’ is analytic. ID: The function f ( z ) = sinz with f ’ ( z ) = cosz is analytic.
2. Determination of the Functions u and v If both the functions u and v satisfy the Laplace differential equation, then they are harmonic functions (see 13.5.1, p. 667). If one of these harmonic functions is known, e.g., u , then the second one. as the conjugate harmonic function u, can be determined up to an additive constant with the CauchyRiemann differential equations:
1
dU
- dy dX
+ q(x)
dq
(- + 1 du
with - = dz dy .4nalogously u can be determined if w is known.
v=
d dx
du
dy)
(14.6)
14.1.2.3 Properties of Analytic Functions 1. Absolute Value or Modulus of an Analytic Function The absolute value (modulus) of an analytic function is: (14.7) IW = l f ( z ) l = JMz, Y I P + >.(I Y)12 = v(x, Y). The surface IUJ = p(z.y) is called its relzef, Le., 1.u is the third coordinate above every point z = x+i y.
14.1 Functions of Complex Variables 671
IA: Theabsolutevalueofthefunctionsinz = sinzcoshyticoszsinhyisIsinzl = dsin2z +sinhZy. The relief is shown in Fig. 14.2a. IB: The relief of the function w = e'/' is shown in Fig. 14.2b. For the reliefs of several analytic functions see [14.10].
Figure 14.2
2. Roots Since the absolute value of a functions is never negative. the relief is always above the z plane, except the points where lf(z)l = 0 holds, so f ( z ) = 0. The z values, where f(z) = 0 holds. are called the roots of the function f ( z ) .
3. Boundedness A4function is bounded on a certain domain, if there exists a positive number N such that If(z)I < N is valid everywhere in this domain. In the opposite case, if no such number N exists, then the function is called unbounded. 4. Theorem about the Maximum Value If u, = f ( z ) is an analytic function on a closed domain. then the maximum of its absolute value is attained on the boundary of the domain. 5 . Theorem about the Constant (Theorem of Liouville) If w = f ( z ) is analytic in the whole plane and also bounded, then this function is a constant: f ( z ) = const.
14.1.2.4 Singular Points If a function w = f ( z ) is analytic in a neighborhood of z = a, Le., in a small circle with center a, except a itself, then f has a singularity at a. There exist three types of singularities: 1. f ( z ) is bounded in the neighborhood. Then there exists w = l i i f ( z ) , and setting f ( a ) = w the function becomes analytic also at a. In this case. f has a removable singularzty at a. 2. If i$lf(z)l = ea,then f has a pole at a. About poles of different orders see 14.3.5.1,p. 691 3. Iff has neither a removable singularity nor a pole, then f has an essentzal szngularity. In this case, for any complex w there exists a sequence zn+ a such that f(zn) + w.
IA: The function w =
z-a
has a pole at a.
672
14. Function Theory
IB: The function w = el/* has an essential singularity at 0 (Fig. 14.2b).
14.1.3 Conformal Mapping 14.1.3.1 Notion and Properties of Conformal Mappings 1. Definition .4mapping from the z plane to the w plane is called a conformal mapping if it is analytic and injective. In this case,
w = f ( z ) = u + iv, f'(z) # 0. The conformal mapping has the following properties:
(14.8)
The transformation dw = f ' ( z ) dz of the line element dz =
($1
is the composition of a dilatation by
u = If'(z)l and of a rotation by cu = argf'(z). This means that infinitesimal circles are transformed into almost circles, triangles into (almost) similar triangles (Fig. 14.3). The curves keep their angles of intersection, so an orthogonal family of curves is transformed into an orthogonal family (Fig. 14.4).
v
J
V
u Figure 14.4
Figure 14.3
Remark: Conformal mappings can be found in physics, electrotechnics, hydro- and aerodynamics and in other areas of mathematics.
2. The Cauchy-Riemann Equations The mapping between d z and dw is given by the affine differential transformation (14.9a) and in matrix form
dw=Adz
with
A=(uzuy).
(14.9b)
vx vy
According to the Cauchy-Riemann differential equations, A has the form of a rotation-stretching matrix (see 3.5.2.2, 2., p. 191) with CT as the stretching factor: (14.loa)
A = ( ~ ~ ~ ~ ) = u ( E Y ~ E - ~ ~ E ) , 21,
= v u = UCOSQ,
-uy = u, = usincu,
(14.10b) (14.10d)
4-
m,
CT
= If'(z)l =
(Y
= arg f'(z) = arg (us ius) .
=
+
(14.10~) (14.10e)
3. Orthogonal Systems The coordinate lines x = const and y = const of the z plane are transformed by a conformal mapping into two orthogonal families of curves. In general, we can generate a bunch of orthogonal curvilinear
14.1 Functions of Complex Variables 673
coordinate systems by analytic functions; and conversely, for every conformal mapping there exist an orthogonal net of curves which is transformed into an orthogonal coordinate system. IA: In the case of = 22 t y; u = I + 2y (Fig. 14.5),orthogonality does not hold. 0
Figure 14.5
IB: In the case w = tz the orthogonality is retained, except at the point t = 0 because here w’ = 0. The coordinate lines are transformed into two confocal families of parabolas (Fig. 14.6), the first quadrant of the z plane into the upper half of the w plane.
“t
’/ b)
\
\
Figure 14.6
14.1.3.2 Simplest Conformal Mappings In this paragraph, we discuss some transformations with their most important properties, and we give the graph of their zsometric net in the z plane, Le., the net which is transformed into an orthogonal Cartesian net in the w plane. The boundaries of the z regions mapped into the upper half of the w plane are denoted by shading. Black regions are mapped onto the square in the w plane with vertices ( O , O ) , (0,l ) , ( l , O ) , and (1,l)by the conformal mapping (Fig. 14.7).
1. Linear Function For the conformal mapping given in the form of a linear function
2c = az + b, the transformation can be done in three steps: a) Rotation of the plane by the angle a = arga according to : b) Stretching by the factor lal: c) Parallel translation by b:
( 14.1la)
w1 = einz. w2 = lalw1.
(14.11b)
w = w 2 + b.
b Altogether, every figure is transformed into a similar one. The points z1 = m and 22 = -for a # 1, 1-a b # 0 are transformed into themselves, and they are calledjxedpoints. Fig. 14.8 shows the orthogonal
674
14. Function Theory
net which is transformed into the orthogonal Cartesian net.
B-
O
Figure 14.8
Figure 14.7
Figure 14.9
2. Inversion The conformal mapping 1
w=-
(14.12)
2
represents an inversion with respect to the unit circle and a reflection in the real axis, namely, a point z of the z plane with the absolute value r and with the argument 'pis transformed into a point w of the w plane with the absolute value l / r and with the argument -9 (see Fig. 14.10).Circles are transformed into circles, where lines are considered as limitingcases of circles (radius + co).Points of the interior of the unit circle become exterior points and conversely (see Fig. 14.11).The point z = 0 is transformed into w = co.The points z = -1 and z = 1 are &xed points of this conformal mapping. The orthogonal net of the transformation (14.12) is shown in Fig. 14.9. Remark: In general a geometric transformation is called inversion with respect to a circle with radius r , when a point Pz with radius r2 inside the circle is transformed into a point PI on the elongation of
+
the same radius vector OPz outside of the circle with radius of the circle become exterior points and conversely.
Figure 14.10
= 71 = T
Points of the interior
~ / T ~ .
Figure 14.11
3. Linear Fractional Function For the conformal mapping given in the form of a linear fractional function XI
=
az+b cz d
~
+
(14.13a)
14.1 Functions of Complex Variables 675
the transformation can be performed in three steps: a) Linear function: w1 = cz + d.
b) Inversion: c)
Linear function:
wz=-.
1
(14.13b)
w1 a bc-ad w = - + -W2. c c
Circles are transformed again into circles (circular transformation), where straight lines are considered as limiting cases of circles with r + M. Fixed points of this conformal mapping are the both points satisfying the quadratic equation az+b z=(14.14) cztd If the points z1 and z2 are inverses of each other with respect to a circle K1 of the z plane, then their images w1and w2 in the w plane are also inversions of each other with respect to the image circle Kz of K1. The orthogonal net which has the orthogonal Cartesian net as its image is represented in Fig. 14.12.
Figure 14.12
Figure 14.13
Figure 14.14
4. Quadratic Function The conformal mapping described by a quadratic function w = 22
has the form in polar coordinates and as a function of 2 and y: = p2
@,
(14.15b)
w = u + iu = 2 - y2 + 2izy.
(14.15~)
It is obvious from the polar coordinate representation that the upper half of the z plane is mapped onto the whole w plane, Le., the whole image of the z plane will cover twice the whole w plane. The representation in Cartesian coordinates shows that the coordinate lines of the w plane, u = const and w = const. come from the orthogonal families of hyperbolas x2 - yz = u and 2x21 = w of the z plane (Fig. 14.13). Fixed points of this mapping are z = 0 and z = 1. This mapping is not conformal at z = 0. 5 . Square Root The conformal mapping given in the form as a square root of z , w=&, (14.16) transforms the whole z plane whether onto the upper half of the w plane or onto the lower half of it. Le., the function is double-valued. The coordinate lines of the w plane come from two orthogonal families of confocal parabolas with the focus at the origin of the z plane and with the positive or with the negative
676
14. Function Theoru
real half-axis as their axis (Fig. 14.14). Fixed points of the mapping are z = 0 and z = 1. The mapping is not conformal at z = 0. 6 . Sum of Linear and Fractional Linear Functions The conformal mapping given by the function
w=
(z
t
!-)
( k a real constant, k
> 0)
(14.17a)
can be transformed by the polar coordinate representation of z = p e ' q and by separating the real and imaginary parts according to (14.8): (14.17b) Circles with p = po = const of the z plane (Fig. 14.15a) are transformed into confocal ellipses u2 2 -+-=1 with (14.17~) a* b2 in the w plane (Fig. 14.15b). The foci are the points f k of the real axis. For the unit circle with p = po = 1 we get the degenerate ellipse of the w plane, the double segment (-k, + k ) of the real axis. Both the interior and the exterior of the unit circle are transformed onto the entire w plane with the cut (-kl t k ) , so its inverse function is double-valued: z=
uJ+@7?
(14.17d)
k
The lines cp = ipo of the z plane (Fig. 14.15~)become confocal hyperbolas u2 v2 (14.17e) _ -= 1 with (Y = kcoscpo, @ = ksinipo (Y2 32 with foci kk (Fig. 14.15d). The hyperbolas corresponding to the coordinate half-axis of the z plane (p = 0. 5.K , 2~ 3 , are degenerate in the axis u = 0 (v axis) and in the intervals (-m, -k) and (k, m) of the real axis running there and back.
Figure 14.15
7. Logarithm The conformal mapping given in the form of the logarithm function w=Lnz (14.18a) has the form for z given in polar coordinates: (14.18b) u = l n p , z=cp+2k.ir ( k = O , & l , f 2,...). We can see from this representation that the coordinate lines u = const and v = const come from concentric circles around the origin of the z plane and from rays starting at the origin of the z plane
14.1 Functions of Complex Variables 677
VI
Figure 14.16
Figure 14.16
(Fig. 14.16).The isometric net is a polar net. The logarithm function Lnz is infinitely many valued (see (14.74c), p. 697). If we restrict our investigation to the principal value In z of Lnz (-T < v 5 +T)>then the whole z plane is transformed into a stripe of the w plane bounded by the lines v = hi7,where w = i7 belongs to the stripe.
8. Exponential Function The conformal mapping given in the form of an exponential function (see also 14.5.2, l.,p. 696) w = e' (14.19a) has the form in polar coordinates: w = pel'. (14.19b) Wegetfromz=z+iy: p = e' and ~5 = y. (14.19~) If y changes from -T to + T ~and z changes from --x to + c q then p takes all values from 0 to m and $ from -T to s. .4 2i7 wide stripe of the z plane, parallel to the x-axis, will be transformed into the entire 'w plane (Fig. 14.17).
Y4
Figure 14.17
9. The Schwarz-Christoffel Formula By the Schwarz-Christoffel formula = c1
/
( t - W l ) W ( t - dt w p ( t - W,)""
+ c2
(14.20a)
the interior of a polygon of the z plane can be mapped onto the upper half of the w plane. The polygon has n exterior angles a l ~a, p , . . . , a , (Fig. ~ 14.18a,b).We denote by w,(i = 1 , .. . n) the points of the real axis in the 11: plane assigned to the vertices of the polygon, and by t the integration variable. The oriented boundary of the polygon is transformed into the oriented real axis of the w plane by this mapping.
678
1L. Function Theoru
Figure 14.18
For large values oft, the integrand behaves as l/t2 and is regular at infinity. Since the sum of all the exterior angles of an n-gon is equal to 2 ~we, get:
2
Cy”
= 2.
(14.20b)
V=l
The complex constants C1 and Cz yield a rotation, a stretching and a translation; they do not depend on the form of the polygon, only on the size and the position of the polygon in the z plane. Three arbitrary points 2c1,w2, w3 of thew plane can be assigned to three points zlr ~2~ 23 of the polygon in the z plane. If we assign a point at infinity in the w plane, Le., lul = *m to a vertex of the polygon in the z plane, e.g., to z = 21, then the factor (t - 2 ~ 1 ) ~is’ omitted. If the polygon is degenerate, e.g., a vertex is at infinity, then the corresponding exterior angle is T , so aCc= 1, Le., the polygon is a half-stripe.
Figure 14.19
IA: We want to map a certain region of the z plane. It is the shaded region in Fig. 14.19a. Considering a, = 2 we choose three points as the table shows (Fig. 14.19a,b). The formula of the mapping is: 2d dt = 2c1(fi- arctan f i )= i- (fi - arctan fi). For the determination of C1 we substitute t = pe’p - 1: id = C1 limpiO
s.0
(-1
A
c + pe’p) ’” ipe’pdp pel?
.d
C~T. Le.. C1 = I - . R
We get the constant C2 = 0 considering that the mapping assigns “z = 0 + w = 0”. IB: Mapping of a rectangle. Let the vertices of the rectangle be t1,4 = *K,~ 2 , 3= iK iK’. The points z1 and 22 should be transformed into the points w1 = 1 and w2 = l / k (0 < k < 1) of the real axis, z4 and 23 are reflections of z1 and z2 with respect to the imaginary axis. According to the Schwarz reflection principle (see 14.1.3.3, p. 679) they must correspond to the points w4 = -1 and w3 = -l/k (Fig. 14.20a,b). So, the mapping formula for a rectangle (a1= az = a3 = = 1/2) of the position
+
14.1 Functions of Complex Variables 679
“f
Figure 14.20 sketched above is: z = Ci
1%
dt
f i t - W l ) ( t - wz)(t - wWQ)(t - Wq)
dt
=cdw /&$q The
point z = 0 has the image w = 0 and the image of z = i K is w = m. With C1 = l / k we get 7 d8 dt = F ( 9 , k) (substituting t = sint9, w = sincp). ’= v’l- k2sin2t9
lw d w
~1 =
F ( p lk) is the elliptic integral of the first kind (see 8.1.4.3,p. 436). We get the constant Cz = 0 considering that the mapping assigns z = 0 -+ w = 0 ”.
14.1.3.3 The Schwarz Reflection Principle 1. Statement Suppose f ( z ) is an analytic complex function in a domain G, and the boundary of G contains a line segment gl. If the function is continuous on g1 and it maps the line g1 into a line g;, then the points lying symmetric with respect to g1 are transformed into points lying symmetric with respect to g; (Fig. 14.21).
Figure 14.21
Figure 14.23
2. ADDlication 1 1
The application of this principle makes it easier to perform calculations and the representations of plane regions with straight line boundaries: If the line boundary is a stream line (isolating boundary in Fig. 14.22).then the sources are reflected as sources, the sinks as sinks and curls as curls with the opposite sense of rotation. If the line boundary is a potential line (heavy conducting boundary in Fig. 14-23), then the sources are reflected as sinks, the sinks as sources and curls as curls with the same sense of rotation.
14.1.3.4 Complex Potential 1. Notion ofthe Complex Potential We will consider a field ? = $(z, y) in the z, y plane with continuous and differentiable components V,(s, y) and V,(z,y) of the vector ? for the zero-divergence and the irrotational case. a) Zero-divergence field with div ? = 0. i.e., % + % = 0: That is the integrability condition ax ay for the differential equation expressed with the field or stream function !P(x2y)
I
680
14. Function Theoru
dly = -Vy dx t V, dy = 0, (14.21a)
a*
and then V, = - , ay
V
- --a* . -
ax
(14.21b)
For two points PI Pz of the field ? the difference !P(Pz)- *(PI)is a measure of the flux through a curve connecting the points PI and Pz, in the case when the whole curve is in the field. ~
av av,
b) Irrotational field with rot? = 6,Le., 2 - - = 0: That is the integrability condition for the ax a y differential equation with the potential function @(z,y) d@ = r/,dx t V, dy = 0, (14.22a)
a@
and then V, = %,
a@
V - -. ydy
(14.22b)
The functions @ and P satisfy the Cauchy-Riemann differential equations (see 14.1.2.1, p. 670) and both satisfy the Laplace differential equation (A @ = 0, A P = 0). We combine the functions 0 and @ into the analytic function (14.23) IV = f(z) = @(x,y) t ily(z,y ) ) and this function is called the complex potential of the field ?. Then -@(x3y) is the potential of the vector field ? in the sense of the usual notation in physics and electrotechnics (see 13.3.1.6, 2., p. 660). The level lines of P and @forman orthogonal net. We have the following equalities for the derivative of the complex potential and the field vector ?: dW d@ ,d@ _ - - _ l-=vv,-iV (14.24) dW = f’(z)= V, +ivy. dz dx dy ” dz
2. Complex Potential of a Homogeneous Field The function W = az. (14.25) with real a is the complex potential of a field whose potential lines are parallel to the y-axis and whose direction lines are parallel to the x-axis (Fig. 14.24). A complex a results in a rotation of the field (Fig. 14.25).
@=const Figure 14.24
@=const Figure 14.25
3. Complex Potential of Source and Sink The complex potential of a field with a strength of source s > 0 at the point
w = S27rl n ( z - 20).
2
= 20 has the equation:
(14.26)
A sink with the same intensity has the equation:
w = -5 ln(z - 20). 277
(14.27)
14.1 Functions of Complex Variables 681
The direction lines run away radially from z = 20,while the potential lines are concentric circles around the point 20 (Fig. 14.26).
4. Complexes Potential of a Source-Sink System We get the complex potential for a source at the point zl and for a sink at the point 22, both having the same intensity, by superposition s
M.7
= -In 277
2-21
-
(14.28)
2-22
The potential lines @ = const form Apollonius circles with respect to I = const are circles through 21 and z2 (Fig. 14.27).
Figure 14.26 5 . Complex Potential of a Dipole The complex potential of a dipole with dipole moment M angle a with the real axis (Fig. 14.28),has the equation:
z1
and
22;
the direction lines
Figure 14.27
> 0 at the point 20,whose axis encloses an (14.29)
6 . Complex Potential of a Curl If the intensity of the curl is Irl with real r and its center is at the point 20,then its equation is:
r
W = -ln(z - zo). 27ri
(14.30)
In comparison with Fig. 14.26,the roles of the direction lines and the potential are interchanged. For complex r (14.30) gives the potential of a source of curl, whose direction and potential lines form two families of spirals orthogonal to each other (Fig. 14.29).
14.1.3.5 Superposition Principle 1. Superposition of Complex Potentials A system composed of several sources, sinks, and curls is an additive superposition of single fields, Le., we get its function by adding their complex potential and stream functions. hlathematically this is possible because of the linearity of the Laplace differential equations A @ = 0 and D I = 0.
2. Composition of Vector Fields 1. Integration A new field can be constructed from complex potentials not only by addition but also by integration of the weight functions. ILet a curl be given with density e(s) on a line segment 1. Then we get a Cauchy type integral (see
682
14. Function Theory
Yt
\
I
+
I
0)
X
Figure 14.28
I
Figure 14.29
14.2.3, p. 687) for the derivative of the complex potential:
(14.31) where C ( s ) is the complex parametric representation of the curve 1 with arclength s as its parameter. 2. Maxwell Diagonal Method If we want to compose the superposition of two fields with the potentials G1 and 12,then we draw the potential line figures [[11]] and [[12]] so that the value of the potential changes by the same amount h between two neighboring lines in both systems, and we direct the lines so that the higher 1 values are on the left-hand side. The lines lying in the diagonal direction to the net elements formed by [[11]] and [[12]] give the potential lines of the field whose potential is 1 = 11 t 12 or 1 = 11 - 12. We get the figure of [[@I + 1211 if the oriented sides of the net elements are added as vectors (Fig. 14.30a), and we get the figure of [[GI -1211 when we subtract them (Fig. 14.30b). The value of the composed potential changes by h at transition from one potential line to the next one. W Vector lines and potential lines of a source and a sink with an intensity quotient lelI/le21 = 3/2 (Fig. 14.31a,b).
[@I],
Figure 14.30
14.1.3.6 Arbitrary Mappings of the Complex Plane .4 function (14.32a) w = f ( z = z + i y) = u ( z ,y) + i v(z, y) is defined if the two functions u = u ( z ,y) and u = u(z,y) with real variables are defined and known. The function f ( z ) must not be analytic, as it was required in conformal mappings. The function to maps the z plane into the w plane, Le., it assigns to every point z, a corresponding point wv. a) Transformation of the Coordinate Lines z is a parameter; y = c t u = u ( z ,c), v = v(z, c),
11.2 Inteqration in the Complex Plane 683
Figure 14.31 y is a parameter. (14.32b) z = e1 i u = u(cl, y), v = v(cll y). b) Transformation of Geometric Figures Geometric figures as curves or regions of the z plane are usually transformed into curves or regions of the w plane: z = ~ ( t )y. = y(t) -iu = u ( z ( t ) , y ( t ) ) ,w = v(z(t),y(t)), t i s aparameter. (14.32~) IFor u = 2 2 + y, v = I + 2y, the lines y = c are transformed into u = 22 t c, v = z + 2c, hence into u 3 the lines v = - + -c. The lines I = c1 are transformed into the lines v = 221 - 3Cl (Fig. 14.5). The 2 2 shaded region in Fig. 14.5a is transformed into the shaded region in Fig. 14.5b. c) Riemann Surface If we get the same value w for several different z for the mapping w = f ( z ) then , the image of the function consists of several planes “lying on each other”. If we cut these planes and connect them along a curve, then we get a many-sheeted surface, the so-called many-sheeted Rzemann surface (see [14.13]).This correspondence can be considered also in an inverse relation. in the case of multiple-valued functions as, e.g., the functions fi,Ln z , Arcsin t.Arctan z .
Figure 14.32
Iw = 2 :While z = reip overruns the entire z plane, Le., 0 5 ~p < 27~,the values of w = eelv = rzei2p,cover twice the w plane. We can imagine that we place two w planes on each other, and we cut and connect them along the negative real axis according to Fig. 14.32. This surface is called the Riemann surface of the function w = 2’.
The zero point is called a branch point. The image of the function ez (see 14.69) is a Riemann surface of infinitely many sheets. (In many cases the planes are cut along the positive real axis. It depends on whether the principal value of the complex number is defined for the interval (-n>+T] or for the interval [O, 27r).)
14.2 Integration in the Complex Plane 14.2.1 Definite and Indefinite Integral 14.2.1.1 Definition of the Integral in the Complex Plane 1. Definite Complex Integral Suppose f ( z ) is continuous in a domain G, and the curve C is rectifiable, it connects the points A and B , and the whole curve is in this domain. We decompose the curve C between the points A and B by
684
14. Function Theory
choose a point CZ on every arc segment and form the sum
G
n
x f ( C , ) A z , with Az,=z,-z,-1.
(14.33a)
$=I
If the limit n+cc
F(z)=
I
f ( z ) dz t C with F'(z) = f ( z ) .
(14.35)
Here C is the integration constant which is complex, in general. The function F ( z ) is called an indefinite complex integral. The indefinite integrals of the elementary functions of a complex variable can be calculated with the same formulas as the integrals of the corresponding elementary function of a real variable.
w
A: / s i n z d z = - c o s z t C .
IB: J e z d z = e' t C.
3. Relation between Definite and Indefinite Complex Integrals If the function f ( z ) has an indefinite integral, then the relation between its definite and indefinite integral is f ( z ) dz = F ( z B )- F ( z a ) .
(14.36)
4B
14.2.1.2 Properties and Evaluation of Complex Integrals 1. Comparison with the curvilinear integral of the second type The definite complex integral has the same properties as the curvilinear integral of the second type (see 8.3.2. p. 462): a ) If we reverse the direction of the path of integration, then the integral changes its sign. b) If we decompose the path of integration into several parts, then the value of the total integral is the sum of the integrals on the parts.
2. Estimation of the Value of the Integral If the absolute value of the function f ( z ) does not exceed a positive number M at the points z of the 7.
path of integration AB which has the length s, then: (14.37)
14.2 Inte.qration in the Complex Plane 685
3. Evaluation of the Complex Integral in Parametric Representation h
If the path of integration AB (or the curve C) is given in the form
z= 4 t ) , Y = y(t) (14.38) and the t values for the initial and endpoint are t A and t g , then the definite complex integral can be calculated with two real integrals. We separate the real and the imaginary parts of the integrand and we get: B
B
B
/
/( dz - v d y ) t i /(v d z t
A
A
( C ) f ( z )dz =
u
u dy)
A
B
B
= / [ u ( t ) z ' ( t) v(t)y'(t)] d t t i/[v(t)z'(t) A
+ u(t)y'(t)]dt
(14.39a)
A
with f ( z ) = u ( z , y) t iv(z, y),
z = z+ i y .
(14.39b)
B
/
The notation (C) f ( z ) dz means that the definite integral is calculated along the curve C between the A
points A and B. The notation
II =
/
-/
f ( z ) dz and f ( z ) dz is also often used, and has the same meaning. (i.) AB i c i ( z - z ~ ) ~ d( zn E Z). Let the curve C be a circle around the point zo with radius T O :
z = 2 0 + T O cost, y
= yo
t T O sin t with 0 5 t 5 27r. For every point z of the curve C: z = z + i y =
+ COS t t isint), dz = ro(-sin t + icost) dt. After substituting these values and transforming (cosnt + isin nt)(- sin t t i cos t ) d t according to the de Moivre formula we get: I = ritl zo
27
=
,it'
1
.~
2n
[icos(n
+ 1)t - sin(n t l)t]dt =
4. Independence of the Path of Integration Suppose a function of a complex variable is defined in a simply connected domain. The integral (14.34) of the function can be independent of the path of integration. which connects the fixed points A(zA) and B ( z ~ A ) .sufficient and necessary condition is that the function is analytic in this domain, i.e., the Cauchy-Riemann differential equations (14.4) are satisfied. Then also the equality (14.36) is valid. If a domain is bounded by a simple closed curve, then the domain is szmply connected. 5 . Complex Integral along a Closed Curve Suppose f ( z ) is analytic in a simply connected domain. If we integrate the function f ( z ) along a closed curve C which is the boundary of this domain, the value of the integral according to the Cauchy integral theorem is equal to zero (see 14.2.2, p. 686):
ff ( z ) dz = 0.
(14.40)
If f ( z ) has singular points in this domain, then the integral is calculated by using the residue theorem (see 14.3.5.5.p. 692), or by the formula (14.39a). 1 IThe function f ( z ) = -, with a singular point at z = a has an integral value for the closed curve z-a directed counterclockwise around a (Fig.14.34)
dz f= 2ai. z-a (C)
686
11. Function Theorv
14.2.2 Cauchy Integral Theorem 14.2.2.1 Cauchy Integral Theorem for Simply Connected Domains If a function f ( z ) is analytic in a simply connected domain, then we get two equivalent statements: a) The integral is equal to zero along any closed curve C:
f f ( z ) dz = 0.
(14.41)
b) The value of the integral
LB
f ( z ) dz is independent of the curve connecting the points A and B , Le.,
it depends only on A and B. This is the Cauchy integral theorem
14.2.2.2 Cauchy Integral Theorem for Multiply Connected Domains If C , C1, Cz,. . ., C, are simple closed curves such that the curve C encloses all the C,, (v = 1 , 2 , . . . , n!, none of the curves C, encloses or intersects another one, and furthermore the function f(z) is analytic in a domain G which contains the curves and the region between C and C,,, i.e., at least the region shaded in Fig.14.35, then
f f ( z ) dz = f f ( z ) dz + f f ( z ) dz + . . . + f f ( z ) dz,
(14.42)
(C“)
(CZ)
(Cl)
(C)
if the curves C, C1, . . ., C, are oriented in the same direction, e.g., counterclockwise. This theorem is useful for the calculation of an integral along a closed curve C, if it also encloses singular points of the function f ( z ) (see 14.3.5.5, p. 692).
[c
* X
Figure 14.35
Figure 14.34
ICalculate the integral
Figure 14.36
fz(tz+:) dz, where C is a curve enclosing the origin and the point z = -1 (C)
(Fig. 14.36). Applying the Cauchy integral theorem, the integral along C is equal to the sum of the integrals along C1 and Cz, where C1 is a circle around the origin with radius r1 = l / 2 and Cz is a circle around the point z = -1 with radius r2 = 1/2. The integrand can be decomposed into partial fractions.
f -+ 2 dz
f
:f
-2 dz f ZS1 zs1 in1 (C2) (Cl) (Compare the integrals with the example in 14.2.1.2,3.,p. 685.)
Then we get:
f
(C)
2-1 dz= -
$=0+4ri-2ri-O=2ri.
(C2)
14.3 Power Series Expansion of Analytic Functions 687
14.2.3 Cauchy Integral Formulas 14.2.3.1 Analytic Function on the Interior of a Domain If f ( z ) is analytic on a simple closed curve C and on the simply connected domain inside it, then the following representation is valid for every interior point z of this domain (Fig. 14.37):
(Cauchy integral formula),
(14.43)
where C traces the curve C counterclockwise. With this formula, the values of an analytic function in the interior of a domain are expressed by the values of the function on the boundary of this domain. The existence and the integral representation of the n-th derivative of the function analytic on the domain G follows from (14.43): (14.44) Consequently. if a complex function is differentiable, Le., it is analytic, then it is differentiable arbitrarily many times. In contrast to this, differentiability does not include repeated differentiability in the real case. The equations (14.43) and (14.44) are called the Cauchy zntegralformulas.
14.2.3.2 Analytic Function on the Exterior of a Domain If a function f ( z ) is analytic on the entire part of the plane outside of a closed curve of integration C, then the values and the derivatives of the function f ( z ) at a point t of this domain can be given with the same Cauchy formulas (14.43), (14.44), but the orientation of the curve C is now clockwise (Fig. 14.38). .&o certain real integrals can be calculated with the help of the Cauchy integral formulas (see 14.4, p. 692),
Figure 14.37
Figure 14.38
14.3 Power Series Expansion of Analytic Functions 14.3.1 Convergence of Series with Complex Terms 14.3.1.1 Convergence of a Number Sequence with Complex Terms .An infinite sequence of complex numbers zir2 2 , . . . , z,, . . . has a limit z
(2 =
lim z,) if for arbitrarily
nics
given positive E there exists an no such that the inequality ) z - t,l < E holds for every n > nor Le., from a certain no the points representing the numbers z,, zntll.. . are inside of a circle with radius E and center at z . H If the expression { G}means the root with the smallest non-negative argument, then the limit lim { @} = 1 is valid for arbitrary a (Fig. 14.39). n+x
688
14. Function Theory
21y/a=z1 .2 .3 .4
i+ I +I + I
2 4 8 I
01
Figure 14.39
+ X
Figure 14.40
14.3.1.2 Convergence of an Infinite Series with Complex Terms
A series a1 + a2 + ' . t a, t . ' . with complex terms a, converges to the number s, if s = lim (al + a2 + . . . + an) n+x
(14.45)
holds. The number s is the sum of the series. If we connect the points corresponding to the numbers s, = a1 t a2 . . . an in the z plane by a broken line, then convergence means that the end of the broken line approaches the point s. iz i3 i2 i3 i4 IB: i + - + - + ' . . (Fig. 14.40). IA: i + - + - + - + . . . . 2 22 2 3 4 A series is called absolutely convergent (see IB), if the series of absolute values of the terms (all t laz/ + la31 + . . . is also convergent. The series is called conditionally convergent (see IA) if the series is convergent but not absolutely convergent. If the terms of a series are functions j , ( z ) , like fi(z)+f~(z)+'.'+fn(z)+.'. 2 (14.46) then its sum is a function defined for the values z for which the series of the substitution values is convergent.
+ +
14.3.1.3 Power Series with Complex Terms 1. Convergence A power series with complex coefficients has the form P(z-zo) = a o + a l ( z - z o ) + a ~ ( z - z o ) 2 + ~ ~ ~ + a , ( z - z o ) " + ~ ~ ~ , (14.47a) where zo is a fixed point in the complex plane and the coefficients a, are complex constants (which can also have real values). For zo = 0 the power series has the form
+
+ +
+
(14.47b) ~ ( z=)a. + alz azzZ . . . a,zn . . . . If the power series P ( z - zo) is convergent for a value zlrthen it is absolutely and uniformly convergent for every z in the interior of the circle with radius r = 121 - 201 and center at 20.
2. Circle of Convergence The limit between the domain of convergence and the domain of divergence of a complex power series is a uniquely defined circle. We determine its radius just as in the real case, if the imits T
1 = lim -
1%~
or
r = n+m lim a
n+m
(14.48)
exist. If the series is divergent everywhere except at z = zo, then T = 0; if it is convergent everywhere, then T = cu. The behavior of the power series on the boundary circle of the domain of convergence should be investigated point by point. IThe power series P ( z ) =
30
n=l
zn
- with radius of convergence r = 1 is divergent for z = 1 (harmonic n
14.3 Pourer Series Expansion of Analutic Functions 689
series) and convergent for z = -1 (according to the Leibniz criteria for alternating series). This power series is convergent for all further points of the unit circle = 1 except the point z = 1.
3. Derivative of Power Series in the Circle of Convergence Every power series represents an analytic function f(z) inside of the circle of convergence. We get the derivative by a term-by-term differentiation. The derivative series has the same radius of convergence as the original one.
4. Integral of Power Series in the Circle of Convergence We get the power series expansion of the integral
J1: f(C)
d6 by a term-by-term integration of the power
series of f ( z ) . The radius of convergence remains the same.
14.3.2 Taylor Series Every function f ( z ) analytic in a domain G can be expanded uniquely into a power series of the form m
f(z) =
an(z
(Taylor series),
- 201,
(14.49a)
n=O
for any zo in G, where the circle of convergence is the greatest circle around zo which belongs entirely to the domain G (Fig. 14.41). The coefficients a, are complex numbers in general, and for them we get:
f
'"'
(20)
(14.49b)
a, = -
n! The Taylor series can be written in the form
f //(zo1
+
f'"'(zo) ( z - Z O ) ~ . . . . (14.49~) 2! n! Every power series is the Taylor expansion of its sum function in the interior of its circle of convergence. IExamples of Taylor expansions are the series representations of the functions e', sin z , cos z , sinh z , and coshz in 14.5.2, p. 696. f(Z)
f'(z0)
=f(zo) t - ( z - z o ) + - ( z - z o ) 2 + ~ ' ~ + l!
Figure 14.41
Figure 14.43
Figure 14.42
14.3.3 Principle of Analytic Continuation We consider the case when the circles of convergence KO around zo and K1 around z1 of the two power series m
fo(z) =
1an(z n=O
m ~
0
)
and ~
fi(z) =
bn(z
-
ziIn
(14.50a)
n=O
have a certain common domain (Fig. 14.42)and in this domain they are equal:
fo(z) = f l ( 2 ) .
(14.50b)
690
14. Function Theorv
Then both power series are the Taylor expansions of the same analytic function f ( z ) , belonging to the points zo and 21. The function f l ( z ) is called the analytic continuation into K1 of the function fo(z) defined only in KO.
c zn m
IThe geometric series fo(z) = fi(2)
=
1
2 "
with the circle of convergence KO (TO = 1) around zo = 0 and
n=n . .
z-i
l-i (G)with the circle of convergence K1
(TI
=
4)around z1 = i have the same
analytic function f ( z ) = l / ( l - z ) as their sum in their own circle of convergence, consequently also on the common part of them (doubly shaded region in Fig. 14.42) for z # 1. So, f l ( z ) is the analytic continuation of fo(z)from KO into K1 (and conversely).
14.3.4 Laurent Expansion Every function f ( z ) , which is analytic in the interior of a circular ring between two concentric circles with center zo and radii r1 and ~ 2 can , be expanded into a generalized power series, into the so-called Laurent series: m a-ktl f(z)= a n ( z - 2 0 ) n = ~ ~ ~ a-k t(2 - Z0)k ( 2 - 2o)k-l n=-m
c
+
+
+-2 a-120 + a0 +
al(2
-
- 20)
+
a2(2
- 20)2
+.
' . +a&
- z0)k
+ ..
'
.
(14.51a)
The coefficients a, are usually complex and they are uniquely defined by the formula (14.51b) where C denotes an arbitrary closed curve which is in the circular ring T I < /zl < r2, and the circle with radius T I is inside of it, and its orientation is counterclockwise (Fig. 14.43).If the domain G of the function f ( z ) is larger than the circular ring, then the domain of convergence of the Laurent series is the largest circular ring with center zo lying entirely in G. IDetermine the Laurent series expansion of the function f ( z ) =
1
( z - 1)(z - 2)
. around to= 0 in
the circular ring 1 <
121 < 2 where f ( z ) is analytic. First we decompose the function f ( z ) into partial 1 1 fractions: f ( z ) = -- -. Since 11/21 < 1 and 12/2l < 1 holds in the considered domain, the
2 - 2
2-1
two terms of this decomposition can be written as the sums of geometric series absolutely convergent in the entire circular ring 1 < 121 < 2. We get:
14.3.5 Isolated Singular Points and the Residue Theorem 14.3.5.1 Isolated Singular Points If a function f ( z ) is analytic in the neighborhood of a point zo but not at the point zo itself, then 20 is called an isolated singular point of the function f ( z ) . If f ( z ) can be expanded into a Laurent series in the neighborhood of zo
c m
f(z) =
n=-m
ZOln,
(14.52)
14.3 Power Series Expansion of Analvtic Functions 691
then the isolated singular point can be classified by the behavior of the Laurent series: 1. If the Laurent series does not contain any term with a negative power of ( z - ~ 0 Le.. ) ~a, = 0 for n < 0 holds, then the Laurent series is a Taylor series with coefficients given by the Cauchy integral formula (14.53) In this case, the function f ( z ) itself is either analytic at the point zo and f ( z 0 ) = a0 or zo is a removable singularity. 2. If the Laurent series contains a finite number of terms with negative powers of ( z - 20))Le., a, # 0, a, = 0 for n < m < 0; then z0 is called a pole, apole of orderm, or a pole ofmultiplicity rn. If we multiply by ( z - z0),, and not by any lower power, then f ( z ) is transformed into a function which is analytic at zo and in its neighborhood.
:( 2-)
If ( z ) = - z t - has a pole of order one at z = 0. 3. If the Laurent series contains an infinite number of terms with negative powers of ( z - zO),then zo is an essential singularity of the function f(z). Approaching a pole, lf(z)l tends to co.Approaching an essential singularity, f ( z ) gets arbitrarily close to any complex number c. O011 IThe function f(z)= e'/'>whose Laurent series is f ( z ) = has an essential singularity at --)
n! zn
z = 0.
14.3.5.2 Meromorphic Functions If an otherwise holomorphic function has only a finite number of poles as singular points, then it is called meromorphic. A meromorphic function can always be represented as the quotient of two analytic functions. IExamples of functions meromorphic on the whole plane are the rational functions which have a finite number of poles, and also transcendental functions such as tan z and cot z .
14.3.5.3 Elliptic Functions Elliptic functions are double periodic functions whose singularities are poles, Le., they are meromorphic functions with two independent periods (see 14.6, p. 700). If the two periods are w1 and w2, which are in a non-real relation, then f(z
+ mwl t nu2) = f ( z )
(3
* 2 , . . . ; Im -
( m , n=
# 0).
The range of f ( z ) is already attained in a primitive period parallelogram with the points
(14.54) O , w l , w1
t
Q, w2.
14.3.5.4 Residue of the power ( z - zo)-l in the Laurent expansion of f(z) is called the residue ofthe The coefficient function f ( z ) at the point 20: (14.55a) The residue belonging to a pole of order m can be calculated by the formula 7
Am-l
( 14.55b)
692
14. Function Theorv
If the function can be represented as a quotient f ( z ) = cp(z)/$(z),where the functions cp(z) and $ ( z ) are analytic at the point z = zo and zo is a simple root of the equation $ ( z ) = 0, Le., $(zo) = 0, ~ ' ( 2 0 ) # 0 holds, then the point z = zo is a pole of order one of the function f ( z ) . It follows from (14.55b) that (14.55~)
) . . . = $("'-')(z~) If zo is a root of multiplicity m of the equation $ ( z ) = 0, Le., $(zo) = ~ ' ( z o= = 0,d * ) ( z o ) # 0 holds, then the point z = 20 is a pole of order m of f ( z ) .
14.3.5.5 Residue Theorem With the help of residues we can calculate the integral of a function along a closed curve enclosing isolated singular points (Fig. 14.44). If the function f ( z ) is single valued and analytic in a simply connected domain G except at a finite number of points 20, zll 2 2 , . , , zn, and the domain is bounded by the closed curve C, then the value of the integral of the function along this closed curve in a counterclockwise direction is the product of 2 ~ i and the sum of the residues in all these singular points: ~
f f ( z ) d z = 2nif:Resf(z) lz=zk.
(14.56)
b=O
(K)
IThe function f ( z ) = e z / ( z 2+ 1) has poles of order one at z1,2 = f i . The corresponding residues have the sum sin 1. If K is a circle around the origin with radius r
> 1, then
f
z2: dz = P ~ i s i n1.
(K)
Y4
*
I
01
X
Figure 14.44
Figure 14.45
14.4 Evaluation of Real Integrals by Complex Integrals 14.4.1 Application of Cauchy Integral Formulas The value of certain real integrals can be calculated with the help of the Cauchy integral formula. IThe function f ( z ) = e*, which is analytic in the whole z plane, can be represented with the Cauchy integral formula (14.43), where the path of integration C is a circle with center z and radius r . The equation of the circle is = z + re'?. We get from (14.43)
<
14.4 Evaluation of Real Inte.qrals b y Complex Inte.qrals 693
Since the imaginary part is equal to zero, we get
r
2rrn ercoslp cos(rsin ~p - ncp) dcp = -. n!
14.4.2 Application ofthe Residue Theorem Several definite integrals of real functions with one variable can be calculated with the help of the residue theorem. If f ( z ) is a function which is analytic in the whole upper half of the complex plane including the real axis except the singular points z l , 2 2 , . . . , z, above the real axis (Fig. 14.45), and if one of the roots of the equation f ( l / z ) = 0 has multiplicity m 2 2 (see 1.6.3.1, l., p. 43), then +?o
J
n
(14.57)
f ( x ) d x =2~i~Resf(z)l,=,,. 1=1
--co
ICalculation of the integral
dx
Theequation f - =
1 ~
-
X6
= 0 has
1 a root of order six at x = 0. The function w = -has a single singular point z = i in the upper (1t 4 3 half-plane, which is a pole of order 3, since the equation (1 t z 2 ) 3= 0 has two triple roots at i and -i. The residue is according to f14.55b):
follows that Res 27ri
(
1 = 6(z + i)-51 '=I (1 t ~ 2 ) 3 1 ~ , ~
6 3 - - = --i, and with (14.57): - (21)s 16
1
tm
-m
f ( x )dx =
3;
= -K. For further applications of residue theory see, e.g., [14.12].
---I
14.4.3 Application of the Jordan Lemma 14.4.3.1 Jordan Lemma In many cases. real improper integrals with an infinite domain of integration can be calculated by complex integrals along a closed curve. To avoid the always recurrent estimations, we use the Jordan lemma about improper integrals of the form
1
f ( z ) e l a zdz,
(14.58a)
(CR
where C R is the half-circle arc with center at the origin and with the radius R in the upper half of the z plane (Fig. 14.46). The Jordan lemma distinguishes the following cases: a) a > 0: If f ( z ) tends to zero uniformly in the upper half-plane and also on the real axis for Iz/ + co and a is a positive number. then for R -+ 03
1
f(z)elaz
dz
-+0.
(14.58b)
(CR)
+ co,then the above statement is also valid in the case cy = 0. c) a < 0: If the half-circle is now below the real axis, then the corresponding statement is also valid for a < 0. d) The statement is also valid if only an arc segment is considered instead of the complete half-circle. b) a = 0: If the expression z f ( z ) tends to zero uniformly for IzI
I
14. Function Theory
694
e) The corresponding statement is valid for the integral in the form
1
f(z)eazdz,
(14.58~)
(C;)
where C; is a half-circle or an arc segment in the left half-plane with o > 0, or in the right one with a < 0.
Figure 14.46
Figure 14.47
Figure 14.48
14.4.3.2 Examples of the Jordan Lemma 1. Evaluation of the Integral
j
;.2"i;; dx. -
n
The following complex integral is assigned to the above real integral:
0-
R
R
2iJ-dx=iJ-d~+ +a2
-R
x sin ax x2 + a2
even function
x cos ax
-R
-R
= 0 (odd integrand)
The very last of these integrals is part of the complex integral
fz2 a2 dz. The curve C contains the zel"' +
(C)
CRhalf-circle defined above and the part of the real axis between the values -Rand R (R > lal). The complex integrand has the only singular point in the upper half-plane z = a i . We obtain from the ze'ot ( z - ai) = 2ni t-tai lim - = nie-", hence residue theorem: I = z + ai
1
(C)
I =
1
zeiar
dz +
1 R
xeiaz
d x = Tie-aa, It follows from lim I and from the Jordan lemma that R+W
-R
(CR)
( a > 0, a 2 0). 0
Several further integrals can be evaluated in a similar way Table 21.6,p. 1050. 2. Sine Integral (see also (8.95), p. 458) CC
d x is called the sine zntegralor the integral sine (see also 8.2.5, l., p. 458). Analo-
The integral 0
g
gously to the previous example, we investigate the complex integral - dz with the curve C according to Fig. 14.47. The integrand of the complex integral has a pole of first order at z = 0, so
14.4 Evaluation of Real Inte.qrals bv Complex Integrals 695
I = 2i JRsi:x -dx
= 2ai, hence
r
iJeir(cosi.+isinp) dq + J
/ 2 dz
= 2 ~ i This .
Ca
limit is evaluated as R + m>T -+ 0, where the second integral tends to 1 uniformly for r + 0 with respect to 9,Le.. the limiting process T + 0 can be done behind the integral sign. Then we get with the Jordan lemma: 2ij*
dx + ai = 2ai. hence
(14.59)
2
0
3. Step Function Discontinuous real functions can be represented as complex integrals. The so-called step function is an example: 1 fort>O, 1 $2 (14.60) F(t)= -dz= 27ri--+ z 0 for t < O .
/
The symbol --+ denotes a path of integration along the real axis (I R I+ m) going round the origin (Fig. 14.47). If t denotes time, then the function @(t)= cF(t - to) represents a quantity which jumps at timet = t o from 0 through the value 4 2 to the value c. We call it a step function or also a Heavisidefunctzon. It is used in the electrotechnics to describe suddenly occurring voltage or current impulses.
4. Rectangular Pulse A further example of the application of complex integrals and the Jordan lemma is the representation of the rectangular pulse: 0 for t < a and t > b, 1 ?i(b-t)z 1 &(a-t)z (14.61) P(t)=-dz-2dz= 112 for t = a and t = b. 2ai-_+ %-+
/
/
5 . Fresnel Fntegrals
To derive the Fresnel integral X
03
:@ 2
/sin(x2)dx=/cos(x2)dx= 0
0
we investigate the integral I = /e-’’ dz on the closed path of integration shown in Fig. 14.48.We J
K
get, according to the Cauchy integral theorem: I = I1 + I11 t I111 = 0 with I1 =
IR
e-” dx,
= i~ [I4 e-R2(cos2~tisinZp)tipd P3
;
1111 = e? l e i r z d r= -&(1
+ i)
[iR i
sinr’dr - i R c o s ~ ’ d r ] .
Estimation of 111: Since lil = leiT1= 1 ( T real) holds, we get: IIIII< R / r ’ 4 e-R2C0s2pd ‘p
<
1 - e-R2cosa
Ge-R2cosn
2
+
2Rsina
( 0 < (Y <
:).
Performing the limiting process lim I we get the R+W
696
11. Function Theoru
1 values of the integrals ZI and 111: limR+, ZI = -fi,lim 111 = 0. We get the given formulas (14.62) R+m 2 by separating the real and imaginary parts.
14.5 Algebraic and Elementary Transcendental Functions 14.5.1 Algebraic Functions 1. Definition A function which is the result of finitely many algebraic operations performed with z and maybe also with finitely many constants, is called an algebraic function. In general, a complex algebraic function W ( Z ) can be defined in an implicit way as a polynomial, just as its real analogue alzmlwnl$ a 2 z m z w n z + ~ ~ ~ + a k z m k= w O .n k (14.63) Such functions cannot always be solved for w. 2. Examples of Algebraic Functions Linear function:
20
= az
+ b.
Quadratic function: w = z2.
(14.64)
1 Inverse function: w = - .
(14.66)
Square root function: w =
(14.65)
Z
m.
zti Fractional linear function: w = 2-1
(14.67) (14.68)
14.5.2 Elementary Transcendental Functions The complex transcendental functions have definitions corresponding to the transcendental real functions, just as in the case of the algebraic functions. For a detailed discussion of them see, e.g., [21.1]or 121.101. 1. Natural Exponential Function z z3 e X = l +- + - + - + . . . , (14.69) l! 2! 3! The series is absolutely convergent in the whole z plane. a) Pure imaginary exponent iy: This is valid according to the Euler relation (see 1.5.2.4, p. 35): e'y = cosy f isiny with effl= -1. (14.70) b) General case z = z t iy: ( 14.71a) ez = el:+iu = ezeiu - e1:(cosy+isiny),
.
.
Im(ez) =eZsiny, le"I = e z , arg(e') = y . The function ez is periodic, its period is 2ri: e" = ez+2kffi (k = 0, hl. f 2 , . . .) .
(14.71b)
In particular: eo = eZkni = 1,
(14.71d)
Re(ez) =ercosy,
e('k+l)'i = -1.
(14.71~)
c) Exponential form of a complex number (see 1.5.2.4, p. 35): a t ib = peiP. d) Euler relation for complex numbers: e-" = cos z - i sin z . (14.73a) el2 = cos z + i sin z ,
(14.7313)
2. Natural Logarithm u l = L n z , if z = ew.
(14.74a)
(14.72)
14.5 Algebraic and Elementary Transcendental Functions 697
Since z = pe'q, we can write: Lnz = l n p + i ( p + 2kr) and Re(Lnz) = lnp, Im(Lnz) = q + 2k7r
( k = O,i.l,i.2,. . .).
(14.74 b) (14.74~)
Since Ln z is a multiple-valued function (see 2.8.2, p. 84), we usually give only the principal value ofthe logarithm In z : lnz=lnp+ip (-n
3. General Exponential Function (14.75a)
= ezLna,
a' ( a # 0 ) is a multiple-valued function (see 2.8.2, p. 84) with principal value =
( 14.75b)
elha,
4. Trigonometric Functions and Hyperbolic Functions ( 14.76a) ( 14.76b) ( 14.77a)
e' + e-' 22 24 (14.77b) c o s h z = -= 1+-+-+... , 2 2! 4! A11 four series are convergent on the entire plane and they are all periodic. The period of the functions (14.76a.b) is 27r, the period of the functions (14.77a.b) is 27ri. The relations between these functions for any real or complex z are: ( 14.78b) cos i z = cosh z , sin i z = i sinh z , ( 14.78a) sinh i z = i sin z ,
(14.79a)
cosh i z = cost.
(14.79b)
The transformation formulas of the real trigonometric and hyperbolic functions (see 2.7.2, p. 79, and 2.9.3, p. 89) are also valid for the complex functions. We calculate the values of the functions sinz, cosz, sinhz, andcoshzforthe argument z = s + i y w i t h t h e helpoftheformulassin(a+b), cos(a+b), sinh(a t b ) , and cosh(a + b) or we can do it by using the Euler relation (see 1.5.2.4, p. 35). (14.80) I c o s ( z + i y ) = coszcosiy -sina:siniy =coszcoshy-isinssinhy. Therefore. (14.81a) Re(cosz) = cosRe(z) coshIm(z), ( 14.81b) Im(cosz) = -sinRe(z) sinhIm(z). The functions tan z , cot z , tanh z , and coth z are defined by the following formulas: sinh z cosh z cos z sin z tan z = - cot z = (14.82a) tanh z = - cothz = - (14.82b) cosh z ' sinhz ' cos z sin z
.
~
5 . Inverse Trigonometric Functions and Inverse Hyperbolic Functions These functions are many-valued functions, and we can express them with the help of the logarithm function: Arcsinz= - i L n ( i z + m ) , (14.83a) Arsinhz=Ln(z+m), (14.83b)
Xrccosz= - i L n ( z + m ) ,
(14.84a)
Arcoshz=Ln(z+m),
(14.84b)
I
698
14. Function Theory
1 l+z 1 l+iz .4rtanh z = -Ln -, Arctan z = _Ln -, (14.85b) (14.85a) 2 1-2 21 1 - 1 2 1 z+l 1 iz+1 Arcoth z = -Ln -. (14.86b) Arccot z = -- Ln -, (14.86a) 2 2-1 2i 1 2 - 1 The prznczpal values of the inverse trigonometric and the inverse hyperbolic functions can be expressed by the same formulas using the principal value of the logarithm In z :
m),(14.87a) ln(z + m))(14.88a)
arcsinz = -i ln(iz + arccost = -i
1+iz arctan z = 7 In -, 21 1 - 12 1 iz+1 arccot z = -- In -, 21 I t - 1 1
(14.89a) (14.90a)
m), arcoshz = ln(z + m), arsinhz = ln(z +
1 l+z artanh z = - In -. 2 1-2 1 z+l arcoth z = - In -. 2 2-1
(14.87b) (14.88b) (14.89b) (14.90b)
6. Real and Imaginary Part of the Trigonometric and Hyperbolic Functions (See Table 14.1) 7. Absolute Values and Arguments of the Trigonometric and Hyperbolic Functions (See Table 14.2) Table 14.1 Real and imaginary parts of the trigonometric and hyperbolic functions
1
Function
w = f(z t it/) sin(x f iy) cos(z f iy) tan(2 & iy) sinh(x f iy) cosh(z f iy) tanh(z f iy)
Real part Re (w) sin x cosh y cos x cosh y sin 22 cos 22 cosh 2y sinh 5 cos y cosh x cos y sinh 22 cosh 2 2 cos 2y
+
+
Imaginary part Im (tu)
k cos zsinh y sin 2 sinh y sinh 2y *cos 22 cosh 2y fcosh 2 sin y sinh zsin y sin 2y *cosh 22 + cos 2w
+
*
Table 14.2 Absolute values and arguments of the trigonometric and hyperbolic functions
Function
sin(x & iy) cos(z f iy) sinh(x f iy) cosh(z f iy)
Absolute value Iw I
Argument arg w
JzGGE& =J
7 arctan(tan z tanh y)
&zGz&
-J
farctan(cot 2 tanh y)
* arctan(cothz tan y) * arctan(tanh x tan y)
14.5 A1,qebraic and Elementaru Transcendental Functions 699
14.5.3 Description of Curves in Complex Form A complex function of one real variable t can be represented in parameter form: (14.91) z = x ( t ) + iy(t) = f ( t ) . As t changes, the points z draw a curve z ( t ) . In the following, we give the equations and the corresponding graphical representations of the line, circle, hyperbola, ellipse, and logarithmic spiral. 1. Straight Line a) Line through a point (zlr'p) ( p i s the angle with the x-axis, Fig. 14.49a): (14.92a) z = z1 t te+' .' b) Line through two points zl: 22 (Fig. 14.49b): (14.92b) z = 21 + t ( z 2 - 21). 2. Circle a) Circle with radius T . center at the point zo = 0 (Fig. 14.50a): = reit (14 = T I . b) Circle with radius T , center at the point zo (Fig. 14.50b):
z = zo + relt
(12
(14.93a) (14.9313)
- 201 = r ) .
I
X
b)
a)
b) Figure 14.50
Figure 14.49
3. Ellipse
x2
y2
a) Ellipse, Normal Form - t - = 1 (Fig. 14.51a): a2 b2 z=acost+ibsint(14.94a) o r ~ = c e ' ~ + d e - ' ~ ( 1 4 , 9 4 bw) i t h e =
a+b 2
-,
a-b d = -, 2
(14.94~)
Le., c and d are arbitrary real numbers. b) Ellipse, General Form (Fig. 14.51b): The center is at 21, the axes are rotated by an angle. (14.95) z = z1 + ce'' + de-'t. Here c and d are arbitrary complex numbers, they determine the length of the axis of the ellipse and the angle of rotation. 12
y2
4. Hyperbola, Normal Form - - - = 1 (Fig. 14.52): , a2 b2 z = acosh t ibsinht
+
(14.96a)
or
z = ce' + Fe-', where c and E are conjugate complex numbers: a+ib a-ib c= 2 2 .
(14.9613) (14.96~)
14. Function Theoru
700
I' Figure 14.51
Figure 14.52
5 . Logarithmic Spiral (Fig. 14.53): 2 =
p,
(14.97)
where a and b are arbitrary complex numbers.
14.6 Elliptic Functions 14.6.1 Relation to Elliptic Integrals
m)
Figure 14.53 Integrals in the form (8.21) with integrands R (z, cannot be integrated in closed form if P ( x ) is a polynomial of degree three or four, except in some special cases, but they are calculated numerically as elliptic integrals (see 8.1.4.3,p. 435). The inverse functions of elliptic integrals are the ellipticfunctions. They are similar to the trigonometric functions and they can be considered as their generalization. As an illustration, let us consider the special case
](1
-
t 2 ) - i dt = x
(1.
5 1).
(14.98)
0
We can tell that a) there is a relation between the trigonometric function u = sin x and the principal value of its inverse function - 1 < z 5 1,-1 5 u 5 1, (14.99) 22 b) the integral (14.98) is equal to arcsinu. The sine function can be considered as the inverse function of the integral (14.98). Analogies are valid for the elliptic integrals. W The period of a mathematical pendulum, with mass m, hanging on a non-elastic weightless thread of length 1 (Fig. 14-54),can be calculated by a second-order non-linear differential equation. We get this equation from the balance of the forces acting on the mass of the pendulum:
u = sinz H z = arcsinu for
(14.100a) The relation between the length 1 and the amplitude s from the dwell is s = 119, so s = 18 and s = 18 hold. The force acting on the mass is F = mg, where g is the acceleration due to gravity, and it is decomposed into a normal component FN and a tangential component FT with respect to its path (Fig. 14.54). The normal component FN = mgcos19 is balanced by the thread stress. Since it is perpendicular to the direction of motion, it has no effect to the equation of motion. The tangential component FT yields the acceleration of the motion. FT = ms = m119 = -mg sin 19. The tangential component always points in the direction of dwell.
14.5 Algebraic and Elementaw Transcendental Functions 701 We get by separation of variables: d8
(14.100b)
Here, to denotes the time for which the pendulum is in the deepest position for the first time, Le., where t9(tO) = 0 holds. Q denotes the integration variable. We get the equation (14.100~)
8
8
after some transformations and with the substitutions sin - = k s i n q , k = sin -. Here F ( k ,9)is 2 2 an elliptic integral of the first kind (see (8.24a), p. 435). The angle of deflection 29 = O ( t ) is a periodic function of period 2T with (14.100d) where K represents a complete elliptic integral of the first kind (Table 21.7). T denotes the period dt9 of the pendulum, Le., the time between two consecutive extreme positions for which - = 0. If the dt
amplitude is small, Le., sin 29 FZ 29, then T = 2 ~ J 1 / 9 holds.
Figure 14.54
Figure 14.56
Figure 14.55
14.6.2 Jacobian h n c t ions 1. Definition It follows for 0 < k < 1 from the representation (8.23a) and (8.24a), 8.1.4.3, p. 435 for the elliptic integral of the first kind F ( k , y ) that = (1 - k2 sinZy ) - i > 0, dY Le.. F ( k . p) is strictly monotone with respect to 'p, so the inverse function 'p =
am(k.u) = p(u)
(14.102a)
Of
'
u = [ p &d* -
(14.101)
-u(p)
(14.102b)
exists. It is called the amplitudefunction. The so-called Jacobian functions are defined as: (14.103a) snu = sin9 = sinam(k, u ) (amplitude sine), cnu = cos p = cos am(k, u ) (amplitude cosine),
(14.103b)
702
14. Function Theory
dnu = dkzsn2.u
(14.103~)
(amplitude delta).
2. Meromorphic and Double Periodic Functions The Jacobian functions can be continued analytically in the z plane. The functions snz, cnz, and dnz are then meromorphicfunctions (see 14.3.5.2, p. 691), i.e., they have only poles as singularities. Besides, they are double periodic: Each of these functions f(z) has exactly two periods w1 and w2 with (14.104) f ( z + w1) = f ( Z L f ( z + wz) = f k ) . Here, w1 and wz are two arbitrary complex numbers, whose ratio is not real. The general formula (14.105) f ( z mu1 nwz) = f(z) follows from (14.104),and here m and n are arbitrary integers. Meromorphic double periodic functions are called elliptic functions. The set (14.106) (20 cYlbJl+ cvzwz: 0 5 cvl, a2 < l}, with an arbitrary fixed zo E 6,is called the period parallelogram of the elliptic function. If this function (Fig. 14.55) is bounded in the whole period parallelogram, then it is a constant. IThe Jacobian functions (14.103a) and (14.103b) are elliptic functions. The amplitude function (14.102a) is not an elliptic function.
+
+
+
3. Properties of the Jacobian functions The properties of the Jacobian functions given in Table 14.3 can be got by the substitutions
f) , K = F (k);
k 2 , K’ = F (k’>
Poles
Roots
Periods w1, wa
2.
(14.107)
2mK
+ 2niK’
snz
4K,2iK’
cnz
4K, 2(K + iK’)
(2m + l ) K + 2niK’
dnz
2K, 4iK’
(2m+ l ) K + (271 + 1)iK’
i
2mK
+ (2n + 1)iK’
+
sn(u+v) =
(snu)(cnv)(dnv) (snv)(cnu)(dnu) 1 - kz(snzu)(sn2v)
(14.109a)
cn(u + v) =
(cnu)(cnv) - (mu)(dnu)(snv)(dnv) 1 - k2(sn2u)(sn2v)
(14.109b)
dn(u
kZ (snu)(cnu)(snv)(cnv) + v) = (dnv)(dnv)1 --k2(sn2u)(sn2v) ,
3. (snz)’ = (cnz)(dnz), (dnt)’ = -k’(snz)(cnz).
(14.110a)
(14.109~) (cnz)’ = -(snz)(dnz),
(14.110b) (14.110~)
14.5 Algebraic and Elementary Tkanscendental Functions 703
For further properties of the Jacobian functions and further elliptic functions see [14.8],[14.12]
14.6.3 Theta Function We apply the theta function to evaluate the Jacobian functions (14.11 la) m
292(z,q) = 24:
qn(n+’)
(14.111b)
cos(2n t l)z,
n=O &(z,q)= 1
+2
m
qnz
( 14.11IC)
cos2nz,
n=l
m
(14.11Id)
&(z.q) = 1+2~(-l)nqnZcOs2nz. n=l
If 141 < 1 ( q complex) holds, then the series (14.111a)-(14.111d) are convergent for every complex argument z . We use the brief notation in the case of a constant q 2 9 k ( ~ ):= v?k(Tz,q) ( k = 1,2,3,4). ( 14.112) Then, the Jacobian functions have the representations:
( 14.113b)
and K , K’ are as in (14.107)
14.6.4 Weierstrass Functions The functions (14.114a)
p(z) = ~(Z,IJ~.W:,).
((2)
= s(z,wl>wz),
(14.114b)
4.) = 4 Z , ~ l r ~ 2 ) , (14.114~) were introduced by Weierstrass, and here w1 and w:, represent two arbitrary complex numbers whose quotient is not real. We substitute ( 14.115a) d m n = 2(mw1 t nul), where m and n are arbitrary real numbers, and we define p(z.Wlr(u‘2) = z - 2 t C ’ [ ( z - w m n , - ’ - w , ~ ] .
(14.115b)
m,n
The accent behind the sum sign denotes that the value pair m = n = 0 is omitted. The function p(zl wll 4) has the following properties: 1. It is an elliptic function with periods w1 and w2. 2. The series (14.115b) is convergent for every z # umn. 3. The function p(z, dl. w 2 ) satisfies the differential equation
704
14. Function Theory
p’* = 4p3 - gZp - g3
(14.116a)
with gz = 601’u;i, m,n
g3 = 1401’~;:.
(14.116b)
m,n
The quantities g2 and g3 are called the invariants of p(z, wl,wz). 4. The function u = p(z, u l , w 2 ) is the inverse function of the integral
( 14.117)
(14.118)
(14.11sa) (14.119b) are not double periodic, so they are not elliptic functions. The following relations are valid:
I.
(’(2)
2.
((-2)
3.
((2
= -p(z), = -((z),
+2
((2)=
(lnu(z)),
(14.120)
u(-2)
= -u(z),
(14.121) (14.122)
4 = ( ( 2 ) +2C(Wl),
C(z + 2wz) = ( ( 2 ) t 2C(uz),
(14.123) 5.
Every elliptic function is a rational function of the Weierstrass functions p(z) and C ( z ) .
705
15 Integral Transformat ions 15.1 Notion of Integral Transformation 15.1.1 General Definition of Integral Transformations Anlntegral transformation is a correspondence between two functions f ( t ) and F ( p ) in the form (15,la) -X
The function f ( t ) is called the originalfunction, its domain is the original space. The function F ( p ) is called the transform, its domain is the image space. The function K ( p ,t ) is called the kernel of the transformation. In general, t is a real variable, and p = u + iw is a complex variable. We may use a shorter notation by introducing the symbol Tfor the integral transformation with kernel K(P,t)' F(P ) = T{f(t)}. (15.1b) Then, we call it a 7 transformation.
15.1.2 Special Integral Transformations We get different integral transformations for different kernels K ( p , t ) and different original spaces. The most widely known transformations are the Laplace transformation, the Laplace-Carson transformation, and the Fourier transformation. We give an overview of the integral transformations of functions of one variable in Table 15.1. More recently, some additional transformations have been introduced for use in pattern recognition and in characterizing signals, such as the Wavelet transformation, the Gabor transformation and the Walsh transformation (see 15.6, p. 738ff.).
15.1.3 Inverse Transformations The inverse transformation of a transform into the original function has special importance in applications. IVith the symbol 7-' the inverse integral transformation of (15.la) is
f(t) = 7 - ' { W } .
(15.2a)
The operator 7-'is called the inverse operator of 7 ,so (15.2b) T-'V{f(t)l) = f(t). The determination of the inverse transformation means the solution of the integral equation (15.1a), where the function F ( p ) is given and function f ( t ) is to be determined. If there is a solution, then it can be written in the form f(t)= T-'{F(p)}. (15.2~) The explicit determination of inverse operators for different integral transformations, Le., for different kernels K ( p ,t ) . belongs to the fundamental problems of the theory of integral transformations. The user can solve practical problems by using the given correspondences between transforms and original functions in the corresponding tables (Table 21.11,p. 1061, Table 21.12,p. 1066, and Table 21.13, p. 1080)
15.1.4 Linearity of Integral Transformations If fi(t) and fz(t) are transformable functions, then (15.3) T{klfl(t)+ kzfz(t)) = klT{fl(t))+ kZT{fZ(t)), where k l and kz are arbitrary numbers. That is, an integral transformation represents a linear operation on the set T of the T-transformable functions.
I
706
15. Inteoral Transformations ~
Table 15.1 Overview of integral transformations of functions of one variable
Kernel K(p,t) Laplace transformation
for t for t
0 e-pt
Two-sided Laplace transformation Finite Laplace transformation Laplace-Carson transformation
<0 >0
e-Pt
0 for t < 0 e+ for 0 < t < a 0 for t > a l0 fort0 mation.
Fourier
e-iwt
One-sided Fourier transformation Finite Fourier transformation
e-iwt
10
Fourier cosine transformation
(0
Fourier sine transformation
(0
Mellin
(0
\ Re
Stieltjes transformation
,
for t < O for t > 0
[elwt]
,
fort<^
\ Im [PIfor t > 0 (tp-'
Hankel transformation of order u
for t < 0 for 0 < t < a fort>a
0
(0
fort0
fort0
w
0
I~=u+iw W=O I &(ut) is the u-th order Bessel function of the first kind.
I
15.1 Notion of Inteqral Transformation 707
15.1.5 Integral Transformations for FunctionsofSeveralVariables Integral transformations for functions of several variables are also called multzple zntegral transfomnatzons (see [15.13]).The best-known ones are the double Laplace transformation, Le., the Laplace transformation for functions of two variables, the double LaplaceCarson transformation and the double Fourier transformation. The definition of the double Laplace transformation is c a m
P(p. q ) = Cz{f(z.y ) }
/ / e-ps-qyf(x,
y) dxdy.
(15.4)
3=0 y=o
The symbol L denotes the Laplace transformation for functions of one variable (see Table 15.1).
15.1.6 Applications of Integral Transformations 1. Fields of Applications Besides the great theoretical importance that integral transformations have in such basic fields of mathematics as the theory of integral equations and the theory of linear operators, they have a large field of application in the solution of practical problems in physics and engineering. Methods with applications of integral transformations are often called operator methods. They are suitable to solve ordinary and partial differential equations, integral equations and difference equations.
2. Scheme of the Operator Method The general scheme to the use of an operator method with an integral transformation is represented in Fig. 15.1. R’e get the solution of a problem not directly from the original defining equation; we first apply an integral transformation. The inverse transformation of the solution of the transformed equation gives the solution of the original problem.
-
Equation of the problem
i
- Solution of theequation
Transformation
Result
f
Inverse transformation
the transformed equation Figure 15.1 The application of the operator method to solve ordinary differential equations consists of the following three steps: 1. Transition from a differential equation of an unknown function to an equation of its transform. 2. Solution of the transformed equation in the image space. The transformed equation is usually no longer a differential equation, but an algebraic equation. 3. Inverse transformation of the transform with help of 7-’into the original space, Le., determination of the solution of the original problem. The major difficulty of the operator method is usually not the solution of the transformed equation, but the transform of the function and the inverse transformation.
708
15. Integral Transformations
15.2 Laplace Transformation 15.2.1 Properties of the Laplace Transformation 15.2.1.1 Laplace Transformation, Original and Image Space 1. Definition of the Laplace Transformation The Laplace transformation
1 P
L { f ( t ) }=
(15.5)
e - P t f ( t )dt = F ( p )
0
assigns a function F ( p ) of a complex variablep to a function f ( t ) of a real variable t , if the given improper integral exists. f ( t ) is called the original function, F ( p ) is called the t r a n s f o m of f(t). The improper integral exists if the original function f(t) is piecewise smooth in its domain t 2 0, in the original space, and for t --t co,suppose lf(t)l 5 Keet with certain constants K > 0, cy > 0. The domain of the transform F ( p ) is called the image space. In the literature one can find the Laplace transformation also introduced in the b'agner or LaplaceCarson form m
(15.6)
L w { f ( t ) }= p J e - P t f ( t ) d t = p ~ ( p ) . 0
2. Convergence The Laplace integral L { f ( t ) }converges in the half-plane Rep is an analytic function with the properties: 1. lim F ( p ) = 0.
> LY (Fig. 15.2). The transform F ( p ) (15.7a)
Repim
This property is a necessary condition for F ( p ) to be a transform. 2. pF(p)= A ,
bs
(15.7b)
(P-'CU)
if the original function f(t) has a finite limit
!iir f(t) = A . (t-10)
Figure 15.2
Figure 15.3
3. Inverse Laplace Transformation h'e can retrieve the original function from the transform with the formula
(15.8)
15.2 Laplace Transformation 709
The path of integration of this complex integral is a line Rep = c parallel to the imaginary axis, where Rep = c > cy. If the function f ( t )has a jump at t = 0, i.e., lim f(t) # 0, then the integral has the t++O
1 mean value -f(+O) 2
there.
15.2.1.2 Rules for t he Evaluation of the Laplace Transformation The rules for evaluation are the mappings of operations in the original domain into operations in the transform space. In the following, we denote the original functions by lowercase letters, the transforms are denoted by the corresponding capital letters.
1. Addition or Linearity Law The Laplace transform of a linear combination of functions is the same linear combination of the Laplace transforms. if they exist. With constants XI,. . . , A n we get: L{x,fl(t)+~zfz(t)+".+Xnfn(t)} = X,Fl(P) +XZFZ(P)+..'+X,F,(P). (15.9) 2. Similarity Laws The Laplace transform of f(at) (a > 0, a real) is the Laplace transform of the original function divided by a and with the argument p / a :
Analogously for the inverse transformation
( 15. lob) Fig. 15.3 shows the application of the similarity laws for a sine function. IDetermination of the Laplace transform of f(t) = sin(&). For the correspondence of the sine function we have L{sin(t)} = F ( p ) = l/(pz + 1). Application of the similarity law gives L{sin(wt)} = 1
1
W
w (p/w)2+1
-F(p/w) = -
1
W
~
-pZ+wZ'
3. Translation Law 1. Shifting t o t h e Right The Laplace transform of an original function shifted to the right by a ( a > 0) is equal to the Laplace transform of the non-shifted original function multiplied by the factor p p :
L{f(t- a)} = e-apF(p).
(15.1la)
2. Shifting t o t h e Left The Laplace transform of an original function shifted to the left by a is equal to eaP multiplied by the difference of the transform of the non-shifted function and the integral ]:f(t)e-ptdt:
L { f ( t + a ) } = eap F ( p ) - / e - P t f ( t )
[
oa
dt
1
.
(15.11b)
Figs. 15.4 and 15.5 show the cosine function shifted to the right and a line shifted to the left.
4. Frequency Shift Theorem The Laplace transform of an original function multiplied by e-bt is equal to the Laplace transform with the argument p + b ( b is an arbitrary complex number):
L{e-*'f(t)}= F ( p + b ) .
(15.12)
710
15. Inteqral Transformations
,
L{f’(t)l = P F b ) - f ( + % L{f”(t)l = PZF(P) - f(+O) P - f’(+OL
........................................................ L { p ( t ) }= p ” F ( p ) - f ( f 0 ) p n - l - f’(+o)p”-* - . . . - f(n-z)(+O)p- f(”-’)(+O)
f‘”’(+O)
=
’
(15.13)
with
t!yof‘”’(t).
/
In the original space. differentiation and integration act in converse ways if the initial values are zeros. 8. Integration in the Image Space m m 00 1 (15.18) T ( z - p)n-’F(z) dz. L = dPl JdP* ’ ’ ’ F(Pn) dPn =
{ T}J P
J
PI
P,-1
P
This formula is valid only if f ( t ) / t ” has a Laplace transform. For this purpose, f(x)must tend to zero fast enough as t -+ 0. The path of integration can be any ray starting at p, which forms an acute angle
15.2 Laplace Transformation 711
with the positive half of the real axis.
9. Division Law In the special case of n = 1 of (15.18) we have (15.19)
f ( t ) must also exist. For the existence of the integral (15.19);the limit lim -
t
t-0
10. Differentiation and Integration with Respect to a Parameter
C
{ y} aa
(15.20a)
=
,C
{ lf(t,
a ) da} = T F ( t ,a ) da.
01
(15.20b)
a1
Sometimes we can calculate Laplace integrals from known integrals with the help of these formulas.
11. Convolution 1. Convolution in t h e Original Space The convolution of two functions fl(z)and f2(z)is the integral
f * fz=jh(r)~fz(t-~)d7.
(15.21)
0
Equation (15.21) is also called the one-sided convolution in the interval (0, t ) . A two-sided convolution occurs for the Fourier transformation (convolution in the interval (-m, co)see 9., p. 727). The convolution (15.21) has the properties a) Commutative law: b) Associative law: c) Distributive law:
fl * fz = fz * f l ' ( f ~* f ~ *) f3 = fi * (fi * A). (fl
+ fz) * f3 = fl * f3 + fz * f3.
(15.22a) ( 15.22b) (15.22~)
In the image domain, the usual multiplication corresponds to the convolution: (15.23) L{fi * fz} = Fib).Fz(P). The convolution of two functions is shown in Fig. 15.6. We can apply the convolution theorem to determine the original function: a ) Factoring the transform F(P) = Fl(P) ' &(PI. b) Determining the original functions f l ( t ) and f Z ( t ) of the transforms 4 ( p ) and Fz(p) (from a table). c) Determining the original function associated to F ( p ) by convolution of f l ( t ) and f2(t) in the original space ( f ( t ) = h ( t )* fz(t)).
712
15. Intearal Transformations
2. Convolution in the Image Space (Complex Convolution)
xl-im
(15.24)
x2-icc
The integration is performed along a line parallel to the imaginary axis. In the first integral, 21 and p must be chosen so that z is in the half plane of convergence of C{fi} and p - z is in the half plane of convergence of L{f2}. The corresponding requirements must be valid for the second integral.
15.2.1.3 Transforms of Special Functions 1. Step Function The unit jump at t = t o is called a step function (Fig. 15.7) (see also 14.4.3.2,3., p. 695); it is also called the Heaviside unit step function: (15.25) IA: f ( t ) = u ( t - t o )sinwt,
F(p) = e+p
wcoswt~+psinwto p2 w2
(Fig. 15.8).
IB: f ( t ) = u ( t - to)sinw (t - to),
F(p) =
W p2 w2
(Fig. 15.9)
Figure 15.7
+
+
Figure 15.8
Figure 15.9
2. Rectangular Impulse .4rectangular impulse of height 1 and width T (Fig. 15.10)is composed by the superposition of two step functions in the form
0 uT(t - t o )= u(t - to) - u( t - to - T) = 1 0
for t < to, for to < t < to T , for t > t o + T ;
+
(15.26)
(15.27)
3. Impulse Function (Dirac 6 Function) (See also 12.9.5.4, p. 639.) The impulse function 6 ( t - t o )can obviously be interpreted as a limit of the rectangular impulse of width T and height 1/T at the point t = to (Fig. 15.11): 1 (15.28) b(t - t o )= lim -[ u ( t - to) - u(t - to - T ) ] . T+O T
15.2 Lavlace Transformation 713
Figure 15.10
Figure 15.11
For a continuous function h ( t ) , (15.29) Relations such as
6(t - to) = d’(t - t o ) ~
dt
C{ 6(t - to)} = e-top
’
(to 2 0)
(15.30)
are investigated generally in distribution theory (see 12.9.5.3, p. 638).
4. Piecewise Differentiable Functions The transform of a piecewise differentiable function can be determined easily with the help of the b function: If f ( t ) is piecewise differentiable and at the points t , (v = 1 , 2 , . . . , n) it has jumps a,. then its first derivative can be represented in the form
dfo = j i ( t )+ alb(t + a z ~ (-t t z ) + + a,b(t dt tl)
’.
- t,)
(13.31)
where f : ( t ) is the usual derivative of f ( t ) ,where it is differentiable. Ifjumps occur first in the derivative, then similar formulas are valid. In this way, we can easily determine the transform of functions which correspond to curves composed of parabolic arcs of arbitrarily high degree. e.g.. curves found empirically. In formal application of (15.13), we should replace the values f(+O). f‘(+O). . . . by zero in the case of a jump. W A: f ( t ) = [~~bff”bl~~~~,to~(Fig.15.12);f’(t)=au,(t)+b6(t)-(at~+b)6(t-t~)~C{f’ !(I - e 8 O p )
P
+ b - ( a t 0 + b) e?Op;
C { f ( t ) }=
W B:
1 for 0 < t < to, for 0 < t < to, f ( t ) = 2to - t for to < t < 2ta, (Fig.15.13);f ’ ( t ) = -1 for to < t < 2t0, (Fig. 15.14): 0 for t > 2t0, for t > 2t0, 0
i
it
f”(t ) = 6(t )- 6(t - t o )-6( t -t o )t 6(t - 2to): C{f ” ( t )} = 1- 2 e ~ ~ o ~ + e - ~ ~C{ o Pf (; t )} = for for otherwise,
0
< t < to,
$ Iio>{ 2$: (Fig.15.15);
( 1- e - t o P ) 2
PZ
714
15. Inteoral Transformations
;boob;pu; ~
~
2tO Figure 15.12
Figure 15.13
< t < to, < < - t o > (t T - to < t < T ,
Elto 0
for for
0
otherwise,
0
to
'
Figure 15.14
(Fig, 15.16);
a t
01 to
T-t, T
-t
Figure 15.15 ID: f(t) =
[ toWt2
f " ( t ) = -2211(t)
gEr:ii2
(Fig. 15.17); f'(t) =
T-t T
Figure 15.16
0 < t < 1, [ 1 - 2t forotherwise, (Fig. 15.18);
+ b(t) t q t - 1);
Figure 15.17
Figure 15.18
5 . Periodic Functions The transform of a periodic function f * ( t )with period T , which is a periodic continuation of a function f ( t ) , can be obtained from the Laplace transform of f(t) multiplied by the perioditation factor (1- e-Tp)-l. (15.32)
15.2 Laplace Transformation 715
IA: The periodic continuation of f(t) from example B (see above) with period T = 2to is f * ( t )with
IB: The periodic continuation of f ( t ) from example C (see above) with period T is f ' ( t ) with E (1 - ,-top) (1 - e-(T-toiP) '{f*(t)} = tOP2(1 - e-Tp)
15.2.1.4 Dirac S Function and Distributions In describing certain technical systems by linear differential equations, functions u ( t ) and 6 ( t ) often occur as perturbation or input functions, although the conditions required in 15.2.1.1, 1. p. 708, are not satisfied: u ( t ) is discontinuous, and 6 ( t ) cannot be defined in the sense of classical analysis. Distribution theory offers a solution by introducing so-called generalized functions (distrzbutions),so that with the known continuous real functions 6 ( t ) can also be examined, where the necessary differentiability is also guaranteed. Distributions can be represented in different ways. One of the best known representations is the continuous real linear form. introduced by L. Schwartz (see 12.9.5, p. 637). We can associate Fourier coefficients and Fourier series uniquely to periodic distributions, analogously to real functions (see 7.4, p. 418).
1. Approximations of the 6 Function hnalogously to (15.28), the impulse function b(t) can be approximated by a rectangular impulse of width E and height 1/E ( E > 0): (15.33a) Further examples of the approximation of b ( t )are the error curve (see 2.6.3, p. 72) and Lorentz function (see 2.11.2. p. 94): (15.3313) (15.33~)
These functions have the common properties:
1.
7
(15.34a)
f ( t . E ) dt = 1.
--3o
2.
f(-t,
E)
= f ( t , E ) , Le., they are even functions.
( 15.34b) (15.34~)
3.
2. Properties of the 6 Function Important properties of the 6 function are:
1.
T j ( t ) J ( x- t ) dt = f(x) (f is continuous, a > 0).
(15.35)
x-a
2.
1
6((Yz) = -6(z)
3. 6 ( g ( z ) )=
( a > 0).
n 1 1 -6(z ls'(xz)l ,=I
- 2,)
(15.36) with g ( z l )= 0 and g'(zt) # 0 (i = 1 , 2 , .. . , n).
(15.37)
I
716
15. Inteoral Transformations
Here we consider all roots of g ( x )and they must be simple. 4. n - t h Derivative of t h e 6 Function: After n repeated partial integrations of
f(")(z)= Taf(")(t)6(z- t ) dt,
(15.38a)
x-a
we obtain a rule for the n-th derivative of the b function:
(-l)"f'"'(z) = T f ( t )b(n)(z- t ) dt.
(15.38b)
x-a
15.2.2 Inverse Transformation into the Original Space To perform an inverse transformation, we have the following possibilities: 1. Csing a table of correspondences, Le., a table with the corresponding original functions and transforms (see Table 21.11, p. 1061). 2. Reducing to known correspondences by using some properties of the transformation (see 15.2.2.2, p. 716, and 15.2.2.3, p. 717). 3. Evaluating the inverse formula (see 15.2.2.4, p. 718).
15.2.2.1 Inverse Transformation with the Help of Tables The use of a table is shown here by an example with Table 21.11, p. 1061. Further tables can be found, e.g., in [15.3]. 1 = Fl(P). FZ(P),c-'{Fl(P)I = pz = t s i n w t = fl(t), F ( p ) = ( p + c)(p2 + d)
W l2 j
e-' {
'
{ L}
= e-ct = fi(t). We have to apply the convolution theorem (15.23):
LC-'{F2(p)} =
f(t)
P+C
= l-l{Fl(P)
=
it
'
FZ(P)}
f l (7) , f2(t
- 7 ) d7 =
1'
e-c(t-7)
csinwt -wcoswt sin w~ 1 -dr=-( W W cz t WZ
+ e-'') .
15.2.2.2 Partial Fraction Decomposition 1. Principle In many applications, we have transforms in the form F ( p ) = H(p)/G(p), where G(p) is a polynomial of p . If we already have the original functions for H ( p ) and l/G(p), then we get the required original function F ( p ) by applying the convolution theorem. 2. Simple Real Roots of G ( p ) If the transform 1/G(p) has only simple poles p , (v = 1,2,. . . , n),then we get the following partial fraction decomposition: (15.39) The corresponding original function is (15.40)
15.2 Laplace Transformation 717
3. The Heaviside Expansion Theorem If the numerator H(p)is also a polynomial ofp with a lower degree than G ( p ) ,then we can obtain the original function of F ( p ) with the help of the Heaviside formula (15.41)
4. Complex Roots Even in cases when the denominator has complex roots, we can use the Heaviside expansion theorem in the same way. We can also collect the terms belonging to complex conjugate roots into one quadratic expression. whose inverse transformation can be found in tables also in the case of roots of higher multiplicity.
F ( p ) = (p + c)(;2 + d)' The poles p l = -e, pz = iw,
f(t) = I e - c wz + cz
t
-I
e
i
Le., H ( p ) = 1, G ( p ) = (p
-iw are all simple. According to the Heaviside theorem we get
p3 =
w
2w(w - ic)
+ c)(p2 + w'), G'(p) = 3p2 + 2pc + 2.
-I e-iwt or by using partial fraction decomposition and 2w(w ic) 1
t
+
expressions are identical.
15.2.2.3 Series Expansion X
In order to obtain f ( t ) from F ( p ) , we can try to expand F ( p ) into a series F ( p ) =
n=O
Fn(p), whose
terms Fn(p)are transforms of known functions, Le., F,(p) = C{fn(t)}. 1. F ( p ) is an Absolutely Convergent Series If F ( p ) has an absolutely convergent series (16.42) for Ipl > R. where the values A, form an arbitrary increasing sequences of numbers 0 . . < A, < . ' . < ' . . + co,then a termwise inverse transformation is possible: d - 1
pm. P
f(t) =
< A0 < Ai <
L '
T denotes the gamma function (see 8.2.5, 6., p. 459). In particular, for A, = n + 1, Le., for F ( p ) = anti we get the series f(t) = %tn, which is convergent for every real and complex t. Fur. n=O pnt1 n=O n ! thermore, we can have an estimation in the form if(t)l < CeCltl(C, c real constants)
5
-)
I F ( p ) = -- - (1 +
m
P
$)
the original space we get f ( t ) =
-112
1
=
e (-nk) -& ,
After a termwise transformation into
n=O
=
Jo(t) (Bessel function of
n=O
0 order). 2. F ( p ) is a Meromorphic Function If P(p)is a meromorphic function, which can be represented as the quotient of two integer functions (of two functions having everywhere convergent power series expansions) which do not have common
718
15. Inteoral Transformations
roots. and so can be rewritten as the sum of an integer function and finitely many partial fractions. then we get the equality
(15.44) Here p , (v = 1 , 2 , . . . , n) are the first-order poles of the function F ( p ) ,b, are the corresponding residues (see 14.3.9.4,p. 691), y, are certain values and K, are certain curves, for example, half circles in the sense represented in Fig. 15.19.We get the solution f ( t ) in the form
1 etpF(p)
m
f(t) =
b,ePvt,
if
as y
dp
277i
"=l
--t
0
(15.49)
(K,)
+ co,what is often not easy to verify.
X
-Y1
Figure 15.19
Figure 15.20
In certain cases, e.g., when the rational part of the meromorphic function F ( p ) is identically zero, the above result is a formal application of the Heaviside expansion theorem to meromorphic functions.
15.2.2.4 Inverse Integral The inverse formula
f ( t ) = lim V"+==
1' T e t PF ( p )dp 2771
(15.46)
c-iy,
represents a complex integral of a function analytic in a certain domain. The usual methods of integration for complex functions can be used, e.g., the residue calculation or certain changes of the path of integration according to the Cauchy integral theorem.
F ( p ) = -e-@P P2
+u2
is double valued because of Jij. Therefore, we chose the following path of inte-
1 gration (Fig. 15.20): 277i
f e tp2p +wz p e - @
(K)
Res etPF(p)= e-'JW/' cos(wt - a*).
-
A
1 1 / ...+ / ...+ /
. . + . . + .. . +
dp = h
h
h
AB
CD
EF
-
DA
-
.. =
-
BE
FC
According to the Jordan lemma (see 14.4.3, p. 693), the
h
integral part over AB and C D vanishes as yn
+ 30.
The integrand remains bounded on the circular
arc E F (radius E ) , and the length of the path of integration tends to zero for E + 0; so this term of the integral also vanishes. We have to investigate the integrals on the two horizontal segments BE and E, where we have to consider the upper side ( p = re'") and the lower side ( p = re-'") of the negative real axis:
15.2 Laplace Transformation 719
Finally we get:
15.2.3 Solution of Differential Equations using Laplace Transformat ion We have already noticed from the rules of calculation of the Laplace transformation (see 15.2.1.2, p. 709), that using the Laplace transformation we can replace complicated operations, such as differentiation or integration in the original space, by simple algebraic operations in the image space. Here, we haye to consider some additional conditions, such as initial conditions in using the differentiation rule. These conditions are necessary for the solution of differential equations.
15.2.3.1 Ordinary Differential Equations with Constant Coefficients 1. Principle The n-th order differential equation of the form
+ c,-1
y'"'(t)
y y t )t
'
.
'
+
t c1 y'(t) co y ( t ) = f ( t )
(15.47a)
with the initialvalues y(+O) = yo, y'(+O) = yb, . . . , y(n-l)(+O) = yr-') can be transformed by Laplace transformation into the equation (15.47b) Here G ( p )=
5 ckpk = 0 is the characteristic equation of the differential equation (see 4.5.2.1, p. 279).
k=O
2. First-Order Differential Equations The original and the transformed equations are: y'(t)
+ c d t ) = f(t).
Y(+O) = YO,
(15.48a)
(Pt CO) Y ( P )- YO
= F(P),
(15.48b)
where co = const. For Y(p)we get (15.484 (15.49a)
Special case: For f ( t ) = X ept (A. p const) we get X Yo Y(P)= (P - P)(P + co) P t GI'
+-
(15.49b) (15.49~)
P + co
3. Second-Order Differential Equations The original and transformed equations are: Y(+O) = YO, y"(t) 2ay'(t) + b y ( t ) = f ( t ) ,
+
(pZ+ 2ap + b ) Y ( P )- 2 ~ -~(PYO 0 Jr Y;) = F(P). We then get for Y ( p )
Y'(+O) = Y;,
( 15.50a) ( 15.50b)
(15.504
720 15. Inteoral Transformations
(15.51a) (15.51b)
b) b = a 2 : G(p) = (p - a)’, c) b > a’:
q(t)= t eat.
(15.52a)
G(p)
has complex roots, 1 d t ) = -e-ot sin & 7 t .
(15.52b) (15.53a) (15.53b)
Jbl-;li
We obtain the solution y(t) as the convolution of the original function of the numerator of Y(p) and q(t). The application of the convolution can be avoided if we can find a direct transformation of the right-hand side.
+
+
+ 2y’(t) 10y(t) = 37cos 3t 9e-t 9 P+2 37P p2 2p 10 (p2 t 9)(p’ 2p 10) (p l)(pZ + 2p + 10) ’ 19 P 18 1 -p We get the representation Y(p) = p2 2p 10 (p2 t 2p 10) (PZ 9) (p’ 9) (p 1) by partial fraction decomposition of the second and third terms of the right-hand side but not separating the second-order terms into linear ones. We get the solution after termwise transformation (see Table 21.11, p. 1061) y(t) = (-cos 3t - 6sin 3t)e-t +cos 3t 6sin 3t e&.
I The transformed equation for the differential equation y”(t) with yo = 1 and yb = OisY‘(PI -
+ + + + +
+ + + + + +-+-++ +
+
+
+
4. n-th Order Differential Equations
The characteristic equation G(p) = 0 of this differential equation has only simple roots a l ,CY’, . . . ,an, and none of them is equal to zero. We can distinguish two cases for the perturbation function f ( t ) . 1. If the perturbation function f ( t )is the jump function ~ ( twhich ) often occurs in practical problems, then the solution is: 1 ” 1 for t > 0, (15.54a) y(t) = eaut . (15.54b) ‘(‘1 = 0 for t < 0, G(O) a,G’(a,)
+
[
~
2. For a general perturbation function f ( t ) ,we get the solution c(t)from (15.54b) in the form of the Duhamel formula which uses the convolution (see 15.2.1.2, ll., p. 711):
(15.55)
15.2.3.2 Ordinary Linear Differential Equations with Coefficients Depending on the Variable Differential equations whose coefficients are polynomials in t can also be solved by Laplace transformation. Applying (15.16), in the image space we get a differential equation, whose order can be lower than the original one. If the coefficients are first-order polynomials, then the differential equation in the image space is a firstorder differential equation and maybe it can be solved more easily.
I Bessel differential equation of 0 order: t-d 2 f
d f t tf = 0 +dt2 dt transformation into the image mace results in d dF(P) --[p’F(p) -pf(O) - f’(O)] +pF(p) - f(0) - dp = 0 or dP
(see (9.51a, p. 507) for n = 0). The
-~
dF P - = -dp p2 + lF(’)
15.2 Laplace Transformation 721
Separation of the variables and integration yields log F ( p ) = L
/ ?-!?!- + p2
(C is the integration constant), F ( t ) = CJo(t) m) = v5n-i
1
=
- log
+ log C,
(see example p. 717 with the
Bessel function of 0 order).
15.2.3.3 Partial Differential Equations 1. General Introduction The solution of a partial differential equation is a function of at least two variables: u = u ( z ,t ) . Since the Laplace transformation represents an integration with respect to only one variable, the other variable should be considered as a constant in the transformation:
/ e-Ptu(z,t ) m
C { u ( z .t ) } =
(15.56)
dt = U ( z > p ) .
0
z also remains fixed in the transformation of derivatives: (15.57)
For differentiation with respect to zwe suppose that they are interchangeable with the Laplace integral:
C
{ y]
= ;r{u(z:
t ) }= -U(z,p). d
(15.58)
dz
This way, we get an ordinary differential equation in the image space. Furthermore, we have to transform the boundary and initial conditions into the image space.
2. Solution ofthe One-Dimensional Heat Conduction Equation for a Homogeneous Medium 1. Formulation of the Problem Suppose the one-dimensional heat conduction equation with vanishing perturbation and for a homogeneous medium is given in the form (15.59a) u,, - a-2ut = u,, - uy = O in the original space 0 < t < co,0 < 2 < 1 and with the initial and boundary conditions (15.59b) u(+O, t ) = ao(t), u(1 - 0,t ) = a1(t). u ( z , SO) = u&), The time coordinate is replaced by y = at. (15.59a) is also a parabolic type equation, just as the three-dimensional heat conduction equation (see 9.2.3.3, p. 535). 2. Laplace Transformation The transformed equation is
d2U dx2 and the boundary conditions are G(+O,p) = Ao(p). U ( 1 - 0 , ~=)AI(P). The solution of the transformed equation is ~ ( z , p=)clezfi + c2e-xfi. It is a good idea to produce two particular solutions Ul and
(15.60a)
-= p U - u o ( z ) ,
C;(O.p) = 1, & ( l , p ) = 0,
(15.61a)
( 15.60b) (15.60~) U2
with the properties
Uz(0.p)= 0, U ~ ( 1 , p=) 1, Le..
(15.61b)
I
15. Inte.qra1 Trunsformations
722
(15.61d) The required solution of the transformed equation has the form (15.62) U ( ~ , P=) Ao(P)U I ( ~ ,+P2 4)1 ( ~ )UZ(X,P). 3. Inverse Transformation The inverse transformation is especially easy in the case of 1 + m:
1- (-g)
(15.63a) u ( z , t ) = 2fi
U ( z , p )= ao(p)e-'fi,
-
r3~2
exp
dr.
(15.63b)
15.3 Fourier Transformation 15.3.1 Properties of the Fourier Transformation 15.3.1.1 Fourier Integral 1. Fourier Integral in Complex Representation The basis of the Fourier transformation is the Fourier integral, also called the integral f o r m u l ~of Fourier If a non-periodic function f ( t ) satisfies the Dirichlet conditions (see 7.4.1.2, 3., p. 420) in an arbitrary finite interval, and furthermore the integral
T i f ( t ) idt
11
1 t'X+m e'"("-"f(r)d w d r is convergent, then f ( t ) = 2a
(15.64a)
(15.64b)
-m -m
--35
at every point where the function f ( t ) is continuous, and (15.64~) 0
--3o
at the points of discontinuity.
2. Equivalent Representations Other equivalent forms for the Fourier integral (15.64b) are: (15.65a) 'X
2. f ( t ) = /[a(.)
coswt
+ b(w) sinwt] dw
with the coefficients
(15.65b)
0
a(.)
=
1
1 +m ; f(t)coswtdt
(15.65~)
-'X
b(w) =
1 T f ( t ) sinwtdt.
(15.65d)
-W
'X
3. f ( t ) = 1 A ( w ) c o s [ . t + ~ ( w ) ] d w .
(15.66)
0
A(w)sin[wt+ip(w)]dw.
(15.67)
15.3 Fourier Transformation 723
The following relations are valid here:
(15.68~)
(15.68d) (15.68f)
15.3.1.2 Fourier Transformation and Inverse Transformation 1. Definition of the Fourier Transformation The Fourier transformation is an integral transformation of the form (15.1a)) which comes from the Fourier integral (15.64b) if we substitute
F ( d ) = T e - ' " ' f ( r ) dr.
(15.69)
-m
We get the following relation between the real original function f ( t )and the usually complex transform F(Ld): (15.70) -r;
In the brief notation we use E
/ e-'"tf(t)
tr;
F ( d ) = F{f ( t ) } =
dt
(15.71)
-ffi
The original function f ( t ) is Fourier transformable if the integral (15.69), Le., an improper integral with the parameter w , exists. If the Fourier integral does not exist as an ordinary improper integral, we consider it as the Cauchy principal value (see 8.2.3.3, l.,p. 455). The transform F ( w ) is also called the Fozlrzer transform: it is bounded, continuous, and it tends to zero for lw(+ co: lim F ( d ) = 0. (15.72) lw~+ffi
The existence and boundedness of F ( w ) follow directly from the obvious inequality (15.73) The existence of the Fourier transform is a sufficient condition for the continuity of F(w) and for the properties F ( w ) -+ 0 for 1wl + w. This statement is often used in the following form: If the function f ( t ) in (-co,co)is absolutely integrable, then its Fourier transform is a continuous function of w,and (15.72) holds. The following functions are not Fourier transformable: Constant functions, arbitrary periodic functions (e.g sin ij t , cos ~ t )power , functions, polynomials, exponential functions (e.g., e a t , hyperbolic functions). ~
2. Fourier Cosine and Fourier Sine Transformation In the Fourier transformation (15.71), the integrand can be decomposed into a sine and a cosine part. So. we get the sine and the cosine Fourier transformation.
724
15. Intearal Transformations
1. Fourier Sine Transformation 33
/
( 15.74a)
Fs(w)= FS{f ( t ) } = f ( t ) sin (ut)dt. 0
2. Fourier Cosine Transformation M
/
( 15.74b)
Fc(w)= FC{f ( t ) } = f ( t )cos (w t ) dt. 0
3. Conversion Formulas Between the Fourier sine (15.74a) and the Fourier cosine transformation (15.74b) on one hand, and the Fourier transformation (15.71) on the other hand, the following relations are valid: (15.75a) F ( w ) = Ft f(t) 1 = Fc{ f(t) + f(-4 I - iFS{ f(t)- f(-C 13 i
F'(w) = sF{ f(ltl)signt},
(15.75b)
1
FCb)= f{ f ( t ) I.
(15.75~)
For an even or for an odd function f ( t )we have the representation
f ( t ) even: F{f ( t ) 1 = 2 3 4 f ( t )I, f ( t ) odd: 3{f ( t )} = -2iFS{ f ( t ) }.
(15.75d)
3. Exponential Fourier Transformation Differently from the definition of F ( w ) in (15.71), the transform (15.76)
F,(w) = Fe{f(t)} = T e ' " ' f ( t )d t -M
is called the ezponentzal Fourzer transformatton, and we have F ( d ) = 2F,(-w).
(15.77)
4. Tables of the Fourier Transformation Based on formulas (15.75a,b,c) we either do not need special tables for the corresponding Fourier sine and Fourier cosine transformations, or we have tables for Fourier sine and Fourier cosine transformations and we may calculate F(w)with the help of (15.75a,b,c). In Table 21.12.1 (see 21.12.1, p. 1066) and Table 21.12.2 (see 21.12.2, p. 1072) the Fourier sine transforms Fs(w), the Fourier cosine transforms FC(w), and for some functions the Fourier transform 3 ( w ) in Table 21.12.3 (see 21.12.3, p. 1077) and the exponential transform F,(w) in Table 21.12.4 (see 21.12.4, p. 1079) are given. IThe function of the unipolar rectangular impulse f ( t ) = 1 for It1 < to, f ( t ) = 0 for It1 > to (.4.1) (Fig. 15.21) satisfies the assumptions of the definition of the Fourier integral (15.64a). Ac1 +to 2 . cos w t dt = -sin w to and b(w) = cording to (15.65c,d) we get for the coefficients a ( @ ) = T
1
2 sin w t dt = 0 (A.2) and so from (15.65b), f ( t ) = -
/ /
TW
-tn
m
sin wtocos w t
w 5. Spectral Interpretation of the Fourier Transformation 71
-to
T O
dw (.4.3).
Analogously to the Fourier series of a periodic function, the Fourier integral for a non-periodic function has a simple physical interpretation. A function f ( t ) ,for which the Fourier integral exists, can be represented according to (15.66) and (15.67) as a sum of sinusoidal vibrations with continuously changing frequency w' in the form
A ( w ) d w sin[wt + p(w)]>
(15.78a)
A(w)dwc o s [ w t + @ ( w ) ] .
(15.78b)
The expression A(w)dw gives the amplitudeof the wave components and p(w) and $ ( w ) are the phases.
15.9 Fourier Transformation 725
-to
0'
to
t
Figure 15.21
Figure 15.22
We have the same interpretation for the complex formulation: The function f ( t )is a sum (or integral) of summands depending on w of the form
L F (dw~eiut; ) 277
(15.79)
1 where the quantity -F(w) also determines the amplitude and the phase of all the parts. 27T
This spectral interpretation of the Fourier integral and the Fourier transformation has a big advantage in applications in physics and engineering. The transform
~ ( w =) IF(w)le'+) or l ~ ( w ) eiv@) l ( 15.80a) is called the spectrum or frequency spectrum of the function f ( t ) , the quantity IF(.)I = 7 7 4 w ) (15.80b) is the amplitude spectrum and io(.) and @(w)are the phase spectra of the function f(t). The relation between the spectrum F ( w ) and the coefficients (15.65c,d) is (15.81) F(w)= T [ a(.) - ib(w)], from which we get the following statements: 1. If f ( t ) is a real function, then the amplitude spectrum F ( w ) is an even function ofw, and the phase spectrum is an odd function of w. 2. If f ( t ) is a real and even function, then its spectrum F ( w ) is real, and if f ( t )is real and odd, then the spectrum F ( w ) is imaginary. I If we substitute the result (A.2) for the unipolar rectangular impulse function on p. 724 into (15.81), then we get for the transform F ( w ) and for the amplitude spectrum IF(u)l (Fig.15.22) sin w to sin u t o F ( w ) = F{ f ( t ) } = na(w) = 2(A.3), IF(u)I = 2 - (A.4). The points of contact of W I w ! 2 the amplitude spectrum lF(w)l with the hyperbola - are at wto = f ( 2 n t l ) ? ( n = 0 , 1 , 2 , .. .) . w 2
15.3.1.3 Rules of Calculation with the Fourier Transformation As we have already pointed out for the Laplace transformation, the rules of calculation with integral transformations mean the mappings of certain operations in the original space into operations in the image space. If we suppose that both functions f ( t ) and g(t) are absolutely integrable in the interval (--33, -33) and their Fourier transforms are FkJ)= F{f ( 4 1 and G(w)= F{ g(t)1 (15.82) then the following rules are valid.
1. Addition or Linearity Laws If cy and
B are two coefficients from (-m, ca),then:
F{af(t)+ Ps(t) } = ~ F ( w+ )PG(w).
(15.83)
726
15. Inteoral Transformations
2. Similarity Law For real cy # 0,
FI f ( t l a )1 = IQI F(aw). 3. Shifting Theorem
(15.84)
For real cy # 0 and real p,
F{ j(cyt + p) } = ( ~ / ae'PW'aF(w/cu) ) F{f ( t - t o )} = e-iwtoF(w).
(lL85a)
or
(15.85b)
If we replace to by -to in (15.85b), then we get
F{j ( t + t o )} = elWtoF(u).
(15.85~)
4. Frequency-Shift Theorem For real cy
> 0 and R E
(-m, ea),
F{e'"f(cyt) } = ( l / a ) F ( ( w- p ) / a ) or F{e'""f(t) } = F(u - wo).
(15.86a) (15.86b)
5. Differentiation in the Image Space If the function t n f ( t )is Fourier transformable, then
F{ t n f ( t ) }= i n W ( w ) ,
(15.87)
) the n-th derivative of F ( w ) . where F ( n ) ( w denotes 6. Differentiation in the Original Space 1. First Derivative If a function f ( t ) is continuous and absolutely integrable in (-m, 03) and it tends to zero fort -+ &coo,and the derivative f'(t) exists everywhere except, maybe, at certain points, and this derivative is absolutely integrable in (-ea,co), then (15.88a) F{ f'(4 1 = iwF{ f(t)}. 2. n-th Derivative If the requirements of the theorem for the first derivative are valid for all derivatives up to f ( n - l ) , then
(15.88b) F{f ( ' ) ( t )} = (iw)nF{f ( t )}. These rules of differentiation will be used in the solution of differential equations (see 15.3.2. p. 729).
7. Integration in the Image Space If the function t n f ( t )is absolutely integrable in (--03, ~ 0 then ) ~the Fourier transform of the function f ( t ) has n continuous derivatives, which can be determined for k = 1,2, . . . , n as +m
/
d k F ( w ) +m dk [ e-'"tf(t)] dt = (-l)k e @ t k f ( t ) dt: d u k = --3o -m
/
( 15.89a)
and we have dkF(d) lim -- 0. dwk With the above assumptions these relations imply that
(15.89b)
d i f c x
dnF(w) dwn '
F{tnf(t) } = in -
(15.89~)
15.3 Fourier Transformation 727
8. Integration in the Original Space and the Parseval Formula 1. Integration Theorem If the assumption
1
tso
f ( t )dt = 0
(l5.9Oa)
is fulfilled, then
F
(15.90b)
-X
2. Parseval Formula If the function f ( t ) and its square are integrable in the interval then
(-ffi.ffi),
(15.91) -w
9. Convolution The two-sided convolutzon
1
'X
fl(t)
* fdt) =
(15.92)
f1(r)fdt- 7 )d7
-X
is considered in the interval (-03, x)and exists under the assumptions that the functions h ( t ) and f2(t) are absolutely integrable in the interval (-03, 03). If fl(t) and fz(t) both vanish for t < 0, then we get the one-szded convolutzon from (15.92) (15.93)
So. it is a special case of the two-sided convolution. While the Fourier transformation uses the twosided convolution, the Laplace transformation uses the one-sided convolution. For the Fourier transform of a two-sided convolution we have (15.94)
(15.95) -X
-X
exist. Le.. the functions and their squares are integrable in the interval (-w, ffi) ICalculate the two-sided convolution$(t) = f ( t ) * f(t) =
:1
f ( r ) f ( t - r )dr (il.1)for the function
of the unipolar rectangular impulse function (A.1) on p. 724. Since y ( t ) =
1
to
f(t - r ) dr =
-60
for -2to 5 t 5 0. ~ ( t=)
1
ttta
:1
f(r)dr (A.2): we get fort < -2to and t > 2t0, w(u)= 0 and
dr = t t 2to. (A.3)
-60
Analogously. n e get for 0 < t 5 2to: y ( t ) =
1
to
t-to
dr = -t
+ 2to.
(.4.4)
.4ltogether, for this convolution (Fig. 15.23) we get t + 2t0 for -2to 5 t 5 0, ~ ( t=)f ( t ) * f ( t ) = -t 2to for 0 < t 5 2to. (A.5) (0 for It1 > 2to. n'ith the Fourier transform of the unipolar rectangular impulse function (.4.1) (p. 724 and Fig. 15.21)
+
728
15. Inteqral Transfomations
we get P(d)=
F{ v(t)}
=
sin' w to
F{ f ( t ) * f(t) = [ F ( w ) ] ' = 4-
spectrum of the function f ( t ) we have IF(w)I = 2
1
(A.6) and for the amplitude
w2
sin' w to and IF(w)I' = 4 . (A.7)
w
W2
-i. -2t0
~
2t,
-1
Figure 15.23
Figure 15.24
10. Comparing the Fourier and Laplace Transformations There is a strong relation between the Fourier and Laplace transformation, since the Fourier transformation is a special case of the Laplace transformation with p = iw. Consequently, every Fourier transformable function is also Laplace transformable, while the reverse statement is not valid for every f ( t ) . Table 15.2 contains comparisons of several properties of both integral transformations. Table 15.2 Comparison of the properties of the Fourier and the Laplace transformation
1
1 Laplace transformation
Fourier transformation ~ ( d=)
F{ f ( t ) } =
tm
J' e-'"'f(t) dt
~ ( p=) L{ f ( t ) , p } = r e - P t f ( t ) d t
--cc
is real, it has a physical meaning, e.g., frequency. One shifting theorem. interval: (-x.+m) Solution of differential equations, problems described by two-sided domain, e.g., the wave equation. Differentiation law contains no initial values.
p is complex, p = r
J
~
1
+ iz.
Two shifting theorems. interval: [ 0 , co) Solution of differential equations, problems described by one-sided domain. e.g., the heat conduction equation. Differentiation law contains initial values.
~~~
Convergence of the Fourier integral depends only Convergence of the Laplace integral can be improved by the factor e-@. on f ( t ) . It satisfies the one-sided convolution law. It satisfies the two-sided convolution law.
15.3.1.4 Transforms of Special Functions IA: Which image function belongs to the originalfunction f ( t ) = e-altl, Rea > 0 (A.l)? Considere-iWt-altldt= e-""-<dt+ ingthat It1 = -tfort < Oandltl = tfort > Owith(15.71)weget:
/_"
e-ARea
and Re a > 0. the limit exists for A
-+
2a co,so we get F ( w ) = F{ ecaltI } = -(A.3). a2
+ w2
B: Which image function belongs to the original function f ( t ) = e-at, Re a > O? The function is not Fourier transformable, since the limit A -+ co does not exist.
15.3 Fourier Transformation 729
C: Determinate the Fourier transform of the bipolar rectangular impulse function (Fig. 15.24) 1 for -2to < t < 0: -1 for 0 < t < 2t0, (C.1) 0 for It1 > 2to where p(t) can be expressed by using equation (A.l) given for the unipolar rectangular impulse on p. 724. since p(t) = f ( t + to) - f(t - to) (C.2). By the Fourier transformation according to (15.85b, 15.85~)we get @(w) = F{p(t) } = eiwtOF(w) - e-iwtoF(w), (C.3) from which. using (A.1).we have: sin2w t o 2 sin w to Q(wj = - e-id ) iJ = 4i((2.4). W
D: Transform of a damped oscillation: The damped oscillation represented in Fig. 15.25a is given 0 for t < 0, To simplify the calculations, the Fourier transformation is calculated with the complex function f * ( t ) = e(-a+ido)t,since f ( t ) = Re (f"( t ) ) .The Fourier transformation gives
F{f * ( t )} CY
CY2
=
is
+ i(w.0 - IJ) + (d- W o ) 2
n f(t)1=
e-iwte(-a+iwo)t
e-atei(w-wo)t
=
-a
I=
1
O0
+ i(w0 - w )
cy
-
- iwo - w)
The result is the Lorentz or Breit-Wigner curve (see also p. 94) '
cy (y2
dt
+( -.
(Fig. 15.25b). It has a unique peak in the frequency domain which cor-
responds to a damped oscillation in the time domain.
t b)
Figure 15.25
Figure 15.26
15.3.2 Solution of Differential Equations using the Fourier Transformat ion Analogously t o Laplace transformation. an important field of application of the Fourier transformation is the solution of differential equations. since these equations can be transformed by the integral transformation into a simple form. In the case of ordinary differential equations we get algebraic equations, in the case of partial differential equations we get ordinary differential equations.
15.3.2.1 Ordinary Linear Differential Equations The differential equation y'(t)
+ a y(t) = f(t)
with f ( t )=
1 for ltl < to, 0 for It1 2 to,
(15.96a)
730
15. Inteqral Transformations
i.e., with the function f ( t ) of Fig. 15.21,is transformed by the Fourier transformation FI Y(t) } = I~(d) into the algebraic equation sin wto 2 sin u to iuY+aY = , (15.96~) soweget Y(w)= 2- w(a iw) w The inverse transformation gives
+
y ( t ) = F-1{
Y ( w )} = F-l
and
dw
for -co
(15.96b) (15.96d)
(15.96e)
< t < -to,
Function (15.96f) is represented graphically in Fig. 15.26.
15.3.2.2 Partial Differential Equations 1. General Remarks The solution of a partial differential equation is a function of at least two variables: u = u ( z ,t ) . .4s the Fourier transformation is an integration with respect to only one variable, the other variable is considered a constant during the transformation. Here we keep as a constant the variable z and transform with respect to t:
/
tx
F{ u ( z ,t ) } =
e-lwtu(z,t) dt = G(z,w ) .
(15.97)
-X
During the transformation of the derivatives we also keep the variable z: (15.98) The differentiation with respect to z supposes that it is interchangeable with the Fourier integral: (15.99)
So, we get an ordinary differential equation in the image space. We also have to transform the boundary and initial conditions into the image space, of course.
2. Solution of the One-Dimensional Wave Equation for a Homogeneous Medium 1. Formulation of t h e P r o b l e m The one-dimensional wave equation with vanishing perturbation term and for a homogeneous medium is: (15.100a) u,, - utt = 0. Like the three-dimensional wave equation (see 9.2.3.2,~. 534), the equation (l5.100a) is a partial differential equation of hyperbolic type. The Cauchy problem is correctly defined by the initial conditions u(z.0) = f(z) (-cc < z < co), ut(z.0) = g(z) (0 5 t < co). (15.100b) 2. Fourier Tkansformation We perform the Fourier transformation with respect to z where the time coordinate is kept constant: F{U(X%t ) } = U(d.t ) . (15.101a)
15.4 2-Transformation 731
U'e get: dzu(d t )
(id)%(&. t ) - -= 0
with
dt2
(15,101h)
( 15.101e) (15.101d) 2 u + U'I = 0. (15.101e) The result is an ordinary differential equation with respect t o t with the parameter w of the transform. The general solution of this known differential equation with constant coefficients is u ( d . t ) = CleiVt+ C2e-iwt. (15.102a) \Ye determine the constants C1 and C2 from the initial values (15.102b) u ( d 30 ) = C1+ Cz = F ( w ) . d ( w , 0) = iwC1 - iwC2 = G(w) and we get 1 1 1 1 (15.102c) C1 = - [ F ( d ) t -G(u)]. C, = - [ F ( u ) - ,G(w)]. 2 1W 2 l(u' The solution is therefore 1 1 1 1 u(d,t) = -[ F ( J ) + -G(w)]e'"'+ -[F(w.)- TG(w)]e-'"t (15.102d) 2 1U 2 1w
F{ u(.. 0) } = U(W,0) = F{f(.) } = F ( d ) , F{ u~(z, 0) } = ~ ' ( 4 . 0 = ) F{ g(z) } = G(w)
3. Inverse Transformation We use the shifting theorem
F{f ( a z + b ) } = 1 / a . eibWIaF(u/a), for the inverse transformation of F ( w ) ,and we get
(15.103a)
e i W t F ( w }) = f ( z + t ) ) Applying the integration rule
(15.103b)
F 7-l
F-'[e - i W t ~ ( w )=] f(x - t ) .
1 } {-:
f(7)dr = -F(w) :i
{
7G(d)eid'} =
:1
after substituting s
(15.103~)
weget
] F-'{G(w)eiwf} ] d~ =
--si
-m
1
X t t
+ t ) dT =
g (T
g ( z )dz
(15.103d)
-m
+ t = z. analogously to the previous integral we get (15.103e)
Finally, the solution in the original space is (15.104)
15.4 Z-Transformation In natural sciences and also in engineering we can distinguish between continuous and discrete processes. jf'hile continuous processes can he described by differential equations, the discrete processes result mostly in diflereerence equations. The solution of differential equations mostly uses Fourier and Laplace transformations, however. to solve difference equations other operator methods have been
732
15. Inteqral Transformations
developed. The best known method is the z-transformation, which is closely related to the Laplace transformation.
15.4.1 Properties of the Z-Transformation 15.4.1.1 Discrete Functions
L e fo
f,
0 T 2T3T
t
Figure 15.27
If a function f ( t ) (0 5 t < m) is known only at discrete values t, = nT ( n = 0,1,2,. . . ; T > 0 is a constant) of the argument, then we write f ( n T ) = fn and we form the sequence {fn}. Such a sequence is produced, e.g., in electrotechnics by “scanning” a function f ( t )at discrete time periods t,. Its representation results in a step function (Fig. 15.27). The sequence {fn} and the function f ( n T )defined only at disCrete points of the argument, which is called a discrete function, are equivalent.
15.4.1.2 Definition of the Z-Transformation 1. Original Sequence and Transform We assign the infinite series (15.105) to the sequence {in}.If this series is convergent, then we call the sequence {f,} z-transformable, and we write F ( z ) = Z{fn}, (15.106) { f,} is called the original sequence, F ( z ) is the transform. z denotes a complex variable, and F ( z ) is a complex-valued function. Ifn = 1 (n = 0,1,2,, , .) . The corresponding infinite series is
F ( z )=
e (i)n.
n=O
(15.107) I ,
It1< 1 and its sum is
It represents a geometric series with common ratio 1/z, which is convergent if -
F(z)=
z, It is divergent for - 2 1. Therefore, the sequence (1) is z-transformable for - < 1, iil IkI 2-1
Le., for every exterior point of the unit circle 1x1 = 1 in the z plane. 2. Properties Since the transform F ( z ) according to (15.105) is a power series of the complex variable l / z , the properties of the complex power series (see 14.3.1.3,p. 688) imply the following results: a) For a z-transformable sequence {f,},there exists a real number R such that the series (15.105) is absolutely convergent for Iz/ > 1/R and divergent for IzI < 1/R. The series is uniformly convergent for IzI 2 l/Ro > 1/R. R is the radius of convergence of the power series (15.105) of l/z. If the series is convergent for every IzJ> 0, we write R = co. For non z-transformable sequences we write R = 0. b) If {fn} is z-transformable for IzI > 1/R, then the corresponding transform F ( z ) is an analytic func> 1/R and it is the unique transform of {f,}. Conversely, if F ( z ) is an analytic function tion for for > 1/R and is regular also at z = co, then there is a unique original sequence {f,} for F ( z ) . Here, F ( z ) is called regular at z = m, if F ( z ) has a power series expansion in the form (15.105) and F(m)= fo.
3. Limit Theorems Analogously to the limit properties of the Laplace transformation ((15.7b), p. 708), the following limit theorems are valid for the z-transformation:
15.4 Z- Transformation 733
a) If F ( z ) = Z { f n }exists, then fo = hlF(Z).
(15.108)
Here z can tend to infinity along the real axis or along any other path. Since the series 1 1
Z { F ( z )- fo} = fl + f i r + f32
zz
1 1 1 2' [ F ( z ) - f 0 - fi;) = f2 t f3; + f4'2
+. . +
' I '
'
,
(15.109)
>
(15.110)
are obviously z transforms. analogously to (15.108) we get f l = ~ ~ g F ( Z ) - . f o ~ ,f z = ~ ~ ~ Z ~ { F ( z ) -1f o - f l ~ } , . . .
(15.111)
IVe can determine the original function {fn} from its transform F ( z ) in this way. b) If fn exists, then lim
nim
fn =
lim ( z - 1)F(z).
(15.112)
ZilCO
\Ve can however determine the value
Ail fn from (15.112) only if its existence is guaranteed, since the
above statement is not reversible 1 lim (z-1)-=O,but z+l
If n = ( - l ) n ( n = 0 , 1 . 2 . . . . ) . T h e n Z { f , } = L a n d z t 1 not exist.
nlim(-l)ndoes im
2-+1+0
15.4.1.3 Rules of Calculations In applications of the z-transformation it is very important to know how certain operations defined on the original sequences affect the transforms, and conversely. For the sake of simplicity we will use the notation F ( z ) = Z { f n }for 121 > 1/R.
1. Translation We distinguish between forward and backward translations. 1. First Shifting Theorem: Z{fn-k} = z - ~ F ( z ) ( k = 0 , l . 2 , . . .) . here fn-k = 0 is defined for n - k < 0. k-1
2. Second Shifting Theorem: Z{fn+k}= zk [ p ( z ) -
fv "=O
(;)1 "]
(15.113)
( k = 1 , 2 , . . .) . (15.114)
2. Summation (15.115)
3. Differences For the dzferences Lfn = fn+l - fn, the following holds:
Z{Afn) Z{A'fn}
= =
Amfn = A(Am-'fn)
(VI=
1 , 2 , . , , : Aofn = f n )
1)F(2)- z f o , - ~ ) ' F ( z-) Z ( Z - I)fo - ~ A f o ,
(2(Z
-
(15.117 ) k-1
Z{Lkfn} =
(15.116)
(2 - l)kF(Z)
-2
(2 Y'O
1)k--.-'A.
f0.
734
15. Inteoral Transformations
4. Damping For an arbitrary complex number A 2{Xnfn} = F
# 0 and /zI > -:1x1
(i) .
R
(15.118)
5 . Convolution The convolution of two sequences {fn} and {gn} is the operation
fn *
n
gn =
(15.119)
fu gn-v
"=a
If the z-transformed functions 2{fn} = F ( z ) for then
Z{fn for
121
1x1 > 1/R1 and 2{gn}
= G ( z )for IzI
> l/Rz exist,
* gn} = P ( z ) G ( z )
> max
(+R
(15.120)
L). Relation (15.120) is called the convolution theorem of the z-transformation. Rz
It corresponds to the rules of multiplying two power series.
6. Differentiation of the Transform 2 { n f n }= -2- d F ( z ) dt
(15.121) '
We can determine higher-order derivatives of F ( z ) by the repeated application of (15.121).
7. Integration of the Transform Under the assumption fo = 0. (15.122)
15.4.1.4 Relation t o the Laplace Transformation If we describe a discrete function f ( t ) (see 15.4.1.1, p. 732) as a step function, then for nT 5 t < ( n + 1)T ( n = 0 , 1 , 2 , . . . : T > 0. T const). (15.123) f ( t ) = f ( n T ) = fn 118 can use the Laplace transformation (see 15.2.1.1, l . ,p. 708) for this piecewise constant function, and for T = 1 we get:
The infinite series in (15.124) is called the dzscrete Laplace transformatzon and it is denoted by
D:
X
D{f(t)} = D { f n } =
1fne-np.
(15.125)
n=O
If n e substitute e p = z in (15.125), then D{fn} represents a series with increasing powers of l/z. which is a so-called Laurent s e w s (see 14.3.4. p. 690). The substitution eP = z suggested the name of the z transformation. With this substitution from (15.124) we finally get the following relations between the Laplace and z-transformation in the case of step functions:
p F ( p ) = (1 -
1 --) F ( t ) (15.126a)
(
;)
or P L { f ( t ) }= 1 - - 2{fn}.
(15.12613)
In this way. we can transform the relations of z-transforms of step functions (see Table 21.13.p. 1080)
15.4 2-Transformation 735
into relations of Laplace transforms (see Table 21.11,p. 1061) of step functions, and conversely.
15.4.1.5 Inverse of t h e Z-Transformation The inverse of the z-transformation is to find the corresponding unique original sequence {fn} from its transform F ( z ) . We write
z - ' { F ( z ) } = {jn}.
(15.127)
There are different possibilities for the inverse transformation
1. Using Tables If the function F ( z ) is not given in tables. then we can try to transform it to a function which is given in Table 21.13.
2. Laurent Series of F ( z ) We get the inverse transform directly from the definition (15.105)) p. 732, if a series expansion of F ( z ) with respect to 1/z is known or if it can be determined.
I:(
3. Taylor Series of F is a series of increasing powers of 2 , from (15.105) and the Taylor formula we get (15.128)
4. Application of Limit Theorems Using the limits (15.108) and (Kill), p. 733, we can directly determine the original sequence {fn} from its transform F ( z ) . H F ( t )=
22
We use the previous four methods. - 2 ) ( 2 - 1)2 ' 1. By the partial fraction decomposition (see 1.1.7.3,p. 15) of F ( z ) / zwe obtain functions which are contained in Table 21.13. (2
F(2) -
-2
2 (2-2)(2-1)2 22
=
A 2 - 2
22
22
(2-1)2
2-1
+- B (2-1)2
+-.2 -C1
so
F ( 2 ) = -- -- - and therefore jn= 2 ( P - n - 1) for n 2 0. 2-2
2. F ( z ) will be a series with decreasing powers of 2 by division: 22 1 1 1 1 1 F ( z ) = z3 - 4 2 + 52 - 2 = 2-zZ t 8-23 t 22-z4 + 52- t 114-z6
From this expression we get fo = fi = 0, f z = 2, f 3 = 8, not obtain a closed expression for the general term f n .
3. For formulating F
f4
+ ... .
= 22, f 5 = 52.
(15.129) f6
=
114.
. . ., but we do
and its required derivatives, (see (15.128)) we consider the partial fraction
736
15. Intearal Transformations
decomposition of F ( z ) ,and get: 2 22 = 1-22 (1-2)2
4
- -
dz
(E)
d2F
dz2
i.e.,
42 4 -~ - - - (1 - 22)’ (1 - 2) 3 (1 - 2 ) ’ ’ -
d3F(:) dz3
2
1-2’
F ( ~ ) = o
for z = 0,
i.e., dz
16 122 12 (1 - 2 ~ ) (1~- z ) ~ (1 - z ) ~ ’
96
(1- 2 2 ) 4
482 (1 - z ) 5
48 (1 - ~
’
(’I
d3F i.e., 2-48 dz3
) 4
for z = O ,
from which we can easily obtain fo, f l , f 2 , f31 . . . considering also the factorials in (15.128). 4. .4pplication of the limit theorems (see 15.4.1.2, 3., p. 732) gives: 22 fo = hlF(z) = lim = 0, z+m 23 - 422 + 52 - 2 222 f l = ) m z ( F ( z ) - fo) = lim = 0, 2-m z3 - 4z2 + 52 - 2
- fo - fl - - f2 ?) = ;;I 23 (
--)
22 2 =8, ... - 42’ 52 - 2 2’ where the Bernoulli-l’Hospita1 rule is applied (see 2.1.4.8, 2., p. 54). We can determine the original sequence {fn} successively.
j3 =
p%
23 ( ~ ( 2 )
1
1
Z3
+
15.4.2 Applications of the Z-Transformation 15.4.2.1 General Solution of Linear Difference Equations A linear difference equation of order k with constant coefficients has the form (15.130) akyn+k + ak-lyn+k-l+ ’. + ~ Z Y ~ + Za l y n + l + aovn = gn (n = O,I,2 . . .). Here k is a natural number. The coefficients a, (z = 0 , l . . . . , k ) are given real or complex numbers and they do not depend on n. Here a0 and ak are non-zero numbers. The sequence {gn} is given, and the sequence {y,} is to be determined. To determine a particular solution of (15.130) the values yo, yl,. . . , yk-l have to be previously given. Then we can determine the next value y k for n = 0 from (15.130). We then get yk+l for n = 1 from y1. y ~ .. . . yk and from (15.130). In this way, we can calculate recursively all values yn. We can give however a general solution for the values yn with the z-transformation. We use the second shifting theorem (15.114) applied for (15.130) to get:
+
[
akzk Y(2) - yo - y l K 1 - .
. - yk-l~-(k-’)] + . . + a l z [ Y ( z ) - yo] + aoY ( 2 ) = G ( z ) (15.131) .
=p(z), HerewedenoteY(2) = Z(y,) andG(z) = Z(g,). Ifwesubstitutea~zkta~-lzk-’+~~~+alz+ao then the solution of the so-called transformed equation (15.131) is
(15.132)
As in the case of solving linear differential equations with the Laplace transformation, we have the similar advantage of the z-transformation that initial values are included in the transformed equation, so the solution contains them automatically. We get the required solution {y,} = 2-'{ Y ( z ) }from (15.132) by the inverse transformation discussed in 15.4.1.5, p. 735.
15.4.2.2 Second-Order Difference Equations (Initial Value Problem) The second-order difference equation is (15.133) Yn+2 + Q l Y n + l + QYn = Sn. where yo and y1 are given as initial values. Using the second shifting theorem for (15.133) we get the transformed equation 2'
[EI(z) -
If we substitute '2
YO
- YI;
ll
+ w [Y ( z )- yo] + aoY(z)= G ( z ) .
(15.134)
+ a l z + a0 = p ( z ) .then the transform is (15.135)
If the roots of the polynomialp(z) ar eal and az, then cy1 # 0 and cyz # 0, otherwise a. is to be zero, and then the difference equation could be reduced to a first-order one. By partial fraction decomposition and applying Table 21.13 for the z-transformation we get the following: for
a1
# QZ,
for
a 1
= CYZ,
(15.136a) Since pO = 0.by the second shifting theorem (15.136b) and by the first shifting theorem
Here we substitute p-l = 0. Based on the convolution theorem we get the original sequence with n
~n = C P n - l q n - u "=O
+ y o ( P n + l +alp") + ~
(15.136d)
1 ~ 1 .
Since p - 1 = PO = 0, this relation and (15.136a) imply that in the case of cy1
This form can be further simplified, since a 1 = -(a1 + az) and Vieta, 1.6.3.1. 3., p. 44).so
a0
# cyz
= alaz (see the root theorems of
(15.136f)
738
15. Inteoral Transformations
In the case of ai = a2 similarly n
gnJv
y, =
-
1)Q’;-2
- yoao(n
- 1)Cl-2
+ y1na;-’.
( 15.136g)
v=2
In the case of second-order difference equations the inverse transformation of the transform Y ( z )can be performed without partial fraction decomposition if we use correspondences such as, e.g.,
[
)
(15.137) sinh n and the second shifting theorem. By substituting al = -2acosh b, and an = a2 the original sequence of (15.135) becomes: 2-1
y
z 2 ’- 2az cosh b t a2
1
[
-7 - sinhb u=2
gn-&’
sinh(v - l ) b - yOan sinh(n - l)b
1
+ y1an-l sinhnb .
(15.138)
This formula is useful in numerical computations especially if a0 and a1 are complex numbers
Remark: Notice that the hyperbolic functions are also defined for complex variables.
15.4.2.3 Second-Order Difference Equations (Boundary Value Problem) It often happens in applications that the values yn of a difference equation are needed only for a finite number of indices 0 5 n 5 N . In the case of a second-order difference equation (15.133) both boundary values yo and y , ~are usually given. To solve this boundary value problem we start with the solution (15,136f) of the corresponding initial value problem. where instead of the unknown value y1 we have to introduce y ~ If. we substitute n = Y into (l5,136f), we can get y1 depending on yo and y , ~ :
If we substitute this value into (l5.136f) then we get
+-a? -1 a;
[ yo(a;a; - a;.,”,
+ yN(a;-
a;)].
(15.140)
The solution (15.140) makes sense only if Q? - a; # 0. Otherwise, the boundary value problem has no general solution, but analogously to the boundary value problems of differential equations, we have to solve the eigenvalue problem and to determine the eigenfunctions.
15.5 Wavelet Transformat ion 15.5.1 Signals If a physical object emits an effect which spreads out and can be described mathematieally, e.g.. by a function or a number sequence, then we call it a signal. Signal analysis means that we characterize a signal by a quantity that is typical for the signal. This means mathematically: The function or the number sequence, which describes the signal. will be mapped into another function or number sequence. from which the typical properties of the signal can be clearly seen. For such mappings, of course. some informations can also be lost. The reverse operation of signal analysis, Le.. the reconstruction of the original signal, is called signal synthesis. The connection between signal analysis and signal synthesis can be well represented by an example of Fourier transformation: A signal f ( t ) (t denotes time) is characterized by the frequency w. Then, formula (l5.141a) describes the signal analysis, and formula (15.141b) describes the signal synthesis:
15.5 Wavelet Transformation 739
/ e-'"'f(t) m
F ( d )=
dt
(15.141a)
and f ( t ) =
-X
&7
eiWtF(cj)dd.
(15.141b)
-m
15.5.2 Wavelets The Fourier transformation has no localization property, Le., if a signal changes at one position, then the transform changes everywhere without the possibility that the position of the change could be recognized "at a glance". The basis of this fact is that the Fourier transformation decomposes a signal into plane waves. These are described by trigonometric functions, which have arbitrary long oscillations with the same period. However, for wavelet transformations there is an almost freely chosen function w ,the wavelet (small localized wave), that is shifted and compressed for analysing a signal. Examples are the Haar wavelet (Fig. 15.28a) and the Mexican hat (Fig. 15.28b). IA Haar wavelet:
v=
{
if p5x5l. %herwise. (15.142) IB Mexican hat: d2 ~ ( x=) --e-"/* (15.143) a) dx2 - (1 - x 2 ) e - r 2 / 2 .(15.144)
-1
0
1
t
b) Figure 15.28
Every function Q comes into consideration as a wavelet if it is quadratically integrable and its Fourier transform Q(d) according to (15.141a) results in a positive finite integral (15.145) Concerning wavelets, the following properties and definitions can be mentioned: 1. For the mean value of the wavelet:
7
v(t) d t = 0
(15.146)
-X
2. The following integral is called the k-th moment of a wavelet y : t"(t) dt.
pic =
(15.147)
-m
The smallest positive integer n such that p,, # 0, is called the order of the wavelet v . IFor the Haar wavelet (15 142). n = 1, and for the Mexican hat (15.144), n = 2. 3. IYhen pk = 0 for every k , y has infinite order. Wavelets with bounded support always have finite order. 4. .4 wavelet of order n is orthogonal to every polynomial of degree 5 n - 1.
15.5.3 Wavelet Transformation For a wavelet y ( t )we can form a family of curves with parameter a: (15.148)
740
15. Intearal Transfornations
In the case of la1 > 0 the initial function $(t)is compressed. In the case of a < 0 there is an additional reflection. The factor 1 / f i is a scaling factor. The functions &(t) can also be shifted with a second parameter b. Then we get a two-parameter family of curves:
t-b (a, b real; a # 0). (e)
1 ca,b
= $0
(15.149)
The real shifting parameter b characterizes the first moment, while parameter a gives the deviation of the function & b ( t ) . The function & , b ( t ) is called a basis junction in connection to the wavelet transformatzon. The wavelet transformation of a function f ( t )is defined as:
L+j ( a . b) = c
7
m
f(t)qa,a(t)d t =
(15.150a)
-E
For the inverse transformation: (15.15Ob) Here cis a constant dependent on the special wavelet $. IUsing the Haar wavelets (15.144) we get 1 if b < t < b + a / 2 , -1 if b + a / 2 < t < b + a ,
and therefore & f ( a ; b) =
~
Ji;lT
(r’l ra
f(t) dt)
f ( t ) dt -
bia/2
(15.151)
, given in (15.151) represents the difference of the mean values of a function f ( t ) The value L V f ( a b) la1 connected at the point b. over two neighboring intervals of length -, 2 Remarks: 1. The dyadzc wavelet transjormatzon has an important role in applications. As a basis function we select the functions
(15.152) Le., we get different basis functions from one wavelet $(t) by doubling or halving the width and shifting by an integer multiple of the width. 2. A wavelet @ ( t ,) where the basis functions given in (15.152) form an orthogonal system, is called an orthogonal wavelet. 3. The Daubechies wavelets have especially good numerical properties. They are orthogonal wavelets with compact support, Le., they are different from zero only on a bounded subset of the time scale.
15.5 Wavelet Transformation 741
They do not have a closed form representation (see [15.9])
15.5.4 Discrete Wavelet Transformat ion 15.5.4.1 Fast Wavelet Transformation The integral representation (15.150b) is very redundant, and so the double integral can be replaced by a double sum without loss of information. We consider this idea at the concrete application of the wavelet transformation. We need 1. an efficient algorithm of the transformation: which leads to the concept of multi-scale analysis, and 2. an efficient algorithm of the inverse transformation, Le., an efficient nay to reconstruct signals from their wavelet transformations, which leads us to the concept of frames. For more details about these concepts see [l5.9],[15.1]. Remark: The great success of wavelets in many different applications, such as calculation of physical quantities from measured sequences pattern and voice recognition data compression in news transmission is based on "fast algorithms". Analogously to the FFT (Fast Fourier Transformation, see 19.6.4.2, p. 925) we talk here about FWT (Fast Wavelet Transformation).
15.5.4.2 Discrete Haar Wavelet Transformation As an example of a discrete wavelet transformation we describe the Haar wavelet transformation: The values f, ( i = 1 , 2 . . . . , N) are given from a signal. The detailed values di (i = 1 , 2 , .. . , N/2) are calculated as: S,
=
L(fzz-i
JT
+fit).
4
=
L(fzt-i
fi
- fit).
(15.153)
The values d, are to be stored while the rule (15.153) is applied to the values s,, i.e.. in (15.153) the values fi are replaced by the values s,. This procedure is continued, sequentially so that finally from (15.154) a sequence of detailed vectors is formed with components dj"). Every detailed vector contains information about the properties of the signals. Remark: For large values of N the discrete wavelet transformation converges to the integral wavelet transformation (15.15Oa).
15.5.5 Gabor Transformat ion Tzme-frequency analyszs is the characterization of a signal with respect to the contained frequencies and time periods when these frequencies occur. Therefore, the signal is divided into time segments (windows) and a Fourier transform is used. We call it a Windowed Fourier Transformation (WFT). The window function should be chosen so that a signal is considered only in the window. Gabor applied the window function
0.04
0
g ( t ) = -e
1
ti --
2g2
(15.155)
f i L l
-0.04
-30
0 Figure 15.29
30 t
(Fig. 15.29). This choice can be explained as g ( t ) , with the "total unit mass". is concentrated at the point t = 0 and the width of the window can be considered as a constant (about 2u).
I
742
15. Inteqral Transformations
The Gabor transformation of a function f(t) then has the form
Gf(.j. s) =
7
j ( t ) g ( t - s)e-'utdt.
(15.156)
-m
This determines, with which complex amplitude the dominant wave (fundamental harmonic) e l d t occurs during the time interval [s - q s t 01 in f , i.e., if the frequency J occurs in this interval. then it has the amplitude lGf(w,s)l
15.6 Walsh Functions 15.6.1 Step Functions In the approximation theory of functions orthogonal function systems have an important role. For instance. special polynomials or trigonometric functions are used since they are smooth, Le., they are differentiable sufficiently many times in the considered interval. However, there are problems. e.g., the transition of points of a rough picture, when smooth functions are not suitable for the mathematical description. but step functzons, piecewise constant functions are more appropriate. Walsh functions are very simple step functions. They take only two function values +1 and -1. These two function values correspond to two states, so the Walsh functions can be implemented by computers very easily.
15.6.2 Walsh Systems .4nalogously to trigonometric functions we can consider periodic step functions. We apply the interval I = [O. 1) as a period interval and we divide it into 2, equally long subintervals. Suppose S,is the set of periodic step functions with period 1 over such an interval. We can consider the different step functions belonging to S, as vectors of a finite dimensional vector space, since every function g E S, is defined by its values 90%gl, gZr., . , gZn-1 in the subintervals and it can be considered as a vector: gT = (gO,gl.g2,...,g2"-1). (15.157) The llalsh functions belonging to S,form an orthogonal basis with respect to a suitable scalar product in this space. The basis vectors can be enumerated in many different ways, so we can get many different Ilalsh systems. which actually contain the same functions. There are three of them which should be mentioned: LValsh-Kronecker functions, Lt'alsh-Kaczmarz functions and Walsh-Paley functions. The Wulsh transformation is constructed analogously to the Fourier transformation, where the role of the trigonometric functions is taken by the Walsh functions. We get, e.g., Walsh series, Walsh polynomials. LValsh sine and IValsh cosine transformations, Walsh integral, and analogously to the fast Fourier transformation there is a Fast Walsh Transformation. For an introduction in the theory and applications of \lalsh functions see [15.6].
743
16 Probability Theory and Mat hemat ical St at ist ics IYhen experiments or observations are made, various outcomes are possible even under the same conditions Probability theory and statistics deal with regularity of random outcomes of certain results with respect to given experiments or observations. (In probability theory and statistics, observations are also called experiments. since they have certain outcomes ) We suppose. at least theoretically, that these experiments can be repeated arbitrarily many times under the same circumstances; namely, these disciplines deal with the statistics of mass phenomena. The term stochastzcs is used for the mathematical handling of random phenomena.
16.1 Combinatorics \Ye often compose new sets, systems. or sequences from the elements of a given set, in a certain way. Depending on the way we do it. we get the notion of perrnutatzon,combznatzon. and arrangement. The basic problem of combinatorics is to determine how many different choices or arrangements are possible with the given elements
16.1.1 Permutat ions 1. Definition A perrnutatzon of n elements is an ordering of the n elements. 2. Number of Permutations without Repetition The number of different permutations of n different elements is P, = n!
(16.1) IIn a classroom 16 students are seated on 16 places. There are 16! different possible arrangements. 3. Number of Permutations with Repetitions The number Pn'k' of different permutations of n elements containing k identical elements ( k 5 n ) is
IIn a classroom 16 schoolbags of 16 students are placed on 16 chairs. Four of them are identical. There are 16!/4! different arrangements of the schoolbags.
4. Generalization The number Pn(kl.kz,,,,,km) of different permutations of n elements containing m different types of elements with multiplicities kl, kZ... . , k, respectively (kl + kz + . . . + k, = n ) is pnWz.....kml = n! kl!kz!. . .k,!
(16.3)
'
ISuppose we compose five-digit numbers from the digits 4,4,5; 5, 5. We can have P5'2,31 =
2 = 10 2!3!
different numbers.
16.1.2 Combinations 1. Definition A combination is a choice of k elements from n different elements not considering the order of them. \Ye call it a combination of k-th order and we distinguish between combinations with and without repetition.
16. Probability Theory and Mathematical Statistics
744
2. Number of Combinations without Repetition
The number Cn(k)of different possibilities to choose k elements from n different elements not considering the order is
Cn[')=
(L)
with 0 5 k 5 n (see binomial coefficient in 1.1.6.4, 3., p. 13),
(16.4)
if we choose any element at most once. We call this a combination without repetition. IThere are pants.
(3
= 27405 possibilities to choose an electoral board of four persons from 30 partici-
3. Number of Combinations with Repetition The number of possibilities to choose k elements from n different ones, repeating each element arbitrarily times and not considering the order is (16.5) In other words. we consider the number of different selections of k elements chosen from n different elements. where the selected ones must not be different. IRolling k dice, we can get C6@)=
G)
= 21 different
( + ') -
different results. Consequently, we can get Cg(*) =
results with two dice.
16.1.3 Arrangements 1. Definition An arrangement is an ordering of k elements selected from n different ones, i.e., arrangements are combinations considering the order.
2. Number of Arrangements without Repetition The number Vn(') of different orderings of Ic different elements selected from n different ones is (16.6)
IHow many different ways are there to choose a chairman, his deputy, and a first and a second assistant for them from 30 participants at an election meeting? The answer is
(3
4! = 657720.
3. Number of Arrangements with Repetition An ordering of k elements selected from n different ones, where any of the elements can be selected arbitrarily many times. is called an arrangement with repetition. Their number is \;(k) = n k , (16.7) IA: In a soccer-toto with 12 games there are 312 different outcomes. IB: We can represent 28 = 256 different symbols with the digital unit called a byte which contains 8 bits. see for example the well-known ASCII table.
16.2 Probability Theory 745
16.1.4 Collection ofthe Formulas of Combinatorics (see Table 16.1)
Type of choice or selection of kfrom
Number of possibilities without repetition with repetition
. 16.2.1 Event, Frequency and Probability
16.2 Probability Theory 16.2.1.1 Events 1. Different Types of Events
All the possible outcomes of an experiment are called events in probability theory, and they form the fundamental probabzlzty set A. if'e distinguish the certazn event. the zmposszble event and random events. The certain event occurs every time when the experiment is performed, the impossible event never occurs: a random event sometimes occurs. sometimes does not. All possible outcomes of the experiment excluding each other are called elementary events (see also Table 16.2). We denote the events of the fundamental probability set A by A. B. C.. . . , the certain event by I , the impossible event by 0. We define some operations and relations between the events; they are given in Table 16.2.
2. Properties of the Operations The fundamental probability set forms a Boolean algebra with complement, addition, and multiplication defined in Table 16.2, and it is called the field of events. The following rules are valid: 1. b)
AB=BA.
(16.10)
2. b)
AA=A.
(16.12)
3. b) A(BC) = (AB)C.
(16.14)
4. b)
AA=0.
(16.15)
(16.16)
5 . b)
A + BC = ( A + B ) ( A+ C).
(16 17)
6. a ) A + B = A B .
(16.18)
6. b)
m=X+B.
(16.19)
7. a) B - 4 = BX,
(16.20)
7. b) z = I - A .
8. a) A(B - C) = AB - AC.
(16.22)
8. b)
1. a ) i l + B = B + A ,
2.a)
A + A = A,
3. a) A
+ ( B + C)= ( A+ B ) + C ,
4.a) A+rl=I, 5. a) A ( B ~
+ C) = AB + AC. __
(16.8)
AB - C = ( A - C ) ( B- C).
(16.9)
(16.13)
(16.21) (16.23)
746
16. Probabilitu Theorv and Mathematical Statistics
9.a) O C A ,
(16.24)
9. b)
10. From A G B follows a) -4 = AB (16.26)
A
C I.
(16.25)
and b) B = A t BX
and conversely. (16.27)
11. Complete System of Events: .4 system of events A, (a E 0, 0 is a finite or infinite set of indices) is called a complete system of events if the following is valid: 11. a) A,Ay = 0 f o r a
# /3
(16.28)
and 11. b)
EAn= I .
(16.29)
aE8
Table 16.2 Relations between events -
Name
Notation
1. Complementary event of A: A 2. Sum of events A and B: AtB AB 3. Product of the events .4 and B: 4. Difference of the events A A - B and B: 5 . Event as a consequence of A G B the other: E 6. Elementary or simple event: 7. Compound event: 8. Disjoint or exclusive events AB = 0 - A and B:
Definition -
A occurs exactly if A does not. A t B is the event which occurs if A or B or both occur. AB is the event which occurs exactly if both A and B
occur. A - B occurs exactly if A occurs and B does not.
A G B means that from the occurrence of A follows the occurrence of B. From E = A t B it follows that E = A or E = B . Event, which is not elementary. The events 4 and B cannot occur at the same time.
mi
IA: Tossing two coins: Elementary events for the separate tossings: See the table on the right. Head Tail 1. Elementary event for tossing both coins. e.g.: First coin shows head, second shows tail: .411Azz. 2. Coin Azl A22 Compound event for tossing both coins: First coin shows head: -411= AiiAzi t AiiA2z 2. Compound event for tossing one coin, e.g., the first one: First coin shows head or tail: A l l t A12 = I . Head and tail on the same coin are disjoint events: AllAlz = 0. IB: Lifetime of light-bulbs. \\'e can define the elementary events A,: the lifetime t satisfies the inequalities ( n - 1)At < t 5 n 3 t (n = 1 , 2 . . . ., and At > 0, arbitrary unit of time). A,. Compound event A: The lifetime is at most nAt, i.e., A =
E
V=l
16.2.1.2 Frequencies and Probabilities 1. Frequencies Let A be an event belonging to the field of events A of an experiment. If event A occurred
n4 times while we repeated the experiment n times, then nA is called the frequency, and nA/n = ha is called the relatzve frequency of the event A. The relative frequency satisfies certain properties which can be used to built up an axiomatic definition of the notion of the probability P ( A ) of event A in the field of events A .
16.2 Probability Theory 747
2. Definition ofthe Probability A real function P defined on the field of events is called a probability if it satisfies the following properties: 1. For every event A E A we have 0 5 P ( A ) 5 1, and 0 5 h.4 5 1. (16.30) 2. For the impossible event 0 and the certain event I , we have P(0)= 0. P ( I ) = 1. and ho = 0, hr = 1. (16.31) 3. If the events .4E A (i = 1.2,. . .) are finite or countably many mutually exclusive events ( A , 4 k = 0 fori # IC). then P(AI+ '42 + . . .) = AI) t P(Azj t , and hA1+Az+.,. = h~~ hAz . . . . (16.32)
+
I . .
+
3. Rules for Probabilities 1. B C A yields P ( B ) 5 P ( A ) . 2.
(16.33)
P(A)+ P(2)= 1.
(16.34)
3. For two disjoint events A and B ( A B = 0),we have P ( A B) = P(A)t P ( B ) . 4. a) For arbitrary events A, (i = 1,., , , nj, we have P(Ai + . ' A,) = P(.4i)t ' . t P(A,) - P(A1A2) - . . - P(AIA,) -P(A2A3) - ' . - P(AzA,) - . . - P(A,-iA,) tP(A1.42A3) t . . P(AlAzA,) ". t P(An-2A,-1A,) -
+
(16.35)
+
+
+
+(-l)n-'P(A1A2.. . A , ) . (16.36a) 4.b) In particular for n = 2: P(A1 t A2) = P(Al) P(A2) - P(A1A2). (16.36b) 5 . Equally likely events: If every event Ai (i = 1 , 2 , . . . , n) of a finite complete system of events occurs with the same probability, then
+
P(A,) = A
(16.37)
If A is a sum of m (rn 5 n ) events with the same probability A, (z = 1 , 2 , . . . , n) of a complete system, then m P(.4)= - . (16.38) n
4. Examples of Probabilities 1 IA: The probability P ( A ) to get a 2 rolling a fair die is: P ( A ) = 6 IB: K h a t is the probability of guessing four numbers for the lotto "6 from 49", i.e.. 6 numbers are to be chosen from the numbers 1 , 2 , . . . ,49. If 6 numbers are drawn. then there are (:) possibilities to choose 4. On the other hand there are
(161;) =
(423)
possibilities for the false numbers. Altogether, there are draw 6 numbers. Therefore, the probability P(A4) is:
4():
different possibilities to
I
748
16. Probabilitv Theorv and Mathematical Statistics
Similarly. the probability P(AG)to get a direct hit is: 1
P ( & ) = - - 0 715. lo-' = 7.15.
('69)
%.
- ' C: What is the probability P ( A )that at least two persons have birthdays on the same day among k persons? (The years of birth must not be identical, and we suppose that every day has the same probability of being a birthday.) It is easier to consider the complementary event 2: All the k persons have different birthdays. We get:
365 365-1 365-2 P (A )= . -. -.
365 365 From this it follows that
365
. 365-k+1 " '
365
+
365.364 363 . . . . (365 - k 1) 36Sk k 1 10 20 23 30 60 P(A) 1 0.117 0.411 0.507 0.706 0.994 We can see that the probability that among 23 and more persons at least two have the same birthday is greater than 50 %.
P ( A )= 1 - P ( 2 )= 1 -
16.2.1.3 Conditional Probability, Bayes Theorem 1. Conditional Probability The probability of an event B , when it is known that some event A has already occurred, is called a condztzonal probabzlzty and it is denoted by P(BIA) or Pa(B) (read: The probability that B occurs given that A has occurred). It is defined by
P(.4B) P(BIA) = - P ( A ) # 0. P(A) The conditional probability satisfies the following properties: a) If P ( A ) # 0 and P ( B ) # 0 holds, then
(16.39)
P(BlA) - P(AIB) PPI P(A) ' b) If P ( A 1 A 2 A 3 ..A,) . # 0 holds, then P(A1A2... A,) = P(A1)P(A2/A1). . .P(AnIA1A2.. .A,-l).
(16.40b)
2. Independent Events The events A and B are independent events if P(A1B) = P ( A ) and P(BIA) = P ( B )
(16.41a)
(16.40a)
holds. In this case, we have
P(AB) = P ( A ) P ( B ) .
(16.41b)
3. Events in a Complete System of Events If A is a field of events and the events B, E A with P(B,) > 0 (i = 1 , 2 , .. .) form a complete system of events, then for an arbitrary event A E A the following formulas are valid: a) Total Probability Theorem P ( A )=
c P(.~IBz)P(B,). z
(16.42)
16.2 Probability Theory 749 b) Bayes Theorem with P ( A ) > 0
(16.43)
IThree machines produce the same type of product in a factory. The first one gives 20 % of the total production. the second one gives 30 % and the third one 50 %. It is known from past experience that 5 %. 4 %. and 2 % of the product made by each machine, respectively. are defective. Two types of questions often arise: a) IYhat is the probability that an article selected randomly from the total production is defective? b) If the randomly selected article is defective, what is the probability that it was made, e.g.. by the first machine? \Ye use the following notation: -4, denotes the event that the randomly selected article is made by the i-th machine (i = 1,2,3)with P(A1) = 0.2. P(A2) = 0.3; P(A3) = 0.5. The events A, form a complete system of eients: A,A, = 0, A i + A2 t A3 = I . A denotes the event that the chosen article is defected. P(illA1) = 0.05 gives the probability that an article produced by the first machine is defective: analogously P(AIA2)= 0.04 and P(AIA3) = 0.02 hold. Now. we can answer the questions: a) P ( A ) = P ( A 1 ) P ( A I A l+) P(AdP(AIA2)+ P ( . W ( A I A 3 ) = 0.2 0.05 + 0.3 .0.04 t 0.5 . 0.02 = 0.032.
16.2.2 Random Variables, Distribution Functions To apply the methods of analysis in probability theory. we introduce the notions of variable and function.
16.2.2.1 Random Variable If we assign numbers to the elementary events, then we define a random variable X . Then every random event can be described by this variable X . The random variable X can be considered as a quantity which takes its values z randomly from a subset R of the real numbers. If R contains finite or countably many different values, then we call X a discrete random variable: In the case of a continuous random variable, R can be the whole real axis or it may contain subintervals. For the precise definition see 16.2.2.2, 2., p. 750. There are also mized random variables. IA: If we assign the values 1,2.3:4 to the elementary events A l l . Alz. Azl, Azz, respectively, in example A, p. 746, then we define a discrete random variable X . IB: The lifetime T of a randomly selected light-bulb is a continuous random variable. The elementary event T = t occurs if the lifetime T is equal to t .
16.2.2.2 Distribution Function 1. Distribution Function and its Properties A random variable X can be defined by its distribution function F ( z ) = P ( X 5 z) for - co < z < co. (16.44) It determines the probability that the random variable X takes a value between -cc and z. Its domain is the whole real axis. The distribution function has the following properties: a) F(-x)= 0 , F(+co) = 1.
750
16. Probabilitu Theoru and Mathematical Statistics
b) F ( z )is a non-decreasing function of x. c) F ( z ) is continuous on the right. Remarks : 1. From the definition it follows that P ( X = a) = F ( a ) - lim F ( x ) . xia-0
2. In the literature, also the definition F ( x ) = P ( X P(X = a ) = lim F ( x ) - F ( a ) .
< x) is often used. In this case
s-a-0
2. Distribution Function of Discrete and Continuous Random Variables a) Discrete Random Variable: A discrete random variable X, which takes the values x, (i = 1;2 . . . .) with probabilities P ( X = xi) = pi (i = 1,2, , . .), has the distribution function F ( z )=
1P,.
(16.45)
X t 3
b) Continuous Random Variable: .4 random variable is called continuous if there exists a nonnegative function f ( x ) such that the probability P(X E S) can be expressed as P ( X E S) = Js f(x)dx for any domain S such that it is possible to consider an integral over it. This function is the so-called densityfunction. .Lz continuous random variable takes any given value zi with 0 probability so we rather consider the probability that X takes its value from a finite interval [a,b]: b
P(a 5 X 5 b) =
1
(16.46)
f ( t )dt.
a
.4 continuous random variable has an everywhere continuous distribution function:
1 X
F ( z ) = P(X 5 z)=
(16.47)
f ( t ) dt.
-m
F ' ( z ) = f ( z ) holds at the points where f(x) is continuous Remark: hThen there is no confusion about the upper integration limit, often the integration variable is denoted by z instead oft. 3. Area Interpretation of the Probability By introducing the distribution function and density function in (16.47), we can represent the probability P(X 5 z)= F ( z ) as an area between the density function f ( t ) and the x-axis on the interval --x < t 5 x (Fig. 16.la).
a)
b) Figure 16.1
Often there is given a probability value a. If (16.48) P ( X > x) = a holds, we call the corresponding value of the abscissa x = z, the p a n t i l e or the fractile of order (Y (Fig. 16.lb). This means the area under the density function f ( t ) to the right of z, is equal to cy.
16.2 Probability Theory 751
Remark: In the literature, the area to the left of 2, is also used for the definition of quantile. In mathematical statistics, for small values of CY, e.g., a = 5% or a = I%, we also use the notion of szgnzficance levelor type 1 error rate. The most often used quantiles for the most important distributions in practice are given in tables (Table 21.14,p. 1083, to Table 21.18, p. 1090).
16.2.2.3 Expected VaJue and Variance, Chebyshev Inequality. For a global characterization of a distribution, we mostly use the parameters expected value, denoted by p. and the varzance d of a random variable X. The expected value can be interpreted with the terminology of mechanics as the abscissa of the center of gravity of a surface bounded by the curve of the density function f(x) and the x-axis. The variance represents a measure of deviation of the random variable X from its expected value p.
1. Expected Value If g ( X ) is a single-valued function of the random variable X. then g ( X ) is also a random variable. Its expected value or ezpectatzon is defined as: m
E ( g ( X ) )= x g ( x k ) P k ,
a) Discrete Case:
if
the series
b
Ig(xk)lpk
exists.
(16.49a)
k=l
/ g(x)f(x)dx, ltm ig(x)lf(x)dx exists.
+-m
b ) Continuous Case: E ( g ( X ) )=
if
-m
m
( 16.49b)
The expected value of the random variable X is defined as
1 tm
px = E ( X ) = x
Z k p k k
Or
Zf(Z)
dx.
(16.5Oa)
--cc
if the corresponding sum or integral with the absolute values exists. We note that (16.49a,b) yields that (16.5Ob) E(aX + b) = a p +~ b (a, b const) is also valid. Of course, it is possible that a random variable does not have any expected value.
2. MomentsofOrder n iVe Introduce:
a) Moment of Order n: E ( X " ) .
(16.51a)
b) Central Moment of Order n: E ( ( X - p ~ ) " ) .
(16.5lb)
3. Variance and Standard Deviation In particular. for n = 2, the central moment is called the varzance or disperszon:
E ( ( X - p x ) 2 ) = D 2 ( X ) = 0; =
(16.52)
if the expected values occurring in the formula exist. The quantity ux is called the standard deviation. The following relations are valid: (16.53) D 2 ( X )= u; = E ( X 2 )- p i . D Z ( a X+ b ) = a Z D 2 ( X ) . 4. Weighted and Arithmetical Mean In the discrete case, the expected value is obviously the weighted m e a n E ( X ) = p , x , . . . p,z,
+ +
(16.54)
752
16. Probability Theory and Mathematical Statistics
of the values x i , . . . 2 , with the probabilities pk as weights (k = 1,.. . n). The probabilities for the uniform distribution are pl = pz = . . . = p, = l / n , and E ( X ) is the arithmetical mean of the values ~
xk:
E ( X )=
21 + 2 2
t . . . t 2,
(16.55)
n
In the continuous case, the density function of the continuous uniform distribution on the finite interval [a, 61 is
{-
6-a 0 and it follows that f(z)=
fora<x<6,
(16.56)
otherwise,
(16.57) 5 . Chebyshev Inequality If the random variable X has the expected value p and standard deviation o, then for arbitrary X > 0 the Chebyschev inequality is valid: 1 P(lX - pi 2 Ao) 5 (16.58) A* ' That is, it is very unlikely that the values of the random variable X are farther from the expected value p than a multiple of the standard deviation (A large).
16.2.2.4 Multidimensional Random Variable If the elementary events mean that n random variables X 1 , .. . , X, take n real values 2 1 , . . . , z, then a random vector X = ( X l ,X2... . ,X,) is defined (see also random vector, 16.3.1.1, 4., p. 768). The corresponding distribution function is defined by (16.59) F(z1.. . .,a,) = P(X15 Z l r . .. ,X, a,). The random vector is called continuous if there is a function f ( t 1 , . . . , t,) such that
<
F(z1... . , 2 , ) =
7. 7 '.
--s
f ( t l . . . . , t,) dtl . . , dt,
(16.60)
-m
holds. The function f ( t 1 , .. . , t,) is called the density junction. It is non-negative. If some of the variables a I 1 . .. a, tend to infinity, then we get the so-called marginal distributions. Further investigations and examples can be found in the literature. The random variables X I , . . . , X , are independent random variables if (16.61) F b l , . . . L n ) = F1(51)F2(22).. .F,(Z,), f(t1,. . . , t,) = f l ( t 1 ) . . . f n ( t n ) . ~
.
16.2.3 Discrete Distributions 1. Two-Stage Population and Urn Model Suppose we have a two-stage population with N elements, Le., the population we consider has two classes of elements. One class has M elements with a property A , the other one has N - &f elements which do not have the property A. If we investigate the probabilities P ( A ) = p and P ( A ) = 1 - p for randomly chosen elements. then we distinguish between two cases: When we select n elements one after the other, we either replace the previously selected element before selecting the next one, or we do not replace it. The selected n elements, which contain k elements with the property A , is called the sample. n being the site of the sample. This can be illustrated by the urn model.
16.2 Probability Theory 753
2. Urnmodel Suppose there are a lot of black balls and white balls in a container. The question is: What is the probability that among n randomly selected balls there are k black ones. If we put every chosen ball back into the container after we have determined its color, then the number k of black ones among the chosen n balls has a binomial distribution. If we do not put back the chosen balls and n 5 M and n 5 N - 11.1,then the number of black ones has a hypergeometric distribution.
16.2.3.1 Binomial Distribution Suppose we observe only th_e two events A and 2 in an experiment, and we do n independent experiments. If P ( A ) = p and P ( A ) = 1- p holds every time, then the probability that A takes place exactly k times is
W;(k) =
(3
p k ( l - P ) ” - ~ ( k = 0 , 1 , 2 , .. . , n ) .
(16.62)
For every choice of an independent element from the population, the probabilities are A1 - N - ‘If P(A)= -, P ( A )= =1-p=q.
s
(16.63)
~
121
The probability of getting an element with property A for the first k choices, then an element with the remaining property A for the n- k choices ispk(l-P)”-~, because the results of choices are independent of each other. \Ve get the same result assigning the k places any other way. We can assign these places n! (16.64)
(nk)
=
qFq
different ways, and these events are mutually exclusive, so we add
(3
equal numbers to get the re-
quired probability. A random variable X,, for which P ( X , = k ) = W,”(k)holds, is called binomially distributed with parameters n and p .
1. Expected Value and Variance E ( X ” )= p = n ’ p,
(16.65a)
DZ(Xn)= 2 = n . p ( 1 - p).
(16.65b)
2. Approximation of the Binomial Distribution by the Normal Disribution If X,has a binomial distribution, then (16.65~) ..
This means that, if n is large, the binomial distribution can be well approximated by a normal distribution (see 16.2.4.1.p. 756) with parameters px = E(X,) and o* = D2(X,), ifp or 1 - p are not too small. The approximation is the more accurate the closerp is to 0.5 and the larger n is, but acceptable if np > 4 and n(1 -p) > 4 hold. For very smallp or 1- p , the approximation by the Poisson distribution (see (16.68) in 16.2.3.3) is useful.
3. Recursion Formula The following recursion formula is recommended for practical calculations with the binomial distribution: n-k p Ul,”(k + 1) = -’ - ’ W,n(k). (16.65d) k+l q
4. Sum of Binomially Distributed Random Variables If X , and X, are both binomially distributed random variables with parameters n,p and m, p . then the random variable X = X , + X , is also binomially distributed with parameters n + m, p.
I
754 16. Probabilitv Theoru and Mathematical Statistics
Fig. 16.2alblc represents the distributions of three binomially distributed random variables with parameters n = 5; p = 0,s. 0.25, and 0.1. Since the binomial coefficients are symmetric, the distribution is symmetric for p = q = 0.5, and the farther p is from 0.5 the less symmetric the distribution is.
16.2.3.2 Hypergeometric Distribution Just as with the binomial distribution, we suppose that we have a two-stage population with "V elements, Le.. the population we consider has two classes of elements. One class has AI elements with a property A , the other one has Ai - M elements which do not have the property A. In contrast to the case of binomial distribution, we do not replace the chosen hall of the urn model. The probability that among the n chosen balls there are k black ones is
(16.66a)
0 5 k 5 n. k 5 .PIt n - k 5 A' - L\l.
( 16.66b)
If also n 5 -11 and n 5 - V , I hold, then the random variable X with the distribution (16.66a) is said to he hypergeometrically distributed.
0.7
f
0.7
f
0.7
f
0.4 0.5 0.6/,,[;;.
~ j l l l / O . l O
0.3 0.2 0.1 0.0
0.3 0.2 0.1
0 1 2 3 4 5
a)
k
0.1 0.0
0 1 2 3 4 5
b)
k
0.0
0 1 2 3 4 5
. k
c) Figure 16.2
1. Expected Value and Variance of the Hypergeometric Distribution (16.67a)
(16.67b)
2. Recursion Formula ( n- k ) ( M - k ) W;,~J(k t 1) = ( k + l)(.V - M - n t k t 1)K b , N ( k ) .
(16.67~)
16.2 Probability Theory 755
In Fig. 16.3a,b,c we represent three hypergeometric distributions for the cases N = 100, M = 50, 25 and 10, for n = 5. These cases correspond to the cases p = 0.5: 0.25, and 0.1 of Fig. 16.2a,b,c. There is no significant difference between the binomial and hypergeometric distributions in these examples. If also M and A' - M are much larger than n, then the hypergeometric distribution can be well approximated by a binomial one with parameters as in (16.63).
0.5 0.4 0.3 0.2 0.1 0.0
p=0.50
0 1 2 3 4 5
a)
p=o.10
k
0.2 0.1 0.0
0.3 0.2 0.1 0 1 2 3 4 5
k
b)
O'O 0 1 2 3 4 5 c)
k
Figure 16.3
16.2.3.3 Poisson Distribution If the possible values of a random variable X are the non-negative integers with probabilities Xk
(k = 0.1.2.. . . ; X > O), k! then it has a Pozsson dzstrzbutzon with parameter A.
P ( X = IC) = -e-A
(16.68)
1. Expected Value and Variance of the Poisson Distribution E ( X ) = A.
(16.69a)
D z ( X ) = A.
(16.69b)
2. Sum of Independent Poisson Distributed Random Variables If XIand Xz are independent Poisson distributed random variables with parameters XI and Xlr then the random variable X = X I + Xz also has a Poisson distribution with parameter X = XI t Xz.
3. Recursion Formula (16.69~) We can get the Poisson distribution as a limit of binomial distributions with parameters n and p if n + x.and p ( p + 0) changes with n so that np = X = const, Le., the Poisson distribution is a good approximation for a binomial distribution for large n and small p with X = np. In practice, we use it if p 5 0.08 and n 2 l5OOp hold, because the calculations are easier with a Poisson distribution. Table 21.14. p. 1083. contains numerical values for the Poisson distribution. Fig. 16.4a,b,c represents three Poisson distributions with X = np = 2.5, 1.25 and 0.5, Le, with parameters corresponding to Figs. 16.2 and 16.3. The number of independently occurring point-like discontinuities in a continuous medium can usually be described by a Poisson distribution. e.g., number of clients arriving in a store during a certain time interval; number of misprints in a book, etc.
750
:I,;, ;I,:.
16. Probabilitu Theow and Mathematical Statistics
0.5
0.1 0.0
*
0 1 2 3 4 5
a)
k
0.0
j!/,,
h=0.50
0.4 0.3 0.2
0.2
*
0 1 2 3 4 5
b)
k
0.0
0 1 2 3 4
k
c) Figure 16.4
16.2.4 Continuous Distributions 16.2.4.1 Normal Distribution 1. Distribution Function and Density Function h random variable X has a normal distrzbotion if its distribution function is (16.70) Then it is also called a normal variable, and the distribution is called a ( p , u ) noma1 distribution. The function
f ( t ) = -e
1
(16.71)
ad5
is the density function of the normal distribution. It takes its maximum at t = p and it has inflection points at p i.a (see (2.59), p. 72, and Fig. 16.5a).
Figure 16.5
2. Expected Value and Variance The parameters p and o2 of the normal distribution are its expected value and variance, respectively, i.e. (16.72a)
16.2 Probability Theory 757
If the normal random variables X 1 and X2 are independent with parameters p l , u1 and p2, uZ1resp., then the random variable X = k l X l kzX2 ( k l , kz real constants) also has a normal distribution with parameters p = k l p l
+ + k2p2. u = Jm.
t - p in (16.70),then the calculation of the substitution values of If we perform the substitution r = U
the distribution function of any normal distribution is reduced to the calculation of the substitution values of the distribution function of the ( 0 , l ) normal distribution, which is called the standard normal dzstrzbutzon Consequently, the probability P ( a 5 X 5 b) of a normal variable can be expressed by the distribution function @(z)of the standard normal distribution: (16.73)
16.2.4.2 Standard Normal Distribution, Gaussian Error Function 1. Distribution Function and Density Function \$'e get from (16.70) with p = 0 and uz = 1 the distribution function
P(X 5 z)= @(x)
1 m,
1 =-
t2
e-? dt =
j
p ( t ) dt
(16.74a)
-X
of the so-called standard normal distribution. Its density function is 1
u ( t )= -ee-?,
t2
( 16.74b)
Jz;;
it is called the Gaussian error curve (Fig. 16.5b). The values of the distribution function @(z)of the ( 0 , l ) normal distribution are given in Table 21.15. p. 1085. Only the values for the positive arguments x are given, while we get the values for the negative arguments from the relation (16.75) @(-x) = 1 - q z ) .
2. Probability Integral The integral @(x) is also called the probability integralor Gaussian error integral. We can also find in the literature the following definitions:
erf denotes the error function.
16.2.4.3 Logarithmic Normal Distribution 1. Density Function and Distribution Function The continuous random variable X has a logarithmic normal distribution, or lognormal distribution with parameters p L and u2 if it can take all positive values, and if the random variable Y , defined by Y = log (16.77) has a normal distribution with parameters p~ and ui. Consequently, the random variable X has the density function
x,
for t 5 0. (16.78)
758
16. Probability Theory and Mathematical Statistics
and the distribution function
for z 5 0,
(0
(16.79)
K e can use either the natural or the decimal logarithm in practical applications. 2. Expected Value and Variance Using the natural logarithm we get the expected value and variance of the lognormal distribution:
(16.80)
3. Remarks a) The density function of the lognormal distribution is continuous everywhere and it has positive values only for positive arguments. Fig. 16.6 shows the density functions of lognormal distributions for different
PL
and UL. Here we used the natural logarithm.
b) Here the values P L and 02 are not the expected value and variance of the lognormal random variable itself. but of the variable Y = logX. c) The values of the distribution function F ( z )of the lognormal distribution can be calculated by the distribution function @(z)of the standard normal distribution (see (16.74a)). in the following n a y (16.81)
d) The lognormal distribution is often applied in lifetime analysis of economical, technical. and biological processes. e) The normal distribution can be used in additive superposition of a large number of independent random variables, and the lognormal distribution is used for multiplicative superposition of a large number of independent random variables.
1-
pL=ln0.5 ;0,=0.5 ,p,=0 ;a,=l
0.5
Figure 16.6
Figure 16.7
16.2.4.4 Exponential Distribution 1. Density Function and Distribution Function A continuous random variable X has an exponentzal distrzbution with parameter X (A function is (Fig. 16.7) 0
for t < 0
Xe-At for
t 2 0,
> 0) if its density (16.82)
16.2 Probability Theory 759
consequently. the distribution function is
(16.83)
2. Expected Value and Variance p=-
1 A'
1 02=-
A2
(16.84) '
Usually. we can describe the following quantities by an exponential distribution: Length of phone calls, lifetime of radioactive particles. working time of a machine between two stops in certain processes, lifetime of light-bulbs or certain building elements.
16.2.4.5 Weibull Distribution 1. Density Function and Distribution Function The continuous random variable A ' has a Weibull distribution with parameters a and 3 (a> 0,/3 > 0), if its density function is for
t < 0, (16.85)
and so its distribution function is
for
z < 0. (16.86)
2. Expected Value and Variance (16.87) Here r ( z ) denotes the gamma function (see 8.2.5, 6.. p. 459):
/t 3c
r ( z )=
s-'
e -t dt for z > 0.
(16.88)
0
In (16.85), a is the shape parameter and 3 is the scale parameter (Fig. 16.8, Fig. 16.9). 1.o
0 1
""I .o
p=1
0.5
1.0
1.5
Figure 16.8
1
2.0 2.5
c(=2
t Figure 16.9
760
16. Probabilitv Theoru and Mathematical Statistics
Remarks: a) The Weibull distribution becomes an exponential distribution for CY = 1 with X = A.
3
b) The Weibull distribution also has a three-parameter form if we introduce a position parameter 7 . Then the distribution function is:
(16.89)
c) The Weibull distribution is especially useful in life expectancy theory, because, e.g., it describes the functional lifetime of building elements with great flexibility.
16.2.4.6 x2 (Chi-square) Distribution 1. Density Function and Distribution Function Let XI, Xz.. , , X , be n independent ( 0 , l ) normal random variables. Then the distribution of the ~
random variable
+ +
+
(16.90) x 2 = XI2 XZZ , , ' x,2 is called the xz dzstrzbzltzon with n degrees of freedom. Its distribution function is denoted by F,I(Z), and the corresponding density function by fXz ( t ) . (16.91a)
(16.91b)
2. Expected Value and Variance E ( x Z )= n3
D'(x2) = 2n.
(16.92a)
(16.92b)
3. Sum of Independent Random Variables If X1 and X2 are independent random variables both having a xzdistribution with n and m degrees of freedom, then the random variable X = XI+ Xz has a xz distribution with n + m degrees of freedom.
4. Sum of Independent Normal Random Variables If X I . Xz,. . . X,are independent, (0, a) normal random variables, then ~
X
=
5X,'
1 has the density function f ( t ) = -f
z
X
I n
=-
1X,*
n has the density function f ( t ) = --f
z
02
2=1
X= -EX'
has the density function f ( t ) =
(-)t (-)nt 02
UZ
z=1
a2
,
(16.93)
,
(16.94) (16.95)
z=1
5 . Quantile For the quantile (see 16.2.2.2, 3., p. 750) 16.10),
P(X >
= CY.
of the
x2 distribution with m degrees of freedom (Fig. (16.96)
16.2 Probability Theory 761
Figure 16.10 Quantiles of the
Figure 16.11
x2 distribution can be found in Table 21.16,p. 1087.
16.2.4.7 Fisher F Distribution 1. Density Function and Distribution Function If X1 and X z are independent random variables both having of freedom, then the distribution of the random variable
x2 distribution with ml and m2 degrees (16.97)
is a Fisher distribution or F distribution with ml, m2 degrees of freedom.
fort > 0, .fF(t)
=
(16.98a)
fort 5 0.
For I 5 0 we have F.P(z) = P(Fm,,m2 5 I) = 0, for I > 0:
F F ( ~=) P(Fm,,m25 I)
2. Expected Value and Variance
The quantiles (see 16.2.2.2, 3., p. 750) t,,,,,,, Table 21.17,p. 1088.
of the Fisher distribution (Fig. 16.11)can be found in
16.2.4.8 Student t Distribution 1. Density Function and Distribution Function If X is a ( 0 , l ) normal random variable and Y is a random variable independent from X and it has a x2 distribution with m = n - 1 degrees of freedom, then the distribution of the random variable T=-
X
(16.100)
762
16. Probability Theory and Mathematical Statistics
is called a Student t distribution or t distribution with m degrees of freedom. The distribution function is denoted by F ~ ( X and ) ~the corresponding density function by fs(t).
fdt) = 1 r(?)
(16.10 la)
1
m ( 16.101b)
Figure 16.12
2. Expected Value and Variance E ( T ) = 0: 3. Quantile The quantiles t,,m and P ( T > t,,,) = a
m P ( T )= - ( m > 2).
(16.102a)
m-2
(16.102b)
of the t distribution (Fig. 16.12a,b),for which (16.103a)
or
P ( /TI > t,p,m) = 0
(16.103b)
holds. are given in Table 21.18. p. 1090. The Student t distribution, introduced by Gosset under the name Student, is used in the case of samd e s with small samde size n, when only estimations can be given for the mean and for the standard heviation. The staniard deviation (16.102b) nolonger depends on the deviation of the population from where the sample is taken.
16.2.5 Law of Large Numbers, Limit Theorems The law of large numbers gives a relation between the probability P ( A ) of a random event .4 and its relative frequency na/n with a large number of repeated experiments.
1. Law of Large Numbers of Bernoulli
The following inequality holds for arbitrary given numbers E > 0 and 17 > 0 1 (16.104a) if n 2 4E217
(16.104b)
For other similar theorems see [16.5].
I How many times should we roll a not necessarily fair die if the relative frequency of the 6 should be closer to its probability than 0.1 with a probability of at least 95 % 7
16.2 Probabilitu Theoru 763
Now..E = 0.01 and q = 0.05, so 4811 = 2 . W5, and according to the law of large numbers of Bernoulli n 2 5 lo4 must hold. This is an extremely large number. We can reduce n, if we know the distribution function.
2. Central Limit Theorem of Lindeberg-Levy
If the independent random variables XI,. . . , X, all have the same distribution with an expected value p and a variance 0’.then the distribution of the random variable
(16.105) tends to the ( 0 , l ) normal distribution for n + co.Le., for its distribution function Fn(y) we get (16.106)
If n > 30 holds, then F,(y) can be replaced by the ( 0 , l ) normal distribution. Further limit theorems can be found in [16.5], [16.7]. We take a sample of 100 items from a production of resistors. We suppose that their actual resistance values are independent and they have the same distribution with deviation u’ = 150. The mean value for these 100 resistors is E = 1050 Q. In which domain is the true expected value p with a probability of99%? We are looking for an E such that P ( l x - pi 5 E ) = 0.99 holds. We can suppose (see (16.105)) that the random variable Y = -has a (0.1) normal distribution. From P(lY1 5 A) = P(-A 5
-’
ff/&
Y 5 A) = P(Y 5 A) - P ( Y < -A), and from P(Y 5 -A) P(lY1 5 A) = 2P(Y 5 A) - 1 = 0.99.
= 1 - P(Y
5 A) it follows that
S o . P ( Y < A) =@(A) =0.995andfromTable21.15,p.1085,wegetA=2.58. Sinceo/m=1.225, we get with a 99 % probability: 11050 - pi < 2.58.1.225, Le., 1046.80 < p < 1053.2 R.
16.2.6 Stochastic Processes and Stochastic Chains Many processes occurring in nature and those being studied in engineering and economics can be realistically described only by time-dependent random variables. IThe electric consumption of a city at a certain time t has a random fluctuation that is dependent on the actual demand of the households and industry. The electric consumption can be considered as a continuous random variable X. When the observation time t changes, electric consumption is a continuous random variable at every moment. so it is a function of time. The stochastic analysis of time-dependent random variables leads to the concept of stochastic processes, which has a huge literature of its own (see, e.g., [16.7],[16.8]). Some introductory notions will be given next.
16.2.6.1 Basic Notions, Markov Chains 1. Stochastic Processes A set of random variables depending on one parameter is called a stochastzc process. The parameter, in general, can be considered as timet, so the random variable can be denoted by Xt and the stochastic process is given by the set IXtIt E TI. (16.107) The set of parameter values is called the parameter space T , the set of values of the random variables is the state space Z.
I
16. Probabilitu Theoru and Mathematical Statistics
764
2. Stochastic Chains If both the parameter space and the state space are discrete, Le., the state variable X,and the parameter t can have only finite or countably infinite different values, then the stochastic process is called a stochastic chain. In this case the different states and different parameter values can be numbered: (16.108) Z = { I ) & .. . , i , i + 1,. . . , }
T = { t o , t ~. ., . ,t,,
. . .} with 0 5 to < tl < . . . < t, < tmtl < . . . .
t,+l,
(16.109)
The times t o ,t l , . . . are not necessary equally spaced.
3. Markov Chains, Transition Probabilities If the probability of the different values of X,,,, in a stochastic process depends only on the state at time t,, then the process is called a Markov chain. The Markov property is defined precisely by the requirement that P(X,,,,= i,+llXt, = io, = 2 1 , . . . ; xt, = 2,) = P(X,_+, = i,tl/X&" = i,) (16.110) for all m E {0,1,2,.. .} and for all i ~ , i l ,. .. , imtl E 2. The conditional probabilities Consider a Markov chain and times t , and (16.111) P(xt,,, = jI&, = = Ptj(t,,tfntl) are called the transitionprobabilitiesofthechain. The transition probability determines the probability by which the system changes from the state X,,= i at t , into the state Xtmtl= j at tmtl. If the state space of a Markov chain is finite, i.e., Z = {1,2,. . . . N } , then the transition probabilities pi,(tlrt z )between the states at times t l and t 2 can be represented by a quadratic matrix P(t1, t 2 ) , by the so-called transition matrix
x,,
4
(16.112)
The times tl and t 2 are not necessarily consecutive.
4. Time-Homogeneous (Stationary) Markov Chains If the transition probabilities of a Markov chain (16.111) do not depend on time, i.e., P,,(t,. t,+1) = P t J . (16.113) then the Markov chain is called tzme-homogeneous or statzonary. A stationary Markov chain with a finite state space Z = {1,2,. . . , N } has the transition matrix Pll P = [P:
PI2 P22
,.. ' ' '
PIN P2N)
PN1
pN2
,,.
PNN
,
( 16.114a)
where a) p,,
2 0 for all z , j and
cp2,1 for all i .
(16.114b)
N
b)
=
(16.114c)
3=1
Being independent of time p,, gives the transition probability from the state i into the state j during time unit. IThe number of busy lines in a telephone exchange can be modeled by a stationary Markov chain. For the sake of simplicity we suppose that we have only two lines. Hence, the states are i = 0,1,2. Let
16.2 Probabilitu Theow 765
the time unit be, e.g., 1 minute. Suppose the transition matrixpij is: (pv) =
10.7 0.3 O.O\ 0.2 0.5 0.3 0.1 0.4 0.5
[
In the matrix ( p z j )the first row corresponds to the state i = 0. The matrix element p12 = 0 , 3 (second row, third column) shows the probability that two lines are busy at timet, given that one was busy at tm-1.
Remark: Every quadratic matrix P = ( p i j ) of size N x N satisfying the properties (16.114b) and (16.114~)is called a stochastic matrix. Their row vectors are called stochastic vectors. Although the transition probabilities of a stationary Markov chain do not depend on time, the distribution of the random variable X t is given at a given time by the probabilities N
P ( X t = i) = p l ( t ) (i = 1.2.. . . . Y ) (16.115a)
with x p i ( t ) = 1
(16.115b)
2=1
since the process is in one of the states with probability one at any timet. Probabilities (16.115a) can be written in the form of a probability vector P=
(p,(t),P*(t),...,p~(t)T)'
(16.116)
The probability vector p i s a stochastic vector. It determines the distribution of the states of a stationary Slarkov chain at time period t.
5 . Probability Vector and Transition Matrix Let the transition matrixP ofastationary Markov chain be given (according to (16.114a,b,c)). Starting with the probability distribution at time period t determine the probability distribution at t + 1, that is, calculate p(t + 1) from P and p(t): (16.117 )
p(t
+ k ) = p(t) Pk. '
(16.118)
Remarks: 1. Fort = 0 it follows from (16.118) that P(k)
= p(O)P'",
( 16.119)
that is, a stationary Markov chain is uniquely determined by the initial distribution p(0)and the transition matrix P. 2. If matrices A and B are stochastic matrices, then C = AB is a stochastic matrix, as well. Consequently. if P is a stochastic matrix, then the powers Pkare also stochastic matrices. W X particle changes its position (state) X i (1 5 z I: 5) along a line in time periods t = 1 , 2 , 3 , .. . according to the following rules: a) If the particle is at .z = 2.3,4, then it moves to the right by a unit during the next time unit with probability p = 0.6 and to the left with probability 1 - p = 0.4. b) At points z = 1 and z = 5 the particle is absorbed, Le., it stays there with probability 1. c) At timet = 0 the position of the particle is x = 2. Determine the probability distribution p(3) at time period t = 3. By (16.119) the probability distribution p(3) = p(0) P3holds with p(0)= (0, 1,0, 0,O) and with the
766
16. Probubzlztv Theorv and Mathematical Statzstacs
transition matrix 1 0 0
P=
('i i :i 0
0.4
0
0 0.6
0
. Weget P3 =
0:6) p(3) = (0.496;0; 0.288; 0; 0.216) . and finally -
(
1 0 0 0 0.496 0 0.288 0 0.216 0.160 0.192 0 0.288 0.360 0 0.744 0.064 0 0.192 0 0 0 0 1O )
.
16.2.6.2 Poisson Process 1. The Poisson Process In the case of a stochastic chain both the state space 2 and the parameter space T are discrete, that is, the stochastic process is observed only at discrete time periods to, t l . tz.. . . . Now; we study a process nith continuous parameter space T . and it is called a Poisson process. 1. Mathematical Formulation Mathematical formulation of the Poisson process: a) Let the random variable X t be the number of signals in the time interval [O.t); b) let the probabilitypx(t) = P ( X t = z) be the probability of z signals during the time interval [O. t ) . .\dditionally. the following assumptions are required. which hold in the process of radioactive decay and many other random processes (at least approximately): c) The probablity P ( X t = x) of I signals in a time interval of length t depends only on x and t. and does not depend on the position of the time interval on the time axis. d) The numbers of signals in disjoint time intervals are independent random variables. e) The probability to get at least one signal in a very short interval of length At is approximately proportional to this length. The proportionality factor is denoted by X ( A > 0). 2. Distribution Function By properties a)-e) the distribution of the random variable X t is determined. \Ye get: (16.120) where = At is the expected value and uz = Xt the variance. 3. Remarks 1. From (16.120) we get the Poisson distribution as a special case for t = 1 (see 16.2.3.3, p.755). 2. To interpret the parameter X or to estimate its value from observed data the following properties are useful: X is the average number of signals during a time unit. 1. - is the average distance (in time) between two signals in a Poisson process.
I^
3. The Poisson process can be interpreted as the random motion of a particle in the state space 2 = {O. 1.2.. . .}. The particle starts in the state 0. and a t every sign it jumps from state i into the next state i + 1. Furthermore, for a small interval At the transition probability pt,%+l from state i into the state i 1 should be: Pz.z+l XAt. (16.121) X is called the transition rate. 4. Examples of Poisson Processes IRadioactive decay is a typical example of a Poisson process: The number of decays (signals) are registered with a counter and marked on the time axis. The observation interval should be relatively small nith respect to the half-period of the radiating matter. I Consider the number of calls registered in a telephone exchange until time t and calculate. e.g.. the probability that at most I calls are registered until timet with the assumption that the average number
+
16.9 Mathematical Statistics 767
of calls during a time unit is A. W In reliability testing. the number of failures of a reparable system is counted during a period of duty. In queuing theory we consider the number of customers arriving at the counter of a department store: to a booking office or to a gasoline station.
2. Birth and Death Processes One of the generalizations of the Poisson process is the following: We assume that the transition rate Ai in (16.121) depends on the state i . Another generalization is when the transition from state i into state i - 1 is allowed. The corresponding transition rate is denoted by pi. The state i can be considered, e.g., as the number of individuals in a population. It increases by one at transition from state i into state i + 1, and decreases by one a t transition from i into i - 1. These stochastic processes are called birth and death processes. Let p ( X t = i) = p i ( t ) be the probability that the process is in state i a t time t . Analogously to the Poisson process: a A,-,.&. from i - 1 into i : pZ-l,% (16.122) from z 1 into z : ~ ~x bz+1At, ~ 1 , ~ from z into z : pt,%x 1 - (A, p t ) 4 t Remark: The Poisson process is a pure birth process with a constant transition rate.
+
+
3. Queuing The simplest queuing system is considered as a counter where customers are served one by one in the order of their arrival time. The waiting room is sufficiently large, so no one needs to leave because it becomes full. The customers arrive according to a Poisson process, that is, the interarrival time between two clients is exponentially distributed with parameter A. and these interarrival times are independent. In many cases also the serving time has an exponential distribution with parameter p. The parameters X and p have the following meanings: A: average number of arrivals per time unit,
A. average interarrival time, A p: average number of served clients per time unit.
1:average serving time. P
Remarks: 1. Ifthenumber ofclientsstandingin thequeueisconsideredas thestateofthisstochastic process. then the above simple queuing model is a birth and death process with constant birth rate A and constant death rate p. 2. The above queuing model can be modified and generalized in many different ways, e.g., there can be several counters where the clients are served and/or the arrival times and serving times follow different distributions (see [16.8]: [16.17]).
16.3 Mat hematical St at istics hIathematica1 statistics provides an application of probability theory for given mass phenomena. Its theorems allow us to make statements with certain probability about properties of given sets: which statements are based on the results of some experiments whose number should be kept low for economical reasons.
16.3.1 Statistic Function or Sample Function 16.3.1.1 Population, Sample, Random Vector 1. Population The Population is the set of all elements of interest in a particular study. We can consider any set of things having the same property in a certain sense, e.g.. every article of a certain production process
I
768
16. Probability Theoru and Mathematical Statistics
or all the values of a measuring sequence occurring in a permanent repetition of an experiment. The number .Vof the elements of a population can be very large, even practically infinite. \\-e often use the word population to denote also the set of numerical values assigned to the elements. 2. Sample In order not to check the total population about the considered property, data are collected only from a subset, from a so-called sample of size n ( n 5 N ) . We talk about a random choice if every element of the population has the same chance of being chosen. A random sampleof size n from a finite population of size JV is a sample selected such that each possible sample of size n has the same probability of being selected. A random sample from an infinite population is a sample selected such that each element is selected independently. The random choice can be made by so-called random numbers. We often use the word sample for the set of values assigned to the selected elements.
3. Random Choice with Random Numbers It often happens that a random selection is physically impossible on the spot, e.g.. in the case of piled material, like concrete stabs. Then we apply random numbers for a random selection (see Table 21.19, p. 1091). hlost calculators can generate uniformly distributed random numbers from the interval [O. 11. Pushing the button R . 0 or RAND we get a number between 0.00,. . O and 0.99,. .9. The digits after the decimal point form a sequence of random numbers. We often take random numbers from tables. Two-digit random numbers are given in Table 21.19, p. 1091. If we need larger ones, then we can compose several-digit numbers from them by writing them after each other. I.4random sample is to be examined from a transport of 70 piled pipes. The sample size is supposed to be 10. We number the pipes from 00 to 69. A two-digit table of random numbers is applied to select the numbers. We fix the way we choose the numbers, e.g., horizontally, vertically or diagonally. If during this process random numbers occur repeatedly, or they are larger than 69: then they are simply omitted. The pipes corresponding to the chosen random numbers are the elements of the sample. If we have a several-digit table of random numbers, we can decompose them into two-digit numbers.
4. Random Vector .A random variable X can be characterized by its distribution function, by its parameters, where the distribution function itself is determined completely by the properties of the population. These are unknown at the beginning of a statistical investigation, so we want to collect as much information as possible with the help of samples. Usually we do not restrict our investigation to one sample but we apply more samples (with same size n if it is possible, for practical reasons). The elements of a sample are chosen randomly, so the realizations take their values randomly, i.e.: the first value of the first sample is usually different from the first value of the second sample. Consequently. the first value of a sample is a random variable itself denoted by X I . Analogously, we can introduce the random variables X z . X 3 , . . X , for the second, third,. . . . n-th sample values, and they are called sample variables. Together, they form the random vector
21 = ( X I , xz.. . . , Xn). Every sample of size n with elements xi can be considered as a vector x = ( 5 1 . z2:. . G I ) , as a realization of the random vector. . 1
16.3.1.2 Statistic Function or Sample Function Since the samples are different from each other, their arithmetic m e g s E are also different. We can consider them as realizations of a new random variable denoted by X which depends on the sample
16.3 Mathematical Statistics 769 varaables XI,X2.. . .
.X,.
. . . xl, with mean El. . . . 22, with mean ZZ. .. .. . .. .. . . . . . m-th sample: xmlrzm2, . . . x,, with mean Em. 1. sample: xll, 2. sample: xzl,
212,
x22, I
(16.123)
\$'e denote the realization of the j-th sample variable in the i-th sample by xij (i = 1,2,. . . , rn; j = 1 2, . . . n) . A function of the random vector = (XI)Xz,. . . , X,) is again a random variable, and it is called a statistic or sample function. The most important sample functions are the mean, variance, median and range. ~
~
1. Mean The mean
x of the random variables Xi is: .
-
n
1
'"
( 16.124a)
X=-CX,. 2=1
The mean Z of the sample (zlrxz,
t . .
2,)
is (16.124b)
It is often useful to introduce an estimate zo in the calculations of the mean. It can be chosen arbitrarily but possibly close to the mean E. If, e.g., x,,(i = 1 , 2 , . . .) are several-digit numbers in a long measuring sequence, and they differ only in the last few digits, it is simpler to do the calculations only with the smaller numbers (16.124~) 2, = 2, - zo. Then we get I = xo
+ -I 1n2 , = 50 + z.
(16.124d)
2=1
2. Variance The variance S2 of the random variables X, with mean 7 is defined by: l
n
SZ = -C(X,- 7 ) Z .
(16.125a)
n - 1 ,=I The realization of the variance with the help of the sample (21,$ 2 , . . . , 2), is (16.125b) It is proven that in the estimation of the variance of the original population we get a more accurate estimation by dividing n - 1 than by dividing n. With the estimated value 20 we get n
2
z
- 2=1
s -
n
n
2,2 -
2=l
n-1
zi2 - n(E - 2 0 ) 2
Ti
-
i=l
n-1
For z = xo the correction is z c z, = 0 because Z = 0 holds. 2=1
(16.125~)
770
2 6. Probabilitv Theorv and Mathematical Statistics
3. Median Let the n elements of the sample be arranged in ascending (or descending) order. If n is odd. then the n+l medzan d is the value of the --th item; if n is even, then the median is the average value of the 2 -th items, the two items on the middle.
. whose elements are arranged in ascending (or The median j: in a particular sample ( ~ 1 ~ x 2. .,, x,), descending) order, is if n = 2 m + 1, ,
2 4.
(16.126)
if n = 2 m .
Range R = max,Y, z
- minX, t
(i = 1 , 2 , .. . , n ) .
(16.127a)
The range R of a particular sample (a,. x2 (16.127b) R = a, - x,in. Every particular realization of a sample function is denoted by a lowercase letter. except the range R, and R. i.e., for a particular sample ( X I , xZ1 a,) we calculate the particular values E , s2:2%
ILVe take a sample of 15 loudspeakers from a running pro-
16.3.2 Descriptive Statistics 16.3.2.1 Statistical Summarization and Analysis of Given Data In order to describe statistically the properties of certain elements, we characterize them by a random variable X . Usually, the n measured or observed values x, of the property X form the starting point of a statistical investigation, which is made to find some parameters of the distribution or the distribution itself of X . Every measured sequence of size n can be considered as a random sample from an infinite population, if the experiment or the measurement could be repeated infinitely many times under the same conditions. Since the size n of a measuring sequence can be very large, we proceed as follows: 1.Protocol, P r i m e Notation The measured or observed values x, are recorded in a protocol list. 2. Intervals or Classes We consider an interval, which contains the n data 2, (i = 1 , 2 . . . . , n) of the sample, and divide it into k subintervals, so-called classes or class intervals. Usually 10-20 classes are selected with equal length h! and their boundaries are called class boundaries. The endpoints of the total interval are not uniquely defined; in general, we choose them approximately symmetricly with respect to the smallest and largest value of the sample, and class boundaries should be different from any sample value. 3. Frequencies and Frequency Distribution The absolutefrequencies hj ( j = 1 , 2 , .. . , k ) are the numbers h, of data (occupancy number) belonging to a given interval Axj.The ratios hj/n (in %) are called relative frequencies. If the values h,/n are represented over the classes as rectangles, then we get a graphical representation of the given frequency distribution. and this representation is called a histogram (Fig. 16.13a). The values h,/n can be considered as the empirical values of the probabilities or the density function f(x).
16.3 Mathematical Statistics 771 4. Cumulative Frequency Adding the absolute or relative frequencies, we get the cumulative abso-
lute or relative frequency Table 16.3 Frequency table
I
Class 50 - 70 71 - 90 91 -110 111 -130 131 - 150 151 -170 171 -190 191 -210 211 -230 231 -250
Fi (%)
12.0
7.2 4.8 2.4
22.4 40.0 64.0 85.6 92.8 97.6 100.0
Fj= h l + h + . " + h , % ( j = 1 , 2 > . . , k ) . (16.128) n If we represent the value F3 at the upper boundary and draw a horizontal line until the next boundary, then we get a graphical representation of the empirical distribution function, which can be considered as an approximation of the unknown underlying distribution function F ( z ) (Fig. 16.13b). I Suppose we perform n = 125 measurements during a study. The results spread in the interval from 50 to 270, so it is reasonable to divide this interval into k = 11 classes with a length h = 20. We get the frequency table Table 16.3.
Fi/% 100
70
110 150 190 230 270 x
70 110 150 190 230 270 x b)
a)
Figure 16.13
16.3.2.2 Statistical Parameters After summarizingand analyzing the data of the sample as given in 16.3.2.1,p. 770. we can approximate the parameters of the random variable by the following parameters:
1. Mean If we use directly the data of the sample, then the sample mean is (16.129a)
If we use the means E, and frequencies h, of the classes, then we get r i r
(16.129b)
2. Variance If we use directly the data of the sample, then the sample variance is (16.130a)
772
16. Probabilitv Theoru and Mathematical Statistics
If we use the means zj and frequencies hj of the classes, then we get (16.130b) The class midpoint u, (the midpoint of the corresponding interval) is also often used instead of ?EJ. 3. Median The median 5 of a distribution is defined by
P ( X < 5) = I.
2 The median may not be a uniquely determined point. The median of a sample is if n = 2m + 1, Xrntl, 5 = x,+1 +x, , ifn=2m. 2
{
4. Range R = x,
(16.131a)
( 16.131b)
(16.132)
- x,jn.
5 . Mode or Modal Value is the data value that occurs with greatest frequency. It is denoted by D.
16.3.3 Important Tests One of the fundamental problems of mathematical statistics is to draw conclusions about the population from the sample. There are two types of the most important questions: 1. The type of the distribution is known, and we want to get some estimate for its parameters. A distribution can be characterized mostly quite well by the parameters p and u2 (here p is the exact value of the expected value, and u2 is the exact variance), consequently one of the most important questions is how good an estimation can we give for them, based on the samples. 2. Some hypotheses are known about these parameters, and we want to check if they are true. The most often occurring questions are: a) Is the expected value equal to a given number or not? b) Are the expected values for two populations equal or not? c) Does the distribution of the random variable with p and u* fit a given distribution or not? etc. Because in observations and measurements, the normal distribution has a very important role, we discuss the goodness of fit test for anormal distribution. The basic idea can be used for other distributions, too.
16.3.3.1 Goodness of Fit Test for a Normal Distribution There are different tests in mathematicalstatistics to decide if the dataof a sample come from a normal distribution. We discuss a graphical one based on normal probability paper, and a numerical one based on the use of the chi-square distribution ( rix2test”).
1. Goodness of Fit Test with Probability Paper a) Principle of Probability Paper The x-axis in a right-angled coordinate system is scaled equidistantly, while the y-axis is scaled on the following way: It is divided equidistantly with respect to Z, but scaled by l
y = @(Z)= -
/ e-F Z
Jz;;-m
t2
dt.
(16.133)
16.3 Mathematical Statistics 773 If a random variable X has a normal distribution with exDected value LI and variance u2,then for its distribution function (see 16.2.4.2,p. 757)
F(x)= @
(7) =
@(Z)
(16.134a)
Z 0
holds, Le..
Z=x-P U
1 -1
(16.134b)
X ~
P
+
~
(16.134~)
P-ff
must be valid. and so there is a linear relation between x and Z and (16.134~).
+
b ) Application of Probability P a p e r 4 )Y We consider the data of the sample, we calz @ ( t ) , , , , , culate the cumulative relative frequencies according to (16 128), and sketch these onto the probability paper as the ordinates of the points with abscissae the upper class boundaries. If these points are approximately on a straight line, (with small deviations) then the random variable can be considered as a normal random variable (Fig. 16.14). As we see from Fig. 16.14, the distribution to which the data of Table 16.3 belong, can be considered as a normal distribution. We can also see that p x 176, c x 37.5 (from the x values be70 110 150 190 230 270 x longing to the 0 and i l values of 2). P* P P+O Remark: The values F, of the relative cumulative frequencies can be plotted more easily on the Figure 16.14 probability paper, if its scaling is equidistant with respect to y, which means a non-equidistant scaling for the ordinates. I
2. XZtest \%’e want to check if a random variable X can be considered normal. We divide the range of X into k classes and n e denote the upper limit of the j-th class ( j = 1 , 2 , . . . , k ) by[,. Let p j be the “theoretical” probability that X is in the j-th class, Le.. (16.135a) P j = F K j ) - F(G-1); where F ( X )is the distribution function of X ( j = 1 , 2 , . . . k ; (0 is the lower limit of the first class with F(
(16.135b) must hold, where @(z)is the distribution function of the standard normal distribution (see 16.2.4.2, p. 757). The parameters p and u2 of the population are usually not known. We use E and s2 as an approximation of them. We have to make the decomposition of the range so that the expected frequencies for every class should exceed 5, Le., if the size of the sample is n,then np, 2 5, Now, we consider the sample ( X I ,~ 2 . ... .z,)of size n and calculate the corresponding frequencies h, (forthe classes given above). Then the random variable
(16.135~) has approximately a x2 distribution with m = k - 1 degrees of freedom if we know p and u2,m = k - 2 if we estimated one of then1 from the sample, and m = k - 3 if we estimated both by ?E and 2.
774
16. Probability Theory and Mathematical Statistics
Now, n e determine a level a , which is called the signzficance level, and determine the quantile (i depends on the number of unknown parameters) of the corresponding x2 distribution. e.g., from Table 21.16;p. 1087. This means P ( x 2 2 xiikvi)= a holds. Then we compare the value n e got in (16.135~)and this quantile, and if
xi
< X a2 ; k - i
(16.135d) holds. we accept the assumption that the sample came from a normal distribution. This test is called the x2 test for goodness ojfit. IThe following x2 test is based on the example on p. 771. The sample size is n = 125, with the mean = 176.32 and variance s2 = 36.70. These values are used as approximations of the unknown according to (16.135~) parameters p and u2 of the population. We can determine the test statistic after performing the calculations according to (16.135a) and (16.135b))as shown in Table 16.4.p. 774. X S2
xi
Table 16.4 x2 test
-
p
ti 70 90 110 130 150 170 190 210 230 250 270
i}
13
9 15 22 30 27 9
;} 9
(V) 0.0019 0.0094 0.0351 0.1038 0.2358 0.4325 0.6443 0.8212 0.9279 0.9778 0.9946
-2.90 -2.35 -1.81 -1.26 -0.72 -0.17 0.37 0.92 1.46 2.01 2.55
Pi 0.0019 0.0075 0.0257 0.0687 0.1320 0.1967 0.2118 0.1769 0.1067 0.0499 0.0168
nPj
nPj
0.2375 12.9750
8.5857 16.5000 24.5875 26.4750 22.1125 13.3375 6.2375
0.00005 0.1635 0.2723 0.4693 1.0803 1.4106 0.0526
xi = 3.4486
-
-
(hj - npj)'
xi
It follows from the last column that = 3.4486. Because of the requirement np, 2 5. the number of classes is reduced from k = 11to k* = k -4 = 7. LVe calculated the theoretical frequencies np, with the estimated values z and s2 of the sample instead of p and uz of the population, so the degrees of freedom of the corresponding x2 distribution is reduced by 2. The critical value is the quantile x : , ~ . - ~ - For ~. a = 0.05 we get xi 05,4 = 9.5 from Table 21.16,p. 1087, so because of the inequality xi < there is no contradiction to the assumption that the sample is from a population with a normal distribution.
xi
16.3.3.2 Distribution of the Sample Mean Let X be a continuous random variable. Suppose we can take arbitrarily m a y samples of size n from the corresponding population. Then the sample mean is also a random variable X , and it is also continuous.
1. Confidence Probability of the Sample Mean If X has a normal distribution with parameters p aEd u2,then ff is also a normal random variable with parameters p and u 2 / n ,Le.. the density function j ( x ) of X is concentrated more around p than the density function f ( x ) of the population. For any value E > 0:
5 E)
= 20
(;)
-
jEt)
P ( / X - p I < E ) = 2 @ - -1.
(16.136)
16.3 Mathematical Statistics 775
It followsfrom this that with increasing samplesize n. the probability that the sample mean is a good approximation of p is also increasing. 1 1 We get for E = -u from (16.136) P 2
Table 16,5 Confidence level for the sample mean
- 1, and for different values of n we get the values listed
16
! :i 1
in Table 16.5. We see from Table 16.5,e.g., that with a sample size n = 49. the probability that the sample mean 2 differs from p
68.27 % 95.45 rc 98.76 99.96 %
1
~
by less than +-u is 99.95 %. 2
2. Sample Mean Distribution for Arbitrary Distribution of the Population The random variable 7 has an approximately normal distribution with parameters p and u z / nfor any distribution of the population with expected value 1.1 and variance u *. This fact is based on the central limit theorem.
16.3.3.3 Confidence Limits for the M e a n 1. Confidence Interval for the Mean with a Known Variance u2 If X is a random variable with parameters p and uz, then according to 16.3.3.2, p. 774. x i s approximately a normal random variable with parameters p and uz/n. Then substitution of -
- 'Y-p Z=-fi
(16.137)
U
yields a random variable
zwhich has approximately a standard normal distribution, therefore
1 E
P ( lzl 5 E ) =
;(x) dx = 2 @ ( ~ -) 1.
(16.138)
--E
If the given significance level is c y , namely, (16.139) P( I Zl 5 E ) = 1 - c y , is required, then E = ~ ( acan ) be determined from (16.138)) e.g.. from Table 21.15,p. 1085, for the standard normal distribution. From 1215 cy) and from (16.137) we get the relation p=T*-
U
( )
f i Ecy
The values Z f % ( c y )
J;;
(16.140) in (16.140) are called conjdence lzmztsfor the expected value and the interval
between them is called a confidence zntervalfor the ezpected value p with a known uz and given significance level cy. In other words: The expected value 1.1 is between the confidence limits (16.140) with a probability 1 - cy Remark: If the sample size is large enough, then we can use sz instead of u* in (16.140). The sample size is considered to be large. if n > 100, but in practice. depending on the actual problem. it is considered to be sufficiently large if n > 30. If n is not large enough, then we have to apply the t distribution to determine the confidence limits as in (16.143).
2. Confidence Interval for the Expected Value with an Unknown Variance u2 If the variance u* of the population is unknown, then we replace it by the sample ivariance sz and instead of (16.137) we get the random variable
x-/I
T=-fi.
(16.141)
I
776
16. Probabilitu Theoru and Mathematical Statistics
which has a t distribution (see 16.2.4.8, p. 761) with rn = n - 1 degrees of freedom. Here n is the size of the sample. If n is large, e.g., n > 100 holds, then T can be considered as a normal random variable as 2 in (16.137). We get for a given significance level a (16.142)
, = t,/z;n-l, where ta/2;n-1 is the quantile of the t distribution From (16.142) it follows that E = € ( a n) (with n - 1 degrees of freedom) for the significance level a (Table 21.18,p. 1090). It follows from IT1 = ta/2;n-l that p = zf S t , / z ; n - l .
(16.143)
fi
The values F & z t , p i n - l are called the confidence limitsfor the expected ualue 1.1 of the distribution of
J;;
the population with an unknown variance 0 ' and with a given significance level a.The interval between these limits is the confidence interval. I A sample contains the following 6 measured values: 0.842; 0.846; 0.835; 0.839; 0.843; 0.838. We get from this 2 = 0.8405 and s = 0.00394. LVhat is the maximum deviation of the sample mean from the expected value p of the population distribution. if the significance level cy is given as 5 % or 1 % ? 1. cy = 0.05: We read from Table 21.18,p. 1090, that ta/2;5 = 2.57, and we get - 5 2.57. 0.00394/& = 0.0042. Thus, the sample mean z = 0.8405 differs from the expected value p by less than f0.0042 with a probability 95 %. 2. a = 0.01: t a p 5 = 4.03; - 1-11 5 4.03.0.00394/& = 0.0065, Le., the sample mean differs from p by less than f0.0065with a probability 99 %.
lx
lx
16.3.3.4 Confidence Interval for the Variance If the random variable X has a normal distribution with parameters p and u2,then the random variable 52
XZ = ( n - 1)-
U2
has a xZ distribution with m = n - 1 degrees of freedom, where n is the sample size and sz is the sample deviation. fp (z) denotes the density function of the xZdistribution in Fig. 16.15,and we see that (16.145) P ( x 2< x:) = P ( x 2 > x:) = E 2. Thus, using the quantiles of the x2 distribution (see Table 21.16,p. 1087) we obtain that
XZ = X?-a/z;n-1> xt = ~ % / 2 ; n - 1 ,
a
Figure 16.15 (16.146)
Considering (16.144) we get the following estimation for the unknown variance o2 of the population distribution with a significance level a: (16.147) The confidence interval given in (16.147) for uz is fairly large for small sample sizes. IFor the numerical data of the example on p. 776 and for cy = 5% we get from Table 21.16,p. 1087,
16.9 Mathematical Statistics 777
x&~
= 12.8, so it follows from (16.147) that 0.625 s
= 0.831 and s = 0.00394.
5
u
5 2.453
s with
16.3.3.5 Structure of Hypothesis Testing A statistical hypothesis testing has the following structure: 1. First we develop a hypothesis H that the sample belongs to a population with some given properties, e.g., H: The population distribution has a normal distribution with parameters p and uz (or another given distribution), or H: The expected value is equal to a given value PO,or H: Two populations have the same expected value, p1 - pz = 0, etc. 2. We determine a confidence interval B , based on our hypothesis (in general with tables). The value of the sample function should be in this interval with the given probability, e.g., with probability 99% for cy = 0.01). 3. We calculate the value of the sample function and we accept the hypothesis if this value is in the given interval B , otherwise we reject it. ITest the hypotesis H: p-= po with a significance level a. The random variable T = - ” fihas a t distribution with m = n - 1 degrees of freedom according to 16.3.3.3,p. 775. It follows from this that we have to reject this hypothesis, if T is not in the confidence interval given by (16.143), Le., if IE - Pol
2 pL/z;n-1
(16.148)
J5i
holds. WB say that there is a signijcant dzference. For further problems about tests see [16.14].
16.3.4 Correlation and Regression Correlation analysis is used to determine some kind of dependence between two or more properties of the population from the experimental data. The form of this dependence between these properties is determined with the help of regression analysis.
16.3.4.1 Linear Correlation of two Measurable Characters 1. Two-Dimensional Random Variable In the following, we mostly use the formulas for continuous random variables, but it is easy to replace them by the corresponding formulas for discrete variables. Suppose that X and Y as a two-dimensional random variable ( X ,Y), have the joint distribution function ~
(16.149a) Fl(2) = P(X
5 2 . Y < m), Fz(y) = P ( X < co,Y 5 y).
(16.149b)
The random variables X and Y are said to be independent of each other if (16.150) F ( 2 ,Y) = Fib) . Fz(Y) holds. We can determine the fundamental quantities assigned to X and Y from their joint density function f(x, y): a) Expected Values m x
px = E ( X )
=//x f ( q y ) d z d y , (16.151a) -m-m
m m
py
= E(Y)
=//yf(.r,y)dzdy, -m-m
(16.151b)
778
16. Probabilitu Theoru and Mathematical Statistics
b) Variances u;. = E ( ( X - p x ) 2 ) ,
u; = E ( ( Y - p y ) 2 ) .
(16.152a)
c) Covariance
(16.162b)
d) Correlation Coefficient
oxy = E ((X - px)(Y- p y ) ) . (16.153)
OXY
exv = -.
(16.154)
Ox UY
We assume that every expected value above exists. The covariance can also be calculated by the formula
/ /xy f(z. x3c.
U ~ = Y
E ( X Y )- pxpy where E ( X Y ) =
y)
dx d y .
(16.155)
-X-X
The correlation coefficient is a measure of the linear dependence of X and Y ,because of the following facts: A11 points (X,Y)are on one line with probability 1 if = 1 holds. If A’ and Y are independent random variables, then their covariance is equal to zero, QXY = 0. From Q X . ~= 0, it does not follow that X and Y are independent, but it does if they have a two-dzmenszonalnormal dzstrzbutzon which is defined by the density function
(16.166)
2. Test for Independence of two Variable \Ve often have the question of whether the variables X and Y can be considered independent with QXY = 0, if the sample with size n and with the measured values (xi,9%)(i = 12,. . . , n) comes from a two-dimensional normal distributed population. The test is performed in the following way: a) We have the hypothesis H : =0. b) We determine a significance level CY and determine the quantile ta,mof the t distribution from Table 21.18, p. 1090, form = n - 2. c) K e calculate the empirical correlation coeficients T , and ~ calculate the test statistics (sample function) n - W Y % - 8) t=(16.157a) with T , ~= . (16.157b)
“’-
*\
d) LVe reject the hypothesis if It1 2
E(.%
%=I
holds.
16.3.4.2 Linear Regression for two Measurable Characters 1. Determination of the Regression Line If we have detected a certain dependence between the variables X and Y by the correlation coefficient, then the next problem is to find the functional dependence Y = f ( X ) . \Ye consider mostly linear dependence. The simplest case is linear regression; when we suppose that for any fixed value of zthe random variable Y in the population has a normal distribution with the expected value (16.158) E ( Y ) = a bz and a variance uz independent of x. The relation (16.158) means that the mean value of the random variable Y depends linearly on the fixed value of x. The values of the parameters a; b and 0’ of the
+
16.3 Mathematical Statistics 779 population are usually unknown, and we estimate them approximately by the least squares method from a sample with values (x2.yJ (i = 1 , 2 , . . . , n). The least squares method requires that
C[yt- (a+ bx,)12 = min!
(16.159)
2=1
and we get the estimates
(16.160a)
(16.160b) and the empirical correlation coefficient rZYis given in (16.157b). The coefficients 6 and 6 are called regression coefficients. The line y(x) = ii 6x is called the regression line.
"-
+
2. Confidence Intervals for the Regression Coefficients Our next question is, after the determination of the regression coefficients ii and estimates approximate the theoretical values a and 6. We form the test statistics tb
= ( b - b) ~
(16.161a)
S
Y
p
and t, = (ii- a )
s sy
X
,
m
J5i
Jq
b, how well do the .
(16.161b)
i=l
These are realizations of random variables having a t distribution with m = n - 2 degrees of freedom. We can determine the quantile t,/2;m taken from Table 21.18.p. 1090, for a given significance level cy and because P( It1 < tapim)= 1 - cy holds fort = t , and t = t b :
We can determine a conjdence region for the regression line y = a given in (16.162a.b) for a and b.
+ bx with the confidence interval
16.3.4.3 Multidimensional Regression 1. Functional Dependence
.
Suppose that there is a functional dependence between the characters XI, X 2 , . . . X,: and Y , which is described by the theoretical regression function S
.
y = f ( x I . x 2 . .. . xn) = Ca,g,(xl,
22,.
. . ,in).
(16.163)
j=0
.
The functions g3(x1.2 2 , . . . xn) are known functions of n independent variables. The coefficients a3 in (16.163) are constant multipliers in this linear combination. We also call expression (16.163) linear regression, although the functions g3 can be arbitrary. IThe function f(q, i z ) = a. alzl a222 a32l2 + aqxZ2+ a55152, which is a complete quadratic polynomial of two variables with go = 1, g1 = 21: g2 = x2,g3 = z l z ,g4 = x Z 2 ,and g5 = x1x2;is an example for a theoretical linear regression function.
+
+
+
780
16. Probabilitu Theoru and Mathematical Statistics
2. Writing in Vector Form It is useful to write formulas in vector form in the multidimensional case
x = ( 2 1 : 2 2 , , , , >Z")T1
(16.164)
so, (16.163) now has the form: S
(16.165)
3. Solution and Normal Equation System The theoretical dependence (16.163) cannot be determined by the measured values
(@, f J , ( 2 = 1 . 2 , . . . , 'V) because of random measuring errors. We are looking for the solution in the form Y = i(4=
(16.166a)
6393(21)
(16.166b)
j=O
and using the least squares method (see 16.3.4.2, l., p. 779) we determine the coefficients ii3 as the estimations of the theoretical coefficients a,, from the equation N
1[it- j (x('))I2= min! .
(16.166~)
1=1
Introducing the notation
(16.166d)
we get from (16.166~)the so-called normal equation system
G T G l = GTf (16.166e) to determine 5.The matrix GTG is symmetric, so the Cholesky method (see 19.2.1.2, p. 890) is especially good to solve (16.166e). I Consider the sample whose result is given in the next table. Determine the coefficients of the regression function (16.167): 5(21,22) J " ( Z ~ , Z ~ ) 1.5
3.5 6.2 3.2
= a0 + a l z 1
+~2x2.
(16.167)
From (16.166d) it follows that (16.168) and (16.166e) is 460 + 1661 + 1.6 62 = 14.4, l 6 & + 686l-t 6.462=58.6, 1.660 6.461 0.6862 = 5.32,
+
+
Le.,
E o = 7.0, 61=0.25, 62 = -11.
(16.169)
16.3 Mathematical Statistics 781
4. Remarks 1. To determine the regression coefficients we start with the interpolation f ( ~ ( ' 1 ) = fi (z = 1 , 2 . .. . , N ) , Le.. with Gl=f (16.170) In the case s < 5 . (16.170) is an overdetermined system of equations which can be solved by the Householder method (see 19.6.2.2,p. 918). The multiplication of (16.170) by GT to get (16.166e) is also called Gauss transformatton. If the columns of the matrix G are linearly independent, Le., rank G = s + 1 holds, then the normal equation system (16.166e) has a unique solution, which coincides with the result of (16.170) got by the Householder method. 2. Also in the multidimensional case. it is possible to determine confidence intervals for the regression coefficients with the t distribution, analogously to (16.162a,b). 3. lye can analyse the assumption (16.166b) by the help of the F distribution (see 16.2.4.7, p. 761), if there are some superfluous variables in it.
16.3.5 Monte Carlo Methods 16.3.5.1 Simulation Simulation methods are based on constructing equivalent mathematical models. These models are then easily analysed by computer. In such cases, we talk about dzgztal szmulatzon. A special case is given by Monte Carlo methods when certain quantities of the model are randomly selected. These random elements are selected by using random numbers.
16.3.5.2 Random Numbers Random numbers are realizations of certain random quantities (see 16.2.2, p. 749) satisfying given distributions.
1. Uniformly Distributed Random Numbers These numbers are uniformly distributed in the interval [0,1],they are realizations of the random variable X with the following density function fo(z)and distribution function Fo(z): 0 for 0 5 x, 1 for 0 5 2 5 1, (16.171) 0 otherwise: 1 for z 2 1. 1. Method of t h e Inner Digits of Squares A simple method to generate random numbers is suggested by J. v. Neumann. It is also called the method of the inner digits of squares, and it starts from a decimal number z E (0,l) which has 2n digits. Then we form z2, so we get a decimal number which has 4n digits. We erase its first and its last n digits, so we again have a number with 2n digits. To get further numbers, we repeat this procedure. In this way we get 2n digit decimal numbers from the interval [0,1]which can be considered random numbers with a uniform distribution. The value of 2n is selected according to the largest number representable in the computer. For example, we may choose 2n = 10. This procedure is seldom recommended, since it produces more smaller numbers than it should. Several other different methods have been developed. 2. Congruence Method The so-called congruence method is widely used: A sequence of integers zi (i = 0;1 . 2 . . . .) is formed by the recursion formula zttl I c z, modm. (16.172) Here zo is an arbitrary positive number and c and m denote positive integers, which must be suitably chosen. For z,+1 ne take the smallest non-negative integer satisfying the congruence (16.172). The numbers zi/m are between 0 and 1 and can be used for uniformly distributed random numbers. 3. Remarks a) We choose m = 2r: where T is the number of bits in a computer word, e.g., r = 40. Then the order of c is fi.
I
782
16. Probability Theory and Mathematical Statistics
b ) Random number generators using certain algorithms produce so-called pseudorandom numbers. c) On calculators and also in computers, “ran” or “rand” is used for generating random numbers.
2. Random Numbers with other Distributions To get random numbers with an arbitrary distribution function F ( z )we adopt the following procedure: Consider a sequence of uniformly distributed random numbers ( 2 , . . . from [0,1]. With these numbers we form the numbers 7%= F-’(<,) for i = 1,2,.. . . Here F-’(z) is the inverse function of the distribution function F ( z ) .Then we get:
P ( %I .) = P ( P ( & )5 ). = P ( & I F ( z ) )=
7
fo(t) d t = F ( z ) .
Le., the random numbers q1.
satisfy a distribution with the distribution function F(.r).
3. Tables and Applica
f Random Numbers
(16.173)
1. Construction Random number tables can be constructed in the following way. \Ve index ten identical chips by the numbers 0 , 1 , 2 , . . . .9. We place them into a box and shuffle them. One of them is then selected. and its index is written into the table. Then we replace the chip into the box, shuffle again, and choose the next one. In this way a sequence ofrandom numbers is produced, which is written in groups (for easier usage) into the table. In Table 21.19,p. 1091. four random digits form a group. In the procedure. we have to guarantee that the digits 0 , l . 2 9 always have equal probability. 2. Application of Random N u m b e r s The use of a table of random numbers is demonstrated with an example. ISuppose x e choose randomly n = 20 items from a population of :V = 250 items. Lye renumber the objects from 000 to 249. We choose a number in an arbitrary column or row in Table 21.19.p. 1091, and we determine a rule of how the remaining 19 random numbers should be chosen. e.g., vertically, horizontally or diagonally. We consider only the first three digits from these random numbers, and we use them only if they are smaller than 250.
16.3.5.3 Example of a Monte Carlo Simulation Lye consider the approximate evaluation of the integral
: ( 16.174) as an example of the use of uniformly distributed random numbers in a simulation. We discuss two solution methods.
1. Applying the Relative Frequency Il’e suppose 0 5 g(z) 5 1 holds. We can always guarantee this condition by a transformation (see (16.179). p. 783). Then the integral I is an area inside the unit square E (Fig. 16.16). If we consider the numbers of a sequence of uniformly distributed random numbers from the interval [0,1] in pairs as the coordinates of points of the unit square E . then we get n points P, ( t = 1.2.. . . , n ) . If we denote by m the number of points inside the area A , then considering the notion of the relative frequency (see 16.2.1.2. p. 746): 1
Figure 16.16
/g(o) dz x
-.m
( 16.175)
0
To achieve relatively good accuracy with the ratio in (16.175). we need a very large number of random numbers. This is the reason why we are looking for possibilities to improve the accuracy. One of these methods is the following Monte Carlo method. Some others can be found in the literature.
2. Approximation by the Mean Value
To determine (16 174), we start with n uniformly distributed random numbers (1, E 2 . . . . .En as the realizationof the uniformlydistributedrandomvariableX. Then thevaluesg, = g(&) (z = 1 , 2 , . . . n ) are realizationsof the random yariableg(X), whose expectation according to formula (16.49a,b),p. 751, is:
.
(16.176) This method. which uses a sample to obtain the mean value, is also called the usual Monte Carlo method.
16.3.5.4 Application of the Monte Carlo Method in Numerical Mat hemat ics 1. Evaluation of Multiple Integrals First, we have to show how to transform a definite integral of one variable
I' =
j
h(z)dz
(16.177)
into an expression which contains the integral
I=
/
g ( x ) d x with 0 5 g(z)
5 1.
(16.178)
Then we can apply the 1Ionte Carlo method given in 16.3.5.3. Introduce the following notation: T. = a + ( b - a)u. m = min h( z) . ,If = maxh(x). (16.179) xE[a,b;
r€[a.b]
Then (16.177) becomes (16.180)
h(a + ( b - a ) ~-) VL = g ( u ) satisfies the relation 0 5 g ( u ) 5 1. J4 - m IThe approximate evaluation of multiple integrals with Monte Carlo methods is demonstrated by an example of a double integral where the integrand
I' =
11
h ( z ,y) dxdy with h ( z .y) 2 0.
s
S denotes a plane surface domain given by the inequalities a 5 z 5
b and p1(z) 5 y 5 pz(z),where pl(x) and p2(z) denote given functions. Then V can be considered as the volume of a cylindrical solid K . which stands perpendicular to the z,y plane and its upper surface is given by h(x,y). If h ( z ,y) 5 e holds, then this solid is in a block Q given by the inequalities a 5 z 5 b, c 5 y 5 d, 0 5 z 5 e ( a >b. c, d, e const). After a transformation similar to (16.179), we get from (16.181) an expression
containing the integral
1'* = //g(u.u)dudu
with 0 5 g ( u ) u ) 5 1,
(16.182)
S'
where V*can be considered as the volume of a solid K' in the three-dimensional unit cube. The integral (16.182) is approximated by the Monte Carlo method in the following way: N'e consider the numbers of a sequence of uniformly distributed random numbers from the interval
784
16. Probabilitu Theow and Mathematical Statistics
[O: 11 in triplets as the coordinates of points P, (i = 1 , 2 , .. . , n) of the unit cube. and count how many points among P, belong to the solid K*. If m point belong to K*,then analogously to (16.175) v-* a -m, (16.183) n Remark: In definite integrals with one integration variable we should use the methods given in 19.3, p. 895. For the evaluation of multiple integrals, the Monte Carlo method is still often recommended.
2. Solution of Partial Differential Equations with the Random Walk Process The Monte Carlo method can be used for the approximate solution of partial differential equations with the random walk process.
a) Example of a Boundary Value Problem: Consider the following boundary value problem as an example: 6% a2u Au = -t - = 0 for (z,y) E G, (16.184a) ax2 ay2
4 z , Y) = f ( z ,Y) for (zl Y) E
r.
(16.184b)
Here G is a simply connected domain in the L , y plane; r denotes the boundary of G (Fig.16.17). Similarly to the difference method in paragraph 193.1. G is covered by a quadratic lattice, where we can assume, aithout loss of generality, that the step size can be chosen ash=l.
Figure 16.17
This way we get interior lattice points P ( z ,y) and boundary points 4. The boundary points 8, which are at the same time also lattice points, are considered in the following as points of the boundary r of G.Le.: u(R,) = f ( R J ( i = 1 , 2 , .. A T ) (16.185) t ,
b) SolutionPrinciple: \Ve imagine that aparticlestarts a random vralkfrom an interior point P ( z .y). That is: 1. The particle moves randomly from P ( z ,y ) to one of the four neighboring points. We assign to each of these four grid points 1/4: the probability to move into them. 2. If the particle reaches a boundary point 4 ,then the random walk terminates there with probability one. It can be proven that a particle starting at any interior point P reaches a boundary point R, after a finite number of steps with probability one. We denote by (16.186) P(P>8) = P ( ( T 2/15 4 ) the probability that a random walk starting at P ( z ,y) will terminate at the boundary point 4. Then we get p ( & , & ) = 1, p(Ri,R,) = 0 for i # j and (16.187)
The equation (16.188) is a difference equation for p ( ( z ,y). 4).If we start n random walks from the point P ( L ,y), from which m, terminates at 8 (m, 5 n ) ,then we get (16.189)
16.4 Calculus of Errors 785
The equation (16.189) gives an approximate solution of the differential equation (16.184a) with the boundary condition (16.185). The boundary condition (16.18413)will be fulfilled if we substitute
4 P ) = 41,Y) =
N
f(RtM(z3Y ) 1 8
1 3
(16.190)
2=1
because of (16.188). w(R,) =
N 2=1
f(R,,)p(R,,R,,) = f(R,).
To calculate v(z, y) we multiply (16.188) by f(R,,). After summation we get the following difference equation for v(z.y): 1
w(z,y) = -[v(z-1,y)+w(zt1,y)+w(z,y-1)+w(z,y+1)]. 4
(16.191)
If we start n random walks from an interior point P ( z .y), and among them m3terminate at the boundary point R,, (i = 1 . 2 , . , . , N ) , then we get an approximate value at the point P ( z ,y) of the boundary value problem (16.184a,b) by
16.3.5.5 Further Applications of the Monte Carlo Method Monte Carlo methods as stochastic simulation, sometimes called methods ofstatistical experiments,are used in many different areas. For example we mention: Operations research: Queueing systems, process design, inventory control, service For further details of these problem areas see for example [16.13].
16.4 Calculus of Errors Every scientific measurement, giving certain numerical quantities - regardless of the care which with the measurements are made - is always subject to errors and uncertainties. There are observational errors, errors of the measuring method, instrumental errors and often errors arising from the inherent random nature of the phenomena being measured. Together they compose the measunnent error. All measurement errors arising during the measuring process we call devzataons. As a consequence a measured quantity represented by a number of significant digits can be given only with a rounding error, Le.. with a certain statistical error, which we call the uncertazntyof the result. 1. The deviations of the measuring process should be kept as small as possible. On this basis we have to evaluate the possible best approximation, what can be done with the help of srnoothzng methods which have their origin in the Gaussian least squares method. 2. The uncertainties have to be estimated as well as possible, what can be done with the help of methods of mathematacal statastzcs. Because of the random character of the measuring results we can consider them as statistical samples (see 16.2.3. l.,p. 752) with its probability distribution, whose parameters contain the desired information. In this sence, measurement errors can be seen as sampling errors.
16.4.1 Measurement Error and its Distribution 16.4.1.1 Qualitative Characterization of Measurement Errors If we qualify the measurement errors by their causes, we can distinguish between the following three types of errors: 1. Rough errors are caused by inaccurate readings or confusion; they are excludable.
786
16. Probabilitv Theoru and Mathematical Statistics
2. Systematic measurement errors are caused by inaccurately scaled measuring devices and by the method of measuring, where the method of reading the data and also the measured error of the measurement system can play a role. They are not always avoidable. 3. Statistical or random measurement errors can arise from random changes of the measuring conditions that are difficult or impossible to control and also by certain random properties of the events observed. In the theory of measurement errors the rough errors and the systematic measurement errors are excluded and we deal only with the statistical properties and with the random measurement errors in the calculation of the rounding errors.
16.4.1.2 Density Function of t h e Measurement Error 1. Measurement Protocol UB suppose that in the characterization of the uncertainty we have the measured results listed in a measurement record as a prime notation and we have the relative frequencies or the density function f ( ~ ) ~ or the cumulative frequencies or the distribution function F ( z ) (see 16.3.2.1, p. 770) of the uncertain values. By z we denote the realization of the random variable X , which is under consideration.
2. Error Density Function Special assumptions about the properties of the measurement error result in certain special properties of the density function of the error distribution: 1. Continuous Density Function Since the random measurement errors can take any value in a certain interval, they are described by a continuous density function f(z). 2. Even Density Function If measurement errors with the same absolute value but with different signs are equally likely, then the density function is an even function: f(-z) = f(z). 3. Monotonically Decreasing Density Function If a measuring error with larger absolute value is less likely than an error with smaller absolute value, then the density function f(z)is monotonically decreasing for z > 0. 4. Finite Expected Value The expected value of the absolute value of the error must be finite: (16.193) -m
Different properties of the errors result in different types of density functions.
3. Normal Distribution of the Error 1. Density Function a n d Distribution Function In most practical cases we can suppose that the distribution of the measurement error is a normal distribution with expected value p = 0 and variance uz.Le.. the density function f(x) and the distribution function F ( z )of the measurement error are: 22
f(z)= 1 e - P U
(16.194a)
and F ( z ) = -
6
Here @(z) is the distribution function of the standard normal distribution (see (16.74a), p. 757, and Table 21.15, p. 1085). In the case of (16.194a,b) we speak about normal errors. 2. Geometrical Representation The density function (16.194a) is represented in Fig. 16.18a with inflection points and points at the center of gravity, and its behavior is shown in Fig. 16.18b when the variance changes. The inflection points are at the abscissa values * u ; the centers of gravity of the halfareas are at i.q. The maximum of the function is at 1 = 0 and it is l / ( u f i ) . The curve widens as u2 increases. the area under the curve always equals one. This distribution shows that small errors occur often. large errors only seldom.
16.4 Calculus of Errors 787
a)
-u -y
0 ty +u
-rl
+11
X
bF3
-2
0
1
2
3
x
Figure 16.18 4. Parameters to Characterize the Normally Distributed Error Beside the variance uz or the standard deviation u which is also called the mean square erroror standard error, there are other parameters to characterize the normally distributed error, such as the measure of accuracy h. the average error or mean error q1and the probable error y. 1. Measure of Accuracy Beside the variance u z ,the measure of accuracy 1 h=(16.195) U&
is used to characterize the width of the normal distribution. A narrower Gauss curve results in better accuracy (Fig. 16.18b). If we replace u by the experimental value of 6 or 6%obtained from the measured values, the measure of accuracy characteizes the accuracy of the measurement method. 2. Average or Mean Error The expected value 17 of the absolute value of the error is defined as oc
q = E(IX1) = 2 J z f ( z ) dz.
(16.196)
0
3. Probable Error The bound y of the absolute value of the error with the property P(lXl 5 7 ) =
;
(16.197a)
is called the probable error. It implies that
where @(z)is the distribution function of the standard normal distribution. 4. Given Error Bounds If an upper bound a > 0 of an error is given, then we can calculate the probability that the error is in the interval [-a, a] by the formula (16.198) 5. Relations between Standard Deviation, Average Error, Probable Error, and Accuracy If the error has a normal distribution, then the following relations hold with the constant factor e = 0.4769 :
(16.199a)
788
16. Probabilitu Theoru and Mathematical Statistics
(16.199b) and 1
@(a&) = -2 .
(16.200)
16.4.1.3 Quantitative Characterization of the Measurement Error 1. True Value and its Approximations The true value x, of a measurable quantity is usually unknown. We choose the expected value of the random variables, whose realizations are the measured values x, (i = 1 , 2 , . . . , n ) ,as an estimated value of 2., Consequently, the following means can be considered as an approximation of 2,: 1. Arithmetical Mean (16.201a)
or
k
-
x=
h,E,,
(16.201b)
3=1
if the measured values are distributed into k classes with absolute frequencies h, and class means
T, ( j= 1,2,.. . , k).
2. Weighted Mean
(16.202) Here the single measured values are weighted by the weighting factors p. 791).
gi
(gi
> 0 ) (see 16.4.1.6, l.,
2. Error of a Single Measurement in a Measurement Sequence 1. True Error of a Single Measurement in a Measurement Sequence is the difference between the true value x, and the measuring result. Because this is usually unknown, the true error E , of the 2-th measurement with the result x, is also unknown: E, = 2, - 2 , . (16.203a) 2. Mean Error of a Single Measurement in a Measurement Sequence is the difference of the arithmetical mean and the measurement result 2,: u,=z-xz. (16.203b) 3. Mean Square Error of a Single Measurement or Standard Error of a Single Measurement Since the expected value of the sum of the true errors E, and the expected value of the sum of the mean errors v, of n measurement is zero (independently of how large they are). we also consider the sum of the error squares: E2
=
Et2, i=l
(16.204a)
u2 =
v,2,
(16.20413)
i=l
From a practical point of view only the value of (16.204b) is interesting, since only the values of u, can be determined from the measuring process. Therefore, the mean square error of a single measurement of a measurement sequence is defined by (16.205) The value 6 is an approximation of the standard deviation u of the error distribution. We get for 5 = D in the case of normally distributed error: P(IEI5 5) = 2@(1) - 1 = 0.68.
(16.206)
That is: The probability that the absolute value of the true value does not exceed o.is about 68%. 4. Probable Error is the number 7 , for which 1 (16.207) P(lE1 5 7 ) = - . 2 That is: The probability that the absolute value of the error does not exceed y,is 50%. The abscissae iq divide the area of the left and the right parts under the density function into two equal parts (Fig. 16.18a). The relation beheen and 6 in the case of a normally distributed error is (16.208) 5 . Average Error is the number 11, which is the expected value of the absolute value of the error:
q = E(IE1) =
7
-m
lxlf(x)dx.
(16.209)
In the case of a normally distributed error we get 17 = 0.798. It follows from the relation (16.210) that the probability that the error does not exceed the value 11 is about 57.6 %. The centers of gravity of the left and right areas under the density function (Fig. 16.18a) are at abscissae ?q. We also get: (16.211) = 0.7978e x 0.85 = 0.84 e v t 2 / ( n- 1). ij = 1=1
3. Error of t h e Arithmetical Mean of a Measurement Sequence The error of the arithmetical mean E of a measurement sequence is given by the errors of the single measurement: 1. Mean Square Error or Standard Deviation (16.212) 2. Probable Error
: j 21 x3
cr2/n(n - 1) = --
(16.213)
3. Average Error
(16.214) 4. Accessible Level of Error Since the three types of errors defined above (16.212)-(16.214) are di-
rectly proportional to thecorresponding error ofthe single measurement (16.205), (16.208) and (16.211) and they are proportional to the reciprocal of the square root of n, it is not reasonable to increase the number of the measurements after a certain value. It is more efficient to improve the accuracy h of the measuring method (16.195). 4. Absolute a n d Relative Errors 1. Absolute Uncertainty, Absolute Error The uncertainty of the results of measurement is characterized by errors E,, u,, o,,-/$, 7t2or E , v , 0,y, 11, which measure the reliability of the method of
790
16. Probabilitu Theoru and Mathematical Statistics
measurement. The notion of the absolute uncertaznity, given as the absolute error, is meaningful for all types of errors and for the calculation of error propagation (see 16.4.2,p. 792). They have the same dimension as the measured quantity. The word ‘‘absolute” error is introduced to avoid confusion with the notion of relative error. We often use the notation Axi or Ax. The word “absolute“ has a different meaning from the notion of absolute value: It refers to the numerical value of the measured quantity (e.g., length, weight, energy), without restriction of its sign. 2. Relative Uncertainty, Relative Error The relative uncertainty, given by the relative error, is a measure of the quality of the method of measurement with respect to the numerical value of the measured quantity. In contrast to the absolute error. the relative errorhas no dimension, because it is the quotient of the absolute error and the numerical value of the measured quantity. If this value is not known. we replace it by the mean value of the quantity x:
( 16.215a) The relative error is given mostly as a percentage and it is also called the percentage error: 6xt/% = 62%.100%. (16.215b)
5 . Absolute and Relative Maximum Error 1. Absolute Maximum Error If the quantity z we want to determine is a function of the measured quantities 51, 2 2 , . . . ,z,,Le., z = f ( x l , x Z , . . . , z”),then the resulting error must be calculated taking also the function f into consideration. There are two different ways to examine errors. The first approach is that statistical error analysis is applied by smoothing the data values using the least squares method (min r ( z l - z ) ’ ) , and in the second approach, an upper bound Az,,, is determined for the absolute error of the quantities. If we have n independent variables I,,then: (16.216) where we should substitute the mean value E, for x,. 2. Relative Maximum Error We can get the relative maximum error if we divide the absolute maximum error by the numerical value of the measured value (mostly by the mean of z ) :
( 16.217)
16.4.1.4 Determining the Result of a Measurement with Bounds on the Error .A realistic interpretation of a measurement result is possible only if the expected error is also given; error estimations and bounds are components of measurement results. It must be clear from the data what is the type of the error: what is the confidence interval and what is the significance level. 1. Defining t h e Error The result of a single measurement is required to be given in the form (16.218a) x = I , Lk Ax % 2, 8, and the result of the mean has the form ( 16.218b) x = F i A X ~ , Wx F c! AM. Here Ax is the most often used distance, the standard deviation. p and ij could also be used. - 2, 2. Prescription of Arbitrary Confidence Limits The quantity T = -has a t distribution
*
x
DAM
(16.101b) with f = n - 1 degrees of freedom in the case of a population with a distribution N ( p , u 2 ) according to (16.100). For a required significance level cy or for an acceptance probability S = 1- CY we get the confidence limits for the unknown quantity x, = p with the t quantile t,jzif (16.219) p = p tap:f ’ 5A.M.
*
16.4 Calculus ofErrors 791
That is, the true value x, is in the interval given by these limits with a probability S = 1 - 0. We are mostly interested in keeping the size n of the measurement sequence at its lowest possible level. The length 2 t , p f 5 , 4 ~of the confidence interval decreases by a smaller value of 1 - (Y and also by a larger number n of measurements. Since decreases proportionally to 1 / f i and the quantile t,/z;f with f = n - 1 degrees of freedom also decreases proportionally to l / f i for values of n between 5 and 10 (see Table 21.18,p. 1090) the length of the confidence interval decreases proportionally to l / n for such values of n.
16.4.1.5 Error Estimation for Direct Measurements with the Same Accuracy If we can achieve the same variance ui for all n measurements, we talk about measurements with the same accuracy h = const. In this case, the least squares method results in the error quantities given in (16.205), (16.208))and (16.210). W Determine the final result for the measurement sequence given in the following table which contains n = 10 direct measurements with the same accuracy.
x, 1.592 1.581 1.574 1.566 1.603 1.580 1.591 1.583 1.571 1.559 0 -11 + 9 +21 v,.103 - 1 2 -3 -1 + 6 t 1 4 -23 vt2.1O6 144 1 36 196 529 0 121 9 81 441
Final result: z = f i6 a =~ 1.580 f 0.041
16.4.1.6 Error Estimation for Direct Measurements with Different Accuracy 1. Weighted Measurements If the direct measurement results 2, are obtained from different measuring methods or they represent means of single measurements, which belong to the same mean E with different variances 6,',we calculate a wezghted mean (16.220) where gz is defined as
2
gt=-q.
(16.221)
0%
Here 5 is an arbitrary positive value, mostly the smallest 6%. It serves as a weight unit of the deviations, i t . , for tL= 6 it is gZ = 1. It follows from (16.219) that a larger weight of a measurement results in a
smaller deviation cl.
2. Standard Deviations The standard deviation of the weight unit is estimated as (16.222) We have to be sure that 8 9 ) < 5.In the opposite case, if &) systematic deviations.
> 5,then there are z,values which have
I
792
16. Probabilitu Theoru and Mathematical Statistics
The standard deviation of the single measurement is (16.223) where 6t[g) < 6, can be expected. The standard deviation of the weighted mean is: (16.224)
3. Error Description The error can be described as it is represented in 16.4.1.4, p. 790, either by the definition of the error or by the t quantile with f degrees of freedom. H The final results of measurement sequences ( n = 5) with different means ?EL (i = 1,2,. . . ,5) and with different standard deviations AM, are given in Table 16.6. We calculate (?Et)m = 1.5830 and we choose 20 = 1.585 and 5 = 0.009. With z, = 2, - 20,g, = 5'/6,' we get T = -0.0036 and Z = zo 0.0088
< 5 and cfZ
+ Z = 1.582. The standard deviation is
&)'
=
\1
k g , a , 2 / ( n - 1) = 1=1
= C . A .= ~ ~0.0027. The final result is z = ?E f cfZ = 1.585 f 0.0027.
Table 16.6 Error description of a measurement sequence
AM; 1.573 1.580 1.582 1.589 1.591
0.010 0.004 0.005 0.009 0.011
-a A '-M;
1.0.10-4 1. 6 . m 5 2.5.1OW5 8.1.10-5 1.21.I 0-4
1.16.10-4 1.26.10-* 2.91.10-5 1. 6 . w 5 2.37.10-5
U ,=1
= 0.009
= 10.7
16.4.2 Error Propagation and Error Analysis Measured quantities appear in final results often after a functional transformation. If the error is small, we can use a linear Taylor expansion with respect to the error. Then we talk about error propagation.
16.4.2.1 Gauss Error Propagation Law 1. Problem Formulation Suppose we have to determine the numerical value and the error of a quantity z given by the function z = f ( qx, 2 , . . . ,z k )of the independent variables z j (3 = 1 , 2 , . , , , k ) . The mean value ?Ej obtained from n3 measured values is considered as realizations of the random variable x3,with variance uJ2.We wish to examine how the errors of the variables affect the function value f(zl,x2,.. . ,xt). It is assumed that the function f(x1,x 2 , . . . , zk)is differentiable, its variables are stochastically independent. However they may follow any type of distribution with different variances uJ'.
f 6.4 c a k u h s of Errors 793
2. Taylor Expansion Since the error represents relativelysmallchangesofthe independent variables, the function f(x1,52,. . . , q )can be approximated in the neighborhood of the mean TJ by the linear part of its Taylor expansion with the coefficients a,, so for its error Aj we have: ~f=f(.l~x21...rZk)-f(2lrTZr...,~k)r
(16.225a) (16.22513)
where the partial derivatives 8f/8x, are taken at (!&,Zz,. . . ,T k ). The variance of the function is af 2 = al2 u2,2
k + a22u222+. + ak2utk2= 1 a32u,,2. '.
(16.226)
3=1
3. Approximation of the Variance uj Since the variances of the independent variables x, are unknown, we approximate them by the variance of their mean, which is determined from the measured values x,~(1 = 1 , 2 , . , . , nl) of the single variables as follows: n
(16.227) With these values we form an approximation of uf2: k
E; =
1aJ26:,.
(16.228)
3=1
The formula (16.228) is called the Gauss error propagation law. 4. Special Cases 1. Linear Case .4n often occurring case is when we add the absolute values of the errors of sequentially occurring error quantities with a3 = 1:
i+f = JE:+ 6;+ . ' . + 6;.
(16.229)
IThe pulse length is to be measured a t the output of a pulse amplifier of a detector channel for spectrometry of radiation, whose error can be deduced for three components: 1. statistical energy distribution of the radiation of the part passing through the spectometer with an energy Eo, which is characterized by &tr, 2. statistical interference processes in the detector with 6Det, 3. electronic noice of the amplifier of the detector impulse eel. The total pulse length has the error Ef=
Jm
(16.230)
2. Power Rule The variables x3 often occur in the following form:
z = f ( x l , 2 2 , . , . xk) = azlbl. xZbz . . . x k b k . By logarithmic differentiation we get the relative error
(16.231)
(16.232)
794
16. Probabilitv Theorv and Mathematical Statistics
from which by the error propagation law we get for the mean relative error: (16.233)
ISuppose that the function f (xl.2 2 . ~ 3 )has the form f(q, ~ 2 deviations are ut,,us* and u Z 3 . The relative error is then 62
= L!!f =
\ (;g j2!3
~ x = 3 )&x22x331
and the standard
(3$)2,
5 . Difference to the Maximum Error Declaring the absolute or relative maximal error (16.216), (16.217) means that we do not use smoothing for the values of the measurement. For the determination of the relative and absolute error with the error propagation laws (16.228) or (16.231), smoothing between the measurement values xJ means that we determine for them a confidence interval for a previously given level. This procedure is given in 16.4.1.4, p. 790.
16.4.2.2 Error Analysis The general analysis of error propagation in the calculations of a function cp(xt),when quantities of higher order are neglected, is called error analysis. In the framework of the theory of error analysis we investigate using an algorithm, how an input error Ax, affects the value of ~ ( x , )In. this relation we also talk about differential error analysis. In numerical mathematics, error analysis means the investigation of the effect of errors of methods, of roundings. and of input errors to the final result (see [19.24]).
795
17 Dynamical Systems and Chaos 17.1 Ordinary Differential Equations and Mappings 17.1.1 Dynamical Systems 17.1.1.1 Basic Notions 1. The Notion of Dynamical Systems and Orbits A dynamical system is a mathematical object to describe the development of a physical, biological or another system from real life depending on time. It is defined by a phase space M ,and by a oneparameter family of mappings pt : M + 114, where t is the parameter (the time). In the following, the phase space is often R", a subset of it, or a metric space. The time parameter t is from R (time continuous system) or from 2 or from Z+ (timediscrete system). Furthermore, it is required for arbitrary z E .zI that a)).(OP = z and b) p t ( $ ( x ) ) = ~ ~ ' ~ for ( xall ) t:s. The mapping p1 is denoted briefly by p. In the following, the time set is denoted by hence, r = R, r = R,, r = Z or = Z+.If = R, then the dynamical system is also called a flour; if = 2 or r = Z,, then the dynamical system is discrete. In case r = R a n d = Z.the properties a) and b) are satisfied for every t 6 T ,so the inverse = p-t also exists, and these systems are called invertible dynamical systems. mapping If the dynamical system is not invertible, then @ ( A ) means the pre-image of A with respect to pt, for an arbitrary set A c M and arbitrary t > 0, Le., v - ~ ( A = ) {z E M :cp'(z) E A } . If the mapping pt:Jf i !If is continuous or k times continuously differentiable for every t E r (here M c R").then the dynamical system is called continuous or C'-smooth, respectively. For an arbitrary fixed z E !\f) the mapping t H pt (x),t E r, defines a motion of the dynamical system starting from x at time t = 0. The image $2) of a motion starting a t z is called the orbit (or the trajectory) through x,namely y(x) = {cpt(x)}tEr.Analogously, the positive semiorbit through x is defined by r+(z)= {pt(x)}tzOand, if # R+ or r # Z+2 then the negative semiorbit through z is defined by -,-(z) = {pt(x)}t0, such that $+=(E) = ~'(z) for all t E r, and T E r is the smallest positive number with this property. The number T is called the period. 2. Flow of a Differential Equation Consider the ordinary linear planar differential equation x = f (x). (17.1) where f : ,21 -i R" (vectorfield) is an r-times continuously differentiable mapping and :M = R" or M is an open subset of R". In the following, the Euclidean norm 11 11 is used in R". Le.. for arbitrary
r,
r
r
r
r
r
r,
)zn), its norm is l/zl1 =
l--
z'. If the mapping f is written componentwise
i=l
(17.1) is a system of TI scalar differential equations xi = fi(zl,.. . , zn)>i = 1 , 2 : . . . .n. The Picard-Lindelof theorem on the local existence and uniqueness of solutions of differential equations locally and the theorem on the r-times diferentiability of solutions with respect to the initial values (see [17.11])guarantee that for every zo E M , there exist anumber E > 0, a sphere Bs(x0)= {x: Ilx-z011 < 6) in ,If and a mapping p: ( - E , E ) x Ba(xo) + M such that: 1. p(,),) is ( r t 1)-times continuously differentiable with respect to its first argument (time) and rtimes continuously differentiable with respect to its second argument (phase variable):
I
796
17. Dvnamical Svstems and Chaos
2. for every fixed I E B ~ ( z o'p) (., ~ 2) is the locally unique solution of (17.1) in the time interval
which starts from I at timet =
0, Le., -09 (t,z)
at
= $ ( t ! z )= f(p(t,z))holds for every t E
(-E, E ) (-&,E),
y(0. z) = I. and every other solution with initial point z at time t = 0 coincides with p(t,z) for all small It/. Suppose that every local solution of (17.1) can be extended uniquely to the whole of R. Then there exists a mapping p: R x M + M with the following properties: 1. p(0,1) = z for all I E M . 2. p(t + s, I) = 'p(t2p(s,z)) for all t , s E R a n d all z E M . 3. p(., .) is continuously differentiable ( r + 1)times with respect to its first argument and r times with respect to the second one. 4. For every fixed I E ,bf, p(.,z) is a solution of (17.1) on the whole of R. Then the C'-smooth flow generated by (17.1) can be defined by 'pt : = p(t, .). The motions p(.,z) : R i Af of a flow of (17.1) are called integral curves. The equation (17.2) X = ~ ( -yz). jr = TI - y - Z Z , i = ~y - bz is called a Lorenz system of convective turbulence (see also 17.2.4.3, p. 825). Here c > 0, r > 0 and b > 0 are parameters. The Lorenz system corresponds to a C" flow on M = R3. 3. Discrete Dynamical System Consider the difference equation Zttl = P(.t)> (17.3) which can also be written as an assignment z i--t p(z). Here 'p: M + M is a continuous or T times continuously differentiable mapping, where in the second case M C R". If p is invertible, then (17.3) defines an invertible discrete dynamical system through the iteration of p, namely,
-
pt = p o ~ ~ . o for p, t
t times,
> 0,
pt = p-'o...op-'
'-
-t times,
for t
< 0, po = id.
(17.4)
If p is not invertible, then the mappings pt are defined only for t 2 0. For the realization of 'pt see (5.86))p. 296. IA: The difference equation (17.5) It+l = azt (1 - I t ) , t = 0,1,. . . with parameter a E (0,4] is called a logistic equation. Here M = [0,1], and p: [0,1] + [0,1]is, for a fixed a , the function p(s) = a z ( 1 - z). Obviously, p is infinitely many times differentiable, but not invertible. Hence (17.5) defines a non-invertible dynamical system. B: The difference equation (17.6) ~ t + = l yt + 1 - ax:, yt+l = bzt, t = 0, k l , . . . , with parameters a > 0 and b # 0 is called a He'non mapping. The mapping p: RZ+ R* corresponding to (17.6) is defined by p(z,y ) = (y + 1 - az2,bz),is infinitely often differentiable and invertible.
4. Volume Contracting and Volume Preserving Systems The invertible dynamical system {'pt}tEr on M C R" is called dissipative (respectively volume-preserving
or conservative), if the relation vol('pt(A)) < vol(A) (respective vol(pt(A)) = vol(A)) holds for every set A c M with a positive n-dimensional volume vol(A) and every t > 0 (t E r). A: Let p in (17.3) be a C'-di$eomorphism (Le.: p : M + M is invertible, M C R" open, p and p-l are CT-smoothmappings) and let Dp(z) be the Jacobi matrix of p in z E M. The discrete system I I 1 in M . (17.3) is dissipative if I det Dp(z)/< 1 for all z E M , and conservative if I det
17.1 OTdinaT~Di,fferentialEquations and Mappings 797
B: For the system (17.6) Dp(z,y) =
( -? i) and so I det Dp(z, y)i
I
b. Hence, (17.6) is
dissipatike if (bl < 1.and conservative if Ibl = 1. The Henon mapping can be decomposed into three mappings (Fig. 17.1): First. the initial domain is stretched and bent by the mapping zf = z, y’ = y 1 - ax2 in a area-preserving way, then it is contracted in the direction of the d-axis by z” = bz’,y” = y’ (at Ibl < l ) ,and finally it is reflected with respect to the line y” = x’’by 2’’’= y”, y”‘ = d’.
+
Figure 17.1
17.1.1.2 Invariant Sets 1. CY- and w-Limit Set, Absorbing Sets Let { ~ ‘ } ~be€ardynamical system on M.The set A C M is invariant under {$}, if $((A) = A holds for all t E r. and positively invariant under {$}, if cpt(A) c A holds for all t 2 0 from r. For every z E M ,the w-limitset of the orbit passing through z is the set d(2) = {y E M :3 t, E r, t, + +x, ~ “ ( 2 )+ y as n + +x}. (17.7) The elements of w ( x ) are called w-limit points of the orbit. If the dynamical system is invertible, then for every I E M, the set CY(.) = {y E .kf: 3 t, E r, t, + -w, p“(z) + y as n -+ +E} (17.8) is called the a-limitset ojthe orbit passing through z; the elements of a(.) are called the d i m i t points of the orbit. For many systems which are volume decreasing under the flow there exists a bounded set in phase space such that every orbit reaching it stays there as time increases. .4bounded, open and connected set c’ c M is called absorbing with respect to if cp‘(c)c U holds for all positive t from r. is the closure of c’.) Consider the system of differential equations (17.9a) y = z t y ( 1 - z 2 - y2) .i = - y + 2 ( 1 - x 2 -y2), in the plane. Using the polar coordinates z = r cosl9, y = rsinl9. the solution of (17.9a) with initial state ( T O , 30)at timet = 0 has the form
(u
+
+
r ( t ,r O )= [I (TO’ - 1)e-2t]-1/2, 8(t,230) = t 190. This representation of the solution shows that the flow of (17.9a) has a periodic orbit with period 2 ~ , which can be given in the form y ( ( l . 0 ) ) = {(cost, sin t ) ,t E [0,27r]}.The limit sets of an orbit through p are:
+ y2 < r 2 } with r > 1 is an absorbing set for (17.9a). 2. Stability of Invariant Sets Let 4 be an invariant set of the dynamical system {pt}tEj-defined on (M. p). The set A is called stable. if every neighborhood c’ of A contains another neighborhood Ui c c‘ of A such that pt((crl)c U holds Every open sphere B, = {(z,y): x 2
798
17. Dynamical Systems and Chaos
for all t > 0. The set A, which is invariant under {iot}, is called asymptotically stable if it is stable and the following relations are satisfied: vx E .I1 (17.10) 3 4 > 0 dist(x, A ) < A } : dist(cpt(z), A) -+0 for t + f m . Here. dist(z. A ) = inf p ( x , y). YEA
3. Compact Sets Let (M. p ) be a metric space. A system { U Z } of ~ ~open I sets is called an open covering of dl if every point of .If belongs to at least one Ut. The metric space (M, p) is called compact if it is possible to choose finitely many Ut1 . . . , iYtTfrom every open covering {Uj}i,l of M such that M = C;,U . . U C;, holds. The set K c M is called compact if it is compact as a subspace.
4. Attractor, Domain of Attraction Let {pt}tGrbe a dynamical system on (M,p)and A an invariant set for {'it}. Then W(A) = {x E .\I: ~ ( 2c) A} is called the domain o j attraction of A. A compact set .I c M is called an attractor of {pt}ttron .tl if .I is invariant under {$} and there is an open neighborhood C; of 1 1 such that ~ ( z=) A for almost every (in the sense of Lebesgue measure) x E U .
I .2 = ~r((1.0))isanattractoroftheflowof(17.9a). HerelY(i2) = RZ\{(O~O)}.Forsomedynamical systems. a more general notion of an attractor makes sense. So, there are invariant sets '1 which have periodic orbits in every neighborhood of A which are not attracted by 11,e.g., the Feigenbaum attractor. The set .I may not be generated by a single limit set U . .4compact set A is called an attractor in the sense ojMilnorof the dynamical system {pt}tGron M if '2 is invariant under {pi} and the domain of attraction of .Icontains a set with positive Lebesgue measure.
17.1.2 Qualitative Theory of Ordinary Differential Equations 17.1.2.1 Existence of Flows, Phase Space Structure 1. Extensibility of Solutions Besides the differential equation (17.1). which is called autonomous, there are differential equations whose right-hand side depends explicitly on the time and they are called non-autonomous: j: = f ( t : x). (17.11) Let f : R x h ' + M be a C'-mapping with M C R". By the new variable xntl := t, (17.11) can be interpreted as the autonomous differential equation 3 = j ( z n t l l x ) 3 k n t l = 1. The solution of (17.11) starting from xo at time to is denoted by p(,>to,xo). In order to show the global existence of the solutions and with this the existence of the flow of (17.1))the following theorems are useful. 1. Criterion of Wintner and Conti If M = R" in (17.1) and there exists a continuous function
+ [I. +m).such that Ilf(x)ll 5
d:[0, +x)
(u'
(11x11)for all x E Rnand
+x iirnUtp) ~
dr =
holds, then
every solution of (17.1) can be extended onto the whole of Rt. For example, the following functions satisfy the criterion of Wintner and Conti: W ( T ) = Cr 1 or W ( T ) = Crl In 7.1 + 1. where C > 0 is a constant. 2. Extension Principle If a solution of (17.1) stays bounded as time increases, then it can be extended to the whole of R+. Assumption: In the following, the existence of the flow { p t } t Eof~(17.1) is always assumed.
+
2. Phase Portrait a) If;s(t) isasolutionof (17.1))then thefunctionp(t+c) withanarbitraryconstant cisalsoasolution.
b) Two arbitrary orbits of (17.1) have no common point or they coincide. Hence, the phase space of (17.1) is decomposed into disjoint orbits. The decomposition of the phase space into disjoint orbits is called a phase portrait.
27.1 Ordinarv Dzflerential Equations and Mappings 799
c) Every orbit, different from a steady state, is a regular smooth curve, which can be closed or not closed. 3. Liouville's Theorem Let {qt}tER be the flow of (17.1), D c M c Rnbe an arbitrary bounded and measurable set, Dt : = p t ( D ) and \; : = vol(Dt) be the n-dimensional volume of Dt (Fig. 17.2a). Then the relation d - V, = divf(z) dx holds for arbitrary t E R . For n = 3. Liouville's theorem states:
/
dt
Dt
(17.12)
Figure 17.2 Corollary: If divf(z) < 0 in M holds for (17.1)3then the flow of (17.1) is volume contracting. If divf(x) I 0 in holds, then the flow of (17.1) is volume preserving. IA: FortheLorenzsystem (17.2),divf(x.y.t) I - ( u + l + b ) . Sincea > Oand b > O,divf(z.y,z) <
0 holds. il'ith Liouville's theorem, - r/, =
dt
///
-(u
+ 1 + b) dxldxz dxg = - ( u + 1+ b) V, obviously
Dt
holds for any arbitrary bounded and measurable set D C R3. The solution of the linear differential so that V, + 0 follows for t + +m. equation 1; = -(a 1 t b) is \4 = Va e-(aS1+b)t,
+
dH IB: Let G c R" x R" be an open subset and H: b' -+ R a @-function. Then, x, = -(x, y), y, = dyi H -d_ ( 2 ,y) ( i = 1 , 2 ,. . . n)is called a Hamiltonian differential equation. The function H is called the ax, Hamiltonian of the system. I f f denotes the right-hand side ofthis differential equation, then obviously ~
divf(z,Y) =
1 m ( x " [ ":", i=l
; Y) - -(x,y)] dY,dX,
I 0.
Hence, the Hamiltonian differential equations are
volume preserving.
17.1.2.2 Linear Differential Equations 1. General Statements
Let A ( t ) = [aij(t)]Fj=l be a matrix function on R , where every component aij: R -+ R is a continuous function. and let b: R -+ R" be a continuous vector function on R . Then (17.13a) x = A(t)x b ( t ) is called an inhomogeneous linear first-order differential equation in R", and x = il(t)z (17.13b) is the corresponding homogeneous linear first-order differentialequation. 1. Fundamental T h e o r e m for Homogeneous Linear Differential Equations Every solution of (17.13a) exists on the whole of R. The set of all solutions of (17.13b) forms an n-dimensional vector subspace L H of the C'-smooth vector functions over R .
+
I
800 17. Dynamical Systems and Chaos
2. Fundamental Theorem for Inhomogeneous Linear Differential Equations The set of all solutions LI of (17.13a) is an n-dimensional affine vector subspace of the C'-smooth vector functions over R of the form L1 = po + L x , where cpo is an arbitrary solution of (17.13a). Let cpl!. . . , pn be arbitrary solutions of (17.13b) and CJ = [cpl, . . . ,cpn] the corresponding solution matriz. Then CJ satisfies the matriz diferential equation Z ( t ) = A ( t ) Z ( t ) on , R, where 2 E R"'". If the solutions P I , . . . pn form a basis of L H ,then @ = [cpl,. . . , cp,] is called the fundamental matrir of (17.13b). W(t) = detCJ(t) is the Wronskian determinant with respect to the solution matrix CJ of (17.13b). The formula of Liouville states that: ~
tP(t) = SpA(t) W ( t ) (t E R). (17.13~) For a solution matrix, either W(t) E 0 on R or W ( t )# 0 for all t E R . The system pl,. . . , 'pn is a basis of L H , if and only if det['pl(t),. . . , pn(t)]# 0 for a t (and so for all t ) . 3. Theorem (Variation of Constants Formula) Let CJ be an arbitrary fundamental matrix of (17.13b). Then the solution 'p of (17.13a) with initial point p at timet = 7 can be represented in the form
p(t) = C J ( t ) @ ( T ) - ' p +
/
@(t)CJ(s)-'b(s) ds ( t E R).
(17.13d)
T
2. Autonomous Linear Differential Equations Consider the differential equation
x =AX, (17.14) where A is a constant matrix of type (n,n). The operator norm (see also 12.5.1.1, p. 617) of a matrix A is given by lIAl1 = max{lIAz/l, z E R", I / X I ~ 5 l}, where for the vectors of R" the Euclidean norm is again considered. Let A and B be two arbitrary matrices of type (n, n). Then b) IlAAll = 1x1 IlAll (A E R), a) IlA + BII I IIAlI + IIBIL d) IlABIl I I l A l l l l ~ i l ~ e) IlAIl = G,where, ,A is the greatest eigenvalue of ATA . The fundamental matrix with initial value E, at timet = 0 of (17.14) is the matriz ezponentialfunction c) IlAzIl
5 IlAllII~llz E R"),
(17.15) with the following properties: a) The series of eAt is uniformily convergent with respect to t on an arbitrary compact time interval and absolutely convergent for every fixed t ; b) /leAtll 5 elAslt (t 2 0); d
c) z ( e A t ) = (eAt)' = AeAt = eAtA
(t E R);
d) e(t+s)A= etA esA ( s , t E R); e) eAt is regular for all t and (eAt)-' = e-At; f ) if A and B are commutative matrices of type (n, n ) ,Le., AB = B A holds, then B eA = eAB and €A-B - A E . -e e , g) if A and B are matrices of type (n, n ) and B is regular, then eBAB-' = B eA B-'.
17.1 Ordinary Diflerential Equations and Mappings 801
3. Linear Differential Equations with Periodic Coefficients We consider the homogeneous linear differential equation (17.13b), where A ( t ) = [aij(t)][j=lis a Tperiodic matrix function, Le., a z 3 ( t )= aij (t + T ) (Vt E R , i , j = 1 , .. . , n ) . In this case we call (17.13b) a linear T-periodic diflerential equation. Then every fundamental matrix @ of (17.13b) can be written in the form @ ( t )= G ( t ) e t R ,where G(t) is a smooth, regular T-periodic matrix function and R is a constant matrix of type (n,n) (Floquet's theorem). Let @(t)be the fundamental matrix of the T-periodic differential equation (17.13b), normed at t = 0, Le., @(O) = E,, and let @(t)= G ( t ) e t Rbe a representation of it according to Floquet's theorem. The matrix @ ( T )= em is called the monodromy matrix of (17.13b); the eigenvalues p3 of @ ( T )are the multipliers of (17.13b). A number p E C is a multiplier of (17.13b) if and only if there exists a solution p 0 of (17.13b) such that p(t + T ) = p p ( t ) (t E R) holds.
+
17.1.2.3 Stability Theory 1. Ljapunov Stability and Orbital Stability Consider the non-autonomous differential equation (17.11). The solution cp(t,to,xo) of (17.11) is said to be stable i n the sense of Lyapunov if
V t 2 tl.
The solution q(t,to, 20)is called asymptotically stable in the sense of Lyapunov, if it is stable and:
for t
+ +m.
For the autonomous differential equation (17.1) there are other important notions of stability besides the Lyapunov stability. The solution p(t,zo)of (17.1) is called orbztally stable (asymptotzcally orbztally stable). if the orbit ~ ( 5 0=) {cp(t,Q), t E R} is stable (asymptotically stable) as an invariant set. A solution of (17.1) which represents an equilibrium point is Lyapunov stable exactly if it is orbitally stable. The two types of stability can be different for periodic solutions of (17.1). ~
W Let a flow be given in R3,whose invariant set is the torus T 2 . Locally, let the flow be described in a rectangular coordinate system by 61 = 0, 62 = fz(@), where f2: R + R is a 2a periodic smooth function, for which: VO1 E R 3 Uel (neighborhood of 01) An arbitrary solution satisfying the initial conditions (01(0), 0 2 ( 0 ) )can be given on the torus by & ( t ) E 01(0), @z(t)= Qz(0) f*(Qi(O)t (t E R). From this representation it can be seen that every solution is orbitally stable but not Lyapunov stable (Fig. 17.2b).
+
2. Asymptotical Stability Theorem of Lyapunov A scalar-valued function V is called positive definite in a neighborhood U of a point p E M c R", if 1. V:II c M + R is continuous. 2. V(5) > 0 for all z E U \ { p } and V(p) = 0 . Let c' c M be an open subset and V : U -+ R a continuous function. The function V is called a Lyapunovfunctzon of (17.1) in U , if V ( v ( t )does ) not increase while for the solution p(t) E U holds. Let 1': C + R be a Lyapunov function of (17.1) and let V be positive definite in a neighborhood U of p . Then p is stable. If the condition V(cp(t,zo)) = constant (t 2 0) always yields cp(t,2 0 ) p for
I
802
17. Dynamical Systems and Chaos
a solution p of (17.1) with rp(t,z) E U (t 2 0), i.e., if the Lyapunov function is constant along a complete trajectory. then this trajectory can only be an equilibrium point, and the equilibrium point p is also asymptotically stable. IThe point (0,O)is a steady point of the planar diflerential equation x = y, y = -2 - z2y. The function V ( x ,y) = x2 y2 is positive definite in every neighborhood of (0,O) and for its derivative d - V ( x ( t ) , y(t)) = - 2 ~ ( t ) ' y ( t ) <~ 0 holds along an arbitrary solution for z ( t ) y ( t ) # 0. Hence, (0,O) is dt asymptotically stable.
+
3. Classification and Stability of Steady States Let xg be an equilibrium point of (17.1). In the neighborhood of 2 0 the local behavior of the orbits of (17.1) can be described under certain assumptions by the variational equation y = Df(xo)y, where D f(x0) is the Jacobian matrix o f f in 2 0 . If D f(z0)does not have an eigenvalue A, with Re A, = 0, then the equilibrium point zo is called hyperbolic. The hyperbolic equilibrium point zo is of type ( m ,k ) if D f(z0) has exactly m eigenvalues with negative real parts and k = n - m eigenvalues with positive real parts. The hyperbolic equilibrium point of type ( m ,k ) is called a sink, if m = n, a source, if k = n, and a saddle point, if m # 0 and k # 0 (Fig. 17.3). A sink is asymptotically stable; sources and saddles are unstable (theorem on stability in the first approximation). Within the three topological basic types of hyperbolic equilibrium points (sink, source, saddle point) further algebraic distinctions can be made. A sink (source) is called a stable node (unstable node) if every eigenvalue of the Jacobian matrix is real, and a stable focus (unstable focus) if there are eigenvalues with non-vanishing imaginary parts. For n = 3; we get a classification of saddle points as saddle nodes and saddle foci.
Type of equilibrium ooint
Sink
Source
Saddle point
Figure 17.3
4. Stability of Periodic Orbits Let p(t%xo) be a T-periodic solution of (17.1) and y(z0) = {p(t,zo),t E [O:T]} its orbit. Cnder certain assumptions. the phase portrait in a neighborhood of y(z0) can be described by the variational equation y = D f ( p ( t ,xo))y. Since A(t) = D f(p(t,20))is a T-periodic continuous matrix function of type (n, n ) , it follows from the Floquet theorem (see p. 801) that the fundamental matrix O,,(t) of the variational equation can be written in the form OZ0(t)= G(t)eRt,where G is a T-periodic regular smooth matrix function with G(0) = En,and R represents a constant matrix of type (n,n) which is not uniquely given. The matrix Ozn(7') = em is called the monodromy matriz of the periodic orbit y ( z ~ ) , and the eigenvalues p l , . . . , pn of em are called multipliers ofthe periodic orbit y(z0). If the orbit */(xo) is represented by another solution p(t,XI),i.e., if y(z0) = $ X I ) , then the multipliers of y(z0) and y(q) coincide. One of the multipliers of a periodic orbit is always equal to one (Andronov- Witt theorem). Let plI. . . , pn-1, pn = 1 be the multipliers of the periodic orbit y(z0) and let O,,(T) be the monodromy matrix of y(x0). Then
17.1 Ordinary Differential Equations and Mappings 803
-
e.r: divf(+(t, 2 0 ) ) d t ,
(17.17) T
.
Hence. if n = 2. then pz = 1 and p1 = e[, dlvf(dts )O' dt. ILet +(t5(1.0)) = (cost,sint) bea2a-periodicsolutionof (17.9a). ThematrixA(t) ofthevariational equation with respect to this solution is
The fundamental matrix @ ( 1 , 0 ) ( t ) normed at t = 0 is given by '(1,0)(t)
ecZtcost-sint cost -sint cost) = ( s i n t cost)
= (e-ztsint
e-2t
(0
o 1) 3
where the last product is a Floquet representation of 0(1,0)(t).Thus, pi = e-4a and pz = 1. The multipliers can be determined without the Floquet representation. For system (17.9a) divf(z: y) = 2 -42' -4y' holds, and hence divf(cos t , sin t )
-2. According to the formula above, pi = e-fo2r-2dt =
e-41;
5 . Classification of Periodic Orbits I f the periodic orbit y of (17.1) has no further multiplier on the complex unit circle besides pn = 1,then
7 is called hyperbolic. The hyperbolic periodic orbit is of tupe (m,k ) if there are m multipliers inside and k = n - 1 multipliers outside the unit circle. If m > 0 and k > 0, then the periodic orbit of type ( m ,k ) is called a saddle point. According to the Andronov-Witt theorem a hyperbolic periodic orbit y of (17.1) of type ( n - 1,O) is asymptotically stable. Hyperbolic periodic orbits of type (m,k ) with k > 0 are unstable. IA: .A periodic orbit 7 = { p ( t ) ,t E [0,TI} in the plane with multipliers pi and p~ = 1 is asymptoti-
cally stable if lpll < 1. i.e., if
dT
divf(cp(t)) d t < 0.
IB: If there is a further multiplier besides pn = 1 on the complex unit circle, then the AndronovWitt theorem cannot be applied. The information about the multipliers is not sufficient for the stability analysis of the periodic orbit. IC: .As an example, let the planar system x = -y + zf(z' + y2)> y = z+ y f(z' + y') be given by the smooth function f : (0, +XI) + R, which additionally satisfies the properties f(1) = f'(1) = 0 and ~ ( T ) ( T- 1) < 0 for all r # 1,r > 0. Obviously, p ( t ) = (cost, sin t ) is a 2a-periodic solution of the system and cost - sin t @(l,O)(t)= sint cost) is the Floquet representation of the fundamental matrix. It follows
(
(:y )
that p1 = pz = 1. The use of polar coordinates results in the system 1: = T f ( r 2 ) ,8 = 1. This representation yields that the periodic orbit y((1,O))is asymptotically stable.
6. Properties of Limit Sets, Limit Cycles The a-and w-limit sets defined in 17.1.1.2, p. 797, have with respect to the flow of the differential equation (17.1) with ,%f c R" the following properties. Let z E Ad be an arbitrary point. Then: a) The sets a(.) and d(z) are closed. b) If?+(.) (respectively y-(z)) is bounded, then w ( z ) # 0 (respectively a(.) # 0)holds. Furthermore, w ( z ) (respectively a(.)) are in this case invariant under the flow (17.1) and connected. IIf for instance, T+(Z) is not bounded, then ~ ( zis) not necessarily connected (Fig. 17.4a). For a planar autonomous differential equation (17.1),(Le., M c Rz)the PoincarC-Bendixson theorem is valid. PoincarB-Bendixson Theorem: Let cp(.,p) be a non-periodic solution of (17.1). for which yt(p) is bounded. If d ( p ) contains no equilibrium point of (17.1), then w ( p ) is a periodic orbit of (17.1).
17. Dynamical Systems and Chaos
804
Figure 17.4 Hence, for autonomous differential equations in the plane, attractors more complicated than an equilibrium point or a periodic orbit are not possible. A periodic orbit y of (17.1) is called a limit cycle, if there exists an z $ y such that either y c W(X)or c a(.) holds. A limit cycle is called a stable limit cycle if there exists a neighborhood U of y such that "/ = w ( z ) holds for all x E U , and an unstable limit cycle if there exists a neighborhood U of y such that Y = a ( z )holds for all x E U . IA: For the flow of (17.9a), the property y = w ( p ) for all p # (0,O) is valid for the periodic orbit */ = {(cost,sin t ) , t E [0,h ) }Hence, . C' = RZ\{(O; 0)) is a neighborhood of y such that with it, y is a stable limit cycle (Fig. 17.4b). IB: In contrast, for the linear differential equation x = -y, si = 2 , the orbit y = {(cost, sin t ) , t E [0, 27rl) is a periodic orbit, but not a limit cycle (Fig. 1 7 . 4 ~ ) .
7. m-Dimensional Embedded Tori as Invariant Sets -4 differential equation (17.1) can have an m-dimensional torus as an invariant set. An m-dimensional torus Tm embedded into the phase space M C R" is defined by a differentiable mapping g : R" + R", whichissupposed tobe2K-periodicineverycoordinateOiasafunction(Q1,. . ..Om) ++g ( O 1 , . . .,Om). IIn simple cases, the motion of the system (17.1) on the torus can be described in a rightangular coordinate system by the differential equations Qi = wi (i = 1 , 2 , .. . , rn). The solution of this system , Q,(O) at timet = 0 is @ ( t )= wit O,(O) (i = 1 , 2 , . . . , m ; t E R). Kith initial values (&(O) A continuous function f : R -+ R" is called quasiperiodic i f f has a representation of the form f ( t )= g ( w l t , w z t . .. . .unt))where g is also a differentiable function as above, which is 27r-periodic in every component, and the frequencies wi are incommensurable. Le., there are no such integers ni with
+
rn
n,' > 0 for which nlwl
+
'.
t nmw,
= 0 holds.
2=1
17.1.2.4 Invariant Manifolds 1. Definition, Separatrix Surfaces
-,
be a hyperbolic equilibrium point or a hyperbolic periodic orbit of (17.1). The stable manifold W ( ? )(respectively unstable manifold W'(y)) of y is the set of all points of the phase space such that pass through these points: the orbits tending t o y as t + +XI (resepctively t + -a) (17.18) W(-,)= {x E M :~ ( x=)y} and W"(y) = {x E M :a(.) = y}. Stable and unstable manifolds are also called separatriz surfaces. Let
I In the plane. the differential equation x=-x, y=y+x2 (17.19a) is considered. The solution of (17.19a) with initial state (XO,yo) at time t = 0 is explicitly given by XZ
p(t, zo,yo) = (e-'zo, etyo + 0(et - e-")). 3 For the stable and unstable manifolds of the equilibrium point (0,O) of (17.19a) we get:
(17.19b)
17.1 Ordinary Diferential Equations and Mappings 805
Figure 17.5 Let Af and .V be two smooth surfaces in R", and let L,M and L,,V be the corresponding tangent planes to I f and S through I. The surfaces M and N are transversal to each other if for all x E h 'n .V the following relation holds: dim L,M + dim L,N - n = dim (L,M n L,N). IFor the section represented in Fig. 17.5b we have dim L,M = 2: dimL,N = 1 and dim(L,M n L,.V) = 0. Hence, the section represented in Fig. 17.5b is transversal.
2. Theorem of Hadamard and Perron Important properties of separatrix surfaces are given by the Theorem of Hadamard and Perron: Let 7 be a hyperbolic equilibrium point or a hyperbolic periodic orbit of (17.1). a) The manifolds ! P ( y ) and W"(y) are generalized CT-surfaces,which locally look like CT-smoothelementary surfaces. Every orbit of (17.1),which does not tend t o y fort -+ +cc or t + -cc. respectively. leaves a sufficiently small neighborhood of 7 for t -+ +cx: or t -+ -m, respectively. b) If /; = 2 0 is an equilibrium point of type ( m ,k ) , then W s ( 2 0 )and W'(x0) are surfaces of dimension m and k. respectively. The surfaces W s ( e oand ) W"(z0)are tangent at 10 to the stable vector subspace
E' = {y E R": eDf(zo)ty--t 0 for t and the unstable vector subspace
+ +m}
of equation y = Df(zo)y
(17.20a)
E" = {y E R": eDf(to)ty-+ 0 for t -+ -m} of equation y = Df(zo)y> respectively. (17.20b) is a hyperbolic periodic orbit of type ( m ,k ) , then W ( 7 )and W"(y) are surfaces of dimension m + 1 and k + 1, respectively, and they intersect each other transversally along y (Fig. 17.6a). IA: To determine a local stable manifold through the steady state (0,O) of the differential equation 0)) has the following form: (17.19a) we suppose that W,S,,((O, Wly,sO,((O.O)) = {(xly):y = h(x),1x1 < A,h : (-A.A)+ R differentiable}. Let ( z ( t )y, ( t ) ) be a solution of (17.19a) lying in W,S,,((O,0)). Based on the invariance, for times s near c ) If
. differentiation and representation of k and j , from the system (17.19a) t o t we get y(s) = h ( z ( s ) ) By we get the initial value problem h'(x) (-x) = h ( z )+ x 2 , h(0) = 0 for the unknown function h ( s ) .If we are looking for the solution in the form of a series expansion h ( z ) =
sx2+ %x3 +. 2 3!
'
, where
h'(0) = 0
2 is taken under consideration, then we get by comparing the coefficients a2 = -- and ak = 0 for k 2 3. 3 W B: For the system (17.21) x = -y z ( l - 22 - y2), y = 2 y(1 - 22 - y2). i = az with a parameter cy > 0. the orbit Y = {(cost. sin t , 0). t E [0,27~]}is a periodic orbit with multipliers p 1 = e-4n. pz = eoz'i and p3 = 1. In cylindrical coordinates 5 = r cos 29, y = r sin 29, z = z , with initial values ( T O , 80.~0) a t time t = 0.
+
+
806
17. Dvnamical Svstems and Chaos
the solution of (17.21) has the representation ( r ( t ,r o ) ,d ( t , t90), solution of (17.9a) in polar coordinates. Consequently,
where r(t. T O ) and d ( t , do) is the
W ( ~ ) = { ( z . y . z ) : z = O )\ { ( O , O , O ) } and W U ( y ) = { ( z , y , z ) : z Z + y 2 = 1 } (cylinder). Both separatrix surfaces are shown in Fig. 17.6b.
Figure 17.6
3. Local Phase Portraits Near Steady States for n = 3 We consider now the differential equation (17.1) with the hyperbolic equilibrium point 0 for n = 3. Set A = D f(0) and let det[XE - A] = X3 + pXz + qX + T be the characteristic polynomial of A. LVith the notation 6 = p q - r and A = -p2q2 + 4p3r + 4q3 - 18pqr + 27r2 (discriminant of the characteristic
polynomial), the different equilibrium point types are characterized in Table 17.1.
4. Homoclinic and Heteroclinic Orbits Suppose 7 , and 7 2 are two hyperbolic equilibrium points or periodic orbits of (17.1). If the separatrix surfaces and intersect each other, then the intersection consists of complete orbits. For two equilibrium points or periodic orbits, the orbit y c Ws(yl) n WU(y2)is called heteroclznzc if # (Fig. 17.7a).and homoclznzcify = nyz. Homoclinic orbits ofequilibrium points are also called separatrzz loops (Fig. 17.7b). !&'"(my2)
Figure 17.7 W Consider the Lorenz system (17.2) with fixed parameters u = 10, b = 8/3 and with variable r. The equilibrium point (O,O,0) of (17.2) is a saddle for 1 < r < 13.926.. . , which is characterized by a twodimensional stable manifold W s and a one-dimensional unstable manifold W". If r = 13.926. . . , then there are two separatrix loops at ( O , O , 0); Le., as t -+ +xbranches of the unstable manifold return (over the stable manifold) to the origin (see [17.9]).
17.1.2.5 Poincard Mapping 1. Poincar6 Mapping for Autonomous Differential Equations Let -, = { y ( t . z o ) .t E [O,T]}be a T-periodic orbit of (17.1) and a (TI - 1)-dimensional smooth
hypersurface, which intersects the orbit y transversally in zo (Fig. 17.8a). Then, there is a neighborhood C of zo and a smooth function 7 : G + R such that T ( Q ) = T and 9(7(z)..) E for all z € U . The mapping P : c' n -+ with P ( z ) = p ( ~ ( zz) ) : is called the Poincare' mapping of y at 10.If the right-hand side f of (17.1) is r times continuously differentiable, then P is also r times continuously differentiable . The eigenvalues of the Jacobi matrix DP(z0) are the multipliers PI,.. . , pn-l of the periodic orbit. They do not depend on the choice of zo on /; and on the choice of the transversal surface.
17.1 Ordinary Differential Equations and Mappings 807
Parameter domain b > 0:T > 0. q>o
A
A< A>
Type of equilibrium point stable node stable focus
Roots ofthe charac- Dimension of teristic polynomial W *and Wu ImXj = 0 dim W s = 3. dim W” = 0 Xj
ReX1,2< 0 <0
A3
808
17. Dynamical Systems and Chaos
X system (17.3) in hl = U can be connected with the Poincar6 mapping, which makes sense until the iterates stay in G. The periodic orbits of (17.1) correspond to the equilibrium points of this discrete system. and the stability of these equilibrium points corresponds to the stability of the periodic orbits of (17.1).
Figure 17.8
IM'e consider for the system (17.9a) the transversal hyperplanes ~={(T;t9):r>o,t9=190}
in polar coordinate form. For these planes U = can be chosen. Obviously, r ( r ) = 27r (Vr > 0) and so ~ ( r=)[I (r-' - 1) where the solution representation of (17.9a) is used. It is also valid that P ( c ) = E, P(1) = 1 and P'(1) = e-4r < 1.
+
2. Poincar6 Mapping for Non-Autonomous Time-Periodic Differential Equations A non-autonomous differential equation (17.11), whose right-hand side f has period T with respect to t. Le.. for which f(t + T , z ) = f ( t , z) ( V t E R , Vz E M) holds, is interpreted as an autonomous differential equation x = f ( s ,z), s = 1 with cylindrical phase space M x {s mod 7'). Let SO E {s mod T } be arbitrary. Then, C = M x {so} is a transversal plane (Fig. 17.8b). The PoincarC mapping is given globally as P : --t over zo H q(s0 + T ,$0, zO)rwhere ip(t, so, 20)is the solution of (17.11) with the initial state zo at time SO.
17.1.2.6 Topological Equivalence of Differential Equations 1. Definition Suppose. besides (17.1) with the corresponding flow { ( o t } t e R , that a further autonomous differential equation x = Y(XL (17.22) is given, where y : N + R" is a C'-mapping on the open set N c R". Of course, the flow { y ' } t e of ~ (17.22) should exist. The differential equations (17.1) and (17.22) (or their flows) are called topologically equivalent if there exists a homeomorphism h: M + N (Le., h is bijective, h and h-' are continuous), which transforms each orbit of (17.1) to an orbit of (17.22) preserving the orientation, but not necessarily preserving the parametrization. The systems (17.1) and (17.22) are topologically equivalent if there also exists a continuous mapping r : R x M + R , besides the homeomorphism h : M + N , such that 7 is strictly monotonically increasing at every fixed z E M, maps R onto R, with ~ ( 0z), = 0 for all z E M and asatisfies the relation h ( p t ( z ) )= $ T ( t + ) ( h ( zfor ) ) all z E M and t E R . In the case of topological equivalence, the equilibrium points of (17.1) go over into steady states of (17.22) and periodic orbits of (17.1) go over into periodic orbits of (17.22),where the periods are not necessarily coincident. Hence! if two systems (17.1) and (17.22) are topologically equivalent, then the topological structure of the decomposition of the phase spaces into orbits is the same. If two systems (17.1) and (17.22) are topologicallyequivalent with the homeomorphism h: M --t N and if h preserves the parametrization, Le., h(cpt(z))= $ ( h ( z ) )holds for every t , z , then (17.1) and (17.22) are called
17.1 Ordinarv Differential Equations and Mappinqs 809
topologically conjugate. Topological equivalence or conjugacy can also refer to subsets of the phase spaces A4 and N . Suppose, e.g.. (17.1) is defined on C; c M and (17.22) on Uz c N . We say that (17.1) on U1 is topologically equivalent to (17.22) on U2 if there exists a homeomorphism h : U1 + U, which transforms the intersection of the orbits of (17.1) with Ul into the intersection of the orbits of (17.22) with U2 preserving the orientation. H A: Homeomorphisms for (17.1) and (17.22) are mappings where, e.g., stretching and shrinking of the orbits are allowed; cutting and closing are not. The flows corresponding to phase portraits of Fig. 17.9a and Fig. 17.9b are topologically equivalent; the flows shown in Fig. 17.9a and Fig. 1 7 . 9 ~are not.
Figure 17.9
B: Consider the two lznear planar diflerential equations (see [17.11])
S = Az and x
=
Bz with A =
(1;1;) and B = (: -:).
The phase portraits of these systems
close to (0,O) are shown in Fig. 17.10a and Fig. 17.10b. The homeomorphism h : R2 T :R x
+ R2 with h ( z ) = Rx.where R
=
-
JT
('1 -'),1
and the function
1
R2 + R with T ( t ,x) = - t transform the orbits of the first system into the orbits of the second
2 one. Hence, the two systems are topologically equivalent.
Figure 17.10
2. Theorem of Grobman and Hartman Let p be a hyperbolic equilibrium point of (17.1). Then, in a neighborhood ofp the differential equation (17.1) is topologically equivalent to its linearization y = Df(p)y.
17.1.3 Discrete Dynamical Systems 17.1.3.1 Steady States, Periodic Orbits and Limit Sets 1. Types of Steady State Points Let zo be an equilibrium point of (17.3) with M c R". The local behavior of the iteration (17.3) close to zo is given. under certain assumptions, by the variational equation yt+l = Dy(zo)y,, t E r. If Dp(zo)has no eigenvalue Xi with lX,I = 1,then the steady state point xo,analogously to the differential equation case, is called hyperbolic. The hyperbolic equilibrium point 20 is of type (m,k ) if D f ( x 0 ) has exactly m eigenvalues inside and k = n -m eigenvalues outside the complex unit circle. The hyperbolic equilibrium point of type (rn,k ) is called a sink for m = n, a source for k = n and a saddle point for
I
810 17. Dvnamical Systems and Chaos m > 0 and k > 0. .4 sink is asymptotically stable; sources and saddles are unstable (theorem on stability in the first approximation for discrete systems). 2. Periodic Orbits Let y(z0) = k = 0,. . ' , T - 1) be a T-periodic orbit (T 2 2) of (17.3). If zo is a hyperbolic equilibrium point of the mapping pT,then ~ ( z ois) called hyperbolic. The matrix Dp'(z0) = py(pT-'(zO)).. .Dp(zo) is called the monodromy matriq the eigenvalues pi of DqT(z0)are the multzplzers of y(zo). If all multipliers pz of y(z0) have an absolute value less than one, then the periodic orbit 5(zo) is asymptotically stable.
3. Properties of w -Limit Set Every w-limit set w ( z ) of (17.3) with M = R" is closed, and w(p(z))= ~ ( 2 )If .the semiorbit ? + ( E ) is bounded. then w ( x ) # 0 and .(E) is invariant under p. Analogous properties are valid for a-limit sets. ISuppose the difference equation zt+l = -zt: t = 0, f l . . . ', is given on R with p(z) = - 2 . Obviously. the relations w(1) = (1, -1}, w(p(1)) = a(-1) = ~ ( 1 )and ) ~ ( w ( 1 )= ) w(1) are satisfied for z = 1. We mention that u(1) is not connected. is different from the case of differential equations.
17.1.3.2 Invariant Manifolds 1. Separatrix Surfaces Let .CO be an equilibrium point of (17.3). Then W s ( r O=) {y E M : @(y) -+ zo for i -+ +m} is called a stable manifold and W'(x0) = {y E M : @(y) -+ zo for i -+ -m} an unstable manifoldof 2 0 . Stable and unstable manifolds are also called separatriz surfaces.
2. Theorem of Hadamard and Perron The theorem of Hadamard and Perron describes the properties of separatrix surfaces for discrete systems in N c R": If zo is a hyperbolic equilibrium point of (17.3) of type ( m ,k ) , then WS(zo)and W"(z0)are generalized Cr-smooth surfaces of dimension m and k , respectively, which locally look like C'-smooth elementary surfaces. The orbits of (17.3),which do not tend to EO for i -+ +w or i i -m, leave a sufficiently small neighborhood of zo for i i +m or i -+ -w, respectively. The surfaces W'(z0) and W"(z0) are tangent at EO to the stable vector subspace E' = {y E R" : [Dp(xo)]*y -+ 0 for i -+ - m } of yztl = Dp(z0)yzand the unstable vector subspace EU = {y E R" : [Dp(zo)]'y -+ 0 for i -+ -m}, respectively. ILye consider the following time discrete dynamical system from the family of HCnon mappings: (17.23) z,+1 = 5; + yz - 2, yz+l = 2 , ; i E z. Both hyperbolic equilibrium points of (17.23) are PI = (4,d) and P2 = (-J"i,-4). Determination of the local stable and unstable manifolds of PI: The variable transformation 2 , = 4. yz = 7, t 4 transforms system (17.23) into the system = [f 2 4 c, vi, 9i+1 = with the equilibrium point (0,O).The eigenvectors al = (d + fi,1) and a2 = (4- fi,1) of the Jacobian matrix D f ( ( 0 , O ) )correspond to the eigenvalues = J"i fi,so E s = {taz, t E R } and E" = { t a l , t E R}. Supposing that W&((O>O)) = { ( t , ~77 )=: p((), 151 < A,p:(-A,A) + Rdifferentiahle}, aearelookingforPintheformofapowerseriesB(<) = (fi-8) [ + k < 2 + , ... From (&,vi) E M'zc((O,O)), ( & + l , v z t l ) E follows. This leads to an equation for the coefficients of the decomposition of 9,where k < 0. The theoretical shape of the stable and unstable manifolds is shown in Fig. 17.11a (see [17.12]).
+
+
+
*
woc((O,O))
3. Transverse Homoclinic Points The separatrix surfaces W s ( x Oand ) WU(zo)of a hyperbolic equilibrium point zo of (17.3) can intersect each other. If the intersection W S ( q n) W'(z0) is transversal. then every point y E Ws(zo)n W"(z,) is called a transversal homoclinic point.
17.1 Ordinary Differential Equations and Mappings 811
Figure 17.11 Fact: I f y isa transversal homoclinicpoint, then theorbit {pi(y)}oftheinvertiblesystem (17.3) consists only of transversal homoclinic points (Fig. 17.11b).
17.1.3.3 Topological Conjugacy of Discrete Systems 1. Definition Suppose. besides (17.3), a further discrete system Zt-1 = 4.t) (17.24) with u : .I' + ,V is given; where :V c R" is an arbitrary set and $ is continuous (Mand .V can be general metric spaces). The discrete systems (17.3) and (17.24) (or the mappings p and q) are called topologically conjugate if there exists a homeomorphism h: M -+ N such that p = h-' o q o h. If (17.3) and (17.24) are topologically conjugated, then the homeomorphism h transforms the orbits of (17.3) into orbits of (17.24).
2. Theorem of Grobman and Hartman
If p in (17.3) is a diffeomorphism 9:R" + R": and 20 a hyperbolic equilibrium point of (17.3): then in a neighborhood of 20 (17.3) is topologically conjugate to the linearization yt+' = Dp(zO)yt.
17.1.4 Structural Stability (Robustness) 17.1.4.1 Structurally Stable Differential Equations 1. Definition
The differential equation (17.1),Le., the vector field f : M -+ R",is called structurally stableor robust, if small perturbations off result in topologically equivalent differential equations. The precise definition of robustness requires the notion of distance between two vector fields defined on M .We restrict our investigations to smooth vector fields on M , which have a common open connected absorbing set U c M.Let the boundary d C of C be a smooth ( n - 1)-dimensional hypersurface and suppose that it can be represented as dCT = {x E R": h ( z ) = 0}, where h : R" -+ R is a C'-function with gradh(z) # 0 in a neighborhood of ab'. Let X'(U) be the metric space of all smooth vector fields on M with the C' metrzc (17.25) p ( f , g ) =suPlIf(Z) -g(s)Il + s u p I l D f ( s ) - D s ( z ) / / . ZEC
XEC
(In the first term of the right-hand side 11 . 11 means the Euclidean vector norm, in the second one the operator norm.) The smooth vector fields f intersecting transversally the boundary dC in the direction U . Le., for which grad h ( ~ ) ~ f (#z 0.) (z E dU)and p'((z) E U (z E dU,t > 0) hold. form the set X:(C-) c X'(C-). The vector field f 6 X:(U) is called structurally stable if there is a 6 > 0 such that every other vector field g E Xi(C) with p ( f , g ) < 6 is topologically equivalent to f. H Consider the planar differential equation g (,, a )
x = -y + 2 ( a -xz
-
y",
y = 2 t y(a-22
- y2)
(17.26)
I
812
17. Dvnamical Svsterns and Chaos
with parameter a , where Jal < 1. The differential equation g belongs, e.g., to X:(U) with U = {(z, y): x* + y* < 2) (Fig. 17.12a). Obviously, p(g (., 0), g ( . , a ) )= la/ (4 1). The vector field g (., 0) is structurally unstable: there exist vector fields arbitrarily close to g (,, 0), which are not topologically equivalent to g (., 0) (Fig. 17.12b,c). This is clear if we consider the polar coordinate representation i = -r3 a r . 8 = 1 of (17.26). For a > 0 there always exists a stable limit cycle r = fi.
+
+
Figure 17.12
2. Structurally Stable Systems in the Plane Suppose the planar differential equation (17.1) with f E X:(U) is structurally stable. Then: a) (17.1) has only a finite number of equilibrium points and periodic orbits. b) A11 d-limit sets ~ ( xwith ) z E of (17.1) consist of equilibrium points and periodic orbits only. Theorem of Andronov and Pontryagin: The planar differential equation (17.1) with f E Xi(U)
u
is structurally stable if and only if a) All equilibrium points and periodic orbits in are hyperbolic. b) There are no separatrices, i.e., no heteroclinic or homoclinic orbits, coming from a saddle and tending to a saddle point.
17.1.4.2 Structurally Stable Discrete Systems In the case of discrete systems (17.3), i.e.. of mappings p: M --t M ,let U C M c R" be a bounded, open. and connected set with a smooth boundary. Let Diff ' ( U ) be the metric space of all diffeomorphisms on 121with the corresponding U defined C'-metric. Suppose the set Diff i ( U ) C Diff(U) consists of the diffeomorphisms p, for which p(V) C U is valid. The mapping ~p E Diff:(U) (and the corresponding dynamical system (17.3)) is called structurally stable if there exists a 6 > 0 such that every other mapping w E Diff \(L') with p(p, +) < 6 is topologically conjugate to 9.
17.1.4.3 Generic Properties 1. Definition .A property of elements of a metric space ( M .p) is called generic (or typical) if the set of the elements B of M with this property form a set of the second Baire category, i.e., it can be represented as B = n B,, where every set B, is open and dense in M . m=1,2 ... IA: The sets R and I C R (irrational numbers) are sets of second Baire category, but Q C R is not. IB: Density alone as a property of "typical" is not enough: Q C R and I C R are both dense, but they cannot be typical at the same time. IC: There is no connection between the Lebesgue measure X (see 12.9.1, 2., p. 634) of a set from R 1 1 and the Baire category of this set. The set B = n Bk with Bk = U a,, - -, a,, where k=1,2, ... n>o k 2" k 2" Q = {un}Fy0represents the rational numbers, is a set of second Baire category (see [17.5], [17.10]).
(
+ -),
17.1 Ordinary Differential Equations and Mappings 813
On the other hand. since Bi, I)Bktland A(&)
L
l
< scoalso X(B)= lim A(&) 5 lim - -= 0 kim
k+m
k 1 - 112
holds. 2. Generic Properties of Planar Systems, Hamiltonian Systems For planar differential equations the set of all structurally stable systems from X;(V) is open and dense in X:(c’). Hence. structurally stable systems are typical for the plane. It is also typical that every orbit (C) for increasing time tends to one of a finite number of equilibrium points of a planar system from and periodic orbits. Quasiperiodic orbits are not typical. Under certain assumptions, in the case of Hamiltonian systems, the quasiperiodic orbits of differential equation are preserved in the case of small perturbations. Hence, Hamiltonian systems are not typical systems. 8Ho 8Ho IGiven in R4 a Hamiltoniansystem in action-angle variables $1 = 0: 3 2 = 0, 6, = -, 02 = 8jl ?iZ where the Hamiltonian Ho(j1,jz) is analytical. Obviously. this system has the solutions jl = c l , j 2 = cg. 01 = d1t c3. 02 = uzt c4 with constants el, c4, where w1 and w2 can depend on c1 and c2. The relation ( j 1 - j ~=) (c1, cz) defines an invariant torus T Z .Consider now the perturbed Hamiltonian Ho(ji.jz)t i H i (ji,jz. @ , 6 ) instead of Ho, where HI is analytical and E > 0 is a small parameter. The Kolmogorov-Arno2d-Moser theorem (KAMtheorem) says in this case that if Ha is non-degenera-
Xi
-.
+
)::(
te. i.e.. det -
+
# 0, then in the perturbed Hamiltoniansystem most of the invariant non-resonant
tori will not vanish for sufficiently small E > 0 but will be only slightly deformed. “Xlost of the tori” means that the Lebesgue measure of the complement set with respect to the tori tends to zero i f s tends to 0. A torus, defined as above and characterized by w1 und ~2~ is called non-resonant if there exists a w1 P c holds for all positive integers p and q. constant c > 0 such that the inequality - - - 2 lw2
‘il
3. Non-Wandering points, Morse-Smale Systems Let {pt}tER be a dynamical system on the n-dimensional compact orientable manifold M. The point p E .I1 is called non-wandering with respect to {p‘} if V T > O 3t, ItlLT: pt(uP)nup#0 (17.27) holds for an arbitrary neighborhood Up C M of p . ISteady states and periodic orbits consist only of non-wandering points. The set Q(pt)of all non-wandering points of the dynamical systems generated by (17.1) is closed, invariant under {$} and contains all periodic orbits and all w -limit sets of points from M. The dynamical system { p t } t E on~M generated by a smooth vector field is called a Morse-Smale system if the following conditions are fulfilled: 1. The system has finitely many equilibrium points and periodic orbits and they are all hyperbolic. 2. A11 stable and unstable manifolds of equilibrium points and periodic orbits are transversal to each other. 3. The set of all non-wandering points consists only of equilibrium points and periodic orbits. Theorem of Palis and Smale: Morse-Smale systems are structurally stable. The converse statement of the theorem of Palis and Smale is not true: In the case of n 2 3, there exist structurally stable systems with infinitely many periodic orbits. For n 13, structurally stable systems are not typical.
814
17. Dvnamical Svstems and Chaos
17.2 Quantitative Description of Attractors 17.2.1 Probability Measures on Attractors 17.2.1.1 Invariant Measure 1. Definition, Measure Concentrated on the Attractor Let {pt}tcr be a dynamical system on (M, p ) . Let B be the u-algebra of Borel sets on ,Lf (12.9.1. 2.: p. 634) and let p : B i [0, +cw] be a measure on B. Every mapping pt is supposed to be p measurable. The measure p is called invariant under {pt}tErif ~ ( c p - ~ ( A =)p(A) ) holds for all A E B and t > 0. If the dynamical system {pt}tEris invertible. then the property of the measure being invariant under the dynamical system can be expressed as p(pt(A))= p(A) (A E B, t > 0). The measure p is said to be concentrated on the Borel set A C M if p(A4 \ -4) = 0. If A is also an attractor of { p t } t c rand p is an invariant measure under {pt},then it is concentrated at '2, if p ( B ) = 0 for every Borel set B with
.2 n B = 0.
+XI,
The support of a measure p : B --t [0, denoted by supp p1is the smallest closed subset of M on which the measure p is concentrated. IA: We consider the Bernoulli shift mapping on .21 = [0,1]: (17.28a) xttl = 2zt (modl). In this case the map p: [0,1] i [O. 11 is defined as =
{
22. 22 -
0 5 x 5 112,
(17.28b)
1, 112 < z 5 1.
The definition yields that the Lebesgue measure is invariant under the Bernoulli shift mapping. If a n "_
number x E [0,1)is written in dyadicform z =
a,.2-"
(a, = 0 or 1):then this representation can
n=l
be identified with z = . ~ l ~ z a 3. .. .The result of the operation 2 5 ( mod 1) can be written a s . a: a; . . . with ai = aztl , Le., all digits ak are shifted to the left by one position and the first digit is omitted. IB: The mapping P:[0,1] i [0,1]with (17.29) is called a tent mappzng and the Lebesgue measure is an invariant measure. The homeomorphism 2 h : [O. 1) i [0,1) with y = - arcsin transforms the mapping p from (17.5) into (17.29). Hence, ?r
in the case of a = 4, (17.5) has an invariant measure which is absolutely continuous. For the density p l ( y ) 5 1 of (17.29) and p(x) of (17.5) at a = 4 it is valid that pl(y) = p(h-'(y)) l(h-')'(y)l. It follows 1
directly that p(x) = 7r
Jm'
IC: If zo is a stable periodic point of period T of the invertible discrete dynamical system {p'}, 1 T-1 then p = 6,,(,,) is a probability measure for { q * } .Here, 6,, is the Dzrac measure concentrated 7- 2=0 at XO.
2. Natural Measure Let .I be an attractor of {pt}ter in A1 with domain of attraction W . For an arbitrary Borel set A and an arbitrary point xo E 14; define the number.
C
W
(17.30)
17.2 Quantitative Description o.fAttractors 815 Here. t ( T ,A, xo)is that part of the time T > 0 for which the orbit portion,{pt(zo)};=, lies in the set A. If p(A:ZO) = cy for A-a.e.* zo from W , then let p ( A ) := p(A; 20).Since almost all orbits with initial points zo E It’ tend to .Ifor t + +cq p is a probability measure concentrated at A.
17.2.1.2 Elements of Ergodic Theory 1. Ergodic Dynamical Systems A dynamical system {pt}tEron (M, p ) with invariant measure p is called ergodic (we also say that the measure is ergodic) if either p ( A ) = 0 or p(M\ A ) = 0 for every Borel set A with p;*(A) = A ( V t > 0). If {$} is a discrete dynamical system (17.3), yo : M + M is a homeomorphism and M is a compact metric space, then there always exists an invariant ergodic measure. IA: Suppose there is given the rotation mapping of the circle S’ xttl = xt Qj (mod 27r), t = 0,1,. . . , (17.31) with 9:[O. 227) + [0,27l),defined by p(x) = z + @(mod27r). The Lebesgue measure is invariant under 0 . . Qj p. If - is irrational, then (17.31) is ergodic; if - is rational, then (17.31) is not ergodic. 271 277 IB: Dgnamical systems with stable equilibrium points or stable periodic orbits as attractors are ergodic with respect to the natural measure. Birkhoff Ergodic Theorem: Suppose that the dynamical system {pt}ter is ergodic with respect to the invariant probability measure p. Then, for every integrable function h E L’(M, f3.p). the time
+
d t for flows and average along the positive semiorbits {pt(zo)}goLe. K(z0)= lim L I ’ h (p*(zo)) T-ttcoT 0 1 n-1 h (~‘(zo)) for discrete systems, coincide with the space average h d p for p-a.e. h(z0) = ~
-
I
Aiir
M ..
2=0
points xo E .\I.
2. Physical or SBR Measure The statement of the ergodic theorem is useful only if the support of the measure p is large. Let p : X + .21 be a continuous mapping, and p : B + R be an invariant measure. We call (see [17.9])p an SBR measure (according to Sinai, Bowen and Ruelle) if for any continuous function h: M + R the set of all points 20 E .VI3 for which 1 n-1
lim n 2=0 h($(xo)) = I h d p
n+x
(17.32a)
M
holds! has a positive Lebesgue measure for this. It is sufficient that the sequence of measures ( 17.32b) where 6, is the Dirac measure, weakly converges to p for almost all z E M ; Le.. for every continuous function
1
/ h d p as n +
M
.M
h dpn i
+m.
IFor some important attractors, such as the HBnon attractor, the existence of an SBR measure is proven. 3. Mixing Dynamical Systems A dpnamical system {pt}ter on (M, p ) with invariant probability measure p is called mixing if lim p ( An p - t ( B ) ) = p ( A ) p ( B )holds for arbitrary Borel sets A, B c M. For a mixing system, the t-itm
measure of the set of all points which are at t = 0 in A and under p*for large t in B , depends only on ’Here and in the following a.e. is an abbreviation for “almost everywhere”
I
816
17. Dynamical Systems and Chaos
the product p ( A ) p ( B ) . A mixing system is also ergodic: Let {pt} be a mixing system and A be a Bore1 set with P - ~ ( A = ) A ( t > 0). Then P ( A )=~jiim~(p-~(A) f l A ) = p ( A ) holds and p ( A ) is 0 or 1.
A flow {pt} of (17.1) is mixing if and only if the relation lim /[g(p'(z))
t-itm
- g ] [ h ( z)
711 dp = 0
(17.33)
M
holds for arbitrary quadratically integrable functions g, h E L Z ( M B, , p ) . Here, g and E denote the space average, which is replaced by the time average. IThe modulo mapping (17.28a) is mixing. The rotation mapping (17.31) is not mixing with respect
x
to the probability measure -.
2A
4. Autocorrelation Function Suppose the dynamical system {pt}tE,-on M with invariant measure p is ergodic. Let h : M -+ R be an arbitrary continuous function, { p ' ( ~ ) } be ~ >an~ arbitrary semiorbit and let the space average h be replaced by the time average, Le., by lim - h ( c p t ( ( z ) ) dt in the time-continuous case and by T+m T
'J
1 n-1 lim h ( ~ ' ( 2 ) )in the time-discrete case. With respect to h the autocowelation function along
nim
1
n t=O
the semiorbit { $ ( ~ ) } ~ > o to a time point 7 2 0 is defined for a flow by
".
C h ( 7 )=
lim
T+m
1 ]h (cp"'(z)) T
h (pt(z))dt -
z2
(17.34a)
and for a discrete system by (17.34b) The autocorrelation function is defined also for negative time, where Ch(.) is considered as an even function on R or Z. Periodic or quasiperiodic orbits lead to periodic or quasiperiodic behavior of Ch. A quicker descent of C h ( 7 ) for increasing 'T and arbitrary test function h refers to chaotic behavior. If Ch(7) decreases for increasing 7 with an exponential speed, then it means mixed behavior. 5 . Power Spectrum The Fourier transform of Ch(7)is called a power spectrum (see also 15.3.1.2, 5., p. 724) and is denoted by Ph(u). In the time-continuous case, under the assumption that
Ph(iu)= 7Ch(7)e-iw7d7= 2
ICh(7)Idr< m, we have (17.35a)
-x +W
In the time-discrete case, if C lCh(k)l < +co holds, then k=-m
Ph(W)
= Ch(0)
+2
m
C h ( k ) coswk.
(17.35b)
k=l
If the absolute integrability or summability of C h ( . ) does not hold, then, in the most important cases, can be considered as a distribution. The power spectrum corresponding to the periodic motions of
Ph
17.2 Quantitative Description of Attractors 817
a dynamical system is characterized by equidistant impulses. For quasiperiodic motions, there occur impulses in the power spectrum, which are linear combinations with integer coefficients of the basic impulses of the quasiperiodic motion. A “wide-band spectrum with singular peaks” can be considered as an indicator of chaotic behavior. IA: Let p be a T-periodic orbit of (17.1), h be a test function such that the time average of h(lp(t)) is zero, and suppose h ( v ( t ) )has the Fourier representation t m
akeikwot with
h(p(t))=
wg
=
2a
k=-m
Then we have, with 6 as the b distribution, tm
1 )cuk)*cos(kw,,~) and
C ~ ( T=)
t m
Iaklzb(w - kwo).
ph(d)= 2a
k=-m
k=-m
B: Suppose y is a quasiperiodic orbit of (17.1), h is a test function such that the time average is zero along 9,and let h(p(t))be the representation (double Fourier series) toc
t m
1
h ( q ( t ) )=
aklkze1(klwltlczw2)t.
ki=-m kz=-m
Then. t3c
t m
Ch(7) =
IQklkz12COS(klW1
+ k2uZ)T1
17.2.2 Entropies 17.2.2.1 Topological Entropy Let ( M , p ) be a compact metric space and {’pk}kEr be a continuous dynamical system with discrete time on M.A distance function pn on M for arbitrary n E W is defined by (17.36) Pn(Jc. 21) := P(c^%)> d ( Y ) ) .
oyz
Furthermore, let .V(E, pn) be the largest number of points from M which have in the metric pn a distance at least E from each other. The topological entropy of the discrete dynamical system (17.3) or of the 1
mapping 9 is h(p) = lim limsup - In N(E,pn). The topological entropy is a measure for the complexity E 4 n+m n of the mapping. Let ( M I .p1) be a further compact metric space and 91 : MI+ h31 be a continuous mapping. If both mappings 9 and p1 are topologically conjugate, then their topological entropies coincide. In particular. the topological entropy does not depend on the metric. For arbitrary n E W, h (pn)= n h (p) holds. If p is a homeomorphism, then h (pk)= Ik(h (p)for all k E Z. Based on the last property, we define the topological entropy h(p*) := h (p’) for a flow pt = p(t. ,) of (17.1) on
Ad c R“.
17.2.2.2 Metric Entropy Let {pt}tej-be a dynamical system on M with attractor h and with an invariant probability measure p concentrated on .I. For an arbitrary E > 0 consider the cubes Q ~ ( E ).,. . , Q n ( E ) ( ~of) the form {(XI,.. . ,zn) : k,c 5 z,< (ki t 1 ) ~ ( i = 1,2:. . . , n)} with ki E Z,for which p(Qi) > 0. For arbitrary z from a Qt the semiorbit {pt(z)}Eo is followed for increasing t. In time-distances of r > 0 (T = 1 in discrete systems), the N cubes, in which the semiorbit is found is denoted by i l . . . . , iN after each other. Let E,,,.,.,,,vbe the set of all starting points in the neighborhood of A whose semiorbits at the
I
818
17. Dunamical Susterns and Chaos
times t , = ZT ( i = 1.2,. . . , N ) are always in . . Qi, and let p ( i 1 , . ' , ZN) = p(Etl,...,E,v) be the probability that a (typical) starting point is in Eil,...,,N. The entropy gives the increment of the information on average by an experiment which shows that among a finite number of disjoint events which one has really happened. In the above situation this is
HAV =-
1
p ( i ] : . . . , i ~l )n p ( i l , . . . , i N ) ,
(17.37)
(t1,-,i,v)
) length N , which are realized by the where the summation is over all symbol sequences (il,. . ' , i ~ with orbits described above. The metric entropy or Kolmogorow-Sinai entropy h, of the attractor .? of {pt} with respect to the invariant measure p is the quantity h, = lim lim
N+m
E+O
-.HTNN
For discrete systems, the limit as E
+ 0 is
omitted. For the topological entropy h ( y ) of p : A -+ A the inequality h, 5 h(p) holds. In several cases h ( y ) = sup{h,: 1 invariant probability measure on A}. IA: Suppose A = { x o }is a stable equilibrium point of (17.1) as an attractor, with the natural measure p concentrated on 20. For these attractors h, = 0. IB: For the shift mapping (17.28a), h(p) = h, = In 2, where p is the invariant Lebesgue measure.
17.2.3 Lyapunov Exponents 1. Singular Values of a Matrix
Let L be an arbitrary matrix of type (n,n ) . The szngular values o1 2 u2 2 . . . 2 un of L are the non-negative roots of the eigenvalues a1 2 . . 2 a, 2 0 of the positive semidefinite matrix LTL. The eigenvalues a, are enumerated according to their multiplicity. The singular values can be interpreted geometrically. If K, is a sphere with center at 0 and with radius E > 0.then the image L(K,) is an ellipsoid with semi-axis lengths uE& (z = 1 , 2 , . . . , n ) (Fig. 17.13a).
Figure 17.13
2. Definition of Lyapunov Exponents Let {9'}tErbe a smooth dynamical system on M c R", which has an attractor A with an invariant ergodic probability measure p concentrated on A. Let u l ( t ,x ) 2 . . 2 gn(t,z) be the singular values of the Jacobian matrix D pt(x) of pt at the point x for arbitrary t 2 0 and x E 11.Then there exists 1 a sequence of numbers AI 2 ' . 2 A, the Lyapunov exponents, such that - In ul(t,x ) + A, for
t t -+ +mp-a.e. in the sense of L'. According to the theorem of Oseledec there exists p-a.e. a sequence of subspaces of R" (17.38) R" = Es", 2 Es", 3 ' . 2 E:,+, = { 0 } , such that for p-a.e. z the quantity
with respect to u E Es",\ E,",+1.
5t In llD yt(x)ull tends to an element A,,
E
{ A l , . . . ,A,}
uniformly
17.2 Quantitative Description 0.fAttractor.s 819
3. Calculation of Lyapunov Exponents Suppose u t ( t , x )are the semi-axis lengths of the ellipsoid got by deformation of the unit sphere with center at x by D pt(z). The formula xz(x) =
sup In oz(t, x) can be used to calculate the Lyat
punoi exponents, if additionally a reorthonormalization method, such as Housholder, is used. The function y(t. z.v) = D pt(z)vis the solution of the variational equation with v at t = 0 associated to the semiorbit ;/+(I)of the flow {$}. Actually, {pt}tER is the flow of (17.1),so the variational equation is y = Df(p'(x))y. The solution of this equation with initial v at time t = 0 can be represented as y(t.x: v) = O z ( t ) v .where O z ( t )is the normed fundamental matrix of the variational equation at t = 0, which is a solution of the matrix differential equation 2= Df(cpt(x))Z with initial Z(0) = En according to the theorem about differentiability with respect to the initial state (see 17.1.1.1,2., p. 795). 1 The number x ( z :v) = lim sup - In IlD pt(x)vJ1describes the behavior of the orbit Y ( I + E ~ )0, < E <( 1 t i m
t
with initial x +EL' with respect to the initial orbit y(x) in the direction v. If x ( r ,v) < 0, then the orbits move nearer to x for increasing t in the direction v. If, on the contrary, x(z,v ) > 0, then the orbits move away (Fig. 17.13b). Let .I be the attractor of the dynamical system {pt}tcr and p the invariant ergodic measure concentrated on it. Then, the sum of all Lyapunov exponents p-a.e. I E '1 is n
A, =
'i
!iirt
,=1
divf(ps(z))ds
(17.39a)
in the case of flows (17.1) and for a discrete system (17.3), it is (17.39b) Hence, in dissipative systems
n
A,
< 0 holds. Considering that one of the Lyapunov exponents is
%=I
equal to zero if the attractor is not an equilibrium point, the calculation of Lyapunov exponents can be simplified (see [17.9]). A: Let be x0 an equilibrium point of the flow of (17.1) and let cy, be the eigenvalues of the Jacobian matrix at xo. Il'ith the measure concentrated on 20. the following holds for the Lyapunov exponents: A, = R e a , (i = 1 , 2 . . . . , n ) . IB: Let $x0) = { p t ( x ~ )t l E [O:T]}be a T-periodic orbit of (17.1) and let pi be the multipliers of 1 r(r0).ll'ith the measure concentrated on y(x0) we have that X - -In IpiI for i = 1 , 2 . . . , n. " T
4. Metric Entropy and Lyapunov Exponents If {pt}teris a dynamical system on M C Rnwith attractor '1 and an ergodic probability measure 1.1 concentrated on .I,then the inequality h, 5 X i holds for the metric entropy h,, where in the sum X,>O
the Lyapunov exponents are repeated according to their multiplicity. The equality A,
h, =
(Pesin's formula)
(17.40)
X,>O
is not valid in general. If the measure p is absolutely continuous with respect to the Lebesgue measure and p: M + M is a C2-diffeomorphism, then Pesin's formula is valid.
I
820
17. Dynamical Systems and Chaos
17.2.4 Dimensions 17.2.4.1 Metric Dimensions 1. Fractals Attractors or other invariant sets of dynamical systems can be geometrically more complicated than point, line or torus. Fractals: independently of dynamics, are sets which distinguish themselves by one or several characteristics such as fraying, porousity, complexity, and self-similarity. Since the usual notion of dimension used for smooth surfaces and curves cannot be applied to fractals, we must give a generalized definition of the dimension. For more details see [17.2],[17.12]. I The interval Go = [0, I]is divided into three subintervals with the same length and the middle open third is removed. so we obtain the set G1 = [O,;] U 11. Then from both subintervals of G1 the open
[i,
i] [i,51 [i,i] [I,
middle third ones are removed, which yields the set Gz = [O) U U U 11. Continuing this procedure, Gk is obtained from Gk-1 by removing the open middle thirds from the subintervals. So, we get a sequence of sets Go 3 GI 3 . . 3 G, 3 * . . , where every G, consists of 2n intervals of 1
length k . 3" The Cantor set C is defined as the set of all points belonging to all G,, Le., C =
m
n G,.
The set C is
n=l
compact. uncountable. its Lebesgue measure is zero and it is perfect, i.e., C is closed and every point is an accumulation point. The Cantor set can be an example of a fractal.
2. Hausdorff Dimension The motivation for this dimension comes from volume calculation based on Lebesgue measure. If we suppose that a bounded set A c R3 is covered by a finite number of spheres B,, with radius T , 5 E , 4 so that U, B , 3 A holds, we get for A the "rough volume" - T T ~ . Now, we define the quantity 1
3
4
p E ( A )= i n f ( 1 - T T : } over all finite coverings of A by spheres with radius T , 5 E . If E tends to zero, * 3 then we get the Lebesgue outer measure X(A) of A. If A is measurable, the outer measure is equal to the volume vol(A). S U D D O S e :li is the Euclidean mace R" or. more generallv. a seuarable metric mace with metric LJ and let'i c hi be a subset of it. 6 r arbitrary parankers d 0 and E > 0, the quantity
>
:A
c
u
B,, diamB, 5 E
(17.41a)
is determined. where B, C M are arbitrary subsets with diameter diamB, = sup p(z,y). 2,BEB.
The Hausdorfl outer measure of dzmenszon d of A is defined by Pd(A) = l & p , E ( A ) = suPPd,EV)
(17.41b)
E>O
and it can be either finite or infinite. The Hausdor$dzmenszondH(A) of the set A is then the (unique) critical value of the Hausdorff measure: +x,if pd(A) # 0 for all d 2 0. (17.41~) dH(A) = inf . { d 2 0: pd(A) = 0 ) .
[
Remark: The quantities p d + ( A ) can be defined with coverings of spheres with radius T , 5 E or, in the case of Rn,of cubes with side length 5 E . Important Properties of t h e Hausdorff Dimension: (HD1) d ~ ( 0 =) 0. (HD2) If .4C R", then 0 5 dH(A)5 n.
17.2 Quantitative Description of Attractors 821
(HD3) From A (HD4) If A =
c B. it follows that ~ H ( A 5) & ( B ) . m
U At, then ~ H ( A =) s uz p d ~ ( A , ) .
2=1
(HD5) If A is finite or countable, then &(A) = 0. (HD6) If 9.AI' + AI' is Lipschitz continuous, Le., there exists aconstant L > 0 such that p(p(z).&)) 5 Lp(z. y) Vz. y E -21, then d ~ ( p ( A )5) &(A). If the inverse mapping 9-l exists as well. and it is also Lipschitz continuous. then &(A) = d ~ ( p ( A ) ) . I For the set Q of all rational numbers ~ H ( Q=) 0 because of (HD5). The dimension of the Cantor In 2 set C is ~ H ( C= )- x 0.6309.. . . In 3 3. Box-Counting Dimension or Capacity Let A be a compact set of the metric space ( X , p ) and let N t ( A ) be the minimal number of sets of diameter 5 E , necessary to cover A. The quantity In :YE(A) (17.42a) dB(A) = limsup E + ~ Inf is called the upper box-counting dimension or upper capacity. the quantity In.VE(A) (17.42b) d B ( A )= lim inf 7 E+O In; iscalled the lo~erbox-countingdimensionorlower capacity (then dc) ofA. IfaB(A) = &(A) := ~ B ( A ) holds. then dB(A) is called the box-counting dimensionof A. In Rn the box-counting dimension can be considered also for bounded sets which are not closed. For a bounded set A C R". the number ArE(A) in the above definition can also be defined in the following way: Let R" be covered by a grid from n-dimensional cubes with side length E . Then, XL(A) can be the number of cubes of the grid having a non-empty intersecting A. Important Properties of the Box-Counting Dimension: ( B D l ) d ~ ( ' 4 )5 ~ B ( A always ) holds. (BD2) For m-dimensional surfaces F C R" holds ~ H ( F=)~ B ( F=) m.
(BD3) dB(.4) = d B ( 3 ) holds for the closure 2 of A, while often d*(A) < d ~ ( 2is) valid. (BD4) If 4 = U A,, then, in general, the formula ~ B ( A = ) sup ~B(A,)does not hold for the boxn
n
counting dimension. 1 1 1 ISuppose '4 = {O. 1,-, -, . . . } . Then d ~ ( . 4 )= 0 and d ~ ( ' 4 )= - . 2 3 2 If A is the set of all rational points in [O. 11, then because of BD2 and BD3 ~ B ( A=) 1 holds. On the other hand d H ( A ) = 0. 4. Self-similarity Several geometric figures, which are called self-similar, can be derived by the following procedure: .4n initial figure is replaced by a new one which is composed o f p copies of the original, any of them scaled linearly by a factor y > 1. All figures that are k times scalings of the initial figure in the k-th step are handled as the in the first step.
-
IA: Cantor set: p = 2. y = 3. IB: Koch curve: p = 4, y = 3. The first three steps are shown in Fig. 17.14. Figure 17.14
I
822
17. Dynamical Systems and Chaos
IC: Sierpinski gasket: p = 3, q = 2. The first three steps are shown in Fig. 17.15. (The white triangles are always removed.) Figure 17.15
ID: Sierpinski carpet: p = 8, q = 3. The first three steps are shown in Fig. 17.16. (The white squares are always removed.) For the sets in the examples A-D: In D dg = d H = 2 In 4
Figure 17.16
17.2.4.2 Dimensions Defined by Invariant Measures 1. Dimension of a Measure Let p be a probability measure in (M,p), concentrated on '1 If z E .Iis an arbitrary point and & ( x ) is a sphere with radius 6 and center at 2 , then (17.43a) denotes the upper and p(B6 (17.43b) In6 > d,(z) is called the denotes the lower pointwise dimension of p in z.If &(x) = a,(.) := d P ( x )then dimension of the measure p in 2 . Young Theorem 1: If the relation d,(z) = cy holds for p-a.e.', 5 E A, then a = d H ( p ) := inf {dH(X)}. (17.44) d,(x) = liminf 6+0
xc'1.@[x)=I
The quantity d ~ ( pis) called the Hausdorf dimension ofthe measure p . ISuppose M = R" and let .1 c R" be a compact sphere with Lebesgue measure A(.!) restriction
x of p to .2 is p.1 = -.
> 0. The
Then
p ( & ( z ) ) 6" and d ~ ( p=) n . N
2. Information Dimension Suppose, theattractor.!of{pt}tcr iscovered by cubes Q1(&),. . . Qn(€)(c)OfsidelengthEasin 17.2.2.2, p. 817. Let p be an invariant probability measure on A. The entropy of the covering Q1 ( E ) , . . . , &,(,I( E ) is nk)
H ( E )= - EP,(E) Inpi(€). with p i ( & )= ~ ( Q , ( E ) ) (i = 1,.. . , I L ( E ) ) .
(17.45)
2=1
W E ) , If the limit d l ( p ) = - lim exists, then this quantity has the property of a dimension and is called E+O lne the information dimension.
'Here and in the following a.e is an abbreviation for "almost everywhere"
17.2 Quantitative Description of Attractors 823
Young T h e o r e m 2: If the relation d,(z) = cy holds for p-a.e. z E 11,then (17.46) a = dH(p) = di(p). A: Let the measure p be concentrated at the equilibrium point zo of {@}.Since H E ( p )= -1 In 1 =
0 always holds for E > 0, so dr(p) = 0. B: Suppose the measure p is concentrated on the limit cycle of holds and so d ~ ( p=) 1.
I$}. For E > 0, H E ( p )= - lne
3. Correlation Dimension Let { Y , } Z ~ = ~ be a typical sequence of points of the attractor '1 c R" of { $ } t E rp, an invariant probability measure on '1 and let m E K be arbitrary. For the vectors z,:= (ut!.. . , yitm) let the distance be defined as dist(z,.x ) .- max llyz+s- y3tsllrwhere 11 . 11 is the Euclidean vector norm. If 0 denotes 05sjrn
'-
the Heaviside function O ( x ) =
,O: then the expression
{O:
1 Crn(E)= limsup-card{(z,,z,): r++s.v2 I
= limsup N-toc
'
10
(E
dist(x,.z,)
<E}
- dist(x,.z,))
(17.47a)
J 2 $,]=I
is called the correlatzon integral. The quantity lnCm(E) 1nE (if it exists) is the correlation dimension. dK
= lim
(1i.47b)
~
E+O
4. Generalized Dimension Let the attractor .1of {.$}tGron ,ti' with invariant probability measure p be covered by cubes with side length E as in 17.2.2.2. p. 817. For an arbitrary parameter q E R, q # 1: the sum n(0
H,(E) = -In
1-q
1 p , ( ~ ) ,where p , ( ~ = ) p(Qi(~)) ,=I
(17.48a)
is called the generalized entropy ofq-th order with respect to the covering Q ~ ( E .) ., . . QncE)(~) The Re'nyi dimension o f q - t h orderis (17.48b) if this limit exists. Special Cases of the RBnyi Dimension: a ) q = 0: do = dc(suppp).
b) q = l :
(17.49a)
dl:=limd,=dI(p).
(17.49b)
4-1
C)
q=2:
d2=
(17.49~)
dK.
5 . Lyapunov Dimension Let {p'}terbe a smooth dynamical system on ,ti' C R" with attractor A (or invariant set) and with the invariant ergodic probability measure p concentrated on '4. If AI 2 A2 2 ' ' 2 A, are the Lyapunov exponents with respect to p and if k is the greatest index for which
k ,=I
A, 2 0 and
ktl t=1
A,
< 0 hold, then
I
17. Dynamical Systems and Chaos
824
(17.50) is called the Lyapunov damensaon of the measure p. If
2A, 2 0. then d L ( p ) = n;if XI
%=I
< 0, then d t ( p ) = 0
Ledrappier Theorem: Let {pt} be a discrete system (17.3) on M C R" with a C2-function 9 and p , as above, an invariant ergodic probability measure concentrated on the attractor . I of {pt}. Then d ~ ( p5) d ~ ( pholds. )
IA: Suppose the attractor A c RZof a smooth dynamical system {I$} is covered by .VEsquares with side length E . Let 0 1 > 1 > 0 2 be the singular values of D p. Then for the dB-dimensional volume of the attractor mdw N A'E . €dB holds. Every square of side length E is transformed by 'p approximately into a parallelogram with side length U ~ and E ~ I EIf. the covering is made by rhombi with side length then X,,,,
U ~ E ,
Y
N -01 holds. From the relation NE~dW N Nn2E(~02)dB we get directly 02
(17.51) This heuristic calculation gives a hint of the origin of the formula for the Lyapunov dimension. IB: Suppose the HBnon system (17.6) is given with a = 1.4 and b = 0.3. The system (17.6) has an attractor '1 (Henon attractor) with a complicated structure for these parameter values. The numerically determined box-counting dimension is d ~ ( i 1 )N 1.26. It can be shown that there exists an SBR measure for the HBnon attractor '2. For the Lyapunov exponents A1 and A 2 the formula A1 + A 2 = In 1 det D p(z)1 = In b = In 0.3 N -1.204 holds. With the numerically determined value XI N 0.42 we get X2 = -1.62. So, 0.42 d ~ ( pN) 1 + - N 1.26. (17.52) 1.62
17.2.4.3 Local Hausdorff Dimension According to Douady and Oesterl6 Let {ps'}tEr be a smooth dynamical system on M C R" and A a compact invariant set. Suppose that an arbitrary to 2 0 is fixed and let @ := ptO. Theorem of Douady a n d OesterlB: Let ul(z) 2 . . . 2 un(z) be the singular values of D @ ( s ) and let d E (0, n] be a number written as d = do s with do E (0, 1, . . . , n - 1) and s E [0,1]. If . . O d o ( z ) ~ & + , ( z ) ]< 1 holds, then &(A) < d. sup [o1(z)o2(z).
+
XE.2
Special Version for Differential Equations: Let {pt}tER be the flow of (17.1), .I be a compact invariant set and let ( Y ~ ( z2) ... 2 cx,(z) be the eigenvalues of the symmetrized Jacobian matria: 1 2 where do E (0,.
- [ D ~ ( Zt)D~f ( z ) ]at an arbitarary point z E A. If d E (0,n] is a number of the form d = do + s
..
~
n - l} and s E [0,1],and sup[cq(z) + ' . . + ado(z)+ sad,+l(Z)] < 0 holds, then XEA
dH('1) < d. The quantity 0. i f q ( z )< 0,
dDO(z) =
{ sup(& 0 5 d 5 n,
cyl(s)
+ . . + q d l ( 2 ) + (d - [d])a[d~+~(z) 2 0 ) otherwise, (17.53)
where z E .1 is arbitrary and [d] is the integer part of d , is called the Douady-OesterlC dimension at the point z. Under the assumptions of the Douady-OesterlB theorem for differential equations, d ~ ( ' 1 5) SUP dDo(z). ZE.1
17.2 Quantitative Description of Attractors 825
IThe Loren2 system (17.2), in the case of u = 10, b = 8/3, r = 28, has an attractor A (Lorenz attractor) with the numerically determined dimension &(A) = 2.06 (Fig. 17.17 is generated by Mathernatica). With the Douady-OesterlB theorem, for arbitrary > 1, u > 0 and r > 0 we get the estimation
b
~ H ( A 5) 3 - ___ u + b + l where
(17.54a)
17.2.4.4 Examples of Attractors IA: The horseshoe mapping p occurs in connection with Poincarh mappings containing the transversale intersections of stable and unstable manifolds. The unit square M = [0,1] x [0,1] is stretched linearly in one coordinate direction and contracted in the other direction. Finally, this rectangle is bent at the middle (Fig. 17.18). Repeating this procedure infinitely many times, we get a sequence of sets M 2 p(M) 2 ’ . ’ , for which
Figure 17.17
n m
A=
(17.55)
p k ( ~ )
k=O
of)
M
is a comnact set and an invariant set with respect to cp. A attracts all points of M. Apart from one point, il can be described locally as a product “line x Cantor set”.
q2(M)
Figure 17.18
IB: Let cy E (0,1/2) be a parameter and 12.I = [0,1]x [0,1]be the unit square. The mapping p: M M given by
+
(17.56a) is called thedissipative baker’s mapping. Two iterationsof the baker’smappingareshown in Fig. 17.19. The “flaky pastry structure” is recognizable. The set Y
1
n W
A=
(17.56b)
p k ( ~ )
k=O
l x
l x
lx
is invariant under p and all points from M are attracted by A. The value of the Hausdorff dimension is (17.56~)
Figure 17.19
For the dynamical system {pk} there exists an invariant measure p on M , which is different from the Lebesgue measure. .4t the points where the derivative exists, the Jacobian matrix is Dpk((z,y))=
(?d; :.),
Hence, the singular values are
u l ( k , (z, y))
= Z k , ~ ( k(z,, y)) =
ffk
and, consequently.
826
i7. Dvnamical Svstems and Chaos
the Lyapunov exponents are XI = In 2, A2 = In a (with respect to the invariant measure p). For the Lyapunov dimension we get In 2 (17.56d) d ~ ( p=) 1 + -= dH(A). -1na Pesin's formula for the metric entropy is valid here, i.e., h, = C A, = In 2. X,>O
IC: Let T be the whole torus with local coordinates (Q,x,y), as is shown in Fig. 17.20a.
Figure 17.20 Let a mapping p: T
+ T be defined by (17.57)
with parameter a E (0,1/2). The image p(T).with the intersections p(T)n D(8)and pZ(T)n D ( Q ) , is shown in Fig. 17.20b and Fig. 17.20~. The result of infinitely many intersections is the set A =
p*(T),which is called a solenoid. The attractor A consists of a continuum of curves in the length k=O
direction. and each of them is dense in A, and unstable. The cross-section of the A transversal to these curves is a Cantor set. In 2 The set A has a neighborhood which is a domain of atThe Hausdorff dimension is dH(A) = 1 - -. In a traction. Furthermore, the attractor A is structurally stable, i.e.; the qualitative properties formulated above do not change for @-small perturbations of p. ID: The solenoid is an example of a hyperbolic attractor.
17.2.5 Strange Attractors and Chaos 1. Chaotic Attractor Let {pt}terbe a dynamical system in the metric space ( M .p ) . The attractor A of this system is called chaotic if there is a sensitive dependence on the initial condition in 12. The property " sensitive dependence on the initial conditions " will be made more precise in different ways. It is given, e.g., if one of the two following conditions is fulfilled: a) All motions of {pt}on '2 are unstable in a certain sense. b) The greatest Lyapunov exponent of {p*}is positive with respect to an invariant ergodic probability measure concentrated on '2. ISensitive dependence in the sense of a) occurs for the solenoid. Property b) can be found, e.g.. for HCnon attractors.
2. Fractals and Strange Attractors An attractor A of {p"ttEris called fractal if it represents neither a finite number of points or a piecewise differentiable curve or surface nor a set which is bounded by some closed piecewise differentiable surface. An attractor is called strange if it is chaotic, fractal or both. The notions chaotic, fractal and strange are used for compact invariant sets analogously even if they are not attractors. h dynamical system is called chaotic if it has a compact invariant chaotic set.
17.3 Bifurcation Theory and Routes to Chaos 827 I Themapping
+
(17.58) z,+1 = 2 2 , + Y, (mod 11, Y"+I = 2, yn (mod 1) (Anosov difleomorphism) is considered on the unit square. The adequate phase space for this system is the torus 2'. It is conservative, has the Lebesgue measure as invariant measure, has a countable number of periodic orbits whose union is dense and is mixing. Otherwise, .A = T 2 is an invariant set with integer dimension 2.
3. Systems Chaotic in the sense of Devaney Let {pt}tErbe a dynamical system in the metric space (M,p)with a compact invariant set '4. The system {pt}ter (or the set A) is called chaotic in the sense of Devaney, if a) {pt}tEr is topologically transitive on A, Le., there is a positive semiorbit, which is dense in .A. b) The periodic orbits of {pt}tErare dense in A. c) {pt}tEr is sensitive with respect to the initial conditions in the sense of Guckenheimer on '2, i.e.! 3 F > o v X E 1 I t / 6 > 0 3 y E A f l U 6 ( 2 ) I t 2 0 :p(pt((.),pt(y))>& (17.59) where c6(z)= { z E M :p ( a , z ) < 6). W Consider the space of the 0-1-sequences =
{ s = SOSl5-2.. . , s, E {0,1} (i = 0 , l . . .)}
For two sequences s = sosls2.. . and s' = sbs{s;. . ., their distance is defined by 0. if s = st, p('% "1 = 2 - 3 , if s # st.
(
where j is the smallest index for which s3 # 51,. So, (E, p ) is a complete metric space which is also compact. The mapping p : s = soslsz.. . H~ ( s=) s' = s ~ s Z S ~ .. . is called a Bernoulli shift. The Bernoulli shift is chaotic in the sense of Devaney.
17.2.6 Chaos in One-Dimensional Mappings For continuous mappings of a compact interval into itself, there exist several sufficient conditions for the existence of chaotic invariant sets. We mention three examples. Shinai Theorem: Let p : I -i I be a continuous mapping of a compact interval I , e.g., I = [0,1] into itself. Then the system {pk} on I is chaotic in the sense of Devaney if and only if the topological entropy of cp on I . i.e., h(p),is positive. Sharkovsky Theorem: Consider the following ordering of positive integer numbers: 3 t 5 t 7 t . . . + 2 . 3 + 2 . 5 + . . . t 2 ' . 3 + 2 ' . 5 + . . . . . t 23 t 2* + 2 + 1. (17.60) Let p: I -i I be a continuous mapping of a compact interval into itself and suppose {pk} has an nperiodic orbit on I . Then {pk} also has an m-periodic orbit if n + m. Block, Guckenheimer and Misiuriewicz Theorem: Let p : I + I be a continuous mapping of the compact interval I into itself such that {pk} has a 2"m-periodic orbit ( m > 1, odd). Then In 2 h(p) 1holds. I
17.3 Bifurcation Theory and Routes to Chaos 17.3.1 Bifurcations in Morse-Smale Systems Let {p:}ter be a dynamical system generated by a differential equation or by a mapping on M C R", which additionally depends on a parameter E E V C R'. Every change of the topological structure of the phase portrait of the dynamical system for small changes of the parameter is called bifurcation. The parameter E = 0 E V is called the bifurcation value if there exist parameter values E E V in every
I
828
17. Dvnamical Svstems and Chaos
neighborhood of 0 such that the dynamical systems {cp:} and { cpk} are not topologically equivalent or conjugated on M. The smallest dimension of the parameter space for which a bifurcation can be observed is called the codzmenszon of the bifurcation. We distinguish between local bifurcations, which occur in the neighborhood of a single orbit of the dynamical system, and global bifurcations, which affect a large part of the phase space.
17.3.1.1 Local Bifurcations in Neighborhoods of Steady States 1. Center Manifold Theorem Consider a parameter-dependent differential equation
x = f(x,E) or & = fi(zl,.. . , x,, . . . , E L ) (z = 1,2,. . . , n) (17.61) with f : A1 x V + R". where h.3 C R" and V C RLare open sets and f is supposed to be r times continuously differentiable. Equation (17.61) can be interpreted as a parameter-free differential equation x = f(x.E ) , 6 = 0 in the phase space 12.3 x V . From the Picard-Lindelof theorem and the theorem on differentiability with respect to the initial values (see 17.1.1.1, 2., p. 795) it follows that (17.61) has a locally unique solution p ( , , p )E) with initial point p at timet = 0 for arbitraryp E M and E E I', which is r times continuously differentiable with respect t o p and E . Suppose all solutions exist on the whole of R. Furthermore. we suppose that system (17.61) has the equilibrium point x = 0 at E = 0. Le.. f(0,O) = 0 holds. Let XI,.
..
~
: :1
A, be the eigenvalues of Dzf(O,O) = -(O,
0)
I:,=
with ReX, = 0. Furthermore,
suppose, D,f(O, 0) has exactly m eigenvalues with negative real part and k = n - s - m eigenvalues with positive real part. According to the center manifold theorem for differential equations (theorem of Shoshitaishvili, see [17.12]),the differential equation (17.61),for E with a sufficiently small norm 1 1 ~ 1 1in the neighborhood of 0. is topologically equivalent to the system 5 = F(z,EE ) AX + g ( x , E ) , j , = -y, t= (17.62) where A is a matrix of type ( s , s) with eigenvalues XI,. . . , A,, and g with x E RS,y E R"' and z E Rk, represents a CT-functionwith g(0,O) = 0 and Dzg(O,0) = 0. It follows from representation (17.62) that the bifurcations of (17.61) in a neighborhood of 0 are uniquely described by the differential equation x =F(~,E). (17.63) Equation (17.63)represents the reduceddiferential equation to the local centermanifoldW& = {x,y: z : y = 0. z = 0) of (17.62). The reduced differential equation (17.63) can often be transformed into a relatively simple form, e.g., with polynomials on the right-hand side, by a non-linear parameterdependent coordinate transformation so that the topological structure of its phase portrait does not change close to the considered equilibrium point. This form is a so-called normalform. .4 normal form cannot be determined uniquely; in general, a bifurcation can be described equivalently by different normal forms.
2. Saddle-Node Bifurcation and Transcritical Bifurcation Suppose (17.61) is given with 1 = 1, where f is continuously differentiable at least twice and D,f(O, 0) has the eigenvalue XI = 0 and TI - 1 eigenvalues A, with ReXj # 0. According to the center manifold theorem. in this case, all bifurcations (17.61) near 0 are described by a one-dimensional reduced differ8F
entia1 equation (17.63). Obviously, here F(0,O) = -(O, 8X
a2
--F(O, ax2
0)
0 ) = 0. If, additionally, it is supposed that
8F
# 0, K(O, 0) # 0 and the right-hand side of (17.63) is expanded according to the Taylor
formula, then this representation can be transformed by coordinate transformation (see [17.6]) into
17.9 Bifurcation Theoru and Routes to Chaos 829
the normal form j: = a . + X 2 +
...
(17.64)
(for s ( O >. 0 0) ) or j: = a. - x2 + . . .
TX:
(
for -(O,O)
)
< 0 , where a. =
a(€)
is a differentiable
function with a(0) = 0 and the points indicate higher-order terms. For a < 0, (17.64) has two equilibrium points close to x = 0, among which one is stable, the other is unstable. For a. = 0, these equilibrium points fuse into x = 0, which is unstable. For a > 0, (17.64) has no equilibrium point near to 0 (Fig. 17.21b). The multidimensional case results in a saddle-node bijurcatzon in a neighborhood of 0 in (17.61). This bifurcation is represented in Fig. 17.22 for n = 2 and A1 = 0, A2 < 0. The representation of the saddle-node bifurcation in the extended phase space is shown in Fig. 17.21a. For sufficiently smooth vector fields (17.61), the saddle-node bifurcations are generic.
-
-
--:
a<0
a>O
Figure 17.21
a=O
Figure 17.22
dF If among the conditions which yield a saddle-node bifurcation for F , the condition -(O,O) de
dF
# 0 is
82 F
replaced by the conditions -(O. 0) = 0 and -(0,O) # 0, then we get from (17.63) the truncated dE dXdE normal form (without higher-order terms) x = ax - x2 of a transcrztical bzfurcatzon. For n = 2 and A 2 < 0. the transcritical bifurcation together with the bifurcation diagram is shown in Fig. 17.23. Saddle-node and transcritical bifurcations have codimension 1-bifurcations.
Figure 17.23
3. Hopf Bifurcation Consider (17.61) with n 2 2. 1 = 1 and T 2 4. Suppose that f(0.e) = 0 is valid for all E with / E / 5 EO (€0 > 0 sufficiently small). Suppose the Jacobian matrix DJ(0,O) has the eigenvalues X I = X2 = iw
830
17. Dynamical Systems and Chaos
with w # 0 and n - 2 eigenvalues X j with ReX, # 0. According to the center manifold theorm, the bifurcation is described by a two-dimensional reduced differential equation (17.63)of the form (17.65) i = a(E)z - w ( i ) y g * ( q 1/, E ) , y = W ( & ) Z a ( E ) y t gz(2. Y . E ) where cy. w ,g1 and g2 are differentiable functions and u(0)= w and also a(0)= 0 hold. By a non-linear complex coordinate transformation and by the introduction of polar coordinates ( r ,19). (17.65) can be written in the normal form i = O ( E ) T + a(c)r3 ’ . ’ , 8 = W ( E ) b(&)r2+ ’ . . (17.66) where dots denote the terms of higher order. The Taylor expansion of the coefficient functions of (17.66) yields the truncated normal form
+
+ +
+
+
+
1: = a ’ ( 0 ) ~ r a(0)r3, 8 = w ( 0 ) + ~ ’ ( O ) E b(0)r2. (17.67) The theorem of Andronov and Hopf guarantees that (17.67) describes the bifurcations of (17.66) in a neighborhood of the equilibrium point for E = 0. The following cases occur for (17.67) under the assumption cy’(0) > 0: 1. a(0) < 0 (Fig. 17.24a). : 2. a(0) > 0 (Fig. 17.2413) :
a)E > 0 : Stable limit cycle and a)E < 0 : Unstable limit cycle. unstable equilibrium point. b) E = 0 : Cycle and equilibrium point fuse b) E = 0 : Cycle and equilibrium point fuse into an unstable equilibrium point. into a stable equilibrium point. c) E < 0 : A l l orbits close to (0:0) tend c) E > 0 : Spiral type unstable as in b) fort + +x equilibrium point as in b). spirally to the equilibrium point (0,O).
6 a) E>O
& d
6 &SO
b)
E20
E
Figure 17.24 The interpretation of the above cases for the initial system (17.61) shows the bifurcation of a limit cycle of a compound equilibrium point (compound focus point of multiplicity 1))which is called a Hopf bijurcation (or Andronov-Hopf bijurcation). The case a(0) < 0 is also called supercritical. the case a(0) > 0 subcritical (supposing that a’(0) > 0). The case n = 3: X1 = XZ = i. X3 < 0. a’(0) > 0 and a(0) < 0 is shown in Fig. 17.25.
Figure 17.25 Hopf bifurcations are generic and have codimension 1. The cases above illustrate the fact that a supercritical Hopf bifurcation under the above assumptions can be recognized by the stability of a focus:
17.3 Bifurcation Theorv and Routes to Chaos 831
Suppose the eigenvalues XI(&) and & ( E ) of the Jacobian matrix on the right-hand side of (17.61) at 0 are pure imaginary for E = 0, and for the other eigenvalues A, ReX, # 0 holds. Suppose furthermore d that - ReX1(E),E=o> 0 and let 0 be an asymptotically stable focus for (17.61) at E = 0. Then there is dE a supercritical Hopf bifurcation in (17.61) at E = 0. The van der Pol differential equation x &(z2 - l ) i+ z = 0 with parameter E can be written as a planar differential equation (17.68) x = y. y = -&(2 - l ) y - 5. For E = 0, (17.68) becomes the harmonic oscillator equation and it has only periodic solutions and an equilibrium point, which is stable but not asymptotically stable. With the transformation u = f i x , u = f i y for E > 0 (17.68) is transformed into the planar differential equation
+
i = v , i.=-u-(U2-&)C. (17.69) For the eigenvalues of the Jacobian matrix at the equilibrium point (0,O) of (17.69): d 1 and so X1,2(0) = *i and - ReX1(E)i,,o = - > 0. X ~ . Z ( E )= 5 dE 2 As shown in the example of 17.1.2.3, 1.)p. 801. (0,O)is an asymptotically stable equilibrium point of (17.69) for i = 0. There is a supercritical Hopf bifurcation for E = 0, and for small E > 0, (0.0) is an unstable focus surrounded by a limit cycle whose amplitude is increasing as E increases.
4. Bifurcations in Two-Parameter Differential Equations 1. Cusp Bifurcation Suppose the differential equation (17.61) is given with r 2 4 and 1 = 2. Let the Jacobian matrix D J ( 0 . 0 ) have the eigenvalue XI = 0 and n - 1 eigenvalues A, with ReX, # 0 and
dF
suppose that for the reduced differential equation (17.63) F(O.0) = -(O,O) dX
l3
d3F
:= -(O.
ax3
a 2F
= -(O,O)
8x2
= 0 und
0) # 0. Then the Taylor espansion of F close to (0,O) leads to the truncated normal form
(without higher-order terms see. [17.1])
+
+
(17.70) 3 = cy1 cy22 sign /3z3 with the parameters aland a2. The set ( ( ~ az, 1 ~E ) : cy1 cyzz sign/& = 0 ) represents a surface in extended phase space and this surface is called a cusp (Fig. 17.26a).
+
+
Figure 17.26 In the following, we suppose l3 < 0. The non-hyperbolic equilibrium points of (17.70) are defined by the svstem a1 + cyzz - z3 = 0. a2 - 3z2 = 0 and thus they lie on the curves S1 and Sz,which are
832
17. Dvnamacal Svstems and Chaos
determined by the set {(aiI 0 2 ) : 27aT - 40; = 0) and form a cusp (Fig. 17.26b). If (a1. c y 2 ) = (0,O) then the equilibrium point 0 of (17.70) is stable. The phase portrait of (17.61) in a neighborhood of 0, e.g.. for n = 2: l 3 < 0 and XI = 0 is shown in Fig. 1 7 . 2 6 ~for Xz < 0 (triple node) and in Fig. 17.26d for Xz > 0 (triple saddle) (see [17.6]). At transition from ( a i Ia z ) = (0,O) into the interior of domain 1 (Fig. 17.26b) the compound nodetype non-hyperbolic equilibrium point 0 of (17.61) splits into three hyperbolic equilibrium points (two stable nodes and a saddle) (supercritical pitchfork bifurcation). In the case of the two-dimensional phase space of (17.61)the phase portraits are shown in Fig. 17.26c,e. When the parameter pair of Si \ { (0,O)) (i = 1,2) traverse from 1 into 2 then a double saddle nodetype equilibrium point is formed which finally vanishes. A stable hyperbolic equilibrium point remains.
Figure 17.27
2. Bogdanov-Takens Bifurcation Suppose, for (17,61)>n 2 2, l = 2, T 2 2 hold and the matrix D,f(O, 0) has two eigenvalues X I = Xz = 0 and n - 2 eigenvalues X j with ReXj # 0. Let the reduced two-dimensional differential equation (17.63) be topologically equivalent to the planar system (17.71) x = y , y = a1 t a2x t 22- xy. Then there is a saddle-node bifurcation on the curve SI= ((01,0 2 ) : ai - 4a1 = 0). r\t the transition on the curve Sz = {(ai,a?):a1 = 0, a2 < 0) from the domain cy1 < 0 into the domain a1 > 0 a stable limit cycle arises by a Hopf bifurcation and on the curve S3 = { ( a ~a ,~: a1 ) = -kai + . . .} ( k > 0, constant) there exists a separatrix loop for the original system (Fig. 17.27). which bifurcates into a stable limit cycle in domain 3 (see [17.6],[17.9]). This bifurcation is of a global nature and we say that a single periodic orbit arisesfrom the homoclinic orbit ofa saddle or a separatriz loop disappears. 3. Generalized Hopf Bifurcation Suppose that the assumptions of the Hopf bifurcation with r 2 6 are fulfilled for (17.61),and the two-dimensional reduced differential equation after a coordinate transformation into polar coordinates has the normal form i = e l r + ~~r~- r5 +. . , $ = 1 t. .. The bifurcation diagram (Fig. 17.28) of this system contains the lines1 = {(EL, E Z ) : E L = 0, ~2 # 0}, whose points represent a Hopf bifurcation (see [17.4], [17.6]). There exist two periodic orbits in domain 3, amongwhichoneisstable,theotheroneisunstable. On thecurveSz = { ( E ~ , E zE): +: ~ E Z > 0. EI < 0}, these non-hyperbolic cycles fuse into a compound cycle which disappears in domain 2. 5. Breaking Symmetry Some differential equations (17.61)have symmetries in the following sense: There exists a linear transformation T (or a group of transformations) such that f(Tz,E ) = T f ( 2 ,E) holds for all z € M and E E I/. An orbit y of (17.61) is called symmetric with respect t o T if T y = y. IVe talk about a symmetry breaking bifurcation at E = 0, e.g., in (17.61) (for 1 = l ) , if there is a stable equilibrium point or a stable limit cycle for E < 0, which is always symmetric with respect to T , and for E = 0 two further stable steady states or limit cycles arise, which are nolonger symmetric with respect to T .
17.3 Bifurcation Theory and Routes to Chaos 833
Figure 17.28 For system (17.61) with f ( z lE ) = c x - x 3 the transformation T defined as T : z H --z is a symmetry, since f ( - z , ~ = ) -f(z, E ) (zE R, E E R). For E < 0 the point x1 = 0 is a stable equilibrium point. For E > 0, besides z1 = 0, there exist the two other equilibrium points ~ 2 , 3= i&; both are nonsymmetric.
17.3.1.2 Local Bifurcations in a Neighborhood of a Periodic Orbit 1. Center Manifold Theorem for Mappings Let y be periodic orbit of (17.61) for E = 0 with multipliers p1,. . . , pn-l, pn = 1. A bifurcation close to 7 is possible, if when changing E , at least one of the multipliers lies on the complex unit circle. The use of a surface transversal to y leads to the parameter-dependent PoincarC mapping z H P ( LE ) . (17.72) Then, with open sets E c Rn-' and V c RLlet P : E x V + Rn-' be a Cr-mapping where the mapping P : E x I' --t Rn-l x R' with P ( ~ , E=) ( P ( z ,E ) , & ) should be a Cr-diffeomorphism. Furthermore, let P(0,O) = 0 and suppose the Jacobian matrix D,P(O, 0) has s eigenvalues p1,. . . , ps with 1p,l = 1, m eigenvalues pstl.. . . , pstm with JpiI < 1 and k = n - s - m - 1 eigenvalues p s + m t l r . . , pn-1 with (pzl > 1. Then, according to the the center manijold theorem for mappings (see [17.4]),close to (0.0) E E x V . the mapping P is topologically conjugate to the mapping (17.73) (x,y. z , E ) +-+( F ( z ,E ) , A'y, A"z,E )
+
near (0.0) E Rn-' x RLwith F ( z ,E ) = ACz g ( z ,E ) . Here g is a Cr-differentiable mapping satisfying the relations g(0,O)= 0 and D,g(O. 0) = 0. The matrices A', As and A" are of type (s, s),( m ,rn) and ( k , k ) , respectively. It follows from (17.73) that bifurcations of (17.72) close to (0,O) are described only by the reduced mapping z HF ( z ,E ) (17.74) on the local center manijold &, = {(z,y. z ) : y = 0, t = 0).
;kg ;kg
;34$
a)
b)
U
a=O
a>O
Figure 17.29
2. Bifurcation of Double Semistable Periodic Orbits Let the system (17.61) be given with n 2 2, r 2 3 and 1 = 1. Suppose, at E = 0, the system (17.61) has periodic orbit y with multipliers pi = $1, lpzl # 1 (i = 2,3,. . . , n - 1) and pn = 1. According to the
834
1'7. Dvnamical Svstems and Chaos
center manifold theorem for mappings, the bifurcations of the Poinear6 mapping (17.72) are described 82 F dF by the one-dimensional reduced mapping (17.74) with A' = 1. If -(0,O)# 0 and - (0,O)# 0 is ax2
supposed, then it leads to the normal forms 2 HF ( z , a )= cy
+a:
or
$22
( E
z ~ c y + x - z ~ for-(O,O)<~
).
a€
(17.75a) (17.75b)
The iterations from (17.758) close to 0 and the corresponding phase portraits are represented in Fig. 17.29a and in Fig. 17.291, for different cy (see [17.6]). Close to z = 0 there are for cy < 0 a stable and an unstable equilibrium point, which fuse for cy = 0 into the unstable steady state z = 0 . For a > 0 there exists no equilibrium point close to z = 0. The bifurcation described by (17.75a) in (17.74) is called a subcritical saddle node bifurcation f o r mappings. In the case of the differential equation (17.61)1the properties of the mapping (17.75a) describe the bifurcation o f a double semistable periodic orbit: For (Y < 0 there exists a stable periodic orbit y1 and an unstable periodic orbit which fuse for cy = 0 into a semistable orbit 7 ,which disappears for cy > 0 (Fig. 17.30a,b).
Figure 17.30
3. Period Doubling or Flip Bifurcation Let system (17.61) be given with n 2 2, r 2 4 and 1 = 1. We consider a periodic orbit y of (17.61) at E = 0 with the multipliers p1 = -1, lpll # 1 (i = 2 , . . . , n - l ) ,and pn = 1. The bifurcation behavior of the Poinear6 mapping in the neighborhood of 0 is described by the one-dimensional mapping (17.74) with Ac = -1. if we suppose the normal form (17.76) z * B(z,c y ) = (-1 + cy)z + 2 3 . The steady state z = 0 of (17.76) is stable for small cy 2 0 and unstable for (Y < 0. The second iterated mapping B2 has for cy < 0 two further stable fixed points besides z = 0 for z1,2 = 5 6 + o(1aI). which are not fixed points of F, Consequently. they must be points of period 2 of (17.76). In general, for a C4-mapping(17.74) there is a two-periodic orbit at E = 0, if the following conditions are fulfilled (see [17.2]):
F(0,O) = 0,
82 F2 -(O.O) dXdE dF2
Since -(O,
dX
# 0.
dF -(o> ax d2F2
-(O,O)
8x2
dF2
0) = -1, = 0,
dF
0) = +1 holds (because of -(O, OX
-(O.O) dE
= 0.
d3F 2 -(O.O) 6x3
# 0.
(17.77)
0) = -1)$ the conditions for a pitchfork bzfurcation are
formulated for the mapping F 2 . For the differential equation (17.61) the properties of the mapping (17.76) imply that at cy = 0 a stable
17.3 Bifurcation Theory and Routes to Chaos 835
periodic orbit ye splits from y with approximately a double period (period doubling), where y loses its stability (Fig. 17.30~). I The logistic mapping pa:[0,1] -+ [0,1] is given for 0 < a 5 4 by ve(s)= a z ( 1 - s), Le., by the discrete dynamical system (17.78) Pt+l = azt(1 - st). The mapping has the following bifurcation behavior (see [17.10]): For 0 < a 5 1 system (17.78) has the equilibrium point 0 with domain of attraction [O,11. For 1 < a < 3, (17.78) has the unstable 1 equilibrium point 0 and the stable equilibrium point 1 - - , where this last one has the domain of a 1. attraction ( 0 , l ) . For a1= 3 the equilibrium point 1- - is unstable and leads to a stable two-periodic a orbit. the two-periodic orbit is also unstable and leads to a stable 22-periodic orAt the value a2 = 1 + 6, bit. The period doubling continues, and stable 29-periodic orbits arise at a = aq.Numerical evidence shows that cyp + a, x 3.570.. . as q -+ +m For cy = a x . there is an attractor F (the Feigenbaum attractor), which has a structure similar to the Cantor set. There are points arbitrarily close to the attractor which are not iterated towards the attractor, but towards an unstable periodic orbit. The attractor F has dense orbits and the Hausdorff dimension is dH(F) x 0.538 On the other hand, the dependence on initial conditions is not sensitive. In the domain a, < a < 4, there exists a parameter set A with positive Lebesgue measure such that system (17.78) has a chaotic attractor of positive Lebesgue measure for a E A . The set A is interspersed with windows in which period doublings occur. The bifurcation behavior of the logistic mapping can also be found in a class of unimodal mappings. Le.. of mappings of the interval I into itself, which has a single maximum in I . Although the parameter values cyzr for which period doubling occurs, are different from each other for different unimodal mappings. the rate of convergence by which these parameters tend to am,is equal: ak - a , x Cb-k, where b = 4.6692.. . is the Feigenbaum constant (C depends on the concrete mapping). The Hausdorff dimension is the same for all attractors F at a = a,: &(F)x 0.538.. . . 4. Creation of a Torus Consider system (17.61) with n 2 3, r 2 6 and 1 = 1. Suppose that for allE close to 0 system (17.61) has
aperiodicorbity,. Letthemultipliersof-/obep,,2 = e*%*with!P# ( O ,;- $ - .7T
}
3 , pJ
(- - 3 , . . . , n - l )
with lp31 # 1 and pn = 1. according to the center manifold theorem. in this case there exists a two-dimensional reduced C6mapping
z HF ( P . E ) (17.79) with F ( 0 ,E ) = 0 for E close to 0. If the Jacobian matrix D,F(O.E) has the conjugate complex eigenvalues P ( E ) and P(E) with Ip(0)l = 1 d for all E near 0. if d := - /p(e)l~,=o> 0 holds and p(0) is not a q-th root of 1 for q = 1,2,3.4. dE
then (17 79) can be transformed by a smooth E dependent coordinate transformation into the form z e P(z,E) = Fo(z, E ) + 0(11~1/~) (0Landau symbol), where Fois given in polar coordinates by (17.80) Here a , LU’ and b are differentiable functions. Suppose a(0) < 0 holds. Then, the equilibrium point r = 0 of (17.80) is asymptotically stable for all E < 0 and unstable for E > 0. Furthermore, for E > 0 there
I
836
17. Dunamical Sustems and Chaos
exists the circle r =
, which is invariant under the mapping (17.80) and asymptotically stable
(Fig. 17.31a).
Figure 17.31 The Neimark-Sacker Theorem (see [17.10],[17.1])states that the bifurcation behavior of (17.80) is similar to that of (supercritical Hopf bifurcation for mappings). IIn mapping (17.79), given by 1 (1+ € ) Iy +t2 2 - 2y2
(r) +-+ -(
fi
-5+(1+E)y+52-53
1
'
there is a supercritical Hopf bifurcation at E = 0. h'ith respect to the differential equation (1i.61)3the existence of a closed invariant curve of mapping (17.79) means that the periodic orbit yo is unstable for a(0) < 0 and for E > 0 a torus arises which is invariant with respect to (17.61) (Fig. 17.31b).
17.3.1.3 Global Bifurcation Besides the periodic creation orbit which arises if a separatrix loop disapears, (17.61) can have further global bifurcations. Two of them are shown in [17.12]by examples.
1. Emergence of a Periodic Orbit due to the Disappearance of a Saddle-Node IThe parameter-dependent system = z ( l - 2 2 - y2) + y ( 1 + 5 + c y ) > y = -5(1+ 5 + a ) + y ( l - 2 2 has in polar coordinates I = r cos 8,y = r sin 8the following form:
x
- y2)
r: = r ( l - .*), 9 = -(1 + a + r cos 8 ) . (17.81) Obviously, the circler = 1 is invariant under (17.81) for an arbitrary parameter a , and all orbits (except the equilibrium point (0,O)) tend to this circle for t + +m. For cy < 0 there are a saddle and a stable node on the circle, which fuse into a compound saddle-node type equilibrium point at cy = 0. For cy > 0, there is no equilibrium point on the circle, which is a periodic orbit (Fig. 17.32).
Figure 17.32
2. Disappearance of a Saddle-Saddle Separatrix in the Plane IConsider the parameter-dependent planar differential equation x = cy + 25y, y = 1 + 5 2 - yz.
(17.82)
17.3 Bifurcation Theorv and Routes to Chaos 837
For a = 0, equation (17.82) has the two saddles ( 0 , l ) and (0, -1) and the y-axis as invariant sets. The heteroclinic orbit is part of this invariant set. For small la1 # 0, the saddle-points are retained while the heteroclinic orbit disappears (Fig. 17.33).
Figure 17.33
17.3.2 Transitions to Chaos Often a strange attractor does not arise suddenly but as the result of a sequence of bifurcations, from which the typical ones are represented in Section 17.3.1. The most important ways to create strange attractors or strange invariant sets are described in the following.
17.3.2.1 Cascade of Period Doublings Analogously to the logistic equation (17.78), a cascade of period doublings can also occur in timecontinuous systems in the following way. Suppose system (17.61) has the stable periodic orbit 7:’)for E < E , . For E = E~ a period doubling occurs near $:), at which the periodic orbit 7:’) loses its stability for E > A periodic orbit 7::) with approximately double period splits from it. At E = ~ 2 there , is a new period doubling, where ntf) loses its stability and a stable orbit 7:;)with an approximately double period arises. For important classes of systems (17.61) this procedure of period doubling continues. so a sequence of parameter values { E ~ arises. } Numerical calculations for certain differential equations (17.61), e.g., for hydrodynamical differential equations such as the Lorenz system, show the existence of the limit lim J-++X
EJt1 - Ej Elf2 - E j t 1
= 6. where 6
(17.83)
is again the Feigenbaum constant. For E, = lim E j . the cycle with infinite period loses its stability, and a strange attractor appears. 31-
The geometric background for this strange attractor in (17.61)by acascade of period doubling is shown in Fig. 17.34a. The Poincari! section shows approximately a baker mapping, which suggests that a Cantor set-like structure is created.
Figure 17.34
17.3.2.2 Intermittency Consider a stable periodic orbit of (17.61)>which loses its stability for E = 0 when exactly one of the multipliers, for E < 0 inside the unit circle takes the value + l . According to the center manifold theorem. the corresponding saddle-node bifurcation of the Poincari! mapping can be described by a one-dimensional mapping in the normal form x e P ( x , a ) = a + z + x2 + . . . . Here a is a parameter
838
17. Dynamical Systems and Chaos
depending on E . Le., LY = a(&)with ( ~ ( 0 )= 0. The graph of p ( . )a)for positive a is represented in Fig. 17.34b. As can be seen in Fig. 17.3413, the iterates of ."(., a ) stay for a relatively long time in the tunnel zone for 0 2 0. For equation (17.61), this means that the corresponding orbits stay relatively long in the neighborhood of the original periodic orbit. During this time, the behavior of (17.62) is approximately periodic (laminar phase). After getting through the tunnel zone the considered orbit escapes, which results in irregular motion (turbulent phase). After a certain time the orbit is recovered and a new laminar phase begins. A strange attractor arises in this situation if the periodic orbit vanishes and its stability goes over to the chaotic set. The saddle-node bifurcation is only one of the typical local bifurcations playing a role in the intermittence scenario. Two further ones are period doubling and the creation of a torus.
17.3.2.3 Global Homoclinic Bifurcations 1. Smale's Theorem Let the invariant manifolds of the Poinear6 mapping of the differential equation (17.61) in R3 near the correspond periodic orbit y be as in Fig. 17.11b, p. 17.11. The transversal homoclinic points PJ(xo) to a homoclinic orbit of (17.61) to y. The existence of such a homoclinic orbit in (17.61) leads to e dependence on initial conditions. In connection with the considered PoincarC mapping. horseshoe mappings, introduced by Smale, can be constructed. This leads to the following statements: a) In every neighborhood of a transversal homoclinic point of the PoincarC mapping (17.74) there exists a periodic point of this mapping (Smale's theorm). Hence, in every neighborhood of a transversal homoclinic point there exists an invariant set of P"(m E N), .4,which is of Cantor type. The restriction of Pmto .2 is topologically conjugate to a Bernoulli shift. Le., to a mixing system. b) The invariant set of the differential equation (17.61) close to the homoclinic orbit is like a product of a Cantor set with the unit circle. If this invariant set is attracting, then it represents a strange attractor of (17.61).
2. Shilnikov's Theorem Consider the differential equation (17.61) in R3 with scalar parameter E . Suppose that the system (17.61) has a saddle-node type hyperbolic steady state 0 at E = 0, which exists so long as I E ~ remains small. Suppose. that the Jacobian matrix Dzf(O,0) has the eigenvalue A3 > 0 and a pair of conjugate complex eigenvalues XI,* = a & i w with a < 0. Suppose, additionally, that (17.61) has a separatrix loop 7 0 for E = 0, Le.: a homoclinic orbit which tends to 0 fort + -m and t -+ +m (Fig. 17.35a). Then, in a neighborhood of a separatrix loop (17.61) has the following phase portrait : a) Let A 3 + a < 0. If the separatrix loop breaks at E # 0 according to the variant denoted by A in (Fig. 17.35a). then there is exactly one periodic orbit of (17.61) for E = 0. If the separatrix loop breaks at E # 0 according to the variant denoted by B in (Fig. 17.35a), then there is no periodic orbit. b) Let X3 + a > 0. Then there exist countably many saddle-type periodic orbits at E = 0 (respectively. for small lcl) close to the separatrix loop yo (respectively, close to the destroid loop The Poincar6 mapping with respect to a transversal to the *io plane generates a countable set of horseshoe mappings at E = 0. from which there remain finitely many for small (€1 # 0. e,,,).
Figure 17.35
17.3 Bifurcation Theorv and Routes to Chaos 839
3. Melnikov's Method Consider the planar differential equation (17.84) i = f(x) c g ( t , x), where E is a small parameter. For E = 0, let (17.84) be a Hamiltonian system (see 17.1.4.3, 2., p. 813), dH dH i e , for f = ( f l . f 2 ) f l = - and f 2 = -- hold, where H : C' c R2 i R is supposed to be a C3ax2 axl function. Suppose the time-dependent vector field g: R x U + R2 is twice continuously differentiable, and T-periodic with respect to the first argument. Furthermore, let f and g be bounded on bounded sets Suppose that for E = 0 there exists a homoclinic orbit with respect to the saddle point 0. and the PoincarC section ztoof (17.84) in the phase space {(XI, 2 2 . t ) }f o r t = to looks as in Fig. 17.35b. The Poincark mapping P,,tO: Et, --t Et,, for small l ~ l has . a saddle point p , close to x = 0 with the invariant manifolds TVs(pE)and W ( p E ) If . the homoclinic orbit of the unperturbed system is given by p(t - t o ) , then the distance between the manifold W 9 ( p e )and W"(p,),measured along the line passing through 9(0)and perpendicular to f ( + ( O ) ) , can be calculated by the formula
+
(17.858) Here, M(.) is the Melnikovfunction which is defined by (17.85b)
M ( t o ) = T f ( p ( t - t o ) ) A g ( t , ~ (- to)) i dt. -m
(For a = ( a l , a2) and b = ( b l , b2): A means a A b = alb2 - a2bl.) If the Melnikov function M has a simple root at to3Le., M ( t 0 ) = 0 and M ' ( t 0 ) # 0 hold, then the manifolds W s ( p Eand ) W'.(pe) intersect each other transversally for sufficiently small&> 0. If M has no root, then W s ( p En) W"(p,) = 0, Le., there is no homoclinic point. Remark: Suppose the unperturbed system (17.84) has a heteroclinic orbit given by p(t - t o ) running , from a saddle point O1 in a saddle 02. Let p: and pz be the saddle points of the Poincark mapping P,,,, for small IEI. If )[I. calculated as above, has a simple root at to) then W s ( p l )and W " ( p : ) intersect each other transversally for small E > 0. I Consider the periodically perturbated pendulum equation x + sinx = E sinwt, Le.. the system x = y, k = - sinx t ~ s i n c j t in , which E is a small parameter and w is a further parameter. The 1 unperturbed system x = y) j, = - sinx is a Hamiltonian system with H ( z ,y) = -yz - cosx. It has 2 (among others) a pair of heteroclinic orbits through (-x,0) and ( T )0) (in the cylindrical phase space
S' x R these are homoclinic orbits) given by p*(t) = ( 1 2 arctan(sinh t ) ,1 2 -
)
(t E R ) . The cosh t 2 i sin ~ direct calculation of the hlelnikov function yields .M(to) = Since .Lf has a simple root cosh(.rrw/2) ' at to = 0, the Poincark mapping of the perturbed system has transversal homoclinic points for small E > 0.
.&
17.3.2.4 Destruction of a Torus 1. From Torus to Chaos 1. Hopf-Landau Model of Turbulence The problem of transition from regular (laminar) behavior to irregular (turbulent) behavior is especially interesting for systems with distributed parameters, which are described, e.g.. by partial differential equations. From this viewpoint, chaos can be interpreted as behavior irregular in time but ordered in space. On the other hand, turbulence is the behavior of the system, that is irregular in time and in space. The Hopf-Landau model explains the arising of turbulence by an infinite cascade of Hopf bifurcations: For
I
840
17. Dunarnica1 Sustems and Chaos
E = c1 a steady state befurcates in a limit cycle, which becomes unstable for E Z > E I and leads to a .At the k-th bifurcation of this type a k-dimensional torus arises, generated by non-closed torus TZ. orbits which wind on it. The Hopf-Landau model does not lead in general to an attractor which is characterized by sensitive dependence on the initial conditions and mixing. 2. Ruelle-Takens-Newhouse Scenario Suppose that in system (17.61) we have n 2 4 and 1 = 1. Suppose also that changing the parameter E , the bifurcation sequence “equilibrium point + periodic orbit + torus TZ+ torus T3” is achieved by three consecutive Hopf bifurcations. Let the quasiperiodic flow on T 3be structurally unstable. Then, certain small perturbation of (17.61) can lead to the destruction of T 3and to the creation of a structurally stable strange attractor. 3. Theorem of Afraimovich a n d Shilnikov on the Loss of Smoothness a n d the Destruction of t h e Torus T 2 Let the sufficiently smooth system (17.61) be given with n 2 3 and I = 2. Suppose that for the parameter value the system (17.61) has an attracting smooth torus T * ( Espanned ~) by a stable periodic orbit ys, a saddle-type periodic orbit 7%and its unstable manifold W’”(y,) (resonance torus). The invariant manifolds of the equilibrium points of the Poincark mapping computed with respect to a surface transversal to the torus in the longitudinal direction, are represented in Fig. 17.36a. The multiplier p of the orbit ys, which is the nearest to the unit circle, is assumed to be real and simple. Furthermore, let E ( . ) : [0,1] + V be an arbitrary continuous curve in parameter space. for which E(Oj = EO and for which system (17.61) has no invariant resonance torus for E = ~ ( 1 ) Then . the following statements are true: a) There exists a value s, E ( 0 , l ) for which TZ(&(s,)) loses its smoothness. Here, either the multiplier p(s,j is complex or the unstable manifold Wu(yujloses its smoothness close toys. b) There exists a further parameter values,, E (s., 1) such that system (17.61) has no resonance torus for s E (s*.. 11. The torus is destroyed in the following way: a ) The periodic orbit rS loses its stability for E = ~(s,,). A local bifurcation arises as period doubling or the creation of a torus. p ) The periodic orbits */u and ys coincide for E = E(s,,) (saddle-node bifurcation) and so they vanish. 7) The stable and unstable manifolds of -yu intersect each other non-transversally for E = E(s,,) (see the bijurcation diagram in Fig. 1 7 . 3 6 ~ ) .The points of the beak-shaped curve S1 correspond to the fused ;(s and yu (saddle-node bifurcation). The tip C1 of the beak-shaped curve is on a curve SO,which corresponds to a splitting of the torus.
Figure 17.36 The parameter points where the smoothness is lost, are on the curve S,,while the points on S3 characterize the dissolving of a T z torus. The parameter points for which the stable and unstable manifolds of yu intersect each other non-transversally, are on the curve S,. Let PObe an arbitrary point in the beaked shaped tip of the beak such that for this parameter value a resonance torus T Zarises. The transition from Po to PI corresponds to the case a)of the theorem. If the multiplier p becomes -1 on Sz.then there is a period doubling. .4 cascade of further period doublings can lead to a strange attractor. If a then it can result pair of complex conjugate multipliers pl,z arises on the unit circle passing through SZ. in the splitting of a further torus, for which the Afraimovich-Shilnikov theorem can be used again. The transition from Po to Pz represents the case p) of the theorem: The torus loses its smoothness, and on passing through on S1, there is a saddle-node bifurcation. The torus is destroyed, and a transition to chaos through intermittence can happen. The transition from Po to P3, finally, corresponds to the
17.3 Bifurcation Theorv and Routes to Chaos 841
case 7):After the loss of smoothness, a non-robust homoclinic curve forms on passing through on S,. The stable cycle ys remains and a hyperbolic set arises which is not attracting for the present. If ys vanishes, then a strange attractor arises from this set.
2. Mappings on the Unit Circle and Rotation Number 1. Equivalent and Lifted Mappings The properties of the invariant curves of the Poincark mapping play an important role in the loss ofsmoothness and destruction of a torus. If the Poincark mapping is represented in polar coordinates, then, under certain assumptions, we get decoupled mappings of the angular variables as informative auxiliary mappings on the unit circle. These are invertible in the case of smooth invariant curves (Fig. 17.36a) and in the case of non-smooth curves (Fig. 17.36b) they are not invertible. .4mapping F : R + R with F ( O + 1) = F ( O ) + 1for all 0 E R, which generates the dynamical system @ n + ~= F ( @ n ) , (17.86) is called equivariant. For every such mapping, an associated mapping of the unit circle f : SI i SI can be assigned where SI = R \ 2 = (0 mod 1,0E R}. Here f(z):= F ( 0 ) if the relation z = [e] holds for the equivalence class [O]. F is called a lified mapping off. Obviously, this construction is not unique. In contrast to (17.86) X t + l = f (4 (17.87) is a dynamical system on S'.
IFor two parameters w and K let the mapping F(*;w , K ) be defined by P(u;w , K ) = o+w - K sin u for all 7 E R. The corresponding dynamical system on+l= u n + w - K s i n o n (17.88) can be transformed by the transformation on = 2n0, into the system K On+, = 0, t R - -sin 27r0, (17.89) 2i7
K .
With F(0; R,K ) = 0 + R - - s i n 2 ~ an 0 equivariant mappingarises, which generates where R = 2. 27l 277 the cunonical form of the circle mapping. 2. Rotation Number The orbit y(0) = {Fn(0)} of (17.86) is a q-periodic orbit of (17.87) in S' if and only if it is a P- cycle of (17.86), Le., if there exists an integer p such that On+q= 0, 4
+ p , ( n E Z)
holds. The mapping f : S' + S' is called orientation preserving if there exists a corresponding lifted mapping F . which is monotone increasing. If F from (17.86) is a monotone inreasing homeomorphism,
F"(z) for every z E R, and this limit does not depend on z.Hence, then there exists the limit lim Inlim n Fn(z) can be defined. I f f : S' + S' is a homeomorphism and F and the expression p(F) := lim In/+-
TI
are two lifted mappings o f f , then p(F) = p(p)+ k , where k is an integer. Based on this last property, the rotation number p ( f ) of an orientation-preserving homeomorphism f : S' i S' can he defined as p ( f ) = p(F) mod 1. where F is an arbitrary lifted mapping o f f . I f f : SI i SI in (17.87) is an orientation-preserving homeomorphism, then the rotation number has the following properties (see [17.4]):
P
a) If (17.87) has a q-periodic orbit, then there exists an integer p such that p ( f ) = - holds. 4
b) If p ( f ) = 0, then (17.87) has an equilibrium point. c) Ifpjf) =
5.
wherep # 0 is an integer and q is a natural number ( p and q are coprimes), then (17.87)
has a q-periodic orbit.
I
842
17. Dunamical Systems and Chaos
d) p ( f ) is irrational if and only if (17.87) has neither a periodic orbit nor an equilibrium point Theorem of Denjoy: I f f : S' + S' is an orientation-preserving C2-diffeomorphismand the rotation number cy = p ( f ) is irrational, then f is topologically conjugate to a pure rotation whose lifted mapping is F ( x ) = x t a.
3. Differential Equations on the Torus Ta Let
61= f i ( Q i , @ ~ )Q~z = f z ( Q i , @ z ) (17.90) be a planar differential equation, where fi and f 2 are differentiable and one-periodic functions in both arguments. In this case (17.90) defines a flow, which can also be interpreted as a flow on the torus T 2 = S' x S' with respect to 01 and 02. If f l ( 0 1 , O Z ) > 0 for all (01,02),then (17.90) has no equilibrium points and it is equivalent to the scalar first-order differential equation (17.91) f2 LVith the relations 01 = t , 02 = 5 and f = -, (17.91) can be written as a non-autonomous differential
fl
equation
x = f ( t ,x) (17.92) whose right-hand side is one-periodic with respect to t and 5 . Let g ( , ,xo)be the solution of (17.92) with initial state 50 at timet = 0. So, a mapping PI(.) = p(1, ,) can be defined for (17.92),which can be considered as the lifted mapping of a mapping f : S'+ S'. ILet d1,d2E R be constants and 6,= wlr Q2 = . ~ 2a differential equation on the torus, which w2 is equivalent to the scalar differential equation x = - for w1 # 0. Thus, p(t, 20) = %:2t t 50 and w1
W1
4. Canonical Form of a Circle Mapping 1. Canonical Form The mapping F from (17.89) is an orientation-preserving diffeornorphism for
aF
0 5 K < 1, because - = 1 - Kcos2~19> 0 holds. For K = 1, F is nolonger a diffeomorphism,
a19
but it is still a homeomorphism, while for K > 1, the mapping is not invertible, and hence nolonger a homeomorphism. In the parameter domain 0 5 K 5 1. the rotation number p(R, K ) := p(F(8,R, K ) ) is defined for F ( . , R,K ) . Let K E ( 0 , l ) be fixed. Then p(.. K ) has the following properties on [0,1]: a) The function p(.. K ) is not decreasing, it is continuous, but it is not differentiable.
b) For every rational number P- E [0, 1) there exists an interval Iplqrwhose interior is not empty and 4
for which p(R, K ) = P- holds for all R E Ipiq 4 c) For every irrational number a E ( 0 , l ) there exists exactly one R with p(R, K ) = cy. 2. Devil's Staircase and Arnold Tongues For every K E (0, l ) ,p ( , 2K ) is a Cantor function. The graph of p ( . , K ) ,which is represented in Fig. 17.37b, is called the devil's staircase. The bifurcation diagram of (17.89) is represented in Fig. 17.37a. At every rational number on the R-axis: a beak-shaped region (Arnold tongue) with a non-empty interior starts, where the rotation number is constant and equal to the rational number. The reason for the formation of the tongue is a synchronization of the frequencies (frequency locking). a) For 0 5 K < 1, these regions are not overlapping. At every irrational number of the Q-axis, a continuous curve starts which always reaches the line K = 1. In the first Arnold tongue with p = 0, the dynamical system (17.89) has equilibrium points. If K is fixed and R increases, then two of these
17.3 Bifurcation Theoru and Routes to Chaos 843
equilibrium points fuse on the boundary of the first Arnold tongue and vanish at the same time. As a result ofsuch a saddle-node bifurcation, a dense orbit arises on 5’’. Similar phenomena can be observed when leaving other Arnold tongues. b ) For K > 1 the theory of the rotation numbers is nolonger applicable. The dynamics become more complicated, and the transition to chaos takes place. Here, similarly to the case of Feigenbaum constants, further constants arise, which are equal for certain classes of mappings to which also the standard circle mapping belongs. One of them is described in the following.
0.6 0.4
0.2
b)
0.4
0.6
0.8
Figure 17.37 &-1 3. Golden Mean, Fibonacci Numbers The irrational number II is called the golden mean L
and it has a simple continued fraction representation
&- 1 ~
2
1 = 1++ ~
= [l;1,1,.. .] (see 1.1.1.4,
1 + r
3.) p. 4). By successive evaluation of the continued fraction we get a sequence {r,} of rational numbers: 61 Fn which converges to . The numbers T, can be represented in the form T, = , where F, are 2 F,+l Fibonacci numbers, which are determined by the iteration F,+1 = F, +F,,-, (n = 1.2, . .) with initial &- 1 values F, = 0 and Fl = 1. Now, let R, be the parameter value of (17.89), for which p(R, 1) = 2 and let R, be the closest value to R,, for which p(R,, 1) = r, holds. Numerical calculation gives the R, - R*-*- -2.8336.. . . limit lim “+a: R,+l - n, ~
~
-
844
18. Optimization
18 Optimization 18.1 Linear Programming 18.1.1 Formulation of the Problem and Geometrical Represent at ion 18.1.1.1 The Form of a Linear Programming Problem 1. The Subject of lznearprogrammzngis the minimization or maximization of a h e a r objectwe junctzon (OF) of finitely many variables subject to a finite number of constraints (CT),which are given as linear equations or inequalities. Many practical problems can be directly formulated as a linear programming problem, or they can be modeled approximately by a linear programming problem.
2. General Form A linear programming problem has the following general form: OF:
f(x)= ~ 1 x +1 ' . . + c,x,
+ cr+lxr+l +. . + c,x,
= max!
(18.la)
(18.1b)
1
xi 2 0 , . . . ,x, 2 0; x r t l l . . . ,z, free. In a more compact vector notation this problem becomes: OF:
f ( r )= siTx1+cZTx2 = max!
(18.2a)
CT:
Allxi + A12x25 b', A z i ~ ' AzzxZ= bz, xi 2 0 , x2 free.
+
Here. we denote: (18.2~)
18.1 Linear Programming 845
3. Constraints with the inequality sign
.’
2 ’’ will have the above form if we multiply them by (-1).
4. Minimum Problem A minimum problem f(x) = min! becomes an equivalent maximum problem by multiplying the objective function by (-1) -f(x)= max!
(18.3)
5. Integer Programming Sometimes certain variables are restricted to be only integers. We do not discuss this discrete problem here.
6. Formulation with only Non-Negative Variables and Slack Variables In applying certain solution methods. we have to consider only non-negative variables, and constraints (18.1b). (18.2b) given in equality form. Every free variable xk must be decomposed into the difference of two non-negative variables xk = xi - OF:f(x)= c1z1+ . . . + c , ~ , = max! (18.4a) xi. The inequalities become equalities by adding non-negative variables; the): are called slack van- CT: a1,121 -t ’ + ai,n% = b l , 1 ables. That is. we can consider the problem in the form as given in (18.4a,b), where n is the increased number of variables. In vector form we have: “
OF: f ( x )= cTx= max! (18.5a) CT: Ax=b, x>Q. (18.5b) We can suppose that m 5 n. otherwise the system of equations contains linearly dependent or contradictory equations. 7. Feasible Set The set of all vectors satisfying constraints (18.2b) is called the feasible set of the original problem. If we rewrite the free variables as above, and every inequality of the form “5”into an equation as in (18.4a) and (18.4b), then the set of all non-negative vectors x 2 Q satisfying the constraints is called the feasible set M : M = {x E R” : x 0; Ax = b}. (18.6a) A point x* E :2f with the property f(x*)2 f(x) for every x E M (18.6b) is called the maximum point or the solution point of the linear programming problem. Obviously, the components of x not belonging to slack variables form the solution of the original problem.
>
18.1.1.2 Examples and Graphical Solutions 1. Example of the Production of Two Products Suppose we need primary materials Rl! R2, and RBto produce two products E1 and E*. Scheme 18.1 shows how many units of primary materials are needed to produce each unit of the products El and Ez, and there are given also the available amount of the primary materials. Scheme 18.1 Sellingone unit of the products El or E2 results in 20 or 60 units of profit: respectively ( P R ) . Determine a production program which yields maximum profit, if at least 10 units must be produced from product El. If we denote by x1and x2 the num-
I
846
18. Optimization
ber of units produced from El and Ez, then we get the following problem: OF: f(x) = 20x1 6Oz2 = max! CT:
1221 8x1
+ + 6x2 5 + 1222 5 10x2
21
5 2
630, 620, 350! 10.
Introducing the slack variables 2 3 > 2 4 , 2 5 , 2 6 , we get: OF: f(x) = 2021 60z2 0 . 2 3 0 . 2 4 0 x5 CT :
+ 1221 + 821
+ 622 +
+ 12x2
10x2
+
23
+
+
+ 0.z6 = max! = 630,
24
+
-21
25
+
= 620, = 350! 26 =
-10.
2. Properties of a Linear Programming Problem On the basis of this example, we can demonstrate some properties
"2,'
'
4 of the linear programming problem by graphical representation. To do this, we do not consider the slack variables; only the original two 35 ._ variables are used. a) X line alzl +a222 = b divides the 21)x2 plane into two half-planes. 25" The points ( q l2 2 ) satisfying the inequality alz1+a2x2 5 bare in one of these half-planes. The graphical representation of this set of points in a Cartesian coordinate system can be made by a line, and the halfXI plane containing the solutions of the inequalities is denoted by an -@+ arrow. The set of feasible solutions M.i.e., the set of points satisfying all inequalities is the intersection of these half-planes (Fig. 18.1). Figure 18.1 In this example the points of M form a polygonal domain. It may happen that ,PI is unbounded or empty. If more then two boundary lines go through a vertex of the polygon. we call this vertex a degenerate vertex (Fig. 18.2).
t,
r x
" I
Figure 18.2
b) Every point in the Z ~ , X Zplane satisfying the equality f ( 2 ) = 2021 + 6Oz2 = co is on one line. on the level line associated to the value eo. With different choices of eo, a family of parallel lines is defined, on each of which the value of the objective function is constant. Geometrically, those points are the solutions of the programming problem, which belong to the feasible set ,21and also to the level line 2021 + 60x2 = co with maximal value of eo. In this example, the solution point is (21, 2 2 ) = (25,35) on the line 2021 + 6Oz2 = 2600. The level lines are represented in Fig. 18.3,where the arrows point in the direction of increasing values of the objective function. Obviously. if the feasible set M is bounded, then there is at least one vertex such that the objective function takes its maximum. If the feasible set M is unbounded, it is possible that the objective function
18.1 Linear Programming 847
Figure 18.3
Figure 18.4
is unbounded. as well
18.1.2 Basic Notions of Linear Programming, Normal Form \Ye consider problem (18&,b) Kith the feasible set M
18.1.2.1 Extreme Points and Basis 1. Definition ofthe Extreme Point A point x E -21 is called an extreme point or vertex of M ,if for all E , , E A4 with x, # x,: 0 < x < 1, -x # AXl t (1 -A)&,. Le.. E is not on any line segment connecting two different points of M.
(18.7)
2. Theorem about Extreme Points The point x E .If is an extreme point of M if the columns of matrix A associated to the positive components of x are linearly independent. If the rank of A is m. then the maximal number of independent columns in A is m. So, an extreme point can have at most m positive components. The other components, a t least n - m: are equal to zero. In the usual case, there are exactly n positive components. If the number of positive components is less then m. we call it a degenerate extreme point.
3. Basis We can assign m linearly independent column vectors of the matrix A to every extreme point, the columns belonging to the positive components. This system of linearly independent column vectors is called the basis ofthe extreme point. Usually. exactly one basis belongs to every extreme point. However several bases can be assigned to a degenerate extreme point. There are at most
(3
possibilities to
choose m linearly independent vectors from n columns of A . Consequently, the number of different bases. and therefore the number of different extreme points is least one extreme point
. If M is not empty, then M has at
848
18. Optimization
OF:
f(x) =
CT:
221 21
t 352
+
22 22
-21
+ 4x3 = max!
+ I 2 1, 2, + 253 I2, 53
(18.8)
2x1 - 3x2 t 2x3 5 2.
The feasible set N determined by the constraints is represented in Fig. 18.4. Introduction of slack variables 2 4 ) 25) 2 6 , 2 7 leads to: CT: 21 .CZ ~3 - 5 4 = 1, 22 + 55 = 2, -21 t 223 x6 = 2, 221 - 322 223 27 = 2.
+
+
+
+
+
The extreme point PZ = (0,1,0) of the polyhedron corresponds to the point x = (zl1xz.z3,x4>z5,56, z7)= ( 0 , 1 , 0 , 0 , 1 , 2 , 5 )of the extended system. The columns 2 , 5 , 6 and 7of A form the corresponding basis. The degenerated extreme point Pl corresponds to (1,O , O , 0,2,3,0). .4basis of this extreme point contains the columns 1 , 5 , 6 and one of the columns 2,4 or 7. Remark: Here, the first inequality was a “2”inequality and we did not add but subtract x4. Frequently these types of additional variables both with a negative sign and a corresponding bi > 0 are called surplus variables, rather than slack variables. As we will see in 18.1.3.3,p. 852, the occurrence of surplus variables requires additional effort in the solution procedure.
4. Extreme Point with a Maximal Value of the Objective Function Theorem: If .Vi is not empty, and the objective function f(x) = cTxis bounded from above on rll. then there is at least one extreme point of M where it has its maximum. A linear programming problem can be solved by determining at least one of the extreme points with maximum value of the objective function. Usually, the number of extreme points of M is very large in practical problems, so we need a method by which we can find the solution in a reasonable time. Such a method is the simplez method which is also called the simplez algorithm or simplex procedure. ~
18.1.2.2 Normal Form of the Linear Programming Problem 1. Normal Form and Basic Solution The linear programming problem (18.4a,b) can always be transformed to the following form with a suitable renumbering of the variables:
OF:
CT:
f(x)= clzl +. ’ . + cn-,x,+,
+ cg = max!
Ui,iXi
+...+ ~ i , n - m ~ , - m t X,-,+i
am,lxl
+ ’ . + am,n-mxn-m
21:.
. . z,-, ~
2,-,+1,.
. . ,5, 2 0.
(18.9a)
!I
= bi,
+ X, = b,
(18.9b)
,
The last m columns of the coefficient matrix are obviously independent, and they form a basis. The basic solution ( X I 2, 2 ) . . . ,~,-,,x,-,~~,.. . ,x,) = (0,. . . , 0 , b i , . . . , b,) can be determined directly from the system of equations, but if b 2 0 does not hold, it is not a feasible solution. If b 2 0. then (18.9a.b) is called a normalform or canonicalform ofthe linearprogrammingproblem. In this case, the basic solution is a feasible solution, as well, Le., x 2 0, and it is an extreme point of .\I. The variables 21, . . . , zn-mare called non-basic variables and z,-,+l, . . . z,are called basic variables. The objective function has the value cg at this extreme point, since the non-basic variables are equal to zero.
I
~
18.1 Linear Programming 849
2. Determination of the Normal Form If an extreme point of ,\.Iis known, then we can get a normal form of the linear programming problem in the following nay. We choose a basis from the columns of A corresponding to the extreme point. Suppose the basic variables are collected into the vector xBand the non-basic variables are in x N .The columns associated to the basis form the basis matrix AB, the other columns form the matrix AN. Then. (18.10) Ax = ANSA, + ABXB= b. The matrix AB is non-singular and it has an inverse AB1, the so-called basas inverse. Multiplying (18.10) by AB1 and changing the objective function according to the non-basic variables results in the canonical form of the linear programming problem:
OF: f ( ~ =) CT,XN
+ CO,
( 18.1la)
CT: Ai1A1vxN+xB= Ai'b with
XN
2 0,
XB
2 0.
(18.11b)
Remark: In the original system (18.lb) has only constraints of type "5" and simultanously b 2 0. Then the extended system (18.4b) contains no surplus variables (see 18.1.2.1,p. 847). In this case a normal form is immediatlely known. Selecting all slack variables as basic variables xB we have AB = I and xB = b. xN = 0 is a feasible extreme point. IIn the above example x = (0, l , O , 0,1,2,5) is an extreme point. Consequently:
(18.12b)
+ + = 1, + + + + I;:} 521 + 523 - 354 t = 5. From f ( x ) = + 322 + 423. we get the transformed objective function f ( x )= + t 3x4 + 3, 22
-
-21
23
24
23
24
25
223
-21
26
(18.13)
27
221 -21
23
(18.14)
if we subtract the triple of the first constraint.
18.1.3 Simplex Method 18.1.3.1 Simplex Tableau The simple2 method is used to produce a sequence of extreme points of the feasible set with increasing values of the objective function. The transition to the new extreme point is performed starting from the normal form corresponding to the given extreme point, and arriving at the normal form corresponding to the new extreme point. In order to get a clear arrangement, and easier numerical performance, we put the normal form (18.9a,b) in the simplex tableau (Scheme 18.2a, 18.2b):
850
18. Optimization
Scheme 18.2a
Scheme 18.2b or briefly
1
e1
'.'
en-m
I-co
The k-th row of the tableau corresponds to the constraint xn-m+k a k . l X 1 + ' ' . ak,n-mxn-m= b k . ( 18.15a) We have for the objective function ClZl t .' ' + cn-mxn-m = f(x)- eo. (18.15b) , = (Qlb). We also get the value of From this simplex tableau, we can find the extreme point ( x NxB) the objective function at this point f(x)= eo. We can always find exactly one of the following three cases in every tableau: a) c3 5 0, j = 1, . . . , n- m: The tableau is optimal. The point (xN, xe) = (0,b) is the maximal point. b) There exists at least one j such that cj > 0 and aij 5 0, i = 1,.. . , m: The linear programming problem has no solution, since the objective function is not bounded on the feasible set; for increasing values of xj it increases without a bound. c) For every j with el > 0 there exists at least one i with at, > 0: We can move from the extreme point x to a neighboring extreme point with f(g)2 f(x). In the case of a non-degenerate extreme point x. the ">" sign always holds.
+
+
18.1.3.2 Transition to the New Simplex Tableau 1. Non-Degenerate Case If a tableau is not in final form (case c)), then we determine a new tableau (Scheme 18.3). We interchange a basic variable xp and a non-basic variable x, by the following calculations: a)
cPq= L.
(18.16a)
aPP
(18.16b) 6, = bp . Lip,. j # q> kPq, i # p. e, = -ep a,, (18.16~) d) 6ij = aij a,, . iitq3 i # p , j # q> 6, = bi b, hi,. i # p , j # q, Eo = co b, . E,. (18.16d) E3 = c3 a,j , E,, The element apqis called the pivot e l e m e n t , the p t h row is the pivot TOW, and the q-th column is the pivot column. We must consider the following requirements for the choice of a pivot element: a) Eo 2 co should hold; b) the new tableau must also correspond to a feasible solution, Le., h 2 0 must hold. Then. (&. &) = (0,b) is a new extreme point at which the value of the objective function f(g)= i.0 is not smaller than it was previously. These conditions are satisfied if we choose the pivot element in the following way: a) To increase the value of the objective function, a column with c, > 0 can be chosen for a pivot column; b) to get a feasible solution, the pivot row must be chosen as b)
6p3
C)
6,,= -ai,
= a,j , E,,
+
+
+
+
(18.17)
18.1 Linear Programming 851
If the extreme points of the feasible set are not degenerate, then the simplex method terminates in a finite number of steps (case a) or case b)). I The normal form in 18.1.2 can be written in a simplex tableau (Scheme 18.4a). Scheme 18.3
Scheme 18.4a
# %B
AN b - -co
27
-1 -1 5 -1
-1 2 0 5 -3 1 3 -3
Scheme 18.4b
24 26
27
-1 1 -1 2 0 -1
2 2 3 2 4 - 3 -6
2:2 8:2
This tableau is not optimal, since the objective function has a positive coefficient in the third column. The third column is assigned as the pivot column (the second column could also be taken under consideration). \\.'e calculate the quotients b,/a,* with every positive element of the pivot column (there is only one of them). The quotients are denoted behind the last column. The smallest quotient determines the pivot row. If it is not unique, then the extreme point corresponding to the new tableau is degenerate. After performing the steps of (18.16a)-(18.16d) we get the tableau in Scheme 18.4b. This tableau determines the extreme point (0,2,0,1,0,2,8),which corresponds to the point P7 in Fig. 18.4. Since . extreme point this new tableau is still not optimal, we interchange 26 and 2 3 (Scheme 1 8 . 4 ~ ) The of the third tableau corresponds to the point P6 in Fig. 18.4. After an additional change we get an optimal tableau (Scheme 18.4d) with the maximal point E* = (2,2,2,5,0,0,0),which corresponds to the point P5,and the objective function has a maximal value here: f(x*)= 18. Scheme 1 8 . 4 ~ Scheme 18.4d Scheme 18.5 Xn
2. Degenerate Case If the next pivot element cannot be chosen uniquely in a simplex tableau, then the new tableau represents a degenerate extreme point. .4degenerate extreme point can be interpreted geometrically as the coincident vertices of the convex polyhedron of the feasible solutions. There are several bases for such a vertex. In this case, it can therefore happen that we perform some steps without reaching a new extreme point. It is also possible that we get a tableau that we had before, so an infinite cycle may occur. In the case of a degenerate extreme point, one possibility is to perturb the constants bi by adding E' (with a suitable E , > 0) such that the resulting extreme points are nolonger degenerate. We get the solution from the solution of the perturbed problem, if we substitute E = 0. If the pivot column is chosen "randomly" in the non-uniquely determined case, then the occurrence of an infinite cycle is unlikely in practical cases.
852
18. Optimization
18.1.3.3 Determination of an Initial Simplex Tableau 1. Secondary Program, Artificial Variables If there are equalities among the original constraints (18.lb) or inequalities with negative h,, then it is not easy to find a feasible solution to start the simplex method. In this case, we start with a secondary program to produce a feasible solution, which can be a starting point for a simplex procedure for the original problem. We add an artificzal variable yh 2 0 (k = 1 , 2 , . . . , m) to every left-hand side of A s = b with b 2 0, and we consider the secondary program:
OF*: g(x,2)= -y1 - . . - ym = max!
CT':
a1,121
+...+ a1,nZn + YI
.,
~ 1 ,, . 2 ,
L 0;
~ 1 , ,.
(18.18a) = hi,
)
+ gm = h i ,
}
(18.18b)
J
. ,Ym 2 0.
For this problem, the variables yl,. . . , ym are basic variables, and we can start the first simplex tableau (Scheme 18.5). The last row of the tableau contains the sums of the coefficients of the non-basic variables, and these sums are the coefficients of the new secondary objective function OF*. Obviously, g(x,y) 5 0 always. If g(x*,y*)= 0 for a maximal point ( x ' , ~ 'of ) the secondary problem, then obviously y' = 0, and consequently x*is a solution of Ax = b. If g(x*,y') < 0, then Ax = b does not have any Glution.
2. Solution of the Secondary Program Our goal is to eliminate the artificial variables from the basis. We do not prepare a scheme only for the secondary program separately. We add the columns of the artificial variables and the row of the secondary objective function to the original tableau. The secondary objective function now contains the sums of the corresponding coefficients from the rows corresponding to the equalities, as shown below. If an artificial variable becomes a non-basic variable, we can omit its column, since we will never choose it again as a basis variable. If we determined a maximal point (x*,y*), then we distinguish between two cases: 1. g(x*,y') < 0: The system Ar = b has no solution, the linear programing problem does not have any feasible solution. 2. g(r*,y') = 0: If there are no artificial variables among the basic variables, this tableau is an initial tableau fir the original problem. Otherwise we remove all artificial variables among the basic variables by additional steps of the simplex method. By introducing the artificial variables, the size of the problem can be increased considerably. .4s we see, it is not necessary to introduce artificial variables for every equation. If the system of constraints before introducing the slack and surplus variables (see Remark on p. 848) has the form A1x 2 bl, Azr = bz.A3x 5 & with bI,b2,B2 0, then we have to introduce artificial variables only for the first two systems. For the third system the slack variables can be chosen as basic variables. IIn the example of 18.1.2, p. 848, only the first equation requires an artificial variable: OF': g(x,y)= - Y1 = max!
CT*:
x1
+
22 22
-21
2x1 -
322
+ 23 + 2x3
+ 2x3
- 24
+ y1
+ 25
= 1,
+ 26
= 2, =
2,
+ 2 7 = 2.
The tableau (Scheme 18.6b) is optimal with g(x*, y*)= 0. After omitting the second column we get the first tableau of the original problem.
18.1 Linear Programming 853
Scheme 18.6a
Scheme 18.6b
-1 -1 -1 1 - 1 0 2 0 5 3 5-3 -1-3 1 3 -3
OF
18.1.3.4 Revised Simplex Method 1. Revised Simplex Tableau Suppose the linear programming problem is given in normal form: co = max! OF: f(x)= c l z l + ' . . t C ~ - ~ X , - , , ,
+
CT:
+ .+ a 1 , n - m x n - m + Zn-m+l
0 1 1x1
2 1 2 0
'
=
(18.19a)
131,
1
. . . . ,x , > o .
Obviously. the coefficient vectors ( a = 1,.. . , n ) are the 2-th unit vectors. In order to change into another normal form and therefore to reach another extreme point, it is sufficient to multiply the system of equations (18.19b) by the corresponding basis inverse. (We refer to the fact that if AB denotes a new basis, the coordinates of a vector x can be expressed in this new basis as AB'X. If we know the inverse of the new basis. we can get any column as well as the objective function from the very first tableau by simple multiplication.) The simplex method can be modified so that we determine only the basis inverse in every step instead of a new tableau. From every tableau, we determine only those elements which are required to find the new pivot element. If the number of variables is considerably larger than the number of constraints ( n > 3m), then the revised simplex method requires considerably less computing cost and therefore has better accuracy. The general form of a revised simplex tableau is shown in Scheme 18.7. Scheme 18.7
1
I
I
1
c1 ' . ' en-m Cn-mtl ' . ' cn -cO cq The quantities of the scheme have the following meaning: z;, . . . x i : Actual basic variables (in the first step the same as xn-mtl ' . zn). el;. . , c, : Coefficients of the objective function (the coefficients associated to the basic variables are zeros). b l , . . . , b, : Right-hand side of the actual normal form. : \:slue of the objective function at the extreme point (x;,. . . ~ z i =) ( b l s . . . . bm). co
.
A*=
(
aipm+i
.' . 0 1 , ~ ;
j
2,-,+1,. am,n-m+l
' ''
A* are the columns of . . ,x, corresponding to the actual normal form;
) : actual basis inverse, where the columns of
am,n
1: = ( T I , . . . , T , ) ~ : ,Actual pivot column.
I
18. Optimization
854
2.
Revised Simplex Step
a) The tableau is not optimal when at least one of the coefficients e, ( j = 1 . 2 . . . . , n ) is positiv. Lye choose a pivot column q for a cq > 0. b) We calculate the pivot column 1: by multiplying the q-th column of the original coefficient matrix (18.19b) by A' and we introduce the new vector as the last vector of the tableau. \Ye determine the pivot row k in the same way as in the simplex algorithm (18.17). c) We calculate the new tableau by the pivoting step (18.16a-d), where utq is formally replaced by ri and the indices are restricted for n - m + 1 5 j 5 n. The column f is omitted. x q becomes a basic variable. For j = 1 , .. . . n - m, we get E, = e, t $E, where E = and gj is the j-th . ., column of the coefficient matrix of (18.19b). IConsider the normal form of the example in 18.1.2, p. 848. We want to bring x4 into the basis. The corresponding pivot column 1: = a is placed into the last column of the tableau (Scheme 18.8a) (initially A* is the unit matrix). Scheme 18.8a
Scheme 18.8b 21 2 3
x4
x2 2 5 2 6 2 7
0 0l0 0
x7
3 0 1
For j = 1.3.4 we have: E, = c, - 3~22,: (c1, c3, c4) = (2,4,0). The determined extreme point = (0,2,0,1,0,2,8)corresponds to the point P7 in Fig. 18.4. The next pivot column can be chosen for j = 3 = q . The vector 1:is determined by 1 1 0 0
[-;I[-!)
= ( r l , .. . r,) = A*% = ~
(00 03 1O0 01 ) . = and we place it into the very last column of the second tableau (Scheme 18.8b).We proceed as above analogously to the method shown in 18.1.3.2, p. 851. If n e want to return to the original method, then we have to multiply the matrix of the original columns of the non-basic variables by A* and we keep only these columns.
18.1.3.5 Duality in Linear Programming 1. Correspondence To any linear programming problem (primal problem) we can assign another unique linear programming problem (dual problem):
Przmal problem
OF: CT:
f(x)= $zl + cTx2= max! A, 1x1 + A1.2~25 bi,
+ -42.2112
A2,1~1
xl 2 0,
Dual problem (18.20a)
OF*: g(n)= bTnl + bTu2= min! (18.21a)
= b2,
x2free.
(18.20b)
The coefficients of the objective function of one of the problems form the right-hand side vector of the constraints of the other problem. Every free variable corresponds to an equation, and every variable with restricted sign corresponds to an inequality of the other problem.
18.1 Linear Programminq 855
2. Duality Theorems a) If both problems have feasible solutions, Le., M # 0,M*# 0 (where M and M' denote the feasible sets of the primal and dual problems respectively), then (18.22a) f ( x ) 5 g(u) for all 1~.E M, y E M*: and both problems have optimal solutions. b) The points x E M and u E M* are optimal solutions for the corresponding problem, if and only if f ( x )= d u ) . (18.22b) c ) If f ( x ) has no upper bound on M or g(u) has no lower bound on M*,then M*= 0 or M = 0;Le., the dual problem has no feasible solution. d) The points x E M and u E M* are optimal points of the corresponding problems if and only if (18.22~) uT(A1,lxl t A 1 . 2 ~ 11,) ~ = 0 and xT(AT1ult Allu2 - cl) = 0. Using these last equations, we can find a solution x of the primal problem from a non-degenerate optimal solution u of the dual problem by solving the following linear system of equations: (18.23a) (18.23b) (18.23~) We can also solve the dual problem by the simplex method.
3. Application of the Dual Problem Working with the dual problem may have some advantages in the following cases: a) If it is simple to find a normal form for the dual problem, we switch from the primal problem to the dual. b) If the primal problem has a large number of constraints compared to the number of variables. then we can use the revised simplex method for the dual problem. Consider the original problem of the example of 18.1.2, p. 848
Dual problem
Primal problem
OF:
CT:
f(x) = 2x1 + 3x2 + 4x3 = max! 5 x2 5 -21 t 2x3 5 2xl - 3x2 t 2x3 5 -21
-
22
-
23
-1. 2> 2) 2,
OF': CT':
g(u) = -ul
+
2u2 t 2u3 t 2u4 = min! - u3 t 2u4 2 2 , -211 t u2 - 3214 2 3. -u1 2u3 2uq 2 4, -ul
+
+
2111 u2.74ru4 2 0. 0. If the dual problem is solved by the simplex method after introducing the slack variables, then we get the optimal solution ut = (211, uzl 213,214) = (0,7,2/3,4/3) with g(G) = 18. We can obtain a solution x* of the primal problem by solving the system (Ax - b)i = 0 for ui > 0, Le., x2 = 2 , -xl + 2x3 = 2, 2x1 - 3x2 + 2x3 = 2, therefore: x* = ( 2 , 2 , 2 ) with f(x)= 18.
xl,X2,x3 2
18.1.4 Special Linear Programming Problems 18.1.4.1 Transportation Problem 1. Modeling A certain product, produced by m producers El, Ez,. . . , E, in quantities al, a 2 : . . . , a,, is to be transported t o n consumers \{. &, . . . V, with demands bl, b 2 : . . :, b,. Transportation cost of a unit product of producer 5, to consumer tJ is c,j. The amount of the produet transported from E, to 4is z , ~units. ~
856
18. Optimization
We are looking for an optimal transportation plan with minimum total transportation cost. We suppose the system is balanced, i.e., supply equals demand: m
n
E a , =Eb3.
(18.24)
3=1
t=1
Lye construct the matrix of costs C and the distribution matrix X:
(
=
v:
c1,1
'.
i
C1,n
j
cm,1 ' . ' cm,n
v, ... vn
)
E:
(
El
j
X=
)(18.25a)
Em
C:
Xi,i
. .. x1,n
;
xm,1
bl
i
. . . xm,n ... b,
)
c: ai j
.
(18.25b)
am
If condition (18.24) is not fulfilled, then we distinguish between two cases: a) If a, > b , , then we introduce a fictitious consumer V,+l with demand b,+l = a, - b3 and with transportation costs c,,,+1 = 0. b) If C a, < bl, then we introduce a fictitious producer E,,,,, with capacity am+l = b3 - a, and with transportation costs c,,,+,,~= 0. In order to determine an optimal program, we have to solve the following programming problem: m
n
OF: f ( X ) = E
ct3xt3= min!
(18.26a)
,=13=1
n
m
]=I
,=1
CT: E z t 3 = a t ( z = l ,. . . .m ) , c ~ , ~ =( ~b= ~1.,. . ,n ) , x t 3 > O ) .
(18.26b)
The minimum of the problem occurs at a vertex of the feasible set. There are m + n - 1 linearly independent constraints among the m + n original constraints, so, in the non-degenerate case, the solution contains m + n - 1 positive components zt3. To determine an optimal solution the following algorithm is used, which is called the transportation algorithm. 2. Determination of a Basic Feasible Solution Il'ith the Northwest corner rule we can determine an initial basic feasible solution: a) Choose zll = rnin(a1, b1). ( 18.27a)
b) If a, < bl, we omit the first row of X and proceed to the next source. If a, > bl, we omit the first column of X and proceed to the next destination. If al = bl, we omit either the first row or the first column of X.
(18.27b) (18.27~) (18.27d)
If there are only one row but several columns, we cancel one column. The same applies for the rows. a1 - xll and bl by bl - 211 and we repeat the procedure with the reduced scheme. The variables obtained in step a) are the basic variables, all the others are non-basic variables with zero values. c) We replace a] by
c=
( 58 32 21 71 ) : E: z , 9 2 6 3
1':
v, v, v3 v,
(
x=
E3
E:
x1,l
x 1 , ~ x1,3
xZ,1 x3,l
x2,2
x2,3
x2,4
x3,2
x3,3
x3,4
bl=
21,4
4 b2 = 6 b3 = 5 b4 = 7
Determination of an initial extreme point with the Northwest corner rule:
18.1 Linear Proqramminq 857
first step
x=(i' 4657
)
second step
further steps
x = ( j 5 - ) !ioo
9 5
3
2
0657
b
,
450 1 5 4 ) 1pgjo. I I3 3 0 $7
x=(
~
103 0 There are alternative methods to find an initial basic solution which also takes the transportation costs into consideration (see, e.g., the Vogel approximation method in [18.10]) and they usually result in a better initial solution. 3. Solution of the Transportation Problem with the Simplex Method If we prepare the usual simplex tableau for this problem, we get a huge tableau ( ( m+ n) x ( m . n ) ) with a large number of zeros: In each column, only two elements are equal to 1. So, we will work with a reduced tableau, and the following steps correspond to the simplex steps working only with the nonzero elements of the theoretical simplex tableau. The matrix of the cost data contains the coefficients of the objective function. The basic variables are exchanged for non-basic variables iteratively, while the corresponding elements of the cost matrix are modified in each step. The procedure is explained by an example.
0
a) Determination of the modified cost matrix C from C by (18.28a) Etj = c,,+p,+y, (i = 1,. . . ,m>j = 1, . . . ,n ) , with the conditions (18.28b) E,, = 0 for ( i , j ) if zt3is an actual basic variable. MC mark the elements of C belonging to basic variables and we substitute pl = 0. The other quantities p , and yj, also called potentials or simplex multiplicators, are determined so that the sum of p , ; qj and the marked costs c,? should be 0:
y 1 = -5 y2 = -3 b) We determine: Epq = n${ElJ}.
y3
= -2 q4 = -2
(18.28d)
If tpq2 0, then the given distribution X is optimal; otherwise zpqis chosen as a new basic variable. In our example: Epq = E32 = -2.
c) In C , we mark Epppand the costs associated to the basic variables. If C contains rows or columns with at most one marked element, then these rows or columns will be omitted. We repeat this procedure with the remaining matrix, until no further cancellation is possible. (18.28e) d) The elements ztj associated to the remaining marked elements E,, form a cycle. The new basic variable ipp is to be set to a positive value 6. The other variables Zzj associated to the marked elements E,, are determined by the constraints. In practice, we subtract and add 6 from or to every second element
858
18. Optimization
of the cycle. To keep the variables non-negative, the amount 6 must be chosen as 6 = z,,= min{x,, : 2 t j = ztj - 6}, where z,,will be the non-basic variable. In the example 6 = min{l,3} = 1. 4
(18.284
c9
5 t
5 4$6
1-6
X=
t
4.
6 +3-6
c
3
5 7 Then, we repeat this procedure with X = X but in the calculation of the new p , , qj values we always start with the original C. 4
6
(5)
(3)
q1 = -5 q2 = -3
2
q3
= -4
44
(0) 3
= -4
X=
\
l+b
-+
f
(
(0)
3 7 ) , f ( X ) = 49.
(18.281)
l
2-6)
The next matrix C does not contain any negative element. So, X is an optimal solution
18.1.4.2 Assignment Problem The representation is made by an example. In shipping contracts should be given to n shipping companies so that each company receives exactly one contract. We want to determine the assignment which minimizes the total costs, if the i-th company charges cZj for the j-th contract. An assignment problem is a special transportation problem with m = n and ai = bj = 1 for all i, j : n
OF:
n
f(x)=
(18.29a)
c,jzt3 = min! i = i 3=1
n
n
CT: c x i j = 1 ( i = l , . . . , n ) , J=1
= 1 ( j = 1 ,. . . > T I ) ,
zijE{O;l}.
(18.29b)
i=l
Every feasible distribution matrix contains exactly one 1 in every row and every column, all other elements are equal to zero. In a general transportation problem of this dimension, however, a nondegenerate basic solution would have 2n - 1 positive variables. Thus, basic feasible solutions to the assignment problem are highly degenerate, with n - 1 basic variables equal to zero. Starting with a feasible distribution matrix X,we can solve the assignment problem by the general transportation algorithm. It is time consuming to do so. However, because of the highly degenerate nature of the basic feasible solutions, the assignment problem can be solved with the highly efficient Hungarian method (see [18.6]).
18.1.4.3 Distribution Problem The problem is represented by an example.
18.2 Non-linear Optimization 859
Im products E l . E l . . . . em should be produced in quantities a l , az,. . . ,a,. Every product can be produced on any of n machines M I ,Mz,. . . , M,.The production of a unit of product E, on machine .$Ij needs processing time b,, and cost cij. The time capacity of machine M,is b,. Denote the quantity produced by machine .Lfj from product E, by x,j. We want to minimize the total production costs. K e get the following general model for this distribution problem:
OF:
m
n
1=1
,=I
f(x)=
C ~ , L ,= ~ min!
m
(18.30a) n
CT: x z L j = a l ( i = l > . . . % m ) ,x b , , x t 3 < b j ( j = l ,. . . ,n ) > x i j 2 0 forall i , j . 3=1
(18.30b)
1=1
The distribution problem is a generalization of the transportation problem and it can be solved by the simplex method. If all b,, = 1, then we can use the more effective transportation algorithm (see 18.1.4.1. p. 856) after introducing a fictitious product E,, (see 18.1.4.1, p. 856).
18.1.4.4 Travelling Salesman Suppose there are n places 01,02,.. . 0,. The travelling time from 0, to 0, is elj. Here, ctj # cji is possible. We want to determine the shortest route such that the traveller passes through every place exactly once, and returns to the starting point. Similarly to the assignment problem, we have to choose exactly one element in every row and column of the time matrix C so that the sum of the chosen elements is minimal. The difficulty of the numerical solution of this problem is the restriction that the marked elements cij should be arranged in order of the following form: (18.31) e,,,,,. czz,i3,.. . cln,i,,+] with i k # if for k # 1 and intl = il. The trayelling salesman problem can be solved by the branch and bound method. ~
~
18.1.4.5 Scheduling Problem n different products are processed on m different machines in a product-dependent order. At any time only one product can be processed on a machine. The processing time of each product on each machine is assumed to be known. Waiting times, when a given product is not in process, and machine idle times are also possible. An optimal scheduling of the processing jobs is determined where the objective function is selected as the time when all jobs are finished. or the total waiting time of jobs, or total machine idle time. Sometimes the sum of the finishing times for all jobs is chosen as the objective function when no waiting time or idle time is allowed.
18.2 Non-linear Optimization 18.2.1 Formulation ofthe Problem, Theoretical Basis 18.2.1.1 Formulation of the Problem 1. Non-linear Optimization Problem A non-linear optimization problem has the general form f(x)= min! subject to x E R" with (18.32a) gi(x) 50, i ~ I = { . l. . . ) m } ) h , ( x ) = o , j E J = { 1 .... ) r } (18.32b) where at least one of the functions f,gi, h, is non-linear. The set of feasible solutions is denoted by (18.33) .Lf = {x E Rn : gz(x)< 0, i E I , hj(x) = 0 , j E J } . The problem is to determine the minimum points.
I
860
18. Optimization
2. Minimum Points A point x* E M is called the global minimum point if
f(x*)5 f(x) holds for every x E M . If this relation holds for only the points x of a neighborhood U of x*>then x' is called a local minimum point. Since the equality constraints h,(x) = 0 can be expressed by two inequalities, -h, (x)I 0, h, (4 5 0, (18.34) the set can be supposed to be empty, J = 0. 18.2.1.2 Optimality Conditions 1. Special Directions a) The Cone of the Feasible Directions at x E M is defined by Z ( x ) = { d E R " : % > O : x + a d E M , 0 5 a < & } , EM,
(18.35) where the directions are denoted by d. If d E Z(x),then every point of the ray x + ad belongs to M for sufficient small values of a. b) A Descent Direction at a point x is a vector d E R" for which there exists an d > 0 such that f(x + ad) < f(3v1(2.E (0,Cu). (18.36) There exists no feasible descent direction at a minimum point. Iff is differentiable, then d is a descent direction V f ( ~ )<~0.dHere, V denotes the nabla operator, so Vf(x) represents the gradient of the scalar-valued function f at x.
2. Necessary Optimality Conditions I f f is differentiable and x' is a local minimum point, then ~ f ( x * )2~o d for every d E Z(x*). (18.37a) In particular. if x*is an interior point of M ,then Vf(x*= ) 0. (18.37b) 3. Lagrange Function and Saddle Point Optimality conditions (18.37a,b) should be transformed into a more practical form including the constraints. We construct the so-called Lagrange function or Lagrangian: m
L(x:u) = f(x)-t C u l g i ( x )= f(3 +UTg(3,
x E R, u E RT,
(18.38)
2=1
according to the Lagrange multiplier method (see 6.2.5.6, p. 401) for problems with equality constraints. A point (x',u') E R" x RT is called a saddle point of L , if L ( ~ * , u5) L(x':u*)5 L ( ~ , u * )for every E R", y E R;". (18.39)
4. Global Kuhn-Tucker Conditions .4 point x* E R" satisfies the global Kuhn-Tucker conditions if there is an U' E RT, i.e., u' 2 0 such that (x'.u') is a saddle point of L. For the proof of the Kuhn-Tucker conditions see 12.5.6: p. 623. 5 . Sufficient Optimality Condition If (x*, u*)E Rn x RT is a saddle point of L: then E' is a global minimum point of (18.32a,b). If the functions f and gz are differentiable, then we can also deduce local optimality conditions. 6 . Local Kuhn-Tucker Conditions .A point x* E M satisfies the local Kuhn-Tucker conditions if there are numbers ui 2 0, i E la(x*)such that -Vj(x') = u,Vgi(x*), where (18.40a) iE Io(?(' )
1,(x)= { i E { 1, . . . , m } : gt (x)= 0) is the index set of the active constraints at x. The point point.
x'
(18.40b) is also called a Kuhn-Tucker stationary
18.2 Non-linear Optimization 861 This means geometrically that a point x' E 'kf satisfies the local Kuhn-Tucker conditions. if the negative gradient -Vf(x') lies in the cone spanned by the gradients Vgi(x*) i E lo(x*)of the constraints active at x* (Fig. 18.5). The following equivalent formulation for (18.40a,b)is also often used: x* E R" satisfies the local Kuhn-Tucker conditions) if there is a u* E R;" such that dx*)5 0, (18.41a)
uZgt(x*) = 0, i = 1 , .. .,m,
g2(x)
(18.41b)
+ x u t V g t ( x * )= 0.
, /
/
,
level lines
f(x) = const
m
Vf(x*)
I , 0 /
(18.41~)
t=l
Figure 18.5
7. Necessary Optimality Conditions and Kuhn-Tucker Conditions If x' E ,21is a local minimum point of (18.32a.b) and the feasible set satisfies the regularity condition at x* : 3d E R" such that Vgz(x*)Td< 0 for every i E lo(x*), then x*satisfies the local Kuhn-Tucker conditions,
18.2.1.3 Duality in Optimization 1. Dual Problem With the associated Lagrangian (18.38) we form the maximum problem, the so-called dualof (18.32a,b):
L(s.u)= max! subject to ( x , ~E )M* with .21* = {(x,u)E R" x R;" : L(x,u)= min L(z,u)} ZER"
(18.42a) (18.42b)
2. Duality Theorems
(x2,uz)E M'. then a) L(x2
(18.42a,b).
18.2.2 Special Non-linear Optimization Problems 18.2.2.1 Convex Optimization 1. Convex Problem The optimization problem f(x)= min! subject to gz(x) 5 0 (i = 1:.. . , m) (18.43) is called a convex problem if the functions f and gi are convex. In particular. f and gt can be linear functions. The following statements are valid for convex problems: a ) Every local minimum o f f over M is also a global minimum. b) If JI is not empty and bounded, then there exists at least one solution of (18.43). c) I f f is strictly convex. then there is at most one solution of (18.43). 1. Optimality Conditions a) I f f has continuous partial derivatives, then x* E M is a solution of (18.43))if
(x- z * ) ~ v ~ ( z *2 )o for every x E M . (18.44) b) The Slater condition is a regularity condition for the feasible set ,tI. It is satisfied if there exists an x E A1 such that gi(x) < 0 for every non-affine linear functions gi.
I
862
18. Optimization
x' is a minimum point of (18.43) if and only if there exists a u* 2 0 such that (x*, u*)is a saddle point of the Lagrangian. Moreover, if functions f and gi are differentiable, then s*is a solution of (18.43) if and only ifx* satisfies the local Kuhn-Tucker conditions. d) The dual problem (18.42a,b) can be formulated easily for a convex optimization problem with differentiable functions f and gt: L(x,u)= max!, subject to (x,g)E M* with (18.45a) (18.45b) M' = {(x,p)E R" x RT : V,L(x,u) = Q}. The gradient of L is calculated here only with respect to 11. e) For convex optimization problems, the strong duality theorem also holds: If M satisfies the Slater condition and ifx' E M is a solution of (18.43), then there exists a 11' E RT; such that ( x * , g *is) a solution of the dual problem (18.45a,b), and f(x*)= minf(Z) = max L(x;u)= L(x*,u*). (18.46) c ) If the Slater condition is satisfied, then
ZE M
(X,U)€M'
18.2.2.2 Quadratic Optimization 1. Formulation of the Problem Quadratic optimization problems have the form f(x) = xTCx+ pTx= min! , subject to E hil c R" with M = .WT: M = {x E R" : Ax 5 b. x 2 Q}. Here, C is a symmetric ( n ,n) matrix, p E Rn, A is an (m,n) matrix, and b E R". The feasible set M can be written alternatively in the following way: '2il = .\ilI, : M = {x : Ax = b, x 2 Q}, M = A V l ~ ~ : = {x : Ax 5 b}. db'
2.
(18.47a) (18.47b)
(18.48a) ( 18.48b)
Lagrangian and Kuhn-Tucker Conditions
The Lagrangian to the problem (18.47a,b) is
L(x,u)= x T C x + p T x + g T ( A x - b ) . By introducing the notation dL
+
(18.49)
dL
v = - = p 2Cx+ ATy and y = -- = - A x + b ax du the Kuhn-Tucker conditions are as follows:
(18.50)
Case 11:
Case I:
Case 111:
a) A x + y = b ,
a) A x = b ,
a) A x + y = b .
b) 2Cx - 1 + ATu = -p,
b) 2Cx - y + ATu = -p, -
b) 2Cx + ATq = -p. (18.5lb)
c) x 2 0 . Y 2 0 , y 2 0 , u 2 0 , c) x20,Y 2 0 , d)
xTx+ y'u
= 0.
d) xTy= 0.
c)
112
0, y 2 0,
d) z T g = 0.
(18.5la)
(18.51~)
( 18.5Id)
3. Convexity The function f(x)is convex (strictly convex) if and only if the matrix C is positive semidefinite (positive definite). Every result on convex optimization problems can be used for quadratic problems with a positive semidefinite matrix C; in particular, the Slater condition always holds, so it is necessary and sufficient for the optimality of a point x*that there exists a point (x*, y, p,y),which satisfies the corresponding system of local Kuhn-Tucker conditions.
18.2 Non-linear Optimization 863
4. Dual Problem If C is positive definite, then the dual problem (18.45a) of (18.47a) can be expressed explicitly: L(2i.g) = max!. subject to ( x , ~E )M*, where (18.52a)
M" = {(x.u)E R" x RI; : x = - k - ' ( A T ~ + p ) } .
(18.52b)
2
1
If we substitute the expression x = --C-'(ATg 2 we get the equivalent problem 1
+ p) into the dual objective function L(x.u),then
+ blTp- a p ~ c --l p= max!,
q ( y )= - - ~ T A C - ~ A-T ~
4-
g2
0.
(18.53)
Hence: If x* E M is a solution of (18.47a,b),then (18.53) has a solution g' 2 0, and
f(x*)= du*).
(18.54)
Problem (18.53) can be replaced by an equivalent formulation: ~ ( g=)uTEu
+ hTu= min! ,
1
E = -AC-'AT 4
and
subject to g 2 0 where
(18.55a)
1
h = -AC-'p+b.
(18.55b)
2
18.2.3 Solution Methods for Quadratic Optimization Problems 18.2.3.1 Wolfe's Method 1. Formulation of the Problem and Solution Principle The method of Wolfe is to solve quadratic problems of the special form: f ( x )= xTCx+ eTx= min! subject to Ax = b! x 2 0. ~
'
(18.56)
We suppose that C is positive definite. The basic idea is the determination of a solution (x*, u*,y') of the corresponding system of Kuhn-Tucker conditions, associated to problem (18.56): Ax = b, (18.57a)
2Cx - 41 + ATg = -p, -
(18.57b)
x 2 0, v20;
(18.57~)
0. (18.58) Relations (18.57a,b,c)represent a linear equation system with m + n equations and 2n + m variables. i = 0 or ut = 0 (i = 1,2, . . . , n) must hold. Therefore, every solution Because of relation (18.581,either z of (18.57a.b:c), (18.58) contains at most m + n non-zero components. Hence, it must be a basic solution of (18.57a.b,c). xT41 =
2. Solution Process First, we determine a feasible basic solution (vertex) g of the system Ax = b. The indices belonging to the basis variables of form the set IB. In order to find a solution of system (18.57a,b,c),which also satisfies (18.58).we formulate the problem - p = min!, ( p E R); (18.59)
Ax = b,
(18.60a)
2 C x - y + A T g - p q-= -p- with q=2Cz+p,
x 2 0, 41 2 0, XTq: = 0.
p20;
(18.60b) (18.60~) (18.61)
I
864
18. Optimization
If (x,v>u> p ) is a solution of this problem also satisfying (18.57a,b,c) and (18.58), then p = 0. The vector ( ~ , v , u $ p=) (x,Q,Q, 1) is a known feasible solution of the system (18.60a,b,c), and it satisfies the relation (18.61), too. We form a basis associated to this basic solution from the columns of the coefficient matrix A 0 I denotes the unit matrix, 0 the zero matrix and 0 (18.62) 2 c -1 AT -g ' is the zero vector of the corresponding dimension,
0
in the following way: a) m columns belonging to x, with z E I B , b) n - m columns belonging to u, with z $ I B , c) all m columns belonging to u,, d) the last column, but then a suitable column determined in b) or c) will be dropped. If q = 0. then the interchange according to d) is not possible. Then g is already a solution. Now, we can construct a first simplex tableau. The minimization of the objective function is performed by the simplex method with an additional rule that guarantees that the relation sTy= 0 is satisfied: The variables x, and u, (i = 1 , 2 , .. . , n) must not be simultaneously basic variables. Thesimplexmethodprovidesasolutionofproblem (18.59),(18.60a,b,c),(18.61)withp = Ofor positive definite C considering this additional rule. For a positive semidefinite matrix C, based on the restricted pivot choice, it may happen that although p > 0, no more exchange-step can be made without violating the additional rules. We can show that in this case p cannot be reduced any further. If(x)= x: + 42; - loxl - 32x2 = min! with 21 + 2x2 2 3 = 7, 2x1 x2 2 4 = 8.
+
+ +
In this case C is positive semidefinite. A feasible basic solution of Ax = Ir is % = (0,0,7, 8)T, g = 2CZ + p = (-10, -32,0, O)T. We choose the basis vectors: a) columns 3 and 4 of
, b) columns 1 and 2 of
(
column Jg) instead of the first column of
( -4).
The basis matrix is formed from these columns, and the basis inverse is calculated (see 18.1, p. 844). Multiplying matrix (18.62) and the vectors
and d)
T :()
(-3 ~I
by the basis inverse, we get the first simplex tableau (Scheme 18.9). Only z1 can be interchanged with u2 in this tableau according to the complementary constraints. After a few steps, we get the solution x* = (2.5/2.0, 3/2)T. The last two equations of 2Cx v + ATu- pg = -pare: v3 = u1, u4 = u2. Therefore, by eleminating u1 and u2 the dimension of the problem can be reduced.
Scheme 18.9 21
x2
Ul
213
214
1 2
2 1
0 0 32
0 0 12
0 0 54
0 0 2 10
0 0
o--
0 0 - 1 1 2 10 10
0 - 1 0 1 10
7 8
0 0 0
1 -1
18.2 Non-linear Optimization 865
18.2.3.2 Hildreth-d'Esopo Method 1. Principle The strictly convex optimization problem f(x) = xTCxt pTx = min!, Ax 5 b has the dual problem (see l.,p. 862)
(18.63)
+
hTu = min! u 2 0 with 1 1 E = -AC-'AT, h = -AC-'p + b. 4 2 Matrix E is positive definite and it has positive diagonal elements e,, variables x and 11satisfy the following relation: 1 x = --C-'(ATu+p). 2 y(u) = uTEu
(18.64a) (18.64b)
> 0, (z
= 1 , 2 , .. m). The t ,
(18.65)
2. Solution by Iteration
The dual problem (18.64a), which contains only the condition u 2 0, can be solved by the following simple iteration method: a) Substitute u' 2 0, (e g., u' = Qjl k = 1. b) Calculate uf" for z = 1 , 2 , . . . , m according to
c) Repeat step b) with k + 1 instead of k until a stopping rule is satisfied, e.g., l y ( ~ ~ + - 'y)( a k ) l< E . E > O
Under the assumption that there is an x such that Ax < b, the sequence {$(uk)} converges to the minimum value pmlnand sequence { x k } given by (18.65) converges to the solution x* of the original problem. The sequence {uk}is not always convergent.
18.2.4 Numerical Search Procedures By using non-linear optimization procedures we can find acceptable approximate solutions with reasonable computing costs for several types of optimization problems. They are based on the principle of comparison of function values.
18.2.4.1 One-Dimensional Search Several optimization methods contain the subproblem of finding the minimum of a real function f(x) for I E [a. b]. It is often sufficient to find an approximation zof the minimum point x'.
1. Formulation of the Problem A function f(z),x E R, is called unimodal in [a,b] if it has exactly one local minimum point on every closed subinterval J [a,b]. Let f be a unimodal function on [a,b] and I * the global minimum point. Then we have to find an interval [c, d] C [a,b] with z* E [e,d] such that d - c < E , E > 0. 2. Uniform Search b-a E LVe choose a positive integer n such that 6 = -< -, and we calculate the values f ( x k )for xk = n t l 2 a + k6 ( k = 1 , .. . . n). If f(x) is the smallest value among these function values. then the minimum point z*is in the interval [x- 6, I t 61. The number of required function values for the given accuracy can be estimated by 2(b - a) (18.67) n>-1. E
866
18. Optimization
3. Golden Section Method, Fibonacci Method The interval [a.b] = [al.b l ] will be reduced step by step so that the new subinterval always contains the minimum point x*. m'e determine the points XI, p1 in the interval [al, b l ] as (18.68a) XI = al t (1 - r)(bl - al), p1 = ai t r(bl - al) with 1 (18.68 b) T = 2 (6 - 1) zz 0.618. This corresponds to the golden section. We distinguish between two cases: a) f ( A 1 ) < f(p1): We substitute a2 = al, bz = p1 and pz = XI. (18.69a) b) f ( A l ) 2 f ( p l ) : We substitute a2 = XI, bz = bl and XZ = p ~ . (18.69b) If b2 - a2 2 E , then we repeat the procedure with the interval [az,bz],where one value is already known, f(&) in case a) and f(pg) in case b), from the first step. To determine an interval [a,: b,], which contains the minimum point x*,we must calculate n function values altogether. From the requirement (18.70) E > bn - a, = T ' ( b 1 - ai) we can estimate the necessary number of steps n. By using the golden section method, at most one more function value should be determined compared to the Fibonacci method. Instead ofsubdividing the interval according to the golden section, we subdivide the interval according to the Fibonacci numbers (see 5.4.1.5, p. 323, and 17.3.2.4, 4., p. 843).
18.2.4.2 Minimum Search in n-Dimensional Euclidean Vector Space The search for an approximation of the minimum point x* of the problem f(x) = min!, x E R", can be reduced to the solution of a sequence of one-dimensional optimization problems. We take (18.71a) a) x = xl, k = 1: where x' is an appropriate initial approximation of x*. b) h'e solve the one-dimensional problems (18.71b) ~(a,) = f(~!+'~ . . ., x;?;) zf +a,, x:+', . . . ,E:) = min! with a, E R for T = 1.2,. . . . n. If c& is an exact or approximating minimum point of the r-th problem, then we substitute E:+' = E," t &. c) If two consecutive approximations are close enough to each other, Le., with some vector norm,
llxk+' - xk/I < €1 or If(xk+')- f(xk)I< € 2 , (18.71~) then xkL' is an approximation of x*.Otherwise we repeat step b) with k + 1 instead of k . The onedimensional problem in b) can be solved, by using the methods given in 18.2.4.1, p. 865.
18.2.5 Methods for Unconstrained Problems The general optimization problem f(x)= min! for 11E R" (18.72) is considered with a continuously differentiable function f. Each method described in this section constructs, in general, an infinite sequence of points { x k } E R", whose accumulation point is a stationary point. The sequence of points will be determined starting with a point x' E R" and according to the formula X k ~l xk + a d k ( k = L 2 , . . .), (18.73) -
I
Le., we first determine a direction dk E R" at xk and by the step size ak E R we indicate how far xktl is from xk in the direction dk.Such a method is called a descent method, if f(Z'"+') < f(xk) ( k = 1 , 2 . . . .). (18.74)
18.2 Non-linear Optimization 867
The equality Vf(x) = 0. where V is the nablaoperator (see 13.2.6.1, p. 664). characterizes a stationary point and can be used as a stopping rule for the iteration method.
18.2.5.1 Method of Steepest Descent Starting from an actual point xk,the direction dk in which the function has its steepest descent is dk =
-Vf(xk)
(18.75a)
and consequently xkt' = xk - cykVf(xk).
(18.75b)
.A schematic representation of the steepest descent method with level lines f(x)= f(x')is shown in Fig. 18.6. The step size q is determined by a line search, Le., cXI; is the solution of the onedimensional problem ')
Figure 18.6 Xk-l = xk t ffk&
f ( x k+ adk)= min! cy 2 0. (18.76) This problem can be solved by the methods given in 18.2.4, p. 865. The steepest descent method (18.75b) converges relatively slowly. For every accumulation point x' of the sequence { x k } , Vf(x*) = 0. In the case of a quadratic objective function, Le., f(x)= xTCxt pTx, the method has the special form:
(18.77a) with dk = -(2@
I
+p) and
(Yk
= dkTdk
(18.77b)
2dkTCdk'
18.2.5.2 Application of the Newton Method Suppose we approximate the function f at the actual approximation point xk by a quadratic function: 1 (18.78) + (X- x k ) ' V f ( x k ) -(x - xk)'H(xk)(x- x~). q(x) = 2Here H(xk)is the Hessian matrix, Le., the matrix of second partial derivatives off at the point x k . If H(xk)is positive definite, then q ( x ) has an absolute minimum at xkt'with Vq(ykt') = 0, therefore we get the Newton method: xktl = xk - H-'(xk)Vf(xk) ( k = 4 2 , . , .), Le., (18.79a)
+
dk = - H - ' ( x ~ ) v ~ ( ~ and ' ) ak in (18.73). (18.79b) The Sewton method converges quickly but it has the following disadvantages: a) The matrix H(xk)must be positive definite. b) The method converges only for sufficiently good initial points. c ) \Ye cannot influence the step size. d) The method is not a descent method. e) The computational cost of computing the inverse of H-'(xk) is fairly high. Some of these disadvantages can be reduced by the following version of the damped Newton method: (18.80) xktl = zk- ~ y k H - * ( x ' ) V f ( ~ ~( k) = 1 , 2 , .. .) . The relaxation factor ah can be determined, for example, by the principle given earlier (see 18.2.6.1, p. 867).
18.2.5.3 Conjugate Gradient Met hods Two vectors d'.rJz E R" are called conjugate vectors with respect to a symmetric, positive definite matrix C. if dlTCdZ= 0.
(18.81)
I
868
18. Optimization
If d'. dz.,, . ,dn are pairwise conjugate vectors with respect to a matrix C, then the convex quadratic problem q ( ~ =) xTCx + pTx, x E Rn,can be solved in n steps if we construct a sequence xk+'= xk+akdkstarting from x', where (Yk is the optimal step size. Under the assumption that f(x)is approx1 imately quadratic in the neighborhood of x', Le., C M -H(x*),the method developed for quadratic 2 objective functions can also be applied for more general functions f(x), without the explicit use of the matrix H(x*). The conjugate gradient method has the following steps: a) x' E R". d' = -Of($), (18.82) where X' is an appropriate initial approximation for x*. b) xktl = xk + (Ykdk ( k = 1 , .. . n) with ak 2 0 so that f(xk+ adk)will be minimized. (18.83a) ~
dk+' = -Vf(xktl) + pkdk ( k = 1,.. . , n - 1) with
(18.83b) (18.83~)
c) Repeating steps b) with zn+l and #+'instead of E' and dl.
18.2.5.4 Method of Davidon, Fletcher and Powell (DFP) LVith the DFP method, we determine a sequence of points starting from x' E Rnaccording to the formula -x k - akMkVf(xk) ( k = 1 , 2 , . . .). (18.84) Here, Mk is a symmetric, positive definite matrix. The idea of the method is a stepwise approximation of the inverse Hessian matrix by matrices Mk in the case when f(x)is a quadratic function. Starting with a symmetric, positive definite matrix M1,e.g., M1 = I (I is the unit matrix), the matrix Mk is determined from Mk-1 by adding a correction matrix of rank two (18.85) - of(&') ( k = 2,3.. . .). w e get the step size ( Y k from with yk = xk - xk-' and y k = vf(xk) f(xk- cuMkVf(xk)))= min!, (Y 2 0. (18.86) If f(x)is a quadratic function, then the DFP method becomes the conjugate gradient method with Mi = I.
18.2.6 Gradient Method for Problems with Inequality
Type Constraints If the problem f($ = min! subject to the constraints gz(x)5 0 (z = 1,. . . , m ) (18.87) has to be solved by an iteration method of the type (18.88) Xk-l = xk f(Ykdk (k = 1 ~ 2 , . . . ) then we have to consider two additional rules because of the bounded feasible set: 1. The direction dk must be a feasible descent direction at xk. 2. The step size (Yk must be determined so that xk+' is in M. The different methods based on the formula (18.88) differ from each other only in the construction of the direction dk. To ensure the feasibility of the sequence {xk}C M, we determine a; and ai in the following nay:
18.2 N o n - h e a r Optimization 869
+
a; from f ( x k odk)= min! (Y 2 o ai = max{a E R : xk t adk E M}. ~
(18.89)
Then ok = min{a;.a;}.
(18.90)
If there is no feasible descent direction dkin a certain step k , then xk is a stationary point.
18.2.6.1 Method of Feasible Directions 1 . Direction Search Program A feasible descent direction dk at point xk can be determined by the solution of the following optimization problem: u = min!.
(18.91)
If u < 0 for the result d = dk of this direction search program, then (18.92a) ensures feasibility and (18.92b) ensures the descending property of dk. The feasible set for the direction search program is bounded by the normalizing condition (18.92~).If u = 0. then xk is a stationary point, since there is no feasible descent direction at xk. A direction search program. defined by (18.92a,b,c),can result in a zig-zag behavior of the sequence x k . which can be avoided if the index set 10(xk)is replaced by the index set which are the so-called E k active constraints in sk.Thus, we exclude local directions of descent which are going from xk and lead closer to the boundaries of M consisting of the &k active constraints (Fig. 18.7).
Figure 18.7 If
D
= 0 is a solution of (18.92a,b,c) after these modifications, then
xk is a stationary
point only if
(xk).Otherwise E k must be decreased and the direction search program must be repeated.
= IEk
f l ;@
-Vf(xk)
4 Figure 18.8
2. Special Case of Linear Constraints If the functions gz(x)are linear. Le., g z ( r ) = azTx- b,, then we can establish a simpler direction search method: = V f ( ~ ~=)min! ~ d with (18.94)
atTd lldll 51. 5 0,
z E 10(xk)or z E l E k ( x k )(18.95b) (18.95a) ,
= The effect of the choice of different norms J/d/l
max{ld,l} 5 1 or Fig. 18.8a,b.
lldll = @ 5 1 is shown in
870
18. Optimization
In a certain sense. the best choice is the norm lldll = lldiiz = @d, since by the direction search program we get the direction dk,which forms the smallest angle with - V f ( x k ) .In this case the direction search program is not linear and requires higher computational costs. With the choice 1 /dl1= 1 ldIlrn= max{ I d J } 5 1 we get a system of linear constraints -1 I d, I 1, (i = 1,.. . n ) ,so the direction search program can be solved. e.g.. by the simplex method. In order to ensure that the method of feasible directions for a quadratic optimization problem f(x)= x T C+~pTx = min! with Ax I b results in a solution in finitely many steps, the direction search program is completed by the following conjugate requirements: If ak-1 = holds in a step, Le.. xk is an "interior" point, then we add the condition ~
dk-IT Cd=O (18.96) to the direction search program. Furthermore we keep the corresponding conditions from the previous steps. If in a later step ah = ai we remove the condition (18.96). I f ( x ) = .r: + 42; - lox1 - 32xz = min! SI($ = -21 I 0, gz(x) = -22 I 0, g3(x) = 51 2x2 - 7 5 0, g4(x) = 2x1 22 - 8 5 0.
+
+
Step 1: Startingwith
x1 = (3.0)T, of@)= (-4. -4dl - 32d2 = min!
Direction search program:
-dz
IO. lldll, I 1
-32)T,
] =+ d'
lo(x') = (2). = (1,I)T.
: for z such that
hlaximal feasible step size. ai = min
2 ==+ 3
atTdk> 0 ,ai = $. a;' = 1
3
, 182 2 11 2 al=mln - - = - xz=(--) (5.31 3'3'3
Directionsearch program: 4 a2 = -
3
I
113
{
X2t
= (3.2)T
Step 3: V f ( x 3 )= (-4, -16)T, Direction search program:
(-I>
;IT,
152
- :dz = 2dl +dz 5 0. IIdIl, 5 1 -:dl
a$ = 1, a; = 3
IO = ($)
= {3,4}
23
=+a3 = 1, x4 =
The next direction search program results in the minimum point is x*= x4 (Fig. 18.9).
0 =
>:IT
.
(0. Here 2 -
1
0
b
* 2 1x
1 2 3 Figure 18.9
d 4x1
18.2.6.2 Gradient Projection Method 1. Formulation of the Problem and Solution Principle Suppose the convex optimization problem f($ = min! with a,T1lI b,,
(18.97)
18.2 Non-linear Optimization 871
for i = 1,, , , , m is given. A feasible descent direction dk at the point xk E M is determined in the following way: If -Of($) is a feasible direction, then dk = -Vf(xk) is selected. Otherwise xk is on the boundary of M and - V l ( x k ) points outward from M. The vector - V j ( x k ) is projected by a linear mapping Pk into a linear submanifold of the boundary of M defined by a subset of active constraints of xk. Fig. 18.10a shows a projection into an edge, Fig. 18.10b shows a projection into a face. Supposing non-degeneracy, Le., if the vectors a,, i E 1 0 ( ~ are ) linearly independent for every x E R", such a projection is given by
dk = -Pkvf(xk)= - (I - AkT(AkAkT)-'Ak)Of($).
(18.98)
Here. Ak consisrs of all vectors L ~whose , corresponding constraints form the submanifold, into which - V f ( x k )should be projected.
Figure 18.10
2. Algorithm The gradient projection method consists of the following steps. starting with x' E M and substituting k = 1 and proceeding in accordance to the following scheme: I: If - V f ( x k )is a feasible direction, then we substitute dk = - V f ( x k ) , and we continue with 111. Otherwise we construct Ak from the vectors atTwith z E l o ( x k )and we continue with 11.
11: R'e substitute dk = - (I - AkT(AkAkT)-*Ak)Vf(rk)).If dk # 0. then we continue with 111. If
dk = 0 and p = -(AkAhT)-'AkVf(xk)2 0. then zkis a minimum point. The local Kuhn-Tucker conditions -Of($) = C utat = AkTu are obviously satisfied
-
%EIo (x_k )
2: 0, then we choose an i with u, < 0, delete the i-th row from Ak and repeat 11. 111: Calculation of ak and xk+l = xk + Qkdkand returning to I with k = k + 1. If
3. Remarks on the Algorithm If -Of@) is not feasible. then this vector is mapped onto the submanifold of the smallest dimension which contains sk.If dk = 0, then - V f ( z k ) is perpendicular to this submanifold. If u 2 0 does not hold: then the dimension of the submanifold is increased by one by omitting one of the active constraints, somaybedk # Qcanoccur (Fig. 18.10b) (with projection onto a (latera1) face). Since Ak is often obtained from Ak-1 by adding or erasing a row. the calculation of (AkAkT)-'can be simplified by the use of (Ak-1Ak-lT)-l. Solution of the problem of the previous example on p. 870. Step 1: =(3.0)~. I: O f ( x ' ) = (-4, -32)T, -Vf(xl) is feasible, d' = (4, 32)T.
111: The step size is determined as in the previous example: a1 = Step 2:
1
16 8
I
872
18. OPtimization
I: Of(xz)=
18 96 [-T, -T) .
.
..
(not feasible), I&')
= {4}, A2 = (2 1).
8 16 111:
cy2
=
g5 , x3 = (3.2)T
Step 3: I: Of@) = (-4, -16)* (not feasible), I&)
,:(
= {3,4}, A3 = T ) : -
(.1
2 1), .
uz < 0 : A3 = (1 2).
16 8
111:
5 x4 = (2: 16' -
cy3 = -
%)'
Step 4:
I: Of($) = (-6, -12)T (not feasible), 10(x4) = {3}, A4 = A3. 11: P4 = P3. d4 = (0,O)T, u = 6 2 0. It follows that x4 is a minimum point.
18.2.7 Penalty Function and Barrier Methods The basic principle of these methods is that a constrained optimization problem is transformed into a sequence of optimization problems without constraints by modifying the objective function. The modified problem can be solved, e.g., by the methods given in Section 18.2.5. With an appropriate construction of the modified objective functions, every accumulation point of the sequence of the solution points of this modified problem is a solution of the original problem.
18.2.7.1 Penalty Function Method The problem
f(x)= min! subject to gz(x)5 0 (i = 1,2,. . . , m ) is replaced by the sequence of unconstrained problems H(&,pk) = !(E) + p k S ( x ) = min! with x E R", p k > 0 (k = 1 , 2 , . . .). Here, pk is a positive parameter, and for s(x)
(18.99) (18.100)
(18.101) . problem (18.100) holds, i.e., leaving the feasible set M is punished with a "penalty" term p k S ( ~ )The is solved with a sequence of penalty parameters pk tending to co.Then lim H ( x , p k ) = ! ( E ) , x E M. (18.102) kim
If xk is the solution of the Ic-th penalty problem, then: H ( d , p k ) 2 H(Xk-',pk-l), f(Xk)2 f(Xk-')9 and every accumulation point x' of the sequence {xk}is a solution of (18.99). If xk E solution of the original problem. For instance, the following functions are appropriate realizations of S(x): S(x)= maxr{0,g1(x)>.. . ,gm(x)} ( T = 1 , 2 , . . .) or
(18.103)
M ,then xk is a (18.104a)
18.2 Non-linear Optimization 873 m
maxP{O,gz(x)}( r = 1,2,.. .).
S(x)=
(18.104b)
1=1
If functions f(x)and gz(x)are differentiable, then in the case r > 1, the penalty function H(z,Pk) is also differentiable on the boundary of M, so analytic solutions can be used to solve the auxiliary problem (18.100). Fig. 18.11shows a representation of the penalty function method.
Figure 18.11
Figure 18.12
+
W f(x) = xf xi = min! for 51 + 2 2 2 1, ff(x,pk) = 2: The necessary optimality condition is: 221 - 2pkmax{0,1- 2 1 - Q}
+ + pk max2{0, 1 - zl - x2}. 2:
The gradient of H is evaluated here with respect to x. By subtracting the equations we have x1 = z2. The equation 221 - 2pk max(0.1
-
221} = 0 has a unique solution 2: = x; =
Pk 1 solution 2; = x; = lim = - by letting k kiW1$2pk 2 ~
A. We get the 1 f 2Pk
-+ co.
18.2.7.2 Barrier Method We consider a sequence of modified problems in the form
H(& qk) = f ( x ) + q&B(x)= 3 qk > 0 . (18.105) The term q k 8 ( x ) prevents the solution leaving the feasible set M ,since the objective function increases unboundedly on approaching the boundary of Ilf. The regularzty condition M ' = { ~ E M: g2(x)<0( z = 1 , 2 m ) } # 0 and @ = M must be satisfied, Le.. the interior of A4 must be non-empty and it is possible to get to any boundary (18'106) point by approaching it from the interior, Le., the closure of M o is M. The function B ( x ) is defined to be continuous on M'. It increases to co at the boundary of M.The modified problem (18.105) is solved by a sequence of barrier parameters q k tending to zero. For the solution xk of the k-th problem (18.105) holds ) . . . ,
f(xk)I f(xk-'), and every accumulation point x' of the sequence {xk}is a solution of (18.99). Fig. 18.12shows a representation of the barrier method.
(18.107)
I
874
18. Optimization
The functions. e.g.. m
(18.108a) (18.108b) are appropriate realizations of B ( x ) . Ij(x) = 2: 2; = min! subject to x1 x2 2 1, H(x,qk) = 1
+
21
t 2 2 > 1. VH(X. qk) =
with respect to x.
[
+
yz ')= (:)
- qkxl - 2x2 - q k 51 t 5 2 - 1 221
Subtracting the equations results in x1 = 2 2 ,
221
- qk-
,
21
2:
+ xi + qk(-
In(z1 t 2 2 - I ) ) ,
+ 2 2 > 1. The gradient of H is given
1 = 0, 2x1 - 1
21
>1 -)
2
*
2;
21 qk - -- - 2 4
The solutions of problems (18.100) and (18.105) at the k-th step do not depend on the solutions of the previous steps. The application of higher penalty or smaller barrier parameters often leads to convergence problems with numerical solution of (18.100) and (18.105), e.g., in the method of (18.2.4), in particular, if we do not have any good initial approximation. Using the result of the k-th problem as the initial solution for the (k + 1)-th problem we can improve the convergence behavior.
18.2.8 Cutting Plane Methods 1. Formulation of the Problem and Principle of Solution We consider the problem f(x)= cTx= min!, c E R" (18.109) over the bounded region M c R" given by convex functions gi(X) (i = 1 , 2 , . . . , m) in the form Si(&) 5 0. A problem with a non-linear but convex objective function j(x)is transformed into this form, if (18.110) f ( ~ -) 2nt1 5 0, 2 n + 1 E R is considered as a further constraint and (18.111) j(x)= xnt1 = min! for all = (x,zn+l) E Rnil is solved with gi(x) = g i b ) 5 0. The basic idea of the method is the iterative linear approximation of M by a convex polyhedron in the neighborhood of the minimum point x*,and therefore the original program is reduced to a sequence of linear programming problems. First, we determine a polyhedron (18.112) PI = {X E R" : atTz5 b,, i = 1 , .. . , s}. From the linear program j(x)= min! with X E P1 ( 18.113) an optimal extreme point x1of Pi is determined with respect to j b ) . If&' E M holds, then the optimal solution of the original problem is found. Otherwise, we determine a hyperplane H1 = {a : & + l T ~= b,tl, &tlTxl > bstl}, which separates the point 21' from M ,so the new polyhedron contains (18.114) pz = {X E 9 : as+iTX5 h+l}.
x
18.3 Discrete Dynamic Programming 875
Fig. 18.13 shows a schematic representation of the cutting plane method. V f(X)
Figure 18.13
2. Kelley Met hod The different methods differ from each other in the choice of the separating planes Hk. In the case of the Kelley method Hk is chosen in the following way: A jk is chosen so that gJk(xk)=mm{g2(xk) ( i = 1, . . . ,m ) } . (18.115) At the point x = xk,the function gjk(x) has the tangent plane
T(x)= gjk(Xk)+ (X- x ~ ) ) ' V Y ~ , ( X ~ )(18.116) . The hyperplane Hk = {X E R" : T ( x )= 0) separates the point zkfrom all points x with y J x(x)5 0. So, for the ( k + 1)-th linear program, T ( x )5 0 is added as a further constraint. Every accumulation point x*of the sequence {g}is a minimum point of the original problem. In practical applications this method shows a low speed of convergence. Furthermore, the number of constraints is always increasing.
18.3 Discrete Dynamic Programming 18.3.1 Discrete Dynamic Decision Models A wide class of optimization problems can be solved by the methods of dynamic programming. The problem is considered as a process proceeding naturally or formally in time, and it is controlled by time-dependent decisions. If the process can be decomposed into a finite or countably infinite number of steps, then n e talk about discrete dynamic programming,otherwise about continuous dynamic programming. In this section, we discuss only n-stage discrete processes.
18.3.1.1 n-Stage Decision Processes An n-stage process P starts at stage 0 with an initial s t a t e s = % and proceeds through the intermediate states xl. xz,. . . , z ~ into - ~ a final state xn = & E X , C Rm.The state vectors xj are in the state space X , R". To drive a state x ~ into - ~the state xj,a decision uj is required. All possible decision vectors IJ, in the state xj-lform the decision space U ~ ( X ~ -R'. ~ )From xj-lwe get the consecutive state xj by the transformation (Fig. 18.14)
x, = Y,(x,-bIJJ
(18.117 )
j = l(1)n.
Un(Xn.1)
Figure 18.14
18.3.1.2 Dynamic Programming Problem Our goal is to determine a polzcy (nl,.. . ,un)which drives the process from the initial state into the state considering all constraints so that it minimizes an objective function or cost functzon f ( f l ( % . g l.). . .fn(xn-lsgn)). The functions .fj(xj-],uj) are called stage costs. The standard form of the dynamic programming problem is
876
18. Optimization
( 18.118b) 1
.
, -
The first type of constraints ~r, are called dynamzc and the others s,uj are called s t a t ~ Similarly . to (18.118a).a maximum problem can also be considered. A policy (&,. . . ,un)satisfying all constraints is called jeaszble. The methods of dynamic programming can be applied if the objective function satisfies certain additional requirements (see 18.3.3,p. 876).
18.3.2 Examples of Discrete Decision Models 18.3.2.1 Purchasing Problem In the j-th period of a time interval which can be divided into n periods, a workshop needs v j units of a certain primary material. The available amount of this material at the beginning of period j is denoted by zj-], in particular, 20 = 2 , is given. We have to determine the amounts u j , for a unit price cJ,which should be purchased by the workshop at the end of each period. The given storage capacity K must not be exceeded, Le., t u, 5 K . We have to determine a purchase policy (ul,. . . , un), which minimizes the total cost. This problem leads to the following dynamic programming problem
OF:
j ( q , ... , u,) =
n
n
jj(u,) = C c , u 3
+min!
( 18.119a)
j=1
3x1
I
= ~ j - 1t u3 - v j j = l(l)n, .c~=x,, O
CT:
~j
+
n
f(zo,u1,.. .>2,-1,u,) = C(Cj.j j=1
+
(+I
t u , - vJ2) .l).
(18.120)
18.3.2.2 Knapsack Problem Il'e have to select some of the items A I , . . . ,A, with weights w l , . . . , w, and with values c l , . . . , c, so that the total weight does not exceed a given bound W,and the total value is maximal. This problem does not depend on time. It will be reformulated in the following way: At every stage we make a decision u, about the selection of item A,. Here, u, ,= l holds if A, is selected, otherwise u j = 0. The capacity still available at the beginning of a stage is denoted by ~ ~ - so 1 ,we get the following dynamic problem:
OF:
f ( u l . . . u,) = ~
n
c,u, --+ max!
(18.121a)
(18.121b)
18.3.3 Bellman Functional Equations 18.3.3.1 Properties of the Cost Function In order to state the Bellman functional equations, the cost function must satisfy two requirements:
18.3 Discrete Dynamic Programming 877
1. Separability The function f(fl(%,u,),. . . , f,,(~~-~,u~)) is called separable, if it can be given by binary functions HI, . . . . Hn-l and by functions F1,. . . , F,,in the following way:
f(f1(Eo.U l ) , . fn(xn-1r En)) = Fl (fl(Eo, Ul),. ' ' fn(x,,-114)). Fl (fl(Eor u1). . , fn(xn-l > u,)) = H l (fl(Eo,Ul),FZ(fZ(X1,nz),' ' ' * fn(Xn-1,u,,)))> . . . . . . . . . . . . . . . . . . . . . . . . . . . ... . . .... . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . ... . . . . .. . . . . ' '
2
' '
1
Fn- I ( fn- 1 k n - 2 > ~ n - f n ( ~ n 1-I ~ n )= Hn-I ( f n - I ( ~ n - 2>
1)
Fn ( fn (xn11 u n1) = f n (&,,-I
I
un
un- 1) Fn(f n ( ~ n1- ~
1)1
l n
(18.122)
7
1.
2. Minimum Interchangeability A4function H(f(a),ki(f?)) is called minimum interchangeable,if min
(g,b)E..lxB
H
(io,~ ( b )=)min H (~ ( a min ) , ~ ( b ). ) BEE
(18.123)
&?EA
This property is satisfied, for example, if H is monotone increasing with respect to its second argument for every a E A . i.e., if for every a E A ,
H
(io,b)) 5 H (SM b,)) for F(b1)I &d.
(18.124)
Now, for the cost function of the dynamic programmingproblem, the separability off and the minimum interchangeability of all functions HI, j = l(1)n - 1 are required. The following often occurring type of cost function satisfies both requirements: (18.125)
(18.126)
(18.127)
18.3.3.2 Formulation of t h e Functional Equations We define the following functions:
k=j(l)n
@ntl(xn)= 0. If there is no policy (ul,., . .un)driving the state x ~ into - ~a final state & E X,. then n e substitute (18.129) 4J( x ~ -=~ )x.Using the separability and minimum interchangeability conditions and the dynamic constraints for = l(1)n w e get:
I
878
18. Optimization
(18.130) Equations (18.130) and (18.129) are called the Bellman junctzonal equatzons. value of the cost function f .
@I(%)
is the optimal
18.3.4 Bellman Optimality Principle The evaluation of the functional equation (18.131) corresponds to the determination of an optimal policy ($$,
,,
,$) minimizing the cost function, Le.,
(18.132) F j ( f j ( x j - 1 ) u j ) 3 . . f n ( l i n - 1 . U n ) ) -+min! based on the subprocess Pj starting at state x3-1and consisting of the last n - j t 1 stages of the total process P . The optimal policy of the process P1 with the initial state x3-l is independent of the decisions u,, . . . . u351 of the first j - 1 stages of P which have driven P to the state x3-l. To determine q ~ j ( ~ ~ we - ~ need ) . to know the value d3+1(x3).Now, if ($%. . . )u:) is an optimal policy for P3. then. obviously. ( u ; + ~. % . ,$). is an optimal policy for the subprocess Pjtl starting at xj = gj(xl-i,u;). This statement is generalized in the Bellman optimality principle. Bellman Principle: If (I$. . , . ,a;) is an optimal policy of the process P and (&.. . . , x;) is the corresponding sequence of states, then for every subprocess PI,j = l ( l ) n , with initial state the policy is also optimal (u;. . . . , I
18.3.5 Bellman Functional Equation Method 18.3.5.1 Determination of Minimal Costs Il'ith the functional equations (18.129). (18.130) and starting with &+1(xn) = 0 we determine every value q3(xj-1)with xI-l E X3-l in decreasing order of j . It requires the solution of an optimum problem over the decision space L'3(xj-l)for every xj-* E X3-1. For every xI-l there is a minimum point uj E C; as an optimal decision for the first stage of a subprocess PI starting at x3-1.If the sets Xj are not finite or they are too large: then the values can be calculated for a set of selected nodes E Xj-l. The intermediate values can be calculated by a certain interpolation method. &(a) is the optimal value of the cost function of process P . The optimal policy ($, . . . ,$) and the corresponding states (6%. . , f )can be determined by one of the following two methods.
.
18.3.5.2 Determination of the Optimal Policy 1. Variant 1: During the evaluation of the functional equations, the computed uj is also saved for After the calculation of $I(%), we get an optimal policy if we determine xy = every 53-1E X,-l. gl(&, u;) from xo = & and the saved ul = LI;~then from x; and the saved decision & we get $, etc. After every $3(x3-l) is known, we make 2. Variant 2: We save only q$(x3-!)for every x3-1E a forward calculation. Starting with J = 1 and x, = & we determine u; in increasing order of j by the evaluation of the functional equation
(18.133) During the forward calculation, we again have to solve an optimization Il'e obtain xi = gI(xi-l,g;), problem at every stage. 3. Comparison of the two Variants: The computation costs of variant 1 are less than variant 2 requires because of the forward calculations. However decision uj is saved for every state xj-i,which may require very large memory in the case of a higher dimensional decision space b'j(gl-l). while in
18.9 Discrete Dynamic Programming 879
Therefore, sometimes variant 2 is used
the case of variant 2, we have to save only the values q$(+J. on computers.
18.3.6 Examples for Applications of the Functional Equation Method 18.3.6.1 Optimal Purchasing Policy 1. Formulation of the Problem The problem from 18.3.2.1, p. 876, to determine an optimal purchasing policy
OF f ( u l , . . . . un) =
n
1c3u3 t min!
(18.134a)
3=1
(18.134b)
(18.135) (18.136)
j=l 2 3 4 5 6
~ , = 0 1 2 3 4 75 59 56 53 50 47 44 39 34 29 24 24 2 1 1 8 1 5 1 2 22 1 8 1 4 1 0 6 6 4 2 0 0
5
6
7
8
9 1 0
44 41 38 35 32 29 21 18 15 12 9 6 9 6 4 2 0 0 4 2 0 0 0 0 0 0 0 0 0 0
I
880
18. Optimization
18.3.6.2 Knapsack Problem 1. Formulation of the Problem Consider the problem given in 18.3.2.2,p. 876 n
OF : f ( u l . . . . , u,) =
1c,u, +max!
(18.137a)
,=1
CT:
X, = ~ zo =
~ -- w,u,, 1
3 = l(l)n, 3 = l(l)n,
w. 0 5 z,5 w,
(18.13713)
if < w,, Since we have a maximum problem, the Bellman functional equations are now O n t I ( X n ) = 0, u, = 0 ,
The decisions can be only 0 and 1, so it is practical to apply variant 1of the functional equation method. For j = n.n - 1 , . . . , 1 we get:
2. Numerical Example c3 = 3, cq = 1, c5 = 5, cg = 4, = 6, W4 = 3, W5 = 7, Wg = 6. Sincetheweightsw, areintegers, thepossiblevaluesforz, arez, E { O , l , . . . , l o } , j = l ( l ) n , and zo = ) every stage and and the actual decision U , ( X ~ - ~for 10 The table contains the function values for every state x , - ~ . For example. the values of ~ , & ( 2 5 ) ~4 3 ( 2 ) > 4+~3(6), and 43(8) are calculated:
W = 10, n = 6.
= 1,
c2
W1 = 2,
W2
c1
x,=O J=1 2 3 4 5 6
0: 0 0; 0 0: 0 0; 0 0: 0
= 2, = 4,
W3
1
2
3
3; 0 3; 0 3; 1 0; 0 0; 0
4; 1 3; 0 3; 1 0:0 0;0
7; 1 6; 1 3; 1 0; 0 0; 0
8
9
9; 0 10; 1 13; 1 13; 1 15; 0 9; 1 9; 0 10; 0 12; 1 15; 1 6; 0 9; 1 10: 1 10; 1 10; 1 6; 0 7; 1 7; 1 7 ; 1 7; 1 6; 1 6; 1 6; 1 6; 1 6; 1
16; 0 16; 1 13; 0 13; 1 6; 1
4
5
6
7
10 19; 0 19; 1 16; 0 16; 1 13; 1 6; 1
19 Numerical Analysis The most important principles of numerical analysis will be the subject of this chapter. The solution of practical problems usually requires the application of a professional nurnerzcal library of numerical methods, developed for computers. Some of them will be introduced at the end of Section 19.8.3. Special computer algebra systems such as Mathernatica and Maple will be discussed with their numerical programs in Chapter 20, p. 950 and in Section 19.8.4, p. 943. Error propagation and computation errors will be examined in Section 19.8.2, p. 936.
19.1 Numerical Solution of Non-Linear Equations in a Single Unknown Every equation with one unknown can be transformed into one of the normal forms: Zero form: f ( z ) = 0.
(19.1)
(19.2) Fixed point form: z = p(z). Suppose equations (19.1) and (19.2) can be solved. The solutions are denoted by z*.To get a first approximation of z*.WP can try to transfom the equation into the form f l ( z ) = f i ( z ) , where the curves of the functions y = f ~ ( zand ) y = f2(z) are more or less simple to sketch. f ( z ) = x2 - sin z = 0. We can see from the shapes of the curves y = x2 and y = sin z that z;= 0 and z;x 0.87 are roots (Fig. 19.1).
Yt
Figure 19.1
Figure 19.2
19.1.1 Iteration Method The general idea of iterative methods is that starting with known initial approximations ZR (k = 0,1,. . . , n ) we form a sequence of further and better approximations, step by step, hence we approach the solution of the given equation by iteratzon. by a convergent sequence. We try to create a sequence with convergence as fast as possible.
19.1.1.1 Ordinary Iteration Method To solve an equation given in or transformed into the fixed point form x = cp(z), we use the iteration rule (19.3) x,+~ = p(~,) (n = 0 , 1 , 2 , .. . ; zo given), which is called the ordznary zteratzon method. It converges to a solution z*if there is a neighborhood of z*(Fig. 19.2)such that (19.4)
I
882
19. Numerical Analusis
holds, and the initial approximation xo is in this neighborhood. If p(z) is differentiable, then the corresponding condition is (19.5) lp’(z)l 5 K < 1. The convergence of the ordinary iteration method becomes faster with smaller values of K .
I
1
sinx, 10.7643 10.7670 10.7681 10.7684 0.7686 0.7686 R e m a r k 1: In the case of complex solutions, we substitute z = u + iw. Separating the real and the imaginary part, we get an equation system of two equations for real unknowns u and w. R e m a r k 2: The iterative solution of non-linear equation systems can be found in 19.2.2, p. 893.
19.1.1.2 Newton’s Method 1. Formula of the Newton Method To solve an equation given in the form f(z) = 0, we mostly use the Newton method which has the formula
-
( n = 0,1,2,.. . ; xo is given), (19.6) f’(xn) Le.. to get a new approximation znil. we need the value of the function f(z) and its first derivative f’(z) at z,. 2. Convergence of the Newton Method The condition f‘(x) # 0 (19.7a) is necessary for convergence of the Newton method, and the condition z,+1
= 2,
- f(’n)
is sufficient. The conditions (19.7a.b) must be fulfilled in a neighborhood of z*such that it contains all the points I, and E* itself. If the Newton method is convergent, it converges very quickly. It has quadratic convergence, which means that the error of the (n+l)-st approximation is less than a constant multiple of the square of the error of the n-th approximation. In the decimal system, this means that after a while the number of exact digits will double step by step. IThe solution of the equation f ( z ) = x2 - a = 0. Le., the calculation of 2 = & (a > 0 is given), with the Newton method results in the iteration formula (19.8) We get for a = 2:
2,+1
= 2,
n
O
1
2
3
z, 1.5 1.4166666 1.414137 1.4142136
f‘(h)
- f(2n)
( m fixed, m < n).
(19.9)
19.1 Numerical Solution of Non-Linear Equations in a Single Unknown 883
Figure 19.3
Figure 19.4
The goodness of the convergence is hardly modified by this simplification.
5. Differentiable Functions with Complex Argument The Newton method also works for differentiable functions with complex arguments.
19.1.1.3 Regula Falsi 1. Formula for Regula Falsi To solve the equation f ( x ) = 0, the regula falsi method has the rule: (19.10) We need to compute only the function values. The method follows from the Newton method (19.6) by approximating the derivative f’(xn) by the finite difference of f ( x ) between x, and a previous approximation z, (rn < n ) .
2. Geometric Interpretation The geometric interpretation of the regula falsi method is represented in Fig. 19.4.The basic idea of the regula falsi method is the local approximation of the curve y = f ( x ) by a secant line.
3. Convergence The method (19.10) is convergent if we choose m so that f (5,) and f(x,) always have different signs. If the convergence already seems to be quick enough during the process, it will speed up if we ignore the change of sign, and we just substitute x, = zn-i. W f ( x ) = z2 - sinx = 0.
0 1
2
-0.3 0.0065
0.9 0.0267 0.87 -0.0074 0.8765 -0.000252 0.876729 0.000003 0.876726
-0.0341 0.007148 0.000255
0.8798 0.9093 0.8980
3 0.000229 -0.000003 4 If during the process the value of Ax,/Ayn only barely changes, we do not need to recalculate it again and again.
4. Steffensen Method Applying the regula falsi method with x, = x,-1 for the equation f ( x ) = x - p(x) = 0 we can often speed up the convergence. especially in the case ~ ’ ( x<) -1. This algorithm is known as the Stefensen
method. W To solye the equation z2 = sinx with the Steffensen method, we should use the form f ( x ) = x G = O .
I
884
19. Numerical Analusis
0 1 2 3
a,
xo
0.9
-0.03 0.006654
a,-l zoa:,-,
0.014942 -0.004259 0.876654 -0.000046 0.876727 0.000001 0.87
-0.019201 0.004213
1.562419 1.579397
an-? . . . a3 a2 a1 a0 ~ o a h -. .~. zoa; xoa; xoai zoab
(19.16)
(19.17)
j1 i ?;,
19.1 Numerical Solution of Non-Linear Equations in a Sinole Unknown 885
+ 2x3 - 3x2 - 7. The substitution value and derivatives of pd(z) are calculated at zo = 2 according to (19.16).
Ip4(x) = $4
1 2 -3 0 -7
We see: P4(2) = 13: P & P ) = 44, p&'(2) = 66, p1;'(2) = 60, p y ' ( 2 ) = 24.
1 4 5 10 13 1 6 17 44 1 8 33
Remarks: 1. We can rearrange the polynomialp,(z) with respect to the powers of z - 20,e.g., in the example above we get p4(x) = (x - 2)4 t 1O(z - 2)3 t 33(x - 2)' 44(x - 2) t 13. 2. The Horner scheme can also be used for complex coefficients a k . In this case for every coefficient we have to compute a real and an imaginary column according to (19.16).
+
2. Complex Arguments
If the coefficients ak in (19.11) are real, then the calculation ofp,(xo) for complex values zo = uo + ivo can be made real. In order to show this, we decompose p,(x) as follows: p , ( x ) = a,xn
+ an-lzn-l t . ' . + alx + a0 + . ' . + ub) t T ~ +Z ro
= (xz- px - q)(ak-zxn-2
with
(19.18a)
x2 - p x - q = ( x - xo)(z-TO), i.e., p = 2u0, q = -(uo2 + ~ 0 ' ) . ( 19.18b) Then. we get (19.18~) p,(xo) = T ~ t Z ro = ( ~ 1 t ~ ro) 0 t irlwo. To find (19.18a) we can construct the so-called two-row Horner scheme introduced by Collatz:
(19.18d)
W pd(x) = x4 t 2x3 - 32' - 7. Calculate the value of p4 at xo = 2 - i, Le., for p = 4 and q = -5.
19.1.2.2 Positions of the Roots 1. Real Roots, Sturm Sequence With the Cartesian rule of signs we can get a first idea of whether the polynomial equation (19.11) has a real root, or not. a) The number of positive roots is equal to the number of sign changes in the sequence of the cooefficients (19.19a) a,, a n - l , . . , . al, a0 or it is less by an even number.
886
19. Numerical Analysis
b) The number of negative roots is equal to the number of sign changes in the coefficient sequence uo. - u l . a2 , . . . . (-l)%, (19.19b) or it is less by an even number. Ips(x) = z5 - 6x4 t lor3 132’ - 152 - 16 has 1 or 3 positive roots and 0 or 2 negative roots. To determine the number of real roots in any given interval (a,b ) , Sturrn sequences are used (see 1.6.3.2. 2.) p. 44). After computing the function values yv = p,(z,) at a uniformly distributed set of nodes z,= zo t v . h (h constant) (which can be easily performed by using the Horner scheme) a good guess of the graph of the function and the locations of roots are obtained. If p,(c) and p,(d) have different signs. there is at least one real root between c and d.
+
2. Complex Roots In order to localize the real or complex roots into a bounded region of the complex plane consider the following polynomial equation which is a simple consequence of (19.11):
+ +
+
(19.20) f*(z)= lun_llrn-l t /an-21rn-2 ... / a l l r /ao/ = lu,lrn and we determine, e.g., by systematic repeated trial and error: an upper bound ro for the positive roots of(19.20). Then,forallrootszE ( k = 1 , 2 , . . . ,n)of(19.11), 1 4 5 To. (19.21) I f(x) = p4(x) = a4+4.4z3-20.01x2-50.12z+29.45 = 0, f*(z) = 4.4r3+20.01r2+50.12r+29.45 = r4. \Ye get for T = 6: f’(6) = 2000.93 > 1296 = r4. r = 7: f”(7)= 2869.98 > 2401 = T~~ r = 8: f*(8)= 3963.85 < 3096 = T ~ . From this it follows that Iz;~ < 8 (k = 1,2,3,4). Actually, for the root z;with maximal absolute value we hale: -7 < Z; < -6. Remark: A special method has been developed in electrotechnics in the so-called root locus theory for the determination of the number of complex roots with negative real parts. It is used to examine stability (see [19.11], [19.30]).
19.1.2.3 Numerical Methods 1. General Methods The methods discussed in Section 19.1.1, p. 881, can be used to find real roots of polynomial equations. The Newton method is well suited for polynomial equations because of its fast convergence, and the fact that the values of f(z,)and f’(z,) can be easily computed by using Horner’s rule. By assuming that an approximation z, of the root z*of a polynomial equation f(x) = 0 is sufficiently good, then the correction term 6 = z*- x, can be iteratively improved by using the fixed-point equation
2. Special Methods The Bazrstow method is well applicable to find root pairs, especially complex conjugate pairs of roots. The Horner scheme (19.18a-d) is used to find quadratic factors of the given polynomial having the root pair to determine the coefficients p and q which make the coefficients of the linear remainder ro and equal to zero (see I19.291, [19.11], [19.30]). If the computation of the root with largest or smallest absolute value is required, then the Bernoullz methodis the choice (see [19.19]). The Graefle method has some historical importance. It gives all roots simultaneously including complex conjugate roots; holyever the computation costs are tremendous (see [19.11]. [19 301).
19.2 Numerical Solution of Equation Svstems 887
19.2 Numerical Solution of Equation Systems In several practical problems, n e have m conditions for the n unknown quantities x, in the form of equations:
(2
= 1 , 2 , .. . , n)
(19.23) We have to determine the unknowns xi so that they form a solution of the equation system (19.23). Mostly m = n holds. Le.; the number of unknowns and the number of equations are equal to each other. In the case of m > n; we call (19.23) an overdetermined system: in the case of m < n it is an underdetermaned system. Overdetermined systems usually have no solutions. Then, we are looking for the "best" solution of (19.23). in the Euclidean metric with the least squares method m
C I ; " ( z 1 , z 2.....x,)=min!
(19.24)
i=l
or in other metrics as another extreme value problem. Usually. the values of n - m variables of an underdetermined problem can be chosen freely. so the solution of (19.23) depends on n - m parameters. We call it an ( n - m)-dimensional manifold ofsolutions. \Ye distinguish between linear and non-linear equation systems, depending on whether the equations are only linear or non-linear in the unknowns.
19.2.1 Systems ofLinear Equations Consider the linear equation system
az1xl
+ a12x2 + . . t alnx, = + ~ 2 2 x 2+ + aznx, = b2.
a,lxl
t an252
ullx1
bl.
,
,'.
+
'.
(19.25)
+ annxn= bn.
The sJstem (19.25) can be written in matrix form
Ax=b
(19.26a)
with (19.26b) Suppose the quadratic matrix A = (azk) ( 2 %k = 1 , 2 , . . . , n ) is regular, so system (19.25) has a unique solution (see 4.4.2.1, 2., p. 273). In the practical solution of (19.25), n e distinguish between two types of solution methods: 1. Direct Methods are based on elementary transformations, from which the solution can be obtained immediately. These are the pivoting techniques (see 4.4.1.2, p. 271) and the methods given in 19.2.1.119.2.1.3. 2. Iteration methods start with a known initial approximation of the solution, and forming a sequence of approximations that converges to the solution of (19.25) (see 19.2.1.4, p. 892).
19.2.1.1 Triangular Decomposition of a Matrix 1. Principle of t h e Gauss Elimination Method By elementary transformations
888
19. Numerical Analusis
1. interchanging rows, 2. multiplying a row by a non-zero number and 3. adding a multiple of a row to another row, the system A x = b is transformed into the so-called TOW echelon form
(19.27)
Since only equivalent transformations were made, the equation system Rx = c has the same solutions as A x = b. We get from (19.27): (19.28) The rule given in (19.28) is called backward substitution, since the equations of (19.27) are used in the opposite order as they follow each other. The transition from A to R is made by n - 1 so-called elimination steps, whose procedure is shown by the first step. This transforms matrix A into matrix AI: (1) 311
0 0
(19.29)
0
ilTe proceed as follows: 1. Lye choose an arl# 0 (according to (19.33)). If there is none, stop: A is singular. Otherwise a,l is called the pivot. 2. \le interchange the first and the r-th row of A. The result is A. 3. We subtract 1,l ( i = 2,3,. . . , n)times the first row from the i-th row of the matrix -4s a result we get the matrix A! and analogously the new right-hand side b, with the elements
x.
(19.30) bj” = $i - 1,161 ( i l k = 2 , 3 , . . . , n). The framed submatrix in AI (see (19.29)) is of type ( n - 1,n - 1) and it will be handled analogously to A.etc. This method is called the Gaussian elimination methodor the Gauss algorithm (see 4.4.2.4. p. 276).
2. Triangular Decomposition The result of the Gauss elimination method can be formulated as follows: To every regular matrix A there exists a so-called triangular decomposition or L Ufactorization of the form
PA=LR
(19.31)
19.8 Numerical Solution of Equataon Systems 889
with rii
R=(
r12
Ti3
TZZ
~ 2 3
733
... .
rln
I:],
::: rnn
1 (19.32)
I,=[:
41
LZ li
1
’.’
,o,
In,n-l
Here R is called an upper triangular matrix, L is a lower triangular matr% and P is a so-called pennutation matrix. A permutation matrix is a quadratic matrix which has exactly one 1 in every row and every column, and the other elements are zeros. The multiplication P A results in row interchanges in A, which comes from the choices of the pivot elements during the elimination procedure. The Gauss elimination method should be used for the system schematic form, where the coefficient matrix and the vector from the right-hand side are written next to each other (into the so-called extended coeficient matrix), we get:
1 0 0
P= 0 0 1 (0
1 0)
*PA=
3 1 6 1 1 1 , L = 1/3 1 (2 1 3) ( 2 i 3 1;2
:) 0
(i
,R=
0 2/3
:;2)’
In the extended coefficient matrices, the matrices A, Al and Az, and also the pivots are shown in boxes.
3. Application of Triangular Decomposition With the help of triangular decomposition, we can describe the solution of linear equation systems A x = in three steps: 1. P A = L R: Determination of the triangular decomposition and substitution Rx = c
L c = P b: Determination of the auxiliary vector c by forward substitution. 3. R E = c: Determination of the solution 11by backward substitution. 2.
If the solution of a system of linear equations is handled by the expanded coefficient matrix (A,b), as in the above example, by the Gauss elimination method, then the lower triangular matrix L is not needed explicitly. This can be especially useful if several systems of linear equations are to be solved after each other with the same coefficient matrix, with different right-hand sides.
4. Choice of the Pivot Elements Theoretically, every non-zero element a!!-’) of the first column of the matrix Ak-1 could be used as a pivot element at the k-th elimination step. In order to improve the accuracy of solution (to decrease the accumulated rounding errors of the operations), the following strategies are recommended. 1. Diagonal Strategy The successive diagonal elements are chosen as pivot elements Le., there is no row interchange. This kind of choice of the pivot element makes sense if the absolute value of the elements of the main diagonal are fairly large compared to the others in the same row. 2. Column Pivoting To perform the k-th elimination step, we choose the row index T for which:
890
19. Numerical Analysis
If r # k,then the r-th and the k-th rows will be interchanged. It can be proven that this strategy makes the accumulated rounding errors smaller.
19.2.1.2 Cholesky's Method for a Symmetric Coefficient Matrix In several cases, the coefficient matrix A in (19.26a) is not only symmetric, but also posztzwe definzte, i.e.. for the corresponding quadratic f o r m Q(x)holds n
Q(X) = LITAX =
E
n azkXIXk
>0
(19.34)
,=I k=1
for every x E R", # 0. Since for every symmetric positive definite matrix A there exists a unique triangular decomposition A=LL~ (19.35) with
(19.36a)
(19.36b)
,~= aij( k - 1 )
a(k)
- 1%k3)9 1 ( i , j = k t l , k + 2 , . . . ,n),
(19.36~)
the solution of the corresponding linear equation system A z = b can be determined by the Cholesky method by the following steps: 1. A = L LT: Determination of the so-called Cholesky decomposition and substitution LTz= c. 2. L c = b: Determination of the auxiliary vector c by forward substitution. 3. LTz= c: Determination of the solution x by backward substitution. For large values of n the computation cost of the Cholesky method is approximately half of that of the LC decomposition given in (19.31), p. 888.
19.2.1.3 Orthogonalization Method 1. Linear Fitting Problem Suppose an Overdetermined linear equation system
2
albxk
= b,
( i = 1 , 2 , . . . m; m > n ) ,
(19.37)
b=l
is given in matrix form (19.38) Ax = b. Suppose the coefficient matrix A = ( a , k ) with size ( m x n) has full rank n,Le.. its columns are linearly independent. Since an overdetermined linear equation system usually has no solution, instead of (19.37) we consider the so-called error equations
"
(19.39) with residues
ri,
and we require that the sum of their squares should be minimal: (19.40)
19.2 Numerical Solution of Equation Systems 891
The problem (19.40) is called a linearfitting problem or a linear least squares problem (see also 6.2.5.5, p. 401). The necessary condition for the relative minimum of the sum of residual squares F ( q , q.. . ., 2 , ) is
8F
-= 0
( k = 1 , 2 , .. . , n)
(19.41)
8Xk
722
A = QR
(19.44) with QTQ = E and R =
' ' '
7271
(19.43)
= rnn
(19.46)
without changing the sum of the squares of the residuals. From (19.46) it follows that the sum of the = i 2 = . = in= 0 and the minimum value is equal to the sum of the squares squares is minimal for i1 of ?,+I to im. \$'e get the required solution x by backward substitution
Rx = b,
(19.47)
892
19. Numerical Analusis
.!I,,
where is the vector with components &, , . . , &,, obtained from (19.46). There are two methods most often used for a stepwise transition of (19.39) into (19.46): 1. Givens transformation, 2. Householder transformation. The first one results in the QR decomposition of matrix A by rotatzons, the other one by reflectzons. The numerical implementations can be found in [19.23]. Practical problems in linear mean square approximations are solved mostly by the Householder transformation, where the frequently occurring special band structure of the coefficient matrix A can be used.
19.2.1.4 Iteration Methods 1. Jacobi Method Suppose in the coefficient matrix of the linear equation system (19.25) every diagonal element a,, (z = 1,2.. . . , n) is different from zero. Then the z-th row can be solved for the unknown x,, and we get immediately the following iteration rule, where p is the iteration index: (19.48) ( p = 0,1.2,. . . ; xy', xp',
. . . , xipl
are given initial values)
Formula (19.48) is called the Jacobi method. Every component of the new vector x(fitl) is calculated from the components of &')). If at least one of the conditions mtx
5: 1 1 < I
column sum crzterzon
(19.49)
5 1:1< &:)
row sum criterion
(19.50)
(S)
or max
'
1
holds, then the Jacobi method is convergent for any initial vector do).
2. Gauss-Seidel Method If the first component x p l ) is calculated by the Jacobi method, then this value can be used in the calculation of x p t l ) , While we proceed similarly in the calculation of the further components, we get the iteration formula (19.51)
(i = 1 ~ 2 , .. . , n; sy', xp13,., ,zip) given initial value; p = 0 , 1 , 2 , . . .) . Formula (19.51) is called the Gauss-Seidel method. The Gauss-Seidel method usually converges more quickly than the Jacobi method, but its convergence criterion is more complicated. I 10x1 - 3x2 - 4x3 + 2x4 = 14, -321 26x2 + 5x3 - 2 4 = 22, -4x1 + 5x2 + 16x3 + 5x4 = 17, 2x1 + 3x2 - 4x3 - 12x4 = -20. The corresponding iteration formula according to (19.51) is:
+
19.2 Numerical Solution of Equation Systems 893
Some approximations and the solution are given here:
10
zpl)= 1 (22 + 3spl)- 5xk)+ xp)), 26 5p1) =1 (17 + 4$+1) - 5 $ + 1 1 - 5xpl), 16 12
3. Relaxation Method The iteration formula of the Gauss-Seidel method (19.51) can be written in the so-called correction form
xjp+1) = 5jp) + dj’”) ( 2 = 1 , 2 , . . . ,n; p = 0 , 1 , 2 , . . .). By an appropriate choice of a relaxation parameter w and rewriting (19.52) in the form
(19.52)
z.l
$+lI = +4 4 ( i = 1 . 2 , . . .2 n ; p = o , l > 2 > . . . ) 2 (19.53) we can try to improve the speed of convergence. It can be shown that convergence is possible only for
O<w<2. (19.54) For w = 1 we retrieve the Gauss-Seidel method. In the case of w > 1, which is called overrelaxation, the corresponding iteration method is called the SOR method (successive overrelaxation). The determination of an optimal relaxation parameter is possible only for some special types of matrices. We apply iterative methods to solve linear equation systems in the first place when the main diagonal elements all of the coefficient matrix have an absolute value much larger than the other elements azk ( i # k ) (in the same row or column), or when the rows of the equation system can be rearranged in a certain way to get such a form.
19.2.2 Non-Linear Equation Systems Suppose the system of n non-linear equations F 2 ( z 1 , 2 *., . . 5,) = 0 ( 2 = 1 , 2 , . . . , n) (19.55) for the n unknowns xlr 52. . . . x, has a solution. Usually, a numerical solution can be given only by an iteration method. ~
~
19.2.2.1 Ordinary Iteration Method We can use the ordinary iteration method if the equations (19.55) can be transformed into a fixed-point form (19.56) 5, = ji(xl,xz~.. . . x n ) ( i = 1 , 2 , .. . n ) . ~
Then. starting from estimated approximations zp’, $),. 1. iteration with simultaneous steps
.., ~
~ we ~get 1the , improved values either by
z y = jl (5y Jp.. . . .xy) (z = 1 , 2 , . . . , n: p = 0, 1 , 2 , .. .) or by 2. iteration with sequential steps
(19.57)
894
19. Numerical Analysis
It is of crucial importance for the convergence of this method that in the neighborhood of the solution the functions f, should depend only weakly on the unknowns, Le., if fi are differentiable. the absolute values of the partial derivatives must be rather small. We get as a convergence condztzon
K < 1 with K = max
’
)l: 1
max (k:1
.
(19.59)
With this quantity K . the error estimation is the following:
Here. x, is the component of the required solution, zZ(r) and x?”) ( p + 1)-th approximations.
are the corresponding p-th and
19.2.2.2 Newton’s Method The Newton method is used for the problem given in the form (19.55). After finding the initial approximation values zp), x!), , , ,zho),the functions F, are expanded in Taylor form as functions of n independent variables 21. x2. . . z, (see p. 415). Terminating the expansion after the linear terms, from (19.55) we get a linear equation system, and with this we can get iterative improvements by the following formula:
(i = 1 , 2 : . . . 7%; p = 0.1,2;. . .). The coefficient matrix of the linear equation system (19.61),which should be solved in every iteration step. is ~
and it is called the Jacobian matrix. If the Jacobian matrix is invertible in the neighborhood of the solution, the Kewton method is locally quadratically convergent, Le., its convergence essentially depends on how good the initial approximations are. If we substitute z p l ) - x r ) = df’ in (19.61). then the Newton method can be written in the correction form (19.63) $+I) = x!) df’ ( 2 = 1:2,. . . n; p = 0 , l : 2 , . , .). To reduce the sensitivity to the initial values, analogously to the relaxation method, we can introduce a so-called damping or step length parameter ;t:
+
$+I)
= x!)
+y d y
(1
= 1 , 2 , .. . , n: p = 0 , l . 2..
. . ; y > 0).
(19.64)
Xethods to determine y can be found in [19.24].
19.2.2.3 Derivative-Free Gauss-Newton Method To solve the least squares problem (19.24), we proceed iteratively in the non-linear case as follows: 1. Starting from a suitable initial approximation xp),$I, . . . , z f j ) we approximate the non-linear functions F,(xl3x2... . 2,) (i = 1 , 2 , .. . m ) as in the Kewton method (see (19.61)) by linear approximations pt(xl,5 2 , . . . xn), which are calculated in every iteration step according to
.
(i = 1 . 2 , . . . : n ;p = 0,1,2.. . .).
(19.65)
19.3 Numerical Inteqration
895
in (19.65) and we determine the corrections dk' by using the Gaussian 2. K e substitute d!$ = rk least squares method. i.e., by the solution of the linear least squares problem m
C F:(rl.. . .
~
2,)
(19.66)
= min,
z=1
e.g.. with the help of the normal equations (see (19.42)): or the Householder method (see 19.6.2.2, p. 918). 3. Wt get approximations for the required solution by ,$tl)
+&)
(19.67a)
or
= x k( I )
( k = 1:2. . . . n ) , (19.67b) > 0) is a step length parameter similar to the Newton method. where ,; By repeating steps 2 and 3 with instead of I!) we get the Gauss-Newton method. It results in a sequence of approximation values, whose convergence strongly depends on the accuracy of the initial approximation. Il'e can reduce the sum of the error squares by introducing a length parameter y. p I;t i )
= x k(PI
+ */&)
~
(A,
xp')
aF,
(ZP)~
If the evaluation of the partial derivatives , , , , zk)) (i = 1,2, . . . . m; k = 1 , 2 , . . . , n) requires a xk too much work. we can approximate the partial derivatives by difference quotients :
-F,(zji") , . . . .
xtl . . . .$ r k ) ) ] ( i = 1 , 2, . . . , m; k = l , 2 , . . . , n;p=O:1,2 , . . .).
(19.68)
The so-called discretization step sizes h k ) may depend on the iteration steps and the values of the variables. If we use approximations (19.68): then we have to calculate only function values F, while performing the Gauss-Newton method, Le., the method is derivative free.
19.3 Numerical Integration 19.3.1 General Quadrature Formulas The numerical evaluation of the definite integral b
I ( f )=
JfiW
(19.69)
Q
must be done only approximately if the integrand f(x)cannot be integrated by elementary calculus, or it is too complicated. or when the function is known only at certain points I,, the so-called interpolation nodes from the integration interval [a,b]. We use the so-called quadrature formulas for the approximate calculation of (19.69). They have the general form n.
Q(f) =
+
n
C O ~ Y ~ "=O "=O
civd
+ .. +
n
cpvy!' "=O
(19.70) with yp) = f " ) ( r U()p = 1:2 , . , . , p ; u = 1 , 2 ; .. . n ) , yv = f(xv),and constant values of cIu. Obviously, (19.71) I(f)= Q ( f )+ R. where R is the error of the quadrature formula. We suppose in the application of quadrature formulas that the required values of the integrand f(r)and its derivatives at the interpolation nodes are known ~
I
896
19. Numerical Analysis
as numerical values. Formulas using only the values of the function are called mean value formulas: formulas using also the derivatives are called H e m i t e quadrature formulas.
19.3.2 Interpolat ion Quadratures The following formulas represent so-called interpolation quadratures. Here, the integrand f(z)is interpolated at certain interpolation nodes (possibly the least number of them) by a polynomial p(z)of corresponding degree, and the integral of f(z) is replaced by that of p ( z ) . The formula for the integral over the entire interval is given by summation. In the following, we give the formulas for the most practical cases. The interpolation nodes are equidistant: b-a r , = z o + ~ h ( ~ = 0 , 1 , 2. ., . ,n ) , z o = a , ~ , = b , h=-.
(19.72)
We give an upper bound for the magnitude of the error IR/ for every quadrature formula. Here, M,, means an upper bound of if(”)(z)l on the entire domain.
19.3.2.1 Rectangular Formula In the interval [zol 50+ h],f(z)is replaced by the constant function y = yo = f(zo), which interpolates f(z)at the interpolation node 20,which is the left endpoint of the integration interval. We get in this way the simple rectangular formula hZ
z i * f ( z )dz M h . yo; lR1 5 z M l .
(19.73a)
20
We get the left-sided rectangular formula by summation: (19.73b)
MI denotes an upper bound of lf’(z)Ion the entire domain of integration. We get analogously the rzght-szded rectangular sum, if we replace yo by y1 in (19.73a). The formula is: (19.74)
19.3.2.2 Trapezoidal Formula f(r)is replaced by a polynomial of first degree in the interval [.(I, zo + h],which interpolates f(z)at the interpolation nodes zo and z1 = 10 + h. We get: (19.75) LO
We get the so-called trapemidalformula by summation: (19.76)
Mz denotes an upper bound of If”(z)(, on the entire integration domain. The error of the trapezoidal formula is proportional to hZ,i.e., the trapezoidal sum has an error of order 2. It follows that it converges
19.3 Numerical Integration 897 to the definite integral for h + 0 (hence, n
--t
co),if we do not consider rounding errors.
19.3.2.3 Simpson’s Formula f(x) is replaced by a polynomial of second degree in the interval o.[s at the interpolation nodes 20,xi = xo + h and x2 = xo t 2h:
xo + 2h], which interpolates f(x) (19.77)
(19.78)
M4 is an upper bound for If(4)(2)/ on the entire integration domain. The Simpson formula has an error of order 4 and it is exact for polynomials up to third degree.
19.3.2.4 Hermite’s Trapezoidal Formula
f(x)is replaced by a polynomial of third degree in the interval [xo,20 + h], which interpolates f ( x ) and f’(x) at the interplation nodes xo and x1 = zo h:
+
(19.79) We get the Hermite trapezoidalformvlaby summation:
M4denotes an upper bound for 1f(4)(x)lon the entire integration domain. The Hermite trapezoidal formula has an error of order 4 and it is exact for polynomials up to third degree.
19.3.3 Quadrature Formulas of Gauss Quadrature formulas of Gauss have the general form (19.81) where not only the coefficients c, are considered as parameters but also the interpolation nodes x,. These parameters are determined in order to make the formula (19.81) exact for polynomials of the highest possible degree. The quadrature formulas of Gauss result in very accurate approximations, but the interpolation nodes must be chosen in a very special way.
19.3.3.1 Gauss Quadrature Formulas If the integration interval in (19.81) is chosen as [a,61 = [-I, 11, and we choose the interpolation nodes as the roots of the Legendre polynomials (see 9.1.2.6, 3., p. 509, 21.10, p. 1060), then the coefficients e,, can be determined so that the formula (19.81) gives the exact value for polynomials up to degree 2n + 1. The roots of the Legendre polynomials are symmetric with respect to the origin. For the cases n = 1 , 2 and 3 we get:
898
19. Numerical Analysis
n = 1:
n = 2:
20
= -XIr
cg = 1,
21
= A=0.577350269 . . . ,
c1 = 1.
d3
= 0.
21
x2 =
6
= 0.774 596 669. . .
1
n=3:
5 9' 8 c1= - , 9
cg = -
xg = -22,
21
22 23
c2
= cg.
cg
= 0.347 854 854. .
-
= -23. = -22. = 0.339 981 043.. . , = 0.861 136 311 . . . ,
20
(19.82)
~1 c2 cg
.,
= 0.652 145 154, . . , = c1. = cg.
Remark: A general integration interval [a.b] can be transformed into [-1.11 by the transformation b-a a+b t = -x +(t E [a,b1.2 E [-1, 11). Then 2 2 b
/
f ( t ) dt
b-a
a
with the values 2 ,
e)
c,f ($-%. + 2 2 u=o and e,, given above for the interval [-1.11
=Z -
(19.83)
19.3.3.2 Lobatto's Quadrature Formulas In some cases it is reasonable also to choose the endpoints of the integration interval as interpolation nodes. Then, we have 2n more free parameters in (19.81). These values can be determined so that polynomials up to degree 2n - 1 can be integrated exactly. We get for the cases n = 2 and n = 3: n = 2: n = 3: 1 1 2 0 = -1. 2 0 = -1, cg = -, = 6' 3 5 4 (19.84a) 21 = -22, X I = 0, c1 = j, = 6' (19.84b) x* = 1, c2 = c g . 22 = - 0.447213595.. ., c2 = cl,
4 $6-
23
= 1.
c3 = cg
The case n = 2 represents the Simpson formula.
19.3.4 Method of Romberg To increase the accuracy of numerical integration the method of Romberg can be recommended, where we start with a sequence of trapezoid sums, which is obtained by repeated halving of the integration step size.
19.3.4.1 Algorithm of the Romberg Method The method consists of the following steps:
1. Trapezoid sums determination We determine the trapezoid sum T(h,)according to (19.76) as approximationsofthe integral
lb
f ( x )dx
with the step sizes (19.85)
19.3 Numerical Integration 899
Here. we consider the recursive relation
[: + ( +
T ( h , )= T 1-' = h-l -!(a) (h2 2
)
+
+f(a 2 L l )
f
a
h;l)
3 + f ( a + hi-I) + f (u + 4 b) 2
+ . .. + f (a +
(19.86)
we need to compute the Recursion formula (19.86) tells that for the calculation of T(h,)from T(/L,-~) function values only at the new interpolation nodes.
2. Triangular Scheme We substitute Tot = T(h,) (z = 0,1,2,. . .) and we calculate recursively the values
Tkt = Tk-1.i
+ Tk-l,i4k--Tk-1,i-1 1
( k = 1 , 2, . . . ,m; i = k , k + 1 , . . .).
(19.87)
The arrangement of the values calculated according to (19.87) is most practical in a triangular scheme, whose elements are calculated in a column-wise manner: T(ho)= Too T(h1) = To1 Tll (19.88) T(h2)= To2 Tl2 T22 T(h3)=TO3 TI3 T23 T33
...
I..................
The scheme will be continued downwards (with a fixed number of columns) until the lower values at the right are almost the same. The values TI, (z = 1 , 2 , . . .) of the second column correspond to those calculated by the Simpson formula.
19.3.4.2 Extrapolation Principle The Romberg method represents an application of the so-called eztrapolatzon prznczple. This will be demonstrated by deriving the formula (19.87)for the case k = 1. We denote by I the required integral, by T ( h )the corresponding trapezoid sum (19.76). If the integrand of I is (2m + 2) times continuously differentiable in the integration interval, then it can be shown that an asymptotical ezpansion with respect to h is valid for the error R of the quadrature formula, and it has the form R(h)= I - T ( h ) = a l h 2 + ~ z h 4 + . . . + ~ , h 2 m + O ( h Z m + 2 ) (19.89a) or T ( h )= I - ~ l h ~ - a ~ h ~ - . . . - ~ , h ~ ~ + O ( h ~ ~ ' ~ ) . ( 19.89b) The coefficients a l , a2.. . . ,am are constants and independent of h. LVe form T ( h )and T - according to (19.89b) and consider the linear combination
+ (Y2T
Tl(h)= CYlT(h) If we substitute 01
+ CYZ = 1 and a1 + 042 = 0, then T l ( h )has an error of order 4, while T ( h )and T ( h / 2 ) -
both have errors of order only 2. We have (19.91)
I
19. Numerical Analusis
900
This is the formula (19.87) for k = 1. Repeated application of the above procedure results in the approximation Tkt according to (19.87) and
Tkz= I
+ O(hTkt2).
(19.92)
i:
-dx (integral sine, see 8.2.5, l., p. 458) cannot be obtained in an si:X elementary way. Calculate the approximate values of this integral (calculating for 8 digits). k=O I k=l I k=2 1 k=3
IThe definite integral I =
1
I
0.94451352 0.94608693 0.94608300 0.94569086 0.94608331 0.94608307 0.94608307. The Romberg method results in the approximation value 0.94608307. The value calculated for 10 digits of the error according to (19.92) is proved. is 0.9460830704. The order 0 ((1/8)8) % 6 . ~
2. Trapezoidal and Simpson Formulas: We can read directly from the scheme of the Romberg method that for hg = 1/8 the trapezoid formula has the approximation value 0.94569086 and the Simpson formula gives the value 0.94608331. The correction of the trapezoidal formula by Hermite according to (19.79) results in the value I %
+
0.94569086 0'30116868 64. 12 = 0.94608301. 3. Gauss Formula: By the formula (19.83) for , , we get -
n = 1:
I%
;
n = 2:
I
%
j[cof ( i z ~t
n = 3:
I
%
[Cof
;[
cof
(& t ;j
(
t
+Clf
(;XI
+ elf (;zl
+ij] t
+ c z f (;z2 t
;j+ . . + (1 + 131 '
Cgf
-z3
-
a)]
= 0.94604113; = 0.94608313; = 0.94608307.
We see that the Gauss formula results in an 8-digit exact approximation value for n = 3, Le., with only four function values. With the trapezoidal rule this accuracy would need a very large number (> 1000) of function values. Remarks: 1. Fourier analysis has an important role in integrating periodic functions (see 7.4.1.1, l., p. 418). The details of numerical realizations can be found under the title of harmonic analysis (see 19.6.4, p. 924). The actual computations are based on the so-called Fast Fourier Transformation FFT (see 19.6.4.2, p. 925). 2. In many applications it is useful to take the special properties of the integrands under consideration. Further integration routines can be developed for such special cases. A large variety of convergence properties, error analysis, and optimal integration formulas is discussed in the literature (see, e.g., [ 19,4]). 3. Numerical methods to find the values of multiple integrals are discussed in the literature (see, e.g., p9.261).
19.4 Approximate Inte.qration of Ordinaru Diflerential Equations 901
19.4 Approximate Integration of Ordinary Differential Equations In many cases. the solution of an ordinary differential equation cannot be given in closed form as an expression of known elementary functions. The solution, which still exists under rather general circumstances (see 9.1.1.1,p. 486), must bedetermined by numerical methods. These result only in particular solutions. but it is possible to reach high accuracy. Since differential equations of higher order than one can be either initial value problems or boundary value problems, numerical methods were developed for both types of problems.
yA
= y(z0
A) Y(Z1)
= YO
jY0
0
xo
jY1 XI
~
x2
Y2
jY3
x3
U W W
x
+ h)
+ f(zol yo)h + TY ” ( h0
z
(19.98)
with 20 < E < zo + h, then we see that the approximation y1 has an error of order h2. The accuracy can be improved by reducing the step size h. Practical calculations show that halving the step size h results in halving the error of the approximations yE.
902
19. Numerical Analysis
+
and the possible next point (xo h, yl) of the curve, and depending on the appropriate choice of these additional points we get more accurate values for y1. We have Runge-Kutta methods of different orders depending on the number and the arrangements of these "auxiliary" points. Here we show a fourthorder method (see 19.4.1.5, 1.. p. 904). (The Euler method is a first-order Runge-Kutta method.) The calculation scheme of fourth order for the step from 20 to 2 1 = xo h to get an approximate value for y1 of (19.93) is given in (19.99). The further 20 steps follow the same scheme. xo t hJ2 xo + h/2 The error of this Runge-Kutta method 2o + has order h5 (at every step) according to (19.99). so with an appropriate 1 2 1 = xo h YI = YO -6( k l + 2kz 2k3 + k4) choice of the step size we can have high accuracy. 1 1 W y' = - ( x z + y') with y(0) = 0. We determine y(O.5) k = i ( ~ 2+ y') 4 in one step. i.e. h = 0.5 (see the table on the right). The exact value for 8 digits is 0.01041860. 0.00781280
+
+
+
+
~
2. Remarks 1. For the special differential equation y'
= f(x)> this Runge-Kutta method becomes the Simpson formula (see 19.3.2.3, p. 897). 2. For a large number of integration steps, a change of step size is possible or sometimes necessary. The change of step size can be decided by checking the accuracy so that we repeat the step with a double step size 2h. If we have. e.g.. the approximate value yz(h) for y(x0 + 2h) (calculated by the single step size) and yp(2h) (calculated by the doubled step size), then we have the estimation for the error &(h) = ~ ( 2 + 0 2h) - yz(h):
(19.100) Information about the implementation of the step size changes can be found in the literature (see [19 241). 3. Runge-Kutta methods can easily be used also for higher-order differential equations, see [19.24]. Higher-order differential equations can be rewritten in a first-order differential equation system (see p. 495). Then, the approximation methods are performed as parallel calculations according to (19.99). as the differential equations are connected to each other.
19.4.1.3 Multi-Step Methods The Euler method (19.97) and the Runge-Kutta method (19.99) are so-called single-step methods. since we start only from yt in the calculation of y t t l . In general, linear multi-step methods have the form (19.101) ( j = 0 , 1 , , , . , k; ak = 1). The formula (19.101) is with appropriately chosen constants aJ and called a k-step method if 1 0 1 t I& # 0. It is called explicit, if 8 k = 0, since in this case the values ftt3 = f(xitl, ytTJ) on the right-hand side of (19.101) only contain the already known approximation values y,. yt+l.. , , yi+k-l. If p k # 0 holds. the method is called implicit, since then the required new value yz+k occurs on both sides of (19.101). IVe have to know the k initial values yo, y1,. . . , yk-] in the application of a k-step method. We can get these initial values. e.g., by one-step methods. b'e can derive a special multi-step method to solve the initial value problem (19.93) if n e replace the
19.4 Approximate Integration of Ordinary Differential Equations 903
derivative y'(x,) in (19.93) by a dzference formula (see 9.1.1.5, l., p. 494) or if we approximate the integral in (19.96) by a quadrature formula (see 19.3.1, p. 895). Examples of special multi-step methods are: 1. Midpoint Rule The derivative y'(xEtl) in (19.93) is replaced by the slope of the secant line between the interpolation nodes xi and xt+Z. We get: (19.102) yt+z - ~i = 2hfttl. 2. Rule of Milne The integral in (19.95) is approximated by the Simpson formula: h (19.103) YZ+Z - 212 = - ( f i -t4ft+l t f i + z ) . 3 3. Rule of Adams and Bashforth The integrand in (19.95) is replaced by the interpolation polynomial of Lagrange (see 19.6.1.2, p. 915) based on the k interpolation nodes x,,z,+~,.. . Z , + L - ~ . We integrate between xttk-1 and &+k and get: (19.104) The method (19.104) is explicit for yz+k. For the calculation of the coefficients Oj see [19.2]
19.4.1.4 Predictor-Corrector Method In practice, implicit multi-step methods have a great advantage compared to explicit ones in that they allow much larger step sizes with the same accuracy. But. an implicit multi-step method usually requires the solution of a non-linear equation to get the approximation value y,+k. This follows from (19.101) and has the form k
k-1
(19.105)
yaii
The solution of (19.105) is an iterative one. We proceed as follows: An initial value is determined by an explicit formula. the so-called predzctor. Then it will be corrected by an iteration rule y$;1) = F(y!!?) ( p = 0 . 1 , 2 , . . .). (19.106) which is called the corrector coming from the implicit method. Special predictor-corrector formulas are
The Simpson formula as the corrector in (19.108b) is numerically unstable and it can be replaced, e.g., by h (19.109) ~ 6.7ft-1 t 30.7fi t 8.lf$i). y y = 0 9yt - i + 0 . 1 t~ z(0.1fi-2
+
904
19. Numerical Analusis
19.4.1.5 Convergence, Consistency, Stability 1. Global Discretization Error and Convergence Single-step methods can be written generally in the form: (19.110) yZ+l = yi hF(z,, yt, h ) (i = 0, 1,2,. . ; yo given). Here F ( z :y, h) is called the incrementfunctionor progressive direction of the single-step method. The approximating solution obtained by (19.110) depends on the step size h and it should be denoted by y(z, h ) . Its difference from the exact solution y(z) of the initial value problem (19.93) is called the global discretization error g(z, h ) (see (19.111)), and we say: The single-step method (19,110) is convergent with orderp if p is the largest natural number such that (19.111) g ( z : h )= Y(2, h ) - Y ( 2 ) = O(hP)
+
.
z - 20
holds. Formula (19.111) says that the approximation y(z, h ) determined with the step size h = converges to the exact solution y(z) for every 2 from the domain of the initial value problem if h + 0. IThe Euler method (19.97) has order of convergence p = 1. For the Runge-Kutta method (19.99) p = 4 holds.
2. Local Discretization Error and Consistency The order of convergence according to (19.111) shows how well the approximating solution y(z, h ) approximates the exact solution y(z). Beside this, it is an interesting question of how well the increment function F ( r ,y. h ) approximates thederivative y’ = j ( z , y), For this purpose we introduce theso-called localdiscretization errorl(z. h) (see (19.112)) and we say: The single-step method (19,110)is consistent with orderp. ifp is the largest natural number with
’
h, - y(z) - F ( z ,y, h ) = O(hP), h It follows directly from (19.112) that for a consistent single-step method lim F ( z !y, h ) = f(z,y). h-0
l ( z ,h ) = y(z
(19.112) (19.113)
IThe Euler method has order of consistencyp = 1,the Runge -Kutta method has order of consistency p = 4.
3. Stability with Respect to Perturbation of the Initial Values In the practical performance of a single-step method, a rounding error O ( l / h ) adds to the global discretization error O(hP). Consequently, we have to select a not too small, finite step size h > 0. It is also an important question of how the numerical solution yi behaves under perturbations of the initial values or in the case z,+ co. In the theory of ordinary differential equations, an initial value problem (19.93) is called stable with respect to perturbations of its initial values i f ( 19.114) lG(x) - v(z)I I I G O - Yol. Here G(z)is the solution of (19.93) with the perturbed initial value $(q) = GO instead of yo. Estimation (19.114) tells that the absolute value of the difference of the solutions is not larger than the perturbation of the initial values. In general, it is hard to check (19.114). Therefore we consider the linear test problem (19.115) y’ = Xy with y(z0) =yo (A constant, X 5 0) which is stable, and a single-step method is applied to this special initial value problem. .A consistent method is called absolutely stable with step size h > 0 with respect to perturbed initial values if the approximating solution y1 of the above linear test problem (19.115) obtained by using the method satisfies the condition I Y L I I IYOI. (19.116)
19.4 Approximate Integration of Ordinary Differential Equations 905
IApplying the Euler polygon method for equation (19.115) results in the solution yttl = (1 + Xh)yt ( i = 0,1,. . .). Obviously, (19.116) holds if 11+Ah1 5 1, and so the step size must satisfy -2 5 Ah 5 0.
4. Stiff Differential Equations Many application problems, including those in chemical kinetics, can be modeled by differential equations whose solutions consist of terms converging to zero exponentially but in a high different kind of exponential decreasing. These equations are called stif differential equations. For example: y(z) = CleA1”+ CzexZs (Cl,Cz, XI, Xz const) ( 19.117) with X1 < 0, XZ < 0 and IX1,I < /Xzl, e.g., XI = -1, Xz = -1000. The term with Xz does not have a significant affect on the solution function, but it does in selecting the step size h for a numerical method. In such cases the choice of the most appropriate numerical method has special importance (see [19.23]).
19.4.2 Boundary Value Problems The most important methods for solving boundary value problems of ordinary differential equations will be demonstrated on the following simple linear boundary value problem for a differential equation of the second order: (19.118) y”(z)+ p(z)y‘(z) q(z)y(z)= f (x) (a 5 z 5 b) with y(a) = a , y ( b ) = p. The functions p(x), q(z) and f(x) and also the constants a and p are given. The given method can also be adapted for boundary value problems of higher-order differential equations.
+
19.4.2.1 Difference Method
+
We divide the interval [a, b] by equidistant interpolation points x, = zo vh (v = 0,1,2,. . . n;xo = a, z, = b) and we substitute the values of the derivatives in the differential equation at the interior interpolation points (19.119) Z I f f ( Z Y ) + P ( Z Y ) Y ’ ( Z Y ) + Q(ZY)Y(Z”) = f(zu) (v = 1 , 2 , ,, . . n - 1) by so-called finzte dzwzded dzfferences, e.g.: (19.120a)
= y:‘
Yutl -
2Y, -t Y Y - I
(19.120b) h2 This way, we get n - 1 linear equations for the n - 1 approximation values yu cz y(xu) in the interior of the integration interval [a,b],considering the conditions yo = a and y,, = p. If the boundary conditions also contain derivatives, they must also be replaced by finite expressions. Eigenvalue problems of differential equations (see 9.1.3.2, p. 513) are handled analogously. The application of the diflerence method, described by (19.119) and (19.120a,b), leads to a matrix eigenvalue problem (see 4.5, p. 278). IThe solution of the homogeneous differential equation y” + X 2 y = 0 with boundary conditions y(0) = y(1) = 0 leads to a matrix eigenvalue problem. The difference method transforms the differential equation into the difference equation yv+l - 2yu + y u - l + hZX2yu= 0. If we choose three interior points, hence h = 1/4, then considering yo = y(0) = 0, y4 = y(1) = 0 we get the discretized system ytf(z,)
(-2+;)Y1
+
=
Yz
= 0,
I
906
19. Numerical Analusis
This homogeneous equation system has a non-trivial solution only when the coefficient determinant is zero. This condition results in the eigenvalues XI’ = 9.37, X2’ = 32 and = 54.63. hmong them only the smallest one is close to its corresponding true value 9.87. Remark: The accuracy of the difference method can be improved by 1. decreasing the step size h, 2. application of a derivative approximation of higher order (approximations as (19.120a,b) have an error of order O ( h 2 ) ) , 3. application of multi-step methods (see 19.4.1.3, p. 902). If we have a non-linear boundary value problem, the difference method leads to a system of non-linear equations of the unknown approximation values yu (see 19.2.2, p. 893).
19.4.2.2 Approximation by Using Given Functions For the approximate solution of the boundary value problem (19.118) we apply a linear combination of suitably chosen functions gz(z),which are linearly independent and each one satisfies the boundary value conditions: n
Y(..)
= g(2) = C a , g t ( z ) .
(19.121)
Z=1
If we substitute g ( z ) into the differential equation (19.118), then we get an error, the so-called defect (19.122) 4 z ; a1, Q 2 . ’ ’ ’ > a,) = g ” ( z ) + p ( z ) g ’ ( z+) 9(z)g(z)To determine the coefficients a,, we can use the following principles (see also p. 910): 1. Collocation Method The defect is to be zero at n given points z,,the so-called collocatzon poznts. The conditions (19.123) ~(z,: a l , aZ1...,a,) = 0 (v = 1,2,. . . , T I ) , a < z1 < z2 < . . . < 2 , < b result in a linear equation system for the unknown Coefficients. 2. Least Squares Method We require that the integral b
~ ( a ~. . ., . ,aa,)~ = S E 2 ( 2 ; a 1 : a 2 , . . . . a , ) d z ,
(19.124)
a
depending on the coefficients, should be minimal. The necessary conditions aF - = 0 (i = 1,2,. . . , n)
da,
( 19.125)
give a linear equation system for the coefficients ai. 3. Galerkin Method We require that the so-called error orthogonality is satisfied, i.e.,
I.(.:
al, az,
.. .
~
a,)g&)
dz = 0 (i = 1 , 2 , . . . , n ) ,
(19.126)
a
and we get in this way a linear equation system for the unknown coefficients. 4. Ritz Method The solution y ( z ) often has the property that it minimizes the variational integral, h
(19.127) a
(see (10.4), p. 550). If we know the function H ( z , y , y ’ ) , then we replace y(z) by the approximation g(z) as in (19.121) and we minimize I[y] = I ( a l ,a’,. . .,a,). The necessary conditions
ai
- = 0 (i = 1,2, . . . , n) aa, result in n equation for the coefficients a,
(19.128)
19.4 Approximate Integration of Ordinary Differential Equations 907
ICnder certain conditions on the functions p , q, f and y, the boundary value problem -[P(4211(z)I‘+ q(z)y(z)= f(.) with y(a) = 0, Y(b) = and the variational problem
a
1[y] = [b(z)y”(z) a
+ q(z)y2(z)- 2f(z)y(z)]dz = min! with y(a) = a , y(b) = p
(19.129) (19.130)
are equivalent, so we can get H ( z ,y, y’) immediately from (19.130) for the boundary value problem of the form (19.129). Instead of the approximation (19.121), we often consider n
g ( z ) = sob) + CQtsz(4,
(19.131)
%=I
where go(z)satisfies the boundary values and the functions g%(z) satisfy the conditions (19.132) g,(a) = g,(b) = 0 (z = 1 , 2 , .. . , n). For the problem (19.118), we can choose, e.g., 3-(u (19.133) go(z) = a t -(z -a) b-a Remark: In a linear boundary value problem, the forms (19.121) and (19.131) result in a linear equation systems for the coefficients. In the case of non-linear boundary value problems we get non-linear equation systems, which can be solved by the methods given in Section 19.2.2, p. 893.
19.4.2.3 Shooting Method IVith the shooting method, we reduce the solution of a boundary value problem to the solution of an initial value problem. The basic idea of the method is described below as the szngle-target method.
1. Single-Target Method The initial value problem (19.134) y” p(z)?/lt q(z)y = f ( z ) with y ( a ) = a , y’(a) = s is associated to the boundary value problem (19.118). Here s is a parameter, which the solution y of the initial-value problem (19.134)depends on, Le., y = y(z, s) holds. The function y(z, s) satisfies the first boundary condition y(a, s) = (u according to (19.134). We have to determine the parameter s so that y(z. s) satisfies the second boundary condition y(b, s ) = p. Therefore, we have to solve the equation (19.135) F ( s j = y(b. s ) - B, and the regula falsi (or secant) method is an appropriate method to do this. It needs only the values of the function F ( s ) .but the computation of every function value requires the solution of an initial value problem (19.134) until z = b for the special parameter value s with one of the methods given in 19.4.1.
+
2. Multiple-Target Method In a so-called multzple-target method, the integration interval [a,b] is divided into subintervals, and we use the single-target method on every subinterval. Then, the required solution is composed from the solutions of the subintervals, where the continuous transition at the endpoints of the subintervals must be ensured. This requirement results in further conditions. For the numerical implementation of the multiple-target method. which is used mostly for non-linear boundary value problems. see. [19.24].
908
19. Numerical Analvsis
19.5 Approximate Integration of Partial Differential Equations In the following, we discuss only the principle of numerical solutions of partial differential equations using the example of linear second-order partial differential equations with two independent variables with the corresponding boundary or/and initial conditions.
19.5.1 Difference Method We consider a regular grid on the integration domain by the chosen points (xp,yu). Usually. this grid is chosen to be rectangular and equally spaced: (19.136) x P = 20 ph, yv = yo + ~l ( p , u = 1 , 2 , . . .). We get squares for 1 = h. If we denote the required solution by u(x,y), then we replace the partial derivatives occurring in the differential equation and in the boundary or initial conditions by finzte dzvzded dzflerences in the following way, where up,, denotes an approximate value for the function value
+
Finite Divided Difference
Order of Error
(19.137)
1
The error order in (19.137) is given by using the Landau symbol 0. In some cases; it is more practical to apply the approximation
with a fixed parameter u (0 5 u 5 1). Formula (19.138) represents a convex linear combination of two finite expressions obtained from the corresponding formula (19.137) for the values y = yv and y = yUtl. A partial differential equation can be rewritten as a diflerence equation at every interior point of the grid by the formulas (19.137), where the boundary and initial conditions are considered, as well. This equation system for the approximation values has a large dimension for small step sizes h and I , so usually, we solve it by an iteration method (see 19.2.1.4, p. 892).
W A: The function u(x,y) should be the solution of the differential equation Au = uzs+uyy= -1 for the points (x,y) with 1x1 < 1, IyI < 2 , i.e., in the interior of a rectangle, and it should satisfy the boundary conditions u = 0 for 1x1 = 1and lyl = 2. The difference equation corresponding to the differential equation for a square grid with step size h is:,+pu4 = uptl,u uIL,ytlt + up,u-lt h2. The step size h = 1 (Fig. 19.6) results in a first rough approximation for the function values at the three interior points: 4Uo,1 = 0 t 0 + 0 + uo,o 1, 4uo,o = O+Uo,i+O+Ua,-1 +I, 4Uo,~~=o+21o,o+o+o+1. 3 5 We get: uo,o= - FZ 0.429, ~ 0 =, U O~, - ~= - x 0.357
+
+
Figure 19.6
7
14
19.5 Approximate Inte.qration of Partial Diflerential Equations 909
IB: The equation system arising in the application of the difference method for partial differential equations has a very special structure. We demonstrate it by the following example which is a more general boundary value problem. The integration domain is the square G: 0 5 2 5 1, 0 5 y 5 1. We are looking for a function u(zly) with Au = u,, uya,= f (z, y) in the interior of G, u ( z ,y) = g(z, y) on the boundary of G. The functions f and g are given. The difference equation associated to this differential equation is. for h = 1 = l / n : u f i + ~ v + u f i , v + ~ t u f i - l , v t u ~ , v - l -= 4 u$f(z,,yv) ~,, ( p , v = 1 > 2 ,. . . ,n-1). Inthecaseofn = 5,the left-hand side of this difference equation system for the approximation values in the 4 x 4 interior points has the form (19.139):
+
Lrn Lmm - 4 1 0 1 - 4 1 0 1 - 4 0 0 1 -
1 0 0 0
0 0 1 4
1 0 0 0
0 1 0 0
0 0 1 0
0 0 0 - 4 1 0 1 0 0 1 - 4 1 0 1 0 0 1 - 4 0 0 1 0 0 1 -
0 0 0 1
0 0 1 4
1 0 0 0
0 1 0 0
0 0 1 0
0
0 0 0 1
I111 1 0 0 0
0 1 0 0
0 0 1 0
0 - 4 1 0 0 1 - 4 1 0 0 1 - 4 1 0 0 1 -
(19.139)
1 0 0 0 0 1 0 0 0 0 1 0 0 0 0 1
m1 1 0 0 0 1 0 0 0 1 0 0 0
0
0 0 1 4
0 0 0 1
-4 1 0 0
1 0 0 - 4 1 0 1 - 4 1 0 1 - 4
if we consider the grid row-wise from left to right, and considering that the values of the function are given on the boundary. We see that the coefficient matrix is symmetric and is a sparse rnatrzz. This form is called block-tridiagonal We can also see that the form of the matrix depends on how we select the grid-points. For different classes of partial differential equations of second order, such as elliptic, parabolic and hyperbolic differential equations, more effective methods have been developed, and also the convergence and stability conditions have been investigated. There is a huge number of books about this topic (see, Le., [19.22], [19.24]).
19.5.2 Approximation by Given Functions $e\'
approximate the solution u ( z ,y) by a function in the form n
+, Y) zz 4 2 , Y) = V O ( Z , Y ) + CwJ&Y).
(19.140)
1=1
Here, we distinguish between two cases: 1. uO(z.y) satisfies the given inhomogeneous differential equation, and the further functions v,(z>y) (i = 1 , 2 , , . , , n) satisfy the corresponding homogeneous differential equation (then we are looking for the linear combination approximating the given boundary conditions as well as possible). 2. uO(z,y) satisfies the inhomogeneous boundary conditions and the other functions v,(z,y) ( i = 1 , 2 . , . . , n) satisfy the homogeneous boundary conditions (then we are looking for the linear combination approximating the solution of the differential equation on the considered domain as well as
910
19. Numerical Analusis
possible). If we substitute the approximating function u(z3y) from (19.140) in the first case into the boundary conditions, in the second case into the differential equation. then in both cases we get an error term. the so-called defect E = E ( Z . y; a1, a ? , .. . , a n ) . (19.141) To determine the unknown coefficients a,, we can apply one of the following methods:
1. Collocation Method The defect E should be zero in n reasonably distributed points, a t the collocation points (xu,yy) (v = 1,2, . . . , n): (19.142) E ( Z ” , y,; a1: a * 2 . .. < a n )= 0 (I! = 1 , 2 , . . . , n). The collocation points in the first case are boundary points (we talk about boundary collocation): in the second case they are interior points of the integration domain (we talk about domain collocation). From (19.142) we get n equations for the coefficients. Boundary collocation is usually preferred to domain collocation. IWe apply this method to the example solved in 19.5.1 by the difference method, with the functions satisfying the differential equation: u(x. y; a l , a2, a3) = (z2+y2) +a1 +a2(x2 - y2)+a3(z4 - 6z2y2+y4). The coefficients are determined to satisfy the boundary conditions at the points (21%y1) = (1,0.5), (x2,y2) = (1,l.j)and ( ~ 3 ~ = ~ 3 (0.5.2) (boundary collocation). b’e get the linear equation system -0.3125 t a1 t 0 . 7 5 ~- 0 . 4 3 7 5 ~= 0; -0.8125 t a l - 1.25a2 - 7.4375a3 = 0, -1.0625 + a1 - 3 . 7 5 ~ 10.0625a3 = 0 with the solution al = 0.4562, a2 = -0.2000, a3 = -0.0143. We can calculate the approximate values of the solution a t arbitrary points with the approximating function. To compare the values with those obtained by the difference method: v(0,l) = 0.3919 and u(0,O) = 0.4562.
-a
+
2. Least Squares Method Depending on whether the approximation function (19.140) satisfies the differential equation or the boundary conditions, we require 1. either the line integral over the boundary C
I =
J Ez(z(t),y(t); a ] , . . .,a,) d t = min:
(19.143a)
(C)
where the boundary curve C i s given by a parametric representation z = z ( t ) !y = y(t). 2. or the double integral over the domain G (19.143b)
dI
From the necessary conditions, - = 0 (z = 1 , 2 , .. . , n ) , we get n equations for computing of the parameters al,a2,. . . , a,.
da,
19.5.3 Finite Element Method (FEM) After the appearance of modern computers the finite element methods became the most important technique for solving partial differential equations. These powerful methods give results which are easy to interpret. Depending on the types of various applications, the FEhl is implemented in very different ways, so here n e give only the basic idea. It is similar to those used in the Ritz method (see 19.4.2.2, p. 906)
)
19.5 Approximate integration of Partial DzfferentialEquations 911 for numerical solution of boundary value problems for ordinary differential equations and is related to spline approximations (see 19.7. p. 928). The finite element method has the following steps: 1. Defining a Variational Problem We formulate a variational problem to the given boundary value problem. The process is demonstrated on the following boundary value problem: l u = u,, uyy= f in the interior of G, u = 0 on the boundary of G. (19.144) It'e multiply the differential equation in (19.144) by an appropriate smooth function v ( z ,y) vanishing on the boundary of G, and we integrate over the entire G to get
+
(19.145) Applying the Gauss integral formula (see 13.3.3.1, l.,p. 663), where we substitute P ( z ,y) = -vuy and Q(z.y) = vu, in (13.119). we get the varzatzonal equatzonfrom (19.145) a(u,z.) = b(v) ( 19.146a) with (19.146b)
2. Triangularization The domain of integration G is decomposed into simple subdomains. Usually. we use a trzangularzratzon, where we cover G by triangles so that the neighboring triangles have a complete side or onlj a single vertex in common. Every domain bounded by curves can be approximated quite well by a union of triangles (Fig. 19.7). Remark: To avoid numerical difficulties, the triangularization should not contain obtuse-angled triangles. IA triangularization of the unit square could be performed as shown in Fig. 19.8. Here we start from the grid points with coordinates zp = ph, yv = vh ( p ,v = 0 , 1 , 2 , . . . , N ; h = l/N). We get (S- 1)' interior points. Considering the choice of the solution functions, it is always useful to consider the surface elements G,, composed of the six triangles having the common point ( x p ,y,,). (In other cases, the number of triangles may differ from six. These surface elements are obviously not mutually exclusive.) 3. Solution We define a supposed approximating solution for the required function u(x,y) in every triangle. A triangle with the corresponding supposed solution is called a jnzte element. Polynomials in 3 and y are the most suitable choices. In many cases, the linear approximation (19.147) G(z,y) = a1 t a22 + a3y
912
19. Numerical Analusis
is sufficient. The supposed approximating function must be continuous under the transition from one triangle to neighboring ones, so we get a continuous final solution. The coefficients a l , a2 and a3 in (19.147) are uniquely defined by the values of the functions u1, u2 and 113 at the three vertices of the triangle. The continuous transition to the neighboring triangles is ensured by this at the same time. The supposed solution (19.147) contains the approximating values u,of the required function as unknown parameters. For the supposed solution, which is applied as an approximation in the entire domain G for the required solution u ( z ,Y ) ~we choose (19.148) ,=l
,=l
n'e have to determine the appropriate coefficients cypy. The following must be valid for the functions uPv(z,y): They represent a linear function over every triangle of G,, according to (19.147) with the following conditions: 1 fork = p,1 = v, (19.149a) at any other grid point of GI".
2. u,&,
Y)
=0
for (z, Y)
# G,,.
(19.149b)
The representation of u,,(z,y) over G,, is shown in Fig. 19.9. The calculation of up,, over G,,, Le., over all triangles 1 to 6 in Fig. 19.8 is shown here only for triangle 1: uPv(z,y) = al + a22 a3 with (19.E O )
+
1 for z = z,,y = y, , u P L y ( z , y ) = 0 for z = ~ , - l , y = y , - ~ , 0 for z = z,,y = yY--l.
{
(19.151)
From (19.151) we have a1 = 1 - v, a2 = 0, a3 = l / h , and we get for triangle 1: u&,y)
Analogously. we have:
I
1-
(X
- p)
+ (f - v)
1+
(: v ) - p)
for triangle 2,
for triangle 4,
--
(i + (f
(19.152)
for triangle 3,
1-(;-p)
u,v(z,y) = ( 1 -
(: )
= 1t - - v
- v)
(19.153)
for triangle 5, for triangle 6.
4. Calculation of t h e Solution Coefficients We determine the solution coefficients cypv by the requirements that the solution (19.148) satisfies the variational problem (19.146a) for every solution function u , ~ .Le., we substitute G(z, y) for u(z. y) and u,,(z, y) for v(z,y) in (19.146a). This way, we get a linear equation system N-1 N - 1 ,=1
u=l
~ + v a ( u puki) ) = b ( ~ () k , 1 = 1 , 2 , . . . , N - 1)
(19.154)
19.5 Approximate Inteqration of Partial Differential Equations 913
for the unknown coefficients, where (19.155) In the calculation of u(upLy, ukl) we must pay attention to the fact that we have to integrate only in the cases of domains G ,, and Gkl with non-empty intersection. These domains are denoted by shadowing in Table 19.1. Table 19.1 Auxiliary table for FEM
Surface region
:;$ 1
~
1
1
2 3 4 5 6
2 3 4 5 6
1
pjj I
l
l
I% El6 4, p = k + l v=l+l
#
v i E--
Graphical Triangle of representation Gki G,, 1
;
1 5 2 4
2 6 3 5
-l/h -l/h
3 1
-l/h
4 2 5 1
n
0
llh
0
0 -l/h
5 3 6 2
6 4
7, p = k - l v=l-1
l 3
The integration is always performed over a triangle with an area h2/2, so for the partial derivatives with respect to z we get:
p1 ( 4 w - 2 a k t 1 , l - 2Qk-1,lj
hZ 2
-.
(19.156a)
914
19. Numerical Analysis
Analogously. for the partial derivatives with respect to y we have: (19.156b)
(19.157a) where V p is the volume of the pyramid over Since 1
Gkl
with height 1, determined by
ukl(z,y)
(Fig. 19.9).
1
(19.157b) VP= - 6 . -hZ we have b(ukl)FZ jklhz 3 2 So. the variational equations (19.154) result in the linear equation system . . , N - 1) ( k .1 = (19.158) 4~ - Q ~ + I , L ~ k - 1 ,~ O k , l t l - a k . l - 1 = hZ.fkl for the determination of the solution coefficients. Remarks: 1. If the solution coefficients are determined by (19.158), then C(z3y) from (19.148) represents an explicit approximating solution, whose values can be calculated for an arbitrary point (x,y) from G. 2. If the integration domain must be covered by an irregular triangular grid then it is useful to introduce triangular coordinates (also called barycentric coordinates). In this way, the position of a point can be easily determined with respect to the triangular grid, and the calculation of the multidimensional integral is made easier as in (19.155), because every triangle can be easily transformed into the unit triangle with vertices ( O , O ) , (0, l),(1,O). 3. If accuracy must be improved or also the differentiability of the solution is required, we have to apply piecewise quadratic or cubic functions to obtain the supposed approximation (see. e.g., [19.22]). 4. In practical applications, we usually obtain equation systems of huge dimensions. This is the reason why so many special methods have been developed, e.g., for automatic triangularization and for practical enumeration of the elements (the structure of the equation system depends on it). For detailed discussion of FEM see [19.13], [19.7], [19.22].
19.6 Approximation, Computation of Adjustment) Harmonic Analysis 19.6.1 Polynomial Interpolation The basic problem of interpolation is to fit a curve through a sequence of points (x,,, y,,) (v = 0,1, , . . n). This can happen graphically by any curve-fitting gadget, or numerically by a function g(x), which takes given values y,, at the points z, at the so-called interpolation points. That is g(z) satisfies the interpolation conditions (19.159) g(x,,)=y, ( v = 0 , 1 , 2 In the first place, we use polynomials as interpolation functions, or for periodic functions so-called trigonometric polynomials. In this last case we talk about trigonometric interpolation (see 19.6.4.1, 2., p. 924). If we have n + 1 interpolation points, the order of the interpolation is n, and the highest degree of the interpolation polynomial is at most n. Since with increasing degree of the polynomials, strong oscillation may occur, which is usually not required, we decompose the interpolation interval into subintervals and we perform a spline interpolation (see 19.7, p. 928). ~
I
19.6.1.1 Newton's Interpolation Formula To solve the interpolation problem (19.159)we consider a polynomial of degree n in the following form: g(z) = p,(z) = a0 al(z - zo) az(z - q ) ( z - 21) ..
+
+
+.
19.6 Approximation, Computation of Adjustment, Harmonic Analusis 915
+an(z - 2 0 ) ( 2 - 2 1 ) . . . (2 - & I ) . ( 19.160) This is called the Newton interpolation fornula, and it gives an easy calculation of the coefficients a, (i = 0 , l . . . . . n ) . since the interpolation conditions (19.159) result in a linear equation system with a triangular matrix. IFor n = 2 we get the annexed equation = Yo system from (19.159). The interpolation p z ( z O=) = Y1 polynomial pn(z)is uniquely determined pz(21)= + a1(21 p2('2) = 0' + 'l('2 - '0) + az(zz - '0)('2 - '1) = Y2 by the interpolation conditions (19.159). The calculation of the function values can be simplified by the Horner schema (see 19.1.2.1. p. 884).
19.6.1.2 Lagrange's Interpolation Formula Wecanfit apolynomialofn-thdegreethroughn+lpoints(z,, y,,) formula:
(u = 0,1,.
. . , n ) )withtheLagrange
n
g(2) = pn(.)
=
1~fiLfi(2).
(19.161)
=,O
Here L,(x) ( p = 0 , 1 , .. . , n ) are the Lagrange interpolation polynomials. Equation (19.161) satisfies the interpolation conditions (19.159), since 1 for p = u, (19.162) 0 for p # u. Here 6," is the Kronecker symbol. The Lagrange interpolation polynomials are defined by the formula
.
(19.163)
"#P
I\$e' fit a polynomial through the points given by the table b'e use the Lagrange interpolation formula (19.161) and we get:
5 17 L2(z) = - - 2 2 + -2 + 1. 6 6 The Lagrange interpolation formula depends explicitly and linearly on the given values y, of the function. This is its theoretical importance (see, e.g., the rule of Adams-Bashforth, 19.4.1.3, 3., p. 903). For practical calculation the Lagrange interpolation formula is rarely reasonable. p z ( 2 ) = 1 . Lo(2)
+3
'
Ll(2)
+2
'
19.6.1.3 Aitken-Neville Interpolation In several practical cases: we do not need to know the explicit form of the polynomial pn(z),but only its value at a given location z of the interpolation domain. We can get this function value in a recursive way due to Aitken and Neville. We apply the useful notation (19.164) pn(2) = PO,^ ...,,n(x)* in which the interpolation points 2 0 , xl, . . . , 2 , and the degree n of the polynomial are denoted. Notice that (2 - ZO)PI,Z,...,n(2) - ( 2 - Z ~ ) P O , ,..., ~ ,n-1(2) Z (19.165) P0,l ,...,n(x) = Xn - 20
I
916
19. Numerical Analysis
i.e.. the function value p o ~...,,(z) , can be obtained by linear interpolation of the function values of p l ~...,,(x) , and PO,~J,,.,,,-~(X), two interpolation polynomials of degree 5 n - 1. Application of (19.165) leads to a scheme which is given here for the case of n = 4: Yo =Po Y1 = PI Po1 Y Z = P2 PI2 Po12 !/3 = P3 P23 P123 PO123 y4 = P4 P34 P234 P1234 PO1234 = P4(x).
(19.166)
The elements of (19.166) are calculated column-wise. A new value in the scheme is obtained from its west and north-west neighbors (19.167a)
(19.167b) (19.167~)
For performing the Aitken-Neville algorithm on a computer we need to introduce only a vector p with n + 1 components (see [19.4]), which takes the values of the columns in (19.166) after each other according to the rule that the value ~ i - + k + l , . . . , ~ (i = k, k + 1, , n) of the k-th column will be the i-th component p, of p. The columns of (19.166)must be calculated from the top down. so we will have all necessary values. The algorithm has the following two steps: (19.168a) 1. For i = O . l , . . . . n set p,=y,. 2-x, 2. For k = 1,2,. . . , n and for i = n,n - 1,.. . , k compute pi = p i t L ( p , - pi-1).(19.168b) xi - Xi-k t . .
after finishing (19.168b) we have the required function value p,(x) at x in element p,.
19.6.2 Approximation in Mean The principle of approximation in mean is known as the Gauss least squares method. In calculations we distinguish between continuous and discrete cases.
19.6.2.1 Continuous Problems, Normal Equations The function f ( x ) is approximated by a function g(x) on the interval [a, b] so that the expression b
F = /.(.)If(.)
- 9(x)I2dz,
(19.169)
a
depending on the parameters contained by g(x), should be minimal. W(X)denotes a given weight function, such that ~ ( x >) 0 in the integration interval. If we are looking for the best approximation g(x) in the form n
g(x) = C a t g i ( x )
(19.170)
t=O
with suitable linearly independent functions go(x), gl(x), . . . , gn(x), then the necessary conditions
aF
-= 0 8Qt
(2
= O , l , . . .,n)
(19.171)
19.6 Approximation, Computation of Adjustment, Harmonic Analysis 917
for an extreme value of (19.169) result in the so-called normal equation system (19.172)
( 19.173a)
(19.173b) which are considered as the scalar products of the two indicated functions The system of normal equations can be solved uniquely, since the functions go(z),g l ( z ) ,. . . , gn(z) are linearly independent. The coefficient matrix of the system (19.172) is symmetric, so we could apply the Cholesky method (see 19.2.1.2. p. 890). The coefficients a, can be determined directly, without solving the equation system, if the system of functions gz(z)is orthogonal, that is, if = 0 for z
(gz.gk)
# k.
(19.174)
We call it an orthonormalsystem. if (19.175) R'ith (19.175). the normal equations (19.172) are reduced to a, = ( f . g Z ) (i = 0.1.. . . n).
(19.176)
~
Linearly independent function systems can be orthogonalized. From the power functions gt(z) = z2 (i = 0.1.. . . n ) ,depending on the weight function and on the interval, we may obtain the orthogonal polynomials in Table 19.2. Table 19.2 Orthogonal polynomials
.
i
' Name of the polynomials see p. Legendre polynomial P,(z) (19.177)
Chebysev polynomial Tn(z) Laguerre polynomial L,(z) Hermite polynomial H,(x)
511 512
With these polynomial systems we can work on arbitrary intervals: 1. Finite approximation interval. 2. Approximation interval infinite at one end, e.g., in time-dependent problems. 3. Approximation interval infinite at both ends, e.g., in stream problems. Every finite interval [a,b] can be transformed by the substitution
b+a b-a z = -t -t 2
2
(x E [a,b], t
E
[-1.11)
(19.178)
918
19. Numerical Analusis
into the interval [-1.11.
19.6.2.2 Discrete Problems, Normal Equations, Householder's Method Let S pairs of \slues (xu,y u ) be given. e.g., by measured values. We are looking for a function g(x). whose values g(lru)differ from the given values yv in such a way that the quadratic expression Y
F =
C[Yy- g(xv)12
(19.179)
"=l
is minimal. The value of F depends on the parameters contained in the function g(z). Formula (19.179) represents the classical sum ofresidualsquares. The minimizationof the sum of residual squares is called l3F the least squares method. From the assumption (19.170) and the necessary conditions - = 0 (i = aa, 0 . 1 , . . . n ) for a relative minimum of (19.179) we obtain a linear equation system for the coefficients: which is called the normal equations: n
a,[g,gkI = [ygkl ( k = 0%1.. . . , n).
(19.180)
0'2
Here we used the Gaussean sum symbols in the following notation: ,v
Y
[g&] =
gl(zv)gc(z,), (19.181a)
[ygk] =
1yvgk(z,)
(i, k = 0 , l . . . . , n).
(19.181b)
Lsually, n < Z. IFor the polynomial g(z) = a. + a l z + ' . + a,zn: the normal equations are a0[zk] al[zkt1] ' . u,[xktn] = [z"] ( k = 0:1.. . . n) with [zk]= Cf=, x V k l [xO]= N ! [xky] = zykyy: [y] = y,. The coefficient matrix of the normal equation system (19.180) is symmetric, so for the numerical solution we may apply the Cholesky method. The normal equations (19.180) and the residue sum square (19.179) have the following compact form:
+
+ +
~
Y1
a0
a= YN
I:]
(19.182a)
.
(19.18213)
an
If, instead of the minimalization of the sum of residual squares. we want to solve the interpolation problem for the S points ( z v 3yy), then we have to solve the following system of equations:
Gn = y.
(19.183)
This equation system is overdetermined in the case of n < M - 1, and usually it does not have any solution. \Ye get (19.180) or (19.182a) ifwe multiply (19.183) by GT.
19.6 Approximation, Computation of Adjustment, Harmonic Analysis 919
From a numerical viewpoint, the Householder method (see 4.4.3.2,2., p. 278) is recommended to solve equation (19.183), and this solution results in the minimal sum of residual squares (19.179).
19.6.2.3 Multidimensional Problems 1. Computation of Adjustments Suppose that there is a function f(zl,2 2 , . . . ,z,) of n inde, z,. We do not know its explicit pendent variables 21, 2 2 , form; only N substitution values fy are given, which are, in general, measured values. We can write these data in a table (see (19.184)). The formulation of the adjustment problem is clearer if we introduce the following vectors: I
x
.
z1
.
= ( X I )z2).. .~z,)~
:
&“, . . .
x(’) = (xi“’,
:
x2
i Xn
f
.. .. zp zp fl
fz
... z ”
’
i
(19.184) p
fw
Vector of n independent variables, Vector of the v-th interpolation node (v = 1,.. . , N ) , Vector of the N function values at the N interpolation nodes.
: f = (fl, j z , . . . f N ) T We approximate f(z1,xZr.. I,) = f(x)by a function of the form I
(19.185) Here, the m + 1 functions gz(zl,2 2 , . . . ,z,) = g,(x) are suitable, selected functions. IA: Linear approximation by n variables: g(zl, 2 2 , . , , , 2 , ) = a0 t alzl t a222 + * . t anz,. IB: Complete quadratic approximation with three variables: g(q.z2,z3)= a0 alzl a222 a313 a4z12 a5xZ2 a623’ a72122 t Q8z1z3t a92223.
+
+
+
+
+
+
+
The coefficients are chosen to minimize C,”=,[fy - g (zp),zr),. . . , z;))]’.
2. Normal Equation System Analogously to (19.182b) we form the matrix G, in which we replace the interpolation nodes z, by vectorial interpolation nodes x(”)(v = 1,2,. . . , N ) . To determine the coefficients, we can use the normal equation system GTGa = GTf (19.186) or the overdetermined equation system Ga = f . (19.187)
IFor an example of multidimensional regression see 16.3.4.3, 3., p. 780.
19.6.2.4 Non-Linear Least Squares Problems We show the main idea for a one-dimensional discrete case. The approximation function g(z) depends non-linearly on certain parameters. IA: g(z) = aOeal” t a2ea3’. This expression does not depend linearly on the parameters a1 and a3. IB: g(z) = aOeal‘ cos azz. This function does not depend linearly on the parameters a1 and az. We indicate the fact that the approximation function g(z) depends on a parameter vector a = (ao, a l , . . . , a,)T by the notation g = g ( r , a )= d z ; a o > a l , , , . , a n ) . (19.188) Suppose, N pairs of values ( z v ,yy) (v = 1 , 2 , . . . , N ) are given. To minimize the sum of residual squares (19.189)
I
920
19. Numerical Analusis
dF
the necessary conditions - = 0 (i = 0, 1,. . . , n) lead to a non-linear normal equation system which da, must be solved by an iterative method, e.g., by the Newton method (see 19.2.2.2, p. 894). .4nother way to solve the problem, which is usually used in practical problems, is the application of the Gauss-Newton method (see 19.2.2.3, p. 894) given for the solution of the non-linear least squares problem (19.24). We need the following steps to apply it for this non-linear approximation problem (19.189): 1. Linearization of the approximating function g(z, a) with the help of the Taylor formula with respect to a,. To do this, we need the approximation values ato) (i = 0 , 1,. . . , n):
2. Solution of the linear minimum problem N
C[yv- 5(zY,a)12= min!
(19.191)
"=I
with the help of the normal equation system
G T G h = GTAy -
(19.192)
or by the Householder method. In (19.192) the components of the vectors Aai = a, - a?)
and & are given as
(i = 0,1,2,. . . n) and ~
(19.193a)
(19.193b) Ayu = yu - g(z,,a(O)) (v = 1 , 2 , .. . , N). The matrix G can be determined analogously to G in (19.182b), where we replace g,(z,) by ( 2 = 0,1,. . . ,n; v = 1 , 2 , .. . , N). da, 3. Calculation of a new approximation
-(zu,ajO') ag
(19.194) a!') = ato)+ Aai or a!') = a!') + TAai (i = 0,1,2, . . . , n), where y > 0 is a step length parameter. By repeating steps 2 and 3 with a!') instead of a?), etc. we obtain a sequence of approximation values for the required parameters, whose convergence strongly depends on the accuracy of the initial approximations. We can reduce the value of the sum of residual squares with the introduction of the multiplyer y.
19.6.3 Chebyshev Approximation 19.6.3.1 Problem Definition and the Alternating Point Theorem 1. Principle of Chebyshev Approximation Chebyshev approximationor uniform approximationin the continuous case is the following: The function j ( z ) is to be approximated in an interval a 5 z 5 b by the approximation function g(x) = g(z; ao, a l , . . . ,an) so that the error defined by (19.195) max I j ( z ) - g(z; ao, a ] , . . . , an)l = O(ao,a ' , . . . ,an) asxsb should be as small as possible for the appropriate choice of the unknown parameters a, (i = 0, 1,. . . , n). If there exists such an approximating function for f(z),then the maximum of the absolute error value will be taken at least at n+ 2 points zv of the interval, at the so-called altematingpoints, with changing signs (Fig. 19.10).This is actually the meaning of the altematingpoint theorem for the characterization of the solution of a Chebyshev approximation problem.
19.6 Approximation, Computation of Adjustment, Harmonic Analysis 921
Figure 19.10
IIfwe approximate the function !(I) = x" on the interval [-1,1] by a polynomialofdegree 5 n- 1in the Chebyshev sense, then we get the ChebyshevpolynomialT,(z)as an error function whose maximum is normed to one. The alternating points, being at the endpoints and at exactly n - 1 points in the interior of the interval, correspond to the extreme points of T,(z) (Fig. 19.11a-f).
d)
e)
f)
Figure 19.11
19.6.3.2 Properties of the Chebyshev Polynomials 1. Representation T,(z) = cos(narccosz), 1
+ m)" +
(19.196a)
my]
- [(I (I , 2 cos nt, I = cost for l 2 1 < 1, (n = 1 , 2 , . . .), Tn(x) = coshnt, x = cosht for 1x1 > 1 T"(Z) =
{
(19.196b)
( 19.196~)
I
922
19. Numerical Analusis
2. Roots of Tn(z) ( p = 1 , 2 , . . , , n). 2n 3. Position of the e x t r e m e values of Tn(z)for z E [-1,1]
xp = cos (2p - l)?r
(19.197)
~
UT
( v = 0 . 1 , 2 ,...,n). n 4. Recursion Formula Tn+l= 2xTn(2) - Tn-l(Z)( n = 1 , 2 , . . . ; To(x)= 1, T,(z)= 2) From this recursion we have for example T2(2) = 2x2 - 1, T3(2)= 4x3 - 32, T~(Z) = 8x4- 82’+ 1, T5(x)= 16x5 - 20x3 52, 5u=cos-
T6(2) =
(19.198) (19.199)
(19.200a)
+
+
(19.200b)
32x6 - 48x4 182’ - 1,
(19.2OOc)
T,(x)= 64x’ - I l k 5 + 56x3 - 72, TB(z) = 1282’ - 2 5 6 + ~ 1~ 6 0 -~ 322’ ~ + 1, Tg(x)= 2562’ - 5762’ + 432x5 - 120x3 + 9x, Tlo(x)= 5 1 2 ~ ”- 12802’ + 1 1 2 0 -~ ~4 0 0 + ~ 50x’ ~ - 1.
(19.200d) (19.200e) (19.200f) (19.200g)
19.6.3.3 Remes Algorithm 1. Consequences of the Alternating Point Theorem The numerical solution of the continuous Chebyshev approximation problem originates from the alternating point theorem. We choose the approximating function (19.201) with n + l linearly independent known functions, and we denote by ai’ (i = 0,1,. . . , n)the coefficients of the solution of the Chebyshev problem and by Q = @(ao’,a ~ *. .~,an*) . the minimaldeviation according to (19.195). In the case when the functions f and gl (i = 0,1,.. . , n) are differentiable, from the alternating point theorem we have n
n
Cai’gi(2,)+(-1)”e=f(2,),
CaZ*gI(zv) = f’(x,) ( v = 1,2, . . . ,n + 2 ) .
i=O
i=O
(19.202)
The nodes x, are the alternating points with (19.203) a 5 x1 < x2 < . . . < x,+~5 b. The equations (19.202) give 2n + 4 conditions for the 2n + 4 unknown quantities of the Chebyshev approximation problem: n + 1 coefficients, n + 2 alternating points and the minimal deviation e. If the endpoints of the interval belong to the alternating points, then the conditions for the derivatives are not necessarily valid there.
2. Determination of the Minimal Solution according to Remes According to Remes, we proceed with the numerical determination of the minimal solution as follows: 1. We determine an approximation of the alternating points xu(0) (v = 1 , 2 , .. . , n 2) according to (19.203).e.g.! equidistant or as the positions of the extrema of Tn+l(x) (see 19.6.3.2, p. 920). 2. We solve the linear equation system
+
19.6 Approximation, Computation of Adjustment, Harmonic Analusis 923 and as a solution, we get the approximations aI(0) (i = 0 , 1 , . . . , n) and
eo.
3. \$’e determine a new approximation of the alternating points z Y ( l )(v = 1,2,. . . , n positions of the extrema of the error function f ( x ) -
e
+ 2), e.g., as
a,(’)g,(z). Now, it is sufficient to apply only
,=O
approximations of these points. By repeating steps 2 and 3 with z V ( l )and a,(’) instead of zY(”)and a,(’), etc. we obtain a sequence of approximations for the coefficients and the alternating points, whose convergence is guaranteed under certain conditions, which can be given (see [19.25]). We stop calculations if, e.g., from a certain iteration index p (19.204) holds with a sufficient accuracy.
19.6.3.4 Discrete Chebyshev Approximation and Optimization From the continuous Chebyshev approximation problem I
n
I
(19.205)
+
we get the corresponding discrete problem, if we choose N nodes X, (v = 1,2,. . . , N ; N 2 n 2) with the property a 5 $1 < 22 . < X N 5 b and require (19.206) We substitute (19.207) and obviously we have if(X”)
- kaigi(Xu)1I Y (v = 1 > 2 , .. . , N ) .
(19.208)
2=0
Eliminating the absolute values from (19.208) we obtain a linear inequality system for the coefficients a, and 7,so the problem (19.206) becomes a linear programming problem (see 18.1.1.1, p. 844):
I :I 7
y = min! subject to
+ ,E aig,(zu) 2
7-
f(XV),
c a,gi(z”) 2 - f ( z v )
( v = 1 , 2 ,..., N ) .
(19.209)
Equation (19.209) has a minimal solution with y > 0. For a sufficiently large number N of nodes and with some further conditions the solution of the discrete problem can be considered as the solution of the continuous problem. n
a,gi(z) a non-linear approximation
If we use instead of the linear approximation function g(z) = i=O
function g(z) = g(z: ao,a l , . . . , a n ) ,which does not depend linearly on the parameters ao, al, . . . a,, then we obtain analogously a non-linear optimization problem. It is usually non-convex even in the cases of simple function forms. This essentially reduces the number of numerical solution methods for
I
924
19. Numerical Analgais
non-linear optimization problems (see 18.2.2.1, p. 861).
19.6.4 Harmonic Analysis We want to approximate a periodic function f(z)with period 27r, which is given formally or empirically, by a trigonometric polynomial or a Fourier sum of the form a0 g(x) = - t C ( a k c o s k z t b k s i n k z ) ,
(19.210)
k=l
where the coefficients ao, a k and bk are unknown real numbers. The determination of the coefficients is the topic of harmonic analysis.
19.6.4.1 Formulas for Trigonometric Interpolation 1. Formulas for the Fourier Coefficients Since the function system 1, cos kx, sin kx ( k = 1 , 2 , .. . , n) is orthogonal in the interval [0,27~]with respect to the weight function w
I 1, we
get the formulas for the coefficients (19.211)
by applying the continuous least squares method according to (19.172). The coefficients ah and bk calculated by formulas (19.211) are called Fourier coeficienta of the periodic function f ( z ) (see 7.4, p. 418). If the integrals in (19.211) are complicated or the function f(z) is known only at discrete points, then the Fourier coefficients can be determined only approximately by numerical integration. Using the trapezoidal formula (see 19.3.2.2, p. 896) with N t 1 equidistant nodes 27T ( 19.212) 2, = vh (U = O , l , . . . N ) , h = -
N
we get the approximation formula 2 2 N 6k = - f ( Z v ) COS kz,, b k X gk = - f(X,) Sinkz, ( k = 0,1,2,. . . , n). (19.213) N u=l N w=l The trapezoidal formula becomes the very simple rectangular formula in the case of periodic functions. It has higher accuracy here as a consequence of the following fact: Iff z is periodic and (2m+ 2) times differentiable, then the trapezoidal formula has an error of order O(h'+'). 2. Trigonometric Interpolation Some special trigonometric polynomials formed with the approximation coefficients 6k and 6, have important properties. Two of them are mentioned here: 1. Interpolation Suppose N = 2n holds. The special trigonometric polynomial
ak
1
RZ
n-1 1 (ick cos kx t 6, sin kz) t -6, cos nz ijl(x)= $ t i o 2 2
(19.214)
k=l
with coefficients (19.213) satisfies the interpolation conditions (19.215) at the interpolation nodes z, (19.212). Because of the perodicity of f(z) we have f(zo) = f(z,v). 2. Approximation in M e a n Suppose N = 2n. The special trigonometric polynomial ij2(x)= sa0 1- t x ( 6 k c o s k x t i k s i n k z )
k=l
(19.216)
19.6 Approximation, Computation of Adjustment, Harmonic Analysis 925
form < nand with the coefficients (19.213) approximates the function f(z)in discrete quadratic mean with respect to the N nodes x, (19.212), that is, the residual sum of squares N
[fb") - Bz(r,)I2
F=
(19.217)
"=l
is minimal. The formulas (19.213) are the originatingpoint for the different ways of effective calculation of Fourier Coefficients.
19.6.4.2 Fast Fourier Transformation (FFT) 1. Computation costs of computing Fourier coefficients The sums in the formulas (19.213) also occur in connection with discrete Fourier transformation, e.g., in electrotechnics, in impulse and picture processing. Here N can be very large, so the occurring sums must be calculated in a rational way, since the calculation of the N approximating values (19.213) of the Fourier coefficients requires about N Zadditions and multiplications. For the special case of N = 2 P , the number of multiplications can be largely reduced from NZ(=22P) topN(= p2P) with the help of the so-called fast Fourier transformation FFT.The magnitude of this reduction is demonstrated on the example on the right-hand side. By this method, the computation costs and computation time are reduced so effectively that in some important application fields even a smaller computer is sufficient. The FFT uses the properties of the N-th unit roots, Le., the solutions of equation z N = 1to a successive sum up in (19.213).
2. Complex Representation of the Fourier Sum The principle of FFT can be described fairly easily if we rewrite the Fourier sum (19.210) with the formulas
into the complex form (19.220)
If we substitute
'/
U k - ibk ck = -, (19.221a) then because of (19.211) ck = 2 2.rr and (19.220) becomes the complex representation of the Fourier sum:
Ckeikx with
g(2) =
C-k
211
0
f(x)e-ikz2dx,
= Ek.
(19.221b)
(19.222)
k=-n
If the complex coefficients ck are known, then we get the required real Fourier coefficients in the following simple way: Uo = 2C0, Uk = 2Re(Ck), bk = -2Im(Ck) (k = 1,2,...,?I). (19.223)
3. Numerical Calculation of the Complex Fourier Coefficients For the numerical determination of ck we apply the trapezoidal formula for (19.221b) analogously to (19.212) and (19.213), and get the discrete complex Fourier coefficients Ek:
.
hT-1
N-1
(19.224a)
926
19. Numerical Analysis 1
fu
= lyf(”u)’
“u
2au N
Zni -W N = ~N .
= - ( u = 0 , 1 , 2 ,...,N - 1 ) ,
(19.224b)
Relation (19.224a) with the quantities (19.224b) is called the dzscrete complez Fourier transformation of length N of the values fv ( u = 0 , 1 , 2 , . . . , N - 1). The powers uk = z ( u = 0,1,2,. . . , N - 1) satisfy equation z N = 1. So, they are called the N-th unit roots. Since e-’‘’ = 1, W”+Z = w;, . . . . UN” = 1, w”+1 = (19.225) The effective calculation of the sum (19.224a) uses the fact that a discrete complex Fourier transforN mation of length N = 2n can be reduced to two transformations with length - = n in the following 2 way:
a) For every coefficient Ek with an even index, i.e., k = 21, we have In-1
E21
n-1
fyWp= 1 [ f u w 2t fn+uW;(n+”)]
= u=o
n-1
=
[fu
t f n + J w?.
(19.226)
”=O
u=o
wpwb
Here we use the equality w:(nt”) = = up. If we substitute Y, = f v t fn+v ( u = 0,1,2, , , , n - 1) and consider that w;, = wn, then j
n-1 E21
=
y”U:
( u = 0 , l : 2,. . . , n - 1)
(19.227)
(19.228)
”=O
is the discrete complex Fourier transformation of the values yu (u = 0 , 1 , 2 , . . . , n - 1) with length N
n=T. b) For every coefficient
with an odd index, i.e., with k = 22 + 1, we get analogously: (19.229)
(19.230) “-1
C2ltl
yn+&k
=
( u = 0,1,2:.
I . ,
n - 1)
(19.231)
”=O
which is the discrete complex Fourier transformation of the values yn+u ( u = 0, 1 , 2 , . . . , n - 1) with IY
length n = -. 2 The reduction according to a) and b), Le., the reduction of a discrete complex Fourier transformation to two discrete complex Fourier transformations of half the length, can be continued if N is a power of 2, Le., if N = 2 P (pis a natural number). The application of the reduction after p times is called the FFT. N Since every reduction step requires - complex multiplications because of (19.230), the computation 2 cost of the FFT method is
I
-.vp = 1 v log, N .
2
2
(19.232)
19.6 Avwoxzmation, Computation of Adiustment, Harmonic Analysis 927
4. Scheme for FFT For the special case N = 8 = 23, the three corresponding reduction steps of the FFT according to
Step 1
fo Yo = fo + f 4
+
fl
Y1
fi
YZ
= fl f 5 = f2 t f 6
f3
Y3
=f3
+f 7
Step 2
Step 3
yo := Yo + Yz Y1 := Y1 t Y3 Yz := (Yo - Y z ) 4
Yo := Yo f Y1 Y1 := (Yo - Y l ) 4
Y4
= (fo - f
4 ) 4
f5
Y5
= (fl - f
5 ) 4
f6
?j6
= ( f 2 - f6)lU’;
f7
Y7
y3
Y3
:= (Yz - Y 3 ) 4
:= Y4
Y4
:= Y4
Y5
:= (Y4 - Y 5 1 4
Y6
:= (Y4 - Y6)w40
y6
:= y 6 t 9 7
Y7
:= (Y5 - Y 7 ) 4
N
:= 4,
:= (Y1 - Y 3 ) 4
+ Y6 Y5 := Y5 + Y7
= (f3 - f,)wg3 277, IV = 8, n := 4, wg = e - T
:= YZ f
Y4
Y3
f4
92
+ Y5
Y7 := (Y6 - Y 7 ) 4 n := 2, w4 = w; N := 2, n := 1, w2 = wi
We can observe how terms with even and odd indices appear. In Scheme 2 (19.233) the structure of the method is illustrated. Scheme 2:
(19.233)
If the coefficients Ek are substituted into Scheme 1 and we consider the binary forms of the indices before step 1 and after step 3, then we recognize that the order of the required coefficients can be obtained by simply reverszng the order of the bzts of the binary form of their indices. This is shown in Scheme 3.
Scheme 3:
Index
Eo
000
6
Eo
E1
E2
E4
E5
OOL OLO OLL LOO LOL
E6
LLO
E7
LLL
E2
E4
Step 1 Step 2
Step 3 Index
6
000
E4
E4
LOO
E2
E2
OLO
26
E6
E6
LLO
E1
El
El
OOL
E3
E5
E5
E5
E3
E3
LOL OLL
E7
E7
E7
LLL
271 1 creteFouriertransformation. WechooseN = 8. Withz, = -, f - -f(x,) (v = 0,1,2,. . . , 7 ) , wg = 8 ’-8 e
_-Zni8
= 0.707107(1 - i). w,’= -is ui = -0.707107(1 t i ) we get Scheme 4:
I
928
19. Numerical Analusis
Scheme 4:
Step 1
Step 2
fo = 2.467401 yo = 3.701102
yo = 6.785353
Step 3 yo = 13.262281
fl
= 0.077106 y1 = 2.004763
y1 = 6.476928
y1 = 0.308425
= E4
fi
= 0.308425 y~ = 3.084251
yz = 0.616851
+ 2.467402 i = E2
f3
= 0.693957
f4
= 1.233701 y4 = 1.233700
U?
= 4.472165
y2
= 0.616851
un = 2.467402 i 214
= 1.233700
=&
y3 = 0.616851 - 2.467402 i = E6
y4 = 2.106058 + 5.9568331 = E 1
+2.467401 i f5
= 1.927657 y5 = -1.308537(1 - i)
y5
= 0,872358
ys = 0.361342 - 1.022031 i = E5
$3.489432 i fe = 2.775826 y6 = 2.467401 i
y6 =
1.233700
y6 = 0.361342 + 1.022031 i = E3
-2.467401 i f7
= 3.778208 y7 = 2.180895(1+ i)
y7 = -0.872358
y7 = 2.106058 - 5.9568331 = E7
+3.489432 i From the third (last) reduction step we get the required real Fourier coefficients according to (19.223). (See the right-hand side.) In this example, the general property CN-k
= ck
= 26.524 562 = 4.212 116 az = 1.233 702 a3 = 0.722684 a4 = 0.616850 a0
a1
= -11.913 666 bz = - 4.934 804 b3 = - 2.044 062 bd = 0 (19.234) bl
Of the discrete complex Fourier coefficients can be observed. For k = 1 , 2 , 3 , we see that E-, = & ,
4=
E2, E5 = E3.
19.7 Representation of Curves and Surfaces with Splines 19.7.1 Cubic Splines Since interpolation and approximation polynomials of higher degree usually have unwanted oscillations, it is useful to divide the approximation interval into subintervals by the so-called nodes and to consider a relatively simple approximation function on every subinterval. In practice, cubic polynomials are mostly used. We require a smooth transition at the nodes of this piecewise approximation.
19.7.1.1 Interpolation Splines 1. Definition of the Cubic Interpolation Splines, Properties Suppose there are given N interpolation points (xi,fi) (i = 1 , 2 , .. . , N ;z1 < 22 < . . . X N ) . The cubic interpolation spline S(z) is determined uniquely by the following properties: 1. S(z) satisfies the interpolation conditions S(zi)= fi (i = 1 , 2 , .. . , N). 2. S(z) is a polynomial of degree 5 3 in any subinterval [xi, (i = 1 , 2 , .. . , N - 1). 3. S(z) is twice continuously differentiable in the entire approximation interval [XI,ZN]. 4. S(z) satisfies the special boundary conditions: a) S”(z1) = S”(ZN) = 0 (we call them natural splines) or b) S’(z1) = fl’, s ’ ( z ~ = ) fN’ (fit and f ~ are ’ given values) or c) S(z1) = S ( X N in ) ~the cme of f1 = f N , S’(z1) = S ’ ( I N )and S”(z1) = S ” ( X N(we ) call them
19.7 Representation of Curves and Surfaces with Splines 929
periodic splines). It follows from these properties that for all twice continuously differentiable functions g ( z )satisfying the interpolation conditions g(zl)= ft (i = 1 , 2 , .. . , N ) (19.235) is valid (Holladay’s Theorem). Based on (19.235) we can say that S(z) has minimal total curvature, since for the curvature ti of a given curve, in a first approximation, ti M S”(see 3.6.1.2, 4.,p. 228). It can be shown that if we lead a thin elastic ruler (its name is spline) through the points (xi,j,) (i = 1:2 , . . . , N ) , its bending line follows the cubic spline S(z). 2. Determination of the Spline Coefficients We suppose that the cubic interpolation spline S(z) for z E [xi,z,+l]has the form:
+
+
(19.236) S(z) = Si(z)= a, bi(z - xi) t ci(z - xi)’ di(z - z ~ i ) (i ~ = 1 , 2 , .. . , N - 1). The length of the subinterval is denoted by hi = ziti - zi.We can determine the coefficients of the natural spline on the following way: 1. From the interpolation conditions we get a,=fi ( i = 1 , 2,...,N - 1 ) . (19.237) It is reasonable to introduce the additional coefficient a N = j N , which does not occur in the polynomials. 2. The continuity of Sff(z)at the interior nodes requires that
di-l - a‘ - c$-l (i = 2,3, . . . , N - 1). (19.238) 3h,-1 The natural conditions result in c1 = 0, and (19.238) still holds for i = N , if we introduce C N = 0. 3. The continuity of S(z) at the interior nodes results in the relation 2ci-l c, (19.239) b.1-1 - ai - ai-1 h,-l (i = 2 , 3 , . . .,A’). 3 hi-i 4. The continuity of S f ( z )at the interior nodes requires that
+
(i = 2,3,. . . ,N - 1). (19.240)
Because of (19.237), the right-hand side of the linear equation system (19.240) to determine the coefficients c, (i = 2,3, . . . , N - 1; c1 = c N = 0 ) is known. The left hand-side has the following form:
(19.241)
I
0
The coefficient matrix is tridzagonal, so the equation system (19.240) can be solved numerically very easily by an LR decomposition (see 19.2.1.1, 2., p. 888). We can then determine all other coefficients in (19.239) and (19.238) with these values c,.
19.7.1.2 Smoothing Splines The given function values j , are usually measured values in practical applications so they have some error. In this case, the interpolation requirement is not reasonable. This is the reason why cubic smoothang
I
930
19. Numerical Analusis
splines are introduced. We get this spline if in the cubic interpolation splines we replace the interpolation requirements by
2[1-’
+ X T[S”(r)]’dx = min!.
,=l
(19.242)
XI
We keep the requirements of continuity of S, S’and S“, so the determination of the coefficients is a constrained optimization problem with conditions given in equation form. The solution can be obtained by using a Lagrange function (see 6.2.5.6, p. 401). For details see [19.26]. In (19.242)X (A 2 0) represents a smoothingparameter, which must be given previously. For X = 0 we get the cubic interpolation spline, as a special case, for “large” X we get a smooth approximation curve, but it returns the measured values inaccurately, and for X = 03 we get the approximating regression line as another special case. A suitable choice of X can be made, e.g., on computer by screen dialog. The parameter q (q > 0) in (19.242) represents the standard deviation (see 16.4.1.3,2., p. 788) of the measurement errors, of the values fi (i = 1 , 2 , .. . , N ) . Until now, the abscissae of the interpolation points and the measurement points were the same as the nodes of the spline function. For large N this results in a spline containing a large number of cubic functions (19.236). A possible solution is to choose the number and the position of the nodes freely, because in many practical applications only a few spline segments are satisfactory. It is reasonable also from a numerical viewpoint to replace (19.236) by a spline of the form
c
rt2
S(x)=
aix,4(2).
(19.243)
i=l
Here r is the number of freely chosen nodes, and the functions Ne,4(x) are the so-called normalized Bsplines (basissplines) of order 4, i.e., polynomials of degree three, with respect to the i-th node. For details see [19.5].
19.7.2 Bicubic Splines 19.7.2.1 Use of Bicubic Splines Bicubic splines are used for the following problem: A rectangle R of the x,y plane, given by a 5 2 5 b, c 5 y 5 d, is decomposed by the grid points (xi,yj) ( i = 0, 1,. . . , n; j = 0,1,. . . , m) with (19.244) a = 2 0 < 21 < ’ . ’ < 2 , = b, c = YO < yi < ... < ym = d into subdomains h&, where the subdomain hl contains the points (x,y) with x, 5 x 5 xiflr yj 5 y 5 yjti (i = 0,1,. . . , n - 1; j = 0, 1,. . . m - 1). The values of the function j(x,y) are given at the grid points (19.245) f ( x i , yj) = jtJ (i = 0,1,. . . ,n; j = 0,1,. . . , m). A possible simple, smooth surface over R is required which approximates the points (19.245).
19.7.2.2 Bicubic Interpolation Splines 1. Properties The bicubic interpolation spline S ( x ,y) is defined uniquely by the following properties: 1. S(Z,y) satisfies the interpolation conditions S(Z,,Yj) = fil ( 2 = 0,1,. . . , n; j = 0,1,. . . , m). 2. S ( x ,y) is identical to a bicubic polynomial on every &j of the rectangle R, that is,
(19.246)
(19.247) on R,,3. So, S,,(z, y) is determined by 16 coefficients, and for the determination of S ( x ,y) we need 16 m n coefficients.
19.7 Representation of Curves and Surfaces with Splines 931
3. The derivatives
as as
82s
ax:
dsdy
-
ay’
(19.248)
are continuous on R. So, a certain smoothness is ensured for the entire surface. 4. S(z, y) satisfies the special boundary conditions:
dS
= p,,
z(z,,y,)
dS
%(xt,
d2S
&dy(z,l
for i = 0,n; j = O , l , .
y,) = qej for
y,) = T
, ~
. . ,m,
i = 0 , 1 , . . . n; j = O,m,
(19.249)
for i = 0, n;j = 0, m.
Here pi, qiJ and T , are ~ previously given values. We can use the results of one-dimensional cubic spline interpolation for the determination of the coefficients aykl. We see: 1. There is a very large number (271 + m + s) of linear equation systems but only with tridiagonal coefficient matrices. 2. The linear equation systems differ from each other only on their right-hand sides. In general. it can be said that bicubic interpolation splines are useful with respect to computation cost and accuracy, and so they are appropriate procedures for practical applications. For practical methods of computing the coefficients see the literature. ~
2. Tensor Product Approach The bicubic spline approach (19.247) is an example of the so-called tensor product approach having the form n
m
i=O
j=O
S(2,Y)= CCazjsl(z)MY)
(19.250)
and which is especially suitable for approximations over a rectangular grid. The functions si(.) (i = , , n ) and h,(y) ( j = 0 , 1 , . . . m) form two linearly independent function systems. The tensor product approach has the big advantage, from numerical viewpoint, that, e.g., the solution of a twodimensional interpolation problem (19.246) can be reduced to a one-dimensional one. Furthermore, the two-dimensional interpolation problem (19.246) is uniquely solvable with the approach (19.250) if 1. the one-dimensional interpolation problem with functions si(.) with respect to the interpolation nodes 2 0 , zl,. . . ,z,and 2. the one-dimensional interpolation problem with functions h,(y) with respect to the interpolation nodes YO,?/I,. . . , Ym are uniquely solvable. An important tensor product approach is that with the cubic B-splines: 0,1,,
.
~
(19.251) Here, the functions Nt,4(z) and Nj,4(y)are normalized B-splines of order four. Here r denotes the number of nodes with respect to 2 , p denotes the number of nodes with respect to y. The nodes can be chosen freely but their positions must satisfy certain conditions for the solvability of the interpolation problem. The B-spline approach results in an equation system with a band structured coefficient matrix, which is a numerically useful structure.
19. Numerical Analusis
932
For solutions of different interpolation problems using bicubic B-splines see the literature
19.7.2.3 Bicubic Smoothing Splines The one-dimensional cubic approximation spline is mainly characterized by the optimality condition (19.242). For the two-dimensional case we could determine a whole sequence of corresponding optimality conditions, however only a few special cases make the existence of a unique solution possible. For appropriate optimality conditions and algorithms for solution of the approximation problem with bicubic B-splines see the literature.
19.7.3 Bernstein-B6zier Representation of Curves and Surfaces 1. Bernstein Basis Polynomials The Bernstein-Bkzier representation (briefly B-B representation) of curves and surfaces applies the Bernstein polynomials
B,,,(t) =
(;)
tt(1 - t)n-,
(2
= 41,. . .,n)
(19.252)
and uses the following fundamental properties: 1. 0 5 Bi,,(t) 5 1 for 0 5 t 5 1,
(19.253)
n
2. CB,,,(t) = 1. a
(19.254)
d
Formula (19.254) follows directly from the binomial theorem (see 1.1.6.4, p. 12). IA: Bol(t) = 1 - t , B,,,(t)= t (Fig. 19.12). IB: B03(t)= (1 - t ) 3 ,B1,3(t)= 3t(l - t ) ' , B2,3(t)= 3t2(1- t ) B3,3(t)= t3 (Fig. 19.13).
-b
t
Figure 19.12
Figure 19.13
2. Vector Representation In the following, a space curve, whose parametric representation is I = z ( t ) ,y = y(t), z = z ( t ) ,will be denoted in vector form by (19.255) r'= i ( t )= I ( t )& + y(t) sv z ( t )&. Here t is the parameter of the curve. The corresponding representation of a surface is (19.256) r' = P(u, v) = z(u,w) & + y(u, v) Gy + z ( u ,v) &. Here, u and w are the surface parameters.
+
19.7.3.1 Principle of the B-B Curve Representation
Suppose there are given n + 1 vertices P, (i = O , l , . . . , n) of a three-dimensional polygon with the position vectors P,. Introducing the vector-valued function n
r'(t) =
CBi,,(t)Pi t=O
(19.257)
19.8 Using the Computer 933
bp:
we assign a space curve to these points, which is called the B-B curve. Because of (19.254) we can consider (19.257) as a “variable convex combination” of the given points. The three-dimensional curve (19.257) has the following important properties: 1. The pointsPo and Pnzre interpolated. andPn. 2. Vectors Popl and Pn-lP, are tangents to t ( t )at points PO The relation between a polygon and a B-B curve is shown in Fig. 19.14. The B-B representation is considered as a design of the curve, since we can easily influence the shape of the curve by changing p4 the polygon vertices. We often use normalized B-splines instead of Bernstein polyFigure 19.14 nomials. The corresponding space curves are called the B-spline curves. Their shape corresponds basically to the B-B curves with the following advantages: 1. The polygon is better approximated. 2. The B-spline curve changes only locally if the polygon vertices are changed. 3. In addition to the local changes of the shape of the curve the differentiability can also be influenced. So, it is possible to produce break points and line segments for example.
19.7.3.2 B-B Surface Representation Suppose there are given the points Pij (i = 0, 1,. . . , n; j = 0,1,. . . , rn) with the position vectors P i j , which can be considered as the nodes of a grid along the parameter curves of a surface. Analogously to the B-B curves (19.257), we assign a surface to the grid points by n
m
=
<(u,
Bi,n(u)Bj,m(u)*ij.
(19.258)
%=O j=O
Representation (19.258) is useful for surface design, since by changing the grid points we may change the surface. Anyway, the influence of every grid point is global, so we should change from the Bernstein polynomials to the B-splines in (19.258).
19.8 Using the Computer 19.8.1 Internal Symbol Representation Computer are machines that work with symbols. The interpretation and processing of these symbols is determined and controlled by the software. The external symbols, letters, cyphers and special symbols are internally represented in binary code by a form of bit sequence. A bit (binary digit) is the smallest representable information unit with values 0 and 1. Eight bits form the next unit, the byte. In a byte we can distinguish between 28 bit combinations, so 256 symbols can be assigned to them. Such an assignment is called a code. There are different codes; one of the most widespread is A S C I I (American Standard Code for Information Interchange).
19.8.1.1 Number Systems 1. Law of Representation Numbers are represented in computers in a sequence of consecutive bytes. The basis for the internal representation is the binary system, which belongs to the polyadic systems, similarly to the decimal system. The law of representation for a polyadic number system is n
a=
1 z,Bz t=-m
(rn > 0, n 2 0;m, n integer)
(19.259)
934 19. Numerical Analysis with B as basis and z, (0 5 zi < B ) a6 a cypher of the number system. The positions i 2 0 form the integers, those with a < 0 the fractional part of the number. W For the decimal number representation, i.e., B = 10, of the decimal number 139.8125 we get 139.8125 = 1 '10' + 3 10' + 9.10' + 8 . lo-' + 1 . lo-' + 2 . +5. The number systems occurring most often in computers are shown in Table 19.3. Table 19.3 Number systems -
Number system
Basis Corresponding cyphers
Binary system Octal system Hexadecimal system Decimal system
2. Conversion The transition from one number system to another is called convemon. If we use different number systems in the same time, in order to avoid confusion we put the basis as an index. W The decimal number 139.8125 is in different systems: 139.812510 = 10001011.1101~= 213.648 = 8B.D16. 1. Conversion of Binary Numbers into Octal or Hexadecimal Numbers The conversion of binary numbers into octal or hexadecimal numbers is simple. We form groups of three or four bits starting at the binary point to the left and to the right, and we determine their values. These values are the cyphers of the octal or hexadecimal systems. 2. Conversion of Decimal Numbers into Binary, Octal or Hexadecimal Numbers For the conversion of a decimal numbers into another system, we adept the following rules for the integer and for the fractional part separately: a) Integer Part: If G is an integer in the decimal system, then for the number system with basis B the law of formation (19.259) is:
cz,B~ n
G=
(19.260)
( n 2 0)
a=O
If we divide G by B , then we get an integer part (the sum) and a residue: (19.261) Here. zo can have the values 0,1,. . . , B - 1, and it is the lowest valued cypher of the required number. If we repeat this method for the quotients, we get further cyphers. b) Fractional Part: If g is a proper fraction, then the method to convert it into the number system with basis B is
+ c z.J-"+l, m
gB = z-1
(19.262)
a='
i t . , we get the next cypher as the integer part of the product gB. The values z-2, obtained in the same way.
z-3,
. . . can be
19.8 Usinq the Computer 935
IB: Conversion of a decimal fraction 0.8125 into a binary fraction. 0.8125.2 = 1.625 (1 = 2-1) 0.625 . 2 = 1.25 (1 = 2-2) 0.25 . 2 = 0.5 (0 = Z-3) 0.5 . 2 = 1.0 (1 = 2-4) 0.0 . 2 = 0.0
IA: Conversion of the decimal number 139 into a binary number. 139:2 = 69 residue 1 (1 = zo) 69: 2 = 34 residue 1 (1 = zl) 34:2 = 17 residue 0 (0 = 22) : 1 7 : 2 = 8 residue 1 8 : 2 = 4 residue 0 : 4 : 2 = 2 residue 0 : 2 : 2 = 1 residue 0 : 1 : 2 = 0 residue 1 (1 = z7)
0.812510 = 0.11012
13910 = 100010112 3. Conversion of Binary, Octal, a n d Hexadecimal N u m b e r s i n t o a Decimal N u m b e r The algorithm for the conversion of a value from the binary, octal, or hexadecimal system into the decimal system is the following, where the decimal point is after ZO: a=
2
( m > 0, n 2 0,integer).
Z,B%
(19.263)
t=-m
The calculation is convenient with the Horner rule. LLLOL = 1 .24 1 .23 + 1 . 2 ’ + 0 . 2 l + 1.2’ = 29. The corresponding Horner scheme is shown on the right.
+
111 0 1 21 2614; 1 3 7 14 29
19.8.1.2 Internal Number Representation Binary numbers are represented in computers in one or more bytes. We distinguish between two types of form of representation, the jmd-point numbers and the jloatzng-point numbers. In the first case, the decimal point is at a fixed place, in the second case it is “floating” with the change of the exponent.
-I
1. F i x e d - p o i n t Numbers binary number (t bits) The range for fixed-point numbers with the I given parameters is 0 5 1 a I 5 2t - 1. (19.264) Fixed-point numbers can be represented in the form of Fig. 19.15. sign IJ of the fixed-point number 2. F l o a t i n g - P o i n t Numbers Figure 19.15 Basically. two different forms are in use for the representation of floating-point numbers, where the internal implementation can vary in detail. 1. Normalized Semilogarithmic Form In the first form, the signs of the exponent E and the mantissa M of the number a are stored separately a = fMBiE. (19.26Sa) Here the exponent E is chosen so that for the sign u, sign IJ, mantissa 1/B I M < 1 (19.265b) of the exponent of the mantissa holds. We call it the normalized semilogarithFigure 19.16 micfonn (Fig. 19.16).
I-
The range of the absolute value of the floating-point numbers with the given parameters is: 2 4 5 I a 1 5 (1 - 2 - t )
. z(V-1).
(19.266)
I
936
19. Numerical Analysis
2. IEEE Standard The second (nowadays used) form of floating-point numbers corresponds to the IEEE (Institute of Electrical and Electronics Engineers) standard accepted in 1985. It deals with the requirements of computer arithmetic, roundoff behavior, arithmetical operators, conversion of numbers, comparison operators and handling of exceptional cases such as over- and underflow. The floating-point number representations are mantissaM pharacteristic C shown in Fig. 19.17. We get the characteristic C from the exponent E by addition of a suitable constant K . This is chosen so that we get only positive numbers in the characteristic. The representable number is sign IJ of the floating-point number u = (-1)”2’.l.blbz
. . .bt-1
with E = C - K.
(19.267)
Figure 19.17
Single precision Double precision
Parameter Word length in bits Maximal exponent E,, Minimal exponent E,,, Constant K Number of bits in exponent Number of bits in mantissa
32 $127 -126 $127 8 24
64 $1023 -1022 $1023 11 53
19.8.2 Numerical Problems in Calculations with Computers 19.8.2.1 Introduction, Error Types The general properties of calculations with a computer are basically the same as those of calculations done by hand, however some of them need special attention, because the accuracy comes from the r e p resentation of the numbers, and from the missing judgement with respect to the errors of the computer. Furthermore, computers perform many more calculation steps than human can do manually. So, we have the problem of how to influence and control the errors, e.g., by choosing the most appropriate numerical method among the mathematically equivalent methods. In further discussions, we will use the following notation, where x denotes the exact value of a quantity, which is mostly unknown, and j.is an approximation value of 2: Absolute error: 14x1 = 1x - 51.
(19.268)
Relative error:
lTl
ax =1 2 -T j. 1 .
(19.269)
The notation C(Z)= z
2-5 - j. and crel(x)= -
(19.270)
is also often used.
19.8.2.2 Normalized Decimal Numbers and Round-Off 1. Normalized Decimal Numbers Every real number z # 0 can be expressed as a decimal number in the form z = *O.blbZ.. . . loE (bi # 0).
(19.271)
19.8 Usina the Computer 937 Here 0, b l b z . . . is called the mantissa formed with the cyphers b, E (0, 1 , 2 , . . . ,9}.The number E is an integer, the so-called exponent with respect to the base 10. Since bl # 0, we call (19.271) a normalized decimal number. Since only finitely many cyphers can be handled by a real computer, we have to restrict ourselves to a fixed number t of mantissa cyphers and to a fixed range of the exponent E . So, from the number x given in (19.271) we obtain the number
i={
*O.blbz...bt. loE f ( 0 . b l b z ' . bt 10-t)lOE
+
for bt+l 5 5 (round-down), for bt+' > 5 (round-up),
(19.272)
by round-off (as it is usual in practical calculations). For the absolute error caused by round-off, =
IL
- j.1 5 0.5. 10-tlOE.
( 19.273)
2. Basic Operations and Numerical Calculations Every numerical process is a sequence of basic calculation operations. Problems arise especially with the finite number of positions in the floating-point representation. We give here a short overview. We suppose that x and y are normalized error-free floating-point numbers with the same sign and with a non-zero value: x = m l B E l , y = m2BE2 with (19.274a) t
m, =
1a!),B-k,
a!! # 0, and
(19.274b)
k=l
a!k=Oorlor
...
or B - 1 for k > l
(z=1,2).
(19.274~)
1. Addition If El > Ez>then the common exponent becomes El, since normalization allows us to make only a left-shift. The mantissas are then added.
I< 2 If B-' 51 ml +m2B-(E1-Ez)
(19.275a)
and
I ml +m2B-(E1-EZ) 12 1,
(19.27513)
then shifting the decimal point by one position to the left results in an increase of the exponent by one. 0.9604 lo3 + 0.5873.102 = 0.9604. lo3 + 0.05873. lo3 = 1.01913. lo3 = 0.1019. lo4. 2. Subtraction The exponents are equalized as in the case of addition, the mantissas are then subtracted. If and I ml - m2B-(E1-E2) /< B-', (19.276b) I ml - m2B-(E1-E2) I< 1 - B-t (19.276a) shifting the decimal point to the right by a maximum oft positions results in the corresponding decrease of the exponent. 0.1004. lo3 - 0.9988. lo2 = 0.1004 lo3 - 0.09988. lo3 = 0.00052. lo3 = 0.5200.10°. This example shows the critical case of subtractive cancellation. Because of the limited number of positions (here four), zeros are carried in from the left instead of the correct characters. 3. Multiplication The exponents are added and the mantissas are multiplied. If m1m2 < B-', (19.277) then the decimal point is shifted to the right by one position, and the exponent is decreased by one. (0.3176. lo3) . (02504. lo5) = 0.07952704 10' = 0.7953. lo7. 4. Division The exponents are subtracted and the mantissas are divided. If 9
(19.278) then the decimal point is shifted to the left by one position, and the exponent is increased by one. (0.3176 lO3)/(O.2504. lo5) = 1.2683706.. .lo-' = 0.1268 lo-'.
I
938
19. Numerical Analusis
5. Error of the Result The error of the result in the four basic operations with terms that are supposed to be error-free is a consequence of round-off. For the relative error with number of positions t and the base B , the limit is
B
(19.279)
--B-t.
2
6. Subtractive cancellation As we have seen above, the critical operation is the subtraction of nearly equal floating-point numbers. If it is possible, we should avoid this by changing the order of operations, or by using certain identities. = 0.4455 . lo2 - 0.4454. 10' = 0.1 . lo-' or I = Iz = - v%% = 1985 - 1984 = 0.1122~10-~
m
mtm
19.8.2.3 Accuracy in Numerical Calculations 1. Types of Errors Numerical methods have errors. There are several types of errors, from which the total error of the final result is accumulated (Fig. 19.18).
I Total error I Round-off error
Truncation error
Discretizationerror
Figure 19.18
2. Input Error 1. Notion of Input Error Input error is the error of the result caused by inaccurate input data. Slight inaccuracies of input data are also called perturbations. The determination of the error of the input data is called the dzrect problem of error calculus. The inverse problem is the following: How large an error the input data may have such that the final input error does not exceed an acceptable tolerance value. The estimation of the input error in rather complex problems is very difficult and is usually hardly possible. In general, for a real-valued function y = f(r) with x = ( X I , 52,.. . x , ) ~ for , the absolute value of the input error we have
IAYI = I f ( z l > z ~,j,,I ,n ) =
-f(zlrz2r., , $ z n ) l
IC -((El,Ez,...rEnn)(lt-~,)I af 2=1
ax,
(19.280)
if we use the Taylor formula (see 7.3.3.3, p. 415) for y = f (z)= f ( z l , z z ,... ,zn) with a linear residue.
El>( 2 , . . . ,En denote the intermediate values, j.1,&,. . . ,j.,denote the approximating values of ~ 1 ~ x 2 , . . . , 2 , . The approximating values are the perturbed input data. Here, we also consider the Gauss error propagation law (see 16.4.2.1,p. 792). 2. Input Error of Simple Arithmetic Operations The input error is known for simple arithmetical operations. With the notation of (19.268)-(19,270) for the four basic operations we get:
19.8 Using the Computer 939
4 x h Y) = +)
4), - ze(y) YZ
frel
(i)
(19.281) ~ ( x y = ) ye(x)
+
terms of higher order ine,
+
= erei(x) - crel(y)
terms of higher order in&.
+ xe(y) + ~(x)e(y),
(19.282) (19.283)
(19.286)
The formulas show: Small relative errors of the input data result in small relative errors of the result on multiplication and division. For addition and subtraction, the relative error can be very large if Ix YI 1x1 + 1211, 3. Error of t he Method 1. Notion of the Error of the Method The error of the method comes from the fact that we should be able to numerically approximate theoretically continuous phenomena in many different ways as limits. Hence, we have truncation errors in limiting processes (as, e.g., in iteration methods) and discretization errors in the approximation of continuous phenomena by a finite discrete system (as, e.g., in numerical integration). Errors of methods exist independently of the input and round-off errors; consequently, they can be investigated only in connection with the applied solution methodology. 2. Applying Iteration Methods If we use an iteration method, we should consider that both cases may occur: We can get a correct solution or also a false solution of the problem. It is also possible that we get no solution by an iteration method although there exists one. To make an iteration method clearer and safer, we should consider the following advice: a) To avoid “infinite” iterations, we should count the number of steps and stop the process if this number exceeds a previously given value (Le., we stop without reaching the required accuracy). b) We should keep track of the location of the intermediate result on the screen by a numerical or a graphical representation of the intermediate results. c) We should use all known properties of the solution such as gradient, monotonicity, etc. d ) We should investigate the possibilities of scaling the variables and functions. e) We should perform several tests by varying the step size, truncation conditions, initial values, etc 4. Round-off Errors R o u n d - o f errors occur because the intermediate results should be rounded. So, they have an essential importance in judging mathematical methods with respect to the required accuracy. They determine together with the errors of input and the error of the method, whether a given numerical method is strongly stable, weakly stable or unstable. Strong stability, weak stability, or instability occur if the total error, at an increasing number of steps, decreases, has the same order, or increases, respectively. At the instability we distinguish between the sensitivity with respect to round-off errors and discretization errors (numerical instability) and with respect to the error in the initial data at a theoretically exact calculation (natural instability). A calculation process is appropriate if the numerical instability is not greater than the natural instability. For the local error propagation of round-off errors, Le., errors at the transition from a calculation step to the next one! the same estimation process can be used as the one we have at the input error. 5 . Examples of Numerical Calculations We illustrate some of the problems mentioned above by numerical examples. IA: Roots of a Quadratic Equation: ax2 + bx + c = 0 with real coefficients a , b, c and D = bZ - 4ac 2 0 (real roots). Critical situations are the cases a) I 4ac I<< bZ and b) 4ac bZ. Recommended proceeding:
*
940
19. Numerical Analvsis
a) x1 = -
b t sign(b)@
, 22 = c (Vieta root theorem, see 1.6.3.1, 3., p. 44).
ax1 b) The vanishing of D cannot be avoided by a direct method. Subtractive cancellation occurs but the error in ( b t sign(bfl) is not too large since Ibl >> \/Is holds. IB: Volume of a Thin Conical Shell for h < r - r3 v = 471 ( r t h)3 because of ( r h) x r there is a case of subtractive cancellation. However in the 3 3r2h + 3rh2 t h3 equation V = 471 there is no such problem. 3 2a
+
IC: Determining t h e Sum S =
- (S = 1.07667.. .) with an accuracy of three signif-
k2 + 1 icant digits. Performing the calculations with 8 digits, we should add about 6000 terms. After the identical transformation
I - k2
s=c--cm l
k2
k=l
k=l
+1
k2
1
k2(k2+ 1)'
and
k 2 ( k 2+ 1)
we see that 1
7r*
S=--c6
k=l
+
k 2 ( k Z 1) '
By this transformation we have
to consider only eight terms.
ID: Avoiding t h e
00 Situation in the function t = (1 - d m ) x2m- y2 for x = y = 0.
Jm)
Multiplying the numerator and the denominator by (1 + we avoid this situation. E: Example for a n Unstable Recursive Process. Algorithms with the general form Y,,+~ = ay,
+ by,-l
;1 * d;'l -+b
( n = 1 , 2 , .. .) are stable if the condition -
case yntl = -3yn y2, y3) 1/4. y5, ye,.
+ 4y,-l
(TI
< 1 is satisfied. The special
= 1 , 2 , .. .) is unstable. If yo and y1 have errors E and - E , then for ~- 2, 5 ~ ,1 0 3 -4096, ~ ~ 16398, . . .. The process is instable for the
. . we get the errors 7
parameters a = -3 and b = 4. IF: Numerical Integration of a Differential Equation. For the first-order ordinary differential equation (19.287) Y' = f ( x 2u) with f(x,Y) = ay and the initial value y(x0) = yo we will represent the numerical solution. a ) N a t u ra l Instability. Together with the exact solution y(z) for the exact initial values y(xo) = yo let u(z) be the solution for a perturbed initial value. Without loss of generality, we may assume that the perturbed solution has the form 4 x 1 = Y ( 5 ) +Ea(.), (19.288a) where E is a parameter with 0 < E < 1 and ~ ( xis) the so-called perturbation function. Considering that d(x) = f(s,u ) we get from the Taylor expansion (see 7.3.3.3, p. 415) u'(z) = f(z, y(z) t E ~ ( x )= ) f(x,y) t E ~ ( xf,(x, ) y) terms of higher order (19.288b) which implies the so-called differential variation equation V ' k ) = fV(X,Y)V(.). (19.288~) The solution of the problem with f(x,y) = ay is
+
~(z) = 170 with 170 = ~ ( 2 0 ) . (19.288d) For a > 0 even a small initial perturbation v0 results in an unboundedly increasing perturbation ~(z). So, there is a natural instability. b) Investigation o f t h e Error of t h e Method in t h e Trapezoidal Rule. With a = -1, the stable
19.8 Usin.9 the Computer 941
differential equation y'(z) = -y(z) has the exact solution y(z) = yoe-(x-xO). where yo = y(z0). The trapezoidal rule is
T
y(z)dz RZ h*
21
(19.289a)
with h = zttl- z,.
2
(19.289b)
By using this formula for the given differential equation we get %,+I
k+l
= fii t
J
(-y)d.
-
di
+ di+l
= yt - h or
2
di+l
2-h2 t h
= -yi
or
2%
2-h
I, = (?Ti;) do. With z, = zo
+ ih, Le., with a =
(2,
(19.289~)
- zo/h we get for 0 5 h < 2
c(h) =
(19.289d) h If do = yo, then i, < y,, and so for h -+ 0, it also tends to the exact solution yoe-(x1-50). c) Input Error b'e supposed in b) that the exact and the approximate initial values coincide. Now, we investigate the behavior when yo # IO with I $0 - yo I< E O .
So, E , + ~ is at most of the same order as E O , and the method is stable with respect to the initial values. We have to mention that in solving the above differential equation with the Simpson method an artificial instability is introduced. In this case, for h + 0, we would get the general solution (19.290b) sz = Cle-'I + cz(-qte=J3. The problem is that the numerical solution method uses higher-order differences than those to which the order of the differential equation corresponds.
19.8.3 Libraries of Numerical Methods Over time, librarzes of functions and procedures have been developed independently of each other for numerical methods in different programming languages. An enormous amount of computer experimentation was considered in their development, so in solutions of practical numerical problems we should use the programs from one of these program libraries. Programs are available for current operating systems like WINDOWS, UNIX and LINUX and mostly for every computation problem type and they keep certain conventions, so it is more or less easy to use them. The application of methods from program libraries does not relieve the user of the necessity of thinking about the expected results. This is a warning that the user should be informed about the advantages and also about the disadvantages and weaknesses of the mathematical method he/she is going to use.
19.8.3.1 NAG Library The NAG library (Numerical Algorithms Group) is a rich collection of numerical methods in the form of functions and subroutines/procedures in the programming languages FORTRAN 77, FORTRAN 90,
I
942
19. Numerical Analysis
and C. Here is a contents overview: 14. Eigenvalues and eigenvectors 1. Complex arithmetic 15. Determinants 2. Roots of polynomials 3. Roots of transcendental equations 16. Simultaneous linear equations 17. Orthogonalization 4. Series 18. Linear algebra 5. Integration 6. Ordinary differential equations 19. Simple calculations with statistical data 20. Correlation and regression analysis 7. Partial differential equations 21. Random number generators 8. Sumeric differentiation 22. Non-parametric statistics 9. Integral equations 23. Time series analysis 10. Interpolation 11. .4pproximation of curves and surfaces from data 24. Operations research 12. hlinimum/maximum of a function 25. Special functions 26. Mathematical and computer constants 13. Matrix operations, inversion Furthermore the N.4G library contains extensive software concerning statistics and financial mathematics.
19.8.3.2 IMSL Library The IMSL library (International Mathematical and Statistical Library) consists of three synchronized parts: General mathematical methods, Statistical problems, Special functions. The sublibraries contain functions and subroutines in FORTRAN 77, FORTRAN 90 and C. Here is a contents overview: General Mathematical Methods 1. Linear systems 2. Eigenvalues 3. Interpolation and approximation 4. Integration and differentiation 5. Differential equations
6. 7. 8. 9. 10.
Statistical Problems 1. Elementary statistics 2. Regression 3. Correlation 4. Variance analysis 5. Categorization and discrete data analysis 6. Non-parametric statistics 7. Test of goodness of fit and test of randomness 8. Analysis of time series and forecasting 9. Covariance and factor analysis 10. Discriminance analysis 11. Cluster analysis Special Functions 1. Elementary functions 2. Trigonometric and hyperbolic functions 3. Exponential and related functions 4. Gamma function and relatives 5. Error functions and relatives
Transformations Non-linear equations Optimization Vector and matrix operations Auxiliary functions
12. 13. 14. 15. 16. 17. 18. 19. 20.
Random sampling Life time distributions and reliability Multidimensional scaling Estimation of reliability function, hazard rate and risk function Line-printer graphics Probability distributions Random number generators Auxiliary algorithms Auxiliary mathematical tools
Bessel functions Kelvin functions Bessel functions with fractional orders Weierstrass elliptic integrals and related functions 10. Different functions
6. 7. 8. 9.
19.8 Using the Computer 943
19.8.3.3 Aachen Library The Aachen library is based on the collection of formulas for numerical mathematics of G. EngelnMiillges (Fachhochschule Aachen) and F. Reutter (Rheinisch-Westfalische Technische Hochschule Aachen). It exists in the programming languages BASIC, QUICKBASIC, FORTRAN 77, FORTRAN 90, C, MODULA 2 and TURBO PASCAL. Here is an overview:
1. Numerical methods to solve non-linear and special algebraic eqda‘tiohs 2. Direct and iterative methods to solve systems of linear equations 3. Systems of non-linear equations 4. Eigenvalues and eigenvectors of matrices 5. Linear and non-linear approximation 6. Polynomial and rational interpolation, polynomial splines 7. Numerical differentiation 8. Numerical quadrature 9. Initial value problems of ordinary differential equations 10. Boundary value problems of ordinary differential equations The programs of the Aachen library are especially suitable for the investigation of individual algorithms of numerical mathematics.
19.8.4 Application of Computer Algebra Systems 19.8.4.1 Mathernatica 1. Tools for the Solution of Numerical Problems The computer algebra system Mathernatica offers a very effective tool that can be used to solve a large variety of numerical mathematical problems. The numerical procedures of Mathernatica are totally different from symbolic calculations. Mathernatica determines a table of values of the considered function according to certain previously given principles, similarly to the case of graphical representations, and it determines the solution using these values. Since the number of ppints must be finite, this could be a problem with badly ” behaving functions. Although Mathernatica tries to choose more nodes in problematic regions, we have to suppose a certain continuity on the considered domain. This can be the cause of errors in the final result. It is advised to use as much information as possible about the problem under consideration, and if it is possible, then to perform calculations in symbolic form, even if this is possible only for subproblems. In Table 19.5,we represent the operations for numerical computations: Table 19.5 Numerical operations NIntegrate calculates definite integrals n NSum calculates sums f ( i ) 1=1
NProduct NSolve NDSolve
1
calculates products numerically solves algebraic equations numerically solves differential equations
After starting Mathernatica the “Prompt ” In[ll := is shown; it indicates that the system is ready to except an input. Mathernatica denotes the output of the corresponding result by Out [l]. In general: The text in the rows denoted by Infnl := is the input. The rows with the sign Out [nl are given back by Mathernatica as answers. The arrow -> in the expressions means, e.g., replace z by the value a.
2. Curve Fitting and Interpolation 1. Curve Fitting Mathernatica can perform the fitting of chosen functions to a set of data using the least squares method (see 6.2.5, p. 399ff.) and the approximation in mean to discrete problems (see 19.6.2.2. p. 918). The general instruction is: (19.291) Fit[{yl, YZ, . . .I,funkt,zl.
944
19. Numerical Analvsis
Here the values yz form the list of data, funkt is the list of the chosen functions, by which the fitting should be performed, and x denotes the corresponding domain of the independent variables. If we choose funkt, e.g., as Table[x"z, {i, 0, n}] , then the fitting will be made by a polynomial of degree n. ILet the following list of data be given: InDI := l = {1.70045,1.2523,0.638803,0.423479,0.249091,0.160321,0.0883432,0.0570776, 0.0302744,0.0212794} With the input InC21 := fl = Fit[l,{l,e,x"2,xA3,x"4},x] we suppose that the elements of l are assignd to the values 1,2,.. . , 10 of 2. We get the following approximation polynomial of degree four: Out L21 = 2.48918 - 0.8534872 0.0998996~'- 0.00371393~~ - 0.0000219224~~ With the command Inf31 := Plot[ListPlot[l,{x3lo}], fl, {x,1, lo}, Axesorigin-> (0, O}] we get a representation of the data and the approximation curve given in Fig. 19.19a.
+
1.75 1.5
1.5 1.25
1.25
1
1.
0.75
a)
0.75
0.5
0.5
0.25
0.25
'
2
4
6
8 ' 1 0
b ) '
2
4
6
8
1
0
Figure 19.19 For the given data this is completely satisfactory. The terms are the first four terms of the series expansion of e1-0.5z. 2. Interpolation Mathernatica offers special algorithms for the determination of interpolation functions. They are represented as so-called interpolating function objects, which are formed similarly to pure functions. The directions for using them are in Table 19.6. Instead of the function values yI we can give a list of function values and values of specified derivatives at the given points. W With In[3] := Plot[Interpolation[l][x], { q l , lo}] we get Fig. 19.19b. We can see that Mathernatica gives a precise correspondence to the data list. Table 19.6Commands for interpolation Interpolation[{yl,yz,. ..)I
gives an approximation function with the values y, for the values x, = z as integers Interpolation[{{xl, yl}, {x2,yz}, . . .}] gives an approximation function for the point-sequence
3. Numerical Solution of Polynomial Equations As shown in 20.4.2.1,p. 981,Mathernatica can determine the roots of polynomials numerically. The command is NSolve[p[x] == O,x,n],where n prescribes the accuracy by which the calculations should be done. If we omit n, then the calculations are made to machine accuracy. We get the complete solution, Le., m roots, if we have an input polynomial of degree m. IInLfI := NSolve[z"6 + 3x"2 - 5 == 01 O ~ t [ l= ] {x-> -1.07432}, {z-> -0.867262 - 1.152921},{x-> -0.867262 + 1.152921},
19.8 Usin.9 the Computer 945
(2->
0.867262 - 1.152921}, (5-> 0.867262
+ 1.152921}, (z-> 1.07432)).
4. Numerical Integration For numerical integration Mathernatica offers the procedure NIntegrate. Differently from the symbolical method, here it works with a table of values of the integrand. We consider two improper integrals (see 8.2.3, p. 451) as examples. IA: In[lI := NIntegrate[Exp[-z"2], (5, - I n f i n i t y , Infinity}] Out [ll = 1.77245. B: Inf2l := NIntegrate[l/zA2,{z, -1, l}] Power::infy: Infinite expression 1 encountered. 0 N1ntegrate::inum: Integrand ComplexInfinity is not numerical at{z} = (0). Mathernatica recognizes the discontinuity of the integrand at z = 0 in example B and gives a warning. Mathernatica applies a table of values with a higher number of nodes in the problematic domain, and it recognizes the pole. However, the answer can be still wrong. Mathernatica applies certain previously specified options for numerical integration, and in some special cases they are not sufficient. We can determine the minimal and the maximal number of recursion steps, by which Mathernatica works in a problematic domain, with parameters MinRecursion and MaxRecursion. The default options are always 0 and 6. If we increase these values, although Mathernatica works slower, it gives a better result. IInC31 := NIntegrate[Exp[-~"21, (2, -1000, lOOO}] Mathernatica cannot find the peak at z = 0, since the integration domain is too large, and the answers is: N1ntegrate::ploss: Numerical integration stopping due to loss of precision. Achieved neither the requested PrecisionGoal nor AccuracyGoal; suspect one of the following: highly oscillatory integrand or the true value of the integral is 0. Out [31 = 1.34946. If we require InC41 := NIntegrate[Exp[-~"21, (5, -1000, lOOO}, MinRecursion-> 3, MaxRecursion-> lo], then we get
Out f41 = 1.77245 Similarly. we get a result closer to the actual value of the integral with the command: NIntegrate[fun, {x,x,,x~,xz,. . .,z,}]. (19.292) We can give the points of singularities zi between the lower and upper limit of the integral to force Mathernatica to evaluate more accurately.
5. Numerical Solution of Differential Equations In the numerical solution of ordinary differential equations and also in the solution of systems of differential equations Mathematica represents the result by an InterpolatingFunction. It allows us to get the numerical values of the solution at any point of the given interval and also to sketch the graphical representation of the solution function. The most often used commands are represented in Table 19.7. Table 19.7 Commands for numerical solution of differential equations NDSolve[dgl,y >{x,z, ze}l InterpolatingFunction[lzste][z] Plot[Evaluate[y[z]/. los]],(2, z, z,}]
computes the numerical solution of the differential equation in the domain between z, and z, gives the solution at the point z scetches the gravical representation
I
946
19. Numerical Analysis
ISolution of a differential equation describing the motion of a heavy object in a medium with friction. The equations of motion in two dimension are
x=-y
J--
x2+y2.x,
y=-g-rJG.y.
The friction is supposed to be proportional to the velocity. If we substitute g = 10,y = 0.1,then the following command can be given to solve the equations of motion with initial values z(0) = y(0) = 0 and i ( 0 ) = 100, o(0) = 200: I n [ l ] := dg = NDSolve[{r"[t] == -O.1Sqrt[z'[tIA2 + y'[t]"2] z'[t], y"[t]== -10
-0.1~qrt[z'[t]'''2
+ y'[t]9] y'[t], 401 == y[O]== 0, z'[O]== 100,y'[O] == 2OO},
{z>Y>l{t,1511 Mathernatica gives the answer by the interpolating function: Out [ll = {{r-> InterpolatingFunction[{O.,15.},
<>I,
y-> InterpolatingFunction[{O.,15.},
<>I})
We can represent the solution rn[zl := ParametricPlot[{z[t],y[t]}/.dg,{ t ,0,2}, PlotRange-> All] as a parametric curve (Fig.19.20a). NDSolve accepts several options which affect the accuracy of the result. The accuracy of the calculations can be given by the command AccuracyGoal. The command PrecisionGoal works similarly. During calculations, Mathernatica works according to the so-called WorkingPre cision, which should be increased by five units in calculations requiring higher accuracy. The numbers of steps by which Mathernatica works in the considered domain is prescribed as 500. In general, Mathernatica increases the number of nodes in the neighborhood of the problematic domain. In the neighborhood of singularities it can exhaust the step limit. In such cases, it is possible to increase the number of steps by MaxSteps. It is also possible the prescribe Infinity for MaxSteps. I The equations for the Foucault pendulum are: i ( t )+ &(t) = 2ny(t), Y(t)t u2y(t)= -2Ri(t). With w = 1, R = 0.025 and the initial conditions z(0) = 0, y(0) = 10, i ( 0 ) = y(0) = 0 we get the solution: In[3] := dg3 = NDSolve[{z"[t] == -z[t] + 0.05y'[t],y " [ t ] == -y[t] - O.O5z'[t], z[O] == 0,y[O] == 10,z'[O] == y'[O] == 0 } , {z, y}, { t l 0,40}] Out[3] = {{z-> InterpolatingFunction[{O.,40.},<>I,
y-> InterpolatingFunction[{O.,40.},
<>I}}
With
In[@ := ParametricPlot[{z[t],y[t]}/.dg3,{ t ,0,40}, AspectRatio-> 11 we get Fig. 19.20b.
19.8.4.2 Maple The computer algebra system Maple can solve several problems of numerical mathematics with the use of built-in approximation methods. The number of nodes, which is required by the calculations, can be determined by specifying the value of the global variable Digits as an arbitrary n. But we should not forget that selecting a higher n than the prescribed value results in a lower speed of calculation.
1. Numerical Calculation of Expressions and Functions
After starting Maple, the symbol Prompt " > is shown, which denotes the readyness for input. Connected in- and outputs are often represented in one row, separated by the arrow operator +.
19.8 U5in.q the Computer 947
4
’
0.5
1
1.5
Figure 19.20 1. Operator evalf Numerical values of expressions containing built-in and user-defined functions which can be evaluated as a real number, can be calculated with the command (19.293) evalf (expr,n). expr is the expression whose value should be determined; the argument n is optional, it is for evaluation to n digits accuracy. Default accuracy is set by the global variable Digits. IPrepare a table of values of the function y = f(x) = f i lnx. First. we define the function by the arrow operator: > f := z -> sqrt(z) +I+); +f := z -> & + h i . Then we get the required values of the function with the command evalf (f(z));, where a numerical value should be substituted for z. A table of values of the function with steps size 0.2 between 1 and 4 can be obtained by > f o r x from 1 by 0.2 t o 4 do print(f[x] = evalf (f(z), 12)) od; Here, it is required to work with twelve digits. Maple gives the result in the form of aone-column table with elements in the form fp.21 = 2.95200519181. 2. Operator evalhf (ezpcpr): Beside evalf there is the operator evalhf. It can be used in a similar way to evalf, Its argument is also an expression which has a real value. It evaluates the symbolic expression numerically, using the hardware floating-point double-precision calculations available on the computer. A Maple floating-point value is returned. Using evalhf speeds up your calculations in most cases, but you lose the definiable accuracy of using evalf and Digits together. For instance in the problem in 19.8.2, p. 936, it can produce a considerable error.
+
2. Numerical Solution of Equations As discussed in Chapter 20, see 20.4.4.2, p. 992, by using Maple we can solve equations or systems of equations numerically in many cases. The command to do this is fsolve. It has the syntax fsolve(eqn, var,optzon). (19.294) This command determines real solutions. If eqn is in polynomial form, the result is all the real roots. If eqn is not in polynomial form, it is likely that f solve will return only one solution. The available options are given in Table 19.8. Table 19.8 Options for the command f solve complex maxsols = n fulldigits interval1
determines a complex root (or all roots of a polynomial) determines at least the n roots (only for polynomial equations) ensures that f solve does not lower the number of digits used during computations looks for roots in the given interval
I
948
19. Numerical Analysis
IA: Determination of all solutions of a polynomial equation z6+ 32' - 5 = 0. With > eq := zA6+ 3 1; x"2 - 5 = 0 : we get > f solve(ep, z); i -1.074323739, 1.074323739 Maple determined only the two real roots. With the option complex, we get also the complex roots: > f solve(eq, x, complex); 1,1529220121, -1.074323739, -0.8672620244 - 1,1529220121, -0.8672620244 0.8672620244 - 1.1529220121,0.8672620244 1.1529220121,1.074323739
+
+
IB: Determination of both solutions of the transcendental equation P3 - 4x2 = 0 After defining the equation > ep := exp(-z"3) - 4 1; x"2 = O we get > f solve(eq, x); +0.4740623572 as the positive solution. With > fsolve(eq, z, z = -2..0); + -0.5412548544 Maple also determines the second (negative) root.
3. Numerical Integration The determination of definite integrals is often possible only numerically. This is the case when the integrand is too complicated, or if the primitive function cannot be expressed by elementary functions. The command to determine a definite integral in Maple is e v a l f : e v a l f ( i n t ( f ( z ) , x = a..b),n). (19.295) Maple calculates the integral by using an approximation formula
L2 + l2
ICalculation of the definite integral
e-23 dx. Since the primitive function is not known, for the
integral command we get the following answeI
> int(exp(-xA3),z = -2..2);
e P 3 dz.
If we type > evalf(int(exp(-zA3),z = -2..2), 15); then we get 277.745841695583. Maple used the built-in approximation method for numerical integration with 15 digits. In certain cases this method fails, especially if the integration interval is too large. Then, we can try to call another approximation procedure with the call to a library readlib('evalf/int') : which applies an adaptive Newton method. IThe input > eva1f(int(exp(-zA2),z = -lOOO..lOOO)); results in an error message. With > r e a d l i b ( ' evalf / i n t ' ) :
> ' e v a l f / i n t ' (exp(-zA2),z = -1O00..1000,10,~NCrule); 1.772453851 we get the correct result. The third argument specifies the accuracy and the last one specifies the internal notation of the approximation method.
19.8 Usinq the Computer 949 4. Numerical Solution of Differential Equations We solve ordinary differential equations with the Maple operation dsolve given in 20.4.4, 5., p. 991. However. in most cases it is not possible to determine the solution in closed form. In these cases, we can try to solve it numerically, where we have to give the corresponding initial conditions. In order to do this, we use the command dsolve in the form dsolve(deqn, var, numeric) (19.296) with the option numeric as a third argument. Here, the argument deqn contains the actual differential equation and the initial conditions. The result of this operation is a procedure, and if we denote it, e.g., by j , for using the command f(t), we get the value of the solution function at the value t of the independent variable. Maple applies the Runge-Kutta method to get this result (see 19.4.1.2, p. 901). The default accuracy for the relative and for the absolute error is 10-D'g'tS+3.The user can modify these default error tolerances with the global symbols -RELERR and -ABSERR. If there are some problems during calculations, then Maple gives different error messages. IAt solving the problem given in the Rung-Kutta methods in 19.4.1.2, p. 902, Maple gives: > T := dsolve({diff(y(r),z) = (1/4) * ( 2 9 +y(z)"2),y(O) = O},y(a),numeric); r := proc 'dsolve/numeric/result2' (z, 1592392, [l])end With
> r(0.5);-+ ( 4 . 5 ) = 0.5000000000,y(2)(.5) = 0.01041860472) we can determine the value of the solution, e.g., at z = 0.5.
950
20. Computer Alqebra Systems
20 Computer Algebra Systems 20.1 Introduction 20.1.1 Brief Characterization of Computer Algebra Systems 1. General Purpose of Computer Algebra Systems The development of computers has made possible the introduction of computer algebra systems for "doing mathematics". They are software systems able to perform mathematical operations formally. These systems, such as Macsyrna, Reduce, Derive, Maple, Mathcad, Mathernatica, can also be used on relatively small computers (PC),and with their help, we can transform complicated expressions, calculate derivatives and integrals, solve systems of equations, represent functions of one and of several variables variables graphically, etc. They can manipulate mathematical expressions, i.e., they can transform and simplify mathematical expressions according to mathematical rules if this is possible in closed form. They also provide a wide range of numerical solutions to required accuracy, and they can represent functional dependence between data sets graphically. Most computer algebra systems can import and export data. Besides a basic offer of definitions and procedures which are activated at every start of the system, most systems provide a large variety of libraries and program packages from special fields of mathematics, which can be loaded and activated on request (see [20.4]). Computer algebra systems allow users to build up their own packages. However, the possibilities of computer algebra systems should not be overestimated. They spare us the trouble of boring, time-demanding, and mechanical computations and transformations, but they do not save us from thinking. For frequent errors see 19.8.2, p. 936.
2. Restriction to Mathematica and Maple The systems are under perpetual developing. Therefore, every concrete representation reflects only a momentary state. In the following, we introduce the basic idea and applications of these systems for the most important fields of mathematics. This introduction will help for the first steps in working with computer algebra systems. In particular, we discuss the two systems Mathernatica (version 2.2) and Maple V. These two systems seem to be very popular among users, and the basic structure of the other systems is similar. (Today Mathernatica version 4.1 [20.7, 20.111 and Maple V release 6 [20.6] are current).
3. Input and output in Mathematica and Maple In this book, we do not discuss how computer algebra systems are installed on computers. It is assumed that the computer algebra system has already been started by a command, and it is ready to communicate by command lines or in a Windows-like graphical environment. The input and output is always represented for both Mathernatica (see 19.8.4.1, l.,p. 943) and Maple (see 19.8.4.2, l., p. 946) in rows which are distinguished from other text-parts, e.g., in the form I n f f l := Solve[J z - 5 == 0, z] in Mathernatica, (20.1) > solve(3 * z- 5 = 0 , ~ ) in Maple. System specific symbols (commands, type notation, etc.) will be represented in typewriter style. In order to save space, we often write the input and the output in the same row in this book, and we separate them by the symbol 4.
20.1.2 Examples of Basic Application Fields 20.1.2.1 Manipulation of Formulas Formala manipulation means here the transformation of mathematical expressions in the widest sense, e.g., simplification or transformation into a useful form, representation of the solution of equations or equation systems by algebraic expressions, differentiation of functions or determination of indefinite
20.1 Introduction 951 integrals, solution of differential equations, formation of infinite series, etc. ISolution of the following quadratic equation: xz + ax t b = 0 with a, b E R. (20.2a) In Mathernatica, we type: (20.2b) Solve[z^2 t a z + b == 0 , 2 ] . After pressing the corresponding input key/keys (ENTER or SHIFT+RETURN, depending on the operation system), Mathematica replaces this row by (20.2c) In[1] := Solve[z"2 a x b == O,x] and starts the evaluation process. In a moment, the answer appears in a new row - a - Sqrt[a2 - 461 - a t Sqrt[a2 - 4b] 1). (20.2d) Uut[lI = {{x-> 1,{x-> 2 2 Mathernatica has solved the equation and both solutions are represented in the form of a list consisting of two sublists. Here Sqrt means the square root. In Maple the input has the following form: (20.3a) solve(z"2 + a * x t b = 0 , x ) ; The semicolon after the last symbol is very important. After the equation is entered with the ENTER key, Maple evaluates this input and the resulting output is displayed directly below the input
+ +
(20.3b) 1/2(-a + (a' - 4b)"'), 1/2(-a - (aZ- 4b)'") The result is given in the form of a sequence of two expressions representing the solutions. Except for some special symbols of the systems, the basic structures of commands are very similar. At the beginning there is a symbol, which is interpreted by the system as an operator, which is applied to the operand given in braces or in brackets. The result is displayed as a list or sequence of solutions or answers. Several operations and formula manipulations are represented similarly.
20.1.2.2 Numerical Calculations Computer algebra systems provide many procedures to handle numerical problems of mathematics. These are solutions of algebraic equations, linear systems of equations, the solutions of transcendental equations. calculation of definite integrals, numerical solutions of differential equations, interpolation problems, etc. IProblem: Solution of the equation (20.4a) x6 - 2x5 - 30x4 + 36x3 1 9 0 1 ~-~362 - 150 = 0. Although this equation of degree six cannot be solved in closed form, it has six real roots, which are to be determined numerically. In Mathernatica the input is: (20.4b) InC21 := NSolve[x"6 - 2xA5- 3 0 d 4 + 3 6 ~ +~ 139 0 ~ ~ 362 2 - 150 == O,Z] It results in the answer: Out [21= {{z-> -4.42228}, {z-> -2.14285}, {z-> -0.937397}, {z-> 0.972291},
+
(2->
3.35802}, {z-> 5.17217))
(20.4~)
This is a list of the six solutions with a certain accuracy which will be discussed later. The input in Maple is: (20.4d) fsolve(x"6 - 2 * x"5 - 30 * x"4 + 36 * x"3 + 190 * 2'2 - 36 * x - 150 = 0, E); Here, the input of 'I= 0" can be omitted, and the assignment of the variable "z"is also not necessary here, since it is the only one. Maple automatically considers the entered expression to be equal to zero.
952
20. Computer Alqebra Systems
The output is the sequence of the six solutions. The application of the command f solve tells to Maple that the result is wanted numerically in the form of floating-point decimal numbers.
20.1.2.3 Graphical Representations Most computer algebrasystems allow the graphical representation of built-in and self-defined functions. Usually, this is the representation of functions of one variable in Cartesian and in polar coordinates, parametric representation, and representation of implicit functions. Functions of two variables can be represented as surfaces in space; parametric representation is also possible. Further more curves can be demonstrated in three-dimensional space. In addition, there are further graphical possibilities to demonstrate functional dependence between data sets, e.g., in the form of diagrams. All systems provide a wide selection of options of representation such as line thickness of the applied elements, e.g., vectors, different colours, etc. Usually, the represented graphics can be exported in an appropriate format such as Postscript, Raster or Plotter and they can be built into other programs, or directly printed on a printer or plotter.
20.1.2.4 Programming in Computer Algebra Systems All systems allow useres to develop their own packages to solve special problems. This means both the use of well-known tools to build up procedures, e.g., DO, IF - THEN, WHILE, FOR, etc. and, on the other hand the application of the built-in methods of the system which allow elegant solutions for many problems. Self-constructed program blocks can be introduced into the libraries and they can be reactivated at any time.
20.1.3 Structure of Computer Algebra Systems 20.1.3.1 Basic Structure Elements 1. Type of Objects Computer algebra systems work with several different types of objects. Objects are the known family of numbers, variables, operators, functions, etc., which are loaded at the start of the system, or which are defined by the user according to a suitable syntax. Classes of objects, like type of numbers or lists, etc., are called types. Most of the objects are identified by their names, which one can think of as associated to an object class and which must satisfy the given grammatical rules. The user gives a sequence of objects in the input row, Le., their names, corresponding to the given syntax, closes the input with the corresponding special key and/or by a special system command, then the system starts evaluation and returns the result in a new row or rows. (Input can be spread over several lines.) The objects, object types and object classes described below are available in every computer algebra system, and their particular specialities are described in the manuals for the system. 2. Numbers Computer algebra systems usually use the number types integers, rational numbers, real numbers (Boating-point numbers), complex numbers; some systems also know algebraic numbers, radical numbers, etc. With different type-check operations, the type or certain properties of given numbers can be determined, like non-negative,prime, etc. Floating-point numbers can be determined with arbitrary accuracy. Usually, the systems work with a default precision, which can be changed on request. The systems know the special numbers, which have a fundamental importance in mathematics, such as e, T and co.They use these numbers symbolically, but they also know their numerical approximation to an arbitrary accuracy.
10.2 Mathematica 953
3. Variables and Assignments Variables have names represented by given symbols which are usually determined by the user. There are names that are predefined and reserved by the system; they cannot be chosen. While no value is assigned to the variable, the symbol itself stands for the variable. Values can be assigned to the variables by special assignment operators. These values can be numbers, other variables, special sequences of objects, or even expressions. In general, there exist some assignment operators which differ, first of all, in the time of their evaluation, Le., right after their input or for the first time at a later call of the variable. 4. Operators All systems have a basic set of operators. The usual operators of mathematics +, -, *, /, A (or * *), >, <, = belong to this set, for which the usual order of precedence is valid for evaluation. The znfiz form of an expression means that these operators are written between the operands. The set of operators written in prefix form - where the operator is written in front of the operand is dominant in all systems. This type of operator operates on special classes of objects, e.g., on numbers, polynomials, sets, lists, matrices, or on systems of equations, or they operate as functional operators. such as differentiation, integration, etc. In general, there are operators for organizing the form of the output, manipulating strings and further systems of objects. Some systems allow us to represent some operators in sufixform, Le., the operator is behind the operand. Operators often use optional arguments. 5 . Terms and Functions The notion of term means an arrangement of objects connected by mathematical operators, usually in infix form, hence, it means certain basic elements often occurring in mathematics. A basic task of computer algebra systems is transforming terms and solving equations. IThe following sequence (20.5) ~ " -4 5 * ~ " t3 2 * ~ " -2 8 is, e.g., a term, in which IC is a variable. Computer algebra systems know the usual elementary functions such as the exponential function, logarithmic function, trigonometric functions, and their inverses and several other special functions. These functions can be placed into terms instead of variables. In this way, complicated terms or functions can be generated. ~
6. Lists and Sets All computer algebra systems know the object class of lists, which is considered as a sequence of objects. The elements of a list can be reached by special operators. In general, the elements of a list can be lists themselves. So, we can get nested lists, which are used in the construction of special types of objects, such as matrices. tensors; all systems provides such special object classes. They make it possible to manipulate objects like vectors and tensors symbolically in vector spaces, and to apply linear algebra. The notion of set is also known in computer algebra systems. The operators of set theory are defined. In the following sections, the basic structure elements and their syntax will be discussed for the two chosen computer algebra systems, Mathematica 4.1 and Maple V.
20.2 Mathernatica Mathematica is a computer algebra system, developed by Wolfram Research Inc. A detailed description of Mathematica 4.1 can be found in j20.7, 20.111.
20.2.1 Basic Structure Elements In Mathematica the basic structure elements are called expressions. Their syntax is (we emphasize again, the current objects are given by their corresponding symbol, by their names): (20.6) obj,[obj,, obj,, . . . , obj,]
I
954
20. Computer Aloebra Svstems
obj, is called the head of the expression; the number 0 is assigned to it. The parts obji (i = 1,.. . , n) are the elements or arguments of the expression, and one can refer to them by their numbers 1,.. . , n. In many cases the head of the expression is an operator or a function, the elements are the operands or variables on which the head acts. Also the head, as an element of an expression, can be an expression, too. Square brackets are reserved in Mathernatica for the representation of an expression, and they can be applied only in this relation. IThe term xA2+2*x+1,which can also be entered in this infix form in Mathernatica, has the complete form (FullFonn) Plus[l,Times[2,x], Power[x,211 which is also an expression. Plus, Power and Times denote the corresponding arithmetical operations. The example shows that all simple mathematical operators exist in prefix form in the internal representation, and the term notation is only a facility in Mathernatica. Parts of expressions can be separated. This can be done with Part[expr,i], where i is the number of the corresponding element. In particular, i = 0 gives back the head of the expression. I If we enter in Mathernatica I~[I] := 2 9 + 2 * x 1 where the sign can be omitted, then after the ENTER key is pressed, Mathernatica answers Uut[ll= 1 + 2 x t x* Mathernatica analyzed the input and returned it in mathematical standard form. If the input had been terminated by a semicolon, then the output would have been suppressed. If we enter Inf2-l := FullForm[%] then the answer is Out [2l = Plus[l,Times[2,xI1Power[z,211 The sign % in the square brackets tells Mathernatica that the argument of this input is the last output. From this expression it is possible to get, e.g., the third element 1n[3] := Part[%, 31 for instance Out f31= Power[x,21 which is an expression in this case. Symbols in Mathernatica are the notation of the basic objects; they can be any sequence of letters and numbers but they must not begin with a number. The special sign $ is also allowed. Upper-case and lower-case letters are distinguished. Reserved symbols begin with a capital letter, and in compound words also the second word begins with a capital letter. Users should write their own symbols using only lower-case letters.
+
20.2.2 Types of Numbers in Mathematica
Type of number Integers Rational numbers Real numbers Complex numbers
Head Integer Rational Real Complex
Characteristic exact integer, arbitrarily long fraction of coprimes in form Integer/Integer floating-point number, arbitrary given precision complex number in the form number+number *I
Input nnnnn pppplqqqq
nnnn.mmmm
Real numbers, Le., floating-point numbers, can be arbitrarily long. If an integer nnn is written in the form nnn.,then Mathernatica considers it as a floating-point number, that is, of type Real.
20.2 Mathernatica 955
The type of a number z can be determined with the command Head[x]. Hence, I n f f l := Head[5l] results in Outff] = Integer, while In[2] := Head[51.] OutC21 = Real. The real and imaginary components of a complex number can belong to any type of numbers. A number such as 5.731 + 0 I is considered as a Real type by Mathernatica, while 5.731 0. I is of type Complex, since 0. is considered as a floating-point approximation of 0. There are some further operations, which give information about numbers. So, if z is a number Inf31 := NumberQ[x]Out C31 := True, (20.7a) Otherwise, if z is not a number, then the output is Uutf31 =False. Here, True and False are the symbols for Boolean constants. IntegerQ[z] tests if x is an integer, or not, so Inf41 := IntegerQ[2.] Uutf41 = False (20.7b) Similar tests can be performed for numbers with heads EvenQ, OddQand PrimeQ.Their meanings are obvious. So. we get (20.7~) Inf51 := PrimeQ[1075643] Out C51 = True, while Inf61 := PrimeQ[1075641] Uutf61 = False (20.7d) These last tests belong to a group of test operators, which all end by Q and always answer True or False in the sense of a logical test (in this case a type check).
+
20.2.2.2 Special Numbers In Mathernatica, there are some special numbers which are often needed, and they can be called with arbitrary accuracy. They include A with the symbol Pi, e with the symbol E, as the transformation 180" factor from degree measure into radian measure with the command Degree, I n f i n i t y as the symbol for w and the imaginary unit I.
20.2.2.3 Representation and Conversion of Numbers Numbers can be represented in different forms which can be converted into each other. So, every real number x can be represented by a floating-point number N[x, n]with an n-digit precision. INf71 := N[E, 201 yields Outf71 = 2.7182818284590452354 (20.8a) LVith Rationalize[x, dz], the number x with an accuracy dx can be converted into a rational number, i.e., into the fraction of two integers. 1457 (20.8b) I n f 8 1 := Rationalize[%, 10" - 51 OutC81 = 536 With 0 accuracy, Mathernatica gives the possible best approximation of the number x by a rational number. Numbers of different number systems can be converted into each other. With BaseForm[z, b], the number z given in the decimal system is converted into the corresponding number in the number system with base b. If b > 10, then the consecutive letters of the alphabet a, b, c, . . . are used for the further digits having a meaning greater than ten. A: Inf151 := BaseForm[255, 161 Uutf15l = //BaseForm = ff16 (20.9a) Inf16l := BaseForm[N[E, lo], 81 Outff6l = //BaseForm = 2.5576052138
(20.9b)
The reversed transformation can be performed by b""mmmm.
B: Inf171 := 8""735
Out[f7] = 477
(20.9~)
956
20. Computer Aloebra Susterns
Numbers can be represented with arbitrary precision (the default here is the hardware precision), and for large numbers so-called scientific form is used, Le., the form n.mmmml0” f qq.
20.2.3 Important Operators Several basic operators can be written in infix form (as in the classical form in mathematics) < symbl op symbz > . However, in every case, the complete form of this simplified notation is an expression. The most often occurring operators and their complete form are collected in Table 20.2. Table 20.2 Important Operators in Mathernatica ~
+
a b a b or a * b a“b alb u-> v r =s
Plus[a, b] Times[a, b] Power[a, b] Times[a, Power[b, -111 Rule[u, v] Set[?, s]
u == v Equal[u, v] w!= u Unequal[w, u]
r >t r >= t s s
<= t
Greater[r, t] GreaterEqual[r, t] Less[s, t] LessEqual[s, t]
Most symbols in Table 20.2 are obvious. For multiplication in the form ab, the space between the factors is very important. The expressions with the heads Rule and Set will be explained. Set assigns the value of the expression s on the right-hand side, e.g., a number, to the expression r on the left-hand side, e.g., a variable. From here on, r is represented by this value until this assignment is changed. The change can be done either by a new assignment or by x = . or Clear[x],Le., by releasing every assignment so far. The construction Rule should be considered as a transformation rule. It occurs together with the substitution operator
1.
,
Replace[t, u -> v] or t / . u-> v means that every element u which occurs in the expression t will be replaced by the expression v. IInf51 := x + y2 /. y-> a b Out 151 = x ( a b)’ It is typical in the case of both operators that the right-hand side is evaluated immediately after the assignment or transformation rule. So, the left-hand side will be replaced by this evaluated right-hand side at every later call. Here, we have to mention two further operators with delayed evaluation. (20.10a) u := v FullForm = SetDelayed[u,v] and (20.10b) u :> v FullForrn = RuleDelayed[u,v] The assignment or the transformation rule are also valid here until it is changed. although the lefthand side is always replaced by the right side, the right-hand side is evaluated for the first time only at the moment when the left one is called. The expression u == ti or Equal[u, v] means that u and v are identical. Equal is used, e.g., in manipulation of equations.
+
+ +
20.2.4 Lists 20.2.4.1 Notions Lists are important tools in Mathernatica for the manipulation of whole groups of quantities, which are important in higher-dimensional algebra and analysis. A list is a collection of several objects into a new object. In the list, each object is distinguished only by its place in the list. The definition of a list is made by the command (20.11) List[al, a2. a3, . . .] I {a& a2, a3, . . .}
20.2 Mathernatica 957
To explain the work with lists, a concrete list is used, denoted by 11: I n f f l := 11 = Lis t[al, a2, a3, a4, a5, a61 O u t f l l = {al, a2, 03, a4, a5, a6}
(20.12) Mathematica applies a short form to the output of the list: It is enclosed in curly braces. Table 20.3 represents commands which choose one or more elements from a list, and the output is a “sublist”. Table 20.3 Commands for the choice of list elements First [1] Last[1] Part[1, n] or 1[[n]] Part[1, {nl, n2, .}] 1[[{nl, n2, . . .}]I T&e[1, m] T&e[l, {m, .)I Drop[l. n] Drop[l, {m, .)I I
.
selects the first element selects the last element selects the n-th element gives a list of the elements with the given numbers equivalent to the previous operation gives the list of the first m elements of 1 gives the list of the elements from m through n gives the list without the first n elements gives the list without the elements from m through n
I For the list 11 in (20.11)we get, e.g., Inf21 := F i r s t [ l l ] Uutf21 = a1 Inf31 := l1[[3]] Uutf3]= a3 Inf41 := l1[[{2, 4, 6)]] Uutf41 = (a2, a4, as} Inf5l := Take[ll, 21 Uutf51 = {al, a2}
20.2.4.2 Nested Lists, Arrays or Tables The elements of lists can again be lists, so we get nested lists. If we enter, e.g., for the elements of the previous list 11 In[6] := a1 = { b l l , b12, b13, b14, b15) In[7] := a2 = (b21, b22, b23, b24, b25) Inf81 := a3 = (b31, b32, b33, b34, b35)
and analogously for a4, a5 and a6, then because of (20.12) we get a nested list (an array) which we do not represent here explicitly. We can refer to the j-th element of the i-th sublist with the command Part[1, i, j ] . The expression l[[i, j]]has the same result. In the above example, e.g., I n f f 2 l := 11[[3,411 yields Uutfl21 = b34 Furthermore, Part[1, { i l l 22 ...}, {jl,j 2 ...}I or l[[{il, i2, ...)${jl,j 2 , ...}]I results in a list consisting of the elements numbered with j l , j 2 . .. from the lists numbered with il, i2,. . .. IFor the above example Inff3-7:= 11[[{3,5), {2, 3, 4}]] Uut[131 = {{b32, b33, b34), (b52, b53, b54)) The idea of nesting lists is obvious from these examples. It is easy to create lists of three or higher dimensions, and it is easy to refer to the corresponding elements.
20.2.4.3 Operations with Lists Mathernatica provides several further operations by which lists can be monitored, enlarged or shortened (Table 20.4). IWith Delete, the list 11 can be shortened by the term a6: In[l4] := 12 = Delete[ll, 61 Uutf141 = {al, a2, a3, a4, as}, where in the output the ai are shown by their values - they are lists themselves.
I
958
20. Computer Al.qebra &stems
Table 20.4 Operations with lists ~~
Position[l, a] MemberQ[l, a] FreeQ[l, a] Prepend[l, a] Append[&a1 Insert[l, a, i] Delete[l. {i, j , . . .}] Replacepart [ I , a, i]
gives a list of the positions where a occurs in the list checks whether a is an element of the list checks if a does not occur in the list changes the list by adding a to the front changes the list by appending a to the end inserts a at position z in the list delete the elements at positions z,j, . . . from the list replace the element at position z by a
20.2.4.4 Special Lists In Mathematica, several operations are available to create special lists. One of them, which often occurs in working with mathematical functions, is the command Table shown in Table 20.5. Table 20.5 Operation Table creates a list with imax values of f : f ( l ) , f(2), . . . , f(imax) Table[f, {imax}] Table[f, {i, imin,imaz}] creates a list with values off from imin to imax Table[f, {z, imin,imax, di}] the same as the last one, but by steps di ITable of binominal coefficients for n = 7: I n T l 5 l := Table[Binomial[7, 21, {i, 0, 7 } ] ] Uut[f51 = (1, 7, 21, 35, 35, 21, 7 , 1) With Table, also higher-dimensional arrays can be created. With the expression Table[f, (2, i l , i2}, ( j , jl, j2}, ...I we get a higher-dimensional, multiple nested table, Le.! entering InTl61 := Table[Binomial[i,j ] ) {z, 1, 7 } , {jl0, i} ] we get the binominal coefficients up to degree 7: OutT161 = ((1, 11, (1, 2,
11, (1, 3, 3, 11, (1, 4, 6, 4, 11,
(1, 5 , 10, 10, 5 , l}, (1, 6, 15, 20, 15, 6, l}, {II 7, 21, 35, 35, 21, 7, l}}
The operation Range produces a list of consecutive numbers or equally spaced numbers: Range[n] yields the list (1, 2, . . . n) Similarly, Range[nl, n2] and Range[nl, n2, dn]produce lists of numbers from n l to n2 with step-size 1 or dn respectively. ~
20.2.5 Vectors and Matrices as Lists 20.2.5.1 Creating Appropriate Lists Several special (list) commands are available for defining vectors and matrices. A one-dimensional list of the form (20.13) v = { v l , v2, . . . , vn} can always be considered as a vector in n-dimensional space with components v l , v2,. . . , vn. The special operation Array[v, n] produces the list (the vector) (v[1], 421,. . . , v[n]}. Symbolic vector operations can be performed with vectors defined in this way. The two-dimensional lists 11 of (20.2.4.2, p. 957) and 12 (20.2.4.3, p. 957) introduced above can be considered as matrices with rows i and columns j . In this case bij would be the element of the matrix
20.2 Mathernatica 959 in the i-th row and the j-th column. A rectangular matrix of type (6,5) is defined by 11, and a square matrix of type ( 5 , 5 ) by 12. With the operation Array[b, {n, m}] a matrix of type (n,m) is generated, whose elements are denoted by b[i, j]. The rows are numbered by i , i changes from 1 to n; by j the columns are numbered from 1 to m. In this symbolic form 11 can be represented as (20.14a)
11 = Array[b, {6, 5}], where
(20.14b) bji, j] = bij ( i = 1,.. . ,6; j = 1,.. . , 5 ) . The operation IdentityMatrix[n] produces the n-dimensional unit matrix. With the operation DiagonalMatrix[list] a diagonal matrix is produced with the elements of the list in its main diagonal. The operation Dimension[list] gives the size (number of rows, columns, . . . ) of a matrix, whose structure is given by a list. Finally, with the command MatrixForm[lzst],we get a matrix-type representation of the list. A further possibility to define matrices is the following: Let f(i, j ) be a function of integers i and j . Then, the operation Table[f, {i, n}, { j , m}]defines a matrix of type (n,m ) ,whose elements are the corresponding f ( i ,j ) .
20.2.5.2 Operations with Matrices and Vectors Mathernatica allows formal manipulation of matrices and vectors. The operations given in Table 20.6 can be applied. Table 20.6 Operations with matrices ca
a.b Det[a] Inverse[a] Transpose[a] MatrixPower[a, n] Eigenvalues[a] Eigenvectors[a]
matrix a is multiplied by the scalar c the product of matrices a and b the determinant of matrix a the inverse of matrix a the transpose of matrix a the n-th power of matrix a the eigenvalues of matrix a the eigenvectors of matrix a
Here, the transpose matrix rT of T is produced. If the general four-dimensional vector v is defined by Inf2Ol := u = Array[u, 41, then we get Outf2Ol = {u[l]>421, @I,
u[4]}
960
20. Computer Al.qebra Svsterns
' 'I
Now, the product of the matrix r and the vector v is again a vector (see Calculations with Matrices, 4.1.4, p. 254). Inf211 := r. v Out f 2 l l = { a 1, 1 u 11 a[l, 21 421 a[l, 31 u[3] a[l, 41 u[4] , a 2, 1 u 11 4 2 , 21 421 4 2 , 31 u[3] 4 2 , 41 u[4] , 4 3 , 11u 11 43, 21 421 43, 31 u[3] 43, 41 u[4] , 4 4 , 11411 4 4 , 21 421 4 4 , 31 u[3] 4 4 , 41 u[4] }. There is no difference between row and column vectors in Mathernatica. In general, matrix multiplication is not commutative (see Calculations with Matrices 4.1.4, p. 254). The expression T . v corresponds to the product in linear algebra when a matrix is multiplied by a column vector from the right, while u. T means a multiplication by a row vector from the left. I B: In the section on Cramer's rule (4.4.2.3, p. 275) the linear system of equations p t = b is solved with the matrix In1221 : = p = MatrixForm[{{2, 1, 3}, (1, -2, l}, (3, 2, 2}}] Out [22] = //MatrixForm = 2 1 3 1 -2 1 3 22
+
+ + +
+
+ + +
+
+ + +
and vectors Inf231 := t = Array[z, 31 Out f a 1 = (411, 421, z[3]} I n f q l := b = (9, -2, 7) Out [&I = (9, -2, 7). Since in this case det(p) # 0 holds, the system can be solved by t = p-'b. This can be done by In[25] := Inverseb]. b with the output of thesolutionvector Outf251 = {-1, 2, 3).
20.2.6 Functions 20.2.6.1 Standard Functions Mathernatica knows several mathematical standard functions, which are listed in Table 20.7 Table 20.7 Standard functions Exponential function Logarithmic functions Trigonometric functions Arc functions Hyperbolic functions Area functions
Exp[x] Log[x], Log[b,x] Sin[x],Cos[x], Tan[x], Cot[x], Sec[x], Csc[x] ArcSin[x], ArcCos[x], ArcTan[x], ArcCot[x], ArcSec[x], ArcCsc[x] Sinh[x], Cosh[x], Tanh[x], Coth[x], Sech[x], Csch[x] ArcSinh[x], ArcCosh[x], ArcTanh[x], ArcCoth[x], ArcSech[x], ArcCsch[x]
All these functions can also be applied with complex arguments In every case we have to consider the single-valuedness of the functions. For real functions we have to choose one branch of the function (if it is needed); for functions with complex arguments the principal value (see 14.5, p. 696) should be chosen.
20.2.6.2 Special Functions Mathernatica knows several special functions, which are not elementary functions. Table 20.8 lists some of these functions. Tabelle 20.8 Special functions Bessel functions & ( z ) and Y,(z) Modified Bessel functions I n ( z )and K , ( z ) Legendre polynomials P,(z) Spherical harmonic ~"'(t9,4)
BesselJ[n,z],BesselY[n,z] BesselI[n,z], BesselK[n,z] LegendrP[n,x] SphericalHarmonicY[l, m, theta, phi]
20.2 Mathernatica 961 Further functions can be loaded with the corresponding special packages (see also [17.1]).
20.2.6.3 Pure Functions Mathernatica supports the use of so-called pure functions. A pure function is an anonymous function, an operation with no name assigned to it. They are denoted by Function[z, body]. The first argument specifies the formal parameters and the second one is the body of the function, Le., body is an expression for the function of the variable 2 . InCf] := Function[x, 2"3 + 2"2] Out ClI = Function[z, z3+ 21' (20.15) and so In[2] := Function[z, zA3+ zA2][c] gives OutC21 = c3 + e'. (20.16) We can use a simplified version of this command. It has the form body &, where the variable is denoted by # . Instead of the previous two rows we can also write (20.17) 1nC31 := (#"3 + #^2) & [e] out C3I = c3 t c2 It is also possible to define pure functions of several variables: Function[ { q 5>2 , . . .}, body] or in short form body&, where the variables in body are denoted by the elements #1, #2:. . .. The sign & is very important for closing the expression, since it can be seen from this sign that the previous expression should be considered as a pure function.
20.2.7 Patterns Mathernatica allows users to define their own functions and to use them in calculations. (20.18) With the command InCfI := fk-1 := Polynom(z) with Polynom(s) as an arbitrary polynomial of variable x,a special function is defined by the user. In the definition of the function f , there is no simple 2 , but 2- (pronounced 2-blank) with a symbol for the blank. The symbol 5- means "something with the name 2 " . From here on, every time when the expression f [something] occurs, Mathernatica replaces it by its definition given above. This type of definition is called a pattern. The symbol blank- denotes the basic element of a pattern; y- stands for y as a pattern. It is also possible to apply in the corresponding definition only a ''-", that is y"-. This pattern stands for an arbitrary power of y with any exponent, thus, for an entire class of expressions with the same structure. The essence of a pattern is that it defines a structure. When Mathernatica checks an expression with respect to a pattern, it compares the structure of the elements of the expression to the elements of the pattern, Mathernatica does not check mathematical equality! This is important in the following example: Let 1 be the list (20.19) : 2"~)) 1n[21:= 1 = (1, y, y'a, y " ~ q r t [ z ] {f[y"(r/q)], If we write (20.20) 1nC3]:=1/.y~_-> j a then Mathernatica returns the list (20.21) 0utC31 = (1. y, j a , j a , {f[ja], Z'}} Mathernatica checked the elements of the list with respect to its structural identity to its pattern y"and in every case when it determined coincidence it replaced the corresponding element by j a . The elements 1 and y were not replaced, since they have not the given structure, even though yo = 1,y' = y holds. Remark: Pattern comparison always happens in FullForm. If we examine In&] := bly /. y"- -> j a then we get Out C4]= b j a This is a consequence of the fact that FullForm of b/y is Times[b, Power[y, -111, and for structure (20.22) comparison the second argument of Times is identified as the structure of the pattern. With the definition mc51 := f[zJ := 2"3 (20.23a)
I
962
20. Computer Algebra Systems
Mathernatica replaces, corresponding to the given pattern, I n f 6 l = f [T] by Out f6l = r3 etc.
+
I n R 1 := f [a] t f [z] yields Out f71 = a3 x3
(20.2313) (20.23~)
If Inf81 := f[z] := z”3, so for the same input Inf71 := . . . then the output would be Out [71= f [a] + z3 In this case only the “identical” input corresponds to the definition.
(20.23d) (20.23e)
20.2.8 Functional Operations Functions operate on numbers and expressions. Mathernatica can also perform operations with functions, since the names of functions are handled as expressions so they can be manipulated as expressions. 1. Inverse Function The determination of the inverse function of a given function f(z) can be made by the functional operation InverseFunction. IA: I n f l l := InverseFunction[f] [z] O u t f l l = f-’ [z] Outf21 = Log IB: In[2] := InverseFunction[Exp] 2. Differentiation Mathernatica uses the possibility that the differentiation of functions can be considered as a mapping in the space of functions. In Mathernatica, the differentiation operator is Derivative[l][f]or in short form f’. If the function f is defined, then its derivative can be got by f’. IInf31 := f [z-] := Sin[z]Cos[z] With
1,1141
:= f ’
we get 0utf4]=
COS[#^]^
- ~ i n [ # l ]& *,
hence f ’ is represented as a pure function and it corresponds to 1nf5I := % [z] Out E] = C O S [ Z ] ~ - ~ i n [ z ] ’ 3. Nest The command Nest[!, E, n] means that the function f nested n times into itself should be applied on z. The result is f [f [. . . f [E]] . . .]. 4. NestList By NestList[f, z, n] a list {zl f [z],f [f [z]],. . .} will be shown, where finally f is nested n times. 5. Fixedpoint For FixedPoint[f, E], the function is applied repeatedly until the result does not change. 6. FixedPointList The functional operation FixedPointList[f, z] shows the continued list with the results after f is applied, until the value no longer changes. IAs an example for this type of functional operation the NestList operation will be used for the approximation of a root of an equation f(z) = 0 with Newton’s method (see 19.1.1.2,p. 882). We seek a root of the equation z cos z = sin z in the neighborhood of 3 ~ 1 2 : In[6] := f[z-] := z - Tan[z] Inf71 := f‘[z] Outf71 = 1 - Sec[zj2 I n D ] = g[z-] := z - f [z]/f’[z] Inf91 := NestList[g, 4.6, 41 // N Outf91 = (4.6, 4.54573, 4.50615, 4.49417, 4.49341) InflOl := FixedPoint[g, 4.61 OutflO] = 4.49341
A higher precision of the result can also be achieved.
20.2 Mathernatica 963 7. Apply Let f be a function which is considered in connection with a list { a , b, c, . . .}. Then, we get (20.24) APPlY[f, { a : 6,c, ' ' .}I f[Q,b, c, ' ' .I W In[l] := Apply[Plus, { u , u, w}]
Outfl] = u t w
+w
In[2] := Apply[List, Q + b + c] Outf2l = {a, b, c) Here, the general scheme of how Mathernatica handles expressions of expressions can be easily recognized. We write the last operation in FullForm: In[3] := Apply[List, Plus[a, b, c]] Out[3] = List[a, b, c] The functional operation Apply obviously replaces the head of the considered expression Plus by the required List. 8. Map With a defined function f the operation Map gives: (20.25) Map[!. {a, 6,c, . . -+ {fblIfP1, f[cI,. . .I Map generates a list whose elements are the values when f is applied for the original list. ILet f be the function f(z)= z2. It is defined by 1n[4] := f[z..] := 2%' With this f we get In[51 := Map[f, { u , u , tu}] Outf51 = { u * ) u2, w2} Map can be applied for more general expressions: In[6] := Map[f, Plus[a, 6,e]] OutC6l = a2 t b2 + c2
.}I
20.2.9 Programming Mathernatica can handle the loop constructions known from other languages for procedural programming. The two basic commands are Do[ezpr, {i, i l , 22, di}] (20.26a) and While[test, ezpr] (20.26b) The first command evaluates the expression ezpr, where i runs over the values from i l to 22 in steps di. If di is omitted, the step size is one. If i l is also missing, then it starts from 1. The second command evaluates the expression while test has the value True. IIn order to determine an approximate value of e', the series expansion of the exponential function is used: I n f l I := sum = 1.0; Do[sum = sum t (2%/2!), {i, 1,lo}]; (20.27) sum Out [ll = 7.38899 The Do loop evaluates its argument a previously given number of times, while the While loop evaluates until a previously given condition becomes false. Among other things, Mathematica provides the possibility of defining and using local variables. This can be done by the command (20.28) Module[{tl, t2,. . .}>procedure] The variables or constants enclosed in the list are locally usable in the module; their values assigned here are not valid outside of this module.
I
964
20. Computer A1,qebra Systems
IA: We have to define a procedure which calculates the sum of the square roots of the integers from 1 to n. 1n[11 := sumq[n-] := Module[{sum = l.}, (20.29) Do[sum = sum N[Sqrt[i]], {i,2, n}]; sum 1;
+
The call sumq[30] results in 112.083. The real power of the programming capabilities of Mathematica is, first of all, the use of functional methods in programming, which are made possible by the operations Nest, Apply, Map and by some further ones. IB: Example A can be written in a functional manner for the case when an accuracy of ten digits is required: sump[n-] := N[Apply[Pl~~, Table[Sqrt[i],{i, 1,n}]], lo], sumq[30] results in 112.0828452. For the details, see [20.6].
20.2.10 Supplement about Syntax, Information, Messages 20.2.10.1 Contexts, Attributes Mathernatica must handle several symbols; among them there are those which are used in further program modules loaded on request. To avoid many-valuedness, the names of symbols in Mathernatica consist of two parts, the context and the short name. Short names mean here the names (see 20.2, p. 953) of heads and elements of the expressions. In addition, in order to name a symbol Mathernatica needs the determination of the program part to which the symbol belongs. This is given by the context, which holds the name of the corresponding program part. The complete name of a symbol consists of the context and the short name, which are connected by the ’ sign. When Mathernatica starts there are always two contexts present: System’ and Global’. We can get information about other available program modules by the command Contexts[ 1. All built-in functions of Mathematica belong to the context System’, while the functions defined by the user belong to the context Global’. If a context is actual, thus, the corresponding program part is loaded, then the symbols can be referred to by their short names. For the input of a further Mathernatica program module by << Namepackage, the corresponding context is opened and introduced into the previous list. It can happen that a symbol has already been introduced with a certain name before this module is loaded, and in this newly opened context the same name occurs with another definition. In this case Mathernatica gives a warning to the user. Then we can erase the previously defined name by the command Remove[Clobal‘name], or we can apply the complete name for the newly loaded symbol. Besides the properties that the symbols have per definition, it is possible to assign to them some other general properties, called attributes, like Orderless, Le., unordered, commutative, Protected, i.e., values cannot be changed, or Locked, i.e, attributes cannot be changed, etc. Informations about the already existing attributes of the considered object can be obtained by Attributes[f]. Some symbols can be protected by Protect[somesymbol]; then no other definition can be introduced for this symbol. This attribute can be erased with the command Unprotect.
20.2.10.2 Information Information can be obtained about the fundamental properties of objects by the commands
20.3 Maple 965
?symbol information about the object given by the name symbol, ??symbol detailed information about the object, ?B* information about all Mathernatica objects, whose name begins with B. It is also possible to get information about special operators, e.g., by ? := about the SetDelay operator.
20.2.10.3 Messages Mathernatica has a message system which can be activated and used for different reasons. The messages are generated and shown during the calculations. Their presentation has a uniform form: symbol : : tag, providing the possibility to refer to them later. (Such messages can also be created by the user.) Consider the following examples as illustrations. W A: In[lI := f[z_] := 1/x; Inf21 := f[O] 1 0
Power: :infy:Infinite expression - encountered.
Out[2] = ComplexInfinity
W B: In[3] := Log[$ 16, 251 Log: :argt:Log calledwith 3 arguments; 1 or 2 arguments are expected. Out[3] =Log[3,16,25] IC: In&] := Multply[x, zAn] General: :spelll: Possible spelling Error: new symbol name “Multply’ ’ i s s i m i l a r t o existing symbol “Multiply’ ’ . Out[4] = Multply[z, xAn] In example A. Mathematica warns us that during the evaluation of an expression it got the value m. The calculation itself can be performed. In example B the call of logarithm contains three arguments, which is not allowed according to the definition. Calculations cannot be performed. Mathernatica cannot do anything with the expression. In example C, Mathernatica finds a new symbol name, which is very similar to an existing one which is often used. A spelling error is supposed, and Mathernatica does not evaluate this expression. The user can switch off a message with Off[s : : tag]. With On the message will appear again. With Messages[symbol] all messages associated to the symbol with the name symbol can be recalled.
20.3 Maple The computer algebra system Maple was developed at the University of Waterloo (Ontario Canada). We discuss Maple V, release 4 by Waterloo Maple Software. A good intruduction can be found in the handbooks and in [20.6].
20.3.1 Basic Structure Elements 20.3.1.1 Types and Objects In Maple, all objects have a type which determines its proper affiliation in an object class. An object can be assigned to several types, as e.g., if a given object class contains a subclass defined by an additional relation. As an example it can be mentioned that the number 6 is of type integer and of type posint. By giving the type and arranging all objects in a hierarchy, a consistent formalization and evaluation of given classes of mathematical problems is guaranteed. The user can always ask about the type of an object with the question > whattype(obj); (20.30) The semicolon must be written at the end of the input. The output is the basic type of the object. Maple knows the following basic types of objects, collected in Table 20.9. The more detailed type structure can be determined by the help of commands like type(obj,typname), the values of which are the Boolean functions t r u e or f a l s e . Table 20.10 contains all type names known by Maple.
I
966
20. Computer Al.qebra Systems
Table 20.9 Basic types in Maple
\+.
\
*
\A'
\
\ = \
'<>'
'<'
I'
.. \
' or'
not' exprseq float fraction function indexed integer procedure series set string table uneval
'and' list
Table 20.10 Types
*
**
-
A
+
PLOT algn algnumext and colourabl connected constant float expanded facint linear intersect laurent monomial name minus oddfunc numeric odd primeint positive posint radnum radfunext radical series relation scalar tree symmfunc taylor vector
< PLOT3D anything cubic fraction list negative operator procedure radnumext set trig
RootOf array digraph function listlist negint or quadratic range sqrt type
algebraic biconnect equation graph logical nonneg planar quartic rational square undigraph
I
<>
algext bipartite even indexed mathfunc nonnegint point radext ratpoly string uneval
algfun boolean evenfunc integer matrix not polynom radfun realcons subgraph union
It can be seen that the type-check functions themselves have a type, namely type.
20.3.1.2 Input and Output In Maple the input has the form obJ,(obJ,,obJ,,...,obJ~)~ (20.31) The first term. which is in front of the left parenthesis, is usually an operator, a command or a function, which acts on the parts in parentheses. In certain cases there are special options as arguments, which control the special application of operators or functions. The terminating semicolon is very important; it tells Maple that the input has ended. If the input is terminated by a :, this means that although the input is finished, the result should not be calculated. Symbols. Le., names in Maple. can consist of letters, numbers, and the Blank (-). Names cannot begin with numbers. In the first place numbers are not allowed. Upper- and lower-case letters are always distinguished. The blank is used by Maple for internal symbols; it should be avoided in user-defined symbols. Strzngs. Le., objects of type s t r i n g , must be input between single quotes: (20.32) > S := 'This is a string' S := This is a string The output is without the single quotes; type checking with whattype gives s t r i n g . While no value is assigned to a symbol, the symbol is of type s t r i n g or name, Le., the type checking (20.33) > type(symb,name); or type(symb, s t r i n g ) ; results in true. If the user does not knob, if a symbol in Maple is already reserved, then it can be asked for by Name. If Maple answers that it does not know this symbol, then it can be used freely. If a value is assigned to a symbol by the assignment operator :=, then the symbol automatically takes the type of the assigned value. ILet 21 be a symbol, which should be a variable. For the input > whattype(i1); Maple answers by s t r i n g
20.3 Maple 967
If an integer value is assigned to it: > s l : = 5 ; -+ X I : = 5 and then it is asked > whattype(z1); then the answer is integer. Maple knows a huge number of commands, functions, and operators. Not all of them can be called right after the start. Several special functions and operators are in different packages in the Maple library. There exist packages for linear algebra, for statistics, etc. These packages can be loaded by the command > with(packagename);if they are needed (see Supplement about Syntax, Information, Messages. 20.3.9.1, p. 975) then their operations and functions can be used as usual.
20.3.2 Types of Numbers in Maple 20.3.2.1 Basic Types of Numbers in Maple Maple knows the basic types of numbers introduced in Table 20.11.
Tabelle 20.11 Type of numbers in Maple
Number Integer Fraction Floating-point number
Type integer fraction float
Representation form
nnnnnn sequence of (almost) arbitrary number of digits ppplqqq fraction of two integer nn.mmm or in scientific notation n.mm * lO"(pp)
With the help of type control functions, some further properties of numbers can be asked: 1. Rational Numbers (Type rational): Rational numbers are in Maple the integers and fractions. A fraction, which becomes an integer with all common factors removed, will not be recognized by Maple as afraction (type f r a c t i o n ) . 2. Floating-point Numbers (Type f l o a t ) : If a decimal point is placed behind an integer (nnn.), it is automatically considered a floating-point number. 3. CommonProperties: All three typesofnumbers have the types realcons, numeric and constant. The last two types also belong to the complex numbers. 4. Complex Numbers: Complex numbers are formed with the imaginary unit I as usual. The number 1 is of type radnum, i.e. it is the root of a real number. Its internal definition is: alias(1 = (-1)~(1/2)) (20.34) The command a l i a s provides the possibility to rename functions or other mathematical symbols. (In electro-engineering the imaginary unit is denoted by j instead of I , or sometimes it is convenient to use a shorter name for a function than the one it has in the library.) It has the form > alias(gll!g/2,. . .); (20.35) Here, gli are equations, which define the new symbol by the corresponding Maple functions. If the function a l i a s is called, Maple shows all the other previously defined a l i a s with the new one. An alias can be removed by aliasing the name of itself a l i a s ( s y m = sym).
20.3.2.2 Special Numbers Maple knows several special numbers such as Pi, E, gamma.
20.3.2.3 Representation and Conversion of Numbers 1. Floating-Point Numbers The command evalf (number);results in a floating-point number with a previously given precision; the default value is 20 digits. The argument can be a rational number or a symbolic number given by an expression, and in this last case the result of the calculations is represented. I > evalf ( E ) ; 2.7182818284590452354
968
20. Computer Aloebra Systems
The precision is defined in Maple by the environment variable Digits.If the default value of the number of digits is not suitable for the concrete problem, then it can be changed by the command > Digits := m; m required number ofdigits (20.36) It remains valid until the next definition.
2. Numbers of Different Bases The conversion of decimal numbers into another base can be made by the command convert. The basic form of this command is
convert(ezpr,form,opt),
(20.37)
and it transforms certain types of expressions of one form into another form (if it has any meaning). The argument form can be a type enumerated in Table 20.12. Tabelle 20.12Argument of function convert
't' degrees factorial lessthan multiset rational vector
*' diff float lessequal name ratpoly
D double fraction list octal RootOf
array eqnlist
base binary equality exp GAMMA hex horner listlist I n matrix parfrac polar polynom series set sincos
confrac expln hostfile metric radians sqrfree
decimal expsincos hypergeom mod2 radical tan
The table shows that several forms are provided for conversion of numbers. IExamples of using the command convert: > convert(73, octal); > convert(73,binary); --+ 1001001
> convert(79,hex); --f 4F > convert(11001101,decimal,binary); ---f
+ 111 29 > convert(l.45,rational); + 20 > convert('FFAP',decimal,hex);
205
+ 65442
In the last one, the hexadecimal number is enclosed between backward quotes. With the command convert(list,base,basl,bas2) a number, which is given in the form of a list in basl, is converted into a number in bas2, and the result is given in the form of a list. Input in a form of a list means that the number has the form z = z1 * (bas)O t z2 * (bas)' t 23 * (bas)3t . . . and the list is[....~3~22,21].
IThe octal number 153 should be converted into a hexadecimal number: > convert([l,5,3],base,8,16);--f [9,14] The result is a list.
20.3.3 Important Operators in Maple Important operators are +, -, *, / 2 as the known arithmetical operations; = I <, <=, >, >=, <> as relational operators. The cat operator (concatenation operator) has special importance and it can be written in the short infix form as the point operator '.'. With this operator, two symbols can be connected. I > x := written : h := re :
> cat(h,z); + rewritten Instead of the input cat(h.z) it can be written ' '.h.z;, the result is rewritten. It is important to put an empty string in the first place, since Maple does not replace the very first operand by its value. With this construction, we can produce easily indexed variables. I > 2 := 1,2,3,4,5 : a sequence of integers
20.3 Maple 969
>
y.2;
connecting the sequence to a variable t y l , y2, y3, y4, y5
The result is a sequence of indexed variables.
20.3.4 Algebraic Expressions Expressions can be constructed from variables (symbols) with the help of arithmetical operators. All of them have the type algebraic, which includes the "subtypes" integer, f r a c t i o n , f l o a t , s t r i n g , indexed, s e r i e s , function, uneval, the arithmetical operator types and the point operator. We can see that a single variable (a string) also belongs to the type algebraic. The same is valid for basic number types, since algebraic expressions with the command subs are evaluated into them, in general. I > p:=x"3-4*2"2+3*x+5: Here, an expression, namely, a polynomial of degree 3 in z is defined. With the substitution operator subs a value can be assigned to the variable z in the polynomial (expression) and then the evaluation is performed: 347 > subs(z = 3/4,p); t > subs(z = 3,p); t 5 64 > subs(z = 1.543p);t 3.785864 The operator op displays the internal structure, the subexpressions from an expression. With oP(P); (20.38) we get the sequence (see next section) of subexpressions on the first level, 2 3 , -4x2,32, 5 (20.39) In the form op(i,p);, the i-th term is displayed, so, e.g., op(2,p) yields the term - 4 ~ ' . The number of terms of the expression is given by nops(p);.
20.3.5 Sequences and Lists In Maple, a sequence consists of consecutive expressions separated by commas. The order of the elements is important. Sequences with the same elements but in different orders are considered as different objects. The sequence is a basic type of Maple: exprseq. I > f l := z"3, -4 * ~ " 2 *~z,5 3 : (20.40a) defines a sequence, then (20.40b) > type(f1, exprseq); results in t r u e . With the command (20.41) > seq(f(i).i = 1.n): the sequence j ( l ) , f(2), . . . , j ( n ) is shown. IWith > seq(i2,i = 1..5);we get 1,4,9,16,25. The range operator range defines the range of integer variables represented in the form i = n..m, and it means that the index variable i takes the consecutive values n, n 1,n 2,. . . , m. The type of this structure is ' ..'. An equivalent form to generate a sequence is provided by the simplified form > f ( i ) $ i = n..m; n, n + which also generates f(n))j ( n + l), . . . , j ( m ) . Consequently, $n..m; results in the sequence (20.42) 1 , .. . , m and z$i;the sequence with i terms z. Sequences can be completed by further terms: sequence, a, b, . . . (20.43)
+
+
I
970
20. Computer Alqebra Swtems
If we put a sequence f into square brackets, then we get a list, which has a type l i s t . I > 1 := [i$i = 1..6)]; --il := [1,2,3,4,5,6] With the already known operator op, for the command op(1iste); we get the sequence back, which was the base of the list. A list can be completed if first it is changed back into a sequence, this sequence is completed, then it is changed into a new list by square brackets. Lists can have elements which are themselves lists; their type is l i s t l i s t . These types of constructions have an important role when matrices are constructed. The selection of one particular element of a list can be done by op(n, liste);.This command gives the n-th element of the list. If the list has a name, like L , then it is easier to type L[n];. For a double list we find the elements on a lower level with op(rn, op(n, L ) ) ;or with the equivalent call L[n][v];. There are no difficulties in building up lists with higher levels. IGenerating a simple list:
> L1 := [a,b,c,d,e,f]; -+ L1 := [a,b,c,d,e,f] Selecting the fourth element of this list: > op(4,L); or > L[4]; --+ d Generating a double nested list: > L2 := [ [ a ,b, c], [d, e , f]]: (output is suppressed!) Selecting the third element of the second sublist: > oP(3,OP(2,L)); or L[21[3]; + Generating a triple nested list:
f
> L3 := [[[a,b, cl, [d, e, fll, [[s, tl, [u,41)[[x,YI,
[wzlll :
20.3.6 Tables, Arrays, Vectors and Matrices 20.3.6.1 Tables and Arrays Maple knows the commands t a b l e and array to construct tables and arrays. With table(zfc, Izst)
(20.44) Maple generates a table-type structure. Here, zfc is an indexing function, list is a list of expressions, whose elements are equations. In this case Maple use the left-hand side of the equation as the numbering of the table elements and the right-hand side as the current table element. If the list contains only elements, then Maple uses the natural numbering, starting at one. I > T := table([a,b, c]); -+ T := table([ l=a 2=b 3=e])
> R := table([a = x,b = y,c = 21);
R
:= table([
a=x
I
b=y c=z])
The repeated call of T or R gives only the symbols T or R. With op(T);, the output is the table; for the call op(op(T)); we get the components of the table in the form fnc, a list of the equations for the table elements. Here, we can see that the evaluation principle for these structures is different from the general one. In general, Maple evaluates an expression until the end, Le., until no further
20.3 Maple 971
transformations are possible. However, while the definition is recognized in the example above, further evaluation is suppressed until it is explicitly required with the special command op. The indices of T form a sequence with the command indices(T);,a sequence of the elements can be obtained by entries ( T ) ; . IFor the previous examples > indices(T); yields [l],[2],[3] > indices(R); yields [a],[b],[c] and correspondingly > entries(R); yields [x]:[y],[t] With the command array(ifc, list); (20.45) special tables can be generated, which can be of several dimensions, but they can have only integer indices in every dimension.
20.3.6.2 One-Dimensional Arrays With array(l..5);,e.g., a one-dimensional field of length 5 is generated without explicit elements; with v := array(l.5,[a(l)>a(2), a(3).a(4), a(5)l);onegetsthesamebutwithgivencomponents.Theseonedimensional fields are considered by Maple as vectors. With the type check function type(v,vector); we get true. If we ask whattype(v);,then the answer is string. This is in connection with the special evaluation mentioned above. If we first give the command v l := eval(v);, then we can get for whattype(v1); the required answer table.
20.3.6.3 Two-Dimensional Arrays Two-dimensional arrays can be defined similarly with A := array(l..m,1 . q [[a(l,I), . . . ,a(l,n)ll.. . , [a(m,I), . . . ,a(m,n)]]); (20.46) The structure defined in this way is considered by Maple as a matrix of size m x R . The values of a ( i , j ) are the corresponding matrix elements. W > x := array(1..3,[xl,x2,x3]); x := [xlx2 231 results in a vector. A matrix is obtained, e.g., by > A := array(l..3,1..4, A := array(l..3,1..4, The input
[I); [I)
I
?[1,1] ?[1,2] ?[1,3]?[1,4]
> eval(A): yields the output ?[3,1] ?[3,2] ?[3,3] ?[3,4]
Maple characterizes the undefined values of the matrix by the question mark ?[,,31. to all or some of these elements, like > -4[1.1]:= 1 : A[2,2]:= 1 : A[3,3] := 0 : then the renewed call for A will be displayed with the given values:
> eval(A); +
1
1
?[2,11
?[I$
1
I
?[1,3] ?[1,4] ?[2,3] ?[2,4]
0 ?[3,4] With the command > B := array([[bll,b12,b131,[b21,b22.b23],[b31,b32,b33]]); ?[3,l] ?[3,2]
If a value is assigned
972
20. Computer Al.qebra Systems
c :=
c l l ~ 1 2e13 C [ i , d ] c21 c22 c[2,3c ][2,4] c[3,11
c[3,21
cY,31
c[3,41
c14,11
c[4,21 c[4,31
c[4,41
-
20.3 Maple 973
Table 20.14 Special functions Bessel functions Jn(z) and Y,(z) Modified Bessel functions In(z)and K , ( z ) Gamma function Integral exponential function
BesselJ(v, z ) , BesselY(v,z) BesselI(v, z ) , BesselK(v, z ) Gamma(x)
Ei(x)
The Fresnel functions can also be found among the special functions. The package for orthogonal polynomials contains among others the Hermite, Laguerre, Legendre, Jacobi and Chebyshev polynomials. For more details see [20.6].
20.3.7.2 Operators In Maple the functions behave as the so-called X functions in the programming language LISP. Slightly simplified this means that the name of a function, as defined in Maple, is considered as an operator. In other words, type(sin, operator); yields true. If the argument, or several arguments if this is needed, is attached to the operator in parentheses, then one gets the corresponding function of the given variables. I > type(cos, operator); yields t r u e and > type(cos, function); yields f a l s e . If we replace the argument cos by cos(x), then type checking will give the opposite results Maple provides the possibility to generate self-defined functions in operator form. A function can be defined by the arrow operator ->. With (20.47) > F := x-> mathexpr : and with mathexpr as an algebraic expression of the variable x, a new function with name F is defined in operator form. The algebraic expression can contain previously defined and/or built-in functions. If an independent variable is attached in parentheses to the operator symbol generated in this way, then it becomes a function of this independent variable. H > F := x-> sin($) * cos(z) t xA3* tan(z) t xA2 : > F(y);--+ F(y) := sin(y)cos(y)t y3 tan(y) yz If a numerical value (e.g., a floating-point number) is assigned to this argument, say, with the call > F(nn.mmm); Maple gives the corresponding function value. Conversely, it can be generated from a function (e.g.>from a polynomial of the variable x) the corresponding operator with the command unapply(function, uw).So, we get back from F ( y ) with > maPPlY(F(Y),Y); --+ F the operator with symbol F . It is possible to work with operators according to the usual rules. The sum and difference of two operators is again an operator. For multiplication we have to be careful that the product is again an operator. Maple uses the special multiplication symbol @ for operator multiplication. In general, this multiplication is not commutative. Let F := z-> cos(2 ii z)and G := E-> x"2. Then
+
> ( G Q F ) ( z ) ;t c0s2(2z), while
> ( F @ G ) ( z )t ; c0s(2z2) can be formed by the product oftwo functions given in operator representation (FrG)(z)= ( G * F ) ( x ) , which results in F ( z )ii G(z).
20.3.7.3 Differential Operators The differentiation operator in Maple is D. Its application on functions of operator form is D(F) or D[z](G). In the first case the derivative of a function of one variable is defined in operator form. The attachment of the variable in parentheses results in the derivative as a function. In another form, it
974
20. Computer Alqebra Swtems
can be written as D(F)(x) = d i f f ( F ( z ) , z ) . Higher derivatives can be got by repeated application of the operator D, which can be simplified by the notation ( D O Bn)(F) where B B n means the n-th “power” of the differential operator. If G is a function of several variables, then D[i](G)generates the partial derivatives of G with respect to the i-th variable. This result is also an operator. With D[i,j](G)we get D[i](D[j](G)),Le., the second partial derivative with respect to the j-th and i-th variables. Higher derivatives can be formed similarly. The rules of differential calculus (see 6.1.2.2, p. 378) are valid for the differential operator D, where F and H are differentiable functions: (20.48a) D(F + H ) = D(F)+ D(H),
D(F * H ) = (D(F)* H)+ (F * D(H)),
(20.48b)
D(FQH ) = D(F)Q H * D(H).
(20.48~)
20.3.7.4 The Functional Operator map The operator map can be used in Maple to apply an operator or a function to an expression or to its components. Let, e.g., F be an operator representing a function. Then map(F, z x”2 z * y) yields the expression F ( x ) t F ( x 2 )t F(xy). Similarly, with map(F; y * z ) the result is F ( y ) * F ( z ) .
+
+
map(f. [a>5, c, 4); + [f(a)sf(b), f(c), f(4l
20.3.8 Programming in Maple Maple provides the usual control and loopstructures in a special form to build procedures and programs. Case distinction is made by the i f command. Its basic structure is (20.49) i f cond. then stat1 e l s e stat2 f i The e l s e branch can be omitted. Before the e l s e branch, arbitrarily many further branches can be introduced with the structure (20.50) e l i f condi then stati f i Loops are generated with f o r and while, which require in the command part the form do. . . stat. . . od. In the f o r loop the running index must be written in the form i from n t o m by di where di is the step size. If the initial value and step size are missing, then they are automatically replaced by 1. In the while loop, the first part is f o r ind while cond. Also loops can be multiply nested into each other. In order to write a closed program, the procedure command is needed in Maple. It can have several rows, and if it is stored appropriately, then it can be recalled by name if it is needed. Its basic structure is: proc (args) local . . . (20.51) options . . . commands end; The number of arguments of the procedure is not necessarily equal to number of the variables used by the kernel of the procedure; in particular they can be completely missing. All variables defined by l o c a l are known only in the procedure.
20.4 Applications of Computer Algebra Systems 975
IWrite a procedure which calculates the sum of the square roots of the first n natural numbers: > sumqlli := proc(n) > l o c a l s, i; > s[O] := 0; > foriton do s[i] := s[z - 11 + s q r t ( i ) od; > > evalf ( ~ [ n ] ) ; > end; Maple displays the procedure defined in this way. Then the procedure can be called by name with the required argument n: > sumqw(30); Output: 112.0828452
20.3.9 Supplement about Syntax, Information and Help 20.3.9.1 Using the Maple Library Besides the innumerable commands available to the user after starting Maple, there exist so-called mixed library functions and commands which must be made available by the command r e a d l i b ( f ) . .4n enumeration and short description of these commands is in [20.11], Section 2.2. Maple has a huge library of packages. .4special package can be loaded with the command > with(name); (20.52) Here the name is the name of the current package, hence linalg for the package of linear algebra. After loading, Maple lists all the commands of the package and gives a warning if in the new definitions there are commands already available which were introduced earlier. If only one particular command is needed from a package, then it can be called by paket[command] (20.53)
20.3.9.2 Environment Variable The output of Maple is controled by several environment variables. We already introduced the variable Digits (see 20.3.2.3, l.,p. 967), by which the number of the displayed digits of floating-point numbers is defined. The general form of the output of the result is defined by p r e t t y p r i n t . Default is here > interface(prettyprint = true) (20.54) This provides centered output in mathematical style. If this option is defined f a l s e , then the output starts at the left-hand side with the form of input.
20.3.9.3 Information and Help Help about the meaning of commands and keywords is available by the input ?notion; (20.55) Instead of the question mark, help(notzon) can also be written. This results in a help screen, which contains the corresponding part of the library handbook for the required notion. If Maple runs under Windows, then the call for HELP opens a menu usually on the right-hand side, and the explanation about the required notion can be obtained by clicking on it with the mouse.
20.4 Applications of Computer Algebra Systems This section describes how to handle mathematical problems with computer algebra systems. The choice of the considered problems is organized according to their frequency in practice and also according to the possibilities of solving them with a computer algebra system. Examples will be given for
976
20. Computer Aloebra Svsterns
functions, commands, operations and supplementary syntax, and hints for current computer algebra systems. When it is important, the corresponding special package is also discussed briefly.
20.4.1 Manipulation of Algebraic Expressions In practice, further operations must usually be performed with the occurring algebraic expressions (see 1.1.5, p. 10) such as differentiation, integration, series representation, limitingor numerical evaluation, transformations, etc. In general, these expressions are considered over the ring of integers (see 5.3.6, p. 313) or over the field (see 5.3.6.1, 2., p. 313) of real numbers. Computer algebra systems can handle, e.g., polynomials also over finite fields or over extension fields (see 5.3.6.1, 3., p. 313) of the rational numbers. Interested people should study the special literature. The algebraic operations with polynomials over the field of rational numbers have special importance.
20.4.1.1 Mat hematica Mathernatica provides the functions and operations represented in Table 20.15 for transformation of algebraic expressions. Tabelle 20.15 Commands for manipulation of algebraic expressions EXPandbl EXPandb, TI PowerExpand[a] Factor[p] Collect[p,x] Collect[p, {x, y,. . .}] ExpandNumerator[r] ExpandDenominator[r] ExpandAll [r] Together [r] Apart[r] Cancel[r]
expands the powers and products in a polynomial p by multiplication multiplies only the parts in p , which contain r expands also the powers of products and powers of powers factorizes a polynomial completely orders the polynomial with respect the powers of x the same as the previous one, with several variables expands only the numerator of a rational expression expands only the denominator expands both numerator and denominator completely combines the terms in the expression over a common denominator represents the expression in partial fractions cancels the common factors in the fraction
1. Multiplication of Expressions The operation of multiplication of algebraic expressions can always be performed. The coefficients can also be undefined expressions. gives I I n [ l I : = Expand[(x + y - z)"4] Out[I] = x4 + 4 x 3 y + 6 x 2 y 2+ 4 2 y3 + y4 - 4x32 - 1 2 x 2 y z- 1 2 ~ - ~ 4 y 3~z 2 +6x2 z2 + 1 2 x y t ' + 6y2 z2 - 4 x z 3 - 4 y t 3 + t4 Similarly,
+
Inf21 : = Expand[(a x byA2)(cx"3 - d yA2)] Out [21 = a c x4 - a d x y2 b cx3yz - bd y4
+
2. Factorization of Polynomials Mathernatica performs factorization over the integer or rational numbers if it is possible. Otherwise the original expression is returned.
+
+
I I n [ 2 ] : = p = x"6 7xA5 12x"4 + 6x"3 - 252"2 - 30x - 25; In[3] := Factorb] , gives Out[3]= ( ( 5 + 2 ) ( 1 + x + x 2 ) (-5+x2+x3))
20.4 Applications of Computer Alqebra Sustems 977
Mathernatica decomposes the polynomial into three factors which are irreducible over the rational numbers. If a polynomial can be completely decomposed over the complex rational numbers, then this can be obtained by the option GaussianIntegers. I In[,$] := Factor[x2 - 22 t 51 --t Out[41= 5 - 22 t x2 , but In[5l := Factor[%, GaussianIntegers-> True] Out[51= ( - 1 - 2 1 + x ) ( - l t 2 1 t x )
3. Operations with Polynomials Table 20.16 contains a collection of operations by which polynomials can be algebraically manipulated over the field of rational numbers. Table 20.16 Algebraic polynomial operations
PolynomialLCM[pl,p2] PolynomialQuotient[pl,p2,x]
p l and p2 determines the least common multiple of the polynomials p l and p2 dividespl (as a function of x) by p2, the residue is omitted
ITwo polynomials are defined: I n [ l ] := p = x"6 t 7x"5 + 12x"4 + 6x"3 - 25x"2 - 30x - 25; q = ~ " +4 ~ " -3 6 ~ " 2- 72 - 7; LVith these polynomials the following operations are performed: In[2] := PolynomialGCD[p,q] t Out[2] = 1 x x2 I n [ ] := PolynomialLCM[p,q] a ~ t [ 3 1= 3 ( - 7 + ~ ) ( i t ~ t x ~ ) ( - 2 5 - 5 ~ t 5 ~ ~ + 6 ~ ~ + ~ ~ ) In&] := PolynomialQuotient [p, q: x] t Out [4] = 12 + 6x + x2 In[5] := PolynomialRemainder[pp,q . x] t Out[51 = 59 962 96x2 37x3 With the two last results we get 37x3 962' t 962 t 59 x6 + 7x5 t E x 4 t 6x3 - 252' - 302 - 25 = x2 t 62 12 t x4 + 2 3 - 6x2 - 7~ - 7 x4 x3 - 62' - 7x - 7
+ +
+
+
+
+
+
+
4. Partial Fraction Decomposition Mathernatica can decompose a fraction of two polynomials into partial fractions, of course, over the field of rational numbers. The degree of the numerator of any part is always less than the degree of the denominator. Using the polynomials p and q from the previous example we get -6 -55 11x 6s' InL61 := Apart[q/p] t Out C61 = 35 ( 5 + x ) $ 3 5 ( - 5 + x Z + x 3 )
+
+
~
5 . Manipulation of Non-Polynomial Expressions Complicated expressions, not necessarily polynomials, can often be simplified by the help of the command Simplify. Mathernatica will always try to manipulate algebraic expressions, independently of the nature of the symbolic quantities. Here, certain built-in knowledge is used. Mathernatica knows the rules of powers (see 1.1.4.1,p. 7 ) : I ~ [ I := I ~ i m p ~ i f y [ a " n / a " m )-+ ] o u t ~ =i a(-m+n) ~
(20.56)
I
978
20. Computer Alqebra Svstems
With the option T r i g -> True, the commands Expand and Factor can express powers of trigonometric functions by trigonometric functions with multiple arguments, and conversely. ISome trigonometric formulas (see 2.7.2.3, p. 79) can be verified with the following input: InL21 := Factor[Sin[4x],Trig-> True] t Uutf21 = 4 c 0 s ( x ) ~sin(x) - 4 cos(x) sin(^)^ In[31 := Factor[Cos[5x],Trig-> True] t Uutf31 = cos(x) (1 - 2 cos(22) + 2 cos(4x))
Beginning with version 2.2 of Mathernatica, the option Trig-> True can be directly called by the command TrigExpand. It is valid for several commands of the package Algebra' Trigonometry' which can be loaded. Finally, we mention that the command ComplexExpand[expr] assumes a real variable expr, while in ComplexExpand[expr, ( x l , x 2 , . . .}]the variables xi are supposed to be complex.
I
In[il := ComplexExpand[Sin[2 x], {x}] Uut [I] = Cosh[2 Im[x]]Sin[:! Re[z]] I Cos[2 Re[.]] Sinh[2 Im[x]]
+
20.4.1.2 Maple Maple provides the operations enumerated in Table 20.17 for transformation and simplification of algebraic expressions. Table 20.17 Operations to manipulate algebraic expressions expands the powers and the products in an algebraic expression p . The optional arguments qi prevent the further expansion of the subexpressions qz. factorizes the expression p. K is an optional RootOf argument. factor(p,K) simplify(p, q l , q2,. . .) applies built-in simplifying rules on p. In the presence of the optional arguments these rules are applied only for them. simplifies p containing radicals. radsimp(p) normalizes the rational expression p. normal(p) sorts the terms of polynomialp in order of decreasing degree. sort(p) determines the coefficient of xt. coeff (p, x,i) collects the terms of a polynomial of several variables containing the c o l l e c t ( p ,u ) variable u.
1. Multiplication of Expressions In the simplest case Maple decomposes the expression into the sum of powers of the variables: W > expand((x y - z)"4); 4 x3y - 4 x3z + 6 x'y' + 6 x z z 2+ 4 xy3 - 4xz3 - 4 y3z + 6 yzz2 - 4 yz3 + x4 + y4 + z4 -122'yz - 12zy*z + 12xyzZ IHere we can demonstrate the effect of the absence and presence of an optional argument on the Maple procedure. > expand((a * x"3 b * y"4) * sZn(3 * x) * cos(2 * x)); 8 ax3sin(x) COS(^)^ - 6 ax3 sin(x) cos(x)' + ax3 sin(x) + 8 by4 sin(x) -6 by4 sin(z) cos(x)' by4 sin(x) The expression is completely expanded. > expand((a * x"2 - b * y"3) 1; sin(3 * x) *cos(:! * x),a * x"2 - b * ~ " 3 ) ; 8 (axz- by3) sin(x) C O S ( Z-) ~6 (ax' - by3) sin(x) cos(z)' + (ax' - by3) sin(z) Maple kept the expression of the optional argument unchanged.
+
+
+
20.4 Applications of Computer Alqebra Svstems 979
The following example shows the effectiveness of Maple:
I > expand(ezp(2 * a * z)* sznh(2 * z)+ ln(x3)* sin(4 * z)); 2 eaZ2sinh(z) cosh(z) + 24 ln(x) sin(z) C O S ( X ) ~- 12 ln(z) sin(z) cos(z)
2. Factorization of Polynomials Maple can decompose polynomials over algebraic extension fields (if it is possible anyway). W
> p := ~
+ +
+
+
" 6 7 * ~ " 5 12 * ~ " 4 6 * ~ " -3 25 * ~ " -2 30 * x - 25: * *x -7:
q := ~ " 4 ~ " 3 6 ~ " -2 7 > p l := factor(p);
(z+ 5 ) (z2+ z + 1) (z3+ z2 - 5) and
> q l := factor(q); (2t z + 1) ( 2- 7) Here, Maple decomposed both polynomials into irreducible factors over the field of rational numbers. If we want to have a decomposition over an algebraic extension field, then we can continue:
I > p2 := factor(p, (-3)"(1/2)); (.3 + z 2 - 5 ) ( 2 z + 1 - ,m) ( 2 x + 1 + G)( z + 5 ) 4 Maple decomposed the second factor (in this case after a formal extension of the field by G). In general. we do not know if such an extension is possible. If the degrees of the factors are 5 4, then it is possible. With the operation RootOf, the roots can be determined as algebraic expressions.
I> r := RootOf (z3+ z2 - 5 ) : k := allvalues(r) : > k[l]:
1
-+-
- 113
> ki21: L
-
11
2
-
1
- 113
t
2 The call for k[3] results in the complex conjugate of k[2]. The procedure given in this example yields a sequence of floating-point numbers in case the polynomial can be decomposed only over the field of real or complex numbers.
3. Operations with Polynomials Besides the operations discussed above, the operations gcd and lcm have special importance. They find the greatest common divisor and the least common multiple of two polynomials. Correspondingly,
I
980
20. Computer A1,qebra Systems
quo(p, q , 2 ) yields the integer part of the ratio of polynomialsp and q, and rem(p, q, x) gives the residue.
I > p := x6 + 7 * x5 + 12 * x4 + 6 * x3 - 25* X' - 30* x - 25: q := x4 + x3 - 6 'X - 7 * x - 7 : > gcd(p, q ) ; X'+X+l
> Wp,q);
2102 + 5z6- 43x5 - 1 0 9 -~ 72x3 ~
+ 1502' + 175 + 72' + x8
With the command normal the ratio of two polynomials can be transformed into normal form over the field of rational numbers, Le., the quotient of two relatively prime polynomials with integer coefficients. IWith the polynomials from the previous example
> normal(p/q);
+
x4 + 6x3 5 X' - 5 2 - 25 2' - 7 With numer and denom the numerator and the denominator can be represented separately.
> factor(denom("), (7)"(1/2)); 4. Partial Fraction Decomposition The partial fraction decomposition in Maple is performed by the command convert, which is called with the option parfrac. IUsing the polynomials p and q from the previous examples we get
> convert(p/q, p a r f r a c , r ) ;
+
372 59 and + 6 ~12 > convert(q/p, p a r i r a c , i ) ; -~ 6 + -55 + 11x + 62' 352 + 175 35x3 + 352' - 175 ' 2
+ +
5. Manipulation of General Expressions The operations introduced in the following allow the transformation of algebraic and transcendental expressions with rational and algebraic functions containing functions which are self-defined or built-in in Maple. In general, optional arguments can be given, which modify the transformations under certain conditions. The command simplify is introduced here in an example. In the simple form simplify(exppr) Maple performs some built-in simplification rules on the expression. I
> t := sZnh(3 * 2 ) + Cosh(4 * X ) : > simplify(t); 4 sinh(z) cosh(x)' - sinh(x) t 8 c ~ s h ( x-) ~8 cosh(x)'
Similarly,
I > r := sin(2 * 2 ) * cos(3 * 2 ) : > simplify(r); 8 sin(z) C O S ( Z ) ~- 6 sin(x) cos(x)'
+1
20.4 Applications of Computer Alqebra Sustems 981
There exists a command combine, which is the reverse command of expand in a certain sense. I > t := t a n ( 2 * ~ ) ~ : > t l := expand(t); 4 sin(x)' cos(x)' t l := ( 2 cos(2)' - 1)' > combine(t1,trig); cos(22)4 - 1 Here, combine was called with the option t r i g , which provides the use ofthe basic rules of trigonometry. Using the command simplify we get cos(2 x)' - 1 > t2 := simplify(t); ; t cos(2x)2 Here, Maple reduced the tangent function to the cosine function. Transformations can be performed with the exponential, logarithmic and further functions.
20.4.2 Solution of Equations and Systems of Equations Computer algebra systems know procedures to solve equations and systems of equations. If the equation can be solved explicitly in the domain of algebraic numbers, then the solution will be represented with the help of radicals. If it is not possible to give the solution in closed form, then at least numerical solutions can be found with a given accuracy. In the following, we introduce some basic commands. The solution of systems of linear equations (see 20.4.3.2, l . ,p. 986) is discussed in a special section.
20.4.2.1 Mathernatica 1. Equations Mathernatica allows the manipulation and solution of equations within a wide range. In Mathernatica. an equation is considered as a logical expression. If one writes (20.57a) 1n[1I := g = x"2 t 22 - 9 == 0, then Mathernatica considers it as a definition of an identity. Giving the input In[21 := %/. x-> 2 , we get Out[2] = False, (20.57b) since with this value of x the left-hand side and right-hand side are not equal. The command Roots[g,x] transforms the above identity into a form which contains x explicitly. Mathematica represents the result with the help of the logical OR in the form of a logical statement: In[2] : = Roots[g,I] yields -2 - 2Sqrt[40] -2 Sqrt[40] Ilx== (20.57~) aut 121 = x == 2 2 In this sense, logical operations can be performed with equations. With the operation ToRules,the last logical type equations can be transformed as follows: In[31 : = {ToRules[%]} + -2 + Sqrt[40] -2 - 2Sqrt[40] (20.57d) Out [3i = {{x -> 1,{x -> 11 2 2
+
2. Solution of Polynomial Equations Mathernatica provides the command Solve to solve equations. In a certain sense, Solve perform the operations Roots and ToRules after each other. Mathernatica solves polynomialequations in symbolic form up to fourth degree, since for these equations solutions can be given in the form of algebraic expressions. However, if equations of higher degree can be transformed into a simpler form by algebraic transformations, such as factorization, then Mathematica
I
982 20. Computer Alqebra Sustems
provides symbolic solutions. In these cases, Solve tries to apply the built-in operations Expand and Decompose. In Mathernatica numerical solutions are also available. IThe general solution of an equation of third degree: In[41 := Solve[zA3+ a xA2t b z c == 0,2] Mathernatica gives -a aut[ql = {{x->
+
1
23 (-a2 3
3 -2 a3 t 9 a b - 27c + 35
(
(-2 a3 + 9 a b - 27c+ 39
4-(a2
+ 3 b) 1
b2)
+ 4 b3 + 4a3c - 1 8 a bc + 27c2):
4-(a2b2) + 4
b3
+ 4a3c - 1 8 a bc + 27c2)
32 Q
. . .}
The solution list shows only the first term explicitly because of the length of their terms. If we want to solve an equation with given coefficients a, b, c, then it is better to handle the equation itself with Solve than to substitute a, b, c into the solution formula. IA: For the cubic equation (see 1.6.2.3, p. 40) z3 + 62 + 2 = 0 we get: InL51 : = Solve[xA3+ 6 z 2 == 0, 21 Out[5] = {(z-> -0.32748}, (2-> 0.16374 + 2.46585 I}, (2-> 0.16374 - 2.46585 I}} IB: Solution of an equation of sixth degree:
+
+
In[6l : = Solve[zA6- 6 d 5 6 d 4 - 4zA3t 6 5 ~ -~ 382 2 - 120 == 0,2] Out[6] = ((2-> -l}, (2-> 4)) (2-> 3}, (2-> 2}, (2-> -1 - 2 I}, (E-> -1 2 I}} Mathernatica succeeded in factorizing the equation in B with internal tools; then it is solved without difficulty. If numerical solutions are required, then the command NSolve is recommended, since it is faster.
+
IThe following equation is solved by NSolve: In[7] : = NSolve[xA6- 42"5 + 6xA4- 52"3 + 3xA2- 42 2 == 0 , 2 ] Out [7] = ((x-> -0.379567 - 0.76948 I}, (2-> -0.379567 0.76948 I}, {x-> 0.641445}, (2-> 1. - 1. I}, (2-> 1. t 1. I}, {z-> 2.11769))
+ +
3. Solution of Transcendental Equations Mathernatica can solve transcendental equations, as well. In general, this is not possible symbolically, and these equations often have infinitely many solutions. In these cases, a domain should be given, where Mathernatica has to find the solutions. This is possible with the command FindRoot[g, (z,zS}], where 2, is the initial value for the search of the root. I In[8] : = FindRoot[z + ArcCoth[z] - 4 == 0, ( 2 )1.1}] Out[8] = 2-> 1.00502 t O.I} and In[9] : = FindRoot[z + ArcCoth[z] - 4 == 0, {x,5}] -i Out[9] = (2-> 3.72478 t 0.1)
4. Solution of Systems of Equations Mathernatica can solve simultaneous equations. The operations, built-in for this purpose, are represented in Table 20.18.and they present the symbolical solutions, not the numerical ones. Similarly to the case of one unknown, the command NSolve gives the numerical solution. The solution of systems of linear equations is discussed in 20.4.3, p. 984.
20.4 Applications of Computer Algebra Systems 983
Table 20.18 Operations to solve systems of equations Solve[{ll == rl, /2 == r 2 , .. .}, {z,yI . . .}] solves the given system of equations with respect to the unknowns eliminates the elements 2,. . . from the system of Eliminate[{ll = = T I , . . .},{z,. . .}] equations simplifies the system of equations and gives the possiReduce[{ll ==TI).. .},{z,. . .}] ble solutions
20.4.2.2 Maple 1. Important Operations The two basic operations in Maple to solve equations symbolically are solve and RootOf or r o o t s . With them. or with their variations with certain optional arguments, it is possible to solve several equations, even transcendental ones. If an equation cannot be solved in closed form, Maple provides numerical solutions. RootOf represents the roots of an equation of one variable: k := RootOf (2"s - 5 * z 7 , Z) + k := RootOf (-Z3 - 5 -Z 7 ) (20.58) In Maple k denotes the set of the roots of the equation x3 - 52 7 = 0. Here, the given expression is transformed into a simple form, if it is possible, and the global variable is represented by -Z. (Maple returns an unevaluated RootOf.) The command a l l v a l u e s ( k ) results in a sequence of the roots. The command solve yields the solution of an equation if any exists. W > k := solve(z"4 x"3 - 6 * x"2 - 7 * z - 7, x); 1 1 1 1 k : = - - + - l ~ , - - - - I ~ , ~ , - ~ ~ 2 2 2 2 If the equation entered is of degree greater then four, then answers are provided in terms of RootOf: > r := solve(x"6 4 * z"5 - 3 * z t 2, x);; --+ r := RootOf (-Z6 4-Z5 - 3-2 2) This equation has no solution for rational numbers. With a l l v a l u e s we get the approximate numerical solutions.
+
+
+
+
+
+
+
2. Solution of Equations with One Variable 1. Polynomial Equations Polynomial equations with one unknown, whose degree is 5 4, can be solved by Maple symbolically.
W
> solve(z"4 - 5 1; x"2 - 6); t I, -I, v%, -&
2. Equations of Degree Three Maple can solve the general third degree equation with general
coefficients. W > r := sohe(z"3 + a * x"2
> $1;
+ b * x + c, z):
J4b3-b2a2-18bact27c2+4ca3& 18 -b - -a2 i
-
0
.
n3
a
- -
3 b_a _ c_ -a3_ ~ 4 b 3 - b 2 a 2 - 1 8 b a c t 2 7 c 2 + 4 c a 3 &
d. 6_
2-
27' -
18
~.
We get the corresponding expression for the other roots r[2],r[3],too, but we do not give them here because of their length. If the input equation in solve has floating-point numbers for coefficierits,then Maple solves the equation numerically.
I
984
20. Computer Algebra Systems
I > solve(1. * z”3 t 6. * z + 2., 2); -3.27480002, .1637400010 - 2,465853271, .1637400010
+ 2.465853271
3. General E q u a t i o n of Degree Four The general solution is given by Maple also for polynomial equations of degree four. 4. O t h e r Algebraic Expressions Maple can solve equations containing radical expressions of the unknown. 5. Extra root We must be careful, when taking roots, because occasionally we may get equations whose roots are not roots of the original equation. These roots are called extra roots. Hence, every root offered by Maple should be substituted into the original expression. - 1 = 0 is to be determined. The input is IThe solution of the equation + > p := s q r t ( z t 7) t s q r t ( 2 3 z - 1) - 1 : 1 := solve(p = 0,z) : With > s := allvalues(l) : we obtain 1 1 1 1 s[l] := - + -(2 + fi)z and s[2]:= - t -(2 - fi)z 2 2 2 2 By > subs(z = s[i ]. p );i = 1 , 2 we can convince ourselves that only s[2] is a solution.
3. Solution of Transcendental Equations Equations containing transcendental parts can usually be solved only numerically. Maple provides the command f s o l v e for numerical solution of any kind of equation. With this command, Maple finds real roots of the equation. Usually, we get only one root. However, transcendental equations often have several roots. The command f s o l v e has an optional third argument, the domain where the root is to be found. I > f s o l v e ( z arccoth(z) - 4 = 0, E ) ; --t 3.724783628 but (20.59) > f s o l v e ( z + arccoth(z) - 4 = O,z, 1..1.5); + 1.005020099
+
4. Solution of Nonlinear System of Equations Systems of equations can be solved by the same commands solve and fsolve. The first argument contains all of the equations in curly braces, and the second one, also in curly braces, lists the unknowns for which the equations are to be solved: (20.60) > solve({gl1,9/2,. . .}, {z1,z2,.. .})I I
> solve((a”2 - yh2 = 2,2”2 + yh2 = 4}, {z, y}); {y=l,z=J?;},{y=l,z=-J?;},{y=-l,
(20.61) z=&},{y=-l,z=fl}
20.4.3 Elements of Linear Algebra 20.4.3.1 Mat hemat ica In 20.2.4, p. 956, the notion of matrix and several operations with matrices were defined on the basis of lists, Mathernatica applies these notions in the theory of systems of linear equations. In the followings (20.62) P = Arrayb, { m , defines a matrix of type (m,n) with elements pi, = p[[i,j]].Furthermore (20.63) z = Array[z, {n}] und b = Array[b, { m } ] are n- or m-dimensional vectors. With these definitions the general system of linear homogeneous or inhomogeneous equations can be written in the form (see 4.4.2, p. 272) p.z==b p.z==O (20.64)
I).
1. Special Case n = m, detp # 0 In the special case n = m, detp # 0, the system of inhomogeneous equations has a unique solution, which can be determined directly by s = Inverse[p].b (20.65)
20.4 Applications of Computer Algebra Systems 985
Mathematica can handle such systems of up to ca. 50 unknowns in a reasonable time, depending on the computer system. .An equivalent, but much faster solution is obtained by LinearSolveb, b].
2. General Case With the commands Linearsolve and NullSpace, all the possible cases can be handled as discussed in 4.4.2,p. 272, i.e., it can be determined first if any solution exists, and if it does, then it is calculated. In the following, we discuss some of the examples from Section 4.4.2, p. 272ff. A: The example in 4.4.2.1, 2., p. 274, is a system of homogeneous equations 21 - 2 2 + 5x3 - 2 4 = 0 x 1 + x2 - 2x3 3x4 = 0 3x1 - Z2 + 8x3 24 =0 21 3x2 - 9x3 7x4 = 0 which has non-trivial solutions. These solutions are the linear combinations of the basis vectors of the null space of matrixp. It is the subspace of the n-dimensional vector space which is mapped into the zero by the transformation p . A basis for this space can be generated by the command NullSpace[p]. lVith the input In[11:=p={{1,-1,5~-1},{1,1,-2,3},{3,-1~8,1},{1,3,-9,7}} we define the matrix whose determinant is actually zero, which can be checked by Det[p]. Now we take: 3 7 In121 := NullSpace[p] andget Uut121 = {{--,-,1,0},{-1~-2,0,1}} 2 2 is displayed, a list of two linearly independent vectors of four-dimensional space, which form a basis for the two-dimensional null-space of matrixp. An arbitrary linear combination of these vectors is also in the null-space, so it is a solution of the system of homogeneous equations. This solution coincides with the solution found in Section 4.4.2.1, 2., p. 274. B: Consider the example A in 4.4.2.1,2., p. 273, x1 - 2x2 + 3x3 - x4 t 2x5 = 2 3x1 - 2 2 523 - 3x4 - ~5 = 6 2xl + x2 t 2x3 - 2x4 - 3x5 = 8 with matrix m l of type ( 3 , 5 ) ,and vector bl Inr31 := m l = {{1, -2,3, -1,2}, (3, -1,5, -3, -1}, { 2 , 1 , 2 , -2, -3}}; In141 := b l = { 2 , 6 , 8 } ; For the command In141 := LinearSolve[ml, bl] the response is Linearsolve : : nosol: Linear equation encountered which has no solution. The input appears as output. C: According to example B from Section 4.4.2.1, l., p. 273, x 1 - x2 + 2x3 = 1 2 1 - 2x2 - 2 3 = 2 321 - x2 + 5x3 = 3 -221 + 2x2 3x3 = -4 the input is 1n[5] := m2 = {{1,-1,2},{1~-2,-1},{3,-1~5}~{-2,2,3}}; In161 := b2= { 1 , 2 , 3 , - 4 } ; If we want to know how many equations have independent left-hand sides, then we call InL71 := RowReduce[m2];+ U u t f 7 ] = {{1,0,0}, {0,1, O } , {0,0, I}, {0,0,0}}
+ + +
+
+
+
986
20. Computer Algebra systems
Then the input is In[8] := LinearSolve[m2, b2];
1 2 + Outf81 = {-107 ’ --7’ --} 7
The answer is the known solution.
3. Eigenvalues and Eigenvectors Eigenvalues and eigenvectors of matrices are defined in 4.5, p. 278. Mathernatica provides the possibility of determining eigenvalues and eigenvectors by special commands. So, the command Eigenvalues [m] produces a list of eigenvalues of a square matrix m, Eigenvectors[m] creates a list of the eigenvectors of m. If N[m]is substituted instead of m, then we get the numerical eigenvalues. In general, if the order of the matrix is greater than four ( n > 4), then no algebraic expression can be obtained, since the characteristic polynomial has degree higher than four. In this case, we should ask for numerical values. I In[9l := h = Table[l/(i + j - l),{i, 5}, {j,5}] This generates a five-dimensional so-called Hilbert matrix. 1 1 -1 -} 1 { -1 -1 -1 -1 -}.{1 1 -1 -1 -1 -} 1 {-1 -1 1 1 -} 1 { -1 -1 -1 -1 -}} 1 Out[g]= {{I ’ 2 3’4’5 ’ 2’3’4’5’6 3’4’5’6’7’ 4’5’6’7’8 ’ 5’6’7’8’9 With the command In[lOl := Eigenvalues[h] we get the answer Eigenva1ues::eival: Unable to find all roots of the characteristic polynomial. But with the command I n[ l l ] := Eigenvalues [ N [ h ] ] we get Out [ill = { 1.56705,0.208534,0.0114075,0.000305898,3.28793 ~
20.4.3.2 Maple The Maple library provides the special package l i n a l g . After the command > with (linalg) : (20.66) all the 100 commands and operations of this package are available for the user. For a complete list and description see [20.6]. It is important that matrices and vectors must be generated by the commands matrix and vector while using this package, and not by the general structure array. With matrix(m, n, s), an m x n matrix is generated. If s is missing, then the elements of this matrix are not specified, but they can be determined later by the assignments A[i,j]:= . . .. If s is a function f = f ( i , j ) of the indices, then Maple generates the matrix with these elements. Finally, s can be a list with elements, e.g., vectors. Analogously, the definition of vectors can happen by vector(n, e). The input of a vector is similar to a 1 x n matrix, but a vector is always considered to be a column vector. Table 20.19 gives the most important matrix and vector operations. Addition of vectors and matrices can be performed by the command add(%,u , k , l ) . This procedure adds the vectors or matrices u and v multiplied by the scalars k and 1. The optional arguments k and I can be omitted. The addition is performed only if the corresponding matrices have the same number of rows and columns. Multiplication of matrices can be performed by multiply(u, w) or with the short form (see 20.3.6.4, p. 972) &* as an infix operator.
1. Solution of Systems of Linear Equations To handle systems of linear equations Maple provides special operations contained in the linear algebra package. One of them is the command linsolve(A, c). It handles the system of linear equations in the form d . Z = C (20.67)
20.d Applications of Computer Aloebra Sustems 987
Table 20.19 Matrix operations determines the transpose of A determines the determinant of the square matrix A determines the inverse of the square matrix A determines the adjoint of the square matrix A, Le., A & * adjoint(A) = det(A) mulcol(A, s, expr) multiplies the s-th column of the matrix A by expr mulrow(A, T,expr) multiplies the r-th row by ezpr transpose(A) det(A) inverse(A) adjoint (A)
where A denotes the coefficient matrix and c is the vector on the right-hand side. If there is no solution, then the null-sequence Null is returned. If the system has several linearly independent solutions, they will be given in parametric form. The operation nullspace(A) finds a basis in the null space of matrix A , which is different from zero if the matrix is singular. It is also possible to solve a system of linear equations with the operators of multiplication and evaluation of the inverse. IA: Consider the example E from 4.4.2.1,2.,p. 274, of the homogeneous system 21 - 22 + 523 - x4 = 0 X2 - 2x3 324 = 0 IC1 321 - 22 + 8x3 24 = 0 z1 + 3x2 - 923 7x4 = 0 whose matrix is singular. This system has non-trivial solutions. In order to solve it, first we define matrix ‘4: > A := matrix([[l,-1,5, -11, [l,1, -2,3], [3, -1,8,1], [1,3, -9,711: (With det(A) we can be convinced that the matrix is singular.) With the command
+ + +
+
the list of two linearly independent vectors is determined. They form a basis in the two-dimensional null space of matrix A. For the general case, operations to apply the Gaussian algorithm are available in Maple. They are enumerated in Table 20.20. If the number of unknowns is equal to the number of equations and the coefficient matrix is non-singular, then the command linsolve is recommended. Table 20.20 Operations of the Gaussian algorithm row to the others, whose j-th column consists of zeros except A,, gausselim(A) generates the Gaussian triangle matrix by row-pivoting; the elements of A have to be rational numnbers gauss jord(A) generates a diagonal matrix according to the Gauss-Jordan method
B: The system from 19.2.1.4,2., p. 892, 10x1 - 322 - 4x3 + 2x4 = 14 -321 4- E x 2 + 5x3 - 2 4 = 22 -4x1 + 5 x 2 16x3 + 5x4 = 17 221 + 322 - 4x3 - 1224 = -20
+
988
20. Computer Algebra Systems
is to be solved. Now, the input is > A := rnatrix([[lO. -3, -4,2], [-3,26,5, -11, [-4,5,16,5], [ 2 , 3 ,-4, -1211):
> v := vector([l4, 22,17, -201): With l i n s o l v e we get
> l i n s o l v e ( A , v ) ; ;-+
[z 1 -f 2l
The Gaussian algorithm results in
>F
:= gaussjord(augment(A, u ) ) ;; --t
C: The inhomogeneous system of example
then by gaussjord the matrix F is transformed into an upper triangular form:
n _
1 0 0 0 ~ 2 0100 1 F := 1 00102 -00 0 1 2 -
10 100 7 1 > F1 := g a u s s j o r d ( F ) ; ; -i F1 := --7 2 O O L 7 The solution can be read from F1. 000 0
20.4 Applications of Computer A1,qebra Systems 989
With a l l v a l u e s , a sequence of the approximated values can be produced.
20.4.4 Differential and Integral Calculus 20.4.4.1 Mat hematica The notation of the derivative as a functional operator was introduced in 20.2.8, p. 962. Mathernatica provides several possibilities to apply the operations of analysis, e.g., determination of the differential quotient of arbitrarily high order, of partial derivatives, of the complete differential, determination of indefinite and definite integrals, series expansion of functions, and also solutions of differential equations. 1. Calculation of Differential Quotients 1. Differentiation Operator The differentiation operator (see Section 20.2.8, p. 962) has the name Derivative. Its complete form is (20.68) Derivative[nl,nz,.. .] The arguments say how many times the function is to be differentiated with respect to the current variables. In this sense. it is an operator of partial differentiation. Mathernatica tries to represent the result as a pure function. 2. Differentiation of Functions The differentiation of a given function can be performed in a simplified manner with the operator D. With D[f[x],x])the derivative of the function f(x) will be determined. D belongs to a group of differential operations, which are enumerated in Table 20.21. Tabelle 20.21 Operations of differentiation
I>.
D[f[XI, {x, yields the n-th derivative of function f(x) with respect to x D[f) {xl,nl}, {x2,nZ),...] multiplederivatives,n,-thderivativewithrespect tox, (z = 1 , 2 , . . . ) the complete differential of the function f D t [f1 df of the function f the complete derivative Wfl
XI
Dtlf, x1,zzr . .
dx
the complete derivative of a function of several variables
For both examples in 6.1.2.2, p. 380, we get IA : I n f f I : = D[Sqrt[z”3 Exp[4x] Sin[x]],x] E4’x2 (xCos[x] +3Sin[x] +4xSin[x]) outl-11 = 2 Sqrt[E4zx3 Sin[x]]
IB : I n f 2 l : = D [ ( 2 x t 1 ) ” ( 3 z ) , x ] + O u t f 2 1 = 6 x ( 1 t 2 x ) - ’ t 3 ’ + 3 ( 1 + 2 z ) 3 z L 0 g [ l t 2 z ] The command D t results in the complete derivative or complete differential. IC : Inf31 := Dt[x”3 + y”3] t Outf31 = 3xzDt[z] 3y2Dt[y]
+
ID : Inf41:=Dt[xA3+y”3,x] + Outf41=3x2+3y2Dt[y.x] In this last example, Mathernatica supposes y to be a function of x. which is not known, so it writes the second part of the derivative in a symbolic way. If Mathernatica finds a symbolic function while calculating a derivative, it leaves it in this general form, and expresses its derivative by f’. IE : Inf5I:=D[xf[x]”3,x] + Outf51=f[z]3+3xf[x]zf’[x] Mathernatica knows the rules for differentiation of products and quotients, it knows the chain rule, and it can apply these formally: IF : Inf61 := D[f[u[z]],z] Outf61 = f’[u[x]]u‘[z]
I
990
20. Computer Algebra Svstems
u'[z] - u[x] v'[z] IG : In[7] := D[u[x]/v[x],x] Out 171 = 4x1 v[xI'
2. Indefinite Integrals
I
LVith the command Integrate[!, $1, Mathernatica tries to determine the indefinite integral f(x) dx. If Mathernatica knows the integral, it gives it without the integration constant. Mathernatica supposes that every expression not containing the integration variable does not depend on it. In general, Mathernatica finds an indefinite integral, if there exists one which can be expressed in closed form by elementary functions, such as rational functions, exponential and logarithmic functions, trigonometric and their inverse functions, etc. If Mathernatica cannot find the integral, then it returns the original input. Mathernatica knows some special functions which are defined by non-elementary integrals, such as the elliptic functions. and some others. To demonstrate the possibilities of Mathernatica, some examples will be shown, which are discussed in 8.1, p. 425ff. 1. Integration of Rational Functions (see 8.1.3.3, p. 430ff.) IA : In[fl : = Integrate[(2x + 3)/(x"3 + x"2 - 2s),x] 5 Log[-l + 21 3 LOg[z] LOg[2 X] Out[fl = 2 6
+
IB : In[2] : = Integrate[(xA3+ l)/(x(x - 1 ) " 3 ) , ~ ] Uut[21 =
-
(-1
1 + 2 Log[-1 + x] - Log[x] + x)-2 - -1+x
(20.69)
2. Integration of Trigonometric Functions (see 8.1.5, p. 436ff.) IA: The example A in 8.1.5.2, p. 437, with the integral
1
sin' x cos5 x dx =
s
sin' x (1 - sin' x)' cos x dx =
1
t2(1 - t 2 ) 2dt with t = sin x
is calculated: 1n[31 : = 1ntegrate[~in[x]"2~os[x]~5,x] Sin[7z] 5 ~ i n [ x ] ~ i n [ J x ] 3 ~ i n [ 5 x -] Out[3] = -- -320 448 64 192 IB: The example B in 8 . 1 5 2 , p. 437, with the integral sin x dt dx = with t = cosx ~
1
1
is calculated: 1n[41 := ~ntegrate[Sin[z]/Sqrt[Cos[x]],x] -+Out 41 = -2Sqrt[Cos[x]] Remark: In the case of non-elementary integrals Mathernatica performs only a transformation. I1n[5] := Integrate[x"x, x] -i Out = 1ntegrate[x2,z]
[a
3. Definite Integrals With the command Integrate[!, {z,x,,xe}], Mathernatica can evaluate the definite integral of the function f ( z ) with a lower limit x, and upper limit 2 , .
IA : In[il := Integrate[Exp[-x'], (2, - I n f i n i t y , Infinity}] General: : i n t i n i t : Loading i n t e g r a t i o n packages.
+ Out [f1
= Sqrt[Pi]
After Mathernatica has loaded a special package for integration, it gives the value 7r (see Table 21.6, p. 1050, Nr. 9 for a = 1).
20.4 Applications of Computer Al.qebra Sgsterns 991
IB: It can happen, with earlier releases of Mathernatica, that if the input is 2 1n[2] := ~ n t e g r a t e [ l / z " Z {z, , -1, I}], we get 0ut[2] = -3 which is actually false since this integral is an improper integral. Mathernatica takes the primitive 1 1 function (antiderivative) of -, namely --, then it substitutes the limits and subtracts these values 2
22
The fact that the integrand is not finite is not considered. In version 2.2 of Mathernatica, this mistake is avoided. After a slightly longer working time, Mathernatica says that the integral cannot be calculated since it is improper. In the calculation of definite integrals, we should be careful. If the properties of the integrand are not known, it is recommended before integration to ask for a graphical representation of the function in the considered domain. 4. Multiple Integrals Definite double integrals can be called by the command (20.70) Integrate[f[z, YL { zsxor {Y. yar yell The evaluation is performed from right to left, so, first the integration is evaluated with respect to y. The limits ya and ye can be functions of z,which are substituted into the primitive function. Then the integral is evaluated with respect to 2. IFor the integral A, which calculates the area between a parabola and a line intersecting it twice, in 8.4.1.2, p. 470. we get 32 1n[3I:= Integrate[z y"2, {z,O, 2}, {y, x"2,22}] t Out DI =. :
GI,
.Also in this case, it is important to be careful with the discontinuities of the integrand. 5 . Solution of Differential Equations Mathernatica can handle ordinary differential equations symbolically if the solution can be given in closed form. In this case, Mathernatica gives the solution in general. The commands discussed here are listed in Table 20.22. The solutions (see 9.1, p. 485) are represented as general solutions with the arbitrary constants C[z]. Initial values and boundary conditions can be introduced in the part of the list which contains the equation or equations. In this case we get a special solution. Table 20.22 Commands to solve differential equations DSolve[dgl, y[z],z]
solves the differential equation for y[z] (if it is possible); y[z] may be given in implicit form gives the solution of the differential equation in the form of a DSolve[dgl, y >z] pure function DSolve[{dgll, dg12,.. .}, y, z] solves a system of ordinary differential equations As examples, two differential equations are solved here from 9.1.1.2, p. 487. IA: The,solution of the differential equation y'(z) - y(z) t a n z = cosz is to be determined. In[l] := DSolve[y'[z] - y[z] Tan[z] == Cos[z], y>z] Mathernatica solves this equation, and gives the solution as a pure function with the integrations constant C[1] sec[slot[l]] (4C(1) +Sin[2Slot[l]] + 2Slot[l]) Out[ll = {{y -+ Function[ Ill 4 The symbol Slot is for #, it is its FullForm. If it is required to get the solution for 2/[z],then Mathernatica gives Sec[z] ( 2 2 + 4 C ( 1 ) +Sin[Pz]) 1n[2] := y[z]/. 561 + Out [21 = { l 4
992
20. Computer Alqebra Swtems
We also could make the substitution for other quantities, e.g., for y’[x] or y[l]. The advantage of using pure functions is obvious here. IB: The solution of the differential equation y’(x)x(x - y(x)) + yz(z) = 0 (see 9.1.1.2,2., p. 487) is to be determined. In[3] := DSolve[y’[z]x(z - ~ [ x ]+) y[x]”2 == O,y[x],x] Mathernatica returns the input without any comment. The reason for doing so is that Mathernatica cannot solve this differential equation for y. The solution of this differential equation was given in implicit form (see 9.1.1.2, 2., p. 487). In such cases, the solutions can be found by numerical solutions (see 19.8.4.1, 5., p. 945). Also in the case of symbolic solutions of differential equations, like in the evaluation of indefinite integrals, the efficiency of Mathernatica should not be overestimated. If the result cannot be expressed as an algebraic expression of elementary functions, the only way is to find a numerical solution. Mathernatica 2.2 contains a special package which can solve partial differential equations with first partial derivatives of one single function.
20.4.4.2 Maple Maple provides many possib es to handle the problems of analysis. Besides differentiation of functions, it can also evaluate indefinite and definite integrals, multiple integrals, and expansion offunctions into power series. The basic elements of the theory of analytic functions are also provided. Several differential equations can be solved, as well. 1. Differentiation The operator of differentiation D was introduced in 20.3.7, p. 972ff. Its application with different optional arguments allows us to differentiate functions in operator representation. Its complete syntax is D[il(f) (20.71a) Here, the partial derivative of the (operator) function f is determined with respect to the i-th variable. The result is a function in operator representation. D[i, j](f)is equivalent to (20.71b) D[il(D[Jl(f))and D[l(f) = f. The argument f is here a function expression represented in operator form. The argument can contain self-defined functions besides the built-in functions, functions defined by the arrow operator, etc. ILet > f := (2, y) -> exp(x * y) sin(z y): . Then one sets
+
> > > >
D [I(f);-+
f
+
+
D IlI(f); + (z, Y) -> Y exp(x Y) cos(x + Y) D [2](f); -+ (z, Y)-> x exp(x Y) + cos(z Y) D [I,2](f); + (2, Y) -> exp(x 9 ) x Y exp(x y) - s i n ( x + Y)
+
+
Besides the differentiation operator there exists the operator d i f f with the syntax (20.72a) diff(ezpr, z l , x 2 , . . . ,zn) Here ezpr is an algebraic expression of the variables x l , x 2 , . . . . The result is the partial derivative of the expression with respect to the variables 21,. . . ,xn. If n > 1 holds, then the same result can be obtained by repeated application of the operation d i f f : (20.7213) d i f f (a, z1,22) = d i f f (diff(a, x1),22) Multiple differentiation with respect to the same argument can be got by the sequence operator $. I > d i f f ( s i n ( z ) , z $ j ) ; (E d i f f ( s i n ( x ) , x , z , x , x , x ) ) t COS(Z)
20.4 Applications of Computer Alyebra Sustems 993
a
If the function f ( z ) is not defined, then the operation d i f f gives the derivatives symbolically: -f(z). a d
ax
IA > d i f f ( f ( z ) / g ( z ) ) ; t IB
--f(.)
dz-
f(.)&4
dx)
9(x)2
a > d i f f ( z t f(z),z);t f(z) + z -f(x)
ax
2. Indefinite Integrals If the primitive function F ( z ) can be represented as an expression of elementary functions for a given function f (z), then Maple can usually find it with i n t ( f , z). The integration constant is not displayed. If the primitive function does not exist or is not known in closed form, then Maple returns the integrand. Instead of the operator i n t , the long form i n t e g r a t e can be used. 1. Integration of Rational Functions 3 1 5 IA : > i n t ( ( 2 * z + 3 ) / ( a A 3 + z " 2 - 2 * z ) , z ) t ; --ln(z) - - 1 n ( a : + 2 ) + - l n ( z - l ) 2 6 3 IB : > int((z"3 + l ) / ( z * (z - 1)'3), z); t - ln(z) -
4 -I + 2 ln(z - 1) (z-1) z-1
2. Integral of Radicands Maple can determine the indefinite integrals given in the table of indefinite integrals (see 21.5, p. 1017ff.). IIf the input is > X := sqrt(z"2 - a"2): then we have the output:
> i n t ( X , z ) ; --+ i > i n t ( X / a , z);t
z
v - 9~l n ( ~z +1- ~
vGFZ
2 - aarcsec
(20.73)
(a)
1 3 3. Integrals with Trigonometric Functions -a3x3 cos(az) 3a2x2sin(az) - 6 sin(az) 6az cos(az) IA : > int(z"3 * s i n ( a * 2))a ) ; a4 1 cos(az) 1 ln(csc(az) - cot(az) IB : > i n t ( l / ( s i n ( a * 2))"3,z); --+ +7 a 2 a s i n ( a ~ ) ~2
> i n t ( X * z,z);t -(zz -
+
+
Remark: In the case of non-elementary integrals, only a formal transformation is performed. I > i n t (z^z, z); the output is /z" dz, since this integral cannot be given in an elementary way.
3. Definite Integrals
To determine a definite integral the command i n t is used with the second argument z = a .. b. Here, z is the integration variable. and a .. bare the lower and upper limits of the integration interval. 1 1 26 (20.74) IA : > int(z"2,z = a h ) ; --+ -b3 - -a3 > int(z"2,z = 1..3); -+ 3 3 3 IB : > int(exp(-zA2),z = -infinity..infinity);+ fi W C : > int(1/zA4,z = -1..1); t co If Maple cannot solve the integral symbolically, it returns the input. In this case we can try a numerical integration (see 19.3, p. 895ff.) with the commands evalf ( i n t ( e x p r , v a r = a . . b ) ) or evalf ( I n t (expr,var=a. .b ) ) .
20. Computer Algebra S,ystems
994
4. Multiple Integrals Maple can even calculate multiple integrals ifit can be done explicitly. The operation i n t can be nested. IA : > int(int(z"2 t y"2 * exp(x t y),z),y);
IB :
> i n t ( i n t ( z * y"2,y
= 2^2.,2 * x ) , z = 0-2);
-+
32
T 3
5 . Solution of Differential Equations Maple provides the possibility of solving ordinary differential equations and differential equation systems with the different forms of the operator dsolve. The solution can be a general solution or a particular solution with given initial conditions. The solution is given either explicitly or implicitly as a function of a parameter. The operator dsolve accepts as a last argument the options shown in Table 20.23 Table 20.23 Options of operation dsolve explicit laplace series numeric
-
gives the solution in explicit form if it is possible applies the Laplace transformation for the solution the power series expansion is used for the solution the result is a procedure to calculate numerical values of the solution
1. General Solution > dsolve(diff(y(z).x)- y(z) * tan(s) = cos(z),y(s)); Y(X)
=
+
1 cos(x) sin(x) x t 2 -C1 cos(2)
5
(20.75a) (20.75b)
Maple gives the general solution in explicit form with a constant. In the following example the solution is given in implicit form, since y(z) cannot be expressed from the defining equation. The additional option e x p l i c i t has no effect here. (20.76a) > dsolve(diff (y(z), 2) * (x - y(z)) t y(r)^2,~ ( 2 ) ) ;
(20.76b) 2. Solution with Initial Values Consider the differential equation y' - e5 - y2 = 0 with y(0) = 0. Here we give the option s e r i e s . If we use this option, the initial values should belong to x = 0. The same is valid for the option laplace. (20.77a) > dsolve({diff(y(z),x)- exp(z) - y(x)"2, y(0) = 0}, y(x), s e r i e s ) ; y(2) = x
1 7 3 1 + -2 +1-z3 t -x4 + -x5 + O ( 2 ) 2 2 24 120
(20.7713)
The equation and the initialvalues have to be in curly braces. The same is true for systems of differential equations.
20.5 Graphics in Computer Algebra Systems By providing routines for graphical representation of mathematical relations such as the graphs of functions, space curves, and surfaces in three-dimensional space, modern computer algebra systems provide extensive possibilities for combining and manipulating formulas, especially in analysis, vector calculus,
20.5 Graphics in Computer Al.qebra Sgstems 995
and differential geometry, and they provide immeasurable help in engineering designing. Graphics is a special strength of Mathernatica.
20.5.1 Graphics with Mathernatica 20.5.1.1 Basic Elements of Graphics Mathernatica builds graphical objects from built-in (so-called) graphics primitives. These are objects such as points (Point), lines (Line) and polygons (Polygon) and properties of these objects such as thickness and colour. Mathernatica has several options to specify the environment for graphics and how the graphical objects should be represented. With the command Graphics[list], where list is a list of graphics primitives, Mathernatica is called to generate a graphic from the listed objects. The object list can follow a list of options about the appearance of the representation. With the following input I n [ l l := g = Graphics[{Line[{{O,O}>{ 5 , 5 } , {l0,3}}],Circle[{5,5),4], (20.78a) Text [FontForm[“Example”,“Helvetica-Bold” ,251 { 5, S}]}, AspectRat i o -> Automatic] (20.78b) a graphic is built from the following elements: a) Broken line of two line segments starting at the point (0,O) through the point ( 5 , 5 ) to the point (10,3). b) Circle with the center at (s35 ) and radius 4. c) Text with the content “Example”, written in Helvetica-Bold font (the text appears centered with respect to the reference point (5,6)). ~
Figure 20.1
With the call Show[g], Mathernatica displays the picture of the generated graphic (Fig.20.1). Certain options should be previously specified. Here the option AspectRatio is set to Automatic. By default Mathernatica makes the ratio of the height to the width of the graph 1 : GoldenRatio. It corresponds to a relation between the extension in the z direction to the one in the y direction of 1 : 1/1.618 = 1 : 0.618. With this option the circle would be deformed into an ellipse. The value of the option Automatic ensures that the representation is not deformed.
20.5.1.2 Graphics Primitives Mathernatica provides the two-dimensional graphic objects enumerated in Table 20.24.
20.5.1.3 Syntax of Graphical Representation 1. Building Graphic Objects If a graphic object is to be built from primitives, then first a list of the corresponding objects with their global definition should be given in the form {objectl, object?. . . .}, (20.79a) where the objects themselves can be lists of graphic objects
996
20. Computer Al.qebra Svstems
Table 20.24 Two-dimensional graphic objects Point[{z, YII Line[{{zl, yl}, { x 2 ,Y ~ } .~.}]. Rectangle[{zl,, ylu}, {z,, yvo}] Polygon[{{zl, yl}, { x z ,yz}, . . .}] Circle[{z, y}, TI Circle[{z, Y }T ,~{ ~ I , Q z ) ~ CircleKz, Y I I { a , bll Circle[{x, y}) { a , b}, {cq, az}] Disk[{z, y}, T ] , Disk[{x, y}, { a , b}] Textltezt, {x,y)]
point at position x,y broken line through the given points shaded rectangle with the given coordinates left-down, right-up shaded polygon with the given vertices circle with radius T around the center x,y circular arc with the given angles as limits ellipse with half-axes a and b elliptic arc shaded circle or ellipse writes text centered to the point 2 ,y
Besides these objects Mathernatica provides further primitives to control the appearance of the representation, the graphics commands. They specify how graphic objects should be represented. The commands are listed in Table 20.25. Table 20.25 Graphics commands point is drawn with radius a as a fraction of the total picture denotes the absolute radius b of the point (measured in pt (0.3515 mm)) Thickness[a] draws lines with relative thickness a Abs o l u t eThi cknes s [ b] draws lines with absolute thickness b (also in pt) Dashing[{al, az, a3,. . .}] draws a line as a sequence of stripes with the given length (in relative measure) AbsoluteDashing[{bl, b 2 , , ,.}] the same as the previous one but in absolute measure GrayLevellp] specifies the level of shade ( p = 0 is for black, p = 1 is for white) ~ointSize[a] AbsolutePointSize [b]
There is a wide scale of colors to choose from but their definitions are not discussed here.
20.5.1.4 Graphical Options Mathernatica provides several graphical options which have an influence on the appearance of the entire picture. Table 20.26 gives a selection of the most important commands. For a detailed explanation, see [20.5]. Table 20.26 Some graphical options AspectRatio -> w Axes -> True Axes -> False Axes -> {True, False} Frame -> True GridLines -> Automatic AxesLabel --> { Z S y n b o i , Ysymboi} Ticks -> Automatic
sets the ratio w of height and width. Automatic determines w from the absolute coordinates; the default setting is w = 1 : GoldenRatio draws coordinate axes does not draw coordinate axes shows only the z-axis shows frames shows grid lines denotes axes with the given symbols denotes scaling marks automatically; with None they can be suppressed scaling marks are placed at the given nodes
Let object 1 be, e.g., m[I] := 01 = {Circ~e[{5,5},{5,3}],Line[{{0,5},{10,5}}]} and corresponding to it 1nE1 := 02 = {Circle[{5,5}, 31)
80.5 Graphics an Computer Alqebra Sustems 997
Ifagraphic object, e.g., 02, is to be provided withcertain graphicalcommands. then it should be written into one list with the corresponding command In[31 := 03 = {Thickness[O.Ol],o2}. This command is valid for all objects in the corresponding braces, and also for nested ones, but not for the objects outside of the braces of the list. From the generated objects two different graphic lists are defined: In&] := g l = Graphics[{ol,o2}] ; 92 = Graphics[{ol,o3}];, which differs only in the second object by the thickness of the circle. With the call (20.79b) Show[gl] and Shou[g2,Axes -> True] we get the picture represented in Fig. 20.2. In the call of the picture in Fig. 20.2b,the option Axes -> True was activated. This results in the representation of the axes with marks on them chosen by Mathernatica and with the corresponding scaling.
Figure 20.2
1. Graphical Representation of Functions Mathernatica has special commands for the graphical representation of functions. With (20.80) Plot[f[zl. { z , ~ , , n , z m m l l the function f(z)is represented graphically in the domain between z = z,,, and z = z , , , . Mathematica produces a function table by internal algorithms and reproduces the graphic following from this table by graphics primitives. IIf the function sin 22 is to be graphically represented in the domain between -27r and 2 7 ~then , the input is In[5] := Plot[Sin[2z], {zI -2P1, ZPi}]
Figure 20.3
Mathernatica produces the curve shown in Fig. 20.3. It is obvious that Mathernatica uses certain default graphical options in the representation. So, the axes are automatically drawn. they are scaled and denoted by the corresponding z and y values. In this example, the influence of the default AspectRatio can be seen. The ratio of the total width to the total height is 1 : 0.618. With the command InputForm[%] the whole representation of the graphic objects can be shown. For the previous example we get:
Graphics[{{Line[{{-6.283185307179587,4.90059381963448* loA- 16). List of points from the function table calculated by Mathernatica {6.283185307179S37. -(4.90059381963448 * 10" - IS)}}]}},
I
998
20. Computer Alqebra Systems
{Plot Range-> Automatic AspectRatio-> GoldenRatio" (- l), ~
DisplayFunction:> $DisplayFunction,ColorOutput-> Automatic, Axes -> Automatic Axesorigin-> Automatic,PlotLabel-> None, ~
AxesLabel-> None,Ticks-> Automatic,GridLines-> None,Prolog-> { }
~
Epilog-> { } AxesStyle-> Automatic,Background-> Automatic, ~
DefaultColor-> Automatic,DefaultFont :> $Def aultFont, RotateLabel-> True,Frame-> False,Framestyle-> Automatic, FrameTicks-> Automatic,FrameLabel-> None,PlotRegion-> Automatic}] Consequently, the graphic object consists of two sublists. The first one contains the graphics primitive Line,with which the internal algorithm connects the calculated points ofthe curve by lines. The second sublist contains the options needed by the given graphic. These are the default options. If the picture is to be altered at certain positions: then the new settings in the Plot command must be set after the main input. With (20.81) In[6] := Plot[Sin[21],{x,-2Pi,2Pi}, AspectRatio-> 11 the representation would be done with equal I and y absolute scaling. It is possible to give several options at the same time after each other.With the input
Plot[{fl
"f
[I] >
' '
.I>{I>Iminr xrnaz}l
(20.82)
several functions are shown in the same graphic. With the command Show[plot,options] (20.83) an earlier picture can be renewed with other options.With (20.84) Show[GraphicsArray[list]l, (with lzst as lists of graphic objects) pictures can be placed next to each other, under each other, or they can be arranged in matrix form.
20.5.1.5 Two-Dimensional Curves .A series of curves from the chapter on functions and their representations (see 2.1,p. 47ff.) is shown as examples.
1. Exponential Functions A4family of curves with several exponential functions (see 2.6.1, p. 71) is generated by Mathernatica (Fig.20.4a) with the following input: I ~ [ I ] := f[z-] := 2"x; := 1O"x;
&:_I
In[2] := h[z-] := (1/2)"x;j[x-] := (l/E)"z;k[x-] := (1/10)"1;
These are the definitions of the considered functions. The function ez need not be defined. since it is built into Mathematica. In the second step the following graphics are generated: In[3] := pl = Plot[{f [I], h[x]}, {x,-4,4}, Plotstyle-> Dashing[{O.Ol,0.02}]] In&] :=p2 = Plot[{Exp[z],j[x]}, {I, -4,4}] In[5] := p3 = Plot[{g[x],k[x]}, {I,-4,4}, Plotstyle-> Dashing[{0.005,0.02,0.01,0.02}]]
The whole picture (Fig. 20.4a) can be obtained by: In[6] := Show[{pl,p2,p3},PlotRange-> {0,18}, AspectRatio-> 1.21 The question of how to write text on the curves is not discussed here. This is possible with the graphics primitive Text.
20.5 Graphics in Computer Alqebra Sustems 999
-11 1
-6
6
Figure 20.4
2. Function y = 2
.
+ Arcoth z
.
Considering the properties of the function .4rcothx discussed in 2.10, p. 91, the function y = z + Arcoth x can be graphically represented in the following way: In[ll := f l = Plot[x t ArcCoth[x], {x,1.000000000005,7)] In[2l := f 2 = Plot[x ArcCoth[x], {x,-7: -1.00000000000S}] In[] := 3Show[{fl, f2}, PlotRange-> {-lo! lo}, AspectRatio-> 1.2,Ticks->
+
{{{-6 -6}> (-1, -11, {1,1}. {6,6}}, {{2.5,2.5). {10,10)}) The high precision of the x values in the close neighborhood of 1 and -1 was chosen to get sufficiently large function values for the required domain of y. The result is shown in Fig. 20.4b. 3. Bessel Functions With the calls In[l] := bjO = Plot[{BesselJ[O, z],BesselJ[2,2],BesselJ[4,2]}) { z >0: lo}, PlotLabel-> “ J ( n z, ) n = 0,2,4”] In[2l := b j l = Plot[{BesselJ[l, z ] )BesselJ[3, z], BesselJ[5, z]}, { z , 0,lo}, (20.85) PlotLabel-> “ J ( n z, ) n = 1,3,5”] the graphics of the Bessel function Jn(z) for n = 0,2,4 and n = 1 , 3 , 5 are generated, which are then
represented by the call In[3] := Show[CraphicsArray[{bjO,bjl}]] next to each other in Fig. 20.5.
0.6
J(n,z)n=1,3,5
-0.2
b) Figure 20.5
20.5.1.6 Parametric Representation of Curves Mathernatica has a special graphics command, with which curves given in parametric form can be graphically represented. This command is: ParametricPlot[{f,(t), fy(t)}, {t,tl,tz}l. (20.86)
1000 20. Computer A1,qebra Systems
It provides the possibility of showing several curves in one graphic. A list of several curves must be given in the command. With the option AspectRatio-> Automatic, Mathernatica shows the curves in their natural forms. The parametric curves in Fig. 20.6 are the Archimedean spiral (see 2.14.1,p. 103) and the logarithmic spiral (see 2.14.3,p. 104). They are represented with the input I n f f I := ParametricPlot [{tCos[t],t Sin[t]},{ t ,0,3Pi}, AspectRatio-> Automatic] and I n f 2 1 := ParametricPlot[{Exp[O.lt]Cos[t],Exp[O.lt]Sin[t]},{ t ,0,3Pi}, AspectRatio-> Automatic] With I n f 3 1 := ParametricPlot[{t - 2 Sin[& 1 - 2 Cos[t]}, { t ,-Pi, llPi}, AspectRatio-> 0.31 a trochoid (see 2.13.2,p. 100)is generated (Fig. 20.7).
Figure 20.6
20.5.1.7 Representation of Surfaces and Space Curves Mathematica provides the possibility of representing three-dimensional graphics primitives. Similarly to the two-dimensional case, three-dimensional graphics can be generated by applying different options. The objects can be represented and observed from different viewpoints and from different perspectives. Also the representation of curved surfaces in three-dimensional space, Le., the graphical representation of functions of two variables, is possible. Furthermore it is possible to represent curves in three-dimensional space, e.g., if they are given in parametric form. For a detailed description of three-dimensional graphics primitives see [20.5].The introduction of these representations is similar to the two-dimensional case.
1. Graphical Representation of Surfaces The command Plot3D in its basic form requires the definition of a function of two variables and the domain of these two variables: I n f l := Plot3D[f[z,yl,{z,&,&},{y,ya,ye}1 (20.87) All options have the default setting. I For the function z = z2 + yz, with the input I n f f l := Plot3D[zA2t yA2,{z,- 5 , 5 } , {y, -5, 5},PlotRange-> {0,25}] we get Fig. 20.8a, while Fig. 20.8b is generated by the command In[2] := Plot3D[(1 - Sin[z]) (2 - Cos[2 y]), {z,-2,2}, {y, -2,2}] For the paraboloid. the option PlotRange is given with the required z values, because the solid is cut at z = 25. 2. Options for 3D Graphics The number of options for 3 D graphics is large. In Table 20.27, only a few are enumerated, where options known from 2 D graphics are not included. They can be applied in a similar sense. The option Viewpoint has special importance, by which very different observational perspectives can be chosen.
20.5 Graphics in Computer Al.qebra Svstems 1001
3. Three-Dimensional Objects in Parametric Representation Similarly to 2D graphics, three-dimensional objects given in parametric representation can also be represented. With (20.88) ParametricPlot3D[(f,[t, 4 f&, 4, fz[tl4}>it, ta,t e l , ua,4 1 a parametrically given surface is represented, with (20.89) ParametricPlot3D[{f,[tl, fy[tl,fz[tl),{t, ta,tell a three-dimensional curve is generated parametrically. Table 20.27 Options for 3D graphics default setting is True; it draws a three-dimensional frame around the surface Boxed Hiddensurf ace sets the non-transparency of the surface; default setting is True specifies the point (2, y, z ) in space, from where the surface is observed. DeViewpoint fault values are {1.3, -2.4,2} default setting is True; the surface is shaded; False yields white surfaces Shading {za,ze), {{za,ze}> {ya,ye}, {za, z , } } can be chosen for the values A l l . DePlotRange fault is Automatic
W The objects in Fig. 20.9a and Fig. 20.9b are represented with the commands I n f 3 1 := ParametricPlot3D[{Cos[t] Cos[u],Sin[t] Cos[u],Sin[%]},{ t ,0,2Pi},
1002
20. Computer Alqebra Systems
{ u , -Pi/2, Pi/2}]
In&]
:= ParametricPlot3D[{Cos[t],Sin[t],t/4},
(20.90)
{ t ,0,20}]
Mathernatica provides further commands by which density, and contour diagrams, bar charts and sector diagrams, and also a combination of different types of diagrams, can be generated. H The representation of the Lorenz attractor (see 17.2.4.3, p. 825) can be generated by Mathernatica.
20.5.2 Graphics with Maple 20.5.2.1 Two-Dimensional Graphics Maple can graphically represent functions with the command p l o t with several different options. The input functions can be explicit functions of one variable. functions given in parametric form and lists of two-dimensional points. Maple prepares a table of values from the input function by internal algorithms, and its points are connected by a spline method to get a smooth curve. There are several options by which the shape of the graphic can be influenced. The graphic itself is represented in a special environment, and it can be connected to the work document by the corresponding system commands, or it can be sent to the printer or plotter. The data can be saved in various formats, for example as a Postscript file.
1. Syntax of Two-Dimensional Graphics The two-dimensional plot command has the basic structure (20.91) plot(funct, hb, vb, options): The first argument f u n d can have the following meanings: a) a real function of one independent variable, e.g., f(x); b) an operator of a function, generated by, e.g., the arrow symbol; c) the parametric representation of a real function in the form of a list [ u ( t ) , v ( t ) , = t a..b], where t = a..b is the domain of the parameter; d) several functions enclosed in curly braces, which should be represented together; e ) a list of numbers, which are considered to be the coordinates (2,y) of the points to be represented. For completeness it should be mentioned that the first argument in this command can also be functions generated by procedures. The second argument hb is the domain of the independent variable; it has the form x = a..b. If no argument is given, then Maple automatically takes the domain -10..10. It is possible to assign to one or to both limits the values -co and/or co. In this case, Maple chooses a representation of the x-axis with arctan. The third argument vb directs the domain of the dependent variable (vertical). It should be given in the form y = a..b. If it is omitted, Maple takes the values determined from the equation of the function for the domain of the independent variable. It can cause problems if in this domain there is, e.g., a pole. Then, if it is necessary, this domain should be limited. One or several options can follow as further arguments. They are represented in Table 20.28. The representation of several functions by Maple in one graphic is made in general by different colors or by different line structure. Maple V/2 running under Windows provides the possibility of making changes directly on the graphic according to corresponding menus, e.g., the ratio of the horizontal and vertical measure, the frame of the picture, etc. 2. Examples for Two-Dimensional Graphics The followinggraphics are generated by Maple, then vectorized by Coreltrace and finished by Coreldraw!. This was necessary because the direct conversion of a Maple graphic in EPS data results in very thin lines and so unattractive pictures.
20.5 Graphics in Computer Al.qebra Systems 1003
Table 20.28 Options for Plot command yields the representation of a parametric input in polar coordinates (the first variable is the radius, the second one is the angle) sets the minimal number of the generated points (default 49) numpoints = n resolution = m sets the horizontal resolution of the representation in pixels (default m = 200) xtickmarks = p sets the number of scaling marks on the z-axis s t y l e = SPLINE generates the connection with cubic spline interpolation (default) generates linear interpolation s t y l e = LINE shows only the points s t y l e =POINT places a title for the graphic, T must be a string title =T coords = polar
Figure 20.10 1. Exponential a n d Hyperbolic Functions With the construction > plot({2"z, 1O^z,(1/2)"x, (1/1O)"x, exp(x), l/exp(z)}, x = -4..4, y = 0.20,
> xtickmarks = 2,ytickmarks = 2);
(20.92a) (20.92b)
we get the exponential functions represented in Fig. 20.10a. Similarly, the command (20.92c) > plot({sinh(x), cosh(z), tanh(z), coth(x)}, z = -2.1..2.1, y = -2.5-2.5): yields the common representation of the four hyperbolic functions (see 2.9.1, p. 87) in Fig. 20.10b. A4dditionalstructures, such as arrow heads on axes, captions, etc., are added subsequently with the help of graphic programs. 2. Bessel Functions With the calls (20.93a) > plot({BesselJ(O,z).BesselJ(Z, z),BesselJ(4, z ) } ,z = 0..10);
> plot({BesselJ(l,z),BesselJ(3,z),BesselJ(5,z ) } ,z = 0..10);
(20.93b)
we get the first three Bessel functions J(n,z ) with even n (Fig. 20.11a) and with odd n (Fig. 20.11b).
Figure 20.11 The other special functions built into Maple can be represented in a similar way.
1004
20. Computer Aloebra Susterns
3. Parametric Representation With the call
(20.94a) > plot([t * cos(t),t * sin(t),t = 0..3* Pi]); we get the curve represented in Fig. 20.12a. For the following two commands Maple gives a loop function similar to a trochoid Fig. 20.12b (compare: Curtate trochoid in 2.13.2, p. 100) and the hyperbolic spiral Fig. 2 0 . 1 2 ~(see 2.14.2, p. 104). (20.94b) > p l o t @ - sin(2 * t ) ,1 - cos(2 * t ) ,t = -2 * Pa.2 *Pi]);
> plot([l/t, t , t = 0..2 * PZ],rc= -.5..2,coords = polar);
(20.94~)
Because of the introduction the option coords, Maple interprets the parametric representation as polar coordinates at the execution of the command.
Figure 20.12
3. Special Package p l o t s The special package p l o t s with additional graphical operations can be found in the Maple library. In the two-dimensional case, the commands conformal and polarplot have special interest. With (20.95) polarplot ( L ,options) curves given in polar coordinates can be drawn. L denotes a set (enclosed in curly braces) of several functions r(p). Maple interprets the variable 'p as an angle and it denotes the curve in the domain between -7r 5 9 5 ?T if no other domain is prescribed. The command (20.96) conformal(F,rl, r2, optzons) displays the mapping of the standard grid lines of the plane to a set of grid curves with the help of a function F of complex variables. The new grid lines also intersect each other orthogonally. The domain r l determines the original grid lines. Default values are 0..1 + (-l)l/z.The domain r2 gives the size of the window in which the range lies. The default range here is calculated to completely enclose the resulting conformal lines.
20.5.2.2 Three-Dimensional Graphics Maple provides the command plot3d to represent functions of two independent variables as surfaces in space or to represent space curves. Maple represents the objects generated by this command analogously to two-dimensional ones in one window. The number of options for representation is usually larger, especially by the additional options about the viewpoint (of observation). 1. Syntax of plot3d Commands This command can be used in four different forms: a) plot3d(funct, z = a..b, y = c d ) . In this form, funct is a function of two independent variables, whose domain is given by rc = a..b and y = c..d. The result is a space surface. b) plot3d(f, a..b, c..d). Here f is an operator or a procedure with two arguments, e.g., generated by the arrow operator, the domains are associated to these variables. c) plot3d([u(s,t ) .u ( s , t ) )w(s, t ) ] ,s = a..b, t = c..d). The three functions u, v, 2u depending on the two parameters s and t define a parametric representation of a space surface, restricted to the domain of
20.5 Graphics in Computer Alqebra Sustems 1005
the parameters. d) plot3d( [f.g1h ] ,a..b, c..d). This is the equivalent form of the parametric representation, where j , g, h must be operators or procedures of two arguments. All further arguments of the operator plot3d are interpreted by Maple as options. The possible options are represented in Table 20.29. They should be used in the form option = value. Table 20.29 Options of command plot3d sets the minimal number of generated Doints (default number is n = 625) specifies that an m x n grid of equally spaced points is sampled (degrid[m, n] fault 25 x 25) indicates the labels used along the axes (string is required) l a b e l s = [x,y. z] s is a value from POINT, HIDDEN, PATCH, WIREFIRE. Here it defines style = s how the surface is represented f can have the values BOXED, NORMAL, FRAME or NONE. The represenaxes = f tation of the axes is specified specifies the required coordinate system. Values can be Cartesian, coords = c spherical, c y l i n d r i c a l . Default is Cartesian p takes values between 0 and 1 and it defines the observational perprojection = p spective. Default value is 1 (orthogonal projection) o r i e n t a t i o n = [theta,phi] specifies the angle of the point in space in a spherical coordinate system from which the surface is observed gives the domain of the t values for which the surface should be repview = 21.22 resented. Default is the total surface
numpoints = n
Y
In general. almost all options can be reached and appropriately set in the corresponding menu in the screen. In this way, the picture can subsequently be improved.
2. Additional Operations from Package p l o t s The library package p l o t s provides further possibilities for representation of space structures. Especially important is the representation of space curves with the command spacecurve. The first argument is a list of three functions depending on a parameter, the second argument must specify the
1006
20. Computer Aloebra Systems
domain of this parameter. So, the options of the command plot3d are kept until they have any meaning for this case. For further information about this package one should study the literature. \\.'ith the inputs (20.97a) > plot3d([cos(t)* cos(u), sin(t) * cos(u),sin(u)],t= 0..2 * Pi,u = 0..2 * Pz) (20.97b) > spacecurve([cos(t),sin(t),t/4], t = 0..7 * Pz) the graphics of a perspectively represented sphere (Fig. 20.13a) and a perspectively represented space spiral (Fig. 20.13b) are generated.
1007
21 Tables 21.1 Frequently Used Constants s141592654.. . K 1" = 0.017453293.. . 180 1' = B 0.000290888.. . 10800 1" = 1 7 0.000004848.. . 648000 1 rad=l 57.29577951,. , ' lo/o 0.01 B
1Oim e lge = M
0.001
2.718281828,. , 0.434294482. . .
1 In 10 = -
2.302585093.. .
C
0.577215665.. . Euler constant
M
21.2 Natural Constants The table contains the values of the physical constants, which are contained in "Determination of fundamental physical constants 1986" (E.R. Cohen and B.N. Taylor, Reviews of Modern Physics, 59, KO, 4, October 1987) and which are based on CODAT'4 Bulletin No. 63, November 1986. The numbers in the round parentheses represent the standard deviation of the last digits of the values. Avogadro constant iltomic mass unit Bohr magneton Bohr radius Boltzmann constant Compton wavelength of electron of neutron of proton Electronic charge Acceleration due to gravity Faraday constant Fine structure constant Gas constant, universal Gravitation constant Permeability of free space Permittivity of vacuum Nuclear magneton
= 6.022 136 7(36) IOz3mol-' = 1 U = (gmOl-')/ivA = 1.6605402(10)~
kg
= 931.49432(28) MeVc-*= 1822.89 me = eA/2me = 9.274015 4(31) JT-' = 5.788382 63(52). 10W5 eVT-' = h2/&(e)e2 = re/az= 0.529 177249(24). lo-'' m = & / N A = 1.380658(12) JK-' = 8.617385(73) . eVK-' = A/mc = 3.861 593 23(35) . m = 2.100 19445(19) m = 2.10308937(19).
m
= 1.602 17733(49) lo-'' C = 2.41798836(72) 1014 AJ-' = 9.806 65 ms-z = i"Jae = 96485.309(29) CmOl-' = poce2/2h = 7.297 353 08(33) = 137.035 9895(61) = NAk = 8.314510(70) Jmol-'K-' = 6.672 59(85) . lo-'' m 3 k g - k 2 = 4~ x NA-' = 12.566 370 614.. , . = l/p0cz = 8.854 187817. . . lo-'' Fm-' = eh/2m, = 5.050 786 6(17)
JT-'
NA-'
1008 2f. Tables
Yuclear radius Classical electron radius Velocity of light Loschmidt constant Magic numbers Magnetic moment electron proton neutron Molar volume Planck constant
= 3.152 451 66(28) . eVT-' = T o A ' / TO ~ ;= ( 1 , 2 . . .1,4) fm; 1 5 A 5 250: ro
R
Pe
PP Pn V, h h
Rest mass electron proton
me mp
neutron
m,
muon
mwi
pion
m,+ m,a
Rest energy electron proton neutron muon pion atomic mass unit Rydberg constant Rydberg energy Stefan-Boltzmann constant Thomson cross-section Interaction constants strong interaction electromagnetic interaction weak interaction gravitational interaction Characteristic impedance of vacuum
= 1.001 159652 1 9 3 ( 1 0 ) p = ~ 928.47701(31). = +2.792847386(63)pk = 1.41060761(47). = -1.913 042 75(45)Pk = 0.966 23707(40) . = R O T ~ / P= ~ 0.022 414 io(i9) m3moi-' = 6.626 075 5(40) . Js = 4.135 669 2(12) = h/2a = 1.054 572 66(63) . Js = 6.582 122 O(20) . eVs
JT-' JT-' JT-' 8
= 9.109389 7(54). kg = 5.485 79903(13). = 1.6726231(10).10-27kg= 1836.152701(37)me = 1.007276470(12)u = 1.674 928 6(10) . kg = 1838.683 662(40)me = 1.008664904(14)u = 1.883532 7(11). lO-"kg= 206.768262(30)me = 0.113428913(17) u = 2.488. kg = 273.19me = 2.406 kg = 264.20me
= 0.51099906(15) MeV = 938.27231(28) MeV &(n) = 939.56563(28) MeV Eo(bi)= 105.658389(34)MeV Eo(ai)= 139.57 MeV Eo(ao) = 134.972 MeV Eo(u) = 931.49432(28) MeV = po2me4c3/8h3= 10973731.534(13) rn-l R,
&(e) Eo(p)
~ C T ,
0
a0
= 13.605698 l(40) eV = (a2/60)k4/h3c2= 5.67051(19). = 8 ~ ~ , ' /= 3 0.665 246 16(18).
QC
= 0.08.. .14 = 11137
(YF aG
= 5.1 * 10-39
20
= 376.730 3 R
(Yk
=3.10-'2
Wm-2K-4 m2
eVs
u
21.3 Important Series Expansions 1009
21.3 Important Series Expansions Function
Convergence Region
Series Expansion Algebraic Functions Binomial Series
( a fx
) ~After transforming to the form am the following series:
(1fX)rn
Binomial Series with Positive Exponents m(m m(m - l ) ( m - 2) 2 3 + . . . l*ma:+2! 3! m(m - 1).. . ( m - n t 1) zn+ ... (fl)n n! 1 1'3 1.3.7 1.3.7.1lz4*... 1 f -z- -22 f4 4.8 4.8.12x3-4.8.12.16 1.2 1.2'5 -2 2 L L i X 4f.. . -23 1 f -12 - -22 3 3.6 3.6.9 3.6.9.12 1 . 1 f -23 1 . 1 . 3 - ----x4 1 . 1 . 3 . 5 f.. . 1 f -12 - -22 2 2.4 2.4.6 2.4.6.8 3 3.1 3.1.1 1 -2 + -22 -23 + -3224 .. 41 .. 61 .' 83 F..' 2 2.4 2.4.6 5 + -x2 5 . 3 f -23 5 ' 3 ' 1 - -x54 . 3 . 1 . 1 F.. 1 -5 . 2 2.4 2'4.6 2.4.6.8
' +
( m 0)
(1 42): (1 f x)i (1 f 2 ) ; (1f 2 ) :
(1i.Z):
*
* *
Binomial Series with Negative Exponents (1f X ) - m
'
( m 0)
lTrnx+---
+
m(m + 1) x2 F m(m + l3!) ( m + 2Ix3 2! m(m 1).. . ( m t n - 1)E " + ...
+
,, ,
+
n!
1 1.5 1.5.9 1 . 5 . 9 . 1 3 24F... ( l f z ) - i 1 7 -x+ - 2 F 23 4.8.12.16 4 4.8 4.8.12 1 1.4 1'4.7 ( l f z ) - f 1 7 -x + -2 F -23 + -3x14.. 64 .. 97 .. 11 20 7 . .. 3.6.9 3 3.6 1.3.5 1 1'3 (1f 2)-i 1 -zt - 2 2 -23 + -2214 .. 43 .. 65 .. 87 F..' 2 2.4 2.4.6 (1 f 2)-1 1 T 2 + 2 2 2 3 + 2 4 7 . . .
+
3 (13k z)-: 1 F -x 2 (1 f x ) - 2 1 F 22
3.5 3.5.7x3+3.5.7.9 + -22 7-24 2.4 2.4.6 2.4'6.8 + 322 7 4x3 + 5x4 T . ' .
7 . ..
(1k x)-4
*
(1 x)-3
(1 4x)-4
*
(1 x)-5
Convergence Region
Series Expansion
Function
5.7.9 5.7.9.11 5 5.7 1 F -x + -22 F -23 -24 2.4.6 2.4.6.8 2 2.4 1 1 F -(2 ’ 3x 7 3.422 4.5x3 ‘F 5 ’ 6x4 1.2
+
+
+
’
”
’.
.)
( 2 . 3 42 7 3 . 4 . 5 2 ’ 1.2.3 $4 5 . 6x3 7 5 . 6 7x4 + ...) 1 1 F ----(2,3,4,5~ F3.4.5.62’ 1.2.3.4
IF-
$4 5 . 6 . 7x3 7 5 . 6 . 7 . 8x4 t ...) 8
Trigonometric Functions sin x x2 sin a x3 cos a ;in(x + a) sina t xcosa - -- 2! 3! xn sin ( a + 7 ) x4 sin a ... +-+,..+ 4! n! XZn x2 1 4 2 6 1--+---+... + (-1)n-k(k)! cos x 2! 4! 6! x2 cos a x3 sin a :os(x + a) cosa - xsina - - 2! 3! X ~ C O S( a t 7 ) x4 cos a +--... 4! n! 1 2 x t -x3+ -25 + tan x 3 15
+
+
cot x 0 < 1x1 < T
sec x
1t
1 2
-22
277 61 + -24254 + -26 + -28 + .’ . 720 8064 -
21.3 Important Series Expansions 1011
cosec x
Convergence Region
Series Expansion
Function 1 x
7 + -x61 + -x3+ 360
31 x 19120
7
5t
127 x7 + ... 604800 0 < 1x1 < 77
Exponential Functions x x' 23 Xn 1 + -+--$,+...+-+... n! l! 2! 3.
e" = ex
az
in a
X ex - 1
1x1 < cx2
x In a ( x In a)' ( x In a)3 +-+-+...+2! 3! I! x B1 X' Bz x4 B3 1--+--+ - - .x6 .. 2 2! 4! 6!
( x In a)n n!
+
1x1 < 02
B xZn i., L .
1x1 < 277
1t-
+(-l)n+1
(2n)!
Logarithmic Functions
In x
+ (2n +(x1)(x - l)'n+l t
In x
In x
In (1 t x ) In (1 - x)
In =2
(. (. (.
- 1)' - 113 ( x - 1) - -+---+... 2 3
-
114
2-1+-
*
23
25
27
x>o
4
( x - l ) n & . .. +(-1)n+1 n (x-1)2 ( x - 1 ) 3 + , , . ; (x- l)n / . , . tX 2x2 3x3 nxn 22 23 24 + (_l)n+l-Xn ... x--+---+... n 2 3 4 2' x3 2 4 x5
(Z)
x2n+1
2 x + -+-+-+...+-+... [ 3 5 7 2n+1
Artanh x
+. .I
1
O<X52 1 x>2 -1<xI1
-l<x
Convergence
Series Expansion
Function
1 1 1 1 -+-+-+-+...+ x 3x3 5x5 7x7
(2n + 1) xZn+l
=2 Arcoth x In /sinxi
In l x / - - - - - -- . . . 6 180 2835 22
x4 12
x2 2
In cos x
26
24
17x8 2520
x6 45
XI
-...
n (271) !
.. - 22n-1
In ltan
2 2 n - 1 ~ 22n n
+
7 x4 -62 In 1x1 + -1 x2 t x 3 90 2835
6
(22n - 1)BnxZn_ . . . n(2n) !
+ ...
=I
0 < 1x1 < 2
Inverse Trigonometric Functions arcsin x
xt
23 -+-+ 2.3
1.325 2.4.5
1.3.52 t... 2.4.6.7 1 . 3 . 5 . ' (2n - 1)xZn+l+ . . . 2 . 4 . 6 . . . (2n)(2n 1)
~
+
+
arccos x
1.325
23
1.3'527
2 +
23
25
arctan x
a:--+---+...
arctan x
&7 - - - r+ - l- - +1 2 x 3x3
3
5
5 . . . (2n - 1) xZnt1 1. 3 2 . 4 . 6 . .. (2n)(2n 1)
+
27
x2n+l
+(-l)n-*".
7 1 5x5
2n+1
_1- . . . 7x7 t(-l)n+l
arccot x
+. .I
(2n + 1)x Z n t 1
+ (-1)n
x2ntl
*
2n+1
*.
f...
1
.
21.3 Important Series Expansions 1013
Function
Hyperbolic Functions x7 x2ntl -+-+-+...+-+... 3! 5! 7 ! (2n + 1)!
x3
sinh x cosh x tanh x
Convergence Region
Series Expansion
I+
x5
1 + -2 2+ - +2 4- + .26. . + - + . . .X2n 2 ! 4! 6! (2n) ! 1 2 17 z - - x 3 + -x5 - -x7 + -xQ62 3 15 315 2835
-.
,
.
coth x 0 < 1x1 < K sech x
1 x2 + 5 x4 - 61 x6 t 1385 x 8 - . . . 1- 2! 4! 6! 8!
+-(2n) ! Enxln 5 . . . cosech x
x 3. . . 31s' -1- - +x - - -7 + x
6
360
15120
(2+ 2(-1)" (2n) !
Arsinh x
- 1)
Area Functions 1 . 3 x5 - 1 . 3 . 5 x7 2.4'5 2'4.6.7
1 x--x3+2.3
~
n x2n-l
+
iZrcosh x 23
Artanh x Arcoth x
25
-3+ - +5
27
x2ntl
+
-t . . ' t 7 2n+ 1 1 1 1 1 1 -+-+-+-+...+ +... x 3x3 5x5 7x7 (2n 1)x*n+1
x+
+
+
,,
0 < 1x1 < x
1014
21. Tables
21.4 Fourier Series 1. y = x f o r O < x < 2 a
tY 2. y = x f o r O < x < . i r
y = 2a
-x
< x 5 271
for ?r
tY
y = - 2- -
cos32 cos52 a cosz+- 32 +- 5 2
'(
+...)
3. y = z f o r - T < x < n
tY 4. y = x f o r - T < x < z 22 y = 57 - z for
377 2
<x 5 -
2-
tY I
5. y = a f o r O < x < r y = -a for 7r
SY
< x < 2n sin32 sin52 y = - sin2 + - -t ..*) 3 5
3
+
21.4 Fourier Series 1015
6. y = 0 for 0 5 z < a and for a - cy < z 5 a + cyand 2a - LY < z 5 2a y = a for a < z < T - a y = -a for ~ + < az 5 2a - cy
3
1 3
y = - cosasinz+-cos3asin3z
+; 1 cos jcysin jz + . .) ,
ax 7. y=-for-a<x
y=-
a(.
-
a
x)
for T - a
y = -afor T + a
5 z5 a + a
1 +-sin
5 z 5 2a - cy
Especially, for cy = I holds: 3
1 . -sin3cysin3z 32
Tff
32
y=
5asin5z
+
'.
1 . sin 52
1 1 . +sin 7z - -isin llz+ . 72 11
=T2 -
(F +T+T
8. y = x2 for -a 5 z 5 T sy 2 t
-x O x 2 x 3 i l 4 x 5 ~ 6 x 7 i l
X
9. y = Z ( T - z ) for0 5 z 5 T
6
10.
g = T(T
- z)for
05z5T
y = (T- z)(h- z) for T
5 x 5 2a
cos42
cos6z
I
11. y =sinzforO 5 z 5 A
.tY 12. y = cosz for 0
< z < 71. 4 /2sin2z
13. y = sinz for 0 5 z
14. y = cos uz for --x
4sin4z
6sin6z
5 71.
5 5 71. +--u2-1 u2-4 (u arbitrary, but not integer number)
15. y = sin uz for
-A
\
212-9
+...) (u arbitrary, but not integer number)
16. y = z c o s z for -71. < z
y = cosz
1 1 + -COS22 + -COS32 + ... 2 3
1 2
y = cosz - - cos22
+ -31 cos3z - . .
1
19. y = -1n cot 2 for 0 < z < 71. 2
2
y = cos2
1 1 + -cos 32 + - cos 5 1 + . . 3 5
,
21.5 Indefinite Integrals 1017
21.5 Indefinite Integrals (For instructions on using these tables see 8.1.1.2,2., p. 427).
21.5.1 Integral Rational Functions 21.5.1.1 Integrals with X = ax + b
I Notation: X = ax + b ]
1. / X n d x =
1
xntl
( n # -1);
(for n = -1
see No. 2).
2. j ' sX= l lan x .
3. / x X n d x =
1 Xnt2 az(n + 2)
~
b xntl a2(n 1 )
+
(n # - l , # -2);
'I
4. /xmXn dx = am+' (X - b)mXndX
(n # -1,
(for n = -1, = -2
see No. 5 und 6).
# - 2 , . . . ,# -m).
The integral is used for m < n or for integer m and fractional n; in these cases ( X - b)m is expanded by the binomial theorem (see 1.1.6.4,p. 12).
a3
16.
3b
-
3b2 ( n - 2 ) ~ n - z ( n - 1)xn-l b3 +
1
( n # 1, # 2, # 3, # 4).
19.
dx /5
X ax ; + x).
1
= - p (In
dx
22. / E =dx- G + p ~l n - . a
x
23. /-=-a dx
dx dx
26.
dx /x3X
=
-1 [a2 b3
dx
- - -+ x 2aX x 2x2
3a21nX x
28.
dx / 23x3
29.
I-=-dx
(n 2 2).
nalnz]
=
.x".+ x" - K] 2x2 x ,
-; - + x [6u2 In
X x
,
4a3x
[ : ( )
a4x2 X 2 - - -- 2x2 2x2 x
+
1 n+l n + 1 ( - u ) W 2 a2X2 +--bnt2 i (i -2)X'-2 2x2
(n+ 1)aX X
+
t-lnz] n(n l ) a 2 2 30.
(n23).
/2mXn A= bm+n-1 -i=O
If the denominators of the terms behind the sum vanish, then such terms should be replaced by (-a)m-' In. ;X
(,,my T ')
21.5 Indejnite Inte.qrals 1019
I Notation: A = bf
x dx 33’ . / ( a x t b ) ( j x + g )
35.
36’
J ( a t x )x(dbx+ x ) ’
-
b (a - b ) ( b t x )
a
--
( a - b)’
In-
-1
1
--(-t1 -
38. J ( u + x ) ’ ( b t x ) 2
(a-b)’
a:x
b:x)
-1 x2 dx -( a t x ) 2 ( b t x ) 2 ( a - b)2 ( a ? x
1
b2 - 2ab + I)t ln(b + x) (b - a)2
21.5.1.2 Integrals with X = az2
I Notation: X 2 -
2ax t b
A)t m
a
(for A < 0).
=
2a
2ax+b ( n - 1)AXn-1
(afb)
(a#b)
+ bx t c : A = 4ac - b2 1
(for A < 0),
2ax+ b
a+x btx
(a # b).
+ bz + c
= ax2
-7z dx
In-
(a # b).
(for A > 0),
a
62arctan 2ax + b Artanh VQ 1 In2ax-tb-2 a x + b + a
JFdx
t ya tl b n - a t x (a-b) b+x 2ab
+
( a # b)
2 a+x In( ~ - b ) ~b t x
1 b-tx
t-)t-m(z
43.
a+x btx
x 2 dx b2 a’ tln(a (a+x)(btx)’ (b - a ) ( b + z ) ( b - a)’
40.
(A # 0)
f
dx 37’ / ( a - t x ) 2 ( b + x ) 2
39.
1
b ln(ax + b) - 9 In(fx -t g)]
1
x dx
- ag
(see No. 40).
dx +- (2n-3)2a ( n 1)A Xn-1 -
1
(see No. 40).
(see No. 40).
56'
dx
1 - 2(cf2 - gbf
+ g2a) [
7 1 (fx + SI2
In
2ga - bf +2(cfZ - gbf g2a)
+
dx
Jx
(see No. 40).
21.5.1.3 Integrals with X = u2 k x 2
I
Notation: X = a2 & x2, arctan
for the
"+" sign,
for the "-"sign and 1x1 < a, x 1 x t a Arcoth - = - In -for the "-"sign and 1x1 > a. a 2 x-a If there is a double sign in a formula, then the upper one belongs to X = a2 and the lower one to X = a2 - xz, a > 0.
+ 2'.
21.5 Inde.finiteInteorals 1021
dx
1
x
dx
1
ST=dx
59.
x 32 +-+-Y. 4a2X2 8a4X
3 8a5
dx
61. /$=f21nX. 1
/ g =2x~ - . / g 4= ~ ~ , x 1
62. 63.
1
x dx
1
(n # 0).
65. l F = * x ~ a Y . 1 66. /%=?-*-Y. x 2X 2a 67. / % = T $ * ~ *x ~ Y 1.
68.
l--=~--&-/ x2 dx
dx
X.
( n # 0).
x2 a2 - -1nX. 2 2 a2 1 --+-InX. 2X 2 = &-
69.
70. 71.
L 1 2 n ~ n 2n
~ n + l
1%-
/ x3dx x = -1 za2+ s .
x3 dx 72. I s = -
1 a2 t2(n - l)Xn-l 2nXn
73. I g = 3 1 1n z .x2 74.
1
dx
1 2a2X 1
1s = dx
1 2a4
= -+-In-.
x2
X
1 1 75' t - -In 2a4X 2a6 76. /&=-a2ria3y 1 1
+
22
-.X
( n > 1).
21.5.1.4 Integrals with X = a3 f z3 Notation: a3 i x3 = X ; if there is a double sign in a formula, the upper sign belongs to X = a3 + x3,the lower one to X = a3 - x3. 83. 84.
/ dx
1
(six)'
6a2
a2Fax+x2
= +---In
a&
'
Is=1
85. /$=-ln
86.
1 2x a +arctan a2&
6a
(see No. 83). a2Tax+x2 1 22 F a f -arctan (a+x2) a d a& ' 1
/$=x+~/$ 22
(see No. 85).
87. J Tx=2 id-x1 n X 1. 3 1 3x 89.
I x3 - -dx= + ~ ~ ~dx~ J x X
(see No. 83). (see No. 83).
(see No. 85). (see No. 85).
21.5 Indefinite Inteorals 1023
dx
(see No. 83). (see No. 83).
21.5.1.5 Integrals with X = a4 t x4 dx
97.
I=-=
-
I
In
ax Jz +-2 a13 4 arctan aZ-x2
x2+axfi+a2 x2-axa++2
21.5.1.6 Integrals with X = a4 - x4 101.
dx 1 -= -1n I a 4 - x 4 4a3
103.
I
x2dx
104.
1
x3dx
1
= =
a+x a-x
~
1 +arctan 2a3 a
a+x 1 In a-2- arctan I.
-41 ln(a4 - x4).
21.5.1.7 Some Cases of Partial Fraction Decomposition 105. 106.
1
1
1 --A (x$a)(x+b)(x+c)-x+a
B +-+x+b
where it holds A = 107.
1 -- A (x+a)(x+b)(z+c)(s+d) - x + a
108.
1 (a + b x 2 ) ( f gx2) E
+
C x+c' 1 (b-a)(c-a)'
B +-+-+x+b
C
B=
1 (a-b)(c-b)'
C=
1 (a-c)(b-c)'
D
x+c s+d' 1 1 B= usw. where it holds A = (a - b)(c- b)(d - b) ( b - a)(c - a)(d - a ) ' 1
21.5.2 Integrals of Irrational Functions 21.5.2.1 Integrals with &and u2 f b2x I Notation: arctan !@ for the sign "+", a -1nfor the sign "-". 2 a-b& If there is a double sign in a formula, then the upper one belongs to X = a2 + b2x,the lower one to X = a2 - b2x.
113.
2 dx JW = ;17Y.
116.
dx /m
2 3 b 2 f i 3b T -Y. a 2 X f i T a 4 X a5
= --
21.5.2.2 Other Integrals with 6 117.
/=fidx a4-22
120.
= --
2a&
ln
1 a& +arctan -. a& a2-x
1 a+& 1 fi -In -- - arctan 2a a - & a
J (a4 dxx2 I f i --In-2a3' -
x+a&+a2 x-a&+a'
-
'+fi+Larctan--. a - fi a3
fi
21.5 Indefinite Inteorals 1025
21.5.2.3 Integrals with d
a
I Notation: X = ax + b I 121. / a d z =
L a. 3a 2(3ax - 2 b ) m 15a2 '
122. / x f i d x =
2(15a2z2- 12abx + 8 b 2 ) m 105a3
123. / r 2 & d x = 124.
125. 126.
/E dx = a 2 6 x dx - 2(ax - 2b)
1%
=
+
2(3a2x2- 4abx 8 b ' ) a 15a3 forb > 0, f o r b < 0.
128.
/g d x
129.
/m=----
= 2&
+b /
dx
dx -
(see No. 127).
XVT
a dx 2 b / z
bx
(see No. 127).
(see No. 127).
130.
2 m 132. / m d x = 5a ' 133. / x a d x = -
%a2
(5 dX 7- -7bd-X)5
134. 135.
136.
/Tds=
-2 +f2 lb f i + b 2 / 3
dx
XVT
/s=L(fi+-$=), a2
(see No. 127).
(see No. 127). (see No. 1 2 7 ) .
21.5.2.4 Integrals with
Jn and d
[Notation: X = ax
m + b, Y = f x + g, A = bf - ag 1
(see No. 146).
149.
I%=
dx
'
2
f d
J-af arctan m a-m L r n h ffO+rn
forAf < 0 ,
(
A +2aY 150. / m d x = ___ 4af
forAf>O. (see No. 146).
21.5 Indefinite Inteqrals 1027
(see No. 146). (see No. 149).
153,
I!?!?-fi dx
154.
155. J a Y ' d x = 156.
Yn-'dx
2 - (2n+ 1).
1 ~
P n + 3)f
(
2fiY"+' t A
IF=( J17 --
( n - 1)f
2"JL). fiyn-1
Yn-1
21.5.2.5 Integrals with d
158. l x f i d x =
m
1 Notation: X = a2 - x2 1
-3\/57j. 1
159. / x 2 f l d x =
m m
160. / r 3 a d x = --a2-. 5 3 161. / g d x = 162. 163. 164. 165. 166.
$dx
1 1J17 1~
a-ah--.a t f i
-e
- arcsin
fi
g d z = -dx
x dx
2x2
= arcsin =
z.
1 a +In -. 2a
t 0 x
z.
-e.
--a t 2
x'dx x -= 0
167.
=
I$$--fl 3
a2
x
arcsin -.
a2fi.
Y" dx JT )
(see No. 153).
168. 169.
1zJI? -. /a=-dx
1
a + o
= --In
dx
a2x ' dx
172. / x @ d x
=
1 -;m.
x f l 173. / x 2 m d x = --
+-a 224x f l +
6
JsTi
a2fl
7
5
m 175. J $dz = -+ 177.
/z
178. 179.
.
a + O
- a3 In -.
3
1
+
= -- -
174. / x 3 @ d x
176.
a 4 x 0 a6 x - - arcsin -. 16 16 a
m -3- x c J f l -
dx = --
x
2
X
3
-a2arcsin 2
@ 3 J f ? 3a d x = -- - -In 2x2 2 2 dx
+
I.
a+dX
-.
/*m - E ' 1
-
arcsin 2.
a
181. /*=d?+g x3 dx
Jrr' 182.
184.
I
dx
Jm=--
/- dx
1 a 2 0
1 -In a3
1 2a2x2Jrr
= -~
-a. + d X x
3 3 +- -In-.
2a4dX
2a5
a+dX
x
21.5 Indefinite Inteqrals 1029
21.5.2.6 Integrals with
4 [Notation: X = x2 + a2
185.
/ v%dx
=
1
(xv% + a2Arsinhz) + C c1. + a)+]
1 = - [ x a a2 In (x 2
+
=
186. /W??dx
I
id%
a2fl Jx3 - -
188. /x3v%dx =
3
'
/$ dx a-ah-, a + J s ? Js? 190. / 5dx -z+ Arsinh I + C =
189.
=
191.
0 123
192.
/ Edx
193.
/$=a .
Js?
dx = -2x2
-
I a+Js? - In ---x 2a
+ C = In (x + a)+ C1.
= Arsinh
/ $ iv%- 22 Arsinh E + C 195. /$= -3 194.
Js? +In (x + a)+ C1.
= --
=
=2
az
-
2 (x -In
+ 0)+ 4.
aza.
196. / a =dx- - l n - ,1 a 197.
a+Js? x
/nJs? / m Js? dx
= --
aZx '
198.
dx
= --
2a2x2
1 +-ln-. 2a3
a+Js? x +C
1 200. / x m d x = ;v%.
x m
aZx@ a4x0 201. / x 2 m d x = -- -- -16 16 6 24 a6 a Z x m- a 4 x 0- xJl?s - -16 16 6 24
gArsinhz +c a
202 203 204
/$
F 3 3 - x a -a2 Arsinh C x 2 2 - --f l + ! x f i + ! a 2 l n ( x + f i ) + C 1 . x 2 2
d dx = --
+
+
+
205
206 207 208 209
210 211 212
21.5.2.7 Integrals with
4 I Notation: X = x2 - a21
213. / a d z = 2.1 (Xfl-a'Arcosh?)
+C
q +cl,
21.5 Indefinite Inteqrals 1031
215. / x 2 d f d x =
216. /x3&dx
=
rn + a 2 m 3
,
220.
/ Js? dx f i aarccos fl. 0dx Jls + .4rcosh + C 0+In (x + a)+ /7 / $dx 2x2 + 2a arccos fl. / Edx Arcosh + In (x + 6) + 9.
222.
/ x x5d?+
217. 218. 219.
-
=
= --
= --
0
1 -
= --
C=
=
x2dx
a2 ?a+ -2l n ( x + a) +C1
a’
-ArcoshC 2 +C = 2
=
223. 224. 225. 226.
0 dx
/a /=dx
C1.
3 =
1
- arccos
.!
=-
I==dx
229. /x2@dx=
a2x
0+ 1 a arccos -.
2a2x2
2a3
X
x@ a2xfl -+
-230. / x 3 m d x
’
6
24
6
24
fl+ a Z m = ~
7
5
a 4 x 0 fArcosh 2 16
+
16
a
+
231. J ~ d x = ~ - a 2 f i + a 3 a r c c o sa - . 3 232.
1-m -g
233.
1
m3 3 dx = -- x G - -a2 Arcosh 2 2 2 - -~ @+!xfi-!a21n(x+fi)+C1. 2 2 2
23
dx
234.
+
+C
fl+ 3 0 3 - -aarccos E, 2 2
dx = -2x2 X
= --
a2JIS' x dx 1 = --
235.
1-
236.
I
237.
I$$=&---.
0'
x'dx
= --
x
+ Arcosh -X + C = -z + In (x + 0)+ C1. 0
0
a2
Jr? 238. 239. 240.
J
1
dx dx x2vF dx
1 a -a3 arccos -. X
-m
=
I-=-----
')
1 0 a4 1 3 2 ~ 2 x 2 0 2a4Jr?
+ Jr? -.
3 2a5
a arccos -. X
+ +c
21.5.2.8 Integrals with dux2 bx Notation: X = ax2
J;;
241.
dx
+
+ bx + e ,
A = 4ac- b 2 , k
1 2ax b -Arsinh -+ c1
fora>O, A > O ,
+ b)
fora>O, A = O ,
/ E = J;; .1
-ln(2ax
J;;
-- 1
a 2ax
+b
4a
=-
fora
A
21.5 Indefinite Inte.qrals 1033
(see No. 241). (see No. 241). (see No. 241).
(see No. 241).
(see No. 244). (see No. 241). (see No. 241). (see No. 241). (see No. 246). (see No. 248). (see No. 245).
dx
1 --Arsinh
+
bx 2c -
bx+ 2c _f 1_ ilnT1
bx
forc>O, A > O , forc>O, A = O ,
+ 2c (see No. 258). (see No. 241 and 258).
(see No. 241 and 258).
264.
-1
dx , x - a = arcsin -. ax - x x dx
267'
1
-
x-a
-J'ZZG?+ a arcsin -.
dx (ax2 b ) J m =
1
&Jm
+
( a s - bf > 0 ) .
21.5.2.9 Integrals with other Irrational Expressions
269.
1-W dx
n ( a x + b)
1
= ~( n - 1)a
W'
dx
2
= - arccos -
~
21.5.2.10 Recursion Formulas for an Integral with Binomial Differential 273. / x m ( a x n -
+ b)Pdx
m+np+l 1 ~
bn(p + 1) 1
+
( m 1)b
[xmil(axn
+ b ) P + npb
[-xm-'(axn+b)P~l+(m+n+np+l)/xm(axn+b)P+'dx
+
xm+'(axn b)pt' - a ( m
1,
+ n + np + 1)/xmtn(axn + b)pdx],
21.5 Indefinite Inteqrals 1035
-
1
+ + 1) [zm-n+'(azn+ b)pt'
- ( m- n
a(m np
+l)b
I
zm-"(uzn
+ b)Pdz
21.5.3 Integrals of Trigonometric Functions Integrals of functions also containing sin z and cos z together with hyperbolic and exponential functions are in the table of integrals of other transcendental functions (see 21.5.4, p. 1044).
21.5.3.1 Integrals with Sine Function
278.
1 1 1 1 1
280.
1
274. 275. 276. 277.
1 sin ax dx = - - cos ax. 1 1 . sin' ax dz = -z - - sin 2az. 2 4a 1 1 sin3 az dz = -- cos az t - cos3 az 3a 3 1 1 sin4ax dx = -z- - sin 2az -sin 4az. 8 4a 32a
+
sinn-' ax cos ax na sinaz zcosaz 279. I x s i n a z d z = -- -.
sinn-' az dz
sinn az dx = -
(ninteger numbers, > 0).
a2
:( $) (5 ): :(
22 . z' sin ax dz = -sin az a'
281. / z 3 s i n a z d z =
-
-
-
sinaz -
Xn
282. /x"sin az dz = -- cos az t
The definite integral
cos az -
2)
cosaz.
zn-' cos az dz
(n> 0).
-dt is called the sine integral (see 8.2.5, l.,p. 458) and it is denoted si(z).
si:t For the calculation of the integral see 14.4.3.2. 2., p. 694. The power series expansion is 23 27 si(z) = z + x5 - + ' . . ; see 8.2.5, l.,p. 458. 3.3! 5 . 5 ! 7 . 7 ! sin az a cos dz 284. y d z : = -(see No. 322). ~
1
~
+
285.
I%&=---
286.
1
287.
I
1
2
1 sinaz n-1zn-'
~
a n-1
,[ zl!;c
dr
dx 1 a5 1 -= cosec az dz = - In tan - = - In(cosec az cot ax) sin ax 2 a dx 1 -= --cot ax. sin'ax a
(see No. 324).
dx
289.
1-
290.
I*=-
cosax
1 ax +--Intan--. 2
1 cosax dx = sinnax a(n - 1)sin'-' a z
sinax
(
1 (ax)3 ax+-+-+3.3! a'
+-nn --21 I sin'-'dx a z
( n > 1).
~
7 ( a ~ ) ~3 1 ( a ~ ) ~ 3.5.5! 3'7'7! 2(22n-' -
+-+...+ 127(
')Bn(ax)''+l (2n + I)!
3.5.9! B, denote the Bernoulli numbers (see 7.2.4.2, p. 410). 1 xdx x 291. = - a c o t a x + -1nsinaz. a' 1 x dx x cos ax
+ ...
/&
293.
1
dx l+sinax
( n > 2).
a
dx
296.
298.
1
dx sin az( 1 sin ax)
*
a
(a
dx - _ _1t a n 299' l ( 1+sinax)' - 2a
300. 301.
1 1
(1 sinaxdx
=
1 cot
- _ 1_
7)
--- -1 tan3 6a
(f - y ) +
(1 + sin ax)2 - 2a tan
1
Cot3
I-=-
dx 1+ & a x
2JZa
-
.
(f - 7) + sa1 tan3
=--cot(%-'i.)+sacot3(~--). 1 ax 1
303.
(f
7) 7)
(f -
arcsin
(3
)
sin' ax - 1 sin' ax + 1
ax
21.5 Inde.finateInte.qrals 1037
305.
/ sin ax sin bx dx
=
+ +
sin(a - b)x - sin(a b)x 2(a - b) 2(a b) 2
dx -306' / b + c s i n a x a -
m
b tan ax12 + c
arctan
m
1 b tan ax12 a m L nb tanax12 x b dx
for la1 = /bl see No.275).
(la1 f Ibl ;
+c +c +
for b2 > c2),
for b2
/- sinaxdx dx 1 ax In tan 4 / dx 308. / sinax(b+csinax) ab 2 b b+csinax c cos ax b dx dx 309. / (b+csinax)2 a(b2-c2)(b+csinax) + m / b + c s i n a x sinaxdx cos ax dx 310. / ( b + c ~ i n a x ) -a(c2-b2)(b+csinax) ~ + Ab+csinax / -tan ax dx arctan (b > 0). b 311' / b2 + sin2ax abm t a n a x dx (bZ> arctan b 312' / c2sin2ax ab-tanax+ b 307.
=-
c2).
(see No. 306). (see No. 306).
--
-
b2
-
b
--
1
--
1
-
1
cz
c2,
In -tan
2ab-
ax - b
21.5.3.2 Integrals with Cosine Function 1 .
313. Jcosaxdx = -sinax.
/ cos2ax dx = -x2 + 4a sin 2ax. 1 315. / cos3 ax dx sin ax sin3ax. 3a 3 1 316. / cos4 ax dx -x + sin 2ax + sin4ax. 8 4a 32a ax sin ax n 1 + / cos"-2 ax dx. 317. / cos" ax dx na 1
314.
1
==
(see No. 306).
1 -
--
1 -
-
-
=
cosax xsinax 318. /xcosaxdx = - -. a2 22 x2 cos a z dx = - cos ax + a2
+
:(
5)
320. / x 3 c o s a x d x = ( s - $ ) c o s a x +
x" sin ax n 321. /zncosaxdx = -- -
/
xn-l
sin ax.
(:-$)sinax. sin ax dx.
(see No. 306).
b > 0),
(2> b2, b > 0).
1038
322.
21. Tables
1
(ax)2 (ax)4e)..( E d x = ln(ax) - - -- -t . . 2.2! 4.4! 6.6!
+
dt is called the cosine integral (see 14.4.3.2, p. 694) and it is denoted by
The definite integral -
22
+
24
26
+. . see 8.2.5,2., p. 458;
Ci(x). The power series expansion is Ci(x)= C t In x - - -- 252! 4 . 4 ! 6 . 6 ! C denotes the Euler constant (see 8.2.5, 2., p. 458). --cos ax -
(see No. 283).
X
(see No. 285). 325.
I-=- '
326.
a
dx cosax dx
1 Artanh Artanh(sin ax) = - In tan a 1 = tan ax.
dx dx
329.
( n > 1). En(ax)2n+2 t . , (2n 2) (2n!)
+
cos ax
.)
E, denote the Euler numbers (see 7.2, p. 411).
332.
/* cos2ax 1% 1
333.
1
334.
I---
330. 331.
cosnax
1
=
f tan ax + - In cos ax.
=
1 x sin ax (n- l)acosn-'ax (n- l ) ( n - 2)a2cos"-2ax
a
a2
dx 1 ax = - t a n -. l+cosax a 2 dx 1 ax = --cot -. 1 - cosax Q 2 x ax 2 ax xdx - -tan - - lncos -. 1t c o s a x - a 2 a2 2 x dx x ax 2 ax ---cot-+-Insin--. 1 -cosax a 2 a2 2 cos ax dx 1 ax = x - - t a n -. 1t COSUZ a 2 1 ax cos ax dx = -x - -cot -, 1 - cos ax a 2 dx 1 1 ax = - lntan - - tan cos ax(1 cos ax) a a 2
~
~
335.
I--
336.
/-
+
~
+
(: + y )
x dx
(n > 2).
21.5 Inde.finite Inte.qrals 1039
339.
1
dx cos ax(1 - cos ax)
a
dx 1 ax 1 ax = -tan - + - tan3 340' s ( 1 t cosax)2 2a 2 6a 2 ax 1 ax - - - Cot32 6a 2
1 dx = --cot 341' s ( 1 -cosax)' 2a
1 6a
ax 2
cosaxdx = -cot 1 ax - 1 343' / ( I - cosax)' 2a 2 6a
ax 2
cosazdx 342'/(l+cosax)'
344.
I---
(
1 - 3 cos' ax dx arcsin 1 +cosZax 2 4 % 1 +cos'ax
346. 1cos ax cos b s dx =
347'
ax 2
1 2a
= - tan - - - tan3 -
1+ b
sin(a - b)x sin(a t 2(a - b) 2(a
dx -ccos ax a
)
'
+ b)x + b)
( b - c) tan ax/2
2
m
+m
( c - b) tanax/2 m In (c - b) tan ax/2 cosaxdx = b dx c -c btccosax b+ccosaz 1 dx = -1ntan cos ax(b + c cos ax) ab
a
1
349' 350'
351'
352' 353'
1 1+ 1 1+ 1
(for b' > e')
d F 7
1
-
(y + t)
(for bZ < e').
m
(see No. 347).
J
dx
+
dx
cosaxdx b sin ax ( b + ccosax)' - a(b2 - c2)(b+ ccosax)
dx
b'
dx -cZcosZax abdx
b2 - c2cos2ax
1
b tan ax arctan -
b tan ax -~ 1 arctan ab1 btanax 2 a b m In b tan ax
1 . sin ax cos ax dx = - sin2 ax. 2a
(see No. 347). (see No. 347).
(b > 0).
m
21.5.3.3 Integrals with Sine and Cosine Function
1
(see No. 347).
CCOSax
c sin ax dx ccosax)2 a(cz - bz)(b + ccosax)
(b
+
354.
(for la1 = J b / see No. 314).
( l a # lbl);
(b2 > c 2 ; b
> 0) ,
(e' > b2 b > 0 ) . ~
355.
1
356.
/ sinn ax cos ax dx
357. 358.
x sin4ax sin' ax cos' ax dx = - - 8 32a '
1 1
=sinn+laa:
a(n
( n + -1).
+ 1)
= - -coSn+lax
sin ax cosn ax
a(n
sinn ax cos"' ax dx = -
+ 1)
( n + -1).
sinn-' ax COS"'+' ax a(n m)
+
n- 1
+n + m /"
ax COS"' ax dx
(lowering the exponent n ; m and n > 0), -
sinn+' ax cosm-' ax a ( n m)
+
/
m-1 +n+m
sinn ax c o P 2 ax dx
(lowering the exponent m ;m and n > 0).
359. 360. 361. 362. 363. 364. 365. 366. 367.
368. 369.
dx 1 = -In tan ax. J sinaxcosaz a
1 1 1 1
dx sin' ax cos ax dx sin ax cos' ax dx sin3ax cos ax dx sin ax cos3 ax
I I I
=
= - (1i n t a n - + ax a 2 =
cosax
).
1a (Intanax - 7). 1 2sin ax
= - (1I n t a n a x + 2cosZax a l )' 2 dx = --cot2ax. sin' ax cos2 ax a dx sin'axcos3ax dx sin3axcos2ax 'dx dx 1 sin ax cosn-' ax sin ax COS^ ax a(n - 1) cosn-1 ax
1
I I
-1.
1 [lntan (i+ y ) - 1 a sin ax
+I I
( n # 1) (see No.361 and 363).
1 dx dx -_ a(n - 1) sin'-1 ax sin'-2 ax cos ax sin' ax cos ax 1 1 dx - --, a ( n - 1) sin'-' ax cos"'-1 ax sinn ax c o p ax
( n # l)(see No. 360 and 362).
+
+
%I
dx sin"-' ax cosm ax
(lowering the exponent n; m > 0, n > l ) , --.
1 1 a(m - 1) sinn-1 ax cos"'-1 ax
n+m-2 +
I
dx
7sin" a x cos"'-2
(lowering the exponent m; n > 0,m > 1).
ax
21.5 1nde.finite1nte.qrals 1041
370.
I---=
sinaxdx -
1 acosax
sin ax dx
372.
373. 374.
375.
1-
sin ax dx COS"
ax
a
1 2a
1
-
-1 sec ax. + C = -tanZax+Cl.
1 a(n - 1)cosn-1 a x '
I-
sin'axdx 1 1 =--sinax+-lntan cos ax a
-1 -1
sin2axdx = -1 C O S ~ ~ Z
sin' ax dx COS"
ax
-
[- sinax
- - ~1n t a n ( : + y ) ] . 2
a 2cos'ax
dx
sin ax a(n - 1)COS"-1 ax
( n # 1)
(see No. 325,326, 328).
cos ax sin3ax dx cos ax 1
379. 1-dx 380.
sinn-' ax a(n - 1)
= -~
ax dx = = cosrnax
382. 383.
I---
dx
sinn-' ax
cos ax dx 1 = -~ sin3ax 2a sin2ax
( m # 1).
cot' ax +c=-+GI . 2a
cosaxdx 1 a(n - 1)sinn-' a x ' sin ax a
(-
-)
ax cos'axdx 1 cosax - -- In tan . 2 sin3ax - 2a sin'ax
-1
(n # 1). dx
cos ax dx - -- 1 - --1 cosec ax. asinax a
sinax 385.
sin:;:
sin"-' ax
-
1 1-
1 1:
sinni1 ax
1
-_ -
381.
+
1
386.
I----
387.
-1
388.
I--
cos2ax dx sinn ax
+JL) ( n # 1) sinn-' ax
(see No. 289).
(
1 , - -- (sinax a
/--A[cos3axdx -
390.
=
sinax
I GFT
cosn ax dx
+ -)sin1ax 1
ax a(n - 1)
c0P-l ~
--
. -
a ( n - 3)
sin ax
391.
___ cosaz
( n - 1) a sinn-' a x
cos3axdx 1 cos2ax -- sinax - a 2
cos3axdx
389.
(
-
ax
+
1
( n - 1)sinn-' ax
J ;;:c o:s
dx
( n # 1).
cosn+1ax a(m - 1)sinm-1 ax cosn-1 ax
(m#
111
( m# n ) , ( m # 1).
dx
1
dx
1
394.
1 1f c o s a x J cos ax(sin1axidxcos ax) = -In a cos ax
395.
1
,-
1 lisinax cosaxdx = --ln7, sin ax(1 f sin ax) a sin ax
sinaxdx
397. 398,
-
1
I
cosaxdx -1 1 ax f-lntan-. 2 2 4 1 & cos ax) 2a sin a x ( l & cos ax)
J
sinaxdx x 1 = - 7 - ln(sinax i cosax). sinax f cosax 2 2a cos ax dx dx dx = illn (1 l+cosax+sinax a dx bsinax
--
+ ccosax - a
1 m
* tan y ). ax t e In tan 2
and t a n e =
with sin0 = ~
!.
b
21.5 Indefinite Integrals 1043
/ 404. / 403.
sinaxdx 1 = - - In(b b+ccosax ac b
+ c cos ax).
1 cosaxdz = -ln(b+csinax). csinaz ac
+
d (x
dx
405'
/
/b+
+ fsinaz =
btccosax
+
:)
m s i n ( a z +e) with sin0 =
dx
406'
/
407'
1
408.
/
b2
b2
1 abc
= -arctan
cos2 ax + cz sin2ax
tan ax)
(see No. 306).
(a2 # b2); for a = b
(see No. 354).
.
ctanax+b c tan ax - b'
1 dx =-In cos2 ax - c2 sin' ax 2abc
sin ax cos bx dx = -
(i
C
r n and t a n 0 = 2f
~
COS(^ + b)x - COS(^ - b ) ~ 2(a
+ b)
2(a - b)
21.5.3.4 Integrals with Tangent Function 1
409. / t a n az dz = -- In cos ax. tan ax 410. / t a n Z ax dx = -- 2 .
/ 412. / 411.
1 tan3 az dz = - tan' ax 2a
+ -1 In cos ax.
tann ax dz = -tann-' ax a(n - 1)
1
tann-' ax dx.
22n(22n - 1 ) a2n-l ~ 2n+l ax3 a3x5 2a5x7 17a7x9 n x 413. /z tanaxdx = - + - -t -t ... (2n + I)! 3 15 105 2835
+
+
+..
B, denote the Bernoulli numbers (see 7.2.4.2, p. 410). tan ax dx
414.
/-
=
ax+-+-
22"(22n - l)B,(ax)'n-l +t ...+ 2205 (2n 1)(2n!)
2 ( a ~ ) 1~ 7 ( a ~ ) ~
9
75
-
/ t a n adxx & 1 *-x2 + -2a1 ln(sinax cosaz). tanazdz x 1 417. /- - - ln(sin ax cos ax). t a n a s k 1 2 2a 416.
~
=
&
zt
21.5.3.5 Integrals with Cotangent Function 418.
/ cot a z dz
1
= - In sin ax.
+...
cot ax --x.
419. /cot'axdx=
1 1 420. / c o t 3 a x d x = --cot'ax--1nsinax. 2a
421.
1
cotn ax dx = -cotn-' ax a(n - 1)
/ cotn-' ax dx
(n # 1).
22ngna2n-lX2n+l x ax3 a3x5 -*.. 422. I x c o t axdx = - - - - -- ... 9 225 (2n + I)! a
B, denote the Bernoulli numbers (see 7.2.4.2, p. 410). (ax)3 --2 ( a ~ _) ~. . . - 2'"Bn (a2)'n-l 423, cotaxdx = _ 1__ ax _ _4725 (2n - 1)(2n)! X ax 3 135
1-
cotn ax 424. / x d x = -cotn+lax a(n t 1)
425.
I-=/*
dx 1 cot ax
-...
( n # -1).
tan ax dx tanax % 1
(see No. 417).
21.5.4 Integrals of other Transcendental Functions 21.5.4.1 Integrals with Hyperbolic Functions
429.
1 1 1 1
430.
1
426. 427. 428.
1 sinh ax dx = - cosh ax. 1 cosh ax dx = - sinh ax. 1 1 sinh' ax dx = - sinh ax cosh ax - -x. 2a 2 1 . 1 cosh' ax dx = - sinh ax cosh ax + -x. 2a 2
sinhn ax dx 1 . an
= - sinhn-' ax cosh ax -
1
n+2 sinhn+l ax cosh ax - - sinhn+' ax dx n+1 a(n + 1)
431.
coshn ax dx
1
n-1 +coshn-' ax dx n + 2 j - -sinh ax coshntl ax + - coshnt2 ax dx n+l a(n t 1) an
433.
(for n < 0) (n # -1).
1
1 sinh . ax coshn-' ax =-
432.
(for n > 0) ,
1 1
dx
=
a1 In tanh -.ax2
dx 2 -= - arctan eaz. coshax a
(for n
>O),
(for n < 0) (n # -1)
22.5 Indefinite Inte.qrals 1045 1 1 434. [ z sinh a z dz = -zcosh a z - - sinh az. a2
1 . 1 435. [z cosh ax dx = -x sinh a z - - cosh az. a2
436.
1
1 tanh ax dx = - lncosh ax.
[coth ax dx = -1 In sinh ax. tanh ax 438. [tanh2ax dx = x coth ax 439. [coth2 a z ds = a: - -. 440. [sinh a z sinh bx dz = -(asinhbzcoshax-bcoshbzsinhaz) a2 - b2 437.
-,
441.
1
cosh a z cosh bz dz = -( a sinh ax cosh bx - bsinh bx cosh ax) a2 - b2 1
442. I c o s h a z s i n h b z d z = -(asinhbxsinhax a2 - b2
- bcoshbzcoshaz)
1 2a
443. [sinhaxsinazdx = -(coshazsinax - sinhaxcosax). 444.
1
1 . cosh ax cos a z da: = -(sinh a z cos ax + cosh ax sin ax). 2a
1 445. [sinhaxcosaxdx = -(coshazcosax +sinhazsinax). 2a
446.
1 [cosh ax sin ax da: = -(sinh 2a
ax sin a z - cosh a z cos ax).
21.5.4.2 Integrals with Exponential Functions 447.
1
1 ear dz = -eax.
eax a2
- 1).
448. [zeaxdz = -(ax 449.
[z2eardx = ea' :(
22 7 +
-
$-)
1
1
450. /zneax dz = -xneaz - - xn-'ear dx 451. / $ d z
ax l.l!
(ax)' (ax)3 +... ++2.2! 3 . 3 !
=lnz+ -
(a2 # b 2 ) . (a2 #
b2)
(a2 # b2).
21. Tables
1046
dt is called the exponential function integral (see 8.2.5, 4., p. 459) and it is --m
denoted by Ei(x). For x > 0 the integrand is divergent at t = 0; in this case we consider the principal value of the improper integral Ei(x) (see 8.2.5,4.,p. 459). ;dt = C + l n i x l + --oc
x +x= 23 xn t-+ ... + +. . . 1.1! 2.2!
3.3!
C denotes the Euler constant (see 8.2.5, 2., p. 458)
dx dx 454' 455.
/ enzdx
456.
1
1 ens - -1na 1 t ens' x 1 In(b t tea").
dx
beax
=
1 ; ln(b t ce"')
+ ce-ar
1
=
arctan ( e n s 4
I c t e a z G 2 a a In c - ens-
--
xens dx
--
ear x
1 a
easInx a
458. /en"Inxdx = -- - 1 - d x ;
(see No. 451).
en' a2 t b2 ear ear cos bx dx = -(acos bx + bsin bx). a2 + bZ
459. /ear sin bx dx = -(asinbx - bcosbx). 460. 461.
1 1
ear sinn x dx =
Sinn-l
+
a2 n2
'(a sin x - n cos x) - 1) +-n(n /enx sinn-' x dz ; az t nz
462.
1
ens cosn x dx =
a 2 t n2
+ 463. /senzsinbxdx =
x
ear
+
a2 n2
(a cos x + n sin x) ens
x dx ;
xeax ( a sin bx - b cos bx) a2 t b2
(see No. 447 and 459).
(see No. 447 and 460). ear ~
(a'
+ b 2 ) 2 [(a2 - b2) sin bx - 2ab cos bx]
,
21.5 Indefinite Inteqrals 1047
xeax
ea'
464. / r e n ' cos bx dx = -(acosbztbsinbz)---a2 + b2
[(a'
b2)2
-
b2) cos bz t 2absin bx].
21.5.4.3 Integrals with Logarithmic Functions 465. / l n r d x = x l n z - x . 466. /(lnx)2dz = z(1nx)' - 221nx + 22.
I
467. ( l n ~ ) ~ =d zx ( l n ~ -)3~z ( l n x ) 2 + 6 z I n z - 6 z . 468. 469.
S(lnz)"
1
S
dz = z(1nz)" - n (lnz)"-'dz
= In lnx
( n # -1).
(lnz)' (1112)~ + In x t t-t..,. 2.2! 3.3!
1
is called the logarithm integral (see 8.2.5,p. 458) and it is denoted by Li(z). In t For x > 1 the integrand is divergent at t = 1. In this case we consider the principal value of the improper integral Li(x) (see 8.2.5, p. 458). The relation between the logarithm integral and the exponential function integral (see 8.2.5, p. 459) is: Li(x) = Ei(ln2).
The definite integral
( n # 1) ;
[-
1-
1 In x 471. 1 x " l n x d z = xmtl m + 1 - ( m t l ) '
472. /xm(lnx)"dz =
474.
l$dx=-
475.
1
xm
xmdx 476. II,,=/,dy
dx = e-31
( m # -1).
zm+'(lnz)" n - -/xm(Inz)"-' dx ( m # -1 m t l m+l
1 In' ( m - 1)xm-1 ( m - 1 ) 2 2 m - 1
(see No. 469)
~
n # -1 ; (see No. 470).
( m # 1).
(Inz)"
( m # 1);
(see No. 474).
with y = -(m t 1 ) l n x ;
(see No. 451).
478.
1%dx
479.
( n - 1)2(lnz)2 ( n dx Ja = In lnx - ( n - 1)l n z t 2.2!
= lnlnx. -
~)~(inz)~ +..'. 3.3!
25
22n-IB n x Z n t l
900
n(2n t l)!
482. / l n s i n x d x = x l n x - x - - - - -x3 s..-
18
-...
B, denote the Bernoulli numbers (see 7.2.4.2, p. 410). 2%-1
483. / l n c o s x d x = - - - -2-3- - . .2. -5
6
484. I l n t a n x d x = x l n x - x
315
+23 t 7x5 t ... + 2 2 y p - 1 - 1)Bnx2n+l , . . 9
485.
/ sin In x dx
2n
(2 - 1)Bnx2n+l - . . . n(2n + l)!
27
60
450
n(2n
+ l)!
+
= :(sin In x - cos In x)
2
486. S c o s l n x d x = z(sinlnx+coslnx). 2 487.
1 1 ear In x dx = -ear In x - a
1
eez
(see No. 451).
;
-dx x
21.5.4.4 Integrals with Inverse Trigonometric Functions 488. 489. 490.
/ arcsin z dx x arcsin a + m. / x arcsin F dx (g <) arcsin + % d m . 1 / x2arcsin 2a dx 3 arcsin xa + 9-(x2 +2 a 2 ) m . =
23
=-
,
491, /arCSi:fdx
492. 493. 494. 495.
-
=
arcs:z:
1 1
arccos
-
1 . 3 d' 1 . 3 . 5 -x7+ . . . , 2 . 4 . 5 . 5 a5 2 . 4 . 6 . 7 . 7 a 7
1 x3 2 . 3 3a3
x a
--+--+--
dx
I
1 a
x a
1
+
= -- arcsin - - -In X
:dx = x arccos
x arccos F dx =
-d
(5 ): -
a t d F 7 X
m .
arccos ! -
2m.
x 23 x 1 x2 arccos - dx = - arccos - - -(x2 t 2 a 2 ) d n . a 3 a 9 arccos!dx
x
1
13
1.3
25
1.3'5
27
2f.5 Indefinite Integrals 1049
498.
/ arctan
x a dx = x arctan - - - ln(az + xz). a
x
2
1
x
ax
499. / x arctan - dx = -(x2 t a') arctan - - -. a 2 a 2
x
ax2 a3
x
x3
500. Jx' arctan - dx = - arctan - - a 3 a 6
+ln(a2 + x'). 6 ( n # -1).
,
a r c t ~dx 2~
503.
504.
arctzi=
507.
dx
510,
-
z
a
1 J x arccot x- dx = -(x' a
2
/ / xn arccot -a dx
+2
+
x + -.ax
+ a z ) arccot a
2
x x3 x ax' a3 x2 arccot - dx = - arccot - + - - - ln(a' a 3 a 6 6 X
508.
aZ+x2
1
1 dx arctan - + ( n - 1)xn-1 a n - 1 xn-l(a2+z2) x a arccot dx = x arccot - - ln(a' x').
505. 506.
x
1
= -- arctan - - - In x2 ' X a 2a
arcc;2'
a
dx
+ x').
xntl
= -arccot
( n # -1).
n+1
1 x 1 =--arccot-+-In2 a 2a
aZ+x2 x2 '
21.5.4.5 Integrals with Inverse Hyperbolic Functions 512.
Arsinh dx = x Arsinh
513.
Arcosh
514.
1
515.
/ Arcoth
Artanh
dx = x Arcosh
-
d m .
z d m . -
x a dx = x Artanh - + - In(a2 - x'), a
2
x
a
a
2
dx = x Arcoth - + - ln(x2 - a').
(n # 1).
21.6 Definite Integrals 21.6.1 Definite Integrals of Trigonometric Functions For natural numbers m. n: 1. Jsinnxdx = 0.
(21.1)
2. Jcosnxds = 0. (21.2)
0
0 form # n, x form = n.
4. Jsin nx sin mx dx = 0
3. Jsinnxcosmxdx = 0.
(21.3)
0
0
25
(21.4) 5 . [cosnzcosrnxdx = 0
0 for m # n, (21.5) form = n.
T
(21.6)
(21.7a)
B(x, Y) = r(x)r(y) denotes the beta function or the Euler integral of the first kind, r ( x ) denotes the r ( x + Y) gamma function or the Euler integral of the second kind (see 8.2.5, 6., p. 459). The formula (21.7a) is valid for arbitrary (Y and 9;we use it, e.g., to determine the integrals ~
*I2
1
TI2
G d x ,
1
G d x ,
0
0
1
' I z dx
etc.
0
For positive integer cy, 3:
1 TI2
7b.
sinZat1 x coszDf1 x dx =
0
for a
X
9.
I---
cos ax dx
- cc
.!P!
2(.
+ p + l)!
< 0.
( a arbitrary).
(21.7b) '
(21.8)
(21.9)
0
'x
tan ax dx
0
11, ~ c o s n x - cXo s h x n
for a b dx = ln-a
< 0.
(21.10)
(21.11)
21.6 Definite Integrals 1051
I for la1 < 1,
sin x cos ax
(21.12)
0
0 for ( a (> 1.
(21.13)
14.
7-
dx = iIe-labl (the sign is the same as the sign of b) , 2
0
(21.14)
(21.15)
(21.16)
/ sin(x2)dx / cos(x2)dx E.
tcc
17.
tm
=
=
(21.17)
-m
-CX
(21.18)
(21.19) 20,
7'
sinzzdz
1 k2
= -(K
odGmiG
-
E) for Ik/ < 1 .
(21.20)
Here, and in the following, E and K mean complete elliptic integrals (see 8.1.4.3, 2., p. 435):
E = E (k, 21.
j.;.
22,
1
);
, K =F
cosZxdx
J1 - k2sinZx
(k) 5) (see also the table of elliptic integrals 21.7, p. 1055). 1 kZ
= -[E - (1 - k z ) K].
cosaxdx 1 - 2 b c 0 s ~+ bz
-
nb" for integer a 2 0, Ibl < 1. 1 - b2
21.6.2 Definite Integrals of Exponential Functions (partially combined with algebraic, trigonometric, and logarithmic functions)
n! - -p+I
for a > 0, n = 0,1,2,. . . .
(21.21)
(21.22)
r ( n )denotes the gamma function (see 8.2.5, 6., p. 459); see also the table of the gamma function 21.8, p. 1057).
r (+)
P
for a > 0, n > -1,
dx = 2a(+)
24. /xne-"' 0
(2 1.24a)
(21.2413)
k! --
for n = 2 k + 1 ( k = O , 1 , 2 , . . .) , a > 0 .
2ak+1
/
(21.24~)
P
25.
J;; for a > 0 e-a2xzdx = 2a
0
(21.25)
m
J;; for a 26. /x2e-nazadx = 4a3
0
1
> 0.
(21.26)
X
27.
cos bx dx = J". e-b2/4a2 for a > 0 2a
e-""
0
(2 1.27)
(21.28)
(21.29)
30,
7%
sin x
1
dx = arccot a = arctan - for a > 0. U
0
In x dx = -C
31.
M
-0,5772
(21.30)
(21.31)
0
C denotes the Euler constant (see 8.2.5, 2.)p. 458).
21.6.3 Definite Integrals of Logarithmic Functions (combined with algebraic and trigonometric functions) 32. ]In
1 lnx I
dx = -C = -0,5772
(reduced to Nr. 21.31).
(21.32)
0
C is the Euler constant (see 8.2.5, 2., p. 458). 33.
12-i lnx
0
34.
1 0
R2
dx = 6
(reduced to Nr. 21.28)
(21.33)
dx = -- (reduced to Nr. 21.29) 12
(21.34)
R2
21.6 Definite Inteorals 1053
(21.35)
(21.36)
(2 1.37)
r ( z ) denotes the gamma function (see 8.2.5, 6., p. 459; see also the table of the gamma function 21.8, p. 1057).
38.
1
1
0
0
lnsinzdz =
lncosxdz = - p l n 2 . 2
(21.38)
n2 In 2 39. j z l n s i n z d z = -2 '
(21.39)
40. ~ s i n z l n s i n x d = r ln2 - 1.
(21.40)
0
0
41. j l n ( a
* bcosz) dz = Tln a +
0
42. ]ln(a2 - 2abcosz + bZ) dx = 0
2
fora 2 b.
(21.41)
P a l n a for (a 2 b > 0), 2alnb for (b 2 a > 0 ) .
43. ?'In tan z dz = 0.
(21.42)
(2 1.43)
0
(21.44)
21.6.4 Definite Integrals of Algebraic Functions 45. ]P(1 - z)P dx = 2 0
=
i
zZa+l(1 - " 2 ) B
0
dz =
r(cP+ l ) r ( p+ 1)
+ +
r ( a p 2)
B ( a t 1, /3 + l ) , (reduced to Nr. 21.7a).
(21.45)
B(z,y) = r(z)r(y) denotes the beta function (see 21.6.1, p. 1050) or the Euler integral of the first ~
r ( x Jr Y)
kind, r ( z ) denotes the gamma function (see 8.2.5, 6., p. 459) or the Euler integral of the second kind. W
46.
for a < 1.
(21.46)
I
=-?rcotan
fora
(21.47)
forO
(21.48)
xa-l
48. J = d x = A
bsin b
0
(21.49) r ( x ) denotes the gamma function (see 8.2.5, 6., p. 459; see also the table of the gamma function 21.8, p. 1057).
(21.50)
51.
7
dx - -a 1 + 2 x c o s a + x 2 sinx
@
(21.51)
21.7 Elliptic Inteqrals 1055
2 1.7 Elliptic Integrals 21.7.1 Elliptic Integral ofthe First Kind F (9,I C ) , k = sin a 'p
I" -
---
10
20
30
0.0000 0.1745 0.3491 0.5236 0.6981 0.8727 1.0472 1.2217 1.3963 1.5708
0.0000 0.1746 0.3493 0.5243 0.6997 0.8756 1.0519 1.2286 1.4056 1.5828
0.0000 0.1746 0.3499 0.5263 0.7043 0.8842 1.0660 1.2495 1.4344 1.6200
0.0000 0.1748 0.3508 0.5294 0.7116 0.8982 1.0896 1.2853 1.4846 1.6858
0
~
0 10 20 30 40 50 60 70 80 90
--
60
70
0.0000 0.1752 0.3545 0.5422 0.7436 0.9647 1.2126 1.4944 1.8125 2.1565
0.0000 0.1753 0.3555 0.5459 0.7535 0.9876 1.2619 1.5959 2.0119 2.5046
-
80
90
0.0000 0.1754 0.3561 0.5484 0.7604 1.0044 1.3014 1.6918 2.2653 3.1534
0.0000 0.1754 0.3564 0.5493 0.7629 1.0107 1.3170 1.7354 2.4362
--
30
21.7.2 Elliptic Integral of the Second Kind E (cp, k) , k = sin a -
'I"
0 -0 10 20 30 40 50 60 70 80 90
-
0.0000 0.1745 0.3491 0.5236 0.6981 0.8727 1.0472 1.2217 1.3963 1.5708
10
I
-- -20
30
40
50
60
70
0.0000 0.1743 0.3473 0.5179 0.6851 0.8483 1.0076 1.1632 1.3161 1.4675
0.0000 0.1742 0.3462 0.5141 0.6763 0.8317 0.9801 1.1221 1.2590 1.3931
0.0000 0.1740 0.3450 0.5100 0.6667 0.8134 0.9493 1.0750 1.1926 1.3055
0.0000 0.1739 0.3438 0.5061 0.6575 0.7954 0.9184 1.0266 1.1225 1.2111
0.0000 0.1738 0.3429 0.5029 0.6497 0.7801 0.8914 0.9830 1.0565 1.1184
1
80
90
0.0000 0.1737 0.3422 0.5007 0.6446 0.7697 0.8728 0.9514 1.0054 1.0401
0.0000 0.1736 0.3420 0.5000 0.6428 0.7660 0.8660 0.9397 0.9848 1.0000
1056
21. Tables
21.7.3 Complete Elliptic Integral, k = sin a E K E I" a /" xI" K ff
K
E
-2.1565 1.2111 2.1842 1.2015 2.2132 1.1920 2.2435 1.1826 2.2754 1.1732
0 1 2 3 4
1.5708 1.5709 1.5713 1.5719 1.5727
1.5708 1.5707 1.5703 1.5697 1.5689
30 31 32 33 34
1.6858 1.6941 1.7028 1.7119 1.7214
1.4675 1.4608 1.4539 1.4469 1.4397
60 61 62 63 64
5
1.5738 1.5751 1.5767 1.5785 1.5805
1.5678 1.5665 1.5649 1.5632 1.5611
35
1.7312 1.7415 1.7522 1.7633 1.7748
1.4323 1.4248 1.4171 1.4092 1.4013
65
66 67 68 69
2.3088 2.3439 2.3809 2.4198 2.4610
1.1638 1.1545 1.1453 1.1362 1.1272
1.5828 1.5854 1.5882 1.5913 1.5946
1.5589 1.5564 1.5537 1.5507 1.5476
40
1.7868 1.7992 1.8122 1.8256 1.8396
1.3931 1.3849 1.3765 1.3680 1.3594
70 71 72 73 74
2.5046 2.5507 2.5998 2.6521 2.7081
1.1184 1.1096 1.1011 1.0927 1.0844
1.5981 1.6020 1.6061 1.6105 1.6151
1.5442 1.5405 1.5367 1.5326 1.5283
45
1.8541 1.8691 1.8848 1.9011 1.9180
1.3506 1.3418 1.3329 1.3238 1.3147
75
76 77 78 79
2.7681 2.8327 2.9026 2.9786 3.0617
1.0764 1.0686 1.0611 1.0538 1.0468
1.6200 1.6252 1.6307 1.6365 1.6426
1.5238 1.5191 1.5141 1.5090 1.5037
50
1.9356 1.9539 1.9729 1.9927 2.0133
1.3055 1.2963 1.2870 1.2776 1.2681
80 81 82 83 84
3.1534 3.2553 3.3699 3.5004 3.6519
1.0401 1.0338 1.0278 1.0223 1.0172
1.6490 1.6657 1.6627 1.6701 1.6777
1.4981 1.4924 1.4864 1.4803 1.4740
55
2.0347 2.0571 2.0804 2.1047 2.1300
1.2587 1.2492 1.2397 1.2301 1.2206
85
3.8317 4.0528 4.3387 4.7427 5.4349 m
1.0127 1.0080 1.0053 1.0026 1.0008 1.0000
6 7 8 9 10 11
12 13 14 15
16 17 18 19 20
21 22 23 24 25
26 27 28 29
- --
36 37 38 39 41 42 43 44 46 47 48 49 51 52 53 54 56 57 58 59
86 87 88 89 90
21.8 Gamma Function 1057
21.8 Gamma Function X
X
X
X
1.00 01 02 03 04
1.00000 1.25 0.90640 1.50 0.88623 26 0.90440 51 0.88659 0.99433 27 0.90250 52 0.88704 0.98884 28 0.90072 53 0.88757 0.98355 29 0.89904 54 0.88818 0.97844
1.75 76 77 78 79
0.91906 0.92137 0.92376 0.92623 0.92877
1.05 06 07 08 09
0.97350 0.96874 0.96415 0.95973 0.95546
0.89747 1.55 0.88887 56 0.88964 0.89600 0.89464 57 0.89049 0.89338 58 0.89142 0.89222 59 0.89243
1.80 81 82 83 84
0.93138 0.93408 0.93685 0.93969 0.94261
1.30 31 32 33 34
1.10 0.95135 1.35 0.89115 1.60 0.89352 1.85 0.94561 11 0.94740 61 0.89468 86 0.94869 36 0.89018 62 0.89592 87 0.95184 12 0.94359 37 0.88931 63 0.89724 88 0.95507 38 0.88854 13 0.93993 64 0.89864 89 0.95838 39 0.88785 14 0.93642 0.90012 1.90 0.96177 0.90167 91 0.96523 0.90330 92 0.96877 0.90500 93 0.97240 94 0.97610 0.90678
1.15 16 17 18 19
0.93304 1.40 0.88726 41 0.88676 0.92980 0.92670 42 0.88636 0.92373 43 0.88604 44 0.88581 0.92089
1.20 21 22 23 24
0.91817 1.45 0.88566 1.70 0.90864 1.95 0.97988 46 0.88560 71 0.91057 96 0.98374 0.91558 72 0.91258 97 0.98768 47 0.88563 0.91311 98 0.99171 48 0.88575 73 0.91467 0.91075 49 0.88592 74 0.91683 99 0.99581 0.90852
1.25 0.90640
1.65 66 67 68 69
1.50 0.88623 1.75 0.91906 2.00 1.00000
The values of the gamma function for z < 1 (z # 0, -1, -2,. . .) and z > 2 can be calculated by the following formula: r(x+ 1) r ( z )= (z - 1) r ( z - 1). r ( z )= , 2
r(1'7) 0'90864 A: r(0.7) = =- 1.2981, 0.7 0.7 IB: r(3.5) = 2.5. r(2.5) = 2.5.1.5' ~ ( 1 . 5 = ) 2.5.1.5.0.88623 = 3.32336.
21.9 Bessel Functions (Cylindrical Functions) X
0.0 +1.0000 0.1 0.9975 0.2 0.9900 0.9776 0.3 0.4 0.9604 0.5 $0.9385 0.9120 0.6 0.7 0.8812 0.8 0.8463 0.8075 0.9 1.0 +0.7652 1.1 0.7196 1.2 0.6711 1.3 0.6201 1.4 0.5669 1.5 t0.5118 0.4554 1.6 1.7 0.3980 0.3400 1.8 0.2818 1.9 2.0 t0.2239 2.1 0.1666 0.1104 2.2 0.0555 2.3 2.4 0.0025 2.5 -0.0484 0.0968 2.6 0.1424 2.7 2.8 0.1850 0.2243 2.9 3.0 -0.2601 0.2921 3.1 0.3202 3.2 0.3443 3.3 3.4 0.3643 3.5 -0.3801 0.3918 3.6 3.7 0.3992 0.4026 3.8 3.9 0.4018 4.0 -0.3971 4.1 0.3887 4.2 0.3766 4.3 0.3610 4.4 0.3423 4.5 -0.3205 0.2961 4.6 4.7 0.2693 0.2404 4.8 4.9 0.2097
+o.oooo
0.0499 0.0995 0.1483 0.1960 +0.2423 0.2867 0.3290 0.3688 0.4059 +0.4401 0.4709 0.4983 0.5220 0.5419 +0.5579 0.5699 0.5778 0.5815 0.5812 +0.5767 0.5683 0.5560 0.5399 0.5202 f0.4971 0.4708 0.4416 0.4097 0.3754 t0.3391 0.3009 0.2613 0.2207 0.1792 t0.1374 0.0955 0.0538 t0.0128 -0.0272 -0.0660 0.1033 0.1386 0.1719 0.2028 -0.2311 0.2566 0.2791 0.2985 0.3147
-m
-1.5342 1.0181 0.8073 0.6060 -0.4445 0.3085 0.1907 -0.0868 +0.0056 +0.0883 0.1622 0.2281 0.2865 0.3379 $0.3824 0.4204 0.4520 0.4774 0.4968 10.5104 0.5183 0.5208 0.5181 0.5104 +0.4981 0.4813 0.2605 0.4359 0.4079 +0.3769 0.3431 0.3070 0.2691 0.2296 +0.1890 0.1477 0.1061 0.0645 t0.0234 -0.016s 0.0561 0.0938 0.129€ 0.1632 -0.1947 0.2235 0.2494 0.2723 0.2921
--03
-6.4590 3.3238 2.2931 1.7809 -1.4715 1.2604 1.1032 0.9781 0.8731 -0.7812 0.6981 0.6211 0.5485 0.4791 -0.4123 0.3476 0.2847 0.2237 0.1644 -0.1070 -0.0517 +0.0015 0.0523 0.1005 +0.1459 0.1884 0.2276 0.2635 0.2959 t0.3247 0.3496 0.3707 0.3879 0.4010 +0.4102 0.4154 0.4167 0.4141 0.4078 $0.3979 0.3846 0.3680 0.3484 0.3260 +0.3010 0.2737 0.2445 0.2136 0.1812
+1.000 1.003 1.010 1.023 1.040 1.063 1.092 1.126 1.167 1.213 1.266 1.326 1.394 1.469 1.553 1.647 1.750 1.864 1.990 2.128 2.280 2.446 2.629 2.830 3.049 3.290 3.553 3.842 4.157 4.503 4.881 5.294 5.747 6.243 6.785 7.378 8.028 8.739 9.517 10.37 11.30 12.32 13.44 14.67 16.01 17.48 19.09 20.86 22.79 24.91
0.0000 +0.0501 0.1005 0.1517 0.2040 0.2579 0.3137 0.3719 0.4329 0.4971 0.5652 0.6375 0.7147 0.7973 0.8861 0.9817 1.085 1.196 1.317 1.448 1.591 1.745 1.914 2.098 2.298 2.517 2.755 3.016 3.301 3.613 3.953 4.326 4.734 5.181 5.670 6.206 6.793 7.436 8.140 8.913 9.759 10.69 11.71 12.82 14.05 15.39 16.86 18.48 20.25 22.20
m
2.4271 1.7527 1.3725 1.1145 0.9244 0.7775 0.6605 0.5653 0.4867 0.4210 0.3656 0.3185 0.2782 0.2437 0.2138 0.1880 0.1655 0.1459 0.1288 0.1139 0.1008 0.08927 0.07914 0.07022 0.06235 0.05540 0.04926 0.04382 0.03901 3.03474 3.03095 3.02759 3.02461 3.02196 3.01960 3.01750 3.01563 3.01397 3.01248 0.01116 0.00998C 0.008927 0.007988 0.007146 0.00640C 0.00573C 0.005132 0.004595 0.004119
oc;
9.8538 4.7760 3.0560 2.1844 1.6564 1.3028 1.0503 0.8618 0.7165 0.6019 0.5098 0.4346 0.3725 0.3208 0.2774 0.2406 0.2094 0.1826 0.1597 0.1399 0.1227 0.1079 0.09498 0.08372 0.07389 0.06528 0.05774 0.05111 0.04529 0.04016 0.03563 0.03164 0.02812 0.02500 0.02224 0.01979 0.01763 0.01571 0.01400 0.01248 0.01114 0.00993E 0.008872 0.00792: 0.007078 0.006325 0.005654 0.00505~ 0.004521
21.9 Bessel Functions (Cylindrical Functions) 1059
X
5.0
5.1 5.2 5.3 5.4
5.5
5.6 5.7 5.8 5.9
6.0
6.1 6.2 6.3 6.4
6.5
6.6 6.7 6.8 6.9
7.0
7.1 7.2 7.3 7.4
7.5
7.6 7.7 7.8 7.9 8.0 8.1 8.2 8.3 8.4 8.5
8.6 8.7 8.8 8.9
9.0
9.1 9.2 9.3 9.4
9.5
9.6 9.7 9.8 9.9
10.0
-0.1776 0.1443 0.1103 0.0758 0.0412 -0.0068 $0.0270 0.0599 0.0917 0.1220 +0.1506 0.1773 0.2017 0.2238 0.2433 +0.2601 0.2740 0.2851 0.2931 0.2981 $0.3001 0.2991 0.2951 0.2882 0.2786 t0.2663 0.2516 0.2346 0.2154 0.1944 t0.1717 0.1475 0.1222 0.0960 0.0692 t0.0419 t0.0146 -0.0125 0.0392 0.0653 -0.0903 0.1142 0.1367 0.1577 0.1768 -0.1939 0.2090 0.2218 0.2323 0.2403 -0.2459
-0.3276 0.3371 0.3432 0.3460 0.3453 -0.3414 0.3343 0.3241 0.3110 0.2951 -0.2767 0.2559 0.2329 0.2081 0.1816 -0.1538 0.1250 0.0953 0.0652 0.0349 -0.0047 +0.0252 0.0543 0.0826 0.1096 t0.1352 0.1592 0.1813 0.2014 0.2192 t0.2346 0.2476 0.2580 0.2657 0.2708 +0.2731 0.2728 0.2697 0.2641 0.2559 +0.2453 0.2324 0.2174 0.2004 0.1816 t0.1613 0.1395 0.1166 0.0928 0.0684 f0.0435
-0.3085 0.3216 0.3313 0.3374 0.3402 -0.3395 0.3354 0.3282 0.3177 0.3044 -0.2882 0.2694 0.2483 0.2251 0.1999 -0.1732 0.1452 0.1162 0.0864 0.0563 -0.0259 t0.0042 0.0339 0.0628 0.0907 t0.1173 0.1424 0.1658 0.1872 0.2065 t0.2235 0.2381 0.2501 0.2595 0.2662 t0.2702 0.2715 0.2700 0.2659 0.2592 t0.2499 0.2383 0.2245 0.2086 0.1907 t0.1712 0.1502 0.1279 0.1045 0.0804 t0.0557
t0.1479 0.1137 0.0792 0.0445 tO.O1O1 -0.0238 0.0568 0.0887 0.1192 0.1481 -0.1750 0.1998 0.2223 0.2422 0.2596 -0.2741 0.2857 0.2945 0.3002 0.3029 -0.3027 0.2995 0.2934 0.2846 0.2731 -0.2591 0.2428 0.2243 0.2039 0.1817 -0.1581 0.1331 0.1072 0.0806 0.0535 -0.0262 tO.OO1l 0.0280 0.0544 0.0799 t0.1043 0.1275 0.1491 0.1691 0.1871 t0.2032 0.2171 0.2287 0.2379 0.2447 t0.2490
27.24 29.76 32.58 35.6; 39.01 42.66 46.74 51.li 56.04 61.38 67.23 73.6E 80.72 88.46 96.96 106.3 116.5 127.8 140.1 153.7 168.6 185.0 202.9 222.7 244.3 268.2 294.3 323.1 354.7 389.4 427.6 469.5 515.6 566.3 621.9 683.2 750.5 824.4 905.8 995.2 1094 1202 1321 1451 1595 1753 1927 2119 2329 2561 2816 -
24.34 26.68 29.2: 32.08 35.18 38.56 42.33 46.44 50.95 55.9c 61.34 67.32 73.85 81.1C 89.03 97.74 107.3 117.8 129.4 142.1 156.0 171.4 188.3 206.8 227.2 249.6 274.2 301.3 331.1 363.9 399.9 439.5 483.0 531.0 583.7 641.6 705.4 775.5 852.7 937.5 1031 1134 1247 1371 1508 1658 1824 2006 2207 2428 2671 -
0.00 3691 3308 2966 2659 2385 2139 1918 1721 1544 1386 1244 1117 1003 09001 08083 07259 06520 05857 05262 04728 04248 03817 03431 03084 02772 02492 02240 02014 01811 01629 01465 01317 01185 01066 009588 008626 007761 006983 006283 005654 005088 004579 004121 003710 003339 003036 002706 002436 002193 001975 001778
0.00 4045 3619 3239 2900 2597 2326 2083 1866 1673 1499 1344 1205 1081 09691 08693 07799 06998 06280 05636 05059 04542 04078 03662 03288 02953 02653 02383 02141 01924 01729 01554 01396 01255 01128 01014 009120 308200 307374 306631 305964 305364 304825 304340 303904 303512 303160 302843 302559 302302 302072 301865
1060
21. Tables
21.10 Legendre Polynomials ofthe First Kind Pl(2)= 2 ;
Po(.) = 1; 1 Pz(.) = - ( 3 2 - 1); 2 I
P~(x) = - ( 3 h 4 - 302'
8 1 P ~ ( Z=) -(231d 16
P3(2)=
+ 3);
1 -(523 2
- 32);
' ( 6 3 ~-~ 70x3 + 152); 8 1 P,(z)= -(429x7 - 6 9 3 +~ 3~ 1 5 -~ 352). ~ 16 P5(2)=
+
- 3 1 5 ~ 1052' ~ - 5);
x =S ( x ) 0.00 0.05 0.10 0.15 0.20 0.25 0.30 0.35 0.40 0.45 0.50 0.55 0.60 0.65 0.70 0.75 0.80 0.85 0.90 0.95 1.00
-0.3000 -0.4962 -0.4850 -0.4662 -0.4400 -0.4062 -0.3650 -0.3162 -0.2600 -0.1962 -0.1250 -0.0462 +0.0400 0.1338 0.2350 0.3438 0.4600 0.5838 0.7150 0.8538 1.0000
0.0000 -0.0747 -0.1475 -0.2166 -0.2800 -0.3359 -0.3825 -0.4178 -0.4400 -0.4472 -0.4375 -0.4091 -0.3600 -0.2884 -0.1925 -0.0703 +0.0800 0.2603 0.4725 0.7184 1.0000
0.3750 0.3657 0.3379 0.2928 0.2320 0.1577 $0.0729 -0.0187 -0.1130 -0.2050 -0.2891 -0.3590 -0.4080 -0.4284 -0.4121 -0.3501 -0.2330 -0.0506 +0.2079 0.5541 1.0000
0.0000 0.0927 0.1788 0.2523 0.3075 0.3397 0.3454 0.3225 0.2706 0.1917 +0.0898 -0.0282 -0.1526 -0.2705 -0.3652 -0.4164 -0.3995 -0.2857 -0.0411 +0.3727 1.0000
-0.3125 -0.2962 -0.2488 -0.1746 -0.0806 +0.0243 0.1292 0.2225 0.2926 0.3290 0.3232 0.2708 0.1721 $0.0347 -0.1253 -0.2808 -0.3918 -0.4030 -0.2412 $0.1875 1.oooo
0.0000 -0.1069 -0.1995 -0.2649 -0.2935 -0.2799 -0.2241 -0.1318 -0.0146 +0.1106 0.2231 0.3007 0.3226 0.2737 +0.1502 -0.0342 -0.2397 -0.3913 -0.3678 $0.0112 1.0000
21.11 Laplace Transformation 1061
2 1.11 Laplace Transformation (see 15.2.1.1, p. 708) m
F(p) =
/
e P t " f t ) dt, f ( t ) = 0 for t
< 0.
0
C is the Euler constant:
No.
C = 0.577216 (see 8.2.5, 2., p. 458).
-
1
2 3
0
0
-1 P 1 -
1 in-1
(n- I)! tn-1
Pn 1
4
( n - I)! ePt
5
eQt
- eat
P-. BePt - cue"
6
P-. e-"t
7
d~
8
i n at
9
jin(.t
10
+ P)
P
p2
t 2ap + P 2
p2
+
P
11
12
sin
:os a t .2
p c o s p - cusinp p2 t QZ
:os(& t p)
13
jinh cut
14
:osh cut
15 -
t
No.
-
16 17 18
( a - a)2 a sin Bt - p sin at
19
aP(ff2- P') cos Pt - cos at
20
(a2- PZ)
21
cos2 at
22
sin2 at
23
cosh' at
24
2a2 P(P2 - 4 4
sinh' at
25
2dp p4 + 4a4
sin at . sinh at
26 27 28 29 30 31 -
+
a(p2 2a2)
p4 t 4a4 a(p2 - 2 2 ) p4 + 4a4
sin at . cosh at cos a t . sinh at
P3
p4
+ 4a4 ffP
(p2
+ a')'
t .
- sin at 2
t
- sinh
2
at
B sinh at - a sinh Pt a2 - p2
21.11 Laplace Transformation 1063
No. P
32 33 34 35 36
n! 4n in-+ -(2n)!J;;
1
Pnfi 1
(n> 0 , integer)
m
37 sin at -
38
t
g gcos sin at
39
at
40
E s i n , rrt
41
gcosh
42
m
44
.)m m
46
at
1
43
45
m
1
(P+ P
1
dF=
&(at) (Bessel function of order 0, p. 507)
1
47
Io(rrt) (modified Bessel function of order 0, p. 507)
1064
21. Tables
No. 48 49 50
51
a
arctan P
t 2 - sin at . cos Pt t
52 53
arctan
54
1nP P
55 56
P2-ffP+P
a8
eat 2 sin Pt t -C - l n t
Inp
$(n) = - t
pn+l
(lnt+C)’--
P
1(,Pt
57
-1 + ... + -1- C 2
T2
6
- eat)
t
58
a P+ff In -= 2artanhP-ff P
59
p2 a2 In P2 + P2
60
pz - cy2 In P2 - P2
+
2 . - sinh at t
2.
cos pt - cos f f t
t
e-a2/4t
61 1
62
--e-Qfi Rea20
63
(m - P)” m , R e v > -1
~
v+
a”Jy(at) (see Bessel function, p. 507)
21.11 Laplace Transformation 1065
-
No. 64
(P -
m)” , R e v > -1
Cy”l,,(at) (see Bessel function, p. 507) 0 fort < p 1 fort > ,8
65
66
67
68
69
70
71
for t < p
- e-P&3
e-PP
for t < p
Pa -
72 1 - e-QP
73 74
P
e-QP - ,-PP
P
(-1
a t2 -
0 for t > Q 1 for 0 < t < Q 0 for 0 < t < cy 1 for Q < t < /3 0 for t > p
2
fort > p
21.12 Fourier Transformation The symbols in the table are defined in the following way: C: Euler constant (C = 0.577215.. . ) r(z) =
e-?'-' m
'J~)
dt,
Re z > 0
(Gamma function, see 8.2.5, 6., p. 459),
(-l)n(;&tzn
(Bessel functions, see 9.1.2.6, 2., p. 507),
= zn!r(v+n+i) L
(modified Bessel functions, see 9.1.2.6, 3., p. 507), C(x)
(Fresnel integrals, see 14.4.3.2, 5.: p. 695);
S(x) Si(x) - i z y d t si(x)
=
Ci(x)
-
(Integral sine, see 14.4.3.2, 2., p. 694),
- 1 y d t = si(x)--
2 (Integral cosine, see 14.4.3.2, 2., p. 694).
The abbreviations for functions occurring in the table correspond to those introduced in the corresponding chapters.
21.12.1 Fourier Cosine Transformation F,(w) = J f ( t ) c o s ( t w ) dt Oa
O2 3.
~
p' t '
1
0.
sin(a i d )
-) w-2 id
4 cos w sin* ( 2
Oa
t>a
-Ci(aw)
21.12 Fourier Transformation 1067
4
00
t>a
+ t)-'
( a - t)-'
e 11.
:a2
~
O
(a
10.
= 1 f ( t ) c o s ( t w ) dt
FJW)
a>O
,
[- si (aw) sin (aw) - Ci ( a u )cos ( m ) ]
a>O A e-aw _
+ t2)-'
2
a
--sin(au) A
2
b b2 + ( a - t ) 2
b
+
b2
I2+ ( a t ) 2 + b2
(a2 - P - i ,
0,
A
e-& cos (au)
A
e-bw sin(au)
a-t
a+t
+
+ ( a t t)2 + (a
-
t)'
w
0< t < a t>a
t-",
0 < Rev < 1
~
a
,-at
a' ?-bt
- e-at
t
.-at
-
fi
+ w2
I
m
F,(w) = J f ( t ) c o s ( t w ) d t
No. 20.
tn e-at
21.
tu-1
22.
-
23.
e-at2
e-at
1 (-1 - -1 + -) 1 t 2 t et-1
1 --ln(12
e-2aw)
2
24.
25.
s e- ot
t-2
In t
26.
27.
,
0,
Ol
Si (w)
--
w
-F
In t -
2w ( C + f + l n ? w )
&
~
7rl - - (sin (aw) Ci (w) - cos (w) si (w))
28.
2 a
29.
(t2- aZ)-' In ( b t )
30.
-
31.
In lb-tl
32. -
1
t
a1 - - {sin (aw) [Ci (au) - In (ab)]- cos (w) si(w)} 2 a
In (1 + t )
a+t
:-at
In t
1 [f [cos (bw)- cos (aw)] W +cos (bw)Si (bw)+ cos (w) Si (aw) -sin (awl -
ci (awl - sin ( b ~ c)i (bw))
a 2 [a. + ln (a2 + + w arctan 0 a2 i- w2 1 w2j
21.12
FOUTaeT
Ransformation
Q3
F,(w) = J f ( t ) c o s ( t w ) d t
No.
0
T
33.
-
W
34. - 2 si(aw) ~
35.
36.
In ( a Zt t')
d7TF
37.
38.
1 - cos (aw)
71
W
R
39.
40.
41.
sin (at) ~
t
45.
2 -
sin (at) t ( P + bZ)
t
w
e-ab
t sin (ut) t2 + b2
42. e-bt sin (at) e - t sin t 43.
44.
2 ' R 4'
2
cosh (bw),
w
e-& sinh (ab) ,
w>a
T b-'
(1 - e-ab cosh (bw)),
w
T b-'
e-& sinh (ab) ,
w>a
2 2
1 2 - arctan ( 2 ) 2
sin' (at)
t
4
sin (at) sin@)
t
1
(a+b)'- wz
5 In~(a-b)Z-wzl
1069
-
46.
3
sin2 ( a t ) t2
p ),
(a- 1
0,
-
47.
m
= J f ( t ) cos(tw)dt
F&)
No.
1 - { (w 8
sin3 ( a t ) t2
w
< 2a
w
> 2a
+ 3a) In (w t 3a)
+(w - 3a) In Iw - 3a/ - (w -(w - a ) In ~w - a i }
-
?! ( 3 2 - w') ,
O<w
8
48.
iwz,
sin3 ( a t )
16 0:
49.
(3a-w)2,
a<w
< 3a
w
> 3a
1 - cos ( a t )
t
-
50.
w=a
t3
2
1 - cos (at) t2
-
51.
T e-ab c o s h ( h ) , 2 b T e-& cosh(ab) , 2 b
cos ( a t ) b2 + t Z
52.
:-bt
1 b2 + ( a - d)'
cos(at)
-
a'
1 & Te - -
53.
54. -
t +
cot (at) 55. b2 t2 -
w>a
+ b2 + (a + w)*
+ w2 4b
w
cosh
(E)
+ eZab)-l
R
cosh (bw)(1
R
cosh (h) (ezab- l)-'
+ a ) In (w + a )
21.12 Fourier Transformation 1071 -
I F&)
No.
:E
56.
sin (at')
57.
sin [a(l - t')]
58. 59.
60.
= b ( t ) cos(to)dt
~
(g)
(cos
-sin
(a+"+jl) 4 4a
-iEc0s
sin (at')
t' sin (at')
t
e-
sin (bt') '
61.
(z))
sin
[
arctan
(a) 1-"
- 4 (a'
+ b2)
LF (g)I ) : (
cos (at')
2
2a [cos
+sin
62. cos [a(l - t')] -
63.
e-
cos ( bt') cos
64.
-
- - arctan 2
(a)]
1t sin j:( 1 . - sin
(:)
1 67. - cos -
(;)
65.
~
[?(ap: b2)
4
66.
4
68.
17T [cos ( 2 2 2w
- sin ( 2
+e - 2 6 1
m
No.
F,(w) = ./ f ( t ) cos(t w ) dt
69.
2~
70.
[. (g)
sin
ecbt sin ( a d )
.xcos
t
72.
1 - cos ( a 4)
73.
e- at -c o s ( b 4 )
-S
2
sin
4
fi
74.
75.
e- a Ji -[cos ( a 4)- sin ( a d ) ]
d
21.12.2 Fourier Sine Transformation 00
F,(w) =
No. 1. 2.
3. ~
-1
t' 0.
Oa
Si (aw)
cos
I);(
-- - arctan
[4(bfJw2)
[s (%)+ c (31 E (: + g)
sin ( a d )
71.
(g) (g) ($1
,S f ( t ) sin(tw) dt
11.11 Fourier Runsformation
00
F,(w) = J f(t) sin(t w ) d t O
0: -1
5.
-si (aw)
t>a
- + ' 6.
O
7. - 0.
t>a O
a.
t>a
-
&
9. -
10.
Ia+t)-'
(a>0)
-t)-'
(a > 0)
[sin (aw)Ci (aw) - cos (w) si (aw)]
11.
a
I
12.
t 2'
-
!!e-w
+t2
2
1 - [sin (w)Ci(uu) - cos (w)Si(aw)
13. 14.
b
!I'
+ ( a - t)'
5'
+ ( a + t)'
15. -
a t t
b
-
-
b'
+ ( a t t)'
b'
+ ( a - t)'
a-t
K
e+ sin (au)
w e-bw cos (aw)
-? cos(aw)
16.
2
-
17.
1 L (a* - t') 1
L (a'
+ t')
x 1 -cos ( a u ) 2 a2 T l-e-w _-
2
a'
1073
-
No.
19.
”.
t-
O
-
20.
e-
W
at
a2
e-
+ w2
Rt
21.
arctan
e-ot
22.
(t) (-)
-
[iw In b2 + w 2 + b arctan
t2
-
23. e-
24.
at
Jt
-
25.
tn e-at
-
26.
tY-1
e-at
1 -- tanh ( T U ) 2
(1 - e-t)-l
27. -
28. -
29. -
30.
31. -
In t , 0.
O
t>l
Ci(w) - C - In w W
():
- a arctan
(t)] 1
21.12 Fourier Trans.formation 1075
-
No. In t
32.
-
33.
In t -
-
t
&
34.
t (t' -
In (bt)
35.
t (t'
-2 - l
In (-)
36.
e-at
In t
!! [cos (aw) (In (ab) - Ci(aw)) - sin(aw) si(aw)] 2
t
j:(
1 - [a arctan a2
+ w2
1
- cw - - w In (a2 2
1
+ wz)
1 {In (5j + cos ( b ~ ci(bW) ) - cos (awl Ci(aw) W
+sin (bw)Si@) - sin (aw) Si(aw)
37.
+? [sin (bw) +sin (w)]] 2
38. 2T
- [I - cos (au) - aw si(aw)]
39.
40.
In
41.
42.
-2ne
1 I:;
2 - [C In ( a ~-) cos ( a ~ Ci(aw) ) - sin (aw) Si(aw)]
In 1 - -
In
a2
W
+
++ ( b - tt )) 22)
sin
+
W
2n - e-aw sin (bw) W
43.
-n
44.
i~
-
-.a j:(
(W)
cij:(
[C + In (aw) - Ci(aw)]
-
No
sin ( a t ) 45. t 46.
fw,
sin ( a t ) t'
~ < w < a
7r
2"'
w>a
47.
sin (7rt) 1- t 2
48.
A ecab - sinh (bw), 2 b _T2 -e-bw sinh (ab) , b
sin ( a t ) bZ t2
+
[
-
49. e-bt sin ( a t ) e- bt sin ( a t ) 50. t -
Ab 2 b2
i
In
2
sin' ( a t )
t
-
4' 8' 0, 0,
sin ( a t ) sin (bt)
t
-
-
(U
(E)
b K
54.
-
+ (a - w)' b2 + + u)' (b2b2 ++ (w(w +a)' - a)2)
-pe -
53.
w>a
1 T - 4 I bW sinh
51. -
52.
1
0 <w
sin2 ( a t ) t2
-
4' 0,
O<w<2a w = 2a
w > 2a O<w
< a+b
w>a+b
21.12 Fourier Transformation
!
No. sin' ( a t ) t3
cos ( a t ) t
:
1
w > 2a
fa',
0,
O<w
i7
-
w=a
4' K
-
w>a
2 '
57.
t cos ( a t ) bZ + t2
- i7 - e-ab sinh (bw), 2 2
e-bw cosh(ab) ,
0<w
w>a
-
58. sin (at') sin (at') 59. t 60.
cos (at')
61.
-
cos (at')
t
62.
No. f ( t ) 1.
b ( t ) (Dirac 6 function)
F(w) =
7 e-Iwtf(t)dt
-m
1
b(n) (t)
3.
6(")(t- a )
(iw).e-'-
( n = 0 , 1 , 2 , . . .)
1077
1078
21. Tables
1 1
tn
H ( t ) = 1 for t > 0 H ( t ) = 0 for t < 0 (Heaviside unit step function)
7.
~
8.
1 - t ab(w) iw
tn H ( t ) e-at H ( t ) = e-a* e-at H ( t ) = 0
for for
t >0 t <0
1
atiw
10. ~
t -
c;
1
15.
k ~
+ a2 H ( t + a) - H ( t - a) = 1
t2
H(t+a)-H(t-a)=O
for jtl < a for I t l > a
l-
2sinaw w
,i ai
2a6(w - a)
cos at
7r[b(W
+ a) + 6(w - a)]
sin at 1 cash t
1 sinh t
19.
;in at2
20.
:os at2
aw
-i7r tanh 2
E($;)
(a>O)
21.18 Fourier Transformation 1079
21.12.4 Exponential Fourier Transformation although the exponential Fourier transformation F,(w)can be represented by the Fourier transforma1 tion F ( w ) according to (15.77), Le., F,(w)= ;F(-w), here we give some direct transforms.
I
I
f ( t ) = A for a < t < b f ( t )= 0 otherwise
f(t)=t" 2.
f(t) = 0
for Ostsb otherwise ( n = 1 , 2 , ...)
Rev > 0
Rev
>0
T
for w > 0
0
for w
0
for w > 0
w v - l e-w
<0
21.13 2 Transformation For definition see 15.4.1.2,p. 732, for rules of calculations see 15.4.1.3, p. 733, for inverses see p. 735 Convergence Region
No. Original Sequence f,, 1
1
2 -
2-1
z zil
2
3
n
4
n2
5
n3
6
ean
7
an
8
an n!
9
n an
10
n2an
11
(3
12
13
z (2
- 1)2
+
z ( z 4 2 t 1) - 114
(2 2
z - ea 2 -
z-a
za
( z -a)? az(z + a )
( z - a)3 Z
(2
- l)k+'
G)
(1
+
sin bn
z sin b z2 - 22 COS b + 1
14 cos bn -
y
Z(Z
- COS b)
z2 - 22
COS
bt 1
21.13 Z Transformation 1081
Transform
Original Sequence f,,
=
F )(. 15
~
Zen
ean sin b n
16
ean cos b n
17
sinh bn
z ( z - e” cos b) z2 - 2ze4 cos b
z sinh b
Z(Z
18 cosh b n
2’ -
an sinh b n
20
an cosh b n
21
22
23
= 0,
f2n
= 0,
f2n
25
26
27
28
z ( z - a cosh b) z2 - 2za cosh b a2
+
f2n+l
= 2(2n
=2
+ 1)
I-
(ZZ
+
1) - 1)2
2422
Li 2-1
2 =-
2n + 1
nn cos
-5-
( n + 1) ean
+1
za sinh b
= 0,
f2nt1
+1
z2 - 2za cosh b t a2
fk = 1
f2n
+ eZa
- cash b)
22 cash b
fn = 0 f”ur n # k ,
f2n+l
24
sin b b + eZa
z 2 - 22120 cos
z2 - 22 cash b
19
Convergence Region
(fn)
1082
21. Tables
No. Original Sequence fn
Convergence Region
In t 2-1
30
~
(2n
+ 1) ! 1
1
cos
7
21.14 Poisson Distribution
21.14 Poisson Distribution For the formula of the Poisson distribution see 16.2.3.3, p. 755.
x k
0.1
0.2
0.3
0.4
0.5
0.6
0 1 2 3 4 5 6 7
0.904837 0.090484 0.004524 0.000151 0.000004
0.818731 0.163746 0.016375 0.001091 0.000055 0.000002
0.740818 0.222245 0.033337 0.003334 0.000250 0.000015 0.000001
0.670320 0.268128 0.053626 0.007150 0.000715 0.000057 0.000004
0.606531 0.303265 0.075816 0.012636 0.001580 0.000158 0.000013 0.000001
0.548812 0.329287 0.098786 0.019757 0.002964 0.000356 0.000035 0.000003
2.0
3.0
0.135335 0.270671 0.270671 0.180447 0.090224 0.036089 0.012030 0.003437 0.000859 0.000191 0.000038 0.000007 0.000001
0.049787 0.149361 0.224042 0.224042 0.168031 0.100819 0.050409 0.021604 0.008102 0.002701 0.000810 0.000221 0.000055 0.000013 0.000003 0.000001
__
k 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
t x
0.496585 0.347610 0.121663 0.028388 0.004968 0.000696 0.000081 0.000008 0.000001
0.449329 0.359463 0.143785 0.038343 0.007669 0.001227 0.000164 0.000019 0.000002
0.9
1.0
0.406570 0.365913 0.164661 0.049398 0.011115 0.002001 0.000300 0.000039 0.000004
0.367879 0.367879 0.183940 0.061313 0.015328 0.003066 0.00051 1 0.000073 0.000009 0.000001
I
1083
1084
21. Tables
(Continuation)
x k 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 -
4.0
5.0
6.0
7.0
8.0
9.0
0.018316 0.073263 0.146525 0.195367 0.195367 0.156293 0.104194 0.059540 0.029770 0.013231 0.005292 0.001925 0.000642 0.000197 0.000056 0.000015 0.000004 0.000001
0.006738 0.033690 0.084224 0.140374 0.175467 0.175467 0.146223 0.104445 0.065278 0.036266 0.018133 0.008242 0.003434 0.001321 0.000472 0.000157 0.000049 0.000014 0.000004 0.000001
0.002479 0.014873 0.044618 0.089235 0.133853 0.160623 0.160623 0.137677 0.103258 0.068838 0.041303 0.022529 0.011264 0.005199 0.002228 0.000891 0.000334 0.000118 0.000039 0.000012 0.000004 0.000001
0.000912 0.006383 0.022341 0.052129 0.091126 0.127717 0.149003 0.149003 0.130377 0.101405 0.070983 0.045171 0.026350 0.014188 0.007094 0.003311 0.001448 0.000596 0.000232 0.000085 0.000030 0.000010 0.000003 0.000001
0.000335 0.002684 0.010735 0.028626 0.057252 0.091604 0.122138 0.139587 0.139587 0.124077 0.099262 0.072190 0.048127 0.029616 0.016924 0.009026 0.004513 0.002124 0.000944 0.000397 0.000159 0.000061 0.000022 0.000008 0.000003 0.000001
0.000123 0.001 111 0.004998 0.014994 0.033737 0.060727 0.091090 0.117116 0.13 1756 0.131 756 0.118580 0.097020 0.072765 0.050376 0.032384 0.019431 0.010930 0.005786 0.002893 0.001370 0.000617 0.000264 0.000108 0.000042 0.000016 0.000006 0.000002 0.000001
21.15 Standard Normal Distribution
21.15 Standard Normal Distribution For the formula of the standard normal distribution see 16.2.4.2, p. 757.
21.15.1 Standard Normal Distribution for 0.00 5 z X
@(XI
2
X
0.6950 0.6985 0.7019 0.7054 0.7088 0.7123 0.7157 0.7190 0.7224
0.71 0.72 0.73 0.74 0.75 0.76 0.77 0.78 0.79
@(x) 0.7257 0.7291 0.7324 0.7357 0.7389 0.7422 0.7454 0.7486 0.7517 0.7549 0.7580 0.7611 0.7642 0.7673 0.7704 0.7734 0.7764 0.7794 0.7823 0.7852
0.9192 0.9207 0.9222 0.9236 0.9251 0.9265 0.9279 0.9292 0.9306 0.9319 0.9332 0.9345 0.9357 0.9370 0.9382 0.9394 0.9406 0.9418 0.9429 0.9441
1.60 1.61 1.62 1.63 1.64 1.65 1.66 1.67 1.68 1.69 1.70 1.71 1.72 1.73 1.74 1.75 1.76 1.77 1.78 1.79
0.9452 0.9463 0.9474 0.9484 0.9495 0.9505 0.9515 0.9525 0.9535 0.9545 0.9554 0.9564 0.9573 0.9582 0.9591 0.9599 0.9608 0.9616 0.9625 0.9633
@(XI
~
0.00 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09
D.10 0.11 0.12 0.13 0.14 0.15 0.16 0.17 0.18 0.19
0.5000 0.5040 0.5080 0.5120 0.5160 0.5199 0.5239 0.5279 0.5319 0.5359 0.5398 0.5438 0.5478 0.5517 0.5557 0.5596 0.5636 0.5675 0.5714 0.5753
2 - @(XI
1.00 1.01 1.02 1.03 1.04 1.05 1.06 1.07 1.08 1.09 1.10
0.8413 0.8438 0.8461 0.8485 0.8508 0.8531 0.8554 0.8577 0.8599 0.8621
1.11 1.12 1.13 1.14 1.15 1.16 1.17 1.18 1.19
0.8665 0.8686 0.8708 0.8729 0.8749 0.8770 0.8790 0.8810 0.8830
0.8643
0.20 0.21 0.22 0.23 0.24 0.25 0.26 0.27 0.28 0.29 0.30 0.31 0.32 0.33 0.34 0.35 0.36 0.37 0.38 0.39 X
1.20 1.21 1.22 1.23 1.24 1.25 1.26 1.27 1.28 1.29 1.30 1.31 1.32 1.33 1.34 1.35 1.36 1.37 1.38 1.39
0.5793 0.40 0.5832 0.41 0.5871 0.42 0.5910 0.43 0.5948 0.44 0.5987 0.45 0.6026 0.46 0.6064 0.47 0.6103 0.48 0.6141 0.49 D.6179 0.50 0.6217 0.51 0.6255 0.52 0.6293 0.53 0.6331 0.54 0.6368 0.55 0.6406 0.56 0.6443 0.57 0.6480 0.58 0.59 0.6517 @(XI
0.60 0.61 0.62 0.63 0.64 0.65 0.66 0.67 0.68 0.69
D.6915 0.70
0.80 0.81 0.82 0.83 0.84 0.85 0.86 0.87 0.88 0.89 0.90 0.91 0.92 0.93 0.94 0.95 0.96 0.97 0.98 0.99
1.80 1.81 1.82 1.83 1.84 1.85 1.86 1.87 1.88 1.89 1.90 1.91 1.92 1.93 1.94 1.95 1.96 1.97 1.98 1.99
@(E)
0.7881 0.7910 0.7939 0.7967 0.7995 0.8023 0.8051 0.8079 0.8106 0.8133 0.8159 0.8186 0.8212 0.8238 0.8264 0.8289 0.8315 0.8340 0.8365 0.8389 -
5
X
1.40 1.41 1.42 1.43 1.44 1.45 1.46 1.47 1.48 1.49 0.9032 1.50 0.9049 1.51 0.9066 1.52 0.9082 1.53 0.9099 1.54 0.9115 1.55 0.9131 1.56 0.9147 1.57 0.9162 1.58 0.9177 1.59 0.8849 0.8869 0.8888 0.8907 0.8925 0.8944 0.8962 0.8980 0.8997 0.9015
0.6554 0.6591 0.6628 0.6664 0.6700 0.6736 0.6772 0.6808 0.6844 0.6879
5 1.99 2
3.9641 0.9649 0.9656 0.9664 0.9671 0.9678 0.9686 0.9693 0.9699 0.9706
D.9713 0.9719 0.9726 0.9732 0.9738 0.9744 0.9750 0.9756 0.9761 0.9767
1085
1086
21. Tables
21.15.2 Standard Normal Distribution for 2.00 5 2
5 3.90
X 0 x X @(E) @(XI @(XI @(XI 0.9773 2.20 0.9881 2.40 0.9918 2.60 3.9953 2.80 D.9974
X - @(x)
2.00
2.01 2.02 2.03 2.04 2.05 2.06 2.07 2.08 2.09 -
0.9778 0.9783 0.9788 0.9793 0.9798 0.9803 0.9808 0.9812 0.9817
2.10
0.9821
2.11 2.12 2.13 2.14 2.15 2.16 2.17 2.18 2.19 -
0.9826 0.9830 0.9834 0.9838 0.9842 0.9846 0.9850 0.9864 0.9857
x
@(x)
2.21 2.22 2.23 2.24 2.25 2.26 2.27 2.28 2.29 -
0.9864 0.9868 0.9871 0.9875 0.9878 0.9881 0.9884 0.9887 0.9890
2.30
0.9893
2.31 2.32 2.33 2.34 2.35 2.36 2.37 2.38 2.39
0.9896 0.9894 0.9901 0.9904 0.9906 0.9909 0.9911 0.9913 0.9916
x
@(x)
2.41 2.42 2.43 2.44 2.45 2.46 2.47 2.48 2.49 -
0.9920 0.9922 0.9925 0.9927 0.9929 0.9931 0.9932 0.9934 0.9936
2.61 2.62 2.63 2.64 2.65 2.66 2.67 2.68 2.69
0.9955 0.9956 0.9957 0.9959 0.9960 0.9961 0.9962 0.9963 0.9964
2.81 2.82 2.83 2.84 2.85 2.86 2.87 2.88 2.89
2.50 0.9938
2.70
1.9965
2.90 0.9981
2.51 2.52 2.53 2.54 2.55 2.56 2.57 2.58 2.59
0.9940 0.9941 0.9943 0.9945 0.9946 0.9948 0.9949 0.9951 0.9952
2.71 2.72 2.73 2.74 2.75 2.76 2.77 2.78 2.79
0.9966 0.9967 0.9968 0.9969 0.9970 0.9971 0.9972 0.9973 0.9974
2.91 2.92 2.93 2.94 2.95 2.96 2.97 2.98 2.99
0.9982 0.9983 0.9983 0.9984 0.9984 0.9985 0.9985 0.9986 0.9986
x
@(x)
x
@(x)
x
@(x)
3.00 0.9987
3.20 0.9993
3.40 0.9997
3.10
3.30
3.50
0.9990
0.9995
0.9998
3.60 0.9998
3.70
0.9999
0.9975 0.9976 0.9977 0.9977 0.9978 0.9979 0.9979 0.9980 0.9981
3.80 0.9999
3.90
0.9999
21.16 x 2 Distribution
21.16
x2Distribution
For the formula of the x2 distribution see 16.2.4.6, p. 760.
x2 Distribution: Quantile Degree
Probability a
of Freedom m
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 40 50 60 70 80 90 100
0.99 0.00016 0.020 0.115 0.297 0.554 0.872 1.24 1.65 2.09 2.56 3.05 3.57 4.11 4.66 5.23 5.81 6.41 7.01 7.63 8.26 8.90 9.54 10.2 10.9 11.5 12.2 12.9 13.6 14.3 15.0 22.2 29.7 37.5 45.4 53.5 61.8 70.1
0.975 0.00098 0.051 0.216 0.484 0.831 1.24 1.69 2.18 2.70 3.25 3.82 4.40 5.01 5.63 6.26 6.91 7.56 8.23 8.91 9.59 10.3 11.0 11.7 12.4 13.1 13.8 14.6 15.3 16.0 16.8 24.4 32.4 40.5 48.8 57.2 65.6 74.2
0.95
005
0025
001
0.0039 3.8 5.0 6.6 6.0 7.4 0.103 9.2 7.8 9.4 11.3 0.352 9.5 11.1 13.3 0.711 11.1 12.8 15.1 1.15 12.6 14.4 16.8 1.64 14.1 16.0 18.5 2.17 15.5 17.5 20.1 2.73 16.9 19.0 21.7 3.33 18.3 20.5 23.2 3.94 19.7 21.9 24.7 4.57 21.0 23.3 26.2 5.23 22.4 24.7 27.7 5.89 23.7 26.1 29.1 6.57 25.0 27.5 30.6 7.26 26.3 28.8 32.0 7.96 27.6 30.2 33.4 8.67 28.9 31.5 34.8 9.39 30.1 32.9 36.2 10.1 31.4 34.2 37.6 10.9 32.7 35.5 38.9 11.6 12.3 33.9 36.8 40.3 13.1 35.2 38.1 41.6 36.4 39.4 43.0 13.8 14.6 37.7 40.6 44.3 38.9 41.9 45.6 15.4 40.1 43.2 47.0 16.2 41.3 44.5 48.3 16.9 42.6 45.7 49.6 17.7 43.8 47.0 50.9 18.5 55.8 59.3 63.7 26.5 67.5 71.4 76.2 34.8 43.2 79.1 83.3 88.4 90.5 95.0 100.4 51.7 101.9 106.6 112.3 60.4 113.1 118.1 124.1 69.1 124.3 129.6 135.8 77.9
1087
21.17 Fisher F Distribution For the formula of the Fisher F distribution see 16.2.4.7, p. 761. Fisher F Distribution: Quantile - 1 2 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 40 60 125 m
3
4
for cr = 0.05
8 12 24 5 6 - - -
161.4 199.5 115.7 224.6 230.2 !34.0 238.9 143.9 18.51 19.00 19.16 19.25 19.30 .9.33 19.37 L9.41 9.12 9.01 8.94 8.85 8.74 10.13 9.55 9.28 7.71 6.94 6.59 6.39 6.26 6.16 6.04 5.91 5.19 5.05 4.95 4.82 4.68 6.61 5.79 5.41 5.99 5.14 4.76 4.53 4.39 4.28 4.15 4.00 4.12 3.97 3.87 3.73 3.57 5.59 4.74 4.35 3.84 3.69 3.58 3.44 3.28 5.32 4.46 4.07 5.12 4.26 3.86 3.63 3.48 3.37 3.23 3.07 4.96 4.10 3.71 3.48 3.33 3.22 3.07 2.91 4.84 3.98 3.59 3.36 3.20 3.09 2.95 2.79 3.26 3.11 3.00 2.85 2.69 4.75 3.89 3.49 4.67 3.81 3.41 3.18 3.03 2.92 2.77 2.60 4.60 3.74 3.34 3.11 2.96 2.85 2.70 2.53 4.54 3.68 3.29 3.06 2.90 2.79 2.64 2.48 4.49 3.63 3.24 3.01 2.85 2.74 2.59 2.42 4.45 3.59 3.20 2.96 2.81 2.70 2.55 2.38 4.41 3.55 3.16 2.93 2.77 2.66 2.51 2.34 2.90 2.74 2.63 2.48 2.31 4.38 3.52 3.13 2.87 2.71 2.60 2.45 2.28 4.35 3.49 3.10 4.32 3.47 3.07 2.84 2.68 2.57 2.42 2.25 2.82 2.66 2.55 2.40 2.23 4.30 3.44 3.05 4.28 3.42 3.03 2.80 2.64 2.53 2.37 2.20 4.26 3.40 3.01 2.78 2.62 2.51 2.36 2.18 2.76 2.60 2.49 2.34 2.16 4.24 3.39 2.99 2.74 2.59 2.47 2.32 2.15 4.23 3.37 2.98 2.73 2.57 2.46 2.31 2.13 4.21 3.35 2.96 2.71 2.56 2.45 2.29 2.12 4.20 3.34 2.95 2.70 2.55 2.43 2.28 2.10 4.18 3.33 2.93 2.69 2.53 2.42 2.27 2.09 4.17 3.32 2.92 2.61 2.45 2.34 2.18 2.00 2.84 4.08 3.23 2.53 2.37 2.25 2.10 1.92 4.00 3.15 2.76 2.44 2.29 2.17 2.01 1.83 3.92 3.07 2.68 2.37 2.21 2.10 1.94 1.75 3.84 3.00 2.60 -
30
249.0 250.0 19.45 19.46 8.64 8.62 5.77 5.75 4.53 4.50 3.84 3.81 3.41 3.38 3.12 3.08 2.90 2.86 2.74 2.70 2.61 2.57 2.51 2.47 2.42 2.38 2.35 2.31 2.29 2.25 2.24 2.19 2.19 2.15 2.15 2.11 2.11 2.07 2.08 2.04 2.05 2.01 2.03 1.98 2.00 1.96 1.98 1.94 1.96 1.92 1.95 1.90 1.93 1.88 1.91 1.87 1.90 1.85 1.89 1.84 1.79 1.74 1.70 1.65 1.60 1.55 1.52 1.46
40 !51.0 9.47 8.59 5.72 4.46 3.77 3.34 3.05 2.83 2.66 2.53 2.43 2.34 2.27 2.20 2.15 2.10 2.06 2.03 1.99 1.96 1.94 1.91 1.89 1.87 1.85 1.84 1.82 1.80 1.79 1.69 1.59 1.49 1.39 -
x, 154.3 -9.50 8.53 5.63 4.36 3.67 3.23 2.93 2.71 2.54 2.40 2.30 2.21 2.13 2.07 2.01 1.96 1.92 1.88 1.84 1.81 1.78 1.76 1.73 1.71 1.69 1.67 1.65 1.64 1.62 1.51 1.39 1.25 1.00 ~
21.17 Fisher F Distribution 1089 for cy = 0,Ol Fisher F Distribution: Quantile ja,ml,mz
m~ ma
1 2 3 4 a
6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 40 60 125 !x
-
1
2
3
4
5
6
8
12
24
30
40
4052 18.50 14.12 !1.20 .6.26 .3.74 .2.25 .1.26 .0.56
4999 j9.00 i0.82 18.00 13.27 10.92 9.55 8.65 8.02 7.56 7.21 6.93 6.70 6.51 6.36 6.23 6.11 6.01 5.93 5.85 5.78 5.72 5.66 5.61 5.57 5.53 5.49 5.45 5.42 5.39 5.18 4.98 4.78 4.60
5403 j9.17 l9.46 16.69 12.06 9.78 8.45 7.59 6.99 6.55 6.22 5.95 5.74 5.56 5.42 5.29 5.18 5.09 5.01 4.94 4.87 4.82 4.76 4.72 4.68 4.64 4.60 4.57 4.54 4.51 4.31 4.13 3.94 3.78
5625 39.25 28.71 15.98 11.39 9.15 7.85 7.01 6.42 5.99 5.67 5.41 5.21 5.04 4.89 4.77 4.67 4.58 4.50 4.43 4.37 4.31 4.26 4.22 4.18 4.14 4.11 4.07 4.04 4.02 3.83 3.65 3.48 3.32
5764 39.30 28.24 15.52 10.97 8.75 7.46 6.63 6.06 5.64 5.32 5.06 4.86 4.70 4.56 4.44 4.34 4.25 4.17 4.10 4.04 3.99 3.94 3.90 3.86 3.82 3.78 3.76 3.73 3.70 3.51 3.34 3.17 3.02
5859 39.33 27.91 15.21 10.67 8.47 7.19 6.37 5.80 5.39 5.07 4.82 4.62 4.46 4.32 4.20 4.10 4.01 3.94 3.87 3.81 3.76 3.71 3.67 3.63 3.59 3.56 3.53 3.50 3.47 3.29 3.12 2.95 2.80
5981 39.37 27.49 14.80 10.29 8.10 6.84 6.03 5.47 5.06 4.74 4.50 4.30 4.14 4.00 3.89 3.79 3.71 3.63 3.56 3.51 3.45 3.41 3.36 3.32 3.29 3.26 3.23 3.20 3.17 2.99 2.82 2.66 2.51
6106 99.42 27.05 14.37 9.89 7.72 6.47 5.67 5.11 4.71 4.40 4.16 3.96 3.80 3.67 3.55 3.46 3.37 3.30 3.23 3.17 3.12 3.07 3.03 2.99 2.96 2.93 2.90 2.87 2.84 2.66 2.50 2.33 2.18
6235 99.46 26.60 13.93 9.47 7.31 6.07 5.28 4.73 4.33 4.02 3.78 3.59 3.43 3.29 3.18 3.08 3.00 2.92 2.86 2.80 2.75 2.70 2.66 2.62 2.58 2.55 2.52 2.49 2.47 2.29 2.12 1.94 1.79
6261 19.47 i6.50 .3.84 9.38 7.23 5.99 5.20 4.65 4.25 3.94 3.70 3.51 3.35 3.21 3.10 3.00 2.92 2.84 2.78 2.72 2.67 2.62 2.58 2.54 2.50 2.47 2.44 2.41 2.38 2.20 2.03 1.85 1.70
6287 39.47 16.41 13.74 9.29 7.14 5.91 5.12 4.57 4.17 3.86 3.62 3.43 3.27 3.13 3.02 2.92 2.84 2.76 2.69 2.64 2.58 2.54 2.49 2.45 2.42 2.38 2.35 2.33 2.30 2.11 1.94 1.75 1.59
.0.04
9.65 9.33 9.07 8.86 8.68 8.53 8.40 8.29 8.18 8.10 8.02 7.95 7.88 7.82 7.77 7.72 7.68 7.64 7.60 7.56 7.31 7.08 6.84 6.63
co 6366 39.50 26.12 13.46 9.02 6.88 5.65 4.86 4.31 3.91 3.60 3.36 3.16 3.00 2.87 2.75 2.65 2.57 2.49 2.42 2.36 2.31 2.26 2.21 2.17 2.13 2.10 2.06 2.03 2.01 1.80 1.60 1.37 1.00
1090 21. Tables
21.18 Student t Distribution For the formula of the Student t distribution see 16.2.4.8, p. 761
Student t Distribution: Quantile
Degree of Freedom
&,, or t a p r n
Probability a for Two-sided Problem
m
0.10
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 40 60 120 co
6.31 12.7 2.92 4.30 2.35 3.18 2.13 2.78 2.01 2.57 1.94 2.45 1.89 2.36 1.86 2.31 1.83 2.26 1.81 2.23 1.80 2.20 1.78 2.18 1.77 2.16 1.76 2.14 1.75 2.13 1.75 2.12 1.74 2.11 1.73 2.10 1.73 2.09 1.73 2.09 1.72 2.08 1.72 2.07 1.71 2.07 1.71 2.06 1.71 2.06 1.71 2.06 1.71 2.05 1.70 2.05 1.70 2.05 1.70 2.04 1.68 2.02 1.67 2.00 1.66 1.98 1.64 1.96 0.05
0.05
0.025
0.02
0.01
0.002
0.001
31.82 63.7 6.97 9.92 4.54 5.84 3.75 4.60 3.37 4.03 3.14 3.71 3.00 3.50 2.90 3.36 2.82 3.25 2.76 3.17 2.72 3.11 2.68 3.05 2.65 3.01 2.62 2.98 2.60 2.95 2.58 2.92 2.57 2.90 2.55 2.88 2.54 2.86 2.53 2.85 2.52 2.83 2.51 2.82 2.50 2.81 2.49 2.80 2.49 2.79 2.48 2.78 2.47 2.77 2.46 2.76 2.46 2.76 2.46 2.75 2.42 2.70 2.39 2.66 2.36 2.62 2.33 2.58
318.3 22.33 10.22 7.17 5.89 5.21 4.79 4.50 4.30 4.14 4.03 3.93 3.85 3.79 3.73 3.69 3.65 3.61 3.58 3.55 3.53 3.51 3.49 3.47 3.45 3.44 3.42 3.40 3.40 3.39 3.31 3.23 3.17 3.09
637.0 31.6 12.9 8.61 6.86 5.96 5.40 5.04 4.78 4.59 4.44 4.32 4.22 4.14 4.07 4.01 3.96 3.92 3.88 3.85 3.82 3.79 3.77 3.74 3.72 3.71 3.69 3.66 3.66 3.65 3.55 3.46 3.37 3.29
0.01
0.005
0.001
0.0005
Probability a for One-sided Problem
21.19 Random Numbers 1091
21.19 Random Numbers For the meaning of random numbers see 16.3.5.2, p. 781. 4730 0612 0285 7768 4450 17332 4044 0067 5358 0038 8344 7164 7454 3454 0401 6202 8284 9056 9747 2992 6170 3265 0179 1839 2276 4146 3526 3390 4806 7959 8245 7551 5903 9001 0265
1530 8004 2278 8634 1888 9284 9078 3428 8085 8931 6563 4013 1643 9005 7697 9278 5256 7574 4772 0449 2271 4689 7492 5157 7616 8021 6292 0067 7414 3186 0195 1077 0338 4286 0151 7260 3840 7921 8836 3342 4595 2539 8619 0814 3949 6995 6042 9650 8078 9973 9952 7945 3809 5523 7825 7012 9286 5051 5983 0204 9611 0641 4915 2913 2744 7318 4521 5070 3305 3814
7993 2549 3672 2217 3162 7406 5969 4765 3219 6906 3835 8731 2995 5579 3081 7406 5969 4765 3219 6906 7592 5133 3170 3024 4398 5207 0648 9934 4651 4325 7024 9031 7614 4150 0973
3141 3737 7033 0293 9968 4439 9442 9647 2532 8859 2938 4980 7868 9028 5876 4439 9442 9647 2532 8859 1339 7995 9915 0680 3121 1967 3326 7022 1580 5039 3899 9735 5999 5059 4958
0103 7686 4844 3978 6369 5683 7696 4364 7577 5044 2671 8674 0683 5660 8150 5683 7696 4364 7577 5044 4802 8030 6960 1127 7749 7325 1933 2260 5004 7342 8981 7820 1246 5178 4830
4528 0723 0149 5933 1256 6877 7510 1037 2815 8826 4691 4506 3768 5006 1360 6877 7510 1037 2815 8826 5751 7408 2621 8088 8191 7584 6265 0190 8981 7252 1280 2478 9759 7130 6297
7988 4505 7412 1032 0416 2920 1620 4975 8696 6218 0559 7262 0625 8325 1868 2920 1620 4975 8696 6218 3785 2186 6718 0200 2087 3485 0649 1816 1950 2800 5678 9200 6565 2641 0575
4635 6841 6370 5192 4326 9588 4973 1998 9248 3206 8382 8127 9887 9677 9265 9588 6973 1998 9248 3206 7125 0725 4059 5868 8270 5832 6177 7933 2201 4706 8096 7269 1012 7812 4843
8478 1379 1884 1732 7840 3002 1911 1359 9410 9034 2825 2022 7060 2169 3277 3002 1911 1359 9410 9034 4922 5554 9919 0084 5233 8118 2139 2906 3852 6881 7010 6284 0059 1381 3437
9094 6460 0717 2137 6525 2869 1288 1346 9282 0843 4928 2178 0514 3196 8465 2869 1288 1346 9282 0843 8877 5664 1007 6362 3980 8433 7236 3030 6855 8828 1435 9861 2419 6158 5629
9077 1869 5740 9357 2608 3746 6160 6125 6572 9832 5379 7463 0034 0357 7502 3746 6160 6125 6572 9832 9530 6791 6469 6808 6774 0606 0441 6032 5489 2785 7631 2849 0036 9539 3496
5306 5700 8477 5941 5255 3690 9797 5078 3940 2703 8635 4842 8600 7811 6458 3690 9797 5078 3940 2703 6499 9677 5410 3727 8522 2719 1352 1685 6386 8375 7361 2208 2027 3356 5406
4357 5339 6583 6564 4811 6931 8755 6742 6655 8514 8135 4414 3727 5434 7195 2705 1547 3424 8969 5225 6432 3085 0246 8710 5736 2889 1499 3100 3736 7232 8903 8616 5467 5861 4790
8353 6862 0717 2171 3763 1230 6120 3443 9014 4124 7299 0127 5056 0314 9869 6251 4972 1354 3659 8898 1516 8319 3687 6065 3132 2765 3068 1929 0498 2483 8684 5865 5577 9371 9734
1092
22. Biblioqraphv
22 Bibliography 1. Arithmetic
BECKENBACH, E.; BELLMANN, R.: Inequalities. - Springer-Verlag 1983. BOSCH,K.: Finanzmathernatik. - Oldenbourg-Verlag 1991. HARDY,G.: A Course in Pure Mathematics. - Cambridge University Press 1952. HEILMANN, W.-R.: Grundbegriffe der Risikotheorie.- Verlag Versicherungswirtschaft 1986. F.; MUNZER,H.: Lebensversicherungsmathematik fur Praxis und Studium. ISENBART, Verlag Gabler, 2nd ed. 1986. GELLERT,W . ; KASTNER,H.; NEUBER,S.: Fachlexikon ABC Mathematik. - Verlag H. Deutsch 1978. HEITZINGER, W.: TROCH, I.; VALENTIN, G.: Praxis nichtlinearer Gleichungen. - -C. Hanser Verlag 1984. PFEIFER, A,: Praktische Finanzmathematik. - Verlag H. Deutsch 1995. 2. Functions
[2.1]
[2.2] [2.3] [2.4] [2.5] [2.6]
FETZER, A.; FRANKEL, H.: Mathematik Lehrbuchfur Fachhochschulen, Bd. 1.-VDI-Verlag 1995. FICHTENHOLZ, G.M.: Differential- und Integralrechnung, Bd. 1.- Verlag H. Deutsch 1994. HARDY,G.: A Course in Pure Mathematics. -Cambridge University Press 1952. Handbook of Mathematical, Scientific and Engineering Formulas, Tables, Functions, Graphs, Transforms. - Research and Education Association 1961. PAPULA, L.: Mathematik fur Ingenieure, Bd. 1, 2, 3. - Verlag Vieweg 199441996, SMIRNOW, W.I.: Lehrbuch der hoheren Mathematik, Bd. 1. - Verlag H. Deutsch 1994.
3. Geometry
[3.1] [3.2] [3.3] [3.4] [3.5] [3.6] [3.7] [3.8] [3.9] [3.10] [3.11] [3.12] [3.13] [3.14]
[3.15] [3.16] [3.17] [3.18]
BAR, G.: Geometrie. - B. G. Teubner 1996. BERGER,M.: Geometry, Vol. 1, 2. - Springer-Verlag 1987. BOHM,J.: Geornetrie, Bd. 1, 2. - Verlag H. Deutsch 1988. DRESZER,J.: Mathematik Handbuch fur Technik und Naturwissenschaft. - Verlag H. Deutsch 1975. A.; NOVIKOV,S.: Modern Geometry, Vol. 1-3. -Springer-Verlag DUBROVIN, B.; FOMENKO, 1995. EFIMOW,N.V.: Hohere Geometrie, Bd. 1, 2. - Verlag Vieweg 1970. FISCHER, G.: Analytische Geornetrie. - Verlag Vieweg 1988. JENNINGS, G.A.: Modern Geometry with Applications. - Springer-Verlag 1994. LANG,S.; MURROW,G.: A High School Course. - Springer-Verlag 1991. GELLERT.W.; KUSTNER,H.; KASTNER,H . (Eds.): Kleine Enzyklopadie Mathematik. Verlag H. Deutsch 1988. KLINGENBERG, W.: Lineare Algebra und Geometrie. - Springer-Verlag 1992. KLOTZEK,B.: Einfiihrung in die Differentialgeornetrie, Bd. 1 , 2 . - Verlag H. Deutsch 1995. KOECHER,M.: Lineare Algebra und analytische Geornetrie. - Springer-Verlag 1997. H. v . ; KNOPP.K.: Einfuhrung in die hohere Mathematik, Bd. 11. - S. Hirzel MANGOLDT, Verlag 1978. MATTHEWS.V.: Vermessungskunde Teil 1,2. - B. G. Teubner 1993. RASCHEWSKI, P.K.: Riemannsche Geornetrie und Tensoranalysis. -Verlag H. Deutsch 1995. SIGL,R.: Ebene und spharische Trigonometrie. - Verlag H. Wichrnann 1977. SINGER.D. A.: Plane and Fancy. - Springer-Verlag 1998.
[3.19] STEINERT,K.-G.: Spharische Trigonometrie. - B. G. Teubner 1977. 4. Linear Algebra
[4.1] [4.2] [4.3] [4.4] [4.5] [4.6] [4.7] [4.8] [4.9]
BERENDT,G.: WEIMAR,E.: Mathematik fur Physiker, Bd. 1, 2. - VCH 1990. BLYTH,T. S.; ROBERTSON, E. F.: Basic Linear Algebra. - Springer-Verlag 1998. CURTIS,C. W.: Linear Algebra. An Introductory Approach. - Springer-Verlag 1984. FADDEJEW, D.K.; FADDEJEWA, W.N.: Numerische Methoden der linearen Algebra. Deutscher Verlag der Wissenschaften 1970. JANICH,K.: Lineare illgebra. - Springer-Verlag 1996. KIELBASI~K A,; I , SCHWETLICK, H.: Numerische lineare Algebra. Eine computerorientierte Einfuhrung. - Verlag H. Deutsch 1988. KLINGENBERG, W .: Lineare Algebra und Geometrie. - Springer-Verlag 1992. KOECHER,M.: Lineare algebra und analytische Geometrie. - Springer-Verlag 1997. LIPPMANN, H.: .4ngewandte Tensorrechnung. Fur Ingenieure, Physiker und Mathematiker. Springer-Verlag 1993. RASCHEWSKI: P.K.: Riemannsche Geometrie und Tensoranalysis. -Verlag H. Deutsch 1995. SMIRNOW, W.I.: Lehrbuch der hoheren Mathematik, Teil II1,l. - Verlag H. Deutsch 1994. SMITH,L.: Lineare Algebra. - Springer-Verlag 1998. S.: Matrizen und ihre Anwendung - 1. Grundlagen. - Springer-Verlag ZURMUHL, R.; FALK, 1997. R.: Praktische Mathematik fur Ingenieure und Physiker. - Springer-Verlag 1984. ZURMUHL,
~
[4.10] [4.11] [4.12] [4.13] [4.14]
5. Algebra a n d Discrete Mathematics
[5.5]
A ) Algebra and Discrete Mathematics, General AIGNER,M.: Diskrete Mathematik. - Verlag Vieweg 1993. BURRIS,S . ; SANKAPPANAVAR, H. P . : A Course in Universal Algebra. - Springer-Verlag 1981. EHRIG,H.; MAHR,B.: Fundamentals of Algebraic Specification 1. - Springer-Verlag 1985. F.: A Logical Approach to Discrete Mathematics. - Springer-Verlag GRIES,D.; SCHNEIDER, 1993. WECHLER,W.: Universal Algebra for Computer Scientists. - Springer-Verlag 1992.
[5,6]
B) Algebra a n d Discrete Mathematics, Group Theory FASSLER, A.; STIEFEL,E.: Group Theoretical Methods and their Applications. -Birkhauser
[Ll] [5.2]
I5.31 . . [5.4]
1992. HEIN,W.: Struktur und Darstellungstheorie der klassischen Gruppen. - Springer-Verlag 1990. [5.8] HEINE,V.: Group Theory in Quantum Mechanics. - Dover 1993. C.: Symmetries in Physics. Group Theory Applied to Physical Prob[5.9] LUDWIG, W . , FALTER, lems. - Springer-Verlag 1996. [5.10] VARADARAJAN, V.: Lie Groups, Lie Algebras and their Representation. - Springer-Verlag 1990. [5.11] WALLACE, D.: Groups, Rings and Fields. - Springer-Verlag 1998. [5.12] ZACHMANN. H.G.: Mathematik fur Chemiker. - VCH 1990. [5.7]
C) Algebra and Discrete Mathematics, N u m b e r Theory G.A.; JONES, J.M.: Elementary Number Theory. - Springer-Verlag. [5.13] JONES, [5.14] NATHANSON. M.: Elementary Methods in Number Theory. - Springer-Verlag.
1094
22. Biblaoqraph~~~
[5.15] RIVEST,R.L.; SHAMIR,A.; ADLEMAN, L.: A Method for Obtaining Digital Signatures and Public Key Cryptosystems. - Comm. ACM 21 (1978) 12-126. [5.16] [5.17] [5.18] [5.19]
D ) Algebra a n d Discrete Mathematics, Cryptology BAUER,F. L.: Decrypted Secrets. - Springer-Verlag. BEUTELSPACHER, A.: Cryptology. The Mathematical Association of iZmerica 1996. SCHNEIDER, B.: bpplied Cryptology. - John Wiley 1995. STALLINGS, W.: Cryptology and Network Security. Prentice Hall 1998. - Addison Wesley Longman 1997.
E) Algebra and Discrete Mathematics, G r a p h Theory [5.20] DIESTEL.R.: Graph Theory. - Springer-Verlag. [5.21] EDMONDS, J.: Paths, Trees and Flowers. - Canad. J. Math. 17 (1965) 449-467. J., JOHNSON,E.L.: Matching, Euler Tours and the Chinese Postman. - Math. [5.22] EDMONDS, Programming 5 (1973) 88-129. [5.23] HARARY,F.: Graph Theory. -Addison Wesley.
F) Algebra a n d Discrete Mathematics, Fuzzy-Logik [5.24] BANDEMER, H.; GOTTWALD, S.: Einfiihrung in Fuzzy-Methoden - Theorie und Anwendungen unscharfer Mengen. - Akademie-Verlag 1993. M.: An Introduction to Fuzzy Control. [5.25] DRIANKOV, D.; HELLENDORN, H.; REINFRANK, Springer-Verlag 1993. H.: Fuzzy Sets and System Theory and Applications. -Academic Press [5.26] DUBOIS,D.; PRADE, 1980. [5,27] GRAUEL;A,: Fuzzy-Logik. Einfiihrung in die Grundlagen mit Anwendungen. - B.I. Wissenschaftsverlag 1995. j5.281 HORDESON, J.N.; NAIR,N.S.: Fuzzy Mathematics. -An Introduction for Engineers and Scientists. - Physica Verlag 1998. J.; KLAWONN: F.: Fuzzy-Systeme. - B. G. Teubner 1993. [5.29] KRUSE,R.; GEBHARDT, [5.30] WANG,Z.; KLIR,G.T.: Fuzzy Measure Theory. - Plenum Press 1992. [5,31] ZIMMERMANN, H-J.: Fuzzy Sets. Decision Making and Expert Systems. - Verlag KluwerNijhoff 1987. 6. Differential Calculus
[6.1] [6.2]
[E:] [6.5] [6.6] [6.7] [6.8] [6.9]
R.: Introduction to Calculus and .4nalysis, Vols. 1 and 2. - Springer-Verlag 1989. COURANT, FETZER,A , ; FRANKEL, H.: Mathematik. - Lehrbuch fur Fachhochschulen, Bd. 1 , 2 . -VDIVerlag 1995. FICHTENHOLZ, G.M.: Differential- und Integralrechnung, Bd. 1-3. -Verlag H. Deutsch 1994. LANG,S.: Calculus of Several Variables. - Springer-Verlag 1987. KNOPP,K.: Theorie und Anwendung der unendlichen Reihen. - Springer-Verlag 1964. MANGOLDT, H. v.; KNOPP,K.: Einfiihrung in die hohere Mathematik, Bd. 2, 3. - S. Hirzel Verlag 1978-81. PAPULA, L.: Mathematik fur Ingenieure, Bd. 1-3. Verlag Vieweg 1994-1996. SMIRNOW, W.I.: Lehrbuch der hoheren Mathematik, Bd. 11,111. - Verlag H. Deutsch 1994. ZACHMANN, H.G.: Mathematik fur Chemiker. - VCH 1990. -
7. Infinite Series A.: Tables of Integrals and Series. - Verlag H. Deutsch 1996. [7.1] APELBLAT, COURANT, R.: Introduction to Calculus and Analysis, Vols. 1 and 2. - Springer-Verlag 1989. [7.2] A.: FRANKEL, H.: Mathematik. - Lehrbuch fur Fachhochschulen, Bd. 1 , 2 . - VDI[7.3] FETZER, Verlag 1995.
1095
[7.4] [73] [7.6] [7.7] [7.8] [7.9]
FICHTENHOLZ, G.M.: Differential- und Integralrechnung, Bd. 1-3. - Verlag H. Deutsch 1994. KNOPP:K.: Theorie und Anwendung der unendlichen Reihen. - Springer-Verlag 1964. MANGOLDT, H. v.; KNOPP,K.; HRG. F. LOSCH: Einfiihrung in die hohere Mathematik, Bd. 1-4. - S. Hirzel Verlag 1989. PAPULA. L.: h'lathernatik fur Ingenieure, Bd. 1-3. - Verlag Vieweg 1994-1996. PLASCHKO, P.; BROD,K.: Hohere mathematische Methoden fur Ingenieure und Physiker. -Springer-Verlag 1989. SMIRNOW, W.I.: Lehrbuch der hoheren Mathematik, Bd. 11,111.- Verlag H. Deutsch 1994.
8. Integral Calculus
[8.1] [8.2]
APELBLAT, A.: Tables of Integrals and Series. - Verlag H. Deutsch 1996. BRYTSCHKOW, J.A.; MARITSCHEW, 0.1.; PRUDNIKOV, A.P.: Tabellen unbestimmter Integrale. - Verlag H. Deutsch 1992. [8.3] COURANT, R.: Introduction to Calculus and Analysis, L'ols. 1and 2. - Springer-Verlag 1989. [8.4] FICHTENHOLZ, G.M.: Differential- und Integralrechnung, Bd. 1-3. -Verlag H. Deutsch 1994. [83] KAMKE,E.: Das Lebesgue-Stieltjes-Integral. - B. G. Teubner 1960. [8.6] KNOPP,K.: Theorie und Anwendung der unendlichen Reihen. - Springer-Verlag 1964. [8.7] MANGOLDT, H. v.; KNOPP,K.; HRSG. F. LOSCH: Einfiihrung in die hohere Mathematik, Bd. 1-4. - S. Hirzel Verlag 1989. L.: Mathematik fur Ingenieure, Bd. 1-3. - Verlag Vieweg 1994-1996. [8.8] PAPULA. [8.9] SMIRNOW, W.I.: Lehrbuch der hoheren Mathematik. Bd. 11,111.- Verlag H. Deutsch 1994. [8.10] ZACHMANN,H.G.: Mathematik fur Chemiker. VCH 1990. ~
9. Differential Equations
i9.1I] [9.12] [9.13] [9.14]
A) Ordinary and Partial Differential Equations BRAUN.M.: Differentialgleichungen und ihre Anwendungen. - Springer-Verlag 1991. CODDINGTON, E.; LEVINSON,N.: Theory of Ordinary Differential Equations. - McGraw Hill 1955. COLLATZ, L.: Differentialgleichungen. - B. G. Teubner 1990. COLLATZ, L.: Eigenwertaufgaben mit technischen Anwendungen. - .4kademische Verlagsgesellschaft 1963. COURANT, R.; HILBERT,D.: Methoden der mathematischen Physik, Bd. 1. 2. - SpringerVerlag 1968. EGOROV,Yu.: SHUBIN,M.: Partial Differential Equations, Vols. 1-4. - Encyclopaedia of Mathematical Sciences. Springer-Verlag 1991. FRANK, PH.; MISES, R. v.: Die Differential- und Integralgleichungen der Mechanik und Physik. Bd. 1. 2. - Verlag Vieweg 1961. GREINER.W.: Quanten Mechanics. An Introduction. - Springer-Verlag 1994. KAMKE,E.: Differentialgleichungen, Losungsmethoden und Losungen, Teil 1, 2. - BSB B. G. Teubner 1977. LANDAU. L.D.; LIFSCHITZ, E.M.: Quantenmechanik. - Verlag H. Deutsch 1992 P OLJ ANIN A.D .; SAIZEW, V. F. : Sammlung gewohnlicher Differentialgleichungen.- Verlag H. Deutsch 1996. REISSIG,R . ; SANSONE, G.; CONTI,R.: Yichtlineare Differentialgleichungen hoherer Ordnung. - Edizioni Cremonese 1969. SMIRNOW, W.I.: Lehrbuch der hoheren Mathematik, Teil2. -Verlag H. Deutsch 1994. SOMMERFELD, A,: Partielle Differentialgleichungen der Physik. - Verlag H. Deutsch 1992. ~
I
1096
22. Biblio.qraphy
[9.15] STEPANOW:W.W.: Lehrbuch der Differentialgleichungen. senschaften 1982.
-
Deutscher Verlag der Wis-
B) Non-Linear Partial Differential Equations [9.16] DODD,R.K.; EILBECK, J.C.; GIBBON,J.D.; MORRIS,H.C.: Solitons and Non-Linear Wave Equations. - Academic Press 1982. [9.17] DRAZIN, P.G., JOHNSON, R.: Solitons. An Introduction. -Cambridge University Press 1989. [9.18] G U CHAOHAO (Ed.): Soliton Theory and its Applications. - Springer-Verlag 1995. [9.19] LAMB,G.L.: Elements of Soliton Theory. - John Wiley 1980. [9.20] MAKHANKOV, V.G.: Soliton Phenomenology. - Verlag Kluwer 1991. [9.21] REMOISSENET, S.: Waves Called Solitons. Concepts and Experiments. - Springer-Verlag 1994. [9.22] TODA, M.: Nonlinear Waves and Solitons. - Verlag Kluwer 1989. [9.23] VVEDENSKY, D.: Partical Differential Equations with Mathematica. - Addison Wesley 1993. 10. Calculus of Variations [10.1] BLANCHARD, P.; BRUNING,E.: Variational Methods in Mathematical Physics. - SpringerVerlag 1992. [10.2] GIAQUINTA, M.; HILDEBRANDT, S.: Calculus of Variations. - Springer-Verlag 1995. E.: Variationsrechnung. - BI-Verlag 1988. [10.3] KLINGBEIL, [10.4] KLOTZLER,R.: Mehrdimensionale Variationsrechnung. - Birkhauser 1970. [10.5] KOSMOL,P.: Optimierung und Approximation. - Verlag W. de Gruyter 1991. [10.6] MICHLIN, S.G.: Numerische Realisierung von Variationsmethoden. -Akademie-Verlag 1969. [10.7] ROTHE. R.: Hohere Mathematik fur Mathematiker, Physiker, Ingenieure, Teil VII. B. G. Teubner 1960. 11. Linear Integral Equations [11.1] CORDUNEANU, I.C.: Integral Equations and Applications. - Cambridge University Press
1991. [11.2] ESTRADA, R . ; KANWAL, R.P.: Singular Integral Equations. -John Wiley 1999. [11.3] HACKBUSCH, W.: Integral Equations: Theory and Numerical Treatment. - Springer-Verlag 1995. [11.4] KANWAL, R.P.: Linear Integral Equations. - Springer-Verlag 1996. [ll.5] KRESS,R.: Linear Integral Equations. - Springer-Verlag 1999. S.: Singular Integral Operators. - Springer-Verlag 1986. [11.6] MICHLIN, S.G.; PROSSDORF, [11.7] MICHLIN,S.G.: Integral Equations and their Applications to Certain Problems in Mechanics. - MacMillan 1964. [11.8] MUSKELISHVILI, N.I.: Singular Integral Equations: Boundary Problems of Functions Theory and their Applications to Mathematical Physics. - Dover 1992. [11.9] PIPKIN, A.C.: A Course on Integral Equations. - Springer-Verlag 1991. [ll.lO] POLYANIN, A.D.; MANZHIROV, A.V.: Handbook of Integral Equations. - CRC Press 1998. 12. Functional Analysis [12.1] ACHIESER,N.I.; GLASMANN, I.M.: Theory of Linear Operators in Hilbert Space. - M. Yestell. Ungar. 1961. [12.2] ALIPRANTIS, C.D.; BURKINSHAW, 0.: Positive Operators. - Academic Press 1985. [12.3] ALIPRANTIS,C.D.; BORDER,K.C.; LUXEMBURG, W.A.J.: Positive Operators, Riesz Spaces and Economics. - Springer-Verlag 1991. [12.4] ALT. H.W .: Lineare Funktionalanalysis. - Eine anwendungsorientierte Einfuhrung. - Springer-Verlag 1976.
1097 [12.5] BALAKRISHNAN, A.V.: Applied Functional Analysis. - Springer-Verlag 1976. [12.6] BAUER,H.: MaB- und Integrationstheorie. - Verlag W. de Gruyter 1990. [12.7] BRONSTEIN, I.N.; SEMENDAJEW, K.A.: Erganzende Kapitel zum Taschenbuch der Mathematik. - BSB B. G. Teubner 1970; Verlag H. Deutsch 1990. [12.8] DUNFORD, N.; SCHWARTZ, J.T.: Linear Operators, Vols. I, 11,111.-1ntersciences 1958,1963, 1971. [12.9] EDWARDS, R.E.: Functional Analysis. - Holt, Rinehart and Winston 1965. [12.10] GAJEWSKI, H.; GROGER,K.; ZACHARIAS, K.: Nichtlineare Operatorengleichungen und Operatordifferentialgleichungen.- Akademie-Verlag 1974. [12.11] HALMOS,P . R . : A Hilbert Space Problem Book. -Van Nostrand 1967. [12.12] HUTSON,V.C.L.; PYM,J.S.: Applications of Functional Analysis and Operator Theory. Academic Press 1980. [12.13] HEWITT,E.; STROMBERG, K.: Real and Abstract Analysis. - Springer-Verlag 1965. [12.14] JOSHI. M.C.; BOSE,R.K.: Some Topics in Nonlinear Functional Analysis. - Wiley Eastern 1985. [12.15] KANTOROVICH, L.V.; AKILOW,G.P.: Functional .4nalysis - Pergamon Press 1982. S.W.: Introduction toFunctional Analysis. -Graylock Press [12.16] KOLMOGOROW, A.N.; FOMIN, 1961. [12.17] KRASNOSEL'SKIJ, M.A.; LIFSHITZ,J.A., SOBOLEV,A.V.: Positive Linear Systems. Heldermann-Verlag 1989. [12.18] LUSTERNIK, L.A.; SOBOLEW,V.I.: Elements of Functional Analysis. - Gordon and Breach 1961, Hindustan Publishing Corporation Delhi 1974, in German: Verlag H. Deutsch 1975. [12.19] MEYER-NIEBERG, P.: Banach Lattices. - Springer-Verlag 1991. [12.20] NAIMARK, M.A.: Normed Rings. - Wolters-Noordhoff 1972. [12.21] RUDIN,W . : Functional Analysis. - McGraw-Hill 1973. [12.22] SCHAEFER, H.H.: Topological Vector Spaces. - Macmillan 1966. [12.23] SCHAEFER, H.H.: Banach Lattices and Positive Operators. - Springer-Verlag 1974. [12.24] YOSIDA,K.: Functional Analysis. - Springer-Verlag 1965.
13. Vector Analysis a n d Vector Fields [13.1] DOMKE,E.: Vektoranalysis: Einfiihrung fur Ingenieure und Naturwissenschaftler. Verlag 1990. [13.2] JANICH.K.: Vector Analysis. - Springer-Verlag 1999. R.: Vektoranalysis fur Ingenieurstudenten. - Verlag H. Deutsch 1992. [13.3] SCHARK,
-
BI-
14.Function Theory [14.1] ABRAMOWITZ, M.; STEGUN,I. A.: Pocketbook of Mathematical Functions. - Verlag H. Deutsch 1984. [14.2] BEHNKE.H.; SOMMER.F.: Theorie der analytischen Funktionen einer kornplexen Veranderlichen. - Springer-Verlag 1976. [14.3] FICHTENHOLZ, G.M.: Differential- und Integralrechnung, Bd. 2. - Verlag H. Deutsch 1994. E.; BUSAM,R.: Funktionentheorie. - Springer-Verlag 1994. [14.4] FREITAG, [I451 GREUEL,0.;KADNER,H.: Komplexe Funktionen und konforme Abbildungen. -B. G. Teubner 1990. [14.6] JAHNKE, E.; EMDE,F.: Tafeln hoherer Funktionen. - B. G. Teubner 1960. [14.7] JANICH,K.: Funktionentheorie. Eine Einfiihrung. - Springer-Verlag 1993. [14.8] KNOPP,SONI,R.: Funktionentheorie. - Verlag W. de Gruyter 1976. [14.9] MAGNUS.W.; OBERHETTINGER, F.: Formulas and Theorems for the Special Functions of mathematical Physics, 3rd ed. - Springer-Verlag 1966. F.; MAGNUS,W.: Anwendung der elliptischen Funktionen in Physik und [14.10] OBERHETTINGER, Technik. -- Springer-Verlag 1949.
I
1098
22. Bibliographll
[14.11] SCHARK, R.: Funktionentheorie fur Ingenieurstudenten. - Verlag H. Deutsch 1993. [14.12] SMIRNOW: Lehrbuch der hoheren Mathematik, Bd. 111. - Verlag H. Deutsch 1994. [14.13] SPRINGER, S.: Introduction to Riemann Surfaces. - Chelsea Publishing Company 1981. 15. Integral Transformations [l5.l] BLATTER.C.: Wavelets. - Eine Einfuhrung. - Vieweg 1998. [15.2] DOETSCH.G.: Introduction to the Theory and Application of the Laplace Transformation. Springer-Verlag 1974. [ K 3 ] DYKE,P.P.G.: .4n Introduction to Laplace Transforms and Fourier Series. - Springer-Verlag 2000. V.: Integral-Transformationen. - Hiithig 1977. [ K 4 ] FETZER. 0.: Laplace- und Fourier-Transformation. - Hiithig 1993. [l5.5] FOLLINGER, [l5.6] GAUSS,E.: WALSH-Funktionen fur Ingenieure und Naturwissenschaftler. - B. G. Teubner 1994. B.B.: Wavelets. Die Mathematik der kleinen Wellen. - Birkhauser 1997. [l5.7] HUBBARD, R.C.: Fourier Transforms and Convolutions for the Experimentalist. -Pergamon [l5.8] JENNISON, Press 1961. [15.9] LOUIS,A. K.; MAASS.P.; RIEDER,A.: Wavelets. Theorie und Anwendungen. - B. G. Teubner 1994. [15.10] OBERHETTINGER, F.: Tables of Fourier Transforms of Distributions. - Springer-Verlag 1990. F.; BADIL,L.: Tables of Laplace Tkansforms. - Springer-Verlag 1973. [15.11] OBERHETTINGER: [15.12] PAPOCLIS, A.: The Fourier Integral and its Applications. - McGraw Hill 1962. j15.131 SCHIFF,J.L.: The Laplace Transform. - Theory and Applications. Springer-Verlag 1999. [15.14] SIROVICH, L.: Introduction to Applied Mathematics. Springer-Verlag 1988. R.; AN, M.; Lu, C.: Mathematics of Multidimensional Fourier Transforms Al[15.15] TOLIMIERI. gorithms. Springer-Verlag 1997. R . ; AN: M.; Lu, C.: Algorithms for Discrete Transform and Convolution. [15.16] TOLIMIERI, Springer-Verlag 1997. [15.17] VOELKER,D.; DOETSCH,G.: Die zweidimensionale Laplace-Transformation. - Birkhauser 1950. [15.18] WALKER,J.S.: Fast Fourier Transforms. - Springer-Verlag 1996. ~
~
~
16. Probability Theory and Mathematical Statistics
[16.1] BERGER,M.,A.: An Introduction to Probability and Stochastic Processes. - Springer-Verlag 1993. [16.2] BEHNEN, K.; NEUHAUS, G.: Grundkurs Stochastik. - B. G. Teubner 1995. [16.3] BRANDT,S.: Data Analysis. Statistical and Computational Methods for Scientists and Engineers. Springer-Verlag 1999. [16.4] CLAUSS,G.; FINZE, F.-R.; PARTZSCH, L.: Statistik fur Soziologen, Padagogen, Psychologen und Mediziner, Bd. 1. - Verlag H. Deutsch 1995. [16.5] FISZ, M.: Wahrscheinlichkeitsrechnung und mathematische Statistik. - Deutscher Verlag der Wissenschaften 1988. [16.6] GARDINER, C.W.: Handbook of Stochastic Methods. - Springer-Verlag 1997. [16.7] GNEDENKO, B.W.: Lehrbuch der Wahrscheinlichkeitstheorie. - Verlag H. Deutsch 1997. [16.8] HUBNER,G.: Stochastik. - Eine anwendungsorientierte Einfuhrung fur Informatiker, Ingenieure und Slathematiker. - Vieweg, 3rd ed. 2000. [l6.9] KOCH,K.-R.: Parameter Estimation and Hypothesis Testing in Linear Models. SpringerVerlag 1988. [16.10] RINNE:H.: Taschenbuch der Statistik. - Verlag H. Deutsch 1997. [16.11] SHAO,JUN:Mathematical Statistics. - Springer-Verlag 1999. ~
~
1099 [16.12] SINIAI, J.G.: Probability Theory. - Springer-Verlag 1992. [16.13] SOBOL.I.M.: Die Monte-Carlo-Methode. - Verlag H. Deutsch 1991. [16.14] STORM. R.: Wahrscheinlichkeitsrechnung, mathematische Statistik und statistische Qualitatskontrolle. - Fachbuchverlag 1995. J.R.: An Introduction to Error Analysis. - University Science Books 1982, VCH [16 151 TAYLOR, 1988. G.R.: Mathematical Statistics. A Unified Introduction. - Springer-Verlag 1999. [16.16] TERRELL. [16.17] WEBER,H.: Einfuhrung in die Wahrscheinlichkeitsrechnung und Statistik fur Ingenieure. B. G. Teubner 1992. 17. Dynamical Systems a n d Chaos (17.11 ARROWSMITH, D.K.; PLACE, C.M.: An introduction to Dynamical Systems. - Cambridge University Press 1990. K.: Fractal Geometry. - John Wiley 1990. [17.2] FALCONER, [17.3] GUCKENHEIMER, J . ; HOLMES.P.: Non-Linear Oscillations, Dynamical Systems and Bifurcations of Vector Fields. - Springer-Verlag 1990. [17.4] HALE.J.; KOFAK:H.: Dynamics and Bifurcations. - Springer-Verlag 1991. B.: Introduction to the Modern Theory of Dynamical Systems. [17.5] KATOK,A , ; HASSELBLATT, - Cambridge University Press 1995. [17.6] KUZNETSOV, Yu.A.: Elements of Applied Bifurcation Theory. No. 112 in: Applied Mathematical Sciences. - Springer-Verlag 1995. [17.7] LEONOV,G.A.; REITMANN, V.; SMIRNOVA, V.B.: Non-Local Methods for Pendulum-Like Feedback Systems - B. G. Teubner 1987. 17.8 MANE, R.: Ergodic Theory and Differentiable Dynamics. - Springer-Verlag 1994. I.: Chaotic Behaviour of Deterministic Dissipative Systems. 17.9 MAREK.M.; SCHREIBER. Cambridge University Press 1991. [17.10] MEDVED',M.: Fundamentals of Dynamical Systems and Bifurcations Theory. -Adam Hilger 1992. [17.11] PERKO, L.: Differential Equations and Dynamical Systems. - Springer-Verlag 1991.
t l
[17.12] PESIN, YA., B.: Dimension Theory in Dynamical Systems: Contemporary Views and Applications. Chicago Lectures in Mathematics. - The University of Chicago Press 1997. 18. Programming, Optimization [18.1] BELLMAN, R.: Dynamic Programming. Princeton University Press 1957. [18.2] BERTSEKAS, D.P.: Nonlinear Programming. - Athena Scientific 1999. [18.3] CHVATAL, V: Linear Programming. - W.H. Freeman 1983. [18.4] DANTZIG, G.B.: Linear Programming and Extensions. - Princeton University Press 1998. R.B.: Numerical Methods for Unconstrained Optimization and [18.5] DENNIS,J . E . ; SCHNABEL, Nonlinear Equations. - SIAM 1996. [18.6] KUHN,H.W.: The Hungarian Method for the Assignment Problem. Naval. Res. Logist. Quart., 2 (1995). [18.7] MURTY,K.G.: Operations Research: Deterministic Optimization Models. - Prentice Hall 1995. [18.8] ROCKAFELLAR, R.T.: Convex Analysis. - Princeton University Press 1996. [18.9] SHERALI.H.D.; BAZARAA, M.S.; SHETTY,C.M.: Nonlinear Programming: Theory and Algorithms. -John Wiley 1993. M.N.: DANTZIG,G.B.: Linear Programming 1: Introduction. - Springer-Verlag [18.10] THAPA, 1997. [18.11] WOLSEY,L.A.: Integer Programming. - John Wiley 1998. -
-
22. Bzblioqraphv
1100
19. Numerical Analysis 119.11 BRENNER.S.C.; SCOTT, L.R.: The Mathematical Theory of Finite Element Methods. Springer-Verlag 1994. 119.21 CHAPRA.S.C.: CANALE.R.P.: Numerical Methods for Engineers. - McGraw Hill 1989. 119.31 COLLATZ,L.: Numerical'Treatment of Differential Equations. - Springer-Verlag 1966. [19.4] DAVIS,P.J.; RABINOWITZ, P : Methods of Numerical Integration. -Academic Press 1984. [19.5] DE BOOR,C.: A Practical Guide to Splines. - Springer-Verlag 1978. [19.6] GOLUB,G.; ORTEGA,J.M.: Scientific Computing. - B. G. Teubner 1996. [19.7] GROSSMANN, CH.; ROOS,H.-G.: Numerik partieller Differentialgleichungen. - B. G. Teubner 1992. 119.81 HACKBUSCH, W.: Elliptic Differential Equations. - Springer-Verlag 1992. j19.91 HAMMERLIN, G.; HOFFMANN, K.-H.: Numerische Mathematik. - Springer-Verlag 1994. [19.10] HAIRER,E.; NORSETT,S.P.; WANNER,G.: Solving Ordinary Differential Equations. Vol. 1: Nonstiff Problems. Vol. 2: Stiff and Differential Problems. Vol. 3: Algebraic Problems. Springer-Verlag 1994. 119.111 HEITZINGER. I.: VALENTIN. G.: Praxis nichtlinearer Gleichunnen. - C. Hanser , W.:, TROCH. Verlag 1984. [19.12] KIELBASINSKI: A.; SCHWETLICK, H.: Numerische lineare Algebra. Eine computerorientierte Einfiihrung. - Verlag H. Deutsch 1988. [19.13] KNOTHE,K.; WESSELS,H.: Finite Elemente. Eine Einfiihrung fur Ingenieure. - SpringerVerlag 1992. [19.14] KRESS,R.: Numerical Analysis. - Springer-Verlag 1998. [19.15] LANCASTER, P.; SALKAUSKA, S.K.: Curve and Surface Fitting. - Academic Press 1986. [19.16] MAESS,G.: Vorlesungen iiber numerische Mathematik, Bd. 1, 2. - Akademie-Verlag 19841988. G.: Approximation von Funktionen und ihre numerische Behandlung. - Sprin[19.17] MEINARDUS, ger-Verlag 1964. [19.18] NURNBERGER, G.: Approximation by Spline Functions. - Springer-Verlag 1989. [19.19] PAO,Y.-C.: Engineering Analysis. - Springer-Verlag 1998. [19.20] QUARTERONI, A , ; VALLI,A.: Numerical Approximation of Partial Differential Equations. Springer-Verlag 1994. [19.21] REINSCH,CHR.: Smoothing by Spline Functions. - Numer. Math. 1967. [19.22] SCHWARZ, H.R.: Methode der finiten Elemente. - B. G. Teubner 1984. [19.23] SCHWARZ, H.R.: Numerische Mathematik. - B. G. Teubner 1986. [19.24] SCHWETLICK, H.; KRETZSCHMAR, H.: Numerische Verfahren fur Naturwissenschaftler und Ingenieure. Fachbuchverlag 1991. R.: Introduction to Numerical Analysis. - Springer-Verlag 1993. [19.25] STOER,J.; BULIRSCH, [19.26] STROUD,A.H.: Approximate Calculation of Multiple Integrals. - Prentice Hall 1971. W.: Numerische Mathematik fur Ingenieure und Physiker, Bd. 1, 2. - Springer[19.27] TORNIG, Verlag 1990. 19.28 UBERHUBER: C.: Numerical Computation 1, 2. - Springer-Verlag 1997. 19.29 WILLERS,F.A.: Methoden der praktischen Analysis. - Akademie-Verlag 1951. 19.30 ZURMUHL, R.: Praktische Mathematik fur Ingenieure und Physiker. - Springer-Verlag 1984. 1
L
1
I
,
,
-
~
i l
20. Computer Algebra Systems I20.11 BENKER.M.: Mathematik mit Mathcad. - Springer-Verlag 1996. [20.2] BURKHARDT, W . : Erste Schritte mit Mathematica. - Springer-Verlag, 2nd ed. 1996. [20.3] BURKHARDT, W.: Erste Schritte mit Maple. - Springer-Verlag, 2nd ed. 1996. [20.4] CHAR;GEDDES;GONNET;LEONG;MONAGAN: WATT: Maple V Library, Reference Manual. - Springer-Verlag 1991.
[20.5] DAVENPORT, J.H.; SIRET,Y.; TOURNIER, E.: Computer Algebra. -Academic Press 1993. H.: Maple V. - Verlag Markt & Technik 1993. [20.6] GLOGGENGIESSER, [20.7] GRABE,H.-G.; KOFLER.M.: Mathematica. Einfuhrung, Anwendung,Referenz. - AddisonWesley 1999. [20.8] JENKS. R.D.; SUTOR,R.S.: Axiom. Springer-Verlag 1992. [20.9] KOFLER.M.: Maple V, Release 4. -Addison Wesley (Deutschland) GmbH 1996. [20.10] MAEDER,R.: Programmierung in Mathematica. Addison Wesley, 2nd ed. 1991. [20.11] WOLFRAM,S.: The Mathematica Book. - Cambridge University Press 1999. ~
~
21. Tables [21.1] ABRAMOWITZ, M.; STEGUN,I.A.: Pocketbook of Mathematical Functions. - \’erlag H. Deutsch 1984. [21 21 APELBLAT. A.: Tables of Integrals and Series. - Verlag H. Deutsch 1996. [21.3] BRYTSCHKOW. Ju.A.; MARITSCHEW, 0.1.;PRUDNIKOW. A.P.: Tabellen unbestimmter Integrale. Verlag H. Deutsch 1992. [21.4] EMDE,F.: Tafeln elementarer Funktionen. - B. G. Teubner 1969. I.M.: Summen-, Produkt- und Integraltafeln, Bd. 1, 2. Verlag [21.6] GRAD STEIN.^.^.; RYSHIK, H. Deutsch 1981. [21.6] GROBNER,W . ; HOFREITER,N.: Integraltafel, Teil 1: Unbestimmte Integrale, Teil 2: Bestimmte Integrale. Springer-Verlag, Teil l, 1975: Teil2, 1973. E.; EMDE,F.; LOSCH,F.: Tafeln hoherer Funktionen. - B. G. Teubner 1960. 21.7 JAHNKE. 21.8 MADELUNG, E.: Die mathematischen Hilfsmittel des Physikers. - Springer-Verlag 1964. F.: Formulas and Theorems for the Special Functions of [21.9] MAGNUS, W . ; OBERHETTINGER, mathematical Physics, 3rd ed. - Springer-Verlag 1966. [21.10] MULLER, H.P.; NEUMANN,P.; STORM, R.: Tafeln der mathematischen Statistik. C. Hanser Verlag 1979. [21.11] POLJANIN. A.D.; SAIZEW,V.F.: Sammlung gewohnlicher Differentialgleichungen. L’erlag H. Deutsch 1996. [21.12] SCHULER,M.: Acht- und neunstellige Tabellen zu den elliptischen Funktionen, dargestellt mittels des Jacobischen Parameters q. - Springer-Verlag 1955. [21.13] SCHULER,M.; GEBELEIN, H.: Acht- und neunstellige Tabellen zu den elliptischen Funktionen. - Springer-Verlag 1965. [21.14] SCHUTTE,K.: Index mathematischer Tafelwerke und Tabellen. Munchen 1966. ~
~
~
H
~
~
22. Handbooks, Guide Books and Reference Books M.; STEGUN:I. A.: Pocketbook of Mathematical Functions. - Verlag [22.1] ABRAMOWITZ, H. Deutsch 1984. [22.2] BAULE,B.: Die Mathematik des Naturforschers und Ingenieurs, Bd. 1,2.- Verlag H. Deutsch 1979. [22.3] BERENDT,G.; WEIMAR, E.: Mathematik fur Physiker, Bd. 1, 2. - VCH 1990. 22.4 BOURBAKI, N.: The Elements of Mathematics, Vols. Iff. - Springer-Verlag 199Off. 22.5 BRONSTEIN, J.N.; SEMENDJAJEW, K.A.: Taschenbuch der Mathematik. - B. G. Teubner 1989,24.,Auflage; Verlag H. Deutsch 1989, [22.6] BRONSTEIN, J.N.; SEMENDJAJEW, K.A.: Taschenbuch der Mathematik, Erganzende Kapitel. - Verlag H. Deutsch 1991. [22.7] DRESZER, J.: Mathematik. - Handbuch fur Technik und Naturaissenschaft. - Verlag H. Deutsch 1975. [22.8] GELLERT,W . ; KASTNER,H.; NEUBER,S . (Eds.): Fachlexikon ABC Mathematik. - Verlag H. Deutsch 1978. [22.9] FICHTENHOLZ, G.M.: Differential- und Integralrechnung, Bd. 1,3. Verlag H. Deutsch 1994.
I1
~
I
1102
22. Biblioqraphu
[22.11] GELLERT.W.: KUSTNER,H.; HELLWICH,M.; KASTNER,H. (Eds.): Kleine Enzyklopadie Mathematik. - Verlag Enzyklopadie, Leipzig 1965. [ 22. lo] JOOS.G.: RICHTER,E.W.: Hohere Mathematik fur den Praktiker. - Verlag H. Deutsch 1994. [ 22,121 MANGOLDT, H. v.; KNOPP,K.; HRG. F. LOSCH: Einfuhrung- in die hohere Mathematik, Bd. 1-4. - S. Hirzel-Verlag 1989. H.: MURPHY.G.M.: The Mathematics of Phvsics and Chemistrv. Vols. 1.2. [22.13] MARGENAU. Van Nostrand 1956; Verlag H. Deutsch 1965, 1967. L.: hlathematik fur Ingenieure, Bd. 1-3. - Verlag Vieweg 1994-1996. [22.14] PAPULA, P.; BROD,K.: Hohere mathematische Methoden fur Ingenieure und Physiker. [22.15] PLASCHKO, Springer-Verlag 1989. M.; VOIT, K.; KRAFT, R.: Mathematik fur Nichtmathematiker, Bd. 1, 2. [ 22,161 PRECHT, Oldenbourg-Verlag 1991. [ 22.171 ROTHE: R.: Hohere Mathematik fur Mathematiker, Physiker, Ingenieure, Teil I-IV. B. G. Teubner 1958-1964. E.: Grundlagen der theoretischen Physik, Bd. 1,4.- Deutscher Verlag der Wis[ 22.181 SCHMUTZER, senschaften 1991. W.I.: Lehrbuch der hoheren hlathematik, Bd. 1-5. - Verlag H. Deutsch 1994. [ 22,191 SMIRNOW, [ 22.201 ZEIDLER,E. (ED.): Teubner Taschenbuch der Mathematik. - B. G. Teubner, Teil 1, 1996; Teil 2. 1995. ~
23. Encyclopedias [23.1] EISENREICH, G.. SUBE,R.: Worterbuch Mathematik, Vols. 1, 2 (English, German, French, Russian) - Verlag Technik 1982; Verlag H. Deutsch 1982. [23.2] Encyclopaedia Britannica. [23.3] Encyclopaedia of Mathematical Sciences (Transl. from the Russian). Springer-Verlag 1990. [23.4] Encyclopaedia of Mathematics (Revised Transl. from the Russian). - Kluwer 1987-1993. ~
INDEX
Index Terms
Links
A Abel group
299
integral equation
588
theorem
414
abscissa plane coordinates
189
space coordinates
208
absolute term
272
absolutely continuous
636
convergent
453
integrable
453
absorbing set
797
absorption law Boolean algebra
340
propositional logic
287
sets
293
account number system uniform accumulation factor
331 331 22
This page has been reformatted by Knovel to provide easier navigation.
Index Terms
Links
accumulation point
604
accuracy, measure of
787
addition complex numbers computer calculation polynomials rational numbers
36 937 11 1
addition theorem area functions
93
hyperbolic functions
89
inverse trigonometric functions
86
trigonometric functions
79
additivity, σ-additivity
633
adjacency
346
matrix
81
348
adjacent side
130
adjoint
259
admittance matrix
353
a.e. (almost everywhere)
634
aggregation operator
368
algebra
251
286
340
745
Boolean finite
341
classical structures
298
commutative
612
factor algebra
338
This page has been reformatted by Knovel to provide easier navigation.
Index Terms
Links
algebra (Con.) free
339
linear
251
nórmed
612
Ω algebra
338
Ω-subalgebra
338
σ-algebra
633
Borelian
634
switch algebra
340
term algebra
339
universal algebra
338
344
algorithm Aitken–Neville
916
Dantzig
355
Euclidean
3
Ford-Fulkerson
357
Gauss
276
Gauss algorithm
888
graph theory
346
Kruskal
353
maximum flow
357
QR
283
Rayleigh-Ritz
283
Remes
922
Romberg method
898
theorem for the Euclidean algorithm
322
14
321
This page has been reformatted by Knovel to provide easier navigation.
Index Terms
Links
alignment chart
123
nomogram
125
α -cut
124
362
strong
362
α-level set
362
strong
362
α-limit set
797
alternating point
920
theorem
920
alternating tensor
265
803
810
altitude cone
156
cylinder
154
polyhedron
151
triangle
132
amortization calculus
23
amplitude function sine function spectrum
701 75 725
analysis functional
594
harmonic
914
multi-scale analysis
741
numerical
881
924
This page has been reformatted by Knovel to provide easier navigation.
Index Terms
Links
analysis (Cont.) vector
640
analytical geometry plane
189
space
207
angle
128
acute
129
between vectors
189
central
130
chord and tangent
139
circumference
139
convex
129
directional
145
escribed circle
140
Eulerian
211
exterior
139
full
129
geodesy
145
inclination
377
interior
139
line and plane
219
lines, plane
128
lines, space
219
measure in degrees
130
names
128
notion
128
140
This page has been reformatted by Knovel to provide easier navigation.
Index Terms
Links
angle (Cont.) nutation
211
obtuse
129
perigon
130
plane
151
notion
128
plane curves
228
planes, space
216
precession
211
radian measure
130
reduction
174
right
129
round
129
secant and tangent
140
slope
377
tangent
227
solid
151
space curves
246
straight
129
tilt
143
angles adjacent
129
alternate
129
at parallels
129
complementary
129
corresponding
129
This page has been reformatted by Knovel to provide easier navigation.
Index Terms
Links
angles (Cont.) exterior–interior
129
opposite
129
sum plane triangle
132
spherical triangle
163
supplementary
129
vertex
129
angular coefficient, curve third degree coefficient, plane frequency annuity
66 194 82 23
calculation
25
payment
25
perpetual
25
annulator
622
annulus
140
area
140
Anosov diffeomorphism
827
anticommutativity, vector product
183
antiderivative
425
antikink soliton
548
antilog
24
10
antisoliton
547
This page has been reformatted by Knovel to provide easier navigation.
Index Terms
Links
Apollonius circle
681
theorem
199
apothem
132
applicate
208
approximate equation
494
approximate formula, empirical curve
106
approximation
914
asymptotic, polynomial part
15
best, Hilbert space
615
bicubic
932
Chebyshev
920
δ function
715
formulas series expansion in mean
416 401
Liouville theorem
4
numbers
4
partial differential equations
909
problem
614
solution by extreme value
916
401
successive Banach space
619
differential equations, ordinary
494
integral equation
565
uniform
920
This page has been reformatted by Knovel to provide easier navigation.
Index Terms
Links
approximation (Cont.) using given functions
906
Weierstrass theorem
605
ellipse
200
graph
346
arc
chain
355
hyperbola
203
intersection
149
length circular segment
140
line integral, first type
462
parabola
204
plane curve
140
447
space curve
246
462
sequences
355
Archimedean spiral
103
area annulus sector
140 140
circle
139
circular sector
140
circular segment
140
cosine
91
cotangent
92
curved surface
480
This page has been reformatted by Knovel to provide easier navigation.
Index Terms
Links
area (Cont.) curvilinear bounded
446
curvilinear sector
446
double integral
472
ellipse
199
formula, Heron’s
143
function
91
hyperbola
202
parabola
204
parallelogram
134
parallelogram with vectors
189
planar figures
446
polyeder with vectors
189
polygon
193
rectangle, square
135
rhombus
135
similar plane figures
133
sine
91
subset
633
surface patch
247
tangent
92
triangle
193
plane
141
143
spherical
163
167
argument function of one variable
47
This page has been reformatted by Knovel to provide easier navigation.
Index Terms
Links
argument (Cont.) function of several variables arithmetic sequence
117 1 18
Arnold tongue
842
arrangement
743
with repetition
744
without repetition
744
744
arrow diagram
294
function
294
article number, European
331
ASCII (American Standard Code for Information Interchange) assignment problem
933 858
associative law Boolean algebra
340
matrices
254
propositional logic
287
sets
293
tensors
264
associativity, product of vectors
184
astroid
102
472
curve
231
234
definition
234
asymptote
This page has been reformatted by Knovel to provide easier navigation.
Index Terms attractor
Links 798
chaotic
826
fractal
826
Hénon
824
hyperbolic
826
Lorenz
825
Strange
826
autocorrelation function
816
axial field
642
826
axioms algebra
612
closed set
604
metric spaces
602
normed space
609
open set
604
ordered vector space
599
pseudonorm
622
scalar product
613
vector space
594
axis abscissae plane coordinates
189
space coordinates
208
ordinates plane coordinates
189
space coordinates
208
This page has been reformatted by Knovel to provide easier navigation.
Index Terms
Links
axis (Cont.) parabola
203
azimuth
159
azimuthal equation
541
B backward substitution
888
Baire, second Baire category
812
Bairstow method
886
ball, metric space
603
Banach Fixed-point theorem
629
space
610
example
610
series
610
theorem, continuity, inverse operator band structure, coefficient matrix
619 892
barrel
157
circular
157
parabolic
157
base
7 power
7
vector
186
reciprocal vector space
185
186
315
This page has been reformatted by Knovel to provide easier navigation.
Index Terms
Links
basic formulas plane trigonometry
141
problems plane trigonometry
143
spherical trigonometry
167
theorems propositional logic basis
287 597
contravariant
266
covariant
266
inverse
849
vector contravariant
266
covariant
266
vector space Bayes theorem
597 748
B-B representation curve
932
surface
932
Berge theorem
354
Bernoulli inequality
30
numbers
410
shift
827
Bernoulli–1’Hospital rule
54
This page has been reformatted by Knovel to provide easier navigation.
Index Terms Bernstein polynomial
Links 932
Bessel differential equation
507
differential equation, linear, zero order
720
function
507
imaginary variables
507
modified
507
inequality beta function biangle, spherical
616 1050 161
bifurcation Bogdanov-Takens
832
codimension
827
cusp
831
flip bifurcation
834
global
827
homoclinic
838
Hopf bifurcation
829
generalized
832
local
827
mappings, subcritical saddle node
834
pitchfork
834
supercritical
836
832
saddle node
829
transcritical
829
This page has been reformatted by Knovel to provide easier navigation.
Index Terms
Links
binary number
814
binary system
934
binomial
934
69
coefficient
13
distribution
753
formula
12
linear
69
quadratic
69
theorem
12
binormal, space curve
239
birth process
767
bisection method
283
bisector
142
triangle bit
241
132 933
reversing the order
927
block
152
body of revolution, lateral surface
447
Bolzano theorem one variable several variables Bolzano–Weierstrass property
59 123 626
Boolean algebra finite expression
340
745
341 342
This page has been reformatted by Knovel to provide easier navigation.
Index Terms
Links
Boolean (Cont.) function n-ary variable
287
342
342 342
Borel set
634
σ-algebra
634
bound function
50
sequence
402
boundary collocation
910
condition
550
uncertain for some variables (fuzzy)
367
boundary value conditions
485
512
problem
485
512
Hilbert
591
Hilbert, homogeneous
591
homogeneous
512
inhomogeneous
512
linear
512
bounded set (order-bounded)
905
600
boundedness of a function one variable several variables
60 123
This page has been reformatted by Knovel to provide easier navigation.
Index Terms
Links
Bravais lattice
310
break of symmetry, bifurcation
832
Breit–Wigner curve
729
Brouwer fixed-point theorem
631
business mathematics byte
21 933
C calculation complex numbers determinants
36 260
numerical accuracy
938
basic operations
937
tensors
263
calculus differentiation
377
errors
785
integral
425
observations, measurement error
785
propositional
286
variations
550
canonical form, circle-mapping
841
canonical system
517
Cantor function
842
Cantor set
820
842
821
825
This page has been reformatted by Knovel to provide easier navigation.
Index Terms
Links
cap, spherical
157
capacity, edge
356
Caratheodory condition
629
Cardano formula
41
cardinal number
291
cardinality, set
298
cardioid
298
98
carrier function
573
Carson transformation
705
Cartesian coordinates plane
189
space
208
folium
94
cartography scale
144
cascade, period doubling
837
Cassinian curve category, second Baire category
840
98 812
catenary curve
88
catenoid
88
105
553
Cauchy integral
590
integral formula
687
integral formulas, application
692
method, differential equations of higher order
500
This page has been reformatted by Knovel to provide easier navigation.
Index Terms
Links
Cauchy (Cont.) principal value improper integral
455
principal value, improper integral
452
principle
606
problem
516
sequence
605
theorem
388
Cauchy–Riemann differential equations, partial
670
Cayley table
300
theorem
302
353
center circle
197
curvature
230
spherical
175
center manifold theorem differential equations
828
mappings
833
center of area method
372
center of gravity
213
arbitrary planar figure
451
arc segment
450
closed curve
450
double integral
472
This page has been reformatted by Knovel to provide easier navigation.
Index Terms
Links
center of gravity (Cont.) line integral, first type
462
method
371
generalized
372
parametrized
372
trapezoid
451
triangle
132
triple integral
478
center of mass
191
213
central angle
130
curves
205
field
641
limit theorem, Lindeberg–Levy
763
surface
220
centroid, points having masses
191
chain
297
directed, elementary
355
graph
355
elementary Markov
355 764
stationary
764
time-homogeneous
764
rule
349
641 composite function
stochastic
380 763
764
This page has been reformatted by Knovel to provide easier navigation.
Index Terms chaos
Links 795
attractor, strange
826
from torus to chaos
839
one-dimensional mapping
827
routes to chaos
827
through intermittence
840
transition to chaos
837
839
character group element
304
representation of groups
305
characteristic characteristic strip
10 517
Chebyshev approximation
920
continuous
920
discrete
923
formula
87
inequality
31
polynomials
921
theorem
434
Chinese remainder theorem
326
2
760
2
773
x distribution x test
774
Cholesky decomposition
890
method
278
890
This page has been reformatted by Knovel to provide easier navigation.
Index Terms chord
Links 140
theorem
138
circle Apollonius
680
area
138
center
197
chord
138
circumference
138
convergence
688
curvature
230
plane curve
228
dangerous
149
definition
138
197
equation Cartesian coordinates
197
polar coordinates
197
great circle
158
173
intersection
158
mapping
841
parametric representation
197
periphery
138
plane
138
197
radius
138
197
small circle
158
175
tangent
138
198
842
This page has been reformatted by Knovel to provide easier navigation.
Index Terms
Links
circuit directed, graph
355
Euler
350
Hamilton
351
circuit integral
466
being zero
468
circular field
644
point
249
sector
140
segment
140
circumcircle quadrangle
136
triangle
132
142
circumscribing quadrangle
136
triangle
132
cissoid
95
Clairaut differential equation, ordinary
491
differential equation, partial
518
class defined by identities
339
equivalence class
297
midpoint
771
statistics
770
This page has been reformatted by Knovel to provide easier navigation.
Index Terms
Links
Clebsch–Gordan coefficient
306
series
306
theorem
306
closure closed linear
614
linear
595
set
605
transitive
295
clothoid
105
code
329
ASCII
933
ASCII (American Standard Code for Information Interchange)
933
public key
329
RSA (Rivest, Shamir, Adleman)
329
codimension
827
coding
329
coefficient Clebsch–Gordan comparison
333
11 306 16
Fourier
418
leading
38
matrix, extended
889
metric
185
vector decomposition
182
186
This page has been reformatted by Knovel to provide easier navigation.
Index Terms collinearity, vectors
Links 184
collocation boundary
910
domain
910
method
574
906
points
906
910
column pivoting
889
column sum criterion
892
combination
743
with repetition
743
without repetition
743
conibinatorics commensurability
910
743 4
commutative law Boolean algebra
340
matrices
254
propositional logic
287
sets
293
vectors
255
commutativity, scalar product
183
commutator
317
comparable function
556
complement
292
algebraic
259
fuzzy
366
fuzzy set
363
This page has been reformatted by Knovel to provide easier navigation.
Index Terms
Links
complement (Cont.) orthogonal
614
sets
292
Sugeno complement
366
Yager complement
366
complementary angles formulas
78
completeness relation
616
completion, metric space
608
complex analysis
669
complex function
47
pole
622
671
complex number plane complex mapping complex-valued function
34 34 682 47
complexification
599
composition
369
max-average
369
maw-(t-norm)
369
compound interest
22
calculation
22
computation of adjustment
914
computer basic operations
937
error of the method
939
error types
936
938
This page has been reformatted by Knovel to provide easier navigation.
Index Terms
Links
computer (Cont.) internal number representation
935
internal symbol representation
933
numerical accuracy
938
numerical problems in calculation
936
use of computers
933
computer algebra systems applications
975
basic structure elements
952
differential and integral calculus
989
elements of linear algebra
984
equations and systems of equations
981
functions
953
graphics
994
infix form
953
list
953
manipulation of algebraic expressions
976
numbers
952
object
952
operator
953
prefix form
953
programming
952
purpose
950
set
953
suffix form
953
terms
953
This page has been reformatted by Knovel to provide easier navigation.
Index Terms
Links
computer algebra systems (Cont.) type
952
variable
953
concave, curve
228
conchoid Nicomedes
96
general
96
of the circle
96
of the line
96
conclusion
370
concurrent expressions
343
condition boundary value, ordinary differential equation
485
Caratheodory
629
Cauchy (convergence)
122
Dirichlet (convergence)
420
initial value, ordinary differential equation
485
Kuhn-Tucker
860
Lipschitz, higher-order differential equation
496
Lipschitz, ordinary differential equation
486
number
891
regularity condition
873
cone
156
central surface
224
circular
156
221
611
This page has been reformatted by Knovel to provide easier navigation.
Index Terms
Links
cone (Cont.) generating
600
imaginary
224
normal
611
regular
611
solid
611
truncated
156
vector space
597
confidence interval
779
regression coefficient
779
variance
776
limit for the mean
775
prescription
790
probability
774
region
779
congruence algebraic
325
corners
151
linear
326
method
781
plane figures
133
polynomial congruence
328
quadratic
327
relation
338
This page has been reformatted by Knovel to provide easier navigation.
Index Terms
Links
congruence (Cont.) kernel
339
simultaneous linear
326
system simultaneous linear
326
theorems
133
congruent directly
132
indirectly
133
mapping
132
133
conic section
156
205
singular
207
conjugate complex number conjunction elementary
207
36 286 343
consistency integration of differential equation
904
order p
904
constant Euler in polynomial
458 60
propositional calculus
286
term
272
constants (often used), table continuation, analytic continued fractions
1007 689 3
This page has been reformatted by Knovel to provide easier navigation.
Index Terms
Links
continuity composite functions
59
elementary function
58
from belon
633
function one variable several variables Hölder
57 122 589
continuous, absolutely
636
continuum
298
contour integral vector field
660
contracting principle
606
contraction
264
tensor
268
contradiction, Boolean function
342
control digit
330
convergence absolute complex terms
407 688
alternating series test of Leibniz
408
Banach space
610
circle of convergence
688
condition of Cauchy
52
conditional complex terms
414
407 688
This page has been reformatted by Knovel to provide easier navigation.
Index Terms
Links
convergence (Cont.) in mean
420
infinite series, complex terms
688
integration of differential equation
904
non-uniform
412
order p
904
sequence of numbers
403
complex terms series complex terms
687 404
406
687
uniform
412
uniformly, function sequences
604
weak
627
Weierstrass criterion
413
convergence criterion comparison criterion
405
D’Alembert’s ratio test
405
integral test of Cauchy
406
root test of Cauchy
406
convergence theorem
404
measurable function
636
conversion, number systems
934
convex, curve
228
convolution Fourier transformation
727
Laplace transformation
711
This page has been reformatted by Knovel to provide easier navigation.
Index Terms
Links
convolution (Cont.) one-sided
727
two-sided
727
z-transformation
734
coordinate inversion
269
coordinate line
209
266
coordinate surface
209
266
coordinate system double logarithmic
115
Gauss–Krueger
143
left-hand
207
orientation
207
orthogonal
180
orthonormal
180
plane
189
right-hand
207
semilogarithmic
115
Soldner
143
spatial
207
transformation
262
coordinate transformation
210
267
645
equation central curves second order
205
quadratic curve (parabolic)
206
coordinates affine
185
This page has been reformatted by Knovel to provide easier navigation.
Index Terms
Links
coordinates (Cont.) axis
189
barycentric
914
Cartesian
186
plane
189
space
208
Cartesian, transformation
191
contravariant
187
268
covariant
187
268
curvilinear
190
244
three dimensional cylindrical vector field
266
209 209 645
Descartes
189
equation, space curve
241
Gauss
244
Gauss–Krüger
160
geodetic
143
geographical
160
mixed
267
point
189
polar, plane
190
polar, spherical
209
representation with scalar product
187
Soldner
160
spherical
209
This page has been reformatted by Knovel to provide easier navigation.
Index Terms
Links
coordinates (Cont.) vector field
645
transformation
190
triangle coordinates
914
vector
182
coprime
5
corner convex
151
figure
150
northwest corner rule
856
symmetric
151
trihedral
151
correction form
893
corrector
903
correlation
777
analysis
777
coefficient
778
empirical
778
cosecant geometric definition
130
hyperbolic
87
trigonometric
76
coset left
301
right
301
This page has been reformatted by Knovel to provide easier navigation.
Index Terms
Links
cosine geometric definition hyperbolic geometric definition
130 87 131
law
142
law for sides
164
rule, spherical triangle
164
trigonometric
75
cotangent geometric definition
130
hyperbolic
87
trigonometric
76
Coulomb field, point-like charge
644
counterpoint
158
course angle
159
covariance, two-dimensional distribution
778
covering transformation
299
covering, open
798
Cramer rule
275
credit
666
22
criterion convergence sequence of numbers
403
series
404
series with positive terms
405
uniform, Weierstrass
413
This page has been reformatted by Knovel to provide easier navigation.
Index Terms
Links
criterion (Cont.) divisibility
321
subspace
315
cross product
183
cryptanalysis, classical, methods
335
Kasiski-Friedman test
335
statistical analysis
335
cryptography, conventional methods linear substitution ciphers substitution
333 334 333
monoalphabetic
333
monographic
333
polyalphabetic
333
polygraphic
333
transposition cryptology
333 332
classical Hill cypher method
334
matrix substitution method
334
Vigenere cypher method
334
cryptosystem
332
DES algorithm
337
Diffie–Hellman key exchange
336
encryption context free
332
This page has been reformatted by Knovel to provide easier navigation.
Index Terms
Links
cryptology (Cont.) context sensitive
332
IDEA algorithm
338
mathematical foundation
332
one-time pad
336
one-way function
337
public key methods
336
RSA method
337
security of cryptosystems
333
subject
332
crystal class
311
crystal system
311
crystallography lattice
310
symmetry group
310
cube
152
curl
666 density
667
field pure
666
zero-divergence field
666
line
653
curvature center
230
circle
230
curves on a surface
247
This page has been reformatted by Knovel to provide easier navigation.
Index Terms
Links
curvature (Cont.) Gauss surface
249
mean, surface
249
minimal total curvature
929
plane curve
228
radius
230
curve on a surface
247
principal
247
space curve
240
surface
247
constant curvature total
249
249 243
curve algebraic n-th order
93 234
arc cosine
84
arc cotangent
84
arc sine
84
arc tangent
84
Archimedean spiral
194
103
area cosine
91
area cotangent
92
area sine
91
area tangent
92
astroid
102
asymptote
231
234
This page has been reformatted by Knovel to provide easier navigation.
Index Terms
Links
curve (Cont.) asymptotic point
232
B–B representation
932
cardioid
98
Cartesian folium
94
Cassinian
98
catenary
105
cissoid
95
clothoid
105
concave
228
conchoid of Nicomedes
96
convex
228
corner point
232
cosecant
76
cosine
75
cotangent
76
curvature
228
cuspidal point
232
cycloid
100
damped oscillation
83
directing
154
double point
232
empirical
106
envelope
236
epicycloid
101
epitrochoid
102
This page has been reformatted by Knovel to provide easier navigation.
Index Terms
Links
curve (Cont.) equation complex form
699
plane
194
second degree
205
second order
205
space
214
error curve
72
evolute
236
evolvent
236
of the circle
225
237
104
exponential
71
fourth order
96
Gauss error curve
757
general discussion
235
hyperbolic cosine
88
hyperbolic cotangent
89
hyperbolic sine
88
hyperbolic tangent
88
hyperbolic type
68
hyperbolice spiral
104
hypocycloid
102
hypotrochoid
102
imaginary
194
inflection point
231
involute
237
70
This page has been reformatted by Knovel to provide easier navigation.
Index Terms
Links
curve (Cont.) isolated point
232
Koch curve
821
lemniscate
99
length, line integral, first type logarithmic logarithmic spiral loops
462 71 104 1004
Lorentz curve
94
multiple point
233
normal, plane
226
n-th degree
63
194
n-th order
63
194
parabolic type
63
Pascal limaçon
96
plane
225
angle
228
direction
225
vertex
232
quadratic
205
radius of curvature
228
representation with splines
928
secant
76
second degree
205
second order
205
semicubic parabola
229
93
This page has been reformatted by Knovel to provide easier navigation.
Index Terms
Links
curve (Cont.) sine
75
space
238
spherical
158
spiral
103
strophoide
641
95
tacnode
232
tangent
76
plane
226
terminal point
232
third degree
65
third order
93
tractrix
106
transcendent
194
trochoid
100
witch of Agnesi
172
94
curves family of, envelope
237
second order central curves
205
parabolic curves
206
polar equation
207
spherical, intersection point cut
178 292
Dedekind’s
297
fuzzy sets
362
This page has been reformatted by Knovel to provide easier navigation.
Index Terms
Links
cut (Cont.) set
292
cutting plane method
874
cycle
349
chain
355
limit
831
cycloid
100
basis
100
common
100
congruent
100
curtate
100
prolate
100
cylinder
154
circular
155
elliptic
223
hollow
155
hyperbolic
223
223
invariant signs elliptic
224
hyperbolic
224
parabolic
224
parabolic
223
cylindrical coordinates
209
cylindrical function
507
cylindrical surface
154
213
This page has been reformatted by Knovel to provide easier navigation.
Index Terms
Links
D d’Alembert formula
534
damping parameter
894
damping, oscillations
83
Darboux vector
243
data type
338
De Morgan law
293
rule
287 Boolean algebra
death process debt
341 767 23
decay, radioactive
766
decimal number
934
normalized
936
representation
934
system decoding
934 329
decomposition orthogonal partial fractions vectors
614 15 182
decomposition theorem differential equation, higher order
499
This page has been reformatted by Knovel to provide easier navigation.
Index Terms
Links
Dedekind cut
297
defect
906
vector space
316
definite, positive
890
defuzzification
371
degeneracy of states
539
910
degree curve, second degree
205
curve, n-th degree
194
homogeneity
120
in-degree
346
matrix
353
measure in degrees
130
out-degree
346
Delambre equations
166
δ distribution
638
δ function
634
application
715
approximation
715
Dirac
712
δ functional
621
density function
750
multidimensional dependence, linear
638
752 272
315
deposit in the course of the year
22
This page has been reformatted by Knovel to provide easier navigation.
Index Terms
Links
deposit (Cont.) regular
22
single
22
depreciation
26
arithmetically declining
26
digital
27
geometrically declining
27
straight-line
26
derivative complex function
669
constant
378
directional
647
scalar field
647
vector field
648
distribution
639
exterior
380
fraction
380
Fréchet
630
function composite
380
elementary
378
implicit
381
inverse
381
parametric representation
382
several variable
390
generalized
638
This page has been reformatted by Knovel to provide easier navigation.
Index Terms
Links
derivative (Cont.) higher order
383
inverse function
385
parametric representation
385
interior
380
left-hand
378
logarithmic
380
mixed
393
one variable
377
partial
390
product
379
quotient
380
right-hand
378
scalar multiple
378
Sobolev sense
638
space
647
sum
378
table
379
vector function
640
volume differentiation
648
Derive (computer algebra system) Descartes rule
950 45
descent
895
descent method
866
determinant
259
differentiation
393
261
This page has been reformatted by Knovel to provide easier navigation.
Index Terms
Links
determinant (Cont.) evaluation
261
functional
121
Jacobian
121
multiplication
260
reflection
260
rules of calculation
260
Wronskian
498
zero value
260
800
determination of extrema absolute extremum
390
implicit function
390
deviation, standard deviation
751
devil’s staircase
842
diagonal matrix
252
diagonal method, Maxwell
682
diagonal strategy
889
diameter parabola
204
circle
139
conjugate ellipse
199
hyperbola
202
ellipse
199
hyperbola
202
This page has been reformatted by Knovel to provide easier navigation.
Index Terms
Links
diffeomorphism
796
Anosov
827
orientation-preserving
842
difference bounded
364
finite expression
905
sets
293 symmetric
293
significant
777
symmetric sets
293
z-transformation
733
difference equation
731
boundary value
738
linear
736
partial differential equations
908
second-order
737
difference method
905
partial differential equations
738
908
difference quotient
895
difference schema
18
differentiability complex function
669
function of one variable
377
function of several variables
392
with respect to the initial conditions
795
This page has been reformatted by Knovel to provide easier navigation.
Index Terms
Links
differntiable continuously
630
Fréchet
630
differential arc plane
226
surface
245
complete
392
first-order
378
higher order
392
integrability
466
notion
391
partial
392
principal properties
392
quotient (see also derivative)
377
second-order
394
total
392 n-th order
394
second-order
394
differential calculus fundamental theorem differential equation
393
377 386 485
boundary value problem
512
eigenfunction
513
eigenvalue
513
eigenvalues
539
This page has been reformatted by Knovel to provide easier navigation.
Index Terms
Links
differential equation (Cont.) flow
795
Fourier transformation
729
Laplace transformation
719
numerical solution
495
operational notation
500
order
485
orthogonality
513
self-adjoint
512
singular solution
487
stiff
905
topological equivalent
808
Weber
543
differential equation, higher order
495
constant coefficients
498
decomposition theorem
499
Euler, n-th order
502
fundamental system
498
n-th order
498
lowering the order
497
normal form
503
quadrature
499
superposition principle
499
system
495
system of linear
503
system of solutions
496
499
This page has been reformatted by Knovel to provide easier navigation.
Index Terms
Links
differential equation, higher order (Cont.) variation of constants
499
differential equation, linear autonomous, on the torus
801
first-order
799
fundamental theorem
799
homogeneous
799
homogeneous systems
504
inhomogeneous
799
inhomogeneous systems
504
matrix-differential equation
799
non-autonomous, on the torus
842
periodic coefficients
801
second order
505
Bessel
507
defining equation
506
Hermite
512
hypergeometric
511
Laguerre
511
Legendre
509
method of unknown coefficients
505
differential equation, ordinary
485
approximate integration
901
autonomous
798
Bernoulli
489
boundary value conditions
485
This page has been reformatted by Knovel to provide easier navigation.
Index Terms
Links
differential equation, ordinary (Cont.) boundary value problem
485
center
493
central point
493
Clairaut
491
direction field
486
element, singular
491
exact
487
existence theorem
486
explicit
485
first-order
486
approximation methods
494
important solution methods
487
flow
795
fraction of linear functions
492
general solution
485
general, n-th order
485
graphical solution
495
homogeneous
487
implicit
485
initial value conditions
485
initial value problem
485
integral
485
integral curve
486
integral, singular
491
integrating factor
488
490
This page has been reformatted by Knovel to provide easier navigation.
Index Terms
Links
differential equation, ordinary (Cont.) Lagrange
490
linear
729
constant coefficients
719
first-order
488
variable coefficient
720
linear, planar
795
multiplier
488
non-autonomous
798
notion
485
particular solution
485
point, singular
491
radicals
491
ratio of arbitrary functions
494
Riccati
489
separation of variables
487
series expansion
494
singular point
492
solution
485
successive approximation
494
van der Pol
831
differential equation, partial
515
approximate integration
908
Cauchy–Riemann
528
characteristic system
516
Clairaut
518
809
670
This page has been reformatted by Knovel to provide easier navigation.
Index Terms
Links
differential equation, partial (Cont.) completely integrable
519
eigenfunction
539
electric circuit
529
elliptic type
520
constant coefficients
552
Euler, calculus of variations
552
field theory
667
first-order
515
canonical system
517
characteristic strip
517
characteristic system
515
linear
515
non-linear
517
non-linear, complete integral
517
total differentials
519
two variables
518
Fourier transformation
730
Hamilton
799
839
heat conduction equation one dimensional
529
one-dimensional
527
three-dimensional
535
Hélmholtz
538
hyperbolic type
520
integral surface
516
522
This page has been reformatted by Knovel to provide easier navigation.
Index Terms
Links
differential equation, partial (Cont.) Laplace
667
Laplace transformation
721
linear
515
non-linear
544
Schrödinger
547
normal system
517
notion
485
parabolic type
520
constant coefficients
522
Poisson
536
quasilinear
515
reduced
828
reduced, normal form
828
second-order
520
constant coefficients
665
668
522
separation of variables
538
ultrahyperbolic type, constant coefficients
522
differential operation review
656
vector components
657
differential operator divergence
651
656
gradient
648
649
Laplace
655
656
nabla
654
656
non-linear
630
656
This page has been reformatted by Knovel to provide easier navigation.
Index Terms
Links
differential operator (Cont.) relations
656
rotation
652
rules of calculations
656
space
647
vector gradient
650
differential transformation, affine
672
differentiation
377
complex function
669
composite function
395
656
654
function elementary
378
implicit
381
inverse
381
one variable
377
parametric representation
382
several variables
390
graphical
382
higher order inverse function
385
parametric representation
385
implicit function
396
logarithmic
380
several variables
395
under the integration sign
457
volume
648
This page has been reformatted by Knovel to provide easier navigation.
Index Terms
Links
differentiation rules basic rules
378
derivative of higher order
383
function one variable
378
several variables
378
table
384
vector function
640
vectors
640
diffusion coefficient
536
diffusion equation
549
three-dimesnsional
393
535
digon, spherical
161
dihedral angle
150
dihedral group
299
dimension
820
capacity
821
correlation
823
defined by invariant measures
822
Douady–Oesterlé
824
formula
315
generalized
823
Hausdorff
820
information
822
lower pointwise
822
Lyapunov
823
This page has been reformatted by Knovel to provide easier navigation.
Index Terms
Links
dimension (Cont.) measure
822
metric
820
Renyi
823
upper pointwise
822
vector space
315
597
Dirac distribution
638
measure
634
theorem
351
directing curve
814
154
direction cosine, space
210
plane curve
225
space
210
space curve
238
directional angle
145
derivative
649
directrix ellipse
198
hyperbola
201
parabola
203
Dirichlet condition
420
problem
526
668
This page has been reformatted by Knovel to provide easier navigation.
Index Terms
Links
discontinuity
57
function
57
removable
58
discount
21
discretization error global
904
local
904
discretization step interval
895
discriminant
520
disjoint
292
disjunction
286
elementary
343
dispersion
751
dissolving torus
839
distance
449
Hamming distance
602
line–point
195
metric space
602
planes
217
parallel
217
point–line, space
218
point–plane, space
215
spherical
158
two lines, space
218
two points
191
space
212
This page has been reformatted by Knovel to provide easier navigation.
Index Terms distribution
Links 637
binomial
753
χ
760
2
continuous
756
derivative
638
Dirac
638
discrete
752
exponential
758
Fisher
761
frequency
770
function
749
function, continuous
750
hypergeometric
753
logarithmic normal
757
yognormal distribution
757
measurement error density
786
normal
756
Poisson
755
regular
638
standard normal
757
Student
761
t distribution
761
theory
712
Weibull
759
distribution problem
638
715
754
858
This page has been reformatted by Knovel to provide easier navigation.
Index Terms
Links
distributive law Boolean algebra
340
matrices
254
propositional logic
287
ring, field
313
sets
293
tensors
264
distributivity, product of vectors
184
divergence
665
central field
652
definition
651
different coordinates
651
improper
403
proper
403
remark
648
sequence of numbers
403
series
407
theorem
663
vector components
657
vector field
651
divisibility
318
criteria
320
321
division complex numbers
37
computer calculation
937
external
192
This page has been reformatted by Knovel to provide easier navigation.
Index Terms
Links
division (Cont.) golden section
192
harmonic
192
in extreme and mean ratio
192
internal
192
line segment, plane
192
polynomial rational numbers segment, space divisor
14 1 212 318
greatest common (g.c.d.) integer numbers
321
linear combination
322
polynomials positive
14 320
dodecahedron
154
domain
118
closed
118
convergence, function series
412
doubly-connected
118
function
47
image, set
48
multiply-connected
118
non-connected
118
of attraction
798
of individuals, predicate calculus
289
This page has been reformatted by Knovel to provide easier navigation.
Index Terms
Links
domain (Cont.) open
118
operator
598
set
48
simply-connected
118
three or multidimensional
118
two-dimensional
118
values, set
685
48
dot product
183
double integral
469
application
472
notion
469
double line
205
dual
341
duality linear programming
854
non-linear optimization
861
theorem, strong
862
duality principle, Boolean algebra
341
dualizing
341
Duhamel formula
720
E eccentricity, numerical curve second order
207
ellipse
198
This page has been reformatted by Knovel to provide easier navigation.
Index Terms
Links
eccentricity, numerical (Cont.) hyperbola
200
parabola
203
edge angle
150
figure
150
graph
346
length
348
valuation
348
multiple
346
sequence
349
cycle
349
directed circuit
349
isolated edge
349
open
349
path
349
effective rate
22
eigenfunction
539
differential equation
513
integral equation
563
normalized
514
eigenvalue
278
differential equation
513
integral equation
563
normalized
514
568
539
568
This page has been reformatted by Knovel to provide easier navigation.
Index Terms eigenvalue
Links 278
differential equation
513
integral equation
563
operator
620
539
568
eigenvalue problem general
278
matrices
278
special
278
eigenvector
264
Einstein’s summation convention
262
element
290
finite
911
generic
812
linearly independent
596
neutral
298
positive
599
element of area, plane, table
472
element of surface, curved, table
480
element of volume, table
477
278
620
elementary cell crystal lattice
310
non-primitive
310
primitive
310
elementary formula, predicate logic
289
elementary surface, parametric form
480
This page has been reformatted by Knovel to provide easier navigation.
Index Terms
Links
elementary volume arbitrary coordinates
477
Cartesian coordinates
474
cylindrical coordinates
475
spherical coordinates
476
elimination method, Gauss
276
elimination step
888
ellipse
198
arc
200
area
199
diameter
198
equation
198
focal properties
198
focus
198
perimeter
200
radius of curvature
199
semifocal chord
198
tangent
199
transformation
205
vertex
198
ellipsoid
220
central surface
224
cigar form
220
imaginary
224
lens form
220
of revolution
220
This page has been reformatted by Knovel to provide easier navigation.
Index Terms
Links
ellipsoid (Cont.) surface second order
224
embedding, canonical
624
encoding method, RSA
329
encyphering
332
endomorphism, linear operators
599
endpoint
346
energy particle
537
spectrum
538
system
536
zero-point translational energy
539
zero-point vibration energy
544
entropy generalized
823
metric
817
topological
817
827
envelope
236
237
epicycloid
101
curtate
102
prolate
102
epitrochoid
102
equality asymptotic complex numbers matrices
417 34 254
This page has been reformatted by Knovel to provide easier navigation.
Index Terms
Links
equality relation identity equation
10 10
algebraic
38
43
278
492
40
63
plane
194
225
second degree
205
second order
205
characteristic cubic curve
defining degree Diophantine linear ellipse
506 38 323 323 198
exponential, solution
45
first degree
39
fourth degree
42
homogeneous
627
hyperbola
200
hyperbolic functions, solution inhomogeneous irrational Korteweg de Vries
46 627 39 546
line plane
194
This page has been reformatted by Knovel to provide easier navigation.
Index Terms
Links
equation (Cont.) space
217
linear
39
logarithmic, solution
46
logistic
796
835
non-linear fixed point
881
numerical solution
881
normal form
38
n-th degree
43
operator equation
627
parabola
203
Parseval
420
pendulum
839
plane
214
general
214
Hessian normal form
214
intercept form
215
plane curve
514
616
194
polynomial numerical solution
884
quadratic
39
root, definition
38
62
Schrödinger linear
536
non-linear
545
This page has been reformatted by Knovel to provide easier navigation.
Index Terms
Links
equation (Cont.) second degree
39
sine–Gordon
547
solution, general space curve vector form sphere
38 214
238
238 243
surface normal form
220
second order
223
space
213
system of
38
term algebra
339
third degree
40
transcendent
38
trigonometric, solution
46
vector
243
187
equation system linear overdetermined
271 890
non-linear
887
numerical method, iteration
887
direct method
887
893
887
overdetermined
277
row echelon form
887
underdetermined
887
887
This page has been reformatted by Knovel to provide easier navigation.
Index Terms
Links
equations Delambre
166
L’Huilier
167
Mollweide
142
Neper
167
equilibrium point hyperbolic equivalence
795 802 286
class
297
logic
287
proof
5
relation Eratosthenes sieve
296 318
error absolute
789
absolute maximum error
790
apparent
788
average
787
computer calculation
938
defined
790
density function
786
discretization
939
equation
890
estimation, iteration method
894
input error
938
least mean squares
419
936
789
This page has been reformatted by Knovel to provide easier navigation.
Index Terms
Links
error (Cont.) mean square
787 788
normally distributed
786
percentage
790
probable
787
relations between error types
787
relative
790
relative maximum error
790
round-off
939
single measurement
788
standard
788
true
788
truncation error
939
type 1 error rate
751
type, measurement errors
785
error analysis
789
789
936
794
error calculus direct problem
938
inverse problem
938
error curve
72
error estimation, mean value theorem
387
error function
459
Gauss
757
error integral, Gauss
459
error orthogonality
906
This page has been reformatted by Knovel to provide easier navigation.
Index Terms error propagation law
Links 792 793
error types, computer calculation
938
ess, sup (essential supremum)
637
estimate
769
Euclidean algorithm
3
norm
316
vector norm
258
vector space
316
14
Euler angle
211
broken line method
901
circuit
350
constant
458
differential equation n-th order
502
variational calculus
552
formula
247
function
329
graph
350
integral of the second kind
457
integral, first kind
1050
integral, second kind
1050
numbers
410
polygonal method
901
418
This page has been reformatted by Knovel to provide easier navigation.
Index Terms
Links
Euler (Cont.) relation complex numbers
35 696
theorem
154
trail
350 open
350
Euler–Hierholzer theorem
350
event
745
certain
745
complete system of
746
elementary
745
impossible
745
independent
748
random
745
set of events
745
simple
745
evolute
748
236
evolution equation
545
function
545
evolvent of circle
236
237
104
excess, spherical triangle
163
exchange theorem
393
exchange, cyclic, sides and angles
141
excluded middle
287
This page has been reformatted by Knovel to provide easier navigation.
Index Terms existential quantifier
Links 289
expansion Fourier expansion, forms
421
Laplace expansion
259
Laurent expansion
690
Maclaurin
416
Taylor
415
Taylor expansion
387
expectation
751
expected value
751
bivariate distribution
777
two-dimensional distribution
777
exponent exponential distribution exponential function
7 758 71
complex
677
general
697
natural
696
exponential integral exponential sum
459 72
expression algebraic manipulation analytic
10 976 48
domain
48
explicit form
48
This page has been reformatted by Knovel to provide easier navigation.
Index Terms
Links
expression (Cont.) implicit form
48
parametric form
49
Boolean
342
concurrent
343
explicit form finite partial differential equations
48 905 980
implicit form
48
integral rational
11
irrational
11
non-polynomial, manipulation parametric form propositional logic rational
977 49 286 11
semantically equivalent
343
tautology
288
transcendent
17
14
11
vector analysis
657
extension principle
366
extension theorem of Hahn
623
extension, linear functional
622
extraction of the root complex numbers real numbers extrapolation principle
38 8 899
This page has been reformatted by Knovel to provide easier navigation.
Index Terms
Links
extremal, radius of curvature
556
extreme value
388
absolute
388
determination
389
higher derivatives
389
side conditions
401
sign change
388
function
399
relative
388
face, corner
151
factor algebra
338
factor group
302
factor ring
314
F
factor, polynomial, product representation
43
factorial
13
generalization of the notion
460
factoring out
11
Falk scheme
254
feasible set
845
Feigenbaum constant
835
FEM (finite element method)
910
FFT (fast Fourier transformation)
925
837
Fibonacci numbers
323
843
This page has been reformatted by Knovel to provide easier navigation.
Index Terms
Links
Fibonacci (Cont.) sequence field
323 313
axial field
642
central symmetric
641
circular
644
conservative
660
Coulomb field, point-like charge
644
cylinder symmetric
642
extension
313
flow
662
function
679
gravitational, point mass
667
Newton field, point-like mass
644
potential
660
scalar field
641
source field
665
spherical field
641
666
field theory basic notions
640
differential equations, partial
667
fields, superposition
667
finite difference method
532
finite element method
532
910
fitting problem different versions
401
This page has been reformatted by Knovel to provide easier navigation.
Index Terms
Links
fitting problem (Cont.) linear
890
non-linear
107
fixed point conformal mapping
673
flip bifurcation
834
fixed-point number
935
floating-point number
935
IEEE standard
936
Maple
967
Mathematica
954
semilogarithmic form
935
674
Floquet representation
803
theorem
801
flow differential equation
795
edge
356
scalar field
662
vector field scalar flow
662
rector flow
662
focal line, tractrix
106
focus
807
compound
830
ellipse
198
This page has been reformatted by Knovel to provide easier navigation.
Index Terms
Links
focus (Cont.) hyperbola
200
parabola
203
saddle
802
saddle focus
807
stable
802
form quadratic
890
saddle
249
formula binomial
12
Cardano
41
closed, predicate logic
289
d’Alembert
534
Duhamel
720
Euler
247
Heron’s
143
interpretation, predicate logic
289
Kirchhoff
534
Leibniz
383
Liouville
499
manipulation
950
418
800
de Moivre complex number
37
hyperbolic functions
90
trigonometric functions
79
This page has been reformatted by Knovel to provide easier navigation.
Index Terms
Links
Pesin
819
Plemelj and Sochozki
591
Poisson
534
predicate logic
289
rectangular
896
Riemann
528
Simpson
897
Stirling
460
tangent
142
tautology
290
826
Taylor one variable
387
several variables
395
trapezoidal
896
formulas Frenet
243
predicate logic
289
four-group, Klein’s
302
Fourier analysis
418
Fourier coefficient
418
asymptotic behavior
420
determination
401
numerical methods
422
Fourier expansion
924
418
complex functions
424
forms
421
This page has been reformatted by Knovel to provide easier navigation.
Index Terms
Links
Fourier expansion (Cont.) symmetries
420
Fourier integral
422
complex representation
722
equivalent representations
722
Fourier series
419
best approximation
615
complex representation
419
Hilbert space
615
Bessel inequality Fourier sum complex representation
616 419
722
addition law
725
comparing to Laplace transformation
728
convolution
727
definition
924
925
Fourier transformation
two-sided
722
711 723
differentiation image space
726
original space
726
discrete complex
925
fast
925
Fourier cosine transformation
723
Fourier exonential transformation
724
Fourier sine transformation
723
This page has been reformatted by Knovel to provide easier navigation.
Index Terms
Links
Fourier transformation (Cont.) frequency-shift theorem
726
integration
726
image space
726
original space
727
inverse
723
linearity law
725
shifting theorem
726
similarity law
726
spectral interpretation
724
survey
705
tables
724
transform
728
fractal
820
fractile
750
fraction continued
3
decimal
1
improper
15
proper
15
fractional part of x frames
49 741
Fréchet derivative
630
differential
630
This page has been reformatted by Knovel to provide easier navigation.
Index Terms
Links
Fredholm alternative
627
alternative theorem
569
integral equation
561
first kind
575
solution method
567
theorems
567
Frenet formulas frequency angular/radial
243 75 82
distribution
770
locking
842
relative
746
sine
607
82
spectrum
725
continuous
423
discrete
423
statistics cumulative Fresnel integral
746
770
771 695
frustum cone
156
pyramid
152
function absolutely integrable algebraic
47 453
456
60
This page has been reformatted by Knovel to provide easier navigation.
Index Terms
Links
function (Cont.) amplitude function, elliptic
701
analytic
670
area
91
arrow
294
autocorrelation
816
Bessel
507
modified
507
beta function
1050
Boolean
287
bounded
50
circular, geometric definition
130
comparable
556
complement
366
complex
47
algebraic
696
bounded
671
linear
673
linear fractional
674
quadratic
675
square root
675
complex variable
669
complex-valued
47
composite
62
derivative
380
intermediate variable
380
342
669
This page has been reformatted by Knovel to provide easier navigation.
Index Terms
Links
continuity in interval
57
one-sided
57
piecewise
57
continuous, complex
669
cosecant
76
cosine
75
cotangent
76
cyclometric
84
cylindrical
507
density
750
dependent
121
discontinuity removable
57 58
discrete
732
distribution
749
double periodic
702
elementary
60
elementary, transcendental
696
elliptic
435
entire rational
60
error function
459
Euler
329
even
50
exponential
61
exponential, complex
691
702
71
677
This page has been reformatted by Knovel to provide easier navigation.
Index Terms
Links
discontinuity (Cont.) exponential, natural
696
function series
412
gamma
459
generalized
637
Green
530
Hamilton
518
799
harmonic
667
670
Heaviside
639
Hermite
614
halomorphic
670
homogeneous
120
homographic
61
65
hyperbolic
87
697
geometric definition
130
131
impulse
712
638
715
function (continued II) increment
904
independent
121
integrable
439
integral rational
60
first degree
62
n-th degree
63
second degree
62
third degree
63
inverse
51
This page has been reformatted by Knovel to provide easier navigation.
Index Terms
Links
function (continued II) (Cont.) complex, hyperbolic
697
complex, trigonometric
697
derivative
381
derivative of higher order
385
existence
60
hyperbolic
91
trigonometric
61
84
irrational
61
69
Jacobian
701
Lagrangian
860
Laguerre
614
Laplace
667
limit
51
at infinity
53
infinity
52
iterated
122
left-hand
52
right-hand
52
Taylor expansion
55
limit theorems
53
linear
60
62
linear fractional
61
65
local summable
637
logarithm, complex
676
logarithmic
61
71
This page has been reformatted by Knovel to provide easier navigation.
Index Terms
Links
function (continued II) (Cont.) loop function
1004
MacDonald
507
matrix exponential
507
mean value
442
measurable
634
meromorphic
691
702
717
monotone decreasing
49
increasing
49
non-elementary
60
notion
47
odd
50
of angle
74
one variable
47
order of magnitude
55
parametric representation derivative of higher order
385
periodic
50
piecewise continuous
57
point of discontinuity
57
finite jump
58
tending to infinity
57
positive homogeneous
556
power primitive
714
70 425
This page has been reformatted by Knovel to provide easier navigation.
Index Terms
Links
parametric representation (Cont.) quadratic random variable
60 749
rational
61
real
47
regular
670
Riemann
528
sample function
768
secant
76
several variables
47
sign of
49
sine
74
special fractional linear fractional
64
step
732
stream
679
strictly monotone
676
summable
635
switch
344
tangent
76
theory
669
theta
703
transcendental
61
trigonometric
61
truth
117
49
sum of linear and fractional linear functions
geometric definition
64
74
697
130 286
287
This page has been reformatted by Knovel to provide easier navigation.
Index Terms
Links
parametric representation (Cont.) Weber
507
Weierstrass
703
function system orthogonal
917
orthonormal
917
function theory
669
functional
551
definition linear
47 317
linear continuous p
L space functional determinant
617
598
599
266
472
621 622 121
fundamental form first quadratic, of a surface
245
second quadratic, of a surface
248
fundamental formulas spherical trigonometry
164
fundamental laws, set algebra
292
fundamental matrix
800
fundamental problem first, triangulation
147
second, triangulation
148
fundamental system differential equation, higher order
498
This page has been reformatted by Knovel to provide easier navigation.
Index Terms
Links
fundamental theorem Abelian groups algebra
302 43
elementary number theory
319
integral calculus
440
future value
25
fuzzy control
372
inference
370
linguistics
359
logic
358
logical inferences
370
product relation
369
relation
367
relation matrix
368
system
375
systems, applications
372
valuation
367
fuzzy set aggregation
363
aggregation operator
368
complement
363
composition
369
cut (representation theorem)
362
degree
362
empty
361
This page has been reformatted by Knovel to provide easier navigation.
Index Terms
Links
fuzzy set (Cont.) intersection
363
intersection set
364
level set
362
normal
362
peak
361
similarity
362
subnormal
362
subset
361
support
358
tolerance interval
361
union
363
universal
361
fuzzy sets cut
362
intersection
364
union
364
G g.c.d. (greatest common divisor)
322
g.c.d. and l.c.m., relation between
322
Gabor transformation
741
Galerkin method
906
gamma function
457
459
276
888
Gauss algorithm
This page has been reformatted by Knovel to provide easier navigation.
Index Terms
Links
Gauss (Cont.) coordinates
244
curvature, surface
249
elimination method
276
error curve
757
error function
757
error integral
459
error propagation law
793
integral theorem
663
least squares method
401
plane
34
step
276
transformation
277
Gauss–Krüger coordinates
161
Gauss–Newton method
894
derivative free
895
Gauss–Seidel method
892
generating line
154
generator
154
ruled surface geodesic line
887
891
916
781
891
222 250
geodesy angle
145
coordinates
143
polar coordinates
143
geometric sequence
19
This page has been reformatted by Knovel to provide easier navigation.
Index Terms geometry
Links 128
analytical
180
plane
189
space
207
differential
226
plane
128
Girard theorem
164
golden section gon
2
4
192
145
gradient definition
649
different coordinates
649
remark
648
scalar field
648
vector components
657
vector gradient
650
Graeffe method
649
655
886
graph alternating way
354
arc
346
bipartite
347
complete
347
complete bipartite
347
components
349
connected
349
cycle
355
355
This page has been reformatted by Knovel to provide easier navigation.
843
Index Terms
Links
graph (Cont.) directed
346
directed circuit
355
directed edge
346
edge
346
Euler
350
flow
356
increasing way
354
infinite
347
isomorphism
347
loop
346
mixed
346
non-planar
355
partial
347
planar
355
plane
347
regular
347
simple
346
special classes
347
strongly connected
355
subdivision
355
subgraph
347
transport network
347
tree
347
undirected
346
vertex
346
This page has been reformatted by Knovel to provide easier navigation.
Index Terms
Links
graph (Cont.) weighted
348
graph paper double logarithmic
115
log–log paper
115
notion
115
reciprocal scale
116
semilogarithmic
115
graph theory, algorithm
346
gravitational field, point mass
667
great circle
158
173
greatest common divisor (g.c.d.) integer numbers
321
linear combination
322
polynomials
14
Green function
530
integral theorem
664
method
529
group
530
299
Abelian
299
dihedral
299
element
299
character
304
inverse
299
factor group
302
This page has been reformatted by Knovel to provide easier navigation.
Index Terms
Links
group (Cont.) permutation
301
point group
308
representation
303
irreducible
305
reducible
305
subgroup
300
symmetry group
308
group table Cayley table grouping
300 300 11
groups applications
307
direct product
301
homomorphism
302
homomorphism theorem
302
isomorphism
302
growth factor
22
Guldin’s first rule
451
Guldin’s second rule
451
H half-angle formulas plane trigonometry
142
spherical trigonometry
165
half-line, notion
128
This page has been reformatted by Knovel to provide easier navigation.
Index Terms
Links
half-side formulas
165
Hamel basis
597
Hamilton circuit
351
differential equation
813
differential equation, partial
799
839
function
518
799
system
813
Hamiltonian
536
Hamming distance
602
Hankel transformation
705
harmonic analysis
914
harmonic, spherical first kind
509
second kind
510
Hasse diagram
298
heat conduction equation one-dimensional
527
three-dimensional
535
721
Heaviside expansion theorem
717
function
639
695
unit step function
712
1078
helix
242
Helmholtz, differential equation
538
Hénon mapping
796
810
This page has been reformatted by Knovel to provide easier navigation.
Index Terms
Links
Hermite differential equation
512
polynomial
512
trapezoidal formula
897
543
Hessian matrix
867
normal form line equation, plane
195
plane equation, space
214
hexadecimal number
934
system
934
Hilbert boundary value problem
591
homogeneous
591
inhomogeneous
592
matrix
986
space
613
isomorphic
616
pre-Hilbert space
613
scalar product
613
histogram, statistics
770
hodograph, vector function
640
Hölder condition
589
continuity
589
This page has been reformatted by Knovel to provide easier navigation.
Index Terms
Links
Hölder (Cont.) inequality
32
integrals
32
series
32
Holladay, theorem
929
holoedry
311
homeomorphism and topological equivalence
808
conjugate
811
orientation preserving
841
homomorphism
302
algebra
612
groups
302
linear operators
598
natural
302
ring
314
theorem
339
groups
302
ring
314
vector lattice
601
Hopf bifurcation
829
Hopf–Landau model, turbulence
839
339
314
Horner rule
935
scheme
884
two rows
935
885
This page has been reformatted by Knovel to provide easier navigation.
Index Terms horseshoe mapping
Links 825
Householder method
278
tridiagonalization
283
1’Huilier equations
918
167
hull convex
597
linear
595
Hungarian method
858
hyperbola
200
arc
203
area
202
asymptote
201
binomial
69
conjugate
202
diameter
202
equation
200
equilateral
64
focal properties
200
focus
200
radius of curvature
202
segment
202
semifocal chord
200
tangent
201
transformation
205
vertex
200
203
This page has been reformatted by Knovel to provide easier navigation.
Index Terms
Links
hyperbolic cosecant
87
cosine
87
cotangent
87
secant
87
sine
87
tangent
87
hyperbolic function
697
geometric definition
131
inverse, complex
697
hyperboloid one sheet central surface two sheets central surface hyperplane
220 220
222
224 220 224 623
of support
624
hypersubspace
623
hypersurface
117
hypocycloid
102
curtate
102
prolate
102
hypotenuse
130
hypothesis testing
777
hypotrochoid
102
This page has been reformatted by Knovel to provide easier navigation.
Index Terms
Links
I icosahedron
154
ideal
313
principal ideal
313
idempotence law Boolean algebra
341
propositional logic
287
sets
293
identically valid
10
identity
10
Boolean function
342
representation of groups, identity matrix
253
IEEE standard
936
IF–THEN rule
370
iff (if and only if)
318
image function
48
set
48
space
48
subspace
705
315
imaginary number
34
part
34
unit
34
This page has been reformatted by Knovel to provide easier navigation.
Index Terms implication proof
Links 286 5
impulse function
712
impulse, rectangular
712
incidence function
346
incidence matrix
348
incircle quadrangle
136
triangle
132
143
4
804
21
392
linear
272
596
path of integration
466
685
potential field
660
incommensurability increment independence
induction step inequality
5 28
arithmetic and geometric mean
30
arithmetic and quadratic mean
30
Bernoulli
30
Bessel binomial Cauchy–Schwarz Chebyshev Chebyshev, generalized
616 31 31 31
752
32
This page has been reformatted by Knovel to provide easier navigation.
Index Terms different means
Links 30
first degree
33
Hölder
32
linear
33
first degree
33
solution
33
Minkowski
32
product of scalars
31
pure
28
quadratic
33
solution
33
Schwarz–Buniakowski
613
second degree
33
solution
28
special
30
triangle
181
norm
609
triangle inequality complex numbers
30
real numbers
30
infimum
600
infinite denumerable
298
non-denumerable
298
infinitesimal quantity
439
infinity
1 This page has been reformatted by Knovel to provide easier navigation.
Index Terms
Links
infix form
298
inflection point
231
initial conditions
485
initial value problem
485
inner product
613
388
389
901
inscribed circle quadrangle
136
triangle
132
inscribed pentagram
138
instability, round-off error numerical calculation insurance mathematics
939 21
integer non-negative part of x programming
1 49 845
integrability complete
519
condition
467
conditions
466
differential
466
quadratic
577
660
integral absolutely convergent
456
antiderivative
425
basic This page has been reformatted by Knovel to provide easier navigation.
Index Terms
Links
integral (Cont.) notion
426
calculus
425
circuit
460
466
comp1ex definite
683
indefinite
684
complex function measurable function
635
elementary functions
426
error integral
459
Euler
457
exponential integral
459
Fourier integral
422
Fresnel
695
interval of integration
439
Lebesgue integral
635
comparison with Riemann integral
722
451
limits of integration
439
line integral
460
first type, applications
459
462
lorarithmic integral
429
lower limit
439
non-elementary
429
parametric
457
primitive function
425
458
This page has been reformatted by Knovel to provide easier navigation.
Index Terms
Links
integral (Cont.) probability integral
757
Riemann
439
comparison with Stieltjes integral Stieltjes
451 626
comparison with Riemann integral
451
notion
451
surface integral
477
triple integral
474
upper limit
439
volume integral
474
integral calculus
662
425
fundamental theorem
440
mean value theorem
442
integral cosine
458
integral curves
796
456
integral equation Abel
588
adjoint
563
590
approximation successive
565
characteristic
590
collocation method
574
eigenfunction
563
eigenvalue
563
first kind
561
This page has been reformatted by Knovel to provide easier navigation.
Index Terms
Links
integral equation (Cont.) fredholm
561
degenerate kernel
575
first kind
575
second kind
562
general form
561
homogeneous
561
inhomogeneous
561
iteration method
565
kernel
561
degenerate
562
iterated
585
product
562
kernel approximation tensor product
582
572 572
linear
561
Nystrom method
571
orthogonal system
577
perturbation function
561
quadratically integrable function
577
quadrature formula
570
second kind
561
singular
588
Cauchy kernel
607
580
589
transposed
563
Volterra
561
607
This page has been reformatted by Knovel to provide easier navigation.
Index Terms
Links
integral equation (Cont.) convolution type
585
first kind
583
second kind
583
integral equation method, closed curve integral exponential function, table
532 1045
integral formula Cauchy application Gauss integral logarithm, table
687 692 664 1047
integral norm
637
integral surface
516
integral test of Cauchy
406
integral theorem
663
Cauchy
686
Gauss
663
Green
664
Stokes
664
integral transformation
705
application
707
Carson
705
definition
705
Fourier
705
Gabor
741
Hankel
705
This page has been reformatted by Knovel to provide easier navigation.
Index Terms
Links
integral transformation (Cont.) image space
705
inverse
705
kernel
705
Laplace
705
linearity
705
Mellin
705
multiple
707
one variable
705
original space
705
several variables
707
special
705
Stieltjes
705
Walsh
742
wavelet
738
fast
741
integral, definite
438
differentiation
441
notion
425
particular integral
440
table
739
1050
algebraic functions
1053
exponential function
1051
logarithmic function
1052
trigonometric functions
1050
This page has been reformatted by Knovel to provide easier navigation.
Index Terms integral, elliptic
Links 435
first kind
429
second kind
435
series expansion
460
table
435
1055
third kind
435
integral, improper
438
Cauchy’s principal value
452
455
convergent
452
454
divergent
452
454
infinite integration limits
451
notion
451
principal value
452
unbounded integrand
451
integral, indefinite
425
basic integral
426
cosine function, table
1037
cotangent function, table
1043
elementary functions
701
426
elementary functions, table
1017
exponential function, table
1045
hyperbolic functions, table
1044
inverse hyperbolic functions, table
1049
inverse trigonometric function, table
1048
irrational functions, table
1024
logarithmic functions, table
1047
This page has been reformatted by Knovel to provide easier navigation.
Index Terms
Links
integral, indefinite (Cont.) notion
426
other transcendental functions, table
1044
sine and cosine function, table
1039
sine function, table
1035
table
1017
tangent function, table
1043
trigonometric functions, table
1035
integral, logarithmic
429
integrand
426
integrating factor
488
439
integration approximate ordinary differential equation
901
partial differential equation
908
complex plane
683
constant
426
function, non-elementary
458
graphic
430
graphical
444
in complex
692
interval
439
limit depending on parameter
457
lower
439
upper
439
This page has been reformatted by Knovel to provide easier navigation.
Index Terms
Links
integration (Cont.) logarithmic
428
numerical
895
multiple integrals
900
partial
429
power
428
rational functions
430
rules by series expansion
429
by substitution
429
constant multiple rule
427
definite integrals
441
general rule
427
indefinite integrals
428
interchange rule
442
interval rule
441
series expansion
444
sum rule
427
under the integration sign
457
variable
439
variable, notion
426
vector field
658
volume
448
integrator
445
intensity, source
666
interaction, soliton
545
458
This page has been reformatted by Knovel to provide easier navigation.
Index Terms
Links
intercept theorem interest
134 22
calculation
21
compound
22
intermediate value theorem one variable
59
several variables
123
intermediate variable
380
intermittence
837
Internationale Standard Book Yumber ISBN
330
840
interpolation Aitken–Neville
915
condition
914
formula Lagrange
915
Newton
914
fuzzy system
375
knowledge-based
378
node
895
equidistant
376
901
points
914
quadrature
896
spline
914
bicubic
930
cubic
928
trigonometric
914
928
924
This page has been reformatted by Knovel to provide easier navigation.
Index Terms
Links
interpretation formula, predicate logic
289
variable
287
intersection
292
angle
159
by two oriented lines
147
fuzzy set
363
on the sphere
171
point four planes
216
lines
195
plane and line
218
three planes
216
two lines, space
219
set, fuzzy set
364
sets
292
without visibility
147
interval convergence numbers
414 2
order (0)–interval
600
rule
441
statistics
770
invariance rotation invariance
265
transformation invariance
265
This page has been reformatted by Knovel to provide easier navigation.
Index Terms
Links
invariance (Cont.) translation invariance invariant
265 704
quadratic curve
206
scalar
212
scalar invariant
184
surface second order
224
inverse
599
inverse function
51
hyperbolic
91
trigonometric
84
inverse transformation
705
inversion Cartesian coordinate system
269
conformal mapping
674
space
269
involute
237
irrational algebraic
2
quadratic
2
isometry, space
609
isomorphic vector spaces isomorphism
599 302
Boolean algebra
341
graph
347
339
This page has been reformatted by Knovel to provide easier navigation.
Index Terms
Links
isomorphism (Cont.) groups
302
surjective norm
624
iteration
881
inverse
283
method
283
881
881
893
ordinary sequential steps
893
simultaneous steps
893
vector
284
892
J Jacobi function
701
method
283
892
determinant
121
266
matrix
894
Jacobian
Jordan matrix
282
normal form
282
jump, finite
58
K KAM (Kolmogorov–Arnold–Moser) theorem
813
ker
339
699
This page has been reformatted by Knovel to provide easier navigation.
Index Terms kernel
Links 302
approximation integral equation
572
spline approach
573
tensor product
572
congruence relation
339
homomorphism
339
integral equation
561
degenerate
572
iterated
566
resolvent
568
solving
566
integral transformation
705
operator
599
ring
314
subspace
315
key equation
123
kink soliton
548
Kirchhoff formula
534
Klein’s four-group
302
Koch curve
821
Korteweg de Vries equation
545
585
Kronecker generalized delta
265
product
258
This page has been reformatted by Knovel to provide easier navigation.
Index Terms
Links
Kronecker (Cont.) symbol
253
Kuan
360
Kuhn-Tucker conditions
860
reference Kuratowski theorem
265
624 355
L 1.c.m. (least common multiple)
322
Lagrange function
401
identity, vectors
185
interpolation formula
915
method of multiplier
401
theorem
301
Lagrangian
401
860
Laguerre differential equation, linear, second order
511
polynomial
511
Lanczos method
283
Landau, order symbol
56
Laplace differential equation, partial
536
expansion
259
wave equation
534
667
This page has been reformatted by Knovel to provide easier navigation.
Index Terms
Links
Laplace operator different coordinates
655
polar coordinates
399
vector components
657
Laplace transformation
708
addition law
709
comparing to Fourier transformation
728
convergence
708
convolution
711
one-sided
711
convolution, complex
712
definition
708
differential equation, ordinary linear, constant coefficients
719
linear, variable coefficients
720
differential equation, partial
721
differential equations
719
differentiation, image space
710
differentiation, original space
710
discrete
734
division law
711
frequency-shift theorem
709
image space
708
integration, image space
710
integration, original space
710
inverse
708
716
This page has been reformatted by Knovel to provide easier navigation.
Index Terms
Links
Laplace transformation (Cont.) inverse integral
718
linearity law
709
original function
708
original space
708
partial fraction decomposition
716
piecewise differentiable function
713
series expansion
717
similarity law
709
step function
712
survey
705
table
1061
transform
708
translation law
709
largest area method
372
lateral area cone
156
cylinder
154
polyhedron
151
lateral face
151
latitude, geographical
160
lattice
340
Banach
612
Bravais
310
crystallography
310
distributive
340
244
This page has been reformatted by Knovel to provide easier navigation.
Index Terms
Links
lattice (Cont.) vector
612
Laurent expansion
690
series
690
cosine
164
734
law
spherical triangle
164
large numbers
762
Bernoulli
762
limit theorem, Lindeberg–Levy
763
sine–cosine polar layer, spherical
164 164 157
least common multiple (l.c.m.), integer numbers least squares method
322 107
277
910
916
calculus of observations
785
Gauss
401
regression analysis
778
least squares problem, linear
891
887
Lebesgue integral comparison with Riemann integral measure
452
635
451 634
814
This page has been reformatted by Knovel to provide easier navigation.
906
Index Terms
Links
left singular vector
285
left-hand coordinate system
207
leg or side of an angle
128
Legendre differential equation
509
polynomial associated
511
first kind
509
second kind
510
symbol
327
Leibniz alternating series test
408
formula
383
lemma Jordan
693
Schur
306
lemniscate
99
234
length interval
633
line integral, first type
462
reduced
174
vector
189
length, arc space curve
246
level curve
117
This page has been reformatted by Knovel to provide easier navigation.
Index Terms
Links
level (Cont.) line
642
surface
642
level set (fuzzy set)
362
library (numerical methods)
941
Aachen library
943
IMSL library
942
NAG library
941
limit cycle
804
stable
804
unstable
804
definite integral
438
831
function complex variable one variable several variables theorems
669 51 122 53
function series
412
integration, depending on parameter
457
partial sum
412
sequence of numbers
403
sequence, in metric space
604
series
404
superior
414
This page has been reformatted by Knovel to provide easier navigation.
Index Terms line
Links 62
curvature, surface
248
geodesic
159
imaginary
205
notion
128
space
150
vector equation
188
214
line element surface
245
vector components
658
line equation plane
194
Hessian normal form
195
intercept form
195
polar coordinates
195
through a point
194
through two points
194
slope, plane
194
space
217
line integral
460
Cartesian coordinates
659
first type
461
applications
462
general type
465
second type
462
vector field
658
This page has been reformatted by Knovel to provide easier navigation.
Index Terms linear combination, vectors linear form continuous
Links 83
181
598
599
184
621
linear programming assignment problem
858
basic notions
847
basic solution
848
basis of the extreme point
847
constraints
844
degenerate extreme point
847
distribution problem
858
dual problem
854
duality
854
extreme point
847
normal form
847
objective function
844
primal problem
854
round-tour problem
859
surplus variable
848
848
linearly dependent
315
independent
315
lines angle between, plane
196
intersection point, plane
195
orthogonal
128
197
This page has been reformatted by Knovel to provide easier navigation.
Index Terms
Links
lines (Cont.) parallel
128
pencil
196
perpendicular
128
skew
150
150
197
197
Liouville approximation theorem
4
formula
499
theorem
799
800
Lipschitz condition higher-order differential equation
496
ordinary differential equation
486
locus, geometric
194
logarithm binary Briggsian complex function
10 9 676
decimal
9
definition
9
natural
9
Neperian
9
principal value table taking of an expression logarithmic decrement logarithmic integral
696
696 10 9 83 429
458
This page has been reformatted by Knovel to provide easier navigation.
Index Terms
Links
logarithmic normal distribution
757
logic
286
fuzzy
358
predicate
289
propositional
286
longitude, geographical
160
loop function
1004
loop, graph
346
Lorentz curve
729
Lorenz system
796
loxodrome
177
arclength
177
course angle
178
intersection point
178
intersection point of two loxodromes
179
p
L space
244
799
837
637
LU factorization (lower and upper triangular matrix)
888
Lyapunov exponent
818
function
801
M MacDonald function
507
Maclaurin series expansion
416
Macsyma (computer algebra system)
950
This page has been reformatted by Knovel to provide easier navigation.
Index Terms majorant series
Links 411
manifold center manifold theorem local
833
mappings
833
of solutions
887
stable
804
810
unstable
804
810
manipulation algebraic expressions
976
non-polynomial expressions
977
mantissa
10
manyness set
298
Maae (computer algebra system)
950
936
Maple addendum to syntax
975
algebraic expressions
969
manipulation
978
multiplication
978
array
970
attribute
975
basic structure elements
965
context
975
conversion of numbers, different bases
968
differential equations
994
differential operators
973
This page has been reformatted by Knovel to provide easier navigation.
Index Terms
Links
Maple (Cont.) differentiation
992
eigenvalues, eigenvectors
988
elements of linear algebra
986
environment variable
975
equation one unknown
983
transcendental
984
factorization of polynomials
979
floating-point number, conversion
967
formula manipulation, introduction
950
functions
972
graphics
1002
introduction
952
three-dimensional
1004
two-dimensional
1002
help and information
975
input, output
946
input–output
950
966
integral definite
993
indefinite
993
multiple
994
lists
969
manipulation, general expressions
980
matrices
970
This page has been reformatted by Knovel to provide easier navigation.
Index Terms
Links
Maple (Cont.) numerical calculations, introduction
951
numerical mathematics
946
differential equations
949
equations
947
expressions and functions
946
integration
948
object classes
965
objects
965
operations important
983
polynomials
979
operators functions
973
important
968
partial fraction decomposition
980
programming
974
sequences
969
short characteristc
950
special package plots
1004
system description
965
system of equations
984
systems of equations, linear
986
table structures
970
types
965
types of numbers
967
This page has been reformatted by Knovel to provide easier navigation.
Index Terms
Links
Maple (Cont.) vectors
970
mapping
294
between groups
302
bijective
296
complex number plane
682
conformal
672
circular transformation
674
exponential function
677
fixed-point
673
inversion
674
linear fractional
674
linear function
673
logarithm
676
quadratic function
675
square root
675
296
598
674
sum of linear and fractional linear functions
676
contracting
606
equivalent
841
function
48
Hénon mapping
810
horseshoe
825
injective
296
inverse mapping
296
kernel
302
294
598
This page has been reformatted by Knovel to provide easier navigation.
Index Terms
Links
mapping (Cont.) lifted
841
linear
315
modulo
814
one-to-one
296
Poincaré
825
Poincaré mapping
806
reduced
833
regular
316
rotation mapping
815
shift
818
surjective
296
tent
814
topological conjugate
811
unit circle
841
marginal distribution
316
598
598
752
mass double integral
472
line integral, first type
462
triple integral
478
matching
354
maximal
354
perfect
354
saturated
354
Mathcad (computer algebra system)
950
Mathematica (computer algebra system)
950
This page has been reformatted by Knovel to provide easier navigation.
Index Terms
Links
Mathematica 3D graphics
1000
algebraic expressions, manipulation
976
algebraic expressions, multiplication
976
apply
963
attribute
964
basic structure element
953
context
964
curves parametric representation
999
two-dimensional
998
differential and integral calculus
989
differential equation
991
differential quotient
989
differentiation
962
eigenvalues, eigenvectors
986
elements
953
elements of linear algebra
984
equation
981
manipulation
981
transcendent
982
expression
953
factorization, polynomials
976
FixedPoint
962
FixedPointList
962
floating–point number, conversion
955
This page has been reformatted by Knovel to provide easier navigation.
Index Terms
Links
Mathematica (Cont.) formula manipulation, introduction
950
function, inverse
962
functional operations
962
functions
960
graphics
995
functions
997
introduction
952
options
996
primitives
995
head
953
input and output
943
input–output
950
954
integral definite
990
indefinite
990
multiple
991
lists
956
manipulation with matrices
959
manipulation with vectors
959
Map
963
matrices as lists
958
messages
965
Nest
962
NestList
962
numerical calculations, introduction
951
This page has been reformatted by Knovel to provide easier navigation.
Index Terms
Links
Mathematica (Cont.) numerical mathematics
943
curve fitting
943
differential equations
945
integration
945
interpolation
944
polynomial equations
944
objects, three-dimensional
1001
operations, polynomials
977
operators, important
956
partial fraction decomposition
977
pattern
961
programming
963
short characteristc
950
surfaces
1000
surfaces and space curves
1000
syntax, additional information
964
system description
953
system of equations
982
general case
985
special case
984
types of numbers
954
vectors as lists
958
mathematics, discrete
286
matrices arithmetical operations
254
This page has been reformatted by Knovel to provide easier navigation.
Index Terms
Links
matrices (Cont.) associative law
254
commutative law
254
distributive law
254
division
254
eigenvalue
278
eigenvalue problem
278
eigenvectors
278
equality
254
multiplication
254
powers
258
rules of calculation
257
matrix
255
251
adjacency
348
adjoint
251
admittance
353
anti-Hermitian
253
antisymmetric
252
block tridiagonal
909
complex
251
conjugate
251
deflation
284
degree
353
diagonal
252
distance
349
extended coefficient
889
260
This page has been reformatted by Knovel to provide easier navigation.
Index Terms
Links
matrix (Cont.) full rank
277
fundamental
800
Hermitian
253
Hessian
867
identity
253
incidence
348
inverse
256
inverse calculation
272
Jacobian
894
Jordan
282
lower triangular
889
main diagonal element
252
monodromy
801
normal
252
of coefficients
272
augmented
260
810
273
orthogonal
257
rank
255
real
251
rectangular
251
regular
256
rotation
257
scalar
252
self-adjoint
253
singular
256
This page has been reformatted by Knovel to provide easier navigation.
Index Terms
Links
matrix (Cont.) size
251
skew-symmetric
252
spanning tree theorem
353
sparse
909
spur
252
square
251
stochastic
764
symmetric
252
trace
252
transposed
251
triangle decomposition
887
triangular
253
unitary
257
upper triangular
889
zero
251
matrix exponential function
800
matrix product, disappearing
257
max-min composition
369
252
maximum absolute
60
global
388
point
845
relative
388
maximum-criterion method
371
Maxwell, diagonal method
682
123
388
This page has been reformatted by Knovel to provide easier navigation.
Index Terms
Links
mean arithmetic
19
geometric
20
golden
20
quadratic
20
sample function
769
statistics
771
weighted
788
788
791
843
harmonic
value
751
19 751
mean squares problem different versions
401
non-linear
919
mean squares value problem linear
277
rank deficiency case
278
mean value formula
896
mean value function, integral calculus
442
mean value method
107
mean value theorem differential calculus
387
generalized
388
integral calculus
442
generalized
443
concentrated on a set
814
measure
This page has been reformatted by Knovel to provide easier navigation.
Index Terms
Links
measure (Cont.) counting
633
degrees
130
dimension
822
Dirac
634
ergodic
815
function, convergence theorems
636
Hausdorff
820
invariant
814
Lebesgue
634
natural
814
physical
815
probability
636
invariant
814
814
814
SBR (Sinai–Bowen–Ruelle)
815
σ-algebra
633
σ-finite
636
support
814
measured value
785
measurement error
785
error density
786
error, characterization
785
protocol
786
measuring error, distribution
785
This page has been reformatted by Knovel to provide easier navigation.
Index Terms median
Links 142
sample function
770
statistics
772
triangle
132
Mellin transformation
705
Melnikov method
839
membership degree
358
function
358
bell-shaped
360
trapezoidal
359
relation meridian
359
290 160
convergence
169
tangential
176
244
method Bairstow
886
barrier
873
Bernoulli
886
bisection
283
broken line, Euler
901
center of area
372
center of gravity
371
generalized
372
parametrized
372
Cholesky
278
890
This page has been reformatted by Knovel to provide easier navigation.
Index Terms
Links
method (Cont.) collocation boundary value problem
906
integral equation
574
partial integral equation
910
cutting plane
874
descent
866
difference ordinary differential equations
905
partial differential equations
908
extrapolation
902
Fibonacci
866
finite difference
532
finite element
532
Galerkin
906
Gauss elimination
887
Gauss–Newton
894
derivative free
559
910
895
Gauss–Seidel
892
gradient
559
Graeffe
886
Green
529
Hildreth–d’Esopo
865
Householder
278
Hungarian
858
inner digits of squares
781
530
918
This page has been reformatted by Knovel to provide easier navigation.
Index Terms
Links
method (Cont.) integrating factor
488
iteration
283
881
881
893
Jacobi
283
892
Kelley’s, optimization
875
Lagrange multiplier
860
Lanczos
283
largest area
372
least squares
107
887
916
918
ordinary
Mamdani
372
maximum-criterion
371
mean-of-maximum
371
Monte Carlo
781
multi-step
902
multiple-target
907
Newton
882
modified
882
non-linear operators
630
operator
707
orthogonalization
278
parametrized center of area
372
pivoting
271
polygonal, Euler
901
predictor–corrector
903
887
892
906
910
894
890
891
This page has been reformatted by Knovel to provide easier navigation.
Index Terms
Links
method (Cont.) regula falsi
883
relaxation
893
Riemann
528
Ritz
558
Romberg
898
Runge–Kutta
901
separation of variables
487
shooting
907
single–target
907
SOR (successive overrelaxation)
893
statistical experiment
785
steepest descent
867
Steffensen
883
906
successive approximation Banach space
619
Picard
494
Sugeno
373
transformation
283
undetermined coefficients
16
variation of constants
499
Wolfe
863
metric
602
Euclidean
602
surface
247
Meusnier, theorem
247
This page has been reformatted by Knovel to provide easier navigation.
Index Terms
Links
midpoint line segment plane
192
space
213
midpoint rule
903
minimal surface
249
minimum absolute
60
global
388
point
860
global
860
local
860
problem
845
relative
388
123
388
Minkowski inequality
32
integrals
32
series
32
mixed product
187
modal value, statistics
772
mode, statistics
772
model Hopf–Landau, turbulence
839
urn model
753
module of an element
601
modulo congruence
297
325
This page has been reformatted by Knovel to provide easier navigation.
Index Terms
Links
modulo (Cont.) mapping modulus, analytic function
814 670
de Moivre formula complex number
37
hyperbolic functions
90
trigonometric functions
79
Mollweide equations
142
moment of inertia
263
double integral
472
line integral, first type
462
triple integral
478
moment, order n
751
monodromy matrix
801
monotonicity
633
monotony
386
function sequence, numbers Monte Carlo method
810
402 781 783
usual
783
example
802
49
application in numerical mathematics
Monte Carlo simulation
450
781 782
Morse–Smale system
813
multi-index
611
multi-scale analysis
741
This page has been reformatted by Knovel to provide easier navigation.
Index Terms
Links
multi-step method
902
multiple integral
469
transformation
707
multiple-target method
907
multiplication complex numbers computer calculation polynomials rational numbers
36 937 11 1
multiplicity, of divisors
318
multiplier
488
Lagrange method
801
802
401
N nabla operator definition
654
divergence
654
gradient
654
rotation
654
twice applied
655
vector gradient
654
NAND function
288
nautical, radio hearing
171
navigation
172
negation
286
Boolean function
655
342
This page has been reformatted by Knovel to provide easier navigation.
810
Index Terms
Links
negation (Cont.) double
288
neighborhood
603
point
603
Neper equations
167
logarithm
9
rules
169
chart
123
net
more than three variables
127
three variahles
123
isometric
673
points
930
Neumann problem
668
series
566
585
Newton field, point-like mass
644
interpolation formula
914
Newton method
882
adaptive
948
modified
882
non-linear operators
630
approximation sequence non-linear optimization
894
830 867
This page has been reformatted by Knovel to provide easier navigation.
Index Terms
Links
Newton method (Cont.) damped
867
nonlinear operators modified
630
niveau line
117
nodal plane
539
node
807
approximation interval
928
differential equation, ordinary
492
saddle node
802
stable
802
triple
832
nominal rate
22
nomogram
123
nomography
123
non-negative integer NOR function
807
1 288
norm axioms, linear algebra
258
Euclidean
316
integral
637
isomorphism
624
matrix
258
column sum norm
259
row sum norm
259
spectral norm
259
This page has been reformatted by Knovel to provide easier navigation.
Index Terms
Links
norm (Cont.) subordinate
259
uniform norm
259
operator norm
617
pseudonorm
622
residual vector
277
seminorm
622
s-norm
363
space
609
t-norm
363
vector
258
Euclidean norm
258
matrix norm
258
sum norm
258
normal distribution
756
logarithmic normal distribution observational error standard normal distribution standard, table two-dimensional
800
757 72 757 1085 778
normal equation
891
916
system
780
917
normal form
918
343
equation of a surface
220
Jordan
282
principal conjunctive
343
This page has been reformatted by Knovel to provide easier navigation.
Index Terms
Links
normal form (Cont.) principal disjunctive normal plane, space curve
343 238
241
normal vector plane
211
plane sheet
661
surface
244
normal, plane curve
226
normalization condition
539
normalizing factor
195
northern direction geodesical
169
geographical
169
northwest corner rule
856
notation Polish notation
353
postfix notation
353
prefix notation
353
reversed polish
353
n-tuple
117
number
1
approximation
4
cardinal
291
complex
34
absolute value
35
addition
36
298
This page has been reformatted by Knovel to provide easier navigation.
Index Terms
Links
number (Cont.) algebraic form
34
argument
35
division
37
exponential form
35
main value
35
multiplication
36
power
37
subtraction
36
taking the root
38
trigonometric form
35
composite
318
conjugate complex
36
imaginary
34
integer
1
interval
2
irrational
2
natural
1
non-negative
1
random
768
rational
1
real
2 taking of root
sequence
297
3
8 402
convergent
604
metric space
604
This page has been reformatted by Knovel to provide easier navigation.
Index Terms
Links
number (Cont.) transcendental number line, extended number plane, complex, Gauss
2 633 34
number representation, computer internally
933
number system
933
binary
934
decimal
934
hexadecimal
934
octal
934
number theory, elementary
935
318
numbers Bernoulli
410
Euler
410
Fermat, prime
319
Fibonacci
323
Mersenne, prime
319
prime
318
843
numerical analysis axis
881 1
calculations, computer
936
library (numerical methods)
881
nutation angle
211
Nyström method
571
This page has been reformatted by Knovel to provide easier navigation.
Index Terms
Links
O obelisk
153
objective function, linear
844
observational value
785
occupancy number, statistics
770
octahedron
154
octal number
934
octal system
934
octant
209
ω-limit set
797
operation
298
algebraic arithmetical
803
810
263 1
associative
298
binary
298
commutative
298
exterior
299
n-ary
298
operational method
532
operational notation, differential equation
500
operator adjoint
624
AND
366
bounded
617
closed
618
This page has been reformatted by Knovel to provide easier navigation.
Index Terms
Links
operator (Cont.) coercivity
632
compact
626
compensatory
366
completely continuous
626
continuous
608
inverse
619
contracting
606
demi-continuous
632
differentiable
630
finite dimensional
626
gamma
366
Hammerstein
629
idempotent
626
inverse
599
isotone
631
lambda
366
linear
316
bounded
617
continuous
617
linear, notion
316
linear, permutability
317
linear, product
317
monotone
632
Nemytskij
629
non-linear
629
598
This page has been reformatted by Knovel to provide easier navigation.
Index Terms
Links
operator (Cont.) norm
800
notion
48
OR
366
positive
600
positive definite
626
positive non-linear
631
self-adjoint
625
singular
590
strongly monotone
632
Urysohn
630
operator method
707
opposite side
130
optimality condition
860
sufficient
861
860
optimality principle, Bellman
878
optimization
844
optimization method Bellman functional equation
878
conjugate gradient
867
cutting plane method
874
damped
867
DFP (Davidon–Fletcher–Powell)
868
feasible direction
869
Fibonacci method
866
golden section
866
This page has been reformatted by Knovel to provide easier navigation.
Index Terms
Links
optimization method (Cont.) Hildreth–d’Esopo
865
Kelley
875
penalty method
872
projected gradient
870
unconstrained problem
866
Wolfe
863
optimization problem convex
623
dynamic
875
non-linear
923
optimization, non-linear
859
barrier method
873
convex
861
convexity
862
descent method
866
direction search program
869
duality
861
861
gradient method inequality constraints
868
Newton method
867
numerical search procedure
865
quadratic
862
saddle point
860
steepest descent method
867
This page has been reformatted by Knovel to provide easier navigation.
Index Terms orbit
Links 795
double, periodic
834
heteroclinic
806
homoclinic
806
periodic
795
hyperbolic
803
saddle type
803
839
order (0)-interval
600
curve, second order
205
curve, n-th order
194
differential equation
485
interval
600
of magnitude, function
55
second order surface
220
wavelet
739
order relation
296
linear
297
order symbol, Landau ordering
56 297
lexicographical
297
linear
297
partial
297
599
ordinate plane coordinates
189
space coordinates
208
This page has been reformatted by Knovel to provide easier navigation.
Index Terms
Links
Ore theorem
351
orientation
269
coordinate system numerical axis
207 1
origin numerical axis
1
plane coordinates
189
space coordinates
208
original function
705
original space
705
set
48
orthocenter, triangle
132
orthodrome
173
arclength
174
course angle
173
intersection point
174
intersection point of two orthodromes
178
point, closest to the north pole
173
175
orthonornal function system
917
polynomials
917
space
614
spherical
169
orthogonality eigenvalues, differential equation
513
Hilbert space
614
This page has been reformatted by Knovel to provide easier navigation.
Index Terms
Links
orthogonality (Cont.) lines
128
real vector space
316
trigonometric functions
316
vectors
184
weight
513
orthogonality conditions line–plane
219
lines in space
219
planes
216
orthogonalization method
890
Givens
892
Gram–Schmidt
615
Householder
278
Schmidt
577
orthogonalization process
280
Gram–Schmidt
280
891
892
orthogomal function system orthonormalization, vectors
917 262
oscillation duration
82
harmonic
82
oscillator, linear harmonic
542
osculating plane, space curve
238
osculation point, curve
232
241
This page has been reformatted by Knovel to provide easier navigation.
Index Terms oversliding
Links 266
P pair, ordered
294
parabola
203
arclength
204
area
204
axis
203
binomial
69
cubic
63
diameter
204
directrix
203
equation
203
focus
203
intersection figure
221
n-th degree
64
parameter
203
quadratic polynomial radius of curvature semicubic
62 204 93
semifocal chord
203
tangent
204
transformation
205
vertex
203
paraboloid
221
elliptic
221
This page has been reformatted by Knovel to provide easier navigation.
Index Terms
Links
paraboloid (Cont.) hyperbolic central surface
222 224
invariant signs elliptic
224
hyperbolic
224
parabolic
224
of revolution
222
parallel circle
244
parallelepiped
152
rectangular
152
parallelism lines
128
parallelism conditions line–plane
219
lines in space
219
planes
216
parallelogram
134
parallelogram identity, unitary space
613
parameter
11
parabola
203
statistical
771
parameter space, stochastic
763
parametric integral
457
parametric representation, circle
197
parametrized center of area method
372
49
This page has been reformatted by Knovel to provide easier navigation.
Index Terms parity
Links 542
Parseval equation
420
formula
727
partial fraction decomposition special cases
616
15 1023
partial ordering
297
partial sum
404
partition
297
Pascal limaçon
96
Pascal triangle
13
path
514
349 closed
349
integration
461
PCNF (principal conjunctive normal form)
343
PDNF (principal disjunctive normal form)
343
pencil of lines
196
pendulum equation
839
Foucault
946
mathematical
700
period
701
pentagon, regular pentagram pentagram, regular
138 4 138
This page has been reformatted by Knovel to provide easier navigation.
Index Terms percentage
Links 21
calculation
21
performance score
370
perimeter circle
138
ellipse
200
polygon
137
period
50
secant
76
sine
75
tangent
76
period doubling
834
cascade
835
period parallelogram
702
permutability, linear operators
317
permutation
743
cyclic, vectors
207
group
301
matrix
889
with repetition
743
without repetition
743
82
840
perturbation
533
938
Pesin formula
819
826
phase initial portrait, dynamical systems
82 798
This page has been reformatted by Knovel to provide easier navigation.
Index Terms
Links
phase (Cont.) shift
75
82
sine
75
82
space, dynamical systems
795
spectrum
725
Picard iteration method integral equation successive approximation method
565 494
Picard–Lindelöf theorem
608
pivot
888
column
271
850
element
271
850
row
271
850
scheme
271
pivoting
271
column pivoting
889
step
271
274
plane equation general
214
in space
214
geometry
128
intersecting
158
rectifying
239
241
space
150
214
This page has been reformatted by Knovel to provide easier navigation.
Index Terms
Links
plane (Cont.) vector equation
188
planes orthogonality, conditions
216
parallel
150
distance parallelism, conditions
217 216
planimeter
445
Poincaré mapping
806
Poincaré section
839
point
128
accumulation
604
asymptotic, curve
232
boundary
604
circular
249
coordinates
189
corner
232
cuspidal, curve
232
discontinuity
57
double, curve
232
fixed, conformal mapping
674
fixed, stable
834
focal, ordinary differential equation
493
improper
192
infinite
192
interior
603
808
This page has been reformatted by Knovel to provide easier navigation.
Index Terms
Links
point (Cont.) isolated
604
isolated, curve
232
limit
237
limiting
604
multiple, curve
233
n-dimensional space
117
nearest approach
237
neighborhood
603
non-wandering
813
notion
128
plane curve
231
rational
1
regular, surface
244
saddle, ordinary differential equation
493
singular
225
isolated
492
ordinary differential equation
492
surface
244
spectrum
620
spiral, ordinary differential equation
493
232
245
surface point circular
249
elliptic
248
hyperbolic
249
parabolic
249
This page has been reformatted by Knovel to provide easier navigation.
Index Terms
Links
point (Cont.) spherical
248
umbilical
249
terminal, curve
232
transversal homoclinic
810
umbilical
249
Poisson differential equation, partial
536
distribution
755
formula
534
integral
531
polar
665
668
162
angle
190
axis
190
coordinates, plane
190
coordinates, spherical
209
coordinates, transformation
191
distance
244
equation
540
curve second order
207
subnormal
227
subtangent
227
complex function
671
pole
function left
58 162
This page has been reformatted by Knovel to provide easier navigation.
Index Terms
Links
pole (Cont.) multiplicity m, complex function
691
on the sphere
162
order m, complex function
691
origin
190
right
162
Polish notation
353
polyeder
151
polygon area 2n-gon
138
n-gon
137
base angle
137
central angle
137
circumcircle radius
137
circumscribing
137
exterior angle
137
inscribed circle radius
137
interior angle
137
perimeter
137
plane
137
regular, convex
137
side length
137
similar
133
polyhedral angle
151
138
This page has been reformatted by Knovel to provide easier navigation.
Index Terms
Links
polyhedron
151
convex
154
regular
154
polynomial
62
Bernstein
932
characteristic
278
Chebyshev formula equation, numerical solution first degree Hermite integral rational function
87 884 62 543
615
60
interpolation
914
Laguerre
511
615
Legendre
509
614
n-th degree
63
product representation
43
quadratic
62
second degree
62
third degree
63
trigonometric polynomials
924 11
Chebyshev
921
orthogonal
917
population
767
two-stage
752
Posa theorem
351
This page has been reformatted by Knovel to provide easier navigation.
Index Terms
Links
position coordinate, reflection
269
positive definite
890
postfix notation
353
postman problem, Chinese
350
potential complex
679
equation
536
field
660
conservative
660
rotation
654
retarded
535
power complex number
37
notion
7
real number
7
reciprocal power series
68 414
asymptotic
417
complex terms
688
expansion, analytic function
687
inverse
415
power set
298
power spectrum
816
precession angle
211
predicate
289
logic
289
418
This page has been reformatted by Knovel to provide easier navigation.
Index Terms
Links
predicate (Cont.) n-ary
289
predictor
903
predictor–corrector method
903
pre-Hilbert space
613
present value pressure
25 449
prime coprime
5
decomposition, canonical
319
element
318
factorization
319
canonical
319
Fermat
319
Mersenne
319
notation (measurement protocol)
786
number
318
pair
319
quadruplet
319
relatively
14
triplet prime formula, predicate logic principal (a sum of money)
14
319 289 21
principal axis direction
265
transformation
264
This page has been reformatted by Knovel to provide easier navigation.
Index Terms principal ideal
Links 313
principal normal section, surface
248
space curve
239
principal quantities
241
11
principal value integral, improper
452
inverse hyperbolic function
698
inverse trigonometric function logarithm
84
698
696
698
principle Cauchy
606
contracting mapping
606
extensionality
291
extrapolation principle
899
Neumann
307
of bivalence
286
prism
151
lines
151
regular
151
probability area interpretation
750
conditional
748
definition
747
total
748
probability integral
757
This page has been reformatted by Knovel to provide easier navigation.
Index Terms probability measure
Links 814
ergodic
818
invariant
814
probability paper
772
probability theory
743
probability vector
765
745
problem Cauchy
516
Dirichlet
526
discrete
918
inhomogeneous
535
668
multidimensional computation of adjustments
919
Neumann
668
regularized
278
scheduling
859
shortest way
349
Sturm–Liouville
513
two-body
518
process birth process
767
death process
767
orthogonalization process
280
Poisson process
766
stochastic
763
This page has been reformatted by Knovel to provide easier navigation.
Index Terms product
Links 7
algebraic
364
Cartesian fuzzy sets
367
n-ary
368
sets
294
cross
183
derivative
379
direct group
301
Ω-algebra
339
dot
183
drastic
364
dyadic
255
dyadic, tensors
264
inner product
613
Kronecker
258
n times direct
341
product sign
7
rules of calculation
7
scalar
183
vector
183
programming
255
844
computer algebra system
952
continuous dynamic
875
discrete dynamic
876
This page has been reformatted by Knovel to provide easier navigation.
Index Terms
Links
programming (Cont.) Bellman functional equation
876
Bellman functional equation method
878
Bellman optimality principle
878
constraint dynamic
876
constraint static
876
knapsack problem
876
problem
875
purchasing problem
876
state vector
875
linear
844
linear, transportation problem
855
Maple
974
Mathematica
963
problem dual problem
854
primal problem
854
programming, discrete continuous dynamical
875
cost function
876
decision
875
decision space
875
dynamic
875
functional equation
877
Bellman
876
method
878
This page has been reformatted by Knovel to provide easier navigation.
Index Terms
Links
programming, discrete (Cont.) knapsack problem
876
minimum interchangeability
877
minimum separability
877
n-stage decision process
875
optimal policy
878
optimal purchasing policy
879
purchasing problem
876
state costs
875
state vector
875
880
programming, linear constraints
844
forms
844
general form
844
properties
846
scheduling problem
859
projection sides
142
projection theorem, orthogonal space
614
projector
626
proof by contradiction
5
constructive
6
direct
5
indirect implication mathematical induction
288 5 5
This page has been reformatted by Knovel to provide easier navigation.
Index Terms
Links
proof (Cont.) step from n to n + 1
5
proportionality direct
62
inverse
64
proportions
17
proposition
286
dual
341
propositional logic
286
basic theorems
287
expression
286
propositional operation
286
extensional
286
propositional variable
286
protocoll
770
pseudonorm
622
pseudorandom numbers
781
pseudoscalar
270
pseudotensor
268
270
pseudovector
268
270
Ptolemy’s theorem
136
pulse, rectangular
695
pyramid
152
frustum
152
n-faced
152
regular
152
This page has been reformatted by Knovel to provide easier navigation.
Index Terms
Links
pyramid (Cont.) right
152
truncated
152
Pythagoras general triangle
142
right-angled triangle
141
Q QR algorithm
283
QR decomposition
891
quadrangl
134
circumscribing
136
concave
136
convex
136
inscribed
136
quadrant relations, trigonometric functions quadrant, Cartesian coordinates
891
136
77 189
quadratic curve
205
surface
223
quadratic form first fundamental, of a surface
245
index of inertia
282
real positive definite
282
second fundamental, surface
248
transformation
This page has been reformatted by Knovel to provide easier navigation.
Index Terms
Links
quadratic form (Cont.) principal axis
281
quadrature formula
895
Gauss type
897
Hermite
896
integral equation
570
interpolation quadrature
896
Lobatto type
898
quadruple (ordered four-tuple)
294
quantification, restricted
290
quantifier
289
quantile
750
quantity, infinitesimal
439
quantum number
539
energy
539
magnetic
542
orbital angular momentum
541
vibration quantum number
544
quartic
445
96
quasiperiodic
804
queuing
767
theory quintuple (ordered 5-tuple) quotient
767 294 1
derivative
380
differential
377
12
This page has been reformatted by Knovel to provide easier navigation.
Index Terms
Links
quotient (Cont.) set
297
R radial equation
540
radian definition
130
measure
130
radicals, ordinary differential equation radicand
491 8
radius circle
139
circumcircle
142
convergence
414
curvature
230
curve
229
space curve
240
curvature, extremal
556
polar coordinates
190
197
principal curvature surface
247
short
132
torsion, space curve
242
vector
181
raising to a power complex numbers
37
This page has been reformatted by Knovel to provide easier navigation.
Index Terms
Links
raising to a power (Cont.) real numbers
7
9
random number
768
781
application
782
congruence method
781
construction
782
different distributions
782
pseudorandom
781
uniformly distributed
781
random numbers, table random variable
1091 749
continuous
749
discrete
749
independent
752
mixed
749
multidimensional
752
two-dimensional
777
750
777
random vector mathematical statistics
768
multidimensional random variable
752
random walk process range
784 47
operator
598
sample function
770
statistics
772
This page has been reformatted by Knovel to provide easier navigation.
Index Terms
Links
rank matrix
255
tensor
262
vector space
316
rate effective
22
nominal
22
of interest
22
ray point
493
ray, notion
128
Rayleigh–Ritz algorithm
283
reaction, chemical, concentration
116
real part (complex number)
34
rebate
21
rectangle
135
rectangular formula, simple
896
rectangular impulse
712
rectangular pulse
695
rectangular sum
896
rectification
107
Reduce (computer algebra system)
950
reduction
21
angle
174
reduction formula, trigonometric functions
77
reflection orincide, Schwarz
679
reflection, position coordinate
269
This page has been reformatted by Knovel to provide easier navigation.
Index Terms region
Links 118
multiply-connected
118
non-connected
118
simply-connected
118
two-dimensional
118
analysis
777
coefficient
779
line
778
linear
778
multidimensional
779
regula falsi
883
regularity condition
861
regularization method
285
regularization parameter
278
relation
294
binary
294
congruence relation
338
equivalence relation
296
fuzzy-valued
367
inverse
295
less or equal than (≤ relation)
289
matrix
294
n-ary
294
n-place
294
order relation
296
product
295
779
873
This page has been reformatted by Knovel to provide easier navigation.
Index Terms
Links
relaxation method
893
relaxation parameter
893
reliability testing
767
relief, analytic function
670
remainder estimation
411
series
404
term
404
Remes algorithm rent
922 25
ordinary, constant representation of groups
25 303
adjoint
304
direct product
305
direct sum
305
equivalent
304
faithful
303
identity
303
irreducible
305
non-equivalent
304
particular
303
properties
303
reducible
305
reducible, complete
306
representation matrix
303
representation space
303
This page has been reformatted by Knovel to provide easier navigation.
Index Terms
Links
representation of groups (Cont.) subspace
305
true
303
unitary
304
representation theorem (fuzzy logic)
362
resection Cassini
149
Snellius
148
residual spectrum
621
residual sum of squares
277
residual vector
277
891
residue quadratic modulo m
327
complex function
691
theorem
692
residue class
325
addition
325
multiplication
325
primitive
326
relatively prime
326
ring
313
ring, modulo m
325
residue theorem
690
application
693
residuum
277
325
890
This page has been reformatted by Knovel to provide easier navigation.
Index Terms
Links
resolvent set
566
568
585
620
resonance torus
840
reversing the order of the bits
927
rhombus
135
Riemann formula
528
function
528
integral
439
comparison with Lehesgue integral
451
comparison with Stieltjes integral
451
method
528
sum
439
surface, many-sheeted
683
theorem
407
right screw rule
661
right singular vector
285
right-hand coordinate system
207
right-hand rule
183
ring
313 factor ring
314
homomorphism
314
theorem
314
isomorphism
314
subring
313
risk theory
661
21
This page has been reformatted by Knovel to provide easier navigation.
620
Index Terms
Links
Ritz method
558
Rn, n-dimensional Euclidean vector space
254
Romberg method
898
906
root complex function
671
complex number
38
equation
43
non-linear notion N-th of unity real number square root, complex theorem of Vieta
881 8 926 8 675 43
root locus theory
886
root test of Cauchy
406
rotation definition
652
different coordinates
653
mapping
815
potential field
654
remark
648
vector components
657
vector field
652
rotation angle
212
rotation field pure
666
This page has been reformatted by Knovel to provide easier navigation.
Index Terms
Links
rotation field (Cont.) zero-divergence field
666
rotation invariance
265
rotation matrix
190
orthogonal
262
rotation number
841
rotator, rigid
540
round-off
937
error
939
measurement error method
257
785 939
round-tour problem
859
row echelon form, system of linear equations
276
row sum criterion
892
Ruelle–Takens–Newhouse scenario
840
rule
903 Adams and Bashforth
903
Bernoulli-I’Hospital
54
Cartesian rule of signs
885
Cramer
275
De Morgan
287
Descartes
211
45
Guldin, first rule
451
Guldin, second rule
451
linguistic
373
Milne
903
This page has been reformatted by Knovel to provide easier navigation.
Index Terms
Links
rule (Cont.) Sarrus ruled surface
261 250
rules composition
369
divisibility, elementary
318
Neper
169
Runge–Kutta method
901
S saddle
802
saddle form
249
809
saddle point differential equation
493
Lagrange function
860
sample
752
function
768
random
768
size
752
summarizing the sample
770
variable
768
Sarrus rule
768
261
scalar invariant
263
notion
180
This page has been reformatted by Knovel to provide easier navigation.
Index Terms scalar field
Links 641
axial field
642
central field
641
coordinate definition
642
directional derivative
647
gradient
648
plane
641
scalar matrix
252
scalar product
183
Hilbert space
613
representation in coordinates
186
rotation invariance property
270
two functions
917
vectors
255
649
316
188
scale cartography
144
equation
114
factor
114
logarithmic
114
notion
114
semilogarithmic
115
scenario, Ruelle–Takens–Newhouse
840
Schauder fixed-point theorem
631
scheduling problem
859
146
scheme Falk
254
This page has been reformatted by Knovel to provide easier navigation.
Index Terms
Links
scheme (Cont.) Young
306
Schmidt, orthogonalization method
577
Schoenflies symbolism
307
Schrödinger equation non-linear, partial
545
time-dependent
537
time-independent
537
547
Schrödinger’s equation linear
536
Schur’s lemma
306
Schwarz exchange theorem
393
reflection principle
679
Schwarz–Buniakowski inequality
613
Schwarz–Christoffel formula
677
screw left
242
right
242
search procedure, numerical
865
secant geometric definition hyperbolic theorem
130 87 138
trigonometric
76
secant–tangent theorem
138
This page has been reformatted by Knovel to provide easier navigation.
Index Terms
Links
section, golden
2
sector formula
664
sector, spherical
157
4
192
segment normal
227
notion
128
polar normal
227
polar tangent
227
tangent
227
self-similar
821
semantically equivalent
287
expressions
343
semifocal chord ellipse
198
hyperbola
200
parabola
69
semigroup free
299 299
semimonotone
611
seminorm
622
semiorbit
795
sense class
133
of a figure
203
132
sensitive, with respect to the initial values
827
sentence
286
sentence-forming functor
286
This page has been reformatted by Knovel to provide easier navigation.
843
Index Terms
Links
sentential operation
286
separable sets
623
separation constant
538
theorems (convex sets)
623
variables
487
522
loop
806
838
surface
804
810
538
separatrix
sequence
402
bounded
596
Cauchy sequence
605
convergence
604
finite
595
in metric space
604
infinite
402
numbers
402
bounded
402
bounded above
403
bounded below
403
convergence
403
converging to zero
596
divergence
403
finite
595
law of formation
402
limit
403
596
This page has been reformatted by Knovel to provide easier navigation.
Index Terms
Links
sequence (Cont.) monotone
402
term
402
series
18
alternating
408
arithmetic
18
Banach space
610
Clebsch–Gordan
306
comparison criterion
405
constant term
404
convergence
406
absolute
414
non-uniform
412
uniform
412
convergence theorems
404
convergent
404
D’Alembert’s ratio test
405
definite
402
414
18
divergence
407
divergent
404
expansion
415
Fourier
418
Laplace transformation
717
Laurent
690
Maclaurin
416
power series
414
This page has been reformatted by Knovel to provide easier navigation.
Index Terms
Links
series (Cont.) Taylor finite Fourier complex representation function domain of convergence general term geometric infinite
387
415
18 419 419 412 412 404 19 19
harmonic
404
hypergeometric
511
infinite
402
integral test of Cauchy
406
Laurent
690
Maclaurin
416
Neumann
566
Neumann series
618
partial sum
404
power
414
expansion
415
inverse
415
remainder
404
root test of Cauchy
406
sum
404
Taylor
387
404
404
412
415
This page has been reformatted by Knovel to provide easier navigation.
Index Terms
Links
series (Cont.) uniformly convergent continuity
413
differentiation
413
integration
413
properties
413
Weierstrass criterion, uniform convergence set
413 290
absorbing
797
algebra, fundamental laws
292
axioms of closed sets
604
axioms of open sets
604
Borel
634
bounded in metric space
604
cardinality
298
closed
604
closure
605
compact
626
complex numbers
798
34
convex
597
dense
605
denumerable infinite
298
disjoint
292
element
290
empty
291
equality
291
This page has been reformatted by Knovel to provide easier navigation.
Index Terms
Links
set (Cont.) equinumerous
298
fundamental
292
fuzzy
358
image
48
infinite
298
integers
1
invariant
797
chaotic
826
fractal
826
stable
797
irrational numbers
2
linear
595
manyness
298
measurable
633
natural numbers
614
1
non-denumerable infinite
298
notion of set
290
open
603
operation Cartesian product
294
complement
292
difference
293
intersection
292
symmetric difference
293
union
292
This page has been reformatted by Knovel to provide easier navigation.
Index Terms
Links
set (Cont.) operations
291
order-bounded
600
original space
48
power
298
power set
291
quotient set
297
rational numbers
1
real numbers
2
relative compact
626
subset
291
theory
290
universal
292
void
291
sets
290 coordinates x, y
294
difference, symmetric
293
sexagesimal degree
130
shift mapping
818
shooting method
907
simple
907
shore-to-ship bearing
171
side or leg of an angle
128
side-condition
553
Sierpinski carpet
822
This page has been reformatted by Knovel to provide easier navigation.
Index Terms
Links
Sierpinski (Cont.) gasket
822
sieve of Eratosthenes
318
σ-additivity
633
σ-algebra
633
Borelian
634
sign of a function
49
signal
738
analysis
738
synthesis
738
signature, universal algebra
338
significance
777
level similarity transformation
751
774
280
281
848
849
simplex method revised
853
step, revised
854
tableau
849
revised Simpson’s formula
853 897
simulation digital
781
Monte Carlo
781
geometric definition
130
sine
This page has been reformatted by Knovel to provide easier navigation.
Index Terms
Links
sine (Cont.) hyperbolic geometric definition trigonometric
87 131 74
sine integral
458
694
sine law
142
164
sine–cosine law
164
polar
164
sine–Gordon equation
545
single-target method
907
singleton
361
singular value
285
decomposition singular value decomposition
547
818
285 285
singularity analytic function
671
essential
671
isolated
690
pole
691
removable
671
sink
802 vector field
651
vertex
356
sinusoidal amount
691
809
82
slack variable
845
Slater condition
861
This page has been reformatted by Knovel to provide easier navigation.
Index Terms slide rule logarithmic scale
Links 10 114
slope plane
194
tangent
227
small circle
158
arclength
175
course angle
176
intersection point
176
radius, plane
175
radius, spherical
175
175
smoothing continuous problem
916
parameter
930
spline
929
cubic
929
Sobolev space
611
solenoid
826
solid angle
151
soliton antikink
548
antisoliton
547
Boussinesq
549
Burgers
549
differential equation, partial, non-linear
544
Hirota
549
This page has been reformatted by Knovel to provide easier navigation.
Index Terms
Links
soliton (Cont.) interaction
545
Kadomzev-Pedviashwili
549
kink
548 lattice
548
kink-antikink
548
collision
548
doublet
548
kink-kink collision
548
Korteweg de Vries
546
non-linear, Schrödinger
547
solution point
845
product form, method separation of variables 522 SOR method (successive overrelaxation)
893
source
802
809
distribution continuous
667
discrete
667
field irrotational
665
pure
665
vector field
651
vertex
356
This page has been reformatted by Knovel to provide easier navigation.
Index Terms
Links
space abstract
48
Banach
610
complete, metric
605
directions
210
finite-dimensional
626
higher-dimensional
117
Hilbert
613
infinite-dimensional
597
isometric
609
Kantorovich
601
linear
594
P
L space
637
metric
602
completion
608
convergence of sequence
604
normable
610
separable
605
non-reflexive
624
606
626
normed axiom
609
properties
610
ordered normed
611
orthogonal
614
reflexive
624
Riesz
600
This page has been reformatted by Knovel to provide easier navigation.
Index Terms
Links
space (Cont.) second adjoint
624
separable
605
Sobolev
611
unitary
613
vector
594
space curve
238
binormal
239
coordinate equation
241
curvature
240
direction
238
equation
214
moving trihedral
239
normal plane
238
241
osculating plane
238
241
principal normal
239
241
radius of curvature
240
radius of torsion
242
tangent
238
torsion
242
vector equation
238
space inversion
241
238
241
241
269
mixed product
270
scalar product
270
spectral radius
618
spectral theory
620
This page has been reformatted by Knovel to provide easier navigation.
Index Terms spectrum
Links 725
amplitude
725
continuous
621
frequency
725
linear operator
620
phase
725
sphere
156
ellipsoid
220
equation, three forms
243
spherical biangle
161
coordinates
209
diagon
161
distance
158
field
641
helix
177
lune
161
Archimedean
103
hyperbolic
104
logarithmic
104
233
spline basis spline
930
bicubic
930
approximation
932
interpolation
930
cubic
928
This page has been reformatted by Knovel to provide easier navigation.
Index Terms
Links
spline (Cont.) interpolation
928
smoothing
929
interpolation
914
natural
928
normalized B-spline
930
periodic
928
smoothing
929
matrix
252
tensor
265
928
spur
square
135
stability absolutely stable
904
first approximation
802
integration of differential equation
904
Lyapunov stability
801
orbital
801
periodic orbit
802
perturbation of initial values
904
round-off error, numerical calculation
939
structural
811
812
deviation
751
789
error
788
normal distribution
757
standard 791
This page has been reformatted by Knovel to provide easier navigation.
Index Terms
Links
state degenerate
539
particle
536
space, stochastic
763
stationary
537
statistics
743
descriptive
770
estimate
769
mathematical
743
sample function
768
steady state
795
Steffensen method
883
767
step from n to n + 1
5
function
695
interval parameter
894
size
901 change
712
732
902
steradian
151
stereometry
150
Stieltjes integral
626
comparison with Riemann integral
451
notion
451
transformation
705
Stirling formula
460
This page has been reformatted by Knovel to provide easier navigation.
742
Index Terms
Links
stochastic basic notions
763
chains
764
process
763
processes
763
stochastics
743
Stokes, integral theorem
664
strangling
264
stream function
679
strip, characteristic
517
strophoide
95
structure algebraic
286
classical algebraic
298
Sturm chain
44
function
44
sequence
886
theorem
44
Sturm–Liouville problem
513
subdeterminant
259
subdomain
930
subgraph
347
induced subgroup criterion
347 300 300
This page has been reformatted by Knovel to provide easier navigation.
Index Terms
Links
subgroup (Cont.) cyclic
300
invariant
301
normal
301
trivial
300
subinterval
439
subnormal
227
subring
313
trivial subset
313 291
affine
595
linear
595
open
603
subspace
315
affine
595
criterion
315
invariant, representation of groups
305
subtangent
595
227
subtraction complex numbers computer calculation polynomials rational numbers subtractive cancellation sum
36 937 11 1 937 6
algebraic
364
This page has been reformatted by Knovel to provide easier navigation.
Index Terms
Links
sum (Cont.) drastic
364
of the digits alternating, first order
320
alternating, second order
320
alternating, third order
320
first order
320
second order
320
third order
320
residual squares
918
Riemann
439
rules of calculation
6
summation sign
6
transverse
320
vectors
181
summation convention, Einstein’s
262
superposition fields
667
law, fields
667
non-linear
547
oscillations
82
principle
681
differential equation, linear
505
differential equations, higher order
499
supplementary angle formulas
79
This page has been reformatted by Knovel to provide easier navigation.
Index Terms
Links
support compact
740
measure
814
membership function
358
supporting functional
623
supremum
600
surface
243
area, double integral
472
B–B representation
932
barrel
157
block
152
cone
156
conical
156
constant curvature
249
cube
152
curvature of a curve
247
cylinder
154
developable
250
element
247
element, vector components
658
equation
223
in normal form
220
space
213
equation, space
243
first quadratic fundamental form
246
Gauss curvature
249
214
249
This page has been reformatted by Knovel to provide easier navigation.
Index Terms
Links
surface (Cont.) geodesic line
250
harmonic
542
integral
477
first type
477
general form
483
second type
481
line element
245
line of curvature
248
metric
247
minimal
249
normal
244
normal vector
244
oriented
481
patch, area
247
polyhedron
151
principal normal section
248
pyramid
152
quadratic
223
radius of principal curvature
247
rectangular parallelepiped
152
rectilinear generator
222
representation with splines
928
rotation symmetric
213
ruled
250
second order
220
661
662
482
246
223
This page has been reformatted by Knovel to provide easier navigation.
Index Terms
Links
surface (Cont.) central surfaces
224
invariant signs
224
types
223
second quadratic fundamental form
248
sphere
156
tangent plane
244
torus
157
transversal
804
surplus variable, linear programming
246
848
switch algebra
340
function
344
value
344
Sylvester, theorem of inertia
344
282
symbol internal representation (computer)
933
Kronecker
265
Legendre
327
symmetry axial
133
central
132
element
307
Fourier expansion
420
group
308
applications in physics
312
This page has been reformatted by Knovel to provide easier navigation.
Index Terms
Links
symmetry (Cont.) crystallography
310
molecules
308
quantum mechanics
312
operation
307
crystal lattice structure
310
improper orthogonal mapping
307
reflection
307
rotation
307
without fixed point
307
with respect to a line
133
system chaotic according to Devaney
827
cognitive
373
complete
616
differential equation, higher order
495
linear
503
differential equations, partial canonical system
517
normal system
517
dynamical
795
chaotic
827
conservative
796
continuous
795
r
C -smooth
795
discrete
796
This page has been reformatted by Knovel to provide easier navigation.
Index Terms
Links
system (Cont.) dynamical (dynamical) dissipative
796
ergodic
815
invertible
795
mixing
815
motion
795
time continuous
795
time discrete
795
time dynamical
796
volume decreasing
796
volume preserving
796
volume shrinking
796
equation numerical solution
887
four points
213
generators
300
knowledge based interpolation
375
linear equations
271
compatible
273
consistent
273
fundamental system
273
homogeneous
272
inconsistent
273
inhomogeneous
272
overdetermined
277
272
This page has been reformatted by Knovel to provide easier navigation.
Index Terms
Links
system (Cont.) pivoting
274
row echelon form
276
solvable
273
trivial solution
273
mixing
827
Morse–Smale
813
normal equations
277
orthogonal
614
orthonormal
615
term-substitutions
338
trigonometric
614
table with double entry
119
tacnode, curve
232
T
tangent circle
198
formula
142
geometric definition
130
hyperbolic geometric definition
87 131
law
142
plane
158
393
surface
244
246
plane curve
226
This page has been reformatted by Knovel to provide easier navigation.
Index Terms
Links
tangent (Cont.) polygon
137
space curve
238
trigonometric
241
76
tautology Boolean function
342
predicate logic
290
propositional logic
288
Taylor expansion
55
one variable
415
several variables
395
vector function
641
formula
395
two variables
394 387
analytic function
689
one variable
415
several variables
395
theorem several variables
415
387 394
telegraphic equation
529
telephone-call distribution
767
tensor
262
addition, subtraction
415
387
several variables
series
387
268
This page has been reformatted by Knovel to provide easier navigation.
Index Terms
Links
tensor (Cont.) alternating
265
antisymmetric
264
components
262
contraction
264
definition
262
dyadic product
264
eigenvalue
264
generalized Kronecker delta
265
inertia
263
invariant
265
multiplication
268
oversliding
268
product
263
vectors
268
255
rank n
262
rank 0
263
rank 1
263
rank 2
263
rules of calculation
263
264
skew-symmetric
264
268
spur
265
symmetric
264
tension
263
trace
265
tensor product approach
268
931
This page has been reformatted by Knovel to provide easier navigation.
Index Terms
Links
tent mapping
814
term algebra
339
test problem, linear
904
test, statistical
772
tetraeder
213
tetragon
136
tetrahedron
153
Thales theorem
139
141
theorem Abel
414
Afraimovich–Shilnikov
840
alternating point
920
Andronov–Pontryagin
812
Andronov–Witt
802
Apollonius
199
Arzeli–Ascoli
626
Baire category
606
803
Banach continuity, inverse operator
619
fixed-point
629
Banach–Steinhaus
618
Bayes
748
Berge
354
binomial
12
Birkhoff
340
Block–Guckenheimer–Misiuriewicz
827
749
815
This page has been reformatted by Knovel to provide easier navigation.
Index Terms
Links
theorem (Cont.) Bolzano one variable several variables
59 123
boundedness of a function one variable
60
several variables
123
Cauchy integral theorem
686
Cayley
302
353
center manifold differential equations
828
mappings
833
Chebyshev
434
Chinese remainder
326
Clebsch–Gordan
306
closed graph
618
constant analytic function
671
convergence, measurable function
636
decomposition
297
Denjoy
842
differentiability, respect to initial conditions
795
Dirac
351
Douady–Oesterlé
824
Euclidean theorems Euclidean algorithm
141 322
This page has been reformatted by Knovel to provide easier navigation.
Index Terms
Links
theorem (Cont.) Euler
154
Euler–Hierholzer
350
Fatou
636
Fermat
329
Fermat–Euler
329
329
386
fixed point theorem Banach
606
fixed-point Banach
629
Brouwer
631
Schauder
631
Floquet
801
fundamental integral calculus
456
Girard
164
Grobman–Hartman
809
811
Hadamard–Perron
805
810
Hahn (extension theorem)
623
Hellinger–Toeplitz
618
Hilbert–Schmidt
628
Holladay
929
Hurwitz
501
intermediate value one variable several variables KAM (Kolmogorov–Arnold–Moser)
59 123 813
This page has been reformatted by Knovel to provide easier navigation.
Index Terms
Links
theorem (Cont.) Krein–Losanovskij
619
Kuratowski
355
Lagrange
301
Lebesgue, majorized convergence
636
Ledrappier
824
Leibniz
408
Lerey–Schauder
631
Levi, B.
636
limits functions
403
sequences of numbers
403
Liouville
671
Lyapunov
801
maximum value, analytic function
671
Meusnier
247
nested balls
606
Ore
351
Oseledec
818
Palis–Smale
813
Picard–Lindelöf
608
Poincaré–Bendixson
803
Posa
351
Ptolemy’s
136
799
795
Pythagoras general triangle
142
This page has been reformatted by Knovel to provide easier navigation.
Index Terms
Links
theorem (Cont.) orthogonal space
614
right-angled triangle
141
Radon–Nikodym
636
Riemann
407
Riesz
622
Riesz–Fischer
616
Rolle
386
Schauder
627
Schwarz, exchange
393
Sharkovsky
827
Shilnikov
838
Shinai
827
Shoshitaishvili
828
Smale
838
stability in the first approximation
802
Sturm
44
superposition law
667
Sylvester, of inertia
282
Taylor
387
one variable
415
several variables
394
Thales
139
total probability
748
Tutte
354
variation of constants
800
141
This page has been reformatted by Knovel to provide easier navigation.
Index Terms
Links
theorem (Cont.) Weierstrass one variable several variables
412
605
60 123
Wilson
329
Wintner–Conti
798
Young
822
823
theory distribution
712
elementary number
318
field
667
function
669
graph, algorithms
346
probability
743
risk
21
set
290
spectral
620
vector fields
640
theta function
703
time frequency analysis
741
tolerance
362
topological conjugate
808
equivalent
808
torsion, space curve
242
This page has been reformatted by Knovel to provide easier navigation.
Index Terms
Links
torus
157
differential equation, linear, autonomous
801
dissolving
839
formation
835
invariant set
804
losing smoothness
840
resonance torus
840
total curvature
826
827
840
243
trace matrix
252
tensor
265
tractrix
106
trail
349 Euler open
350 350
trajectory
795
transform
705
transformation Cartesian into polar coordinates
398
covering
299
determinant
211
element of area
472
element of curved surface
480
geometrical
141
Hopf–Cole
549
identical
11
This page has been reformatted by Knovel to provide easier navigation.
Index Terms
Links
transformation (Cont.) invariance
265
linear
262
method
283
orthogonal coordinates
210
principal axes
264
quadratic form
281
similarity
280
wavelet transformation
738
316
598
281
281
transition matrix
764
probability, stochastic
764
transitivity
765
29
translation invariance
265
primitive
310
transport, network
356
transportation problem
855
transposition law
288
trapezoid
135
Hermite’s trapezoidal formula
897
trapezoidal formula
896
sum
896
traversing
147
This page has been reformatted by Knovel to provide easier navigation.
Index Terms tree
Links 352
hight
352
ordered binary
352
regular binary
352
rooted
352
spanning
352
minimum
353
353
triangle altitude
132
area
193
bisector
132
center of gravity
132
circumcircle
132
congruent
133
coordinates
914
equilateral
132
Euler
162
incircle
132
inequality
181
axioms of norm complex numbers
258 30
metric space
602
norm
609
real numbers
30
unitary space
613
isosceles
132
This page has been reformatted by Knovel to provide easier navigation.
Index Terms
Links
triangle (Cont.) median
132
orthocenter
132
Pascal
13
plane
131
area
141
basic problems
143
Euclidean theorems
141
general
141
incircle
143
radius of circumcircle
142
right-angled
132
tangent formula
142
polar
162
similar
133
spherical
161
basic problems
167
calculation
167
Euler
162
oblique
169
right-angled
168
143
141
triangular decomposition
888
matrix
253
lower
253
upper
253
This page has been reformatted by Knovel to provide easier navigation.
Index Terms
Links
triangularization, FEM (finite element method)
911
triangulation, geodesy first fundamental problem
147
second fundamental problem
148
trigonometry plane
141
spherical
158
trihedral angle
151
moving
238
triple (ordered 3-tuple)
294
triple integral
474
application
163
478
trochoid
100
truncation, measurement error
785
truth function
286
conjunction
286
disjunction
286
equivalence
286
implication
286
NAND function
288
negation
286
NOR function
288
table
286
value
286
287
This page has been reformatted by Knovel to provide easier navigation.
Index Terms
Links
turbulence
796
Tutte theorem
354
two lines, transformation
205
two-body problem
518
type, universal algebra
338
839
U umbilical point
249
uncertainty absolute
789
fuzzy
358
relative
790
ungula, cylinder
155
union fuzzy sets
363
sets
292
universal quantifier
289
universal substitution
436
urn model
753
364
V vagueness
358
valence in-valence
346
out-valence
346
This page has been reformatted by Knovel to provide easier navigation.
Index Terms
Links
value expected, measurent
751
system (function of several variables)
117
true, measurement
788
van der Pol differential equation
831
variable artificial
852
basic
848
Boolean
342
bound variable, predicate logic
289
dependent free, predicate logic independent
47 289 47
linguistic
359
non-basic
848
propositional
286
random
749
variance
271
117
271
751
distribution
751
sample function
769
statistics
771
two-dimensional distribution
778
variation function of constants, method
60 499
This page has been reformatted by Knovel to provide easier navigation.
Index Terms
Links
variation of constants differential equation, linear
505
theorem
800
variational calculus
550
auxiliary curve
552
comparable curve
552
Euler differential equation
552
side-condition
550
variational equation equilibrium point points variational problem
802
819
911
809 550
brachistochrone problem
551
Dirichlet
557
extremal curves
551
first order
550
first variation
559
functional
551
higher order
550
higher order derivatives
554
isoperimetric, general
551
more general
558
numerical solution
558
direct method
558
finite element method
559
gradient method
559
Ritz method
558
906
This page has been reformatted by Knovel to provide easier navigation.
Index Terms
Links
variational problem (Cont.) parameter representation
550
parametric representation
555
second variation
559
several unknown functions
555
side-conditions
553
simple one variable
551
several variable
556
variety
339
vector absolute value
180
affine coordinates
182
axial
180
reflection behavior base
268 186
base vector reciprocal
185
bound
180
Cartesian coordinates
182
column
253
components
646
conjugate
867
coordinates
182
Darboux vector
243
decomposition
182
This page has been reformatted by Knovel to provide easier navigation.
Index Terms
Links
vector (Cont.) diagram, oscillations
83
differentiation rules
640
direction in space
180
direction, vector triple
180
directional coefficient
183
expansion coefficient
183
fields
640
free
180
left singular
285
length
189
line
646
magnitude
180
matrix
253
metric coefficients
185
notion
180
null vector
181
polar
180
reflection behavior
268
pole, origine
181
position vector
181
complex-number plane radius vector complex-number plane
34 181 34
reciprocal
186
reciprocal basis vectors
186
This page has been reformatted by Knovel to provide easier navigation.
Index Terms
Links
vector (Cont.) residual
277
right singular
285
row
253
scalar invariant
655
sliding
180
space
595
stochastic
764
unit
180
zero vector
181
vector algebra
180
geometric application
189
notions and principles
180
vector analysis
640
vector equation line
188
plane
188
space curve
238
vector equations
187
vector field
643
Cartesian coordinates
644
central
643
circular field
644
components
646
contour integral
660
coordinate definition
644
241
795
This page has been reformatted by Knovel to provide easier navigation.
Index Terms
Links
vector field (Cont.) cylindrical
644
cylindrical coordinates
645
directional derivative
648
divergence
651
point-like source
666
rotation
652
sink
651
source
651
spherical
644
spherical coordinates
645
vector function
640
derivative
640
differentiation
640
hodograph
640
linear
267
scalar variable
640
Taylor expansion
641
vector gradient
650
vector iteration
283
vector lattice
600
homomorphism
643
284
601
vector potential
666
vector product
183
hints
255
representation in coordinates
187
This page has been reformatted by Knovel to provide easier navigation.
Index Terms vector space
Links 314
all null sequences
596
bounded sequences
596
B(T)
596
C([a, b])
596
(κ)
C ([a, b])
596
complex
595
convergent sequences
596
Euclidean
316
finite sequence of numbers
595
n
F
595
F(T)
596
functions
596
infinite-dimensional
315
L I
p
594
597
637
p
596
n-dimensional
315
n-dimensional Euclidean
254
ordered by a cone
599
partial ordering
599
real
314
s of all sequences
595
sequences
595
595
vector subspace stable
805
810
unstable
805
810
This page has been reformatted by Knovel to provide easier navigation.
Index Terms vectors
Links 180
253
angle between
189
collinear
181
collinearity
184
commutative law
255
coplanar
181
cyclic permutation
207
double vector product
184
dyadic product
255
equality
180
Lagrange identity
185
linear combination
181
184
mixed product
184
187
Cartesian coordinates orthogonality
185 184
products affine coordinates
185
Cartesian coordinates
185
products, properties
183
scalar product
255
Cartesian coordinates
185
representation in coordinates
186
sum
181
tensor product
255
triple product
184
vector product This page has been reformatted by Knovel to provide easier navigation.
Index Terms
Links
vectors (Cont.) representation in coordinates Venn diagram verification, proof
187 291 5
vertex angle
128
degree
346
ellipse
198
graph
346
hyperbola
200
initial
346
isolated
346
level
352
parabola
203
plane curve
232
sink
356
source
356
terminal
346
vertical angle
143
vertices, distance
349
vibration, differential equation bar
524
round membrane
525
string
523
Vieta, root theorem
43
This page has been reformatted by Knovel to provide easier navigation.
Index Terms Volterra integral equation
Links 561
first kind
583
second kind
585
607
volume barrel
157
block
152
cone
156
cube
152
cylinder
154
double integral
472
element, vector components
658
hollow cylinder
155
obelisk
153
parallelepipedon with vectors
189
polyhedron
151
prism
152
pyramid
152
rectangular parallelepiped
152
sphere
156
subset
633
tetraeder
213
torus
157
triple integral
478
wedge
154
volume derivative
649
volume integral
474
This page has been reformatted by Knovel to provide easier navigation.
Index Terms volume scale
Links 114
W Walsh functions
742
systems
742
wave amplitude
82
frequency
82
length
82
period
82
phase
82
plane
739
wave equation n-dimensional
534
one-dimensional
730
Schrödinger equation
537
wave function classical problem
534
heat-conduction equation
535
Schrödinger’s equation
536
wavelet
739
Daubechies
740
orthogonal
740
transformation
738
discrete
739
741
This page has been reformatted by Knovel to provide easier navigation.
Index Terms
Links
wavelet (Cont.) discrete, Haar
741
dyadic
740
fast
741
Weber differential equation
543
function
507
wedge
154
Weibull distribution
759
Weierstrass approximation theorem
605
criterion uniform convergence
413
function
703
theorem
412
one variable several variables
60 123
weight measurement
791
of orthogonality
513
statistical
751
weighting factor, statistical witch of Agnesi word
788 94 602
work (mechanics) general
467
special
449
This page has been reformatted by Knovel to provide easier navigation.
Index Terms Wronskian determinant
Links 498
800
Y Young scheme
306
Z zenith
143
distance
143
zero matrix
251
zero-point translational energy
539
zero-point vibration energy
544
zeropoint
1
z-transformation
731
convolution
734
damping
734
definition
732
difference
733
differentiation
734
integration
734
inverse
735
original sequence
732
rules of calculation
733
summation
733
transform
732
translation
733
z-transformable
732
This page has been reformatted by Knovel to provide easier navigation.
Mathematical Sumbols A
MATHEMATICAL SYMBOLS Relational Symbols = equal to G identically equal to := equal to by definition << much less than + partial order relation
5 less than or equal to 2 greater than or equal to
approximately equal to less than > greater than >> much greater than + partial order relation
M
<
# unequal to, different from corresponding to
Greek Alphabet A a Alpha H 7 Eta N u Nu T r Tau
B 4 Beta B 829 Theta E Xi T u Upsilon
l'y I1
0o @ .p
Gamma Iota Omicron Phi
4b K
K
IT R Xx
Delta Kappa Pi Chi
E
E
AX P p Ii
Epsilon Lambda Rho Psi
Z
6
Zeta
M p
Mu
Cu
Sigma Omega
Rw
Constants const R = 3.14159,
constant amount (constant) ratio of the perimeter of the circle to the diameter
C = 0.57722.. . e = 2.71828.. .
Euler constant base of the natural logarithms
Algebra A, B 7.4, ;.i AAB,n AVB,U
A*B A*B A, B , C , . . . -
propositions negation of the proposition A conjunction, logical AND disjunction, logical OR implication, IF A T H E 3 B equivalence, A IF AND ONLY IF B
set of natural numbers sets set of the integers closure of the set A or complement of set of the rational numbers A with respect to a universal set set of the real numbers A is a proper subset of B AcB set of the positive real numbers A is a subset of B AEB n-dimensional Euclidean vector space difference of two sets A\B set of the complex numbers symmetric difference AAB relation product Cartesian product AxB z is not an element of A z is an element of A xEA empty set, zero set cardinal number of the set A card A intersection of n sets A, intersection of two sets AnB union of n sets A, union of two sets AUB there exists an element 2 for all elements z V I {z : p ( z ) } , set of all z with the {z E X : p(z)} subset of all z from X of the property p(z) tzlp(z)} property ~ ( 4 :'2 X f Y mapping T from the space X I isomorphy of groups -R equivalence relation into the space Y residue class addition 0 residue class multiplication H = HIe H2 orthogonal decomposition of space H A @ B Kronecker product SUPP support supremum: least upper bound of the non-empty set M ( M C R) bounded above sup M infimum: greatest lower bound of the non-empty set M ( M C R) bounded helow inf M
A
B Mathematical Sumbob [a,bl (a. b).]a,b[ (a.b].]a,b] [a,b ) , [a,b[
closed interval, Le., open interval, Le., interval open from le!?, i.e, interval open from right, Le.,
{z E R: a 5 z {z E R: a < 2 {z E R: a < z
5 b} < b] 5 b} {z E R: a 5 z < b}
sign of the number a, e.g., sign (13)= *l, sign0 = 0 absolute value of the number a a to the power m, a to the m-th square root of a n-th root of a logarithm of the number a to the base b, e.g., log, 32 = 5 decimal logarithm (base 10) of the number a, e.g., lg 100 = 2 natural logarithm (base e ) of the number a, e.g., h e = 1
a I bmodm. a I b(m) g.c.d.(al,al,,. . , a,) l.c.m.(al. a,,. . . , a n )
a is a divisor of b, a devides b, the ratio of a to b a is not a divisor of b a is congruent to b modulo m, Le., b - a is divisible by rn greatest common divisor of a l , a,,. . . ,a, least common multiple of a l , a,, . . . , a,, binomial coefficient, n over k
(3
Legendre symbol
n! = 1 2 . 3 ' . . . n factorial, e.g., 6! = 1 2 ' 3 4 5 , 6 = 720; specially: O! = l! = 1 (Zn)!!= 2 , 4 , 6 . . . . ( 2 n ) = 2" n!; in particular: O!! = l!!= 1 ( 2 n l)!!= 1' 3 , 5 , . , . , ( 2 n t 1)
+
A = (atj) AT
A-1 detA, D E = (6q 0 6t,
matrix 4 with elements a,j transposed matrix inverse matrix determinant of the square matrix A unit matrix zero matrix Kronecker symbol: 6,, = 0 for i # j and
6i,
= 1 for i = j
column vector in R" unit vector in the direction of (parallel to) a norm of a vectors in R3 basis vectors (orthonormed) of the Cartesian coordinate system coordinates (components) of the vector a' absolute value, length of the vector a' multiplication of a vector by a scalar scalar product, dot product vector product, cross product parallelepipedal product, mixed product (triple scalar product) zero vector
T G = (1: E )
tensor graph with the set of vertices V and the set of edges E
Mathematical Svmbob C
Geometry I/
orthogonal (perpendicular) equal and parallel
I
# A
N
parallel similar, e.g., AABC portional angle, e.g., SABC radian
N
ADEF; pre
triangle 3: r. arc segment, e.g., AB the arc between A and B rad degree minute as measure of angle and circular arc, e.g., 32" 14' 11.5" second the line segment between A and B the directed line segment from A to B , the ray from A to B
h
I
I I,
AB
A3B
Complex Numbers i (sometimes j) Re ( z ) Izl Z or z'
imaginary unit (i2 = -1) real part of the number z absolute value of z complex conjugate of z , e.g., z = 2 3i, Z=2-3i
I Im (2) argz Ln z
+
imaginary unit in computer algebra imaginary part of the number z argument of the number z logarithm (natural)of a complex number z
Trigonometric Functions, Hyperbolic Functions sin tan sec arcsin arctan
sine tangent secant principal value of arc sine (sine inverse) principal value of arc tangent (tangent inverse)
arcsec
principal value of arc secant (secant inverse)
sinh tanh sech Arsinh Artanh Arsech
hyperbolic sine hyperbolic tangent hyperbolic secant area-hyperbolic sine area-hyperbolic tangent area-hyperbolic secant
cos cot cosec arccos arccot
cosine cotangent cosecant principal value of arc cosine (cosine inverse) principal value of arc cotangent (cotangent inverse) arccosec principal value of arc cosecant (cosecant inverse) cosh hyperbolic cosine coth hyperbolic cotangent cosech hyperbolic cosecant Arcosh area-hyperbolic cosine Arcoth area-hyperbolic cotangent Arcosech area-hyperbolic cosecant
Analysis lim f(z)= B
A is the limit of the sequence (xn). We also write zn-+ A as n -+ m; e.g., Iim (I + =e n-bm B is the limit of the function f(x) as x tends to Q
f = o(g) for x -+ a f = O(g) for x --t Q
Landau symbol "small 0" means: f(x)/g(x) --t 0 as z --t a Landau symbol "big 0" means: f(x)/g(x) --t C (C= const, C # 0) as x
E >E:=, 1=1
sum of n terms for i equals 1 to n
lim x, = A
n+m
A)"
=-+a
n
rI>n:=,
product of n factors for i equals 1 to n
f ( 1) v( 1
notation for a function, e.g., 21 = f(x), u = q(x,y,z ) difference or increment, e.g., A x (delta x) differential, e.g., dx (differential of x)
,=l
A d d _ d2 dn _ dx' dx2' ""dxn
f'(z),f"(X)> f"'(z), f'4'(x), . . . , f ( " )2) ( or
j r , 1,.. . , y(*)
determination of the first, second, . . ., n-th derivative with respect to x
I
first, second,. . . ,n-th derivative of the function f(x) or of the function 21
-ia
3 Mathematical Svmbols
determination of the first, second, . . ., n-th partial derivative determination of the second partial derivative first with respect to I , then with respect to y first, second,
. . . partial derivative of function f ( x , y)
differential operator, e.g., Dy = u', D2y = y" gradient of a scalar field (gradip = Vip) divergence of a vector field (div J = V J )
.
rotation or curl of a vector field (rot J = V x J )
a - ja -+ -ka aZ ax -ay
V = -i+
A = - +a'- + ax2
a2
a2
ay2
aZ2
nabla operator, here in Cartesian coordinates (also called the Hamiltonian differential operator, not to be confused with the Hamilton operator in quantum mechanics) Laplace operator directional derivative, Le., derivative of a = a' g r a d 9 scalar field 'p into the direction a':
9 as
definite integral of the function f between the limits a and b line integral of the first kind with respect to the space curve C with arclength s integral along a closed curve (circulatory integral)
double integral over a planar region S
surface integral of the first kind over a spatial surface S (see (8.152b), p. 480) triple integral or volume integral over the volume V
surface integrals over a closed surface in vector analysis
A = may! A = max
expression A is to be maximized, similarly min!, extreme! expression A is maximal, similarly min, extreme.