This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!
\>n(pim)\ f ° r n ,—•&* + s n\
an
Y w, m, the result follows. D
We now give some other recurrence relations that hold under certain conditions. Theorem 4.5.5. Assume that 4>n is nondegenerate. Then the following recurrence is satisfied for n = 1, 2, . . . : 5 n 0*(z) + ( | £ n | Z - \8n\Z)
*n
mn-i(an-i)mn(z)
(4.47) Proof. This formula is obtained by taking (p*_x from (4.46) and substituting it into (4.45). Formula (4.12) then gives the result. • In the disk situation, with all ak = 0, we get the Szego polynomials and then this formula is well known. See, for example, Ref. [120]. A combination of (4.46) and (4.47) leads in the nondegenerate case to a three-term recurrence relation of the sequence {0O, 0*> 0i, 0|» 02> • • •)> n e n c e to a continued fraction, as the one studied in Section 4.4. For the general situation we need that the (pn are nondegenerate and that 8n\ ^ \en\. This is automatically satisfied when all the ak are in O or when they are all in Oe. Also, the contracted form holds, that is, a three-term recurrence between the elements of the sequence {0o, >\, 0 2 , . . •}, but since we should be able to divide by Xn (see (4.39)), we now have to require that the 8n are nonzero (i.e., that the sequence (pn is nonexceptional). Thus without further proof, we may state
104
4. Recurrence and second kind functions
Theorem 4.5.6. Assume that the sequence (/>n is nondegenerate and \8n\ Define the sequence
=4>k, k = 0, 1 , . . . . Then the subsequent fkS are the denominators of the subsequent convergents of the continued fraction
where for k = 1, 2 , . . . \sk\2 -
= zkek KkUJ^iz) b2k(z) = Zk&k —,
_ Sk *2Jk+l 00 = Zk — •
()
fi
recurrence relations for numerators and denominators are fkiz) =
bk(z)fk-i(z)+ak(z)fk-2(z).
The initial conditions /o = f\ = KQ = 1 give the denominators; /o = — KQ1 = — 1 arad Jj z= KQ1 = 1 giv^ f/ie corresponding numerators (which are the associated functions of the second kind). In the disk case, with all ak = 0, these continued fractions are known as PerronCaratheodory fractions (or PC-fractions for short); see Refs. [120, 122]. Since the previous continued fractions are related to the Nevanlinna-Pick generalization of the Caratheodory coefficient problem, it is appropriate to call them Nevanlinna-Pick fractions (or NP-fractions for short). Contracted versions of the previous relations give the following result: Theorem 4.5.7. Assume that >n-\ is nonexceptional; then the following relation holds:
where Onto = -Zn-1
Kn-\ I
n-26n-x
ZUn(z)
mn + Z»-1-
4.5. Points not on the boundary
105
while
The initial conditions are
0o = ^o,
If the sequence (j)n is nonexceptional and \en\ ^ \8n\for all n, then the >„ are the denominators of the convergents of the continued fraction
The numerators are the corresponding functions of the second kind. Taking once more the special case of the disk with all a^ = 0, we get the recurrences ( ^ )
^n-dz)
- (1 - 1 5 — 1 1 2 ) ^ ^ — Z ^
= fl + =
$n-\
The continued fractions corresponding to the latter recurrences were called M-fractions and T-fractions respectively (see Ref. [123]). The more general fractions of the previous theorem are called Multipoint-Pade fractions (or MPfractions).
Para-orthogonality and quadrature
In this chapter we consider quadrature formulas that can be associated with rational functions. For the case of an interval, it is well known that the abscissas of the Gaussian quadrature formulas are zeros of orthogonal polynomials. For quadrature over the unit circle, the zeros of orthogonal polynomials are inside the unit disk. One can obtain abscissas on the unit circle itself by introducing para-orthogonal polynomials whose zeros are simple and guaranteed to lie on the unit circle. In this chapter, we shall generalize such results to the rational case. 5.1. Interpolatory quadrature Consider the space of rational functions Cp,q = Cq •Cp* = {fg:fe£q,ge
£p*},
p , q > 0,
where Cp* = {/ : /* e Cp}. The space CPtq can also be characterized as
where nn(z) = IlLo Define the point sets A n = {a\ , . . . , a n ) c C and A n = {a\ , . . . , # „ } . Select a set of n mutually distinct points
We want to construct an element Rn e CPA, where p + q = n — 1, that interpolates some function / in the points Nn. In order to give a Lagrange formula 106
5.1. Interpolator? quadrature
107
for the solution, we define the nodal polynomial
k=\
Then (the prime denotes derivative)
are the well-known Lagrange polynomials, satisfying lni{%k) = 5,-*, 1 < /, k < n. Thus the polynomial Pn(z) = ]C/Ui f(%k)£nk(z) is the interpolating polynomial of degree n — 1 for / in the interpolation points Nn. Let L m be defined by
Note that we still have Lnk{^) = 4 / , so that J]i f(%k)Lnk(z) € £ p ^ is an interpolant for / ( z ) in the points Nn. Defining
it can be easily shown that
where the prime means derivative. As a special case, one can take /? = 0, q—n — 1, so that £ p ^ = £q = Cn-\. We are now interested in quadrature formulas /«{/} that approximate the integral
The space £^^ is called a domain of validity for the quadrature formula / n {/} if the quadrature is exact for all / e £ p>9 . It is called a maximal domain of validity when neither £ p +i,^ nor Cp>q+\ is a domain of validity.
108
5. Para-orthogonality and quadrature
We can immediately construct an interpolatory quadrature formula as being the integral of the interpolating function from CpA, that is, "
w ) * wi=y>»*/(&),
r
with ^Bik = / Lnk(t)dji(t).
It can be shown that an jz-point quadrature formula with distinct nodes &, /: = 1 , . . . , « i s of interpolatory type in CPtq, p + q = n — 1 iff Cp,q is a domain of validity for that formula. Concerning the maximality of the domain of validity we have the following theorem. Theorem 5.1.1. Suppose the quadrature formula In{f] has n nodes ondO and suppose that 1Zn-\ = Cn-\,n-\ is a domain of validity; then it is a maximal domain of validity. Proof. We have to show that neither Cn-\,n nor £ n , n -i is a domain of validity. Suppose that /«{/} is exact in 4 - i , « - Define Qn as the unique Qn e Cn with zeros §£, k = 1,...,«, of unit norm: (Qn, Qn)ji = 1 and with positive highest degree coefficient. Because QnBk* G £ n _i, n for all 0 < k < n, we have
{Qn,fl*)A= In{QnBk*} = YJKi=P^T
= 0.
Thus Qn e Cn is orthogonal to £ n _i, and since it satisfies the precise normalizations, it has got to be the orthonormal function (j>n. However, we have shown before that all the zeros of 0 n are in O and thus we get a contradiction. The proof for Cn,n-\ is similar. • The next question to be solved is: How can we construct quadrature formulas with maximal domains of validity? The answer will be that they are of interpolatory type and use as nodes the zeros of a para-orthogonal function. This concept will be considered in the next section. 5.2. Para-orthogonal functions In this section we introduce para-orthogonal rational functions. The zeros of such para-orthogonal functions are simple and are all on the boundary 9O. They will play an important role as abscissas of quadrature formulas. We follow the development and terminology used in Ref. [122].
5.2. Para-orthogonal functions
109
A sequence of functions /„ e Cn is called para-orthogonal when /„ _L £„_!((*„) while „, 1) # 0 # „, Bn). Note that f n £ n _i = {g e Cn: g(an) = g*(an) = 0} ' (z - an)pn-2{z)_
(see Theorem 2.2.1). The sequence of orthogonal rational functions 0 n is not para-orthogonal since (0 n , 1) = 0 and neither is the sequence 0* because (0*, 2?n) = 0 . However, define the following functions for u and v nonzero complex numbers: Qn(z, u, v) = ucf>n(z) + vfiiz)
€ £„.
Obviously these functions are para-orthogonal. It is not difficult to show (see Ref. [34]) that all functions of the form cnQn(z, un, vn) are para-orthogonal when cn,un, vn are nonzero and any para-orthogonal function can be written in this form. Next we observe that since neither u nor v are allowed to be zero, there is no loss of generality if we choose (w, v) = (1, r). The normalizing constant in front does not really matter. Thus we consider in what follows only Quiz) = Qn(z, T) =
O^GC.
(5.2)
Another concept used in this context is &-invariance. A function /„ e Cn is called k-invariant iff f* — kfn. Again, neither >n nor 0* can be ^-invariant for any k, but Qn is Y-invariant if r € T. It is possible to prove [34] that any function of the form cnQn(z, r n ), \rn\ = 1, c # 0 is ^-invariant with kn = cnrn/cn and conversely that any para-orthogonal /„ that is ^-invariant is of that form. Thus a para-orthogonal function with |r | = 1 is ^-invariant and conversely. In view of the previous remarks, we may only discuss the para-orthogonal functions Qn with \x\ = 1. These functions have simple zeros that all lie in dO as we shall presently show. Theorem 5.2.1. Let r e T be given and define the para-orthogonal functions Qn (z) = Qn ( z , r ) by (5.2). Then all the zeros of Qn (z) are on 3O and they are simple. Proof. We shall give this proof only for the case of dO = T. The case 3O = R is even simpler and we leave it as an exercise.
110
5. Para-orthogonality and quadrature
Since 0* does not vanish in B, the ratio 0 n /0* is well defined in D and it equals pn/Pn if the polynomial pn e Vn is defined by <j>n = pn/nn with as always nn(z) = rC=i(l ~ Viz)- Since bn = pn/Pn is a finite Blaschke product, with all its poles in E, we have \bn(z)\<\ in D. Suppose that a is a zero of Qn in D; then Qn(oi) = 0 and this implies pn(a)/p*(a) = bn(a) = — r. Since r € T and \bn (a) | < 1, we get a contradiction. Hence Qn has no zeros in D. Because, however, <2*(z, T) = TJ2«(Z, T), it follows that if a is a zero of Qn(z), then <2 n (l/a) = # n *(a)r<2 w (a) = 0. Thus zeros appear in pairs (a, I/a). Because we showed that there are no zeros in D, there can also be no zeros in E by this duality property. We may therefore conclude that all the zeros are on T. Taking derivatives, it also follows that the multiplicity of a zero a is equal to the multiplicity of the zero I/a. Now we prove that they are simple zeros. Suppose there are only s < n — 2 zeros with an odd multiplicity. Call them § i , . . . , i-s, possibly repeated if they are multiple. They are all in T and depend on r. The remaining zeros are all of even multiplicity and hence we can arrange them in doublets (§/, &) for / = s -{- 1,... ,s + r. Again, we assume that a doublet (£/, £;) is repeated if the multiplicity of §/ is more than 2, whence n = s + 2r with r > 1. Now we note that (z - Hi)2 = (z - %i)(z - 1/f/) = ctz(z - Hi)(z - £)* with a = - £ . Hence, Qn (z) = N/nn with N of the form s
s+r
N(z) = c ]J(z - t-t)zr Y[(z1=1
|,-)(z - &)* with n - s = 2r,
r>1
i=s+l
for some constant c. Consider now the function T(z) = M{z)/7tn-\{z) with M(z) of the fonncHUiiz ~ %i)zr~x(z - otn). Clearly T e Cn-u and hence it is orthogonal to 0 n , but also T(z) G Cn-\(an). This means that it is of the form (z — an)pn-2(z)/7Tn-i with pn-2 e Vn-2- Thus it is also in fn£w_i and therefore (T, 0*}^ = 0. Consequently, T is orthogonal to Qw. In contrast, if we write explicitly (Qn, T)^y we get
f [ \tsince S # 0. This is a contradiction, so that s >n — 1. This means that all the zeros of Qn should be simple and on T. • Note that in the case of the real line, there can be zeros at infinity. For example, in the case of the Lebesgue measure /x = A, then 0O = 1 and 4>\{z) = (z — i)/(z + i) are the first two orthonormal functions and Q\(z) = [(1 + r)z +
5.2. Para-orthogonal functions
111
i(r —l)]/(z + i), which is zero at oo for r = —l.Of course it is always possible to choose r = xn such that there is no zero at infinity. With the para-orthogonal functions Qn, we associate corresponding functions of the second kind as follows: Pn(z) = Pn(Z, X) = ifn(z) ~ T^n*(z),
(5.3)
where the \/fn are the usual functions of the second kind. The following is true: Lemma 5.2.2. Let Qn(z) = Qn(z, r) be the para-orthogonal functions and let Pn(z) = Pn(z, r) be the associated functions of the second kind as defined by (5.3); then for n > 2, we have
Pn(z)f(z) = JD(t,z)[Qn(t)f(t) - Qn(z)f(z)]dfr(t) for an arbitrary function / € £( n _i)*(a n ) = { / : / * € £(«-D : /(<*«) = °JProof. From our results in Lemmas 4.2.2 and 4.2.3, it follows that for n > 2 and / e £(n_i)* D £n*£(n-i)* = £ (n _i)*(a n )
Multiplying this from the left by [1 r] gives the result.
•
The previous lemma says nothing about the case n = 1. The following result will be useful in that case. Lemma 5.2.3. Let Qn denote the para-orthogonal functions and suppose Q\ has a zero § e 9O. Let Pi be the associated second kind function. Then
with D(t,z) the Riesz-Herglotz-Nevanlinna kernel. More explicitly, this means that if rj is the zero of Pi, then r\ and § are related by rj = - £ for 8O = T and rj = - l / £ for dO = R. Proof. The proof requires some calculations, but one uses essentially the first step of the recurrence relation to find explicit expressions for Q\ = <j>\ + r0* and Pi = ^i — r^i*. Since for 3O = T, fo(z) = £> o n e sees that the only difference then gives a sign change, which accounts for rj = — £. In the case 3O = R, this essentially corresponds to a Cay ley transform of this result. •
112
5. Para-orthogonality and quadrature 5.3. Quadrature
Now we consider the space of rational functions 7? — C • C
— C
where Cn* = {/ : /* e Cn}. The space lZn can also be characterized as
f where nn(z) = I l L o ^ ^ ) -
Pn(z) Let
m
\
tm(z) = IL^fe - $k)H£i ~ &) denote
the Lagrange polynomial for the interpolation points §i, §2, • • •, §n» so that YTk=\ R(%k)£nk(z) is the interpolating polynomial of degree n — 1 for R, which we take from 1Zn-\. Let L m be defined by (compare with (5.1)) L ( 7 \ t
r.^"-lfe)
m
»(z) QnJZ) TTn-1 fe) ^ « (& ) (Z - §i) Q'n (§i)
where the prime means derivative and &, k = 1 , . . . , n are the zeros of Qn (z) = (j)n(z) + r0*(z), which are all on 3O. Note that !,„*($,•) = 4 , , so that ^ 1 R(Sk) Lnk(z) is an interpolant for R(z) from Cn-\ in the points § 1 , . . . , £„. Consider the interpolation error
k=\
Here tn-i(z) = z~{n~l)pn-\(z) for the disk and fn_i(z) = p«-i(z) for the half plane, with p n _i e P2n-2 in both cases. From the interpolating property, we find
with q(z) of the form q(z) = qn-2{z)z~(jl~l) for the disk and q(z) = qn-2(z)for the half plane, where qn-2 € ^n-2 in both cases. Now, because &, f = 1 , . . . , n are the zeros of Qn(z), we can also write this as E(z) = Q M ) ^ ^ - .
(5.4)
5.3. Quadrature
113
The second factor can be written as S* with S defined by S(z) =
—
,
qn-2 € P n _2.
7 T ( z )
Observe that S e Cn-\ and also S e £„£„_!, so that 5 is orthogonal to cj)n and to 0* and hence is orthogonal to Qn. Thus (Qn, S)^ = J E djl = 0. In other words, if R e V,n-\ and §&> k = 1» • • •» w are the zeros of 2« = 0« + ^0 |r| = l,then
RdfL = Y^R^i) [Lnidfr = Yj i=i
1=1
^
which is a quadrature on the boundary 3O. We shall refer to this as a Rational Szego quadrature formula or an R-Szego formula for short. Since LniLnU e Hn-\ and also LniLnU - Lni e 7ln-\, we get
J(J(L L
ni nU
- Lni)d{i = ] T Am0 = 0.
Thus Ki = / Lnidjl = / \Lni\2djX > 0. Thus we have proved the following theorem. Theorem 5.3.1. Let {0^} be an orthonormal system for £n, z e T and Qn = (pn + t
jxn-x(z)
(5.5)
is exact for all R e Hn-\ = £>n-\ ' A»-i)*Consider the discrete measure /xn for this quadrature formula. This means MO, 7=1
where 8^ (t) is the Dirac delta function at t = i-j?. The previous theorem now says that
(5.6)
114
5. Para-orthogonality and quadrature
Corollary 5.3.2. With the notation of the previous theorem and the discrete measure jln as just introduced, it holds that
JR{t)dfi{t) = JR{t)dfin{t) for all Re Un-U or, equivalently,
Proof. The first observation is a restatement of the previous problem. The second one is a consequence of the fact that R e7ln-\ iff R = f - g* for f, g e Cn-l• Clearly, using the Riesz-Herglotz-Nevanlinna kernel D(t,z), n
r
Rn(z)=
D(t,z)d/ln(t) = J
^2\njD($j,z) 7=1
can be written as
where Qn is the para-orthogonal function and Sn is some element from Cn. We shall currently show that Sn = Pn = tyn — rxj/* is the associated function of the second kind. Theorem 5.3.3. Let Qn be the para-orthogonal function Qn = 4>n |r | = 1. Let §i, . . . , i-n G 3O be its zeros and ftn the discrete measure (5.6) for the associated quadrature formula. Then, for n > 1, Rn(z)=
Qn(z)
where Pn = x//n — TI)/* is the function of the second kind associated with Qn. Proof. To avoid notational confusion, suppose Sn is defined by (5.7), that is,
Sn{z) = -Qn{z) J D(t,z)dfrn{t) while Pn = \lfn — T\[f* is the second kind function. We have to prove that Sn = Pn.
5.3. Quadrature
115
We first consider the case n > 2. We know by Lemma 5.2.2 that then
Pn(z) = fD(t, z) \Qn{t)j^-
- Qn(z)] d/i(t),
where / e £( B -i)* H fn*A»-i)*' F° r s u c n functions / , it can be verified that D(t, z) \QK(Oy|y
- Qn(z)] e Un-X = Cn.x • £(n_1}*.
By the previous theorem, we may then replace the integral by the quadrature formula (i.e., we may replace /i by /xrt). Thus (recall that §/ are zeros of Qn) n
„
Pn(z) = -QniZ^KjDGjtZ)
= -Quiz)
D(t, Z) d[in{f) = Sn{Z). J
7=1
Hence the theorem is proved for n > 2. For the case n = 1, we use Lemma 5.2.3. Note that Xii = J djl(t) = 1, and thus
Siiz) = -Qiiz)Ri(z)
= -Qi(z)
fD(t,z)di!1(t)
The proof of the theorem is now complete.
•
We have shown in this section how to construct a quadrature formula with maximal domain of validity 1Zn-\. It was of interpolatory type and the nodes were the zeros of a para-orthogonal function. We can also prove that these are essentially all the possible quadrature formulas with this domain of validity. Theorem 5.3.4. Letln{f] be an n-point quadrature formula with distinct nodes ^k> k = 1, . . . , n, all on 3O. Then it has maximal domain of validity 7ln-\ iff it is of interpolatory type in LPA with p + q = n — 1 and the nodes are the zeros of a para-orthogonal function in Cn with \r\ = 1. Proof. =>: Let p and q be nonnegative integers with p -f q = n — 1; then CPiq C Tln-x- Hence /„{/} is exact in CPA. Let xn{z) = Il/Uife ~ &) b e the nodal polynomial and define Xn(z) = xn(z)/7in (z), 7tn(z) = I I L i mk(z); then n
t
X*n(z) = Bn(z) TT ^—f- = knXn(z), ^
?&k*(Z)
116
5. Para-orthogonality and quadrature
with kn = nJUi Zk%k f° r ^ a n d ^« = II/Ui Zk for U. Hence Xn is £n-invariant and thus, by the remarks preceding Theorem 5.2.1, it is para-orthogonal since
1*1 = 1. <=: Conversely, for nonnegative integers p and q with p + q = n — 1, define L[p) = Lke Cp,q by Lk%) = 8kj and set \[p) = Xk = I^{Lk], k= 1 , . . . , n. We prove that /„{/} = ^21=1 h f (^k) is exact in ^ n _ i . So let / e Hn-\ and consider the function E(z) = f(z) - Y!k=\ Lk(z)f(%k) € ^ n _ i . Since E(t-k) = 0 for ^ = 1 , . . . , n, we can now show as in (5.4) that it is of the form E(z) = Xn(z)gn*(z), where Xn e Cn is para-orthogonal and g(z) = (z — an)q*_2(z)/7tn-\(z) qn-2 e Vn-2, and thus with g e Cn-\(an). Consequently,
with
This means that the error is zero for any / e Hn-iWe conclude by showing that the weights are independent of p (i.e., that X[p) = X(kp)). It suffices to prove this for p = p + 1. Define X^iz) as Xn(z)/[7r*(z)7tq+i(z)]', then we know that (prime means derivative) ^k w
-
Now =
X^ (z)
and
Hence
or L^
(z) = L\P) (z) + — —
Thus we have to prove that t that is, t -
J
r— L\P) (
5.4. The weights
111
The latter is equal to
= J Xnp\t)h*(t)djl(t), with h G Cn-\ (an), and this is zero by para-orthogonality of Xnp).
•
This theorem shows that there is a one-parameter family (depending on r G T) of quadrature formulas that have maximal domain of validity 1Zn-\. It has the following properties: 1. Its nodes &, k = 1,...,« are the distinct zeros of the para-orthogonal function Qn = (t>n + r0*, r G T, which are all on 3O, 2. The positive weights are given as Xnk = J Lnk(t) d/l(t), where the Lnk e Cn-\ are defined by Lnk(%j) = Skj, k, j = 1 , . . . , n. This characterizes the set of allrc-pointrational Szego quadrature formulas. 5.4. The weights We shall derive in this section some alternative expressions for the weights of the quadrature formula as given in Theorem 5.3.1. A first result expresses them in terms of the para-orthogonal functions and the associated ones of the second kind. Theorem 5.4.1. The weights Xnk of the quadrature formula of Theorem 5.3.1 are given by =
nk
1
UT0(a0)
Pni^k)
~ 2m^)m^^)Q^y
" '''"' *'
where the prime means differentiation with respect to z, Qn are the paraorthogonal functions Qn = 4>n-\- rep*, |T| = 1 with zeros f i , . . . , §n G 3O, and Pn = \/rn — r\jf* are the associated functions of the second kind. Note that for dO = T the middle factor is just 1/&, and for dO = M, it is
Proof. From the partial fraction decomposition
118
5. Para-orthogonality and quadrature
it follows that
Taking the limit for z —> %k cancels all the terms in the sum and we get 7
= ^-nk l i m ^K£jb z)(z — &)•
This gives the result.
•
Another expression for the weights can be obtained in terms of the Christoffel function. This is given in the next theorem. Theorem 5.4.2. With the same notation as in the previous theorem, it holds that I - 1
n-\
Ly=O
= l/?n(&). Therefore, we can use Proof. We note that & e 3O, so that ^ ( the Christoffel-Darboux relation with w — -j and z = t to get
Using
the limit for t -> i-k yields 1
The determinant formula also gives p (t
-
2
Using the formula of the previous theorem then yields n-\
Kk =
>„*(&)
-l
5.5. An alternative approach
119
Since & e 9 CD, and hence
this proves the theorem. 5.5. An alternative approach An R-Szego formula can also be characterized in an alternative way, by using Hermite interpolation. We shall presently explain this. It is similar to the approach given by Markov for classical Gauss formulas. Using the same notation as in the rest ofthis chapter, we consider a set Nn = {§i, ...,£„} C C\{AnUAn} of distinct nodes. Since CVA is a Haar subspace (i.e., it is spanned by a Chebyshev system) there exists a unique R e CPiq, (p + q= 2(n — 1)) that is a Hermite interpolant for a given function / , in the sense that = /(&),
*=l,...,/i
and
*'(&) = /'(&),
*= 1
/ i - 1. (5.8)
We can represent R as given in the following lemma, which can be found in Ref. [39]. Lemma 5.5.1. With the notation introduced above, the Hermite interpolant R can be given as n—\
n
R(z) = J2 Hjo(z)f($j) + Y, 7=1
y=i
where Hj0, ///i e Hn-\ can be characterized by
H!0Gj) = 0 , HnGj)=0, H!0Gj) = 0 ,
1 < i < n, 1 < j < n - 1; l < i
The functions Ht\ are of the form Hn(z) = cin where xn (z) is the nodal polynomial and c,n is a constant.
120
5. Para-orthogonality and quadrature
Obviously, such an interpolant gives rise to a quadrature formula n
n—\
7=1
7=1
with Kj=
[Hi0(t)diX(t),
l < j < n ,
l
n j
=
fHn(t)dil(t),
l < j < n - l .
It can now be shown that this is in fact the same formula we had before. Theorem 5.5.2. Consider the quadrature formula (5.9) where the nodes §& are supposed to be the zeros of the para-orthogonal function Qn = (f)n + r0*, |T| = 1; then it is an R-Szego formula, that is, it is identical to the n-point quadrature formula (5.5). Proof. We have to show that the weights Xm vanish for 1 < i < n — 1. Using the expression for Hn from the previous lemma, we get, up to a constant factor, that the weights are given by
f xn(t) mn(t)xn-i(t) f , , W o _ /n , x / • diiu) = / Qn\t)hAt)dLL(t) = (Qn,h)ix, J nn(t) (t - & X _ i ( 0 J where h e Cn-\(an), proves the theorem.
and therefore the latter inner product vanishes. This •
Interpolation
In this chapter we discuss several aspects related to interpolation. In the first section, we derive some simple interpolation properties that can be easily obtained from the properties of the functions of the second kind that were studied earlier. It also turns out that interpolation of the positive real function Q^, whose Riesz-Herglotz-Nevanlinna measure JJL is the measure that we used for the inner product, will imply that in Cn the measure can be replaced by the rational Riesz-Herglotz-Nevanlinna measure for the interpolant without changing the inner product. Some general theorems in this connection will be proved in Section 6.2. This will be important for the constructive proof of the Favard theorems to be discussed in Chapter 8. We then resume the interpolation results that can be obtained with the reproducing kernels and some functions that are in a sense reproducing kernels of the second kind. We then show the connection with the algorithm of Nevanlinna-Pick in Section 6.4. This algorithm provides an alternative way to find the coefficients for the recurrence of the reproducing kernels that we gave in Section 3.2, without explicitly generating the kernels themselves. If all the interpolation points are at the origin, then the algorithm reduces to the Schur algorithm. It was designed originally to check whether a given function is in the Schur class. It basically generates a sequence of Schur functions by Mobius transforms and extractions of zeros. Section 6.5 gives a similar algorithm that works for the orthogonal functions rather than the reproducing kernels.
6.1. Interpolation properties for orthogonal functions We shall give some interpolation properties that are easily derived from the properties given in Section 4.2 in connection with the functions of the second kind. 121
122
6. Interpolation
Since by definition
[D(t, z)4>n(z) diHf) = >n(z)[D( we can derive the following interpolation properties. Theorem 6.1.1. Let Q^ be the positive real function with Riesz-HerglotzNevanlinna measure \x. We introduce the Blaschke products #_i = 1 and Bn = $nBn-\ = &Bn for n > 0. Then for the orthonormal functions and the functions of the second kind, it holds that
Bn-l
For their superstar conjugates, we find — G H (O),
n > 0.
(6.2)
Proof. For n = 0, the relation (6.1) is obvious knowing that 0O = i/r0 = 1 and that ftM e H(O). Use (4.22) for / = l/# n _i and rc > 0 to get
(63) For z = ao, the integral equals
/-.
•d\x = (>n,Bn-i)n = 0.
Moreover, Equation (6.3) is analytic in O as a Cauchy-Stieltjes integral. For the relation (6.2), one can similarly check the case n = 0 (use Q 1), and for n > 0, use (4.23) with g = l/Bn to see that
which for z = do equals zero because D(t, ao) = 1 and (>*, Bn)^ = 0. Also, Equation (6.4) is analytic in © as a Cauchy-Stieltjes integral. This proves the theorem. • We have a simple consequence for the para-orthogonal functions.
6.1. Interpolation properties for orthogonal functions
123
Corollary 6.1.2. With the same notation as in the previous theorem and with the para-orthogonal functions Qn(z) = 0n(z) + r0*(z), r e T, and the associated functions of the second kind Pn(z) = ^n(z) — tif*(z), it holds that
Proof. This function is of the form f(z) + r(z - otn)g(z) with f,g e H(O), which gives the result. • More general interpolation results can be obtained as follows. Theorem 6.1.3. Let Q^ be the Riesz-Herglotz-Nevanlinna transform of the measure /JL. Let Qn =
- Rn(z)] = h{z) e H(Oe),
n>l,
with h(6to) = 0, and [Bn-iWr^iz)
- Rn(z)] = h(z) e
withh(a0) = 0. Proof. We use the fact that / s ( f ) d ( £ - £ „ ) ( ' ) = 0 , Vg when
£„(') = is the discrete measure associated with the quadrature formula of Theorem 5.3.1. If g e lZn-\, then obviously D(t,z)[g(t)-g(z)]enn^ and thus \
1
D(t,
124
6. Interpolation
Because
n»(z) - Rn(z) = JD(t, we get for any g e 1Zn-\
h(z) = [D(t, z)g(t)d(jl - AJ(O = g(z) z) - Rn(z)l Clearly, heH(OUOe)
and h(a0) =0 and h(a0) =0. By choosing g(z) =
5 n _i(z), we get the first result and by choosing g(z) = l/Bn-\(z) second one.
we get the •
For the orthonormal functions, we similarly obtain extra interpolating properties in Oe. Theorem 6.1.4. Let (pn be the orthonormal functions and tyn the functions of the second kind. Then
[Bn(z)] \Q^Z) + ^ 4 4 1 = £(*) G H (° e ) L
and
*(«o) =0,
n > 0,
0/i (Z)J
while and n>\. Also, = h(z) e H(O) a^J h(a0) = 0, w > 1,
and g(ao)=O,
n>0.
Proof. Note that this does not follow from the previous results by simply setting r = 0 or r = oo since the previous result relied on the fact that r e T. We shall use a direct proof instead.
6.1. Interpolation properties for orthogonal functions
125
For the first result, we should prove that B»(z) Z) + tyniz)] is analytic in Oe and vanishes at d 0 . Clearly Bn(z)/>n(z) is analytic in Oe because >n(z) has all its zeros in O. We prove that the other factor is also analytic. Use (4.22) with f(z) = 1 to get =
/
= [ this is analytic in Oe as a Cauchy-Stieltjes integral. Moreover, because D(t, do) = — 1, we find for z = do a n d n > 0 that the latter integral equals — (0 n , 1 )# = 0. For ft = 0, we note that D (t, do) = — 1, so that £2M (do) = — Jd/i = — 1. Hence the first relation also holds for n — 0 because 0o = ^o = 1 • For the second one, we observe that we should show that for n > 1
is analytic in O e . We use (4.23) with / = l/fn to get
and this is again analytic in O e as a Cauchy-Stieltjes integral. Moreover, for z = d0, this is equal to (0*, £„)# = 0. The third result is equivalent with (6.1) and the fourth is a simple consequence of (6.2) since 0* has no zeros in O. • We can give expressions for the errors of the interpolants Rn (z) of Theorem 6.1.3. We first need the following lemma, which is easily proved by induction. Lemma 6.1.5. If D(t,z) is the Riesz-Herglotz-Nevanlinna kernel, then the Newton interpolating polynomial for D{t,z) in the points A°m = {«o,. • •, ocm] plus the error term are given by m
, s.
D(t, z) = 1 + 2 ^ a t ( r ) ( z - «<,)»;_,(z) + 2j^-{z
- ao)n*m(z), (6.5)
126
6. Interpolation
where 0,(0 =
m i t )
,
Jfc=l,2,....
We prove next Lemma 6.1.6. If Rn(z) = —Pn(z)/Qn(z) is the rational interpolant to Q^ of Theorem 6.1.3, then for all z € O U Q e , the approximation error En = Q^ — Rn is given by £
=
2m*(z)K-l^)Xn-l(z) uro(ao)Xn(z)
where x«fe) =
f
Xn(t)UT0(t) djiif)
J 7rw_i(O7r*_i(O
^
>
^
t-z'
Qn(z)nn(z).
Proof. By taking / ( z ) = mn(z)/ui*_l(z) find
e £(n-i)*(aw) in Lemma 5.2.2, we
(6.6) We replace D(^, z) by (6.5) with m — n — 2, and use orthogonality properties of G n (z)toget E
2TU^(Z)K_X{Z)K_2(Z)
f
mo(t)mn(t)
Qn(t)
)mn(z)Qn(z) J <_i(0<_ 2 (0 t ~
Since Qn (z) = Xn (z)/rcn (z), the proof follows.
•
Note that in the case of r = 0 , and thus Rn = — ^n /4>n, we can use Lemma 4.2.2 instead of Lemma 5.2.2 and choose / = \/UJ*_X in thefirstline of the previous proof. We then obtain Corollary 6.1.7. If Rn(z) = — V^(z)/0«(z) is the rational interpolant to Q^ of Theorem 6.1.1, then for all z G OUO e , the approximation error En = Q^ — Rn is given by En(z) =
2m^(z)7t*_l(z)7tn(z) f Xn(t)mo(t) ——— / —,. » .
dfr(t) , n > 1,
where xn(z) =
We transform the error expression in Lemma 6.1.6 somewhat further.
6.1. Interpolation properties for orthogonal functions
127
Lemma 6.1.8. For n > 2, the error En (z) of Lemma 6.1.6 can also be expressed as
' Z)
J
z e Q u Q " , (6.7) where 8n = cnfQn(t)urn(t)djl(t), with Qn(z) = Xn{z)/nn(z) and Xn(z) = CnZU + --. Proof. First we show that in the integral of Lemma 6.1.6 we can bring the factor z&o(t) outside. This is obvious for O = D because then mo(t) = 1. For O = U, we have mo(t) = t + i. Now r
(t + i)xn(t)dfr(t) f(t + i)(t-an)Qn(t)dijL(t) = J n*_x(t)7Tn-X{t)(t -z)~ J <_i(Oa - z)
+ fe + i ) / ^ ^ M ) ^ ( 0 . J
7tn_1(t)(t — Z)
The first integral is zero by the para-orthogonality of Qn. Thus f UTo(t)Xn(f)djX(t) _ J Tt^^Ttn-iitXt -Z)~m
Z
f Xn(t)dfr(t) J TT^iOTtn-iitXt - Z)'
It remains to transform the integral in the right-hand side. Since Xn(z) = cnzn
+ •••,
so that Xnit) - Xnfe)
Cntn'1 + P(t)
Therefore,
- z)
OTnCOtc^"-1 + P(t)]
128
6. Interpolation
and also
The second term in the right-hand side is zero by the para-orthogonality of Qn. Since tn~l = n^_x (t) + S " I Q YjX* (0> we find, again by the para-orthogonality of Qn, that the first term is Qn(t)cntnnt-nl-mlmn(t) fQn(t)c n(t) / ^r—
/ /• 0 o dM(O = ^^ // QnQn(t)mn(t)dfi(t). dM(O
If we call this Sn, then we have by Lemma 6.1.6 the desired form for the error. • Remark. The presence of the "strange" term 8n in (6.7) is due to the deficiencies in the orthogonality properties of Qn. For a Stieltjes function with a measure on the real line, true orthogonal instead of para-orthogonal functions are used and this term will not appear. Compare with, for example, Ref. [193]. As an introduction to the next section, we derive the following result. The interpolation properties imply that the following theorem holds true. (Note that the \xn in this theorem is different from the discrete measure defined in Theorem 5.3.1.) Theorem 6.1.9. Let >„ be the orthonormal basis functions of Cn with respect to the measure /x. Define the absolutely continuous measure fin by \mo(t)\2P(t,an)
K
P(t,an)
where P is the Poisson kernel Then on Cn, the inner product with respect to £„ and jx is the same: (•, •)# = (•, •)#„• Proof. We prove first that the norm of 0 n is the same. This is obvious since
Next we show that (cpn, 4>k)fr and (>„, >k)jj,n is the same for k < n. They are both zero.
6.2. Measures and interpolation k
129
^n~ J 0^(0 =
J
r
Since 0* has its zeros in Oe, we know that 5 n \^0^/0* is analytic in the closed domain O U 3O and then we may apply Poisson's formula, which gives zero because Bn\k(an) = 0. Of course also {fa, fa)^ = 0. Hence (j)n is a function of norm 1 and orthogonal to Cn-\ both with respect to /xw and with respect to /x. By Theorem 4.1.6 this 0 n will uniquely define all the previous fa, provided they are normalized properly with >%((Xk) = Kk > 0. Thus the orthonormal system in Cn for (in and for /x is the same: {fa, fa)^ = {fa, fa)p,n = Ski. Since every element from Cn can be expressed as a linear combination of the fa, it also holds that (/, g)^ = (/, g)^ for every / and g e Cn. u The previous theorem was proved for orthogonal polynomials in Ref. [87, p. 199]. 6.2. Measures and interpolation In this section, we want to show how interpolation properties of positive real functions are practically equivalent with the equality of inner products in some Cn spaces. In Section 4.2 we already saw that Qn = V^/0n w a s m C and interpolated Q^ e C in the points of A® = {a?o, ot\,. •., otn) in Hermite sense. By the latter we mean that if in A® some of the a^ are repeated, then the interpolation conditions satisfied refer to function values,firstderivatives, etc. In the same section, it was shown that d/Xn = \UJQ\2P(-, an)/\(pn\2dk = Re QndX and djl are two measures defining the same inner product in Cn. That this is not a coincidence will be shown in this section. See also Section 2.1 for results giving a relation between interpolation and equality of the inner product. We can already conclude from Theorem 2.1.3 that the following holds: Theorem 6.2.1. Let \x and v be two normalized measures ondO with associated positive real functions Q^ and Qv respectively: D{t, z) z) dv{t). = JD{t, z) d/JL(t) d/JL(t) and and Q Qvv(z) (z) == ID(t, I Then the equality of the inner products (-,•)# = (•»*)*> holds in Cn if and only of Q^ and Qv mutually interpolate (in Hermite sense) in the point set
130
6. Interpolation
A® = {(Yo, • . . , oin}. This means that Q (Z
" i~^(Z)'=gfe)€ff(O)
and g{ao) = O,
where 7T*(z) = I I L i f e ~ <**)• Proof. Indeed, when the functions in Cn are written in the appropriate basis {wk : k = 0 , . . . , n} (see Section 2.1), then the metric or Gram matrix Gn of the space Cn is a matrix that is completely defined in terms of &Sk\fii), and by the previous interpolation conditions, these are precisely the values that for Q^ and Qv coincide. D We want to give a more general result where we want to replace a?o by an arbitrary point w e Q. We shall therefore first prove a simple lemma. Lemma 6.2.2. Let /i be a normalized positive measure on T and let the positive real function Q^ be associated with it by (1.27) with c = 0. This means that £2^(0) > 0. Define also the positive real function Q^Cz, w), with w e D some parameter, by
where P(t, w) is the Poisson kernel, and the constant c is given by c
= ~l
CReD(t,w) j—7T,
_
r = CW[W/J,-I - wn\\ e IR,
J Im D(t, w) where cw = 1/(1 — M 2 ) , and [i^ = JC-k = Jt~kd/ji(t) are the moments of \i. Then the relation between ^ ( z ) and Q^iz, w) is given by ftM(z, w) = cw[(z - w)(z~l - w)Q^(z) + (z~lw - Wz V)
\-cw(z
l
w — WZ)/JLO
for z / 0. Moreover, £2M(z, 0) = £2^(z) and /zo = £2^(0) = Qfl(w,w), for any w e D, also for w; = 0. Proof. First, we recall that the real part of the Riesz-Herglotz-Nevanlinna kernel is given by the Poisson kernel, that is, for t e T ReZHJ.u,)-
1
-'!:|2_,,
6.2. Measures and interpolation
131
whereas it can be computed that wt~l — wt (t - w)(t~l - w)
Im D(t, w) = - i -
Hence """* / T^ rw
7 ^M(0 = CW
J Re D(t, w)
(Wt ~ Wt~l) dfl(t) = Cw(U)fl-i ~
J
which is indeed in iR. Next we compute D(t, z) P(t,w)
D(t, z) P(z,w)
tt - z
cw[(t
-
= cw(t + z)\
w
_ w
Uz
= cw[(wz
— wz) + (wt
— wt)].
After integration, this results in ^u(z) Qu(z, w) = —
_i + cw(wz
_ — WZ)/JLO,
P(z, w) and this proves the lemma for z ^ O . Since for w = 0, we have P(t, 0) = 1 and D(t, 0) = 1, so that c — 0, then we get Qfj,(z, 0) = £2M(z). However, l/P(w, w) = 0, so that £2MO, w) is indeed equal to /x0 = 1 = ^/x (0). • For the case of the half plane, one can formulate a similar result. Lemma 6.2.3. Let /x be a normalized positive measure on R and let the positive real function Q^ be associated with it by the Nevanlinna transform Q (z) = fD(t, z) dji(t). This means that £2M(i) = 1 > 0. Define also the positive real function £2M(z, w) with w e U some parameter, by
,z) >(t,w)(l+t2)
with P(t, w) the Poisson kernel and n
lm D(t,w)
•
-
/
Re D(t, w) (1 + t2)
132
6. Interpolation
If fdjl = co(= 1), then the following relation between Q^iz) and £2M(z, w) holds for z / i :
with cw = —i/Im u;. Furthermore, Q/j,(z, i) = ^ ( z ) and /xo = 1 = ^ ( i ) = £2M(w, it;) for all I O G U , also for w; = i. Proof. The calculations are long and technical, but they are along the lines of the previous proof. We only give a sketch. We first note that
D(t,z)
1
D(t,z)
1
,
..
with
Integration gives
where
/(z, w) = cw A(t, z, w) djl(t). A long computation then yields /(z, w) + c = ——-^[(z2 — 1) Re w + z(l —
|W|2)]MO>
from which the theorem follows. Note that the latter formula is just /JLQ if z = w. For w = i we have P(r, i) = (1 + t2)~l and D(r, i) = 1 so that c = 0 and thus^^(z,i) = £2/z(£)- Hence we get for all w e Uthat/zo = 1 = ^ ( w , u;) = ^(i). • We note that in the two previous lemmas, the constant c depends on w, and it is chosen such that Q^iw, w) = JJLQ = 1 in both cases. Also, it is interesting to focus attention on the fact that the expression ftM(z, w) -
—^—P(z, w)
6.2. Measures and interpolation
133
depends on z and w, but the measure enters only via the moment /zo- Thus, it is independent of the actual measure in the sense that it depends only on its normalization. The previous lemmas give the following corollary. Corollary 6.2.4. Suppose jl and v are two normalized measures ondOto which we associate the positive real functions Qll(z)=
fD(t,z)djX(t)
and
Qv(z) = ID(t,
v(z,w)=
/ —-—--—
z)dv(t)
and also
and
/• D(t, z) dv{t) dv
J P(t, w) \m0
with proper normalizing constants c^ and cv tomakeQ^iw, w) = 1 = Qv(w, w) and where P(z, w) is the Poisson kernel. Then ° ' " ' = g(z) e H(O)
Kb)
and
g(a0) = 0
(6.8)
if and only if = gw(z)eH(O)
and
gw(w)=O.
(6.9)
In other words, Q^iz) is a Hermite interpolant for £2v(z) in the point set A°n = {Q?O, ot\, . . . , an] if and only ifQ^iz, w) is a Hermite interpolant for Qv(z, w) in the point set A™ = {w, « i , . . . , an}. By Theorem 6.2.1, this happens if and only if equality of the inner products (•,•)£ = (•,•)# holds in Cn.
Proof. It follows from the previous lemmas that piz, w) — Qv(z, w) = Thus, if (6.8) holds, then
2M(z) - Qv(z) P(z,
134
6. Interpolation
within the case of the disk 8w(z) =
(zw)(l
-wz) g{z) —~ .
1 - \w\2
z
Clearly, gw e H(D) because g(0) = 0. For the half plane, gw(z) = (z-w)(z-w) g(z) Imw and also here gw e H(V) because g(i) = 0. In both cases, the factor z — w also gives gw(w) = 0. The converse is also true and the result follows. • With the previous results on interpolation, it is now easy to prove that the functions of the second kind \jrn associated with the orthonormal rational functions (j)n with respect to the measure /x are also orthogonal rational functions in Cn with respect to some associated measure v. First note that since Q^ e C has a positive real part in O, and hence no zeros, its inverse is also in C. This means that there is a uniquely defined measure v such that ^fi(z)
J
= QV(Z)= I D(t,z)dv(t). J
J
By the normalization ^ ( a o ) = 1, hence £2v(ao) = 1, we have dv(t) = 1. The positive real function \J/* (z)/0* (z) is also invertible in O, and by Theorem 6.1.4, which says
[Bn(z)rl \n»(z) - ^ ^ 1 = g(z) e H(O) and g(a0) = 0, it follows that
[Bn(z)rl \nv(z) - | ^ | ] = Hz) € H(O) and §(a0) = 0, where |(z) = -g(z)0*(z)/[^ / x (z)^*(z)]. Thus the positive real function 0*(z)/ V^*(z) interpolates Qv(z) in the points AQn = {ao, . . . , « „ } . By Theorem 6.2.1 and the determinant formula, it then follows that in Cn the inner product with respect to v and with respect to the measure v defined by
are the same. We can repeat the proof of Theorem 6.1.9 with 0 replaced by i/r and find
6.3. Interpolation properties for the kernels
135
Theorem 6.2.5. Let £2M(z) = jD(t,z)dfi{t) and define Qv(z) = with uniquely defined Riesz-Herglotz-Nevanlinna measure v. Then the functions of the second kind \jfn with respect to the measure [i are orthogonal rational functions for Cn with respect to the measure v. 6.3. Interpolation properties for the kernels In Section 6.1, we discussed rational interpolants for Q^, the positive real function associated with the measure \x. These rational interpolants were constructed from the orthogonal or para-orthogonal functions and the corresponding functions of the second kind. In this section we shall give rational interpolants for Q = Q^, which are based on the reproducing kernels for Cn. By the results of the previous section, we also find approximants of the measure /x, and hence also of the spectral factor a defined in (1.43). Obviously, if we make use of the spectral factor, then we assume it exists and thus that Szego's condition log\JL e L\(k) is satisfied. The following theorem is such an interpolation result for the spectral factor. It is a consequence of Theorems 2.3.8 and 2.1.6. It says that the normalized kernel for Cn interpolates I / a in the points A°. Theorem 6.3.1. Let Kn(z, w) denote the normalized reproducing kernel for Cn w.r.t the measure /x. Suppose that log// e L\(dX) and let o(z) be the outer spectral factor of JJL such that o (a?o) > 0. Then
where Bn = ^Bn, n > 0, with Bn the finite Blaschke products. Proof. By Theorem 2.3.8, we know that kn(z, w) is the projection in H2 of the Szego kernel sw (see (2.18)). Theorem 2.1.6 then says that kn (z, w) = sw (z) for z e A®n = {ao, •-.,<*„} (interpolation in Hermite sense). By setting z = w = «o, we obtain kn(a0, a0) = Kn(a0, a0)2 = \a(ao)\~2. Hence, because kn(z, w) = Kn(z, w)Kn(w, w), the result follows.
•
We next prove the following theorem, which can be found in Ref. [61, p. 48] or [62, p. 654] in the case of the disk.
136
6. Interpolation
Theorem 6.3.2. Let Kn(z, w) be the normalized reproducing kernel ofCn w.r.t. /x and define the absolutely continuous measure \x on the boundary dO by djln(t) = \Kn(t, cto)\~2di(t), t e 3O. Then in the space Cn we have equality of the inner products (•, - ^ = (•, •)# Proof. We only have to show that for all / e Cn and arbitrary w e B (/, Kn(-, UO)AM = , Kn(-, u>)>A = f(w)/Kn(w,
w),
(6.10)
because the [Kn (•, W()}, with wl:, / = 0 , . . . , n some set of distinct points in O, form a basis for Cn. For the proof we shall need the spaces Cn where we let all the points a^ coincide at a^. For the circle, this corresponds to the polynomials (= rationals with all poles at OLQ = oo), but for the real line, these are rational functions with all poles in OLQ = —i. Thus we consider the spaces (ZUQ = 1 for T and Q = {pn(z)/[mo(z)]n
: pn e
Note that any / = p/7tn e Cn can be written as
f(z) =
p{z)
mo(z)n Ttn{z)
The first factor is in £Q and the second factor is incorporated in the measure fc defined by
Let {pk} be a system of orthonormal functions for CQ with respect to the measure TC\ then
is a reproducing kernel for the space CQ with measure n. Thus for any function q e £Q, it holds that
{q,kn(',w))z
=q(w).
Hence, multiplying by mo(w)n/7Tn(w) and extracting again d\± from dfc, we get T
kn(', w)[mo(')mo(w)]n \ 7tn(-)7Vn(w)
q(w)[uT0(w)]n
63. Interpolation properties for the kernels
137
which means that
7tn(Z)7Tn(w)
kn(z, w) = kn(z, w)
(6.11)
is the reproducing kernel for Cn w.r.t. /x. However, if the pn are chosen such that p*(ao)mo(ao)n/7Tn(oto) > 0, then Kn(z, a0) = p*(z)[mo(z)]n/nn(z).
(6.12)
To see this, note that by the Christoffel-Darboux relation obtained from Theorem 3.1.3 by setting all at = ofo we have , _ Pn
, ,
so that ^(z^o) = p*(z)p*(oco). Using the relation (6.11) between kn and kn, we get after normalization Kn(z, w) =kn(z, w)/^kn(w, w) the expression (6.12). With these tools, we can now see that (6.10) will be proved if we can show that =p(w) for arbitrary p e CQ and J/x(r) = \Kn(t, ao)\ 2dk(t). The left-hand side can be written explicitly as
Pit)
[
\^(t)^^pl{w)^{w)jAt)pn{w)]^r^
d
m
=h
"/2'
where
Pit)
since |p*(OI2 = \pn(t)\2fort
So(f)dk(t)
e T. Now we can use in the case of the disk that
tdX(t) = C(t,w)dX(t), t-w
138
6. Interpolation
where C(t, w) = t/(t — w) is the Cauchy kernel for the disk. Similarly, we find for the case of the half plane that = C(t, w)
-dk(t),
where C(t, w) = [2i(t — w)]~l is the Cauchy kernel for the half plane. Thus we find that in both cases 1\ = p(w) and 72 = 0 by the Cauchy formula and the fact that p* (z) ^ 0 in © while pn (z) ^ 0 in ((X n Note that for the polynomial case on the circle (i.e., when all at =0), then djjin = \(p*(z)\~2dk = \(pn(z)\~2dk; a theorem in this style can be found in Ref. [87, p. 198]. From the previous result and the theorems from Section 6.2 we can now find the interpolation property. Theorem 6.3.3. Let us define
nfl(z) = JD(t,z)dil(t)
and Qn(z) = f D(t, z)djin(t)9
with D(t,z) the Riesz-Herglotz-Nevanlinna kernel and with djln(z)] = \Kn(z, ao)\~2dk(z) as defined in Theorem 6.3.2. Then ttn(z) = —— -, Kn(z,a0)
(6.13)
with Kn(t, z) the normalized kernel and Ln(t, z) the associated function as defined in (3.18). Furthermore (recall / / = dfi/dk = djl/di)
2
Kn(z,cto)Kn*(z, ao)
(6.14)
(which is obviously positive on dO) has a spectral factor given by an{z) = \/Kn{z,a0).
(6.15)
Moreover, the function g defined by (6.16) is analytic in O.
6.3. Interpolation properties for the kernels
139
Proof. By Theorem 3.3.3, we get for t e 3O,
while Qn G C. Thus Qn of (6.13) is the class C function associated with the measure \in as follows from (1.36). Hence l/Kn(z, (Xo) is outer in H2, so that also (6.15) follows and (6.16) is a consequence of the previous theorem. • The interpolation result of the last theorem states that Qn is a partial multipoint Fade approximant of Q^ in the points of AjJ = {ao, a\,... ,an}. It is only a partial or multipoint Fade-type interpolant since it is of degree type (n/n) whereas only n + 1 interpolation conditions are satisfied. Because Q^ — Qn = ^Bng, with g analytic in O, we find by taking the substar conjugate that Q^ — Qn* = fo"1 Bn*g*, with g* analytic in Oe. Summing up gives
In the case of the disk, this generalizes the notion of Laurent-Pade approximant [29], since in the case where all at = 0 , [i'n is the inverse of a Laurent polynomial of degree n, which fits the expansion of /z' from — n till +n. In the present case, fi'n takes the form n
Pn(z)pn*(z)'
where /?„ (z) = Kn (z, ofo)7Tn (z) G Pw is a polynomial. Thus /z^ is the ratio of two Laurent polynomials of degree n and fits only 2n + 2 interpolation conditions. It is a partial Laurent-Pade approximant since by fixing the interpolation points at, one fixes the zeros and hence the numerator of the approximant. We could repeat here the same kind of arguments as we used at the end of Section 6.2 concerning the orthogonality of the functions of the second kind with respect to the associated measure v defined through
l/£2M(z) =
D(t,z)dv(t).
We would find that the kernels of the second kind Ln(z, w) are normalized reproducing kernels for Cn with respect to the measure v. Thus the kernels ln(z, vo) = Ln(z, w)Ln(w, w) are the reproducing kernels, and since the ^rn are the orthonormal functions, we have
ln(z,w) =
k=0
140
6. Interpolation 6.4. The interpolation algorithm of Nevanlinna-Pick
In this section we describe the algorithm of Nevanlinna-Pick for interpolation of class C functions or, equivalently, class B functions. The recursions can be described by J-unitary matrices. This approach gives an alternative way of computing the recurrence coefficients pk and yk, exactly like the duality of the two algorithms considered in Ref. [29]. Suppose we start with some function So G B, which is zero at w e O. Since it depends on the parameter w, we write it as a function in z but include the dependence on w explicitly when appropriate. We now transform this SQ into some other S\ e B in three steps. Let a\ in © be given. Then S\(Z, W) = T31 O T2i O Tn(Sb(z, W)) = *i(So(Z,
w
))>
where ° _ I , Y\ = Yi(w) = So(ai, w); 1 -YiSo = S[/(u Si(z) = ~ ^
rii : So h> S[ = r21 :S[\^S'{
Clearly, S\ is again a function in B and it will be zero at w. This follows from Theorem 1.2.3. What was done in the previous transformations is the following. First, So is transformed into S[ to make it zero in ot\. This zero can be taken out by dividing by fi. The last step will normalize S\ by making it zero at w just as So was. We are now in a position like the one we started with and we can repeat the same procedure with some point OLI from © to produce 5 2 . Note that if So is not a rational function, this procedure can go on indefinitely as long as the points a\ are chosen in © since both yt and p* will be in D as evaluations of functions in B. Note the following relation between yk and p&. Since Sk-i(w, w) = 0, it follows that S'k(w, w) =
k l
~ _ '— l-ykSk-i(w,w)
k
- =
-yk,
whereas S'k(w, w) = £fc(u;)5£ (w, w) — %k(w)Pk>
so that for all k > 0 : yk = —t;k(w)Pk> Note that if So were a rational function, then each step will decrease the degree of the function by 1, so that one will end up with a constant unimodular function and then the algorithm will break
6.4. The interpolation algorithm of Nevanlinna-Pick
141
down. Everything we said however still holds up to the step where the algorithm breaks down. We can also invert the previous procedure. Suppose the coefficients pk and Yk for k = 1 , . . . , n are produced by the previous algorithm starting from some So. Now choose some F o e B such that F 0 (w) = 0. Then generate the sequence r fc+1 = r~\ (Fk) for k = 0 , . . . , n — 1. It turns out that Fn shall interpolate So in the points A™ = {w,ot\,... ,an}. Indeed, if we denote n—k&sj, then it holds in general that F 7 will interpolate Sk in the points {w, an,..., a&+i }• We show this for the case where all ak are different. If some of them are confluent, the proof becomes messy by technicalities. For that case, we refer to the homogeneous formulation to be given later in this section. If the interpolation points are all different, then the result follows easily by induction. For j = 0, we have interpolation at w since both Fo and Sn are zero in that point. The induction step can be proved by noting that F 7 + i = xkx{T j) and Sk-\ = xkx(Sk) so that the interpolation from the previous step is inherited. There is one extra interpolation condition satisfied, namely, in the point a^ since F 7 + i (oik) = yk = Sk-\ {oik, w). We have now (almost) proved the following theorem. Theorem 6.4.1. Let So(z, w) e B be a Schur function that is zero at z — w and that is not a rational function. Construct iteratively for a sequence of points {ak :k > 0} C O the functions Sk by Sk(z, w) = Tk(Sk-\(z, w)), where (6.17)
- YkSk-i
z-ak
r2k : S£ H> S£ = S£/ft, k
T3 T k :S^S :S^S k=
1
~
_ * Pk^k
pk = pk{w) = S'{(w, w).
(6.19)
Then all the Sk are in B and Sk(w, w) = 0. All the Yk(w) and Pk(w) are in d) Conversely, by choosing an arbitrary Fo (z, w) e B that vanishes for z = w, we can construct, using the previous Yk and pk, the functions Fk by Fk+\ = r~}k(Fk). All these Fk are in B and Fk will interpolate Sn-k in the points {w, an, . . . , an-k-\-\}- Specifically, 0
= h(z),
h e H(O)
and
h(w) = 0.
If we start from a rational function So(z, w) e B, then it will happen that at some stage we get yn e T. The algorithm then stops because obviously
142
6. Interpolation
Sn-\(z, w) = yn(w) for all Z G D . Thus S'n = 0 and all the remaining ys and ps are zero. In this case, So was a Blaschke product of degree n. The previous theorem is an elaborated version of the Schur lemma (see, for example, Refs. [2], [200], and others), which says Theorem 6.4.2 (Schur lemma). We have S(z) y = S(a) e D
and
eBiff
S'(z) = ^ - " ^
e B
for some a If y e T, then by the maximum modulus principle, S(z) = y for all z e B. Since we are here mainly interested in the case where all the orthogonal functions (pn exist for n = 0, 1, 2 , . . . , we shall concentrate on the case where So(z, w) e B is not rational, so that the algorithm never breaks down. We shall now give an equivalent homogeneous formulation of the previous algorithm. Suppose that a Schur function S e B is described as the ratio of two functions S = A1/A2 that are both holomorphic in O and where A2 is zero-free in O. If S(w) = 0, then of course A\(w) = 0 too. We place these two functions in a vector A = [Ai A 2 ], which can be considered as a set of homogeneous coordinates for S. Following Dewilde-Dym [60], we shall call the set of such A-matrices admissible and denote it as ^ = { A = [Ai
A2] : Ai, A2 G # ( O ) , A2(z) / 0 , z e O, Ai/A 2 e B}. (6.20)
We replace A by A if B is replaced by B in the previous definition. Note that A1/A2 G B can also be written as AJ AH < 1, where J = 1 0 — 1 is our usual signature matrix. We can now describe the Nevanlinna-Pick algorithm of the previous theorem in terms of J-unitary matrix multiplications applied to admissible matrix functions. Let An = [An\ An2] be the admissible matrix containing the homogeneous coordinates for Sn. Then the inverse transform Sn-\ = x~l{Sn) can be written as A n _! = A n 0 n ,
6.4. The interpolation algorithm of Nevanlinna-Pick
143
with the matrix 9n given by On(z, w) = cn
fnfe)
o"
Pn
0
1
and
dn = (1 — \yn\2)
Pn
dn
"l Yn
Yn
1
with cn = (i -
\pn\2yl/2
1/2
,
i n _i 5 2(« n , w;),
Pn =Pn(w)=
if w / an,
\Yn{w < ^ _
^n-i 2(2, u;))|
if w = an
(dz means derivative w.r.t. z). Let us define the J-unitary matrix 0 n as 0 n = 6n • - -O\ for ft > 1. Because this matrix is formally the same as the 0 n matrix of Section 3.3, there must exist some Kn and Ln, both functions in Cn, parametrized in w, such that a relation like (3.19) holds. The interpolation property given in Theorem 6.4.1 implies that if we choose An = [0 1] e A, then A n 0 w = [Kn—Ln Kn+Ln] G w4 has the property that (Kn—Ln)/(Kn+Ln) interpolates So in the points of the set A™ = {w, a\,..., an] if they are all different. We can now easily give the proof for confluent points too. If we define 0* as Bn@n* we get the form \ 2
\Kn-Ln
Because we know that for a J-unitary matrix Gn l = JSn*J = Bn*J®*nJ, we get
Furthermore, since So = AOi/A02 € B and vanishes at z = w, the positive real function Q(z, w) = (1 - S0(z, w))/{\ + S0(z, w)) e C will be 1 for z = w : Q(w, w) = 1. We can write Ao = [1 — £2(z, w) 1 + Q(z, w)]. If we multiply this from the right with / 0 * / , then we get 1 [1 - ft 1 + 2
Ln Ln
= Bn[Anl
Thus An2\.
An2\.
(6.21)
144
6. Interpolation
The first of these relations shows that
Ln(z, w) - Kn(z, w)Q(z, w) = Bn(z)Ani(z, w), with A nl G H(O)
and
Ani(tu, w) = 0.
Because Kn does not vanish in O and Ln/Kn e C by a property of J-unitary matrices, we can also say that the positive real function Q is approximated by the positive real function Qn = Ln/Kn such that Q Qn
~
Bn
= n
e H(O)
with
h(w) = 0.
Taking a Cayley transform results in the interpolation property of the Schur function So by the Schur function Fn = (Kn — Ln)/(Kn + Ln) e B: —
=g e H(O) with g(w) = 0. Bn Suppose /i is some positive measure on dO and that we associate with it the positive real function Q^iz, w) as in Lemmas 6.2.2 and 6.2.3. Suppose we start the Nevanlinna-Pick algorithm as described above with Q equal to this Qfi(z, w). Since we just showed that then Qn will interpolate this starting £2^ at the point set A^, it follows from Corollary 6.2.4 that the inner product on Cn is the same for the measure fi{t) and for the measure jjin(t9 w) defined by
dfin(t, w) =
— - 2 — dX(t) = -———— dk(t). \Kn(t,w)\2 \Kn(t,w)\2
Thus, because jl(t) does not depend on w, it means that as far as the inner product in Cn is concerned, jji(t,w) does not depend on w. Therefore, we may replace w, for example, by cto and we thus have
f f / f(t)g*(t)dfr(t) = / f(t)g*(t)diXn(t, w) P(t, w)
This has the important consequence that Kn(z, w) as generated by the Nevanlinna-Pick algorithm applied to the positive real Q^(z, w) is the normalized reproducing kernel for Cn w.r.t. the measure /x. Therefore we have to show that, up to normalization, Kn reproduces every f e Cn\ and it does indeed
6.5. Interpolation algorithm for the orthonormal functions
145
since
=
J Kn(t,z)P{
because //ATn is analytic in O so that the Poisson formula holds. Finally, we note that the real part of Q^(z, w) has a radial limit satisfying a.e. Re Q^(t, w) =
li'{t)/P(t, w), t e dO, w e O. For further reference, we state the next theorem, which follows from our previous remarks. Theorem 6.4.3. Let djl be a normalized measure (JJ/x = I) and define for w e O ( 6 - 22 )
\mo(t)\2
P(t,w)
where c is chosen such that Q^iw, w) >0. (D is the Riesz-HerglotzNevanlinna kernel as always and P is the Poisson kernel.) Furthermore, let Kn(z, w) be the kernels produced by the Nevanlinna-Pick algorithm starting from Ao = [1 — ^^(z, w) 1 + Q^iz, w)]. Define the absolutely continuous measure ,o , ,
\mo(t)\2P(t,w)
o
P(t,w)
Then the inner product (•, -)^n will not depend on w for functions in Cn and it is the same as the inner product (•, •)# on Cn. Moreover, Kn(z, w) is the normalized reproducing kernel for Cn with respect to the measure fin and hence with respect to the measure /x.
6.5. Interpolation algorithm for the orthonormal functions We shall in this section give an algorithm in the style of the Nevanlinna-Pick algorithm, which, based on the idea of successive interpolation, will generate the recursion for the orthonormal functions (j)n and the functions of the second kind \l/n. As a matter of fact, it is difficult to do this for these functions because of the rotating factors r)xn and rjl in the recurrence relation. These rotations depend on the angle that 0n(o?n_i) forms with the real axis and this is difficult to find
6. Interpolation
146
without evaluating 4>n (an-\). However, the rotated functions „ and *I>n satisfied a recurrence that got rid of these ijs and it will be possible tofindan interpolation algorithm for these rotated functions. That is what we shall currently do. Recall that On = en(/)n and Wn = en^j/n, with en e T as defined by (4.14). We shall use once more the following notation: £_! = 1,
Bn(Z) = Bn-xizKniz) = fofe)^n(z),
* = 0, 1, . . . .
We now define Rn\ and Rn2 by Bn-\Rn\{z)
*»(z)
(6.23)
BnRn2(z)
where £2 = £2M = jD(t, -)d£i(t) is the positive real function associated with the measure /JL for which the orthogonality holds. Note that the functions in the left-hand side are in fact rotated versions of the functions g and h as defined in (6.1) and (6.2) respectively. These are indeed the remainders in the linearized interpolation properties of the rotated functions. We shall call the functions Rn\ and Rn2 the remainder functions. Since J5_i = 1 it follows that RQ\ = Q + 1 and R02 = £2 — I. Note also that for n > 0, both Rn\ and Rn2 are zero in a0. The right-hand side in the defining relation (6.23) of the remainder functions satisfies the recurrence for the rotated functions as in Theorem 4.1.3. Hence, the left-hand side shall satisfy Zn-i(z)
= en
BnRn2(z)
\An
1
0
0 1 (6.24)
This can be rewritten as given in the next theorem. Theorem 6.5.1. The remainderfunctions as defined above satisfy the following recursion: = en (6.25) with en > 0 and A/I =
~^»-i zl™ ^ 2
TT' urn(an)
r
ln-\=znZn-\, 1
and (6.26)
6.5. Interpolation algorithm for the orthonormal functions
147
The An in the previous expression are the same as the An of Theorem 4.1.3. We can make the recursion even simpler and avoid the explicit use of the r)n-i by introducing rn\(z) =-znRn\(z)
(6.27)
and rn2(z) =
With this notation, the recursion (6.25) becomes 1
, , \rnl(z)
0
1
0
l/Uz)
Ln 1
Ln
r
n-l,\(z)
rn-\,i(z) (6.28)
with - A
T
Ln = -znAn
r
r
1/2
n-l,2(z)
en =
= - hm
- \Ln\2 (6.29)
Proof. We shall only prove (6.26), because (6.29) is a direct consequence. We can start from the relation (6.24) and use fn_i = ^ n _iZ n _i to get ,i-i (z)
1
= en
An
(6.30) which now easily gives (6.25). To find the expression for An, one can use the last line of (6.30) for z = an, which gives 0 = Anrjn-iRn-iti(an)
+ Rn-i,2(an),
from which the expression for An follows. The expression for en was shown in Theorem 4.1.2. • The previous theorem has the following consequence. Corollary 6.5.2. Define the function rn(z) in terms of the remainder functions by rni(z) Rni(z)
,
W=0, 1,
especially
ro(z) =
1
l-Q(z)
(6.31)
148
6. Interpolation
Then for all k > 0: Fk e B and they satisfy the recurrence
with Ln =
—rn-i(an).
Proof. The proof follows immediately from the previous theorem. All the Fk are in B because To is, whereas the Mobius transforms are done with Lk e D. Moreover, the division by fn respects the analyticity because the function between brackets was made zero in z = ocn by the choice of Ln. u
Density of the rational functions
As a step toward some convergence results to be considered in Chapter 9, we consider here some asymptotic phenomena. For instance, what happens with the spaces Cn and lZn = Cn- Cn* when n tends to infinity? In general, are these spaces dense in Hp or Lp as they are in the polynomial case? This is important if we want to know how good functions in Hp or Lp can be approximated within the space of rationals that we considered. These results will of course depend on the spaces, that is, on the placement of the poles a^. More precisely, they will depend on the convergence or divergence of the Blaschke products for these numbers. 7.1. Density in Lp and Hp k
The functions t ,k e Z are known to be complete in Lp. Similarly, it is possible to prove that the basis functions of finite Blaschke products and their inverses are complete in the space Lp if and only if ^ ( 1 — \oik\) = oo. In analogy with the powers of z we can define the finite Blaschke products Bn for n = 0, 1 , . . . as before and we set by definition #_„ = Bn* = 1/Bn for n = 1, 2, Hence, the {Bk}\k\
and Ux = \J Un n=0
n=0
just as Poo is the set of all polynomials. The notation V, C, and 1Z will be used for the closures of Voo, £oo, a n d ^-oo respectively in some topological space (for example Lp). The completeness of the functions {Bn}nGz in Lp is by definition the same as the density of T^oo in Lp. If 1Z denotes the closure in Lp of T^oo, then if T ^ is dense in Lp, 1Z should coincide with Lp. 149
150
7. Density of the rational functions
In order to prove this density property for the disk, we start with a lemma that can be found in Akhiezer [1, p. 243]. Lemma 7.1.1. Let z\, zi, • • •, zn be some fixed points in C. We have the following optimization result in LP(T), 0 < p < oo:
: f(z) =
-^-
-,qe P», N > n) = J J - i - , (7.1)
where q ranges over V^, the set of all monic polynomials of degree N and \z\+ = max{|z|, 1}. The unique solution is obtained for q = Q with Q as in (7.2) below. Proof. Suppose without loss of generality that the points are ordered such that z\, . . . , Zd are all in E while Zd+i, • • •, zn are all in D. It is clear that the righthand side value can be reached. It is obtained for q = Q with N
! ) • • • (Zld
Q(z) =
-
Z\'
1)(Z -
Zd+\)
"'(Z-Zn)
(7.2)
"Zd
To prove that this is the unique solution, we have to show that for any other monic polynomial P e V^
P(z) n(z)
l
>rr
Q{z)
-
*(*)
k=\ 1^1
(73) P
where n(z) — (z - zi) • • • (z - zn)- Since ||/||oo > ll/llp, it is sufficient to prove that (7.3) is true for 0 < p < oo. Suppose that q\,..., qj are all the zeros of P in the open unit disk. Thus P(z) = (z — q\) • • • (z — qj)R(z) with \R(0)\ > 1 because R(z) = Hk(z - dk) with all \dk\ > 1. The function
8(z) =
(Z -
Z l ) • • • (Z -
Zd)(l
-
Zd+lZ)
• • • ( ! -
ZnZ)
(7.4)
is analytic and nonzero in D. It is even analytic in D since we can assume that the Zk that are on the boundary T are canceled by zeros of P, otherwise the inequality (7.3) would be trivial. Therefore, p
p
\R(0)\p
J £ dx = J\g\ dx>
\Z\'"Zd\p
1 \Zl'-Zd\p'
with inequality if P ^ Q.
7.1. Density in Lp and Hp
151
With the previous lemma, we can now prove the following completeness theorem. It says that TZOQ is dense in the spaces LP(T) for 1 < p < oo and in the space of continuous 2n -periodic functions if and only if the Blaschke product diverges. It is a slight generalization of a result in Akhiezer [1, p. 244] where the poles were supposed to be simple. However, Akhiezer's result is more general since poles need not appear in reflection pairs (ak, I/a*) as we suppose here. Theorem 7.1.2. For given QL\,OLI, . . . all in ID, let the finite Blaschke products Bn be defined as before for n > 0 and B-n = \/Bnfor n = 1,2, Then the system {Bn}nejJ is complete in any LP(T) space (1 < p < oo) as well as in the class C(T) of continuous 2rc -periodic functions (with respect to the Chebyshev norm) if and only if ^ ( 1 — |cty|) = oo. Proof. First note that the case where infinitely many a^ are equal to zero can immediately be discarded because in that case the system contains the trigonometric polynomials and these are known to be complete while the sum certainly diverges. So we suppose that only finitely many (say q) of the a^ are zero. Without loss of generality we suppose a\ = • • • = aq — 0. In this case the system contains the trigonometric polynomials of degree at most q. Suppose we set i • k — —n
n\ — C • C
n
z)
y : Pin e V2n, D(z) = zq \{
1 (z - \/ak)(z
-ak)\.
In view of the inclusions C(T) C L ^ T ) C • • • C Li(T), it is sufficient to prove that the divergence of the sum implies completeness in C(T) and conversely that completeness in Li(T) implies divergence of the sum. Completeness in Li(T) => divergence of the sum. By the previous lemma we have for p = 1 that
z =
inf
q+i
Q(z) D(z)
p
(z) D(z) k=q+l
Since the system is supposed to be complete in L\ (T), the previous expression should go to zero as n -> oo. Whence J^(l — \ak\) = oo. Divergence of the sum =» completeness in C(T). We already know that all the powers zk, k= — q,..., q are in the system, so we should prove that all zk for \k\ > q can be approximated arbitrarily close in C(T) (the norm is the
152
7. Density of the rational functions
Chebyshev norm) by elements from lZn for n sufficiently large. This follows from the following observations: ||zm+« + alZm+q-1
inf
+ • • • + am.xzq+l
2n+m
=
inf D
^
HOC
+ f(z) Hoc
=n k=q+l
for m = 1, 2 , . . . and \\Z-m-q + alZ-m-q+l
inf =
+ •••+ a m - i z " " " 1 + / ( z ) | | c
m f ^ \\zm+q + alZm+q-1
+ •••+ am-lZq+1
+ /.(z)Ho
=n n
for m = 1,2, Since the sum diverges to infinity, we must have that the right-hand side n^+i \ak\ -^ 0. By an induction argument, it then follows that z±(q+m) for m _ i ? 2 , . . . can be approximated arbitrary close in C(T) by the system {Bn}. This means that the system {Bn} is complete in C(T). D The same proof, with some simplifications can be used to prove that C^ is dense in HP(B>). Corollary 7.1.3. The system {Bn : n = 0, 1,...} of Blaschke products, with zeros ofi, a2, . . . a// m P, w complete in the spaces Hp(B), 1 < p < oo //
Along the same lines, one can adapt the completeness condition given by Akhiezer [1, pp. 246-249] for the real line. We leave it to the reader to check the details. The result is Theorem 7.1.4. For given a\, a2,... all in U, let the finite Blaschke products Bn be defined as before for n > 0 and B-n = \/Bn for n = 1, 2, Then the system {Bn : n e Z} is complete in any LP(R) space (1 < p < oo) as well as in the class C(R) of continuous functions in R (with respect to the Chebyshev norm) if and only if ^ 1 ^ 1 ^ / ( 1 + |a^| 2 ) = oo. Note that continuous in R means continuous in R and that the limits li /(JC) and lim^^-oo f(x) both exist and are equal to each other.
7.1. Density in Lp and Hp
153
Thus in the previous results we had a density result iff the Blaschke product diverges, that is, iff
oo forD
and V ^
lmOik
1 + \ak\2
= oo forU.
(7.5)
We now have a look at the case p = 2 and consider the density of C^ in H2(O). It can already be expected from Theorem 2.1.1, which characterized Cn as H2 © t;oBnH2, that if the Blaschke product Bn diverges to 0 in O, then Coo will be dense in H2. In fact the previous characterization links interpolation with least squares approximation as we can find in the next theorem. It is in fact a restatement of Theorem 2.1.1. For the disk case, this theorem can be found in the book of Walsh [200, p. 224]. Theorem 7.1.5. Let f e H2 be given. Then the following least squares approximation problem in H2 norm,
inf{||/-/J 2 :/„€£„}, has a unique solution, which is the function fn e Cn that interpolates f in the point set Aj) = {a?o, a\,... ,an}. Proof. Obviously, /„ isthe projection in H2 of / onto Cn. Therefore, the residual f — fn must be orthogonal to Cn = H2Q Mn with Mn = ^BnH2 and thus it is an element from Mn. This means that f(z) = fn(z) for all the points z G A°n in Hermite sense. This identifies the interpolant as the least squares approximant. • In fact Walsh observes that we may replace in the previous theorem / e H2(B) by / i G L 2 (T). The best least squares approximant is the interpolant for its Cauchy integral
t-z in the point set A°. See Walsh [200, p. 225]. Completely analogous is the following result, which uses Theorem 2.1.6. Theorem 7.1.6. Let f e H2 be given and recall (Theorem 2.1.6) £ w (a 0 ) = {/„ e Cn : fn(&o) = 0} = H2 0 BnH2. The unique solution of the least
154
7. Density of the rational functions
squares approximation problem
inf{||/-/n||2:/»e£n(a0)} Jn
in H2 is the function fn e Cn (&o) that interpolates f in the point set An =
Let us now return to our density problem. The idea that the interpolation error, which now turns out to be also related to the L 2 approximation error, is proportional to a Blaschke product forms the backbone of the following convergence result that gives a local uniform convergence result in O, which means uniform convergence on compact subsets of O. For the disk, this theorem can be found again in Walsh [200, pp. 305-306]. Theorem 7.1.7. Let f e H2(B) and suppose that ]P(1 - |a*|) diverges. Let fn e Cn be thefunction that interpolates f in the point set A° = {0, a\, . . . , an}. (The zero in this theorem is not essential. It could be replaced by any other ao 6 D.j Then fn converges uniformly on compact subsets of ID to f(z). If f G H2QB)) is also analytic on T, then the convergence is also uniform onT. Proof. The details of the proof can be found in Walsh's book. The basic idea is to use the error formula
f(z)-fnn(z)= J[ Bn
t-z
For z G D, use the fact that Bn (z) converges to zero while the modulus of Bn* (t) isl. For z G T, take the integral over a circle slightly larger than the unit circle. Then|£ n (z)| = 1 while (\t\ > 1)
on at - \/t "
from which the theorem follows.
0
•
The second part of the theorem implies that the set {Bn}n>o is complete in H2. The result is stronger. It says that if the sum diverges, it is possible for an arbitrary polynomial p and any e > 0 to find n sufficiently large such that there is an /„ G Cn with \p — fn \ < e uniformly on ID).
7.2. Density in L2(/x) and H2(IJL)
155
7.2. Density in L2(fi) and //2O) For a more general positive measure /z, it is well known that [1, p. 261] the system {f*}*>o is complete in Lp(fi, T), p > 1 iff / | log\JL (t)\dX(t) = 00, and for the real line, that the system {elax}a>o, or equivalently the system {(t + i)~*}*>o, is complete in L p (/i,R), p > 1 iff / |log/x/(f)k*A.(0 = 00 [1, pp. 263-266]. We call log/AOeLid)
(7.6)
the Szego condition. Thus from classical polynomial theory [2, p. 186; 116, p. 50; 92, p. 144] we have. Theorem 7.2.1. Let p > I be an integer. The set of polynomials is dense in Lp(ii,T) if and only if log n! $ L\(X,T). The set of rational functions {(z + i)~*}fc>o is dense in Lp((i, R) if and only iflog p! g L\(X, R). We want to investigate the density of COQ. From the comments at the end of the previous section, we may immediately conclude the second part of the following theorem. Theorem 7.2.2. Let 1 < p < 00 and /I be a finite positive measure on 9O. Define C as the Lp (/i) -closure o/Xoo- Wfc can then give thefollowing statements: 1. It is always true that VOQ is uniformly dense in CQQ, that is, dense w.r.t. || - ||oo norm. Thus also C c Hp (/I) for any p. 2. If the Blaschke product diverges, that is, if (7.5) is satisfied, then C^ is uniformly dense in VOQ- We then have C = Hp((jL)for any p. Proof. Recall that Hp(/1) is the Lp(/1)-closure of PooFor part 1, we note that since every element from some Cn is meromorphic in C and analytic in O U 3O and since for 1 < p < 00, Lp{fi) is a complete metric space [187, p. 69], all Cn c Hp(/%). Hence C^ c C c Hp(jX). For part 2, it follows from the remarks given above that HP((JL) C £, hence, in combination with part 1 we get equality. • If C is the L2(/x)-closure of C^, then we know from Theorem 7.2.2 that £ ^ H2 (A) and if the Blaschke product diverges, we have equality. Thus the divergence of the Blaschke product is a sufficient condition for completeness. It is, however, not necessary (see Theorem 10.3.4). In the case p = 2, we have the following property concerning the density of C^ in H2(il), which includes a partial converse.
156
7. Density of the rational functions
Theorem 7.2.3. Let pi be a positive measure on 3O. Then Coo is dense in H2 (A) if the Blaschke product diverges, that is, if (7.5) holds. Conversely, let log// G L\(X), then the density ofCoo in #2 (A) implies the divergence of the the Blaschke product. Proof. The first part is the second part of the previous theorem for p = 2. For the second part, we can use a similar construct as in Ref. [60]. If (7.5) is not satisfied, then it is known that Bn (z) converges to a Blaschke product B(z), which is an inner function, that is, with modulus bounded by 1 in O, while its radial limit has modulus 1 X a.e. It has zeros in z = ot\, ct2, We have to show that there exists a nonzero function in #2 (A) that is not in C. Let us take f(z) = £o(z)B(z)
2
^
J
/
git) Bkit)
dfi,
git) Bkit)
2
f\
1
' J Wit)
1 (7(0 2 rin afis.
The last factor is zero because I / a vanishes d/ls a.e., and the other factor is finite because
f J- d/is = J \g\2dps < j |
<
00.
Hence (h, Bk)jxs = 0 for all k = 0, 1, Thus the situation is exactly as in the case of an absolutely continuous measure. • We can combine Theorem 7.2.1 with the previous theorem to get the following corollary. Corollary 7.2.4. The following holds: 1. If the Blaschke product diverges (i.e., (7.5) holds), then log// ^ L\(X) iff Coo is dense in L2(A)-
7.2. Density in L2(/x) and Hid*)
157
2. / f l o g / / e L\(X), then the Blaschke product diverges iff C^ is dense in
Proof. 1. If the Blaschke product diverges then £, the L 2 (A)-closure °f £oo> is equal to H2(/l) by Corollary 7.2.2. Moreover, by Theorem 7.2.1, log/x' £ Li(l) iff H2(IJL) = L2(A)« Thus if the Blaschke product diverges log// g L\(X) iff C = L2(/JL), which proves the first part. 2. It is always true that the divergence of the Blaschke product implies the density of C^ in H2 (/i). However, if log \xl e L \ (A.), then the density of £oo in #2 (A) implies the divergence of the Blaschke product by the second part of Theorem 7.2.3. This proves the second part. • Later, in Section 9.4, we shall give other equivalent conditions for density, convergence, and boundedness under the more restrictive condition that the |ofn| are bounded away from the boundary. If course if the \an\ are bounded away from the boundary, then the Blaschke product will automatically diverge. We can also prove that when the Blaschke product diverges then T^oo is dense in L 2 O ) . Corollary 7.2.5. Suppose the Blaschke product diverges to zero and that /x is a positive measure on 9 CD. Then 1ZOQ = span{Bn}nGz is dense in Proof. To prove the density, we have to show that if / e L2(A) is orthogonal to T^oo, then it is zero. Consider first the case of the disk, so that O = D and dpi = d/x. By the previous theorem C^ is dense in //2(/>0, so that if / f(t)Bk(t)dfi(t) = 0 for all fc = 0, 1 , . . . , then / f (t)t~kdn(t) = 0 for all k = 0, 1, Now consider the function
F(z)= fc«,z)Mt)dii(t)9 where C(f, z) is the Cauchy kernel. This F is analytic in ID because it is the Cauchy-Stieltjes integral of the complex measure dv = f*d\i. Moreover F(z) is of bounded variation and belongs to Hp for any p < 1 [76, Theorem 3.5, p. 39]. By our assumption, (/, Bk) = Ofork e Z, and in particular J f*Bkd\i = 0, k = 0, — 1, —2, This implies that the ak are zeros of F since C(t, «&) e span {#o, B-i,...} and hence
F(ak) = [ C(t, ak)f,(f)dn(t)
= 0 for all * = 0, 1,....
158
7. Density of the rational functions
Suppose for simplicity that there are infinitely many different a?£. If not one has to show first that the otk that is infinitely many times repeated is also a zero of F of infinite order. Because F vanishes in the zeros a^ that are the zeros of a divergent Blaschke product, and because F belongs to the Nevanlinna class N D Hp, it follows from its inner-outer decomposition that it vanishes identically in B. This then implies that
Mt)rkdfi(t)
= I f(t)tkdfi(t) = 0 for all ik = 0, 1,...
(compare with Ref. [92, p. 62]). Thus / is orthogonal to all the elements in {tn}nez, which is complete in L2(/x). Thus / is at the same time in Z,2(/x) and orthogonal to it. Hence it is zero [i a.e. The case of the half plane is treated similarly. We give a brief sketch. First it is observed that / J.^ //2(/x); hence
f(t)e~ixtdil(t) =0
for x > 0.
Furthermore, we note that C(t, z) can be written as an integral (compare with Ref. [92, p. 62]), C(t,z) = /0°°exp{i(z - t)x}dk(x)9 so that in that case
/ Mt)C(t, ak) d(Ht) = 0 implies Mt)e-[xtd/l(t) /
=0
for all x > 0
•
and thus also /
f(t)eixtdjl(t) = 0
for all x > 0.
Thus / G L2(/x), while at the same time, its Fourier transform vanishes identically. Thus / = 0 /I a.e. • Another kind of density result relates to the representation of positive functions in Li. It is for instance well known that / e L\, f > 0 a.e. on T and log / G L\ if and only if / = |g| 2 with g e H2. In fact, any positive trigonometric polynomial can be written as the square modulus of a polynomial of the same degree. For a more general function / , we can take g to be the outer spectral factor of / . One finds this result in any standard work, for example, in Grenander and Szego [102, pp. 23-26]. The density of the trigonometric polynomials then implies that any positive function from L\ with integrable logarithm can be approximated arbitrarily well by a positive trigonometric polynomial, and hence by the square modulus of an outer polynomial. This result can be generalized as follows.
72. Density in L 2 (M) and H2(fi)
159
Theorem 7.2.6. Let the Blaschke product diverge to 0 and take f e Li(I). Suppose / > 0 a.e. on 3O and l o g / G Li(i). Then for every e > 0, £/j£re w swrce /„ G Cnfor n sufficiently large such that \\f — | / J 2 | | i < e (norm in
Proof. By the above-mentioned property, we may replace / by |g| 2 with g G H2 and outer. Thus we have to prove that there exists an /„ such that II \g I2 — I fn I2 II1 < ^. By a property that can be found in the book by Rudin [187, p. 78], we may use for p = 2 \\\g\p - \h\ph < 2pRp~l\\g - A||p,
with
maxfllsll,, ||A||P} < R.
Thus 2
2
r r
2
H l g | - | / n | H l < 4 / H / \g-fn\ dk\
i1/2
.
We can always find an interpolant /„ that makes \g — fn\ arbitrary small and hence also \fn\ < \g\+€\ with 61 > 0 arbitrary small. Thus, because g e H2, /„ will also be in H2, so that R is bounded, whereas \\g — fn ||2 can be made as small as we want. This proves the theorem. • The space C^ will be dense in L2(/x) whenever the Parseval equality holds for all functions (or for a set of basis functions) in L 2 (/i). This result was stated for the polynomial case in Ref. [2, Theorem 2.2.3]. The same argument can be used in the rational case. Consider a function / G L 2 (/X)- Suppose it has the formal Fourier series expansion
f{t) ~ 5^Ci0*(f),
Ck = (/, ^
(t=0
It is well known that
min{\\f-fn\\l:fneCn} is obtained for
k=0
and the minimum is
k=0
160
7. Density of the rational functions
Obviously, this implies the Bessel inequality
k=0
If, however, equality holds, then we get Parseval's equality and this is equivalent with the fact that the function / can be approximated arbitrary close in Z,2(A) by functions from Cn if n is sufficiently large. Basically, the previous argument is used to obtain another density result in Chapter 10; see Theorem 10.3.4. A solution of the moment problem considered in Chapter 10 is called N-extremal if the Parseval equality holds for a certain function. If the Blaschke product diverges, then the moment problem considered in Chapter 10 will have a unique solution, which will be N-extremal and Theorem 10.3.4 then says that £oo is dense in L2(A)- Thus it is sufficient that the Blaschke product diverges for COQ to be dense in L2(/x)- It is, however, not a necessary condition because the same theorem says that if /I is an N-extremal solution of the moment problem, which may well exist even if the Blaschke product does not diverge, then also C^ will be dense in L 2(/x)-
—
8
—
Favard theorems
In this chapter we shall prove two Favard type theorems, which say that if we have a sequence of functions generated by a recurrence of the type we studied in Chapter 4, then there will be a measure with respect to which these are orthogonal. We not only prove the existence of the measure, but really give a constructive proof. Since in our development, we have derived the recurrence relation for the orthogonal functions from the recurrence relation for the reproducing kernels, it seems a natural question to ask whether a Favard theorem exists for reproducing kernels as well. Thus, if we have a recurrence relation for some kernel functions as we gave them in Chapter 3, are these then reproducing kernels for the rational spaces with respect to some positive measure? We were not completely successful in this respect, but we include the results obtained so far. 8.1. Orthogonal functions In Sections 3.3 and 4.1, we have seen how the kernels, as well as the orthogonal functions, satisfied certain recurrence relations that generalize the Szego recurrence relations. It is thus true that all rational functions that are orthogonal (or reproducing kernels) with respect to a certain positive measure on the boundary 9O will satisfy such a recurrence. The converse of this theorem is known as a Favard theorem, named after Favard's paper [79]. Such a Favard theorem states that if functions satisfy recurrence relations as we have given in Section 4.1, then they will give orthogonal rational functions with respect to some positive measure on the boundary 3O. A simple proof for the Szego polynomials was given in Ref. [78]. There, not only the existence of the measure is proved, but the measure is actually constructed. We shall follow in this section a similar approach for the rational case. Related results in a somewhat different setting were obtained in Refs. [112] and [31]. 161
162
8. Favard theorems
We shall give a sequence of lemmas that will eventually lead to the proof of the Favard Theorem. Lemma 8.1.1. Suppose we are given two sequences of numbers ak ^ G D for k = 1,2,.... Define the numbers ek > 0 by their squares
mk(ak)
4=
1 -
\Xk\2
for&= 1,2,
and
(8.1)
Finally, define the functions 0& by
(8.2) where the numbers / | ] G T are chosen such that (/)k(ak) > 0. Then the functions 0£ satisfy the following recurrence: 0o = l ,
-i(z)
k_i(z)l,
*=1,2,..., (8.3)
with (8.4)
with ZA: as in (2.1). Moreover, 1/0* G Proof. We can leave the proof of the first part to the reader because it comes down to simple calculus. For the second part, note that we can couple the two recurrences (8.2) and (8.3) into the form (4.1). We can use this form recursively and end up with (recall ao = 0) cn where cn = n&=i ek
an
d where ©„ is given by 0 n = 0n • • • #o with
ek =
\4
2
L° %
?*-ife) 0 0 1
which is a J-contractive matrix. Hence, because
8.1. Orthogonal functions
163
it follows by Theorem 1.5.3(5) that
\/4>: = c-l(mn(z))[(en)2i + ( O n ^ r 1 e H2, which concludes the proof.
•
Lemma 8.1.2. Under the conditions of Lemma 8.1.1, it holds that Xk = nk "; v
K L/
~
with
r\k = zkzk-x
_~
[ e T.
(8.5)
Proof. Using
and <j>l_x (ptk-\) > 0, we find from the recurrences for (j)k and (pi that (pk\OLk—\) = ekTifc
[U + Ak(Pk—i\Oik—l)\
\®-v)
[ 0 "I" 0jfe—lV^ik—1)]«
(8-7)
and (pfc^Otk—l)
=
e
kflk
Dividing (8.6) by (8.7) gives
The result now follows.
•
Lemma 8.1.3. Let the functions
^n(0 =
|nr o (OI 2 ^a, <*n) .-. „ , TTT^
P ^ , <*n) .. , „
^ ( 0 = TT-T^TT ^ ( 0 ,
(8.8)
with P{z,w) the Poisson kernel. Then the functions fa satisfy the orthogonality relations = hi
forO
(8.9)
164
8. Favard theorems
Proof. We shall first prove that 0 n is orthonormal with respect to all its predecessors: (0n,0m>An : -Km
forO < m < n.
(8.10)
This is shown as follows:
r 0n(O0 m*(t) P(t,an)dX(t)
-I ~J
0n(O0 n*(0
0m* (0 P(t,an)dX(t) 0n*(O
Bn\m(t
= Bn \m(z)
^P{t,an)dX(t)
(>*(z)
The general orthogonality follows from Theorem 4.1.6, which says that all the previous orthogonal functions are defined in terms of 0 n , orthonormal to Cn _ i by the inverse recurrence, and we have just proved by the previous Lemma 8.1.2 that the recurrence in the Lemma 8.1.1 is the same as the one from Theorem 4.1.1.
• We are now ready to prove the following Favard type theorem. Theorem 8.1.4. There exists a Borel measure on dOfor which the (j)n as constructed in Lemma 8.1.1 are the orthonormal functions. The measure is unique when the Blaschke product with zeros a^ diverges (to zero). Proof. For notational reasons, we give the proof for the case of the disk. It can be easily adapted for the half plane. Define the linear functional M on T^oo = £oo ' ^oo* by M(0fc0/*) = 8kh k, I = 0, 1, — Obviously the (j)k are orthogonal with respect to this functional. We prove that this functional can be represented as an integral such that M(f) = / fdfi
for all / e C(T).
Define the measure /xn as in (8.8). Let us temporarily switch back from our notation for the unit circle to the notation for the interval (0, 2n]. To avoid
165
8.2. Kernels confusion, we set fin{6) dO = d^n{ew). Thus for 0 < t < In JO l0n(^)P
JO
These are all increasing functions and uniformly bounded (Jdjln = 1). Hence, there exists a subsequence {An*} a n d a distribution function jl such that l i m fXnk(O)
=
K-+OO
and
lim [f(eie)dfL )dp,nknt(O)=: (9)= ( f(ew)djl(O)=
k->ooj
J
f(t)dfi(t)
for all functions / continuous on T. Thus the {(j)n} are an orthonormal system with respect to this measure \i{ext} — jl(t). Hence, the linear functional M defined on T^oo = £oo • £cx>* by ^(0A:0/*) = $kh k,l = 0, 1 , . . . can be extended to a bounded functional on C(T). The uniqueness follows from the representation of bounded linear functionals on C(3O) (F. Riesz) and the density property of the previous chapter. •
8.2. Kernels We shall in this section try to formulate a Favard theorem for the reproducing kernels. We were not completely successful, but we give the results obtained so far. Even in the polynomial case, this problem has not been considered before. Lemma 8.2.1. Recall the definition a0 = 0 for O = D and a0 = i for O = U. Furthermore, let ot\, «2, • • • be complex numbers in O. Let p&, k = 1,2,... be given functions in B (i.e., \pk(w)\ < 1 for tu € CD). Define Yk(w) = —£k(w)Pk(w) £ B for k = 1,2, Let the functions Kn(z, w) be generated by (3.12), that is,
\Kn(z,w)
] = 0 (z,w) n
Kn-\(z,w)
(8.11)
, w)
with the matrix On given by 1
1
Pn
1
Yn
Pn
1
Yn
1
(8.12)
166
8. Favard theorems
Then the following properties hold: l/Kn(z, w) e H2 Kn(w, w) = Kn-i(w, u
for w e © fixed.
y 1 - \pnyw)r
"x \ i - \pnKW)\- J
(8.13)
(8.14)
Proof. Again, (8.13) follows from Theorem 3.3.3 as for the orthogonal functions because the recurrence is J-unitary by definition. The proof of (8.14) is pure calculus: Just replace z by w and play a bit with the formulas. Equality (8.15) is also immediate, since
K*(an, w) _ p J / ^ - j f e , w) + yn{w)K*n_x(an, w)] _ _ Kn(an,w)
n
[Kn-i(an, w) + yn(u))K*_l(an, w)]
Note that as in the case of the reproducing kernels, the previous result implies that Kn defines all the previous Kj, j = n — 1, n — 2 , . . . , 0 uniquely. Thus if Kn(z, w) is a normalized reproducing kernel for Cn with respect to some measure /x, then the Kj(z, w) will be normalized reproducing kernels for Cj, 0 < j < n. Concerning the reproducing property of Kn{z, w) we have the following lemma, which is somewhat deceiving as we shall discuss after its proof. Lemma 8.2.2. Let the Kn be generated as in the previous Lemma 8.2.1 and define km(z, w) = Km(z, w)Km(w, w)
for m = 0, 1 , . . . .
(8.16)
Define the measure [in by
>2P(t,w) . \Kn(f,w)\*
P(t,w)
where P(z,w) is the Poisson kernel. Then ),
V / eCm,
0<m
(8.18)
8.2. Kernels
167
Proof. We first prove this for m = n. This is easily seen as follows. It holds for all / G Cn that P(t,w)dX(t) Kn(t,w)Kn*(t,w) /
f(t)
kn*(t, w) P(t, w) dX(z) kn*(t,w) Kn(t,w)
Kn(w,w) Kn(w,w) We show next the induction step, which starts from the fact that (8.18) is true for a certain m, and we shall prove that it will be also true for m — 1. That is, we have to prove (/, km(., w))^n = f(w),
Vf eCm=> (/, km-i(-9 u;))^ = f(w), VfeCm-i.
(8.19)
We first derive the following expression from the recurrence, which can be obtained with some tedious calculations:
\2Wm_x{Z, W) After dividing by Bm this becomes kmiZ^w)
Bm(z)
i - i
+ Pm\
= - ( 1 - \Ym\2)km-u(z, w) + [U(z)ymPm + Now we use this in the right-hand side of (8.19) to get
168
8. Favard theorems
The first term can be simplified because (£mymPm + 1 ) / £ Cm and km reproduces / in w. Therefore, the inner product in the first term equals
f(w)(U(w)ym(w)pm(w)
+ 1) = -(1 - \Ym(vo)\2).
The inner product in the second term equals MmYm + Pm)L *«>£„ = <*m, / * ( / m + SmPm))fim.
The latter has a factor (yw + t;mpm)f*, which is again in Cm and we can again use the reproducing property of km to find that inner product equals the complex conjugate of
which is zero by the definition of ym. Thus, after filling in the last two results for the inner products, we get
(/, *„_!(., w))^ = f(w) + 0 = f(w), which proves the induction step.
•
Note that the previous result does not imply that km is a reproducing kernel. Indeed, it reproduces / in w but only with respect to a measure /xn that depends upon the specific w in which the function is reproduced. It will only be a reproducing kernel if there exists a measure that is independent of w. This is about as far as we can get without further specification of how pn depends upon w. For an arbitrary sequence of numbers Pk(u>), depending on w and satisfying \pic(w>)\ < 1, one may not expect that the corresponding 0 n matrix given by ©„ = QnQn-\ • • * #i contains (normalized) reproducing kernels for Cn with respect to any measure whatsoever. The situation is different from the Nevanlinna-Pick algorithm of Section 6.4 where the 0 n matrix did give these kernels, but there the Pk(w) were special since they were generated starting from a measure /x that did not depend on w. The w there was introduced during the definition of the starting values for the Nevanlinna-Pick algorithm. For arbitrary pk(w), k = I,... ,n, one can build 0 n as the product of the Ok given by Lemma 8.2.1, from which we can extract Kn = (0 n )2i + (0«)22 and the corresponding measure \xn as in Lemma 8.2.2. We then do have (8.18), which does not identify kn(z, w) as a reproducing kernel as we said before. If {0o> • • •» 0«} is a n orthonormal basis for £n, then the kernels are
kn(z,w) =
k=0
8.2. Kernels
169
and this reflects a specific symmetry in z and w. It implies, for example, that as a function of w, kn(z, w) should be in Cn. In general, a reproducing kernel should be sesqui-analytic, that is, kn(z, w) = kn(w, z) and, more specifically, in Cn all the relations given in Theorem 2.2.3 should hold. This means that the way in which kn(z, w) depends upon w is very special, and one should not expect that the choice of arbitrary Pk(w), which depend in some exotic way on w, will provide this. One can easily check this by considering the simple case of n = 1 for example. So we shall have to introduce the notion of a sequence Pk(w) having the property that the corresponding kn are indeed reproducing kernels. We shall say that such a sequence pk(w) has the RK (reproducing kernel) property. Since the kn(z, w) as they were generated in the previous lemmas depend upon w via pt(w) in a very complex way, it is not easy to find conditions on how the coefficients pt(w) should depend upon w to ensure that kn(z, w), as a function of w, is in Cn. The reader is invited to try and check this for the simplest possible case n = 1. It remains an open problem to find a direct and simple characterization of the pi (w) having the RK property. For the moment we content ourselves with a characterization that is in the line of how these pt (w) are produced by the Nevanlinna-Pick algorithm and we shall formulate some equivalent conditions. Unfortunately, none of these will give a direct characterization of how the coefficients Pk should depend on w. If such a characterization exists, it is still to be found. As explained in the previous lemmas, the coefficients {pt(w) : i = 1 , . . . , n} define uniquely the normalized kernels {Kt (z, w) : i = 1 , . . . , n} and thus also the kernels {kt{z, w) : i = 1 , . . . , n}, as well as the J-inner factors {9i(z, w) : i = 1 , . . . , n] and thus also the products {®i(z, w) = 9t; • • • 0\ : / = 1 , . . . , n). Conversely, the K((z, w) = (0/)2i + (®*)22 (and similarly the Lt(z, w)) can be recovered from the 0/. These K((z, w) define uniquely the k((z, w) and also the pt(w) and thus the 0,- as well. In other words, there is a one-to-one correspondence between all these sets of quantities. We shall say that one of these (and therefore also all the others) has the RK property, if on Cn, the inner product (•, -)frn is independent of w, where /xn is the measure defined in terms of the Kn(z, vo) by an expression like (8.17). It is an immediate consequence of Theorem 6.4.3 that the pl (w) will have the RK property if they can be generated by the Pick-Nevanlinna algorithm applied to some Ao e A, AOi (w) = 0 of the form A o = [1 - SW(P>) 1 + Sw(fi)]
(8.20)
170
8. Favard theorems
for some measure jl satisfying Jdjl = 1 and independent of w. Recall that Sw is as defined in (6.22). Thus we have by Lemma 6.2.2 for the disk case P(z, w)
1 - \w\z \z
)
and for the half plane we have by Lemma 6.2.3 Qjx(z, w) =Sw(jl)(z)
= —. 7 P(z, iy)(l + z 2 )
9. (ilmw;)(l+z 2 )
(8.22)
where in both cases Qp,(z) = So(jl)(z), with So the usual Riesz-HerglotzNevanlinna transform, which is a special case of Sw obtained by setting w = o?oSince Qn (z,w) = Ln (z, w)/Kn (z, w) as generated by the Pick-Nevanlinna algorithm shall interpolate this Sw (jl) in A™ = {w, ct\,..., a n },alsoL n (z, ao)/ i^n(z, «o) will interpolate £2^ = <So(A) m ^« = i^o, «i, • • •, ^n\- Thus in the case of the disk, for example, 0 n (z, w) will have the RK property if
Ln(z,w)
Ln(z,0)/Kn(z,0)
The construction of these interpolants Qn can be done by applying the Nevanlinna-Pick algorithm as explained in Section 6.4 since this algorithm constructs 0 n and the Kn and Ln can be obtained from the latter. However, they can also be obtained by running the algorithm backwards. Indeed, setting An = [0 2], and forming the array Ao = AnSn where ©„ is the J-unitary matrix generated by the Nevanlinna-Pick algorithm, we shall get Ln(z,w) A02(z, w) - AOi(z, w) Qn(z, w) = —— = -— ——-— -. Kn(z, w) A02(z, w) + Aoi(z, w) More generally, one can choose any AneA A o = A n 0 n , which gives
with An\(w)=0
and generate
~ A02(z, w)- AOi(z, w) Qn(z, w) = = , A02(z, w) + Aoi(z, w) which will also interpolate £2#(z, w) for z e A™. If this ^2(z, w) equals Sw(fr)(z) for some measure jl that does not depend on if, then in view of Theorem 6.4.3, the inner product with respect to /xn will in Cn not depend on w since it is there equal to the inner product with respect to jl. Thus the 0/ (z, w) will have the RK property if there exists some An e A with An\ (w) = 0 and Ao = A n 0 n of the form (8.20)-(8.21).
8.2. Kernels We can now use An(z,w) Sn(w, w) = 0, to get
111
= [Sn(z, w)
1], with Sn(z,w)
e B and
A o = A n® n -\§ = ~ 2 " = \[Sn(K
+ L*n) + (ffn - Ln)
Sn(K*n - L*) + (Kn + L n )].
If this has to be of the form (8.20), then _ A02 - AQI _ Ln - SnL*n ~ A02 + A01 ~ ^ + 5 ^ * '
wW
We may thus conclude that 0/, i = 1 , . . . , n will have the RK property if there exists some function Sn(z, w) e 23, which may depend upon a parameter w and which satisfies Sn(w, w) = 0, such that the function Qn(z), defined by (case of the disk) Ln(z, w) - Sn(z, w)L*n(z, w) z~lw - zw] — «~ —?- — j — j - P(Z, W), Kn(z,w)-Sn(z,w)KZ(z,w) l-|w;|2 J belongs to C and is independent of w. Similar observations can be made for the half plane. We now have a Favard type theorem. Theorem 8.2.3. Let the kn(z, w) be generated as in the previous lemmas and let Kn(z, w) = kn(z, w)/<s/kn(w, w) be their normalized versions. Suppose the Pn (w)form a sequence with the RK property. Then there exists a Borel measure on dO such that for n = 0, 1, 2 , . . . the function kn(z, w) is a reproducing kernel for Cn. Thus there is a measure /x such that for n = 0, 1, 2 , . . . (/(Z), kn(Z, W))} = f(w),
V / € Cn, Vw G D.
If the rational functions U%L01Zn, where lZn = Cn- Cn* and Cn* = { / * : / € Cn], are dense in the space of continuous functions C(3O), then the measure li is unique. Proof. If the pn have the RK property, then A« (0 = Aw(^ w) as defined in (8.17) will define an inner product (•,•)#„ that on Cn will be independent of w, which implies as in Theorem 6.4.3 that the kn(z, w) is a reproducing kernel for Cn with respect to /x«(0 = A«(0 = An(^' a o)- Because the kernel kn uniquely
172
8. Favard theorems
defines all the previous ones, we shall also have that kj(z, w) is a reproducing kernel for Cj with respect to the measure /x n (0 for j = n — 1, n — 2, We can now use the same reasoning as in the case of the Favard theorem for the orthogonal functions given before. We shall again restrict the formulation to the case of the disk. Since the distribution functions (recall our convention that jln(O)dO =
d\(6) are increasing functions and uniformly bounded (Jdfin = 1, because <So(/zw) = Qn (z) = Ln (z, 0)/Kn (z, 0) and Qn (0) = 1 and fdfi = co = Qn (0)), there exists a subsequence such that
lim AnftW = A(fl)
and
lim /
f{ei0)djlnk{e)=[f{t)d^{t)
for all continuous / with d/ji(eie) = p,(O)dO. Thus, for n = 0, 1,..., the kernels kn(z, w) are all reproducing in Cn with respect to this measure /x. To prove the uniqueness, we note that, because these kn are reproducing kernels, there exists a sequence of complex numbers wn, n = 0, 1 , . . . such that the sequence of functions {kn(z, wn), n = 0, 1,...} forms a basis for COQ. Thus we may define a linear bounded functional M on T ^ (and, because of the denseness, also in the space of continuous functions) by means of
=
kt(z, wfikjiz, wfidji = km(wj9 wt),
where m = min{/, j}. By the Riesz representation theorem of bounded linear functionals, it follows that /x is unique. •
Convergence
This rather long chapter contains many different convergence results. We shall start by recalling the background of the classical Szego problem of weighted least squares approximation, which is related to the construction of the Szego kernel. Traditionally, finite-dimensional approximants are taken as reproducing kernels in the set of polynomials Vn. We give its generalization in the case where these approximants are reproducing kernels for the rational spaces Cn. In Section 9.2 we give some further preliminary convergence results related to rational interpolants for the Riesz-Herglotz-Nevanlinna transform of the measure. Such results hold locally uniformly, that is, uniformly on compact subsets of the unit disk or the half plane. In Section 9.3 we study some preliminary convergence results that hold for the reciprocal orthonormal functions 0*. These are called preliminary, because some rather strong conditions are imposed on the location of the points a^. They are supposed to stay away from the boundary, so that they are all in a compact subset of O. When this is assumed, we obtain in the course of this section several other results, and many equivalent formulations of the Szego condition log /x' G L I ( 1 ) are used. These equivalences are collected in Section 9.4. These more restrictive conditions on the c^ can be deleted if we apply results from orthogonal rational functions with respect to varying measures. Such orthogonal polynomials can be found in the work of Lopez, which is reviewed in Section 9.5. These are used in the subsequent Section 9.6 to obtain stronger results for the convergence of the orthogonal rational functions and the reproducing kernels. Some theorems about convergence in the weak star topology are given in Section 9.7. Ratio asymptotics are described in Section 9.8. These ratio asymptotics are typical when the measure is assumed to satisfy the Erdos-Turan condition \A! > 0 a.e., which is weaker than the Szego condition. Root asymptotics, as given in Section 9.9, involve the convergence of expressions such as |0 n | 1 / n . These typically involve some potential theory and 173
174
9. Convergence
are related to estimates for the rates of convergence. Such estimates for the rates of convergence of the rational interpolants for the Riesz-Herglotz-Nevanlinna transform of the measure and for R-Szego quadrature formulas are given in the final Section 9.10. As suggested by this survey, there is a hierarchy in the different kinds of asymptotics. In power asymptotics, one typically considers limits of the form lim^oo (pn(z)/pn(z), with pn a polynomial. In ratio asymptotics, one considers limits of the form lim w ^oo0 n+ i(z)/0 n (z), and in root asymptotics, one considers lim^oo
9.1. Szegoproblem
175
answer is that Poo is dense in L2(/x) if and only if the infimum K~2 = 0, which happens if and only if J log \i' dX = —oo [102, pp. 49-50]. In general, the limiting value of K~2 as n —• oo is given by |a(0)| 2 = exp{/log/x' dX}. This is equal to the geometric mean of [x!. If log/z' ^ Li, then this has to be replaced by 0. This was proved, for example, in Ref. [102, p. 44]. See also Ref. [87, p. 200 ff ]. As a generalization, Grenander and Szego consider next the problem P^(l, w) with w e B again for the polynomial case. We know that the infimum is reached for / = kn(z, w)[kn(w, w)]~l and that the minimum is found to be [kn(w, w)]~l. The latter function is known as the Christoffel function. For w = 0 we rediscover the previous result in the case of polynomials. In Ref. [102, Section 3.2] it is shown that this minimum converges to l/sw(w) — (1 — |w|2)|(7(u;)|2, where sw(z) is the Szego kernel. We shall generalize the latter approach to the rational case. However, in the rational case, it requires putting w = an to rediscover the former problem. This is an extra complication since this w is supposed to be a constant in D, whereas when replacing it by an, it depends upon n and when n tends to oo this may converge to a point on the unit circle T or to a point in the disk D or it may not converge at all. We shall also have to consider C^ and C. In analogy with the polynomials, the latter is again supposed to denote the closure in L2(/x) of COQ. We shall have solved the same problem as Szego did if C = V == H2(IJL). We have seen in Section 7.1 that this happens when the sum ^ ( 1 — \at\) diverges. In Dewilde and Dym [60] we find many of the results of this section proved for an absolutely continuous measure /x, say d/ji = \x' dX on T. We now generalize this to the case of a general finite positive measure on 9O. We follow closely the development in Ref. [60]. Assume that log\i' e L\(X), so that the outer spectral factor a exists. In Ref. [102, p. 51] we find that for polynomials the limit function lim^^oo kn(z, w) equals the Szego kernel, sw(z)=s(z,w)
= [(1 - WZ)(T(Z)CJ(W)]-1.
(9.1)
We also introduce its normalized form given by Sw(z) =
sw(z)/\/sw(w).
Thus sw(z) = Sw(z)Sw(w). This is the function appearing in Theorem 2.3.8. Note that it satisfies sw (w) > 0. Theorem 2.3.8 then stated that ||*n(z, w) - sw(z)\\l = sw(w) - kn(w, w).
(9.2)
176
9. Convergence
We now turn to the invariant formulation including the disk as well as the half plane case. Thus we replace (9.1) by (2.18): sw(z) =
= [1
= = = =
Z()Z()]a(z)cr(w)
==•
mo(ao)
(9.3)
mw(z)cr(z)cr(w)
We shall denote by Tln the orthogonal projection operator in L2(A) onto Cn. Theorem 2.3.8 then says that Un[sw(z)] = Sw(w)Un[Sw(z)] = kn(z, w) and that the squared norm of the error is given by the expression (9.2). Recall that if log /z' e L\ (X), then by Theorem 7.2.4 the divergence of the Blaschke product, that is, (7.5), is equivalent with the density of H^ in // 2 (A)Now we formulate our first convergence result: Theorem 9.1.1. Letkn(z, w) be the reproducing kernel for Cn, assume log / / e L\(k), and let sw(z) be the Szego kernel as defined in (93). If the Blaschke product Booiz) diverges, that is, if (7.5) is satisfied, then
lim ||*n(*,u0-*u,(z)lli = 0
(9.4)
and lim kn(w, w) = sw(w)
and
lim >n(w) = 0,
(9.5)
pointwise in O. Proof. Note that sw € H2(£i) (it is the reproducing kernel for /f2(A))- Since the Blaschke product diverges, we know by Theorem 7.2.3 that JCOQ is dense in //2(A)- Therefore UnsW9 the projection of sw onto Cn, converges to sw in L2(A)» which is the first result. Thus the squared norm of the error, which by (9.2) equals sw (w) — kn (w, w), has to go to zero, which is Parseval's equality. Since kn(w, w) = Yll=o \
inf{||/||A : / € #2, /(uO = 1} = - j - r , that is, the infimum equals (1 - \w\2)\a(w)\2
forB
and
\w
9.1. Szego problem
111
Proof. Indeed, Theorem 1.4.2 says that in Cn the optimization problem
has the solution fn(z) = kn(z, w)/kn(w, w) and that the minimum is given by \/kn{w, w). Since the Blaschke product diverges, the space C^ is dense in #2(A) and by the previous theorem kn(w, w) converges to sw(w), so that the infimum of | | / | | | in #2(A) is equal to \/sw(w). • Theorem 9.1.1 gives L2(A) convergence and pointwise convergence of the reproducing kernels for Cn to the Szego kernel. In Section 9.3 we shall prove under somewhat more restrictive conditions on the c^ that kn(z, w) converges locally uniformly to sw(z) in O. See also Theorem 9.6.7. Sometimes it is more interesting to work with the normalized kernels, which were defined as Kn(z, w) = kn(z, w)/^/kn(w, w). Therefore, we also introduced a normalized form of sw(z), which was given before. We can write it explicitly as
V1 -
urw(z)cr(z) (9.6)
where the factor r](w) of modulus 1 is chosen to satisfy Sw(w) > 0. This is obtained by setting T](w) =
mo(w)\cr(w)\\mo(ao)\ =
_ e T.
\ujo(w)\a(w)ujo(ao) Thus we have the explicit expressions T](W)
Sw(z) = T](W)
1-wz
"|2
1 cr(z)'
(z + i) Vim w 1 -, z-w o(z
rj(w)/a(w) > 0
for©,
r](w)(l-wi)/cr(w) >0
forU.
Note that Sw(w) =
1 W(w)\y/l
=<
- l£o(w)l2 1
/ I - |u;|2 i|/2 w
|a(uOIV^(w)/s7 0 (a 0 ) forD, forU.
178
9. Convergence
In this notation the Szego kernel can be written as sw(z) = Sw(w)Sw(z) and conversely Sw(z) = sw(z)/y/sw(w) = sw(z)/Sw(w). Finally we note that
which brings about the following interesting result. Lemma 9.1.3. Let log// e L\(k). Then the normalized Szego kernel Sw(z) is an outer spectral factor associated with the density ii'(t)[P(t, w)\ujo(t)\2]~l with P(t, w) the Poisson kernel. Thus 1 where r](w) e T is such that Sw(w) > 0. Proof. We know that
^ JD(t,z)logii'(t)dk(t)\ . Furthermore, P(t, u;)|^ 0 (0l 2 = \Xw(t)\2 =
Xw(t)Xw*(t),
with
an outer function in H2. Hence the result follows.
•
For the normalized kernel we have the following: Lemma 9.1.4. Let log// e L\(k) and let Sw be the normalized Szego kernel defined in (9.6) and Kn(z, w) the normalized reproducing kernel for Cn. Then \\Kn(z, w) - Sw(z)\\l = 2[1 - Kn(w, w)/Sw(w)].
(9.8)
Proof. We evaluate the norm in the left-hand side. From the reproducing property of the kernels we readily find \\kn(z,w)\\l=kn(w,w). Since Kn(z, w) = kn(z, w)/Jkn(w,w),
we find that \\Kn(z, w;)||? = 1.
9.1.
Szegoproblem
179
By similar arguments we get for the normalized kernel HS^ ||? = 1. Furthermore, Re (Kn(z, w), Sw(z))fi, = Re {Kn(z, w), = Kn{w,w)/Sw{w)
sw(z)/Sw(w))^ >0.
Hence the assertion is proved since
\\Kn(z, w) - Sw(z)\\l = \\Kn(', w)\\l + \\Sw\\l - 2 Re (Kn(z, w), Sw Theorem 9.1.5. Letlogji' e L i (X) and let Sw(z) be the normalized Szegokernel satisfying Sw(w) > 0 and Kn(z, w) the normalized reproducing kernel for Cn. Suppose also that the Blaschke product diverges. Then (9.9)
lim \
n-»oo
and also
lim
1 -
lim
1-
Sw(z)
= 0
(9.10)
= 0,
(9.11)
and kn{z,w) sw(z)
where dkw(t) = P(t, w) dX(t) with P the Poisson kernel. Proof. From Theorem 9.1.1, we easily derive by taking the square roots that Kn(w, w)—> Sw(w). The first result now follows easily from the previous lemma. To prove the second one, note that (recall (9.7))
with dfia = y! dX = \a | 2 dk. Thus 1 -
Kn(z,w) Sw(z)
= \\Sw(z)-Kn(z,w)\\} < \\Sw(z)-Kn(z,w)\\l.
Since the latter converges to zero, the assertion (9.10) follows. The last relation is shown similarly. The result (9.10) is a generalization of Theorem 5.8 of Ref. [94].
180
9. Convergence
Some direct consequences are Corollary 9.1.6. Let log// e L\(X) and suppose the Blaschke product diverges. Let Sw(z) be the normalized Szego kernel and Kn(z, w) the normalized reproducing kernels for Cn. Then the following convergence results hold:
Kn(t,w)
lim
- 1
Sw(t) 1 im / lim \Kn(t,w)\
ft—>-OO
(9.12)
l \Sw(0\
(9.13)
where dkw(t) = P(t, w) dk(t) and P is the Poisson kernel Proof. First note that it follows from (9.10) by using \a - b\2 > (\a\ - \b\)2 that
Kn(t,w)
lim
Sw(t)
- 1
dkw(t) = 0.
(9.14)
Denoting the integral in (9.12) as /„, we have by the Schwarz inequality l2n = ( [\\Kn/Sw\
-
l\\\Kn/Sw
\\Kn/Sw\ - l\2dXw)
• ( f\\Kn/Sw\
+ l\2dkw I .
The first integral goes to zero for n -> oo by (9.14). For the second integral use \a + b\2 < 2(\a\2 + \b\2) to find that it is bounded by
21 #14=-
dkw
\Kn\2\o\2di+\ <2U\Kn\2dfr+l\=4. This proves (9.12). For the relation (9.13), we note that it can be written as / =
\Kn/Sw\ Kn
dkw.
9.2. Further convergence results and asymptotic behavior
181
After squaring this and using the Schwarz inequality, we find that
I2 < (J\Kn\-2dkJ\ • (J\\Kn/Sw\2 The second integral goes to zero for n -» oo by (9.14), whereas for the first integral we can use Theorem 6.4.3 to find that it is
f t, w) \K (t,w)\ J\ n
2
This proves (9.13).
•
In the case of the disk and for w = 0, the latter results can be found as Theorem 3.3inRef. [168]. 9.2. Further convergence results and asymptotic behavior This section includes a number of convergence results of the approximants we obtained, such as local uniform convergence in O. It is a well-known fact that an infinite Blaschke product B (z) = #oo (z) will converge to zero locally uniformly (i.e., uniformly on compact subsets of O) if (7.5) is satisfied. See, for example, Ref. [200, p. 281 ff ]. This can be used to obtain some other convergence results of the same type. Theorem 9.2.1. Let(j)n be the orthonormalfunctionsfor Cn and\jfn thefunctions of the second kind. Define Qn = V^/^n and let the positive real function Q^(z) = fD(t, z)djl(t) be the Riesz-Herglotz-Nevanlinna transform of the measure /JL. Then Qn = ty* /(/>* converges to Q^ uniformly on compact subsets of O if (7.5) is satisfied. Under the same conditions, Q* (z) = —ifn/
Tn
~~ 2
182
9. Convergence
Hence O*[l - Qn 1 + Qn] = 2[0 l]Tn and thus
+1
{<$>:&„ i
_ Bn
where B^ = fo#fc and Rn\ and Rn2 are as defined in (6.23). Thus £2M - Qn = BnRn2/^l
in ©.
(9.15)
Now define the Schur functions T and Tn by Cayley transforms of Q^ and Qn: T=
1 — Qtl
^
G
B
and
1 — CL
Trt = 1 + fi B
1 + ^M
Then,
r - rn = 2-
""«
""JU
which in view of (9.15) gives
r —r
1
l
2Bn
e #(©).
n
On the boundary 3O, we have \Bn\ = 1 a.e. and |F| < 1 and \Fn\ < 1, so that
r —r 1
A
n
< 1
on i
The maximum modulus theorem then gives ir-rn|<2|£j
in©.
The right-hand side, and hence also the left-hand side, converges to zero uniformly on compact subsets of © if (7.5) holds. With inverse Cayley transforms we now find that M
n
1— r l r
i —r i r
r —r (i
9.3. Convergence of (p*
183
which will converge exactly as F — Fn does. Indeed, the denominator is bounded away from 0 in a compact subset ^ c O because Q^ is continuous in K and thus maps K to a compact subset of the finite plane. Therefore F maps K to a compact subset of B so that |F(z)| < r < 1 and hence |1 + F(z)| > 1 - r > 0 in K. Because Tn converges uniformly to F in K, then also |F n (z)| < p < 1 in K for all n. Hence 11 + Fn(z)| > 1 - p > 0 for all n and z e K. This gives the proof in O. The proof that —V^/0« converges locally uniformly in Oe is given along the same lines, knowing that limn^oo l/Bn(z) goes locally uniformly to zero inO e . • Practically the same proof can be repeated for the following theorem. Theorem 9.2.2. Let Kn(z, w) be the normalized kernels for Cn and Ln(z, w) the associated kernels. Define Qn(z, w) = Ln(z, w)/Kn(z, w). Let Q^iz, w) be as defined in Lemma 6.2.2. Consider these as functions in z for fixed I D G O . Then, if (7.5) holds, £2n(z, w) converges uniformly to Q^(z, w) on compact subsets ofO. Proof. We can use the result of Section 6.4 to find that
From then on the proof runs exactly as in the previous theorem.
•
The previous theorem actually corresponds to Lemma 3.4 in Ref. [60]. Again by the same arguments, we also have Theorem 9.2.3. Let Qn{z) =
•
9.3. Convergence of 0* In this section we discuss the convergence of the 0* under the more restrictive condition that the sequence {an} is bounded away from the boundary 3O.
184
9. Convergence
This means that it is completely included in a compact subset of O. Thus the distance to the boundary dO is then some finite positive value. We shall say that A = {ofi, a?2,...} is compactly included in O. In this section we shall use the notation O r to denote such a subset of O for which mz(z)/mo(ao) > r for all z € O r . Note that this means that 1 — \z\2 > r for z e D r and that Im z > r for z e U r . Thus we suppose that there is some 8 > 0 such that a* €<0>a,
Jfc = 0, 1,2,....
It is clear that this condition is stronger than the divergence of the Blaschke product guaranteed by (7.5), which was our main condition in earlier sections. Since ©5 is compact, we have for the case O = U that the o^ can not go to oo. Thus the Blaschke product diverges in that case when J2^mak = ° ° (see Section 1.3). Note that under this more restrictive condition we can make use of the following bounds for all z £ © r : \Uz)\<m(r)
(9.16)
and for the Poisson kernel we have for some positive m$ and M§ 0
< , 7 M 2 ^ P ^ <*»> ^ V^ka < °°, ^edO.
(9.17)
The main result will be that whenever a subsequence an(5) -> a G O, then0*(5) converges to some function F a , depending on a. It will turn out that this Fa(z) is the normalized Szego kernel Sa(z) that appeared in previous sections. In Section 9.6, we shall give stronger results that rely on convergence properties of orthogonal rational functions with respect to varying measures. We think, however, that the results of this section are interesting in their own right. They also include the case where all the poles of the rational function coincide at some point a. This is precisely what is considered in Section 9.5. The rational functions orthogonal with respect to varying measures will have all their poles in a0. This means that they are polynomials in the case of the disk and functions of the form pn (z)/(z + i) n with pn e Vn in the case of the half plane. We start with the following dichotomy: Lemma 9.3.1. Suppose that A = {a\,a2,.. .}is compactly included in O. Then every subsequence {0*(5)} of {>*} has a subsequence {0*^))} that either converges locally uniformly in O to an analytic function F, without zeros, or diverges locally uniformly to oo.
9.3. Convergence of (/)*
185
Proof. From the Christoffel-Darboux relation (3.2) with w = z, we get I0n*(z)|2 > 1 - \Uz)\2
> 1 - m{rf > 0
for z G O r . The result of a divergent subsequence then follows from the theory of normal families (Montel's theorem). The limit function F has no zeros by Hurwitz's theorem because 0* has no zeros in O. • Another dichotomy is Lemma 9.3.2. Let A = {o?i, c*2,...} be compactly included in O. Then the sequence {0*} is either locally uniformly bounded in O or diverges locally uniformly to oo. In thefirstcase, that is, when {0*} is locally uniformly bounded in O, then the sequence {kn (z, z)} with kn the reproducing kernels converges locally uniformly in O and {0n} converges locally uniformly to zero in ©. Proof. Suppose {0*} does not diverge locally uniformly to oo. Then, by the previous lemma, there exists a subsequence {0*(5)} that converges locally uniformly to a bounded function in O. Again by the Christoffel-Darboux relation (3.2) with w = z, and (9.16), we get
k=0
v
7
for zeOr. Since {kn(s)(z, z)} is nondecreasing, and {0*(5)(z)} is locally uniformly bounded, it follows that the whole sequence {kn(z, z)} is locally uniformly bounded and in particular (0n(z)} converges locally uniformly to zero. Hence, the monotonically increasing sequence {kn(z,z)} converges to a continuous limit. Consequently, {kn{z, z)} converges locally uniformly, {4>n(z)} converges locally uniformly to zero, and (0*(z)} is locally uniformly bounded. • We now get immediately Corollary 9.3.3. Let A = {a\, G?2, • • •} be compactly included in O. Then the sequence of leading coefficients {Kn} is either bounded or diverges to oo. Proof. If {/cn} does not diverge to oo, then, since Kn = >*(an), this implies that {0*} does not diverge locally uniformly to oo. Therefore, {0*} is locally
186
9. Convergence
uniformly bounded by Lemma 9.3.1, implying that the sequence {KU} is bounded. D As we shall see in the next section, if A is compactly included in O, then the condition that the sequence {KU } is bounded and the condition that {0*} is locally uniformly bounded in O are equivalent and they are both equivalent with the Szego condition log yl e L\(A). Below we frequently use the phrase that for a subsequence an^ converging to a e O, there is some Fa to which 0*(5) will converge. This assumes that {0*} does not diverge locally uniformly to oo, which, by our previous observation, actually means that log/// e L\(X). Thus unless Szego's condition is satisfied, there can not be such an Fa. However, we shall not mention the Szego condition and give an interpretation to our results as follows. If Szego's condition holds, then Fa is actually a function; if Szego's condition does not hold, then the results still hold if Fa is interpreted as being oo, l/Fa as zero, etc. We now can prove Theorem 9.3.4. Let otn^s) - > a G O and suppose that 0*(5)(z) -> Fa(z), locally uniformly in O. Then [Fa(z)]~l e H2 and hence Fa has no zeros in O. Proof. We prove only the disk case; but similar observations hold for the half plane. We have to prove that there exists a constant B such that z
/•\F (rt)\~ dX(t) a
< B forO < r < 1.
For an(S) -> of, we know that \(pn(S)(rO\~2 converges uniformly to Thus
f /
dX(t) dX(t)
J I^VOI Thus we should prove that
converges to 2
\Fa(rt)\~2.
ff dX(t) / \F (rt)V a
J dk(t) /,
< B.
Since |0*(z)| 2 is harmonic in D, this integral is nondecreasing with r, and it is sufficient to prove that this integral is bounded for r = 1. Now by Theorem 6.1.9
The result now follows by (9.17).
9.3. Convergence of"0*
187
We want to prove next that \xm^icl{s) =
Fa(a)2.
Before we can do this, we need several lemmas. Lemma 9.3.5. With <j>n the orthonormal functions and yj/n the associated functions of the second kind, we have a
^
<918)
Proof. First we note that both Re—^
and log
are harmonic in O U 3O and hence by Poisson's formula, lo
S
\ujo(z)\2uTn(an)/mo(ao) i , ,,*, M9
=
f
P(t,z)log
P(t, an)\m0(t)\2 ——- 7 (9.19)
whereas by Theorem 4.2.6
Jensen's formula for the inequality of arithmetic and geometric mean then yields
> exp Hence, using (9.20) for the left-hand side and (9.19) for the right-hand side, we get the result. • Letting n(s) -> oo we get the following consequence: Lemma 9.3.6. With an(s) -> a e O, so that 0*(j) (z) ->• Fa (z), and with Q^ (z) = /D(f, z) dA(O, we have ^^
°!, I 77T ^"7^ I ^
< Re Quiz) = / P(t, z) dfi(t). /
(9.21)
188
9. Convergence
Proof. This follows immediately from the previous lemma since ir*(z)/ 0*(z) = Qn(z) converges locally uniformly to £2M by Theorem 9.2.1. • Lemma 9.3.7. With the notation of Lemma 9.3.6 we have
Fa(af > exp { - IP(t, a) log [ ^ ^ J
<*(*)} } • (9-22)
Proof. Taking nontangential limits to the boundary in (9.21), we get
Consequently, we have d
m
* -
By Poisson's formula, the right-hand side is equal to — \ogFa(a)2. Fa(a) > 0.) This is equivalent with the result.
(Note •
We have one more lemma: Lemma 9.3.8. With the notation of Lemma 9.3.6 we have
Proof. Recalling Theorem 6.1.9, we have
By Jensen's inequality for the arithmetic and geometric mean we then get
Splitting the logarithm and using P(t,an)
we get the result.
log \4>*n{t) | 2 dX{t) | = I C ( a n ) | 2 = K2n
•
9.3. Convergence of'0*
189
Knitting the previous results together gives us the following convergence result for the coefficients of highest degree Kn Theorem 9.3.9. Ifotn{s) -^ a e O, and if log/JL' e Li(X), then we have for the leading coefficients Kn of the orthonormal functions (9 25)
-
where P(t, z) represents the Poisson kernel. Iff log \A! dX = —oo, then (9.25) still holds if the right-hand side is interpreted as being oo. Proof. Denoting the right-hand side as S(a), we know from Lemma 9.3.7 and the proof of Lemma 9.3.8 that K2n < S(a) < Fa(af. Now since K^S) = \(j)^s)(an(s))\2 -> Fa(a)2, the result follows directly.
•
A further convergence result is Theorem 9.3.10. When log y! e Li(X), andan{s) - ^ a e O , then
/AO
l \Fa(t)\2
a.e. on dO.
P(t,a)\mo(t)\2
(9.26)
Proof. By (9.23) P(t,a)\mo(t)\2 \Fa(t)\2
< fi'(t)
a.e. t e
If strict inequality holds on a set of positive measure, then we would have strict inequality, Fa(a)2 > S(a), in the previous proof and this is a contradiction with K2 < S(a).
•
Finally, we relate Fa(z) to the normalized Szego kernel. Theorem 9.3.11. Suppose log [x! e Li(X) andan(s)-^a e O. Then (p*(s)(z) converges locally uniformly in O to Sa(z), the normalized Szego function as
190
9. Convergence
defined in (9.6). Thus
-I jm.»•« [Pi,^m]
dkl)
)•
If flog \i' dX = —oo, then this relation still holds if the right-hand side is interpreted as being oo. Proof. We know that 0*(iS)(z) will converge to some Fa(z) in O. We have to show that it is equal to the normalized Szego kernel. Since [F^iz)]'1 is in # 2 and has no zeros in O, its inner-outer factorization takes the form
where rj eT arranges for a normalization Fa (a) > 0. The factor Is is a singular inner function and the last factor is the outer factor. The outer factor is, in view of (9.26), equal to
The inner factor is of the form
Is(z)=expUD(t,z)d&(t)\, where &(t) is some singular (positive) measure. Thus we also have
But by Theorem 9.3.9
—J— = exp ( [p(t, z)dco(t)\ • —L-. Thus fP(t, a)d(o(t) = 0, so that the singular measure co is just zero. This proves the theorem. •
9.4. Equivalence of conditions
191
9.4. Equivalence of conditions We shall here show the equivalence of the Szego condition and certain other conditions of boundedness, convergence conditions, and density conditions when the interpolation points an are bounded away from the boundary 3O. We have Theorem 9.4.1. Suppose A = {a\, 012,.. •} is compactly included in O. Denote by (pn the orthonormal functions, by Kn > 0 their coefficient of Bn, by sn and 8n the parameters from their recurrence relation (4.8), and by kn(z, w) the reproducing kernels. Then the following statements are equivalent: (I) log/z' e L\(k) (Szego's condition). (II) The sequence {/cn} is bounded. (Ill) The sequence (0*(z)} is locally uniformly bounded in O. (TV) The sequence {0*(w)} is bounded for some w e O. (V) The series YL
n
lim TT[|ejfe|2 - \h|2] = lim
n—too •*-•*k=l
n—>oo
n
n
fc
,
n
and hence it follows that (II) & (VII). The equivalence (VIII) 4^ (I) was shown in Corollary 7.2.4. By the dichotomy of Lemma 9.3.2, we get (III) <£• (IV). The implications (VI) =» (V) => (III) =>> (II) are simple: We just observe that (VI) =>> (V) by putting z = w. This implies that {0n} goes locally uniformly to 0 and the Christoffel-Darboux relation then implies (III). By setting z = otn in (III), we get (II). We presently show (II) => (VI), which implies with all the previous results that (II)-(VII) are equivalent. From (II): 0*(a n ) = Kn does not diverge to 00 and hence by the dichotomy of Lemma 9.3.2, we get (III). As in the proof of Lemma 9.3.2, we get also (V). Finally, we apply the Schwartz inequality
k=0
to see that (VI) holds.
192
9. Convergence
The remaining link is the equivalence (II) 4> (I). Suppose an(S) —> a. Then Theorem 9.3.9 shows that (II) is equivalent with lQ
g
P(t,a)\zuo(t)\2
where P(t, z) is the Poisson kernel. By (9.17), this is equivalent with (I).
•
We also have the following equivalence for a more restrictive situation. Theorem 9.4.2. With the same notation as in the previous theorem but now under the condition that an —> a e O, the following statements are equivalent with the conditions (I)-(VIII): (IP) The sequence {icn} converges. (IIP) The sequence {0*} converges locally uniformly in O. (VIP ) The infinite product Yl^L i [ I £k 12 — I &k 12] converges. Proof. (II*) => (II), (III*) => (III), and (IF) o (VII*) are obvious. (Ill) => (III*) because (III) <£> (VI) and the Christoffel-Darboux relation (3.2). (II) =» (II*) because (II) => (III) => (IIP) => (II*). n 9.5. Varying measures Much stronger results than the ones of Section 9.3 were obtained by Li and Pan [135] in the case of the disk. They used the relation between orthogonal rational functions in Cn and orthogonal polynomials with respect to varying measures. The theory of orthogonal polynomials with respect to varying measures was mainly developed by G. Lopez [138, 139, 140, 141]. In this chapter, we shall give an overview of his work. His results and proofs are for the case of the disk. However, the adaptations needed to include the half plane are only minor. We restrict ourselves to the basic results and include here the definitions and summarize without proof the convergence properties that we shall use later on. We consider rational functions of degree m that have all their poles in a0We construct an orthonormal basis with respect to a measure dfin that depends on n. So we consider the spaces % = spanRofe) : * = 0, 1, . . . , m } = i
1
^
: Pm € Vm
and the measure
dfrn(t) = \wn(t)\2dfr(t),
with w0 = 1, wn(z) =
°
, n > 0,
9.5. Varying measures
193
where as always 7tn(z) = TU\(Z)- • • z*jn(z). In the case of the disk we have obviously Cl = Vn. We can use our general theory for this special case and construct orthogonal rational functions 0 n ^ k = 0, 1 , . . . with <j>n^k e CQ. Since we used a double index here, there should be no confusion with our previous
k,l = 0,1,....
The functions are uniquely defined by making their leading coefficient positive: 4>n,k(z) = vntkSo(z)k + • • •,
iv* > 0-
These functions satisfy the following recurrence relations: 4>n,m(z) = «n,
X
with
V
y/l-\K,m\2
n,m
V
n,m-l
We have used the special superstar notation / * for / e £Q to mean /*(Z) = ?O(Z)"/.(Z).
/ 6 ^ -
Thereby we can distinguish between a superstar in Cn and a superstar in Of course, this should also be clear from the context. Furthermore, it is very easy to obtain that
and it is known [141, (6) p. 201] that
\K,m\
f
l0»,m-l(O!2
- 1 dk(t),
with C a constant that can be chosen independent of n and m. Obviously, these {0n,m} have the following relation with the orthonormal basis for Cn. Set F/i,m(z) = tn,m
^, m G T.
(9-27)
194
9. Convergence
Then 0/a,0/i,/)An = /0n, = *„,*,/
Fn,k(t)FnJ(t)dfr(t)
= tnXl(Vn,k*n,l)tL,
with ^ ^ = tn,i/tn,k £ T, so that /„,£,& = 1. Thus the Fntk are orthonormal rational basis functions for Cn, which are obtained by orthonormalizing the basis
with respect to dfi. Thus, we see here yet another basis for Cn, which has the unfortunate property that when n changes, the whole set of basis functions changes. The Blaschke products as basis functions, and their orthonormal derivates
k=0
In the following theorems we shall assume that we select the measure and the point set A = {a\, a2,...} C O such that they satisfy the following conditions: 1. \JL > 0 a.e. (Erdos-Turan condition). 2. fd£i = 1, mn = fd/jin < oo, n > 0. 3. YlT=i mnl/2n = °° (Carleman condition). We shall indicate this briefly by writing (A, /x) e AM. The condition \x' > 0 a.e. was considered by Erdos for the interval [—1, 1]. Erdos and Turan used it to prove certain convergence results in L2. Therefore, /z' > 0 a.e. is usually called the Erdos-Turan condition. Rahmanov used the same condition to show that the recurrence coefficients for the associated orthogonal polynomials behave asymptotically as the recurrence coefficients of Chebyshev polynomials. In Ref. [141], Lopez considers a non-Newtonian triangular array of interpolation points {atn : i = 0 , . . . , n\ n = 0, 1,...}, which we do not need, in view of the other material treated here. Thus we give the formulation only for the situation of a Newton sequence of interpolation points. Lopez calls the
9.5. Varying measures
195
conditions defining AM the C-conditions because condition 3 is a Carlemantype condition for the moments mn. In an earlier paper [138], the properties to follow were proved under the stronger assumption that the set A is compactly included in O. An intermediate condition is considered in Ref. [140]: y. The Blaschke product with zeros a* diverges, instead of the Carleman condition 3. For results where the spectral factor a is needed, this factor should be defined, which requires that we replace the condition yj > 0 a.e. by the stronger one: V. log/x' G L\(X) (Szego condition). We indicate that A and \i satisfy conditions 1', 2, and 3 by (A, /z) € AM!. If A and /x satisfy conditions V, 2, and 3', we write (A, fi) e AM!'. Since 3' implies 3 and V implies 1, AM!' C AM! C AM. Note that here the points of A are in general points in the closed set O = CD U 3O. For the moment, we only need them to be in the open set CD. However, these results are more general and when we treat the boundary case in Chapter 11, where the points of A are all in 3O, practically all the results from this section and from the two sections to follow will be applicable. Note also that for A c O, the functions wn = m^lixn are continuous on 3O and thus condition 2 will be satisfied for any normalized measure. Thus if A c O, then condition 2 is actually superfluous. Thus AM!' is basically defined by V and 3' and the normalization jdji = 1. A fortiori this holds when A is compactly included in O. If we allow some at e 9O, then, in the case of the line, such an at is allowed to be oo. In that case a factor z — &i in the product 7tn(z) should be replaced by 1. For example, if all at = oo, then wn(z) = (z + i)n and condition 2 is necessary to ensure that the measure is sufficiently weak at oo. The following theorems are readily obtained from Lopez [138, 139, 140]. We give them without proof. Theorem 9,5.1. Let (A, /x) e AM. Then the measure .. dVAt)
di(t) - \wn(t)4>n,n+k(t)\2'
with integer k, converges in weak star topology to dji(t), that is,
lim [ f(t)dvn(t)= [f(t)dfr(t), J
V/eC(3Q).
196
9. Convergence
Theorem 9.5.2. Let (A, /x) e AM. Then for integers k and m
Hm /i , r w 7 ; l 2 - i
dk(t) = 0.
Theorem 9.5.3. Let (A, /x) e A M . 77ze An,m are the recurrence coefficients from (9.28) and the vn,m are the leading coefficients from (9.27). Then the following convergence results hold for k integer: 1. limn^oo Xn,n+k+\ = 0. J
l™
Vn,n+k+l _
1
3. limn^oo ^2"+kT(lf = £o(z),
locally uniformly locally uniformly
5. limn^oo I">w+^) = 0,
locally uniformly
z e Oe.
z G O. z € Oe.
Let (A, /x) G A M 7 a^J /^^ <7 be the spectral factor of /x, normalized by <j(ofo) > 0. Then fork integer 6. li
4^4T = -rr>
locally uniformly
z
9.6. Stronger results We shall now give a relation between the Fn?w defined in the previous section and our orthogonal rational functions (j)n. This relation will lead us to asymptotic results for the case where (A, /x) e AM. More specifically, the a# need not be contained in a compact subset of O. Let us simplify the notation by setting Fn =F n , n . Furthermore, we choose the normalizing constants tn,n e T in our definition of Fn such that it gets the same normalization as the 0 n . That is, it is chosen such that F/ife) = Kn,nBn(z) H
with Kn%n > 0.
This means that (setting r\n = fl/Ui zk) _
tn,n — Vn
\
• c/)*n(an)wn(an)
Define the functions /„ and gn both in CQ by / x
U>Z)
..„. (9.29)
9.6. Stronger results
197
In our previous notation, fn = tn^n(j)n^n. We shall also need the leading coefficients of gn and /„ in C^\ w
^
and vn = f:(ao) = r j n ^ \ .
(9.30)
This means that gn(z) = TnSo(z)n + ' • '
and fn(z) = VnhizT + - • .
First we need the following lemma (see Ref. [135]). Lemma 9.6.1. With the previous definitions, it holds that
Proof. A simple calculation shows that hn(z) = — is a monic function in £g. Next consider // n (z) = hn(z)wn(z) G £„. We shall show below that it is orthogonal to Cn-\ with respect to A- Since gn/^n and /in are both monic in £g, the functions //„ and 0 n /r n must have the same normalization and hence Hn = >n/Tn. The orthogonality is shown as follows: Let Gn-\(z) = tn-\(z)wn-\(z) be an arbitrary function from Cn-\ obtained by choosing tn-\ e £Q~1 arbitrarily. The ci appearing below are irrelevant constants. Then (Hn, Gn_i
= cx [[fn
The first inner product is zero by the orthogonality of fn and the second one by Theorem 2.2.1(5), applied in the case where all a* = a0. This proves the lemma. • As a last preparation for the main convergence results, we state Lemma 9.6.2. Let (A, /x) e AM and define with the notation of this section the functions n(z)
=
e Cl and Fn(z) =
€ Cn0.
9. Convergence
198 Then
locally uniformly i Proof. Since we have by the previous lemma
*•„*(««)
then also — 1_ To find the limit of the second term in the right-hand side, we observe that —?—- =
n n
:
< 1 in O.
Hence < 1. And observe that < 1
in<
whereas by Theorem 9.5.3(5) with k — 0
locally uniformly in Oe. The lemma thus follows. Now we can give the main results of this section. Theorem 9.6.3. Let (A, /x) e AM. lim —^ locally uniformly in O.
Then <7(ao)mo(z)
9.6. Stronger results
199
Proof. First note that 0nOO = [gn(z)Wn(z)Y
= r}nWn(z)g*n{z),
T)n = Z\ • - Zn,
so that for z = oto we obtain 0*(a?o) =
rjnwn(a0)Tn.
Thus the expression in the previous lemma can be rewritten as
wn(ao)gn(z)vntto(z) - ft (op This goes to 1 locally uniformly in Oe, so that we also have local uniform convergence in O of
By Theorem 9.5.3(6), there exist cn e T such that lim cnF*(z) = lim cnr]nf*(z)wn(z)
= l/cr(z)
locally uniformly in O; hence also lim^^oo cnF*(a?o) = l/cr(oto). If cr(ao) > 0, then we choose the cn such that cnF*(ao) > 0. Using this result and wo(ao)mn(z) UTn(ao)mo(z)' we get the convergence result of this theorem.
•
Note that when an —> a G O, then the previous result can also be derived from Theorem 9.3.11, which says that lim
where r](a) e T as in (9.6).
a(a)/mo(ao)
cr(z)
200
9. Convergence
The point c#o i n our previous theorem is arbitrary and used for normalization only. It could be any other point w e O, since indeed by applying the previous theorem in z and w, we get for the ratio (j)*(z)uTn(z) n^oo (j)*(w)ujn(w)
<7(w)mo(z) a(z)UT0(w) '
This holds locally uniformly for z e O and for w fixed, but by the symmetry in the relation, it is immediately seen that this must hold locally uniformly for (z, w) e O x O. We can also prove a local uniform convergence result for the reproducing kernels. For the case of the unit disk, see Ref. [135]. Theorem 9.6.4. Suppose (A, fi) e AM' and let kn(z, w) be the reproducing kernel for Cn and sw(z) the Szego kernel (9.3). Then we have local uniform convergence of lim kn(z, w) = sw(z),
(z, w) e O x O.
«->oo
Proof. We know that kn(z, w) =
Wn(z)wn(w).
With the Christoffel-Darboux relation for the 0 n ^, we get kn(z, w) = -JbR
—
r T T
==-^
—
wn(z)wn(w).
Thus F*(z)F*(u;)
1
Since we can use Theorem 9.5.3(6), we get local uniform convergence in O for some cn e T lim cnF*(z) = so that locally uniformly for (z, w) e CD x O: w-^00
1 cr(z)cr(w)
9.6. Stronger results
201
By Theorem 9.5.3(5), the same type of convergence holds in ,0#!,n(z)\ f(t>n,n(W)\ A hm ,_ , , .-' , = 0. r
Thus the result of the theorem follows because sw(z) = 1 / [(1—£o (z) fo (w))cr (z)
n
As an immediate corollary we have Corollary 9.6.5. Let (A, /x) e AM'. Then hm (j)n{z) = 0
and
n—KX)
hm n->oo
= 0 0*(z)
locally uniformly in O. Proof. The first limit holds because by the previous theorem with « j = z w e have local uniform convergence of kn(z, z) = YH=o I0fc(£)|2 for n -> oo. Thus 0n (z) —> 0 locally uniformly. For the second limit, we observe that by Theorem 9.6.3 we have, after inverting the expression and multiplying it by 0 n (z), lim
0 w (z)0(ao)sr w (ao) — — = (Jn
locally uniformly in O since indeed, by the first part of this corollary, (j)n (z) converges to zero. When z is in a compact subset of CD, then mn(ao)/mn(z) is bounded away from zero. Thus the second limit follows. • Next we extend the well-known property for orthogonal polynomials 0w/(/>* -> 0 locally uniformly in O. We show this under the more restrictive condition that A = {G?I, of 2 ,...} is compactly included in O. Recall that this condition in combination with Szego's condition implies that (A, /x) e AM'. Theorem 9.6.6. Let A be compactly included in O and logfi' e L\(k). Then the following limits hold locally uniformly in the indicated regions: lim ^ ^
=0,
z
e O
and
lim ^ ^
=0,
z e O e.
Proof. The second property follows from the first one by taking superstar conjugates. We shall prove that |0 W /0*| -> 0 locally uniformly in CD.
202
9. Convergence
By the Christoffel-Darboux relation (3.1), we have for z = w = n-\
(9.31) k=0
Thus
Since A is compactly included in O, we have | t;n (z) \ < m (r) < 1 for all z e O r . Hence lC(«o)|2>l-m(r)2 = C>O.
(9.32)
This inequality, in combination with the second limit of the previous corollary, leads to the result. • For the kernels, we can prove the following theorem, which was given by Li and Pan [135] for the unit disk. It gives a result when no conditions on the divergence or convergence of the Blaschke product are given. Theorem 9.6.7. Let log /x/ e L\(X) and let a be an outer spectral factor of \x. The sequence of reproducing kernels {kn(z, w)} for Cn is locally uniformly bounded in O x O . Every limit function k(z, w) will be analytic in z and wfor (z, w) e O x O and for fixed w e O, cr(z)k(z, w) e H2. Proof. Clearly, kn(z, w)a(z) e H2. Thus kn(z,w)cr(z)
eH2.
By Cauchy's formula, we find kn(z, w)a(z)
\J < /
'
row; —
dX(t).
Noting that
J\kn(t, W)\2fj,'(t)dk(t) < J \kn(t, W)\2dfr(t)=kn(w,
W)
9.6. Stronger results
203
and that for t G
\C(t,z)\2 = so that j\C{t,z)\2dX{t)
= Inr^OnroCoo)!"1,
we get by an application of the Schwarz inequality
fcfc
"*wN{/ic<^w}.
{/,*.<,.•w)\ n\t)di{t) 2
kn(w,w) Setting z = w, we get kn(w, w) <
\mo(w)\2 \mw(w)u70(a0)cr(w)2\
Plugging this into the previous inequality, we get kn(z, w)
1 \mz(z)uTw(w)mo(ao)2a(z)2cr(w)2\
Now suppose as before that O r is a compact subset of O with | mz (z) TUQ (a?o) I > r for all z e O r . Define M(r) as the supremum of |(j(z)|~2 for z G O r . This is a finite value since a is outer. Thus for z and w in O r we find that M(r)2 —
r2
Since obviously |^ 0 (z)| is bounded in O r , we have found that {kn(z, w)} is locally uniformly bounded as claimed. Thus it is a normal family (in two variables) and this means that there must exist a convergent subsequence with limit function say k(z, w):
lim kn(s)(z,w)
=k(z,w),
locally uniformly for (z, w) e O x O. We have to prove that k(z, w)a(z) G H2 and thus that k(z, w)a(z)/wo(z) G H2. This follows from the following
9. Convergence
204
relations (the formulation rt is for the disk, but an easy adaptation can be made for the real line):
k(ruw)o(rt)
kn(s)(rt,w)a(rt) mo(rt)
vro(rt) < lim /
dX{t)
mo(t)
s->ooj
< ^liir^ / \kn(s)(t, w)\2dKt) = lim kn(S)(w, w) = k(w, w) < oo. This proves the theorem.
•
Corollary 9.6.8. Under the conditions of the previous theorem, the sequence {0*} is locally uniformly bounded in O. Every limit function F is an analytic function without zeros in O. Proof. By the previous theorem, for w = z, the sequence {kn(z, z)} is locally uniformly bounded. Hence, by kn(z, z) = YH=o I0*(£)I2> it follows that {0rt} is locally uniformly bounded. Now by the Christoffel-Darboux formula, kn(z, Z)
< kn(z, Z)
it follows that {10* |2} and hence also {0*} are locally uniformly bounded. None of the limit functions can have a zero in O by Hurwitz's theorem because 0* has no zeros in O. • In Theorem 9.3.11, we have shown that if an(S) -> a e O, then 0*(z) -> S<x(z) with Sw(z) the normalized Szego kernel. Following Pan [173], we now prove a similar result when it is only assumed that A = {a\, ai,...} is compactly included in O. Theorem 9.6.9. Suppose (A, /x) e AM! and let a be the outer spectral factor normalized by a (a?o) > 0. Assume also that A is compactly included in O. Then we have locally uniformly in O
n
cr(z)
_ n
9.6. Stronger results
205
Proof. We first prove that lim
, "
^
= 2
»^°° x/1 - IC«(«O)I
,
1
(9.33) 2
a ( a o ) \ / l - l?o(«o)l
To this end, we first note that by the Christoffel-Darboux relation I0>O)| 2
l^(«o)| 2 ,Q_ ,. ( , | 2 (9.34) 1 lf()|2 Theorem 9.6.4 implies that the limit of the second term in (9.34) is given by ,^ *o) +
lim £n_i(c*o, <*o) =
n-»oo
The denominator of the last term in (9.34) is bounded away from zero and the numerator goes to zero by Corollary 9.6.5. This proves (9.33). Noting that
we can rewrite this as lim
\(p*(ao)mn(ao)\
The result now follows by combining this with the result of Theorem 9.6.3.
• Note that the factor t]n in this theorem is equal to |0*(ao)|/0^(ofo) m the case of the disk. It rotates 0* such that it becomes normalized by the condition 0*(o?o) > 0 instead of our usual 0*(o?n) = Kn > 0. In the case of the half plane, an extra rotation with mo(ao)/\nro(ao)\ is needed. Note also that the extra factor given by the ratio in the left-hand side of this theorem is in the case of the disk simply l / \ / l — \an\2 whereas in the case of the half plane it is l/^lman. As a consequence we can extend the sequence of equivalent conditions given in Theorem 9.4.1. Corollary 9.6.10. Under the same conditions as in Theorem 9.4.1, and with rjn as defined in Theorem 9.6.9, the following statements are equivalent: (B) The conditions (I)-(VIII) of Theorem 9.4.1. (IX) Local uniform convergence of the sequence
206
9. Convergence
(X) Convergence of the sequence \(j)*(cto)mn(ao)\ Proof. The condition (I) implies (IX) by the previous theorem. By setting z = &o in (IX) we get (X) apart from a constant unimodular factor. Hence the sequence of (X) is bounded, but because A is compactly included in O, this means that {0*(ao)} is bounded. This is condition (IV) and that closes the circle. • The previous theorems assumed the Szego condition. Following the work of Pan, it is possible to replace this condition by /z' > 0 a.e. This Erdos-Turan condition will then give rise to several ratio asymptotic results. This will be investigated in Section 9.8.
9.7. Weak convergence We also have a weak star convergence result for the measures considered in Theorem 6.1.9. Theorem 9.7.1. Suppose that the Blaschke product diverges. Let P(t, z) be the Poisson kernel. Then the measure defined by djln(t) = P{t,an)/\(j)n{t)\2 dX(t) converges to dfi in the weak star topology, that is, for any continuous f e
lim Proof. The proof is by standard arguments. We know that (/, g)^n = (/, g)^ for any f,geCn by Theorem 6.1.9. Since fg e 1Zn (t e 3Q) and because the divergence of the Blaschke product implies that the space T^oo is dense in C(dO) by Theorems 7.1.2 and 7.1.4, the result follows. • The same kind of proof can be repeated for the measures considered in Theorem 6.4.3. Theorem 9.7.2. Suppose that the Blaschke product diverges. Let P(t, z) be the Poisson kernel. Then the measure defined by d£in(t) = P(t, w)/\Kn(t, w)\2 dk(t) converges to d\x in the weak star topology, that is, for any continuous
9.7. Weak convergence
207
/ G c(dO), lim [f(t)djin(t) = If(t)dfr(t)For w = c*o 0ftd setting Kn(z) = Kn(z, ceo), we find as a special case:
Other weak convergence results in the Erdos-Turan class are given in Theorem 9.8.5 and Theorem 9.8.18. It is now possible to give a characterization theorem for the Szego condition, now assuming a Carleman condition, rather than assuming that the point set A = [a\, o?2,...} is compactly included in O, as we did in Section 9.4. We recall that the Carleman condition is that
X^^oo
with mn=
Theorem 9.7.3. Assume that the Carleman condition holds. Let kn(z, w) be the reproducing kernels for the spaces Cn. Then the following conditions are equivalent: (I) log/x' e Li(X) (Szego fs condition). (VI) kn(z, w) converges locally uniformly in O x O . (XI) kn(z, «o) converges locally uniformly in O. 2 (XII) ^ Proof. The implication (I) =^ (VI) is given in Theorem 9.6.4. The implications (VI) => (XI) => (XII) are trivial. Thus it remains to show the implication (XII) => (I). Set = Kn(z,oto)
=kn(z,
By the reproducing property, we have
fr(t)> f
1 =
J
Fo(Or
If P(t, z) is the Poisson kernel and t e dO then P(t, ao)I^To(OI2 = 1, so that we can write 1>
f\Kn(t)\2P(t,a0)fjL'(t)d\(t),
208
9. Convergence
and by the inequality for arithmetic and geometric mean
1 > exp | fp(t, a0)log[\Kn(t)\2fi\t)]dk(t)\ By splitting the logarithm and replacing P(t, a0) again by
•
1/|GT 0 (OI 2 ,
1 > exp I Jp(t, aQ)log \Kn(t)\2dX(t)\ exp j f\ogfi\t)dk(t)\
we find
.
The first integral is by Poisson's formula equal to |Kn(ao)l2- Thus we have shown that
)| 2 < exp j Since the left-hand side converges to a finite limit, which is obviously larger than 1, it follows that log \A! e L\ (X). This proves the theorem. • 9.8. Erdos-Turan class and ratio asymptotics Using our previous results, it is not difficult to obtain ratio asymptotics when we assume that Szego's condition log /// e L\ (X) is satisfied. We give a simple example. Lemma 9.8.1. Let (A, /x) e AM. lim
n-+oo 0*( z )
Then
mn(z)ujn+i(a0)
(p*+
and (pn(z) T&%{Z)Z
where convergence is locally uniform in the indicated regions. Proof. The first relation, which holds in O, follows immediately from Theorem 9.6.3. The second relation is derived from the first one by taking the substar conjugate. • Note that in the second limit of this lemma, ^ * j _ i (z)
—7T =
_
i
Zn+\-
9.8. Erdos-Turdn class and ratio asymptotics
209
When O = D and all ak = 0, that is, in the polynomial case, then the first part of Lemma 9.8.1 becomes
=
But in the polynomial case, we also have lim^oo 0*+1(O)/0*(O) = 1 and thus we have then limw^oo(/>*+1(z)/0*(z) = 1 locally uniformly in D. Similarly, part (2) of the lemma implies that in the polynomial case lim w ^oo0 n+ i(z)/ (j)n (z) = z locally uniformly in E. The previous lemma illustrated that when Szego's condition holds, then the ratio asymptotics are easy to obtain. However, ratio asymptotics are typically obtained not assuming the Szego condition log// e Li(X), but using instead the much weaker Erdos-Turan condition [i! > 0 a.e. For the polynomial case, several such results are obtained in a paper of Mate, Nevai, and Totik [145]. K. Pan has extended their results to the rational case in several papers. The rest of this section is mostly an account of Pan's generalizations. His results rely on a lemma from the paper [145] that we shall formulate first. Originally it is proved for the unit circle, but a Cayley transform allows us to formulate it for the general case. We give it without proof. Lemma 9.8.2. Let /xbea measure on 9CD with yJ > 0 a.e. and let p > 0 and A be real numbers. Assume that ( f[f(t)fi'(t)]Pdi(t)\
If(t)dKt)
holds for every nonnegative continuous function / € C(9O). Then A > 1. Before we arrive at a reformulation of Lemma 9.8.1 under the weaker condition \i' > 0 a.e., we shall have to go through several other results. We start with the following theorem, which was proved by K. Pan in [168] in the case of the unit disk. Theorem 9.8.3. Suppose \x' > 0 a.e. on dO and assume that the Blaschkeproduct diverges. Let Kn(z, w) be the normalized reproducing kernels and Kn(z) = Kn(z, cto)- Then we have
210
9. Convergence
Proof. Expanding the square in the integrand gives 0
Furthermore, by Theorem 9.7.2, f\Kn(t)\2fi\t)dX(t)
< [\Kn(t)\2dfr(t) = 1.
(9.35)
Hence it has to be shown that liminf \Kn(t)\y/Jj/(t)di(t) > 1. n->oo J
(9.36)
Now, for any nonnegative function / e C(3O), we have by the Schwarz inequality
Squaring both sides and applying the Schwarz's inequality once more on the second integral of the right-hand side gives
Because the last integral converges to ffdfi
by Theorem 9.7.2, we get
< (liminf fy/^\Kn\dk\
(ffdfiY
We now apply Lemma 9.8.2 to find that (9.36) holds, which proves the theorem.
•
Similar arguments can be used to prove the following theorem, which holds under the more restrictive Szego condition. Theorem 9.8.4. Let Kn(t, w) be the normalized reproducing kernel for Cn and let (j)n be the nth orthonormal rational function. Suppose that (A, /x) e AAi".
9.8. Erdos-Turdn class and ratio asymptotics
211
Then \Kn(t,w)\2
lim sup /
- 1 dXw(t) = O,
n
^°° />o J
we
(9.37)
where dXw(t) = P(t, w) dX{t), with P the Poisson kernel. Proof. Note that for / > 0, it follows from Theorem 6.4.3 that Kn(t, Kn+i(t, w)
dkw(t) = J\Kn{t, w)\2dji,(t) =
However,
= f\Kn(t, w)K
Kn(f,w) Kn+l(t,w)
n+i(t,w)\
=
dXw(t) \Kn+l(t,w)\2
f\Kn(t, w)Kn+t(t,w)\dp,(t)
> \(Kn(t,w),Kn+l(t,w))ii\
=
Kn(w,w) Kn+i(w,w)'
Since Kn+i (w, w) is monotonically increasing with/, and lim/^oo Kn+i(w9 w) = Sw(w), where Sw is the normalized Szego kernel, we have inf /
Kn(f,w)
dkw(t) >
i>oj Kn+i(t,w)
Kn(w,w) Sw(w)
Now Kn+i
-1
\2 J
r v dXw= / —^ J
Kn+i
2
dXw +
r J
dXw-2
Kn
dXw.
As we have seen, the first integral equals 1. The second integral is also 1, so that we get
0 < sup <2| 1-
dXn
Kn+i
Kn(w,w) Sw(w)
Taking the limit for n —> oo, knowing that Kn(w, w) —> Sw(w),we find sup
- 1
dXw = 0.
212
9. Convergence
Now, by Schwarz's inequality, we may bound the integral / in (9.37) by
y2 dxw}'{i{\jh\~i)i
Kn
+ 1) dku
dK
}-
Using (\a\ + \b\)2 < 2(\a\2 + \b\2), we can bound the first factor by 4. Taking the supremum for / > 0 and the limit for n -> oo, we know that the second factor goes to zero, and hence the theorem follows. • Note that for w; = a 0 , we will be able to replace the Szego condition by the condition \x! > 0 a.e. See Theorem 9.8.6. The following theorem is obtained as a direct consequence of Theorem 9.8.3. See Pan [168]. Note that it also includes a weak convergence type of result for the Erdos-Turan class. Theorem 9.8.5. Assume that /z' > 0 a.e. on dO. Define Kn(z) = Kn(z, «o) with Kn the normalized reproducing kernel. Then lim / | | K w ( O l V ( O - l|rfi(O = 0
(9.38)
n-*oo J
and lim f\\Kn(t)\-'-y/im\dkt)=O.
(9.39)
Furthermore, we have the following weak convergence theorem. For any bounded measurable function f e L\(k) Jdn^ ff(t)\Kn(t)\2fif(t)dk(t)
= ff(t)dk(t)
(9.40)
ff(t)dk(t).
(9.41)
and lim ff(t)\Kn(t)\2dfi(t)= n^ooj
J
Proof. To prove thefirstlimit (9.38), we observe that by the Schwarz inequality \\Kn\2v,'-l\dk\
=
9.8. Erdos-Turdn class and ratio asymptotics
213
By Theorem 9.8.3, the first integral on the right-hand side converges to zero as n -> oo. For the second integral we use (|a| + |&|)2 < 2(|a| 2 + \b\2) and (9.35) to find that it is bounded by 4. Thus the first limit (9.38) follows. For the second limit (9.39), we use again Schwarz's inequality to get
- l\d°k\
The first integral on the right-hand side is 1 by Theorem 6.4.3 whereas the second integral converges to zero as n —> oo by Theorem 9.8.3. The weak convergence result (9.40) is a direct consequence of (9.38). For the formula (9.41) we notice that by (9.40) for / = 1, we find lim
\Kn(
Hence lim^oo |KW (t) | 2 d^is (t) = 0 if d[is = d\x — /x' dl is the singular part of dfi and thus also for every bounded / e L\(X)
lim
[f\Kn(t)\2dfrs(t)=0,
and this implies (9.41) as a consequence of (9.40).
•
The next theorem is also in one of Pan's papers [168] for the case of the circle. It gives a Nevai type characterization for measures satisfying / / > 0 a.e. Theorem 9.8.6. Let Kn(t, w) be the normalized reproducing kernel for Cn and let Kn(z) = Kn(z, oio). Suppose that the Blaschke product diverges to 0. Then y! > 0 a.e. on 3O iff lim sup / n ^°° />o J
|K n (0| 2
- 1 dX(t) = 0.
(9.42)
Proof. First suppose that /x; > 0 a.e. Note that for / > 0, it follows from Theorem 6.4.3 that
[ J
= f\Kn (t)\2dii(t) J
= 1.
9. Convergence
214 Now, expanding the square gives 2
di=
[ la.
dk+fdk-if
J Kn+/
J
J
As we have seen, the first integral equals 1. Also, the second integral is 1. We show below that liminfinf
dX > 1.
(9.43)
Kn+/
This will imply that lim sup
- 1
dX = 0.
So, by Schwarz's inequality, we may bound the integral / in (9.42) by
Using (\a\ + |Z?|)2 < 2(\a\2 + |Z?|2), we can bound the first factor by 4. Taking the supremum for / > 0 and the limit for n —> oo, we know that by (9.43) the second factor goes to zero, and hence the theorem follows. Thus it remains to show that (9.43) holds. Therefore we use again Lemma 9.8.2. Assume / e C(3O) is nonnegative. Then using Holder's inequality a couple of times, we get: 1/2
dX 1/4 Kw+/
The second integral on the right-hand side is bounded by 1, since
The third integral converges to ffdji
by Theorem 9.7.2. Thus we have
(liminfinf Application of Lemma 9.8.2 gives then (9.43). This completes the proof of one direction of this theorem.
9.8. Erdos-Turan class and ratio asymptotics
215
For the converse statement, we refer to Pan's paper [168]. He in turn refers to a proof given by Li and Saff [136] for the polynomial case on the circle. The adaptations for the real line and for the rational case are simple. • The following lemma corresponds to Theorem 2.3 in Ref. [166]. Lemma 9.8.7. Let Kn(z, w) be the normalized reproducing kernels for Cn and define Kn(z) = Kn(z, oio) and set vn = K n (a 0 ). Then lim n^oo
= 1 iff
lim
= 1 locally uniformly in O.
n-+oo Kn(z)
Vn
Proof. Note that by Theorem 6.4.3 1 = /|K n _!(O for
dfjbn(t) =
P(t,ao)dk(t) |K n (0|
2
dX(t) |K n (0| 2
Thus
Now set cn = vn-i/vn and fn{z) = K n _i(z)/K n (z). Note that / n (a 0 ) =
[P(t,z)gn(t)dk(t).
Since GJo{t)P(t, z) is locally uniformly bounded for z e O (and t e 3O), we get \gn(z)\<M
J\gn(t)\dX(t)
and thus \gn(z)\2 < M2 J\gn(t)\2dk(t)
= M2\\gn\\2.
Continuing to use the norm in H2, we also have 1 = ll/nll2 = \\Cn + gnf = c\ + \\gnf + 2cn Rt J gn(t) dk(t).
9. Convergence
216
Because gn((*o) = 0, the last integral is zero, so that \\gn\\2 = 1 — c\. Thus, if lim^oo cn = 1, then it follows from
that gn(z) converges locally uniformly to 0, that is, fn(z) converges locally uniformly to 1. The converse statement in the theorem is trivial. • Note that Kn (z) = F* (z), where Fn = Fn?n is defined in the beginning of Section 9.6. Now we finally can formulate the first result on ratio asymptotics for the functions Kn. Theorem 9.8.8. Suppose that the Blaschke product diverges and that \JL > 0 a.e. on 3 CD. Let Kn (z, w) be the normalized reproducing kernels and set Kn (z) = Kn(z,a0). Then K (z) lim — = 1 locally uniformly in O.
fe)
Proof. By the previous lemma, it is sufficient to show that lim
= 1 for
vn = K n (a 0 ).
Define the function
Note that for t e
- 1 Since F(z) is analytic in O, we can apply Cauchy's theorem to get
\F(ao)\ <
|K n +i(OI 2
- 1 dX(t).
Because kn+i (z, ceo) = kn(z, a0) +
9.8. Erdos-Turdn class and ratio asymptotics
217
we get after taking the superstar conjugate
Call the last of these expressions Sn(z). Then, evaluating F(z) for z = cto, one finds u
n+\
Thus we have ,.2
1
— U'n+1
- 1 Because the last integral converges to zero for n -> oo by Theorem 9.8.6, it follows that vn/vn+\ - • 1. D This entails immediately Corollary 9.8.9. Suppose that the Blaschke product diverges and that / / > 0 a.e. Let kn(z, w) be the (nonnormalized) reproducing kernels. Then lim
= 1,
locally uniformly in O.
Proof. Since by definition Kn (z) = A:n (z,cto) /y/kn(oto, oto) and un it follows that
Since vn-\/vn
-> 1, the result follows.
•
Now we want to move toward ratio asymptotics for the orthogonal functions. We start with the following lemmas from Ref. [171].
218
9. Convergence
Lemma 9.8.10. Let /z' > 0 a.e. on 3O and suppose the Blaschke product with zeros A = {a\, an,...} diverges. Let kn(z, w) be the reproducing kernels and (pn the orthonormal functions. Then lim ^ °
) |
=0.
Proof. Note that kn(ao, &o) = u^, so that kn(a0,a0) Because vn-\/vn
v2
kn(ao,ao)
—> 1 for n -» oo, the result follows.
n
Lemma 9.8.11. Let //,' > 0 a.e. on 3O and suppose the points A = [a\, are compactly included in O. Then lim
0*()
a2,...}
= 0.
Proof. Because A is bounded away from the boundary, |f n (z)| < m < 1 and thus
Furthermore, we know that |^>n(z)/0*(z)| < 1 in O and thus 1
> 1.
Thus, using the Christoffel-Darboux relation, kn(a0, <x0) —
we get
•*(«o)P 1 — m2 n-^oo kn((Xo, Qfo)
where the last equality follows from the previous lemma.
•
9.8. Erdos-Turdn class and ratio asymptotics
219
Lemma 9.8.12. Let (A, /JL) e AM and let kn(z, w) and Kn{z, w) be the nonnormalized and normalized reproducing kernels respectively and let Kn (z) = Kn(z,a0). Then kn(z,ao) Kn(z) lim = lim = 0, n^oo k*(z, «o) "->0° K*(z)
. locally uniformly m O .
Proof. Using the notation of Section 9.5 on varying measures, it can be checked that
where 0 n?n are the orthonormal functions in C™, wn(z) = zuo(z)n/nn(z), m\ • • • mn, and / * = ?Q/*• Consequently,
y
0 (z)Wn(z) 2>
nn
2
V^«(ao,ao) Thus
By Theorem 9.5.1(5), the right-hand side goes to 0, locally uniformly in Oe when n -> oo. This proves the lemma. • Lemma 9.8.13. Let Kn(z, w) be the reproducing kernels, Kn(z) = Kn(z, «o)» and vn = Kn(ao). Then
and
Proof. This is a matter of simple algebra. First write down the ChristoffelDarboux formula for kn{z, a0) and for its superstar conjugate &*(z, «o)- Then either eliminate (j)n(z) or 0*(z) between these formulas and the result follows.
•
220
9. Convergence
We are now ready to prove the ratio asymptotics of Lemma 9.8.1 under the weaker assumption /z' > 0 a.e. but with a stronger condition for the set of points A. See Ref. [171]. Theorem 9.8.14. Assume that \JL' > 0 a.e. on dO and that the Blaschkeproduct with zeros A = {a\, a?2, • • •} diverges. Then lim ^ + l ( z ) mn+i(z)uTn(a0) 0(ao) «->oo 0*( z ) n ( ) ( ) 0 * ( )
= 1
and
where convergence is locally uniform in the indicated regions. Proof. We only prove the first formula since the second is an immediate consequence. Some extensive calculations show that
and
Using this in the first formula of Lemma 9.8.13 leads to (j)*(z)UJn(z)
Xn(z),
with Xn(z)= Since for all z e O and for all« we have \£o(z)$o(an)\ < 1, |0 n (z)/0*(z)| < 1, and because the superstar of Lemma 9.8.12 says that Kn(z)/K*(z) goes to zero locally uniformly in O, it follows that lim^oo Xn(z) = 1, locally uniformly inO.
9.8. Erdos-Turdn class and ratio asymptotics
221
Taking the ratio of two such expressions and letting n -> oo gives the result we wanted because lim —— = lim
Kn+1(z)
vn(z)
lim
r
lim
Xn+i(z) Xn(z)
and lim —-—,
lim —
,
and
lim Xn(z)
all converge to 1 (the last two locally uniformly in O) by Lemma 9.8.7, Theorem 9.8.8, and our previous observations. • Under somewhat more restrictive conditions on the set A we have Theorem 9.8.15. Let //' > 0 a.e. in SO and suppose the set A is compactly included in O. Then lim — / n ^ v /_N = lim ^ ,_^ = 0,
locally uniformly in Oe,
and lim
n
— = lim - ~ — = 0,
locally uniformly in O.
Proof. The convergence in O follows from the convergence in Oe by taking the superstar conjugates. We only prove the convergence in Oe. By Lemma 9.8.13 we get by dividing out:
(9.44) where
By Lemma 9.8.11, lim^^oo /„ = 0, and by the superstar of Lemma 9.8.12, lim^oo gn(z) = 0, locally uniformly in Oe. Moreover, |<"„(o?o)I < 1 and because A is compactly included in O we have | £n (z) \ < M < oo, locally uniformly
222
9. Convergence
in Oe. This implies that
are locally uniformly bounded in Oe. Therefore the numerator of the right-hand side of (9.44) and the first term in its denominator converge to zero, locally uniformly in Oe. However, |fn(z) - f n (a 0 )| > |fn(z)l - lfn(«o)l is locally uniformly bounded away from zero for z G Oe. Thus the ratio in the right-hand side of (9.44) converges to zero locally uniformly in Oe. Also, because A is compactly included in O, it follows that 1 < m < | f„ (z) | < M < oo, locally uniformly in O e , so that the factor fn(z) can be dropped. This proves the theorem. • Note that this theorem is the same as Theorem 9.6.6, except that the Szego condition is replaced by the weaker condition /z' > 0 a.e. Under the same conditions, we also have the ratio asymptotics. See Ref. [171]. Theorem 9.8.16. Let /z' > 0 a.e. in 3O and suppose the set A is compactly included inO. Let€n e T be such that €„>* (a0) > O,(i.e.,€n = \4>*(ao)\/4>*(ao)). Then lim
.
= 1, locally uniformly in O.
Proof. In view of Theorem 9.8.14, it is sufficient to show that lim
=
«^oo 7^—7—\i27i—^T~7—\T2\ |0 n + 1 (a o )| z (l - K n (a o )| z )
lj
locally uniformly in O. It follows from the Christoffel-Darboux relation that
= kn(a0, a0) 1 + -—" I ,,. Because A is compactly included in O, it holds that for every z £ O (hence also for z = «o) there is a positive constant m < 1 such that for all n we have
9.8. Erdos-Turdn class and ratio asymptotics
223
Therefore, noting that kn(ao, G?O) = v% and setting Xn = Xn(ao) and Yn = \(t>n(uo)\2/Vn, we have 10>o)| 2 (l - |£n+i(c*o)|2) - \U<*o)\2)
vj [1 + XnYn] 2 v n+x [1 +Xn+lYn+{\'
=
Because lim,,-^ v2/^2+1 = 1 by Lemma 9.8.12, lim^oo Yn = 0 by Lemma 9.8.10, and Xn is uniformly bounded, it follows that the right-hand side converges to 1 locally uniformly in O. • In the beginning of this section, we gave several norm convergence theorems that were valid for /// > 0 a.e. and involved reproducing kernels. We shall conclude this section with similar theorems of Pan [171] that involve the orthogonal functions. Theorem 9.8.17. Let A be compactly included in O and n' > 0 a.e. on 3O. Then
lim f(\(t)n(t)xn{t)\^m-\)2di(t)
n->oo J
= 0,
where x
=
y/
Proof. Using the Christoffel-Darboux relation, we can show that the normalized reproducing kernel Kn(z) = Kn(z, «o) = equals Kn(z) = xn(.z)Xn(.z)*n""H™',
(9.45)
where 1 tn(<Xo)Sn(z)rn(z)rn(cto) , .
V^G3O.
(9.46)
224
9. Convergence
Next observe that
[(\
[(\
The second integral on the right converges to 0 as n - • oo by Theorem 9.8.3. Using (9.45), we find that the first integral is
J(i-\xn\f\xnn\2fi'dL By using the definition of fw and of the Poisson kernel P(t, z), it follows after some algebra that for t e 3O M 0 1
2 _
l
" \mo(t)\2P(t,any
Because A is compactly included in O, there exist positive constants m and M (see (9.17)) such that
0 < m < \xn(t)\2 < M < oo, Vt eW). Furthermore, if we let Mn = max{|l - |Xn(r)ll : t e dO], then we can bound our last integral by
f (I - \Xn\)2 | x w 0 n | V ^ < M22nM <M2M Taking the limit for n —> oo, wefindby (9.46) that this goes to zero. Thus also
lim f(\(l>nxn\^-l)2dk
= 0.
This concludes the proof.
•
Note that because |jcn(OI~2 = l^o(OI 2 ^(^ <*«)> we can write the limit of the previous theorem as
= 0,
MiW = l
Finally, we give the general formulation of Theorem 2.6 in Ref. [171].
9.8. Erdos-Turdn class and ratio asymptotics
225
Theorem 9.8.18. Let P(t,z) denote the Poisson kernel. Suppose that /z' > 0 a.e. on dO and that A = {a\, a?2, • • •} is compactly included in Q. Then 'n(t) - 1| dX(t) = 0,
lim
n'n(t) =
n'(t)
/
For any bounded f e L\(k)
lim [f(t)\(Pn(t)\2^n(t)dk(t)= n^ooj
ff(t)dk(t),
n'n(t) = -
J
(9.49)
Proof. Let xn be as in Theorem 9.8.17. Then by the Schwarz inequality, we find for the left-hand side of (9.47) \\
=
\\
< (
l\\
The first of these integrals converges to zero by Theorem 9.8.17 and the second one is bounded, so that (9.47) follows. For (9.48), we have again by the Schwarz inequality
= ( [\4>nXnrl\\4>nXn\\/iJ-l\dk' \4>nxn\-2dX The first of these integrals equals rP(t,an)\mo(t)\2dk(t)
=
rP(t,an)dk(t)
=1
by Theorem 9.7.1. The second of these integrals converges to 0 by Theorem 9.8.17, so that (9.48) follows. The relation (9.49) is an immediate consequence of (9.47). •
226
9. Convergence 9.9. Root asymptotics
We finally give some examples of root asymptotics. Such asymptotics typically involve potential theory, which we briefly recall. See, for example, Ref. [193]. Let us first consider the normalized counting measure v* defined by
which assigns a point mass at ctj, taking into account the multiplicity of cij. It is called the zero distribution of the polynomial n*(z) = YYk=i(z ~~ ak)- F° r any measure v with compact support in C, the logarithmic potential is defined as
Vv{z) = -
J\og\z-x\dv(x).
For example, I
/
n
log \z - x\dv*(x) = -- ]Tlog \z -
ajl
Thus obviously
Now assume that v% converges to some measure vA with compact support in the weak star topology, which we denote as
that is,
lim [f(x)dv*(x)= n^ooj
ff(x)dvA(x),
V/€C(C).
J
This convergence implies lim \n*(z)\l/n
= exp{- V^(z)},
z € C \ supp(yA)
(9.50)
and limsup | < ( z ) | 1 / n < exp{-VvA(z)},
zeC,
n—>oo
where convergence is uniform on each compact set of the indicated region.
227
9.9. Root asymptotics Set
7=1
Let vA be the measure associated with the point set A = {an}o°, just as vA was associated with the set A = {oik}™- Note that from the definition of VVA and V T, we have
Vife) = - Jlog \z-x\dvj(x) = - I ^ l o g |Z - a y | 7= 1
Therefore Vvj(z) = VVA(Z). The reason for introducing W* (z) is that 7Tn(z)=ZnW*(l/z) nn (z) = W* (z)
for U.
Thus the introduction of vA allows us to state that in the case of D (for z ^ O ) lim \nn(z)\l/n = lim \z\\7t*n(l/z)\l/n = |z|exp{-Vr(l/z)} = |z|exp{-V w .(z)},
(9.51)
with z = 1/z, and in the case of U lim \nn(z)\l/n
= lim |7r:(z)| 1/n = exp{-V v7 (z)} = «p{-V,,.(z)},
(9.52)
with z = z. This holds for any z e C o \ supp(v i ), where C o = C \ {0} for D and Co = C for U and where A refers to the set A = [a\, &2,. • •}• Furthermore, since ET0*(Z)/'z&o*(z) equals z for D and 1 for U, VVA(Z)},
where C o = C \ {0} for D and C o = C for U. A combination of (9.50) and (9.51, 9.52) leads to
ZGC0,
(9.53)
228
9. Convergence
Lemma 9.9.1. Let Bn denote the Blaschke products with zeros from the set An = {oikYk-^\ • Suppose that for the zero distributions vA we have the weak star convergence vA ——• vA with supp(vA) compact. Then n
lim \Bn(z)\l/n
= exp{X(z)}
and
lim \Bn(z)\-1/n
= exp{A(z)}
for z € Co \ {supp(vA) U supp(vA)}, with tfz(x)\dvA(x),
A(z) = /log
(9.54)
where ^(JC) = ^ ^ 1 - zx
forD
and
£z(x) = ^4: x- z
forU.
For z e Co we have limsup \Bn(z)\l/n
< exp{A.(z)} and
limsup \Bn(z)\~l/n
<
Proof. We give the proof for O = D, since for O = U it is even simpler. Because we have for z € Co \ {supp(yA) U supp(vA)} lim \Bn(z)\l/n
n->oo
= Izr 1 exp{-V V A( Z ) + V > ( l / z ) }
while VVA(Z) = -
[\og\x-z\dvA(x)
and VVA{\/Z)
= - flog\l-zx\
dvA(x) + log |z|
we get the first result. For the second formula we note that \Bn(z)\~l the result is then immediate. The proofs for the lim sup results are similar.
= |#«*(z)| = |# n (2)| and •
Note that A (2) = Since lim fn+i/fn = L implies lim /n1/w = L, we can deduce the root asymptotics for 0* from the known ratio asymptotics of Theorem 9.8.14.
9.9. Root asymptotics
229
Theorem 9.9.2. If \i' > 0 a.e. in 3O and if A is compactly included in O, then lim |0*(z)|1/w = 1, tocfl/fy uniformly in Q. Proof. It follows from Theorem 9.8.14 that \/n
lim
= 1,
(/)*(ao)mn(ao)
locally uniformly in O.
(9.55)
Because ratio asymptotics imply root asymptotics, it follows from Theorem 9.8.16 that lim
2
1 - |f n (« 0 )| 2
1.
However, because A is compactly included in O, there is some m such that 1 - m2 < 1 - \Zn(ao)\2 < 1, and thus l i m [ l - | f B ( a 0 ) l 2 ] 1 / B = l. Consequently, lim lC(a o )| 1 / n = 1.
(9.56)
Furthermore, for any z G O r , there exist 0 < m(r) < M(r) < oo such that m(r) <
urn(a0)
< Af(r),
which means that 1/n
lim
urn(a0)
= 1,
locally uniformly in O.
Finally, from (9.55), (9.56), and (9.57), the proof is achieved.
(9.57) •
For the sequence {4>n(z)} we have the following result. Theorem 9.9.3. Assume \x' > 0 a.e. in 3O and let A be compactly included in O and v% —% vA (hence supp(vA) C A compact). Then for the case of the disk lim | «-»oo
n
= \z\ ~l
VVA(1/Z)}
=
230
9. Convergence
and for the case of the half plane lim \(/)n(z)\l/n = exp{-V v .(z) +
n-»oo
VVA(Z)}
= exp{A(z)},
locally uniformly in Oe \ supp(vA). The function k(z) is as in (9.54). Proof. Taking the substar in the previous theorem gives l/n
lim
= 1, locally uniformly in Oe.
Thus by Lemma 9.9.1
locally uniformly in O e \supp(v A ). This completes the proof because |sro*(z) / ST0*(z)\ is \z\~x for O = D and it is 1 for O = U. n As an example, we consider the disk situation where lim^oo an = a e B so that vA = 8a. In this case supp(vA) = {a} and supp(vA) = {a} = {I/a}. Furthermore,
J
= - Jlog \z - x\ d8a(x) = - log \z - a\ and V8a(z) = - log |l/z -a\=
log \z\ — log 11 — za\.
Thus for z € E \ { a } , lm^\(l)n(z)\1/n
= \z\~x exp{VvA(z) -
VVA(Z)}
= \z\~l exp{log|z| — log |1 —az\ +log|z — a\] Z-OL
1 — az So when a = 0, we have Hindoo \4>n(z)\l/n = |z|. In particular, if an = 0 for all n, then
«-»oo
is recovered (compare with Ref. [40]).
locally uniformly in E,
9.9. Root asymptotics
231
A similar computation for the case O = U leads to the same result: If ooan = a e O, then lim^oo |0 n (z)| 1/n = |? a (z)|, locally uniformly e in O \ {a}. Next we prove nth root asymptotics for the para-orthogonal rational functions Qniz, rn) = 4>n(z) + r n 0*(z),
rn e T.
Theorem 9.9.4. Let log/// e L\{dk) and let A be compactly included in O. Then 1. lim^oo |Qw(z, r^l 1 /" = 1, /oca//y uniformly in O. 2. 7^moreover, v% -^> vA w/f/?supp(vA) compact, then\\mn^oQ \Qn(z, ~cn)\x/n= exp{A(z)} locally uniformly in Oe \ supp(vA), where X(z) is as in (9.54). Proof. We shall give the proof only for O = D. For O = U, the proof is completely similar. 1. Set XAZ)
~
0*(O)
Then for any z e D W(W(O)] \Tn+l \_ (1 — a n z)0*(z)0 n+1 (O) JL ?n + (j)n(z)/(p*{z)
Xn(z)
J
Clearly by Theorem 9.8.14 limn^oo yn = 1 whereas 1 - l0w+l(z)/0w+iCg)l
l + l0n(z)/0ife)l
< |A
~
n
. < l + \
~
I _ |0n(z)/0*(z)|
Since by Theorem 9.6.6 lim n ^oo0 n (z)/0*(z) = 0, locally uniformly in D, we find that lim rt ^oo|A n | = 1. Thus we have proved that lim^oo \Xn+\(z)/Xn{z)\ = 1, which implies that limn^oo lx«(^)l1/n = 1. Now, since Qn(z, rn) = 0*(z)x n (z)/(l - anz), limn^oo |l - anz\1/n = 1 locally uniformly in D, and lim^^oo |0*(O)|1/n = 1, the proof follows. 2. Since (pn (z) # 0 in E, we can write Qn(Z, Tn) = (pn This gives < \Qn(Z, Tn)\ <
232
9. Convergence
and consequently
By Theorem 9.6.6, lim^oo 0*(z)/0 n (z) = 0, locally uniformly in E, which yields immediately lim^oo \
OuaO
Theorem 9.9.5. Let log \JL! G L\(k) and assume that A is compactly included in O. Then lim \\Qn\\Vn = l. Proof. Because the Cayley transform maps H^ (D) onto HOQ (U), it is sufficient to give the proof for D. If we take z e D, then \Qn(z, rn)\ < HCIloo and therefore
and liminf \Qn(z, rn)\1/n - lim \Qn(z, rn)\l/" = 1 < liminf \\QjV". n^>oo
n^-oo
(9.58)
n^oo
Since A is compactly included in D and hence A bounded away from T, it follows that Qn(z, rn) € Cn, having poles in the set A = {«i, «2,...}, is analytic in a disk of radius p > 1 for any n = 1, 2 , . . . . Hence we have Iiei.lloo= max \Qn(z,rn)\ TUD
< max\Q n (z, r rt )|, T
Tp = {z e C : \z\ = p}.
By Theorem 9.9.4(2), we have for z € Tp c E lim | n—>-oo
Define y{z) — exp{X(z)}. Then for any e > 0, there exists some no such that for all n > n0 and z e Tp \\Qn(Z,Tn)\l/n-y(z)\<€
9.10. Rates of convergence
233
or, equivalently,
V(z)-€<\Qn(z,xn)\l/n
<€ + y(z).
Hence, for sufficiently large n
[maxICfe, rn)\] " <max\Qn(z,rn)\1/n L Z£lp
J
< max{e + y(z)}.
Z^.1 p
Z €:Tp
Let peie be a point in Tp where y (z) reaches its maximum. Then
WQnC" < [max \Qn(z, rH)\] ''" < € + y(pew),
(9.59)
where
y(pew) = p~l exp{Vv.(p^) - VvA(ei6/p)}. If we now let p tend to 1 + , then we find lim/0^i+ y(pel°) = 1. In combination with (9.59) this yields GJI^
(9.60)
Finally, by (9.58) and (9.60) the proof follows.
•
The previous results were used in Ref. [44] to obtain estimates of the rate of convergence of Rn(z, Tn) to Q^. We give examples of such estimates in the next section. 9.10. Rates of convergence The root asymptotics of the last section can be used to prove geometric convergence for rational interpolants to Q^ and for R-Szego quadrature formulas. Let us start with the rational interpolants of Theorem 9.2.1. It was shown that both sequences of rational functions
^
and ^
=^
=$£
converge locally uniformly to ^ ( z ) . The first one in Oe and the second one in O. We now add to this that the convergence is geometric. For similar results see also Ref. [172].
234
9. Convergence
Theorem 9.10.1. With the notation 0n(z)
En(z) = Qu(z) - Qn(z) = ^u(z) - -777-7 ' assuming that vA —> vA with supp(vA) compact and that the Blaschke product diverges, we have 1. For all z e O: l i m s u p ^ ^ \En(z)\l/n < exp{A,(z)} < 1. 2. For all z e Oe U {00}: l i m s u p ^ ^ \E^(z)\l/n < exp{A.(2)} < 1. Here X(z) is as in (9.54). Proof. By using a substar conjugate, part (2) is immediately obtained from part (1). We thus only have to prove the first part. From the proof of Theorem 9.2.1, we obtain
\En(z)\ < \Bn(z)\\Hz)l
/* = —
Hence 0 < m(z) < \h(z)\ < M(z) < 00 in compact subsets of O. Therefore, we also have lim sup I En (z) Il/n < lim sup | Bn (z) \l/n, and the result follows for z e Co H O from Lemma 9.9.1. In the case of the disk, the result is also true for z = a 0 = 0 because En(0) = 0 for all n. • A similar result can be obtained for the interpolants Ln (z, w)/Kn (z, w) of Theorem 9.2.2 and the interpolants Rn(z) = -Pn(z)/Qn(z) of Theorem 9.2.3. We give the explicit formulation for the latter case. The proof is as the previous one. Theorem 9.10.2. Suppose Rn(z) = —Pn(z)/Qn(z) is the rational interpolant to Q^ of Theorem 9.2.3 with error En(z) = £2/x(z) — Rn(z). Suppose that vA —> vA, with supp(vA) compact, and that the Blaschke product diverges. Then we have the following estimates for the convergence rates: 1. For all z e O : l i m s u p ^ ^ \En(z)\l/n < exp{A(z)} < 1. 2. For all z € Oe U {00} : limsup,^^ \En(z)\x/n < exp{A(g)} < 1. Here again X(z) is as defined in (9.54).
9.10. Rates of convergence
235
To end this section, we give the convergence for the R-Szego quadrature formulas introduced in Chapter 5. For each n = 1, 2 , . . . , let {%nkYl=\ be the zeros of the para-orthogonal rational functions Qn(z, rn) and consider the corresponding R-Szego formulas
which approximate the integral / M {/} = J fit) djl(t). We first prove the convergence of the quadrature formula when / e C(3O), that is, when / is continuous on 3O. Theorem 9.10.3. If the Blaschke product diverges, then, with the previous notation,
lim /„{/} = /„{/},
V/ G C(dO).
Proof. If Un = Cn- Cn* and n^ = U£LO7£W, then we know from Chapter 7 that IZOQ is dense in the space of continuous functions C(3O) iff the Blaschke product diverges. Let / e C(dO) and let € > 0. Take g e Uoo such that
Then there is a A: such that g e TZk- Since /„{•} is exact in IZk for n > &, we have
I W l - 7«{/}l < IW) - ^tell + l^tel - W ) l f-g\d^
+ Y, \g($«j) ~ f(Mn})\ <\ + \=*
whenever n > k since An7 > 0 and Jdft = YTj=i ^nj = 1- This proves the convergence. • In the case of the disk, one can prove along the lines of Ref. [54, pp. 127-129] that the quadrature converges not only for the continuous functions, but for all integrable functions / G Li(/x). Such a theorem in the case where the ak are cyclically repeated can be found in Ref. [32]. When all ak — 0, the proof was given in Ref. [122]. We add to the previous results the rate of convergence. The rate of convergence of the quadrature formula / n { / } , where / is a function analytic in
236
9. Convergence
a region containing 9 0 , will depend on how large this region is. Thus, to get better estimates, we make the following construction. Let Q denote the set of all regions (closed and connected) G in C such that 9O C G and G n {Ao U Ao} = 0 and that have a boundary T = dG which is a finite union of rectifiable Jordan curves. Let / e H(G) be an analytic function inG. In order to simplify the notation, we give a separate treatment for O = D and O = U. Suppose first that O = D. Then we have by Cauchy's theorem m
'
2ni 2ni JT Z -
Since T C G, we get by Fubini's theorem
Let Rn(z) = —P(z)/Qn(z) be the rational interpolant for ^ ( z ) constructed from the para-orthogonal functions Qn = (pn + r n 0* and the associated functions Pn — ^fn — rn\l/*, with zn e T. As before, § n; G 3 0 denote the zeros of Qn. It holds then that for / e H(G), G eG
f(x)\ By Theorem 5.3.3 we know that
Rn{x) = [D(t,x)diin(t) = J
Hence
and an alternative expression for the error is En = W J - /„{/} = ^ 7
7=
9.10. Rates of convergence
237
In the case O = U, we have similarly
and
and
!„{/} = Jf(t)dUn(t) = ^ JR^x) (-^^)
dx.
Theorem 9.10.4. Suppose the Blaschke product diverges and v^ —> vA, with supp(vA) compact, f e H(G), and with G e Q (Q is defined above) and let the error of the R-Szego formula /„{/} be given by en = / M {/} — /«{/}. Then limsup \en\l/n < p < 1,
p = max{pi, p2},
where p\ — max J G rno e x PU(^)i Pi — max^erno^ exp{A(x)} with X(z) as in (9.54), and V = 8G. Proof. By our previous derivations, we have
\en\ = I W 1 - In{f}\ < max|^(x) - Rn(x)\^- [ \F(x)\dx with 2x
and and
ff o o rr T T
F(x) F(x) =
^2lIS^
^ forR . 1 ++ x22
Because F is bounded o n F c G and V is rectifiable, there exists afinitepositive constant M such that \n\
jeer
|
M
( )
n
( ) |
We can use Theorem 9.10.2 (1) for x e rt• = T H O to establish the bound limsup \en\x/n < limsup [max
|^(JC)
-
< max limsup | ^ ( x ) - i xeF
238
9. Convergence
where X(x) is given by (9.54). If x e Te = F n Oe, one can use the second part of Theorem 9.10.2 to deduce in a similar way that limsup \en\l/n < maxexp{A(£)} = /02, and this proves the theorem.
•
Note that we can always take Te = F n Oe to be the reflection of F; = F H Q in the boundary 3O. That is, Te = {z e C : z € F/}. In that case of course P = P \ = P2Take for example the case of the disk with lim^oo an = a e B. We know then that exp{X (z)} = \ fa (z) \. Tofindp, we have tofindits maximum on a curve F; c ED, which contours all the a n . Thus the more these an are concentrated in the neighborhood of a, the better we can make our estimate. If in the extreme case all o;rt = a, then we could take a small circle around a, so that we can make |f a (z)| - and hence also X(z) and thus also p - as small as we want. For more results about the rates of convergence for multipoint Pade and multipoint Pade-type approximants and corresponding quadrature formulas see Refs. [44, 40].
10
—
—
Moment problems
In this chapter we will study the moment problem. This is equivalent to the Nevanlinna-Pick problem for the disk or the half plane. For a finite measure on T, we may define the moments c.k =
tkdfi,
kzZ.
The trigonometric moment problem is the following: Given the moments Ck, k e Z find the corresponding measure on T. Necessary and sufficient conditions for the existence of a solution and for the uniqueness are to be given. If possible, find a way to construct the solution. All this is related to orthogonal polynomials with respect to a linear functional defined on the set of polynomials with the given moments. A quadrature formula, based on these polynomials, then gives a way to construct a solution for the problem. In our case, we shall consider more general moments, which are related to orthogonal rational functions and we will treat again the unit circle and the real line in parallel. 10.1. Motivation and formulation of the problem We suppose that we are given a linear functional M defined on IZoo = C^ • £00* with Coo = U™=0£n and £00* = j / : /^ 6 £00}• We suppose it satisfies and
M{//*} > 0,
V / ^ 0, / € £«>.
This functional induces an inner product on C^ (or equivalently in £00*) that is given by
239
240
10. Moment problems
By our assumption, this inner product is Hermitian and positive definite. When /x is a positive finite measure on 3O, then
= jf(t)djJL(t),
f^Koc
/
is an example of such a functional. We also assume that M is bounded and normalized by the condition M{\] = 1. In this chapter we shall address the problem of finding conditions under which such a measure exists that will represent the functional M defined on the space IZoo by an infinite set of generalized moments. Such a measure will be called a solution of the moment problem. We can motivate this as follows. We recall from Lemma 6.1.5 that the RieszHerglotz-Nevanlinna kernel D(t,z) has the following formal Newton series expansion (all ^ G O):
D(t, z) = k=\
with
Thus we have by formal integration
Qu(z) = / D(t, z) dji(t) = 1 + with /xo = 1,
f mo(t)dfr(t)
jjik =
—,
A: = 1, 2, . . .
as the general moments. Thesemoments/Xfc,/: = 0, 1, 2 , . . . define the functional M on Coo- However, because
J we also know (e.g., by partial fraction decomposition) all the values = m
r \mo(t)\2dfr(t) J l^o(«o)lX(O[^*(O]*
r diLit) J l^o(«o)lX*(O[7rr(O]*' k,l = 0 , 1 , 2 , . . .
10.2. Nested disks
241
in terms of \±k, k = 0, 1, 2 , . . . , and these of course define M on IZOQ = Note that in the case of dO = R, we have formulated the moment problem to include a possible mass point at oo (recall the integrals are over the extended real line M = E U {oo}). This is necessary because the moments are generated by rational basis functions. In the classical situation of polynomials, there can not be a solution of the moment problem with a mass point at infinity because all the basis functions { J C , X 2 , X 3 , . . . } tend to infinity at oo. Thus in the classical situation, the moment problem where the integrals are over R and the moment problem where the integrals are over R are the same. In our situation the basis functions are rational and almost all of them tend to zero at oo, so that a point mass at oo should not be excluded. Almost all the previous results still hold when (•, •)# is replaced by (•, -)M and J • dji{t) is replaced by M{-}. Therefore, we shall keep the notation with the indication of [i instead of M, in anticipation of the measure \x we want to find. In fact, the positivity of the inner product we defined in this section guarantees the existence of at least one solution. This follows, for example, from Theorem 9.2.1. See also our Theorem 10.3.1 to be proved later (more precisely Corollaries 10.3.2 and 10.3.3). We also keep the familiar notation of (j)n for the orthonormal rational functions, ^/n for the functions of the second kind, kn(z, w) for the reproducing kernels, etc. Thus, if we know that a solution exists, the only problem that remains is whether this solution is unique or not. When the Blaschke products diverge, then the uniqueness of the solution is guaranteed by Theorem 9.2.1, so that in this chapter we shall be mainly interested in the other situation, that is, ^(l-|a*|)
forD
and
] T l m c ^ / ( l - \ak\2) < oo
forU. (10.1)
However, we shall formulate the results so that they cover both situations, that is, the case of the divergent as well as the convergent Blaschke product. We shall follow rather closely the analysis given in Akhiezer's book on the classical moment problem [2]. 10.2. Nested disks Let us introduce the notation Oo for the set O, excluding the points ak, k = 1, 2 , . . . , and similarly OQ excludes the reflected points &*,fc= 1, 2 , . . . from Oe: O 0 = O \ A = {z e O : z # a*, * = 1, 2 , . . . } , Oe0 = Oe\A
={zeOe
:z^ak,
ifc = 1, 2 , . . . } .
242
10. Moment problems
For fixed z G Oo U OQ, we consider the values of s =
Rn{z,r}
= ^
-
^
=
-
^
.
r e T , n > l .
(10.2)
Note that Qn(z9r) are the para-orthogonal rational functions and Pn(z, T) are the associated functions of the second kind. It turns out that when x runs over T and z is fixed, then s will describe a circle Kn (z) in the complex plane. Another equation that describes this is W*n(z) ~ S
+ S
Since {<)>„,
- \fnJZ)
+ S(j)n(z)\2
\\ - S\2 - \\ + S\2
1 - l?n(z)l2
1 - l?o(z)l2
n-\
=
Y,Wn(z)+S(i)n(z)\2. k=0
lfseKn
(z), then the first term vanishes and the equation of the circle becomes
Recall that (see Section 2.1)
f
2
forID)
'
We shall denote the closed disk bounded by Kn(z) as A n (z). Thus s e Aw(z) when in the previous relation the equality sign is replaced by a < sign. Since z e O iff 1 - |f o (z)| 2 > 0, and z e Oe iff 1 - |£o(z)l2 < 0, we find that Kn (z) is completely in the right (left) half plane iff z is in © (is in Oe). It follows immediately from the equations for the disks An (z) that they are nested: An+\(z) C A n (z) for n = 1,2, The intersection of all the disks is denoted as Aoo = Aoo(z). This Aoo is thus either a circular disk (with positive radius) or it reduces to a point. The center cn and the radius rn of Kn are given by
\
|//»*|2 _ \fh 12
243
10.2. Nested disks
By the determinant formula and Christoffel-Darboux relation, we may rewrite the latter as
\Bn-i(z)\ [ao)ujz(z) kn-i(z,z)'
rn=2
(10.3)
where kn-\ (z, w) is the reproducing kernel. More explicitly we have
\Bn.x(z)\ rn =
\Bn-i(z)\ 2\lmz\ kn-i(z,z)
forB, forU.
Since the disks are nested, the sequence of \Bn(z)\/kn(z, z) is nonincreasing. This is obvious for z e O o , but it also holds for z € Oe0. Moreover, A ^ is a point iff this sequence tends to 0. Obviously, kn(z,z) > 1, so that the sequence tends to zero in O when Bn (z) does, that is, when the Blaschke product diverges and thus when (10.1) is not satisfied. In this case we have a limiting point. Thus the divergence of the Blaschke product is sufficient to have a limiting point if z eOo. We now prove that this is also true for z e Oe0. Lemma 10.2.1. If z G O 0 U is a point.
OQ
and if the Blaschke product diverges, then Aoo(z)
Proof. As we explained above, this is clear for z that by the Christoffel-Darboux relation
o. For z e OQ, we note
\4>*n+l(z)\2-\
Taking the superstar of this gives ~ \
= J2 \B«\k(z)\2\4>t(z)\2. k=0
k=0
Hence
kn(z,z) = ^|Z? nU (z)| 2 |0*(z)| 2 = \Bn(z)\2Y, k=0
k=0
244
10. Moment problems
Thus, the expression for the radius gives 1 mo(ao)mz(z) and this goes to zero for z e Oe0 because the Blaschke product diverges to infinity in©*. • Thus we have proved that if (10.1) is not satisfied, that is, if the Blaschke product diverges (and if it diverges, it will diverge for all z G ©o U ©Q), then we shall have a limiting point. If, however, the conditions (10.1) are satisfied, then the Blaschke product converges (to a nonzero value) for every z G ©o U ©Q. Thus in this case we shall have a limiting point if and only if \(j)k(z)\2 = oo, ze
O0U©Q.
We shall refer to the case where Aoo is a point as the limit point situation, whereas the case where A ^ is a disk (with positive radius) is the limit disk situation. We shall now generalize the theorems on invariance and analyticity given in Akhiezer's book [2]. By invariance is meant that if Aoo(z) is a point (disk) for some z = w, then it will be a point (disk) for all z. By analyticity, it is meant that if Aoo(z) is a point, then it is an analytic function of z everywhere in ©o U Oe0. We shall need some lemmas first. Lemma 10.2.2. Choose z € ©o U
withs
10.2. Nested disks
245
2. If Aoo(z) has a positive radius, then (faiz)) e h and for s e Aoo(z), also (^k(z) + 50jk(z)) € €2- This implies that also (\jrk{z)) e l2- Since (<j>n, (/>*) and (\/rn, —i/f*) form a basis for the solution space of (4.1), it follows that all solutions are in l2. 3. The previous part proves this in one direction (which does not need the convergence of the Blaschke product). Conversely, if (xn) e l2 for every solution (*„,*+) of (4.1), then (0 n (z)) e £2 and because the Blaschke product converges, the radius of Aoo(z) will be positive. • Lemma 10.2.3. Define for k = 0, 1, . . . , n - 1; n = 0, 1, . . . akn(w) = 4>k(M>)tHw) + MwWnW.
(10.4)
Then
7Uo(ao)mw(z) _ / ^ — 2^akn(w)4>k(z)\
x |2nj-o(z)0*(i<;)
(10.5)
and 2mo*(w)Bn(w)mn(z) nro(w)
^
Proof. Consider the Christoffel-Darboux relation (10.7) ny
"'
k=0
Formula 2 of Corollary 4.3.4 is (10.8) whereas formula I of the same corollary yields the following two equalities:
rn(z)j*W)+n(z)jjw)_\
2 . - £o(zRo(u>)
x'.,„,.,,...,";
(l09)
Moment problems
246 and
2
*(W) + fn(z)(t)n (tu) 1 - fn(z)f»(ty)
1 - fofe)fo(tw) (10.10)
Elimination of 0*(z) from (10.7) and (10.9) gives
(10.11) while elimination of ^r* (z) from (10.8) and (10.10) gives r
ann(w)irn(z) = [1 - ?„(*)?„
/ ^ » j i c / \
w—l
1 (10.12)
Now we use — _
ujn(an)mz(w) ZUn(z)UTn(w)
and the determinant formula of Theorem 4.2.6 to get
W)
2ujn(z)Bn{w)'
so that (10.11) and (10.12) can be transformed into (10.5) and (10.6). From the Christoffel-Darboux relation follows Lemma 10.2.4. Assume the Blaschke product converges. Choose z Then (0 n (z)) G l2 ^ (4>*(z)) € €2 and (^w(z)) € h O (^(z)) Proof. As in the proof of Lemma 10.2.1, we have
= Y, \Bn\k{z)\2\(pt(z)\2. k=0
k=0
e £2
10.2. Nested disks
247
It follows similarly from the formulas in Corollary 4.3.4 that
k=0
k=0
for each n. Now, let B(z) denote the infinite Blaschke product. Then for z € O 0 U OQ and for k = 0, 1 , . . . , n we have
0 < \B(z)\ < \Bn(z)\ < \Bn\k(z)\ < 1,
zeO0
and 1 < \Bn\k(z)\
< \Bn(z)\ < \B(z)\ < oo,
z e Oe0,
so that the lemma follows.
•
Finally we are ready to state and prove the invariance theorem. Theorem 10.2.5 (Invariance). IfAoo(w) is a disk with positive radius for some w G Oo U OQ, then Aoo(z) is a disk with positive radius for every z e Oo and ^
and
k=0
^|^(z)|2
(10.13)
k=0
converge locally uniformly in O U Oe as n —»- oo. This situation can only occur if (10.1) is satisfied, that is, if the Blaschke product converges. If Aoo(w;) is a point for some w e OoU OQ, then Aoo(z) is a point for every z G Oo U OQ. This situation will certainly occur if (10.1) is not satisfied, that is, if the Blaschke product diverges. Proof. First notice that if (10.1) is not satisfied (i.e., if the Blaschke product diverges for some w € Oo U OQ), then it will diverge for every z G Oo U OQ and, by Lemma 10.2.1, this implies that Aoo(z) is a point for every z G Oo U OQ. Now, assume that (10.1) is satisfied. Define An(z) =
mo*(w)Bn(w)mn(z)
and r ( \—
2mo(w)mo*(w)Bn(w)mn(z)'
10. Moment problems
248
and let ank be as in (10.4). Then, because Aoo is a disk (see Lemma 10.2.2(3)), e l2 and (i/rk(z)) e l2 so that oo n—1
oo n—1 n=0 £=0
\n=0
\k=0
\n=0
\k=0
< OO
by Lemma 10.2.4. Let K be a compact subset of Oo U QQ. Then An (z) and Cn (z) are uniformly bounded for z G ^T. Say
Then Lemma 10.2.3 expresses <j>n or ^rn by a formula of the form 71-1
k=0
where (cn) e i2 and E^lo E ^ o We shall now prove that X ^ o
converges uniformly in K. Clearly
p /
Let 0 < 6 < 1 and choose m =
1/2
+«2
2\ 1/2
E
itn^
\n=m
, /?i, /?2) such that
\n=m k=0
Then, for N > m,we have nl/2
\_n=m \k=0
N
k=0 \
!
/2
/ oo
n-1
1/2
10.2. Nested disks / N
249
\ V2
\k=0 fN
\k=m
/
\k=0
SO
As ^ S ) 1 l^l 2 i s continuous on K, there is an M > 0 such that
\t=0
Hence
implying that Y2T=o lx«l2 converges uniformly on K. All this implies with cn = 0*(w), xn =
250
10. Moment problems
Proof. This follows from (10.3) and the previous theorem.
•
Next we prove the theorem on analyticity. Theorem 10.2.7 (Analyticity). Let Aoo be a point (limitpoint situation). Then (see (10.2)) s(z) = lim Rn(z, r) exists for z G Oo U Og and r e T and is independent of x. The function s is analytic in O 0 U OQ and
Proof. For z e O 0 U Og, let s(z) be the point Aoo(z). Then clearly Rn(z, r) -> s(z) a s r c ^ oo for each r e T. Now take a fixed r e T and put sw(z) = Rn(z, r). Then 5n is analytic in Oo U OQ. From the equation of Kn(z), we obtain
Thus 1 + sB(z) + sB(z) + |*n(z)| < 2
1 - l?o(z)l2
or, using \s +s\ < 2\s\,
Thus \sn(z)\ <2|A(z)| with
A(z) =
Y
is uniformly bounded on compact subsets of Oo UOQ. Therefore, s(z) is analytic in Oo U Og. Moreover, from the inequality defining Aoo(z), the last inequality of the theorem follows. • Note that the denominator in the last inequality of the theorem is positive (negative) iff z e O (z e Oe). Thus the last inequality of the theorem says that s (z) is in the right (left) half plane iff z e O (z e Oe).
10.3.
The moment problem
251
10.3. The moment problem Let Af denote the set of all solutions of the moment problem. As we already mentioned before, the theorem below will imply that our moment problem will always have a solution: At ^ 0 . We identify two solutions as being the same in a measure theoretic sense, that is, the solutions are the same whenever d\x\ — d/ji2, thus whenever they define the same linear functional on the set C(9O) of all continuous functions on the boundary 9Q. A (JL e At will always have an infinite support. If it were only supported on a finite number of points tt e 3O, i = 1,...,«, then we may define the polynomial Nn{z) = YTk=\(z ~ fk) a n d set R(z) = Nn(z)/nn(z) e £n, where ) . Obviously
0 < M{RRJ = \\R\\l = j \R(t)\2dKt) = 0, which is a contradiction. Theorem 10.3.1. Fix Z G O 0 U O J and define for /x e M
Then
Proof. Set s = ^ ( z ) for some /x e At. Let /(*) = D(t, z), t € 3O, and let
k=o
be the generalized Fourier series of fit). Then Y k = ff(t)4>k*(t)djjL(t),
* = 0,l,...,
especially
Yo= j f{t) dflit) =s. Moreover,
[
t, z)[(pkit) -
I D(t,
252
10. Moment problems
Bessel's inequality gives k\2< /
\D(t,z)\2d?i(t).
But, some computations lead to the identity
1 + \D(t, z)\2 = j + j ^ j 2 [D(t, z) so that
Also,
Hence Bessel's inequality becomes
Thus
which means that »y G AOO(Z). It remains to show that any s e Aoo(z) corresponds to the Riesz-HerglotzNevanlinna transform of a JJL e Ai. Let us assume first that s is a boundary point of Aoo(z). Then, for each n, there is a point sn e Kn(z) such that sn —> s as n —> oo. For each n, there is a point rn e T such that Rn(z, xn) = ,?„. Let /xn be a solution of the truncated moment problem (in £(n_i)* • £( n _i)) with parameter rn. Then
5n = Rn(z, xn) = J By Helly's selection theorem [115, p. 575; 87, p. 56; 94, p. 222], there is a subsequence (A«O)) of (A«) and a positive measure A on 3O such that AnO) ->* ABy Helly's convergence theorem [115, p. 573; 87, p. 56; 94, p. 222],
/g(t)dfrn(j)(t)^
f
g(t)dfi(t),
j -> oo
10.3. The moment problem
253
for all g G C(9O). For the case 3O = T, Helly's theorems are directly applicable. For the case 3O = R, one can use a Cayley transform of the circle case. By this mapping we obtain a so-called one-point or Alexandroff compactification of the real line [96] and any neighborhood of oo is isomorphic to a neighborhood of a finite point, so that Helly's theorems are also applicable in this case. Clearly /x G M. Moreover,
Snu) = /D(f,z)dpLnU)(t)
-> f D(t,z)dp,(f),
j -> oo.
As sn -> s for n -> oo, this implies
s= f D(t9z)djj,(t), so that s = Q^iz) for some /x G A4. Now assume that A^iz) is a disk and let s belong to its interior. Then s is a convex combination Xs\ + (1 — A,)s2 (0 < X < 1) of points s\ and £2 on the boundary K^iz). By the above, there are /xi, /X2 G A4 such that Sj=
J D(t,z)dijLj(t),
7 = 1,2.
Clearly /x = A/xi + (1 — A)/X2 G .M and s = ^ ( z ) .
n
Corollary 10.3.2. In the case of a limiting disk, for each s G Aoo(z), z G Oo U OQ, ^^re are infinitely many / i e M SMC/* that s = £2M(z). /« this case, the moment problem has infinitely many solutions. Corollary 10.3.3. In the case of a limiting point, the moment problem has a unique solution. Proof. If /xi, /JL2 G M, then the functions Q^ and Q^ coincide on C \ 3O. For /x = /xi — /X2, we have that /x is of bounded variation whereas the PoissonStieltjes integral 0 = Q^iz) - Q^iz) =
P(t,z)d/ji(t),
zeC\dO
is analytic in O. This implies that /x is absolutely continuous (Theorem of F. and M. Riesz [76, p. 41; 92, p. 61]). Thus
0 = /x/(z)= f
P(t,z)fi'(t)dk(f),
so that the difference between the boundary functions of/xi and 1x2 is a constant.
254
10. Moment problems
As in Akhiezer's book [2, p. 43], we adopt the following definition: We say that a solution /x e Ai is N-extremal at a point z G ©o U ©Q if s = £2^ (z) belongs to the boundary ^oo(z) of Aoo(z). This implies that /x G M. is N-extremal at z iff
Let us denote by C the closure in L2OX) of £00 • We can now state a density result. Theorem 10.3.4. Let \i e M. be a solution of the moment problem. Then 1. If COQ is dense in L2(fi), then \JL is an N-extremal at every z G ©0 U ©Q. 2.1f/ji is N-extremal at some z G ©0 U OQ, then C^ is dense in L2(A)> that is, C = L 2 (£). Proof. The proof is along the same lines as the proof of the previous theorem. Let z e ©0 U O§. Put
and let fz(t) = D(r, z)* = -ft{t\
t e dO
(10.15)
(substar w.r.t. t). Let
k=0
be the generalized Fourier series of fz. Then, as in the proof of Theorem 10.3.1, the Bessel inequality
< / \fz(t)\2dfr(t) k=0
becomes
k=0
which is the equation defining
10.3. The moment problem
255
1. Now suppose that C^ is dense in L2(fi). Because obviously fa e L2(A)> we should have equality in (10.16) by Parseval. This means precisely that s e Koo(z) and thus that /x is N-extremal. 2. Conversely, suppose that the solution /x e A4 of the moment problem is N-extremal at z € O 0 U Og. From (10.15)
and it follows that Aoo(z) = —Aoo(z) and in particular that K^iz) — —Kooiz). Hence \i is also N-extremal at z and fa is in the closure C of COQ in L2(/x). Because also 1 e Coo, we find that both 1
l
A
mz(t)
l
and
t-z
belong to C We show that also 1
[ruz(t)f
1
and
1 k
(t-z)k
[m*(t)]
are in C for A: = 1,2, For each k and each p G Vn, there exists a constant A and a polynomial q eVn-i such that 1 t-Z
1 [it-
k Z)
Pit)] 7tn(t)\
==
1 (t - Z)M
A t- Z
(A = p(z)/7tniz) and 0(f) = (p(t) - Aitn(t))/(t (t -
t -z
q(t) nn(t)
<M
q(t) Jtn(t)
- z)), so 1 (t-Z)
k
Pit) 7ln(t)
where M is a constant (depending on z), given by M = sup{\t — z\ l : t € 3O}. For fixed z & 3O, it is a finite constant. As q/nn and p/nn are in C, it follows by induction that (t — z)~k e C for k = 1,2, In a similar way, we also obtain that [nrz(0]~fc e £ for A; = 1, 2, Now, let g e L2(A) be such that (10.17) for all / € C In order to show that £oo is dense in L2(/t)> it is sufficient to show that g(t) = 0 A-a.e. for r € 3O. To do so, set t€
256
10. Moment problems
Then v is a complex measure of bounded variation since g e L2(A) C Li(pt). Therefore, when C(t, z) is the Cauchy kernel,
H(z) = Jc(t,z)dv(t) is analytic in C \ 3®. As (10.17) holds for all / e C, it follows that /
r = 0
and
/
=0,
k = 1, 2,
But then H(z) = 0 on C \ 3® because H and all its derivatives vanish at z and at 2. As in the proof of Corollary 10.3.3, it now follows that dv(t) = = 0 or that g(0 = 0 A-a.e. on 3®. n The previous theorem says that /x e M is N-extremal at all z e ®o U Oe0 iff /x is N-extremal in at least one z e ®o U ®g. Thus we can say that /x is N-extremal without specifying the z, so that we have Theorem 10.3.5. If ii e M is a solution of the moment problem, then \x is N-extremal iff C is dense in L2(jl).
—
11
—
The boundary case
Whereas in the previous chapters, we have considered the situation where all the interpolation points ak were in O, we shall in this chapter consider the situation where the interpolation points c^ are all on the boundary 3O. We shall also start out from the beginning with an inner product that is defined by a linear functional M:
When (/, / ) ^ 0 for all / ^ 0 that are in £, then the functional is called quasi-definite and when, moreover, (/, / ) > 0 for / ^ 0, it is called positive definite. When M{f} <E R for all / € £«> • £oo*, then M is called real. We shall assume that M is real and positive definite. We suppose that, with the new location of the ak, the spaces Cn are as defined before and that the 0 n are the corresponding orthogonal rational functions, whenever they exist. 11.1. Recurrence for points on the boundary The situation where the points are on the boundary is considerably different from the situation where they are not. One simple observation, for example, is that the Blaschke products can not be used as a basis anymore. Indeed, these are all equal to a constant of modulus one. Recall that z = 1/z for the circle and z = z for the line. Let ak,k = 1,2,... be a sequence of points that are all in 3O. Note that then ak = #*, whence we can write our definition of the spaces Cn as Pn € Hn',KW 257
=
258
11. The boundary case
Since we consider only a countable number of ak e 3O, we can always find a point Qf0 e 3O such that c^ ^ a® for all k = 0, 1, A simple transformation can bring this a 0 to any position on dO that we would prefer. Thus it is not a real restriction to assume that, in the circular case, ak ^ 1 for all finite k. Thus we shall set by definition a® = 1 in the case of the circle. For the real line, this corresponds to all a& being different from zero, and there we set a@ = 0. The slash in the index symbolizes that it is a "forbidden" value for the a^ In contrast, we had in the previous situation that, for the disk, the polynomials were a special case when all ak = a® = 0. For interpolation on the boundary, we have in the situation of the real line that the polynomial case is recovered when all poles are at infinity. Therefore, it seems natural to set there cto = oo and the corresponding point on the circle is a?o = — 1. However, we still need the old meaning of OLQ and we shall denote it from now on as /*. Thus f$ — 0 for the circle and /3 = i for the line. These definitions make the point c*o "less exceptional" because it now belongs to 3O just like all the other c^, whereas f3 does not. This unfortunate, but necessary, change of notation will be used consistently, meaning that, whenever appropriate, an index 0 (like in the definition of Zfc below) will refer to the use of ao (new meaning) in the general definition. If we want to refer to /3 we shall use the index )3 (instead of the index 0 as we did before with the old meaning of a?o)- For example, from now on l/nro*(z) = l/(z + 1) for T and l/nro*(z) = 0 for R, whereas m£(z) = z for T and uip(z) = z — i for R. The table below summarizes our notational changes. Notation alert
T R
P
«0
old
new
new
new
0 i
-1 oo
0 i
1 0
A remarkable observation is that in the present situation we have (with the same definition of the substar we had before)
Therefore, it is natural to consider a basis {^}^ =0 for Cn that satisfies Such a basis is described by bo=\,
bn=
ILL Recurrence for points on the boundary
259
where (note that we use the definition below also for k = 0) I(1
Zk(z) = -
~Z)
^.
(ak - z)/(ak - 1) Zk(z) = , 1 ~ z/ak . (z - a 0 )(a 0 -ak)
Zk(z)=i-
or in an invariant notation
— -, (z-ak)(p -<*0)
lm(z)/mM)
Zk(z) = — - - - :
k
> °
forT
'
k > 0 for R, k>0
imS(z)/mS(P)
;—- = ——r-————- = Zjk*U),
foraO
k > 0.
(11.1)
Thus these basis functions indeed satisfy the relation /?„* = bn. We shall use the notation b(z) for the numerator in Zk, that is, we set
f rT [| i(1~z) forR ° ' z
and thus
Hz)
Z
Some very useful observations can be made here. First, note that with this basis, the polynomial case will appear naturally for the situation of the real line, by setting all ak = ao = oo. Indeed, n
1
bn(z)=znl\-
r-=zn
ifa* = oo, * = 1 , 2 , . . . .
We also note that for n > 1
K(Z) =
[b(z)T
Thus writing / e Cn as pn (z)/n* (z) or as qn (z)/[n* (z)/n* (a 0 )] is just a matter of a constant factor relating the polynomial numerators pn(z) — qn{z)7t^{a^). We will use both possibilities, depending on what is the most convenient. It is also useful to see that l/Zk(z) vanishes for z = ak. Now let us consider as before a linear functional M, Hermitian and positive definite, which is defined on the space IZoo = Coo • Coo, Coo = U^L0£n. This means that
= M{/},
fe Coo -Coo and M{//*} > 0,
0^ / e C
260
11. The boundary case
Note that in the previous situation, where the points oik are not on the boundary, then Cn • Cn* is the same as Cn + £„*, but this is no longer true in general for the boundary situation. We now have 1Zn = Cn • Cn. With the linear functional so defined, we can again introduce an inner product
We can construct orthonormal functions (pn e Cn with respect to this inner product and they can be expressed in terms of the bk we have just introduced:
k=0
We assume that the orthonormal functions 4>n have a leading coefficient in the basis {bk} that is positive. We continue calling it Kn = / 3 ^ . Also, the coefficient P%\ will play a special role and we shall also reserve a special notation for it: /><"_>! = < . Thus H
h Kfnbn-i(z) + Knbn(z).
It is easily seen that we can get Kn and K'H from Zn(an-i)
z=(Xn
With the normalization Kn > 0, we have Lemma 11.1.1. The orthonormal functions 0 n have real coefficients with respect to the basis {bk} and 0rt* =
0n = Xn/WXnh
with
Xn = K - ^
^0/,
Yi = (bn, 0i) •
/=0
Using M{/*} = M{/}, (/, g> = M{/g*}, bn* = bn, and 0,-, = 0; for i < n, it follows that the coefficients Yi = (K^i) = M{bn(j)i*} = M{bn*(j)i} = M{bn(j)i*} = Yi are real. Since 0r has real coefficients with respect to the basis {bk}, then Xn and thus also 0 n will have real coefficients with respect to the basis {bk}. •
ILL Recurrence for points on the boundary
261
The notions "degenerate" and "exceptional" as defined in Section 4.5 coincide and are replaced by the notion "singular." We shall now call (j)n (and also its index n) singular when (pn = pn/^n anc^ Pn(&n-\) — 0 and regular otherwise. In the case of the real line, otk can be oo. A zero of pk at oo then means that the degree of pk is less than k. We are now ready to formulate the recurrence relation. Theorem 11.1.2. For n = 2, 3 , . . . , let (pk e Ck, k = n — 2, n — \,n be three successive orthonormal rational functions. Then (j)n-\ and (pn are regular if and only if there exists a recurrence relation of form n _!(z)
+ C
'Zn{Z\(f>n-2(z), z(z)
(11.3)
with constants An, Bn, Cn satisfying the conditions En = An + Bn/Zn-2(an-i)
# 0,
(11.4)
CB#0.
(11.5)
Proof. First suppose that (pn and (j)n-\ are regular. Choose An arbitrary and define Wn(Z) = Let 4>n(z) = qn(z)/[7t:(z)/7t^a0)].
Then
Qg0 - aw_2 qn(z) z-an-2
Anb(z)qn-i(z)
with b(z) as defined above. Thus, if we choose An = qn((Xnwe obtain that Wn e Cn-\. Recall that (j)n-\ is regular and an_2 ^ OL%, SO that An is well defined; this is also true if otn-2 = oo. This implies that Wn can be written as 71-3
Wn(Z) = Bn4>n-l(z) + C n 0 n _ 2 (z) + Yl Dk
For n = 2, the sum is empty and the result is obvious. For n > 3, it is easily checked that Wn J_ £ w _ 3 , hence that all Dk = 0, A: = 0 , . . . , n - 3. What then remains is equivalent with the formula (11.3).
262
11. The boundary case
Taking the numerator of this formula and putting z = otn-\ gives qn{an-i) = b{an-X)[An + Bn/Zn-2(<xn-i)]qn-i(fiin-i).
(11.7)
Because qn(an-\) ^ 0, this gives (11.4). Observe then that /„ (z) = bn-\ (z)/Zn (z) is an element from Cn-\. Thus, it is orthogonal to (j)n and we get n 17 ( A • B" \ M 0 = ( Zn \An + n-2j 0
b
n-l\
L
n
I )M
/ Zn M + C {\ Z n-2 0
K-\
r
Z
«
/ M
which can be written in the form (use Z&* = Z&)
M
\zn_2
/M
The left factor in the first inner product is in Cn-\ and its leading coefficient is An + Bn/Zn-2(oin-\), which is nonzero. Thus the first inner product is nonzero and therefore also the second one will be nonzero. This implies that Cn ^ 0. Conversely, suppose that (11.3)—(11.5) holds for some n > 2. Since 4>n-\ £ Cn-\ \ £n-2, it follows that qn-\ (an-\) / 0. Therefore it follows by (11.4) and (11.7) that qn(an-i) / 0. Because An is well defined, it follows from (11.6) that qn-i(an-2) ¥" 0. Hence 0n_i and (j)n are regular. • Some particular cases in the situation of the real line are worth mentioning. Corollary 11.1.3. Suppose that we are in the case of the real line and an = oc for all n. Then the (pn are polynomials and they satisfy a recurrence relation of the form H = 2, 3, . . . ,
with An^0,
Cn^0,
#1=2,3,....
Proof. Noting that in this case the sequence 4>n is always regular, the result follows immediately from the previous theorem. • We can also relate the recurrence of Theorem 11.1.2 to orthogonal Laurent polynomials, which are obtained by orthogonalizing the basis {1, z" 1 , z, z~2, z 2 ,...} with respect to a measure on R. This result is not immediate though. The powers of z can be obtained by setting otk = oo in the previous definition of the basis
ILL Recurrence for points on the boundary
263
{bn}. However, for the powers of z~l, we need the "forbidden" value ak = 0. We should therefore give an alternative definition. So, setting
z-ctk
and bo = 1;
b2n = Z2Z4 • • • Z2n, n > 1;
b2n+\ = ZXZZ • • • Z 2 n + i, rc > 0,
we get the basis required here by using a2k = oo and a2k+\ = 0. The proof of the previous theorem can now be adapted, according to this choice of the basis functions, and one obtains Theorem 11.1.4. Suppose we are in the case of the real line. Let (j)n be the orthonormal Laurentpolynomials obtained by orthogonalization of 1, z~l, z, z~2, z2, Assume that the sequence (/>„ is regular. Then these Laurent polynomials satisfy recurrences (t>2k{z) = (A2kz + *2*)02*-i(z) + C2k(/)2k-2(z),
* = 1, 2 , . . . ,
+ B2k+l)(j)2k{z) + C2*+i02*-i(z),
k = 1, 2 , . . . ,
and Cn ^ 0.
Many more details and applications of these Laurent polynomials can be found for example in Refs. [118], [162], [119], [114], and [51]. Note that the latter recurrence relations for regular orthogonal Laurent polynomials are formally obtained from our general recurrences by setting Zo = 1, Z2k — z, and Z2k+\ = z~l and a2k = oo and o^+i = 0. The analog for the unit circle would be obtained with Zk(z) =
z-ak
and a2k = — 1 and a2k+\ = 1. This results in orthogonal Laurent polynomials in the variable (z + l)/(z — 1). We shall now derive an explicit relation for the coefficient Cn from the recurrence relation. We introduce the expression Dn =
1
Zn.2(z)
264
11. The boundary case
We need the following equality, valid for n > 2, which can be obtained by some calculations: »«,)«.;-'«-*> ( )( )
(1L8)
Note that /)„ is a constant not depending on z. Thus we can use Dn =
1
1
Zn-2(an-i)
for any n > 2 when it will be convenient. One application is to replace Z n i 2 by Dn + Z~i t so that we can rewrite the recurrence relation (11.3) as
w = 2,3,...,
(11.9)
with En as in (11.4). The definition of En as given above will define En only when the recurrence relation holds, that is, when the system >„ is regular. It is possible to define En independent of the regularity of the system as we do in the following lemma. Lemma ILL 5. Suppose the expansion of the orthonormal rational functions in the basis {bk(z)} is given by 4>n(z) = Knbn{z) + ic'nbn-\(z) + • • •.
For n > 2, define
En = — L + fn Kn-i I
Zn{an-i)\
1.
(11.10)
Then En e R. If the recurrence relation (11.3) holds, so that An and Bn are defined, then the En of (11.10) coincide with the En = An + BnDn of (11.4). The latter also holds for n = 1, if we set by definition A\ = K\ and B\ = K[. Proof. The En are real since Kn, K'U, and /cM_i are real and also Zn(an-\) is real for n > 1. To show that En = An + BwDn, we take the recurrence relation (11.3), divide it by bn (z), and set z = otn-1 • With the identities (11.2) and using l/Zk(pik) = 0, the given relation follows immediately. •
11.1. Recurrence for points on the boundary
265
Note that it follows from definition (11.10) and the formula (11.2) that n is a singular index iff En = 0. This statement does not depend on the existence of the recurrence relation. We can now prove Lemma 11.1.6. Suppose that the recurrence relation (11.3) holds with coefficients An, Bn, and Cn. Let Dn be as defined above and En = An + BnDn. Then En = -CnEn-U
n>2.
(11.11)
Proof. Clearly bn-\/Zn e Cn-\ so that (0W, bn-\/Zn)M rence relation for 0 n , we get
O=((f)n,bn_l/Zn
= 0. Using the recur-
M
M
\
Z
«~2
/ M
Using (Z?n_2, 0 w _i) M = 0 and (fe/, 0/>M = l//c/, we have
Kn—1
^n—1
^n—2
The remaining inner product can be evaluated when we use , , v 4>n-\{z) bn-i(z) =
K'n_x
bn-2(z) H
,
so that
^ (11.13) When we combine (11.12) and (11.13), we get Q =
K^ Kn-\ I
\
D
Kn-\
BnDn
Cn
CnDnK'n_x
Kn-\
Kn-2
Kn-\Kn-2
T\
r*
f
Kn-2 L
fn-1
1 which gives us the expression we wanted.
is1
K
n-\.
11. The boundary case
266
For solutions of the recurrence relation, we can prove a general summation theorem. To formulate this theorem, we define for n > 1 1
H(z, w) =
1
(11.14)
Zn-i(w)
and find after some computations that for dO,
i H(z, w) =
z-w (w - l)(z - 1) z-w zw
forT,
—l-
forR.
Note that this expression does not depend on n. Furthermore, we set 1
Hn(z,w) =
1
and hence Hn{z,w) = 1
1
Zn.2(w)Zn.2(z) Zn-2(w)
Zn-2(z)Zn-i(w) (11.15)
Zn-2(z)
We now have the following analog of Theorem 4.3.1. Theorem 11.1.7. Letxn (z) and yn (z) be two solutions of the recurrence relation (11.3) and define Fn(z,w) =
n-X
(z)
Zn(w)Zn-i(z)
yn (z)xn-i (w) Zn(z)Zn_i(w)
Then, with H(z, w) as in (11.14) and En as in (11.10), Fn(z, w) = yn-i(z)xn-i(w)H(z,
w)En - CnFn_i(z, w)
[n-\
H(z,w)En + (-l lk=l
w).
11.2. Functions of the second kind
267
Proof. We use the recurrence relation for xn and yn in the definition of Fn (z, w), which gives
, ^ * x . r i i i Fn(z, w) = Anxn-i(w)yn-i(z) — — - — — Zn-i(w)\ 1
1
Zn-i(w)Zn-2(z)
Using the expressions (11.14) and (11.15), we find
Fn(z, w) = xn-i(w)yn-i(z)H(z, = xn-i(w)yn-i(z)H(z,
w)[An + BnDn] - CnFn.i(z, w) w)En - CnFn_i(z, w).
An induction argument leads to the result.
•
It is possible to derive from this formula the Christoffel-Darboux type formulas given below. However, this would require that the system 4>n be regular, since it is based on the existence of the recurrence relation. It is possible, however, to prove the Christoffel-Darboux formulas without using the recurrence relation and only relying on the orthogonality properties of the 0 n . This is what we shall do in Section 11.3. We first introduce the functions of the second kind in the next section. 11.2. Functions of the second kind As we did before, we shall associate with our orthonormal functions <j>n some functions tyn € Cn, which are called functions of the second kind. They are defined by irn(z) = Mt{D(t, z)[
!
2
^
where we used the notation Mt to indicate that M works on the argument as a function of t, considering z as a parameter. We first want to show that these functions \/rn also satisfy the recurrence relation (11.3) whenever the (j)n do. Theorem 11.2.1. Suppose that the system of orthogonal rational functions >n is regular and let \jfn be the functions of the second kind associated with them. Then these ^rn satisfy the same recurrence relation (11.3) as the 4>n.
268
11. The boundary case
Proof. We use the recurrence relation for >n(t) and >n(z) in the definition of ij/n. This gives for n > 2 = AnMt{D(t,z)[Zn{t)4>n-\{t)
- Zn(z)
+ BnMt D(t,
^iz)
+ Cn -
+ M,{D(t, z)fn(t, z)} + 8n2^-C2,
(11.16)
with /nft, Z) = An[Zn{t) - Zn(z)](pn-l(t)
We note that Zn(t)-Zn(z)=iand Zn(t) Z n _ 2 ft)
Zn(z) Z n _ 2 (z) ft - an)(z - oin)(oi0 -
an-2)'
Therefore, /n(f,Z)
7 1 7 7 7
A
A
+ an~2~an[Bn(j>n.x{t) «0 - Otn-2
i(«0a
n—o
w
),
..
0n-lft)
+ Cn0n_2ft)]l . J
Note that in the argument of the linear functional in (11.16), the factor t — z in the numerator of /„ft,z) cancels against the same factor in the denominator of D(t,z).
77.2. Functions of the second kind
269
Next we split D(t, z) as Dx(t, z) + D2(t,z): (t-an D(t, z) = {
1
t-z -^(t-an)
for T,
t-z
t-z
zan) t-z
I
t-z
t-z
with cn — \ and c'n = z + an for T, whereas cn = — \z and c'n = — i( 1 + zotn) for R. Using the orthogonality of the >£, we find
{
forn > 3,
0
(z -- aa2)(a 0 - a0) ^ (oi . 0 2)(a0 - a2) c C n = 2.to write 2 2 For the second term with D2(t, z), we use again the recurrencefor relation ( )( )
fn(t,z) =
z-an a 0 - an-2
(Bn(t>n-l(t) + Cn
J
Again, by the orthogonality of the c/)k, we get
{
0 (z-a2)(a0
-a0)
for n = 2.
This proves the recurrence relation for n > 3 directly. For « = 2, we can put together all the terms involved to find that the recurrence relation is again satisfied because =rzrC2 + c 2 C 2 2 C + + c2C2 2 Z0(P) (z - a2)(cx0 - do) (z -
- a2
=0.
In analogy with Lemma 4.2.2, we can prove Lemma 11.2.2. Let
270
11. The boundary case
In particular, we can take for / any function in Cn-\ or we can take it of the form f(t) = g(t)(t — an)/(t + z) for the case of T or of the form f(t) = g(t)(t - an)/{\ + tz) for the case of R, where g e Cn. We shall now give a Liouville-Ostrogradskii type determinant formula. Note that here we do not assume the system {4>n} to be regular. Theorem 11.2.3 (Determinant formula). Let (j)n be the orthonormal functions and tyn the functions of the second kind. Define _ jfn(w) (J)n-\(Z)
0nfe) V^-lO)
Zn(w) Zw_i(z)
Zn(z)
Zn-i(w)'
Then for n > 1 we have, with En as in (11.10), Fn(z,z) = -EnZ0(P)
'
*
or more explicitly, ,,2l\7En (l - z)2 Proof. We note that Fn(z,z)=
forT
and
Fn(z, z) = - ^
\Z z1
)
En
forR.
Multiply with D(t, z) and apply Mf to get for the left-hand side:
for the right-hand side, we find
Note that in the second term D(t, z)[(pn-\(O — >n-i(z)] € Cn-\9 so that this term is zero by the orthogonality of >n. To compute the first term, we write h{t) = D(t, Z)[
77.2. Functions of the second kind
271
and
Then, by the orthogonality of 0w_i = 0(n_i)*, M{(j)n-ih} = ynM{(/)n-\bn} + y ^ Because 0^ = /c^Z?^ + Kfkbk-\ H <
M{(j)n-\bn} =
, it follows by orthogonality that —
1
and
Thus we obtain
YnKn
+ D(a n _i, z)En -
= EnZn(<xn-i)[D(an-i,
z) - D(an, z)].
Working this out gives the result. Indeed, for the case T we have D(t, z) - D(w, z) =
z t —z
w+z w—z
2z(w -1) (t - z)(w - z)
With t =<xn-\ and w = an, this shows that M{0n_i/z} = EnZn(an-i)-
(an -z)(an-\
For the case R, we have D(t, z) - D(w, z) = - i
-z)
=
(1 -z)2
-EnZn(z)Zn-l(z).
272
11. The boundary case
With t = an-\ and w = an, this gives **
/i
-Z7 rj
(an
(
M{(f)n-ih} = -iEnZn(an-i)-
- an-i)(l
— (an -z)(an-i
jq + z 2 ),, 7 , . „ ,, -, EnZn(z)Zn-i(z). z2
= This completes the proof.
+
z2)
-z)
11.3. Christoffel-Darboux relation We shall now derive the Christoffel-Darboux relation for the boundary situation. Theorem 11,3,1 (Christoffel-Darboux relation). Let 0 n be the orthonormal functions in Cn and let H(z, w) and En be defined by (11.14) and (11.10). Then Zn(w)Zn_x(z)
Zn(z)Zn-i(w)
Consequently, if we let z —> w, then we obtain
where
Mz) = and the prime means derivative. Proof. Define
Fn(z, w) = gn(z, w) - gn(w, z).
Set w
Fn(z,w) H(z, w) c = l
=
cF
(P — a 0 ) 2 08 - a
11.3. Christojfel-Darboux relation
273
z—w gn(z, w) = (j)n{w)(j)n-x{z){z - an-i)(w - an). We have to prove that F(w) = YTkZo Yk(z)
F(w) = ^ n ( z ) 0 i k ( ^ ) »
Yk(z) = Yk(z)c,
yk(z) = M{F(pk}.
k=0
But h(z) = M{F(/)k} = MW{F (w)[(f)k(w) - 4>k(z)]} + 4>k(z)M{F}. For the first term we find MW{F (W)[(f)k(w) -
q{ = M { j }
()
l
z-w
Hence, by orthogonality qt = 0 and thus yk{z) = (/)k(z)M{F}. It remains to show that M{F) = cM{F} = En. Note that M{F} = ( z - a B _ i ) ^ i ( z ) / B ( z ) - (z -« n )0 n (z)/ w _i(z), fi(z) = M For 3O = T we add and subtract (z - an)(z - a n _i)0 n _i(z)0 n (z)D(z, w) — . 2z This leads to M{F} = (z - an-i)4>n-i(z)hn(z) - (z - an)(/)n(z)hn^(z), where, using Lemma 11.2.2 z + iv
(11.17)
274
11. The boundary case
For 9O = R we add and subtract
(z - an)(z - an-i)(pn-i(z)(pn(z)D(z, w)-
1+z2'
which leads to the same formula (11.17), but now with f
[ W — (%i
ht(z) =iMw < D(z,w) \-
{
[l+zw
Z — OL[
4>i(w) -
1+z
2
1 1
Z ~ Oti
«0i(z) } = -~
JJ
1 + z2
^fi(z).
Or, in an invariant notation,
Therefore,
M{F} =
-i^±^~
Finally, using the determinant formula of Theorem 11.2.3 to substitute inside the square brackets, we obtain after some computations
which is what remained to be shown. The proof for the confluent formula is obtained by dividing out //(z, w) and letting w —>• z. We leave the details to the reader. • Note that for the case of the real line and when all at are at infinity, we should find the Christoffel-Darboux relation for polynomials on the line as a special case. Noting that then Z& (z) = z and l/Zk (oi() = 0, we get the classical relation for polynomials 0nO)0 n _i(z) - 0n(z)>n_iO) = (W - z)—^fc n _i(z, W), Kn-l
where kn-\(z, w) = YllZo 0^fe)0^(^) is the reproducing kernel for Cn-\ = Vn-\- For the confluent formula, the factor zu£(z)/m£(l3) is in the polynomial case just 1, so that fk is equal to the orthogonal polynomial fa, and \UJ^ (/?) = — 1. Thus we find again a classical formula. A bivariate generalization of the determinant formula is formulated as follows. Theorem 11.3.2. Let 0 n be the orthogonal functions for Cn and \jrn the associatedfunctions of the second kind. Let H(z, w) be as in (11.14), En as in (11.10),
11.3. Christoffel-Darboux relation
275
and D(z, w) the usual Riesz-Herglotz-Nevanlinna kernel. Then Vn-\
Zn(w)Zn-i(z)
H(z,w)En.
*
lk=l
Note that for w —> z, this reduces to the determinant formula. Proof. Define as in the previous proof gn(z, w) = (w - an)(z - an-x)(j)n( and set G(z, w) = gn(z, w) — gn(w, z). Now apply Mt to the expression DO, w)
F(t,z,w) =
t +w
-G{t,z)
2w
G(w,z)
forT,
w2)
- i ( l +tw) Then we obtain by Lemma 11.2.2
Mt{F(t, z, w)} = (z - otn)(w - a n _i)0 n -(w - an)(z - a n _i) But, noting that with the notation of the previous theorem G(z, w) = (z- w)F (w) =
cH(z, w)
Zn(z)Zn-i(w)\
we find by the Christoffel-Darboux formula that n-l
G(z, w) = k=0
with ,
d =
z-w
,(
= —l
Using this expression for G(z, w) in the definition of F(t, z, w) leads as before by Lemma 11.2.2 to the following result in the case that 3O = T: n-\
Mt{F(t,z,w)} = -En c
(z + w) lk=l [n-\
= -dEn
~ D(z,w) lk=l
276
11. The boundary case
For 3O = R the final result is the same, but in the derivation, the last term (z + w) at the end of the first line is replaced by —i(l +zw). Equating the two expressions for Mt{F(t, z, w)} gives a formula that is equivalent to the formula that had to be proved. For z = w, the determinant formula follows because H(z, z) = 0 whereas H(z, w)D(z, w) gives indeed the required expression since the z — w in the numerator of H(z, w) cancels against the z — w in the denominator of D(z, w).
• Also for the functions of the second kind, we can obtain Christoffel-Darboux relations by applying the trick of the previous proof once more. Theorem 11.3.3. Let \//n be the functions of the second kind. Then, in analogy with the Christoffel-Darboux relation, we have 1 ( \ I ( \ rn\UJ)yfn—\\Z)
Zn(w)Zn-i(z)
~n-\
1 ( \ j ( \ Yn\Z)Yn—\\W)
~-H(z, it*) En
Zn(z)Zn-i(w)
with H(z, w) as in (11.14) and En asin (11. 10). Proof. We set now G(z, w) = (w — an)(z
- Un- l)4>n(U)
— (z — an)(w — an-i)(/)n-\(w)'^n(z)With this G(z, w), we define F(t, z, vo) as in the previous proof. By applying Mt on F(t,z,w) we find with the help of Lemma 11.2.2 Mt{F(t, z, w)} = (w - an)(z -(z-
an)(w -
an-i)irn(w)\l/n-i(z) an-i)\l/n-i(w)\lrn(z).
However, we have by the previous theorem (we use the notation introduced there) G(z, w) = dEn
^ .k=\
Substituting this in the definition of F(t, z, w) and applying Mt to it gives, as in the previous proof, Mt{F(t, z, w)} = dEn
^2 lk=i
11 A. Green's formula
277
Setting the two expressions for Mt{F(t, z, w)} equal to each other gives a formula that is equivalent with the required identity. • Note that again here we can let z —> w, to obtain a formula containing derivatives just as in the Christoffel-Darboux case. Finally, we give an identity that is obtained by combination of the three previous theorems. Theorem 11.3.4. Let
Zn(w)Zn-i(z)
Zn(z)Zn-i(w) ~n—\
= H(z, w)En Y^ Xk(z; s)Xk(w; t) + [st - 1 + D(z, w)(t - s)] .k=i
with H(z, w) as in (11.14) and En as in (11.10). Proof. This is directly obtained by working out the left-hand side and using the three previous theorems. • 11.4. Green's formula We can give a complex version of the formulas given in the previous section. The same method is used in the proofs with complex conjugates at the appropriate places. Therefore, we shall not include the details but only give the main steps. First we need the analogs of the expressions (11.14) and (11.15). They are —l.mw(z)m(P) = = H (z, w) =
1 Zn(z)
1 Zn(w)
- i ( l - zw) ^-
, for forT,
(11.18)
zw This relation holds for any n > 0. Note that H (z, w) = H*(z, w), where the substar is with respect to z. For w = z, we have H (z, z) = 2P(«0, z)/Zo(/O, where P(t,z) is the Poisson kernel.
11. The boundary case
278
Also, for n > 2,
= DnH(z,w). (11.19) We now give without proof the following complex analog of Theorem 11.1.7. Theorem 11.4.1 (Green'sformula). Let xn(z) and yn(z) both be solutions of the recurrence relation (11.3) and define
Then, with H (z, w) as in (11.18) and En as in (11.10), Gn(z, w) = yn-i(z)xn-i(w)H
(z, w)En - CnGn-i(z, w) H (z, w)En + (-l) B C n C n _i • • • C 2 Gi(z, u;).
,k=\
This result holds because the numbers An, Bn, and Cn are real. As a corollary of this formula, one can find the summation formulas in the next theorem. However, when derived from the Green formula, they would only be proved for the case of a regular system, that is, when the recurrence relation holds. With the techniques of the previous section, the same result can be derived using only the orthogonality of the >n. However, a much simpler technique is to take the substar of the corresponding relations in Theorems 11.3.1-11.3.3 of the previous section, taking into account that 0«* = 0«
and
V«* =
-tn-
We give the result without further proof. Theorem 11.4.2. With the notation for H (z,w) and En of the previous theorem and with 4>n the orthonormal functions and \jrn the functions of the second kind, we get (4>n(M>)\ (
E
U=o
H(z,w)En
279
11.4. Green's formula *n-l(.Z)\
( \Zn(w)
(fnjZ)
n-\
H(z,w)En, U=i
zn(Z)
zn(w)
H(z,w)En, k=l
where in the last equation, the substar is with respect to z. Note that this also holds for w = z. In that case l + kl 2 D*(z, z) = -^
for T
l + kl 2 and D*(z, z) = i -=- for M,
which corresponds to (recall that fy (z) = z for T and £p (z) = (z—i)/ (z+i) for I
The previous relations can be combined as in the real case to give the following: Theorem 11.4.3. Let (j)n be the orthonormal functions and xj/n the functions of the second kind. For any complex s, we set Xn(Z',s) =
fn(z)+S(f)n{z).
Then, for arbitrary complex s and t,
V Zn(w)
Zn(z)
\0
n-\
\ t)Xk(z; s) +
H(z,w)En.
l+st-(s
k=l
7/2 particular, for z = w and s = t t
Zn(z)~
t'Xn-i(z;s)'
Xn-i(z;s) Zn-i(z)
n-l .k=l
(11.20)
280
11. The boundary case
where
^K-
(H.21)
Proof. By the previous theorem, we can evaluate the left-hand side and obtain the first formula. For z = w and s = t, this equals \Xk(z; s)\2 lk=l
J
H (z, z)En + EnY{z\ s),
with - (s + s)D*(z, z) + \s\2] = H (z, z)\\ - s\2 + (j + )£ (z, z)[l - D*(z,
z)l
In the case of T we get ^
(z, z)[l - D*(z, Z)] = l -
^ 2
^ r-T 2
~
U-zl U-kl
- 1 = 1
J
2|z| 2
U-.
which gives the result. Similarly, we find for R /
NF1
n /
M
Z
^
(Z, Z)[l - D*(Z, Z)] = —r-
[
.1
+
-1—
IZ!
1
1
ki2 L z - 2 z - z + i(l2 + lzl2) i k ~ i2l kl ~ ' \z\
In both cases we get the last term in (11.20).
•
11.5. Quasi-orthogonal functions We shall introduce in this section the quasi-orthogonal functions, which are the analogs of the para-orthogonal functions in the situation where the a* are not on the boundary. Let 0 n be the orthonormal functions with poles on the boundary 3O. We define the quasi-orthogonal functions as Q»(z, r) =
6,-ife),
We set by definition Qn(z, 00) = - ~
7
reR, «>1.
(11.22)
11.5. Quasi-orthogonal functions
281
Our immediate aim is to prove that Qn(z,r) has n simple zeros that are all on 3O. We shall need several lemmas before we will be able to prove this. For the formulations to follow, it will be convenient to say that a polynomial pn e Vn has a zero at z = oo when p* has a zero at z = 0, that is, when its degree is less than n. Thus a polynomial is understood to have a zero at oo when its degree is defective. So, in the proofs below, where it is allowed that a polynomial has a zero at oo, the reader should modify the proofs given for finite zeros whenever it is necessary. Usually this is dealt with by setting in a factorization of a polynomial pn(z) = a(z — t\) • • • (z — tn), the factor z — U for tt = oo equal to 1. The following lemma is a simple observation. Lemma 11.5.1. Let Qn(z, r) = pn(z, Then we have
T)/TC*(Z)
be a quasi-orthogonal function.
2. If£ eC\3Oisazeroofordermfor/? n (z, r), then f is also a zero of order m. Proof. We check each of the claims. 1. This equation is obvious from the definition. 2. Since § ^ 3O, it is a zero of pn{z, r) iff it is a zero of Qn(z, r). For a zero £ that is not 0 or oo, with multiplicity m = 1, the result is a direct consequence of the previous relation. Assuming m > 1 it is clear that h(z) = Quiz, ?)/[(z — %)(z — I)] will again have the property that /*(£) = Oimplies A simple adaptation of this argument will show that the result also holds for § = 0 or oo. • We are now in a position to prove the following: Lemma 11.5.2. Let Qn(z, r) = pn(z, r)/jt*(z) be a quasi-orthogonal function. Then the polynomial pn(z, r) has n simple zeros on 30. Proof. For notational reasons, it is awkward to give an invariant proof. Therefore we give separate derivations for the case R and the case T. 1. The case R. This is the simplest case. Let t\,..., t\ be all the zeros lying on R, which have odd multiplicity 2mt + 1, with mt > 0. All the other zeros (on R or not) will come in pairs (£;, § f ), i = 1 , . . . , k. The couple of zeros (&,§*) is repeated if its order is larger than 1. (We recall that tt can be oo, in which
282
11. The boundary case
case we replace in what follows z — */ by 1, and a similar observation holds for £• = oo.) Thus there is some nonzero constant c such that /
k
Pn(z, r) = c J](z - ^)2m<+1 ]J[(z - &)(z - £)]
Set M = m\ H h % Then n = I + 2M + 2&. We have to prove that all ml• — 0 and A: = 0, or, equivalently, because all these numbers are nonnegative integers, we have to prove that M + k > 1 leads to a contradiction. Assume therefore that M + k > 1. Define the function (z — cti)'-(z
-an-i)
We shall prove first that T is orthogonal to Qn. Because / + 1 = n — 2(M + *:) + 1 < n - 1, it follows that T e Cn-\. Thus (T,>n)M = 0. Clearly
so that / \
Z
" Q z-dn
g > t 1
\
IM
= 0.
(11.24)
Therefore (use the definition of Qn(z, r)), we obtain (7, QU)M = 0. This is however impossible because (make use of (11.23))
where U7(
,
(z - ^i) m i + 1 • • • (z -
Thus we come to a contradiction and this means that M + k = 0, that is, all m,- = 0, / = 1 , . . . , / and £ = 0. Thus all the zeros are simple and are in R. 2. 77*e case T. The difficulty is here that the substar transformation introduces powers of z, which should be taken care of by considering the multiplicity of z = 0 as a zero of pn(z, r). As in part 1, we let tt, i = 1 , . . . , / denote the zeros of pn(z, r) of odd multiplicity 2mt + 1, which are on T. Let z = 0 be a zero of multiplicity m.
77.5. Quasi-orthogonal functions
283
Hence, by Lemma 11.5.1, also z = oo will be a zero of order m for Qn(z, T), that is, Pn(z, T) will have degree at most n — m. All the other zeros will come in (possibly repeated) pairs (§/, 1/^), / = 1 , . . . , k. Thus, again setting M = m\ + • • +•mi, we have 2k + 2M + 2m + Z = n. We have to prove that M + m +k > 1 leads to a contradiction. Define the function • - tt)(z (z - h) • • (z (z - ai) • - -(z -
ctn)zM+m+k-1 an-i)
If M + k + m > 1, then the numerator has degree l + M + m + k = n — (M + m + k)
r) =
1=1
and also z - an = -(z - an)*/(zotn), so that we may write again (T, QH)M — c f(W, W)M / 0, where d is some (nonzero) constant and =
(z - ; p m i + 1 " ' ( z - ^/) m / + 1 q - g o - • • (z - ^ ) (z - « i ) - - - ( z
-an_i)
Q
This contradicts (T, Q«)M = 0, SO that we may conclude that M + m + k = 0 and hence that m = k = 0 and all m; = 0, / = 1 , . . . , Z. Thus all the zeros are on T and they are simple. • Note that the previous theorem holds for any real value of r and thus also for r = 0, so that we have also proved that the numerators pn (z) = pn(z, 0) of the orthogonal rational functions <j>n have n simple zeros, which are all on 3O. The problem is that some of these zeros may coincide with one of the {a^}^~i, and this is something that can not be excluded. Since the zeros of p n are simple,
284
11. The boundary case
such a zero will cancel against the corresponding factor in the denominator of (j)n and so 4>n will not have a zero at that point. We have, however, the following lemma. Lemma 11.5.3. If 0 n is regular, then the numerators of (j)n and (j)n-\ have no common zero. Proof. Set >n(z) = Pn(z)/7i*(z). The adaptations for the real line when some of the oik are oo can be made as mentioned above. We leave the details of that to the reader. We have to prove that a common zero w, that is, pn(w) = pn-i(w) = 0, leads to a contradiction. Obviously w ^ an, for otherwise 4>n would be in Cn-\. For similar reasons, w can not be otn-\. It turns out that w can only be one of ct\,..., a n _2. Indeed, if w were in 9© \ {a\,..., an}, then, recalling the definition fk (z) = <j>k (Z)TJJ£ (z)/m£ (ay) from the confluent form of the Christoffel-Darboux Theorem 11.3.1, this would mean that /„ and /„_! had a common zero. Thus the left-hand side in the confluent Christoffel-Darboux formula would vanish for z = w and this is impossible since the right-hand side is not zero because En ^ 0 if 4>n is regular. Thus if pn (w) = pn-\ (w) = 0, then the only possibility is that w € [a\,..., otn-i}' Assume w = oik with 1
7
/y
z-ak
e Cn-\
and 0n_i(z)
7
/y
z-otk
e Cn-\.
However, if we denote h(z) = (z — an)/(z — oik), then
because 4>n-\h e Cn-\ _L 0 n . Therefore, h(pn e £n-2. If we denote the n — 1 zeros of pn(z) that are not equal to oik by t\,..., tn-\, then
\
z) = C
(z ~ h) - - (z - tn-i)(z - an) (z-ai)-"(z-oin)
<E £ w _ 2 ,
with c a nonzero constant, and this can only be when an-\ e {t\,..., Thus pn (a n _i) = 0, and this contradicts the regularity of 0 n .
tn-\). •
Now we turn to the zeros of the quasi-orthogonal functions. Lemma 11.5.4. Suppose that >n is regular and let Qn(z,r) be a quasiorthogonal function. Then for a fixed w e 3O, the numerator of Qn(z, r) will vanish at z = w for exactly one r e R.
11.5. Quasi-orthogonal functions
285
Proof. Set Qn(z, r) = pn(z, r)/jrn*(z) and 0 n (z) = /?„(*)/<(*). Then
Suppose pn(w, x) = 0, If w = <xn-i, then x = oo is the only possibility because otherwise pn(an-\) would be zero, contradicting the regularity of <j>n. For w 7^ otn-u there are two possibilities: either pn_\(w) ^ 0, and then there is a unique solution for x making pn(w,x) = 0, namely _ _ p -i(w)m -i(w)' n n or pn_i (w) = 0, in which case x = oo is the only possibility because for any finite r, we would have pn(w, r) = pn{w) and this can not be zero by Lemma 11.5.3. • The following result is an immediate consequence, which is obvious without proof. Lemma 11.5.5. Assume that <j>n is regular and let Qn(z, x) = pn(z, r)/7r*(z) be a corresponding quasi-orthogonal function. Then, except for at most n + 1 values of r e l , none of the points in {a?o, ot\,..., an} are zeros of pn(z, x). We shall call the values of r for which the conclusion in the previous lemma is true, regular values of x. Note that r = oo can never be a regular value, because a regular value implies that pn(an-\, r) ^ 0, whereas by definition this is always zero for r = oo. Thus excluding the zeros {a0, ot\,..., Qfrt_2, otn} gives another set of n conditions that exclude at most n finite values as being regular. Thus all (finite) real values of, except at most n ones, are regular. We shall call Qn(z,x) regular when cj)n is regular and r is a regular value for
Qn(z,r). From Lemmas 11.5.2 and 11.5.5 and recalling that all, except n values of r G R, are regular values for Qn(z, r), we can formulate the following: Corollary 11.5.6. Assume that Qn(z,x) is regular and thus that the orthogonal functions 4>n is regular and that x is a regular value. Then Qn(z,x) has n simple zeros, all lying on 3O \ {a0, a\,.. .,an}. Proof. This follows immediately from the previous lemmas and the definition of a regular value x. •
286
11. The boundary case 11.6. Quadrature formulas
We shall construct in this section quadrature formulas for the linear functional M{f}= ffdfi. Assume that the quasi-orthogonal function Qn(z,r) is regular and thus that the sequence {(/>„} is regular and that r is a regular value for Qn(z, r). Let us denote by £• = £m-(T), / = 1 , . . . , n the n simple zeros of the regular quasi-orthogonal function Qn(z) = Qn(z,r). We know that & e d O \ {ao, OL i , . . . , an} for all i. Our aim is to construct quadrature formulas of the form
1=1
where the A.m(r) are appropriate positive weights. The fundamental interpolating functions are exactly as in Chapter 5: T
, , _ <-l(fe) TT Z ~ $k _ < f e )
Lni{z)
Quiz)
~ K-iV) J j 6 - & - <(£) (z - 6)fii(6)
n !
"
(prime is derivative). For the situation where one of the i=t is at oo, recall the comments at the beginning of Section 11.5. The weights Xni = Xni (r) are defined by Xni=M{Lni}. Since we have in the present situation Cn = £„*, the spaces 1Zn = Cn • Cn are
We have the following theorem. Theorem 11,6.1. The quadrature formula
with nodes and weights as defined above, has domain of validity Hn-\. The weights are positive. Proof. The proof is practically the same as the proof of Theorem 5.3.1. Define the auxiliary function
11.6. Quadrature formulas
287
Clearly, it can be written as *
P2n-2(Z) / x * f \
=
(Z ~ an) *, \ *
, \
P2n~2
»
e
Vln-2-
Since /i(§/) = 0 for / = 1 , . . . , n (interpolation property), and since the numerator of Qn (z) is a nonzero constant times the nodal polynomial (z — §i) • • • (z — §n), we can write Hz) = Qn(z)g(z),
g(z) =
„*
( z)
'
q
"~2 e
Vn 2
--
By its form, g e Cn-\. Hence g* e £ n _i while Z-OLn
and thus
This implies that M{h] = d <0n, g*)M + c 2 / ^ _ i ( z ) , ^ " ^ " ^ . ( z ) ^
=0
(where c\ and C2 are constants). Thus the quadrature is exact for all / G lZn-\. To prove that the weights are positive, note that Ln^{z) = Lni(z). Set
Because £(£) = 0, for all / = 1 , . . . , n, we have M{&} = 0 because the quadrature formula is exact. Therefore, kni = M{Lni} = M{LniLni*} = {Lni, Lni)M This concludes the proof.
> 0. •
Theorem 11.6.2. Assume that the system {0n} is regular and that x = 0 is a regular value. Then
k=l
is exact in Cn • Cn-\.
288
11. The boundary case
Proof. Indeed, as in the previous proof, it holds for any / e Cn • £ n _i, that is, for / of the form r, x Pn(z)pn-\(Z) f(z) = ., , „ , , , Pk e Vk, that
Because r = 0, the £,- = ^ m (0) are the zeros of >n. Hence h(Z) = MZ)^1^-
= 4>n(z)g(z), Vn.X G Vn-l, g € £ B _i.
Therefore, M{/z} = (0 n , gHs>M = 0 since g e Cn-\ = £(n_i)*. This means that
This concludes the proof.
•
The situation of the previous theorem is comparable with the classical situation where all otk = oo in the case of the real line. It is known that constructing quadrature formulas with r = 0 gives formulas exact in the set of polynomials Vin-\ rather than in the set Vin-2, obtained when r ^ 0. See Akhiezer [2, p. 21]. The quadrature with r / 0 integrates exactly in a subspace of one dimension lower than in the case r = 0. This can be explained by the fact that, for T ^ O , the orthogonal functions are replaced by quasi-orthogonal functions, which have an orthogonality defect of one. As in Section 5.4, we can derive alternative expressions for the weights. Therefore, we introduce functions Pn (z) of the second kind, associated with the quasi-orthogonal functions Qn(z) as Pn(z) = Pn(Z, T) = fn(z) + T z
Afri-lfe),
t € R, W > 1,
with the same convention as for the quasi-orthogonal functions to define
Pn(z, oo) = V^fe)—
.
Note that although we call them functions of the second kind, they are not given by fD(t,z)[Qn(t)-£
11.6. Quadrature formulas
289
The formulation and the proof of Theorem 5.4.1 can be given without change. It still holds that the following is true. Theorem 11.6.3. The weights of the quadrature formula are given by = nk
1 2
where the prime means differentiation with respect to z, Qn is a regular quasiorthogonal function with zeros %k, k = 1 , . . . , n, and the Pn is the associated function of the second kind. We need an adaptation of the proof of Theorem 5.4.2, to cope with the boundary case. Theorem 11.6.4. With the notation of the previous theorem
j=0
Proof. Since Qn w = t;k and z = t
= 0, we get from the first formula in Theorem 11.4.2 with n-\
H{t,t-k)En.
" Qn(0] =
Multiplying by Z B _i(§t)Z n (0/(§ ~ {) and letting t tend to §*, we get
"*-•&
where kn-\(z, Then
z) = YTj=o 107 fe)I2- Let L be the limit in the right-hand side.
for 30, T
L = lim
. (an - !)(«„_! - 1) -1
forT, forR. (11.25)
11. The boundary case
290
Similarly, using the complex conjugate of the third formula in Theorem 11.4.2 we obtain
L j=0
We multiply by Zn-\(£k)Zn(t)
and let t -> &. Because
the sum immediately drops out in the right-hand side. So there remains = -En lim[D»(&, and this limit equals (L is the limit in (11.25)) -2L0Z B _i(&)Z n (0] =
2%k(an - !)((*„_-! - 1)
for 3 0 , forT, forR. (11.26)
Plugging the limits (11.25) and (11.26) into the expressions for Q'n(%k) Pn (&) and then substituting the results in the expression for knk of the previous theorem gives the result. • For further reference, we shall denote by /JLn(-, r) the discrete measure of these quadrature formulas, that is, the measure that takes masses A.m-(r) at the nodes £m-(r). We also recall that such a measure exists for all, except at most ft, values T G M . 11.7. Nested disks If Qn(z, T) are the quasi-orthogonal functions, then we recall the functions of the second kind associated with them are given by Pn(Z, T) =
fn(Z)
n _i(z)
11.7. Nested disks
291
Furthermore, we define the rational functions Rn(z, r) = - —— -. (11.27) Qn(z, T) When n is a regular index, then the mapping from r to s = sn(z) = Rn(z,r) will transform (for a fixed z e C \ 8O) the extended real line onto a circle Kn(z). The closed disk defined by Kn (z) and its interior will be denoted as An (z). Note that this is well defined as a circle with finite radius because Qn(z,r) has all its zeros on 3O, so that for z & d©, <2n(z, r) ^ 0 for all r e E and thus sn(z) will never be infinite. When n is a singular index, then the transformation is degenerate. In that case, the whole plane is mapped to a point. Indeed, since for a singular index n we have En = 0, it follows from the Christoffel-Darboux relation that
(pn(z) 4>n-\(w) Zn(z) Zn-i
for any z and w. Choose w such that
—
Since it follows similarly from Theorem 11.3.3 that the same holds for the functions of the second kind i//n, we get " ' , = - " " , ""* ,—, ,n~\Z, = ~ / " * QnyZ, T) (cnZn/Zn_i + r)0 n _i(z)
Z
Theorem 11.7.1. Suppose that the index n is regular. Then for z € C 1. The equation of the circle Kn(z) is given by n-\
z)\2 + \l-s\2
= {s
2. The (closed) disk An(z) is obtained by replacing the equality sign by <. 3. The center cn and the radius rn of the disk are
(fc--) - (l2^) ) (1 (1) (fc 1
n-1
292
11. The boundary case
and 1
\zn)
n
(
(fn-
V Z n _i )
VZ n _
(fe)
-
i/'zj
( ^
where kn-l(z,z) = EnkZl \
ifn(z) + S(f)n(z) S(j)n-\(z)
— -
\Zn-\\ Taking the imaginary part gives Im r = — with
Now suppose that the index n is regular. Then with the help of Green's formula (11.20) (since n is a regular index and hence En / 0) we can write the equation of the circle Kn(z) as
H{z,zY with X(z) as given in Theorem 11.4.3 by (11.21). Thus
2kl 2 - \z\2
-X(z) H (z, z)
forT,
2Imz for 3O.
This is (1).
77.7. Nested disks
293
Recall that the denominator 1 — | ^ ( z ) | 2 is positive, negative, or zero, iff z belongs to O, ©*, or 3O respectively. This means that the circle will be in the right or left half plane depending on z being in © or Oe respectively. This is (4). The closed disk An (z) with boundary Kn (z) is given by using an inequality sign instead of an equality. To find out in which sense the inequality has to be set to get the interior of the circle, we make the following observations. Since
with A = — X^=o I0A:I2 < 0, it follows that this expression will become negative for \s\ sufficiently large. Thus this expression is negative outside the disk. Therefore, the closed disk is described by
k=l
This is (2). Since the sum in the left-hand side of (11.28) is nondecreasing with n, it follows that we have nested disks, that is, for any regular index m > n, we have A m (z)C A n (z).Thisis(5). Since Rn(z, oo) = Rn-\(z, 0), the circles will touch, even if the index n is singular. In the latter case, Kn(z) is a point on Kn-\(z). The expressions for center and radius for a general linear fractional transform (r eR) s —
a — xb c — xd
ad -be cd -dc
'
ad- be c -xd cd- dc c -xd
are obviously
ad -be cd -dc
and
ad -•be cd- dc
Using the Green and Christoffel-Darboux formulas, the expressions for cn and rn as in (3) will follow. • Corollary 11.7.2. For z e {/?, $ }, we find the following special cases. All the circles Kn(fi), n = 1,2,... reduce to the same point s = 1 and all the circles Kn (ft ) reduce to the same point s = — 1. Proof. Recall that fy (ft) = 0 and fy ($) = oo. Let n be regular. It then follows from the expression for the radius rn that for z € {f$, $ } this radius is zero.
294
11. The boundary case
Writing out the equation for Kn{z) for z = /3 and z = $, we find that only s = 1 satisfies the equation when z = f$ and only z = — 1 satisfies the equation when z = $ . Since successive circles touch, also for singular indices, we get the same point for all n. • Assume from now on that there are infinitely many regular indices. Suppose these are n(v), v = 1, 2, Because the disks An(v) are nested, it follows that Aoo = limv^oo An(y) is a disk or reduces to a point. Its radius is r(z) = lim rn{v)(z). v->oo
We have the following lemma. Lemma 11.7.3. Suppose z e C^ = C \ (dO U {fi, $ }). If Aoo(z) is a disk (with positive radius), then and k=0
If Aoo(z) is a point, then = oo
and
k=0
k=0
Proof. It follows from the expression for the radii that r(z) is positive (zero) iff koo(z, z) is finite (infinite), that is, iff J2T=o I0fcfe)l2 is fi^te (infinite). Let s be a point from the disk Aoo(z). Then it follows from (11.28) that in any case (disk or point) < oo. k=\
Thus YllcLl I0A:U)|2 < OO iff YllcLl Wkiz)^
< OO.
D
Before we can prove an invariance theorem, we also need the following lemma. Lemma 11.7.4. If n is a regular index, then for some parameter w e C^ = C \ (3O U {/3, $ }), the functions (pn and \/rn can be written as (j)n{z) =
Z—W ttn
~~ Z
v-^
^On(w) + 2 ^ L A:=l
'
77.7. Nested disks
295 n-\
—w
k=\
where
= Vn(w)fa(w) - Yn(w)isn(w), = Vn(w)\j/n(w) - Yn(w)fa(w), = Yn(w)akn(w), n(w),
Yn(w)=
W an
~
and Yn{w) = i W ~ a"2
forT
2w
k — 1, . . . , « - 1, u;
and
2w
Vn(w) =
u;2
forR, for R.
Proof. From the Christoffel-Darboux formula in Theorem 11.3.1 and the mixed formula in Theorem 11.3.2, we get n-l =
Zn(z)Zn-i(w)
Zn(w)Zn-i(z)
~ ^ (Z, W) Ey,
= -H(z, w)En n-\
lk=l
Elimination of fa-\ (z) gives 4>n(.Z)
t
(
= -H(z, w)En
f- 1
, w)fa(w)]
lk=\
From the determinant formula of Theorem 11.3.2, we find
an-i,n(w) =
-EnZn-i(w)Zn(w)X(w),
with
X(w) =
2iw (1 - w)2
forT
and X(w) = - i
forR.
11. The boundary case
296 Hence
z -- w z z -• w an •- z z -- w
of,,
an •-
H(z,w)EnZn(z)Zn.x(w) an-hn(w)
Qln
-
2w
forT,
for dO.
£
Thus
Next we compute ' (z + w)(an - w) _ z-w
2w(an-z) (l+zw)(an-w) (l + w2)(an-z)
Yn(w)D(z,w)={
(z - w)(w+an) (otn-z)2w (z-w)(l+anw)
forT, forR, for 3O.
Thus
— w
n-\ lk=l
This is the first formula required. The second formula is proved similarly.
•
We can now prove the following invariance theorem. Theorem 11.7.5 (Invariance). Suppose w e C^ = C \ (3O U {0, $}) and suppose that AOO(M;) is a disk with positive radius. Then Aoo(z) is a disk with positive radius for every z € C^ and
and k=0
k=0
converge locally uniformly in C^ as n -> oo.
77.7. Nested disks
297
Proof. From Lemma 11.7.4, we know that with xn = 4>n and yn = tyn or vice versa, we have, with the notation as in Lemma 11.7.4, n-\
xn(z) = xn(w) H z — w n(w)
-
Yn(w)yn(w)].
From the definition of akn(w), it follows that oo
n—1
n=\
k=\
n=\
k=l
by Lemma 11.7.3. Now suppose that C is a compact subset of C^. Then for arbitrary z e C and w e C^ fixed 7 — W
7 —W
Yn{w)
and
-Yn(w) < R\
and
Vn(w)
are uniformly bounded, say z- w
- Z
z-w
Vn(w)
So (
N
1/2
n-l
For any e e (0, 1), choose m = m(e, R\, Ri) > 1 such that 1/2
1/2
11. The boundary case
298 and
1/2
\n=m k=l
Then <€+€
(n-\
y^Jakn(w)\2
+ n=m
\k=l 1/2
oo
1/2
n—1
\n=m k=\ 1/2
'/2
/m-1
Hence 1/2
(l-€)( \n=m sa Because ^ n 1 ' ^ ' ^ | 2 ^ continuous function in C, there is an Mm, depending
on m, such that
1/2
/m-l
holds uniformly in C. Thus 1/2
^ (2 + Mm)e < , 1-6
N > m.
Now we first fix an 6, for example, 6 = 1/2, and let mo be the corresponding index m. It follows that ^2^=m \xk(z) I2 has a finite upper bound (2 + Mm)2 in C, and consequently, YlkLi \xk(z)\2 has a finite upper bound M2 in C. We now return to the argument with an arbitrary 6 and corresponding m. It follows that 1/2
hT|
1/2"
2+
77.7. Nested disks
299
and hence from the foregoing that ,
N>_m.
\k=m
This shows that {(Ylk=i \xk(z) I 2 )^ 2 ) *s a uniform Cauchy sequence in C, which proves the uniform convergence of X^&io \xk(z)|2 in C. D We can now also prove the analyticity theorem. Theorem 11.7.6 (Analyticity). Let Aoo(z) be a point and let Rn be defined by (11.27). Then the limit s(z)=
lim
fln(z,r),
zeC\dO
is an analytic function of z not depending on r. Moreover,
Proof. By Theorem 11.7.1 (6), it follows that if Aoo(z) is a point then lim Rn(z, r) = lim sn(z) = s(z) n-»oo
n->oo
exists and is independent of r. Obviously, sn(z) = Rn(z, r) is analytic for z € C \ 3O. Thus the analyticity of s(z) will follow if the functions sn(z) are uniformly bounded in compact subsets of C \ 9O. This is shown as follows. We know that
Thus 1 + \t I2
n l
~
Sn)
^'
300
11. The boundary case
Therefore,
Since the right-hand side is uniformly bounded in compact subsets of C \ the analyticity of s(z) follows. The last inequality is a direct consequence of Theorem 11.7.1 (4).
11.8. Moment problem Recall that in the previous chapters we had the relation dfi(t) = \ m0 (t) 12d/x(t). For R we had mo (t) = t + i whereas for T this factor was just 1. In this chapter, we have changed the meaning of a?o and renamed it as /?. Therefore, for t on the boundary 3O, we shall now need \urp(t)\2 = \t —fi\2to replace \z&o(t)\2. It is just 1 for t e T and it is 1 + t2 for t eR. After this introductory note, we can state our moment problem. It has been outlined in the introduction that the moment problem is different from the moment problem in the previous chapters. In fact we should consider two kinds of moment problems. We are given a functional M defined on IZoo = £>oo ' £oo, which is real and positive definite, that is, feC^-Coo
and M{//*} > 0,
0 # / e £«,.
To represent the functional in C^ as an integral, we suppose it is defined there by the moments tin0 = M {bn(z)},
/i = 0 , 1 , . . . .
(11.29)
The problem is to find a representation of this functional by a measure JJL on dO such that J
= /z n0 , n = 0, 1 , . . . ,
(11.30)
and thus such that M{f] = [f(t)dii(t),
V/ G £oo.
(11.31)
We call this the moment problem in C^. Note that when 3O = R and all ad = oo, then this is the classical Hamburger moment problem [104,105,106]. If we want to represent the functional in IZoo as an integral, we assume that it is defined by the moments [inm = M{bnbm],
7i, m = 0, 1 , . . . .
11.8. Moment problem
301
The problem is to find a measure /x on 3O such that IJLnm = j bn(t)bm{t)dfot),
n, m = 0, 1, . . . .
This problem is equivalent to finding a measure such that M{f} = Jf(t)dji(t),
V/eftoo
or also such that
> 8)M = j mWtdfct)
= (/, g)A,
V/, g e £«,.
(11.32)
We call this the moment problem in IZOQ. Note, as already explained in the introduction, both the moment problem in Coo and the moment problem in IZoo are rational generalizations of the classical Hamburger moment problem in the sense that they both reduce to the Hamburger moment problem when 3O = R and all at = oo. We note that when there is only a finite number of different at / oo that are repeated cyclically: a\,..., <xp, a\,..., ap, a\,..., then the study of our orthogonal rational functions would simplify considerably. Indeed, in that case T^oo = Coo • £oo = £oo since a product of rational functions whose poles are among the points a\,... ,ap is again a rational function of the same type. Thus, in that case, the moment problems in IZoo and m £oo &TQ the same. Another remark to make is that we could as well have used the moments i'nm=M
to define the functional in K^, as was done in [157, 158, 160, 45]. However, this representation does not allow an elegant treatment of the points a,- = oo in the case of the real line. However, when there are only a finite number of different finite a/, then the choice of ii'nm would be more natural to solve a multipoint Hamburger moment problem. As in the introduction of Section 10.3, it can be shown that a solution should have infinitely many points on which it is supported. We have the following existence theorem. Theorem 11.8.1. Let M be a real positive functional defined on IZOQ. Then (f,g) = M{fg*} defines an inner product for f g £ Coo- Assume that the corresponding sequence of orthogonal rationalfunctions <j>n has infinitely many regular indices. Then there exists at least one measure ji on 3O that solves
11. The boundary case
302 the moment problem in problem (11.31).
Q, that is, that satisfies (11.30), or equivalently, the
Proof. The proof is based on the fact that by the regularity assumption, we can find for a regular index k a regular T&, such that there is a quadrature formula. Thus there is an infinite sequence of such quadrature formulas. Denote by fik = frk(., rk) the discrete measure representing the quadrature formula = i=\
which is exact in 72*_ i and hence also in £k-1. In particular, Jd A& (t) = M {1} for all &. Thus there is an infinite sequence of these measures and it is uniformly bounded by fdfrkit) = M{1} = 1. Hence by Helly's selection principle, the sequence (/jik) will have a convergent subsequence A&0) ~> A- Next we prove that such a A solves the moment problem in C^, that is, that
M{bn} = Jbn(t)dfr(t),
n = 0,l,....
For ft = 0, we can apply Helly's convergence theorem since bo = 1 is continuous. Recall that for the extended real line we can use the one-point compactification [96] to make Helly's convergence theorem work since oo may be considered as any other (finite) point of the line. See Section 10.3. This observation also implies that the argumentation below will hold for dQ = R. We now prove for an arbitrary but fixed n > 0 that the moment relation holds. Let us denote the elements of the subsequence k(j) by k for simplicity. Note that in this case Helly's convergence theorem is not applicable because the bn are not continuous on 3O. However, consider any compact subset / c 3O that does not contain the points a\,... ,otn. Then bn is indeed continuous on J and this means that by Helly's convergence theorem we can find a K large enough such that for a given e > 0 we have for all k = k(j) > K - dp.k(t)]
(11-33)
In contrast, setting / = 3O \ J, we have for k = k(j) > n,
< max — — / \bn(t)\2djj,k(t) < max \l/bn(t)\ 1 1 \bn\t)\ Ji
• finn.
11.8. Moment problem
303
We can always choose / large enough such that the maximum of \l/bn| in / is arbitrarily small. Indeed, the set / will then contain small neighborhoods of the Qfi,..., an, none of which is a@. The zeros of l/bn are precisely in the points a?i,..., an and its pole is at z = OL%. Because [inn is finite, it follows that for any € > 0, we can make J large enough to satisfy €
(11.34)
<
3
Note that this holds for any k > n, that is, / is independent of k. Next we want to show that J can also be made large enough to satisfy
bn(t)dfi(t)
(11.35)
< —.
To obtain this, we consider sets Jp, none of which contain a\,..., such that Jq C Jp for p > q and UpJp = dO\{a\,
an, and
..., an). Note that for
k = k(j) > n and for any p [ bn(t)dfr(t)=
( bn(t)[dfr(t)-dijLk(t)]+
jp
f
Jjp
Jjp
bn(t)djj,k(t).
Thus if p > q and k > n
[
JP-Jq
f
JJP-Jq
bn(t)[diji(t)-dijLk(t)]
bn(t)djjik(t) Jq
(11.36) By (11.34), there is always a p and q large enough such that for any r\ > 0
JJD-Ja
(11.37)
<
2
for all large k. By (11.33), we can make k so large that for any r\ > 0
JJD-Ja
<
1
2'
(11.38)
Combining (11.36), (11.37), and (1138) shows that it is possible to make <
304
11. The boundary case
for any TJ > 0. This means that
bn{t)dfct) )
J is a Cauchy sequence, so that its limit for p —*• oo (which is the integral over i , . . . , an}) exists. We find that bn(t)bn*(t)
[ dlXkit)
=
< max for all A: > n. Thus / 7 d£ik(t) can be made arbitrarily small for all k > n by choosing / small enough; hence J7 d^i{t) can also be made arbitrarily small by choosing / small enough. Hence / t { a i , . . . ,(*„} = 0. It thus follows that for any e > 0 there exists a /? large enough such that
Jbn{t)dfct) = Jbn(t)dfi,(f)-J bn which proves (11.35). Finally, by (11.34), (11.35), and (11.33), bn(O[dfik(t)
-
<\
bn(OdUk(f)
because each of the terms in the right-hand side can be bounded by e/3. Thus
lim [bn(t)dfrkU)(t)=
[bn(t)dji(t).
Since / bn(t) dfak(t) = fin0 = M{bn], this proves the theorem.
for k > n, •
11.8. Moment problem
305
We now use our framework of nested disks to obtain information about when the solution is unique. Let us denote by Mc the set of solutions of the moment problem in Coo and Mn the set of solutions of the moment problem in fc^. Then we have Theorem 11.8.2. Assume that the sequence of orthonormal functions (j)n has infinitely many regular indices, and hence M.c ^ (p.Fix z G C \ 3O and define for \x G Aic the Riesz-Herglotz-Nevanlinna transform
Then {Q^z) : fi e Mn]
c AooCz) c {Q^z) : ii e
Mc}.
Proof. Let s = Q^iz) for some /x e Mn. Note that the system {(/>„} is then orthonormal with respect to the inner product defined by the measure /x. Let f(z) = D(t, z), t G 3O. Writing the generalized Fourier series of f(z) as
k=0
and then using
yk= [ D(t,z)
jD(t,z)
= / D(t, z)[(t>k(t) —(/)k(z)]djjL(t) + >n(z)£2/x(z), However, it can be shown that for t G 3D
Using Bessel's inequality, it then follows that oo
A:=l
oo
fc=0
^ G 3C
306
11. The boundary case
which can be rearranged as
k=l
This means that 5* e Aoo(z). Note that for z = /3, the inequality reduces to |1 —s\ < 0, whereas for z = P, it reduces to 11 + s | 2 < 0. This is consistent with the fact that for all normalized measures /z, Q^iP) = 1 and Q^iP) = — 1. Next we show that if s e Aoo(z), then it is the Riesz-Herglotz-Nevanlinna transform of some iieMc. This is readily shown by using the quadrature formulas we have discussed. We consider the limiting point and the limiting disk cases separately. If Aoo(z) is a point, then since s e Aoo(z), there must exist sn e Kn(z) such that sn -> s. Since there is for each regular n some xn such that sn = Rn(z,rn)=
D(t,
Helly's selection criterion then yields that there is a subsequence //,„(_/) -» /L By the proof of the previous theorem, [i e Mc, and by Helly's convergence theorem,
lim [D(t,z)dfrnU)(t)=
f
Thus s is the Riesz-Herglotz-Nevanlinna transform o f a / i G M £ . If Aoo(z) is a disk, let s be a point on the boundary K^iz). Recall that for a fixed n, we can, except for finitely many values of r, associate a quadrature formula with Rn(z, t ) . Let us denote the discrete measure that is associated with this quadrature by £in(-, r). It depends on n but also on the choice of r. We can then, for every regular index n, choose an sn e Kn(z) such that these sn tend to s and such that sn = Q^iz), where \xn = /xn(-, xn) and where rn is chosen such that sn = Rn(z, rn). By Helly's theorems and the proof of the previous theorem, there exists a /x e Mc such that Q^iz) = s. Thus every s on the boundary Kodz) is of the form Q^iz) with /JL e Mc. Now let s be an interior point of Aoo(z). Then it can be found as a convex combination s = Xs\ + (1 — X)s2 (0 < X < 1) of points s\,s2 on the boundary ^oo(z). Thus there exist /^i, /JL2 e Mc such that Sj
=
jD(t,z)djjLj(t).
Thus \x = kfii + (1 - X)jn2 e .M £ and s = Now the following corollary is obvious.
ftM(z).
•
77.9. Favard type theorem
307
Corollary 11.8.3. In the case of a limiting disk, for each s e Aoo(z), z e C \ 3O, there is a /x e AAC such that s = £2M(z). The moment problem in C^ has infinitely many solutions. In the case of a limiting point, a solution of the moment problem in TZQQ is unique.
11.9. Favard type theorem Here we want to show that if we generate rational functions by the recurrence relation of Theorem 11.1.2, then they are orthogonal with respect to some functional. Thus if 0o e R\{0},
and
>n(z) = (AnZn(Z) + Bn^Q-) 4>n-l(z) + Cn V Zn-2(z)J
n = 2,3,...
(11.39)
w= 2,3,...,
(11.40)
or, by (11.9),
with En and Cn nonzero, then they are orthogonal with respect to some functional M, meaning that = 0,
for
k^l.
If we want this to be a real inner product, that is, if we want M{/*} = M{/}, then the coefficients An, Bn, Cn, and En should be real: An, Bn, En e R,
En = An + BnDn ^ 0,
Cnel,
Cn/0,
n = 1, 2 , . . . ,
n = 2,3,...,
(H-41)
which is shown in Section 11.1 (see Lemma 11.1.1), and if we want it to be positive (M{ff*} > 0 for / ^ 0), then by a small adaptation of Lemma 11.1.6, we obtain
^ ^ 1
2
} ,
,, = 2 , 3 , . . . ,
(11.42)
308
11. The boundary case
so that the positivity is guaranteed if M{0^}>O
and
CnEn-l/En<0,
n = 2, 3 , . . . .
(11.43)
Here we assume that the leading coefficients Kk are positive. If we want the rational functions to be normalized, then we should choose En = -CnEn-U
" = 2,3,...
(11.44)
with M{0Q} = 1. Thus we assume in this section that the coefficients of the recurrence satisfy the conditions (11.41) and (11.44). We start with an extension of the above recurrence relation. To this end we use the following lemma. Lemma 11.9.1. 1. For all constants A and B and for all j , k, n integer such that an ^ ak, there exist unique constants a and b such that B _ a Zj Zn
b Zk
2. For all constants A and B and for all j , k integer, there exist unique constants a and b such that B
A+
b
+
Proof. To have the first relation, we note first that, by the definitions of Zk and Z n , there exist nonzero constants c\, c2 and nonzero constants d\, J2 such that it is rewritten as
c\A(z-
l)+c2B(z-aj) z-l
_ dia(z - an) + d2b(z - a z-\
Thus the constants a and b exist if the following system can be solved for a and b: c\ A + c2B = d\a + d2b, c\ A +
C2BOCJ
= d\aan + d2bctk.
Since the determinant of the system is d\d2(oik — an) ^ 0, there is a unique solution.
11.9. Favard type theorem
309
For the second relation, we find a similar system:
c\ A + c2B = d\a + dib, c\A + c2Botj = d\a + d2bak. Now the determinant is d i ^ f e — 1 ) ^ 0 since none of the a^ is assumed to be 1. D This lemma is used to derive the following extension of our recurrence relation. Theorem 11.9.2. Let the (j)n be generated by the previous recurrence relation. Consider an integer n > j + 2 > 3 and assume thatan ^ {otn-\, a n _2,. ••,tf/+i}Then there exist constants a^y k = 1 , . . . ,n — j — 1 and constants bk, k = n — j — \,n — j — 2, all depending on n and j , such that H
h an-j-i(/)j
Zn(z) Zj(z)
<
Proof. For n = j + 2 > 3, the formula reduces to
which, by the previous lemma, is seen to be the recurrence itself in the form (11.40). To prove the induction step, assume that for some k with n — 2 > k > j + 2 we have
h an-k(j)k(z)
Zn(z). + bn-k—-—(pk(z) + Zk(z)
Zn(z) bn-k-iCk+i--—— Zk-i(z)
We apply (11.40) to
Zn \( Bk \ bn-k—- \[E k + —— Z k(j)k-i + ^ LV A + bn-k-\Ck+\ , ^ , bn-kBk On-ktk H
Zk Ck-— ^
310
11. The boundary case
Since an ^ a ^ , we can apply part 1 of the previous lemma for the bracketed expression to write it as Qn-k-l
bn-k+\
Zn
Zk-i
Hence (t>n - a\
an-k4>k = an-k-\
bn-k+i—^—(j)k-\ z
fc-l
+ &„_£ — (pk-2z fc-2
This proves the induction step.
•
Applying the recurrence once more on the extended recurrence gives another form of the extended recurrence as in the following theorem. Theorem 11.9.3. Let the (pn be generated by the previous recurrence relation. Consider an integer n > j' + 2 > 3 andassume thatan $ {otn-\, a n _ 2 , . . . , c*/+i}. Then there exist constants ak, k = 1, . . . , n — j — 1, bn-j-\, b'n_jt and afn_j, all depending on n and j , such that
Proof. Substitute in the next to last term of the extended recurrence of the previous theorem
+ so that the last two terms become Zn
\
bn-j-iEj+i +
Now apply part 2 of Lemma 11.9.1 to the bracketed expression to write it as
and the desired formula follows as in the previous theorem.
11.9. Favard type theorem
311
Now we will prove the Favard theorem. We recall all the basic assumptions made: (Al) (A2) (A3) (A4) (A5) (A6)
afr#a0,* = l,2,.... <j>n is generated by the recurrence (11.39) or, equivalently, by (11.40). 0 n e Cn\£n-U n = 1, 2 , . . . , 0 + 0O e R. En, Bn eR,n = 1,2,.... £ n # 0 , n = 1,2,.... £ n = -C n £ n _i,rc = 2 , 3 , . . . .
If we set 0 n = Pn/Xn' w * m Pn ^ Vn, then (A3) implies that pn(an) / 0 whereas (A5) implies regularity, that is, pn(an-i) ^ 0. Assume that we have constructed a functional M such that {0n} is orthogonal with respect to m. The conditions (A4) and (A6) imply that all the coefficients in the recurrence relations are real and this guarantees that the functional is real. Assumption (A6) together with M{0Q} = 1 imply positivity of M as well as the normalization M{0^} = 1, n = 0, 1, 2 , . . . . Assumptions (A5) and (A6) imply that Cn / 0 for/i = 2,3, . . . . We first note that we should only define the functional M such that we have orthogonality, because if we define M{>%} = 1, then the assumption (A6) guarantees that, if we have orthogonality, then we have also defined all the norms to be 1 because of (11.42). Defining M on C^ is simple. We set by definition M{0o} = l/0o
and M{0*} = 0, fc = l , 2 , . . . .
Note that the definition of M{0O} takes care of the normalization. The problem is to extend M to 1Zoo = £oo ' £°° s u c ^ m a t M{(pk(t>i} = 0fork ^ I. Therefore, we consider the following table of products:
0001 0002 0003 0004 0005 ' ' * 0102 0103 0104 0105 0203 0204 0205 ' ' ' 0304 0305 ' ' *
When M is applied to these entries, we should always get zero. The first row (row 0) is dealt with by our previous definition of M on Coo- For the subsequent rows, we consider B-\ = {0o} and Bn = Bn.x U {0n0z : / = n + 1, n + 2 , . . . } ,
n > 0.
312
11. The boundary case
Thus Bn contains thefirstn + 1 rows in the above scheme (and 0o)- Furthermore, define Bnn
= £ n _i
and
Bnk = Bn^x
U {
k > n,
that is, Bnk, k = n, n + 1 , . . . adds to Bn-\ (the first n rows) the elements of the (n + l)st row, one by one. These sets generate the following spaces: 1Znk = span Bnk,
n = 0, 1 , . . . ; k = n, n + 1,
(Note that C^ = 1Z\^ = U^QT^O,^-) The strategy is to obtain a definition of M on the successive spaces
such that we have the required orthogonality relations. Thus we walk through the table above row by row and in each row from left to right. Each time we consider the next product (j)n(j)k, we have to check whether that product is in a subspace on which M has already been defined or not. If the new product is not in a previous subspace, we can just define M{0 n ^} to be zero; otherwise we have to prove that it gives zero. Eventually we have M defined on the subspace n'^ = span [\JBn,k : n = 0, 1 , . . . ; k = n + 1, n + 2,...} c U^ such that M{
M{(pn(pk} = 0,
n = 0, 1, . . . ; k = n + 1, n + 2 , . . .
It will turn out that in fact TZ^ = IZoo and by assumption (A6) we then also have M{0^} = 1, n = 0, 1, We now elaborate these successive steps. RowO We set by definition = O, which defines M on C^.
*=l,2,
77.9. Favard type theorem
313
Rowl Initialization: M{(j)\ 02} w e set
If 0102 ^ £oo, by definition M{0i0 2 } = 0. If 0i02 € Coo, then there is a A: such that , , P1P2 0102 =
qk
_ tf€P
This A: should be at least 3 because otherwise we wouldfindthat p\ (a\ )p2 (ot\) = 0 while by our assumptions both p\(ot\) and P2(&\) are nonzero. Thus
Therefore n^/n^ has a zero in ai. This means that there is some m > 3 such that a m = «i. Let m be the smallest such m. We distinguish between m = 3 and m > 4. (a) m = 3 Thus o?i = a3 and we get from the recurrence relation (11.39)
03 = ( A3 + ^ ) Z302 + C 3 |^0l = (A3Z1 + 5 3 )0 2 + C301. \ Z\J Z\ Note that A3 ^ 0 because otherwise 03 would be a linear combination of 01 and 02, contradicting assumption (A3). Apply M to the previous relation and this gives M{03} = A3M{ZKI>2} + S3M{02} + C 3 M{0i}. Because M{03} = M{02} = M{0i} = 0, it follows that M{Zi0 2 } = Oand consequently that also M{0i02} = 0. (b) m > 4 In this case we know that ot\ — am g {a m _i, a m _ 2 ,..., 0^2}. We now use the extended recurrence 0m = «10m-l H
h «m-303 +
fl
m-2^2
+ ^ m _ 2 Zi0 2 + Z?m_3C30i.
Note that &m_2 7^ 0; otherwise 0 m G £ m _i, contradicting assumption (A3). Applying M to this relation and using MJ0*} = 0 for k = 1, 2 , . . . , m, we find that M{Zi0 2 } = 0, which implies M{0!02} = 0.
314
11. The boundary case Induction step for row 1
We have to prove that M{0i0^} = 0, given that M{0/>^} = Ofor0;0jt e #i,&-iThis is treated as in the general induction step for row n. We refer to that step below. Row n, n > 2 Initialization:
M{4>n4>n+x} = 0
If 0 n 0 n +i & 1Zn,n then we define M{0 n 0 w+ i} = 0. If 0 n 0 w+ i G 7£n?n then there is some k such that 1 —^ H h
0
'\
=
that is, PnPn+l
Pnqn
Pn-\Qk,n-\
and thus PnPn+X
7t JT
l k
n
^n-X^k
k
If k < n + 1, then for z = an, we get Pn(an)Pn+x(Mn) = 0, which is contradicting our assumptions, and hence k > n + 1. But then n
n+X
PnPn+X
H
7tt * *PxC[kX H
n
X
1
n
7T * 7T ;—Pn-Xqk,n-X
n-X
=
^ ^n
shows that TT^/TV*^ should have a zero at a n . This means that there is some m > n + 2 such that a m = aw. Let m be the smallest such index. We distinguish again between (a) m = n + 2 and (b) m > n + 2. (a) m = n + 2 Thus otm=an=
ctn+2' We use the recurrence relation (11.39):
(Pn+2 = [ = A w + 2 Z n 0 w + i + Bn+2
Bn+2M{bn-i
11.9. Favard type theorem
315
Using the orthogonalities given by the induction hypothesis, we find that the left-hand side and the last two terms on the right-hand side vanish so that M{bn(/)n+i} = 0 and thus also M{0 n 0 n+ i} = 0. (b) m > n + 3 In this case am $ {am-\, Qf m _2,..., otn+i)- We use the extended recurrence relation to write (note Zm = Zn) 0m =
The coefficient b'm_n_x 7^ 0; otherwise 0 m e £m-\. We multiply this again by bn-\ and apply M to the result: M{bn-i(/)m} = aiM{bn~i(f)m-i} H
h am_n_
By the induction hypothesis all the terms vanish except the first one on the second row. Thus M{bn(j)n+\} = 0 so that also M{(j)n(j)n+\} = 0. Induction step for row n We now consider n > 2 and 7 > n + 2 for which we know that fc} = O, m = 0, l , . . . , n - 1 , A: > m + 1, n0*}
= 0, fc = n + 1, n + 2 , . . . , j - 1.
We have to prove that M{0 n 0 7 } = 0. From the recurrence relation (11.40), we get
and thus
Applying M to this expression gives
Because bn and bn/Zj-\ are both in Cn and n < j —2, it follows by the induction hypothesis that the first two terms in the right-hand side are zero. For the third
316
11. The boundary case
term, we note that for j = n+2, Z?n/Z7_2 = bn-\, so that again by the induction hypothesis the third term is zero. However, if j > n+2, then Z?n/Z7-_2 € £ n with n < j —2, so that again the third term is zero by the induction hypothesis. Thus in any case, the right-hand side vanishes completely. Therefore, we conclude that
Now we distinguish two cases: (A) an / otj and (B) otn = oij. (A) an # otj We may then set by Lemma 11.9.1 (2) that 1/Z7 = \/Zn + c, with c a nonzero constant. Hence M{bn(t)j-X} = M{bn-i4>j] + cMibntj} = 0. The first term in the middle part is zero by the induction hypothesis and thus we find that M{bn4>j} = 0 and thus also M{(j)n(j)j} = 0. (B) an = (Xj
Here again, we distinguish two cases (1) 0 n 0 7 ^ lZnj-\
and (2) 0 n 0 7 G
(1) If 0 n 0 7 ^ TZnj-i, then we may define M{(j)n(j)j} = 0 and we are done. (2) If 0 n 0 7 efcnj-i then there exist an integer /: and constants c/ and polynomials #&,/ G 7 \ such that 0 w 0 y - + Cj^\4>n4>j-\ H
h Cn+\(j)n
or, setting 0/ = Pi/n*, —f-pPji + • • • + C i —± "^—; 7tn_17tk
*Pn-\qk%n-\
H
1—r7^i2;u 7Tl7Tk
n
Ttk
The index & can not be less than j + 1, for otherwise, putting z = otj in the previous relation would yield pn(aj)pj(ctj) = pn(an)pj(oij) = 0, which contradicts our assumptions. By rewriting the previous identity as n
n n k k k — PnPj + Cj-i ——pnPj-i + • • • + Cn+i —-p n Pn+\ n n n j j-l n+l 7T*
+ —f-Pn-\qk,n-l
7T*
H
h -7
^
11.9. Favard type theorem
317
and putting z = ctn we find that n^/nj is zero for z = an. Thus there must be an index m > j + 1 such that am = an. Let m be the smallest such index. We once more distinguish two cases: (a) m = j + 1 and (b) m > j + 2. (a) m = 7 + 1: Thus we have now that an = Qf7 = otj+\. By the recurrence (11.40), we have
Because Z7-+i = Zn = Z 7 , we get after multiplication with /?n_i
After applying M, this gives
Since y > n + 2, the left-hand side and the second term in the right-hand side are zero by the induction hypothesis, but also the last term is zero by the same argument because bn/Zj-\ e Cn. Thus it follows that M{bn(j)j} = 0 sothatM{0 n 0 7 } = 0. (b) m > j + 2: Thus we have am = an = a 7 £ {a w _i, a w _ 2 , • • •, a 7 + i}. By the extended recurrence relation of Theorem 11.9.2, we have
Note that ^ m _ 7 _i 7^ 0, for otherwise 0 m € £ m - i . By the extended recurrence relation of Theorem 11.9.3, we have 0m = «10m-l + * * ' +«m-j-10j + l +« m _;07
with (see the proof of Theorem 11.9.3)/?^_7 = cEj+\bm-j-\ andcbeinga nonzero constant. Because bm-j-\ ^ 0, also Z4_ ; 7^ 0- Now we multiply this relation with bn-\ and apply M to obtain
+ b'm_jM{bn
i-i
318
11. The boundary case By the induction hypothesis (recall m > j + 1), all the terms vanish, except for the first term on the second line. Thus M{Z?w07} = 0 and hence
This concludes the induction step for row n. The diagonal So far we have defined M on IZ'^ such that = 1 and We also have to satisfy M{0^} = 1, n > 1. By the recurrence relation, we have Zi — 02 = E2Z i(/)i ^2
Noting that, by Lemma 11.9.1, Z\/Z2 = c\ +C2Z1, it follows that the left-hand side is in the span of {0i0i ,02}. Hence it follows that 01 Zi € Span{0O, 0102, 0102} C ft'oQ. Thus also (j)\ € T^J^. Similarly, it follows in general, for n > 3, that from the recurrence relation we get ^ t
= En+\bn4>n + #n+A-10« + Cn+\ ——0n_l.
Again by Lemma 11.9.1 wecan replace l/Z n + i and 1/Zn_i by 1/Zn plus some constant. Thus the previous relation implies that bncj)n e span{0 n _i0 n , 0 n _i0 n _i, 0 n _i0 n +i, 0«0«+i}. Therefore, 0^ e ^ for all n > 0, which means that ^ = H^ = C^ -Coo. In other words, we have defined M on the whole space TZOQ and by our assumption (A6)weknowthatwehaveautomaticallyM{0^} = 1. Thus we have now proved the Favard theorem. Theorem 11.9.4 (Favard). Let {0n} be a sequence of rational functions generated by the recurrence (11.39) or, equivalently, by (11.40) and assume that (A1)-(A6) are satisfied. Then there exists a functional M on IZQQ = CQQ • COQ such that , g) = M{/g,}
11.10. Interpolation
319
defines a real positive inner product on C^ for which the 4>n form an orthonormal system. Proof. By the previous analysis, the definition is given such that the orthonormality is satisfied. It is easily proved that M{h*} = M{h] for any h e T^oo since h = /g*, with f = J2 ai^i e ^oo and g = ^ bifa £ £00, so that
atbt = Y^idi = M{h}. Also, positivity is guaranteed by
11.10. Interpolation Suppose now that we have a solution of the moment problem in IZOQ have the inner product defined as before by
(f, g) = J f(t)g(O dfr(t),
/, « e £TO,
SO
that we
(11.45)
where we assume the normalization / #(f) = 1. We make the following observation (see also the appendix of Ref. [62]): Lemma 11.10.1. If //. is a positive measure on T, such that dn(t)
J K*(oi2 then £„, as a subspace of L 2 (/z), i s m
tne
<
00,
closure V of
span{l,/\e2iV..}. Proof. Define
and f€(t) =
Pn(f)
,
with
0
11. The boundary case
320
For a fixed a eT and t e T arbitrary, we have 1 \rt-a\
1 \t — a\
rt — a t—a 1 1 -r \rt-a\\t-a\ \t - a\
Hence \rt — a\~l < 2\t — a\~l. Thus we can find a constant cn such that
\Ut)\ <
cn
K*(0l
Because cn/n*(t) e Z,2(/x), also f€ e L 2 (/x). Obviously, f€ is analytic in D so that f€ e V. Since there is also a constant dn such that
then / e L2(/x). It remains to show that in L2(/x) Um||/6-/||=0. Since by the Schwarz inequality
the right-hand side being uniformly bounded (not depending on e), we can apply the dominated convergence theorem to find that Um The analog for the real line is Lemma 11.10.2. If [i is a finite measure on R, such that
/
(t)\2dfl(t)
< 00,
bn(Z) = k=l
\bn then £„, as a subspace of L 2 (A) ? is in the closure V of
11.10. Interpolation
321
Proof. Define
fit)
= %fy P»^n,<eR, ^(O = n ( i - ^ )
and Mt)=
.,"..,.
withO<€
For a fixed a e R and t e R arbitrary, we have
1 1-^i
1 \l-t/a\
1
1
1- **
1 - I/a
\ae\ |a - f -
Since |a — t — \€ \ > 6, the right-hand side is bounded by 1 /11 — t /a \ and thus is
1
:— <
2
Thus we can find constant cn and dn such that
\feit)\
*(t)\2 \n*(t)\
imply that the measure does not have a mass point at any of the a e An = {a\,..., an}. Consequently, defining as usual the Riesz-Herglotz-Nevanlinna transform
then the boundary function Q^(t)9 t e 3O, will not have a jump at a (see Ref. [62]). If a# is the number of times that a appears in An, then also the derivatives Q^\t) will not be discontinuous at a for k = 0, 1 , . . . , ct# — 1. In
322
11. The boundary case
particular, for the case 3O = R and a = oo, we interpret continuity at oo of fit) to mean that rim,_++oo fit) = lim,_»_oo fit). The kind of limits that were used in the last two lemmas will be needed frequently in this section. For the case of the circle, these limits will be radial limits; for the case of the real line, these are limits along vertical lines. We could call it orthogonal limits in general. We shall indicate this by a special notation defined as follows: lim fiz) = lim/(rof),
aeT
and lim fiz) = lim fia + k ) ,
a e R,
and
If a & 3O, we can interpret the orthogonal limit as an ordinary limit. We can now give an analog of Theorem 2.1.3, namely that the inner product in Cn is characterized by values of the class C function Q^ and possibly its derivatives in the points at. Consider the case of the circle T and let us use the basis {wo, w\,..., wn] with
wo = 1 and
i)
Recall that (wo, wo) = / dftit) = 1 = Q^iP). This is the left top element in the Gram matrix for these basis functions. The other entries in the Gram matrix involve integrals of the form ik -\-1 > 1) k, Wi) =
/
,
;
.
(A product n/=i w ^ m J = 0 is replaced by 1.) By partial fraction decomposition of the integrand, we see that this expression depends upon integrals of the form td[l{t)
it - a ) * + 1 '
a eAn = {«!,...,«„},
(11.46)
where k = 0, 1 , . . . , 2a# — 1 with OL# the number of times that a appears in An. By Lemma 2.1.2, we know that for a replaced by w e D, such integrals are completely characterized by the values of Q^\ where Q^i w) = fDit, w)djiit)
11.10. Interpolation
323
and where the superscript means derivative. Using the estimates as in the proof of Lemma 11.10.1 for the integrand of the integral (11.46) above, it should be clear that we can apply the dominated convergence theorem so that
lim n{*\w) = lim /
^ -
= /
^Yr-
w^a * w^a J (t - W)M J (t ~ 0t)k+l Thus we find that the integrals of the form (11.46), and hence also the inner product in £„, are completely characterized by the radial limit values lim n^iw),
a e An, k = 0, 1 , . . . , 2 c / - 1
and by £2^(/?), the latter assumed to be 1 by normalization. For the real line, a similar argument can be used. We leave the technicalities to the reader. Thus we can formulate the following theorem. Theorem 11.10.3. Consider the inner product (11.45) and the Riesz-HerglotzNevanlinna transform
=f Then in Cn this inner product is completely characterized by the values QJP)
and
lim Q(*\w),
a e An, k = 0 , . . . , 2 c / - 1,
where a# is the multiplicity of a in An. This entails immediately the following result, which says as before that equality of the inner product in Cn corresponds to an interpolation result for the corresponding Riesz-Herglotz-Nevanlinna transforms. (Recall the definitions A£ = {ft, «i, . . . , an] and i f = {$ , au ..., an}.) Corollary 11.10.4. Let JJL and v be two positive measures on 3O and define the sets (counting multiplicities)
if = A{ U if = {p, $ , auaua2, a2, ..., an, an}. Then in Cn we have
(•, •)# = (•, -)y iff
lim \nZ\w) - ^(w))
=0, a e if, k = 0, 1,..., a# - 1,
with a# the multiplicity of a in An.
324
11. The boundary case
Proof. In view of the foregoing, taking into account that the orthogonal limits are replaced by ordinary limits when a $ 3O, we need only to explain the ft . Therefore, we note that for any positive measure /JL, we have, by definition, [^/x(z)]* = -flU(z). Therefore, ^ Q 3 ) = -Q^iP). Thus Q^P) = QV(P) is equivalent with Qll0) = £lv($). n Wefirstderive some results involving the quasi-orthogonal functions. Recall Qn(z, r) = 0 n (z) + T ^ ^ - ^ - i f e ) , Z ( )
r G I,
PW(Z, T) = tfrB(z) + T_ Z w ( f.^r w -i(z), Z ( )
T GI .
It has also been remarked that, although we called Pn(z,r) the functions of the second kind associated with Qn(z, r), it isft(?£true in general that
Pn(z, r) = y ^ , z)[Qn(t, r) - Qn(z, unless r = 0. However, we do have the following: Lemma 11.10.5. Let Qn(z, r) be the quasi-orthogonal functions and Pn(z, r) the associated functions of the second kind. Then for n > 2 Pn(Z,T)=
D(t,Z)
Proof. The right-hand side is
f
z)
/ D(t, Z)
/D(^, The second of these integrals is V^n_i by definition. The first integral turns out to be Zn(z) by Lemma 11.2.2. We need to check that, as a function of t, D(t, z)[f(t) /(z)] G £„_!, where Z n _i(z)
(a0 - a - an-i)(z -
an-\)
-
11.10. Interpolation
325
This is indeed the case because t, z)[f(t) - f(z)] = ^
(t ~ Z)(<X" ~ «-»> (t - an_i)(z - an-i)
^ a® -an
This proves the lemma.
•
Now suppose that Qn(z, r) has precisely n simple zeros §m-(r) e 9 O \ An, An = {ai, . . . , an}, so that we can associate with this quasi-orthogonal function the quadrature formula as in Section 11.6
f f(t)dfi(t)^Yff^nj(r))knj(T)=
= ff
f(t)djj,n(t,T)
7=1
with exact equality in 7Zn-\ = Cn-\ • Cn-\. If r = 0 is a regular value, then we can set r = 0 and the quadrature is even exact in Cn-\ • Cn (see Theorem 11.6.2). Let us define the positive real function Qn(z, t ) by the Riesz-HerglotzNevanlinna transform of \xn: n
/
D(t,z)djJLn(t) =
J2knj(T)DGnj(T),Z).
We show the following: Lemma 11.10.6. With Qn(z, r) as defined above we have
where Qn(z, T) is the quasi-orthogonal function and Pn(z, T) is the associated function of the second kind. Proof. Let us drop the argument r for simplicity. By the previous Lemma 11.10.5, we have
It is easily checked that this integrand is in lZn-\ and therefore dfi can be replaced by d\xn. Recalling that Qn(%i) = 0, we have Z B -i(z)
Pnil) = ~
Zn-l(z)
Quiz).
326
11. The boundary case
Since in the square brackets we recognize Qn(z), we have found the required equality. • Because the quadrature is exact in 1Zn-\, which is equivalent to saying that the inner products with respect to ft and ftn are the same for functions in £n-\, we have by Theorem 11.10.3 the following consequence. Theorem 11.10.7. Let Qn(z,r) be the quasi-orthogonalfunctions and Pn(z, r) the associated functions of the second kind, and let Q^iz) = fD(t, z) d£i(t). Then, setting for a regular value r P(ZT)
Qn(z,r) = - Qn(z,r)
we have for all a e An_l = {/3, $ , ot\, ot\,..., otn-\» &n-i} lim fo<*}(z) - n(n\z, r)l = 0,
k = 0, 1 , . . . , a# - 1,
where a* is the multiplicity of a in An_v Ifz = 0 is a regular value, then setting Bn = An_l U{an} = {ft, ft , au au a2, 012, -.., an-u an-X, an}, we have (recall that Qn(z, 0) = lim \Q^\z)
-^n{z)/(pn{z))
- to(}Hz, 0)1 = 0,
k = 0, 1 , . . . , ot# - 1,
where a# is the multiplicity of a, in Bn. As a special interpretation of this theorem, we consider the example where 3O = R and where all ctk — 00 so that the orthogonal functions become the orthogonal polynomials on the real line. However, since we used the kernel D(t,z) and not the simplified kernel (t — z)~l as Akhiezer does, we do not find the same result as given in Akhiezer [2, p. 22]. By the previous theorem we have that the first 2n — 2 coefficients in the asymptotic expansion of Q^ — Qn at 00 will vanish. Thus Qn(z,r)
L
Si
52
S2rc-3|1
z
z2
z2n
2n-3
(11.47) where so = /xio,
sk = /Zfc-1,0 + Atik+i.o,
k > 1,
t^ko=
tkdfi(t),
k>0.
11.10. Interpolation
327
Indeed, D(t, z) = - i
l+tz t
-z 7
I
2
*** I
V"1 V
+
(z - t)zl I '
so that
JD(t, Z) dfct) = i f t djjL(t) + X) i
/(I + t2)tj-ldjj,(t)
(1 + Now
ftkd[in(t) = ftkdfr(t) = m ,
k = 0, 1,...
and hence
Because
lim / z^ooj
= 0, Z-t
(11.47) will follow. We also note that the result in Akhiezer is somewhat stronger since his results would translate as S
S2nZ2n-
2
Z
Z2
,
Zh>OO,
that is, • 11 so H
h -r
z z2 as z i-^ oo, where K is some constant. However, imposing stronger conditions on the measure (or on the linear functional), it is possible to obtain such a result directly. This is done for example in Ref. [38]. We explain the tread that is followed there.
328
11. The boundary case
In the previous approach, we assumed that Jf(t)dft(t) was finite for all / e Coo • Coo- Now we need these integrals to be defined for a somewhat larger space, namely the space Coo • Coo • £00 • We note that then the integrals are automatically defined on the space 7£, which consists of all the functions / having the form fit)
=
7ktl
t —z
_ git),
geCoo'Coo,
k, I = 0 , 1 ,
ze(C\
dO) U A ^ U Afi,
with Ap = UnA^ and Ap = ]JnArl. Of course, this is needed for the general rational case. If, as in the previous example, we only consider the polynomial case on the real line, then obviously Coo = Coo • COQ = Coo - Coo * £00 • Recall the Newton expansion for D(t,z) given in Lemma 6.1.5:
Dit,z) - 1 +2g^KfeK_ 1 fe),
ak{t) =
This is associated with successive interpolating polynomials (in z) n
Pn(t,z)
= 1 +2^«i(/)^(zK_1(z)
(11.48)
k=\
in the interpolation points A^ = {/?, a\, ..., an}. The interpolation error for pn(t, z) is given by
Similarly, one may obtain successive interpolating "polynomials" in the points An = {$ , &\,..., an}. For the unit circle, these are the points $ = 00, and for k > 1, l/ak = ak since these are on T. For the real line, these points correspond to $ = — i and for k > 1 again ak = ak. The expansion is easily obtained by the relation D(t,z)* = —D(t,z), where the substar is with respect to t and recalling that /*(z) = /(2). Thus we get -D(t,z) ~ 1 + k=\
with the same notation as before. Note that for t e 9O, we have [ak (7)]* = ak it). For the other factor, we can write explicitly rz-i(2-i_a-i)...(z-i_a-i)
I (z + i)(z - « ! ) • • • (z - a n )
forT;
forM.
11.10. Interpolation
329
Thus n
pn(t, z) = 1 +2j2lak(t)Uux;(z)n*k_1(z)h
(11.50)
k=l
1
is a polynomial (in z for R and in z" 1 for T) that interpolates — D(t, z) in the points z e An. The interpolation error is [(z-P)*Z(z)]* (the first substar w.r.t. t\ the second one w.r.t. z). The associated relations for Q^ are
k=\
and -fl^(z) = P n (z) + En(z)
where the generalized moments are /
ak(t)dft(t)
r and vk = / ak(t)dft(t) =
The expressions
A:=l
^nfe) = fen(t,z)djji(t) = (z-P)n*(z) f j show that Pn(z) interpolates Qfi(z) in the point set A^, whereas
n(z)
- /e n (f, z)
show that P n (z) interpolates — ^ ( z ) in the points of A n .
(11.51)
330
11. The boundary case
We want to show that tyn/(f)n interpolates Q^ linearly in Hermite sense in all the points of Bn as defined before and if r = 0 is a regular value then this holds in a nonlinear sense. We prove this by showing that this kind of interpolation holds for a truncated Newton expansion for Q^ as given above. We need a series of lemmas to come to this result. We define
The polynomial numerators of 0 n and tyn are denoted by fn and gn respectively, that is, 0 = fn/n* and \frn = gn/n*. Lemma 11.10.8. Assume that Jfdfi exists for all / e V^oo- Let
J
with
for all z e (C \ 3O) U Ap U Ap. Proof. Use Lemma 11.2.2 with / = 1/TT^ and the expression (11.49) for the error en(t, z) to get Anl(z) = [ {E(t, zy^-^t)
= j
- [D(t, z) - Pi(t, z)]4>n(z)\djjL(t)
Bn,mj(t,z)dfr(t),
where pi(t, z) is as in (11.48). Furthermore, E(t, z) = D(t, z) - 1 = 2-
11.10. Interpolation
331
so that
0
nf(z)fn{z)
- z) Note that the two parts of Bn^mj(t, z) belong to H^ so that the integral is well defined for all z € (C \ 3D) U A ^ U A ^ . Hence the lemma follows. n Similarly we can obtain Lemma 11.10.9. Assume that ffd£i exists for all f e R. Let (pn = / n /?r* and yjrn = gn/n* be the orthogonal functions and the functions of the second kind and let Ant be as above. Furthermore, suppose 0 < m < n and let / be arbitrary. Then
with
for all Z G ( C \ 9 O ) U A ^ U Ap.
Proof. This proceeds along the same lines as the previous proof. Use Lemma 11.2.2 with/ = l / t ^ L and the expression (11.51) for the error en(t,z) to get djJL(t) Anl(Z) = f {E{f, Z )^^ ( A% % * ( O " [D(f, Z) + pt(t, Z)](t*n(z)\ J I [<(0]* J , ^^fa)]^ 2^(O[^*(z)7r;(z)]^
where p/(r, z) is as in (11.50). In this expression, we can replace the kernel E(t, z) by E(t, z) — 2 since the additional term is zero by orthogonality. Furthermore, using D(t,z) = —D(t,z)* (substar with respect to t) we arrive at
so that Ani(z) = f Bn,mJ(t, z) dfa(t) with Bn,m,l(t, Z) = ~
332
11. The boundary case
Note that again the integral of Bn,mj(t, z) will converge so that the lemma follows. • Note that (r - 2)* = (t~l - z~l) for T
and (t - 2)* = {t - z)
for R
and [;r*(z)L =
TT*(Z)/
n i-zoti)
for T
while
[TT*(Z)]*
=
TT*(Z)
for R.
Our next step is to construct Newton interpolants for Q^. We define for each n two sets of interpolation points C2n-\ = (KO, Yu - - -, Y2n-\] and C2n_i = {%, Yu • - •, K2«-i}, where y0 = /?,
/jk = «ik for 1 < k < n and yk = oik-n forn
2n.
Recall oik — oik and hence also % = yk for all A: > 1. Furthermore, we define the corresponding nodal polynomials k
TTQ n = 1,
ft£n(z)
= I J(z — Yi)t
I < k < 2n.
i=\
Note that ftln(z) = n*(z)7tZ_n(z)
for any n < k < In.
The generalized moments are defined as follows: ak{t) =
J while
Vk,n = / ak(t)dix{t)
= \±k,n-
The interpolating "polynomials" for these nodes are
k=\
11.10. Interpolation
333
and 2n-\ k=l
We have the following theorem: Theorem 11.10.10. Let Jfdjl be definedfor all f e IZOQ. Then the polynomial P^n-x as defined above interpolates Q^ in the following sense:
with R2n-\ given by 2
mp(t)dp.(t) (t - z)n*{t)n*n_x{ty
lKZ)
"-
and this integral converges for z € A%. A similar result holds for the other interpolation. The polynomials interpolate —£1^ in the following sense:
with R2n-\ given by 2n l{Z)
~
"
r m^{t)dKt) J (t- 2)*[<(0<-i
anJ the integral converges for z G An. Proof. This follows immediately from our general polynomial interpolation given above, and the integrals defining /?2«-i and R2n-\ converge because the integrands are in TN^. • We need another lemma Lemma 11.10.11. Suppose the integrals ffdfa converge for / e 7£oo- Let (j)n = fn/7T* be the orthonormal functions and tyn = gn/^n m e functions of the second kind. The interpolating polynomials P2Z-1 a n ^ ^2«-i a r e a s defined above. Then the following formulas hold: gn(z) ~ fn
334
11. The boundary case
and
with r2n-i (z) bounded for z e A% and f2n-i (z) bounded for z e An. Proof. We use Lemma 11.10.8 with m = n — 1 and / = In — 1 to get
An,2n-lfe) = with (recall irfn-i « =
2mo(z)
;
f~
/
37-0 («o) J
Anjn-it2n-l(t,z)dtl(t),
Kn^n-i)
An,n-\,2n-\{t, Z) = 7V*_l(z)7t*(z)Bn(t, z),
where n
„ ,
/»(0 - fn(z)
Hence the integrand will be in ^oo and therefore the integral converges for Z € Al A similar derivation can be given for the second part. • Now we have the final interpolation result. Theorem 11.10.12. Suppose ffdfi converges for all f e T ^ . Let Q^ be the Riesz-Herglotz-Nevanlinna transform of the measure \x. Let (f)n be the orthonormal functions and xj/n the associated functions of the second kind. Then \j/n/n interpolates £2^ linearly in Hermite sense in the points Bn = {&,$, <*!,(*!, • • • , Oin-U <*„_!,(*„},
by which we mean the following: Let 0 n =
/ W /TT*
gn(z) ~ MZ)^(Z) = (Z - ^X(z)<_!(z)[(z with Rn(z) finite for z e A? U i f . Ifx —0 is a regular value, then we also have
itfe rn{z)finitefor z e Aj U i f .
and \j/n = gn/^n- Then
77.70. Interpolation
335
Proof. The linear interpolation follows immediately from a combination of Lemma 11.10.11 and Theorem 11.10.10. If r = 0 is a regular value and 0 n = fn/n*, then /„ has no zeros in A^UAn, so that we can divide by fn in the linear relations and obtain the result as stated. • We remark that for the case of the line and taking all at = oo, the previous theorem immediately yields the earlier mentioned result of Akhiezer [2, p. 22]. Our next move is to show interpolation properties involving reproducing kernels. These properties are obtained via our framework of orthogonality with respect to varying measures. We start, however, with a lemma not in terms of this framework. Set
kn(z,w) =
k=0
the reproducing kernel for Cn, and define as before the normalized kernels Kn(z, w) =
" *' W
.
Then we have Lemma 11.10.13. For the reproducing kernels kn(z, w) in Cn, we have
kn(z,w)
= Kn(/)n(z)
and
Kn(z, an) = (j)n(z),
where 4>n{z) = Knbn(z) + Krnnn-\{z) -f • • • are the orthonormal functions with Kn > 0 . Proof. The first part follows from = 00 (*)
+ bn(w)J
+
Because (j)k(w)/bn{w) retains a factor \/Zn{w) for all k < n, and 1/Zn(an) = 0, and since
it follows that the first equality holds.
336
11. The boundary case
For the second equality, we note that
and so
r ,
,
K(z, w) bn(w) V'td)/!^^)!2
bn(w) l*(w)l
with rf{w) real for iy G 9O. This gives Kn(z,an)
=>n(z)rj
with r] = dzl. Because [0«(z)//?n(z)]z=an = /cw > 0, it follows that r\ = 1.
•
Let us now recall some basic facts concerning the varying measures considered in Section 9.5. Recall also that the a?o from that section has to be replaced now by p. Define the spaces
and the measure n
dftn(t)
= \wn(t)\2dfi(t),
w0 = 1,
wn(z) = F T ^ L ,
n>\.
To avoid a lot of notational complication we shall assume that, in the case of the real line, a factor rut (z) = z — oit is replaced by 1 if a?; = oo. Let (pnk, k = 0, 1 , . . . , ft be the orthonormal functions obtained by GramSchmidt orthogonalization of the basis {1, f^, f|,..., £p] for£o- SettingF^ = $nkW>n G Cn,wc have that (see Section 9.5 for the nonboundary case, but the arguments are the same as in the boundary situation) kn(z, W) = k=0
k=0
is the reproducing kernel for Cn. Next we show the following theorem. Theorem 11.10.14. Considered as a function of z,kn(z, w) is an outer function in H2.
11.10. Interpolation
337
Proof. We use the notation introduced above. Recall also that the functions in CQ have only poles in Oe, so that we can apply the results of the previous chapters. Since kn(z, w) = kn(z, w)wn(z)wn(w),
kjriz, w) =
and because kn(z, w) is an outer function in H2, it follows that kn(z, w) is a rational function without zeros in O and without poles in O. Hence, it is outer in H2. D By this, we can define the measure
dkt) where Kn (z, if) is the normalized reproducing kernel for Cn. It is a finite positive measure on 3O. Now the proof of Theorem 6.3.2 can be repeated without changes and we immediately can formulate Theorem 11.10.15. Let Kn(z, w) be the normalized reproducing kernel for Cn and define djjin(t) = dX(t)/\Kn(t, p)\2. Then the inner product in Cn is the same for the measure [i and the measure \xn, that is,
J f(t)J(t)dljL(t) = J f(t)W)d[Ln(t), V/, g e Cn. As a consequence we have the following general interpolation result (see also Ref. [62]). Corollary 11.10.16. Let \xn be defined as in the previous theorem. Then f(z)g*(z)
j
ID(!, z)d[fi - £„](*) = fn(t, z)f(t)jfit)d[jjL - £„](*). J
Proof. Since
Jh(t)d[/i-M(t) = o, vh€Hn, we just have to check that, as a function of t,
D(t, z)[f(z)gAz) - f(f)g*(t)] € 1Zn, which is obviously true.
•
338
11. The boundary case
When the recurrence relations of Section 11.1 are linked with the interpolation properties discussed in this section, one can set up a generalization of the theory of T-fractions and M-fractions. A (modified) T-fraction and a (modified) M-fraction arise as the even and odd contractions of a Perron-Caratheodory (PC-) fraction. These PC-fractions are related to two-point Pade approximants; that is, their convergents approximate two formal power series: one at 0 and one at oo. The canonical denominators of T- and M-fractions are orthogonal Laurent polynomials obtained by orthogonalization of the basis {l,z~l, z, z~2, z 2 ,...} or {1, z, z ~ \ z2, z~ 2 ,...} respectively. For this theory see Refs. [18], [123], [142], [118], [120], [121], [126], and [147]. The present situation is related to a multipoint generalization of this theory. The PC-fractions have to be replaced by Extended Multipoint Pade (EMP-) fractions. There, the two points 0 and oo are replaced by a sequence of points {a\, a?2, &3,...} on the (extended) real line. Such an EMP-fraction defines two formal Newton series: one associated with the interpolation table {oo, 0, o?i, c*i, #2, &2, <*3, «3,...} and one with the interpolation table {oo, 0, ai, #2, Qfi, Q?I, G£3, c*3,...}. The even contractions of an EMP-fraction are called Multipoint Pade (MP-) fractions because their convergents are multipoint Pade approximants for the first Newton series. The canonical denominators of the convergents are orthogonal rational functions of the form studied in this chapter for the real line. An odd contraction of an EMP-fraction is not exactly an MP-fraction, but it is similar to an MP-fraction. This odd contraction is related to the second Newton series defined by the EMP-fraction just as the even contraction is related to the first one. The denominators of its convergents are now orthogonal rational functions related to a reordered set of basic points: {a2, oi\, #4, <*3, •. •}• An extensive study of these properties and relations can be found in Ref. [41]. 11.11. Convergence We continue our treatment of convergence results in the same setting as in the previous section. In particular, it was already mentioned in Section 9.5 on these varying measures that the results for these functions were much more general than needed at that moment. There we applied these results for the case where all the a & were in O. However, they were formulated such that most of them also hold when the a,- are in O; thus, they can also be on the boundary (as they are here). The proofs of most of the results, as long as they do not need the divergence of the Blaschke products or the c^ to be compactly included in CD, can be applied without change in the present situation. We only draw attention to the following facts.
11.11. Convergence
339
When a e T, then
f()
at z - at
L
Similarly, we also get for the case of the real line that £ (z) = 1. Thus all these factors and also the Blaschke products Bn can be replaced by 1. We also have explained in Section 11.1 that >„* = 0 n ; thus we get 0* =>„. In the case of the line, and an = oo, then a factor crn(z) has to be replaced by 1. Finally, the reader is advised to check the meaning of the condition (A, fi) e AM' given in Section 9.5. Taking these things into account, we can just reformulate the theorems given in Sections 9.6 and 9.8 in our new notation and refer for the proofs to the proofs given for the corresponding theorems in those sections. We give some illustrative examples. Theorem 9.6.3 becomes Theorem 11.11.1. Let (A, /JL) e AM' and a(z) the outer spectral factor for the measure, normalized with a(/3) > 0. Then lim
locally uniformly in O, and
locally uniformly in Oe. Proof. The first part is by Theorem 9.6.3. The second part is by taking substar conjugates. • For Theorem 9.6.4 and its Corollary 9.6.5 we obtain Theorem 11.11.2. Suppose (A, /x) e AM' andletkn(z, w) be the reproducing kernel for Cn and sw(z) the Szegokernel (9.3). Then we have lim kn(z, w) = sw(z),
(z, w) e O x Q,
n-»oo
while lim (pn(z) = 0 ,
z
eC\~dO,
where convergence is locally uniform in the indicated regions.
340
11. The boundary case
Proof. The latter convergence is given in O by Corollary 9.6.5, but because >n(z) — 0«*(z) = 0n(2) and z € O iff z e Oe, we get the convergence in the whole complex plane except on the boundary 3O. • Note that for the first relation we can take substar conjugates with respect to z and with respect to w. The result for kn (z, vo) is kn (z, w). By a similar operation on sw(z), we obtain r , / x Sfi(z)Sfi(w) 1 hm kn(z, w) = ^°° 1 locally uniformly for (z,w)eQexOe.
m*(z)zu$(w)
=
1
Corollary 11.11.3. Under the same conditions as in the previous theorem, letting Kn(z, w) be the normalized reproducing kernel, lim Kn(z, fi) = +oo
,
locally uniformly in O.
a(z)
Proof. Since by the previous theorem we have for z € l * ( 8 ) (recall cr(P) > 0), thus also r9
from which the first result follows by using the definition Kn(z, ft) —kn{z,P)/
Concerning ratio asymptotics, we have from Lemma 9.8.1 Theorem 11.11.4. Let (A, /x) e AM'.
Then
lim
(j)n{Z) UJn(z)U(P)
0WQ3)
= 1
QiP)
and
where convergence is locally uniform in the indicated regions.
11.11. Convergence
341
For the quasi-orthogonal functions we get immediately from the proof of Theorem 11.8.2 that the following holds: Theorem 11.11.5. Let Qn(z, Tn) be the quasi-orthogonal functions and suppose that there is an infinite sequence of indices n = n{h) for which 4>n is regular and xn is a regular value. We can then associate a quadrature formula with each of these indices characterized by the discrete measure ftn(t, xn). Define the class C functions Qn(Z) = Qn(z, Xn) = - ^ ' ^ Qn(Z,Tn)
= [D(t, Z)dfin(t9 Tn). J
Then there is a subsequence of(n(h)) such that ^in{h) converges to a solution /x of the moment problem in C^ and we have lim Qn(h(s))(z) = Q^(z) = s-^oo J
D(t,z)dfr(t),
locally uniformly in C \ 9O. As a special case we get convergence of ^/n/(pn, if we assume that there are infinitely many regular indices for which xn = 0 is a regular value. Then we can select a subsequence of Qn(z, 0) = — \lfn(z)/
—
12
—
Some applications
We give some applications of the previous theory. The theory of linear prediction of stationary stochastic processes is a classical application that dates back to the work of Kolmogorov and Wiener. By the work of Dewilde and Dym, this is known to be mathematically equivalent with Darlington synthesis and lossless inverse scattering. The basic connection is that the solution for these problems can be constructed recursively by the application of the NevanlinnaPick algorithm. In the classical Nevanlinna-Pick problem one has to find a Schur function g that satisfies interpolation conditions g(at) = wt, i = 1 , . . . , n, where the at e D. It is well known that this will give a solution if and only if the Pick matrix (we assume for simplicity that the oti are all different from each other)
is positive definite. If one wants to find a Schur function that interpolates these data and satisfies ||g ||oo < Y, then this problem has a solution if and only if the matrix
is positive definite. If this matrix has negative eigenvalues, then there can not be a solution. However, a solution with k poles in D can be found where k is the number of negative eigenvalues of P w (y). This is the Nevanlinna-Pick-Takagi problem. The solution will therefore not be an H^ function anymore but we can still require that it is L ^ , that is, that ||g||oo < YRecalling our discussion of the Nevanlinna-Pick algorithm in Chapter 6, we know that if we start the recursion with a Schur function F, or equivalently from 342
12.1. Linear prediction
343
data (a*, Wi) drawn from a Schur function [wt = F(c^) with F e £>], then the Pick matrix will be positive definite and, by the Cay ley transform of F, we can associate a Caratheodory function £1 with it and hence a positive measure on the unit circle. In that case we do have orthogonal rational functions and the whole machinery gets to work. If there are negative eigenvalues of the Pick matrix, then the Nevanlinna-Pick-Takagi algorithm can be used to solve a form of the Nehari problem. These kind of problems are related to several applications in Hoc control and Hankel norm approximation problems in model reduction. We shall also outline the relation to the previous chapters.
12.1. Linear prediction For the basic material of this section we used Refs. [202] and [203]. See also Refs. [174], [64], [60], and [62] for more details. Let {xt : t e Z} be a complex-valued, zero-mean, finite variance, secondorder stationary process. This means the following. Suppose E represents the expectation operator. Then E{xr} = 0,
E{\xt\2} = no < oo.
The covariance is defined by [ik = E{xiXi-k},
k e Z.
Note that being stationary means that, in this definition, /z& does not depend on /. Note also that /z_£ = ~jlk and |/x^| < /JLQ. The value /xo = E{|jtr|2} is positive. It is called the energy of the stochastic process. Since we assume that the energy is finite and nonzero, we can suppose without loss of generality that the process is normalized such that its energy is equal to 1. With a stochastic process one can associate a (backward) shift or delay operator Z, which maps xt to xt-\. Since the index runs over all integers, the shift operator has an inverse Z" 1 , which maps xt to xt+\. So we can write xt = Z~1XQJ e Z. A stationary process is also said to be shift invariant because if we define yt = Zlxt for arbitrary integer /, then the process {xt }<^Z_OO and the process {yt}%L-oo n a v e exactly the same first- and second-order statistics and are for our purposes indistinguishable. We shall consider the pre-Hilbert space span {xt: t e Z} with inner product (JC, y)x = E{jcy}. Completing this space with respect to the norm \\x\\x = (x, x)% turns it into a Hilbert space that we shall denote as X = clos. span {xt : t e Z}.
344
12. Some applications
Note that
The space X is called the time domain for the process. The projection of x e X onto a subspace y is denoted by (x\y). To describe the prediction problem in the time domain, we introduce the following subspaces: X~ = clos. span {xs : s = t, t — 1,...}
and
+
Xt = clos. span {xs : s = t, t + 1,...}. If we consider xt as being the present of the stochastic process then its strict past is defined as X~_ x and its past (in a weak sense since it includes the present) as X~. Similarly, we define the strict future and future of xt as X^x and Xt+ respectively. The remote past is defined as teZ
The stochastic process is called regular if Af_oo = {0}. LetJcf = {xt\X~_x)bz the orthogonal projection ofxr onto its strict past. Then it is called the (one-step) prediction for xt. The stochastic process et = xt — xt gives the error made by the prediction and it is called the forward innovation process. Note that for a stationary process, the inner product is shift invariant and therefore
(xt\X-_x) =
{Z-tx0\Z-tXZl)=Z-t(x<)\Xl]),
xt = Z~fxo,
and et = Z~l e§.
Thus the innovation process will be stationary too and the forward prediction error, which is defined as the energy of the forward innovation process, is E = E{\et\2} and does not depend on t. Thus instead of finding the prediction xt for each and every t, it is sufficient to find the prediction at time zero. If the prediction error is nonzero, then it is impossible to predict x$ (and for that matter also xt) exactly from its strict past. The stochastic process is then called unpredictable or nondeterministic. If the prediction error is zero, then the process is called predictable or deterministic. If a stationary process is unpredictable, then its time domain X is infinite dimensional. For a stationary deterministic process, JCO belongs to XZ\ since xo is perfectly predictable from its strict past. By stationarity, X-\ e XZ\ is again perfectly predictable from its strict past X^2> which implies that XQ e AT2- This
12.1. Linear prediction
345
argument can be repeated, which leads to the conclusion that, for a deterministic process, x0 (and also any xt) belongs to the remote past X^^ of the process. Obviously, the prediction xt will give the best possible estimate of xt given its strict past. Thus it is a solution of the least-squares minimization problem
in£{\\x,-xfx:xeXr-i}The infimum is the prediction error E. Note that if £^~ represents the past for the innovation process et, then X~ = ALoo + £~ is an orthogonal decomposition. More generally, the Wold decomposition theorem (see Ref. [202, p. 137]) says Theorem 12.1.1 (Wold decomposition). Let {xt}^ be a unpredictable stationary stochastic process with innovation process {^}?°oo- ^t~ and ^7 represent the past of xt andet respectively. Thenxt = ut + vt, with ut = (xt\£~) J_ vt = (XflA^oo). The stochastic process vt is predictable. The innovation process is orthogonal in the sense that (es, et)x = 8stE- If the process is unpredictable, then {es : s = t, t — 1,...} forms an orthogonal basis for £~ for any t e Z. If the process xt is also regular then X~ = £~ and then [es : s = t, t — 1,...} is also an orthogonal basis for X~. Note that, for a predictable process, xt e X-^ and thus in the Wold decomposition of xt we get vt — (x^X-^) = xt and ut = 0. We shall now consider the prediction problem for an unpredictable stationary process. This problem reduces to finding the projection Jto of JC0 onto its strict past XZ\. Note that unpredictable means that XZ\ is infinite dimensional. A possible way of solving this infinite-dimensional problem is by projecting x$ onto finite-dimensional subspaces
{0} = * 0 ; 0 c • • • c x^n c A£n+1 c • • • c xzx of its strict past and letting dimA^ = n go to infinity, preferably such that Denote the projection of xo onto X§n as Jco,n = (xolA^). If the process is regular, and if we arrange that U^LoAo,n = XZ\, then it is reasonable to expect that jco,n —> Jco for n —> oo. If the process is not regular, then we might expect that at least Jco,n —> (xo\£Z\) as n -» oo. Let us now cast this problem into a function theoretic setting. This corresponds to the analysis of the process in the frequency domain. The reformulations of the previous concepts and prediction problem will then be very familiar in view of the analysis in the previous chapters.
346
12. Some applications
With the correlation coefficients /x^ given above, we can define
k=\
The series converges for z e D to an analytic function and it has a positive real part. In other words, it is a Caratheodory function and by the Riesz-Herglotz representation theorem, there should be a positive measure on T (recall that we assumed /XQ = 1) such that
This measure /x is called the spectral measure of the stochastic process. Suppose the Lebesgue decomposition of the spectral measure is /x = \ia + \xs with absolutely continuous part iia <^ k, that is, dfia = [x'dk, where k is the normalized Lebesgue measure, and singular part /JLS = /x — \ia _L k. The Radon-Nykodym derivative /x' is called the spectral density of the process. This decomposition is related to the Wold decomposition of the process. One has the following theorem (see Ref. [202]): Theorem 12.1.2. Let {.ty}^ be a stationary stochastic process with spectral measure fi. Then the following hold: 1. The process is unpredictable if and only z/log/x' e L\ = L\(T, k), that is, /x satisfies the Szego condition. 2. If the process is unpredictable and regular, then /i is absolutely continuous. 3. Assume the process is unpredictable and let xt = ut + vt be the Wold decomposition of the stationary stochastic process xt. Let \JLU and \JLV be the spectral measures corresponding to the stochastic processes ut and vt respectively. Then \i = /xM +/Xy corresponds to the Lebesgue decomposition of \x. fiu is the absolutely continuous part and \xv is the singular part w.r.t. the Lebesgue measure. The second conclusion is obvious since for a regular unpredictable process, the remote past is X-OQ = {0}, which means that the vt component of the Wold decomposition is zero and thus also \xv is zero, so that /x = /xM is absolutely continuous. An intuitive explanation for the second statement can be given as follows. Via the relations fa, xi)x = E{xkxt} = fjLk_t = / tl~k dfi(t) = (z~k, z~l),
12.1. Linear prediction
347
where the last inner product is the inner product of L2(/x) = L 2 (T, /x), it can be shown that the mapping, xt \-^ e~lt0, called Kolmogorov isomorphism, will be an isomorphism between the Hilbert spaces X and L2(/x), that is, between the time domain and the frequency domain. This explains why we have used the notation Z for the backward shift operator in X since a multiplication with z in L2(/x) corresponds with applying the operator Z in X. By definition the stationary process xt is unpredictable if XQ $ XZ\, or equivalently, if x\ $ X§ . It should be clear that the Kolmogorov isomorphism maps the past X$ onto the Hardy space H2(/z). Thus for an unpredictable process, the time domain statement x\ g X$ translates into the frequency domain the fact that z~l is not in // 2 (/x), which means that the polynomials are not dense in L2(/x). By Theorem 7.2.1 we know that this is true iff Szego's condition is satisfied, and this is an explanation for the second statement of Theorem 12.1.2 above. It is now easy to formulate and, at least in principle, solve the prediction problem in the frequency domain. The projection of xo onto XZ\ in the Hilbert space X corresponds to the projection of 1 onto the space {/ e // 2 (/z): /(0) = 0 } . The projecting vector eo = x0 — x0 e X corresponds to the projecting vector / e H2(fi), which is the function solving the optimization problem (compare with Theorem 1.4.2) P2(l,0) :
inf{||/|| 2 : / e H2(ji), / ( 0 ) = l } .
If the process is unpredictable, then log// e L\ and the Szego kernel sw(z) will be defined in terms of the outer spectral factor a as [see (1.43) and (2.18)]
sw(z) = — — - ^ — = = ,
or(z)=exp(i
fD(t,z)logtif(t)dk(t)\.
From Lemma 2.3.6 we know that the Szego kernel is reproducing for //2(/x) and by Theorem 1.4.2 we know that the solution of problem P 2 (l, 0) will be given by f(z) = so(z)/so(O) = cr(0)/cr(z) with minimum ^(O)" 1 = |a(0)| 2 . In other words, the optimal prediction is given by (Z is the backward shift and / is the identity)
xo = xo-eo=xo-
/(Z)jc0 = [/ - /(Z)]jc0,
f(z) = cr(0)/a(z).
The operator / — / ( Z ) is the orthogonal projection operator in X$ onto the subspace XZX. The prediction error is given by
E = |
.
348
12. Some applications
Thus the prediction problem (for an unpredictable process) is in fact equivalent with the Szego problem (see Section 9.1), which is also a spectral factorization problem: Given a measure satisfying /x the Szego condition, find its spectral factor a. The classical polynomial approach is to consider the problem P 2 (l, 0) in the finite-dimensional subspaces Vn C H2(iJi) of all polynomials of degree at most n, hence dimT^ = n + 1. Denoting this problem as Pn2(l, 0) (see Theorem 2.3.2), P2(l,0):
inf{||/||M:/(0) = l , / e 7 > n } ,
we get the solution / = cp* = K~1(/)*, where (j)n is the nth orthonormal Szego polynomial with leading coefficient Kn > 0, and cpn are the monic ones. The minimum is given by K~2. Such finite-dimensional solutions approximate the prediction in the sense that one finds the best possible prediction of order n: Jco,n = (xo\X^n), where XQH — span {x-i,..., x-n}. Thus, setting *o,« = -(fli*-1 H
h anx-n) = -(a\Z H
h anZn)x0,
we get for the nth-order forward innovation eo,n =x0-
xo,n = (1 + d\Z H
h anZn)x0 = cp*(Z)x0.
Here cp* is called the (forward) predictor (polynomial) of order n. To give an interpretation for the Szego polynomials cpn and the SzegoLevinson recurrence, we consider the nth-order backward prediction problem where it is required to "predict" the variable X-n from its future values x\-n,..., JCO. Thus if X§n = span {xi_ n ,..., JC0}, we have to find the projection xo,n = ix-n\X^n). The projecting vector / 0 , n = x-n — X-n is called the nth-order backward innovation. In the frequency domain, this backward problem is translated into finding the solution of
where V^ is the set of monic polynomials of degree at most n. Because || (pn\\fi = \\
k=\
12.1. Linear prediction
349
where the k^ are the Szego parameters or reflection coefficients that appear in the Szego-Levinson recurrence [see (4.1)]
1
K
0 1 Replacing z by Z and applying this to xt we find the time-domain relations
\f,.n e
\ t,n
1
K
L 1
Z 0 0 1
1
ft,n-\
K
L 1
The innovation prediction filter takes as input the stochastic process x, and produces as output the nth-order innovations et>n and / r „ as in Figure 12.1. By the Szego-Levinson recursion, this can be realized in a cascade of elementary 2-ports as in Figure 12.2. Each 2-port is described by an elementary 2 x 2 matrix of the form 1
A*
Z
0
0
1
which can be realized as a lattice in Figure 12.3. The Z-block represents a
Figure 12.1. Innovation prediction filter.
ft,2
ftfl
hi
tl
e*,n-i
et>2
e
t,l
Figure 12.2. Cascade form of innovation prediction filter.
et,o
12. Some applications
350
Figure 12.3. Lattice section of innovation prediction filter.
tl
*2
Figure 12.4. Frequency-domain formulation of innovation prediction filter.
ft,n-l
^
ft,2
ftfi
/t.l
k et,o
Figure 12.5. Modeling filter in the time domain.
delay. This innovation prediction filter is sometimes called a whitening filter because it can be shown that, under suitable conditions, the processes et,n and ft,n become white noise as n -+ oo. The same filter can be formulated in the frequency domain. In that case, the innovation prediction filter is just the Szego-Levinson recurrence relation. The input is 1, the delay operator is replaced by a multiplication with z, and the output of the filter are the polynomials cpn and (p* as in Figure 12.4. Inverting the filter gives a scheme like that shown in Figure 12.5. In the
72.7. Linear prediction
351
-1
|-1
vl: - l
tit Figure 12.6. Modeling lattice section in the frequency domain.
-1
Figure 12.7. Modeling ladder section in the frequency domain. frequency domain the effect can be described by the equations
\h\2)z
These are directly obtained from the Szego-Levinson recursions. This elementary section is depicted in Figure 12.6. Such a filter section is equivalent to the ladder structure of Figure 12.7 since the previous relations can be rewritten as (Pk(z) = z
The filter of Figure 12.5 is called a modeling filter because, in the time domain, it gives the stochastic process as the output provided the innovation process etin is applied at the input. The backward innovation process /,,„ comes as a bonus. Thus to generate the process xt exactly from an nth-order modeling filter, we need to know the et,n, but storing this information is as difficult and
352
12. Some applications
expensive as storing the process xt itself. However, in practical applications, it is assumed that during the analysis of the process, we obtained a whitening filter (predictor) that catches the characteristic information of the process quite well, which means that the process et,n is almost informationless. In stochastic terms, this means that ettn is approximately white noise and the only information it contains is its energy En. Thus, if we feed the modeling filter with a normalized white noise process wt, that is, E{wkWi] = 8^ so that it has a perfectly flat spectrum W(z) = 1, then we can assume that
xt =
Eln/2
Since the spectral measure of wt is the normalized Lebesgue measure, we see that the autocorrelation coefficients of the process xt are given by (see Theorem 6.1.9 and recall that En = K~2 and 0 n = Kn
dx
(t)
f i k
dx
(0
f i k
for all 0 < k, I < n. Thus our model will match the autocorrelation coefficients Hk of the given process for k = 0, d=l, . . . , ±rc. The spectral density \i' — |a| 2 of the original process and the spectral density ft! = |0*|~ 2 of the simulated process will interpolate in the sense that a (z) — 1 /0* (z) has a zero of order n + 1 at the origin (see Theorem 6.3.1). This is equivalent with the Riesz-Herglotz transforms
Q(z)=
J
D(t,z)diJL(t) and Q(z) = / D(t,z)dji(t) = tiL^l J
interpolating each other such that Q (z) — &(z) has a zero of order n + 1 at the origin. From Chapter 9 (in fact, as a special case, setting all oik = 0), we know how and when 0* and Kn (and also cpn) converge asn -> oo. Under suitable conditions we have convergence in some sense of En = K~2 -> |cr(0)|2 and (p*(z) -^ cr(0)/
72.7. Linear prediction
353
it means that among all filters with the same amplitude characteristic, the range of the phase angle will be mimimized. Therefore, if the transmission zeros are outside the closed unit disk, the filter will be called minimal phase. A filter that has no finite transmission zeros is called an autoregressive filter. However, the fact that there is no freedom left in choosing the transmission zeros can be seen as a restriction. This is the point where the orthogonal rational functions have an advantage over the orthogonal polynomials. Let us motivate this by a simple example. If the spectral measure were t-a
2
d\(t),
a,
then we are trying to approximate the predictor or(0) _
o(z)
\-~Pz
1 —az
by a polynomial
/
(1 - Jz) , ~ °LZ = (1 - Pz){l + (a - a')z[l + az + (az)2 + (az)3 + •••]} 1 -az
by a polynomial pn e Vn. This will be much easier because the ratio (1 — afz)/(l — az) is almost 1, so that the rational function on the left-hand side
354
12. Some applications
is almost a polynomial of degree 1, which is reflected by the fact that on the right-hand side the part of degree 1 is dominant. Indeed, the slowly converging series, which also appeared in the previous expansion, is now multiplied by a small number (a — a'). Thus if this number is small enough, then an approximation with n = 1 will already give a fairly good fit. This idea can of course be generalized, which leads to the setting of our orthogonal rational functions. Instead of working with the subspaces Vn of polynomials, we work with the subspaces £„, which are rational functions with given poles. If we want the modeling filter (which was 1/0* in the polynomial case and is now of the form pn/qn with pn, qn G Vn) to be stable and minimum phase, then its poles and zeros (transmission modes and transmission zeros) should all be outside the closed unit disk. We fix the zeros by choosing them as estimates for the transmission zeros of the given process. Since we have only the spectral information of the given process, we can choose the transmission zeros to be either a or I / a since indeed the spectral density is |cr(£)|2 = cr(t)cr*(t). Thus if we want a minimal phase model, we should take care that they are chosen to be all outside the closed unit disk. Now it turns out that these zeros appear as the poles of the function spaces Cn. So we are working in the spaces Cn of functions analytic in D as they were considered in the first chapters of this book. The stability of the optimal predictor will come as a bonus, just as in the polynomial case (see below). One might expect that the optimal predictor is given by the rational function (p*, but this is not true. Indeed, the finite-dimensional subspaces of the past on which we project xo have now a Kolmogorov image, which are the spaces £n. We know that the optimal nth-order predictor is found by solving the problem
The solution to this problem is given by f(z) = kn(z,0)/kn(0,0), where kn(z, w) is the reproducing kernel for Cn (see Theorem 2.3.2). If Kn(z, w) = kn(z, w)[kn(w, w)]~l/2 is the normalized kernel, then, with Kn(z) = Kn(z, 0), we can also describe the optimizing function as f(z) — Kn(z)/Kn(0). The nthorder prediction error is given by En = 1/£„((), 0) = Kn(0)~2. See Theorem 1.4.2. By (3.14) we know that this error equals
n
1 =TVHP*(Q)I2 Kn(0)2 11 1 - |n(0)p
A i-lPfc(O)!2 1J 1 - \akPk(0)\2'
^ '
where pk and yk are the coefficients that appear in the recurrence relation for the reproducing kernels. In analogy with the polynomial case, we find a modeling
)
12.1. Linear prediction
355
filter, which is given by
We know that Kn is an outer function in H2 [see Theorem 3.3.3(3)]. Thus its zeros, which are the transmission modes of the modeling filter, will all be outside the closed unit disk. Hence the modeling filter is stable. As in the polynomial case we have that the process
has spectral measure djl = dk/\Kn\2 and autocorrelation coefficients k-l =
j-k
!
/1
dHt) i|2*
Since /z0 = /z0, the synthesized process has the same energy as the original one, but for the other autocorrelation coefficients we have in general jlk ^ \JLk,k > 1. Instead, the approximation is such that the outer spectral factors a = 1/K(z) and G interpolate each other in the points 0, a\,..., an (Theorem 6.3.1). Also, the Riesz-Herglotz transforms €l(z) = Ln(z, 0)/Kn(z, 0) and £2M interpolate in these points (Theorem 6.3.3). The inner products in Cn is the same with respect to /x and with respect to \x. This means that the spectral information of the original process jcf, which is contained in the "information space" Cn, is matched completely by the simulated process xt. The realization of the filters (whitening and modeling) has the same cascade structure but the sections are somewhat more complicated because the recurrence relation for the kernels is more complicated. For example, section *k, which is the frequency-domain formulation of a normalized innovation prediction filter, would be of the form given in Figure 12.8. The notation used
Figure 12.8. Section n of normalized innovation prediction filter in the frequency domain.
356
12. Some applications
is the same as in Theorem 3.3.1. For practical applications, one can use the Nevanlinna-Pick algorithm as described in Section 6.4. The convergence of Kn(z) = Kn(z, 0) to the inverse of the spectral factor was studied in Chapter 9.
12.2. Pisarenko modeling problem This problem was originally considered in Ref. [179]. For other discussions of the problem see Refs. [53], [108], [11], and [55]. Let {xt} be a stationary process as before and let its n + 1 first autocorrelation coefficients
/4 = E{xtxt-k},
k = 0,..., n
be given. Suppose we want to model this process as xt = yt + wt, where yt and wt are uncorrelated processes and wt is a white noise process with variance G. Then one has £ = 0,1,..., where ixyk = E{ytyt_k} are the autocorrelation coefficients of the process {yt} Define the covariance matrices
Then Gyn = G* — GIn and since this is a covariance matrix it has to be positive semidefinite. Therefore, the maximal possible value of G is the smallest nonnegative eigenvalue of Gxn. For simplicity let us assume that this eigenvalue is simple. The Caratheodory-Toeplitz theorem says: Theorem 12.2.1 (Caratheodory-Toeplitz). LetGk = [M*-,]f J= o> k = 0,l, ... be the covariance matrices of size k + 1 that are associated with the spectral measure /x. Then detG& ^ 0, k = 0, 1, . . . , n — 1 and detG n = 0 iff [i is positive measure with n mass points. If the measure has n mass points, then the nth orthogonal polynomial <j)n with respect to this measure is rj-reciprocal, that is, 0* = t](/)n with r] e T. Moreover, 4>n has n zeros that are all on T and coincide with the mass points of the measure. The outer spectral factor of the measure is 1/0* and its RieszHerglotz transform is r/r*/>*, where x//n is the polynomial of the second kind. We apply this in the case of Gn = Gyn and \x = ixy. Let the mass points be
12.2. Pisarenko modeling problem
357
= e~l6k e T, and the corresponding masses hk > 0. Then t|^, k=\
hk>0.
(12.2)
Jfc=l
Note that (see Chapter 5) hk = The above formula (12.2) for £2M implies that with
/x (z) = Mo + 2
/xz =
z=i
Thus
k=i
where the phase angles yk are uncorrelated zero-mean random variables. Such a process is generated by n sinusoidal wave generators. This is called the Pisarenko model [179] (see Figure 12.9). Thus the procedure to obtain this model goes as follows: First find the smallest eigenvalue G of Gxn. Then define the moments /JLQ = /XQ — G and /x^ = \i\ for k = 1 , . . . , n. From these /x& sinusoidal wave generator sinusoidal wave generator ^
sinusoidal wave_ generator ^ white noise generator
_ Figure 12.9. Pisarenko frequency model.
358
12. Some applications
one can generate the Szego polynomials 0o> • • •» 0«- The zeros of 0 n are the & = e~l9k G T, k = 1 , . . . , n and the weights are given by
In this classical model, one draws information about the spectrum of the process from the information space Vn. This means that we use the matrix G*, which is the Gram matrix of the basis {1, z , . . . , zn} for Vn with respect to the measure /JL. Again, in this application, the use of orthogonal rational functions instead of orthogonal polynomials may have advantages. In this generalization to the rational case one draws information about the spectrum from the information space Cn instead of Vn. Suppose we choose for Cn the Malmquist basis (W/Lo given by (2.13). Then if \xx", \iy, and \iw are the spectral measures for the processes {JC^}, {v^}, and {wk} respectively, then
Because /xw is G times the normalized Lebesgue measure, and because the Malmquist basis is orthogonal for this measure, we find that the Gram matrices are given by
Because Gn — Gyn — Gxn — GIn has to be positive semidefinite, it follows that G is again the smallest eigenvalue of Gxn. For \x = fiy, it follows from Theorem 2.2.4 and (3.14) that 1 KI
detG, det Grt_i
1 kn(an, an)
1=1 1 - \yk(an)\2 '
Thus if det Gn = 0 then pn(an) e T. Thus T
Because >n(z)/0*(z) is an inner function, and an e D, this implies that this function is a unimodular constant
12.3. Lossless inverse scattering
359
Thus 0rt(z) = *](/)* (z) is ^-reciprocal as in the polynomial case. The paraorthogonal function Qn is thus given by
Quiz, r) = 0n(z) + rC(z) = (n + x)4>*n(z). Because Qn has n simple zeros on T, 0*(z), and hence also >n(z), will have n simple zeros on T. Because the singularity of Gn also means that the Gram-Schmidt orthogonalization process breaks down in step n + 1 so that fyi/ji) = Cn is n + 1 dimensional because /x is positive. This is only possible if /x is a discrete measure having n mass points with positive weights. The rest of the solution goes as in the polynomial case. One computes the smallest eigenvalue G of G*, sets Gn = G* — GIn, and from this Gram matrix generates the orthogonal rational functions (j>k(z). The zeros & of (pn will all be on T and the weights hk are given by hk = \/kn-\ (§£, ^ ) Again, as in the previous applications, when the information from the spaces Cn is available, or even approximately available via estimates of the or^, then the rational Pisarenko model will give a better approximation of the signal than the polynomial one. 12.3. Lossless inverse scattering The material of this section is mainly inspired by the work of Dewilde and Dym. For more detailed information refer to Refs. [64], [60], [58], [62], and [59]. Consider a dissipative scattering medium M as depicted in Figure 12.10. In inverse scattering one wants to find a model for the medium M, given an input signal u(t) (incident wave) and an output signal v(t) (reflected wave). For digital processing, the signals are sampled at discrete time intervals and we shall therefore consider discrete time signals Wk, k e Z. The energy of such a signal is defined as its ^2-norm: Ylk&z \wk\2- We consider signals with finite energy, which are thus I2 sequences. The z-transform 00
W(z) = ] T wkZk k=—oo
Figure 12.10. Scattering medium.
360
12. Some applications
will converge to a function W e L2OO. Obviously the energy is also given by \W(t)\2dk(t), k=-oo
where X is the normalized Lebesgue measure on T. The medium is supposed to act as a linear system with transfer function S. This means that if we denote the z-transforms of the incident wave and reflected wave as U(z) and V(z) respectively, then V(z) = S(z)U(z). The transfer function S(z) is called the scattering function of the medium. For physical reasons, it is plausible that the medium behaves as a causal system. This means that there can not be any output different from zero before there has been any input that was different from zero. Thus, when the medium is excited with a unit impulse at time zero, that is, with a signal Uk = <5fco whose z-transform is U(z) = 1, then the output, with z-transform V(z) = S(z) have Fourier coefficients Sk = 0 for k < 0. This means that S is analytic in D. The system is supposed to be stable in the sense that bounded inputs are transformed in bounded outputs, and this means that S e HOQ = Hoo(T). Moreover, if the medium is passive, then it should add no energy to any signal while it is being scattered. Mathematically this forces S(z) to be bounded by 1: \S(z)\ < 1 for all z in the closed unit disk. In other words, S is a Schur function: S e B. In what follows we use scattering function in the meaning of passive scattering function and this is a synonym for Schur function. If the medium does not absorb energy, then the medium and its scattering function are called lossless, which means mathematically that \S(t)\ = 1 a.e. for t G T. Thus a lossless scattering function is an inner function in H^. We need a generalization of these concepts to square matrix valued functions. We say that a n n x n matrix valued function E (z) is a (passive) scattering matrix if it is analytic inDand if it is contractive inD; thus D(z) // D(z) < /«forz e D. Here In stands for the n x n unit matrix and the inequality is understood in the sense of positive definiteness: T>H £ < In iff /„ — Y,H £ is positive semidefinite. Note that E ^ S < 4 ^ E E ^ < / „ . Thus £ is a scattering matrix iff JEH is a scattering matrix. A scattering matrix is called lossless if it is unitary a.e. on T; thus £ (t)H £ (t) = /„ a.e., t e T. For n = lwe obtain the previous definition of a (lossless) scattering function. Thus a scattering function is a scattering matrix of size l x l . If we think of the scattering medium M as having a top surface at which the incident and reflected waves are observed and at the other end a bottom surface
12.3. Lossless inverse scattering
361
Figure 12.11. Scattering medium with load. where another medium starts with some other scattering properties, then we can represent the whole system as in Figure 12.11. The scattering function of the underlying medium is SL (the load). The scattering properties of the medium M are described by a scattering matrix E. We have (12.3)
VL(z) = SL(z)UL(z) and
(12.4) where UQ and Vo now represent the incident wave and reflected wave at the top surface of the medium M, while UL is the transmitted wave that emerges at the bottom surface of M and is incident for the underlying medium and VL is the wave reflected by the underlying medium and is incident at the bottom surface ofM. Using circuit terminology, we say that the load is described by a 1-port (1 input, 1 output) and the medium itself is described by a 2-port (2 inputs, 2 outputs). The whole system is a cascade of the 2-port and the 1-port. We say that the 2-port is loaded by a passive load SL. The scattering matrix gives the relation between the incident waves (Uo and VL) and the output waves (Vo and UL). Although this is the most logical description, it has mathematically a number of disadvantages. For example, the overall scattering function for the combination of the medium M and its load is given by So(z) =
Vo(z) U0(z)
,
(12.5)
where S = [S;7L,7=i,2. This is not a simple formula. Similarly, if we have two consecutive media, each having a scattering matrix, then the scattering matrix of the cascade of the two media is difficult to obtain. If we consider the cascade
12. Some applications
362
Figure 12.12. Cascade of two scattering media. of Figure 12.12 then and
V'
V"
For the cascade we have
v j
~
[v"\'
where S" is given by the Redheffer product [183] of S and E': S" = S * £' =
+ ^22
with r = (1 - E ^ D n ) " 1 . If 1 - E22E11 is not identically zero, this will exist for all values of z, except for at most a countable number of values in D. For our purposes, it is much easier to work with chain scattering matrices. Whereas the scattering matrix gives the relation between input and output waves for the 2-port, the chain scattering matrix gives the relation between the waves at the bottom and the waves at the top surface. Thus U'
= ®
u V
(12.6)
The relation between S and B can be found as follows. Define the projection matrices p =
0 0
and
P± =
0
0
0
1
12.3. Lossless inverse scattering
363
Then from (12.6)
u~ V
V
'u~
and
V
Therefore,
P)"1
= (P£ so that 0 = (P£ +
±
-l.
— ^12^22
£ + P)"1 =
y-1
— H 22
(12.7)
^22
if D22 is not identically zero. Note that the inverse formulas, which express £ in terms of 0 , are completely symmetric, that is, [011-012072021
22
I
0 22
These formulas connecting £ and 0 are sometimes called the Mason rules [52, p. 100] or the Redheffer transformation rules [21, p. 58]. The chain scattering matrices have the enjoyable property that a cascade is described by an ordinary product, rather than a Redheffer product: = £*
0" =
Since £ is passive (i.e., contractive in D), it follows that 0 is also passive, which means that it is J-contractive in D because
G(z)HJ®(z) < J, where J = P - P± =
0 -1
If £ is lossless (i.e., unitary a.e. on T), then 0 is J-lossless, which means J-unitary on T a.e.:
G(t)HJG(t) = J,
a.e.f e T.
364
12. Some applications
The chain scattering matrices are precisely the matrices discussed in Section 1.5. They showed up in the recurrence relations for the reproducing kernels and the orthogonal functions in Sections 3.3 and 4.1. The following theorem is due to Dewilde and Dym [62, p. 647] (see also Theorem L5.1). Theorem 12.3.1. Every J-lossless chain scattering matrix is of the form _ 1 2 with F = (0 2 1 + ©22)"1 € H2,
- 012*)(011* " 012*)"1 € C,
-(ft+ 00 and where T= is inner. The conclusion of the discussion is that a lossless inverse scattering problem can be formulated as follows. Given a lossless scattering function So(z) = VO(Z)/UQ(Z), find a J-lossless chain scattering matrix 0 and a passive load SL = VL/UL such that
where UL is analytic in ID. Once this problem is solved, we have the relation So - -(©22 - ^© 1 2 )- 1 (®2i - SL®u).
(12.8)
If at the bottom surface of the medium we have perfect absorption, then SL = 0 and in that case °
02i = n - i ©22 Q + 1 '
12.3. Lossless inverse scattering
365
Since Vb = SoUo, we can rearrange this by an inverse Cay ley transform as
If we consider £/0 and Vb as incoming and outgoing waves for an electrical 1-port network of characteristic impedance 1, then f/0 + Vb can be interpreted as a "voltage" and Uo — Vb as a "current" (see the next section or Ref. [23, pp. 160-161]). Therefore, Q is sometimes called the input impedance with matched load (the latter referring to SL =0). Once more, the orthogonal rational functions can be advantageous over the orthogonal polynomials. The original Schur algorithm is based on a recursive application of the Schur lemma (6.4.2) where each time a is taken to be 0. This algorithm checks whether a given function is a Schur function, but at the same time it constructs rational Schur functions that approximate (by repeated interpolation in the point a = 0) the given Schur function. The NevanlinnaPick algorithm does basically the same thing but now it is allowed to choose for each of the successive applications of the Schur lemma arbitrary values for a, provided they are all in D. Instead of orthogonal polynomials, one obtains orthogonal rational functions. To be more precise, the prominent approximating functions will be related to the reproducing kernels kn(z, 0) for the rational function spaces we have considered. Note that, in the polynomial case, these are just the reciprocal Szego polynomials. We recapitulate from Chapter 6 the following facts. Applying the NevanlinnaPick algorithm to the given Schur function So computes the successive parameters pk and yk = ^kPk f° r a chosen sequence of points a^ £ D. As we know, this gives the elementary matrices 6^ of Theorem 3.3.1 of the recurrence for the normalized reproducing kernels. Let us assume without loss of generality that we apply the Nevanlinna-Pick algorithm with w = 0, that is, we assume that Sb(O) = 0. If this is not true, then a simple transformation S0(z) - S0(0) 1 - So(O)So(z) will arrange for it. Then, setting So(z) - 1
sb(z) + r we find after application of the algorithm the relation (6.21), which can be rewritten as
2 \ Kn(l
-&«)
].
Bn
366
12. Some applications
Vo
8n-l
Sn
Figure 12.13. Layered medium.
Thus setting Un = BnAn2/(& + 1) € H(B) and Sn = K\/Ki
€ B, we see
by comparing with Theorem 12.3.1 that we have solved an nth-order lossless inverse scattering problem with SL = —Sn, F(z) = l/Kn(z,0), Q = Qn, and R*(z) = l/K*(z,0). This solution corresponds to considering the medium as consisting of n layers. (See Figure 12.13.) Each layer is described by an elementary chain scattering matrix 0^. The "unexplained" part of the medium is deferred to the load Sn. This corresponds to a parameterization of all Schur functions that match So in all the interpolation points a\,..., an. In our notation this is given by l l *n
B
_
r („)-,
where @n = ^n 0n-\ • • • ^i is a factorization of ©„ in elementary matrices. The parallel with the prediction problem of the previous section should now also be obvious. If we neglect the unexplained part completely, that is, if we set Sn — 0, then the scattering function So is approximated by
^ , ,
Kn(z,0)-Ln(z,0)
Because we know that Ln(z, 0)/Kn(z, 0) interpolates the input impedance Q in the points 0, a\,..., an (Theorem 6.3.3), it follows that Fn interpolates the scattering function So at the same points. This solution (i.e., the one obtained by setting Sn = 0) can be interpreted as a maximal entropy approximant. Recall that the entropy integral for any positive F e L\ is given by log
F(t)d\(t)-
Theorem 12.3.2. Let ii be the Riesz-Herglotz measure for Q = (1 — 5'o)/(l + 5'o) and let Sn be the matrix that is constructed by n steps of the Nevanlinna-Pick algorithm applied with w = 0 (we assume SQ(0) = 0). Then
12.3. Lossless inverse scattering
367
£2n(z) = Ln(z, O)/Kn(z, 0) and the outer spectral factor of the Riesz-Herglotz measure for Qn is equal to on(z) = l/Kn(z,0). Moreover,
f logli'(t)dk(t) < f log \an(z)\2dk(t) with equality if SL = 0, that is, /z' = \crn\2. Proof. Since [116, p. 149]
exp lflogii'(t)dk(t)\
= inf{ll/ll* : / e #2, /(0) = 1} ,
we shall not decrease the infimum if we replace H2 by the subspace Cn. Thus exp U log ^(t)dX(t)\
which proves the result.
< inf {11/11* : / e £ n , /(0) = 1}
D
Thus setting SL = 0 corresponds to picking from all solutions of the ra-point Nevanlinna-Pick interpolation problem the one with maximal entropy. Because of the factor £k in 0k, each section will add a transmission zero l/oik to the scattering medium. A transmission zero eia) would correspond to the fact that the frequency co is completely absorbed. A transmission zero reico, with r close to 1, causes a significant reduction in the amplitudes that correspond to frequencies in the neighborhood of co. Thus if we have good estimates of these transmission zeros, one might expect Tn to be a good approximation for So for rather small values of n. In fact, if So is rational of degree n and if the transmission zeros are estimated exactly, then we should have a perfect fit after n steps of the Nevanlinna-Pick algorithm. Most inverse problems are notoriously ill conditioned. It is no different for the lossless inverse scattering problem. It is very difficult to recover the transmission zeros l/oik from So- The optimal choice for the ak would correspond to letting the ak vary freely, giving an approximation error that depends on these ak. In the terminology of Section 12.1 we could use the prediction error En as a measure of how well the spectral measure is approximated, hence also as a measure of how well its Riesz-Herglotz transform and the associated scattering function are approximated. Therefore, we have to minimize this prediction error En with respect to the ak if we want optimal locations of the transmission zeros. In
368
12. Some applications
Ref. [28] some numerical examples are given that reveal that plots of En as a function of the otk show a very flat behavior near the minimum. This means that this is an ill posed problem, which means that the otk can not be pinned down with great accuracy when working in finite-precision arithmetic. However, by the same observation, this also means that the location of the points otk (at least when moved in certain directions) does not influence the value of the approximation error En by large amounts. Thus if we are only interested in finding a good model giving a small approximation error, then it is of no crucial importance to find the location of these as exactly. Rough estimates will do for approximating purposes. So far, we have only considered transmission zeros that were chosen in E, that is, otk that were chosen inside B. Each otk gave rise to an elementary section in the cascade, which is described by a J-lossless chain scattering matrix Ok. In Ref. [62], such an elementary section is called a Schur section in the cascade or, equivalently, Ok is called a Schur factor of 0 . It is also possible to extract factors from 0 that have zeros on T. Such sections are called Brune sections, referring to work of Brune related to network theory synthesis [25]. However, a Brune factor with transmission zero a e T is only possible if a is a point of local losslessness (PLL). Recalling the relation among the chain scattering matrix 0 , the scattering function So, the input impedance Q, and its Riesz-Herglotz measure /z, one can define a PLL as a point where either fJL({a}) > 0 (there is a point mass in a) or /x({a}) = 0, but then such that
|f-ap
< oo.
It can be shown [62, Theorem 2.3] that a e TisaPLLifflim rt i S0(ra) = c e T and lim rti
l-cS0(ra) 1-r
< oo.
If one wants to extract several Brune sections at the same point a e T, then one needs PLLs of higher order, which basically means the following. The point a G T will be a PLL of order k if either =
00
or
while f-^rr^oo J Il2^1)
and
/ £™ = oo,
J
\t-a\2k
12.4. Network synthesis
369
where \xa is the measure /x with the mass point at a deleted (for more details see Ref. [62, pp. 649-650]). Of course such conditions were also required for the study of the boundary situation in Chapter 11. However, our study was based on a three-term recurrence relation, whereas the form of the Brune factors Ok in Ref. [62] is based on a coupled recurrence. Therefore the link is not immediate and we shall not elaborate on it here. 12.4. Network synthesis The problem of Darlington synthesis of a passive lossless network is mathematically the same as the problem of lossless inverse scattering. In the previous section, we already used some terminology referring to that. We introduce this terminology here in a more systematic way. We essentially used the book by Belevitch [23]. An electrical network consists of a finite number of interconnected elements. Such elements involve, for example, resistances, capacitances, inductances, current or voltage generators, etc. Such a network can have several terminals to which other subnetworks can be connected. A network is called an ft-port if it has 2n terminals that are paired in couples. Each couple is called a port and it is characterized by the port variables, which are, for example, a voltage and a current over that port. Figure 12.14 shows a 2-port. Note that the name variables is misleading since voltage and current are actually functions of time, which are related through a system of ordinary differential equations that describe the electrical properties of the composing elements. After taking the Laplace transform, the derivative with respect to time is transformed into a multiplication with the complex variable z. We shall only work in the Laplace transform domain, that is, the frequency domain. Thus voltages and currents will be represented by functions of a complex frequency variable z. Suppose Vt and /,- are the voltage and current for port i. The complex power dissipated by the ft-port is defined as W = YM=I ^* ^ > w n e r e the bar represents the complex conjugate. If we define the vectors V = [V\ V2 • • • Vn]T and I = [Ii I2 - - In]T, then the complex power is W = IHV, where the superscript
v2 Figure 12.14. A 2-port.
370
12. Some applications
means complex conjugate transpose. If the n-port does not contain internal generators, then the relation between the vectors V and / can be described in homogeneous form by V = RI
or
/ = AV.
(12.9)
The n x n matrices R and A are complex rational matrix functions. R is called the impedance matrix of the network and A is called the admittance matrix. An n-port is called passive if the active power (that is, the real part of W) is nonnegative in the right half plane: Re W(z) > 0 for all Re z > 0. If, moreover, it satisfies Re W(z) = 0 for Re z = 0 then it is said to be lossless. Consequently, if (12.9) holds, then Re W = IH(RQR)I = VH(ReA)V so that the impedance and admittance matrices of a passiverc-porthave a real part that is nonnegative definite in the right half plane. A matrix satisfying Re R(z) > 0 in Re z > 0 is called a passive matrix. Similarly, for a lossless w-port, one has Re R(z) = 0 for Re z = 0 and such a matrix is called lossless. If a 1-port contains internal (current or voltage) generators, then the relation between voltage V and current / is inhomogeneous: V = RI -\- E or / = AV+J, where V and J are some combinations of internal variables. This means that it can be represented by a voltage generator in series with an impedance R (Thevenin's theorem) or as a current generator in parallel with an impedance A (Norton's theorem). We find V = E when 7 = 0; thus E is the open-circuit voltage of the port. Similarly, J will be the short-circuit current of the port. Thus if all internal generators are put to zero (voltage generators replaced by short circuits and current generators by open circuits), then V = RI and / = AV, so that we have the situation as above (12.9). The impedance R = V/I is called the internal impedance of the 1-port. A voltage generator can be seen as a 1-port that produces a voltage V and has some (internal) impedance Q. If it is loaded with an impedance £2L, then there will be a current given by /
=
(See Figure 12.15.) If one makes the load QL equal to £1, then the total power dissipated in the load will be maximal [23, p. 159]. The current obtained for this matched load is
and the relative difference for the currents with open circuit (load 0) and with
12.4. Network synthesis
371
Figure 12.15. Generator with load. load s =
is called the reflectance of QL relative to Q. Note that if we match the load impedance QL with the internal impedance Q (i.e., QL = Q) then s = 0. Choosing the normalized variables / = Iy/Q, v = V/y/Q, and co = QL/&> this becomes s =
co — 1
or
1+s
co =
co+ 1 1 -s Note co = v/i because Q = V/I. These relations imply Re co =
i-ki2
.
This shows that Q, and hence co, being passive is equivalent with \s(z)\ < 1 for all Re z > 0, and if co is lossless, then \s (z) | = 1 for Re z = 0. Replacing co by v/i gives s =
v —i
By another change of variables x =
v +i
and
v =
one obtains the simple relation y = sx. The function x is called the incoming wave and y is the outgoing wave. For an «-port, one can make the same change of variables for each port and thus obtain new wave variables (JC/, v,) for port /. Defining the vectors
= [Xlx2 ... xnf = (V + I) /2 and Y = [yiy2 ... ynf = (V one obtains a relation of the form Y = SX, which replaces (12.9). The matrix function S is called the scattering matrix of the n-port. It is related to the (internal) impedance matrix by
12. Some applications
372
3/2
or 1
x2
Figure 12.16. A 2-port with wave variables. where /„ is the n x n unit matrix. For the 2-port of Figure 12.16, we have in terms of the wave variables y\
= S
\"
Note that for the dissipated power, we have W = IHV = iHv = (x - y)H(x + y)= xH(In
-
SHS)x.
Thus for a passive ft-port SH S < In on Re z > 0, and for a lossless ft-port, it is moreover true that SH S = 1 on Re z = 0. Now, modulo some minor adaptations and a change of notation, we are back in the situation of the previous section. There we had the special case of n = 2 and the scattering matrix was contractive in D instead of the right half plane, but the latter is easily dealt with by a bilinear transformation of the variable that maps the right half plane to the disk. Or one can rotate the right half plane to the upper half plane and then one has the theory of orthogonal rational functions on the real line. If we had sampled the electrical signals and used the z-transform instead of the Laplace transform, we would have obtained the disk situation directly. By the same observations as made in the previous section, it is much more interesting for us to describe the cascade connection of two ports not by scattering matrices, but by chain scattering matrices. Suppose we have a lossless 2-port where port 2 is replaced by an open circuit. Then as seen from port 1, we obtain a 1-port circuit with some impedance Q. If this 2-port is loaded at port 2 by a passive 1-port with impedance QL, then there will be a reflectance SL (= scattering function) of QL with respect to Q: SL = (QL - £2)/(QL + ^ ) - This SL will be zero when we match the load impedance QL with the internal impedance Q of the 2-port (i.e., if we set QL = Q). This explains why the function Q in the lossless inverse scattering framework is called the input impedance with matched load. In the realization theory of these networks, the problem is to realize a certain complex network, which is considered as being a passive 1-port with possibly a
373
12.5. HOQ problems
high internal complexity. This means that the scattering matrix, or equivalently the chain scattering matrix 0 , is a rational function of a relatively high degree. The realization will be obtained as a cascade connection of simple sections represented by 2-ports, just like in the lossless inverse scattering framework. Mathematically this means that the rational matrix function 0 is factored as a product of elementary matrices. Elementary usually means of degree 1, but for practical reasons it is sometimes more interesting to put together two "complex conjugate" sections of degree one and combine them into a real section of degree two. Since in practical situations the given scattering function is often rational, we know, at least in principle, the transmission zeros and we should be able to apply the machinery of the factorization and the orthogonal rational functions, which should give an exact realization after a finite number of steps. Thus certainly in such situations the rational, rather than the polynomial, approach is highly recommended. 12.5. Hoo problems 12.5.1. The standard Hoc control problem Several problems considered in HOQ control are tightly connected to inverse scattering. Again the machinery of J-unitary matrices and Nevanlinna-Pick interpolation can be set to work and that is the only point we want to make here. The reader who is interested in more details should consult the extensive literature on H^ control. A simple approach, which is close to ours, is given, for example, in Ref. [128]. See also Ref. [22, Part VI]. Suppose we consider a discrete time system; otherwise one should replace the unit circle by the real or the imaginary axis (depending on the formalism one wants to use). Let us redraw the picture given in Figure 12.11 as in Figure 12.17.
Uo
— -
•
Vo
Figure 12.17. Plant with feedback.
374
12. Some applications
This makes it easier to see this as a plant with input Uo and output Vb with some feedback loop, which contains a controller characterized by SL> Suppose the plant is described by (12.3) and (12.4), or by the chain scattering equivalent as in (12.6). Then the closed loop transfer function of the plant is given by (12.5). Thus
So = — = £21 or So = - ( @ 2 2
-
The HQQ control problem is to find a controller SL such that the closed loop transfer function So (from Uo to Vb) satisfies IISolloo < Y for some given positive number y > 0. Of course, for physical reasons, one expects the system to be stable, which means in mathematical terms that So is analytic in D, and thus the norm || So ||oo that we used above is the norm in /Joo(D). This means that y~lSo should be a Schur function. This problem can be given the following interpretation. We can consider Vo as some error that can be observed. C/o is some exogeneous input and UL is an observed output of the system. The controller SL uses these output observations to steer a control input VL such that the effect of t/ 0 on the error signal Vb is brought below a certain tolerance y. Alternatively, one could consider Uo as some noise that influences the output Vb of the system, and a controller has to be designed to bring this influence below a certain level. The problem of inverse scattering is indeed the inverse of this problem in the sense that, there, the scattering function So is observed and one aims at constructing a model for the scattering medium, that is, one wants to compute D or 0 and SL •Here, one knows in principle the plant as £ or 0 and one wishes to construct a controller SL such that the closed loop transfer function So is a (scaled) Schur function. Let us assume for simplicity that the problem is scaled such that we can take y — 1. Thus So should be a genuine Schur function. There is, however, an important constraint imposed by the physical realizability of the system. Namely, the closed loop system should be internally stable. This means that none of the signals that are generated somewhere internally in the system may become unbounded, whatever the input signals may be, provided of course that these input signals are bounded. This is physically
12.5. HOQ problems
375
obvious because otherwise the device could "explode" while operating with finite-energy input signals. It may well be that the system transfer function is stable, but that it is realized such that it is not internally stable. For example, if the transfer function is \/{z — 5) but if this is implemented as a cascade of a transfer function 1/(1 — 2z) and a transfer function (1 — 2z)/(z — 5), then it is not internally stable since the signal after the first transfer function is internally generated and it can become unbounded because the first transfer function is not stable (not in Hoo). Thus internal stability refers to a specific (state space) realization of the system, but we do not want to go into the details of state space realizations in this context. There are several other problems in HOQ control, including sensitivity minimization and robust stabilization, that can be reduced to this standard problem or variations thereof. See, for example, Refs. [19, Chap. 5], [84, Chap. 12], [128], [22, Part VI], [85], and [86] for much more on H^ control and interpolation. Because the internal variable UL of the closed loop system is
and because of internal stability, it follows that (1 — S ^ S L ) " 1 ! ^ ! should be stable. Furthermore, for internal stability reasons, the internal variable VL should be bounded, and because VL = SLUL = SL(l — Y,i2SL)~ly£nUo, this means that SL and SL(l - ^nSL)~JEn should be stable. Now if oii is an unstable zero of £22 (i.e., £22(a*) = 0 with at e D) then it can not be compensated by a pole of 5^(1 — ^USL)~1 £11 > since this is stable and thus has only stable poles. Thus, if we fill in c^, we get S0(c*;) = £21 (<*;),
(12.11)
which should hold for all unstable zeros of £22- Consequently, internal stability conditions impose Nevanlinna-Pick interpolation constraints. In fact, it will be shown below that, if a solution exists, then the solution for this Nevanlinna-Pick problem actually solves the control problem. As in the inverse scattering problem, there is a considerable advantage in using the chain scattering matrix 0 instead of the scattering matrix £ . These are related by (12.7), that is, 0 = ( P £ + P ± )(P- L E + P)~l if H22 exists. The terminology "scattering" and "chain scattering" matrix is not exactly correct since there is no reason why the system should be described by a matrix £ that has the properties of a scattering matrix. Moreover, in the
376
12. Some applications
multidimensional case, £ is some m x n matrix that need not even be square. It is possible to generalize the ideas of J-contractive, J-unitary, J-lossless, etc. to nonsquare matrix functions, but since this was not discussed in the main part of this monograph, we shall not go into the details. The control problems where the signals are all scalar are of little practical importance. Usually, all the signals are vector valued and often of different dimensions, but since we only want to illustrate the simplest possible ideas, let us assume that £ and 0 are 2 x 2 matrices, which means that all the inputs and outputs are scalar functions. Moreover, in all practical computations, they are assumed to be rational functions. Assuming that we use the "chain scattering matrix" 0 to describe the system, then
which means that (12.10) holds, or equivalently, »^L
= :
v^21 ~r ^ 2 2 ^ 0 / v ^ l l ~r ^12^07
•
\Y2LAL)
If the controller is also described in the same way: 5' = ( 0 ^ + 022^)(en +
&12SL)~\
that is,
V'" I/'
= ©'
S' =
V
Tf'
then
sf = (0^ + ©^SoX©!! + e W 1 ,
0" = &®.
Thus, if we have realized the system 0 as a cascade of elementary sections, then adding the controller just means that we have to extend the cascade with a few more sections. This illustrates once more the advantages of working with 0 instead of £ . Let us reformulate the control problem once again: Given some matrix 0 , we have to find a function SL such that So given by (12.10) is a Schur function and such that the system is internally stable. It is obvious that if 0 is a J-lossless chain scattering matrix, then So will be a Schur function for any choice of a Schur function SL. This is of course a trivial system, which does not need a controller because setting SL = 0 would solve
72.5. //oo problems T/
Uo
317
r
\
VL
n
0
r—1
i
t
Ui 1
n-1
SL
%
1
5 i
0
T
yL
Vo _
0
Uo
UL
SL
Figure 12.18. Lossless factorization. the HOQ control problem. By letting SL range over all Schur functions, we get all the possible controllers. However, for an arbitrary plant, there is no reason why 0 should be a J-lossless chain scattering matrix. In that case, one can hope to replace the system by another system as in Figure 12.18. There we replaced 0 by 0FI, where 0 is J-lossless and n is outer. Recall that a matrix is J-lossless if it is (1) all-pass (i.e., J-unitary) on T and (2) passive (i.e., J-contractive) in D. The matrix function n is outer (in Hoc) when n and its inverse IT"1 have entries that are analytic functions in D (it is a matrix version of an outer function). Such a factorization (if it exists) is called a J-lossless factorization of the matrix 0 . It can be seen from the figure that for the new system, where 0 is loaded (controlled) by §L, we can choose as above SL to be any Schur function since 0 is J-lossless. Because (UL, VL) are the port values of a port described by I!" 1 and terminated by §L, it is clear that this corresponds to a controller SL for the original system, which is given by
sL =
n 12 )
with SL an arbitrary Schur function. Equivalently, because of (12.10), the controller is given by SL =
021 + 022^0
Thus the problem of ifoo control is reduced to a problem of J-lossless factorization. Since 0 is J-lossless, it can not have poles on T. Then it is always possible to factor it as 0 = ®a®s, with Ss J-lossless with all its poles in E (it is stable) and 0 a J-lossless with all its poles in D (it is anti stable) [128, Lemma 4.9, p. 89].
378
12. Some applications
So, if a J-lossless factorization of 0 exists, then it can be written as 0 = 0a0,n. This implies (recall from Section 1.5 that, since 0 a is J-lossless, 0" 1 = / 0 a * / )
Since n is outer, and hence stable, it follows that II * is anti stable. Also, 0 5 * is anti stable. Thus the right-hand side is anti stable (i.e., has all its poles in D), and therefore the left-hand side should also be anti stable. Thus one has to find first a J-lossless matrix 0 a (with all its poles in D) such that G* = 0 * / 0 f l is anti stable. If we can then find a J-lossless factorization for the stable matrix JG of the form JG = 0 5 n , then the problem has been solved because we then have 0 = Ga®sTl. Thus there remain two problems to be solved: 1. Given a rational matrix function H, find a J-lossless matrix 0 a that makes HGa anti stable. 2. Given a rational and stable matrix function G, find a J-lossless factorization G = 0,11. The first problem is a problem of J-lossless anti stabilizing conjugation. One defines this problem as follows. Given a rational matrix H, then one says that 0 is a J-lossless stabilizing (anti stabilizing) conjugator for H if 0 is J-lossless and H 0 is stable (anti stable) and if the degree of 0 is equal to the number of anti stable (stable) poles of H. The last condition expresses some minimal degree condition for the conjugator. The 0-matrix cancels the anti stable (stable) poles at and replaces them by their reflections I/a,-. The zeros and the stable (unstable) poles are left untouched. In the scalar case, this is a trivial matter. For example, the lossless stabilizer of G(z) =
(T
is given by 0(z) = (1 — 2z)/(z — 2) (a Blaschke factor, and thus lossless) because (z-2)(z-5) is stable. We first note that problem 2 above can be solved by a J-lossless stabilizing conjugation. Indeed, suppose G is stable and let 0 be a J-lossless stabilizing conjugator for G" 1 . Then G - 1 0 = G' with 0 J-lossless and with G' stable. Because
72.5. Hoo problems
379
G is stable, G 1 has no anti stable zeros, and because J-lossless conjugation keeps all the zeros, G' can have no anti stable zeros. Thus G' is outer. Setting II" 1 = G', we find that G = 011 and this is a J-lossless factorization of G. Thus the only problem left to be solved is the problem of constructing a stabilizing (antistabilizing) J-lossless conjugator. Such a construction is obtained step by step, where each elementary step eliminates an unstable pole by multiplying with an elementary J-lossless matrix Ok as in the Nevanlinna-Pick algorithm. It is in fact equivalent with a Nevanlinna-Pick problem. In practical applications, where the signals are vector valued, the algorithm is often performed on a state space realization of the system. The details can, for example, be found in Ref. [128]. We prefer, however, to reformulate the problem as a Nehari problem for which we shall give a solution below. In the Nehari problem, one wants to do better than in the H^ control problem formulated so far. One wants to find a controller that is optimal in a certain sense. So, instead of just constructing a controller that arranges for || So I loo < K> one wants to go further and find the best controller, that is, one that minimizes || So I loo- If such an optimal controller can be found for which ||So||oo - V-> m e n m e H°° problem as we defined it here has a solution; otherwise, there is no solution. This problem of optimal control is a so-called minimal norm problem. An alternative would be to solve the minimal degree problem, which will construct a (rational) controller that makes || So ||oo — X > b u t a m o n g all solutions finds the simplest possible one, that is, the rational function of lowest possible degree. Nehari problems are usually formulated in terms of Hankel operators. Therefore, we shall give an introduction first.
12.5.2. Hankel operators For an elementary introduction to Hankel operators see Ref. [175]. See also Ref. [182] for a more advanced text. Let {xk}k^=_oo be an input signal in t2 and suppose we apply this to a linear system with impulse response {^^}^_ooG^oo to give an output signal -oo £ ^2- Then we can describe this as
y-i
[yo] yi
•••
=
h
0
h-i
••• hx
h0
•••
hi
h
2
h0
380
12. Some applications
For causal systems, it will only be the past of the input that will define the future of the output. Thus the relevant part of the system will be the operator that maps the past of XQ into the future of yo. This operator can be described as yo
h0
h\
yi
hi
h2
yi
h2
h2
...
x0 x-i X-2
To rewrite this in the frequency domain, we define some operators for the frequency domain. Let R be the reversion operator R:L2->L2:
f(z)
For given h e L^, let M/, be the (bounded) multiplication operator
Mh:L2^L2:
f(z) H» h(z)f(z).
We note incidentally that
Furthermore, let n + be the projection in L2 onto H2. Then a Hankel operator with symbol h e L^ is an operator on H2 defined as Hh = n + M^/?|// 2 , that is,
H2: f(z) h+ U+MhR\HJ{z) = where M\x means the restriction of operator M to the space X. Now we can transform our time domain input-output relation (X^Q00 H> (jfc)o° t 0 m e fre" quency domain. Therefore we define the z-transforms x(z) = YllcLoX-kZ1* £ H2 (note the minus sign in the index), y(z) = YlT=o 3 ^ G ^2> and the transfer
function h(t) = YlkL-oo h^k eL^t xk = j tkx{t) dk(t),
e T. Thus yk= f rky(t) dk(f),
ke
where x k = y - k = 0, for k = 1 , 2 , . . . , and r
hk = I t~kh(t)dX(t),
k
Then the input-output relation simply reads y(z) = Hhx(z). Obviously, with respect to the standard basis {1, z, z2,...}, the operator Hh is represented by
72.5. Hoo problems
381
the Hankel matrix Hh = [/*;+; L\7=o,i,2,.... Note that we use the notation Mh for the operator in Hi as well as for the corresponding matrix representation. Given h e Loo, this defines uniquely the Hankel operator Hh, but the converse is not true. If H is a Hankel operator and h is a symbol for it, that is, H = Hh, then any function f = h + g with g e H^ arbitrary will also be a symbol for H. We used the notation H^ for tkf(t)dk(t)=O,k
=
O,l,...
Obviously Hf = Hh iff / - h e H^. The classical Nehari Theorem [151; 175, p. 31] says that it is always possible to find a symbol for a Hankel operator that is optimal in a certain sense. Theorem 12.5.1 (Nehari). Let H be a Hankel operator. Then there exists a symbol h € Loo such that H = Hh and \\H\\2 = Halloo- This h is the solution of the optimization problem (the infimum is \\H\\2) wf{\\h\\oo:heLoo,H
= Hh}.
Since we can write any symbol for H = Hh as / = h — hL with hL e H^, it follows that \\Hh\\2=mf{\\h-h±\\oo:h±eH^}. In other words, \\Hh H2 is equal to the Loo-distance of h e L^ to H^\
This optimization problem is also called a Nehari extension problem. That is defined as follows. Given a sequence {hk : k = 0, 1, 2,...} with J2T=o n ^ e Hoo, one has to extend it with a sequence {hk : k = — 1, — 2,...} such that h (z) = YlT=-oonkZk e ^oo- Then one has to find, among all the solutions of this extension problem, the one that has minimal norm. Sometimes another variant is formulated where the problem is to find among all the solutions of the extension problem some h that satisfies ||/z||oo < 1. Of course, if one can solve for the optimal h, then one can decide whether the second variant has a solution and if a solution exists, it is actually constructed. Let us now see how the solution of this problem can help to solve the previous Hoo control problem. First we recall that internal stability required that
382
12. Some applications
f = SL(l — Y,i2SL)~lTin was stable, and hence in Hoc. Thus if we can find / e HOQ that minimizes ||E 2 i + D22/II00, then we have solved our problem, because we can easily compute SL from / . Now let us assume for simplicity that our given system is stable and that H21 and £22 are m #00 •Let £22 — #226*22 be an inner-outer factorization of £22- Then because OH 00 — H^ for any outer function O, and because ||#/||oo = ||/||oo for any inner function B, it follows that our problem can be further reduced to finding the infimum of \\h — /lloo, where / ranges over H^ and h = — ^2\/B22 £ ^00 is given. Thus we have to find distoo(/i, #00), and this can of course be formulated as a Nehari optimization problem of the previous type by a transformation z - • l/z.Indeed Or
^11/1-/1100=^11^-/1100 if h(z)=h(l/z)/z
and f(z) =
For the Nehari optimization problem, Adamyan, Arov, and Krem [6] gave a solution that is much more general, but for the solution of the Nehari problem, it simplifies to the following theorem. Wefirstdefine an all-pass function f e L^ to be a function for which \f(t)\ = 1 a.e. on T. Note that an all-pass function is not necessarily inner because it need not be analytic in D. Theorem 12.5.2 (Adamyan-Arov-Krein). Let H be a Hankel operator with y = \\H\\2. Letx e H2 be nonzero such that \\Hx\\2 = y \\x\\2. Sety = y~lHx and x = Rxy and thus x(z) =x(l/z), and define h = yg = yy/x. Then g is an all-pass function and h e L^is the only function that solves the Nehari extension problem. Thus it is the only function for which H = Hh and \\H\\2 = HallooEquivalently, if f is a symbol for H, then the function hL € H^,that is closest to f in LOQ is given by h± = f — h, with h as constructed above and where the distance distoo(/, H^) is equal to ||/z||ooNote that if the operator H is compact, then it has the singular value decomposition
Hf = J2sk(f^k)yk,
f,xk9yk
e H 2,
Hxk=skyk,
(12.13)
k=\
where \\H\\2 = s\ > s2 > S3 > • • are • the singular values of H and (xk, yk) is a pair of singular vectors (Schmidt pairs) corresponding to sk. The x and y
72.5. Hoc problems
383
in the previous theorem are given by any Schmidt pair that corresponds to the maximal singular value y. If we return to our H^ control problem, then this theorem implies that there can only be a solution to the problem if the Hankel matrix H, defined by the symbol h with h(z) = h(l/z)/z and h = £21/^22, has a norm \\H\\2 < y. If not, there is no solution. If the given system £ is rational, then it can be seen that the rank of this Hankel operator is finite and equal to the number of unstable zeros of £22 (multiplicities counted) that are the zeros of the Blaschke product #22- This follows directly from the following classical theorem by Kronecker [132; 175, p. 37]. Theorem 12.5.3 (Kronecker). Let H = Hh be a Hankel matrix with symbol h G HOQ. Then the following statements are equivalent: 1. H has finite rank n. 2. g(z) = z~lh(z~l) e H^ is rational and has n poles (in 3). 3. H = Hf with f of the form f = Bng, where g e H^ and Bn is a Blaschke product with n zeros in ED. Now if the rank of the Hankel matrix is finite, then this may suggest that the problem is a finite-dimensional problem and that we do not have to compute the infinite-dimensional Schmidt pair as suggested in Theorem 12.5.2. This is indeed the case. It is in fact solved as a Nevanlinna-Pick interpolation problem. To see this, assume that Hf has finite rank. This means that the given symbol / has a finite number of poles {l/a/}" = 1 that are in E (they are repeated according to their multiplicity). We can collect them in a Blaschke product Bn with zeros {c^}f=1 so that / has the form f = Bnf with / e H^. But because the approximant hL e H^, it is obvious that the error h = f — h1 will have the same poles as / in E and thus it will also be of the form h — Bnh. Now the solution h1- = f — h = Bn(f — h) should be in H^. By taking the substar, this is equivalent to saying that the solution hL should satisfy
A(z) - K(z) = Bn(z)h±(z) e zBn(z)Hoo. Thus we have reduced the interpolation problem to the following one: Given F = / * = Bnf* and the numbers at (which are the unstable poles of /*), find F n = h* = Bnh* such that Tn interpolates T in the points {of/}"=0 (where a0 = 0). Thus
r(z)-rn(z)ezBn(z)Hoo.
384
12. Some applications
The solution of the original problem is then given by
This interpolation problem can of course be solved by the Nevanlinna-Pick algorithm as described in Section 6.4 (where we take w = 0). The algorithm constructs J-unitary matrices with parameters pk (and yk). These matrices are related to the recurrence of reproducing kernels as has been explained there. They can also be used to give all solutions to the interpolation problem. Indeed, this algorithm constructs transformations tk (see Theorem 6.4.1) and all solutions to the interpolation problem were given by Yn = Tn (Fo) with F o arbitrary in B and where Tn = (zn o zn-\ o • • • o ri)" 1 is defined by the associated ©matrix. If in the algorithm all the pk e D, then Yn will be a Schur function for any choice of r o e 6 . A solution of minimal degree for Tn will be found if F o is a constant. Thus if we set Fo = y, then || F n ||oo < y and we have solved the minimal degree H^ control problem. However, the problem we had above is to find an all-pass function, which need not be a Schur function. Of course if F e B, then Yn will be in B too. However, if F g B, then interpolation in the at may generate some Fn that is not in B. This can of course be checked by the modulus of the numbers pk. If some Pk falls in E, then Yn will not be a Schur function. The HQQ control problem will not have a solution. However, it can well be that, if no degenerate situation occurs, the Nevanlinna-Pick algorithm can go on and construct some F rt . Of course the Pick matrix associated with this problem will not be positive definite anymore and there will not be a positive measure associated with the problem. This (rational) function Yn is, however, not completely useless. It is a solution of a Nevanlinna-Pick-Takagi problem. The function Fn will have unstable poles (poles in D). The number of these unstable poles is equal to the number of negative eigenvalues of the (generalized) Pick matrix that can be associated with the interpolation problem. This number is equal to the number of sign changes in the sequence of determinants of the leading submatrices in this Pick matrix. By the determinant formulas for the kernels given in Theorem 2.2.2 and formula (12.1), this means that this is equal to the number of sign changes in the sequence of Ek, k = 0, 1 , . . . , n, where
1
I
|2
and the p{ are as produced in the Nevanlinna-Pick algorithm. The Nevanlinna-Pick-Takagi problem has another important application in best rational approximation and in model reduction for linear systems. These
72.5. H^ problems
385
problems are characterized by the criterion of Hankel norm approximation. Since almost all our results only involved positive measures, they do not apply to these kinds of applications. We therefore will only outline this problem very briefly in the next section. 12.5.3. Hankel norm approximation Given a function h e //oo> one can associate with it the Hankel operator Hh, and the norm || Hh |b is in fact a norm for h. It is known as the Hankel norm of h and we shall denote it as \\h\\n = 11^4 II2- The Hankel norm lies somewhere between the L2 and the L ^ norm since one can show that [57, p. 184] h{z)eHO0
=> \\h\\2 < \\h\\H < \\h\U
Note that for h e L ^ , this is not a norm, because the //^-part of h does not contribute to the Hankel norm of h. Suppose that / e #00 is the transfer function of a linear system. Then the corresponding input-output map is the Hankel operator H = Hf. The problem is to approximate the system by another system, where the approximating criterion is the Hankel norm, which corresponds to approximating the Hankel matrix H = Hf by another Hankel matrix H'. Thus, if h! is the symbol of Hr and if g = / — h\ then we consider \\H - H'h
= \\Hf-h'h
= \\Hgh
= \\8\\H = 11/ -
h'\\H
as the error of approximation. Either we want to bring this below a certain tolerance r and if there is more than one solution to achieve this, we try to find the simplest possible one, that is, the one for which the rank of H' is as low as possible (this is the minimal degree problem), or we just want to find the optimal one of a certain degree, that is, we minimize the norm, given that the rank of H' is bounded by k (this is the minimal norm problem). Wefirstremark that approximating an operator by an operator offiniterank is a standard problem. If H is a Hankel operator with singular value decomposition (12.13), then inf {\\H - K\\2 : K has rank < k} =
sM.
An operator that is optimal is given by K — Y^j=\ •*/('' xj)yj- However, this approximating operator is in general not Hankel. The remarkable result of the Theorem 12.5.4 below, which generalizes Theorem 12.5.2, is that there is indeed an optimal approximating operator
386
12. Some applications
within the class of Hankel operators. Thus given a Hankel operator H, there exists a Hankel operator H' such that it solves inf{||// - H'h : H' has rank < k and H' is Hankel}.
(12.14)
The minimum is sk+\. We can reformulate this in terms of the symbols as follows. Let Hj£} denote the class of functions that are symbols of Hankel operators of rank k. Thus Hj£ represents functions that belong to classes of the form BkH^, where Bk is an arbitrary Blaschke product with k zeros in D. Then the optimization problem (12.14) becomes: Given / e Loo, find M{\\f-h'\\H:h'eHW}.
(12.15)
Theorem 12.5.4 (Adamyan-Arov-Krein). Let H = Hf be a compact Hankel operator with Sk+\ the (k + l)st singular value and (x, y) an arbitrary Schmidt pair associated with Sk+\. A solution to the optimization problem (12.14) is obtained for K equal to a unique Hankel operator H'. The Hankel operator H — H' has a unique symbol h of minimal norm, that is, there is a function h e L^ such that H — H' = H^ and
This symbol h can be constructed as follows: h{z) = Sk+\x{z)/y(l/z). The function h/s^+x is an all-pass function. Equivalently, hr = f — his a solution to the Hankel norm problem (12.15). Note that for k = 0 this theorem reduces to Theorem 12.5.2. This solves the minimum norm problem. Given the Hankel operator H one finds the Hankel operator H' of rank k at most that minimizes ||// — H'\\2. In the optimal degree problem, one is given the Hankel operator H and a tolerance r. The problem is then to find the Hankel operator Hf of minimal degree that satisfies \\H — H'\\2 < r. For the latter problem, it is seen from the previous theorem that if Hankel operators of rank k are allowed, then the minimal norm that can be obtained is Sk+\. This implies that the minimal rank required to satisfy \\H - H'\\2 < r will be k if sk > r >
sk+i.
The solution is constructed by the Nevanlinna-Pick algorithm as in the previous section. It is now allowed that there are unstable poles and the measure that is involved will not be positive definite anymore. We note that this kind of problem really requires the Nevanlinna-Pick interpolation algorithm and thus in this case the problem can not be solved by
12.5. Hoo problems
387
a polynomial approach. When all at = 0 in this kind of problem, then the corresponding Nehari problem is equivalent with a Caratheodory coefficient problem [52] and this of course can be solved by a polynomial approach. In a recent paper [99], Gohberg and Landau discuss the linear prediction problem for two stationary stochastic processes. More precisely, one of these stochastic processes is predicted using the cross correlation between the past of the other process and the future of the predicted process. In their formulation of the problem, the authors obtain a unifying framework for prediction problems as we have studied them in Section 12.1 and the problems we have discussed in this section, but only for the polynomial case. Extensions to the rational case may be expected to be straightforward with the tools provided by this monograph.
Conclusion
We have given an introduction to the theory of orthogonal rational functions. Although it required a slightly more complicated notation, our treatment allowed us to discuss the case of the circle (3O = T) and the case of the real line (3O = R) simultaneously. The case where all the poles are in O e and the so-called boundary case, where the poles of the rational functions are on the boundary of Oe, were considered separately. If A represents the basic points A = {a\, c*2,...}, which fix the poles, then the first case (internal poles) corresponds to all the points A being interior to O: A c CD (see cases IT and IR in the table below) and the boundary case corresponds to A c 9O (see cases BT and BR in the table below).
)= U Ac O
IT
IR
A C 3O
BT
BR
The case A c CD was extensively discussed in Chapters 2-10; the case A c 3O was discussed in Chapter 11. The origin of these problems can be found in multipoint generalizations of classical moment problems and associated interpolation problems, which are usually related to one- or two-point rational approximants in a Pade-like sense. The case IT from the previous scheme where A c ED and orthogonality is considered for a measure on T is in a sense the most natural multipoint generalization of the theory of orthogonal polynomials on the unit circle and the associated trigonometric moment problem. The polynomial problem occurs as a special case by choosing all a^ equal to 0. 389
390
Conclusion
The case BR in that scheme corresponds to the multipoint generalization of the polynomials on the real line and the associated Hamburger moment problem. Here the polynomials appear as a special case when all c^ are chosen to be at oo. The cases IR and BT are obtained by conformal mapping of the cases IT and BR respectively. However, the polynomial situations for IT and BR are not mapped to polynomial situations in IR and BT. The polynomial situation in IT corresponds to a special rational situation in IR, in fact polynomials in (z — i)/(z + i), and the polynomial case in BR is mapped to the case of special rationals functions, namely polynomials in (1 — z)/(l + z). The band in between has been left open. There is much room left for considering the case where the points from A can be everywhere on the Riemann sphere C or, maybe more realistically, when A c O = O U 3O. Some remarks about A c O U O e were given in Section 4.5. The case A c E> = D U T was considered by Dewilde and Dym [62] in the context of lossless inverse scattering as was briefly mentioned in Section 12.3. While the authors were lecturing about the topic of this monograph at several international conferences, they were often asked how this collaboration of people from four different countries came about. In fact, this was not just a coincidence. Coming from different directions - analytical, numerical, or applied - each of the authors was interested in the multipoint generalizations of the polynomial ideas as outlined above. Thus it was unavoidable that some day they should meet at a conference somewhere in the world. Once they had discovered their common interest, mutual visits and often participation in conferences, which sometimes took place in none of the countries where the authors reside, turned professional contacts into personal friendship. The collaboration has been quite intense during the past five years. For daily communication, the fax machine and the internet were indispensable tools. We finish with some historical notes about how the authors were brought together. Olav Njastad has been working for quite a while on moment problems, continued fractions, orthogonal polynomials, and related topics, both for the case of the real line and for the complex unit circle. Much of his work was in collaboration with W. B. Jones and W. Thron. For the trigonometric moment problem, the polynomials studied by Szego are the natural building bricks to be used. As is well known, this theory naturally leads to rational approximants for positive real functions (Caratheodory functions), quadrature formulas, etc. For moment problems on the real line such as the Hamburger moment problem, the same kind of problems, tools, and solutions occur, but now using polynomials orthogonal on the real line. The discussion of strong Hamburger moment problem gave rise to a first generalization. The orthogonal polynomials have to be replaced by
Conclusion
391
orthogonal Laurent polynomials and the rational approximants were replaced by approximants that did not approximate in one point, but in two points. So a step was made from one-point Pade-type approximation to two-point Pade approximation. The step from two to more than two is then only an unavoidable generalization. So the extended moment problems were born, which are related to multipoint Pade approximation. However, the idea of treating power series in more than one point somehow suggested to deal with only a finite number of points in which these power series were given. This explains why the earlier papers dealt with the so-called cyclic situation where the infinite sequence of points ak was a cyclic repetition on only a finite number of different points. Pablo Gonzalez-Vera promoted in 1985 Two-Point Pade Type Approximants, which were closely related to orthogonal Laurent polynomials, quadrature, and strong moment problems. It was in the fall of 1985 at a workshop organized by Claude Brezinski in Luminy in the South of France that Erik Hendriksen, Pablo Gonzalez-Vera, and Olav Njastad met. Erik visited Trondheim at the end of the year and Olav was invited to Amsterdam in 1986 and this started collaboration on the topic of orthogonal Laurent polynomials and multipoint generalizations, which give rise to the orthogonal rational functions. Erik's promotion of Strong Moment Problems and Orthogonal Laurent Polynomials was held in Trondheim in April 1989. By their common interests in two-point Pade approximation, quadrature formulas, and more general multipoint Pade approximation, Pablo and Olav kept in touch. It was in May-June 1989 that Pablo spent some time in Trondheim. Olav and Pablo prepared some work to be presented at another Luminy workshop to be held in September of that year. Adhemar Bultheel who earned his Ph.D. under the guidance of P. Dewilde had entered these topics motivated by the interests of his promoter in applied topics such as digital speech processing, lossless inverse scattering, etc. The algorithms of Schur and Nevanlinna-Pick were the tools par excellence to deal with these problems. The orthogonal rational functions are not of central interest though because the optimal predictors are given by reproducing kernels in the first place. His Ph.D. (1979) dealt with Recursive Rational Approximation, discussing both matrix versions of the Nevanlinna-Pick algorithm and Pade approximation in one and two points. Olav and Adhemar met for the first time at the NATO Advanced Study Institute on Orthogonal Polynomials and Their Applications organized by Paul Nevai in Columbus, Ohio in May-June 1989. It was only at the September meeting organized by Claude Brezinski on Extrapolation and Rational Approximation in Luminy, France, from September 24 to 30, 1989 that the present four authors met. Adhemar and Olav visited
392
Conclusion
La Laguna (Tenerife) in October-November of the same year and this really started a successful collaboration between the four authors in trying to extend the theory of rational functions, first in the unit circle and the a^ inside the disk, but later also for the boundary situation, that is, where the points are on the circle, which is the obvious choice when trying to get analogs for the extended moment problems on the line. From December 1993 till December 1996, A. Bultheel, P. Gonzalez-Vera, and O. Njastad were collaborating in the framework of a European Human Capital and Mobility project ROLLS under contract number CHRX-CT93-0416. This project tried to unify topics in rational approximation, orthogonal functions, linear algebra, linear systems, and signal processing. The financial contribution from this project for the accomplishment of this monograph is greatly appreciated. The present book is one of the accomplishments of the project. This historical note may explain how this introduction to the theory of orthogonal rational functions came about. The treatment is only introductory since we have not attempted to cover all possible generalizations of the corresponding Szego theory. Many aspects of the theory were not discussed and there is much work to be done. There is the matrix case, which was, for example, discussed in Refs. [56], [26], [146], [3], and [165] and many other papers. For a matrix theory of orthogonal polynomials on the unit circle see Refs. [74] and [75] or even the case of operator-valued functions [186, 84, 19, 149]. This is related to many directional or tangential interpolation problems such as discussed in Refs. [80], [82], [81], [83], [164], [117], [100], and [10] or the theory of more general J-unitary matrices, a theory initiated by Potapov [181,77,9,13,14,15, 16,127, 89,90, 88,91]. More of the polynomial results in Ref. [19] could have been generalized. We could have stressed more the multipoint Pade aspect of this theory (see Refs. [101], [113], and [159]). Much more extensive results can be obtained for the asymptotics of the polynomials or the recursion coefficients and the convergence of Fourier series in these orthogonal functions. There is the theory of time-varying systems, which can be generalized. See Refs. [98], [197], [63], [20], and [198]. And there are of course the many beautiful applications, with their own terminology and their own problem settings. We have given but a brief introduction to some of these in the last chapter. One could consult various volumes [97, 84, 19, 52, 128]. And there is a very long bibliography that is related to all these topics. Citing them all would be a project on its own. So we are well aware of the fact that the present discussion can only be an appetizing survey that may hopefully invoke some interest in the field, if that were necessary at all. We think it is a fascinating subject, even more fascinating than the theory of orthogonal polynomials, if that does not sound too much like a blasphemy.
Bibliography
[1] N. I. Achieser. Theory of Approximation. Frederick Ungar Publ. Co., New York, 1956. [2] N. I. Akhiezer [Achieser]. The Classical Moment Problem. Oliver and Boyd, Edinburgh, 1969. Originally published Moscow, 1961. [3] R. Ackner, H. Lev-Ari, and T. Kailath. The Schur algorithm for matrix-valued meromorphic functions. SIAM J. Matrix Anal. AppL, 15:140-150, 1994. [4] V. M. Adamjan, D. Z. Arov, and M. G. Krein. Infinite Hankel matrices and generalized Caratheodory-Fejer and Riesz problems. Functional Anal. AppL, 2:1-18, 1968. [5] V. M. Adamjan, D. Z. Arov, and M. G. Krein. Infinite Hankel matrices and generalized problems of Caratheodory-Fejer and I. Schur. Functional Anal. AppL, 2:269-281, 1968. [6] V. M. Adamjan, D. Z. Arov, and M. G. Krein. Analytic properties of Schmidt pairs for a Hankel operator and the generalized Schur-Takagi problem. Math. USSR-Sb., 15:31-73, 1971. [7] V. M. Adamjan, D. Z. Arov, and M. G. Krein. Infinite Hankel block matrices and related extension problems. Izv. Akad. Nauk. Armjan SSR Sen Mat, 6:87-112, 1971. See also Am. Math. Soc. TransL, 111:133-156, 1978. [8] A. C. Allison and N. J. Young. Numerical algorithms for the Nevanlinna-Pick problem. Numer. Math., 42:125-145, 1983. [9] D. Alpay, J. A. Ball, I. Gohberg, and L. Rodman, j -Unitary preserving automorphisms of rational matrix functions: State space theory, interpolation, and factorization. Linear Algebra AppL, 197/198:531-566, 1994. [10] D. Alpay and V. Bolotnikov. Two-sided interpolation for matrix functions with entries in the Hardy space. Linear Algebra AppL, 223/224:31-56, 1995. [11] G. S. Arnmar and W. B. Gragg. Determination of Pisarenko frequency estimates as eigenvalues of an orthogonal matrix. In Proc. SPIE, Int. Soc. for Optical Eng. Advanced Algorithms and Architectures for Signal Processing 2, 826:143-145, 1987. [12] N. Aronszajn. Theory of reproducing kernels. Trans. Am. Math. Soc, 68:337-404, 1950. [13] D. Z. Arov. y-generating matrices, /-inner matrix-functions and related extrapolation problems. Part I. Theory of Functions, Functional Analysis and Their Applications, 51:61-67, 1989. (In Russian.)
393
394
Bibliography
[14] D. Z. Arov. y-generating matrices, /-inner matrix-functions and related extrapolation problems. Part II. Theory of Functions, Functional Analysis and Their Applications, 52:103-109, 1989. (In Russian.) [15] D. Z. Arov. y-generating matrices, /-inner matrix-functions and related extrapolation problems. Part III. Theory of Functions, Functional Analysis and Their Applications, 53:57-64, 1990. (In Russian.) [16] D. Z. Arov. Regular /-inner matrix-functions and related continuation problems. In G. Arsene et al., eds., Linear Operators in Function Spaces, vol. 43 of Oper Theory: Adv. Appl, pp. 63-87, 1990. [17] G. A. Baker, Jr. and P. R. Graves-Morris. Pade Approximants. Part II: Extensions and Applications, vol. 14 of Encyclopedia of Mathematics and Its Applications. Addison-Wesley, Reading, MA, 1981. [18] G. A. Baker, Jr. and P. R. Graves-Morris. Pade Approximants, vol. 59 of Encyclopedia of Mathematics and Its Applications. Cambridge University Press, Cambridge, 2nd ed., 1996. [19] M. Bakonyi and T. Constantinescu. Schur's Algorithm and Several Applications, vol. 261 of Pitman Research Notes in Mathematics. Longman, Harlow, UK, 1992. [20] J. A. Ball, I. Gohberg, and M. A. Kaashoek. Two-sided Nudelman interpolation for input-output operators of discrete time-varying systems. Integral Equations Operator Theory, 21:174-211, 1995. [21] J. A. Ball, I. Gohberg, and L. Rodman. Realization and interpolation of rational matrix functions. In I. Gohberg, ed., Topics in Interpolation Theory of Rational Matrix-Valued Functions, vol. 33 of Oper. Theory: Adv. Appl., pp. 1-72, Birkhauser Verlag, Basel, 1988. [22] J. A. Ball, I. Gohberg, and L. Rodman. Interpolation of Rational Matrix Functions, vol. 45 of Oper. Theory: Adv. Appl., Birkhauser Verlag, Basel, 1990. [23] V. Belevitch. Classical Network Theory, pp. 93,136,141. Holden-Day, San Francisco, 1968. [24] R. P. Brent and F. T. Luk. A systolic array for the solution of linear-time solution of Toeplitz systems of equations. /. VLSI and Comp. Systems, 1(1): 1-23, 1983. [25] O. Brune. Synthesis of finite two-terminal network whose driving point impedance is a prescribed function of frequency. /. Math. Phys., 10:191-236, 1931. [26] A. Bultheel. Orthogonal matrix functions related to the multivariable Nevanlinna-Pick problem. Bull. Soc. Math. Belg. Sen B, 32(2): 149-170, 1980. [27] A. Bultheel. On a special Laurent-Hermite interpolation problem. In L. Collatz, G. Meinardus, and H. Werner, eds., Numerische Methoden der Approximationstheorie 6, vol. 59 of Int. Sen ofNumer. Math., pp. 63-79, Birkhauser Verlag, 1981. Basel-New York-Berlin. [28] A. Bultheel. On the ill-conditioning of locating the transmission zeros in least squares ARMA filtering. /. Comput. Appl. Math., ll(l):103-118, 1984. [29] A. Bultheel. Laurent Series and Their Pade Approximations, vol. OT-27 of Oper. Theory: Adv. Appl. Birkhauser Verlag, Basel-Boston, 1987. [30] A. Bultheel and P. Dewilde. Orthogonal functions related to the Nevanlinna-Pick problem. In P. Dewilde, ed., Proc. 4th Int. Conf on Math. Theory of Networks and Systems at Delft, pp. 207-212, Western Periodicals, North Hollywood, CA, 1979. [31] A. Bultheel, P. Gonzalez-Vera, E. Hendriksen, and O. Njastad. A Szego theory for rational functions. Technical Report TW131, Department of Computer Science, K. U. Leuven, May 1990.
Bibliography
395
[32] A. Bultheel, P. Gonzalez-Vera, E. Hendriksen, and O. Njastad. Orthogonality and quadrature on the unit circle. In C. Brezinski, L. Gori, and A. Ronveaux, eds., Orthogonal Polynomials and Their Applications, vol. 9 of IMACS Annals on Computing and Applied Mathematics, pp. 205-210, J. C. Baltzer AG, Basel, 1991. [33] A. Bultheel, P. Gonzalez-Vera, E. Hendriksen, and O. Njastad. The computation of orthogonal rational functions and their interpolating properties. Numer. Algorithms, 2(1):85-114, 1992. [34] A. Bultheel, P. Gonzalez-Vera, E. Hendriksen, and O. Njastad. Orthogonal rational functions and quadrature on the unit circle. Numer. Algorithms, 3:105-116,1992. [35] A. Bultheel, P. Gonzalez-Vera, E. Hendriksen, and O. Njastad. Moment problems and orthogonal functions. / Comput. Appl. Math., 48:49-68, 1993. [36] A. Bultheel, P. Gonzalez-Vera, E. Hendriksen, and O. Njastad. Asymptotics for orthogonal rational functions. Trans. Am. Math. Soc, 346:331-340, 1994. [37] A. Bultheel, P. Gonzalez-Vera, E. Hendriksen, and O. Njastad. Orthogonal rational functions with poles on the unit circle. J. Math. Anal. Appl., 182:221-243, 1994. [38] A. Bultheel, P. Gonzalez-Vera, E. Hendriksen, and O. Njastad. Orthogonality and boundary interpolation. In A. M. Cuyt, ed., Nonlinear Numerical Methods and Rational Approximation II, pp. 37-48. Kluwer, Dordrecht, 1994. [39] A. Bultheel, P. Gonzalez-Vera, E. Hendriksen, and O. Njastad. Quadrature formulas on the unit circle based on rational functions. J. Comput. Appl. Math., 50:159-170, 1994. [40] A. Bultheel, P. Gonzalez-Vera, E. Hendriksen, and O. Njastad. On the convergence of multipoint Pade-type approximants and quadrature formulas associated with the unit circle. Numer. Algorithms, 13:321-344, 1996. [41] A. Bultheel, P. Gonzalez-Vera, E. Hendriksen, and O. Njastad. Continued fractions and orthogonal rational functions. In W. B. Jones and A. S. Ranga, eds., Orthogonal Functions, Moment Theory and Continued Fractions: Theory and Applications, pp. 69-100, Marcel Dekker, New York, 1998. [42] A. Bultheel, P. Gonzalez-Vera, E. Hendriksen, and O. Njastad. A Favard theorem for rational functions with poles on the unit circle. East J. Approx., 3:21-37, 1997. [43] A. Bultheel, P. Gonzalez-Vera, E. Hendriksen, and O. Njastad. Orthogonal rational functions and nested disks. J. Approx. Theory, 89:344-371, 1997. [44] A. Bultheel, P. Gonzalez-Vera, E. Hendriksen, and O. Njastad. Rates of convergence of multipoint rational approximants and quadrature formulas on the unit circle. J. Comput. Appl. Math., 77:77-102, 1997. [45] A. Bultheel, P. Gonzalez-Vera, E. Hendriksen, and O. Njastad. A rational moment problem on the unit circle. Methods Appl. Anal, 4(3):283-310, 1997. [46] R. B. Burckel. An Introduction to Classical Complex Analysis. Birkhauser Verlag, Basel, 1971. [47] J. P. Burg. Maximum entropy spectral analysis. In D. G. Childers, ed., Modern Spectral Analysis, pp. 34-39, IEEE Press, New York, 1978. Originally presented at 37th Meet. Soc. Exploration Geophysicists, 1967. [48] C. Caratheodory. Uber den Variabilitatsbereich der Koeffizienten von Potenzreihen die gegebene Werte nicht annehmen. Math. Ann., 64:95-115,1907. [49] C. Caratheodory. Uber den Variabilitatsbereich der Fourier'schen Konstanten von positiven harmonischen Funktionen. Rend. Circ. Mat. Palermo, 32:193-217, 1911.
396
Bibliography
[50] C. Caratheodory and L. Fejer. Uber den Zusammenhang der Extremen von harmonischen Funktionen mit ihren Koefficienten und uber den Picard-Landauschen Satz. Rend. Circ. Mat. Palermo, 32:218-239, 1911. [51] L. Cochran and S. C. Cooper. Orthogonal Laurent polynomials on the real line. In S. C. Cooper and W. J. Thron, eds., Continued Fractions and Orthogonal Functions, pp. 47-100, Marcel Dekker, New York, 1994. [52] T. Constantinescu. Schur Analysis, Factorization and Dilation Problems, vol. 82 of Open Theory: Adv. Appl. Birkhauser Verlag, Basel, 1996. [53] G. Cybenko. Computing Pisarenko frequency estimates. In Proc. 1984 Conf. Inform. Syst. ScL, pp. 587-591. Princeton Univ., 1984. [54] P. J. Davis and P. Rabinowitz. Methods of Numerical Integration. Academic Press, 2nd ed., 1984. [55] Ph. Delsarte and Y. Genin. A survey of the split approach based techniques in digital signal processing applications. Phillips J. Res., 43:346-374, 1988. [56] Ph. Delsarte, Y. Genin, and Y. Kamp. The Nevanlinna-Pick problem for matrix-valued functions. SIAM J. Appl. Math., 36:47-61, 1979. [57] Ph. Delsarte, Y. Genin, and Y. Kamp. On the role of the Nevanlinna-Pick problem in circuit and system theory. Int. J. Circuit Th. Appl., 9:177-187, 1981. [58] P. Dewilde. Stochastic modeling with orthogonal filters. In Outils et Modeles Mathematiquespour I'Automatique, lfAnalyse de Systemes et le Traitement du Signal Vol. 2, pp. 331-398, Editions du CNRS, Paris, 1982. [59] P. Dewilde. The lossless inverse scattering problem in the network-theory context. In H. Dym and I. Gohberg, eds., Topics in Operator Theory, Systems and Networks, vol. 12 of Oper. Theory: Adv. Appl., pp. 109-128, Birkhauser Verlag, Basel, 1984. [60] P. Dewilde and H. Dym. Schur recursions, error formulas, and convergence of rational estimators for stationary stochastic sequences. IEEE Trans. Inf. Th., IT-27:446-461, 1981. [61] P. Dewilde and H. Dym. Lossless inverse scattering with rational networks: theory and applications. Technical Report 83-14, Delft University of Technology, Dept. of Elec. Eng. Network Theory Section, December 1982. [62] P. Dewilde and H. Dym. Lossless inverse scattering, digital filters, and estimation theory. IEEE Trans. Inf. Th., IT-30:644-662, 1984. [63] P. Dewilde, M. A. Kaashoek, and M. Verhaegen, eds. Challenges of a Generalized System Theory. Essays of the Dutch Academy of Arts and Sciences. Dutch Acad. Arts Sci., Amsterdam, 1993. [64] P. Dewilde, A. Viera, and T. Kailath. On a generalized Szego-Levinson realization algorithm for optimal linear predictors based on a network synthesis approach. IEEE Trans. Circuits and Systems, CAS-25:663-675, 1978. [65] M. M. Djrbashian. Expansions in systems of rational functions on a circle with a given set of poles. Doklady Akademii Nauk SSSR, 143:17-20, 1962. (In Russian. Translation in Soviet Mathematics Doklady, 3:315-319, 1962.) [66] M. M. Djrbashian. Orthogonal systems of rational functions on the unit circle with given set of poles. Doklady Akademii Nauk SSSR, 147:1278-1281, 1962. (In Russian. Translation in Soviet Mathematics Doklady, 3:1794-1798, 1962.) [67] M. M. Djrbashian. Orthogonal systems of rational functions on the circle. Izv. Akad. NaukArmyan. SSR, 1:3-24, 1966. (In Russian.) [68] M. M. Djrbashian. Orthogonal systems of rational functions on the unit circle. Izv. Akad. NaukArmyan. SSR, 1:106-125, 1966. (In Russian.) [69] M. M. Djrbashian. Expansions by systems of rational functions with fixed poles. Izv. Akad. NaukArmyan. SSR, 2:3-51, 1967. (In Russian.)
Bibliography
397
[70] M. M. Djrbashian. A survey on the theory of orthogonal systems and some open problems. In P. Nevai, ed., Orthogonal Polynomials: Theory and Practice, vol. 294 of Series C: Mathematical and Physical Sciences pp. 135-146, NATO-ASI, Kluwer Academic Publishers, Boston, 1990. [71] W. F. Donoghue Jr. Monotone Matrix Functions and Analytic Continuation. Springer-Verlag, Berlin, 1974. [72] R. G. Douglas. Banach Algebra Techniques in Operator Theory. Academic Press, New York, 1972. [73] R. G. Douglas, H. S. Shapiro, and A. L. Shields. Cyclic vectors and invariant subspaces for the backward shift operator. Ann. Inst. Fourier, 20:37-76, 1970. [74] V. K. Dubovoj, B. Fritzsche, and B. Kirstein. On a class of matrix completion problems. Math. Nachr., 143:211-226, 1989. [75] V. K. Dubovoj, B. Fritzsche, and B. Kirstein. Matricial Version of the Classical Schur Problem, vol. 129 of Teubner-Texte zur Mathematik. Teubner Verlagsgesellschaft, Stuttgart, Leipzig, 1992. [76] P. L. Duren. The Theory of Hp Spaces, vol. 38 of Pure and Applied Mathematics. Academic Press, New York, 1970. [77] H. Dym. J-Contractive Matrix Functions, Reproducing Kernel Hilbert Spaces and Interpolation, vol. 71 of CBMS Regional Conf. Sen in Math. Am. Math. Soc, Providence, RI, 1989. [78] T. Erdelyi, P. Nevai, J. Zhang, and J. S. Geronimo. A simple proof of "Favard's theorem" on the unit circle. Atti. Sem. Mat. Fis. Univ. Modena, 29:551-556, 1991. Proceedings of the Meeting "Trends in Functional Analysis and Approximation Theory," 1989, Italy. [79] J. Favard. Sur les polynomes de Tchebicheff. C. R. Acad. Sci. Paris, 200:2052-2053, 1935. [80] I. P. Fedcina. A criterion for the solvability of the Nevanlinna-Pick tangent problem. Mat. Issled. (Kishinev), 7:213-227, 1972. (In Russian.) [81] I. P. Fedcina. A description of the solutions of the Nevanlinna-Pick tangent problem. Dokl Akad. NaukArmjan SSR, Ser. Mat., 60:37-42,1975. (In Russian.) [82] I. P. Fedcina. The tangential Nevanlinna-Pick problems with multiple points. Dokl. Akad. NaukArmjan SSR, Ser. Mat, 61:214-218, 1975. [83] I. P. Fedcina. The Schur problem for vector valued functions. Ukrain Mat. Z , 30(6):797-805, 861, 1978. [84] C. Foia§ and A. Frazho. The Commutant Lifting Approach to Interpolation Problems, vol. 44 of Open Theory: Adv. Appl. Birkhauser Verlag, Basel, 1990. [85] C. Foia§, J. W. Helton, H. Kwakernaak, and J. B. Pearson. H^-Control Theory. Number 1496 in Lecture Notes in Math. Springer-Verlag, Berlin, 1991. [86] B. A. Francis. A Course in H°° Control Theory. Springer-Verlag, Berlin, Heidelberg, 1987. [87] G. Freud. Orthogonal Polynomials. Pergamon Press, Oxford, 1971. [88] B. Fritzsche, B. Fuchs, and B. Kirstein. Schur sequence parametrizations of Potapov-normalized full rank jpq-elementary factors. Linear Algebra Appl., 191:107-150, 1994. [89] B. Fritzsche and B. Kirstein. Darlington synthesis with Arov-singular jpq -inner functions. Analysis, 13:215-228, 1993. [90] B. Fritzsche and B. Kirstein. On the Weyl balls associated with nondegenerate matrix-valued Caratheodory functions. Z Anal. Anw., 12:239-261, 1993. [91] B. Fritzsche and B. Kirstein. Caratheodory sequence parametrizations of Potapov-normalized full rank y^-elementary factors. Linear Algebra Appl., 214:145-186, 1995.
398
Bibliography
[92] J. B. Gamett. Bounded Analytic Functions. Academic Press, New York, 1981. [93] Ya. Geronimus. Polynomials Orthogonal on a Circle and Their Applications, vol. 3 of Transl. Math. Monographs, pp. 1-78. Am. Math. Soc, 1954. [94] Ya. Geronimus. Polynomials Orthogonal on a Circle and Interval. International Series of Monographs in Pure and Applied Mathematics. Pergamon Press, Oxford, 1960. [95] Ya. Geronimus. Orthogonal Polynomials. Consultants Bureau, New York, 1961. [96] L. Gillman and M. Jerison. Rings of Continuous Functions, vol. 43 of Graduate Texts in Mathematics. Van Nostrand, Princeton, NJ, 1976. [97] I. Gohberg, ed. /. Schur Methods in Operator Theory and Signal Processing, vol. 18 of Open Theory: Adv. Appl. Birkhauser Verlag, Basel, 1986. [98] I. Gohberg, ed. Time-Variant Systems and Interpolation, vol. 56 of Open Theory: Adv. Appl. Birkhauser Verlag, Basel, 1992. [99] I. Gohberg and H. J. Landau. Prediction of two processes and the Nehari problem. J. Fourier Anal. Appl, 3:43-62, 1997. [100] I. Gohberg and L. A. Sakhnovich, eds. Matrix and Operator Valued Functions, vol. 72 of Oper. Theory: Adv. Appl. Birkhauser Verlag, Basel, 1994. [101] P. Gonzalez-Vera and O. Njastad. Szego functions and multipoint Pade approximation. /. Comput. Appl. Math., 32:107-116, 1990. [102] U. Grenander and G. Szego. Toeplitz Forms and Their Applications. University of California Press, Berkeley, 1958. [103] T. H. Gronwall. On the maximum modulus of an analytic function. Ann. of Math., 16(2):77-81, 1914-15. [104] H. Hamburger. Ueber eine Erweiterung des Stieltjesschen Moment Problems I. Math. Ann., 81:235-319, 1920. [105] H. Hamburger. Ueber eine Erweiterung des Stieltjesschen Moment Problems II. Math. Ann., 82:120-164, 1921. [106] H. Hamburger. Ueber eine Erweiterung des Stieltjesschen Moment Problems III. Math. Ann., 82:168-187, 1921. [107] G. Hamel. Eine Charakteristische Eigenschaft beschrankter analytischer Funktionen. Math. Ann., 78:257-269, 1918. [108] M. H. Hayes and M. A. Clements. An efficient algorithm for computing Pisarenko's harmonic decomposition using Levinson's algorithm. IEEE Trans. Acoust. Speech Signal Process., ASSP-34:485^-91, 1986. [109] G. Heinig and K. Rost. Algebraic Methods for Toeplitz-Like Matrices and Operators. Akademie Verlag, Berlin, 1984. Also Birkhauser Verlag, Basel. [110] H. Helson. Lectures on Invariant Subspaces. Academic Press, New York, 1964. [ I l l ] J. W. Helton. Orbit Structure of the Mobius Transformation Semi-Group Acting on H°° (Broadband Matching), vol. 3 of Adv. in Math. Suppl. Stud., pp. 129-197. Academic Press, New York, 1978. [112] E. Hendriksen and O. Njastad. A Favard theorem for rational functions. J. Math. Anal. Appl, 142(2):508-520, 1989. [113] E. Hendriksen and O. Njastad. Positive multipoint Pade continued fractions. Proc. Edinburgh Math. Soc, 32:261-269, 1989. [114] E. Hendriksen and H. van Rossum. Orthogonal Laurent polynomials. Proc. of the Kon. Nederl Akad. Wetensch, Proceedings A, 89(1): 17-36, 1986. [115] P. Henrici. Applied and Computational Complex Analysis. Volume 2: Special Functions, Integral Transforms, Asymptotics, Continued Fractions, vol. II of Pure and Applied Mathematics, a Wiley-Interscience Series of Texts, Monographs and Tracts. John Wiley & Sons, New York, 1977.
Bibliography
399
[116] K. Hoffman. Banach Spaces of Analytic Functions. Prentice-Hall, Englewood Cliffs, 1962. [117] T. S. Ivanchenko and L. A. Sakhnovich. An operator approach to the Potapov scheme for the solution of interpolation problems. In I. Gohberg and L. A. Sakhnovich, eds., Matrix and Operator Valued Functions, vol. 72 of Open Theory: Adv. Appl., pp. 48-86. Birkhauser Verlag, Basel, 1994. [118] W. B. Jones, O. Njastad, and W. J. Thron. Two-point Pade expansions for a family of analytic functions. /. Comput. Appl. Math., 9:105-124, 1983. [119] W. B. Jones, O. Njastad, and W. J. Thron. Orthogonal Laurent polynomials and the strong Hamburger moment problem. /. Math. Anal. Appl., 98:528-554, 1984. [120] W. B. Jones, O. Njastad, and W. J. Thron. Continued fractions associated with the trigonometric moment problem and other strong moment problems. Constr. Approx., 2:197-211, 1986. [121] W. B. Jones, O. Njastad, and W. J. Thron. Perron-Caratheodory continued fractions. In J. Gilewicz, M. Pindor, and W. Siemaszko, eds., Rational Approximation and Its Applications in Mathematics and Physics, vol. 1237 of Lecture Notes in Math., pp. 188-206, Springer-Verlag, Berlin, 1987. [122] W. B. Jones, O. Njastad, and W. J. Thron. Moment theory, orthogonal polynomials, quadrature and continued fractions associated with the unit circle. Bull. London Math. Soc, 21:113-152, 1989. [123] W. B. Jones and W. J. Thron. Continued Fractions. Analytic Theory and Applications. Addison-Wesley, Reading, MA, 1980. [124] T. Kailath. A view of three decades of linear filtering theory. IEEE Trans. Inf. Th., IT-20:146-181, 1974. Reprinted in [125], 10-45. [125] T. Kailath et al., ed. Linear Least-Squares Estimation, vol. 17 of Benchmark Papers in Electrical Engineering and Computer Science. Dowden, Hutchinson and Ross, Stroudsburg, PA, 1977. [126] J. Karlsson. Rational interpolation and best rational approximation. J. Math. Anal. Appl, 52:38-52, 1976. [127] V. E. Katsnelson. Left and right Blaschke-Potapov products and Arov-singular matrix-valued functions. Integral Equations Operator Theory, 13:836-848, 1990. [128] M. Kimura. Chain Scattering Approach to H-Infinity-Control. Birkhauser Verlag, Basel, 1997. [129] A. N. Kolmogorov. Stationary sequences in Hilbert's space. Bull. Moscow State Univ., 2(6): 1-40, 1940. Reprinted in [125], 66-89. [130] P. Koosis. Introduction to Hp Spaces, vol. 40 of London Mathematical Society Lecture Notes. Cambridge University Press, Cambridge, 1980. [131] M. G. Krem and A. A. Nudel'man. The Markov Moment Problem and Extremal Problems, vol. 50 of Transl Math. Monographs. Am. Math. Soc, Providence, RI, 1977. [132] L. Kronecker. Algebraische Reduction der Schaaren bilinearer Formen. S.-B. Akad. Berlin, pp. 763-776, 1890. [133] H. J. Landau. Maximum entropy and the moment problem. Bull. Am. Math. Soc. (N.S.), 16(l):47-77, 1987. [134] N. Levinson. The Wiener rms (root mean square) error criterion in filter design and prediction. /. Math. Phys., 25:261-278, 1947. [135] X. Li and K. Pan. Strong and weak convergence of rational functions orthogonal on the unit circle. J. London Math. Soc, 53:289-301, 1996.
400
Bibliography
[136] X. Li and E. B. Saff. On Nevai's characterization of measures with almost everywhere positive derivative. J. Approx. Theory, 63:191-197, 1990. [137] G. L. Lopes [Lopez-Lagomasino]. Conditions for convergence of multipoint Pade approximants for functions of Stieltjes type. Math. USSR-Sb, 35:363-376, 1979. [138] G. L. Lopes [Lopez-Lagomasino]. On the asymptotics of the ratio of orthogonal polynomials and convergence of multipoint Pade approximants. Math. USSR-Sb, 56:207-219, 1985. [139] G. L. Lopez [Lopez-Lagomasino]. Szego's theorem for orthogonal polynomials with respect to varying measures. In M. Alfaro et al., eds., Orthogonal Polynomials and Their Applications, vol. 1329 of Lecture Notes in Math., pp. 255-260. Springer-Verlag, Berlin, 1988. [140] G. L. Lopez [Lopez-Lagomasino]. Asymptotics of polynomials orthogonal with respect to varying measures. Constr. Approx., 5:199-219, 1989. [141] G. L. Lopez [Lopez-Lagomasino]. Convergence of Pade approximants of Stieltjes type meromorphic functions and comparative asymptotice for orthogonal polynomials. Math. USSR-Sb, 64:207-227, 1989. [142] L. Lorentzen and H. Waadeland. Continued Fractions with Applications, vol. 3 of Studies in Computational Mathematics. North-Holland, Dordrecht, 1992. [143] J. Makhoul. Linear prediction: a tutorial review. Proc. IEEE, 63:561-580, 1975. [144] J. D. Markel and A. H. Gray Jr. Linear Prediction of Speech. Springer-Verlag, New York, 1976. [145] A. Mate, P. Nevai, and V. Totik. Strong and weak convergence of orthogonal polynomials. Am. J. Math., 109:239-281, 1987. [146] R. Mathias. Matrices with positive Hermitian part: inequalities and linear systems. SIAM J. Matrix Anal. Appl, 13(2):640-654, 1992. [147] J. H. McCabe and J. A. Murphy. Continued fractions which correspond to power series expansions at two points. J. Inst. Math. Appl., 17:233-247, 1976. [148] H. Meschkowski. Hilbertsche Rdume mit Kernfunktion. Springer-Verlag, Berlin, 1962. [149] K. Miiller. Arov-Dewilde-Dym-Parametrization of j q q -inner functions. PhD thesis, Univ. Leipzig, 1995. [150] K. Miiller and A. Bultheel. Translation of the Russian paper "Orthogonal systems of rational functions on the unit circle" by M. M. Dzrbasian. Technical Report TW253, Department of Computer Science, K. U. Leuven, February 1997. [151] Z. Nehari. On bounded bilinear forms. Ann. of Math., 65:153-162, 1957. [152] P. Nevai. Geza Freud, orthogonal polynomials and Christoffel functions. A case study. /. Approx. Theory, 48:3-167, 1986. [153] R. Nevanlinna. Uber beschrankte Funktionen die in gegebenen Punkten vorgeschriebene Werte annehmen. Ann. Acad. Sci. Fenn. Ser. A., 13(1):71, 1919. [154] R. Nevanlinna. Asymptotische Entwickelungen beschrankter Funktionen und das Stieltjessche Momentenproblem. Ann. Acad. Sci. Fenn. Ser. A., 18(5):53, 1922. [155] R. Nevanlinna. Kriterien fur die Randwerte beschrankter Funktionen. Math. Z , 13:1-9,1922. [156] R. Nevanlinna. Uber beschrankte analytische Funktionen. Ann. Acad. Sci. Fenn. Ser. A., 32(7):75, 1929. [157] O. Njastad. An extended Hamburger moment problem. Proc. Edinburgh Math. Soc, 28:167-183, 1985.
Bibliography
401
[158] O. Njastad. Unique solvability of an extended Hamburger moment problem. J. Math. Anal. Appl., 124:502-519, 1987. [159] O. Njastad. Multipoint Pade approximation and orthogonal rational functions. In A. Cuyt, ed., Nonlinear Numerical Methods and Rational Approximation, pp. 258-270, D. Reidel, Dordrecht, 1988. [160] O. Njastad. A modified Schur algorithm and an extended Hamburger moment problem. Trans. Am. Math. Soc, 327(1):283-311, 1991. [161] O. Njastad. Classical and strong moment problems. Comm. Anal. Th. Continued Fractions, 4:4-38, 1995. [162] O. Njastad and W. J. Thron. The theory of sequences of orthogonal L-polynomials. In H. Waadeland and H. Wallin, eds., Pade Approximants and Continued Fractions, Det Kongelige Norske Videnskabers Selskab Skrifter (No. l),pp. 54-91, 1983. [163] O. Njastad and W. J. Thron. Unique solvability of the strong Hamburger moment problem. /. Austral. Math. Soc. (Series A), 40:5-19, 1986. [164] A. A. Nudelman. Matrix versions of interpolation problems of Nevanlinna-Pick and Loewner type. In U. Helmke, R. Mennicken, and J. Saurer, eds., Systems and Networks: Mathematical Theory and Applications, Vol. I (Regensburg, 1993), number 77 in Mathematical Research, pp. 291-309, Akademie Verlag, Berlin, 1994. [165] A. A. Nudelmann. Multipoint matrix moment problem. Dokl. Acad. Nauk., 298:812-815, 1988. [166] K. Pan. On characterization theorems for measures associated with orthogonal systems of rational functions on the unit circle. /. Approx. Theory, 70:265-272, 1992. [167] K. Pan. On orthogonal systems of rational functions on the unit circle and polynomials orthogonal with respect to varying measures. /. Comput. Appl. Math., 47(3):313-322, 1993. [168] K. Pan. Strong and weak convergence of orthogonal systems of rational functions on the unit circle. /. Comput. Appl. Math., 46:427^136, 1993. [169] K. Pan. On orthogonal polynomials with respect to varying measures on the unit circle. Trans. Am. Math. Soc, 346:331-340, 1994. [170] K. Pan. On the convergence of rational interpolation approximant of Caratheodory functions. J. Comput. Appl. Math., 54:371-376, 1994. [171] K. Pan. Extensions of Szego's theory of rational functions orthogonal on the unit circle. J. Comput. Appl. Math., 62:321-331, 1995. [172] K. Pan. On the orthogonal rational functions with arbitrary poles and interpolation properties. J. Comput. Appl. Math., 60:347-355, 1995. [173] K. Pan. On the convergence of rational functions orthogonal on the unit circle. J. Comput. Appl. Math., 76:315-324, 1996. [174] A. Papoulis. Levinson's algorithm, Wold decomposition, and spectral estimation. SIAMRev., 27(3):405-441, 1985. [175] J. R. Partington. An Introduction to Hankel Operators, vol. 13 of London Mathematical Society Student Texts. Cambridge University Press, Cambridge, 1988. [176] G. Pick. Uber die Beschrankungen analytischen Funktionen welche durch vorgegebene Funktionswerte bewirkt werden. Math. Ann., 77:7-23, 1916. [177] G. Pick. Uber die Beschrankungen analytischen Funktionen durch vorgegebene Funktionswerte. Math. Ann., 78:270-275, 1918. [178] G. Pick. Uber beschrankte Funktionen mit vorgeschriebenen Wertzuordnungen. Ann. Acad. Sci. Fenn. Ser. A, 15(3): 17, 1920.
402
Bibliography
[179] V. P. Pisarenko. The retrieval of harmonics from a covariance function. Geophys. J. R. Astron. Soc, 33:347-366, 1973. [180] V. P. Potapov. The Multiplicative Structure of J-Contractive Matrix Functions, vol. 15 of Am. Math. Soc. Transl. Ser. 2, pp. 131-243. Am. Math. Soc, Providence, RI, 1960. [181] V. P. Potapov. Linear Fractional Transformations of Matrices, vol. 138 of Am. Math. Soc. Transl. Ser. 2, pp. 21-35. Am. Math. Soc, Providence, RI, 1988. [182] S. C. Power. Hankel Operators on Hilbert Space. Pitman Advanced Public Program, Boston, 1982. [183] R. Redheffer. On the relation of transmission-line theory to scattering and transfer. /. Math. Phys., 41:1-41, 1962. [184] F. Riesz. Uber ein Problem des Herrn Caratheodory. /. Reine Angew. Math., 146:83-87, 1916. [185] F. Riesz. Uber Potenzreihen mit vorgeschriebenen Anfangsgliedern. Acta Math., 42:145-171, 1918. [186] M. Rosenblum and J. Rovnyak. Hardy Classes and Operator Theory. Oxford University Press, New York, 1985. [187] W. Rudin. Real and Complex Analysis. McGraw-Hill, New York, 2nd ed., 1974. [188] D. Sarason. Generalized interpolation in H°°. Trans. Am. Math. Soc, 127:179-203, 1967. [189] I. Schur. Uber ein Satz von C. Caratheodory. S.-B. Preuss. Akad. Wiss. (Berlin), pp. 4-15, 1912. [190] I. Schur. Uber Potenzreihen die im Innern des Einheitskreises Beschrankt sind I. / Reine Angew. Math., 147:205-232, 1917. See also [97, pp. 31-59]. [191] I. Schur. Uber Potenzreihen die im Innern des Einheitskreises Beschrankt sind II. /. Reine Angew. Math., 148:122-145, 1918. See also [97, pp. 36-88]. [192] J. A. Shohat and J. D. Tamarkin. The Problem of Moments, volume 1 of Math. Surveys. Am. Math. Soc, Providence, RI, 1943. [193] H. Stahl and V. Totik. General Orthogonal Polynomials. Encyclopedia of Mathematics and Its Applications. Cambridge University Press, Cambridge, 1992. [194] T. J. Stieltjes. Recherches sur les fractions continues. Ann. Fac. Sci. Toulouse, 8:J.1-122, 1894, 9:A.l-47, 1895. English transl.: Oeuvres Completes, Collected Papers, Vol. 2, 609-745, Springer-Verlag, Berlin, 1993. [195] M. H. Stone. Linear Transformations in Hilbert Space and Their Applications to Analysis, vol. 15 of Am. Math. Soc. Colloq. Publ. Am. Math. Soc, New York, 1932. [196] G. Szego. Orthogonal Polynomials, vol. 33 of Am. Math. Soc. Colloq. Publ. Am. Math. Soc, Providence, RI, 3rd ed., 1967. First edition 1939. [197] A.-J. van der Veen. Time-variant system theory and computational modeling. Realization, approximation and factorization. PhD thesis, Technical University Delft, The Netherlands, June 1993. [198] A.-J. van der Veen and P. Dewilde. Embedding time-varying contractive systems in lossless realizations. Math. Control Signals Systems, 7:306-330, 1995. [199] J. L. Walsh. Interpolation and functions analytic interior to the unit circle. Trans. Am. Math. Soc, 34:523-556, 1932. [200] J. L. Walsh. Interpolation and Approximation, vol. 20 of Am. Math. Soc Colloq. Publ. Am. Math. Soc, Providence, RI, 3rd ed., 1960. First edition 1935. [201] N. Wiener. Extrapolation, Interpolation and Smoothing of Time Series. Wiley, New York, 1949.
Bibliography
403
[202] N. Wiener and P. Masani. The prediction theory of multivariate stochastic processes, I. The regularity condition. Acta Math., 98:111-150, 1957. [203] N. Wiener and P. Masani. The prediction theory of multivariate stochastic processes, II. The linear predictor. Acta Math., 99:93-139, 1958. [204] D. C. Youla and M. Saito. Interpolation with positive-real functions. /. Franklin 7n5f.,284(2):77-108, 1967.
Index
admissible, 142 admittance matrix, 370 all-pass function, 377 autoregressive filter, 353
Darlington synthesis, 369 density, 34, 149, 155, 174, 254 determinant expression, 55, 57 determinant formula, 89, 90, 92, 118, 243, 246 dissipated power, 369
backward shift, 343 Banach space, 20, 36 Bessel inequality, 160 Beurling theorem, 45 Beurling-Lax theorem, 34 Blaschke factor, 42, 65 Blaschke product, 31, 43, 53, 84, 91, 110, 122, 135, 142, 149, 151, 153, 157, 164, 176, 179, 181,184,241,244,257 Blaschke-Potapov factor, 38 boundary situation, 101 Brune section, 368 Caratheodory class, 11, 12, 15, 23 Caratheodory coefficient problem, 104,387 Caratheodory-Toeplitz theorem, 356 Carleman condition, 194, 195 Cauchy integral, 22, 46, 62, 138, 153 Cauchy kernel, 23 Cauchy-Stieltjes integral, 22, 122 causal system, 360 Cay ley transform, 15, 16, 25, 111, 144, 182 chain scattering matrix, 40, 362 Christoffel function, 118, 175 Christoffel-Darboux relation, 12,64,67,78,93, 101, 137, 185, 191, 192, 200, 204, 243, 245, 246, 272 compactification, 253, 302 continued fraction, 96, 103, 105 approximants, 95 convergents, 95 control problem, 373 convergence factor, 32 covariance, 343
EMP-fraction, 338 energy of stochastic process, 343 Erdos-Turan condition, 173, 194, 209 expectation operator, 343 extended multipoint Pade fraction, 338 extended recurrence relation, 309 extremal problem, 35, 36, 42, 58, 174 Favard theorem, 13, 121, 161, 307 Fourier transform, 21 Fourier-Stieltjes transform, 21 frequency domain, 345, 369 functions of second kind, 92, 100, 104, 105, 111, 114, 117, 121, 123, 145, 181, 241, 242, 267, 269, 331 Gram matrix, 46, 48, 49, 53, 56, 57 Gram-Schmidt orthogonalization, 56 Green's formula, 93, 277 Hamburger representation, 29 Hankel norm approximation, 385 Hardy class, 15, 17 harmonic function, 186 harmonic majorant, 17,40 Helly's theorems, 252 Hurwitz theorem, 185, 204 impedance matrix, 370 incident wave, 359 incoming wave, 371 inner function, 31, 34, 156 inner-outer factorization, 31, 190, 382
405
406 innovation prediction filter, 349 innovation process, 344 internal impedance, 370 internal stability, 374 invariant subspace, 34, 45 J-contractive, 36, 88 J-inner, 41 J-lossless conjugation, 378 J-lossless factorization, 377 J-unitary, 12, 36, 64, 70, 74, 75, 87, 88, 140, 142-144, 166
Index Nevanlinna-Pick fraction, 104 Nevanlinna-Pick problem, 6, 25, 47, 104, 239, 342 Nevanlinna-Pick-Takagi problem, 342, 384 nondeterministic process, 344 normal family, 185 Norton theorem, 370 NP-fraction, 104 orthogonal Laurent polynomial, 9 outer function, 31, 33, 34, 63, 135, 139, 156, 377 outgoing wave, 371
Kolmogorov isomorphism, 347 Laplace transform, 369 Laurent-Pade approximation, 139 leading coefficient, 54, 260 Lebesgue decomposition, 21, 28, 346 linear functional, 172 linear prediction, 343 Liouville-Ostrogradskii formula, 78, 90, 92 load, 361 lossless n-port, 370 lossless inverse scattering, 359 lossless scattering function, 360 M-fraction, 105, 338 Mobius transform, 24, 25, 121, 148 Malmquist basis, 51 Mason rules, 363 maximal entropy, 366 measure normalized Lebesgue, 17, 18 minimal phase filter, 353 moment, 5, 21, 46, 130, 195, 239, 240, 300, 329 moment problem, 5, 239-241, 251, 302 Hamburger, 9 strong Hamburger, 9 trigonometric, 7 Montel theorem, 185 MP-fraction, 105, 338 multipoint Pade approximation, 105, 139, 238, 338 multipoint Pade fraction, 338 multipoint Pade-type approximation, 238 N-extremal, 254 Nehari problem, 381 Nevanlinna class, 15, 20 Nevanlinna kernel, 28 Nevanlinna measure, 29 Nevanlinna representation, 29 Nevanlinna-Pick algorithm, 121,140,145,169, 356, 365
Paley-Wiener theorem, 22 para-orthogonal, 106, 108, 114, 117, 120, 122, 242, 280 Parseval equality, 160 passive n-port, 370 passive scattering medium, 360 past, 344 PC-fraction, 104, 338 Perron-Caratheodory fraction, 104, 338 Pick matrix, 47, 342 Pisarenko modeling problem, 356 Poisson integral, 31 Poisson kernel, 27, 28, 63, 89, 128, 145, 163, 166, 184, 189, 192 Poisson-Stieltjes integral, 27, 253 port, 361, 369 positive real function, 11, 15, 23, 29, 82, 89, 121, 122, 130, 131, 143, 146, 181 present, 344 projection, 13, 35, 36, 46, 60, 135, 153, 176 pseudo-hyperbolic distance, 25 pseudo-meromorphic extension, 32, 33, 37 quadrature formula, 12,106,112,123,239,286 quasi-orthogonal, 280 R-Szego quadrature, 113 rational Szego formula, 117, 119 Redheffer transformation, 363 reflected wave, 359 regular function, 261 regular index, 261 regular process, 344 regular values, 285 remote past, 344 reproducing kernel, 34, 52, 55, 58, 63, 66, 67, 70, 135, 144, 165, 178, 243 Riesz representation theorem, 172 Riesz-Herglotz kernel, 27, 47 Riesz-Herglotz measure, 27 Riesz-Herglotz transform, 356 Riesz-Herglotz-Nevanlinna measure, 89
407
Index Riesz-Herglotz-Nevanlinna representation, 27 Riesz-Herglotz-Nevanlinna transform, 27,121, 122, 170, 252, 334 scattering function, 360 scattering matrix, 40, 360 scattering medium, 359 Schur algorithm, 6, 7, 121, 365 Schur class, 12, 15, 23, 69, 141, 182 Schur continued fraction, 5 Schur lemma, 142, 365 Schur section, 368 Schwarz inequality, 180 Schwarz lemma, 24-26 spectral factor, 61, 63, 135, 138, 156, 178, 356 spectral measure, 346 stable system, 360 stationary process, 343
stochastic process, 343 strict past, 344 subharmonic function, 17 Szego condition, 33, 61, 135, 155, 173, 175, 191, 195, 209, 346 Szego kernel, 61, 135, 175, 176, 178, 184, 189 Szego polynomial, 33, 74, 103, 155, 161, 174 Szego problem, 60, 173, 174 T-fraction, 105, 338 Thevenin theorem, 370 time domain, 344 transmission modes, 352 transmission zeros, 352 unpredictable process, 344 Wold decomposition, 345