This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!
[0,oo)) = J-. A e A 2 n V 2 , r € A 2 n V 2 , 0p,r(<) = (P + r)tp+r~l + ptp~l is convex on R+ and , t>0 a + <po, (TTJ). 6.1 Asymptotically Stationary Processes The definition of asymptotic stationarity is the exact analogue of the one for the abelian case. DEFINITION 6.3 Let { X ( t ) , t <E G} be an #- valued stochastic process on a cr-compact amenable group G, and let K(t,r) be its covariance function (27). The process X is called asymptotically stationary with asymptotically stationary covariance K(T) if conditions (i)-(iii) and (4) of Definition 2.2 are satisfied with + replaced by group multiplication in G and K is independent of the F01ner sequence {An}. As in the abelian case, stationary processes on G are asymptotically stationary, and one can prove that every asymptotically stationary covariance K is continuous and positive definite. To appreciate the notion of an associated spectral measure in the nonabelian case, note that we have K = K(e)(p for some if € P\(G). If B(G} is given the weak-* topology relative to its duality with G*(G), then Po(G) is a compact, convex set, and on Po(G) the relative weak-* topology is the topology of uniform convergence on compact sets ([18], Theorem 3.31; [12], Theorem 13.5.2). Since G is cr-compact, this topology is metrizable. Thus given -l[ I (i.e.
(16)
NONSQUARE CONSTANTS OF ORLICZ SPACES
353
(Ai) In addition, if F$(t) = ^£f is decreasing for t > 0, then (16) becomes J(i(*)[0,oo)) = 2^,
(17)
where C$ = lim F$(t) as in (7). t
(Az) On the other hand, if F$(t) is increasing for t > 0, then (16) deduces to
J(LW[0,oo)) = 2%,
(18)
where C% = liraF*(t). Part B. Suppose that 0 is a convex function on R+, then
(19) (Bi) In the case that F$(t) = ^|f is decreasing for t > 0, then (19) deduces to
o')) = 2°*.
(20)
If JF$(t) is increasing for t > 0, then (19) becomes o)) = 21^.
(21)
Proof. Part A. Since 0 is concave, one has that
v) = / Jo
(f>(s}ds < K 2 $(u).
Letting K = \/2 and u = $(u), we obtain that v7^"1^) < $^ 1 (2u),M > 0, or ~ u 2/3$ = sup .,,,:: < V2/- < sup 2
^O^H ")
~ u
;. ; = — .
u>0 ^"H")
«*
The above inequalities and (12) in Lemma 5 imply
o)).
(22)
Next we show J(L ( * ) [0,oo))<^a$ if 0 is concave on R+ . Note that by (3), (23) is equivalent to
(23)
354
Z. D. Ren
(24)
Now we prove (24), which will finish the proof of (16) together with (22). Let Gcj>(u) = a,-1/2 -|;u > 0, then a$ < G<j>(u) and <&[<& (u)/G<j>(u)] = 2u for u > 0. For any x e L^[0,oo) with p^>(x) = 1, by letting u = $(\x(t)\),x(t) ^ 0, one has that a$ < G$[$(|x(i)|)] and T(+\ =^ x^tj ^ nu.
("9^ v /
Therefore, we get from (25)
-
1
(26>
If /cx > 0 satisfies equality p$(^) = 2, then fcx > 25$ by (26), which proves (24) in view of (10) in Lemma 4. We have proved (16). (A\) If, in addition, F$(t) is decreasing on [0, oo), then C$ = lim F$(t) t—»oc
exists and C$ < oo since $ e A2, and G$(u) = j>-i(2u) ^s a^so decreasing on u € (0, oo) by Lemma 2. Thus, Lemma 3 implies
5$ = lim G$(u) = 2 ^*, u—>oo
proving (17) by (16). (^2) If F$(t) is increasing, then G$(u) is also increasing again by Lemma 2. It follows from Lemma 3 that _ i 5$ = lim G$ (u) =2 c* , u—*0
which proves (18) again by (16). Part B. If <j) is convex, then 4>(Kt) > K>(t) for K > 1 and i > 0, and hence 2^$ = max
-.2/
< J(L[0,oo)).
(27)
Next we show (28)
which will finish the proof of (19) together with (27). Note that /?$ > G$(u) = ^_~i 1 jv for all u > 0. For any x 6 L^[0, oo) with p$(x) = 1, inequalities /?$ > G$[$(|x(t)|)] for x(t) ^ 0 and (25) imply
"2-
(29)
If /Ca: > 0 satisfies p*(^) = 2, then kx < 2^ by (29), which proves (28) by (11) in Lemma 4.
NONSQUARE CONSTANTS OF ORLICZ SPACES
355
(-Bi) In addition, if F$(t} is decreasing, then C<j> = liiQ.F^(i) exists and C$ < oo since $ G A 2 . It follows from Lemmas 2 and 3 that u—*0
proving (20) by (19). (82) On the other hand, if F&(t) is increasing, then G$ = lim F$(t) < oo, t—»oo
and by Lemmas 2 and 3 we have
/?> = lim G$(u) = 2 ^* , u—>oc
which proves (21) again by (19). Q Remark 1. Theorem 1 is also valid for the Orlicz function space L^ '(Q, S, n), where n(ty = oo with p, being nonatomic. Similarly to Theorem 1 we can prove the following result on the Orlicz function space Z/*)[0,1] with the usual Lebesgue measure. Theorem 2. Let $(u) = /Q is a concave function on IT1", then ])<
a$
(30)
where a$ is as in (13) and
(Ai) In addition, if F$(t] = t±t is decreasing on (0, oo), then inequality (30) deduces to equality
(32) where C$= lim $$. t^cc ^vt^ (^2) If F$(t) is increasing on (0, oo), then (30) becomes ])<2%,
(33)
where Cg = lim ^. Part B. Suppose that > is convex on R+ , then 2/3i<J(L(*)[0,l])<2^,
(34)
where /3$ is as in (13) and
(35)
356
Z. D. Ren
(Bi) In particular, if F$(t) is decreasing on (0, oo), then (34) becomes l])<2~^.
(36)
(.82) If F$(t) is increasing on (0, oo), then (34) deduces to equality 1]) = 2 1 ~^.
(37)
Remark 2. Inequalities (30) is also proved by Yan [11], independently. It should be pointed out that Theorem 3(ii) in Yan [12] is incorrect. Let (fi,S,/^) be a nonatomic measure space with /^(fi) < oo. Then Theorem 2 is also valid for the space Z/ $ )(fi) if we replace 0*$ in (31), /3|, in (35), and |^jf} in (33), fr^ in (36) by
and
respectively. 3
Examples
To illustrate the main results in the above section, we give several examples. Example 1. Consider N-function $(u) = /Q
0
1
t -: = r, (e-1) 1 ^'
e-l
It is easily seen that 4> is concave and
e
_—
TTTT^
T-.
;
-T7,
e - l < t < 0 0 .
is decreasing for t > 0 (cf. Rao and Ren [7]). Note that C$ = limF$(i) = 2,F$(e- 1) = e- 1,(7$ = lim F$(t) = 2 - e"1 and that $ e A 2 n V 2 . By t—>00
NONSQUARE CONSTANTS OF ORLICZ SPACES
357
Theorem 1 (A\ ) and Theorem 2 (A\ ) we have Example 2. Let $p,\(u) - \u\p+x + \u\p, Kp
is increasing on (0,oo). Since C$o =p and <& ^(2) = 1, from Theorem 1 p,\ "' (A 2 ) and Theorem 2 (^2) one gets
and
Example 3. Consider N-function $p(u) = u\plu(l + u\) with p > 2. It is seen that
and
Remark 3. It should be also noted that Example 5 in Yan [12] is wrong. 1 In fact, if M(u) = |u|[ln(l + |u|)] 2 , then CM = lim S = 1 so that M i M(t) t—>oc
V2(oo) and J(L(M)[0, 1]) = 2 by Theorem 4 in Rao and Ren [8, p.56]. Example 4. Let $p,r(u) = \up+r + \u\p, 2
is increasing on (0, oo). Since C$p r = p + r, we have from Theorem 1 (13%) and Theorem 2 (52) Remark 4. Consider N-function Q(u) = |u|3 + 2 ti|z. Its derivative Q'(t) = 3(t2 + 1 2 ) is neither convex nor concave on R. In fact, Q' is convex on [0, ^],
358
Z. D. Ren
but concave on [|,oc). Since
2
—
is increasing on (0, oo), one gets that ~O.Q = lim GQ(U) = 2~s , pQ = lim GQ(U) u —»0
u —*oc
2~ 3 . Prom Lemma 5 and the fact that Q € A 2 n V 2 we obtain 23 = max
-,2]30 ) < J ( L [ 0 , o o ) ) < 2. V«Q /
It is an open problem to calculate the exact value of J(L^^[0, oo)). 4
A note on
Let * be the Orlicz sequence space denned by an N- function $, equipped with the gauge norm. For the values of J(^*') and g(l^') defined by (1) and (2), Ji and Zhan [5] obtained a useful result similar to Lemma 4 in Section 1. With some printed correction and refinement we restate it as follows. Lemma 6 (Ji and Zhan ([5], Theorem 2)) Let <E>(u) = JQ
sup (fc x > 0 : p* f^U H. IMI ( *)=i *\k^J ^)
(38)
(ii) If (j) is a convex function on [0, 2~ 1 (1)], then =
fcx>0:P<E>=i.
inf
(39)
Since $ £ A 2 (0) n V 2 (0) iff J(l^) = 2, we may assume that $ 6 A 2 (0) n V 2 (0) in this section. By using Lemma 6, Yan [13] obtained some estimations for J(l^), which can be generalized as follows. Theorem 3. Let <&(u) = /Q ' 4>(t)dt denote an N-function and let $ <E A 2 (0)nV 2 (0). Part A. Suppose that
; 5^' where
(40)
NONSQUARE CONSTANTS OF ORLICZ SPACES
359
(Ai) In addition, if F$(t) = t-^r is decreasing on (0,$~ a (l)], then (41) deduces to equality
(Az) On the other hand, if F<s>(t) is increasing on (0, $~1(1)], then (41) deduces to equality -2C*,
(43)
where C$ = limF$(t). Part B. Suppose that <j> is convex on [0,2$~1(1)], then 2/3' < J(i(*>) < 2ft,,
(44)
where & = BUP
{ $-l(u\ l] 0 S^' ^-* / • \
(Bi) In particular, if F$(t) is decreasing on (0, $-1(l)], then (44) deduces to equality J(J<*>) = 2 5 £.
(46)
(^2) On the other hand, if F$(t] is increasing on (0, $~1(1)], then (44) turns to equality ' $-1(1) ' Proof. Part A. Under the assumption, (40) has been proved by Yan [13]. (Ai) Further, if F$(t} is decreasing on (0, $~1(1)], then G$(u) = $-1/2^) is also decreasing on (0, ^] (cf. Rao and Ren ([8], p.154)), or Yan [11]). Thus, one gets
which proves (42) by (40). (^2) In this case, C| < oo since $ e A2(0), and by Lemma 3 _ i o4 = a$ = lim G$(u) = 2 c* , u—>0
implying (43) again by (40).
(45)
Z. D. Ren
360
Part B. Under the hypothesis, (44) was proved by Yan [13]. We now prove (B\) and (-82). (BI) Suppose that > is convex on (0, 23>~1(1)] and F$ is decreasing on (0, ^(l)]. It is seen that C% < oo and that u—>0
which finishes the proof of (46) by (44). (.62) Suppose that (f) is convex on (0, 23>~1(1)] and (0, ^(l)]. It is seen that C$ < oo and that
implying (47) again by (44).
is increasing on
D
Example 5. (Yan [13]) Let $(u) = (1 + H)hi(l + \u\) - u\. Then $ e A 2 (0) n V 2 (0), 4>(t) = ln(l + t) is concave on E+ and F$(£) = ^ is decreasing for t > 0 (cf. Rao and Ren [8], p. 95). By Theorem 3 (Ai) one gets
L487
°-
Example 6. Let &pt\(u) be given in Example 2. By using Theorem we obtain = 2* .
Example 7. Let $p(w) = w| p ln(l + |u|) with 2 < p < oo, as in Example 3. It is seen from Theorem 3 (Bi that
Example 8. (Yan [13]) Consider ty(v) = e\v\ — \v\ — 1, complementary to $(u) in Example 4. Since ^'(s) = es — 1 is convex on.ffi+ and F^(s) = s^,/y is increasing on (0, oo), by Theorem 3 (-62) °ne obtains
References
1. Chen, S. T., Geometry of Orlicz Spaces, Dissertationes Math., 356(1996), 1-204.
NONSQUARE CONSTANTS OF ORLICZ SPACES
361
2. Gao, J. and K. S. Lau, On the geometry of spheres in normed linear spaces, J. Austral. Math. Soc., A 48(1990), 101-112. 3. James, R. C., Uniformly nonsquare Banach spaces, Arm. of Math., 80(1964), 542-550. 4. Ji, D. H. and T. F. Wang, Nonsquare constants of normed spaces, Acta Sci. Math. (Szeged), 59(1994), 421-428. 5. Ji, D. H. and D. P. Zhen, Some equivalent representations of nonsquare constants and its applications, Northeast. Math. J.,15(1999), 439-444. 6. M. M. Rao and Z. D. Ren, Theory of Orlicz Spaces, Marcel Dekker, New York, 1991. 7. Rao, M. M. and Z. D. Ren, Packing in Orlicz sequence spaces, Studia Math., 126(1997), 235-251. 8. Rao, M. M. and Z. D. Ren, Applications of Orlicz Spaces, Marcel Dekker, New York, 2002. 9. Ren, Z. D., Nonsquare constants of Orlicz Spaces, Lect. Notes in Pure and Applied Math., Marcel Dekker, 186(1997), 179-197. 10. Schaffer, J. J., Geometry of Spheres in Normed Spaces, Lect. Notes in Pure and Applied Math., Marcel Dekker, 20(1976). 11. Yan, Y. Q., Some results on packing in Orlicz sequence spaces, Studia Math.,147(2001), 73-88. 12. Yan, Y. Q., An estimate of nonsquare constants of Orlicz function spaces, J. of Suzhou Univ. 17(4)(2001), 1-5. 13. Yan, Y. Q., The values of nonsquare constants of Orlicz sequence spaces, Soochou J. of Math., (to appear).
Asymptotically Stationary and Related Processes Bertram M. Schreiber Department of Mathematics Wayne State University Detroit, MI 48202 U.S.A. E-mail: [email protected] Dedicated to my friend and colleague Professor M.M. Rao
Abstract We survey the known properties of asymptotically stationary processes and discuss their relationship with other processes with finite variances. The appropriate notions of asymptotic stationarity, harmonizability, and other classes of processes are recalled and discussed, first for abelian locally compact groups and then for nonabelian groups, using the notion of a F01ner sequence. Issues such as prediction, filtering, and estimation of the associated spectral measure are discussed. 1 INTRODUCTION Asymptotically stationary processes on K or Z were first introduced by Kampe de Feriet and Frenkiel [34], Parzen [50], and Rozanov [64] in an effort to study nonstationary processes which have some features that reflect those of stationary processes. In this paper we survey the notion of asymptotic stationarity and its relationship with other classes of processes that have been introduced to extend the class of stationary process in a meaningful way. We confine our attention to so-called L2-processes, that is those for which means and variances exist. Most of our exposition will involve abelian groups, since much of the general interest in nonstationary processes and 363
364
Bertram M. Schreiber
fields revolves around processes parameterized by M, Z, R n , and Zn. In the final section we shall discuss processes on nonabelian groups. We refer the reader to the excellent book of Folland [18] for background material from Harmonic Analysis. Thus if G is a locally compact group and (fi,^ 7 , P) is a (complete) probability space, by a process or field X = {Xt, t 6 G} one usually means a mapping X : G —> L 2 (J7,P). It is usually assumed that each Xt has mean 0, so that E(XaXt) represents the covariance of Xs and Xt, but that does not seem to affect any of the arguments developed below. So we shall just refer to a process as a measurable mapping X : G —> H, where H is a Hilbert space, whose inner product is denoted by {•, •} and norm by || • j|. Thus we are including vector-valued processes. Without loss of generality, we may assume that H = spsm{Xt, t e G}. We shall assume that our process X is locally integrable on G with respect to a left Haar measure on G and that it is mean-square continuous; further assumptions will be introduced below.
2 ASYMPTOTICALLY STATIONARY PROCESSES Let G be a a-compact, locally compact, abelian (LCA) group, written additively, denote a Haar measure on G by m or dt, and let G denote its character (dual) group, i.e., the group of continuous homomorphisms t >—> (i, 7) of G into the group T = { z £ C : | z | = l}. The dual group of G is G, the characters on G being given by the binary form above with the elements of G fixed. If / is a bounded function on a set E, then we shall denote the usual supremum norm of / on E by \\f\\ERecall that a process X = {Xt , t € G} is called stationary if setting ) = (Xt+T,Xt),
(1)
we have
K(t, T) = K(0, T) = K(r),
t,reG.
Since K is continuous and positive definite on G, there is a finite, positive measure [i on G, called the spectral measure of X, such that K is the FourierStieltjes transform of JJL: K(T)= / (£,7)^(7)-
JG
(2)
Let G = R, so G = R, the "dual line," and (£,7) = exp(ryf). In an effort to define a class of processes that are "nearly stationary," Kampe de Feriet and Frenkiel considered the class of real- valued processes {Xt , t 6 E} that satisfy the following condition:
ASYMPTOTICALLY STATIONARY PROCESSES
365
For T > 0, let 1
fT-M/2
= -= \ 1
E(Xt_T/2Xt+T/2)dt
J\r\/2
!
r\
tf(|T|,<)
lim RT(T) = R(r]
T— >oo
(3)
exists for all r € R, in which case R(T) was called the correlation function of X. The function R is obviously positive definite also, so if R is continuous (at 0), then it is the Fourier-Stieltjes transform of a measure, called the associated spectral measure of X. If X is stationary, then R = K, so the class of processes introduced by Kampe de Feriet arid Frenkiel, called the KF class by Rao [57], includes the class of stationary processes. If X is periodically correlated in the sense that K(t,r] is periodic in t for all T, then Kampe de Feriet and Frenkiel show that X is in the KF class. They conclude with a numerical example illustrating the approximation of R by RTAt about the same time Parzen [50] and Rozanov [64] introduced a similar definition, which applies to the same class of processes and which Parzen termed asymptotically stationary. Specifically, Parzen assumes the realvalued process X possesses fourth moments and defines T
1
o Suppose there is a function R(t) such that
Xt+TXtdt.
lim E[\RT(T) - -R(r)|]2 = 0, r e R.
T—»oo
Then he calls R the covariance function of X and shows that it is positive definite. He develops conditions for X to be asymptotically stationary in terms of the covariance K(t,r) and provides several examples. These include a signal whose amplitude is randomly modulated, a signal whose frequency is randomly modulated, and an autoregressive time series (in discrete time) before reaching a stationary state. Later Bhagavan [4] and then Anh and Lunney [1] studied the KF class, ultimately removing the requirement of the existence of fourth moments, leaving only the assumption that R be continuous at 0. Bhagavan showed that strongly harmonizable processes are asymptotically stationary (see below) and obtained a mean ergodic theorem for asymptotically stationary processes; this result was later augmented by Rao [57]. The definition of Kampe de Feriet and Frenkiel was extended to realvalued processes on R™ by Anh and Lunney [2] in a natural way. Namely,
366
Bertram M. Schreiber
for T = (T!, . . . ,T n ), t = (*i, . . . ,t n ), r = (n, . . . ,T n ) e R n , define K(t,r) &!•••&„ when TJ! < 1}, j = 1, . . . , n and RT(T) = 0 otherwise. Here
for j = 1, . . . , n. Again RT is positive definite, and Anh and Lunney call X asymptotically stationary if (3) holds. Note that the definitions of asymptotic stationarity above only involve the values of Xt for t > 0 when G = R and for t in the positive cone when G = R n . Thus even when n — 1, the behavior of Xt for t < 0 is not taken into account (unless X is stationary) , and as n —> oo less and less of the nature of X plays a role in the definition, so the function R(T) carries less and less information about the process. Moreover, when G ^ Rn or Zn, there is no notion of positivity. (The exception here is the collection of ordered groups considered by Helson and Lowdenslager [26], [27]. For an application of this setting to prediction theory, see the work of Mandrekar and Nadkarni [43].) So we shall adopt the more symmetric notion of an asymptotically stationary process below. DEFINITION 2.1 By a F0lner sequence in G we mean an increasing sequence {An} of compact, sets in G such that G = U'£L1An, and for every lll,nsin
lim —
n^oo
-T U / ( £_A JT-rjl
-p-
m(A n )
= 0,
where A A B stands for the symmetric difference of the sets A and B. It is well known (cf. [29] §18, [22]) that every locally compact abelian group contains F01ner sequences. Any F01ner sequence satisfies
lim — — f
n-oc m(An) JAn An
f ( t ) d m ( t ) = M(/),
/ e AP(G),
where AT>(G) is the space of (uniform) almost periodic functions on G and M is the (uniquely defined) invariant mean on AP(G) (Hewitt and Ross [29], Sec. 18). The reader is reminded that a bounded, continuous function on G is called almost periodic if the set of its translates is relatively compact in the space Cb(G) of bounded continuous functions on G, or equivalently,
ASYMPTOTICALLY STATIONARY PROCESSES
367
if it is the uniform limit of a sequence of linear combinations of characters (Hewitt and Ross [29], Sec. 18). Examples of F01ner sequences are easy to construct and verify in common examples of groups G. Here are a few. EXAMPLES (1) If Tk > 0 and Tk -> oo, then {[-Tfc,Tfc]} is a F01ner sequence in R. Similarly, if Tf, . . . , Tn > 0 and TJ° -» oo, j = 1, . . . , n, Then rr rpk rpk] v il~ i l > J l J x
v x
i rpk n-,k]\ [~1mJ-n\i
is a F01ner sequence in En. (2) Consider the field Gp of p-adic numbers (or more generally the group Ga of a-adic numbers) [29], [63]. The group Zp of p-adic integers is an open, compact subgroup of Gp such that if Gn = p~nZp, the {Gn} is an increasing sequence of open compact subgroups whose union is Gp. It follows immediately that {G™} is a F01ner sequence in Gp. (3) Let G = Z00, the group of all integer sequences t = (ti, t%, . . .) such that for some N = N(t), ti = 0 for all i > N. For any positive integer sequence dn —> oo, let An = [-n.n]^ = {(ti,... ,tdn, 0,0,.. .) : |i;| < n, 1 < i < dn}. Then {An} is a F01ner sequence in Z°°(cf. [24], Example 1). DEFINITION 2.2 Let X be a process on G. We make the following assumptions: (i) s u p - ^ (ii) (iii)
\\X(t)fdm(t}«x
limsup /. . / \\X(t + h) - X(t}\\2 dm(t) -> 0 as h -> 0 n m(An) JAn l i m —l— I n^oc m(An)
\\X(t}\fdm(t} =0
J(An+x)AAn
A process X satisfying (i)-(iii) is called asymptotically stationary if for any F01ner sequence of sets {j4n}) the finite limit K(r)= lim —1— I n-oo m(An)
JAn
K(t,r)dm(t)
(4)
exists for every T € G and is independent of the choice of {An}. The function K(T) is called the asymptotically stationary covariance of the process {X(t}}. In particular, if G = R", then X is asymptotically stationary if and only if, using the notation above,
=
lim
Ti,...,Tn->oo If
• ln J_
'
368
Bertram M. Schreiber
exists for all r. If X is stationary, then it is clearly asymptotically stationary and the asymptotically stationary covariance of X is just the covariance. In some ways asymptotically stationary processes can be handled by methods similar to stationary processes. The class of asymptotically stationary processes on G is fairly large. For instance, we have the following elementary facts. PROPOSITION 2.3 (i) If X and Y are asymptotically stationary processes on G such that Xs _L Yt for all s, t e G (X and Y are uncorrelated), then X + Y = {Xt + Yt, t 6 G} is asymptotically stationary. (ii) If Xn = {X",t € G},n = 1 , 2 , . . . are asymptotically stationary processes on G, Y is a process on G, and \\X™ — Yt\\ —> 0 uniformly, then Y is asymptotically stationary. (iii) Let X be a bounded and asymptotically stationary process on G and ^ be a complex (Borel) measure on G. Set
I JG IG Then Y is asymptotically stationary. Proof: Assertions (i) and (ii) are trivial, so we concentrate on (iii). Write T-udn(u),
G =
JG J G
I JG
(Xt+T-u,Xt
Thus —^— /
m(An) JAn
KY(t,r)dt =
=
/ JG / m(A tAn\) JG
~TA~\ I
(Xt+T+v-u,Xt)dtdp.(u)dp.(v).
JG JG m(An) JAn~v Since {An} is a F01ner sequence, the inner integrals converge boundedly to T + v — u). Hence Y is asymptotically stationary, and KYr )= I I
JG JG
The following proposition ([24], Proposition 2) is an elementary consequence of our assumptions and the definition of a F01ner sequence. We shall
ASYMPTOTICALLY STATIONARY PROCESSES
369
sketch the proof here because we are using the minimalist assumptions (i)(iii) and because we wish to refer to it below. PROPOSITION 2.4 If X is asymptotically stationary, then the function K is continuous and positive definite on G. Proof: For K(T) defined by (4), let us denote the expression on the right of (4) by M(K(-,r)). It follows from Assumptions (i) and (iii) that for all
For any a\, . . . , ajy 6 C, N
N
Y^ a-idjK^ - TJ) = ]T t,j=l
aidjM(K(-
t,J=l
=M
=M
Thus K is positive definite. Turning to the continuity, the positive definiteness implies that it suffices to show continuity at 0. For each n the Cauchy-Schwarz inequality gives,
m(An) JAn
K(t,h)dm(f) '
-
m(An)
JAn
^ — / A \ I \(Xt+h ~ Xt,Xt)\ dm(t) m(An)JAn
i
j^-
r
'•(An) JAn
11/2
\\Xt+h-Xt\\2dm(t)\ 1/2
J
Letting n —> oo we obtain
T 1 f \K(h)-K(0)\< ^(O)1/2 limsup—r—. /
I n->oc m(An) JAn
\\Xt+h-Xt\\2dm(t)\
I1/2 J
.
Invoking (ii), we obtain the desired continuity.
DEFINITION 2.5 Let X be asymptotically stationary. Since K is positive definite, it can be represented by a measure p, on G as in (2). This measure fj, is called the associated spectral measure of X.
370
Bertram M. Schreiber
3 EXAMPLES In this section we shall recall the definitions and basic properties of various previously-studied classes of processes X on the LCA group G and discuss their relationship to asymptotic stationarity. All of these classes are easily seen to contain all stationary processes. 3.1 Harmonizable Processes The notion of a harmonizable process was first introduced by Loeve [40]. Nowadays these processes are termed strongly harmonizable, per the following definition. DEFINITION 3.1 The process X = {Xt, t e G} is called strongly harmonizable if there is a complex Borel measure /j, on G x G such that K(t,r)= I (t,7)(T,y)dM7,y) = A(*,T), t,T€G. JGxG Clearly X is stationary if and only if it is strongly harmonizable and the measure fj, is nonnegative and concentrated on {0} x G. The following was first proved on E by Rozanov [64] and later appears in Bhagavan [4]. PROPOSITION 3.2 stationary.
Every strongly harmonizable process is asymptotically
Proof: Since the averages over any F01ner sequence of any character 7 ^ 1 converge to 0, the Dominated Convergence Theorem implies that the limit in (4) exists and equals
€G, K(r)= /(T, 7 )^(0,7), JG i.e., K(T) is the Fourier-Stieltjes transform of the measure dv(^} = d/z(0,7) onG. To study the larger class of weakly harmonizable processes on G, we need to recall the notion of a bimeasure. DEFINITION 3.3 Let E and F be locally compact Hausdorff spaces, and denote by Co(E) and Co(F) the usual spaces of continuous functions that "vanish at infinity," normed via the supremum norms. By a bimeasure on E x F we mean a bounded, complex-valued bilinear form u on Go(E') x Go(F), i.e., a bilinear form such that \\u\\ = sup{\u(f,g)\
: \\f\\ E < I , \\g\\F < 1} < oo.
(5)
ASYMPTOTICALLY STATIONARY PROCESSES
371
Equivalently, if V(E, F) = Co(E) (8)6*0 (F) is the projective tensor product of Co(E) and Co(F), consisting of all functions of the form fn(s)9n(t)
(6)
n=l
such that \\fn\\E\\9n\\F«X>,
(7)
n=l
normed by setting ||0|| to be the infimum of all sums (7) such that (f> is represented as in (6), then each bimeasure u on E x F corresponds to a continuous linear functional on V(E,F). Another way to look at a bimeasure u is to view it as a set function. By considering the functional u** (second adjoint) of the bimeasure u, acting on CO(E)** ®Co(F)** , and then restricting u** to bounded Borel-measurable functions on E and F (as elements of CQ*), we are lead to the following definition. Via Stone's Theorem one can easily see that Definitions 3.3 and 3.4 are equivalent. Also note that the extension of u to u** implies that denning u(f,g) = u**(f,g) makes u well defined for / and g bounded continuous functions. DEFINITION 3.4 Let (Qi , ft ) and (Q2, ft) be measurable spaces. A bimeasure u on QI x ^2 is a bounded bilinear form on L°°(£li,ft) x L°° ($1%, ft>] , in the sense of (5), where L°° denotes the space of bounded measurable functions, such that for all / e L°°(fti, ft), g £ L°°(tt2,ft), Aeft and
Be ft, pi(A)=u(xA,g)
and /z 2 (-B) = u(f,xs)
define countably additive measures on ft and ft, respectively. The following definition is due to Rao [57] and Rozanov [64]. DEFINITION 3.5 The process {Xt, t <E G} is weakly harmonizable if there exists a bimeasure u on G x G such that ,s-t) = (Xa,Xt)=u((s,.),(t,-)).
(8)
If such a bimeasure exits, it is unique ([21], Theorem 2.4). It follows from (8), by considering linear combinations of characters on G, that u must be positive definite in the sense that for any a\,..., an G C and f\,..., fn bounded measurable functions on G, n
iaju(fi,Ti)>0-
(9)
372
Bertram M. Schreiber
In particular, u
(f,g)
—u(9if]
an
d
u
(/i/) > 0
(10)
for all / and g. It is known [71],[60],[48] that there exist weakly harmonizable processes that are not asmptotically stationary. Nevertheless, we shall take this opportunity to show how a modern approach to bimeasures leads easily to the most important properties of weakly harmonizable processes and elucidates these properties. This approach appears in [45], but it does not seem to be widely known to practitioners in the area of nonstationary processes. Once the key tool in this approach is introduced, the properties we wish to mention are readily obtained. The key tool in this approach to weakly harmonizable processes is the celebrated theorem of Grothendieck, which he dubbed the "fundamental theorem of the metric theory of tensor products." This theorem has been generalized in very deep ways and is viewed today as part of the theory of operator spaces and completely bounded maps [8],[54],[17]. The version we present here is a special case adapted to our needs. The connection between Grothendieck's inequality and the theory of harmonizable processes seems to have been observed first in [45] (cf. [49], [61]).
THEOREM 3.6 Let u be a bimeasure on E x E. Then there is a probability measure p on E and a universal constant KG such that for all f , g € CQ ( E ) , u(f,g)\
< KG\\u\\ ||/!|L2(M) |M| L 2(u)-
(U)
Similarly, if u is a bimeasure on (ft,.F) x (fi,^), then there exists a probability measure ^ on T such that (11) holds for all f,g£ L°°(fi, f). It follows from (11) that for u as in Grothendieck's theorem there is an operator T on L2(E,/j,) such that u(f, 9) = (Tf, g)L*M = / (Tf)g d/x.
JE
(12)
If u is the bimeasure determined by a weakly harmonizable process, then it follows immediately from (10) that T is a self-adjoint nonnegative definite operator. Let S denote the nonnegative definite square root of T. Then we have
(13)
THEOREM 3.7 Let X be a weakly harmonizable process on G. Then there exists a countably additive H -valued (i.e. stochastic) measure Z on G such
ASYMPTOTICALLY STATIONARY PROCESSES
373
that Xt=
(t, 7 )dZ( 7 ).
(14)
Conversely, if X is represented as in (14), then X is weakly harmonizable and the corresponding bimeasure u is given by u(A,B) = (Z(A),Z(B))
(15)
for A and B Borel subsets of G. Proof: Let K be represented as in (8), and let S be the operator appearing in (13). For each Borel set in G, let Z'(E) = S(XE)- Then Z' is clearly finitely additive. If {En} is a pairwise disjoint sequence of sets with union E, then it follows immediately from the monotone convergence theorem that the series Y^XEn — XE converges in £ 2 (yu). Since S is a bounded linear operator, it follows that Y^z'(En) = Z'(E) in L2(^). Thus Z' is a vector measure. For any simple function / on G, we have
Sf = I f d Z ' ,
(16)
JG
so this formula persists for all / 6 (8) and (13) we have lt)
= (X3,Xt),
s,teG.
(17)
It follows by consideration of linear combinations of characters on G and corresponding linear combinations of the vectors Xt that (17) induces a unitary isomorphism V of the closure M of the range of S in I/2(/i) onto H such that VS((t, •)) = Xt, t€ G. Let Z(A) = VZ'(A). Then (17) implies that (14) holds. Conversely, if (14) holds, first note that clearly (15) extends to the identity u(f,g) = ( f f d Z , I gdZ\ VG
JG
(18)
/
for simple functions / and g. Since every bounded function is the uniform limit of simple functions, it follows that u is a bimeasure, in the sense of Definition 3.4. Moreover, for any simple function / and bounded function g, the definition of Z says that (18) holds, and then it holds for all bounded / and g. Combining (18) with (14) gives (8). REMARK It is interesting to note that very early on the canonical examples of bimeasures were seen to be of the type (15), except that one could allow for taking the inner product of two vector measures, say Z\ and Z^. For instance see [32].
374
Bertram M. Schreiber
THEOREM 3.8 Let X be weakly harmonizable. Then there exist a Hilbert space K containing H and a stationary process Y with values in K such that Xi = PYt , t G G, where P : K —+ H is the orthogonal projection. Proof: We preserve the notation of the proof of Theorem 3.7. Let S and V be as above, and set W = VS. Then W maps L2(G, fj,) onto H with W((t, •)) = Xt. Let HI be the Hilbert space L2(G, /it) 0 H, and let W1 be the operator on HI such that W'(f) = (W(f), 0) if / € L2(G, p) and W = 0 on H. Let w = \\W\\ = \\W\\. Recall that by a classical theorem of Nagy, the contraction operator w~lW has a unitary dilation U on a Hilbert space K containing HI . That is, there is a Hilbert space K containing HI and a unitary operator U on K such that if P : K —> HI is the orthogonal projection, then w—lW'^) = PZ7(£), £ € #1. Now, the process 74 = (t, •), t € G with values in L2(G,p.) is clearly stationary. Hence when considered as a process with values in HI and then in K it is stationary, so the process Yt — wU(%), t € G is stationary, since U is unitary. Finally, PYt = wPUfrt) = W'frt) = Wfrt) =Xt,
te G,
as desired. THEOREM 3.9 Let X be a continuous process on G. The following are equivalent. (i) X is weakly harmonizable. (ii) There exists C > 0 such that G
f £ '
where J-f denotes the Fourier transform of f. The integral in (ii) is an ordinary vector integral in H. The condition (ii) of this theorem was introduced in 1953 by Bochner, who termed it Vboundedness. Proof: (Cf. [21], §2.) (i) implies (ii). If X is weakly harmonizable, let u be the defining bimeasure. By Definition 3.3 and the subsequent discussion, the measures /zi and ^2 in Definition 3.4 are regular. Thus the function 1i—> 74 is continuous in Ll(G, |/zi|) and Ll(G, |//i|). Hence if / £ Ll(G), then
375
ASYMPTOTICALLY STATIONARY PROCESSES
f f(t)Xtdt JG
2
= / [ JGJG = f [ JG JG
f(s)f(t)(Xs,Xt)dsdt f(s)J(t)u(js,Tt)dsdt
dt
u( f f ( s ) % d s , f \Jc JG =
2 u(fJ)
Thus X is V-bounded. (ii) implies (i). Let C be as in (ii). Then for all /, g E Ll(G), I f JG JG
f(s}g(t)(Xs,Xt)dsdt
G
f ( s ) X s d s , I g(f)Xtdt} JG JG f(s)Xsds\\\\ f
G
g(t)Xtdt
*sin Lr
Since A(G) = { / : / € Ll(G}} is dense in Go(G), there is a unique bounded bilinear form u on Go(G) x Co(G) of norm at most C*2 such that if cj) = / and ip = g, then
= f t f(s)!ffi(X,,Xt)dsdt. JG JG By extending u to bounded functions in the canonical way, using properties of the second adjoint u**, and running / and g through translates, by s and t respectively, of an approximate identity in Ll(G), we conclude that (8) holds.
3.2 Almost Periodically Correlated Processes DEFINTION 3.10 A process X is called almost periodically correlated if K(t, T) is uniformly continuous on G x G and uniformly almost periodic on G as a function of t for each T. This definition is due to Gladyshev [19]. Such processes were studied in depth (on E) by Hurd and Leskow [30],[31],[39]. In particular, necessary and sufficient conditions for a strongly harmonizable process to be almost periodically correlated are presented in [30], while problems related to statistical estimation of coefficients in the almost periodic Fourier series expansion of the functions K(-,T) are studied in [31],[39].
376
Bertram M. Schreiber
Specifically, let M be the unique translation-invariant mean on the space AP(G), so if {An} is any F01ner sequence in G and / € AP(G), then lim
* . /
f(t)dt = M ( f ) .
(19)
U; V n^m(An)JAn^> ' An almost periodic function on G has a Fourier series defined via the mean M. Thus if X is almost periodically correlated we can write the Fourier expansion of K as
where for each r 6 G the the function 0(7, r) vanishes off of a countable set in G (depending on r). In [31] the authors present strongly consistent and asymptotically normal estimators for the function 0(7, T) when G — R. The estimators are the natural ones and are somewhat similar to the estimators appearing in Theorem 4.1. The following proposition is now clear. PROPOSITION 3.11 Every almost periodically correlated process on G is asymptotically stationary, One can enlarge the class of almost periodically correlated processes and preserve the asymptotic stationarity as follows. Recall that a bounded continuous function / on G is called weakly almost periodic if the set of translates {fs : s € G} of / is relatively compact in the weak topology on the space of bounded continuous functions on G [14], [15], [16], [5]. Let WAP(G) denote the set of all weakly almost periodic functions on G. Every function in WAP(G) is uniformly continuous, and WAP(G) is an algebra of functions. Moreover, the mean M on AP(G) extends uniquely to a translation-invariant mean on WAP(G), so (19) holds for / € WAP(G) [11], [5]. Thus we may make the following definition. DEFINITION 3.12 A process X on G is weakly almost periodically correlated if K(t,r) is uniformly continuous on G x G and for all T € G, K(t,r) is weakly almost periodic in t. It is now immediate that every weakly almost periodically correlated process is asymptotically stationary. If one transfers the almost periodicity from the "time" domain to the "frequency" domain, one is led to the almost harmonizable processes introduced by Rao in [57]. This class is a subset of the Cramer class defined below, but we include it here because it involves almost periodicity. It is a natural extension of the class of strongly harmonizable processes. DEFINITION 3.13 The process X on G is almost harmonizable if there is a complex measure fj, on G x G and a family {g(t,7) : 7 (E G} of jointly
ASYMPTOTICALLY STATIONARY PROCESSES
377
continuous functions such that g^(t) = g(t,"f) is an almost periodic function on G for each 7 and such that the following hold:
W / 1107 He? llsy JGxG (ii) For all s,teG,
r
_
JGxG
PROPOSITION 3.14 If X is almost harmonizable, then it is almost periodically correlated, and hence asymptotically stationary. Proof: Write
Then g^
JGxG
57® 57^(7,7')
as a vector integral in AP(G x G). Since .ftT' G «4/P(G x G), we see in particular that for all T G G, #(-, r) = tf'(r + •, •) 6 AP(G). REMARK The term almost harmonizable was introduced by Rao in [57] on R with a different condition on the functions (£, 7). Namely, the hypothesis there which replaces our condition (i) amounts to the assumption that for every compact set G in G the set F(G) = {g~/ : 7 G G} lies in a norm-compact subset of AP(G), from which it follows that all the translates of all the functions in F(G) lie in a compact set in the space of bounded continuous functions on G. This is formulated using the Besicovitch definition of almost periodicity as follows on the line: For every e > 0 there is a number IQ > 0 such that each interval / in R of length £Q contains a number T G / for which r,7)-5(*,7)l<£,
i e K, 7 € G.
This definition does not lead to a process that is asymptotically stationary, however, as the following example shows. EXAMPLE Let G = R, and let (j. be supported on the positive integer points of the diagonal in R x E with
Define g(t, 7) = 0 if 7 ^ N, and for n € N let gn(t} = (-, n) be a continuous, real- valued function on [0, 1] such that g(t, rri)g(t, n) = 0 if m ^ n and /•i / g2(t,n)dt = n, n e N.
Jo
378
Bertram M. Schreiber
(One could easily modify this family so as to be jointly continuous, but that is immaterial.) Extend each gn to be periodic on R with period 1. Then clearly all of the functions {g^ : 7 6 C} lie in a compact subset of «4'P(R) for any compact set C C R, and there is a process X on R for which K(t, T) is given by condition (ii) of Definition 3.12. In fact, i l K(t,r} = ^ y —=•(£ +r,n)g(i,n) = —^g(t + T,no}g(t,nQ) n n fe ° if there is a unique no for which the right-hand side is nonzero, and K(t, T) = 0 otherwise. For T e N, however, the common periodicity gives
-^ / K(t,t)dt= ! K(t,t)dt 2^ J-T Jo ^ gl(t}dt = ™.
Thus X is not asymptotically stationary. 3.3 The Karhunen Class The following definition was introduced by Karhunen in 1947 [35]. A special case of these processes, in which the functions g(t, 7) have a certain type of oscillatory behavior, are called oscillatory processes by Priestly [55],[56]. DEFINITION 3.15 The process X belongs to the Karhunen class if there is a measure space (A,.F,/x) and a set of complex-valued functions g(t, •) 6 Z/ 2 (fi,/z) such that K(t,r) = I g(t + T,\)g(t,X)dn(X). JA
(21)
The following theorem was first proved by Rao and appears as [60], Theorem 3.1. It follows immediately from Grothendieck's theorem, as portrayed in (17). THEOREM 3.16 Every weakly harmonizable process is in the Karhunen class relative to a measure [i on G. If X is an almost harmonizable process, represented as in (20) with respect to a measure supported on the diagonal of G x G, then X is in the Karhunen class and is asymptotically stationary. On the other hand, since not all weakly harmonizable processes are asymptotically stationary, the same is true a fortiori for the Karhunen class.
ASYMPTOTICALLY STATIONARY PROCESSES
379
3.4 The Cramer class If we think of the Karhunen processes as having a "spectral measure" concentrated on the diagonal of A x A, as suggested in the previous paragraph, we are led to the following definition, first suggested by Cramer in [7]. DEFINITION 3.17 The process X belongs to the Cramer class if there is a measurable space (A, F), a measure v on (A x A, f x f \ and a family 9t(ty — 9(t, A) of .F- measurable functions, t e G, such that
(i) (fcW&Me^AxA.i/), 3,teG, (ii] K(t,r}= f ./Ax A
We say that X is in the weak Cramer class if each of the functions gt is bounded and there is a positive definite bimeasure u on A x A such that K(t,s-t) = {Xs,Xt)=u(gs,g-t),
a,teG.
(22)
The relationship between the weak Cramer and Karhunen classes and the "spectral" representation of processes in the weak Cramer class is similar to the relationship between weakly harmonizable processes and stationary processes. A proof of the following theorem may be found in [6]. THEOREM 3.18 If X is in the weak Cramer class, then it is in the Karhunen class. If K is represented as in (22), then there exists a countably additive H-valued measure Z on A such that Xt= f g(t,X)dZ(X),
teG,
(23)
A,B£f.
(24)
./A
and we have u(A, B) = (Z(A),Z(B)},
Conversely, if X is represented as in (23), then X is in the weak Cramer class and its covariance function K(s,t) is represented as in (22) with respect to the bimeasure given by (24). Proof: If X is in the weak Cramer class, then Grothendieck's theorem (3.6) imples that it is in the Karhunen class with respect to the measure n and the functions S(gt) via (13). By (22) and (13), if F is the closed subspace of L 2 (A, /^) spanned by the functions gt, t £ G, then F and H are unitarily equivalent. Let Q : L 2 (A,/u) —» F be the orthogonal projection, and set Z'(A) = QS(XA)- Then Q is a vector measure such that
s(f)=[fdz', JA
/er.
The proof now proceeds as in the weakly harmonizable case.
380
Bertram M. Schreiber
The relationship between the Cramer classes and asymptotic stationarity is as elucidated earlier and in the following section. In discussing further examples and the relationship between the Cramer class and asymptotic stationarity, Rao [57] studied that relationship in depth for processes generated by stochastic difference equations and differential equations. The reader is referred to [57] for the details, as they are much too elaborate to be included here. His result on difference equations was extended to conditions for higher-order asymptotic stationarity (see the following section) by Swift [70]. 3.5 Higher-Order Asymptotic Stationarity As noted by Rao [60] and Swift [68] , one can define weaker notions of asymptotic stationarity by considering iterations of the averaging process. Their notions on E are based on iterated averaging of the functions RT above over [0,T]. This translates to the following definition for processes on an LCA group G. DEFINITION 3.19 let
For a process X, a F01ner sequence {An}, and s, r e G, Kn(s,T} = ~^-- I K(t + s,r)dt. m(An) JAn
Set Kn(0, r) = Kn(r). If X satisfies assumptions (i)-(iii) of Definition 2.2, {Kn(r}} converges for all T, and the limit is independent of the F01ner sequence, then X is asymptotically stationary. Otherwise, for each p e N we can define averages Kn (T) of order p inductively by setting Kn (s,r) =
fc=i
and Kn (0, T) = Kn (T) . The process X is called asymptotically stationary of order p if p is the least integer such that assumptions (i)-(iii) of Definition 2.2 hold with first-order means replaced by means of order p, Kn (T} converges for all T, and the limit function K&\T) is independent of the F01ner sequence {An}. Clearly a process that is asymptotically stationary of order p will be asymptotically stationary of order q for all q > p. For instance, the process in the example at the end of §3.2 could be modified so as to create a process which is asymptotically stationary of order p for any p. In the setting of Rao and Swift cited above, each of the functions RT is positive definite, hence it follows immediately that the same is true for any
ASYMPTOTICALLY STATIONARY PROCESSES
381
limit of iterated averages. For the sequences denned above, however, just as for the case of asymptotically stationary processes as denned herein, only the limit function K^ is positive definite. PROPOSITION 3.20 // X is asymptotically stationary of order p, then K^p' is continuous and positive definite. Proof: Let {An} be a F01ner sequence in G. It follows from our assumptions that for all s,r e G, n ]im[^)( S ,r)-^)(r)]
= 0.
(25)
That is, if for any appropriate function / on G we let M^p\f) denote the limit of p-th order averages as indicated in Definition 3.19, then we have M (p) (K(- + s , r ) ) = M<*> (K(-, T)). The proof that K (p> is positive definite now proceeds like that of Proposition 2.4. The proof of continuity in that Proposition may also be modified easily to show that K^p' is continuous. In order to move the averaging process from the time domain to the frequency domain, Swift [68] introduced the notion of a (c,p) summable [weak] Cramer process. Namely for G = S = R, a process X in the [weak] Cramer class, as defined in Definition 3.16, is called (c,p) summable [weak] Cramer if the functions (• + T,j)g(-,^f') in the definition are (c,p) summable in the usual Cesaro sense on R for all 7,7' € G. That is, define (T> 7,
) =
1
and for p > 1 set a n(p)(T
~ ~'\ — _ T V ' T i T J — Tp, 1 I Jo
a n
u
Then X is a (c, p) summable [weak] Cramer process if p is the smallest integer such that the functions a^ are uniformly bounded and limr-.oo a? ( r > 7' 7')exists uniformly in r for all 7 and 7'. As above, a process that is (c,p) summable [weak] Cramer will be (c, q) summable [weak] Cramer for all q > p. Notice that this definition easily extends to random fields over K". In fact, using the notion of a F01ner sequence we may extend it further. DEFINITION 3.21 Let X be a [weak] Cramer process on G, defined as in Definition 3.16. For a F01ner sequence {An} on G, T e G, and A, A' € A, let
r, A, A') = -1— I g(t + T, X)W^) dt. m(An) JAn
382
Bertram M. Schreiber
For each p & N we may define averages a« (j, 7,7') of order p inductively by setting
The [weak] Cramer process X is called (c,p) sumrnable [weak] Cramer if p is the least integer such that the a« (r, A, A') are uniformly bounded and converge for all T, A, and A' and the limit function a^p\r, A, A') is independent of the F01ner sequence {>!«}. Again, a process that is (c,p) summable [weak] Cramer will be (c, q) summable [weak] Cramer for all q > p. For G = R, using the definition above, Swift proves the following proposition as [68], Theorem 5.3.2. The proof is analogous to that of Proposition 3.2 and other arguments that appear earlier in this work. The details are left to the reader. PROPOSITION 3.22 Suppose that X is a process in the Cramer class on G such that the measure v in Definition 3.16 is finite. If X is (c,p) summable Cramer, then X is asymptotically stationary of order p. In a similar way one can define the class of (c, p) summable Karhunen processes (cf. [68], §5.3) and obtain relationships of the type described above. For the details on R, see [68]. 4 ESTIMATION OF THE ASSOCIATED SPECTRAL MEASURE For any measurable function / on G, set /(7) = lim —l— f n^oo m(An)
JAn
f(t)!J^)dt,
7 e G,
(26)
provided that / is locally integrable and the limit exists for all 7 € G and is independent of the choice of the F01ner sequence {An}. The function / is called the almost periodic Fourier transform of /, since it is equal to M(/7~) when / e AP(G). It is well known that a measure fi on G is discrete if and only if the function K defined by (2) is almost periodic. Thus if the process X on G is asymptotically stationary, then its associated spectral measure is discrete if and only if its asymptotically stationary covariance is almost periodic. In general, of course, this will not be the case. But it is often of interest to estimate the discrete part of the measure, or equivalently, the almost periodic component of the covariance. We refer the reader to [24] for more motivation and examples.
ASYMPTOTICALLY STATIONARY PROCESSES
383
In the stationary case, even for scalar processes, it is well known that standard estimators for both the discrete and absolutely continuous parts of the spectral measure are not consistent. This difficulty is usually overcome by using smoothing kernels and spectral windows (see e.g. Priestley[55]). This approach is essential to the work of Hurd and Leskow mentioned above. As first observed in Hanin and Schwarz [25], however, replacing the usual Fourier transform by its almost periodic analogue leads to a simple, natural, consistent estimator of the discrete component of the spectral measure which is free of artificial kernels and spectral windows. Moreover, this is true not only for stationary processes but also for asymptotically stationary processes. Their work was later generalized by Hanin and Schreiber, yielding the following theorem. For the proof and some further elaborations, see [24]. In order to state the theorem, we must go to the classical setting of a vector-valued process over a probability space. Thus we consider a process X on G with values in L2($l,F,P;H}. Our estimator is constructed from a sequence of independent samples {X%}, I < i < N, of the process X. THEOREM 4.1 Let {Xt, t € G} be an L2(tt,P;H)-valued asymptotically stationary stochastic process which satisfies conditions (ii) and (iii) of Definition 2.2. Suppose that the following conditions are also met:
(iv)
lim sup
(v) sup sup
-AT
/ (E\\Xt+T\\4)1/2dm(t)
*
n T£An f^\-"n)
=0
< oo
JAn
Then for each 7 6 G, 1 A
1
_
l+T,xl}(Ti 7) dm(t)dm(r)
is a consistent estimator for ^({7}), in the sense that Ar(7) - M({7})|2 -* 0 as n, N ->oo.
5 APPLICATIONS Prediction and filtering of stationary processes on Z and E have been among the most carefully studied and useful problems from the point of view of applications of the spectral theory of such processes. Their study dates back to the work of Kolmogorov [37] and Wiener [72], and there are many summaries of the results and applications, such as [55] and [73]. There are numerous results aimed at extending these ideas to nonstationary processes, particularly to the class of harmonizable processes and its
384
Bertram M. Schreiber
extensions. For instance, see [6], [59], [70]. Prediction problems for asymptotically stationary problems were studied by Niemi [45], [46], [48]. When one is dealing with asymptotically stationary processes one can envision the prediction and filtering problems as follows. PREDICTION PROBLEM Given an asymptotically stationary process on Z or R, let s(T,t) = t + Tsgnt. Calculate or estimate Km^HYtp - Xs(T
T-KXJ i
where X^T is the closest element to Xt in MTFormulated in this way, one can easily extend the notion of prediction to Z™, R n , or any LCA group. For instance, the Linear Prediction Problem on a general group G becomes: LINEAR PREDICTION PROBLEM Let X be an asymptotically stationary process on G, with some further hypotheses, and {An} be a F01ner sequence in G. Find lim sup||A" tri - Xt\\ - teG
n toc
where XttH denotes the closest element to Xt in the closed linear span in H of {Xs : s € An}. Prove that this limit is independent of the choice of the F01ner sequence. As a first step in the direction of the linear prediction problem in R, we have the following theorem, whose proof is based on an argument suggested in ([48], pp. 23-24). THEOREM 5.1 Let X be an asymptotically stationary process on R. Let Y be a stationary process on G whose covariance is K\, the. asymptotically stationary covariance of X. Denote by XttT the best linear predictor of Xt-\-T given {Xs,s < t}, and define Yt
T-»oc ZJ J _T
\\Xt+T -
iiT ||
2
dt < \\Yt+T - Yt,Tf = \\YT - r0,T||
385
ASYMPTOTICALLY STATIONARY PROCESSES
Proof: For
, . . . , on € C and ti,...,tn>Q, T -T
-^L,aJxt-tj
L
*+T|
n
n
E%<
-^OiiXt-t,
j=i
dt.
Thus dt =
K(fy->aiK(-ti-T
*=j
Choosing an appropriate sequence of linear combinations, we have 2
inf
1 [ lim — /
ai ,...,a n ;ti,...,t n n-»oo 2T J ^
- /
J
ajXt-t
and the theorem follows. With regard to prediction and its relationship to asymptotic stationarity, however, one should mention that an example due to Veilahti [71] quoted by Niemi in [48] shows that specific prediction information solely based on knowledge of the associated spectral measure is not likely to be available. Similarly one can formulate the filtering problem. FILTERING PROBLEM
Consider a process on G written as
where X is an input process whose nature one would like to determine, W is a "noise" process, and Y is the observed behavior. (Sometimes the function F is also a function of t.) One applies a filter (linear, Kalman, etc.) to Y in
386
Bertram M. Schreiber
order to obtain or approximate X, and one would like to find an optimal filter, i.e. an estimator Xt which minimizes the distance \\Xt — Xt\\, t € G. If the process is asymptotically stationary, one would like to use this information to determine the effectiveness of the filtering process. Already the following was shown in [24]. Let X, Y, and W be as above with F(x,y) = x + y, and suppose that X satisfies all of the hypotheses of Theorem 4.1. Suppose that W is stationary and uncorrelated with X and has covariance K\y(t) tending to zero at infinity and uniformly bounded fourth moments. Then the estimator jJLn,N(l} constructed for the observed process Y in Theorem 4.1 converges in the sense of Theorem 4.1 to the discrete part of the associated spectral measure of X. Clearly information of this type bears on both problems elucidated here. Much more attention needs to be given to this subject, and we hope to return to it in the future. Another area in which the structure of processes discussed in this work appears is in the development of ergodic theorem. In the interest of brevity, we shall omit a discussion of this area. 6 NONABELIAN GROUPS The spectral theory of processes on a locally compact nonabelian group is intimately connected to the structure of its space of irreducible unitary representations, as one would expect from the fact that the irreducible representations of an abelian group correspond to its characters. The theory was launched in the classical paper of Yaglom [75] (summarized in [74]), where the nature of stationary processes (termed homogeneous by Yaglom) on locally compact groups and homogeneous spaces is examined in detail. Recall that {Xt, t e G} with values in H is called (left) stationary if X is continuous on G and (X(gs),X(gt)}
= (X(s),X(t)),
s,t,g e G.
That is, if we look at the (right) covariance function K(t,T) = ( X ( t T ) , X ( t ) ) ,
(27)
K(t,T) = (X(t),X(e))=K(t).
(28)
then stationarity says
Just as in the abelian case, (27 and (28) imply that K is positive definite on G. The tie-in with representation theory is now forced by the classical theorem of Gelfand, Naimark, and Segal to the effect that every continuous positive definite function (j> on G has the form
>
(29)
ASYMPTOTICALLY STATIONARY PROCESSES
387
for some strongly continuous unitary representation TT of G on a Hilbert space HT, and some £ 6 H^. Let us summarize some relevant facts from the theory of unitary group representations. Let B(G) denote the linear span of the set of continuous positive definite functions on G. Then B(G) is an algebra; in fact it is a Banach algebra when it is normed in the following way. There is a canonical way to lift each unitary representation TT of G to a ^representation of the convolution *-algebra L 1 (G); namely for / e Ll(G] and £ 6 Hn, let vr(/) be the operator vector integral (30)
Let II denote the set of all (unitary equivalence classes of) representations of G, and set \\f\\* = sup{||7r(/)|| : ?r 6 II}. If Ll(G) is completed in this norm, one obtains a C*-algebra, called the group G*-algebra of G and denoted by G*(G). The dual space of this algebra is precisely B(G), which is given its natural norm as the dual space to C* (G) . The duality is effected by defining the action of the function <$> G B(G) on functions / 6 L^(G) to be
= JGfG It follows immediately that if > is represented as in (29) then
When G is abelian, C*(G] can be canonically identified with Go(G), B(G) is the algebra of Fourier-Stieltjes transforms of complex measures on G, and the embedding of Ll(G) in C*(G) is just the Fourier transform. Let P(G) stand for the set of all continuous positive definite functions on G. The elements of "P(G) whose values at the identity element of G are less than or equal to [resp., equal to] one will be denoted by Po(G) [resp., Pi(G)]. Let G denote the set of all (unitary equivalence classes of) irreducible unitary representations of G. We may assume that all the irreducible representations of the same dimension act on the same space, and since G is assumed to be u-compact, all the representation spaces are separable. Thus we may write G = G1UG2U---UG00, where for each v < oo the elements of Gv act on a Hilbert space Rv of dimension v. For each f, fix a complete orthonormal set {££}&=! °f H-vGiven a strongly continuous, unitary representation TT of G on a Hilbert space HK and £, 77 6 Hv, let <£>?„(£) = {7r(a;)£, 77); if £ = 77, denote this function by tf'f (these functions are usually called matrix coefficients of TT). Since we are primarily interested here in asymptotic stationarity and are considering the analogues of the classical version of that notion which in-
388
Bertram M. Schreiber
volves convergence of a sequence of averages, we shall assume that our group G isCT-compactand amenable, with Haar measure m and identity element e. It is well known that this class of groups is precisely those that satisfy the "F01ner condition," i.e., as in the abelian case, there exist F01ner sequences in G: increasing sequences {An}^=1 of nonempty, open, symmetric sets with compact closure such that G = U^L1An, and for every x e G, lim ™t(4,*)A4.] = 0. n^oc
(31)
m(An)
An amenable group is one for which there exist (right) translation-invariant means M on the space G{,(G), and for any F01ner sequence {An},
M(f) = lim —i— / n^oc m(An)
JAn
/(t)rfm(t)),
feAP(G).
(32)
As in the abelian case, the functional M all agree on the space of weakly almost periodic functions on G. The function ip? is always weakly almost periodic on G, and it is almost periodic if and only if dimTr < oo. It can be shown that every f € P(G) decomposes uniquely as a sum
(33)
where <pa <E AP(G) and M(\ipQ\2) = 0. If ? € P(G] D AP(G), then ip can be written uniquely in the form OC
,
(34)
t£G,
1=1
where TT, e GVi , i/i < oo, TTJ ^ itj if i ^ j, and each T, is a nonnegative definite operator on HVi, so in particular X)i* r ^i = f ( e ) - Thus we are lead to the following definition. DEFINITION 6.1 For tp € P(G), the almost periodic spectrum of (p is the collection {(7rj,Tj)} associated to (pa as in (34). In fact we can identify the relationship between the function tp and the operators Tj m a nicer way. The following definition is usually restricted to AP(G). For convenience we define it more generally, but we shall refer to it as the almost periodic Fourier transform. DEFINITION 6.2 For any function / on G and {An} a F01ner sequence in G, set />)= lim —L_ / f(t)ir(rl)dm(t), n—>oc m\An) J An
TT e G,
(35)
ASYMPTOTICALLY STATIONARY PROCESSES
389
in the sense that = nlim
-
f(t)WMdm(t),
TT e G,
(36)
provided that the limit exists for all TT € G and is independent of the choice of the F01ner sequence {An}. Note that by the uniqueness of the mean on WAP(G), the almost periodic Fourier transform is well defined for every function / € WAP(G). If
We have £(PQ(G)) = {0} U {^ : TT 6 G, ||f || = 1}. Moreover, if if € Pi(G), then we may assume that ^({0}) = 0. Setting E = £(PQ(G)) \ {0}, it follows that for every asymptotically stationary process {X(i),t E G} on G with K(e) 7^ 0 there exists a positive measure p, on E such that K(t) = f il>(t)dn(t),
JE
t 6 G.
(37)
390
Bertram M. Schreiber
DEFINITION 6.4 A measure JJL as above is called an associated spectral measure of the asymptotically stationary process {X(t}}. If G is nonabelian, the measure [i given by Choquet's Theorem is not unique (cf. [51], §9); hence our terminology above. The almost periodic spectrum of K is unique, however. When G is abelian, E consists of the characters of G, the existence of /j, is a consequence of Bochner's Theorem, and it is unique. This is the associated spectral measure of Definition 2.5. To illustrate these ideas, before discussing general classes of processes, let us reproduce the following example taken from [66], where more details appear. EXAMPLE Suppose that { X ( t ) , t 6 G} is a process with discrete spectrum, namely *(*) = E Cntr (Kn(t)Tn),
t e G,
(38)
n
where the ?rn are distinct elements of G, Tn is a trace-class operator on H^n, and {cn} is a corresponding collection of H- valued random variables. Assume that \,<^
(39)
n
which implies that conditions (i), (ii) and (iii) are satisfied. Here ||Tn||i is the trace-class norm of Tn. Now K(t,r] =
E((cm, c n ) H )tr (7r m (t)7r m (r)r m )tr (xn(t)Tn). Tn, n
It can be shown that the process {X(t)} is asymptotically stationary with Kn(T}TnTZ),
(40)
where F = {n : vn = dim7rn < oo}. For each n 6 F, let Fn = {k : Tn^n ^ 0}, and for k £ Fn set n
rn
—
T
fVn
lk — i|Tt t"n|| J "?A: ' H- n?fc II
Then an associated spectral measure for {X(t}} is M = E E E(\cn\2H)\\Tn^\\%^. "
(41)
ASYMPTOTICALLY STATIONARY PROCESSES
391
6.2 Examples Every locally compact group G is either of Type I, Type II, or Type III, according as its group von Neumann algebra W*(G) = G*(G)** is of Type I, II, or III in the classical Murray-von Neumann classification. Let us assume that G is of Type I, so every element of (G x G) is of the form K\ ® -KI-, where 7Ti,7T2 <E G. Thus (G x G) is identified with G x G. Let B(G) be as above. There is a natural Borel structure on G with respect to which each Gn is measurable. (If G is separable, we may use the Mackey Borel structure. Otherwise, recall that G is of Type I if and only if the Fell topology is Hausdorff, and we may use the Borel sets of the Fell topology. This topology is simply the topology on G induced from £(Pi(G)) via (29) and the map <^£ H-> TT. See [18].) If <j) G P(G) there is a measurable mapping TT t-> T^, where Tv is a nonnegative definite trace-class operator on Hw, TT e G, and a finite, nonnegative measure JJL on G such that = f JG
Now, G x G is also of Type I, so following Rao [61] we can modify the definition in §3.1 as follows. DIFINITION 6.5 Let G be of Type I. We call the continuous process {X(i), t 6 G} strongly harmonizable if K e B(G), so X is strongly harmonizable when there exist a measurable mapping T on G x G such that T^1)7r2 is a trace-class operator on H^l <8> H^ and a complex measure p, on G x G such that K(t,r)= I ^tr(7n(t)®7r 2 (T)r w i , W 3 )dM7ri,7r 2 ), 7GxG
t.reG.
(42)
PROPOSITION 6.6 Let G be of Type /. Then every strongly harmonizable process on G is asymptotically stationary. For the proof, see [66], §3. To define the notion of weak harmonizablility we need to introduce bilinear forms. In this case we may abandon the restriction that G be of Type I. The definition is due to Ylinen [76], summarized in [78]; cf. [61], §2.3. Again, this is a nonabelian version of the definition for abelian groups. We shall not provide many details here, as this does not relate directly to our discussion of asymptotic stationarity. Suffice it to say that in a canonical way one can define a mapping t \-+ "ft of G into W*(G). If u is a bounded bilinear form on G*(G) x G*(G), then just like in the commutative case there is a natural extension of u to W*(G) x W*(G).
392
Bertram M. Schreiber
DEFINITION 6.7 The continuous process X on G is weakly harmonizable if there is a bounded bilinear form u on C*(G) x C*(G) such that
It is not hard to see that every strongly harmonizable process on G is weakly harmonizable when G is of Type I. Theorems 3.7-3.9 can be extended to the nonabelian case, using this definition, as shown by Ylinen in [76],[77]. For a summary of this and related material, see [78]. The key to much of this work is the operator-space point of view, namely the extension of Grothendieck's theorem due to Pisier [53] and Haagerup [23]. The remainder of the classes denned in §3 and the discussion of higherorder asymptotic stationarity can be denned just as well on nonabelian groups, and one can easily modify the arguments to fit this context. The relevance of some of these classes for nonabelian groups is not clear, however, since they are not extensions of the harmonizable classes. To rectify that one needs to redefine them appropriately using the theory of operator algebras and operator spaces, just as was done for the weakly harmonizable processes.
6.3 Estimation of the Associated Almost Periodic Spectrum Let K(T) be the associated covariance function of an asymptotically stationary process X on G. It follows from the discussion in §6.1 that to estimate the almost periodic spectrum, as given by Definition 6.1, for each TT €E G \ Gx we need to estimate the corresponding operator K(ir). If vr e Gv, then to find that operator, we must estimate the matrix entries (K(TT}^,^) corresponding to the basis {^,... ,£,„} of Hv. As in §4, we must consider a process X on G with values in L2(£l,J-,P]H) and construct an estimator from a sequence of independent samples {XI}, 1 < i < N, of the process X. The appropriate generalization of Theorem 4.1 to nonabelian groups is then the following theorem, whose proof may be found in [66]. THEOREM 6.8 Let { X ( t ) , t 6 G}, be an H-valued asymptotically stationary stochastic process which satisfies conditions (ii) and (iii). Suppose that the following conditions are also met:
(iv)
lim sup
" '
m(An)
(v) C = sup sup
K(t,T)dm(t)-K(r}
0;
I (E\X(tT)\4H)l/2dm(t)
< oo.
JAn
*
ASYMPTOTICALLY STATIONARY PROCESSES
393
Let {X\(t)}, {Xzty)}, ... be a sequence of independent samples from {X(t), t € G}. Then for each TT £ Gv and £, 77 6 Hv (z/ < oo),
1
N
xA n
1
{X i (tT) ) X i (t))^(7r(T)» ?) e} 7fl ,dm x m(t,r) (43)
is a consistent estimator for (K(n)£, T])HV , in the sense that E\Kn,N(7r;£,r,) - (K(-K}^r,}Hi/\2 -> 0 as n, N -> oo.
(44)
REMARK A Moore group ([62]; [28], p. 15) is a locally compact group all of whose irreducible unitary representations are finite dimensional. The structure of such groups was described by C.C. Moore in [42]. Let G be a Moore group and { X ( t ) , t e G} be an asymptotically stationary process on G satisfying the conditions of Theorem 6.8. Then we may view that theorem as providing a consistent estimator for the discrete part of the asymptotically stationary covariance K(T). This notion of the discrete part can easily be made precise (for a group that is not necessarily a Moore group).
REFERENCES 1. V. V. Anh and K. E. Lunney, Spectral representation and ergodicity of asymptotically stationary processes, Austral J. Statist., 33:85-94 (1991). 2. V. V. Anh and K. E. Lunney, Covariance function and ergodicity of asymptotically stationary random fields, Bull. Austral. Math. Soc., 44:49-62 (1991). 3. A. S. Besicovitch, Almost Periodic Functions, Cambridge University Press, Cambridge, 1932. 4. C. S. K. Bhagavan, Nonstationary Processes. Spectral and Some Ergodic Theorems, Andhra University Press, Waltair, India, 1974. 5. R. B. Burckel, Weakly Almost Periodic Functions on Semigroups, Gordon and Breach, New York, 1970. 6. D. K. Chang and M. M. Rao, Bimeasures and nonstationary processes. In M. M. Rao, ed., Real and Stochastic Analysis, Wiley, New York, 1986, pp. 7-118. 7. H. Cramer, Random Variables and Probability Distributions, Cambridge University Press, Cambridge, 1937.
394
Bertram M. Schreiber
8. E. Christensen and A. M. Sinclair, Representations of completely bounded multilinear operators, J. Functional Anal., 72:151-181 (1987). 9. D. Dehay, On a class of asymptotically stationary harmonizable processes, J. Multivariate Anal., 22:251-257 (1987). 10. D. Dehay, Asymptotic behavior or estimators of cyclic functional parameters for some nonstationary processes, Statist, and Decisions, 13:273-286 (1995). 11. K. de Leeuw and I. Glicksberg, Applications of almost periodic compactifications, Acta Math., 105:63-97 (1961). 12. J. Dixmier, Les (7*-algebres et leurs representations, Cahiers Sci. Fasc. XXIX, Gauthier-Villars, Paris, 1964. 13. J. L. Doob, Stochastic Processes, Wiley, New York, 1953. 14. W. F. Eberlein, Abstract ergodic theorems and weak almost periodic functions, Trans. Amer. Math. Soc., 67:217-240 (1949). 15. W. F. Eberlein, A note on Fourier-Stieltjes transforms, Proc. Amer. Math. Soc., 6:310-312 (1955). 16. W. F. Eberlein, The point spectrum of weakly almost periodic functions, Michigan Math. J., 3:137-139 (1955-56). 17. E. G. Effros and Z. J. Ruan, Operator Spaces, Oxford University Press, New York, 2000. 18. G. B. Folland, A Course in Abstract Harmonic Analysis, CRC Press, Boca Raton, FL, 1995. 19. E. G. Gladyshev, Periodically and almost periodically correlated random processes with continuous time parameter, Theory Probab. Appl., 8:173-177 (1987). 20. R. Godement, Les fonctions de type positif et la theorie des groupes, Trans. Amer. Math. Soc., 63:1-84 (1948). 21. C. C. Graham and B. M. Schreiber, Birneasure algebras on LCA groups, Pacific J. Math., 115:91-127 (1984). 22. F. P. Greenleaf, Invariant Means on Topological Groups, van Nostrand Math. Studies No. 16, American Book, New York, 1969. 23. U. Haagerup, The Grothendieck inequality for bilinear forms on C*algebras, Adv. Math., 56:93-116 (1985). 24. L. G. Hanin and B. M. Schreiber, Discrete spectrum of nonstationary stochastic processes on groups, J. Theoretical Probab., 11:1111-1133 (1998). 25. L. G. Hanin and M. A. Schwarz, Consistent estimation of spectral measure discrete component for a class of random process, Nonparam. Statist., 2:81-87 (1992). 26. H. Kelson and D. Lowdenslager, Prediction theory and Fourier series in several variables, Acta Math., 99:165-201 (1958). 27. H. Kelson and D. Lowdenslager, Prediction theory and Fourier series in several variables II, Acta Math., 106:175-212 (1961).
ASYMPTOTICALLY STATIONARY PROCESSES
395
28. H. Heyer, Probability Measures on Locally Compact Groups, Ergebnisse der Math, und ihrer Grenzgeb. Vol. 94, Springer-Verlag, Berlin, 1977. 29. E. Hewitt and K. A. Ross, Abstract Harmonic Analysis, Vol. 1, Grudl. der Math. Wiss. Vol. 115, Springer-Verlag, Berlin, 1963. 30. H. L. Hurd, Correlation theory of almost periodically correlated processes, J. Multivariate Anal., 37:24-45 (1991). 31. H. L. Hurd and J. Leskow, Strongly consistent and asymptotically normal estimation of the covariance for almost periodically correlated processes, Statist, and Decisions, 10:201-225 (1992). 32. T. Ito and B. M. Schreiber, A space of sequences given by pairs of unitary operators, Proc. Japan Acad., 46:637-641 (1970). 33. Y. Kakihara, Multidimensional Second Order Stochastic Processes, Series on Multivariate Anal. Vol. 1, World Scientific, Singapore, 1997. 34. J. Kampe de Feriet and F. N. Frenkiel, Correlation and spectra for nonstationary random functions, Math. Comput., 16:1-21 (1962). 35. K. Karhunen, Uber lineare Methoden in der Wahrscheinlichkeitsrechnung, Ann. Acad. Sci. Fenn. Ser. A I Math., 37:3-79, (1947). 36. T. Kawata, Almost periodic weakly stationary processes. In G. Kallianpur and P. R. Krishnaiah, eds., Statistics and Probability: Essays in Honor of C.R. Rao, North-Holland, Amsterdam, 1982, pp. 383-396. 37. A. Kolmogorov, Stationary Sequences in Hilbert Space, Bjul. Moskov. Cos. Univ. Vol. 2 No. 6, Moscow, 1941. 38. V. E. Kulikov, On the mean value of the estimate for the variance of an asymptotically stationary random process of bounded duration (Russian), Izv. Akad. Nauk. Teor. Sist. Upr., 1:28-32 (1998). 39. J. Leskow, Asymptotic normality of the spectral density estimators for almost periodically correlated stochastic processes, Stoch. Processes Appl., 52:351-360 (1994). 40. M. Loeve, Probability Theory, 3rd ed., van Nostrand, Princeton, 1963. 41. L. H. Loomis, An Introduction to Abstract Harmonic Analysis, van Nostrand, Princeton, 1953. 42. C. C. Moore, Groups with finite-dimensional irreducible representations, Trans. Amer. Math. Soc., 166:401-410 (1972). 43. V. Mandrekar and M. Nadkarni, On the linear prediction theory of stationary processes indexed by lattice points, Tech. Rep. No. 100, Univ. Minnesota Dept. of Stat., 1967. 44. J. Michalek, Associated spectra of some nonstationary processes, Kybernetika (Prague), 24:428-438 (1988). 45. H. Niemi, On stationary dilations and the linear prediction of certain stochastic processes, Soc. Sci. Fenn. Comment. Phys.-Math., 45:111-130 (1975). 46. H. Niemi, On the linear prediction of certain non-stationary stochastic processes, Math. Scand., 39:146-160 (1976).
396
Bertram M. Schreiber
47. H. Niemi, On orthogonally scattered dilations of bounded vector measures, Ann. Acad. Sci. Fenn. Ser. A I Math., 3:43-51 (1977). 48. H. Niemi, Asymptotic stationarity of nonstationary L2-processes with applications to linear prediction, Bol. Soc. Mat. Mexicana, 28:15-29 (1983). 49. H. Niemi, Grothendieck's inequality and minimally orthogonally scattered dilations. In Probability Theory on Vector Spaces, III (Lublin, 1983), Lect. Notes in Math. Vol. 1080, Springer-Verlag, Berlin, 1984, pp. 175-187. 50. E. Parzen, Spectral analysis of asymptotically stationary time series, Bull. Inst. Internat. Statist., 39(2):87-103 (1962). 51. R. B. Phelps, Lectures on Choquet's Theorem, van Nostrand Math. Ser. No. 7, American Book, New York, 1966. 52. J. P. Pier, Amenable Locally Compact Groups, Pure and Applied Math., Wiley Interscience, New York, 1984. 53. G. Pisier, Grothendieck's theorem for non-commuting (7*-algebras with an appendix on Grothendieck's constant, J. Funct. Anal., 29:397-415 (1978). 54. G. Pisier, Operator spaces and similarity problems. In Proc. Internat. Congress of Mathematicians (Berlin, 1998), Vol. 1, Springer-Verlag, Berlin, 1998. 55. M. B. Priestley, Spectral Analysis of Time Series, 2 vols., Academic Press, New York, 1981. 56. M. B. Priestley, Non-linear and Non-stationary Time Series Analysis, Academic Press, London and New York, 1988. 57. M. M. Rao, Covariance analysis on nonstationary time series. In Developments in Statistics Vol. 1, Academic Press, New York, 1978, pp. 171-225. 58. M. M. Rao, Harmonizable processes: structure theory, L'Enseignement Math. (2), 28:295-351 (1982). 59. M. M. Rao, Harmonizable signal extraction, filtering, and sampling. In Topics in NonGaussian Signal Processes, Springer-Verlag, New York, 1984, pp. 98-117. 60. M. M. Rao, Harmonizable, Cramer, and Karhunen classes of processes. In E. J. Hannan, P. R. Krishnaiah, and M. M. Rao, eds., Handbook of Statistics Vol. 5: Time Series in the Time Domain, North-Holland, Amsterdam, 1985, pp. 279-310. 61. M. M. Rao, Bimeasures and harmonizable processes (analysis, classification, and representation). In Probability Measures on Groups IX (Oberwolfach, 1988), Lect. Notes in Math. Vol. 1379, Springer-Verlag, Berlin, Heidelberg, and New York, 1989, pp. 254-298. 62. L. C. Robertson, A note on the structure of Moore groups, Bull. Amer. Math. Soc., 75:594-599 (1969).
ASYMPTOTICALLY STATIONARY PROCESSES
397
63. L. C. Robertson and B. M. Schreiber, The additive structure of integer groups and p-adic number fields, Proc. Amer. Math. Soc., 19:1453-1456 (1968). 64. Yu. A. Rozanov, Spectral analysis of abstract fractions, Theory Probab. Appl, 4:271-287 (1959). 65. Yu. A. Rozanov, Stationary Random Processes, Holden-Day, San Francisco, 1967. 66. B. M. Schreiber, Asymptotically stationary processes on amenable groups, in press. 67. R. J. Swift, Almost periodic harmonizable processes, Georgian Math. J., 3:275-292 (1996). 68. R. J. Swift, Some aspects of harmonizable processes and fields. In M. M. Rao, ed., Real and Stochastic Analysis: Recent Advances, CRC Press, Boca Raton, FL, 1997, pp. 303-365. 69. R. J. Swift, Signal extraction for a class of nonstationary processes, Portugal. Math., 57:49-58 (2000). 70. R. J. Swift, Covariance analysis and associated spectra for classes of nonstationary processes, J. Statist. Plann. Inference, 100:145-157 (2002). 71. D. Tj0stheim and J. B. Thomas, On the analysis of a class of multivariate nonstationary stochastic processes, IEEE Trans. Information Theory, 21:257-262 (1975). 72. N. Wiener, Extrapolation, Interpolation, and Smoothing of Stationary Time Series with Engineering Applications, Wiley, New York, 1949. 73. A. M. Yaglom, An Introduction to the Theory of Stationary Random Functions, revised English ed., transl. and ed. by R. A. Silverman, Prentice-Hall, Englewood Cliffs, NJ, 1962. 74. A. M. Yaglom, Positive-definite functions and homogeneous random fields on groups and homogeneous spaces, Soviet Math. Dokl., 1:14021405 (1961). 75. A. M. Yaglom, Second-order homogeneous random fields. In Proc. Fourth Berkeley Symp. Math. Stat. Probab. Vol. 2, University of California Press, Berkeley, 1961, pp. 593-622. 76. K. Ylinen, Fourier transforms of noncommutative analogues of vector measures and bimeasures with applications to stochastic processes, Ann. Acad. Sci. Fenn. Ser. A I Math., 1:355-385 (1975). 77. K. Ylinen, Dilations of F-bounded stochastic processes indexed by a locally compact group, Proc. Amer. Math. Soc., 90:378-380 (1974). 78. K. Ylinen, Random fields on noncommutative locally compact groups. In Probability Measures on Groups VIII (Oberwolfach, 1985), Lect. Notes in Math. Vol. 1210, Springer-Verlag, Berlin, 1986, pp. 365-386.
Superlinearity and Weighted Sobolev Spaces Victor L. Shapiro Department of Mathematics, University of California, Riverside,CA 92521
Abstract Working in the special weighted Sobolev space H^p (0, F), which meets the criterion for being a simple VL — region^ with Lu =-£^=i Di(piDiU) a weighted elliptic operator, where p = (pi, ...,PN) and p, and p are positive weights in 0, multiple solutions to the following superlinear Dirichlet problem are established: Lu =[— \u\q~ u + \u + g(x, u)]p, u =0 on T c <9Q. Here, 2 < q < JV°, where N° is the critical exponent associated with simple VL — regions, g(x, s) is odd in s, sublinear and nonnegative for s > 0, and A is sufficiently large. 1. Introduction. Let fi C RN,N > 1, be an open (possibly unbounded) set and let p (x) ,pi (x) 6 C° (Q) be positive functions with the property that /
Ja
p (x) dx < oo and / pi (x} dx < oo for i = 1,..., N.
Ja
(1.1)
Also, let F C 90 designate a fixed closed set. ( F may be the empty set.) We introduce the pre-Hilbert space:
€ C0 (0) n C 2 (0) : u (x) = 0 Vx € F; /
Jn
1
Dedicated to my colleague M. M. Rao
399
400
Victor L. Shapiro
where p = (p\, ...,pn) and D^u = du/dxi. In <7p)p(n, F), we have the inner product N
iV + puv dx.
U,V >p,p=
(1.3)
Li=l
Hpp(fl,T) will be the real Hilbert space (see [Sh, p.2]) that we obtain by completing C* (f2,F) by the method of Cauchy sequences with respect to the norm \\u\\„ „ =< u.u >Jp • L?n(£l) will be the real Hilbert space with ^ ^ 2 the inner product < u, v >p= JQ uvpdx where \\u\\ =< u, u >p . In a similar manner we have the spaces Lp.(f2), i = 1,..., N. Hence we see from (1.3) that ptp= 2^ < DiU,DiV >pi + p.
(1.4)
Also, in the sequel, sometimes we shall write H^p for Hp
N
Lu = ~^Di \piDiU} 1=1
(1.5)
and the two-form N
£(«,«) = V / piDiuDiV l^i J"
(1.6)
for u, v e J?^p. We have the possibility of (1.5) being singular because the p/s may tend to zero on all or part of <917, or fi may be unbounded, or both. We shall say (fi,F) is a VL - region if the following two conditions hold: (VL—1) There exists a complete orthonormal system {^>n}^=i m L2p. Also
^e^ p (fi,r)nC' 2 (fi)Vn. (VL~2) There exists a sequence of eigenvalues {A n }^LjwithO < AI < A2 < A3 < - < An -> oo such that C(vn,v) = Xn (vn,v)p Vv e ff^(n,F). Also, (^i > 0 in n. In this paper we shall be interested in a special type of VL - region called a Simple VL - region.We say that (fi, F) is a Simple V^ -region if (fi, F) is a V^-region and the following four conditions prevail: (i) £1 = £li x ... x fijv where flj C R is an open set for i = 1,..., N; (M) associated with each fi, there are positive functions p* and p* in O satisfying /„. \p\ (s) + p* (s)] ds < oo for i = 1,..., AT;
WEIGHTED SOBOLEV SPACES
401
(Hi) p(x) = pl (xi) ...p*N (XN) and Pi ( x ) = Pi (xi) -Pi-i (xi-i)Pi (xi)Pi+i (xi+i) -PN (XN) for i= 1,..., N; (iv) for each ft; (i = 1,..., N), 3/i; <E C° (ft;) n Lep, (ft;) for 2j0 < oo with the property that Vu 6 C1 (ft;) , |u (s)\
< hi (s) \\U\\p,
Vfi e ft;.
In (TO) above /i; is understood to be in every L&t (ft;) for 2\d, and also, to be quite explicit, \\u\\2
= J LL* (s) |du (s) /ds\2 + p*u2 (s)]1 ds. (1.8) '' Jfii It is easy to give examples of interesting Simple V^,—regions, and in the concluding section of this paper, we shall discuss two of them. The whole point of introducing the concept of Simple VL—regions is that both a weighted continuous and a weighted compact imbedding theorem holds for these regions (see [Sh, p.15 and p. 46]). Using these imbedding theorems, we shall establish multiplicity results for a weighted (i.e., singular) elliptic operator. Also, Simple V^—regions need not be bounded or have finite volume, unlike the situation which prevails for most of the previous multiplicity results in the literature (see[Am], [He], and [Ra, p.54]). The following two imbedding results were established in [Sh, p. 15 and p. 46]: Theorem A. Let Lu be given by (1-5) and suppose that (£1, F) is a Simple VL -region. Then ffp)p(ft,F) is continuously imbedded in Lep (ft) for every 6 satisfying 2< 6 < 2N/ (N - 1), i.e., 3Kg > 0 such that •} 1/6
u\e pdx\ n J
Vuetf1 .
(1.9)
Theorem B. Let Lu be given by (1.5) and suppose that (£1, T) is a Simple VL~region. Then for N > 2 , H^ (0,F) is compactly imbedded in L0p(£l) for every 6 satisfying 2jO < 2N/ (N - 1 ) . For N=l, H^p (fi, T) is compactly imbedded in Lep (fi) for every 6 satisfying 2j 0\oc. Remark 1. It is clear from the very definition of a Simple VL—region that for N=l, Hp!j0 (fl,T) is continuously imbedded in Lep(£t) and that (1.9) holds for every 6 satisfying 2j 9 \ oo. Remark 2. For N> 2, we shall set 2N/ (N - 1) = N* (For N = 1, N» = oo) . Now it is clear from Theorems A and B that N" is the critical exponent
402
Victor L. Shapiro
for very general Simple VL — regions, but for a specific Simple V L — region, it may be larger. For example for N> 2, letting Q C HN be a bounded open set, choosing F = d£l, taking PJ = 1 for i=l,...,N and p = l,we see that H^ p (fl, F) is a Simple VL— region, and that Hp >p (fi, F) — WQ' (fi), the familiar Sobolev space. It is well-known that the critical exponent in this case is N* where N* = 2N/ (N - 2) for N > 3 and N* = oo for N=l,2. Using the above two imbedding theorems in conjunction with various theorems in nonlinear analysis, we shall obtain multiplicity results for weak solutions of the following superlinear (singular) elliptic boundary value problem:
Lu = p[-\u\q~2u + Xu + g(x,u)} f o r x e f i , x=0 on F C dfl.
(1.10)
( Note, F may be the empty set.) Here, Lu is given by (1.8), 2jqj./V 0 , A > A,where \j is the j-th eigenvalue given in (VL — 2) above, and g ( x , s) meets the following conditions: (g-1) the Caratheodory conditions: the map x —> g(x, s) is measurable for all s € R, and the map s —> g ( x , s) is continuous for a.e. x(E Q; (g-2) given e > 0, 3 bE € LJ, (Q) where q=-^i such that \g(x, s)\ < e \s\q~l + be(x) Vs G R and for a.e. x e fi;
(g-3) sg(x, s) > 0 for s € R and a.e. x£ fi; (g-4) g(x, s) = —g(x, —s) Vs 6 R and for a.e. x € fi. By a weak solution of the boundary value problem (1.10), we shall mean : 3u 6H^)/9 (fi,F) such that £(u,v)= f[-\u\q-*u + Xu + g(x,u)]vp
Vve^(fi,F)
(1.11)
where £(u,v) is defined in (1.6). The theorem we shall establish concerning (1.10) is the following: Theorem 1. Assume (fl,T) is a Simple VL -region, that A > Xj, that (gl)-(g-4) holds, and that SjqjN^. Then the boundary value problem (1.10) possesses at least j distinct pairs of nontrivial weak solutions, i.e., 3 V'l, ••-, V'j G H^p (fi, F) such that (1.12) sl
for i = l,...,j with !/>, ^ 0 and tpi ^ ±Vfc f°r i^ k, i,k — l,...,j. It is clear from (g-4), that -^>j also satisfies the equation in (1.12). So the j distinct pairs of weak solutions of (1.10) are {iV'i}^!- We recall that N° = 2N/ (N - 1) for N > 2 and N" = oo for N=l.
WEIGHTED SOBOLEV SPACES
403
Also, it follows from the observations in Remark 2 that for specific Simple VL— regions, Theorem 1 above may even hold for 2jqjN* where N°
2
iU\
/2 + [u\(i/q-\ \u\2 /2 - G(x, u]\p
1
(2.1)
where G(x,s) = f g(x,t)dt
Jo
Vs e R and for a.e. x € Q.
(2.2)
It is clear from the conditions in the hypothesis of Theorem 1 that /(it) is well-defined for u € Hptp (fi,F) . Also, it is clear from these same conditions that the Gateaux derivative /' (u) exists and that [\u\q~2u- Xu-g(x,u)]vp (2.3) 7o for v G Hptf> (fi, r) . Now, /'(it) is a bounded linear functional in [Hptp (£2, F)]*, the dual of Hp p (£2,F) . As an element in this Banach space, it is not difficult to show from Theorem B and from the conditions in the hypothesis of Theorem 1, that /'(it) is continuous as a function of it. We state all this in our first lemma as follows: I'(u)v = C(u,v) +
Lemma 1. Under the conditions in the hypothesis of Theorem 1,
where I(u) is defined in (2.1) and I'(u)v is defined in (2.3). It follows from (2.3) that if u* 6 H^p is a critical point of /(«), i.e., /'(it*) = 0, then it* is a weak solution of the boundary value problem (1.10). ( Note, in this case, /(it*) is called a critical value.) We will establish our theorem by invoking a result in nonlinear analysis called Clark's theorem, which shows that I(u) has at least j distinct pairs of nontrivial critical points. One of the conditions in the hypothesis of Clark's theorem is that /(it) satisfies the (PS) condition, which is defined as follows: Suppose {un}^=\ € Hptfi is such that (i) the sequence {I (un}}^=l is uniformly bounded and (ii) /' (un) —> 0 in [Hptf>]*. Then B {unk}'^'=l and u € Hpip such that \\Unk - u\\p
404
Victor L. Shapiro
To prove the lemma, we assume that {un}'^=l £ H^p is such that (i) 3K with I (un)\ < K Vn, and (ii) I' (un) -> 0 in [H^p]*.
(2.4)
We shall first show that under the assumption (2.4) with 2jqjN <> , such that
L
u ' p<
(2.5)
Vn.
To establish (2.5), we first observe from Holder's inequality that
t I / \ 2/? 3K2 such that / u\2 p < K2 1L / \u\q pJ C Vu e fli . Jn Jn
(2.6)
Also, from (g-2) and (2.2), we see that Ve > 0, 36£ £ Lqp such that f \G(x,u)\p<ef
7n
7n
u\qp+ f b£\u\p V « e f £ p .
Jn
(2.7)
Since v^ e H^p, it follows from Theorem A that un e Lqp. Consequently, since q > 2, if (2.5) did not hold, it would follow from (2.6) and (2.7) that for a subsequence, the expression U
/2 - G(X, Un)
OO.
But, from (2.1), we see that I(Un) > I [\Un Jil
q
/q~\ KI 2 /2 - G(X, Un)]p.
Hence, for a subsequence, I(un) —> oo. This fact contradicts (2.4) (i). We conclude that (2.5) does indeed hold. Next, we observe that (2.4) - (2.7) imply that such that £(un,un) < K3 Vn.
(2.8)
For (2.5)-(2.7) imply that BK4 such that / [\un q /q - A \un 2 /2 - G(x, un}]p < K4 Vn. Jn Hence, it follows form (1.6) and (2.1) that I(un} > £(un,un) - K± Vn. But then we infer from(2.4)(i) and this last fact that (2.8) does indeed hold. (1.3), (1.6), (2.5), (2.6), and (2.8) in turn imply that such that ||wn|L p <-^5
(2.9)
We now show that this last fact joined with (2.4)(ii) gives us the conclusion of the lemma. To do this, we first observe that well-known facts about Hilbert spaces in conjunction with Theorem B and (2.9) imply that 3 {unk}f=l and
WEIGHTED SOBOLEV SPACES
405
u € Hp p such that (i) Unk-* u weakly in H* „ and (ii)
I \unk - u\q p —> 0.
Jtt
(2.10)
Consequently, we have that /' (u) ( unk - u) -> 0,
and also from (2.4)(ii) and (2.9) that I' (unk) ( unk - u) ->0.
Hence, we have that [/' (unk) ( unk -u)- I' (u) ( unk - u)} -> 0.
Likewise, we obtain from (2.10)(ii) and (g-2) that / [|«|9~2« — \u — g(x,u)]( Unk — u)p —> 0
Jn
and also that ' Unk - Au nfc - g(x,Unk)](
Unk - u)p —> 0.
Jn These last three facts in conjunction with (2.3) gives us that £( Unk - U, Unk - U) -> 0.
But we also know from (2.10)(ii) and (2.6) that \\unk — u\\ —> 0. Consequently, from (1.3) and (1.6), we conclude that \\unk — u \ —> 0, and the proof of the lemma is complete. Next, we need a lemma which deals with the Fourier analytic properties associated with the eigenfunctions {^n}^=1 and eigenvalues {Ari}^=1 in (VL — 1) and (V/, — 2) above. In particular, the following lemma holds: Lemma 3. Assume that L is given by (1.6), that (^,F) is a Simple V^-region, and that f € !?„. Set
2
Then f €E H^ if and only if
/("•)
< oo. Furthermore, if f e
Hlp, then C(f,f) = En=i ^ /(n) For a proof of the above lemma, we refer the reader to [Sh, p. 37]. 3. Proof of Theorem 1. In order to prove Theorem 1, we shall need Clark's Theorem which was first established in 1972 (see [Cl] and [Ra, p. 53]) and which is the following: Theorem C. Let H be a real Hilbert space with /6 Cl (H,H) . Suppose that I is (i) even, (ii) bounded below in H, (in) I (0) = 0, and (iv) satisfies
406
Victor L. Shapiro
(PS) . Suppose also that (v) 3E C H homeomorphic to S^1 by an odd map with the property that
sup/ (u) < 0. u€E
Then I possesses at least j distinct pairs of critical points with each critical value strictly less than 0. In order to prove Theorem 1, we leave I(u) be defined by (2.1) for u e H* p. The proof of Theorem 1 will be complete if we can show with H = H^ ., the conditions in the hypothesis of Theorem 1 imply that I(u), which is in C*1 (H , R) by Lemma 1, satisfies all the other conditions in the hypothesis of Theorem C above. It is clear from (2.1) that I(u) meets conditions (i) and (iii) and, from Lemma 2, that I(u) also satisfies condition (iv) in Theorem C. That I(u) is bounded below for u 6 H follows from the fact that > t{\u\q/q-\ Jn
u\2/2-G(x,u)]p,
and the right-hand side of this inequality is bounded below for u G H because of (2.6), (2.7), and the fact that q/,2. ( Note, \t\ /q - 71 |t|f - 72 |t|« is bounded below for t 6 R and 71,72 positive constants.) So to complete the proof of Theorem 1, it remains to show that condition (v) in Theorem C also holds. In order to accomplish this, we set (3.1) i=i
J
where the 0j are the eigenfunctions in (V/, — 1) and (Vi — 2) above. We also set r)(u) = (a-i/r, ...,atj/r)
for u <E Er.
So n(u) = — 77 (— u) and r\ : Er —> S^~l in a continuous manner. Also, it is clear that r/ is bijective and onto, and that r/™ 1 is also continuous. Hence for each r > 0, Er is homeomorphic to S^~l by an odd map, and we see condition (v) will be established if we can find an r0 > 0 such that sup u € £ r o /(w)<0. It follows from Lemma 3 and (3.1) that
< u, u >p= /J o% and £(u, u) =
(3.2)
WEIGHTED SOBOLEV SPACES
407
for u E Er. Consequently, we obtain from (2.1) that
Jf
p/q
Vu 6 Er,
(3.3)
^ t=i
where we have also made use of the fact that — G (z,«) < 0 by (g-3), (g-4) , and (2.2). By Theorem A and (1.9) above
L
U,U
1=1
for u G ET. It is also easy to see from Lemma 3 and (3.1) that £(u,u)+ < u,u>p< (Xj + l)r 2 Vu £ Er. We apply these last two inequalities to the situation in (3.3) to obtain
for u € Er. But for i — 1, ...j, Xi — X < Xj — X = —e where e > Q. Using this in (3.4), we obtain that / (u) < -er2/2 + (tf,)«(Aj + l)M/g
(3.5)
for u e Er. However, q > 2, and rq = r9~2r2. Also, there is an r0 > 0 such that (Kq)q(Xj + l)§r9"2 < e/2 for Ojr< r0. We conclude from (3.5) that I (u) < -er 2 /4 for 0 < r < r0 and u e Er. Consequently, (3.2) does indeed hold. Condition (v) in Theorem C is established, and the proof of Theorem 1 is complete. 4. Examples of Simple VL— Regions. In this section, we give two examples of Simple VL— regions, the first of which is bounded and the second unbounded. For the first example, we take Q CR2 to be the rectangle (0, 1) x (—1, 1) and F = {(1, s ) : — 1 < s < 1}. We observe that F C <9f2 and set p = xi, pi = i, and p2 = x\(l — x\ ). Then it is easy to see that (fi,F) is a VL— region with the CONS on L 2 (fi) given by where
(4.1) with amiH a normalizing constant, «/o(t) the familiar Bessel function of the first kind of order 0, and km the m — th positve root of Jo(t). Also, Pn(t) is the n — th Legendre polynomial. We see from [BD, p. 234 and p. 619] and (4.1) that with Lu given by (1.5)
408
Victor L. Shapiro
So, (fi,r) is indeed a V/,— region with AI = k\ and <£i(xi,X2) = aiflJo(k\x\) To see that (fi,F) is a Simple VL — region, we take QI = (0,1), (^2 = (-1, 1), PI(XI) = xi, /9i(xi) = xi, £2(^2) = (1 - x$), pl(x2) = 1. Then we have that pi(xi,x 2 ) = P\(XI) p*2(x2),p2(xi,x2] = P\(XI) p 2 (x 2 ), andp(xi,x 2 ) PI(XI) j92(x 2 ). Hence, conditions (i), (ii), and (iii) in the definition of a Simple V^— region are satisfied. It remains to show that condition (iv) holds for our example. This is done explicitly on page 24 of [Sh]. So our first example is completely established. For our second example, we take fi CR3 to be the unbounded open set fi = (-1, 1) x (0, 1) x (0, oo) and F = {(r, i, t] : -1 < r < 1, i = 0, 1, 0 < t < 00} U {(r, s, 0) : -1 < r < 1,0 < s < I } . We observe that F C <9fi and take PI(XI, x 2 , x3) = (1 - z 2 )e^ X3 , PZ(XI, x 2 , x3) = e~ X3 , p 3 (xi, x 2 , x 3 ) = (1 + x%)~1, and p(xi, x 2 , x3) = e~X3. It turns out that >m(x3)ajtktm
(4.2)
for j, k, m = 1, 2, ..., is a CONS on Lp(f2), where Oj^m is a normalizing constant, Pj(t) is the j-th Legendre polynomial, and {^m(^3)}m=i is a CONS on lA(0, oo) where /o^xa) = e~ X3 . f/Vn^s) is discussed on page 22 of [Sh] where it is shown that ^m(xz) € C2[0, oo) with V'm(O) = 0. Furthermore, it is shown that there is a sequence Ojryii^i-.-i^m ~* oo such that (x3) = T]mi/jm(x3)e'X3
forOjo;3
(4.3)
1
where D3 = ^- and f^°(x^ + I)' \D3i/)m(x3)\ < oo. As a consequence of all this, we see from (4.2) that <&j,k,m(xi,X2,X3) e H^p (fi,F) n C2 (fi) , and hence that (VL — 1) holds. From the property elucidated about 14>m(x3) in (4.3) above and from well-known facts about Legendre polynomials, we also see that L$j,k,m(xi,X2,x3) = p[j(j + 1) + (fcvr) 2 +77 m ]$j (fc|m (xi,x 2 ,x 3 ) (4.4) where Lu is given by (1.5) and pi, p2, ps, and p are given above. As a conequence of (4.4) and other facts given about 4>m(x3) on page 22 of [Sh], we obtain that
for all ve Hpt/, (fi, F) , and hence that (V/, — 2) holds where <^i(xi, x 2 , x3) = ^0,1,1(^1) ^2)^3) and AI = (?r2 + 771). Consequently, (fi,F) is a V^— region. To see that (fl,F) is a Simple Vx,— region, we take Oi = (—1,1), ^2 = (0,1), andQ 3 = (0,oo), P\(XI) = (1 - x 2 ), PI(XI) = 1,P2( X 2) = 1, ^2(^2) = 1) ^3(^3) = (xl + I)"1! P^^s) = e~ X3 . Then we have that pi(xi,x 2 ,x 3 ) = Pl(Xl) X
PI( I)
P2(X2)P3(X3), P2(X2)P3(^3),
P 2 ( X 1 , X 2 , X 3 ) = P i ( X i ) P^(X 2 ) p| (X 3 ),
p3(Xi,X2,X3)
=
and p(xi,x 2 ,x 3 ) = PI(XI) p*2(x2) p*3(x3) . So the condi-
WEIGHTED SOBOLEV SPACES
409
tions (i), (ii), and (iii) for (fi, F) to be Simple V^—region are met. It remains to show that condition (iv) is met. But this is done explicitly on pages 23 and 24 of [Sh]. Hence, (fi, F) is a Simple V^—region, and our second example is complete. We close with the comment that it is easy to construct other examples of Simple VL—regions.
REFERENCES [Am] A. Ambrosetti, On the existence of multiple solutions for a class of nonlinear boundary value problems, Rend. Sem. Mat. Univ. Padova, 49 (1972), 195-204. [BD] W. E. Boyce and R. C. Diprima, Elementary Differential Equations and Boundary Value Problems, Fifth Edition, John Wiley & Sons, New York, 1992. [Cl] D. C. Clark, A variant of the Ljusternik-Schnirelmann theory, Indiana Univ. Math. J. 22 (1972), 65-74. [He] J. A. Hempel, Multiple solutions for a class of nonlinear boundary value problems, IndianaUniv.Math.J. 20 (1971), 983-996. [Ra] P. H. Rabinowitz, Minimax methods in critical point theory with applications to differential equations, CBMS Regional Conference Series Math. 65, Amer. Math. Soc., Providence, 1986. [Sh] V. L. Shapiro, Singular quasilinearity and higher eigenvalues, Memoirs of the AMS 726, Amer. Math. Soc., Providence, 2001.
Doubly Stochastic Operators and the History of Birkhoff's Problem 111 Sheila King Ray Shiflett Department of Mathematics California State Polytechnic University Pomona, CA 91768
Introduction In 1923, while deriving inequalities between the characteristic roots and diagonal elements of Hermitian matrices, Schur [30] found the solution was dependent upon the existence of nxn positive matrices with row and column sums all being one. Some thirty years latter, in his famous book on probability, Feller [12] named such matrices "doubly stochastic". The original edition of Garrett Birkhoff's Lattice Theory text was written between 1937 and 1939 and published in 1940 [1]. Birkhoff subsequently published a revised edition in 1948 and a third edition in 1967. Birkhoff presented the following statement in his 1948 version of Lattice Theory [3], on page 266 in Exercise 4: Ex.4- Let \\tij\\ be any matrix of non-negative terms from an ordered field, such that sums of the terms in each row and column is one. Show that it is a weighted mean of permutation matrices. This statement is now referred to as Birkhoff's Theorem. Birkhoff gave a proof of it in Tres Observaciones sobre Albegra Lineal [2]. In his original Lattice Theory, Birkhoff posed seventeen "unsolved" problems. By 1948, eight of these problems had been solved. In his revised edition of Lattice Theory [3], Birkhoff offered one-hundred and eleven unsolved problems. The last one, Problem 111, has been extensively studied. The problem as it appears in the 1948 Lattice Theory text simply challenges us as follows: Problem 111: Extend the result of Ex.4 to the infinite-dimensional case, under suitable hypotheses. 411
412
S. King & R. Shiflett
By 1967, when Birkhoff published the third and final edition of Lattice Theory [4], considerable work had been done on Problem 111. In the third edition, Birkhoff offered no "unsolved" problems, but did give numerous exercises. On page 394, exercise 8 asks the reader to prove Birkhoff's Theorem.
1. Finite Case The matrices noted by Schur and named doubly stochastic by Feller arise in many different areas of mathematics. One of the more familiar areas is that of stochastic processes A stochastic process is a family of random variables {xt | t G A} denned on a common probability space and indexed on some set A. A is often taken to be time, space, or some other set on which the xt are indexed. If A is time, we can think of the xt as representing the possible outcomes of a random variable at different points in time. The probability, given a random variable starts in state i at time i that at time t + 1 it is in state j, is a transition probability - the probability that in one unit of time the random variable will go from state i to state j. Given a stochastic process with n possible states, a probability transition matrix, P, is the n x n matrix where the ith row and jth column entry pij is the transition probability The sum of the entries on any given row in P is equal to 1, since given a random variable currently in state i, at the end of the next unit of time it must be in one of the possible states available. Thus, it will proceed to some state in the state space with probability 1 in the next unit of time. Consequently, each row of P is a probability vector. Definition 1.1: An n x n matrix is called a stochastic matrix iff each row is an n-dimensional probability vector. That is iff £V- p^ = 1 and p^ > 0 for every i € {l,2,...,n}. Stochastic Operators An n x n matrix, P, maps the set of n-tuples to themselves by the operation of vector-matrix multiplication. Each n-tuple represents a function from {l,2,...,n} to the reals, R. Let ln be the set of functions / : {1,2,3,...,™} -» R. Hence, P induces a linear operator, T, which takes ln to ln, where / = (*:,...,z n )e^ andT/ = /P. On^ n , we impose the norm ||/|| = (X^=i l^iD 5 ) 1 < p < oo and we write lp to denote the normed linear space. If p = oo, we use the norm H / H ^ = max {|/j| : 1 < i < n}. We will represent the standard unit basis vector in ln, that has a one in the kth coordinate and zeros elsewhere, by e&. We represent the function f G £n, where f(/c) = x/-, by either (x/t) or Y^k=ixkek- We will denote the function that is constantly one everywhere on its domain by the symbol 1. All norms are equivalent on the vector space F iff F is finite dimensional and the closed unit ball of V is compact iff V is finite dimensional [10].
DOUBLY STOCHASTIC OPERATORS
413
Definition 1.2: An operator, T, induced by the matrix, P, is called a stochastic operator iff P is stochastic. Theorem 1.3: The positive linear operator, T on ln, is stochastic iff £?=iTf - £?=1f V f 6 in. Proof: Let T be stochastic. Then T is induced by the stochastic matrix P = (pij), where £"=1 Pij = 1 Vz. Let f € in be represented by (xi, x%, . . . , £„), then -
™
— Vn
— V" — 2^i=i xT»~ 2^j=ifr
Now assume that V f <E ^ n , £)Tf = £]f and let f = e*, then 1 and £Tej = Y^jPij = 1- So ^jpij = 1 Vz. Thus P is stochastic. Definition 1.4: Let T be the linear operator on I™ induced by an n x n ||Tf||p matrix. The norm of T is defined to be ||T|| = sup^n , for 1 < p < oo. Note that ||T||p = sup{||T||p : ||f||p = 1}. Definition 1.5: A linear operator, T, is a positive linear operator iff Tf > f whenever f > 0. Theorem 1.6: Let T be a linear operator induced by the n x n matrix (Pij)T is positive iff p^ > 0, V i,j. Proof: Let e, be the standard basis vector. Then, given i, Tej = (pn, ...,pm) 0 => Pij > 0 V j. Now assume p^ > 0. Clearly Tf > 0 whenever f > 0. By Theorem 1.6, every stochastic operator is positive. Theorem 1.7: If T is a positive linear operator, then T|f| > |Tfj, V f 6 in. Proof: Tf > 0 whenever f > 0, so if f - g > 0 then T(f - g) = Tf - Tg > 0. Therefore Tf > Tg. Since |f | > f and |f | > -f, we have T|f| > Tf and T|f| > -Tf. Thus T|f| > |Tf|. By P* we will denote the transpose of the matrix P. That is, if P = (pij) then P* = (Pji)- Therefore (P*) = P. Also observe that for the row vector, f = (xi,xz, ...,x n ), f * is a column vector, and vice- versa. We will simply use f to denote either of these, and let the reader determine the type of vector from the context. Note that matrix multiplication requires that the f in the product fP be a row vector and that, in the product Pf, f must be a column vector. Because we will not require strict use of the transpose notation to indicate where a row or column vector is required, we will be able to write (Pf)' = f f P 4 = f P4 = Pf. If T is the linear operator induced by P, then
414
S. King Ac R. Shiflett
T* will represent the linear operator induced by P*. By Theorem 1.6, if T is positive then so is T*. Theorem 1.8: A positive linear operator, T, defined on ln is stochastic iff T*l = 1. Proof: Let T be a stochastic operator induced by the stochastic matrix P. ThenT*! = IP* = PI =1. Now let T be a positive linear operator induced by (pij) and let T*l = IP* = PI =1. Since PI is the vector with entries being the row sums of P, we have EjPij = 1, V z. Thus P is stochastic, and therefore T is stochastic. Theorem 1.9: If M is a positive linear operator defined on ln and if for some g e P1, with HgH^, — 1, we have HMgJl^ = 1, then HMH^ — 1. Proof: Let f be in in with PH^, = 1, i.e. the largest Xi in |f| is 1. Thus |f | < 1 and, since M is positive, M|f | < Ml. So for all f in ln with ||f H^ = 1, we have |Mf| < M|f| < Ml by Theorem 1.6. Therefore HMf^ < ||M j f l H ^ < 1 for all f i n P1 with p]^ = 1 yielding \\M\\ ^ < 1. Since HgH^ = 1 and = 1, we have ||M|| =1. Corollaryl.10: If T is stochastic, then IJT*^ = 1. Theorem 1.11: If T is stochastic on en, then |T*||p = 1, 1 < p < oo. Proof: The case where p = oo is taken care of by Corollary 1.10. Let p = 1. If T is stochatic then T* is positive and, by Theorem 1.8, T*l = 1. By Theorem 1.7, |Tf| < T|f| which implies ||T*f |: = E T*f < T* |f | = E |f| = Plli- Therefore jl^ < 1. Since T*l = 1, we have 1. Now let 1 < p < oo and let j-j + ^ = 1 and let f be represented by < Xi >= Y,XiCi, f € ln. We use the Holder inequality to obtain |T*f|p < [E |^lT*ei]p < snce T*(E e i) = T<1 = 1- So we have T*fp < E z/T*e;] = T which implies [||T*f| p ||T*f
< p||p and since T*l = 1 we have ||T*f || = 1.
Doubly Stochastic Matrices and Operators Definition 1.12: The n x n stochastic matrix P is a doubly stochastic matrix iff P* is stochastic. The T € S is a doubly stochastic operator iff it is induced by a doubly stochastic matrix, that is, iff T and T* are both stochastic. Theorem 1.13: The positive linear operator T is doubly stochastic iff
DOUBLY STOCHASTIC OPERATORS
415
Proof: T is doubly stochastic iff T and T* are stochastic. The result follows immediately by Theorem 1.3. Theorem 1.14: For a positive linear operator, T, denned on £n, the following statements are equivalent: 1. T is doubly stochastic 2. Tl = 1*1 = 1
3. £Tf = £T*f = E f > V f e ^ n Proof: Assume T > 0, the fact that T is doubly stochastic gives Tl = 1*1 = 1 by Theorem 1.8. Also by Theorem 1.8,1*1 =1 shows T is stochastic and 11 = 1 shows 1* is stochastic. Iherefore, T is doubly stochastic. The equivalence of (1) and (3) follows directly from Theorem 1.13. Corollary 1.15: A linear operator, T, denned on ln, is doubly stochastic iff T is positive, Tl =1, and £lf = £f V f 6 £n Corollary 1.16: If T is doubly stochastic, then ||T||p = ||l* p < oo.
p
= 1 VI <
Convex Sets and Extreme Points Definition 1.17: In a real linear space, a non-empty set C is called a convex set iff for any c\ and 02 in C and real numbers AI > 0 and A2 > 0 with AI + A2 = 1, we have AICI + Xzc-z 6 C. We say X\GI + X^c^ is a convex combination of c\ and c^. Definition 1.18: Let C be a convex set x £ C is an extreme point of C iff x is not a convex combination of any two other points in C. The above definition implies that if y £ C and z G C with AI > 0, A2 > 0 and AI + A2 = 1, then if \\y + X2z — x, either y — z = x or else x is not extreme. Example 1.19: In the linear space, I", the set of all probability vectors is a convex set and the extreme points are the standard basis vectors. Every probability vector can be uniquely written as a convex combination of these extreme points. We will denote the set of stochastic operators on tn by Sn and the set of doubly stochastic by Dn. The set of all linear operators on ln form a real linear space. The algebra of this linear space is isomorphic to the algebra of the linear space of all nxn real matrices. When considering questions of convexity in this linear space, it is often easier to use the matrix representation of the operators.
416
S. King & R. Shiflett
Theorem 1.20: Sn and Dn are convex sets in the linear space of all linear operators on tn. Proof: Let TI and T2 be stochastic and AI > 0, A2 > 0, be real numbers 3 AI + A2 = 1. Let T= AiTi + A 2 T 2 . Then T*l =\iT\l + A 2 T 2 1 = Ail + A21 = 1. T 6 Sn by Theorem 1.8. Let TI and T2 be in Dn. Then TI and T2 be in Sn. So T = AiTi + A 2 T 2 6 Sn and since T{ and T*2 are in Sn by definition, we have T* = \iT{ + A2T| <E Sn. So T € Dn by definition. Definition 1.21: The nxn doubly stochastic matrix P is a permutation matrix iff it has exactly one 1 in each row and column. The linear operator T induced by a permutation matrix will be called a permutation operator. Denote the set of nxn permutation operators by Pn and notice that Pn C Dn C Snn . Theorem 1.22: The stochastic operator T, induced by the n x n stochastic matrix P, is an extreme point of Sn iff P has only ones and zeros as entries. Proof: Assume, for some ij, that 0 < p^ < 1. Then 3 ik, k ^ j, 3 0 < Pik < 1. Let A and B be two n x n matrices such that A = B = P for all entries except ij and ik. Let 0 < e < minjpjj,p^} and let a^ = p^ + e, a ik = Pik = £, hj = p^ — s, and bik = p^, + e. Then A and B are stochastic, A ^ B and P = ^(A + B). Thus T is not an extreme point of Sn. Now assume every p^ in P is either 1 or 0 and that A and B are stochastic matrices such that P = |(A + B). If p^ = 1, then i (a$j + bij) = 1 => ay- = bij = Pij = 1. If Pij = 0, then \ (ay- + bij) = 0 => a^- = bij = p^ = 0. So A = B = P and therefore T is extreme in SnTheorem 1.22 shows that every permutation matrix is an extreme point of Sn. We now turn our attention to finding the extreme points of Dn. We will use a well known technique that employs paths and loops connecting entries in the matrix. These ideas have been successfully extended and employed in the infinite settings in [33, 7, 24]. Definition 1.23. A path in the positive matrix M = (m^) is a finite set {mriCl, ...,mruCu} of distinct entries from M, such that 0 < mrsCt
DOUBLY STOCHASTIC OPERATORS
417
A path ends if it ever returns to a row or column already visited. So if {mriCl, ...,mruCu} is a path that started at m rici and ended at mruCu then either ru = r\ or cu — ciand r-j -£ TJ or c» ^ Cj for any other i and j between 1 and u. A sequence{mnci, ...,mruCu,mriCl} is a loop from m riCl through mruCu to TOriCl iff where {m riCl , ...,Tn ruCu } is a path that started at m riCl and ended at mruCu and if TI = TI then cu = c\ and if c\ = c^ then ru = r\. From this definition, we see that if we have a loop starting at and returning to m riCl , has an even number of steps. If a matrix has two distinct paths from m riCl that end at TnruCu then one can produce a loop from m rici to m riCl using some, perhaps not all, of the elements from these two paths. So if {mrici, ...,mruCu} is a path in a matrix that has no loops, it is the unique path connecting mriCl and mruCu. If 0 < mrCu < 1 is another element in the cu column, the only path from m riCl to mrCu is the unique path through m rucu Therefore there is only one path from mriCl to column cu and it enters this column at mTuCu. A similar statement holds for rows. Lemma 1.24: The n x n doubly stochastic matrix P has an entry prici G (0,1) iff P has a loop, consquently, P is a permutation matrix iff it has no loops.. Proof: By definition, if P has a path, then P has a non-zero entry less than 1. If P is doubly stochastic and if priCl G (0,1) then 3 priC2 G (0,1) and so 3 pr2C2 G (0,1), we can continue until, by finiteness of P, we must return to a previous row or column, yielding a loop. Theorem 1.25: The doubly stochastic operator, T is an extreme point of Dn iff T G Pn. Proof: Let T G Pn and let P be the permutation matrix that iduces T. If P = AA+(1 — A)B where 0 < A < 1 and A and B are doubly stochastic, then either 1 = Aajj + (1 — A&JJ) < 1 and Oy = bij = 1 or 0 = Aajj + (1 — A) bij > 0 so Oij = b^ = 0 which implies A = B and therefore T is extreme in Dn. Now assume T ^ Pn and that Tf = fP. So P is not a permutation matrix. By Lemma 1.24, there is a loop in P, Pnci,---,Prucv = Pnci- Let d = ^ mm{priCl, ...,pr u c u }- Let A = (ojj) be defined as ajj = pij if the entry ij is not in the loop and a nci is pnci + d, then alternately subtract and add d around the loop. B is defined in the same manner with the addition and subtraction of d reversed on the loop. Clearly A and B are doubly stochastic and P = ^(A + B). So T is not extreme in Dn.
418
S. King & R. Shiflett
When Birkhoff published his Theorem in 1946, the language and theory of convex sets had not yet fully developed. It was over a decade later that mathematicians began to attack questions regarding doubly stochastic matrices using convexity as the mathematical setting.
Definition 1.26: The convex hull of a set, E, written chE, is the intersection of all convex sets containing E. Birkhoff's Theorem may now be restated as follows: Theorem 1.27 (Birkhoff's Theorem for the Finite Case) Dn = ch Pn
Proof: It is well known that for finite dimensional linear spaces, every closed and bounded convex set is the convex hull of its extreme points [28, p.167]. We wish to prove that Dn is a closed and bounded subset of the finite dimensional space linear operators denned on in. Assume T& is a sequence from Dn that converge in the operator norm to T. Then ||Tfcf — Tf || —> 0 as k —> oo. We want to show that T e Dn. Using the i1^, norm, for every f € C we have max{|Tfcf - Tf|} < ||Tfcf - Tf || ||f j^ < e for large k. So Tfcf -> Tf, for all f. Let f > 0 then Tfcf > 0 -> Tf, so Tf > 0. Let f = 1 and we have Tfcf = 1 -> Tf, so Tf = 1. Finally, |£f - £Tf| = |£T fe f - £Tf| = |£(Tfcf - Tf | < E |Tfcf - Tf | < ne. So £f = £Tf and T e Dn. Corollary 1.28: S is the convex hull of stochastic operators induced by the zero-one stochastic matrices.
2. The Countably Infinite Case Countably Infinite Stochastic Matrices and Operators Definition 2.1: P = (PJJ) is an infinite matrix iff every row and column is a sequence of reals. P is an infinite stochastic matrix iff 0 < pij < 1 and EjPH^l^itq will denote the Banach space of g-summable sequences of reals, for 1 < q < oo, under the norm
DOUBLY STOCHASTIC OPERATORS
419
IIH q — \l^
The Banach space of bounded real sequences is denoted by l^, with norm HfHoo = sup |xn|. Note that f € lq, with 1 < q < oo =>• \xn\ —> 0, and that i\ C ^2 C • • • C ^oo- We will represent f 6 lq, where f(fc) = x/g, by the sequence (xk) and by EfcLi xkek where e^ is still the standard unit vector with one in the kth coordinate and zeros elsewhere. Note that f n = Efc=i xkek /* f pointwise on {1,2,3,...}. P* is still the transpose of P. Theorem 2.2: Let P be an infinite stochastic matrix. Tf = fP defines a positive, bounded, linear operator on i\ with Let T: i\ —» ^ be a positive, bounded, linear operator with Then 3 an infinite stochastic matrix, P, 9 Tf = fP V f € ^i. Proof: Let P= (pij) be a given infinite stochastic matrix and let Tf = fP V f € t\. If f = (xn), let Tf = fP = (X)j XiPij) where |xipy| < \Xi and ^ 1^1 < oo. Therefore T is positive and linear. It follows that H T f U j = ^)- 1 ^ixiPij\ 0 < S S j x ibu = SjS x i|Pu> since all terms are positive. So EiCkil EjPij) = Ei z il = PI i < oo. We have T: ^ -> ^ and T is bounded and ||T|| < 1. Finally, from absolute convergence,
Now suppose T: £j —> ^i is a given positive bounded linear operator where ^f V f e li. Let Tej = (py). Let P = ( P i j ) . Since T is positive, Pij > 0. Since EjTej = Y^jPij and ^j^ei — Ej e« = 1> ^ ^s stochastic. Definition 2.3: The positive bounded linear operator on ^i induced by an infinite stochastic matrix is called an stochastic operator. The set of stochastic operators will be denoted by S. Theorem 2.4: If T e S then ||T|| = 1. Proof: .Let f € *i, then HTf^ = E,- |Tf| < EjT|f| = E |f I = l|f|liTherefore, ||T|| < 1. However, \\Tei\h = EjTei since T is positive so llTei^
Recall that £00 is the dual space of t\ and every g 6 (.^ represents a bounded linear functional on t\ by J^gf , Vf £ li. Similarly for £p and £q where ^ + i = 1. Consequently, for every bounded linear operator T: i\ —> £1 3 a dual operator T* : ^ -» ^oc denned by£g Tf= ET*gf V g 6 ^oo and f
420
S. King & R. Shiflett
One should be careful to note here that, unlike the finite case, there are stochastic operators, T, that are not denned on l^. A simple example is the operator induced by the stochastic matrix, P, for which the first column is constantly 1 and all other entries are 0. Clearly IP is undefined. Corollary 2.5: If T € 5 is induced by P, then T* is a positive linear operator mapping l^ —> tx such that T*g = gP*= Pg V g e lx and ||T*|| < IProof: Given g € 4c, £(T*g)f = £g(Tf) = £g(f P) = £g(P*f) = £(gP*)f Vf e ti => Ttg = gP* = Pg Vg e 4c- If g = (yn), then ||T4g ^ = suPi{|£jPijyj|} ij
\yj\} <
Theorem 2.6: Let T be a positive bounded linear operator on i\. T e S iff T*l = 1. Proof: If T 6 51, then £f = £Tf = £l(Tf) = £(T*l)f = V f e £1 => T'l = 1. If T'l = 1, then £f = £(l)f = £(T*l)f = £(l)Tf = £Tf. So T e 5. Corollary 2.7: If T e 5, then ||T*|| = 1. Proof: By Corollary 2.5, ||T*|| < 1, by Theorem 2.6, T*l = 1, and we have||T* | = 1. The following new and rather surprising result clearly distinguishes the differences between both the finite case studied in section 1 of this work and the uncountable case to be studied in section 3. Theorem 2.8: There exist T 6 5, such that neither T nor T* is a linear operator on tq V q e (1, oo). Proof: Let P be the stochastic matrix with every row equal to the probability vector (^r). Let q e (0, oo), if f = (i), then f 6 £q but f ^ t\. Moreover fP is undefined, since it is a vector where the kth entry is £^Lj -^ — 2& £r^=i n = °°- Also, for any g = {#„} € tq we have Pg is the constant vector {£^=1 ^n^r) and so Pg ^ 1Q V : 1 < q < oo. This says that T and T* do not define linear operators on iq for any q € (1, oo). Theorem 2.9: If S: l^ —> l^ is a positive bounded linear operator and if £Sg = £g V g <E £1, then S: ^ -> ti and is stochastic.
DOUBLY STOCHASTIC OPERATORS
421
Proof: Assume S: ^ -> i^ and £Sg = £g y g e ^ C 4c- If g £ ^, then ||Sg ||i < E |S<7 I < £S|0| = E |g I = l|g||i- Therefore S: d -> ^ and is stochastic by Theorem 2.2.
Infinite Doubly Stochastic Matrices and Operators Definition 2.10: T € S is called doubly stochastic iff T*|fl <E S i.e.when T* is restricted to i\ it is stochastic This says T is doubly stochastic iff it is induced by a doubly stochastic matrix, where an infinite stochastic matrix P is doubly stochastic iff P* is stochastic. The set of doubly stochastic operators will be denoted by D. Theorem 2.11: T € D iff T e S and ET*g = Eg V g € i\. Proof: Assume ETt§ = Eg V g € i\ C lx. By Theorem 2.9, T*|^ € S. Now let T € S and T*|^ € 5 , then by Definition.2.3 and Theorem 2.2, V g e ^ and T e L>. Let T 6 D and S = T*|^: 4 -> li- Then S e 5 and £T4g = V g € £1. So S has a transpose operator, S*: lx ~> POO- Let g e £1 C ^,30 and f 6 ^ then, since S = T'^, E(S*g)f = E g(Sf) = E gC^f ) = E (Tg)f so S* = T on t\. That is to say every stochastic operator may be extended to all of loo and since S* is bounded and therefore uniformly continuous on ^oo and on ^jthis extension is unique since i\ is dense in lx. Theorem 2.12: Let T: i\ —> t\ be a positive bounded linear operator. The following are equivalent: 1. T e D. 2 T1 = T 3 by Theorem 2.11. If T is doubly stochastic, then 4. follows by Theorems 2.6 and 2.11. If Tl = 1, then T* is stochastic by Theorem 2.6 and if E^f = £f V f € t\, then T is stochastic by Theorem 2.2. Corollary 2.13: If T € D, then T: ^ -> 4, T: t^ -^ l^, T*: ^ -> 4, and T*: t~ -> ^ and
422
S. King & R. Shiflett
Unlike the stochastic case, doubly stochastic operators map £p to tp Vp € [l,oo] and have norm 1. Theorem 2.14: If T <E D, then T: tp -> lp and ||T||p =1 V p € [1, oo). Proof: The cases where p = 1 and p = oo are done. Let p e (l,oo) and q be chosen so that i + i = 1. If f = (xn) € £p, then f = E x i e J- If lp Vn, then (||f - f n ||p)P = 0 =» f n / f pointfn = Ei wise. Since Tf n = E?*iTei, we have ||Tf n -Tf m || p = ||En+i^Tei En+i
<e>
since
/ e V Therefore
Cauchy in lp and so Tf n —> g for some g€E £p. Define Tf = g. Then T is a bounded linear operator on £p. For / 6 lp and 5 £ Aj with £ + i = 1, we have < f, g > = £) |xjj/i| < g||, by Holder's Inequality. So |Tfn " <
[(Ei |x
l|Tf n ||
||p < ||f ||p => ||Tf ||p < |f ||p =» ||T||p < 1. Since Tl=l we have ||T||p = 1. Lemma 2.15: If T! 6 D and T2 e £>, then TiT 2 e £> and T^ e D V n €{1,2,3,...}. Proof: If T e U, then T > 0, Tl = 1, and ETf = E f - If f > 0, T:T2f = T! (T2f) > 0 since T 2 f>0. Similarly T:T21 = TI (T21) = Til = 1 and ETiT2f = ^Ti (T 2 f) = ^Tif = Ef- So TiT2 e D. By finite induction, it follows that TJ e D if TI 6 D. Theorem 2.16: In the countably infinite setting, the set D is not closed in the £00 norm. Proof: Let P be the following infinite doubly stochastic matrix: in //i 2 2 un u un... \\ P=
£ 0 \ 0 0 ... 0 i 0 i 0 ... nU nU 2i U 0 2i ... 0 0 0 i 0 ... \: : : : : ' - . .
DOUBLY STOCHASTIC OPERATORS
423
Note P is doubly stochastic, P = P*, and, by Lemma 2.15,
rX
Since these diagonal entries converge to zero, the sequence of (P 2 ™) converges to a matrix with all entries equal to zero. So T2n converges to the zero operator in the i^ norm. Convex Sets and Extreme Points The sets S and D are convex. Kendall and Kiefer [19]characterized the extreme points of D as the set of operators induced by the permutation matrices (i.e. those with exactly one 1 in each row and column). Maulden [24] later gave a purely algebraic proof of this fact. In this section, we will establish this fact and extend Birkhoff's Theorem to the set D. Recall that a path and loop was defined in Definition 1.23, and that, in the finite case, a doubly stochastic matrix has a loop iff it is not a permutation matrix and, by adding and subtracting a sufficiently small quantity to the entries of the loop, we generate two different doubly stochastic matrices whose average is the given matrix with the loop. This says an n x n doubly stochastic matrix, P, is extreme (a permutation matrix) iff P has no loops. As in the finite case, the extreme points of S are easily identified. Theorem 2.17: The extreme points of S are the operators induced by
424
S. King & R. Shiflett
zero-one infinite stochastic matrices, i.e. those infinite matrices with only zeros and ones for entries. Proof: The proof is identical to the finite case of Theorem 1.22. In the infinite setting, there are doubly stochastic matrices that have no loops and are not permutation matrices, e.g. /f 0 i 0 0 0 ...\ 0 | 0 i 0 0 ... | 0 0 0 f 0 ... 0 i 0 0 0 § ... 0 0 | 0 0 0 ... 0 0 0 0 0 ...
\: : : : : : •
Furthermore, changing one entry of an infinite doubly stochastic matrix P may require an infinite number of entries to be changed to preserve the doubly stochastic property of the new matrix. If we wish to constructively prove that the extreme points of D are exactly the permutation operators, we must begin with a doubly stochastic matrix P with an entry pmn €E (0, ^]. From this we need to construct doubly stochastic matrices A = (a^-) and B = (6^) such that \ (a^- + bij) = pij Vi, j. The following proof of the following Lemma was suggested by a technique used by Mauldon and it is essential in this construction. Lemma 2.18: Let (pn) € i\ be a probability vector for which 0 < pk < \ for some k. Let
n =
Pm
Pm-l
1-Pml-Pm-l
Pi
Vm
1-P1
and
n = n(i^a); p f c o >o m p
m~fc0
kQ
Case 1: Let 0 < pn < \ Vn. Let m > 1 befixed.Let a-m = Pm ( 1 y
n ) , bm = pm ( I + II ) , an=pn(l + Il} and
m-lj
\^
m-lj
\
ml
Then (an) and (bn) are probability vectors and pn — \ («n + ^n) ^nCase 2: Let 0 < pn < \ Vn ^ no, let | < pno < 1, let m > 1, m ^ HQ be
DOUBLY STOCHASTIC OPERATORS
425
fixed and let a
n0 — Pn0 — (1 ~ Pn 0 )
II
,
an = Pn ( 1 + JO^ 1 , and \
OnQ = pnQ + (1 — pnQ)
11
bn=pn{l- ^ j Vn ^ n0
/
\
/
Then (an) and (bn) are probability vectors and pn—\ (an + bn) Vn. Proof: Case l : 0 < p n < i ^ 0 < 0 < an < 1, 0 < bn < 1 Vn.
Pn
1 - Pn
< 1. So, 0 < H < 1 Vm, and m
n=l
= Pm(i- m-l n )+ •'—' Pn (i+n) rn m-l
- m-l n) - n ) + (!m-l
= Pm-Pmmn— 1 +i-pm + (i-pm) m—1 n =1
since 11(1 — p m ) = pm II . Similarly ^ bn = 1. TO m—1
Case 2: 0 < pn < ^
Vn 7^ no and | < pno < 1. Since II < 1, we have
and
Therefore, 0 < an < 1 and 0 < 6n < 1 for all n. So
426
S. King & R. Shiflett
= pno - (1 - Pno)^ + (1 + Jlja - Pn 0 )
= Pn0-(l-pno)
n
+1 -pno + (1 - p n j
LT
Similarly, In the proof that the extreme points of D are precisely the permutation operators, we employ the following notation. For the doubly stochastic matrix P with an entry p^ € (0, 1), let PIJ be the set of ordered pairs of positive integers (u, v) 3 3 a path in P from pij to puv. It is important to observe that if Pmn € (0, 1) and (m,n) <£ PJJ, then (m, c) ^ Py Vc and (r, n) <£ Py Vr. Also, for (u,v) € Pij, if pr« is in the path from pij to put,, i.e if.(u, v) € PJJ, we use p,^ to mean the unique immediate successor of prs in the path and p^~s as the unique immediate predecessor. Theorem 2.19: T is an extreme point of the set of doubly stochastic operators iff T is a permutation operator. Proof: That permutation operators are extreme points is proven exactly as in Theorem 1.25. Assume P is not a permutation matrix. If P has a finite loop then P is not extreme, so assume P has no finite loops. Select an arbitrary positive entry of P that is less than ^. For simplicity, we will denote this element as p\\. Let PH be defined as above. Let (u, v) € PH. Consider the row or column containing pu and p^[. Without loss of generality, let this be a row, i.e. pj| is Pi2For Case 1, assume 0 < pic < 4 Vc. Then let a^ = pi2 1 \
1-pnJ
and
bi2 = Pi2 ( 1 + ;——— I and a\c =pi c (1 + II) and b\c =Pic (1 - H ) . From V I - p n / V 2/ V 2 / Lemma 2.18, (ai c ) and (6ic) are probability vectors with (pic) = ^ ((aic] + (bic))For Case 2, let there be exactly one pic > ^ and define aj co = pico + / (! - Plc0) 1 ^— and ^Ico = Pico - (! - Pico) i ^— l-pn 1-P11
and 6ic = plc 1 + -—-— ) Vc ^ c0.
By Lemma 2.18, X) a ifc = S &ifc = ! and i ( a ifc +
and
\
°lc = Pic ( 1 - , ^— ) V i-Pll/
DOUBLY STOCHASTIC OPERATORS
427
Now proceed to the row or column containing p^[ and its successor. Replace the entries using Lemma 2.18 as follows. Continue to assume that the path begins with a row; p^[ is p\2- Then the next step is in a column: p^ is
P22for
If Pi2 < \i using the formula above, 012 — Pi2 { 1 — ^ I < 5, then V 1-Pu/ Casel
ar2=Pr2(l-U)=prs(lV
12/
V.
Pl2
Pn
] Vr ^ 1.
l-.Pl2l-.Pll/ 77111 771
Case 2 If pro2 > 1, then a ro2 = pro2 + (1 - pro2) -.
1 —Pn Pii
- -— —
r
, and
r0.
If Pi2 > \, then ais = pi? + (1 - ^12) ^-fromabove, so 1 ~ Pn
The construction is continued inductively until a^- and 6y are defined Vij G
PIIBy defining a,j = 6^ = PJJ Vzj ^ Pn, we have A= (a^-) and B= (6y ) both in the set of doubly stochastic matrices, with A ^ B and P = \ (A+B). Let P denote the convex set of operators induced by infinite permutation matrices. Now that we have identified the extreme points of the set of doubly stochastic operators, we turn to Birkhoff's Theorem. In the finite case, this theorem says that the set of doubly stochastic operators is the convex hull of the set of permutation operators, i.e. the extreme points. Since any matrix with an infinite number of different entries, can not be written as a finite convex combination of permutation matrices, we must consider infinite convex combinations, i.e. limits of sequences of operators. The best we can hope for is that the set of doubly stochastic operators will be the closure of the convex hull of the permutation operators. We saw in Theorem 2.16, D is not closed in the ix norm. In Birkhoff's Problem 111 [15], Isbell introduces another norm on the linear space of matrices that have finite row and column sums. He proves, in that norm, that cchP ^ Y). In 1955, Rattray and Peck [27] and then, independently in 1960, Kendall [19] successfully extended Birkhoff's Theorem by proving that D = cchP if one imposes the correct topology. Kendall, working without the knowledge of the existence of the Rattray and Peck result, used a different topology than that of Rattray and Peck.
428
S. King & R. Shiflett
As is done in Isbell and Kendall's work, we work in the linear space V of all boundedly line-summable infinite matrices, i.e. M € V iff < oo
and
sup^ ^Zi \fnij\ < oo
fsbell observes that ||M|| = max{sup, • -i is a norm on V that makes V into a Banach space. Kendall defines a topology T on V that is the weakest topology making the linear functions = Y.J mkj,
ck (M) = £i m ifc ,
Sjj (M) = m^
continuous.
(V, T) is a locally convex Hausdorff topological vector space since given MI 7^ M2 3 some ZQJO such that Si0j0 (Mi) ^ Sj0j0 (M^) (that is, the set of linear functionals generating T is total). Theorem 2.20: (Birkhoff's Theorem for the Countably Infinite Case) There exists a topology T on the space V such that D = cchP. Proof: Assume Te D but that T^cchP. Let T be induced by the infinite doubly stochastic matrix, M. Then 3 a continuous linear functional / for which /(M) < inf Eec ch^/(E) [29, page 341]. However rai
712
nz
f(M) = ^ amrkm(M) + g bmckm(M) + ^ dm(simjm(M))VM m—l
m=l
eV
m=l
and for all doubly stochastic matrices, M, we have r^m (M) = c^m (M) = 1 and 0 < Simjm(M) < 1. Let n = max(ni, 712,713). Then no m^ for i > n and j > n contributes to the value of /. Let Sn = (mij) for 1 < i < n and 1 < j < n. Let Un be the n x n diagonal matrix with the (i,i) entry being 1 — £)?=i 77ijj and zero elsewhere. Let V n be a similar n x n diagonal matrix with the ( j , j ) entry being 1 — Y^i=i mijIf S^ is the transpose of S n , then the In x In matrix W^n — I ^ of doubly stochastic. If we expand W^n with zeros in the tails of the rows and columns and an infinite identity matrix in the lower right, to get the infinite doubly stochastic matrix
DOUBLY STOCHASTIC OPERATORS
429
W2n
we see that /(W) = /(M). Moreover W^n = X^fc=i ^fcPfc> Pfc
aie n x n
permutation matrices and
0 < Afe < 1 with ^ Afc = 1 by the finite case of Birkhoff's Theorem. So /(W) = /(Xa™ -^fcPfc) where Pfc is expanded to a doubly stochastic matrix the way W2n was expanded to W. Therefore /(M) = /(W) > /(E) which is a contradiction.
3. The Uncountably Infinite Case Stochastic Operators To continue the historical development of Birkhoff's Problem 111, we move from the countably infinite to the uncountable case. Until now, the study has been able to work with operators induced by matrices, and stochastic properties were defined in terms of finite or infinite row and column sums. In the uncountable setting, we have no matrix and no summation. However, in both the finite and countably infinite settings, we were careful to point out that with every stochastic and doubly stochastic matrix, there were associated bounded linear operators defined on t\ and l^. It is exactly these operators that survive the move to the uncountable case, with sums becoming integrals and lp spaces becoming Lp spaces. We will work with spaces of real-valued bounded measurable functions, whose domains are the unit interval, [0, 1], with Lebesgue measure, m. We use /f to denote the Lebesgue integral JJ0 a ,f (x]m(dx). For 1 < p < oo, Lp = {f: [0, 1] -> R,the reals; f measurable, / |f \p < oo}. ||f ||p = (/ |f \p)p is the norm on Lp under which it is a Banach space. The space L^ = {f : [0, 1] —> R; f measurable, sup |f| < oo}. ||f| ^ = inf{M : m{x : \i(x)\ > M} = 0}, is the norm on Lx under which it is a Banach space. So PH^ is the greatest lower bound of M that are upper bounds of |f | except possibly on a set of measure zero. i.e. they are m-a.e. upper bounds. So |f(x)| < P H ^ . Note that LOO C Lp C Lq C LI, where 1 < q < p < oo, since the measure space is finite. If Tp:Lp-> Lp is linear and if ||Tpf ||p < M||f ||p V f 6 Lp and some M e R , then Tp is bounded. ||TP|| = sup
p
V f € Lp is a norm on this P set of bounded operators. The set of bounded linear operators on Lp is a Banach space under this operator norm. If T is bounded then it is uniformly continuous.
430
S. King & R. Shiflett
A bounded linear operator, Tp : Lp —> Lp, is positive iff given f e Lp, f > 0, we have Tpf > 0. In the finite and countable cases, we used row sums being equal to one as the basis for our definition of stochastic. This isn't possible in the continuous case. But, in both the finite and countably infinite cases, we found that an operator, T, was stochastic iff £]Tf = £)f (see Theorems 1.7 and 2.2). This motivates us to choose an analogous property, that jTf = Jf, as the basis for our definition of a stochastic operator for the uncountable case. Definition 3.1: The bounded linear operator T: L\ —> L\ is stochastic iff T is positive and /Tf = Jf V f G LI. Let S denote the set of stochastic operators. Example 3.2: The function (/> : [0,1] -> [0,1] such that m((t>~1(A) = A is called a measure preserving map. Consider T^(f) = £o<j>, for all f e L\. If f > 0, then io<j> > 0 so T0 > 0. Let XA denote the characteristic function of A and observe that jT^x / i(x)dm = f XA ° 4>(x)dm = J X(j>-1A(x)din = m((t>~1A) = m(A) for all measurable sets A. It follows that jT^f = Jf V simple functions f G L\ and therefore jT^f = Jf V simple functions f € L\ by a standard convergence theorem argument. By the Riesz Representation Theorem, the dual of Lp, 1 < p < oo, i.e. the space of all bounded linear functions defined on Lp, is isomorphic to Lq where „ + „ — 1- The dual of L\ is represented by Lx. F is a bounded linear functional on Lp iff 3 g e Lq 3 F(f) = (g,f ) = Jgf, V f € Lp. The bounded linear functional on Lx yield no such representation. If T: Lp —> Lp, I < p < oo, is a bounded linear operator, then we may define a bounded linear operator, T4: Lq -> Lq 3 | + | = 1, by (g,Tf } = (T*g,f }. T* is called the transpose operator of T and is analogous to the transpose operator induced by the transpose matrix in the finite and countable cases. A review of the proof of Theorem 1.7 shows that if T: Lp —* Lp is a positive operator, then |Tf|
DOUBLY STOCHASTIC OPERATORS
431
T*l = 1. Then /Tf = (l,Tf } = (T*l,f } = (l,f } = /f, V f 6 LI. So T € S. Lemma 3.4: Let Tp be a bounded linear operator on Lp, 1 < p < oo. If Tp > 0, then TP > 0 and T^g < HgH^ for all g in L^ , therefore T* : L^ -> L^. Proof: If Tp > 0 then TPXA > 0 V measurable A, Let g € L^ C Lg, ± + \ = I and let g > 0. Then
Doubly Stochastic Operators Definition 3.6: T 6 S is doubly stochastic iff there exists E 6 S for which \LX = T* . We will denote the set of doubly stochastic operators by D. Theorem 3.7: If T € D, then T: L^ -* L,'OO' Proof: If T € D, and E € S for which E = T* on Lx, then E* : L^ -> L^, and takes 1 to 1. Let g € L^ be given. We have (E*g, f ) = (g, Ef } = {g, T*f ) = (Tg, f } for all f € LI. Therefore, T = E* : L^ -> Lx. Since E* is continuous on a dense subset of LI, it extends to an bounded linear operator on LI. This extension equals T on Lx so the extension has to be T. For these reasons, we will use T* to represent the operator on L^as well as its extension to L\. Furthermore, (T*)* will be denoted by T. Theorem 3.8: T € S and Tl = 1 iff T €.D Proof: For every positive bounded linear operator, T, defined on LI, T* extends to a positive bounded linear operator E defined on all of LI. Now assume Tl = 1, then /Ef = /T*f = /(Tl)f =/f for all f € L^ a convergence theorem argument shows the equality holds for all f 6 LI. Therefore E is stochastic. So T is doubly stochastic. If T is doubly stochastic then T is
S. King & R. Shiflett
432
stochastic and /T*f = /f for all f <E LI. So /Tl f = /f for all f € LI which implies Tl = 1. Theorem 3.9: A positive bounded linear operator, T, defined on L\ is doubly stochastic iff either Tl = T*l = 1 or Tl=l and /Tf = /f V f e LI. Proof: If T is doubly stochastic then it is stochastic so /Tf = /f and T* 1 = 1 and by Theorem 3.8, Tl = 1. Conversely if T*l = 1 then T is stochastic by Theorem 3.3 and then if Tl =1, T is doubly stochastic by Theorem 3.8. The other statement is Theorem 3.8. Theorem 3.10: If T e D then, when restricted to L pj T : Lp |p = l Vpe[l,oo].
Lp and
Proof: If T e D then /Tf = /f V f € LI and Tl = 1. Let 1 < p < oo and i + - = 1. Let k = ^fciXA;be a simple function. Then < (T \k\)p <
So
for all simple functions k. (Observe that \k\ =
p
\ki\ XAi and \k\p = =
since Aj n A-,- = 0 Vz ^ j and x2Ai = XA,-} Now let f € Lp and let f n X f in Lp, where f n are simple functions. Then ||Tfn||p < Pnllp < P lip and Tf n is increasing. Therefore Tf n converges pointwise to some g e Lp C LI =^> Tf n —> g in the LI norm by the Monotone Convergence Theorem. Thus Tf = g because T is bounded on L\ and therefore continuous. Thus ||Tf ||p < ||f ||p. Therefore T: Lp -> Lp and since Tl=l,
Convex Sets and Extreme Points
DOUBLY STOCHASTIC OPERATORS
433
S and D are convex subsets of the Banach space of bounded linear operators on LI . Since, if TI and T2 are in S, then their convex combination T = AjTi + A2T2 is a positive bounded linear operator on L\ such that /Tf = Ai /Tif +A 2 /T2f = Ai /f +A2 /f = /f, so T e S. If TI and T2 are in £>, then Til = T21 = TI = 1, so T 6 D. Our goal is once again to characterize the extreme points of these sets and to show that Birkhoff's Theorem still holds. It is interesting to note, in what follows, that the only known characterization of extreme points is given, not in terms of operator characteristics, but in terms of properties of associated measures. The measures we will associate with doubly stochastic operators were studied first by J. E. L. Peck in 1959 [26]. They may be considered the continuous version of n x n doubly stochastic matrices by thinking of these matrices as measures imposed on the square [0, n] x [0, n] where each entry Pij is the mass assigned to [i — 1, i] x [j — 1, j]. Peck refered the measures as doubly stochastic measures. In what follows, we let / = [0,1]. Definition 3.11: The probability measure /x defined I x I, / = [0,1], is called doubly stochastic iff p,(A x I) = y(I x A) = m(A) V Borel subsets A c /. Let M denote the set of doubly stochastic measures. Notice that Lebesgue product measure on /2 is doubly stochastic. In 1966, James R. Brown proved that there is a one-to-one correspondence between the set of doubly stochastic measures and doubly stochastic operators [6]. Theorem 3.12: Let T e D and fj, <E M. The relation n(AxB) = f: XA^X V Borel sets A, B determines a one-to-one, onto correspondence between D and M. Proof: Let ju e M. Let g G L^m) and let f be a simple function in Lj (m). Since /
XA(x)(J,(dx,dy)
Jlxl
= I
XAxi(x:y)n(dx,dy)
= p(A x I) = m(A)
Jlxl
= I xA(x}m(dx) Jl we see that / Jlxl
f(x)p,(dx,dy)
= I
f(x)m(dx)
JI
for all simple functions. By standard approximation arguments, it follows
434
S. King & R. Shiflett
that / f(x)n(dx,dy)= Jlxl
[ f(y)n(dx,dy) Jlxl
= f f(x)m(dx) JI
Vf
Choose g 6 L^m^g > 0 and define G(f) = I
Jlxl
f(x)g(y)n(dx,dy).
Then
|G(/) < /
\f(x)\\g(y)\n(dx,dy)
Jlxl
< y^ I
\f(x)\^(dx,dy)
=
Jlxl
Thus G is a linear functional on LI so 3 h 6 L^ such that G(f) = (f, h } V f € L x . Define T* : L^m) -» Lx(m) as T*(g) = h. So G(f) = Iixi f(x)s(y)^-(dx,dy) = fj f(x)h(x)m(dx) = /7 f(x)T'g(x)m(cte). Since G(/) > 0, whenever f > 0 and g was choosen to be positive, we have T*g > 0. Hence T* is positive. Also G(l) = / /x/ g(y)ii(dx, dy) = fr g(y)m(dy) = (1, T*g ) = fj Ttg(y)m(dy) V g € Lx which implies Tl = 1. If g = 1, then flxlf(x)lfj,(dx,dy) = fjf(x)mdx = {f, T*l ) = /fT*lm(dx) V f 6 LI which implies T'l = 1. So T* : L^ -> L^, T* > 0 such that Tl = T*l = 1. By Theorem 3.9, T is doubly stochastic. Now assume T € D and define A (A x B) = (XA,TXB)- A is finitely additive in both A and B and A (A x I) = A (7 x A) = m(A) since T e D. For every Borel set, A or B, there exist compact sets A\ C A and B\ C B 3 m(A — A\) < e and m(B — B\) < e when e > 0 is an arbitrary number. So for compact AI x BI C A x B, A (A x B - Av x B^ < A [A x (B - BI)] + A [(A -A1)xB}<X[Ix(B- Si)] + A [(A - AI) x /] = m(5 - #1) + m(A AI) < 2e. Therefore A is regular on the Borel measurable rectangles of / x /. By Alexandroff's Theorem [10, p. 138], A is countably additive and therefore has a unique extension to the Borel sets of / x / by the Hahn Extension Theorem. Independently, R. G. Douglas [9] and Joram Lindenstrauss [21] discovered the only known characterization of the extreme points of the convex set of doubly stochastic measures and, consequently, of the convex set of doubly stochastic operators. Theorem 3.13: The Douglas-Lindenstrauss Theorem T 6 D is an extreme point of D iff L = {f(x)+g(y) : f and g in L\(m}} is norm dense in LI(//) where n e M is the doubly stochastic measure associated with T by n (A x B) = (XA,TXB).
DOUBLY STOCHASTIC OPERATORS
435
Proof: To show T is extreme in D, we show that p, is extreme in M. IJL € M is not extreme iff // = |//i + ^2 for some //i G M and fj.2 € M not equal to /u,. Then /ii = 2/z — /^2- Let v = IL — ^\ = p. — (2fi — ^2) = M2 ~ M- So i/ ^ 0 and |^(5)| < ^(5) V Borel sets 5. Then dv = F(x,y)dn, \F(x,y) \ < 1, F(x, y) € Loc(/i), F(x, y) 7^ 0 by the Radon-Nikodym Theorem. Moreover, v(A x I) = z/(/ x A) = 0 implies / / x / f (x)^(dx) = f l x l g ( y ) v ( d y ) = 0 V f and g in LI (m). So ^ is not extreme iff / / x / (f(x) + g(y)) F(x, y)d^ = 0 V f and g in Li(m). That is, F(x,y] is a non-zero bounded linear functional on Li(fi) that is identically zero on L. So p, is not extreme iff L is not norm dense in
We conclude this work with the operator version of Birkhoff's Theorem, proven by James R. Brown [6]. Theorem 3.14: Let (p : I —* / be a measure preserving map. Then T^/ = / o (p is doubly stochastic and is an extreme point of -D. Proof: In Example 3.2, we showed that T^ is stochastic. Since T^l = 1, we have Tv is doubly stochastic by Theorem 3.8. Now assume T^ = | (Ti + T2 ), where Tiand T2-are both doubly stochastic. Then T^XA = X^-M = \T^A + \T2XA- So for
=*• TiX^Or) = T 2 %^(x) = 1. Similarly,
TIXA(Z) = T 2 XA(z) = 0 =* TIXA = T 2 XA = xv-M = TvAtreme.
So T
v
is ex
-
Note that this theorem is proven for all measure preserving maps and therefore valid for one to one tp, that is for those (p for which ip~l is a function. Note that, since m(?~ 1 /) = m(I) = 1, we have that every
Definition 3.15: A doubly stochastic operator T^ e D is called a permutation operator iff 3 an invertible measure preserving map
436
S. King & R. Shiflett
m ((p~lA) = m (A) V Borel sets A) for which T^f = foy>. We denote the set of the permutation operators by P. The set of operators T^ where (ft is not invertible will be denoted by M. The permutation operators are not all of the extreme points of D. In fact, T € D is an extreme point iff T* is an extreme point since T =i (Ti + T2) iff (XA,Txs) = \ (XA,TIXB) + 5 (XA^XB) = | (TfoA.Xfl) + I (T* 2 XA,XB) = ((^T*! + |T2) XA,XB) => T* = ± (T^ + T 2 ). Consequently, those permutation matrices induced by non-invertible measure preserving maps have transpose operators that are extreme but are not permutation operators. As in the countably infinite case, we need to impose the correct topology on the space of bounded linear operators if we hope to extend Birkhoff's Theorem to the uncountable setting. The norm, strong, and weak operator topologies serve the purpose. Recall that, on the Banach space of all linear operators that map Lp to Lp, I < p < oo, there exist a complete metric or norm topology induced by the operator norm. This is called the uniform operator topology and the sequence Tn converges to T iff ||Tn - T|| —> 0. The strong operator topology is the topology for which Tn converges to T iff ||Tnf - Tf || —> 0 V f € Lp (pointwise convergence), and the weak operator in which Tn converges to T iff |{g, Tnf } - {g, Tf )| -> 0 V.f <E Lp and V g e L 9 , i + i = 1. Therefore the weak operator topology is the smallest topology that makes all the linear functionals on Lp, represented by Lq, continuous. The fact that a convex set of bounded linear operators has the same closure in both the weak and strong operator topologies [10, page 477] suggest that these topologies might be useful in this study. The other result we will make use of states that since Lp is reflexive, then the closed unit ball of the bounded linear operators on Lp is compact in the weak operator topology [10, page 512]. Lemma 3.16: P is dense in D in the weak operator topology. Proof: We will actually show that Pi is dense in D. Since doubly stochastic operators map Lp to Lp for all p, we may use any p and in this arguement it is convenient to use L^- A basis for the weak operator topology on D is given by {T : [(ffc.Tgj. ) - (f fc ,Sg fc )| < e, k 6 {1,2, ...,n}} where f f c , gfc come from a dense subset of L%, £ > 0, and S € D. We will take f^ and g^ to be continuous and bounded by 1. We must show that there exists T^, in this basis set. Let A and A^ be the doubly stochastic measures associated with S and T^. Let hfc(x,y) —fk(x)Sk(y)- hfc is uniformly continuous on I x /. Also (ffc, Sgfc ) = / /x/ h fc (x, y) d\ and (f*, T^,gfc) = J 7x/ h A; (x, y) dX^.
DOUBLY STOCHASTIC OPERATORS
437
By uniform continuity, we can choose disjoint intervals Ir C I such that / = |J" Ir and such that h^ varies by no more than e on Ir x Is for every pair (r,s) in {1,2, ...,u} x {1,2,...,«}. For each r, we can find u sets Xrs 9m(X r s ) = A (Ir x I s ) . Similarly, for each s we can find u sets Yrs 3 m (Yra) = A (Ir x /s). So m (Xj-a) = m (Y ra ). It follows, [23], that there is an invertible measure preserving map (prs : Xrs —> Y rs . Define tp(x) = <prs(x} Va; 6 Xrs. Then < / ? : / — » / is invertible and measure preserving and Av (Ir x Is) = m (lm n yr1/^) = m (X rs ) = A (/r x Ia}. Therefore / 7
/
/ifcrfA^ - /
JlrxI3
hkd\
Jlrxla
2e.
Theorem 3.16: (BirkhofF's Theorem for the Uncountable Case) D is the closed convex hull of P the permutation operators in the strong operator topology. Proof: We will prove that D is compact in the weak operator topology on the Banach space of bounded linear operators on Lp, 1 < p < oo. Theorem 3.10 proves that every doubly stochastic operator is in this Banach space where 1 < p < oo. Therefore, we need only show that D is closed in the closed unit ball. Since convex sets have the same closures in the weak operator topology and the strong operator topology our result follows from the KreinMilman theorem. Assume Tn € D and (g, Tnf ) -> (g, Tf) V.f e Lp and V g e Lg, \ + \ = 1. If f > 0 and g > 0, then (g, T n f } > 0 since Tn is positive. Therefore, (g, Tf > =/gTf > 0 V.f G Lp and V g € Lqt ± + | = 1. By allowing g =XA for all measurable A we have that Tf > 0 whenever f > 0. So T is positive. If g = 1, then /Tnf -> /Tf =/f since /Tnf = Jf for all n. Obviously Tl = 1 since T n l = 1. So T 6 D and D is closed in the weak operator topology. Therefore D is compact and the Krein-Milman says that it is equal to the closed convex hull of its extreme points, the closure is with respect to the weak operator topology. Therefore D is the closed convex hull of its extreme points in the strong operator topology. By Lemma 3.16, P is dense in D so D = cchP.
4. Conclusions and Observations
438
S. King & R. Shiflett
One should notice that Birkhoff's Problem 111 not be fully solved in either the countable or the uncountable case since the solutions require a weak topology instead of a norm topology. A characterization of the countable operator that map all tp spaces into themselves is needed. The work on doubly stochastic operators inspired by Birkhoff's Problem 111 is a significant part of the development of the knowledge about these operators but is only one avenue of the ongoing investigation. The geometric and algebraic structure of this convex set is not fully known. For instance, Brown [6] proved that M is the closure of P in the strong operator topology and that P is closed in the uniform operator topology. It is still an open question about what the closed convex hull of P and the closed convex hull of the extreme points of D are in the uniform or norm operator topology. In fact, characterizations of the extreme points of D based upon operator properties, rather than on properties of their associated measures as in the Douglas-Lindenstauss characterization, are needed. In 1942, Halmos [14] found a necessary and sufficient condition for the existence of square roots of certain invertible measure preserving transformations. The question of the existence of square roots of stochastic and doubly stochastic operators was explored in [16] without resolution and remains an open question. Obviously, the connection between doubly stochastic measures and operators is extremely important and useful. A large body of knowledge has accumulated about doubly stochastic measures whose applications go well beyond their connection to doubly stochastic operators. It was hoped that these measures would shed a geometric light onto the nature of their associated operators and yield further characterizations of the extreme doubly stochastic operators. In an effort to better understand these relationships, the study of these ideas in the stochastic as well as the doubly stochastic setting has been productive [25, 34, 35]. Lindenstrauss [21] proved that extreme doubly stochastic measures are singular to Lebesgue measure on the unit square. In [7], a concept of a loop in a doubly stochastic measure was introduced to study extremality. Losert [22] used these ideas to prove that there are extreme doubly stochastic measures /j, and v for which v(A) = 0 if and only if p,(A) = 0 but v ^ p.. However, if n is associated with an operator in M and v « /j. then v = p,. A characterization of all doubly stochastic meaures // for which given v is doubly stochastic and v « p, then v = /i has not been found. The first study of doubly stochastic measures supported on graphs of two functions appeared in [31]. This idea has lead to a substantial knowledge of the nature of extreme doubly stochastic measures lead by the work of Sherwood and Taylor [32, 17, 18]. An alternative approach began with [5, 8, 13]. Here the extension is makes
DOUBLY STOCHASTIC OPERATORS
439
use of the observation that the entries of an nxn extreme doubly stochastic matrix are exactly the extreme points of the convex set of reals from which all entries are drawn, i.e. 0 and 1 are the extreme points of [0, 1]. This new direction considers nxn matrices with entries from a convex subset of some topological vector space with given row and column sums. The extreme points have been identified and some Birkhoff type theorems proven for some given spaces. Work continues in the area [20]. Bibliography 1. Garret Birkhoff, Lattice Theory, Amer. Math. Soc. Colloquium Pub., vol.25, 1940. 2. , Tres observaciones sobre albegra lineal, Rev. Univ. Tucaman A 5 (1946),147-151. 3. , Lattice Theory, revised ed., Amer. Math. Soc. Colloquium Pub., vol.25, 1948. 4. , Lattice Theory, third ed., Amer. Math. Soc. Colloquium Pub., vol.25,1967. 5. R. V. Benson and R. C. Shiflett, Doubly Stochastic Matrices over Arbitraty Vector Spaces and the Birkhoff Theorem, Linear Algebra Appl. 42 (1982) 145-158. 6. James R. Brown, Approximation theorems for Markov operators, Pacific J. of Math 16 (1966), 13-23. 7. James R. Brown and Ray C. Shiflett, On Extreme Doubly Stochastic measures, Mich. Math. J. 17 (1970), 249-254. 8. M. H. Clapp and R. C. Shiflett, A Birkhoff theorem for doubly stochastic matrices with vector entries, Studies in Applied Math. 62 (1980), 273-279. 9. R. G. Douglas, On extremal measures and subspace density, Michigan Math. J. 11 (1964), 243-246. 10. Nelson Dunford and Jacob T. Schwartz, Linear Operators, Part 1, New York, Wiley-Interscience, 1958. 11. D. V. Feldman, Extreme doubly stochastic measures with full support, Proc. Amer. Math. Soc. 114 (1992), 919-922. 12. W. Feller, An Introduction to Probability Theory and its Applications, vol 1. New York, 1950. 13. P. M. Gibson, Generalized doubly stochastic and permutation matrices over a ring, Linear Algebra Appl.30 (1980), 101-107. 14. P. R. Halmos, Square roots of measure preserving transformations, Amer. J. Math. 64 (1942), 153-166. 15. J. R. Isbell, Birkhoff' Problem 111, Proc. Amer. Math. Soc. 6 (1955), 217-218. 16. A. Iwanik and R. Shiflett, The root problem for stochastic and doubly stochastic operators, J. Math. Anal. Appl. 113 (1986), 93-112.
440
S. King & R. Shiflett
17. A. Kaminski, H Sherwood, and M. D. Taylor, Doubly stochastic measures with mass on the graphs of two functions, Real Anal. Ex. 13 (198788), 253-257. 18. , Doubly stochastic measures, topologies, and laticework hairpins, J. Math. Anal. Appl. 152 (1990) 252-268. 19. David G. Kendall, On infinite doubly-stochastic matrices and Birkoff's problem 111, J. London Math. Soc.35 (1960), 81-84. 20. Xin Li, P. Mikusinski, H. Sherwood, and M. D. Taylor, In quest of Birkhoff's theorem in higher dimensions, IMS Lecture Notes, 28, Inst. Math. Stat., Hayward, CA. 1996 21. Joram Lindenstrauss, A remark on extreme doubly stochastic measures, Amer. Math. Monthly 72 (1965), 379-382. 22. V. Losert, Counter-examples to some conjectures about doubly stochastic measures, Pac. J. Math. 99 (1982), 387-397. 23. D. Maharam, On homogeneous measure algebras, Proc. Nat. Acad. Sci. USA 28 (1942), 108-111. 24. J. G. Mauldon, Extreme points of convex sets of doubly stochastic matrices, Z. Wahrsch. Verw. Gebiete 13 (1969), 333-337. 25. K. R. Parthasarathy, Extreme points of the convex set of stochastic maps on a C*-algebra, Infin. Dimens. Anal. Quantum Prob. Relat. Top. 1 (1998), 599-609. 26. J. E. L. Peck, Doubly stochastic measures, Mich. Math. J. 6 (1959), 217-220. 27. B. A. Rattray and J. E. L. Peck, Infinite stochastic matrices, Trans. Roy. Soc. Canada 49 (1955), 55-57. 28. T. R. Rockafeller, Convex analysis, Princeton Un. Press, 1970. 29. H. L. Royden, Real analysis, third ed., Prentice Hall,1988. 30. I. Schur, Uber eine klasse von mittelbildungen mit anwendung auf die determinatentheorie, S. Ger. Berliner math. Ges. 22 (1923) 9-20. 31. T. L. Seethoff and R. C. Shiflett, Doubly stochastic measures with proscribed support, Z. Wahrsch. verw. Gebiete 41 (1978), 283-288. 32. H. Sherwood and M. D. Taylor, Doubly Stochastic measures with hairpin support, Prob. Theory and Related Fields 78 (1988), 617-626. 33. R. C. Shiflett, On extreme Doubly Stochastic measures and Feldman's conjecture, Tech. Report 39, Oregon St. Un. Press, 1968. 34. , Extreme stochastic measures and Feldman' conjecture, J. Math. Anal. Appl. 68 (1979), 111-117. 35. , Continuous stochastic measures and Markov operators, J. Math. Anal. Appl. 70 (1979), 258-266.
Classes of Harmonizable Isotropic Random Fields Randall J. Swift Department of Mathematics California State Polytechnic University Pomona, CA 91768 Dedicated to Professor M.M. Rao advisor, colleague and dear friend
Abstract The class of harmonizable fields provide a natural extension of the class of stationary fields. The concept of isotropy plays a central role in the study of turbulence. In recent years the study of harmonizable processes has played a central role in the development of the theory of nonstationary processes. In this paper, the classes of harmonizable isotropic random fields are detailed.
I.
Introduction
In recent years the study of harmonizable processes has played a central role in the development of the theory of nonstationary processes. Crucial to this development is the pioneering work of Chang and Rao [1] on bimeasures and Morse-Transue integration. Their paper set the stage for the recent advances in the theory. A recent account of the development of harmonizable processes and some of their applications may be found in Swift [12], [15]. The article [12] also contains a detailed bibliography of the existing work on harmonizable processes. The corresponding theory for harmonizable fields and their applications are being developed by R. J. Swift in a series of papers [7] - [15]. In this
441
442
R.J. Swift
article, the classes of harmonizable isotropic random fields are detailed. The classical papers of Jones [4] and Roy [7] are central to the development of iar*frr*Ttxr isotropy.
II.
Preliminaries
To begin the discussion, several ideas are briefly recalled here, which will be made use of throughout the article. First, as always, there is an underlying probability space, (0,1), P). We consider second order random fields, that is mappings X : I —> LQ(P), where I (c ]Rn) an index set, and LQ(P) the space of all complex valued / 6 Ll such that Jn /(u)dP(w) = 0. A random field X ( - ) is termed stationary if its covariance function r(-, •) is continuous and r(s, t) = f(s — t). It can be represented as eiX'TdF(\),
r(T ) = /
(1)
n
JJR
for a unique non-negative bounded Borel measure F ( - ) on JRn. One motivation for the concept of harmonizability is to enlarge the applications of stationary processes and fields while retaining the Fourier analytic methods. This notion is seen by defining a random field X ( - ) as is weakly harmonizable if its covariance r(-, •) is expressible as
r(s ,t}= f
t
JlR" J]Rn
e iA - 8 - iA ' -t d^(A, A')
(2)
where F : Mn x Mn —> C is a positive semi-definite bimeasure, of bounded Frechet variation. Here F ( - , - ) is a positive definite function on MX M of bounded Frechet variation in the following sense if m
m
\\F\\(M, M) = sup{ 2 . / ^o-iOjF(ti,tj} i=i j=\
: |aj| < l,i E R, i = 1 , . . . ,TO}< oo.
(3) and it is of bounded Vitali variation if |-F(ti,tj)| -te JR,i = l , . . . , m > < oo.
(4)
A random field, X ( - ) , is strongly harmonizable if the bimeasure F(-, •) in (2) is of bounded Vitali variation. Now it is clear that \\F\\(M,1R)<\F\(1R,IR}
and \F\(]R, St) = +00
HARMONIZABLE ISOTROPIC RANDOM FIELDS
443
is possible. It can be shown that \\F\\(]R,1R) is always finite (cf., Rao [5]). Hence when \F\(1R, 1R) < oo, both variations are finite, and if the Vitali variation is finite, then the integrals in (2) are in the Lebesgue sense and all the standard results from Real Analysis apply. However, if \F\(1R, JR) — +00, then (2) has to be defined in a weaker form called the Morse Transue (or MT-) integral for which the dominated convergence theorem is false. In this case some restriction has to be imposed. A restricted integral, still weaker then the Lebesgue integral but having a dominated convergence theorem is called a strict MT-integral, details of which can be found in Chang and Rao [1]. These strict MT-integrals will be used in what follows. A general class of nonstationary fields which extends the ideas of the harmonizable class was first considered by Cramer in 1952. We say a secondorder random field X : IRn —> L2(P) is of Cramer class (or class (C)) if its covariance function r(-,-) is representable as r(t1}t2)=/
/
JJRn JTRn
5 (t 1 ,A) f f (t 2 ,A
/
)dF(A,A')
(5)
relative to a family {g(t, - ) , t € lRn} of Borel functions and a positive definite function F ( - , •) of locally bounded variation on IRn x Mn, with each g satisfying the (Lebesgue) integrability condition: 0< /
/
JlRn JIRn
$(ti,A)s(t 2 ,A') dF(\ A') < oo, t e Rn.
If F ( - , •) has a locally finite Frechet variation, then the integrals in equation (5) are in the sense of (strict) Morse-Transue and the corresponding concept is termed weak class (C).
III.
Harmonizable Isotropic Random Fields
Random fields often admit an additional property. A random field X(-) with meanm(-) and covariance r(-, •) is isotropic if for each orthogonal matrix g acting on Mn, one has m(gt) — m(t) and r(gs, gt) = r(s,t). The representation of the covariance of a weakly harmonizable isotropic random field was obtained by R.J. Swift [8] as
where Jv(-} is the Bessel function (of the first kind) of order v = (n — 2)/2 and F ( - , •) is of complex bounded Prechet variation. Note that when the spectral measure F ( - , •) concentrates on the diagonal
444
R.J. Swift
A = A' the representation of the covariance becomes
which is the representation of a stationary isotropic covariance obtained by Bochner [16]. A very useful characterization in spherical-polar form for the covariances of weakly harmonizable isotropic random fields was also given by Swift in [8]. The characterization is given by 00
h
(m'n}
fOO
fOO
*.')-<£E«.<«>4>>j[ I
T
(7) where v = ^^ and i) 8 = (ri,u),t = (T2,v) are the spherical polar coordinates of s,t in Mn, here TI =|| s ||,T2 =|| t || and u = -^-,v = •£- are unit vectors. ii) 5^(.), 1 < I < h(m,n) = (
with F ( - , •) as a complex function of bounded Frechet variation. Using (7) and a form of Karhunen's Theorem, a spectral representation of a harmonizable isotropic random field is given as oo
h(m,n)
x® = <*» m=0 E E1=1 S™ where ZL,(-} satisfies m (Bl)Zlml(Bz)) = 6mmf6U'F(B1,B2)
(8)
Let Qn be the class of all n-dimensional strongly harmonizable isotropic covariances on Mn x lRn and Qx the class of all covariances which belong to Qn for all n > 1. Identify in a natural way, the random field X : Mn —> L^(P ) with a field Ll(P) by taking
with tn+\ fixed, then one has Qx,
C
...
C Qn+l
C
Qn
and that Soc = f|
Qn.
C Qn-l
HARMONIZABLE ISOTROPIC RANDOM FIELDS
445
Similarly, let T>n be the class of all n-dimensional stationary isotropic covariances on IRn x lRn and T>^ the class of all covariances which belong to T>n for all n > 1. By the same natural identification, one has Poo C . . . C T)n+i and then
C Vn C £>„_!
Vn
£>oc = n n>l
Since the representation (3) reduces to the form of the stationary case if and only if F ( - , •) concentrates on the diagonal A = A', it follows that T^n C Qn and V^ c
Q^,
It is clear that the classes Qn are not empty. Swift [8] showed that as n increases, the covariance of a strongly harmonizable isotropic random field becomes smoother. More specifically; the covariance has at least m = 1,2, . . . \ % '] partial derivatives with respect to TI,TZ and 6. Here [ • ] is the greatest integer function, r\ = ||s||,T2 = ||t||, and 9 = arccos(s • t). This implies that members of the class Goo are infinitely differentiable. In fact, the covariance can be given (cf., Swift [8]) as roc ro
(s,t) = I I Jo Jo IV.
Local Classes of Fields
In the modern statistical theory of turbulence, random fields with certain local properties are often considered. A useful addition to this theory is given by considering a random field X(t) which is not necessarily of class (C), but whose increment field
is of class (C). Rao [6], obtained the spectral representations for these locally class (C) random fields. Rao showed that the representations are obtained by considering generalized (in the sense of Gel'fand and Vilenkin, [3]) random fields, since they provide the required differentiability structure. The notion of a generalized field will now be given for completeness. Consider the space K, of infinitely differentiable functions h(t) having compact supports, which with compact convergence becomes a locally convex linear topological space. A generalized random field X is a linear functional X : K, —* C such that if {<£n}£Li C /C,
446
R.J. Swift
The mean of a generalized field is the linear functional m(h] = E(X(h}),
/ie/C
and similarly its covariance is the bilinear (conjugate linear in the complex case) functional r(/n,/i 2 ) = E(X(hi)X(hi}),
hi € £,i = 1,2.
Ordinary fields generate the corresponding generalized fields by the relation
X(h) = I
JJRn
X(t)h(t)dt
for he 1C,
The converse is not true unless an additional condition is assumed. That is, if a generalized field X ( - ) has point values (also called "of function space type") then the reverse implication holds. Using this, and results from the theory of generalized functions, one defines the derivative X^mi'--'mn\h) of a generalized field X(h) as
Using these ideas, Swift [12] defined the class of generalized class (C) fields as those random fields X : 1C —> C with zero mean and covariance functional r(-, •) is of weak class (C) that can be expressed as
r(h!,h2)= I t MA)MA'WA,A')
(9)
co
(10)
Jmn Jm.n where F(-, •) is a function of locally bounded Frechet variation satisfying J]Rn JlRn
where p > 0, || • || is the Euclidean length. Further the integrals relative to F are in the strict Morse-Transue sense and hi are the ^-transforms of hi, i = 1,2 hiW = t
JIRn
hi(t)g(t,X)dt.
(11)
Spectral bimeasures F(-, •) which satisfy equation (10) are known as tempered. It may be shown that such an X ( - ) admits a representation
X(h) = I n h(X)dZ(\} Jm where Z : B —> L2(P) is a vector measure such that E(Z(A]Z(B)} = / f dF(\,\'). JAJB If in the representation (9) g(t, A) = e , then the generalized random field X ( - ) will be a weakly harmonizable random field. That is, the covariance
HARMONIZABLE ISOTROPIC RANDOM FIELDS
447
has representation /
(12)
JlR
where F ( - , •) is a positive definite function which satisfies equation (10). Further, one notes that the integrals relative to F are in the strict MorseTransue sense. The theory of these generalized fields will be used throughout the remaining sections of this article. A useful extension of the classes was considered by Swift [12] and is given by the class of fields X(-) for which the increments of order M are of class (C). More specifically, if X(h), is an arbitrary generalized random field, then it is a random field with class (C) increments of order M if its generalized partial derivatives X^mi'm2'-'mn\h), where mi + 7712 + ... + mn = M are of class (C). Swift showed that for a field to have class (C) increments, g must satisfy further conditions. Specifically for the case of class (C) increments of order M, it is required that g satisfies g(t, A — A') = g(t, A)/3(t, A') for all t, A, A' e lRn where
a.-.^.V ^fc""'0
<13>
with r>Mft(+
\\
(14)
and 0^(0, A) = a/j ^ 0. One notes that these restrictions are satisfied if g(t, A) = ei\-t
Using this, Swift obtained the representation of generalized random fields with class (C) increments of order M as
X(h] = f
JlRn-{0} -{0}
h(\)dZY(X) + (a,
where ZY(-} is the spectral measure associated with its class(C) Afth order partial derivative field ¥(•) = x(mi'm2>-"'mn\h)(-) and h is the ^-transform (11) of h with g satisfying g(t, A - A') = g(t, A)/9(t,A') for all *, A, A' € JRn and (13) and (14). More specifically,
ZY : B(]Rn - {0}) is a measure such that ) = E(ZY(A)ZY(B'}}, which is of finite Vitali variation, where B(Mn — {0}) is the Borel cr-algebra of jR™ — {0}. Further, (•, •) is the inner product and the Mth order gradient
448
R.J. Swift
is denned as:
The covariance functional of X(') is given by r(/n,/ l 2 )= /
/
JjR"-{0} JJRn-{0}
MA)MA')o!F(A, A') + (15)
with A a positive definite matrix. As noted before, the conditions upon g are satisfied when g(t, A) = e in which case the previous representation specializes to: X(h) = f
,
h(X)dZY(\) + (-1)M (a,
n
JM -{0}
where Zy(-) is the spectral measure associated with its strongly harmonizable Mth order partial derivative field Y(-) = X^1'™2'-'771-1) (/*)(•) and h is the Fourier transform of h and Zy, a a second-order random vector and as defined above.
V.
Locally Harmonizable Isotropic Random Fields
Using the ideas presented above, Swift [14], obtained the representation of a generalized random field with strongly harmonizable isotropic increments of order M as cc h(m,n)
X(h) = an Jsfl M<) E
E
m=0 J=i
5
f\||+|h
m H /7
• +°
TtnJ, A||t ^
// J5
dZ
™(^dt
(16)
^
(17)
\j\=M
where E(Zlm(Bl)Zlm,(B2)} = 6mm,8ll'F(B1,B2}, with F(-,-) as a tempered function of bounded Vitali variation, and u = -£r-, a unit vector. Further, Slm(-), 1
and
-{:
W.lYi,,. Y-.A = J
b for fc = j,
where the u,j denotes the moments of h.
HARMONIZABLE ISOTROPIC RANDOM FIELDS
449
Using this result, the spectral representation of the ordinary field X(-) is obtained as oo h(m,n)
X(t) = anY^ £ S™(* m=0
1=1
and the increment field IT-X(-) has the structure function B(s,t, s + n,t + r 2 ) =
E[ITlX(s)IT2X(t))]
Now let Qn be the class of all n-dimensional covariance functions of a field with strongly harmonizable isotropic increments of order M. Let Goo be the class of all covariance functions belonging to Qn for all n. Using the natural identification mentioned above, one has _CA/n
C
Observe further that any field with strongly harmonizable isotropic increments of order M is also a field with strongly harmonizable isotropic increments of order M — 1, so
and
The class G&> , rnay be reformulated for the covariance functions B ( - , - , - , •) as B(s,t,s + ri,t + TZ) € &L if and only if /-oo / / -0 J+0 oo
where F(-,-) is a tempered function of bounded Vital! variation, related to the ^(^-measure, and lim E\XM • s M X M • tM]
n—>oo
is a random vector.
450
R.J. Swift VI.
Harmonizable Spatially Isotropic Fields
Often in applications, random fields X(t,x) which are functions of both space, and time, occur. It is convenient to write the parameter set as (t, x), where t e R, x 6 Mn. These processes are often stationary in (t,x) and stationary and isotropic in the spatial variable x. More specifically,
where r is a function from H? to C . Yadrenko, [16] obtained the covariance of such a field as
(the notation as in (6).) In view of the motivation behind harmonizable fields, Swift [10] relaxed the requirement of stationarity for these stationary spatially isotropic fields and gave the following definition Definition 1. A random field X : Mk x ]Rn —> LQ(P) is weakly harmonizable spatially isotropic if its covariance is expressible as
e^-
I) Ax - X'y ||"
(20) where F ( - , •, -, •) is a function of bounded Frechet variation. This definition allows the time parameter t to be a vector. Observe that when k = 1, the parameter t is scalar time, and if F ( - , - , - , • ) concentrates on the diagonals A = A' and w = u/, the representation (20) reduces to the stationary case (19). Further it should be noted that this definition, with further restrictions placed upon F(-, - , - , • ) , includes the case of stationary in time, harmonizable and isotropic in space, as well as the case harmonizable in time, stationary and isotropic in space. Using this definition, the following characterization for harmonizable spatially isotropic fields was obtained in Swift [10]. Theorem 1. A random field X : lRk x IRn —> LQ(P) is weakly harmonizable spatially isotropic iff the covariance function r(-, •, •, •) is expressible as h(m,n)
T Jo
(Ari)"(A'r2) w/iere ^ = ^^ and
,
HARMONIZABLE ISOTROPIC RANDOM FIELDS
451
i) x = (Ti,u),y = (T2,v) are the spherical polar coordinates of x,y in IRn, here T\ =11 x I I , T2 =11 y II and u—^-,v = f-'2 are unit vectors. *1 ii) Slm(-), 1<1< h(m,n) = (2m+2$™^~lY >m ^ l>Sl0(u) = l are the sPherical harmonics on the unit n- sphere of order m. Hi) an > 0, o?n = 22"+1F (^) TT? with F(-, •, •, •) as a complex function of bounded Frechet variation This spherical-polar representation (21) together with the classical Karhunen's theorem give the spectral representation for weakly harmonizable spatially isotropic random fields as oo
h(m,n)
.->-*. £ E
K
where Z l m ( - , - ) satisfies
with F(-, -, •, •) a function of bounded Frechet variation, u = TT^F, and 6mm/ the Kronecker delta. Now for s,t € Mk, a weakly harmonizable spatially isotropic covariance r(s,t, •, •) belongs to Qn defined earlier. By using the definition of Qn as the class of all n-dimensional harmonizable isotropic covariance functions, some basic stability properties of a weakly harmonizable spatially isotropic covariance from Qn may be deduced. Proposition 1. i) If r(s,t, •, •) belongs to Qn and a, b are two arbitrary positive constants, then a(r(bs,bt,-,-)) belongs to Qn. ii) Ifr\(s,t, •, •) and r2(s,t, •, •) belong to Qn then the product ri(s,t, •, -)r2(s,t, •, •) and all linear combinations a\r\(s,t, •, •) + a2r2(s,t, •, •), where a-[ and a2 are nonnegative constants, also belong to Qn. Hi) I f r k ( s , t , -, •) belongs to Qn,k = 1,2,... and\\Ta.k^oo f k ( s , t , •, •) = r(s,t, exists for all s, t then r(s, t, •, •) belongs to Qn, in the sense that r ( - , •, •, •) coincides with a covariance function a.e. (Lebesgue measure). Proof: The proofs of these statements are consequences of the fact that the class of covariance functions coincides with the class of continuous positive definite functions from which the above statement follows easily using a probabilistic argument. Q With this proposition in view and in light of the smoothness property for the covariance of a strongly harmonizable isotropic random field obtained above appiles to the class of harmonizable spatially isotropic covariance. Specifically the harmonizable spatially isotropic covariances of Q^ have representation given as /•
r(s,t,x,y}=
f
/
roc
/
JJRk JMk JO
fo
/ JO
452
R.J. Swift VII.
Locally Time- Varying Fields
In this section, the theory of locally time-varying random fields X : JR x fftn —> LQ(P) is considered. These fields are mappings on Si. x Mn and can thus be regarded as mappings on IRn+l so that the above outlined theory of generalized fields may be applied. Following the ideas of the previous section, we can make the following definition Definition 2. A mapping X : M x lRn —> L^P) is a strongly harmonizable spatially isotropic random field with strongly harmonizable time increments of order k if its generalized kth partial derivative dkX(h(t,x))/dtk is a generalized strongly harmonizable spatially isotropic random field. A spectral representation for such a field was given by Swift [10] as oo
h(m,n)
^
-(u) /
/
Jn-{o}Jo -{o}
hlm(^
where Zlm(-, •) is the spectral measure associated with the generalized strongly harmonizable spatially isotropic partial derivative random field _ (
'
dtk
}
~
and (22)
is the Fourier-Bessel transform of h(t,x). More specifically,
is a measure defined by _\Zlm(A) if A 1 {(0,A)|A 6 (0,oo)} ' \0 i f A = {(0,A)|Ae(0,oo)}
7~i (^ m{
(where Zlm(-) is given by equation (8)) such that E(Zlm(Al,Bl)Zlm,(A2,B2))
= dm
where Fy(-, • > • ) • ) is a tempered measure of finite Vitali variation, and where B(M - {0} x Mn) is the Borel a-algebra of M - {0} x Mn. Further, A : Mn —> LQ(P] is a strongly harmonizable isotropic random field and = f !
JmJMn
tkh(t, x)dxdt.
Letting X(t,x) be an (ordinary) strongly harmonizable spatially isotropic random field we can consider its kth time differences I*X(t) which are de-
HARMONIZABLE ISOTROPIC RANDOM FIELDS
453
fined by k
/ x
\ '
r;=0
n
f o T T ^ M and x 6 R . Then a spectral representation for l!fX(t,x) can be obtained as
=
r
j
r I e
-
J*-{o}Jo {o}
an
"
+ k\A(x)rk where A(-) and ^(-, •) are denned above. Using this representation, it is now possible to obtain the representation of a strongly harmonizable spatially isotropic random field with strongly harmonizable time increments of order k. Specifically, Swift [13] Jv+m(X \\x\\)
-
" ™(w> }
o
(23) where Afc(-, •) are the jumps at the origin, given by fc-i (iwty =0 J=0
0
| i^ or w < 1 , for | a> |> 1
f
77 = 0. . . . , k — 1 are random fields, and Zlm are as given above. Now letting
r; (-),
^m(t,\\ *\\}=anJR
{oJQ
^
( A l l x l D - ^(U;'A)
one has E(Vlm(t,\\ x |D) = 0
and (s, || x \\)Vlm,(t, || y |D) = 6mm,6u,F(s,t, \\ x ||, || y | using a form of Fubini's theorem. More specifically, first apply x* to both sides, then taking x* inside the integral, which is permissible, (cf. [2], IV.9), since x*Zlm(-, •) is a scalar measure, the classical Fubini theorem applies, Dunford and Schwartz, [2]. Hence, the above representation can be extended for all fcth order locally time-varying spatially isotropic random
454
R.J. Swift
fields which need not be harmonizable. Thus a kth order locally time- varying spatially isotropic has a spectral representation oo h(m,n) m=0
k-1
/=0
J?=0
This extends the representation of a time varying field on a sphere, given by R. H. Jones [4]. VIII.
Harmonizable Locally Spatially Isotropic Fields
A natural extension, in light of the previous sections, is to obtain the spectral representation of a field X(t,x) that has harmonizable spatially isotropic increments in the spatial variable x. Definition 3. A mapping X : M x IRn —> L^(P) is a strongly harmonizable spatially isotropic random field with strongly harmonizable spatially isotropic increments of order M if its generalized partial derivatives
where m\ + m^ + • . • + mn = M are strongly harmonizable spatially isotropic. A spectral representation for such a field is given in the following theorem. Theorem 2. A generalized strongly harmonizable spatially isotropic random field X(h) with strongly harmonizable spatially isotropic increments of order M has spectral representation: . X(h(t,x))
.
oo h(m,n)
/ «n=E JRJIO" m=0
E 1=0
S
™M
r e^^^INDh(tt r<x>
X
1 A II X IU
imJo+ JO+ +a
oo f-
h(m,n)
where Zlm(-,-} is the spectral measure associated with its strongly harmonizable spatial isotropic partial derivative random field Y(t,x) = —
HARMONIZABLE ISOTROPIC RANDOM FIELDS
455
and
hlm(u,\)=f
h(t,x)e^J^X\*}}dxdt
I
1A IIx I I J
JlRJlR"
(25)
is the Fourier- Bessel transform ofh(t,x). More specifically,
is a measure such that F(-, - , - , • ) is of finite Vitali variation, where B(]R x (0, oo)) is the Borel a -algebra of(lR x (0, oo)). Further, for each m = 0,1, ... ,00 and 1 = 0,..., h(m, ri) the stochastic measure W^(-) is defined by ••mv~/
™g
£!
Proof: Using the relationship between X and the partial derivative
(cf. Yaglom [17]), it follows that since the measure F is tempered, /g(Tn 1 ,m 2 ,...,m n ) f a ( t > a .)\ ^
^
...
f00
^mi'm2'-'m^ X(h(t, a?))
'
f /
J-ocJlR"
Since the partial derivative
is a strongly harmonizable spatially isotropic field with spectral representation mi,m,2,...,mn
Y(t,x) = — -.-r 9 (mi,m ,...,m ) x
oo
2
rl
h(m,n)
E then ro
/ 7-o
Integrating by parts repeatedly and noting that the various partial deriva-
456
R.J. Swift
tives of h(-, •) have compact supports one has
m=0
f000
f
x
l=0
/ / JRJO RO+
l lx||jc - r l h r)( m l> TO 2,.-.,«in)/ 7 f'/ W
lim
where VM is the Mth order gradient. The partial derivative ^("H."^."."1*)/^ x )/5 a ,(mi,m 2 ,...,m n ) can be repiaceci by h(t,x) since the set of partial derivatives of functions 7 in /C coincides with the subspace of /C, consisting of functions satisfying = . . . = 7 m _i(/i) = 0. Thus .,
,.
X(h(t,x)}= / /
oo h(m,n)
a«E E
•&(«)
X
IMJO+ f liirif
J]R£^0J~s
i7Mhlm(u,X)dZlm(u,X) h(m,n)
, A)
m=0
,=0
where for m = 0, 1, . . . , oo and I = 0, . . . , h(m, n), the stochastic measure Wlm('} is defined by
This gives the desired spectral representation,
HARMONIZABLE ISOTROPIC RANDOM FIELDS
457
Using the relationship X(h) = f
I
h(t,x)X(t,x)dtdx
JlRJlR"
with the spectral representation of the previous theorem, (since X(-} is point valued), the spectral representation of the ordinary field X ( - , - } can be obtained as oc
h(m,n) V^
0
Til — U
to
m
~
I —U
^
m
«*E E S1™^ I \\x\\M
' (26)
JlR
m=0 1=0
This result is summarized in the following proposition. Proposition 2. A strongly harmonizable spatially isotropic random field with strongly harmonizable spatially isotropic increments of order M has a spectral representation given by (26) where Zlm(-,-) is the spectral measure associated with its strongly harmonizable spatial isotropic partial derivative random field = ^ '
d^m*>->m^x(h(t,x))
'
Qx(mi,m2,...,mn)
'
and for each m = 0, 1, . . . , oo and 1 = 0,.,., h(m, n) the stochastic measure W^j(-) is defined by
Now letting
one has
E(Vlm(t,\\x\\))=Q and s, x
m,t,
y
=
mmlu,FS,t,
x , y
using a form of Fubini's theorem. More specifically, first apply x* e (LQ(P))* to both sides, then taking x* inside the integral, which is permissible, (cf. [2], IV.9), since x*Z^n(-,-) is a scalar measure, the classical Fubini theorem applies. Hence, the above representation can be extended for all time-varying random fields with Mth order spatially isotropic increments which need not be harmonizable. These facts are summarized in
458
R.J. Swift
Theorem 3. A random field X : JR x 1RH —> Lg(P) is time-varying with Mth order spatially isotropic increments iff it admits the spectral representation (xi
h(m,n)
E sL(«)*m(*, II * II) h(m,n)
E m=0 (=0
wftere
are a sequence of random fields such that II VE lh\&'' (fc II x Ihl — 0<*) ;Aii;/) (/ / II T II II T Ih -ml(t1 ' II • IU* m 'l i II T \\)> — mm'°U'Om\J'i<>-> || «*• Ih || •*- HJ /'
and oo
2~] h(m,n)bm(t,t, || x ||, || x ||) < oo. m=0
This result gives the representation of a time-varying field with Mth order spatially isotropic increments and for M = I reduces to the representation of a time varying field on a sphere, given by R. H. Jones [4].
References 1. D. K. Chang and M. M. Rao, Bimeasures and Nonstationary Processes, in: M.M. Rao, ed., Real and Stochastic Analysis, John Wiley and Sons, New York, pp. 7-118, 1986. 2. N. Dunford and J. T. Schwartz. Linear Operators. Part I, Interscience, New York, 1957. 3. I. M. Gel'fand and N. Ya Vilenkin, Generalized Functions, Volume 4, Applications of Harmonic Analysis, Academic Press, New York, 1964. 4. R. H. Jones. Stochastic Processes on a Sphere. The Annals of Mathematical Statistics. 34, 213 - 218, 1963. 5. M. M. Rao, Harmonizable Processes: Structure Theory, L'Eriseign Math., 28, 295-351, 1981. 6. M. M. Rao, Representation Theory of Multidimensional Generalized Random Fields. Proc. Symp. on Multivariate Analysis, Volume 2, Academic Press, New York, pp. 411-435, 1969. 7. R. Roy. Spectral Analysis of a Random Process on the Circle. Journal of Applied Probability. 9, 745 - 757, 1972.
HARMONIZABLE ISOTROPIC RANDOM FIELDS
459
8. R. J. Swift, The Structure of Harmonizable Isotropic Random Fields, Stochastic Analysis and Applications, 12, 583 - 616, 1994. 9. R. J. Swift, Representation and Prediction for Locally Harmonizable Isotropic Random Fields, Journal of Applied Mathematics and Stochastic Analysis, Vol. 8, II, 101-114, 1995. 10. R. J. Swift. A Class of Harmonizable Isotropic Random Fields, Journal of Combinatorics, Information & System Sciences, Vol 20, No 1-4, 111127, 1995. 11. R. J. Swift. Stochastic Processes with Harmonizable Increments, Journal of Combinatorics, Information & System Sciences, Vol 21, No 1, 47-60, 1996. 12. R. J. Swift. Some Aspects of Harmonizable Processes and Fields. Real and Stochastic Analysis: Recent Advances, edited by M.M. Rao, pages 303-365, CRC Press, Boca Raton 1997. 13. R. J. Swift. Locally Time-Varying Harmonizable Spatially Isotropic Random Fields. Indian Journal of Pure and Applied Mathematics. Vol. 28, No. 3, 295-310, 1997. 14. R. J. Swift, Harmonizable Locally Spatially Isotropic Random Fields, Revista Colombiana de Matematica, Vol. 33, No. 2, 91-103, 1999. 15. R. J. Swift, Applications of harmonizable Isotropic Random Fields, Teor. ImovTr. Mat. Stat. (Translated in Theory Probab. Math. Statist.) No. 66, 137-146, 2002. 16. M. I. Yadrenko. Spectral Theory of Random Fields. Optimization Software Inc., New York (English Translation), 1983. 17. A. M. Yaglom, Correlation Theory of Stationary and Related Random Functions, Volume 1, Springer-Verlag, New York, 1987.
On Geographically-Uniform Coevolution: Local Adaptation in Non-Fluctuating Spatial Patterns Jennifer M. Switkes Department of Mathematics California State Polytechnic University Pomona, CA 91768 [email protected] Michael E. Moody Franklin W. Olin College of Engineering 1735 Great Plain Ave. Needham, MA 02952-1245 [email protected]
Abstract We present and analyze a general diffusion model for the coevolution of two species in a geographically-uniform selection environment. If both species are diploid, unequal rates of gene flow for the two species can cause the growth and fixation of non-fluctuating spatially-heterogeneous frequency distributions; such patterns are not seen with less complex genetic structure. When these patterns occur, our results suggest that the more mobile species is likely to be locally adapted, while the less mobile species is likely to be locally maladapted.
I.
Introduction
According to the geographic mosaic theory of coevolution [1], reciprocal genetic change in interacting species can result in the formation of a spa-
461
462
J. M. Switkes & M. E. Moody
tially heterogeneous genetic structure across each species' metapopulation. Recent empirical studies provide strong evidence for the formation of such geographic mosaics in certain coevolving populations; interactions between flax and flax rust [2], crossbills and lodgepole pines [3], Taricha salamanders and garter snakes [4], snails and trematodes [5], Depressaria moths and umbelliferous plants [6], legumes and rhizobia [7], and Grey a moths and their saxifragacous host plants [8] provide well-known examples. Recent coevolutionary models have begun to analyze the formation of spatially-heterogeneous genetic structure. Gavrilets and Hastings [9] and Switkes and Moody [10] examine single-deme models under linear frequencydependent selection; the complexity and variety of outcomes suggest the possibility of interesting dynamics when local populations are linked by migration. Nuismer et al. [11] model two populations linked by migration of each of two interacting haploid species, with the interaction mutualistic in one subpopulation and antagonistic in the other. Migration between subpopulations is seen to have a dramatic effect on local outcomes. In a separate paper, Nuismer et al. [12] examine allele-frequency clines in metapopulations of two species due to gene flow and geographic variation in the direction and magnitude of reciprocal selection. In their model, the nature of the selective interaction is varied over a one-dimensional habitat using a diffusion approximation for migration. Gomulkiewicz et al. [13] consider migration of two interacting species in a landscape consisting of "hot spots" in which coevolutionary change occurs along with "cold spots" in which coevolution does not occur. Their model suggests that geographic mosaics can form if coevolutionary hot spots exist in the landscape. Lively [5] investigates a two-locus model of geographically-structured host-parasite coevolution. Gandon et al. [14], Gandon and Michalakis [15], and Gandon [16] discuss local adaptation and maladaptation of coevolving parasite-host systems across a landscape in the context of single-locus models under haploid dynamics. In Gandon [16], time-varying spiral waves are exhibited in a haploid model that assumes geographically-uniform selection, nearest-neighbor migration, and linear frequency-dependent fitness functions. In Gandon et al. [14] and Gandon and Michalakis [15], it is suggested that parasite local adaptation is likely when the parasite migration rate is higher than the host migration rate, and that parasite local maladaptation is likely when the host migration rate is higher. In this paper, we present and analyze a general diffusion model for the coevolution of two species in a geographically-uniform selection environment. If both species are diploid, unequal rates of gene flow for the two species can cause the growth and fixation of non-fluctuating spatially-heterogeneous frequency distributions; such patterns are not seen with less complex genetic structure. With our modeling assumptions, the production of such patterns requires a difference in the mobility of the coevolving species paired with
ON GEOGRAPHICALLY-UNIFORM COEVOLUTION
463
slight variations in initial genetical conditions across the metapopulation. We emphasize that the patterns displayed in this paper are not snapshots of a time-varying pattern; rather the patterns develop from slight spatial variations in initial genetic conditions and become fixed over time. When these patterns occur, our results further suggest that the more mobile species is likely to be locally adapted, while the less mobile species is likely to be maladapted.
II.
General Coevolutionary Model
Consider a habitat with a discrete array of subpopulations of two species equally spaced on a lattice. Local selection occurs in each subpopulation. We assume that migration is between nearest-neighbors and is unbiased as to direction. The migration rates of the two species are independent. In each species, consider two alleles at a single locus. We assume discrete, non-overlapping generations and ignore random genetic drift and mutation. Suppose that selection is soft, so that the relative sizes of subpopulations remain constant for each species, and that the population size of each species does not vary by subpopulation. Let species 1 have alleles A and a at the locus under consideration and species 2 have alleles B and b at its locus under consideration. Let p(t) be the frequency of allele A in species 1 in generation t, t — 0, 1, 2, . . . , so that 1 — p(t) is the frequency of allele a in generation t. Let q(t) be the frequency of allele B in species 2 in generation t, with 1 — q(t) the frequency of allele b in generation t. In each subpopulation, we describe the evolution of the two species using the standard selection model [17]
*
\ WA 1
p =p\=— '
*
[" WB
« =q\=
where starred variables indicate frequencies in the next generation, WA and WB are the allelic fitnesses of alleles A and B, respectively, and W(A) and ™(B) denote the mean fitnesses of species 1 and species 2, respectively. For simplicity of notation, we suppress subscripts identifying the subpopulation. Suppose that in each generation a fraction mi of species 1 individuals and a fraction 7712 of species 2 individuals in each subpopulation migrated there, with one-fourth of the migrants coming from each neighboring subpopulation, as shown in Figure 1. Boundary subpopulations receive migrants only from their two or three neighbors. In each subpopulation in each generation we account for selection first, followed by migration and then population regulation, as shown in Figure 2 where Nl and N2 represent population sizes of species 1 and species 2, respectively.
464
J. M. Switkes & M. E. Moody
Figure 1. Stepping Stone Model
For the ijth subpopulation, P'ij = (1 - miKj +
[Pi-ij+Pi+ij +Pij-i+Pij+i] , (2)
where i = 1, 2, . . . , m — 1 and j = 1, 2, . . . , n — 1. The primed variables represent frequencies after both selection and migration. The starred variables represent frequencies after selection but before migration, as given by (1). Natural modifications are made for the boundary subpopulations. III.
Discrete Model: Coevolution of Two Diploid Species
We will assume that each species is diploid; Appendix 2 suggests that in a geographically-uniform selection environment the growth and fixation of non-fluctuating spatially-heterogeneous frequency distributions is not seen with less complex genetic structure. Thus, in species 1 the genotypes AA, Aa, and aa occur at the zygote stage in generation t with frequency p2 (t), 2p(t)[l — p(t)\, and [1 — p(t)]2, respectively, while in species 2 the genotypes
ON GEOGRAPHICALLY-UNIFORM COEVOLUTION
465
Figure 2. Order of (Revolutionary Mechanisms
P*
p q
selection
'
q* N1 N2
migration
P' q1 N11 N'2
regulation '
P' q' N1 N2
BB, Bb, and bb occur with frequency q2(t), 2q(t)[l - q(t)], and [1 - q(t)]2, respectively. We will generally suppress the explicit time dependence. Let WAA, WAO,, waa, WBB, wsb, and wbb denote the fitnesses of the corresponding genotypes. In order to model coevolutionary effects, we will suppose that in each subpopulation the fitnesses of the alleles in each species depend linearly on the frequencies of the genotypes in the other species. That is, 2q(l - q)wfA + (I - q)2w>*A, + 2q(l - q)wfa + (1 - q)2wbAa, B a
+ 2q(l - q)w™ + (1 - q)2wfa, (3)
wBb = 1 + n + p2w^ + 2p(l~ p)w$ + (1 - p)2waBab, l + r2+ P2wAbbA + 2p(l - p)w£ba + (1 - p)2w%, where we again have suppressed subscripts identifying the specific subpopulation. Here, si, $2, n, and r% are intra-specific selection coefficients for species 1 and species 2, respectively. The other coefficients are betweenspecies selection coefficients. Superscripts indicate the genotype responsible for the selective effect, while subscripts indicate the genotype upon which the selective effect is acting. Thus, the coefficient WAA represents a selective effect of genotype BB from species 2 on genotype A A from species 1. A positive value of wAA indicates that genotype A A in species 1 benefits from the presence of genotype BB in species 2. A negative value of WAA indicates a deleterious effect of genotype BB on AA. We weight WAA by the frequency q2 of genotype BB individuals in species 2. The other terms in (3) are interpreted in a similar way. Notice that the coefficients on the righthand side of (3) are constant across the geographical landscape, modeling a geographically-uniform selection environment.
466
J. M. Switkes & M. E. Moody
The mean fitnesses are given by W(A) = WAAP2 + WAa ' 2p(l - p) + Waa(l - p)2,
(4)
q)+ wbb(l - q)2. Finally, the allelic fitnesses WA, wa, WB, and wj, are given by -P),
Wa = WAap + Waa(l-p),
- q),
wb = wsbq + w^,(\ - q).
(5)
In order to reduce the mathematical complexity of the model, we will introduce a set of symmetry assumptions. Assume first that each "effector" genotype (indicated by superscripts) is co-dominant in the effects it causes: j IjU DJ, OO
BB
j
—— — 1 1JU D D ~T~ lJUliL. o \ -D±j 00
I t I '
i
1
_ f
BB
.
BB\
IjU 131* .DO
Bb
——
— o
2,
1 / Bb ,
_ /
\
B6^
bb
1
_
Assume that each "effected" genotype (indicated by subscripts) is co-dominant in the effects it feels:
1 2
1 2
1 / 2 \
\ / '
a
1/ 2 \
1 2 \ J '
a
a
aa
1 2
Assume further that "effector" homozygotes have no effect on non-matching homozygotes, that heterozygotes are affected equally by all "effectors" in the other species, and that within each species the magnitude of intra-specific selection is the same for both homozygotes: ,, — ri AA - U' 1nAA
^66
— ,..aa — n ~ WBB — u >
ni^A —W^
w
_ __ n 2 ~ U'
T
These assumptions reduce the number of parameters in the model to four. For simplicity of notation, introduce the four quantities,
= sl + wfa, e3 = w£% = w%, e± = n + w$, (6) from which it now follows that the fitnesses depend linearly on the allele frequencies: = l + ei<3-,
WAa:=l+e2,
waa = 1 + ei(l - q),
= 1 + esp,
WBb = 1 + 64,
Wbb = 1 + es(l - p).
ON GEOGRAPHICALLY-UNIFORM COEVOLUTION
467
The equations in (1) now become (1 + eiq)p + (1 + e2)(l - p)
p =p „,
I
(1 +
CsPJO + ( J - + C 4 ) ( J - — d]
I
^
2g(l - g )(l + e4) + (1 - ?)2(1 + e3 - e3p) where again we have suppressed subscripts identifying the subpopulation. These equations, together with (2), describe the discrete diploid-diploid coevolutionary model which we will use in what follows. In Appendix 1, corresponding models are described using haploid-haploid and haploid-diploid dynamics.
IV.
Continuous Model: Coevolution of Two Diploid Species
To proceed, we approximate the discrete system given by (2) and (8) by a continuous-time, continuous-space limit. We suppose that the separation £ between nearest neighbor denies is small both horizontally and vertically (e —> 0) and that the number of denies is large. Let one generation correspond to 6 units of time, where s2/6 tends to a positive constant as e and 6 tend to 0. Suppose that the habitat length L — (n + l)e and width M = (m + l)e are fixed. Let t represent the generation number. We will define new spatial and time variables x = ie, y = je, and T = t8. Changing to these new variables, define u(x, y, r) = Pij(t) and v(x, y, T) = %•(*)> for i = 0, . . . , n and j = 0, . . . , m. We suppose selection to be weak and rescale the parameters e, = //A;,, where /J./5 tends to a positive constant as p, and 8 tend to 0. Thus, we can re-write the fitnesses from (7) as wAA(x,y,r) = l + nkiq(x,y,T), wBB(x,y,r) = 1 + nk3p(x,y,r), wAa(x,y,T) = 1 + p,k2, wBb(x,y,r) = 1 + /zfc4, waa(x,y,r) = l + fj,ki[l
-q(x,y,r)},
wbb(x, y, T) = 1 + fj,k3[l - p(x, y, T)]. In the limit as p tends to 0, there are no biologically-imposed restrictions on the values of fcj, i = 1, . . . , 4. We make the standard diffusion assumptions [18] that as e, 6, and [i tend to 0, £
- [mi] — > 2Vi,
7 — > 1.
(9)
o ci where i = 1, 2. The terms 2Vi are measures of the infinitesimal variance in
468
J. M. Switkes & M. E. Moody
displacement of individuals of each species per unit time. Since ^ multiplies each ki, there is no loss of generality in assuming the ratio of JJL and 5 to be 1 as p, and 6 tend to 0. Using equations (2) and (8) and the change of variables to express the differences u(x,y,T + 6)-u(x,y,T)
and
v(x,y,r + 6) -v(x,y,r)
for denies in the interior of the geographic landscape, dividing both sides of the equations by 6, using (9), and taking the limit as 8 —> 0, we obtain + u(l - u}[k2(l - 2u) -fci(l-uor
+ v(l ~ v)[k4(l - 2v) - k3(l - « - v)},
where V 2 is the Laplacian operator, V\ and Vz are proportional to the migration rates TOI and m?, and the fc,'s are proportional to the e^'s. Appendix 1 contains the corresponding continuous models under haploid/haploid and haploid/diploid dynamics. At the boundaries of the geographical landscape, again using equations (2) and (8) and proceeding as before, we find that, with the scaling used to derive the reaction-diffusion equations (10) in the interior of the geographic landscape, there is now an unbounded term as £ —> 0 and 6 —> 0 proportional to n • Vu in the equations for ur and to n • Vi> in the equations for VT, where n is the unit outward normal to the boundary. We thus stipulate zero-flux boundary conditions n • Vu(x, y, r) = 0,
n • Vv(x, y, r) = 0,
(11)
on the boundary of the geographical landscape for r > 0. These, together with (10) and specified initial conditions, constitute the reaction-diffusion system associated with the discrete stepping-stone models. We usually think of diffusion as driving a system towards homogeneity. Indeed, if the parameters /ci, . . . , £4, Vi, and Vz are not chosen with care, we are likely to see initial differences in genetic conditions smoothed out across the geographic landscape as time goes on. Another possibility for this type of system is the presence of spiral waves, as seen in Gandon [16] under haploid/haploid dynamics; we have observed such time-varying patterns here as well with diploid/diploid dynamics. Sometimes, however, a diffusion-driven instability can lead to the formation and fixation of spatially-heterogeneous patterns. Given specified initial conditions, in order for non-fluctuating spatial patterns to form and be fixed over a suitably-defined landscape due to linear effects, it must be the case that in the absence of diffusion populations tend towards a stable spatially-uniform steady state. If, with specified unequal
ON GEOGRAPHICALLY-UNIFORM COEVOLUTION
469
diffusion coefficients, the spatially-uniform steady state becomes unstable, then non-fluctuating spatially-heterogeneous patterns can be fixed. Suppose that the diffusion approximation of a discrete model is d
-j- = Vi V2u + f ( u , v),
^ = V2 V2v + g(u, v),
(12)
where V 2 is the Laplacian operator in either one or two dimensions. According to linear analysis of the system, in order for non-fluctuating, heterogeneous spatial patterns to grow and become fixed, it is necessary and sufficient that fu(u,v)+gv(u,v) < 0,
fu(u,v)gv(u,v) - fv(u,v)gu(u,v) > 0,
(lo)
V2fu(u,v) + Vigv(u,v) > 0, [V2fu(u,v) + Vigv(u,i)}}2 -4ViV2[fu(u,v)gv(u,v)
- fv(u,v}gu(u,v)]
> 0,
where (u, v) is an equilibrium for the system in the absence of diffusion [19]. Briefly, the first two inequalities ensure that, in the absence of diffusion (V\ = Va = 0), each eigenvalue of the system linearized about the equilibrium has negative real part. Given that the first two inequalities hold, the final two inequalities ensure that with migration at relative rates described by Vi and Vz, the real part of at least one eigenvalue is pushed positive. The symmetric diploid-diploid system has an equilibrium at u = 0.5, v = 0.5 for which it is possible to satisfy the inequalities in (13). At this equilibrium, the inequalities reduce to (fci - 2k2) + (k3 - 2fc 4 ) < 0, (ki - 2/c2)(/c3 - 2/c4) -kik3> 0,
(14)
[V2(ki - 2k2) + Vi(k3 - 2k4)}2 - 4ViV2[(ki - 2k2)(k3 - 2fc 4 ) - kik3] > 0. The inequalities in (14) are useful in guiding our choice of parameters in the discrete coevolutionary model such that non-fluctuating patterns form. We emphasize that we are looking at the development of non-fluctuating heterogeneous spatial patterns due only to linear effects. It may be shown that one implication of (14) is that k\ and ^3 must have opposite sign. Thus, referring to (6), we find that matching homozygotes must be helpful to homozygotes in one species and harmful in the other species if spatiallyheterogeneous patterns are to form and be fixed. In Figure 3, we display representative non-fluctuating heterogeneous spatial patterns generated from the discrete diploid model (2) and (8). The discrete-model parameters were chosen such that the corresponding continuousmodel parameters V\, V2, k\, ...k^ satisfy the inequalities given in (14).
470
J. M. Switkes & M. E. Moody
30
Figure 3. Representative non-fluctuating heterogeneous spatial patterns. Initial conditions are randomly generated between 0.495 and 0.505 for each subpopulation. The shade of each square indicates the frequency of p in species 1 in the left-hand plot and the frequency of q in species 2 in the right-hand plot; the darker the shade, the higher the frequency (e\ = 0.275, e2 = 0.275, e3 = -0.225, e4 = -0.175, m: = 0.9, m2 = 0.045).
These plots do not represent snapshots of a changing genetical landscape; rather, the patterns shown have become fixed. The shade of each square indicates the frequency of p in species 1 in the left-hand plot and the frequency of q in species 2 in the right-hand plot for the corresponding sub-population; the darker the shade, the higher the frequency.
V.
Local Adaptation
It is natural to ask whether non-fluctuating heterogeneous spatial patterns that form are beneficial or harmful to each species according to some measure. There are various natural measures of local adaptation (e.g., [5], [15], [16]). Each measure provides an indication of the performance of a subpopulation of one species in the presence of the other species' local subpopulation, compared to the performance against the other species' metapopulation. For our model, the natural measure of performance is mean fitness. We define subpopulation local adaptation for species 1 or species 2, respectively, as Species 1: Subpopulation local adaptation = U>(A)(PI Q) ~ ™(A)(P-> Species 2: Subpopulation local adaptation =
(15)
ON GEOGRAPHICALLY-UNIFORM COEVOLUTION
471
The frequencies 1 Pave =
n
are averages of the frequencies p and q across the entire metapopulation, where n denotes the number of subpopulations in the metapopulation of each species. As before, we have suppressed subscripts identifying the subpopulation. To interpret this measure of local adaptation, consider a particular subpopulation of Species 1, with some frequency p of allele A. This subpopulation interacts with a particular subpopulation of Species 2, with some frequency q of allele B. The Species 1 subpopulation has some mean fitness in its current interaction with the particular Species 2 subpopulation. We can also compute a hypothetical mean fitness for the Species 1 subpopulation imagining it to interact with Species 2 across the entire metapopulation and using qave for the mean frequency of allele B in Species 2 across the metapopulation. We have defined the local adaptation of a particular subpopulation of Species 1 to be the difference between these two mean fitnesses values. This local adaptation is thus a measure of how much better the Species 1 subpopulation does in its interaction with the corresponding Species 2 subpopulation than it would be expected to do in interacting with a randomly-selected subpopulation of Species 2. The mean fitnesses w^) an(l W(B)> computed from (4), may be re- written through the use of (7) as = 1 + [2p(l - P)e2 + (1- p)2el] + [2(p - 0.5)ei]g, 9)e4 + (1 - q)2e3] + [2(q - 0.5)e3]p. Only the terms in w^) involving q differ between w^(p, q) and w^(p, qav a e similarly, only the terms in tJJ(jg) involving p differ between W(B) (P> ) W(B)(pave, Q)- The expressions for subpopulation local adaptation given in (15) thus simplify to Species 1:
Subpopulation local adaptation = 2ei(p — 0.5)[ — qave],
Species 2:
Subpopulation local adaptation = 2e^(q — 0.5)[p — pave\-
We define the average local adaptation A^ and A(m of species 1 and species 2, respectively, as Species 1 metapopulation:
A( A ) = - ^2ei(p^ - 0.5) [%• - qave],
Species 2 metapopulation:
A(B
l6 i Ji ( ) ) = - ^ 2e (^ - 0.5) [py - p }. 3
ave
i,j
That is, the average local adaptation of a species is the average across the metapopulation of the subpopulation local adaptation values. Algebraic ma-
472
J. M. Switkes & M. E. Moody
nipulation simplifies (16) to Species 1 metapopulation: A( A ) = 1e\ [(pq}ave - Pave
(17)
where (pq)ave = ^ l^ijPijQij ig the average of the product Pijqij across the metapopulation and Cov(p, q) denotes the covariance of p and q. If the value of A is positive for the metapopulation of a species, we will say that the species exhibits local adaptation in the interaction; a negative value of A corresponds to local maladaptation of the species. From (17), it follows that if species 1 and species 2 evolve independently of one another, so that p and q are independent, then there will be no average local adaptation or maladaptation. However, in this coevolutionary model the long-term values of p and q are not independent of one another. While the patterns for species 1 and species 2 need not be identical, usually there is a strong resemblance between the pattern that is fixed for species 1 and the pattern that is fixed for species 2. In general, either high p values are paired most often with high q values and low p values are paired most often with low q values (we will call this a direct resemblance), or high p values are paired most often with low q values and low p values are paired most often with high q values (we will call this an inverted resemblance). As mentioned previously, the inequalities in (14) may be shown algebraically to require that k\ and k% have opposite sign, and so e\ and 63 have opposite sign as well. Thus, from (17), in this model one species will be locally adapted while the other species will be locally maladapted. Suppose that species 1 is more mobile than species 2, so that V\ > V-z. If e\ > 0 and 63 < 0, computer simulations suggest that the pattern resemblance will be direct, regardless of the precise initial conditions, assuming that the initial conditions include slight variation and that the geographic landscape is large enough to allow patterns to form; Appendix 3 provides an informal analytical argument in support of this observation. With a direct resemblance, the covariance of p and q will be positive. By (17), A^ will be positive while A(#) will be negative, and so species 1 will be locally adapted while species 2 will be locally maladapted. If e\ < 0 and 63 > 0, computer simulations suggest that the pattern resemblance will be inverted, again regardless of the precise initial conditions. With an inverted resemblance, the covariance of p and q will be negative. Again, A^ will be positive while A(#) will be negative, and so again species 1 will be locally adapted while species 2 will be locally maladapted. In each case the more mobile species exhibits local adaptation while the less mobile species exhibits local maladaptation.
ON GEOGRAPHICALLY-UNIFORM COEVOLUTION
8
473
6
10
8
10
2 4 6 8 10 10
10
Figure 4- Local adaptation. Upper plots are frequency plots. Lower plots are local adaptation plots, with patches of local adaptation in black and patches of local maladaptation in white. Species 1 (left) is more mobile than species 2 (right) and the weak pattern resemblance is inverted (e\ = —0.7, e2 = 0.1, e3 = 0.9, e4 = 0.1, mi = 0.9, m2 = 0.045).
Local adaptation (or maladaptation) of the metapopulation does not imply local adaptation (or maladaptation) of every subpopulation. If the patterns for the two species are identical in form, there will be local adaptation (or maladaptation) of every subpopulation. If the patterns simply share a general resemblance, there may be pockets of local maladaptation (or adaptation) that survive amidst the metapopulation local adaptation (or maladaptation). In Figure 4, we display representative plots illustrating local adaptation and maladaptation. The upper plots are frequency plots, with the shade of each square indicating the frequency of p in species 1 in the left-hand plot and the frequency of q in species 2 in the right-hand plot; the darker the shade, the higher the frequency for the corresponding sub-population. The lower plots are local adaptation plots, where we have illustrated patches of local adaptation in black (A > 0) and patches of local maladaptation in
474
J. M. Switkes &: M. E. Moody
white (A < 0). Species 1 (left) is more mobile than species 2 (right). The average local adaptation values are A (A) « 0.024,
A(B) « -0.031.
As expected, the more mobile species 1 is locally adapted while the less mobile species 2 is locally maladapted. The inequalities in (14) can be satisfied even if one of the migration rates is zero, corresponding to an interaction between a mobile species and a nonmobile species. The resulting patterns look similar to those occurring with two-species migration. The migration by the mobile species drives the formation of the resulting patterns for both species. It is thus perhaps not surprising that, in agreement with the results for two-species migration, the migrating species exhibits local adaptation while the non-migrating species exhibits local maladaptation.
VI.
Discussion
The discrete and continuous models described here exhibits the formation and fixation of non-fluctuating spatially-heterogeneous patterns. We emphasize once more two important qualities of these patterns. First, the patterns shown are not snapshots of time-varying patterns but are in fact fixed by the coevolutionary interaction. While the model appears similar to that described by Gandon [16], Gandon investigated time-varying spiral waves resulting from haploid-haploid dynamics. The increased genetic complexity introduced by the diploid-diploid dynamics of our model allows for the fixation of spatial patterns. Also, the patterns shown here form under geographically-uniform coevolution, and thus represent very different dynamics than those shown in Nuismer et al. [12] or Gomulkiewicz et al. [13]. As we have seen, the formation and fixation of these non-fluctuating spatially-heterogeneous patterns requires non-equal migration rates for the two species (including the case of a non-mobile species interacting with a mobile species). Our model provides support for the claim that the more mobile species in a coevolutionary interaction is likely to achieve local adaptation, with pockets of local adaptation dominating pockets of local maladaptation. The less mobile species is likely to suffer local maladaptation. This continues to be true in the case of a non-mobile species interacting with a mobile species. These results are in agreement with Lively [5], Gandon et al. [14], Gandon and Michalakis [15], and Gandon [16], all of whom worked in the context of parasite-host interactions. We find it interesting that this influence of migration rate on local adaptation appears to hold for a wide class of coevolutionary interactions.
ON GEOGRAPHICALLY-UNIFORM COEVOLUTION
475
References 1. J. N. Thompson. The Coevolutionary Process. Chicago: University of Chicago Press, 1994. 2. J. J. Burden, P. H. Thrall. Coevolution at multiple spatial scales: Linum marginale-Melampsora lini—from the individual to the species. Evolutionary Ecology 14: 261-281, 2000. 3. C. W. Benkman, W. C. Holimon, J. W. Smith. The influence of a competitor on the geographic mosaic of coevolution between crossbills and lodgepole pine. Evolution 55: 282-294, 2001. 4. E. D. I. Brodie, E. D. J. Brodie. Predator-prey arms races. Bioscience 49: 557-568, 1999. 5. C. M. Lively. Migration, virulence, and the geographic mosaic of adaptation by parasites. Am. Nat. 153: S34-S47, 1999. 6. M. R. Berenbaum, A. R. Zangerl. Chemical phenotype matching between a plant and its insect herbivore. Proc. Natl. Acad. Sci. USA 95: 13743-13748, 1998. 7. M. A. Parker. Mutualism in metapopulations of legumes and rhizobia. Am. Nat. 153: S48-S60, 1999. 8. J. N. Thompson. Evaluating the dynamics of coevolution among geographically structured populations. Ecology 78: 1619-1623, 1997. 9. S. Gavrilets, A. Hastings. Coevolutionary chase in two-species systems with applications to mimicry. J. Theor. Biol. 191: 415427, 1998. 10. J. M. Switkes, M. E. Moody. Coevolutionary interactions between a haploid species and a diploid species. J. Math. Bio. 42: 175-194, 2001. 11. S. L. Nuismer, J. N. Thompson, R. Gomulkiewicz. Gene flow and geographically structured coevolution. Proc. R. Soc. Lond. 266: 605-609, 1999. 12. S. L. Nuismer, J. N. Thompson, R. Gomulkiewicz. Coevolutionary clines across selection mosaics. Evolution 54 (4): 1102-1115, 2000. 13. R. Gomulkiewicz, J. N. Thompson, R. D. Holt, S. L. Nuismer, M. E. Hochberg. Hot spots, cold spots, and the geographic mosaic theory of coevolution. Am. Nat. 156: 156-174, 2000. 14. S. Gandon, Y. Capowiez, Y. Dubois, Y. Michalakis, I. Olivieri. Local adaptation and gene-for-gene coevolution in a metapopulation model. Proc. R. Soc. B 263: 1003-1009, 1996. 15. S. Gandon, Y. Michalakis. Local adaptation, evolutionary potential and host-parasite coevolution: interactions between migration, mutation, population size and generation time. J. Evol.
476
J. M. Switkes & M. E. Moody
Bio. 15: 451-463, 2002. 16. S. Gandon. Local adaptation and the geometry of host-parasite coevolution. Ecol. Letters 5:246-257, 2002. 17. T. Nagylaki. Introduction to theoretical population dynamics. Springer, Berlin, 1992. 18. J.F. Crow, M. Kimura. An introduction to theoretical population genetics. Harper and Row, New York, 1970. 19. J. D. Murray. Mathematical Biology. Berlin: Springer, 1993, pp 372-397.
Appendix 1: Haploid-Haploid and Haploid-Diploid Models Haploid-Haploid If both species are assumed to be haploid, a natural model is obtained by retaining (1) and (2), replacing (3) and (5) by WA = 1 + ciq, wa — 1 + C2 + CSQ, WB = 1 + C4P, Wb = 1 + c5 + cgp,
(18)
and replacing (4) by W(A) = wAp + wa(l -p),
W(B)=wBq + wb(l-q).
(19)
The continuous approximation has the form
Haploid-Diploid If species 1 is haploid and species 2 is diploid, a natural model is obtained by retaining (1) and (2), replacing (3) and (5) by
dwp,
(20)
ON GEOGRAPHICALLY-UNIFORM COEVOLUTION
477
and replacing (4) by
= WAP + wa(l - p), = wBBq2 + WBb • 2g(l - q) + Wbb(l - q)2.
(21)
The continuous approximation is du — = ViV2u + u(l - u)[(j2 - J5)v2 + (ji - jt)v - J3],
i = V2V2v dr + v(l - V)[{v(j6
- 2j8 + jw) + (J6 - jW)}u - {v(2j7 - jg) + (jg - J
Appendix 2: Absence of Patterns in HH and HD Models Let (u,v) be an equilibrium of a general reaction-diffusion system (12) in the absence of diffusion (Vi = Vz = 0). In order to satisfy simultaneously the first and third inequalities in (13), we need fu(u,v) and gv(u,v) to be non-zero and of opposite sign. Then by the second inequality in (13) we have that fv(u, v) and gu(u, v) must each be nonzero and of opposite sign as well. Thus, in the absence of diffusion, all four entries in the Jacobian matrix corresponding to system (12) evaluated at (u,v) must be nonzero. Whenever we can demonstrate that a particular model incorporating certain genetic structure has at least one entry of zero in the Jacobian matrix evaluated at an equilibrium in the absence of diffusion, then we are assured that non-fluctuating heterogeneous patterns due to linear effects about that equilibrium do not occur in the model when diffusion is incorporated. In the haploid-haploid model, the Jacobian matrix in the absence of diffusion is
The equilibria are (0, 0), (0, 1), (1, 0), (1, 1), and potentially (is/(u - is), «2/(«i is)). At an edge or corner equilibrium, one or both of the off-diagonal entries will be zero. At an internal equilibrium, both of the diagonal entries will be zero. Thus, the haploid-haploid model will not exhibit non-fluctuating heterogeneous spatial patterns due to linear effects. In the haploid-diploid model, the Jacobian matrix in the absence of diffusion is ,
gu(u,v)
.
,
.
gv(u,v)
478
J. M. Switkes & M. E. Moody
where fu(u,v) = (1 - 2u)[(j2 - J5}v2 + (ji - j4)v - J3], fv(u,v) = u(l - U)[2(J 2 - J5)v + (ji - J4)v],
gu(u,v) = v(l - v){v(j6 - 2js + jio) + (j6 - jw)}, gv(u,v) = (1 - 2v)[{v(j6 - 2j8 + jio) + (je - Jio)}u - {v(2j7 - jg The equilibria are (0,0), (0,1), (1,0), (1,1), potentially several other edge equilibria, and up to two internal equilibria. At an edge or corner equilibrium, one or both of the off-diagonal entries will be zero. At an internal equilibrium, the upper left diagonal entry will be zero. Thus, the haploid-diploid model also will not exhibit non-fluctuating heterogeneous spatial patterns due to linear effects. Appendix 3: Pattern Resemblance Analysis
We have seen that the system of partial differential equations du - u)[k2(l -2u) (22)
can exhibit the formation and fixation of stable, heterogeneous spatial patterns. As mentioned previously, while the patterns for species 1 and species 2 need not be identical, usually there is a strong resemblance between the pattern that is fixed for species 1 and the pattern that is fixed for species 2. Suppose that species 1 is the more mobile species, that is, that V\ > V2. Recall that inside the pattern region k\k% < 0. Here, we use linear analysis to suggest that if k\ > 0 and k% < 0, then the pattern resemblance is most likely direct; if k\ < 0 and ^3 > 0, the pattern resemblance is most likely inverted. Following the more general analysis in Murray [19], we let W =
fit-1/2 [v-1/2
and linearize system (22) about the non-diffusion equilibrium at (1/2,1/2), obtaining
[
Vi 0
01 F2 J
2
f ( ^ i ~ 2&2)/4 [ fc3/4
fci/4 1 (Jfe3 - 2fc 4 )/4j
Still following Murray [19], we look for product solutions of (23) of the form w(x, y, T) = £ c m , n e A ( fc2 ) T W m , n (x, y). m,n
(24)
ON GEOGRAPHICALLY-UNIFORM COEVOLUTION
479
Here, W min (x,y) is a vector solution of the uncoupled eigenvalue problem V 2 W m , n (x, y) + k2Wm,n(x, y) = 0
(25)
with zero-flux boundary conditions, where k2 = TT2(m2/L2 + n2/W2) for a two-dimensional spatial habitat on 0 < x < L, 0 < y < W. The vector function W TO>n (x, y) has the form A
1
cos(rmrx/L) cos(mry/W). The scalars cm,n in (24) are determined by the initial conditions. Substituting w(x,y,r) into (23), we obtain A £_, > cm,n e W m,n\(x , y} y) =
Vl
0 1 V- .
i Q L
m,n
.ATV72,
J
~ m,n
m,n
In order for this equality to hold for all x, y, T, it must be the case that for each m, n pair, AT Ac m , n eAr W m , n (x,y) = | 01 ^
c m , n eAATTV2 2 W m , n (x,y)
(fci-2fc 2 )/4
fcj/4 (fe3 -
1
AT Cm nfi
'
Dividing through by Cm;n and eAT we have that AWm,n\ m n (x,w) iyi =
r 0 TV2
V W m"i,nv n (x,u) >yy
Using (25) in (26) and rearranging the terms, we obtain 0
fcs/4
(fc3-2fc 4 )/4j
2
0 (27)
where / is the 2 x 2 identity matrix. Continuing to follow Murray [19], the eigenvalues A(fc 2 ) are given by the roots of the characteristic polynomial (A;3-2A;4)/4--")l"A/1-0-
480
J. M. Switkes & M. E. Moody
After algebraic simplication, Ai, ^1,22 =
k!-2k2 + k3- 2k4 - 4k2 (Vi + V2) 2 2 ± \y \(2k 2 - K! 4 -fcu3+ 4k V2)} L\ -i + 4K Vi) .i/ - (2K \ •* i / j + 4Kifc 3 \(28) /
Suppose that parameters KI, . . . , K 4 , Vi, V~2 have been chosen to satisfy the inequality requirements given in this paper for pattern formation and fixation. Temporarily setting V\ and V2 to zero to look at the corresponding eigenvalues in the absence of diffusion, both eigenvalues will have negative real part, since parameters were chosen inside the pattern region. Thus, KI - 2k2 + K3 - 2K4 - 4fc 2 (Vi + V2) < 0. Introducing diffusion at relative levels V\ and V2, in order for patterns to form and become fixed due to linear effects it is necessary and sufficient that the real part of at least one eigenvalue become positive. Since increasing diffusion levels from 0 to V\ and V2, respectively, decreases the quantity outside the radical on the righthand side of (28), it must be the case that the radical is real-valued in the presence of diffusion at levels V\ and V2, with the radicand increased enough by the presence of this diffusion to push the real part of one eigenvalue positive. Thus, with the diffusion at levels V\ and V2, the eigenvalues are real-valued, with AI > 0 and \2 < 0. It is the positive AI which is responsible for the growth of patterns, as solutions corresponding to AI will grow, rather than decay, with time. Consider an arbitrary position ( X , Y ) in the spatial habitat and arbitrary values m = M, n = N. Let
,W 2
By (27), /KI -2k2 _ V k 2 _ x \ w l ^ 4 ! ) l
+
^w = Q 4 2-
That is, (29)
Since V\ > V2, by the first and third inequalities in (14) for pattern formation and fixation, it must be the case that k\ — 2k2 < 0. Thus, the sum in parentheses in (29) is negative. We therefore conclude that W\ and W2 have the same sign if and only if k\ > 0. That is, with V\ > V2, each eigenfunction W m>n (x, y) corresponding to AI causes contributions of like sign to both
ON GEOGRAPHICALLY-UNIFORM COEVOLUTION
481
components of the vector solution w(x,y,r) = ]^c m , n e A(fe2)r W m , n (z,y) m,n
if and only if k\ > 0. Linear combinations of like-sign contributions most often will lead to likesign components in w(x,y,r). To get an intuitive idea of why this is the case, consider for the moment linear combinations of two positive-valued vectors in the xy-plane. If the vectors are not multiples of each other, the vectors form a basis for the xy-plane. However, if a linear combination of the vectors is chosen at random, the resultant vector will fall in the first and third quadrant regions between the lines formed by the basis vectors fifty percent of the time. The resultant vector will fall in the first and third quadrants more than fifty percent of the time. Similar conclusions may be drawn for linear combinations of two or more vectors, each with like-sign components. Thus, with Vi > V-J, the pattern resemblance is most likely direct if k\ > 0 and most likely inverted if k\ < 0.
Approximating the Time Delay in Coupled van der Pol Oscillators with Delay Coupling Stephen A. Wirkus Department of Mathematics California State Polytechnic University Pomona, CA 91768
Abstract A system of two van der Pol oscillators with delayed velocity coupling is examined. The method of averaging is used to rewrite the system and an analysis of the stability and bifurcation of the equilibria is performed. These bifurcation curves are compared with those obtained from a Taylor series truncation of the averaged equations. The results show excellent agreement.
I.
INTRODUCTION
Previous work has investigated the dynamics of two weakly coupled van der Pol oscillators in which the coupling terms have time delay r [7], [8], [9], [10]. The coupling was chosen to be via the first derivative terms because this form of coupling occurs in radiatively coupled microwave oscillator arrays [2], [3], [12], [13]. The method of averaging was used to obtain an approximate simplified system of three slow flow equations and then the terms with the time delay were approximated via a Taylor series expansion. It was also predicted that the stability curves for the in-phase and out-of-phase modes are periodic in the delay. Numerical integration of the original system showed that the approximated system agrees well when certain parameters are small and begins to break down for larger parameter values. Work on other models with time delay found non-periodic dependence of the bifurcation curves on 483
484
Stephen A. Wirkus
the time delay [4], [5], [6], [11]. This current work examines the averaged equations before they have been Taylor expanded and then investigates the stability and bifurcation of their equilibria. These bifurcation curves are compared with those obtained with the Taylor series truncation and the validity of the original results is examined. II.
THE SLOW FLOW EQUATIONS
The original equations under investigation were two van der Pol oscillators with delay coupling [7], [8], [9], [10]: Xi + X\ — 6 (1 — X\) Xi = £ aX2 (t — T),
(1)
X2 + X2 - 6 (1 - X2) X2 = e OtX\ (t - T),
(2)
where a is a coupling parameter, T is the delay time, and where e « 1. Assuming small e, it was shown that this system yielded the averaged equations Ri = ^Ri ( 1 - ^) +^ Rj cos(0,- - BJ + r), 4 z z \ /
(3) (4)
where the ansatz Xi = Ri cos(t + 0j),
±i = - RJ siu(t + 0i).
(5)
was used and i,j = 1,2. Here Rj = Rj(t — T), 6j = Oj(t — T). The previous work then stated that equations (3)- (4) show that Ri, 6i are O(e) and, for O(l) values of r, we can replace -Rj, (9j can be replaced by Ri, 6{ in equations (3) and (4) [7], [8], [9], [10]. The only assumption was that the product er is small. Numerical integration of the original differential delay equations (l)-(2) showed that this approximation gives good agreement when fT is small. We thus begin by considering equations (3)-(4) written as Ri 0i =
R2 sin(02 - el - T),
Z
\
(6)
~ Rz cos(0i - e2 + T},
(7)
-T),
(8)
4
R, 008(0! - 02 - T).
(9)
In the previous work, the substitution = 0,-Ri = R2. Here, we let ^1 = 01-02,
^2 = ^1-^2.
(10)
APPROXIMATING TIME DELAY
485
Incorporating equation (10) into equations (6)-(9) gives the following equations: —-M + a 1?2 cos(V>i + T) j ,
2 \ \ ea ( R2 •
=e_a
_Kz
(11) (12)
4
. .
I ,
~
+T]_R]L
sin /^
\
_T]]
(13)
,14) /
where Ri = Ri(t — T), Ri = Ri(t — 2r] and fa = fa(t — T). In comparing with the previous work, when 6i « #j we obtain fa = fa in equations (11)-(14). Thus, fa = ip2 = <j> and if we also assume that Ri & Ri K R^, the two systems are the same. Equations (11)-(14) are those that will be considered for the remainder of this paper. Note that RI and R2 are nonnegative and the vector field associated with equations (11)-(14) is periodic in fa. Equations (11)-(14) are invariant under the two transformations: (Ri, R2, fa, fa) i-» (R2, Ri,—^2,—fa)
(15)
fa i—» fa + TV,
(16)
and
i/j2 (—> t/>2 + TT,
a i—> —a.
Unlike the previous work, there are no invariances involving the parameter T—the functional dependence of Ri and fa on T prevent such an invariance [7], [8], [9], [10].
III.
EQUILIBRIA AND STABILITY
In keeping the time delays, we have the additional terms Ri, Ri and rf>i in equations (11)-(14). At equilibria, we have that
Thus, the equilibria are the same as in the previous work. In particular, the in-phase mode is given by RI = RI = \/l + a COST, fa = 4"2 = 0, the out-of-phase mode is given by
= R2 = \f\-a cos r, fa = fa = TT,
486
Stephen A. Wirkus
and the unsymmetrical equilibria, although not given in closed form, also exist. The stability of these equilibria may be different, however, in different regions of parameter space. The bifurcation curves can be calculated. For the in-phase mode, we linearize by letting Ri(t) = Ui(t) + 2^1 + a COST,
(17)
Ri(t - r) = Ui(t - r) + 2Vl + acosT,
(18)
Ri(t - 2r) = Ui(t - 2r) + 2VI + a COST, •0j(£) =Vi(t),
(19) (20)
ipi(t - T) = Vi(t - T)
(21)
be small perturbations about the in-phase mode. We substitute equations (17)-(21) into equations (!!)-(14). Ignoring higher order terms, we obtain 1 / ~ , / HI = — -(. 2 \(— ONCOST + SuiacosT + 1u\ + 2vi(sinT)aVl + a cos
1 ,
cos T — 3u2d cos T — 2u,2 + 2^2 (sin r)aVl + a COST) (23)
A
ea ((—ui — u\ + 2^2) sin T + 1(v\ + V2)(cosT)Vl + a COST) 4Vl + a COST + U2 + u?)/ sinT + 2(v\ +^_M V2)(cosT)Vl + a COST)/ Q$) i,2 = ea ((—2ui LV_ j 4V1 + a COST Since equations (22)-(25) give a system of linear homogeneous equations with constant coefficients, we seek solutions of the form
et, Ui = (u0)t • e-T $ = (w0), • e 2 T Vi = (v0)i-ext, Vi = (v0)i • e^-^, vt = (v^ • e^^.
(26) (27)
Substituting into equations (22)-(25) and simplifying gives linear equations on (tto)i, (vo)i- The matrix of this system is
(28) L^3 «MJ
where in nno TO~^T 1
me T- -f -I- 1 1 = f \ -4- 22n/ «COST
J 1
'
1
—~
Ar
2 ttCOSTe
A+|acosT + l '
aVl + a COST sin T 0 , 0 — a Vl + OL cos T sin T 4Vl + Oi cos T
2V1 + a COST asinT(l + e~2A' 2Vl + ot cos T 4Vl + a COST _ . « + iacosr iacosTe- AT ] — 1 1 \ . l iacosTe_ \T A + iacosT XT
^4
.
.
^ ' , , (30)
(31)
l«^J
APPROXIMATING TIME DELAY
487
The characteristic equation of this system is then the determinant of this matrix. It is given by AB _ 16exp(4Ar) where A = 4AeAr cos ra + 8A cos rae2XT + 2eXr cos ra + 2 cos re2Xra +a2 cos2 r + a2 sin2 r + sin2 re 2Ar a 2 + 2eAr sin2 ra2 +4eXT cos2 ra2 + 3a2 cos2 re2Ar + 4A 2 e 2AT + 4Ae2Ar
(33)
and B = -4AeAT cos ra. + 8A cos rae2Xr - 2eAr cos ra + 2 cos re2ATa +a2 cos2 r + a2 sin2 r + sin2 re 2Ar a 2 - 2eAr sin2 ra2 -4eAr cos2 ra2 + 3a2 cos2 re2AT + 4A 2 e 2Ar + 4Ae 2Ar .
(34)
Setting A = 0, A = 0, and solving gives the bifurcation curves
a = 0,
(35)
" = -7^-. 1 + cos- r 2
06)
both of which were obtained previously with the Taylor approximations of the delay terms, the second corresponding to the in-phase mode switching its stability [7], [8], [9], [10]. Setting B = 0, A = 0, and solving gives
a= Q=
—,
COST
—.
(37) (38)
rcosr The first of these was also obtained previously and corresponds to the birth of the in-phase mode. The second curve, however, is a new curve, see Figures 1 and 2. Using the delay approximations of [7], [8], [9], [10], it was shown that curve (36), (41) correspond to subcritical Hopf bifurcations in which the in-phase mode gives birth to an unstable limit cycle as it switches from unstable to stable. In the full system, the in-phase mode is a periodic motion and goes from being unstable to stable as it gives birth to an unstable quasiperiodic motion (corresponding to the unstable limit cycle in the delay approximated system). These results were inferred because the approximate system was 3-dimensional and not infinite-dimensional. It would be difficult to numerically observe the birth or death of an unstable quasiperiodic motion in the full system. It is reasonable to believe that the same type of bifurcation occurs along the "new" curves (38), (43) but showing/observing this has been elusive thus far. The curve of Hopf bifurcations obtained in previous work has proved elusive when the delay terms are kept. Expressions which the Hopf must satisfy
488
Stephen A. Wirkus
[
O>
d
CO
d
CD
d
t/>
d (6u||dnoo) »
I
T
-*r
ci
Figure 1. Solid curves are stability curves predicted by equations (35)-(37) and (40)-(42), cf. [7], [8], [9], [10]. The dotted-dashed curve is the stability curve predicted by equations (38), (43).
489
APPROXIMATING TIME DELAY
.
0.65
'
^
0.65 \
0.6
/
90.55
/
90.55
c
\
c
\
\ \
!o.5
\\
*
0 0
S 0.45
0 0
v.
tiO.45
/
^N\
0.4
/
0.35
/
! 0.5
\ /"'
0.4
/
/
0.6
1
/ \ \
0.35
/ /
0.3
/
8
8.2
j
i
t
8.4
8.6
8.8
0.3
10
10.2
T (delay)
\
T (delay)
\ \
0.65 , '
/' 0.65
/
\ \
0.6 \
90.55
-0.55
\
1 0.5
8
\
"a 0.45
80.45
/^
0.4
/ ^
0.35 14.4
X \
1 0.5
\
8
/ 14.6
<
\>
0.35
14.8
i (delay)
15
15.2
0.3
16.4
'
; •
0.4
/
'
/
L
'
|
/ /
\
\
'
/ /
\
0.6
0.3
10.4
\ , \ 10.6 10.8
16.6
\\\ , \\ , 16.8
17
17.2
i (delay)
Figure 2. Close-up of stability curves from Figure 1. The dotted-dashed curves are again the new predictions, equations (38) and (43).
490
Stephen A. Wirkus
are found in closed form solution as Fi(w,a,T) = 0,
F 2 (a; ) Q,r) = 0 )
(39)
but an expression independent of the frequency has been difficult to find. In particular, writing equation (39) as a = a(ui),T = T(UJ) for the purpose of plotting the Hopf bifurcation curve has also been elusive. The out-of-phase mode is also an equilibrium of equations (11)- (14). There is no symmetry that allows us to obtain the bifurcation curves of the outof-phase mode given the in-phase mode. The same curves found in previous work, however, are also found here by a similar process to that shown above. In particular, we find that a = 0,
(40)
COST -, —> +COS2 T
(42)
COST
a = — ?— ,
(43)
T COS T
are the bifurcation curves for the out-of-phase mode. Only equation (43) is a "new" curve. But the stability of the out-of-phase mode is also much better predicted when this curve is used.
IV. DISCUSSION Approximating the original van der Pol oscillators by averaging yields a simpler system to manipulate, equations (6)-(9). The further step of approximating the time delay of the averaged equations yields accurate results for small values of er; however, the bifurcation curves are predicted to be periodic in the delay variable, T. In examining the stability of the in-phase mode in the previous study, it was found that for larger T values, analytical prediction of stability, via the Taylor expanded delay equations, did not agree with the numerical integration. Plotting equation (38) as the stability transition curve gives much better results for large T values. The non-periodic bifurcation curves have also come up in other systems [4], [5], [6], [11]. This seems to indicate that the averaging does not significantly affect the results. By performing a stability analysis of the averaged equations now written as (11)-(14), we obtain more accurate bifurcation curves for the stability regions of the in-phase and out-of-phase modes. The stability curves predicted by the averaged equations agree well with the stability given by numerical integration of (l)-(2) and the averaged system thus gives a good approximation of the original system, without missing essential results. See the numerical runs and the analytical predictions in Figure 3. (Note: In regions
APPROXIMATING TIME DELAY
491
where there are both solid and dotted lines, the dotted lines are the analytical predictions from previous work and the solid lines are the new predictions for stability of the in-phase mode.) In the original work, Taylor expanding the delay terms allows one to obtain all but one of the bifurcation curves. Using this technique might be a good first approximation of the bifurcation curves for systems that have delay terms and have been averaged. An examination of the averaged equations (before Taylor expansion) appears far more reliable. ACKNOWLEDGMENT The author wishes to thank Richard Rand for helpful suggestions in this research and Randall Swift for his efforts in putting together these proceedings in honor of Professor Rao.
492
Stephen A. Wirkus
\
I
!
I
~i
nTTTTTT CO
i
\
(Bujidnoo) n
Figure 3. The curves have the same meaning as in Figure 1. The dots represent stability of the in-phase mode via numerical integration of the original equations with e = 0.1.
APPROXIMATING TIME DELAY
493
REFERENCES 1. E. N. Hairer, S. P. N0rsett, and G. Wanner, Solving Ordinary Differential Equations I: Nonstiff Problems, Springer-Verlag, Berlin, 1987. 2. J. J. Lynch, and R. A. York, 'Stability of mode locked states of coupled oscillator arrays,' IEEE Transactions on Circuits and Systems 42, 1995, 413-417. 3. J. J. Lynch, Analysis and design of systems of coupled microwave oscillators, Ph. D. thesis, Department of Electrical and Computer Engineering, University of California at Santa Barbara, 1995. 4. Moon, F. C. and Johnson, M. A., 'Nonlinear dynamics and chaos in manufacturing processes,' in Dynamics and Chaos in Manufacturing Processes, F. C. Moon (ed.), Wiley, 1998, pp. 3-32. 5. D. V. R. Reddy, A. Sen, and G. L. Johnston, 'Time delay induced death in coupled limit cycle oscillators,' Physical Review Letters 80, 1998, 5109-5112. 6. S. H. Strogatz, 'Death by delay,' Nature 394, 1998, 317-318. 7. S. Wirkus, and R. Rand, 'Dynamics of two coupled van der Pol oscillators with delay coupling,' in Proceedings of DETC'97, 1997 ASME Design Engineering Technical Conferences, Sacramento, CA, Sept. 1417, 1997, paper no. DETC97/VIB-4019. 8. S. Wirkus, and R. Rand, 'Bifurcations in the dynamics of two coupled van der Pol oscillators with delay coupling,' in Proceedings of DETC'99, 1997 ASME Design Engineering Technical Conferences, Las Vegas, NV, Sept. 12-15, 1999, paper no. DETC99/VIB-8318. 9. S. Wirkus, The dynamics of two coupled van der Pol oscillators with delay coupling Ph. D. thesis, Center for Applied Mathematics, Cornell University, 1999. 10. S. Wirkus, and R. Rand, 'The dynamics of two coupled van der Pol oscillators with delay coupling,' Nonlinear Dynamics 30 (3): 205-221, November 2002. 11. M. K. S. Yeung, and S. Strogatz, 'Time delay in the Kuramoto model of coupled oscillators,' Physical Review Letters 82, 1999, 648-651. 12. R. A. York, and R. C. Compton, 'Experimental observation and simulation of mode-locking phenomena in coupled-oscillator arrays,' Journal of Applied Physics 71, 1992, 2959-2965. 13. R. A. York, 'Nonlinear analysis of phase relationships in quasi-optical oscillator arrays,' IEEE Transactions on Microwave Theory and Techniques 41, 1993, 1799-1809.