This page intentionally left blank
NEW JERSEY
LONDON
SINGAPORE * S ~ A N e ~ *A HONG l KONG
TAIPEI
9
CHENNAI
Published by
World Scientific Publishing Co. Pte. Ltd. 5 Toh Tuck Link, Singapore 596224 USA ofice: Suite 202, 1060 Main Street, River Edge, NJ 07661
UK ofice: 57 Shelton Street, Covent Garden, London WC2H 9HE
British Library Cataloguing-in-PublicationData A catalogue record for this book is available from the British Library.
FROM MARKOV CHAINS TO NON-EQUILIBRIUM PARTICLE SYSTEMS (2nd Edition)
Copyright 0 2004 by World Scientific Publishing Co. Pte. Ltd AN rights reserved. This book, or parts there06 may not be reproduced in any form or by any means, electronic or mechanical, including photocopying, recording or any information storage and retrieval system now known or to be invented, without written permissionfrom the Publisher.
For photocopying of material in this volume, please pay a copying fee through the Copyright Clearance Center, Inc., 222 Rosewood Drive, Danvers, MA 01923, USA. In this case permission to photocopy is not required from the publisher.
ISBN 981-238-81 1-7
Printed in Singapore.
Contents Preface to the First Edition . . . . . . . . . . . . . . . . . . ix Preface to the Second Edition . . . . . . . . . . . . . . . . . xi Chapter 0 . An Overview of the Book: Starting from Markov Chains . 0.1. Three Classical Problems for Markov Chains 0.2. Probability Metrics and Coupling Methods . 0.3. Reversible Markov Chains . . . . . . . . 0.4. Large Deviations and Spectral Gap . . . . 0.5. Equilibrium Particle Systems . . . . . . . 0.6. Non-equilibrium Particle Systems . . . . .
. . . . . . .
. . . . . . .
. . . . . . . 1 . . . . . . . 1 . . . . . . . 6 . . . .
. . . .
. . . .
. . . .
. . . .
. . . .
13 15 17 19
Part I . General Jump Processes . . . . . . .
21
Chapter 1. Transition Function and its Laplace Transform . 23 1.1. Basic Properties of Transition Function . . . . . . . . . . 23 27 1.2. The q-Pair . . . . . . . . . . . . . . . . . . . . . . . 1.3. Differentiability . . . . . . . . . . . . . . . . . . . . . 38 1.4. Laplace Transforms . . . . . . . . . . . . . . . . . . . 51 57 1.5. Appendix . . . . . . . . . . . . . . . . . . . . . . . 1.6. Notes . . . . . . . . . . . . . . . . . . . . . . . . . 61
.
Chapter 2 Existence and Simple Constructions of Jump Processes . . . . . . . . . . . . 2.1. Minimal Nonnegative Solutions . . . . . . . . . 2.2. Kolmogorov Equations and Minimal Jump Process 2.3. Some Sufficient Conditions for Uniqueness . . . . 2.4. Kolmogorov Equations and q-Condition . . . . . 2.5. Entrance Space and Exit Space . . . . . . . . . 2.6. Construction of q-Processes with Single-Exit q-Pair 2.7. Notes . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . .
. . . . .
. . . . .
. . . . .
. . . . .
Chapter 3. Uniqueness Criteria . . . . . . . . . . . . . . 3.1. Uniqueness Criteria Based on Kolmogorov Equations . . . 3.2. Uniqueness Criterion and Applications . . . . . . . . . . 3.3. Some Lemmas . . . . . . . . . . . . . . . . . . . . . 3.4. Proof of Uniqueness Criterion . . . . . . . . . . . . . 3.5. Notes . . . . . . . . . . . . . . . . . . . . . . . . .
V
. . . . .
62 62 70 79 85
88 93 96
. 97 . 97 . 102 113
. 115 119
CONTENTS
vi
Chapter 4. Recurrence. Ergodicity and Invariant Measures . . . . 4.1. Weak Convergence . . . . . . . . 4.2. General Results . . . . . . . . . . . 4.3. Markov Chains: Time-discrete Case . 4.4. Markov Chains: Time-continuous Case 4.5. Single Birth Processes . . . . . . . 4.6. Invariant Measures . . . . . . . . 4.7. Notes . . . . . . . . . . . . . . .
. . . .
. . . . . . . . . . .
. . . . . . . .
.
Chapter 5 Probability Metrics and Coupling 5.1. Minimum Lp-Metric . . . . . . . . . . . 5.2. Marginality and Regularity . . . . . . . . 5.3. Successful Coupling and Ergodicity . . . . 5.4. Optimal Markovian Couplings . . . . . . 5.5. Monotonicity . . . . . . . . . . . . . . . 5.6. Examples . . . . . . . . . . . . . . . . 5.7. Notes . . . . . . . . . . . . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. 120 . 120 . . . .
Methods . . . 173 . . . . . . . . 173 . . . . . . . . 184 . . . . . . . . 195 . . . . . . . . 203 . . . . . . . 210 . . . . . . . 216 . . . . . . . 223
Part I1. Symmetrizable Jump Processes . Chapter 6 . Symmetrizable Jump Processes and Dirichlet Forms . . . . . . . . . . . . . . . . . 6.1. Reversible Markov Processes . . . . . . . . . . . . . . . 6.2. Existence . . . . . . . . . . . . . . . . . . . . . . . 6.3. Equivalence of Backward and Forward Kolmogorov Equations 6.4. General Representation of Jump Processes . . . . . . . . . 6.5. Existence of Honest Reversible Jump Processes . . . . . . . 6.6. Uniqueness Criteria . . . . . . . . . . . . . . . . . . . 6.7. Basic Dirichlet Form . . . . . . . . . . . . . . . . . . 6.8. Regularity, Extension and Uniqueness . . . . . . . . . . . 6.9. Notes . . . . . . . . . . . . . . . . . . . . . . . . . Chapter 7. Field Theory . . . . . . . . . . 7.1. Field Theory . . . . . . . . . . . . . . 7.2. Lattice Field . . . . . . . . . . . . . . 7.3. Electric Field . . . . . . . . . . . . . . 7.4. Transience of Symmetrizable Markov Chains 7.5. Random Walk on Lattice Fractals . . . . . 7.6. A Comparison Theorem . . . . . . . . . 7.7. Notes . . . . . . . . . . . . . . . . .
124 130 139 151 166 171
225
227 227 229 233 233 243 249 255 265 270
. . . . . . . 272 . . . . . . . . 272 . . . . . . . . 276 . . . . . . . . 280
. . . . . . . . 284 . . . . . . . . 298 . . . . . . . . 300 . . . . . . . . 302
CONTENTS
.
Chapter 8 Large Deviations 8.1. 8.2. 8.3. 8.4.
vii
. . . . . . . . . . . . . . . . 303
Introduction to Large Deviations . . Rate Function . . . . . . . . . . . Upper Estimates . . . . . . . . . . Notes . . . . . . . . . . . . . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . . 303 . . . 311 . . . 320 . . . 329
.
Chapter 9 Spectral Gap . . . . . . . . . . . . . . . . . . 330 General Case: an Equivalence . . . . . . . . . . . . . . 330 Coupling and Distance Method . . . . . . . . . . . . . . 340 Birth-Death Processes . . . . . . . . . . . . . . . . . . 348 . . . . . . 359 Splitting Procedure and Existence Criterion Cheeger’s Approach and Isoperimetric Constants . . . . 368 9.6. Notes . . . . . . . . . . . . . . . . . . . . . . . . . 380
9.1. 9.2. 9.3. 9.4. 9.5.
Part I11. Equilibrium Particle Systems Chapter 10. Random Fields
. . . . . 10.1. Introduction . . . . . . . . . . . . 10.2. Existence . . . . . . . . . . . . . 10.3. Uniqueness . . . . . . . . . . . . 10.4. 10.5. 10.6. 10.7. 10.8.
. . . .
. . . .
. . . .
Phase Transition: Peierls Method . . . . Ising Model on Lattice Fractals . . . . . Reflection Positivity and Phase Transitions Proof of the Chess-Board Estimates . . . Notes . . . . . . . . . . . . . . . . . .
.
Chapter 11 Reversible Spin Processes and Exclusion Processes . . . . . 11.1. Potentiality for Some Speed Functions . 11.2. Constructions of Gibbs States . . . . . 11.3. Criteria for Reversibility . . . . . . . 11.4. Notes . . . . . . . . . . . . . . . . .
. . . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . 381
. . . 383 . . 383 . . 387 . . 391
. . . . . . . . 397 . . . . . . . . 399
. . . . . . . . 406 . . . . . . . . 416 . . . . . . . 421
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . .
422 422 425 432 446
Chapter 12. Yang-Mills Lattice Field . . . . . . . . . . . . 447 12.1. 12.2. 12.3. 12.4.
Background . . . . . . . . . . . . . . . . . . . . . Spin Processes from Yang-Mills Lattice Fields . . . . . Diffusion Processes from Yang-Mills Lattice Fields Notes . . . . . . . . . . . . . . . . . . . . . . . .
. 447 . . 448 . . 457 . 466
CONTENTS
viii
Part IV . Non-equilibrium Particle Systems . . . . . . . . . . . . . . . . . . Chapter 13. Constructions of the Processes . . . . . . . . 13.1. Existence Theorems for the Processes . . . . . . . . . . 13.2. Existence Theorem for Reaction-Diffusion Processes . . . 13.3. Uniqueness Theorems for the Processes . . . . . . . . . 13.4. Examples . . . . . . . . . . . . . . . . . . . . . . . 13.5. Appendix . . . . . . . . . . . . . . . . . . . . . . . 13.6. Notes . . . . . . . . . . . . . . . . . . . . . . . .
467
. 469 . 469 . 486 . 493 502 510 . 513
.
Chapter 14 Existence of Stationary Distributions and Ergodicity . . . . . . . . . . . . . . . . . . . 514 14.1. General Results . . . . . . . . . . . . . . . . . . . . 514 14.2. Ergodicity for Polynomial Model . . . . . . . . . . . . . 521 14.3. Reversible Reaction-Diffusion Processes . . . . . . . . . . 532 538 14.4. Notes . . . . . . . . . . . . . . . . . . . . . . . . .
.
Chapter 15 Phase Transitions . . . . . . . . . . . . . . . 539 539 15.1. Duality . . . . . . . . . . . . . . . . . . . . . . . . 15.2. Linear Growth Model . . . . . . . . . . . . . . . . . . 542 15.3. Reaction-Diffusion Processes with Absorbing State * . . 547 15.4. Mean Field Method . . . . . . . . . . . . . . . . . . 550 15.5. Notes . . . . . . . . . . . . . . . . . . . . . . . . . 554 Chapter . 16. Hydrodynamic Limits . . . 16.1. Introduction: Main Results . . . . . 16.2. Preliminaries . . . . . . . . . . . . 16.3. Proof of Theorem 16.1 . . . . . . . 16.4. Proof of Theorem 16.3 . . . . . . . 16.5. Notes . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . .
. . . .
. . . .
. . . . . . . . .
. . .
555 555 . . . 559 . . . . 564 . . . . 570 . . . 571
. . . . . . . . . . . . . . . . . . . . . .
572
Author Index . . . . . . . . . . . . . . . . . . . . . .
589
Bibliography
Subject Index . . . .
. . . . . . . . . . . . . . . . 593
Preface t o the First Edition The main purpose of the book is to introduce some progress on probability theory and its applications to physics, made by Chinese probabilists, especially by a group at Beijing Normal University in the past 15 years. Up t o now, most of the work is only available for the Chinese-speaking people. In order to make the book as self-contained as possible and suitable for a wider range of readers, a fundamental part of the subject, contributed by many mathematicians from different countries, is also included. The book starts with some new contributions to thc classical subject Markov chains, then goes t o the general jump processes and symmetrizable jump processes, equilibrium particle systems and non-equilibrium particle systems, Accordingly the kook is divided into four parts. An elementary overlook of the kook is presented in Chapter 0. Some notes on thc bibliographies and open problems arc collected in the last section of each chapter. It is hoped that the book could be useful for both experts and newcomers, not only for mathematicians but also for the researchers in related areas such as mathematical physics, chemistry and biology. The present book is based on the book “Jump Processes and Particle Systems” by the author, published five years ago by the Press of Beijing Normal University. About 1/3 of the material is newly added. Even for the materials in the Chinese edition, they are either reorganized or simplified. Some of them are removed. A part of the Chinese book was used several times €or graduate students, the materials in Chapter 0 was even used twice for undergraduate students in a course on Stochastic Processes. Moreover, the gitlley proof of the present book has bcen used for gradiintc students in their second and third semesters. The author would like to express his warmest gratitude to Professor Z. T. Hou, Professor D. W, Stroock and Professor S. 3 . Yan for their teachings and advices. Their influences are contained almost everywhere in the book. In the past 15 years, the author has been benefited from a large number of colleagues, friends and students, it is too many to list individually here. However, most of their names appear in the “Notes” sections, as well as in the Bibliography and in the Index of the book. Their contributions and cooperations are greatly appreciated. The author is indebted to Professor x. F. Liu, Y . 3. Li, B. M. Wang, X. L. Wmg, J. Wu, S. Y . Zhang and Y . H. Zhang for reading the galley proof, correcting errors and ixlproving the quality of the presentations. It is a nice chance l o acknowledge thc financial support during thr. past years by fi’ok Ying-Tung Educational Foundation, Foundation of Institution of Higher Education for Doctoral Program, Foundation of State Education Commission for Outstanding Young Teachers and the ix
X
PREFACE TO THE
FIRSTEDITION
National Natural Science Foundation of China. Thanks are also expressed to the World Scientific for their efforts on publishing the book. M. F. Chen Beijing November 18, 1991
Preface t o the Second Edition The main change of this second edition is Chapter 5 on “Probability Metrics and Coupling Methods“ and Chapter 9 on “Spectral Gap” (or equivalently, “the first non-trivial eigcnvalue”) Actually, these two cha.pters have been rewritten, within the original text. In the former chapter, the topic of “optimal Markovian couplings” is added and the “stochastic cornprtrability” for jump processes is cornplited. In tlis latter cliapt,er, t,wo general results on estimating spectral gap l y couplings and two dual variational formula for spectral gap of birth-death processes are added. Moreover, a generalized Cheeger’s approach is renewed for unbounded jiirnp processes. Next, Sectiorr 4.5 on “Single Birth Processes” and Section 14.2 on “Ergodicity of Reactiondiffusion Processes“ are updated. But the original technical Section 14.3 is removed. Besides, a large number of recent publications are included. Numerous modifications, improvements or correct’ionsare made in almost every page. It is hoped that, t,he serious effort could improve the quality of the book and bring the reader to enjoy some of the recent developments. Roughly speaking, this book deals with two subjects: Markov Jump Processes (Parts I and 1) and Interacting Particle Systems (Parts m and IV). If one is interested only in the second subject, it is not necessary to read all of t,he first niae chapters, but instead, may have a look at Chapters 4, 5, 7: 9 plus s2.3 or so. A quick way to read the book is glancing at the element,ary Chapter 0, to get some impression about what studied in the hook, to have some test of the results, arid to choose what for the further reading. Some t.irnes, 1 feel crazy to writ’e such a thick book, this is due to the wider range of topics. Even though it can be shorten easily by moving sonic details but the resulting book would be much less readable. Anyhow, I belicve that the reader can make the book thin and thin. A concrete model t.hroughout the whole book is Schlogl’s (second) rnodeI, which is introduced at the beginning (Example 0.3) to show the power of our first main result and discussed right after the last theorem (Theorem 16.3) of bhe book about its unsolved problems. This model, completely different from Ising model, is typical from non-equilibrium statistical physics. Its generalization is t.he polynomial model or more generally, the class of reaction-diffusion processes. Locally, these models are Markov chains. But even in t,his case, the uniqueness problem of the process was opened for several years, though everyone working in this field believes so. From physical point of view, the Markov chains should be ergadic and this is finally proved in Chapter 4, Thus, to study the phase tra.nsit.ions, we have to go to the infinite dimensional setting. The first hard stone is the construction of the corresponding Markov processes. For which, the matherna.tical tool .is preI
xi
xii
PREFACE T O T H E SECOND
EDITION
pared in Chapter 5 and the construction is done in Chapter 13. The model is essentially irreversible, it can be reversible (equilibrium) only in a special case. The proof of a criterion for the reversibility is prepared In Chapter 7 arid completed in Chapter 14. The topics studicd in almost, every chapter are either led by or related to Schlligl’s rnodel, even though sometimes it is not explicitJy mentioned. Actually, the last four chapters are all devoted to the reaction-diffusion processes. The Schlijgl model possesses thc main characters of the current mathematics: infinite dimensional, non-linear, complex systems and so on. It provides us a chance to re-examine the well developed finite dimcnsional mathematics, to create new mathematical tools or new research topics. It is not surprising that many ideas and results from different branches of mathematics, as well ti physics, are used in the book. However, it is surprising that the methods developed in this book turn out to have a dccp application to Rierriaxiniari geometry and spectral theory. ‘l’his is clearly a different story. Since there are so much progress made in the past ten years or more, a large part of the new materials are out of the scope of this book, the author has decided to write a separate book under the title “Eigenvalues, Inequalities and Ergodic Theory”. It is a pleasure t o recall the fruitful cooperation with my previous students and colleagues: Y. H. Mao, F. Y. Wang, Y. Z. Wang, S. Y. Zhang, Y . H. Zhang et al. Their contributions heighten remarkably the quality of the book. The author acknowledges the financial support during the past years by the Research Fund for Doctoral Program of Higher Education, the National Natural Science Foundation of China, the Qiu Shi Science and Technology Foundation and the 973 Project. Thanks are also expressed to the World Scientific for their efforts on publishing this new edition of the book.
M. F. Chen Beijing August 29, 2003
Chapter 0
An Overview of the Book: Starting from Markov Chains In this chapter, we introduce some background of the topics, as well as some results and ideas, studied in this book. We emphasize Markov chains, and discuss our problems by using the language as elementary and concrete c2s possible. Besides, in order to save the space of this section, we omit most of the references which will be pointed out in the related “Notes” sections. 0.1 Three Classical Problems for Markov Chains For a given transition rate (Le., a Q-matrix Q = ( q z j ) on a countable state: space), the uniqueness of the Q-semigroup P ( t ) = (Pt3(t)), the recurrence and the positive recurrence of the corresponding Markov chain are three fundamental and clmsical problems, treated in many textbooks. As an addition, this seclion inlroduceu some practical results motivated from the study of a type of interacting particle systems, reaction-diffusion processes.
Definition 0.1. Let E be a countable set. Suppose that ( P z J ( t )is) a subMarkov transition probability matrix having the following properties.
(1) Normal condition.
-p&) < 1, hj E E , t 2 0.
P&) 2 0 ,
3
(2) Chaprnan-Kolmogorov equation. P i 3 ( I + 6 )= C P i k ( t ) P k j ( S ) ,
i , j E E , t , s 3 0,
k
(3) Jump condition. limt-o Pij(t) = Sij for all i , j E E . It is well-known that for such a ( P z j ( t ) )we , have a Q-matrix Q = ( q i j ) deduced by (4) Q-condition. (Pij(t)- S i j ) tlim 40
/ t = qij
where
1
for all i , j E E ,
2
0 AN OVERVIEW OF
THE
BOOK
Because of the &-condition, we often call P ( t ) = (Pij(t)) a Q-process. Unless otherwise st,ated, t,hroughout this chapter, we suppose that the Q-matrix Q = ( q i j ) is totally stable and conservative. That is
C3.f z. q ”
4%< C X ] ,
i € E. (0.1) The first problem of our study is when there is only one Q-process P ( t ) = (Pij(t))for a given Q-matrix Q = ( q i j ) (Then, the matrix Q is often called regular). This problem was solved by Feller (1957) and Reuter (1957). = qi,
2J
Theorem 0.2 (Uniqueness criterion). For a given &-matrix Q = the Q-process (Pij(t))is unique if and only if (abbrev, iff) the equation
(A
+
q;)Ui
=CQijUj, 0
< ui < 1,
iEE
(qij),
(0.2)
j#l
has only the trivial solution uz = 0 for some (equivalently, for all) X
> 0.
Certainly, this criterion has rna.ny applicatjons. For instance, it gives us a cornpl& answer to the birth-death processes (cf; Corollary 0.8 below). However, it seems hard to apply the above criterion directly to the following examples. Example 0.3 (SchIiigl’s modcl). Let S be a finite set and IT = X:, where Z+ = { O , l , . . . }. The model is defined by the following Q-matrix Q -r ( q ( q y) : z,y E
E):
[
A1
(2“’)”
+
if y = z
A1
+ e,
ify=z-e,+e,
4 ( 4 = - d x >4 =
c
for other y
# z,
q(WA
Y#X
(1)
where 2 = ( ~ ( u: )u E S ) , is the usual combination, XI, ’ ! A 4 are positive constants, ( p ( u , v ): u,’u E S) is a transition probability matrix on S and eU is t h e element in E having value 1 a t u and 0 elsewhere. 1
The Schlogl model is a model of chemical reaction with diffusion in a container. Suppose that the container consists of small vessels. In each vessel u f S , there is a reaction described by a birth-death process. The birth and death rates are given, respectively, by the above first two lines in the definition of (q(x,y)). Moreover, suppose that between any two vessels u and w, there is a diffusion, with rate given by the third line of the definition. This model was introduced by F. Schlogl (1972) as a typical model of nonequilibrium systems. See Haken (1983) for related references.
0.1 THREE CLASSICAL PROBLEMS FOR MARKOV CHAINS
3
Example 0.4 (Dual chain of spin system). Let S be a countable set, and X be the set of all finite subsets of 5'. For A E X ,let IAl denote the number of elements in A . For various concrete models, their &-matrices (q(A,B) : A , B E X)usually satisfy the following condition:
for some constant C, c E R := (-m, m). A particular case is that
F : I'A(A\u)=B
uEA
where
4).
2 0,
supc(.u) < CO? U
and supuc(u) C A p ( u , A )JAl <
SUP,, 44 C
00.
Then (0.3) holds with C =
0 and
c =
F P b ,F ) llFl - 11.
Intuitively, we can interpret the last Markov chain as follows. Let A be the set; of sites occupied by particles (finite!). At each site there is at most one particle. Then the process evolves in the following way: each u E A is removed from A at rate C(U) and is replaced by a set F with probability p ( u , F ) ;when an attempt is made to put a point at site u which is already occupied, the two points annihilate one another. The dual chain of a spin system is often used as a dual process of an infinite particle system. This dual approach is one of the main powerful tools in the study of infinite particle systems (cf. Liggett (1985), Chapter 3, Section 4). Now, we State our first main result.
Theorcrn 0.5. Let Q = ( q Z J )be a Q-matrix on E . Suppose t h a t there exist a sequence {En}y and a non-negative function 9 such that
If in addition .1
holds for some c
c R, then the Q-process
is unique.
To cornpare this theorem with Criterion 0.2, we reformulate Criterion 0.2 as follows.
0 AN OVERVIEW 01.’ THE BOOK
4
Theorem 0.6 (Alternative uniqueness criterion). Given a Q-matrix Q = ( q i j ) , for the uniqueness of the &-process, it is sufficient t h a t the inequality
has no bounded solution (pi : i E E ) with sup,y, > 0 for some (equivalently, for all) X 2 0. Conversely, these conditions plus p 2 0 are also necessary.
<
Take E,, =; { i E E : qp. n}. By ‘l’heorem 0.5, we have the following result.
Corollary 0.7. tf there exist a function ‘p: cpi 3 q i , i E E , and a constant c E R such t h a t (0.4) holds, then t h e &-process is unique.
To see these results are practical, for Schlogl’s model (Example 0.3), we ~ ( u ) and ) ~ ]apply Corollary 0.7, or take can either take cp(x) = c[l (CUES cp(z) = c[l C u z ( u ) ]and apply Theorem 0.5 with En = {i : i ,< n}, where c is a constant chosen by a simple computation. For Example 0.4, simply take cp(A) = c[1 lAl] for a suitable c and apply Theorern 0.5 with En = { A : \At n}. For instance, for Schlagl’s modeI, when C,,X ( U ) is large, then (0.4)should hold because the order of the death rate is higher than thc one of the birth rate. On the other hand, for bounded X U~(u), we can choose c large enough so that (0.4) also holds. Next, we consider a typical case. Let E = {0,1,2,.. } = Z+. Suppose that the solution ( u i ) to the equation
+
+
<
+
~
is non-decreasing: ui t as i 1, then, from Criterion 0.2, it is easy to see that the process is unique iff lim+m w.i= 00. On the other hand, if we take En = {i E Z, : i < n } : c = X and qpi = ui,i E E , then the hypotheses of Theorem 0.5 fire re.duced lo the condition: limi-+wyi = l i n ~ + ~ u= i 00, which is the same as above. Thus, the conditions of Theorem 0.5 are not only sufficient but also necessary for this particular case. This remark plus the next result gives us another view of justifying the power of Theorem 0.5. Corollary 0.8. For the single birth Q-matrix on E = Z+:
(but there is no restriction t o the death rates), the Q-process is unique iff 00
C m k = 00,where
0.1 THREE CLASSICAL PROBLEMS FOR MARKOVCHAINS
5
The key to prove this corollary is the non-decreasing property mentioned above, of the solution to (0.5) (cf. Theorem 3.16). Now, we go to the next topic: recurrence. It is well known that for a regular Q, the corresponding Ivrarkov chain is recurrence iff so is its embedding chain. See Chung (1967). Here, we would like to menlion a more precise formula. Note that for a given @matrix Q = ( q z 3 ) we , always have the minimal Q-process (Piyin(t)), which C B Kbe ~ obtained by the following procedure, Let P,"(t) = 0 and
then for fixed i ,j E E and t 2 0, p,',"'(t) T P;'"(t) as n
oc: (Theorern 2.21).
Theorern 0.9. We have
where
n!:'
= dij and
(TI!?)) is the n-th power of the matrix 23
and we use the usual convention: c/o0 = 0 for c # 0; C / O = 50 for c 00 = 00; I: x 03 = CQ for c > 0; 0 x (M = 0 and O / O = 0.
>
0;
c+
To state a more practical. criterion for the recurrence, we need an important concept. A function h : E -3 R+ = [0, m) is called compact, if for each d c EX+, the set { i E E : hi < d } is finite. Theorem 0.10. An irreducible Q-matrix Q = P ( t ) iff the equation C 9 i j Y j 6 qiyi,
(qij)
is regular with recurrent
i $H
j#i
has a compact solution (yi) for some finite H
# 8.
The last topic is about the positive recurrence. Theorem 0.11. Given an irreducible Q-matrix Q = ( q i j ) , suppose that there exist a compact function h and constants K 2 0, y > 0 or K = 7 = 0 such that
C
qij(hj
- hi) < K
- vhi,
i
E
E.
(0.6)
j
Then the Markov chain is positive recurrent (exponentially ergodic) and hence has uniquely a stationary distribution.
6
0 AN OVEIZVIEW OF
THE
BOOK
To apply this theorem to Schliigl’s model (Example 0.3), take h ( z ) = Then one can find a K < 00 such that
CUES z(u) and an arbitrary q > 0.
the above inequality holds. Hence, Schlogl’s model is always ergodic in finite dimensions. As for Example 0.4, since the empty set 0 is an absorbing state, the answer is obvious. Finally, consider the linear growth model:
+
xi 6, 42,Z-l = pi, A, p,6 > 0, for other j # if 1, i , j E Z+. qi,j = 0 It is well known that this model is positive recurrent if and only if X < p. Recall that this conclusion is usually obtained by studying three series, respectively, to show the regularity of Q, the recurrence and finally the positive recurrence of the chain (cf. Example 4.56 for details). However, it is obvious that Theorem 0.11 is applicable if and only if X < p , for the natural choice that hi = i (i E Z+). Thus, Theorem 0.11 is sharp for this model and its advantage should be clear now. Roughly speaking, the three problems discussed above consist of the subjects of the subsequent four chapters. Actually, we deal with the general case where the Q-matrix may not be conservative and furthermore the state space is allowed to be general too. Certainly, some results for the general state space are natural generalization of that for the discrete state space. However, it should be pointed out that the generalization is not trivial in many situations, for instance, the differentiability for the transition functions (see Section 1.3). Another case is the following. As we will see in Chapter 4,the ergodic theory for Markov chains are now quite complete but at the moment, our knowledge about the th,eory for general jump processes is still incomplete. For general totally stable Q-matrix (ix., qi < 00 for all i), the uniqueness problem had been open for a long period and was eventually solved by Hou (11,974) for Mnrkav chains arid Chen and Zheng (1982) for the general setup. Th,e general uniqueness criterion is given in Chapter 3. qi,i+1 =
0.2 Probability Mctrics and Coupling Methods
The coupling technique has a long history and now has many applications.
It is one of the basic tools used in the book. In this section, we discuss the relation between couplings and probability metrics, and introduce some coupling methods for Markov chains. Some preliminary applications are also introduced.
Definition 0.12. Let Pk be a probability measure on a measurable space ( E k , &&), k = 1 , 2 . A probability measure p on (El x E2,8; x 8 2 ) is called a coupling of PI and P2 if is has the following marginality:
F(B1 x E2) = Pi(B1),F(E1 x Bz) = P2(B2),
BI,E &I,, k = 1,2.
0.2 PROBABILITY METRICSA N D COUPLING METHODS
7
-
Similarly, for given two processes (X,k:)t>~ valued in ( E k ,8 k ) with distribution P,+,k = 1,2,a proccss (Xt)120 valued in x E z , x g2)with distribution
P is called a coupling of (X,') and
(X;)if
h o r n our point of view, the coupling technique is a natural way to obtain some upper estimate for the probability metrics, and for different rnetrics, the effective couplings can be different. For this reason, we begin our study with recalling some results on probability metrics, and then come back to the coupling methods. Let ( E , p , 8 ) be a separable: complete metric space with met.ric p and Bore1 cr-algebra 4'. Given a sequence of probability measures P, on ( E ,G), we say that P, converges weakly to P if
for all bounded continuous functions f . For this convergence, it is well-known that we have the Levy-Prohorov metric:
tu(P1,PZ)=inf{J:PI(A) < P 2 ( A 6 ) + S a n d P 2 ( A ) < P I ( A ' ) + S for all closed set A E €'} where A8 = {x : p ( z , g / ) < 6 for all y E A } . Now, we are going t o introduce a probability metric W p( p >, 1) which is still less popular. As we know, in probability theory, we usually consider the following convergence for real random variablcv on a probability space: convergence in Lp a.s. convergence
convergence in P
vague convergence
weak convergence The LP-convergence, a.s. convergence and the convergence in F' all depend on the reference frame - our probability space (0,$!I?). But the vague (weak) convergence does not. By a result of Skorohod (cf. Ikeda and Watanabe (1981): p.9 Theorem2.7), if P, P (also denoted by P, -% P since we already have the metric w),then we can choose a suitable reference frame
8
0 AN OVERVIEW
OF THE
BOOK
- -
-
P,, ( P and Cn 55, where 5 P means that Thus, all the convergence above are intrinsically the same, except the LP-convergence. In other words, if we want to find another intrinsic metric on the space of all probability measures, we should consider an analogue of the LP-convergence. Let (1, J2: (R, 9, P)+ ( E ,p, 8 ) .The usual LP-metric is defined by (52,9, IF) such that Jn
5 has distribution P .
Suppose that
(i
-
Pi, i = 1 , 2 and
(el,
52)
-p.
Then
-,.
Certainly, P is a coupling of PI and Pz. However, if we ignore our reference frame ( R , 9 , P ) , then there are a lot of choices of F , for given Pl and PZ Thus, the intrinsic metric should be defined as follows:
Definition 0.13. The metric defined above is called the minimum LPdistance or the probability Lp-metric or W,-metric. Briefly, we write w = w1. In the literature, this metric has several different names: Kantorovich metric, Kantorovich-Rubinshein-Wassersteinmetric (KRW-metric), Wasserstein metric, Hutchinson metric and so on. Here, we choose the intrinsic name to avoid the confusion of the history. In this book, we deal only with the metrics: w, W = W l )W2 and the total variation:
It is interesting to note that if we use the discrete metric
4x79) =
{0
ifz=y 1 if z # y,
then the distance of total variation is again a minimum L1-metric with respect to the metric d:
Theorem 0.14. V(PI,P2) := inf,-~d(zl,z2)P(dz,,dz2) = il]Pl-P211var.
0.2 PROBABILITY METRICSAND COUPLING METHODS
9
As we mentioned before, W is usually stronger than w. More precisely, we have Theorem 0.15. P,
WP
(1) P, P, (2) P(.> zo), Pn(W
s
P ifF the following two conditions
-+
s P(.,
dP P(d4
hold for some (equivalently, for all) zo such that J p ( z , z o ) p P ( d z ) < particular, if p is bounded, then w and W, are equivalent.
oc).
In
From the probabilistic point of view, the W,-metric have an intrinsic property which makes W, more suitable for certain applications. For example, if ( E ,p) is the Euclidean space, then for P2 tf z,obtained from P1 tf by a translation 5 , we have W,(Pl, P2)= 1x1. As usual, the precise value of W, is very difficult to calculate, up to now, only very special cases are known.
-
Theorem 0.16 (Vallender (1973)). Let real line with distribution function
-
+
Pk be a probability measureon the
Fk(z),k
= 1,2. Then
Theorem 0.17 (Dowson and Landau (1982), Givens and Shortt (1984), 01kin and Pukelsheim (1982)). Let Pk be the normal distribution on (Rd,93(Rd)) ( d 2 1) with mean value m k and covariance matrix M k , k = 1 , 2 . Then
where t r M denotes the trace of M .
For general Pk, not necessarily the normal distribution, a characterization of Wz(PI,P2) was obtained by Ruschendorf and Rachev (1990). Fortunately, in most cases, what we need is only certain upper estimates. For instance, to prove that Wp(Pn,P ) -+0 as n + 00, we need only to find out an upper estimate of W,(P,,P), which goes t o zero as n --t 00. Noting that
-
any coupling measure P, will give us an upper estimate. Thus, our main task is to choose a coupling to make the above right-hand side as smaller as possible. We now study the coupling methods for Markov chains. Suppose that we are given two Q-processes Pikjk ( k ) ( t ) with regular Q-matrices on state
($jk)
10
0 AN OVERVIEW OF
THE
BOOK
k = 1, 2, respectively. We want to find some coupling Q-process P ( t ;i l , i 2 ; j1, j z ) with &-matrix (G(i1,i2;j , , j z ) ) on the product state space El x E2 having the marginality:
space
Ek,
Define jl
where f is a bounded function on El. Similarly, we can define R2 and s2 (2) and ( Q ( i l , i 2 ; j l , j 2 ) ) , respectively. Regarding f on corresponding to (qizjz) E1 (resp. f on E2) as a bivariable function on El x E2, it is not difficult to prove that condition (0.8) implies the following marginality for operators.
Any 6 satisfying (0.9) is called a coupling operator. Before going further, let us introduce some examples of coupling operators. In the following examples, f is a bounded function on El x E2, il E E l , i2
E
E2.
Example 0.18 (Independent coupling).
This trivial example already shows that a coupling operator always exists.
Example 0.19 (Classical coupling). Take El = E2 = E and let the two marginal Q-matrix be the same ( q i j ) . Set
where
A = {(il, i2)E E2 : il = i z } ,
a s defined above.
g(k) = f(k,k) and
0.2 PROBABILITY METRICSAND COUPLING METHODS
11
Example 0.20 (Basic coupling).
where a A b = min{a, b ) and a+ = max{a, 0).
Example 0.21 (Coupling of marching soldiers). Take E = {0,1,2,. * } +
and set
here we have used the convention qij = 0 for all i E E and j $! E .
Let us now consider a birth-death process with regular Q-matrix:
Then for two copies of the process starting from i l and have
i2,
respectively, we
Example 0.22 (Coupling by reflection). For il 6 22, we take
By exchanging i1 and i2, we get the expression of … for the case that i1 > i2 .
12
0 AN OVERVIEW OF
THE
BOOK
Hopefully, we have introduced enough examples to show that there are many choices of a coupling operator 6.Indeed, there are infinitely many choices in the case of E being infinite. For instance, for every I' c E2, Gf(i17 i2)
-
= Tr(i17i2) 6Cf(i1,
i 2 ) $. 1 F c ( i 1 > i 2 ) stbf(il,i2)
is a coupling operator. Now, to use the coupling technique, a basic problem we should study is the regularity of coupling operators. Note that the dimension of a coupling process is the sum of the dimensions of the marginals. Hence, a coupling process is usually more difficult to handle than the marginals. However, for the above problem, we do have a complete answer.
Theorem 0.23. If the marginals are regular jump processes, then so is every coupling Markov process. Conversely, if a Markovian coupling is a regular jump process, then so are i t s marginals. Furthermore, in the regular case, (0.8) and (0.9) are equivalent.
In what follows, we will meet several times the applications of coupling methods. Let us now mention a typical application here. Let Xt = (Xt, X:) (t 2 0) be the path of a coupling Q-process and set
T = inf{t 3 -.
o : x,l = x,"}. .
A coupling is called successful if Pzl~zz[T < ca]= 1 and
Suppose that a successful coupling does exist, then
IIP(t,il>.) -'(t,i2,*)llvar
< 2*1'i2 [ T > t ] - 0
Furthermore, if the process has a stationary distribution
ast--tO. T,
then
and so the process is ergodic.
As another application of the coupling technique, we discuss the monotonicity for Markov chains.
0 . 3 REVERSIBLE MARKOVCHAINS
13
Definition 0.24. Take E = {0,1,2,...} and let ( X k ( t ) ) i a o(k = 1,2) be two copies of a Markov chain ( X ( t ) ) t a owith different starting points. If
then we say t h a t the chain is monotone. One way to prove the monotonicity is using the coupling method. For example, applying the basic coupling to a Markov chain with regular Qmatrix Q = (yij) on Z+,we find that the condition:
is sufficient for the monotonicity of the Markov chain. Of course, if we use a different coupling, we will find different sufficient condition for the rnonotonicity. From this point of view, it is believable that condition (0.10) is not necessary for the monotonicity. A complete solution for the monotonicity for general jump processes, as well as other topics discussed in this section, are treated in Chapter 5 . The above two sections are based on Chen (1989a, 1991d), respectively.
0.3 Reversible Markov Chains
Definition 0.25. Let (Xt)t>o be a Markov process defined on (a,9, IF') with countable state space E . The process is called reversible if for any n 2 1, 0 < i l < . . < it, with
and any il, . ' .
, in c E ,
[X,, = i l , . . .
,xi,= in] = P [X,, = i n , . . , Xt,, = ill.
(Q.11)
Clearly, a reversible Markov chain (Xi)should be stationary. That is, xi := P [X, = i ] is independent of t 2 0. Actually,
Due t o the Markov property, (0.11) is equivalent to (0.12)
14
0 AN OVERVIEW
OF THE
BOOK
This implies that Tiqij
= rjq..ji)
i , j E E , t 2 0.
(0.13)
Since it is easy to get. Q = ( q i j ) in practice, but not ( P i j ( t ) ) ,we should start our study from a given Q-matrix. Thus, we are now at; the position as at the beginning of Section 0.1. Fur a given Q-matrix Q (yij) which is reversible with respect; to a probability measure (ni)in the sense of (0.13), we would like to know when there is one, and when there is precise one, such Q-process (Pij(5)) so that (0.12) holds. To state our main results, let us relax the probability measure (ri)by an arbitrary but non-trivial measure (7riTi). Then, we call Q = ( q i j ) (resp. ( P i j ( t ) ) )symmetrizable with respect t o (ni)if (0.13) (resp. (0.12)) holds. Finally, in this section, the only assumption for the Q-matrix ( q i j ) is the total stability: y i < 30 €or aIl i E E . Theorem 0.26. The minimal Q-process (Piyin(t)) is reversible (resp. symmetrizable) with respect to (T*)iff so is its Q-matrix.
Theorem 0.27. With respect to a probability measure ( x i ) , the reversible Q-process is unique iff the following conditions hold. (1) ( q i j ) is reversible with respect t o ( ~ i ) ~ (2) Ti(Yi - Cj+ 4ij) < 0 , (3) Equation (0.2) has only the trivial solution For general (ni): we have the following result.
Theorem 0.28. With respect t o a measure (q), there exits precisely one symmetrizable Q-process if the following conditions hold. (1) Q = ( y i j ) is symmetrizable with respect t o (ni). (2) C(T,(qi - Cj+iq i j ) < go or infi C j P;;~*’(A) > 0. (3) The only sumrnable solution t o Equation (0.2) is zero. We guess that condition (2) is still stronger than to be necessary. Thus, a complete criterion for the uniqueness of symmetrizable Q-process remains open. Besides, even though we have known a great deal about the general Q-process (cf. Section 3 4 , we only have a partial solution to the following problem.
Open Problem 0.29. What is the uniqueness criterion for honest reversible (resp. syrnmetrizable) @process? Here, “honest” means that C jPij(t) = 1 for all i E h‘ and t 2 0.
0.4 LARGEDEVIATIONS AND SPECTRAL GAP
15
In the study of symmetrizable Q-process, a new question arises. How can we justify whether a given Q-matrix is symrnetriz.able with respect to some measure (ri)? As a nice exercise, one may try to answer the question by himself for the Schliigl model. In general, this question is answered by us& ari analogue of the classical field theory in. analysis. It is interesting that the same idea can also be used ta study t,hc recurrence for symmetrizable Markov chnhs (see Chapter 7 for details).
0,4Large Dcviations and Spectral Gap Markov chains consist of a nice class of stochastic processes, not, only for their a lot of applications hut also for t,he!j, concrete beh.avior and simplicity. Tn the regular case, the paths of a hllarkov h a i n are simply step functions almost surely. We can even see thc jump law: starting from a state .i, the chain stays in i for a while according to the exponential distribution with parameter qi. Then, t.he chain jumps to j ( # i ) according to the distribution qij/yi (provided 4%> 0). Because of this reason, a large part, of the theory of stochastic processes was begun from hdarkov chains. Cowessely, Markov chains can be used to justify the power of a general theory for stochastic processes. Let us discuss the two topics expressed by the title of this section. In the Donsker-Varadhan's large deviation theory (an introduction to the theory is presented in Section 8.1), we are interested in the entropy (=rate function) :
(0.14) and -1 upper estimate: lim
t+w
lower estimate:
<
- log Qt,.i(C) -
t
1
inf I(p),
C closed
PEC
lim - log Q,,i(G) 2 - @infLEGI ( p ) , t+oo t
G open.
We should explain the notation used here. Let (Xt)t>o be a Markov chain with transition probability P(t>= (Pij(t)) a.nd let Pi be the probability that the chain starts from i E E . Next, let 9 ( E ) be bhe set of probabilities on E , endowed with the weak topology. Set
and Q t , i = Pi o L t ' . Considering P ( t ) as an operator on b&: the set of bounded functions with uniform norm? it induces an infinitesimal generator
16
0 -4N OVERVIEW
O F T HE
BOOK
L with domain g ( L ) . Let g + ( L )be the set of strictly positive functions in
m)* In view of Markov chains, the entropy given by (0.14) is not satisfactory
since 9 ( L ) is quite poor, even the indicator Iti)(i E E) is usually not in g ( L ) . However, we have
Thesrcm 0.30. Given a regular Q-matrix Q = ( q i j ) , (1) if p E
9(&) satisfies C ,piqi < 00,
where %? = F
then
( L ) or anyone of the following sets:
> O for some E > O } , 0 < f < w},
&+ = (f : f 2 €O = b@+
(f
:
= &8n &+,
E
b@
=b
8
n go9
(2) If (gij) is reversible with respect to some 7r E P ( E ) , then p E 9 ( E ) ,we have an explicit expression as follows.
for every
This theorem is proved in Chapter 8. Some upper estimates are also studied there. Roughly speaking, the large deviations say that the exponential I ( p ) . For convergence rate of &t,z(C)is described by the entropy - iiifPEc reversible Markov processes, we have a different way 1,o look at the exponential corivergeiicc rate: IIP(l)f - ~ ( j ) l l - ~ ( f ) l l exp[-et], where 11 11 is the norm in L2(7r) and ~ ( fis)the mcan of f with respect to 7r. Let 0 denote the largest value of E > 0. The curlstant u is the rate we are looking for. As usual, the cortvergcnce rate is related to some spectral gap. Let L denote the generator with domain g ( L ) induced by P ( t ) on L 2 ( r ) and let gap(L) denote the infimum of the spectra of -L restricted to the orthogonal complement of the constant function 1. Then, we have the following result.
< 11s
Theorem 0.31.
+
0.5 EQUILIBRIUM PARTICLE SYSTEMS
17
For finite Markov chains 1vit.hQ-matrix &, gap(Q) is nothing but the first non-trivial eigenvalue of -&. Estimating gap(Q) is a traditional hard topic in mathematics. '1'0 compute gap(&) explicitly, one has to stop when the order of Q is higher than five. Surprisingly, we do have in a particular case a complete solution to the problem eve11 for some infinite matrices. Consider the birth-death &-matrix: q i , i + l = bi > 0 ( i 2 0), qi,i-1 = ai > 0 (i 3 1) arid q,j == 0 for all other i # j . Suppose that the process is ergodic and so we have a stationary distribution
nefine
{ {w~}+o wi is strictly increasing in i and C ixiwi 3 0}, "w = { there exists k 1 < k < so that wi = w is :
=
:
{wi.}i>o :
m
wiAk,
strictly increasing in [0,k] and
&(w) =
1
c
xi ~ i w =i 0},
00
bZ/Ji(Wi+l - Wi) j .= z +, l
&Wj?
i 2 0;
6 = sup
c
1 -c p j ,
i31 j G i - l ~ j b j>i j
Note that g i s simply a modification of W . Hence, only two notations W and l(w)are essential here.
Theorem 0.32. For the ergodic birth-death process as above, the following conclusions hold. (1) Variational: forrnula for the lower bound gap(D) = sup inf l i ( w ) - I , W G W 220
( 2 ) Variational formula for the upper bound: gap(D) = inf-sup &(w)-' W E Wi&J
(3) Explicit bounds and explicit criterion: 26-' particular, gap(D) > 0 iff S < 00.
3 gap(D) 2 (46)-1.In
The study on spectral gap is the aim of Chapter 9. 0.5 Equilibrium Particle Systems
Let us start from the simplest case. Consider a Q-process (P(t)) with Q-matrix
and state space E = {–1,1}. Assume that ab = 0,
otherwise, the model is trivial. Then
18
0 AN OVERVIEW OF
THE
BOOK
As the limit of Pij(t)( t -+ oo),we obtain the stabionary distribution n-1
+b),
a/(.
T+I
T
b/(a
+b).
In other words, there is only one stationary distribution, denoted by 19) = 1. ‘l’hc above process is a rnodel with single particle having two states f l . If we consider finite number of particles, say it’ E N,N 2 2, Then the state space becomes {-1, + l } N .Thc system can be nlso described by a Q-process (its operator is given by (0.15) below replacing Zd with N ) and we still have 1 4 1 = 1. What will happen if we replace N with a countable set? For instance, consider a particle system on the regular lattice Z d . At each site u E Zd, there is a particle with two states f l . Then the whole configurations consist of our state space {-l,+l}Zd, which is no longer countable. Hence, the system can notj be described by a Q-process. Now, we use c ( u , x ) ,instead of qzg,to describe the Osarisition rate of a particle changing its state. Given x E E , let if u = ‘11 (U.)(.> = if u.# v --q,
{
and define a formal generator as follows:
Rf(4=
c
c ( u , 4 [ f ( u . - .,
f(41
(0.15)
*
UEZ*
c
A particular choice of c(u, x) is c ( u , x) = exp rl‘hcIi we obtain the famous Ising model in astatistical physics. Now, corriplete different phenoinenon happens. For d = 2,we bave
1 9 1= 1,
if ,l3
1
< -log (1 + A) =: fl:2) x 0.44 2
,Bid)
For d 3, the picture is similar for a critical point > 0. But for d = I, we have 1 . 9 1 = 1. It should be clear now that the king model exhibits phase transitions which depend on the dimension d. Actually, t,his model has attracted a lot of attention in statistical physics, even in the 2-dimensional case (see for instance, McCoy and Wu (1973)). The Ising model, as well as a fundcamental part of the theory of random fields, including the typical methods-the Peierls method and the reflection positivity method for studying the phase transitions, are presented in Part JX. Based on the field theory, we introduce some simple criteria for the reversibility of spin processes and exclusion processes. Besides, two new developments
0.6 NON-EQUILIBRIUM PARTICLE SYSTEMS
19
in the field are included. The first one is to use the lattice fractals instead the regular lattice. Then, we do obtain some interesting results. For example, the Ising model on lattice Sierpinski gasket h m no phase transitions in any dimension bul the model on lattice Sierpinski carpet does exhibit the phase transitions in any dimension. The other one is to use some groups as the spin space instead of {-1, I l}, the latter one seems mainly suitable for the mctallic phasc transitions at low temperature. Howcvm, new progress on the superconductivity has been made recently by using ccramics instead of ferromagnetics. This explains why we have to consider more general spin space instead of {-1, +1}.
0.6 Non-equilibrium Particle Systems The Ising model discussed in the last section belongs to the equilibrium statistical physics. Having the knowledge about the equilibrium systems in mind, it is natural to ask what we can do for the non-equilibrium systems. A typical example is Schlogl's model (Example 0.3), replacing the finite set S with infinite one S = Z d . Thc formal generator can be written ~LS follows:
where X I , . . . , A4 and (p(u,u)) are the same as before, e, is the unit vector in E = Z s having value 1 at u E Zd and 0 elsewhere. This model is a special reaction-diffusion process studied in the last part of the book. I t may be helpful for our readers to compare the Schlogl model with the Ising model. (1) Clearly, the state space E = (-1, +l}"dfor Ising model is compact and so is L@(E).Thus, the process has at least one stationary distribution. But for Scl-ilogl model, the state space E = Z"+" is neither cornpact nor locally compact. (2) 'l'he king model is reversible, ils local Gibbs distributions are explicit. But the Schlijgl model hns no such advantage, except in a special case. (3) The generator ol the king model is locally bounded but it is not so for the Schlogl model.
These facts show that thc non-equilibrium particlc systems are more difficult to handle than the equilibrium systems.
20
0 AN OVERVIEW
OF T H E
BOOK
To construct an infinite dimensional Schlogl model, take a sequence {A,} of finite subsets of Zd so that A, t Z d . Then, we have a sequence of Markov -) : n 2 l} chains {P,(t) : n 2 l}. The next step is to prove that {Pn(t,x, is a Cauchy sequence. Thus, we have to use a probability metric, say W , for instance: as m 2 n --+ 00. w(P,(~ x,,.), ~ , ( t ,x,.)) + 0 From this line of the construction, we see a relation between the Markov chains and the interacting particle systems. Locally, particle systems are Markov chains. At this point, it explains why the title of the book is chosen. The constructions, the uniqueness of the processes as well as 15 concrete models are presented in Chapter 13. It will be proved in Chapter 14 that the reaction-diffusion processes often have at least one stationary distribution and sometimes they are ergodic. The reversible reaction-diffusion processes are always ergodic. For some special models, we will prove that there more than one stationary distributions. That is, the processes exhibit phase transitions (Chapter 15). Finally, we turn to discuss the relation between the processes and partial differential equations. It is known that the generator of &dimensional a2/ax?. Moreover, for Brownian motion {Bt}t>o is the Laplacian suitable g, f ( t ,x) := IE,g(Bt) satisfies the linear equation:
1
-=-C-f af 1 d2 at
2
,
2=1
ax?
If(0,x) = g ( 4 -
However, if we consider the reaction-diffusion equation (non-linear):
(0.16) where V is a polynomial, there is no hope to find a Markov process valued in Rd with such a generator since for a Markov process, its generator must be linear. Nevertheless, under some hypotheses on the initial distribution of the process and on the initial function p, it will be proved in the last chapter of the book that a limit of some mean of a scaled reaction-diffusion process provides a solution to Eq. (0.16). In other words, a reaction-diffusion process describes the microscopic behavior, and Eq. (0.16) describes the macroscopic behavior of a non-equilibrium system. In the last chapter, we will also prove that some solution to Eq. (0.16) are asymptotically stable but some of them are not. This result represents the critical phenomena of the systems, which corresponds more or less to the phase transitions for the microscopic processes.
PART I GENERAL JUMP PROCESSES
This page intentionally left blank
Chapter 1
Transition Function and its Laplace Transform In this chapter, we first study some basic properties of sub-Markovian transition function of a jump process: continuity and differentiability. From which, we deduce the transition intensity, q-pair. Next, we study the one-toone correspondence between the transition function of a jump process and its Laplace transform. This enables us to use the fundamental tool, Laplace transform, instead of the transition function itself in the subsequent study. As well-known, the advantage of using the Laplace transform is reducing the integral equations to the linear algebraic ones.
1.1 Basic Properties of Transition Function Throughout the book, we use the following notations. Let ( E , 8 ) and ( X ,93) be two measurable spaces. Denote by f E €,'A3 the measurable mapping from ( E ,&) to (X, 3).However, if X = with Borel a-algebra 93 = B ( R ) ,we simply use the same notation € to denote the set of all measurable functions from ( E , € ) to (R,B(R)). Similarly, let r€ (resp. r€+, b&, b€+, 8+)denote the set of all measurable real-valued (resp. non-negative realvalued, bounded real-valued, bounded non-negative and non-negative but p+)denote the set of all may be +m) functions. Finally, let Y (resp. Y+, a-additive set functions (resp. finite measures, a-finite measures). Unless otherwise stated, the state space ( E , € ) considered in the book is a Pofash space with Borel a-algebra 8. Recall that a Polish space is a separable topological space that can be metrized by means of a complete metric.
Definition 1.1. We call P ( t , z , A )(t 2 0, z E E , A E €) a (sub-Markovian) transition function of a jump process if the following conditions hold. (1) For each (2) For each
t 2 0 and A E €, P ( t ,. , A ) E .8+. t 2 0 and z E E , P(t,z, .>E 9+and P ( t ,z, E ) < 1.
(3) Chapman-Kolmogorov equation (abbrev. CK-equation). For each t , s 2 0, z E E and A E €, P(t
+ s,
2,
A) =
J
P(t,z, d y ) P ( s ,y, A ) .
(4) For each z E E and A E 8, limt,o P ( t ,z, A ) = P(O,z,A ) = d(z, A ) , where 6(., A ) is the indicator of A , also denoted by 1,.
23
24
1 TRANSITION FUNCTION
A N D ITS
LAPLACETILANSFO
In this definition, the crucial point distinguishing to the transition function of a gcrieral Markov process is the last condition (4),which means the continuity at the origin and hence often quoted as continuous condition. However, because of this condition, the sample of the process are step functions, at least before the explosion time. Since this reason, we also call (4) the j u m p condition. In many cmcs, we do not want to distinguish different jump processes with the same transition function. Hence we olten call the transition function itself a jump process. In particiilar, we call it a Markav chain in the case of E being a countable set., denot.ed by matrix (Pij(t): i , j E E ) . A jump process P ( t , z , A ) is called honest, if for each t 2 0 arid z E f3, P ( t ,5 , h’)= 1. Otherwise, it is called non-honest. Theorem 1.2. For each 2 and g E b&+, Y(1,z,dy)f(y) i s uniformly continuous in t uniformly in f with I f 1 6 9. In particular, P ( t ,5 , A ) is uniformly continuous in
t
uniformly in A.
Proof: By conditions (3) and (2) of Definition 1.1, it follows that
In bhe last step we have used the fact that la - bl Thus, we have
< c for ail a , b E [O:c]
Now: the first assertion follows from this and Definition 1.1( 4 ) . I The next result shows a nice property of jump processes. Even though we need the result only in a few of cases, it is still included for completeness. Theorem 1.3. Let P ( t , z , A ) be a j u m p process on ( E , 8 ) . Then for each z E E and A E 8 , either P ( . , z , A )= 0 or P(a,.,A) > 0. Proof: If P ( t ,x 7A) is not honest, we may introduce a fictitious state A $! E , such that EA := E U {A} is again a Polish space and A is an isolated state. Moreover, € c &A := cr(& u {A}). Let
1.1 B A S I C PROPERTIES O F TRANSITION FUNCTION
25
Then we obt,ain an honest jump process P(t,x,A ) (t 3 0, 2 E EA, A E En). Clearly, if P(t,2,A ) possesses the properties described in the theorem, then so does P(t,x,A ) . Hence, we need only consider the honest jump processes. By CK-equation, we have
+
P(t s, x,A) 3 P( s , s , (x}) P ( t ,2, A ) .
> 0, then for all s > 1, P ( s , x , A )> 0. Froin this and the continuit.y of Y ( - , x ,A ) , it follows t b t P ( t , x , A ) > 0 for all t whenever z E A . Furthermore, for x $ A, there exists u(x,A ) E [O, 031 such that Hence, if P ( t ,.,A)
,P(t,x,A ) 2 0,
P ( t ,x,A ) > 0,
<
if 0 6 t u(x,A), if t > u(5,A ) .
Thus, what we need to prove is that for each either u(z, A ) = 0 or u(z,A ) = 03. Suppose that 0 < u ( 5 , A ) < 00
2
(1.2)
C E arid A E 6 , 5 .$ A,
(1.3)
for some x and A . Fix z and A. Set uo = u(x,A), u(y) = u(y,A) and Obviously, u and LJ are measurable. Since ( E , G ) is v(y) = u ( y ) A a Polish space, we can construct a Markov process X ( t ) on a probability P)with t,ransition function P ( t ,x,.) and initial state X ( 0 ) = 2 space ( Q , 9: (cf. Neveu (1965), p.83, Corollary), Let
Yo(t)= V ( X ( t ) ) . Then Yo(0)= uo:0 < Yo(t)< uo. Moreover
By the dominated convergence theorem, the right-hand side tends to 1 as h + 0. So Yo is contiiiuous in probability. On the other hand, since E can be embedded into a compact space X so that the completion (E,p) of E in X is again compact with metric p. There exists a measurable and separable version Y of Yo such that 0 Y ( t ) uo = Y(0). Now, it suffices to show that there exists A E 9such that P (A) = 0 and
<
<
-4ctually, (1.4) gives us P ( t ,T , {y : u ( y ) < u o } )= 0 for almost all t and then for all t 2 0 because of the coritintiity of the transition function. Notc that
1 TRANSITION FUNCTION AND
26
ITS
LAPLACETRANSFO
u(y) = 0 whenever y E A . It follows that P ( t ,5 ,A ) = 0 for all t 2 0. This is in contradiction with (1,2). Next, we use three steps to prow (1-4). a) Prove that Z ( t ) := Y ( t ) t is non-decreasing in t. We first prove that
+
h > 0. < v(y) - h } ) = 0, (1.5) This is trivial when h 2 w(y). Assurnc that 0 < k < v(;y). Since v(y) 6 u(y), P(h,y, (2 : v(.)
wc have = '('(y),
1
y>A )
P(hl
y7 d z ) p(v(?/) - h7
'7
A)'
(1.6)
On the other hand, by (1.2), '(2) < '(9) - h implies that P(v(y) - h, z , A ) 0. Hence, (1.5) follows from (1.6). Next, by (1.5), we have
1
+
P [ Y ( t k ) < Y ( t )- h ] = This shows that
>
P ( t ,2,d y ) P ( h ,y, ( 2 : v ( z ) < v(y) - h}) = 0.
Y ( t + h) 3 Y ( t )- h,
P-a.s.
(1.7)
By the separability, we may choose an exceptional set so that (1.7) holds for all t , h 2 0. This proves a). In what follows, we will ignore the exceptional set. b) Prove that d
-Z(t) dt
=:
1,
a.e. t.
Because Z ( t ) is non-decreasing, the derivative Z' ( t ) exist almost everywhere. To compute Z'(t), let A(t,h) = IF [ Y ( t h ) # Y ( t ) ]We . have seen that limhtoA(t,h) = 0 for all t 2 0, so by the dominated convergence theorem,
+
1
oc,
lini
h+O
0
A(t,/ ~ ) e - ~ d=t 0.
Thus, we can choose a sequence {h,}, h,
3
0, such that
By the Fubini theorem, there is ZI set N with Lebesgue measure zero so that 00 for all t $ N . Then, by the Borel-Cantelli lemma, we have
c,A(t, h,) <
P [ Y ( t+ h,) = Y ( t )for sufficient large n ] = 1,
t $ N.
1.2 THE PAIR
27
This shows that
P lim ( Z ( t+ h,) - Z ( t ) ) / h , =. 1
[
1
12-100
t $ N.
= I,
Therefore b) follows from the F'ubini theorem. c ) Prove (1.4). By a) and b ) , there exists a P-zero set A, such that
z(t)- Z(0) 2
Jot
Z ' ( S ) ~=St
011
A",
t 2 0,
H
which is just (1.4).
1.2 The q-Pair In this section, we study the right derivat.ives at the origin for a jump process. We first deal with the diagonals.
Theorem 1.4. For each z E E , the limit q(z) := lirn
1 -. P(t,x,{x})
t
t-+O
<.oc
exists and q(.) E rB. Moreover, for each z E E and
P ( t )x,{x}) 2 Proof: Fix z
e E.
(1.8)
t 2 0, we have
e--4(+.
(1.9)
By CK-equation, we have
From condition (4)of Definition 1.1, we see that the right-hand side is positive for large enough n. Hence P ( t ,2,{x}) > 0 for all t 2 0. Thus, for fixed 3c, we may define
f(t)
:= - logP(t,z,{z})
E [O,oo).
(1.10)
By using CK-equation again, we obtain
P(t
+ s,z,
{X})
3 P ( t ,5,(4)P b , 5 ){.})*
This shows t.hat f is sub-additive: (1.11)
28
1 TRANSITION FUNCTION A N D ITS LAPLACETRANSFORM
For t , h
> 0 , let n be the integer such that t = nh + c, 0 6 E < h. Then
Letting h -+ 0, then nh/t
-+
1, f ( ~=) - log P ( E x, , {x})
-+
0, and
SO
Therefore
This shows tha,t
We have thus proved the first assertion. The last, assertion follows from (1.10) and (1.12):
Finally, for A E 8 x 8,let A, denote the section of A at the point X. By the monotone class theorem (cf. Section 1.51, it is easy to check that for each A E 8 x 8, P ( t ,x,A,) is &-measurable in 5. In particular, since ( E ,8 is a, Polish space, we have ((2, x) : x E E } E 8 x 8. Thus P ( t ,x,{x}) is &-measurable in z and so is q(z). Now, we turn to study the existence of the limit limt,o P ( t ,5,A ) / t (x @ A E 8).For this, set
9 = { A E 6 : lim sup [I- ~ t+o
( x: t ,{2>>1= o } .
zE.4
We have the following result. Theorem 1.5. (1) 9 is a ring. T h a t is, it i s closed under t h e set operations of finite unions and differences. Moreover, if SUP,^^ g(x) < M,I then A E 9, (2) For each A, z $ A E 2Z, the limit
exists and
q(x14 6
dZ)l
2
4AE
(1.13)
(3) Define z E E , A E: 9. q(z,A ) = q(x,A \ {x}), Then for each x, q(x, .) is a finite measure on 2Z and for each A E 9
q ( * , A E) ~ 8 .
1.2 THE(I-PAIR
29
Proof: Clearly, 9 is a ring. The last assertion in (1) follows directly from (1.9). Once known the existence of the limit defined in (2), for J: f A ,
This implies (1.13). We now prove the existence of the limit, which is the main part of the proof. The key idea is the inequality: P(nh,,x,B ) P(h,2 , B ) 2 nh h 7
n = [t/hI,
2
$ B,
(1.14)
where [u]is the integer part of a. From this, we get
and so
This certainly gives us the required assertion. To prove (1.14), i.e., P ( n h , z , B ) 3 n P ( h , x , B ) ,it is natural to study the decomposition of the timediscrete Markov chain with transition function P ( h , x , B ) , according to its first entrance into B . To do so, fix z and B , x $ B , and introduce the taboo transition function as follows.
By induction, we have k-1
k21,
A€&‘.
Note that B U {x} E 2 whenever B E 2. Given that for all t 6 6,
E
(1.15)
> 0, choose 6 > 0 such
1 TRANSITION FUNCTION AND
30
ITS
LAPLACETRANSFORM
Now suppose that h 6 6 and take n = [t/h].We prove the following two estimates:
(1.18) Indeed, substituting A = B , k = n into (1.15), we obtain
Note that (n- k ) h 6 nh 6 t
< b and
We have
which is just (1.17). As for (1.18), it is another application of (1.19):
c n
&
> P(nh,2,B ) 2 (1 - E )
P&h,
2,B).
k=l
Now, it is the position to prove our main inequality (1.14). Take A = {z},
k
< n in (1.15) and apply (1.18),
Hence
1.2 THEQ-PAIR
31
Substituting this into (1.17), we obtain
P(nh,2,B) 2 n(1-
3E)
P(h,z,B),
E
< 1/3.
Even this estimate is weaker than (1.14), but the argument after (1.14) is still available by letting h --+ 0, t --+ 0 and then E + 0. Thus, we have proved the existence of the required limit. Finally, we prove assertion ( 3 ) . Since
is €-measurable in
2,
so is
To show that ~ ( 2.) ,is a-additive, take {B,} c 9, B, 1 8. Without loss of generality, assume that n: @ u,B,. By what we have seen before (1.20) and limn+m P ( t ,5 , B,) = 0, we certainly have limndm Q(X,B,) = 0. H Now a question arises. When the measure Q(Z, .) on 2%’ can be extended uniquely to the whole space €? To get an answer, we need the following simple fact.
Lemma 1.6. Let {p,}? be a sequence of measures o n an arbitrary measurable space ( E , € ) . Suppose t h a t for each A E €, p n ( A ) increases to some p(A), then p is a measure on ( E , & ) . Proof: It suffices to show the a-additivity. To do so, let { A j } T C €, mutually disjoint. Then
Letting n ---f co,then m other hand,
which is what required.
+ 00,
we obtain p
>, Cjp ( A j ) . On the
32
N FUNCTION 1 'I'R.4NSITIONN AND
ITS TJAPLACE TRANSFU
Lemma 1.7. Suppose that there exists {&}? c &such that u,E, E. Then for each x E E , q(x,.) can be uniquely extended t o 8. Moreover, the extended q(x,.) preserves the property: for each A E 8,q ( . , A) E 8, L
Remark 1.8. If the set {x E E : q(x) = co} is at most countable, enumerated by {xl, x2,.. . }, then the condition of Lemma, 1.7 is satisfied. Actually, set Bj = {x : .j - 1 6 q(x) < j } , j 2 1 and Ej = nj u { ~ j } , j >, 1. Then, from Theorem 1.5 ( l ) , it follows that { , B j )c 9, atid so { E j }c 9 Proof of Lemma 1.7: Since 9 is a ring and 8 u(9),which means that € is the rniniinal u-algebra generated by 9, the extension must be unique whenever it exists. To prove the existence, put V, = U,"=, E j , n 2 1. Then V, E . Let A E 8. q ( x , A )= 'T2--'rn lim q(x,AV,), 1~
Now Lemma 1.6 implies that q(x,.)E A?+.Because AV, E 9, hence for each A E 8, q(*,A4) E 8. Based on Theorems 1.4, 1.5 and Lerrirria 1.7, we int,rotliice the following
Definition 1.9. We call ( q ( x ) , q ( z , A )(x ) E E , A f 8)a q-pair, if for each x E E , q(x,-) is a measure on G,q(x,{x}) = 0, q(x,E ) < 4 ( x ) ;and for each A E 8, q ( . ) and q(4, A ) are $-measurable. We say t h a t a state J: is absorbing, stable or instantaneous, respectively, if q(x) = 0, 0 < q(x) < 00 or q(x) = 00. We say t h a t a q-pair is totally stable or totally instantaneous if all of the states x E E are stable or instantaneous, respectively. Furthermore, we call x conservative if q(x) = q ( x , E ) , And we say t h a t a q-pair is conservative if so are all the states x E E .
For Markcsv cha,ins, E is cor.rnt;able. Define q,ij = q ( i $( j } ) for i # j and - q ( i ) =: -qi: we obtain a matrix Q = ( q i j : i , j E E ) . It is the derivative of P ( t ) = ( & ( t ) ) at t = 0 and called a. 9-matrix. Of course, every Q-matrix has the following properties: yzz =
0
< qij < Sm,
2
#j;
c
qzj
6 qa 6 f o o ,
i E E.
(1.21)
j#i
Definition 1.10. A jump process P ( t , z ,A ) is called a q-process with q-pair (q(x),q(x:A)) (x E E , A E 6 ) if for each x E E and A E i2, lirn [ P ( t z, , A ) - S(x,A ) ] / t = q ( x , A ) - q(x)d(x,A ) .
t-0
For countable E , it is traditionally called a Q-process.
1.2T H E
33
q-PhIR
As a consequence of Theorems 1.4, 1.5 and Lemma 1.7, we have Theorem 1.11. Let P ( t , x : A )be a jump process for which the set
is a t most countable. Then it must be a q-process with respect t o some q-pair (q(x),q(x,A)). Moreover, for each .7: E E , q(x,.) E 3,.
vl'e say that a q-process is conservative (resp. totally stable) if so is its q-pair. For simplicity, when talking about q-proccsses, we will not mention their q-pairs if not necessary. In general, 3i' is much smaller than 8. Indeed we have Corullary 1.12. $2 = € iff supzEEq(x) < 00.
ProoE: The sufficiency follows from Theorem 1.5 (1). To prove the neccssity, assume that E E 9. 'Then for every E > 0, there exists a11 a > 0 such that for all s a and x E E , we have P ( s ,x,{x}) > 1 - E . We now set
<
From CK-equation, it follows that
Or
I ) . < 1 - P ( a ,x,{.I)
(1 - &)[1- np(x,
On the other hand,
6 E.
(1.22)
34
1 TRANSITION FUNCTION AND
ITS
LAPLACETRANSFORM
Hence from (1.22), it follows that
(1 - &)(1 - ,-+)a) Set.ting E = 1/3, we get 1/2
< 1 - P ( a ,5 ,
{Z})
6 E.
< e - q ( r ) a , And so supz q ( z ) <
00.
Even though 9 # 8 in general, but the limit lirnt-0 P ( t ,2,A ) / t may still exist for some set A 6 9,
Theorem 1.13. If ( q ( x ) , q ( z , A )is) a totally stable and conservative q-pair, then every q-process P ( t ,x,A) satisfies lim P ( t ,x,A ) / t = y(x,A ) ,
t-4
Moreover the convergence is uniformly
J;
$A
E 8.
in A .
Proof: As we did in the proof of Lemma. 1.7, take {Vn}yC 9 such that V, t E and 2 4 A E 8. q ( 5 , A ) =: lim q ( x ,AVn), n, b T X 1
Let us firsl prove the following fact: for x $ V E 92, lirn P ( h ,5,B ) / h = q ( x , B )
h-0
(1.23)
uniformly in B c V . Actually, for each 0 < E < 1/3, there exists 6(V { x } , ~>) 0 such that whenever h < 6(V { z } , E ) ,
+
Here, we have used (1.20). On the other hand,
Hence
Letting h
+ 0,
and then
E
---f
0, we obtain (2.23).
+
35
1.2 THE+PAIR Kote that
Fix R . By (1.23), we see that the first term on the right-hand side tends to 0 and the remainder tends to
uniformly in A as h
--f
0. Then (1.24) goes to zero uniformly in A as rt
-+
00.
As we have mentioned before, in general, 9 # 8. However, Theorem 1.13 shows that the sets A, for which lirnt,o P ( t ,xl A ) / t exist, may still vary over whole 8.Does t.his result remain true for all totally stable q-process? The answer is negative (cf. Remark 1.16 below). To discuss this problem, we need a general result, which is a complement to Lemma 1.6 and will be used a lot subsequently.
Theorem 1.14, Let { p n } r be a sequence of finite measures on a measurable space ( E ,8). Suppose that for each A E 8,the limit limn-+mp n ( A ) =: p ( A ) exists and is finite. Then
Proof: The first assertion is a consequence of the well-known Vitali-HahnSaks theorem (cf. Yosida (1978), pp.70-72). We now prove (2). First, we have s i i p n Z l p n ( E ) =: C < m. Next, given f E b 8 , there exists a sequence of simple functions {gm}, such that
36
1 TRANSITION FUNCTION AND ITS LAPLACETRANSFUKM
Next prove (3). Since f n + f and p ( E ) < m, by the Egorov theorem, for given E > 0, we can choose A, E d such t,hat p ( & ) < F and limn+m S U P ~ ~ ~ IAf n, ( x ) - f(x)I = 0. Letting n --t 03, then E + 0, by (a), it follows lhat
Finally prove (4). For { f n } i x ) c &+, we may use f n A M (0 < -Id < cm) instead of fn if necessary. Thus, we need only to consider the imiformly bounded case. Write
T n-+w lim f n =: g.
gn := inf f,+ k2n
Thus, (3) implies that
n+cc J
J
Theorem 1.15. (1) Fix z E E . If q(x) < 00 and for every A, z ff A f 8,t h e limit lirnt,o P ( t ,z, A ) / t exists (in particular, if (q(z),q(x,A ) ) is a totally stable q-pair and conservative), then for each A, P ( t , x , A ) is continuously differentiable in t . Moreover, the backward Kolmogorov equation holds:
1.2 THEq-PAIR
37
Equivalently,
where P ( X , x , A ) (A > 0, z E El A E 8) is the Laplace transform of P ( t ,z, A ) (t 3 0, z E E , A E 8): P(A,z,A ) = e-xtP(t,z, A)&. (2) If an honest totally stable q-process satisfies Eq. (1.25), then i t s q-pair must be conservative. (3) If Eq. (1.26) holds for all A, z and A , then we have
s,"
z,A ) = I A ( x ) ,
lim "(A,
x+cc
lim X[XP(X,z,A) - 6 ( x , A ) ]= q ( z , A )- q(s)I~(z), z E E , A E 8 x+cc Before proving this theorem, we make a remark about our notations. For a function f ( t ) , its Laplace transform is denoted by f(X). Here, we use the same f but change the variables. For the former variable, we use Roman letters t , s,. . ; but for the latter one, we use Greek letters a , p , . . . The reason we choose such a notation, somehow confused, is to save our notations. Proof of Theorem 1.15: By CK-equation, for h > 0, we have a
P(t
+ h, Z,A )
-
P ( t ,5 , A )
h
Letting h + 0, by Theorem 1.14, we obtain the right derivative:
q t ,5 , A ) =
1
d z , dY) P ( t ,Y,A) -
44P ( t ,
21
A).
By the dominated convergence theorem, we see that the right-hand side is continuous in t , so is the right derivative. This is enough to claim that P ( t , z , A ) has a continuous derivative and that (1.25) holds (cf. Yosida (1978), p.239). Making Laplace transform, (1.26) follows from (1.25). On the other hand, since the both sides of (1.25) are continuous in t , by the uniqueness theorem of Laplace transform (cf. Section 1.5, Theorem 1.38), (1.26) implies (1.25). Obviously, the second assertion follows from Eq. (1.25) by setting A = 23. To prove assertion (3), noting that as the Laplace transform of a jump process P ( t ,z, A ) , we have XP(X,z,E ) 1 for all X > 0 and z E E . Thus
<
lim P ( X , x , E )= 0,
x-+cc
X
> 0, z E E .
Now, multiplying X in the both sides of (1.26) and letting X t 00, we obtain the first equality in (3). The second one then follows by using the same procedure but multiplying X2 instead A.
1 TRANSITION FUNCTION AND ITS LAPLACE TRANSFORM
38
R e m a r k 1.16. Let E be countable. Take q22. . - -2i+l
7
qij
= 0, i
# j,
i,j
E
E.
Then Q = ( q i j ) is a totally stable Q-matrix. Every state is non-conservative. Define
where bij = 0 if i # j and Sii = 1. It is easy to check that (Pij(A) : i, j E E ) is the Laplace transform of an honest Q-process (cf. Theorem 1.29 in Section 1.4). However, it does not satisfy Eq.(1.26). Combining this with Theorem 1.15, we have answered the question mentioned before Theorem 1.14. 1.3 Differentiability From the last section, we have known that the jump processes are differentiable at the origin and even at every point t 2 0 under some conditions. This section is devoted to prove that the last property indeed holds for all totally stable jump processes. L e m m a 1.17. Let p E L@([O, cm)) and q E (0, cm). Suppose that
(1) 0 6 p ( t ) < 1, t 2 0. (2) p ( s t ) 2 e-QSp(t), s, t 2 0. (3) 1 - p ( s t ) 2 e-QS(l - p ( t ) ) ,
+
+
Then there exists T E B ( [ O ,
S
s , t 2 0.
0 < ~ ( t6) 1, such that
o;))),
t
p ( t >= q
e-q(t-s)r(s)ds
+ p(O)e-qt,
t 2 0.
(1.27)
0
Proof: By using (2) and (3), respectively, we have p(s p(s
+ t ) - p ( t ) (e-QS- l)p(t) 2 e-9' + t ) - p ( t ) < 1- e-qs.
Hence
Ip(s
+ t ) - p(t)l 6 1 - e-qS,
-
1,
s, t 2 0.
This shows that p ( t ) is absolutely continuous. By using (2) and (3) again, we obtain
1.3 DIFFERENTIABILITY
39
It follows that both eqtp(t) and e @ ( l- p ( t ) ) are non-decreasing in t. Thus, if we set
then r ( t ) 2 0, a.e. But
hence we also have r ( t ) < 1, a.e. Note that (1.27) follows from definition of r ( t ) . It is now easy to modify this ~ ( tto) having the desired properties.
Lemma 1.18. If q(x) = 0, then P ( t , z , A )= I A ( x ) . If 0 < q(x) T ( . , z, A) E %([O, m)) such that ~ ( .,A) t , E 6' and
< 00,
then
there exists a
Jot
(1) P ( t ,z, A ) = q(x) e--4(=)( t - s ) ~ ( ~z, , A)ds ( 2 ) For {An}T c 8, mutually disjoint,
+ e-q(2)tIA(x).
n
(3) 0
< r ( t , z , A )< 1. If P ( t , z , E )= 1 for all t 2 0, then r ( t , x , E )= 1,
a.e. t.
Proof: The first assertion is trivial. We now prove the second one. If P ( t , z , A ) is non-honest, then we may enlarge the state space, as we did in the proof of Theorem 1.3, to obtain an honest jump process. Once we have proved the desired properties for the new process, the properties for the original one follow immediately. Assume that P ( t , z , A )is honest. By CK-equation and (1.9), we have
P ( s + t , 5 , A ) 2 e-q(z)sP(t, z,A ) . Using A" instead of A, we have
Combining these with Lemma 1.17, we get a function T ( . , z, A ) E B([O, m)), r ( t , z , A )< 1. Substituting A = E into (l),we get such that (1) holds and 0
rO
40
1 TRANSITION FUNCTION AND
ITS
LAPLACETRANSFO
Hence, ( 3 ) follows. Clearly, (2) follows from (1).
1
We may omit x for a while since it is frxed. In the above discussion, we have constructed a function T satisfying
For each A E 8, r ( +A) ) E 9 ( [ 0 co)); , a s . 1; 0 < ~ ( lA), < 1: r ( t , E )= 1, T
(t,
C, Aj)
=
C, r ( t ,A j ) ,
a.e. t.
(1.28)
The question is when do we have a. probability kernel R(t,A ) such that
R(t7A ) = ~ ( At ),
€or a.e. t and all A E € ?
(1.29)
Here, the (probability) kernel or transition measure (probability) R(t, A ) defined on (X x E , 2 x 8)means that R(.,A ) E 9 for each A c € and R(t,.) is a (probabilit,y) measure on 8 for cacli t E X. In the totally stable and conservative case, comparing Lemma 1.18 (1) with the backward Kolmogcsrov ecpation
(cf. (1.25))) we see that, q(x:d y ) P ( t ,y, A ) / q ( x ) is, in the sense of (1.29), a version of r ( t ,5,A ) . Because it is more meaningful to make some restriction on the state space rather than on the processes so that the differentiability holds for all jump processes, this leads to the general problem (1.29). To answer the above questmion,we need some preparation.
Lemma 1.19. Let ( X ,28) and ( E ,€) be two arbitrary measurable spaces. Set -
{Bx A
:
B
E
9, A E €}.
Given r ( B , A ) ( B E -97, A E 81, which is a-additive in each variable and T ( X , E )= 1, then P(B x A) := r ( B ,A) is finitely additive on !3 and
P(X, E ) = 1.
Proof: Clearly, we need only to prove the finite additivity. That is
\ k
’
k
1.3 DIFFERENTIABILITY
41
Here and an what follows, the sum A + B means the disjoint union of A and B . To prove the assertion, we use induction on n. It is trivial when n = 1. Now, consider the case that n = 2. That is
Assume that Al, A2, B1,B2 # 0, otherwise it reduces to the case that n = 1. Without loss of generality, assume that B1 n B2 = 8. If Ao := A1 n A2 # 0, then we can rewrite
+
From this, it follows that B = B1 82. Besides, A1 \ Ao, A2 \ A. c A . Taking intersection with (B1 B2) x ( ( A ,\ Ao) (A2 \ A,)) on the both sides of (1.30)) we see that
+
+
Or B1
x(Az\Ao)+Bzx(Al\A,)=8.
That is, A1 \ Ao, A2 \ Ao = 8, or equivalently, Al = A2 = Ao. Hence B x A = (B1+ B2) x A. and so
Thus we have proved the additivity when A0 # 8. As for the case that A0 = 8, the same proof shows that A1 = A2 = 8, which is in contradiction with our assumption. This completes the proof for the case that n = 2. Assume that the additivity holds for all n < rn - 1 and consider the case that n = m. Let m
B xA=
C B~ x A~ k=l
Because (B1 x AL)fl ( 8 2 x A2) = 8, we may assume that
B1 fl B2 =
0. Then
42
1
T R A N S I T I O N F U N C T I O N A N D ITS L A P L A C E
TRANSFO
each of the right-hand sides consists at most m-1 non-empty terms. Hence, we can apply the inductive hypothesis and our conclusion for n = 2 to get
Now, we return to our original problem. Clearly, we necd only to consider the case that X = [0, I] with B = g ( [ O , 1)).But we still allow ( E ,€) to be arbitrary. Lemma 1.20. Set 99 = { B x E : B E B } . Define
P(B x A) =
r ( t ,A)&,
B E 28, A E 8.
Then the desired probability kernel R exists ifF the following conditions hold. (1) P is sub a-additive on g. (2) P ( . IY)is a regular conditional probability (Of course, this means that a version of P ( +I g) is a probability kernel on (X x E , 2 x 8)).
Proof: Note that if condition (1) holds, then Y should be u-additive on 9 since .9 is a semi-algebra. Hence P can be extended to B x 8 and so condition (2) is mmningfiil. a) Necessity. Suppose that R exists. First, for D E B x 8, let D ( t ) denote the section of D at t . We prove that as a function o f t , R ( t , D ( t ) )is 28-measurable. Actually, if D E 9, denoted by D = B x A, then R(t,W ) )= IB(t)R(t,A )
is certainly B-measurable, The general case follows by the monotone class theorem. Next, define
By the Fubini theorem, this is equivalent to
1.3 DIFFERENTIABILITY
43
Then, is a probability measure on B x 8,coincides with P on !3. Thus, (1) follows and = P . Furthermore, by
and
it follows that R(t, . ( t ) )is a regular version of P ( . I Y)(t). Su condition (2) is satisfied. b) Suflciency. Suppose that (I) and (2) we saLisfied. Choose a regular version k(t,B x A ) of P ( . 197).Set
R ( t , A ) = g ( t ,[0,1]x A ) . Then R ( t ,A) is a probability kernel. Moreover
=P(B x
A) =
r(t,A)&,
B E B.
Therefore R ( t , A )= .(t,A) for a.e. t and all A E 6. I We now return to our main setup: (E!8)is a Polish space. The next result is the only one in this section for which we need the “Polish” restriction on the state space (See also Section 1.5).
Corollary 1.21. The desired probability kernel R(t,A ) exists. Proof: Since [0,1] x E is a Polish space, condition (2) is satisfied (cf. Corollary 1.39 for further discussion). Next, let
v=(zblB,XAk:
B I , € B 7A k E 6 ,
I
>
T%/
.
Then %? is an algebra, generates 2.3 x 8. Obviously, P is finitely additive on %. We now prove that P is sub a-additive on V . For this, let D, E 59, D= D, f V.For each B x A E %?, we have
c,“=,
1 TRANSITION FUNCTION AND
44
ITS
LAPLACETRANSFO
Thus, we can first choose a compact subset K1 c B and then choose a compact subset K2 C A, hence we can choose a compact K1 x Kz C B x A so that P(K1 x K2) is as close to P ( B x A ) as we want. Furthermore, we can do so for every D E V. Now, given E > 0, choose a compact K c D such that P ( K ) 6 P ( D ) P ( K ) ~ / 2 Next, . for each n, choose a compact Kn c Dz SO that P ( K n ) P(DE) 6 P ( K n ) ~/2"+'. Because 9is an algebra and P is finitely additive on V ,we see that
<
<
+
+
P(KE) 3 P(D,) 3 P(KE) - &/Zn+'. On the other hand, K i 3 Dn, so we have 00
00
n= 1
n=l
Hence there is a finite covering:
u;=, KE 3 K . Finally
00
00
00
m
n= 1
n= 1
n=l
n= 1
Since E is arbitrary, we have proved the required assertion. 1 The next step is to modify further the version obtained above so that it is also continuous in t. Proposition 1.22. Let x satisfy 0 < q(z) < 00. Then there exists uniquely a kernel R ( t , z , A )(t > 0, A E 8)such that B(t,.,A)E 6, continuous in t for fixed
z and A and having the properties
s*
(1) P(t>x,4 = 4 ( 4 e d Z ) ( t - S ) Rs, ( z,A)ds+e-4("It I A (z),t 2 0, A E G. (2) R ( t , z , E ) 6 1 and R(.,z , E ) = 1 if P ( . , x ) E )= 1. (3) R ( s + t , z , A ) = s R ( s , z , A ) P ( t , z , A ) ,s > O , t>O,AE&'. Proof: a) By Lemma 1.18 and Corollary 1.21, there exists a kernel rl(t,x,A ) such that
for all t >/ 0 and A E 8. Applying the monotone class theorem, we obtain
1 . 3 DIFFERENTIABILITY for all f E b 6 . In particular, taking equation, we get
P ( t + t’, z,A) =
It
f
45
= P(t’, .,A) and applying the CK-
q ( z ) e d ” )( t - s )
.I
rl(s,z,dy)P(t’,y, A)ds
Starting from this and using (1.31) twice, we obtain
= P(t
+ t’,
2,A)
Hence
Thus, for each t’, there is a null set Ntt = Nt/(A),such that
b) We now start from (1.32) to construct the required version R ( t , z , A ) of r l ( t , z , A ) . First, we prove that the both sides of (1.32) are joint measurable in (8, t’). Actually, for fixed t‘, the right-hand side is measurable in t ; and for fixed t , it is continuous in t’. Hence it is joint measurable. The same conclusion holds for the left-hand side. Thus the set S = ( t ,t’) : t , t’ > 0, r1( t
{
+ t’, 5,A ) # f
r l ( t ,2,d y ) P(t’,y, A ) }
is joint measurable. By the Fubini theorem and (1.32), it follows that
11
dtdt‘ = 0.
1 TRANSITION FUNCTION AND ITS LAPLACE TRANSFORM
46
Let u = t , v = t
+ t‘, then JJM
dudv = 0, where
By using the Fubini theorem again, there is a riull set H , and then there is a null set H , for every 7~ 4 H , such that
~ $ 1 1 u, < v $ IIu.
r I (v,z, A) = ]~~(u, IC, dy) P ( v - IL, y, A ) ,
Next, we construct R ( v , x , A ) as follows. T& u‘ such that 0 < u’< v and u’$ 11, and set
R(v,z, A ) =
s
T ~ ( u5’ , d y ) P (v
w
(1.33)
> 0. Choose an arbitrary
- u’, y, A ) .
(1.34)
To justify this definition, we need to show the independence of the choice of u’.To do so, let wo > 0, 0 < u’,u” < vo, u’,u“ $! H be given. Then, for v $ HL U H;(I and v > u’ v u”,by (1.33), both of
1
rl(u’, z, d y ) P(v - u’, y, A ) and
s
rl(u“,z, dy) P(v - u”,Y,A )
equal to rl(v,z, A ) . In particular, this conclusion holds for v = zfo by the continuity of P ( . , y , A ) . Therefore, choosing u’or u”,we define the same
Rho, 2 , A ) . c) Finally, we prove that the kernel R(t,2 , A ) (t > 0, A E 8)constructed above satisfied the desired conditions. By (1.34), R ( t , z ,.)EL?+for each 1. Note that P ( ~ , xE,) is non-increasing. Thus, (2) follows from Lemma 1.18, Corollary 1.21 and (1.34). Next,, we: provc the continuity. Civcn t o > 0, choosc an arbitrary uo $ H so that 0 < uo < to. Then whenever > t o , we have
R(v,X,A)
s
T~(uO,
dy) p(u - uoj Y ~ A ) ,
which shows that R(.,2 , A ) is continuous in ( l o , m). Since to is arbitrary, R(.,z, A ) is continuous in (0,m). Now, we prove that for each A E 8,
R ( v , z , A )= r l ( v , x , A ) ,
a.e. v.
(1.35)
Actually, for the above vo and uo, by (1.33) and (1.34), we sce that (1.35) holds for all v > t o r v $! Hue. In other words, (1.35) holds for almost all
1 . 3 DIFFERENTIABILITY
47
> to. Letting to tend to zero along the rational numbers, we see that (1.35) holds for almost all w > 0. Finally, (1) follows from (1.35) and (1.31). Furthermore, in parallel to the proof of (1.32) by using (1.31), from Proposition 1.22 (l),it follows that
ZI.
+
R(t t', 2,A ) =
J
R(t,5 , d y ) P(t', y, A ) ,
t $ Np.
(1.36)
Both-hand sides are continuous in t , and so Eq.(1.36) indeed holds for all t. This proves (3). So far, we have completed the proof of the existence of $P(t, z, A ) ( t > 0).
Theorem 1.23. Let z satisfy 0 < q(x) < 00. Then for each A E 8, P ( t ,z, A ) (t > 0) has a continuous derivative P'(t, 2,A ) , which is a-additive on 8 , having total variation bounded from above by 2q(z). Moreover,
P ' ( t , z , A ) = q(z)( R ( t , z , A )- P ( t , z , A ) ) , P'(t + s , z , A ) =
J
P ' ( s , z ,dy) P ( t , y , A ) ,
Finally, if q ( z )= 0, then
s
t > 0, A E 8.
> 0, t 3 0, A
E
8.
(1.37) (1.38)
P'(t, 2,A ) = 0 for all t >, 0 and A E 8.
Proof: By Proposition 1.22 (1) and the continuity of R(t,z, A ) , it follows that P ( t ,z, A ) is differentiable and (1.37) holds. Then the properties of P'(t, z, A ) can be read out from (1.37). Finally, (1.38) follows from Proposition 1.22 (3), (1.37) and the CK-equation. The case that q ( z ) = 0 is trivial. H
Corollary 1.24. We have (1) limt,o R(t,2,{z}) = 0. (2) limt-0 P'(t,z, {x}) = -q(x).
+ t , x,{x}) 3 R(s,2,{z}) x P ( t ,x,{z}). So R(t,z, {z}) 2 P ( t ,x,{ z } ) z s - + oR(s,z, {z}).Hence
Proof: By Proposition 1.22 (3), we have R(s
lim R(t,z, {z}) 2 lim R(s,x,{z}). t 4 O
s-4
This shows that limt-0 R(t,2,{z}) exists. Setting A = {z} in Proposition 1.22 (l),we get
Then, part (1) follows by letting t --+ 0. From this and (1.37), we obtain part (2). H The next result is a complement to the previous one.
48
1 TRANSIT~ON FUNCTION
A N D ITS
LAPLACE'FRANSFORM
Corollary 1.25. Let n: $ A E 9, Then
(1) 1imt-o R ( t ,5 , A ) = q(a,A ) / q ( x ) . (2) lirnt,o P ( t ,55, A ) = q ( x , A ) .
Proof: Given
E
> 0, there is a S > 0 such that
ho r n Proposition 1.22 (3), it follows that
W I ' t , ~ , A 2>
s,"(.-. 9
1
dy ) P ( t , y , A ) 2 ( ~ - E ) R ( s , x , A ) .
NOW,the remainder of the proof is similar to the p w i 0 u . s one. Corollary 1.26. If c :=
SUP,~AQ(X)
are functions of bounded variation in
<
00,
then
I
R ( t , z , A )and P ' ( t , s , d )
t.
Proof: By Proposition 1.22 (3), the CK-equation and (1.9), it follows that
Thus ectR(t,X,A) and ectP(t,2:A ) are non-decreasing. Hence R(t,5,A) and P ( t ,x,A ) arc?of boimded variation in t and so is P'(t, 2 , A) by (1 37). II Theorem 1.27.Let ( q ( x ) ,g(x,A ) ) be a totally stable q-pair. Then for every A with ~ i i q(x) p <~m ~ , we ~have P'(B
+ 1,x, A ) =
s
P ( s , s ,d y ) P'(L,y, A),
1 > 0, s
>, 0.
(1.39)
Proof: a) The proof of this part is similar t o those of Proposition 1.22. Set c := 8iIP,EA q(s). We have seen from the last proof that e r t P ( t , x , A ) is non-decreasing in t. Hence, by Theorern I .23 and Corollary 1.25 (2),we h a w
=:
U ( t ,5 , A ) ,
t 2 0.
On the other hand, by Theorem 1.23, we have IP'(t, x,A ) \ < 2q(z),and so U ( t ?x , A)
< 2q(a) + c.
(1.40)
1.3 DIFFERENTIABILITY
49
Thus, by the definition of U ( t ,2 , A ) , it follows that
Applying (1.41) once again, we obtain
Therefore
j” ecsU(s+ t’, rt
=
Z,
A)ds.
This shows that for each t’ 3 0, there is a null set Ntj such that
U ( t + t ’ , ~A,) =
.i
t $ Ntt.
P(t’,5 , d y ) U ( t ,y , A ) ,
(1.42)
In particular
U(t
+ t’,Z,A) 3
t $ Nv,
P(t’, 2 , d y ) U ( t ,Y, A ) ,
(1.43)
/En
<
<
+ < +
where En = {y : g(y) n}. Because of U ( t , y , A ) 2 q ( y ) c 2n c on En (by (1.40)) and U ( t ,y, A ) being continuous in t , the right-hand side of (1.43) is also continuous in t. Of course, the same conclusion holds for the left-hand side of (1.43). Thus, (1.43) actually holds for all t and t’. Letting n --f 00, it follows that
U ( t + t’, x,A ) 3
s
P(t’,x,d y ) U ( t ,y, A )
for all t , t’ and x.
(1.44)
1 TRANSITION FUNCTION AND
50
ITS
LAPLACETRANSFORM
Once we prove that (1.44) is indeed an equality, then by replacing U ( t ,z, A ) with its original form P’(t, z, A ) cP(t,z, A ) , we would obtain (1.39). b) We now prove that the converse inequality of (1.44) holds for all t , t‘ > 0. Actually, the right-hand side of (1.43) is joint continuous in (t,t’), hence is joint measurable, and so is its limit, the right-hand side of (1.42). The same conclusion holds for the left-hand side of (1.42). Thus, by the Fubini theorem and (1.42), there is a null set H of t > 0, and then there is a null set Ht for each t @ H , such that
+
+
/
U ( t t’,z,A)= P ( t ’ , z , d y ) U ( t ,y, A ) ,
O < t $ H , t’$Ht.
(1.45)
Now, we prove that Ht = 8 for each 0 < t $ H . Suppose that there are a to: 0 < t o $ H and tb $ H,,so that (1.45) does not hold. Then, by (1.44), we should have
Hence, whenever t’
> tb, we have
Since P(t’ - tb, z, {z}) 2 give us
=/
e-q(x)(t‘-th)
+
> 0, the above two facts and (1.44)
+
P(t’ - tb, 2 , d z ) U(t0 tb,X , A ) < U(to t’,2 , A ) .
This shows that for all t’
> tb,
1.4 LAPLACETRANSFORMS
51
It is clearly in contradiction with (1.45). So for all t: 0 < t f H , Ht = 8. In other words, we have proved that
s
U ( t + t ’ , x , A ) = P(t’,x,dy)U(t,y,A), Finally, for given t , t’ we obtain
O
(1.46)
> 0, choosing tl f N: 0 < tl < t , by (1.46) and (1.44),
U ( t + t‘, X,A ) = U(t1 + (t + t‘ - t l ) ,X ,A )
=s
P(t
+ t‘ - ti,
X,
d y ) u(1.1, y, A )
/
< W’,2,d z ) w, z,A), which is thc required converse inequality of (1.44). I 1.4 Laplace Transforms
In this section, we prove the one-to-one correspondence between a jump process and its Laplace transform. Recall that for a jump process P( t,x,A) (t >, 0, IC E E , A E. &‘), its Laplace transform P(X,x,A) (A > 0, x E E , A E 8)is defined by
P(A,2,A ) =
irn
e F x t P ( tIC, , A)&.
Lemma 1.28. Let P ( X , z , A ) be the Laplace transform of a jump process P ( t ,%,A).Then the following properties hold. (1) For each X > 0 and A E 8, P(X,., A ) E 8+.For each X > 0 and x E E , P(A,2,-) E Y+. (2) Normal condition. For each X > 0, z E E and A E 8,
0 < AP(A,z,A) < 1. (3) Resolvent equation. For each A, > 0, 2 E E and A E 6 ,
P(A,2,A ) - P(P,x,A)
+
- P)
1w,
x,d Y ) P(P,Y,A) = 0.
(4) Continuous condition or jump condition. For each x E Y, and A E 8 , limx+4MXY(A,x,A)= S(x,A). Let ( q( x) ,q( z, A)) (x E E , A E <%)be determined by Theorem 1.4 and Theorem 1.5. Then P(X,LC, A ) also satisfies (5) q-condition. For each LC C E and A E %‘, lim X[XP(X, x,A) - 6(z,A ) ] = q(x,A ) - q(x)S(x,A ) . x+rn Finally, if P ( t , x , A )is honest, then for each X 3 0 and 5 E E , XP(X,x, E ) = 1.
52
1 TRANSITION FUNCTION AND
ITS LAPLACETRANSFO
Proof: We check only the y-condition, the others can be checked similarly, even easier. By assumption, n: E E , - y(x)6(s, A ) , x 6 A E .%,' then for given E > 0, there is a 6 > 0 such that IF'($, Z,A ) - t q ( x ,A)l < ~ t , 0 < t 6 6.
lim[P(t,5 , A ) - 6(z!A ) ] / t= q(s,A)
t-Q
If
A E %'.
Note that X2e-xt
tdt = 1,
X >0
It, follows that
Letting X
-+ GO,
then
E
--f
0, we see that
lim X2P(X, x,A) = q(x,A), x-+rn Thus, we need only to prove that
z
$ A E 9.
lim X[XP(X,n:, {x}) - 11 = - q ( x ) ?
A400
If q ( 5 )
<
q(x) = GO.
x E E.
then the above proof is still available. Now, assume that Then for each M > 0, there is a S > 0 such that
00,
1 - P(t,Tc, {x}) 2 M t , so
X
- X2P(X, 5, {x}) = X2
r1
0
< t 6 6.
~ - ' ~ [ 1- P ( t ,2,{ x } ) ] d t
b
2 X2
e-AtMtdt = M
1
Xb
e-ssds,
and hence lim, ,03 X[1 - AP(A, x,(x})] 2 Ad. But &I is arbitrary, therefore limx,, X[1 - XP(X, x,{x})] = m. The main body of this section is devoted to prove the converse of the above lemma.
53
1.4 LAPLACETRANSFORMS
Theorem 1.29. A function P(X,Z,A) (A > 0, x E E , A E 8) is the Laplace transform of a jump process P ( t , x ,A) (t 2 0, z E E , A E 8) iff conditions (1)-(4) of Lemma 1.28 hold. A jump process P ( t , x , A ) is honest iff for all X > 0 and z E E , XP(X,x,E)= 1. Proof: Clearly, we need only to prove that under conditions (1)-(4), P(X, z,A) is the Laplace transform of a jump process P ( t , z , A ) . Once this is done, the last assertion follows immediately. The idea of the proof goes as follows. Choose an appreciate Banach space. Construct a family of resolvent operators {PA X > 0) on the Banach space corresponding to P(X,Z,A). Then apply the Hille-Yosida theorem to determine a strongly continuous semigroup. Finally, construct a required jump process in terms of the semigroup. a) Choose a Banach space. In contrast to the usual choice b 8 with the the set of signed measures with finite total uniform norm 11 . )Iu, we use 2, variation. Define the linear operation (ClCpl+ C2P2)
(4= C l c p l ( 4 + c,(P2(A), CI,C2
E
R,
cplIcp2
E
A E 8,
2
and the norm n
n
Ip(Ai)l: {Ai}? ~8mutually disjoint and
llpll= sup{ i=l
Ai a= 1
+
1
=E, n 2 1
which is the total variation of 9:llyll = p+(E) p-(E). Then (2,) . 11) is a Banach space. b) Construct a family of linear operators {PA: X > 0} on 2. Given P(X,z,A)(X > 0, 2 E E , A E 8 ) satisfying conditions (1)-(4) of Lemma 1.28. Let
(PPd (A) =
/ cp(W
P(X,z , A ) ,
A
E
8.
The domain PA) of PA is chosen to be 3.Then PA is a linear operator from 9to itself and is bounded: IlP~ll< 1/X. Next, we use two steps to show that PA is a resolvent operator. c) PA is a one-to-one mapping. Take p E 2 and let A: be the JordanHahn decomposition of p - XpP,. Then II'P - XPPAII = (cp
-
XCPPX) (A:) -
('p - X(PPA)
(A!)
1 TRANSITION FUNCTION A N D ITS LAPLACE TRANSFO
54
where JcpJ= cpt
+ cp-.
Thus, for z E AX,we have
0 6 1 - XP(X,s,AX) = S(x,AX) - XP(X,z,AX) < 1- XP(X,x, {x}). And for z $ AX, we have
0 2 -XP(X, Z,AX) = S(Z, Ax) - XP(X, Z, A x ) 2 -XP(X,x,E \ {Z}) 2 XP(X,Z, {x}) - 1. Hence 16(~, AX) - XP(X,5, Ax)I 6 1 - XP(X,
5,
{z}).
By condition (4) and the dominated convergence theorem, it follows that
(1.47) Now, to prove that PA, is one-to-one, it suffices to show that cpPx, = 0 implies that cp = 0. But by ( 3 ) , cpP~,,= 0 implies that cpPx = 0 for all X > 0. In particular, from (1.47), we see that cp = 0. This finishes the proof of assertion c). Let &?(PA)c 9denote the range of PA. Define an operator Rx on &?(PA) as follows: XI - n A = PT'. We now prove d) R A is independent of X > 0, denoted by R; its domain g ( R ) is dense in 9. We first prove that ~ ( R x is) independent of X > 0. Given cp E 9, by (3), we have cppp = (P + (A - P ) c p w 3 . This shows that 9 ( P p )c %'(PA).Exchanging X and p, it follows that
Next, we prove that for each cp E ~ ( P x=) 9 ( P p ) ,cps2x = 'PO,. Actually, for cp1, 9 2 E 2 so that cp = cplP~= cp~P,, from ( 3 ) , it follows that
+
But Pp is one-to-one, so cp2 - cp1+ (A - p)cplP~= 0. Hence (pP;l- cpP;' cp(X - p ) = 0. This gives us cpRx = pR,. Finally, we prove that g(n) := g(fI2x)is dense in 9. Note that for each cp E 9, XcpP, E g ( R ) . On the
1.4 LAPLACETRANSFORMS
55
other hand, from c), we have wen that limx-,, IIy-XpPxII = 0. This yrovcv the denseness. So far, we have proved that {PA : X > O} satisfies the hypotheses of the Hille-Yosida theorem. Hence, there exists uniquely a strorlgly continuous, contraction semigroup {Ti : t 3 0) with resolvent apcrators {PA : X > 0) such that PA = J r e - x t 7 j d t , where the integral is in the Bochner sense. Take S, = 6(x:.) E 9and set P ( t , x , A )= 6,Tt(A). We prove that e) P ( t ,x,A ) is a jump process. By thc representation theorem (cf. Yosida (1978), p.248), we have
6,Tt(A) = lim n-02
M 1 [6,(~nb'2P,)'~](A) m. m=O
Since b . ( A )and (6.Px)(A)= P(X,.,A) E &+., we hwe [S.(tnW,)](A) E &+By induction, for each 7 n 2 1, [6.(tnRPn)m](A) E &,.. Therefore P ( t ,-,A ) = (6.Tt)(A)E &+. Obviously, P(0,x,A ) = 6 ( x ,A ) . Next, since exp(-tnJ) and exp(tn2P,) are d l bounded operators from 2+into itself, b, E LZk,so S,Tt E 9+, md hence P ( t ,5,.) 6 2+. By contractivity,
P ( t ,2,I";)
=
(6,Tt) ( E )
< 1.
By strong continuity, P ( t ,2 , A ) = (S,TL)(A)is continuous in t. Now: the remainder i s l o check the CK-equation. Note that pS. = p and so
.By semigroup property,
WE
have
1 TRANSITION FUNCTION AND ITS LAPLACE TRANSFO
56
Since the one-to-one correspondence between P ( t ,x,A ) and its Laplace transform P ( X , z , A ) ,f r o m now on, we also call P(X,.,A) a jump process (q-process). If P(X,x,A) satisfies conditions (1)-(4) of Lemma 1.28, then it determines uniquely a jump process P(t,x,A), and so one can define the correThus, Lemma 1.28 shows that spondent 9. Corollary 1.30. P(X,z, A ) is a jump process with q-pair ( q ( z ) ,q(x,A ) )(x E E 9) deduced by Theorems 1.4 and 1.5 iff conditions (1)-(5)5) of Lemma 1.28 hold.
E, A
As we have seen, this corollary is essentially not new since the q-condition lays on P ( t , x , A ) but not on P(X,x,A). The next result is much more meaningful. Corollary 1.31. Use the hypotheses and notations in Remark 1.8. Then P(X,z,A)is a q-process iff conditions (1)-(4)4) of Lemma 1.28 and the following q-condition all hold. lim X[XP(X,
x+co
2,A
n En) - S(Z, A n En)]= q(z,A n En) - q(x)6(x,A n En), A€€,
n31.
Proof: By Lemma 1.28, the q-condition here is clearly necessary. Now, suppose that conditions (1)-(4) are all satisfied. By Theorem 1.29, there exists a jump process P ( t ,x,A ) , and hence EL q-pair (Q(x),Q(x,A ) )(x E E , A E 9) Then from the q-condition here and the proof of Lemma 1.28, it follows that q(x) = Q(x),x E E. Moreover, Q(Z,
Put
A n En) = q(x,A n En),
x E E, A
E
8, n 3 1.
-
En = {x E E : n - 1 < ~ ( z<) n} u {xn},
n2 1
zn
where {xn} defined in Remark 1.8. We have = En for all n 2 1; Be.) possesses uniquely an cause (q(x),q(z,A ) ) is a q-pair, by Lemma 1.7, Q(z, extension and moreover
Q(x,A ) = q(z,A ) ,
17: E
E, A
E
In particular, lim(P(t,x,A ) - 6(z,A ) ) / t = Q(x,A ) - Q(z)6(x,A )
t-+O
8.
1.5 APPENDIX
57
This shows that P ( t ,5 , A ) is a g-process with q-pair (q(x),q ( x , A ) ) . 1.5 A p p e n d i x
In this book, we often use different forms of the monotone class theorem. For the reader's convenience) we recall them here. But the proofs are omitted since they are included in many standard textbooks (Cohn (1980)) Wang (1965), Yan, Wang and Liu (1982), for example). D e f i n i t i o n 1.32. Let (I?,€) be an arbitrary measurable space. A class of subsets of 8 is called a .Ir-system if i t is closed under the (finite) intersection. A class 9 of subsets of € is called a d-system or D y n k i n - c l a s s if it has the properties:
(1) E E G. (2) A , B E 9 and A c B imply B \ A E 9?. (3) { A , } c 9 and A, imply &A, E 9,
The next result is the monotone class theorem in the set's form. T h e o r e m 1.33. Let %? be a 7r-system and 53 be a d-system. If %? a(%) c 9.
c 9, then
D e f i n i t i o n 1.34. Let 2 be a family of functions from E t o [--00, +-00] having the properties: f E 2 implies that ft, f - E 2.A family L of functions is called a 2 - s y s t e m , if
(1) L contains the constant function 1. (2) For cl, c2 E R and f ~f2, E L , either both of f l and f l are bounded or clf1 c2f2 E 2, then c l f l c2f2 E L. (3) For f , E L , 0 fn t f , either f is bounded or f E 9, then f E L .
+
<
+
The following result is the monotone class theorem in the functional form. When ( E , p ) be a metric space with Bore1 c-algebra 8, denote by 9 z p ( E ) (resp. b Y Z p ( E ) ) the set of all (resp. bounded) Lipschitz continuous functions. T h e o r e m 1.35. Let L be an 3-system.
(1) If L contains the indicator I A for all A in a .rr-system %?,then L 3
2 no(%). (2) If L 3 &%p(E), then L 2 2no(%'). Proof: The first assertion is standard, the second one follows from the next lemma.
1 TRANSITION FUNCTION AND
58
ITS
LAPLACETRANSFO
Lemma 1.36. Let (E,p ) be a metric space with Bore1 a-algebra 8. Then for U , there exists { f n } yc b Y i p ( E ) such that
every open set
Proof: Take
Then
and so f E b2ip(E). Next, let
Note that l q n ( f ( x ) )- qn(f(y))I = 0 if lf(z)IA lf(y)l 2 l / n . On the other hand, if lf(x)I A If(y)I < l / n , then
Finally, if
then
and
Hence q n ( f ) E bTip(E) for each n. Clearly,
The next special form deals with the non-negative functions.
Theorem 1.37. Suppose that L is a family of functions from E t o [O,+m] having the following properties.
(1) { I A : A E 8 } c L . (2) L is closed under non-negative linear combination. That is, for cl, c2 3 0 and fl, f 2 E L , we have c l f l c 2 f 2 E L. (3) For 0 6 fn E L , fn 1' f ,we have f E L.
+
Then
L 1&+.
Without any confusion, we refer anyone of the above theorems as the monotone class theorem. Our next topic is about the uniqueness theorem of Laplace transform which is also often used in the book.
1.5 APPENDIX
59
Theorem 1.38. Let p be a a-finite signed measure on ("0, oo),B ( [ O , m)). If
(1.48) Then p = 0. In particular, if p(dt) = ' p ( t ) d t with cp E d%'([ O , oo)), then 'p(t)= 0, a.e. t. If in addition cp being right continuous (resp. left continuous with cp(0) = 0), then 'p = 0.
Proof: From (1.48), it follows that (1.49) Denote by I/*the Jordan-Hahn decomposition of v(dt) = ectp(dt). Then, by (1.49), IvI = v+ v- is a finite measure. Under the transform s := e-t, v deduces a measure X on B((0,lI) so that 1x1 = A+ A- being a finite measure on B((O,l]).By (1.49), we have
+
+
i1
12 = 0,1,2,. . '
P X ( d s ) = 0,
(1.50)
Set 9= B((O,11)and
{
L = f E B ( ( 0 ,11) :
/
0
1
f(s)X(ds) = 0
1
Then, it is easy to check that L is an 2Z-system. Moreover, by Weierstrass theorem, L contains all continuous functions. Hence the monotone class theorem implies that L 2 2. In particular, taking f = I B , B E B((O,l]), it follows that X(B) = 0 for all B E B((O,l]). This shows that X = 0, then v = 0, and hence 1-1 = 0. The last result is a generalization to Dynkin (1965). Let us point out that the most of the results in this chapter (also in the next two chapters and in Chapter 6) do not depend on the topology on the state space ( E ,6 ) . For example, we require that all singletons {z) are in G. But we can ignore this restriction by using the measurable atoms { k } instead of {z} and replacing 8 with the deduced a-algebra. On the other hand, we require that ((2, z) : 17: E E } E € x €, but this is satisfied whenever € being countably generated. Actually, the main cases for which we need some restriction are Theorem 1.3 and Corollary 1.21. We now discuss again the differentiability of totally stable jump processes. As we have seen from Corollary 1.21 that for Polish space, the two conditions in Lemma 1.20 hold and so every totally stable jump process is differentiable. Actually, the state space can be more general.
60
1 TRANSITION FUNCTION A N D ITS LAPLACETRANSFO
Definition 1.39. Let X be a complete separable space with metric d and Bore1 a-algebra 23. We say that X O c X is a universal m e a s u r a b l e set if for every probability measure p on ( X , B ) , there exist B1, Bz E 33 such that B1 c X O c Bz and p ( B 1 ) = p(&) = 1. We say that ( E , & ) is a universal m e a s u r a b l e space if it is a-isomorphic t o a universal subset Xo of a complete separab!e metric space ( X , g ) (i.e., there exists a mapping from d t o XonB in one-to-one manner, and preserves countable set operations). Corollary 1.40. If ( E ,6 )is a universal measurable space, then conditions (1) and (2) of Lemma 1.20 are satisfied.
Proof: a) We first prove condition (1) of Lemma 1.20 holds. To do so, we prove the inner regularity of (XO,XOfl B). That is, for any probability ,u on ( X 0 , X o 033') and any A E X O n 3 , we have p ( A ) = sup{p(K) : compact K c A } . Now, given p, define a probability measure ii as follows: @ ( B= ) p ( B n X o ) , B E B. By definition, there exist B1, B2 E 28 such that B1 c Xoc B2 and jl(B1)= ii(B2) = p ( X 0 ) = 1. Given A E XOilB, there exist,s B E 3 so that A = B n Xo. Now, by using the inner regularity of (X,9), we see that P(A) = gal = i q n~B ~ ) = s u p { p ( ~ :) compact K c B n &)} = sup{p(.K) : compact K c a n B , ) }
compact K c D n xo)} = S U ~ { F ( K ) : compact K c B n X O ) } < sup(ji(K) : compact K c B n Bz)}
G sup{/@q
= fi(B
;
Bz) = ji(B) = p(A).
This proves the inner regularity of (Xo, X On B). Now, since ( E ,8 ) is aisomorphic to (Xo,X On 3 ) , the remainder of the proof is a simple modification of that of Corollary 1.21. b) For a proof of condition (2) of Lemma 1.20, refer to Sokal (1981). Indeed, for countably generated a-algebra, a criterion for the existence of regular conditional probability was obtained by Ma (1985). I Frorri probahilistic point of view, the universal measurable spaces are nice setup. Becausc thc Kolmogorov extension theorem holds for such spaces and every Markov process on such space has a transition function (cf. Kuznetsov
(1980)).
1.6 N o r m
61
1.6 Notes Theorem 1.2 is taken from Kingman (1972), p.160. It is a generalization to the Levy-Austin-Ornstein theorem for Markov chains. The state space can be more general, refer to Kingman (1968). For hlarkov chains, the differentiability of transition function at the origin is due to Kolmogorov (1936b). The generalization to the general state spacc is due to Kendall (1955). Hwe, we refer l a L o h e (1963) and Wang (1965). A part of Theorem 1.14 is taken from Hu (1966). The main body of Section 1.3 is duc to IIsu (1958), where the Euclidean space w a studied. The Polish space was treated by Zheng and Liu (1984), their key idea is included in the last part of the proof of Corollary 1.21. Lemma 1.20 and Corollary 1.39 are taken from Chen (1986~). Section 1.4 is mainly taken from Hu (1966). Corollary 1.30 is due to Chen and Zheng (1983). For Markov chains, the use of Laplace transform goes back to Feller (1957) and Reuter (1957). For the remainder of the book, we study only totally stable jump processes. Tt is inore dificrrlt to study the jump processes containing instantaneous states, A resull obtained by using lion-standard analysis was presented by Chen and Cheng (1981). For more resent progress: see Hou (1991).
Chapter 2
Existence and Simple Constructions of Jump Processes This chapter begins with a theory, one of the basic tools will be used subsequently, of non-negative solutions to a type of equations. Then, we prove an existence theorem for jump process, that is, constructing the minimal jump process. Moreover, we study the backward and forward Kolmogorov equations, the entrance and the exit boundaries. We present a criterion and some sufficient conditions for the uniqueness of conservative jump processes. Finally, for a special case, all jump processes are constructed. Starting from this chapter, the q-pairs are assumed to be totally stable. 2.1 Minimal Nonnegative Solutions
In what follows, we will meet the following different equations
z&) =
c
+
~ C i k ( S ) x k ( S ) d sbz(t),
t 2 0, i E E ,
(2.2)
kEE
where for the first two equations, E is countable, cij and bi, cij(t) and bi(t) are non-negative; for the last two equations, ( E ,€) is a measurable space, U is a non-negative measurable kernel, g E €+ and V is a measure on ( E ,8‘) These lead us to introduce a general type of equations. Let E be an arbitrary non-empty set. Denote by X a set of mappings from E to E+ := [0,+m]: X contains the constant 1 and is closed under the non-negative linear combination and monotone increasing limit. Here the order relation “2” in X is defined to be pointwise. For instance, f 2 0 means f ( x ) 2 0 for all E E E . Obviously, X is a convex cone. We say that A: A? --+ A? is a cone mapping if A0 = 0 and for all 62
and
2.1 MINIMALNONNEGATIVE SOI~UTIONS
63
Denote by d the set of all such mappings which satisfy in addition the hypothesis: &? 3 fTb t f implies Afn t Af. (2.5) Besides, we use the convention: c / c o = 0 and c x 00 = 00 if c > O! c+w = 00 for c 2 0 and 0 x 00 = co x 0 = 0.
Definition 2.1. Given A E a' and g E AY.We say that f * is the rninirnd non-negative solution (abbrev. minimal solution) to the equation
if f * satisfies (2.6) and for any solution
fE X
of (2.6), we have
The last property is called the minimum property of f*.
For simplicity, in what follows, we often omit the variable z. Theorem 2.2. The minimal solution t o Eq. (2.6) always exists uniquely. Furthermore, it can be obtained by the following procedure. Let
then f(") increases t o
f*
as ri
3 00.
Proof: Obviously, .Yf3 f(") T, so t h e limit exists. By (2.5), it follows that
Hence f* is a solution. Assume that f F 2f' is an another solution. Then f 3 f(') E 0. Furthermore, if 2 f'"), then f = A4f g 3 Af(") + g = j'(n+'). Thus, f 2 f(") for all n 2 0. And so f 2 f*. Finally, if two solutions possess the minimum property, they must be the same. I
fl
+
We call the recursive procedure given by (2.8) the first successive approximation scheme. Definition 2.3. The equation (2.6) is called homogeneous if g = 0. Corollary 2.4. The minimal solution t o a homogeneous equation equals zero. Next, we compare the order relation of the minimal solutions to different equations. Given 2, A E d ,we write 2 2 A if for each f f i%', 3 Af.
xf
64
2 EXISTENCE AND SIMPLE CONSTRUCTIONS OF JUhilP PROCESSES
Definition 2.5. Let A ,
A E a' and g , ij E 3 satisfy 224
929.
Then, we call
f>Xi+j,
(2.9)
f E X
a controlling equation of Eq. (2.6).
Theorern 2.6 (Comparison Theorem). Let f* be the minimal solution t o f q . (2.6). Then for any solution f t o Eq. (2.9), we have f f*.
>
Proof: Simply use induction and the first successive approximation scheme to show that f 2 4'") and then let n -+ 00. H By Theorem 2.2, for each A E d , we may define a map mA from 2 into itself as follows: m A ( d = f*. Then we have the following interesting result. Theorem 2.7. mA is a cone mapping. For { A n } c d ,A,, %, gn t g , we have A E d , g E A? and mA,,gn t m,g.
TA
and (9,)
C
Proof: By definit.ion, g E 53'. Similarly, it is easy to check that A E d.On and the other hand, by the comparison theorem, we have f: := mA,gn
f: Set
fl* = lirnTL-.oof;.
Anf;
+gn.
Then
2 f*by the minimum properly. is a solution to Eq. (2.6) and so Hence Note that by the order-preserving property, f* = m,g mA,gn = f: for all n , so f* 2 f*. Therefore f* = = limn4m mAngn.
>
Corollary 2.8. Let G be a countable set, {a, : s E G} C E+, then
Proof: When G is finite, the assertion follows from the first assertion of Theorem 2.7 and induction. Then the assertion for general case follows by using the second assertion of Theorem 2.7. I The next result, is the second successive approximation scheme for the minimal solution, its proof is similar t o the previous one and hence omitted.
65
2.1 MINIMALNONNEGATIVE SOLUTIONS
Theorem 2.9. Let {g"}?
c 3.Define
In particular, if we set
then mAg = f* = CrzIf ( n ) .
7
Tlieurcr:m 2.10, Let be a non-negative solution t o equation (2.6) so t h a t f" < p i * for some constant p 2 1. Then ,f = f". Moreover, for any initial f ( " ) 0 f(') G pf*, setting = Af"(n) g (n2 0), we have flcn) f * as
I("+')
<
n
--t
+
---f
00.
Proof: By (2.8) and induction, it fallows that
When f*(z)= 00, (2.10) implies that limn+m f"(")(x) = 00 = f*(z).When f*(z)< 00, (2.10) and (2.11) imply that
We still have limn+mf(n)(z) = f * ( z ) . Thus, we have proved the second assertion of the theorem. We now prove the first assertion. Let 0 f" pf*. Set f(') = f" and f(n+l) = Ap(") g (n2 0). Then the assertion we have just proved gives us lim,rl_,mf ( l L ) = f * . But f satisfies Eq. (2.6), so f1(7L) for every n 2 0. Hence f = f*.
< <
+
fl
Theorem 2.1.1. Suppose that the minimal solution f " t o the equation (2.6) satisfies
0 < inf f*(z)< sup f * ( s < ) o, XGE
XEE
then the only non-negative bounded solution to the homogeneous equation
f =Af, is zero.
f E&?
(2.12)
66
2 EXISTENCE AND SIMPLE CONSTRUCTIONS O F JUMP PROCESSES
Proof: Let f be a non-negative bounded solution to Eq. (2.6). Then there exists a constant c < 00 such that
c 2 sup f(Z)
sup f*(x) 2 i d f*(s). X
2
So we have
f
Hence, by Theorem 2.10, = f*. Next, for any non-negative bounded solution J to Eq. (2.12), since f+f* is a non-negative solution to Eq. (2.6), by what we have proved in the last 0. paragraph, f + f * = f * . This proves that To simplify the writing and to save the space, let us introduce some notations which will be used throughout the book. For a given kernel K ( z ,d y ) on (E, E), not necessarily non-negative, we we set set
F=
Note that for a function f and a measure 9,fp gives us a kernel but q f gives us a constant. Of course, some restriction on K or the class of functions are needed to make the notations to be meaningful. Tn what follows, we will mention the related restriction case by case. It zs zmportant lo remernher that the operators we are dealzny wilk are in the w e d Se71Se. I n other words, all the equations, inequalitaes, limits of functions, measures as well as kernels are pointwzac, These notations simplify greatly our expressions but somctimcs may lost thcir intuition. The reader c m rcwritc down the full expressions if necessary. Now, we turn to study Eq. (2.3) and (2.4). For which, Corollary 2.8 can be generalized as follows.
Theorem 2.12. Let
U
and T be two non-negative measurable kernels on
tG 8). (1) For each A E 8,denoted by P ( - A) , the minimal solution t o the equation
f = U f+ ?'(*,A), Then, for every g E &+, equat io n
f
j"P ( . ,dy)g(y)
f=Uf+Tg,
E
G;,.
is the minimal solution t o the
f€€+*
2.1 MINIMALNONNEGATIVESOLUTIONS
67
(2) For each x E E , denoted by Q(z, .) the minimal solution t o the equation $3 =
cpu + qz,.).
Then, for every measure v , v
Q
is the minimal solution to the equation
Proof: Since the proofs of (1) and (2) are similar, here we prove (1) only. Let
A E 6,
P(O)(.,A)= T ( . , A ) ,
+
P ( * + ' ) ( . , A )= U P ( n ) ( . , A ) T ( - , A ) ,
A
E
8, n 2 0.
(2.13)
Then, by Theorem 2.2 we have
l""'(.,A) T P ( . , A ) as n -+co,
A
E
8.
>
By induction and (2.13), we see that for every n 0 and z E E , P(")(z,.) is a measure on 8.And so is P ( x , by Lemma 1.6. Moreover, for every n 2 0 and g E 8+, P(nel)g= UF'(n)g + T g . P(O)g = Tg; 4 )
Now fix g E &+ and set f(O)
=3
Ty,
f(n-t.l) =
I / J ( ~+) T g ,
n 2 0.
Clearly, we have f(') = P(')g, Suppose that f(") = P(n)g.Then f (w-1) =
+ Tg= #7Jp("Iy + Tg
p(74$
Hence f(") = P(n)g for all n 2 0 , Now, our assertion follows by letting n + o;! and using Theorems 2.2 and 1.37 (P(")gt Pg as 7t m). The next result is called a localization theorem. Theorem 2.13. Let U be a non-negative measurable kernel and f * be the minimal solution to the equation
Next, let G + E and to the equation
be the minimal solution
68
2 EXISTENCE AND SIMPLE CONSTRUCTIONS
OF
JUMP PROCESSES
Then we have
p(z)= f * ( x ) ,
x E G (resp. x E E ) .
Proof: Because (f*(z) : x E G ) is a solution to Eq.(2.14) and so is to Eq. (2.15). By the minimum property, we have f*(z)
< f*(z),
n: E G.
On the other hand, by the first successive approximation scheme for f*, we have f ( ' ) ( x )= 0 f*(z) for all n: E G. Suppose that f(n)(x) f*(n:) for all z E G. Then
<
<
<
Therefore, as n + 00, f * ( z )1' f(")(x) p ( x ) for all x E G. W The reason why the theory of the minimal solutions being especially useful for the jump processes is due to the fact: the samples of the minimal jump process are step functions. Each step of the first successive approximation scheme corresponds to the probability of a jump from a point z to a set A . To illustrate this, we introduce two preliminary applications. Let (Xn)n20be a Markov chain defined on a probability space (a,9, P) with countable state space E and with transition probability matrix (Pij : i, j E E ) . Set
where
Pi is the probability measure of the chain starting from i.
Proposition 2.14. For each fixed j , { f i j : i E E } is the minimal solution t o the equation xi = Pikxk f Pij. k#.i
In particular, j is recurrent iff x; = 1.
2 . 1 MINIMAL NONNEGATIVE SOLUTIONS
69
Proof: Use Theorem 2.9. Clearly, xj') = P. = f (1) . . . Suppose that xi") = 2.7
fly), then xi("+I)
=c
pikxp =
aj
Cp.f'"'= fij ak k j
(n+l) '
k#j
k#j
Hence, from Theorem 2.9, it follows that xf = fij.
I
Proposition 2.15. Let mij = Co=1nfi(3n). Then, for each fixed j , {mij : i E E } is the minimal solution t o
in particular, if j is recurrent, then it is positive recurrent iff
x; < 00.
Proof: Let
Suppose that y,'"' = nf!". 23
Then
It follows that y!") = nf;;' for all i E E and n 3 1. Thus, By Theorem 2.9, we obtain cn
We conclude this section by comparing the difference between the minimal solution and the ordinary solution. Certainly, if a system of finite number of equations has only one solution, then it must coincide with the minimal one. However, the equation
x=x has infinitely many finite non-negative solutions, but its minimal solution
x* = 0. On the other hand, the equation x=x+2 has no finite non-negative solution, but has minimal solution x* = 00.
70
2 E XISTENCE AND S IMPLE c ONSTRUCTIONS OF J UMP P ROCESSES
2.2 Kolmogorov Equations and Minimal Jump Process
In this section, we study the backward and forward Kolmogorov equations and their equivalent forms. As the minimal solution to these equations, we obtain the minimal jump process. In this way, we prove the existence theorem for jump processes. The following two equations
have important probabilistic meaning. Given H jump process, if its first sample discontinuity is an isolated jump with probability one, then Eq. (B) holds. On the other hand, if before time t (finite), with probabilityone there exists the last sample discontinuity which is again an isolated jump, then Eq. (F) holds (cf. L o h e (1963) or Wang (1965)). In the physics literature, the forward Kolmogorov equation ( F ) is often called Fokker-Planck equation. The Laplace transforms of the above equations become
respectively. Certainly, from the algebraic point of view, Eq. ( B x ) is the same as
Nevertheless, from the probabilistic point of view, they have some diflerent meanings. For instance, as explained in the last section, each step of the first successive approximation scheme of (Bx) corresponds to thc successive jumps of a jump process, but the last equation has no such meaning.
2.2 KOLMOGOROV EQUATIONS A N D MINIMAL J U M P PROCESS
71
From now on, we will often concern with the above Kolmogorov eqriations. To simplify our writing, we introduce some operators. Corresponding to a q-pair (q(x),q(s, d y ) ) , we use three operators:
Sometimes, it is more convenient to regard t,hese operators as the kernels
respectively. If a 9-pair is conservative, then flf can be rewritten as
For the second operator, we use “Q” but not “q” as the operator is to avoid the confusion with q f which is the product of the functions q and f . As usual, I denotes the identity operator: If = f. Certainly, it is the same to consider I as the operator generated by the kernel 6 ( z , d y ) . Finally, for f E 8+)we use fI to denote the kernel (or operator): f ( s ) b ( s ,dy). By using these notations, the equation (Bx)can be rewritten as follows: P(X) = Il(X)P(A)4-(A -t-. p 1 .
Theorem 2.16. ( B ) and (DA), ( F ) and ( F A )are equivalent respectively. Proof’: Let P ( t , s ) A )satisfy (B). Since for fixcd x E E and A f 8:both sides of (B) are continuous in t, making their Laplace transforms, we o b t ~
72
2 EXISTENCE A N D SIMPLECONSTRUCTIONS OF JUMPPROCESS
which is exactly ( B A ) . Conversely, let P(A,IL', A ) satisfy ( B x ) . By Theorem 1.29, it detcrmincs uniquely a jump prwess P ( t ,x,A ) such that
1
00
P(A,2 , A ) =
eUxtP(t,x:A)&,
X
> 0, .x E E , A
t 8"
R,cpeating the above proof in the opposite may, it follows that
It is easy to see that the integrands are all bounded and continuous in t . Hence (B) follows by the uniqueness theorem of Laplace transform. The proof of (P) ( F A ) is similar t o that ( B ) (Bx) and hence is omitted. To show that ( F A ) ( F ) , we need more work except the trivial case that q(x) = 0. Now, assume that q(a) > 0. At the beginning of Sectiori 1.3, we proved that
*
+
+
<
exists almost everywhere and 0 < r ( t , x , A ) 1, a.e. 1. On the other hand, by usiiig CK-equation and letting s 0, we have
Consider A E En n 8,where En = {x E E : q ( z ) 6 n } . The limits of the first two addition terms on the right-hand side exist and are finite, and so we can apply Fatou's lemma to obtain
2.2 KOLMOGOROV EQUATIONS AND MINIMAL JUMP PROCESS
73
Combining this with (2.16)) we see that
/ P ( t ,x, r
dy)q(y, A )
6 q(x)+sup q(y),
a.e. t for all A E En n 8. (2.17)
YEA
This shows that for each A E E,
n 8,
is continuous in t. Hence: applying the uniqueness lheorern of Laplace transforin and then the monotone class theorem we obtain (FA) ===+ (8'). I Definition 2.17.We call ( B ) or ( B A )(resp. ( F ) or ( F A ) )t h e backward (resp. forward) Kolmogorov equation. Proposition 2.18. Every q-process P ( t ,x,A ) satisfies the backward Kolmogorov inequality: (2.18) Proof: The proof here is quite similar to the last part of the previous one. By Lemma 1.18, it suffices to show that for fixed 2 and A,
or
d dt By CK-equation,
e-q(z)t - (eq ( z ) t P ( x, t, A))
q(z,d y ) P ( t ,y, A ) ,
a.e. t ,
+
[eq(z)(t+s)P(ts, x,A ) - eQ(")'P(L, x,A ) ]/ s
where En = {x f E : n - 1 lemma, we obtain
< q(x) < n ) . Letting s 1 0 and applying Fatou's
- e4(r)t
J
q(x,dy)P(f,y:A),
a.e. t. W
For Polish space: by Theorem 1.23, the last inequality actually holds for all t . The next result is the Laplace transform of the previous one.
74
2 EXISTENCE AND
SIMPLE CONSTRUCTIONS OF
JUMP PROCESSES
Proposition 2.19. Every q-process P(A,x , A ) satisfies t h e backward Kolrnogorov inequality:
P(X) 2 rr(X)P(X)
+ (A + q y 1 ,
x > 0.
(2.19)
Proposition 2.20. Every q-process P(X,x,A) satisfies the forward Kolmogorov inequality:
P(X) 3 P(A)Q[(A
+ q)-’I] + ( A + q)-’I,
Proof: Take En = {x E E : a(.) resolvent equation, we kavc x,dY)P%4 Y)A )
+(A;
But for y
E
X
> 0.
(2.20)
< n ) . Then i4:, E R and En t E . By the
=k(h,
X )$Y)I-L[rUP(P: u,
4 -4% 43 W %x,4.
A: we have 0 6 p p - plJ(p,Y:4 1
Hence, for each A
E
-
8 n En: n 2 1
XP(X,x,A ) = lira X /i.-+oc!
< /&[1 / q p ,Y: {I/>)].
s
P(A,z, d y ) p P ( p ,y, A )
That is
+
P(A)[(X-I q)1.,4] 2 P(X)QIA I n ,
A
E En n 8,n
2 1.
(2.21)
Next, for A E 8, substituting A n En into (2.21) and letting n 3 00, we see that (2.21) bokls for all A E 8 . Finally, considering the both sides of (2.21) as measures in A and using the monotone class theorem, we obtain PIX)[(X+ q)fI 5: I’(A)Qf
Jn particular, setting f
+ f,
f
E &+.
= d ( . , A ) / ( X + q ( - ) )it ! gives us the desired inequality.
Now, we are at the position to prove the first fundamental result, the existence theorem.
2.2 KOLMOGOROV EQUATIONS AND MINIMAL JUMPPROCESS
75
Theorem 2.21 (Existence Theorem). Given a q-pair, there always exists a q-process. In details, the minimal solution Pmil'(X,z,A) to (a,)is a qprocess. Indeed, i t is the minimal one: for any q-process P ( X , z , A ) ,we have P ( X , x : A )2 Pmin(X,z,A) for all X > 0, x E l$ and A E 6'. Moreover, (Bx) and ( F A ) ,as well as ( R ) and ( F ) have the sam?eminimal solution. The former one is the Laplace transform of the latter one, which is the minimal q-process i n a similar sense.
Proof: a) Prove that Fin()\, x,A ) is a q-process. Let
Then, from Theorem 2.2, it follows that
Using induction on n, we see that P(")(X)is a kernel on ( E , 6 ) , bounded from the above by l / X for each n, and so does its limit Pmil1(X, +,A ) .
On the other hij8nd,sincc ,PTnin(X) sati,sfi,es( R x ) ,by Theorem 1.15 (3), thc: jump condition and y-cc:mdition are all satisfied. So the rernadnder is only to check t.he resolvent equation. ,Actually, we will prove a stronger result:
F(4(A) -
23 $7
g T q p )
= ( p - xj
i;(k)(X)gn+~-k)(p),
k=l
A, p > 0,
2 1,
(2.22)
where
To see that the resolvent equation follows from (2.22), note that by the second successive approximation scheme,
Thus, by summing the both sides of (2.22) from 1 to 30, we obtain the required equation. We now prove (2.22) by using induction. When n = 1, at each point (x:A), the bath sides of (2.22) have the same value as
76
2 EXISTENCE AND SIMPLE
CONSTRUCTIONS OF
JUMP PROCESSES
Suppose that (2.22) holds for n - 1. Then, from
c n
( p - A)
jm(A)++l-k)
(A
k=l
n- 1
+
= ( p - A)P(1)(A)F(n)(p) ( p - A)
c~(X)P("(A)~("-"(( k=l
= ( p - X)[(A
+ q ) - l ] P ( q p )+ n ( A ) ( P - l ) ( A )
-P-l)(p))
we see that (2.22) holds also for n and hence we are done. b) Prove that Pmin(A, z, .) is the minimal solution to ( F A ) . We use again the notations given by (2.23). Moreover, for fixed set
5
E
E,
Once we prove that
i;Cn)(x) = P ( n ) ( ~ ) ,
A
> 0, n 2 1,
(2.24)
then the assertion follows immediately since the minimal solution to ( F A )is P(n)(A, 2 , .). To prove (2.24), we adopt induction again. When n = 1, (2.24) is trivial. When n = 2, we have
c,"==,
So (2.24) holds. Suppose that it holds for n- 1 and n, then, by the monotone class theorem, we have
P("+')(A)= P(")(A)Q[(A + q)-'1] = F(")(A)Q[(X + ~)-l1] = n(A)P(n-l)(A)Q[(X + q)-lI] = rI(A)P(n-l)(A)Q[(X = rI(A)p(n)(A) = rI(A)F(")(A) -
,(n+"(X).
+ q)-'I]
2.2 KOLMOGOROV EQUATIONS AND MINIMAL JUMPPROCESS
77
Therefore, (2.24) holds for all n. c) Prove the minimum property. Because every q-process P(X,2,A ) satisfies the backward Kolmogorov inequality (2.19), which is a controlling equation of (Bx),so the property follows by the comparison theorem. d) Prove that Pmin(X, z, A ) is the Laplace transform of the minimal solution Pmin(t, 2 , A ) to (B). First, we have
1
00
edxtP(')(t,x,A)dt= 0 = P(O)(X,z,A),
X
> 0, x E E , A E 8.
Next, suppose that
<
and that P(")(t)is continuous in t , P ( n ) ( t ) l 1 for all t 2 0. Then it is clear that
has the same properties. Furthermore,
Hence
Lrn
e - x t ~ ( n ) ( t ) d= t P(~)(x),
Letting n -+
00,
x > O,
n
2 0.
by the monotone convergence theorem, we have e-At
P min ( t ) d t = P m y X ) ,
X
> 0.
e) Prove that Pmi"(X) is the Laplace transform of the minimal solution to (F). To prove this, it suffices to show that (B) and (F) have the same minimal solution. Define
78
2 EXISTENCE AND S I MP L E CONSTRUCTIONS O F J U M P PROCESSES
and
As we did in the proof of b), we need only to show that
F(n)(t)= P n ) ( t ) ,
t 2 0 , n 2 1.
(2.25)
This is trivial when n = 1. Suppose that it holds for n. Sirice F""'(t) = F(.)(t) < Pmin(t),by (2.17), rcrL.' ] ) ( tx, , A ) (x E E , A E ETnf18) is cuntin u o u ~in 1. Thus, by (2.24) and Theorem 1.38, (2.25) holds for n + 1, first for all A E Em n 8 and then for all A E 8 , A direct proof of (2.25) goes as follows. For n = 2, we have
and
= thc right-hand sidc of
(2.26).
Here, in the last step, we have used thc identity
Thus, (2.25) holds for n = 2. Now, suppose that (2.25) holds for n n.Then, by the monotone class theorem, we have
-
1 and
(2.27)
2.3
S O M E SUFFICIENT CONDITIONS FOR UNIQUENESS
79
On the other hand,
THE RIGHT-HAND SIDE OF (2.27). This completes the proof of (2.25). f ) Prove that Pmin(t, x, A ) is the minimal q-process. z,A ) is a solution to (B), it is continuous in t. Due to the fact that Pmin(t, Thus, by the uniqueness theorem of Laplace transform and Corollary 1.30, we see that Pmin(t,z,A)is a q-process. Finally, the minimum property follows from Proposition 2.18.
2.3 Some Sufficient Conditions for Uniqueness Having the existence theorem in mind, the next step is to study the uniqueness problem. As we will see later, it can happen that a number of jump processes may have the same q-pair. Hence we would like to know when does a given q-pair determine precisely one q-process. This problem is especially important in practice since what we can figure out in advance is the q-pair but not the processes. However, the problem is very difficult in general. A complete answer will be given in the next chapter. Here we introduce some sufficient conditions, which are quite effective in practice. A concrete example kept in mind is Schlogl's model (Example 0.3). For 3 which, the next two theorems are available with p(z) = c ( l (C,xu) ) and cp(x) =' c( 1 C , xu), respectively, for some constant c. In this section, we assume that the given q-pair (q(x), q(x, A ) ) is conservative.
+
+
Theorem 2.22. Let cp E T € satisfy cp >, q. Then the q-process is unique if one of the following conditions holds.
<
(1) There exists a constant c E JR such t h a t Rcp ccp. (2) There exists a Xo such that Pmin(Xo)cp < 00. ( 3 ) For each t 2 0, Pmin(t)cp< co.
80
2 EXISTENCE AND SIMPLE CONSTRUCTIONS OF JUMP PROCES
*
*
Proof: First, we prove that (1) ( 2 ) uniqueness. 4 (1 ) + (2). By Theorem 2.21 and Theorem 2.12 (1)) it follows that for each X > 0, P"'"(X)v is minimal solution t o the equation
But condition (1) implies ip 3 -n(4ip +x-c x-c
whenever X have
> c. Thus, by the cornparisori theorem, whenever X > c V 0, we P"'"(X)(P
b) (2)
cp
x+y
+ uniqueness.
6'p
(2.28)
Using the forward equation
and applying the monotone class theorem, we obtain
+ q, we gel + q ) = Prni"(Xo)q + 1.
In particular, setting X = A0 and f = A0 Pm'"(Xo)(Xo
Then, condition ( 2 ) gives us X 0 P m i n ( X ~ ) l = 1. We now prove that this equality indeed holds for all X > 0, which certainly implies the uniqueness since Pmin(A) is the minimal one. For this, we use the resolvent equation:
P(X)l- P ( p )14- (A - p ) P ( X ) P ( p ) l= 0. Thus, if pP(pu)l .- 1 for some p
> 0, then it follows that
1 x P(X)1- - , i . -?(X)1 CL El
= P(X)I.
And so XP(A)l = 1 for all X > 0. Next, we prove that (1) j (3) + uniqueness. c) (1) + (3). This st,ep is not needed for the proof but included €or completeness. Let (1) hold. Then
2 . 3 SOME
SUFFICIEN'I' CONDITIONS FOR UNIQUENESS
This implies that Q y ( x )= 0 whenever c
1
81
+ q(2) = 0. We have
t
(c+q)t
or
1
- 11 >,
e(C+q)SQg&,
E
ipeCi
2
,
e-Y(L--s)Q)ecsds +
t20.
On the other hand, by Theorern 2.21 arid Theorem 2.12 (l),it follows that for each t >, 0, P"'"(t)ip is minirnal solution to the equation
Combining these two facts and using the comparison theorem, it follows that
pmin(t)y 6 yect < 00,
t 2 0,
which is just (3). d) (3) => uniqueness. Use the forward equation
(2.29)
(F):
Since the q-pair is conservative, by Theorem 1.15 and condition (3): it follows that
The null set depends on 2 , By using the forward equation again, in virtue. of condition (3) and the dominated convergence theorem, we obtain
d a.e. t. - p m y t , 2 , E ) = 0, dt However, the left-hand side is indeed continuous in t by Theorem 1.15, so this equat,ion holds for all t . Therefore
Pmin(t, 2,E ) = constant =Pmin(O, 2 , E ) = I,
x E E.
Now, the uniqueness follows from the minimum property of Pmin(t). W
2 EXISTENCE AND SIMPLE CONSTRUCTIONS OF JUMPPROCESS
82
Definition 2.23. T h e q-pair (q(x),q(x,A ) ) is called bounded if supzGEq(z)
< 03. Corollary 2.24. If a y-pair is conservative and bounded, then the q-process is unique. T h a t is P ( t )= etR = (tQ)n/n!.
c;=,
Proof: The conditions for the case (1) of Theorem 2.22 are satisfied wilh c = 0 and ‘p =: A4 = supxEEq ( s ) .
{&}? c 8 and ~1 E T€+. Suppose that (1) En t E , supzEE, q(x) < 00, limndm inf,,js, p(x) = 00,
Theorem 2.25. Let 00
where inf
0=
by the usual convention.
(2) There exists a constant c E IR such that flp
< CV.
(2.30)
Then the q-process is unique.
Proof: The idea of the proof is using an approximation by a sequence of jump processes with bounded q-pairs. For this, let
Then for each n, the q-pair (9, (x))qn (z, A ) ) is bounded and conservative. It x,A ) . Clearly, replacing determines uniquely a q-process, denoted by Pn(X, c and ( q ( x ) , q ( x , A ) with ) c+ = c V 0 and (qn(x),qn(z,A)), respectively, condition (2.30) remains true. Thus, according to the first step of the previous proof, we have
On the other hand? when
But when x $
5
En,we have
E
En,we have
2.3 S O M E
SUFFICIENT CONDITIONS FOR UNIQUENESS
83
Thus, in any cases, we have
X > O , X E E ,A C E , , n 2 1 .
Hence, by Corollary 2.24, Theorem 2.21 and the comparison theorem, we obtain X
Pmin(X, z,A ) 2 P,(X, z,A ) ,
> 0, z E El A c En, n 2 1.
Combining this with what mentioned at the beginning of the proof, we arrive at: 2,En) 2
XP"'"(X,
XP,(X,
2 , En) =
1 - XP,(X,X,
E;)
So we have XP"'*(X,
z,E ) = lim X P ~ ' ~ ( Xz, , E,)
= 1,
A>
C+.
It follows that the minimal q-process is honest and hence the q-process is unique. Next, we compare the above conditions. The purpose we introduce condition (1) of Theorem 2.22 is mainly for conditions (2) and (3) there. It is clearly more practical and provides the useful estimates (2.28) and (2.29). However, if we pay attention only to the uniqueness, then the first case of Theorem 2.22 is indeed a simple consequence of Theorem 2.25. To see this, taking En = {z E E : q ( z ) < n } and using the same cp and c given by Theorem 2.22 (l), it is easy to check that the hypotheses of Theorem 2.25 are satisfied. The next example shows that the conditions for case (1) of Theorem 2.22 are really stronger than the hypotheses of Theorem 2.25.
Example 2.26. Take E = { 1 , 2 , . . - } . Let { q l , q 2 , . . . ) be the prime numbers in the natural order. Define
Then the hypotheses of Theorem 2.25 are satisfied but not conditions for case (1) of Theorem 2.22.
84
2 EXISTENCE AND SIMPLE
CONSTRUCTIONS OF
JUMPPROCESSES
Proof: Take
Since
= 00. Because 'pi is are convergence-equivalent, it follows that lirni,,cp, increasing, it is now easy to see that the hypotheses of Theorem2.25 are satisfied with En = {1,2, - * - , n}. Next, we prove that the conditions for case (1) of Theorem2.22 do not hold. Actually, we can even prove that the conditions for case (2) of Theorem 2.22 do not hold. Note that in the present case, the solution to ( Bx ) is as follows:
PZj(A) =
(A
+
4i . . .Qj-l 4i).* * (A 4j)
+
j
> 2,
So we have 00
M
00
Since limj(2-1) j--tm
aj+l
=Xlim 3-00
j = o , 4j+l
it follows that the above series diverges for all A done. I
A>O,
> 0 and hence we are
We now introduce a criterion (its special case is Theorem 0.6) which shows that Theorem 2.25 is sharp. For further discussions about Theorem 2.25, see Section 3.2.
Theorem 2.27. The minimal q-process is not honest (equivalently, the 4processes are not unique) if the inequality Rcp 2 ccp has a solution cp E b6' with supzEEcp(z) > 0 for some (equivalently, for any) c > 0. Conversely, these conditions plus cp 2 0 are also necessary.
2.4 KOLMOGOROV EQUATIONS AND 9-CONDITION Proof: Sufficiency. Choose
A0
85
> 0 and zo so that
<
Take En = {x E E : q ( x ) n } and use the notations given in the proof of Theorem 2.25. By the forward equation ( F A )and the assumption, we have
pn(A)((A+ qn)cp) = Pn(A)Qncp+ cp = ~ ~ ( A ) [ I E , Q ~ ] (2P P+, ((AP) ( I E , ( C + q n ) P ) +CP, on En, where Qn denotes the operator generated by q n ( q dy). Thus, if the q-process were unique, then we would have cp
< (A - c)pn(A)(IEILcp)+ Apn(A)(IE;cp)
n-cc
(A - C)P"'"(~)cp.
Here, in the last step we have used an approximation lemma (Lemma 5.15) and Theorem 1.14 (2). In particular, cp(x0)
G-
which is in contradiction with our assumption. To prove the other assertions of the theorem, we need much more preparations, which are indeed the main task of the remain sections in this chapter. The equivalence on non-uniqueness follows from Theorem 2.47. The necessity of the condition follows from Theorem 2.47 and Lemma 3.14. W 2.4 Kolmogorov Equations and q-Condition
Starting from this section, we are mainly preparing to establish a uniqueness criterion for conservative q-processes, which will be given at the end of Section 2.6. The materials here will be also used in the next chapter. Unless otherwise stated, we adopt the following partition of E :
En = {x E E
:
n
-1
< q(z) < n } .
Clearly, the q-condition given by Corollary 1.31 can be restated as follows. Corollary 2.28. The q-condition holds iff lim A[AP(A, x,A ) - S(z,A ) ]= q(x,A ) - q(x)b(z, A)
A+m
for all n 2: 1 and all A E €
n En.
2 EXISTENCE AND SIMPLE CONSTRUCTIONS OF JUMP PROCESSES
86
The main purpose of this section is to show that if one of the Kolmogorov equations holds, then the q-condition is automatic. Far this, we introduce the following alternative forms of the equations.
Proposition 2.29, For each n 2 1, define
Then ( R A )(resp. ( F A ) )holds iff (Bra)(resp. ( F n ) )holds for every
n 3 1.
Proof: Note that (F,) holds for cvery n iff
Both sides are measures in A. Integrating with respect to the function (A q ) - l I ~ we , obtain ( F A ) .Conversely, considering the both sides of ( F A ) as measures and integrating with respect t o the function ( X + q ) l ~ ,we obtain (2.31). The proof for the other half is similar. I
+
Theorem 2.30. Let P ( X , x , A ) satisfy conditions (l), (2) and (3) of Lemma 1.28. If the backward or the forward Kolmogorov equation holds, then the q-condition is satisfied and hence P(X,x,A ) is a q-process.
Proof: The “backward” case was actually proved in Theorem 1.15 (3). Now, n assume that (F,) holds for all n. Set G, = Em. Obviously, (F,) remains true if we replace E, with G,. Since Gn 1 E , for given 2, we may choose G, 3 2. By (F,), we have
Herice lim~-+oo XP(X,2,{x}) = 1. Furthermore lim XP(X,x,A) ,< lirn [l - XP(X,z, {x})] = 0,
Ado0
A+Cn
z $ A.
This proves the continuous condition. We now prove the q-condition. By (FTL), for each X > 0 and A E 8 n G,, we have (2.32) IP(X)QIAI 6 1 + SUP(X+ d y ) ) / X . YEA
Next, by the resolvent equation,
2.4 KOLMOGOROV EQUATIONS A N D CONDITION
87
On the other hand, for fixed z E E and A E €, P ( X , s , A ) is decreasing in A. Hence for X 2 XO > 0, we have
P(X)QIA< P ( X ) [ I G , Q ] I A+ P ( X O ) [ I G ~ Q ] I A A , E gnG,. Letting X --t 00 and then n + 00, by (2.32) and the dominated convergence theorem, it follows that limx+cc P(X)QIA= 0 for all A E € n G,. Furthermore, by (2.32), we can prove that lim~.+mpLP(p)P(X)QIA = 0. Thus, we obtain
On the one hand, we have proved the continuity: limxdm XP(X) = I . So by Theorem 1.14 (4) we have
lim XP(X)QIA2 QIA 2 0 ,
x-+m
A E 8 n G,, n 2 1.
(2.34)
On the other hand, by (2.33), (2.34) and using Theorem 1.14(4) again, we have
Since P ( p ,z, {z}) > 0, this shows that
lim XP(X)QIA< QIA,
x-+m
A E € n G,, n 2 1.
(2.35)
n 2 1.
(2.36)
Combining (2.34) with (2.35), we obtain lim XP(X)QIA= QIA,
x+cc
A
E
Therefore, by (F,) and Theorem 1.14, we have
which is just the q-condition.
€nG,,
88
2 EXISTENCE AND SIMPLE CONSTRUCTIONS OF JUMP PROCESSES
2.5 Entrance Space and Exit Space
Suppose that P(X,2,A ) is a q-process satisfying the equation (Bx).Then .,A) is a non-negative bounded for fixed A > 0 and A , P(X, . , A ) - Pmin(X, solution to the homogeneous equation
f = rr(A)f,
f Eb8.
Thus, if the q-processes are not unique, then the above equation should have non-trivial solution. The solutions to the equation consist of the exit space. This explains intuitively why we introduce this exit space and how it concerns with the uniqueness problem. Similarly, if we consider the jump processes satisfying the forward equation, we will need the entrance space. The main purpose of this section is to prove that the dimensions of these spaces are independent of X > 0. For this, we need a lot of preparations. The related notations will become clear in the proof of Lemma 2.43. Introduce a kernel R(X,p)(X,p > 0), not necessarily non-negative, on ( E ,8') as follows: R(X,p ; 2,A ) = 6(z, A )
+ (A - p)Pmin(p,z,A ) ,
Equivalently R(X,p ) = I
2
E E , A E 8'.
+ (A - p)P"'"(p).
(2.37)
(2.38)
By resolvent equation, it is easy to prove the following result. Lemma 2.31. We have
ww,
w,
WP, 4= m p t ,A> = R(P,4 , P"'"(p)R(p, A) = R ( p ,X)Pmin(p)= Prni"(X>.
In particular,
w, P>R(P,4
= 1.
(2.39) (2.40)
(2.41)
The following notion is again due to the resolvent equation. Definition 2.32.
(1) A family of functions {fx E b 8 + : X > 0) is called consistent, if for all A, p > 0, we have fx = R ( p , X)f,. (2) A family of measures {px E 2'+: X > 0) is called consistent, if for all A, p > 0, we have px = cp,R(p, A).
2.5 ENTRANCE SPACE AND EXITSPACE
89
Next, define
{f E b€+ : ( X I - n)f = O}, YA= {'P E 2+: 'P(A)= 'P&((X + ~ ) - ' I A ) ,A E &}, W,l = {'p E 2+: c p ( X 1 - Q ) I A= 0, for all A E d n En and n 2 l}, X > 0. 52-A =
By using the method of the proof of Proposition 2.29, we obtain
> 0, we
Lemma 2.33. For every X Lemma 2.34. Fix X
have ?'A'
= Y,l.
> 0.
(1) If f , g E T&+ and
( X I - Q)f
z9,
(2.42)
then
f 2 P"'"(X)g. Moreover, if the sign of the equality in (2.42) holds and so does (2.43). (2) If p , U E p(E,) < 00, U ( & ) < co for all n and
(2.43)
52-~= {0},
then
2+,
p(XI
- Q ) I A2
U(A)
for all
A E E n En and n 2 1,
(2.44)
then
p 2 UP"'"( A).
(2.45)
Furthermore, if p ( E ) < 00,the sign of the equality in (2.44) holds and "yx = {0}, then so does (2.45).
Proof: Since the proofs for (1) and (2) are similar, here we prove (2) only. Note that
By (2.44), we have P[(X
+ q)II 2 PQ + U.
+ ~ ) - ' I A we , see that p is a solution to the equation cp 2 (pQ[(X + q)-l.1] + U[(A + q ) - ' I ] . (2.46)
Postmultiplying by (A
On the other hand, since P(A,2,.) is the minimal solution to ( F A ) by , Theorem 2.12 (2), it follows that UPm'"(X) is the minimal solution to (2.46), replacing the inequality by equality. Thus, the first assertion of (2) follows by applying the comparison theorem. Moreover, if the equality in (2.44) holds and p ( E ) < 00, then we have p - UP"'"(A) E YA,and so the last assertion is obvious. W
90
2 EXISTENCE AND
SIMPLE CONSTRUCTIONS O F
JUMP PROCESSES
Definition 2.35. (1) A family of functions
c b&+
{fk};,l
is called linear independent, if
*
for any { c ~ } ; = c ~ R, C;=,c k f k = 0 ck = 0, IC = 1,2, ,TI. (2) A family of measures { c p k } ; = , c A?+ is called linear independent, if for any { c k } ; = , c R, c k p k = 0 =+ ck = 0, k = 1 , 2 , - . . ,n.
xi=,
As usual, the number
c
sup{n : {fk};=l is called the dimension of %A, dim Wx.
%A
is linear independent}
denoted by dim%x.
Similarly, we can define
Lemma 2.36. (1) Given XO
>
fx,
Set f p = R(Xo,p)fxo, p is consistent. (2) Given XO > 0 and 'px, E "yx,,. Set 'pp = 'pxoR(Xo,p),p cp, E W, and {cp, : p > 0) is consistent.
f,
E
9,and
0 and
E %A,.
{f,: p > 0)
Proof: a) Prove that
f,>, 0 for all p > 0.
f, = fx, + 0 But for p >, Xo, since fx,
Actually, for p
>
0. Then
>
0. Then
< Xo, we have
0 - P)~"'"(p)fx0.
E %xo, we have
Thus, Lemma 2.34 (1) implies that
Hence we also have
f,
= fx,
+
(A0
- P)pmi"(P)fxo2
0.
b) Prove the measurability. Because fx,, R ( X o , p ) 1 ~E b&, so does f,= R(Xo,p)fx, by the monotone class theorem. c) Prove that f, E % ,, p > 0. Since Pmin(X)I~ satisfies (Bx)and fx, E %A, it follows that Qfp
= QR(XO, ~
) f ~ o
+ (Xo - p)QP"'"(p)fx,, = 0 0 + dfx, + ( A 0 - P ) ( P + Y)pmin(p)fAo - (A0 - p)fxo = ( P + Y)[fxo + (A0 - P)P"'"(P)fx,l = Qfx,
= (P
+ 4)fW
CL
> 0.
2.5 ENTRANCE SPACE
91
ANT) E X I T SPACE
d) The consistency of {fp : p > 0) follows from (2.39). Combining this with a), b) and c), we obtain assertion (1). The proof of (2) is quite similar. Clearly, q, E 2 ' ( p > 0). Next, for P 6 Xo, we certainly have pp 2 0. Conversely, for p 2 Xo, since 'pAoE WA,, it follows that
HAVEtHUS, BY lEMMA 2.34(2), WE
ENCEh
The consisTherefore we always have (pp 2 0 and furthermore y , E 2+. tency of {p,} is obvious. Now, by using (F,) and the fact that 'px, E Wj,,, we obtain
Theorem 2.37. Both dim92A and dim WA are independent of X Proof: Let
{ j r ) } L = ,c %j,
> 0.
be linear independent. Set
By Lemrrias 2.36 (1) and 2.31, we have
k= 1
k=l
k=l
{fr'}
{fik)}.
Thus, the independence of implies the independence of In other words, dim 9?Lp2 dim 92~. But X and p are symmetric, hence dim %-?,A = dim
ep.
Similarly, we can prove that dim"'= dimWp. In view of the above theorern, we can use dim% and dim?' instead of dim 7 1 ~ and dim 'Y' respectively.
92
2 EXISTENCE AND SIMPLE CONSTRUCTIONS OF ,JUMP PR.0CESSES
Definition 2.38. We call and WA the exit space and the entrancc space respectively. The elements of are called the exit solutions. If dimo& < 00, it is said t o be finite exit. In particular, corresponding t o dim kP = 0 and dim W = I , we have zero-exit and single-exit. Similarly, we can define the entrance solutions, finitc cntrance, zero-entrance and single-entrance.
For each x E E , set d ( x ) = q ( x ) - q ( x , E ) , which is called the nonconservative quantity at 2. Lemma 2.39. For each X > 0, {zA(x) := 1 - XP"'"(X,z,E) : x C E } is the maximal solution t o t h e equation u = rI(A)u id/(X
+ q),
0
< u < 1.
(2.47)
Moreover, it can be obtained by the following procedure. Let
Zip' = 1, Then z r ) for each X
1 xA as n
n 3 1.
(2.48)
particular, if the given g-pair is conservative, then > 0, zA is the maximal solution t o the equation -3
a3. In
u = n(x)u,
0
< 1.
(2.49
Furthermore, xA = 0 iff dim %A = 0.
Proof: By Theorems 2.21 and 2.12 (11,it foIlows that X P m i n ( X , minimal solution to the equation u=
n(
X)U
+)
E ) is tho
+ X/(X + q )
<
Noting that 0 XPmin(A: ., E ) < 1, the first assertion follows from Theorem 2.2.The proof for the other assertions is easy.
Theorem 2.40. Let the given q-pair be conservative. Then the minimal qprocess is honest ifF dim k P A = 0. Equivalently, zA- 0 for some (equivalently, for all ) X > 0. Corollary 2.41. The minimal q-process is honest iff i t s g-pair is conservative and zero-exit.
Proof: If the q-pair is not conservative, them by ( B A ) ,we see l h a l the minimal q-process is non-honest ,
2.6 CONSTRUCTION OF Q-PROCESSES W I T H SINGLE-EXIT Q-PAIR
93
2.6 Construction of q-Processes with Single-Exit q-Pair
Given a q-pair, when the q-processes are not unique, it is interesting to construct all q-processes. In this section, we deal with a special case, i.e., the given q-pair is single-exit. As we mentioned at the beginning of the last section, if P(X,rc,A) is a q-process satisfying ( B A ) then ; B(X,., A ) = P(X, ., A ) - Pmin(X, ., A ) is an element in @A. Thus, if dim% = 1 and the q-pair being conservative, then xA is the maximal element in %A, and hence there should exist a non-negative function pA(A)depending on X and A only such that B(X,., A ) = zxpA(A). Moreover, pAE 2+. Hence, P ( X , x , A )should have the form
(2.50) The main purpose of this section is to describe further pA so that P(X) defined by (2.50) is actually a y-process. For the later use, we allow the q-pair t.0 be general for a while. We begin our study with a simple observation as follows.
Lemma 2.42. (1) The family (zX : X > 0) is consistent: z p = zA -t- (A - p)P"'"(X)z,. (2) Let {uA : X > 0) be a consistent family of functions, then uA # 0 or TL, = 0 is independent of X > 0. Moreover, u x 1 0 as X T ,30. (3) If PIX) defined by (2.50) is a jump process, then XpA(E> 1 for all
X
<
> 0.
Proof: The consistency of {xA : X > 0} is due t o the fact that P"li"(X satisfies the resolvent equation. Assertion (2) is easy t o check. Assertion (3) is obvious because of the normal condition of jump processes. Because of Lemma 2.42 (2), we may define u .-= 1imx ,O uA.Tn particular, z
L:
lirnA..+o xA - 1.
In order to guarantee P(X) defined by (2.50) being a jump process, the key point is the resolvent equation, which is solved by the next lemina.
<
Lemma 2.43. Let (uX: X > 0) be a consistent family of functions, 0 # uA 1 and yX E Y+,XyA(E) 1 for all X > 0. Then P(X) = Pmi"(X) uAp satisfies the resolvent equation iff there exists a consistent family {vx : X > 0) of measures such that pA= r n ~ q where ~ , r n A > 0 and m i 1 - Xqxu =: c 2 0 is independent of X > 0.
<
+
Proof: a) Because P"'"(X) satisfies the resolvent equation, it is easy t o check that P( A) defined above satisfies the same equation iff
2 EXISTENCE A N D SIMPLE CONSTRUCTIONS OF JUMPPROCESSES
94
In terms of the operator R, this can be re-expressed as follows.
The consistency of (u,}shows that this is equivalent to
or
'SxR(X,Pu) - v p
+ (A - P.)PxUp'p,
(2.51)
= 0.
Here we have used the fact that ux # 0. b) Next, if pX= 0 for some A, then 9, = 0 for all ,ii> 0 by (2.51), and so P(A) = Pmin(X). Hencc we have nothing to do in t,his trivial case. Assume that ywo# 0 for some pa, Then 'pi, # 0 for all p > 0. Let
Then { q x : X > 0) is consistent. In particular, t,nking X = p0 in (2.511, we
see that rlp(4 =
Note that
[...I
" +(P
-
POhq&IY,Lb41.
does not depend on A . When p,oup = 0, then 1
+ (p -
pO)'pp,up= 1 > 0. Othcrwise,
<
Here we have used Ihe condition XpA(E) 1. Therefore, we always have 1 ( p - po)cpIL0uk> 0 , p > 0. Thus, we may define
+
(px = rnxqx,
mx > 0, X > 0.
(2.52)
c) Substituting (2.52) into (2.51), it follows that {mx}must satisfy
sET
tHAT IS
tHEN WE HAVE
m,-l
- pL"(X,p) = mx-1
- XU(X,Pu).
On t,he other hand, by the consistency of {u,},we have 21,
+
= up ( p - X ) P m i n ( X ) u , ,
(2.53)
2.6 CONSTRUCTION OF q-PROCESSES WITH SINGLE-EXIT q-PAIR
95
and so ux = u - XPrnL"(A)u.
(2.54)
Next, Xqxu = XqpR(p,X)u = qJXu
+ (p -
= vp(Xu
+ ( p - X)XP"'n(X)u)
- .A,>
= cL77pU
- b - Q7,ux.
(2.55)
That is ( p - A)+,
p ) = ( p - A ) q A 7 L p = p/?jp7A - X q x l L .
Substituting this into (2.53) we obtain m i 1 - pq g u = m i - Xqhu. Hence, rn,' - X?jxu=: c is independent of X > 0. d) So far, we have proved the hypalheses are necessary. The suficieiicy is easier, simply substituting pAgiven by (2.52) into (2.51) and using the consistency of {qX}. I
Lemma 2.44. Given a consistent family {qx} of measures. Set 'px = q A / [ c + Xqx(E)]with c 0. Then P(A) defined by (2.50) is a jump process and it is
>
indeed a q-process satisfying
(Bx)if the
q-pair is conservative.
Proof: In general, since Pmin(X) is a jump process, the continuous condition for P(A) follows from Lemma 2.42 (2). In the conservative case, since zX E %A, P(A) satisfies ( B x ) . Hence the q-condition and so the continuous condition are automatic. Combining the above lemmas, we obtain a complete construction theorem.
Theorem 2.45. Given a conservative and single-exit q-pair, every q-process can be obtained by the following procedure. Choose an arbitrarily consistent family {qx} of measures and a constant c 2 0 and then set
where 0/0 = 0 by convention. The q-process is honest iff c = 0.
Corollary 2.46. Let the q-pair be conservative. Given 9 E p+with
,Prr"rl(X)(E) < 00, then for every c 2 0, P(A) := Prrlin(X) a q-process.
x > 0,
(2.56)
+ zxpY'T'i"(X)/[c + A9PT11i11(A)(is
Proof; We need only to check that {pPmi"(X): X this follows from (2.40). I
> 0)
is coiisistent. But
Combining Corollary 2.46 with Theorern 2,40, we obtain at last the following imiqueness criterion.
96
2 EXISTENCE AND SIMPLE CONSTKUCTIONS O F J U M P PROCESSES
Theorem 2.47. Given a conservative q-pair, the g-process is unique iff dim 9 2
0. Equivalently, the minimal q-process is honest.
2.7 ‘Notes For Markov chains, the theory of non-negative solutions to a. system of equations with non,nega.ti,vecoefficients is diie to Hou and Guo (1979), which goes back to Kantorovich and Krylov (1962). The general presentation here is talcen from Chen (1979). Theorem 2.12 is taken from Chen (19801, Chen and Zheng (1983). The ‘backward and forward Kolmogorov equations were first introdnced by Kolmogorov (1931). In a special case, Theorem 2.21 was obtained by Feller (l940), sometimes called the Feller’s construction in the literature, a generalization was obtained by Hu (1966). Here, we study the general case in details. The equation ( F A )was introduced by Chen (1980)) it is locally equivalent to the; ordinary ( F ) . The equivalent result Proposition 2.29 is taken from Chen arid Zheng (19133). Theorem 2.22 and Theorem 2.25 were first appeared in Chen (1986a,b). The later one is an analogue of Stroock and Vartzdhan (1979). Basis (1976, 1980) used a stronger version of the conditions for the cases (1) a.nd (3) of Theorem 2.22. For Theorem 2.25, the author was benefit.ed by a conversa.tion wi’Lh S. Z. Tang. The Example 2.26 is due to J. L. Zheng (oral cornmunication). It should point out that a slight modification of Theorern 2.25 is available for tirne-inhomogeneous jump processes. See Zheng and Zbeng (1986), Zheng (1993). For more related results, refer also to Chebotarev (1988), Hamza arid Klebaner (1995)) Kerstirig and Klebaner (1995). For the applications to quantum mechanics, refer t o the survey article by Konstantinov, Maslov and Chebotarev (1990), Thc materials of Section 2.4 are mainly taken from Chen and Zheng (1983). For Evlarkov chains, the part of Theorem 2.30 related to ( B n )is due to Feller (1957) and R.euter (1,957))and thc gen,erd form is due t o Hu (1966). For Markov chains, the other part of Theorem 2.30 related to (F,) is due to Hou (1982). Here is th,e natural generalization of the previous results. The author learnt thc name “consistent family” from Yang (1981). For Markov chains, corresponding to Corollary 2.46, there is a probabilistic construction, due to Doob (1945), and hence called noob’s construction. Tn the same situation, Theorem 2.47 is due t o Feller (1‘357}and Neuter (1957), the general form is due to Hu (1966). Essentially, Lemma 2,43 is due to Reiiter (1959) I
Chapter 3
Uniqueness Criteria This chapler begins with the study of uniqueness problem for the jump proccsses satisfying the backward or forward Kolmogorov equations. Then we study the uniqueness problcm for general jump processes. Besides, we introduce some applications of the uniqueness critcria. 3.1 U n i q u e n e s s Criteria Based on Kolrnogorov Equations
As we will see later, the Kolmogorov equations play a special role in the study of the uniqueness problem. The problem becomes easier if we restrict ourselves to those jump proccsscs satisfying one of the equations. This is the reason why we first deal with these cases. Definition 3 . 1 . A q-process satisfying (BA)(resp. ( F A ) )is called a Bq-process (resp. Fq-process). Theorem 3 . 2 . The Bq-process is unique iff dim GY = 0
Proof: Non-uniqueness ===+dim% # 0. This is clear as we explained at the beginning of Section 2.5. We now prove that “dim% # 0 ==+ nonuniqueness” by reducing the non-conservative case to the conservative one. For a different proof, see Proposition 6.28. a) Note that a given q-pair ( q ( x ) ,q ( x , d y ) ) induces naturally a conservative q-pair ( q a ( x ) , qn(J:, d y ) ) on the enlarged state space EA = E U {A}. &A = a ( € u {A}) as follows:
x E E , A E 85 ’ qa(J:, A ) = q(x, A \ {A}) -k I A ( A ) d ( Z ) , 4ab) = 4 ( 4 , J: E E , s a ( 4 = 0. By the last condition, for any q-process PA(X> with q-pair ( ~ ( x ~) ,( 2d y,) ) , we have Pa(&A , {A}) = l / X and so PA(&A, E ) = 0 for all X > 0. This plus Theorem 1.15(1) implics that the restrzctzon P(A) of PA(X)to ( E ,8 ) is not only a jump process but also satisfies the backward equation ( B x ) corresponding t o the q-pair (q(z),q(z,dy)), and hence is a Bq-process by Theorem 2.30. b) It is easy to check that the q-pair (qA(x),qA(x,dy)) is zero-exit iff so is the q-pair ( q ( x ) ,q ( x ,d y ) ) . Now, by the assumption and Corollary 2.46, there are infinitely many q-processes with the same q-pair ( ~ ( z ~) , ( xdy)) , but having different restrictions. H To study the problem for the Fq-processes, we need some preparation. The next result is about the decomposition of a consistent family of measures. 97
3 UNIQUENESS CRITERIA
98
Lemma 3.3. Let { q X } be a family of measures. Then it is consistent iff there exist a K. E y+and a consistent family of measures { f j ~E 'YX : X > 0) such that
Moreover, K and {qX : X uniquely by {qX : A > 0).
> O}
in the decomposition (3.2) are determined
Proof: a) The sufficiency is obvious. We now prove the necessity. Let {qx : X > 0) be a consistent family of measures. Then 7"
+ (v - X)qvPrni"(X)= rlx 2 0.
ENCEh fIX V AND LET
Define
KV(A)= vqy(A) - ~ ~ ( O I A ) ,A E 8 0 En,rt 3 1.
(3.4)
Then O
can be extended uniquely to 8, denoted by E p+. Thus, by Lemma 2.34 (2), we have
Clearly, K,
A E € ' ~ B , n.21. ,
K,
KvPmin(v)
K.,
again. Of course,
< 7,.
(3.5)
b) Next, fix q,, again and set fjv = qv - K
Then from (3.5), it follows that f j v E and ( F A ) ,we obtain fj,(VI -
p y V )*
(3.6)
2+. On the other hand, by (3.6): (3.4)
R)IA = &(A) - K v p r n ' " ( V ) ( U I - n ) I A = K , ( A ) - K , ( A ) = 0, A E 8 n En, n 2 1.
(3.7)
3.1 UNIQUENESS CRITERIA BASED ON KOLMOGOROV EQUATIONS
99
This shows that ijv E Wv. c) We now turn to prove that K , and ~ 5jv are determined uniquely by q,. Actually, if K,; and 5jL also satisfy (3.6)) then
+ q, = .;Pmin(V)
KuPmin(v)
+ 7.;
Note that Pmin(A) satisfies (F,) and f j v , f j L E W,. I t is not only meaningful to integrate the both sides with respect to
(vI- Q ) I A ,
A E 8 n En, n 3 1.
but also gives 11s
.,,(A) = &;(A),
A E Lan En, n 2 1.
This shows that n,, = K:, and hence ij, = ijl. Thus, for each fixed v, the decomposition (3.6) is unique. d) Finally, we prove that K~ is independent of v > 0. By the consistency and (2.40), it follows that q)$= v,R(v, A)
=
nuPm’n(v)R(v, A)
+ fjvR(v,A) = KuPrn”’(A) + qvR(v,A).
By 1,emma. 2.36 (2), we have fjvR(v,A) E Wx. So K , P ~ ~ ~ E( A2+, ) Hence: IE, and fjUR(v,A) consist a decomposition of qx. Combining this with what we have proved in the last paragraph, we obtain and TIx = ij,,R(v,A). The former one in (3.8) shows that K~ is independent of A one shows that { f j , : v > 0} is consistent. W =
I C ~
(3.8) > 0 and the latter
Lemma 3.4. Let {qx : A > 0} be a consistent family of measures, then qx10
as ATw.
(3.9)
Moreover,
AqA - uqv = (A - v ) q x ( I- vPmin(v)) = (v - A)qu(I - APtnin(A))(3.10) for all A, v 3 0, and hence Aqx(E) as X
t.
ProoE: By the consistency, we have qx = TI,,
- TI,
+ (v - X)r]uYmi”(A)
+ Vql,Prnin(A)
- r/l,APm‘n(A).
(3.11)
The first equality shows that qlx is decreasing arid the second one shows that limx,, qx = 0. By using the consistency again, we have Aqx - vq, = Aqx - vqx [ I
(v - A)Pmin(v)] = (A - v)qx[I- vPrni”(,)]. -
This gives us the first equality in (3.10). Exchanging A and v ,we obtain the second one.
3 UNIQUENESS CRITERIA
100
> 0) be a consistent family of measures. KP"'"(A)(E)< 00 and set ' 7 = K P " (4 ~ ~+ 7x1
Proposition 3.5. Let {fjx E W' : X Choose K E X > 0. Then
T+such that
P(A) := P"'"(X)
+
ZxT]x/(C
+ Xqx(E)),
x >0
(3.12)
is a j u m p process. It is further a q-process iff one of the following conditions holds.
(1) ( q ( x ) ,q(z,A ) ) is conservative. (2) K = 0 . (3) K ( E ) = 00. (4) liniA-+m Aq,(E) = 00. Proof: 8 ) Under condition (l),Proposition 3.5 follows from Lemma 2.44 since (7' : X > 0) is a consistent family of measures. Thus, we assume that (q(x)>q ( x , A ) ) is non-conservative. Then from Lemma 2.44, it follows that (3.12) still defines a jump process. This proves the first assertion. b) Because Pmin(X) is a q-process, by Corollary 2.28, P(A)given by (3.12) is a q-process iff
+
A E 8 n E,,, n. 2 1.
lim X2zAqA(A)/[c XqA(E)J= 0, x+m c) Note that Lemma 2.42 (1) and (2) give us
AP"'"(A)l
t
1
as
x T 00.
(3.13).
haveWe
On t,he other hand, by Lemma 3.4, we have Xqx(E)
Hence
as x
+ x+oo lim
lim Xq,(E) = K ( E )
A+
+ 03.
00
X.il,(E).
+
(3.15)
(3,16)
Xqx(E) K ( E )= 0, then from (3.15) and (3.16), it follows that K = 7, = 0, so qX 0 and lience P(A) = Prnin(X) for all X > 0. Thus, this situation is trivial. In what follows, we assume that
Now, if limx,,
=3
Iim X7jA(E) ' + a 3
+ K ( E )# 0.
3.1 UNIQUENESS CRITERIA BASEDON KOLMOGOROV EQUATIONS101
d) Observe lim x - lim [I - X P ~ ' ~ ( =X0,) ~ ]
A 4 0 0 x-x-.cO
and note that zA satisfies Eq. (2.47). We obtain lim X r x ( x ) = d ( s ) ,
zE
A-00
e) Fix vo > 0. For each N(v,, E , A) so that
Thus, by (3.9))whenever
11
E
> 0 and A
E & I-
F.
En! there exists an AT
=
> vo, we have
From this and (3.9)) we see that lim ~,,QIA= 0,
v+w
A E € n En,n 2 1.
Therefore, by Lemma 3.3, (3.4) and (3.9)) we obtain
Moreover,
f ) Finally, combining d), e) with c), we get
l?rorom this and a), b) and c>, it follows that P(A>is a g-process iff one of (1)-(4) holds. Now, we can state the main result of this section.
Thcorerri 3.6. The Fq-process is unique iff either the minimal q-process is honest or dim "t/ = 0.
102
3 UNIQUENESS CRITERIA
Proof: a) Sufficiency. The assertion is trivial if the minimal q-process is honest. Otherwise, for any Fq-process P(A), we have
P(A,2,.) - PrnyX,z, ,)
E
w,.
Hence dim Y' = 0 also implies the uniqueness. b) Necessity. Assume that the minimal q-process is non-honest and dim '9' # 0. Fix A0 > 0, take qA, E W', \ (0) and set QA = qAoR(Xo,X).By Lemma 2.36 (21, { f j ~E Wx : X > 0) is consistent. Since zx # 0, setting K, = 0 and c = 0 in Proposition 3.5, we obtain a Fq-process which is different to Pmin(X). This completes the proof.
3.2 Uniqueness Criterion and Applications
Lemma 3.7. For each A X > 0.
E
&, either irifXEnPmin(A: z, E ) > 0 or = 0 for all
Proof: Suppose that infXEAPmin(Xo, x,E ) > 0 for some XO. Then by the resolvent equation,
P y A j = P"'"(A0)
+ (A,
- X)Pmin(A)Pmi"(Xo),
we have
On the other hand, as we have seen in (3.13) t#hatXPmin(X, x,E ) is increasing in A, we also have
Denote by
E0 = {x E E : d ( z ) := q ( z )- q ( 2 , E ) > 0) the set of all non-conservative points. We can now state our main criterion.
Theorem 3.8 (Uniqucness Criterion). Given a q-pair (q(z),q(z,A ) ) ,the q-process is unique iff the following conditions hold simultaneously.
(1) For some X > 0 (hence for all X > 0 ) , infxEEoPmin(X,z,E) > 0. (2) dim% = 0. (3) Either the q-pair is conservative or even it is not but still dim W = 0.
3.2 UNIQUENESS CRITERION AND APPLICATIONS
103
The theorem will be proved in Section 3.4. In this section, we introduce some of its applications. Before doing so, let us make a remark about the probability meanings of the conditions. Consider the minimal Markov chain (Xt)t>o on a probability space (R, 9, P). Its successive jump times are T, is called the denoted by TO = 0 < T~ < T~ < . . . 00. Then 7 := first infinity of the Markov chain. Next, let
<
h
R = {w
:7
< 00,
for every E > 0, there are infinitely many isolated jumps during ( T ( w ) - E , ~ ( w ) ) } .
Remark 3.9. Using the above notations, we have (1) Condition (1) of Theorem 3.8 holds iff there exists a to > 0 such that infiEEoPi(7 > t o ) > 0. (2) Condition (2) of Theorem 3.8 is equivalent t o say that Pi(fi)=O for all i.
The proof of Remark 3.9 was presented in Hou and Guo (1978), Section 12.10. Corollary 3.10. If q ( x , E ) = 0 for all z E E , then the q-process is unique iff SUPzEEq ( 2 ) < OO. Proof: Clearly, dim 92 = dim 'Y = 0. Since P m i n ( A ) = (A+q)-lI, it follows that
To study bounded q-pair, we need the following
Lemma 3.11. Let r), E W, \ ( 0 ) . Then J r),(dz)q(z,E)= 00. In particular, if supzEEq ( z ,E)< 00,then dim 'V = 0. Proof: We prove here only the first assertion. From r), E "yx = Wi, it follows that r),(X q)I = r),Q. In particular, XqA(E) q X ( q ) = r),Q(E). Thus, if the assertion is not true, then we could have
+
+
which is a contradiction. W The next result improves Corollary 2.24. Corollary 3.12. For bounded q-pair, the q-process is unique. Proof: a) Set c = supxEEq(z) < 00. For f, E
@A,
we have
3 UNIQUENESSCRITERIA
104 Hence
and so fx = 0. This provies that dimU&= 0. b) If the q-pair is conservative, then Eo = S and so the conditions of Thee rem 3.8 me all satisfied. In particular, we have proved again Corollary 2.24. c ) By rCheorem 2.21 arid Theorem 2.12 (I). Pmi"(X)l is the minimal solution to the equation u = r I ( X ) u + (A + 4 ) - 1 .
We thcn have
This implies condition (1)of the criterion. Finally: By a), c ) and Lernma3.11! the conditions of Theorern 3.8 are dl satisfied. I
As a special case of the above theorem, we have Corollary 3.13. Let E" be finite. Then q-process is unique iff both &-processss and Fq-process are unique.
Now, we want to sh,ow that Theorem 2.25 can be deduced from our main criterion. Indeed, since the q-pair discussed there is conservative, the uniqueKXSS criterion becorries dirr&%= 0. O n t,he other hand, dim% = 0 is equivalent t)o the minimal q-process being honest, which is the point we used to arrive at the uniqueness. Here, we want to show that EL dual approach enables 11s to arrive at dim% = 0. To do so: 'we introduce a comparison lemma, which is a dual of the comparison theorem for the minimal solution.
L L W L ~3.14 L ~ (comparison Lcmrnu). Let ni be a non-negative kernel, gi E 8+satisfying nil gi 1,,i = 1, 2. Denote by f i the maximal solution t o the equation
+ <
(3.17) and let f i be an arbitrary solution to
>
<
If n1 2 TI2 and g1 g 2 , then f 2 fi. Furthermore, if Ill = TI2 and g1 = g2, then these two equations have or have no nan-trivial solutions simultaneously.
3.2 UNIQUENESS CRITERION AND APPLICATIONS
105
Proof: a) First, the mcwrimal solution to cquation (3.17) can be obtained by the following procedure, Lef,
Then f!”’ 4 J1 as n -3 CM. b) Next, for any solulion f2 to equation (3.18)) wc have Suppose that
f2
< f,’”’.Then
f2
6
n2f2
k 9 2 6 ulf2 -1-
6 n.lf!”)f 91 =
,!?I
f2
6 1=
fl(0) ,
fin+’’* <
I t follows that f2 6 fin’ for all n 2 0. Letting n + 00, we obtain f 2 fl. c) If (3.18) has a non-trivial solution f 2 , then so does (3.17) by b). Now, assume that 71 # 0, l l ~ = IT2 and g; = 9 2 . Then, from f1 = n1f,
+91 = n 2 f 1
+g2,
it follows that f; is a non-trivial solution to Eq. (3.18). W Alternative Proof of Theorem 2.25: Let zn(X) = 1 - XPmin(X, .,En). Then z n ( X ) is the maximal solution to the equation .(A)
= rI(X)u(X)
+ (A 3- q)-1XIp,
< .(A)
0
,< 1
:= 1 - XPn(X)I~,, (refer to the notations in t,he proof of Theorem 2.25) is the maximal solution to the equation
At the same time, &(A)
Thus, the proof of comparison lemma gives us zn(X) hand, as we did in the proof of Theorem 2.25
< Zn(X>.
On the other
Finally
Next, we turn to discuss Markov chains. Take
E
= ( 0 , l )2,
* .
.}
=:
z,.
Defiriitiori 3.15. A conservative Q-matrix Q = (qz, : i , j E Z+) is called a single birth Q-matrix if
+
if j 2 i 2, i , j E Z,; IT:= sup{i 1 : qi,i+1 = 0, i E Z+}= inf{i : q k , k + 1 qz, = 0,
+
> 0 for all k 2 i } < 00.
The corresponding Q-process is called a single birth Q-process.
3 UNIQUENESSCRITERIA
106
In the case that yi,i+l > 0 for all i 2 0, we have N = 0. When N 3 1, to be distinguished, we also call the matrix (resp. Q-process) a single birth Q-matrix (resp. Q-process) with absorbing boundary.
Theorem 3-16" Given a single birth Q-matrix Q : (qij), (1) when N = 0, the Q-process is unique iff R := CrZN rn,
and q f ) = that
c i = o q k j for
xa= 0.
=. m,
where
z < k and k 2 1, Here we use the convention
(2) When N >, 1, choose arbitrarily a positive ijol, define in (3.20) and (3.19) with N = 0, replacing
$)
and En as
and qn,n+l, respectively,
with i#) and tjn,n+l:
Then the (qij)– process is unique iff
Proof: First, we consider the general case that N 3 0. a) By Theorems 2.47 and 2.40, it suffices to show that the maximal solution (ur)to the equation
equals zero identically for some fixed X > 0. When AT 2 1, the set {0,1, . . , N - 1) consists a closed subclass of the chain and so u: = 0 for all i 6 N - 1. b) Define
(3.21)
By induction and ( N O ) , it is easy to check that
3.2
UNIQUENESS CRITERION AND
APPLICATIONS
107
c) Let (ui)be a solution to the equation ( l + q i ) u i = x q i j u j , i 2 0 with
uk=O
for all k < N - 1 and u N = l . (3.23)
j#i
We now prove that (ui)is unbounded iff z,"==,Gn = 00. From (3.23), it follows that
[c n- 1
uN
= 1, un+1- un = q;,;+i
qik)(uk+l-
uk)
+ un
k=O
and hence ui as i
Gk
1
2N
(3.24)
1'. The key of the proof is to show that
< uk+l - u k < ( u N + 1 - uN
+ukGk,
k 2 N.
(3.25)
To check (3.25), we use induction. Noting that
haveand by ()3.24 we
Suppose that (3.25) holds for all k: N < k < n – 1 and we now consider the case that k = n. Then
and
=
+ un5iin,
( u N + l - uN)pLN)
n2N+l.
d) Having (3.22) and (3.25) in mind, it is now easy to complete the proof of the conclusion mentioned in c). If u, := limn+,un < 00, then, by (3.25), we have
108
3
Conversely, let
it follows that
UNIQUENESS
CRITERIA
.k < 00. Because
nkuk+l/ukand
log (uk+l/uk), and then
converge or diverge simultaneously whenever and (3.22), we get
U ~ + ~ /--fU 1. ~
Next, by (3.25)
Combining these facts together, we obtain the main conclusion. e) We now prove (3.19). Actually, by (3.20) and (3.21) with N = 0 and induction, if 6 k = mk for all k 6 n - 1, then we have
n-1
j=O
= m,
We have thus proved the first assertion of the theorem. f ) Finally, consider the case that N 2 1. From the expression of ( f i n ) , it is clear that one can regard {0,1, . . , N - 1) as a single absorbing state 0. If necessary, relabelling N , N 1,.. . as 1,2,. . . , respectively, we get a new Q-matrix and then the quantities &) (0 6 i < n ) and gn,n+l. For which, the
+
3.2 UNIQUENESS CRITERION AND APPLICATIONS
109
state 0 is the only absorbing one. However, for simplicity, we assume N = 1 and use the original ( q i j ) instead of the new Q-matrix. Then
Note that even though qol = 0 but (FLZ) : k 2 i 2 0) is still well defined by (3.20) with N = 0. Repeating the proof e), it follows that
,
n-1
n
\
On the other hand, define co = 1 and
Comparing this with n-1
k=O
and FAo) = 1, it is clear that cn = F?) for all n 2 0. Collecting these facts together, we obtain n
k=l
Therefore, we have returned to the case (1) with qol = 1. Now, part (2) of the theorem follows, because the sequence { F F ' : n 2 k 2 0} does not depend on gol and so the condition " R = 00" is independent of qol > 0. H Alternative Proof: We can reduce the case of N 2 1 to N = 0 by using a probabilistic approach, which goes back to Pakes and Tavark (1981). Let (qij) be a local modification of ( q i j ) , up to N-1 for instance. Denote by ( X t ) and the minimal processes determined by ( q i j ) and (t&) respectively. We want to show that ( X t ) has at most finite number of jumps in every Let (2,)denote the minimal process finite time-interval iff so does determined by the Q-matrix (&): & = 0 for all i 6 N - 1 and @ij = q i j for all i 2 N . Note that for each i ,< N - 1, stays at i with the exponential law having parameter qi and then jumps to other states. This is the only way yields more jumps than Due to the conditional independence and the fact N < 00, such jumps can happen at most finite times in a finite time-interval. Thus, (57,) has at most finite number of jumps more than in every finite time-interval. The same comparison holds for ( X , ) and ( X t ) . We have thus proved the assertion. H
(xi)
(xt). (x,)
(xi)
(zt)
(zt).
3 UNIQUENESS CRITERIA
110
Definition 3.17. A conservative Q-matrix Q = ( q i j ) is called a birth-death Q-matrix if qi,i+l=:
bi
> 0, i 2 0;
q i , i - 1 =: ai
> 0, i 3 1 and qij = 0 for
all li - j l
> 2.
The corresponding Q-process is called a birth-death process
Since the birth-death Q-matrix is a special type of single birth Q-mat,rix, as a consequence of the above result, we have Corollary 3.18. Given a birth-death Q-matrix, the Q-process is unique iff
where
po = 1,
pn
bob1 1
-~
bn-l
*
--
UlU2..
-
. an
,
n>l.
When bo i s replaced by 0, we obtain a birth-death Q-matrix with absorbing boundary. This necds only a little modification since the change of a finite number of thc transition rates qzJ docs riot iritcrfere the uniqueness property as mentioned above. Clearly, Theorem 3.16 is a very nice result since it is explicit and completely computable. Unfortunately, this is uncommon. Even in the twodimensional case, it is usually hard to justify whether a Q-matrix is zeroexit or not. The remainder of this section is devoted to study some sufficient conditions for the uniqueness of multidimensional Q-processes by comparing them with a single birth Q-process. The typical example in mind is Schlijgl’s model. Theorem 3.19. Let E be a countable set, Q = ( q ( z , y ) : ‘c,y c E ) be a conservative Q-matrix. Suppose that there exists a partition of E such t h a t Cr-oEk = JY and the following conditions hold.
{&}r
E3 for all k 2 0. (1) If q(x,y) > 0 and J: E Ek, then y E (2) AT := inf {i : q ( z , y ) > 0 for all z E Ek and all k 2 i} (3) CI, := sup{q(z) : z E Ek} < 00 for all k 2 0.
xYEEk+I
Define a conservative Q-matrix Q = ( y z j : i , j E
< 00.
Z+)as follows: (3.26) other cases of j
tf the (qij)-process is unique (i.e., R or then so is the (q(z,y))-process.
# i.
defined by Theorem 3.16 equals
oo),
3.2 UNIQUENESS CRITERION AND APPLICATIONS
111
Proof: By the uniqueness criterion, it suffices to show that the equation (A
+ q ( z ) ) u ( z=) 1q ( 2 ,
0 6 ).(.
Y>U(Y),
6 1,
x
E
E
(3.27)
Y#X
has only the trivial solution. Suppose that Eq. (3.27) has a non-trivial solution ( u ( x ): 2 E E ) . Let Uk
= SUp{U(X) : X E Ek},
k 2 0.
(3.28)
< <
k N - 1 by Then ( u k : Ic 3 0) is non-zero. However, U k = 0 for 0 condition ( l ) , CrzilrS, is a closed set for the chain, and by condition (3), q is bounded on the set Ek. For each k, choose ~k > 0 arid d k ) E Ek so that Fk(X c k ) < x/2, u ( x ( k ) ) (1 - E k ) U k (3.29)
cri:
+
IZ: with (3.29), we obtain
Replacing
d k )in
(3.27), using conditions (1) and (3), (3.28) and
k-1
That is
?(
xuk 2 + j=o
UGEj
p(2(*);y))(uk-Uj)
<
c
q(2(",y)(Uk+l-Uk).
(3.30)
VEEk+i
Since u k 2 0 and CYEEkil q ( ~ ( ~ ) ,>y )0 for k 2 N ,from (3.30) anti induction, it follows that uk T as k. t. Combining (3.30) with condition ( 3 ) , we obtain
Equivalently,
112
3 UNIQUENESSCRITERIA
This shows, by the comparison Lemma and the uniqueness criterion, that the (qij)-process is not unique, which is in cont,radiction with our assumption. Next, we want to show that the above result can be also deduced from Theorem 2.25. Before doing this, let us compare the uniqueness criterion in the conservative case with the sufficient conditions given in Theorem 2.25.
Remark 3.20. In order a single birth Q-process to be unique, the hypotheses of Theorem 2.25 are necessary. Proof: Without loss of gcncrdity, assume that N = 0. Suppose that the Q-process is unique. Then Eq. (3.23) has no non-trivial bounded solution. On the other hand, the solution to (3.23) can be obtained by the recurrent procedure given in (3.24). Moreover: for each fixed X > 0, the solution (ui is increasing in i . Thus, if the process is unique, then we should have ui 00 as i t 00. Now, we fix X = A0 and denote by (ui)again the corresponding solution to Eq. (3.23). Take En = {0,1,. . , n } , c = A0 and (pi = ui for all i E E . Then it is easy to check that the hypotheses of Theorem2.25 are satisfied with these choices. I Actually, the above idea remains true in general. Let dim42 = 0. Then for any non-trivial solution .(A> t o the equation
(XI
- R)u(X) = 0:
u(X) 2 0,
if exists, must be unhounded. Thus, if we set cp = .(A) certainly have
Qv
(fix A), then we
6 ( X I +dv.
Moreover, we can choose a sequence { x n } c E so that (p(zn)= 00. From these, we see that the hypotheses of Theorem 2.25 are almost satisfied. In this sense, Theorem 2.25 is an alternative description of dim 42 = 0. Even though the criterion .‘dim%’ = 0” is better than Theorem 2.25 in the sense that for the former one we have a successive approximation scheme for the maximal solution but for the latter one we do not, have an explicit way t o find out the function cp. However, as we have seen several times before, the latter one is still quite effective in practice. Based on the above idea, we have
Alternative Proof of Theorem 3.19: Fix X > 0 and use ( u z )again t o denote the unique solution to equation (3.23). The uniqueness assumption of (q,,)-process and the above proof a) tell us u,T cc as z t 00. Set p(z) = u,
3.3 SOMELEMMAS for
2
E
113
Ei, i 3 0. Then from the hypotheses of Theorem 3.19, we see that k-1
Hence
Now, it is easy to see that the hypotheses of Theorem 2.25 are satisfied and hence the ( q ( 2 ,y))-process is unique.
3.3 Some Lemmas This section makes preparations for proving the main criterion given in the next section.
Lemma 3.21. Let c(X) = inf Pml"(X, 2,E). XEE
T h e n .(A)
= 0 for some X
(3.31)
> 0 iff there exists a K E T+,such t h a t
&(En)< 00, n 3 1,
K ( E )= 00
(3.32)
x > 0.
(3.33)
and
0 < nPm'"(A)l
< 00,
Proof: We first prove that the property (3.33) is independent of A. Actually, if ~P"'"(p)l= 0 for some p, then by the resolvent equation, we have KP'"(p)l
= KP" '" ( A) l
+ (A - p ) K P m ' " ( p ) P m i n ( A ) l .
This implies that ~ l ' ~ ' " ( X ) l = 0, which is a contradiction. On the other hand, since ~(p,P"'"(p)l)is decreasing in p, so r;Prn'"(X)l< 00 implies that ~,P"'"(p)l< 00 for all p 6 A. As for p > A, the resolvent equation gives us KPrn'"(/L)l
< KP'"(A)l
< 00.
3 UNIQUENESS CRITERIA
114
The sufficiency comes from 3;,
> KEpmin(X)l 2 C ( X ) K ( E ) .
To prove necessity, assume that c(X) = 0. Choose an infinite subset N of N := {l,2,. I . } so that CrLEN inf,CE, P"'"[A)l(z) < 00 and choose a sequence ( 5 , F En : n E N } such that 0 < z n E N P m i n ( A ) l ( x n <) 00. These can be done since
For a natural number n f N , choose an arbitrary point
r;(A) =
C K ( An {xn}),
2, E
En and set
A E 8.
ne N
<
Then we obtain &(En)= ~ ( { x , } ) 0 < K P " ' " ( X ) ~= CnEN P"'"(X)l(z,)
1, K ( E ) =
C , , K ( { X ~ ~03} and ) =1.
< 00.
Lemma 3.22. If infrEE Pmin(X, 2 ,E ) > 0 for some X > 0, then dim@ = 0. Proof: This is an easier consequence of Theorems 2.21 and 2.11. Lemma 3.23. (1) If f E bG) and Pmin(X)f = 0 , then f = 0. (2) If p E 9 and cpPmin(A)(A) = 0 for A E 8 n En,n 2 1,then p = 0.
Proof: a) From
(BA),
it follows that
f = ( X I -- I 2 ) P " ( X ) f = 0. Hence assertion { 1) follows. b) Let cp* be the John-Hahn decomposition of q, By the hypotheses, we have
cp+P"'"(X)(A) = pP-Pmin(X)(A),
A E c" n En, n >/ 1.
Now, the monotone class theorem implies that (p+Prni"(X)f = p-Pm'"(X)f
for all f E b8. Furthermore, for any f E 8,whenever one of the two sides exists, then the other side should exist and moreover! the both sides are equal. Applying (Fn),we obtain q+ ( A )=
C yipmi"(A) ( A 1 n
-f l ) I A E , =
C q-Pmin(A) ( X I n
for all A E 8,and so cp = 0 since cp E 9.
- fl2)I~~, = cp-
(A)
3.4 PROOF OF UNIQUENESS CRITERION Lemma 3.24. Let G be a kernel on ( E ) & )0, by f * the minimal solution t o the equation
< G < 1 and g E b&+.
115 Denote
f = Gf f g . Set
D = {x
E
E : g(x) > O}. Then supxEEf*(x:)= supzEDf*(x).
Proof: As usual, let f(1)
= g,
f(lz+')
+
= G f ( n ) 9,
n
> 1.
Then
<
Suppose that f ( 7 L ) ( x )supyEDf*(y) for all x E E . Then for z E D c ,
But for z E U , it is trivial that f ( " ) ( x ) 6 f*(x) < supyEDf*(y).Thus, by induction, we obtain the desired assertion. Lemma 3.25. If dim% = 0, then
Proof: We have known that zA is the maximal solution to the equation 74A) = rI(A)u(A)
+ (A + q)-Q,
0 6 .(A)
< 1.
But, on the other hand, since !?m ?id = 0, we see that zA is also Ihe miriirrial solution to the same equation, Tlence, Lemma 3.24 provides what we required. 3.4 Proof of Uniqueness Criterion
The purpose of this section is to prove Theorem 3.8. The next result is an alternate criterion. Once we proved the latter one: the former one follows quite easily.
Theorem 3.26 (Uniqueness Criterion). Given a q-pair (q(x),q(z,A ) ) , the q-process is unique iff the following two conditions hold simultaneously. (1) There exists some X > 0 such that c(A): = inf,,E (2) The Fq-process is unique.
Pmin(A, x,E ) 3 0.
3 UNIQUENESS CRITERIA
116
Proof: We first prove the necessity. Condition (2) is clearly necessary. If condition (1) does not hold, then the minimal q-process should be nonhonest. So by Lemma3.21, we can find a 6 E p+ so that the hypotheses of ProposiOion 3.5 are satisfied. Taking c = 0 in (3.121, we obtain an honest q-process (depending on the choose of VX). And hence, the q-processes can not be unique. Next, we prove the sufficiency. Assume that the minimal g-process is non-honest and P(A) is an arbitrary q-process. Set
*(A)
= P(A) - P n i n ( A ) *
(3.34)
Because P i " ( X ) satisfies (Fn>and Proposition 2.18, we have
Thus
V(A)
:= Q ( A ) ( A I
- sl) 2 0,
A
E
8 n En, n 2 1
Clearly, for ewh X > 0 and z E , U ( A ,5,.) can be extended to 8 uniquely, denoted by U(A, 5 , ,) again. Using condition (2), Tlieorerri3.6 and Lemma 2.34(2), we obtain @[A) --- U(A)Pmin(A).From this and (3.34), it follows that P(A) = P"'"(A) U ( A ) P m y A ) . (3.35)
+
Hence
6 X P ( X ) l 6 1.
XT/(X)P"1'"(X)l
Besides, by condition (1) and Lemma3.7, we have 1 Combining these two facts together, we obtain
XU(X)l
<
c(X)-'P'"'~(X)~.
< Au(A)c(X)-lPmin(A)1 < c ( A ) - - l .
(3.36)
Thus, for each X > 0, U(X) is a bounded kernel on ( E , & ) . On thc other satisfy the resolvent equation, by (3.351, we hand, since P(A) and Pmin(A) have
[ u (A)
-
r q p ) 3- (A - p ) Fin (A) U (p ) + (A - p ) u(A)
PIuin
(A) U (p) ]
( p ) = 0.
F'rorri this, (3.36), and Lemma 3.23 (a), it follows that
+
U ( A )- U ( p )+ (X- p)P"'"i(A)U(p) (A - , I . ) u ( x ) ~ ~ K l ' n ( x ) I / ( /= L )0. (3.37) By (3.36), we may rewrite (3.37) as follows:
+
+
U ( p ) ( p - X)Pmin(A)U(p)= U(A) (A - p)V(X)P"'"(X>U(p). (3.38)
3.4 PROOF OF UNIQUENESS CRITERION The right-hand side is non-negative whenever X
2 p.
117
Hence
and so p U ( p ) 3 (1 - p A - l ) X [ XP"'"(X)
-I]U(p),
X 2 p.
(3.39)
X(XPTi"(X) - I ) = R. Letting Note that Prnin(X) is a Bq-process, limx,, X --+ 00 on the right-hand side of (3.39) and using (3.36) and Theorem 1.14, we obtain pU(p) 3 RU(p). Thus
(3.40)
V(p) := (p1 - R)U(p) 2 0.
On the other hand, by condition (1)and Lemma 3.22, we know that dim 92 = 0. So Lemma 2.34 (1) implies that
Substituting this into (3.35),
P(X) = P"i"(X)
+ Pmin(X)v(X)Pmin(X).
(3.42)
Substituting (3.41) into (3.37), P"'"(X)[V(X)
- V(p)
+ (A - p)v(X)Prnin(X)Prnin(p)V(p)] = 0.
By Lemma 3.23(1), we have V(X) - V(p)
+ (A - p)V(X)Prni"(X)Prni"(p)V(p)= 0.
(3.43)
Hence, the resolvent equation gives us V(X) - V(p)
+ V(X)[Prni"(p)- P"'"(X)]V(p)
= 0.
(3.44)
Before moving further, we prove two facts: v(x)Pmin(X)V(p)l< 00,
(3.45)
V(X)Prni"(p)V(p)l< 00.
(3.46)
By (3.36) and (3.40), we have V(X)1 < 00 for X
> 0. But
3 UNIQUENESS CRITERIA
118
so
V(X)Prni"(p)V(p)l< p-lc(p)-lv(X)l
< 00.
(3.47)
This is just (3.46). Next, by the resolvent equation and (3.43), we have
P m i n ( X , s , A )1, V(X,x,A) J,
as X
t.
(3.48)
Thus, for X 6 p, from (3.46), it follows that
<
V(X)Pmin(X)V(p)l V(X)Pm'n(X)v(X)l At the same time, for X
< 00.
> p, we have
V(W"'"(~)V(P)1
< V(p)Prni"(p)V(p)l< 00.
Combining the above two cases, we obtain (3.45). Now, we return t o our main proof. By (3.42) and q-condition, we have
But X2P"'"
2 XP"'" (A, 2 , { x}) (V(A) XP"'"
(X)V(A)P"'" (A)&,
)(2).
Hence lim V(X)XPmi"(X)IEn= 0,
X-+W
n 3 1.
(3.49)
On the other hand, by (3.48) and Theorem 1.14 (l),if we set K : = lim V(X), X+Cc
then K ( z ,.) E (3), we have KIE,
2;for every 2
= lim
XdCc
E E.
KXP"""(X)IE~ =
Hence
V(X)I K I
=
So by (3.49) and Theorem 1.14 (2) and
lim
A-
v(x)xP"'~(x)IE,
= 0,
n 2 1.
00
CKI.,= o
as
x
cc).
n
From this and (3.47), we get
Letting X + 00 in (3.44), we have V(p)1 = 0 for all p > 0. Substituting this into (3.42), we get P(A) = Pmi"(X)for all X > 0, which completes the proof. I
3.5 NOTES
119
Proof of Theorem 3.8: Combining Theorems 3.26, 3.2 and 3.6 with Lemmas 3.22 and 3.25, we obtain the desired assertion. W 3.5 Notes In the above three chapters, we have chosen a short way to present our main uniqueness criteria for general q-processes. Some more refined results are delayed to Section 6.4, which should be considered as an essential part of the theory. Next, we mention a further topic related to this chapter. In practice, the most important jump processes are honest ones. But in physics, one pays more attention on the Fq-processes, i.e., the processes satisfying the Fokker-Planck equation. Corresponding to the equilibrium physics, there are so-called reversible (or more general, symmetrizable) jump processes, which are the main subject in Chapters 6 and 7. Thus, it is natural to study the existence and uniqueness problem for each type of jump processes. Moreover, one may even ask how many jump processes do we have in each case. To show that the answer is quite interesting, we mention that even for conservative q-pair, it can happen that the minimal jump process is nonhonest, but the honest Fq-process is still unique. Note that for such a jump process, its samples are not necessarily step functions, even have no left or right limits. Hence, to get a complete theory, it is not enough to consider only the samples which are right continuous having the left limits unless enlarging the state space. However, a quite complete picture for the above problem was obtained by Hou and Guo (1976) for Markov chains, and by Zheng (1982) for general jump processes. For Markov chains, Theorem 3.6 is due to Reuter (1962), Theorems 3.8 and 3.26 are due to Hou (1974). The proof adopted here is closed to Reuter (1976). For single birth matrix without absorbing boundary, Theorem 3.16 was first proved by Zhang (1984) by using a probabilistic approach. Then an analytic proof was presented in Yan and Chen (1986), in which an incorrect assertion was made for the Q-matrix with absorbing boundary. A correction was given by Li(1990b). A unified treatment was given in Chen(1999d). The present proof is a further simplification. The uniqueness criteria in the general context and the most of the materials discussed here are taken from Chen and Zheng (1983). The discussion about Theorem 2.25 is taken from Chen (1986b). Some more applications of the uniqueness criteria, especially for nonconservative Markov chains, were present,ed in Hou (1982), Chapter 8. A more complete theory of the uniqueness problem for Markov chains is explored in anderson (1991), Hou et a1 (1994, 2000).
Chapter 4
Recurrence, Ergodicity and Invariant Measures In this chapter, we first introduce some results about the weak convergence which are very useful but not popular yet in the literature. Next, we study the recurrence and the existence of stationary distributions for general jump processes, and then for Markov chains. Moreover, three types of ergodicity are also studied. Finally, we discuss the invariant measures for jump processes.
4.1 Weak Convergence
To begin this section, we recall iiwell-known result, its proof can be found from Billingsley (1968) or Stroock and Vsraclhan (1979). Let ( E , p ) be a metric space with Bore1 0-algebra f i . Denote by @ ( E ) the set of probability rricasiires on (I?,&'). TJet C&(E) (resp. U,(E)) dcnote the set of all bounded continuous (resp. bounded uniformly continuoils) functions on E .
Theorem 4.1. Given p , prL E , Y ( E ) (n 3 l ) , the following assertions are equivalent
(1) For each f E Cb(E),limndo;,pn(f) = p ( f ) . (2) For each f E U,(E), limndo;, p n ( f ) = ~ ( f ) . (3) For each closed subset C of E , p,(C) 6 p ( C ) . (4) For each open subset G of E , limn,,pn(G) p(G). (5) For each B E & with p ( d B ) = 0, limn-,o;,pn(B) = p(€?),where d B denotes the boundary of B.
>
Now, we return to our main setup, assume that E is a Polish space and choose a metric p so that ( E , p ) becomes a complete separable space with Bore1 u-algebra 6. Theorem 4.2. A subset A?c 9 ( k ! )is relatively compact in the weak topology iff for each E > 0, there exists a compact K, c E such t h a t
inf p(K,) 2 I
I.1E.I
- E,
where the weak topology is generated by the open sets
s
E > 0, (2 E qE), E c m . { PE *qE) : I P ( f ) - Q(S)I E } , This theorem is the set's form for the compactness. That is, using compact sets to describe the relative compactness. As the monotone class theorems having both the set's form and the functional form, the above criterion also has a functional form.
120
4.1WEAKCONVERGENCE
121
Definition 4.3, A function h E 8+ is called compact if for every d E [0,m) the set (x E E : k ( z ) 6 d } is compact. Note that a compact function is not necessarily continuous, for instance,
However, cvcry compact function should be closed I=lowcr scrni-continuom). T1ia.t is: for every d E LO, oo),the set [f d ] Is closed. Equivalently, lim, r'X) f(a,) 2 f (x ) far every sequence { x n } Y I K ~ hba,t; limn, ,.mxn - x. A function 'p E r S ; 011~d (resp. %') is compact iff limlz~n.,+mp(x) = 00 (resp. and p being closed), It, may be remarked that for the sufficiency in Theorems 4.2 and 4.4. below we med only msurne that, E is a separable metric space.
<
Thcorem 4.4. A subset c P ( E ) is relatively compact iff there exist a compact function h and a constant C such t h a t sup p ( h ) 6
c<
00.
P € A
Proof: a) Sufficiency. For
> 0, take K,
E
=
{x E E : h ( z ) < C / E } Then .
b) Necessity. For n 3 1,choose compact. Klln3 so that
Without loss of generality, assume that Klln3 t as n
h(z)
;=
in€{n 2 I : 3:
E Klln3}.
Then
{x : h ( z ) < d )
= (5 : k(s)
r. Take
< [d]}=
u
I4
n=1
K1/rL3
is compa,ct, where [d] i s the inleger part of d. Moreover 00
sup p ( h ) = sup /&.A
h d p 6 sup
En t l),u[h >
121
P E M nrl
Corresporiding to Theorem 4.1 (4):we also have a functional form.
122
4 RECURRENCE, ERGODICITY AND
INVARIANT
MEASURES
Theorem 4.5. p n converges weakly to p iff for every closed function f E &+, -n+m lim pn(f) 2 ~ ( f ) . Proof: Note that I B is lower semi-continuous for open set B. The condition is sufficient. Ta prove the necessity, we need only to consider bounded Actually, for a closed f E 8+,f A N is also closed and f a
This plus the monotone convergence theorcm gives us limn rm,un (f)2 p ( f ) . Thus, we havc rcdiiced the proof to the bounded case. Next, replacing f with f/Nif necessary, we may assume that 0 f < 1. Finally, set
<
m-1
Then,
fm
is closed. For each x,there exists the largest
k.0
= k o ( z ) so that
Hence
So, we have
.
m-1
Letting m 00, it follows that lim_ ,m pn(f) 2 pw(f). W Now, we turn to study product state spaces. Let S be a countable set and denote by S the collection of finite subsets of S . For each 11 E S, let (&, pu, gU)be a complete separable metric space. Given A c S, we have product space Bh = &. EA = E,, --f
l-I
r-J
4.1 WEAKCONVERGENCE
123
For simplicity, we write ( E ,6 )= ( E s ,Ss). Next, let P E P ( E ) . Define
p [ x E J??: ZA E B A ] : B,l f &*, where X A = (xu : u E A) is the projection of 2 to E”. Then the family { p A : A E S} is consistent. Conversely, a consistent family {ph : A E 9) determines uniquely a P E 9 ( E ) by the Kolmogorov extension theorem. Thus, it is meaningful to write P = { p A : A E 9). Now, given P, = { p z : A E S} and Q = { q A : A 6 S}, we may define the weak convergence in finite-dimensional distributions as follows: for each A E S, as n m, p x =+ qll PA(BA)
---f
where “ a’’ means the weak convergence defined by Theorem4.1. This topology is generated by the following open sets {PA : A E
{P E
s}: b A ( V ) - (?A(’P)I < & } I
> 0, A E S,
Q =(
E Cb(EA),
4 :~A E
S}.
Xu
pU(xzL, g u ) L / [ l + ~ u ( x yuU, ) ] , where (L is a positive summable sequence on S . Then ( E , p ) is a complete separable metric space. Given p, p n E .Yd(S),n 2 1, then pn p iff p n converges to p weakly in finite-dimensional distributions.
Remark 4.6. Define P ( Z , 3) =
=+
Proof: Clearly, if p n ===+ p , then p n converges to p weakly in finite-dimensional distributions. We now assume that pn converges to p weakly in finitedimensional distributions. Without loss of generality, assume that C , k, < 1. For each E > 0 and u E S , choose n compact PC,,, C E, such that
Set
then
Hcnce {pn : n 2 1) is relatively compact in the weak topology. Suppose that p n does not converge to p in the weak topology. Since the topology can be metrized by LBvy-Prohorov metric wf,there exist an E > 0 and a subsequence { n k } such that w(pn,,p) 2 E , k = 1,2,... . On the other hand, by the relative compactness proved above, we may choose a subsequence { p m k } & so that pmk p . This certainly implies that pr,, converges to p weakly in h e finite-dimentiional distributions. So by Kolniogorov extension theorcm, we have 0 = p . And hence, E:
<
W(/Lrnk 1 p )
which is a contradiction.
W
= W ( P r n k ,E-1)
--+
0,
k
’ 03,
124
4 RECURRENCE, ERGODICITYD INVARIANT MEASURES
Theorem 4.7. A subset C P(15)is relatively compact iff there exist a family {h, : u E S } of compact functions and a family {C,' : u E S} o f constants such that 1L E P{&J 6 CtL,
s,
where F' = {p,, : A E S} E A.
Proof: a) We first prove that A is relatively compact iff for each A the family {p,,} deduced by
E S,
is relatively corripact in E" with respect to the metric p~ := CUEh pu. The necessity is obvious. To prove the suficiency, take Pr, = { p z : A E S) so that for each A E S. Clearly, the family (p.4 : A E S} p x converges to some is consistent, it deterniilles uniquely a p and P, converges to J? weakly in finite-dimensional distributions. b) Applying a) to the case that A = {u},we see LhaL the condition is necessary. On the other hand, if hu(u E S) is compact, then for each A, E $?
is also compact in (E"O, P A , > . By assumption,
Applying Theorem 4.4 to the compact function hno, it follows that {pAh0 : P E A} is relatively compact and so the condition is also sufficient by a). I 4.2 General Results
For thc! remainder of this chapt.er, we often assume that the q-process is unique and study further the recurrence, the existence of stationary distributiori or invariant xlieasure and the crgodicity of the q-process.
Definition 4.8. We call a q-pair regular if it is totally stable, conservative and it determines uniquely a g-process.
Our first result reduces the recurrence of a q-process to the one of embedding process (cf. Section 4.3). Define
4.2 GENERAL RESULTS
125
Clearly, H(x, d y ) is a probabilily kernel, which is called an embedding jump process. Set
no = 1,
nnfl
= II II".
Theorem 4.9. Let (q(x),q(x,d y ) ) be conservative. Then we have
Here we use the convention O / O = 0, 0
8
oc = 00.0 = 0 and 1/0 = 00.
Proof: Since (Pmitl(A,x,A): x E E ) is the minimal solution to Eq. @A):
f = rI(A)f
+ (A + q)-%,
f E &+.
By the second successive approximation scheme, we have 00
P"'"(X,.,A)
=
CII(A)"((X+B n=O
Then, by the monotone convergence ~heorcm,we ohlain
Finally, the proof will be done once we show t,hat
This can be deduced by using induction on n. Acbually, when n = 0, both sides are the same I A / ~Assume . that the assertion holds for n - 1 and we now check the case of n ( 2 1). By inductive assumptim, we have
It suffices to show that
This is trivial if q(z) > 0. Let q(z) = 0. Then both sides are eqiial to ca if x E A . Otherwise, if x A , then both sides are 0 because of the convention 0/0 = 0 and 0 . 0 0 - 0. I Next, we study the exist,ence of stationary distribution. WE first consider the time-discrete case.
4
126
4
RECURRENCE, ERGODICITY AND
INVARIANT
MEASURES
Lemma 4.10. Let ( E , & ) be an arbitrary measurable space and P ( z , d y ) be a transition probability function on ( E ,8). Suppose that there exist h E d?+ and constants C E [O, 00), c E [0, 1) such that Ph 6 C
+ ch.
Then for each stationary distribution have
T(h)
(44
of P ( z , d y ) (i.e., TP =
T
T),
we
< C/(1 - c).
Proof: As usual, let P(O)= I , P(n+l)= P P(n).By induction, we have
C P ( " ) h 6 C ( l + c + . . . + c " - ' ) + c n h < -+cnh, 1-c
Set hN = h A N ,
N 3 1. Since T
( ~ N= )
n
> 1.
(4.2)
~ P ( " ) h r vit, follows that
Now, the assertion follows by letting N -+ 00. Denote by ZZp(E) the set of all Lipschitz continuous functions on the metric space ( E ,p ) .
Theorem 4.11. Let P ( z , d y ) be a transition probability function having the property: Pf E C6(E) for each f E ,+!?zp(E). Suppose that for some xo E E , there exist a compact h E d?+on E and a constant C E [O, 00) such that . , n 1
P(")h(xo) 6 C, n. .~ m= 1 Then, P has a stationary distribution particular, (4.1) implies (4.3).
n > 1.
7rx. having the
(4.3)
property ~ , ~ ( 6 h )C. In
Proof: Define
By (4.3) and Theorem 4.4, there is a subsequence {pnk}p=lsuch that pnk Some p as k m. Let f E 622p(E). Then p n k ( f )-+ p ( f ) , p n k P f pPf and --+
---f
4.2 GENERAL RESULTS
127
This proves the main assertion of the theorem. The other assertions are obvious. To study the time-continuous case, we need two lemmas. For the later use, we allow an operator acting on a column of functions. For instance, s1 acting on (fk : 1 k n) equals ( R f k : 1 k n).
< <
< <
L e m m a 4.12. Let R be the operator corresponding t o an arbitrary totally stable q-pair and let g , f ( t , .) E €/LB(R$) for all t 2 0. Suppose that f ( . ,z) is absolutely continuous for all z E E . If
d dt
-f(t) 2 Rf(t), and f(0) 2 g , then P m i n ( t ) g
a.e.
t
(4.4)
< f ( t ) for all t 2 0.
Proof: Since P m i n ( t ) g is the minimal solution to the equation
by the comparison theorem, it suffices to show that rt
That is
rt
eqtf(t) 2
eqsQf(s)ds
+g,
t 2 0.
Now, since this inequality is trivial at t = 0, we need only to show that
which is exactly condition (4.4). L e m m a 4.13. Let ( q ( x ) ,q(z,d y ) ) be a conservative q-pair. Suppose that there exist a function h E €/LB(R$), a vector C E rW$ and a d x d matrix c with non-negative non-diagonals such that 0th C ch. Then
< +
Proof: Since
t
the assertion follows from the previous lemma. The next result is an analogue of Theorem 4.11.
4 RECURRENCE, ERGODICITY AND
128
'LNVARIANT MEASURES
Theorem 4.14. Let P ( t >be a transition probability function of a Markov process. Suppose that (1) P ( t ) f E Cb(E) for all f E &Czp(E) and t 2 0. (2) For some x0 E E , there exist a compact h E P&+
and a constant
C
E
[O,m)such that
+lo i
rT
P(t)h(x,)dt < C,
T > 0.
<
Then P ( t ) has a stationary distribution 7r satisfying ~ [ h ) C . Moreover, condition (2) is satisfied if the condition in t h e previous lemma holds with c < 0 or C: = c = 0 for the compact k , and in the former case, we indeed have
n(h)< -C/c for every stationary distribution Proof:
7r
of P ( t ) .
TAPL
By condition (a), we may assume that pTVLconverges weakly to some 7t t m. Then for f f ,+Fzp(E),we have
kJf)
+
P(f):
rUTnP(4f
-+
/L
as
luP(s)f
arid
28
<7 SUP I S ( 4 1, x
-
0,
xs n,4
03.
This pravB the first assertion. For the last assertion, simply use Lemma 4.13:
and then apply Lemma 4.10. I To conclude this section, we discuss some properties of an invariant measure of a q-process related to its q-pair. Lemma 4.15. Let P ( t ) be a q-process satisfying the forward Kolmogorov equation. Suppose that p is a u-finite invariant meamre: puP(t) = p for all t 2 0. Then (1) for each A E € with p ( A ) p(QT,l> = 0. Equivalently, (21 P ( f 4 ) = P Q . ~for all f .€+.
<
00 and S U P , , ~ ~ ( Z )
<
00,
we have
4.2 GENERAL RESULTS
129
Proof: Since p = p P ( t ) ,t 3 0, it follows that
On the other hand, by the forward Kolmogorov equation, we have
Combining the above two equalities, we obtain
When p ( A ) < 00, this gives us
By the monotone class theorem, we get
From this, assertion (1) follows. By using the monotone class theorem again, we obtain assertion (2).
Lemma 4.16. Let p be a a-finite measure satisfying condition (1) (or (2)) of Lemma 4.15. Then we have p 2 XpPmin(X). Proof: In view of the above proof, we have
nOTE THAT THEN
and so p 3 XpPmin(X).
Theorem 4.17. Let (y(x),q(x,dy))be regular and p E LP(E). Then p is invariant iff condition (1) (or (2)) of Lemma 4.15 holds.
130
4
ERGODICITY AND
RECURRENCE,
INVARIANT
MEASURES
Proof: As we have seen from Lemma4.16 that ,u 2 XpP(X) for all X > 0. If this is a strict inequality for some X > 0 and A , then 1= p(A)
+ p ( A C )> XpP(X)(A)+ XpP(X)(AC)= 1,
which is a contradiction. Hence X
p ( A ) = XpP(X)(A),
> 0, A E 6 .
On the other hand, because P ( t , s , A ) is continuous in t , so is ,uP(t)(A) by the dominated convergence theorem. Thus, the uniqueness theorem of Laplace transforms gives us p ( A ) = p P ( t ) ( A )for all t 2 0 and A E 8. 4.3 Markov Chains: Time-discrete Case The recurrence and positive recurrence of Markov chains are discussed in the most of textbooks on stochastic processes. In this and the next sections, we introduce some results, some of them are still new, which are often not included in the textbooks but quite useful in practice. Let E be a countable set. In many cases, we can assume that
E = z+= ( 0 , l ) 2, ‘ * . ) without loss of generality. Let (Xn)n21 be a Markov chain on a probabiIP) with state space E and transition probability function lity space (R, 9) P = (Pij : i , j E E ) . One of the fundamental facts in the study of Markov chains is that the state space of a Markov chain can be decomposed into a transient sub-class and some irreducible sub-classes, and in each irreducible sub-class, either the states are all recurrent (resp. positive recurrent) or not.
Definition 4.18. We call a matrix c = (cij) irreducible if for every i and # i, there exist pairwisely distinct io = i, i l , , in = j such that
j
# 0.
CiO,ilCil,iZ . * . czTL-l,z,,
Similarly, we can define irreducible Markov chain P = (Pij) and Qmatrix. Finally, an irreducible Markov chain is called aperiodic if for each i, the greatest common divisor of the set of positive n such that Pj:’ > 0 equals one.
Lemma 4.19. Let
cij, bi
E [ O , c o ) for all
i, j E E .
(1) Let ($ : i E E ) be the minimal solution t o the equation
xi
=
C c . . 2 .+ bi, aj
j
j
i E E.
4.3 MARKOVCHAINS:TIME-DISCRETE CASE
131
Given j E E , suppose that there exist i = io, i l , . . . , in = j such t h a t c.2021. c.ZlZZ .-.cin-lin> 0, then x: < 00 3 x; < 00. If moreover,
C c i j +bi
< 1,
i E E.
(4.5)
j
Then we have xf = 1 ===+xcj* = 1. In particular, if (cij) is irreducible, then for these two cases we have, respectively, either x: = 00 or xf < 00 and either x: = 1or 2; < 1for all i E E , simultaneously. (2) Let (x: : i E E ) be the minimal solution t o the equation
C c i j ~+j bi,
xi =
i E E,
3 #30
for a fixed j , E E. Then xf and cjokxi < 00. If
Ckfjo
cij
< 00 for
+ bi = 1,
all i E E iff x:
< 00 for all i # j o
i E E,
3ZjO
then xf = 1 for all i E
E iff xf
= 1 for all i
# jo.
Proof: a) Assume (4.5). Let x,' = 1 and cii,ciliz - . . ~ i ~ - ~>i0. , , If x,'~< 1, then
This is impossible. So we have xfl = 1. Successively, we have
By the irreducibility, we indeed have x: = 1 for all i E E. b) Let xz < 0;)and ciilciliz . cin-lin > 0. If xzl = 00, then 0O
> x; 2 ciilz;l
= 00.
This is impossible. In the same way, we can prove that
< 00,.. . ,x;n < 00. Now, the irreducibility implies that x; < 0;)for all j x;z
E E.
c) By the localization theorem, if (x; : i E E ) is the solution given in (2), then (x; : i # j o ) is the minimal solution to the equation
xi
=
C cijxj + bi,
i # jo.
3#30
From this, the last assertion follows immediately.
H
132
4 RECURRENCE, ERGODICITY AND
INVARIANT
MEASURES
Lemma 4.20. Let (xik)*: i E E ) be the minimal solution t o the equation
xi = C c i j x j
+ bi
(k)
,
i E El k
= 1,2.
j
Suppose that
Clbil) < bi2) < C2b!’),
iEE
for some C1, C, E [O,m].Then
c1xj1)< *xj2)*6
c,xi (I)* ,
i
E
E.
Proof: Consider the case that C2bi” 2 bj2), i E E. By Corollary2.8, (C2xi1)*:iE E) is the minimal solution to the equation
Then, as an application of the comparison theorem, we have
czxy*2 $)*
,
iEE.
Similarly, we have
xi2)* 2 C l x i(I)* , Denote by and define
i E E. W
Pi the probabilityof the Markov chain (Xn) starting from i
In particular, M
n= I
Proposition 4.21. Let P = (Pij) be an irreducible Markov chain and H
#8
be a finite subset of E .
(1) The chain is recurrent iff f&
= 1 for all i $! H . Equivalently, for all i E E . (2) Let (9: : i E E ) be the minimal solution t o the equation
yi
PikYk
+ PiH,
E
fTH
=1
E,
Ic
where pi^ = CkEH P i k . Then the chain is recurrent iff yr = 00 for a l l i E E.
4.3 MARKOVCHAINS:TIME-DISCRETE CASE
133
Proof: a) Note that
f/A) = PiH,
f&+”
=
C
n 2 I.
Pzkfs,
WH
xr=lf!O,
Since f:H = it is easy to check, as we did in the proof of Proposition 2.14, that (f:H : i E E ) is actually the minimal solution to the equation
xi =
C Pijxj + P ~ H ,
i E E:
(4.6)
j@H
On the other hand, by Corollary 2.8, it is easy to check that
j € H n=1
Hence, by the localization theorem, (y: : i E E ) is the minimal solution to the equation
Take
C1 = ming;, JEN
Cz = maxy; j€H
By Eq. (4.6), Eq. (4.7) and Lemma 4.20, we obtain
+
Pick j , E H so that C1 = yT0. Then (1 yj*,)fjoH < y;o. Thus, if f j o H = 1, we must have y;o = 00 and hence y; = 00 for all i E E by Lemma 4.19 (1). b) Consider the special case that H = { j o } . Then, the first assertion follows from Proposition 2.14 and Lemma 4.19 (2). To prove the second assertion, note that by a) we have (1 y;o)f;o = y;” and so f;o = 1 y3t0 = 00. Conversely, let f;o < 1. Denote by f & ( r ) and y;(r), respectively, the minimal solutions to (4.6) and the equations given in the proposition when (Pij) is replaced by ( ~ P i j 0 ) ,< T < 1. It is clear that
+
>Y 3 )t
$7
The above argument gives us (1
fG0W 1 fG0
as ?- T 1.
+ ~;~(r))f;~(r)= y;”(~).
Hence
134
4 RECURRENCE, ERGODICITY AND
INVARIANT
MEASURES
This proves that fj*, < 1 + "y; < 00. c) We now return to general H . By the localization theorem, if for all i E E , then the minimal solution to the equation
f&
=1
equals 1 for all i $ H . This is equivalent t o say that f c H = 1 for all i E E . d) The second assertion for general W now follows from the expression of ya, b) and Lemma 4.19 (1). H For a given transition probability matrix P = (Pij),define a new transition probability matrix ?; = ( F i j : i , j E S) as follows:
-
,Paj =
{
&j
Pij
if i = 0 if i # 0.
Then 0 is an absorbing state. The following result is well known. Lemma 4.22. For each i E E , the limit ?ria = limndm Piu -3n) exists and
ii00
= 1.
Next, by the identitry
k-I
it follows that
Lcrrima 4.23. For each i Theorem 4.24. Let P the equation
=:
1, j z l = iiio.
( P i j )be irreducible. Then, the chain is recurrent iff
has a (finite) solution ( y i ) so that limi4m yi = 00 (i.e., (yi) is compact) for some finite H
# 0.
Proof: Trr vie'w of Eq. (4.6),(f& : i $ ,FT) is i,tideed the minimal solution to the equation xa = Papj - f ' P i f f , 1: 6 H .
c
j$H
Hence, we may regard H as a singleton {O}. a) Sufficiency. Let ( y i ) be a solution having the desired property. Then
4.3 MARKOV CHAINS:TIME-DISCRETE CASE
135
so and hence
it follows that iii~ = 1 for all i # 0. Letting n + 00, and then N + a, From this and Lemma 4.23, we see that f& = 1, i # 0. Finally, the assertion follows from
and the irreducibility of P. b) Necessity. Denote by (2,)the Markov chain with transition probability (&) and set SUCH THAT
Then f o ( n )= 0 for all n > 0 and f i ( n ) = 1 for all i 2 n. Because the originaI chain is recurrent, so we have f& = 1 for all i 2 0. Hence
Choose n k
1' such that
and define yi = c E , f i ( n k ) < possesses the desired properties.
00.
Then it is easy to check that (yi)
Theorem 4.25. Let P = (Pij) be irreducible. Then the chain is transient iff the equation JEE
has a non-constant bounded solution.
136
4 RECURRENCE, ERGODICITY AND INVARIANT MEASURES
Proof: Again, regard H as a singleton (0). Suppose that the chain is transient. Then there exists an i # 0 so that o:f = iii0 < 1. Note that i7io is the probability of Markov chain finally returning to 0 starting from i. We have
F
jEE
From this and i i o o = 1, we see that (yi = jiio, i E E ) is a desired solution. Conversely, suppose that the equation has a non-trivial bounded solution (yi). Without loss of generality, assume that yo = 1,
0 < yz
< 2,
2
E E.
Then
Furthermore
Letting n + 03, by Lemma 4.22, we get iiio 6 yi for all i E E . Because (yi) is not a constant, we have either yio < 1 or yio > 1 for some io. In the former case, we have i7iOo < 1. In the latter case, replacing (yi) with (2 - yi), we obtain the same conclusion. Now, the same argument given at the end of proof a) of Theorem 4.24 implies the required assertion. H Now, we turn to study the positive recurrence of the chain P = (Pij). We begin the study with a simple .result. Proposition 4.26. Let P = (Pij) be irreducible. If the equation
has a non-trivial solution (xi)so that Ci 1zil < 03, then the chain is positive recurrent. Conversely, only if the chain is positive recurrent, the equation
&+iPaj 6 Xj) xj 2 0, j E E ,
j EE CiEEZi < 0;)
has a non-trivial solution.
Proof: For simplicity, assume that the chain is aperiodic. Then, the assertions are easy to check by using the fact: the limit
exists and is independent of i, either
7rj 5 0
or
7rj
> 0 for all j
E
E. I
4.3 MARKOVCHAINS:TIME-DISCRETE CASE
137
Let P = (Pij) be an irreducible aperiodic Markov chain and (r2)bc a probability measure.
Definition 4.27.
(1) The chain is called ergodic (equivalently, positive recurrent) if P:?' + T j as n ---f oo for all i , j E E. (2) The chain is called geometrically ergodic if there is some j3 < 1such that for each i and j , IP)?' - 7rjl = O ( P n ) as n t 00. (3) The chain is called strongly ergodic or uniformly ergodic if sup, I P ~ I - T--3~ ~as n -+ oo.
o
Actually, these types of ergodicity have stronger properties. For instance, the geometrical ergodicity has the property: tpliere exist j3 < 3. and Ci, depending on i only, such that IPiT) ( - 7rjl Cip" for all i, j and n. Furthermore, we have
<
Theorem 4.28. (1) The chain is ergodic iff IIPl?) - 7rIlvar = C j IF'!;' - 7rjI --+ 0 as n -+ cc for all i E E . (2) The chain is geometrically ergodic iff liPj?) -7r1IVar = O(p") as n 00 for some ,# < 1. and for all i E E . Equivalently, 7r~llP/?) - r(IVar --f
Ci
O ( p " ) a5 n. -+ 00 for some p < 1. (3) The chain is strongly ergodic iff supi l / f $ ~ ) - ~=/Ol (~ p ",) ~as~n for some p < 1.
-+ 00
To state the probabilistic meaning of the above different types of ergodicity, we need a lemma as follows.
Lemma 4.29. Let mi^ = ~ ~ = l n fi E~ E. ~ )Then ,
mi^ : i
E E ) is the
minimal solution t o the equation
Proofi The argument is the same as that used in the proof of Proposition 2.15.
Theorem 4.30. (1) The chain is ergodic iff &a, < 00 for all i f H . (2) The chain is geometrically ergodic iff lEiexun < 00 for some X for all i E H . (3) The chain is strongly ergodic iff supiIEioH < 00.
We now state the criteria for the three types of ergodicity.
> 0 and
138
4 RECURRENCE, ERGODICITY AND INVARIANT MEASURES
Theorem 4.31. Let H # 0 be a finite subset of E . (1) The chain is ergodic iff the equation
has a finite non-negative solution.
(2) The chain is geometrically ergodic iff for some
E
> 0 the equation
has a finite non-negative solution. (3) The chain is strongly ergodic iff Eq. (4.9) has a bounded non-negative solution.
Proof: For the above results, here we study the ordinary ergodicity only. For the others, the related references can be found from the last section of this chapter. a) The assertion Theorem 4.28 (I) is tt wcll-known property of -!I-space. A direct proof is also easy. Given a probability (nj)and non-negative (n;?’), assume that AND
Since 1x1 = 2x+
- x,z+ := max{z,O},
we have
Note that (rj - T;”’)+ 6 7 ~ j . The first term on the right-ha.nd side goes to zero by the dominated convergence theorem. NOW,the required assertion follows by setting ny)= P$’. b) Without loss of generality, from now on, msuine tha,t 0 E H . Let the chain be ergodic. Then, by Proposition 2.15 and Lemma 4.19 (a), Eio,< 00 for all i, where (xi*:= Eiao : i E E ) is the minimal solution to the equation
4.4 MARKOVCHAINS:TIME-CONTINUOUS CASE
139
But for i $ H , 6 IEiao < co. Furthermore, by Lemma4.29, we also have &aH < 00 for all i E H . This proves Theorem 4.30 (1). Now, take yi = 0 if i E H and yi = &aH if i $ H . By Lemma 4.29, we see that (yi) is a solution to (4.9). c) Conversely, let (yi) be a solution to Eq. (4.9) with general H . Set
Then, from the finiteness of (yi) and
it follows that (y,!"') is also finite for every n 2 1. Moreover,
Hence
(n+2'/n< (1+ c ) -1
Y,
cCP"!3"+ n
yi(2) / n - 1.
n j E I - l r=l
Letting n + w, because of the irreducibility, we obtain 0 ,< (l+c) CjEH nj-1 and so C j F H nj 3 (1 c)-l > 0. This is enough to guarantee the positive recurrence of the chain by the irreducihility. H
+
4.4 Markov Chains: Time-continuous Case This section deals with the recurrence and the ergodicity for Q-processes. define Given a conservative Q-matrix Q = ( q i j ) on E = Z+, nij
= l [ q i # O ~ (1- Sij) qij/qi
+ l [ q i = O ~ &j,
i , j E E.
Then @ I i j ) is a transition probability matrix and is often called embedding chain of the Q-process.
140
4 RECURRENCE, ERGODICITY AND
INV.4RTANT
MEASURES
Definition 4.32. We call the Markov chain P ( t ) = (Pi,(t)) recurrent if for each h > 0, P ( h ) is recurrent. Equivalently, S,”Pt,(t)dt = 00 for all i E E. Similarly, we call the Markov chain P(t) = (P%,(t)} positive recurrent or ergodic if so is P ( k ) for every h > 0 (cf Lemma 4.42 below). Equivalently, limt400 Pii(t)= T~ > 0 for all i E E .
By using the first; successive approximation scheme and Theorem 1.3, it is not difficult to prove the following simple fact. Lemma 4.33. For a given Q-matrix, t h e minimal q-process is irreducible iff so i s i t s Q-matrix.
As we mentioned in Section 4.2, thc recurrence of the q-process can be reduced to the one of the embedding chain. Let us copy Theorem4,9 as follows.
Theorem 4.34. Let Q = (q2,) be a conservative Q-matrix. Then
In particular, if Q is irreducible and regular, then (P,j(t))is recurrent iff so is its embedding chain.
Combining Theorem 4.34 with Theorem 4.24, we obtain
Theorem 4,35. An irrcducible conservative Q-matrix is regular with recurrent P ( t ) iff the equation 2 $ Jl CITijYj Yi,
<
jEE
has a (finite) compact solution (yi) for some finite H
Now, we t.urn t.o study the positive recurrence.
Lemma 4.36. Let Q = ( q i j ) be a regular irreducible Q-matrix. Then the limit lirri Pzj(t)=: rj
t+m
exists for all i and j and it is indeed independent of i. Moreover, we have either ~j =I 1 or T~ = 0.
Cj
C,
Proof: By Theorem 1.3, for each h > 0, (Pzj(h))is irreducible and aperiodic Markov chain. Hence t,he desired assertion follows from
4.4 MAR,KOV CHAINS:TIME-CONTINUOUS CASE
141
Theorem 4.37. Let Q = (qij)be a regular irreducible Q-matrix. Then P ( t ) is positive recurrent iff the equation
(4.10) has a summable, non-negative and non-trivial solution uniquely up t o a constant.
Proof: a) Let P ( t ) be positive recurrent. By Lemma 4.36, we have AND
Corresponding to the time-discrete case, we get rj = Ci rZPt,j(t).Next, from Theorem 4.17, it ibllows that
So (ri)is a solution to Eq. (4.10). b) Let (xi)be a summable, non-negative and non-trivial solution to Eq. (4.10). Without loss of generality, assume that
xi > 0,
c x i = 1;
2
E E.
a
Applying Theorem 4.17, we have
t 2 0, j E E .
- C.ZPij(t),
23. -
i
Letting t + 00, we obtain 0 < xj = x j . This proves not only the positive recurrcnce of P ( t ) but also the uniqueness of the solution to Eq. (4.10). The next two examples show that the positive recl.irrence, unlike the recurrence, of &-process cam not be rediiced 1,o the one of its embedding chain. Example 4.38. Take
Pa
0, i=l
n0,i = Pi, rIzj
= 0,
and qi = pa/2, 423. . = qiIIij, i not the Q-process.
221;
ITio=1,
i31;
otherwise
# j.
Then
IT
=
(Qj)
is positive recurrent but
142
4 REClJRRENCE, ERGODICITY AND INVARIANT MEASURES
Pro06 Clearly, ll = (Qj) has uniquely a stationary distribution as follows: 7rO : 1/2, T% = p , / 2 , i 2 1. On the other hand, since 0 = ( q i j ) is bounded, the Q-proccss is unique. By Theorem 4.37, if P ( t ) is positive recurrent, then xjqj = C i f jxiqij. Hence for every j 2 1, zjnj =
c
xi7rzrI, = xo7ropj.
i#j
xi
That is xj = xo. This implies that xi = 00, which is in contradiction with Theorem 4.37. Therefore P ( t ) is not positive recurrent. We have seen a,n example for which the embedding chain is positive recurrent but not the Q-process, The next example goes in the opposite way. Example 4.39. Consider a conservative birth-death Q-matrix: 0;=
hi, i 3 1,
C'glT/bi
bi + M such that recurrent but not II = {l&j).
Choose
bz
<
> 0,
2
2 0.
Then the Q-process is positive
00.
Proof: By Corollary 3.18, the Q-process is mique. The birt'h-death process is syniinetrizabie with respcct to ( p i ) (cf. Sections 6.1 and 6.2):
/GiP&) = jLj,Pji(t)> i , j where Pi =
Po = 1,
Because
cc'
bob1 bz-1 aIa2. ai + .
OC:1
c p i =l+box0
E ,F:,
i= 1
bi
,
t 2 0, 221
hence P ( t ) is positive recurrent, R u t II+.1 = TIi,i+l = 1/2, .i 3 1, IIol = 1, so T I is not, positive recurrent. In a special case, the positive recurrence for a Q-process is equivalent that of its embedding chain. Moreover, the Lirn&discrete case can be ofteri roduced to the the-continuous one. Note that the time-continuous analogue of the geometric ergodicity is called exponential ergodicity, see Definition 4.41.
Theorem 4.40.
<
1() Let 6;r = (qij) be il conservative irreducible @matrix satisfying 0 < c gi 6 C < cc for some constants c and C'. Then P ( t ) is ergodic (resp. strongly ergodic) iff so is II = (ne)and P(t) is exponential ergodic iff II (Ilij) is geometrically ergodic. (2) Let P = (Pij) be irreducible, then P i s recurrent iff so is the process generated by the Q-matrix Q = P - 1. Assume additionally t h a t P is aperiodic, then the same conclusion as in (1) holds for the three types of e rgodicity. 1
4.4 MARKOVCHAINS: TIME-CONTINUOUS CASE
143
Proof: a) The Q-matrix is regular since it is bounded. Note that there is o n e - b o n e correspondence p j = T j q j , j E E , between the non-negative solutions to the equations
So they are simultaneously equal to or great than zero identically. Now the ergodic assertion in (1) follows from c rj pj C ;7j, Proposition 4.26 and Theorem 4.37. b) All the ergodic assertions in the theorem can be checked by using the criteria Theorems 4.31 and 4.45 below, c) It remains to prove the recurrence assertion in (2). Fix an arbitrary point, say 0 f E. Then we have
cj
< cj <
cj
The chain (Pij) is recurrent The minimal solut,ian t o the equation xi =
PijXj
+ Pin, i E E
j#O
equals one identically
(by Proposition 4.21 m i (4.6))
P.zJ.x 0 < xi
a The equation xi =
J
,
< 1! i E E has only t'rivial
j$O
solution zero
(by Theorem 2.21) has only
The equation
trivial solut.ion zero The ( Q = P . I)-process is recurrent (by Lemma 4.51 below). ~
H
We have seen that the positive remirrence of P(t>can not be determined completely by its embedding chain 11. So we need to study this problem more carefully, As .what we did for the tirncdiscrete case, we now study the three types of crgodicit.y. Definition 4.41.Let Q = (qi,j) be a regular irreducible Q-matrix and a probability measure.
( ~ i be )
(1) T h e chain P ( t ) is called exponentially ergod.icif there is some p > 0 such t h a t for each i a n d j , lPij(t)- ~jrjl= 0 ( e d t ) a5 t CXI. 4
( 2 ) The chain is called strongly ergodic or uniformly ergodic if supi IPij(t)-~JI4 0 as t + 00,
144
4
RECURRENCE, SRGODICITY AND INVARTANT MEASURES
For general Markov processes, Definition 4.41is ineaninghl once the pointwise convergence is replaced by the convergwr:e in total variation (cf. T h e orem 4.28 and Theorem 4.43below). The next result enables us to transfer the tirue-discrete case into the timecontinuous one. Lemma 4.42. Let ( X t ) t 2 0 be a Markov process. If for any h > 0, the skeleton (X(nh)),>o i s (resp. geometrically, strongly ) ergodic, then (X(t)) is (resp. exponentially, strongly ) ergodic.
Proof: Recall that for m y finite (signed) measure p on ( E , & ) ,its total variation norm can be represented as follows (by Hahn-Jordan decomposition):
For simplicity, in this proof, w e omit the subscript “Var”. a) The key fact in the proof is that. the lunction t - > IlvP(t)ll is nonincreasing on [0, m) for evcry bounded signed measure v on ( E ,-8).Actually,
Next, we show that for these types of ergodicity, the initial distribution p can be replaced by initial point mass. Actually, if
for all x c E , then for any p E
9((E),
<
as t 4 00,since I l l 1 - ~ 2 1 1 2 for any probability measures pl and p2. b) Suppose that (JC(nh))is ergodic, i.e., 7r is an invariant probability 4 0 BY n CQ for every probability measure for P ( h ) and Il@(nh) measure p. Let t > 0 and ( k - l)h, 6 !i < Ich (k: E Pi). By a), we have --)
and so n P ( t ) = T.Furthermore
4.4 MARKOVCHAINS: TIME-CONTINUOUS CASE
145
c) Suppose that ( X ( n h ) )is geometrically ergodic, i.e.,
/'T(dr)ll~(nhl+, -
= O(e-n,'(h)))
as n
--f
oo
for some P ( h ) > 0. Again, l e t t > 0 and (k - l ) h ,< t < k h ( k E we have
s
~ ( d z ) I l P (5t,,.)
-
TII
N), Then,
= O(e- t P ( h ) / h )*
d) Similarly, we can prove the assert,ion about the strong ergodicity.
H
Theorem 4.43. is ergodic iff[lP,,(t)= C j IPij(t) - 7tjl -+ 0 as all i E E . is exponentially ergodic iff IlPi.(t) - 7r]TIIvar = 0 ( e - P t ) as some ,L? > 0 and for all i E E . Equivalently, CinillPi.( n-l[var- O ( e - P t ) as t -+ cc for some p > 0. (3) The chain is strongly ergodic iff supi llPi . ( t ) - ~ [ = l vO(e-Pt) ~~ as t -+ m for some p > 0.
(1) The chain t --+ 00 for (2) The chain t -+ 00 for
Next, for a giver1 regular irreducible 6,)-matrix Q - (ye), let (X(t)),,, be the corresponding Markov chain defined on a probability space (12,F,IF). Its successive jumps are given by TO
= 0,
7,
= inf{t : t
> 7;2-1,X(t) # X ( T ~ - ~ ) } , n 2
Due to the regularity, we have T := finite subset of E and define aH = inf{t 2 following probabilistic criteria.
T~ T~
1.
= 00. Let H be a non-empty
:Xt E
H } . Then, we have the
Theorem 4.44. (1) The chain is ergodic iff R,a, < 00 for all 1: E H . (2) The chain is exponentially ergodic iff lE,eXo~~ < oc for all i E H , where 0 < X < qi for all i E F . (3) The chain is strongly ergodic iff supzEiaH < 00.
The analytic criteria for the t'hree types of ergodicity are the following. Theorem 4.45. Let H
# 8 be a
finite subset of E .
(1) The chain is ergodic iff the equation
(4.11)
146
4 R.ECURRENCE,ERGODICITY
AND INVARIANT
has a finite non-negative solution. (2) The chain is exponentially ergodic iff for some X i E E , the equation
MEASURES
> 0 with X < qi
for all
(4.12) has a finite non-negative solution. (3) The chain is strongly ergodic iff Eq. (4.11) has a bounded non-negative solution.
asReplacing (yi) in (4.12) with yi = yi + 1), one can rewrite ()4.12
To explain how to deduce these results, we need some preparritions. Define
n= 1
Lemma 4.46. We have f i(1) H - nZff~
=
f&+l)
cnil,f2,
n 2 1.
V H
Furthermore,
(fiH : i
E E ) is the minimal solution t o the equation
and the Markov chain is recurrent iff
f
i ..I ~
1 far all i E
Proof: Note that (1) f2H =
p,[qf= 71 1 = 1 - h
and
The r m a i n assertions are then obvious.
I
l?
4.4 MARKOV CHAINS:TIME-CONTINUOUS CASE
Lemma 4.47. We have
Proof: Obviously,
Next,
But by the strong Markov property, we have
and
By induction, this proves the required assertion. Lemma 4.48. For X E R with X
Then we have
I
< qi for all i E E , define
147
4 RECURRENCE, ERGODICITY
148
A N D INVARIANT MEASURES
In particular, ( e i H ( X ): i f E ) is the minimal solution t o the equation
Proof: The first assertion follows from Lemma 4.47 immediately. To prove the last assertion, apply Theorem 2.9 to
k$H
TZ-
1
and note that gi = fiH (i E E ) by Lemma 4.46. H Equivalence of Theorcm 4.45 and Theorem 4.44: a) Assume that the chain is recurrent. Then fiH = 1 for all i by Lemma 4.46. Hence by Lemma 4.48, ( e t H ( X ) : i E E ) is the minimal solution to the equation
On tmheother hand, since f i ~ ( t = ) lPi[uH> t]: by the Fubini theorem, we have
By the assumptions of Theorem 4.44, e i H ( X )< 00 for all i E H . From this, we show that eiH(X) < 00 for all i 4 H. Given i $ H , choose i o , i l , ,in with in = i so that io E H and -
-
a
<
Without loss of generality, assume that ik f H : 1 k 6 n. Since e,,H(X) < 00, the argument given in Lemma 4.19 shows that eil13(A) < 00, e,211(X) < 00, . , ezT,H ( X ) < w ,successively. Next, ( e i H ( X ) : i $! H ) is the minimal solution to the equation +
(4.14) Thus, if we set yi = 0 for i E H and yi = eiH(A)for i 4 H , then (yi) satisfies the corresponding inequalities list,ed in Theorem 4.45 (in the first. and the third cases, X is setting to be 0).
4.4 MARKOVCHAINS: TIME-CONTINUOUS C A S E
149
b) Conversely, if the inequalities in Theorern 4.45 hold, then e i H ( X )< 00 for all i H since (ein(X) : i $ H ) is the minimal solution t o Eq. (4.14) by a). Then, by Ixmma 4.48 an.d.the assumptions, we indeed have e i H ( X )< 00 for all i E E . This means that the corresponding conditions of Theorem 4.44
6
hold.
Proof of the ergodicity: We now show that it is similar to the timediscrete case: we can prow the ergodicit,y more directly. Let (xi)be a solution t,o the equation c j q i j x j + 1 < 0, i fi I$*
ci,I-l xj.#aq @ j
< Do.
Define
Then
where c obtain
(inf{cj/X : j E E } ) A 0 52 - c
> -w. Thus, by compa,rison theorem, we
2 CP,,(X)(Cj - Xc),
i G E,
j
NOW,by the regularity, X Cj Pij(X) = 1, hence 1;2
XPaj(X)cj:
i E E,
x > 0.
j
Furthermore,
That is xzi
1
+
c
XPij(X)(Cj
- l),
i E E,
x > 0.
jEH
Since lim XP'j(X) = lim Pi,i(t)= rij = rj, x +o t4m
+
it follows that 0 2 1 C j E H nj(c, - 1).This s h w s that rj the irreducibility. I We now return to a condit'ion given in Theorem 4.14.
> 0 for all j by
150
4 RECURRENCE, ER.GODICITY AND
INVARIANT
MEASURES
Corollary 4.49. Let Q = (qi3)be a conservative irreducible Q-matrix. Suppose that there exist a compact h E T& and constants I: 0 and c > 0 such that IZh < C - ch. (4.15)
>
Then the Q-matrix is regular and the &-process is exponentially ergodic. Proof: The uniqueness of the process is a straightforward consequence of Theorem 2.25. Since h is compact, we can choose a finite H so that C ( c / 2 ) h -1 on H". Now, the asscrtion follows from Theorem 4.45. Applying Corollary 4.49 with h(x) = C , xu to Schlogl's model, it is easy to check that the model is exponentially ergodic. Actually, it, is strongly ergodic, as can be shown by Theorem 4.59.
<
Example 4.50 (finite dimensional Brnssel's model). Let S be a finite S and let p,(u, v) be transition probability on S, k = 1,2. Denote set, E = (Z:) by eUl E E the unit vector, i t s first component a t site u E S is equal t o 1 and the second component a t u,as well as other components a t u # u all equal 0. Similarly, one can define eu2. The model is described by the conservative Q-matrix:
Q(GY>=
and q(x) = C,+.q(x,y), where z ((x1(u),z2(u)) : u E S ) E E , A, > 0, i = 1.2:3: 4. This model is exponentially but not uniformly ergodic.
Sketch of the proof: This is a typical model of reaction-diffusion process with several species. Since the full proof is tedious, here we sketch the proof only. Let $3(m..Ti) =
m
+ (1 + 2 ~ ) -n E2mn
{ rn+(l+E)n-E2rnlog(n+I)/(m+l)
Next: define
h(4 =
c
Cp(r:l(U),%(U)),
ifm61 if m 2: 2, (m,n)E Z,: x E
E*
uES
Then, by some careful estimations, one can check that for sufficient small E , (4.15) holds for this h and some constants C 2 0 arid c > 0.
4.5 SINGLE RIRTH PROCESSES
151
The proof for non-uniform ergodicity is based on a comparison, a s a dual of what used in Theorem 3.19 or Theorem 4.59 below, with birth-death processes. Or adopt the coupling argument. to make the comparison. Refer to Wu and Zhang (2003) for the details. 4.5 Single Birth Processes
In this section, we apply the above results Lo the single hirth processes and rriultidirnensionul Q-processes. For this, we need a simple result.
Lemma 4.51. Let Q = ( q i j ) be a regular and irreducible Q-matrix. Then P ( t ) is recurrent iff the equation
has only zero solution for some (equivalently, for a n y ) fixed
30.
Proof: By Lemma 4.46, it suffices to show that Eq. (4.16) has only the trivial solution iff the minimal solution ( s ; ) to the equation
equals one identically. Note that Eq. (4.16) is the homogeneous equation of Eq. (4.17). On the other hand, since (xi = 1 : i E E ) is a non-negative solution to Eq. (4.17), hence xf 1 for all i. These facts are enough to imply the required assertion. Now, we consider the single birth process defined by Definition 3.15. Recall Ihe notations (FAk) : a >, k 2 0), (m, : n 2 0) and (d, : n 2 0) defined in (3.20), (3.19) and the proof I) of Theorem 3.16, when AT = 0,
<
(4.18)
where q f ) C:=,qkJ for i, < k and k 3 1. To save our notations however, when N 2 1, we use the same notations to denote the sequences defined by
152
4 RECURRENCE,
ERGODICITY AND
INVARIANTMEASCRES
(4.181, replacing the original (gij) by the following (&?):
This will cause no confusion since the case of N 2 1 can be reduced to the one of -h‘= 0: ils we did in Theorem 3.16. It is interesting that, among the three sequences, the second and third ones are all expressed by the first one. Next, recall ON = inf{t 2 T~: 0 6 X t Ar - 1).
<
Note that, when N = 0, the next result describes to the recurrence and three types of ergodicity.
Theorem 4.52. Let Q = ( 4 . ) be a regular single birth Q-matrix. Assume 2.3 that: for each io 2 N ,there exist some i l , ... ,i, such that i, 6 N - 1 and q20a14iliz . ’ * q i m - k z m > 0. Then (1) fi[aN< 32.1= 1 for every i 2 N iff
j
(2) &aN
< 03
for every i
C:==,F?)
= m, where
=O
2 N iff k
l
k
(4.19)
(3) Kidv.$-
< M for
some X
> 0 and all i > N if k-1
M
(4) (4.21)
4.5 SINGLE BIRTH PROCESSES
153
Proof: a) First, we remark that when N 2 1, Theorem 4.45 is still available with a slight modification. Note that if &aH < 00 for all i $ H , then IPi[aH< m] = 1 for all i $! H . The same conclusion holds if IEiex"H < 00 (A > 0) for all i 6 H or supipHIEiaH < m. Conversely, assume that Pi[aH< m] = 1 for all i $! H . Then as shown in the proof of the equivalence of Theorems 4.45 and 4.44,{eiH(A) := (IEiex"H - 1 ) / A : i $ H } is the minimal solution to the equation (4.14). Thus, the criterion for the finiteness of {eiH(A) : i $ H } is the same as given in Theorem 4.45 with yi = 0 for all i f H . The second condition in (4.11) or (4.12) can be ignored for the single birth processes. b) Without loss of generality, one may regard the set (0,. . . , N - 1) as a single point 0. The resulting Markov chain is a single birth process with absorbing state 0. We have thus reduced the general case of N 2 2 to the one of N = 1. Replacing qo = qol with a positive number (which is indeed not used = t h e first below), we obtain an irreducible Markov chain ( g t ) .Define jumping time of and set e0 = inf ( t 2 : 2, = O}. By a) (cf. Proposition 4.21, Theorem 4.30 and Lemma 4.19), Pi[ao< m] = 1 e IPi[aO< m] = 1. Similarly, IEi5,< 00 &ao< m and so on. Thus, even though it is not necessary, from now on, we can further assume that the chain is irreducible, and so reducing the case of N = 1 to the one of N = 0. c) To prove the first assertion, we need only to show that Eq. (4.16) has non-trivial solution iff C:=, Pio) < 00. Set j o = 0. Then Eq. (4.16) has non-trivial solution iff the equation
(xt)
has a non-negative bounded solution. But the solution to the last equation is unique, which is zo = 1 = z1 and
Clearly, xi is increasing in i. Now, the problem is reduced to show that (xi) is bounded iff CEO Pio' < 00. By using the summation by parts formula:
k=O
k=O
k=O
154
4 RECURRENCE, ERGODICITY AND INVARIANT MEASURES
where
with
We obtain
and
Hence
This shows that (yi := xi+1 – xi : i > 1) is a non-negative solution to the equation
j=I
But the solution to this equation is also unique: i-1
3=1
for all i 3 1. Combining the
By induction, it is easy to check that zi = F,"' above facts, we obtain = 1 = Fo( 0 ) ,
xo
xis1 -xi
= Fi (0),
i 2 1.
This certainly implies the required assertion. d) To prove the second assertion, we use Theorem 4.45 (1). Let (ui)(uo= 0) be a non-negative solution to Eq. (4.11) with H = (0). Then k-1
j=O
j
k- 1
k-1
j=O
j =O
From this and induction, it follows that
where vk = u k S 1- u k , k 2 0. Hence L
.L
ic
, L
ic
4.5 SINGLE BIRTHPROCESSES This gives us d 6 u1 < 00. Conversely, a s u m e d
Obviously, we have C j f O q O j u= j qolul have
s=o
s=o
< 00.
155
Set
< 00. Moreover, for each k > 0, we
s=o
s=o
j=O
+
That is Cj qkjuj 1 = 0, k > 0. e) In view of Theorem 4.45 (2), the condition infi qi > 0 is indeed necessary. We need to construct a solution ( g i ) to Eq. (4.13) with H = (0) for a fixed A: 0 < X < infi qi. First, define an operator
This operator essentially comes from the study on spectral gap, and will be discussed in more details in Chapter 9. Next, define
Then p is increasing in i and cp1 = q&'. Let f = c q l 0 m for some c > 1. Then f is increasing and f l = cql0. Finally, define g = f I I ( f ) . Then g is increasing and
We now need a technical result, will be proved later, taken from Chen (2000b).
156
4 RECURRENCE, ERGODICITY AND
MEASURES
INVARIANT
Lemma 4.53. Let (mi)and (ni)be non-negative sequences, ni $ 0, satisfying
Define cpk =
Cizi nj. Then for every y E (0, l),we have
Proof: Let Mn = &2nmj. Fix N > i . Then by summation by parts formula and the assumption Mn ccp;', we get
<
By using the elementary inequality y(1 - y)-'(zT-l it is easy to check that
-
1)
+
T.
2 1 .( > 0 ) )
Collecting these facts together gives us the required assertion. I We now return to our main proof. By Lemma4.53, it follows that
<
Let go = 1. Then 1 gi < 0;) for all i 2 0. We now determine X in terms of Eq. (4.13). When i = 1, we get X (c - l)c-'111(f)-'. When i 2 2, we should have
<
4.5 SINGLE BIRTHPROCESSES
157
For this, it suffices that
Tn other words, for (4.13), we need only X 6 f i / g i = IIi(f) and X (c - l)c-lII,(f)-'. Then we can t a l e any A:
<
for all i 2 2
provided the right-hand side is positive, or equivalently supi22 I I i ( f ) < 00. To prove the last property, define another operator
which again comes from the study on spectral gap. By the proportion property, we get
By Lemma 4.53 and the condition M
< 00, it follows that
<
for all i 2 1. Therefore, supi2, IIi( f ) 4A4 < 00 as required. We have thus constructed a finite solution ( g i ) to Eq. (4.13) with 1 gi < 00 for all i. This implies the exponential ergodicity of the process. f ) Finally, we prove the fourth assertion. To begin with, we prove that the eauation
<
(4.24)
has a bounded non-negative solution iff (4.21) holds. If so, k
k
d := sup k>O n=O
n=O
k
k
F$) = lim
tin/
k-+m
FAo)
dn/ n=O
n=O
158
ERGODICITY AND I N V A R I A N T MEASURES 4 RECURRENCE,
and the unique solution to (4.24) is as follows:
First, assume that (4.21) holds and define (yi) by (4.25). Then, it should be easy to verify that (yi) is a bounded non-negative solution of (4.24). Next, let (yi) be a bounded non-negative solution of (4.24) and define wn = yn+l - yn for n 2 0. From (4.24), it is not difficult to derive
By induction, we can easily prove that 21, = Fn(0)vo - d n for all n 2 0. Note that vo = yl. From these facts, it follows that k
k
Now, on the one hand, by (4.26) and
Hence vo 2 d = supk2o again,
k
d,/
yk+l
2 0, it follows that
C,”,=, F P ) . On the other hand, by (4.26) (4.27)
Ic F t ) + +00 as k -+ 00 (by recurrence). Note that (yi) is bounded and Letting k -+ 00 in (4.27), we see that the right-hand side of (4.27) tends to C:=,d,/Ck=oF?), and furthermore the limit vo - d’, where d’ = limk,, vo d’ d. Hence, we have
< <
y 1 = v0 = d = d’ Combining this with (4.26), it follows that the solution (yi) to (4.24) must have the representation (4.25) and hence is unique. Finally, by the boundedness of (yi) and (4.26), condition (4.21) follows. We have thus completed the proof of the required equivalence.
4.5 SINGLE BIRTHPROCESSES
159
We now complete the proof of assertion (4). By Theorem4.45(3) with H = {0}, we know that a single birth Q-process is strongly ergodic iff the following equation C q i j Y j < -1, i#0 (4.28) j
has a bounded non-negative solution since Cjfoqojy j = qolyl < 00. Assume that the single birth process is strongly ergodic. Then there exists a bounded non-negative solution (ui)of (4.28), i.e.,
Denote by (u:)the minimal non-negative solution of (4.24). By the comparison theorem, we have ui3 u,'for all i 3 0. Thus, (uz)is bounded and (4.24) has a bounded non-negative solution. By the equivalence just proved above, (4.21) holds. Conversely, let (4.21) hold. Define (yi) by (4.25). By the equivalence again, (yi) is a bounded non-negative solution of (4.24). Clearly (yi) is also a bounded non-negative solution of (4.28). This implies strong ergodicity by the criterion quoted above. To move further, we need the following result.
Lemma 4.54. Let Q = ( q i j ) be a single birth Q-matrix with N = 0. Then (1) Eq. (4.11) with H = (0) has an increasing solution iff
2 := s u p d k / F p < 03.
(4.29)
k>l
In this case, (ui)defined by (4.23), replacing d with
2,is such a solution.
(2) If the process is recurrent (i-e., C,"==, F?' = 00) and lim,,, 2,then d = d.
d,/FLo) =
Proof: The last assertion of (l),and hence the sufficiency in (l),follows from the last part of the proof d) of Theorem4.52. To prove necessity in (l),note that (4.22) remains true. Then, by the increasing property of (u,), we see that vk 2 0, Ic 2 0. Hence
Obviously, d theorem:
< A. Hence, assertion (2) is a simple application of the Stolz's
160
4
RECURRENCE, ERCODDr(XTY AND INVARIANT
MEASIJRES
Part (3) of T1-ieorem4.52 is only tt sufficient condition but not a criterion for the exponential ergodicity, which is still unknown explicitly at the m e mexit. Coriditioxi (4.20) comes froin Theorem 4.55 ( 3 ) bclow with bi = q i , i + 1 and iii = qi,~+lE’/”’/F~!),.This choice of birtl-1-death process is quite good since on the one hand, the original single birth process and the birth-death process with rates (Ei,bi) are recurrent or not simultaneously; and on the other harid, both the original process and the (ai,&)-process are st;ochastically dominated by the birth-death process with rates b; =. q;!i+1 and ai = q i j , but the ( G i , b.i)-process does not stochastically dominate the origir:A process. The stochastic Comparability will be studied in the next chapter in detail. Roughly speaking, here we need only the comparison of the moments of the hitting time and so is weaker than the studastic comparabiliky. Along the same line and based on the next result, we ca,n write down the following suficierit conditions
xj
w
respectively, for the ergodicity and strongly ergodicity of single birth processes. The comparison of nioments of hitting time can be done by using the comparison theorem, (4.31) arid Proposition 4.56 below. Theorem 4.55. A regular birth-death process is
(1) recurrent iff 30
bo):---=E “
1 n= 1 hFLn
. . . a, = m. bl . . . b,,
a1
n=l
(2) It is positive recurrent (ergodic) iff 03
00
CPil= c
bo * . . b, a1
?l=l
1
I&-
(3) It is exponentially ergodic iff
(4) It is strongly ergodic iff w
* . ark
< 00.
4.5 SINGLE BIRTHPROCESSES
161
Proof: In the present case, we have
a) The first assertion is easy to check. The second assertion comes from the observation:
Em, =
03
and d
< 03
==+
n
= 03
(by (4.18)),
n
bo . . . bn (by the Stolz's theorem). a1 * * an+l n=O Thus, by Lemma 4.54 (2), we indeed have d = 2. b) We now prove the third assertion. Suppose that the process is exponentially ergodic. Then, the condition infi qi > 0 is necessary by Theorems 4.44 (2) or 4.45 (2) [or Chapter 9, Lemma 9.7 with K = { k } ] . By Theorem 4.44(2), there exists a X with 0 < X < qi for all i such that EoeXoo < 00. Define eio(X) = e X t P i [ o o> t l d t , i 2 0. lim
C L o d k = lim dn F(0) n-w F p k=O k
n-mzn
I "
g1
I" +
Then IEieX"o = Xeio(X) 1. From the proof a) of Theorems 4.45 and 4.44, it follows that eio(X) < 03 for all i 2 1. Furthermore, EieXoo < 03 for all i 2 1. Note that if the starting point is not 0, then oo is equal to the first hitting time: T~ = inf{t > 0 : X ( t ) = O}. Hence lEieATo < 00 for all i
2 1. Define min)= E~T,". The Taylor's expansion (4.30)
leads us to estimate the moments mZ(").By a result due to Z. K. Wang [cf. Wang (1980, Chapter 3), or Hou and Guo (1981, Chapter 9), or Wang and Yang (1992)], we have
x c i-1
mi( l ) =
j=o
nx i-I
mjn)=
1
"
P A k=j+l
Pk,
1 " pkmf-'), j=o P A k = j + 1
C
n
2 2.
(4.31)
From these quoted books, one can also find the probabilistic interpretation while of S and R: S is the mean of the hitting time of 0 starting from "00)', R is the mean of the hitting time of "03') starting from 0. Actually, we have a general result as follows.
162
4 RECURRENCE, k h C O D I C I T Y A N D INVARIANT MEASUR
P r o p o s i t i o n 4.56. Far an regular and irreducible Q-process, (mjz := Eia;4 : i @ H ) is the minimal solution to the equation
Proof: Similar t o the proof of Lemma 4.48. W In the present case, obviously, m p ) 3 mLn) if k 3 i. By (4.31), it follows that
and
Hence, by induction, one gets
Combining this with (4.30), we obtain
which implies that
< 1.
<
Making the suprcmum over i, we obtain 6 X - l < 00. EIencc, thc necessity is Droven. c) Finally, by a), we have d = d. The fourth assertion is an application of Lemma 4.54 (2). Before moving further, let us consider a particular example which is often used to justify the power of a result.
Example 4.57 (Linear growth model). Consider the birth-death process p and a, = 6n, where a, p, 6 > 0. Then the Q-process is
with rates b, = an always unique.
+
(1) The process is recurrent iff a < 6 or a = 6 2 p. (2) The process is ergodic (exponentially ergodic) iff a (3) The process is never strongly ergodic.
< 6.
1ti3
4.5 SINGLE BIRTH I"R0CESSES
Proof: a) The uniqueness assertion is easy since
Assertion (3) is also easy since
b) To prove the other assertions, we use
Kummer Test: Let (un) and (wn) be two sequences of positive numbers. l/v, = 03 and t,he limit K := limn+m ten exists, where Suppose that
xF
Then, the series respectively.
Cunconvmges or
diverges according to
K
> 0 or
Now, to prove recurrence, we consider the series Gnu,: un = Take v, = n. Then
[6
t&=n Q: -
So R = -too if a > 6, K
We have
K
-os if a
I I :
lf6(n+1)
< 6.
> 0 if 3 > 6, R < 0 if p < 6.
K
<0
* . . a, b l * * - b,
a1
1-1.
Next, assume that a = 6. Then
Finally, let /3 = 6. Then
1 XU" cn+l - +oo. =
n
n
c ) In view of b), we need only to consider a present situation,
<6
or a = 6
> p. In the
4 RECURRENCE, ERGODICITY AND
164
INVARIANT
MEASURES
Take v, = n again. Then K,
=n
Thus, when a < 6,we have
(n
K
+ l ) ( 6 - a ) + 6 - p ] -1.
= +oo. As for
a!
= 6,
-P/a < 0. This proves the ergodic assertion (2). Then, by Theorem 4.55 (3)) the exponential ergodicity follows whenever a! < 6.
So
K
=
The simplest example to distinguish the three types of ergodicity is as follows.
Example 4.58. Consider a birth-death Q-matrix having the property ai = bi = 24 for all large i . Then (1) (2) (3) (4)
the process is recurrent for all y and so the Q-matrix is always regular. The process is ergodic iff y > 1. The process is exponentially ergodic iff y 2 2. The process is strongly ergodic iffy > 2.
To conclude this section, we study the multidimensional Q-processes. The idea is the same as those introduced in Section 3.2, i.e., reducing the multidimensional case to dimension one. Again, The typical example in mind is Schlogl’s model. Theorem 4.59. Let E be a countable set, Q = ( q ( z , y ) : z,y E E ) be a conservative Q-matrix. Suppose that there exists a partition { E I , }of E such that C E o E k = E with Eo = {Q},where 8 E E is a reference point. Next, suppose t h a t (1) If q ( z , y ) > 0 and z E E k , then y E C;:: Ej for all k (2) CyEEr+l q ( s , y ) > 0, for all z E EI, and all k 2 0. (3) sup{q(z) : z E Ek} < 00 for all k 2 0.
2 0.
Define a conservative Q-matrix Q = (qij : i , j E Z+) as follows:
qij =
1
(4.32)
0
for other j
# i.
4.5 SINGLEBIRTHPROCESSES
165
Moreover, suppose that both (q(z,y)) and ( q i j ) are irreducible and that ( q i j ) is regular. If & ( k > 1) are finite and C,"==, F(O) = 00,then the (q(z,9))-process is recurrent. Furthermore, the (q(z,9))-process is ergodic (resp. exponentially ergodic, strongly ergodic) if
d<m, (resp. (4.20) holds, (4.21), replacing d with
(4.33)
2, holds).
Proof: a) To prove recurrence, by Lemma 4.51, it suffices to show that the equation 0 6 ).(. 6 1, II: E E .(x) = rqx, y)u(y), YZQ
c
has only trivial solution, where II(z,y) = [l- 6(z,y)]q(x, y)/q(z). Now, the proof is similar to those of Theorem 3.19. Let ( u ( z ): z E E ) be a non-trivial solution to the above equation. Set u k = max{u(z) : x E E k } , k 2 0. Since EI, is finite, there exists a d k )E El, such that UO = @),
so 40% = 4 ( v @ ) =
Uk
k 31
= "(x("),
c
4 ( & Y>U(Y>
c
4 ( 4 Y)Ul 6
Q O l 9 .
PEE1
YfQ
When k 2 1, by using the argument which deduced (3.30), we obtain
j=1 \ y E E j
This shows that
uk
/
and moreover,
k-1 qkOuk
+
qkj(uk
- uj)
qk,k+l(Uk+l
- uk),
k 2 1.
j=1
Hence, we always have qkuk
6
qkjuji
>, 0.
j#O,k
Thus, by Lemma 4.51, Lemma 3.14 and Theorem 4.52, it follows that ui = 0. This contradicts with the assumption that u(x)$ 0. b) To prove positive recurrence, by Theorem 4.45, we need only to show that the equation
Y
Yf0
166
4 RECURRENCE, ERGODEITY AND
INVARIAN'I' h!fEASWRK;S
has a non-negative solukiori. For this, let ( u k )be a solution constructed in Lemma 4.54 and take u(x) = u k if z E EI,,
Now, for z # 6,there exists some k so that
k 2 0.
5 E Ek.
Hence
k-1
Y
On the other hand,
We have thus constructed a desired solution. c) The proofs for the remainder assertions are similar arid hence are omitted. I 4.6 Invariant Measures
An extension to stationary measure of a process P(d) = (Pij(t))or P ( P i j )is invariant measure, i.e., a non-trivial cr-finite measure 7r so that
n=;.rP(t)
or
=
T=TP
respectively. In this section, we study the existence and uniqueness of invariant measures. Of course, the uniqueness is in the sense of up to a constant factor. As usual, we restrict ourselves to the irreducible case. We begin our study with time-discrete case. the answers depend on recurrence or transience of the process. Let ( E ,X,, Pij)be an irreducible Markov chain with discrete times. Becausc of the irreducibility, if (n,) is an invariant measure, then 7rz > 0 for all i. For the positive recurrent case, the answer is quite sirriplc arid well-known.
4.6 INVARIANT MEASURES
167
Theorem 4.60. Let ( E ,X,,Pij) be an irreducible and positive recurrent Markov chain. Then there is precisely one invariant measure. More precisely, if the chain has period d ( 3 l), then the state E can be decomposed as a union of disjointed subclasses C O , .. Cd-1 so that Pk is an aperiodic chain on each CF Furthermore, the invariant measure is given by
Corollary 4.61. Let ( E ,X,, Pij) be an irreducible and non-positive recurrent Markov chain. Then i t s invariant measure, whenever exists, should be nonsumma ble.
Proof: See Proposition 4.26. For the recurrent case, the answer can be found in many books (cf. Chung (1967) Part 11, Theorem 9.7, for example). Here we state only the result.
Theorem 4.62. Let ( E ,X,,Pij) be an irreducible and null recurrent Markov chain. Then there is only one invariant measure, which is given by 00
n=O
where 8 is an arbitrary element in E and sP$) denotes the taboo probability: for a given B c E ,
B P23( o ) = b i j , B P23! n ) = P i [ x , = j , X I , . ' . ,Xn-l $ B ] . The transient situation is much more complicated. Let us look at some examples.
Example 4.63. Take Pi,i+l = pi and Pi,o = 1-pi for i 2 0, where pi E ( 0 , l ) such t h a t limn+m lJi=,p, > 0. Then the chain is irreducible and transient. It has no invariant measure.
Proof: Clearly, if (ri)with
7ro
=
1 is an invariant measure, then
Thus, there exists an invariant measure iff limn-,m n;=,p, = 0. On the other hand, n;=,p, describes the probability that the chain does not return pk to 0 in the first n + l steps starting from 0. Hence, condition = 0 means that the probability of the chain returns to 0 in finite steps starting from 0 equals 1. That is equivalently to the recurrence of the chain.
nLz0
168
4 RECURRENCE, ERGODICITY AND
INVARIANT MEASURES
Example 4.64. Take P,,,+l= p
ri'
1/2,
P,,%-1=
1 - p =: 4 )
Then the chain is irreducible and transient. measures.
Proof: Siippose that Vk+l
-
( T ~is )
Xk
2
= 0, 51, &2). . .
.
It has infinitely many invariant
an invariant measure of the chitin. Then
+ T L l P = 0,
k = 0, f l ,f 2 , .
2
'
.
To this equation, the solutions are as follows: 7rk = cl(p/q)k
+ c2,
k
= 0, fl: 42,.
..
Hence, for any c l , c2 0 , c1 + c2 = 1, (ni)is an invariant measure. This proves our assertion. H
Example 4.65. Take
Then the chain is irreducible and transient. It has precisely one invariant: measure.
Proof: Notice that the probability of the chain goes to infinity starting from 0 is 00 1/2rJ2i+l
- 1)/2i+1 > 0.
i=l
Hence the probability of the chain without return to 0 in finit'e steps starting from 0 is positive. So the chain is transient. Next, let (ni)be an invariant measure. Then
Set (7i-i)
1, then nL - 4. It is easy to sce that is positive, simply use induction. I
7r0 =
(T?)
is unique. To show that
Proposition 4.66. Let ( E ,X,, PiJ)be an irreducible, transient Markov chain. Without loss of generality, assume that E = 25,. Suppose t h a t the chain has an invariant measure, then there exists a simple path coming from infinity. T h a t is, there exist mutually distinct states i l , 22, such t h a t Pipil > 0, Pi3,i2 > 0, '
...
-
8
169
4.6 INVARIANT MEASURES
Proof: Define a, dual chain as follows: PzJ r 7 P J i , f T a , i , j Then
p::)
E
E.
= 7rjP:r)/7ri,i , j E E . Note that an irreducible chain is transient
iff
00
(4.34)
EP"'o0. n-0
We have
oc
M
n=O
n=U
x,,
So the dual chain ( E , Pt3)is irreducible and transicnt. Without loss of generality, we assume that the chain is aperiodic. Then, for almost all LJ, -
-
,..
PXl ( W ) , F O ( W ) , PJTZ ( W ) , T ? 1 ( w ) , are all positive. On the other hand, since the chain is transient,, for almost all w , in thc sequencc ( y o ( w ) , y l ( u ) ,. >,no clement can be appeared in infinitely many times. In other words, for almost all w , there exist infinitely many distinct XO( w ), XI ( w ), * . * so that ~ x l ( w ~ >, 0,~ o%(w),xl(w) (w) > 0 , ... a
This completes our proof. W We are now ready to present an existence criterion {br invariant measures. For simplicity, we take E = Z+ again. Moreover, we allow the chain to be sub-M arkovian : (4.35) CPij61, i € E j
As we did before, for H f f P2.7( 0 ) 6,j,
c E , define the taboo probability as follows: PiklPklkz * ''Pkn-]j, i,j E
ffp(n) z.7
E , n 2 1.
k 1 , . . ., k n - l $ ! f f
Theorem 4.67. Let ( E ,X,,Pij) be an irreducible and transient Markov chain, maybe sub-Markovian (i.e., (4.35) holds). Then, the chain has an invariant measure iff there exists an infinite subset K c E such t h a t
where
0 0 0 0
(4.36) r=j ,=I
This criterion is quite deep in the theory of hlarkov chains. Unfortunately, its proof is lengthy and so is omitted here. Refer to Harris (1957) and Veech (1963) for details. As a straightforward consequence of Theorem 4.67, we have
170
4 RECURRENCE, ERGODICITY AND INVARIANT MEASURES
Corollary 4.68. Under the hypotheses of Theorem 4.67, if for each i E E , there are only finite k E E so that Pki > 0, then the chain has an invariant measure. Now, we turn to study the time-continuous Markov chains. For the recurrent case, there exists uniquely an invariant measure, which can be seen from the time-discrete case at once. Thus, we consider only the transient case. Again, assume that the chain is irreducible. Hence, we have
0 < J, PZj(t)dt < w.
(4.37)
It is similar to the time-discrete case, in the present situation, we may have no invariant measure. But excessive measures do exist. Lemma 4.69. Let Q = ( q i j ) be an irreducible, regular Q-matrix and P ( t ) be transient. Then for each probability measure cr on E with finite support,
is a finite positive excessive measure of and t 2 0.
P ( t ) : pj 2
xi
p i P ; j ( t ) for
all j E E
Proof: By (4.37) and the assumption, the assertion is obvious. Now, it is natural to ask when an excessive measure becomes an invariant measure. More general, for a given ( p i ) , when we have pept = p
for some p
< O?
t2o
~(t),
(4.38)
If so, by Lemma 4.15, we should have (4.39) i#j
Define a dual chain as follows. &(t) = e - P t p j P j i ( t ) / p i . Clearly, Pij(t) is a Markov chain and (4.38) implies that
t
P ( t ) l = 1,
> 0.
(4.40)
F'urthermore, its Q-matrix is as follows:
4.. = p j.q j i / Pa, a3
4i = P +
Qi.
Thus, in order to having a pinvariant measure, it is necessary that pE
[
- infqi,
01. a
(4.41)
Moreover, (4.39) implies that the Q-matrix = (ijij) is indeed conservative. Next, assume (4.39) and (4.41), then the construction of the minimal process gives us Pij(t) = pein(t).Combining this with (4.40), we see that Q = (Qij) is regular. We have thus proved the following result.
4.7 NOTES
171
Theorem 4.70. Let Q = (qi,) be an irreducible and regular Q-matrix. Then ( p , ) is a p-invariant measure iff (4.39), (4.41) hold and = (qt2,)is regular. The dual chain used above is introduced in terms of an excessive measure, but we can also introduce a dual chain by means of a finite, positive excessive function, in view ol (4.37). The last tedinique even works for more general state space. Refer to Chcn and Stroock (1983).
4.7 Notcs
For a more complet,e theory of Markov chains, refer to Aldous and Fill (1994-), Anderson (1991)) Chung (1967)) Hou (19821, Hou et a1 (1994, ZOOO), Hu (1983, 1985), Wang (1980)) Wang and Yang (1992), Yang (1981). In particular, the complete proofs of the results in Sections 4.3 and 4.4 are included in Anderson’s book. The ergodicity has been studied for much more gcneral state space in the time-discrete case. Refer t o Nummelin (1984), Meyn and Tweedie (1993b) and references within for more details. For closely related results in the timc-continuous cmc for the general state space, refer to Down. Meyn and ‘rweedie (1995)) Meyn and Tweedie (1993a). Section 4.1 is mainly due to Dobrushin (1970). Remark 4.6 was pointed to the author by L. P. Huang. Lemma4.10 is due to Basis (1980). The particular case “c < 0” of Theorem 4.14 was appeared in Basis (1980) and Chen (1986b, 1989b)) based on a time-discrete analogue obtained by Dobriishin(l970). For Markov chains, the special case of Theorem 4.14 (i.e., Theorem 0.11) was obtained by Tweedie (1975, 1981). The present form of Theorem 4.14 seems to be new but quite natural. The proof given here simplifies greatly the original o i m . Theorem 4.9 was presented in Chen (1986b), for which the author was benefited from a conversation with S. W. He. The proof of Theorem 4.9 adopted here was actually contained in Feller (1957). The sufficiency of Theorem 4.24 is due to Foster (1953) and Kendall(l951). The necessity is due to Mertens, Samuel-Cahn and Zarnit (1978). Kendall (1959) introduced the term geometrically ergodic for irreducible Markov CzJ,f3G,where &, < 1. Vere-Jones (1962) chains for which P$’ - 7rJ showed that /3 can be taken to be independent of z and j. Then, Nummclin arid Tweedie (1978) showed that the coefficient C,, can be chosen independent of j . If we want to have a universal coefficient C , then thc gcomctrical ergodicity turns to be the uniform ergodicity as proved by Isamson and Luecke (1978). Theorem 4.28 (2) is due to Numrnelin and ’l’weedie (1978) and Numrnelin and Tuominen (1982). Theorem 4.28 (3) is due to Isaacson and Luecke (1978). For the special case that H = (0). Theorem 4.30 (2) goes back to Kingman (1964) and Theorem 4.30 (3) is due t o Huang and Isaacson (1976). Theorem 4.31 (l), (2) and (3) are due to Foster (1953)) Popov (1977)
I
I<
172
4 RECURRENCE, ERGODICITY AND
INVARIANT
MEASURES
and Isaacson and 'i'weedie (1978) respectively. In Isaacson (1979), a diflerent concept, the ergodic coefficients introduced by Dobrushin (1956), ww used to characterize the strong ergodicity. The ordinary ergodic assertion in Theorem4.40(1) is due to Wu (1965), part (2) of the theorem is based on Clien (1999d). Lemma 4.42 is t a k a from Tuornincn and Tweedie (1979). Theorem 4.44 (3) is due to Isaacson and Arnold (1978). Theorem 4.45 is taken from Tweedie (1981), For S being a singleton, Example 4.50 is proved by €Ian (1991). Far general finite S, it is proved by Chen (1995). The proofs for the three types of ergodicity presented in Sections 4.3 and 4.4 are quite different to the originals, but they are not essentially new. The ideas are based on Hou and Guo (1978). As a sufficient condition, (4.11) was appeared in Reuter (1961). Part (4) of Theorem4.52 is due to Zhang (2001). For birth-death processes, this result is due to Zhang, Lin and Hou (ZOOO), and was used previously by Tweedie (1981) as a sufficient condition for the exponential ergodicity. Part (3) of Theorem 4.55 as well as the remark above Theorem 4.55 are due to Ma0 and Zhang (2003). Part (3) of Theorem 4.55 is due to Chen (ZOOOb), the present proof is taken from Mao and Zhang (2003). The remainder of the results in Section 4.5 is taken from Yan and Chen (1986), Chen (1999d). Some of results are motivated from Reuter (1961). See also Tweedie (1975). The strongly ergodic part of Theorem 4.59 was added by Zhang (2001). I t is interesting, as observed by Wu and Zhang (2003), that the dual comparison of Theorem4.59 can be used to study thc necessary condition for (uniform) ergodicity. The first three examples in Section 4.6 are taken from Dermsn (1955). Theorerri 4.62 is lakexi horn Chung (1967). Proposition 4.66 and the sufficicncy of Tlieorerri 4.67 are due to Harris (1957), he also conjectured the necessjty of the cmditions, its proof was later obtained by Veech (1963). T h e final step of the proof of Theorem 4.70 is due to Kelly (1983).
Chapter 5
Probability Metrics and Coupling Methods This chapter begins with the study of some basic properties on the minimum LP-metric. Then we prove in Section 5.2 two fundamental theorems about the regularity and the marginality, respectively, for coupling operators. Section 5.4 is devoted to the topic on optimal Markovian couplings with respcct to non-negative closed functions. In Sections 5.3 and 5.5, we study thc successful couplings and stochastic comparability respectively. Finally, some examples are presented in Section 5.6 to illustrate the applications of the coupling technique. 5.1 Minimum LP-Metric In this section, we study some topological properties of the minimuin LPmetric and its relation to the weak topology. TJet ( E ,p, 8)be a complete separable metric space with metric p and Bore1 cT-algebra 8'.For two given probability measures PI and P2, define
where
P varies over all coupling probability measures with marginals PI and
PJ. It will be proved in Lemma 5.3 below that W, is a metric. Definition 5.1. The metric defined above is called the minimum LP-metric or Wasserstein metric or Kantorovich-Rubinstein-Wasserstein metric. Briefly, we write W = W1. A dual expression of Wl will be discussed at the end of this section. Lemma 5 . 2 . Given PI and P2. The infimum in (5.1) is attained for some coupling measure F .
Proof: Assume that Wp(P1,P2) < 00. Let ling measures such that
{Fn}n21be a sequence of coup-
Since the k-th marginal of Pn equals to Pk, applying Theorem 4.4 to the singleton { P h } ,we see that there exist compact functions h', h2 and a constant C < 00 such that
/
hh((zk)+(dx,,
dz,)
< C, 173
n 2 1, k = 1,2.
174
5 PROBABILITY METRICSAND C O U P L I N G METHODS
Thus, if we choose h1(xl)+h2(x2)as a compact function and choose p ( z , , yl) +p(z2,y2)as a metric on E x E , then it is clear that {Fn}n21 is relatively compact on E x E . Hence, for any limit P , as an application of Theorem 4.5, we have /P(Xl,
z2)P15(d3+, dz,) 6 W,(Pl, P2),.
The infimum is thus attained at p. The marginality of Fnk ==+ P and the monotone convergence theorem. Set
p
now follows from
9 ( E , p ) = { P E 9 ( E ) :3 c E E such t h a t S p ( ~ , z ) ~ P ( d< z )CQ}. Lemma 5.3.
(1) W,(P1, P2) 2 w(P1,P2)l+'/P,where w is the Levy-Prohorov metric. (2) ( 9 ( E , p ) W,) , is a metric space. Proof: a) Let PI(F)- P2 ( F " ) 3 E for some closed set F and FE is the &-neighborhoodof F . Then
3 &(F(Fx ( E \ F
y
E
> 0, where
p
2 E ( F ( F x E ) - P(E x FE))l/P = & ( P l ( F) P2(FE))l/P> , l + l / p This proves (1). b) By (l),if W,(Pl,P2) = 0, then w(Pl,P2) = 0 and hence PI = Pz. Thus, we need only to verify the triangle inequality. Let Pk E 9 ( E , p ) , k = 1,2,3. By Lemma 5.2, we can choose a coupling F 1 2 of PI and P2 and a coupling F23 of P2 and P3 such that
Next, construct a probability space (Q9, P) on which there are random variables ((1, (2) and (&, &) with distributions p12 and respectively. Since the state space ( E , p , & ) is a complete separable metric space, the regular conditional probabilities do exist, we can construct a Markov chain
&,
5.1 ~ ~ I N J M U LP-METRIC M
so that and
and Therefore
175
have the same distribution, as well as
Theorem 5.4. ( P ( E , p ) W,) , is a complete space. Proof: Let {P,}z!lbe a Cauchy sequence in the W,-metric. By Lemma 5,3 (l),it is also a Cauchy sequence in the w-metric and so has some limit p in the w-metric: since ( 9 ’ ( E ) w , ) is a complete separable space (cf. Billingsley (1968), p.240 or Parthasarathy (19671, Section 2.6). Choose P,,, such that Y
On the other hand, by using the proof of Lemma 5.2, we see that {Fn,nL}Tm ,., iti relatively compact on E 2 . Assume that P,,,, -% P,,, as k -+ 00. Then P,,, is a coupling probability meaSurt3 of P, tmd F . Moreover, by Theorem 4.5, we havc
-
From this! it follows that.
since {Pn)is a Caiichy sequence in the WP-rnetric. We have seen that the convergence in the W,-metric is stronger than the weak convergence. il more precise characterization of the relation between these two types of convergence is given as follows. Theorem 5.5. A subset 4 c 9 ( E , p )is compact in the W,-metric iff
(I) & ,. is weakly compact and (2) for some (or a n y ) a E E ,
176
5 PROBABILITY h/lETRICS
AND COUPLING
METHODS
Proof: a) Suppose that A is compact in the W,-metric. Then, it is certainly weakly compact by Lemma 5.3 (1). Given E > 0, choose a finite set d’ c .A? such that for evcry P f A, there is a P’ E 4’with
Next, choose N
1=
N ( E )so that
for a fixed c E E. Now, for each P E A,by (5.2)) we can choosc a coupling ,. Pok of P and Pk such that J p(x, z k ) P & ( d x , d z k ) < E P . Then
But
and
Hence
x
<
p ( x , c)”P(dz) 2 q 2 p
x ,p( 2 .C)>
+ 1)P.
N]
This proves that condition (2) holds for it specific u = c. b) !Are now prove that (2) holds for all a E E . Given E E (0, l), choose a finite set A’ c A such that for every P E A, there is a P‘ E A‘ with 70(P,P’} < ~ / 2 ,Choose N large enough so that sup P’ [ 2 : p ( x )u> > N P’E.4‘
-
1 <42.
5.1 MINIMUM LP-METRIC Thus, we actually have suppE& P [ 2 : p(x,a )
177
> N ] < E . Next, since
and
>N ]
P(.,c)pp(dx) G
I :p( z ,a)
J:.I
P .(
7
c)
> N --P
(a3 41
42, c ) p w 4 ,
condition (2) holds also for a. We have thus proved the necessity. w c) To prove the sufficiency, let P, + P as n t co. Since for every E > 0, we can choose a compact set K, such that P(K,) 2 1 - E , by the finitely z2,. . . , zN} such that covering theorem, there exists a finite set {q,
For any
Xk
E
R+, k
= 1,2,..
a
,N , let
Because F ( A i ' ) . " l X N is )monotone in each xj, so except a set ZI, with Ndimensional Lebesgue measure zero, P(A;""' "") is continuous in Xk on ZZ,k = 1,2,...,N. Due to the continuity, the probability, under P , of the boundaries of the sets A;'>'.' , k = 1,2,. . . , N , is equal to zero. Now, by choosing X i E (1 - E , l), k = 1 , 2 , . - . , N , on the continuous set, and denoting Bk = A, it follows that we have constructed disjointed Pcontinuity sets B1,Bz, . * , B N whose diameter is not more than 2 4 1 - E)-' and for which C,"=,P ( B k ) 1 - E . Therefore, by Theorem 4.1 (5)) we have Pn(BI,) F(Bk) n * 03. d) Define I",
178
5 PROBABILITY METRES AND COUPLING METHODS
Next, define a coupling Pno of Pn and P as follows:
Then
Note that J p ( z , a)PP(dz)< 00 by Theorem 4.5 and condition ( 2 ) . Thus, if we set Bo = E \ B k , then P(Bo)< E and hence
c,”=,
can be arbitrarily small for sufficient small enough n, we have ?;-
2 F ( B k ) ( l- E ) ,
E.
k
On the other hand, for large
=
1,.. . 7 N .
so
can be also arbitrarily small for sufficient small E and sufficient large n. Combining the above two facts, we see that there is a 6 ( ~+ ) 0 (as E + 0) such that
5.1 M ~ N I I V ILP-METRIC U~I
179
So the last term on the right-hand side of (5.3) becomes arbitrarily small for large enough n . The samc conclusion holds for the middle term on the right-hand side of (5.3). Therelore, the lefl-hand side of (5.3) can be made arbitrarily sniall for large enuiigh n. This completes the proof of the Iheorcm.
Alternative Proof: A different way tn prove the sufficiency of thc above w theorem proceeds as follows. Let P, P. By Skorohod Theorem [see Ikeda and Watanabe (1981): p9, Theorem 2.71, we cttn construct a nice reference frame (0,9: IF) and En. E such that tn 4 [, P-a.s. This plus condition (2) gives us W,(P,. P)” < IEp(tn! E ) P + 0 as n + 30. I --f
As a consequence of the above result, we have Theorem 5.6. Pn
2P iff the following two conditions
(1) P, 3 P , (2) J p ( z , x o ) P P,(dx:)-+J p ( x , z o ) ~ P ( d xfor ) some (or a n y ) xo E E hold. In particular, if p is bounded, then
u l
a n d Tiv, are equivalent.
Finatly, we prove that if we use the discrete metric 0 ifs=y 1 if 3; y,
+
t.hen the t,otal varia,t,ionis again t.he minimum LP-metric wit,h respect to the metric d.
Theorem 5.7. For any space E with discrete metric d, we have
Proof: Clearly, V is certainly a probability metric since so is P I , Pz E Y ( E ) :we have Jordan-Hahn decomposit.ion:
Define PI A P2 = Pl - {P1- I$)+. Then
11 [1var.Given
180
5
PROBABILITY
METRICSAND
COUPLING
METHODS
For any coupling P of PI and P2,denote by U the support of (PI-P2)+, then U" is the support of (P1-P2)- and we have (Pl-P2>+(U)= (P1-P2)-(Uc). Since
we have V(Pl,P2)3 ?jllP1 - P211Var.Conversely, for given PI and P2, we define a coupling is as follows:
where A is the diagonal in E : A = {(z,x): z E E } . Note that one may ignore lac in the above formula since (Pl-P2)+ and (PI-P2)- have different supports. Then
= (PI - P2)-(Uc)= 1 - (PI A P2)(E).
This gives us V(P1,P2)
< a IJP1- P2
as required.
I
To conclude this section, we study a dual expression of the W-metric. Note that the definition of W ( P l ,P2) is meaningful for any finite measures PI and P2, not necessarily probability measures. In what follows, unless otherwise stated, we will work in this general setup. For simplicity, rewrite Pi = (PI - P2)+ and Pi = (PI - P2)-. Let A1 be the support of Pi and A2 = A"1 which is the support of Pi.
Lemma 5.8. Using the above notations, we have
5.1 MINIMUM LP-METRIC
Proof: Let projcction:
181
P' be a coupling of Pi and Pl. Denote by 7r1 : E x E -r E Ike 7rL(z, y)
7
IC.
Define
E'(C) = P
l A Pd(7dC n D ) ) ,
where D = { (z, x ) : x E E } . We now prove that of PI and Pz. To do so, let B1 E E , then
R(B1 x E ) = Pi(&)
+
= ?;I t2 is a coupling
(Pl A P2)(B1) =
Pl(B1).
Similarly] wc have E(l3 x S,) = Pz(B2). From this, it, follows that
since the support of R' is D.
We now consider the simple case of E being finim. Writ,e E= {xl, . xn}, pij = p ( x c i , z j )and pi') = Pi({ai}), pj2) = P;({zi}).Then pl1)pj2)= 0, i = 1 , 2 , ' - . ,n. Moreover, 1
w ( PY ~),= i n f (
~ ~ , ~ p i j r "cij c i j3:
4
0, z j x : i j = p i(1), z i x ; j =yj( 2 ), 1 Q i , . j 611.).
Recause of the primal-dual relation for the linear programming problem, we have
=sup{ / f d ( r ; - P ; ) : f(Yd-f(Yz)
1
< P ( Y I , Y d , yk E A k , k = 1: 2 .
We have thus obtained the following resiilt. Lemma 5.9. For finite E , we have
wq, Pi)= sup
{Ifw; --
: f(Yl)
-
f(l/z)
< P(Y11Y2)'
Yk
1
fEAkr k = L 2
Theorem 5.10. Let zo E G , ,%(E) = { P E L @ ( E:)/ p ( q x , ) P ( d l c ) Then for P1,Pz E P o ( E ) , W(Pl,f'2) = sup = Sup
}
< ca}
"
I
fd(P1 - Pz) : f E %p(E), L,(f) < 1 ,
(5.5)
1
(5.6)
(s {1
fd(P1
I
1'2)
: f E b s p ( E ) )L(f)
61 ,
where Yip(E)denotes the set of Lipschitz continuous functions on E and L(f) denotes the Lipschitz constant of f E Y i p ( E ) ~
282
5
PROI3AI3ILITY
METKICSAND
COUPLING
METHODS
Proof: Dcxiote by L(P1,P2) the riglit-hand side of ( 5 . 5 ) . Obviously, L satisfies the trianglc incqualit,y. a) By Lemma 5.2, we can choose a coupling of PI and P2 so that W(P1,P2) is achieved at P . Then
<
Hence L(P1,PL) W ( 4 , P 2 ) . bj Let f satisfy
For y E A2, define g(y) = sup { f (2) - p ( y , 2) : x E A l } and for x E Al: define g ( x ) = inf {f(z) p ( x , x ) : x E A 2 ) . Then it is easy to check that g E Bp(h!) with L(g) 1. Moreover, g 2 f on A1 and g < f on Az. Thus
+ <
/fd(PI
- P2j =
s,,f G s,, f G s,, G
-
9dP:
-
L2
=/
gd(P1 - P2) G L(Pl,P2).
c) Consider the special case that Pk(k = 1 , 2 ) has finite support. Then, by Lemma 5.8, on the one hand, we have W(Pl,P2) l V ( P i , P i ) . On the other hand, by Lemma 5.9 and b), we have
<
<
Hence W’(P1,P2) L(P1,P2). Combining this with a), we have completed the proof for the simple case. d) We now remove the extra assumption of finite support used in c). Fix POE 9 o ( E )for a moment. Assume that p is unbounded. Otherwise, simply ignore BnO and O ,z considered below. Since ~ p ( x o , z ) P o ( d< z )00, we can choose C ( n ) E [n,00) and xn0 E {x : p(zo, x) = C ( n ) >such that
1
p(z,,z)Po(dx)
and
P(~o,~nO)PO(BnO) <> ;
where BnO:= {x : p ( z o , x ) 2 C ( n ) 2 } .Next, for each n set { Bnj}yzl c 8 such that
j=O
1
2 1, choose a finite
5.1 MINIMUM LP-METRIC
183
>
Finally, for each n and j
l! choosc arbitrarily a point xnJ E BnJ. Let Pn be Ihe measure with mass Po(Bnj)at xnj E Bnj. Then, for every bounded uniformly continuous function y, we have
This shows that P,
+ Po by Theorem 4.1 (2). At the same time,
Thus, by Theorem 5.6, we have W{P,,Po) + 0 as n + crc. So by a). we also have L(Pn,Po>-3 U as T G -+ 00, Finally, for given P t : k = 1, 2, we can choose P: as above. Furthermore, by triangle inequalily, we get
and
Conibinixig thcsc facts with c ) , we obtain the required conclusion. e) To prove (5.61, Ict ,f E 3 z p ( E ) with L (f ) 6 1and set f n = ( - n ) V ( f h n ) , n 1. 'Then fn E b k ? p ( E ) , L ( f n ) 6 1, For large cnough n, we have -n < f(zo) < n, and so
>
I
/(i - fn)dPxl
=
1
hiI>.i(f
- fddPk1
p(z,z,)Pk(dz)
Therefore
Jr,>nl If - f ( Z O ) I d P k < co,
k = 1,2.
184
5 PROBABILITY METRICSAND COUPLING bfETHODS
as n + 00 by the dominated convergence theorem.
W An immediate consequence of (5.6) is that for given transition probabilities Pl(z1, -) and P2(z2,.), W(P1(xl1 .), P2(2,,-)) is measurable in (x1,x2):since b y i p ( E ) c U,(E) and ([J,(E),I1 llu) (Ilfll, := supz If(x)l) is separable. To see the separability, noting that E is separable, by Urysohn theorem, ( E ,p> is homeomorphic to a subspace Q of [O, 13' with product topology. Because the closure Q of Q is compact, one can regard (U,(E), 11 Ilu> as a subset of the separable space C ( Q ) with uniform topology and hence is separable. However, the proof does not mean the separability of ( b 2 z p ( E ) ,j j 1Iu). This is not surprising since the Lipschitx continuity is not a topological concept. We remark that the space bLFip(E) may riot be separuble with respect to the Tipschitz norm. To see this, consider E = R. Given E E R)define fc(z) =: 0 if J: 6 <, &(x) = 1 if J: 3 E 1 and linear between 5 and E 1. Then " L ( j t )= 1. Next, let (I < &. Then for sufficiently srriall E r 0: we have fcl - ft2 = fcl on [ ( l , ( ~ E ] . Therefore L ( f t , ft2) = L ( f c , ) = 1. Since {fc : E E JR} is not countable, the space &+ip(R) is not separable with respect to the Lipschitz norm.
-
+
+
+
5.2 Marginality and Regularity
We now study the coupling methods for jump processes. Unless otherwise stated, all q-pairs considered in this chapter are con,seruative. Suppose that we are given two jump processes P k ( t , x k , A k ) wit.h regulnr qpair ( q k ( z )q, k ( x k , A k ) ) on state space (Ek,&), k = 1 , 2 respectively. We want to find some coupling jump process p(t;z, ,x2;d3,, dy,) with q-pair (@(xi,xz),Q ( s l , xz;dy,, dy,)) on the product state space (Elx ~ 3 2 8 , 1 x 62 having the marginality:
-
P ( t ; ~ : 1 ) 2 2 ; Axl E2) P l ( t , 2 1 , A l ) , .F(t;x l , x2;El x A2) P2(t,x2,A2), t 2 0, xk E Ek, Ah E € k , k
(5.7)
:--4
= 1,2.
Define
R l f ( 4 = /ql(xL)dYl)(f(Yl) - f
( m
f
t b&I*
Similarly, we can define R2 and fi. Since the marginal q-pairs and the coupling q-pair are all conservative, we have
185
5.2 MARGINALITY AND REGULARITY
By the monotone class theorem, it follows that
R1f(z,) = fif(z,,z,), as a function in
Here, we regard f E the following result.
zlc E b(61
Ek,
x
62).
f E b61. Thus, we have proved
Lemma 5.11. The marginality (5.7) implies that fif(zl,x2) = nlf(zi), =02f(x2),
fif(x1,z2)
f E bgl, f E b82,
xlc E
(5.8)
172.
Ek, k
As we will see later (Theorem 5.20), the converse assertion that (5.8)+ (5.7) is also correct if R is regular. Definition 5.12. Any operator rator.
fi satisfying (5.8)
is called a coupling ope-
To see (5.8) is practical, let us restate the examples of coupling operators given in Section 0.2 as follows. Example 5.13 (Independent coupling).
-
a0f(x1,22)
=
[Olf(’,%)](x1) -k
[02f(21, * > ] ( 2 2 ) ,
f E b(61
x
€2).
In the following examples, we assume that El = E 2 = E , 61 = 8 2 = €‘ and write (E’, = ( E x El € x 6).In the following three examples, let x1,x2 E E and f E b g 2 . Example 5.14 (Classical coupling). Let the two marginal q-pairs be the same ( 4 ( 4 ,4 ( x , A ) ) . Set m ( r C 1 , Z Z = ) I n c ( z 1 , ~ 2 )& f ( Z l ? 2 )
where
A
= {(x1,x2)E
+ rn(zl,z2)f%(zl),
E2 : x1 = x2}, g(z) = f ( z , z ) and
fig(.> =
/dz,
(dY) - d z ) )
as defined above.
Example 5.15 (Basic coupling).
where for two measures and u2, are the Jordan-Hahn decomand position of
186
5
PROBABILITY
METRICSAND
COUPLING
METHODS
Example 5.16 (Coupling of marching soldiers). Let E be an additive group. Define
This coupling is also meaningful even E is only a subset of an additive group. It is a easy matter to see that all the above operators satisfy the marginality (5.8). On the other hand, as we have mentioned before, there are infinitely many choices for coupling operators, hence it is important to know whether a coupling operator is regular or not in order to use the coupling technique. To study this problem, we need some preparations. The next result shows that a q-process can be often approximated by a sequence of bounded q-processes.
Lemma 5.17. Let (q(z),q(z,dy)) and ( q n ( x ) , q n ( x , d y ) )be a sequence of q-pairs, not necessarily conservative. Suppose that
Then
(I) lim_
,oo PF~"(X) 2
pmin(X).
(2) If ( q ( x ) ,q(s,dy)) is regular, then limn+m P,"'"(X) = P(X). Here, P,"'"(X) and Pmi"(X) denote the minimal q-processes determined by (qn(z),qn(x,d y ) ) and (q(x),q(s,d p ) ) respectively. We omit "min" when the correspondent q-process is unique. In particular, for Markov chains, the condition can be weakened as follows:
Proof: a) As usual, let II,(X)(z,dy) = q,(z,dy)/(X Eq. ( B x )and Theorem 1.14 (4), we have
+ qn(z)). Then,
by
5.2 MARGINALITY AND REGULARITY
187
Now the first assertion follows from the comparison theorem and Theorem 2.21. b) To prove (2), suppose that lim_ ,xP,"'"(X,, zo,Ao) > P(Xo,zo,Ao) €or some A, zo and Ao. Then, from (l),it follows that
which is impossible. So we have __ lim
P,"'"(X,z,A) = P ( X , z , A ) ,
X
> 0, z E E , A E 8.
n-cc
Furthermore, if and Ao. Then
for some
which is also impossible.
Lemma 5.18. Given a q-pair (q(z),q(x,dp)).Let En E as n a sequence of q-pairs { ( q n ( x ) qn(z, , dy))),>l as follows.
where {cpn(z,dy)},>l
7 00.
Define
are free kernels. Then
Moreover, when { ( q n ( z )qn(x, , dy))},>l are regular, then so is ( q ( z )q(z, , dy)) iff lim P,"'"(X, z,E;) = 0, X > 0, z E E. n+cc
Proof: We adopt the first successive approximation scheme for the minimal q-process. Denote by { I ' ~ m ' ( X ) } m 2 0 and { P ( " ) ( X ) } , ~ O the approximating
188
5
PROBABII,I?'Y METRIC3 AND C OU PL IN G
METHODS
sewcnces determined by (qn(z), qn(x,dy)) and ( q ( x ) q(z, , dy)) respectively. For instmce,
P'O'(A)
0,
P("-' "(A) = rI(A)P(m)(A) 4-[(A
+ q)-11],
m 2 0.
Note that when y $ En, we have qn(y) = 0, and so P,l"'(X, y, A ) = d(y, A)/A for all A E 8 and m 2 1. Thus
This explains the freedom of (P., One can simply regard E; as an absorbing singleton when starting from x E En. Hence, for each x E En, we have
We now use induction on m to show that
(A, x,A n E,), P?) (A, 5 , A n E J 6 P ~ ~ ) ( An , ~E,), A< P ( ' ~ ) ( A , ~ n ,EA~ ) , :r:
E E,,
n 2 1.
The initial case of rn = 0 is trivial. Assume that the assertion holds for all k 6 m. Then
- P;"+~)(A,2,A n E,) I
We have thus proved the required assertion. By letting m -+ 00, we get
P,"'"(A, 2 , A fl En) G PF$(A, z,A n En) < PFi;(X, x,A n En+,), z E En, IT >, 1. P,"'"(X, x:! A n En)< Pmin(A, 2,A n ET1),
5.2 MARGINALITY A N D REGULARITY Noting that
189
PFin(X, 2 , A n En) = 0 when IC @ En, we have
P,"'"(X, z,A
n E,)
t ,"lim cc
P,"'"(X, z,A n En)
< Pmin(X, x,A).
On the other hand, by using the previous proof a), replacing A with AnE,, we obtain
lim P,"'"(X, z , A n En) 3 Pmin(X, 2,A),
X
> 0,z E E , A E 8.
,-+a3
We have thus proved the first assertion and then the second one follows.
Theorem 5.19. If both of the marginals are regular j u m p processes, then so is every coupling Markov process. Conversely, if a Markovian coupling is a regular jump process, then so are i t s marginals.
-
Proof: a) Jump condition. Let Pk(t,xk,dyk) and P ( t ;zl, z2;dy,, d y 2 ) be
the marginal and coupled Markov processes respectively. By the marginality for processes, we have &;
3
2 1 ,z2; (21
1 x @2>)
&;z1,x2; ( 2 1 )
x
E2) -
F(t;x,,2,;E1 x (E2\
3 P(t;z1,22;{z,} x E2) - 1 + P(t;z1,22;E1x = z1, h)) .- 1 + P2(t,z2, {z2H.
w,
{.2)))
(4)
If both of the marginals are jump processes, then lim, ,o F(t;z,,x,; {zl)x {z,}) 3 1. This means the Markovian coupling P ( t ) must be a jump process. Conversely, since p@;~1,22;{ x~( 1 34 })
< F(t;z1,z2;{x,} x E2) = P 1 ( 4 ~ 1 , { ~ 1 } ) ,
F(t) is a jump process, then limt ,o Pl(t,zl, {zl}) 3 1 and so Pl(t) is also a jump process. Symmetrically, so is Pz(t). b) Equivalence of total stability. Assume that all the processes concerned are jump processes. Denote by ( q k ( z k ) ) q k ( z k , dyk)) the marginal q-pairs on g k ) , where
if
(a,
z2),Q(x,,x2;dyl, d y 2 ) ) a coupling q-pair on (El x Next, denote by (Q(z,, E2, g ) ,where
190
5
PROBABILITY METRICS AND COUPLING METHODS
We need to show that' @(Z) < 00 for all 53 E El x E2 iff q,(a,) V q 2 ( x 2 )< cm for all z1E El and x2 E E2. Clearly, it suffices to show that, Y l ( 4 vq z ( 4
s"(%.2)
q1h)
+Y
, W
(5.9)
Note that we can not use the conservativity nor iiniqiieness of the proccsses at this step. But Eq. (5.9) follows from a) and Theorem 1.4 immediately. c) Equivulence of corzservatzvity. From now on, we assume that all the q-pairs considered below are totally stable. The problem is that in general, we know that limt+o P ( t ,2 , A)/t = q(x,A ) only for x $ A f 9 rather than A E &. The last assertion holds once the q-pair is conservative (Theorems 1.5 and 1.13). Let the marginal q-pairs are conservative. We first prove that
Moreover, the convergence is uniform in A". To do so, by Lemma 1.7, choose {EP)}? c 9 k such t,hat E p ) T EI, as n -+ 00. We h a x seen from the proof a) that 1-F(t;z1,a2;{q}x
{.2})
< 1 - ~ l ( t , ~ l , { z l +} )1 - P 2 ( t , x z , { ~ , } ) .
Hence {Ein)x Ep))nbl c & and Ein) x E p ) t El x E2 as n + 00. Note that
=: I
+ II +m.
Besides, by the marginality for processes, we have
5.2 MARGINALITY AND REGULARITY
191
Fix n large enough so that zk E E f ) . Since n ( E p ) x E p ) ) E @, we have limt,oI = 0. Next, since the marginal q-pairs are conservative, we have limt,olI ql(zl, (E!"))') q2(z2,( E P ) ) ' ) . Collecting these facts together, we obtain
2
<
+
This proves the existence of the limit by letting n --+ 00. From the proof of Theorem 1.13, it also follows that the convergence is indeed uniform. Applying the marginality again and using the conservativity of the marginal processes, we obtain
Hence
This proves the conservativity of the coupling q-pair. Conversely, let a coupling q-pair be conservative. marginals satisfy
We prove that its
and the convergence - - is uniform in AI, (k = 1 , 2 ) . To do so, choose a sequence c 9, E(") El x E2 and a point x$ E E2 such that E(")(zg):=
{E("))
192
5 PROBABILITY METRICSA N D COUPLING METHODS
Similarly, we define Clearly,
and
By marginality,
and so
It follows that E ( " ) ( z $ E ) 9
1
(n2 1). Next,
=: I+II+rn.
By marginality again, lI=p(t,z1,x2;( E ( " ) ( z ; ) ) C x E 2 ) /Fix t . n large enough such that x1 E E(n)(z;),since the coupling q-pair is conservative and Al n E(")(z$)E 9 1 ) it follows that
< Q(z,,z2;(E(n)(&)Cx E2) + Q1(%
(E(n)(z;))c).
The existence of the required limit follows by letting n --t m. By the same reason mentioned above, the convergence is also uniform in AX. We now prove the conservativity of the marginal q-pairs. By using the marginality again, 41(z1) = Q ( z l J 2 )- i(z1,22;b 1 ) x (E2 \ {%I)) = 4"(%z2;El x E2 \ (4x b 2 H - (3%z2; bl}x
(E2
\b2H)
5.2 MARGINALITY AND REGULARITY
193
d) Equivalence of uniqueness. From now on, we assume that all the 4pairs used below are totally stable and conservative. Let the coupling qPair (G(x1,x 2 ) ,G(q,z,; dY,, dY2)) of q-pairs (q&), q&k, Ak)) determine uniquely a coupling q-process. Denote by IIk((x), k = 1 , 2 and @A) the operators corresponding to the kernels and respectively. Next, let z t l k = 1 , 2 and ZX be the maximal solutions to the equations k = 1,2 2; = rI"(x)& 0 z; 1;
< <
and ZA =
n(A)iA, 0 6 ZX < 1, respectively. We now prove that zi
By the comparison lemma, to prove that ZA(., zz) 3 for all x2 E E z , we need only to show that z i < fi(A),zi. But by (5.8), we indeed have
This gives us i A ( . , z 2 2 ) z i . Dually, we have ZX(xl,.) 3 z; for all x1 E El. We have thus proved (5.10) and hence the regularity of the marginals follows as an application of the uniqueness criterion. Conversely, let the marginal q-pairs determine uniquely the marginal processes respectively. Take
and define
By (5.9), we have
194
5
PROBABILITY
METRICSAND COUPLING METHODS
Thus, for each n, we have obtained three bounded q-pairs, denote by Pk(")(X xk, dyk), k = 1 , 2 and 8 " ) ( A ; x,,x,; dy,, dy,), respectively, the correspondent q-processes. Now, by Lemma 5.18 it suffices to show that
By the comparison theorem, we need only to check that
P!") (A, x,,El
\ Ei")) + P,'") (A, x,, E2 \ E p ) )
(5.11) For
LHS of (5.11) > 1/2 = RHS of (5.11). We now assume that havewe of
Then, by (5.8) and the regularity
Replacing P!'" (A, y l , E1\13in)) wilh PJn' (A: pz, E2\Ep'), we obtain mother equality. Summing up them together, we get (5.11). I It is worthy to point out that the q-pair ($n)(xl,x2),Q ( n ) ( ~ l ,x 2 ;cly,, dy,)) used above is not a coupling of ( 4 ~ ) ( ~ k ) , ~ ~ ' ( 5 k ? d ky = ~ ) 1, ) , 2 since for
fE
fi(")f(x1,x2)= IE2~ n , ( x 2 ) f l ~ ) f (isa not l ) independent of
Now, we are ready to prove the following result.
2,.
5.3 SUCCESSFUL COUPLING
195
A N D ERGODICITY
Theorem 5.20. Conditions (5.7) and (5.8) are equivalent. Proof: a) By Lemma 5.11, we have (5.7)=+(5.8). b) Conversely, for the given conservative q-pairs satisfying (5.8). in general: we have
Pmin(A; zl; x 2 ;A1 x E2) < PFin(A: zlr A l ) , P"'"(A; x L ! ~ 2 El; x Az) < P,"'"(X, 5 2 , Az), > 0,
Xk E Ek,
- 1, 2.
Ak E 8,,,k
This is again cieduced by using the comparison theorem. By assumption, the marginal q-pairs are regular and so is the coupling g-pair by Theorem 5.19. Herice
-
independent of r 2 .
Y(X;21:2 2 ; A , x Ez) = Pi (A, xl: A , )
Now thc inverse implication (5.8)*(5,7) 7(5.) follows from t h e uniqueness tbeorem of Laplace transform. H
5.3 Successful Coupling and Ergodicity Based on the results obtained in the last section, we assume for the remainder of this chapter, that all the q-pairs considered are regular. I n this section, we discuss a typical application of the W-metric and coupling methods. Let (X,', X,") ( t 2 0) be the path of a coupling jump process and set,
T = inf{t 3
o : X: = x,"}.
Definition 5.21. A coupling is called successful if I -
IFD"L*"qT< m] = 1, and
XL
f
,--. P " ~ + z [ x= , ~X; for a l l t 2 T ] = 1,
Suppose that
@'z11z:2
(5.12)
Tc2
(5.13)
x1 # x 2 .
is a successful coupling, then
[ I P ( t , X I.), - P(t,X,, . ) [IVar
< 2F-[T
> t ] ---t 0
Furthermore, if the process has a stationary distribution initial distribution p , we have
as t T,
-4
00.
then for any
5 PROBABILITY METRICSAND COUPLING METHODS
196
where P ( t ) is the transition probability function of the original process, By using this way, we prove the ergotlicity of the process. Now, it should be clear that the study of successful couplings is related to the distance of total variation. In general, the succcss of couplings is weaker than the recurrence of the process arid hence weaker than the crgodicity (See Example 5.50 below for instance). For the opposite direction, the next result gives us a reasonable solution.
Theorem 5.22, Let P be a probability kernel on a Polish space ( E , p , $ ) satisfying the following conditions. (1) P ( z , .) E 9'0for all z E l3, where gowas defined in Theorem 5.10. (2) There is a constant c E [O, 1) such that
<
W ( W q 7 9,P(x'L,9)
ZY
C P h , 221,
Then there exists uniquely a stationary distribution
W(pPn,7r)
< CnW(p77r),
72
7r
E
2 1, p
, 2 2 E E. 9'0 such that € 9 0 .
Proof: By Theorem 5.10, the symmetry of W and the assumption, we get
This means that L ( P f )
W ( p l P , pZp)= sup
< cL(f).Thus
IS
I
Pfd(p1-p2) < c sup
L(f)
<
I
IS
gd(pl-pZ) = C W ( p l ! p.2).
L(g)
Inductively, W ( p l P n ,p2Pn) c n W ( p l , p 2 ) . By Theorem 5.4, (,Po,W >is complete, and so by the contraction mapping principle, there exists uniqiiely a fixed point n- E 90 of the mapping p -+p P such that 7r = T P . We have
W(pPQ,7 r ) = W(pup", 7
P
)
6 c n W ( p ,n)
as required,
As illustrated by Lemma 4.42, it is easy to extend the above result to the time-continuous case.
Theorem 5.23. Let P ( l , x , d y ) be a transition function on a Polish space (FJ, p, 8 ) satisfying the follawing conditions. (1) P ( t , z , . )E (2) For every t
for all t > 0 and z E E . > 0, there is c(t) E [ O , l ) such that
W ( P ( tq") , P ( t ,5 2 , .)) 6 c ( t ) p ( z , , !,),
51,
a),
Then there exists uniquely a stationary distributlon
w(ppt,T) where h
< c ( h ) [ ' l h 1 ~ ( 7pr,) + 0
> 0 is an arbitrarily
fixed constant and
as
7r
zp f E .
E 9'0 such that
t + 00 for p E
9 0 ,
[z]is the integer part of z.
5.3 SUCCESSFUL COUPLING A N D ERGODICITY
197
Proof: a) Since c ( t ) < 1, it is easy to see that W(plPt,p2Pt)is decreasing in t . Actually, we have first that L ( P t f ) < c ( t ) L ( f )and then
We now fix h
> 0 and write t = [t/h]h+ ht,
b) Next, for each t > 0, by Theorem 5.22, there exists uniquely a 7rt E 9 such that 7rt = 7rtPt. The proof will be done once we show that 7rt = 7rs, since then we would have 7r = 7rPt and furthermore
c) The proof of 7rt = 7rs is based on the semigroup property plus the Noting uniqueness of the fixed point of the mapping cpt(p) = pPt, p E 90. that Pt 0 Ps(.t) = Pt+s(%) = Ps 0 'Pt(7rt) = Ps(Tt)r by the uniqueness of 7 r t , we have c p s ( 7 r t ) = 7rt and then 7rt = 7rs by the uniqueness of 7rs. 1 Before moving further, we mention that if we are interested only in whether the two marginals will meet or not, then we can ignore condition (5.13). In this case, we can even allow the two marginals to be different processes. On the other hand, if the two marginals are copies of a single process, it is often easy to modify the coupling process so that (5.13) holds. In this sense, condition (5.13) is not essential. Next, in the study of successful couplings for jump processes, we may and will fix a coupling q-pair (ij(z,,x 2 ) ,Q(z,,z 2 ;dy,, dy,)) and then justify whether the corresponding process is successful or not. Thus, our main task is to find some conditions, depending on the q-pair only, to ensure the success. In this section, we restrict ourselves t o the case that
( E l ,€1) = (E2,€2) = ( E ,8) and
01 = 0 2 .
Denote by ?(t;xl,z,; dy,, d y 2 ) the jump process determined by the coupling q-pair. Then, condition (5.13) becomes
P ( t ;z, z; A ) = 1,
t 2 0,
2
E E,
(5.14)
198
5
PRORARILITY hh3'l'RICS A N D COUPLING
METHODS
where A = { ( x , x ) : x E E } . Equivalently, $(x,x;Ac) = 0 for all z E E . Under (5.14), we have
and so condition (5.12) is now reduced to for all
Definition 5.24. A coupling q-pair is called successful if (5.14) and (5.15) hold.
s(A)
To state our criterion for success, let be the operator corresponding to lhe kcrriel @(x1,x2;dy,, &,)/(A n(xlLz,)) and let afi(A) denote the restriction of IT(x) to E' \ A. Set aii = ~ I - I ( o > .
+
Theorem 5.25. A coupling q-pair (ij(z,, z2),ij(al,x2;dy,, d y 2 ) ) is successful iff the following conditions hold.
(1) @(a, z; A') = 0 for all x E E . (2) q(xlrx2) > o for all ( x l , z2) E (3) T h e equation
has only the trivial solution
IL
E2 \ A.
= 0.
Proof: As we mentioned above: the first two conditions of the lheorern are necessary. Hence, we need only to show that condition (3) is cquivalent to (5.15) irndor the assumptions (1) and (2), Now, mxurne (1) and (2). By using the Laplxcc transform, (5.14) becomes
AF(A;x,x;A)= 1,
x E E, X > 0
(5.17)
and we can rewrite (5.15) as
Thus, we need only to show - that (3) and (5.18) arc equivalcnt. Fix X > 0. Sirice (AP(A;:cl, x2; A) : :cl, :c2 E E ) is the rriiriirrral sulution to the equatiori
5 . 3 SUCCESSFUL
by (5.17) and the localization theorem, we see that ( X s ( X ; x2) is the minimal solution to the equation
f(A)(s,Jz)
199
COUPLING AND ERGODICITY
a) :
51,~ 2 ;
= aii(x)f(X)(s,,s,)+~(zl,~,;A)/(X+~(al,22)),5 1
#
# 52.
Noting that
and using condition (2), we obtain
xF(x;~,, x 2
; ~ ) some f(xl, x2) as
x 1 0,
x1 # z2
and (f(x,, x2) : ( x l , x2) E: E2 \ A) is the minimal solution to
f(%.2)
= nfifl(z,, .2)
+
q
Y
h
3
52:
~ ) / d ( w4, 21 # zz.
Thus, ( h ( z l x2) , := 1- f(rc,. a,>: (xl, x 2 ) C E Z\ A ) is the maximal solution to Eq. (5.16). I Even though we have a general procedure to approximate the maximal solution to Eq. (5.16) (cf. Lemma 2.39). €€owever,such a procedure is somet i m e s not very practical, so we would like to propose some more ef1ective sufficieril conditions for success. Take 0 $ E2 arid set EQ = ( E 2\ A > U { H } , & = ' T { & ~ ~ E ~ \ (0)). A, Define a transition probability on (&I, &) as follnws: I"
-
V~(0,O) = I,
P0(z1,Z2;A)
[ 6 ( ~ ~ , ~\2(0)) ; - ~d-g"(z,;x2;A)IA(@)]/~(X~IZ~),
x1 # x,,
AE
&.
(5.19)
Intuitively, this transition probability is nothing but considering the set A as a single state 0.
Corollary 5.26. Assume t h a t (1) and (2) of Theorem 5.25 hold. Let h E &, h 3 0 and h ( 0 ) = 0. Suppose that there exist constants C > 0 and 0 c < 1 such that Poh C c h on -& (5.20)
<
< +
and there exist constants k c [ O , l ) and K
-
> C/[(l - c)(l - k ) ] such t h a t
P ~ ( J : ~ , zX ~I ; ~k ) for all (xLT1,z2) E Eo satisfying
h(x,, x2) 6 K .
(5.21)
Then the coupling q-pair is successful. In particular, if (5.21) is satisfied for all z1 # z 2 ,then the same conclusion holds w i t h o u t using condition (5.20).
5 P'KORARILITY METRICSANT) COUPLING METHODS
200
Proof: Consider the process (Z(n,) := (X'(n,),X2(n))),20 defined on a ,..
probalility space ( f 2 , 3 + ,IF) valued in (Eo,&) with transition probability Po. What st" need is to show that I E i r [ ~ ( ~ ) . + o---t] 0 as n t 00. Put
4, = .Ip(n)pe],
2 0. Since 6' is an absorbing state arid In-l = 0 ITL= 0, by (5.20), wc gct lEJ, = IE [I,-,lE(J, I Z ( n - l ) ) ] Jn
= In q q n ) ) ,
*
6 E [I,-lFoh(Z(n - I ) ) ] < IE [ In-1 (C c h ( Z ( n- 1))] = C E I n _ 1 + clEJn-l.
+
(5.22)
On the other hand, by (5.21), we have
EIn
= E [ irn-lFo(Z(n - 1);
E2\ A ) ]
6 IE [ In-lFo(Z(n - 1);E 2 \ A ) ; h ( Z ( n- 1)) < K ] + E [ In-lpo(Z(n - 1); E2 \ A); h ( Z ( n- 1))> K ]
< lclEIn!,_l + K-liEJ,-I.
(5.23)
Combining (5.22) with (5.23), we obtain
Since thc eigenvalucs of the matrix on the right-hand side are smaller than 1, we see that the left-hand side goes to zero as n + 00. 'l'he same proof, even sisnpler, will give us the lad assertion. W Corollary 5.27. Let E be endowed with metric p. Assume that conditions (I) and (2) of Theorem 5.25 hold. Suppose that q ( x ) is locally bounded. (1) If for every such that
T
> 0 there
exists a bounded function p : [O,T]
GO,
~('POP)(2,,~,)$.rl(~1,~,)
O
-r
Gr,
[O,m)
(5.24)
fi
where is the operator corresponding to the given coupling g-pair. Moreover, there exists a function f : [O,m) -+ R such that
f ( r ) T f ( w ) = 00
as
~ + m ,
f ( 0 )= 0
(5.25)
and
fi(fop)
on
E2\A.
Then the coupling is successful. (2) If there exists a bounded function f : (0,00) 4 (0, m) such t h a t
f i ( f 0 p ) 2 0 on E ~ \ A . Then t h e coupling is not successful.
(5.26)
5.3 SCCCESSFUL COUPLING A N D ERGODXCITY
201
Proof: a> Use the same notations as above and set T = minin 3 1: p( Z( n ) )= 0}, S,= min{ n 2 1: p( Z(n))> T } , TO,,= T A S,.
-
Without any confusion, we use the same notation IFD"172 (resp. E"lizz) to denote the distribution (resp. expectation) of the process with transition x2). probability Fo starting from (xl, First, we prove that *-
P"1 J z [To,, < 34 = 1 for a11
0 < p(z,, T 2 )
e
(5.27)
T.
Choose and fix a rcference point 0 E E , set 7iv = min{n
Since
z 1 : p ( ~ ' ( n ) ,+~ p) ~( ~ ' ( n ) >, ~N )' }~,
-
c(F:+lQ
N
1.
n-1
Ph- 9=
- @9),
9 E 80,
k-0
by conditioris (1) and (2) of Theorem 5.25 and condition (5.24), if wc takc g = aT:= p o p , lhen
That is Let n …… and then N …… to get
This proves (5.27). Next, fix r2 > 0 and set F = f o p _ F'rom (1) and (2) of Theorem 5.25 and condition (5.26), it follows that P2F 6 F for all n 2 1. Hence, for p(x1,z2) := T E ( o : T ~ )E, ~ ~ J ~ FA (~ z0 ,( ~ ~) ) ~ ( x , , x ~ Letting ) . n -+m, by (5.2?), we see that
<
J ( r ) . - I + ' ( q ,x 2 ) 3 E"1+-2 [ F ( Z ( T ) :) 2
f(T2)iF
,"z
6 ST,] + E"+2 [b'( Z (SrJ) : To,?.> sr,]
(To,, STJ.
5
202
PROBABILITY
METRICSAND
COUPLING
METHODS
<
Thus, @ " ~ J ~ (>T ST2) o , ~ f(r)/f(r2).Now, letting r2 + m and using (5.25), we obtain @"l>z2(T = 00) = 0. This shows that our coupling is successful. b) We now consider the converse case. Let x1 # x2. If @ z 1 ~ z 2 = GO] > 0, then @1,z2 [ T = m] > 0 and so there is nothing to do. Thus, we may and will assume that F ~ . " ~ [ T
where
For Markov chains, we have a precise criterion. Corollary 5.28. Let E be countable. A coupling Q-matrix is successful iff condition (1) of Theorem 5.25 and the following conditions hold. (2)' For every (21 ,22) E E2 \ A, there exists a sequence
(p.ern)), 7
'2
m = 0,1,. . . , M
+1
such t h a t
and
.(k+l) q ( i(,k ) ,z2. ( k ) ; 2 .(k+l) , 722 )>O,
k=O,l,*..,M.
(3)' There exists a sequence {A,}T of finite subsets of E2\ A so that
A,
1' E2\ A and a
positive function cp such that
(5.29)
ficp60
(5.30)
onE2\n.
Proof: Condition (2)' is clearly necessary. Now, assume that (1) and (2)' hold. Again, consider the subset A as a single absorbing state 8. Setting cp(8) = 0 and using the notation defined by (5.19), we have cp on Ee and
<
Now, the remainder of the proof can be done by a slight modification of the proof of Theorem 4.24.
5.4 OPTIMAL MARKOVIAN COIJPLINGS
203
5.4 Optimal Markovian Couplings
€1
Let (& €A) ( k = 1,2) be a measurable space and cp be a non-negative x 82-measurable function, As an analogue of the W-metric, we define
vdries over all couplings of the probability measures PI and P2 on ( E I , ~ and I ) ( E Z , & ) )respectively. Jn this section, we discuss the optimal
whew
coupling defined below.
r
Definition 5.29. A coupling of PI and PZ is called p-optimal if it attains the infimum on the right-hand side of (5.31). For a complete separable metric space ( E ,p, E ) , a poptimal coupling does exist by Lemma 5.2, but may not be unique. In the special case of p being the discrete metric and EI, = E , the cp-optimal coupling is given by (5.4). k = 1: 2 and Certainly, one may replace the above Pk by Pk(t,zk,dyk), define a poptimal coupling P(1;zI, z2; dyl, dy,). But this definition is usually not useful since it is not practical. We will ernphmize the coupling operators. Consider jump processes again. As i~sual,for a jump process P ( t ,IC, dy), denote by P ( t ) the corresponding semi-group on b 8 . We want to find out a coupling process p ( t )such that for any coupling process F ( t ) ) -
W M z , , 4G mcp(z1,2 2 )
for all
( X I , x2).
The next result reduces the comparison of two semi-groups t o the one of their operators. From the proof below, it should be clear that under some mild condition, the conclusion also lmlds for other type of Markov processes.
Lemma 5.30. Let Pk(t) be a regular jump process with s t a t e space (I&, Ek) and q-pair ( q k ( x k ) , y k ( x k , d y h ) ) , k = 1, 2. Suppose that there exist nonnegative functions ~ p kE. & (k =- 1,2), cp E 8’x 8 2 and constants C and c such that
Given two Markovian couplings P(t) and P(t) of P1(t) and P2(t), if
then we have
204
5 P R O B A B I L I T Y METRICS AND COUPLING h4ETIIOnS
Proof: a) Without loss of generality, assume that C, c Lemma 4.13, we liave Pk(t)pk
G C[ect - 1]/c t. ectpk,
Hence, for any coupling semi-group Y ( t ) , we have mi4% .a) ! 6 m l F 1bd t- m
v 2 w = .e,
> 0. By (5.32) and
k = 1, 2.
(d+ W)'P'L(%)
<2C[e""l]/c-i-ect ['p(z,)+(i7:!jx2)], $1 E E 1 ) X , € & . b) By Theorem 1.15, for every f E &, wc have
P(t)f(s,,4- f(% .a) In particular,
=
JCI
(5.33)
(5.34)
~ ~ ~ ( + s ?&is. )f(z,, -,-
(5.35)
where prG = 'p A n. Moreover, by (5.34) and the margiriality, we have
Combining this with (5.32), we see that for fixed x1 and x2,
is bounded on finite t-interval uniformly in n. By (5.35), it follows that
Y
Furthermore, since limt,o P(t)lp(z,,x2) = 'p(a,,x 2 ) , using (5.34)arid the dominated convergence theorem, we get
Therefore
From this, t,he required assertion follows immediately. The above result leads to the following definition.
W
5.4 OPTIMALMARKOVIAN COUPLINGS Definition 5.31. A coupling operator
2 is called cp-optimal if
-
-
205
~p(z1,z2 =)igf Rp(zl,z,)
for all z1 and z2,
R
where
6 varies over all coupling operators.
To study the existence of a p-optimal Markovian coupling, we recall some notions. A finite measure p on ( E ,8)is said to be inner regular or tight if for every E > 0 there is a compact K, such that p(EL;C) < E. A sequence of finite measures {p,} is said to be uniformly tight if for every E > 0 there is a compact KE such that supn p n (K:) 6 E . Next, given two measurable spaces (0,s) and ( E , € ) ,we say that G is a transition measure (resp. transition probability) from ( R , 9 ) to ( E , 8 ) if for each w E R, G(w,.) is a finite (resp. probability) measure on ( E ,8) and for each A E 8, G(., A ) is a 3-measurable function. Denote by 3’(pl, p 2 ) the set of all couplings of p1 and p2. Similarly, we have X ( G 1 , GZ), which is the set of all coupling .) and G2 (w2, .) for each fixed (wl, w2). measures of GI (wl, Now, the main existence result of cp-optimal Markovian coupling can be stated as follows. It is clearly a generalization of Lemma 5.2. Theorem 5.32. Let Gk be a transition probability from a measurable space t o a separable metric space ( E k , p k , & k ) . Suppose that for each W k , = 1,2. Then for every non-negative closed function cp, there exists a transition probability E X ( G 1 , G z ) from (01x R 2 , 9 1 x 9 2 ) t o (El x E2,P,El x 8 2 ) [P((Z1,4,(Yl,Y2)):= Pl(%Yl) + P2(Z2,Y2)1 such that (Stk,Sk)
Gk(wk,.)is tight, Ic
c
The coupling ?? given by Theorem 5.32 is called a cp-optimal Markovian coupling or cp-optimal measurable coupling. The next result enables us to reduce the closed function cp to the bounded continuous one. Lemma 5.33. Let GI, (Ic = 1,2) and p be the same as in Theorem 5.32. Given a sequence { p n } n of non-negative closed functions, suppose that for -(n)
each n , there is a cp,-optimal Markovian coupling G E X ( G 1 , G z ) . Then a cp-optimal Markovian coupling E X ( G 1 , G 2 ) exists provided one of the following conditions holds.
(1) J J (2) cpn
<
( P ~ \ (m ~
and cp,
t cp as T 0.
-+ cp uniformly
on El x
E2.
206
5 PROBABILITY METRES AND COUPLING METHODS
Proof of Theorem 5.32: a) Reduction. We show a reduction of the proof in terms of Lemma 5.33. Given cp, set cpn = cpAn. Then, cpn t cp. As an application of Lemma 5.33 (2)) we need only consider bounded closed function cp. Without loss of generality, assume that cp ,< 1. Next, let
Then, it is easy to check that Icpm-cpl 6 l / m and cpm + cp uniformly. On the other hand, because Am,, is open, by Lemma 1.36, there exists { ~ : j ! } ~ > ~ of non-negative Lipschitz continuous functions such that 1 l~~~ as n 1‘ 00. Set cpc$ = 1zm k=O -1ymk. ( n ) Then cpg’ cpm as n 7 00. Thus, one can apply Lemma 5.33 (2) to the sequence and then apply Lemma 5.33 (1) to the sequence { ( ~ ~ } ~reducing > l , the function cp to be bounded Lipschitz continuous functions. b) Approximation. The main difficulty is the measurability of the coupling transition measure. Otherwise, the result would be reduced to Lemma 5.2. To avoid this, it is natural to use an approximation by the marginal measures with finite supports, as we did in the proof of Theorem 5.10. When GI,(k = 1,2) has finite support, from the stochastic linear programming, it is known that there is a transition probability G E X ( G 1 , G2) such that for each ( w 1 , 4 and b E ~ ( G l ( w , , . ) , G a ( w , , . ) ) , Gcp(w1,w2) 6 Pep.
cp2i
{cpk)},>l
It is important to note that the inequality is pointwise since we do not assume fi being measurable in ( w l , w 2 ) , even though so is on the left-hand side. To go further, we need to construct the approximation more carefully. Since E is separable, for each n > 1, we can construct a sequence of disjoint sets { 13j”)}j21c €‘ such that diameter of each Bj”’ is less than l/n. Given p E 9 ( E ) , let rn, be large enough so that p ( B c ’ ) zj>m Bjn). ,
For each j
:
0 ,< j
< m,,
< l/n, where Bc)=
choose a point z?) E Bin’. Define
p ( n ) E 9 ( E ) by ,U(~)({Z;?)}) = p(13jn)). Then p(%)has finite support and moreover p(,) .% p (weak convergence) as n ---t 00. Actually, for every bounded Lipschitz function f , we have
5.4 O P T I M A L MARKOVIAN COUPLINGS
207
where L(f) is the Lipschitz constant o f f . Next, let 1-11, 1-12 E P ( E )and define { p r ) } n 2 1 , k = 1 , 2 as above. For fi E X ( p l , p 2 ) , define fi(n) E 9 ( ( E 2 ) as follows.
l.'"'({.~"'> x {.I;"'})
=p1(B,(n))p2(By),
1
< 2 , j < m,.
It is easy to check that for each n, f i ( n ) E X ( p (l n ), p2(,I)
and moreover
/3(n) fi as n 3 02. Applying these constructions to 1-11, = G k ( ~ k : . ()I ; = 1 , 2 ) and G , = c(w,,wz;.) with a slight change of the notations (since El and E2 may be different) for fixed (w1,w2), from what proved in the last paragraph, w e obtain
G%w,,
w2)
6 fincp.
(5.37)
Because 9 is bounded continuous, there is no trouble t o pass the limit on the right-hand side of (5.37). But there is still a problem of choosing a measurable limit on the left-hand side of (5.37) since -(n) G (wl, w 2 ;-) is determined by the stochastic linear programming without explicit expression. To overcome this difficulty, one needs Le Cam's Theorem (cf. Dudley (1987), Theorem 11.5.3). It says that for probability measures { p n } and p on a measurable metric space: if p n ( n 3 l) and p are all tight and p n % p as n -+ cx), then { , u ~is}uniformly ~ ~ ~ tight. Besides, noting that if ( k = 1,2) is uniformly tight, then so is { f i n E X ( p ~ ) , p ~ ) ) } ndue 2 1 to , the mar,'ulnality. As an application of these remarks, it follows that for fixed (wl! wz) { G P ) ( u ~ , . ) } , is ~ ,uniformly tight and then so is {G,(wl,w2;.)}n21. Now a measurable selection of the limit on the left of (5.37) is guaranteed by the following result. which is a generalization of Prohorov's theorem (Theorem 4.2). Theorem 5.34 (Selection Theorem). Let {G,},>I
be a sequence of transition measures from a measurable space ( 0 , s )t o a separable metric space ( E , p , € ) . Suppose t h a t for each irl E 0,
(1) {Gn(w,.)}n21 is uniformly tight and (2) SUP,>, GnbJ,E ) < co. Then, there exists a transition measure G from ( 0 , g )t o ( E ,p, S)so that for each w , there is a subsequence { n ( w ) } c { n }such that Gn(u)(w,.) -% G(w:.) as n ( w ) --+ oc.
Proof: On the one hand, since the space (U,(E), 11 * (Iu) is separable, we = Gnfk. Since for each w , have a dense set {fk}k>I c U,(E). Set
ip)
5
208
PROBABILITY
METRICSAND
COUPLING M E T H O D S
the sequence { f ( n ) } , 2 1 consists a mapping from (a, 9) to a totally bounded set in the Polish space R, B ( R ) ) . By Chen (1983, Lemma 1, based on the Kuralowski-Ryll Nurdzewski Selection Theorem), there exists an 9-measurable sequence (fk) such that for each LL~,we can choose a subsequence { n ( ~ C-) }{n}so that jc(w)) + ,fk as n(u)-+ 00. On the other hand, replacing C, with G,/G,,(., E ) if necessary, applying Prohorov’s theorem, for each w ,there exists a subsequence {n’(w)]such that Gnr(w)(w, .) W a finite measure G(w,.). Hence
(n,,, n,,,
-
This means that G f h E 9 for each k . By the denseness of { f k } , we have Gf E 9 first for all f f V p ( E )and then for all f E b f f by Lcrnrnn. 1.36 since &ip(E) C U p ( E ) =plus the monotone class thcorcm. W Proof of Lemma 5.33: Since for each w k , Gk(wk,.) is tight, by margina--(?A) litY, {G ( ~ 1 , ~ 2 ; 4 ) } r L 2 1is uniformly tight. By Selection Theorem, there exists a transition measure G such that for each ( ~ ~ , w . there ~ ) , is a sub-
-
w -
sequence {n(w1,w2)} c { n } so that ~ - ( n ( w l l w zh) )w 2 ; .> G(w1,w2; .) as -(n) is 9,-optimal, n ( w l , w p ) 4 00. Clearly, E X ( G 1 , G 2 ) , Next, since G we have ?In)
G
dq+) < Fvn
for all /5 E %(GI ( w l . .), G&,,
.)).
Under the assumptions, it is rather easy to remove the subscript/superscript n in the above inequality and then the cp-optimality of follows. H For a Polish space ( E ,p, 6)endowed with a metric p, the discrete metric d (i.e., d(x,y) = 1 iff 2: # y, otherwise, d ( z : y) = 0) is lower semi-continuous (but not necessary continuous) with respect to p . The d-optimal coupling is often called the maximal coupling. The next result is an extension of Theorem 5.7, it shows the existence of Markovian maximal coupling. Corollary 5.35. For Polish s t a t e space, the &optimal Markovian coupling of transition probabilities PI and P 2 satisfies
for all x1 E El and x 2 E
p
E2.
We can now return to the time-continuous case. When dealing with operators, the phase “Markovian” may be omitted.
5.4 OPTIMAL MARKOVIAN COUPLINGS
209
Theorem 5.36. Given regular q-pairs ( q k ( z k ) , q k ( z k , dyk)), k = 1,2 on Polish space ( E ,S),a p-optimal coupling exists for every non-negative closed function 'p.
Proof: Applying Theorem 5.32 to Pk(t,z k , d y k ) , k = 1,2, for each t >, 0, we obtain a p-optimal measurable coupling Ft(xl, z2;dy,, dy,):
-
-t
x2) 6 P y ( x l , x,)
P
(5.38)
u
for all measurable coupling Pt of Pl(t) and Pz(t). Define
1 -6, G ( y 2 , A ) = --P tn
(?,A \ { z } ) ,
2 E E'1 x Ez,
A E 8 1 x 8 2 , 72 3 1,
By the marginality, we have
6 4 1 ( 4 + 42(Z,),
2 := ( Z 1 , 2 2 ) .
-(n)
(2,El x E2) < w for all X: E El x Ez. Next, since G c ) ( ~ k , A k=) q k ( z k , A k ) , we have G P ) ( x k , -% q k ( x k , Thus, for fixed 5, by Le Cam's Theorem, { G r ) ( x k ,-)}, is uniformly tight and so Hence supn21 G
a)
a).
-(n) is {G ( 5 ,.)}1L31 for fixed 5. Therefore, by Selection Theorem, there exists a transition rncasure y(2, -) such that for each ?, there is ( n ( 2 ) ) c { n } SO that &n(S))(Z, .) -% y(Z, .) as n ( 5 ) 4 00. Hence, we obtain a coiiservative y-pair (q(Z), G(Z, d j j ) ) which corresponds to an operator as follows.
fIf(2) =
1
q(?,dj)f(G) - q(z)f(2),
2E
x E2,
f
b(81
x
82).
From lhe construction, it is readily to see that is a coupling of 01 and 0 2 . Thus, it remains to check the p-optimality of G. To do so, we consider first bounded p. In this case, the proof is easier than Lemma 5.30. Because cp is bounded lower semi-continuous, by Theorem 4.5 and (5.38),replacing .$by Markovian coupling &t) with operator 6: we obtain flp(x1, x2)
5 PROBABILITY METRICSAND COUPLING METHODS
210
(2,,22) E
= fi(p(X1,X2),
El x &.
Finally, we consider general p. Let ( p n = cp A n. Then we have proved that there exists a (p,-optimal coupling operator of fl1 and slz such that for all coupling operator 6 of R1 and 0 2 :
a(n)
Following Lhe argument above, we know that { i j ( n ) ( Z ,d$)}n21 is uniformly tight, and so we can appIy Selection Theorem to find a transition measure Q(?, d y ) , as rz weak li,mj,t;of q(n(5))(ii,djj) as n(2)--t no. Then deGrie ari operator The cp-optimality now follows from Theorem 4.5 and the rnonotone convergence theorem. The construction oE an explicit optirrial coupling is usually not easy. Here we mention two results, taken from Chen (1994a), their proofs are omitted here.
n.
Theorem 5.37. Let p be a translation-invariant metric on Z+ and set uk:= p(k 1) - p ( k ) , k 2 0, where p(k> = p(0, k ) . Then, for birth-death process,
+
(1)
-
n, is p-optimal whenever iz
- il
=: k
7i.k
is decreasing in k. Moreover, we have for
2 1,
f L p ( i 1 , i 2 ) = (ailA M u , + , +(ail v bi&k
-
-
( b i , v a i z ) u b I .- (hi, A a t 2 ) U , - 2 ,
where u-1 = uo. (2) If uk is increasing in k , then fl, is p-optimal. Moreover, for k 2 1,
fl,p(i1,i2) = [(Ui,-%)
f
+ (bi, - bi, ,']w-,i."
- a21
>+
+(hi,
22 - i l
-biz)+]
=:
Uk-1.
Theorem 5.38. Let ( u k )be a positive sequence on Z., Define a distance p(m,n) = uj - Ejsnu j Then, every coupling mentioned in 50.2,
I Ej<,
except the independent one,
-
1.
IS
p-optimal. Moreover,
O p ( i , j ) = b j ~ -j a j ~ j - 1- biui
+ u , L u ~ - ~ , u-1
:= I.
5.5 Monotonicity
Suppose that our state space E is endowed with a measurable partial ordering "4and ' that F := ( ( z , , x 2 ): x1 4 z2} E g2.
5.5 MONOTONICITY
-
211
Definition 5.39. A function f E T& is called monotone if 21 -4 5 2
f ( z 1 )6 f ( 5 2 ) .
A set A E 8 is called monotone, if so is the indicator I A . For two probability measures p1 and p2, we define p1 -4 p2 provided for every non-negative monotone function f , pl(f) 6 p2(f). Similarly, for two semigroups Pk(t) on b&h, k = 1, 2, we say that Pl(t) -4 P2(t) if for every monotone non-negative function f, 51
t 3 0.
3 x2 ===+ P l ( t ) f ( X , )6 P 2 ( t ) f ( x 2 ) ,
(5.39)
If Pl(t) = P2(t),we call Pl(t) itself monotone.
Clearly, the relation "+" is a partial ordering for probability measures in the above sense. However, "4'defined above for semigroups is only a formal notation, it is not a partial ordering. Actually, we have neither the reflexivity nor the transitivity in general. If p1 + p2, the well-known Strassen's Theorem says that, for Polish state space E , there exists a coupling measure is on ( E 2 ,g2)such that ?;(F) = l (cf. Stoyan and Daley (1983), for example). Due t o this property, the monotonicity is also called stochastic monotonicity. This remark leads us to study order-preserving couplings. In this section, we assume that (El, 81) = ( E 2 ,€2) = ( E ,6') but we allow the two marginals t o be different processes. Definition 5.40. A coupling q-pair is called order-preserving if
-
-
2 1 -4 z2
* P ( t ;x1,x2; F ) = 1,
t 2 0,
(5.40)
where P ( t ; zl, x 2 ; dy,, dy,) is the q-process determined by a coupling q-pair.
We have Theorem 5.41. A coupling q-pair is order-preserving iff
F")= 0,
Q(q,z2;
(ZlJ2)
E
F.
(5.41)
Proof: Because the q-pair is conservative, (5.41) follows from (5.40). Conversely, note that
Suppose that
8n) (A;zl,
x 2 ;F") = 0 for all ( qx2) , E F.Then
= f i ( X ) ( ~ ~ p ( n ) ( X ) ~ ~ C ) ( ~(by 1 , x(5.41)) 2) = 0,
(zl,x 2 ) E F
(by inductive assumption).
Thus, by induction, we have ?;(")(A; zl, x,; F " ) = 0 for all (xl, x2) E F and n 3 0. So ?;(A; z1,z2;F " ) = 0 for all (x1,z2)E F and X > 0, and hence the assertion follows from the uniqueness theorem of Laplace transform. H
212
5
PROBABILITY
METRICSAND
COUPLING
METHODS
Corollary 5.42. The basic coupling is order-preserving iff (Ql(~11.)
- 92(52,*))+(~1(x2)c) = 0
(q2(x2,.) - 41(x,1 3)+(p2(x1)c) = 0 where
F~(s) = {y
: y 3 x} and
F2(x)= {y
for all
z1 4
J:2,
: J: 4 y}
An important application of the p-optimal coupling is the stochastic comparability. Since F" is an open set, 'p = I p is lower semi-continuous. Combining Theorems 5.32, 5.36 with 5.41, Stanssen's Theorem can be strengthened as follows. Theorem 5.43. Again, let A denote the diagonals. (1) Given probabilities p1 and p2, p1 4 p2 ifF the I p - o p t i m a l coupling G, satisfies p ( F ) = 1. If so, then p(AC)= IIp1 - p2]lvar/2. (2) Given transition probabilities PI and P2, PI 4 P2 iff the I p - o p t i m a l Markovian coupling p satisfies P ( x , , z , ; F ) = 1 for all x1 4 x 2 . If so, then P ( ~ 1 , 2 2A") ; = IIP1(%1, *) - p2(22, .)IIvar/2, xi 4 22. (3) Given two regular q-pairs, the corresponding processes P l ( t ) and &(t) obey the property Pl(t) 4 P2(t) iff the IFc-optimal coupling q-pair ( ~ ( 2q(2, ) , d y ) ) satisfies ~(z,,x,;F') = 0 for all x1 3 z2.
Proof: Since E X ( p 1 , p 2 ) attains the infimum, the first assertion in part (1) is a restatement of Stranssen's Theorem. Similarly, we get the first assertion in part (2). On the other hand, by Lindvall (1999), the infimum of P(zl,x,; A') among all order-preserved Markovian couplings is given by l/P1(xl,-) - P2(x2,.)llvar/2whenever x1 4 z2. Since is IFC-optimal,orderpreserved and Markovian, we obtain the formulas of p ( x l x 2 ;A'). Similarly, we have the formulas of p(Ac). To prove part (3), it suffices to show that Q(xl, xz;F C )= 0 for all z1 4 x2. By part (a), for each t >, 0, we have an order-preserved measurable coupand P2(t,x2] .). Following the proof of Theoling ?(x1,x2;.) of P1(t,x1,.) rem 5.36, we can construct a coupling q-pair (Q(2),S(2,d f ) ) ( 2 := (xl,x 2 ) ) with operator 6, produced by a weak limit of tctz.?n(z) ( 2 ,d f ) ( n ( Z ) -+ 00). ]
Because P t ( x , , z 2 ; F )= 1 for all t >, 0, we have for all
a
Then so is f i I p (xl,x2) = 0 since is IFc-optimal. Equivalently, by Theorem 5.41, for the process P(t) generated by we have F(t;x l rz 2 ;F ) = 1 for all x1 4 x2.
n,
5.5 MONOTONICITY
213
We have seen the power of the coupling technique in the study of the monotonicity (or the order-preserving). However, in some cases, the conditions obtained in this way may not be necessary, since they clearly depend on the choice of a specific coupling. For the remainder of this section, we introduce a criterion for the order-preserving property. To do so, we need some preparation. We use the same notation A to denote either the set of (real) monotone functions or the collection of all monotone sets. However, A+and b A denote, respectively, the set of non-negative and bounded monotone functions. Lemma 5.44. Let p1 and p2 be two finite measures. Then the following statements are equivalent. (1) p l ( A ) 6 p 2 ( A ) for all A E 4. (2) 11d.f) for all f E b A + . (3) P l ( f ) p2(f) for all f E A+.
< <
Furthermore, if p l ( E ) = p 2 ( E ) , then each o f the above statements is also equivalent t o
(4) pl(f) < p 2 ( f ) for all f E
A for which the integrals exist.
* (3) ==+(2) * (1). To prove (1) * (3), simply
Proof: Clearly, (4) use
0
0
.
(3)* (4). Let f E 4 so that p k ( f ) exists, k = 1, 2. Without loss of generality, assume that pl(f) > -00 and ,u~(f) < 00. Since f+ E A and
Pl(f+) P 2 ( f + ) (by (311, it suffices to consider f E A with f < 0 and pl(f) > -00. Take fn
Then
fn
= -nI[f<-n]
+f4f2-4.
+ n E A+. Applying (3) to fn + n, we get Pl(f71) G
P2(fn)G
Thus
This completes our proof. We now study an easier case.
PZ(f4f>-n]).
5 PROBABILJ'I'Y Mrsrnrcs AND COUPLING METHODS
214
Lemma 5.45. Let (qk(z),qk(z,d y ) ) (k = 1,2) be two bounded y-pairs possibly non-conservative. Then Pl(t) 4 Pz(t>iff for every monotone s e t A, 21
4 22,
XZ
4A
XI
4 22,
51
E A
=*
q l ( q , A )6 4 2 ( x 2 , A ) and Q, ( % A ) - 4 1 ( 4 < 4 2 ( 2 2 , 4 - Q Z ( X 2 ) .
(5.42)
Equivalently,
x1 + x2,
either z1 E A or
x2 $ A
--I Q I I A ( X ~6) f l 2 I A ( x 2 ) .
Proof: For every bounded q-pair (q(x),q ( z , dy)), we have uniquely a process P(t,z,A) and d -P(t, 2 , A ) q(x,A ) - 4(z)IA(X)
I~=o=
dt
for all z E E and all A E 6'.So, as an application of Lemma 5.44, it follows that (5.42) is necessary. We now prove the sufficiency. Note that even though (5.42) does not imply that R1 i02,but for X 2 sup2(q1(z) q 2 ( z ) )we , still have
+
P p
* Pz'xi,
(5.43)
where PjA)= I + R k / X , lc = 1,2. Since S t k may be non-conservative, we have Pk( A ) 1 1 only. Assume for a moment that Pix' 1 = 1: k = 1 , 2 . Then by
<
Theorem 5.32, there is a Markovian coupling such that for all z14 xz. Furthermore, by (5.43) and induction,
'Ic1
+ 5 2 , f E &,
F(xl,x 2 ;F ) = 1
n 2 1.
Hence the assertion follows from t,he expansion:
In the non-conservative case, we can introduce a fictitious state A I$ E with extended partial ordering: A 4 z for all II: f E , so that the extended q-pairs become conservative. Then, based on the modified (5.43), the above proof works. In view of the proof of Theorem 3.2, the restriction of the process to the original state space coincides with the original process and the proof is done. 1
To study the Comparability for q-processes with the unbounded q-pairs, we need
5.5 MONOTONICITY
215
Hypothesis 5.46. There exist Gg sets { E n } such that En
1E
( n 3 1) and
(1) supzEE, qrc(x)< 00, n 3 1, k = 1, 2. (2) The set H , := {y E E \ E, : there exists x E E, such that x + y} is monotone. Moreover, if H , # 0, then there is a point b, E H , so that x b, for all x E En.
+
In the cases that E = R d ,Z d , Rf or Z$with the ordinary partial ordering and that (qk(x),q k ( x ,A))(k = 1,2) being locally bounded, the hypotheses are trivial. We simply take
E,
= {x E E : -n = { T E E :
and b, = ( n
< x < n}
if E = Rd or Zd
~ < z < n } ifE=R$orZ$
+ 1, - . . , n + 1) E E , n 3 1.
Theorem 5.47. Let ( q k ( x ) q, k ( x ,dy)) (k = 1,2) be two regular q-pairs. Under Hypothesis 5.46, Pl(t) 4 P2(t) iff (5.42) holds. Equivalently, for every monotone set A , and
Proof: Because of the regularity of the q-pair (q(x),q(z,dy)), the necessity can be proved in the same way as in the proof of Lemma 5.45. To prove the sufficiency, define
Then we have sup
+{ bn }
q p ) ( x )< oo,
k
= 1,2,
n 2 I,
3:E En
qp)(z)
--f
q p ) ( z , A n (En
q k ( x ) as n
+ {b,}))
-+
-+ 00,
zE
q k ( x , A ) as n + 00,
E, k x
E
= 1,2,
E, A
E
8,k
= 1,2.
Thus, by Lemma 5.17, we have lim
n-00
PA"'(t,x,A n (En + {b,}))
= Ph(t,x,A),
t 2 0 , T E E ,A € & , k = l , 2 .
(5.45)
216
5 PROBABILITY METRICSA N D COUPLING METHODS
Clearly, the partial ordering on E induces a partial ordering on En + {&I. Let dndenote the set of all monotone sets (functions) on En {bn}. Of course, A E dd ===+ A n (En {b,}) E An.Now, since En is Polish, by (5.45), for Pl(t) 4 P2(t) it suffices to show that
+
+
Ip(t)f4 Pp(t)f,
f E dn, n 2 1.
(5.46)
TOprove (5.46), we should check that ( q p ) ( x )qpl(x, , d y ) ) (k = 1 , 2 ) satisfy condition (5.42). Let xl,x2 E En { b n } , x1 + x 2 and B E An, thcn b, E B. Wc need to show that
+
(ither x1,x2 E R or x 1 ) x 2E
En \ B
< Cdp)Ifi(x2). (5.47)
I Stjn)IR(TCl)
Since +3! U N, E d, by assimption, we have
<
either xl,x 2 E BUH, or xl,z2 B U H , 1 ~ I J B ~ J I ~ ( X ~ 6121n1,~,(r ) (5.48) Because
and it follows that
Thus, if x2 = bn, then x1, x2 e En and so (5.47) follows from (5.48) and (5.49). But if x2 = bn, then
Therefore, (5.42) is satisfied for the sequences of q-pairs k = 1,2. 5.6 Examples
In this section, we introduce some examples to illustrate the applications of the results obtained in the last two sections.
5.6 EXAMPLES
217
Example 5.48. Take E = { 0 , 1 , 2 , . . . } and let Q = ( q i j ) be a regular matrix on E . The basic coupling is
&-
k=O
+ (%& - 4 i , k ) + [ f h k) -f(il,i2)] + ( q i l k ~ q i z k [) f ( k , k )- f ( i l , i 2 ) ] } . (5.50) Here we have used the convention qkk = 0. If qi = 0 for some i E E , then we must assume that for every j E E \ { i } , there exist j 1 , . . . ,j, E E such that qjj1qj1j2. . * Qj,i
(5.51)
>0
which implies condition (2)' of Corollary 5.28. Now, we choose
+ <
+ +
il # 2 2 , il i 2 n } , cp(i1,ia) = il 22 C, i l # 2 2 , where C is a constant, and set p ( i , i) = 0 for all i 2 0. Then condition (5.29) is automatic and condition (5.30) becomes
A,
= {(il,i2) :
00
00
CI
qilk - qizk
I k G qi,il+
qiZi2
for all
il
+ CXq i l k A q i z k IC=O
k=l
# 22. For birth-death process, if
>
j > i 0. bj 3 bi - ~i (uO := 0), then the above conditions are all satisfied and hence the coupling is successful. Now, we consider again the basic coupling but use Corollary 5.27. Let ~j -
il
Then, we have for (jih
c
)#(il,iz)
=i
p(i1,iz)
w,
> 0,
iz =i
+k,
= lil - 221 that
22;j l , j 2 ) V w l , j 2 ) )
k 3 1.
5 PROBABILITY METRICSAND COUPLING METHODS
218
where qij = 0 if j have
c
< 0.
i or j
==
In particular, for birth-death process, we
4(h,62;ji,.j2)dJ(P(A)j2))
,id (bi
(jli 2 ) # ( i l
= { (bi
+ ui+k)$(k
- 1)
+ (ai + bi+k)dJ(JC + 1)
+1%
- %+21$(1)
A a2+2>$(0>
if k = 1 or Ic 2 3
+ (ai + bi+2)$(3)
if
= 2.
From this, it is easy to check (similar computation will be given in the next example) that conditions (5.24) and (5.26) hold with (P(T) =
provided
alog(1
or
T-)
ar/(lf
T)
( a > 0) and f ( r )= T
3 a; - b; for all i 2 0 and k 2 1. Equivalently,
ai+k - bt+k U,+l
+
- bZ+l
k a; - b; (a0 := 0) for all i 3 0.
If so, then the basic coupling is successful. Note that the condition holds in the simplest case of bi = b > 0 and ai = a > 0, but the process is transient if a < b. In the recurrent case, the classic coupling is successful since birthdeath process is monotone, the “bigger” component can not reach the origin before crossing the “smaller” one. Now, we study the birth-death processes more carefully.
Example 5.49. Consider a birth-death process with rates Go
= 0, ai
Suppose that for each
> 0, i >, 1,
b;
> 0, i 3 0
k 3 1, yk :=
inf
bi
i20 ai
+ ~ i + k : > 0.
(5.52)
+ bi+k
Define
(5.53)
If uj
>0
for all j and
cuk
= 00,
(5.54)
Ic
then the coupling by reflection is successful. In particular, the result holds with uk 1 if ai+l - bi+l 2 ai - bi for all i >, 0.
=
5.6 EXAMPLES
219
Proof: Take p ( 0 , k ) = C O G j G k -By l ~assumptions, k. f ( r ) = T satisfies (5.25). Let il = i, i2 = i k , k 2 1. Then for the coupling by reflection, we have
+
nrp(i1?i2)
= (ail A
v biz)Uk-(bi,
bzz)uk+l+(ai,
v%z)uk-l-(bi~
< 0.
A ai2)uk-2
(5.55)
Hence condition (5.26) is satisfied. Next, for fixed T , take k-1 'pk
=o
IC = 0 , 1 , . . ., r ; o > 0,
x u j / m j , j=O
where M = rn(r) 2 1 is a constant which will be determined later. Then, for ~ ~ ~ ( c p o p ) ( i l+q(il,i2) ,iz) < 0, by (5.55) replacing zhk by cvuk/?-nk, it suffices that
That is
Since u k
1 and m 2
1, it suffices that
Equivalently, ai
+
bi+k
Since the last term is negative for large enough a , by (5.52), this is equivalent
to or
Thus, one may first choose rn = m ( r ) large enough so that the right-hand side becomes posilive for all k T , and then the inequality holds for all k r whenever N is large enough. 1
<
<
220
5
PROBABILITY
METRICSAND
COUPLING
METHODS
Example 5.50. Let P be the simple random walk on E = Z d . Consider the Q-process with Q-matrix Q = P - I . When d = 1, since the process is recurrent, it is not surprising that there exists a successful coupling, even the classical one is already good enough. However, it can be proved that the classical coupling is successful for this model iff d = 1. Now, we want to show that the existence of a successful coupling is usually weaker than the recurrence and is indeed independent of the dimensions (finite!) whenever the components are independent. To see this, we couple each component of the process independently. Denote by T k and T the coupling times of the k-th component and the whole process respectively. Then the conclusion follows from the observation T = maxk T k . Example 5.51. Take E and Q = ( q i j ) as in Example 5.48. By Corollary 5.42, a sufficient condition for the monotonicity of P ( t ) is the following: qilk
< qizk
qilk
2 qizk for
for
< 22 < k k < il < i z .
a1
and (5.56)
In other words, for each fixed k , qik J.
as i J.
qik J.
as a
T
for i for a
< k and > k.
From this, we see immediately that the birth-death processes are monotone. However, the above conditions are not necessary. The complete answer for Pl(t) 4 PZ(t) is as follows: q?! z1.l
<
for all i l
< i~ < k
and (5.57)
In this general case, an order-preserving coupling was constructed by Zhang ( k )), define (1996, 1998) as follows. Given two regular Q-matrices Qk = (qij
Next, define successively the sequences
{ake}, {bke}
(k < t, t 2 0):
ti. 6 EXAMPLES
221
Finally, the required conservative coupling Q-matrix is given as follows:
other (k,!)
#
(i,j).
For the order-preserving couplings on product spaces: refer to Forbes and Francois (1997) and Ldpez and Sanz (1998). It is easy t o check that Schlogl's model is monotone (cf. Section 14.2). Example 5.52 (Loth-Volterra model), Take E = Z'2+ and
@(21,22;j17j2)=
{
+
Xlil X2i2
Xyili2
if j , = i l 1: j , = i2 if j1 = 21, j , = i z - 1 if j , = il 1, j , = i2 ~
+ 1.
Consider the monotone set A = { (il ,iz) : i l 2 1). Then ( I , 1),(1,2) E A
and
g(1,l;A') = A s , ij{l,2;A") = 2X3,
and so q(1,l;A') < Q(1,2; A C ) .By Theorem 5.117, this model is not monatone.
Example 5.53 (finite dimensional generalized Potlach process). Take E = R$ and
where F is a distribution on [0,GO) with mean 1, ei is the .i-th unit vector in Rd and (Pij) is a Markov transition matrix on { 1 , 2 , . . - , d } . We now prove that the process generated by s1 is monotone. Clearly,
Let A be a monotone set in E. Fix each i and
Since for
222
5 PROBABILITY METRICSA N D COUPLING METHODS
we have
Thus, if x (2) e A, then
= q ( ~ ( ~A). ) ,
If x(l) E A , then d
w
I ~ C ( y ( ~ ) ( i , [ ) ) F ( >, d tq) ( d 2 ) , A ) .
q(x(’),AC) = i=l
0
Our conclusion follows from Theorem 5.47. An alternative way, even simpler, to prove the monotonicity of the process is using the following coupling
(5.58) which gives us
Example 5.54. Let us consider a special case of the above example: d = 1, Pll = 1 and F(d<) = e-cdt. Now, the coupling in (5.58) is reduced as follows:
This coupling is not successful. Indeed, it is easy to check that the maximal solution to Eq. (5.16) equals one identically and so the assertion follows from Theorem 5.25.
5.7 NOTES
223
5.7 Notes
The probability metrics are now an active research field. Refer to Dudley (1976, 1989), Rachev (1991) and Zolotarev (1984, 1986) for more references and applications. The coupling technique goes back to Doeblin (1938). In the context of random fields, it was used by Wasserstein (1969) and Dobrushin (1970). In the time-discrete case, refer to Griffeath (1978). For more references, see Liggett (1985), Chen and Li (1989), Lindvall (1992) and Thorrison (2000). The coupling method is now a powerful tool in statistics, called “copulas” (cf. Nelssen (1999)). It is also an active research topic in PDE and related fields, named “optimal transportation” (cf. Villani (2003)). Most of the materials in Section 5.1 are taken from Dobrushin (1970). Lemma 5.2 was proved in Dobrushin and Percherski (1981). The proof for the necessity of Theorem 5.5 is due to Rachev (1982). Besides, the alternative proof of Theorem 5.5 seems to be new. The formula (5.5) in Theorem 5.10 was proved partially by Dudley (1976). The complete proof is due to Szulga (1978, 1982). The formula (5.6) is newly added. Theorems5.22 and 5.23 are due to Zhang (1999), the simplified proofs here are taken from Mao (2002b). For couplings of Markov chains, the problems studied in Section 5.2 begun in Chen (1984), and continued in Chen (1986a,b, 1989a, 1991f). Most materials in this chapter are taken from these papers. Lemma 5.18 is an extension of the original result, the free kernels used in the lemma was introduced in Anderson (1991, Proposition 2.2.14), its conclusion is corrected here. For the proof d) of Theorem5.19, the author was benefited from a discussion with J. L. Zheng. The proofs a) and b) of the theorem were updated by Chen (1994b) and the proof c) is due to Zhang (1994). The idea of the proof of Corollary 5.26 is taken from Dobrushin and Percherski (1981). The coupling by reflection [the diffusions’ analogue of this coupling is due to Lindvall and Rogers (1986)l was appeared in Chen (1994a), in which the optimal couplings with respect to distances were introduced and then extended by Zhang (2000) to closed functions. Refer also to Chen and Li (1989), Chen (1990) for the earlier efforts on the optimization of couplings. Based on Chen and Li (1989), some estimates for the moments of coupling time in the context of jump processes were obtained by Huang (1988). Theorems 5.32, 5.34, 5.36 and 5.43 are due to Zhang (1999, 2000), except the formulas in Theorem 5.43, which are due to Lindvall (1999). A correction is made in the proof of Theorem 5.34. Theorem 5.47 was updated by Zhang (2000a,b), removing some extra condition in the original version. It was proved in Chen (1996b), Zhang (2000b) and Zhang and Chen (1996) that two minimal q-processes are stochastically comparable only if the “bigger” one is regular and then so is the (‘smaller” one. Thus, for the stochastic comparability, only the regular processes are meaningful.
This page intentionally left blank
PART I1 SYMMETRIZABLE JUMP PROCESSES
This page intentionally left blank
Chapter 6
Symmetrizable Jump Processes and Dirichlet Forms An important subclass of Markov processes is the set of reversible ones, which correspond to the equilibrium systems (i.e., detailed balance) in physics. The symmetrizable Markov processes are natural extension of the reversible ones. In this chapter, we study some basic properties of symmetrizable Markov processes, especially for jump processes. As what we did for general jump processes, we should first study the existence and uniqueness problems for the processes, which are the main topics of this chapter. The mathematical tool used in the first six sections is mainly the one developed in the first three chapters with some refinements. In Sections 6.7 and 6.8, we employ a different tool, the L2-theory, to obtain a uniqueness criterion for symmetric Bq-processes. Some applications of the results obtained here will be given in the subsequent chapters. 6.1 Reversible Markov Processes
In this section, until otherwise mentioned, assume that the process (Xt)t>o is non-explosive. That is, P [ X t E El = 1 for all t 2 0. The following result explains the meaning of time-reversibility. Definition 6.1. A Markov process (Xt)t>o defined on (n,S,IP)with state space ( E , 8 ) is called reversible, if for any finite n 2 1, 0 t l < t 2 < ...
<
< t,
with
and
By the monotone class theorem, it follows that Proposition 6.2. A Markov process ( X t ) t 2 0 is reversible iff for any finite n 2 1, any 0 t l < t 2 < . . . < t, satisfying (6.1) and f E b 8 , ,
<
E[f(Xt,, * * .
9
Xtn)
where 8, is the n-fold product of
8 227
228
6 SYMMETRIZARLE JUMPPROCESS
We now prove the main result of this section. Theorem 6.3. Let (Xt)t>0 be a Markov process with tranisition probability function P(t,x,dy) and initial distribution t.…Then it is reversible iff for any A, B e S and t > 0. Equivalently, .for all f, g e be or e+ and all t > 0 Proof?: The last assertion follows from the monotone class theorem. By the , we haveassumption for all
and
Thus, if the process is reversible, then
So the condition is necesssary. Congives us versely, assume (6.4). This means that P(t) is self-adjoint on L2() , the space of all real square integrable functions with inner product: Let and Then, we habe
satisfy
and
Define
229
6.2 EXISTENCE
Now, the assertion follows by setting ft = I A ~ 1, 6 i Actually, we have also proved the following result.
< n.
I
Corollary 6.4, Every reversible measure of an honest P ( t ) is a stationary distribution.
The above theorem leads to the following Definition 6.5. Let P ( t ,2,d y ) be a sub-Markovian transition function. It is said t o be symmetrizable (resp. reversible) if there exists some a-finite (resp. probability) measure T such that (6.3) holds for all A , B E 8. Equivalently, (6.4) holds for all f , g E &. Similarly, by replacing P ( t , z , d y ) with q(z,d y ) , we can define a symmetrizable (reversible) q-pair.
It is worthy to mention that for f , g E E;, r ( f P ( t ) g )is meaningful but maybe infinite. In the L2-theory, which is the topic of the last two sections of this chapter, in order to avoid infinity, we usiinlly restrict f , g to the domain of P ( t ) . But in the first, six sections of the chapter, we do nal want to be involvcd in the L2-theor;y, so we allow (f,,q) = m. Lemma 6.6. If a jump process P ( t ) is symmetrizable with respect to a measure n,then so does its q-pair
Proof: Since
P ( t ,z,( 3 2 ) ) 2
e - q y
havewe Now, we split the proof into three steps. a)
b) By a), we obtain c0 In general, by a) and b), we have
6.2 Existence The first important result about symmetrizable q-processes is as follows:
230
6 SYMMETRIZABLE
JUMP PROCESSES
Theorem 6.7. The minimal q-process is symmetrizable with respect: to
T ifF so does its g-pair. In particular, for a given symmetrizable q-pair, there always exists a t least one symmetrizable q-process.
Proof: Recall that for f E S+,f1 denotes the kernel f ( x ) d ( s , d y ) . The necessity was proved in Lemma 6.6. 'lb prove the sufficiency, let
Then P r n i n ( A ) . -
C z ,P ( n ) ( X ) .
On the other hand, because
is symmetric in A and B , so P ( l ) ( X )is syrnmetrizable with respect to Suppose that F""j(A) is symmetrizable with respect to T T . That is
T.
By the monotone class theorem, this is equivalent to
Next, since I l ( X ) is symmetrizable with respect to
T ,we
have
Therefore, we obtain
here in the last step, we have used (2.24). Thus, P(")(X)is symmetrizable with respect to 7r for all n 1, and so is its sum Pmin(A).H Having the existence result in mind, we now discuss the uniqueness problem for symmetrizable q-processes. Clearly, we have
Corollary 6.8. If the q-pair (q(z),q(x,dy)) i s symmetrizable with respect t o x and the q-process is unique, then there is only one g-process which is symmetrizable with respect t o T .
6.2 EXISTENCE
23 1
In general, the uniqueness problem is quite hard, the remainder of this section is devoted to introduce a non-trivial sufficient condition. NoOe that
> 0,3A E €such that P(X,5 , A ) # Pmin(X, x,A ) } = {x E E : 3X > 0 such that P(A,z, E ) # Pmin(X, x,E ) ) = { x E E : VX > 0 such that P(X,2,E ) # Pmin(X, x,E ) }
{x E E
: 3X
is clearly &-measurable, hence we may introduce
Definition 6.9. We call y-processes P(X) and Pmin(X) are n-equivalent, if
~ { Ex E
:
3X
> 0, 3 A E &such
t h a t P ( A , x , A ) # Pmin(X, 2,A ) } = 0.
If P'"(X),k = 1 , 2 , are all n-equivalent t o Pmin(X), we call themselves n-equivalent.
Obviously, we have Lemma 6.10. Two n-equivalent q-processes are, or are not symmetrizable with respect t o
7r
simultaneously.
Lcmma 6.11. Let ( q ( z ) , q ( % , d y ) )be symrnetrizable with respect to n and { f x : X > 0) be a consistent family of functions. Then, the equality nfx = 0 holds or not simultaneously for a l l X > 0.
Proof: Let rfp= 0 for some p . Then, by consistency, we have
here in the second to the last step, we have used Theorem 6.7. W Lemma 6.12. Let the q-pair be symmetrizable with respect t o
7r.
( 3 ) If for some X > 0, dirn@A = 0 n-a.e., that is xu = 0 for all u E then dirri%x = 0 n-a.e. for all X > 0. In this case, we write dim@ (2) I f for some X > 0, there is u A E 9?!~ so t h a t
then, there is a consistent family of functions { u x E 0 < IIuxII1 < cc for all X > 0.
%A
:X
> 0)
@A,
0.
so t h a t
6 SYMMETRIZABLE JUMPPROCESSES
232
Proof: a) Let u x f %A for some A. Define up = R(X,p)uA for all p. Then, by Lemma 2.36, U~ E qpfor all p > 0 and { u p : p > 0} is consistent. Hence, the first assertion follows from Lemma 6.11. b) Let uA E q7dr, satisfy 0 < IIuxII1 < 00 for a fixed X > 0. Define the consistent family {uI1: p > 0 ) as. above. Then, tls we have just seen that nuc, > 0 for all p > 0. Moreover, by symmetry7 we obtain IIP1lyp)~xII1 =
(UA,
prnin(pII) 6 ru-yux, 1) = IIuxllI/F<
Hence,
6
11~p111
IIUAIIl
+ IX - PI IlP"'"(4uxll1 < m.
1
Proposition 6.13. Let (q(x),q(x,d y ) ) be a n-almost conservative (i.e., nd = 0, where d = q ( . ) - g(., E ) ) and 7r-symmetrizable. If d i m 9 f 0 or rq < 00, then n-symmetrizable q-process is unique.
Proof: Recall that zA = I .((A
-
XPmin(X)l. If the last condition holds, then
+ q)xx) = rQzx
so r z A = 0, arid hence d i m 9 have
( 1 7 Q ~ A=) (zx, Q1)
< rq <
00,
0. Furthermore, for any p-process P(X), we
{x E E : 3X > 0, 3A E &such that P(X,z,A)# Pmin(X,x,A ) } = {x E E : 3X > 0 such that P ( X , z , E )# Pmin(A,x,E)} c {x E E : 3X > 0 such that zA # 0}, which is a n-null set.
Proposition 6.14. Let 7rzA < 03. Then there exists n-almost honest syrnmetrizable Bg-process ifF the following conditions hold. (1) The q-pair is symmetrizable with respect t o n-. (2) The q-pair is n-almost conservative.
Proof: It will be proved later (Proposition 6.25) that for every Bq-process P(X>, x E E. d( s) = lim X [ 1 - XP(A, 5,E ) ] , A-00
This plus the n-almost honesty implies that n-d = 0. Hence the necessity of the conditions is clear. Next, when conditions (1) and (2) hold and dim% = 0, the existence is trivial. If dim % # 0 but dim % 0, then each Bq-process 7r
is x-equivalent t o ,"'"(A)
and x-almost honest. Finally, if dim%'#O, then
P(A) := P n i n ( X )
+ zApx/[Xpx(E)]
6.4 GENERALREPRESENTATION OF JUMP PROCESSES
233
provides one of the desired process, where p x = x [ z x l ] . What we are going to do for the remainder of this chapter is more or less related to the last two results. -4s we will see later, the uniqueness criterion for this special type of jump processes can not be deduced directly from the one for general jump processes. Actually, it is even not known completely. 6.3 Equivalence of Backward and Forward Kolmogorov Equations 111 this section, we prove that the two Kolrnogorov equations are almost equivalent.
Definition 6.15. W e say t h a t the backward (resp. forward) equation ( B A )(resp. (FA)) holds 7r-a.e. if there is a Ir-null set N so t h a t for all 2 $ N , a l l A E 8 and X > 0, ( B A )(resp. ( F A ) holds. Theorem 6.16. Let P(X) be a n-symmetrizable q-process. Then P(A) satisfies (BA),n-a.e. iff P(A) satisfies (FA),7r-a.e.
Proof: Let P(A) satisfy (Bx),r-a.e. Choose { E n } so that En n(E,) < 00 and supzEE, q(x) 6 n. Then, by Lemma 6.6, we have
E with
Tdking Radon-Nikodym clcrivative, we obtain
The exceptional set depends on X > 0, A and n. However, for fixed z and A, by the resolvent equation, P(A,x,A) is continuous in A, so we can choose an exceptional set depending on A and n only. Furthermore, since ( E , & ) is Polish space, 6 is countably generated, we can choose an exceptional set independent of A and n. Similarly, we can prove the other assertion. 6.4 General Representation of Jump Processes
To study further the uniqueness problem of symmetrizable q-processes, we need to know more details about the exit and entrmce hoiindaries of thc processes. In this section, we first introduce some basic results on Feller's boundary theory. Then, we present a general representation of jump processes. This section can bc regarded as an addition to Chapter 2. For convenience, set %A(C)
= { u E %A
:
u < c}.
6 SYMMETRIZABLE JUMPPROCESSES
234
Lemma 6.17. Let iiA = P"'"(X)d, X > 0 and let tix be the maximal element in %x(l) which can be obtained by the procedure: fix X > 0 and let E1
U(n+l)
n(x)u(4,
n
B 0,
then u(") 1 GA. Next, {fiA}and { f i x } are consistent families. Moreover,
Proof: The approximation scheme for {fix}is obvious. By Lemma 2.39, we know that for each fixed X > 0, zA is the maximal solution to the equation Ic = r I ( X ) X
+ d(X + q ) - l ,
0
< Ic < 1.
On t,he other hand, P"'"(X)cl is the rnininial solution to the same equation, so as their difference, { E x } must be the maximal element in Wx(1).The consistency of {GX} follows from (2.40). Now, as the difference of two consistent families, { U x } must be also consistent. I Recall that for the consistent families { f ~ and } {qA}, we have
Definition 6.18. We call f := 1imx-o fx and 71 := 1imx-o qAthe canonical images of (fx) and {qA} respectively.
In what follows, for the consistent families {fx}, { u x } , {qA}, {qx} and so on, we use f , u , q and f j to denote their canonical images, respectively. Lemma 6.19. For the canonical images, we have the following decompositions:
where LW
F(x, A )
. I
lim Pmin(X, 2,A ) =
A--0
Pmin(t, Z,A)& =
rIn(IA/q)(x). n=O
Proof: Ry consistency, we have
f x = fJL
+ ( P - X)Prnil'(X)fp,
fp
= fx
+ (A - p )
Pmin(X)fp.
Letting X 1 0 in the left equality, we obtain the first desired decomposition o f f . Let,ting p J. 0 in the right equality, we obtain the other decomposition of f. The proof for the decompositions of rl is similar. I
6.4 GENERALREPKESENTATION OF JUMP PROCESSES
235
Next, recall that Eo = {x : d(z) > 0) and set
Lemma 6.20. For the consistent families {G,}, {GA) and their canonical images ii,U we have
(6.7) (6.8)
(6.9) (6.10) (6.11) where uo is the maximal element in
having the property:
Proof: All assertions are clear except the ones about we have u, ii, = xx < 1.
14'.
By Lerrima 6.17,
+
Define
1
uo := lim (1 - a x X-+O
-
iiA) = 1 - u - fi 2 0.
Then, we have (6.11). Using Lemma 6.17, (6.8) and (6.9), we obtain (6.12). To show that uo E %, noting that by the definition of uo,we have XPrn'"(X)l
as x
J. uo
1 0.
If q(x) = 0, then we certainly have uo(x)= lfuO(s).Otherwise, the equation (Bx) also implies that u O ( z )= LIu"(x). Finally, for any solution f to the equation f = XPmin(X)f, 0 < f 6 1, we have
< xPmi"(x)l,
f = XPmin(X)f
arid so f
< lirri~-+OXP"'"(X)l
= uo.
Lemma 6.21.
(1) A family
{f,: X 3 0)
of functions is consistent iff it has the representa-
tion : f,
= prriin
(
W
I
+ A,
6 SYMMETRIZABLE JUMPPROCESSES
236
where w satisfies Pmin(X)w E b€+ for some (and hence for a l l ) A > 0 and (f, E %'A} is a consistent family. Moreover, w and hence fx are determined uniquely by = (XI -
UT
(2) A family {q, : X
n)f,, fx J, 0, > 0 ) of
-+
as X
w
T 00.
measures is consistent iff it has the representa-
tion:
77,
Xfx
2
#emin (A) + q x ,
where K E ?+ satisfies ~ E P ~ ~ E ~ 9 ~+ for ( Xsome ) (and hence for a l l ) X > 0 and { f j x E W,} is a consistent family. Moreover, K. and hence f j x are determined uniquely by
44 = % ( X I
- %4)7
rlx
-1 0, h ( 4 44 +
as
AEmE,,
T 00, n21.
Proof: Assertion (2) follows by combining Lemma 3.3 with Lemma 3.4. In a similar way, we can prove assertion (1). W Corollary 6.22. As A
t 00,
we have
-
P1x
-1 0:
6, 10,
X U , -+ 0; AfiA --t d.
> 0 } be a consistent family of measures. := AqAu0 < 03 independent of X > 0.
Lemma 6.23. Let {q, : X ' 7
Next, let
{fx
f E bd?. Then
> 0) X7,f T
:X
Then
be a consistent family of functions with canonical image as X 00. In particular, for a E Eo, if we set
."X(.)
= Pmin(X, 2,{ a } ) d ( a ) ,
then {u:) has canonical image ua = v fAl
.- Xq,ua ._
'LI"
= v;
vn,
a ) d ( a ) . Moreover,
T
+ q,({u})d(u)
0 0 7
independent of p
> 0.
Proof: Actually, it wus proved in (2.55) that
h f = PVpf +
- ChpfX7
so the second assertion follows. Similarly, by Lemma 6.20, we can prove the first one. Now? it is clear that, u: = 21;
+ ( A - p)q&,
a E Eo.
But Xu: 6 d(a), so the last assertion follows by the dominated convergence theorem. W
6.4 GENERAL REPRESENTATTON O F J U M P PROCESSES
237
Definition 6.24. Let P(X) be a q-process. We say that the backward equation at point II: holds if for all X > 0 and all A , P(X)pA(z)= P I ( X ) P ( X ) I A ( X ) f (A f q ( x ) ) - l I A ( x ) . The ncxt result improves Theorem 1.15.
Proposition 6.25. The backward equation a t z holds iff d ( z ) =: lim X [ 1 - XP(X,2 , E ) ]=: D ( z ) .
x
(6.13)
'M
In general, we have the inequality replacing "=" with " 3 " ,
Proof: a) As we did in Section 1.1, enlarging ( E ,8 ) by a fictitious state A, we obtairi ( E n ,&"a), Define
Clearly, PA(^) is a q-process on ( E n ,&A) with qa = 9 . 1 ~50 . the limit
lim X ( l
x+cQ
AP(X,x,E ) ) = lim A2P,(X, x-+m
x:(A})
exists for all x E E . On the other hand, by Lemma 6.17 and Corollary 6.22 we have
D ( z ) : = lirn X ( l - XP(X,z,E)) 6 lirn Axx(.) x-tm A400 = lim (XG, XG,) = d ( z ) . x-+m
+
This proves the last assertion. b) By a), we have q,(A) = 0 and
for all II: E E and for A in a ring and hence for all A E 8'. Hence, applying the backward Kolmogorov inequaiity to both of P(A)and Pn(X),we obtain
6 SYMMETRIZABLE JUMP PROCESSES
238
where
r
Thus, whenever D ( z ) = d ( z ) , the above every inequality should become equality. In other words, (Bx) at 2 holds. c) Conversely, if (Bx) at x holds, lhen
Lemma 6.26. Let P(X) be a q-process and set u x = 1 - XY(X)l. Then, we have u x >, l l ( X ) u x .
Proof: By using the resolvent equation with some computation, we obtain
Now, the assertion follows by letting We can now state the general representation theorem of q-processes. Theorem 6.27. Every g-process has th e following form
P(X) = Pi'1(X)
+ B(X) + X ( X ) F ( X ) ,
X
> 0,
where
X ( X ) := P"""(X)[dl], X
>0
(6.14)
and F(X) are kernels having the following properties. (1) For foxed x and A , B ( . , z , A ) and b;'(.,z,A)are continuous. (2) For fixed X and J:,
B ( X , z ,.), F(X,J:,*) E .A?+, XB(X)l(z)< 1, XF(X)l(J:) F ( X ) I ( X ) = o for J: 4 E'.
< 1,
(3) For fixed X and A , B(X,+ , A )E e ~ ( l / X )F(X, ? A ) E T&+. (4) If B(X)= 0, then F(X) satisfies the resolvent condition e ,
F ( p ) = F ( A)
+ (A -p )F ( A) pmin (T+ d F ( p ) )= F ( A) { I+ (A ,p( p)} . (,I)
-
(5) The q-condition holds:
lirri A2X(A)F(X)I~ = 0, x-+m
A E &nE,, n 2 1.
6.4 GENERAL REPRESENTATION OF J U M P PROCESSES
239
Furthermore, if P(A>satisfies the backward equation a t x,then F(X,x,*) 5 0. If P ( X ) satisfies the forward equation, then for every 2,B(A,x,.), F(X,x,.) E WA. Finally, P ( X ) is honest iff
XB(A)l = for all X
and
XP(A)l = 1
(6.15)
on E*
0.
Proof: a) Let P(X) be a g-process. Since every q-process satisfies the Kolmogosov inequality a,nd Pmin(A> satisfies ( B x ) ,wc have
Moreover, by Proposition 6.25, for the conservative point 5: d ( s ) = 0, the inequdity becomes equality for all X > 0 and all A . Hence, we may define
F ( X , s , A )=
the left-hand aide of (6.16)/d(x)
if x
(=_
Eo
if x $ E'.
(6.17)
Clca.rly, we have
and F(X,x:.) = 0 whenever P(X)satisfies the backward equation at 2. Moreover, for fixed 2 and A, E(X,5 , A) is continuous in X and so is X(A)F(A)(x,A). From this, msertkm ( I ) follows. b) Next, note that Lemma 2.34 (1) implies
P(A) - Prni"(X) >J P"'"(X)(dF(X)). So we may define B(X) = P(X) - Pmin(A) - Prni"(X)(dF(A)) 2 0.
01' course, we have
Now, P(X>can be written as follows:
P(A)= P"'"(Aj
Preniultiglying ( X I
-..
+ X ( X ) F ( X )+ B(X).
(6.18)
st), we obtain
( X I - ( 1 )P(A)= ( A 1 -fl)P*li" (A) + (A& S2)X(A) F ( A) = I+ ( X I - fl)X (A) F ( A).
6 SYMMETRIZABLE JUMPPROCESS
240
Noting that, by Eq. ( B A ) we , have (XI
-
fl)X(X)J = ( A T
-
st>Pmin(X)(df) = d f .
(6.19)
Conibining the above two facts:, we arrive at ( X I - a)P(A)l(rc)= 13. d(x)F(A)l(x),
X > 0: x E ED.
(6.20)
On the other hand, by Lemma 6.26, we have ( X I - R)P(X)l
< 1+ d / X ,
x > 0.
It follows that X F ( X ) l < 1 for all X > 0. c> It is emy to check that if B(X) = 0, then P(A) satisfies the resolvent equation iff
But
Hence, the above equality becomes
Now, applying (6.19) to this eqrrality, we obtain (4). d) Suppose that P(A) satisfies ( F A ) .Then
B(A)(M - n)IA+ X(X)F(X)(XI- n ) I A = 0,
A E 8 n E,,,
Premultiplying ( X I - a),we obtain d F ( A ) ( X I- ~ ) I = A 0,
A E 8 n En, n 2 1.
Moreover, we also have
B(A)(AI- 0 ) I A = 0,
A
E8
n E,,
2 1.
TI
31
6.4 GENERAL REPRESENTATION OF JUMP PROCESSES
241
This shows that for fixed X and z, B(X),F ( X ) E W,. e) We now prove the last assertion. The sufficiency is obvious. To prove the necessity, let P(X) be honest. By (6.20), we obtain X+Xd(z)F(X)1(2) = (XI-R)XP(X)l(z) = (AI-fl)l(x) = X+d(x), z E EO. Hence, X F ( X ) l = 1 on Eo. Substituting this into the expression of P(X),we obtain
XB(X)l+ X(X)XF(X)l
1 = X P ( X ) l = XP"'"(X)l+ = 1- z , =1
+ XB(X)l+ P"'"(X)d
+ XB(X)l- zx + f i x .
Thus, by Lemma 6.17, we have
XB(X)l = 2,
- 6,
= u,.
f ) Finally, we consider the q-condition. By Corollary 2.28, P(X) is a qprocess iff
But we have proved that B(X)l E
e~(l/X so ), by Lemma 6.17 we have
From this and Corollary 6.22, we obtain
<
Iim X ~ B ( X ) I lim X U , = 0. x-+m
x+m
Therefore, the q-condition becomes ( 5 ) . The next result describes further the above B(X) in a special situation, which also improves Theorem 2.45.
Proposition 6.28. Let dim%
> 0.
P ( X ) = P"'"(X)
Then
+ u,cpx,
X
>0
(6.21)
is a q-process iff either c p = ~ 0 or c p can ~ be obtained by the following procedure.
(1) Take
K.
E p+ so that K.P"'"(X) E 2f+.
(6.22)
6 SYMMETRIZABLE JUMPPROCESSES
242
(2) Take a consistent family { f j x E Wx} of measures so that
+
(6.23) (6.24)
q x := nPrni"(A) ?jA, 21A
r
:= X f j A i i
< 03, A
21
00.
Set
' 5 = X?jxuoC=
independent of A.
0;)
(6.25)
(3) Next, take constant c so that
ao+ n(u0 + ii)
+ 21 < c.
(6.26)
(4) Finally, take
cpx = q J [ c
+
K(U -
+
(6.27)
Ux) Xfj,U].
Moreover, the q-process P(X) constructed above must satisfy ( B x ) . It satisfies ( F A )iff n = 0. It is honest iff the q-pair is conservative and c = ' 5
+ KUO.
(6.28)
Furthermore, if d i m 9 = 1, then we have constructed all Bq-processes. If d i m 9 = 1 and the q-pair is conservative, then we have indeed constructed all q- processes.
Proof: Let (6.21) hold and c p $0. ~ By Lemma 2.43, we have rn;'
- xqxii =: c
independent of A.
Now, the normal condition gives us
XE,cp,(E)
< ii,
(6.29)
3- i i x .
E 9 ~ (and l) But iiA 00
iix = P"'"(X)d
= CIqX)"d/(X
+ 4) G zx G 1,
0
premultipIying II(A)" in (6.29) and then letting m + 03, we see that (6.29) is equivalent to Xcpx(E) 1. That is X q A ( l - U) 6 c. On the other hand,
<
+ X f j J l - ii) = X K P r n i " ( A ) ( U O + 6 ) + Xfjx(z1O + ii) = nuo + - 6,) + ao + ux
X q x ( l - u) = XnPrni"(X)(l-
ii)
K(ii
n(u0
+ii)
+ao+ v
as x
t 00.
6.5 EXISTENCE OF HONESTREVERSIBLE JUMPPROCESSES
243
So we obtain condition (6.26). Next, since
+
-
+
+
r n x = c Xq,G = c + XK.P"'"(X)U + xqxu = c K{ii - ux) xq,u, we obtain (6.27). Finally, the q-process is honest, iff (6.29) becomes equality. Therefore, in order the process to be honest, it is necessary that the q-pair being conservative and Xypx(E)= 1. Under t,he former condition, the later one is equivalent to c = so K U O . The remaining assertions of the proposition are obvious. H
+
6.5 Existence of Honest Reversible Jump Processes
For the remainder of bhis chapter, we fix a g-pair, symmetrizable with respect to a symmetrizing memure T , and study the existence and uniqueness problems for the symmetrizable q-processes. As we did several times before, we choose and fix a partition {En}? of 8 so that
En TI?, sup q(x) 6 n and n(E,) < m,
n > 1.
TEE,
Lemma 6.29. Let (q(x),q ( x ,dy)) be symmetric with respect t o be a consistent family. Then
7r
and { f ~ }
(1) XxfA is increasing in X. for all X > 0 and nf, < .x for some (equivalently, (2) If moreover f, E for a l l ) X > 0, then qx := 7 r I f ~ l E) WA for all X > 0 and { q x : X > 0) is consistent.
Proof: a) By Lemma 6.11 or is independent of > 0. Next, by consistency and symmetry,
.This proves the first assertion and b) Let
and hence c) Finally, by symmetry, we have
which proves the last assertion.
Then
6 SYMMETRIZABLE JUMPPROCESSES
244
Lemma 6.30. Let ( q ( z )q(z, , d y ) ) be symmetrizable with respect to
T.
Then
(6.30)
Proof: By Lemma 6.29, we need only to show that limx,m XxGA = n d . But XTGx = X ( P m i n ( X ) , d ) = (d, XP"'"(X)l)
== (d, 1 - zx).
So the assertion follows from the consistency of x , ~ Lemma6.21 , plus the monotone convergence theorem. I Lemma 6.31. Suppose that the q-condition given in Theorem 6.27 holds. Then, we have
lim X F ( X , z , A ) = 0,
x+co
J: E
Eo, A E € nE,, n 2 1 .
Proof:
Proposition 6.32. Let the given q-pair (q(x),q(s, d y ) ) be symmetrizable with respect t o 7r. Then, for the existence of an honest .rr-syrnmetrizableq-process, it is necessary that one of the following conditions holds.
(1) n d = 00. (2) limx,, XnU, = 00. (3) nd limx,, XnU,
<
< 00.
Proof: Clearly, it suffices to show that. limx-, XnGx 2 nd whenever 7rd < 00. Actually, by Theorem 6.27, whenever P(X) is a symmetrizable q-process, we must have
In particular,
245
6.5 EXISTENCE OF HONESTREVERSIBLE JUMPPROCESSES
Now, by symmetry and honesty, we obtain
Since
and 7rd < 00, we have
here, in the last step, we have used Lemma 6.31. But by Corollary 6.22,
This shows that
and so 7rd 6 lim, ,oo XrrrUx. Now, we are going to prove that the inverse of the above result is also true when 7r is a finite measure. To do so, we still need some preparations.
Lemma 6.33. Let ( q ( x ) ,q(x,d y ) ) be syrnmetrizable with respect t o xuo < 00, then (uo,1- u o ) = 0.
T.
If
Proof: By (6.12), we have
That is ( I p , uo(l - zA)) >, (JE,, uozx). Letting X + 0 and then n noting that zx 7 (1 - uo) as X L 0, we obtain (uol 1 - uo) = 0.
-+
03,
6 SYMMETRIZABLE JUMP PROCESSES
246
Lemma 6.34. Let (q(x),q ( x , d y ) ) be reversible with respect t o
T. Define
If U x # 0 and U x # 0, then U x and U x , and hence fjx and jjx are linear independent. The images of fj, and i j x are f j = n [ 4 and ij = 7r[UI],respectively. Moreover, lim Xijx(A)= ~ ( d l ~ ) , A E 8 n En, x+m
n 2 1.
Proof: Since { f i x } and {GA} are consistent, the consistency of f j x and i j x follow from Lemma 6.29. If clUx c2Ux = 0 holds for some X > 0, then c2(XI - R)G, = 0 since f i x E %A. But from Gx # 0 we know that d # 0. Hence (XI - R)U, = d # 0.
+
This implies that c2 = 0. Next, from fix # 0, it follows that el = 0. Finally, since the property whether f i x and fix are zero or not is independent of A, we have thus proved the linear independence. As for the last three equalities, simply apply Lemma 6.20 and Lemma 6.21. W
Lemma 6.35. Let fjx and i j x be the same as above. Set
Then, we have
(1) w:~
some wab as X
(2) wi2 = wil for all X
co,a, b = 1 , Z . Moreover,
> 0 and so w12 = w21.
Proof: The first assertion is a straightforward consequence of Lemma 6.34 and Lemma 6.23. By using these lemmas again, we get
and so wi2=
wil
for all X
> 0. W
6.5 EXISTENCE OF HONESTREVERSIBLE JUMP PROCESSES247
Write
W(X) = (w?
4 1 )
WZl
For a given matrix
R(X)=
:;(
rY)
>
r i2 we write
P(X) = P"'"(X)
p).
+ (fix, ux)R(X)
(6.31)
Lemma 6.36. Let (q(z),q(z,d y ) ) be reversible with respect t o 7r. Then P(X) is a reversible q-process if the following conditions hold.
(1) R(X) is non-negative and symmetric. (2) R(X)W(X)l < 1, x > 0. (3) (4) Moreover, the process is honest iff the equality in (2) holds.
Proof: Clearly, condition (1) implies that P(X) 2 0. Write
Then Xp",(E) = Xri1q,((E)
+ X.;%jx(E),
2
= 1,2.
On the other hand, by (6.11)) we have
here we have used Lemma 6.33: qxuo = (uO,u.x)
< (uO,q)6 (uO,1 - uO)= 0,
X
> 0,
(6.32)
and ljxuo = 0, X > 0. Thus, condition (2) implies the normal condition X P ( X ) l < 1 and furthermore the process is honest iff (2) becomes equality by (6.15).
6 SYMMETRIZABLE JUMP PROCESSES
248
To show that P(X) satisfies the resolvent equation, by consistency of U , and G, plus some computations, it suffices that
cpt - 9; + (A - p)cp:P(pu>= 0 ,
k = 1,2.
This can be deduced by using consistency of 5j,, ?jA and condition (3). Clearly, (4) follows from (3). Finally, since limx,oo XU, = 0 and limA+-ooXii, = d , we have lim X ( I A - X P ( X ) I A )
A-+m
= lim X ( I A - X P m i n ( X ) I ~) - (0, d ) x-+m
So the q-condition is also satisfied. 1
Lemma 6.37. The hypotheses and the notations are the same as in the previous lemma. Suppose in addition that 0 < .rrd < lim XnuA< co. A-+m
Then, we can choose a matrix R(X) so that P(X) defined by (6.31) is an honest reversible q-process.
Proof: Because nd
> 0, so Eo # 0 and GA # 0. Moreover, lim Xnii, 2 .ird > 0, A-+m
we also have ii,
# 0.
Now, take hl = lim Xnii,,
hz = x d ,
A-+W
r1 = (hl - h2)/h?,
r2
= l/h1.
Clearly, r1 2 0, r2 > 0. Next, take = [rl
+ ri(w22- w?~)]/A,
ri2 = ril = T~ [ 1 - r 2 ( d 2- w;')>I/A,
ri2 = r,"(wll - wil)/A, where
n = [ 1- r2(w12 - w,12 ) ]2 + T;(wll
- w:1)(w22
- w?) - r1(w1l
- w?)
and w $ ~and wab(a,b = 1,2) are defined by Lemma 6.35. It is easy to check that wab(a,b = 1 , 2 ) are finite and so are w:~. The computation of R(X) comes from Lemma 6.36 (3). Letting p i 00, we obtain R(X) = R ( I - (W - w ( X ) ) R ) l-.
From this, it is easy to show that R(X) constructed here satisfies all conditioris oC Lenrxna, 6.36 and moreover, the process is honest. We can now state our main criterion.
6.6 UNIQUENESS CRITERIA
249
Theorem 6.38. Let ( q ( x ) , q ( z , d y ) )be reversible with respect t o T . Then there exists a n-almost honest and .rr-reversible q-process iff one of conditions (1)-(3) of Proposition 6.32 holds. Proof: The necessity g v a proved in Proposition 6.32. To prove the sufficiency, we discuss the problem according to different cases. a) Let ( q ( z )q, ( 5 , dy)) be conservative. By Proposition 6.28,
> 0, i.e., fjA(3) > 0, then
is a q-process, it is clearly reversible. When niix it is actually honest. Ot'herwise, T [ 2E
E
:
XPrni"(Xj(;c)l < 11 6 T [ U X # 01 = 0,
and so Prnin(X) is a n-almost honest q-process. b) Let dim@ 0 and Eo 0. That is, the q-pair is n-almost zero-exit and n-almost conservative. In this case, we still have r [ z E E : XPrn'"(X,2,E)< 13 =7r[zx # O ]
c) Let Eo 0 and 0 < limx,,XnuA constructed in a) satisfies 7r
[. E 15
: AP(A, x,E )
< 7r[ux# O ]
< 00.
+ T [ i i A # 01
=o.
In this case, the q-process
< 11 6 T [ d # 01 4-7r [d = 0, A P ( X ) l < I ] -n[zA-uX #O,d-O] =
piA# 0, a = 01 = 0.
d) Let 0 < nd 6 limx+m X.rr?IA < m. In this case, an honest reversible q-process was constructed by Lemma6.37. e) Let n d = 00 or lirnAds X7riiA = 00. By Lemma 6.30, we have limxdm Xnzx = 00. Then, qX = ijx := 7r1tzx-l] satisfies Proposition 3.5 (4),so by Proposition 3.5, P ( x ) = Pmin(X) -k Z A ~ ~ A / ( A ~ ) A ( E ) ) is an honest q-process, which is clearly reversible.
I
6.6 Uniqueness Criteria
In this section, we fix a symmetrizable q-pair (q(z),q(z,dy)) with a symmetrizing measure T , which is also fixed. The uniqueness studied here is naturally in the sense of ;.r-equivalence.
6 SYMMETRIZABLE JUMPPROCESSES
250
Proposition 6.39. If %'AnL'(n) = (01, then there is only one n-syrnmetrizable Dq-process.
Proof: Suppose that the Bq-processes are not unique. Then, by Theorcrn 6.27, we should have a Q-process with the form P(X) = P"'"(X) 3. B(X) with B(X,.,A) E for all X > 0 and A E 8 and nB(Xo)l > 0 for some Xo > 0. Choose large enough n so that n ( B ( X o ) I ~ > , ) 0. But then for this n, we also have
<
since n(En) n,. Hence, by Lemma6.12, it follows that 0 < T(B(X)TE,)< 00 for all X > 0.
Proposition 6.40. If
(1) %A n L1(n) = {O} (2) n d
and
< oc,
hold, then there is only one n-symmetrizable q-process.
Proof: Let P(X) be an arbitrary .Ir-symmetrizable q-process. Then by Theorem 6.27, it has the representation:
+
+
P(X) = Prni"(X) B(X) X ( X ) F ( X ) . But by condition (1) and the proof of the last proposition, we see that B(X,+, l?) = 0, T-ax. On the other hand, by Lerrirria 6.31, we have lim XF(X).[A
A+cQ
= 0,
AE
&nnE,,
n 2 1.
Hence lim
X-CO
( x P ~ ~ ~ ( x ) I E~, X , F(X)I)
= lim X-+m
(I,yn, XPmin(X)dXF(X)l) (by symmetry of P"'"(X>)
= lim (1, XP"'"(X)dXF(X)I,gn) (by symmetry of P(X) and Pmi"(X X-+m
=
lim (XP"'"(X)l, ~ X I ; I ( X ) I E ~ ) (by symmetry of P"'"(X))
x-+m
= 0,
72
2 1.
6.6 UNIQUENESS CRITERIA
251
Here in the last step, we have used condition (2) and the dominated convergence theorem. Moreover, by Theorem 6.27, F ( p ) = F(X)( I + (A - p ) P ( p ) ) . Hence
Combining these facts together, we obtain
and so
Thus, there exists a n-null set N such that
Note that the jump condition gives us
By Theorem 1.14, it follows that limA,m the other hand, by symmetry,
XPmin(X)(I~cdXF(X)l) = 0. On
(1, P"'"(X)IN) = ( I N , P"'"(X)l) = 0.
So we can choose a n-null set fi so that Pmin(X, z,N ) = 0 for all z @ Therefore, we have actually obtained
N.
From this, it follows that
Now, by Proposition 6.25, we see that P(X,z, .) satisfies the backward equaTo see this, we can either use tion. Therefore, P(X) is equivalent to Prn1"(X), condition (1) and Proposition 6.39 to claim the uniqueness of Bq-processes or use Theorem 6.27 to claim that F ( X ) l = 0, n-a.e.
6 SYMMETRIZABLE JUMPPROCESSES
252
Theorem 6.41. There exists only one 7r-symmetrizable q-process if the following three conditions all hold.
(1) ( q ( x ) q(x, , dy)) is symmetrizable with respect t o 7 r . (2) %A n L ~ ( T=) {o}. (3) Either inf{Pmin(X)l(z): z E E o } > 0 or 7rd < 00. Proof: By Proposition 6.40, we need only consider the q-process having the form P(X) = Prni"(X) X ( X ) F ( X )
+
and consider only the case that inf{Pmin(X)l(x) : x E E o } > 0. By Thecrem 6.27 (4), we have
Hence
Letting p
--f
00,
we get
XF(X)IA 2 F(X)QIA,
X
> 0, A E € n En, n 3 1.
Thus on € n En, n >, 1.
U(X) := F(X)(XI- 0) 2 0
(6.33)
Of course, for every X > 0 and z, U(X,x,.) can be extended uniquely as an element in p+.We use the same notation U(X) to denote the extension. By Lemma 2.34 (a), it follows that
Set
V(X) = F ( X ) - U(X)Pmin(X)2 0,
X
> 0.
(6.34)
Then by (6.33) and (6.34) we obtain, for each xo E E , that
V(X)(XI- Q ) I A ( z ~=) 0,
X
> 0, A E € n En,n 2 1.
That is
v ( X ) I ~ ( z o=) V(X)Q((X
+ q)-'I~)(z~),
X
> 0, A E 8
(6.35)
6.6 UNIQUENESS CRITERIA
253
by the monotone class theorem. On the other hand, by symmetry, T ( A )= 0 implies that
2 2
1 1
E ( ~ ~ ) P " ' " z, ( X(,x } ) d ( z ) F ( Xz, , A) n(dx)Pmin(X, x,{z})d(x)V(X, z,A),
A
> 0.
This shows that for all zo out of a .rr-null subset of Eo, say E;, we have
V(X,z,;)
<< 7r,
A
> 0.
Define
Then
and so by (6.35), we have
That is
Combining the above facts with condition
(a), we see that
Note that
(IAPmin (A) (dIq),V(X)1) < (Pmin (A) (dIE;), V(A) 1) 6 (Prni"(A)(dIq),F ( A ) l ) 6 (p"'"(X)(dIE;), =
1)/A
( I q ,dPmi"(X)l)/A = 0,
X
> 0, A
E 8.
6 SYMMETRIZABLE JUMPPROCESSES
254
We arrive at
P( A) = Prnitl (A) 3- Fin (A) [ d I ]u (A) Prrlin (A).
(6.36)
Now, we start to iise the condition
Since
12X F ( X ) l = XU(A)Pmin(X)l2 c(X)XU(X)l, we have XU(X)l
< c(X)-l
for X > 0 and so
<
p2Pmin(pU)U(p)1C ( p y ,
p
> 0.
Besides, by Lemma 6.17, we have P ' " ( X ) d 6 1,
x > 0,
5
E E.
We have got everything we need; the next step is copy the proof of T h e e rem 3.26 starting from (3.42). 1 'Yhe Iast result of this section is a uniqiieness criterion for reversible qprocesses. Theorem 6.42. The reversible q-process is unique iff the following three conditions all hold.
(1) (q(x),~ ( zd ,y ) ) is reversible with respect: t o 7 ~ . (2) nd < 00. (3) @A = (0). Moreover, if (I) holds but (2) or (3) does not hold, then there are infinitely many reversible q-processes.
Proof: The sufficiency follows from Proposition 6.40. Clearly, condition (1) is necessary. We now assume (1). a) If .rrd = oc, or lirnA*m X.irii, = 30, then in the last step of the proof of Theorem 6.38, replacing P(A) with
wc obtain infinitely many reversible y-processes with parameter c 2 0. 17) If (3) does not hold and limh-oa AntiA < cx3. Then by Lemma 6.11, Lemma 6.29 arid Lemma 3.4, wc have xTT1LA> U for all X > 0 and
255
6.7 BASICDIRICHLET FORM Setting I E =0,
qx = ~ [ i i x I ] , X > 0, A E 8
in Propositiori 6.28, we obtain w = lim X(fiA,'lL) x4w
< x+w lim XnG, <
00.
Moreover, by Lemma 6.23, we have 8' = Xfjxuo< 00
independent of X
> 0.
Hence, we can take infinitely many constant c satisfying (6.26) so that P(X) defined by (6.21) are Bq-processes. These processes are clearly reversible. Finally: since fj,(E) = n'lL, > 0, we have actually obtained infinitely many reversible q-processes. W
6.7 Basic Dirichlet Form
In this and the next sections, we use the L2-theory, more precisely, the Dirichlet forms, to study the symmetrizable jump processes. Thus, d l processes considered in these sections are symmetrizable. Besides, we fix a a-finite measure T on ( E ,8')and denote by the space of all square integrable functions on ( E ,8, T ) with inner product and norm
respectively. Occa++mally, we also consider the P ( 7 r ) ( p E [l,031) space, in that case, the LP-norm is denoted by 11 . [ I p . The following result is our shrtirig point, which is even more than what we need at the moment.
Lemma 6,43. Let P (t,2 , dy) be a jump process with an excessive measure T (ie, T 2 n P ( t ) for all t 2 0). Then {P(t)}t>o:
can be extended uniquely t o P ( n ) ( p E [ l , ~ as ) )a strongly continuous contraction semigroup.
Proof: Note that
256
6 SYMMETRIZABLE JUMPPROCESSES
By Holder inequality and the excessive property, we have
=/7r(W(z)IP(1-
W , z ,{.I)).
Combining these facts with the jump condition and using the dominated convergence theorem, we obtain
This proves the strong continuity. As for the contraction, we have
From now on, we discuss only the case that p = 2. The above extension is again denoted by {P(t)}t20.According to the ordinary semigroup theory, { P ( t ) } t pdeduces an infinitesimal generator L as follows. If for some g E
L2(T ),
( P ( t ) f - f ) / t - g -+ o in ~ ~ ( 7 r ) as t j.0, then we define Lf = g . Such functions f consist of the domain of L, denoted by !3(L). It turns out that the generator L is self-adjoint (densely defined) on L2(7r). However, the domain g ( L ) is usually quite poor. For instance, even for countable E , the simple indicator I { i ) may not belong the domain of L. Thus, it is worthy to find a weaker version of the generator L. Actually, one such version is at hand, that is the weakly infinitesimal generator Instead of the strong convergence, we consider
E.
for all h E L2(7r). Such g E L2(7r)consist of the weak domain g ( L ) . But this version is not very useful since in the L2-context the weak and the strong topologies are quite closed each other.
6.7 BASICDIRICHLET FORM
257
Since the space I.’((n) is fixed, we also call the semigroup {P(t)}t>osymmetric on L2(7r).Recall that in thc finitc dimensional case, cvcry quadratic form can be represented as a sum of squares under an nppreciatcd orthonorma1 basis. The representation simplifies greatly the classification of the quadratic surfaces. As a gcneralization to the infinite dimensional case, a spectral family { E , : a E R} on L 2 ( r )is used instead of the orthonormal basis. More precisely, every self-adjoint non-negative definite operator -L has uniquely a spectral representation as follows:
-L = Furthermore, for every continuous
io lo adE,.
‘p :
P(-L) :=
[0,oo) + [0, oo),the operator
@I
P(.>dE*
is again non-negative definite. In particular, we have the strongly continuous contraction semigroup ( P ( t )= exp(tl)) and the resolvent (P(A) = ( X I L)-l : A > 0), respectively, as follows.
Note that in this general setup, the semigroup P ( t ) (resp. the resolvent P(A)) is not necessarily sub-Markovian: i.e., 0 6 P ( t ) f 1 for f E L2(7r with 0 6 f 6 1. But in our particular situation, P ( t ) is nothing but the semigroup induced by the transition function P ( t , x , d y ) and P(A) is its Laplace transform. Set
<
For f c
g ( Z ) , the limit lim Dt(f, f
) always exists and is finite. Furthermore,
ti0
since
1 - ( I-
t
p t )
we have for all f E L2(7r),
=
-
/u
t
ehasd3
t
a
as t L O ,
6 SYMMETRIZABLE JUMPPROCESSES
258
Hence: we can define D ( f ,f) as its limit with domain
a ( D ) := {f E L2(.)
: D ( f , f ) < oo} 2
9 ( L )3 q L ) .
Using the formula
we obtain a symmetric bilinear form on g(D). Note that
=
i2 J 7r(dx)[, P ( t ) ( f f(4)2](4 '-
(6.37) We have proved the following result.
Lemma 6.44. (D, .9(0)) defined above possesses the following properties.
(1) S ( U ) is a dense linear subspace of L 2 ( x ) . (2) B is a symmetric and non-negative definite bilinear form on g(D)x B(D). ( 3 ) L, is closed. That is, B ( D ) is complete with respect t o the norm
(4) Let f E g ( D ) and let g E L 2 ( x ) be a normal contraction of f in the sense that
then 9 E Q ( D ) and moreover D ( g ,g )
< D ( f ,f ) .
Definition 6.45. We call ( D g ( D ) )a Dirichlet form if it possesses the properties listed in Lemma 6.44.
As usual, there is one-to-one correspondence between the semigroup (resp. resolvent) and its generator. We are now going to prove that this one-toone correspondence also holds for the resolvent (resp. semigroup) and its Dirichlet form.
6.7 BASICDIRICHLET FORM
259
Lemma 6.46. A closed symmetric form ( D , g ( D ) )and its resolvent P(X) have the following relation:
Proof: a) Let P(X) be a strongly continuous resolvent with the form (0 g ( D ) ) . Since P ( X ) ( L 2 ( r ) c ) g(L) c g ( D ) , D ( f , g ) = - ( L f , g ) for all f E g ( L ) and g E L 2 ( r )and XP(A)f - LP(X)f = f,we obtain the required property. b) Conversely, let (0,g(0))be a closed symmetric form and define
closed symmetric form for each X > 0. Next, for fixed Then ( D x ,~ ( D Xis)a) X 0 and f E L2(7r),since g -+ ( f , g ) is continuous in B ( D x ) with respect t o the metric U X ,by the Riesz representation theorem, there is uniquely a P(X)f E 9 ( D ) such that
Now, it is not difficult to show that {P(X) : X > 0) obtained in this wny is a strongly continuous contraction resolvent. Actually, since
the contraction property follows. To see the strong continuity, we need only to show that p P ( p ) f ---f f as p --f 0;) for f E g ( D ) by the denseness. Now,
which gives us the required assertion. Lemma 6.47. Let ( D ,a ( D ) ) be a Dirichlet form.
(I) If f E g ( D ) ,then l f l , f A 1 E g ( D ) . (2) If f ,,q E B ( D ) ,then f A g , f V g C g ( D ) . (3) f E 9 ( D ) iff f* E 9 ( D ) ,where f+ = f V 0 and f- = -f A 0. (4) Let f E g ( D ) and define fn = ((-n)V f)An.Then f n -3 f as n --+ co in the D1-norm.
6 SYMMETRIZABLE JUMP PROCESSES
260
Proof: By the normal contraction, f E 9 ( D ) implies that Then 1
fvg=
5(f+g+If-sl),
If[,
f A l E 9(D).
1
fAg=~(S+g-If-gl)E~(D),
and so f* E 9 ( D ) since 0 E B ( D ) . Finally, since fn is a normal contraction of f, Dx,(fn, fn) is uniformly bounded by Dx,(f,f) for fixed A, > 0. Moreover, by the previous lemma, Dx,(fn,
W,M=(fn,
9) -+ ( f , 9 ) = D x 0 ( f ,P(A,)g),
Since Px,(L2(r)) is dense in g ( D ) with metric Dx,, f in Dxo:
fn
n ---t
03,
9 E L2(r).
converges weakly to
we have
Given a self-adjoint operator, the corresponding semigroup may not be sub-Markovian as we mentioned before. Because of this, we need to study the operator more carefully. Lemma 6.48. As the generator of a sub-Markovian symmetric semigroup
{ P ( t ) }on L 2 ( r ) ,we have
(Lf,( f
-
1) v 0) 6 0,
f E W).
(6.38)
Conversely, every linear operator (L, 9 ( L ) )satisfying (6.38) should have the property: ( L f , f ) 6 0 for all f E .9(L).
Proof: Note that for each f E L 2 ( r ) ,we have f A 1 E L2(7r).So f - f A 1 E L 2 ( r ) .Rewrite
(f- l)+=
(W)f, (f - I > + >= (W>"f - 1>++ f A 11, (f - I > + > * Then, (6.38) follows from
( W K f - 1)+1,(f - I > + >G II(f- 1)+1l2= (f - 1, (f - I>+> and ( P ( t ) ( fA I), (f - I)+) 6 (1, (f - 1)+).To see the last assertion, note that
(Lf, ( f
-
<
+)+)
= (L(nf), (nf- 1)+>/n2 G 0,
Hence (Lf, ff) 0. Replacing f with -f, we obtain also and so ( L f , f ) 6 0.
n 2 1.
-(Lf, f-) 6 0
6.7 BASICDIRICHLET FORM
261
Definition 6.49. A self-adjoint operator L on L 2 ( r ) is called a Dirichlet operator if (6.38) holds. Having the above facts in mind, we now prove the following fundamental result about the sub-Markovian symmetric semigroups, their Dirichlet forms and operators.
Theorem 6.50. (1) An operator L on L 2 ( r )is a Dirichlet operator iff it is the generator of a sub-Markovian symmetric semigroup { P ( t ) } t > ~ . (2) Let ( D ,g ( D ) ) be a Dirichlet form. Then the generator L defined by
g ( L )= { f
E
g(D): there exists g E L2(n)such that D ( f ,h) = (g,h,) for all h E B ( D ) }
and
( L f , h , )= - D ( f , h )
for all h, E 9 ( U ) and f E
g(L)
is a Dirichlet operator. (3) Let L be a Dirichlet operator. Then the form defined by
9(D)
Q((-L)'/')
and
D(f,f)
=.I1
(-L)'/'f
112,
f
E
g(.D)
is a Dirichlet form of generator L (in the sense of (2)). Proof: a) Let L be a Dirichlet operator. Then it determines uniquely a strongly continuous, symmetric contraction resolvent P(X) (cf. Fukushima (1980), 51.3). We prove that P(X) is sub-Markovian. Given f E L ' ( T ) , set g = XP(X)f. Then g E g ( L ) and Xg - Lg = X f . I f f 1, then
<
so
<
This proves that g 6 1. Similarly, if f 2 0, then -nf 1 for all n 3 1. Hence we have -ng 6 1 for all n and so g 2 0. Combining this with Lemma 6.48, we obtain (1). b) Let ( D , 9 ( 0 ) )be a Dirichlet form and L be i t s generator. Given f C g ( L ) ,since f("):= f I ~ ( -fl)+ E 9 ( D ) and f is a normal contraction of f(", we have D(f, f ) D ( f ( ' ) ,f'")). That is
<
- W f ,( f - I ) + ),< E L ) ( ( f
-
1)+, ( j - l)+).
6 SYMMETRIZABLE JUMPPROCESSES
262 Thus
(Lf,(f - I)+)
=
-w, (f - I)+)
< ;D((f
-
1)+,(f - 1)+),
and so (Lf, (f - 1)+)6 O'follows by letting E + 0. We have thus proved ( 2 ) . c) Let L be a Dirichlet operator and let P ( t ) be the semigroup obtained by (1). Because 9 ( D ) = 9 ( ( - L ) ' / ' ) , so we have
9 ( D ) = {f
E
L 2 ( r ): l i m D t ( f , f ) < co} and [[(-L)'/2f112= lim Dt(f,f). t+O
t+O
It suffices to show that for each f
E
L2(7r)and g, a normal contraction o f f ,
The assertion now follows from (6.37). For a given symmetrizable q-pair, the corresponding Dirichlet forms are not necessarily unique since the symmetrizable q-processes may not be unique in general. In this sense, our previous work on the construction of symmetrizable q-processes is just a construction of Dirichlet forms. The main purpose of this section is to prove that among the Dirichlet forms there is only one which is completely explicit. To do so, let us return to (6.37). From now on, we will often use the following notation:
Since 1irnt-o P ( t ,2,AE,)/t = q ( x , AE,) for all A E 8, by Theorem 1.14 (4), it follows that
Letting n + m, we obtain
(6.39)
6.7 BASICDIRICHLETFORM
263
where 7rg is a bilinear form
On the other hand, by Proposition 6.25, we have
-1 lim -(I t-+0 t
-~
( t ) f2) l,
< (d, f2)
(6.40)
=: r d ( f 2 )
Hence, it is natural to ask when (6.39) and (6.40) become equalities. This leads us to study the bilinear form
with domain
@(D*)= {f E L2(T) : D*(f:f) < 0 ) . Clearly, we have
Lemma 6.51. The measures 7rq and T~ defined above are a-finite. Moreover, xq is symmetric: x,(dlr, d y ) = n,(dy, dz). Lemma 6.52. (TI*,-"n(D*))defined above is a Dirichlet form.
Proof: The proof of this Lemma is quite standard. Conditions (2) and (4) of Lemma 6.44 are obvious. To show the denseness, let f E L2(7r).Without loss of generality, we may and will assume that {BnL}yC 8 such that 7r(B,)
< 00 and Km := sup{q(z)
V
If)
< 00 everywhere. Choose
If(x)i : z E B,}
< 00.
Then fm := f l ~ 3 , f in L2(7r) and by Lemma 6.51, we have
Therefore, !B(D*) 3 f m -+ f in L2(7r). 9 ( D * ) is clearly a linear space and so condition (1) of Lemma6.44 holds. Vce' now check condition (3) of Lemma6.44. Let (fn) be a D1-Cauchy sequence. Then (fn) is a L2(7r)Cauchy sequence and so there exists ( f n k )
c
( f n ) such that
fnk
f
E
264
6 SYMME'~RIZAB1.E JUMPPROCESSES
L~(T ). Moreover,
The right-hand side can be a r b i h r i l y small for large enough Ic.
Definition 6.53. We call ( D * ,g ( D * ) )the basic Dirichlet form. Since the one-to-one correspondence between the Dirichlet forms and the semigroups, the Dirichlet form D* determines uniquely a strongly continuous, self-adjoint. and sub-Markovian contraction semigroup {P(t)}t>o.A remaining question is whether it is a q-process or not. The answer is affirmative. To show t.his, we need more work. Recall that the Eq. ( B A plays ) a crucial role in Ihe study of jump processes. In the present situation, an analogue is as follows:
Lemma 6.54. Let (q(s),q(x,dy)) be a symmetric q-pair and {P(A): X be a symmetric resolvent on L2(;7).Suppose that q(x, *) << T.
> 0}
(6.42)
and Eq. (6.41) holds. Then ( B x ) holds, r-a.e.
Proof: Fix f = I A E L 2 ( r ) and let g E L2(q(z)r(dx))n L2(n). Then g E !3(D*)and
By (6.41), we have
6.8 REGULARITY, EXTENSION AND
265
UNIQUENESS
Clearly, this equality can be extended to for all g E Ly((.ir).RepIacing g with (A g)-lg, we get
+
(P(X).f,9 ) = (QP(X).f, (X+q)-lg> = ( W ~ ) ~ ( ~ ) f + ( ~ + 9Ld -g 1ffL7~ ( T Thus, for each A ,
P(X)IA2 0, XP(X)l < 1 and P(X)IA = f l ( X ) Y ( X ) I A (A q)-'JA
+
(6.43)
hold almost everywhere. The exceptional set depends on A and A. Since ( E ,8 ) is separable, we can choose an exceptional set depending on X only. On the other hand, the strongly continuous means that ( P ( X ) )determined by a dense set of X > 0. By using this arid (6.42), the exceptional set can be chosen so that it is also indeprndcnt of X > 0. Now, dmotc the r-null set by N . Then (6.43) holds for all 2 $ N, X > 0 and A E 8 , II
Theorem 6.55. Suppose that (E,S) is locally compact and (6,421 holds. Then ( D * ,9 ( D * ) )determines uniquely a symmetric q-process on L'(K). Proof: By the construction of resolvent (P*(X))from the Dirichlet form (cf. Fukusbirna (1980), Theorem 1.3.1), we have
P"L"4)
c 9(D*),
D*
D*(P*(X)f, 9) + W * ( X ) f9 ,) = ( f d , f E L2(7rr7),g E 9 ( D * ) .
So the proof of Lemma 6.54 tells us
P*( X) IA ( Z2) 0, XP*(X)l(s) < 1 and p * ( A ) I A ( z )= n ( A ) P * ( X ) I A ( x ) (A q ( T ) ) - ' I A ( x ) , 2 $2 N,X > 0 , A E 8,
+ +
(6.44)
where N is a .rr-null set independent of X and A. Letting X ---f 00 in the equation in (6.44), we see that the jump condition holds for all II: $ AT. Using the equation again, it is easy to check that q-condition also holds for all z $ N. Finally, the local compactness enables us to find a kernel P(X,Z, d y ) SO that P(X,. , A ) = P(X)IA. W 6.8 Regularity, Extension and Uniqueness
We have seen from the last section that the basic Dirichlet form corresponds to a symmetric g-process on L2(.rr). On the other hand, we have proved that there is thc minimal symmetric q-process, which then correIt is interesting sponds to a Dirichlet form, denoted by (Dmin,9(Dmin)). that the two Dirichlet forms constructed by different approaches are not the same in general.
266
6 SYMMETRIZABLE JUMP PROCESSES
Proposition 6.56. Let (g(x), g(x, d y ) ) be conservative and symmetric on L2(7r).Suppose that dim% < 1. 7T
(1) If dim% 0 or dim%#O but r z x = 00, then there is only one Dirichlet form: Dm'" = D". 7T (2) If dim %#0 and r z x < 00, then there are infinitely many Dirichlet forms having the same representation:
where c 2 0. Moreover, if c1 2 cz 2 0, then
Dmin(f, f) 2 DC1( f ,f ) 2 DCZ(f,f ) 2 D * ( f ,f ) for all f E g ( ~ c~a (~ D "'l )~ c g(D'2) ~ ) c g(O*). (3) In any cases, there is a t most one honest symmetric g-process.
Proof: The case of dim% 0 was treated in Proposition 6.13, We now as7r sume that dim W f O , equivalently, K Z > ~ 0. By Theorem 2.45, all g-processes
{7lx : X > 0} is a corisistent family of measures and c 2 0 is a constant. Now, by symmetry, we have
where
7 r ( z x I ~ ) q x (= B )K ( X ~ I B ) ~ ~ ( Afor ) all A,B E 8 In particular, we have qx I= 0 if 7rzx = 00 since qx is a finite measure and zx 6 1. Otherwise, we should have qx = c ( X ) n [ z x I ]where , c(X) is a constant depending on X only. Then, by using consistency of {vx : X > 0} and { n [ z x I ]: X > 0}, it follows that c(X) is actually independent of X > 0. Now, the Dirichlet form can be computed by using the formula
P ( f , f ) = lim X(f - XPC(X)f,f). x+cc
Finally, it is obvious that the symmetric q-process corresponding to Dc is honest iff c = 0. For birth-death processes, we have dim % 1. The Q-matrix is symmetrizable with respect to
<
7ro
= 1,
7rz
=
bob1 * ' * bi-1 alaz . . ai
,
231
and hence Proposition 6.56 is applicable to this case In view of the last result, we guess that Dminis thc minimal Dirichlet form in some sense. We now discuss this problem.
6.8 REGULARITY, EXTENSION AND
267
UNIQUENESS
Lemma 6.57. Let P(X,z, d y ) be a symmetric q-process on L 2 ( n )with Dirichlet form ( D , g ( D ) ) then , we have
. 9 ( P " )c 9 ( D ) and
DrniYf,f) 2 Wf, f),
f
E
(6.45)
g+(Dm'").
(6,46)
Proof: Ry the monotone class theorem and the minimal property of P i r 1 ( t ) , we have
P ( t ) f 3 P"'"(t)f,
t 3 0, f
E L$(n)
and
( W f ,9 ) 2 (p"'"(t)f,g>,
t 2 0, f , g E Lt(7r)
Hence
D(f7 f) =
yz;1(f - P ( t ) f ,f ) ,< $ ;1( f
= Drni"(f,f ) ,
f
- P"'"(t)f,
f)
.9+(Dmin).
E
Thus, we have proved (6.46) and .9+(D"'") C 9 + ( D ) . Now (6.45) follows immediately because of the basic property for Dirichlet form:
f E 2?(Il) 1 fh
E
a(n). w
Let R, be t,he restriction of R to A/:
JY := {fE L"(n) : there exists
n
such that [x : f ( z ) # 01 c En}.
Since R, is a well-defined, linear, symmetric and non-positive definite operator with 37 being dense in L 2 ( n ) ,a usual procedure gives us the smallest extension Do of
This extension is often called Friedricks extension of 0,. Denote by the generator corresponding to Do. But we still need to show that Do is a Dirichlet form. For this, it suffices to note that
Definition 6.58. We call %? c g ( D ) a core of D if 'G is dense in 9(0) with respect t o the D1-norm. A Dirichlet form D is called regular if .X is a core of U .
268
6
SYMMETRIZABLE
JUMPPROCESSES
Do = Dminand is regular. We have just proved that Do is a Dirichlet form and hence the same
Proposition 6.59.
Proof: proof given in Theorem 6.55 gives us a symmetric y-process on L 2 ( r ) .So by Lerrirria6.57, we have 3?(llmirt) C 9 ( D o ) . On the other hand, Do = Illnin on X , the minimal property of f i 0 gives us 9?(Do)C 9(Dmin).Therefore, we have Q(D0)= g ( D n l i l 1 ) arid Do = Dmin. M
Definition 6.60. A Dirichlet form D is called an extension of Dminif
9(Dmin)C 9 ( D )
and for every f E 9(Dmin), D ( f , f ) = D " ' " ( f , f ) .
Now, we can state our main result which shows that D* is an extension of Dminand is indeed the maximal one in some sense.
Theorem 6.61, Suppose t h a t ( E ,8 ) is locally compact and (6.42) holds. (1) Any symmetric q-process on L2(,rr)is an extension of PrrLin(A)(2) For any Bq-process on L2(n+) with Dirichlet form ( D :g(D)), we have
ir
c .(Drni"> c 9 ( D ) c 9 ( D * )
(6.47)
and
(3)
In other words, the Dirichlet form D* is the maximal one among the Dirichlet forms corresponding t o the Bg-processes. The By-process is unique iff 9(Dmi"):-: g ( D * ) . Equivalently,
and
iim A(I - A P ~ ' ~ ( Af )2I) ,= ,rrd(f2)
x+w
(6.50)
hold for all f E !2(D*)n Lw(.i.r).
Proof: a) Let P ( t ) be symmetric on L2(,rr)with Dirichlet form (0,! 2 ( D ) ) . For f E X , assume that f == 0 OD Ek for some n. Then, by Theorem 1.14, we have
This proves that D coincides with flo on 2 and so D coincides with 9 ( D o ) .By Proposition 6.59: we have thus proved assertion (1).
Do on
6.8 REGULARITY, EXTENSION AND
UNIQUENESS
269
b) Let P(A) be a synimetric By-process on L 2 ( r ) . Then, by Proposition 6.25, we have lim[1 - P ( t ,2,E ) ] / t= d(x),
t-0
2 E
E.
so
1 - ~ ( t ) lf ,2 ) 2 r ( d f 2 ) . t-ot Combining this with (6.39), we obtain D(f, f) 2 D * ( f ,f ) and so 9 ( D ) c .9(D*).Now since for D*, the correspondent P*(A)is a By-process, assertion (2) follows from the above fact plus a). c) The first assertion in (3) is a combination of a) and b). Clearly, if the By-process is unique, then (6.49) and (6.50) hold for all f E .@(D*). Conversely, if (6.49) and (6.50) hold for all j E g(D*) f7L M( n-then ) , g(D*)n H L"(n-r) c 9(UmLrL), and hence $'(D*) c 9(Dmin). lim -(I
Corollary 6.62. Let (q("c),y(x, d y ) ) be a conservative symmetric y-pair on L2(n->, Then the symmetric q-process is unique iff the basic Dirichlet form is regular, i.e., 9(Dmin)= B ( D * ) . Equivalently, (6.49) and (6.50) hold. In this case, the unique Dirichlet form is just the basic one.
One may ask whether the basic Dirichlet form being the maximum in any cases, The answer is negative. Example 6.63. Take E = {0,1,2,.. . } and yiJ = 0 for all i 00. Then P$"(A) = & / ( A 4%). On the other hand,
+
PZj(X) :=
Pgin(x)+ (A + Y i Y) (i X + Y j )
x
# j , xi l/q, < (6.51)
is also a Q-process. The latter one is honest and syminetrizable with respect t o the measureri = l/qi. For this Q-matrix, we have 9(Dmin)= B ( D * ) c @ ( D )but 9(D4) # 9(D).
Proof: Obviously, we have
Since
270
6 SYMMETRIZABLE JUMPPROCESSES
we have Dmin= D* and 9 ( D m i n= ) 9 ( D * ) . On the other hand,
LeL f E g(D*), then
Thus B ( D * ) C g ( D ) . But 1 E g ( D ) ! 1 # g ( D * ) and so g ( D * ) # 9 ( D ) .
In contrast to the smallest extension; there is a Krein extension. It is known that the Krein extension is so large that it is sometimes even not sub-Markovian (cf, Fukushima (1980), Theorem 2.3.2). But we would like to point out here that the Krcin extension is so~netirriesnot large enough since t.here are some Dirichlet forms with larger domain than the Krein's. To show this, let us return to the above exitlnpk. In this c u e : it, is easy to check tha.t .N, = ( 0 ) (see Fukushima (1980), (2.3.5) for notat.ion), hence DrDin = D' = D K ( t h e Krein extension): and so 9 ( D K ) c 9 ( D ) but g ' ( D K )# 9(D) The main reason is that the symmetric q-process (Pij(X))given by (6.51) does not sat'isfy the equation ( B A ) . More precisely, the non-conservative part of the Martin boundary is not considered by the Krein extension. 6,9 Notes The study of symmetrizable Markov chains was begun by Kolmogorov (1936a). He considered the Markov chains with finite state space and discrete time-parameters. Now, there are several books on this subject, both for Markov chains and for general Markov processes. See E'ukushima (19801, Kelly (1979), Silverstein (1976) and Qian and Hou (1979), for instance. A lot of progress has been made in t.he past years, refer to Fukushima, Oshima and Takeda (1994)! Ma and Rijckner (1992).
6.9 NOTES
271
This chapter presents a general theory for symmetrizable jump processes. However, the results on this class of jump processes are still not as complete as those on the general jump processes. The uniqueness criterion for honest reversible q-processes remains open. This problem is certainly important in practice. The first three sections are taken from Chen (1980), where the reversible cases were studied, some generalization was due to Zheng (1981, 1983a). But the second sufficient condition of Proposition 6.13 was appeared in Qian (1978) . For Section 6.3, we refer to Yang (1981). Theorem 6.21 is due to Reuter (1957). The proof adopted here is taken from Zheng (1981). For Markov chains, the existence criterion for honest reversible Q-processes was obtained by Hou, Guo and Chen (1979) in the conservative case, and then in the non-conservative case by Chen and Zhang (1984). The generalization, including Lemma 6.33, was appeared in Chen (1986b). The uniqueness for reversible Q-processes in the conservative case was also appeared in Hou, Guo and Chen (1979). Some results for non-conservative case were obtained by Hou and Chen (1980). Theorem 6.42 was appeared in Chen (1986b), for which, the author benefited from Y. L. Dai, R. H. Ouyan and H. J. Zhang. Theorem 6.41 are taken from Chen (1991~). Theorem 6.50 is well known in the theory of Dirichlet forms (cf. Fukushima (1980)) under the local compactness hypothesis, which is removed by Bouleau and Hirsch (1986). For Markov chains, Proposition 6.59 was proved by Silverstein (1974), Proposition 6.56 was appeared in Hou and Chen (1980). The last result shows that even the general q-processes are not unique but the honest and symmetric one can still be unique. This indicates the difficulty to obtaining a criterion for honest and symmetric q-processes. The remainder of Section 6.7 and 6.8 are taken from Chen (1991~). One topic, not included in the book, in the study of jump processes is bo construct all q-processes for a given q-pair. For the symmetrizable case, refer to Chen (1982)) Chen (198613) and Zheng (1981, 1983a).
Chapter 7
Field Theory In the study of symmetrizahle Q-processes, a new question arises, i.e., for a given 9-matrix, how can wc justify it is symrnetrizsble or not. If so, haw can we find out a symmetrizing measure. Next, for a given regular syinnietriznkle Q-matrix, it is easy to know whether the corresponding Qprocess is positive recurrent or not, but it is riot so emy to understand the riull recurrence or transience of the process. These questions are discussed in this chapter. Our main tool is the field theory which will be also used in Chapters 11 and 14. 7.1 Field Theory Let E be an arbitrary set, T be an index set, and let a: T x E x E [-00,+00] satisfying
+ R :=
Hypotheses 7.1.
# 2 , 0 < a ( t ,x,2) < 00. (2) Co-zero property. a ( t , x ? Z )= 0 iff a(t,Z., x) = 0. (1) For each t E T and x, 2 E E , x
Definition 7.2. For given 5 , 2 E E , x # 5,5 is called reachable directly from x at time t , denoted by x 2, if a ( t , x,Z)> 0; 2 is called reachable , 2 ).,. . , dn)in E from z at time t , denoted by x A it, if there are d l ) d such t h a t Then, L ( t ) := (x,
z i1 x(1) i I.
*
J:(2)
4, . . . 4, .I4
t
f
5,
,dn),2 ) is called a path from x to i?
at time t.
Let d ( t ):. { a ( t ,x.2 ) : x,3: E E>(t E T).Denote by 2 ( t ) the collection of patlls of
by Hypotheses7.1(2), Because of bhis reason, we also call (x?) an edge or section at time t whenever 2 -% 2 . Next,, let {w(t;z,Z)} be a family of functions defined on the edges which are anti-symmetric: w ( t ;x,2) = -w(t; 2, x). Then, for each path
272
7.1 FIELDTHEORY we set
273
c n
w ( L ( t ) )=
w ( t ; x(”, x ( k f 1 ) ) .
k=O
Put W ( t ) = { w ( t ; z , ? ): z,? E E},where w(t;z,?) is undefined if 5 is not reachable directly from z at time t. Noting that the paths L ( t ) and w ( t ;x,?) are all independent of the diagonals { a ( t , x , ? ) : z = Z } , hence we may and sometimes will leave the diagonals to be free.
Definition 7.3. We call ( E , d ( t ) , L Z ( t ) , W ( t(t) )E 7’) (abbrev. d ( t ) ( tE T))a field and call w ( L ( t ) )the work done by the field d(t)along L ( t ) . If moreover, w(t; 2,Z ) = w(t,5) - v(tl z) whenever z --+t z, ---t R, then d ( t ) is called a potential field with potential V ( t )= {v(tl z): x E E } ( t E 7’). Finally, we say that d ( t ) is path-independent if w ( L ( t ) )= 0 for any closed path L ( t ) .
for some real-valued function v: T x E
In the classical field theory, one fundamental result says that every potential field is path-independent. To show an analogue of this result, we still need some notations. For each t E T , we define a relation wt as follows: for each z,5 E E ,
IC
2
iff x
t
5 or z = 5.
This is clearly an equivalence relation. Thus, we may divide E into equivalent classes {Ee(t) : l E D ( t ) } . For each l E D ( t ) , we choose an arbitrary reference point A, := Ae(t) E Ee(t). Then, for each x E Ee(t),x # A,, we choose an arbitrary reference path L ( t ;A,, x) := (At, - , d k )x). , In what follows, Ae and L ( t ;At, x) will be fixed.
Lemma 7.4. d ( t ) ( tE 2’) is a potential field iff it is path-independent. Proof: a) Let d ( t ) be path-independent and set k
~ ( t , A e=) 0,
v ( ~ , x=) w ( L ( t ; A , , x ) )=
C
w(t;z(j), $+I)
)?
j=O
x E Ee(t), l E W ) , where
do)= Ae
and
t dk+’) = z.Then, for each 5, x + 2, we have
w ( L ( t ;At, x))
+ ~ ( x,Z) t ; = w ( L ( t ;At, 2 ) ) +
by the path-independence. This gives us ~ ( z t ), w ( t ;x,5)= v ( t ,?), and so V ( t )= { v ( t , z ): t E T , z E E } is a potential.
7 FIELD THEORY
274
b) Conversely, if V ( t )= { u ( t , x ) : t E T , x E E } is a potential, then for each closed path L ( t ) := (x =: do), dr), , dn), d n + l ) = x), we have 4
n
n
= v(t,x(n,")) - v(t,x(O))= 0.
rn
Recall that our main purpose to introduce the field theory is to study the symmetrizability. For this, take
w ( t ;x,2 ) = log a ( t ,i,x> - log a(t,x , ~ ) , if x L.z
(74
and define h ( t ;A,, X) = a ( t , A,,
J1)) ~ ( J1), t , d 2 )- ) ~ ( ~t ,( *
1
b(t;2, A,) = ~ ( 2, t ,dk)) a ( t ,x ' ~ 'dk-')) , * * *
~X)1 ,
a ( t ,dl), A,),
(7.2) where A, and L ( t ;At, x) := (A,, dl), ,dk), x> are the rcfcrcnce point and reference path, respectively, introduced abovc. I
.
.
Definition 7.5. A field d ( t )is called symrnetrizable if there exists a family
U ( t ) = {u(t, x) ; x E E } (t E T)such that (1) u ( t ,x) > 0 for every t > 0 and z E E ; (2) u(t,x ) a ( t ,x,2) = u(t,2 ) a(t, 2 , x) for all t and x. Then U ( t ) is called a symmetrizing function of d ( t ) . of real-valued functions
The next theorem is the main result of this section.
Theorem 7.6. Let d ( t ) be a field with F ( t ) given by (7.1).
Then the
following statements are equivalent:
(1) d(L)is a potential field. (2) ,d(t)is path-independent. (3) .@'(t)is symrnetrizable. (4) For each t E ?', -t E U(f) and z, H E E&), one has h ( t ; At, x)a@, x,?) h(t; 2, A,) = ii(t; A t , 5) a(t,5, x) b ( t ; 2,At).
(7.3)
If one of the above statements holds, then ( 1 if x = A,, if
2
# A,, t E T , x E Eg(t)
(7.4)
is a symmetrizing function and - log X is a potential of d ( t ) .Furthermore, for any symmetrizing function A', there is an a:e(t)> 0 for each l! E D ( t ) such that .7: E E&). A'@, x) = c.&)X(t, x), Finally, d(1)is a potential field iff so is its restriction t o F&(t) for each !E f>(t)"
7.1 FIELD THEORY
275
*
Proof: By Lemma 7.4, we have (1) u (2). We now prove (2) (4). If 2 is not directly reachable from x at time t , then (7.3) is trivial. Assume that 2 5 2 . From (2), it follows that w ( L ( t ;At, x))
+ w ( t ;x,2 ) = w(L(t;At, Z)),
and so by (7.1) and (7.2), we have
This is just (7.3). (4)==+ (3). Since (7.3), (7.4) and the fact that a ( t ,x,Z)= a ( t ,Z, x) = 0 whenever x and 2 do not belong to the same &(t), X is a symmetrizing function of d ( t ) . (3) 1(1). Let u be a symmetrizing function of d ( t ) .Taking
v(t,x) = - log u(t,x),
t E T, x E E,
it is easy to check that V ( t ):= { v ( t ,x)} is a potential of the field. We have now proved the main part of the theorem, the other assertions are easy to check. H Definition 7.7. A potential field d ( t ) is called conservative if there is a potential, independent o f t , of the field.
It is easy to prove the following result. Corollary 7.8. If 2 ( t ) is independent of t and every conservative potential should be of the form: v ( x ) = - logX(to,z)
+ c,,
where c, is a constant depending on element of T .
d ( t ) is conservative, then
x E El, e E D
e
only and
:= D ( t ) ,
t o is an arbitrary but fixed
We now consider a particular case. Take T = [0,co), E = { i ,j , k, . . . } and a ( t ; i , j ) = Pij(t), where P ( t ) = (Pij(t))is a Markov chain. Notice that Pij(t) = 0 for some t > 0 iff so does for all t > 0. Hence 2 ( t ) , D ( t ) and E,(t) are all independent of t > 0. Definition 7.9. Let
Pij(t) = 0 u Pji(t)= 0
for all
t 2 0, i , j E E.
Then the field corresponding t o P ( t ) is called a chain field, also denoted by
P(t>.
7 FIELD THEORY
276
Proposition 7.10. A chain field P ( t ) is a potential field (equivalently, P ( t ) is symmetrizable) iff it is conservative field. Then, all potential functions are given by
v&) = -log A,(l)
+ ce(tf,
i E Ef,e
E
D,
where (Xi(1)) is defined by (7.4) and c e ( l E D ) is an arbitrary function on T .
Proof: Without loss of generality, assume that P ( t ) is irreducible. Choose a reference point A and reference path and define ?ai(t), ?ia(t), &(t) by (7.2) and (7.4) respectively. Since P ( t ) is a potential field, (&(t))defined by (7.4) is a symmetrizing function: Ai(t)Pij(t)= Xj(t)Pji(t)
for all t 2 0, i and j.
Then, by CK-equation and induction, we obtain
Xi(t)Pij(rnt)= Aj(t)Pji(mt)
for all m 3 1.
Noting that AA(t) f 1 and that P ( t ) is irreducible, we have
and so
xi(m/n>= A ~ ( I=>F A ~ ( I ) / & ( I )
for all rn, n.
Combining these two facts, it follows that
A&) Next, for any potential
= A&))
t E T.
V ( t )= { G ( t ) } of P ( t ) ,we have
7.2 Lattice Field
In the last section, the study on the symmetrizability is reduced to the potentiality of a field. Then, due to the path-independence, to justify whether a given field d ( t ) is a potential field or not, it suffices to check the work done by the field along the minimal closed paths. This idea should be clear intuitively and it is actually the key point why such an elementary tool being helpful. The main purpose of this section is to illustrate this idea, further applications will be presented in the next part (Chapters 11 and 14).
7.2 LATTICE FIELD
277
Lemma 7.11. Let S be a finite set and denote by E = { O , l } S product space of ( 0 , l ) . Define
the usual
Take
Then d = ( u ( x , 2 ) : x,2 E E ) is a potential field iff the following quadrilateral condition holds:
where
Proof: For each u S, define a transform (u): E E as follows:
x[(u) = ux, Then for every path L =
(2 =
x
E , u f s.
E
x(*),dr), a
4
, d")), we have
In this proof, we also use x c ( u l )...C(un> to denote the path L , where z denotes the starting point aiid ((u,) denotes the section di-') 3 ~ ( 2 ) . By the path-independence, we have
Using these notations, we may rewrite (7.5) as
Clearly w(z((u)<(u)) 0,
x E E,
uE
s.
(7.8)
We now prove that W ( X C ( U I > .. ' ((?in>) =0
for any closed path L
= (x = do), x(I),
- . - da)= x).
(7.9)
7 FIELD THEORY
278
Since every closed path consists of an even number of edges, n = 2m for some positive integer m. We use induction on m. When m = 1, then u1 = u2 and hence (7.9) follows from (7.8). Suppose that (7.9) holds for n = 2(m - 1). Then for n = 2m, there is a k such that 2 k n,uk = u1 and ue # u1 for 2 L < k - 1. Applying (7.6) and (7.7), we obtain
< <
<
w(xC(U,)
' ' '
C(%>)
= w(xC(u1)C(u2)) = w(.C(~,)C
+ w((u,u1x)C(u3) C(%>) + w((u12L24C(%). . . + n > ) ' * '
+ w((u,z)C(u1)C(~s). . . C(un>>.
Similarly, applying ()7.6 and (7.7) repeatedly and using (7.8), we obtain
This proves the sufficiency. The condition is obviously necessary.
I
Definition 7.12. Take El = { O , f l , f 2 , . . . } and set E = E p . For each x = ( z I , z 2 , . . ,zN)E E , let
A field d = (a(z, x') : x,x' E E ) is called a lattice field if
x € E x' $ F, U {x}, x E E .
Z'€FZ)
= 0,
Proposition 7.13. The lattice field a !'is a potential field iff the following quadrilateral condition holds: a(z,XU) u ( x U xUW)a(xUV, , xW)u(x", x) =u(x,zW)U ( X W ,ZWU)U(ZWU, xU)a(xU,z), u , v E { 1 , 2 , - ,N}, 5 E E , where XU(W) =
ifufv
7.2 LATTICEFIELD
279
and xuv = ( x u ) v . Then, the potential is given by -log.rr,:
where sgni = 1,0, -1 according t o
and
7rx0
i being even, 0 or odd respectively,
be an arbitrary constant,
Proof: a) We first prove that w ( L )= 0 for each closed path L= (x,x l , . . , xn,x). We may assume that x,xl, - . ,x, are all distinct since the general cases can be reduced to this one. Denote by .t (resp. t i ) the length of L (resp. the path of L i ) . Of course, f 2 4. If != 4,the quadrilateral condition implies that w ( L ) = 0. Thus, we may assume that f > 4. If L is included in an N-dimensional unit square, then by the previous lemma, we still have w(L) = 0. Hence, we need only to consider the closed path L which is not included by an N-dimensional unit square and l > 4. For such L there exists a hyperplane 2 which divides L into two parts. Starting from x, denote by x k the point where we first arrive at 2 along L. Then we go along L continuously, at some point we will leave 2. Denote by zj(j > k) the point where we first return to 2 after left 2. Then x k and xj divide L into two parts. One of them is and the other one is
280
7 FIELD THEORY
Denote by L4 :-: ( Z k , xi,xh, , x i , zj) the shortest path in 2 from 5'k to xj. We now claim that & > 1?4. In :fa&, if wc! denote by L', the projection of L3 to Z,then it is clear that l 4 < t i , and so .I13 > 3, 14. Similarly, we have
C1 +& > &. Set
then l5 < l and l6 < l. Also, w(L) = w(L5) + w(L6). From this and induction we get w(L) = 0. b) For each x e E, choose a path from x0 to x as follows:
and set
The second assertion now follows from Theorem 7.6 and Corollary 7.8.
m
7.3 Electric Field Suppose that we are given a symmetrizable Markov chain with symmetrizing measure (xi),Cixi = 00. We now want to know whether the chain is recurrent or not. Since the time-continuous case can be reduced to the bimediscrete case, we restrict ourselves to the time-discrete one. Let P = (Pij)be a transition probability on E with symmetrizing measure (xi).Define aij = xiPij. Assume that the chain P is irreducible, i.e., the graph induced by ( a i j )in the sense of Section 7.1 is connected. Then >0 for all i E E. To show the main idea, this section deals with finite E only. Fix two point a and b. Without loss of generality, unless otherwise stated, assume that T, = 1. Let the chain begin at a and move until reaching b. Next,, let hi denote the probability that the chain starting from i, will return to a before reaching b. Then, (hi : i E E ) is harmonic with boundary condition h, = 1 and hb = 0. That is hi =
C Pijhj, j
i
# a, b.
7 . 3 ELECTRIC FIELD
281
Equivalently,
j
j
Regarding (hi) as an electric potential in an electric field and regarding c23. , = a 23 . .(hi- h3. )
(7.11)
as the current that flows from i lo j , then (7.10) simply means that the totally amount of current flowing through the node i equals zero, which is just the firsl Kirchhoff’s law. Thus, rn a potential field in the sense of Section 7.1, the “work” done by the electric field should be given by whenever
However, in the 1a.nguageof the electrics, the difference wij = hj -hi between the potentials at. nodes i and j corresponds to the voltage (maybe negative) from i to j , and
w13. . == - C . . / & . 2.7 z3 is nothing but the Ohm’s law, regarding r i j := l/aij (resp. a i j ) as the resistance (resp. conductance) between i and j. Thus, the path-independence of the potential field is an alternative description of the second Kirchhoff’s law: in a closed circuit, the algebraic sum of the electric potentials equals the algebraic sum of the voltages spent on the branches. Finally, the probabilities (hi)can be interpreted as follows. When we impose one voltage between a and b, h,, = 1 is established at a and hb = 0, then hi represents the voltage at node i in the circuit. From the prool‘ of Le,nr,,ma7.4, it should be clear that the correspondence between the valtsges and the current flow on the circuit is one-to-one. To see more precisely the relation between the electric network and the recurrence, we need more notations. Let c, =
c,j
and reff= l/ca.
j
Here c, means the amount of current flow into the circuit from the outside source and yeff means the effective resistance between a and b since the voltage at a equals one. Certainly, ceff := l/reff is called the effective conductance between u and b. Obviously, T , (resp. ~ ten) is intrinsjc in the sense that it remains the same m 2ra/c,, even if the voltage at a is replaced
7 FIELD THEORY
282
by an arbitrary number v a . Actually, ref is determined by Because
( a i j ) completely.
where pescapeis the probability, starting at a , that the chain reaches b before returning a. Hence Pescape
= l/r,E*
(7.13)
For a Markov chain with an infinite state space, regarding b as infinity, then ~ 03. This gives the above formula shows that the chain is recurrent iff T , = us tt link between the electric network and the recurrence of the chain. As we know, except thc above noti,ons, another basic quantity in the electricity is the energy dissipation. We now use this to describe the effective resistance. Recall that for a current c i j flows through a resistor with resistance rij, the energy dissipation is defined by 6ijrij = ciij(vi - v j ) . Th,en, the total energy dissipation is E(c) 1= c:jrij, On the other hand, since
it follows that the totally energy dissipation is the Dirichlet form corresponding to the @matrix P - I . So by using our old notation, E ( c ) = D(v,TJ). Definition 7.14. We say that ( u i j ) is a flow from a t o b if it has the following properties: (1) anti-symmetry: (2)
I L ~:=
(3)
uij
uij = -uji.
C,u i j = 0 if 1: # a, b.
= 0 if i and j are not adjacent.
( u i j ) from a to b is called a unit flow if u, = C j uaj = 1. Furthermore, a unit flow is called a unit current flow if it is determined by a n electric field.
A flow
Proposition 7.15 (Principle of Conservation of Energy). Let (xi : i E ) be an arbitrary function and ( u ; ~be ) a flow from u t o b. Then
In particular, T
, = ~ I
xi,<j
&fjrij,where (c;j)
is the unit current flow.
E
7 . 3 ELECTRIC FIELD
283
Proof: Note that
We have
This proves the first assertion. Applying this to (xi) = (v,) and the imit current flow (uij) = (cij), we obtain v,c, = $ Ci,jc : j ~ i jSince r v, = careFf and ca = 1, the second assertion follows. Proposition 7.16 (Thomson's Principle). The energy dissipation of the unit current flow from a t o b minimizes t h e energy dissipation among all unit flows from a t o b, Proof: Let ( 7 i i j ) be any unit flow from a to b and ( c i j ) be the unit current flow. Then ( d i j := uij - c i j ) is a flow with da = 0. Denote by ( v i ) the voltages on the circuit determined by the unit current flow. We have
Applying the principle of conservation of energy to (xi)= (ui) and ( d i j ) , the middle term vanishes, and so we obtain the assertion. Combining the above two results, we obtain a useful corollary.
(uij)=
Corollary 7.17 (Rayleigh's Law). (1) Monotonicity law. The effective resistance is monotonic in the branch resistances. (2) Shorting law. Shorting certain sets of nodes together decreases the effective resistance. (3) Cutting law. Cutting certain branches increases the effective resistance.
In the electric circuit analysis, there are some useful rules to simplify the analysis. The next result is an example of such rules.
284
7 FIELDTHEORY
Corollary 7.18 (Equivalence Principle). The effective resistance of the circuit keeps the same if we connect t h e nodes which have the same potential by a conductor with arbitrary resistance (maybe zero or infinity). 7.4 Transierice of Syrnmetrizable Markov Chains
This section rcturns bo study the recurrence problem for an irreducible and symmetrizable hfa.rkov chain ( E ,X,, Pij, -.i) with symmetrizing rneasure ( ~ i )and infinite state space E . The nobations of resistance, current and voltage between two nodes a.nd the totally energy dissipation are the same as defined in the previous sect,ion. One main difference is that we now regard the point b with potential zero at infinity. Thus! in the definition of flow, the second condition
c u i j =0
if
i # a,b
should be replaced by
Again, the effective resistance of the elect.ric network is dcfined by T , = ~ wherc u, is the voltagc at a and c, = ca3 is the totally current flowing into the network. It is important to note that c, can be zero in the present situation, which means that rcfi= oc. For instance, if we take
u,/c,,
xj
as the electric potential on the network, then h, =: 1. In the recurrent case, hi = 1 for all i, so the current flow ~j = 0 and hence c, = 0. Certainly, for this flow, the totally energy dissipation E ( c ) = 0. However, in the transient case,
and so reff< 00. Our main task of t.his section is t,o extend the Thomson’s principle to the infinite network. That is the following result.
7.4 TRANSIENCE OF S Y M M E T R I Z A B L E MARKOV CHAINS
285
Theorem 7.19. (1) The energy dissipation af the unit current flow from a t o infinity minimizes t h e energy dissipation among all unit flows from a to infinity. Equivalent Iy, : ( u i j ) is a unit flow from
a t o infinity
where min 8 = m by convention. Here we use “min” but not “inf“ since the minimum is achievable.
(2) The chain
(23,X,,, Pijjn;)is recurrent iff reff= 00.
Proof: a) Let (E, X,,Pij,~ i be) transient. We prove that there is a. unit flow with finite energy. As we have seen above, for the flow (cij) corresponding to (hi), we have
j
and
c c i j = n i ( h i - T Pijhj
)
= 0,
i # a.
j
Hence:, ( c i j ) is a flow from a to infinity. Ta show that this flow has finite energy, t.&e finite subsets iF, containing a such that F, t E and set F,* = F, \ { a } . Regard E \ F, its a single point b, and define a?. = 23
if i,j E F,
aij
{ CjeFn
an. = an. 32’
if i E F,, j = b,
aij
i = b,, j
E
F,.
Applying the principle of conservation of energy discussed in the previous section to the circuit between a and b, with potential
h,?
:= Pi [ X I ,= a for some k
3 0 but X,
E
F, for d l rn
< k],
we obtain
Now, let Fn t E . Then limn-.+mh~= hi, i E E . By Fatou’s lemma, it follows that
7 FIELD THEORY
286
b) Next, suppose that there is a unit flow with finite energy. We prove that a unit flow c = ( c i j ) minimizing the finite energy dissipation does exist. Consider the Hilbert space
with inner product ( u , v ) = x i , j u i j v i j / a i j ,here and in what follows, the summation is over all i, j with aij > 0. Certainly, set llull := ( u , u ) ' / Define i, j , k E E. 3 k = 6ijaik, Then
(ti,@) = xi&j, Thus, ti E H ( i E E ) and
( u i j ) is
( P , u )= 1,
i ,j E E.
a unit flow iff it is anti-symmetric and
( P , u )= 0, i
# d.
Next, set
HO= { u E H : u is a unit flow}. By assumption, HO is non-empty. Clearly, HO is convex and complete. Hence, there exists uniquely an element c = ( C i j ) E Ho such that
which is described by (c, c - e ) = 0 for all e E Ho. Now, we construct a (unique) function (vi) such that dij(Vj
- vz) = czj,
i,j E E,
v, = 0.
(7.14)
As we explained in the last section, we should take
where a = io + i l --+ . + in = i is a path from a to i . To show that ( w i ) is actually a potential on the network, we need to check the path-independence. For this, let j o -+ j1 + ---t j n = j , be a closed path. Without loss of generality, assume that j ~j,,, . . ,jn-l are distinct. We need to show that 1
n
.
0
7.4 TRANSIENCE O F SYMMETRIZABLE
Define
(fij) c?s
hdARKOV CHAINS
287
follows:
Then
Thus, it suffices to prove that
for all i E E and required. Next,
Ifl c) = 0.
But
llfll < w, it follows that
c
a,jvj
=
c
Caj
(fl
c) =
-(c, c
-
(3' + c ) ) = 0 as
= (c, P ) = 1.
(7.15)
3
j
Applying (7.14) again,
it follows that
(7.16)
(7.17) c) Now, we use the flow const,ructed in b) to show that the chain is transient. Suppose that the chain (X,)is recurrent. Let
T = inf{n 2 0 : X , = a } . For any A
c E , define
*Pp= 6ij,
.P!?) $3 =Pi [Xn= j ; XI?... lXn-l $ A ] .
By (7.14) and (7.16): we have
vi = C,pz',"'uj = I E ~iv(x,)?n < T I = I E ~[zI(xnAT)l,n 3 0, i # a . j#a
288
7 FIELD THEORY
This equality is trivial when i = a. Hence, we always have vi = iq [V{Xn*T)],
Furthermore, if we set Fn, = o ( X , : rn property, we have IEi [ V ( X ( n + I ) A T )
n 2 0.
(7.18)
< n),then by the strong Markov
IS~A 1 =TE x ( ~ A T )[ ~ ( X I ~ T ) ] =
c
4 X ( n A T ) = j ] J E jl.(X,*T>
I
j
=
I { X ( n A T ) = j ] vj
=v(xnAT),
2 0, i E E -
j
This shows that Z I ( X ~ A Tis) a Pi-martingale. Noting (7.18) and applying the ) 0, a s Let martingale convergence theorem, we obtain lirnk-,m u u ( X k A ~=
Then gan
0:
gii
< OO? i f
U,
since the assumption of the recurrence. Clearly, rigij = 7rjgji for all i and j . We now prove that g j i gi.i for al1 i E E . This is trivial if i = a. But if i # a , then
<
Collecting the above facts t,ogether, we obtain
Ei
[(V(XkAT)-V(x0)) ]
7.4 TRANSIENCE OF SYMMETRIZABLE MARKOV CHAINS rOO
e
289 7
m
Hence sup& k
[ ( T / ( X k A T ) -'u(xO))2] < 00,
i E Es
T n particular, supX:IEi [ U ( X ~ A ~
T I ( X ~ A T )
as.
sup&
v(XT)= 0
[v(XkAT)2]
lc
k
~ t s 4 00,
< Ei [v(xI.)2] = 0.
SO ~ ( X ~ A=T0,) k 2 0, IPi-a.s. and hence vi = 0 for all i . This contradicts to (7.15). Thus, we have proved the transience of the chain. d) To complete the proof of (I), consider first the transient case. Denote by c = (cij) the unit current flow determined by ( h i / (1- C ja a j h j ): i E E) in terms of (7,ll). That is
Similarly, the unit current flow (c;)
on Fn U bn corresponding to (hp : i E
F, U b,) (see a)) is
Since hy
-+
hi and a:
---f
aij
as n, -+ 00, we have
and sos n …. On the other hand, for any unit flow(), if we set
290
7 FIELDTHEORY
then (u;) is a unit flow on F, U { b m } . Thus, by the Thornson's principle
Letting n 4 00, by Fatou's lemma, we obtain E(c) G E ( u ) . This not only proves assertion (1) for the transient case but also shows that the flow constructed in b) is actually the unit current flow since the element in Ho which minimizes the energy is unique. As for the recurrent case, since there is no unit flow with finite energy dissipation, we still have ref = 00 = min 0. I
Corollary 7.20. (1) The Corollary 7.17 and Corollary 7.18 remain true for the infinite network. (2) Given two Markov chains ( E ,P z j ,T,> and ( E ,P:3,x i ) which are equivalent in the sense that C-l aZJ/a&& C , then they are transient or recurrent simultaneously.
<
Proof: The second assertion is clear. T h e first one follows from Corollary 7.17 and the proof e) of the above Theorem. Example 7.21. The two-dimensional simple random walk
is recurrent.
Proof: Clearly, the chain is symmetrizttble with rcsped to aij =
1 0
~i
= 4. Then
ifli-jl = 1 otherwise.
'l'akc a = A0 := ((0,O)) and let A, be the set of Lbe ifitegers 0x1 the square centered at the origin with length 2n of edges. Denote by [A,,A,+1] the branches connecting A , and A,+, . Theri [[A,, An+~]l= 4(2n l), where
+
7.4 TRANSIENCE OF SYMMETRIZABLE MARKOV CHAINS
29 1
] A [= #{x : z E A } . Even though A, is not an equipotential surface, we still regard each A , as a single point. By the shorting law, we have 00
reff
1 = 00. I 4(2n + 1)
1
n=O
Alternative proof: Instead of studying the effective resistance, we consider the unit current flow. What we need to prove is that any unit flow has infinite energy dissipation. Use the notations given in the last proof. Given a flow ( u i j ) , for each n 2 1, we have
so ( i d E [ A n, A n + l I
By the Cauchy-Schwarz inequality
it follows that
Therefore
Indeed, the last proof implicates a more general result.
7 FIELDTHEORY
292
Theorem 7.22 (Nash-Williams). Let (ElX,,Pij, T ~ be ) a symmetrizable Markov chain. Suppose that there exists a partition of E : C,An = E such that
where A-I = Set [A,-1, chain is recurrent whenever
A,,] = { ( z , j ) : i E An.-,, j E A,L}.Then
0 3 /
\
the
-1
Proof: If necessary, dividing A0 irito two parts, we may aSsiime that A0 = {a}. Given a unit flow (u,,), we need to show that it has infinite enerIutJI = 00 for some n, then the Cauchy-Schwarz inequagy. If C,EAn,,EE lity shows that the flow has infinite energy. Hence, we can assume that C2EAn,,EE Iu,,I< 00 for all n 2 0. Now, the proof is similar to the previous one. Up to now, we often fix a source a. It is sometimes more convenient in practice to leave the point a to be free. For this, we introduce another criterion.
Theorem 7.23. The Markov chain ( E ,X,,P,,, (u,,) such that (1) (2) 3()
T,)is transient
iff there exists
and
Proof: The conditions are clearly necessary. We now use six steps to prove the sufficiency. Let ( u i j )be given and satisfy the above conditions. a) Prove that there is an a such that
For simplicity, set d, = C, u2,. If necessary, replacing (uz1)with ( - u 2 j ) , wc may assume that > 0. (7.19)
Cdi a
7.4 TRANSIENCE OF SYMMETRIZABLE MARKOVCHAINS Since
xi IdiJ<
00,
we can choose a finite subset I
cE
293
so that (7.20)
If I is singleton, then we are done. Otherwise, take a E I and k' E I Choose a path from a to t: a = i o 3 il -+ . i n -+ in+l = t and let
\ {u).
*
Then
Hence
iEI
i
Thus, the new (iiij) still satisfies condition (l),(7.19) and (7.20). But the number of I decreases by one. On the other hand, only finite numbers of (uij) have been changed, so condition ( 3 ) remains true. Repea.ting this procedure, we will obtain the required assertion in finite steps. b) Use the notations given in the proof of Theorem 7.19. We prove that the problem is reduced to finding a (cij) E H such that (1) holds and
I(P,c)l <
( t i , c ) = (ti+), i # a;
c
l(ti,u)l.
i#u
Actually, from the first condition above, we have (ti,u - c ) = 0 for all i and from the second one, we obtain
i#u
'
j
I
# a,
7 FIELDTHEORY
294
It follows that u - c satisfies all conditions of Theorem 7.19. c) We now start to construct the required c. The main steps are as follows. Let F and F* be the same as before. Instead of Ho,set H ( F ) = {v
E
H : v is anti-symmetric and (ti, v) = (t2,u) for all i E F * } .
Denote by cF the element which minimizes the norm of H ( F ) . Next, let F . E , then as the weak limit of {cFn : n 3 l}, we will get the desired c. As we did before, the element cF:
is described by (cF, C F -
e) = 0,
eE
H(F).
Since u E H ( F ) , we have llcFlj 6 [Iul]. This holds for all finite F 3 a. On the other hand, {IlcF,[I : n 2 1) is bounded, so we do have a weak limit, denoted by c. Clearly, c E H , c satisfies (1) and (f2,c) = ( f i , u ) ,
2
# a.
Hence, we need only to prove that (7.21) for every finite F 3 a. d) Let cF = (cij : 2,j E E ) . We prove that there exists uniquely a function (vi)such that (7.22) (7.23)
As we have known that there is uniquely a (q)so that (7.22) holds and v, = 0. We now prove that this v actually vanishes on E \ F * . Fix an i E E \ F . The discussions are divided into two parts according to a,i > 0 or = 0. i) a,i > 0. In this case, we claim that c,i = 0 and hence vi = 0. Otherwise, using k=a, j=iork=i, j = a Ekj
=
{ O7
Ckj,
otherwise
instead of (cij), we would obtain a different C E H ( F ) , anti-symmetric and having a smaller norm llCll < [[ cF 11. This is impossible.
7.4 TRANSIENCE OF SYMMETRIZABLE MARKOV CHAINS
295
ii) aai = 0. Define tiai
= hia = c
> 0,
6kj
= akj
in other cases,
and the correspondent l?, fi, g ( F ) and EF. It is easy to check that l l E ~ l l< J J c F and J J so 2;, = cF by the uniqueness of c F . Therefore, by i), we still have cai = C,i = 0 and furthermore vi = 0. e ) Let T = inf{n 2 0 : X , 4 F * } and set rO0
l
o
o
In this paragraph, we prove that (7.24) (7.25) First, we have
Next, by (7.22) and (7.23), we have
3
Hence
Furthermore,
jEF'
296
7 FIELD THEORY
This shows that (w - V ) ( X n A ~is )a Pi-martingale. But w = V = 0 on E \ F', so we have v = V. To prove (7.25)) fix k E F* and set d i j = a i j ( g i k - g j ~ ) Since .
it follows that
On the other hand, noting that
dij
+ d j i = 0, we obtain
which is just what we required. f ) Finally, we use (7.24) and (7.25) to prove (7.21). First, by (7.251, we have
Next, by (7.22) and (7.23)) we have
From this and (7.24), it follows that
which is what we required.
I
Example 7.24. The simple random walk in three-dimension is transient.
7.4 TRANSIENCE OF SYMMETRIZABLE MAR.KOV CHAINS
297
Proof: Take
and set
Ni = { j E z3: li - j l = 1). Clearly, ( u ; ~satisfies ) condition (1) of Theorem 7.23. Since for j E Ni,we have
Condition (3) holds. We now check condition (2). Let M be the unit cube centered at the origin and with length 2m of edges. Then
i€ M j € Ni
i € d M jENi\M
On the other hand, let (el, e2, e3) be the coordinate bases in R3 and set
z [ f+ ( ~+ 3
g(E) =
j(.
Eei)
- Eei) - :!f(x)],
i=l
where f is harmonic in a neighbor of z. Then, by the Taylor expansion formula, 1 g(E) = -(Af)E2 O(E4)= 0 ( E 4 ) . 2
+
Using these two estimates and the fact that A(l/lxl) = 0 (x # 0), it is easy to justify condition (2) of Theorem 7.23. Alternative proof: The proof consists of two steps. Firstly, construct a tree T3 with finite resistance. We begin the construction with drawing 3 branches from the root, denoted by t l l , t 1 2 and t 1 3 respectively. Each branch has 1 ohm. At the n-th step, each branch t n - l , k (1 6 k < 3n-1) constructed at the ( n - 1)-step splits into 3 branches and each of the new branch has 2n- 1 ohm. Suppose that there are no overlap. By the equivalence principle, it is easy to see that
n=O
Secondly, embed the tree T 3 into Z3. Put the root of the tree at the origin. At each step, let the rays go ahead along the tree coordinate directions.
3 FIELDTHEORY
298
+
Whenever a ray intersect the plane x+ y z = 2n - 1 for some n, it splits into three rays. If two rays pass though each other, we simply let them “bounce” (disconnect). Since this tree is a subgraph of Z3 with finite resistance, we have proved the transience of the simple random walk in Z3. 7.5 Random Walk on Lattice Fractals
To illustrate some applications of thc results obtained in the last two sections, we discuss in this section the simple random walk on two lattice fractals, Sierpinski gasket and Sierpinski carpet. As an example of the construction of general lattice fractals, we explain how to construct the lattice Sierpinski gasket. To do so, let us recall the usual construction of Sierpinski gasket in Rd(d 2 2). Starting from d 1 points ( 0 = q,,x l ?. - , xd} with length Ix,- xj I = 2. The d+ 1 vertices with d(d l ) / 2 (adjacent) edges make a d-dimensional polyhedron, denoted by H y ) . Next, let xzJbe the midpoint of the line jointing x2 and x3, write x,, = 2%:and let be the graph with vertices {xZ3: 0 6 i, j d + 1) and edges between x , and ~ xtk, x3e, 0 k, t! d, k # j , !# i. Clearly, H i d )consists of d 1 d-dimensional polyhedrons, with each pair sharing exactly onc vertex. , }nro. The Repeating this procedure, we obtain a decreasing sequence { H (4
+
+
+
<
+
<
<
limiting set n~&Y~? is called the Sierpinski gasket. The construction of the lattice Sierpinski gasket G(d)is proceeded in the polyhedron in H i d ) which contains an opposite way. Denoted by the origin with length one of edges, Consider Ef) as our starting graph and := Hidl as our construction at the first step. In other words, consider by using some appreciate transitions, we make d new copies of the original Ef’. Repeating the same procedure, at the n-th step, we make d 1 copies of Gn-l -(4 including Gn-] -(4 itself. To make the graph more symmctric, let G,(4
(?f)
eyl
+
(?id)
and its reflection in a fixed coordinate hyperplane. be the union of Finally, let G(d)= UT.,GLd). Since we are studying the random walk on the lattice fractds, only the vertices and edges are required in the construction. In the 2-dimensional case, this procedure is easy to expressed as follows. Let x0 = 0, z1= (1,0), x2 = (1/2, a / 2 ) and
+
+
where p A := (y T : z E A } . Next, let V : be the reflection on V’ in y-axis and set V, = VA V l . Then the vertices U,V, and their adjacent edges consists of the graph d2).
+
7.5 RANDOM W A L K ON LATTICEFRACTALS
299
Lattice Sierpinski gasket Next, define the simple random walk ( X n ) n 2 on ~ G(d)in the usual way: if z and y are adjacent, otherwise,
l/d(z)
where d ( z ) = #{y : y and z are adjacent}. In the present case, d ( z ) = 2d.
Proposition 7.25. The simple random walk on Sierpinski gasket is recurrent in any dimension.
Proof: Use the above notations and let
A0 = G r ) ,
A, = GLd)\ G r i l ,
n31.
<
and define [An-l,An] = { ( q y ) : z E An-lry E An}. Then IIAn,An+l]I 2d2, n 2 0. Since d ( z ) = 2d, if we set no = 2d, then a(z,y) = 1 for all z and y which are adjacent. Consequently, \-I
0 3 ,
.oo
The assertion now follows from the Nash-Williams Theorem . Next, we turn to study the simple random walk on the lattice Sierpinski , is defined as follows: consider Zd as a graph in the usual carpet F ( d ) which sense and set
-id) = d 25 n [o, 3Id, -
FZl =
u
{ ( i 1 3 ~ + ' ,- - ,2d3nS-1)
i l , ' . ',idE{0,1,2} (il,... , i d ) # ( l , . . . , l )
-k
pLd)},
7 FIELD THEORY
300
+
+
where y A = {y z : 2 E A } . Next, let Fid) be the union of F?') and its reflections in every coordinate hyperplane. Then, F ( d ) = uF=~&~~).
To see the carpet is much more difficult than the gasket, note t,,at the key in the above proof is l,hat IIArL,A,, 111 being bounded. Since this property, the gasket is said to be a finitely ramified fractal. Rut for any partition ol the carpet, IIA,n,A,,+l]Iis no longer being bounded. Even though, the two-dimensional case is still easy. Recall that the simple random walk on the regular lattice Z2 is recurrent, so as an application of the cutting law, we see that the simple random walk on the lattice Sierpinski carpet is also recurrent. For the higher dimensional cases, the idea is again constructing a tree T with finite effective resistance and then embedding the tree into the lattice carpet. Since a complete proof is lengthy, here we only list the result as follows: Proposition 7.26. The simple random walk on the lattice Sierpinski carpet ,F(d)is recurrent if d = 2 and is transient if d 2 3. Refer to Zhou (1991) or Zhou (2002) for a proof.
7.6 A Comparison Theorem
In Chapter 4, we studied t,he recurrence for single birth processes, usual,ly not symmetrizable, by comparing them with a tiirth-death process which is symrnetrizuble. In this sectiori, we go to the opposite direction, That is, we study the transience (but not recurrence!) by using the same type of comparison. More precisely, we prove the following result.
7.6 A COMPARISON THEOREM
301
Theorem 7.27. Let P and Q be two irreducible transition probability matrices which have excessive measure p and invariant measure v respectively. Suppose that (1) p and v are equivalent. That is, there is a constant C E (0,m) such that C-l dp/dv C. ( 2 ) Q is symmetrizable with respect t o v. (3) P 2 EQ for some E > 0.
<
Then for all X
<
< 1 and f
E
E;, we have
where K = 2 C 2 ( 2 V c-') and the subscript indicates that the inner product is taken with respect t o v. In particular, if Q is transient, then so is P.
The proof of Theorem 7.27 is based on the following simple observation.
.>
Lemma 7.28. Let A and B be two invertible operators on a real Hilbert space with
0 6 (Ax, x) < (Bz, for all
z and A
be symmetric, then (B-lz,
x) < (A-'z, x) for all x.
Proof: Since A is non-negative definite and symmetric, we have the CauchySchwarz inequality : (Y, < (Y,AY) (2,Az).
w2
Applying this to y = B - ~ zand z = A-lx gives
which by hypothesis is controlled by
Proof of Theorem 7.27: Let cp = dv/d,u and define
P' = ( I
+P)/2,
Q'
= (1 - 9 / 2 C ) I
+ (9/2C)Q,
where I denotes the identity operator. Then it is easy to check that P' and Q' are transition probabilities having p as their excessive and invariant measure respectively. Since C " cp < C , we have P' 2 [ 2-l A &IS
<
7 FIELD THEORY
302
Set 6 = 2-l A E . Hence (I“- SQ’)/(l contractive on L 2 ( p ) ,and so
Thus, for every A
-
6) is a transition probability and
< 1, we obtain
Similarly, from the equality
it follows that
Therefore
From this and Lemma 7.28, the conclusion follows immediately.
7.7 Notes The field theory was first proposed by Hou and Chen (1979) to study of symmetrizable Markov chains. It was then used to study the reversibility for particle systems by Yan, Chen and Ding (1982a,b). See Section 11.4 for more references. The first two sections of this chapter are based on the above quoted papers. Here, we consider the electric network as a particular case of the field theory. Section 7.3 is taken from Doyle and Snell (1984). Section 7.4 is taken from Lyons (1983). But we emphasize on the connection with the electric network. As for the simple random walk on the lattice fractals, it has been studied recently by Zhou (1991). Theorem 7.27 goes back to Varopoulos (1983) and Durrett (1985). It was used by Durrett (1986) to the multidimensional random walks in random environments. The present version of Theorem 7.27 is taken from Chen (1991d), in which some further applications and references are included.
Chapter 8
Large Deviations This chapter begins with a short introduction to the theory of large deviations. Then we study the rate function of large deviations for jump processes. Especially, we present an explicit formula for the rate function in the symmetrizable case. Finally, we study the large deviation principle for Markov chains.
8.1 Introduction to Large Deviations Recall that 9 ( E ) denotes the collection of probability measures on a Polish space (E;,8).Given {pE}E,O c 9 ( E ) ,pE 3 JZu rn E -+ 0 for xomc xo E E , we have p6(r)--f 0 as E 0 for each 5; (closure of I ') 5 xo. The purpose of the study of large deviations is to find out the convergence rate. Certainly, the most common rate is the exponential one: pE(r) e x p [ - r / ~ ] , T 3 0. Hence, we are seeking the expression: T = - lim E log p E ( F ) --f
N
€+O
provided the limit exists. In order to show what we can expect, consider a simple example as follows. Example 8.1. Let { X , } y be i.i.d. with normal distribution, denoted by X I JY(0,l) and set >?, = X k / n . By the strong law of large numbers, we have N
cF=l -
X,
---f
0 a.s.
as n i 00. - 1
Denote by pn the distribution of Tn,i.e., p n = P o X, then p n J0 as TL + 00. For seeking the rate, it is natural t o consider first the basic sets: closed sets and open sets. a) Let C bc closed with 0 $ C. Define i? = distancc(0,C) > 0. Since pyt N(O,l/n), C c (-a, -el u [E, 001, we have )
-
303
304
8 LARGEDEVIATIONS
and so
1
lim
n log p n ( C )
n-cc
< -t2/2
for all closed C
5 0.
(U)
b) Let G be an open set, G # 0. Again, set t = distance(0,G). Then, for > 0 small enough, either t + E E G or t - E E G. Suppose that t + E E G. Then for n large enough, we have [t+ E , t + E + l/n] c G. Thus E
(g)
p.n(G>2
1/2
l+& e+E+l/n
e-nY2/2
dy
e-Y2/2dy
Furthermore,
lim
1
- logpn(G)
2 -(t + ~ ) ~ / 2 .
n+cc
Since E is arbitrary, we get 1
for all open G # 0. 2 -12/2 (L) n Before moving further, note that we can cancel the extra assumptions “3 $ @” and “G # 0 . Next, for general L’ E 8, if inf 1x1 = infxET;1x1,we xEr have the equality
lim
- logpn(G)
n+cc
i
This SLOWS that, only in the case that r has rice topological property, we can expect the equa1i:y. In general, we can only expect the above two inequalities ( U ) and ( L ) . Finally, since the left-hand sides of the inequalities are expressed in term?, of the sets C and G, the righ:-hand sides should also be. For this, we introduce a function I(x) = x2/2. Then the right-hand sides of ( U ) and ( L ) can be rewritten as -
inf I ( % ) and
XEC
- inf I(x) xEG
respectively. Now, the expressions are symmetry. Clearly, the function I describes the convergence rate. Definition 8.2. A function I on E is called a rate function or I-function or entropy, if
(I) O < I $ C o ; (2) I is compact.
8.1 INTRODUCTION TO LARGE DEVIATIONS
305
Definition 8.3. We call {pE}E,Osatisfies the large deviation principle (abbrev. L.D.P.) with rate function I , if
(U)
GE,0
<
ElogpE(G) - inf,,c I ( x )
(L) limF rO ElogpE(G)2 - infxEG I ( z )
for all closed G, for all open G.
The following result is often used in various applications. satisfy the L.D.P. with rate function Theorem 8.4 (Varadhan). Let I , then for every closed function Q, bounded below, we have
If moreover,
E
Cb(E),then
(cf. Stroock (1984), Theorem 2.6).
In the history, the theory of large deviations came from H. Cram&.
Theorem 8.5 (Cram& (1938)). Let { X n } r be i.i.d. real-valued random variables and XI p. Suppose that N
eExp(dz)< 03
for all
E
E R.
Then the L.D.P. holds with
(cf. Stroock (1984), Theorem 3.8).
Applying Cram& theorem to the above example, we have
I J X ) = sup(& €
- p / z > = S l l P ( 2 - (I- 2 ) " ) / 2 = 2 / 2 . E
Now, we generalize Cram& theorem to more general state space. The first step goes from R to Rd. The msurription becomes
8 LARGEDEVIATIONS
306
where (x,y) is the usual inner product in Rd. The rate function becomes
Everything is quite natural. Now, it is not difficult to guess that for Banach space E , we should replace (c,x) by x*(z),where x* is an element in the dual space E*.
-
Theorem 8.6 (Donsker and Varadhan). Let E be a separable Banach p . Then the space, {X,}c;” be i.i.d. random variables valued in E. X I L.D.P. holds with
provided
eEIIJ:IIp(dx)< 0;)
for all
6 E W.
The only change is replacing x*(x) with [ilx11 in the last line. Next, we explain why we need more general state spa.ce-Polish space. Consider again the real-valued case. In statistics, one often considers the empirical distributions:
This average induces a distribution Qn on B ( P ( R ) ) ? the Bore1 a-algebra generated by the topology of weak convergence in 9 ( R ) :
Now, the problem is to study the large deviations for the sequence Qn of probability measures. Note that the state space is a Polish space but no longer a Banach space. The answer to the problem is contained in the next theorem, which is very important in the development of the theory of large deviations.
Theorem 8.7 (Sanov (1957)). Using the above notations, the L.D.P. holds for
&,
with
4m = . f ESUP Cb(R)( S , f d u - l % p n ) , where cr is the distribution of
XI
(cf. Stroock (1984), Theorem 3.40).
8.1 INTRODUCTION T O L A R G E DEVIATIONS
307
We now explain how to get the rate function I,. Let A ( E ) be the set of totally finite signed measures endowed with the topology of weak convergence. Then, it is a locally convex, Hausdorff topological vector space. Note that A ( v ) := uf. f E Cb(E) ==+ 11 E A(&’)*, On the other hand,
Thus
Cb(E) Ez &(E)*. Next, define X : E
--f
P ( E ) ,x
-+ 6,.
Then we have
where a := IP o X,’ and p := a o A-l. Donsker-Vnradhan theorem, we obtain
Now, applying an extension of
This is exact the rate function given in Sanov’s theorem. Bcfore moving further, we woiild like to introduce an alternate expression of 1, ( u ) . Suppose that v E ,Ya(E),Y << a and set ‘p = dv/da. Then, by Jensen’s incquality, under some natural assumptions, we have
hence
/(f
- log y ) d Y
< log
1
ef d a.
That is vf - l o g / e f d a 6 / b g c p d v =
I
cplogcpda.
8 LARGEDEVIATIONS
308
Indeed, by making supremum over all f E Cb(E) on the left-hand side, one can obtain an equality. Combining this with the expression of I ( p ) obtained in the last paragraph, we get formally
dv
dv
-log-da= da da
Ia(v)=
if
v
if
u
$ a.
This is very interesting because the right-hand side is just the Boltzmann (or Shannon) entropy. In order to develop the theory of large deviations apart from the independent case, the next step is naturally to consider the Markov processes. First, we consider the time-discrete case. Let R = EN.Define
X n ( W ) = ~ ( n ) , w E R,
nE
N
and the shift operator E N; en : R -, R, xm(enw) =x ~ + ~ ( ~ ) ,e = Q ~ .
Suppose that { X n } y is a Markov process with transition probability P ( z , dy). At the moment, we suppose that the measure P on (R, 9) determined by the process is stationary and ergodic:
P(O-'A) = P(A), A E 9, KIA=A P ( A ) = 0 or 1. Then, from Birkhoff's ergodic theorem, it follows that
. n-1 'p o e,(w) = 'p(0,w). In particular, take f E td? and set ~ = P ~ o xweget , ~ ,
where
Now, define 1 ~ia(QJ)
=-
c
n-l
n m=O ~ X , ( W ) ( W
Then the above fact can be rewritten as
'p =
f o Xo,
8.1 INTRODUCTION TO T,ARGEGE DWIATTONS
309
Since E is a Polish space, Cb(E) is separable, we indeed have Ln(.,w) ==+ p,
a.s.
Set Qn = P o L i l , as we have seen several times before, we actually obtain Qn 6,. Hence we have returned to the beginning of this section. The question is what the rate function is. For this, let us return to Sanov’s theorem. Its rate function is
However,
This leads to the expression:
where Cb(E)+ denotes the set of strictly positive functions in Cb(E). In the present case, we do not have a , which means that N and NU have to be changed, respectively, in terms of PXand P ( x ,d y ) :
Finally
I(p)= -
inf
U E cb ( E ) +
1
Pu p .
log -d
U
For the time-continuous case, everything is similar to the time-discrete one. Define t
L t ( B , u )=
6 ~ ( ~ , ~ ) ( B ) d Qt,x s , = P, 0 G - l .
Then, t,he L.D.Y. becomes -1 lim
t-+w
-1 logQt,z(C) < - l inf T(p) lEC 1
lim - log Qt,x(G)2 - PEG inf I ( p ) t-+w t
for all closed
C,
for all open G.
(U) L()
310
8 LARGEDEVIATIONS
Again, what is J(,u)? For each k
> 0, define
Then
the rate function is
By the preceding, we se that for
However, for g E Cb ( E ),
We guess that
1 1 -logQt,Z(A) = -IogP,[Lt t t
1 t
E A ] 2 - - ~ o ~ P , [ L [ $E) ~A, ],
Therefore, we guess that
I ( p ) = -f E Cinf b ( E ) + lim hlO h
d p = -f € Cinf b ( E ) + [ dt d s l o g w f d p ] t=O .
Finally
and so (8.1)
8.2 R,ATE FUNCTION
311
where a ( L ) + is the set of strictly positive elements in the domain g(L)of the strongly infinitesimal generator L. This is preciscly the correct answcr. The above discussions suggest us to call the rate function 1( p ) DonskcrVarad han entropy.
8.2 Rate Function In this and the next sections, we study the theory of large deviations for jump processes. Of course, we consider only the regular q-pairs. Since b 8 is a Banach space with the uniform norm 11 - 11 = 11 - ]Iu, a jump process P ( t ) induces in a natural way a semigroup on b 8 , also denoted by P ( t ) . Its generator is denoted by L with domain .9(L). Recall that for a given q-pair, the domain g ( L ) is generally not known explicitly, it is helpful to find an explicit expression for the rate function. For simplicity, we use the following notations.
has a strictly positive lower bound},
The next result says that if we restrict ourselves to p E ,P(q),then for computing I ( p ) , the set B ( L ) + can be replaced by a much larger class of functions.
Theorem 8.8. For every p E 9 ( q ) , we have
Proof: Equality (8.2) is a copy of (8.1). Equality (8.4) is obvious. Equality (8.5) follows from the fact that f E b b + iff l/f E b b f . We now prove (8.3).
8 LARGEDEVIATIONS
312
It is known that the strong generator L and the weak generator of { P ( t ) } both coincide with the operator C2 on b 8 . Let be the domain of the weak generator. We have
G
Take an f E
b&'+
and put l/n
P(t)fdt,
fn=nl
n31.
It follows that
and f n ( x ) -i f(x) for each x E E as n --+ co. Since
Applying the dominated convergence theorem, we get
n b 8 + , we can choose a sequence { f a } ? c 9 ( L ) Similarly, for each f E 90 such that S < fn < llfll and \Ifn - f l l --+ 0 as n + 00. Again, by using the fact that p E 9 ( q ) and the dominated convergence theorem, we have the same conclusion as given in (8.8). Hence we have proved that
and inf f€&b&+
Combining this with
we get (8.3).
1Qff
-dp
= inf fEbb+
/
ydp.
8.2RATEFUNCTION To prove (8.6), we fix an f E
313
Without loss of generality, assume that
b8’.
Equivalently,
Let
fn
=f
+ l / n , n 2 1. Then fn E bd?
and
Hence condit,ion (8.9) arid the dominated convergence theorem give us
which is just the equality (8.6). The last equality (8.7) can be proved in a similar way. To show that Theorem 8.8 is practical, we prove the following result. Proposition 8.9. Let p c 9’(y). bution of
Then I ( p ) = 0 iff p is a stationary distri-
{P(t)}.
Proof: First of all, we assume that I ( p ) = 0. Since p E P(q),from Theorem 8.8, it follow-s that
w e now fix an f E b 8 + , then there exists a 6 > 0 such that 1 for every E E [-6, oo),and so
As a function of
E,
F f has a minimum 0 at d -FFf(O) = dE
s
E
+ ~f E b&+
= 0: therefore
62fdp = 0.
This implies that p ( q f ) = p ( Q f ) for each f E b€‘+ and hence for each f E rd?+, which is then equivalent to the claim that p is a stationary distribution of ( P ( t ) }(Theorem 4.17).
8 LARGEDEVIATIONS
314
Conversely, assume that p is a stationary distribution of { P ( t ) } .F'rom the last part of the proof of Donsker and Varadhan (1975) [Lemma 2.51, it follows that
On the other hand, the proof of Donsker and Varadhan (1975) [Lemma 3.11 also works in our case, hence we have
Combining these two facts together, we obtain the required assertion. Now, we would like to give a clearer expression for I ( p ) . For this we need a hypothesis.
Hypothesis 8.10 ( H I ) . There exist a 0-finite measure X and an 6' x 8measurable function q(z,y) such that q ( z , d y ) = q(z,y)X(dy)
for all
z,y E E .
Theorem 8.11. Under ( H I ) ,for each p E 9 ( q ) satisfying p << A, we have
(8.10) where %?denotes one of
g ( L ) + ,b 8 + ,
b8',
8+ or 8'.
Proof: For convenience, if a a-finite measure cr on (E,6')is absolutely continuous with respect to A, we write a(.) = for z E E . Now, let p << X and define
g(x)
y) with Denote by H ( p ) and H ( p ,f) the integrals of h(p;z,y) and h(p;f ; z, respect to X x A, respectively. Since p E 9 ( q ) , we have
8.2 RATEFUNCTION
315
and so by Thcorcm 8.8, we obtain
1
1
1
= - H ( p ) - - irif H ( p , f ) . 2 2 ff'&
After taking a look at the expression of (8.10), one may guess that the second term in (8.10) should vanish in I,he symmetrizable case. This is correct and is presented in the next theorem. Before studying this problem, we notice a simple fact as follows. Lemma 8.12. Let ( q ( x ) q , ( z ,dy)) be syrnmetrizable with respect t o ( H I ) ,we have 7-r << X on E+ := {x E E : g ( x ) > O}.
Proof: Let X(A) = 0. Since follows that
This proves our assertion.
7-r
7-r.
is a symmetrizing measure of q(z,dy), it
I d.rr dX
Hypothesis 8.13 (H2). - > 0, X-a.e. on E+. Again, for
I/
Under
<< A, wc: w-rite ~ ( x=) $(XI
Lemma 8.14. Under ( H z ) , we have 4 ( , ,E
for simplicity.
\ E s ) = 0, X-a.e.
Proof: By Lemma 8.12 and the symmetry? we have
So ( H z ) gives us the required assertion. II
on E+.
316
8 LARGE DEVIATIONS
Lemrwia 8.15. Under ( H I ) and r(z)q(z, y)
=
( H 2 ) , we
n(y)q(y, z) on
have
E+ x E+,
X x A-a.e.
Proof: By Lemma 8.12 and the syrrrmel,ry, wc hitvc
Hence for A, B E €
n E+, we have
=
s,s,4 Y ) d Y ,
z)wWdY)"
Starting from this and using the monotone class theorem, the required assertion follows immediately. 1
Theorem 8.16. Under ( H I ) and ( H z ) , for every p E 9 ( q ) satisfying p on E+, we have
Proof:
:t)
<< X
By Lemma 8.14, we have
Hence for every f E bG U d?+.
Starting from this and using the approach used in the proof of Theorem 8.11, we obtain
8.2 RATEFUNCTION
317
Thus, the proof is reduced to the particular case that E = Et. From now on, we assume that E = Et. b) Because of 0 < T < 00, X-a.e., without loss of generality, assume that 7r E €'. At the moment, we also assume that 0 < p ( z ) < 00 for all z. Then p/7r E €', and so inf W P ; f) q p ;
<
fE g o
m>.
But by Lemma 8.15, we have
m)
= 0, and hence I ( p ) = H ( p ) / 2 , which is This implies that H ( p ; exactly what we want. c) To remove the extra assumption " p > 0" made in the last step, we prove that there is a probability measure a such that a << A, a(.) > 0 on E and aq < 00. For this, choose a sequence of disjoint sets {&}? C € such that 0 < A(&) < 00 and B, = E . Set
Cf"
Then f E d oand
+ f(~)-']-',
Now, if we take &(z)= [q(z)
a ( A )=
then the measure a defined by
A
E
8
< aq <
00.
& ( r ) X ( d z ) / / cu(~)X(dz),
will have the required properties. d) By b) and c), we have inf H ( a ;f) = 0 f€%
Define
1 n
and
1 I ( a )= - H ( a ) 2
n-1 n
p, = - a + -P ,
n 3 1.
8 LARGEDEVIATIONS
318
Then p n has the properties required in b) and so I ( p n ) = H ( p n ) / 2 for all n 2 1. We now show that the assertion
(8.11) implies quickly the conclusion of the theorem. Indeed, noting that
(8.12) (8.23)
and using the dominated convergence theorem, we get 1 . 1 lim H(,un) = - H ( p ) , n-+m 2 n+m 2 which plus (8.11) gives us the conclusion of the theorem. e) Let us return t o the proof of (8.11). By the convexity of I ! we have
lim I(p,) =
_ .
and so limn+m I ( p n ) it follows that
< I(,u).On the other hand, noting (8.12) and (8.131,
n--.oo f E 9 ( L ) +
72+M
-
8.2 RATEFUNCTION
319
r r
This finishes the proof. The next result is a different. version of the previous one. It will be proved by using Dirichlet form.
Theorem 8.17. Suppose that
(I) g(x) > O on E (i.e,, E = Ed-). (2) ( E , @ )is locally compact, ~ ( x ) is locally bounded and (3) ( H I ) and ( H z )hold. Then, for every p
<< X we have
Proof: a) First, we prove that for every p << A,
pP(t><< 7r,
t >, 0.
Let n ( B ) = 0. By condition (1) and ( H z ) , we have X << T and so p << T . Hence p ( e - Q t I B ) = 0. Next, since for fixed A , P ( t , . , A ) is the minimal solution to the backward equation, p P ( t ) l satisfies ~ the equation
But
= 0.
Combining the above three facts together, we obtain p P ( t ) 1 = ~ 0. b) Under the hypotheses oE the theorem, the Dirictilet form is uniquc. Now, we can apply [Stroock (19841, Theorem 7.441, which says that if p P ( t ) << 7r
320
8 LARGE DEVIATIONS
for all t > 0, then the rate funct,ion I(’) Dirichlet form:
can be computed by using the
here in the last step we have used Lemma 8.15. W
Corollary 8.18. Let Q = ( q t 3 ) be an irreducible regular &-matrix, symmetrizable with respect to { ~ i> 0 : i E E } , then for each probability measure p, we have
J(4=
1
-
C ( m- dmJ2. i,j
8.3 Upper Estimates In this section, we study bhe large deviation principle for Markov chains, i.e., the upper estimates and the lower estimates. For the lower estimates, the answer is known. Actually, if the chain has no absorbing state and being irreducible, then the two hypotheses given in [Donsker-Varadhan (1983), Section 5) are satisfied by using the counting measure as a reference measure. Hence, as proved in the above quoted section, the lower estimates hold. Thus, whenever we have had the upper estimates, then tlie large deviatiuri principle holds with tho rate function I given in the last section (cf. Sectiorls 1-6 of the above quoted paper for details). As for the case that these are several absorbing states, the lower estimates usually can riot be held. ‘l‘herefore, we need only to study thc upper. estimaks. Our discussions are divided into two parts, Case (I). There are at most finite number of absorbing states. Case (11). There are no absorbing states. For the first case, we use the theory of the minimal non-negative solutions and the coupling technique. For the second, we use the martingale approach.
Definition 8.19. We call
(8.14)
8.3 UPPER ESTIMATES
321
a constrained system of homogeneous non-negative linear inequalities (abbrev. a constrained system) if c i k , di
E [0,$00],
ilkEE
(8.15)
and xcikdk
2 di,
.i E E.
(8.16)
k
a.{
W e call E [0, +GO] : i E E } the minimal (non-negative) solution t o the constrained system (8.14), if it is a solution t o (8.14) and for any solution { x i : i E E } t o (8.14), we have x i 2 xa for all i E E .
By induction, it is elementary to prove the following result. Proposition 8.20. (1) The minimal solution t o (8.14) exists uniquely. Indeed, it can be obtained in the following way: set
xi(O) = d i ,
X!n+l)
=ccikxp),
2
E
E,
12
2 0,
k
xa as n -+ 00 for all i E E . then xi"' (2) Let (E,d) satisfy (8.15), (8.16) and
Then every solution {Zi : i E E } t o the inequality
(8.17) is a solution t o (8.14). Hence Z i 2 xr for all i E E . In particular, if (8.17) has a finite non-negative (resp. and non-decreasing) solution, then so does (8.14).
Now, we endow E with the discrete topology, then { P ( t ) }is a Feller semigroup (i.e.) P ( t ) maps the class of bounded continuous functions into itself). Let { X ( t ) : t 2 0) be the sample process associated with the semigroup { P ( t ) } .Denote by TO _= 0, T ~T,~. ,- . the successive jump times of { X ( t ) } . Lemma 8.21. Let @ E &+ satisfy (8.18)
8 LARGE DEVIATIONS
322
and set
(8.19) Then {pi : i E E } is the minimal solution t o (8.14)with
Proof: Clearly, ( c , d ) satisfies (8.15) and (8.16). Put
'pi") =IEi(exp[LTn+' @ ( X ( s ) ) d s ] ) ,
i 6 E , n 3 0.
By Proposition 8.20 (l),it suffices to prove that
This is trivial if qi = 0 (since @ ( i= ) 0 and then yin) = l ( n assume that qi > 0. Then
> 0)). We now
from this, the inductive assumption and the strong Markov property, we have
qik kfi
(n)
qi
8.3 UPPER ESTIMATES
Lemma 8.22. Let Q =
323
be a birth-death matrix which may have some absorbing states. Suppose t h a t there is a constant ct > 1 such that (qij)
ai = qi,i-1 = aqi,i+l= abi, i E E,
lim bi = 00.
z-+m
Then there exists a function @ on E such that (8.18) holds, i -+ 00 and yi(i E E ) , defined by (8.19) with this a, is finite.
Proof: The regularity of the Q-matrix follows from a (0, l ) , define
@(i)
+ 00 as
> 1. For each
E
E
Then Q E satisfies (8.18) for each E E ( 0 , l ) . By Lemma 8.21 and Proposition 8.20 (2), it is enough to show that there exists an E E ( 0 , l ) such that the following constrained system
has a finite non-negative solution. To this end, set
: i E E} is a solution to the difference equation:
Then,
(1- E)(1 +.)Xi
= azi-1
+ XZ+l.
Noting that X(0) = a > 1, we can choose an EO > 0 such that {xi = X ( ~ g ) z + l : i E E } gives us the required solution. H In what follows, the study on the upper estimate of large deviations is based on the following result. Theorem 8.23. Suppose that there exists a compact 00)
(i.e., limi+m @(i) =
such that
Then the upper estimate o f large deviations holds:
-1 lim
t4m
- logQt,i(C)
t
6
-
inf I ( p )
for all closed
C,
P€C
where Qt,i was defined in Section 8.1. (cf. Stroock (1984), Theorem (8.12)).
324
8 LARGEDEVIATIONS
Theorem 8.24. Let Q = ( q i j ) be a single birth Q-matrix which may have some absorbing states. Suppose that there exists an a > 1 so t h a t
Cqij 3 aqi,i+r, i E E ;
Cqij --+
j
j
as
00
i --+ o3.
Then the upper estimate of large deviations holds.
Proof: a) Let
=
( @ j ) be
the birth-death matrix defined by
By Lemma 8.22, there exists a function
5 on E
such that (8.20)
If necessary, using infk2i s ( k ) ( i E E ) instead of 5, we can also assume that
5(i)TCo
as i T 0 .
(8.21)
b) Choose a coupling operator as follows flf(il,i2)
- qiz,i2-k)+[f(il - k,i2) - f ( i l , i 2 ) ]
=C{(qZ1,i1-k k>O
+ ( ~ i 2 , i z - k- 4 i 1 , i 1 - k ) + [ f ( i l , i 2
+ (qz1,Zl-k
-
1) - f ( i l , i Z ) ]
A Y i z , i z - k ) [ f ( i l - k , i 2 - 1) - f ( i l , i 2 ) ] }
+ (qil,il+l q i 2 , 2 2 + d + [ f ( i 1 + 1 3 2 ) - f ( i l , i 2 > ] + (4izri2+1 - 4 i 1 , i 1 + d + [ f ( i 1 , i 2 + 1) - f ( i l , i Z ) ] + (qil,il+l A 4iz,iz+l)[S(il + l , i 2 + 1) - f ( i l , i 2 ) ] , -
here we have used the convention: qij = 0 if j Theorem 5.41, we have
Pi’~i”x;t)< X ( t ) ]= 1,
{x(t)}
< 0. As an application of 21
6 22,
where { X ( t ) } and are the processes corresponding to Q = ( q i j ) and Q = ( @ j ) respectively. Thus, if we take @ = 6,then
-
= pi
< 00,
i E E.
An alternative way to prove this result is using Theorem5.46. But the coupling used here has its own interesting. c) Finally, the assertion follows from Theorem 8.23, (8.20) and (8.21), replacing (p and 5 with cp and @, respectively. W Actually, we have also proved the following comparison result.
8.3 UPPER ESTIMATES
325
Lemma 8.25. Let P(t) and p(t)be two regular Q-processes with s t a t e space E = Z,. Suppose that P ( t ) 4 f-P(t). If there exists a function (r, satisfying (8.20) and (8.21), then tl7e upper estimate of large deviations holds for both of
P ( t ) and P ( t ) .
Example 8.26. Consider the birth-death matrix:
The Q-process is ergodic iff X I
< Xa.
Theorem 8.24 works in the same situation.
From now on, we consider mainly a regular Q-matrix with no absorbing states. Then 70
< T~ < . . -< T~
:=
lim
T,
nico
and P [ T n < m]= 1 , Define gt
=1
u ( X ( 3 ): s
n 3 0;
P[Tm = 0O]=; 1.
< t}.
Lemma 8.27. Let Q = ( q i l ) be a regular Q-matrix with no absorbing states. For f E &+ with Cjsiq i j fj < 00 for all i E E , define
Then
Ei Z n(t ) = fi,
t30, iEE,n>O.
Moreover, for each i, ( . Z ( T , ) , S , ~ ~ , Pis~ a) martingale and ( Z ( t ) , P t , P i )is a supermartingale.
Proof: Recall that
Fix
for all
with Then
=
fi,
t20, ~ E E .
For simplicity, we write haveWe
8 LARGEDEVIATIONS
326
Suppose that IEZ
Zn(t)= fz,
t 2 0, i E E .
(8.22)
Then by strong Markov property we obtain
Hence, by induction, (8.22) holds for dl Next, since
TL.
we have
Therefore
= Z(s),
t > s 2 0.
This proves that ( Z ( t ) ,3$,Pi)is a supermartingale. The proof for Z(T%)is similar, even simpler. I
For the remainder of this section, we take E = Z for simplicitg. Theorem 8.28. Let Q = ( q i j ) be a regular &-matrix with no absorbing states. Suppose t h a t there exists a function f such that
Then the upper estimate of large deviations holds.
8.3 UPPER ESTIMATES
327
Proof: By Lemma 8.27, we have
and so
On the other hand, since limidw @(i) = 00, the assertion now follows from Theorem 8.23. H Remark 8.29. In view of the above proof, we can also assume that the function @ is non-decreasing. Otherwise, use infk2i @ ( k )instead of @(i), the new function @ still have the required properties.
The next corollary provides some more explicit conditions. Corollary 8.30. Let Q = ( q i j ) be a regular Q-matrix with no absorbing states. Then the upper estimate of large deviations holds if one of the following conditions is satisfied. (1) There exist a positive integer rn, an a
> rn and an N
such t h a t
Moreover, Cjciq i j 00 as i -+ 00. (2) There exist a positive sequence (ci) and a non-negative sequence (di) such that --f
qij
< cidj
for all large enough i and all j
<
> i;
where Ci = max{ck : k i}. (3) There exist an a E (0, GO) and an 6 E (0, a ) such t h a t
and
for all large enough i.
328
8 LARGEDEViATIoNS
Proofi a) Take an ci E (1, ( a / 7 n )l'm) arid set a,nd
fi = & z ,
i E E. Then f 2 1,
Therefore, Theorcm 8.28 is now available. This proves (1). h) Since Ct 2 co > 0 and the assumption in (2): we have
Set
Because f is increasing, the assertion for case ( 2 ) follows from Theorem 8.28 and
c) Choose 0
< d < D < 00 and define
8.4 NOTES Then fo = d,
fi
tD
as i
329
co. Finally
8.4 Notes
For Section 8.1, refer to Stroock (1984). Theorem 8.17 is taken from Chen (1990a). The remainder of Sections 8.2 and 8.3 are taken from Chen and Lu (1990a,b) with some improvements. See Jain (1990) for recent progrcss on the lower estimates. Two recent books on large deviations are Demba and Zeitouni (1993), Deuschel and Stroock (1989).
Chapter 9
Spectral Gap In this chapter, we study the exponential L2-convergence. We prove in Section 9.1 that the exponential convergence rate for a Markov process can be described by the L2-spectral gap of its generator or Dirichlet form. In the reversible case, we prove an equivalence, or even coincidence, of this convergence and the exponentially ergodic convergence. Two main results for estimating the spectral gap by couplings and two approximating procedure for jump processes are introduced in Section 9.2. Section 9.3 is devoted to birth-death processes, for which we have a complete solution to the topic studied in this chapter. In the last two sections, another powerful tool-a —a generalized Cheeger’s method is studied. It works in a very general setup but the resulting estimates are usually less explicit. 9.1 G e n e r a l Case: a n Equivalence Let {P(t)}t20 be a positive, strongly continuous, Markovian contraction semigroup (i.e., P ( t ) l = 1 for all t 3 0) on L 2 ( n )with stationary distribution T,not necessarily symmetric. Again, denote by L and .9(L), respectively, the infinitesimal generator and its domain induced by {P(t)}t2o. We say that P ( t ) converges exponentially i n the L2(n)-norm [I 11 if there is a positive E so that for all f E L 2 ( r ) ,
On the other hand, since 1 E .9(L) and L1 = 0, the vector 1is an eigenvector of L with eigenvalue 0. One may seek for the next-to-largest eigenvalue of the self-adjoint (resp. the symmetrized part of a non-self-adjoint) generator L. That is, to seek for the infimum of the spectrum of -L restricted to the orthogonal complement space of 1: {f E L2(i.) : r ( f )= 0} n B ( L ) . This leads us to define the s p e c t r a l gap of L: gap(L) = inf {
-
( L f , f ) : f E g ( L ) , r(f)= 0 and llfll
= 1).
(9.2)
Our first step is to show that (9.1) and (9.2) are closely linked. To do so, let D ( f ) denote the limit
330
9.1 GENERAL CASE:
AN
EQUIVALENCE
331
provided the limit exists, here the equality is due to the fact that T is a stationary distribution of P ( t ) . Such functions f E L2(.rr)with D ( f ) < 00 consist of the domain a ( D ) of D. Clearly, 9 ( L ) C Q ( D ) . In the case of P ( t ) being symmetric on L'(T), D ( f ) coincides with D ( f ,f) introduced in Section 6.7. This explains why we choose the notations D(f) and , 9 ( D ) . Next, define gap(D) = inf { D ( f ): f E g ( D ) , ~ ( f=) 0 and
llfll
= l}.
By definition, it is clear that the condition f E a ( D ) in the last line becomes unnecessary in the symmetric case. Because we are working in the regular case, which is natural since the process is assumed to be ergodic. Thus, it would be much better if we could use gap(D) instead of gap(L). Finally, define u(t) = -sup { log IlP(t)fll : ~ ( f=) 0 and l l f l l = l}. Since
( I P (+~s)fll 6 e-'(t)\\P(s)fII 6
e-'(t)-u(s)
Ilf 11,
by the contractivity and semigroup properties, it follows that u(-)is superadditive and u(0) = 0. Hence, the limit
u := lim - = inf tJ0
t
t>O
4
-
(9.3)
t
is well defined.
Theorem 9.1. We have u = gap(D) = gap(L). Proof: Clearly, gap(D) ,< gap(L) since D ( f ) = (-Lf,f) on 9 ( L ) . To prove u >, gap(L), simply use the fact: d = 2 ( P ( t ) f , L P ( t ) f ) G -2gap(L)IIP(t)fl12, -IIP(t)fl12 dt
f E W )7 , 0 ) = 0 and llfll = 1, and the denseness of 9 ( L ) in L 2 ( r ) . Finally, let f E 9 ( D ) with and llfll = 1, then 1 D ( f ) = lim -(f tL0 t
-~
~ ( f=) 0
1 ( t ) ff) , 2 lim -(I - ePut) = u. tl0 t
Hence gap(D) 2 u. I At the moment, except the fact 9 ( L ) c 9(D), the knowledge of 9 ( D ) is quite limited. However, it will be clear later, whenever we have a little more information about the generator, the domain 9 ( D ) is actually manageable. The next obvious facts will be helpful for our further study.
9 SPECTRALGAP
332
Lemma 9.2.
-
(1) D ( f ) 3 0 for all f E .9(D). 9 ( D ) is dense i n L2(?r). (2) fEQ(D)*g:=cf + d ~ g ( Da)n d D ( g ) = c 2 D ( f )for all c, d c R . (3) fl 9 E .9(D) a n d f + 9 D ( f + 9 ) G W ( f+)&?>I.
.w)
Bcfore moving further, we would like to mention that the non-symmetric o given as case can be often reduced to the symmetric one. Let { P ( i ) } ~be above. Define a bilinear form D ( f ,9) on Q ( L ) by D ( f , 9) : - (Lf, 9 ) . Next, define a dual (or adjoint,) 6 of D as follows: 6(f,g) = D ( g , j ) , f , g E g ( L ) and set = (D+ 6) /2. Then -is is a syrnmctric form and so we have a norm 2 -f'j I( 1 0: Ilfli, - (f) llf1I2. Naturally, one can extend the domain of D to ~ ~ by g . Therefore, we the completion of 9 ( L ) with respect to 11. ( 1 denoted can define
+
gap(l;)) = inf
{o(f) : f E 3, ~ ( f=) 0 and l l f l l
= l}.
Noting that the last inequality D ( f ) 3 CT in the proof of Theorem 9.1 holds first for f E Q ( L ) and then for f E g , moreover, since
-
1
D ( f )=- Z(D(f)
+ m)) = Wf),
f E 9,
the proof of Theorem 9.1 shows lhat o = gap@) = gap(L). Hence we have proved the following result. Corollary 9.3.
(T
= gap(D) = gap())
= gap(L).
One often uses 9, rather than 9 ( D ) defined above, as the domain of D. This is because in general the bilinear form may not be regular. To use the semi-group corresponding to (0,Q), the theory of non-symmetric Dirichlet, forms should be helpful. Refer to Ma and Rockner (1992) (especially, Theorems 2.15 arid 2.18) for details. We will return to this problem more cnrcfully (see Theorem 9.12 below)+ mlPnext, coinparison result is useful t o compare the spectral gap of a complex process wit,h a simpler one. It shows, for hlarkov chains for instance, that a local pcrturbation does not interfere the L2-exponential convergence. The proof of thc thcorcm is straightforward and hence is omitted. I -
Theorem 9 . 4 (Comparison Theorem). Let ( D ,9 ( D ) ) a n d (6,9( be two forms with g ( D ) c defined on L2(,rr)a n d L2(?) respectively. Suppose t h a t there exist constants A , B and C such t h a t
9(D),
+ CD(f), ~ . P ( ~ ) / [ A B+ c ~ ~ P ( E ) ]
O ( f )6 A W ) , hen gap(D)
Varn(f>6 BVar;(f)
f
E .9t(D)*
9.1GENERAL CASE:A N EQUIVALENCE
333
For tensor product P ( t ) of { P k ( t ) }with generator Lk (it corresponds to a Markov process with independent components), the spectral gap takes a simple form. Denote by L the generator of P ( t ) on the product space with product measure
7r
=
nk
Tk.
Theorem 9.5 (Additive Theorem). gap(L) = infk gap(Lk). Proof: Choosing the functions to be depending only on the kth coordinate in the definition, it follows that gap(L) < gap(Lk). To prove the inverse assertion, it suffices to consider two components. Then, by induction the assertion holds for finitely many components and a limiting procedure gives it for infinitely marly components, since the functions depending on finitely many components are denso in Lz(x).~ e ft satisfy n(f)= o and l l f l l = 1. Exprcss f as f(x,y> = h ( z ,y) h,l (x) hz(y), whew J h,(.,y)dn, = 0 for a,e. y, J h,(s,.)&r2 = 0 for a.e. z, hldxr = 0 and / h2dn2 = 0. Then h, h,l and hz fire orthogonal in L 2 ( , r r ) an'd so are P ( t ) h ,Pl(t)h,l and F$(t)hz.The conclusion now follows from
+
+
IIW)fl12 = lIP(t)hl12+ IIP(tP1112 + IIP(t)h,2II2 = IlP(t)h.1l2+ l l ~ l ( ~112) + h Ilf3(t)hzIl2 and IlP(t)hll = IIPl(t)P2(t)h,ll6 e-l(gap(Ll)+gap(Lz))((hll. Now, we study the spectral gap for jump processes. staled, we consder only the regular j u m p processes.
Unless otherwise
As we have seen from Lemma 6.43, for a given q-process P ( t ,z, dy) with stationary distribution x,P ( t ) f := J P ( t ,-,dy)f(y), f E bd? can be extended to L2(T)uniquely as a non-negative, strongly continuous contraction semigroup. Thus, the above discussions are applicable to the present situation. , dy)), as we did befbre, define Next, for a given q-pair ( q ( x ) q(x, Xq(&
dy) = n(dz)q(z,dY),
For f E g ( D ) , by Fatou's lemma and Theorem 1.14 (4), it follows that 3o
> ~ ( f=)lim 1/ ' x ( c i z ) ( ~ ( t )-[ ff(z>lz)(z>~ * ( f ) . 110 2t
Therefore, we have 9 ( D ) C 9 ( D * ) . Note that here 7rq may not be symmetric. Choose and fix a sequence {En} c d? such that En 1 E and SUP,^^,^ q(x) = mpz< 00, n 2 1. Assume that mTL = n for simplicity. Define
X
=
md&={g:=cf+d:
{f
E
Lo3(x): {f # 0)
EX, c , d ~ R } .
c
some En}
9 SPECTRAL GAP
334 Lemma 9.6. XL c a ( D ) .
Proof: By the regularity of q-pair, we have
On the other hand, since
where
it follows that
Note that 7r is an invariant probability of { P ( t ) } t 2 0 ,7 r ( q f 2 ) = .rr(Qf2). Combining the above facts together, we arrive at
for f E X . Now, the conclusion follows from Lemma 9.2. I This simple result already enables us to get an upper bound for gap(D).
Lemma 9.7. We have 1 gap(D) 6 2 inf{k(K) :
o < 7r(K)< 1, I K
EX},
(9.5)
where
In particular, for Markov chains, gap(D) 6 infk q k / ( 1 - nk).
+
Proof: For 1, E A' with 0 < r ( K ) < 1, set f = CIK d. Choose c and d E R such that ~ ( f = ) 0 and llfll = 1. The first assertion follows from Lemma 9.6 by computing D * ( f ) . Then the second assertion follows by taking K = { k } plus the invariance of 7r.
9.1 GENERAL CASE: AN EQUIVALENCE
Lemma 9.8. Let
Choose and define
335
such that
Proof: It suffices to prove the first asserbion. Clearly, gn E g ( 0 ) and gn ---f g in L1(7r). Hence sup, .lr(gn) < 0s. Next, since
n ( s 3 z 7r(!121E,,) we have limn
+
,oo~ ( g , ) ~ / r ( g ; )=
..(.g2)
as 72
= 00
+ 00,
0. Therefore, by definition,
Definition 9.9. We call %? c 9 ( D * ) a core of D * , if '3 is dense in .9(D*) with respect t o the norm 11 lip: l l f l l ~ * = D * ( f ) llf1I2.
+
Lemma 9.10. If n(q) < 00, then Xj is a core of D*,
Proof: We need only to show that X is a core of D*. Let f E g ( D * ) , Replacing J with JrrL= (-m) V ( J A m) if necessary, we nay assume that f is bounded. Next, set fn = JIB,. Then
Theorem 9.11. If 3' is a core of D*,then gap(D) = inf{D*(f) : n(f)= 0 and = inf{D*(f) :
f E
IlfJl
= l}
XL, n(f)= 0 and l l f l l
= l}.
Proof; Since g ( 0 )- 9 ( D * ) ,D and D* coincide on 9 ( D ) ,by Theorem 9.1, it sufices to show that for each f E g(L)*) wilh n(1)= 0 and l l f l l = 1, there
9 SPECTRAL GAP
336
exists (fn} c Xi with 7i-(fm) = 0 and I I f n I I = 1 such that D * ( f n )--t D * ( f ) . Because X is a core of D*, by Lemma 9.2, we can find a sequence { f n } c ;U, with 7 r ( f n > = 0 and /lfnll = 1 such that fn -+ f in Ij . IIp-iiorm. Note that D * ( f n- f ) -+ 0 implies that D * ( f n )is bounded in n. On the other hand, by Schwarz inequality,
and so D * ( f r L-+ ) D * ( f ) as 71 oc. The assertion now foIlows. I We now specify the above result to Markov chains. Let Q = (qiJ : i, j E E) be an irreducible regular Q-matrix. Suppose that the Q-process ( P Z J ( t has )) a stationary distribution T . Define --j
It is easy to check t,hat, (&) is a conservative Q-matrix with stationary measure ( x i ) : so is (f&j). By Theorem 1.70, the Q-matrix ( & j ) is regular. Moreover, ( i j i j ) is reversible with respective to the same probability measure (7ri), and so is t,he corresponding minimal Q-process.
Theorem 9.12. Let Q = ( q z j : i , j E E ) be an irreducible regular Q-matrix. Suppose that the Q-process (Pij(t))has a stationary distribution x and the Q-matrix ( 4 ; j ) defined above is regular. Then
fi)'
:
f E XI,,~ ( f=) 0 and
1
=1 .
Proof: By Theorem 9.11, we need only to prove that X is a core of D*. From Corollary 6.62, we. know that (gtJ) is regular iff X is a core of D.But
We claim that the cores of 75 and D* are the same. The above result is a special case of Corollary 9.3, which says that gap(L) = gap((L z ) / 2 ) , where 2 is the adjoint operator of L. Before moving further, uFe now consider two irreversible Markov chains.
+
337
9.1 GENERAL CASE: AN EQUIVALENCE
Example 9.13. Take -1/2
1/2
0
Q=(! f f l i / 4 . Thus, in
Then gap(Q) = 1 but the eigenvalues of Q are 0, -5/4 general, Spec( (Q
+ 0)/2)
Example 9.14. Let
# Re. Spec(Q).
qk = qk,k-I
and qij = 0 for all other j
# i.
=
I, qok
Then gap(Q)
<
=
ek(k
{
z
1) for some O
+
>, (1 - f i ) - 2 &p}
<
1
-1
.
However, when B 1/2, the operator SZ has no non-zero real eigenvalues X in the weak sense: O f ( i ) = -Xfi for non-zero real f and all i E E .
Proof: First, it is easy to check that the stationary distribution follows.
and
TOO"
xn = 1-0
( ~ i )is
as
for all n 3 1.
<
a) We show that the operator R has no non-zero eigenvalues when 8 1/2. That is, Of = -Xf has no non-trivial solution (A # 0 and f # 0). Solving the equation Of(;) = -Xfi ( i 2 l),one gets (1- X)fi = fi-1 (i >, 1). From this, it follows that fi f 0 once X = 1. Otherwise, fi = (1 - X)-ifo for all i >, 1 and fo # 0. Note that slf(0) is meaningful only if O < 1 1 - XI. From flf(0) = -Xfo, it follows that Cp=,O k ( l - O ( 1 - O)-' = -A. But when 0 < 1/2 and O < 11 - XI, the last equation holds iff X = 0. b) To estimate gap(&), here we adopt a standard and powerful method: xi the path method. Let f satisfy ~ ( f=) 0 and llfll = 1. Denote by ~ [ ab]. , Then, we have
xiE[a
k=l
e= 1
9 SPECTRALGAP
338
whcrep,q (withp > 1, q > 1, (p-l)(q-1) to be detcrmined later. Next,
Minimizing y / ( l fi)-2. Thus
- y)(y - 8)
= 1) a n d y E ( 0 , l ) arecoristants
with respect to y, we get the minimum (1 -
Minimizing the right-hand side with respect to p and q, we get
and then the required lower bound follows. We will come back to this example at the end of 59.4 and prove that gap(Q) = 1 - fi. We have seer1 that the non-symmetric cme ca,n be often reduced to thc syrnmctric one. This is especially practical once the stationary distribution is known. Hence the symmetric case is more important and often easier to handle. We will show in the next section that the unbounded case can be further reduced to the bounded case ( i e , sup, q(x) < m). The remainder of this section is devoted to prove an equivalence of the L2-exponential convergence and exponential ergodicity. Thus, on the one hand, by the known criterion for the exponential ergodicity (cf. Chapter 41, we obtain some criterion for L2-exponential convergerice. On the other hand, from the study of the estimates in this chapter, we obtain a lot of new estimates of the exponcntially ergodic convergence rates. Again, denote by 11 [Ip the P(n)-norm. By Theorcrn 4.43, thc cxponential ergodicity means one of the following equivalent statemcnts holds:
IIPt(x,.) - 'liJJVar< C(x)e--Et, 11 IIpt(., .) - rIIvarII1 G Ce-Et, The largest
E
t > 0. t > 0.
in the above inequalities are denoted by
and
(9.6)
(9.7) E~
respectively.
9.1 GENERAL CASE:A N EQUIVALENCE
339
Theorem 9.15. Let ( P t ) t g o be a reversible Markov process with densitypt(z, y) with respect t o a probability measure 7r. Then we have (1) E l 3 @P(D).
(2) Conversely, we have gap(D) 2 c2. Hence these two types of convergence are equivalent. (3) If in addition that the set X of bounded functions with compact supports is dense in L 2 ( z ) and p s ( . , . ) E Lt/z(7r), then gap(D) 3 and so ~i
FFP(Q)-
Proof: a) Assume tha,t the process has L2-exponential convergence. Let p << T . Then
Since Pt(x: .) << z and the process is reversible, we have p t ( z , y) = pt(y, IC), 7r x 7r-as. (z, y). Hence
This m a n s that pt(x, f L 2 ( n )for all t 3 )
This proves the first assertion. b) Assume that prove that
> 0 and 7r-as.
IC
E E . Thercfore
with largest
We Then
9 SPECTRALGAP
340 Next, we prove that
llP2t
IlWt - ..MI;
- 7 r l l m + ~ = llPt
- T / / & + ~We .
have
= ((Pt - 7 d f 7 (4 - T)f)= (f,(Pt - ..)2f) = (f)(P2t - n ) f > t l f l l m l l ( ~ 2 t- Wll
<
Ilfll& IlPzt - .rrllCo-l.
G
(9-9)
Hence llP2t - 7rllm+1 2 llPt - 7rI12-t2.The inverse inequality is obvious by using the semigroup property and symmetry: llP21
- 7rllCo-4 6
llPt - ~IlCo+211Pt - 7 4 2 - 1 = JlPt- 7rll&+2.
Finally, we prove that XI := gap(D) 2 E ~ We . have just proved that for every f with ~ ( f=) 0 and llfll2 = 1, IIPtfll$ 6 Cllfll&e-2ezt. By the spectral representation theorem and Jensen’s inequality, we have IIP,fll; =
Lrn
[ /1
e-zxtd (Ex!, 3’) 2
t/a
e - 2 X S d ( E x f Il ) ]
for all t 3 s. Thus, llP9fll,”6 [ C ~ ~ f ~ ~ ~2 ]E zss ’. tT,etting e t
l l ~ ~ f
4.f)- 0, llfllz
I
=
Ilpsfll:t’s
+ 00,we
get
= 1, +fE LDO(n).
Since Lm(n) is dense in L2(7r),we have
1lP.f1;
c-2Fq
R
>, 0 , n ( f ) = 0 , llfllz = 1.
Therefore, X I 2 E ~ . c) From proof a), we haw seen that if the process has L2-exponential convergence, then (9.6) holds with C E L ~ / : ( X )by assumption. Under this condition, as in (9.9)) we have II(Pt - T ) ~ I I ;= (f,(
~ 2t
r > fG > IVIL J ~ ( ~ ~ )- I 7r11var ~
< llfllk J SUPP
(f)
n(dz)C(x)e-2Clt =’* c fe-zt-lt 7
f E X *
The constant C f can be removed as we did in the last paragraph of proof b) by using the denseness of X . 1
9.2 Coupling and Distance Method Coupling method is a powerful tool in the study of convergence rates for Markov processes. This section begins with two general results on this method. Then, we study two approximating procedures which are often needed in the applications to estimation of spectral gap by coupling methods.
9.2 COUPLING A N D
DISTANCE METHOD
341
Definition 9.16. Let L be an operator of a Markov process (Xt)t>o.We say that a function f is in the weak domain of L, denoted by g W ( L )i,f f satisfies the forward Kolmogorov equation
or equivalently, f(Xt) - J:Lf(X, )ds is a P"-martingale with respect t o the natural flow of o-algebras {LFt := a ( X , : s 6 t}}t>o.
Definition 9.17. We say that g is an eigenfuction of L corresponding t o X in weak sense if g satisfies the eigen-equation Lg = -Xg pointwise. Note that the eigenfunction defined above may not belong to L2(7r).
Theorem 9.18. Let ( E , p ) be a metric space and let { X t } t > be ~ a reversible Markov process with operator L. Denote by g the eigenfunction corresponding t o XI := gap(L) in weak sense. Next, let ( X t , x ) be the coupled process, starting from ( I C , ~ ) ,with coupling operator and let y : E x E --+ [O,m) satisfy y(z,y) = 0 iff IC = y. Suppose that
z
(1) 9 E %J(L)l
(2) Y E %J(Z), (3) LY(Z,Y) 6 -QY(.,Y) for all 2 # Y, (4) g is Lipschitz with respect t o y in the sense that
Then, we have gap(L) = XI
2 a.
Proof: By conditions (2) and (3), we have
Next, by condition (1) and the definition of g , rt
rt
is a P"-martingale with respect to the natural flow of a-algebras { 9 t } t > 0 . In particular, giz) = E"[ g ( X t ) XI g ( X s ) d s ] .Because of the coupling property,
+
9 SPECTRAL GAP
342
Thus, we obtain
Define the coupling time T = inf {t>0: Xt = Yt}. Thenn
Noting that g is not a constant, we have cg,y = 0. Divifding both sides by Noting that g is not a constant, we have cg,y = 0. Divifding both sides by cg,y, we obtain
for all t. This implies that A 1 2 a as required. H Condition (3) in Theorem 9.18 is essential, for which one needs l o cliovse not. only a good coupling but; also a good distance. This leads to the study on optimal couplings discussed ia Chapter 5. The other conditions in Theorem 9.18 can often be relaxed or avoided by using a localizing procedure (cf. Theorems 9.20 or 9.22 below). The next weaker result is useful! it is actually relat,ed to the strong ergodicity of the process. Theorem 9.19. Let { X t } t 2 0 , L , X I and 9 be the same as in the last theorem. Suppose that (1) 9 E
%&)I
( 2 ) s'lP,+y
19(4 - 9(?)1/)1 < m.
-
Then for every coupling I W Y , we have gap(L) = XI 2 ( supzpy@,")-I,
Proof: Set $(x,y) = g(z) - g(y). By the martingale formulation as we did in the last proof, we have
9.2 COUPLING
AND DIS'rANCE
METHOD
343
Hence
-
sup,.,IE"~YT <
Assume obtain
00
and so @"J['I'< m] = 1. Letting t
b
00,
we
T
Id4 - d Y ) I G G " ' Y
IS(X8) -
.dK)Ids.
,
~ h o o s exn and yn SUCII tliat 1g(zn) - g(pn)l = supz, 1 9 ( ~> $(Y)I, Without lass of generality, asslime that sup,,y I ~ ( x ) - g(y)I = 1. rl'hen .., lE"n9YfiT. Therefore, 1 6 A1 supzfy E"%". H 1 < A] We are going to study two approximating procedures, Two related renormalizing methods will be introduced in 59.5. To begin with, noting that by the regularity assumption and Corollary 6.60, the Dirichlet form is unique which takes X as a core. Note that if E is locally compact and q ( , ) is locally bounded, one may take an increasing sequence of compact sets as { E n } , Assume that
-
.n(Ez) > 0, Regard
An = Eg as a single point
n
G2+1 dn+l
= En
(x,A )
u {an}, &+I
=
q ( x ,A
n 2 1.
(9.10)
and set
= 48n (El IJ {An))),
n B ~+)] A ( ~ , ) q ( x E:), ,
E
E,, A E
&+I,
It is easy to see that (qn+1()x,qn+1(x,dy)) is a bounded conservative q-pair and hence is regular. Finally, let
Then for all A, B e En+1, we have
9
344
SPECTRAL GAP
This is symmetxic with respect to A aid B . Therefore (z;), &,l(x,dv)) is reversible with respect to jin+l. Next: let f E &. Without loss of generality, assume that f = eomtant off En for some n. Then ~ ( f =) 0 and llfll = 1 iff ? t n + l ( f ) = 0 and ?in+1(f2) = 1. Moreover,
By Theorem 9.11, we obtain gap(D)= inf{D*(f): n-(f)=0, IlfII=l,f= constant off En for somc n 3 0 ) = Iim irif{D*(f) : ~ ( f=)0, llfll = 1, f = constant off EV} n-+z
= lim inf{D*(f) :
. ~ m + l ( f= )
0 , 7 i n + l ( f 2 ) = 1)
7L-w h
= lim gap(D,+I). n-+w
Finally, choose f =constant off E,-1 such that + , ( f ) = 0 , %,(f2) = 1 arid gap(&) E >, E 7 & ( j Since ). .ir,+l(,f) = 0, f i n + l ( j 2 = ) 1, and nT2(s) = D * ( f ) = GrL.,21 ( f ) ,we have 67z(f)> gap(&+l) and furthermore gap(5,) E 3 g a ~ ( D , + ~Because ). E can be arbitrarily sniall, we have thus proved the following result.
+
h
+
A
Theorem 9.20. Let ( q ( z )q(x, , d y ) ) be a regular q-pair, reversible with respect t o T . Assume (9.10) holds for all n 0. Define (&+l(x),&+l(z,dy)) on En+, as above. If XL is a core of D*, then
>
Fur Markov chains: as a corisequencc of Theorem 9.20, we have
9.2
COUPLING AND
DISTANCE METHOD
Corollary 9.21. Let E = Z+ and Q =
(qij)
345
be an irreducible regular Q-
matrix. reversible with respect t o ( ~ i ) . Take
h
Qn+1
=
where n.
j=O
Then gap(E,+l)
1 gap(D) as n -+00.
We now mention anot.her approximating method which is also meaningful and sometimes even simpler. That is the rcstriction of ( D 59) ; to En: qn ( 2 ,d y ) = IE,Lx E,, ( 2 :$ ) Q ( X , d y ) , 4n (XI qn (z! En):
Correspondingly, we have
1 D n ( f )= 5 /En
s,,
%(dZ)qn&,
E En > 72 2 1.
(9.11)
d?I)[S(?I)- yF(4l2,
where x,, = x/n(E,) defined on En. The main advantage of this approximation is that' if f p - a p holds for the original operator $2,then we have R,p - a p automatically for the local operators .Q, for ail n..
<
<
Theorem 9.22. Under the same assumptions as in Theorem 9.20, we have lim gap (&) < gap(D). ,--roo
Proof: Since the q-pair is regular, we can choose a function f so that f =constant c out of Em with mean zero and variance I such that D ( . f ) < gap(D) -t E . Then; w h m n 3 m, we have n , ( f )= --c7r(E.;,)/x-(En) and
7rn(f2)
= (1- c2n(E~>)/n(
Thus, n,(f2)- 7rn(f)' = [n(E,) - ~~7r(E;)]/x(E,)~ and so
9 SPECTRALGAP
346
lim gap ( D n ) < gap(D) + E . But E finally obtain lim gap(D,) 6 gap(D). I
We get
n+m
can be arbitrarily small: we
12'00
For the second approximation, we: have proved a weaker conclusion that lim gap ( D n ) 6 gap(D) rather than g a p ( & ) 1 gap(D) for the first apn+oo proximation. However, within the context of birth-death processes, the last conclusion also holds for the second approximation. Proposition 9.23. Consider the restriction of a birth-death process with rates (bi,ai) t o { n , , n+ I , . . . ,rn}(0 < n < m < m) with reflection boundaries and denote by gap,,, i t 5 spectral gap. Then, we have gap(B) 6 gap,,,. Moreover, gap,,, is decreasing as m t or n I.
TL~'~)
Proof: a) Define = ni/ .rr("im)(f2) = 1 such that
7rk.
Take
f with
Then
Define
i
i>m
n
i
i
i>m
i>m
n
Due to the restriction to the birth-death processes,
i<j
and
d n j m ) ( f )
= 0 and
347
9 + 2~ O U P L I N GAND DISTANCE METHOD we have
< gap,,, +
<
iE.
<
Therefore, gap(D) ga.p,,, E and then gap(D) gap,,, by letting E J. 0. b) To prove the monotonicity of' gapn,,, it suffices to show that gap,,, 3 This simply follows from the proof a,) and even simpler. For instance, the modified function f becomes J I l n , ~ i < m l fJ[i=m+l~. I
+
Example 9.24. Let qok = bk > 0 , q k = qko = 2-l ( k >, 1.) and q i j = 0 b k < oc). The for the other cases of i # j . Assume that qo := operator -0 has eigenvalues 0, 1/2 and 2-1 + Ck21 bk: with 1/2 having infinite multiplicity. The eigenfunctions of A1 1/2 are neither unique nor monotone. By (5+57), it is easy to see that the process is not monotone. The decay of Cjpi7rjas a + 0;) can be arbitrarily slow, not necessarily exponential. The last condition is necessary for X I > 0 for the birth-death processes with rates bounded below (by a positive constant) and above. Hcnce, this exarriple is very different from the birth-death processes discussed below. Sincc: bk > 0 and qo < CM, we can choose a strictly increasing sequence g k ( k >, 2 ) so t'hat bkgk < 00. Next, define gz < 0 by bkgk = 0 and set go = 0. Finally, define a distance p on Z+ by p ( 0 , l ) = -91, p ( i , i 1) = gi+l - gi (i 2 1) and p ( i , j ) = p ( i , i 1) + . . . + p ( j - 1,j ) for all i < j. To show that gap(D) 2 1/2 = XI, by Theorems 9.18 and 9.22, it suffices to construct a coupling 6 such that 6 p ( i ,j ) < - p ( i , j ) / 2 . If i , j 2 1, we adopt, the coupling of marching soldiers (i:j ) -+ (0,O) at rate 1/2. Then
ck21
xk2:l
zka2
+
+
i l m p ( i , j ) = [p(O,O) - p ( i , j ) ] / 2 = -p(,i,j)/2. Otherwise, use the classical coupling. Let i = 0 arid :j
%P(O, j > =
c
2 I, we have
1
A - d o , A1 + 2[ d o ,0 ) - P ( 0 : j)l.
bk M k ,
k> 1
< 0.
Thus, it suffices to shcw that J := Ck21 b k [ p ( k , j )- p ( 0 , j ) l j = 1, J blgl f z k 2 2 b k [ p ( k ,1) - p(0, I)] = h g l -k z k 2 2 bkgk j 2 2, J =
bklbk k>l
$1
- (.Qj -
291)l
When 0. When
348
9 SPECTRALGAP
since j 2 2 and qo > b l . This example is also called star model since its graphic structure. There is a center at 0, from which there exists only one bond to each k. Thus, the geometric distance can be defined a s follows. First, we have p(0,k). Then, for i, j # 0, i # j , p ( i , j ) = p ( i , 0) p ( 0 , j). Based on this structure, a “path method” goes as follows. Replace q k o = 1/2 by a more general qkO = q k ( k 2 1). By Theorems 2.40 or 2.47, it is easy to check that the process is always unique. Clearly, the stationary distribution is ni = p i / Z , i 2 0 where PO = 1, pi = bi/qi for i 2 1 and 2 = c k 2 1 p k . For every f E L2(7r) with n(f)= 0 and l l f l l = 1, we have
+
Thus, gap(D) 2 2 - l i n f ~ l q i . When qi = 1/2, we return to the original model but the estimate of the last method is not sharp. Clearly, the problem does not come from the second inequality but the first one, which is based on the graphic distance. 9.3 Birt h-Deat h Processes
In this section, we study mainly the spectral gap for birth-death processes. This is crucial since the birth-death processes are often used as a tool to compare with some general (even infinite dimensional) process. For instance, the lower bounds obtained in this section is available for general reversible Markov chains on Z+ with qi,i+l > 0 and qi,i-1 > 0, provided the deduced birth-death Q-matrix is regular. Recall that for a positive recurrent birth-death process with birth rate bi > 0 (i 2 0) and death rate ai > 0 (i 2 l),the reversible measure (xi)is the following:
9.3 BIRTH-DEATH PROCESSES
349
Let Y be the set of all positive sequences (vi : i 3 0) and define
R&)
+ bi - aJvi-1 - bZ+lVi = Aa(i)- Ab(i)+ ai (1 + b i + l ( l - 4, = ai+1
a0
where A a ( i ) = ai+l
:=o,
v-1 := 1,
- ai,Ab(i)= bi+l
- bi.
i 3 0,
(9.12)
Next, let
W = ( { W i } i > o : wi is strictly increasing in i and ~ ( w2 )0},
-
W = { {wi}i>O : there exists k : 1 < k
< 00 so that wi = W i A k , w is
strictly increasing in [0, k] and n(w)= 0},
(9.13) Note that @is simply a modification of W . Hence, only two notations W and I ( w ) are essential here. The main results can be collected a s follows.
Theorem 9.25. Consider the ergodic birth-death process as above. We have the following conclusions.
(I) Dzflerence form of the variational formula f i r the lower bound (9.14)
(2) Summation f o r m of the variational formula for the lower bound: (9.15) (3) Summation f o r m of the variational formula for the upper bound gap(D) = inf sup I ~ ( w ) - ’ . w€?w i>o
(9.16)
Moreover, the supremum in (9.14) and (9.15), and the infimum in (9.16) can all be attained. 1 (4) Explicit bounds and explicit criterion: Define ‘po =0, ‘pi=
xjGip1
2 1, Q ~ ( Y ) = C p ~ c ~ ~ for? ~p-2 L j1,Q: = [ , i + k ] ~ ~ > ~ S(y) = supn2l Qn(r),6 = 6(1) and 6’= SUP,^^ Cyl: Q>vjn),where ~ ( is~ a probability 1 measure on {0,1,.. , k - l} with density vjk)= ( p $ ~ j > - ’ / Z ((and ~ ) Z(’”)is the normalizing constant). Then we have 6 < 6’< 26 and for2
+
In particular, gap(D) > 0 iff 6
< 00.
9 SPECTRAL GAP
350
In view of (9.12) and (9.13), one sees that the difference form (9.14) and the summation form (9.15) are quite different but there is indeed a correspondence between (iii) and (wi> (Idernma9.30). As we will see soon each of them has its own advantage. By exchanging “sup” and “inf”, we obtain (9.16) from (9.15), ignoring the difference of W and %? Thus, (9.15) and (9.16) are dual one to the other. All these formulas are completely different from the classical variational formula: gap(D) = inf {D(f) : ~ ( f=) 0 and
Ilfll
(9.17)
= l}.
Because of the uniqueness assumpt.ion of the process and Corollary 6.62, we do not need the condition ‘’f E B ( D ) ” in the last formula. Clearly, for each t,est sequence ( u i ) , from (9.14) we obtain a lower bound of gap(D). In particular, according to a classification of the test sequence, we obtain t.he following result.
Corollary 9.26. Define Au(i) = ai+l
+
(1) Let vi = ~ [ ll/(i
- ai,a0 := 0,
Ab(i) = 1
3 - bi. ~
+ c)], r 2 1 , c E 10,m]. Then
Aa.(i)- Ab(i)
+1 [ai - bi+i]
if r = 1.
Z+C
Then
(2)
A u ( i ) - Ab(i) - cI
ai
i - 1+ c2 -.” c1
2
-1- ca
In particular, w e have Examples 9.27. The exact gaps for nine examples are listed in Table 9.1 in the next page. Here, for the sixth example, we need a restriction: l / k < a/b k/(k - 1 ) 2 ( k 2 2). The test sequence used in the last example is ui = - 1)/4 for even i and ‘ui = (*+ 1)/4 for odd i .
<
(a
9.3 BIRTII-DEATH
351
PROCESSES
Proof: Replace by a the gap(D) in the table. Then by Corollary 9.26, we have gap(D) 2 Q for all these examples. Moreover, except the sixth and eighth exarnples, the infiiriurn on the right-hand sides of the formulas in the table are all constants. Thus, by using (q), we can reconstruct the eigenfinnctions y: 7 4 = (yi+2 - gi..l l)/{gi+l - g i ) and ~ ( g =) 0. Because it is also easy to see that g E L2(7rir)[except the first example], thi,s means that a is an eigenvalue and so the estimates are sharp.
i 4-
2(i+l)+Po a0 = 0
1 i+2 i ,.I, 1
2
+ +
1) 1.1.~. ... 2 ( i 4 i 2 (1+ 2 J z ) i + 2 f i
+
(JzqL&+ b) (i+2) .
,. .
2b(i +-l)-.---
rn i -t"1
2
2i I- 3
2(i
Table 9.1
+ 2)
Exact gaps for nine examples
It remains to study the first, the sixth and the eighth examples. For these exampIcs, there is a problem since the solution g to f2g = -Xlg satisfies .q E L1(?r)\ L2(7r).However, for the eighth example, applying Theorem 9.25 ( 3 ) to the test scquerice zlli = &, it follows that gap(L)) ,< 1/4 and hence the lower estimate 1/4 given in the table is sharp. Clearly, the sixth is a l o c d perturbation of the first one. We need only consider the sixth example. Let yi = ( c ~ k / b ) " and ~ set gjn' = g i A n . Then g E L'(.ir) \ L2(7r) and
Hence by Lemma 9.8 (1),
352
9 SPECTRAL GAP
as required. 1 From the second, third and forth examples, one sees that both of the eigenvalues and their eigenfunctions are very sensitive. The next result is a consequence of (9.15), except part (1) below, which is deduced from (9.14) directly by setting vi =
d x .
Corollary 9.28.
(1) (2)
(3) If & > i
gaP(D) 2 (.A - d q ) 2 / ( c ~ c 22) 1/(4C?c2). p ~ j6 clpiai and C3>i p j a j 6 c2piui for all i 2 1, then
(A
gap(D) 2 - dG)2/c1 3 1/(4cIc2). (4) If ai = bi and i Y C3.>% . l / a j < c(y) for some y 2 1 and all i 3 1, then gap(D) 3 max {(4c(l))-', b:-'c(y)-l(l - y-')}. Parts (1)-(3) of the corollary are all sharp, but not part (4)) for the constant rates b, = b and ai = a. Conversely, part (4) is sharp for ai = bi = i2(i 2 1) but parts (1)-(3) fail. The proof of the corollary is delayed for a while. The simplest example to show the power of Theorem 9.25 (4) is the following one. Example 9.29. Let ai = bi = i Y for all .i 3 1 and bo is ergodic i f f y > 1 and gap(D) > 0 i f f y 3 2.
> 0.
Then the process
Partial proof of Theorem 9.25: Step 1. First, we prove in the next lemma the equivalence of (9.14) and (9.15). Lemma 9.30. When vi = ui+l/ui for positive (ui), rewrite
Ri(v)as Ri(u).
(1) Given w E W with ~ ( w=) 0, set
Then we have &(u) = li(w)-lfor all i 2 0. (2) Given positive (ui : i 2 0) such that inf+oRi(u)
limn+m bnpnttn
>
0. Then c :=
< 00. Set
wi = aiui-1 - biui + c / ( Z- l), i >, 0, where u-1 = 1. Then we have wi+' > wi for all i 2 0, w E L ' ( n ) , ~ ( w= ) c / Z ( Z - 1) 3 0 , C i 2 1 p i w i > 0 and Ii(w)-'>, Ri(u) for all i 2 0.
9 . 3 BIRTH-DEATH PROCESSES
353
Proof: a) It follows from the definition of (ui) that we obtain Since
Since ~ ( w=)0, we have
Thus On the other hand, by (9.18), we have
We have thus proved part (1)of the lemma. b) For part (2), we first prove the existence of the limit limn-+mbnpnun. To do so, take wi = aiui-1 - biu; bouo (i 2 0 ) for a moment. Note that
+
(wi+l - w i ) / u i
( a i + l ~i bi+lUi+l - aiui-1 > 0, i > 0.
+ biui)/ui (9.19)
= Ri(U)
We have wi < wi+l for all i 2 0. On the other hand, since wo = 0, we see that w1 > 0 and so wi > 0 for all i 2 1. Thus n
n
n
kj-lpj-luj-1 0
- bjpjuj]
+ bouo x p j j= 1
n j=1
Since the left-hand side is increasing in n,it follows that bnpnun must have a finite limit c 2 0 as n --+ 00. Next, redefine wi = aiui-1 - biui c / ( Z - l),i 0. Then (9.19) remains the same. Moreover,
+
This gives us w E L1(7r)and get
,piwi
IC
I ~ ( w ) - '>, bipiRi(U)ui
j&+l
> 0. Now, by (9.19) and (9.20), we p j w j 2 R ~ ( u ) , i 2 0.
354
9
SPECTRALGAP
Therefore, li(w)-l 2 &(u) for all i 2 0.
H
Step 2. Next, we prove that gap(D) 3 supWEwinfi>oIi((w)-l. To do so, we show that &(w) > 0 for each w E "w and all i 3 1. Equivalently, C E i + l p j w j > 0 for all i 2 0. Otherwise, let io satisfyC,oo_io+lpjwj 0. Then, since w j is strictIy increasing, it follows that wio < 0, and furthermore
<
This is a contradiction. Fix w E "w and define
We have seen that ui > 0 for all i 3 0. Hence g is strictly increasing and so p is a distance. Based on Theorem 5.38, we adopt the classical coupling. Because of the symmetry pibi = pi+lai+l (i 2 0), we have
355
9 . 3 BIRTH-DEATH PROCESSES On the other hand, since Cjpjwj 2 0, we have
- -w1-
1
-
c
pjwj
< -(w1
- WO)
j21
Collecting these two inequalities together, we obtain
This proves the key condition ( 3 ) of Theorem 9.18. From which, the required conclusion follows by the localizing procedure given in Proposition 9.23. Clearly, the key point in the last proof is the choice of the distance p , which is not obvious at all. The key idea goes as follows. For each w E W , regard g defined by (9.21) as a mimic of the eigenfunction of the first non-trivial eigenvalue A1 = gap(D). In the case that g coincides with the eigenfunction, the sign of equalities in parts (1)-(3) of Theorem 9.25 holds and so we have completed variational formulas for the spectral gap. The proof of the last assertion requires the strictly monotone property of the eigenfunction and so is much more technical and is omitted here. Refer to Chen ( L996a, 1999a, 2001a) or Chen(2003d) for details. Step 3. Prove the upper estimate: gap(D) < i n f w E gsupi2, I i ( W ) - ' . Let
-
w E W , then w i = ~ i A and k T ( W ) = 0. Write c = S C = supi20 I i ( W ) - ' Since W i = W i A k . Thus k-1
k-1
00
=C X T j W j j=1
j=O
<
c
~
~I i ( ~ W ) -~' .
Then ~ ~
r
(j-l)A(k-l)
i=O
U
00
(Wi+l - W i )
=c
C
TjWj
Ak - Wa)
j=1
-
Hence gap(D) c by the classical variational formula. Since w E %' is arbitrary, we have proved the required assertion.
9 SPECTRAL GAP
356
Step 4. To get the explicit upper bonds, fix k 2 1 and take (i-l)A(k-l) ( ; . j b j ) - l . Then Cj=O
i
Note that for every
4f2) -
w;
= wik) =
i>k
f with fo
=0
and x(f2) = 1, we have
= 1 - “ ( f I [ f f 0 J 23 1- 7r(f”);.([f
# 01) 3 KO.
Hence
Thus, by using the summation by parts formula, we get
(9.22) Making supremum with respect to k 3 1 gives A0 < & I w 1 . From (9.22), it Q:U,(”) and so 6 6 6’. By the definition of follows that wk: C r k7ri < 2
CFzt
s/
I
Finally, we study the explicit lower bound. Applying Lemma4.53 to 7 = 1 / 2 , nLi = pi, TLi = (pibi)-’ and c = 6, and noting that (&),/EZ 3 ((pi+l - pi)/2, we obtain
Therefore gap(D) 2 (46)-’. In general, if 6(y) < 00, then we must have b < 00 since cpi is increasing, and so the estimate works for all y 3 1.
9.3 BIRTH-DEATH PROCESSES
357
Assume that y > 1. Following the proof of Lemma 4.53, we get
for alll Hence
Because
havewe for all
and
so
Hence gap(D) bA-7b(y)-1(l - y-I). This estimate is trivial when y = 1 and so holds for all y 2 1. Combining these two estimates together, we obtain the required lower bound. Actually, these explicit bounds are all deduced from the first Dirichlet eigenvalue to be studied in the next section. To prove Corollary 9.28, we need the following result.
Lemma 9.31. Let (mi : i 3 1) and (ni : i 1) be non-negative. Cj>imjnj< c l m i and Cj2imj< cami for all i 2 1, then
If
Proof: Assume that (m,) has finite support and set Mi = Cj2,mjnj. Then
In particular, when
nj
= 1 and c1 = c2, we get
Inserting this into (9.23), we get the required assertion. Proof of Corollary 9.28: a) Part (4)of the corollary follows from Theorem 9.25.
9 SPECTRALGAP
358
b) The application of Lemma 9.31 goes as follows. Lemma 9.31 gives us
Minimizing the right-hand side with respect to y,we get 70 = and hence
c) Applying (9.24) to
mj
d-
= a j p j , n j = 1 / a j and w j = yoj,we get
From this, we have infi>oIi(w)-' 2 ( A- d g ) 2 / c 1 . This completes the proof of part (3) of the corollary. d) The proof of part (2) is similar but by setting mj = pj and nj G 1.
To conclude this section, we mention a closely related topic, the logarithmic Sobolev inequality, which says that Shannon entropy
L D-V ~ ~entropy
(9.25)
for some ALog< 00. Recall that for two given probability measures p and
7r,
On the other hand, as we have seen from the last chapter that the DonskerVaradhan entropy is given by D-V entropy =
D(,/-)
ifp<
Let p << 7r. Then d p = f 2 d x for some f E L2(7r)with becomes
llfll
= 1.
So (9.25)
9.4 SPLITTING PROCEDURE AND EXISTENCE CRITERION
359
which is the usual form of the inequality appeared in the literature. Since the spectral gap can be redefined as the optimal constant A = l/gap(D) in the following Poincare inequality
their close relation should be more or less clear. We now state a result about the logarithmic Sobolev inequality (refer to Chen (2003~)for a proof). Define cp as before and
hf(a)= x [ l + d2 i T G +log
( + "y 1
Theorem 9.32. For birth-death processes, we have
ziv - 1 (2-1)
5
B@
where x1 = 2 - 1 and Q-' is the inverse function of @: Q(s> = x2 log(l+ x2). In particular, A L <~cx:~iff supi2l p [ i , co)log ( p , [ i ,= ) - I ) < m.
vi
9.4 Splitting Procedure and Existence Criterion
Let ( E ,8,T)be a probability space satisfying { ( a ,x) : 2 E E } E g2.Consider a quadratic form ( D ,!9(D)) (not necessarily symmetric) on L2(r1T 1
P
9(D)= {f E L2(.)
: D ( f ) < m),
(9.26)
where J 2 0 is a measure on €2, having no charge on the diagonal set ((5,x) : z E E } . A typical examplo is as follows. For a regular q-pair (q(x),q(s,dy)) with stationary distribution T (i.e.: rr(dx)p(z!d3) = y(y)n(dy)), we simply take J ( d z ,dy) = 7r(dx)q(x?d p ) . Naturally! define
1
XI = inf{D(f) : r { f )= 0,
llfll
1). = g'ap(D) the spectral gap of ( D , g ( D ) ) .The next result rep=
We call A 1 resents a tradibioIial technique, reducing the Neumann case (XI) to Dirichlet one (Ao):
Xo(A)
= inf
{D(f>: f E 9 ( D ) , flAc = 0, 7 r ( f 2 ) = 1):
whic;h is called the first Dirichlet eigenvalue on A .
9 SPECTRAL GAP
360
Theorem 9.33. For the above quadratic form ( D ,. 9 ( D ) ) ,we have
62 Proof: a) Let f E g(0) such that
inf
4 A ) E (0,1/21
fl~'
A,(A).
= 0 and .(f2)
(9.27) = 1. Then
2
r(f2) - r(f)'= 1 - r ( f I ~2 ) 1 - r ( f 2 ) r ( A= ) 1 - .(A) = .(A"). Hence
which implies that
A1
6 Ao(A)/n(A").Furthermore
This part of the proof works for general D(f), necessarily the form (9.26). This part of the proof works for general D(f), necessarily the form (9 .2 This part of the proof works for general D(f), necessarily the form (9.26). This part of the proof works for general D(f), necess
proof is done.Because e is arbitrary, the To move further, define a local form D B ( f )=
;1
BxB
J ( d z ,dIl)[f(Y) - f(412
and
x ~ ( B= ) inf { D B ( ~ :) r ( f ~ B= ) 0 , r ( f 2 1 B ) / r ( ~=) I}. We adopt Cheeger's splitting technique. That is, splitting E into two parts A and A", and estimating A1 in terms of A1(A) and A,(A"). Our motivation is that thc splitting sets are allowed to be overlapped.
9.4 SPLITTING
PROCEDURE AND
36 1
EXISTENCE CRITERION
Theorem 9.34 (Existence criterion). Suppose that J ( . :C) C E 6 . In the non-symmetric case, assume additionally that
<< 7r for every
Define
MA
{
2esss11pA,,J(dx, A")/x(dx) esssupA ,[J(dz, A")
+ J ( d z , E)]/7r(dx)
if J is symmetric if J is non-symmetric,
where esssupA,, denotes the essential supremum over the set A with respect to the measure R. Let A satisfy x(A) E ( 0 , l ) and < 00. Then for every B 3 A , we have
> 0 for all compact B .
Hence thc result means that, XI > 0 iff &(Ac) > 0 for some compact A, because we can first fix such an A and then make B large cnough so that the right-hand side of (9.28) becomes positive. The reason why we have to use two sets is that the operator is not local, there may exist an iritcraction with a very long ra.nge. The first assumption of the theorem in the non-symmetric case is quite natural. For instance, when J ( d z , d y ) = x(da)q(z,dy)and x is a stationary distribution, then the assumption is automatic since we indeed have J ~ ( d x ) q ( dy)J'(y) x, - J ~(dz)q(a)f(x) for all f E I??+by Theorern 4.17. Proof of Theorem 9.34: Lct f satisfy n(f)= 0 and x ( f 2 ) = 1. Our aim is to bound D(f) in terms of Xo(A") and A,(B). a) First, WE? 'Use x , ( B ) .
Usually,
X1(B)
D ( f ) 2 D , ( f I B ) 2 Xdl3)x(I?)-' = Al.(B)7@-1 [7r(f"r,)
'. '
[7r(S21B)- n ( w 7 T ( f Z D ) 2 ] 7r(B)y7i(f1gcjZj I
(9.29)
Here in tlrc: last step, we have iised n ( f )= 0. b) Next, we use A. (A").We need the following d ~ ~ i e ~irieqiiality i c ~ y /(!IAc)(.T:)
- ( j l A c : ) ( Y )I
< l f ( f c ) - J(.Y>I + ~ A , A ~ ~ , A C . A ( ~ , ~ ) I ( ~ -- I (Af I) ~( Z) () y ) l . Then
362
9 SPECTRALGAP
When J i s symmetric, by assumption, we have
When J is not symmetric, we have
<
+M A T ( ~ ~
Therefore, in both cases, we have Xo(AC)n(f21~c) 2 D ( J ) Hence Ao(A")(I - 7 ) 6 2 D ( f ) MAY.
+
(9.30)
c) Estimating the right-hand sides of (9.29) and (9.30) in terms of y := . i r ( f 2 1 ~ )we , obtain two inequalities D ( f ) 2 c,-y c2 and D ( f ) 2 -c3y c4 for some constants cl, cg > 0. Hence
+
+
Clearly, the infimum is achieved at yo, which is the intersection of the two lines and r2 in { + + . } Then, . Lhe required lower bound of XI is given by clyo +"2. w The rerriainder of this section is devoted to study the principal eigenvalue of jump processes. It is the first Dirichlet eigenvalue if regarding the infinity as Dirichlet boundary. Given a totaly stablc q-pair (y(z), y(x, dy)), denote by d(z) = q(z) - q(x,E ) the non-conservative quantity of the q-pair at z E E. Suppose that the q-pair is symmetrizable with respect t o a mea.), respectively, the norm sure .ir (may be infinite). Denote by 11 . 11 and and inner product in L 2 ( n ) .Let (a,
9.4 SPLITTING where
PROCEDURE AND
7rd(dx) = d(x)7r(dz).Next,
llfll~ = llf1I2+ W ) , 90= {f It is easy to check that
E
EXISTENCE CRITERION
363
set En = {. E
:
d 4 G 4, n 2 1,
L 2 ( r ): f vanishes out of some En}.
l l f l l ~ < 00
for all f E 90(Lemma 9.38). Let
9 ( D ) be the completion of 230 with respect to
11 . [ID.
The form ( D , 9 ( D ) )corresponds to the minimal process. Note that for the bounded q-pair ( i e , M := sup,q(z) < m), B ( D ) = L2(,rr) = 90 since En = E for all n 2 M . The principal eigenvalue studied here is defined by Xo = inf{D(f) : f E g ( D ) ,llfll = l}. Recall that there is a one-to-one correspondence of the q-pair and the following operator:
a(n)= {f
E 8 : / 9( . , dY ) / f ( Y ) l +4( 4I f ( 4l
<
1
for all x E E .
The next main result is a variational formula for the lower bound of Xo: Theorem 9.35. For a symmetrizable q-pair with respect t o r,we have XO 2 supo 0 such that Rg 6 -Xg, r-as.,then XO 2 A.
The proof of the theorem is based on the following result. Theorem 9.36 (Variational formula for Dirichlet form). Let (q(x), q ( z , d y ) ) be a symmetrizable bounded q-pair: M :==sup,q(x)< 00. Then for every non-negative f E L2(7r),we have (9.31) where g varies over all strictly positive (i.e., g bounded &-measurable functions.
2
cg
> 0 for
some constant cg),
Proof: First, we prove that the right-hand side of (9.31) is controlled by the left-hand side. Because
9 SPECTRAL GAP
364
where l j = g I [ f # o ] ,thus, we may replace g by 3 in the proof. Define h = (ij/f)I[f+oland denote by p ( t , IC, dy) the jump process determined uniquely by the bounded q-pair (by Corollary 3.12). T h e corresponding semigroup is denoted by {Pt}t>o. Then, by symmetry of .rr(dz)p(t,5 , d y ) (Theorem 6.7), we have
i(f2/g,
Here, we have used the fact that a + l / a 2 2 for all a > 0. Hence j.Ptj) ( f , f - P t f ) . From the spectral theory, it is standard that the righthand side increases to D ( f ) as t 0 (cf. Section 6.7). Thus, it remains to show that the left-hand side converges to ( f 2 / l j , -06) as t L O . This can be done by using the dominated convergence theorem and the following facts:
<
G nfllGllm 0. b) When 0 < c < f < C < 00 for some constants c and C , the inverse inequality holds since onc can simply set g = f. The general situation can be proved by approximation. Let f n 2 n - l + f A n. Then, by symmetry and bouridedness of the q-pair, we have
Since
by Fatou's lemma, it follows that limrL +03 - ( f 2 / f n , af,)3 D ( f ) .This completes the proof. Because, in general, we have D ( f ) 2 D(lf1) and the strict inequality can happen for some f, it follows that the condition "f 2 0" in Theorem 9.36 cannot be removed. The next result is a special case Theorem 9.35.
(3.4 SPLITTING PROCEDURE A N D EXISTENCK CRITERION
365
Proposition 9.37. Theorem 9.35 holds for bounded y-pairs.
Proof: From the assumption, - Q g / g 2 A, 7r-a.e., it follows that (-Og/g, f2) 3 Allf1I2. Since D ( f ) 2 D ( l f l ) , once g is bounded, the conclusion follows from Theorem 9.36 immediately. We now consider the general 9. Let gn = g A n. Then, it is easy to check that
Therefore
From the assumption, -ng/g X > 0, the required assertion now follows by using the monotone convergence theorem. I
The next simple result was used in the definition of B ( D ) .
Lemma 9.38. For each f E 9 0 ,we have llf]ln < m. Proof: Take n such that f j ~ : = 0. Then, by the definition of g o , we have l l f l l < 00. Next, from the symmetry of 7r,(dx,dy), we obtain
This gives us
I l f l \ ~ , < 00.
I
We are now ready t o prove Theorem 9.35. The idea is a localizing procedure reducing the general case to the bounded one treated in Proposition 9.37. To do so, we need some preparations. From now on, we lix a lunctlon g and a conuhnt X > 0, as given in Theorem 9.35. Lemma 9.39. Let F, = {x E E : g(x> 2 l/rn}, rn 2 1, and g1= { f I , f € 9 0 , m 2 l}. Then, 95 is dense in 9(D)in the norm 11 I(L). +
:
9 SPECTRALCAP
366
r
For each fixed n, since IIf,Ila < OQ, Pm E and q ( x ) is boundcd on the support of f n , the right-hand side goes to zero as m -+ m. From the triangle inequality, IIfnm - f l l ~< [Ifnm - f n l l ~ [ I f , - f l l ~ , we can first choose a large enough n and then a large enough m so that /Ifnrn - f l l ~ becomes
+
arbilrarily small. H For each B E &, define a local q-pair ( g R ( x ) , q R ( x , d y ) )and the corresponding operator f l B on ( B ,B n 8)as follows: qR(4 =q(4,
clR(J:,dY)= 9 ( J : , & / ) k 4 Y L
R B f ( J := ) /ijB(J:, dY)f(Y)
- qB(xC)f(4,
2 f
B.
Lemma 9.40. Let g and X be given by Theorem 9.35. Then for every B f 8, flBg -Xg, n-a.s. on B .
<
Proof: By assumption,
= QBg(z),
J: E
B. H
For n, m 2 1, let G,,, = E, n F , and define q-pair (qn,m(x), Q ~ and operator On,, as above (by setting B = Gn,m).Next, define
1 Dn,,(f> = 5
J
Gn,mXGn,m
rqn,m
rTn,m
, ~ ( dy)) X ,
(d4f(42,
9.4 SPLITTING PROCEDURE AND EXISTENCE CRITERION
367 Corre-
where havesponding to the bounded form Dn,m,m we
since .@(D,,,> = L2(Gn,m., T ) (the set of square-integrable functions on G,,, with respect t o the measure T ~ G , , ~ )A. simple computation shows that we also have (9.32) since Dn,,(f> = D ( f ) for every f E L 2 ( n )with
~IG;,,
= 0. In other words,
A c ’ m ) is the nirichlet eigenvalue of the q-pair on the donlain Gn,m.
Lemma 9.41.
Ag’mlis decreasing in both nrand m.. Moreover,
Proof: The first assertion follows from (9.32) and the fact that En f E , F, E as n,m -+ 00. Moreover, it is obvious that A?’”’ 2 Xo. Next, from the definition of Ao, for every E > 0, there is JEE E .@(D)such that I l f E l l = 1 and A0 2 D(f,) - E. By Lemma 9.39, there exists a sequencc { f n m } c 91so that -f , l l ~ 0. Without loss of generality, assume that f n m I ~ k , , = 0 and IlfTbrnll = 1. Thus, for large enough n,m,, we have 2 D(fnm) - E. Hence A0 2 D ( f n m ) - 2 E >, ‘n’m’ - 2~ by (9.32). Since E is arbitrary, we have thus proved the required assertion. It is now easy to complete the proof of Theorem 9.35: Proof of Theorem 9.35: Applying Proposition 9.37 to the q-pair (qn,m(x), q,,,(x, dy)) on (Gn,,,Gn,mn 8)and using Lemma 9.40, it follows that 2 A. Then, the required assertion follows from Lemma 9.41.
[Ifnrn
---f
AFym)
Example 9.42 (Continuation of Example 9.14). The dual Q-matrix I$ is as follows: Gk,h+l = 8 for k 2 0, &O = 1- 19for k 2 1 a:id ijzj= 0 for other i-# j . Denote by 1the operator corresponding to the symmetrized $-matrix Q = (Q+6)/2 ru’ext,define po = 0, (pk = O( I)/‘, k 2 1. Then, it is easy to check that - at;;(k)/pprc 3 1 - l/e for all k 2 1. Thus, by Theorems 9.35 and 9.33 with R = Z+\ {0}, we obtain A1 2 A0 3 1 - fib This improves the lower bound given in Example 9.14. Actually, the equation n g = -Xog has a solution 9%= z @ - ’ + ~ ) / ~ ,g E L1(r) \ L2(7r). Applying Lemma 9.8 (1) to g2 = O ( - ’ + + ’ ) / ~ (i 2 O ) with 9,’”’ = gzAn plus some computations, it fo~lows that A1 < 1 - fi and so our estimate is sharp.
368
9 SPECTRALGAP
9.5 Cheeger's Approach and Xsoperimetric Constants
In this section. we study XO axid XI by using a generalized Cheeger's approach. We consider a general qiiadratk f0r.m
where J is t h e same as in the last section and K is a non-negative measure on ( E ,6 ) . The study on XO is meaningful since D(1) # 0 whenever K # 0. Thus, in what follows, when dealing with A0 (resp. A,), we consider only the quadmtic form given by (9.33) (resp. (9.26)). In general, J and K can be urihounded. We adopt a renormalizing procedure as follows. Choose r' f € : and s E &+ such that
where
for a: E [O, 11. By convention, J(O! - J and - K . Assume that T is symmetric if so is J . 1,VhenJ(dx,dy) = .rr(dz)q(z, d y ) is symmetric, one may simply choose T(X,y) = q(x)V q(y). In the non-symmetric case, assume also that (9.35) Corresponding to (J(&),Ida))> we have a quadratic form D(*): defined by (9.33). For which, Xi") and are well defincd. Next, define an isoperimetric (or Cheeger's) constants as follows
Ar)
Theorem 9.43. For t h e quadratic form given by (9.33), under (9.34) and (9.35), we have
9.5 CHEEGER'S APPROACH A N D ISOPERIMETRIC CONSTANTS
369
Proof: a) First, we express h(") by the following functional form
By setting f = I A / K ( A )one , returns to the original set form of h(*). For the reverse assertion, simply consider the set A , = {f > y} for y 2 0. The proof is also not difficult:
's
tf(Z)>f(Y))
J(%%
dY)[f(Z)
-
f(d1 + K ( " ) ( f )
(Co-area formula). Hence
The appearance of K makes the notations heavier. To avoid this, one can enlarge the state space to E* = E U (m}. Regarding K as a killing measure on E * , the form D ( f , g ) can be extended to the product space E* x E* but expressed by using a measure J* only. At the same time, one can extend f to a function f * on E*: f* = f 1 ~ More . precisely, define J*(") on E* x E* bY E 47 x 8 J'")(C),
c c =A x or +> x A, A E 8 c = {m} x {w}.
Then, we have
L.
J*(")(dz,E*)(f*(.)( =
L
J ( " ) ( d z ,E ) ( f ( z )f( K ( " ) ( ( f l )
9 SPECTRALGAP
370
b) Take f with 7 r ( f 2 ) = 1, by a), Cauchy-Schware inequality and conditions (9.34) and (9.35), we have
The right-hand side is bounded above by 1. Solving this quadratic inequality in D(')(f), one obtains D(')(f) 2 1c ) Repeating the above proof but by a more careful use of Cauchy-Schwarz inequality, we obtain
d-,
6 D ( f >[2 - D " ) ( f ) ] . From this and b), the required assertion follows. I We now turn to study XI. To do so, we need the following isoperimetric (or Cheeger's) constants.
1
/&a) = -
inf
2 dA)€(OJ)
J ( a ) ( Ax A") + $")(Ac x A ) I
(4 J(")( A x Ac)+ J(CY)(AC x A) 4 4 n
1 k(a)' == inf 2 n(A)~(0,1/2] 44 J ( O ) ( Ax A C ) J(.I(AC x A ) 1 -inf 2 n(A)~(0,1) r ( A )A x(AC)
+
The functional farm of these constants can be expressed as follows.
9.5 CHEEGER'S APPROACHANT) ISOPTSRTMETRIC CONSTANTS Lemma 9.44. For every
CL
371
3 0, we have
Proof: Because Q 3 0 is fixed, we can omit the superscript "(a)"everywhere in the proof. Denote by and ,@ the right-hand sides of the above and k' >, E l . We now prove quantities. By taking f = I A , we obtain k the reverse inequalities. a) For any f f L:(.rr) with J ~ ( d z ) ? r ( d y ) l f ( z) f ( y ) l = 1, by proof a) of Theorem 9.43, we have
This proves the first. equality of k("). Next, we show t'hat
whcrc Il.llp denotes the LP-norm. First, let n ( g ) = 0 with inf,,R IIy-,cllw Then, because r ( g ) = 0 and ~ ( -f ~ ( f ) = ) 0, we have
for all c E R. Hence, by Holder inequality, we have
for all c. This gives us
< 1.
372
9
SPECTRALGAP
On the other hand, for a given J E L1(n), set ATf = {f 2 n(f)} and A; = { J <.})!(+a Take go = I,+ - 1 -n(A;)+n(A;). Then go E L"(.rr) f and n ( g o ) = 0. Finally, take co = 1 - 27r(Af). Then, it is easy to check that in[, ((go - c((, = ((9" c ~ ( =( 1. ~ Therefore, we have J fgod.ir = J - 7 ~ ( f ) j das~ required. We now prove the second equality of k("). Let f 2 0 arid set A, = {f 2 7).By a) and (9.36), we have
If
Therefore, we obtain > k . b) Choose co E R such that ~ ( co) < 1/2. Let ff = ( J - c0)* + T h e n we have f+ f- = If c,,I and - c0l) = min, ~ ( (- fc.1). For any y >, 0, defirie A$ = {f*> y}. We have
+
=
flTJ(A:
[.(A:)
= k'minn(\f C
fI(.
x (A,'>")+J((A,f)"x A:)+J(A; x (A;)")+J((A;)'x A;)] dy
00
2k'1
<
+ n(A;)]dy
= k'n(f+
+ f-) = k'n(lf - col)
- .I).
This implies that k 2 k'. 1 We can now state the main result of this section.
Theorem 9.45. Let K = 0 and (9.34), (9.35) hold. Then, we have
9.5 CHEEGER'S APPROACH AND where K,
= lim
373
ISOPERIMETRIC CONSTANTS
4
>,
K ; ~
Pf A!"
(9.38) the infimum is taken over all i.i.d. random variables X and Y with EX EX2 = 1.
r
0 and
Proof: a) First, we prove the first lower bound of (9.37). Let f 6 a ( D ) with .(I) = 0 arid r ( f 2= ) 1. Set y = f c, c E R. Similar to the proof c ) of Theorem 9.43, we have
+
for all P : 0 < /3 < Xi1) 6 2. By Lemma 9.44, we have
Thus, it remains to estimate
K ~ Set .
y = IElXl E (0, I]. Clearly,
Here in the second to the last step, we have used Jenseii's inequality,
We claim that when c = 0, I E I x 2 - Y*l >, 2 ( 1 -
IE(X1)= 2(1- 7).
(9.41)
To see this, note that IE(X2- Y21 21E(X2 - Y 2 ;Y 2< 1 < X 2 ) = 21E [ X 2 ;x 2>, 11P ( X 2 < 1) - 21E [ X 2 ;x 2 < 11P ( X 2 >, 1) = 2(P[X2 < 11 - I E [ X 2 ; x 2< 11).
9 SPECTRAL GAP
374 On the other hand,
Hence
Thus, we obtain
Letting p T X p ) , we obtain the first lower bound. b) For any B C E with n(B)> 0, define a local form as follows.
Obviously, D - ( ,a ) ( f ) = D -(a) B (91,). Moreover, xO(B>
:= inf{D(j> : f / p = 0,
llfll
= 1 ) = inf
( 5 B ( j ) : i 7 ( f 2 1 B ) = I}.
Let nB = T ( . n B)/.rr(B)and set
~ ~ s j a=)
inf
[ J ( a ) ( A(xB \ A )j + J ( " ) ( ( B\ A) x A ) ] / 2+ J(")( A x BC)
(4
ACB, rr(A)>O
- _1 inf 2 ACB,r(A)>O
$-)(A x A") + $")(A" x A )
44
Applying Theorem 9.43 to the local form on L 2 ( B ,6 n B , nB) generated by J B = T ( B ) - ~ J I and B ~K~B = d ( . ,Bc)IB,we obtain
c) Note that in
,by Theorem 9.33
9.5
375
CIIEEGER'S APPROACIi A N D ISOPERIMETRIC CONSTANTS
We obtain the second lower bound of (9.37). We now study a different way to renormalize the general quadratic forms. In contrast to the previous approach, we now keep ( J , K ) to be the same but change the L2-space. To do so, let p be a measurable function satisfying cyp
:= .Ir-essinfp > 0,
[ J ( d z ,E )
Pp := r ( p ) < 00 and
+ K(da)]/7r,(dz)< ,f&,
7rp-a.e.
where 7rp = pn/BpO,. In the non-symmetric case, assume also that (9.35) holds. hior jump processes, one may take p ( z ) = g ( s ) V r for some T 2 0. From this, one sees the main restriction of the present approach: J r ( d z ) q ( s )< 00, since we require that ~ ( p .=) 00. Except this point, the approach is not comparable with the previous one. Next, define
Theorem 9.46. Let P, ap,Pp, rp and X , i (i = 0 , l ) be given above. Then, we have xi
2 apAp,2/L&,
2
= 0, 1.
(9.43)
In particular,
(9.44) and when K
-f-
0,
(9.45) where
rp 2 1/2
is the unique solution t o the equation 2y'Lk; = 1 - 41 - 4(1 - ~ ) ~ t k ; ,
y E (0,l).
(9.46)
Proof: Here, we prove the assertions for i = 1 only since the proof €or i is similar and even simpler.
=. 0
9 SPECTRALGAP
376
To prove (9.43) for i = 1, by the assumption on lo, axid TJcmma 9.10, it is clear that L’(n) is dense in L2(7r,) in the 11 Ilo-norm. Noting that L2(n,j is just the domain of the: form D ( f ) on L‘(T,), by definition of’ A, and AP,, , i~ suffices to show that np(f2) - r,(f)’ a p [ n ( f 2-) ~ ( f ) ~ ] 3for / ~ every , f E J J M ( 7 i ) . The proof goes as follows.
-
>
To prove (9.44) and t,he second estimate of (9.45), simply apply Theorems 9.43 and 9.45 with 7~ replaced by 7rp and renormalizing constants r ( z , y ) 5 p,, s ( x ) = P,. Noting that for the modified k(”), we have k(“) = k/,’3,”, k, = k / P p and so on, where k = k(’). To prove the first estimate of (9.45),lct np(f)= 0 and 7 r p ( f 2 ) = 1. Similar to (9.39), we have
Solving t.his inequality in D ( f ) , we obtain D ( f >3 & ( I + c 2 ) -
Jm
and so &,1
+ c2)2 - P,”(l+ c2)M(x,Y,c)k;,
> &(1+ c2) - @(1
where M ( X ,Y ,c) =
+
-~.--
+
+ c)21)2
I
Thus
M ( X ,Y,c)k;
&J >
&
(y 1 c2 .-..
/-
1
+ J1 - M ( X ,Y,C ) k Z / ( l + c2)
*
Combining this with (9.40) and (9.41), we get
A1,
{
2 pp inf max 2y2k;, I YE [O?21
Jm}
= 24,(ypkp)2,
where y p is the unique solution to (9.46). To see that 7 , 2 1/2, note that JG 1 - 2/2 and solve the equation 2y2ki = 2(1 - ~ ) ~ k i H. The main advantage of the Cheeger’s approach is lhat it works in a very general setting. Here is an example.
<
9.5 CHEEGER'S APPROACH A N D ISOPERIMETIUC CONSTANTS
377
Corollary 9.47. Let ( E , € , T )be a probability space and let j ( x , y ) 2 0 be a symmetric function satisfying j ( x , x ) = 0 and j ( z ) := J,j(z,g)n(dy) < ca for all 2 E E . Then, for the symmetric form generated by J(dz,dy) = we have
Proof: Note that
The conclusion now follows from (9.37) immediately.
H
For birth-death processes, the Cheeger's constants can be computed explicitly. Theorem 9.48. Consider the birth-death process on and death rates (ai):
Z+with
birth rates (bi)
rij = (ui + bi) V ( a j + b j ) (i # +j). Then k(")' > O(equivalently, k(*) > 0) iff there exists a constant c > 0 such that
(1) Take
Then, we indeed have Ic(")'
(2) Let
2 c. Furthermore,
Take
0 (equivalently, IC, > 0) iff inf
i>l
Then we have nisi
-> O C j &njpj
and moreover,
9 SPECTRAL GAP
378
Proof: Here we prove part (1) only since the proof of part (2) is similar. a) Let k(") > 0. Take A = Ii = {i, i 1,.. } for a fixed i > 0 and
+
\
*
-
Z,Z+l
Then
This proves the necessity of the condition. b) Next, assume that the condition holds. Then for each A with 7r(A)E (0,l), since the symmetry of A and A", we may assume that 0 @ A. Set io = minA 2 1. Then, A c Ii,, A" C E \ (20) and so
Because A is arbitrary, we obtain the required assertions.
Example 9.49. Let E = iz+ and take ui = a and bi Then, both Theorem 9.45 and Theorem 9.46 are sharp.
E
b with a
> b > 0.
Proof: This is a standard example which is often used to justify the power of a method. It is well known that A1 = (fi - & ) 2 (cf. Examples 9.27). a) By part (1) of Theorem 9.48, we have
Then, by Theorem 9.45, we get XI 2 (fib) Take pi = a b. Then by part (2) of Theorem 9.48,
+
The same estimate as in a) now follows from Theorem 9.46. I
Example 9.50 (Continuation of Example 9.24). Let qok = bk > 0, q k = Q ~ O= 2-1 ( k >, l), q i j = 0 for the other cases of i # j and qo := Ckalbk < 00. Then, Theorem 9.46 is sharp for all qo but Theorem 9.45 is sharp only for qo ,< 1/2.
9.5 CHEEGER'S APPROACH A N D TSOPERIMETRICTRIC CONSTANTS 379
Proof: From 7roqok = 7 r k q k 0 , it follows that 7ri~1,= 27robk, lc 2 1 and no = (1 2 q p . a) Take pi = qiV (1/2)) then aP= 1 / 2 . Without loss of generality, assume that 0 6! A. Then
+
This gives us kb 2 1 and so lcb = 1. Hence by Theorem 9.46,
Actually, every equality in the last line must hold (cf. Example 9.24). b) Again, assume that 0 e A. Then
1
1
1
( +
Because 1/2 CIGA, izob i ) / CiEA bi decreases when A increases, by setting A = {i} for a large enough i # 0, it follows that
By Theorem 9.45, we get XI 2
+{1 V (2qO)-I- d-}-'.
Thus,
the lower bound is equal to 1 / 2 = XI iff qo 6 1/2. The last two exarnples show that Theorems 9.45 and 9.46, and hence the related isoperimetric constants, can bc sharp. This is more essential than the unboundedness since the unbounded case can be reduced to the bounded one by Theorem 9.20.
380
9 SPECTRAL GAP
9.6 Notes Theorem9.1 is a modification of Liggett (1989) in terms of the Dirichlet forms. Theorem 9.5 is taken from the same quoted paper, The development of Theorem 9.15 has taken for quite a long time. It was proved first for birthdeath processes in Chen (1991b), based on Karlin-Mcgregor’s representation theorem (cf. van Doorn (1981)), and then for the general setup in Chen (2000a,b, 2003b), based on the a time-discrete analogue of Roberts and Rosenthal (1997). The present form is an extension with simplified proof of the original one. The last step of proof b) is due to Wang (2000>, or Rockner arid Wang (2001). The author learnt the proof b) mainly from Y. H. Mm (oral communication). The local intcgrability used in the proof c) of ihe theorem is due to Hwang, Hwang-Ma arid Sheu (2002). The first two results in 39.2 are due to Chen and Wang ( l W b ) , Chen (1994a,b). The last two results are talcen from Chen (1991, 1996a). Theorem 9.25 is taken form Chen (1996a, 1999a, 2000b, 2001a). In the exponentially ergodic sense, part (1) of the theorem was implicated in a scries oi papers by van Doom (1985, 1987, 1991, 2002). See also Zeifman (1985)) Granwski and Zeifman (1997) for a digcrcnt approach. The diffusions’ analogue of parts (1) and (2) were appeared in Chen and Wang (1997). The explicit criterion was obtained indcpendcntly by hliclo (1999), based on a different approach, the weighted Hardy’s inequality, not touched here. Parts (1)-(3) of Corollary 9.28 are due to Doorn (1985), Sullivan (1985) and Liggett (1989)! respectively. Lemma 9.31 is also due to the last quoted paper. Theorem 9.32 was proved by Mao (2002a) and Miclo (1999). The present form is taken from Chen (2003c), in which a much more general Poincare-type inequality in Banach space of one-dimensional functions is treated. The idea goes back to Bobkov and Gotze (1999a,b). The last two sections are motivated from Cheeger’s idea (Cheeger, 1970). The generalization to bounded jump processes is due to Lawler and Soh1 (1988)[See also Sokal and Thomas (1988), Thomas (1989) more results]. The resulting lower bounds are degenerated when passing to unbounded case. This problem was overcome by Chen and Wang (1998), by using new isoperimetric constants. Most of materials in these sections are based on these papers and Chen (1999c, 2000b). Theorem9.35 is taken from Chen (2000~).For finite Markov chains, Theorem 9.36 is due to Kipnis and Landim (1999, Appendix I). In thc past ten yeas or more, a lot of progress an spectral gap, logarithmic Soholev inequality (Gross, 1975) and related topics has been made and there is a huge number of publications. A lwge part of the new materials are collected into a separate book (Chen, 2003d).
PART I11 EQUILIBRIUM PARTICLE SYSTEMS
This page intentionally left blank
Chapter 10
Random Fields Random fields can be considered as a sistcr's subject of the interacting particle systems. The main purpose for these two subjects is the same, studying the phase transitions in statistical physics. The difference between them is that the former one adopts the discrete parameter and the latter one adopts the continuous parameter. This is similar to the ordinary Maxkov processes for which we have time-discrete and timecontinuous cases. Since the random fields are now a wide research field, it is not possible for us to cover the whole subject in the book. Instead, we introduce some fundamental parts of the theory: the existence and uniqueness problems for random fields. Two methods, thc Pcierls' method md the reflection positivity method for studying the phase transitions are also introduced. 10.1 Int rod uct ion
I
Throughout this chapter, unless otherwise stated, we t.ake 5' = Zd. Let I denote the ordinary metric in Z d :
c d
Is - t i 2
(,(4 - # ) ) 2 .
i=l
For V c Z d , let IV1 denote the cardinality of V . In particular, we write V cc Zd or V E S whenever IVl < 00. Besides, the r-boundary of V is defined by OTV = {t E
vc: d(t,V) 6
2 0, where d(t, V) = infsEv It - sI. Next, fix a metric space (X,p ) with Bore1 a-algebra. @. In what follows, ( X , g ) is called a spin space. For V c Z d , let ( E ( V ) €o(V)) , denote the product measurable space E ( V ) = xv,
T),
T
8pv)= BV.
Set 8 ( V ) = 8o(V) x E ( S \ V ) and ( E ,8) = ( E ( S ) ,€ ( S ) ) . An element z = (xu : u E Zd) in E is called a configuration. The restriction of z to V is denoted by x, = (xu : u E V).For V c W c Zd, zy is also used to denote the restriction of zw t o V. Finally, for a kernel p , tine.,pv(.,A ) E 80(Vc) for every A E d$(V) and ~ ~ ( 3 E; ) P ( E ( V ) )for every 1 E E ( V " ) ] ,it is more convenient, as will be used often subsequently, to regarding p , as a probability kernel on the whole space ( E ,8).To do so, simply put p W ( 5 , A ) = p v ( % v ,A , (Z:Vc)),
if E E , A E 8,
where A ( % )= (x E E ( V ) : z x z E A} for z E E(V'). 383
384
10 RANDOM FIELDS
Definition 10.1. (1) Each P E P ( E ) is called a random field. 2() A family of kernels (pv : V c Z d ) is called a specification if it satisfies the following consistency condition: for every W c V G Zd, 3 E E ( V c ) ,A E 8 ( V \ W ) and B E 8 ( W ) ,we have
where for given VI and V2 with VlnVz = 0,given xv, and 3v2,zvlx 3v2 is an element in E(V1UVz), which coincides with zvl on VI and with Zv2 on K. The specification (pv : V c Z d ) is called having finite range or a r-specification if for every A E &,(V),V c Z d , pv(=,A) E €(&V) (also writing as E 8o(&V)).It is called having zero range if pv(.,A ) is independent of zvc E E ( V c ) ,i.e., T = 0. P ( E ) and let (pv : V c Z d ) be a specification. If I € ( V c ) )= 3() Let pv for all V c Zd, then is called consistent with (pv : V G Z d ) .
F(.
FE
One can rewrite the consistency condition into the functional form:
/Pv(% d z ) f ( z , ) = / P v ( %
4 [/P,(xv,,x w c v CC Zd,
%c,
dlt,)f(ltw
f E b&*(V).
x
)I
,
% I \ ,
(10.2)
Then the condition can be interpreted as follows. Regarding V c , V \ W and W ( V G Z d ) as past, present and future, respectively, (10.2) means that given the present, the future is independent of the past. This is nothing but a typical type of Markov property. In this sense, the specifications are analogue of the transition probability functions of a Markov process with discrete time-parameter. However, the parameters are now subsets of Zd but no longer {0,1,2,. . .} and hence the condition is essential. Actually, random fields are one of the original resource of the random processes with multi-parameters. Clearly, (10.2) plays the same role as those played by CKequation for the ordinary Markov processes. In the literature, (10.2) is often called Dobrushin-Lanford-Ruelle equation (abbrev. DLR-equation). In statistical physics, a specification is often given in terms of the Hamiltonian. Example 10.2 (Ising model). The spin space is X = {-l,+l}. The Hamiltonian is
H ( x )= -
C~,xt,
2
E E.
{st)
Noting that the above sum is only formal since the series can be divergent. Essentially, the formula only means that the interaction for the system is nearest neighbor, i.e., T = 1.
10.1 INTRODUCTION
385
Example 10.3 (Anisotropic Heisenberg ferromagnet). The spin space is X = S2 C R3 with the ordinary metric 1.1. Take
where zs =
(x, , xp),2L3)) E S2, s
E Zd and
a!
> 1 is called the coefficient
of anisotropy. Example 10.4. Take X = R" ( n 2 1) with the standard metric
1.1
and
where @ is an appropriate B-measurable function.
By using some appropriate interaction function @ ~ ( 2 :=) @ A ( X A ) ,
2
E
E, A
@A
E &'(A):
Zd
(for the above examples, we have @ A = 0 whenever [ A ]> 2), these Hamiltonians can be expressed by a uniform formula:
H ( z )= X @ A ( Z ) ,
2 E
E.
A
Furthermore, we can define the conditional Hamiltonian H v , , with boundary J: as follows:
Choosing a reference measure (counting measure or Lebesgue measure, and so on) and define
where p > 0 is called the inverse temperature (i.e,, p = l/(nT), T is the temperature and n is the Boltzmann constant) and
-wz):=
J
exP[-oHv,,(X)]
W z t ) t€V
is called the partition function or statistical sum.
386
10 RANDOM FIELDS
Proposition 10.5. The family (pv : V
G
Zd) defined above satisfies the
consistency condition.
Proof: Clearly, with respect to Av(dzv) = density
ntEV A(dxt), p v ( Z v , ,
For simplicity, we rnay writ.e
Noting that in the present situation, (10.2) becomes
Thus, it suffices to show that
That is
.) has a
10.2 EXISTENCE
387
We now check (10.3). By using the fact that @A(z) = @A(XA) and
it follows that the right-hand side of (10.3)
= Pv (4,
which is what we required. In statistical physics, p v ( 5 , . ) is called the Gibbs distribution in V with boundary condition % and the corresponding random field is called a Gibbs random field or a Gibbs state. In other words, in the present situation, a Gibbs state, a random field, and a probability measure are all coincided each other. The collection of Gibbs states is denoted by 9? or
9?8(@)* 10.2 Existence In this section, we study the first fundamental problem-the existence of a random field consistent with a given specification. Again, assume that ( X ,p, 97)is a complete separable metric space. For a given specification (pv), we write
for simplicity.
10 RANDOM FIELDS
388
Theorem 10.6. Given a specification (pv),suppose that the following conditions hold.
(1) There exit a compact h E $8 and constants C, c 2 0, (cSt : s , t E Z d , s # t ) such that CSgt cSt c < 1 for all t E Zd. Moreover,
<
V G Z d , Z -+ pv(Z, .) is continuous in the weak topology. z E X ,there exists a random field P , which is consistent with
(2) For every
Then, for any (pv) and satisfies
For compact X ,we may choose h (1) holds. Hence, we have
= 0 or 1 and cst = 0.
Then condition
Corollary 10.7. Let X be compact and (pv) be continuous in the weak topology. Then there exists a random field consistent with (pv). Proof of Theorem 10.6: Choose a sequence {V,}?, V, G Zd, V, t Zd and fix boundary Z E E so that h(ZS) D < 00 for all s E V:, n 3 1. Let P, be the probability measure on ( E ,&) induced by pvn:
<
Pn(A) = Pvn(G:r A(%:)),
A
E
8,
where A ( z ) is the section of A at z . The purpose is to construct a P E Y ( E ) consistent with {pv} by using {P, : n >, 1). The proof is now naturally consists of two main steps: prove first the relative compactness of {P, : n 2 1) and then prove that any limiting point of {P,} is a required random field. The two conditions of the theorem are used in these two steps respectively. a) By Theorem 4.4, the relative compactness of {P,} follows from the estimate: sup/h(zt)dP, n
< (C + CD) v (C(1- c)-l),
We now prove (10.4). Given
G, = inf max e€&
tEV,
J
E
> 0, set
h(zt)e(zvn)dPn,
t
E
Zd.
(10.4)
10.2 EXISTENCE
389
where 8&and G, depend on n. Fix n for a moment. Since h E TB,we have
A,
:= {zvn :
maxh(zt) 6 N } t€V,
t
as N
E(V,)
Thus, for large enough N , we have e(zv,) := I A , ( Z ~ , ) E G, < 00. Actually, we have
00.
€€
and hence
G, < max{C + cD, C(1- c)-’} =: Co. To prove this, let 6 > 0 and choose eo E €E such that
maxS t€V,
h(zt)eo(zv,)dPn<
6)
and moreover Jh(zto)eo(zvn)dP, 3 G,(1 - b) for a point t o E V,. Noting that V, is finite, we may assume that the function eo is chosen so that the last inequality holds for the least number of t o E V,. Put
.o(zv,) := EO(”v,\to)
:= sP(zvn\to x
zv;, dzt,)eo(av,).
From the consistency condition (10.1) and the monotone class theorem, it follows that
~. for every t E V, \ t o , Hence Eo E ~ 9 Similarly,
/
h ( z t ) e o d ~=,
s
h(zt)eodP, < G,(I
+ 6).
Now if Sh(zt,)EodP, < G E ( l- S), then the number of t E V, so that h ( z t ) fdP, 2 G,(1 - 6) holds for f = Eo is one less than those for f = e0. This contradicts with the assumption of eo. Therefore, we have
G&(1- 6 )
EV,, s f t o
< C + c max{G,(l + S), D}.
10 RANDOM FIELDS
390
+ 8) 2 D , then G,(1 - 6) < cG,(l + 6 ) + C. Equivalently, G, < C/(1 - c - 6 - c8). Otherwise, G,(1 + S ) < D and so G,(1 - 8) < C + cD. That is G, < (C + cD)/(l - 8). Since 6 is arbitrary, we always have c, 6 mm{C + CD) C(1- C y } = c,,
If G,(1
as claimed. Now, we return to the main proof of this step. Choose eEm such that 0
< P,(1-
e,,)
< l/m
r n m j h ( z t ) e E m d P nB G,,
and
+ -1 < C, + -.1 m
t€V,
rn
<
From the first inequality) we see that 0 < 1 - eZm 1 and Pn(l- e E n ) + 0 as m -3 00. In particular, eEm4 1 in Pn L L ~rn + 00. Thus, replacing {m} with a subsequence { m k } if necessary, we may assume that limm+m eem =. 1. Then, applying Fatou’s lemma to the above second inequality, we obtain
/ ’ h ( z t ) i l ~< ,
lim
s
h(zt)e,,dP,
6
c,,
m-+w
tE
v,.
<
This proves that maxtEv,, h(zt)dP, C, and then (10.4) follows since n is arbitrary. b) Let x E X and pick J: : 51 = z for t E E d , as our boundary. By a), we have h(.&P, mm{C + C h ( Z ) , C(1- c)-”.
sups n
<
Thus, by Theorem 4.5, every limiting point
P of (P,}
satisfies
Next, fix such a limiting point p . Without loss of generality, assume that F , =$ p . To show that is a random field corresponding t o the specification (pv), it suffices to prove that
for all V, W G Z d , V n W = 0 and all bounded continuous functions f ( z v ) and g ( x w ) . But for large enough n, the consistency condition gives us
10.3 UNIQUENESS
39 1
and hence the assertion follows by passing n + 00, since the specification is continuous in the weak topology.
10.3 Uniqueness We restrict ourselves to the r-specification: pv(xvc, A ) = PV(QTV,
A),
A E qv>, 7- < oa.
Definition 10.8. We call P E 9 ( E ) exponentially growing if there exist z E X and constants g, G E [0, oa) such that P t W :=
s,
P(% W d z )
< G e x p b It1 3,
t
E Zd.
(10.5)
Denote by 9 ( P , g ) be the set of P E 9 ( E ) so that (10.5) holds for some z E E and G E (0, w). In the case that E = Rn or Z", the standard metric p provides a compact function h = p ( . , z ) . With this h, the random field P constructed in Theorem 10.6 satisfies (10.5). In particular, for bounded p, (10.5) is trivial. For simplicity, write aV = a,V. Denote by X ( p l , p 2 ) the set of all coupling measures of p1 and p2. Take E E ( V ) , V CE Zd. pv(z1,z2)= ~ p ( x ~ , x ~ z1,x2 ) , tEV
Then we have the minimum L1-metric W v with respect to the metric p v , defined in Section 5.1. The next result is our main theorem, which says that if the given specification (pv)possesses some properties in a fixed finite set A c Zd, then there is at most one random filed consistent with (pv).
Theorem 10.9. Given a r-specification (pv :
V cz Z d ) which
is translation
invariant: PV+t(Z(V+t)C, dxv+t) = PV(ZVC, dzv),
Suppose that for some
A
CE
t E Zd.
Zd, the following conditions hold:
(1) For all t E a A and all 21,22E E with 3; = 3:, s # t ,
where the constants
(')
C t E a A .t/
1'1
=: Y
{ K ~ 2
0 :t E
an} satisfy
< 1.
Then there exists a constant go = go(A,r,y) > 0 such that there is a t most one element in 9 ( P , g 0 )consistent with pv.
10 RANDOM FIELDS
392
Proof: Given two random fields P1 and P2 consistent with (pV : V G 9 ( E ) and V Q Z d , let Pv denote the projection of P on For
a) First, we prove that for every 6 > 0, there is a p E X ( P $ ,P?) such that
(10.6) for all s E T ( V ):= {t E Zd : A U d R
+ t c V } , where
, such that To do so, let p E X ( P & P;)
=
c
(10.7)
ft,
tEV
Here, we have used Lemma 5 . 2 . For simplicity, assume that s = 0 E Zd (the general case then follows by the same proof below, 'replacing A with A s with a minor modification and using the translation invariance of the specification). Next, by Theorem 5.32, t.here exists a measurable coupling @(Zi,if;; .) E X ( p A ( ? ' , .), p A ( Z 2 , .)> such that
+
Thus, in view of condition (I) and the fact that we are dealing wiOh the r-specification, we get
for all
z1,it2E E.
-
Let
&o(V\ A)
= v{B' x
B2: B ' , B 2 E 8o(V \ A)}
10.3 UNIQUENESS We now reconstruct a measure
393
fi E 9 ( E ( V )x E ( V ) )as follows:
Then fi coincides with p on &(V \ A) and the conditional distribution of ii7 given &o(V \ A), coincides with fi almost surely with respect to p. On the other hand, we have
(by the definition of
=
/
fi)
(by the definition of p ) pv (Zkc, dzL)f(xb) (by the consistency condition).
It follows that fi E X ( P $ ,P;). Replacing p with p, we can define the corresponding
w,"(p:,
G
c
ft.
ft.
Then we have
(10.9)
tEV
Note that
-
ft = f t ,
t E V\A.
( 10.10)
From (10.8), it follows that
(10.11)
10 RANDOM FIELDS
394
Combining (10.7), (10.9) with ( l O . l O ) , we obtain
Cft-CfL=C(fi-.ft) t€A
tEA
CEV
<
This plus (10.11) gives us EtEA ft K t f t and hence (10.6) follows by (10.10). b) Next, let W c V . We prove that there exist ,g = ij(y,r,A) > 0 and C = c(r,T , A) such that (10.12) where
ci(w,g) = exp[-ijd(t, w ) ] , t E zd, i3AV = { t E. V : d ( t , V c )< diam (A U ah)}.
ct =
By (10.6), we have
Exchanging the order of the sum, we obtain
Hence
Put
Then s: t--sEA
s: t--sGb)A
10.3 UNIQUENESS
395
< K < (1 - 7) and g
= g ( K ) is small enough. On the other hand, noting that the terms in Ii2) vanish unless t E ~ A and V moreover
provided 0
<
provided ij is small enough so that ml 2. Combining these facts together we obtain (10.12). c) Finally, let P 1 , P 2 E 9 ( P , g ) consistent with pv. We need only to prove that P& = P$ for every W G Z d . Choose V 3 W, V g Zd. By b), we have
But by the definition of ct and a),
tEV
tEW
As for the right-hand side of (10.13), note that
Thus, we obtain
<
Take V = V, = { t E Zd : -n t(2) 6 n, i = 1,' . . , d } . Then there exists a go < ?j such that the right-hand side of (10.14) goes to zero as n --t 03, for all g 6 go. We claim that P& = P$ since W,, is a metric. The above theorem has some convenient corollaries. For instance, we have
Corollary 10.10, Suppose that diam X := sup,,,IEx p(z, 2') the discrete topology). If for some A G Z d and E > 0,
an.
< 1 (e.g., using
holds for all Z1 and 12, different a t exactly one point t E Then there exists a go = gO(R, T , E ) such that there is a t most one element in 9 ( P ,go) consistent with pv.
10 RANDOM FIELDS
396 Proof: Take
Kt
= (1 - ~ ) l A l / l a A l ,t E ah. Then
and hence the hypotheses of Theorem 10.9 are satisfied.
I
Example 10.11 (Two-dimensional Ising model). Take
X
= {-l,+l}
with discrete topology, and
Next, take A = (0). Then
Here, we have used the fact that if Pk has density
fk
with respect to A, then
Denote by {1,2,3,4} the nearest neighbors of the origin. Then
so
< 0.249995 < - = I*' 4
Therefore, the Gibbs state is unique whenever
for /? 6 0.044305.
< 0.044305.
Remark 10.12. Applying Theorem 10.9 t o the above example, we obtain an upper bound P < log3 x 0.274653.
10.4 PHASE TRANSITION: PEIERLS METHOD
397
10.4 Phase Transition: Peierls Method From the last two sections, we have seen that there is only one Gibbs state for the 2-dimensional king model for sufficient, small p (i.e.l at, high temperature). In this section, we prove the following result. Theorem 10.13. For 2-dimensional king model, there is a for all ,O 2 Po.
Do so t h a t 1591 > 1
Actually, it is known that for d-dimensional (d 2 2) Ising model, there ) that IF?l = 1 for 0 < Pc and 1 3 1 > 1 for ,O > &. In exists a PC E ( 0 , ~so this sense, we say that the model appears a phase transition. It is even known for this model that when d = 2, & = log(1 fi)E 0.4407. The tool in the proof is the contour method. Let Et denote the unit square parallel to the coordinate axes and centered at t E 2'. Denote by 8Et the boundary of Et. The union U t 8 E t is called the dual graph of E 2 . A closed path I' (without loop) on the dual graph with finite length is called a contour, its length is denoted by Irl. Next, let V c Z2 and set
4
+
t € V ,xr=-l
where dB denotes the boundary of B C R 2 . Note that the set a(a,) is not necessarily connected. Let V = VL be the square parallel to the coordinate axes, centered at the origin with length 2L of edges. Let P6il E 9 ( E ) be the conditional Gibbs distribution p v ( 3 , with boiindary (2t = fl,t E 1"). The key step for the proof is the following a)
Lemma 10.14 (Peierls inequality). Let l? be a fixad contour. Then P&[X
:
r c a(zv)]< exp[-2~1rI], v = v'.
Proof: A pair {s, 1 ) is called an edge, denoted by {:St), if 1s - 61 = 1. Set v == / { ( s t ): { s , d } n V # @}I. For each :c, let denole the number of the edges ( s t ) in the above set with x, = xt and as = -51 respectively. Then
x'
Because every edge of El which separates +1 and -1 belongs t o a(z,), we = 1) - Ifi(zv>l.By using these notations, the have C - = Id(zv)l. So Hsmiltonisn can be rewritten as follows:
xt
10 RANDQM FIELDS
398
where z equals 1 at the sites out of V . Thus
To estimate the quantity on the right-hand side, we make a simple transform. Let
Define a mapping Xr 3 xv
bv)t =
-+
-xt
z;
E
X F as follows:
il 1 is in the inside of I' if t is in the outside of r
Clearly, this mapping is one-to-one, it,s role is removing I' from 3(z,). have a ( x v >= ~ ( X G u) r , l a ( ~ ~=)ia(q)l i Irl.
We
+
Hence
Corollary 10.15. PG,3[xo= -I] 6
9(2e4@- 9) e4P(e4i3 - 9)2 '
1 2
p > - log3.
Proof: Let 0 E I n t r denote that the origin 0 is in the inside of zo = -1. Then there exists a contour l? so that 0 E I n t r c a(+) the boundary is 1). Thus, by the Peierls inequality, we have
I?. Let (since
10.5 ISING MODELON LATTICEFRACTALS
399
where c, = I{r : Irl = n and 0 E Intr}l. Now, we estimate c,. Fix the length n of contours. Starting from the origin, a contour with length n can at most pass through [ n / 2 ] squares along the positive direction of the first coordinate axis. In other words, there are at most [ n / 2 ]intersection points with the axis along the positive direction. Thus, if we set to(I’) = min{t : (t,O) E r},then
I{to(r): Irl = n, 0 E Int r}l < [ n / 2 ]< n / 2 . On the other hand, starting from the intersection point on the first coordinate axis, at the first step, we go along the positive direction of the second coordinate axis. Then, at each step, we have at most three directions to go. Moreover, since r is closed, at the last step, we have to go back to the starting point and so there is only one way to go. Combining these facts n . 3n-2/2. Finally, the length of a closed path together, we arrive at c, must be even, therefore
<
n=2
Proof of Theorem 10.13: By Corollary 10.15, we have P 6 p [ ~=o-13 < 1 / 2
for
p > Po = 0.6587
Let L + 00, then VL T Z d . Denote by P$ a weak limit of P?P as L + 00. Then P$ E 92 and
P,”Xo = -13 < 1 / 2 ,
p 2 p,.
P,-[X, = +1] < l / 2 ,
p 3 Po.
By symmetry, This shows that F‘j!
> 1.
# P i and hence
10.5 Ising Model on Lattice Fractals In this section, we study the Ising model on irregular lattice, especially, on the lattice Sierpinski gasket G(d) and Sierpinski carpet F ( d )introduced in Section 7.5, The spin space is again {-1, +l}. The configuration x is now a mapping: G(d)(resp. F ( d ) ) + {-1, +l}. The Hamiltonian of the Ising model on G(d) is given by
c
H(x)= (st):
ZsXtr s,tEG(d)
10 RANDOM FIELDS
400
where ( s t ) denotes the bond of the nearest neighbor s, 1 E G(d) i a the usual sense. Replacing G(d)with F ( d ) ,we have a Hamiltonian H on F ( d )for the Ising model. Consider first the Ising model on G(’). Note that every point in G(2)has four neighbors. This structure is exact the same as the model on the regular lattice 2’. From this point of view, one may guess that the model on G ( 2 ) has a phase transition. However, the answer is the opposite!
Theorem 10.16. For any d
2, the king model on the lattice Sierpinski gasket (.$dl has no phase transition.
To prove this result., we need some preparations. Recall that Y(+)denotes the collection of Gibbs states with potential = ( @ A ) : 6,&)
=
cu
x,xt
if A = { s , t ) a.nd Is - tl = 1 otherwise.
By Definition 10.1 (3), for each p E Y(@)and V E S, a version of the condit.iona1probability p ( x V = ylz, = itu,,u # V ) is given by
Let pv denote the measure p v (. x 2 ) arid let g ( V )denote t,he closed convex hull of {pV,* : 5 E {-1: 1}’jV}. Proposition 10.17.
(1) Vl c VZimplies that Y(Vl> 3 Y(V2). (2) p E Y(@)iff for every u E S, a version of t h e conditional probability p{x, = yy.141x.1, =: yv, v # u} is given by
(3)
q a ) = n,qv).
(4) g(@) is non-empty, convex, and compact.
ProoE The first assertion follows from the consistericy condition since every p E Y’(Vz>can be expressed as a convex combination of the elements in
10.5 ISlNG MODELON
LA’I’TICE
FRACTALS
401
$$‘(Vl). Next, let p E Y(@).Then
-
-
,
L
exp [ / Y Z k u E A @ a ( 4+ CXP
1
-
1 + exP
[ - 2P CA:ucn
[oc,:,,, ,
U€V*
9A(Y)]
This proves the “only if” part of (2). To prove the ‘‘if” part, it suffices to show that yv,y is the only probability measure un {-1, l}v whose one point conditional probabilities are given by the above right-hand side It is equivalent to say that the potential of the corresponding field is unique up to 8 constant. This is obvious in point of view of the Geld theory, since V is fiuite and pv,, is positive everywhere in {-l,l)‘+ By definition, we have Y(Q) c g ( V )and so 9(@) c n,Y(V). Thc inverse inclusion Y(V) c $?(a)follows from the proof of (2). Finally, the last assertion follows from (l),(3) and the compactness of the state space. I Proposition 10.17 will be extended t o more general situations in the next chnpt er.
n,
Lemma 10.18.
<< p 2 . Set h = d p l / d p z . Then I Z ’ ( ~ X } = h(a), pz-as. for each u E 5’. (10.15) (2) Suppose t h a t p2 E Y(@), h 2 0, = 1 and h satisfies (10.15).
(1) Let
p1,p2
Then
E Y(@)satisfy
:= hp:! E Y(@).
Proof: Let f E C ( E ) and set
P
c A.uEA
By assumption, we have
@,(I)].
402
10 RANDOM FIELDS
In the first and the third steps, the change of variables x 3 uz is used. Similarly, by using PropositionlO.17, one can prove assertion (2). H
Corollary 10.19. Suppose that p1 << p2 for all p1,pz
[Y(@)/ = 1.
cz $?(a). Then
Proof: Let h = dp1Idp.a. Since p1 << p2 and p2 << p1, we have h > 0, p2m. Suppose that there is a set A = { z : h ( z )> c } for some c > 0 so that 0 < p2(A) < 1. Define g(z) = 1~(X)/p2(A).By Lemma 10,18(1), h and g satisfies (10-15). Therefore p3 := gp2 E %(a)by Lemma lO.lS(2). But p2 Q p3, which contradicts the hypothesis of the corollary. Proof of Theorem 10.16: According to the construction of G(d),we have G(d)= ur==,Gj,d),where Ck? is a finite set for each n 2 0. Take V = Gn (4 for some n 2 0. Then
Here in the last step we have used the fact that the boundary of V = GLd) consists of at most 2d points. Furthermore,
Since the estimate is independent of n, by Proposition 10.17, this is enough Y(@) are absolutely continuous mutually, and hence the assertion follows from Corollary 10.19, Next, consider the king model on F ( d ) .
to claim that any p1,p2 E
Theorem 10.20. For any d 2 2, the king model on the Sierpinski carpet F ( d ) exhibits phase transition a t sufFicient lower temperature.
Proof: For simplicity, we prove the %-dimensionalcase only. The proof for higher dimension case was presented in Zheng (1993), in which algebraic topology was used to describe the contours. a) Thc first step of the pro01 is lo construct a dual graph of F ( 2 ) .Note that in the original construction of the carpct, mnc! squares with length 1 of edges are removcd. However, these eliminations do not play any role in the present study and so can be ignored. Thus, we may assume that all removed
10.5 ISING MODELON LATTICEFRACTALS
403
squares have length 2 3 of edges. Next, let F(2)denote the closed set in R2 including the inside part of all squares in F ( 2 ) . Denote by g2 the dual lattice of the regular lattice Z2. That is, is the collection of the center of the regular square in Z2. Now, we construct the dual graph of F ( 2 )in two steps.
z2
z2
n F ( 2 )with 1s - tl = 1. Second, let R First, connect each pair { s , t } c be a removed square with center T . Connect each point in
with the center T . Thus, each t E F ( 2 )is surrounded by a smallest sub-graph in the dual graph, denoted by Et. There are three cases. If t is inside of F(2),Et is simply a unit square. If t is at some points inside of an edge of some removed square, then Et is a triangle. Finally, if t is at the corner of a removed square, then Et is a quadrilateral. A closed path r (no loop) with finite number of bonds on the dual graph is called a contour. The length of r,denoted by Irl,is defined to be the number of bonds making up I?. Again, as in the regular lattice case, for each finite V c F ( 2 ) ,
t € V ,st=-l
10 RANDOM FIELDS
404
b) Recall that F ( 2 )= U , " = , P ~ 2as) given in Section 7.5. Take V = Pi2) for some n 2 0. Then, the same argument BS in the proof of Lemma 10.14 shows that P&[X : r c a ( ~ , ) ]G exp [ - 2pliil] for a fixed contour r. c) Tn the next lemma, we will prove that c, :=
I{I' :
Irl = n a n d 0 E Intr}l < Cecn
for some C,c > 0. Then, the remainder of the proof is exactly the same as in the regular lattice case.
Lemma 10.21. c,
< nlog3/log2
1
12n-1.
Proof: In what follows, we consider only the contour r having the properties: Irl = n and 0 E Int I'. a) Observe that starting from a point (m,-1/2), m > 0, it needs at least ZmS2 steps to enter into a hole (or removed square) with length 3, of edges. In particular, I? can pass through at most the number - 1)] of holes with length 3m of edges, m = 1 , 2 , . . . ,mo, where mo is the largest number so that 2moS.2 < n/2. That is, mo = [(logn)/(log2)] - 3. b) Let tl = max {t : (t,0) E I?}. By a), we have
1,
I
c) T,et Ch = { r : t o ( l )= t } where to(r)= min { t : (t, 0) E r}.Starting from $0 := (to(r), -1/2)> at the first step, wc have four diffcrent places to go. Label these plmcs by {il = 1 , 2 , 3 , 4 = k} =: I . At the second step, the places we can go ahead depend on the place il at which we have arrived after the first step, Denote by ( i z ( i l )= 1,2, ' . . , k ( i 1 ) ) =: I ( i l ) the set of the places for which we can choose at the second step. Similarly, at the m-th step, we have a s e t {i,(il,... , i m - l ) = l , 2 , . . . , k ( i l , . . . , & - I ) } =: I ( i l , . . . , & + I ) , m n. By using these notations and induction, we obtain
<
i1=1 iz(i1)=1
i,(il,... ,i%.-l)==l
The key point here is that the sums on the right-hand side are dependent since at the rn-th step, the choices depend on the previous steps. d) To estimate CA(0), we KIOW adopt a modification procedure. We call ( j l , . . . , j n ) a realization of ( i l , i 2 ( i l ) , . . . , z m ( i l , +,in-l)) -. if j, E I(i1, ,i, 1 ) for every m < n,. Given a realization ( j l , . . . ,j n ) , if
-
K(rno) := {!
:
k(.j,, . ' . ,je) = 4 x
3m")
# 0,
10.5 ISING MODELON LATTICEFRACTALS
405
we set p = min {l: l e K (m0)}. Then 1 < < n. In this case, we define
If K(m0) = 8, we regard p = 0 and hence the last two lines are still meaningful. P u t
<
Clearly, CL (0 Ck(1). Moreover, for every realization ( j l : . . ,j n ) with p 2 1, by a), in the expression of CA(l),there are at most [ r ~ / ( 2 ~ 0 + ~11-1 -1 of t! (1 ,< L n - l),which is independent of 31, so that k ( s 1 ;j1, . * ,je) = 4 x 3m0. Set. k, = [n/(2m+2- l)]. Repeating this procedure up to k, - 1 times: we obtain +
<
Here on the right-hand side, the variables k(...) can be only varied over {0,4,. . . , 4 x 3 m O - 1 } . In other words, we have reduced the range of k ( - .. ) from {0,4: . . . ,4x 3,0} to {0,4,. ,4x 3mo-1). Repeating this procedure for rno - 1,mo - 2 , . ' 1, at last, the range of I F ( . ' . ) becomes {0,4}. Hence s .
+
4x3
4x3
4x9
4x9
e ) Combining b) with d), we obtain the required estimate. PI
4
10 RANDOM k’1ELDS
406
10.6 R,eflection Positivity and Phase Transitions
Let V G Zd and L be a hyperplane (degenerated as a single point when d = 1) so that V is invariant under the reflection in L. That is, rLV = V , where rL denotes the reflection in L. Choose a halfspace of Rd, including L and separated by L , and denoted by L+. Set
V + = v m + , V- = TJ+,
v0 = V+n V-.
Definition 10.22. A measure pv E P ( E ( V ) )is called reflection positive with respect t o the reflection rL or Osterwalder-Schrader positivity (abbrev. RP) if (1) ,uv is invariant under the reflection rL and (2) for every f E b & ( V + ) , we have p v ( f r L f ) 3 0, where r L f ( x ) = f ( r L z ) , (TLZ)(t) = G L t .
The RP deduces a non-negative definite, symmetric bilinear form
Furthermore, we have
Lemma l 0 . 2 3 (Schwarx inequality).
From this, we will deduce some important chess-board estimates (see T h e orem 10.28) which are very powerful in the study of phase transitions. The next result shows that we do have RP in many cases. Lemma 10.24. Let V and
T
= rL be as above. If k
- H ~= F
+TF+C G ~ T G ~ j=1
for some F,G E 8 ( V v +and ) 2 := Z ( V ) = JE(v)exp[-Hv]dXv
dPV := 2-1exp[-Hv]dX,
Proof: Note that
is RP with respect t o T.
< cc,then
10.6 REFLECTION POSITIVITY
AND P H A S E
TRANSITIONS407
Up to a positive constant, pv(frf) can be expressed as a sum of the terms having the form:
In Zd, there are two typical reflections. One is “half-integers”. That is the reflection for which the corresponding hyperplane has the form L = { u E Rd : u ( ~=) n/2} for some 1 m d and some integer n. Denote by .9Zl/2 the collection of such reflections. The other one is “integers”, which is obtained by replacing 1/2 with 1 in the above definition. In particular, we have a notation 9 1 . Certainly, 9 1 C 9 1 / 2 .
<
<
Example 10.25. Let X be the unit circle S1 and X be the standard measure on
S1.Next, take H ( x )= -
c
Then for any V G Zd and r E 10.24 are satisfied.
cos(x,
- xt) -
c
c0s(2(xs
- Q)).
so that V = r V , the assumptions of Lemma
In general, there are not many reflections T- satisfying rV = V. This may decrease the power of the RP. To avoid this, we connect the boundaries is of V in a suitable way, making V to be a torus, then every T E 91/2 meaningful. In other words, it is more convenient t.0 divide the whole Zd into some equivalent classes. To explain this idea, we need more notations. Let ZN= Zd, be the factor subgroup Z d / N Z d . The coset in ZNfor t E Zd is denoted by ( t ) . Define
However, in what follows, we will identify the points with their coset and omit the brackets. Next, put
Recall that in Z d , for every s E Z d , we have a shift (or translation) operator 6,: 0,t = t - s, t E Z d . It deduces a transform 0,: E := X z d + E as follows. t f Z d , x E 6. ( 0 s x ) t = zest = Z t - s ,
10 RANDOM FIELDS
408
Furthermore, for the last operator, a s well as for other measurable transform 7 : E + E , denoted by 7-l its inverse, we have the following operators:
and
(w)(f)= P ( r - l f ) ,
f E b 8 , I-1 E P ( E ) *
If ~p = p, we say that p is 7-invariant. In particular, if p is 8,-invariant for all s E Zd, we say that p is translation invariant. Clearly, the shift Os in Z d deduces a shift in ZN in a natural way, denoted by 8, again. Besides, the element in EN is called a periodic configuration. Let A. = { t E Zd : tci) = Oor 1, i = l , . . ., d } . We will restrict ourselves to the interactions (potentials) @ satisfying Hypothesis 10.26.
(1) Nearest neighbor. @(z)= @(zA ) for all 2 E E . (2) Reflection invariance. T @ = @ for all T E 2Z1/2. In the above (l),the metric used there is 11s - tll = maxlGiGd Idi)instead of 1s - tl, so we still have the interacting length T- = 1. Now,
is called the Hamiltonian with periodic boundary condition.. Furthermore, we can define the conditional Gibbs state with periodic boundary condition as follows:
ntEZN
where AN ( d ~ =) A(d~t). According to the previous discussions, for a given interaction @, the Hamiltonian H = CtEZd deduces a set of the corresponding Gibbs states. Assuming that P is a weak limit of { P i : N l}, is it true that P E $(@)? If so, then we can use RP to study the phase transitions. The answer is affirmative.
>
Proposition 10.27. Let CP be continuous and Z(@,N , P ) < 00. If P i k as k -+ 00, then 7 E Y(@).
10.6 REFLECTION POSITIVITY
AND P H A S E
TRANSITIONS409
Proof: Denote by (pv) again the specification corresponding to H v . Take N large enough so that W U aW c V, G Zd. Fix an arbitrary Z E E . Regarding P! as a probability measure on ( E ,8)with boundary condition 5, denoted by p$N(5vi, d z v N ) . Without loss of generality, assume that
P{
+ F as N
+ oo. Since
= =
(.v;,(%GJ /P;N
('v;
[s
P%.Caw7~Wf(",
)I
, drvN) /pw("Swl d'w)f('w)l
letting N + 00, it follows that J P ( d x ) f ( z , ) = J P ( d z )Jpw(zaw,dZw)x f ( Z w ) . Hence p is consistent with (pv) and so E $(a). H As mentioned before, the RP provides us some chess-board estimates. To show the main idea, let us consider a special example. Take d = 1 and N = 4. Then we have r1, r2 E g112.They are described in the figure below. Given fj E &@,j = 0,1,2,3. We want to estimate the average n
Applying the Schwatrz inequality repeatedly, we obtain
10 RANDOM FIELDS
410
So we have
This is a special form of the chess-board estimate. On the left-hand side, at each site j = 0,1,2,3, there is a distinct function fj; but on the righthand side, the function is the same fi at every site j. This is the key of the estimate since the right-hand side is easier to handla To move further, let A = [1/2, 1/2 11. Then the above reflections r 1 and rP can be regarded as the reflection with respect to the end points of A. This leads us to consider more general “block” reflection. That is, dividing ZN into some blocks with the same shape, then consider the reflection with respect to the faces of the blocks. In details, let
+
A = A ( p , q ) = { u € I R d : p ( z ) < ~ ( ~ ) < p ( z ) + q i( =~ 1) , 2 ; . . , d } , where p(2) E Z1 U $Z1 and q(2) E N := {1,2;..}, i = 1 , 2 , . . . ,d. For covering the whole Z N by the shifts of A, we should assume that q(2)IN and (2q(4)JNfor all i = 1 , . . . ,d. Next, set Z ( q ) = Z(q(l1,.. . ,q W ) = {u E Z d : q(i)lu(i),i = 1,.. . 7 d } , Z N ( d = Z(q)/NZd.
Clearly, Z N ( q ) is a partition of ZN, each part has the same shape as A. In other words, {A e : e E Z N ( q ) } covers ZN. Finally, given a family
+
{Fe E b€‘(A
+ e ) : e E ZN(q)},
we introduce a new family as follows. Let Fe,o = Fen For e’, e’l E Z N ( q ) such that A+e+el and A+e+el’ has acommon face L, we define Fe,ef= rLFe,ef This definition is obviously meaningful since what we did is simply reflecting the function Fe = Fe,0 defined on A e to all A el (e’ E Z N ( q ) ) in terms of the reflections in the faces of A e’ (el E Z N ( ~ ) ) .
+
+
+
Theorem 10.28 (Chessboard estimates). Let p E EN) ( N is even) be RP with respect t o the reflection in each face of A e (e E Z N ( q ) ) . Then
+
where IAl is the Euclidean volume and
l Z ~= l Nd.
10.6 REFLECTION POSITIVITY AND PHASE TRANSITIONS411
The proof of this theorem needs some work and is delayed to Scction 10.7. To illustrate bhe application of thc chess-board estimates, we need the following result. For a product space E = X s , from now on, we will often use the following notation: 59y.k ( E ) = {f E d' : 3V E S such that f(z)= f ( x v x y) for all IC E E and y E E(S\ V ) } . The elements in %yli(E) are called cylindrical functions. Next, a set B is called cylindrical if I , E Vy!(E). IIowever, we use the same notation %y?( E )to denote the collectiorl of cylindrical sets. Theorem 10,29. Let A l , A2 E g y t ( E ) , Al n A2 = 0 and A1(x) = A 2 ( m ) , x E E , for some transform T on E . Set A: = B,A1 and A: = QUA2,where IQ,A = 1 9 ~ 1Next, ~ . let PP(p 2 0) be a 7- and 7-'-invariant and translationinvariant probability measure on ( E ,&). Suppose that there exist constants
such that
(1) P'(A1) = PP(A2) A/2, (2) PO(AfA,") < B for all s , t E Zd and all p 2 Pc. Then there exist pf and
Ff such that
F f ( A k )> 1/2,
k
=
1,2, ,O 2 &.
Proof: a) By the Birkhoff-Khinchin ergodic theorem, we have
&fk
= f k , Po-,.,.
and
P P f k= PP(Ak),
k = 1'2.
Note that
P"(f1
+ f 2 2 u ) 2 P q p ) + PP(f2)- a p q p + f 2 < a)
and Pfi(f1f2) >/ bPp(f1f2 > b). By the assumptions, for every a E [O, A] and b E [I?,11, we have
P"(fl
A-u + f2 z a ) 2 1-a'
10 RANDOM FIELDS
412 b) Next, we prove that
for some e , 6 > 0. Actually, by the assumptions, A' and A2 and hence f' and f 2 are symmetric under 7 . It suffices to show that this estimate holds for at least one k , which is then implied by the following two conditions:
+
3~ > 0 so that P"f' f 2 2 a , f 1 f 2 < b) > e. ( 10.17) 36 > 0 so that {(x,y) : 0 < z < y < 1, x y 2 a , xy 6 b }
+
c {(x,Y) : Y > 1/2 + 6) holds for some a
< 1 and b > 0.
(10.18)
By (10.16),
PP(fl+ f 2 3 a ,
f1f2
< b) 3 PP(f'+ f 2 2 a ) + PP(f1f2< b) - 1 A-a b-B a-+-1. 1-a b
So (10.17) follows from
A-u 1-a
+-b -bB
+
> 1.
(10.19)
On the other hand, if we set x y = a' 2 a , xy = b' 6 b, then x and y (x y) are two roots of the equation z2 - a'z b' = 0. This gives us
<
Hence, (10.18) follows from
b < -U2 4
and
+
?+/S>l. 2
2
( 10.20)
But both (10.19) and (10.20) can be deduced by
a 1
B-
1-a
Finally, for the given constants A and B , if we take a = a0 = B then (10.21) is equivalent to
B- 1 - a0 < -a0- - 1 A-UO 2 4'
(10.21)
+ 4 2 + 1/4,
10.6 REFLECTION POSITIVITY
A N D PHASE
TRANSITIONS413
That is
+
or A / 2 2B < u& which is then implied by the assumptions on A and B. c ) Let M I = Mf = (f' 2 1/2 6). Then Pp(M1) 2 ~ / by 2 b). Define
+
--B
PI = P q * ; M I ) / P ( M l ) *
Frorri Preston (1976), we know that. pf E Y(@)whenever so does Pb. Since
and -0 P I (f- ') = Pf(A'), fore
p : << Pa, it
follows that
f'
= f', Pf-a.s. There-
I3
.P:(A') = 1,(f') = P P ( f lM; l ) / P p ( h l l ) 3 (1/2 -1h ) P " ( M ' ) / P f i ( M t = ) 1/2 + 6.
+
Similarly, F ! ( A ~ 2;) 1/2 S, p 2 ,oC. Now, we introduce au application of the above results. Example 10.30. Let d =: 2. Then Model 10.25 appears a phase transition a t sufficiently low temperature.
Proof: Note that, for this model, the Hamiltonian has a symmetry group S1 @ Z2. E . x h rotation B E S1acts on E in the following way:
While a generating element g E 232 acts on E according to the formula: (.9x:>t =
x + xt
if dL)I t ( 2 ) = even if t ( ' ) + d2)= odd.
Take such a g as the transform T required by Theorem 10.29. and a2 E S1 Let la1 - a21 denote the shortest arc joining the points (ie., the geodesic) and set
A' = {X : 1zY- xtl < 6, s: t E A,, Is - tl = 1}, A2 {. . 1x3 - ~ t >( T - 6, s : t E A", (3- tl = I}, A' = E\ (A' f A 2 ) : 0 < S < 1/2, I
10 RANDOM FIELDS
414
where A, is the unit square introduced at the beginning of this section. Next, set
Then H & ( s ) = CtEZN Q t @ ( x ) .The proof goes as follows. First, we check the two estimates for F'i required by Theorem 10.29, then pass the limit N4m. a) Consider condition (I) of Theorem 10.29. That is estimating P i ( A o ) . Applying the chess-board estimate to A = A,, we have
To estimate the right-hand side, fix z E S' and let IE be a constant which will be determined lat,er. Put R, = (2' f S1: Iz - 2'1 < 6 / ~ }Then .
On the other hand, if x E &Ao= A!, then there is at least one edge ( s t ) in &Ao = At so that Ix, - xtl 2 6. That is, there is at least one ( s t } so that cos(2(z3 - Q)) cos(26). Hence
<
< 421 3 + c 0 4 2 4 ) + 2 = 321 + 'z1
~~S(ZS).
Thus
26
cos(26) - 2120s - - 2cos K
n
Hence, we may take n = ~ ( 6large ) enough so that the exponent is negative. Furthermore, there exist C = G(6) and c = c(S) > 0 (independent of N ) such that 6 exp [- cpj. P; ( A $ )
c
10.6 REFLECTION POSITIVITY AND P H A S E TRANSITIONS415 b) To check condition (2) of Theorem 10.29, we use the Peierls method plus the RP. Let At = &Ao. We say that {Atl , - - . ,At,,) consists a contour I? if the squares Ati and htj intersect iff i - j = f l (mod n). Then we write
We say that a configuration z E EN contains r if for every At E s u p p r , x E A!. This event is denoted by Ar. That is Ar = 0,: AtEsuppr A:. Finally, we say that r separates s and t if the open squares A: and A! lie in different connected components of Z2 \ r. Now, let 5 E A;A;. Since
the configuration x contains a contour I’ which separates s and t. Thus
P~(A;A;) < P{
Ar) .
r:r separates
s and
t
On the other hand, applying the RP with respect to A = A,, for a fixed we have
F‘zfAr) < [P{(Ag)]Iri’N2< drlexp[-cj3lrl]
I?,
(by a))
= exp[-(cj3 - l o g C ) \ r l ] .
Next, it is clear that the number of contours of n squares containing a given contour does not exceed 7”-l. Actually, a more careful examination on the definition of contour improves this estimate by 5”-l. Hence
We obtain
P NP( A s A 1,) 2
< C’eXp[-c‘P],
P 2 Po
for some Po, where C’and c‘ are independent of N , depend only on c involved in a). c) By a) and b), there exists Po so that for p 2 Po, we have
C and
10 RANDOM FIELDS
416
uniformly in N . Choose an arbitrary weak lirnit ?5’ (use a subsequence P i k instead of P i if necessary). Then the above estirnates still hold for this limit P’. Therefore, the assumptions of Theorem 10.29 hold for all p 2 pc whenever p, is large enough. Here, we have also used the fact: if P, E Y(@) and P, + P , then for all A E %@(E),P,(A) + P ( A ) . This is because on implies the consistency of P with the specification the one hand, P E Y(@) (pv), and on the other hand {pv} has no mass on the boundary i3A of
A
E
8o(V). H
10.7 Proof of the Chess-Board Estimates We begin this section with another proof for the simple case (d = i and N = 4) discussed in the last section. The present proof contains the x a i n idea for the general case. Assume that at each site 0,1,2, 3, a function fj with variable xj is located. Define a multilinear functional G as follows:
Note that if we keep fj’s at the same places as above but replace { O ! 1 , 2 , 3 } and { a o ,xl, x2,x3} with a cyclic permuhtion {3,0,1,2} and (x3,q,,X I , z2}, respectively, then the righbhand side has the same value since a cyclic permutation can be generated by a composition of r1 E 221/2 and rz E 9 1 and p is reflection invariant. On the other hand, the left’-hand side should be replaced by G(fl, f 2 , f3, f o ) which is a different functional from the above G(fo,f l , f 2 , f 3 ) . However, we may use t,he same notat,ion G(fo, f l , f 2 , f3) to denote these functionals but instead assuming that the funct.ional G obeys the cyclicity: W o , f l , f 2 , f3) = w 3 1 fo,fl, f 2 ) = . . . The main advantage of using this functional is t,hat we can then ignore the difference of different reflections. To see this, let us return to the example discussed in the last section. The reflection T used in
represents rl. Here, the sernicolon is used for separating the functionals f j ’ s into two parts located in the two halfspaces, respectively, according to the reflection T = r l . The general reflection rule is, as .we will explain later,
10.7 PROOF OF
THE
CHESS-BOARD ESTIMATES
417
After a cyclic permutation, the reflection r used in
represents r2. Thus, by making a cyclic permutation, we turn to a different reflection and hence we can ignore the difference of the reflections. The shorthand of using the functional G is that after several steps, except at the last step, it is hard to recognize the precise position occupied by fj from the expression G(f6, f;, fi). Fortunately, the precise position of fj is not important for our purpose since the final expression should be independent of the positions. Let us rewrite the proof given in the last section by using the above simplified notation:
fi,
Look at each term on the right-hand side, G(rf0, fo;rfo, f o ) for instance, before the sernicclon, rfo and fo are located in the same halfspace, rfo is the reflection of fo. Since there are only two sites in the halfspace, this means that these two sites are occupied by the same function fo. The same conclusion holds for the other halfspace. Therefore, G(rf0, fo,rfo, fo) indicates that every site is occupied by fo. Thus, there is no confusion to recognize that the right-hand side is just
Now, we generalize the above idea to the general situation. Theorem 10.31 ( A b s t r a c t Chess-Board Estimates). Let 92 be a (real) vector space and r : 42 -+92 be a linear mapping with r2 = 1. Suppose that G(a1, . . ,~ 2 be~ a multilinear ) functional having the properties. +
(1) Cyclicity. G(a1,. . . ,u2,) = G(a2, . . . , uzn, U I ) . ('2) Schwarz inequality.
10 RANDOM FIELDS
418 Set
1 1 ~ 1 1= IG(a,T U , a , . ,T U ) ~ ' / ~Then ~ . I( . 11 is a semi-norm and +
.
2n
(10.22) Proof: By (l),we have /Ira((= ((allfor a f 9Y.On the other hand, by ( 2 ) , we have either G(a1,. . , an) 2 0 or 0 for all a l , . . ,a,. Hence, we can assume the former without loss of generality. a) Given a l , . . . ,a2, E Let llaiII # 0 , i = 1 , . . ., 2 n . Take bi = aj or T U ~( j = 1,.. . ,an), i = 1,.. . , 2 n , and define
<
a.
Clearly, F is well defined and satisfies) regarding F as G, all assumptions of the theorem. Thus we need only show that for all For this, let
the setChoose an element from
so that the length l ( >1) of the chain
ai, mi, ai, m i , . *
(or mi,ai, m i , -
.
when !is odd)
is maximal. Without loss of generality, assume that i = 1 and the element (bi) chosen above has the form: (bl,
* * .
,b2,)
= (tl, ra1,:.
. ,T U > b f + l , . . . ,b2n).
e Certainly, the last element in the chain ( a l ,T U ~' ,. . ,r a l ) is a1 or rul, according to t being odd or even. If l = 2n, then we have already had
F* = j F ( a l , r a ~ , - .,al,ral)l = 1. This deduces (10.23). Otherwise) !. obtain
< 2n.
(since F0 is the maximum).
(10.24)
If! 2 n, by condition (2), we
10.7 P R O O F OF THE CHESS-BOARD ESTIMATES
419
Hence F ( b l , . . , b,; rb,, . . , rbl) = Po and so t = 2n which is impossible by our assumption. Conversely, if I < n, then
It follows that the chain ( q ral , I - U I ,r a l ) or ( r a l ,a1 ,rul, . . . ,) has length 2& > l , which contradicts our assumption. b) If for some a t , Iluill = 0, we claim that G(a1,a2, * 7u2n)= 0. Otherwise, we may choose a sequence ( b l , - ' * , bz,) such that +
1
-
and ( b , , * . . , bzn) contains a chain (a1 T U I , * 1 having the maximal length. According to the argument in a>, i t follows that ( b l , . . . ,bzn) must be (a), r u l , . . ,u1, r u l ) . Therefore
0 # G ( b l ; * *,bzn) = G ( a l , r ~ l , . ,* u- ~ , T u=~ l) l
~ l = ( (0,~ ~
which is a contradiction. c) Finally, we prove that (10.22) implies the triangle inequality of Ilu
+ bll = IG(u + b, r ( a + b ) ,
*.
. ,a
I\ . 11.
+ b, r ( a + b ) ) 11/2n
To prove Theorem 10.28, we consider first a particular case. Corollary 10.32. Let N be even. Suppose that p E EN) is reflection T E 91/2 and is RP. Then for every family {ft E b 8 : t E ZN}
invariant for all we have
In particular, for every f E b 2 8 , we have
10 RANDOM FIELDS
420
Proof: Consider first the reflections along the first coordinate direction. Set hT= 2n and let 9 be the vector space spanned by
Define to be the common value
Applying 'I'heorem 10.31 to the function obtain
2n-1
I
n
0,j
=
nt(zl,.., ,t(d)
f(j,t(~)
we
2n- 1
Repeating the argument to the other ( d - 1) directions, we obtain (10.25). Finally, (10.26) follows by setting $0 = f arid f i = 1, t E EN \ (0) in (10.25).
Proof of Theorem 10.28: a> The proof can be reduced to the special case that d = 1 by considering the coordinate directions succcssively, as we did in the proof of Corollary 10.32. b) For d .- 1, the assertion Eollows from Lhe abst8ract8 chess-board estimate. Actually, we need orily regard each block A e as a brick, iis [mi1/2, rn 3/21 (m E Z)used in the proof of Corollary 10.32. I
+
+
10.8 NOTES
42 1
10.8 Notes In the past thirty years or more, the random field has been one of the most active subject in probability theory and in statistical physics. Several books are now available: Dobrushin, Koteckjr and Shlosman (1992), Ellis (1985), Georgii (1988), Malyshev and Minlos (1991), Riinlos (ZOOO), Preston (1976) and Sinai (1982). A plenty of references can be found from these books. Theorem 10.6 is due to Dobrushin (1970). Actually: for the existence theorem, the weak continuous condition can be weaken and the set of parameters can be more general. See Dobrushin (1970) and Georgii (1988). Another generalization, based on RP, was presented in Shlosman (1986). Theorem 10.9 is an extension to the well-known Dobrushin’s uniqueness theorem, in which the minimum L1-metric was evaluated at each single site. The present theorem, due to Dobrushin and Shlosman (1985), treats with the metric in a firiite volurne globally. See hlaes and Shlosrrian (1991) for recent progress on this topic. Nole that the measurability question in the original ststcment of Theorem 10.9 was solved by Zhang (1999), that leads to the use of Theorem 5.32. The king model on lattice fractals was studied by Gefen, Aharony and Mandelbrot (1983, 1984) and by Gefen, Ahxon?, Shapir and Mandelbrot (1984). Theorem 10.16 was proved by J. L. Zheng in 1989 but it is published here for the first time. The proof given here is an analogue of those in proving the absence of phase transition for the model having finite range. Refer to Liggett (1985, Chapter 4) for details and original references. The dual graph used in the proof of Theorem 10.20 is due to J . L. Zheng. The corrhinatorial Lemma 10.21 is due to J. Wu, Again, Theorem 10,20 appears here for the first time, A complete proof was presented in Zhcng (1993). The proof uses some results from algebraic topology since the complex of coritoiirs, It should rriention that the proofs of Theorem 10.16 and Tlieorerri 10.20 are suitable for much more general situations. For instance, the same conclusion of Theorem 10.16 should be held for any lattice nested fractals. See also Yoshida and Higuchi (1996) for a related result. Refer to Mandelbrot (1982) and Falconer (1989) for more materials on fractals. For diffusion processes on fractals, refer to Barlow and Perkins (1988), Kusuoka (1987, 1989) and Linstrbm (1990) and the references within. For the Peierls method and its development, called Pirogov-Sinai method, refer to Sinai (1982). Here, we mention Dobridiin and Zahradnik (1985), Park (1988a,b). The R.P was int,roduced by Osterwaldcr arid Srhrader (1973) and so is also called the Osterwalder-Schrader positivity. Section 10.6 is taken from Shlosman (1986). Finally, Section 10.7 is mainly taken from Frohlich, Israel, Lieb and Simon (1978). Finally, for large deviations of random fields, refer to Ellis (1985) and Olla (1988).
Chapter 11
Reversible Spin Processes and Exclusion Processes This chapter deals with the reversibility of two important classes of particle systems, spin processes and exclusion processes. There are two reasons why we study this problem. One is that comparing with an irreversible process, a reversible process has nicer property and is easier to handle, so we should justify the reversibility of a given process at the beginning of the study. This leads also to the study of potentiality. On the other hand, in the equilibrium statistical physics, we are given (local) reversible measures (i.e., conditional Gibbs states) and the processes are constructed for describing the systems. Thus, the processes should be reversible. Then, a question arises, what rates we should take so that the corresponding process actually describes an equilibrium system. This again leads to the study of the reversibility.
11.1 Potentiality for Some Speed Functions In this section, we introduce some simple criteria for the potentiality of the speed functions of spin processes and exclusion processes. As we will see in the subsequent sect#ions,the potentiality is essential for the reversibility of the processes. Besides, the idea given here will be also used in Section 14.3 to study the reversibility for some reaction-diffusion processes.
Definition 11.1. Let S be a countable set, E = (0,l)’. Suppose that c(., -) : S x E -+ R and c(., ., .) : S x S x E -+ R satisfy the positivity condition: (1) c ( u , z ) > 0, u E s,z E E ; (2) c(u,v , 2 ) > 0, u , v E u # v; z E E , x(u)# x(v).
s,
Set
4&,4
=
{
c(u, z)
if $ = , z for some u E S if 5 # z,, z for any u,
where
( U z ) ( v= )
{
-
XV
if = if u # v,
( ( u , v ) z ) ( w= )
i
(11.1)
z(w)
if w # u , v
4.)
if w = 11 if w = v.
x(u)
Then we call Qs := ( q s ( z , Z ) : x,Z E E ) and Qe := ( q e ( z , Z ) : z,Z E E ) a field of spin speed functions and a field of exclusion speed functions, respectively.
422
11.1POTENTIALITY
FOR S O M E SPEED
FUNCTIONS
423
Theorem 11.2. (1) The field Q s := ( q s ( x , 5 ) : x , Z E E ) is a potential field iff the following quadrilateral condition
4%M JU+(% ,
u v z > ) c ( v ,4)= C(V, +(%
d ) C ( V , U U ~ ) ) C ( U ux) ,
holds for any u , v E S and x E E . (2) The field Q, := (q,-(x,$) : x,Z E E ) is a potential field iff the following
triangle condition c h 'u, 44%w ,(u,lJ)W'Lv,Ul ( w , u ) 4 = 4%70, +(w
I f ?(U,W)++?
u?( V , U ) X )
holds for any u,'u,w E S and x E E .
Proof: The first assertion is actually a restatement of Lemma 7.11. As in the proof of Lemma 7.11, we define 4u,.>
=
(u,v)Z
for u # 'u and x(u)# x(v>. But here <(u:'u) is no longer a transform on . also use E because it is only defined on { x E E : X ( U ) # x ( ~ ) } We ~ < ( u ~ , w ~ ) . ' .to~ denote ~ ~ ~ ,thc ~ , path i ~ ~L~=j (x =: d o ) , , d l ) , ~ which consists oE x = do)and x ( ~ = ) x ~ ' - ' ) c ( u ~ , w ~(1) < k < n,),and regard ((uk, w k ) as the k'th segment, of the path. Using these notations, it is easy to see the triangle condit,ion is equivalent to Cp(4UI
.>i;(.?4) = ( P ( 4 U I 4,
(11.3)
for any different u,w,w E S and x E E with x(u)# z(w) = x(w),where p(L) denotes the work done by the field Qe along the path L.
Moreover, we have
and
11 SPINPROCESSES A N D EXCLWSION PROCESSES
424
We now prove that Y(XC(%
v 1 ) C ( u 2 , v2))
(11.6)
= P ( z C ( u 2 , %)C(U,,v1))
for distinct u l , vl,u 2 ,v2 and IC with x(ui) # x(v.j), i = 1,2. We may, and do, assume that z(ul) = z ( u 2 ) # z ( u l ) = @(u,,v2) is meaningful. Noting that
z( 'u 2 ) .
So,
and using (11.5), (11.4) and (11.3), we have
In fact, the last three steps reduce side(11.3). Going on, the above right-hand = Y(.C(Ul,
'uJC(U2. v2)c(vl,
= 'P(zC(u1,vl)c(vz, v1))
to in terms of
4 )+ ( P ( ( ( u , , v 2 ) 4 C ( u 1 ,212))
+ ( P ( ( ( u l , l ) z ) ~ ) C1(12u))l ,
= cP(~S(ul*V2)) -I-'p(((211'vL)z)C(u1,212))
= L3(4(?
1
.2)<(U1rr4)
= 0,
and hence (11.6) is proved. The t r i m & condition is obviously necessa,ry. To prove the suficiency, we = x((ul, v I ). . . need to show that for every closed path L - (do), ,dn)) 1
!gun,
+
Vn),
Cp(zC(q, 111) *
' * C(Un,
%))
= 0.
(11.7)
We use induction on n. When n = 2, (11.7) is reduced to (11.4). When n = 3, by (11.3), (11.7) is also reduced to (11.4). Suppose that (11.7) holds
< m, we want to prove that it holds for n = rn. Since L is a closed path, each U k and 'uk (1 k m ) appear in an even number of times, thus, there is a k such that 2 k rn and
for n
< < < <
11.2 CONSTRUCTIONS OF GIBBSSTATES
425
Since , without loss of generality, we may assume that
andAt the moment, we write x = x . ………. Then from ()11.5 (11.6) we obtain
obtainApplying ()11.5 and ()11.6 again, we
andAt the moment, we write x = x . ………. Then from ()11.5 andAt the moment, we write x = x . ………. Then from ()11.5 to
andAt the moment, we write x = x . ………. Then from ()11.5 andAt the moment, we write x = x . ………. Then from () 1
andAt the moment, we write x = x . ………. Then from ()11.5 andAt the moment, we write x = x . ………. Then from 11.2 Constructions of Cibbs States
Dcfinition 1.3 “3. Thc functions { c ( u , x ) } and { c ( u , 11: x ) } are called speed functions of the spin-flip processes and the exclusion processes, respectively, if C ( U , .), c ( u ,v I.) E C ( E )for all u,u E S.
426
11 SPIN
PROCESSES AND
EXCLUSION PROCESSES
Having a potential field at hand, one can use its potential (an analogue of Hamiltonian) to construct the conditional Gibbs distributions as we did in Section 10.1. Lemma 11.4. Let { c ( u l x ) }and ( c ( u , ' u , ~ ) be } given as above.
(1) Suppose that ( c ( u , z)} satisfies the quadrilateral condition and let @ be a potential function of
Qs.Define
for each V E S and 2 E E , where E ( V ) = (0, l}'. independent of the choice of @ and
Then {p,},,,
is
(11.13) pV(z)=pW(z)
c
pV(w x
..Aw), W C V E S , z E E.
(11.14)
wEE(W)
(2) Suppose that {c(u,v,x)} satisfies the triangle condition and let @ be a potential function of Qe. Define
for each V E S, k E Z,, k < IVI, y E E ( V ) and L E E ( S \ A ) , where IVI is the cardinality of V and E,(V) = {y E { O , l ) v : IyI := CUE,y(u) = k}. Then p,, is independent of the choice of @ and
11.2 CONSTRUCTIONS OF GIBBSSTATES
427
Proof: a). The reference points A, (l E D ) , the reference paths L ( x ) and the corresponding &, &, X are defined in Section 7.1. The first assertions of (1) and (2) follow from the fact that for any potential functions @ and a', by Corollary 7.8, we have
@(x) = W ( x ) + ce,
x E ,El, e E n.
b). We now prove (11.12). Choose a reference point y(O) in E ( V ) ,for each w E E ( V ) ,choose ( ~ ( ~ 1~ , ( ~ 1 , ,?I('))) c E ( V ) such that +
Define
where
Then for every x, (y(O) x ~ ~ \ ~ , yx ( xl ~) \ ~ , ., y-(k+l) - x xs,v) (where y('"+l) =w) is a path from y(O) x xcs\v to y('"+l) x xs\v, and by Theorem
7.6, we have
From this and a), we obtain
Since c(u,
and hence
a)
E C ( E ) ,we see that
428
11 SPIN PROCESSES AND EXCLUSION PROCESSES
andAt the moment, we write x = x . ………. Then from ()11.5 FOR EACH M > m. hENCE
AND SO BY (11.20). C) cLEARLY, ()11.13 HOLDS. nOTE THAT FO R w v s,
tHUS, BY A) AND THE DEFINITION OF PV, WE OBTAIN
andAt the moment, we write x = x . ………. Then from ()11.5 andAt the moment, we write x = x . ………. Then from ()11.5
fOR
DEFINE
iT IS CLEAR THAT
IS A PATH FROM FOR EACH
aLSO, DEFINE
We riow obtain
<
for each V E S, k ( 0 6 k IVl), y E E ( V ) and x E E ( S \ V ) . Now (11.16) follows from the same argument as in b). e) Finally: by the definition of pv,k, (11.18) can be proved by the same argument as in c).
11.2 CONSTRUCTIONS OF GIBBSSTATES
429
Definition 11.5. Let 8o(S \ V ) be the Bore1 0-algebra on E(,S \ V ) = {O,l.}s\v. Define &(S\V) = &o(S\V) x E ( V ) and d ( S \ V ) = a { E k ( V )x E ( S \ V ) , 0 6 k < IVI; B(S \ V ) } . (1) A measure p E P ( E ) is called a Gibbs statc with respect t o { c ( u , z)}, if for each V E S, y F E ( V ) , P({Y) x E(S \
I &(S \ V ) )= Pv(Y x
(.>S,V>l
P-a.e.
Denote by gSthe set of all Gibbs states. (2) A measure p E 9 ( E )iscalled a canonical Gibbs state with respect to(c(u,v,z)}, if for each V E S, y E E ( V ) ,
Ct({Y) x
-w\ V )1 4 s \ V ) )= Pv,I(.)vl(Y x
(.)S\V)>
p-a.e.
Oenate by Ye the set of all canonical Gibbs states.
Notations 11,&
8,~
(1) Define t o be the closed convex hull (in the weak topology) of all probabilities , U ~ , ~E( S, V z E E ( S \ V))defined by
(1) If c(u,x) satisfies the positive condition in Definition 11.1 (l), the quadri-
Theorem 11.7. (1) If c ( u , x ) satisfies the positive condition in Definition 11.1(l),the quadrilateral condition in Theorem 11.2 (1) and the continuous condition in # 0, Definition 11.3, then q. = (2) If C ( U , I J , I C ) satisfies the positive condition in Definition 11.1(2), the triangle condition in Theorem 11.2 (2) and the continuous condition in
8
Definition 11.3, then ge=
# 8.
430
11 S P I N PROCESSES
AND
EXCLUSION PROCESSES
Proof: a) Let p E gS, then P({Y} x E ( S \ V ) )=
(11.22)
where pus,v is the projection of p on E ( S \ V ) . Let
and
< pv(yk
APik = { z E E ( S \ V ) : APn = { Z E E ( S \ V ) : ~ , i=
nL, A?
n Z k 7
x
Q pv(yk x
2)
Z)
< %},
< 1)
ik
i E In.
For each n and i E In, take a Lemma 11.4(1), we obtain
Zni
E Ani. Then from (11.22), (11.23) and
For each n and i E In, take a Z
ni E Ani. Then from ( 1.22), (11.23) and Hence and a subsequence {} such that
by (11.24). Therfore
We now claim that for any
In fact, for sufficiently large
we have
and
and
proveIt is now easy to show that that that that that. Coversely, let …………. To that ………, it is enough to show that
11.2 CONSTRUCTIONS OF GIBBSSTATES
431
where Co(S\W) = {{Y} x E(S\V) : V E s, v 3 W, 5 E E ( V \ W ) } . w e first prove that (11.27) holds for FO E Co(S\W) : Fo = {Y} x E(S\V) and p = pvm,zm,
v, 2 v, zm f E ( S \ Vm).
(11.28)
Actually, the right-hand side of (11.27)
Furthermore, ()11.27 holds for F0 and for any linear convex combination of Furthermore, ()11.27 holds for F0 and for any linear convex combination of Furthermore, ()11.27 holds for F0 and for any linear convex combination of Furthermore, ()11.27 holds for F0 and for any linear convex combination of Furthermore, ()11.27 holds for F0 and for any linear convex combination of and b) Let u e ge. We have
That is
Let I , = {1,2,. . . ,n}E(V) and define
432
11 SPIN PROCESSES
<
AND EXCLUSION PROCESSES
<
For each n 2 1, 0 k IV( and i E I,, take z,ki E E ( S \ V) and gnki E E k ( V ) such that ynki x z n k i E A n k i , then by (11.29), (11.30) and Lemma 11.4 (21, we obtain
Since IVI
k-OzEI,,
there exist ,uV E (11.31), we have
8 , and ~ {nk}
such that pV,nk
p ( A x E ( S \ V ) )= p . V ( A x E ( S \
-
* pv(as k
-3
V)), V E S,A c E ( V ) .
30
). By (11.32)
g.
Now the same argument as in (1) shows that 1-1 E Conversely, suppose that ,u E $fe, to prove that p E Ye, by the monotone class theorem, it is enough to show that
< k < IWI, y E E k ( W ) and Po = {$} x E ( S \ Y), v 3 w,v E s, g E E(V \ W ) .
holds for each IV f S,0
Starting from pVm,z,,kl(Vm3 V, z, E E ( S \ V,), argument in (1) shows that p E 9,. W
0
< k’
(11.34)
6 IVml),the
11.3 Criteria for Reversibility
Define
Under some hypotheses, exclusion process) with domain need here is the following : the sets
is a generator of a spin process (resp. The main fact we and
11.3 CRITERIA FOR REVERSIBILITY
433
are cores of R, and Re, respectively. That is, R, (resp. Re) is the smallest closed extension of its restriction to the linear space C(R,) (resp. C(Re)). (See Liggett(l985) for details and related references). We say that 0, is reversible with respect to some p E 9 ( E ) if
This is equivalent to the reversibility of the corresponding semigroups. So we need only to discuss the construction of reversible measures for as(resp. Re). For this, we need the following two results. Lemma 11.8. Let 9 ( R , ) (resp. %'(C2,)) denote the set of all reversible rneaR, (resp. f i e ) . Then the following conclusions hold.
sures for
(1) A probability measure p belongs t o %'(a,) iff
(2) A probability measure p belongs t o 9(!2t,) iff
Proof: a) To prove the necessity of condition (l),by the monotone class
theorem, it suffices to consider the function f having the form n U E v z ( for some V E S. Fix u E V and take g(x) = f ( U ~ ) 2, E E. Then, a simple computation shows that ( f R , g ) ( z ) = c(u, z)f (z) and ( g o , f ) ( x ) = z E E . Hence c(u,x)f
This is certainly true for u $ V . To prove the sufficiency, let f , g E C(R,). Then condition (1) gives us
434
11 SPIN PR.OCESSES AND EXCLUSION PROCESSES
Since C(sZ,q)is a core of 9(sZ,), we have p F 9 ( 0 8 ) . b) We now prove thc second assertion. The sufficiency can be proved in the same way as above, except C(R,) is replaced by C(n,). To prove the necessity, we need only consider the function having the form: I{y}xE(S\V),
YE
W),v E s.
Lemma 11.9. 1() where
(2)
Proof: Since the proofs for (1) and (2) are similar, here we prove (1) only. Let p E 9(sZ,)and F E %yt(E(S \ V ) )n g0(S \ V),V E 6. Take
f = J{y}xF,
Y E E(V).
For u E V, by the previous lemma, we have
Then, this holds for all F E €o(S\V) by the monotone class theorem. Noting that c ( u , y x (.)s\v) E € ( S \ V), we have
11.3 CRITERIA
FOR REVERSIBILITY
435
for all F E &o(S\ V ) . Hence
Because ~ ( ( 9 )x E ( S \ V )1 €‘(S \ V ) )depends on zv only, we have proved the necessity of the condition. Repeating the argument in a converse way, we obtain the sufficiency.
Theorem 11.10.
(1) If {c(u,x)} satisfies the positive, continuous and quadrilateral conditions, = Ys. then 9(0,) (2) If {c(u,v, x)} satisfies the positive, continuous and triangle conditions, then
%(Re) = Ye.
Proof: a) Let p E B(0,).Then by Lemma 11.9 (1) we have
On the other hand, by the path-independence and the proof b) of Lemma 11.4 we obtain
where
From these we get
and so for each y,
9 E E ( V ) we have
436
11 SPIN
PII.OC:ESSES AND
By (11.13), a, = 1, i.e., pv(. x P E9s. Conversely, if p E Y S ,then
2) =
EXCLWSION PROCESSES p v ( - [ z ) ,ps\v-a.e. This shows that
Pv({Y)l4 = Pu({Y>x E ( S \ V)I&(S \ V > ) ( W x 2) = Pv(Y x 4. From (11.36) and Lemma 11.9 (l),it follows that p E 9 ( Q S ) . b) Let p E %'(fie). Similarly, by the pa,th-independence and (11.21) we have
From this and Lemma 11.9 ()2 , we obtain
Since have
for each
and
by (11.17), we
and so for each
That is u e ge. Conversely, if u e ge, then from ()11.327 and Definition 11.5 (2) we get
11" 3CRITERIAFOR REVERSIBILITY Now, from Tlemrna. 11+9(2), it follows that
11
E ,%'(!&).
437
I
Theorem 11.11. Suppose that { c ( u , x)} satisfies the positive and continuous conditions, then %(Cia) # 0 iff Qs is a potential field. Equivalently, the quadrilateral condition holds.
Proof: By Theorem 11.10 (1) and Theorem 11.7 (l),it is enough to prove . for each V E S, there exists a yo E: E ( V ) the necessity. Let p E 9 ( n S )Then such that
hence p s \ v { z : p(y01z) > 0)
> 0. By Lemma 11.9 (l),
for some zv E E(S \ V ) with p(yOIzv) > 0. From this and the positive condition, we get P(YlXv>> 07 Y E wq, and from these we obtain
x xv : y E E(V),V E S) is dense in E , far any {Vm}c 9, V, t S E , we have xvm x zvm + x. On the other hand, for each u and 21 in S, there is an rno such that V, 3 { u , ~whenever } M 2 mo. Now the quadrilateral condition follows from the continuous condition and (1138) with V = V, arid y = xvm. W Since arid
{'y
2
E
Theorem 11.12. Suppose that {c(u, v,+)} satisfies the continuous condition, then G?(fl,J # 0. In fact vo = S(0,.) and v1 = S(1, are all in 9(fle), where O(u) = 0, l ( u ) = 1 for every u E S. a)
Proof: By Lemma 11.8 (2).
Remark 11.13. From Theorem 11.12 we see that for exclusion processes, the reversibility does not imply the potentiality. This is quite different from the spin processes for which p E 9 ( f l S ) implies that
But
yo
does not satisfy this property.
438
11 SPIN PROCESSES AND EXCLUSION PROCESSES
Definition 11.14. A measure p E 9 ( E ) is called positive, i f (11.39) holds. Denote by P + ( E ) the set of all positive probability measures. Similarly, we have a notation B + ( O e ) . Theorem 11.15. Suppose that {c(u,II, x)} satisfies the positive and continuous conditions. Then B+(Re) # 0 implies that Qe is a potential field, equivalently, the triangle condition holds, and &@+(ae) = P + ( E )f t Ye. Proof: Let p f &@+(Re) and yo E Ek(V). By (11.39) we have P[X f
EkW) x E ( S \ V ):
m)x E ( S \ V )1 4 s \ V ) > ( 4> 03 > 0.
So there exists a xV.k E E such that
For each y E & ( V ) , U , I I E V with y(u)
# y(v),
by Lemma 11.9(2)! we have
where zv,k = ( z ~ , ~ )Hence, ~ , ~ by . the positive condition of ( c ( u ?T I , x)}, we get YEGmPKYl) x E ( S \ 1 4 s \ W Z V , k ) > 0,
v>
Therefore the restriction of Qe on E k ( V ) x z V k is a potential field. From Theorem 7.6 and Theorem 11.2 (2) we have
Now the triangle condition follows from the continuous condition and the fact that UVEsU z o& ( V ) x { z ~ , is~ dense ) in E. M7e now discuss the uniqueness problem for Gibbs states of a spin process with nearest neighbor speed function.
Definition 11.16. The speed function { c ( u , ~ ) is } said t o have nearest neighbor if, for each u E S, there is a au E S such that c ( u , .) E S ( U
u au):= &o(u u au)x E ( S \ {uu au}).
From now on we make the "nearest neighbor" assumption. Take S such that = Vn-l uav,-,, 1s
v,
v,
{Vn}yc
11.3 CRITERIA FOR RF,VERSIRII,~TY
439
and c(u, .) E G(Vn)
for each u E Vn-l.
(11.40)
For each n 2 1, we take Bvn as a reference point in E(Vn). For each w E E(Vn),we choose {y(l),. - . ,~(‘1)) c E(Vn)such that ,Q
:=
p
3
$(’) + . . . + y(‘“) + y(‘++l)
=: w
and define M V , x “s\v,, 11) x “qv,) and q.s(w x x:s\v,, ,Q x XS\V,) as usual (cf. (11.19)). Because of (11.40), we may rewrite these qS’sa5 &(dv, x xav,, w x zavn)and &(w x xav,, Ovn x zaVn),respectively.
Theorem 11.17. Suppose that ( c ( u , z ) ) is positive, nearest neighbor and satisfies the quadrilateral condition. Then there is a bijective mapping between the set of all Gibbs states for { c ( u , x ) }and the set of all the equivalent classes of linear independent positive solutions to the following equation
(11.41) E(BVn-l), n 2 1, where V, = 0, avo = V, and &(y,$)/yB(&y) = 1 as a convention when y = f. In particular, there is only one Gibbs state i f f (11.41) has only one positive
z
E
solution up t o a constant.
Proof: Noting thatj without any confusion, we can use exp [ - ’p(Bv,,-, x x,y x z ) ] to denote (TR(f?vn-l x z , y x z)/q8(y x z , dv,,-l x z ) since the pathindependence, where cp(y, 2) is the work done by the field Qsfrom y to z. a) Suppose that (Zn,z} is a positive solution to (11.41). Define
hn(Y x z ) = 2;’ ~ X P [ - L ~ ( ~ Vx , X_,~Y x
~ n , t
(11.42)
for each y E E(Vn-l), z E E(dVn-1>and n 2 1, where
c
zn=
exp[-cp(bn-l x
*I
Y x z>IZn,z.
(11.43)
y€E(Vn-11 z E E( aV, - 1
Clearly, /inis a probability meamre on E(VOZ). We now prove that b) {/Jn}n>l i s a consistent family. By the path-independence, wc have cp(&n
x w, Y x
x w>
= P(Ovn x w,Qvn_lx = ‘p(Qvnx w, x
+ x w) + 9+v,-1
x w) ’p(Qv,-, x z x w,Y x z x w) x
*, Y x 2)
(11.44)
440
11 SPIN
PROCESSES AND
EXCLUSION PROCESSES
for each y E E(I&-l), z E E(aVn-l) and w E E(aVn). Hence for each n 2 1,
=
C
exp [ -
and so, by (11.44),we get
for each y E E(Vn-l), z E E(aVn-l) and n 2 1. Therefore) by the Kolmogorov consistency theorem) there is uniquely a probability measure p on 8 such that pn is the projection of p on E(Vn). Next, we prove that c) the measure p obtained in b j is a Gibbs state. By Theorem 11.1(l),it is enough to prove that p is a reversible measure. But this follows immediately from (11.42), the path-independence and the following result.
Lemma 11.18. Let p be a probability measure on (E,&'),then
/L
E .%'(St,) ifF
for each y E E(V,) and u E Vn-l, where p., is the projection of p on
E(Vn).
11.3 CRITERIA FOR REVERSIBILITY
44
Proof: If p E 9 ( f l s ) , then applying Lemma 11.8(1) to f = I { y } x ~ ( ~ \ ~ , , (y E E(V,)) gives (11.45). Conversely, it is trivial that (11.45) implies (11.46) with the above f in the cases that u # V, or u E V,-I.
If u E V,
\ Vn-l
=
aV,-,, from (11.45)) it follows that
for each z E E(V,+1
~ P X = x}
\ V,).
E(S\Vn+i1
That is
4%z)p(d4 = UYX
=Ix E(S\V,+l)
4%+(W.
Summing up z over E(V,+1 \ V,), we get (11.46), and then p E 9 ( f l s ) by Lemma 11.8(1). The lemma is now proved. 1 We now return to our main proof. d) Let p E 9 ( f l S ) and f i n be the projection of p on E(V,). Then, by (11.45), we have
Summing up w over E(dV,), we get
:= ,5n(6V,-1 x z ) : z E E(aVn-l), n 2 l} is a positive solution Hence {5& to (11.41). Next, from (11.45) and Theorem 7.6, we see that
P7dY x
4 = fin(6Vn-l
x 4 e x P [ - P(ev,-l x
2,
Yx
43
for each E E(V,-1) and z E E(aVn-l). Taking 2,,= instead of 3,,= in (11.42) and (11.43), we obtain 2, = 1 and p, = G,, for each n 2 1. So p is the same as the measure obtained in b ) . We have thus proved that the mapping defined in u ) and b) is an onto-mapping. We finally prove that it is also an one-to-one mapping. e ) Let {&$ : z E E(aV,-l),n 2 l}, i = 1,2 be two positive solutions to (11.41). If %,Z (1) = n,z> E E(aK-l), 21 (11.47)
442
11 S P I N
PROCESSES A N D
EXCLUSION PROCESSES
(~22)
for a constant cy > 0, then 22)= by (11.43)] and hence p?) = p?) by (11.42). We see that {xt;} and {x?;} determinate the same Gibbs state p by b). Conversely, if { x c ' z } and {xk:'z} determinate same p E 9 ( f l S ) ,then for each n 2 1, p?) and p?), as the projection of p on E(V,), must coincide. In particular, by (11.42), we have for each z E E(aVn-l)l
(.zc))-'zSk = p:)(evn-l
x
.)
= pLn (2) (6 vn-l x
z) = (
z~))-'
But 2;) is independent of n 2 1, so (11.47) holds. Example 11.19. Let @ : S + R satisfy CBCuEBES l@(B)1.\B\ < oa for every U E Set
s.
In fact, for the first assertion, we need only to check the quadrilateral condition. Since
we have
On the other hand, since
is symmetric with respect to u and v,the quadrilateral condition is satisfied.
11.3 CRITERIA FOR
REVERSIBILITY
443
In fact, for the first assertion, we need only to check th e quadrilateral In fact, for the first assertion, we need only to check the quadrilateral On the other hand, since
On the other hand, since
where
for all
be the same as above. Take
Then Qc is a potential field iff for all distinct
In this case,
:Proof: We need only to check the triabngle condition
Without loss of generality, assume that x(u) = 1, x()v = 0 and x ()(w = 1. Then
So we have
Similarly
11 SPINPROCESSES AND EXCLUSION PROCESSES
444 Hence
'l'herefore, wc need only to show that
Note that
- (-l)lon{u,.H b x(u) # x ( v ) , We have
5B((u,u)4
for all u #
t~and
all
2
with
which is syrrmietric with respect to u and w and so the required assertion
follows.
w
Example 11.21. In Example 11.19, take S = Z and
@ ( B )=
x > 0,
B = {n,n-I- l}(n E Z) otherwise.
That is, c(n,n;) = exp[X~17rL-TL1-1 (2z(rn) - 1)(2x(n) - l)].Then l$fal = 1,9(R8)1 = 1 .
Proof: Use Example 11.19 and Theorem 11.17. Take Vo = {0}, dV, = = V, U aV,, n 2 0. Since { - n - 1,n l},
+
11.3 CRITERIA FOR REVERSIBILITY
445
for each x E E(dVn-1) and w E E(dVn),
Hence (11.41) is reduced to
where
(i:>.
and we have used 1 , 2 , 3 , 4t o denote the elements (O,O), (0, l ) , (1,O) and (1,l) in E(6’Vn-1) respectively. Let z = e-*’
and B =
Then
It follows that the characteristic roots of A-’ are:
+
a1 = (1 e-”)p2,
a2 =
(1 - e- 2X ) -2 , a3 = a 4
= (1 - e-”)-l.
Combining this with the difference equation (11.48)7we obtain Xn+1,i
= aiay
+ bia; + (Ci + din)a;,
i = 1 , 2 , 3 , 4 ; n 3 1,
where a i , bi, ci, di are constants depending on X , only. On the other hand, from (11.48), it is easy to see that
This implies that bi = ci = di = 0 for i = 1,2,3,4. Hence
x,+l,i = aia;L, Inserting these into (11.48)7we get and so
i = 1,2,3,4; n 2 0. u2
= e-2xul,
a3
= eW2’al, a4 = e - 4 x ~ 1 7
446
11 SPIN
PROCESSES AND EXCLUSION PROCESSES
Therefore /gs1 = ILZ’(52,)l = 1. Inserting this solution into (11.42), we see that the projection p n of the unique Gibbs state on E(Vn)is actually given by
where IzI = l{m E aVn-, : z ( m ) = 1}1, and 2, is the normalizing constant for { p n ( y x z ) : y E E(V,-l),z E E ( 8 V n - - ~ ) }1 . A different approach to prove the absence of phase transition was given in Section 10.6.
l L 4 Notes The criterion (ie., the quadrilateral condition) for the reversibility of the spin processes wm obtaincd by Ding and Chen (1981) undcr an extra. hypothesis: llie speed function has nearest neighbor, which was then rernovcd by Tang (1982). A unified treatment and much more materials are prcscnted in Yan, Chen and Ding (1982a, b). Based on the last papers, somc related models were studied by Dai (1986), Dai and Liu (1986), Li (1983), Liu (1987), Ren (1983)) Wii (1990) and Zeng (1983). In view of Theorem 11.10, one may ask for the inverse statement “Qeis a potential field r r j %’+ (52,) # 8”. The answer is negative. Refer to Li (1990) for a counterexample and related discussions. Usually, the interacting particle systems (or random fields) are ergodic in some sense in the supercritical region of a parameter but not so in the subcritical region since the phase transitions. Thus, any point of the parameter at which the process being ergodic provides an upper estimate for the critical value. This leads to the study of different type of ergodicity and its convergence rate. On the other hand, the reversible systems are naturally concerned with the L2-theory, as we have seen from Chapters 6 and 9. Actually, the L2-exponential convergence and related topics mentioned in Chapter 9 are now an active research field. Refer to Aizenman and Holley (1987), Holley (1984, 1985a)b), Holley and Stroock (1976, 1987, 1989), Liggett (1989, 1991), Zcgarlinski(1990), Guionnet and Zegarlinski (2003), Martinelli (1999), Minlos (1996), Minlos and Trisch (1994), and Schonmann (1994).
Chapter 12
Yang-Mills Lattice Field The new developments of the superconductivity leads us to study the stochastic processes motivated from the Yang-Mills lattice field. In this chapter we first discuss the background. Then we study the subjects by using two different tools, spin processes and diffusion processes. For spin processes, we begin with the construction of the processes. As usual, we also discuss the finite dimensional case where there are no phase transitions. Furthermore, we show that the reflection positivity holds and some concrete models do exhibit the phase transitions. For the diffusion process, we use some differential geometry and logarithmic Sobolev inequality to show the ergodicity for some models at high temperature. The subject treated in this chapter is only at the beginning step and needs further study. Besides, since a complete exploration could be too heavy for the book, this chapter is less self-contained. 12.1 Background
In the recent years, one of the most active research field in physics is the superconductivity. The subject is growing rapidly. Now, ceramics are used instead of ferromagnetics and the critical temperature is no longer very low but close and close to the ordinary temperature. At the moment, we have not understood this phase transition phenomenon completely. However, there are still some common viewpoints. (1) The ceramics used here should have crystal structure. Mathematically, this concerns some sort of groups. (2) This kind of phase transitions is mainly due to planar behavior. In other words, the interactions along each plane are more essential.
On the other hand, there are some mathematical tools to study the phase transitions: random fields and interacting particle systems. For which, {O,l}"d is often treated as the configuration space. This situation corresponds to the metallic phase transitions at low temperature. However, the above fact (1) suggests us to use some more complex groups instead of (0, l} as our spin space. In view of (2), we may consider such potentials that are determined by the configurations restricted on the unit squares in Zd. More precisely, let X be a compact group, 3 be the set of all edges in Zd, we X and consider the Hamiltonian: H ( x ) = Entr xo, where x E E = xo = xelxe2xe3xed if the unit square 0 has edges e l , . - . ,& successively. See next section for more careful definitions.
n,,,
447
12 YANG-MILLS LATTICEFIELD
448
Now, take X = S O ( n ) ( n 2 2). Assume that a copy of Rn(u) of ndimensional Euclidean space is attached to every point u E Zd. Regard xe E X ( & = (uu) E 2”) as an isometry between EXWn(,) and R n ( u ) and regard the whole configuration x = (xe : t E 9) as a connection in the whole set of these spaces ( R n ( u ) , uE Zd). This interpretation shows that we are studying the Yang-Mills lattice field (cf. Sinai (1982), p.6).
12.2 Spin Processes from Yang-Mills Lattice Fields Let Zd ( d 3 2) be the &dimensional lattice. For u = (u’, , u d ) E Zd, we set (u(= ( ( u ’ ) ~- . ( u ~ ) ~ )We ” ~transfer . the lattice Zd into a graph, regarding the points U , U E Z d as neighbors if Iu-vI = 1. We also call (uw) = (vu)an edge of the graph if u and v are neighbors. Denote by 2 the set of all edges & = (uw). Next, let X be a compact group with Bore1 a-algebra 223 and let X be a finite reference measure on ( X ,9). Take (Ee,&) = ( X ,9), C E 2 and set 1
.
.
+ +
We consider the potential @ : X4 --+ R satisfying Hypothesis 12.1. (I) Continuity. a E C ( x 4 > . (2) Cyclicity. @(zl, , z 4 ) = @(zz,zg,z4,z1), zl, . . . , z4 E X . (3) Anti-invertibility. @(zl, , z4) = @(zT1,.. . , z;’), zl,* . . ,z4 Given
E
x.
e = (uw)E 3,for simplicity, assume that
We use 0 to denote the unit square (also called a plate) determined by
for some j E (2,- . , d } . Denote by (C = 11, &2,!3, t 4 ) the counterclockwisely edges of 0 and set xg = ( x e l , .. . , 5 e 4 ) as usual. In this case, the edges in 9 are oriented by increasing directions. For example, (0, - . . ,0) + (1,0, . * . ,0) is the positive direction. Denote by -& the conversely oriented edge l?and set x-e = z;.’ Then the Hypotheses 12.1 (2) and (3) simply mean that @(xo) is independent of the starting edge and the orientation of the plate 0.In
12.2 SPIN
PROCESSES FROM
YANG-MILLS LATTICE
FIELDS
449
other words, our planar interacting potential Q, is expressed by means of the plates. Finally, let %?ye( E ) be the set of all continuous cylindrical functions on E and define a linear operator R : %ye ( E ) %ye ( E ) by ---f
where
Define r
r =(Y(W2) n,(q = SUP { If(.)
- f(Y)I
Z,Y
1
: el, e2 E
E E and
{
W )= f E W): lllflll =
Y),
.(el)
= Y(&) for all
cn,(4 <
e
el
# q,
m}.
e
By Liggett's theorem [Liggett (1985): Theorem 1.3.91, we have
Tbcoreni 12.2. Under Hypothesis 12.1,
a
of R with domain g ( E ) is a Markov generator of a Markov semigroup P ( t ) . (2) g ( E ) is a core for (3) For f E B ( E ) ,Ap(t)f e-'texp(tl?)Af. (4) I f f E g ( E ) ,then P ( t ) f E 9 ( E ) for all t 3 0 and
(1) the closure
IT.
<
IIIP(t)flllG exp[(M - &PI Illflll. *
12 YANG-MILLS LATTICEFIELD
450
Now, we introduce some typical models.
Model 12.3 (Potts gauge model). Let X = G be a finite group with order n 3 2 and unit element e, and let X be the counting measure on G. Take
@(xo) = 6(z0,e ) =
1 ifZ,=e
0 if $o
# e,
where Z, = zel ext4 if 0 = ( t l ,... ,t4). In other words, @ ( l c o ) depends on 0 but independent of the permutation of the edges in 0.Without any confusion, we will use the same notation xg t o denote 5,. Thus, our operator R becomes
If el # &, el and t2belong to a common 0, then y(k'1, &) = ( n - I) (exp[P] - 1) exp[,B(2d- 3)] =: c.
If t , # t2 but
tl
and
t2
do not belong to a common 0, then y(t1, &I) = 0.
Thus
M = #{&
:
& # l ~there , is a
such that t l , t 2E 0) c +
= 6(d - 1 ) ~ .
On the other hand,
E
= 2. Therefore, if
then the process is ergodic [cf. Theorem 12.2 (4) and Liggett (1985): Theorem 1.4.11). As we will see later, this model occurs phase transition. Moreover, for a fixed dimension d, the critical temperature increases as the order n of the group G increases.
Model 12.4 (Yang-Mills lattice field). Take X = SO(n) and take A(&) t o be the normalized Riemannian volume element of S O ( n ) :
where
K is the n x n matrix:
k i j = -kji, and
12.2 SPIN PROCESSES FROM YANC-h$ILLS
T,ATTICE
FIELDS 451
is the total volume of SO(n). Again, take tr zo as our potential. Then
As above, for l = l , we have
in the case of Furthermore,
el
and
2 !
being in a common El, Otherwise, y(!~,l?~)= 0.
M
= 6(d -
l)C(P,TI).
However, for this model, we always have E = 0. So Theorem 12.2 (4)gives us no information about the ergodicity for the process. The problem is that the topology (total variation norm) used in thc thcorem is too rough for this model. Certainly, we can replace SO(n) by other groups. For example, SU(n) or UP(,) L SU(n) (? S P ( n ) .From IIua (1963) and Gong (1983)) it is known that
Moreover, we can replace t r x O by an arbitrary (but real) character of the groups. For the above gauge models, the iriteractionv depend only on four edges of a unit Equare. The next model has a little Inore interactions. To introduce the model, we need some notations. Recall that
where ei is the standard unit vector in E d having the i-th component 1. For = (v,v e j ) E 2, we write t ts k if
e = (u,u + ei), k
+
+
+
For given two edges C = (u,u e i ) , k = ( v ,'u c j ) E -2,we use C !Ik to dcnotc thc CELWS that i = j or i # j respectively.
11
k and
12 YANG-MILLS LATTICEFIELD
452
Model 12.5 (Shlosman model). Take d = 2 and X = SO(2). Now the 2 have orientation. For each .t E 9, define
edges in
and
Using Liggett's theorem, it is easy to prove that Theorem 12.2 holds for this operator 0. Again, in this case, E = 0. As we will see later, this model occurs phase transition. Next, we discuss the finite dimensional case. For simplicity, we study Models 12.4 and 12.5 only. Model 12.3 can be treated in the same way. Thus, we consider 2 as a finite set. More precisely, let Z$, be the factor group Z d / ( N Z d )introduced in Section 10.6. In an obvious way, we can Set. define 2 ~ EN = El, 8"= &. e€YN
e€ZN
Then, we can replace 2 by 2~and study the corresponding finite dimensional processes. Set
where
Define a conservative q-pair as follows: qN(Z, A)
lx
c(e,
=
IA\z( Z ( e ' z ' ) X ( d z ) ?
e€YN
qN(z) =~ N ( ~ , E N ) ,
x E EN, A
E &N.
Theorem 12.6. The jump process determined by the q-pair ( q N ( z )q,N ( zd, y ) ) (resp. R N ) is unique and reversible with respect t o the probability measure p N :
[
~ N ( d z= ) ~ X P-P where
AN(&)
=
net2NA(&,)
C
~(z,)
TN>n
and
2, is the
1
x N ( d z ) / ~ N ,
normalizing constant.
(12.5)
12.2 SPIN PROCESSES FROM YANG-MILLS LATTICEFIELDS 453
Proof: Since the q-pair is bounded, i.e., supzEENq N ( z )< 00, we are now in a trivial case for uniqueness of the process. In order to prove the reversibility, it suffices to check
for all A , B E 8’. Because
e J 1
r
1
is symmetric with respect to A and B, hence the assertion holds. H
Theorem 12.7. The jump process p N ( t , x,dy) determined by ( q N ( x ) ,q N ( z , dy)) has uniquely an invariant probability measure p N . Proof: For Model 12.3, this result is trivial. Now, we consider Model 12.4. a) Define P N b , dY) = Q N ( Z 7 d Y ) l q N ( 4 * Since 0 < c q N ( z ) C < 00, there is an one-to-one correspondence between the invariant measures of p,(t,s,dy) and p N ( x ,dy) (cf. Theorem4.17):
<
<
9 ( P ~ ( t 3) )U N
t+
qN(z)vN(dz)/const.= ~ N ( d z E) ~ ( P N ) ,
where 9 ( P ~ ( t )is) the set of all PN(t)-invariant measures. Thus, we need only to show that I ~ ( P N )=I 1. b) Now we prove that pN(x,dy) satisfies the condition: there exists an e > 0 such that
X N ( A ) < E ==+ p f ( ~ , A ) > l - ~ , z E E . This implies the well-known Doeblin’s condition. Clearly,
454
1 2 YANG-MILLS L A r r r c E FIELD
Set Z(t,x) = C ( t , T ) /
c
c(t',x).
rE2N
Enumerate 3 , as 1 , 2 , - . . , M
=;
I.AY,vI arid rewrite EN 3 x = x1x x2 x . . x
: x M . Then
......
Thus
Take E = E I A d ( ~ >~ 0.) c) Next, we show that EN is the only minimal invariant set. Let A be an invariant set md suppose that XN(A)< 1. Then, as we have proved above,
which is a contradiction to the definition of invariant sets. d) Suppose that EN has a cyclical decomposition: GI, - - , C, such that
12.2SPIN
PROCESSES FROM YANG-MILLS LATTICE
FIELDS 455
If m > 2, then
But (C) > 0 implies that for every
> 0,
a.e. II: on Cj.
This is a contradiction. Combining b)-d) with Doob (1953)[Chapter 5, Section 51 or Meyn and Twecdic (1993b) [Theorcm 16.2.3j, it follows that pN(x,dy) is indeed exponentially ergodic. Finally, we discuss briefly the reflection positivity and phase transitions. We mainly deal with Models 12.3 and 12.5. Some results still hold for more general cases including Model 12.4. Our goal is to reduce the time-continuous case to the time-discrete case. For this, we need some preparations.
Lemma 12.8. For every N 2 1, p N defined by (12.5) is a RP-state with respect t o all r E Proof: By Hypothesis 12.1,
for all T E 0 . Our assertion fallows from Lemma 10.24 immediately. In what follows, we will fix a translation invariant boundary Z E E and regard p 2 N as a state on ( E ,8 ) by fixing the mass off ~ ( V - N at )2 , where
Let 9 2 denote the set of weak limits of { p f N : N 2 11,where pgN = p 2 N is defined by (12.5). From now on, we rewrite P ( t ) obtained by Theorem 12.2 as Pp(t). Similarly, we: use P{(t) to denote the finite dimensional jump process P N ( ~studied ) above. Lemma 12,9. Each [LO E 9;is a reversible probability measure of Pp(t) and tra nsia tion invariant .
456
12
YANG- M I LLS
LATTICE F I E L D
Proof: Let ,up E 92; and f i g E g y l ( E ) . Assume that p f N for large N, by Theorem 12.6, it follows that
1 1 1
fn!NgddN
That is
=
fflPgdp[N =
1
gnfNf
J
+ po.
Then,
d&N.
SflpfdpgN.
By Hypothesis 12.1, one can pass the limit to get
joflgdp@=
J go@jdpP.
8,
Since 59g.t ( E ) is a core of we have proved the first assertion. The second one is easy. Now, we set 90= the set of invariant probability measures of P p ( t ) . L e m m a 12.10. Let pp E 3, be translation invariant and f be non-negative, 8,f = f , pO-a.s., where 8,(u E Z d ) is the usual translation operator on Zd. Denote by
f:
f=
1
f(y)Pp(., d d ,
Po-=.
where pp is the probability kernel induced by Pp(t,2,d y ) . Assume t h a t 0. Then the probability measure v, defined by
@(f) >
is in Y p and is again translation invariant. Proof: cf. Tang (1984).
L e m m a 12.11. Let pp E 90, ,B 0 and ,O + ,& < 00. Then there exists a and ,dm +-ppOas m 00. Moreover, subsequence ppm such that ,Om -+ --f
ppo E
,a,,.
Proof: Since E is compact and
by Liggett (1985) [Proposition 1.2.131, the assertion follows from the fact: flpf E 59yt ( E ) for all f E 59y.t ( E ) and /3 + Op+f is continuous in C ( E ) for each fixed f E 59g.t ( E ) . I
12.3 D r ~ ~ u s PROCESSES ro~ FROM YANG-MILLS LATTICEFIELDS 457
Lemma 12.12. Suppose t h a t
X
Model 12.4, respectively. If ptNk B G Vy!?(E).
= G or
=+
X
SO(n) given in Model 12.3 and p p as k -+ 00, then p t N , ( R ) ---t pB(R)
for all
Proof: For the first model, every B E @,d!(E) is not only open but also closed, the conclusion is trivial. For the second model, the proof is similar to those given in [Preston (1976)) I:,ernmns (3.1) and (3.2)]. It is the position to discuss thc phase transitions for the above models. First, for Model 12.5, we regard each edge as a point at the center of the edge. This gives us a dual lattice, and hence an interacting particle system on the dual lattice. Actually, this particle system is a modification of Example 10.30. Thus, by using the above lemmas and the proofs of Theorem 10.29 and Example 10.30 with a slight modification, it is not difficult, to prove the following result. Theorem 12.13. For Model 12.5, there exist two distinct reversible measures a t sufficiently low temperature (= l//?).
The reflections used for this model are in the planes: u3 = 0, &1/2, f l , &3/2, . . * ,
j
1:2 , * * *,d.
Similarly, combining the above lemmas with the proof of Shlosman (1986) [Theorem 8.11, we obtain Theorem 12.14. For Potts gauge model, when d 2: 3, we can find an n ( d ) < such that if n 3 n ( d ) , then there exists a value ,Dc(n)a t which the model possesses two distinct translation invariant reversible measures.
00
From the paper quoted above, it is also known that Pc(n)increases as n increasing.
12.3 Diffusion Processes from Yang-Mills Lattice Fields In the study of statistical physics, there are several ways to describe the systems. For example, we can use spin processes or diffusions as dynamic systems t o describe the same Gibbs states. In this section, we use some differential geornetry, in particular, the logarithmic Sobolcv inequality to study the ergodicity at high temperature for the diffusion processes rnotivatcd from the models discussed in the last section. First of all, we rt:call thc stochastic k i n g models studied by Holley and Stroock (1987). In the study of Yang-Mills lattice field, we use the edges L tt k and so on introduced instead of vertices. We will use the notations 9) in the last section.
458
12 YANG-MILLS LATTICEFIELD
Next, let (111,r ) be a compact, oriented, C"-Riemannian manifold of dimension n, X be the normalized volume element on M , X(111) be the set of all Coo-vector fields, and E = M 2 be the infinite product manifold with Borel a-algebra 8. For any 8 # A c 2, let STT,, denote the natural projection of E onto M A , and G;\ denote the product Borel field of MA. When A = {k}, we simply write 7rk instead of rA.Set
C r ( E )= { f
E
CCO(E): f depends on the coordinates in A only}, 9 ( E )= C"(E) n %?ye(13).
In this section, we restrict ourselves to the potential Q =
{@A : 0
#A
G
2') satisfying Hypothesis 12.15. QA E C r ( E )for all A: 0 # A G 9. (2) For every k E 9, l{A : k E A, @A # O}] < 00. (3) There exists a constant b < 00 such that for any X E X ( E ) ,
(1)
where
xA3k(P~,
Let HI, = then ffk E a ( E ) , k: E differential operator L as follows.
2.Define a second-order
where diVk and gradk are the divergence and gradient operators on the k-th manifold respectively. For each 8 # A C 9 and xA E M A ) gAc E MA', define
Recall that g E P ( E ) is said to be a Gibbs state with potential @ if for each 8 # A c 3,
12.3 DIFFUSION PROCESSES FROM YANG-MILLS LATTICEFIELDS
459
is a regular conditional probability distribution 011 M A of 9 given Y(@) the set of all Gibbs states with potential @. For f = ( u ,u ei), k = ( u ,u + e j ) E 2, set
Ue-
note by
&,A..
+
I -k
= (u- w,u - 21
+ e, - e 3 ) .
Theorem 12.16 (Holley and Stroock (1987)). Let Ric denote the Ricci curvature tensor for ( M ,r ) . Suppose that (1) there is a constant [Y > 0 such t h a t Ric 2 or; (2) there is a mapping p : Z d x Zd --+ R+ such that C k E 2 p ( k ) < a and lHess(@A)(grdcf,grad!f)l
mu) for all
P(k
-
e> I l p a d k f l I \lgad!fil *
k , l E 9and f E g ( E ) .
Then lY(@)l = 1 and the corresponding L-diffusion process is ergodic.
Now, we recdl the two models studied in the last section with a slight, extension. Let M = S 0 ( 7 ~with ) n 2 3, and let 0 = ( P l r & , ! 3 , & ) denote a unit square on Zd with four edges C1 < l , < f 3 < f,, where < is the lexicographical order.
Modcl 12,17. Take
i
/3trxD if F = 0 = (&1:&,&,!4) Phi if F = {f} otherwise, z f E , 0,
QF(z) = where
p > 0 and he
is a constant for each l E
3.
Model 12.18. Take
otherwise,
where p
> 0 and hl
x
E
E,
are the same as above.
Finally, the main results of this section are the following.
Theorem 12-19. For Model 1217, if n-2
o<
'
4(2d2 - 1 ) J m '
then there exists precisely one Gibbs state and the corresponding diffusion process is ergodic.
1 2 YANG-MILLS LATTICEFIELD
460
Theorem 12.20. For Model 12.18, if n-2
+ 2(d2 - 1)n+ d(d - 1)d-I
< 8[(d2+ 2d - 2)J-
'
then the same conclusion of Theorem 12.19 holds.
To prove the above results, we need to compute the Ricci curvature tensor on SO(n) and some preparations. From now on, the indexes i, j , s and t are assumed to be in {1,2,. . . ,n}. Let e be the unit element in SO(n). Then for each X E X ( S O ( n ) ) there , exists precisely one skew symmetric matrix A = ( a i j ) such that 71
X,
=
C aii-. d i,j=l
dXij
Proposition 12.21. Every left-invariant vector field X can be represented as follows:
Proof: For z = ( z i j )E SO(n), define fij(x) = zij, i , j = 1,.. . , n. Suppose that
. .
Then x b f i j = bij, z , j = 1 , . . . , n. On the other hand, for g i j ( z ) := f i j ( b z ) , a we have Xb f i j = x e g i j = akjbik. Therefore x b= bikakj 60 For i < j , let Eij be the left-invariant Coo-vector field such that ( E i j ) e= a/axij - d / d x j i . Then by Proposition 12.21, we get
cr=,
z;j,k=l
Let Fij = E i j / f i . It is easy to prove that {Fij : i < j } is a left-invariant standard orthogonal basis of X ( S O ( n ) ) . It is well-known that S O ( n ) is an Einstein manifold, which means that there exists a constant c such that Ric = C T , where T is the Riemannian metric on SO(n). Now we compute the constant c.
Lemma 12.22. For S O ( n ) ,we have Ric = (n- 2 ) ~ / 4 .
12.3 DIFFUSIONPROCESSES FROM YANG-MILLS LATTICEFIELDS 461
Proof: For any i
<j
and s
< t , we have
EIcre in the last step, we have used the fact that V x Y = [ X , Y ] / 2for hiinvariant Rictmannian metric and left-invariant vector fields X , Y . On the other hand, it is easy to check that for i $ j and s # t ,
Etj E,i
if s = i if t = j
Ejs if i = t 0, otherwise. Thus, for i
< j and s < t we have [ [Eij Est] Eij] )
7
{
ESt if I{i,j} 0,
f l {s,t}l = 1
otherwise.
Hence
Lemma 12.23. For A = ( a i j ) and fi = (hi.?) E SO( n) ,we have
Proof: Here we prove (1) only since the proof of ()2 is similar. Set
12 YANG-MILLS LATTICEFIELD
462
Then the left-hand side of part (1) is equal to I + II + IU. But
n
n
s < t i,j=l
s < t i=l
n
a=
1
i,s=l
n
n
i<j s,t=l
i<j s=l
n
n
s
s
n
n
i=l
s=l i < j
s
n
n
= 2 C ( A B ) , ", 2
C i,s=l
i=l
Therefore
+
I + 1 JII = n2 - 2n + [tr (BA)I2< 2n(n - 1). D Proof of Theorem 12.19: We prove the theorem in three steps. a) For x E E , let xe = ( x f j ) denote the t-th component. Then for i j , s < t and 0= ( k ,C,e l , t 2 ) , we have
$Fftaa(x) = /3F$F$tr
(xkzp~e,xe,)
'I(Fz",Z k ) ( F f tze) 4 =2 P [ ~ej s ( x e l s e 2 x k ) t+i z jet ( Z e 1 z e 2 ~ k ) s i
= P tr
Ze,
- Z i s ( Z e , x e 2 x l c ) t j - Z ei t ( Z p l z e 2 Z l i ) s j ] .
<
12.3 DIFFUSIONPROCESSES FROM YANG-MILLS LATTICEFIELDS 463 By Lemma 12.23, we obtain y p ; F , e , @ O ( x ) ] 26 zP2n ( n - l),
z E E.
(12.6)
i<j s
In the same way, it can be proved that (12.6) holds for other unit squares containing k and C. b) For I7 = (t,t , ,!, &), i < j , s < t , and z E E , we have
where b = xe,xe2xe3xe,and e is the unit matrix. Applying Lemma 12.23 to A = b and B = e , we get (12.6) for the unit square 0.Then we have
c) Finally, define a mapping p from Zd x Zd into R+ as follows:
where A = (&(ei,ej), (0,ei - e j ) ,(ei - ej, 0 ) ,f ( e i , ei) : 0 < i # j 6 d } . It is easy to check that ]A1 = 2d(2d - 1)) and k - !E A if k and !are two edges of a unit square. Therefore, for f E g ( E ) and k,! E 2, we have
On the other hand,
Now, from Theorem 12.16, the assertion follows immediately. Proof of Theorem 12.20: Let !, k E 2 'and !f-) k .
I
1 2 YANG-MILLS LATTICEFIELD
464
a) If
t Ik, then for i < j and s < t , we have
where
By Lemma 12.23, we have i < j s
Assume that
i < j s
Fidj=; -Fji.
Then
.
,..-..-
rT7
Therefore
b) For
by Lemma 12.23, we have
and
since
..
.
12
12.3 DIFFUSION PROCESSES FROM YANG-MILLS LATTICEFIELDS
Therefore
c ) Let l E 2. For k
I[ t, the proof b) gives us
,
n
On bhe other harid: for k It, i < j and
where
for
s
< t,
By Lemma 12.23,
Therefore
d) For u , v E Ed, set
I0,
otherwise,
where
It is easy to check that, €or k
H
1,
465
466
12 YANG-MILLS LATTICEFIELD
Hence, for k , t E 9and f E 9 ( E ) , we have
and
Finally, by Theorem 12.16, we obtain Theorem 12.20. I
12.4 Notes Section 12.2 is taken from Chen (1991g). Section 12.3 is a joint work with F. Y. Wang and published here for the first time. Refer to Wang (1996a,b) for more results.
PART IV
NON-EQUILIBRIUM PARTICLE SYSTEMS
This page intentionally left blank
Chapter 13
Constructions of the Processes Throughout this part, suppose that S is a countable set. For each u E S, let (Eu,pu,&) be a complete separable metric space, where C&is the r7algebra generatcd by thc metric pu. Denote by ( E ,8)the iisual topological product space of (EU, p u ) (u E S). A particular type of particle systems we will deal with is the reaction-diffusion processes which are motivated from the study of non-equilibrium statistical physics. For which, S = Zd and E, = Zq (u E S). Then E = ZS is neither locally compact nor n-compact. As we will see later, the operators of the processes are not locally bounded arid moreover, as typical models for non-equilibrium particle systems, the processes are usually riot reversible. These facts tell us that thc reaction-diffusion processes are quite different from the equilibrium particle systems studied in Part III and are much hard to handle. This chapter is organized as follows. In Section 13.1, we prove two existence theorems for the processes and then apply to the reaction-diffusion processes in Section 13.2. The uniqueness problem is studied in Section 13.3. As applications of these results, fifteen examples are introduced with some comments in Section 13.4. An integration by parts formula is proved in the Appendix.
nuEzd
13.1 Existence Theorems for the Processes
c S and
(xu : u E S ) , y := (yu : u E S ) , define E" = = n u E A 8U and PA(%:,9) z2 x u ~ pU(2U, h yzL)klL>where (h) is n u ~ EU, h a positive sequence on S . For simplicity, we also use the notations: For A
II: :=
where 0 := (0, : u E 5') is an arbitrary but fixed reference point. Set
Eo = { x E E
:
p ( x ) < m}
and denote by 80the a-algebra on EO induced by 8. Ncxt, denote by xh the restriction of 2 on A. Certainly, fl" can be naturally embedded into E by identifying y E En with y x BS\". Given transition probabilities Pk(i,x,*),k = 1,2 and A E S , let W ~ ( P l ( t , x , *Pz(t,x,.)) ), denote the minirriurri L'-distance of the projections on (E", gA)of Pk(t,x,-), k = 1,2 with respect to p,,. 469
470
13 CONSTRUCTIONS OF THE PROCESSES
Let {A,}? c B be a fixed sequence such that A, S and (q,(x), q,(x, d y ) ) be a sequence of regular q-pairs on (Ed'.. Sometimes, it is convenient to regard (q,(x), qn(x,dy)) as a q-pair on the whole space ( E ,8): gn(x)= q n ( x R n ) ,
qn(x,A)= qn(xATL,A(xs\"n)),
where A ( x ) = { u E EA" : y x z E A } . Similarly, we can regard the q-process P,(t) determined by ( q n ( x ) , q n ( xdly ) ) as a q-process on ( E , & ) . Again, the operator corresponding to (q,(x), qn(x,d y ) ) is denoted by fin. The problem we are intcrcstcd in is to find a limiting process of the g-processes determined by ( q 7 L ( 4 , 4 n ( J ; , d Y ) ) . Before moving to details, let us point out that the key of our constructions is the following estimates:
<
(1) P,(t)p(z) (1 + p ( z ) ) e C t ,II: f Eo and (2) W v ( K ( tz, , En(4 x,.))6 c(v, s; n, 4, x E Eo, where c is a constant independent of n, V E S (or V = A,) and c(V,z; n, m) E R+ satisfies lim,~,,, c(vx;n, m ) = 0.
9,
The second condition (2) shows that {Pn(t,z,-) : n 2 l} is a Cauchy sequence in the Wv-metric. Note that in general, our operators are not local bounded and the particles from infinite sites may move to a single site, so the process may be explosive at some single site. This explains the reason why we use EO instead of E . Then, the first moment condition (1) ensures that Eo is a closed set of the process. Finally, in order to prove that the limiting process satisfies the CK-equation, some kind of uniform controlling in the second condition is also needed. We now state our first existence theorem for the processes.
Theorcrn 13.1. Suppose that the following conditions hold (1) The first moment condition. There exists a constant c 2 0 such that
Onp < c(1 + p ) ,
n 2 1.
(13.1)
(2) Lipschitz - condition. For every n and m, 1 of R, and Om such that coupling
< n < rn, there exists a
I
Qn,mPw(x1rz2)
<
C
~ u w ~ u ( x 1 , x+ Z c)w ( n,m )(l + P ( X ~ ) + P ( ~ C , ) ) ,
utAn
f An, 21133 E Eo, (13.2) where the non-diagonal elements of (cuw : u, w E 5") and the elements of c,(n,m) (w E An : m 2 R. 2 1) are non-negative, and moreover rt
1 3 . 1 EXISTENCE T I ~ E O ~ FOR. ~EM THE S PitocmsEs
471
where CTL(C) = exp[tC;], CG is the transpose of C, = (c,, : u,v E A,). Then there exists a Markov process with state space (Eo,80) and transition probability function P ( t ,z, +)such that for each A E S,
Iim
n+w
W A (P,(t, z, .), P ( t ,z,
.)) = 0 ,
2
t 2 0.
Eo,
E
(13.4)
<
uniformly in 5 E E t := {x E b’o : p(s) N ) . Furthermore, for fixed P ( t ,z,.) is continuous in the following sense: if x;2, E E ~n ,3 1,supnp(z,) 00 and limnioo pu(zn, s) = 0 for every u E S, then
lim
n-m
WA(
~ ( zn, t , .), ~ ( t5 ,, +)>
=
o
t,
<
(13.5)
for every A E S.
Ry Theorem 5.36, one may replace f i n , m on the left-hand side of (13.2) by the p,,,-optimal coupling of 0%and Om.
Corollary 13.2. If
lim
man+w
cu(n,m)= 0,
SUP
uE S
and
C U ( ~ , m ) + s u P ~ l C u< S /cx;:, 2’
man,uEA,
(13.6)
then condition (13.3) holds. If moreover c u ( n ,n)G 0 and supu C, /cuv/ < m, then condition (13.1) can be replaced by
%P(@)
n
00,
where c 2 0 is a constant, independent of
U
3 1,
(13.7)
TI.
Proof: The first assertion is clear. To prove the second one, note t.hat
,.,
%,nPw(Q,z)
<
c
c,wpu(@,z),
w
E
An, z f EO,n 2 1
uEAn
by condition (2). Let (@n,m(Xl, z,), i j n , m ( ~ lz2; , dy,, dy,)) be the coupling Then q-pair corresponding to the coupling operator
-
Hence
and so
This certainly implies condition ( 1 ) . See Remark 13.13 for further discussion about condition (13.6).
472
13 CONSTRUCTIONS
OF T H E PROCESSES
Remark 13.3. The proof below shows that the theorem actually works for more general function p , not necessarily having the form p ( x ) = C , p u ( z ) . What we need is the following (1) p E 8 , , p < 00 on each E*n, n 2 1. (2) For each d E [0,oa) and n 2 1, ( 5 E E : p(znri> > d } is open in E . (3) For each II: E E , p ( x A T L7)p ( 2 ) as n f 00.
In order to prove Theorem 13.1, we need some preparations. Lemma 13.4 (Estimate of the first moment). For the g-process Pn(t) determined by (gn(z), qn(x,d y ) ) , condition (13.1) implies that
~ , ( t ) p< (1+p)ect - 1. Proof: Use Lemma 4.13. The next result is our main estimate. Lemma -13.5. Under the assumptions of Theorem 13.1, we have
where
t 2 0, ~ 1 , 1 c 2E Eo, 'W E A,, m 2 TZ 2 1 Pn,m(l: x l ,q,; dyl, dyZ) is the q-process determined by
-
-
(13.8)
Before proving this lemma, let us discuss how to get the estimate. To do so, we still need two lemmas. Lemma 13.6 (Gronwall's Lemma). Let a, W : [O,c] (resp. [O,m)) + rW$ and Q : [O: c] (resp. [0, m)) --+ R$ x R$ be Lebesgue measurable. Suppose that Q(s)!€J(t) = P(t)!€J(s) and J { q ( s ) W ( s ) d s < w. If
+.I'
W ( t )< Q ( ( t ) d
-W(t) dt
q(s)W(s)ds,
tE
10, cj
d
< -@(t) + !P(t)W(t), t E dt
[O,C),
or
(13.9)
W ( 0 ),< @(O)
Then
t
E [OJ].
(13.10)
1 3 . 1 EXISTENCE THEOREMS FOR THE PROCESSES
473
Moreover, if one of the signs of equality in (13.9) holds, then so does (13.10). More precisely, the solution to each of equations (13.9) (not inequality!) is unique, denoted by W * ( t )and , furthermore, for any solution W ( t )to (13.9) we have W ( t ) M f * ( t ) for all t E [O,c].
<
Proof: We consider the integral form in (13.9) only since the differential form can be reduced to the one. Define Y ( t )= Q ( s ) W ( s ) d s .Then Y ( t ) is absolutely continuous. Since Q 3 0; by (13.9), we obtain
Y ' ( t )- Q ( t ) Y ( t<) *@)a@). Set Z ( t ) = exp to
[ di Q ( s ) d s ] Y ( t ) .Then the above inequality I_
z'(t>< cxp
[
-
It
is equivalent
C(s)ds] Q(t)*(t).
Noticing Z(0) = 0 and the absolute continuity of Z ( t ) ,we see that
Next, by definition of Z(t>and the exchangeability of Q ( s ) Q ( t ) , we have
T h u s , by definition of Y ( t )and condition (13.9), we get
This proves (13.10). Finally, if the integral form in (13.9) is an equality, then every inequality in the proof becomes equality. I
Lemma 13.7. Let ( q ( x ) ,q(s,d y ) ) be a q-pair on an arbitrary space ( E ,€'I and let T , k E &/B(R$). If s1 r' 6 h on E , then Pmin(t) T r+J: Pmin(s)hds.
<
Proof: Let EN = (x 6 E : y(x) v r ( 2 ) < N } and set
474
13 CONSTRIJCTIONS OF THE PROCESSES
Then ( q N ( x )qN(J:, , dy)) is a bounded q-pair on ( E ,S), the corresponding q-process is denoted by P N ( t ) .Moreover, P N ( t )= I J',"P N ( s )R,ds. The key point we choose this approximation is the following: for N large enough so that J: E E N , we have
+
which follows from the first successive approximation scheme for Pmin(t) (cf. Lemma 5.18). Since R N r N R N r h on EN; we have P N ( t )T N TN J: ~ , ( s ) h d s . Hence, for n N ,
< <
<
PN(s)hds ,< T N
+
+
P N ( ~T , ) < T N
+
It
l
Ymi"(S)hdS
On E N .
Letting N --3 00 and then n t 00, we obtain the required assertion. We now return to the main estimate. Fix n m and define the following column vectors:
<
By Lemma 13.7 and (13.2),
Hcrc in the last step, we have used Lemma 13.4. Now, if C, is non-negative, then we can apply the Gronwall's Lemma to obtain the desired estimate: rt
W(t)
< Cn(t)P+ 1 0 Cn(t- S)@(S)dS
=: F ( t ) .
13.1 EXISTENCE THEOREMS FOR
THE PROCESSES
475
Having this estimate in mind, it is now easy to prove our main estimate even for more general C, which is not necessarily non-negative.
Proof of Lemma 13.5: Note that for F ( t ) just defined above,
On the other hand, since Cn(t)is non-negative, by assumptions, we have
d dt
= --F(t;
2 1 , z2).
The assertion now follows from Lemma 4.12. 1 Proof of Theorem 13.1: a) Applying Lemma 13.5 to z = z1 = z2, it follows that
t 3) ,0,z E EO,WE A,,m W~,(t;z,z),<2(l+p(z))ectcw(t;n,m
2 n 2 1.
Hence
as
By Theorem 5.4, there exists
such that
476
13 CONSTRUCTIONS OF T H E PROCESSES
and moreover, the convergence is uniformly in for we have
On the other hand,
This shows that for fixed t and z,{ P A @x) , .) : A k S} is consistent. So by the Kolmogorov extension theorem, there exists uniquely a P ( t ,x,.) E P ( E ) so that the projection of P ( t ,z,.) on (E-',gA)coincides with P A ( t ,x, ( t 2 0. x E Eo). According to Theorem 5.6, Pn(t!x, converges weakly to P ( t ,2 ,.). The topology on E used here is the usual product topology. b) Prove that P ( t ,z,Eo) = 1 for t 3 O and z E Eo, and P ( t ,., B ) E & for each LI E 8 0 . Noting that by Lemma 13.4, we have a)
a)
(1
+ p ( x ) ) e c t- 1 2
lim ~ , ( t ) p ( z 2) lim
00'21
= lim n+cc
=
1
1
~
12+m
t
I
~ , ( tz, , dy)p(y"m)
(t,mx,~ l y ) p ( y ~ m2)
1
~
*
(mt ,r ,dy)p(y*m)
P ( t ,z,d y ) p ( y A m ) .
Here in the last step but one, we have used Remark 13.3 (2) and Theorem 4.5. By Fatou's lemma, it follows that
P ( t ) p < (I
+ p)ect - I.
(13.11)
This certainly implies that P ( t ,z,E;) = 0 or equivalently, P ( t ,z,Eo) = 1. To prove bhe measurability, we need only to study the property of P ( t , ., B ) for the open subsets B E Vyt(E). However, by Lemma 1.36, I B can be approximated arbitrarily by bounded cylindrical Lipschitz continuous functions f . Thus, it suffices to show that P ( t ) f f 80.But this is trivial since P ( t ) f = limn P,(t)farid P,(t)fE 80 for large enough n. c ) Prove the la.st assertion of the theorem, Since
by a), (13.4) and (13.8), the right-hand side can be made arbitrarily small by taking n and then l sufficient large.
13.1 EXISTENCE THEOREMS FOR THE PROCESSES
477
d) Prove that P ( t ) satisfies the CK-equation. For two probabilities on ( E ,&'), coincided on the finite dimensional open sets, they must be the same. Thus, because of the reason mentioned in b), it suffices to show that P ( t s)f = P ( t ) P ( s ) f holds for every bounded cylindrical Lipschitz continuous functions f. By (13.8), (13.4) and c), letting n --f 00 and then m -+ 00, we have
+
Hence uniformly in z on pbounded sets. Combining this with (13.4), we have
uniformly in I on p-bounded sets. Next, by Lemma 13.4, for given there is an N large enough such that
where Ilfllu = supa: If(.)\. E" , then we get
E
> 0,
El Finally, if we use " A (= B" to denote ''[A- BI
<
13 CONSTRUCTIONS OF THE PROCESSES
478
e) 3% have seen that P ( t , x , . ) is ZL transition probability function on (I?,,, gob). Starting from this, it is a standard procedure to construct a Markov on (n,F, IF‘) with transition probability function P ( t ,5 , +)and process (X,) state space (Eo,&o). The key point in the construction is the Kolmogorov extension theorem which holds for universal measurable space as mentioned at the end of Section 1.5. In the present case, {.: : p ( & ) 6 N} is closed and so
u n {. c o c a
{.:
p(x) <
=
: p(+)
N)
N=l n=l
is a Borel set of thc Polish space ( E ,8). Hence, (Eo,80) is a universal measurable space (cf. Cohn (1980)) Proposition 8.2.3 and Corollary 8.4.3). I As we mentioned before, the above theorem allows p ( z ) to be quite general. But condition (13.3) is not satisfactory since it means that the interaction is decreasing rapidly when the distance between thc components increases. The next result relaxes the restriction for the particular p : p(s) = CUES pU(z)kUu , E S . From now on: we fix thzs p. Clearly, p is a metric on Eo. However, it should point out that the topology generated by p is stronger than the one induced by the product topology. Hence, the Borel algebra generated by p is contained in 8 0 . We now prove that with metric p is a complete separable metric space. hc a Cauchy sequence in p . First, we prove the complcteness. Let {dn)} Then p(dm),drL)) = C, k,p,, (xim),x?)) 3 0 as m, n + x and so for cach
u E S, p,(sb,m),x?)) -+ o as m , n -+ Oo, ~y the completeness of ( ~ ~ , p ~ ~ there is x, t? E, such that p , ( ~( , ~ ) , x ~4) 0 as n 4 m. Given E > 0, by assumption, there exists N such that C, i C , p , ( ~ ~ ~xp’) ), < E for all n,m 3 AT. Hence, by Fatou’s lemma,
Therefore, by triangle inequality,
U
This proves that
3:
E EO and
U
d n )3 J: in p
U
as n
-+
00.
13.1 EXISTENCE THEOREMS FOR T H E PROCESSES
479
Next, we prove the separability of (Eo,p). Take A, E S, A, 1 S.Since EAn is separable, there is a countably separable set RAn. Set R = UnRAn x which is again countable. Now, let x E Eo. Because p(x) = C, kup,(x,, 8,) < 00, there is A E S such that CueAk,pU(x,,8,) < ~ / 2 .Take A, 3 A. Since EAn is separable, there is xAn E RAnsuch that CUEA k,p, (xu,x t n ) < ~ / 2 .Then dn):= xAn x E R and P(Z("),Z)
=
c
c
kZLPli(xplx2L) +
uEA,
%$An
uEAn
,$An
k,PU(@,~,)
This means each neighborhood of x E EO contains a point in R and so the proof is done. Let zip(&) denote the set of Lipschitz continuous functions on Eo with respect to the metric p . For f E 2ip(Eo),let L ( f ) denote the Lipschitz constant o f f . In what follows, we mostly concern with the next construction of the processes. As we will see soon, the processes constructed in the next theorem are Lipschitz and so they are measurable with respect to the 0algebra generated by p rather then 80. Theorem 13.8. Suppose that the following two conditions hold.
(1) First moment condition. There exist c1 E R and a non-negative matrix ( b ( u ,v) : u,?I E S) such that QnPw(x)
21
E h n ,E ~~ o , 3n 1 (13.15)
,€A,
where
Pu. 2 0,
21 E
s;
[[Pi[:= ZP&,< O0
(13.16)
U
and C , b(u,v)k, < M k , for all u E S and some A4 > 0. Lipschitz condition. For every 1 n m, there exists a coupling (2) of R, and Om such that
< <
,€Am \An
UEAn
w E An,
x2,52
E
Eo,
where the non-diagonal elements of (c,,,), non-negative and furthermore
c,(n,n) = 0, zu E A,, n 2 1;
lim m>n,, € A n
+
c,(n, m> sup U
lim
m>n+m
C(cU,, + g,,,)
(13.17)
(g,,,),
c,(n,m)
< 00.
( c w ( n , m )are ) all
= 0,
u E S and
480
13 CONSTRUCTIONS OF
THE PROCESSES
Then there exists a Markov process with transition function space (Eo,80)such that
P ( t ,IC, .) and state
w~,(P,(t,z,.),P(t,z,.)) +O, n + m , t 3 0 , zE&.
(13.18)
Moreover, the convergence is uniformly in finite t-intervals. Finally, there is a constant c2 E R such that
< @L(f),
L(P(t)f)
t >, 0, f
(13.19) The main steps of the proof of this theorem are quite similar to the previous one. E
Y i p ( ~ 0 ) .
Lemma 13.9. Let (13.15) hold. Then
(4+1 t
Pn(t)P,(4 6 (exp
[t(ClL
+m I P . ( 4 )
(exp
[S(Cll,+%]P.)
t 3 0, z E 80,21 E A,, n 2 1, matrix on A, and B: is the transpose
where I , is the unit u , v E An}. In particular,
~,(t)pA,(x)
W s ,
(13.20) of B, := { b ( u , v ):
t
< p ( z ) e ( c I + ~ )+t llpll/
e(Cl+M)Sds
0
G (Pb)+ IlPll) exp [t(%+ M ) ] , ~ E E o n, 2 1 ,
where c3 = c1 V (1 - M ) .
(13.21)
Proof: The last assertion follows easily from the first one. As for the first assertion, simply use Lemma 4.13. I Lemma 13.10. Let f,(t,x, n ) (v hand side of (13.20) and set
cpw(t,z; n, m ) =
C
E
A,) be the function given by the right-
f,(t, 5 ,m ) ~ g , ~ + f 5~, mPWcw(n, (t, m>, w E 12.,
,6 Am \An Then, under the assumptions of Theorem 13.8, we have
W,"i,(t; 2 1 , .2> G (c,( t )P. (21,.,))
t 2 0,
(4+
/
t
0
(Cn(t - 4cp.(8,z2;n, 4)( ' W ) d S ,
q , x 2 E Eo, m 2 n 3 1, where Cn(t)was defined in Theorem 13.1. Furthermore, wAn(Pn(t,
<
etl/CII1
+
w E A,,
'), Pm(t,x 2 , ')) PA,(Z1722)
etllclll
c It
uEAn
+ llg\lle
t'lClll
c
f&, z 2 ,m)k,ds
u€Am\An
j,(s, z2,m)k,c,(n, m)ds,
where llalll denotes the norm of the matrix
(a,,): llalll = SUP,
(13.22)
c,lazLv[.
13.1 EXISTENC~;: TW:ORE:MS FOR
T H E PROCESSES
48 1
Proof: As we did in the proof of Lemma 13.5, define the column vectors: P ( ~ , , z z=) ( P W ( Z l , % ) w f An), W(t;~,,= 2 ~( W ) z m ( t ; x 1 , x 2:) w E An), @ ( t , x )= ( q w ( t , z ; n , m: )ul E An). Then, the first estimate of the lemma becomes
W ( t ;x1,1c2)< Cn(t)P(n;,,x2)+
bl
C,(t
- s)iD(s,z 2 ) d s .
+
In view of the proof of Lemma 13.5, since f i n , m P ( x l , x 2Q) C ~ P ( z , , z , ) (D(0, x 2 ) ,it suffices to show that d
) < -@(t,x2). 'dt Equivalently, Rmyw(t,2,;n, m) 6 -$yw(t,x 2 ;n, m ) . Then, in view of the expression of cpw, it is enough to show that R,f,(t, z2,m ) < $ f w ( t ,x2,m ) . Now, the first assertion follows from condition (1) of Theorem 13.8. Next, note that
x 2 ) = R,@(t,
2
and that cond assertion follows immediately. Proof of Theorem 13.8: By Lemma 13.10, we have
=:I+II.
The se-
482
13 CONSTRUCTIONS OF THE PROCESSES
Noting that
we have limm2nirxl I = 0. Furthermore, by the assumption on c,(n, rn) and the dominated convergence theorem, we obtain limm2n+ooIt = 0. Therefore
In particular,
From this, as we did in the proof of Theorem 13.1, we can construct a probability measure P ( t , x , - )on (E0,80)such that for fixed t an d B E 8 * > B ) E 80,
w,
P(t>p< C ( I + p ) e c t ,
t 3 0, z
E E~
(13.25)
and
However, the convergence in (13.24) is not necessarily uniform in IC E EON, which is just the point why we need a different approach to prove the semigroup property.
Lemma 13.11, Let Y c Yip(E0)satisfy sup{L(g) : g E 9?} < 00. Then for each t 2 0 and x E Eo, we have (P7,(t)- P ( t ) ) g ( z ) 0 uniformly in g E Y as n -+ 00. ---f
Proof: Set gn(x:>= g ( x C ” n ) ,5 E Eo, n 2 1. Then supn2,L(g,) ,< L ( g ) . The assertion now follows from the fact
plus (13.26) and (13.25). I One main character of P ( t ) constructed by Theorem 13.8 is as follows.
13.1 EXISTENCE TIIEORGMS
FOR TEIE PROCESSES
483
<
Lemma 13.12. let f E 3ip(E0). Then P ( t ) f E LFzp(E0) and L ( P ( t ) f ) e x p ( c 2 t ) l ( f )for some cg E B. Proof: By Lemma 13.10, we have
for some c2 E R. Then, letting n +.
00,
the desired conclusion follows from
Lemma 13.11. In view of the above proof of Theorem 13.8, we have the following result. Remark 13.13. Under (13.6), suppose additionally that the convergence lirnm2n-m c,,(n, m) = 0 is uniformly i n u E S and that cU(n,n ) EZ 0. Then the semigroup P ( t ) constructed by Theorem 13.1 is also Lipschitzian.
We now return to our main proof of 'i'heorem 13.8. What remains is to check the CK-equation. For this, as we explained in the proof of Theorem 13.1, it is enough to show that P(t>P(s)= P ( t s) on & i p ( E ~ ) .Note that P,(t)P,(s) = Pn(t + 9 ) . Given f f yip(&), we have
+
and
gn = 0. Hence, by the dominated convergence theorem,
Iim P ( t ) ( P ( s )- P,(sj)f(x) = lim P(t)g,(s) = 0. n-+m
n-03
<
On the other hand, since supna1 L(Pn(s)f) L ( f ) e C Sby , Lemma 13-11, it follows tha.t
+
Therefore, P ( t ) P ( s ) f = P(t s)f for all f E bLi?ip(Eg) and so for all f E bd? U &?+ by the monotone class theorem.
484
13 CONSTRUCTIONS OF THE PROCESSES
Remark 13.14. In view of the above proof, we see that the only place u’c need the coupling for different size of boxes (A, and A, with n # m) is Note that, here the sta,rting points of to estimate I P T L ( t ) f ( x-) PTn(t)f(z)l. the processes are the same. Thus, the required estimate can be obtained by using a different approach, an integration by parts formula, for instance. In this way, we need only to consider the coupling of the same processcs in the same box A, ( n 2 1) with different starting points (cf. Example 13.36). Having constructed a limiting process, we would like to know its generator in a weak sense. For this, let
Clearly, 0 E Eoo and so Eoo # Q). Now, let f t 9iip(Eo),z E Eoo and set f,(y) = f(yAn x xs\nn): y E Eo: n 2 I. Then, by (13.17), we have
(13.28) This shows that {Cl,f’(~:) : n 2 1) is a Cauchy sequence and so we can define Clf(x) :=-12-04 lim 0 2 , f ( s ) , x E Eoo, f E Yip(E0). hrtllerKnore, by (13.281, we h v e
Replacing f with P ( s ) f or Pn(s)f: we see that
uniformly in finite s-intervals. But we still need a technical assurnption, which is often satisfied, to pass through the bridge from thc finite dimensional case to the infinite dimensional onc. That is
for every sequence
f,
4
{fn}r c 9ip(E0)with
fo, we have 12-00 lim f12fn(z)- Qfo(s) for all
L(f,) R: E
< 00
EOO.
and (13.30)
13.1 EXISTENCE THEOREMS FOR
THE PROCESSES
485
Finally, to avoid the dull case, it is natural to assume that
EOOis dense in Eo with respect to p .
(13.31)
Corollary 13.15. Let f E 3ip(Eo).Suppose that the hypotheses of Theorem 13.8, (13.30) and (13.31) are all satisfied. Then we have
Moreover, there are a function g ( f , x )
> 0 and a constant c <
00
such that
Furthermore, for each x f Eo, P(t)f(z)is continuous in t and so is StP(t)f(z) for each
x E Eoo. finally (13.34)
Proof: a) By (13.29) and (13.30), we have
uniformly in finite s-intervals. Now, (13.32) follows from
and in terms of (13.28),
Combining (13.36) wit.h (13.32), we obtain (13.33). b) By (13.32), we see that P(t)f(x) is continuous in t for x E EOO.Next,
For given x E Eo, by Lemma 13.12, we may choose z0E Eoo so that the first two terms on the right-hand side are arbitrarily small uniformly in finite sintervals, then the third term can be made arbitrarily small for this fixed xo whenever s is small enough. From this and the denseness of Eoo, we obtain the continuity of P ( t ) f ( z )in t for f E L?iip(Eo) and x E Eo. By the uniform
486
13 CONSTRUCTIONS O F T H E PROCESSES
convergcnce in (13.35), it is clear that dlP(t)f(x) is continuous in t for each x € &a. Finally, (13.34) follows from (13.32). I
13.2 Existcncc Theorem for Reaction-Diffusion Processes In this scction, we apply Theorem 13.8 to a class of particle systemsthe reaction-diffusion processes, which are the main subjcct of this part. Besides, we make the conditions of Theorem 13.8 to be more explicit and explain again their meanings. First of all, we should introduce the model. Let S be a countable set. Imagining each u E S as a small vessel in which there is a reaction. In this section, we restrict ourselves to the case that there is only one single reactant. In each u E S , the number of particles of the reactant is evaluated in Z+which consists of the spin space. Now, the rate function of the reaction in u can be described by a Q-matrix QZt= ( q u ( i , j ) : i , j E Z+>. Thus, the reaction part of the formal generator of the process is as follows:
where e, is the element in E: having value one at u,and zero at other sites. Moreover, we use the following convention:
=o:
q&j)
2
E
;z+, j $ z.,,
TIE
s.
'The second part of the generator of the process consists of diffusions between the vessels, which arc described by a transition probability matrix P = (p(u,v) : u: 'u E S). For instancc, if there are k particlcs in u: then the rate fiinction of the diffiision from u to is c, ( k ) p ( u ?v), where
c, 2 0,
C,(O)
= 0,
uE
s,
(13.37)
Hence, the diffusion part of the formal generator becomes
Finally, the formal generator of the reaction-diffusion processes can be expressed as follows:
W(.) Choose {An}? corresponding
c S, A, On,d
= flrf(4
+ W(4.
S. Replacing S with A,, we can define the and
nn,respectively.
13.2 EXISTENCE THEOREM FOR REACTION-DIFFUSION PROCESSES
487
As explained in the last section, we need to use a smaller space Eo instead of E : Eo = { L T f E : 1121( := Cu5,ku< oo}l where {k,} is a positive sequence such that
(13.38)
From now on, we call ( p ( u ,v)) satisfying (13.38) an M-controlled matrix. Then, starting from 2 E Eo? after a linear immigration: c,(k) = k : the number of particles adds to the site v is equal to z,p(u, v>.Hence
c,
and so the process will still stay in ED. Note that for a given (p(u,v)), the required (ku) always exists. For instance, take A1 > 1 arid u bounded positive sequence (d,,L)land set
This explains the source of the sequence (k,) discussed in the last section. Besides, under the assumption
(13.39) we may choose a sumrnable (d,) so that (k,) is also summable. Now, it is the position to discuss the conditions made in the last section. qu(Ol k)k and condition (13.16) becomes In the present case, ,& :=
cF=,
(13.40) u
Next, the set EOOintroduced before Corollary 13.15 becomes
To guarantee the denseness of EOOin ED,we adopt the assumption q&, kfQ
i
+ k)lkl <
00,
uf
s,
(13.41)
488
13 CONSTRUCTIONS
O F THE PROCESSES
which implies the regularity of the Q-matrix Qu = ( q u ( i , j ) ) by Theorem 2.25. Finally, for the most. crucial condition (2) of Theorem 13.8, we will adopt a pa.rt,icular coupling which deduces the following two conditions: c := sup &.I)
- &(k
+ 1)1<
(13.42)
00,
k,u
c; :=sup(gu(jl,j2)+h,(jl,jp)
where
. c 1
SU(jl?j2)=
J2 -31
(QU(j3?j2
kfO
: uES,j2>ji
>o} COO,
+ Ic) - qu(j1,jl + k ) ) k ,
(13.43)
j,
> j , 2 0,
Set
Obviously, Finally, set
Remark 13.16. If for every
is a birth-death Q-matrix,
then
Proof: Deniote by c2 the quantity given by the right-hand side. Clearly, in the present situation, and hence On the other hand,
< (32 - j&,
j,
> j , 3 0. I
We arc now ready to state the main result.
13.2 EXISTENCE THEOREM FOR REACTION-DIFFUSION PROCESSES
489
Theorem 13.17. Under conditions (13.37), and ()13.40—(13.43), there exists a Markov process with state space (E,E). Moreover, for each we have and
1() (2) (3) 4() For each 5() For each (6)
is continuous in t. is continuous in t.
Proof: a) Since
and
we have
where pu(x) = xu, b(u,w) = c p ( u , w) and
To show the regularity of R,, note that
490
13 CONSTRUCTIONS
OF THE PROCESSES
by the above inequality. Hence, the required assertion follows from Theorern 2.25. We have thus checked the first condition of Theorem 13.8. b) To check the second condition of Theorem 13.8, we use the coupling of marching soldiers. For the diffusion part, the coupling is
For the reaction part, at each u E An, the coupling of marching soldiers is as follows:
When u E Am \ An, let the particles evolve independebtly. Then the whole coupling for the process is
In particular, for
we have
13.2 EXISTENCE THEOREM FOR REACTION-DIFFUSION PROCESSES 491
On the other hand, for the reaction part, we have
then
492
13 CONSTRUCTIONS OF THE PROCESSES
By symmetry, this estimate also holds for 2, 6 y.,, Combining the above two estimates, we arrive at
(13.44) Therefore, condition (13.17) holds with the choice:
We have proved that the Lipschitz condition of Theorem 13.8 is satisfied. To obtain the estimate in part (1)of the theorem, we need a little more careful. By (13.44)) we have
on,nP W ( 2 , y) 6 c; P w ( G 9)+
Hence
1- P ( W , 4
-
c
v$An
P
h4
>
-C
I
P h4
13.3 UNIQUENESS THEOREMS FOR
THE PROCESSES
493
and so
From this, assertion (1) follows easily. c) It is easy to see that the hypotheses of Corollary 13.15 are also satisfied. Then, the conclusions of the theorem follow from Theorem 13.8 and Corollary 13.15 by some computations. 13.3 Uniqueness Theorems for the Processes
In this section, we prove some uniqueness theorems for the processes constructed in the previous sections. Two different approaches are used here. The first one is the usual semigroup approach. The second one is the weak maximum principle. Theorem 13.18. Suppose that the hypotheses of Theorem 13.8 hold and additionally, condition (13.30)and the following conditions are all satisfied.
(1) Growing condition:
CuEAn
where piy)(x) = ~ ~ ( z ) ~m(> , k1 ~) ,is the minimal number so that the above control holds and K1 is a constant.
(2) Moment condition:
Then there exists uniquely a Markov process having the properties listed in Theorem 13.8. Moreover,
where Em = {x E ,230 : p ( " ) ( z ) := Finally
xupu(x)mk, < co} and
K2
is a constant.
(13.46)
494
13 CONSTRUCTIONS O F THE PROCESSES
Proof: a) Since Em C Eo, Theorem 13.8 and Corollary 13.15 are applicable to the present case. Next, by the moment Condition and Lemma 4.13, we have ~~(t)pi:)(x) 6 (1 +pi:)(z)) exp [ ~ ~ t ]t ,2 0,
2
n 2 1.
E
By using an approximating argument, we obtain
t 3 0 , x E Em.
P(t)p(")(x)6 (1 + P ( ~ ) ( Z )exp ) [Kzt],
(13.47)
This proves not only (13.45) but also that Em is a closed set of the process. b) Let f E YZp(E0). By the growing condition, pnf(z)I
Since R,f(z)
< KIL(f)(l+drn)(5)),
+ Rf(z),
Ifif(.)I
2
E Em, n
2 1.
we obtain
6 K & ( f ) ( l +P(")(Z)),
J:
E
Gn.
In particular,
and so
for t
< 1, f E Yiip(E0) and z E Em.
Thus, by a) and the dominated convergence theorem, we get
= P(t)Rf(z),
t 2 0,
2
€
Em, f E pip(&).
Combining this with Corollary 13.15, we obtain (13.46). c) Finally, let Pk(t),k = 1 , 2 be semigroups on Tip(&) perties: i) Pk(t) is Lipschitzian on
having the pro-
YZp(Eo),
L ( P l , ( t ) f ) 6 L ( f ) e x p [ c 2 t ]f ,~ B p ( E o )Ic=1,2 , for some c 2 > 0 ; ii) Em is dense in Eo with respect to p and Em is closed for both Pl(t) and PZ( t ); iii) (13.46) holds for P ( t ) = Ph(t), k = 1,2.
13.3 UNIQUENESS THEOREMS FOR
THE
PROCESSES
495
From these, we claim that Pl(t) = Pz(t),t 2 0. Since the denseness of Em and the Lipschitz property, it suffices to show that P~(t)f(z) = P~(t)f( for all f E b-%p(E0),z E Em and t 2 0. But Em is a closed set, we may replace Eo with Em and consider Q ( t ) as a semigroup on 9Zp(Em). Then the conclusion follows easily from the Hille-Yosida theorem. Actually, for f E b9b(Ern)t let
F ( z ) = Fx(z)= Then F E &?Zp(E,)
Jom
whenever X
> c2. On the other hand,
Hence
P ( ~ ) F ( x-)F ( z ) e X h- 1 h
z E Em.
e-"P(t)f(z)dt,
h
F ( z )-
$1
h
e-'"(t)f(z)dt.
By iii), we have
RF(2) = XF(Z) - f ( z ) ,
z E Em.
Or
( X I - S2)F(z)= f ( ~ ) , 1~ E Em, > ~ 2 . (13.48) From this, we show that F is determined uniquely by f . To do so, let FI and F2 satisfy (13.48) and set g = Fl - Fz. Then
z E Em, x > c2.
Rg(z) = Xg(z),
By iii) again, d --P(t)g(z) = RP(t)g(z)= P(t)Rg(z)= XP(t)g(z)
dt
and so by the continuity,
e-xtP(t)g(z)- g ( z ) =
1
t d
(eWxsP(s)g(z))ds = 0.
0
Therefore g ( 2 ) = e-"P(t)g(z),
This gives us SUP I g ( 4
zEEm
x E Em,
> cz.
< e-xt VEEm S U P IS(Y)I*
Hence g = 0. Finally, for given two semigroups Pk(t), we have two functions Fk(z)defined above for which (13.48) hold and hence Fl = F2 as we have just proved. Therefore, Pl(t) = &(t) on b2ip(Em)by the uniqueness theorem of Laplace transform. Now, we apply the above theorem to the reaction-diffusion processes.
496
13
CONSTRUCTIONS O F THE PROCESSES
Theorem 13.19. Suppose that the hypotheses of Theorem 13.17 hold and additionally the following conditions are all satisfied.
(1) Growing condition: For a fixed rn 3 1,
(2) Moment condition: sup
c
Q&,
i
+ k) [ ( i + k ) m - im] < Kz(1 + P ) ,
iE
z+
kfO
(3) Transition condition: supv
xup ( u , w) < C,
Without loss of generality, assume that k, clusions of Theorems 13.17 and 13.18 hold.
00.
< co (by
(3) ). Then the con-
Proof: What we need is to check that the conditions given here imply the corresponding ones given in the previous theorem. a) For rn > 1, by the growing condition, transition condition, the C,inequality and the Holder inequality, we have
<
where c, = 2m-1 if m >, 2, c, = 1 if m 1. Note that this estimate holds even for rn = 1. Combining this with conditions (2) and (3), we have checked the second condition of Theorem 13.18.
13.3 UNIQUENESS THEOREMS FOR
497
THE PROCESSES
b) Similarly, one can check that the first condition of Theorem 13.18 follows from the first one here since the diffusion part is M-controlled by the distance I] . 11, which was defined before (13.38):
The idea of the above theorem is keeping the function f in L@ip(EO)but restricting EOto Em. The next result goes to the opposite direction.
Theorem 13.20. The assumptions are the same as the previous one except the last two conditions are replaced, respectively, by the foltowing ones. (2)' Moment condition:
(3)' Transition condition: There are a positive summable sequence (k,) and a constant M ( m ) such that
Then there exists uniquely a Markov process having the properties listed in Theorem 13.17. Moreover, for each f E 9ipm(Eo):
Zip,(Eo) = {f : f is Lipschitz continuous with respect t o the metric
lb - Y l l m = c,IX,
- YzLlJi~}
and t > 0,
(13.49)
Remark 13.21. Condition (3) plus MI imply (3)'.
:= sup { p ( u ,v)'-~ : ZL,
v E S and p(u, v)
> 0} <
DC)
498
13 CONSTRUCTIONS OF THE PROCESSES
Proof: Indeed, if we take (k,) as before, then
for all ~ E S . In general, (3)' is stronger than ( 3 ) . Proof of Theorem 13.20: a) The key idea to prove this theorem is as follows. Using (k:) and M ( m ) instead of (k,) and M respectively, according to Theorem 13.17, we can construct a Markov process P ( t ) having the Lipschitz property with respect to I[ [ j m . So (13.49) holds. b) Because of
combining this with
(a)', we get
A similar argument as before leads to
That is just ()13.50. c) For
we have
where L,(f) is the Lipschitz constant off E 2zpm(Eo).Now, the remainder of the proof is similar to the last part of the proof of Theorem 13.18.
499
13.3 UNIQUENESS THEOREMS FOR THE PROCESSES
Corollary 13.22. If the reaction part is the type of birth-death qu(i,i + 1) = bi, i 2 0;
q&i
- 1) = ai,
i 3 1.
Then conditions (2) and (2)' in the last two theorems can be replaced by
for some c E (0, I).
Proof: Here, we check (2)' only+ Choose N1 = N 1 ( m )so that (12- e,(( 2 cJJx+e,JI wherever 2 , >/ N'. Next, choose N2 so that [ b i - c m a i ] / i A for ) all i 2 N 2 . Put N = N 1 V N 2 . Then for each n 2 1, some A E ( 0 , ~ and we have
<
The growing condition used in the above three theorems is not natural for the uniqueness problem. To avoid this condition, we adopt a different approach--the weak mzxirnum principle. To state the principle, we need some notations. Lct (X,p ) be a rnetric space and let C ( X )be the collections of continuous functions. Suppose that
( M I ) g E C + ( X ) is locally compact: i.e., there is a sequence H , such that for every n > 1, as a function on H,, g is compact. Next, let F : [O,T]x X below and set
---f
W
be a bivariate continuous function bounded
Finally, let R be a linear operator on @(&',g) with f21 = 0 and satisfy:
( M z ) Rg
X
< ag + b on X for some constant a,b 2 0.
500
13 CONSTRUCTIONS
OF T H E PROCESSES
(M3) Moreover, for each T > 0, if (P, E @(F,g) (n 3 some no), as a function on [0,TI x Hn, achieves its infimum at some point (s,, x,) E [O, TI x Hn,r:
H,,r := {Z E H , : g(x) 6 then
-
T},
Ry,(s,, .)(x,) 2 0.
Theorem 13.23 (Weak maximum principle). Let F and fl be as above and satisfy (MI)-(M3). Suppose t h a t for each x E X , F ( . , x ) is continuously
.>
differentiable and
get,).
2 R F ( 4 x)) (t, z E x. F ( 0 , x ) 3 0, Then F 2 0 on [O,T]x X .
{
E [O, T ] x
Proof: Without loss of generality, assume that a
x
> 0. Define
- t ,x) + (T - t ) +~&e-at[g(x)+ b/a]. Then f E ( t ,.) E @ ( F g) , for sufficient small E > 0. Clearly we have f E ( tx) , = F(T
on [O, TI zfE+ QfE < --E on X . f E ( T.), 2 0
(a
x X,
(13.52)
Thus, we need only to show that f E 3 0 on [O,T]x X for sufficient small & > 0. Suppose that f E < 0 at some point (s,x). Since H , X , there must be an N so that ( s , x ) E H , for all n 2 N. Take T
I
= inf{F(t,
x) : (t,x) E [o,T ] x
x>I
. eaT/&.
If T = 0 then there is nothing to do. Otherwise, from ( M I ) and (M3), it follows that there is a compact Hn,r so that
3 0 on [O,T]x (Hn\ Hn,r). Thus, (s,x) E [0,T ] x Hn,r for all n 3 N . But [0,T ] X Hn,r is compact, achieves its minimum at some (s,, x,), .fE
fE(sn,2), and
6 fc(s, x) < 0,
a -f&(Sn,%)
at
2 0,
fE
n2N
n 2 N.
Combining this with (Ads), we obtain
This contradicts with (13.52) and therefore f E < 0 is impossible. We now apply the maximum principle to the reaction-diffusion processes.
13.3 UNIQUENESS THEOREMS FOR
THE
PROCESSES
501
Theorem 13.24. The assumptions are the same as Theorem 13.19 except the growing condition is removed and the moment condition holds for some m > 1. Then there exists uniquely a Markov process having the properties listed in Theorem 13.17. Besides,
for some constant K2.
u,
H,. Proof: a) Take Hi, = {z E EO : z, = 0, u $ A,} and set X = Then X is countable. Endow X with the discrete metric p. Next, take g(z) = p(")(z),
2
E X.
It is easy to see that g E C ( X ,p ) and is locally compact. Actually,
H,,?.
:= {.
E H,
:
g(x) < 7=}
is a finite set. However, if we use the metric induced by p ( z ) , then g is clearly not continuous with respect to p unless m = 1. For this, we have to be careful. Finally, let Pk(t),k = 1 , 2 be two semigroups constructed by Theorem 13.8, F ( t ) = Pl(t)f- P2(t)fand define fi to be the restriction of R on @(F',g):
6cp = R F ( t , .) + q R g ,
cp E
@(F,g).
By assumptions, it is easy to check, as we did in the proof of Theorem 13.8, that the operator 6 defined on @ ( F , g )satisfies the hypotheses of the maximum principle except (A&), which is often the key point in the applications. b) Fix T > 0 and suppose that cp,(s,z)
:= F ( s , z ) + " y g ( z ) +a?) E @(F,g),
(s,z) E
achieves its infimum at some point ( s ( ~ ) , z ( ~ )E) [O,T]x x(,) ke, E H,. whenever u E A,. We have
+
and
[O,T]x H , Note that
13 CONSTRUCTIONS OF THE PROCESSES
502 Thus
:= I
--+
+ II.
0,
n
--+ 00.
(XF))~~,
Here in the last step we have used p(")(a(")) = C ,
13.4 Exarriples To illustr~tethe applications of the results obtained in the previous sections, in this section, we introduce 15 examples. Some of them have been attracted by many authors and for which, even for a particular model, there are a lot of publications. Some of them will be studied continuously in the subsequent chapters. On the other hand, several models are still not well-understood and need the further study, for which, our discussions are sketched only. An earlier example of the reaction-diffusion processes is
13.4 EXAMPLES
503
Example 13.25 (Zero range process). The spin space is Z+.For this model, there is no reaction, i.c., qTL(i, j ) = 0. So the forrnal gcncrator becorncs
U
'u
where the coefficients c, and p(u,11) satisfy the assumptions given in Section 13.2. The process is uniquc by Theorems 13.19 or 13.20 with m = 1 but ignoring the growing condition. Theorems 13.19 or 13.20 arc also suitablc for the following five modcls.
Example 13.26 (Linear growth process). Again, the spin space is E, Z+.The operator is as follows:
5
u,v
where XI,& > 0 are constants and p ( u , v ) satisfies the same assumptions as previous one. Since the reaction part is linear, even replacing X~Z, with X15, + A0 for some constant Xo > 0, the process is still unique.
Example 13.27 (Polynomial model). E, E Z+. The diffusion rate is c,(k) = k and p ( u , v ) is as above. The reaction rates are of birth-death type: mo+l
mo
where mo 3 0, k ( j ) = k ( k - 1) . ( k - j non-negative, and 61, Jmo+l> 0.
+ l),the coefficients & and Jj are
Tn particular, we have
Example 13.28 (SchlSgl's first model).
+ Olk,
bk = p0
nk: = 61k -t 62k(k - l ) ,
/7",i(jll,61r62 > 0.
Example 13 29 { Schlijgl's second model)" I
hk =
/%+/!%k(k-1),
ak
6',k+63k(k-1)(k-2),
/!?0,&?,8],&3
Example 13.30 (Autocatalytic model). 4 k , k + l = Plk,
and qtJ = 0 for all other j
qk,k-2
# i.
= J2k(k - I),
P1,62
>0
> 0s
504
13 CONSTRUCTIONS OF T H E PROCESSES
Example 13.31 (Coupled branching process). Eu
= Z+.
J
W
where XI, . ,A4 > 0, ( p l ( u ,v)) and ( p z ( u ,v)) are transition probability matrices satisfying
U
2)
for some positive sequence ( k u ) and constant M > 0. Even though this model can not be covered by Theorem 13.17 but it may also be considered as a reaction-diffusion process. The process is unique by using Theorems 13.8 and 13.18. The last conclusion also holds for the following four models. Example 13.32 (Coalescing process). Eu = R+.
where ( p ( u ,v)) is a transition probability matrix satisfying the M-controlled condition and
{ ;zu
xc(uPJ) 20 =
2,
+ (1 - p ) x u
if w # u , v if w = u if w = v,
for a constant p E ( 0 , l ) . Example 13.33 (Smoothing process). Eu
= EX+.
U
where
ifwfu x w p ( u , v ) x w if w = u C, and ( p ( u ,v)) is a transition probability matrix satisfying x; =
c
{
k d w J )6 M k v ,
U
for some
M > 0.
V € S
13.4 EXAMPLES Example 13.34 (Potlatch process). E,
where
x; =
{
xw
+ 2 U P ( U , w)
XUP(U,
4
505
= R+.
ifwfu ifw=u
and (p(u, v)) is a M-controlled transition probability matrix.
Example 13.35 (Generalized Potlatch process). Take Eu = R+, S = Z d and ( p ( u ,v)) the simple random walk on Zd satisfying the M-controlled condition. Next, let be a non-negative random variable with mean 1 and distribution function F . The formal generator of the process is as follows:
To study the construction of the process, consider first the finite dimensional case. Take An = { a E Z d : (u[ n}, n 3 1, where Iu[ = lull ... (ud(.Replacing Z d with An in the expression of Q, we obtain an operator On and hence the corresponding q-pair qn(x,A) = flnIA\(z)(S), qn(x) = -fln1{.) (x).Since
<
+ +
qn(x,.) is clearly non-negative and o-additive and so is a measure, qn(x) = qn(x,E ) . Hence, (qn(x),qn(x, d y ) ) is a bounded conservative q-pair and so the q-process is unique.
To consider the infinite dimensional case, take pU(z,, yU) = [x, - y U ] and let 8 be the zero vector. Then, we have pU(x) = z,,u E S. On the other hand, qn(8, dy)p,(y) = 0 for all u E S. So we have & = 0. Besides,
506
13
CONSTRUCTIONS OF THE PROCESSES
implies the first moment condition. As for the Lipschitz condition in Theorem 13.8, simply use the following coupling:
Thus, by Theorem 13.8 and Theorem 13.18 with m = 1, the process is unique. Example 13.36 (Coupled random walk process). The system can be described as follows. Let {Nu: u E S} be a collection of independent Poisson processes with parameter one (called the exponential clocks). At the time when Nu rings, all particles at u at time t move to the sites chosen independently according to P = (p(u,v)), which is an M-controlled probability matrix. Thus, after one such movement, the configuration becomes z ;
where
=
X,
+ i,
ifw#u if w = u,
EVES i, = x u , u E S. For finite S, the Q-matrix is
lo
otherwise.
Since for this model, each particle moves according to the Markov chain with transition rate P = (p(u,v)) and with waiting time one and moreover, when N, rings, the xu particles at site v evolve independently each other. Therefore, the first moment at site u is
where p ( t , u,v) is the Q-process determined by the Q-matrix Q = P - I
13.4 EXAMPLES
507
To study the Lipschitz property, one may choose a coupling of R, and f2, and applying Theorem 13.8. But here, we prefer to use this example to explain the modified construction outlined in Remark 13.14, based on the integration by parts formula. We consider first the erne that A, = A, = S (finite!). Define a coupling as follows:
where ( y ( ' ) , ~ ( ~#) )(d1),d2)) is the following: if x?' 3 xu(2) , then
Otherwise, x?)
< xp),
Y V( l )
y?)
2 p + i,, 2' # 21, y = x y + it, + j,. 7,' # u, =
p = 2, y?) = 2,
+j , .
In any case, i, = z ? ) f i x L 2 ) , C , j , = lxu(l) -x?) 1. Clearly, this coupling is order-preserving. Hence, we can avoid the direct computation to get the red2)E E , let quired estimate. The procedure goes as follows. For given dl), ~ ( 3= ) ~ ( /l \)x ( ~~) . ( =~~ 1( v ~ ~ (1 ~Then ) , ~ ( ~x(l) 1 , &?) < 5(4)with respect to the ordinary partial ordering. First, we use the above coupling to get two coupled processes ( X ( 3 ) ( t )X. ( l )( t ) )and (X(')((t),X(4)( t ) ) .Then we couple these two processes again to get a coupled process ( X ( ' ) ( t ) ,* * . , X(4)(t)). Certainly, we have X ( 3 ) ( t )< X ( ' ) ( t ) and X ( 2 ) ( t )< X ( 4 ) ( t ) Denote . by E the expectation of the multivariate process. Then
<
IEJXp(t)- x:)(t)l< I E p p ( t )- x:)(t)I =
-y(.?) - 2$')p(t, 21
and so
This is the desired estimate.
U)U)
508
13 CONSTRUCTIONS OF THE PROCESSES
Next, let R1 and R2 be two operators corresponding to PI = (pl(u,v)) and = ( p z ( u , v ) ) respectiveIy (also for finite S). Note that the coefficients of R k ( k = 1,2) is just the one of multinomial distribution and the evaluations between different sites are independent each other. Thus, if we denote by and J2,, the random configurations when the clock Nu rings at u,then, are the binomial ones for each v E S, the distributions of < l $ u and u) and p2(u,v), respectively. Here, we assume with parameters xu, p l (u, that the two processes are starting from the Sam$ configuration z.Now, we -<:>u] has binomial distribution with couple these two processes so that parameter z, and Ipl(u, v) - p2(u,v)[,and denote by E the expectation of the coupling. Then P2
?<,'
e2vu
fINALLY, RETURN TO OUR ORIGINAL SETUP. lET s. dENOTE BY m the controlling constant for P i = 1,2 and set M = M1 V M2 . Since the operators (n > 1) are bounded, we have the integration by parts formula:
Let
Sincew
we have
Therefore
We have at last obtained the estimate
These facts are enough to deduce the existence and the uniqueness of the process, as mentioned in Remark 13.14. To conclude this section, we introduce two related models. For which, the details are omitted.
13.4 EXAMPLES
509
Exarriplc 13.37 (Limiting Gaussian process). Take S = Zd9,E, = R and pu(zu,yu) = 15, - gal, 71, E S . T,et { T , ~ : u,v E S} be a ramily of constants satisfying the following conditions. (1)
T,,
= Tlu-vl
(2)
Tuu
> 0,
2 0.
CvEZ",vfurvu
< Tuu.
(3) Cv~vueYlu-"l6 C < 00 for some constants C and y > 0. (4) The matrix (ruv : u,v E A) (A c Z d ) is positive definite. For each n 3 1 and u E A,, let px(z,dy) be a probability of which onto (Eu,gu) is the Gaussian measure with variance 1/ruuand mean )&,n\{ul ; C ~ T ~ ~ / T , Finally, ,.
the family of the local characteristics of the process are
as follows: 4.12(z)d y )
Cu~h, px(z, d y ) ,
q71.(z) = q n ( x >E A n ) ,
-
??,
2 1*
Then, Theorem 13.1 is suitable with the choice k , = e- YI u l , u E Zd, 0 < ;j < y and p ( z ) = Cuzuku.Actually, the limiting process is ergodic. See Basis (1980) for details.
To see the main difference between Theorems 13.1 and 13.8, let us return to Example 13.35 and using the same coupling there. We have seen chat Theorem 13.8 is suitable for this model. As for Theorem 13.1, the natural , ) )would be choice of (cu,,) and ( c w ( n m cuw =
{
u#fu,
P(U, 4 ~ w / k u ,
+ 1,
P(U>U)
U=W
4% d = Cn
w E An.
So we need more rcstriction on the interaction ( p ( u , u ) ) . In particular, if ( p ( u , v ) )has a finite range, then Theorem 13.1 is applicable for the above reaction-diffusion processes. Finally, we discuss the reaction-diffusion processes with multispccies. One particular model was introduced before.
Example 13.38 (Brussel's model). See Example 4.50. Here is another example.
Example 13.39 (Volterra-Loth model). E, = Z,: u E S = Z d . The rates from ;C to y are the following: if y = z e,l hxd4 if y = I(: - e,l eu2 A 2 2 1 ( 4 x 2 (4 x3x2 > .( if y = z - e u2 if y = J : - e u z + e u z , i = l , 2 Cu&z(U)) P z ( U , 4 0 in the other cases of y # z,
+
+
510
13 CONSTRUCTIONS O F
T H E PROCESSES
where X I , AS, > 0, (pi(u,v)) and e,i are the same as in Example 4.50. Of course, assume that ( p i ( u ,v)) are Mi-cont.roIled matrices. In general, for each i, c u i ( k ) have the same properties as t,hose of c,(k) treated before. Due to the behavior of the diffusions, it is natural to use EO = (x E E : p(x) < co} as our state space, where p ( z ) = C , (zl (u) z 2( u ) ) k ofor some positive sequence (k,). However, the Lipschitz condition of Theorem 13.1 or Theorem 13.8 with respect to the metric p(x) may not be satisfied since the interactions of the reaction part is very strong. Instead, we take
+
i.e., the discrete metric for the spin spaces. Then, one can prove that both Theorem 13.1 and Theorem 13.8 are applicable provided
cua(0)= 0, u E S, i = 1 , 2 and
SUP tL€S,i=1,2, k > l
c,i(k) < =o.
(13.53)
Furthermore, by using the weak maximum principle, one can even prove that the process is unique. The only point is that the second condition in (13.53) is certainly not satisfactory. 13.5 Appendix
The original constructions of the processes in Examples 13.32-13.34 and 13.36 are all based on the integration by parts formula for bounded jump processes (Liggett and Spitzer (198l)), as illustrated in Example 13.36. Since the formula is quite useful, we introduce a generalization t.0 the unbounded operators as follows (appeared in Chen (1986b)). The formula has been studied by many authors, see for instance Dudley and Stroock (1987), Yan (1988).
Theorem 13.40 (Integration by Parts Formula). Given two q-pairs (qi(x),qi(z,d y ) ) (i = 1,2) which are either bounded or conservative and satisfy the first condition o f Theorem 2.22. Then, for every
f
E
b8,
we have
rt
Since the bounded case is easy, here we prove the unbounded one only. To do so, we need some lemmas. d Lemma 13.41. For every f E b€', we have --P(t)f = OtP(t)f for all t 3 0. dt d Moreover, --P(t)f is continuous in t. dt
13.5 APPENDIX
51 1
Proof: By Theorem 1.2, P (t)f is continuous in t. Then the proof of Theorem 1.15 is still suitable for the present case. H Lemma 13.42. Under the hypotheses of Theorem 13.40, P ( t ) Q f ( z ) and P ( t ) ( f q )are continuous in t for each ~cE E and f E b€'.
Proof: By (2.29), we have
~ ( t ) c6p yect < 00 for some c 3 0. Let s
(13.54)
> 0. Then
On the other hand,
We have thus proved the continuity for P(t)Qf(x). The proof for the other one is similkar. Lemma 13.43. Under the hypotheses of Theorem 13.40, we have for all
and
Proof: By (13.54) and the dominated convergence theorem, similar to the last proof, we obtain
By using (13.54) again, the right-hand side is bounded in finite t-intervals. So
512
13 CONSTRUCTIONS O F THE PROCESSES
Proof of Theorem 13.40: Clearly, we need only to prove that d
d s [PI( S ) p 2 (t-s ) f ] = Pi ( S ) (a1- fl2)Pz (t - S ) f ,
t > S > 0, f E b 8 .
(13.55)
Note that, 1
+ AS)P,(t - s - As)f - Pl(S)P2(t- s ) f ]
-[Pl(S
AS
(s
-
= 11
+
+ 12.
AS
- P2 (t- s ) +Pl ( s )P2 (t-s - As) f AS
P2(t - s - A s)!
-
As > 0, by Theorem 1.2 and Theorem 1.14, we obtain
For
11(4 = -
P@s) - I
As
4(s)S(t-s
- As)f(.)
4(As)
(LT\{zpl(s)P2(t - s - As,f)(.)
AS
-
- P1(*s?’”’{2})Pl(s)P2(t - s - As)f(x)
AS-4
AS QiPi(s)P2(t- s ) f ( x )- qi(z)Pi(s)P2(t -s)f(z)
= f W 1 ( s ) P 2 ( t - s)f(x),
where &I is the kernel ~ ( sdy). , For As < 0, we have the same conclusion. - s ) f as As + 0. Next, for As > 0, since Hence, I1 i il1P1(s)P2(t
and
I
I - P2(As)
As
applying Theorem 1.14 and the dominated convergence theorem, we obtain I2 4-Pl(s)fl&(t - s ) f as As 1 0 , For As < 0, we have similarly I2
= PljS)
I - P+c!ls) -as
Pz(t - s ) f
+
-P1(s))R&(t
-S)f,
As 0.
Therefore I2 ---i -Pl(s)S12Pz(t - s ) f as As 4 0. Combining the above facts together, we get d ds
- [Pl(s)P2(f -~
(13.56) - s ) ~ - P ~ ( s ) S ~-~S)Pf .J ( ~ ) f ] fllPl(~)l>z(t
13.6 NOTES
513
Set g = PZ(t - s ) f , Then d
fllPl(3)s = -PI (s)g (by Lemma 13.41) = PI (s)O,g (by Lemma 13.43). ds
Combining this with (13.56), we obtain (13.55). I
13.6 Notes As a typical example of the interacting particle systems, the zero range process goes back to Spitzer (1970). For a special case, this process was constructed by Holley (1970). A general existence theorem for this process was obtained by Liggett (1973)! the proof was then simplified by Andjel (1983) based on a method deveioped by Liggctt and Spitzer (1981). This process has been obtained a lot of attentions, refer to Liggett (1985) for more references. On the other hand. the limiting Gaussian process was studied by Basis (1976, 1980) by using a different method. These two methods are our original starting points of Theorem 13.1 and Theorem 13.8. Our interest in the reaction-diffusion processes began with Yan and Li (1980)- which goes back to Haken (1983) and Nicolis and Prigogine (1977). See also Arnold (1980) and Arnold and Theodosopulu (1980) for related studies. In these literatures, the models are treated as an analogue of the “mean field” (see Section 15.4). Here, we emphasize on the microscopic models. The finite dimensional case was treated by Yan and Chen (1986). The linear growth process was then constructed by Zheng and Ding (1987). Theorem 13.17 is originally due to Chen (1985). The present representation of Theorem 13.17 as well as the general existence tlicorcms (Theorem 13.1 and Theorem 13.8) were appeared in Chen (1986b, 1987). Refer to Dittrich (1988a, b), Shiga (1988) and the subsequent “Notes” sections for related studies and for more references. Theorem 13.18-Theorem 13.20 are taken from Chen (1991a). The special case that m = 1was appeared in Chen (1986b) and a special case of ‘l’heorern 13.20 was treated in Zheng (1987). The weak maximum principle goes back to Stroock and Varadhan (1979) and it was used by Tang (1985) to study the uniqueness for the model discussed at the end of the last section. The present form of Theorem 13.23 and Theorem 13.24 are taken Irom Li (1991). Example 13.31 and Example 13.35 are taken from Greven (1991), Hollcy and Liggett (1991) respectively. Examples 13.32-13.34 and 13.36 are due to Liggett and Spitzer (1981), for the proof of the last one, the author was helped by X. F. Liu.
Chapter 14
Existence of Stationary Distributions and Ergodicity In this chapter, we study the existence of stationary distributions and the ergodicity for the processes constructed in the previous chapter. In particular, we prove that the reaction-diffusion processes often have stationary distributions and sometimes are ergodic. Some general results are presented in Section 14.1. A refined criterion for the ergodicity is introduced in Section 14.2. In Section 14.3, we prove that the reversible reaction-diffusion processes are ergodic and so have no phase transition. 14.1 General Results Theorem14.1. Under the hypotheses of Theorem 13.1 with p ( x ) = C u p u ( x ) k u r suppose in addition t h a t for each 'IL E S there is a compact h, < 00 such that Pu(Gl> 6
and there are constants I 7E
flnh,
hL(&J,
21,
E
Eu
(14.1)
and rl E ( 0 , ~ such ) that
[O,cxl)
< K - qh,
(resp.
< 0),
(14.2)
where h(s)- CUE*,, hA(su)hL. Then
(1) For each n 2 1, the process P,(t) has a t least one stationary distribution 7rn
satisfying
< K/q
(resp. 6 const.).
7rn(hn)
(14.3)
(2) The process P ( t )constructed in Theorem 13.1 has a t least one stationary distribution 7 r , which can be obtained as a weak limit of a subsequence of the s7;r and satisfies
<
~ ( p6) ~ ( h )K / q where h ( z ) =
C , h,(z,)k,,
x
(resp. E
< const.),
(14.4)
Eo.
Theorem 14.2. Under the hypotheses of Theorem 13.8 with compact h, = pu(u E S), if (14.2) holds, in particular, if the constants and M given in Theorem 13.8 satisfy (14.5) c1 M < 0,
+
then the conclusions of Theorem 14.1 hold for the processes P,(t) and the process P ( t ) constructed in Theorem 13.8.
514
14.1 GENERAL
R,ESULTS
515
Proofs of Theorem 14.1 and Theorem 14.2: a) Note that (14.5) implies the Grsl case in (14.2) and that P,(t)f(z)is continuous in 2 for each f E b9ip(Eo)(Eo). The assertion for the finite dimensional case follows frorn Theorem 4.14. b) Since the proofs for the two cases in (14.2) are: the same, we consider only the first case. By (14.3), we have
Hence, by Theorern 4.7, {n, : n 2 I } is r e l a t i d y compact in finite-dimensional distributions. Choose a subsequence if necessary, such that T,, converges to 7r in finite distribiitions as k -+ 00. Using (14.3) again, it follows that
d h n ) < %dh-J < K / %
m
<
and so, by Theorem 4.5, we have ~ ( h , ) K/q. This gives (14.4). c) To complete the proofs, WP IIOW need to check that T is a stationary f distribution of P ( t ) . For this, it suffices to show that x(f) = ~ P ( t ) for every f E b%yl(E) n.Bp(Eo)(Eo).In the following, we fix such an f . First, consider the process given in 'l'heorem 13.1. As we did before, let El " A (= B" denote IA - BI 6
F.
Then for every E
> 0 and large enough N ,we
have
The second and the last equalit,ies come from the fact that r m k converges to T in finite-dimensional distributions. The seventh equdit.y is due to the property that 7rnk is a stationary distribut,ion of P,, ( t ) . The first and the fourth equalities come from (13.12). The fifth follows from (13.13). Finally,
516
14 EXISTENCE OF STATIONARY DISTRIBUTIONS AND ERGODICITY
note that
for all Then the e-equalities follows by (14.1) and (14.3). The proof of Theorem 14.1 is now foinisshd. d) Next, consider the process given by Theorem 13.8. Without loss of generality, assume that tn converges to t in finite-dimensional distributions. veClearly, we hab
By Lemma 13.12, (14.4) and the dominated convergence theorem, we have
On the other hand, by ()13.22, ()14.4 and the dominated convergence theorNext, by Lemma 13.9, rem, we have
So
14.1 GENERAL RESULTS
Letting N
00,
517
we get
the right-hand side is dominated by
Hence, we have
= 0.
Finally, for each m, since Q, := {P,(t)f(z*m): n 3 1) is a uniformly bounded family of functions which are equicontinuous at each point in EAm and rn converges to r in the finite-dimensional distributions, we have lim sup Inn(f) - r ( f ) l= 0
n-+w f € Q
(cf. Stroock and Varadhan (1979), p.11, Corollary 1.1.5). Especially, lim 1 ; n--+rn = 0 for all m. In brief words, for given E > 0, we can choose sufficient large rno so that IVmo < e/4 and supnIT0 < e/4. Then choose no such that, IUzo< ~ / and 4 IIro < e/4, n 2 no. We obtain InnPn(t)f - ;.rP(t)fl< E We now turn to study the ergodicity for the processes.
Theorem 14.3. Under the hypotheses of Theorem 13.1 (resp. Theorem 13.8) with p ( z ) = pu(z)ku,suppose additionally that the coefficients (cUv>given
xu
in (13.2) (resp. (13.17)) also satisfy
ccuw < -q
< 0,
u E S and
(14 -6)
W
~ ( c U w 6 (K W
< cx),
u f S.
(14.7)
51 8
14 EXISTENCE O F STATIONARY DISTRIBU'I'IONS
AND
ERGODICITY
Then
(1) The process P ( t ) constructed in Theorem 13.1 (resp. Theorem 13.8) ha5 exact one
stationary distribution
T
satisfying
for every f E h%jf!(E) f l z i p ( & ) ( & ) (resp. f f ~ i p ( & ~ ) ( ~ o ) ) ~ (2) For each n 2 1, if the coefficients c,,(n,n) given in (13.8) vanish, then Pn(t) has exact one stationary distribution rn satisfying
Proof: Wc prom the first assertion only since the second one can bc proved in the same way. To do so, we first justify the conditions of Theorem 5.23, The first moment, condition is covered by (13.11) (resp. (13.25)) with 9" = { p E 9(&) : &)I < ca). Ncxt, by the main estimate (13.8) (resp. (13.22)), we have
where Cn(f) = exp [tC;] Thus, if we set I
<
then conditions (14.6) and (14.7) give us ~ , ~ ? ~ ~ e-8'( tfor>,211 w II: S. 14islherrnore W A n ( f ) n ( t , E > ' ) )Pn(t,p,'))6
d~~~k-''.
We have proved in the construction theorems that €or fixed t and 2,P,,(t, x, .) converges to Pn(t,x,.>in W Afor every A E S, and then in finite dimensional weak topology by Thcorcrn 5.5, or cquivdcntly, in the product topology by Remmlr4.6. Because determined by En,,, is a (Markovian) coupling of P,(t), we claim that there is a subsequence in'} C { n } such that P,tt,n,[f; x,y ,) converges weakly in the product topology to a limit F t ; z l y , which should be a coupling of P ( t ,x,,) and P(t,y, for each fixed t , 2 and
Fn,n(i)
a)
14.1 GENERAL RESYJLTS
519
y. Besides, p is lower semi-continuous in the product topology. Thus, by
Theorem 4.5 and the last estimate, we obtain
This proves condition (2) of Theorem 5.23 and hence the process is exponent3ally ergodic with stationary distribution T E 90. The proof of (14.9) is now rather easy, since
Remark 14.4. An alternative way to prove (14.9) goes its follows. By Theorem 5.10, W ( P ( t zl: , .), P ( t ,x 2 ,.)) is measurable in (z1,x2),and moreover
A direct proof for t.he joint measurability of W ( P ( t zl, , P(t,z2,.)) without using Theorem 5.10 goes as follows. Note t,hat, a),
520
14 EXISTENCE OF STATIONARY
DISTRIBUTIONS AND ERGODICI
By triangle inequality, we have
Iw(w,2 1 , .>,P(t7 2 2 , 9)- W ( W ,2 3 , * ) ,P ( t ,2 4 , 4)I 6 I W ( P ( 4 2 , , * ) ,P ( k z 2 , * ) ) W ( P ( t , z , , . ) ,P ( t , a 3 , - ) ) ] + IW(P(4Z,,.), P ( 4 z 3 , 9)- W ( P ( t , Z 3 , . )P, ( t ,2 4 , G K ( t )(p(.I t 2 3 ) + P ( 2 2 , .4)).
I) .
Hence, W ( P ( t z, l , .), P(t,x2,.)) is indeed continuous in (xl, x2).
As a straightforward consequence of the above theorems, for the reactiondiffusion processes, we have
Corollary 14.5. Under the hypotheses of Theorem 13.17, if (14.2) holds, then there exists exact one stationary distribution T. If c,+cM < 0, then the process is ergodic and W ( P ( t , z -), , T ) ( p ( z ) 7r(p))e-qt for ail z E Eo and t 3 0.
+
<
Examples 14.6. Suppose t h a t
C , k , < 00
= 0.
and p ( u , u)
(1) The stationary distribution always exists for the polynomial model. (2) For the first Schlogl model, a sufficient condition for the ergodicity is that - bl M < 0. (3) For the second Schlogl model, a sufficient condition for the ergodicity is that -61 P2 36314 @ / ( 3 6 3 ) M < 0. (4) For the linear growth process, a sufficient condition for existence of stationary distribution and for the ergodicity is that XI - A2 A4 < 0.
+
+ +
+
+
+
Proof: Consider first the polynomial model. Recall that mn
mn+l
j=O
j=1
+
+
Let q > 0, C = sup{(M q ) k b ( k ) - u ( k ) : k 3 0) and take K = CC,,,k,. Then C E [ O , m ) and K E [0,m). On the other hand, for h,(z,) = p,(z,) = z,,we have
6 So
1[Mx,+ b(z,) - u(x,)]k,.
14.2 ERGODICITY FOR POLYNOMIAL MODEL
521
Assertion (1) now follows from Theorem 14.2. For the linear growth model, some restriction is needed for the existence of stationary distribution, even in one-dimensional case (cf. Example 4.57). The assertions for the ergodicity follow from Theorem 14.3. H
Remark 14.7. Sometimes, we can choose a transition probability ( p , ( u v)) , on A, for each n 3 1 to keep the M-controlled condition. (For instance, it is the case if (p(u,v)) is translation invariant in S = Zd with p(u,u)= 0. See Remark 14.9 and the proof c) of Theorem 14.10 below.) Then, the constant M used in the above examples can be replaced by M - 1. In general, it seems not natural to involve the constant M in the ergodic conditions for a real model. However, some example shows that the ergodicity does depend on the choice of M and its corresponding sequence ( k , ) . On the other hand, for a given model, the choices of M are usually not unique. So it is natural to take E m i n := {x E E : c,~,Ic,(M) <: m}
n M
as our (minimal) state space instead of Eo,where M varies over the following set:
{ M > 0 :there exists a positive sequence ( k , ( M ) ) so t h a t ~ , p ( u , v ) k , ( ~<) M ~ , ( M ) for all u E s}. But in this book, we often ignore the minimizing procedure since it is automatic whenever the conclusions are independent of a specific choice of M . 14.2 Ergodicity for Polynomial Model
In this section, we study the ergodicity for an important class of reactiondiffusion processes, the polynomial model. Recall that for this model, we have formal generator
with birt h-death rates
cpp, mo
b(k) =
j=O
+
c
mo+l
a(k)=
djk(j),
(14.12)
j=1
where ,W = /c(k - 1).. . (IC- j I), p j > 0, 6, >/ 0. From now on, we will often use the following natural hypothesis.
522
14 EXISTENCE OF STATIONARY DISTRIBUTIONS AND ERGODICI
Hypothesis 14.8 (H). The transition probability p ( u , v) is translation invariant and irreducible in S = Zd, p(u,u) 0 and mo 2 1, 00,61,Sm0+l > 0.
=
Remark 14.9. Because of the translation invariance, for each h4 > 1, the sequence (k,) defined by
c 00
k,
M-np(n)(u,o),
:=
u Es
(14.13)
T&=O
<
possesses the properties: C , p ( u , v ) k , < Mku for all u E S and C, k, M / ( M - 1) < 00. Thus, we may make M as close to 1 as desired. Combining this with Remark 14.7, we see that the constant h4 in Examples 14.6 disappeared. Thus, in the present situation, the sufficient conditions for the ergodicity of the first and the second Schlogl models are P1
and
81 > P 2
< 61
(14.14)
363 P; +4 +(2 2pZ)) 363
(14.15)
respectively. Clearly, the pure birth rate PO plays no role in these conditions. The reason is that the distance used to deduce the conditions is the ordinary (in particular, translation invariant) distance on Z+and the coupling used there is the one of marching soldiers. All these are certainIy not necessary. In view of Theorems 5.37 and 5.38, there are two ways to improve the above results. The first one is introducing a refined translation invariant distance and adopt the coupling by reflection, based on Theorem 5.37. The second one is to use the classical coupling and a non-translation invariant distance, as suggested by Theorem 5.38. Both ways are meaningful. Here we adopt the second one as an illustration. Thus, we consider the following distance
P(k4
=
1
k >e E
c u jj
z-!-
j<e
for a given positive sequence (uk)on Z+. Clearly, p is non-translation invariant unless U k -constant. Here is the main result in this section.
Theorem 14.10. Let (uk) be a positive sequence on Z+ with uo = 1 and ii := supk>$,wk < 00. Set u* = 0 v - ui).Suppose that there exists an & > 0 such that bk+lUk+l-(bk+ak+l+k+l-&)ur,+(ak+k)uk-1+u+ku*
6 0 , k > 0, (14.16)
where a0 = 0 and 'LL-1= 1. Then under (H),the reaction-diffusion are exponentially ergodic, uniformly in initial points.
processes
1.4.2 ERGODEITY FOR POLYNOMIAL MODEL
523
We remark that (14.14) and (14.15) can bededuced from (14.16) by setting = 1. For instance, for the second Schtogle model, (14.16) with Uk z 1 becomes 2PZk - 61 - 363k(k - 1) < 0. This is trivial when k = 0. Hence: the condition becomes Uk
This holds iff A := (2p2 - 383)2 - 1263(61 - 202) < 0. That is (14.15). The next two corollaries are out of the range of (14.14) and (14.15).
Corollary 14.11. Under (H), the processes are exponentially ergodic, uniformly in initial points, provided (1) dmo+1 is large enough for fixed & and 6 k , k 6 mo, or (2) PO is large enough for fixed 0,and 6 k , k 2 1. Proof: Take
Uk =
( k + 1)-l, k 2 0. Then ii
-
1: u*
2:
0 and (14.16)
becomes
Since the degree of ak is higher than that of bb, this inequality holds toor large enough k . Next, for fixc?d k , in case (1) (resp. (2)), the second (resp. first,) term on the left, can be arbitrarily negative for large enough Srnoil (resp. PO). Now, the assertion follows from Theorem 14.10.
Of course, in order to get a more precise ergodic region, some restriction is necessary. T h e one-paramet.er coefficienbs used in the next corollary have a deep reason, which will be explained at the end of 316.1.
Corollary 14.12. Consider the second Schlogl model with = 2a, 02 = 6a, and S, = Q > 0. Then, the processes are exponentially ergodic, uniformly in initial points, for all o 2 0.7303.
61 = 9a
Proof: Take
E
6 lo-', uo = 1, u1 = u2 = 3/2
+
E
(trick!),
Define k.1 = inf{k 3 2 : U ~ + I 2 uk). When Q = 0.7303, a numerical computation gives us !q = 15 ( k l can be smaller if a is bigger). We now show a technical point so that the computation can be stopped in finite steps.
14 EX~STENCE OF STATIONARY DISTR.IRUTIONS AND EKGOUIC
524
First, replace ukl+l.by ukl =: IL, (14.16) still holds. Next, suppose that we have already had ux: = uc+1 = g for some k 2 2. Then, in order for uk4-2
2 g,by definition, it suffices that (%+2
+ bk+l + 1
-
E
-
Uk+l)%
- (k
bc+2 Equivalently, !L>
Au(k
+ 2)u, + k + 1 2 2.
( k + 2)u, - ( k + 1) + 1) - A b ( k + 1) +'l -
E
where A a ( k ) = a k + l - u k . Thus, by induction, once this holds, we can indeed use uk = g instead of the original uk for all k 2 k l . In other words, the computation of (uk) can be stopped at k l 1. Next, since the resulting sequence (uk) satisfies (14.16) with u = u1 and u*= ii - 1, the conclusion of the corollary follows from Theorem 14.10. The remainder of this section is devoted to the proof of Theorem 14.10. For this, bhe m x t estiniate plays a key role.
+
Lemma 14.13 (Esti.matc of Moments). Under (H), for every m 2 1, there exists a decreasing function pm: ( 0 , ~3) [0, ca) such that
E z ( X u ( t ) m L,<) ~ + ; ~ ( t ) for all 2 > 0,
7~
E Zd and
x E E,",
(14.17)
where (X(t)>,,, is the reaction-diffusion process and
E; = {x E Eo : 2 , - zo for all
PL E
Zd}.
Proof: a) Let En denote the expected value of the process starting from xu = n. Note that Et c Em for all m 2 1, where Em was defined in the last chapter:
E,, =
{
II: E
EO : p ' " ) ( ~ ) := C ~ r k<, 00 U
Moreover, by Theorem 13.18, we have
< c (1 +
iE,p(m) ( X ( t ) )
(x))ect
far some C,c 2 0. In particular,
E,X,(t)" Set
f ~ ( 2 : )= 1 ; :
A".
Q (C/C,k,,
+ rtm)@,
rn 2 1.
Then flv E Yip(fY~)(&), 13y 'I'heorcrn 13.19, we have
14.2 ERGODICITY FOR POLYNOMIAL MODEL Observe that using Dhe Holder inequality with p = j shows
525
+ 1 and q = ( j + l ) / j
by translation invariance. This estimate is trivial when j = 0. On the other hand, 51,fhr(X(s))is dominated by
which is integrable wit,h resppct. to Pn uniformly in finite s-intervals. Hence, we have t
IE,X,(t)'" with
-I
nrrl-
&-JWds)mIds
IE, (Ct(X,(s))1 uniformly boundcd in finite s-intervals. This proves that
+
where L I ( ~ =) (k( /1)" C)k". Furthermore,
b) Rewrite (14.18) as follows:
where y(k) = km0/". Clearly, the fiiriction y possesses the following properties: (i) y : W+ -+ R+ is continuous, $0) = 0 and inftE[l,M)y(l) > 0. (ii) ty(t) is convcx in [I,00). (iii) Jam < 00 for evcry a > 1. (iv) For each Large K , there exists E = E ( K )E (0,l) such that
&
y(k/K)
< K(1 - E ) Y ( k ) ,
k
E
N.
520
14
EXISTENCE O F S'l.Nl.10NARY
DISTRIUWTIONS A N D ERGODICITY
We now use these properties to cornpletc our proof. By (14.19), (ii) and Jcnsen's inequality, we get
Or
d -gn(t) 6 dt
c1 - w n ( t ) y ( g n ( t ) ) ,
t 2 0,
where g n ( t ) = IEn(Xo(t)m). Next, by (i), r y ( r ) + there is an M > 1 so that
On the other hand, by (14.19) and (i), there exist
00
as r
(14.20) --+
m. Hence,
> 0 such that
C2,~.2
By Gronwall's lemma,
and so
nm 6 111 ===+ g n ( t ) ,< K M
(1422)
for some K E N. Consider the case that g n ( 0 ) = nm E ( M , K M ] . By (14.21), we have the following picture: g n ( t ) is strictly decreasing until some t 2 > tl := inf{t > 0 : g n ( t ) M } and so g n ( t ) < KM for all t f ( O , t z ] . Furthermore, g n ( t ) < M for all t > t 2 up to t 3 := inf{t > 0 : g n ( t ) 2 M } . However, by (14.21) again, we have gn(tA) = 111 and gk(t3) < 0. Hence, the function gn goes down again, provided t 3 < 00. Thm, g, can never exceed M for all 1 2 t z . Combining this with (14.18), we obtain
<
nnL< K M ==+y,(t)
< KM.
Next, we consider the case that nm > K M . Define T K M = inf{t
2 o : gn(t) < K M } .
(14.23)
14.2 ERGO 111crr Y
F'DR
P oLY N o M IAL M oLIEI,
Then by (14.20), (14.21) and the fact that gn(t> 2 K M for all t have
527
< T K Mwe ,
N.ow, applying (iv) to obtain d
-dgt n ( t )
Write
E~
=E
< -&Elgn(t)T(gn(t)),
t < TKM.
(14.24)
Note that, on the one hand, we have
E ~ .
Here we have used the fact that gn(t) E [ K M , nm] for all t < T K Mand (iii). On the other hand, by (12.24), we have
Therefore, T K M< 00 and so g, 2 K M on [0, T K Mby ] the continuity of gn. By using (14.24) again, we get
That is
Set O0
Then
du
0 .
528
14 EXISTENCE OF
STATIONARY
DISTRIBUTIONS AND ERGODICI
Because r is strictly decreasing in (0, m), so its inverse function exists but also decreases. From (14.25), it follows that loggn(t) 6 T * ( E Z t )
< 00,
t
E
I?* not only
(0, T K M ] ,
i.e., gn(t) < exp [ r * ( E z t , ] , t E (0, 5%M]. Combining this with (14.23), it follows that cp&)
:=
t >o
(exp [ r * ( ~ ~ v (t K) ]M)) ,
provides a desired function. Proof of Theorem 14.10: The proof is split into five steps. a) First, consider the finite dimensional case. Let S be a finite additive group. Suppose that (p(u,v) : u,v E S ) is a translation-invariant transition probability :
p(u
+ w, v + w) = p(u,v)
for all u,v,w E S.
By using S instead of Zd, one can define an operator as in (14.11). For the diffusion part, for each u,we adopt the coupling of marching soldiers. (2, Y)
+
+
-+
.( - eu + ew, Y - eu + ew) .( - eu ew, Y) (2, Y - eu + ew)
+
at rate (xuA Yu)Pb,4, at rate (xu - Yu)+P(u, 4, at rate (yu - x,)+p(u, v).
For the reaction part, at each u E Zd, due to Theorem 5.38, we adopt the classical coupling. If xu = yu, then the two marginal processes evolve at exact the same rates. If xu # yu, then they jump independently
(x,Y)
.( + eu, Y) -+ (x - eu,y) (2,Y eu) ( x , v - eu) -+
-+
+
+
at at at at
rate rate rate rate
b(xu),
a(xu), b(Yu), u(yu).
Clearly, this coupling is order-preserved. b) Next, we make some computations. Denote by fiC the coupling operator defined above. Fix x 6 y and u E S, write xu = i j = yu. We have
,.,
R,-p(i,j)=
<
{ - biui + ~ i u i - l +bjuj - ~
j ~ j - ~ } ~~( jj - i)uj-I i > ~ ~
(14.26)
529
14.2 ERGODICITY FOR POLYNOMIAL MODEL
where I k 2 1 is the indicator of the set { (z,y) : y, - 2, 2 l}. The last term on the right-hand side appears since p is not translation invariant. Now, by (14.16), for the first term on the right of (14.26), we have
e=i j-1
< -.>ut
- ( j - 2)'lL -
( j - 2)iu'
t=i
< -ep(i, j ) - ( j - i)u - iu*.
(14.27)
On the other hand, we have monotonicity, x translation invariance E"?Z,(t)
= +?zo(t),
C p ( v ,u)= C p ( 0 ,u - v) = 1, 21
V
where Z ( t ) = ( X ( t ) , Y ( t ) )for , any initial x
E"
= {X E
< y + X ( t ) < Y ( t ) ,a.s., and
Z$ : X,
< y, z,y E ES:
= x0,u E
S}.
Corresponding to the last two terms on the right of (14.26), we have
Here is the main place we have to pay for the method. Because of the interaction, one can not replace u ~ ~and( ~~l ~ ), ( -~ u ) ~ , ( ~ with ) u ~ ~and( U Y V ( t )- U X u ( t ) ' respectively. Otherwise, the system would become independent one for which the process is exponentially ergodic. Combining (14.26) and (14.27) with (14.28), we obtain
Hence, we arrive at
c) Let AN = [-N + l , N l d c Zd and regard AN as the torus SN = Z d / ( 2 N Z d ) ,the factor group. On S N ,we can introduce a shifting operator
530
14 EXISTENCE OF STATIONARY DISTRIBUTIONS A N D ERGODICI
in a natural way and transition invariance is meaningful. Next, for a given translation-invariant transition probability p ( u , u) on Z d , we can introduce p N ( u ,v) on S N with the same property:
Here we have identified u E SN as an element in Zd. Clearly, this ( p N ( u ,v) : u,u E S,) possesses the M-controlling property mentioned in Remark 14.9. Applying a) and b) to the present case with an obvious change of notations, we get IE>Yp(Zo(t)) 2~"(Zo(l))e-"'"-'', t21 (14.29)
<
<
for any initial II: y, 2 , y E B&. d) To go to the infinite dimensional case, regard the above process & ~ ( t as ) a process on Z p . Let fi denote the infinite dimensional coupling operator constructed in the same way as in a). It is easy to check that for every u E Zd, if we put h(z,y) = Cu(xcuyu)k,, then
+
for some constant c < co. Moreover, the interaction between two boxes [ i.e., I I : ~ ~ ( U , Vis) ]at most linear. Based on these facts, one may find a Markov process with generator fi as we did in the last chapter. An alternative way is to take a weak limit @ in the usual Skorohod topology, which is a solution to the martingale problem for the operator fi (the existence of the solution was proved by Han(1990) and Li(1990a)). Actually, what we need is to construct a limiting process to keep the order-preserving and (14.26). Thus, from Lemma 14.13 and (14.26), it follows that
Since the original process is unique, it does not depend on the ways of different finite dimensional approximations. Hence each marginal distribution of & coincides with the original process. In particular, by Lemma 14.13, we have
e) Finally, let Xn(t)(resp. Y n ( t ) denote ) the Markov process starting from
xu= n, u E Z d . By monotonicity of the original process, we have Y"(t) <
14.2 ERGODICITY FOR
POLYNOMIAL
531
MODEL
_-Y"+'(t),a s . Construct a common probability space (R, 9, P)on which the process Z ( t ) lives and
Y"(t) Y"(t),
Pas.
By Lemma 14.13,
EYorn(l) < 03.
(14.32)
By (14.30)-(14.32) and Fatou's lemma, we get -
x,"(t)) < lim IEp(yg(1) - x:(l))e+-')
~ p ( ~ ~- ~ ( t )
"-+"
< I E ~ ( Y ~ ~ -( IX:(l))e-E(t-l) )
,
t>l.
Using Fatou's lemma again, we finally obtain
< lim Ep(Yo"(t) - X,"(t))
~ p ( Y o ~ ( 0-3 x:(03)) )
t+ 00
Here, the tightness of {(X~(t),Y,OO(t)): t 2 1) is due to Lemma 14.13 and d = X,"(m). Theorem 4.4. Since ue > 0 for all t 0, this proves that Yom(0o) Set zA n = (xuA n : u E S ) . Then
>
x0(t),< ~ " " " ( t<) x n ( t ) < ~ " ( t ) ,
t
> I, F-a.s.
By monotonicity, XZAn(t) X " ( t ) . We have the comparison:
t
> I, F-a.s.
This proves the ergodicity of the process. Actually, the convergence is exy). ponential, uniform in initial points (z,
Remark 14.14. Because p is a metric on Z+, p(z,y) = ~ , p ( z U , y U ) k U defines a metric on EO ( (Eo,p ) is a Polish space), and so we have a minimum L'-distance W ( P ,&) = inf,- s p ( x ,y)F(dz, dy). In this notation, we have indeed proved that U
21
+o
as
t -+ 00.
To conclude this section, we consider the reaction-diffusion processes with an absorbing state 6 = (6, = 0 : u E S). Note that the above proof needs only a little modification. Since the process Xo((t)stays at 8 for all t 2 0, thus, in (14.26) for instance, we can simply set i = 0 and 2, = 0. After some suitable modifications, we can prove the following result.
532
14
EXIS'I'ENCE O F STATIONARY DISTRIBCTIONS A N D
ERGODEI
Theorem 14.15. Let (uk) be a positive sequence on Z + with uo = 1 and .li := ~ u p ~ > , < ~ u03.k Suppose t h a t there exists an E > U such t h a t
where a0 = 0 and u-1 = 1. Then under (H) but b~ = 0, the reaction-diffusion processes with absorbing state 0 are exponentially ergodic, uniformly in initial points.
Applying Theorem 14.15 to the sequence Uk = 1 again, th e sufficient condition becomes infk>I(nk - b k ) / k > 0. Thus, for t.he first; Srhlijgl model, we obtain the same ergodic condition as (14.14). For the second model, the condition becomes
which is weaker than (111.15). In principle, the original process is easier to be ergodic than the absorbing one, due to monotonicity. Clearly, we should have analogue of Corpllaries 14.11 ( I ) and 14.12.
14.3 Reversible Heaction-Diffusion Processes It is natural to ask when a reaction-diffusion process is rcversible. For simplicity, we discuss this problem only for polynomial model.
other cases of 5 #
u E Z*,
x,Z
E
E.
2,
(14.33)
Assume that P O , 61,Jmotl > 0 as in the last section, then b ( k ) > 0 €or all k 3 0 and u(k) > 0 for all k 1. In order for ( q ( q 2 ) : x,2 E E ) to be a field, we need assume that p(u,'L')> 0
-
p(v,u)> 0,
u,v E E d .
(14.34)
Under this assumption, we can introduce paths, works and so OR to make a field ( q ( x , 2 ) )as in Section 7.1. Now, we look for the conditions under which (q(x?5))becomes a potential field. It is clear that for this field there is only one type of minimal closed paths. That is the triangle
14.3 REVERSIBLE REACTION-DIFFUSION
+ +
533
PROCESSES
+
+
Here, the path from z e, to z ew is due to the diffusion (z e,) + (x e,) - e, e , = z e,. For the works done by the field along these paths to be zero, it is necessary and sufficient that the triangle condition
+
+
holds for all x E E and u,ZI E Zd,u # v. Fix u # v and let z satisfy z, = z,. From the last equality, it follows that
Furthermore, the same equality gives us
(’ u(k
l)b(Ic) =:X > 0, independent of k. Equivalently, ” =A. 1) 6k+l
+
(14.36)
Based on these discussions, it is not difficult to prove (as we did several times in Chapters 7 and 11) the following result. Lemma 14.16. Under (14.34), the above field (q(z,5): z,2 E E ) is a potential field iff (14.35) and (14.36) hold.
Having the potentiality of the field at hand, the next step is to construct all Gibbs states and then prove that the Gibbs states coincide with the reversible measures or the process. However, since the potential function of the field is exactly the same as that of the field induced by the independent product of birth-death processes, we guess that a Gibbs state (i.e., a reversible measure) may be obtained by the product of the reversible measures for the birthdeath processes. That is, the independent product of Poisson measures with parameter X since (Ic l)b(Ic) = Xu(k l), Ic >, 0. The main purpose of this section is to prove this conjecture and furthermore the uniqueness of the reversible measure.
+
+
Theorem 14.17. Consider the polynomial model for which the birth-death rates satisfy Po, 61, bmo+l > 0. Suppose that ( p ( u ,v)) is translation invariant and p(u,u) = 0. Then the process is reversible iff (q(z,iE))is a potential field, i.e., (14.34), (14.35) and (14.36) hold. Furthermore, the only reversible measure is the product of the Poisson measures with parameter A.
To prove this theorem, we need some preparations. Lemma 14.18. Under the hypotheses of Theorem 14.17, if the process is reversible with respect t o T , then
(14.37) J
J
534
14 EXISTENCE OF STATIONARY DISTRIBUTIONS AND ERGODICITY
Proof: a) Let ~ ( ~ ) (=x CUES ) x z k u , m E N. Denote by x A n the element in Eo with value xu A n at every u f S . Observe that
as n + 00. By Lemma 14.13 and the monotonicity of the process, for every stationary distribution T of P ( t ) ,we have
.(p‘”’)
= s 7 r ( d s ) l E , p ( m ) ( X ( t )= )
cJ k,
7r(dx)E,Xu(t)m
U
In particular, 7r(Em)= 1,
rn E N,
(14.38)
where Em = {x : p(”)(x) < m}. b) Given f , g E b29g.t ( E ) , by Theorem 13.19, (14.38) and the dominated convergence theorem, we have
The proof is completed.
Lemma 14.19. Condition (14.37) hol;ds iff 1() (2) hold for all
and
and for all
14.3 REVERSIBLE REACTION-DIFFUSION PROCESSES
535
Proof: a) Sufficiency. Note that
1
fR9dn - 1 9 R f d n
=:I+II+III. By (l),we have
1=
c/ c1
u(&J
[ g ( z ) f ( J-: eu) - f ( z ) s ( J-: eu>]7r(d4= -II*
U
On the other hand, by (2): we have ZuP(zL, 4
uL,'u
=
1s
9 ( z - eu
+ ev)f(47@.:)
~ v P ( Wu>9(.>f(. ,
+ eu - ev).rr(dz)
u,v
=TJ
w ( u , M z ) f ( s - eu + ev)7r(dz),
here in the last step, we have exchanged u and ZI. This completes the proof of sufficiency. b) To prove the necessity, we first consider condition (2). Take y E E*, A G Zd = S. f = I l y l x E ~ \ "and g = I { y + e , - e t ) x E S \ ~ ,
If { s , t } is not contained in A, s E A but t f A for instance. Then, using the expression
f
=
I{yXy'}XES\A'
Y'E%
fY',
=: Y'EE
where A' = A U t , we may replace A and f with A' and fY', respectively, so that {s, t } c A'. Now, assume that {s,t } C A. We have
0= =
J [fog
c u,u
-g
~ l d n
2uP(u, 4 [ 9 b - eu
+ ew)f(z) - f(.
- eu
+ eu)g(.)]
X(d4
536
14 EXISTENCE OF
STATIONARY
DISTRIBUTIONS AND ERGODICI
This gives (2). As for condition (I), the proof is similar but using g = I { g + e 8 1 x E instead ~\~ of the above g .
Proof of Theorem 14.17: a) Let the process be reversible and denote by 55'the set of all reversible measures of the process. As usual, set
Let n E .%'. Applying 1,e:Mma 14.19 (1) to the function f = b,n(s : 2, = n) = a n + l ~ (:xxu = n 1).This gives us
+
7+
: 5, = 0) = z-I,
n(2 :
2,
= n) = pn/Z.
we have
(14.39)
it ~follows , Next, applying Lemma 14.19 (1) to the function f = I [ l s , = n , Z v = m that b,.ir(z : 2 , = nl z, = rn) = u ~ + ~ T :( 2z, = n 1, x, = m). So
+
Suinrning over n 2 0 gives T(Z
: Z, = 0, Z, = m ) = T ( X : Z, = m ) / Z = T ( Z :
:I:,
= O)T(:C : :cW = m )
Substituting this into (14.40) and using (14.39) we get n(2 :
2, = 12
+ 1,2, = m ) = pn+17r(x : 5, = m ) / Z = n ( x : xu = n + 1)v(x : x2.= m).
By induction, it is now easy to check that
This proves that 1221' = 1 and the unique stationary distribution is the independent of ( p n / Z : n 2 0) which is just the stationary distribution of the marginal birth-death processes. Finally, applying Lemma 14.19 (2) to the function f=1[2u=n, r , = n a + l ~ , we obtain
+
(n I)p(u,v)n(z: zll = n -t 1, x, = m> = (,m l)p(v, u>n(z: 2, = n, 2 , = rn + 1).
+
(14.41)
14.3 REVERSIBLE REACTION-DIFFUSION PROCESSES
537
In particular, setting m = n, we get p ( u , u ) = p(u, u),and so p ( u , u)= 0 a p ( u , u ) = 0. This proves the necessity of (14.34) and (14.35). Now, assume that p ( u , u ) > 0. Then (14.41) implies that
Thus
+
( n 1)bn T ( 2 : 2 , = m) = (m+ 1)7r(2 : 2 , = m+ 1). %+I Summing over m 2 0, we obtain
c
(n + 1)bn = an+1 m=l 03
m T ( X :
x, = m ) =: A,
independent of n 2 0, which gives us (14.36). By Lemma 14.16, we have proved that ( q ( 2 , 2)) defined above is a potential field. b) Let (14.34), (14.35) and (14.36) hold. Then (14.37) holds. In the finite dimensional case, (14.37) implies the reversibility of the process with respect to the independent product of Poisson measures with mean A. To pass through from the finite dimensional case to the infinite dimensional one, let f, g E b%?yl(E). We need to show that
which follows from the proof d) in the proofs of Theorems 14.1 and 14.2 (regarding gdTn and g d r as dTn and d x respectively). A further problem concerned with the reversible reaction-diffusion processes is the ergodicity. By using the monotonicity and the estimate of the first moment, we see that the process X n ( t ) starting from 5 , = n will have a weak limit for some sequence t k ---f 00. In particular, we have a stationary distribution p as a limit of X o ( t ) starting from 0 = (0, = 0). Because of Lemma 14.1% it is even true that X W ( t ) := limn+m X n ( t ) ( t >, 1) has a weak limit ,G. Again, by monotonicity, for any stationary distribution p , we should have p < p. Thus, whenever p = p, the process must be ergodic. In the present case, since p and ji both are translation invariant, the last assertion is equivalent to say that the only translation invariant stationary distribution is the reversible measure p given in Theorem 14.17. Based on these observations plus some computations on free energy, Ding, Durrett and Liggett (1990) proved the following result. Theorem 14.20. Under the hypotheses of Theorem 14.17, the reversible reaction-diffusion processes are ergodic.
538
14. EXISTENCE OF STATIONARY DISTRIBUTIONS AND ERGODICI
Refer to the cited paper for a proof. Refer also to 515.3 for more results on this class of processes. It is a good chance to look at the spectral gap for this infinite dimensional process.
Theorem 14.21. For reversible polynomial model, we have gap(S1) 2 gap(Q) > 0, where Q is the birth-death Q-matrix appeared in the reaction part of the process.
Proof: The Dirichlet form of the process consists of the reaction part and diffusion part. Ignoring the diffusion part, we get a smaller one, which is the sum of the forms of independent birth-death processes. By Theorem 9.5, we have gap(R) 3 gap(&). To see that gap(Q) > 0, one may use Theorem 9.25 (4). An easier way to see this qualitative result goes as follows. Applying Corollary 4.49 to hi = 1 i shows that the birth-death process is exponentially ergodic and so the assertion follows by Theorem 9.15 (2).
+
14.4Notes Section 14.1 is taken from Chen (198613, 1989b), but the proofs are now simplified. Some analogies of Theorems 14.1 and 14.3 were appeared in Basis (1980). An analogue of Theorem 14.2 was obtained by Huang (1987). Under the hypotheses of Theorem 14.3, the existence of stationary distribution is due to Zhang (1999). Meanwhile, some progress about the ergodicity for reaction-diffusion processes were made by Ding, Durrett and Liggett (1990) and Neuhauser (1990). In the paper by Ding et al, Lemma 14.13 was appeared. The present proof is an extension to the original one and works also for non-polynomial case, a complete exploration is contained in Chen, Ding and Zhu (1994). The sufficient conditions (14.14) and (14.15) were proved in Chen (1986b). Then, it was proved by Neuhauser (1990) that the processes are ergodic when pis and 62s are all large enough. Part (2) of Corollary 14.11 is taken from Chen (1990). Sections 14.2 is taken from Chen (1990, 1995), some other criteria and their comparison are also appeared there. Finally, Theorem 14.17 is mainly based on Zhu (1990). The ergodicity for reaction-diffusion processes with absorbing state was studied by Li (1995) with more direct approach (without using coupling). A related result will be stated at the end of Section 15.3. Based on Chapter 8, the large deviation principle for reversible reactiondiffusion processes was proved by Chen (1996a,b) completely in the finite dimensional case and partially in the infinite dimensional case. Comparing the irreversible case with the reversible one, it seems that the ergodic theorems may be further improved.
Chapter 15
Phase Transitions This chapter is devoted to the study on phase transitions for the reactiondiffusion processes. Of course, we have to restrict ourselves to some more concrete rriodels. Our first model is the linear growth model for which we adopt the moments method. The second iiiodel is rioriliriear but having 8 = (@, = 0 : u E Ed) as an absorbing state, for which we usc thc graph rcprcscntation method. Finally, &s an approximation, we study a time-inhomogeneous Q-process, instead of the infinite dimensional reactiondiffusion processes, to exhibit the phase transition. The last approach is often called the mean field method in the statistical physics. As a preparation, in the next section, we introduce an important “dual” formula. 15.1 Duality
Unless othcrwise stated. throughout this chapter, assume that ( p ( u ,u)) is a genord random walk (Lc: a translat,ion-invariant transition probability on S = E d ) with p(u,u) 0. Moreover, the rates for the diffusion part of the reaction-diffusion processes are fixed: x:,p(u, v). That is
=
Qdf(4
= .&uP(U,
).
[f(. - eu + e l J - f ( 4 ] ,
(15.1)
u.v
Define
and the dual operator Q X Y ) = CY(.)P(./4[f(Y
- eu + e u > f ( w ) ] , -
Y
E(f)-
PL,V
Note that the rate ( p ( v , u ) )for diffusions used in the last line is the dual of the original ( p ( u ,v)). Next, define a dual function (Poisson polynomial) D : ~ ( f x) E 4 Z+ as follows
W
+
where n(’) = 1, n ( k )= ,n(n- 1) - - ( n - k l), k 2 1. The main rcason we adopt the notation n ( k )is that €or a Poisson random varia.ble E with mean A, we have E$k) = A k , lc 2 0. The next result exhibits the dual relation between !& and sZ;C, in terms of the function D. 539
15 PHASE TRANSITIONS
540
Lemma 15.1. Let S be finite. Then (fldD(Y7
9)(4= (flm, 4)(Y),
r, Y E E.
Furthermore
ED(y,X(t))= W Y ( t ) , z ) ,
z, y E El
where IE is the expectation of the independent product of the R&process starting from y and the Rd-process starting from 2.
Proof: a) By definition,
x
[ (u) (u)- 1) 2
(2
(.(V)
+1)(y(v))-
x (u) 2 (u) (Y(U))z(v) I:y(V))].
Note that
We have
Here, in the last step, we have used C, p ( 7 ~11) , z 1. b) Next, set d N ) ( y ,z) = D ( y )z A N),whcre 5 A By a), we have
N
= (xu A
N
:
E S).
54 I
15.1 DUALITY
Hence
d d.5
-ED(")(Y(t
- s ) , X ( s ) )= 0,
0 6 s 6 t.
Integrating over s , we obtain
W Now, t,he second assertion of the lemma follows by letting N 4 03. Actually, (15.2) is a consequence of the integration by part.s formula. Regard X ( t ) and Y ( t )as bivariate processes with state space E x E and denote by Pd(t) and P;(t) their semigroups, respectively. Then, by Theorem 13.40, we have
Now, (15.2) follows from
For the infinite dimensional c s e , it should pointed out that the d i d relation between f l d and 02 given by Lemma 15.1 is not meaningful since D(y, -) is not a Lipschitz function. To avoid this, one may replace D with D I N ) ,and then handle the infinite dimensional case directly. But here, we would like to continue our study by starting from the finite dimensional cases. Note that corresponding t o a:,the process on E ( f j is a Markov chain. Each E(")(n3 0) is a closed set for the chain. In particular, on E('), the Markov chain has &-matrix Q = P' - I , where P = ( p ( u ,u)). Denote by p * ( t , u!u) the corresponding Q-process. Since the particles evolve independently, the transition probability of the chain on E(f)is given by
whore 0 =
(QT8
= 0). In particular, for finite S , by (15.2), we have
On the other hand, since D ( N ) ( y , f bVg!(E) and D ( N ) ( y ' , x )< 00, by using a limiting procedure (a.pplying Theorem 13.8, Lemma 5.14 and the proof a>of Theorem 4.38 to the left- and the right-hand sides of (15.3), a )
542
15 PHASE TRANSITI
respectively), we see that (15.3) still holds for S = Zd. Again, letting N + 00, we get
We have thus proved that the zero range process (Rd-process) has a dual (Ri-process) which is also a zero range process (and so the dual is called self-dual). For the reaction-diffusion process with generator R = Rd+R,, even though the above dual formula does not hold, but the same argument leads to the following result.
Proposition 15.2. For the polynomial model or the linear growth process with diffusion part as above, we have
Y E E(f),
2 f
(15.4)
Emo+[yI,
where for the polynomial model, mo was given in Example 13.27 and for the linear growth process, mo is setting to be 0.
This formula is a starting point for the study in Chapter 16. In the next section, we will use this formula to compute the first two moments of the linear growth processes.
15.2 Linear Growth Model In this section, we study the linear growth model:
b ( k ) = Alk) u ( k ) = A&,
k 2 0,
A1,AZ
> 0;
(15.5)
with diffusion part given in the last section. As was mentioned before, since ( p ( u ,w)) is a random walk, we can choose a positive summable sequence (ku) and an M > 1 so that
Recall that the convergence in finite-dimensional distributions coincides with the weak convergence in ~ ( E owith ) respect to the induced topology (cf. ) Remark 4.6). As usual, we call P ( t ) ergodic if there is a p E ~ ( E osuch that vP(t) + p as t + 00 for all v E P ( E 0 ) . The phase transition of this model is described as follows.
15.2 LINEARGROWTHMODEL
543
Theorem 15.3. For the linear growth model, the following conclusions hold. (1) If XI
t
< A,,
then the process is ergodic. More precisely, v P ( t ) v E 9 ( E o ) ,where So is the point mass a t 8. Az, then the process is non-ergodic.
+ 60 as
-+ 00 for any
(2) If A1
>
The key to prove this theorem is based on the fact that the first moment for the model is known explicitly. Actually, by (15.4), for IyI = 1, y = eu(say!), we have
W q Y , X ( t ) )= & L ( t ) Y’
+
= CP*(t,U,u’)zu’ (A, -A,) u’
It
dsCp*(t,’
S,U,U’)lE,X,‘(S).
From this, we obtain the following result. Lemma 15.4. Let p u ( z ) = x,, z E Eo, u E S = Zd. Then
P(t)p,(x) = e(X1-Xz)t
CP*(t) u , v)p,(J;).
(15.6)
A direct way to prove this lemma goes as follows. By Theorems 13.19 or 13.20, we have $P(t)f = P(t)flf for all f E Yip(Eo)(E). Since p, E B p ( E o ) ( E )and f l p u ( z ) = ( h- h)p.ll(.)
+&
m ( P ( W
- SU,),
21
we get
This differential equation is linear, so its solution is unique. A simple computation shows that the solution is given by (15.6). Next, we consider the second moment which is the case that IyI =.2. Fix y = e, e,. Since f12,D(y,z)= 2(X1 - Xz)D(y, x) ~X~J;,S, by (15.4), we have
+
+
E A Y , X ( 4 ) = C P * ( t , Y ,Y’)D(Y’,4 Y’
544
15 PHASE TRANSITIONS
fROM THIS, WE CLAIM THAT THE SOLUTION OF THE SECOND MOMENT IS GIVEN BY (15.8) BELOW. lEMMA 15.5.
Proof: Denote by by gf)(t,y) the right-hand side of (15.8). Then
Combining this with (15.8), we see that f(t,y) is a solution to Eq. (15.7). .But the solution is unique, so we have proved the required conclusiuon
15.2 LINEARGROWTHMODEL
545
Before going further, let us point out that the upper bound of P ( t ) D ( y ,x)
(Ivl = 2), which is what we need only to prove for the main theorem, can be also obtained in a different way without using the "dual" formula. Consider first the case that I S1 < 00. Then, the estimate follows from Lemma 4.12. Since the estimate is independent of 151 ' , the assertion then follows by a truncation argument. Lemma 15.6. Let ( E , p , $ ) be a metric space. Given { p n } c 9 ( E ) and m, S > 0 , if pn + p and CJ := supn pn(pmt6) < 00, then pn(pm) p(p") as ---f
n
--f
00.
Proof: Obviously, lim:
,oop n ( p m )
p(p"). On the other hand,
Letting n + 00 and then N 00,we get Kn+oo pn(pm) < p(pm). Proof of Theorem 15.3: a) From Remark 14.9, we have seen that the process is ergodic whenever XI < X2. b) Now, let XI > X2. Denote by X1((t)the process starting from 1, which is the configuration having value 1 at every site u E Z d . Let pt be the distribution of X i (t)/lEXi ( t ) .Then pt E 9( [0,00)). Moreover, ---f
LNTPL(dT)
= EX,l(t)/IEX,l(t) = 1.
By Theorem 4.4, we can choose a sequence { t n } so that pt, 9( [ 0 ,00)). On the other hand,
* some p E
where yo = 2eo. But by Lcmma 15.4 and Lemma 15.5,we have
P(t)D(yO,1) = e2 ( x 1 - x z ) 1
(Cp*(t, 0, u))a U
15
546
PHASE 'I'RANSI TIONS
11;Iollows that sup t>O
/*
r2pt(d?-) < 00.
0
Thus, by Lemma 15.6, we can choose a constant c > 0 such that p ( ( c ,m)) 2 E > 0. Then pt,,((c,m)) 2 ~ / 2 > 0 for all n 2 no = no(&). Set A ( t ) = E X t ( t ) . Noting that either X & ( t )= 0 or X,'(t) 2 1, we obtain
P ( L ) ( p oA 1)(1)=E[xA(t,)A 11 2 lE[X,'(t,) A 1 : Xi(Ln) > cA(tn)] = P[XCx,'(tn)> cA(t,)] ((c, I.) 2 E/2. R u t p0A1 E bWyl!(E),it is impossible that 6,P(t) + 60 as t 4 co.Therefore, the process ( X ( t ) ) is non-ergodic. I In view of Theorern 15.3, the only remainder case for this model is that A, = A,. To discuss this situation, we need some notations. Let = pt,,,
9 = the set of stationary distributions of P ( t ) , 9 = the set of translation invariant measures, Ye= the set of the extreme points in 9, LPm(E)= { p E Y ( E ) : JxFp(dx) < oo}:
+
and set p(u,v ) = ( p ( u ,v ) p ( v , u ) ) / 2 . Theorem 15.7. Under the assumptions of Theorem 15.3, suppose additionally t h a t ( p ( u , v ) )is transient. Then
(1) For each p
/
> 0, there is uniquely an vP E 9 n 5@satisfying
/
xcovp(dx) = p,
W v V p ( d 2 ) = P(P
+ L) + 2x1
lm
F ( 4 Zt.7 v)&
where p ( t ) is the Q-process determined by the &-matrix 2(p- I ) (2) Let p E 9 n 9 n LP1(E),then there is a X E 9 ( ( [ 0m)) , such that
P =;J vPX(dP). (3) Let p E Yesatisfy J x o p ( d x ) = p Moreover, if p E 9n 9 2 ( E ) ,then
<
00.
Then limt-,m
u,v (4) Let ,u E Yesatisfy Jxop((dz)= co. Then
lirn lini p P ( t ) { x : xo B k } > 0.
k-*m t-+w
f
pP(t) = v p .
Zd.
15.3 REACTION-DIFFUSION PROCESSES
WITH
ABSORBINGSTATE
547
One of the key to prove this theorem is that the second moment for this model is also explicit. This is a common point in the study of such type of results (cf. Liggett and Spitzer (1981)). Since a complete proof of Theorem 15.7 is quite lengthy and this finer result does not interfere too much the picture of the phase transition, we omit the details here. Refer to Ding and Zheng (1989).
15.3 Reaction-Diffusion Processes with Absorbing State
It was proved in Section 14.2 that the reaction-diffusion processes are ergodic whenever the pure birth rate is large enough. In this section, we consider an opposite extreme case that the pure birth rate vanishes. For which, we show that the processes can be non-ergodic and hence there exist phase transitions. Theorem 15.8. Take S = Z1. Consider the reaction-diffusion process ( X ( t ) ) with birth-death rates b ( k ) = Xk and a ( k ) [a(O)= 01 being arbitrary and with diffusion coefficient x,p(u, w), where ( p ( u ,w)) is the simple random walk in Z1: p ( u , v ) = 1/2 iff Iu - wl = 1. Then A, := inf { A :
IP [ ~ ' ( ( t ) $ o for
where Xo((t)is the process starting from
all
t > 01 > 0} < 00,
xo : z: = 1 and zt = O(u # 0).
The proof of this theorem is based on a comparison of the process with an oriented percolation. Let A? = { (m,n ) E Z2 : m n is even, n 2 O}. For each z E 2, draw an oriented bond from z to z (-1,l) and to z (1,l). Suppose that each z E 2 is independently open (denoted by q ( z ) = 1) with probability p E (0,l) and closed (denoted by q(2) = 0) with probability 1 - p . Given z j E 2, j = 0, .. . , k , (zo, ' . ,z k ) is called an open path from zo t o zk, denoted by zo z k , if ( z j - 1 , z j ) is a bond and zj is open for all j . Next, let
+
+
+
+
y-t
The first problem in the study of the percolation is looking for p , := inf ( p : itD (st,)
Lemma 15.9. 1/2
> o}.
< p , < 80/81.
Proof: a) The lower bound is obtained by comparing the percolation with a branching process (0 = 1, P [El = 01 = (1 - P ) ~ ,IP[& = I] = 2p(1 p ) , IP [
IP [ICoI = co]2 IP /(t never dies out]
15 PHASE TRANSITIONS
548
and the last probability being positive or not is described by the mean
IE
= 2p( 1 - p >
+ 2P2 = 2P
according to IE (1 > 1 or E (1 6 1 respectiveIy. From this, we see that pc 2 1 / 2 . b) The upper bound is harder to estimate than the lower one. Here, we adopt the contour method. Let
For l C ~ l< co, let r be the boundary of the unbounded component of (R x (-1, co)) \ W N .Such a r is called a contour. Note that for a contour I? of length m to exist there must be at least m/4 sites outside of r which are closed (we will prove this below) and the shortest possible contour has length 2 N 4. As an analogue of the Peierls' inequality, we have
+
for p > 1 - 3-4. The right-hand side is less than 1 for sufficient large N . From this and IP[ICNI= 001 6 (N+l)IP[ICol = 001, we claim that p , 6 80/81.
(-2N,
- 1)
(0, -1)
Now, we return to prove the %/4" assertion as promised in the last paragraph. Let us orient I' in such a way so that the segment (0, -1) + (1,O) to be positive. Clearly, there are four types of oriented bonds: \ J \ / * . Label them by i = 1,.- . ,4,respectively. Set mi = the number of bonds of
15.3 REACTION-DIFFUSION PROCESSES
WITH
549
ABSORBINGSTATE
4
i-th type. Then, m = mi. Next, for a bond L of type i (z = 1,2), if we stand at the midpoint of 1 and face in the direction of the orientation of I?, then the site closest to our right hand, denoted by zg, must be closed. Note that howcvcr, two bonds .t and !’ may share the same site xg. Because 1’s and 2’s bonds decrease our z coordinate by 1 and 3’s and 4’s increase: it by 1, and on the other hand, the contour r starts at (0, -1) and ends at (-2N, - l ) , we have (ml m2) - (m3 r n 4 ) = 2N. Thus 2N+m m rnl -1 7743 = 2 2F
+
+
and hence (rnl f m2)/2 2 m/4 a s desired. For our purpose, we need to generalize the above result to a slight more general situation.
+
Definition 15.10. Let Il(rn,.)I[ = (Iml Inl)/2 and q : ( R , 2 ) + (0,l). We call 7 l-dependent if q(zl), . . . ,~ ( 2 , ) are independent for all zl,. . , z, E 3 with IIx, - zjll > 1 for i # j . +
Lemma 15.11. For l-dependent percolation q , we have p , < 1- 3-36. Proof: The previous proof works also for the l-dependent case except one estimate is changed as follows. Observe: that for each (m,n,)E 2, thcre are 9 sites in 2 with \i(m,n)II < 1, so for each I’of length rn there is a set of m/36 sites which are separated by more than 1 and which must be closed for the contour to exist. Therefore
P “Cl, for 1 - p < 3-36. 1
< w]
< corlst.(9(1
-
p)”/’y
Proof of Theorem 15.8: a) The idea is to make a comparing of the rcaction-diffusion process with a l-dependent oriented percolation. Clearly, for fixed site m E Zl,the reaction-diffusion process dominates a birth-death process (Bt)t>owith rates
An: 4n,n-1 = 44 + ?a. Let To = inf{t : Bt = 0). Take N and X so large that e-N/2 Qn.nS-1=
< 3-38
< 3-38, PI [ inf B~ < N] < 3-38. l
and
(15.9)
lFl[TO 6 11
(15.10)
b) For ( m ,n ) E 2, we say that ( m ,n) is open (h. q(m,n) = 1) if (1) The original process has at least
I2n
+ 1,2n + a].
N
particles at site m for all t E
+
+
(2) Particles jiinip from rn to rn - t 1 or m - 1 in [2n 1,272 21. (3) Particles sent to IC - 1 and x 1 by the original process do not die during [Zn-t. I, 272 21.
+
+
15 PHASETRANSITIONS
550
Denote by &(k = 1,2,3) these events respectively. If we restart the process with X & ( t )A 1, m E Z1 at time t = 2n, n E Z+, then { ~ ( m n),= 1,(m,n) E 2'}is an 1-dependent oriented percolation and the reaction-diffusion process dominates it. From the definition of the above percolation, it follows that
lP [a site m is closed] = IF [ &, A;]
<
3
P (A;) k=l
< 3-35 + 2 e - N / 2 + 2 x 3-38 < 5 x 3-38 Therefore
IF [a site m is open] 2 1 - 5 x 3-38 > 1 - 3-36, which implies that the percolation occurs and then X o ( t ) survives. So our result is proved. H The next result, due to Mountford (19921, is a strengthening of Theorem 15.8 arid solves a conjecture made by Shiga (1988). Its proof i s omitted hcre. Thc model below is different from Theorem 14.15 since = 0.
Theorem 15.12. For the reversible polynomial model with absorbing state (Po = 61 = 0) and with diffusion rate x,p(u, w), if either C , IvIp(0, u ) < 00 or the distribution p(0,v) lies in the domain of attraction of a stable law of index less than one, then the process tends in distribution to the non-trivial stationary distribution, starting from any non-identically zero initial point.
15.4 Mean Field Method 111 the statistical physics, a method to study the phase transitions is the mean fieId net hod, which is often simpler and considered as an approximating approach to describe the original models. Let us restrict ourselves to the Polynomial model for simplicity. Instead of the infinite dimensional state space, we take E = Z+ as our state space and replace the original operator with
for k f Z+. The first two terms represent the birth-death rates a5 given in Example 13.27. The new additional birth rate E X ( t ) , the mean of the process, represents the exchange of the energy from the outside: a source of energy is provided from the outside according to the mean of the pro~ess. Intuitively, this model can be interpreted as follows. Fix a ve.sscl u f Zd in which the reaction' is kept. Regard all v E Zd \ { u } as outside and the
15.4 MEANFIELD METHOD
551
diffusions between u and E Zd \ {u}are now described simply by the new birth term E,Y(t), In this sense, the procesE; represents a non-equilibrium model. Clearly, the new term rriakes the process to be time-inhomogencous. In other words, we are studying a time-inborr~ogeneousbirth-death process with the Q-matrix:
b,+IEX(t) dt;i,j)={
8'
ifj=i+l ifj=i-1 for other j
q ( t ; i ) = - X q ( t ; i , j ) ,t 2 0.
# i;
I fi
Even though the Q-matrix Q ( t ) = ( q ( t ; i , j ) ) is quite awkward since it is expressed in terms of an unknown process. However, it can be proved that the process not only exists but also is unique, thia iu again due to the fact, that the degree of the birth rate is one less than that of the death rate. Morc precisely: given measure p ( t ) = ( p ( t , k ) )(t > 0) on E = Z+ with finite first moment m(t) := C k p ( t , k ) k l for each i , there is uniquely a IF', on the space of right continuous functions having left limits such that f ( X ( t ) )- O , f ( X ( s ) ) d s is a P,-martingale with respect to the natural i ~ algebras 9 t := o ( X ( s ) : s t } , t 0 for every f with compact support, where n;2tf(k)= ( b k m(t)) [ f ( k 1) - f ( k ) ] uk [ f ( k - 1)- f ( k ) ] .Moreover, IE,X(t) = m(t)and the process is indeed a Q-process. Next, how does this model relate to the phase transitions? Recall that a measure n- f 9 ( E ) is stationary if 71-h = P [ X ( t )= k] is independent of t for all k E E . Note that if 7r = ( r k ) is a stationary distribution with finite first moment X := XI, h k , then the above Q-matrix and hence the corresporidiiig Q-process become time-homogeneous. Furthermore, as we met several times before, for a birth-death process, the stationary distribution 7r should be given by the formula: 7rk = j&/Z, where
<
+
+
+
Thiis, the finiteness of the first moment gives us a condition
Hence
x + C(A- k ) (A + bo) * . w
f(A)
:=
k=l
Ul
*
(A
+ bk-1)
. . al,
Therefore, we have deduced the following conclusion.
= 0.
(15.11)
15 P H A S E TRANSITI
552 Lemma 15.13.
1 4 1: = # { T
:
T
is a stationary distribution}
= # { A E [O,m>: A i s a root of f ( A ) } .
Having t8hislemma in mind, it is nut dificult to prove the following result.
Corollary 15.14. For the second Schlligl model:
We have
{
1 4 1
where c
21
in any case,
=1
23
if 2& < (663) A (61 - l ) , if 1 < 61 < J1/2 [2,h’2 1)/(361
> 0 is a
constant depending on
+
+
/32,
61 and
+ 663) and ,OO E [0, c),
83.
Proof a> Let PO > 0. Then f ( 0 ) < 0 and f ( A ) > 0 for all sufficient large X > 0. Hence, there exists at least one positive root of f(A) and all positive roots belong to a finite interval (0, XI. This proves (913 1. b) P u t p ( i ) = i, ph, = p r\ N and T~ = inf{t : X ( t ) > R}+Since thr,
P,V(W
/Id) -
j
nsP,(X(s))ds
0
is a Pi-martingale, we have
(15.12)
Set C = supkao [ b ( k ) - u(C) + 2k] < w. From {15.12), it, follows (;hat,
Letting N + 03, we obtain
By Gronwall’s lemma, we have
~ p ( X (At T ~ ) )< i + Ct +
et-s(i
+ Cs)ds = -C + (i + C)et,
15.4 MEANFIELD hhTHOD
<
553
+ +
and so m(t)= lEip(X(f)) -C (i C)et. This shows that m ( t ) is locaIly bounded. Similarly, one can prove that IEi [ X ( t ) k ]is locally bounded in finite t-intervals for every Ic 1. Combining this with (15.12), wc sce that
>
d
c - m(t).
-m(t) = lEin,p((x(t))6
dt
IIence
( 15.13) c) To prove the ergodicity, we adopt the coupling of marching soldiers. Given two stationary distribution r1 and 7r2 with the first moments m1 and m2, respectively. Denote by fit and the coupling operator and the expectation of the coupling process, respectively. Note that
We have
fit(X1(k)- X2(t)1 < cLIX1(t) - X2(t)l + IX'(t)
X2(t)I,
(15.14)
where ci is given by Remark 13.16. In virtue of (15.13) and (15.14), we get
% I X l ( t ) - X 2 ( t ) [= fiiltlXl(t) - X2((t)(< (Ci + l)EIX1(t) - X"(l)(. dt This plus the assumption c$
+ 1 < 0 gives us
d) To prove the last assertion of the corollary, consider first the case that
,& = 0 and rewrite f ( A ) as Ag(X):
+ ?(A
- k)
(A
+
U.J
k=4
Clearly, g(0) > 0, g(1)
bl) . .
.
(A *
+ bk-1)
ak
< 0 whenever 61 <s;
1 2
<-+
2p2+1 361 +663'
15 PHASE TRANSITIONS
554
Combining this with a), we obtain the last assertion in the case that PO = 0. e) Finally, let PO > 0 and let bi, ai,i 2 1 and f ( X ) be the same as in d). Reset ?Li = ai,bi = bi PO and
+
Denote by 0 < A’ < A” the first three roots of f ( A ) . By d), we have 0 < A’ < 1 and f’(0) = 1- 1/61 > 0 since > 1. So there exists A* E (0, A‘) such that f(A*) = S U ~ , , ( ~ , ~f (~A)) . Take T > A” and set
c = min {A”, f ( x * ) / M ~>) 0.
Then, for every PO E (0, c), we have f ( 0 ) < 0, J(X* - P o ) > 0, f(X” - P o ) < 0 and !(A) > 0 for large enough A > 0. Therefore, f” has at least three roots. I
15.5 Notes The “dual” formula given in Section 15.1 was appeared in Boldrighini, DeMasi, Pellegrinotti and Presutti (1987) but the proof was sketched there only. Section 15.2 is taken from Ding and Zheng (1989), the computations for the moments are simplified here. Theorem 15.8 is originally due to Li and Zheng (1988), the simple proof by using oriented percolation is due to R. Durrett, preseiked at Institute of Mathematics, Nankai (1988). Section 15.4 is taken from Feng and Zheng (1992). For further developments on this topic, refer to Dawson and Zheng (1991), Feng (1994a,b, 1995). The mean field model, goes back to Yan and Li (1980), is a nice example of time-inhomogeneous Markov chains. This model shows that the time-inhomogeneous case is quite different from the time-homogeneous one and a theory for the general timeinhomogeneous jump processes is valuable. Certainly, about this subject, there are a lot of publications, some of them are collected in the books by Hu (1983, 1985). In general, reaction-diffusion processes are rather hard to handle and so simplified models are useful. Along this line, there is a large number of publications with applications to biology. See for instance, Durrett (1993)) Durrett and Neuhauser (1997), Durrett and Levin (1994) and references within.
Chapter 16
Hydrodynamic Limits As we mentioned before, the reaction-diffusion processes are one of the tools to study the non-equilibrium systems. Actually, the processes describe the microscopic behavior of the systems. Another mathematical tool to study the systems is the reaction-diffusion equations having the form
where V(f) is a polynomial. These equations describe the macroscopic behavior of the systems. A bridge connecting these two subjects is the hydrodynamic limits, which are the aim studied in this chapter. 16.1 Introduction: Main Results Throughout this chapter, we consider only the polynomial model:
W(4=
c
44P(%
4 [f(.
- eu
+ e7J)- f(41f
u>u
+
and = ad fl, as given in Example 13.27. In order to obtain the Laplacian part in the reaction-diffusion equation, it is natural to assume that ( p ( u , v ) )is the simple random walk on Zd and use a scaling approach for the diffusion part of the process. In other words, we are dealing with the processes with formal generator
52" = C 2 Q d
+ Rr +
Our main task is looking for the limiting behavior of the scaling processes as E + 0. To do so, we need some hypotheses on the initial distribution p' of the processes. In what follows, suppose that $ ( E > 0) is the independent product of the Poisson measures for which P E ( 4= P ( 4 ,
zf
E Zd,
where p is a non-negative, bounded C2(Rd)-function with bounded first derivative. Denote by JE:c the expectation of the process with generator RE and initial distribution p'. The main result of this chapter is as follows. 555
16 HYDRODYNAMIC LIMITS
556
, r d )E Rd and t 2 0, the limit
Theorem 16.1. For all
T
where [TIE] = ( [ T ~ / E ] :. equation:
, [ r d / 4 E) E d , exists and satisfies the reaction-diffusion
I
=
(TI,.*
-
I f(0,4=P(4. -
Furthermore, for any n >, 1 :,rl, . . ,T , E Wd and t
> 0, we have
where
A particular ca,se is that p is a constant.. Then pE is translation invariant and hence f ( t ,r ) = f ( t , 0) satisfies the equation m+l
(16.4)
Of course, if f is a constant^ then it must be a non-negative root of the cyiiation
(16.5) j =O
j=1
Definition 16.2. A non-negative, spatially homogeneous solution fo(t) t o Eq. (16.2) is called an equilibrium if it satisfies Eq. (16.4). An equilibrium solution fo(t) t o Eq. (16.2) is called stable (resp. asymptotically stable) if for every E > 0, there exists a 6 > 0 such that for any solution f ( t , r ) t o Eq. (16.2), whenever If(0,r) - fo(0)I< 6, we have I f @ , r ) - fo(t)l < E for all t > 0 (resp. h 4 m If(tl4 - f"(t)l = 01. Theorem 16.3. Let p.A be the independent product of identical Poisson measures with parameter X > 0 and EzA be the expectation of the process with > Ah the generator flE and initial distribution p'. Denote by A1 > A2 > non-negative roots of (16.51, where X j has multiplicity mj. Then, the nonnegative equilibrium solution f ( t , r>5 X i is asymptotically stable iff mi is odd and
CjGi-l rnj is even.
16.1 INTRODUCTION: MAINRESULTS
557
This result describes the critical phenomenon for a non-equilibrium system in terms of the reaction-diffusion equation. We now compare Theorem 16.3 with the results obtained in the previous chapters. First, consider the reversible case. Because Pj = aSj+l for some Q: > 0, j = 0 , . . . , m, the equation (16.5) becomes m
m+ 1
m+l
j=O
j=1
j=1
which has only one non-negative solution 8 = a. By Theorem 16.3, the solution is asymptotically stable. This conclusion is consistent with Theorem 14.20, which says that there is no phase transition in the reversible case. Next, consider the first Schlogl model with PO = 0. Then, Eq. (16.5) has two roots: X I = (PI - &I)/& and A2 = 0. It is easy to see that X I is asymptotically stable but not X2. This conclusion is certainly reasonable since there is a phase transition whenever PI is large enough (Theorem 15.8). However, if PO > 0, then there is only one non-negative root and hence asymptotically stable. From this, one may conjecture that there would be no phase transition for the first Schlogl model and there would exist phase transition for the second Schlogl model since for which not every solution being asymptotically stable (we will come back to the second model soon). These conjectures remain unsolved. Certainly, in these two different contexts the objects are actually quite different. There is a scaling factor E - ~( E J, 0) in front of the diffusion rate z(u)p(u,v)in the study of hydrodynamics in order to obtain the Laplacian in the equation. Thus, to regard Eq. (16.2) as an approximation of the particle systems, as indicated by (16.1), the diffusion rate should be large. Alternatively, if we fix the diffusion rate to be 1, then the reaction rates u k and bk should be replaced by E ~ U ~ respectively. Thus, the above comparison makes sense for those u(k) and b ( k ) up to a sufficient small scaling factor. The above two theorems will be proved in Sections 16.3 and 16.4 respectively. Some preparations are presented in the next section. To conclude this section, we explain the reason why we choose the coefficients in Corollary 14.12. Note that for the second Schlogl model, the role played by each of the parameters ,& and & is not clear at all. It seems too hard and may not be necessary to consider the whole parameters. Based on the above observation and to keep the physical meaning (the details are given below), we fix p2 = 6 a ( a > 0), 61 = 9a and 63 = a. Then, when Po E (0,4a), there are three roots A1 > A2 > A3 2 0. By Theorem 16.3, XI and A3 are asymptotically stable but not A2. When PO = 401, we have A2 = 1 with ml = 2 and XI = 4, A1 is asymptotically stable but not X2. As for PO > 4a, there is only one non-negative root which is certainly asymptotically stable. Hence, we guess that the ergodic region of Po should be
558
16 HYDRODYNAMIC LIMITS
located in (4a,m) for sufficient small a. Of course, the assertion is true in the reversible case, for which, we have Po = 36a (Theorem 14.17). On the other hand, as mentioned in Durrett and Neuhauser (1994) that the reactiondiffusion equations are usually the end of the study of hydrodynamical limits of the reaction-diffusion processes. But we can also go to the opposite direction, i.e., using the reaction-diffusion equation to investigate the microscopic processes. The main point used in the above quoted paper to prove some kind of phase transitions for the reaction-diffusion processes with absorbing state xu = 0 is t o look for the critical value at which the speed of the traveling wave solution to (16.2) changes its sign. Let us mention, without details, that in our present situation, this critical value is Po = 2a. From this point of view, the phase transitions would be appeared when ,& f (0, 2a). Based on these considerations, we propose a typical non-trivial case, for which we have more precise picture as shown in Corollary 14.12. We now go to the details. The main point is that, to keep the essential meaning of the model, we should choose the parameters so that the equation
Po + p2x2 - S I X - 63X3 = 0 contains a non-asymptotically stable root. Of course, we can take a = 53 = 1. Let X = z ,&/3. Then, the equation is reduced to z3 p z 2 q = 0, where 1 2 1 2 4 = -Po 3@2&P = 61 -
+
+
,a,
+
zP;*
+
+
When y2/4 p 3 / 2 7 > 0, there is onIy one real root, which is necessarily positive and asymptotically stable. Hence, the only interesting case is that y2/4 + p3/27 < 0. Solving the equation y2/4 + p3/27 = 0 in variable Po, we obtain (1) -
Po
(2)
-
-/%(a&
- -P2(2&
Po -
- 961) - 2(@ - 3 6 1 ) ~ ’ ~
, 27 - 961) 2(@2”- 3 6 1 ) ~ ’ ~ 27
@g)< <
+
It turns out that y2/4 p 3 / 2 7 6 0 iff /3: 2 361 and &I @ .’: This rules out the region provided by (14.15). Recall that for the model, Po varies from 0 to 00. So, it is natural to take = 0. That is
@A1)
($3)
3‘2
=-( P2
9
P;
5-sl
The only solution to this equation is ,B2 = Z f i .
).
Then
559
16.2 PRELIMINARIES Therefore, for all 0 < PO < 46:'2/27,
we have three non-negative roots:
where coscp = 27,00b;~'~/2- 1. Thus, the number of the parameters is reduced from 4 to 2. Our specific choice that 61 = 9 is not essential but for simplicity to make p2 being an integer and 61 being different from 63. Now, fix 63 = 1, d1 = 9 and Pz = 6. Then, q 2 / 4 y3/27 > 0 iff PO > 4. If so, there is only one non-negative root. Next, q2/4 t-p3/27 < 0 iff ,& < 4. In that case, we have three non-negative roots given in the last formula. Finally, when PO = 4, we have XI = 2 with multiplicity 2 and a single root X I = 4. Thus, as mentioned at the second to the last paragraph, for every PO E (0,4] there is precise one non-asymptotically stable root but there is no such root for all Po E (4,oo). We have thus arrived at the desired position. In our particular situation (Po = 2)) the three roots are 2 2, 2 A.
+
a,
+
16.2 Preliminaries
Let co = suprp(r). Denote by po the independent product of identical Poisson measures with mean co. Lemma 16.4. For every
E
> 0, we
have
(1) pE
for all u E Zd and t 3 0, where lE, is the expectation of the reactiondiffusion process starting from p .
Proof: a) Since p' and p o are independent product, of their marginals, it suffices to consider the marginal distributions p i and p: at a fixed site u. Construct two birth-death processes P"(t) and P o ( t )with death rates 1 and birth rates br = P ( E u ) / ( ~
+ l),
bi0 = cO/(Z
+ l),
ZE
Z+
(16.6)
respectively. Then, P"(t) < P o ( t )by Lemma5.45 or (5.57). Since p: and p t are the stationary distributions of P"(t) and Po@)respectively, we have
P"(t)f(i+ ) pLf
and
Po((t)f(i)+ pEf
as t -+
03
560
16 HYDRODYNAMIC LIMITS
for all bounded f. Hence assertion (1) follows from Lemma 5.44. b) Because po is translation invariant, we have popm < 03 and hence poEm = 1. Furthermore, by Corollary 13.22 and Theorem 13.19,
for some constants C , c 2 0. Now assertion (2) follows from (1) and the monotonicity of the process. W Next, as an application of Lemma 16.4 (2), by (15.4), we obtain
where p ( t , u,v) is the Q-process induced by the simple random walk. Finally, replace (p(u, u))with ( ~ - ~ p (u)) u ,and denote by IE$ and ELe, respectively, := &-2Rd the expectation of the process starting from p' with generator and RE = RE, RT.We can rewrite Proposition 15.2 as follows.
+
Proposition 16.5. For every y E E ( f ) we , have (1) E$% X ( t > )= CYl P"(t, 9 , d ) l - L E ( W9) , and (2) E&D(Y,X ( t > > = CYl P"(t, Y, Y/)P"(D(Y', + ds CYlP"(t - s, Y,Y ' ) q [ R A Y ' , where p'(t, y, y') = p ( ~ - ~y,t 3, ') .
9)
,
X(S))]
In view of part (2) of the proposition, to study the convergence of ELE,we need to estimate p " ( t , y, y'). For this, we have the following result.
Lemma 16.6. Let P = ( p ( u , u ) ) be a general random walk on Z d such that p(0,O) < 1 and let ( p ( t , u , u ) )be the Q-process with Q-matrix Q = P - I . Then there exists a constant C > 0 such that
p(t,u,v) < C / h ,
t > 0, u , v
E Zd.
Proof: Clearly, the characteristic function cpt(a)( a E Rd) of the distribution ( p ( t ,0, .)) is given by
16.2 PRELIMINARIES
56 1
As the convolution of the distribution p ( 0 , u) =: pu with itself n-times, the characterstic function of P(~)(O,u) should be as follows:
where U
Hence p t ( a ) = exp [t(cp(a)- l)].Next, by the inverse formula,
Note that
and that
{
exp t p , (cos
c,"=, uiai - I)} 6 1,
Besides, we may assume that
B
:= p(,',,.. +dl
uE
~
d
.
> 0 with nd # 0, Then
P ( t , o,u>
Here in the last step, we have changed the variables
pi = ai,
1
and
pd =
d
nisi.
16 HYDRODYNAMIC LIMITS
562
Since the last integral on the right-hand side is controlled by
we have
exp
G
[ - trP2/4] dP+2(7r
c/&.
This completes the proof of the lemma.
-
I
1)exp [ta(cos 1- l)]
W
Lemma 16.7. Let ( p ( u , w ) ) be the simple random walk on Z d and (p(t,u,w)) be the same as above. Then
Proof: a) Clearly, it suffices to consider the special case that lul - u21 = 1 since the general situation then follows by triangle inequality. Next, for a coupling of p ( t , u l , .) and p ( t , u 2 ,.) having the property P"1i"z [Xt= X f , t 3 T ] = 1, where T is the coupling time and @"1~"2 is the probability of the coupling process starting from ( u l , u 2 ) ,we have
Thus, we need only to estimate @ 1 i u 2 [T > t ] . b) Since the components of the simple random walk are independent, by Example 5.50, it suffices to consider the one-dimensional case. Let X ' ( t ) be the Markov chain starting from u1 E Z with transition probability p ( t , u, w). Next, let X 2 ( t ) be the reflection of X ' ( t ) with respect to the line {u E Z : Iu - u l ( = I u - u21}. Then ( X ' ( t ) ,X 2 ( t ) )is a coupling of the Markov chains starting from u1 and u2 respectively. Moreover,
16.2 PRELIMINARIES
563
whenever Iul - u21=even, where TOis the first time of the original process hitting 0 and P" is the probability of the original process starting from u. Even though the above estimate does not cover all cases, we claim that it is enough to show that P2[T0> t] C / & To see this, we modify the above coupling as follows. First, let the processes, starting from 0 and 1, respectively, evolve independently. At the first jump, there are two possibilities. Either they meet each other or they are located at distance 2. In the former case, we let them move together. Otherwise, couple them by reflection (Example 0.22). Denote by T1 the fist jump of this coupling process. Then, the assertion follows from
<
FOJ [T > t] = FOJ [T > t > F1] + PJ[T > t , F1 6 t ]
.I
t
e-csP2[To> t - s]ds
+ eWct
for some c > 0. we have thus reduced our study to estimate P2[T0 > t ] . , . . the successive jumps of the Markov chain c) Denote by TO = 0, T ~7i, X ( t ) starting from u E Z. Set AT, = T, - -T,-~, n >, 1. Then {AT, : n 2 1) is i.i.d. AT^ has exponential distribution and {Y,:= X ( T , ) } , ~ O is a simple random walk on Z. Define SO= inf{n 2 1 : Y, = O } , Then
,< P"[SO > Pt]
+ PU [so< Pt, C;??lATi
> t]
,< P"[SO > [Ptll + [PtIVdA71)/[t - "EA.rl] for some suitable P > 0. Therefore, the desired estimate follows from the ~ l u l / f i .See for instance Spitzer(l975) classical result: pu[S0 > n] [p.381, P3]. The last topic of this section is to estimate the moments of the process with initial distribution pE given in the previous section. Recall that co = supr p ( r ) < 00. Define y(") = y(u)eu E Eo. It is easy to check that W q y , s ) =CD(Y-Y("),~)fw(Y("),~),
YE
-mz E Eo.
(16.9)
U
Lemma 16.8. Let c, be an increasing sequence such that
{
sup b ( k ) [ ( k k>O
+ l)(,)- k ( n ) ]+ a ( k )[ ( k - l)(,)- k ( 4 ] } < c,,
with equality holds when n = 1. Set
n2 1
564
16 HYDRODYNAMIC LIMITS
Then
Proof: By Lemma 16.4, for fixed E > 0, the process has finite moments of any order. Hence combining Proposition 16.5 with (16.9), we obtain
Since q"(t,O)= 1, the first assertion follows. Put
cp"(t,n)= sup
(pE(S,
k)
s
We get
p'(t, n) < c;
+
t
nC,pE(S, n)ds.
Therefore, by Gronwall's lemma,
16.3 Proof of Theorem 16.1
For any n 3 1 and y = C;==, eZLJdefine f"(t>Tlr"'
yyn)
=EzED(y,X(t)),
t 3 0,
where rj = E U ~ ,j = 1,. . . ,n.Extend f ' ( t , .) to the whole space Rnd by linear interpolation. We now prove Theorem 16.1 in three steps.
Step 1. First,we prove the tightness of the family {f" : E > 0) on the compacts for fixed n 3 1. From Lemma 16.8, {f" : E > 0} is uniformly bounded on finite t-intervals once n is fixed. We need only to show the equicontinuity. That is lim sup 6+o E>O
sup
SUP
If"4r)
- f"(t',r')I= 0,
t,t'
T>0, n21,
(16.10)
16.3 PROOF OF THEOREM 16.1 where for II = ( T I , . . . ,T-,), First of all, we have
565
llrll = m a l G i G nIril.
x IE;,R,D(z,X(s)).
(16.11)
We now deal with the first term on the; right-hand side. Note that
t
1 '1
?Ve need only to show that C, p"(t7u, v ) ~ ( E vis) equicontinuous for t Because p ( t , 0, u)luI2= d t. We have
xu
c+ c < di + c u"(t.,
< T.
C P E ( t , 0 , 4 E I d=
,:+I<Jl
V
v:Elwl>fi
0, .U)EI@ .
&
3
v: a ( v i > d ?
\
a) Given 6
(d+ 1)h.
> 0, Iu - ~ ' < 1 6 and 0 < t'
-t
( 16.12)
6 6,by (16.12), we obtain
( 16.13) where c' = supr l p ' ( ~ ) I . b) Next, consider the second term on the right-hand side of (16.11). Given S > 0, lu - u'J 6 and It' - C ( 6/2, we have either t, t' >, 6/2 or t , t' 6. For the latter one, the proof is easy:
<
<
<
16 HYDRODYNAMIC LIMITS
566 where -
IE;, IR,D(y, X ( t ) ) l < 00
sup
c := E>O,
( by Lemma 16.8).
lyl=n, t
Now, we turn to the case t , t' 2 6/2. Let t' 2 t. For convenience, we label the support of y E E(,) as 11 := (ul, * - . ,u ,), where ui repeated y(ui)-times. If at time t , the particle came from site uj is located at u j , then the probability of such a transition can be expressed as follows:
P V , u,2) := P " ( 4 u17 q )* * * P"(& u,, %), where v = (q,- ,vn). By using this notation, from (16.12), it follows that +
+
c
P " ( 4 1,2 )
< n ( d + 1)&/6
(16.14)
1l14-vll~E-l~
where 11 we have
11 was defined below (16.10).
On the other hand, by Lemma 16.7,
IP"(t,II,2) - P E ( t , l ' , E ) I
< nCEllll- dll/h.
(16.15)
2) -
so,
<2ncCGdT. (16.16) Here we have used D ( 2 ,x) instead of D(x,x). Next, by (16.14) and (16.15), we have
16.3 PROOF OF THEOREM 16.1
567
(16.17)
From (16.16) and (16.17): it follows that
Minimizing the right-hand side with respect to 12
t, we obtain
< 2 n c ~ 6 f +i 26 + ~n~&iZJZ
We have thus completed the proof of the equicontinuity of the family { f " E > O}.
:
Step 2. Now, take a sequence { E ~ } P =t7k~ L, 0 so that f Q (t, r l r - ,r,) converges to some f ( t :rl:. . ,T,) for all n 2 1, all r l , . . - ,r, and t. The convergence is uniformly on the compacts. By (16.9), we can rewrite (16.11) as follows: 1
( 16 18) I
Because
16 HYDRODYNAMIC LIMITS
568
the last term on the right-hand side of (16.18) vanishes in the limit E --t 0. Noticing that as E --t 0 p'(t, [ T ~ / E ].), converges weakly to the normal distribution with density
Gt(ri - T ) := (27rt)- d / 2 e - ( r - r < ) ' / ( 2 t ) 3 we obtain
j=1 k
where
with r j repeated k-times.
Step 3. Finally, we identify f ( t ,r l , . + ., rn) with solves Eq. (16.2). To do so, set
nrzlg ( t ,
r i ) , where
g ( t ,T )
n
i=l
Then, we have
n
r m
~ = 1a # j
k=O
m+l
1
k=l
J
From this, we see that f ( t , r l , . . - , r,) is a solution to the integral equation (16.19). Thus, to complete the proof of Theorem 16.1, we need only to prove the uniqueness of the solution to Eq. (16,19), First, set pPn =
sup
sup j(1,T I , . ,T n ) *
rL,",,rn t
From (16.18) and Lemma 16.8, it follows that
16.3 PROOF OF THEOREM 16.1 Hence
569
qn < ( c o e ~ l ~ / ~ o ) n ,n 2 0.
(16.20)
To prove the uniqueness under the growing condition (16.20), we introduce some more compact notations. Let A? be the space of the sequence
- - ,rn)l with supr1,,..r T n lh(n)(rl,. lows: J
< co. Define operators Yt and d as fol-
i=l
where and Aj,kh(") are defined in a similar way as above. Suppose t T are solutions to Eq. (16.19) with the same initial datum, that fjl), both having the bound
fL2), <
Then
- ......
Define for all and constant K2. From (16.21), it follows that
(16.21)
Note that for some
16 HYDRODYNAMIC LIMITS
570
+
Letting Ic -+ 00: we deduce that f,"' = fj2) for all t < l/(Ka(m l)Ky+'). Actually, this is enough to claim that = f j 2 ) €or all t 2 0 since the semigroup property: t>T>O.
16.4 Proof of Theorem 16.3
Set
t 3 0.
O(A, t ) = lim I E ~ x X ~ ,(- t/ )E,] E'O
a) First, we prove that
x > X1. Note that if B ( X , t o ) = A1 for some to > 0, then O(X, t ) -= XI for all t 2 t o since the uniqueness of the solution to Eq. (16.19). Thus
qx,t> x1, whenever X > XI. root and limk,, for X > A1
d -qx,
at
t 2 0,
(16.22)
On the ohher hand, since XI is the largest non-negative [C,"l-o&kj - CT2' djk3] = -00, by Eq. (16.4), we have
t ) = -6,+1(0
-X
p . .. (0- Xk)mkg(B)
< 0,
(16.23)
where 9 3 0 011 [0,ca}. This means that 8(X, - ) ( A > XI) is nan-increasing and so the limit limt,w B ( X , t ) 2 XI exists. By Eq. (16.4), we have
Letting t 7 00, since both the integral and its integrand on the right-hand side converge, we have m
m 1.1
This plus thc existence of lirnidW @(A, t ) certainly gives us thc desired assert ion.
16.5 NOTES b) Let X i
< X < Xi-1.
571
Then, as an analogue of (16.22), we have
mj is even (resp. odd), then f?(A,t) is nonMoreover, by (16.23), if increasing (resp. non-decreasing) in t. Therefore, the proof a) gives us
< < <
c) For general p : 0 a p p, where a and monotonicity of the process, we have
p
are constants, by the
-Ep is the product of Poisson measures introduced in Section 16.1 corwhere pL responding to the function p. Hence, as the limit as E + 0, we have
This completes the proof of our last theorem.
1
16.5 Notes The hydrodynamic limits are an active research topic in the study of interacting particle systems. Even for the zero range processes, there are a lot of publications. So it is easy for the readers to find some recent references. Here, we mention Boldrighini (1996), DeMasi and Presutti (1992), Feng (1996), Fritz (1987), Funaki (1997, 1999), Guo, Papanicolaou and Varadhan (1988), Kipnis and Landim (1999)) Kipnis, Olla and Varadhan (1989), Perrut (2000) and Spohn (1992). Actually, this chapter is mainly taken from the books quoted above with some more careful treatments. Theorem 16.1 is due to Boldrighini, DeMasi, Pellegrinotti and Presutti (1987) and Dittrich (1988). Theorem 16.3 is due to Xu (1991), from which, the proof of Lemma 16.6 is referred. Finally, the results obtained in this chapter can be extended to more general situation, as have been done by Chen, Huang and Xu (1990) and Xu (1991).
Uibliography Aizenman, M. and Holley, R.(1987), Rapid convergence t o equilibrium of stochastic Ising models in the Dobrushin-Shlosman regime, in “IMA volume on Percolation Theory and Ergodic Theory of Infinite Particle Systems”, edited by H. Kesten, Springer, New York, 1-11. Anderson, W. J. (1991), Continuous-Time Markov Chains, Springer Series in Statistics. Aldous, I). G. and Fill, J. A. (1994-), Reversible Markov Chains and Random Walks on Graphs, In preparation. Arldjel, €2. D.(1982), Invariant measumY l o r zero range Ijrocess, Ann. of Prob. 10 no. 3, 527,-547. Arnold, L.(1980), O n the consistency of the mathematical models of chemical reactions, in ‘‘Proc.Int. Symp. Synergetics” , Bielfeld 1979, edited by W . Haken, Springer-Verlag. Arnold? L. and Theodosopulu, M.(1980), Deterministic limit of the stochastic model of chemical reactions with diffusions, Adv. Appl. Prob. 12, 367-379. Austin, D. G.( 1956), Some differentiation properties of Markoff transition probability functions, Proc. Amer. Math. SOC.7,756-761. Barlow, M. T. and Perkins, E. A.(1988), Brownian motion on the Sierpinski gasket, Prob. T h . Rel. Fields 79,543-624. Basis, V. Ya.(1976) , Infinite dimensional Markov processes with almost local interaction of components, Theory Prob. Appl. 21 no. 4,706-720. Basi8, V. Ya.(l980), Slationarity and ergodicity of Mnrkov interacting processes, in “Multjcompnnent Random Systems”, edited by R. L . Dobrushin and Ya. G . Siuai, 37-58. Bebbington, M . , Pollctt, P. and Zheng, X. C. (1995), Dual Constructions for Pure-Jump hfarkov Processes, Markov Processes and Related Fields 1 no. 4, 513-558. Billingsley, P.(1968), Convergence of Probability Measures, Wiley, New York. Blurnenthal, R. M. and Getoor, R. K.(1968), Markov Processes and Potential Theory, Academic Press? New York. Bobkov, S. G. and Gotze, F. (1999a), Discrete isoperimetric and Poincare‘ inequalities, Prob. Th. Rel. Fields 114,245-277. Bobkov, S.G. a.nd Gotze, F. (1999b), Exponential integrability and transportation cost related to logarithmic Sobolev inequalities, J. Funct. Anal. 163,1-28, Boldrighini, C . (19961, Macroscopic limits oJ microscopic systems, Rendiconti di Matematica Serie: VII, 16, 1-107. Boldrighiai, C., DeMLtsi, A , , Pellegrinotti, A. arid Presutti, E~(1987),Collective phenomena art interacting particle system,s, Stoch. Proc. Appl. 2 5 , 137-152. Boulmu, N.a.nd IIirsch, F.(1986), Proprie‘te‘s d’absolu,c con,tinuite‘ et densite‘ des variabks alkatoires rCelles sur l’espace de Wiener, J. Funct. Anal. 69, 229-259. Carlen, E. A. Kusuoka, S. Stroock, D. W.(1987), Upper bounds for symmetric Markov t.ransition functions, Ann. Inst. Henri Poincar6 no. 2, 245-287. Chebotarev, A. M. (1988), Suficient conditions for the regularity of jump Markov processes, Theory Prob. Appl. 33 no. 1, 22-35. Cheeger, J. (1970), A lower bound for the smallest eagenvalue o f t h e Laplacian, Problems in Analysis, a Symposium in Honor of S.Bochner, Princeton Univ. Press, Princeton, 195-199. Chen, A. Y.(1984), Constructions of symmetrizable Q-processes with J n i k exit boundaries and finite non-consewative states (in Chinesc), Chin. Ann. Math. 5 no. 2, 153-164. Chen, A. Y . and Zhang, N.J.(1984), Existence criterion l o r reversible Q-processes, Acta M a t h . Sin. 5, 153--164. Chen, D. Y., Feng, J. F. and Qian, M. P. (1996), T h e metastability of exponentially perturbed Markov chains, Sci. China, Ser. A. 39,7-28.
572
BIBLIOGRAPHY
573
Chen, D. Y., Feng, J. F. and Qian, M. P. (1997a), The metastable behavior of the three dimensional Ising model, I, Sci. China, Ser. A. 40 no. 8, 832-842. Chen, D. Y . , Feng, J. F. and Qian, M. P. (1997b), The metastable behavior of the three dimensional Ising model, 11, Sci. China, Ser. A. 40 no. 11, 1129-1135. Chen, J. D.(1983), O n existence condition for Bayes criterion (in Chinese), Chin. J. Appl. Math 6 no. 3, 367-375. Chen, J. W.(1995), T h e positive recurrence of Brussel’s model, Acta Math. Sci. 15,121125. Chen, J. W.( 1996a), Large deviations f o r infinite dimensional and reversible reactiondiffusion processes, Acta Math. Appl. Sin., Engl. Ser. 12 no. 3, 300-307. Chen, J. W.(1996b), A note o n the large deviations f o r reversible reaction-diffusion processes, Chinese J. Contemp. Math. 17 no. 4, 397-404. Chen, M. F.(1979), Minimal solutions to a class of non-negative equations (in Chinese), J. Beijing Normal Univ. no. 3, 66-73. Chen, M. F.(1980), Reversible Markov processes in abstract state space (in Chinese), Chin. Ann. Math. 1,437-451. Chen, M. F.( 1982), Constructions of symmetrizable Q-processes with finite exit boundaries (in Chinese), Acta Math. Sin. 25, 136-166. Chen, M. F.(1984), Basic couplings of Markov chains (in Chinese), J. Beijing Normal Univ. no. 4, 3-10. Chen, M. F.( 1985), Infinite dimensional reaction-diffusion processes, Acta Math. Sin. New Ser. 1 no. 3, 261-273. Chen, M. F.(1986a), Couplings of j u m p processes, Acta Math. Sin. New Ser. 2 no. 2, 123-136. Chen, M. F.(1986b), Jump Processes and Particle Systems (in Chinese), Beijing Normal Univ. Press, Beijing. Chen, M. F.(1986c), Existence for a probability kernel and diflerentiability of transition functions (in Chinese), J. Beijing Normal Univ. no. 4, 6-9. Chen, M. F.(1986d), Some new developments in probability theory (in Chinese), Chin. Quard. J. Math. 1, 104-117. Chen, M. F.( 1987), Existence theorems for interacting particle systems with non-compact state spaces, Sci. Sin. 30 no. 2, 148-156. Chen, M. F.(1989a), Probability metrics and coupling methods, in “Pitman Research Notes in Mathematics”, Vol. 200, edited by K. D. Elworthy and J-C. Zambrini, 55-72. Chen, M. F.(1989b), Stationary distributions of infinite particle systems with non-compact state spaces, Acta Math. Sci. 9 no. 1, 7-19. Chen, M. F.(1989c), A survey o n random fields (in Chinese), Adv. Math. 18 no. 3, 294322. Chen, M. F.(1990), Ergodic theorems for reaction-diffusion processes, J. Statis. Phys. 58 no. 5/6, 939-966. Chen, M. F.(1991a), Uniqueness of reaction-digusion processes, Chin. Sci. Bulleten 36 no. 12, 969-973. Chen, M. F.(1991b), Exponential L2-convergence and L2-spectral gap for Markov processes, Acta Math. Sin. New Ser. 7 no. 1, 19-37. Chen, M. F.(1991c), Dirichlet f o r m s and symmetrizable j u m p processes, Chin. Quart. J. Math. 6 no. 1, 83-103. Chen, M. F.(1991d), O n three classical problem f o r Markov chains with continuous time parameters, J. Appl. Prob. 2 8 , 305-320. Chen, M. F.(1991e), Comparison theorems for Green function of Markov chuins, Chin. Ann. Math. 12(B) no. 3, 237-242.
574
BIBLIOGRAP
Chen, M. F.(1991f), O n coupling of j u m p processes, Chin. Ann. Math. 12(B) no. 4, 385-399. Chen, M. F.(1991g), Stochastic processes f r o m Yang-Mills lattice field, in “Prob. and Stat., Nankai Series of Pure and Applied Math.”, edited by S. S. Chern and C. N. Yang, World Scientific. Chen, M. F. (1994a), Optimal Markovian couplings and applications, Acta Math. Sin. New Ser. 10 no. 3, 260-275. Chen, M. F. (1994b), Optimal couplings and application t o Riemannian geometry, in Prob. Theory and Math. Stat., Vol.1, Edited by B. Grigelionis e t al. VPS/TEV. Chen, M. F. (1995), O n ergodic region of Schlogl’s model, in Proc. Intern. Conf. on Dirichlet Forms & Stoch. Proc. Edited by Z. M. Ma, M. Rockner and J. A. Yan, Walter de Gruyter, 87-102. Chen, M. F . (1996a), Estimation of spectral gap for Markov chains, Acta Math. Sin. New Ser. 12 no. 4, 337-360. Chen, M. F. (1996b), A comment o n the book “Continuous-Time Markov Chains” by W. J. Anderson, Chin. J. Appl. Prob. Stat. 12 no. 1, 55-59. Chen, M. F. (1997), Reaction-diffusion processes, Chin. Sci. Bull., 1997, 42 no. 23, 24662474 (Chinese Ed.); 1998, 43 no. 17, 1409-1421 (English Ed.). Chen, M. F. (1998a), Trilogy of couplings and general formulas f o r lower bound of spectral gap, in “Probability Towards 2000”, Edited by L. Accardi and C. Heyde, Lecture Notes in Statis. 128,123-136, Springer-Verlag. Chen, M. F. (1998b), Estimate of exponential convergence rate in total variation by spectral gap, Acta Math. Sin. Ser. (A) 41:l (Chinese Ed.), 1-6; Acta Math. Sin. New Ser. 14 no. 1, 9-16. Chen, M. F. (1999a), Analytic proof of dual variational formula f o r the first eigenvalue in dimension one, Sci. Sin. (A) 42 no. 8, 805-815. Chen, M. F. (1999b), Nash inequalities f o r general symmetric forms, Acta Math. Sin. Eng. Ser. 15 no. 3, 353-370. Chen, M. F. (1999c), Eigenvalues, inequalities and ergodic theory (It), Chin. Adv. in Math. 28 no. 6,481-505. Chen, M. F. (1999d), Single birth processes, Chin. Ann. of Math. 20B no. 1, 77-82. Chen, M. F. (2000a), Equivalence of exponential ergodicity and L2-exponential convergence f o r Markov chains, Stoch. Proc. Appl. 87,281-297. Chen, M. F. (2000b), Explicit bounds of the first eigenvalue, Sci. in China, Ser. A, 43 no. 10, 1051-1059. Chen, M. F. (~OOOC),The principal eigenvalue f o r j u m p processes, Acta Math. Sin. Eng. Ser. 16 no. 3, 361-368. Chen, M. F. (2000d), Logarithmic Sobolev inequality for symmetric forms, Sci. in China (A) 30 no. 3, 203-209 (Chinese Ed.); 43 no. 6 (English Ed.), 601-608. Chen, M. F. (2001a), Variational formulas and approximation theorems for the first eigenvalue in dimension one, Sci. Chin. (A) 44 no. 4,409-418. Chen, M. F. (2001b), Ergodic Convergence Rates of Markov Processes - Eigenvalues, Inequalities and Ergodic Theory [Collection of papers, 1993--20031, http://www. bnu,edu.Cn/-chenmf. Chen, M. F. (2001c), Explicit criteria for several types of ergodicity, Chin. J. Appl. Prob. Stat. 17 no. 2, 1-8. Chen, M. F. (2002a), Variational formulas of Poincark-type inequalities in Banach spaces of functions o n the line, Acta Math. Sin. Eng. Ser. 18 no. 3, 417-436. Chen, M. F. (2002b), A new story of ergodic theory, in “Applied Probability”, eds. R. Chan et al., AMS/IP Studies in Adv. Math. 26,25-34.
BIBLIOGRAPHY
575
Chen, hi. F. (2002c), Ergodic Convergence Rules of Markov Processes - Eigenvalues, Inequalities und iirgodic Theory, in Proceedings of “ICM 2002”, Higher Educatiori Press, Beijing, pp. 41-52. Chen, M. F. (2003a), Vuriation.al form,ulo,s of Poincark-typc inequalities for one-dimensional processes, “IMS Lecture Notes-Monograph Series, Probability, Statistics and their Applicahions: Papers in honor of H.abi Rhattacharya” 41,81-95. Chm, M. F. (2003b), Ten explicit criteria of one-dimensional processes, in “Proc. Conf. on Stoch. Anal. on Large Scale Interacting Systems”, Adv. Studies in Pure Math., Math. Soc. Japan, 89-114. Chen, M . F. (2003c), Variational formulas of Poincark-type inequalities for birth-death processes, Acta Math. Sin. Eng. Ser. 19 no. 4, 625-644. Chen, hl. F. (2003d), Eigenvalues, Inequalities and Ergodic Theory, In preparation. Chen, M. F. and Chcng, H. X.(1981), ,B-equation and its application to Q-processes (in Chinese), J. Beijing Normal Univ. no. 4,1-15. Chen, hl. F., Ding, W. D. and Zhu, D. J. (1994), Ergodicity of reversible reaction-diflusion processes with general reaction rates, Acta Math. Sin. New Ser. 10 no. 1, 99-112. Chen, M. F., Huang, L. P. and Xu, X. J.(1991), Hydrodynamic limit for reaction-diflmion processes with several sgecies, in “Prob. and Stat., Nankai Series of Pure and Appl. Math.”, edited by S. 5 . Chern and C. N . Yang, World Scientific. Chen, k1. F. and Li, S. F.(1989), Coupling methods for multidimensional diflusion pvocesses, Ann. Prob. 17 no, 1, 151-177. Chen, M. I?. and Lu, Y. G.(’l990a),Larye deviutions for Murkov chains, Acta Sci. Sin. XO no. 2 , 217-222. Chen, M . F. and Lu, Y. G.(199Ob), O n evaluating the rate function of large deviations f o r jump processes, Acta Math. Sin. New Ser. 6 no. 3, 206-219. Chen, M .F. and Stroock, D. W.(1983), A,-invariant measures, LNM. 986, 205-220. Chen, M. F. and Wang, F.Y. (1993a), O n order-preservation and positive correlations f o r multidimensional diffusion processes, Prob. Th. Rel. Fields 95, 421-428. Chen, M. F. and Wang, F. Y . (1993b), Application of coupling method to the first eigenvalve o n manzfold, Sci. Sin.(A), 23 no. 11(1993) (Chinese Ed.), 1130-1140; 37 1(1994)(English Ed.), 1-14. Chen, M. F. and Wang, F. Y. (1997), Estimation of spectral gap f o r elliptic operators, Trans. Arner. Math. SOC. 349 no. 3, 1239-1267. Chen, M. F. and Wang, F. Y.(1998), Cheeger’s inequalities f o r general symmetric f o m s and existence criteriafor spectral gap, Abstract: Chin. Sci. Bull. 43 no. 1 4 (Chinese Ed.), 1475--1477; 43 no. 18 (English Ed.), 1516 .1S19; Ann. Pmb. 2000, 28 no. 1, 235-257. Chen! M.F. and Wang, Y.Z.(2003), Algebraic Gon,vergen,ce of Markov Chains, Ann. Appl. Probab 13 no. 2, 604-627. Chen, M. F. and Yan, S. .J.(J991), Jump processes and particle systems, in “Prob. T h . Appl. in China”, edited by S. d. Yan, C. C. Yang and J. G. Wang, Providence, AMS. 118, 23-57. Chen, hl. F. and Zheng, X. G.(1983), Uniqueness criterion f o r q-processes in abstract state spaces, Scj, Sin. 26 no. 1, 11-24. Chung, K. L.(1956), Some new developments in Markov chains, Trans. Amer. Math. Soc. 81,195-210. Chung, K. L.( 1967): Markov Chains with Stationary Transition Probabilities, 2nd edition, Springer-Verlag, New York. Chung, K. L.(1970), Lectures o n Boundary Theory f o r Markov Chains, Princeton, New Jersey.
Cohn, U . L.(1980), Measure Theory, Birkhiiuser, Boston.
576
BIBLIOGRAP
Dai, Y . E.(1986), Gibbs states and random fields (in Chinese), Acta Math. Sin. 29 no. 1, 103-111. Dai, Y. L. and Liu, X. J.(1986), Quasi-nearest particle systems, Acta Math. Sin. New Ser. 2 no. 1, 92-104. Dawson, D. A. and Zheng, X. G. (1991), Law of large numbers and central limit theorem f o r unbounded j u m p mean field models, Adv. Appl. hla.th. 12, 293-326. DeMasi, A. and Presutti, E,. (1992), Lectures on the Collective Behavior of Particle Systems, L N M 1502 (1992), Springer. Dembo, A. and Zeitouni, 0 . (1993), Large Deviations Techniques and Applications, Jones and Bartlett Publ. Inc. Derman, C . (1955): Some contribution to the theory of denumerable Mnrkov chmins, 7tans. A.mer. Math. SOC. 79, 541-555. Deuschel, J.-D. and Stroock, D. W. (1989), Large Deuiati~ons,Academic Press. Diacoriis, P. and Stroock, D. W. (1991), Geometric bounds f o r eigenvalues of Markov chains, Ann. Appl. Prob. 1 no. 1, 36-61. Ding, W. D. and Chen, M. F. (1981), Quasi-reversible measures f o r nearest speed functions (in Chinese), Chin. Ann. Math. 2, 47-59. Ding, W. D., Durrett, R. and Liggett, T. M. (1990), Ergodicity of reversible theorems reaction-diflusion processes, Prob. Th. Rel. Fields 85 no. 1, 13-26. Ding, W. D. and Zheng, X . G . (1989), Ergodic t h e o r e m for linear growth processes with digusion, Chin. Ann. Math. 10(B) no. 3,386-402. Dittrich, P. (1988a), A slochastic model of a chemical reaction with diffusion, Prob. Theory Rel. Fields 79,115-128. Dittorich,P. (1988b), A stochastic particle s y s t e m : fluctuations around a non-linear reaction diflusion eguulion, Stuch. Proc. Appl. 30,149-1611. Dobrushin, R. L. (19561, Central limit theorem for nonstationary Markov chains (I), Theory Prob. Appl. 1 no. 1, 65-79. Dobrushin, R. L. (1956), Central limit theorem for nonstationary Markou chains (II), Theory Prob. Appl. 1 no. 4, 329-383. Dobrushin, R. L. (1970), Prescribing a system of random variables by conditional distributions, Theory Prob. Appl. 15 no. 3,458486. Dobrushin, R. L.: Koteckj., R. and Shlosman, S. B. (1992), Wulg Construction, A Global Shape from Local Interaction, Trans. Math. Monographs, AMS. Providence. Dubrushin, R.. L.and Kusuolta, S.(1993), Statistical Mechanics and Fraclals, 1,NM. 1667, Springer-Verlag . Dobrushin, R, L. and Percherski, E. A.(1981), A criterion for the u ~ ~ i q u m e of s s Gighian fields in the riun-compact case, in “Prob. Theory and Math. Statis. ”, edited by K. It0 and J. V. Prohorov. LNM 1021,97-110. Dobrushin, R. L. and Shlosman, S. B.(1985), Constructive criterion for the uniqueness of Gibbs field, in “Statistical Physics and Dynamical Systems, Rigorous Results”, edited by R-itz, Jaffe and Szasz, 347-370. Dobrushin, R. L. and Sinai, Ya. G.(1980), Mathematical problems in statistical mechanics, in “Math. Phys. Reviews, Section C”, edited by S. P. Novikov. Harwood Academic Publisher GmbH 1, 77-105. Dobrushin, R. L. and Zahradnik, M.(1985), Phase diagrams for the continuous spin models. Extension of Pirogov-Sinai theory, in “Mathematical Problems in Statistical Physics and Dynamical Systems”, cdited by R L. Dobrushin, Dordrecht etc.: Rei-
del, 1-123. Doeblin, W.(1938), Expos6 de la thCorie des c h a h simplev constanted de Markov h un nim61yfini d’e‘tats, Rev. Math. Union Interbalkaniquc 2, 77 -1.05.
BIBLIOGRAPHY
577
Donsker, M. U. and Varadtian, S. R. S.(1975), Asymptotic evaluation of certain Markov 77 process expectations f o r large time. (I), Comm. Pure ilppl. Math. 28, 1-47. Donsker, M.D. and Varadhan, S. R.S.(1976), Asymptotic evaluation of certain Murkov process expectations f o r large time. (IU), Camrn. Pure Appl. Math. 29, 389-461. Donsker, M. D. and Varadhan, S. R. S.(1983), Asymptotic evaluation of certain Markow process expectations f o r large time. (IV), Cornm. Pure Appl. Math. 36, 183-212. Doob, J. L.(1945), Markou chains-denumerable case, Tram Amer. Math. SOC. 158,455473. Doob, J. L.(1953), Stochastic Processes, Wiley, New York. Down, D., Meyn, S. P. and Tweedie, R. L.(1995), Exponential and u.nifomn ergodzcity of Markov processes, Ann. Prob. 23, 1671-1691. Dowson, D. C. and Landau, B. V.(1982), The Fre'chet distance between multiz~uriatenormal distributions, J . Multivariate Anal. 12,450455. Doyle, P. G . and Snell, J . L.(1984), Random TVullcs and ELlectric Netuiorks, AMS. Dudley, R. >I.( 1976), Probability and Metrics, Lecture Kotes, Aarhus University. Dudley, R. M.( 1989), Red Aaalysis and I'ruvahilzty, Wadsworth &: BrooksCole, Belmont, Calii'oornia. Dudley, R. M.arid Stroock? W . (1987), Slepiaa's inequality and commuting scrnigroups, Seminairc de Probabilitgs, XXI, LNM 1247,571-578. Durrett, R.(1981), An irttroductaon to infinite particle systems, Stoch. Proc. .4ppl. 11, 109-150. Durret.t, R.(1985), Two comparison theorems for the recurrence of Markov chains, preprint. Durret.t, R.(1986), Multidimensional random walks in random environments with subclassical limiting behavior, Comm. Math. Phys. 104 no. 87. Durrett, R.( 19S8), Lecture Notes on. Particle Systems and Percolation, Wadsworth and BrookslCole, Pacific Grove, Calif. Durrett, R.(1993), Ten Lectures o n Particle Systems, In Ecole dEte de Probabilitks de Saint Flour XXIII, LNM 1608, Springer-Verlag. Durrett, R. and Neuhauser, C. (1997), Coexistence Results far Some Competition ModeZs, Ann. Applied Prob. 7,10-45. Durrett, R. and Levin, S. (1994), The importance of being discrete (and spatial), Thearet. Pop. Biol. 46, 363-394. Uurrett, R. and Neuhauser, C, (1994), Particle systems and reaction diflusion equations, Ann. Probab. 22, 289-333. Uynkin, E. R.(1965), Markov Processes, Springer-Verlag, New York. Dynkin, E. B.(1969), Boundary Iheoory OJ Murkov Pwcesses ( t h e discrete case), Russian h4ath. Surveys 24 no. 2, 1-42. Dynkin, E.B.(1982), Markov Processes and Related Problems of Analysis, LNM. Vol. 54. London Math. Soc.: Cambridge Univ. Press. Ellis, R. S.( 1988) Entropy, Large Deviations and Statistical Mechanics, Springer. Falconer, K . 5.(1985), The Geometry of Fractal Sets, Cambridge Univ. Press. Falconer, K . J. (1989), Fractal Geometry, Mathematical Foundations and Applications, WiIey, Chichester. Feller, W.( 1940), On the integro-difjerentiai equations of pure discontinuous Markov Processes, Trans. Arner. Math. SOC. 48, 488-515. Feller, W.(1945), On the integTo-dijJerential equations of pure discontinuow Markov Processes, Trans. Amer. Math. SOL 5 8 , p. 474. Feller, W.( 1957), O n boundaries and lateral conditions for Kolmogorof difleerential e q w tions, Ann. M a t h . 8 5 , 527-570.
n.
~
578
BIBLIOGRAP
Feller, W.(1958), Notes to m y pnper “O?Lboundaries and lateral c0nditiun.s f o r Kolmogorof differential equations”, Ann. Math. 68, 735-736. Feller, W.( 1966), A n Introduction t o Probability and its Applications, Wiley, New York. Feng, J. F. (1996), The hydrodynamic limit f o r Ihe reuction diflusim equation--an approach in terms of the GRP mclhod, J. Theor. Prob. 9(2), 285-299. Feng, S. (1994a), Large deviations f o r empirical process of interacting particle system with unbounded jumps, Ann. Prob. 22 no. 4 , 2122-2151. Feng, S . (1994b), Large deviations for Markov processes with interaction and wzbounded jumps, Prob. T h . Rel. Fields 100,227-252. Feng, S . (1995), Nonlinear master equation of multitype particle systems, Stoch. Proc. Appl. 5 7 , 247-271. Feng, S. and Zheng, X. G.(1992), Solutions of a class of non-linear Master equations, Stoch. Proc. Appl. 43,65-84. Ferrari, P. A.(1990), Ergodicity for span systems with stirrings, Ann. Prob. 18 no. 4, 1523-1538. Forbes, F. and Franqois, 0.(1997), Stochastic comparison f o r Markov processes on a product of partially ordered sets, Stat.. Prob. Letters 33,309-320. Foster, F. G.(1953), O n the stochastic matrices associated with certaia queueing problems, Ann. Math. Stastis. 24, 355-360. Freedman, U.( 1983), Markov Chains, Springer-Verlag, New York. Fritz, J . (1987), O n the hydrodynamic li.mit of a scalar Ginzburg-Landau lattice model, IMA Volume 9, Springer-Verlag. Frohlich, J., Israel, I t . , Lieb, E. H. and Simon, B.(1978), Phase transitions and reflection positivity.1. General theory and long range lattice models, Comm. Math. Phys. 62, 1-34. F’ukushima, M.(1980), Dirichlet F o m s and Marlcov Processes, North Holland Math., 23. hkushirna, M., Oshima, Y. and Takeda, M. (1994), D i ~ d c h l ~Ft o m and Symmetric Markov Proceuseu, Walter de Gruyter. E’ukushirna, M. and Stroock, D. W.(lSSS), Reversibility of solutions. lo 7nurlingak probl e m s , in “Prob., Stat. Mech., and Number Th., Adv. in Math. Suppl. Studies” 9, 107-1 23. Rinaki, T. (1997), Singular livnil f o r reaction-difluuian equation with self-similar Guussian noise, Proc. of Taniguchi symp., New ‘I’rendsin Stoch. Anal. Eds: a.Flworthy, 5. Kusuoka and 1. Shigekawa, World Sci., 132-152. Funaki, ‘1‘. (1 999), Singular limit for stochastic reaction-dijfusion equation a.nd generation of random interfaces, Acts Math. Sin, Eng. Ser. 15, 407-438. Cefen, Y., Aharony, A. and Mandelbrot, B. B.(1983),Phase trnnnitions o n fractals: (I). Quasi-hear lattices, J. Phys. A. Ma.t,h. 16,1267-1278. Gefen, Y . , hharony, A,, Shapir, Y . and Mandelbrot: €3. B.(L984), Phase lmnsilio7is 07L fractals: (n).Sierpinski gaskets, J. Phys. A . Makh. 17,435444. Gefen, Y . , Aharony, A. and Mandelbrot, R. B.(J984), Phasc transitions o n fractah: (m). Infinitely ramified lattices, J. Phys. A. Math. 17,1277-1289. Georgii, 1-1. 0.(1979), Canonical Gibbs States, LNM. Springer-Verlag. Georgii, H. 0.(1988), Gibbs Measures and Phase Transitions, de Gruyter Studies in Math., Water de Gruyter. Givens, C. R. and Shortt, R. h1.(1984), A class of Wasserstein metrics for probability distributions, Mishigan Math. J. 31,231-240. ~ coupLed branching process. Part I: The erGreven, A.(1991), A phase transition f o the godic theory in the range of finite second moments, Prob. T h . Rel. Fields 87 no. 4 , 416-458.
BIBLIOGRAPHY
579
Griffeath, D.( 1978), Coupling methods f o r Markov processes, in “Studies in Probability and Ergodic Theory, Adv. Math., Supplementary Studies” 2 , 1-43. Gong, S.(1983), Harmonic Analysis in the Classical Groups (in Chinese), Press of Sci., Beijjrig .
Cranovsky, B. and Zcifman, A. I. (1997), The decay function of nonhomogeneoi~sbirthdeath pvocesses with application to mean-field models, Soch. Proc. Appl. 72, 105-120. Gross, L. (19753, Logarithmic Sobolev inequalities, hrner. J. Math. 97 no. 4, 1061 -1083. Guionnet, A. a n d Zegarlinski, B. (2003), Lectures o n logarithmic Sobolev inequalities, J. AzBrna: M. h n e r y , M. Ledoux and M. Yor, LNM 1801, Springer-Verlag, 1-134. Guo, M. Z., Papanicolaou, G . C. and Varadhan, S . R. S.(1988), Non-linear diflusion limit f o r a system with nearest neighbor interactions, Comm. Math. Phys. 118, 31-59. Haken, H.( 1983), Synergetics, 3rd edition, Springer-Verlag. Harnaa, K. and Klebaner, F. C. (1995), Conditions f o r integrability of Markov chains, J. Appl. Proob. 32,541-547. Han, D.(1990), Existence of solution to th.e rnartirtgule prablem j o r multispecies infinite dimensional refiction-daffusion particle systems, Chin. J . Appl. Proh. Statis. 6 no. 3, 265278.
I-Ian, D.(1991), Ergodicity f o r one-dimensional Brusseltor model (in Chinese), J. Xingjiang Univ. 8 no. 3, 37-38. Han, D.( 1992), Uniqueness of solution to the martingale problem f o r infinite-dimensional reaction-dzffusion particle systems with multispecies (in Chinese), Chin. Ann. Math. 13A no. 2, 271-277. Han, D . (1995), Uniqueness for reaction-diffusion particle systems with multispecies (in Chinese), Chin. Ann. Math. 16A no. 5 , 572-578. Harris, T. E.(1957), Transient Markov chains with stationary measures, Proc. Amer. M a t h . SOC. 8 , 937442. I-larriza, K.and Klebaner, F. C. (1995), Conditiur~sfor irtiegmbility of Ma,rkov chains, J. Appl. Prob. 3 2 , 541L547.
Herbst, 1. and Pitt, L(1991), Difusion equation, tmhniqv,es i n stochastic monoton,icity and positive correlations, Prob. Th. Rel. Ficlds 87 no. 3, 27S-312. Holley, R.(1970), A class of interaction in a n infinite particle systems, Adv. Math. 5 , 291-309.
Holley, R.(1984),Convergence in L2 of stochastic Ising models: jump processes and diffusions, in “Proc. Tanigushi Symposium o n Stochastic Analysis”, 149-167. Holley, R.(1985a), Possible rates of convergence in finite range, attractive spin systems, in “Particle Systems, Random Media and Large Deviations”, edited by R. Durrett, Contemporary Mathematics 41,215-234. Holley, R.( 1085h), Rapid conmergenm t o eqiiilihrium in one dimensional stochastic lsing models, Ann. Prob. 13, 72--89. Holley, It. and Liggett, T. M.(1981), Generalized Potlatch and smoothing processes, Z . Wahrs. 5 5 , 165-195. Holley, R. and Stroock, D. W.(1976), La theory f o r the stochastic Ising model, Z. Wahrs. 35, 87-101. Holley, R. and Stroock, D. W.( 1987), Logarithmic Sobolev inequalities and stochastic Ising models, J. Statis. Phys. 46 no. 5 / 6 , 1159--1194. Holley, R. and Stroock, D. W.(1989), Uniform and L2 convergence in one dimensional stochastic Ising models, Comm. Math. Phys. 123,85-93. Hou, Z. T.(1974), Uniqueness criterion f o r Q-processes, Sci. Sin. 15, 141-159. Hou, Z . T.(1982), Uniqueness Criteria f o r Q-processes (in Chinese), Hunan Sci. Press.
580
BIBLIOGRAPHY
Hou, Z. T.(1.991)) 8 - m a t r i x problem, in “Probability Theory and its Applications in China”, edit,ed by S. J. Yan, C. C. Yang and J. G . Wang, Pruvidence, AMS. 118, 127-148. Hou, Z. T.and Cheri, M. F.(1980), Markov Proccsses u n d h ~ l dtheory (Abstract), Kuoxue Tongbao 25,913-916. Hou, Z. T. and Chen, M. F.(1979), Markov processes and field theory, in “Reversible Markov Processes” (in Chinese), edited by M. Qian and 2. T. Hou, 913-916. Hou, Z. T. and Chen, M. F.(1979), O n strong Markov property (in Chinese), J. Changsha Railway Inst. 25,913-916. Hou, Z. T. and Chen, M. F.(1979), Some examples of symmetrizable Q-processes (in Chinese), J. Changsha Railway Inst., 913-916. Hou, Z. T. and Chen, M. F.(1980), O n the symmetrizability of a class of Q-processes (in Chinese), J. Beijing Normal Univ. no. 3-4, 1-12. Hou, Z. T. et a1 (ZOOO), Markou Skeleton Processes (in Chinese), Hunan Sci. Press, Hunan. Hou, Z. T. and Guo, Q . F.(1976), O n the qualitative theory of constructing the timehomogeneous Markov processes with countable state space (in Chinese), Sci. Sin. 110. 4, 239-262. Huu, Z. T. and Guo, Q. F.(1978), Time-Hom.ogen<eousMarkou Processes with Countable State Space (in Chinese), Reijing Sci. Press, English translation (1988)) Beijing Sci. Press and Springer-Verlng. I-IOU, Z. T., Cuo, Q. F. and Chen, M . P.(1979), IZeweraible Q-processes, in “Reversihle Mrtrlcov Prucemed’ (in Chinese), edited by M. Qian and Z. T. Hou, Hunarr Sci. Press. Hou, Z. T., Liu, Z. M., Zhang, H.J., 12, J. P., Zhou, J. Z. and Yuan, C. G . (2000), Birth-death Processes (in Chinese), Hunan Sci. Press, Hunan. Hau, Z. TI, Wang, P.Z.and Ctien, M. F.(l979), Reversible birth-death processes, in “Reversible Markov Processes” (in Chinese), edited by M . Qian and 2 . T. Hou, 139-1 48. Hou, Z. T., Zhou, J. Z., Zhang, H. J., Liu, 2. M., Xiao, G. N., Chen, A. Y. and Fei, Z. L. (1994), The Q-matrix Problem for Markov Chains (in Chinese), Hunan Sci. Press. Hsu, P. L.( 1958) , The differentiability of the probability transition function of pure discontinuous stationary Markou processes on the Euclidean space (in Chinese), J. Beijing Univ. no. 3, 257-270. Hu, D. H.(1966), Construction theory of q-processes in abstract state spaces (in Chinese), Acta Math. Sin. 16 no. 2, 150-165. IIu, D. €I.(1978), O n pure discontinuous Markov processes (I) (in Chinese), J. Wuhan Univ. no. 4, b 1 8 . Nu, D. H.(1979), O n pure discontinuous Markov processes (lI)(in Chinese), J . Wuhan Univ. no. 1, 15-38. Hu, D.H.(l983), Markov Processes with Countable State Spaces I (in Chinese), Wuhan Univ. Press. 1411, D.H.( 1985), Analytic Theory for Markov Processes with General Stale Spuces (lI)(in Chincse), Hubei’s Educational Press. Hua, I,. K.(1963), Harmonic Analysis of Funclions of Several Complex Variables in the Clavsical Domains, Amer. Math. SOC.Providence, R.I. Hrmng, C. C. and Tsaacson, D.( 1976), Frgodicity using mwan visit. times, J. London Math. SOC.(2)14,570-576. Hnang, L. P.(1987), The ezistence theorem, of the stationary distributions of infinite particle systems (in Chinese), Chin. J. Appl. Prob. Statis. 3 no. 2, 152-158. Huang, L. P.(1988), Controlled Q-processes and Success of Couplings (in Chinese), Ph. D. Thesis, Beijing Normal U. Hunt, G. A.(1970), Murkou chains and Martin boundaries, Illinois J. Math. 4, 313-340.
BIBLIOGRAPHY
581
Hwang, C. R., Hwang-Ma, S. Y. and Sheu, S. J. (2002), Accelerating diflusions, preprint. Ikeda, N.and Watanabe, S.( 1981)) Stochastic Difierential Equations and Difiusion Processes, North-Holland, Kodansha, Tokyo. Isaacson, D.( 1979), A characterization of geometric ergodicity, Z. Wahrs. 49,267-273. Isaacson, D. and Arnold, B.( 1978), Strong ergodicity for continuous-time Markov chains, J. Appl. Prob. 15,699-706. Isaacson, D. and Luecke, G. R.(1978), Strongly ergodic Markov chains and rates of convergence using spectral conditions, Stoch. Proc. and Appl. 7,113-121. Isaacson, D. and Tweedie, R.L.(l978), Criteria for strongly ergodicity of Markov chairG, J. Appl. Prob. 15,87-95. Jain, N. C.(1990), Large deviation lower bounds f o r additive functionals of Markov processes, Ann. Prob. 18 no. 3, 1071-1098. Kantorovich, L. V. and Krylov, V. 1.(1962), Approzimuix Methods of Highw Analysis (Chinese translation), Sci. Press, Beijing. Kelly, F. P.(1979), Reversibility and Stochastic Networks, Wiley, Chichester. Kendall, D. G.(1951), Some problem in the theory of queues, J. Roy. Stat. Soc.(B) 13, 151-185. Kendall, D. G.(1955), S o m e analytical properties of continuous stationary Markov tmnsition functions, Trans. Amer. Math. SOC.78,529-540. Kendall, D. G.(1959), Unitary dilations of Markov transition operators and the corresponding integral representations f o r transition probability matrices, In “Probability and Statistics’’ (U. Grenander, ed.), Wiley, New York. Kersting, G . and Klebaner, F. C. (1995), Sharp conditions f o r nonexpcplosions and explosions in Markov j u m p processes, Ann. Prob. 23 no. 1, 268-272. Kingman, J. F. C.(1964), The stochastic theory of regenerative events, Z. Wahrs. 2 , 180224. Kingman, J. F. C.(1968), Markov transition probabilities (III); general state spaces, Z. Wahrs. 10, 87-101. Kingman, J. F. C.(1972), Regenerative Phenomena, Wiley, New York. Kipnis, C. and Landim, C. (1999), Scaling Limits of Interacting Particle Systems, Springer-Verlag, Berlin. Kipnis, C., Olla, S. and Varadhan, S. R. S.(1989), Hydrodynamics and large deviations for simple exclusion processes, Comm. Pure Appl. Math. 42,115-137. Kolmogorov, A. N.( 1931), Uber die analytischen methoden in der wahrscheinlichkeitsrechn u g , Math. Ann. 104,415-458. Kolmogorov, A . N.(1936a), Zur theorie der Markoffschen ketten, Math. Ann. 112 no. 4, 155-160. Kolmogorov, A. N.( 1936b), O n the difierentiability of the transition probabilities an homogeneous Markov Processes with a denumerable number of states, Ueenye Zapiski MGY 148,53-59. Konstantinov, A. A . , Maslov, V. P. and Chebotarev, A. M. (1990), Probability representations of solutions of the Cauchy problem f o r quantum mechanical sohtions, Russian Math. Surveys 45 no. 6, 1-26. Kusuoka, S.(1987), A diffusion process o n a fractal, in “Probabilistic Methods in Mathematical Physics”, Proc. of Taniguchi International Symp.( 1985), edited by K. It6 and N. Ikeda, 251-274. Kusuoka, S. and Zhou, X. Y.(1992), Dirichlet form of fractals: Poincar; constant an.d resistance, Prob. T h . Rel. Fields 93,169-196. Kuznetsov, S. E.(1980), A n y Markov process in a Bore1 space has a transition function, Theory Prob. Appl. 25,389-393.
582
BIBLIOGRAP
Landim, C., Sethuraman, S. and Varadhan, S. R. S . (1996)) Spectral gup f o r zero range dynamics, Ann. Prob. 24, 1871-1902. Lawler, G.F. and Sokal, A. D.(1988), Bounds on the L2 spectrum for Markov chain and Markov processes: a generalazation of Cheeger’s inequality, Trans. Amer. Math. Soc. 309, 557-580. Li, J . P.(199Oa), Master’s Thesis (in Chinese), Beijing Normal Univ. Li, J. P. (199Ob), A remark on the paper “Multidimensional Q-processes” (in Chinese), Chin. Ann. b1at.h. 11(A) no. 4,505-506. Li, S. Q.(1983), Potentiality and reversibility of a class of mixed infinite particle systems (in Chinese), Chin. Ann. Math. 4(A) no. 6 , 773-780. Li, T. D. and Yang, C. N.(1952), Statistical theory of equations of state and phase transitions (I), Phys. Rev, 8 7 , 404-409. Li, T. D. and Yang, C . N.(1952), Stulivticul theory of equations of state and phase transitions (II), Phys. Rev. 87,1110-419. Li, Y . (1990), ?‘he positive rever.si6le probability measure o n potential exchsion processes (in Chinese), Chin. J. Appl. Frob. Statis. 6 no. 2, 121-126. Li, U.(1991))llniqaeness for injhite dimensional reaction-diffusion processes, Chin. Sci. Bull. (Chinese Ed.) 2 2 , 1681-1684. Li, Y .(1996), Ergodicity of a class of reaction-dilffusion processes with translation invariant coeficients (in Chinese), Chin. Ann. Math. 16A no. 2, 223-229. Li, Y . and Zheng, X. G.(1988), Colored graphic representation for certain interacting particle systems, preprint. Ligget,t, T. M.(1972), Existence theorem f o r infinite particle systems, Trans. Amer. Math. SOC.165,471-481. Liggett, T. M.(1973), A n infinite particle system with zero range interactions, Ann. of Prob, 1, 240-253. Liggett, T. M.(1985), Intemcting Particle Systems, Springer-Verlag. Liggett, T. M. (1989), Exponential L2 convergence of attractive reversible nearest particle systems, Ann. of Prob. 17,403-432. Liggett, ’r. h4. (1991), C z rates of conuerpnce jor attractive reversible nearest particle sydems: the critical m s e , Ann. of Prob. 19 no. 3, 935-959. Liggett, T. M. (1997)) The 1996 Walu‘ Memorial Lectures; Stochastic models of interacting systems, Ann. Probab. 2 5 , 1-29. Liggett, T. h/I. (1999), Stochastic Intemcti.ng Systems: Contact, Voter and Exclusion Processes, Springer-Verlag. Liggett, T. ha. and Spitzer, F.(1981), Ergodic theorems f o r coupled random walks and other systems with locally interacting components, Z . Wahrs. 5 6 , 443-468. Lindvall, T.( 1979), A note on coupling of birth-death processes, J. Appl. Prob. 16, 505512. Lindvall, T. (1992), Lectures on the Coupling Method, Wiley, New York. Lindvall, T. (1999), On Strassen’s theorem, on stochastic domination, Electr. Comm. Probab. 4, 51-59. Lindvall, T. a n d Rogers, L.C. G.(1986), Coupling of multidimensional dilffusion processes, Ann. Prob. 14 no. 3, 860-872. Linstr$m, T.(1990), Brawnian Motion o n Nested Fractals, Memoirs of AMS. Liu, X. J.(1986), Firtite quasi.-neareul particle systems (in Chinese), Chin, J. Appl, Prob. Stat,. 3 no. 1, 38-45, Lokve, M.(1963), Prabahility, 3rd, edition, New York. Lowe, M.and hleise, C . (2001), Note on the knapsack Markov chain, Stoch. Proc. Appl. 94, 155-170.
BIBLIOGRAPHY
583
LApez, F. J., Martinez, S and Sanz, G . (ZOOO), Stochastic domination and Markovian couplings, Adv. Appl. Prob. 32,1064-1076. Lbpez, F. 3. and Sanz, G.(1998), Stochastic com,parisons and couplin.gs f o r in.teracting particle systems, Stat. Prob. Letters 40, 93-102. Lyons, T. J.(1983), A simple crateriun f o r trunsient of a reversible Markov chains, Ann. of Prob. 11 no. 2 , 393-402. Ma, Z. M.(1985), Some results o n regular conditional probabilities, Rcta Math. Sin. New Ser. 1 no. 4, 128-133. Ma, Z. M. and Rockner, M. (1992), Introduction to the Theory of (Non-symmetric) Dirichlet Forms, Springer-Valerg. Maes, C . , Redig, F. and Saada, E,. (2002), The abelian sandpile model o n a n infinite tree, Ann. Prob. 30, 2081-2107. Maes, C. and Shlosman, S. B.(1991), Ergodicity of probabilistic cellular automata: A constructive criterion, Comm. Math. Phys. 135 no. 2, 233-251. Maes, C. and S X i l o ~ ~ xS. ~ i13.(1993), , Constructive criteria Jor the eyodicity of interacting particle system-, in “Cellular Aut,ornxta and Cooperative Systems’’, Nato AS1 Series, eds. N . Boccara, E. Goles, S. Marthez, and P. Picco, Kluwer, Dordreckt, 451-461. Malyshev, V. A. and Minlos, R. A. (1991), Gibbs Random Fael&, Cluster Expansions, Kluwer Acad. Publ. Mandelbrot, B. B.(1982), The Fractal Geometry of Nature, Freeman, San Francisco. Mao, Y. H. (2002a), The logarithmic Sobolev inequalities for birth-death process and diflusion process o n the line, Chin. J. Appl. Prob. Statis. 18 no. 1, 94-100. Mao, Y. H.(2002b), Strong ergodicity for Markov processes by coupling methods, J. Appl. Prob. 39,839-852. Mao, Y . H. and Zhang, Y. H. (2003), Exponential ergodicity f o r single birth processes, preprint.
Martinelli, F. (1999), Lectures o n Ghuber dyn,amics for discrete spin models, L.NM 1717, Springer-Valerg, 93-191. McCoy, B. M. a n d Wu, T. T.(1973), Thc Two-Dimensionnl Ising Model, Ilnrvard Univ. Press, Carnbridge, M.wsachusett,s. Meise, C. (1999), On spectral gap estimates of a Markov chain via hitting times and coupling, J. Appl. Prob. 36, 310-319. Mertens, J. F., Samuel-Cahn, E,. and Zamit, S.(1978), Necessary and suficient conditions for recurrence and transience of Markov chains, J. Appl. Prob. 15,848-851. Meyn, S. P. and Tweedie, R. L.(1993a), Stability of Markovian processes (m):FosterLyapunov criteria for continuous-time processes, Adv. Appl. Prob. 2 5 , 518-548. Meyn, S. P. and ‘Tweedie, R. E.(1993b), Markov Chains and Stochastic Stability, Springer--Vwlag, I.,ontlnn. Miclo, I,. (1999), A n example of application of discrete Hardy’s inequalities, Markov Processes Helat. Fields 5, 319-330. Minlos, R. A. (1996), Invariant subspaces of the stochastic Ising high temperature dynamics, Markov Processes Kelat. Fields, 2, 263-284. Minlos, R. A. (2000), Introduction t o Mathematical Statistical Physics, University Lecture Series 19,American Mathematical Society. Minlos, R. A . and Trisch, A . (1994), Complete spectral decomposition of the generator for one-dimensional Glauber dynamics (in Russian), Uspekhi Matem. Nauk 49,209211. Mountford, T. S. (1992), The ergodicity of a class of reversible reaction-diflusion processes, Prob. T h . Re]. Fields 92 no. 2, 259-274. Nash-Williams, C. St. J. A.(1959), Random walk and electric current in networks, Math. Proc. Cambridge Philis. SOC.5 5 , 181-194.
584
BIBLIOGRAP
Nelssen, R . B.( 1999), An Introduction to Copulas, Lecture Notes in Stat., Springer-Verlag. Neuhauser, C.(1990), An ergodic theorem f o r Schlogl models with small migration, Prob. Th. Rel. Fields. 85 no. 1, 27-32. Neveu, J.( 1965), Mathematical Foundations of the Calculus of Probability, Holden-Day. Nicolis, G . and Prigogine, I.(1977), Self-organization in Non-Equilibrium Systems, Wiley. Nummelin, E.(1984), General Irreducible Markov Chains and Non-Negative Operators, Cambridge Univ. Press. Nummelin, E. and Tuominen, P.(1982), Geometric ergodicity of Harris recurrent chains with applications to renewal theory, Stoch. Proc. Appl. 12, 187-202. Nummelin, E. and Tweedie, R. L.( 1978), Geometric ergodicity and R-positivity f o r general Markou chains, Ann. Prob. 6 no. 3, 404-420. Olla, S.(1988), Large deviations f o r random fields, Prob. Th. Rel. Fields. 77,343-357. Osterwalder, K . and Schrader, R.( 1973), Axioms f o r Euclidean green’s functions, Comm. Math. Phys. 31,83. Olkin, I. and Pukelsheirn, R.(1982), The distance between two random vectors with given dispersion matrices, Linear Algebra Appl. 48,257-263. Pakes, A. G . and TavarB, S. (1981), Comm>entso n the age distribution of Markou processes, Adv. Appl. Prob. 13,681-703. Park, Y. M.(1988a), Extension of Pirogou-Sinai theory of phase transition t o infinite range interactions (I), Comm. Math. Phys. 114,187-218. Park, Y . M.(1988b), Extension of Pirogov-Sinai theory of phase transition t o infinite range interactions (n), Comm. Math. Phys. 114,219-241. Parthasarathy, K . R.( 1967), Probability Measures on Metric Spaces, Academic Press. Popov, N. S.(1977), Conditions f o r geometric ergodicity of countable Markov chains, Soviet Math. Dokl. 18,676-679. Perrut, A (ZOOO), Hydrodynamic limits for a two-species reaction-diffusion process, Ann. Appl. Prob. 10 no. 1, 163-191. Preston, C . J.(1974), Gibbs States o n Countable Sets, Cambridge Univ. Press. Preston, C. J.(1976), Random Fields, LNM., Springer-Verlag. Qian, M. and I-IOU,Z. T.(Editors)(l979), Reversible Markov Processes (in Chinese), Hunan Sci. Press, Ilunan. Qian, M. P.(1978), O n the reversibility of stationary Markov chains (in Chinese), J. Beijing Univ. no. 4, 158-184. Qian, M. P. and Qian, M.(1979), Irreversibility, detail balance and circulations, in “Reversible hlarkov Processes” (in Chinese), edited by M. Qian and Z. T. Hou, 151-193. Rachev, S. T.(1982), Minimal metrics in the random variables space, Publ. Inst. Stat. Univ. Paris 27 no. 1, 27-47. Rachev, S. T . (1991), Probability Metrics and the Stability of Stochastic Models, Wiley, New York. Ren, K. L.( 1983), Potentiality and reversibility for the N-spin-flip processes and N-generalized N-spin-flip processes, Acta Math. Sci. 3,300-320. Reuter, G.E. H.(1957), Denumerable Markov Processes, Acta Math. 97, 1-46. Reuter, G . E. II.(1959), Denumerable Markov Processes (n), J. London Math. Soc. 34, 81-91. Reuter, G. E. H.(1961), Competition processes, in Fourth Berkeley Symposium on Math. Stat. and Prob. 2 . , 421-430. Reuter, G . E. H.(1962), Denumerable Markov Processes (m),J. London Math. SOC.37, 63-73. Reuter, G . E.H.(1976), Denumerable Markov Processes ( N ) ,on C. T . HOU’Suniqueness theorem for Q-semigroups, Z. Wahrs. 33,309-315.
BIBLIOGRAPHY
585
Roberts, G. 0. and Rosenthal, J. S. (1997), Geometric ergodicity and hybrid Markov chains, Electron. Comm. Probab. 2 , 13-25. Rockner, M. and Wang, F. Y. (2001), Weak Poincare‘ inequalities and L2-convergence rates of Markov semigroups, J. Funct. Anal. 185 no. 2, 564-603. Ruschendorf, L. and Rachev, S. T.(1990), A characterization of random variables with minimum L2-distance, J. Multi. Anal. 32,48-54. Schlogl, F.(1972), Chemical reaction models f o r phase transitions, Z . Phys. 253,147-161. Schonmann, R. H . (1994), Slow drop-driven relaxation of stochastic Ising models in the vicinity of the phase coexistence region, Commun. Math. Phys. 161,1-49. Shiga, T.(1988), Stepping stone models in population, genetics and population dynamics, in ‘Stochastic Processes in Physics and Engineering”, edited by S. Albeverio et al., 345-355. Shlosman, S. B.(1986), T h e method of reflection positivity in the mathematical theory of first-order phase transitions, Russian Math. Surveys 43,83-134. Silverstein, M.L.(1974), Classification of stable symmetric Markov chains, Indiana Univ. Math. J. 24, 29-77. Silverstein, M. L.(1976), Symmetric Markov Processes, LNM., Springer-Verlag. Sinai, Ya. G.(1982), Theory of Phase Transitions: Rigorous Results, Pergamon Press. Sokal, A. D.( 1981), Existence of compatible families of proper regular conditional probabilities, Z. Wahrs. 56,537-548. Sokal, A. D. and Thomas, L. E.(1988), Absence of mass gap for a class of stochastic contour models, J . Statis. Phys. 51 no. 5/6, 907-947. Spitxer, F.(1970), Interacting Markov processes, Adv. Math. 5,246-290. Spitzer , F.(1975), Principles of Random Walks, Springer-Verlag. Spitzer, F.(1981), Infinite systems with locally interacting components, Ann. of Prob. 9 no. 3, 349-364. Spohn, H. (1992), Large Scale Dynamics of Interacting Particles, Texts and Monographs in Physics 342, Springer-Verlag, Heidelberg and Berlin. Strassen, V.(1965), T h e existence of probability measures with given marginals, Ann. Math. Statist. 36,423-439. Stoyan, D.and Daley, D. J.(1983), Comparison Methods for Queues and Other Stochastic Models, Wiley. Stroock, D. W.(1978), Lectures o n Infinite Interactzng Systems, LNM. Kyoto Univ., Kinokuniya Bookstore. Stroock, D. W.(1981), O n the spectrum of Markov semigroup and the ezistence ofinvariant measures, in “Functional Analysis in Markov processes”, Proceedings, edited by M. Fukushima. LNM. 923, 287-307. Stroock, D. W.(1984), An Introduction to the Theory of Large Deviations, Springer. Stroock, D. W. and Varadhan, S. R. S.(1979), Multidimensional Diffusion Processes, Springer-Verlag, New York. Stroock, D. W. and Zegarlinski, B. (1992), T h e logarithmic Sobolev inequality for discrete spin systems o n a lattice, Comrn. Math. Phys. 149 no. 1, 175-193. Sullivan, W. G.(1984), T h e L2 spectral gap of certain positive recurrent Markov chains and j u m p processes, Z . Wahrs. 67,387-398. Szulga, A.( 1978), O n the Wasserstein metric, Transactions of the 8th Prague Conference, Prague, 267-273. Szulga, A.(1982), O n minimal metrics in the space of random variables, Theory Prob. Appl. 27,424-430. Tang, S. Z.(1982), O n the reversibility of din-flip processes (in Chinese), Acta Math. Sin. 2 5 , 306-314.
586
BIBLIOGRAP
Tang, S. 2.(1984), Construction of invuriant measures jor stochastic continuous Feller's processes with compact m e t r i c state space (in Chinese), Chin. Ann. Math. 5 no. 1, 99 -108. 'rang, S . Z.and T h , X . F.(1979), Mzaltzvariale couplings of spin-flip processes ( i n Chinese), J. Beijing Normal IJniv. 2 5 , 6-4. Thomas, L. E.(1989), B o m d s o n the mass gap .for $finite volume stochastic Ising models at low ternperahre, Comrn. Math. Phys. 126,1-11. Thorrison, H.(ZOOO), C o ~ p l Z n gStationarity, ~ and Regeneration, Springer-Verlag. Tuominen, P. arid Tweedie, R.. L . ( 1 ~ 9 ) IJxponmtial , crgodicity i n Markovian queueing and da71b 7rlodek+,J. Appl. Prob. 16,867-880. Twecdic, R. L.(l975), Suficient coizditions f o r regularity, recurrence and ergodicity of Markav processes, Math. Proc. Camb. Phil. SOC.78, 125-136. Tweedie, R. L.(1975), Crateria .for elassijyiny general M a ~ k o uchains, hdv. Appl. Proh. a, 737-771. ?'weedie, R. L.(1981), Criteria for ergodicity, exponential ergodicit?]and strong ergodicity of Markov Processes, J. Appl. Prob. 18,322- 130. Valiender, S. S.(1973), Culculntion of the Wusserstcan divtance between probability distributions on line, Theory Prob. Appl. 18,784-786. van Doorn, E. A.( 1981), Stochastic Monotonicity and Queueing Applications of RirthDeath Processes, Lecture Notes in Statistics 4, Springer. van Doom, E. A.(1985), Conditions f o r expmsntinl ergodicity and bou.nds j o r the decay parameter of a birth-death process, h d v . Appl. Prob 17,514.-530. van Doorn, E. h.(1987), Representations and bounds f o r zeros of orthogonal polynomials and eigenwalues of sign-symmetric tri-diagonal matrices, J. Approx. Th. 51 no. 3, 254-266. van Dourn, E. A.(1991) , Quasi-stationary distributions and convergence to quasi-stationarity of birth-death processes, Adv. Appl. Prob. 23, 683-700. van Doorn, E. A. (ZOOZ), Representations for the rate of convergence of birth-death processes, Theory Probab. Math. Statist. 6 5 , 36-42. Varadhan, S . R. S.( 1984): Large Deviations a n d Applications, SIAM, Philadelphia. Varopoulos, N.(1983), Brownian motion and transient groups, Ann. Inst. Fourier 33, 241-261. Veech, W.(1963), The necessity of Harris' condition for the existence of a stationary measure: Proc. Amer. Math. SOC. 14,856-860. Vere-Jones, D.(1962), Geometric ergodicity in denumerable Markov chains, Quart. J. Math. Oxford end Ser. 13, 7-28. Villani, C.(2003), Topics in Mass Tramportation, Graduate Studies in Math., AMS, Providence, RI. Wang, F. Y, (1994a), Ergodicity for infinite dimensional difjusion processes on manifolds, Sci. Sin.(A), 24 (Chinese Ed.). Wang, F. Y. (1994b), Applimtion of coupling method to the Neumann eigenvalue problem, Prob. T h . Rel. Fields 98, 299-306. Wang, F. Y. (1996a), Estimates of the first eigenvalue and the applications t o Yang-Mills lattice fields, Chinese Ann. Math. 17A:2(1996),147-154 (Chinese Ed.); Chinese 3 . Contemporary Math. 17:2(1996), 119-126 (English Ed.). Wang, F. Y. (1996b), Estimates of the logarithmic Sobolev constant for finite volume continuous span systems, J . St'atis. Phys. 84 no. 1/2, 277-293. Wang, F. Y. (ZOOO), Functional inequalities f o r empty essential spectrum, J. Funct. Anal. 1709,219-245. 'Wang, Z. K.(1965), Theory of Stochastic P.rocesses (in Chinese), Sci. Press, Beijing.
BIBLIOGRAPHY
587
Wang, Z. K.(1980), Birth-Death Processes and Markov Chains (in Chinese), Sci. Press, Beijing. Wang, Z. K. and Yang, X. Q. (1992), Birth and Death Processes and Markov Chains, Springer-Verlag, Berlin. Wasserstein, L. N.( 1969), Markov processes o n a countable product space, describing large systems of automata (in Russian), Problem Peredachi Informastii 5,64-73. Williams, D.(1964), O n the construction problem f o r Markov chains, Z. Wahrs. 3,227246. Williams, D.(1976), T h e Q-matrix problem, LNM. 511,216-234. Williams, D.( 1979), Diffusions, Markov Processes, and Martingales I: Foundations, Wiley. Wu, L. D.(1965), Classification of Markov chains with denumerable states (in Chinese), Acta Math. Sin. 15 no. 1, 32-41. Wu, B. and Zhang, Y. H. (2003), One-dimensional Brusselator model (in Chinese), preprint. Wu, X. Y.(1990), Canonical Gibbs states and reversible random fields (in Chinese), Acta Math. Sin. 33 no. 2, 277-285. Xu, X. J.(1991), Ph.D thesis (in Chinese), Beijing Normal Univ. Yan, J . A. (1988), A perturbation theorem f o r semigroups of linear operators, SBminaire de ProbabilitBs, XXII, LNM 1321,89-91. Yan, S . J.( 1990), Introduction to Infinite Particle Systems (in Chinese), Beijing Normal Univ. Press, Beijing. Yan, S. J . and Chen, M. F.( 1984), Stability of circulation decompositions and self-organization phenomena, Acta Math. Sci. 4 no. 1, 13-26. Yan, S . J. and Chen, M. F.(1986), Multidimensional Q-processes, Chin. Ann. Math. 7(B) no. 1, 90-110. Yan, S. J., Chen, M. F. and Ding, W. D.(1982a), Potentiality and reversibility f o r general speed functions (I), Chin. Ann. Math. 3 no. 5, 572-586. Yan, S . J., Chen, M. F. and Ding, W. D.(1982b), Potentiality and reversibilityfor general speed functions (II), Chin. Ann. Math. 3 no. 6, 705-720. Yan, S . J. and Li, 2. B.(1980), Probability models of non-equilibrium systems and the Master equations (in Chinese), Acta Phys. Sin. 29, 139-152. Yan, S . J., Wang, J . X. and Liu, X. F.(1982), Foundations of Probability Theory (in Chinese), Sci. Press, Beijing. Yang, X. Q.( 1981), Constructions of Time-Homogeneous Markov Processes with Denumerable States (in Chinese), Hunan Sci. Press, Hunan. Yosida, K.(1978), Functional Analysis, Firth edition, Spring-Verlag. Yoshida, N. and Higuchi, Y. (1996), Ising model o n the lattice Sierpinski Gasket, J. Stat. Phys. 84 no. 112, 295-307. Zegarlinski, B.( 1990), Log-Sobolev inequality f o r infinite one dimensional lattice systems, Comm. Math. Phys. 133,147-162. Zeifman, A. I. (1985), Stability f o r continuous-time nonhomogeneous Markov chains, LNM. 1155,401-414. Zeng, W . Q.(1983), Reversibility of two classes of infinite particle systems (in Chinese), Chin. Ann. Math. 4(A) no. 6, 763-772. Zhang, J. K.(1984), Generalized birth-death processes (in Chinese), Acta Math. Sci. 4 no. 6, 241-259. Zhang, H. J., Chen, A. Y.(2000), Stochastic comparability and dual Q-functions, J. Math. Anal. Appl. 234, 482-499.
588
BIBLIOGRAP
Zhang, H. J., Lin, X. and Hou, Z. T. (2000), Uniformly polynomial convergence for standard transition functions, in “Birth-Death Processes” by Hou, Z. T. e t a1 (ZOOO), Hunan Sci. Press, Hunan. Zhang, Y. H.(1994), Conservativity of coupling j u m p processes (in Chinese), J. Beijing Normal Univ. 33 no. 3, 305-307. Zhang, Y. H.(1996), Construction of one-dimensional order-preserving coupling (in Chinese), Chin. J. Appl. Prob. Stat. 12 no. 4, 376-382. Zhang, Y . H.( 1998), Construction of one-dimensional order-preserving coupling (continued, in Chinese), J. Beijing Normal Univ. 34 no. 3, 292-296. Zhang, Y. H.(2000a), Suficient and necessary conditions f o r stochastic comparability of jump processes, Acta Math. Sin. Eng. Ser. 16 no. 1, 99-102. Zhang, Y. H.(2000b), Stochastic comparability of the minimal j u m p processes (in Chinese), J. Beijing Normal Univ. 36 no. 2, 156-158. Zhang, Y . H.(2001), Strong ergodicity f o r continuous-time Markov chains, J. Appl. Prob. 38,270-277. Zhang, S. Y.(1999), Existence of the optimal measurable coupling and ergodicity for Markov processes, Science in China (A) 42 no. 1, 58-67. Zhang, S. Y.(2OOO), Existence and application of optimal Markovian coupling with respect t o non-negative lower semi-continuous functions, Acta Math. Sin. Eng. Ser. 16 no. 2, 26 1-270. Zheng, J. L.(1993), Phase Pansitions of Ising Model o n Lattice Fractals, Martingale Approach f o r q-processes (in Chinese), Ph. D. Thesis, Beijing Normal Univ. Zheng, J. L. and Zheng, X. G.(1986), Martingale approach for q-processes, Unpublished. Zheng, X. G.(1981), Symmetrizable Markov processes in abstract slate space (I) (in Chinese), J. Beijing Normal Univ. no. 4, 15-32. Zheng, X. G.(1982), Qualitative theory f o r q-pi.ocesses in abstract state space, Acta Math. Sci. 2 no. 1, 1-0. Zheng, X. G.(1983a), Symmetrizable Markov processes in abstract state space (E) (in Chinese), 3. Beijing Normal Univ. no. 1, 1-10. Zheng, X. G.(1983b), Feller’s boundary in abstract state space, Chin. Ann. Math. 4(B) no. 3, 267-284. Zheng, X. G. and Ding, W. D.(1987), Existence theorems f o r linear growth processes with diflusion, Acta Math. Sin. New Ser. 7 no. 1, 25-42. Zheng, X. G. and Liu, X. F.(1984), Diflerentiability of transition functions on a topological space with countable bases (in Chinese), J. Beijing Normal Univ. no. 1, 25-35. Zhou, X. Y.(1991), O n the recurrence of simple random walk o n some lattice fractals, J. Appl. Prob. 1992, 29, 454-459. Zhou, X. Y.(2002), Collection Papers of Zhou Xianyin, Beijing Normal Univ. Press. Zhu, D. J.(1990), Master’s Thesis (in Chinese), Anhei Normal Univ. Zolotarev, V. M.(1984), Probability metrics, Theory Prob. Appl. 28 no. 2, 278-302. Zolotarev, V. M.(1986), M o d e m Theory of i.i.d. Random Variables (in Russian), Nauka, Moscow.
Author Index Deuschel, J.-D., 329, 576 Diaconis, P., 576 Ding, W. D., 302, 446, 513, 537, 538, 547, 554, 575, 576, 587, 588 Dittrich, P., 513, 571, 576 Dobrushin, R. L., 171, 223, 421, 576 Doeblin, W., 223, 576 Donsker, M. D., 314, 577 Doob, J. L., 96, 455, 577 Down, D., 171, 577 Dowson, D. C., 9, 577 Doyle, P. G., 302, 577 Dudley, R. M., 182, 223, 510, 577 Durrett, R., 302, 537, 538, 554, 558, 576, 577 Dynkin, E. B., 57, 59, 577
A Aharony, A,, 421, 578 Aizenman, M., 446, 572 Aldous, D. G., 171, 572 Anderson, W. J., 119, 171, 223, 572 Andjel, E. D., 513, 572 Arnold, B., 172, 581 Arnold, L., 513, 572 Austin, D. G., 572
B Barlow, M. T., 421, 572 Basis, V. Ya., 96, 171, 509, 513, 538, 572 Bebbington, M., 572 Billingsley, P., 120, 175, 572 Blumenthal, R. M., 572 Bobkov, S. G., 380, 572 Boldrighini, C., 554, 571, 572 Bouleau, N., 271, 572
E
C
Ellis, R. S., 421, 577
Carlen, E. A,, 572 Chebotarev, A. M., 96, 572, 581 Cheeger, J., 380, 572 Chen, A. Y . , 271, 572, 580, 587 Chen, D. Y., 572, 573 Chen, J. D., 208, 573 Chen, J. W., 172, 538, 573 Chen, M. F., 6, 61, 96, 119, 171, 172, 223, 271, 302, 329, 380, 446, 466, 513, 538, 571, 573, 574, 575, 576, 580, 587 Cheng, H. X., 61, 575 Chung, K. L., 5, 167, 171, 172, 575 Cohn, D. L., 57, 575
D Dai,-Y. L., 271, 446, 576 Daley, D. J., 211, 585 Dawson, D. A., 554, 576 DeMasi, A,, 554, 571, 572, 576 Dembo, A., 329, 576 Derman, C., 172, 576
589
F Falconer, K. J., 421, 577 Fei, Z. L., 580 Feller, W., 2, 61, 96, 171, 577, 578 Feng, J. F., 571, 572, 573, 578 Feng, S., 554, 578 Ferrari, P. A., 578 Fill, 3. A., 171, 572 Forbes, F., 578 Foster, F. G., 172, 578 Frohlich, J., 421, 578 Franqois, O., 578 Freedman, D., 578 Fritz, J., 571, 578 Fukushima, M., 270, 271, 578 Funaki, T., 571, 578
AUTHOR INDEX
590
G Gefen, Y . , 421, 578 Georgii, H. O., 421, 578 Getoor, R. K., 572 Givens, C. R., 9, 578 Gong, S., 451, 579 Gotze, F., 380, 572 Granovsky, B., 380, 579 Greven, A . , 513, 578 Griffeath, D., 223, 579 Gross, L., 380, 579 Guionnet, A., 446, 579 Guo, M. Z., 571, 579 Guo, Q. F., 96, 103, 119, 161, 172, 271, 580
K Kantorovich, L. V., 96, 581 Kelly, F. P., 172, 270, 581 Kendall, D. G., 61, 171, 172, 581 Kersting, G., 96, 581 Kingman, J. F. C., 61, 171, 581 Kipnis, C., 380, 571, 581 Klebaner, F. C., 96, 579, 581 Kolmogorov, A. N., 61, 96, 270, 581 Konstantinov, A. A., 96, 581 Koteckf, R., 421, 576 Krylov, V. I., 96, 581 Kusuoka, S., 421, 572, 576, 581 Kuznetsov, S. E., 60, 581
H 1
Haken, H., 2, 513, 579 Hamza, K., 96, 579 Han, D., 172, 530, 579 Harris, T. E., 169, 172, 579 He, S. W., 171 Herbst, I., 579 Higuchi, Y . , 421, 587 Hirsch, F., 271, 572 Holley, R., 446, 459, 513, 572, 579 Hou, Z.T., 6, 61, 96, 103, 119, 161, 171, 172, 270, 302, 579, 580, 584, 588 Hsu, P. L., 61, 580 Hu, D. H., 61, 96, 171, 554, 580 Hua, L. K., 451, 580 Huang, C. C., 171, 580 Huang, L. P., 171, 223, 538, 571, 575, 580 Hunt, G. A., 580 Hwang, C. R., 380, 581 Hwang-Ma, S. Y . , 380, 581
I Ikeda, N., 7, 179, 581 Isaacson, D., 171, 580, 581 Israel, R., 421, 578
J Jain, N. C., 329, 581
Lowe, M., 582 Lbpez, F. J., 583 Landau, B. V., 9, 577 Landim, C., 380, 571, 581, 582 Lawler, G. F., 380, 582 Levin, S., 554, 577 Li, J . P., 119, 530, 580, 582 Li, S. F., 223, 571, 575 Li, S. Q., 446, 582 Li, T. D., 582 Li, Y . , 446, 513, 554, 582 Li, Z. B., 513, 587 Li, Z. P., 554 Lieb, E. H., 421, 578 Liggett, T. M., 3, 223, 380, 421, 432, 446, 449, 450, 456, 513, 537, 538, 547, 576, 579, 582 Lin, X., 172, 588 Lindvall, T., 223, 571, 582 Linstr+m, T., 421, 582 Liu, X. I?., 57, 61, 586, 587, 588 Liu, X. J., 446, 576, 582 Liu, Z. M., 580 LoBve, M., 61, 70, 582 Lu, Y . G., 329, 575 Luecke, G. R., 171, 581 Lyons, T. J., 302, 583
591
AUTHORINDEX
M M a , Z. M., 60, 270, 583 blaes, C., 421, 583 Malyshev, V. A . , 421, 583 Mandelbrot, B. B., 421, 578, 583 Mao, Y . H., 172, 380, 583 Martinelli, F., 446, 583 Martinez, S, 583 Maslov, V. P., 96, 581 McCoy, B. M., 18, 583 Meise, C., 582, 583 Mertens, J. F., 171, 583 Meyn, S. P., 171, 455, 577, 583 Miclo, L., 380, 583 Minlos, R. A , , 421, 446, 583 Mountford, T. S., 550, 583
N Nash-Williams, C. St. 3. A . , 292, 299, 583 Nelssen, R. B., 223, 584 Neuhauser, C., 538, 554, 558, 577, 584 Neveu, J., 25, 584 Nicolis, G., 513, 584 Nummelin, E., 171, 584
0 Olkin, I., 9, 584 Olla, S., 421, 571, 581, 584 Oshima, Y . , 270, 578 Osterwalder, K., 421, 584 Ouyan, R. H., 271
P Pakes, A. G., 109, 584 Papanicolaou, G. C., 571, 579 Park, Y . M., 421, 584 Parthasarathy, K. R., 175, 584 Pellegrinotti, A . , 554, 571, 572 Percherski, E. A,, 223, 576 Perkins, E. A . , 421, 572 Perrut, A , 571, 584 Pirogov, S. A, 421 Pitt, L, 579 Pollett, P., 572 POPOV, N. S., 172, 584 .
Preston, C. J., 421, 457, 584 Presutti, E., 554, 571, 572, 576 Prigogine, I., 513, 584 Pukelsheim, R., 9, 584
Q Qian, M., 270, 584 Qian, M. P., 271, 572, 573, 584
R Rockner, M., 270, 380, 583, 585 Riischendorf, L., 9, 585 Rachev, S. T., 223, 584, 585 Redig, F., 583 Ren, K. L., 446, 584 Reuter, G. E. H., 2, 61, 96, 119, 172, 584 Roberts, G. O., 380, 585 Rogers, L. C. G., 223, 571, 582 Rosenthal, J. S., 380, 585 5 Saada, E., 583 Samuel-Cahn, E., 171, 583 Sanz, G., 583 Schlogl, F., 2, 585 Schonmann, R. H., 446, 585 Schrader, R., 421, 584 Sethuraman, S., 582 Shapir, Y., 421, 578 Sheu, S. J., 380, 581 Shiga, T., 513, 550, 585 Shlosman, S. B., 421, 457, 576, 583, 585 Shortt, R. M., 9, 578 Silverstein, M. L., 270, 585 Simon, B., 421, 578 Sinai, Ya. G., 421, 576, 585 Snell, J. L., 302, 577 Sokal, A . D., 60, 380, 582, 585 Spitzer, F., 513, 547, 563, 582, 585 Spohn, H., 571, 585 Stoyan, D., 211, 585 Strassen, V., 585 Stroock, D. W., 96, 120, 171, 305, 307, 324, 329, 446, 459, 510, 513, 572, 575, 576, 577, 578, 579, 585 Sullivan, W. G., 380, 585 Saulga, A., 182, 585
AUTHOR INDEX
592
T Takeda, M., 270, 578 Tang, S. Z., 96, 446, 456, 513, 585, 586 TavarB, S., 109, 584 Theodosopulu, M., 513, 572 Thomas, L. E., 380, 585, 586 Thorrison, €I., 223, 586 Trisch, A . , 446, 583 Tuominen, P., 171, 584, 586 Tweedie, R. L., 171, 172, 455, 577, 581, 583, 584, 586
V Vallender, S. S., 9, 586 van Doorn, E. A . , 380, 586 Varadhan, S. R. S., 96, 120, 314, 513, 571, 577, 579, 581, 582, 585, 586 Varopoulos, N . , 302, 586 Veech, W., 169, 172, 586 Vere-Jones, D., 171, 586 Villani, C., 223, 586
W Wang, F. Y . , 380, 466, 575, 585, 586 Wang, J. X., 57, 587 Wang, P. Z., 580 Wang, Z. K., 57, 61, 70, 1.61, 171, 586, 587 Wasserstein, L. N., 223, 587 Watanabe, S., 7, 179, 581 Williams, D., 587 Wu, R., 151, 172, 587 Wu, J., 421 Wu, L. D., 172, 587 Wu, T. T., 18, 583 Wu, X. Y., 446, 587
X Xiao, G. N., 580 Xu, X. J., 571, 575, 587
Y Yan, J. A., 510, 587 Yan, S. J., 57, 119, 172, 446, 513, 554, 575, 587 Yang, C. N., 582 Yang, X. Q., 96, 161, 171, 271, 587 Yoshida, N., 421, 587 Yosida, K., 35, 55, 587 Yuan, C. G., 580
Z Zahradnik, M., 421, 576 Zarnit, S., 171, 583 Zegarlinski, B., 446, 579, 585, 587 Zeifman, A. I., 380, 579, 587 Zeitouni, O., 329, 576 Zeng, W. Q., 446, 587 Zhang, H. J., 172, 271, 572, 580, 587, 588 Zhang, J. K., 119, 587 Zhang, S. Y., 538, 588 Zhang, Y . H., 151, 172, 223, 583, 587, 588 Zheng, J . L., 96, 223, 421, 588 Zheng, X. G., 6, 61, 96, 119, 271, 513, 547, 554, 572, 575, 576, 578, 582, 588 Zhou, J. Z., 580 Zhou, X. Y., 300, 302, 581, 588 Zhu, D. J . , 538, 575, 588 Zolotarev, V. M., 223, 588
Subject Index Special Symbols
r-specification, 384 R(X,P), 88 8, 122,383 a-isomorphic, 60 U,(E), 120
1-dependent, 549 11. l l z l l 184 C, 383 ( B ) , 70 ( B n ) , 86 ( B x ) , 70 Bq-process, 97 Cb(E), 120 %y.f(E), 411 ( D ,g ( D ) ) , 258, 331 d-system, 57 dim% 2 0, 231 dim%, 91 dm i ?,' 91 ~ W ( L 341 ), ( F ) , 70 (Fn), 86 ( F A ) , 70 Fq-process, 97 cp-optimal, 203, 205 cp-optimal Markovian coupling, 205 cp-optimal measurable coupling, 205 gap(D), 331 gap(L), 331 % p ( E ) , 126 2 - s y s t e m , 57 Xo, 363, 368 XO(A)I 359 X I , 359 Xi(B), 360 n-almost honest, 232 n-equivalent, 231 7r-system, 57 P ( E ) , 120 (P;in(t)), 5 Pm'"(X,x,A),74 Pmin(t,x,A),77 Q-condition, 1 q-condition, 56, 85 Q-matrix, 1 q-pair, 32 Q-process, 2, 32 r-boundary, 383
A absorbing, 32 abstract chess-board estimates, 417 additive theorem, 333 aperiodic, 130 asymptotically stable, 556 autocatalytic model, 503
B backward equation at point x, 237 backward Kolmogorov equation, 73 backward Kolmogorov inequality, 73, 74 basic coupling, 11, 13, 186 basic Dirichlet form, 264 birth-death Q-matrix, 17, 110, 142, 164, 325, 488, 538 birth-death process, 11, 110, 160, 162, 210, 217, 218, 220, 266, 346, 348, 349, 359, 377, 549, 551, 559 Boltzmann constant, 385 Brussel's model, 150, 509
C canonical Gibbs state, 429 canonical image, 234 chain field, 275 Cheeger's constant, 368, 370 chess-board estimate, 410 CK-equation (Chapman-Kolmogorov equation), l, 23 classical coupling, 10, 185, 347, 354, 528 closed function (lower semi-continuous function), 121, 122, 205, 209, 305 co-zero property, 272 coalescing process, 504 coefficient of anisotropy, 385 compact function, 5 , 121, 124, 174, 391 comparison lemma, 104
593
594
SUBJECT INDEX
comparison theorem, 64, 332 conditional Gibbs state with periodic boundary condition, 408 conditional Hamiltonian, 385 conductance, 281 cone mapping, 62 configuration, 383 conservative, 32, 33 consistency condition, 384 consistent family of functions, 88 consistent family of measures, 88 constrained system, 321 continuous condition, 24, 51 contour, 397, 403, 548 contraction, 256 controlling equation, 64 core, 267, 335, 433 coupled branching process, 504 coupled random walk process, 506 coupling, 6, 173, 324, 340, 341, 342, 347, 470, 471, 479, 490, 506, 507, 518, 519, 553, 562, 563 coupling by reflection, 11, 218 coupling of marching soldiers, 11, 186, 347, 490, 528, 553 coupling operator, 10, 185 current, 281 cyclicity, 417 cylindrical function, 411 cylindrical set, 411
E edge, 272 effective conductance, 281 effective resistance, 281 eigenfunction, 341 eigenfunction in weak sense, 341 electric potential, 281 embedding chain, 139 embedding jump process, 125 energy dissipation, 282 entrance solutions, 92 entrance space, 88, 92 entropy, 15, 304 equilibrium solution, 556 ergodic, 12, 17, 137, 140, 143, 145, 149, 150, 152, 163, 164, 165, 196, 308, 325, 338, 352, 411, 450, 455, 459, 518, 520, 522, 523, 532, 538, 542, 543, 550 estimate of moment, 524 estimate of the first moment, 472 excessive measure, 170 existence criterion, 361 exit solutions, 92 exit space, 88, 92 explicit bound, 17, 349 explicit criterion, 17, 349 exponential Lz-convergence, 331 exponential ergodicity, 143 exponentially growing, 391
F
D dimension of %A, 90 dimension of YA,90 Dirichlet eigenvalue, 359 Dirichlet form, 258 Dirichlet operator, 261 DLR-equation (Dobrushin-Lanford-Ruelle equation), 384 Donsker-Varadhan entropy, 311 Donsker-Varadhan theorem, 306 Doob’s construction, 96 dual graph, 397 duality, 539 Dynkin-class, 57
Feller’s construction, 96 field of exclusion speed functions, 422 field of spin speed functions, 422 finite dimensional generalized Potlach process, 221 finite entrance, 92 finite exit, 92 finite range, 384 finitely ramified fractal, 300 first infinity, 103 first moment condition, 470, 479 first successive approximation scheme, 63 flow, 284 Fokker-Planck equation, 70 forward Kolmogorov equation, 73
595
SUBJECT INDEX
forward Kolmogorov inequality, 74 Friedricks extension, 267
G generalized Potlatch process, 505 geometrically ergodic, 137 Gibbs distribution in V with boundary condition, 387 Gibbs distribution in V with boundary condition, 387 Gibbs random field, 387 Gibbs state, 387, 429 graph representation method, 539 Gronwall’s Lemma, 472 growing condition, 493, 496
H Hamiltonian, 384 Hamiltonian with periodic boundary condition, 408 homogeneous equation, 63 honest, 24
I I-function, 304 independent coupling, 10, 185 inner regular, 60, 205 instantaneous, 32 integration by parts formula, 484, 507, 508, 510, 541 interaction function, 385 invariant measure, 124, 128, 166 inverse temperature, 385 irreducible, 130, 134 Ising model, 18, 384 isoperimetric constant, 368, 370 1
J
L
L6vy-Prohorov metric, 7 Laplace transform, 37, 51 large deviation principle, 305 lattice field, 278 lattice Sierpinski carpet, 299, 300, 402 lattice Sierpinski gasket, 298, 299, 400 limiting Gaussian process, 509 linear growth process, 503 Lipschitz condition, 470, 479 localization theorem, 67 locally compact, 499 logarithmic Sobolev inequality, 358 Loth-Volterra model, 221 lower semi-continuous function (closed function), 121
M marginality, 6, 10, 174, 184 Markov chain, 24 maximal coupling, 208 maximal solution, 92 mean field method, 539 measurable coupling, 392 minimal non-negative solution, 63 minimal solution, 63 minimum LP-distance, 8, 173, 179, 391, 469, 531 minimum property, 63 moment condition, 493, 496, 497 moments method, 539 monotone, 211 monotone class theorem, 57
jump condition, 2, 24, 51 jump process, 23
N K Kantorovich-Rubinstein-Wassersteinmetric, 173 kernel, 40 Krein extension, 270
nearest neighbor, 438 non-conservative quantity at x, 92 non-honest , 24 normal condition, 1, 51
596
SUBJECT INDEX 0
Ohm’s law, 281 open path, 547 operator Q , 71 optimal Markovian coupling, 203 order-preserving coupling, 211, 220 Osterwalder-Schrader positivity, 406, 421
resolvent equation, 51 reversible, 13, 227, 339, 341, 344, 348, 433, 452, 455, 533, 537, 550 reversible q-pair, 229 reversible measures, 433
S P partition function, 385 partition of E , 85 path, 272 Peierls inequality, 397 periodic configuration, 408 phase transition, 18, 20, 397, 400, 402, 413, 455, 457, 542, 547, 550, 558 Pirogov-Sinai method, 421 Polish space, 23 polynomial model, 503, 520, 521, 532, 533, 538, 550, 555 positive measure, 438 positive recurrent, 137, 140 potential, 273 potential field, 273 Potlatch process, 505 Potts gauge model, 450 principal eigenvalue, 362 probability kernel, 40
Q quadratic form, 359, 368 quadrilateral condition, 277, 278, 423, 426, 429, 435, 437, 439
R random field, 384 rate function, 15, 304 reachable, 272 reachable directly, 272 realization, 404 recurrence, 124, 140 reflection positive, 406 regular, 2 regular g-pair, 124 resistance, 281
Schlogl’s first model, 503, 520, 522, 532, 557 Schlogl’s model, 2, 4, 6, 19, 79, 110, 150, 164, 221 Schlogl’s second model, 503, 520, 522, 532, 552, 557 Schwarz inequality, 417 second successive approximation scheme, 64 section, 272 self-dual, 542 shift, 407 Shlosman model, 452 single birth Q-matrix, 4, 105, 152, 324 single birth Q-process, 105, 112, 151, 160, 300 single-entrance, 92 single-exit, 92 smoothing process, 504 specification, 384 spectral gap, 16, 330, 340, 348, 359, 538 speed functions, 425 spin space, 383 stable, 32, 556 stationary distribution, 124, 126 statistical sum, 385 stochastic monotonicity, 21 1 strong continuity, 256 strongly ergodic, 137, 143 sub-Markovian, 257 successful, 12, 198, 202, 218, 220, 222 successful coupling, 195 symmetric, 257 symmetrizable, 14, 142 symmetrizable jump process, 229
SUBJECT INDEX
T taboo probability) 167 tight, 205 total energy dissipation) 282 total variation, 8 totally instantaneous, 32 totally stable) 33 transition condition) 496, 497 transition function, 23 transition measure) 40, 205 translation, 407 translation invariant, 408 triangle condition, 423, 426, 429, 435, 438, 443, 533
U uniformly ergodic, 137, 143 uniformly tight, 205 uniqueness criterion, 2, 3, 103, 115 uniqueness theorem of Laplace transform, 58 universal measurable set, 60 universal measurable space, 60
597
V Varadhan theorem) 305 variational formula, 17, 349 variational formula for Dirichlet form, 363 Volterra-Lotka model, 509
w Wasserstein metric, 8, 173, 179, 391, 469, 531 weak convergence, 7 weak convergence in finite-dimensional distributions) 123 weak domain) 341 weak maximum principle, 500 work, 273
Y Yang-Mills lattice field, 451 Z zero range, 384 zero range process, 503 zero-entrance) 92 zero-exit , 92