CONTENTS
Exact steady state properties of the one dimensional asymmetric exclusion model B. Derrida and M. R. Evans Droplet condensation in the Ising model: moderate deviations point of view R. L. Dobrushin and S. Shlosman
1
17
Shocks in one-dimensional processes with drift P. A. Ferrari
35
Self-organization of random cellular automata: four snapshots D. Griffeath
49
Percolative problems G. R. Grimmett
69
Mean-field behaviour and the lace expansion T. Hara and G. Slade
87
Long time tails in physics and mathematics F. den Hollander
123
Multiscale analysis in disordered systems: percolation and contact process in a random environment A. Klein
139
Geometric representation of lattice models and large volume asymptotics ´ R. Kotecky
153
Diffusion in random and non-linear PDE’s A. Kupiainen
177
Random walks, harmonic measure, and Laplacian growth models G. F. Lawler
191
Survival and coexistence in interacting particle systems T. M. Liggett
209
Constructive methods in Markov chain theory M. V. Menshikov
227
vi
CONTENTS
A stochastic geometric approach to quantum spin systems B. Nachtergaele
237
Disordered Ising systems and random cluster representations C. M. Newman
247
Planar first-passage percolation times are not tight R. Pemantle and Y. Peres
261
Theorems and conjectures on the droplet-driven relaxation of stochastic Ising models R. H. Schonmann
265
Metastability for Markov chains: a general procedure based on renormalization group ideas E. Scoppola
303
PREFACE
Random spatial processes were the subject of a special six-month programme at the Isaac Newton Institute of the University of Cambridge, in 1993. A major event of that programme was a NATO Advanced Study Institute, of which this volume contains the proceedings. The meeting took place during 4–16 July, and brought together specialists and students working on spatial disorder and phase transition. The main language of the meeting was probability theory, but with important input from other areas of science, particularly physics. The success of the Advanced Study Institute was ensured by a generous grant from the NATO Scientific Affairs Committee; this funding enabled the full participation of young people who might otherwise have been unable to attend. Travel support for young U.S. participants was provided by the National Science Foundation. Cambridge proved to be an excellent and popular venue, of which an essential component was the superb environment offered by the Isaac Newton Institute. The Institute provided not only substantial funding, but also a wonderful building, and a staff of individuals who assisted the smooth organisation with efficiency and good humour. The Organising Committee offers its thanks to Sarah Shea-Simonds, whose graceful administration of the preparation for the meeting was especially appreciated by the Director. Her virtuosity at all dialects of TEX is evident in her fine work in producing this volume. The editor was aided also by the Isaac Newton Institute, and by support from the SERC under grant GR G59981.
Geoffrey Grimmett Cambridge
LIST OF PARTICIPANTS
Douglas Abraham Department of Theoretical Physics University of Oxford 1 Keble Road Oxford OX1 3NP United Kingdom
Rashid Ahmad Department of Statistics University of Strathclyde 26 Richmond Street Glasgow G1 1XH United Kingdom
Michael Aizenman Jadwin Hall Princeton University Princeton NJ 08544–0708 United States
Kenneth Alexander Department of Mathematics University of Southern California Los Angeles CA 90089–1113 United States
Martin Barlow Department of Mathematics University of British Columbia Vancouver British Columbia V6T 1Z2 Canada
Martin Baxter Statistical Laboratory University of Cambridge 16 Mill Lane Cambridge CB2 1SB United Kingdom
Daniel Boivin Department of Mathematics Universit´e de Bretagne Occidentale 6 avenue Le Gorgeu, B.P. 452 29275 Brest Cedex France
Christian Borgs Institut f¨ ur Theoretische Physik Freie Universit¨ at Berlin Arnimallee 14 D–1000 Berlin 33 Germany
Anton Bovier Institut f¨ ur Angewandte Analysis & Stochastik Mohrenstrasse 39 D–10117 Berlin Germany
Emmanuel Buffet School of Mathematical Sciences Dublin City University Dublin 9 Ireland
x
LIST OF PARTICIPANTS
Massimo Campanino Department of Mathematics University of Bologna Piazza di Porta S. Donato 5 40126 Bologna Italy
Terence Chan Department of Actuarial Mathematics & Statistics Heriot-Watt University Riccarton Edinburgh EH14 4AS United Kingdom
Jennifer Chayes Department of Mathematics University of California Los Angeles CA 90024 United States
Francis Comets UFR de Math´ematiques Universit´e Paris VII 2 place Jussieu 75251 Paris Cedex 05 France
Michael Cooper Department of Mathematics Birkbeck College University of London London WC1E 7HX United Kingdom
Gustav Delius Fakult¨ at f¨ ur Physik Universit¨ at Bielefeld Postfach 10 01 31 4800 Bielefeld 1 Germany
Bernard Derrida Service de Physique Th´eorique CE Saclay F91191 Gif sur Yvette France
Ronald Doney Statistical Laboratory Department of Mathematics University of Manchester Manchester M13 9PL United Kingdom
Nicholas Duffield School of Mathematical Sciences Dublin City University Dublin 9 Ireland
Mireille Echerbault Laboratoire de Probabilit´e Universit´e Paul Sabatier 118 route de Narbonne 31062 Toulouse Cedex France
Andreas van Elst Physikalisches Institut Universit¨ at Bonn Nussallee 12 53115 Bonn 1 Germany
Aernout van Enter Institute for Theoretical Physics Rijksuniversiteit Groningen P.O. Box 800 NL–9747 AG Groningen The Netherlands
Alison Etheridge Department of Mathematics University of Edinburgh Mayfield Road Edinburgh EH9 3JZ United Kingdom
Chyunjia Albert Fannjiang Department of Mathematics University of California Los Angeles CA 90024–1555 United States
LIST OF PARTICIPANTS
Kambiz Farahmand Department of Mathematics University of Ulster Jordanstown County Antrim BT37 0QB United Kingdom
´ndez Ariel Ferna Department of Biochemistry P.O. Box 016129 Miami Florida 33101–9990 United States
Pablo Ferrari Department of Statistics IME–USP Cx Postal 20570 01452–001 S˜ ao Paulo Brazil
Martin Florian Department of Physics University Polytechnic of Catalunya Pau Gargallo 5 08028 Barcelona Spain
Alberto Gandolfi Department of Mathematics University of Turin Sede di Alessandria 15100 Alessandria Italy
V´ eronique Gayrard Centre de Physique Th´eorique CNRS Luminy, Case 907 13288 Marseille Cedex France
Priscilla Greenwood Department of Mathematics University of British Columbia Vancouver British Columbia V6T 1Y4 Canada
David Griffeath Department of Mathematics University of Wisconsin Madison WI 53706 United States
Geoffrey Grimmett Statistical Laboratory University of Cambridge 16 Mill Lane Cambridge CB2 1SB United Kingdom
Benjamin Hambly Department of Mathematics University of Edinburgh Mayfield Road Edinburgh EH9 3JZ United Kingdom
Shirin Handjani Department of Mathematics University of California Los Angeles CA 90024–1555 United States
Martin Hansen Department of Mathematics Royal Veterinary & Agricultural University Thorvaldsenvej 40 DK-1871 Frederiksberg C Denmark
xi
xii
LIST OF PARTICIPANTS
Takashi Hara Department of Applied Physics Tokyo Institute of Technology Oh-Okayama Meguro-ku Tokyo 152 Japan
Matthew Harris Faculty of Technical Mathematics Delft University of Technology Mekelweg 4 2628 CD Delft The Netherlands
Simon Harris School of Mathematical Sciences University of Bath Bath Avon BA2 7AY United Kingdom
Yasunari Higuchi Department of Mathematics Kobe University Rokko Kobe 657 Japan
Frank den Hollander Mathematical Institute University of Utrecht P.O. Box 80.010 3508 TA Utrecht The Netherlands
Marco Isopi CMAP Ecole Polytechnique 91128 Palaiseau Cedex France
Sudhir Jain Department of Mathematics Faculty of Science and Technology University of Derby Derby DE22 1GB United Kingdom
Alex Kaganovich Rockefeller University 1230 York Avenue New York NY 10021 United States
Michael Keane Faculty of Technical Mathematics Delft University of Technology Mekelweg 4 2628 CD Delft The Netherlands
Joanne Kennedy Department of Statistics University of Oxford 1 South Parks Road Oxford OX1 3TG United Kingdom
Harry Kesten Department of Mathematics Cornell University Ithaca NY 14853 United States
Abel Klein Department of Mathematics University of California Irvine CA 92717 United States
´ Roman Kotecky Centre for Theoretical Study Charles University T´ aboritsk´ a 23 130 00 Praha 3 Czech Republic
Flora Koukiou Groupe de Physique Statistique Universit´e de Cergy-Pontoise 47–49 avenue des Genottes, B.P. 8428 95806 Cergy-Pontoise Cedex France
LIST OF PARTICIPANTS
xiii
Ravishankar Krishnamurthi Department of Mathematics State University of New York New Paltz NY 12561 United States
Antti Kupiainen Department of Mathematics University of Helsinki Hallituskatu 15 SF–00014 Helsinki Finland
Gregory Lawler Department of Mathematics Duke University Durham NC 27708–0320 United States
John Lewis Dublin Institute for Advanced Studies 10 Burlington Road Dublin 4 Ireland
Thomas Liggett Department of Mathematics University of California Los Angeles CA 90024 United States
Terry Lyons Department of Mathematics Imperial College 180 Queen’s Gate London SW7 2BZ United Kingdom
Kirone Mallick Laboratoire de Physique de l’ENS 24 rue Llomond 75005 Paris Cedex France
¨f Anders Martin-Lo Department of Mathematical Statistics University of Stockholm S–10691 Stockholm Sweden
Ronald Meester Department of Mathematics University of Utrecht P.O. Box 80.010 3508 TA Utrecht The Netherlands
Mikhail Menshikov Mechanico-Mathematical Faculty Department of Probability Moscow State University 119899 Moscow Russia
Thomas Mountford Department of Mathematics University of California Los Angeles CA 90024 United States
Bruno Nachtergaele Department of Physics Princeton University Princeton NJ 08544–0708 United States
Charles Newman Courant Institute 251 Mercer Street New York NY 10012 United States
Bao Gia Nguyen Department of Mathematics Illinois Institute of Technology Chicago IL 60616 United States
xiv
LIST OF PARTICIPANTS
Geoffrey Nicholls Department of Statistics University of Oxford 1 South Parks Road Oxford OX1 3TG United Kingdom
John Noble Department of Statistics University College Cork Ireland
Neil O’Connell Department of Mathematics University of Edinburgh Mayfield Road Edinburgh EH9 3JZ United Kingdom
Enzo Olivieri Department of Mathematics University of Rome II Via Fontanile di Carcaricola 00133 Rome Italy
George Papanicolaou Department of Mathematics Stanford University Stanford CA 94305 United States
Marc Peign´ e IRMAR Laboratoire de Probabilit´es Campus de Beaulieu 35042 Rennes Cedex France
Robin Pemantle Department of Mathematics University of Wisconsin Madison WI 53706 United States
Mathew Penrose Department of Mathematical Sciences University of Durham South Road Durham DH1 3LE United Kingdom
Yuval Peres Department of Mathematics Yale University New Haven CT 06520 United States
Dimitri Petritis IRMAR Campus de Beaulieu 35042 Rennes Cedex France
Pierre Picco Centre de Physique Th´eorique CNRS Luminy, Case 907 13288 Marseille Cedex France
Marcelo Piza Department of Physics New York University New York NY 10003 United States
Emily Puckette Department of Mathematics Duke University Durham NC 27708–0320 United States
Ellen Saada LAMS de l’Universit´e de Rouen Facult´e des Sciences B.P. 118 76134 Mont-St-Aignan Cedex France
LIST OF PARTICIPANTS
Roberto Schonmann Department of Mathematics University of California Los Angeles CA 90024 United States
Elisabetta Scoppola Department of Physics Universita’ La Sapienza Piazzale Aldo Moro 2 00185 Rome Italy
Sunder Sethuraman Courant Institute New York University 251 Mercer Street New York, NY 10012 United States
Gyoung Moo Shim Instituut voor Theoretische Fysica Katholieke Universiteit Leuven B–3001 Leuven Belgium
Senya Shlosman Institute for Information Transmission Problems 19 Yermoleva Street GSP-4 Moscow 101447 Russia
Gordon Slade Department of Mathematics & Statistics McMaster University Hamilton Ontario L8S 4K1 Canada
Wayne Sullivan Dublin Institute of Advanced Studies 10 Burlington Road Dublin 4 Ireland
Franck Vermet IRMAR Campus de Beaulieu 35042 Rennes Cedex France
Jonathan Warren School of Mathematical Sciences University of Bath Bath Avon BA2 7AY United Kingdom
Joseph Watson Department of Physics Harvard University Cambridge MA 02138 United States
Edward Waymire Department of Mathematics Oregon State University Corvallis Oregon 97331–4605 United States
Dominic Welsh Merton College Oxford OX1 4JD United Kingdom
Aubrey Wulfsohn Mathematics Institute University of Warwick Coventry CV4 7AL United Kingdom
Milos Zahradnik Faculty of Mathematics and Physics Charles University MFF-UK, Sokolovska’ 83 18600 Prague Czech Republic
xv
xvi Boguslaw Zegarlinski Mathematics Department Ruhr-Universit¨ at 4630 Bochum 1 Germany V. Zhikov Department of Mathematics Pedagogical Institute Vladimir 600024 Russia
LIST OF PARTICIPANTS
Yu Zhang Department of Mathematics University of Colorado Colorado Springs CO 80933–7150 United States
EXACT STEADY STATE PROPERTIES OF THE ONE DIMENSIONAL ASYMMETRIC EXCLUSION MODEL
B. DERRIDA and M. R. EVANS Service de Physique Th´eorique C. E. Saclay F–91191 Gif-sur-Yvette Cedex France
Abstract. The asymmetric exclusion model describes a system of particles hopping in a preferred direction with hard core repulsion. Here we review several exact results concerning the steady state of this system which have been obtained recently for periodic and open boundary conditions: density profiles, correlation functions and diffusion constants. We then discuss generalisations to the case of partial asymmetry and to a model with two species of particles. Key words: Asymmetric exclusion, steady state, diffusion constants, exactly solvable model.
1. Introduction Models of hopping particles in one dimension [1–5] provide simple but non-trivial realisations of systems out of equilibrium [6–17]. Here we review some recent exact results [18–20] for a family of such models — the asymmetric exclusion process in various geometries and with one or more species of particles. These results, which pertain to steady state properties, have been obtained within a matrix formulation, the description of which will constitute the main part of this presentation. Let us define the system to be considered. Each site of a one dimensional lattice of N sites is either occupied by one particle or empty. A configuration of the system is characterised by N binary variables {τ1 , τ2 , . . . , τN } where τi = 1 if site i is occupied by a particle and τi = 0 if site i is empty. During an infinitesimal time interval dt, each bond of the lattice has probability dt of being updated. If a bond is updated and there is a particle on the left hand site of the bond, and a hole on the right hand site, the particle will hop across the bond. In other words a particle hops forward with rate 1 whenever there is an empty site on its right. Different variants of the model can be considered by imposing different boundary conditions for the lattice. For a finite system of N sites two kinds of boundary conditions are often considered: 1. Periodic P boundary conditions where τi+N = τi and the number of particles M = i τi is fixed [7, 8, 20]. 2. Open boundary conditions, where in time dt a particle may enter the lattice at site 1 with probability αdt (if this site is empty) and a particle at site N may leave the lattice with probability βdt. In this case the number of particles in the system is not conserved [14, 18].
2
B. DERRIDA AND M. R. EVANS
Remark. For finite systems the steady state is unique, that is, the probability Pt (τi , · · · , τN ) of finding the system in configuration {τi , · · · , τN } has a long time limit independent of the initial condition. However as the limits t → ∞ and N → ∞ do not usually commute, the situation for an infinite system is somewhat different and the long time behaviour may depend on initial conditions. For example, suitable initial conditions may produce a shock in the system separating two regions of unequal densities whereas other initial conditions may lead to a homogeneous particle density. Evolution of the Correlation Functions Armed with the dynamical rules of the model, one can easily derive the equations which govern the time evolution of any correlation function. For example, if one considers the occupation of site i (for the moment we consider a non-boundary site to avoid choosing any particular boundary conditions) one can write down with probability 1 − 2dt τi (t) τi (t + dt) = τi (t) + [1 − τi (t)]τi−1 (t) with probability dt (1) τi (t)τi+1 (t) with probability dt. The first equation comes from the fact that with probability 1 − 2dt, neither of the bonds i − 1, i or i, i + 1 is updated and therefore τi remains unchanged. The second equation corresponds to updating bond i − 1, i: after the update of that bond τi = 1 if site i was either occupied before the update or empty but site i − 1 was occupied. Likewise, the third equation corresponds to updating bond i, i + 1 after which site i would only be occupied if both site i and site i + 1 were occupied before the update. If one averages (1) over the events which may occur between t and t + dt and all histories up to time t one obtains dhτi i = hτi−1 (1 − τi )i − hτi (1 − τi+1 )i. dt
(2)
The same kind of reasoning allows one to write down an equation for the evolution of hτi τi+1 i: dhτi τi+1 i = hτi−1 (1 − τi )τi+1 i − hτi τi+1 (1 − τi+2 )i. (3) dt For periodic boundary conditions, where the system has translational invariance, equations of the form (2)–(3) hold for all i. For open boundary conditions one has to consider boundary effects; the equation for the evolution of the one-point correlation function (2) becomes at the boundaries dhτ1 i = αh(1 − τ1 )i − hτ1 (1 − τ2 )i, dt
(4)
dhτN i = hτN −1 (1 − τN )i − βhτN i. (5) dt Once relations of the type (2)–(5) are written, one can in principle calculate the time evolution of any quantity of interest. However, the equation (2) for hτi i requires the knowledge of hτi τi+1 i which itself (3) requires the knowledge of hτi−1 τi+1 i and
THE ASYMMETRIC EXCLUSION MODEL
3
hτi−1 τi τi+1 i so that the problem is intrinsically an N -body problem in the sense that the calculation of any correlation function requires the knowledge of all the others. In what follows, we shall see however that both for periodic and for open boundary conditions, all the correlation functions in the steady state can be calculated exactly. In the steady state, the correlation functions satisfy equations of the form (2)–(5) where the left hand sides are set to zero. For the case of periodic boundary conditions these equations can, in fact, be solved immediately [6] by recognizing that each configuration (with the correct number M of particles) has equal probability Peq :
Peq =
N M
−1
.
(6)
This can be easily checked by noticing that if all configurations have equal weight, the rate at which the system leaves a given configuration is equal to the number of clusters of particles in that configuration (the first particle of each cluster can hop forward) and the rate at which the the system may enter that configuration is also equal to the number of clusters (by the move of the last particle of each cluster). Then, if one considers for example the two-point correlation functions, it follows that hτi τj i will take the same value regardless of the positions of sites i, j. Similarly any n-point correlation will be independent of the positions of the n points (as long as they are all different). With correlation functions of this form it easy to see that the right hand sides of (2)–(3) are automatically zero and similarly any steady state equations for higher order correlation functions would be satisfied. In the case of open boundary conditions one might try to look for a solution of a similar form. However, since the number of particles is not conserved, a corresponding guess as to the form of the stationary probabilities would be that configurations with the same number of particles have the same probability. For α + β = 1, such a solution does exist (see below) for which all correlation functions are factorised hτi τj i = hτ i2 with α = hτ i = 1 − β.
(7)
However, in the general case where α+β 6= 1, the steady state is non-trivial. The difficulty in calculating the steady state can be seen in (2)–(5): the computation of the one point functions hτi i requires the knowledge of the two point functions hτi τi+1 i which in turn require the knowledge of higher correlation functions hτi−1 τi τi+1 i and hτi τi+2 i and so on. As mentioned earlier, this is a situation quite common in equilibrium statistical mechanics where, although one can write relationships between different correlation functions, there is an infinite hierarchy of equations which in general makes the problem intractable. In the following we will discuss a way of representing the steady state that for the case of open boundary conditions allows all equal time correlation functions to be computed [18]. A similar approach can also be used in the case of periodic boundary conditions to obtain more complicated steady state properties [20].
4
B. DERRIDA AND M. R. EVANS
2. Matrix Formulation of Steady State for Open Boundaries Let us now describe a way of calculating the steady state properties in the case of open boundary conditions that we developed in collaboration with V. Hakim and V. Pasquier. This approach had previously been used to solve other problems of statistical mechanics (directed lattice animals and quantum antiferromagnetic spin chains [21, 22, 23]). The idea is to write the weights fN (τ1 , . . . , τN ) of the configurations in the steady state as fN (τ1 , . . . , τN ) = hW |
N Y
[τi D + (1 − τi )E] |V i,
(8)
i=1
where D, E are matrices, hW |, |V i are vectors (we use the standard Bra Ket notation of quantum mechanics) and τi are the occupation variables. In other words in the product (8) we use matrix D whenever τi = 1 and E whenever τi = 0. In general, since the matrices D and E do not commute, the weights fN (τ1 , . . . , τN ) are complicated functions of the configuration {τ1 , . . . , τN }. As the weights fN (τ1 , . . . , τN ) given by (8) are usually not normalised, the probability pN (τ1 , . . . , τN ) of a configuration {τ1 , . . . , τN } in the steady state is " #−1 X X ... fN (τ1 , . . . , τN ) . (9) pN (τ1 , . . . , τN ) = fN (τ1 , . . . , τN ) τ1 =1,0
τN =1,0
Of course, from looking at (8) it is not obvious that such matrices D, E and vectors hW |, |V i exist. We shall see, however, that it is possible to choose these matrices and vectors so that fN (τ1 , . . . , τN ) given by (8) are indeed the actual weights in the steady state. Before presenting some explicit forms for the matrices and vectors involved in (8) let us show how the approach leads to a straightforward computation for the correlation functions. If one defines the matrix C by C = D + E,
(10)
it is clear that hτi iN defined by hτi iN =
X τ1 =1,0
...
X
τi fN (τ1 , . . . , τN )
τN =1,0
"
X τ1 =1,0
...
X
fN (τ1 , . . . , τN )
#−1
, (11)
τN =1,0
can be calculated through the following formula hτi iN =
hW |C i−1 DC N −i |V i . hW |C N |V i
(12)
In the same way, any higher correlation will take a simple form in terms of these matrices. For example, when i < j, hτi τj iN is equal to hτi τj iN =
hW |C i−1 DC j−i−1 DC N −j |V i . hW |C N |V i
(13)
THE ASYMMETRIC EXCLUSION MODEL
5
Therefore, all we require in order to be able to calculate arbitrary spatial correlation functions is that the matrix elements of any power of C = D + E have manageable expressions. One can show [18] that if the matrices D, E and the vectors hW |, |V i satisfy (14)–(16): 1 |V i, β DE = D + E, 1 hW |E = hW |, α D|V i =
(14) (15) (16)
then (8) does give the steady state. We shall not repeat here the proof that (14)–(16) are sufficient conditions to give the weights in the steady state. It is however easy to check that the relations (2)–(5) will be satisfied in the steady state provided that the corresponding identities hold: D E (D + E) = (D + E) D E, D E D (D + E) = (D + E) D D E, α hW | E (D + E) = hW | D E, D E |V i = β (D + E) D |V i,
(17) (18) (19) (20)
and that these relations are immediate consequences of the algebraic rules (14)–(16). Another easy check that (14)–(16) do give the right steady state is to look at some special configurations. If one takes the case of a configuration where the first p sites are empty and the last N − p are occupied, it is easy to show that in the steady state one must have hW |E p−1 DEDN −p−1 |V i = αhW |E p DN −p |V i + βhW |E p DN −p |V i
(21)
since this expresses that during a time interval dt the probability of entering and leaving the configuration are the same. Here again, this equality appears as a very simple consequence of the algebraic rule (14)–(16). For the line (α + β = 1) we mentioned above that the steady state becomes trivial. This is reflected by the fact that one can choose commuting matrices D and E to solve (14)–(16). If D and E commute one can write 1 1 1 + hW |V i = hW |D + E|V i = hW |DE|V i = hW |ED|V i = hW |V i. (22) α β αβ As hW |V i 6= 0, this clearly implies that α + β = 1. This is a well knownPspecial case where the steady state is factorised (fN (τ1 , . . . , τN ) depends only on i τi and all connected correlations vanish). Under this condition (α + β = 1), one can choose the matrices D and E to be uni-dimensional, with D = β −1 and E = α−1 . The previous remark also shows that for α + β 6= 1, the size of the matrices D, E must be greater than one. The next question is whether one can find finite dimensional matrices that will satisfy (14)–(16). It turns out that one can prove [18] that this is impossible (if D and E were finite dimensional matrices, the relation
6
B. DERRIDA AND M. R. EVANS
DE = D + E would imply that D = E(1 − E)−1 which itself would imply that the matrices D and E commute). So the only possibility left is to use infinite dimensional matrices. In order to perform calculations within the matrix formulation there are basically two approaches one can take. Either one can work with the algebra (14)–(16) directly, or one can make a particular choice of matrices and use it to the full. In the latter case there are several possible choices for the matrices D, E and vectors hW |, |V i that satisfy (14)–(16). One particularly simple choice, which has proved useful in the extensions of the approach to be discussed below is 1 1 0 0 ... 1 0 0 0 ... 0 1 1 0 1 1 0 0 0 0 1 1 0 1 1 0 D= E= (23) , , 0 0 0 1 0 0 1 1 .. .. .. .. . . . .
hW | =
1,
1−α α
,
1−α 2 α
, ...
,
1
1−β β |V i = 1−β 2 β .. .
.
(24)
This choice makes the particle-hole symmetry of the problem apparent since the matrices D and E have very similar forms and the boundary conditions α and β only appear in the vectors hW | and |V i. For this choice (23) of D, E the elements of C N (where C = D + E and N denotes the N th power of matrix C) are given by 2N 2N N C nm = − . (25) N +n−m N +n+m Expression (25) can be obtained by noting that C N nm is proportional to the probability that a random walker who starts at site 2m of a semi-infinite chain with absorbing boundary at the origin, is at site 2n after 2N steps of a random walk. This probability may be calculated by the method of images. An apparent disadvantage of this choice (23)–(24) is that, due to the form of hW | and |V i, one has to sum geometric series to obtain the correlation functions and these series diverge in some range of α, β ( in fact α + β ≤ 1). However, at least for finite N , all expressions are rational functions of α, β so that in principle one can obtain results for α + β ≤ 1 by analytic continuation from those for α + β > 1. Other choices of matrices and vectors are possible [18], which solve the equations (14)–(16). For example, a possible choice of D, E, hW |, |V i, that avoids the divergences is 1/β a 0 0 . . . 1/α 0 0 0 . . . 0 1 1 0 a 1 0 0 ˜ = 0 0 1 1 ˜= 0 1 1 0 D E (26) , , 0 0 0 1 0 0 1 1 .. .. .. .. . . . .
THE ASYMMETRIC EXCLUSION MODEL
˜ | = (1, 0, 0, . . .), hW
1 0 |V˜ i = 0 , .. .
7
(27)
where a2 =
α+β−1 . αβ
(28)
The fact that a2 may be negative is of no importance, because in the calculation of any required matrix element a only enters through a2 . One should note that for α = β = 1, we have a = 1 and (26)–(27) coincides with our previous bidiagonal choice (23)–(24). Also, a vanishes for α + β = 1 so that the 1, 1 elements of the matrices ˜ E ˜ decouple from the other elements. This choice of matrices then becomes, for D, the purposes of our calculations, one-dimensional as is sufficient for this special case of α and β. Instead of using explicit forms for the matrices, one can calculate directly matrix elements such as those which appear in (12)–(13) from the commutation rules (14)– (16). For example, one can easily show that hW |C|V i hW |D + E|V i 1 1 = = + , hW ||V i hW ||V i α β hW |D2 + ED + E 2 + D + E|V i 1 1 1 hW |C 2 |V i 1 1 = = 2+ + 2+ + . hW ||V i hW ||V i α αβ β α β
(29)
(30)
The general expression of hW |C N |V i (where C = D + E) for all values of α and β has been shown to be [18] N X hW |C N |V i p (2N − 1 − p)! β −p−1 − α−p−1 . = hW ||V i N ! (N − p)! β −1 − α−1 p=0
(31)
Some Results Once the matrix elements of C are known, expressions for several quantities can be derived. For example, in the steady state, the current through the bond i, i + 1 is simply J = hτi (1 − τi+1 )i, because during a time dt, the probability that a particle jumps from i to i + 1 is τi (1 − τi+1 )dt. Therefore, J is given by J=
hW |C N −1 |V i hW |C i−1 DEC N −i−1 |V i = , hW |C N |V i hW |C N |V i
(32)
where we have used the fact (15) that DE = C. This expression is independent of i, as expected in the steady state. From the large N behaviour of the matrix elements hW |C N |V i given by (31) one can show [18] that there are three different
8
B. DERRIDA AND M. R. EVANS
phases where the current J is given by 1 if α ≥ 4 J = α(1 − α) if α < β(1 − β) if β <
1 2
and β ≥ 12 ,
1 2
and β > α,
1 2
and α > β.
(33)
Thus, the phase diagram consists of three phases: α > 12 , β > 12 ; α < 12 , β > α; β < 12 , α > β. This is exactly the phase diagram predicted by the mean field theory [9, 14, 17]. From the knowledge of the matrix elements hW |C N |V i, one can also obtain [18] exact expressions for all equal time correlation functions. For example the profile hτi iN is given by hτi iN =
n−1 X p=0
+
2p! hW |C N −1−p |V i p! (p + 1)! hW |C N |V i
n+1 hW |C i−1 |V i X (p − 1)(2n − p)! −p β hW |C N |V i p=2 n! (n + 1 − p)!
(34)
where n = N − i. Several limiting behaviours (N large, i large) are discussed in [18]. In the case α = β = 1, one can even perform the sums in (34) to obtain [14] hτi iN =
N − 2i + 1 (2i)! (N !)2 (2N − 2i + 2)! 1 + . 2 4 (i!)2 (2N + 1)! [(N − i + 1)!]2
(35)
3. Diffusion Constant and Non-Equal Time Correlation Functions for Periodic Boundary Conditions One can also try to extend the matrix approach to calculate more general steady state properties than equal time correlation functions. The first result of this kind [20] is the exact expression of the diffusion constant ∆ for a system of M particles on a ring of N sites in the fully asymmetric case (each particle jumps to its right neighbour with probability dt when the right neighbour is empty). If we consider a tagged particle (which has exactly the same dynamics as the M − 1 other particles) and if we call Yt the number of hops performed by this tagged particle between time 0 and time t, one expects that in the long time limit: lim
t→∞
hYt i hYt2 i − hYt i2 = v; lim = ∆. t→∞ t t
The velocity v and the diffusion constant ∆ are given by 2 N −M (2N − 3)! (M − 1)! (N − M )! v= ; ∆= . N −1 (2M − 1)! (2N − 2M − 1)! (N − 1)!
(36)
(37)
A derivation of this result based on the matrix ideas described above is given in [20]. Let us discuss here its connection with non-equal time correlation functions of
THE ASYMMETRIC EXCLUSION MODEL
9
the τi variables. In order to see this, it is convenient to introduce another random variable Yet which represents the number of particles which have jumped from site 1 to site 2 between time 0 and time t. Since the particles cannot overtake each other it is clear that N N2 hYt i hYet i hYt2 i − hYt i2 hYet2 i − hYet i2 = lim ; lim = 2 lim . t→∞ t t→∞ M t→∞ t t M t→∞ t lim
(38)
It is then rather easy to see how the moments of Yet are related to unequal time correlation functions. If one decomposes the time t into T infinitesimal time intervals dt with T = t/dt, one can write Yet as Yet =
T X
ak
(39)
k=1
where ak = 1 if a particle jumps from site 1 to site 2 at time kdt and ak = 0 otherwise. All the ak are random variables with 1 with probability τ1 (kdt)(1 − τ2 (kdt))dt ak = (40) 0 with probability 1 − τ1 (kdt)(1 − τ2 (kdt))dt; then i 1 h e2 hYt i − hYet i2 t→∞ t ! T T T X 2X 1 X 2 2 = lim [hak ak0 i − hak ihak0 i] . hak i − hak i + t→∞ t t 0 lim
k=1
(41)
k=1 k =k+1
Taking the continuous time limit (dt → 0) one obtains for t → ∞ M2 ∆ = hτ1 (1 − τ2 )i N2 Z ∞ dt hτ1 (t)(1 − τ2 (t))τ1 (0)(1 − τ2 (0))i − hτ1 (1 − τ2 )i2 . (42) +2 0
So we see that the exact expression of ∆ gives some information about unequal time correlation functions. Of course it would be interesting to know whether the matrix approach could be sufficiently refined to give exact expressions for all unequal time correlation functions; however at present this seems to us a very remote goal. Two limiting cases of (37) are worth mentioning. First if one takes the limit N → ∞ keeping M fixed, one finds ∆=
[(M − 1)!]2 4M −1 . (2M − 1)!
(43)
In that limit, it is clear that the particles almost never see each other and one might fancy that ∆ = 1, the value it takes when there is a single particle in the system.
10
B. DERRIDA AND M. R. EVANS
However this is not the case and ∆ depends on M because the ‘collisions’ between two particles are highly correlated in time. Another limit one can consider is that of a given density ρ of particles in an infinite system (M = N ρ as N → ∞ in (37)) √ π (1 − ρ)3/2 1 ∆' . (44) 2 ρ1/2 N 1/2 The fact that ∆ vanishes as N → ∞ indicates that in the infinite system the fluctuations of the tagged particle are subdiffusive. This can be seen by considering that, for finite t and N , the quantity hYt2 i − hYt i2 is a function of the two variables t and N . When both t and N are large, one can expect the following sort of scaling form: hYt2 i − hYt i2 ' t2ω g(t/N γ ). (45) When t → ∞ first and N is large one knows from the above results that hYt2 i − hYt i2 ∼ t N −1/2 .
(46)
This of course gives some constraints on the exponents ω and γ and on the behaviour of the function g for large values of its argument: g(z) ∼ z 1−2ω
as
z→∞
(47)
with γ(1 − 2ω) = 12 . To determine the values of the exponents ω and γ one needs another relation which can be obtained via the following additional argument: for large N , one can ask at what time t does the tagged particle notice that it moves on a finite lattice of size N instead of an infinite lattice. To estimate this time one can use the result [7, 8] that the longest relaxation time in the system scales like N 3/2 . Therefore, γ = 32 and one obtains ω = 13 . In the hope of being able to calculate more general time correlations in the steady state, one can wonder whether result (37) can be generalised. The simplest extension one can consider is the case of open boundary conditions. In that case if one denotes by Yt the number of particles which have entered the lattice at site 1 between times 0 and t one can evaluate the current and diffusion constant hYt i = J, t→∞ t lim
lim
t→∞
hYt2 i − hYt i2 = ∆, t
(48)
by solving the problem exactly on the computer for all system sizes 1 ≤ N ≤ 10. It seems very likely that ∆ is given by following expression in the case α = β = 1 J=
N +2 , 4N + 2
∆=
3 (4N + 1)! N ! (N + 1)! [(N + 2)!] 3
2 [(2N + 1)!] (2N + 3)!
2
.
(49)
(The expression for J corresponds of course to that calculated in Section 2.) We have started to develop a matrix approach to establish this result but the work is yet to be completed and at present we do not know whether the matrix approach can be adapted to prove this result and (49) should be considered as a conjecture.
THE ASYMMETRIC EXCLUSION MODEL
11
4. More than One Species of Particle One possible generalisation of the model is to the case of more than one species of particles. For example one can consider a system containing two species of particles, which we represent by 1 and 2, and holes represented by 0 in which the hopping rates of the two species of particles are 1 0 → 0 1 with rate 1, 2 0 → 0 2 with rate γ, 1 2 → 2 1 with rate δ. Even for the case of periodic boundary conditions the steady state of this model is in general non-trivial. Nevertheless, the steady state weights may be obtained by writing them in the form [19] trace(X1 X2 · · · XN )
(50)
where Xi = D if site i is occupied by a 1 particle, Xi = A if it is occupied by a 2 particle and Xi = E if it is empty. The translational invariance of a product of matrices under the trace operation used in (50), reflects the translational invariance of the periodic boundary conditions. One can prove that (50) gives the steady state of this system provided that the matrices D, A and E satisfy the following algebra: DE = D + E,
δDA = A,
γAE = A.
(51)
The second two of these equations are satisfied when A is given by A = |V ihW | and D|V i =
1 |V i, δ
hW |E =
(52) 1 hW |. γ
(53)
So one can use any of the matrices D, E presented for the case of open boundary conditions ((23) and (26)) and construct matrix A from the vectors hW |, |V i ((24) and (27)) with α replaced by γ and β replaced by δ. A case of the two species problem of particular interest is that of first and second class particles. This corresponds to γ = δ = 1 so that all hopping rates are 1 and both first and second class particles hop forward when they have a hole to their right, but when a first class particle has a second class particle to its right the two particles interchange positions. In regions of a low density of first class particles and a high density of holes, a second class particle will tend to move forward whereas in a high density of first class particles and a low density of holes a second class particle will tend to move backward. For this reason, second class particles were first introduced in the context of an infinite system in order to track the position of shocks (recall that a shock is a change in the density of particles over a microscopic distance) [24, 25, 26]. On a finite system with periodic boundary conditions the steady state is unique and corresponds to a uniform density. However it has been shown that even in the case of periodic boundary conditions one can use a finite density of second class particles to probe the structure of shocks [26]. The idea is that from the point
12
B. DERRIDA AND M. R. EVANS
of view of any particular second class particle, those second class particles to its right are equivalent to first class particles whereas second class particles to its left are equivalent to holes. Thus, by calculating the density profile of first and second class particles in a finite system as seen from a second class particle located at the origin, one can construct shock profiles by taking the limit of an infinite system and using the density of first class particles to the left of the second class particle as the profile to the left of the origin and the density of first and second class particles to the right of the second class particle as the profile to the right of the origin [19]. An interesting result concerns the case of a finite number of second class particles in an infinite uniform system of first class particles at density ρ. It can be shown that they form an algebraic bound state, i.e., the probability of finding them a distance r apart decays like a power law in r. For example, in the case of two second class particles in an infinite system of first class particles at density ρ, the probability P (r) of finding them a distance r apart is given by [19] P (r) = ρ(1 − ρ)
r−1 X
ρ2p (1 − ρ)2r−2p−2
p=0
r!(r − 1)! p!(p + 1)!(r − p)!(r − p − 1)!
(54)
which decays for large r as 1 1 P (r) = p . 2 πρ(1 − ρ) r3/2
(55)
Thus, the two second class particles form a bound state although their average distance is infinite. Using the matrix approach one can also calculate [27] a diffusion constant ∆ for a single second class particle in the presence of M first class particles by considering Yt as the distance forward (the number of hops forward minus the number of hops backwards) travelled by the second class particle between time 0 and time t, and define a diffusion constant ∆ through (36). One finds 2 (2N − 3)! M !(N − M − 1)! ∆ = 2 (2M + 1)!(2N − 2M − 1)! (N − 1)! × [(N − 5)M (N − M − 1) + (N − 1)(2N − 1)] . (56) (Here the velocity of the second class particle is v = (N − 2M − 1)/(N − 1)). The formula simplifies when the N → ∞ limit is taken with M = N ρ and the leading order of (56) is ∆ ' 14 (N πρ(1 − ρ))1/2 . (57) This large N dependence contrasts with that of the equivalent formula (44) for the diffusion constant of a first class particle which behaves as N −1/2 . It is consistent with the idea [28] that in an infinite system a single second class particle displays superdiffusive fluctuations in its position (hYt2 i − hYt i2 ∼ t4/3 ). 5. Conclusion The matrix representation of the steady state leads to several exact results for the asymmetric exclusion process. We have discussed here the steady state of the system
THE ASYMMETRIC EXCLUSION MODEL
13
with open boundary conditions [18], diffusion constant for a system with periodic boundary conditions [20], steady state for two species of particles [19]. There are several other possible generalisations, for example throughout this work we have been concerned with totally asymmetric exclusion although one could equally consider the partially asymmetric exclusion problem where particles can hop either to the right with probability pdt or to the left with probability qdt (with q = 1 − p). In that case one can show [18] that replacing (15) by pDE − qED = D + E,
(58)
1 2
still gives the steady state. When p = (the case of symmetric exclusion) it is known that with periodic boundary conditions detailed balance is satisfied, so that qualitatively different behaviour from the asymmetric case might be expected. For p = 12 the diffusion constant has previously been calculated [29] and the dependence on the system size is N −1 as opposed to the N −1/2 dependence of (44). This is related to the fact that both for the asymmetric and the symmetric cases, the fluctuations of Yt in the infinite system are subdiffusive (hYt2 i − hYt i2 ∼ t2/3 for the asymmetric case and ∼ t1/2 for the symmetric case). We have made a numerical calculation of the diffusion constant of a tagged particle on systems of sizes 2 ≤ N ≤ 10 for p = 12 (1 + ) (on a ring of N sites with M particles). For small the first terms of the expansion seem to be given by ∆ =
2 (M − 1) (N − M )(N − M − 1) (N − M ) + M (N − 1) 3 M (N − 1)2 4 2 (M − 1)(M − 2) (N − M )(N − M − 1)(N − M − 2) + O(6 ). (59) − 45 M (N − 1)2 (N − 2)
This result is so far only a conjecture based on the analysis of our data. One can see that in the limit of a finite density of particles on an infinite ring (N → ∞ with M = N ρ), the terms are of order 1/N, 2 , 4 N · · ·. Thus it appears that for large N and small the diffusion constant should be of the form 1 ∆ ∼ g(2 N ) (60) N with g(x) ∼ O(1) for x = 0 and g(x) ∼ x1/2 for large x where the function g would describe the crossover between the asymmetric and symmetric processes. The asymmetric exclusion process is connected to several other problems of interest. First it can be mapped exactly onto a model of a growing interface in (1 + 1) dimensions [6] by associating to each configuration {τi } of the particles, a configuration of an interface: a particle at a site corresponds to a downwards step of the interface height of one unit whereas a hole corresponds to an upward step of one unit. The heights of the interface are thus defined by hi+1 − hi = 1 − 2τi .
(61)
The dynamics of the asymmetric exclusion process in which a particle may interchange position with a neighbouring hole to the right, corresponds to an interface dynamics in which a downwards step followed by an upwards step may become an upwards step followed by a downward step. In other words, a growth
14
B. DERRIDA AND M. R. EVANS
event occurs at any minimum of the interface height with probability dt, i.e., if hi (t) = hi+2 (t) = hi+1 (t) + 1 then
hi+1 (t + dt) =
with probability 1 − dt hi+1 (t) hi+1 (t) + 2 with probability dt.
(62)
Otherwise hi+1 (t) remains unchanged. A growth event turns a minimum of the surface height into a maximum thus the system of hopping particles maps onto what is known as a single step growth model meaning that the difference in heights of two neighbouring positions on the the interface is always of magnitude one unit. Periodic boundary conditions for the particle problem with M particles and N − M holes correspond to an interface satisfying hi+N = hi + N − 2M , i.e., to helical boundary conditions with in average slope 1 − 2M/N . The case of open boundary conditions corresponds to special growth rules at the boundaries. Because of this equivalence, several results obtained for the asymmetric exclusion process can be translated into exactly computable properties of the growing interface [15]. As is well known [30], the problem of growing interfaces is equivalent to the problem of directed polymers in a random medium (which is known as first-passage percolation in the mathematical literature). It would be of interest to see what kind of quantities could be calculated exactly in the directed polymer problem through the mapping from the asymmetric exclusion process. As well as the mapping to growth described above, other possible mappings from systems of hopping particles to growth [11] and to other models of physical interest exist. For example, repton models of diffusion of polymer chains and gel electrophoresis may be formulated in terms of exclusion processes with various numbers of species of particles [31, 32, 33, 34]. It would certainly be interesting to see whether these models could be attacked using similar techniques to those outlined here. Another possible direction in which this work might be extended would be to examine the effects of disorder. Disorder could be introduced in a variety of ways, for example, the hopping rate of each particle could be a quenched random variable. If the hopping rates took only two values and the particles did not overtake each other the disorder would be in the sequence of the particles. For any order of the particles we can describe the steady state in this case as it corresponds to the limit δ → 0 of the two species model discussed in Section 4. Then the problem would be to analyse the effect of the quenched disorder of the sequence on various properties such as the current and the diffusion constant. Lastly, a question we feel would be worthwhile answering concerns the relation of the matrix approach to other techniques that are commonly used in statistical mechanics. It is known that in the case of periodic boundaries [7, 8], or of parallel updating [16], the asymmetric exclusion model can be solved by means of the Bethe ansatz. It would certainly be instructive to better understand the link between the traditional Bethe ansatz approach and the matrix formulation we have used here.
THE ASYMMETRIC EXCLUSION MODEL
15
Acknowledgements Some of the results discussed here have been obtained in collaboration with E. Domany, V. Hakim, S. A. Janowsky, J. L. Lebowitz, D. Mukamel, V. Pasquier, and E. R. Speer. We thank them as well as D. Foster, C. Godr`eche, C. Kipnis, K. Mallick, G. Sch¨ utz, and H. Spohn for useful discussions.
References 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. 12. 13. 14. 15. 16. 17. 18.
19.
20.
21.
Spitzer, F. (1970). Interaction of Markov processes. Advances in Mathematics 5, 246–290. Liggett, T. M. (1985). Interacting Particle Systems. Springer-Verlag, New York. De Masi, A. and Presutti, E. (1991). Mathematical Methods for Hydrodynamical Behavior. Lecture Notes in Mathematics, Springer-Verlag, New York. Ferrari, P. A. (1986). The simple exclusion process as seen from a tagged particle. Annals of Probability 14, 1277–1290. DeMasi, A., Kipnis, C., Presutti, E., and Saada, E. (1989). Microscopic structure at the shock in the simple asymmetric exclusion process. Stochastics and Stochastic Reports 27, 151–165. Meakin, P., Ramanlal, P., Sander, L. M., and Ball, R. C. (1986). Ballistic deposition on surfaces The Physical Review A34, 5091–5103. Dhar, D. (1987). An exactly solved model for interfacial growth. Phase Transitions 9, 51. Gwa, L. H. and Spohn, H. (1992). Bethe solution for the dynamical scaling exponent of the noisy Burgers equation. The Physical Review A46, 844–854. Krug, J. (1991). Boundary-induced phase transitions in driven diffusive systems. Physical Review Letters 67, 1882–1885. Janowsky, S. A. and Lebowitz, J. L. (1992). Finite-size effects and shock fluctuations in the asymmetric simple-exclusion process. The Physical Review A45, 618–625. Kandel, D. and Mukamel, D. (1992). Defects, interface profile and phase transitions in growth models. Europhysics Letters 20, 325–329. Krug, J. and Spohn, H. (1991). In Solids far from Equilibrium (C. Godr`eche, ed.), Cambridge University Press, Cambridge. Spohn, H. (1991). Large Scale Dynamics of Interacting Particles. Springer-Verlag, New York. Derrida, B., Domany, E., and Mukamel, D. (1992). An exact solution of a one dimensional asymmetric exclusion model with open boundaries. Journal of Statistical Physics 69, 667–687. Derrida, B. and Evans, M. R. (1993). Exact correlation functions in an asymmetric exclusion model with open boundaries. Journal de Physique I 3, 311–322. Sch¨ utz, G. (1993). Generalized Bethe ansatz solution of a one-dimensional asymmetric exclusion process on a ring with blockage. Journal of Statistical Physics 71, 471–505.. Sch¨ utz, G. and Domany, E. (1993). Phase transitions in an exactly soluble one-dimensional exclusion process. Journal of Statistical Physics 72, 277–296. Derrida, B., Evans, M. R., Hakim, V., and Pasquier, V. (1993). Exact solution of a 1D asymmetric exclusion model using a matrix formulation. Journal of Physics A: Mathematical and General 26, 1493–1517. Derrida, B., Janowsky, S. A., Lebowitz, J. L., and Speer, E. R. (1993). Microscopic-shock profiles: exact solution of a non-equilibrium system. Europhysics Letters 22, 651–656; Exact solution of the totally asymmetric simple exclusion process: shock profiles. Journal of Statistical Physics, to appear. Derrida, B., Evans, M. R., and Mukamel, D. (1993). Exact diffusion constant for one dimensional asymmetric exclusion models. Journal of Physics A: Mathematical and General , in press. Hakim, V. and Nadal, J. P. (1983). Exact results for 2D directed animals on a strip of finite width. Journal of Physics A: Mathematical and General 16, L213–L218.
16
B. DERRIDA AND M. R. EVANS
22. Kl¨ umper, A., Schadschneider, A., and Zittartz, J. (1991). Equivalence and solution of anisotropic spin-1 models and generalised t-J fermion models in one dimension. Journal of Physics A: Mathematical and General 24, L955–L959. 23. Fannes, M., Nachtergaele, B., and Werner, R. F. (1992). Finitely correlated states on quantum spin chains. Communications in Mathematical Physics 144, 443–490. 24. Andjel, E. D., Bramson, M., and Liggett, T. M. (1988). Shocks in the asymmetric simple exclusion process. Probability Theory and Related Fields 78, 231–247. 25. Boldrighini, C., Cosimi, G., Frigio, S., and Nu˜ nes, M. G. (1989). Computer simulation of shock waves in the completely asymmetric simple exclusion process. Journal of Statistical Physics 55, 611–623. 26. Ferrari, P. A., Kipnis, C., and Saada, E. (1991). Microscopic structure of travelling waves for asymmetric simple exclusion process. Annals of Probability 19, 226–244. 27. Evans, M. R. and Derrida, B. Unpublished . 28. Spohn, H. Private communication. 29. Ferrari, P. A., Goldstein, S., and Lebowitz, J. L. (1985). Diffusion, mobility and the Einstein relation. In Statistical Physics and Dynamical Systems (J. Fritz, A. Jaffe, and D. Sz´ asz, ed.), Birkh¨ auser, Boston, 405. 30. Kardar, M., Parisi, G., and Zhang, Y.-C. (1986). Dynamic scaling of growing interfaces. Physical Review Letters 56, 889–892. 31. Rubinstein, M. (1987). Discretized model of entangled-polymer dynamics. Physical Review Letters 59, 1946–1949. 32. Duke, T. A. J. (1989). Tube model of field-inversion electrophoresis. Physical Review Letters 62, 2877–2880. 33. Widom, B., Viovy J. L., and Defontaines A. D. (1991). Repton model of gel electrophoresis and diffusion. Journal de Physique I 1, 1759–1784. 34. Leeuwen, J. M. J. van and Kooiman, A. (1992). The drift velocity in the Rubinstein–Duke model for electrophoresis. Physica A 184, 79–97.
DROPLET CONDENSATION IN THE ISING MODEL: MODERATE DEVIATIONS POINT OF VIEW
ROLAND L. DOBRUSHIN1 and SENYA B. SHLOSMAN12 Institute of Information Transmission Problems Russian Academy of Science 19 Yermolova Street GSP–4 Moscow 101447 Russia
Abstract. The threshold for the condensation of the vapour of microscopic droplets into a macroscopic one is studied for the case of the ν-dimensional Ising model. The parameter which drives the condensation is the amount of the condensing phase, and the critical value of it, which turns the condensation on, is found to be of the order of |VN |(ν−1)/ν , where |VN | is the volume of the system. The corresponding behaviour of large and moderate deviations is studied. The principle of large deviations in the strong form is obtained in the region where condensation does not take place. Key words: Large deviations, moderate deviations, Ising model, droplet.
1. Introduction In this talk we would like to address the following question: to what extent can the usual picture of the droplet condensation be observed in the most well understood model of the statistical physics, the Ising model. By the droplet condensation picture we mean the process which takes place when one changes the concentration of the solution of one species in another. At a certain concentration threshold the solution reaches the saturation point, after which the solution cannot absorb the extra amount of the solvent, and the formation of a crystal takes place. It turns out that the qualitative picture described above can indeed be observed in the low temperature Ising model, where the role of two substances is played by the two different phases. It turns out that the condensation takes place when the amount of the solvent exceeds the threshold which is equal to the volume of the system raised to a certain power smaller than one. From the probability theory point of view that threshold belongs to the so called region of moderate deviations (see below), which explains the title of the paper. From the mathematical point of view, the phenomenon of condensation is a question about the behavior of the probabilities of large and moderate deviations in the phase transition regime. It turns out that the condensation regime is precisely the regime where this behavior is different from the one given by the classical schemes 1 Partially
supported by the Russian fund for fundamental research under grant 93-011-1470. in Department of Mathematics, University of California, Irvine, CA 92717, U.S.A. Partially supported by the NSF under grant DMS 92-08029. 2 Also
18
R. L. DOBRUSHIN AND S. B. SHLOSMAN
of the probability theory of independent or weakly dependent random variables. So below we will give the overview of the results of the probability theory mentioned, and will compare them with the corresponding results for the Ising model. The complete results will appear under the title Large and Moderate Deviations in the Ising Model in the forthcoming book Probability Contributions to Statistical Mechanics, edited by R. Dobrushin and published by the American Mathematical Society.
2. Ising Model In this section we will fix some notation and recall results about the Ising model. Let Zν be a ν-dimensional integer lattice with sites t = (t1 , . . . , tν ), ti ∈ Z, i = 1, . . . , ν and the norm |t| = |t1 | + · · · + |tν |, t ∈ Zν , (2.1) which defines a metric on Zν . For any V ⊆ Zν we denote by ΩV the set of all configurations {σt , t ∈ V }, σt = ±1. We reserve the notation Ω for the case when V = Zν . If V 0 ⊂ V , we use σV 0 = {σt , t ∈ V 0 } to denote the restriction of the configuration σ ∈ ΩV to V 0 . Let |V | be the number of points in V , and |∂V | be the number of points in V lying at the boundary of this volume. Let V ⊂ Zν be a finite subset of Zν , σ ¯ = {σt , t ∈ Zν } be a fixed configuration, and β > 0 be real constant. In a usual way we define the energy function HV (σ | σ ¯ ) = − 12
X
σs σt −
X
σs σt ,
σ ∈ ΩV ,
(2.2)
s∈V,t∈V c : |s−t|=1
s6=t: s,t∈V,|s−t|=1
and the Gibbs probabilities PVβ (σ | σ ¯) =
1 exp{−βHV (σ | σ ¯ )}, ZV (β, σ ¯)
σ ∈ ΩV ,
(2.3)
where the partition function satisfies X
¯) = ZV (β, σ
exp{−βHV (σ | σ ¯ )}.
(2.4)
σ∈ΩV
The probability distribution (2.3) is called the Ising model distribution in the volume V with the boundary condition σ ¯ at inverse temperature β. Two particular boundary conditions will be mainly used: σ ¯ + = (σt ≡ 1, t ∈ Zν )
and
σ ¯ − = (σt ≡ −1, t ∈ Zν ).
(2.5)
The boundary conditions σ ¯ ± are called ±-boundary conditions. The corresponding Gibbs probabilities and the partition functions will be denoted by PV±,β (σ) and ZV± (β).
DROPLET CONDENSATION IN THE ISING MODEL
19
For any function f (σ), σ ∈ ΩW , where W ⊆ V , we denote by X
hf i+,β = V
f (σW )PV+,β (σ),
(2.6)
σ∈ΩV
its Ising model mean value in the volume V with + boundary conditions. In what follows we will suppose that the dimension ν satisfies the bound ν ≥ 2 since otherwise there is no phase transition. It is well known (see, e.g., [R]), that for any function f (σ),Sσ ∈ ΩW , where |W | < ∞ and any sequence of volumes V1 ⊂ V2 . . . , such that N VN = Zν , the limits hf i± = hf i±,β,h = lim hf i± VN N →∞
(2.7)
exist and do not depend on a choice of the sequence VN . These collections of mean values define probability measures P ± = P ±,β on ΩZν which describe stationary ergodic random fields and are called the limit pure ±-Gibbs states with parameter β. Another well known fact is the existence of the critical values βνcrit , such that 0 < βνcrit < ∞ and the limit Gibbs states P +,β,h and P −,β,h coincide if β ≤ βνcrit . In this case for any function f hf i
+,β
= hf i
−,β
.
(2.8)
This case corresponds to the physical situation of the absence of a phase transition. In case β > βνcrit , when the phase transition does take place, the limit states P ±,β are different. The corresponding mean values are connected via symmetry relation: hf (∗)i
+,β
= hf (−∗)i
−,β
.
(2.9)
We define finally magnetization to be m(β) = hσt i
+,β
.
(2.10)
For β > βνcrit it is positive. Another example of boundary conditions are periodic boundary conditions. They correspond to the case when instead of a box V we consider a discrete torus TN = Zν /N Zν with a side N and an obvious norm |t|. We define per HT (σ) = − 21 N
X
σs σt ,
σ ∈ ΩTN .
(2.11)
s6=t: s,t∈TN ,|s−t|=1
per We define the probabilities PTper,β (σ) and partition functions ZT (β) by analogs of N N the relations (2.3), (2.4). per,β,h Similarly to (2.6) we introduce the mean value hf iTN .
20
R. L. DOBRUSHIN AND S. B. SHLOSMAN
3. Droplet Condensation To explain the condensation phenomenon we have to use the contour technique which is the basis of the mathematical theory of phase transitions of the first order. In the case of small β the typical configuration of the Ising model is a chaotic mixture of pluses and minuses. In the limit case β = 0 one gets just the realization of a Bernoulli field. With the growth of β a clustering of pluses and minuses begins. For β large, when we have distinct plus and minus phases, a typical configuration of a realization of plus phase forms a sea of pluses with islands of minuses floating inside it (some of these islands have lakes of pluses inside and so on). In the opposite phase a typical realization is a continent of minuses with lakes of pluses inside it (and again with islands of minuses inside some of these lakes and so on). To formalize this visual picture we shall use the notion of contours of a configuration. To be specific, we consider configurations in a finite volume with + boundary conditions. Let a finite volume V ⊂ Zν be given. We define the boundary of a configuration σ ∈ ΩV to be the set of all unit plaquettes of the dual lattice Zν∗ = Zν +( 12 , . . . , 12 ) ⊂ Rν , which cross any of the bonds (t1 , t2 ) of the initial lattice such that σt1 6= σt2 . (If t lies outside V we suppose that σt = +1.) The boundary of a configuration σ splits up into connected components which are non-intersecting polyhedra and which will be called the contours of the configuration σ. The family of all the contours of a configuration σ will be denoted by Γ(σ). For any contour Γ ∈ Γ(σ) we denote by |Γ| the number of its (ν − 1)–dimensional faces and by Int Γ the set of lattice sites inside the corresponding polyhedron, and finally by |Int Γ| the number of points in this set. The interiors of external contours (i.e., contours which do not lie inside another ones) can be interpreted as droplets of minus phase inside the plus phase. We shall describe a typical large deviation configuration in the volume VN by using the conditional probabilities PN+,bN (σ) = P
PN+ (σ)
σ∈ΩV :SN (σ)=bN
PN+ (σ)
,
(3.1)
P induced by the condition SN (σ) = t∈VN σt = bN . We shall assume that bN −|VN | ≡ 0 (mod 2). (In the opposite case the denominator in (3.1) vanishes.) The following theorem describes some typical properties of such configurations. Theorem 3.1. Suppose that some sequences of volumes VN , satisfying the Van Hove condition and integers bN are given. Suppose that either, for some D > 0 and β > β1 (D), + |VN | tanh Dβ ≥ bN ≥ EN , (3.2) or, for β ≥ β2 with β2 large enough, + |bN − EN | = 0, ν/ν+1 N →∞ |VN |
lim
(3.3)
+ where EN = hSN i+,β VN is the mean value of SN with respect to the considered Gibbs distribution. Then there exists a constant C < ∞ such that for the events
AN = {σ ∈ ΩVN : max |Γ| ≤ C ln |VN |} Γ∈Γ(σ)
(3.4)
21
DROPLET CONDENSATION IN THE ISING MODEL
the following limit of probabilities is valid: lim
N →∞
X
PN+,bN (σ) = 1.
(3.5)
σ∈AN
On the other hand suppose that β > β1 , the box VN is contractible, with |∂VN | ≤ L|VN |ν−1/ν
(3.6)
for some L < ∞, lim sup N →∞
+ bN − EN < 0, ν/ν+1 |VN |
(3.7)
and for some constant Q > 0 + bN ≥ −EN − Q|VN |(2ν−1)/2ν .
(3.8)
Then there exist constants C < ∞, c > 0 such that for the event BN = {σ ∈ ΩVN : for some contour Γ0 ∈ Γ(σ) its surface |Γ0 | > c
+ EN
− bN
(3.9) (ν−1)/ν
+ its volume |Int(Γ0 )| > c EN − bN ,
,
while for all other contours Γ ∈ Γ(σ), Γ 6= Γ0 , their surfaces |Γ| ≤ C ln |VN |} the limit of the probabilities satisfies lim
N →∞
X
PN+,bN (σ) = 1.
(3.10)
σ∈BN
Finally suppose that β > β1 , that condition (3.6) holds, that lim sup N →∞
+ bN + EN = −∞, |VN |(2ν−1)/2ν
(3.11)
bN > −1. |VN |
(3.12)
and that lim inf N →∞
Then there exist constants C < ∞, c > 0, such that for the event 0 BN = σ ∈ ΩVN : for some contour Γ0 ∈ Γ(σ) its volume |Int Γ0 | > |VN | − c|∂VN | while for all other contours Γ ∈ Γ(σ), Γ 6= Γ0 , their surfaces |Γ| ≤ C ln |VN |
(3.13)
22
R. L. DOBRUSHIN AND S. B. SHLOSMAN
the limit of the probabilities satisfies lim
N →∞
X
PN+,bN (σ) = 1.
(3.14)
0 σ∈BN
This theorem can be interpreted in the following way. If a value of SN is close to + its mean value EN , then all the droplets of the opposite minus phase have microscopic sizes (they are smaller than C ln |VN |). If we reduce the number of minus-particles + (by considering the values bN > EN ), these droplets would further shrink in size and diminish in number, but there will be no qualitative change in a picture. The total deviation of SN is caused by deviations of the sizes of separate droplets, which are almost independent, and so its probability behaves in the same way as in the classical case of independent variables. If we increase the number of minus-particles (the case + bN < EN ), the same situation holds, provided the order of the number of additional minus-particles is smaller than |VN |(ν−1)/ν . But after we pass this threshold, the microscopic droplets can no longer absorb all the additional minus-particles, so a large minus-droplet of a macroscopic size is formed, which contains the main part of these particles. It is the moment of the condensation of the minus-phase. At that level of speculation it is possible to explain why the scale of the order of |VN |(ν−1)/ν + is critical. If the number of additional minus-particles EN − bN is approximately equal to |VN |κ , then the classical mechanism, connected with the collective behavior of many small droplets, predicts the deviation probability to be of the order of
(E + − bN )2 exp − N + 2DN
≈ exp{−¯ c|VN |2κ−1 },
(3.15)
+ where DN is the variance of the random variable SN . On the other hand, a large droplet, containing approximately |VN |κ particles, has the surface of the order |VN |κ(ν−1)/ν , and so its probability has the order
exp{−˜ c|VN |κ(ν−1)/ν }.
(3.16)
The exponents in (3.15) and (3.16) are equal if κ = ν/(ν + 1) . + we have a new change of a picture. The volume of the At the level bN ≈ −EN large contour containing the minus-phase arrives to its maximal possible value |VN | and so can not grow further. So the droplets of the plus-phase inside the minus-phase begin to shrink. Typical configurations inside the minus-phase become symmetrical to typical configurations of plus-phase via the transformation bN → −bN .
4. Classical Behavior of Large and Moderate Deviations In this section we will describe the classical behavior of the large and moderate deviations probabilities for the sums of random variables. The description we give differs a little from the standard one, and contains more information, so we name it the ‘classical large deviation principle in the strong form’. Since we are Ising model oriented, we will consider random variables taking values +1 and −1 only.
DROPLET CONDENSATION IN THE ISING MODEL
23
Consider a random variable N , SN = ξ1N + · · · + ξN
(4.1)
where the random variables ξiN take values +1 and −1. Let PN (b) = P{SN = b},
(4.2)
b − N ≡ 0 (mod 2).
(4.3)
where b are integers such that
We consider the logarithmic generating function X HN (z) = ln PN (b)ezb
(4.4)
b
and suppose that for each real λ ∈ R1 there exists a complex neighborhood Oλ of λ in C1 and a constant Cλ such that for all large enough N |HN (z)| ≤ N Cλ
if z ∈ Oλ .
(4.5)
We suppose also that the limit lim
N →∞
1 HN (z) = H(z) N
(4.6)
exists in the sense of the uniform convergence in Oλ for all λ. This limit is a holomorphic function of z ∈ Oλ . Let for any λ ∈ R1 dn HN (z) GnN,λ = . (4.7) dz n z=λ For λ = 0 these quantities are called the semi-invariants (or cumulants) of order n of the variables (4.1). It follows from (4.5) and well known properties of holomorphic functions that the limits exist 1 n dn H(z) n GN,λ = . (4.8) Gλ = lim N →∞ N dz n z=λ We shall omit the index λ in the notation GnN,λ when λ = 0 and let EN = G1N ,
E = G1 ,
DN = G2N ,
D = G2 .
(4.9)
The quantities EN and DN are, of course, the mean value and the variance of SN . It is easy to see that G2N,λ ≥ 0 for all λ ∈ R1 . So the functions HN (λ) and H(λ) are concave functions of λ ∈ R1 . We suppose additionally that HN (λ) and H(λ) are strictly concave. (For the functions HN (λ) this property holds under the condition that the probability distribution PN (b) is not concentrated in a single point; for the
24
R. L. DOBRUSHIN AND S. B. SHLOSMAN
limiting function H(λ) it is an essential new restriction.) This assumption implies in particular that the limiting variance D is positive. Our hypothesis about the asymptotic behavior of the functions N −1 HN (λ) and H(λ) at λ = ±∞ is that their derivatives N −1 G1N,λ and G1λ go to ±1 when λ → ±∞. We define the action functions IN (b) = sup (λb − HN (λ)),
I(ˆb) = sup (λˆb − H(λ)).
λ∈R1
(4.10)
λ∈R1
Under the strict concavity hypothesis introduced above, there exist unique values λN (b), b ∈ (−N, N ), and λ(ˆb), ˆb ∈ (−1, 1), such that dHN (λ) dH(λ) 1 ˆ b = GN,λ = , b= (4.11) dλ λ=λN (b) dλ λ=λ(ˆb) and so IN (b) = λN (b)b − HN (λN (b)), and
I(ˆb) = λ(ˆb)ˆb − H(λ(ˆb)),
1 IN (N ˆb). I(ˆb) = lim N →∞ N
(4.12) (4.13)
So the implicit function theorem implies that IN (b) and I(ˆb) can be extended to holomorphic functions in a neighborhood of R1 . It is easy to see that dIN (b) dI(ˆb) IN (EN ) = I(E) = 0, = = 0. (4.14) db b=EN dˆb ˆ b=E
So the Taylor expansions of IN (b) and I(ˆb) at EN , E have the following form: ∞
IN (b) =
j (b − EN )2 X KN + 2DN j! j=3
∞ (ˆb − E)2 X K j + I(ˆb) = 2D j! j=3
(b − EN ) DN
(ˆb − E) D
!j
j
,
,
(4.15)
(4.16)
where the limits satisfy 1 j KN = K j . N →∞ N lim
(4.17)
It is easy to see that j KN
= DN Qj
Gj G3N ,..., N DN DN
!
,
j = 3, 4, . . . ,
(4.18)
where Qj is a polynomial of j − 2 variables. In particular, 3 = −G3N , KN
2
4 KN = −G4N + 3(G3N ) (DN )−1 ,
(4.19)
25
DROPLET CONDENSATION IN THE ISING MODEL
and so on. Similar relations hold between the limiting coefficients K j and the limiting semi-invariants Gj . Observe that it follows from (4.11)–(4.12) that the derivatives satisfy d2 IN (b) = db2
dHN (λN (b)) dλ
−1
d2 I(ˆb) = dˆb2
,
dH(λ(ˆb)) dλ
!−1
.
(4.20)
Because we have supposed that the functions HN (λ) and H(λ) are strictly concave, the relation (4.20) implies that the same holds for the functions IN (b) and I(ˆb). We say that the ‘classical large deviation principle in the strong form’ holds for the sequence SN if for any sequence bN , N = 1, 2, 3, . . . , satisfying the condition bN − N ≡ 0 (mod 2), and lim sup N →∞
|bN | <1 N
(4.21)
(4.22)
the following relation holds: −1/2 PN (bN ) = 2 2πG2N,λN (bN ) exp{−IN (bN )}(1 + oN (1)),
(4.23)
where oN (1) → 0 as N → ∞. It follows from (4.23) and (4.12) that if the sequence bN satisfies the condition (4.21) and the condition lim
N →∞
bN = ˆb, N
(4.24)
where |ˆb| < 1 then 1 ln PN (bN ) = −I(ˆb). N →∞ N lim
(4.25)
This is a familiar form of the ‘large deviation principle’ as it appears usually in the literature. Of course, it is weaker than (4.23). On the other hand, we can consider the case when lim
N →∞
|bN − EN | = 0, N
(4.26)
which is called the ‘moderate deviations’ case. Then λN (bN ) → 0 as N → ∞, and so the classical large deviation principle in the strong form implies that PN (bN ) = 2(2πDN )
−1/2
exp{−IN (bN )}(1 + oN (1)).
(4.27)
If an even stronger condition holds, viz. for some k = 2, 3, . . . lim
N →∞
|bN − EN | = 0, N k/k+1
(4.28)
26 then
R. L. DOBRUSHIN AND S. B. SHLOSMAN
j k j X KN bN − EN PN (bN ) = qN (bN ) exp − (1 + oN (1)), (j)! DN j=3
where qN (bN ) = 2(2πDN )
−1/2
1 (bN − EN )2 exp − 2 DN
(4.29)
(4.30)
is the usual normal approximation for the probabilities PN (bN ). In the case k = 2 (‘small deviation case’) it implies that PN (bN ) = qN (bN )(1 + oN (1))
(4.31)
if
|bN − EN | = 0, (4.32) N →∞ N 2/3 and this is a natural region of the validity of the local central limit theorem. The equations (4.25) and (4.15), and the positivity of the limiting variance D imply a normal approximation in the domain of moderate deviations (4.26) lim
ln PN (bN ) = ln qN (bN )(1 + oN (1)) = −
1 (bN − EN )2 (1 + oN (1)). 2 DN
(4.33)
In the domain of large deviations (4.24) with b 6= 0 we can reduce (4.27) further to ln(− ln PN (bN )) = ln(− ln qN (bN ))(1 + oN (1)) = ln N (1 + oN (1)).
(4.34)
So the normal approximation predicts correctly the order of the exponential asymptotics of the large deviation probabilities.
5. Classical Behavior of Probabilities of Deviations for the Ising Model S We fix a sequence of finite volumes V1 ⊂ V2 · · · ⊂ Zν , such that N VN = Zν and the following Van Hove condition holds: |∂VN | = 0. N →∞ |VN |
(5.1)
lim
For σ = (σt , t ∈ VN ) ∈ ΩVN we consider SN = SN (σ) =
X
σt
(5.2)
t∈VN
and we will study a large deviation behavior of this random variable. We will suppose that |bN | bN − |VN | ≡ 0 (mod 2), lim sup < 1. (5.3) N →∞ |VN | The following probability distributions for σ ∈ ΩVN in (5.2) will be considered: the Ising model distribution in VN with +-boundary conditions σ ¯ + = (σt ≡ +1). We let PN+ (bN ) = PVβ,+ (σ : SN (σ) = bN ). (5.4) N Similar notation will be used for other boundary conditions also.
DROPLET CONDENSATION IN THE ISING MODEL
27
Theorem 5.1. Suppose that the Van Hove condition (5.1) and the conditions (5.3) are fulfilled. There exists a value β0 > 0 such that for the Ising model with the inverse temperature β ≤ β0 the classical large deviation principle in the strong form (see Section 4) holds together with its implications (4.25), (4.27), (4.29), (4.31), (4.33), (4 38). This limit action function is a strictly concave holomorphic function of ˆb.
6. The Behavior of Deviations in the Phase Transition Regime As we have mentioned already, the behavior of the large deviations in the phase transition regime is not classical. In particular it is known ([I]) that the function H(z) is not holomorphic at z = 0 if β is large enough, contrary to the functions HN (z), which are holomorphic in some neighborhoods around zero, even they shrink to zero as N → ∞. The same holds for the functions I(z) and IN (z). Still, we shall j use the coefficients KN defined by the semi-invariants GjN by the relations (4.18)– (4.19) even in the phase transition region. The statements of the present section are using a notion of a contractible volume. By such we call any finite volume V ⊂ Zν such that the complement V c is connected. We will sometimes use a following additional restriction for a sequence of finite volumes VN which will be called a thickness condition: there exists a constant L < ∞ such that for any N and any integer K ≤ |VN | there exists a contractible subvolume VN0 ⊂ VN such that |VN0 | = K and |∂VN0 | ≤ L|VN0 |ν−1/ν .
(6.1)
This restriction is a weak one. For example if VN is a sequence of parallelepipeds {t = (t1 , . . . , tν ) : 0 ≤ ti ≤ li , i = 1, . . . , ν} then the thickness condition (and also the Van Hove condition and the contractibility condition) are fulfilled if all the ratios li /lj , i, j = 1, . . . , ν are uniformly bounded. Theorem 6.1. Suppose that a sequence of contractible volumes VN and a sequence of integers bN are given such that the Van Hove condition (5.1) and the conditions (5.3) are fulfilled. Consider the +-boundary conditions. Let D > 0 and + , |VN | tanh Dβ ≥ bN ≥ EN
(6.2)
+ EN = hSN i+,β VN
(6.3)
where
is the mean value of SN with respect to the considered Gibbs distribution (cf. the notations (4.9) and (2.6)). Then for the random variable SN defined by (5.3) the classical large deviation principle in the strong form holds, i.e., the relation (4.23) holds, together with (4.25), (4.27), (4.29), (4.31), (4.33), (4.34) (which in the situation of Section 4 were its corollaries) provided β ≥ β1 (D). The limit action function
28
R. L. DOBRUSHIN AND S. B. SHLOSMAN
I(ˆb) (see (4.13)) defined for the corresponding Gibbs distribution is a holomorphic strictly concave function of ˆb in the interval ˆb > m(β).
(6.4)
+ bN − EN <0
(6.5)
k≤ν
(6.6)
+ |bN − EN | = 0, N →∞ |VN |k/k+1
(6.7)
In the opposite case for any integer k, such that and for values bN , satisfying lim
the equality (4.29) and its implication (4.31) hold provided β ≥ β2 with β2 large enough. Suppose now additionally that the thickness condition (6.1) is fulfilled. Then in the complementary region + bN − EN lim sup < 0, (6.8) ν/ν+1 N →∞ |VN | if also for some constant Q > 0 + bN ≥ −EN − Q|VN |(2ν−1)/2ν ,
(6.9)
there exist constants c > 0, c < ∞ depending on Q, a sequence of volumes VN and the value of inverse temperature β2 , such that for all N and β ≥ β2 (ν−1)/ν (ν−1)/ν + + c EN − bN ≤ − ln PN+ (bN ) ≤ c EN − bN . (6.10) Finally, if lim sup N →∞
+ bN + EN = −∞ (2ν−1)/2ν |VN |
(6.11)
bn > − tanh Dβ |VN |
(6.12)
and lim sup N →∞
then for β ≥ β1 (D) − ln PN+ (bN ) = 1, N →∞ IN (−bN ) lim
(6.13)
where IN (b) is the action function defined by the relation (4.12) for the probability distribution in question. Similar statements hold for the case of the limit pure +-state. The limit action functions for the +-boundary conditions and for the limit pure +-state coincide and do not depend on a choice of the sequences of volumes VN (cf. (5.5)). To compare the classical and non-classical behavior of the deviations described above we suppose that for some 1 ≥ κ ≥ 12 and some a 6= 0 + bN − EN = a. N →∞ |VN |κ
lim
(6.14)
DROPLET CONDENSATION IN THE ISING MODEL
29
Then it follows from the previous theorem that 2κ−1 |VN | , if a > 0, κ < 1 or 1 − m(β) > a > 0, κ = 1 or if a < 0, κ < ν/(ν + 1), +,0 κ(ν−1)/ν − ln PN (bN ) |VN | , if a < 0, ν/(ν + 1) < κ < 1, or −2m(β) ≤ a < 0, κ = 1, if −1 − m(β) < a < −2m(β), κ = 1, |VN |, (6.15) where the relation means that the ratio of the corresponding quantities is uniformly in N bounded from both sides. This relation illustrates a difference between the asymptotic behavior of large deviations in the region of the phase transitions versus region without it. Another interesting picture arises if we suppose that for some 0 < κ ≤ 1, a > 0 + bN + EN lim = a. (6.16) N →∞ |VN |κ Then using the statements (6.10) and (6.13) and a normal approximation for the + 0 action function IN (b) for b = −bN ≈ EN (cf. (4.33)) we see that ( (ν−1))/ν , if κ ≤ (2ν − 1)/2ν |VN | + − ln PN (bN ) (6.17) 2κ−1 |VN | , if κ ≥ (2ν − 1)/2ν. Finally, it follows from (6.13) that if −1 < lim
N →∞
bN = ˆb < −m(β) |VN |
(6.18)
then we know a more exact asymptotic and can state that − ln PN+ (bN ) = IN (−ˆb). N →∞ |VN | lim
(6.19)
7. Large Deviations and the Shape of the Droplets The relation (6.10) gives much rougher information compared with the precise knowledge of the large deviation behavior one gets in the classical regime. So it is natural to try to get at least the value of a constant c which describes the true asymptotic in (6.10): − ln PN+ (bN ) lim (7.1) (ν−1)/ν = c. + N →∞ − bN EN It turns out that this is a difficult question, and the answer to it is closely connected with an important physical theory which describes the typical shape of a droplet of one phase floating in the opposite one. In the recent book [DKS] the question of the typical shape was rigorously studied in the simplest non-trivial situation of the
30
R. L. DOBRUSHIN AND S. B. SHLOSMAN
two-dimensional Ising model with periodic boundary conditions, and this is about the only case when we can prove the existence of the limit (7.1) and exhibit a construction which produce the value of this constant c. For this we have to repeat some notions discussed in details in the book [DKS]. Let (7.2) WN = {t = (t1 , t2 ) ∈ Z2 : −N ≤ ti ≤ N, i = 1, 2} be a square. For any direction n ∈ S1 (where S1 ⊂ R2 is the unit circle centered at the origin) we introduce the boundary conditions σ ¯ n such that σ ¯tn
=
+1 if (t, n) > 0,
(7.3)
−1 if (t, n) ≤ 0.
Suppose that the inverse temperature β is large enough. We introduce the surface tension in the direction n as the limit ZWN (β, σ 1 ¯n ) ln N →∞ βd(N, n) ZWN (β, σ ¯+ )
(7.4)
τβ (n) = − lim
where σ ¯ + are +–boundary conditions (see (2.5)) and d(N, n) is the length of the segment {t ∈ R2 : (t, n) = 0, t1 ∈ [−N, N ]}. (7.5) The existence of the limit (7.4) for the case of large enough inverse temperatures was proven in [DKS]. Let γ be a closed non-self-intersecting smooth curve in R2 . Let Z τβ (ns ) ds, (7.6) Wβ (γ) = γ
where ns ∈ S1 is the normal to the curve γ at the point s and ds is the differential of its length. Let Fβ = inf Wβ (γ) (7.7) where the infimum is taken over all curves γ such that the area enclosed is equal to one. Theorem 7.1. There exists a value β1 < ∞ such that for the two-dimensional Ising model with inverse temperature β > β1 and for any sequence of non-negative integers bN such that the conditions (5.4) hold and the limit lim inf N →∞
per,β,+ bN − ET N < 0, |TN |2/3
(7.8) per,β
per,β,+ where ET is the conditional mean value hSTN |STN ≥ 0iTN , the following N equality holds: − ln PTper,β (bN ) N = 1, (7.9) lim N →∞ wN (bN )
DROPLET CONDENSATION IN THE ISING MODEL
31
where
wN (bN ) =
βFβ
per,β,+ − bN ET N 2m(β)
!1/2
2βN τβ (n0 )
if Fβ
per,β,+ − bN ET N 2m(β)
!1/2
≤ 2βN τβ (n0 ),
otherwise, (7.10)
and n0 = (1, 0) is the unit lattice vector. The case of positive bN can be reduced to the case of negative ones by the help of the symmetry.
8. References and Generalizations 8.1. In the exposition of the classical theory of deviations in Section 4 we followed essentially the pioneering papers of Khinchin [Kh], Smirnov [Sm], Cram´er [Cr], who laid the foundations of the theory (see also books by Ibragimov–Linnik [IL], Petrov [P], Saulis–Statulevichius [SS]). Instead of the local variant of the theory used in Section 4, its integral variant is discussed more often. The local approach was used in papers [Ri], [CS1], [CS2] and in the book [B]. In the integral theory the asymptotic of sums X PN = PN (b), (8.1) b∈BN
where BN ⊂ R1 is a sequence of subsets, is studied. Since the main estimate (4.23) is true for any sequence bN , we can estimate the sum (8.1) termwise uniformly in N , so an estimate for the sum can be obtained as an easy implication of the local theory. The possibility to obtain the results both about large and moderate deviations as an implication of a unique estimate (4.23), discussed in Section 4, might be a methodological innovation. The elegant relation (4.25) became a starting point of deep generalizations to the cases of dependent variables, of variables with values in general functional spaces, to a study of empirical distributions and so on (see the books [DeSt], [DZ], [V], [VF]). The main idea is to prove a general result which shows that the classical large deviations principle in the strong form follows easily from some analyticity properties of the logarithmic generating function. This result is influenced by similar results of Saulis–Statulevichius [SS]. 8.2. Theorem 5 is a simple implication of the general result mentioned above. The analyticity condition which one has to check was intensively studied in the literature on rigorous results in statistical physics (see, for example, [DS1], [MM]) and so this theorem can be extended to a wide class of situations. On the other hand, it is ν natural to expect that Theorem 5 remains true for all β in the domain β0 < β < βcrit , but at present there is no known methods to check the corresponding analyticity ˇ ˇ properties. Some result of the similar type were obtained by Cepulenas [C]. Recently several authors have developed a general theory of large deviations of empirical distribution of Gibbs fields (see [B], [BD], [Co], [FO1], [G1], [G2], [GZ], [O]). This theory implies, in particular, the existence of the limit (4.25) and can be applied to a wide class of Gibbs fields including the phase transition region. This
32
R. L. DOBRUSHIN AND S. B. SHLOSMAN
does not contradict the special behavior of large deviations in the regime of phase transition described in Section 6: the general theory does not exclude a possibility of the vanishing of the action function I(b) on some interval, and it is a typical situation in the case of phase transitions of the first order. In such situation the equality (4.25) does not reveal the true asymptotic behavior of the probabilities. 8.3. The main estimate (6.10) for the case of the deviations proportional to the volume was proved also in the recent papers of F¨ ollmer and Ort [FO2] and Schonmann [Sch]. They obtained for this case explicit estimates of constants c and c in (6.10). Their papers are based on methods different from the one used in the present paper. Without an explicit formulation such results were essentially contained already in the old paper of Minlos and Sinai [MS]. The results of our paper about moderate deviations seem to be new. The results of Sections 3, 6 and 7 can be extended to a wide class of Gibbs fields undergoing phase transitions, for which it is possible to have a full control of the probabilistic properties of contour systems (such as in Pirogov–Sinai situation, see [S]). Again, it is natural to expect that the theorems of ν Section 6 are valid for all values of β > βcrit , but there is no methods to prove it. 8.4. Results of Section 3 of the book [DKS] about the shape of the droplet. Its proof is an extension of the proof in [DKS]. Since we wanted to use directly the constructions of [DKS] we restrict ourselves in Section 3 to the case of two-dimensional Ising models with periodic boundary conditions. Generalizations to a wider class of two-dimensional Gibbs models and other boundary conditions seems to be possible. A study of the three-dimensional case meets very serious mathematical difficulties (see discussions in Section 1 of [DKS]). 8.5. The book [DKS] contains some of the results of the present paper for the twodimensional case, which were necessary to obtain the main result of [DKS]. Some other results were formulated in the papers [DS2], [Sh]. References [BD]
Bolthausen, E. and Deuschel, J.-D. Critical large deviations for Gaussian fields in the phase transition regime; I. Annals of Probability (to appear); II (to appear). [BI] Borgs, C. and Imbrie, J. (1989). A unified approach to phase diagrams in field theory and statistical mechanics. Communications in Mathematical Physics 123, 305–328. [B] Borovkov, A. A. (1986). Probability Theory. Nauka, Moscow. [Br] Bryc, W. (1992). On the large deviation principle for stationary weakly dependent random fields. Annals of Probability 20, 1004–1030. ˇ ˇ [C] Cepulenas, S. (1985). Probabilities of large deviations for random fields. Litov. Mat. Sbornik 5, 164–176; Lith. Math. J. (1985) 25, 381–390. [CS1] Chaganty, N. R. and Sethuraman, J. (1985). Large deviations local limit theorems for arbitrary sequences of random variables. Annals of Probability 13, 97–114. [CS2] Chaganty, N. R. and Sethuraman, J. (1987). Limit theorems in the area of large deviations for some dependent random variables. Annals of Probability 15, 628–645. [Co] Comets, F. (1989). Large deviation estimates for a conditional probability distribution. Applications to random interacting Gibbs measures. Probability Theory and Related Fields 80, 407–432. [Cr] Cram´er, H. (1938). Sur un nouveau th´eoreme-limite de la th´eorie des probabilit´es. Actualit´ es Scientifiques et Industrielles 736, 5–23. [DZ] Dembo, A. and Zeitoni, O. (1992). Large Deviations Techniques and Applications. Jones and Bartlett Publishers, Boston. [DeSt] Deuschel, J.-D. and Stroock, D. (1989). Large Deviations, vol. 137. Academic Press.
DROPLET CONDENSATION IN THE ISING MODEL
33
[DKS] Dobrushin, R. L., Koteck´ y, R., and Shlosman, S. (1992). The Wulff Construction: A Global Shape from Local Interactions. Translations of Mathematical Monographs, vol. 104. AMS, Providence, Rhode Island. [DN] Dobrushin, R. L. and Nakhapetyan, B. S. (1974). Strong convexity of the pressure for lattice systems of classical statistical physics. Teoret. Mat. Fiz. 20, 223–234. [DS1] Dobrushin, R. L. and Shlosman, S. B. (1987). Completely analytic interactions. Constructive description. Journal of Statistical Physics 46, 983–1014. [DS2] Dobrushin, R. L. and Shlosman, S. B. (1992). Large deviation behavior of statistical mechanics models in the multiphase regime. Proceedings of the Xth Congress on Mathematical Physics, Leipzig 1991 (K. Schmudgen, ed.), Springer-Verlag, Berlin. [DT] Dobrushin, R. L. and Tirozzi, B. (1977). The central limit theorem and the problem of the equivalence of ensembles. Communications in Mathematical Physics 54, 173–192. [FO1] F¨ ollmer, H. and Orey, S. (1988). Large deviations for the empirical field of a Gibbs measure. Annals of Probability 16, 961–977. [FO2] F¨ ollmer, H. and Ort, M. (1988). Large deviations and surface entropy for Markov fields. Ast´ erisque 157–158, 173–190. [GKK] Gawedski, K., Koteck´ y, R., and Kupiainen, A. (1987). Coarse-graining approach to the first-oder phase transitions. Journal of Statistical Physics 47, 701–724. [G1] Georgii, H.-O. (1993). Large deviations and maximum entropy principle for interacting random fields on Zd . Annals of Probability (to appear). [G2] Georgii, H.-O. (1994). Large deviations and the equivalence of ensembles for Gibbsian particle systems with superstable interactions. Annals of Probability (to appear). [GZ] Georgii, H.-O. and Zessin, H. (1993). Large deviations and the maximum entropy principle for marked point random fields. Probability Theory and Related Fields 96, 177–204. [IL] Ibragimov, I. A. and Linnik, Yu. I. (1977). Independent and Stationary Sequences of Random Variables. Walters–Noordhoff, Groningen. [I] Isakov, S. N. (1984). Nonanalytic features of the first order phase transition in the Ising model. Communications in Mathematical Physics 95, 427–443. ¨ [Kh] Khinchin, A. I. (1929). Uber einen neuen grenzwertsatz der wahrscheinlichkeitsrechnung. Mathematische Annalen 101, 745–752. [KP] Koteck´ y, R. and Preiss, D. (1986). Cluster expansion for abstract polymer models. Communications in Mathematical Physics 103, 491–498. [MM] Malyshev, V. A. and Minlos, R. A. (1991). Gibbs States; Cluster Expansions. Kluwer, Dordrecht. [MS] Minlos, R. A. and Sinai, Ya. G. The phenomenon of ‘phase separation’ at low temperatures in some lattice models of a gas; I (1967). Matem. Sbornik 73, 375–448; English translation Math. USSR Sbornik (1967) 2, 335–395; II (1968). Tr. Moskow Mat. Obshch. 19, 113–178; English translation Trans. Moscow Math. Soc. (1968) 19, 121–196. [O] Olla, S. (1988). Large deviations for Gibbs random fields. Probability Theory and Related Fields 77, 343–357. [P] Petrov, V. V. (1975). Sums of Independent Random Variables. Springer-Verlag, Berlin. [Ri] Richter, W. (1957). Local limit theorem for large deviations. Teor. Probab. Appl. 2, 206– 219. [R] Ruelle, D. (1978). Thermodynamic Formalism. Addison-Wesley, Reading. [SS] Saulis, L. and Statuleviˇcius, V. A. (1991). Limit Theorems for Large Deviations. Kluwer, Dordrecht. [Sch] Schonmann, R. (1987). Second order large deviation estimates for ferrromagnetic systems in the phase coexistence region. Communications in Mathematical Physics 112, 409–422. [Sh] Shlosman, S. B. (1989). The droplet in the tube: A case of phase transition in the canonical ensemble. Communications in Mathematical Physics 125, 81–90. [S] Sinai, Ya. G. (1982). Theory of Phase Transitions: Rigorous Results. Pergamon Press. [Sm] Smirnoff, N. (1933). Uber wahrscheinlichkeiten grosser abweichungen. Rec. Soc. Math. Moscou 40, 441–455. [V] Varadhan, S. R. S. (1984). Large Deviations and Applications. Society of Industrial and Applied Mathematics, Philadelphia.
34 [VF] [Za]
R. L. DOBRUSHIN AND S. B. SHLOSMAN
Ventzel, A. D. and Freidlin, M. I. (1984). Random Perturbations of Dynamical Systems. Springer, Berlin. Zahradnik, M. (1984). An alternative version of Pirogov–Sinai theory. Communications in Mathematical Physics 93, 559–581.
SHOCKS IN ONE-DIMENSIONAL PROCESSES WITH DRIFT
P. A. FERRARI Instituto de Matem´ atica e Estat´ıstica Universidade de S˜ ao Paulo Cx. Postal 20570 01452–001 S˜ ao Paulo SP Brazil e-mail:
[email protected]
Abstract. The local structure of shocks in one-dimensional, nearest neighbor attractive systems with drift and conserved density is reviewed. The systems include the asymmetric simple exclusion, the zero range and the ‘misanthropes’ processes. The microscopic shock is identified by a ‘second class particle’ initially located at the origin. Second class particles also describe the behavior of the characteristics of the macroscopic equation related to the corresponding model when the hydrodynamic limit is performed. Law of large numbers and central limit theorems as well as the convergence of the system at the average position of the shock are reviewed. Key words: Asymmetric simple exclusion, zero range process, second class particle, shock fluctuations, central limit theorem, dynamical phase transition, density fluctuation fields.
1. A Review Second class particles appeared first as a tool to prove the ergodic properties of the simple exclusion and the zero range process (sep and zrp respectively). Then they were useful to show hydrodynamic limits and central limit theorems for tagged particles. It was also established that the process as seen from a single second class particle may present different asymptotic densities to the right and left of the second class particle. When this happens we say that the second class particle identifies a microscopic shock. Finally, when conveniently rescaled, the second class particle follows the shocks or the characteristics of the related hyperbolic equation resulting from the hydrodynamic limits. Misanthropes are individuals that tend to avoid other persons. The sep and the zrp are particular cases of a general system called the misanthropes process for which only a few results were proven. However we prefer to present the known results in the frame of reference of the misanthropes to help the understanding of the difficulties to generalize the results. In the misanthropes process we consider here, a finite number of particles is allowed at each site x ∈ Z. For simplicity we consider the nearest neighbor totally asymmetric case: the particles may jump only to the nearest neighbor site to its right with a rate that is a non-decreasing function of the number of particles in the departure site and a non-increasing function of the number of particles in the arrival site. The state space of the process is X = NZ . We use η, ζ, ξ to denote the configurations of X. Let b(n, m) ≥ 0 be the rate that a particle jumps from x to x + 1 when there are n particles at x and m particles at
36
P. A. FERRARI
x + 1. The generator of the process is X Lf (η) = b(η(x), η(x + 1))[f (η x,x+1 ) − f (η)]
(1)
x∈Z
where f is a cylinder function on X. The configuration η x,y (z) is defined by if z 6= x, y η(z) x,y η (z) = η(x) − 1 if z = y η(y) + 1 if z = x. Let S(t) denote the corresponding semigroup. The rates of jump b are assumed to satisfy: (i) b(0, .) ≡ 0, (ii) b(n, m) is non decreasing in n and non decreasing in m, (iii) there exist a bounded non-decreasing function g such that b(n, m − 1)g(m) = b(m, n − 1)g(n) b(n, m) − b(m, n) = b(n, 0) − b(m, 0).
(2)
Condition (ii) guarantees that the process is atractive. Let ρ ∈ [0, ∞) and νρ be the product measure with marginals νρ (η(x) = k) =
1 ρk Z(ρ) g(1) . . . g(k)
(3)
where Z(ρ) is a normalizing constant. Condition (iii) implies that νρ is invariant for ρ ≥ 0 and that the reverse process with respect to this measure is a misanthropes process with the same rate b but with reversed jumps. We obtain the simple exclusion process when g(n, m) = 1{n = 1, m = 0}, the indicator function that n = 1 and m = 0, and when the configuration space is restricted to {0, 1}Z. The zero range process is obtained when there exists some function g such that b(n, m) = g(n), that is, when the rate of jump does not depend on the destination site. If the number of particles allowed per site is bounded, then also blocking measures arise. In particular, for the simple exclusion process the measure concentrated on the configuration . . . 000111 . . . is invariant, as well as its translates. Indeed it is known that all invariant measures are convex combination of the product measures {νρ } and the blocking measures. For the zrp with supk g(k) < ∞, all the invariant measures are convex combinations of {νρ }. When the number of particles per site is not limited and the system is totally asymmetric as the one we study here, one expects that all invariant measures are translation invariant. It is known that for the partially asymmetric case (when jumps to the left are allowed) there are invariant measures that are not translation invariant. There are no complete results about the set of all invariant measures for the general misanthropes process. R Let h(u) = dνu (η)b(η(0), η(1)). From now on we assume through the paper that h(u) is strictly concave. This assumption is convenient to have a nice construction
SHOCKS IN PROCESSES WITH DRIFT
37
of the entropic solutions of the related equation. It is easy to construct non-trivial examples of b that give rise to concave h. Let u(r, t) denote the unique entropic solution of ∂u ∂h(u) + = 0 ∂t ∂r (4) u(r, 0) = u0 (r) where u0 is a piecewise continuous function. Under the concavity condition on h, + initial discontinuities where u− 0 < u0 persist at later times, but the position may + be translated in space (shocks). Initial discontinuities where u− 0 > u0 disappear at + time 0 (rarefaction). In order to establish the hydrodynamic limit that relates the microscopic model with the pde, we consider a family of product measures νuε 0 with marginals Z
dνuε 0 (η)η(ε−1 r) = u0 (r)
(5)
where by abuse of notation we do not write the integer parts. The next theorem gives the convergence of the distribution of the process as seen from a passenger travelling at constant velocity. It essentially says that in the continuity points of the solution of the equation the system looks asymptotically as in equilibrium with a parameter predicted by the equation. Theorem 1. (Hydrodynamic limit) Let u0 (r) be an integrable uniformly bounded piecewise continuous function, and let νuε 0 be a family of product measures with marginals νuε 0 (η(ε−1 r)) = u0 (r). Then, for all cylinder function f lim νuε 0 S(ε−1 t)τε−1 r f = νu(r,t) f
ε→0
(6)
in the continuity points of u(r, t), the solution of (4) with initial condition u(r, 0) = u0 (r). Remark (7) The integrability of the initial condition may be dropped in special cases like in the sep with asymptotic constant densities. The equation (4) admits travelling wave solutions. In particular, if the initial condition is non-decreasing piecewise constant with only one discontinuity, then the solution is a translation of this initial condition. This puts the question of what happens at a microscopic level. Can one see the jump of the density in the particle system? The anwer is yes and one manner to see this is to look at the system as seen from a second class particle. The next theorem says that if the initial distribution of the system is a product measure with density to the left of the origin smaller than the density to the right of it, then the system as seen from a second class particle added at the origin at time zero will look very much the same way at later times. The motion of a second class particle arises when a joint realization of the process with two different initial configurations is realized. The joint realization is called the basic coupling and the principle is that the jumps of particles sitting at the same site occur together for the two marginals, as much as possible. If the first initial
38
P. A. FERRARI
configuration is identical to the second but has one extra particle located at the origin, with the coupling at any later time t the first process will have an extra particle at position Xt . This is the position of the second class particle. The name comes from the fact that it gives priority to the other particles. The joint process (ηt , Xt ) ∈ X × Z is Markovian and has generator ¯ (η, z) = Lf
X
b(η(x), η(x + 1))[f (η x,x+1 , z) − f (η, z)]
x6=z−1,z
+ (b(η(z − 1), η(z)) − b(η(z − 1), η(z) + 1))[f (η z,z−1 , z) − f (η, z)] + (b(η(z) + 1, η(z + 1)) − b(η(z), η(z + 1)))[f (η z,z+1 , z + 1) − f (η, z)]. The second class particle identifies the shock in the following sense. Let τx be the translation operator defined by τx η(z) = η(z+x). Let νρ,λ be a product measure with marginals νρ for x ≤ 0 and νλ for x > 0. We say that µ ∼ νρ,λ if limx→+∞ τx µ = νλ and limx→−∞ τx µ = νρ , where the limits are understood as weak limits. Let ηt0 = τXt ηt and let S 0 (t) be the corresponding semigroup. Theorem 2. (Microscopic interface) Let the process be the sep and assume 0 ≤ ρ ≤ λ ≤ 1. The process as seen from the second class particle ηt0 has an invariant measure µ ∼ νρ,λ . Furthermore, if λ > ρ, limt→∞ νρ,λ S 0 (t) = µ. Remarks (8) Theorem 2 is proven only for the simple exclusion process. I think however that Theorem 2 can be proven for the misanthropes using the same techniques, but I have not worked out the details. (9) In the sep the invariant measure µ for the process as seen from the second class particle has been described explicitly for all λ ≥ ρ. When λ = ρ there is a reminiscence of a shock because the density to the right (respectively left) of the second class particle is bigger (respectively less) than ρ and the approach to ρ is slow (as an inverse power of the distance). When λ > ρ, the invariant measure µ is extremely close to νρ,λ : it is possible to couple µ and νρ,λ in such a way that the number of sites where the corresponding configurations differ is a random variable with a positive exponential moment. (10) Open problem. Does the process starting from νρ as seen from a second class particle converge to the corresponding invariant measure? Conjecture: yes, but the technique used in the case of initial νρ,λ for ρ < λ does not work. The second class particle moves along a characteristic of the macroscopic equation. This is the essential result of the next theorem. The characteristics w(a, t) emanating from a corresponding to the equation (4) are the solutions of dw + h0 (u(w, t)) = 0 dt w(0) = a.
(11)
There is a unique characteristic emanating from a if the initial condition presents no decreasing discontinuity at a.
39
SHOCKS IN PROCESSES WITH DRIFT
Theorem 3. Let Xtx be the position of a second class particle initially at x. Let −1 X ε (a, t) = εXεε−1 ta . If the characteristic emanating from a is unique, then for the sep, and the zrp with concave g, lim εE |X ε (a, t) − w(a, t)| = 0.
ε→0
(12)
Furthermore, if u0 is piecewise constant, non-decreasing and has a finite number of discontinuities, then for the sep lim X ε (a, t) = w(a, t)
ε→0
P-a.s.
(13)
Remarks (14) The weak law of large numbers (12) is proven for the sep and zrp with strictly concave h. The strong law (13) was proven for the sep with piecewise constant profiles presenting at most one increasing shock, but the proof easily extends to cases when a finite number of increasing shocks are present. I believe that the same techniques can be applied to the zero range process with concave jump rate g. The concavity of g guarantees that if at time zero the system is perturbed by adding a particle, then at later times there will be only one extra particle. A more complicated condition in this direction has been stated for the misanthropes process. (15) Open problem. What happens when the concavity of g is violated and a perturbation added to the system will produce more perturbations? Is there still a shock? How can one describe it at a microscopic level? (16) Open problem. Show Theorem 3 for the misanthropes process. (17) An interesting question that arises here is: what happens when the second class particle is at a decreasing shock? In this case there are infinitely many characteristics emanating from that point (rarefaction front). It was recently proved for the sep that in this case the second class particle chooses uniformly among the characteristics emanating from that point. The next series of theorems have been proven only for the simple exclusion process. Presumably using ad hoc couplings one can show analogous results for the zero range case. At this point it is not clear how to show them for the general misanthropes process. The next theorem gives the asymptotic variance of the second class particle in the case of an increasing shock. Moreover it establishes that in the √ scale t the fluctuations of the second class particle around its expected value at time t are determined by a function Nt of the initial configuration and not by the randomness due to the evolution. The function Nt (η) is the number of empty sites of η between 0 and (λ − ρ)t minus the number of particles of η between −(λ − ρ)t and 0. Theorem 4. Let the process be the sep with initial distribution νρ,λ , with 0 ≤ λ < ρ ≤ 1. Then the average position of the second class particle is given by EXt = (1 − λ − ρ)t := vt
(18)
for all t ≥ 0. The limiting variance (diffusion coefficient) exists and is given by ρ(1 − ρ) + λ(1 − λ) E (Xt )2 − (EXt )2 = t→∞ t λ−ρ
D := lim
(19)
40
P. A. FERRARI
The position of the particle in the scale t1/2 is determined by the initial configuration. Let (λ−ρ)t 0 X X (1 − η(x)) − η(x). Nt (η) = x=0
Then
x=−(λ−ρ)t
i2 1 h E Xt − (λ − ρ)−1 Nt (η0 ) = 0. t→∞ t lim
(20)
Remarks (21) The interval determining the position of Xt can be obtained using the macroscopic equation. The left extreme of the interval is exactly the point where the leftmost characteristic arriving to vt at time t emanates, while the right extreme is the point where the rightmost characteristic arriving at vt at time t emanates. (22) Noticing that EXt = ENt = (1 − λ − ρ)t, it follows from (20) that Xt − EXt , the fluctuations around the mean of the second class particle at time t, are well approximated by Nt (η0 ) − ENt (η0 ), the fluctuations of the density of the initial configuration on the interval −(λ − ρ)t, (λ − ρ)t. Since Nt+s − Nt and Nt depend on disjoint sets and the initial distribution is product, we can conclude that Nt (η0 ) has independent increments. This is useful to show convergence of the rescaled position of the second class particle to the finite dimensional distributions of Brownian Motion. (23) Open problems. One of the most important open questions in the field is the computation of the asymptotic behavior of the variance when λ = ρ. In other words, for which α, the limit as t goes to infinity of tα (E (Xt )2 − (EXt )2 ) exists and is non trivial? Physical arguments and simulations suggest α = − 34 . Which is the right normalization in d = 2? (24) In dimension d ≥ 3 there is strong evidence that α = 1. Let δ be the average vector jump for the underlying random walk. Consider a family of initial product distributions of first and second class particles such that (a) disregarding classes one has the equilibrium measure νρ and (b) the distribution of second class particles is a product measure whose density at site ε−1 r is εu0 (r) (so that the density of second class particles is of order ε). Then the suitable rescaled and translated density of second class particles converges to the solution of the viscous Burgers equation. More precisely, if νuε 0 is a product measure such that Z dνuε 0 (η)η(ε−1 r) = ρ − εu0 (r), then using the notation of Theorem 1, Z −1 ε −2 −1 −2 ρ − dνu0 S(ε t)(η)η(ε r + ε (1 − 2ρ)tδ) = u(r, t) lim ε ε→0
(25)
where u(r, t) is the solution of the d-dimensional viscous Burgers equation d X ∂u ∂2u Di,j . + δ · ∇u2 = ∂t ∂zi ∂zj i,j=1
(26)
SHOCKS IN PROCESSES WITH DRIFT
41
Since the density of second classs particles goes to zero, this suggests that the diffusion coefficient of a single second class particle should be the same as the diffusion coefficient of the equation. (27) Open problem. Show that the diffusion coefficient of a single second class particle in d ≥ 3 is the diffusion coefficient of the density of second class particles in the equation (26). (28) The dependence of the position of the second class particle on the initial configuration (20) for the case when there is only one shock suggests the following generalization. Define Ntε (η)
=
y+ ε−1 X
(1 − η(x)) −
x=ε−1 a
−1 εX a
η(x).
(29)
x=y − ε−1
where y + and y − are the places where the rightmost and leftmost characteristics arriving at w(a, t) emanate. (See also Remark (21).) Then one expects that the dependence on the initial configuration (20) holds also for this case. (30) One of the key tools to show the dependence on the initial configuration (20) was the study of the variance of the current of particles through a passenger travelling at deterministic velocity r. In particular when the system is in equilibrium with initial distribution νρ , and the passenger is travelling at the velocity of the characteristic r = 1 − 2ρ, the asymptotic variance of the current at time t divided by t converges to zero as t goes to infinity. (31) Open problem. Can one say that the lines through which the current of particles have asymptotic variance zero in the correct scale are characteristics? This would give an intrinsic definition of the characteristics by means of a property that can not be obtained from the macroscopic equation. To start one can try to solve a simpler question. Assume that the initial distribution is νρ,λ . Does the asymptotic limiting variance of the current through a passanger travelling at deterministic velocity (1 − λ − ρ) vanish? (32) Open problem. What is the behavior of the variance of the current of particles along the characteristics in the rarefaction front? (33) An important step in proving Theorem 4 was to show that the exact computation of the diffusion coefficient (19) and the dependence of the initial configuration (20) are equivalent. Presumably the same techniques may be applied to show that with the generalized definition of Nt given by (29), the dependence on the initial configuration (20) is equivalent to the following identity for the limiting variance lim ε E (Xtε )2 − (EXtε )2 =
t→∞
R y+ y−
u0 (r)(1 − u0 (r))dr u+ − u−
(34)
where u+ and u− are the densities to the right and left of w(a, t) respectively. The following theorem establishes that the finite dimensional distributions of the position of the second class particle in the sep under shock initial conditions behave as those of Brownian motion. Its proof is a corollary to the dependence on the initial configuration (20) and the fact that Nt has independent increments as mentioned in Remark (22). It is not clear how to prove tightness.
42
P. A. FERRARI
Theorem 5. Let the process be the sep. Let W (t) be Brownian motion with diffusion coefficient D. Then if the process starts with either νρ,λ or the invariant measure µ, lim ε1/2 (Xε−1 . − EXε−1 . ) = W (.),
(35)
ε→0
in the sense of the finite dimensional distributions. In the case of an initial increasing shock, the hydrodynamic limit (6) means that under initial distribution νρ,λ , a traveller moving at deterministic velocity r observes asymptotically that the particles are distributed as νρ for r > v and νλ for r < v, where v = (1 − λ − ρ). Indeed u(r, t) = ρ1{r < vt} + λ1{r > vt} is the entropic solution of the Burgers equation when u0 (r) = λ for r > 0 and ρ for r ≤ 0. When r = v the system converges to a fair mixture of νρ and νλ . This is the principal consequence of our next result. Its proof is based on the central limit theorem for Xt established in Theorem 5. Let Z r 1 exp −s2 /(2Dt) ds, g(r, t) = P(W (t) ≤ r) = √ 2πDt −∞ the normal distribution with variance Dt. Theorem 6. (Dynamical phase transition) Let v = (1 − λ − ρ). Then lim νρ,λ S(tε−1 )τvtε−1 +aε−1/2 = (1 − g(a, t))νρ + g(a, t)νλ .
ε→0
(36)
Remark (37) The case of two meeting shocks presents surprising features. Assume that the process at time zero has three densities: u0 (r) = ρ−1 1{r < a} + ρ0 1{a ≤ r < b} + ρ1 1{r ≥ b}; ρ−1 < ρ0 < ρ1 . According to the Burgers equation, these shocks meet at time t¯ = (b − a)/(ρ1 − ρ−1 ) at the point r¯ = [aρ1 + bρ−1 + (b − a)(1 − ρ0)]/(ρ1 − ρ−1 ). In this case the distribution at macroscopic time t¯ at site r¯ converges to a mixture of the three product distributions. More precisely, calling νuε 0 the family of measures constructed as in Theorem 1, preliminary results indicate that lim νuε 0 S(ε−1 t¯)τε−1 r¯ =
ε→0
5 16 νρ−1
+ 38 νρ0 +
5 16 νρ1 .
(38)
Finally we turn over the fluctuation fields. The problem is to determine how the fluctuations around the initial density evolve with time. If the initial distribution is product with constant density ρ, then the fluctuation fields move deterministically along the corresponding characteristic 1 − 2ρ. When the system starts with the increasing shock νρ,λ , a conflict arises because the characteristics to the right of the shock are slower than those to the left of it. The conflict is solved by the second class particle that moves to compensate the fluctuations changing in that way the position of the microscopic shock. In the space scale t, the one we use to study the fluctuations, the density fluctuations between y − and y + concentrate at the point vt, where y − and y + are the emanating points of the leftmost and rightmost characteristics arriving at vt at time t. This is essentially the content of Theorem
SHOCKS IN PROCESSES WITH DRIFT
43
7 and (43) of Theorem 8. In (44) of Theorem 8 we look at point vt at time t in the scale t1/2 to see how those fluctuations reflect on the position of the microscopic shock. The proofs of the theorems are based again on the fact that the limiting variance of the current along the characteristics vanishes. Let Υεt be the fluctuations fields defined by X Φ(εx)[ηε−1 t (x) − E ηε−1 t (x)], (39) Υεt (Φ) = ε1/2 x∈Z
for smooth integrable test functions Φ. For t = 0, if η0 is distributed according to νρ,λ , then lim Υε (Φ) = Υ(Φ), (40) ε→0
where Υ(Φ) is Gaussian white noise with mean zero and covariance Z E (Υ(Ψ)Υ(Φ)) = u0 (r)(1 − u0 (r))Ψ(r)Φ(r)dr.
(41)
where u0 (r) = λ1{r ≥ 0} + ρ1{r < 0}. Theorem 7. (Convergence of the fluctuation fields) Assume that the initial distribution of the process is νρ,λ . Let v = (1−ρ−λ). Let u(r, t) = λ1{r > vt}+ρ1{r ≤ vt}. As ε → 0, the fluctuation fields Υεt defined in (39) converge in a weak sense to the conservative solution Υt of the nonhomogeneous linear equation ∂ ∂ Υt (r) = (1 − 2u(r, t))Υt (r), ∂t ∂r
(42)
with initial condition Υ, the Gaussian field with zero mean and covariance given by (41). Theorem 7 is a consequence of the L2 convergence of the fluctuation fields established in the next theorem. The weak solutions of (42) present a singularity at the point (vt, t) due to the discontinuity of u(r, t) at r = vt. For this reason there is no unique solution. However there is only one conservative solution. To better describe it let us introduce some notation. Assume that Φ is the indicator of the interval (a1 , a2 ). For i = 1, 2 let bi (t) be the points where the characteristics arriving to ai at time t emanate: ai − (1 − 2ρ)t if ai < vt bi (t) = ai − (1 − 2λ)t if ai > vt. Then Υt , the solution of (42) is given by the following. Z
Υt (r)Φ(r)dr =
Z
a2
Υt (r)dr = a1
Z
b2 (t)
Υ0 (r)dr. b1 (t)
We can interpret this by saying that if vt ∈ (a1 , a2 ) then, the fluctuations present in the interval (−(λ − ρ)t, (λ − ρ)t) at time zero concentrate in the point vt√at time t. Formula (44) below says that these fluctuations are present in the scale t. Indeed they reflect the shock fluctuations that occur in this scale.
44
P. A. FERRARI
Theorem 8. Let Aε = Z ∩ (ε−1 a1 , ε−1 a2 ), Bε (t) = Z ∩ (ε−1 b1 (t), ε−1 b2 (t)). Then
lim εE
ε→0
X
[ηε−1 t (x) − E ηε−1 t (x)] −
x∈Aε
2
X x∈Bε (t)
(η0 (x) − E η0 (x)) = 0.
(43)
Let c > 0, Cε (t) = Z ∩ (ε−1 vt − ε−1/2 c, ε−1 vt + ε−1/2 c) and let Kε (t) = Z ∩ (−ε−1 t(λ − ρ), ε−1 t(λ − ρ)). Then
lim εE
ε→0
X
[ηε−1 t (x) − E ηε−1 t (x)] − Tε−1/2 c
x∈Cε (t)
X x∈Kε (t)
2
(η0 (x) − E η0 (x)) = 0, (44)
where Tc is truncation by c: F (.) Tc F (.) = c −c
if |F (.)| ≤ c if F (.) > c if F (.) < −c.
Note that Cε (t) is an interval of length proportional to ε−1/2 around the macroscopic point vt. When c → √ ∞, (44) says that the fluctuations at time t in a region of length proportional to t around vt are given by the fluctuations at time 0 in a region of length proportional to t. Remark (45) Open problem. How do the fluctuation fields behave in the rarefaction front?
2. Final Remark (46) We have described results characterizing the shocks at a microscopic level for one dimensional nearest neighbor processes. The results also hold for the partially asymmetric case, when jumps to the left are allowed at rate q (< p, the rate of jumps to the right). With the exception of the hydrodynamic limit of Theorem 1 that holds for asymmetric processes other than nearest neighbor and in greater dimensions for some initial conditions, the techniques used to prove the other results do not seem to work for a more general jump transition probability in one dimension or in more than one dimension.
3. Notes and References The simple exclusion and the zero range processes were introduced by Spitzer (1970). The set of invariant measures for the sep was characterized by Liggett (1976, 1985) and the sets of invariant measures for the zrp were characterized by Andjel (1982). Cocozza (1985) gives conditions (i)–(iii) under which the product measures (3) are invariant for the misanthropes process.
SHOCKS IN PROCESSES WITH DRIFT
45
The limit of Theorem 1 was first proven by Liggett (1975, 1977) for the case r = 0, before the conection between the process and the Burgers equation appeared. Rost (1982) established first the hydrodynamical properties of the equation for initial condition . . . 111000 . . . (decreasing profile). Then Benassi and Fouque (1987) extended the result to decreasing one step profiles (the proof in the increasing case is incomplete) and Andjel and Vares (1987) did it for increasing profiles. Benassi, Fouque, Saada, and Vares (1991) worked out monotone initial profiles. The hydrodynamic limit for the zrp was studied by Andjel and Kipnis (1984). For the misanthropes process with general initial integrable profiles the hydrodynamic limit (6) follows now as a consequence of the law of large numbers of Rezakhanlou (1990) and the proof of local equilibrium of Landim (1992). Landim (1991) has some partial results in greater dimensions. Lax (1972) shows how shock waves appear in the related equations. Rezakhanlou (1993a, b) studies further questions about the equation. In particular it is shown there that if the initial condition presents no decreasing discontinuity at a, then there is only one characteristic emanating from a, as mentioned after (11). Blocking measures were first noticed by Liggett (1976). This was the first example that show that this model presented microscopic shocks. Then the case ρ = 0, λ < 1 was studied by Ferrari (1986). The existence of the microscopic shock was supported by simulations of Boldrighini, Cosimi, Frigio, and Nunes (1989). Ferrari, Kipnis, and Saada (1991) show the existence of the microscopic shock in a non-Markovian way for the sep. However their construction is in the base of the Markovian characterization as well as most of the other results for the sep. Ferrari (1992) shows that an isolated second class particle describes the microscopic shock. Remark (9) is a consequence of a result of Derrida, Janowsky, Lebowitz, and Speer (1993) who compute the invariant measure µ for finite systems and computed the asymptotic density of particles around the shock. Building on this computation Ferrari, Fontes, and Kohayakawa (1993) totally described µ. This description is important to show the equivalence between µ and νρ,λ and the properties of µ. The weak law of large numbers for the second class particle given by identity (12) of Theorem 3 was proven by Rezakhanlou (1993) who also established the conditions on the rates to obtain (12) for the misanthropes. Ferrari (1992) proved the strong law (13) for non-decreasing initial profile with at most one increasing shock. The proof works also when a finite number of increasing shocks is present in the (nondecreasing) initial profile. Rezakhanlou (1993) shows that in the decreasing case, the second class particle is concentrated in the set of characteristics emanating from the discontinuity point. Ferrari and Kipnis (1993) prove that in this case the second class particle chooses uniformly among those characteristics (Remark (17)). Spohn (1991) proved that the expected position of the second class particle is given by the velocity predicted by the macroscopic equation. This is the content of (18). He also conjectured (19), the exact value of the asymptotic variance. Boldrighini et al. (1989) performed computer simulations that supported the conjecture. G¨ artner and Presutti (1989) show that the position of the leftmost particle when the initial densities are ρ = 0 and λ < 1 depend on the initial configuration. Ferrari (1992) shows the equivalence between the exact value of the limiting variance (19) and the dependence on the initial distribution (20) and that the right-hand side of
46
P. A. FERRARI
(19) is a lower bound for D. Finally Ferrari and Fontes (1993b) settle this problem by computing explicitly the asymptotic variance (19). The heuristics of the conjecture (23) about the fluctuations of a second class particle when λ = ρ are in Spohn (1991). The diffusive limit in dimensions d ≥ 3 as described by (25) was performed by Esposito, Marra, and Yau (1993), using the relative entropy method. The variance of the current was computed by Ferrari and Fontes (1993a). Remark (33) was inspired on a conjecture of Rezakhanlou (1993a) about the fluctuations of the second class particle about its mean, under general initial profiles. In that paper can be found heuristics leading to (34). The central limit theorem for the leftmost particle when ρ = 0 is a consequence of Burke’s Theorem (see Spitzer 1970, Liggett 1985, Kipnis 1986, Wick 1985 and De Masi, Kipnis, Presutti, and Saada 1988). This is also a special case of Theorem 5 which was proven by Ferrari and Fontes (1993b). The dynamical phase transition of Theorem 6 was proven first by Wick (1985) and De Masi, Kipnis, Presutti, and Saada (1988) for ρ = 0 and by Andjel, Bramson, and Liggett (1988) for λ + ρ = 1. The distribution of the process at the meeting place of two shocks given by (38) was found by Ferrari, Fontes, and Vares. The fact that the fluctuation fields move deterministically if the initial profile is constant was proven by Benassi and Fouque (1992) and also by Ferrari and Fontes (1993a). Theorems 7 and 8 about the convergence of the fluctuation fields when there is an increasing shock are proven by Ferrari and Fontes (1993b). The behavior of a tagged particle for the nearest neighbor sep with jumps to the left and right with probability q < p respectively has also been studied. The system starts with the invariant measure νρ0 , conditioned to have a particle at the origin. Kipnis (1986) proved a central limit theorem and law of large numbers for the position of the tagged particle. De Masi and Ferrari (1985) computed the variance of the limiting Gaussian distribution. Ferrari and Fontes (1993c) show that the position of the tagged particle is given by a Poisson process of rate (1 − ρ)(p − q) plus a perturbation of order 1. Saada (1987) proved a law of large numbers for the process in dimensions greater than one. Rezakhanlou (1993b) shows that a tagged particle in a non-equilibrium system satisfies a law of large numbers. The macroscopic position of the tagged particle can be described as the solution of an equation related to the hydrodynamic limit. The result also holds for the zrp. Bramson (1988), Lebowitz, Presutti, and Spohn (1988) and Spohn (1991) reviewed some of the previous results. A survey on the beginning of the hydrodynamic limits for particle systems is given by De Masi, Ianiro, Pellegrinotti, and Presutti (1984). The physical literature can be found in van Beijeren (1991). Shocks in a cellular automaton introduced by Boghosian and Levermore (1987) can be found in Cheng, Lebowitz, and Speer (1990) and in Ferrari and Ravishankar (1993). In the last paper relations with the Automata 184 of Wolfram (1983) are established. Finally we mention that Walker (1989) describes actual shocks in real highways which look very much as the shocks one find in the mathematical models studied here.
SHOCKS IN PROCESSES WITH DRIFT
47
Acknowledgement I thank Claude Kipnis for a very careful and critical reading of the manuscript. I also thank Enrique Andjel, Luiz Renato Fontes, and Fraydoun Rezakhanlou for valuable discussions. The final version of this paper was written while the author was a participant of the program Random Spatial Processes at the Isaac Newton Institute for Mathematical Sciences, University of Cambridge, whose very nice hospitality is acknowledged. This review is partially supported by FAPESP ‘Projeto Tem´ atico’ Grant number 90/3918-5, CNPq, and SERC Grant GR G59981. References Andjel E. D. (1982). Invariant measures for the zero range process. Annals of Probability 10, 525–547. Andjel, E. D., Bramson, M., and Liggett, T. M. (1988). Shocks in the asymmetric simple exclusion process. Probability Theory and Related Fields 78, 231–247. Andjel, E. D. and Kipnis, C. (1984). Derivation of the hydrodynamical equation for the zero-range interaction process. Annals of Probability 12, 325–334. Andjel, E. D. and Vares, M. E. (1987). Hydrodynamic equations for attractive particle systems on Z. Journal of Statistical Physics 47, 265–288. Beijeren, H. van (1991). Fluctuations in the motions of mass and of patterns in one-dimensional driven diffusive systems. Journal of Statistical Physics. Benassi, A. and Fouque, J-P. (1987). Hydrodynamical limit for the asymmetric simple exclusion process. Annals of Probability 15, 546–560. Benassi, A. and Fouque, J-P. (1992). Fluctuation field for the asymmetric simple exclusion process. Proceedings of Oberwolfach Conference in SPDE, 1989, Birkh¨ auser, Boston. Benassi, A., Fouque, J-P., Saada, E., and Vares, M. E. (1991). Asymmetric attractive particle systems on Z: hydrodynamical limit for monotone initial profiles. Journal of Statistical Physics. Boghosian, B. M. and Levermore, C. D (1987). A cellular automaton for Burgers’ equation. Complex Systems 1, 17–30. Boldrighini, C., Cosimi, C., Frigio, A., and Grasso-Nunes, M. (1989). Computer simulations of shock waves in completely asymmetric simple exclusion process. Journal of Statistical Physics 55, 611–623. Bramson, M. (1988). Front propagation in certain one dimensional exclusion models. Journal of Statistical Physics 51, 863–869. Cheng, Z., Lebowitz, J. L., and Speer, E. R. (1990). Microscopic shock structure in model particle systems: the Boghosian Levermore revisited. Communications in Pure and Applied Mathematics 44. Cocozza, C. T. (1985). Processus des misanthropes. Zeitschrift f¨ ur Wahrscheinlichkeitstheorie verw. Geb. 70, 509–523. De Masi, A. and Ferrari, P. A. (1985). Self diffusion in one-dimensional lattice gases in the presence of an external field. Journal of Statistical Physics 38, 603–613. De Masi, A., Ianiro, N., Pellegrinotti, A., and Presutti, E. (1984). A survey of the hydrodynamical behavior of many particle systems. Nonequilibrium Phenomena II: From Stochastic to Hydrodynamics (J. L. Lebowitz and E. W. Montroll, ed.), Studies in Statistical Mechanics, vol. 11, North Holland, Amsterdam. De Masi, A., Kipnis, C., Presutti, E., and Saada, E. (1988). Microscopic structure at the shock in the asymmetric simple exclusion. Stochastics 27, 151–165. Derrida, B., Janowsky, S., Lebowitz, J. L, and Speer, E. (1993). Exact solution of the totally asymmetric simple exclusion process: shock profiles. Journal of Statistical Physics (to appear). Esposito, E., Marra, R., and Yau, H. T. (1993). Diffusive limit of asymmetric simple exclusion (to appear).
48
P. A. FERRARI
Ferrari, P. A. (1986). The simple exclusion process as seen from a tagged particle. Annals of Probability 14, 1277–1290. Ferrari, P. A. (1992). Shock fluctuations in asymmetric simple exclusion. Probability Theory and Related Fields 91, 81–101. Ferrari, P. A. and Fontes, L. R. G. (1993a). Current fluctuations for the asymmetric simple exclusion process. Annals of Probability. Ferrari, P. A. and Fontes, L. R. G. (1993b). Shock fluctuations in the asymmetric simple exclusion process (to appear). Ferrari, P. A. and Fontes, L. R. G. (1993c). The net output process of a system with infinitely many queues (to appear). Ferrari, P. A., Fontes, L. R. G., and Kohayakawa, Y. (1993). Invariant measure for a two species asymmetric process (to appear). Ferrari, P. A., Fontes, L. R. G., and Vares, M. E. (1993) (to appear). Ferrari, P. A. and Kipnis, C. (1993). Second class particles in the rarefaction fan (to appear). Ferrari, P. A., Kipnis, C., and Saada, E. (1991). Microscopic structure of travelling waves for asymmetric simple exclusion process. Annals of Probability 19, 226–244. Ferrari, P. A. and Ravishankar, K. (1992). Shocks in asymmetric exclusion automata. Annals of Applied Probability 24, 928–941. G¨ artner, J. and Presutti, E. (1989). Shock fluctuations in a particle system. Annales de l’Institut Henri Poincar´ e (Probabilit´ es et Statistique) B 53, 1–14. Kipnis, C. (1986). Central limit theorems for infinite series of queues and applications to simple exclusion. Annals of Probability 14, 397–408. Landim, C. (1991). Hydrodynamical limit for asymmetric attractive particle systems on Z d . Annales de l’Institut Henri Poincar´ e (Probabilit´ es et Statistique) 27, 559–581. Landim, C. (1992). Conservation of local equilibrium for attractive particle systems on Z d . Annals of Probability (to appear). Lax, P. D. (1972). The formation and decay of shock waves. American Mathematical Monthly (March). Lebowitz, J. L., Presutti, E., and Spohn, H. (1988). Microscopic models of hydrodynamical behavior. Journal of Statistical Physics 51, 841–862. Liggett, T. M. (1975). Ergodic theorems for the asymmetric simple exclusion process. Transactions of the American Mathematical Society 213, 237–261. Liggett, T. M. (1977). Ergodic theorems for the asymmetric simple exclusion process, II. Annals of Probability 4, 339–356. Liggett, T. M. (1976). Coupling the simple exclusion process.. Annals of Probability 4, 339–356. Liggett, T. M. (1985). Interacting Particle Systems. Springer, Berlin. Rezakhanlou, H. (1990). Hydrodynamic limit for attractive particle systems on Z d . Communications in Mathematical Physics 140, 417–448. Rezakhanlou, H. (1993a). Microscopic structure of shocks in one conservation laws (to appear). Rezakhanlou, H. (1993b). Evolution of tagged particles in non-reversible particle systems (to appear). Rost, H. (1982). Nonequilibrium behavior of a many particle process: density profile and local equilibrium. Zeitschrift f¨ ur Wahrscheinlichkeitstheorie verw. Geb. 58, 41–53. Saada, E. (1987). A limit theorem for the position of a tagged particle in a simple exclusion process. Annals of Probability 15, 375–381. Spitzer, F. (1970). Interaction of Markov processes. Advances in Mathematics 5, 246–290. Spohn, H. (1991). Large Scale Dynamics of Interacting Particles. Springer, Berlin. Walker, J. (1989). How to analyze the shock waves that sweep through expressway traffic. Scientific American (August), 84–87. Wick, D. (1985). A dynamical phase transition in an infinite particle system. Journal of Statistical Physics 38, 1015–1025.
SELF-ORGANIZATION OF RANDOM CELLULAR AUTOMATA: FOUR SNAPSHOTS
DAVID GRIFFEATH Department of Mathematics University of Wisconsin Madison, WI 53706 U.S.A.
Abstract. We discuss four very simple random cellular automaton (CA) systems that self-organize over time. The first is a directed interface process which stabilizes in a coherent statistical equilibrium. The second is a model for excitable media: nucleating spiral cores lead to a locally periodic final state. The third model is a prototype for curvature-driven clustering. And the fourth illustrates the evolution of complex viable structures near phase boundaries in a parameterized family of non-linear population dynamics. For each CA we present a mix of rigorous results, conjectures, and empirical findings based on computer experimentation. Key words: Cellular automaton, interacting particle system, interface, excitable medium, selforganization, nucleation, metastability, artificial life.
1. Introduction By a cellular automaton (CA) we mean a spatially-distributed dynamical system that evolves via local, homogeneous, parallel updating. Somewhat informally, we will call a CA random if its evolution has random ingredients, either in the starting state or in the dynamics. Deterministic CA rules may be viewed as digital counterparts of partial differential equations. Like their more traditional relatives, they can emulate a broad range of fundamental spatio-temporal phenomena across the spectrum of applied science. Random CA models include discrete-time variants of the interacting particle systems that are a mainstay of mathematical physics; the synchronous CA versions are ideally suited for real-time simulation on parallel computing devices. See [TM] for a nice, practical introduction to CA algorithms. Rick Durrett’s St. Flour lectures [Dur2] survey many recent rigorous results for interacting random systems, with emphasis on co-existence of phases and connections with partial differential equations. Our goal here is to describe four kinds of random CA that serve as prototypes for various non-linear complex systems. Each example is simple enough that a substantive, rigorous mathematical analysis seems within reach. The unifying theme is self-organization: a tendency toward large-scale, coherent structure starting from disordered initial states. We will focus on irreversible dynamics somewhat beyond the purview of traditional statistical mechanics. Rather, our models are motivated by problems from computer science, chemistry, and biology. Theoretical researchers in those fields are beginning to use random CA models to gain insights into organizational, adaptive, and evolutionary principles of spatially-distributed dynamics.
50
DAVID GRIFFEATH
Their ideas constitute a rich new source of important problems in stochastic processes, while probability theory, in turn, has much to contribute to their efforts. 2. Asynchronous Deterministic Computation: A Directed Interface Our first snapshot comes from computer science [Tof]. Imagine a two-dimensional integer array of cpu’s, with nearest neighbor connections, each assigned to carry out a sequence of calculations. In order for the machine at x to perform its (n + 1)th job reliably, it must wait until all neighboring machines have completed their nth jobs (in order to access needed information from the network). Typically, the times required to complete the various jobs are rather unpredictable, so an organizational algorithm is required to keep the cpu’s in synch. One can accomplish this by attaching to each node in the network a phase variable that keeps track of how many jobs have been completed, and prohibits a cpu from proceeding with its computation whenever its phase is ahead of any of the phases of its neighbors. In practice, the phases need only cycle through four states: 0, 1, 2, 3, say. If all sites are initially in phase, then the variables at neighboring nodes will never differ by more than one in modulus, and so four states are sufficient to determine whether a neighbor of x is ahead, equal to, or behind the machine at x. Of course the waiting protocol slows down each individual cpu, but only by a constant factor independent of the size of the network. In many contexts this is a small price to pay for the benefits of parallel computation. Toffoli and Margolus [TM] have proposed the following random CA prototype for such a synchronization scheme. Assume that the durations of jobs are i.i.d. geometric with parameter p. Imagine a directed interface representation of the dynamics on the space–time lattice Z2 × N, where the last coordinate codes the number of jobs completed. Write ξt (x) = n to signify that the machine at x has completed n jobs at time t. Then starting from the flat state ξ0 (x) ≡ 0, at all times t the system consists of a connected interface, by which we mean that |ξt (x + 1) − ξt (x) | ∈ {0, 1} for all x, t. The discrete-time dynamics are captured by the following maxim. Relative minima advance one unit with probability p.
(1)
See page 95 of [TM] for pictures. Observe that the shape of the interface in any finite network of this sort (i.e., the state modulo translations in n) comprises an irreducible aperiodic finite Markov chain. Thus the shape chain converges to a unique equilibrium, and so the interface advances at an asymptotic speed given by the invariant density of relative minima. But this stationary measure appears intractable, so the finite state space theory is of little use in the analysis of large networks. Rather, one wants to know that the directed interface ξt (x), defined on all of Z2 ×N, advances with a positive asymptotic speed α = α(p), and that the infinite system is well-approximated by large finite ones. We recently learned an elegant approach to the speed question for the directed interface process from Harry Kesten. The idea is to establish an exact connection with last passage percolation (cf. [GK]). Attach to each (x, n) a random variable X(x,n) that records how long the nth job at x takes. (This notation works out best if the initial job is considered the 0th .) Let T(x,n) denote the time until the nth job at
SELF-ORGANIZATION OF RANDOM CELLULAR AUTOMATA
51
x begins. We claim that the distribution of T(x,n) is given by the maximum of the n summands X over all paths connecting Z2 × {0} to (x, n) in the directed graph (Z2 × N, E), where E consists of edges from each site (x, m) to all (y, m − 1) such that y is a neighbor of x. Note that in the two-dimensional nearest-neighbor case the max is over 5n paths, and that the n variables on each path range from level 0 to level n − 1. The proof of the claim is a straightforward induction on n. In essence, T(x,n) equals the maximum of T(y,n−1) over any of the neighbors y of x (including x itself) plus the times it takes to carry out the (n − 1)th job at x. Using this last passage percolation representation, and denoting the origin of Z2 by 0, one can show that T(0,·) is superadditive. Hence, by the Kingman–Liggett Ergodic Theorem (see Dur1] or [Lig]), 1 T(0,n) → γ n
as n → ∞ a.s. and in L1 .
Straightforward inversion then yields a limiting speed for the interface: 1 ξt (0) → α = γ −1 t
as t → ∞ a.s.
(2)
The gist of the superadditivity property is to show that 0
0
T(0,m+n) ≥ Tm + Tn , 0
(3)
0
where Tm and Tn are independent, with the same distributions as T(0,m) and T(0,n) , respectively. Let us sketch the derivation of (3). Starting at (x, m + n), trace back the last passage path ℘1 to Z2 × {m}, let Tn0 denote the time along this path, and let y be its final position. Then, starting at (y, m), trace back the last passage path ℘2 0 to Z2 × {0}, and let Tm denote the time along this path. For any value of y, Tn0 has the same distribution as T(0,n) by translation invariance. Moreover, it is clear that T(0,m+n) must be at least as large as the sum of variables X along the concatenated path ℘1 ⊕ ℘2 , as desired. Of course, to know that our network doesn’t grind to a halt, we need to ensure that α > 0, or equivalently that γ < ∞. The last percolation representation facilitates a Peierls style argument to this effect. Namely, since T(x,n) is a maximum over 5n sums of n independent geometric random variables with parameter p, P (T(x,n) > a) ≤ 5n P (Sn > a). If a > 6/p the right side tends to 0 at an exponential rate in n, by a standard large deviations computation. Thus Borel–Cantelli implies that α(p) >
p . 6
Superadditivity gives little insight into the stability of the directed interface however. Indeed, it seems challenging to extend the above analysis to initial configurations with non-zero asymptotic slope. To dig deeper, a more illuminating technique is coupling. Let us discuss this approach in the one-dimensional setting, where it is most effective. Here the interface may be depicted as a polygonal function f : Z → N
52
DAVID GRIFFEATH
or, by extension, f : Z → Z, with nearest neighbor edges of three types (moving left to right): %, →, and &. Adopting the usual orientation, relative minima of this polygonal function move up in parallel, each with probability p. We can imagine starting the directed interface ξt from the flat state ξ0 (x) ≡ 0, or from any other polygonal f . The basic coupling gives a way to represent two directed interfaces on the same probability space, and on the same space–time diagram Z × Z: whenever both interfaces have a relative minimum at a site x, they use the same probability p coin toss to decide whether to move up. By restricting the state space to a finite one-dimensional lattice with wrap-around edges, it is a simple matter to simulate the basic coupling of ξt on a computer. We carried out many such simulations several years ago. For instance, Figure 1 shows the simultaneous evolution, with p = 12 , of the flat initial state (−) and an initial state of maximal oscillation (∨) on a periodic array of 1,000 sites, at time t = 1,000 and then t = 10,000. The flat state rapidly settles into a stable equilibrium. The ∨ state undergoes a much slower (hydrodynamic) transition to the same equilibrium since its peak cannot equilibrate until stochastic effects propagate from the valley. What is clear from the simulation is that the two interfaces are very successfully coupled in a neighborhood of the original minimal site of the ∨ state. By this we mean that the great majority of pairs of edges have the same slopes (−1, 0, or 1) at the same locations in that neighborhood, so the difference between the interfaces is almost constant there. Evidently, as time goes on, the region of successful coupling grows and the density of discrepancies tends to 0. This is strong empirical evidence for loss of memory in the directed interface process, and hence for the existence of a unique equilibrium starting from any profile with asymptotic slope 0. We should note that if f has non-zero asymptotic slope m, then this slope will be preserved by the dynamics, and hence such an f cannot possibly be successfully coupled to the flat state. Rather there should be a unique equilibrium for each slope in m ∈ [0, 1), each with its own characteristic speed αm (p). It also seems clear that slope 0 interfaces should propagate most rapidly, and that αm (p) → 0 as m → 1, since a slope 1 interface cannot move at all. Of course simulations provide little understanding as to why a coupling works. Quite recently Larry Gray [Gra] has established the stability of the one-dimensional directed interface process by proving that the basic coupling is successful. The strong law (2) follows as a corollary of his equilibrium analysis, which applies to any initial configuration with an asymptotic slope m. In essence, Gray’s method exploits a Lyapunov function for the coupled increment processes. Since each increment of the interface has slope −1, 0, or 1, the difference between coupled slopes at any location is an integer in [−2, 2]. Thinking of these discrepancies as signed particles, at most two per site, one can check a key monotonicity property (which, unfortunately, does not extend to dimensions d ≥ 2). Namely, particles of opposite sign can annihilate, and particles can move to neighboring locations, but new particles are never created. Using this observation, and the fact that annihilations must inevitably occur sooner or later, Gray proves that the density of coupling discrepancies tends to 0 starting from any two configurations with the same m. He is then able to extend techniques of Liggett [Lig], developed originally for exclusion processes, to conclude that there is a unique extremal invariant measure πm = πm (p) for each slope m, and that any
SELF-ORGANIZATION OF RANDOM CELLULAR AUTOMATA
Fig. 1.
53
Basic coupling of the directed interface.
initial interface with slope m settles down to πm . The same techniques apply to a fairly broad class of one-dimensional models that can include, for instance, advances of more than one unit as well as retreats. Random interface dynamics are notoriously difficult to analyze rigorously since their conserved quantities tend to give rise to self-organized distributions with longrange correlations. A few such processes are isomorphic to simple exclusion models. For example, discrete-time one-sided exclusion on Z is equivalent to an interface of % and & increments with update rule (1) except that the advance is two units. In large part because product measures are invariant for simple exclusion, there is an amazingly rich theory available in this case, and even more detailed results continue to appear. However most interface equilibria are computationally intractable, in which case robust methods are needed. Last passage percolation and basic coupling now seem to provide the beginnings of a more general theory.
3. Excitable Cellular Automata: Nucleating Waves Mathematical models for excitable media attempt to capture and explain the key features of periodic wave transmission through environments such as a network or tissue. Since the pioneering work of Wiener and Rosenblueth [WR], a great many researchers from the applied sciences have adopted various modeling frameworks,
54
DAVID GRIFFEATH
most notably partial differential equations, cellular automata, and coupled lattice maps. A common feature of many of these models is the requirement that some threshold level of excitation occur in a neighborhood of a location in order for that location to become excited and conduct a pulse. Such activity is typically followed by a refractory period in which further excitation is inhibited. Physical systems that exhibit this basic phenomenology include neural networks, cardiac muscle, and the Belousov–Zhabotinsky (BZ) oscillating chemical reaction. In two dimensions, excitable systems are typically characterized by the emergence of spatially-distributed stable target patterns or spirals. With the advent of effective computer visualization technology there has been a recent flurry of excitable medium modeling that tries to approximate precise quantitative features of observed phenomena (e.g., curvature and wave velocity in the BZ reaction). To accomplish this, most experimentalists introduce several rather ad hoc parameters designed to generate an assortment of non-linear effects. Arguably the simplest dynamical system that emulates an excitable medium is a 3 state, range 1, threshold 1 cellular automaton known as the Greenberg–Hastings model (GHM) (cf. [GH]). Over the past few years, in joint work with Robert Fisch and Janko Gravner, we have carried out a detailed experimental study of a three-parameter family of simple GHM-type rules. The parameters are the range ρ of interaction (assuming a box neighborhood N = {x ∈ Z2 : kxk∞ ≤ 1}), the threshold number θ of excited neighbors required for a cell to become excited, and the number κ of possible states (colors) per cell. Here state 0 is rested, 1 is excited, and 2, . . . , κ − 1 are refractory. A rested cell becomes excited by contact whenever it finds at least θ excited cells within its range ρ (box) neighborhood. The refractory states advance automatically each time, finally cycling from κ − 1 back to 0, so κ governs the recovery time. In symbols, ξt (x) = 0 ξt (x) = 1 ξt (x) = 2, . . . , κ − 1:
means the medium is rested (excitable) at x at time t; means the medium is excited at x at time t; the medium is recovering (refractory) at x at time t;
and the deterministic update rule is: 0 → 1 at x iff ∃ ≥ θ 1’s within x + ρN (the range ρ neighborhood of x); for k ≥ 1, k → k + 1 (mod κ) deterministically. From appropriate simple initial conditions these rules generate periodic traveling waves, in much the same way that The Wave propagates across the crowd at a rock concert or sporting event. From random or disordered configurations the same rules often exhibit complex self-organization characterized by the emergence of large-scale structure. For suitable θ, nucleating spiral cores lead to a locally periodic final state in which every site eventually cycles with period κ, but sites slaved to distinct cores are typically out of phase. Figure 2 shows a representative case: ρ = 8, θ = 28, κ = 8, started from uniform product measure π over the available colors. We refer to π affectionately as primordial soup. The array here, and for all graphics in this article, is 1024 × 768, with some cropping at the left and right edges. In the realization that produced Figure 2 most of the system quickly relaxed to the 0-state, but five spiral
SELF-ORGANIZATION OF RANDOM CELLULAR AUTOMATA
Fig. 2.
55
Nucleation of spirals in a Greenberg–Hastings model.
cores, sometimes called ram’s horns, nucleated from the soup. At the time t = 40 shown, the spirals are in the process of spreading over the entire lattice. Note how wave fronts from distinct centers annihilate when they collide. Evidently color computer graphics provide an effective way to visualize complex multitype interacting systems. Progress in understanding the phenomenology of excitable cellular automata would have been almost impossible without extensive use of efficient parallel devices such as the CAM-6 Cellular Automaton Machine [TM]. Our ability to interact with CAM-6 evolutions on the fly has been particularly illuminating. Clearly, computer simulations are most helpful for answering the question ‘how does system X behave?’ before one tries to prove a theorem about X. As technology improves, though, it is increasingly apparent that visualization of complex system dynamics actually augments the traditional deductive process as well. Let us summarize the highlights of our recent and ongoing theoretical research on excitable cellular automata, especially as it relates to the prototypical dynamics of two-dimensional spiral formation shown in Figure 2. We discovered in [FGG1] that as one varies the parameters, GHM displays a remarkably complex phase portrait containing several cutoffs that divide the ergodic behavior of the infinite system into qualitatively distinct regimes. For instance, one regime is characterized by statistical noise, another by the nucleation of stable spiral pairs shown in Figure 2, a third by clustering of aligned parades of wave fragments
56
DAVID GRIFFEATH
(macaroni), and a fourth by global relaxation. Closely related Cyclic Cellular Automaton (CCA) models (cf. [FGG0]), in which every color updates by threshold contact with its successor, exhibit even more exotic behavior. CCA dynamics are also described in [FGG1], a largely empirical paper filled with color graphics, experimental data, and a host of conjectures. However the complexity of excitable dynamics should not give the impression that rigorous mathematics is hopeless! Many GHM and CCA rules admit finite configurations ξ(Λ) known as stable periodic objects (spo’s): arrangements in which the color at each x sees at least θ sites of its successor color within (x + ρN ) ∩ Λ. A moment’s thought reveals that such a structure ξ(Λ) cycles deterministically no matter what the configuration off Λ. For example, the cores of each spiral pair in Figure 2 are spo’s. Since such structures must exist somewhere in the infinite primordial soup by the monkey-at-the-typewriter principle, and since such period κ spiral cores serve as pacemakers for their disordered environment, we should expect a locally periodic limiting state whenever spo’s exist. At least in the θ = 1 case, this is a theorem: spo’s are simply loops of sites along which the colors appear cyclically, and the simple proof in [FGG0] works equally well for GHM and CCA. A promising discovery of [FGG1] is the emergence of curvature and limiting dynamics in excitable CA systems as the threshold θ and range ρ increase, with θ/ρ2 → λ, say. This threshold–range scaling is particularly appealing from a mathematical point of view since the limiting Euclidean evolutions are surprisingly amenable to rigorous analysis. For instance, Durrett and Griffeath [DG] investigate the geometry of spiral cores in the threshold–range limit. Contact updating in R2 is formulated as follows. At any discrete time t, each site x in the plane inspects a Borel neighborhood x + N , where N is the unit ball with respect to some Euclidean norm, and asks whether the area painted with its successor color exceeds θ. Then the entire continuum updates in one truly massive parallel computation. If the state of the system at time t is a random tessellation of space into connected color-components with smooth boundaries, then the configuration at time t + 1 will also be such a tessellation. Moreover, the action of interfaces is described by integral transformations that can be studied analytically. Using this scaling, in the case of the unit `∞ -box N , one can construct spiral cores that are spo’s for λ < 0.6123, and also argue heuristically but persuasively that spiral cores cannot exist for λ > 23 . Thus there is a critical point λc , known as bend and observed empirically to be about 0.653, below which GHM produces locally periodic patterns in the spirit of Figure 2, but above which the ergodic behavior is altogether different. See [DG] for further details, including a very concrete algorithm for the construction of huge spirals. Fisch, Gravner, and Griffeath [FGG2] study the asymptotic frequency of nucleation in GHM dynamics as the number of colors κ becomes large. Starting from primordial soup, and assuming that the excitation threshold θ is not too large, the box size needed for formation of a spiral core is shown to grow exponentially in κ. By exploiting connections with percolation theory, the exponential scaling rate is rigorously determined as 0.23 ± .06 in the nearest neighbor, threshold 1 case. By way of contrast, GHM rules obey power-law nucleation scaling when started from a suitable non-uniform product measure over the colors; this effect is driven by critical percolation. Along with the proofs, [FGG2] contains a nice picture of percolating
SELF-ORGANIZATION OF RANDOM CELLULAR AUTOMATA
57
spiral formation. Gravner and Griffeath [GG1] calculate the asymptotic shape of excitable CA nucleation droplets, on integer lattices and in Euclidean space. The limit shape L is identified as the polar transform of an explicitly computable width function. Even though the edges of droplets in Figure 2 appear smooth, L is an explicitly computable polygon with a very large number of sides in this case. In fact, by formulating an abstract version of the main limit theorem in [GG1], one can show that L is always a polygon for threshold growth CA rules on Zd . However subtle problems remain to be addressed for GHM dynamics that generate spreading rings. Such a ring is unstable if too thin, or if its curvature is too great at some location. Starting from a sufficiently large ring with uniformly small curvature, the wave should nevertheless be able to spread indefinitely. Delicate issues of boundary behavior make this a challenging problem. It is hoped that connections with curvature-driven partial differential equations will prove useful; a related connection is described in our third snapshot. Having studied spiral cores, nucleation density and droplet growth, it is natural to ask next What is the geometry of the final locally periodic state for typical GHM dynamics in the limit of rare nucleation (e.g., κ → ∞)? In a forthcoming project, Griffeath and Gravner [GG2] will conjecture for GHM, and prove for simpler excitable dynamics, that under suitable rescaling this limiting field is a Poisson–Voronoi Tessellation (PVT) with respect to the norm that describes droplet growth. Recall that a PVT is a tessellation of Euclidean space such that the centers of the individual tiles constitute a Poisson field, and such that each tile comprises those locations which are closest to a given center with respect to a prescribed norm. Our methods for proving convergence to PVT naturally combine nucleation analysis, Poisson approximation, and shape theory. An example of a CA that we can handle rigorously is the Competing Growth Model on {0, . . . , κ − 1}, where ξt (x) = k
means the medium has opinion (color) k at x at time t,
(0 designating undecided voters), and the update rule is: 0 → k at x iff ∃ ≥ θ k’s within x + ρN , and k is the only such color. Otherwise there is no change at x; in particular, no k ≥ 1 ever changes. Figure 3, an even simpler Multitype Threshold Voter Model with box neighborhood, ρ = 2, θ = 7, and κ = 32, should convey the spirit of our results. (This variant has no background state 0.) If nucleation is rare, then the locations of nucleating centers will be approximately Poisson. Moreover, individual droplets are sufficiently separated that they nearly attain their limiting shape L before they interact. Finally, interaction is the simplest possible: a standoff between competing droplets wherever they meet. On the scale of mean distance between nucleating centers, with a suitable formulation of tile boundaries, and adopting the proper weak convergence framework, convergence to PVT should follow.
58
DAVID GRIFFEATH
Fig. 3.
Nucleation of random tiles in a Multitype Threshold Voter Model.
We cannot discuss further details here. However one special case of our result is so simple that it is nearly transparent. With nearest neighbors and threshold 1, any non-zero color grows a diamond-shaped droplet that captures every site to which it is closest in the ∞-norm. We invite the reader to argue for the following Theorem 1. Assume θ = 1. Start the nearest neighbor Competing Growth Model from a soup with density p/(κ − 1) of each color k ≥ 1, and density 1 − p of 0’s. Let Cp,κ denote the set of sites in Z2 that eventually have more than one color in their neighborhood. Then as p → 0 and κ → ∞, √
p Cp,κ → V(P),
where V(P) is the Poisson–Voronoi Tessellation for the ∞-norm.
4. Euclidean Majority Vote: Curvature-Driven Clustering It is hard to imagine a simpler self-organizing scheme than majority vote. Citizens of two political persuasions, say Conservative and Labour, populate the lattice. From time to time individuals poll their neighborhood, succumb to peer-group pressure, and affiliate themselves with the local majority. Assuming a symmetric neighbor set
SELF-ORGANIZATION OF RANDOM CELLULAR AUTOMATA
Fig. 4.
59
Self-organization of a Majority Vote rule after 3 updates.
N , and counting one’s own previous opinion in the tally, there is no chance of a tie so the algorithm is well-defined. In symbols, ξt (x) = k ∈ {0, 1}
means the system has opinion (color) k at x at time t,
and the range ρ update rule is: Switch opinion at x iff > 12 of the voters in neighborhood x + ρN have the opposite opinion.
(4)
Figures 4 and 5 show a typical realization of the range 4 box Majority Vote CA, at times 3 and 10 respectively, started from symmetric product measure. Experiments such as this indicate clustering: from disordered noise the system appears to find a random tessellation within 2 or 3 updates, after which it self-organizes on length scales that grow over time. Real-time visualization reveals a surface tension effect. Minority components are eroded most rapidly along sections of the boundary where curvature is greatest. Small convex pockets of opposition are eliminated rapidly, but larger ones take much longer to disappear. In truth, the system in Figures 4–5 will fixate once the curvature of the tile boundaries is sufficiently small over the entire lattice. Since 41 of the 81 neighbors must disagree for a flip to occur, any tessellation with edges sufficiently close to flat is stable. For the range ρ version of majority vote
60
DAVID GRIFFEATH
Fig. 5.
The same Majority Vote CA after 10 updates.
with stochastic dynamics, fixation can actually be proved by an energy argument ([DS, Dur2]). In order to sustain surface tension clustering indefinitely one must increase ρ without bound. Let us therefore consider the threshold–range scaling limit known as Euclidean Majority Vote (EMV). Our update rule is the same as (4), but with x ∈ R2 , ρ = 1, and symmetric convex N ⊂ R2 . For instance, isotropic dynamics are obtained by choosing N to be the Euclidean unit ball. Of course now the majority condition is phrased in terms of Euclidean area, an interpretation that certainly makes sense if we start from a random two-colored tessellation with smooth boundary. Moreover, one can check that the EMV update rule maps a suitable space of such tessellations into itself. It is a bit more challenging to formulate the continuum counterpart of primordial soup. Roughly speaking, Bernoulli product measure should become White Noise. Alternatively note that, on the lattice, {ξ1 = 1} may be viewed as the positive part of a (correlated) random field. Passing to the continuum, confused sites of ξ1 become the zero set of a (correlated) Gaussian field. It is plausible, then, and suggested by large-range lattice experiments, that ξ2 should consist of countably many connected color components with continuous boundary. Only a flat edge is stable for EMV, so once the system nucleates a random tessellation, surface-tension clustering should continue indefinitely. On the basis of these heuristics, we offer the following bold
SELF-ORGANIZATION OF RANDOM CELLULAR AUTOMATA
61
Conjecture 1. Starting from White Noise, Euclidean Majority Vote nucleates components with continuous boundaries by time t = 2, and then clusters to arbitrarily large length scales as t → ∞. That is to say, for any bounded Λ ⊂ R2 , as t → ∞, P (ξt has one opinion on Λ) → 1.
(5)
In current joint work with J. Gravner, we are attempting to make some headway on this conjecture. One key ingredient is a connection with Motion by Mean Curvature (MMC), a p.d.e. for which a rich and detailed theory has been developed over the past few years (cf. [ES]). Starting from tessellations with large length scale, one can show that EMV dynamics are well-approximated by Motion by Mean Curvature. More precisely, if γ is smooth and simple in R2 , [γ] its bounded component, then as n → ∞, for a suitable uniform sense of convergence →, 1 ξn2 t (n[γ]) → [γt ], n
(6)
where
∂γt = 16 φ2 (n)κ. (7) ∂t Here n is the unit normal, κ is the curvature, and φ(n) is the radius of N in the direction perpendicular to n. This last equation describes anisotropic MMC. The idea behind its derivation is rather straightforward. Half-spaces are invariant for EMV. The boundary of nγ has small curvature and so is well-approximated locally by a parabolic arc. A simple exercise in calculus shows that the amount nγ moves in direction n turns out to be proportional to κn−2 . Another way to think about this approximation, by rescaling, is to fix γ and let the neighbor set n−1 N shrink to a point. In this way, the integral averaging of MVT reduces to the local operator for MMC. We note in passing that numerical analysts sometimes use generalizations of MVT as parallel schemes for simulation of equation (7). Using (6), we are able to prove a result that captures some of the ingredients in Conjecture 1. Here we merely outline the proof; details will appear elsewhere. Theorem 2. Starting from the symmetric Bernoulli 2-coloring of a sufficiently large honeycomb lattice, Euclidean Majority Vote clusters. Sketch of proof. Standard techniques in percolation theory (see [Gri]) imply that the connectivity of the 2-colored honeycomb is critical (site percolation on the hexagonal lattice has pc = 12 ). As a consequence, its connected color components form an infinite cascade : any component is surrounded by a circuit of the opposite color. A simple lemma shows that EMV (or MMC) dynamics preserve separation, i.e., there is an L < ∞ such that any component initially isolated from other components of the same color by distance L will remain so at all times (for MMC any L > 0 has this property). Isolation in the initial large honeycomb effectively precludes interaction between components (or contours) and, together with monotonicity, implies that every component eventually shrinks to ∅ at least as quickly as the smallest ball that covers it. Isotropic MMC is curve shortening; by approximation (6) the lengths of large EMV contour boundaries are controlled. Hence boundary length per unit area must tend to 0 as t → ∞, a property that implies (5).
62
DAVID GRIFFEATH
Fig. 6.
Self-organization of a Plurality Vote CA.
We view the honeycomb lattice as being situated at a critical point toward which EMV nucleates. Very roughly, locations in the random tiling which are not wellseparated should disappear rapidly under iteration of the update rule. In this sense, EMV started from White Noise would appear to be self-organized critical. There are difficult obstacles, both conceptual and technical, to our understanding of the nucleation mechanism. But the phenomenology of surface tension clustering from disordered initial states is of broad interest (e.g., in the so-called spinodal decomposition of Stochastic Ising Model phase transitions), so even partial results seem worthwhile. Going way out on a limb, our heuristics even suggest the possibility of a limiting Euclidean random field statistically self-similar under MMC. A last remark in connection with this snapshot: suppose additional political parties enter the fray so that our conformist voter is confronted with a multitude of candidates. The natural Plurality Vote rule chooses the clear favorite over the neighborhood, but stays with the current opinion in case of a tie. How does this CA self-organize from κ color primordial soup? Figure 6 shows a simulation reminiscent of soap bubble patterns, with ρ = 6 (box neighborhood), κ = 15, at time t = 100. Again the interfaces emulate MMC, and again the system presumably clusters, although the nature of the clustering is quite different from the two-color case.
SELF-ORGANIZATION OF RANDOM CELLULAR AUTOMATA
63
5. Larger than Life: Evolution of Complex Local Structure Our fourth and final snapshot is even more speculative. Together with graduate student Kellie Evans, we are studying a four-parameter family of cellular automata that generalize John Conway’s celebrated Game of Life to the higher range, general threshold context. Recall that Conway’s Life, probably the most famous of all CA rules, is a range 1 box, single species population model with remarkably complex dynamics. A birth can only occur at a cell with exactly 3 occupied neighbors, while survival requires either 2 or 3 occupied neighbors. See [BCG] for an entertaining account of both the recreational and theoretical study of Life. One should bear in mind that the early investigations of Conway and his cohort, as popularized by Martin Gardiner in Scientific American, predated the advent of desktop computer visualization. Cambridge veterans tell me that the first experiments were carried out on Go boards equipped with remarkably reliable cerebral processing units. Now that simulations of Life can be found on any respectable electronic bulletin board, and appear as the default screensaver on many SparcStations, it is easy to check that Conway’s choice of the parameter values (3, 2 or 3) for birth and survival generate the most intriguing dynamics of any of the range 1 population rules of the same general form. We wondered whether his rule might be a clue to a critical phase point in the threshold–range scaling limit. Thus, the Larger than Life (LTL) family of cellular automata have ξt (x) ∈ {0, 1};
1 means a creature lives at x at time t, 0 not.
The update rule is: A birth at x iff the population on (x + ρN ) lies in [β− , β+ ]; a death at x unless the population on (x + ρN ) lies in [δ− , δ+ ]. Special cases of LTL have been considered in [Ruc] (ρ = 1) and [BBC] (ρ = 1, β± = δ± ). Extensive simulation of representative rules from the four-parameter family reveals a surprisingly rich phase space filled with many qualitatively distinct instances of nucleation and self-organization. The terrain is much more difficult to map out than that of GH/CCA because initial configurations far removed from primordial soup are often needed to sustain life. For now, let us simply offer a few illustrations. First, we looked at large-range CA rules near Conway’s Life under threshold– range scaling. Of course ρ = 1 is a very small parameter value, but it is not unreasonable to interpret Conway’s rule as the LTL case [β− , β+ ] = [2.5, 3.5], [δ− , δ+ ] = [2.5, 4.5]. A natural scaling scheme is to identify rules with the same values of parameters/|ρN |. Thus Conway’s rule has approximately the values [.28, .39], [.28, .5] in the phase space. Figure 7 shows a still frame of the range 12 box LTL rule with (integer) parameters [β− , β+ ] = [170, 240], [δ− , δ+ ] = [170, 296]. We leave it to the reader to check that the position in our numerological phase space is quite close. Evidently our large-range rule generates complexity that is reminiscent of the original Life. In particular there are finite periodic structures, akin to Conway’s blinkers, that move through a sequence of basic geometric shapes, and there are mobile bugs
64
DAVID GRIFFEATH
Fig. 7.
Larger than life: emergence of bugs (aka gliders).
with an invariant shape, akin to Conway’s celebrated gliders. We have discovered essentially the same phenomenology for rules up to range 15 in a small neighborhood of an apparent critical point. The resulting bugs seem to settle down to a limiting shape with a fat head, slender posterior, and a stomach, as shown in the Figure 7 inset. An entirely different self-organized evolution occurs for the range 15 LTL rule with parameters [β− , β+ ] = [170, 240], [δ− , δ+ ] = [170, 296], as illustrated in Figure 8. In this case most of the original soup dies out, but various small configurations are viable. Among these are rings of a characteristic diameter and band width that cannot spread on their own. However interactions between two or more such rings create web-like structures that nucleate to cover the entire space with a complex statistical equilibrium. We call this highly non-linear scenario nucleating pretzels. LTL dynamics can display many other exotic forms of self-organization. The phenomenology is so diverse and bewildering that we have decided to focus on some special cases. The exactly θ rule is particularly easy to state: there is a 1 at x next time iff there are exactly θ 1’s in the neighborhood of x this time, excluding x itself. Since updates require an exact population count over the neighbor set, this CA is rather different in spirit from our previous examples, and any viable patterns of occupied sites are necessarily more one-dimensional in spirit. By simply asking whether finite configurations of 1’s can survive and propagate we discover a series
SELF-ORGANIZATION OF RANDOM CELLULAR AUTOMATA
Fig. 8.
65
Larger than life: complex nucleation of pretzels.
of apparent critical phenomena: • If 1 ≤ θ ≤ ρ, a suitable finite segment of occupied cells (vertically oriented, say) self-replicates and gives rise to a spreading fractal-like structure. This claim is actually a little theorem since, despite the non-linear rule, the dynamics mimic the genesis of a Sierpinski lattice. • For θ ≥ ρ + 1, from the same initial seed, orderly propagation breaks down. Instead, for values of θ just above ρ, complex growth is reminiscent of a snowflake or Rorschach test. For θ just below 2ρ growth is apparently no longer possible, but viable periodic bugs emerge. We call these bugs skeeters since one of their characteristic shapes consists of a small solid head with two long trailing onedimensional legs. We have observed skeeters in rules up to range 40. Some of these bugs have very long periods; a few are even capable of giving birth to new skeeters that travel in the opposite direction. This last effect is reminiscent of procreation by glider guns in Conway’s Life. • No propagation appears possible for θ > 2ρ, although one can exhibit finite fixed structures for 3 ≤ θ ≤ (ρ + 1)2 − 1. • Global death occurs from any initial configuration once θ is large enough, e.g., for θ > 2ρ2 + 3ρ. This little theorem is proved by comparison with monotone threshold growth.
66
DAVID GRIFFEATH
In conclusion, let us mention some subtle open problems motivated by recent controversy surrounding Conway’s Game of Life. In [BCC], a cover article of Scientific American, and elsewhere, it has been suggested that Life may be self-organized critical, a claim that includes power-law decay of the density of sites that are neither in the 0 state nor part of some periodic local configuration. Cellular Automaton Machine experiments of [BB], to the contrary, indicate relaxation at a small exponential rate. To a first approximation, Life may be viewed as an interaction between gliders/bugs — mobile finite structures of fixed size and shape; and blinkers — periodic immobile configurations. Interactions between any pair of such objects typically destabilizes both, leading to mutual annihilation. As a prototype, one may consider a system of 2 − d billiards that move in one of the four directions N, S, E, W at each update, and fixed obstacles occupying one cell each, with annihilation upon any collision. If the initial density of billiards is p (p/4 for each type), and the initial density of obstacles is q, we may ask how the density of billiards tends to 0 over time, as a function of the parameters p and q. In the infinite system it is conjectured that the asymptotic rate is always exponential, but only after an initial transient period of apparent power-law decay that can be quite long if p exceeds q. In joint work with Maury Bramson, we will attempt to obtain rigorous results along these lines, and also to investigate the impact of this phenomenon on the behavior of corresponding finite-lattice systems. Perhaps such an analysis will help shed light on the above-mentioned controversy and indicate some important issues of scale in the approximation of infinite complex systems by the finite ones that are used for computer experimentation. However, as noted in [BB], one should not rule out the possibility that Life is actually supercritical! Namely, it is conceivable that Conway’s game admits indestructible local configurations similar to the spo’s of excitable cellular automata: exceedingly rare, perfectly synchronized constellations that send out impenetrable streams of colonists. Some day the offspring of such a monster might just show up on our doorstep and take over the world.
References [BCC]
Bak, P., Chen, K., and Creutz, M. (1989). Self-organized criticality in the Game of Life. Nature 342, 780–782. [BB] Bennett, C. and Bourzutschky, M. (1991). Life not critical? Nature 350, 468. [BCG] Berlekamp, E., Conway, J., and Guy, R. (1982). Winning Ways for Your Mathematical Plays, Vol. 2. Academic Press, New York . [BBC] Bidaux, R., Boccara, N., and Chat´e, H. (1989). Order of the transition versus space dimension in a family of cellular automata. The Physical Review A 39, 3094–3105. [Dur1] Durrett, R. (1991). Probability: Theory and Examples. Wadsworth & Brooks/Cole, Pacific Grove, CA. [Dur2] Durrett, R. (1993). Ten lectures on particle systems. To appear as 1993 Saint-Flour Probability Summer School Lecture Notes, Springer-Verlag, New York. [DG] Durrett, R. and Griffeath, D. Asymptotic behavior of excitable cellular automata. Journal of Experimental Mathematics 3, to appear. [DS] Durrett, R. and Steif, J. (1993). Fixation results for threshold voter systems. Annals of Probability 21, 232–247. [ES] Evans, L. C. and Spruck, J. (1993). Motion of level sets by mean curvature I. Journal of Differential Geometry, to appear.
SELF-ORGANIZATION OF RANDOM CELLULAR AUTOMATA
67
[FGG0] Fisch,, R., Gravner, J., and Griffeath, D. (1992). Cyclic cellular automata in two dimensions. In Spatial Stochastic Processes. A festschrift in honor of the seventieth birthday of T. E. Harris (K. Alexander and J. Watkins, eds.), Birkh¨ auser, Boston, 171–185. [FGG1] Fisch, R., Gravner, J., and Griffeath, D. (1992). Threshold–range scaling of excitable cellular automata. Statistics and Computing 1, 23–39. [FGG2] Fisch, R., Gravner, J., and Griffeath, D. (1993). Metastability in the Greenberg–Hastings Model. Annals of Applied Probability, to appear. [GK] Gandolfi, A. and Kesten, H. (1993). Greedy lattice animals II. Annals of Applied Probability, to appear. [GG1] Gravner, J. and Griffeath, D. Threshold growth dynamics. Transactions of the American Mathematical Society 341, to appear. [GG2] Gravner, J. and Griffeath, D. The Poisson–Voronoi limit for excitable cellular automata with rare nucleation. In preparation. [Gra] Gray, L. A strong law for the motion of interfaces in particle systems. In preparation. [GH] Greenberg, J. and Hastings, S. (1978). Spatial patterns for discrete models of diffusion in excitable media. SIAM Journal of Applied Mathematics 4, 515–523. [Gri] Grimmett, G. (1989). Percolation. Springer-Verlag, New York. [Lig] Liggett, T. M. (1985). Interacting Particle Systems. Springer-Verlag, New York. [Ruc] Rucker, R. (1990). CA-Lab (software). Autodesk, Sausalito, CA. [Tof] Toffoli, T. (1948). Integration of the phase-difference relations in asynchronous sequential networks. In Automata, Languages, and Programming (G. Ausiello and C. B¨ ohm, ed.), Springer-Verlag, New York, 457–463. [TM] Toffoli, T. and Margolus, N. (1987). Cellular Automata Machines. MIT Press, Cambridge, Massachusetts. [WR] Weiner, N. and Rosenblueth, A. (1946). The mathematical foundation of the problem of conduction of impulses in a network of connected excitable elements, specifically in cardiac muscle. Archchive of the Institute of Cardiology, Mexico 16, 205–265.
PERCOLATIVE PROBLEMS
GEOFFREY GRIMMETT* Statistical Laboratory University of Cambridge 16 Mill Lane Cambridge CB2 1SB United Kingdom
Abstract. We sketch elementary results and open problems in the theory of percolation and random-cluster models. The presentation is rather selective, and is intended to stimulate interest rather than to survey the established theory. In the case of the random-cluster model, we include sketch proofs of basic material such as the FKG inequality and the comparison inequalities. Key words: Percolation, random-cluster model, Potts model, phase transition, FKG inequality.
1. Introduction This paper falls naturally into two (related) halves. The first of these is concerned with the percolation model, and the second with the random-cluster model. The emphasis throughout is upon unsolved problems which are easy to state; some of these are chestnuts of varying ages, and some are recent and may be relatively tractable. The percolation model is the subject of Sections 2–4, the last of which contains a selection of open questions. In Section 5 we turn to the random-cluster model of Fortuin and Kasteleyn, and for this process we present and prove several of the basic properties in advance of describing in Section 9 some stimulating problems worthy of resolution.
2. Bond Percolation Our lattice is the hypercubic lattice Ld , having vertex set Zd and edge set Ed containing all pairs hx, yi whose L1 distance
ky − xk =
d X
|yi − xi |
i=1
satisfies ky − xk = 1; for z ∈ Zd , we write z = (z1 , z2 , . . . , zd ). Throughout we shall assume that d ≥ 2. *The author acknowledges support from the Isaac Newton Institute, University of Cambridge, and from the SERC under grant GR G59981.
70
GEOFFREY GRIMMETT
Let 0 ≤ p ≤ 1, and call an edge e (∈ Ed ) open with probability p, and closed otherwise; different edges are designated open or closed independently of one another. Consider the random subgraph of Ld containing the vertex set Zd and the open edges only. The connected components of this graph are called open clusters, and percolation theory is concerned with their sizes and geometry. We write C(x) for the open cluster containing the vertex x, and C = C(0) for the cluster containing the origin 0. The number of vertices in C(x) is denoted by |C(x)|. There is a ‘phase transition’, in the following sense. Define the percolation probability θ(p) = Pp (|C| = ∞), (2.1) where Pp is the associated probability measure, and define the critical probability by pc = sup{p : θ(p) = 0}.
(2.2)
It is a fundamental fact that 0 < pc < 1. The value pc marks a transition from a subcritical phase (when p < pc , and all open clusters are a.s. finite) to a supercritical phase (when p > pc , and there exists a.s. an infinite open cluster). The most basic problem is to understand the nature of the singularity of the process at the point of phase transition. Rather than attempt an accurate bibliography, the reader is referred to [22] for history and references prior to 1989.
3. Some Open Problems for Percolation 3.1. BK Inequality ‘Correlation inequalities’ play an important role in studying percolation, and the FKG and BK inequalities are fundamental techniques. Whereas the FKG inequality is rather well understood, there are interesting unresolved questions concerning the BK inequality. Consider the probability space (Ω, F, µ) where Ω = {0, 1}E , E being a finite set, F is the σ-field of all subsets of Ω, and µ is product measure with density p, i.e., Y {pω(e) (1 − p)1−ω(e) }, ω = (ω(e) : e ∈ E) ∈ Ω. (3.1) µ(ω) = e∈E
There is a natural partial order on Ω given by ω ≤ ω 0 if and only if ω(e) ≤ ω 0 (e) for all e ∈ E. An event A in F is called increasing if its indicator function IA is an increasing function on the partially ordered space (Ω, ≤). The FKG inequality (see [21, 28] and Section 7) states that µ(A ∩ B) ≥ µ(A)µ(B)
for increasing events A, B.
(3.2)
The BK inequality provides a converse relation, but with A ∩ B replaced by another event A ◦ B defined as follows. Let A and B be increasing events. Each ω (∈ Ω) is specified uniquely by the set K(ω) = {e : ω(e) = 1} of edges with state 1. We define A ◦ B to be the set of all ω for which there exists a subset H of
PERCOLATIVE PROBLEMS
71
K(ω) such that ω 0 , determined by K(ω 0 ) = H, belongs to A, and w00 , determined by K(ω 00 ) = K(ω) \ H, belongs to B. We speak of A ◦ B as the event that A and B occur disjointly. The BK inequality ([9]) states that µ(A ◦ B) ≤ µ(A)µ(B)
for increasing events A, B.
(3.3)
It is conjectured that such an inequality is valid for all events A and B, so long as A ◦ B is interpreted correctly. For general events A and B, we define the event A B as follows. For ω = (ω(e) : e ∈ E) and K ⊆ E, we define the cylinder event C(ω, K) by C(ω, K) = {ω 0 ∈ Ω : ω 0 (e) = ω(e) for e ∈ K}. We now define A B to be the set of all ω (∈ Ω) for which there exists K ⊆ E such that C(ω, K) ⊆ A and C(ω, E \ K) ⊆ B. Conjecture 3.1. For all events A and B, µ(A B) ≤ µ(A)µ(B).
(3.4)
This conjecture has as special cases both the FKG and BK inequalities, since A B = A ◦ B and A B c = A ∩ B c for increasing events A and B. In the case p = 12 , the conjecture reduces to a counting problem: prove that 2|E| |A B| ≤ |A| · |B| for all events A, B.
(3.5)
In a discussion [8] of partial results, it is proved that (3.5) would imply the full conjecture. Finally we ask for what probability measures µ is the BK inequality (3.3) valid? −1 to each of For example, is it valid for the measure which assigns probability |E| M the sequences ω containing exactly M ones and |E| − M zeros, where M is fixed? 3.2. Smoothness of Percolation Probability It is known that θ(p) = 0 for p < pc (by definition) and that θ is infinitely differentiable when p > pc . It is a major open problem to prove that θ(pc ) = 0, which is equivalent (via the right continuity of θ) to the statement θ is continuous at pc .
(3.6)
This problem has been settled affirmatively when d = 2, and for the following discussion we assume that d ≥ 3. It is known that the critical probability pc of Zd is the same as the critical probability pc (H) of the half-space H = Z+ × Zd−1 ([26]). Furthermore, we know (see [6]) that H contains no infinite cluster when p = pc . It is therefore required to rule out the following outlandish possibility: when p = pc there exists a.s. an infinite open cluster, but this cluster decomposes a.s. into finite clusters whenever Zd is sliced into two disjoint half-spaces.
72
GEOFFREY GRIMMETT
3.3. Uniqueness of Infinite Cluster Let N be the number of infinite open clusters. Then, for all p, either Pp (N = 0) = 1 or Pp (N = 1) = 1 , and the easiest proof of this may be found in [13]. It has been asked by Mathew Penrose whether N has such a property simultaneously for all values of p. In order to make sense of this question, we introduce a family (X(e) : e ∈ Ed ) of independent random variables each having the uniform distribution on [0, 1]. For 0 ≤ p ≤ 1 we define the vector ηp by ηp (e) =
1 if X(e) < p, 0 if X(e) ≥ p,
(3.7)
and note that P(ηp = 1) = p. We call an edge e p-open if ηp (e) = 1, and p-closed otherwise. Let Np be the number of infinite p-open clusters. Is it the case that1 P Np ∈ {0, 1} for all p = 1? (3.8) This is certainly valid when d = 2. 3.4. Critical Exponents There is a wealth of problems concerning critical exponents and scaling theory, and these have received ample attention elsewhere (see [22, Chaps. 7, 8] for example). We confine ourselves here to a few very basic examples of such problems. Those mentioned here are intended primarily to stimulate interest in the major challenge to mathematicians to make sense of scaling theory. It is thought to be the case that θ(p) behaves in the manner of |p − pc |β in the limit as p ↓ pc , where β is a ‘critical exponent’ whose value depends on the number of dimensions. No proof is known that log θ(p) β = lim (3.9) p↓pc log |p − pc | exists. Possibly a(p)|p − pc |β ≤ θ(p) ≤ b(p)|p − pc |β
for p > pc
(3.10)
for some functions a and b which are slowly varying as p ↓ pc . There are corresponding conjectures for other macroscopic functions. The value of β = β(d) should depend on d, and it is conjectured that β(2) =
5 36 ,
β(d) = 1 for d ≥ 6.
(3.11)
This is part of a large family of conjectures dealing with the cases d = 2 and d ≥ 6. When d = 2, it is thought that all critical exponents are rational. When d ≥ 6 it is 1 This
question was answered affirmatively by Ken Alexander during the meeting.
PERCOLATIVE PROBLEMS
73
thought that any given exponent takes its ‘mean-field value’, i.e., the value obtained when the lattice is replaced by an infinite regular tree. See [27, 34, 38]. The first ‘proof’ that pc = 12 when d = 2 utilized the self-duality of the square lattice. Sykes and Essam [42] established the relation κ(p) = κ(1 − p) + 1 − 2p
(3.12)
−1
where κ(p) = Ep (|C| ) and Ep is the expectation operator corresponding to Pp . Assuming that κ has a unique singularity, and that this is at the point pc , then it follows from (3.12) that pc = 1 − pc and hence pc = 12 . Alternative proofs that pc = 12 are now available (see [22, 33]). However it is not ruled out that κ is infinitely differentiable on [0, 1]. It may be conjectured that κ is twice but not thrice differentiable at pc .
4.1. Related Problems 4.1. Wind-Tree Problem Versions of the wind-tree problem have been discussed by Lorenz, Ehrenfest [16], and Hauge and Cohen [29]. The following version is close to percolation theory. We start with the square lattice L2 , a bucket of small double-sided mirrors, and a parameter p taking values in [0, 1]. For each vertex x we perform the following experiment. With probability p, we pick a mirror from the bucket and place it at the vertex x, in such a way that a ray of light arriving at x, parallel to any coordinate axis, is reflected through either 12 π or 32 π (measured clockwise), each possibility having probability 1 2 . The remaining probability 1 − p is the chance that we do nothing at x. We think of the mirrors as being random scatterers of light. How many vertices can see the origin? More precisely, we suppose that four rays of light are emitted along the coordinate axes from a light source placed at the origin. Let C be the set of vertices which are illuminated by one or more of these light rays, and let θ(p) be the probability that C is infinite. Clearly θ(0) = 1, and it is straightforward to see that θ(1) = 0, using a standard result for bond percolation on L2 . Let pc (WT) = sup{p : θ(p) > 0}. (4.1) Is it the case that 0 < pc (WT) < 1? 4.2. Random Orientations Here is a small problem in two dimensions. Each edge of L2 is oriented in a random direction, horizontal edges being oriented eastwards with probability p and westwards otherwise, and vertical edges being oriented northwards with probability p and southwards otherwise. Let η(p) be the probability that there exists an infinite oriented path starting at the origin. It is not hard to see that η( 12 ) = 0, and also that η(p) = η(1 − p). Is it the case that η(p) > 0 if p 6= 12 ? 4.3. Uniqueness for Minimal Spanning Trees The following question concerning ‘continuous percolation’ has been posed in [5]. Let X = (Xi ) be the set of points of a Poisson process in Rd with intensity 1, where
74
GEOFFREY GRIMMETT
d ≥ 2. We construct a spanning forest on X in the following way. For each X ∈ X we define trees tm (X, X), m ≥ 0. Let ξ1 = X and let t1 be the single vertex ξ1 . Let t2 be the tree consisting of the vertex ξ1 together with the vertex ξ2 (∈ X \ {ξ1 }) which is closest to ξ1 , these two vertices being joined by an edge. Having constructed tm−1 , we define tm = tm (X, X) by adding to tm−1 a new edge hξim , ξm i where 1 ≤ im < m and ξm (∈ X \ {ξ1 , ξ2 , . . . , ξm−1 }) is chosen so that the Euclidean distance |ξim − ξm | is minimal overSall possible edges joining tm−1 to X \ {ξ1 , ξ2 , . . . , ξm−1 }. Finally we ∞ set t(X, X) = m=1 tm (X, X). Each point X gives rise to an infinite tree t(X, X). We now use these trees to make a forest. Let F be the graph with vertex set X, and which has each hXi , Xj i as an edge if and only if it is in either t(Xi , X) or t(Xj , X). It may be seen that F is a forest, every component of which is an infinite tree. Aldous and Steele conjecture that F is a.s. a tree, which is to say that F is a.s. connected. This tempting conjecture might be related to the uniqueness of the infinite open cluster of percolation. 4.4. Collisions of Random Walks The following problem arises in the study of collisions of random walks (see [14, 43]). Let k be a positive integer. Let (Xi , Yi : i ≥ 0) be independent random variables, each being equally likely to take any of the values 1, 2, . . . , k. We declare the point (i, j) of Z2 open if Xi 6= Yj . Let θ(k) be the probability that there is an infinite open path of L2 beginning at the origin, each edge of which leads the path either northwards or eastwards away from the origin. It may be shown that θ(3) = 0. Is it true that θ(k) > 0 for large k, perhaps for k = 4?
5. The Random-Cluster Model The random-cluster model is a family of processes which includes percolation, the Ising and Potts models, and related systems. Its discovery was reported by Fortuin and Kasteleyn in a series of papers [17, 18, 19, 20, 32] published around 1970, and it has excited considerable interest recently. Here are brief descriptions of the Potts and random-cluster models. We start with a finite graph G = (V, E). The Potts model has sample space ΣV = {1, 2, . . . , q}V , where q is an integer satisfying q ≥ 2. A spin vector σ in this sample space has probability X 1 π(σ) = exp −J (5.1) 1 − δσ (e) , for σ ∈ ΣV , Z e∈E
where δσ (e) is the indicator function of the event that the endpoints of e have the same spin (see (6.3)), and Z is the normalizing factor. The parameter J describes the strength of pair-interactions. The random-cluster model is a (random) subgraph (V, F ) of G, the edge set F being chosen at random according to the probability mass function 1 |F | φ(F ) = p (1 − p)|E\F | q k(F ) , for F ⊆ E, (5.2) Z where k(F ) is the number of components of (V, F ). Here p and q satisfy 0 ≤ p ≤ 1
PERCOLATIVE PROBLEMS
75
and q > 0. The main observation is that the structures of π and φ are closely related, when the parameters J and p satisfy e−J = 1 − p. Since 0 ≤ p ≤ 1, this requires J ≥ 0, which is to say that the Potts model must be ferromagnetic. [If J < 0 then p < 0, and (5.2) defines a signed measure but not a probability measure.] The random-cluster measures (5.2) form a richer family than the (ferromagnetic) Potts measures, since they are well defined for all real positive q. There are many techniques which bear on the study of the random-cluster model. Some of these are valid for all q, others for q ≥ 1, others for sufficiently large q, and others for integral values of q. To develop a coherent and cohesive theory of this model is a target of substantial appeal. We pursue two targets in the rest of this paper. In Sections 6–8, we summarize some basic properties of random-cluster models; this material is well known and has appeared elsewhere (see [4, 15] and the original papers of Fortuin and Kasteleyn). Finally, in Section 9 we highlight open problems.
6. Potts and Random-Cluster Processes Potts and random-cluster processes may be viewed as the two marginal models obtained in the construction of a certain bivariate model; this was discovered by Edwards and Sokal [15]. Let G = (V, E) be a finite connected graph with no loops or multiple edges. We write u ∼ v whenever the two vertices u and v of G are adjacent; in this case the edge joining u to v is denoted by hu, vi. Let q be an integer satisfying q ≥ 2. Potts models have realizations in the set ΣV = {1, 2, . . . , q}V of ‘spin vectors’; a typical realization is an assignment σ = (σ(u) : u ∈ V ) of an integer from {1, 2, . . . , q} to each vertex. A Potts model with q = 2 is called an Ising model [31]. Random-cluster processes have realizations in the set ΩE = {0, 1}E of ‘edge-configurations’. A typical realization is a vector ω = (ω(e) : e ∈ E) of 0’s and 1’s. Instead of working with the vector ω, it is often convenient to work with the set η(ω) = {e ∈ E : ω(e) = 1}
(6.1)
of ‘open’ edges. The two processes referred to above may be constructed on the same sample space ΣV × ΩE as follows. Let p satisfy 0 ≤ p ≤ 1, and define the probability mass function µ on ΣV × ΩE by µ(σ, ω) =
o 1 Yn (1 − p)δω(e),0 + pδω(e),1 δσ (e) , Z
(6.2)
e∈E
where Z is the appropriate normalizing constant, δi,j is the Kronecker delta, and δσ (e) is given by δσ (e) =
1
if σ(u) = σ(v), where e = hu, vi,
0
otherwise.
(6.3)
76
GEOFFREY GRIMMETT
Let us calculate the marginal measures of µ. Summing over all ω ∈ ΩE , we obtain the marginal mass function π(σ) on ΣV : X n o 1 Y π(σ) = (1 − p)δω(e),0 + pδω(e),1 δσ (e) (6.4) Z e∈E ω(e)=0,1
1 Y 1 Y = {(1 − p) + pδσ (e)} = exp{−J(1 − δσ (e))} Z Z e∈E
e∈E
where J is given by e−J = 1 − p
(6.5)
and satisfies 0 ≤ J ≤ ∞. The mass function π on ΣV is therefore the Potts measure [41, 46]. The letter π stands for ‘Potts’. In order to calculate the marginal mass function on ΩE , we rewrite µ(σ, ω) as Y
1 µ(σ, w) = Z =
pδσ (e)
Y
e:ω(e)=1
(1 − p)
(6.6)
e:ω(e)=0
1 |η(ω)| p (1 − p)|E\η(ω)| I(σ, ω) Z
where
Y
I(σ, ω) =
Y
δσ (e) =
e:ω(e)=1
δσ (e)
e∈η(ω)
is the indicator function of the event that σ assigns a constant spin to all vertices in any given component of the graph (V, η(ω)). Summing (6.6) over all σ ∈ ΣV , we obtain the marginal mass function φ on ΩE given by φ(ω) =
1 |η(ω)| p (1 − p)|E\η(ω)| q k(ω) , Z
(6.7)
where k(ω) is the number of components of (V, η(ω)); this holds since there are q admissible spin values for each such component. The letter φ in (6.7) stands for ‘Fortuin–Kasteleyn’. The form of φ is particularly attractive for at least two reasons. First, the formula (6.7) may be used to define a probability measure on ΩE for any positive value of q; thus, random-cluster processes provide an interpolation of Potts models to non-integral values of q. Secondly, setting q = 1 we obtain the usual bond percolation model, in which edges are ‘open’ or ‘closed’ independently of one another. Suppose we are studying the Potts model, and are interested in some ‘observable’ f : ΣV → R; a particular example of interest is the ‘two-point function’ δσ(u),σ(v) for given u, v ∈ V . The mean value of f (σ) satisfies Eπ (f ) =
X σ
=
X ω
f (σ)π(σ) =
X
f (σ)µ(σ, ω)
σ,ω
F (ω)φ(ω) = Eφ (F )
(6.8)
PERCOLATIVE PROBLEMS
77
where F : ΩE → R is given by F (ω) =
X
f (σ)µ(σ | ω)
(6.9)
σ
and Eπ and Eφ denote expectation with respect to the appropriate measure. This piece of formalism, Eπ (f ) = Eφ (F ), has substantial value in practice. To see this, first let us calculate the conditional mass function µ(σ | ω). By (6.6) and (6.7), µ(σ | ω) = q −k(ω) I(σ, ω),
(6.10)
which may be expressed as follows. Conditional on ω, we assign a constant spin to all vertices in any given component of (V, η(ω)); such spins are equally likely to take any value 1, 2, . . . , q, and the spins assigned to different components are independent. As a major application of (6.9), define f : ΣV → R by f (σ) = δσ(u),σ(v) −
1 , q
where u and v are two fixed vertices; the term q −1 is the probability that two independent and equidistributed spins are equal. It follows from (6.9) and (6.10) that Eπ (f ) = Eφ (1 − q −1 )I{u↔v} = (1 − q −1 )φ(u ↔ v), (6.11) where IF denotes the indicator function of an event F (⊆ ΩE ), and we write {A ↔ B} for the event there exist a ∈ A (⊆ V ) and b ∈ B (⊆ V ) such that a and b are in the same component of (V, η(ω)). Equation (6.11) tells us that the two-point correlation function of the Potts model equals (apart from a constant factor) the probability of a certain connection in the random-cluster process. Thus, questions of correlation structure of Potts models become questions of stochastic geometry of the random-cluster process.
7. Useful Properties This section contains an account of some of the useful properties of the randomcluster measure. Most useful is the material of Sections 7.2 and 7.3, which appeared in the original work of Fortuin and Kasteleyn as well as in [4]. As before, G = (V, E) is a finite simple graph, ΩE = {0, 1}E , 0 ≤ p ≤ 1, and q > 0. The mass function in question is Y 1 pω(e) (1 − p)1−ω(e) q k(ω) , ω ∈ ΩE , (7.1) φp,q (ω) = Zp,q e∈E
where Zp,q =
X Y ω∈ΩE
pω(e) (1 − p)1−ω(e) q k(ω)
(7.2)
e∈E
is the normalizing factor, or ‘partition function’. We write φp,q here to emphasize the role of the parameters.
78
GEOFFREY GRIMMETT
7.1. The Value of q Whereas the Potts model may be defined for integer values of q only, the randomcluster measure (7.1) is well defined for all non-negative real values of q. Therefore, the random-cluster measures enable an interpolation of Potts models to general values of q (∈ (0, ∞)). Indeed in the context of signed measures, φp,q may be defined even for negative values of q. Henceforth we assume that q ∈ (0, ∞). Professor Kasteleyn has pointed out that the random-cluster model is more general than the Potts model in the following additional regard. We saw at (6.8) that, for every function f of the Potts model, there exists a corresponding function F of the associated random-cluster process, and furthermore F does not depend on the value of p (but only on q). The converse is false: in general there may exist functions F with no corresponding f independent of J. 7.2. FKG Inequality The measure φp,q satisfies the FKG inequality if and only if q ≥ 1. This fact is not difficult to prove, and has many applications. Possibly as a result of this fact, there appears to have been no serious study of the case 0 < q < 1. Before stating this inequality, we recall some notation. A function f : ΩE → R is called increasing if f (ω) ≤ f (ω 0 ) whenever ω ≤ ω 0 ; f is decreasing if −f is increasing. An event F (⊆ ΩE ) is called increasing (respectively decreasing) if its indicator function IF is increasing (respectively decreasing). Finally, we write Ep,q for expectation with respect to φp,q . Theorem 7.1 (FKG inequality). Suppose that q ≥ 1. If f and g are increasing functions on ΩE , then Ep,q (f g) ≥ Ep,q (f )Ep,q (g) . (7.3) Replacing f and g by −f and −g, we deduce that (7.3) holds for decreasing f and g, also. Specializing to indicator functions, we obtain that φp,q (A ∩ B) ≥ φp,q (A)φp,q (B)
for increasing events A, B,
(7.4)
whenever q ≥ 1. It is not difficult to see that the FKG inequality does not generally hold when 0 < q < 1. Proof. A mass function µ on ΩE satisfies the FKG inequality if ([21]) µ(ω ∨ ω 0 )µ(ω ∧ ω 0 ) ≥ µ(ω)µ(ω 0 )
for all ω, ω 0 ∈ ΩE ,
(7.5)
where ω ∨ ω 0 and ω ∧ ω 0 are the pointwise maximum and pointwise minimum configurations, ω ∨ ω 0 (e) = max{ω(e), ω 0 (e)},
ω ∧ ω 0 (e) = min{ω(e), ω 0 (e)} ;
note that η(ω ∨ ω 0 ) = η(ω) ∪ η(ω 0 ),
η(ω ∧ ω 0 ) = η(ω) ∩ η(ω 0 ).
Substituting µ = φp,q , we see that (7.5) is equivalent to k(ω ∨ ω 0 ) + k(ω ∧ ω 0 ) ≥ k(ω) + k(ω 0 )
for all ω, ω 0 ,
(7.6)
PERCOLATIVE PROBLEMS
79
so long as q ≥ 1. Assume henceforth that q ≥ 1. Inequality (7.6) is easily proved by induction on |η(ω)∪η(ω 0 )|, and the rest of the proof may be skipped. Inequality (7.6) is trivially true if η(ω)∪η(ω 0 ) = ∅. Suppose it is valid for |η(ω)∪η(ω 0 )| ≤ k. Let ω, ω 0 satisfy |η(ω) ∪ η(ω 0 )| = k + 1; we may assume ω 6= ω 0 , since (7.6) is trivial otherwise. Without loss of generality we may assume that there exists e ∈ η(ω) \ η(ω 0 ), and we write ωe for the configuration ω with e ‘switched off’, i.e., ω(f ) if f 6= e ωe (f ) = (7.7) 0 if f = e. From the induction hypothesis, k(ωe ∨ ω 0 ) + k(ωe ∧ ω 0 ) ≥ k(ωe ) + k(ω 0 ).
(7.8)
Write Ce for the indicator function of the event that the endpoints of e are in the same component. Trivially, Ce (ωe ∨ ω 0 ) ≥ Ce (ωe ),
(7.9)
since ωe ≤ (ωe ∨ ω 0 ). Adding (7.8) and (7.9), we obtain (7.6), on noting that k(νe ) + Ce (νe ) = k(ν) + 1
for ν (∈ ΩE ) satisfying ν(e) = 1,
and ωe ∧ ω 0 = ω ∧ ω 0 .
7.3. Comparison Inequalities Given two mass functions µ1 and µ2 on ΩE , we say that µ2 dominates µ1 , and write µ1 ≤ µ2 , if X X f (ω)µ1 (ω) ≤ f (ω)µ2 (ω) ω∈ΩE
ω∈ΩE
for all increasing functions f : ΩE → R. One may establish certain domination inequalities involving the measures φp,q for different values of the parameters p and q. A principal application of such inequalities is to prove the existence of phase transition for different values of p and q, for the infinite-volume random-cluster process (see [4]). Theorem 7.2 (Comparison inequalities). It is the case that φp0 ,q0 ≤ φp,q
if
q 0 ≥ q, q 0 ≥ 1, p0 ≤ p,
(7.10)
0
φp0 ,q0 ≥ φp,q
if
q 0 ≥ q, q 0 ≥ 1,
p p ≥ . q 0 (1 − p0 ) q(1 − p)
(7.11)
Proof. Since q 0 ≥ 1, the measure φp0 ,q0 satisfies the FKG inequality. The theorem will follow by applying this inequality with suitable choices of increasing functions. Note that πp0 ,q0 (ω)g(ω) φp,q (ω) = P ω φp0 ,q 0 (ω)g(ω)
80
GEOFFREY GRIMMETT
where g satisfies 0 ω(e) k(ω) Y p p q q0 1 − p 1 − p0 e∈E ω(e) k(ω)+|η(ω)| Y p q p0 = . q0 q(1 − p) q 0 (1 − p0 )
g(ω) =
e∈E
Now k(ω) is a decreasing function of ω, and k(ω) + |η(ω)| is an increasing function of ω. Therefore (a) if q ≤ q 0 and p ≥ p0 , then g is increasing, (b) if q ≤ q 0 and p/[q(1 − p)] ≤ p0 /[q 0 (1 − p0 )], then g is decreasing. Under part (a), if f is increasing, then Ep,q (f ) =
Ep0 ,q0 (f g) ≥ Ep0 ,q0 (f ) Ep0 ,q0 (g)
by the FKG inequality. Under part (b) the inequality is reversed, since f is increasing and g is decreasing. 7.4. Rank-Generating Function The rank-generating function of the simple graph G = (V, E) is the function X 0 0 WG (x, y) = xr(G ) y c(G ) , x, y ∈ R, E 0 ⊆E
where r(E 0 ) = |V | − k(G0 ) is the rank of the graph G0 = (V, E 0 ), and c(G0 ) = |E 0 | − |V | + k(G0 ) is the co-rank ; as usual, k(G0 ) denotes the number of components of the graph G0 . The rank-generating function has various useful properties, and occurs in several contexts in graph theory; see [11, 44]. The rank-generating function sometimes crops up in other forms. For example, the function TG (x, y) = (x − 1)|V |−1 WG (x − 1)−1 , y − 1 is known as the dichromatic (or Tutte) polynomial . The partition function Zp,q = Zp,q (G), given in (7.2), is easily seen to satisfy p p Zp,q (G) = q |V | (1 − p)|E| WG , , (7.12) q(1 − p) 1 − p a relationship which provides a link with other classical quantities associated with a graph. See [18] also. 7.5. Hypergraphs Whereas the random-cluster model above is defined on a graph, and corresponding Potts models have pair interactions, the theory may be extended easily to hypergraphs and many-body interactions. We shall not pursue this natural extension here, but refer the reader to [23] and the references therein.
PERCOLATIVE PROBLEMS
81
8. Infinite-Volume Limits and Phase Transition In studying random-cluster measures on lattices, we restrict ourselves to the case of the hypercubic lattice in d dimensions, where d ≥ 2; similar observations are valid in greater generality. For any subset S of Zd , we write ∂S for its boundary, i.e., ∂S = {s ∈ S : hs, ti ∈ Ed for some t 6∈ S}. Let Λ be a finite box of Ld , which is to say that Λ=
d Y
[xi , yi ]
i=1
for some x, y ∈ Zd ; we interpret [xi , yi ] as the set {xi , xi + 1, . . . , yi }. The set Λ generates a subgraph of Ld having vertex set Λ and edge set EΛ containing all hx, yi with x, y ∈ Λ. We are interested in the thermodynamic limit (as Λ ↑ Zd ) of the random-cluster d measure on the finite box Λ. Let Ω = {0, 1}E be the set of ‘edge-configurations’ of Ld . Let Ω1Λ be the subset of Ω containing all ω ∈ Ω for which ω(e) = 1 for e 6∈ EΛ ; similarly define Ω0Λ as the subset of Ω containing all ω with ω(e) = 0 for e 6∈ EΛ . One speaks of configurations in Ω1Λ as having ‘wired’ boundary conditions, and configurations in Ω0Λ as having ‘free’ boundary conditions. We now define two random-cluster measures on Ld . Let 0 < p < 1 and q > 0. For b = 0, 1, define Y 1 φbΛ,p,q (ω) = b pω(e) (1 − p)1−ω(e) q k(ω,Λ) , ω ∈ ΩbΛ , (8.1) ZΛ e∈E Λ
where ZΛb =
XY ω∈ΩbΛ
pω(e) (1 − p)1−ω(e) q k(ω,Λ)
(8.2)
e∈EΛ
is the appropriate normalizing constant, and k(ω, Λ) is the number of clusters of (Zd , η(ω)) which intersect Λ. Theorem 8.1 (Thermodynamic limit). Suppose q ≥ 1. The weak limits φbp,q = lim φbΛ,p,q , Λ↑Zd
for b = 0, 1,
(8.3)
exist and satisfy φ0p,q ≤ φ1p,q . The limits in (8.3) are to be interpreted along any increasing sequence of finite boxes, and the weak convergence is in the sense that φbΛ,p,q (A) → φbp,q (A) for all finite-dimensional cylinders A. The assumption that q ≥ 1 is necessary for the proof, which relies on the validity of the FKG inequality. One may discuss other boundary conditions, ‘mixed’ conditions which are more complicated than either wired or free; such conditions are relevant to random-cluster models arising from Potts models with mixed boundary conditions. We omit a
82
GEOFFREY GRIMMETT
detailed discussion here, but note that φ0p,q and φ1p,q are the most ‘extreme’ measures obtainable in the infinite-volume limit, amongst an important subclass of boundary conditions. Define a random-cluster measure φ on Ed to be a measure with the property that, conditional on the states of edges lying outside any given finite set E (⊆ Ed ), the states of edges within E satisfy (7.1) on the graph induced by E with the appropriate boundary condition specifying which endpoints of edges in E are joined by edges outside E. We write Rp,q for the class of such measures, and note that every φ ∈ Rp,q satisfies φ0p,q ≤ φ ≤ φ1p,q . See [24]. Sketch proof of Theorem 8.1. Let Λ and Λ0 be two finite boxes satisfying Λ ⊆ Λ0 , and let A be the event that all edges e ∈ EΛ0 \ EΛ have state 0. Now φ0Λ,p,q may be thought of as the measure φ0Λ0 ,p,q conditioned on the event A. Since A is a decreasing event, we have by the FKG inequality that φ0Λ,p,q (·) = φ0Λ0 ,p,q (· | A) ≤ φ0Λ0 ,p,q (·);
(8.4)
a similar argument yields φ1Λ,p,q ≥ φ1Λ0 ,p,q .
(8.5)
By monotonicity, the limits exist in (8.3). [By (8.4) and (8.5), limΛ↑Zd φbΛ,p,q (B) exists for any increasing finite-dimensional cylinder B, and for b = 0, 1; such cylinders generate the appropriate σ-field.] To show that φ0p,q ≤ φ1p,q , it suffices that φ0Λ0 ,p,q ≤ φ1Λ0 ,p,q .
(8.6)
This too follows by the FKG inequality, since φbΛ0 ,p,q may be thought of as the random-cluster measure on a larger region Λ00 conditioned on the extra edges having state b; the conditional measure with b = 0 must lie underneath the conditional measure with b = 1. An indicator of phase transition in the Potts model is the ‘magnetization’, defined 1 as follows. Consider a Potts measure πΛ on Λ having ‘1’ boundary conditions, which is to say that all vertices in the boundary ∂Λ are constrained to have spin value 1. 1 Let τΛ = πΛ (σ(0) = 1) − q −1 , the ‘effect’ of these boundary conditions on the spin at the origin. Passing to the corresponding random-cluster measure φ1Λ , as in Section 6, we see as in (6.11) that τΛ = (1 − q −1 )φ1Λ (0 ↔ ∂Λ) .
(8.7)
In reaching this conclusion, we have suppressed the reference to parameter values, and applied (6.11) to the graph obtained from Λ by identifying all vertices in ∂Λ. We say that phase transition takes place in the Potts model if τ = limΛ↑Zd τΛ satisfies τ > 0. In studying the random-cluster process, we shall work with the analogous quantity (8.8) θΛ (p, q) = φ1Λ (0 ↔ ∂Λ) and with the infinite-volume limit θ(p, q) = lim θΛ (p, q) ; Λ↑Zd
(8.9)
PERCOLATIVE PROBLEMS
83
this limit exists if q ≥ 1 (see [4, p. 22]). We have that θ(p, q) = φ1 (0 ↔ ∞) , the φ1 -probability that the origin is in an infinite cluster; in the case q = 1, this coincides with the ‘percolation probability’ of the percolation model. Using the comparison inequality (7.10), θ(p, q) is a non-decreasing function of p, and we may therefore define the critical value pc (q) = sup{p : θ(p, q) = 0},
for q ≥ 1.
(8.10)
How does pc (q) depend on the choice of q? The comparison inequalities imply that 1 q 0 /q q0 1 ≤ ≤ − + 1 if 1 ≤ q ≤ q 0 . (8.11) pc (q 0 ) pc (q) pc (q 0 ) q In particular, since 0 < pc (1) < 1 ([22, p. 14]), we have that 0 < pc (q) < 1 for all q ≥ 1, implying the existence of a phase transition for all values of q (≥ 1). It follows that pc (q) is a Lipschitz-continuous and nondecreasing function of q; strict monotonicity may be shown using the method of [10].
9. Open Problems for Random-cluster Processes 9.1. Value of Critical Point It is unreasonable to expect an exact calculation of the critical point pc (q) in general. For certain two-dimensional lattices however, the method of planar duality is applicable and leads to conjectured values. Conjecture 9.1. The critical value for the random-cluster process on the square lattice is √ q pc (q) = √ , q ≥ 1. 1+ q This has been proved for q = 1 (percolation), for q = 2 (Ising model), and for all sufficiently large values of q ([36, 37]). The argument of [30] may possibly be adapted to prove the conjecture when q ≥ 4. See [7, 45] also. Corresponding conjectures may be made for certain other two-dimensional lattices, such as the triangular and hexagonal lattices, and also for certain processes in which the value of p may depend on the inclination of the edge in question. In making such conjectures, one uses the method of duality together with the star-triangle transformation. 9.2. Continuity of Percolation Probability It is thought that θ(p, q) is continuous at the critical value p if and only if q is sufficiently small. Since θ is right-continuous, this amounts to deciding whether θ(pc (q), q) = 0 (‘second order transition’) or θ(pc (q), q) > 0 (‘first order transition’) for a given value of q.
84
GEOFFREY GRIMMETT
Conjecture 9.2. There exists a real Q = Q(d) such that = 0 if 1 ≤ q < Q(d) θ(pc (q), q) > 0 if q > Q(d). Furthermore Q(2) = 4, and Q(d) = 2 for d ≥ 6. That θ(pc (q), q) > 0 when q is large has been proved in [37]. As remarked in Section 3.2, it is not even known that θ(pc (1), 1) = 0. 9.3. Exponential Decay Suppose q ≥ 1. Let τp,q (x, y) be the φ1p,q -probability of a path joining the vertices x and y, and denote by en the vertex (n, 0, 0, . . . , 0). We have by the FKG inequality that τp,q (0, em+n ) ≥ τp,q (0, em )τp,q (em , em+n ) , whence the correlation length ξ(p, q), defined by 1 ξ(p, q)−1 = lim − log τp,q (0, en ) (9.1) n→∞ n exists. Presumably 0 < ξ(p, q) < ∞ if 0 < p < pc (q) , (9.2) but the finiteness of ξ(p, q) near pc (q) is unproven in general. It is known to hold for q = 1, 2, and for large values of q ([1, 2, 3, 22, 35, 37, 40]). By monotonicity, the quantity µ(q) = lim ξ(p, q)−1 p↑pc (q)
exists, and it is thought that
= 0 if q < Q(d) (9.3) > 0 if q > Q(d) ; once again, the existence of the mass gap (i.e., the fact that µ(q) > 0) has been proved in [37] for sufficiently large q. µ(q)
9.4. Uniqueness of Random-Cluster Measures For given values of p and q, how large is the class Rp,q of random-cluster measures? It is presumably the case that |Rp,q | = 1 whenever q ≥ 1 and p 6= pc (q), but this is not proved in general. Furthermore, when p = pc (q), it is presumably the case that |Rp,q | = 1 if θ(pc (q), q) = 0, and otherwise Rp,q has exactly two extremal measures {φ0p,q , φ1p,q }. Partial results are known in the direction of the uniqueness of random-cluster measures. First, |Rp,q | = 1 if and only if φ0p,q = φ1p,q ; furthermore ([4]) φ0p,q = φ1p,q
if θ(p, q) = 0 ,
(9.4)
so that there is a unique such measure throughout the subcritical phase. If d = 2 and p 6= pc (q), then the uniqueness follows by exploiting self-duality (see [24] for a discussion). If d ≥ 3 and q is sufficiently large, then the uniqueness is a consequence of Pirogov–Sinai theory ([37, 39]). One may use a general argument based on the convexity of free energy to prove that |Rp,q | = 1 for all values of p except (at most) countably many ([24, 25]).
PERCOLATIVE PROBLEMS
85
9.5. The Case q < 1 If 0 < q < 1, then the FKG inequality is not valid. In the absence of the consequent monotonicity, it is no longer clear whether or not there is a phase transition, and what should be the form of such a transition. Using an argument based on convexity of free energy (see [24]), one may show that the edge-density φp (ω(e) = 1) is non-decreasing in p, where φp is any translationinvariant random-cluster measure with parameters p and q. Increasing events other than {ω(e) = 1} may not generally have probabilities which are monotonic in p. The mean-field random-cluster model, when the underlying graph is the complete graph on n vertices, and p = λ/n, may be solved exactly for all positive values of q, even q ∈ (0, 1); see [12]. References 1. Aizenman, M. (1982). Geometric analysis of φ4 fields and Ising models. Communications in Mathematical Physics 86, 1–48. 2. Aizenman, M. and Barsky, D. J. (1987). Sharpness of the phase transition in percolation models. Communications in Mathematical Physics 108, 489–526. 3. Aizenman, M., Barsky, D. J., and Fern´ andez, R. (1987). The phase transition in a general class of Ising-type models is sharp. Communications in Mathematical Physics 47, 343–374. 4. Aizenman, M., Chayes, J. T., Chayes, L., and Newman, C. M. (1988). Discontinuity of the magnetization in one-dimensional 1/|x − y|2 Ising and Potts models. Journal of Statistical Physics 50, 1–40. 5. Aldous, D. and Steele, J. M. (1992). Asymptotics for Euclidean minimal spanning trees on random points. Probability Theory and Related Fields 92, 247–258. 6. Barsky, D. J., Newman, C. M., and Grimmett, G. R. (1991). Percolation in half spaces: equality of critical probabilities and continuity of the percolation probability. Probability Theory and Related Fields 90, 111–148. 7. Baxter, R. J. (1982). Exactly Solved Models in Statistical Mechanics. Academic Press, London. 8. Berg, J. van den and Fiebig, U. (1987). On a combinatorial conjecture concerning disjoint occurrences of events. Annals of Probability 15, 354–374. 9. Berg, J. van den and Kesten, H. (1985). Inequalities with applications to percolation and reliability. Journal of Applied Probability 22, 556–569. 10. Bezuidenhout, C. E., Grimmett, G. R., and Kesten, H. (1992). Strict inequality for critical values of Potts models and random-cluster processes. Communications in Mathematical Physics (to appear). 11. Biggs, N. L. (1984). Algebraic Graph Theory. Cambridge University Press, Cambridge. 12. Bollob´ as, B., Grimmett, G. R., and Janson, S. (1993). The random-cluster process on the complete graph (to appear). 13. Burton, R. M. and Keane, M. (1989). Density and uniqueness in percolation. Communications in Mathematical Physics 121, 501–505. 14. Coppersmith, D., Tetali, P., and Winkler, P. (1993). Collisions among random walks on graphs. SIAM Journal of Discrete Mathematics (to appear). 15. Edwards, R. G. and Sokal, A. D. (1988). Generalization of the Fortuin–Kasteleyn–Swendsen– Wang representation and Monte Carlo algorithm. The Physical Review D 38, 2009–2012. 16. Ehrenfest, P. (1959). Collected Scientific Papers (M. J. Klein, ed.). North–Holland, Amsterdam. 17. Fortuin, C. M. (1971). On the random-cluster model. Doctoral thesis, University of Leiden. 18. Fortuin, C. M. (1972). On the random cluster model. II. The percolation model. Physica 58, 393–418. 19. Fortuin, C. M. (1972). On the random cluster model. III. The simple random-cluster process. Physica 59, 545–570. 20. Fortuin, C. M. and Kasteleyn, P. W. (1972). On the random cluster model. I. Introduction and relation to other models. Physica 57, 536–564.
86
GEOFFREY GRIMMETT
21. Fortuin, C. M., Kasteleyn, P. W., and Ginibre, J. (1971). Correlation inequalities on some partially ordered sets. Communications in Mathematical Physics 22, 89–103. 22. Grimmett, G. R. (1989). Percolation. Springer–Verlag, Berlin. 23. Grimmett, G. R. (1992). Potts models and random-cluster processes with many-body interactions (to appear). 24. Grimmett, G. R. (1993). Random-cluster measures (to appear). 25. Grimmett, G. R. (1993). The random-cluster model (to appear). 26. Grimmett, G. R. and Marstrand, J. M. (1990). The supercritical phase of percolation is well behaved. Proceedings of the Royal Society (London), Series A 430, 439–457. 27. Hara, T. and Slade, G. (1990). Mean-field critical behaviour for percolation in high dimensions. Communications in Mathematical Physics 128, 333-391. 28. Harris, T. E. (1960). A lower bound for the critical probability in a certain percolation process. Proceedings of the Cambridge Philosophical Society 56, 13–20. 29. Hauge, E. H. and Cohen, E. G. D. (1967). Normal and abnormal effects in Ehrenfest wind-tree model. Physics Letters 25A, 78–79. 30. Hintermann, D., Kunz, H., and Wu, F. Y. (1978). Exact results for the Potts model in two dimensions. Journal of Statistical Physics 19, 623–632. 31. Ising, E. (1925). Beitrag zur theorie des ferromagnetismus. Zeitschrift f¨ ur Physik 31, 253–258. 32. Kasteleyn, P. W. and Fortuin, C. M. (1969). Phase transitions in lattice systems with random local properties. Journal of the Physical Society of Japan 26, 11–14, Supplement,. 33. Kesten, H. (1980). The critical probability of bond percolation on the square lattice equals 12 . Communications in Mathematical Physics 74, 41–59. 34. Kesten, H. (1987). Scaling relations for 2D-percolation. Communications in Mathematical Physics 109, 109–156. 35. Kotecky, R. and Shlosman, S. (1982). First-order phase transitions in large entropy lattice systems. Communications in Mathematical Physics 83, 493–515. 36. Laanait, L., Messager, A., and, Ruiz, J. (1986). Phase coexistence and surface tensions for the Potts model. Communications in Mathematical Physics 105, 527–545. 37. Laanait, L., Messager, A., Miracle-Sole, S., Ruiz, J., and Shlosman, S. (1991). Interfaces in the Potts model I: Pirogov–Sinai theory of the Fortuin–Kasteleyn representation. Communications in Mathematical Physics 140, 81–91. 38. Madras, N. and Slade, G. (1993). The Self-Avoiding Walk. Birkh¨ auser, Boston. 39. Martirosian, D. H. (1986). Translation invariant Gibbs states in the q-state Potts model. Communications in Mathematical Physics 105, 281–290. 40. Menshikov, M. V. (1986). Coincidence of critical points in percolation problems. Soviet Mathematics Doklady 33, 856–859. 41. Potts, R. B. (1952). Some generalized order-disorder transformations. Proceedings of the Cambridge Philosophical Society 48, 106–109. 42. Sykes, M. F. and Essam, J. W. (1964). Exact critical percolation probabilities for site and bond problems in two dimensions. Journal of Mathematical Physics 5, 1117–1127. 43. Tetali, P. and Winkler, P. (1993). Simultaneous reversible Markov chains (to appear). 44. Tutte, W. T. (1984). Graph Theory. Addison-Wesley, Menlo Park, California. 45. Welsh, D. J. A. (1992). Percolation in the random-cluster process. Journal of Physics A: Mathematical and General (to appear). 46. Wu, F. Y. (1982). The Potts model. Reviews in Modern Physics 54, 235–268.
MEAN-FIELD BEHAVIOUR AND THE LACE EXPANSION
TAKASHI HARA Department of Applied Physics Tokyo Institute of Technology Oh-Okayama, Meguro-ku, Tokyo 152 Japan e-mail:
[email protected]
and GORDON SLADE Department of Mathematics and Statistics McMaster University Hamilton, Ontario Canada L8S 4K1 e-mail:
[email protected]
Abstract. These lectures describe the lace expansion and its role in proving mean-field critical behaviour for self-avoiding walks, lattice trees and animals, and percolation, above their upper critical dimensions. Diagrammatic conditions for mean-field behaviour are also outlined. Key words: Lace expansion, self-avoiding walk, percolation, trees, lattice animals, mean-field behaviour, critical exponent, bubble diagram, square diagram, triangle condition, upper critical dimension.
1. Introduction An important problem in statistical mechanics and probability theory is to prove existence of the critical exponents which characterize power-law behaviour in the vicinity of a phase transition. Typically the critical behaviour is dimension-dependent. For example, the susceptibility χ of the Ising model, one of the best-known models of statistical mechanics, is expected to behave as χ(T ) ≈ |T − Tc |−γ ,
(1.1)
where Tc is the critical temperature and γ is a critical exponent. It is a common feature of many models of statistical mechanics that there exists an upper critical dimension, above which the critical behaviour becomes simpler and dimensionindependent. For example, above four dimensions the critical behaviour of the Ising model is the same as that of the mean-field model known as the Curie–Weiss model, in which spins interact not just with their neighbours but rather with the average of all other spins; in particular γ = 1. The term ‘mean-field’ has come also to apply to the behaviour of other statistical mechanical models above their upper critical dimensions, even when there is no explicit field or average. At the upper critical dimension d = 4 there are logarithmic corrections to mean-field behaviour
88
TAKASHI HARA AND GORDON SLADE
and χ(T ) ≈ |T − Tc |−1 (log |T − Tc |)1/3 , while for d < 4 there are different power laws and γ > 1 in (1.1). Similar behaviour is expected to hold for percolation and for combinatorial systems like the self-avoiding walk and lattice trees and animals. These lectures are concerned with the lace expansion (Brydges and Spencer 1985) and its application to prove existence of critical exponents for these models above their upper critical dimensions of respectively six, four and eight (five for oriented percolation). It remains an open problem to prove existence of critical exponents for these models at or below their upper critical dimensions, although there has been some recent progress in using renormalization group methods to study models related to the 4-dimensional self-avoiding walk in Brydges et al. (1992), Arnaudon et al. (1991), Iagolnitzer and Magnen (1992). Being an expansion, the lace expansion requires a small parameter to ensure convergence. Here the small parameter will arise in two ways: as the reciprocal of the spatial dimension for nearest-neighbour models, or by considering sufficiently ‘spread-out’ models having suitable long-range connections. For the self-avoiding walk and for lattice trees and animals, the small parameter could alternatively be introduced through a weak interaction, as in the Domb–Joyce model. The spreadout models are believed to be in the same universality class as the corresponding nearest-neighbour models, and so are believed to have the same critical exponents. Studying sufficiently spread-out models allows for results for all dimensions d > dc , where dc is the upper critical dimension, whereas current methods give results for the nearest-neighbour models only for d somewhat greater than dc . An exception is the nearest-neighbour self-avoiding walk, for which a computer-assisted proof has allowed for the treatment of dimensions d ≥ 5. For lattice trees and animals there are results for the nearest-neighbour model in high dimensions (how high has not been computed), or for the spread-out model with d > 8. For percolation there are results for the nearest-neighbour model in sufficiently high dimensions (currently d ≥ 19), and for the spread-out model in more than six dimensions. Oriented percolation has been treated by Nguyen and Yang (1992, 1993) for the nearest-neighbour model in sufficiently high dimensions and for sufficiently spread-out models above 4+1 dimensions. The lace expansion treats each of these models as a perturbation of a simple random walk model. Although the lace expansion has proved to be sufficiently flexible to treat a variety of models, its use has been limited by its reliance on correlation inequalities to prove convergence. For the self-avoiding walk the correlation inequality is based on the repulsive nature of the self-avoidance interaction, while for percolation it is the BK inequality. The absence of similar correlation inequalities has hindered the application of the lace expansion to other models, such as the ‘true’ self-avoiding walk (whose interaction combines repulsive and attractive aspects). However there are encouraging indications that a certain attractive walk model can be treated using the lace expansion. In addition difficulties have been encountered in applying lace expansion methods to analyze the upper critical dimension itself, involving problems in reconciliation of the expansion with coupling-constant renormalization. Lower dimensions remain a major challenge for any method. Historically, mean-field critical behaviour was first proven for Ising and other spin
THE LACE EXPANSION
89
systems using two ingredients. The first is the infra-red bound, which is a bound on the behaviour near the origin of the Fourier transform of the two-point function at criticality (Fr¨ ohlich et al. 1976). The second ingredient is correlation inequalities, which when combined with the infra-red bound have important consequences for critical behaviour. The first manifestation of this combination is Sokal (1979), where it was shown that finiteness of the specific heat in more than four dimensions follows transparently from the infra-red bound. Later more sophisticated correlation inequalities were proven, which together with the infra-red bound led to proof of mean-field behaviour for the susceptibility, magnetization, and so on, in more than four dimensions. (See Aizenman 1982, Fr¨ ohlich 1982, Aizenman and Fern´ andez 1986, and Fern´ andez et al. 1992 for details.) For stochastic geometric models such as the self-avoiding walk, lattice trees and animals, and percolation, there exists no general proof of the infra-red bound. In fact there are claims that the infra-red bound is violated for d < 8 for lattice trees and animals, and for some dimensions below six for percolation. Following the successes with spin systems, the correlation inequality methods were soon applied to stochastic geometric systems, resulting in diagrammatic sufficient conditions such as the triangle condition for mean-field critical behaviour. At the time, the diagrammatic conditions could not be verified, due to the absence of an infra-red bound. For self-avoiding walks and lattice trees and animals it is now possible to use the lace expansion to prove mean-field behaviour without appeal to these diagrammatic sufficient conditions, although this is not true for percolation. In this article we shall provide an outline of the diagrammatic conditions, in part because of their essential role in percolation, in part because of their intuitive role in identifying the upper critical dimension and reducing the problem of critical behaviour to two-point functions, and in part because the diagrams play an essential role as small parameters in bounding the lace expansion. The remainder of this article is organized as follows. In Section 2 we give precise definitions of the models and precise statements of results. Section 3 discusses diagrammatic conditions, and shows in particular the relevance of the triangle condition for mean-field behaviour of the expected cluster size in percolation. Finally in Section 4 we describe the lace expansion in some detail. Convergence issues are discussed only for self-avoiding walks, since the analysis is similar for the other models. A general reference for many of the topics covered below is Madras and Slade (1993); our exposition is based in places on this reference.
2. The Models and Results In this section we give precise definitions of models and statements of results obtained with the lace expansion. All of the models are set on the hypercubic lattice Zd . In addition to the usual nearest-neighbour model we will consider also the spread-out model. For the spread-out model, we let Λ denote the set {x ∈ Zd : 0 < kxk∞ ≤ L} for some fixed L which will be taken to be large. The bonds of the spread-out model are then the pairs of sites {x, y}, with y − x ∈ Λ. We at times treat the spread-out and nearest-neighbour models simultaneously, by letting Λ also denote the set of nearest neighbours of the origin.
90
TAKASHI HARA AND GORDON SLADE
2.1. The Self-Avoiding Walk For self-avoiding walks we restrict attention to the usual nearest-neighbour model, and will be interested in the case where the small parameter is the inverse dimension. Results similar to those stated below can be obtained more easily for d > 4 for the weakly or spread-out self-avoiding walks, for in these contexts there is a small parameter which can be taken to be arbitrarily small. Results for the nearestneighbour model for d ≥ 5 were obtained via a computer-assisted proof, because of difficulties associated with the fact that the small parameter d−1 is fixed and cannot be taken to be arbitrarily small. The fact that the upper critical dimension for the self-avoiding walk is four can be partially understood from the fact that intersection properties of the simple random walk change dramatically at d = 4. For example, the probability that two independent n-step simple random walks do not intersect remains bounded away from zero for d > 4, but not for d ≤ 4. (See, e.g., Lawler 1991.) Mean-field behaviour for the self-avoiding walk is behaviour like the simple random walk. An n-step self-avoiding walk is an ordered set ω = (ω(0), ω(1), . . . , ω(n)), with each ω(i) ∈ Zd , |ω(i + 1) − ω(i)| = 1 for all i (Euclidean distance), and most importantly ω(i) 6= ω(j) when i 6= j. If ω is an n-step walk we write |ω| = n (not to be confused with the Euclidean norm |ω(i)| of ω(i) ∈ Zd ). Let Ωn (x, y) denote the set of all n-step self-avoiding walks with ω(0) = x and ω(n) = y, and let cn (x, y) beSthe cardinality of this set. In particular, c0 (x, y) = δx,y . We also define ∞ Ω(x, y) = n=0 Ωn (x, y) to be the set of all self-avoiding walks, of any length, from x to y. Let cn be the number of n-step self-avoiding walks which begin at the origin and P end anywhere, or in other words cn = y cn (0, y). Hammersley and Morton (1954) observed that the elementary submultiplicativity inequality cn+m ≤ cn cm implies 1/n the existence of the connective constant µ = limn→∞ cn , with cn ≥ µn for all 2 n. The mean-square displacement h|ω(n)| in is defined to be the average value of |ω(n)|2 with respect to the uniform measure on the set of all n-step self-avoiding walks: X 1 h|ω(n)|2 in = |ω(n)|2 . (2.1) cn ω∈Ωn (0,x)
The number of n-step self-avoiding walks and the mean-square displacement are believed to behave asymptotically like cn ∼ Aµn nγ−1 , h|ω(n)|2 in ∼ Dn2ν ,
(2.2) (2.3)
where the amplitudes A and D and the critical exponents γ and ν are dimensiondependent positive constants. Here we are taking the optimistic viewpoint that the above relations really are asymptotic, in the usual sense of the term that the ratio of the left and right sides has limiting value of unity. The critical exponent γ is believed to be equal to 43 32 for d = 2, about 1.162 for d = 3, and 1 for d ≥ 4, with a logarithmic factor (log n)1/4 multiplying Aµn in four dimensions. The exponent ν is believed to be equal to 34 for d = 2, about 0.588 for d = 3, and 12 for d ≥ 4, again with a logarithmic factor (log n)1/4 multiplying Dn in four dimensions. In fact for d ≥ 5 this is a theorem.
91
THE LACE EXPANSION
Theorem 2.1. For any d ≥ 5 there are (dimension-dependent) positive constants A and D such that, as n → ∞, cn = Aµn [1 + O(n− )] for any < 12 , h|ω(n)|2 in = Dn[1 + O(n− )]
for any < 14 .
Rigorous numerical bounds on A and D are available. For example when d = 5, 1 ≤ A ≤ 1.493 and 1.098 ≤ D ≤ 1.803. A weaker theorem, which is easier to prove, involves corresponding statements about generating functions. To define these generating functions, we let z denote a complex parameter (usually taken here to be non-negative), and first define the two-point function as Gz (x, y) =
∞ X
cn (x, y)z n =
n=0
X
z |ω|.
(2.4)
ω∈Ω(x,y)
Then we define the susceptibility χ(z) as χ(z) =
X x
Gz (0, x) =
∞ X
cn z n
(2.5)
n=0
and the correlation length of order two ξ2 (z) by P 2 x |x| Gz (0, x) . ξ22 (z) = P x Gz (0, x)
(2.6)
The manner of divergence of the susceptibility and correlation length of order two at the critical point zc ≡ µ−1 reflects the large-n asymptotics of cn and the mean-square displacement. Theorem 2.2. For any d ≥ 5, as z % zc along the positive real axis, 1/2 Azc Dzc , ξ2 (z) ∼ χ(z) ∼ zc − z zc − z with the same constants A and D as in Theorem 2.1, and with f (z) ∼ g(z) meaning that limz%zc f (z)/g(z) = 1. Theorem 2.2 is weaker than Theorem 2.1, and needs something like a Tauberian condition to conclude Theorem 2.1. This has been done by combining good error estimates in Theorem 2.2 with contour integration methods using ‘fractional derivatives’. For z < zc the two-point function is known to decay exponentially: the correlation length −1 1 (2.7) ξ(z) = − lim log Gz (0, (n, 0, . . . , 0)) n exists for 0 < z < zc , is strictly positive and finite, and ξ(z) % ∞ as z % zc (Chayes and Chayes 1986). It is believed that the divergence of ξ(z) at the critical point is via a power law with power ν; for d ≥ 5 this is a theorem.
92
TAKASHI HARA AND GORDON SLADE
Theorem 2.3. For any d ≥ 5, as z % zc along the positive real axis, r 1/2 D zc . ξ(z) ∼ 2d zc − z The following theorem shows that the scaling limit of the self-avoiding walk is Brownian motion for d ≥ 5, and provides a strong statement to the effect that the self-avoiding walk behaves like simple random walk for d ≥ 5. To state the theorem, we denote by Cd [0, 1] the set of continuous Rd -valued functions on [0, 1], equipped with the supremum norm. Given an n-step self-avoiding walk ω, we define Xn ∈ Cd [0, 1] by setting Xn (j/n) = (Dn)−1/2 ω(j) for j = 0, 1, . . . , n, and taking Xn (t) to be the linear interpolation of this. Let dW denote the Wiener measure on Cd [0, 1], and h · in denote expectation with respect to the uniform measure on the set of n-step self-avoiding walks beginning at the origin. Theorem 2.4. For d ≥ 5 the scaling limit of self-avoiding walk is Brownian motion. In other words, for any bounded continuous function f on Cd [0, 1], Z lim hf (Xn )in = f dW. n→∞
An important ingredient in the proof of the above theorems is an infra-red bound. This bound reflects the long distance behaviour of the critical two-point function indirectly through the behaviour of the Fourier transform of the two-point function near the origin. In general the Fourier transform of a summable function f on Zd is defined by X fˆ(k) = f (x)eik·x , k = (k1 , . . . , kd ) ∈ [−π, π]d , (2.8) where k·x = is
Pd
x
j=1
kj xj . The conjectured behaviour of the critical two-point function
Gzc (x, y) ∼ const.
1 , |x − y|d−2+η
as |x − y| → ∞,
(2.9)
or for the Fourier transform ˆ zc (k) ∼ const. G
1 k 2−η
,
as k → 0.
(2.10)
Scaling theory predicts that the critical exponent η is given in terms of γ and ν by Fisher’s scaling relation (Fisher 1969) γ = (2 − η)ν.
(2.11)
According to the conjectured values of γ and ν, η is non-negative in all dimensions. This is a statement of the infra-red bound, which can also be stated in the form ˆ zc (k) ≤ Ck −2 . This is believed to be true in all dimensions, but remains unproven G for dimensions 2, 3 and 4. This k −2 behaviour is the same as that P for simple random walk, for which the analogue of Gzc (0, x) is the Green function ∞ n=0 pn (0, x) [where pn (0, x) is the probability that a simple random walk beginning at 0 is at x after n Pd steps] whose Fourier transform is (1 − d−1 j=1 cos kj )−1 ∼ (2d)k −2 . The following theorem gives an infra-red bound for self-avoiding walks when d ≥ 5, with correction term.
THE LACE EXPANSION
93
ˆ zc (k)−1 = (2d)−1 k 2 [DA−1 + O(k )] for any < 1/2. Theorem 2.5. For d ≥ 5, G Proofs of these theorems, with some further results along these lines, can be found in Hara and Slade (1992a, b). Some of the principal ideas are outlined in Section 4.1 below. As a final application of the lace expansion, we note that it can be used to prove existence of an asymptotic expansion in powers of 1/d, to all orders, for the connective constant µ, and moreover to compute the coefficients of this expansion. The computation has been done (Hara and Slade 1993) as far as 3 16 1 1 . (2.12) − − +O µ = 2d − 1 − 2d (2d)2 (2d)3 (2d)4 The coefficient of the term of order (2d)−4 was computed by Fisher and Gaunt (1964) and Nemirovsky et al. (1992) to be −102(2d)−4 , but with no error estimate. 2.2. Lattice Trees and Animals A lattice tree is defined to be a connected set of bonds which contains no closed loops. We consider trees on either the nearest-neighbour or spread-out lattice. Although a tree T is defined as a set of bonds, we will write x ∈ T if x is an element of a bond in T . The number of bonds in T will be denoted |T |. A lattice animal is a connected set of bonds which may contain closed loops. We denote a typical lattice animal by A and the number of bonds in A by |A|. Let tn denote the number of n-bond trees modulo translation, and let an denote the number of n-bond animals modulo translation. By a subadditivity argument, 1/n 1/n tn and an both converge to finite positive limits λ and λa as n → ∞. The asymptotic behaviour of both tn and an as n → ∞ is believed to be governed by the same critical exponent θ: tn ∼ const. λn n−θ ,
an ∼ const. λna n−θ .
(2.13)
The typical size of a lattice tree or animal is characterized by the average radius of gyration, which is defined for trees as the average over n-bond trees of their individual radii of gyration: 1/2 X X 1 1 R(n) = |x − x ¯T |2 . (2.14) tn n+1 T :|T |=n
x∈T
Here the sum over T is the sum over P one tree from each equivalence class modulo translation, and x ¯T = (n + 1)−1 x∈T x is the centre of mass of T . A similar definition applies for lattice animals. The average radius of gyration is believed to behave asymptotically as R(n) ∼ const. nν (2.15) for a critical exponent ν which is the same for both trees and animals. The following theorem, proved in Hara and Slade (1992c), gives results for these critical exponents for trees in high dimensions. Related results have been obtained for lattice animals, but so far at the level of generating functions and not at the level of counts (Hara and Slade 1990b).
94
TAKASHI HARA AND GORDON SLADE
Theorem 2.6. For nearest-neighbour trees with d sufficiently large, or for spreadout trees with d > 8 and L sufficiently large, there are positive constants such that for every < min{ 21 , 14 (d − 8)}, tn = const. λn n−5/2 [1 + O(n− )], R(n) = const. n1/4 [1 + O(n− )]. Some hint can be gleaned from this theorem as to why the upper critical dimension should be eight. The fact that the size of trees typically grows like n1/4 is a sign that in some sense trees are 4-dimensional objects, and hence will typically not intersect above eight dimensions. This suggests that for d > 8 lattice trees will have similar behaviour to the ‘mean-field’ model of abstract trees embedded in the lattice with no self-avoidance constraints. As is the case for the self-avoiding walk, the proof proceeds first by studying generating functions near their closest singularity to the origin, and then uses complex variable methods to extract the large-n asymptotics of tn and the radius of gyration. Let zc = λ−1 , and for |z| ≤ zc define the two-point function X Gz (x, y) = z |T | (2.16) T :T 3x,y
P
and the susceptibility χ(z) = x Gz (0, x). Then in particular, it is shown that under the hypotheses of the theorem the susceptibility χ(z) obeys χ(z) = const. (zc − z)−1/2 + O((zc − z)−1/2+ )
(2.17)
for all complex |z| ≤ zc . The Fourier transform of the two-point function plays an important role in the ˆ z (k) is obtained. In contrast to the proof. In particular, an infra-red bound for G self-avoiding walk, it has been conjectured (Bovier et al. 1986) that the infra-red bound fails for lattice trees and animals below eight dimensions, or in other words ˆ zc (k) diverges more strongly than k −2 for d < 8. that G 2.3. Percolation In this section we discuss the nearest-neighbour model and the spread-out model simultaneously. We consider independent bond percolation where the bonds are the pairs {x, y} of sites with y − x ∈ Λ. This means that to each bond {x, y} we associate an independent Bernoulli random variable n{x,y} which takes the value 1 with probability p and the value 0 with probability 1 − p, where p is a parameter in the closed interval [0, 1]. If n{x,y} = 1 then we say that the bond {x, y} is occupied, and otherwise we say that it is vacant. A configuration is a realization of the random variables for all bonds. Given a configuration and any two sites x and y, we say that x and y are connected if there is a self-avoiding walk from x to y consisting of occupied bonds, or if x = y. We denote by C(x) the random set of sites connected to x, and denote its cardinality by |C(x)|. For d ≥ 2 we denote by pc ∈ (0, 1) the critical value of p such that the probability θ(p) that the origin is connected to infinitely many sites is zero for p < pc and strictly positive for p > pc . General references are Grimmett (1989) and Kesten (1982).
95
THE LACE EXPANSION
We denote the indicator function of an event E by I[E] and expectation with respect to the joint distribution of the Bernoulli random variables n{x,y} by h · ip . The two-point function τp (x, y) is defined to be the probability that x and y are connected: τp (x, y) = hI[x and y are connected]ip . (2.18) This is analogous to the functions Gz (x, y) defined previously for self-avoiding walks or for lattice trees and animals. For p < pc the two-point function is known to decay exponentially as |x − y| → ∞, so that the correlation length ξ(p) = −
1 log τp (0, (n, 0, . . . , 0)) n→∞ n lim
−1
(2.19)
is finite and strictly positive. The susceptibility, or expected cluster size, is defined by X τp (0, x) = h|C(0)|ip . (2.20) χ(p) = x∈Zd
The susceptibility is known to be finite for p < pc and to diverge as p % pc . The magnetization is defined by X M (p, h) = 1 − e−hn hI[|C(0)| = n]ip . (2.21) 1≤n<∞
The following power laws are believed to hold: χ(p) θ(p) M (pc , h) h|C(0)|m+1 ip /h|C(0)|m ip ξ(p)
∼ ∼ ∼ ∼ ∼
A1 (pc − p)−γ A2 (p − pc )β A3 h1/δ A4 (pc − p)−∆ A5 (pc − p)−ν
as as as as as
p % pc , p & pc , h & 0, p % pc , p % pc ,
(m = 1, 2, . . .),
for some dimension-dependent amplitudes Ai and critical exponents γ, β, δ, ∆, ν. The next theorem gives existence of these critical exponents under certain conditions. Asymptotic behaviour has not been proved, but rather relations of the form f (x) ' g(x), meaning that there are positive constants c1 , c2 such that c1 g(x) ≤ f (x) ≤ c2 g(x) for x sufficiently close to its limiting value. The methods used to obtain precise asymptotic behaviour for self-avoiding walks and trees made use of the fact that for example the susceptibility of these models is a power series, and do not extend readily to percolation. Theorem 2.7. For the nearest-neighbour model with d sufficiently large (d ≥ 19 is large enough), or for the spread-out model with d > 6 and L sufficiently large, χ(p) θ(p) M (pc , h) h|C(0)|m+1 ip /h|C(0)|m ip ξ(p)
' ' ' ' '
(pc − p)−1 (p − pc )1 h1/2 (pc − p)−2 (pc − p)−1/2
as as as as as
p % pc , p & pc , h & 0, p % pc , p % pc .
(m = 1, 2, . . .),
96
TAKASHI HARA AND GORDON SLADE
The ‘mean-field’ exponents appearing in this theorem are those for percolation on a tree, where exact calculations can be performed. Theorem 2.7 is a combination of several results which centre on the triangle condition. The triangle condition is the statement that the triangle diagram is finite at the critical point, with the triangle diagram given by Z X dd k T(p) = τp (0, x)τp (x, y)τp (y, 0) = τˆp (k)3 . (2.22) (2π)d [−π,π]d x,y The triangle condition is discussed in more detail in Section 3.3. Aizenman and Newman (1984) introduced the triangle condition and showed that it implies the mean-field behaviour χ(p) ' (pc − p)−1 . Barsky and Aizenman (1991) used differential inequalities to prove that the conclusions of the theorem concerning θ(p) and M (pc , h) follow from the triangle condition. Nguyen (1987) showed that the triangle condition implies ∆ = 2. Then Hara and Slade (1990a) used the lace expansion to show that the triangle condition holds under the hypotheses of the theorem. No direct implication for the exponent ν is known to follow from the triangle condition, but Hara (1990) has used the lace expansion to obtain the result of the theorem concerning the correlation length ξ(p). It follows from the behaviour of θ(p) given in Theorem 2.7 that the percolation probability is zero at the critical point: θ(pc ) = 0. Although this is strongly believed to be true in all dimensions, it has otherwise been proven only for the nearestneighbour model in two dimensions. The proof that the triangle condition holds above six dimensions involves proving the infra-red bound 0 ≤ τˆp (k) ≤ const. k −2 , (2.23) with a constant which is uniform in p < pc . The conjectured behaviour here in general dimensions is again τˆpc (k) ∼ const.
1 k 2−η
,
(2.24)
with η determined by Fisher’s relation (2.11). However for percolation it has been conjectured that the infra-red bound is violated (η < 0) for some dimensions below six. The triangle condition is expected not to hold for any d ≤ 6. The lace expansion can also be used to study the high-d behaviour of the critical point of the nearest-neighbour model, with the result (Hara and Slade 1993) 1 1 7 1 1 pc = . (2.25) + + + O 2d (2d)2 2 (2d)3 (2d)4 This expansion was derived to two further terms in Gaunt and Ruskin (1978), but with no control on the error term. 2.4. Oriented Percolation Consider now the case of bond percolation with bonds oriented in one direction. More precisely, we consider the sites in Zd+1 to be of the form (x, n) with x ∈ Zd
THE LACE EXPANSION
97
and n ∈ Z, and define an oriented bond to be an ordered pair ((x, n), (y, n + 1)), where |x − y| = 1 (Euclidean distance in Zd ). This defines the bonds of the nearestneighbour model. For the simplest version of the spread-out model the bonds consist of all ordered pairs ((x, n), (y, n + 1)) with kx − yk∞ ≤ L. In oriented percolation we declare that each oriented bond is occupied with probability p and is vacant with probability 1 − p, independently for each bond. Let τp ((x, m), (y, n)) denote the probability that there is an oriented path consisting of occupied bonds from (x, m) to (y, n). This probability is zero unless m < n. The situation here is in many respects similar to the usual unoriented bond percolation discussed in the previous section. In fact it is somewhat easier, because now there is a Markov property: the event that there is an oriented path consisting of occupied bonds joining (x, m) to (y, n) is independent of the occupation status of bonds lying below the hyperplane {(w, m) : w ∈ Zd }. In Nguyen and Yang (1992) it was shown how the lace expansion could be adapted to apply to oriented percolation, either for the spread-out model above 4 + 1 dimensions or for the nearest-neighbour model in sufficiently high dimensions. They proved that the triangle condition holds in these two situations, and thus in combination with the results of Aizenman and Newman (1984) and Barsky and Aizenman (1991), obtained (among other results) the following theorem. In the statement of the theorem pc is of course the critical point for oriented percolation, and χ(p) and θ(p) are respectively the expected cluster size and the percolation probability. Theorem 2.8. For the nearest-neighbour oriented percolation model in sufficiently high dimensions, or for the spread-out oriented percolation model with d + 1 > 5 and L sufficiently large, χ(p) ' (pc − p)−1 as p % pc , θ(p) ' (p − pc )1 as p & pc . The proof involves using the lace expansion to show that under the hypotheses of the above theorem oriented percolation can be regarded as a small perturbation of the corresponding model of simple random walk with steps oriented in one direction. Accordingly the infra-red bound must be modified to take the orientation into account. Let X τˆp (k, t) = τp ((0, 0), (x, n))eik·x eitn , k ∈ [−π, π]d , t ∈ [−π, π]. (2.26) (x,n)∈Zd+1
Then the infra-red bound used in proving Theorem 2.8, and which holds uniformly in p < pc under the hypotheses of the theorem, is |ˆ τp (k, t)| ≤ const.
k2
1 . + |t|
(2.27)
The right side reflects the corresponding behaviour of random walk in the oriented context. Theorem 2.8 is at the level of generating functions. It is possible to go beyond this by incorporating the ‘fractional derivative’ methods of Hara and Slade (1990a). In particular, one can study the scaling limit of the hitting distribution on a distant
98
TAKASHI HARA AND GORDON SLADE
hyperplane {(x, n) : x ∈ Zd }, with n large. In view of the outlook that in high dimensions at criticality oriented percolation is a small perturbation of a random walk model, a Gaussian scaling limit is to be expected. To state a theorem to this effect at the level of the Fourier transform, recently proved by Nguyen and Yang (1993), we first define X τp ((0, 0), (x, n))eik·x . (2.28) Zp (k; n) = x
The theorem is stated for all p ≤ pc , but it is at the critical point itself that the result is most interesting. Theorem 2.9. For the nearest-neighbour model in sufficiently high dimensions or for the spread-out model in dimensions d + 1 > 5 with L sufficiently large, for any p ≤ pc there is a finite positive constant σp2 such that √ 2 2 Zp (k/ n; n) = e−σp k /2 . lim n→∞ Zp (0; n) 3. Diagrammatic Conditions and the Upper Critical Dimension This section describes diagrammatic conditions for the self-avoiding walk, lattice trees and animals, and percolation, namely the finiteness of the bubble diagram, the square diagram, and the triangle diagram respectively. These diagrammatic conditions are sufficient conditions for mean-field behaviour for the susceptibility, and in the case of percolation also for other quantities. For percolation the triangle condition remains a necessary ingredient for proving the results of Sections 2.3 and 2.4. However for self-avoiding walks and lattice trees and animals the bubble and square conditions have been superseded by more powerful lace expansion methods. Nevertheless for each model it is instructive to see how the diagrams appear in the analysis. The diagrams arise in differential inequalities for the susceptibility. In the case of the self-avoiding walk, the lower bound χ(z) ≥
zc zc − z
(3.1)
is an immediate consequence of (2.5) and the subadditivity bound cn ≥ µn = zc−n . A complementary upper bound χ(z) ≤ const.
1 zc − z
(3.2)
is a consequence of the bubble condition, as will be shown below. Together these bounds give the mean-field behaviour χ(z) ' (zc − z)−1 ,
(3.3)
which is only expected to be true above the upper critical dimension. Similar considerations apply for lattice trees and animals and for percolation.
99
THE LACE EXPANSION
3.1. The Bubble Condition We restrict attention in this section to the self-avoiding walk. To state the bubble condition we first introduce the bubble diagram X B(z) = Gz (0, x)2 . (3.4) x∈Zd
The name ‘bubble diagram’ comes from a Feynman diagram notation in which the two-point function or propagator evaluated at sites x and y is denoted by a line terminating at x and y. In this notation B(z) =
X x
0
r rx =
r r
where in the diagram on the right it is implicit that one vertex is fixed at the origin and the other is summed over the lattice. The bubble diagram can be rewritten in terms of the Fourier transform of the two-point function, using (3.4) and the Parseval relation, as B(z) =
kGz (0, ·)k22
ˆ z k22 = = kG
Z
d ˆ z (k)2 d k . G (2π)d [−π,π]d
(3.5)
The bubble condition is the statement that B(zc ) < ∞. In view of the definition of η in (2.9) or (2.10), it follows from (3.5) that the bubble condition is satisfied provided η > 12 (4 − d). Hence the bubble condition for d > 4 is implied by the infra-red bound η ≥ 0. If the values for η arising from Fisher’s relation and the conjectured values of γ and ν are correct, then the bubble condition will not hold in dimensions 2, 3 or 4, with the divergence of the bubble diagram being only logarithmic in four dimensions. The next theorem (Bovier et al. 1984) shows that the bubble condition implies (3.2) and hence implies (3.3). Theorem 3.1. In all dimensions χ(z) ≥ zc (zc − z)−1 for z ∈ [0, zc ). If the bubble condition is satisfied then there is a corresponding upper bound, and for all z ∈ [0, zc ) zc zc ≤ χ(z) ≤ B(zc ) +1 . zc − z zc − z Proof. The lower bound in the statement of the theorem is just (3.1), which holds in all dimensions. So it suffices to prove the upper bound. For this, we begin by obtaining a lower bound on the derivative of the susceptibility. Once this is achieved, integration of this differential inequality will give the upper bound in the statement of the theorem. The desired lower bound is that for any z ∈ [0, zc ), χ(z)2 − χ(z) ≤ zχ0 (z). B(z)
(3.6)
100
TAKASHI HARA AND GORDON SLADE
To prove this, we begin by noting that below the critical point the derivative of χ can be obtained by term by term differentiation: X X X X |ω|z |ω| = (|ω| + 1)z |ω| − χ(z). (3.7) zχ0 (z) = y ω∈Ω(0,y)
y ω∈Ω(0,y)
The summation on the right side can be written X X X I[ω(j) = x for some j]z |ω| y ω∈Ω(0,y) x
=
X
X
x,y
ω (1) ∈ Ω(0, x) ω (2) ∈ Ω(x, y)
z |ω
(1)
|+|ω (2) |
I[ω (1) ∩ ω (2) = {x}]
≡ Q(z),
(3.8)
where I denotes the indicator function. Therefore zχ0 (z) = Q(z) − χ(z).
(3.9)
The next step is to rewrite Q(z) by using the inclusion-exclusion relation in the form I[ω (1) ∩ ω (2) = {x}] = 1 − I[ω (1) ∩ ω (2) 6= {x}]. This gives Q(z) = χ(z)2 −
X
X x,y
z |ω
(1)
|+|ω (2) |
I[ω (1) ∩ ω (2) 6= {x}].
(3.10)
(1)
ω ∈ Ω(0, x) ω (2) ∈ Ω(x, y)
In the last term on the right side of (3.10), let w = ω (2) (l) be the site of the last intersection of ω (2) with ω (1) , where time is measured along ω (2) beginning at its starting point x. Then the portion of ω (2) corresponding to times greater than l must avoid all of ω (1) . Relaxing the restrictions that this portion of ω (2) avoid both the remainder of ω (2) and the part of ω (1) linking w to x gives the upper bound X X (1) (2) z |ω |+|ω | I[ω (1) ∩ ω (2) 6= {x}] ≤ Q(z)[B(z) − 1], (3.11) x,y
ω (1) ∈ Ω(0, x) ω (2) ∈ Ω(x, y)
as illustrated in Figure 1. Here the factor B(z) − 1 arises from the two paths joining w and x. The upper bound involves B(z) − 1 rather than B(z) since there will be no contribution here from the x = 0 term in (3.4). Combining (3.10) and (3.11) gives Q(z) ≥ χ(z)2 − Q(z)[B(z) − 1]. Solving for Q(z) gives Q(z) ≥
χ(z)2 . B(z)
(3.12)
101
THE LACE EXPANSION
q
D −
lq
q
≤
−
lq
A C
A [AD]
E
D B
=
q
= Q(z)
F
[AD, AB, CD, BD] [EF ]
Fig. 1. A diagrammatic representation of the inequality χ(z)2 −Q(z)[B(z)−1] ≤ Q(z) occurring in the proof of Theorem 3.1. The lists of pairs of lines indicate interactions between the propagators, in the sense that the corresponding walks must avoid each other.
Combining this inequality with (3.9) gives (3.6). We now integrate (3.6) to obtain the upper bound in the statement of the theorem. Let z1 ∈ [0, zc ). By (3.6), for z ∈ [z1 , zc ) we have dχ−1 1 1 1 1 z − ≥ − ≥ − . (3.13) dz B(z) χ(z) B(zc ) χ(z1 ) We bound the factor of z on the left side by zc and then integrate from z1 to zc . Using the fact that χ(zc )−1 = 0 by (3.1), this gives zc χ(z1 )−1 ≥ [B(zc )−1 − χ(z1 )−1 ](zc − z1 ).
(3.14)
Rewriting gives 2zc − z1 , zc − z1 which is the desired upper bound on the susceptibility. χ(z1 ) ≤ B(zc )
(3.15)
3.2. The Square Condition For lattice trees and animals one can argue similarly, and we just summarize the result for trees. Again for concreteness we consider only the nearest-neighbour case, although there are no difficulties in dealing with greater generality. We first define the square diagram by Z d X ˆ z (k)4 d k , S(z) = Gz (0, x)Gz (x, y)Gz (y, u)Gz (u, 0) = (3.16) G (2π)d [−π,π]d x,y,u where Gz (x, y) is the two-point function defined in (2.16). The square condition ˆ zc (k) ≤ const. k −2 . The states that S(zc ) < ∞, and will be satisfied for d > 8 if G following theorem (Bovier et al. 1986, Tasaki 1986, Tasaki and Hara 1987) shows the relevance of the square condition. Theorem 3.2. For all d, χ(z) ≥ const. (zc − z)−1/2 for 0 ≤ z ≤ zc . If the square condition is satisfied then the reverse inequality also holds, and hence χ(z) ' (zc − z)−1/2 .
(3.17)
102
TAKASHI HARA AND GORDON SLADE
x
@@
y x
1 Gz (0,0)2
−
@@ XX
y x −
@@ AA
0
0
0
y x −
@@ J
y
@@ ≤ G (0, x, y) ≤ x
y
z
0
0
Fig. 2. The skeleton inequality for lattice trees. Each line represents a two-point function and unlabelled vertices are summed over.
Proof. Again, we start with an expression for the derivative of χ. By definition, z
X dGz (0, x) X X dχ(z) = = z |T |z |T | dz dz x x T 30,x X X X |T | = z − χ(z) ≡ Gz (0, x, y) − χ(z), x,y T 30,x,y
(3.18)
x,y
where the last identity defines the three-point function. Using inclusion-exclusion, the three-point function can be bounded according to the ‘skeleton inequality’ illustrated in Figure 2. This leads to the estimate dχ(z) 1 3 χ(z) ≤ χ(z)3 . − 3[S(z) − 1] − χ(z) ≤ z (3.19) Gz (0, 0)2 dz With some care1 this differential inequality can be integrated to obtain the lower bound on χ stated in the theorem, and also a corresponding upper bound if S(zc ) < 1 + [3Gzc (0, 0)]−2 . This argument can be strengthened to prove (3.17) under the weaker hypothesis S(zc ) < ∞ by using an argument of Aizenman (1982); Lemma 3.6 below treats the analogous problem for percolation. 3.3. The Triangle Condition It is known from the results of Chayes and Chayes (1987) and Tasaki (1987) that the upper critical dimension of percolation is at least six. In brief, they proved critical exponent inequalities, such as dν ≥ 2∆ − γ, assuming the exponents exist. Inserting the mean-field values γ = 1, ∆ = 2, ν = 12 (corresponding to percolation on a tree), we see that for d < 6 the inequality is not satisfied and hence at least one of these exponents cannot take on its mean-field value. On the other hand, the triangle condition is a diagrammatic sufficient condition for mean-field behaviour, which is known under some circumstances to hold for d > 6 (see Theorem 2.7). The triangle diagram is defined by Z X dd k T(p) = τp (0, x)τp (x, y)τp (y, 0) = τˆp (k)3 , (3.20) (2π)d [−π,π]d x,y and the triangle condition is the statement that T(pc ) < ∞. For d > 6 the infra-red bound is a sufficient condition for the triangle condition, and for d ≤ 6 the triangle 1
Care is required to deal with the possibility that χ(z)−1 has a discontinuity at zc− .
THE LACE EXPANSION
103
condition is believed to be violated. Our present goal is to prove the following theorem due to Aizenman and Newman (1984). Further consequences of the triangle condition are obtained in Barsky and Aizenman (1991) and Nguyen (1987). Theorem 3.3. For all d ≥ 2, χ(p) ≥ const. (pc −p)−1 for 0 ≤ p < pc . If the triangle condition is satisfied then the corresponding upper bound also holds, and hence χ(p) ' (pc − p)−1
as p % pc .
(3.21)
Before beginning, we collect several definitions needed here as well as in Section 4.3. We will make use below of Russo’s formula and the BK and FKG inequalities; proofs of these can be found in Grimmett (1989). Definition 3.4. (a) A bond is an unordered pair of distinct sites {x, y} with y − x ∈ Λ. A directed bond is an ordered pair (x, y) of distinct sites with y − x ∈ Λ. A path from x to y is a self-avoiding walk from x to y, considered to be a set of bonds. Two paths are disjoint if they have no bonds in common (they may have common sites). Given a bond configuration, an occupied path is a path consisting of occupied bonds. (b) Given a bond configuration, two sites x and y are connected if there is an occupied path from x to y or if x = y. We denote by C(x) the random set of sites which are connected to x. Two sites x and y are doubly-connected if there are at least two disjoint occupied paths from x to y or if x = y. We denote by Dc (x) the random set of sites which are doubly-connected to x. Given a bond {u, v} and a bond configuration, we define C{u,v} (x) to be the set of sites which remain connected to x in the new configuration obtained by setting the occupation status of {u, v} to be vacant. (c) Given a set of sites A ⊂ Zd and a bond configuration, two sites x and y are connected in A if there is an occupied path from x to y having all of its sites in A (so in particular it is required that x, y ∈ A), or if x = y ∈ A. Two sites x and y are connected through A if they are connected in such a way that every occupied path from x to y has at least one bond with an endpoint in A, or if x = y ∈ A. (d) We denote by Cˆ A (x) the random set of sites connected to x in Zd \A. The restricted two-point function is defined by τpA (x, y) = hI[x and y are connected in Zd \A]ip = hI[y ∈ Cˆ A (x)]ip . (e) Given a bond configuration, a bond {u, v} (occupied or not) is called pivotal for the connection from x to y if (i) either x ∈ C(u) and y ∈ C(v), or x ∈ C(v) and y ∈ C(u), and (ii) y 6∈ C{u,v} (x). Similarly a directed bond (u, v) is pivotal for the connection from x to y if x ∈ C{u,v} (u), y ∈ C{u,v} (v) and y 6∈ C{u,v} (x); this event will be denoted E1 (x, (u, v), y). If x and y are connected then there is a natural order to the set of occupied pivotal bonds for the connection from x to y (assuming there is at least one occupied pivotal bond), and each of these pivotal bonds is directed in a natural way, as follows. The first pivotal bond from x to y is the directed occupied pivotal bond (u, v) such that u is doubly-connected to x. If (u, v) is the first pivotal bond for the connection from x to y, then the second pivotal bond is the first pivotal bond for the connection from v to y, and so on.
104
TAKASHI HARA AND GORDON SLADE
qq q qqq q y xq qq qq qq A q qq qqqqqqqqqqqqqqqqqqqqqqqq q Fig. 3. The event E2 (x, y; A). The line segments represent the pivotal bonds for the connection from x to y, and the circles represent clusters with no such pivotal bonds. The dotted lines represent the sites in A, which need not be connected.
qq q qq uq vq q x 0q qq qq qq A qq qq qqqqqqqqqqqqqqqqqqqqqqqq q Fig. 4. The event of Lemma 3.5, that E2 (0, u; A) occurs and (u, v) is occupied and pivotal for the connection from 0 to x. There is no restriction on intersections between A and C{u,v} (x).
The proof of Theorem 3.3 is based on uniform upper and lower bounds on χ0 (p)/χ(p)2 , much as in the proofs of Theorems 3.1 and 3.2. However for percolation the situation is more complex. For simplicity we consider only the nearest-neighbour model, although there is no difficulty in generalizing the argument. The proof makes use of two lemmas, whose proofs are deferred to the end of this section. The first lemma will be stated in greater generality than what is needed here, for later use in deriving the lace expansion for percolation. For this greater generality, given sites x, y and a set of sites A we define the event E2 (x, y; A) to be the event that (i) x is connected to y through A and (ii) there is no pivotal bond for the connection from x to y whose first endpoint is connected to x through A; see Figure 3. In particular, E2 (x, y; A) includes the event that x and y are doubly-connected and connected through A. Observe that taking A = {y}, the event E2 (x, y; {y}) is simply the event that x is connected to y; this special case serves the needs of this section.In addition, taking A = Zd , the event E2 (x, y; Zd ) is precisely the event that y ∈ Dc (x). Lemma 3.5. 2 Given a nonempty set of sites A and a site u, let E2 = E2 (0, u; A). Let p < pc . Then hI[E2 ]I[(u, v) is occupied and pivotal for the connection from 0 to x]ip C
= phI[E2 ] τp {u,v}
(0)
(v, x)ip .
(3.22)
2 This lemma corresponds to Lemma 2.1 of Hara and Slade (1990a) and corrects an error in that lemma: the class of events in the statement of Lemma 2.1 was too large. However the conclusion of the lemma was correct for the class of events to which it was applied.
105
THE LACE EXPANSION
In particular, for E2 = E2 (0, u; {u}) it follows from the lemma that C
hI[E1 (0, (u, v), x)]ip = hI[u ∈ C(0)] τp {u,v}
(0)
(v, x)ip .
(3.23)
The second lemma will enable us to strengthen a preliminary attempt to prove Theorem 3.3. Lemma 3.6. For |u| = 1 and A ≡ {x ∈ Zd : kxk∞ ≤ R} ⊃ {0, u}, (R ≥ 1), there exists A > 0 such that C
hI[0 ∈ C(x)]τp {0,u}
(x)
ˆ A (x)
(u, y)ip ≥ A hI[0 ∈ C(x)]τpC
(u, y)ip .
(3.24)
Proof of Theorem 3.3.3 By Russo’s formula (a finite volume argument is required here), X d XX dχ = τp (0, x) = hI[E1 (0, (u, v), x)]ip dp dp x x (u,v) X X = hI[E1 (x, (0, u), y)]ip , (3.25) x,y |u|=1
where in the first line the sum over (u, v) is the sum over directed nearest-neighbour bonds, and in the second line translation invariance was used to shift 0 to the pivotal bond. By (3.25) and (3.23) we have X X dχ C (x) = hI[0 ∈ C(x)] τp {0,u} (u, y)ip (3.26) dp x,y |u|=1 D E X X C{0,u} (x) = τp (x, 0)τp (u, y) − I[0 ∈ C(x)] [τp (u, y) − τp (u, y)] p
x,y |u|=1
E X X D C (x) = 2dχ(p)2 − I[0 ∈ C(x)] [τp (u, y) − τp {0,u} (u, y)] . p
x,y |u|=1
We seek bounds on the summation on the right side. For this we first note that the difference of two-point functions is exactly the probability that u is connected to y through C{0,u} (x), and hence in particular is non-negative. For an upper bound, we note that when u is connected to y through C{0,u} (x) there must be a v ∈ C{0,u} (x) which is connected to u and y by disjoint paths. By the BK inequality, the probability of such a configuration is bounded above by τp (u, v)τp (v, y). (In deriving the lace expansion we will require an identity rather than a bound for this probability; see Lemma 4.1 below.) Summing over all possible v ∈ C{0,u} (x) and overcounting gives the bound X C (x) hI[0 ∈ C(x)] [τp (u, y) − τp {0,u} (u, y)]ip ≤ hI[0 ∈ C(x)] τp (u, v)τp (v, y)ip =
X
v∈C(x)
hI[0, v ∈ C(x)]ip τp (u, v)τp (v, y).
(3.27)
v 3 A correct proof involves working first in finite volume and then taking a limit, but we shall sketch only the main ideas and overlook this. Our discussion is deficient in this respect; see Aizenman and Newman (1984) for a more careful treatment.
106
TAKASHI HARA AND GORDON SLADE
0p u pp p CC pp C p wp pppppppppppppppp v C p pp CC p y xp Fig. 5. A schematic representation of the upper bounds (3.27)–(3.28). The dotted lines denote sites in C{0,u} (x), while the solid line denotes a connection from u to y through C{0,u} (x).
For any configuration in which 0 and v are connected to x, there must be a site w such that there are disjoint connections between 0 and w, v and w, and x and w. By the BK inequality, this implies that the right side of (3.27) is bounded above by X
τp (0, w)τp (v, w)τp (x, w)τp (u, v)τp (v, y).
(3.28)
v,w
The geometry of the above inequality is depicted in Figure 5. By symmetry, this gives " # X dχ 2 2d χ(p) 1 − ≤ 2d χ(p)2 τp (0, w)τp (w, v)τp (v, e1 ) ≤ (3.29) dp w,v where e1 denotes the unit vector along the first coordinate direction. With some care integration of the above bound would yield (3.21) if T(pc ) were less than 1 (and the lower bound χ(p) ≥ const. (pc − p)−1 in any case), since X
τp (0, w)τp (w, v)τp (v, e1 ) =
Z
w,v
dd k ik1 e τˆ(k)3 ≤ T(p) (2π)d
(3.30)
(it is known that τˆp (k) ≥ 0). But in fact T(p) is always greater than 1, due to the presence of τp (0, 0)3 = 1 in the sum. This difficulty is resolved using Lemma 3.6, which instead of Lemma 3.6 gives X X dχ ˆA ≥ A hI[0 ∈ C(x)] τpC (x) (u, y)ip , dp x,y
(3.31)
|u|=1
with A any finite set of sites containing 0 and u. Now the expectation on the right is dealt with much as before, but with Cˆ A (x) playing the role of C{0,u} (x). This gives dχ 1 X ≥ 2dA χ(p)2 1 − dp 2d
X
|u|=1 w,v∈Z d \A
τp (0, w)τp (w, v)τp (w, u) .
(3.32)
When T(pc ) < ∞, the sum on the right can be made arbitrarily small, so in particular less than one, by taking the radius R of A sufficiently large.
107
THE LACE EXPANSION
Proof of Lemma 3.5. The event appearing in the left side of (3.22) is depicted in Figure 4. The proof is by conditioning on C{u,v} (0), which is the connected cluster of the origin which remains after declaring the bond {u, v} to be vacant. This cluster is finite with probability one, since p < pc . We first observe that the event that E2 occurs and (u, v) is pivotal (for the connection from 0 to x) is independent of the occupation status of the bond (u, v). Therefore the left side of the identity in the statement of the lemma is equal to phI[E2 ]I[(u, v) is pivotal for the connection from 0 to x]ip . By conditioning on C{u,v} (0), (3.33) is equal to X p hI[E2 occurs, (u, v) is pivotal, C{u,v} (0) = S]ip ,
(3.33)
(3.34)
S:S30
where the sum is over all finite sets of sites S containing 0. In (3.34), the statement that (u, v) is pivotal can be replaced by the statement that v is connected to x in Zd \S. This event depends only on the occupation status of the bonds which do not have an endpoint in S. On the other hand, the event E2 is determined by the occupation status of bonds which have an endpoint in C{u,v} (0). Similarly, the event that C{u,v} (0) = S depends on the values of nb only for bonds b which have one or both endpoints in S. Hence the event that both E2 occurs and C{u,v} (0) = S is independent of the event that v is connected to x in Zd \S, and therefore (3.34) is equal to X p hI[E2 occurs and C{u,v} (0) = S]ip τpS (v, x). (3.35) S:S30
Bringing the restricted two-point function inside the expectation, replacing the superscript S by C{u,v} (0), and performing the sum over S, (3.35) is equal to C
phI[E2 ] τp {u,v}
(0)
(v, x)ip .
(3.36)
This completes the proof. Proof of Lemma 3.6. The natural inequality here is the reverse of (3.24), since given a bond configuration the fact that {0, u} ⊂ A implies C{0,u} (x) ⊃ Cˆ A (x) ˆA and hence τ C{0,u} (x) (u, y) ≤ τ C (x) (u, y). Following Lemma 6.3 of Aizenman and Newman (1984), we show that a reversed inequality (3.24) can be obtained at the cost of a small constant A . We define three events: E1 = E1 (x, (0, u), y) = {0 ∈ C(x) and u is connected to y in Zd \C{0,u} (x)}, F = {0 ∈ C(x) and u is connected to y in Zd \Cˆ A (x)}, G = {C(x) ∩ A 6= φ, C(y) ∩ A 6= φ, and Cˆ A (x) ∩ Cˆ A (y) = φ}. By definition G ⊃ E1 , F , so Prob(E1 ) = Prob(G) Prob(E1 | G) ≥ Prob(F ) Prob(E1 | G),
(3.37)
108
TAKASHI HARA AND GORDON SLADE
where Prob(·) = hI[·]ip and Prob(E1 | G) is a conditional probability. The event G depends only on bonds having at least one endpoint not in A. And given that G occurs, there is always at least one configuration of bonds having both endpoints within A such that E1 occurs. Therefore Prob(E1 | G) ≥ (min{p, 1 − p})
#{bonds within A}
≡ A ,
(3.38)
and hence by (3.37) Prob(E1 ) ≥ A Prob(F ). By (3.23), Prob(E1 ) = hI[0 ∈
C (x) C(x)] τp {0,u} (u, y)ip . ˆ A (x)
Prob(F ) ≥ hI[0 ∈ C(x)]τpC
(3.39) Hence it suffices to show that4
(u, y)ip .
Conditioning on Cˆ A (x), as in the proof of Lemma 3.5 we have X Prob(F ) = Prob(Cˆ A (x) = S)
(3.40)
(3.41)
S:S3x
×Prob(0 ∈ C(x), u is connected to y in Zd \S | Cˆ A (x) = S). The events {0 ∈ C(x)} and {u is connected to y in Zd \S} are not independent, but they do depend only on those bonds which do not touch S and on those which connect S to A. Restricted to this set of bonds, the two events are increasing. Hence by the FKG inequality X Prob(F ) ≥ Prob(Cˆ A (x) = S) Prob(0 ∈ C(x) | Cˆ A (x) = S) S:S3x
×Prob(u is connected to y in Zd \S | Cˆ A (x) = S) X = Prob(Cˆ A (x) = S) Prob(0 ∈ C(x) | Cˆ A (x) = S) τ S (u, y),
(3.42)
S:S3x
where the independence of the events {u is connected to y in Zd \S} and {Cˆ A (x) = S} was used in the last step. This gives (3.40) as in the proof of Lemma 3.5. 4. The Lace Expansion In this section the derivation of the lace expansion is described for the various models. Bounds on the expansion and the proof of convergence of the expansion are similar for each model, and these two issues are discussed only for the self-avoiding walk. 4.1. The Self-Avoiding Walk The lace expansion shows that above four dimensions the self-avoiding walk is a small perturbation of simple random walk. This is exhibited at the level of the ˆ zc (k) has the same Fourier transform of the two-point function, by showing that G 4 This corrects a claim in Aizenman and Newman (1984) that (3.40) is an identity rather than an inequality.
109
THE LACE EXPANSION
k −2 behaviour near the origin as simple random walk. The lace expansion provides ˆ z (k) which demonstrates this comparison. We first derive an explicit formula for G the expansion without concern for convergence issues, and then discuss bounds and convergence afterwards. For simplicity we restrict attention here to the nearestneighbour model, although no real complications arise with more general steps. 4.1.1. Derivation of the Expansion The lace expansion can be derived in two ways: via a kind of cluster expansion of a type that is well known in statistical mechanics and constructive quantum field theory, or via a repeated application of the inclusion-exclusion relation. These two approaches give the same result. Here we shall follow the inclusion-exclusion approach, which may appeal more to intuition but which has the drawback that it is less explicit in providing precise formulas. The lace expansion produces a linear convolution equation for Gz (x, y), which can then be solved by taking the Fourier transform. This convolution equation is reminiscent of a multi-dimensional renewal equation. Let Ω(0) (x, y) denote the set of all simple random walks (with no self-avoidance constraint) of any length, which begin at x and end at y. Then if we define X Cz (x, y) = z |ω| , (4.1) ω∈Ω(0) (x,y)
Pd ˆ for complex z with |z| ≤ (2d)−1 , and set D(k) = d−1 j=1 cos kj , we have Cˆz (k) =
1 ˆ 1 − 2dz D(k)
.
(4.2)
The corresponding formula for the self-avoiding walk will be ˆ z (k) = G
1 , ˆ ˆ z (k) 1 − 2dz D(k) − Π
(4.3)
ˆ z (k) will be defined via the lace expansion and in high dimensions will be where Π controlled for all k and for |z| ≤ zc . The first step in deriving the expansion is to extract the term in (2.4) corresponding to a walk which takes no steps: Gz (0, x) = δ0,x +
∞ X
cn (0, x)z n .
(4.4)
n=1
Next, we argue that for n ≥ 1, X c1 (0, y)cn−1 (y, x) − cn (0, x) = y:|y|=1
X ω∈Ωn−1 (y,x)
I[0 ∈ ω] .
(4.5)
In fact this just follows by inclusion-exclusion: the first term on the right side counts all walks from 0 to x which are self-avoiding after the first step, and the second
110
TAKASHI HARA AND GORDON SLADE
subtracts the contribution overcounted in the first term, due to walks which are self-avoiding apart from a single return to the origin. Since c1 (0, y) = 1 for |y| = 1, substitution of (4.5) into (4.4) gives X
Gz (0, x) = δ0,x + z
∞ X X
Gz (y, x) −
z n+1
y:|y|=1 n=0
y:|y|=1
X
I[0 ∈ ω].
(4.6)
ω∈Ωn (y,x)
The second term on the right side is a convolution with Gz , and we wish to write the last term on the right side also as a convolution. For this purpose, we now apply the inclusion-exclusion relation to the last term on the right side of (4.6), as follows. Let m be the first (and only) time that ω(m) = 0. Then n X X X I[0 ∈ ω] = I[ω (1) ∩ ω (2) = {0}] m=1
ω∈Ωn (y,x)
=
ω (1) ∈ Ωm (y, 0) ω (2) ∈ Ωn−m (0, x)
n X cm (y, 0)cn−m (0, x) − m=1
X
ω (1) ∈ Ωm (y, 0) ω (2) ∈ Ωn−m (0, x)
I[ω (1) ∩ ω (2) 6= {0}] .
The number cm (y, 0) can be thought of as the number of (m + 1)-step walks which step from the origin directly to y, then return to the origin in m steps, and which have distinct vertices apart from the fact that they return to their starting point. Let Um denote the set of all m-step self-avoiding loops at the origin (m-step walks which begin and end at the origin but which otherwise have distinct vertices), and let um be the cardinality of Um . Then in view of the above equation, ∞ X X y:|y|=1 n=0
=
∞ X m=2
X
z n+1
I[0 ∈ ω]
ω∈Ωn (y,x)
um z
m
!
Gz (0, x) −
X
X
z m+n I[ω (1) ∩ ω (2) 6= {0}].
m ≥ 2 ω (1) ∈ Um n ≥ 0 ω (2) ∈ Ωn (0, x)
Thus we have partially achieved our goal of writing the last term on the right side of (4.6) as a convolution equation: the first term on the right side above is a particularly simple convolution of the two-point function with the constant (as a function of x) P ∞ m m=2 um z . Continuing in this fashion, in the last term on the right side of the above equation let m1 ≥ 1 be the first time along ω (2) that ω (2) (m1 ) ∈ ω (1) , and let v = ω (2) (m1 ). Then the inclusion-exclusion relation can be applied again to remove the avoidance between the portions of ω (2) before and after m1 , and correct for this removal by the subtraction of a term involving a further intersection. Repetition of this procedure leads to the convolution equation X X Gz (0, x) = δ0,x + z Gz (y, x) + Πz (0, v)Gz (v, x), (4.7) y∈Ω
v
111
THE LACE EXPANSION
where the ‘irreducible’ two-point function Πz (0, x) is given by Πz (0, v) =
∞ X
) (−1)N Π(N z (0, v),
(4.8)
N =1 (N )
with Πz (0, v) non-negative and defined as follows. The N = 1 term is given by ∞ X (1) Πz (0, v) = δ0,v um z m ≡ δ0,v 0 q .
Æ
m=2
The N = 2 term is Π(2) z (0, v) =
3 Y i=1
∞ X
z mi
mi =1
X ω (i) ∈Ωmi (0,v)
I(ω (1) , ω (2) , ω (3) ),
where I(ω (1) , ω (2) , ω (3) ) is equal to 1 if the ω (i) are pairwise mutually avoiding apart from their common endpoints, and otherwise is equal to 0. Diagrammatically this can be represented by (2)
Πz (0, v) = 0
v Æ
where each line represents a sum over self-avoiding walks between the endpoints of the line, weighted by z m , with mutual avoidance between the three pairs of lines in the diagram. Similarly (3)
Πz (0, v) =
Æ 0
@@
v
where now there is mutual avoidance between some but not all pairs of lines in the diagram. The unlabeled vertex is summed over Zd . A slashed propagator is used to indicate a walk which may have zero length, i.e., be a single site, whereas propagators without a slash correspond to walks of at least one step. All the higher order terms can be expressed as diagrams in this way. With some care it is possible to discern the pattern of which pairs of lines are mutually avoiding in these diagrammatic expressions, but this is perhaps better understood in the cluster expansion approach, where analytic expressions are used to define the diagrams. For our present purposes it is not important which lines avoid each other, because all such mutual avoidance among lines will be neglected in the upper bounds we will use to estimate the diagrams. Note that neglecting mutual avoidance allows additional configurations, and hence gives an upper bound in x-space for non-negative z. Using translation invariance, and the fact that the Fourier transform of a convolution is the product of Fourier transforms, taking the Fourier transform of (4.7) ˆ z (k) gives and solving for G ˆ z (k) = G
1 ˆ ˆ z (k) 1 − 2dz D(k) − Π
(4.9)
112
TAKASHI HARA AND GORDON SLADE
where ˆ z (k) = Π
∞ X
ˆ (N ) (k). (−1)N Π z
(4.10)
N =1
Of course, up to now there is no guarantee that this expansion is convergent. Convergence will be discussed in Section 4.1.3, but first we shall indicate how it is possible to estimate the diagrams involved in Πz in terms of the two-point function itself. Because the goal is to show that (4.9) behaves like k −2 for z = zc , we will work in terms of the Fourier transform. ˆ z (k) 4.1.2. Bounds on Π ˆ z (k) can be bounded in terms of norms of the twoIn this section we show how Π point function, such as the bubble diagram. Let z ≥ 0. The easiest contribution to ˆ (1) estimate is the one-loop diagram Π z (k), which is given by ˆ (1) Π z (k) =
∞ X X
z |ω| = z
m=2 ω∈Um
X
Gz (y, 0)
(4.11)
y:|y|=1
and hence ˆ (1) (k)| ≤ 2dz sup Gz (0, x). |Π z
(4.12)
x6=0
Writing k · k∞ for the x-space supremum norm kf k∞ = sup |f (x)|,
(4.13)
x∈Zd
and introducing Hz (x, y) = Gz (x, y) − δx,y =
Gz (x, y) x 6= y 0 x = y,
(4.14)
(4.12) can be rewritten as ˆ (1) |Π z (k)| ≤ 2dz kHz k∞ .
(4.15)
(In view of the translation invariance of Hz , in writing norms of Hz we mean norms of the function Hz (0, ·) of a single variable.) It is likely that little has been lost in this estimate, as the supremum is probably attained at a neighbour of the origin. ˆ (2) For Π z (k), we first use the fact that all lines in the diagram representing Πz (0, x) must take at least one step, and hence there is no contribution from x = 0, to obtain X ˆ (2) |Π Π(2) (4.16) z (k)| ≤ z (0, x). x6=0
Let k · kp denote the x-space Lp -norm
kf kp =
X x∈Zd
1/p
|f (x)|p
.
(4.17)
113
THE LACE EXPANSION (2)
Neglecting the fact that the three lines in the diagram representing Πz (0, x) mutually avoid, but not that each line is itself self-avoiding, gives the bound 3 |Π(2) z (0, x)| ≤ Hz (0, x) .
(4.18)
Therefore by definition of the bubble diagram in (3.4), 3 2 ˆ (2) |Π z (k)| ≤ kHz k3 ≤ kHz k∞ kHz k2 = kHz k∞ [B(z) − 1].
(4.19)
The role of the bubble diagram in upper bounds now becomes apparent. Higher order terms can be bounded similarly, using a bit more care than for the N = 2 case, with the result that ) N/2 ˆ (N |Π [B(z)]N/2−1 . z (k)| ≤ kHz k∞ [B(z) − 1]
(4.20)
It will also be necessary to obtain bounds on ˆ z (0) − Π ˆ z (k) = Π
∞ X
) ˆ (N ˆ (N ) (−1)N [Π z (0) − Πz (k)]
(4.21)
N =2
(there is no k dependence for N = 1). This can be bounded in terms of the quantity P 2 (N ) ˆ (N ) with x xµ Πz (0, x), which is closely related to the second derivative of Πz respect to kµ . For the two-loop diagram, we have X X x2µ Π(2) x2µ Hz (0, x)3 ≤ kx2µ Hz (0, x)k∞ kHz k22 z (0, x) ≤ x
x
= kx2µ Hz (0, x)k∞ [B(z) − 1].
(4.22)
This bound provides an indication of the critical nature of d = 4, in the following way. Assuming that the infra-red bound η ≥ 0 is indeed valid, then B(zc ) is finite for d > 4. The infra-red bound is morally the statement that the critical two-point function decays at least as fast as |x|2−d , so that for d > 4 (4.22) will be finite at the P (2) critical point, and hence so will x x2µ Πzc (0, x). For models with a suitable weak interaction, such as the nearest-neighbour model in sufficiently high dimensions, or a sufficiently spread-out model above four dimensions the quantity B(zc )−1 will be not only finite but small, and will be the small parameter responsible for convergence of the expansion. Thus the distinction between Hz and Gz is crucial, as kGzc k22 = B(zc ) is not a small parameter. Higher order terms can be handled in a similar fashion, and with a careful use of symmetry we obtain the bound ˆ z (0) − Π ˆ z (k) = Π
∞ X
) (−1)N Π(N z (0, x)[1 − cos k · x]
N =2 ∞ X
≥ −
ˆ (2j+1) (0) − Π ˆ (2j+1) (k)] [Π z z
j=1 2 ˆ ≥ −d[1 − D(k)]kx 1 Hz k∞
∞ X j=1
(j + 1)2 kHz k2j+1 kGz k22j−1 . 2
(4.23)
114
TAKASHI HARA AND GORDON SLADE
4.1.3. Convergence of the Expansion In this section we sketch a proof, based on the method of Slade (1987), that there is a constant K such that for the nearest-neighbour model in sufficiently high dimensions, kHzc k22 ≤ 2Kd−1 and kx2µ Hzc k∞ ≤ 2Kd−1 .
(4.24)
In particular, B(zc ) = 1 + kHzc k22 < ∞ and the bubble condition is satisfied. The infra-red bound will be obtained in the course of the proof. For simple random walk the analogous behaviour can be proved without difficulty, and in fact the constant K above can be taken to be the smallest constant K such that the bounds kC1/(2d) k ≤ Kd−1 ,
∂2 D
(∂ D)
µˆ
µˆ 2 2 kxµ C1/(2d) k∞ ≤ +
≤ Kd−1 ˆ 2 ˆ 3
[1 − D]
[1 − D] 1 1
(4.25) (4.26)
(here ∂µ ≡ ∂/∂kµ ) hold for all d ≥ 5. Since Hz (0, x) ≤ C1/(2d) (0, x) for all z ∈ [0, 1/(2d)], the bounds (4.24) hold when zc is replaced by z ∈ [0, 1/(2d)]. It is near the critical point that some work is required. To prove (4.24) it is sufficient to obtain uniform bounds on the norms on the left sides, for all z < zc . So let us fix an activity z ∈ [1/(2d), zc ). Suppose for the moment that we had bounds on kx2µ Hz k∞ and kHz k2 that were a bit worse than (4.24), say with the constant factor 2 weakened to a factor 3. Then it would be possible to improve these weak bounds to the stronger bounds with factor 2, by ˆ z (k)−1 , the first step is to use (4.9) to write arguing as follows. Defining Fˆz (k) ≡ G 1 1 = . ˆ ˆ ˆ ˆ ˆ ˆ z (k) ˆ z (0) − Π [Fz (k) − Fz (0)] + Fz (0) Fz (0) + 2dz[1 − D(k)] + Π (4.27) The term Fˆz (0) in the denominator is handled by noting that Fˆz (0) = χ(z)−1 > 0. ˆ z (0)−Π ˆ z (k), if we had the weak bounds, then for large d we could argue that the For Π ˆ right side of (4.23) is dominated by the first term, which would be O(d−3/2 )[1− D(k)] ˆ and hence a small correction to the simple random walk term 1 − D(k). The factor 2dz in (4.27) is bounded below by 1 for z ∈ [1/(2d), zc). This means that given the weak bounds, we have the infra-red bound ˆ z (k) = G
−1 ˆ ˆ z (k) ≤ [1 + O(d−3/2 )][1 − D(k)] = [1 + O(d−3/2 )]Cˆ1/(2d) (k), G
(4.28)
and this can then be used to obtain bounds on the norms appearing in (4.24) which are just slightly worse than the corresponding critical simple random walk bounds (hence the factor 2). In the above we have assumed weak bounds to obtain stronger bounds. This shows that there is a forbidden region in the graphs of kHz k2 and kx2µ Hz k∞ versus z, namely the region where the weak bounds hold but the strong ones fail. Since these norms are continuous functions of z < zc , and since the weak bounds hold for z ≤ 1/(2d), they therefore must also hold for all z < zc , and we are done. In addition, the weak bounds imply the strong bounds, and hence imply the infra-red bound as above.
115
THE LACE EXPANSION
R1 x
R0
q q q q qy q q q q
R8
Fig. 6. Decomposition of a tree T containing sites x and y into its backbone and ribs R0 , ..., R8 . The vertices of the backbone are indicated by heavy dots.
The above argument can be carried out, with considerable elaboration, for d ≥ 5. The infra-red bound implies the bubble condition, which gives the conclusion of ˆ z (k) can be obtained to prove the stronger Theorem 3.1. Sufficient control of Π results of Section 2.1. For lattice trees and animals, and for percolation, the proof of convergence of the expansion has the same structure as that outlined in this section. Finally we mention that (2.12) can be obtained by using χ(zc )−1 = 1 − 2dzc − ˆ zc (0) = 0. Π 4.2. Lattice Trees and Animals For lattice trees and animals it is not immediately obvious how to adapt the lace expansion, since trees and animals are not one-dimensional structures. There is however a sense in which they are one-dimensional, in high dimensions. Consider the two-point function for trees: X z |T | . (4.29) Gz (x, y) = T :T 3x,y
Given two distinct sites x, y and a tree T 3 x, y, the backbone of T is defined to be the unique path, consisting of bonds of T , which joins x to y. Sites in the backbone are labeled consecutively from x to y. For a tree with an n-bond backbone, removal of the bonds in the backbone disconnects the tree into n + 1 mutually non-intersecting trees R0 , ..., Rn , which we refer to as ribs. This decomposition is Pshown in Figure 6. ThePrib generating function, or one-point function Gz (0, 0) = T :T 30 z |T | is equal to n (n + 1)tn z n . For θ = 52 this will be finite at the critical point, and in this sense trees can be considered to be one-dimensional structures in high dimensions. The lace expansion allows them to be treated as a perturbation of simple random walk. The expansion can be performed by repeated use of inclusion-exclusion, as was done for self-avoiding walks. This begins by turning off the interaction between the rib at x and all subsequent ribs, to obtain a simple convolution term. Then there is a correction term in which at least one intersection is required between the rib at x and some subsequent rib. In this term there must be a first rib which intersects the rib at x, and to obtain a convolution the interaction is turned off between all
116
TAKASHI HARA AND GORDON SLADE
# A # A ## AA x 0# @@## #@ @ AA @@ A x 0
x 0
0
x
Fig. 7. The two generic types of rib intersection occurring at the two-loop level, and the Feynman (2) diagrams bounding the corresponding contributions to Πz (0, x).
D0
q
x
D1
D2
qy
Fig. 8. Decomposition of a lattice animal A containing x and y into backbone and ribs D0 , D1 , D2 . The backbone, consisting of two bonds, is drawn in bold lines.
ribs following and preceding this first rib. This involves a further correction term with further intersections, and so on. A quantity like Πz (0, x) for self-avoiding walks arises, but for trees this is estimated in terms of the square diagram rather than the bubble diagram. It is through controlling this analogue of Πz (0, x) that the critical behaviour is accessed. The generic intersections and resulting Feynman diagrams are illustrated in Figure 7 for the second order contribution to Πz (0, x). Avoidance constraints between distinct diagram lines can be neglected in upper bounds. For lattice animals there is not a unique backbone as there is for trees. To modify the notion of backbone to suit lattice animals, we first introduce some definitions. A lattice animal A containing x and y is said to have a double connection from x to y if there are two disjoint (i.e., sharing no common bond) self-avoiding walks in A between x and y or if x = y. A bond {u, v} in A is called pivotal for the connection from x to y if its removal would disconnect the animal into two connected components with x in one connected component and y in the other. Given two sites x, y and an animal A containing x and y, the backbone of A is now defined to be the set of pivotal bonds for the connection from x to y. In
117
THE LACE EXPANSION
general this backbone is not connected. The ribs of A are the connected components which remain after the removal of the backbone from A. An example is depicted in Figure 8. One can then produce a lace expansion based on the inclusion-exclusion relation, as was done for trees. Again the square diagram plays a basic role in the estimates. Further details can be found in Hara and Slade (1990b, 1992c). 4.3. Percolation For percolation the basic idea behind the expansion is similar to that underlying the expansion for lattice animals. Suppose that p < pc , so that the connected cluster of the origin is finite with probability one. Given a configuration in which 0 and x are connected, the connected bond cluster of the origin is a lattice animal containing the sites 0 and x. The occupied pivotal bonds divide the cluster into doubly-connected ribs, as in Figure 8. No two of these pieces can share a common site, so there is a kind of ‘repulsive interaction’ between these pieces. However the situation is not as simple as it was for lattice animals, because the pieces interact also when they share a common boundary bond, due to the factors of 1 − p associated with the unoccupied boundary bonds. We shall consider the percolation cluster to be like a self-avoiding walk, whose steps correspond to the pivotal bonds and whose sites are the intervening doublyconnected clusters. The first task is to extract the contribution due to the zero-step walk, which in the percolation context corresponds to the event that 0 and x are doubly-connected. Thus we have τp (0, x) = hI[x ∈ Dc (0)]ip + hI[x ∈ C(0), x 6∈ Dc (0)]ip
(4.30)
(see Definition 3.4 for definitions used in this section; we treat the nearest-neighbour and spread-out models simultaneously). If 0 is connected to x, but not doubly, then there is a pivotal bond for the connection from 0 to x and hence a first pivotal bond, so that τp (0, x) = hI[x ∈ Dc (0)]ip X + hI[x ∈ C(0), (u, v) is the first pivotal bond]ip .
(4.31)
(u,v)
To proceed further, we need a way of writing the last term on the right side as a convolution with τp . This is achieved using Lemma 3.5. We apply Lemma 3.5 to the second term on the right side of (4.31), with E2 = E2 (0, u; Zd ) = {u ∈ Dc (0)}. The summand in this term is equal to the probability that 0 is doubly-connected to u and (u, v) is occupied and pivotal for the connection from 0 to x. Hence by the lemma it is equal to C
phI[u ∈ Dc (0)] τp {u,v}
(0)
(v, x)ip .
(4.32)
To extract a term involving a convolution with τp from this quantity, we write C
τp {u,v}
(0)
C
(v, x) = τp (v, x) − [τp (v, x) − τp {u,v}
(0)
(v, x)].
(4.33)
118
TAKASHI HARA AND GORDON SLADE
Using (4.32) and (4.33) in (4.31), we obtain τp (0, x) = hI[x ∈ Dc (0)]ip + p −p
X
hI[u ∈ Dc (0)]ip τp (v, x)
(u,v)
X
C
hI[u ∈ Dc (0)]{τp (v, x) − τp {u,v}
(0)
(v, x)}ip .
(4.34)
(u,v)
The above equation gives the lowest order expansion with remainder. We abbreviate the notation by writing gp (0, x) = hI[x ∈ Dc (0)]ip and Rp(0) (0, x) = p
X
C
(4.35)
hI[u ∈ Dc (0)]{τp (v, x) − τp {u,v}
(0)
(v, x)}ip .
(4.36)
(u,v)
We denote by D the function on Zd which takes the value |Λ|−1 at sites in Λ and otherwise is zero; here |Λ| denotes the cardinality of the set Λ. Then (4.34) can be rewritten as τp (0, x) = gp (0, x) + gp ∗ p|Λ|D ∗ τp (x) − Rp(0) (0, x), where ∗ denotes convolution: f ∗g(x) =
P
y
(4.37)
f (x−y)g(y). To proceed further, we will (0)
use the following lemma to expand the remainder term Rp (0, x). For the statement of the lemma, we write I2 (v, x; A) for the indicator function of the event E2 (v, x; A). Lemma 4.1. Given a set of sites A and two sites v and x, τp (v, x) − τpA (v, x) = hI2 (v, x; A)ip + p
X
C
hI2 (v, y; A)τp {y,y
0 } (v)
(y 0 , x)ip .
(4.38)
(y,y 0 )
Proof. The left side is the probability of the event that v and x are connected but are not connected in Zd \A. By definition, this is the probability that v is connected to x through A. If v is connected to x through A then either (i) there is no pivotal bond for the connection from v to x whose first endpoint is connected to v through A, or (ii) there is such a pivotal bond. Case (i) is exactly the event E2 (v, x; A), and gives the first term on the right side of (4.38). In case (ii), let (y, y 0 ) denote the first pivotal bond for the connection from v to x such that y is connected to v through A. The contribution to the left side of (4.38) due to this case is X
hI[E2 (v, y; A)]I[(y, y 0 ) is occupied and pivotal for the connection from v to x]ip .
(y,y 0 )
(4.39) Then by Lemma 3.5, with (v, y) playing the role of (0, u), the contribution due to this case gives the second term on the right side of (4.38).
119
THE LACE EXPANSION
Using Lemma 4.1 and (4.36), and replacing the summation index (u, v) by (y1 , y10 ), we have X 0 (1) (0) hI[y1 ∈ Dc (0)]hI2 (y10 , x; C{y i Rp(0) (0, x) = p 0 (0))i 1 ,y } 1
(y1 ,y10 )
+ p2
X
X
(y1 ,y10 )
(y2 ,y20 )
0 hI[y1 ∈ Dc (0) ]hI2 (y10 , y2 ; C{y 0 (0)) 1 ,y } 1
1 C{y
× τp
0 0 (y1 ) 2 ,y2 }
(y20 , x)i(1) i(0) .
Here and in the following we simplify the notation by dropping the subscript p from the angular brackets denoting expectation. In addition we have introduced a superscript to coordinate random sets with the appropriate expectation, in nested expectations. Thus for example in the second term in the right side of the above 0 equation, the set C{y 0 (0) is random with respect to the outer expectation, but 1 ,y1 } may be treated as deterministic in the evaluation of the inner expectation. Using the analogue of (4.33) to replace the restricted two-point function on the right side by an unrestricted two-point function plus a correction, and defining X 0 (1) (0) Π(1) hI[y1 ∈ Dc (0)]hI2 (y10 , x; C{y i 0 (0))i p (0, x) = p 1 ,y } 1
(y1 ,y10 )
and Rp(1) (0, x) = p2
X
X
(y1 ,y10 )
(y2 ,y20 )
0 hI[y1 ∈ Dc (0)]hI2 (y10 , y2 ; C{y 0 (0)) 1 ,y } 1
1 C{y
×{τp (y20 , x) − τp
0 0 (y1 ) 2 ,y2 }
(y20 , x)}i(1) i(0) ,
we now have from (4.37) that (1) (1) τp (0, x) = gp (0, x) − Π(1) p (0, x) + [gp − Πp ] ∗ p|Λ|D ∗ τp (x) + Rp (0, x).
(4.40)
The above procedure can be iterated as many times as desired. The result is the lace expansion for percolation, which is stated in the next theorem. For the statement of the theorem we write y00 = 0 and introduce for n ≥ 1, n−1 0 C n−1 = C{y 0 (yn−1 ), n ,y }
I n = I2 (yn0 , yn+1 ; C n−1 ),
n
n Π(n) p (0, x) = p
X (y1 ,y10 )
...
X
hI[y1 ∈ Dc (0)]hI 1 hI 2 hI 3 . . . hI n−1
0 ) (yn ,yn
× hI2 (yn0 , x; C n−1 )i(n) i(n−1) . . .i(3) i(2) i(1) i(0) , h(n) p (0, x) = gp (0, x) +
n X j=1
(−1)j Π(j) p (0, x)
120
TAKASHI HARA AND GORDON SLADE
and Rp(n) (0, x) = pn+1
X (y1 ,y10 )
X
...
hI[y1 ∈ Dc (0)]hI 1 hI 2 . . . hI n
0 (yn+1 ,yn+1 ) n
0 0 × {τp (yn+1 , x) − τpC (yn+1 , x)}i(n) . . .i(2) i(1) i(0) . (0)
Finally defining hp (0, x) = gp (0, x), we have the following theorem. Theorem 4.2. For p < pc and N ≥ 0, ) (N ) τp (0, x) = h(N ∗ pΩD ∗ τp (x) + (−1)N +1 Rp(N ) (0, x). p (0, x) + hp
(4.41)
Now we take the Fourier transform as was done for self-avoiding walks. Bounds ˆ (N ) ˆ (j) on Π p (k) and Rp (k) involve bounding the nested expectations from the inside out, using the BK inequality. The triangle diagram is important in the bounds. Further details can be found in Hara and Slade (1990a). 4.4. Oriented Percolation For oriented percolation the lace expansion is closer in some respects to the expansion for lattice animals than for unoriented percolation. The Markov property makes the analysis somewhat simpler. Details can be found in Nguyen and Yang (1992). Acknowledgements We are grateful to Michael Aizenman and David Brydges for many valuable discussions. These lectures were written while G.S. was visiting the Department of Mathematics of the University of Virginia, and he thanks the Department for its hospitality. The work of G.S. was supported by Natural Sciences and Engineering Research Council of Canada grant A9351. References Aizenman, M. (1982). Geometric analysis of ϕ4 fields and Ising models, Parts I and II. Communications in Mathematical Physics 86, 1–48. Aizenman, M. and Fern´ andez, R. (1986). On the critical behaviour of the magnetization in high dimensional Ising models. Journal of Statistical Physics 44, 393–454. Aizenman, M. and Newman, C. M. (1984). Tree graph inequalities and critical behavior in percolation models. Journal of Statistical Physics 36, 107–143. Arnaudon, D., Iagolnitzer, D., and Magnen, J. (1991). Weakly self-avoiding polymers in four dimensions. Rigorous results. Physics Letters B 273, 268–272. Barsky, D. J. and Aizenman, M. (1991). Percolation critical exponents under the triangle condition. Annals of Probability 19, 1520–1536. Bovier, A., Felder, G., and Fr¨ ohlich, J. (1984). On the critical properties of the Edwards and the self-avoiding walk model of polymer chains. Nuclear Physics B 230, 119–147. Bovier, A., Fr¨ ohlich, J., and Glaus, U. (1986). Branched polymers and dimensional reduction. In Critical Phenomena, Random Systems, Gauge Theories (K. Osterwalder and R. Stora, ed.), North-Holland, Amsterdam. Brydges, D., Evans, S. N., and Imbrie, J. Z. (1992). Self-avoiding walk on a hierarchical lattice in four dimensions. Annals of Probability 20, 82–124.
THE LACE EXPANSION
121
Brydges, D. C. and Spencer, T. (1985). Self-avoiding walk in 5 or more dimensions. Communications in Mathematical Physics 97, 125–148. Chayes, J. T. and Chayes, L. (1986). Percolation and random media. In Critical Phenomena, Random Systems, Gauge Theories (K. Osterwalder and R. Stora, ed.), North-Holland, Amsterdam. Chayes, J. T. and Chayes, L. (1987). On the upper critical dimension of Bernoulli percolation. Communications in Mathematical Physics 113, 27–48. Fern´ andez, R., Fr¨ ohlich, J., and Sokal, A. D. (1992). Random Walks, Critical Phenomena, and Triviality in Quantum Field Theory. Springer, Berlin. Fisher, M. E. (1969). Rigorous inequalities for critical-point exponents. The Physical Review 180, 594–600. Fisher, M. E. and Gaunt, D. S. (1964). Ising model and self-avoiding walks on hypercubical lattices and “high-density” expansions. The Physical Review 133, A224–A239. Fr¨ ohlich, J. (1982). On the triviality of ϕ4d theories and the approach to the critical point in d ≥ 4 dimensions. Nuclear Physics B 200, 281–296. Fr¨ ohlich, J., Simon, B., and Spencer, T. (1976). Infrared bounds, phase transitions, and continuous symmetry breaking. Communications in Mathematical Physics 50, 79–95. Gaunt, D. S. and Ruskin, H. (1978). Bond percolation processes in d dimensions. Journal of Physics A: Mathematical and General 11, 1369–1380. Grimmett, G. R. (1989). Percolation. Springer, Berlin. Hammersley, J. M. and Morton, K. W. (1954). Poor man’s Monte Carlo. Journal of the Royal Statistical Society B 16, 23–38. Hara, T. and Slade, G. (1990a). Mean-field critical behaviour for percolation in high dimensions. Communications in Mathematical Physics 128, 333–391. Hara, T. and Slade, G. (1990b). On the upper critical dimension of lattice trees and lattice animals. Journal of Statistical Physics 59, 1469–1510. Hara, T. and Slade, G. (1992a). Self-avoiding walk in five or more dimensions. I. The critical behaviour. Communications in Mathematical Physics 147, 101–136. Hara, T. and Slade, G. (1992b). The lace expansion for self-avoiding walk in five or more dimensions. Reviews in Mathematical Physics 4, 235–327. Hara, T. and Slade, G. (1992c). The number and size of branched polymers in high dimensions. Journal of Statistical Physics 67, 1009–1038. Hara, T. and Slade, G. (1993). The self-avoiding-walk and percolation critical points in high dimensions. In preparation. Iagolnitzer, D. and Magnen, J. (1992). Polymers in a weak random potential in dimension four: rigorous renormalization group analysis. Preprint. Kesten, H. (1982). Percolation Theory for Mathematicians. Birkh¨ auser, Boston. Lawler, G. F. (1991). Intersections of Random Walks. Birkh¨ auser, Boston. Madras, N. and Slade, G. (1993). The Self-Avoiding Walk. Birkh¨ auser, Boston. Nemirovsky, A. M., Freed, K. F., Ishinabe, T., and Douglas, J. F. (1992). Marriage of exact enumeration and 1/d expansion methods: Lattice model of dilute polymers. Journal of Statistical Physics 67, 1083–1108. Nguyen, B. G. (1987). Gap exponents for percolation processes with triangle condition. Journal of Statistical Physics 49, 235–243. Nguyen, B. G. and Yang, W.-S. (1992). Triangle condition for oriented percolation in high dimensions. To appear in Annals of Probability. Nguyen, B. G. and Yang, W.-S. (1993). Gaussian limit of the connectivity function for critical oriented percolation in high dimensions. Preprint. Slade, G. (1987). The diffusion of self-avoiding random walk in high dimensions. Communications in Mathematical Physics 110, 661–683. Sokal, A. D. (1979). A rigorous inequality for the specific heat of an Ising or ϕ4 ferromagnet. Physics Letters A 71, 451–453. Tasaki, H. (1986). Stochastic Geometric Methods in Statistical Physics and Field Theories. Ph.D. thesis, University of Tokyo. Tasaki, H. (1987). Hyperscaling inequalities for percolation. Communications in Mathematical Physics 113, 49–65.
122
TAKASHI HARA AND GORDON SLADE
Tasaki, H. and Hara, T. (1987). Critical behaviour in a system of branched polymers. Progress in Theoretical Physics (Supplement) 92, 14–25.
LONG TIME TAILS IN PHYSICS AND MATHEMATICS
FRANK DEN HOLLANDER Mathematical Institute University of Utrecht P.O. Box 80.010, 3508TA Utrecht The Netherlands e-mail:
[email protected]
Abstract. In physics the name ‘long time tail’ is used to denote slow decay of an equilibrium autocorrelation function of a fluctuating local quantity (like density, momentum, energy) in a system of interacting particles. This paper discusses long time tails for the velocity autocorrelation function of a tagged particle. After an expository introduction we describe various theories from physics and mathematics explaining this phenomenon. Key words: interacting particle systems, tagged particle, velocity autocorrelation function, long time tail, random motion in random environment
1. Introduction For the early 20th century physicist matter had three phases: gas, solid or liquid. Whereas the earliest successes of statistical physics were achieved in explaining the fundamental laws of gases and solids on the basis of simple microscopic laws of interaction, it took much longer to come to a satisfactory understanding of the basic properties of liquids. The reason for this is, quite simply, that the particles in a liquid are neither (more or less) freely moving about, like in a gas, nor are they (more or less) localized, like in a solid. Consequently, at the microscopic level the interactions in a liquid are more complex1 . In this paper we shall address one particular question about the fluid, of interest to the physicist: ‘What is the motion of a tagged particle like?’ Imagine that we pick a particle at random, give it a tag, and follow its erratic motion as time evolves. Suppose that this tagged particle is ‘in equilibrium’ with its environment. Then what is the law of its motion as a result of its repeated collisions with the other particles2 ? (See the figure.) Let X(t) = position of tagged particle at time t (1) V (t) = velocity of tagged particle at time t (put X(0) = 0). Two important quantities characterizing these random processes 1 For simplicity we consider only classical simple fluids, i.e., particles are monatomic with no internal degrees of freedom and are subject to the Newtonian laws of mechanics. Fluid is the collective name for dense gas or liquid. 2 A collision should be thought of as a fast but smooth deflection of trajectories. At short distances particles repel each other according to the Lennard–Jones potential.
124
FRANK DEN HOLLANDER
Fig. 1.
are
Tagged particle in Rd .
hX 2 (t)i = mean-square displacement at time t hV (0)V (t)i = velocity autocorrelation at time t
(2)
where h·i denotes expectation with respect to the ‘equilibrium ensemble’ describing the system, i.e., the law of the motion of all the particles. Of course, this law will not be known in general and it is here that the difficulty lies in analyzing the quantities in (2)3 . Lemma 1. If {V (t) : t ≥ 0} is a stationary process, then 1 d2 hX 2 (t)i = hV (0)V (t)i 2 dt2
(t ≥ 0).
(3)
Rt Rt d Proof. Since X(t) = 0 V (s)ds it follows that 12 dt hX 2 (t)i = hX(t)V (t)i = 0 hV (s) V (t)ids. By stationarity, hV (s)V (t)i = hV (0)V (t−s)i and therefore the last integral Rt can be rewritten as 0 hV (0)V (s)ids. Differentiate once more with respect to t to get (3). Lemma 1 shows that the two quantities in (2) have a simple relation when {X(t) : t ≥ 0} has stationary increments, which is the mathematical way of saying that the tagged particle is ‘in equilibrium’ with its environment. If X(t) were a pure diffusion (like a Wiener process), then we would have hX 2 (t)i = 2Dt for all t ≥ 0, where D is the so-called diffusion constant. However, the motion of the tagged particle is not purely diffusive because it is carrying momentum and because it is not colliding with the other particles all the time. Equation (3) shows that the correction to the linear growth of hX 2 (t)i will manifest itself as hV (0)V (t)i = 6 0, typical for this situation. Lemma 2. If D =
1 2
limt→∞
d 2 dt hX (t)i
D=
Z
exists, then
∞
hV (0)V (s)ids.
(4)
0 3
X 2 (t) denotes the inner product of the vector X(t) with itself, and similarly for V (0)V (t).
125
LONG TIME TAILS
Proof. See the proof of Lemma 1. Lemma 2 relates the diffusion constant D to the velocity autocorrelation function t → hV (0)V (t)i. Equation (4) is known as the Green–Kubo formula4 . For over half a century it was believed by physicists that hV (0)V (t)i = e−At(1+o(1))
(t → ∞)
(5)
with A > 0 some rate that depends on the parameters of the fluid (such as temperature and density). Intuitively (5) was quite a reasonable guess. The tagged particle gets repeatedly ‘kicked around’ by the other particles and therefore it should lose memory of its initial velocity at some positive rate. Moreover, the exponential decay was predicted by the semi-phenomenological theories based on the Boltzmann equation and the Langevin equation, which were the main descriptive tools at the time. Furthermore, the asymptotic linear growth of the mean-square displacement (implied by (5) via Lemma 2) had been observed in numerous physical experiments (e.g., the famous experiments by Perrin 1909 on Brownian particles). So there was no reason to doubt (5). Clouds appeared on the sky in the late 1960’s. Namely, computer simulations by Alder and Wainwright (1967, 1969, 1970), carried out with models of elastically colliding hard disks and spheres, indicated that hV (0)V (t)i = At−d/2 (1 + o(1))
(t → ∞)
(6)
with d = 2 or 3 the dimension and A > 0 some system-dependent amplitude. The slow decay in (6), as opposed to the fast decay in (5), came as a complete surprise. How is such behavior possible? How can it be reconciled with the idea of the tagged particle being repeatedly kicked around? How does the motion manage to carry the correlation over a long time interval? The behavior in (6) has been confirmed by further computer simulations with various types of models with short range interaction (Wood 1973, 1975, Levesque and Ashurst 1974, van der Hoef et al. 1991). It has also been observed in actual experiments (Kim and Matta 1973, Carneiro 1976, Boon and Bouiller 1976, Paul and Pusey 1981, Morkel et al. 1987), leaving no doubt as to the reality of the claim5 . Ever since these developments the slow decay in (6) has become known as a ‘long time tail’, a terminology introduced in the physics literature. (The fast decay in (5) is called a ‘short time tail’.) A remarkable consequence of (6) is that D = ∞ in d = 1 and 2 (by Lemma 2). Thus the motion in low dimension appears to be superdiffusive, which means that the hydrodynamic diffusion equation is not valid. But the impact of (6) goes further. Namely, the foundation of hydrodynamics is based on the assumption that nonequilibrium behavior has two time regimes: (i) short (a few mean collision times), where microscopic relaxation takes place; (ii) long (many mean collision times), 4 Note that Lemmas 1 and 2 do not apply to pure diffusion, because for such motion V (t) does not exist. One is tempted to write hV (0)V (t)i ≡ 0 for t > 0. This is OK with (3) but not with (4) (see also footnote 2). 5 The interpretation of these experiments (based on hydrodynamic measurement, incoherent neutron scattering and light scattering) is a subtle matter, as it is not always clear to what extent the results reflect the long time tail of a single particle (see Boon and Yip 1980).
126
FRANK DEN HOLLANDER
where only macroscopic relaxation is important (and the system is in so-called ‘local equilibrium’). The fact that the velocity autocorrelation persists from the short into the long time regime indicates that the above separation of time regimes is somewhat arbitrary and needs clarification. To close this introduction, let us note that there is of course no complete theoretical foundation for (6). A real fluid, or even a caricature model like n elastically colliding hard spheres, is much too hard for a rigorous mathematical analysis. Nevertheless, over the years a good heuristic explanation has crystallized out. The most convincing argument is due to van Beijeren (1982): “If one assumes that the motion of the tagged particle is roughly diffusion-like, then after time t it will have travelled 1 a distance of order t 2 in each of the d directions. This means that the tagged particle has been in direct or indirect contact with all the particles in a volume of order td/2 . With each of these it has shared some of its momentum, and so the momentum autocorrelation must be of order 1/td/2 .” In the next section we shall see how this argument can be made more precise and what it predicts for the amplitude of the long time tail. 2. Semi-Phenomenological Theories Several semi-phenomenological theories have been proposed to explain the power 12 d and identify the amplitude A. We mention two of these that have been particularly successful. For further references see van Beijeren (1982) and Cohen (1993a, b). 2.1. Hydrodynamics The following argument is taken from van Beijeren (1982). Consider a system of particles in equilibrium, conditioned on having a particle at the origin with velocity v at time t = 0. Give this particle a tag. Define ρ(x, t) = probability density for tagged particle to be at x at time t u(x, t) = average velocity density of particles at x at time t.
(7)
Due to the condition V (0) = v the system starts from a non-equilibrium situation with initial conditions ρ(x, 0) = δ(x) (8) u(x, 0) = vδ(x) where δ(x) is the Dirac function. Since this is only a tiny deviation from equilibrium it is reasonable to assume that after a long time (i.e., much longer than the mean collision time of a typical particle in the system) the evolution of ρ(x, t) and u(x, t) is described to first approximation by linearized hydrodynamic equations, namely ∂ ρ(x, t) ∂t
= D 52 ρ(x, t) (9)
∂ u⊥ (x, t) = −ν 5 ×(5 × u⊥ (x, t)). ∂t Here u⊥ is the transversal part of the velocity field (5 · u⊥ = 0), D is the diffusion constant, and ν is the kinematic viscosity. The equation for the longitudinal part of
LONG TIME TAILS
127
the velocity field is slightly more involved but is omitted, because its contribution to the velocity autocorrelation function turns out to decay exponentially. The solution of (8) and (9) is given most easily in terms of the Fourier transforms R zˆ(k, t) = dx exp[−i(k · x)]z(x, t) (z = ρ, u⊥ ) and reads ρˆ(k, t)
2
= e−Dk t ,
h u ˆ⊥ (k, t) = v −
(v·k)k k2
i 2 e−νk t .
(10)
Now let us assume that after a long time the tagged particle has the same average velocity as the other particles in its neighborhood. Then we have R 1 hV (t)|V (0) = vi ∼ dx ρ(x, t) u(x, t) n (11) Z 1 1 ∼ dk ρˆ(k, t)ˆ u⊥ (−k, t), n (2π)d where n is the particle density. Substitution of (10) into (11) yields Z h (v · k)k i −(D+ν)k2 t 1 1 dk v − hV (t)|V (0) = vi ∼ e n (2π)d k2 (12) 1d−1 1 = v. n d [4π(D + ν)]d/2 The contribution of the longitudinal part of the velocity field is of higher order. Only the transversal part contributes to the leading order, hence the factor (d − 1)/d. Equation (12) says that hV (t)|V (0) = vi points in the same direction as the initial velocity V (0) = v but is reduced in length by an appropriate time-dependent factor due to the diffusive spreading out of the density and the momentum. Finally, we must average over v with respect to the equilibrium ensemble of the system. Since v has the Maxwell–Boltzmann distribution, which is an isotropic Gaussian with mean zero and variance (βm)−1 , where β is the inverse temperature and m is the mass of a particle, it follows that hV (0)V (t)i ∼ At−d/2 A=
1 d−1 1 . βmn d [4π(D + ν)]d/2
(13)
Thus we have arrived at (6) with an explicit identification of the amplitude A in terms of system parameters. The result in (13) turns out to compare very favorably with numerical simulations and with actual experiments. Even though the above derivation clearly is not on a microscopic basis, (13) is nevertheless believed to be exact, at least for d ≥ 3. There is an obvious inconsistency in d = 1 and 2 because (13) implies D = ∞, as was already noted before. Apparently, hydrodynamic equations like (9) do not hold in low dimensions. There are some speculations that in d = 2 (where D diverges only marginally) the result in (13) should be corrected by a logarithmic factor (van der Hoef and Frenkel 1991).
128
FRANK DEN HOLLANDER
2.2. Kinetic Theory At low densities the main source of interaction between the particles comes from uncorrelated binary collisions, i.e., no two particles collide more than once. The behavior of the system in such a situation can be described by the classical Boltzmann equation, which lies halfway between a microscopic and a macroscopic description. At higher densities, however, more complex collision patterns come into play, namely, correlated binary collisions as well as collisions involving three or more particles at the same time. Kinetic theory aims at systematically including the effects of some of these collision patterns, via so-called cluster expansion techniques. For a recent historic account of this fascinating area we refer the reader to Cohen (1993a, b). Dorfman and Cohen (1970, 1972, 1975) derive (13) on the basis of kinetic theory. They show that the main collisions responsible for the long time tail are the socalled ‘ring collisions’. These are sequences of correlated binary collisions where the initial momentum of the tagged particle is transferred to the surrounding particles in a ring-like motion. After about 25 mean collision times this ring-like motion ‘kicks the tagged particle in the back’. In other words, the tagged particle causes a vortex type perturbation of the velocity field around itself and it picks up part of this perturbation some time later. The vortex has also been seen in the computer simulations. Moreover, the vortex structure is compatible with the fact that in (12) only the transversal (= divergence free) part of the velocity field contributes to the long time tail. The ‘kick in the back’ coincides with hV (t)|V (0) = vi and v pointing in the same direction. The difference between (13) and the result obtained from kinetic theory is that not D, ν appear in the amplitude but the so-called ‘bare’ transport coefficients D0 , ν0 . These are the ones predicted by kinetic theory (to lowest order in the density). It seems that the simulations predict a long time tail with a ‘bare’ amplitude for times between 10 and 50 mean collision times, and that only for longer times the full amplitude is seen. For a class of short range interactions with hard core repulsion kinetic theory identifies D0 and ν0 explicitly in terms of system parameters. For instance, for hard spheres the result is D0 = ν0
=
1 1 , 2an (πβm) 12
(d = 2), (14)
3 1 D0 = 65 ν0 = , 8a2 n (πβm) 12
(d = 3),
where a is the radius of a particle. 3. Lorentz Models In order to make possible a more mathematical approach to the study of the tagged particle motion, the following two simplifications are introduced into the modelling: (I) The environment is frozen, i.e., only the tagged particle moves. The other particles are kept fixed, their positions being chosen either periodically or randomly.
LONG TIME TAILS
129
(II) The collision rules are random, i.e., the way in which the tagged particle is scattered when colliding with one of the other particles has a stochastic component. Models with property (I) are called deterministic Lorentz models. Examples are: (1) A hard disk or sphere collides elastically with scatterers. The scatterers are hard disks or spheres too, whose centers are placed at the sites of Z2 resp. Z3 (Bunimovich and Sinai 1981). (2) A light beam moves along the bonds of Z2 and gets deflected by mirrors placed at every site. The axes of the mirrors make an angle of 45 degrees with the lattice axes and are chosen randomly from the two possible orientations (Cohen 1992). Models with properties (I) and (II) are called stochastic Lorentz models. Examples are: (3) A particle moves in a straight line along the bonds of Zd except when hitting a scatterer. At a scatterer it changes the direction of its motion randomly to one of the 2d possible directions. The scatterers are placed at the lattice sites randomly with density lower than one (van Beijeren and Spohn 1983, den Hollander et al. 1992b). (4) A particle performs a random walk on an infinite percolation cluster in Zd . At each step it chooses with equal probability from the nearest-neighbor sites that are in the cluster (Kesten 1986). Three types of questions have been studied for Lorentz models: 1 (a) Is the motion asymptotically diffusive? (Is the limit law of {t− 2 X(butc) : u ∈ [0, 1]} as t → ∞ that of a Wiener process?) (b) Is the motion recurrent or transient? (Does X(t) return to every finite neighborhood of 0 with probability = 1 or < 1?) (c) Is there a long time tail? (Does (6) hold?) Although in this paper we are mainly concerned with question (c), we want to say a few words about questions (a) and (b) as well. For deterministic Lorentz models (a) and (b) have been settled partially in a few cases. For instance, in example (1) with d = 2 we know that (under some conditions) the motion is asymptotically diffusive, but we do not know whether it is recurrent or not6 . In example (2) we know that the motion is recurrent and not asymptotically diffusive, but we do not know its scaling behavior. Unfortunately there are, to date, no examples proving or disproving (c)7 . For stochastic Lorentz models the situation is better. If the motion is Markov and reversible (in an appropriate sense), then a general theorem due to DeMasi et al. (1989) gives the answer to (a): The path scales to a Wiener process with a diffusion constant D ∈ [0, ∞), and in many examples it can be proved that actually D > 0 so that the limit law is non-degenerate (e.g., in example (4) strictly above criticality). For non-reversible Markov motion there is no comparible result of such generality, although in some examples (a) can still be settled in the affirmative (e.g., in example (3); den Hollander et al. 1992b). Question (b) seems to be hard, both for reversible 6 Weaker versions of (b) can sometimes be answered, like X(t) returns to B logβ t infinitely often with probability = 1, where B is some ball in R2 and β > 0 some power. 7 In example (1) the velocity autocorrelation function has been shown to decay faster than any power, but this is a consequence of the periodic configuration of the scatterers. For random configurations one expects a long time tail.
130
FRANK DEN HOLLANDER
and for non-reversible Markov motion. One can prove recurrence in d = 1 and 2 for reversible systems (Durrett 1986), but typically transience in d ≥ 3 remains open. There are, however, examples where recurrence holds in any d ≥ 1 (Durrett 1986, Bramson and Durrett 1988). Question (c) has so far been answered only for a few very simple models. The first results in this direction are due to van Beijeren and Spohn (1983), who proved a long time tail for a class of models in d = 1. Recently, den Hollander et al. (1992a, c) have obtained the first rigorous proofs of long time tails in d ≥ 2. These will be discussed in the next section. 4. Two Models In stochastic Lorentz models one expects a long time tail result of the type as in (6) on the basis of the following heuristic argument. Assume that the motion is asymptotically diffusive (so the answer to (a) is affirmative). The slow decay of the velocity autocorrelation function arises because the particle may return to the origin and recognize the environment. This induces a memory effect, which is governed by slowly decaying return probabilities typical for diffusive motion. However, such an explanation clearly is rather vague, and we are left with the task to get mathematically precise results in concrete situations. A problem that comes up in stochastic Lorentz models is that, because the motion is random, it is not always obvious what is meant by the velocity V (t). In some cases there is a natural choice for defining this quantity (as e.g., in examples (3) and (4) of Section 3), but in other cases one needs to introduce the appropriate notion. We shall see more of this in the examples below8 . 4.1. Random Waiting Times In this model, which was first studied by Denteneer and Ernst (1984), the role of the random environment is played by an i.i.d. collection of random variables W = {w(x) : x ∈ Zd },
(15)
d
whose law µ = γ Z is assumed to satisfy γ(0, ∞) = 1 and R −1 R w2 γ(dw) < ∞ w γ(dw) < ∞.
(16)
Given W , let {X(t) : t ≥ 0} be simple random walk with jump rate w−1 (X(t)), (X(0) = 0). That is, after X(t) jumps to a site x it waits there during a random time which is exponentially distributed with mean w(x). After this time it jumps with equal probability to one of the nearest neighbors of x, waits there again, etc. We shall R write PW to denote the law of this walk in the fixed environment W , and Pµ = PW µ(dW ) to denote the law after averaging over W with respect to µ. The respective expectations will be written EW and Eµ . It is not hard to prove that Eµ X 2 (t) ∼ M −1 t, 8
(t → ∞),
(17)
In the physics literature hV (0)V (t)i is often defined by relation (3), in order to avoid the definition of V (t) itself.
131
LONG TIME TAILS
R with M = wγ(dw) the mean of γ. Indeed, over a large time interval of length t the walk will make ∼ M −1 t steps, because it will visit many different sites and the average delay per visit is M . Since the mean-square displacement of simple random walk after n steps is exactly n, the latter implies (17). Theorem 1 below is a long time tail for the quantity ∆(t) =
d Eµ X 2 (t) − M −1 , dt
(18)
which measures the correction to (17). Theorem 1. Let M and V 2 be the mean and variance of γ. Then lim td/2 ∆(t) = V 2 M (d/2)−3
t→∞
d d/2 . 2π
(19)
2
d 2 −(d/2)−1 Theorem 1 says that 12 dt as t → ∞, with the constant A 2 Eµ X (t) ∼ At 1 2 (d/2)−3 (d/(2π))d/2 . Thus (19) is of the same type as given by with A = − 4 dV M (6) (recall (3) in Lemma 1), except that the exponent is 12 d + 1 instead of 12 d. However, we have to be careful with concluding right away that (19) and (6) are equivalent. Namely, the problem is that {X(t) : t ≥ 0} does not have stationary increments under the law Pµ (which was needed in Lemma 1). To see why, let us introduce the environment process {W (t) : t ≥ 0} defined by
W (t) = {w(x + X(t)) : x ∈ Zd },
(20)
i.e., the environment of waiting times as seen relative to the position of the walk. d Now, W (0) has law µ = γ Z but W (t) for t > 0 has not, simply because X(t) stays longer on sites with large w(x) than on sites with small w(x). Consequently, the environment process is not stationary under the law Pµ and so neither is the process of increments of the random walk. The way out of this dilemma is the following. Lemma 3. Let µ0 be the law defined by dµ0 w(0) ({w(x) : x ∈ Zd }) = . dµ M
(21)
Then the environment process is stationary, ergodic and reversible under the law Pµ0 . Moreover, Eµ0 w−1 (X(t)) = M −1 and M −1 ∆(t) = Eµ0 [w−1 (X(0)) − M −1 ][w−1 (X(t)) − M −1 ] . (22) Rt The proof of (22) is based on the observation that X 2 (t) − 0 w−1 (X(s))ds is a martingale under the law PW for µ-a.s. all W . The reversibility property can be shown to imply that t → R∆(t) is completely monotone (i.e., ∆(t) has a spectral ∞ representation of the form 0 e−γt α(dγ) with α a positive measure). The relation expressed by (22) should be viewed as the analogue of (3) in Lemma 1. The right-hand side of (22) is the autocorrelation function of w−1 (X(t)) =
132
FRANK DEN HOLLANDER
(W (t))−1 (0), which is stationary under the law Pµ0 . Therefore Theorem 1 can now be viewed as a long time tail in the proper sense of the word. The jump intensity w−1 (X(t)) for the random walk in our model is the analogue of the velocity V (t) for the mechanical tagged particle in Section 1. In the remaining part of this section we shall give a sketch of how Theorem 1 comes about, without going into the technical details of the proof. The first step in the argument is a Feynman–Kac formula expressing the Laplace transform of ∆(t) in terms of simpler quantities. To write down this representation, ˜ let us define {X(t) : t ≥ 0} to be our process when w(x) = 1 for all x, i.e., simple ˜ denote expectation with respect random walk with jump rate 1 everywhere. Let E to the law of this random walk and let Z t `t (x) = 1{X(s)=x} ds, (t ≥ 0, x ∈ Zd ), (23) ˜ 0
be the local times. Then one can show that Z ∞ Z ∞ X e−λt dEW X 2 (t) = dt E˜ exp[−λ `t (x)w(x)] . 0
0
After integration over W (recall (18)) and insertion of the identity one arrives at the following expression.
P
Lemma 4. For λ > 0 Z ∞ Z ∞ Y 1 ˜ + e−λt ∆(t)dt = − e−λM t E Γ(λ`t (x)) dt λM 0 0 x with Γ(ξ) =
Z
(24)
x
e−ξ[w−M ] γ(dw),
(ξ ≥ 0).
x `t (x)
= t,
(25)
(26)
The next step in the argument is to use Lemma 4 as the starting point for a Tauberian–Abelian analysis, i.e., to study the λ ↓ 0 behavior of the right-hand side of (25) and from this deduce the t → ∞ behavior of ∆(t) in the left-hand side of (25). To carry through this analysis we need a large deviation estimate for the local times of the following type: ‘supx λ`t (x) is small for all t that make up the dominant contribution to the integral as λ ↓ 0’. Namely, it is precisely under this condition that we can expand the integrand in the right-hand side of (25) for small λ. Proceeding naively with such a computation we have, by (26), Γ(λ`t (x)) = 1 + 12 λ2 V 2 `2t (x) + · · · and hence
(27)
˜ Q Γ(λ`t (x)) = E˜ 1 + 1 λ2 V 2 P `2t (x) + · · · E x x 2 = 1+
1 2 2 2λ V
P ˜ 2 x E`t (x) + · · · .
(28)
LONG TIME TAILS
133
Next we write, recalling (23), X
˜ 2 (x) = 2 E` t
Z
t
ds (t − s)˜ ps (0, 0)
(29)
0
x
˜ where t → p˜t (·, ·) is the transition kernel of {X(t) : t ≥ 0}. Substituting (28) and (29) into (25) we get Z ∞ Z t Z ∞ −λt 2 2 −λM t dt e ∆(t) = λ V dt e ds (t − s)˜ ps (0, 0) + · · · . (30) 0
0
0
The last step in the argument is to rewrite the right-hand side of (30) as R∞ dt e−λt ζ(t) + · · · 0 (31) ζ(t) = V 2 M −3 p˜t/M (0, 0). The latter form suggests that ∆(t) ∼ ζ(t) as t → ∞. This can indeed be shown to be correct by appealing to the complete monotonicity of t → ∆(t). Finally we substitute the standard local limit theorem d d/2 p˜t (0, 0) ∼ , (t → ∞), (32) 2πt to get the claim in (19). 4.2. Random Color Scenery The model discussed in this section is of a somewhat different nature than the previous example, but we shall soon see why it blends in. We begin by associating with each site of Zd a color, drawn black or white. Namely, the random environment is an i.i.d. collection of random colors C = {c(x) : x ∈ Zd },
(33)
d
whose law µ = γ Z is parametrized by q = γ(B) = 1 − γ(W ) ∈ (0, 1), the density of black sites. Next, we define {X(n) : n ≥ 0} to be discrete time simple random walk on Zd (X(0) = 0). We shall assume that walk and coloring are independent. However, they will be linked in an interesting way through the type of question that we shall ask. Let Tk (k ≥ 0) be the successive random times at which the walk hits a black site, defined as c(X(n)) = B n = T0 , T1 , · · · (34) = W otherwise. Let nk (k ≥ 0) be the interarrival times n0 = T0 nk = Tk − Tk−1 ,
(k ≥ 1).
(35)
We shall write Pµ , Eµ to denote the joint law and expectation of walk and coloring.
134
FRANK DEN HOLLANDER
It is not hard to prove that Eµ Tk ∼ q −1 k,
(k → ∞).
(36)
Indeed, over a large time interval of length n the walk will hit ∼ qn times a black site, so an average of q −1 steps is needed between black visits. With a little effort (36) can be turned into the stronger statement Eµ nk ∼ q −1 , (k → ∞). Theorem 2 below is a long time tail for the quantity ∆k = Eµ nk − q −1 .
(37)
Theorem 2. For any q ∈ (0, 1) lim k d/2 ∆k = (1 − q)q (d/2)−2
k→∞
d d/2 . 2π
(38)
To see the link between Theorems 1 and 2, note that X 2 (n) − n is a martingale, so that Eµ X 2 (Tk ) = Eµ Tk . (39) Since, by (35) and (37), we have δk ∆k = δk2 Eµ Tk with δk the forward difference operator, Theorem 2 says that 12 δk2 Eµ X 2 (Tk ) ∼ Ak −(d/2)−1 (k → ∞) with A = − 14 d(1 − q)q (d/2)−2 (d/(2π))d/2 . Thus (38) is of the same type as (6) (again recall (3) in Lemma 1), except that the exponent is 12 d+1 instead of 12 d. The difference with the example in Section 4.1 is that we are observing the mean-square displacement not along the full time scale but along the random time scale {Tk : k ≥ 0}. Again, we have to be careful calling (38) and (6) equivalent. The point this time is that {X(Tk ) : k ≥ 1} does not have stationary increments under the law Pµ . This is due to a renewal effect. If 0 happens to lie in a big white hole, then the Tk ’s will be larger than on average, while just the opposite will be true when 0 is packed between black colors. The net result is that the increments nk = Tk − Tk−1 , (k ≥ 1), are not stationary. The way out is similar to Lemma 3. Define the environment process {C(n) : n ≥ 0} by putting C(n) = {c(x + X(n)) : x ∈ Zd }, (40) which is the color scenery seen by the walk. Lemma 5. Let µ0 be the law defined by µ0 (·) = µ(· | c(0) = B).
(41)
Then the environment process is stationary, ergodic and reversible under the law Pµ0 9 . Moreover, Eµ0 nk = q −1 and q −1 ∆k = Eµ0 [n1 − q −1 ][nk+1 − q −1 ] . (42) 9 The idea of conditioning on a black origin can be traced back to Kac (1947) and E n = q −1 µ0 k is a version of the Kac recurrence theorem (see Kasteleyn 1987).
135
LONG TIME TAILS
The relation expressed by (42) should again be viewed as the analogue of (3) in Lemma 1. The right-hand side of (42) is the autocorrelation function of nk , which is stationary under the law Pµ0 . Therefore Theorem 2 is an authentic long time tail. Apparently the analogue of the velocity is nk . One can indeed make sense out of this by observing that the nk ’s are the lengths of the successive pieces of the walk measured along the random time scale {Tk : k ≥ 1}. Admittedly, this analogue is somewhat artificial, but the connection is nevertheless legitimate. We close this section by writing down the analogue of Lemma 4. Recalling (35), (37) and Lemma 5, we have P P k Eµ n0 + k≥1 z k ∆k = [Eµ Tk − Eµ0 Tk ] k≥0 zP (43) = (1 − q) k≥0 z k [Eµ1 Tk − Eµ0 Tk ] where µ1 (·) = µ(· | c(0) = W ). Now let `n (x) =
n X
1{X(m)=x} ,
(n ≥ 0, x ∈ Zd ),
(44)
m=0
˜ denote expectation with respect to the law of the random walk. Then, and let E thinking of z as a counting factor for each time the walk visits a black site, one can derive the following expression. Lemma 6. For z ∈ (−1, 1) X Y Eµ n0 + z k ∆k = (1 − q)E˜ [1 − z `n (0) ] [1 − q + qz `n (x) ] . k≥1
(45)
x6=0
Equation (45) serves as the starting point for a Tauberian–Abelian analysis, just as in Section 4.1. Again the argument relies on the ∆k having a spectral integral representation. The reader is invited to try and check how Theorem 2 comes out of a naive expansion of the right-hand side of (45) for z ↑ 1. 5. Concluding Remarks It is a somewhat frustrating state of affairs that long time tails are so common and yet are so hard to prove mathematically. Even for stochastic Lorentz models, which are caricatures of reality, there is no satisfactory theory. The reason is that long time tails are closely related to local limit theorems, which are known to be much more difficult to get at than global limit theorems. We conclude this paper with a few remarks. (1) For the mechanical tagged particle in Sections 1 and 2 the long time tail comes from diffusion of density and momentum (see (11)). The exponent is 12 d and the amplitude is positive. For the stochastic tagged particle in Sections 4.1 and 4.2, on the other hand, the long time tail comes from diffusion of density only (the stochastic motion does not conserve momentum). The exponent is 12 d + 1, showing that the tail is weaker, and the amplitude is negative. Unlike the ‘kick in the back’ felt by the mechanical tagged particle from the vortex flow it creates around itself, the stochastic
136
FRANK DEN HOLLANDER
tagged particle remembers its initial velocity because it is more likely to return to the origin from the same side it left the origin, so moving in the opposite direction. (This statement must be read with the appropriate interpretation of ‘velocity’.) (2) Bricmont and Kupiainen (1991a, b) have recently studied the asymptotics of a nearest-neighbor random walk on Zd where the transition probabilities are a small random perturbation of those of simple random walk. They show, using a rigorous renormalization technique, that for d > 2 (and under suitable conditions on the perturbation) the motion is asymptotically diffusive with diffusion constant D ∈ (0, ∞). Their analysis indicates that the velocity autocorrelation function falls off with an exponent 12 d, but the proof has not been worked out. This speculation shows that this model is closer to that of a real fluid. The reason is not understood. (3) The derivation of the long time tails in Theorems 1 and 2 makes heavy use of reversibility, via a spectral integral representation used in the Tauberian–Abelian analysis. Without this representation it seems hard to get the precise asymptotics. (4) What does the long time tail look like without taking the expectation over the law of the random medium? This question, which was first raised by Sinai (see van Beijeren 1982), is addressed in den Hollander et al. (1992c) for the model in d Section 4.1. It is shown that dt EW X 2 (t) − M −1 falls off like Zt−d/4 with a random amplitude, namely, Z is a Gaussian random variable with mean zero and variance V 2 M (d/2)−4 (d/(4π))d/2 . Thus the medium causes fluctuations in the long time tail which dominate the mean behavior. Such fluctuations can cause trouble in simulations and experiments. Namely, although it is true that one often effectively measures some space-average of the tail, the fluctuations may lead to significant errors, particularly because the amplitudes are typically small. Acknowledgements The author thanks H. van Beijeren and J. Naudts for discussions. References Alder, B. J. and Wainwright, T. E. (1967). Velocity autocorrelations for hard spheres. Physical Review Letters 18, 988–990. Alder, B. J. and Wainwright, T. E. (1969). Enhancement of diffusion by vortex-like motion of classical hard spheres. Journal of the Physics Society of Japan (Supplement) 26, 267–269. Alder, B. J. and Wainwright, T. E. (1970). Decay of the velocity autocorrelation function. Physical Review Letters A1, 18–21. Beijeren, H. van (1982). Transport properties of stochastic Lorentz models. Review of Modern Physics 54, 195–234. Beijeren, H. van and Spohn, H. (1983). Transport properties of the one-dimensional stochastic Lorentz model. I: Velocity autocorrelation. Journal of Statistical Physics 31, 231–254. Boon, J. P. and Bouiller, A. (1976). Experimental observation of ‘long time tails’ ? Physical Review Letters 55A, 391–392. Boon, J. P. and Yip, S. (1980). Molecular Hydrodynamics. McGraw-Hill, New York. Bramson, M. and Durrett, R. (1988). Random walk in random environment: a counterexample? Communications in Mathematical Physics 119, 199–211. Bricmont, J. and Kupiainen, A. (1991a). Renormalization group for diffusion in a random medium. Physical Review Letters 66, 1689–1692. Bricmont, J. and Kupiainen, A. (1991b). Random walks in asymmetric random environments. Communications in Mathematical Physics 142, 345–420.
LONG TIME TAILS
137
Bunimovich, L. A. and Sinai, Y. G. (1981). Statistical properties of Lorentz gas with periodic configuration of scatterers. Communications in Mathematical Physics 78, 479–497. Carneiro, K. (1976). Velocity-autocorrelation function in liquids, deduced from neutron incoherent scattering results. The Physical Review A14, 517–520. Cohen, E. G. D. (1992). New types of diffusion in lattice gas cellular automata. In Microscopic Simulations of Complex Hydrodynamic Phenomena (M. Mar´eschal and B. L. Holian, eds.), Plenum Press, New York, 137–152. Cohen, E. G. D. (1993a). Fifty years of kinetic theory. Physica A194, 229–257. Cohen, E. G. D. (1993b). Kinetic theory: understanding nature through collisions. American Journal of Physics 61, 524–533. De Masi, A., Ferrari, P. A., Goldstein, S., and Wick, D. W. (1989). An invariance principle for reversible Markov processes. Applications to random motions in random environments. Journal of Statistical Physics 55, 787–855. Denteneer, P. and Ernst, M. H. (1984). Diffusion in a system with static disorder. The Physical Review B29, 1755–1768. Dorfman, J. R. and Cohen, E. G. D. (1970). Velocity correlation functions in two and three dimensions. Physical Review Letters 25, 1257–1260. Dorfman, J. R. and Cohen, E. G. D. (1972). Velocity correlation functions in two and three dimensions: low density. The Physical Review A6, 776–790. Dorfman, J. R. and Cohen, E. G. D. (1975). Velocity correlation functions in two and three dimensions: higher density. The Physical Review A12, 292–316. Durrett, R. (1986). Multidimensional random walks in random environments with subclassical limiting behavior. Communications in Mathematical Physics 104, 87–102. Hoef, M. A. van der and Frenkel, E. (1991). Evidence for faster-than-t−1 decay of the velocity autocorrelation function in a 2D fluid. Physical Review Letters 66, 1591–1594. Hoef, M. A. van der, Frenkel, D., and Ladd, A. J. C. (1991). Self-diffusion of colloidal particles in a two-dimensional suspension: Are deviations from Fick’s law experimentally observable? Physical Review Letters 67, 3459–3462. Hollander, F. den, Naudts, J., and Scheunders, P. (1992a). A long-time tail for random walk in random scenery. Journal of Statistical Physics 66, 1527–1555. Hollander, F. den, Naudts, J., and Scheunders, P. (1992a). Invariance principle for the stochastic Lorentz lattice gas. Journal of Statistical Physics 66, 1583–1598. Hollander, F. den, Naudts, J., and Redig, F. (1992). Long-time tails in a random diffusion model. Journal of Statistical Physics 69, 731–762. Kac, M. (1947). On the notion of recurrence in discrete time stochastic processes. Bulletin of the American Mathematical Society 53, 1002–1010. Kasteleyn, P. W. (1987). Variations on a theme by Mark Kac. Journal of Statistical Physics 46, 811–827. Kesten, H. (1986). Subdiffusive behavior of random walk on a random cluster. Annales de l’Institut Henri Poincar´ e (Probabilit´ es et Statistique) 22, 425–487. Kim, Y. W. and Matta, J. E. (1973). Long-time behavior of the velocity autocorrelation: a measurement. Physical Review Letters 31, 208–211. Levesque, D. and Ashurst, W. T. (1974). Long-time behavior of the velocity autocorrelation function for a fluid of soft repulsive particles. Physical Review Letters 33, 277–280. Morkel, C., Gronemeyer, C., Gl` aser, W., and Bosse, J. (1987). Experimental evidence for the long-time decay of the velocity autocorrelation in liquid sodium. Physical Review Letters 58, 1873–1876. Perrin, J. (1909). Mouvement brownien et r´ealit´e mol´eculaire. Annales de Chimie et de Physique 18, 1–114. Paul, G. L. and Pusey, P. N. (1981). Observation of a long-time tail in Brownian motion. Journal of Physics A14, 3301–3327. Wood, W. W. (1973). A review of computer studies in the kinetic theory of fluids. In The Boltzmann Equation: Theory and Applications (E. G. D. Cohen and W. Thirring, eds.), Springer, Wien, 451–490. Wood, W. W. (1975). Computer studies on fluid systems of hard core particles. In Fundamental Problems in Statistical Mechanics III (E. G. D. Cohen, ed.), North-Holland, Amsterdam, 331– 388.
MULTISCALE ANALYSIS IN DISORDERED SYSTEMS: PERCOLATION AND CONTACT PROCESS IN A RANDOM ENVIRONMENT
ABEL KLEIN∗ Department of Mathematics University of California Irvine, CA 92717 U.S.A.
Abstract. Multiscale analysis is a technique used in the study of disordered systems in the presence of phenomena similar to Griffiths singularities. In this article we illustrate the use of a multiscale analysis by applying it to a very simple model: percolation in a random environment. We also describe the application of this technique to continuous time percolation and contact processes in random environments. Key words: Disordered systems, random enviroment, Griffiths singularities, percolation, contact process.
1. Introduction Multiscale analysis is a technique used in the study of disordered systems in the presence of phenomena similar to Griffiths singularities. Typically, the corresponding homogeneous system exhibits two different types of behavior (phases) that can be obtained by varying one or more parameters: an ordered phase characterized by the existence of long range order in the system, and a localized phase characterized by the decay of some correlation function. In the presence of disorder, each phase may manifest itself in infinitely many arbitrarily large regions in which the system’s parameters will be on the range characteristic of these phases. As a consequence, even if the system is not ordered as a whole, there may be finite but arbitrarily large regions inside which the system is strongly correlated, giving raise to phenomena similar to Griffiths singularities (Griffiths 1969). Such phenomena appears in random Schr¨ odinger operators, in models of classical and quantum statistical mechanics with random parameters, percolation and contact processes in random environments, etc. In all these models the standard tools to obtain the exponential decay of correlation functions are typically expansions, which fail to converge due to the existence of those arbitrarily large regions inside which the system is strongly correlated (so terms may be large, there may be small divisors, etc.). One way of dealing with such singular regions is to modify the desired expansion, by means of a multiscale analysis which provides good estimates on the typical distances between singular regions. An expansion is then performed outside the ∗
Partially supported by the NSF under grant DMS 92-08029.
140
ABEL KLEIN
singular regions, the contribution of the singular regions must be estimated, and it must be shown that the decay obtained outside the singular regions dominates their contribution. Such a scheme was first used by Fr¨ ohlich and Spencer (1983) and Fr¨ ohlich, Martinelli, Scoppola, and Spencer (1985) to prove localization for random Schr¨ odinger operators. In this approach singular regions are defined for each realization of the disorder, probabilistic estimates are established for the geometrical layout of the singular regions and for the behavior of the system inside the singular regions, and a modified expansion is performed in each realization with typical geometry and behavior. A simpler multiscale analysis was introduced by von Dreifus (1987) and Spencer (1988), and used by von Dreifus and Klein (1989, 1991), also to prove localization for random Schrodinger operators. In this multiscale analysis a simple geometrical layout for the singular regions is predetermined and the probabilistic estimates are simple and elementary. These ideas were used by Campanino and Klein (1991), Campanino, Klein, and Perez (1991), Klein and Perez (1992) and Klein (1992, 1993) to study quantum spin systems, continuous time percolation and contact processes in random environments. In this article we will first illustrate the use of such a multiscale analysis by applying it to a very simple model: percolation in a random environment. After that we will describe the application of this technique to continuous time percolation and contact processes in random environments (Klein 1992). 2. Percolation in a Random Environment 2.1. The Model and Results Let us consider a bond percolation model on Zd (e.g., Grimmett 1989); the occupation probability of a bond (or edge) hx, yi being denoted by px,y . The collection of bonds with endpoints on a subset Λ of Zd will be denoted by B(Λ), we will call p = {px,y : hx, yi ∈ B(Zd )} the environment. The corresponding percolation probability measure will be denoted by Pp . Λ
If x, y ∈ Λ ⊂ Zd , we will write x ←→ y for the event that x is connected to y by a path of occupied bonds in Λ, here and elsewhere we omit Λ in case Λ = Zd . The connectivity function in the region Λ is defined by Λ
Gp,Λ (x, y) = Pp ({x ←→ y}). By a cluster we will mean a maximal collection of sites in Zd all connected to each other by occupied bonds. The environment is homogeneous if all px,y = p, in which case we will write Pp and Gp (x, y) for the the corresponding percolation probability measure and connectivity function. In this case, there exists a critical probability pc = pc (d), with pc (1) = 1 and 0 < pc (d) < 1 for d ≥ 2, such that for p > pc we have percolation, i.e., existence of an infinite cluster with non-zero probability), while for p < pc there is no percolation and the connectivity function decays exponentially with the distance:
MULTISCALE ANALYSIS IN DISORDERED SYSTEMS
141
there exist mp > 0 and Cp < ∞ such that Gp (x, y) ≤ Cp e−mp |x−y|
(1)
for all x, y ∈ Zd (e.g., Grimmett 1989). An application of the FKG inequality shows that (1) implies the same estimate with Cp = 1. But things are not so simple in an inhomogeneous environment p. Let p+ = suphx,yi px,y and p− = inf hx,yi px,y . Clearly, if p− > pc we have percolation, and if p+ < pc we have exponential decay of the connectivity function and no percolation. But if p− < pc < p+ the situation becomes more complicated; in this case we need more information about the environment. We now turn to random environments, and take the px,y , hx, yi ∈ B(Zd ), to be independent identically distributed random variables. We will denote the underlying probability measure and expectation by P and E. We now define p+ and p− to be the supremum and infimum of the essential range of the random variable px,y , respectively. If p− < pc < p+ , the system will develop phenomena similar to Griffiths singularities. Indeed, with probability one (with respect to P), if for any length scale L we look for hypercubes in Zd with sides of length L, we will always find infinitely many in which all px,y < pc , and infinitely many in which all px,y > pc . In the latter the system will try to form large clusters, while in former the clusters will like to be small relative to L for L sufficiently large. But if p1 < pc and P(px,y ≤ p1 ) is sufficiently close to one, the infinitely many regions in which the system would like to form large clusters would be typically located at very large distances from each other, so we should expect that we would not have percolation and that the connectivity function should decay exponentially with the distance. But the usual expansions will not converge, so we will perform a modified expansion using a multiscale analysis. We will prove: THEOREM 2.1. Let p1 < pc , q = q(p1 ) = P(px,y ≤ p1 ). There exists q1 = q1 (p1 ) < 1, such that if q > q1 we can pick m(q) > 0, with limq%1 m(q) = mp1 , for which we have for P-almost every environment p and every x ∈ Zd that Gp (x, y) ≤ Cx,p e−m(q)|x−y|
(2)
for all y ∈ Zd , with Cx,p < ∞. Remark . For bond percolation we do not need a multiscale analysis, since EPp = Pp ,
(3)
where p = Epx,y . Thus, if p < pc , we obtain (1) with p = p. It follows, by an application of Chebyshev’s inequality and the Borel–Cantelli Lemma, that for any m < mp and every x ∈ Zd we have (2) for all y ∈ Zd , with Cx,p = Cx,p (m) < ∞ for P-almost every p. In general, (3) (or something like it) does not hold. But since bond percolation is probably the simplest model in which we can perform a modified expansion using a multiscale analysis, we will use it for pedagogical reasons.
142
ABEL KLEIN
2.2. The Expansion for Homogeneous Environments Before we can perform a modified expansion, we need an expansion to modify. The one we will use is based on the so called Hammersley–Simon–Lieb inequality: Let Λ ⊂ Λ0 ⊂ Zd , x ∈ Λ and y ∈ Λ0 \ Λ. In any environment p we have (from now on the subscript p will be omitted, unless needed for emphasis): X GΛ (x, z) GΛ0 (z, y), (4) GΛ0 (x, y) ≤ z∈∂Λ
where ∂Λ = {z ∈ Λ : hz, wi ∈ B(Zd ) for some w ∈ / Λ}. We will actually use the following consequence of (4), which we will call the HSL inequality: If x ∈ Λ ⊂ Λ0 ⊂ Zd , y ∈ Λ0 \ Λ, we have GΛ0 (x, y) ≤ GΛ (x, ∂) GΛ0 (z1 , y) P for some z1 ∈ ∂Λ, where GΛ (x, ∂) = z∈∂Λ GΛ (x, z). Given L > 0, x ∈ Zd , we set |x| = |x|∞ and
(5)
ΛL (x) = {y ∈ Zd : |y − x| ≤ L}. LEMMA 2.2. Let us consider percolation in a homogeneous environment. Suppose that at some length scale L we have GΛL (0) (0, ∂) ≤ ρ < 1.
(6)
G(x, y) ≤ Ce−m|x−y|
(7)
Then for all x, y ∈ Zd we have
with m = −L−1 log ρ and C = emL . Proof. By the homogeneity of the model, (6) holds with any x substituted for 0. So let x, y ∈ Zd , with |x − y| ≥ L. For any positive integer n < L−1 |x − y|, we can apply (5) and (6) n times to obtain G(x, y) ≤ ρn G(z, y)
(8)
for some z ∈ Zd . Since G(z, y) ≤ 1, we get G(x, y) ≤ ρ(|x−y|/L)−1 , which is just (7).
(9)
PROPOSITION 2.3. Let us consider percolation in a homogeneous environment. Then there exists pb > 0 such that (1) holds for all p < pb. Proof. For any scale L, we have lim GΛL (0) (0, ∂) = 0.
p→0
(10)
If we let pb(L) = sup{p : GΛL (0) (0, ∂) < 1} and take pb = supL pb(L), the proposition follows from the previous lemma. (It is easy to see that pb ≥ (2d − 1)−1 .)
MULTISCALE ANALYSIS IN DISORDERED SYSTEMS
143
2.3. Geometrical Layout and Probabilistic Estimates We start the multiscale analysis by establishing a priori a geometrical layout for the singular regions with appropriate probabilistic estimates. Let Lk , k = 0, 1, 2, . . ., be an increasing sequence of length scales and let p1 be as in Theorem 2.1. In a given environment p we will say that a site x ∈ Zd is 0-regular if py,z ≤ p1 for all hy, zi ∈ B(ΛL0 (x)). We now proceed inductively. For k = 1, 2, . . . a site x ∈ Zd is k-regular if there are no two (k −1)-singular (i.e., not (k −1)-regular) sites y, z ∈ ΛLk (x) with |y −z| ≥ Lk−1 . We call Λ ⊂ Zd a k-regular region if it contains no k-singular sites, otherwise we call Λ a k-singular region. From now on we consider a random environment and introduce probabilities Pk such that P(0 is k-singular) ≤ Pk for all k = 0, 1, 2, . . .. In this case we will also have P(x is k-singular) ≤ Pk for all x by the translation invariance of P. A simple way to do it is the following scheme due to von Dreifus (1987). LEMMA 2.4. Let L0 > 0, r > 2d and 1 < α < 2r/(r + 2d). Set Lk+1 = Lα k and Pk = L−r for k = 0, 1, 2, . . . . Suppose we have k P(0 is 0-singular) ≤ P0 .
(11)
P(0 is k-singular) ≤ Pk
(12)
Then we also have for all k = 1, 2, . . . if L0 is large enough, how large depending only on d, α and r. Proof. We proceed by induction on k. Suppose (12) holds. Then P(0 is (k + 1)-singular) ≤ (2Lk+1 + 1)2d Pk2
(13)
since the events {x is k-singular} and {y is k-singular} are independent if |x − y| ≥ Lk , and the number of such pairs is smaller than (2Lk+1 + 1)2d . But the right hand side is easily seen to be dominated by Pk+1 if L0 is large enough. From now on we will assume the setup given in Lemma 2.4. The following consequence is of particular interest, since it gives the typical geometric layout of singular sets. PROPOSITION 2.5. Suppose (12) holds for all k = 0, 1, 2, . . .. Then for any b ≥ 1 and x ∈ Zd let k b (x) = sup{k : ΛbLk+1 (x) is a k-singular region}.
(14)
Then k b (x) < ∞ with probability one. Proof. It follows from Lemma 2.4 that (3b)d P ΛbLk+1 (x) is a k-singular region ≤ (2bLk+1 + 1)d Pk ≤ r−αd . Lk Since r > αd, the desired result follows from the Borel–Cantelli Lemma.
(15)
144
ABEL KLEIN
2.4. The Multiscale Expansion The main result of the deterministic part of the multiscale analysis is given by THEOREM 2.6. Let us fix an environment p and let m0 < mp1 . Then for any k-regular site x ∈ Zd , k = 0, 1, 2, . . ., we have GΛLk (x) (x, ∂) ≤ e−mk Lk ,
(16)
where mk is defined inductively for k = 1, 2, . . . by mk = mk−1 −
6m0 Lα−1 k−1
(17)
with mk ≥ d
log Lk , Lk
(18)
and L0 sufficiently large, how large depending only on d, p1 , m0 and α. Proof. The proof is by induction on k. If k = 0, since GΛ (x, y) is increasing in Λ, (16) follows immediately from (1) with p = p1 in case L0 is large enough, in which case we also have (18) for k = 0. Let us assume that (16), (17) and (18) hold for for k, we will prove they hold for k+1. So let x be (k+1)-regular, y ∈ ∂ΛLk+1 (x), we will estimate GΛLk+1 (x) (x, y). We will proceed as in the proof of Lemma 2.2, with some modifications (if ΛLk+1 (x) was a k-regular region, the same proof would work). By the definition of (k + 1)-regularity, e = ΛL (x) \ Λ2L −1 (u) is a k-regular region. there exists u ∈ ΛLk+1 (x) such that Λ k+1 k e ∩ ΛL −L −1 (x), we use (5) and the induction hypothesis (16) to get If z ∈ Λ k+1 k GΛLk+1 (x) (z, y) ≤ e−mk Lk GΛLk+1 (x) (z1 , y)
(19)
for some z1 ∈ ∂ΛLk (z). If z ∈ Λ2Lk −1 (u) and Λ2Lk (u) ⊂ ΛLk+1 −Lk −1 (x), we use (5) with Λ = Λ2Lk (u) and Λ0 = ΛLk+1 (x) to get GΛLk+1 (x) (z, y) ≤ |∂Λ2Lk (u)|GΛLk+1 (x) (z1 , y) ≤ Ldk GΛLk+1 (x) (z1 , y),
(20)
e ∩ ΛL −L −1 (x), since we always have the obvious for some z1 ∈ ∂Λ2Lk (u) ⊂ Λ k+1 k bound GΛ (z, z 0 ) ≤ 1. (21) The second inequality is true for L0 large. But now we can apply inequality (19) to GΛLk+1 (x) (z1 , y) obtaining GΛLk+1 (x) (z, y) ≤ Ldk e−mk Lk GΛLk+1 (x) (z2 , y) ≤ GΛLk+1 (x) (z2 , y),
(22)
for some z2 ∈ ∂ΛLk (z1 ), the second inequality following from the induction hypothesis (18).
MULTISCALE ANALYSIS IN DISORDERED SYSTEMS
145
We now estimate GΛLk+1 (x) (x, y) as in the proof of Lemma 2.2, using either (19) or (22) as appropriate when possible, plus (21) at the end. We get Lk+1 − 4Lk GΛLk+1 (x) (x, y) ≤ exp −mk Lk −1 , (23) Lk since we can use (19) at least N times, where N is the integer satisfying Lk+1 − 4Lk Lk+1 − 4Lk −1≤N < . Lk Lk As (23) is valid for any y ∈ ∂ΛLk+1 (x), and |∂ΛLk+1 (x)| ≤ Ldk+1 for L0 large, we get GΛLk+1 (x) (x, ∂) ≤ Ldk+1 e−mk (Lk+1 −5Lk ) ≤ e−mk+1 Lk+1 ,
(24)
with mk+1 given by (17) satisfying (18), if L0 is sufficiently large, how large depending only on d, p1 , m0 and α. We can now prove Theorem 2.1. Given m < mp1 we set m0 = 12 (m + mp1 ), and pick L0 large enough so that we can apply Lemma 2.4 and Theorem 2.6, with ! ∞ X −(α−1) m∞ ≡ lim mk = m0 1 − 6 ≥ 12 (m + m0 ). Lk (25) k→∞
k=0
Since P(0 is 0-singular) ≤ (1 − q) (2L0 + 1)d ,
(26)
we have (11) if q > 1 − {Lr0 (2L0 + 1)d }−1 . We can thus apply Proposition 2.5 with b = (m∞ − m)−1 . So let x ∈ Zd , for any y ∈ Zd we can find a unique k such that bLk ≤ |y − x| < bLk+1 .
(27)
If k > k b (x), we have that ΛbLk+1 (x) is a k-regular region, so the proof of Lemma 2.2 applies inside ΛbLk+1 (x), with L = Lk and ρ = e−mk Lk , yielding G(x, y) ≤ e−mk |y−x|+mk Lk ≤ e−(mk −b This finishes the proof of Theorem 2.1.
−1
)|y−x|
≤ e−m|y−x|.
(28)
3. Continuous Time Percolation and Contact Process in a Random Environment Let us consider the inhomogeneous continuous time percolation process on Zd × R defined as follows. Let δ = {δ(x) > 0; x ∈ Zd } and λ = {λ(x, y) > 0 : hx, yi ∈ B(Zd )}, where B(Zd ) denotes the collection of bonds (or edges) hx, yi in Zd , i.e., unoriented pairs of sites x, y ∈ Zd with kx − yk2 = 1. Along each vertical line {x} × Rd we put cuts at times given by a Poisson point process with intensity δ(x), and between each pair of adjacent vertical lines {x} × R and {y} × R, hx, yi ∈ B(Zd ), we place bridges at times
146
ABEL KLEIN
given by a Poisson point process with intensity λ(x, y). All these Poisson processes are independent of each other. Given a realization of all these Poisson processes (i.e., a locally finite collection of cuts and bridges), we consider the subset of Rd+1 we obtain by taking Zd ×R, removing all cuts, and adding all bridges, and decompose it into connected components. We call these connected components clusters. We say that (x, t), (y, s) ∈ Zd × R are connected if they belong to the same cluster; in this case we write (x, t) ↔ (y, s). Notice that this happens if and only if there is a path from (x, t) to (y, s) made out of uncut segments of vertical lines and bridges. More generally, if W ⊂ Rd+1 we can substitute W for Rd+1 in the above considerations. Thus (x, t) is connected to (y, s) in W, and we write (x, t) ↔W (y, s), if the connection lies in W. Given subsets A and B of Zd × R, we say that A ↔W B if there exist (x, t) ∈ A, (y, s) ∈ B, such that (x, t) ↔W (y, s). If W = Rd+1 we omit W. Given such a inhomogeneous environment δ, λ, we will denote by Q = Qδ,λ the percolation probability measure, i.e., the probability measure on the space of configurations (locally finite collection of cuts and bridges). The connectivity function in the region W ⊂ Rd+1 is then defined by (29) Gδ,λ W ((x, t), (y, s)) = Qδ,λ (x, t) ↔W (y, s) . We may omit δ, λ from the notation. If Λ ⊂ Zd , I ⊂ Z, we will write GΛ×I e for GΛ e×I , where Λ = Λ ∪ {[x, y]; x, y ∈ Λ, kx − yk2 = 1}. We will also write |(x, t)| = k(x, t)k∞ = max{kxk∞ , |t|}. Notice that the scaling (δ, λ) → (aδ, aλ), where a > 0, does not change the behavior of such systems; it corresponds to a simple re-scaling of the time variable. This inhomogeneous percolation process appears as the limit of the percolation processes on Zd × n−1 Z studied by Campanino, Klein, and Perez (1991). The homogeneous version was independently studied by Bezuidenhout and Grimmett (1991). In a homogeneous environment, i.e., δ(x) ≡ δ > 0, λ(x, y) ≡ λ > 0, we always have a non-trivial phase diagram. The relevant parameter is ξ = λ/δ. In any dimension d = 1, 2, . . . there exists 0 < ξc (d) < ∞, such that if ξ < ξc (d) the connectivity function decays exponentially, i.e., G((x, t), (y, s)) ≤ Ce−m|(x−y,t−s)|,
(30)
for some m > 0, C < ∞, and if ξ > ξc (d) there is percolation (Bezuidenhout and Grimmett 1991, Campanino, Klein, and Perez 1991). If d = 1, we actually have ξc (1) = 1 (Bezuidenhout and Grimmett 1991). Again life is not so simple in a inhomogeneous environment (δ, λ). Let 1 min λ(x, y) : y satisfying kx − yk2 = 1 , δ(x) 1 max λ(x, y) : y satisfying kx − yk2 = 1 . ξ(x) = δ(x) ξ(x) =
It follows from the monotonicity properties of G((x, t), (y, s)) with respect to each δ(x) and λ(x, y), that if supx ξ(x) < ξc we have exponential decay, whereas if
MULTISCALE ANALYSIS IN DISORDERED SYSTEMS
147
inf x ξ(x) > ξc we have percolation. The interesting non-trivial cases are thus when the above conditions are not satisfied, in particular when we can find sites where ξ(x) < ξc and sites where ξ(x) > ξc , so the system exhibits the type of behavior associated with Griffiths singularities. This typically happens in disordered environments. If we consider the oriented percolation process with the cuts as above, but replacing the bridges by one-way bridges, i.e., each Poisson process of bridges is replaced by two independent Poisson processes with intensities λ1 (x, y) > 0 and λ2 (x, y) > 0, the first giving one-way bridges from {x} × R to {y} × R, and the second from {y} × R to {x} × R, and uncut segments can only be traversed in the direction of increasing time, then we obtain the graphical representation of the inhomogeneous contact process (see Bezuidenhout and Grimmett 1991). The above considerations apply to the contact process (e.g., Liggett 1985). Given the inhomogeneous contact process with δ, λ1 , λ2 , we can consider the percolation process with intensities δ, λ, where λ(x, y) = λ1 (x, y) + λ2 (x, y), (i.e., we make all bridges two-ways). Then clearly no percolation (i.e., no infinite cluster) in the percolation process implies extinction (i.e., no infinite oriented cluster) in the contact process, and survival of the contact process implies percolation in the percolation process. We will consider these percolation and contact processes in a random environment by taking the δ = {δ(x) : x ∈ Zd } and λ = {λ(x, y) : hx, yi ∈ B(Zd } (δ, λ1 , λ2 for the contact process) to be independent families of independent identically distributed strictly positive random variables. We will use P and E to denote the probability measure and expectation associated with these random variables. We will also use δ and λ for representative random variables. In such environments we will have to deal with phenomena similar to Griffiths singularities. But the situation is very different from the simple one we encountered in the previous section, since now the disorder is frozen in the continuous time direction. If P(ξ(x) > ξc ) > 0, the connectivity function cannot have exponential decay in the time direction, since given any exponential rate of decay we can find, with probability one, singular regions inside which the connectivity function exhibits a slower rate of decay in the time direction than the given rate. The one-dimensional (d = 1) contact process in a random environment was studied by Liggett (1991, 1992), who gave conditions on the probability distributions for extinction and survival. Another one-dimensional survival result is due to Bramson, Durrett, and Schonmann (1991). Andjel (1993) exhibited examples of survival in two or more dimensions. Campanino, Klein, and Perez (1991) (see their Theorem 4.1) used a multiscale analysis to give the first proof of no percolation and decay of the connectivity function for the multidimensional continuous time percolation process in a random environment, thus also proving extinction for the multidimensional contact process in a random environment. Klein (1992) extended their proof to a larger class of probability distributions. Campanino, Klein, and Perez (1991) also proved percolation for the continuous time percolation process in a random environment for d ≥ 2. Aizenman, Klein, and Newman (1993) gave a proof of percolation in the one-dimensional case.
148
ABEL KLEIN
Continuous time percolation and contact processes were studied in a quasiperiodic environment by Jitomirskaya and Klein (1993), who also used a multiscale analysis to show no percolation and decay of the connectivity function. Aizenman, Klein, and Newman (1993) found probability distributions (in any dimension) under which there is always percolation. Their result is: PROPOSITION 3.1. Consider the continuous time percolation process on Zd × R in a random environment δ, λ. Suppose either d = 1, E(δ) + E(λ−1 ) < ∞, and 1 u lim P log > u = ∞; (31) u→∞ log u δ or d ≥ 2, λ ≥ λ0 > 0 and δ ≤ δ0 < ∞ for some λ0 , δ0 , and 1 lim ud P log > u = ∞. u→∞ δ
(32)
Then there is percolation with probability one. Proposition 3.1 tells us that conditions on the probability distributions of δ and λ are needed for absence of percolation. Unlike the situation in Theorem 2.1, it is not enough to require P ξ(x) ≤ ξ1 sufficiently close to 1 for some ξ1 < ξc . (33) In fact, Proposition 3.1 suggests the conjecture that a sufficient additional condition for no percolation should be E [log(1 + λ)]β + E [log(1 + δ −1 )]β < ∞ (34) for some β > d. The following result is proven in Klein (1992). Notice the manifestation of the Griffiths-type singularities in the less than exponential rate of decay in the time direction in (38). THEOREM 3.2. Let d = 1, 2, . . . , and consider the continuous time percolation process on Zd × R in a random environment δ, λ. Let ! r 1 1 2 β > 2d 1 + 1 + + (35) d 2d and suppose n o Γ = max E [log(1 + λ)]β , E [log(1 + δ −1 )]β < ∞. Then we can find = (d, β, Γ) > 0 such that, if E [log(1 + (λ/δ))]β < ,
(36)
(37)
MULTISCALE ANALYSIS IN DISORDERED SYSTEMS
149
we have no percolation with probability one. In fact, there exists q(β, d) > 1 such that, given m > 0 and q with 1 < q < q(β, d), we can find = (d, β, Γ, m, q) > 0 for (37) for which we have, with probability one, that for every x ∈ Zd , n o (38) G((x, t), (y, s)) ≤ Cx,δ,λ exp −m x − y, [log(1 + |t − s|)]q for all y ∈ Zd , t, s ∈ R, with Cx,δ,λ < ∞. Condition (37) is only used to get (33) and may be replaced by it. Notice also that e E [log(1 + (λ/δ))]β ≤ 2β+1 Γ, (39) where n o e ≡ min max E [log(1 + γλ)]β , E [log(1 + (γδ)−1 )]β ≤ Γ. Γ γ>0
(40)
Theorem 3.2 immediately implies the following result about the contact process in a random environment: COROLLARY 3.3. Let d = 1, 2, . . . , and consider the d-dimensional contact process in a random environment δ, λ1 , λ2 . Let β satisfy (35) and δ, λ satisfy (36), where λ = λ1 + λ2 . Then there exists = (d, β, Γ) > 0 such that if (37) holds, the contact process becomes extinct with probability one. Theorem 3.2 is proven by a multiscale analysis, albeit not as simple as the one in Section 2. In this article we will only describe the results of the multiscale analysis and refer to Klein (1992) for the full proof. For X = (x, t) ∈ Zd × R, L > 0, T > 0, let BL,T (X) = ΛL (x) × [t − T, t + T ]. We fix ν such that 0 < ν < 1, and set BL (X) = BL,eLν (X). For Y ∈ BL,T (X) let X
GBL,T (X) (Y, ∂) =
GBL,T (X) (Y, Z)
Z∈ΛL (x)×{t−T,t+T }
+
X hz,z 0 i∈∂ΛL (x)
where now
0
λ(z, z )
Z
t+T
GBL,T (X) (Y, (z, s))ds,
(41)
t−T
∂Λ = hz, z 0 i ∈ B(Zd ) : z ∈ Λ, z 0 ∈ /Λ .
If Y ∈ BL,T (X) ⊂ W and Z ∈ W \ BL,T (X), we have the following consequence of the HSL inequality: GW (Y, Z) ≤ GBL,T (X) (Y, ∂) GW (Z1 , Z)
(42)
for some Z1 either in ΛL (x) × {t − T, t + T } or of the form (z1 , s) with z1 ∈ ΛL (x), hz1 , z10 i ∈ ∂ΛL (x) for some z10 , and s ∈ [t − T, t + T ].
150
ABEL KLEIN
In a homogeneous environment we may use (42) to obtain (30), as in Lemma 2.2. As before we need to develop a multiscale analysis to extend the argument to a random environment. Given m > 0, L > 1, a site x ∈ Zd will be called (m, L)-regular if GBL ((x,0)) ((x, 0), ∂) ≤ e−mL .
(43)
The main result of the multiscale analysis is given by THEOREM 3.4. Let d = 1, 2, . . . , and consider the continuous time percolation process on Zd × R in the random environment δ, λ. Let ! r 1 1 , (44) β > 2d2 1 + 1 + + d 2d and suppose (36) holds. Set α=d+
p d2 + d,
and choose ν and p such that αd(α + β + 1) < ν < 1, β(α − d + αd) β(ν(α − d + αd) − αd) − αd αd < p < . α
(45)
Let m0 and m∞ be given, and satisfy 0 < m∞ < m0 . There exists a number L = L(d, β, Γ, ν, p, m0 , m∞ ) < ∞ such that, if for some L0 > L we have P 0 is (m0 , L0 )-regular ≥ 1 − L−p (46) 0 , then, setting Lk+1 = Lα k , k = 0, 1, . . ., we also have P 0 is (m∞ , Lk )-regular ≥ 1 − L−p k
(47)
for all k = 0, 1, 2, . . .. Inequality (45) can be satisfied because of (44). Indeed, let f (θ) =
θd(θ + 1) θ−d
for θ > d.
√ It is easy to see√that f (θ) attains its minimum at θ = α ≡ d + d2 + d, and f (α) = 2d2 (1 + 1 + d−1 + (2d)−1 ). Thus (44) just says that β > f (α), and (45) says that we picked 0 < ν < 1 and p > αd such that β>
α(p + d) α(p + d) = . αν − d(α(1 − ν) + ν) ν(α − d + αd) − αd
The proof of Theorem 3.4 proceeds by induction on k. In addition to the HSL inequality, the proof uses the Harris–FKG and the van den Berg–Kesten inequalities.
MULTISCALE ANALYSIS IN DISORDERED SYSTEMS
151
The main difficulty is in estimating the decay of the connectivity function in the time direction, inside the singular regions. Such an estimate is needed since the singular regions are cylinders infinitely extended in the time direction, as the disorder is frozen in that direction. Using (36) it is shown that a suitable estimate holds in each scale with good probability. Since we can obtain (46) by taking in (37) sufficiently small, Theorem 3.2 follows from Theorem 3.4 by THEOREM 3.5. Let d = 1, 2, . . ., and consider the continuous time percolation process on Zd × R in the random environment δ, λ. Let ν, α, p, m∞ , L0 be such that 0 < ν < 1, α > 1, p > αd, m∞ > 0, L0 > 1. Set Lk+1 = Lα k , k = 0, 1, 2, . . . . Suppose P 0 is (m∞ , Lk )-regular ≥ 1 − L−p (48) k for all k = 0, 1, 2, . . . . Then, for any m such that 0 < m < m∞ , we have, with probability one, that for every x ∈ Zd , n o G((x, t), (y, s)) ≤ Cx exp −m (x − y, [log(1 + |t − s|)]1/ν ) for all y ∈ Zd , t, s ∈ R, with Cx = Cx (δ, λ, m) < ∞. References Aizenman, M., Klein, A., and Newman, C. M. (1993). Percolation methods for disordered quantum Ising models. To appear in Phase Transitions: Mathematics, Physics, Biology, . . . (R. Kotecky ed.), World Scientific. Andjel, E. (1993). Survival of multidimensional contact process in random environments. Boletim da Sociedade Brasileira de Matem´ atica 23, 109–119. Bezuidenhout, C. and Grimmett, G. (1991). Exponential decay for subcritical contact and percolation processes. Annals of Probability 19, 984–1009. Bramson, M., Durrett, R., and Schonmann R. H. (1991). The contact process in a random environment. Annals of Probability 19, 960–983. Campanino, M. and Klein, A. (1991). Decay of two-point functions for (d + 1)-dimensional percolation, Ising and Potts model with d-dimensional disorder. Communications in Mathematical Physics 135, 483–497. Campanino, M., Klein, A., and Perez, J. F. (1991). Localization in the ground state of the Ising model with a random transverse field. Communications in Mathematical Physics 135, 499–515. Dreifus, H. von (1987). On the effects of randomness in ferromagnetic models and Schr¨ odinger operators. Ph.D. Thesis, New York University. Dreifus, H. von and Klein, A. (1989). A new proof of localization in the Anderson tight binding model. Communications in Mathematical Physics 124, 285–299. Dreifus, H. von and Klein, A. (1991). Localization for random Schr¨ odinger operators with correlated potentials. Communications in Mathematical Physics 140, 133–147. Fr¨ ohlich, J. and Spencer, T. (1983). Absence of diffusion in the Anderson tight binding model for large disorder or low energy. Communications in Mathematical Physics 88, 151–184. Fr¨ ohlich, J., Martinelli, F., Scoppola, E., and Spencer, T. (1985). Constructive proof of localization in the Anderson tight binding model. Communications in Mathematical Physics 101, 21–46. Griffiths, R. (1969). Non-analytic behavior above the critical point in a random Ising ferromagnet. Physical Review Letters 23, 17–19. Grimmett, G. (1989). Percolation. Springer-Verlag, New York. Jitomirskaya, S. and Klein, A. (1993). Ising model in a quasi-periodic transverse field, percolation and contact processes in quasi-periodic environments. Journal of Statistical Physics, to appear. Klein, A. (1992). Extinction of contact and percolation processes in a random environment. Annals of Probability, to appear.
152
ABEL KLEIN
Klein, A. (1993). Disordered quantum spin processes, percolation and contact processes. To appear in Phase Transitions: Mathematics, Physics, Biology, . . . (R. Kotecky, ed.), World Scientific. Klein, A. and Perez, J. P. (1992). Localization in the ground state of a disordered array of quantum rotators. Communications in Mathematical Physics 147, 241–252. Liggett, T. M. (1985). Interacting Particle Systems. Springer-Verlag, New York. Liggett, T. M. (1991). Spatially inhomogeneous contact processes. In Spatial Stochastic Processes. A Festschrift in honor of the Seventieth Birthday of Ted Harris, Birkh¨ auser, Boston, 105–140. Liggett, T. M. (1992). The survival of one dimensional contact processes in random environments. Annals of Probability 20, 696–723. Spencer, T. (1988). Localization for random and quasi-periodic potentials. Journal of Statistical Physics 51, 1009–1019.
GEOMETRIC REPRESENTATION OF LATTICE MODELS AND LARGE VOLUME ASYMPTOTICS
´ ROMAN KOTECKY* Centre for Theoretical Study Charles University Celetn´ a 20 116 36 Praha 1 Czech Republic e-mail:
[email protected]
Abstract. The finite volume asymptotics of lattice models near first-order phase transitions is discussed. The tool for the description of finite size effects is (a version of) the Pirogov–Sinai theory. Its main ideas are reviewed and illustrated on simple models. Key words: Lattice models, phase transitions, finite size effects, Pirogov–Sinai theory.
1. Introduction There is a vast inventory of lattice models providing examples of first-order phase transitions and coexistence of phases. It became clear already from the first proof of existence of such a transition for the Ising model by the Peierls argument [P, G, D1] that a convenient tool for a study of the coexistence of phases is a representation in terms of probabilities of configurations of geometric objects — contours. This approach has been systematically developed in Pirogov–Sinai theory [PS, S]. At present it is the main technique for the study of phase transitions for models with no symmetry between coexisting phases. Here, I will discuss its use for the derivation of the asymptotic behaviour, as the size of the system grows, in the region of the first-order phase transitions [BK1, BK2, BKM, BI1–3]. Even though the original papers by Pirogov and Sinai were published almost twenty years ago, the theory is not widely known outside a rather restricted group of mathematical physicists. Thus, my first aim in this lecture is to present a simpleminded introduction to the Pirogov–Sinai theory taking into account some latest developments. I will not attempt to develop the theory in its full generality. Instead, only the main principles will be explained and the theory will be ilustrated on the simplest examples of models that still capture the general features. As a starting point, let us recall a couple of banal facts about the standard ferromagnetic Ising model. The probability of a configuration σΛ ≡ {σi }i∈Λ , σi ∈ {−1, +1}, on a finite lattice Λ ⊂ Zd , d ≥ 2, and under the boundary conditions
*Also in the Department of Theoretical Physics, Charles University.
154
´ ROMAN KOTECKY + + + + + + + + + + + + + + + + + − − − + + + − − − − + + + + − − − + + − − − + + + + + + + − − − − + + − − − − + + + + + − + − − + + + − + + − + + + + − + + − − + + + + − + + − + + − − − − + + + − + + + + − + + − − + + + + + + + − − + + + + + + + − − + + + + − + + − + + + − + + + − + + + + + − + + + + + + − − − − + − + + + + + + + + − − − + − − − + + + − + + + + + − − + + − + + + − − + + + + − + − − − − + + − + + + + + + + + + + + + + + + + + +
Fig. 1
σ Λc = {σ i }i∈Zd \Λ , is given by µΛ (σΛ | σ Λc ) =
e−βHΛ (σΛ | σΛc ) , ZΛ (σ Λc )
(1)
where the energy is1 HΛ (σΛ | σ Λc ) = −
X
(σi σj − 1) −
hi,ji i,j∈Λ
X hi,ji i∈Λ,j∈Λc
(σi σ j − 1) − h
X
σi
(2)
i∈Λ
with the sum over pairs of nearest neighbours, and the normalizing partition function is X ZΛ (σ Λc ) = e−βHΛ (σΛ | σ Λc ) . (3) σΛ
At high temperatures, β small, the random variables σi are ‘almost independent’ and as a result, for Λ % Zd , there is a unique weak limit µ of (1) independent of boundary conditions (or sequence of boundary conditions {σ Λc }Λ ). On the other hand, at low temperatures, β large, the variables σi are strongly dependent — a first-order phase transition occurs that reveals itself in the fact that, for h = 0, the particular boundary conditions corresponding to the ground configurations σ Λc = +1, σ i = +1 for all i ∈ Λc , and σ Λc = −1, lead for vanishing external field, h = 0, to different limiting measures µ+ and µ− . The proof of this fact by the famous Peierls argument is based on a reformulation of the model (with h = 0) in terms of probabilities of particular spatial patterns 1 A constant has been added to the Hamiltonian, so that the energy of ground configurations in the case without external field, h = 0, vanishes.
GEOMETRIC REPRESENTATION OF LATTICE MODELS
155
in the configurations. Namely, one considers configurations ∂ = {Γ} of contours Γ introduced for a spin configuration σ as components of the boundaries between areas of pluses and minuses (see Fig. 1)2 . For a fixed boundary condition (say +1) the correspondence between spin configurations and collections ∂ of mutually disjoint contours is one to one and the probability of a contour configuration ∂ under the measure µΛ (· | +1) (with h = 0) is µΛ (∂ | +1) =
1 Y −2β|Γ| e . Z(Λ)
(4)
Γ∈∂
Here Z(Λ) =
XY
e−2β|Γ|
(5)
∂∈Λ Γ∈∂
with the sum over collections of mutually disjoint contours in Λ. It differs from ZΛ (+1) (≡ ZΛ (−1)) by the factor that equals the contribution of the configuration +1 to ZΛ (+1). The typical configurations σ of the measure µ+ obtained as the limit of µΛ ( · | +1) can be characterized by proving that, in the limiting probability obtained from (4), the typical contour configurations ∂ are such that for every Γ ∈ ∂ there exists the most external contour surrounding it. (No infinite ‘cascades’ of contours exist.) This fact is proven, with the help of the Borel–Cantelli lemma, by evaluating the probability of every contour surrounding a fixed site in such a way that the sum of these probabilities can be shown to converge. As a result, one characterizes the typical configurations σ of the measure µ+ as consisting of a connected sea of pluses containing finite islands of minuses. Or, in other words, in a typical configuration of µ+ the pluses percolate (and minuses do not). This situation can be described as a stability of plus phase. By the same reasoning we can show that also the minus phase is stable and characterize the measure µ− as supported by configurations with percolating minuses. The measures µ+ and µ− thus differ – we say that two different phases coexist for h = 0 and β large or that phase transition of the first order occurs for h = 0. The trick that allows one to describe the typical configurations, in spite of the fact that the variables σi are actually strongly dependent, is based on replacing them by ‘contour variables’ and viewing their probability distribution (4) as a perturbation of a contour-free (empty) configuration that corresponds to the ground spin state +1 in the case of µ+ . The crucial fact for the Ising model is its plus-minus symmetry. It follows not only that the phase transition should be expected to occur for vanishing external field, h = 0, but also that the contours distributed by (4) are essentially independent. We use this term to refer to the fact that the weight factor in (4) is multiplicative; once the contours in ∂ are pairwise compatible – every two contours Γ and Γ from ∂ are disjoint – they contribute independently. A configuration with a particular contour skipped is again a possible configuration (under fixed boundary conditions +1 there exists a uniquely defined corresponding 2 We are illustrating the two-dimensional case here, with contours characterized as connected sets consisting of edges of the dual lattice (Z2 )∗ ≡ Z2 + ( 12 , 12 ) such that every vertex of (Z2 )∗ is contained in even number (0, 2, or 4) of its edges.
156
´ ROMAN KOTECKY
spin configuration in Λ) and the weights of remaining contours do not change. The second main ingredient is the fact that the long contours are sufficiently damped — the weight factor of a given contour Γ (in our case e−2β|Γ| ) decreases quickly with its length |Γ|; namely, it can be bounded by e−τ |Γ| with a sufficiently large τ (to achieve this in our case one simply takes β large enough). This is a direct consequence of the fact that the difference of the energy of a configuration and the ground state configuration (say +1) is proportional to the length of its contours (Peierls condition). It is the fulfilment of these two conditions, essential independence and damping, that allows us to use any form of standard cluster expansion for a study of properties of the contour probability distribution. In the next section we summarize the properties of such contour models in a form to be used later. Unfortunately, even a small perturbation to the Ising Hamiltonian (2) may break the essential independence of contours. Instead, one is getting a model with ‘labeled contours’ with ‘long-range matching conditions’. In Section 3 we explain this notion by representing a simple perturbation of the Ising model in terms of such a labeled contour model. The perturbed Ising model, in spite of its simplicity, actually contains all the ingredients of the general case and we will simplify the presentation of the main ideas by formulating and proving the results just in this case. Our first step is to recover the essential independence – to find contour models, one for each phase, that yield information about the original model with the corresponding boundary conditions. Before showing, in Section 5, how to achieve this, we discuss in Section 4 the Potts model – our aim there is to illustrate how a model of quite different type also naturaly leads to a labeled contour model. The main step of Pirogov–Sinai theory in the present setting is to show that a transition point ht (β) exists such that (for large β) both contour models constructed in Section 5 are damped and thus both phases are stable for h = ht . For some models (such as the unperturbed Ising ferromagnet) the value ht can be guessed from the symmetry. In Section 6 we discuss the case of Ising antiferromagnet that can be considered to be ‘half way’ to the general case. Even though the transition point can be guessed from the symmetry, the real proof of stability of both concerned phases is a good illustration of inductive procedure used also in less symetric cases. The perturbed Ising model, as a representative of the general case, is discussed in Section 7. In a finite volume, say a cube Λ = L × L × · · · × L, the transition reveals itself as a rapid per as the function of h, of the magnetization P change, defined as the mean value σ under the periodic boundary conditions. i∈Λ i Λ The final Section 8 is devoted to an application of the results of Section 7 to the discussion of universal behaviour of the magnetization in the neighbourhood of the transition point and the asymptotic dependence of the finite volume transition point ht (L) defined, say, as the inflexion point of the finite volume magnetization curve.
GEOMETRIC REPRESENTATION OF LATTICE MODELS
157
2. Contour Models Let us suppose that a weight factor z assigning a real non-negative number z(Γ) to every contour Γ is given3 . A collection ∂ of contours of contours in Λ is called compatible if they are mutually disjoint. The contour model, satisfying the condition of essential independence, with the weight factor z is defined by specifying the probability of any compatible collection ∂ of contours in Λ by µΛ (∂; z) =
Y 1 z(Γ) Z(Λ; z)
(6)
Γ∈∂
with the partition function (we reserve script Z for partition functions of contour models) XY Z(Λ; z) = z(Γ). (7) ∂∈Λ Γ∈∂
The contribution of the empty configuration ∂ = ∅ is taken to be 1 by definition. We are not going to discuss the details of the cluster expansion here; let us only formulate its main assertion [GK, Se, KP2, DKS] that can be for our case translated into the following statement. Proposition 1. For a contour model with a damped weight factor z, satisfying, for sufficiently large τ and for every contour Γ, the (damping) bound z(Γ) ≤ e−τ |Γ|,
(8)
there exists a mapping Φ assigning real numbers to finite connected (in the connection by paths whose edges are pairs of nearest neighbour sites) subsets of Zd , such that X log Z(Λ; z) = Φ(C) (9) C⊂Λ
for every finite Λ. Moreover, the contributions Φ(C) are damped, |Φ(C)| ≤ e−τ d(C)/2,
(10)
where d(C) is the minimal summary length (area) of a set of contours such that the union of their interiors equals C. Actually, there is an explicit formula for Φ(C), X Φ(C) = (−1)|C\A| log Z(A; z). (11) A:A⊂C 3 Here we have in mind the contours as introduced above, but sometimes (e.g., when studying interfaces [HKZ1, HKZ2]) it is useful to consider slightly more complicated structures — standard contours ‘decorated’ by some additional sets etc. The present formulation of the contour model can be easily reformulated in a more abstract way [KP2] covering these situations. In particular, the condition of compatibility may differ from simple disjointness. However, an important feature that has to be valid is that compatibility is defined pairwise — a collection ∂ is compatible if all pairs of contours from ∂ are compatible. Also, the weight z(Γ) may be in general complex. To assume that it is real non-negative suffices in our case and it simplifies the formulation.
158
´ ROMAN KOTECKY
If the contour translation invariant 4 , there exists the ‘free energy’ g(z) = −1 model is −1 −β lim Λ log Z(Λ; z) , given by g(z) =
X Φ(C) . |C|
(12)
C:i∈C
Here the sum is over all finite sets containing a given fixed site and |C| denotes the number of points in C. The free energy is bounded by g(z) ≤ e−τ /2 .
(13)
3. Perturbed Ising Model In the case of the Ising ferromagnet with vanishing external field we were fortunate to get immediately the representation (4) in terms of a contour model. This is not at all obvious. Actually, even a small perturbation to the Hamiltonian (2) may introduce a ‘long-distance dependence’ among contours. To see what I mean by that, consider a simple plus-minus symmetry breaking term, say, −κ
X
σi σj σk ,
(14)
(i,j,k)
added to the Hamiltonian (2). Here the sum is over all triangles consisting of a site j and two its nearest neighbours i and k such that the edges (ij) and (jk) are orthogonal. We consider all triplets with at least one of the sites i, j, k in Λ; σ for those sites that are outside Λ is to be interpreted as the corresponding boundary condition σ (say +1). Rewriting the model in terms of contours we obtain µΛ (∂ | +1) =
Y + − 1 ρ(γ)e−βe+ |VΛ (∂)|−βe− |VΛ (∂)| . ZΛ (+1)
(15)
γ∈∂
Here VΛ+ (∂) (resp. VΛ− (∂) is the number of sites in Λ occupied, for the configuration corresponding to ∂, by pluses (resp. minuses), cf. Fig. 1, and e+ = −h − κ2d(d − 1) (resp. e− = h + κ2d(d − 1)) is the average energy per site of the configuration +1 (resp. −1). Notice that the weights ρ(γ) actually depend not only on the geometrical form of the contour, but also on whether γ is surrounded from outside by pluses or minuses. For example for the contour surounding a single plus spin immersed in minuses we obtain ρ(γ) = e−β(8+8κd) , while for the contour surrounding a single minus spin immersed in pluses we obtain ρ(γ) = e−β(8−8κd) . As a result we have to label the contours by the signature of the spins surrounding it from outside (in (15) we anticipated it and introduced labeled contours γ = (Γ, ε) consisting of a geometrical shape Γ labeled by ε = ±1 of the outer spins). In (15) we obtained 4 I.e., z(Γ) = z(Γ + i) for any contour Γ and any shift i. A correspondingly modified statement is true for contour model satisfying some condition of periodicity.
GEOMETRIC REPRESENTATION OF LATTICE MODELS
159
again the representation in terms of contours with the weights ρ damped (for κ and h small), however, the condition of essential independence has been lost. The order of labeled contours matters — if a plus-contour is surrounded by another plus-contour, there must be a minus-contour immersed between them (in the configuration shown in Fig. 1 there are minus-contours that, in view of plus boundary conditions, have to be surrounded by plus contours). Unlike in unperturbed case, this matching condition introduces certain ‘long-range hard core’ — a minus-contour γ surrounded by a disjoint plus-contour γ ‘knows’ about its presence. Erasing γ would turn γ into plus-contour and thereby change its weight ρ(γ).
4. Potts Model The representation of a lattice model in terms of a probability distribution of matching collections of labeled contours, see (15), is not restricted to our simple perturbed Ising model. There exists a large class of models that naturally yield such a representation which is actually the starting point of Pirogov–Sinai theory. Before discussing how to recover essential independence and to transform this representation into a contour model, let us consider an example of slightly different type – the Potts model – that leads to a similar representation as (15). The Potts model is discussed in detail in other lectures in this volume [Gr, N] and thus I will abstain from introducing it here and start directly from its random cluster formulation to get its contour representation. Contours were used already in the original proof of existence of first-order phase transitions [KS] for this model. However, their probability was controlled there with the help of chessboard estimates. A treatment by the Pirogov–Sinai theory has been presented, among others, in [KLMR, BKL] and [M]. A simplification based on the Fortuin–Kasteleyn representation was suggested in [LMMRS] and here I will use the reformulation from [BKM]. Let us begin from the Fortuin–Kasteleyn random cluster representation [FK] with the weight of a set ω of bonds (a subset of the set BΛ of all bonds intersecting Λ) given by p|ω| (1 − p)|\ω| q c(ω,b) . (16) Here |ω| is the number of bonds in ω, \ω ≡ BΛ \ ω denotes the complementary set of bonds and c(ω, b) is the number of components5 of the set ω under the boundary conditions b (for example, b = f , the free boundary conditions when all sites outside Λ are considered to be disjoint; or b = w, the wired boundary conditions with all sites outside Λ connected). Up to a factor depending on Λ, the partition function is ZΛ (b) =
X (eβ − 1)|ω| q c(ω,b) ,
(17)
ω
where the temperature factor 1 − e−β = p has been reintroduced. For every set of bonds ω we can introduce contours in the following way: consider first the closed set ω consisting of the union of all bonds from ω with all unit squares whose all four 5 Each
site not touched by ω is counted as one additional component.
160
´ ROMAN KOTECKY
Fig. 2. Contours for a configuration of occupied bonds ω in the random cluster representation of the Potts model under the wired boundary conditions. Thick lines correspond to the bonds from ω. Plain thin lines denote the ordered contours (ω from outside), while dashed thin lines denote the disordered contours (ω from inside).
sides belong to ω, all unit cubes whose all twelve edges belong to ω etc. Taking now the 14 -neighbourhood U1/4 (ω) of ω we define the contours as connected components of the boundary of U1/4 (ω). This procedure is illustrated in Fig. 2. The contours are boundaries between regions occupied by ω (ordered regions) and empty (disordered ) regions whose each site contributes by the factor q to the partition function (17) (it represents the component attributed to a site unattached to ω). Denoting by VΛ0 the set of bonds in the former and VΛd in the latter, we get ZΛ (b) =
X
0
d
(eβ − 1)|VΛ (∂)| q |VΛ (∂)|/d
∂
Y
ρ(γ).
(18)
γ∈∂
Here the weights of contours ρ(γ) depend on the surrounding regions — if γ is surrounded by the order (i.e., ω) from outside (plain thin lines in Fig. 2), we have ρ(γ) = q −|γ|/(2d) ,
(19)
while for γ surrounded by the disorder from outside (dashed lines in Fig. 2) we have ρ(γ) = q −|γ|/(2d)+1 .
(20)
Again, in (18) we have a similar representation as in (15). Taking q large enough allows to get sufficiently small weights ρ above. Notice also that the role of the ground state energies e± is played by the free energies (per bond) log(eβ − 1) and d−1 log q of the ordered and entirely disordered states, respectively.
GEOMETRIC REPRESENTATION OF LATTICE MODELS
161
5. Recovering Essential Independence In any case, as already mentioned, it is the representation of the form (15) (or (18)) that is a starting point for the Pirogov–Sinai representation. An important fact is that we cannot directly apply standard cluster expansions — we first have to get rid off the above described long-range dependence of labeled contours. However, this is rather easy to achieve [PS, Z, KP1]. Namely, one introduces two contour weight factors z+ (Γ) and z− (Γ) (by using the notation Γ for the contour here we want to stress that, unlike for ρ(γ), the dependence will be only on the shape of the contour (the label being delegated to the subscript of z)) z+ (Γ) = ρ((Γ, +))e−β(e− −e+ )|∂I Γ|
ZInt Γ (−1) ZInt Γ (+1)
(21)
z− (Γ) = ρ((Γ, −))e−β(e+ −e− )|∂I Γ|
ZInt Γ (+1) . ZInt Γ (−1)
(22)
and
Here ∂I Γ denotes the set of all sites attached from inside to the contour Γ (those sites from Zd inside Γ whose distance from Γ in the maximum metric equals 12 ) and Int Γ is the set of all sites from Zd inside Γ that are not contained in ∂I Γ. With the help of these weights, we get the original partition functions in terms of a contour model. Lemma 1. For every finite Λ one has XY
ZΛ (+1) = e−βe+ |Λ|
z+ (Γ)
(23)
z− (Γ).
(24)
∂⊂Λ Γ∈∂
and ZΛ (−1) = e−βe− |Λ|
XY ∂⊂Λ Γ∈∂
Proof. Indeed, resumming in the expression (cf. (15)) ZΛ (+1) =
XY
+
−
ρ(γ)e−βe+ |VΛ (∂)|−βe− |VΛ
(∂)|
(25)
∂⊂Λ γ∈∂
over all ∂ with a fixed collection ϑ of all most external (plus-)contours, we get ZΛ (+1) =
X ϑ⊂Λ
e−βe+ | ExtΛ (ϑ)|
Y
ρ((Γ, +))e−βe− |∂I Γ| ZInt Γ (−1).
(26)
(Γ,+)∈ϑ
Here, we use ExtΛ (ϑ) to denote the set of all lattice sites in Λ that are outside every contour Γ of ϑ. Notice that the partition function ZInt Γ (−1) has the fixed minus boundary condition on ∂I Γ and every contour contributing to it is disjoint from Γ. Multiplying each term on the right-hand side of (26) by exp(−βe+ |∂I Γ| + βe+ |∂I Γ|) ZInt Γ (+1)/ZInt Γ (+1), using the definition (21), and proceeding in proving (23) and (24) by induction in the number of sites in Λ, we use (23) for ZInt Γ (+1) on the
162
´ ROMAN KOTECKY
right-hand side valid by induction hypothesis (Int Γ ( Λ) and obtain thus (23) for the full volume Λ. As a result, in (23) and (24) we succeeded in rewriting the partition functions ZΛ (+1) and ZΛ (−1) in terms of partition functions Z(Λ; z+ ) and Z(Λ; z− ) of contour models z+ and z− . These are contour models according to our definition from Section 2 with the condition of essential independence fulfilled. Two questions may now arise. First, in the formulas (23) and (24) we rewrote only the partition functions. Moreover, we did so in terms of rather artificial contour model (formally speaking, we suppose that inside a plus contour there are again only plus contours). Thus, even if we have the corresponding contour models under control, will it suffice to say something, for example, about typical configurations of the measures µ+ and µ− ? The answer is positive. Namely, it is clear that the contour models z+ and z− introduced above, not only lead to the same (up to a factor) partition functions as the original model, but also yield exactly the same probabilities that a given set ϑ of external contours is present, µΛ (ϑ | ± 1) =
Y 1 z± (Γ)Z(Int Γ; z± ). Z(Λ; z± )
(27)
Γ∈ϑ
Once we know that the corresponding contour model, say z+ , is damped (satisfies the bound (8)), we can control the limit Λ % Zd and with the help of the equality (27), show that there are no infinite cascades of contours in the limiting measure µ+ and the plus phase is stable. However, and this is the second question, it is not clear that, even though the original weights ρ were damped, the newly defined weights z+ and z− are also damped. The answer depends on the values of the parameters β and h. It turns out that for a fixed (sufficiently large) β there exists a value ht ≡ ht (β) such that for h = ht both z+ and z− are damped and thus both plus and minus phases are stable, while for h > ht only z+ is damped and for h < ht only z− is damped. The description of this transition point ht (β) actually yields the phase diagram in the case of the perturbed Ising model6 . Our next task thus will be to find the transition point with the above formulated properties. Sometimes, in presence of a symmetry, the value of the transition point can be guessed. For example, for the unperturbed Ising model we expected ht = 0. Indeed, for h = 0 we got e+ = e− = 0, ZΛ+ = ZΛ− , and thus directly z+ (Γ) = z− (Γ) = e−2β|Γ| . Before turning to the general situation, when ht is a priori unknown, let us consider another case for which the value of the transition point can be guessed. 6 The ‘tuning parameter’ (driving field) here was the external field h. For the Potts model, one can closely follow our treatment of the perturbed Ising model. The role of ‘tuning parameter’ is played by the (inverse) temperature β and, to get damped weight ρ(γ), we have to suppose that q is large enough.
163
GEOMETRIC REPRESENTATION OF LATTICE MODELS + − + − + − + − + − + − + − + − + + − + + − + + − + − − + − + + − + + − − + − − + − + − + − − + − + + − − + − + + − + − + + + + − − + − − − + + + − + − − − + + − − + − + + + − − − + + − + − − + − − − + − + + + − − + + − + − + − + + − − + − + − + − − + + − + − − − + + + − + + + − + + + − + − + + + − + − + − − + − + + + + − + − + − + − − + − − − + − − + − − − + − + − − + + − − − + − − + + − + − − − − + − + + − − − + − + − + − + − + − + − + − + − +
Fig. 3
6. Ising Antiferromagnet The model I have in mind here is the Ising antiferromagnet with the Hamiltonian X X X HΛ (σΛ | σ Λc ) = (σi σj + 1) + (σi σ j + 1) − h σi . (28) hi,ji i,j∈Λ
hi,ji i∈Λ,j∈Λc
i∈Λ
It is known [D2] that for sufficiently small temperatures and small external field h, there exist two antiferromagnetic phases corresponding to two ground configurations. Namely, the configuration with pluses on even lattice sites (i = (i1 , i2 , . . . , id ) such that kik = |i1 | + |i2 | + · · · + |id | is even) and minuses on odd sites, let us call it the even ground configuration (and use the subscript ‘e’ to refer to it), and the same configuration shifted by a unit vector the odd ground configuration (the subscript ‘o’). Let us prove that, indeed, both phases are stable once, for β large enough, the external field h is sufficiently small. In spite of its simplicity, there are two reasons for including this model here. The proof that contour weights for both coexisting phases are really damped is not immediate and it actually involves an important ingredient of the general case. Moreover, a similar reasoning might be useful also in other, more complex, situations — actually, recently we had an occasion to use it for a description of phase transitions in diluted spin systems [CKS] and in a discussion of renormalization group transformations for large external field [EFK]. Let us take, say, the odd ground configuration as the boundary condition. To introduce contours, we again consider the boundaries between regions with even and odd ground configurations. However, this time we take as belonging to the same contour all components whose distance, in maximum metric, is one. Thus, for the configuration on Fig. 3 we have just one contour7 . Again, under fixed boundary 7 The
configuration in Fig. 3 is deliberately chosen so that the set of all boundary lines is
164
´ ROMAN KOTECKY
conditions we have a one-to-one correspondence between spin configurations and collections of contours. Notice that even though we consider, in general, a nonvanishing h, the energies of the ground configuration are (contrary to the case of Ising ferromagnet) equal, eo = ee = 0. Thus we have a particularly simple form of labeled contour model with XY ZΛ (e) = ρ(γ). (29) ∂⊂Λ γ∈∂
Even though this formula is reminiscent of that for the ferromagnet with vanishing field (cf. (5)), here the weights of labeled contours γ depend on the label (which ground configuration surrounds it from outside). To compute the weight ρ(γ), one has to compute the energy of the configuration σ for which γ is the single contour. Consider, for the configuration σ, all pairs i, j of nearest neighbour sites such that j = i + (1, 0, . . . , 0) (j1 = i1 + 1, jk = ik , k = 2, . . . , d) and σi = 1, σj = −1. The remaining unpaired sites are necessarily attached to the contour. Denoting P S(γ) = σi , with the sum over all these unpaired sites, we clearly have ρ(γ) = e−2β|γ|−βhS(γ).
(30)
The reason for gluing together different components8 of the boundary between ground configurations was that otherwise these unpaired sites might be shared by different contours. Had we chosen the standard definition of contour, the weight ρ(γ) would depend on whether the contour γ is isolated or there are other contours around whose distance from γ is 1 and they share some unpaired sites. Notice, for future use, that the complement of a contour, say an even contour γ = (Γ, e), may have several components (cf. Fig. 3). We take for the interior of Γ, Int Γ, only those sites that are in the configuration σ (the configuration whose single contour is γ) occupied by the odd ground configuration and whose distance, again in maximum metric, from Γ is larger than 32 . Notice also that the weight of a labeled contour γ equals the weight of the contour shifted by a unit vector but labeled by the other ground configuration, ρ((Γ, e)) = ρ((Γ + (1, 0, . . . , 0), o).
(31)
We can again use the strategy of the preceding section and introduce the weights ZInt Γ (o) ZInt Γ (e)
(32)
ZInt Γ (e) , ZInt Γ (o)
(33)
ze (Γ) = ρ((Γ, e)) and zo (Γ) = ρ((Γ, o))
identical to that in Fig. 1. While in Fig. 1 we have 11 contours, in Fig. 3 all of them are glued together to a single contour. 8 The idea of gluing together different connected components is in the general Pirogov–Sinai approach automatically carried out by considering ‘thick contours’ that would consist, for the present case, of components of the union of all those 2 × 2 × · · ·× 2 cubes for which the configuration σ restricted to it differs from both ground configurations.
GEOMETRIC REPRESENTATION OF LATTICE MODELS
165
for which ZΛ (e) = Z(Λ; ze ),
ZΛ (o) = Z(Λ; zo ).
(34)
Showing now that both ze and zo are damped, we will prove that both phases are stable9 . Proposition 2. Let h < 2 and β be sufficiently large (depending on h). Then both ze and zo are damped and both phases are stable. Proof. We will prove the bound (8) for ze and zo simultaneously by induction on diam Γ. Let us suppose that both ze and zo satisfy (8) for all Γ such that diam Γ < n. Considering now Γ with diam Γ = n, we apply (34) for ZInt Γ (e) and ZInt Γ (o). By the induction hypothesis we can use the cluster expansion (9) for Z(Int Γ; ze ) and Z(Int Γ; zo ) yielding X Z(Int Γ; ze ) = exp (Φe (C) − Φo (C)) . Z(Int Γ; zo )
(35)
C⊂Int Γ
Observing first that Φe (C) = Φo (C + (1, 0, . . . , 0)) as the direct consequence (by the explicit expression (11)) of the equality ZA (e) = ZA+(1,0,...,0) (o) implied by (31), the terms in the exponent on the right hand side of (35) with C not too near to the boundary of Int Γ will be cancelled. To bound the remaining terms we notice that since to Φe (C) and Φo (C) in (35) only contours Γ with diam Γ < n contribute, the bound (10) is satisfied10 by the induction hypothesis. As a consequence we obtain exp{−ε|Γ|} ≤
Z(Int Γ; ze ) ≤ exp{ε|Γ|} Z(Int Γ; zo )
(36)
with ε of the order e−τ /2 . Taking into acount that, clearly, |S(γ)| ≤ |γ|, we get the bound (8) once h < 2.
7. Transition Point Finally, we consider the perturbed Ising model as a representative of the general case for which the value of the transition point ht is not known. 9 This is true for a range of values of the field h — the field h does not break the symmetry between the phases. For a ‘tuning parameter’ that is able to discriminate between these two phases one P has to introduce an additional field, for example a staggered field in the form of the term g (−1)kik σi added to the Hamiltonian. Here we are actually taking the transition value gt = 0. 10 Formally, we may consider z (n) defined by e(o)
( (n) ze(o) (Γ)
(n)
=
ze(o) (Γ)
if diam Γ < n
0
otherwise,
notice that Z A; ze(o) = Z(A; ze(o) ) for every A such that diam A < n by the induction hypothesis, and that in view of (11) only those A contribute to Φe (C) and Φo (C) in (35).
166
´ ROMAN KOTECKY
The task is to decide, for given values of parameters h, κ, and β, which of the phases is stable, or in other words, which of the contour weights z+ or z− is damped. Following the reformulation of Pirogov–Sinai theory by Zahradn´ık [Z] (or, rather, the version by Borgs and Imbrie [BI1]) we introduce metastable states by suppresing all contours whose weights are not damped. Putting thus z ± (Γ) =
z± (Γ)
if |z± (Γ)| ≤ e−τ (Γ) ,
0
otherwise,
(37)
we define Z Λ (±1) = e−βe± |Λ| Z(Λ; z± ).
(38)
Notice that both weights z + and z − are automatically damped, and it follows that −1 −1 ) = −β lim |Λ| the cluster expansion can be employed to control the limit g(z ± log Z(Λ; z ± ) (see (12)). Comparing the explicit expressions (9) and (12), we get log Z(Λ; z ± ) = −β|Λ|g(z ± ) + ε|∂Λ| with ε (as well as βg) of the order e−τ /2 and thus Z Λ (±1) = exp{−βf± |Λ| + ε|∂Λ|} (39) with f± = e± + g(z ± ).
(40)
The metastable free energies defined by the equality (40) play an important role in determining which phase is stable — it turns out that the stable phase is characterized by having the minimal metastable free energy. Namely, defining a± = β f± − min(f+ , f− ) ,
(41)
we claim that z+ is damped once a+ = 0 (and similarly for the minus phase). To prove this assertion we prove by induction on n the following. Lemma 2 [Z, BI1]. Let κ and h be such that 2|κ|(d2 − 1) + |h| < 1 and let a+ = 0. Then, for sufficiently large β, for every n: (i) if diam Λ ≤ n and a− diam Λ ≤ 1, then z− (Γ) ≤ e−τ |Γ| for every Γ in Λ, (ii) z+ (Γ) ≤ e−τ |Γ| for every Γ with diam Γ ≤ n. Remarks. (1) Notice that, by definition, min(a+ , a− ) = 0. Thus, by this lemma, always at least one of the phases is stable (the plus phase above). Moreover, by (ii) one actually has z + ≡ z+ and Z Λ (+1) = ZΛ (+1). Thus −1 f+ = −β −1 lim |Λ| log ZΛ (+1) ≡ f ; the metastable free energy of the stable phase equals the standard free energy of the original model (which, actually, does not depend on the boundary conditions). (2) The transition point ht is characterized by the equation a+ = a− = 0. The parameter max(a+ , a− ) can be viewed as a measure of distance from the transition point. For sufficiently large volumes is the unstable phase suppressed — the system with unstable (minus) boundary conditions prefers to flip to the plus phase over a long contour encircling large part of the volume Λ. Even though the energy cost of such
GEOMETRIC REPRESENTATION OF LATTICE MODELS
167
a large contour is of the order |∂Λ|, there is a volume gain a− |Λ|. The statement (i) of Lemma 2 then says that for ‘small volumes’ (a− diam Λ ≤ 1) the system prefers to stay in the minus phase. The closer to the transition point (i.e., the smaller is the parameter a− ) the larger volumes are able to support the unstable phase. Very close to the transition point, both phases seem to be stable from the point of view of small volumes (we are saying that the unstable phase (minus) is metastable in small volumes) and, only coming to large volumes, the system is able to distinguish which phase is really stable. Proof. (i) By the induction hypothesis we can replace ZInt Γ (±1) by Z Int Γ (±1). Applying then the equality (39) in the definition (22), we get ZInt Γ (+1) Z Int Γ (+1) = ≤ exp{−β(f+ − f− )| Int Γ| + 2ε|Γ|} ZInt Γ (−1) Z Int Γ (−1) = exp{a− | Int Γ| + 2ε|Γ|} ≤ exp{(1 + 2ε)|Γ|}
(42)
with ε of the order e−τ /2 . In the last inequality we used the inequality | Int Γ| ≤ |Γ| diam Γ and the assumption a− diam Λ ≤ 1. Taking into account that ρ(γ) ≤ exp{−2β(1 − 2|κ|(d − 1))|γ|},
(43)
we get (8) for z+ (Γ) once 2|κ|(d2 − 1) + |h| < 1 and β is large enough. (ii) Let us call small those contours that satisfy the condition a− diam Γ ≤ 1. The remaining contours will be called large. Resumming in (25) over all collections ∂ of contours with a fixed set ϑ of large external contours and using the induction hypothesis, we get ZInt Γ (−1) = ZInt Γ (+1)
small ZExt (−1) Int Γ (ϑ)
X ϑ large ϑ⊂Int Γ
≤ e2ε|Γ|
Q
γ∈ϑ
ρ(γ)e−βe+ |∂I γ| Z Int γ (+1)
Z IntΓ (+1) X
ϑlarge ϑ⊂Int Γ
n oY small exp −| ExtInt Γ (ϑ) |β(f− − f+ ) ρ(γ)e3ε|γ| . (44) γ∈ϑ
(One ε|γ| in the last term comes from the bound on β|e+ −f+ ||∂I γ|.) In this equation, small ZExt (−1) is the partition function with sum taken only over small contours and Λ (ϑ) small f− is the corresponding metastable free energy. Consider an auxiliary contour model with the weight z˜(Γ) =
ρ((Γ, +)) + ρ((Γ, −)) e|Γ|
0
if Γ is large otherwise.
Taking into account the bound (43), we can show that ˜ + ε|∂Λ|} Z(Λ; z˜) = exp{−β f|Λ|
(45)
168
´ ROMAN KOTECKY
(cf. (39)) with ε and β f˜ of the order exp{−τ /(2a− )} (only large contours for which |Γ| ≥ diam Γ ≥ (a− )−1 contribute). On the other side, β|f− −
small f− |
≤
X C3i diam C≥(a− )−1
τ Φ− (C) ≤ exp − |C| 2a−
(46)
and thus small β(f− − f+ ) ≥ a− − exp{−τ /(2a− )}.
Since 2 exp{−τ /(2a− )} ≤ 2(2a− )/τ ≤ a− , we have τ τ small ˜ ˜ ≤ a− − exp − ≤ β f− −β f = β|f | ≤ exp − − f+ . 2a− 2a− ˜
(47)
˜
Multiplying now the right-hand side of (44) by eβ f |IntΓ|−β f |IntΓ| and using (47) we get the bound X Y ˜ ˜ ρ(γ)e4ε|γ| e−β f | Int γ| . (48) e2ε|Γ| eβ f |IntΓ| ϑlarge γ∈ϑ ϑ⊂Λ
Applying twice the approximation (45) we get ˜
e2ε|Γ| eβ f |IntΓ|
X
Y
ρ(γ)Z(Int γ; z˜) · e5ε|γ|
ϑlarge γ∈ϑ ϑ⊂Int Γ ˜
˜
˜
≤ e2ε|Γ| ef | Int Γ| Z(Int Γ; z˜) ≤ eβ f |IntΓ|−β f |IntΓ|+3ε|Γ| . (49) Thus, referring again to the bound (43) and the definition (21), we conclude that z+ (Γ) satisfies the bound (8). The free energies f± are, in view of the equality (40), close to the ground configuration energies e± ; the difference βg(z± ) is of the order e−τ /2 (cf. (13)). Moreover, while the ground state energies e± are linear in h, the functions βg(z ± ) can be shown to be Lipschitz with the Lipschitz constant of the order e−τ . The free energies f± are, in view of the equality (40), close to the ground configuration energies e± ; the difference βg(z± ) is of the order e−τ /2 (cf. (13)). Moreover, while the ground state energies e± are linear in h, the functions βg(z ± ) can be shown to be Lipschitz with the Lipschitz constant of the order e−τ . Indeed, using the definition (7), the (one d sided) derivative dh g(z ± ) can be expressed as the sum, over all contours Γ passing through a given site, of the product of the probability of the appearance of Γ d (bounded by e−τ |Γ| ) and the term dh log z ± (Γ) (whenever z ± (Γ) 6= 0). The latter can be bounded by 3|Γ| + 2| Int Γ| as follows directly from the definition (21, 22). To get the bound d log ZInt Γ (±1) ≤ 2| Int Γ|, dh ZInt Γ (∓1) one takes into account the explicit expressions (3) and (2) (see [Z] for details).
GEOMETRIC REPRESENTATION OF LATTICE MODELS
169
Since the energies e± are linear in h, e± = ∓h ∓ κ2d(d − 1), we infer that the free energies f± are ‘almost linear’ in h. As a result, there exists a unique solution ht of the equation f+ = f− and the value ht differs from the value determined by equality of the ground configuration energies, e+ = e− , by at most e−τ (remember that e−τ can be taken to be of the order, say e−β )11 . This fact can be stated in a more general form: The phase diagram for large β is a deformation, of the order e−β , of the phase diagram at vanishing temperature (β = ∞). This statement remains true also when there are r different ground configurations and one needs (r − 1) external fields to discriminate between them. The general statement of Pirogov–Sinai theory actually claims the above assertion for this case. Having insufficient space here to discuss various existing extensions of the original Pirogov–Sinai theory, we only mention two of them. One is the work of Bricmont and Slawny [BS, Sl] whose approach allowed to study some systems with degenerated ground states. For example, it turned useful for a discussion of ANNNI model [DS] or lattice models of micro-emulsions [DM, KLMM]. An alternative approach to Pirogov–Sinai theory is based on an idea of renormalization group transformations applied to labeled contour models (cf. (15)–(18)) [GKK]. Combining these ideas with the Imry–Ma argument, Bricmont and Kupiainen were able to prove the existence of phase transition for the three-dimensional random field Ising model [BKu].
8. Finite Volume Asymptotics Sticking to our ilustrative perturbed Ising model, is to find the asymptotic P the issue per behaviour of the magnetization mper (β, h) = h σ i in a finite cube, |Λ| = Ld , i∈Λ i L L under periodic boundary conditions. The cubic geometry and periodic boundary conditions are considered here as a simplest case and in view of the fact that it is this situation that is most often studied by computer simulations. We will comment on other cases later. In the limit L → ∞, the magnetization mper ∞ (β, h) displays, as a function of h, a discontinuity at h = ht (β). For finite L, the jump is smoothed into a steep increase in a neighbourhood of ht . It is this rounding and its asymptotic behaviour that is our concern here. The magnetization mper L (β, h) is, in terms of the corresponding partition function ZLper(β, h), given by mper L (β, h) = −
1 d log ZLper (β, h). βLd dh
(50)
We start from a representation of the partition function ZLper (β, h) in terms of a labeled contour model analogous to (15) and prove the validity of the following crucial approximation involving a smooth variant of metastable free energies. 11 For the Potts model, the transition point β can be claimed, for q large, to be close (in the t order q−1/d ) to the value yielded by the equation (eβ − 1) = q1/d .
170
´ ROMAN KOTECKY
Lemma 3 [BK1]. For every A ∈ (0, 1) there exist constants b and c and functions f + (β, h) and f − (β, h) that are four times differentiable in h such that 12 1 min(f + , f − ) = f = − lim β
1 per log ZL (β, h) , Ld
f ± − f ≥ ±c(h − ht ),
(51) (52)
and per Z (β, h) − exp{−βf + Ld } − exp{−βf − Ld } ≤ exp{−βf Ld − bβL} L
(53)
whenever 2|κ|(d2 − 1) + |h| < A and β is large enough. Remarks. (1) There is an amusing immediate consequence of this Lemma [BI1, BK1]. Namely, the limit ZLper(β, h) = N (β, h) (54) lim L→∞ exp{−βf Ld } exists and yields an integer that equals the number of phases. This implies that N (β, h) = 1 for h 6= ht and N (β, h) = 2 for h = ht . A similar claim is valid also in the general Pirogov–Sinai situation. In particular, for the Potts model the limit N (β) equals for β > βt , q N (β) =
q+1 1
for β = βt , for β < βt .
(2) The fact that we are proving differentiability of f ± up to the fourth order is a purely technical matter. We needed the error term of this order in the Taylor expansion of f ± to evaluate the location of the maximum of susceptibility (see Proposition 4 below). Even though we needed larger β for higher orders, one can suppose that an optimization of the present methods would lead to bounds for all orders. Main ideas of proof of Lemma 3. Suppressing all contours wrapped around the torus13 , at the cost of an error of the order exp{−βf Ld − bβL}, we can approximate ZLper (β, h) by the sum of two terms — contributions of all configurations with plus or minus external contours, respectively, ZLper (β, h) ≈ ZLper,+ (β, h) + ZLper,− (β, h).
(55)
For h close to ht , so that max(a− , a+ )L ≤ τ /4, both phases can be treated as stable in Λ (in Lemma 2 we can clearly get the same statement with the damping weakened 12 The
function f is the standard free energy (cf. Remark 1 after Lemma 2). are simply configurations for which we might be in doubt whether to classify them as belonging to the plus or minus phase, and for which the notion of external contours might not be well defined. 13 These
GEOMETRIC REPRESENTATION OF LATTICE MODELS
171
to e−τ |Γ|/2 and with the bound a− diam Λ ≤ 1 replaced by a− diam Λ ≤ τ /4). The right-hand side of (55) then equals per,+
ZL
per,−
(β, h) + Z L
(β, h)
= exp{−βe+ Ld }Z per(Ld ; z + ) + exp{−βe− Ld }Z per(Ld ; z − ). (56) Here Z per (Ld ; z ± ) are the contour model partition functions defined by (7) with collections of contours on the torus without any contour wrapped around it. Being defined on the torus, their approximation by exp{−βg(z ± )Ld } is very accurate. Namely, the first clusters C in which the cluster expressions for −βg(z ± )Ld and log Z per (Ld ; z ± ) differ are the clusters wrapped around the torus. In particular, unlike in (39), there is no surface term proportional to Ld−1 and we have log Z per,± (β, h) + βf± Ld ≤ e−τ L . L
(57)
On the other side, if say a− L ≥ τ /4, then d
d
ZLper,− (β, h)eβf L ≤ e−a− L
/2
00
+ e−τ b
Ld−1
.
(58)
Namely, one is either losing the bulk term of the order a− or there is a long contour along which the configuration flips from minuses to pluses. Thus we obtain the expression of the form (53) with the metastable free energies f± . Even though the weights z± defined by (21) and (22) are smooth functions of h (and of β), the definition (37) is rather discontinuous. However, it turns out that there is actually some freedom in the definition of z ± that allows us to modify the definition of the metastable free energies to make them smooth. Namely, the only property really needed is that the weights z ± are damped and that z ± = z± in the metastable situation, (i.e., when max(a+ , a− ) diam Γ ≤ τ /4). To avoid a reference to a+ and a− defined in the limit Λ → Zd , we have chosen in [BK1] (see also [HKZ2]) the inductive definition z + (Γ) = ρ((Γ, +))e−β(e− −e+ )|∂I Γ|
ZInt Γ (−1) Θ+,Γ Z Int Γ (+1)
(59)
(and a similar definition for z − (Γ)). Here Θ+,Γ is an indicator function (defined also in an inductive way) that interpolates smoothly between 0 and 1 (in the metastable region): (
whenever (h is such that) Z Int Γ (+1) ≤ exp − 14 τ |Γ| − 1 Z Int Γ (−1) Θ+,Γ = 1 whenever Z Int Γ (+1) ≥ exp − 41 τ |Γ| + 1 Z Int Γ (−1). (60) Following the method of the proof of Lemma 2, it is easy to verify that these weights meet the above formulated conditions. Indeed, proceding by induction in diam Γ = n, in the metastable region we have Θ±,Γ = 1 and Z Int Γ (±1) = ZInt Γ (±1). On the (n) other side, introducing z ± by taking z ± as already defined for Γ with diam Γ < n 0
172
´ ROMAN KOTECKY (n)
(n)
and setting z ± (Γ) = 0 otherwise, and denoting f± the corresponding free energy (n) (n) (n) (n) and a± = β f± − min(f+ , f− ) , we prove by induction that n o (n) (n) ZInt Γ (±1) ≤ exp −β min(f+ , f− )| Int Γ| + |Γ| . Thus, o n ZInt Γ (∓1) (n) (n) (n) ≤ exp |Γ| − β min(f+ , f− ) Int Γ + βf± | Int Γ| Z Int Γ (±1) ≤
Z Int Γ (∓1) exp{2|Γ|}, Z Int Γ (±1)
and by (60) the indicator Θ±,Γ = 0 whenever the ratio ZInt Γ (∓1) Z Int Γ (∓1) 2|Γ| ≤ e Z Int Γ (±1) Z Int Γ (±1) is too large. The new weights z ± (Γ) redefined in this way yield the metastable free (n) (n) energies f ± = limn→∞ f± and a± = limn→∞ a± . These parameters might slightly differ from a± in Lemma 2 – they vanish, however, for the same set of external fields h and yield the same ht (as they should). Moreover, the new metastable free energies f ± are smooth. Namely, in the essentially same way as when proving Lemma 2 we can bound also the derivatives of z ± (Γ). An inductive step for that are bounds of the type (42) and (44) with (49) for the derivatives of the left-hand sides of (42) and (44). See [BK1] for details. per The magnetization mper ∞ (β, h) as well as the susceptibility χ∞ (β, h) (recall that the perturbed Ising model does not have the plus-minus symmetry) may have a discontinuity at h = ht . Let us introduce the spontaneous magnetizations and susceptibilities 1 1 m± = lim mper ∞ (β, h), m0 = 2 (m+ + m− ), m = 2 (m+ − m− ), h→ht ±
∂mper ∞ (β, h) , χ0 = 12 (χ+ + χ− ), χ = 12 (χ+ − χ− ). ∂h± It turns out that, in spite of the asymmetry of the model, the finite volume magnetization mper L (β, h) has a universal behaviour in the neighbourhood of the transition point ht . Expanding the metastable free energies in (53) into a Taylor expansion around ht , we get the following proposition in a rather straightforward manner (again, see [BK1] for the proof). χ± =
Proposition 3 [BK1]. For any A ∈ (0, 1) there exist constants K and b such that the approximation mper L (β, h) = m0 + χ0 (h − ht )
n o + (m + χ(h − ht )) tanh Ld β m(h − ht ) + 12 χ(h − ht )2 + R(h, L)
GEOMETRIC REPRESENTATION OF LATTICE MODELS
173
with the error bound |R(h, L)| ≤ e−bβL + K(h − ht )2 is valid whenever 2|κ|(d2 −1)+ |h| < A and β is large enough. Having now a good control over the behaviour of mper L (β, h) in the transition region, we can evaluate the asymptotic behaviour of different variants of the finite volume approximations of the transition point. This is important for the interpretation of computer simulations. In particular, comparison with theoretically predicted asymptotic behaviour is used to settle the question whether an unknown transition is continuous or first-order. When only finite size data are available, a natural choice for the transition point is the value hmax (L) for which the susceptibility ∂mper L (β, h)/∂h (β, h)). Other possible definitions: attains its maximum (the inflection point of mper L the point h0 (L) for which mper (β, h) = m or the point h (L) for which an approx0 t L imation to (54), say # d1 " d ZLper(β, h)2 2 −1 NL (β, h) = , per Z2L (β, h) attains its maximum. In fact, the latter is exactly the point for which mper L (β, h) = mper 2L (β, h). With the help of Proposition 3 we get: Proposition 4 [BK1]. For a fixed constant δ, 2|κ|(d2 − 1) + |h| < 1, and β large enough, one has 3χ (i) hmax (L) = ht + 2 3 L−2d + O(L−3d ), 2β m (ii) in the interval [ht − δ, ht + δ], there exists a unique h0 (L) for which −b0 βL mper ), and L (β, h) = m0 ; for this h0 (L) one has h0 (L) = ht + O(e −b0 βL (iii) ht (L) = ht + O(e ). A popular testing ground for discussion of finite size simulation data is the Potts model (see, e.g., [CLB, BJ, BLM, LK]). Similar results as above can be proved [BKM] for the Potts model with d ≥ 2 and q large enough. In this case, the mean energy can be approximated by n ELper (β) ≈ E0 + E tanh E(β − βt )Ld +
1 2
o log q .
(61)
As a consequence, the inverse temperature βmax (L) where the slope of ELper (β) is maximal is shifted by βmax (L) − βt = −
log q −d L + O(L−2d ), 2E
(62)
while the inverse temperature βt (L) for which NL (β) is maximal again differs from βt only by an exponentially small error O(q −bL ). It seems that the value ht (L) (resp. βt (L)) with an exponentially small shift might be particulary useful in determining the transition point. For further discussion illustrated by computer simulations see [BKa, BJ]. Notice that the difference between the asymptotic behaviour of the shift in Proposition 4(i) for the perturbed Ising model and (62) for the Potts model. Proposition
174
´ ROMAN KOTECKY
4(i) actually settled a controversy [BL, CLB] about the order of the shift. The proof ∂χper (β,h) is of the that the shift is of the order L−2d follows by showing that L∂h h=ht 2 per ∂ χ (β,h) does not exceed L3d in the interval (ht − const.L−d , ht + order Ld and L∂h2 −d const.L ). The fact that the shift for the Potts model is of the order L−d can be traced down to the term log q in the argument of tanh in (62), i.e., to the fact that at βt we have coexistence of q low temperature phases with one high temperature phase. Perturbed Ising model corresponds in this sense to q = 1 (coexistence of one phase for h ≤ ht with one phase for h ≥ ht ) and the term of the order L−d multiplied by the factor log q vanishes. Two final remarks: similarly, as in the last section, the theory can be extended to cover more general situations with several coexisting phases. See [BK1] for a discussion of such cases. Secondly, as already mentioned, asymptotic behaviour for other geometries as well as other boundary conditions was also studied. In the case of cylinder geometry, Λ = M × · · · × M × L with L much larger than M , one obtains an effective onedimensional model and the asymptotics can be studied with the help of the method of transfer matrix [BI2, B]. Another interesting case concerns surface induced shifts (in cubic geometry) driven by the free boundary conditions with possible addition of boundary fields. The shift of transition point is of the order L−1 and can be explicitely computed in terms of (cluster expansions of) surface free energies [BK3]. References [BL] [B] [BI1] [BI2] [BI3] [BJ] [BKa] [BK1] [BK2] [BK3] [BKM] [BKu]
[BKL]
Binder, K. and Landau, D. P. (1984). Finite-size scaling at first-order phase transitions. The Physical Review B 30, 1477–1485. Borgs, C. (1992). Finite-size scaling for Potts models in long cylinders. Nuclear Physics B 384, 605–645. Borgs, C. and Imbrie, J. (1989). A unified approach to phase diagrams in field theory and statistical mechanics. Communications in Mathematical Physics 123, 305–328. Borgs, C. and Imbrie, J. (1992). Finite-size scaling and surface tension from effective one dimensional systems. Communications in Mathematical Physics 145, 235–280. Borgs, C. and Imbrie, J. (1992). Crossover-finite-size scaling at first-order transitions. Journal of Statistical Physics 69, 487–537. Borgs, C. and Janke, W. (1992). New method to determine first-order transition points from finite-size data. Physical Review Letters 68, 1738–1741. Borgs, C. and Kappler, S. (1992). Equal weight versus equal height: A numerical study of an asymmetric first-order transition. Physics Letters A 171, 36–42. Borgs, C. and Koteck´ y, R (1990). A rigorous theory of finite-size scaling at first-order phase transitions. Journal of Statistical Physics 61, 79–119. Borgs, C. and Koteck´ y, R. (1992). Finite-size effects at asymmetric first-order phase transitions. Physical Review Letters 68, 1734–1737. Borgs, C. and Koteck´ y, R. (1993). Surface induced finite-size effects for first-order phase transitions, in preparation. Borgs, C., Koteck´ y, R., and Miracle-Sol´e, S. (1991). Finite-size scaling for Potts models. Journal of Statistical Physics 62, 529–552. Bricmont, J. and Kupiainen, A. (1987). Lower critical dimensions for the random field Ising model. Physical Review Letters 59, 1829–1832; (1988). Phase transition in the 3d random field Ising model. Communications in Mathematical Physics 116, 539–572. Bricmont, J., Kuroda, T., and Lebowitz, J. (1985). First order phase transitions in lattice and continuum systems: Extension of Pirogov–Sinai theory. Communications in Mathematical Physics 101, 501–538.
GEOMETRIC REPRESENTATION OF LATTICE MODELS
[BS]
175
Bricmont, J. and Slawny, J. (1989). Phase transitions in systems with a finite number of dominant ground states. Journal of Statistical Physics 54, 89–161. [CLM] Challa, M. S. S., Landau, D. P., and Binder, K. (1986). Finite-size effects at temperaturedriven first-order transitions. The Physical Review B 34, 1841–1852. [CKS] Chayes, L., Koteck´ y, R., and Shlosman, S. Aggregation and intermediate phases in dilute spin systems, in preparation. [DM] Dinaburg, E. L. and Mazel, A. E. (1989). Analysis of low-temperature phase diagram of the microemulsion model. Communications in Mathematical Physics 125, 25–42. [DS] Dinaburg, E. L. and Sina¨ı, Ya. G. (1985). An analysis of ANNNI model by Peierls contour method. Communications in Mathematical Physics 98, 119–144. [D1] Dobrushin, R. L. (1965). Existence of a phase transition in the two-dimensional and three-dimensional Ising models. Soviet Physics Doklady 10, 111-113. [D2] Dobrushin, R. L. (1968). The problem of uniqueness of a Gibbsian random field and the problem of phase transitions. Funkcional. Anal. i Prilo˘zen. 2, 44–57; English transl. in Functional Analysis Appl. 2, 302. [DKS] Dobrushin, R. L., Koteck´ y, R., and Shlosman, S. (1992). The Wulff construction: a global shape from local interactions. Translations Of Mathematical Monographs 104. AMS, Providence, Rhode Island. [EFK] Enter, A. van, Fern´ andez, R., and Koteck´ y, R., in preparation. [FK] Fortuin, C. M. and Kasteleyn, P. W. (1972). On the random cluster model I. Introduction and relation to other models. Physica 57, 536–564. [GKK] Gaw¸edzki, K., Koteck´ y, R., and Kupiainen, A. (1987). Coarse-graining approach to first-order phase transitions. Journal of Statistical Physics 47, 701–724. [G] Griffiths, R. B. (1964). Peierls proof of spontaneous magnetization in a two-dimensional Ising ferromagnet. The Physical Review A 136, 437–439. [Gr] Grimmett, G. (1994). Percolative problems. Probability and Phase Transition (G. R. Grimmett, ed.), Kluwer, Dordrecht, pp. 69–86, this volume. [GK] Gruber, C. and Kunz, H. (1971). General properties of polymer systems. Communications in Mathematical Physics 22, 133–161. [HKZ1] Holick´ y, P., Koteck´ y, R., and Zahradn´ık, M. (1988). Rigid interfaces for lattice models at low temperatures. Journal of Statistical Physics 50, 755–812. [HKZ2] Holick´ y, P., Koteck´ y, R., and Zahradn´ık, M. (1993), in preparation. [KLMR] Koteck´ y, R., Laanait, L., Messager, A., and Ruiz, J. (1990). The q-state Potts model in the standard Pirogov–Sinai theory: surface tensions and Wilson loops. Journal of Statistical Physics 58, 199–248. [KLMM] Koteck´ y, R., Laanait, L., Messager, A., and Miracle-Sol´e, S. (1993). A spin-one lattice model of microemulsions at low temperatures. Journal of Physics A: Mathematical and General , in print. [KP1] Koteck´ y, R. and Preiss, P. (1984). An inductive approach to the Pirogov–Sinai theory. Suppl. ai Rendiconti del Circolo Matem. di Palermo, Ser. II 3, 161–164. [KP2] Koteck´ y, R. and Preiss, D. (1986). Cluster expansion for abstract polymer models. Communications in Mathematical Physics 103, 491–498. [KS] Koteck´ y, R. and Shlosman, S. B. (1982). First-order transitions in large entropy lattice models. Communications in Mathematical Physics 83, 493–515. [LK] Lee, J. and Kosterlitz, J. M. (1991). Finite size scaling and Monte Carlo simulations of first order phase transitions. The Physical Review B 43, 3265–3277. [M] Martirosian, D. H. (1986). Translation invariant Gibbs states in q-state Potts model. Communications in Mathematical Physics 105, 281–290. [N] Newman, C. M. (1994). Disordered Ising systems and random cluster representations. Probability and Phase Transition (G. R. Grimmett, ed.), Kluwer, Dordrecht, pp. 247– 260, this volume. [P] Peierls, R. (1936). On the Ising model of ferromagnetism. Proceedings of the Cambridge Philosophical Society 32, 477–481. [PS] Pirogov, S. and Sinai, Ya. G. (1975). Phase diagrams of classical lattice systems. Theoretical and Mathematical Physics 25, 1185–1192; (1976) 26, 39–49.
176 [Se] [S] [Sl]
[Z]
´ ROMAN KOTECKY
Seiler, E. (1982). Gauge Theories as a Problem of Constructive Quantum Field Theory and Statistical Mechanics. Lecture Notes in Physics, 159, Springer, Berlin. Sinai, Y. G. (1982). Theory of Phase Transitions: Rigorous results. Pergamon Press, Oxford. Slawny, J. (1987). Low temperature properties of classical lattice systems: Phase transitions and phase diagrams. Phase Transitions and Critical Phenomena (C. Domb and J. L. Lebowitz, eds.), vol. 11, Academic Press, New York, pp. 127–205. Zahradn´ık, M. (1984). An alternate version of Pirogov–Sinai theory. Communications in Mathematical Physics 93, 559–581.
DIFFUSION IN RANDOM AND NON-LINEAR PDE’S
A. KUPIAINEN Mathematics Department Helsinki University P.O. Box 4 Helsinki 00014 Finland
Abstract. We review Renormalization Group methods developed for the study of large time asymptotics of non-linear parabolic PDE’s, random walks in random environments and certain non-Markovian random walks. Key words: Renormalization, diffusion, nonlinear pde, random walk in random environment.
1. Introduction Many of the most difficult problems in non-linear analysis and probability theory are ones with infinitely many degrees of freedom that are strongly interacting or dependent. Such problems are sometimes also approximately scale invariant and the proper understanding and use of this property can be the key to the solution. Diffusive behaviour is a prime example of such phenomena. A quantity u(x, t) (e.g., solution of a PDE or probability distribution of a random walk) is diffusive, if x u(x, t) ∼ t−α/2 f ∗ √ . (1) t Often such a limit law is universal: the α and f ∗ will not change under suitable perturbations of the problem. The connection between scaling and universality was first understood in the physics of quantum fields and critical phenomena using Renormalization Group theory. Here I would like to discuss some simple and less simple applications of these ideas to the mathematics of non-linear parabolic PDE’s and disordered or non-Markovian random walks. 2. Non-Linear Parabolic PDE’s As a first and simplest example of the RG strategy (Barenblatt 1979, Goldenfeld et al. 1990, Bricmont et al. 1992), we will consider nonlinear heat equations of the type ∂t u = ∆u + F (u, ∇u, ∇∇u).
(2)
where, u(x, t) ∈ R (or Rm in general), x ∈ Rd , t ∈ R+ . We want to prove global existence of solutions to (2) and study the possibility of diffusive asymptotics for the
178
A. KUPIAINEN
solution, i.e., whether, as t → ∞ (1) holds and how it depends on the initial data and concrete form of (2). The idea of the RG is to turn the question about asymptotics to a question about stability of fixed points of a dynamical system. Briefly, the idea is as follows. The RG map RL , for L > 1 is defined in a suitable Banach space B of initial data f (x) = u(x, 1) and a suitable space of non-linearities F . These depend on the particular problem and we will give concrete examples below. RL consists of two operations. First, one solves (2) up to the finite time L2 , i.e., proves a local existence and uniqueness theorem . The second step consists of a scale transformation, modelled after the expected asymptotics (1): we scale the solution and correspondingly the equation; let RL (f, F ) = (fL , FL ) where fL (x) = Lα u(Lx, L2 ) and FL (u, v, w) = L2+α F (L−α u, L−α−1 v, L−α−2 w). This scaling assures the semigroup property RLn = RnL on a common domain. Suppose now one is able to prove that RL maps our space of data and equations (or some ball in it) to itself, and that further, that we get convergence in the appropriate norms RnL (f, F ) → (f ∗ , F ∗ ) as n → ∞, where (f ∗ , F ∗ ) is a fixed point for RL . Then we have the asymptotics (1): Lnα u(Ln x, L2n ) ∼ f ∗ (x), i.e., setting t = L2n (1) follows. Universality, i.e., independence of the asymptotics on f and F is then explained in terms of a dynamical systems picture: if (f, F ) lies on the stable manifold of the fixed point, all the corresponding equations and data have the same asymptotics. As a trivial example, consider the linear equation F = 0, i.e., the heat equation. Anticipating the result, put α = d, and get for the Fourier transform fˆ of f (1−L−2 )k2 ˆ d (R f (k/L) L f )(k) = e
(3)
ˆ fˆ∗ , a multiple of the Gaussian which converges for an integrable f as L → ∞ to f(0) ∗ −x2 /4 . Note, how we in this case have a one parameter family fixed point f (x) = Ce of fixed points, and the location where we end on this curve depends on the initial data. In this case, we also see explicitly the stability of the fixed point: since R is linear, it equals its derivative and this linear operator is, from (3) RL = LL0
179
DIFFUSION IN RANDOM AND NON-LINEAR PDE’S
where d ξ ·∇+ ; 2 2 L0 is conjugate to the Schr¨ odinger operator of the harmonic oscillator
(4)
L0 = ∆ +
eξ
2
/8
L0 e−ξ
2
/8
=∆−
ξ2 d + . 16 4
(5) 2
Thus L0 is self-adjoint on its domain in L2 (Rd , dµ), where dµ(ξ) = eξ /4 dξ, and it has a pure point spectrum {− 21 m | m = 0, 1, . . .}. We have one neutral direction, corresponding to the multiple of the fixed point, and the rest of the spectrum is contractive. This discreteness of the spectrum of the derivative of the RG at a fixed point will be true more generally: the scaling achieves this. Linear diffusion equations can of course be understood in many ways, so let us now consider the non-linear equations, and start by studying the stability of the Gaussian fixed point (Af ∗ , 0) in the space of the data and equations. For simplicity, consider just the simple non-linearity and d = 1 (for the general treatment, see Bricmont et al. 1992) ut = uxx + λup (6) (where up ≡ |u|p−1 u). With α = 1 and F = up we have FL = L3−p F , i.e., our Gaussian fixed point seems stable, if p > 3. This indeed is easy to prove, for a much larger class of F ’s, for small data that decays suitably at infinity (Bricmont et al. to appear): the equation has asymptotics given by Af ∗ , where the only dependence on the data and F is in the constant A. For p ≤ 3 the situation is more interesting. We need to take λ ≤ 0, otherwise the solution blows up in finite time (see below). Then, for p = 3, 1
n 2 RnL (f, F ) → (Af ∗ , F ) i.e., we get a logarithmic correction: 1
1
u(x, t) ∼ A(t log t)− 2 f ∗ (xt− 2 ). Finally, for p < 3, the Gaussian fixed point is unstable in the F direction, and we need to change α. Equation (6) is invariant under the scaling u → uL with uL (x, t) = L2/(p−1) u(Lx, L2 t) which suggests setting fL (x) = L2/(p−1) u(Lx, L2 ) and a corresponding definition of FL . Then RL (fγ , F ∗ ) = (fγ , F ∗ ) for F ∗ (u) = −up , where fγ is a one parameter family of non-Gaussian fixed points of the RG. These come as scale invariant solutions of the equation (6) (this is just the fixed point condition in the f variable): 1
u(x, t) = t−1/(p−1) f (xt− 2 ).
(7)
180
A. KUPIAINEN
(6) implies that f solves the ordinary differential equation 00 1 0 f − f p = 0. f + xf + 2 p−1
(8)
The theory of positive solutions of (8) has been developed in Brezis et al. (1986), Galaktionov et al. (1986), Kamin and Peletier (1985). The main result is that, for any p > 1, there exist smooth, everywhere positive solutions, fγ , of (8) with 0 fγ (0) = 0 and fγ (0) = γ for γ larger than a certain critical value γp (but not too large). Actually, for p < 3, γp > 0, while γp = 0 for p ≥ 3. The existence of a critical γp can be understood intuitively by viewing (8) as Newton’s equation for a particle of mass one, whose ‘position’ as a function of ‘time’ is f (x). The potential is then U (f ) =
f2 f p+1 − 2(p − 1) p + 1
0
0
and the ‘friction term’ 12 xf depends on the ‘time’ x. Hence, if f (0) = 0 and f (0) = γ is large enough, the time it takes to approach zero is long and, by then, the friction term has become sufficiently strong to prevent ‘overshooting’. However, as p increases, the potential becomes flatter and one therefore expects γp to decrease with p. These solutions have the asymptotics fγ (x) ∼ |x|−2/(p−1) as |x| → ∞ if γ > γp , while, for γ = γp , it decays at infinity as 2
fγp (x) ∼ |x|{2/(p−1)}−1 e−x
/4
.
To study whether these solutions govern the long time asymptotics of (6), we need to study the derivative of the RG at these fixed points. This is given by the linear operator dRL (fγ ) = LL where L = L0 + V, with V (ξ) = −pfγp−1 (ξ) +
1 1 − . p−1 2
L is now the harmonic oscillator with a potential added that is bounded. It is now a non-trivial fact, that eτ (L0 +V ) is contractive in a suitable Banach space (Bricmont and Kupiainen 1992). It turns out that all the fixed points fγ are stable under perturbations that fall off at infinity faster than |x|−2/(p−1) , i.e., we have the asymptotics (1), with f ∗ = fγ , and all the details of the initial data and the F are erased in the limit. The RG approach is not restricted to the study of the diffusive approach to zero as above: it can be used to study approach to more general attractors: universal
DIFFUSION IN RANDOM AND NON-LINEAR PDE’S
181
stationary patterns (Bricmont and Kupiainen 1993a), moving fronts (Bricmont and Kupiainen 1993b) and the blowup of solutions of (6) with large data (Bricmont and Kupiainen 1993c). We will however now turn to applications to linear, but random equations. 3. Random Walk in Random Environment Let us consider a stochastic version of the equations of the previous section, namely diffusion in a random environment. One model of this is given by a Markov process on the state space Zd , with generator ~ ∆ − ~b · ∇ where ∆ is the Laplacian on Zd and ~b is a random vector ‘field’. For example, we could take ~b(x) for x ∈ Zd to be i.i.d. with covariance E(bα (x)bβ (y)) = 2 δαβ δxy . Thus, the transition probablity P (x, t) in time t from origin to x satisfies the Fokker– Planck equation ~ · (~bP ) ∂t P = ∆P + ∇ with the initial condition P (x, 0) = δx0 . We would like to inquire again, whether a diffusive limit is attained for large times, e.g., 2 1 lim td/2 P (t 2 x, t) = Ae−dx /(Dt) a.s. (9) t→∞
for some D > 0, or, whether the diffusion constant exists and is constant a.s. X lim t−1 P (x, t)x2 = D. (10) t→∞
x
Since the equation is linear, we may write P (x, t) = et(∆+∇·b) (0, x) = pt (0, x)
(11)
where p(x, y) = e∆+∇·b (x, y) can be interpreted as a transition matrix for a random walk on Zd and we may thus write (11) as P (x, t, p) =
Y X t−1
p(ω(i), ω(i + 1))
(12)
ω:0→x i=0
where we emphasized the dependence on the (random) transition matrix p. (12) is then a model of random walk in random environment (RWRE). Versions of this have been studied quite extensively both heuristically and rigorously (for references see Bricmont and Kupiainen (1991). In the asymmetric case, discussed below, the validity of (9) and (10) turns out to be subtle: Sinai showed (Sinai 1982) that for
182
A. KUPIAINEN
d = 1 the walk is subdiffusive, i.e., D = 0 under very general assumptions on the p. For d > 2 diffusion is expected on heuristic grounds (Luck 1983, Derrida and Luck 1983, Fisher 1984) and we sketch below an argument leading to the proof of this (Bricmont and Kupiainen 1991). The RG of Section 2 is now the following recursion for the transition probabilities: we solve for time L2 and scale 2
d L2
d
(RL p)(x, y) = L p (Lx, Ly) = L
−1 X LY ω
p(ω(i), ω(i + 1))
(13)
i=0
with ω(0) = Lx, ω(L2 ) = Ly, and then the following relation holds P (x, t, p) = L−d P (L−1 x, L−2 t, RL p) −d
≡ L
X
−d(T1 −1)
L
tY 1 −1
(RL p)(ω(i − 1), ω(i))
(14)
i=0
ω:ω(i)∈(L−1 Z)d
where ω(0) = 0, ω(t1 ) = L−1 x, t1 = L−2 t. The powers of L in (13) are of course chosen because we expect the long time limit to be diffusive. The ones in (14) −1 d become very natural, provided we note that, since ω now P are walks in (L Z) , due to the scaling involved in RL p, it is natural to replace ω by an ‘integral’. The claims (9) and (10) can now be restated in terms of the map R: given a random matrix p as above, show that almost surely RnL p → p∗
(15)
where p∗ is a Gaussian fixed point of RL (see below), which are given by p∗ (x, y) = Ae−d(x−y)
2
/(2D)
where x, y ∈ Rd and A normalizes p to a probability density connection of (9) and (15) is the iteration of (14): P (x, t, p) = L−nd P (L−n x, L−2n t, RnL p)
(16) R
dy p∗ (x, y) = 1. The (17)
where the right-hand side refers to walks on L−n Zd with transition probability density RnL p ≡ pn . The RG maps an ‘environment’ pn to another pn+1 . Thus pn , n ≥ 1, are random variables, being functions of p. The meaning of (10) is that the rescaled long-time transition probability densities for our RWRE are given as the transition probability densities in the rescaled time of a RWRE with renormalized p’s (note that trivially R R maps transition probability densities to transition probability densities: for all n dy pn (x, y) = 1 holds). Universality would now be the claim, that apart from D, the renormalized diffusion constant, the limit (15) is independent on the p we start with. In terms of RG, such p’s are on the stable manifold of the one parameter family of Gaussian fixed points (16). Note the important difference to Section 2: there the diffusion constant was not renormalized by the non-linearity, whereas the prefactor A was.
DIFFUSION IN RANDOM AND NON-LINEAR PDE’S
183
Here A is always fixed by the probability normalization, but the diffusion constant is renormalized. Let us now study the stability of the non-random Gaussian fixed point (16), under random perturbations. We could study our original model (11), with small, but let us rather look at the more familiar nearest neighbour walk. Thus consider a p of the form 1 + b(x, y) |x − y| = 1 2d p(x, y) = (18) 0 |x − y| 6= 1. For bP = 0, (18) defines the simple random walk. For p to be probabilities, we need y b(x, y) = 0 . The {b(x, y)}x,y∈Zd is a family of random variables of whose distribution we assume the following. (i) We take b(x, ·) and b(x0 , ·) to be i.i.d. if x 6= x0 with mean zero Eb(x, y) = 0. Note, in particular, that b(x, y) and b(y, x) are independent: the environment is asymmetric. (ii) We require the distribution of the b to be invariant under rotations of the lattice. (iii) We next require that b in (18) is a ‘small’ perturbation in the following sense: the generating function of b satisfies for small Eetb(x,y) ≤ et
2 2
.
Thus, the variance is small. (iv) Finally, we impose a condition on the probability that the p(x, y)’s are near zero: 1 −N Prob p(x, y) ≤ ≤ e−ΓN , N ∈ N. e (19) 2d This is designed to avoid the walk getting ‘trapped’ in some region of Zd ; see the discussion below: Γ in (19) will be taken large. With these assumptions and d > 2 one can prove (Bricmont and Kupiainen 1991) (9) and (10). In fact one can moreover prove weak convergence to Brownian motion almost surely. Here we just want to see how the d > 2 emerges from the linear RG analysis and how the trap condition (19) emerges. For this, we compute (13) perturbatively in b: 2
p1 (x, y) = Ld T L (Lx − Ly) P PL2 2 +Ld u,v t=0 T t (Lx − u)T L −t (Ly − v)b(u, v) + O(b2 ). Here we denoted the b independent part in (18) by T . The first term, call it TL , in (20) is straightforward. It can be calculated by the Fourier transform as 2 TˆL (k) = Tˆ(k/L)L .
Pd Since from (18) Tˆ(k) = d−1 α=1 cos kα we have, as L → ∞, 2 k = pˆ∗ (k) TˆL (k) → exp − 2d
(20)
(21)
184
A. KUPIAINEN
where p∗ is as in (16) with D = 1. This is the familiar approach to Gaussian fixed point we saw already when discussing the diffusion equation. If b = 0, this is the full RG and the argument is just a variation of the central limit theorem. When b is not zero, we will define an effective b at each step of the iteration by dividing pn into a ‘deterministic’ and a ‘random’ part: pn (x, y) = Tn (x − y) + bn (x, y) where Tn (x − y) = Epn (x, y) bn (x, y) = pn (x, y) − Epn (x, y) and we have used Rthe translation invariance of the distribution of b. Evidently, R dyTn (y) = 1 and dybn (x, y) = 0 = Ebn (x, y). The aim is to show that bn → 0, so that eventually only the T iteration survives. Obviously, at each scale, bn will modify the diffusion constant. If bn goes to zero sufficiently fast, we shall obtain a sequence of approximations Dn converging to the true diffusion constant D. Thus consider the second term in (20). This involves a sum of walks from Lx to u and from its nearest neighbour v to Ly. Thus, since the total time is L2 , the main contribution comes when |x − y| is O(1) and u within distance L from x. Since the 2 probability of hitting u is O(L−d ) as is the one of hitting P y, and there are L times, 2−d the term linear in b in (20) seems roughly to be L u c(u) where the sum is over Ld independent random variables of mean zero and covariance 2 . Thus this should have covariance L4−d 2 , i.e., bigger than that of b in d = 3. However, P we have not yet used the crucial fact that p’s are transition probabilities, i.e., that y b(x, y) = 0. This implies that in (20) the second T has effectively a derivative ∇u , that brings an extra L−1 . Hence altogether from the linear analysis we expect that Eb1 (x, y)2 ∼ e−|x−y|L2−d 2 . A more quantitative analysis confirms this expectation. Thus we expect the effective disorder bn to stay local (with exponential tails) and its variance to go to zero exponentially in n if d > 2. There are various hard problems, however, when one tries to extend this linear analysis to the full control of the RG. We shall discuss briefly only the main one, namely the problem of the traps. Because of the asymmetry of the p’s, the environment may produce traps, i.e., regions that are easy to enter, but hard to exit. Indeed, the simplest trap consists of nearest neighbours x and y for which p(x, y), p(y, x) ∼ 1 − e−N for N large and thus p(u, v) ∼ e−N for u = x, y, v 6= x, y. Thus if the walk at some time enters x or y, it wants to stay there, since exiting is strongly suppressed. But, by the asymmetry, it is possible (and, indeed, likely for small) that p(u, v) ∼ (2d)−1 for v = x, y, u 6= x, y. Thus the trap {x, y} is easy to enter and hard to exit. The time the walk wants to stay in the trap is ∼ eN and thus diffusive behaviour is unlikely, if the density of the traps is higher than e−γN , γ sufficiently small. The condition (20) assures that the density is small and diffusion likely.
DIFFUSION IN RANDOM AND NON-LINEAR PDE’S
185
However, things are more complicated, since such traps can exist in all scales. P One can expect this already from the linear analysis above. Recall, that b1 ∼ L1−d b(x) and thus even if the covariance of b1 contracts, there are large fluctuations: supposing even that |b| ≤ a.s., b1 can be as large as L and upon iteration bn can become trapping. Thus, in the control of the RG we need to show that such effective traps get more and more unlikely as n increases (i.e., the effective Γn analogous to (19) on the nth scale increases). This indeed is so (Bricmont and Kupiainen 1991). 4. Non-Markovian Walks Let us finally discuss the application of the above RG ideas to non-Markovian random walks. We give three examples of this, namely the walk obtained by averaging over p in (12), the true self-avoiding walk, and certain deterministic models for diffusion. 4.1. Average of RWRE Let us consider the expression (12) for P and take the expectation of it over the randomness XY X EP (x, t, p) = E (T + b) = p(ω) (22) ω
ω
where the weight of the walk is given by X p(ω) =
TI c EbI
(23)
I⊂[0,t−1]
and we used the notation TI c = By independence,
Q
s∈I c
T (ω(s), ω(s + 1)) and bI similarly.
EbI =
Y
UIα
α
S S where we wrote s∈I ω(s) = α xα and Iα = ω −1 (xα ) ∩ I i.e., we divide the times I into sets Iα where ω(s) = xα . Therefore we get the following representation: Y X TI c V Iα (24) p(ω) = α
Iα
where {Iα } is a family of subsets of [1, t], |Iα | ≥ 2 and VIα = 0 unless all the ω(s) for s ∈ Iα are near each other (actually the same point). One should think of (24) as a general representation for the weight of a non-Markovian random walk. It consists of the Markovian part T and interactions among the times when the walk crosses itself. (24) turns out to be a form invariant under the RG. Before we discuss this, we want to consider two other examples that lead to the same representation. 4.2. The True Self-avoiding Walk Here p(ω) =
t Y s=1
ps (ω)
(25)
186
A. KUPIAINEN
where ω is a nearest neighbour walk and e−λns (ω(s+1);ω) −λns (y;ω) |y−ω(s)|=1 e
ps (ω) = P
(26)
and ns (x, ω) = |{s0 ≤ s | ω(s0 ) = x}|. Thus the walk does not want to enter a region which it has visited before. This true self-avoiding walk is P a random walk, unlike the standard self-avoiding or self-suppressing walk: we have ω(t+1) pt (ω) = 1. This model was perturbatively studied in Amit et al. (1983) and Peliti (1984), who concluded that it should, for small disorder, be diffusive for d > 2. For d = 1 2 it was recently proved (Toth 1993) that the mean square distance scales as t 3 , i.e., the walk is super-diffusive. Here we are interested in developing methods to prove diffusive behaviour in d > 2 . Writing now ps (ω) = (2d)−1 + qs (ω) (we will consider λ small in (26)), it is not hard to expand X qs (ω) = b(I) (27) I⊂[0,s]
only if |ω(s) − ω(s0 )| = 1 where b(I) depends only on ω(s0 ), s0 ∈ I, and is non-zero Q 0 for all s ∈ I. Therefore, expanding now p(ω) = (T + q) in powers of q inserting (27) we end up having again a representation (24), where VI is again localized near ω(I). 4.3. Deterministic Diffusion We consider a lattice model for the Lorentz gas, where a particle is moving on Zd in the presence of random configuration of scatterers. At discrete times t ∈ Z+ the particle has a position ω(t) and a velocity v(t), with |v(t)| = 1. At the following time, ω(t + 1) = ω(t) + v(t) and v(t + 1) is determined by the presence and type of scatterer at ω(t). P We describe the scatterer by giving a function p : V ×V → {0, 1} with v0 p(v, v 0 ) = 1 (i.e., given v, there is a unique v 0 ). Here V is the set of unit vectors in Zd . We require p(v, −v) = 0. This means there is no back scattering: if there is a non-zero probability for the back scattering, it is easy to see, that every orbit is periodic (Figotin 1992): eventually the particle will encounter such a scatterer and then retrace its steps until it encounters another one and for subsequent times the motion will repeat itself. We require furthermore that p(v, v 0 ) = p(−v 0 , −v), which means the motion is reversible. Call S the the set of such p’s. Now, let {px }x∈Zd be i.i.d. random variables, taking values in S. We require that the distribution of px ’s is isotropic (i.e., that there is no preferred direction) and moreover Prob(px = id) = 1 − for small where id(v, v 0 ) = δvv0 is the absence of scattering. This means that we have a small density of scatterers. Given now such a set of px ’s together with the initial velocity v(0) and starting point, say origin, the motion is completely determined. The ‘probability’ for a path
DIFFUSION IN RANDOM AND NON-LINEAR PDE’S
187
(walk) ω : [0, t] → Zd with ω(0) = 0 is P (ω) =
t Y
pω(s) (v(s − 1), v(s)).
s=1
Of course this takes value 1 only for one ω, namely the actual trajectory. One would now like to prove that, with high probability the motion is diffusive. Note, that there is always a non-zero density of closed orbits in environments that we have described. In fact, for d = 2 and = 1 every orbit is periodic (Bunimovich and Troubetskoy 1992). However, if d > 2, we expect diffusion to occur in the complement of these orbits (note also, that one cannot enter a closed orbit from outside). An easier problem is to study the diffusion if we average over the scatterers: try to show, that X lim t−1 E P (ω)ω(t)2 6= 0. (28) ω
Thus, let us consider the non-Markovian random walk with probabilities p(ω) = EP (ω).
(29)
px (v, v 0 ) = T (v, v 0 ) + bx (v, v 0 )
(30)
Writing 0
0
with T (v − v ) = Epx (v, v ) , inserting (30) to (29) and expanding in powers of b (note that Eb2 = O()) we again end up with an expansion of the form (24). 4.4. The Renormalization Group The RG that we apply to (24) consists now of ‘blocking’ in time as before and in space. Thus, given a walk ω 0 on L−1 Zd , we set X p0 (ω 0 ) = p(ω) (31) ω→ω 0 2 0 with the constraint on the sum that ω(L S s) = Lω S (s). To get (31) back to the form −2 (24), we write, given Iα in (24), [L α Iα ] = α Jα , where [·] means the integer part and Jα are unions of [L−2 Iα ]’s, where [L−2 Iα ] and [L−2 Iβ ] are in the same Jγ if they intersect. Thus we end up with Y X p0 (ω 0 ) = TL,I c VJ0α (32) Jα ,Zα
α
where V 0 are given by the formula (24) with above constraints and TL by (20)– (21). The only difference with (24) is now, that |Jα | can be 1. Such VJ0 are no 0 more interactions coupling several times, since they only depend on ωs0 , ωs+1 where J = {s}. These W ’s thus renormalize the TL . We set 0 T 0 = TL + V{s} .
(33)
The picture we would now like to establish is the following. The VI are irrelevant under the RG iteration, if |I| > 1. We expect |VI0 | ∼ L2−d |VI |. The reason for this is
188
A. KUPIAINEN
similar to the one behind our computation of the variance of b1 above (actually, Eb21 is just VI for |I| = 2). The reason for the exponent is the fact that we are dealing with probabilities that sum to one. For the regular self-avoiding walk that fits into the present scheme we would find the exponent 4 − d. Thus T above gets renormalized less and less as n → ∞ and we end up with the representation (24) with no V and T = p∗ , i.e., the fixed point Markovian walk. There are, however, again hard problems to deal with. There are expanding directions in our RG flow: those correspond to interactions VI for I a union of many small components. These correspond to the walk returning to the same region during many disjoint time intervals. Upon iteration, these collect walks returning repeatedly to a Ln cube. These V do not contract under the RG before a scale is reached where the blocked time intervals form a connected set. Thus we need again to keep track of large ‘fields’, like the traps in the RWRE. It also turns out, that one has to localize the V ’s not only in time but also in space. The picture presented here is work in progress.
References Amit, D. J., Parisi, G., and Peliti, L. (1983). The Physical Review B27, 1635. Barenblatt, G. I. (1979). Similarity, Self-similarity and Intermediate Asymptotics, Consultants Bureau, New York. Brezis, H., Peletier L. A., and Terman D. (1986). A very singular solution of the heat equation with absorption. Archive for Rational Mechanics and Analysis 95, 185–209. Bricmont, J. and Kupiainen, A. (1991). Random walk in asymmetric random environment, Communications in Mathematical Physics 142, 345–420. Bricmont, J. and Kupiainen, A. (1992). Renormalization group and the Ginzburg–Landau equation. Communications in Mathematical Physics 150, 193–208. Bricmont, J. and Kupiainen, A. (1993a). Stability of moving fronts in the Ginzburg–Landau equation. Communications in Mathematical Physics, to appear. Bricmont, J. and Kupiainen, A. (1993b). Universality in blow-up for nonlinear heat equations. Preprint, IHES. Bricmont, J. and Kupiainen, A. (1993c). Stable non-Gaussian diffusive profiles. Preprint, IHES. Bricmont, J., Kupiainen, A., and Lin, G. (1992). Renormalization group and asymptotics of solutions of nonlinear parabolic equations. Communications in Pure and Applied Mathematics, to appear. Bunimovich, L. T. and Troubetzkoy, S. E. (1992). Recurrence properties of Lorentz lattice gas cellular automata. Preprint. Derrida, D. and Luck, J. M. (1983). Diffusion on a random lattice: weak-disorder expansion in arbitrary dimension. The Physical Review B 28, 7183. Figotin, A. (1992). The localization properties of a random stationary flow on a lattice. Journal of Statistical Physics 66, 1599. Fisher, D. (1984). Random walks in random environments. The Physical Review A 30, 960. Galaktionov, V. A., Kurdyumov, S. P., and Samarskii, A. A. (1986). On asymptotic ‘eigenfunctions’ of the Cauchy problem for a nonlinear parabolic equation. Mathematics of the USSR–Sbornik 54, 421–455. Goldenfeld, N., Martin, O., Oono, Y., and Lin, F. (1990). Anomalous dimensions and the renormalisation group in a nonlinear diffusion process, Physical Review Letters 64, 1361–1364. Kamin, S. and Peletier, L. A. (1985). Large time behaviour of solutions of the heat equation with absorption. Annali della Scuola Normale Superiore di Pisa 12, 393–408. Kong, X. P. and Cohen, E. G. D. (1991). Physica D47, 9–18. Luck, J. M. (1983). Diffusion in a random medium: A renormalization group approach. Nuclear Physics B 225, 169. Peliti, L. (1984). Self-avoiding walks. Physics Reports 103, 225–231.
DIFFUSION IN RANDOM AND NON-LINEAR PDE’S
189
Sinai, Y. G. (1982). Limiting behavior of a one-dimensional random walk in a random medium. Theory of Probability and its Applications 27, 256. Toth, B. (1993). Limit theorem for the local time of the bond-true self-avoiding walk on Z. Preprint.
RANDOM WALKS, HARMONIC MEASURE, AND LAPLACIAN GROWTH MODELS
GREGORY F. LAWLER Department of Mathematics Duke University Durham, NC 27708–0320 U.S.A.
Abstract. A number of problems arise in mathematical physics which deal directly or indirectly with harmonic measure, i.e., with hitting probabilities of simple random walks. The more difficult problems involve understanding the nature of harmonic measure at points on fractal-like sets. We will describe a number of these problems in this paper — intersection probabilities of random walks; random walks grown using harmonic measure (loop-erased or Laplacian random walk); and clusters grown using harmonic measure (diffusion limited aggregation and related models). There is a large range of open problems in describing these random walks and clusters rigorously. Key words: Random walk, harmonic measure, diffusion limited aggregation, intersections.
1. Random Walk and Harmonic Measure 1.1. (Discrete) Harmonic Measure We start by summarizing some standard facts about simple random walk and harmonic measure. For more details see Lawler (1991). Let S(t) = St denote a simple, nearest neighbor random walk in the integer lattice Zd with integer time t. If A ⊂ Zd let τ = τA = inf{t > 0 : S(t) ∈ A}, τ¯ = τ¯A = inf{t ≥ 0 : S(t) ∈ A}. The harmonic measure of A starting at x is the hitting measure of A by a random walk starting at x, conditioned on hitting A, HA (x, y) = P x {S(τ ) = y | τ < ∞}. (We use P x and Ex to denote probabilities and expectations assuming S(0) = x. If the x is omitted it will be assumed that S(0) = 0.) If A is finite, we can define harmonic measure (from infinity) by HA (y) = lim HA (x, y). |x|→∞
The existence of the limit can be shown in a number of ways. Let Cn denote the discrete ball of radius n, Cn = {z ∈ Zd : |z| < n}.
192
GREGORY F. LAWLER
Then if A ⊂ Cn it can be shown that for all y ∈ A, |x| > 2n, HA (x, y) HA (y). Here we use to mean that there exist positive constants c1 , c2 , depending only on d, such that c1 HA (y) ≤ HA (x, y) ≤ c2 HA (y). If A is a finite set, the harmonic measure of A is related to the probability that a random walk starting at y ∈ A ‘escapes’ the set A. Let ξn = inf{t : |S(t)| ≥ n}. Then it can be shown that P y {ξn < τA } . z z∈A P {ξn < τA }
HA (y) = lim P n→∞
If d ≥ 3, the random walk is transient, and we can take the limit into the numerator and denominator. If we let Es denote the escape probabilities, Es(y, A) = P y {τA = ∞} = lim P y {ξn < τA }, n→∞
we can write HA (y) =
Es(y, A) , cap(A)
where cap(A) denotes the capacity of A defined by X cap(A) = Es(z, A). z∈A
The capacity of a set measures how likely a random walk is to hit the set. If A ⊂ Cn , 2n ≤ |x| ≤ 4n, cap(A) nd−2 P x {τA < ∞}.
(1)
We let G(x, y) denote the Green’s function for d ≥ 3, G(x, y) = Ex
∞ X
I{S(t) = y} =
t=0
∞ X
P x {S(t) = y},
t=0
and for d = 2 we let a(x, y) denote the potential kernel, a(x, y) = lim
T →∞
T X
[P x {S(t) = x} − P x {S(t) = y}].
t=0
Note that G and a are symmetric functions and depend only on |y − x|. We write G(x), a(x) for G(0, x), a(0, x). As |x| → ∞, G(x) ∼
2 |x|2−d , (d − 2)ωd
LAPLACIAN GROWTH MODELS
a(x) ∼
193
2 ln |x|, π
where ωd denotes the volume of the unit ball in Rd (Lawler 1991, Theorem 1.5.4, Theorem 1.6.2). We use ∆ to denote the (discrete) Laplacian ∆f (x) =
1 2d
X
[f (y) − f (x)],
|y−x|=1
and call f (discrete) harmonic at x if ∆f (x) = 0. It is easy to check that G(x) and a(x) are harmonic at x 6= 0. For a finite set A ⊂ Zd , d ≥ 3, the function g(x) = gA (x) = P x {¯ τA = ∞}, is the unique function satisfying ∆g(x) = 0, x 6∈ A,
g(x) = 0, x ∈ A,
and g(x) → 1 as |x| → ∞. For x ∈ A we can express the escape probability as a ‘normal derivative’ of the function g, Es(x, A) =
1 2d
X
g(y) =
|x−y|=1
1 2d
X
[g(y) − g(x)].
|x−y|=1
Hence we can think of harmonic measure as the measure of A whose density is proportional to the normal derivative of g. A similar construction can be done for d = 2. In this case we define g(x) = gA (x) by g(x) = a(x) − E x [a(S(¯ τA ))]. Then g satisfies (Lawler 1991, Section 2.3) ∆g(x) = 0, x 6∈ A, and g(x) ∼
g(x) = 0, x ∈ A,
2 ln |x|, |x| → ∞. π
Again, g can be characterized as the unique function satisfying these conditions, and it can be shown that X X HA (x) = 14 g(y) = 14 [g(y) − g(x)]. |y−x|=1
|y−x|=1
In this case we need no normalization for we can prove that X X 1 g(y) = 1. 4 x∈A
|y−x|=1
194
GREGORY F. LAWLER
1.2. (Continuous) Harmonic Measure There is an obvious continuous analogue to the harmonic measure described in the previous section, and, in fact, the term ‘harmonic measure’ is more often used for the continuous analogue. Let Bt = B(t) denote a Brownian motion in Rd and let A ⊂ Rd be a compact subset with the property that Brownian motion starting away from A has a positive probability of hitting A. Again, we let τ = τA = inf{t > 0 : Bt ∈ A}, d
and for x ∈ R we define hA (x, ·) to be the hitting measure of A by Brownian motion starting at x, conditioned to hit A, hA (x, V ) = P x {B(τ ) ∈ V | τ < ∞}. We call hA (x, ·) harmonic measure starting at x. We will use the terms continuous or Brownian motion harmonic measure if we need to distinguish this from discrete or random walk harmonic measure defined in the previous section. Again, we can define harmonic measure (from infinity), hA (·), by hA (V ) = lim hA (x, V ). |x|→∞
For d = 2 there is a unique continuous function g(x) = gA (x) satisfying ∆g(x) = 0, x 6∈ A,
g(x) = 0, x ∈ A,
and such that g(x) ∼ ln |x| as |x| → ∞ (here, of course, we are now using ∆ for the usual Laplacian in Rd ). Harmonic measure can then be defined as the measure on ∂A with density (2π)−1 (dg/dn), i.e., Z dg 1 hA (V ) = , V ⊂ ∂A. 2π V dn This formula makes good sense when ∂A is smooth and rough boundaries can be approximated by smooth boundaries. There is a similar formula in higher dimensions using the function g(x) = P x {τ = ∞}. In two dimensions, one can also use conformal mapping to understand harmonic measure. Suppose we identify R2 with the complex plane C. Take a compact set A and suppose that F is a conformal mapping of Ac to the Riemann sphere which is continuous on Ac ∪ ∂A. Then F preserves harmonic measure, i.e., hA (V ) = hF (∂A) (F (∞), F (V )), c
V ⊂ ∂A.
If A is connected then A is simply connected on the Riemann sphere, and we can find an F which takes Ac to the unit disk with F (∞) = 0. This F is unique up to a rotation. In this case, if V ⊂ ∂A, 1 hA (V ) = hA (∞, V ) = hF (∂A) (0, F (V )) = `(F (V )), 2π where ` denotes length. As a rule of thumb, ‘probabilistic’ or ‘PDE’ methods of analyzing continuous harmonic measure can be adapted to analyze discrete harmonic measure. However, it is very difficult to adapt methods which rely on conformal mapping.
LAPLACIAN GROWTH MODELS
195
1.3. Extremal Bounds Here we review some bounds on (discrete) harmonic measure which hold for all subsets of Zd . We say that a set is connected if any two points in the set can be connected by a random walk path staying in the set. If 0 ∈ A we define the radius of A by rad(A) = sup{|x| : x ∈ A}. There is a classical estimate due to Beurling which states roughly that the continuous harmonic measure of a two dimensional subset of a given radius is maximized by taking a line segment and looking at the endpoint of the segment. Kesten (1987a) first proved the discrete version of this theorem. The discrete Beurling projection theorem states that there is a constant c, depending only on dimension, such that if 0 ∈ A ⊂ Zd with rad(A) = n, then for any x ∈ A, −1/2 , d = 2, cn HA (x) ≤ c(ln n)1/2 n−1 , d = 3, −1 cn , d ≥ 4. The right hand side is sharp if A is a line segment of length n and x is an endpoint. The proof follows a proof of Beurling’s theorem which does not use any complex variables. There is another very deep result about two dimensional harmonic measure due to Makarov (1985). Let A ⊂ R2 be any connected subset (with more than one point). Then (continuous) harmonic measure is concentrated on a set of Hausdorff dimension one. More precisely, there is a set of Hausdorff dimension one which has harmonic measure one and every subset of A of strictly smaller Hausdorff dimension has harmonic measure zero. The proof uses some complex variables and so far there has been no proof that does not use some ingredient of complex variables or conformal mapping. It seems very difficult to write a direct proof of a discrete analogue of this theorem. For this reason, one might look for a method which shows that continuous and discrete harmonic measure are close in some sense, and use the continuous result to say something about the discrete situation. If x ∈ Z2 we let R(x) be the square of side one centered at x. For A ⊂ Z2 we let ΓA =
[
R(x).
x∈A
˜ A , by We can define Brownian motion harmonic measure on A, H ˜ A (x) = hΓA (R(x)). H ˜ close? We cannot expect too sharp a result, e.g., it In what sense are H and H ˜ A (x). However, a sharp approximation theorem of is not true that HA (x) H Koml´ os, Major, and Tusn´ ady (1976) can be used (Lawler 1992b) to show that one can ˜ A basically by moving mass no more than distance O(ln n). Using derive HA from H this approximation, one can take the results of Makarov and show that (discrete) harmonic measure concentrates itself on a set of ‘dimension one’. More precisely, it
196
GREGORY F. LAWLER
can be proved that there are constants α < 1, β > 0, k < ∞ such that if A is any connected subset of Z2 of radius n containing the origin, α
α
HA {x : n−1 e−(ln n) ≤ HA (x) ≤ n−1 e(ln n) } ≥ 1 − k(ln n)−β . Roughly, this states that almost all points on a connected set of radius n have harmonic measure about n−1 , where ‘almost all’ is with respect to harmonic measure. The idea of using the strong approximation to relate continuous and discrete harmonic measure can be found in (Auer 1990) where the idea is used to find the asymptotics of discrete harmonic measure in a wedge. The disadvantage of the method is that one tends to be able to prove results only up to a logarithmic correction. For example, in the case of the wedge Auer was able to prove that the discrete harmonic measure had the same behavior as the continuous up to a logarithmic correction. In this case, Kesten (1991a) was able to show that the behaviors are the same without any logarithmic correction. 1.4. Random Sets and Multifractals A number of interesting problems deal with random subsets in Zd . Suppose we have a sequence of probability measures µn on finite, connected subsets of Zd containing the origin. Suppose also that the typical radius of a set according to µn is nβ for some β > 0. We are often interested in determining the behavior of En (HA (0)a ) where En denotes expectation with respect to µn and a ≥ 0. Since the typical set according to µn has radius nβ we can write HA (0) ≈ nβ(2−d) P {τA < ξnβ }. In many cases we expect that En (HA (0)a ) ≈ n−b(a) , where the above expression means lim −
n→∞
ln En (HA (0)a ) = b(a). ln n
Multifractal analysis deals with the understanding of b as a function of a. If it were true that En (HA (0)a ) ≈ [En (HA (0))]a , then b would be a linear function of a. It is a characteristic of many interesting examples that this b is not a linear function. Sets (or more precisely measures on sets) with this nonlinearity property are sometimes called multifractals. See (Stanley 1991) and (Aharony 1991) for a nonrigorous discussion of multifractals. Let us consider an example. Let An be the line segment of length n − 1 in Z2 , An = {(x1 , 0) : 0 ≤ x1 < n}, and let µn be the measure which assigns probability 1/n to each of the n translates of An , An − z, z ∈ An . (Multifractal analysis of a set is the same as multifractal
LAPLACIAN GROWTH MODELS
197
analysis of the uniform measure on the translates of the set.) The harmonic measure on An can be estimated fairly precisely. If we write j for the point (j, 0), then it can be shown that (Lawler 1991, Proposition 2.4.10) c1 n−1/2 [(j + 1)−1/2 + (n − j)−1/2 ] ≤ HAn (j) ≤ c2 n−1/2 [(j + 1)−1/2 + (n − j)−1/2 ]. One can then check that b is piecewise linear, but not linear, a, a≤2 b(a) = (a + 2)/2, a ≥ 2. In checking the above formula, one notices the reason for the different behavior for a < 2 and a > 2. If a < 2 the main contribution to the sum comes from the n terms of harmonic measure about 1/n. However if a √ > 2, the sum is dominated by the endpoint which has harmonic measure about 1/ n. In this case we say that the a-dimension of harmonic measure is 1 for a ≤ 2 and 0 for a > 2. Roughly speaking, a connected set of radius n (or similarly a measure on sets of radius n) has a-dimension γ if the measure HA (x)a is concentrated on a set of points of cardinality nγ . Note that Makarov’s theorem says that the 1-dimension (sometimes referred to as information dimension) of a connected subset of Z2 is always one. The above example demonstrates that b can be piecewise linear. It is possible for b to be very nonlinear. A lot of the analysis of fractal and multifactal type sets deal with trying to understand the behavior of b. 2. Intersections of Random Walks Let S 1 , . . . , S j , S j+1 , . . . , S j+k be independent simple random walks starting at the origin in Zd . Let q(j, k, t) = P {(S 1 [0, t] ∪ · · · ∪ S j [0, t]) ∩ (S j+1 (0, t] ∪ · · · ∪ S j+k (0, t]) = ∅}. If d ≥ 5, q(j, k, t) → c = c(j, k) > 0 as t → ∞. For d ≤ 4, q(j, k, t) → 0. A lot work has been done (see Lawler 1991) in trying to understand the behavior of q(j, k, t) for large t. Essentially, this boils down to a question about harmonic measure. Let An be the random set An = S 1 [0, n] ∪ · · · ∪ S j [0, n]. Then An is a connected set containing the origin which has a typical radius of about √ n. Understanding the rate of decay of q(j, k, t) is the same as understanding the behavior of E(HAn (0)k ), where we can even take k to be non-integer. The easiest case is when j = 2 and k = 1. In this case we can consider the random set An = S 1 [0, n] ∪ S 2 [0, n] as being a ‘two-sided’ random walk. The origin then becomes a typical point in the middle of a random walk path. (This is different than the case of An = S 1 [0, n] where the origin is an endpoint of the path and not a point in the middle.) Since the total harmonic measure of An is one, a typical point should have harmonic measure 1/n so we get E(HAn (0)) n−1 .
198
GREGORY F. LAWLER
This can be proved √ rigorously using this basic idea. If d = 3, a random walker starting distance n from An has a probability of hitting An of at least c where c is a positive constant independent of n. For d = 4, the probability is of order (ln n)−1 (see Lawler 1991, Theorem 3.3.2). If we combine this with (1) we see that −1/2 , d = 3, n E[Es(An )] (ln n)−1 d = 4, where we write Es(A) for Es(0, A). For d = 2, we cannot discuss the escape probability, but we can show that √ HAn (0) e( n, An ), where e(m, A) = P {τA > ξm }. Hence for d = 2,
√ E[e( n, An )] n−1 .
Here, the expectation is over the random set An and the ξ√n , τAn , and P are for another simple random walk independent of An . For the remainder of this section √ we will consider E(Es(An )k ) or E(e( n, An )k ), but we could equally well consider E(HAn (0)k ). The most studied case is the case j = 1, k = 1. Let An = S 1 [0, n]. The j = 2, k = 1 case can be considered symmetrically as the j = 1, k = 2 case. By doing this, we can see that −1/2 , d = 3, n E(Es(An )2 ) (2) (ln n)−1 , d = 4, √ E(e( n, An )2 ) n−1 , d = 2. (3) If we knew that E(Es(An )k ) [E(Es(An ))]k we could pull the power outside the expectation. However, the multifractal nature of a random walk path makes this relation incorrect in general. One case where the argument does work is in d = 4, the critical dimension for random walk intersections. Here, the intersections are sufficiently infrequent that conditioning on a path not to intersect another path gives a relatively minor conditioning on the path. This means that ‘long-range’ and ‘short-range’ intersections are almost indepedent and intersections of one walk with An are almost independent of those with another walk. In this case it can be shown (Lawler 1992) that E[Es(Akn )] [E(Es(An ))]k , and in particular that E[Es(An )] (ln n)−1/2 . In the terminology of mathematical physics, one says that ‘mean-field’ arguments hold in the critical dimension (but are not expected to hold below the critical dimension). The same basic idea can be used to show that for any j, k if An = S 1 [0, n] ∪ · · · ∪ S j [0, n], E[Es(An )k ] (ln n)−jk/2 .
LAPLACIAN GROWTH MODELS
199
For d < 4, the intersection exponent, ζ = ζd is defined by √ E(e( n, An )) ≈ n−ζ , d = 2. E(Es(An )) ≈ n−ζ , d = 3, It can be proved (see Lawler 1991, Chapter 5, and references therein) that the exponent ζ is well-defined and is equal to the analogous exponent for Brownian motion intersections. However, it is still an open question to determine ζ. A nonrigorous, conformal invariance argument has been given to suggest that ζ = 58 if d = 2. Monte Carlo estimates tend to confirm this result. For d = 3, only Monte Carlo estimates are available and they suggest that ζ is between .28 and .29. In both cases the conjectured value is strictly greater than the value 14 (4 − d) that one would get from a ‘mean-field’ argument using (2) and (3). One can get some rigorous bounds for ζ using (2) and (3). First consider d = 2. We can write √ √ √ ¯ n−1 E[e( n, An )2 ] = E[e( n, An )]E[e( n, An )], (4) ¯ denotes expectation with respect to the probability measure P¯ whose where E Radon–Nikodym derivative with respect to the random walk measure P is √ e( n, ·) √ . E[e( n, An )] Although this measure is not well understood, we can still get √ some estimates for this expectation. For example, An is a connected set of radius n. Therefore the Beurling projection theorem indicates that √ √ e( n, An ) ≤ c( n)−1/2 = cn−1/4 . From this we see from (4) that √ E(e( n, An )) ≥ cn−3/4 . This argument can be improved a little to give the best known rigorous upper bound ¯ is ζ2 < 34 . Giving a bound in the other direction amounts to showing that E significantly different√than E. For d = 2 this is definitely the case; in fact, most sets An under P have e( n, An ) = 0. Define γ by √ P {e( n, An ) 6= 0} ≈ n−γ . Then it can be shown that γ > 0 (for d = 2). It is not difficult to show that (4) then implies 2ζ ≥ 1 + γ. Estimates on γ then give estimates on ζ. The best rigorous bound in this direction is ζ ≥ 12 + (8π)−1 . The proofs of these estimates are actually done for the corresponding Brownian motion exponents and make use of the conformal invariance of Brownian motion as well as estimates from complex variables. The random walk estimate follows since the exponents are the same. For d = 3 there is currently no nontrivial lower bound on ζ, and hence it has not been proved rigorously that there is multifractal behavior. One can prove that ζ < 12 . The case j = 2, k = 2 is interesting. Let An = S 1 [0, n] ∪ S 2 [0, n] and consider E(HAn (0)2 ). We know from (3) that E(HAn (0)) n−1 . Makarov’s theorem tells us
200
GREGORY F. LAWLER
that HAn (0) is concentrated on a set of approximately n−1/2 points. From this we can rigorously prove (Lawler 1992b) that E(HAn (0)2 ) decays no faster than n−3/2 . It is unknown whether this inequality is sharp, but the general multifractal nature of a random walk path argues against it. 3. Diffusion Limited Aggregation 3.1. Basic Model Diffusion limited aggregation (DLA) is a cluster growth model in Zd first introduced by Witten and Sander (1981). This simply described model has produced an enormous number of papers in the physics literature, yet there are only a handful of rigorous results. To define the model we start by setting A1 = {0}. We get a new cluster An+1 from An by sending a random walker from infinity until it hits a boundary point of An and then adding that point to the cluster. More precisely, P {An+1 = An ∪ {x} | An } = H∂An (x). (Here we write ∂A for the set of lattice points distance one from A.) Note that An is always a connected subset of Zd of n points including the origin. The clusters which are formed from these dynamics have ‘fractal-like’ shape and this model has been used to model dendritic growth. The most natural quantity to try to estimate for this cluster is the ‘fractal dimension’. It is not often clear what is meant by this dimension, so we will discuss the well-defined quantity, rn = rad(An ). It is expected that rn ≈ nα , for some dimension dependent exponent α. It is standard to refer to d¯ = 1/α as the ¯ fractal dimension, since the ball of radius nα contains n = (nα )d points. Numerical simulations tend to indicate that d¯ is around 1.6 in two dimensions, although the simulations are not conclusive by any means. There are some mean-field theories in high dimensions that suggest for large d that d¯ is slightly larger than d − 1. Determining the exponent α is essentially equivalent to determining the harmonic measure of the tip of a DLA cluster. Suppose the cluster An is given. Typically, there will only be a couple of points on the boundary of An such that adding one of these points will increase the radius of An . If such a point is added, then the radius increases by an amount of order 1. We then get a difference equation of the form E(rn+1 − rn ) ≈ cE(H∂An (x)), where x is a point on the ‘tip’ of ∂An . Suppose that E[H∂An (x)] ≈ n−β . Then rn ≈ nα where α = 1 − β.
LAPLACIAN GROWTH MODELS
201
The Beurling projection theorem says that −1/2 crn , d=2 H∂An (x) ≤ crn−1 (ln n)1/2 , d = 3 −1 d ≥ 4. crn , This allows one to write a difference inequality, −1/2 crn , d=2 rn+1 − rn ≤ crn−1 (ln n)1/2 , d = 3 −1 crn , d ≥ 4. which gives an upper bound on the growth rate 2/3 d=2 cn , rn ≤ cn1/2 (ln n)1/4 , d = 3 1/2 cn , d ≥ 4.
(5)
This argument, first presented by Kesten (1987b), can be made rigorous to show that there exists a constant c such that with probability one (5) holds for all n sufficiently large. This gives the best rigorous upper bound for d = 2, 3. This bound is very bad for large d. Now assume d ≥ 3. Consider the capacity of A¯n = An ∪ ∂An , cap(A¯n ). It can be shown that there is a constant c > 0 such that if A is any finite subset and x 6∈ A, cEs(x, A)2 ≤ cap(A ∪ {x}) − cap(A) ≤ Es(x, A)2 . We therefore get X E[cap(A¯n+1 ) − cap(A¯n )] E HA¯n (x)Es(x, A¯n )2 x∈∂An
X HA¯n (x)3 . = E cap(A¯n )2 x∈∂An
We now do some heuristics. First, we believe that the dimension of A¯n is around d − 1. As long as it is at least d − 2 we would expect to be able to estimate the capacity in terms of the radius, cap(A¯n ) rnd−2 . How rn grows depends on the behavior of X 3 E HA¯n (x) . x∈∂An
Assume this quantity decays like rn−β . Then we get an expression d−2 rn+1 − rnd−2 ≈ rn−β rn2(d−2) .
202
GREGORY F. LAWLER
Solving this difference equation gives rn ≈ nα where d¯ = α−1 = β + 2 − d.
(6)
Unfortunately, we do not know how to find β. We can give a bound in one direction. Recall that at the ‘tip’ of a DLA cluster, ¯
HA¯n (x) ≈ nα−1 = rn1−d . Hence
X
¯
HA¯n (x)3 ≥ rn3(1−d) ,
x∈∂An
and β ≤ 3(d¯ − 1). Plugging this into (6) gives d¯ ≥ 12 (d + 1). This is only heuristic, but Kesten (1990) has proved this estimate rigorously using a similar argument estimating the growth of the capacity. The upper bounds on d¯ of Kesten are the only rigorous bounds on the growth rate. In particular, there are no lower bounds which say that the dimension d¯ < d. Also the upper bounds, especially in high dimensions, are far from the conjectured values. 3.2. Related Models There are a number of variants of the DLA model. One slight variant, which sometimes goes under the name of the dielectric breakdown model, adds points to the cluster An by sending a random walker from infinity until it hits An and then adds the boundary point from which An is entered. This is similar to the model which adds points according to the rule P {An+1 = An ∪ {x} | An } = P
g(x)
y∈∂An
g(y)
,
x ∈ ∂A,
where g(x) = gAn (x) = Es(x, An ),
d ≥ 3,
Acn
and for d = 2, g is the harmonic function on with boundary value 0 on An and with logarithmic growth at infinity as described in Section 1.1. While these models do not produce exactly the same cluster distribution as DLA, the clusters are believed to behave qualitatively the same. Another thing that one can do is add a parameter η > 0 to the model. In this case we add points to the cluster An by P {An+1 = An ∪ {x} | An } = Z −1 HA¯n (x)η ,
x ∈ ∂A,
where Z = Z(η, A¯n ) is the appropriate normalization constant. (We could similarly adapt the dielectric breakdown model and add points according to g η .) DLA corresponds to η = 1. In this case we expect the radius of the cluster to grow like nα where α = αd (η). As η → ∞ the growth becomes more concentrated at the tips
LAPLACIAN GROWTH MODELS
203
and hence αd (η) → 1. It is a good open question to see if there is an η < ∞ such that α(η) = 1. To motivate why this may be true, consider the two-dimensional line segment which was discussed in Section 1.4. If η > 2 then the probability measure η Z −1 HA ¯n is heavily concentrated at the tips of the line which means that growth will occur there. The η → 0 limit is related somewhat to the Eden’s model for cluster growth. In the Eden’s model, all points on the boundary of An are equally likely to be added. The η → 0 limit is similar, except that only points on the boundary which can ‘see infinity’ can be added. One can derive one estimate on α(η) for d = 2 using an argument similar to Kesten’s argument for η = 1. Let η > 0. It follows from the discrete Makarov theorem that for every > 0 there is a c = c(η, ) such that if A is any connected subset of Zd containing 0 of radius r, X
HA (y)η ≥ cr1−η− .
y∈A
If we let HA (·, η) denote the probability measure on A, Z −1 HA (·)η , then we can conclude from this estimate and the discrete Beurling inequality that for every y ∈ A HA (y, η) ≤ cr+(η/2)−1 . If rn denotes the radius of the cluster An in the η-model for DLA, this inequality translates into the difference inequality E(rn+1 − rn ) ≤ crn+(η/2)−1 , from which we can conclude rn ≤ cn2/(4−η−2) . Hence α(η) ≤
2 . 4−η
In particular we see that lim α(η) = 12 .
η→0
There has been some work on the η → ∞ limit. In particular, Kesten (1991b) and Lawler (1992c) considered the case where η depends on n, η(n) = C ln n. If C is sufficiently large it is not too difficult to show that with positive probability An can grow like a straight line. It is interesting to ask what other shapes An can grow in. Suppose we ask about L-shapes: we say that An grows like an a × b L-shape if An = {(x1 , 0, 0, . . . , 0) : 0 ≤ x1 ≤ a(n)} ∪ {(0, x2 , 0, 0, . . . , 0) : 0 ≤ x2 ≤ b(n)}, where a(n)/b(n) → a/b. If d = 2, L-shapes cannot form; if d = 3, L-shapes can form at a ratio a/b which depends on C; and for d ≥ 4, L-shapes can form with a = b.
204
GREGORY F. LAWLER
3.3. Internal DLA There is an inverse process to DLA which is sometimes called diffusion limited erosion. Here random walkers are sent from infinity until they hit a cluster at which time they remove that particle from the cluster. Of course, if one starts with a finite cluster and removes particles one by one, eventually the cluster will have no points. Internal diffusion limited aggregation is a model which tries to approximate this erosion phenomenon without having this problem of having the cluster completely disappear. One can think of diffusion limited erosion as a cluster growth model where the growing cluster is the complement of the finite cluster. If we use the origin rather than infinity as the source for particles one gets the model internal DLA. To be precise, internal DLA is the growth model in Zd with A1 = {0} and transitions P {An+1 = An ∪ {x} | An } = H∂An (0, x). In other words, random walkers are sent from the origin until they find a point which is not currently in the cluster at which time they stop and add that point to the cluster. Clearly, internal DLA favors adding points which are near the origin, in sharp contrast to DLA where the tips are favored. One would, therefore, expect that internal DLA clusters would be much ‘fatter’ than DLA clusters. In fact, it can be shown (Lawler et al. 1992) that the cluster formed is spherical: for every > 0 with probability one for all n sufficiently large C(1−)n ⊂ A[ωd nd ] ⊂ C(1+)n . Here, ωd is the volume of the unit ball in Rd so that [ωd nd ] represents the approximate number of lattice points in the ball of radius n. We will give the idea here of part of the proof. Let m = m(n) = [ωd nd ]. Take a point |x| < n. What we would like to show is that with very high probability the point x is contained in Am . Consider m random walkers starting at the origin and stopping when they reach distance n. We let them run one at a time creating an internal DLA cluster as they go. For any x we can let M be the total number of these walkers that visit x (before or after the time they add to the internal DLA cluster) and let L be the number of walkers which visit x some time after adding a point to the cluster. If M − L > 0, then x must be in the cluster. Let us estimate the expectations. E(M ) = mpn (0, x), where pn (x, y) is the probability that a random walker starting at x visits y before leaving the ball of radius n. Note that if x and y are not too near the boundary, then pn (x, y) ∼ cGn (x, y). Here Gn is the Green’s function for the ball of radius n. The c is a constant for d ≥ 3 while it is actually c(ln n)−1 for d = 2. To estimate L, we note that each point in the ball of radius n is added at most once to the cluster An . We then get X X E(L) ≤ pn (y, x) ∼ c Gn (y, x). |y|
|y|
LAPLACIAN GROWTH MODELS
205
We now use a property of a sphere: a sphere and its center are the only domain and designated point such that for each point in the domain the average of the Green’s function of the point with all points in the sphere is bounded above by the Green’s function of the point and the designated point. In this case that boils down to X Gn (y, x). Gn (0, x) ≥ m−1 |y|
For x sufficiently far away from the boundary of the ball, we can then use this argument to show that E(M ) is much larger than E(L) and with the aid of standard large deviations results show that M − L > 0 with high probability. A still open question concerns the fluctuations in this model. Define the inner and outer error δI (n) and δO (n) by n − δI (n) = inf{|x| : x 6∈ A[ωd nd ] }, n + δO (n) = sup{|x| : x ∈ A[ωd nd ] }. In one dimension, it can be shown by using√the ‘gambler’s ruin’ estimate for random walks that δI (n) and δO (n) are of order n. Computer simulations suggest that in higher dimensions the fluctuations are much smaller. It has recently been shown (Lawler 1993a) that for d ≥ 2 the fluctuations are of no larger order than n1/3 . It is quite possible that the fluctuations are even smaller (see Krug and Meakin 1991 and Krug and Spohn 1991 for some discussion of a related model). 4. Loop-Erased or Laplacian Random Walk The loop-erased or Laplacian random walk is a process which produces an infinite self-avoiding random walk in Zd . The process can be defined in two different ways which is why there are two different names for the process. The first definition uses the process of erasing loops from simple random walk. Let S(t) denote a simple random walk in Zd with integer time t. First, assume d ≥ 3 so that the random walk is transient. Then we can erase loops in a chronological order and obtain an infinite walk which is self-avoiding. To be precise, we define σ0 = sup{t : S(t) = 0}, and for i > 0, σi = sup{t : S(t) = S(σi−1 + 1)}. We let ˆ = S(σi ) = S(σi−1 + 1), S(i) where for convenience we have set σ−1 = −1. Then it is easy to check that Sˆ is an infinite self-avoiding walk. If d = 2 we cannot use this definition directly; however, an easy modification works. In order to get the measure on n-step walks, we can take simple random walks with M steps; erase loops; consider the measure this gives on n-step SAW walks; and take the limit of this measure as M goes to infinity.
206
GREGORY F. LAWLER
The term Laplacian random walk comes from the transition probabilities for the walk. It is not difficult to verify that for |xt+1 − xt | = 1, ˆ + 1) = xt+1 | S[0, ˆ t] = [x0 , . . . , xt ]} = P g(xt+1 ) P {S(t . |y−xt |=1 g(y)
(7)
Here g is the function which is 0 on {x0 , . . . , xt }, harmonic on Zd \ {x0 , . . . , xt }, and as |x| → ∞, ln |x|, d = 2, g(x) ∼ 1, d ≥ 3. The loop-erased walk was first studied in (Lawler 1980). It was independently introduced as the Laplacian random walk in (Lyklema et al. 1986), using the transition probabilities above as the definition of the walk. These authors were unaware of the loop-erasing definition. (It is fairly easy to derive the transition probabilities from the definition of the loop-erased walk, but it is not obvious from the Laplacian definiton that the walk can be derived from loop-erasing.) One quantity of interest is the speed at which the walk grows. A natural measure ˆ 2 ). For simple random walk, the meanof this is the mean-squared distance, E(|S(t)| ˆ squared distance is exactly t. For S, the behavior will depend on the dimension. As a matter of comparison, we should also consider the mean-squared distance of self-avoiding walks (SAW). A self-avoiding walk of length n is a simple random walk which has no self-intersections. The mean-squared distance of SAW’s of length n is defined to be the average squared length of the endpoint where the average is taken with respect to the uniform measure on SAW’s of length n. It is conjectured (see Madras and Slade 1993 for a detailed treatment of the self-avoiding walk) that the mean-squared distance grows like n3/2 if d = 2; n2ν where ν is slightly less than 35 if d = 3; n(ln n)1/4 if d = 4; and linearly in n for d ≥ 5. The d ≥ 5 statement has been proved rigorously. For the loop-erased walk we might conjecture that α d = 2, 3 cn , E(|S(t)|2 ) ∼ cn(ln n)α , d = 4 cn, d ≥ 5. where the exponent α depends on the dimension. This certainly appears to be the case. The d ≥ 5 is true and the d = 4 result has recently been proved (Lawler 1993b) with α = 13 . For d = 2, 3, the best rigorous result (Lawler 1991, Chapter 7) is that α ≥ 32 , d = 2,
α ≥ 65 , d = 3.
There is strong heuristic and numerical evidence to suggest that α = 85 in two dimensions and numerical evidence suggest that α is around 1.23 in three dimensions. 2 ˆ The analysis of E(|S(n)| ) lies in determining the number of points which are erased in the loop-erasing procedure. Suppose that on the average f (n) of the first n points remain in the self-avoiding path after erasing loops from the simple random ˆ (n))|2 ) would be of order n or equivalently walk. Then one would expect that E(|S(f 2 ˆ that E(|S(n)| ) would be of order f −1 (n). It suffices therefore to analyze f (n). If we let p(n) be the probability that the nth point is not erased, then p(n) ≈ n−1 f (n).
LAPLACIAN GROWTH MODELS
207
It is p(n) that we will consider here. For ease, we will write expressions assuming d ≥ 3; however, similar expressions can be written in the d = 2 case. It is easy to see that the nth point of S is not erased if and only if the simple random walk after time n does not intersect the self-avoiding path derived by erasing loops on S[0, n]. Let An = S[0, n] and let Aˆn denote the set of points in the path derived from erasing loops on S[0, n]. Then we see that p(n) = E[Es(S(n), Aˆn )]. It is not immediately obvious but can be shown that erasing loops in the ‘reverse’ direction gives the same distribution as erasing loops in the ‘forward’ direction. Hence, we can write p(n) = E[Es(Aˆn )]. Since Aˆn ⊂ An we get an immediate lower bound on p(n). For d ≥ 5, this suffices to show that p(n) ≥ c > 0 and hence that the mean-squared distance grows linearly. For d ≤ 4, this lower bound is not sharp. Recall that for d ≤ 4, −1 d=2 n , E[Es(An )2 ] ∼ n−1/2 , d = 3 (ln n)−1 , d = 4. (This is not really correct for d = 2, but there is a corresponding equation which can be written.) It turns out that the third moment of Es(Aˆn ) is the one that can be computed fairly easily. In fact, −1 d=2 n , E[Es(Aˆn )3 ] ∼ n−1/2 , d = 3 (ln n)−1 , d = 4, with the same proviso that the d = 2 result is not quite right. The probable multifractal nature of Aˆn suggest that for d = 2, 3, E[Es(Aˆn )3 ] 6≈ [E(Es(Aˆn ))]3 . We can get a bound in one direction, E[Es(Aˆn )3 ] ≥ (E[Es(Aˆn )])3 , and this is the basis for the rigorous result mentioned above. For d = 4, the critical dimension, one expects that E[Es(Aˆn )3 ] ≈ (E[Es(Aˆn )])3 , and hence n−1 E[Es(Aˆn )] ≈ (ln n)−1/3 . This has been proven recently (Lawler 1993b) using the idea of slowly recurrent sets. In (Lyklema et al. 1986), Laplacian random walks with a parameter η > 0 were also considered. In this model, the transition probabilities are as in (7) except that g is replaced with g η . Hence, η = 1 corresponds to the loop-erased walk. Unfortunately, there is no procedure like loop-erasing which corresponds to the Laplacian walk for η 6= 1. Since all the rigorous analysis of the Laplacian walk has used the loop-erasing characterization, there are no results for η 6= 1.
208
GREGORY F. LAWLER
Acknowledgements This research is partially supported by grants from the National Science Foundation. References Aharony, A. (1991). Fractal growth. In Fractals and Disordered Systems (A. Bunde and S. Havlin, ed.), Springer-Verlag, Berlin, 151–174. Auer, P. (1990). Some hitting probabilities of random walks on Z2 . In Limit Theorems in Probability and Statistics (L. Berkes, E. Cs´ aki, and P. R´ev´esz, ed.), North-Holland, 9–25. Kesten, H. (1987a). Hitting probabilities of random walks on Zd . Stochastic Processes and Their Applications 25, 165–184. Kesten, H. (1987b). How long are the arms in DLA? Journal of Physics A: Mathematical and General 20, L29–L33. Kesten, H. (1990). Upper bounds for the growth rate of DLA. Physica A 168, 529–535. Kesten, H. (1991a). Relations between solutions of a discrete and a continuous Dirichlet problem. In Random Walks, Brownian Motion and Interacting Particle Systems (R. Durrett and H. Kesten, ed.), Birkh¨ auser, Boston, 309–321. Kesten, H. (1991b). Some caricatures of multiple contact diffusion-limited aggregation and the ηmodel. In Stochastic Analysis (M. Barlow and N. Bingham, ed.), Cambridge University Press, Cambridge, 179–228. Koml´ os, J., Major, P., and Tusn´ ady, G. (1976). An approximation theorem of partial sums of independent R.V.’s and the sample DF. II. Zeitschrift f¨ ur Wahrscheinlichkeitstheorie verw. Geb. 34, 33–58. Krug, J. and Meakin, P. (1991). Kinetic roughening of Laplacian fronts. Physical Review Letters 66, 703–706. Krug, J. and Spohn, H. (1991). Kinetic roughening of growing surfaces. In Solids Far from Equilibrium: Growth, Morphology, and Defects (C. Godreche, ed.), Cambridge University Press, Cambridge. Lawler, G. (1980). A self-avoiding random walk. Duke Mathematical Journal 47, 655–694. Lawler, G. (1991). Intersections of Random Walks. Birkh¨ auser, Boston. Lawler, G. (1992a). Escape probabilities for slowly recurrent sets. Probability Theory and Related Fields 94, 91–117. Lawler, G. (1992b). A discrete analogue of a theorem of Makarov. Combinatorics, Probability, and Computing, to appear. Lawler, G. (1992c). L-shapes for the logarithmic η-model for DLA in three dimensions. In Seminar on Stochastic Processes 1991, Birkh¨ auser, Boston, 97–122. Lawler, G. (1993a). Subdiffusive fluctuation for internal diffusion limited aggregation. Preprint. Lawler, G. (1993b). The logarithmic correction for loop-erased walk in four dimensions. Preprint. Lawler, G., Bramson, M., and Griffeath, D. (1992). Internal diffusion limited aggregation. Annals of Probability 20, 2117–2140. Lyklema, J. W., Evertsz, C., and Pietronero, L. (1986). The Laplacian random walk. Europhysics Letters 2, 77–82. Madras, N. and Slade, G. (1993). The Self-Avoiding Walk. Birkh¨ auser, Boston. Makarov, N. G. (1985). Distortion of boundary sets under conformal mappings. Proceedings of the London Mathematical Society 51, 369–384. Stanley, H. G. (1991). Fractals and multifractals: the interplay of physics and geometry. In Fractals and Disordered Systems (A. Bunde and S. Havlin, ed.), Springer-Verlag, Berlin, 1–50. Witten, T. and Sander, L. (1981). Diffusion limited aggregation, a kinetic critical phenomenon. Physical Review Letters 47, 1400–1403.
SURVIVAL AND COEXISTENCE IN INTERACTING PARTICLE SYSTEMS
T. M. LIGGETT* Department of Mathematics University of California Los Angeles, CA 90024 U.S.A.
Abstract. A fifteen year old technique for proving survival of the basic one dimensional contact process is extended in order to obtain improved upper bounds for contact like processes. Comparison techniques are described which can be combined with these survival results to determine exactly which threshold voter models coexist. The paper ends with a bibliography of most of the papers written about interacting particle systems since the author’s book on this subject appeared in 1985. Key words: Interacting particle system, contact process, survival, voter model.
1. Survival of Contact-Like Processes A proof of survival for the basic one dimensional contact process was given by Holley and Liggett in 1978 (see Section 1 of Chapter VI of Liggett (1985)). To begin the discussion of extensions of this technique, consider the following question: Does the (nontrivial) invariant measure of the basic one dimensional contact process have the strong positive correlations property? We need to define the terms which appear in this question. The basic one dimensional contact process is the Markov process ηt on {0, 1}Z in which 1 → 0 at rate one and 0 → 1 at rate λ × (# neighbors which are 1). The distribution of the process at time t when the initial distribution is µ will be denoted by µS(t). An invariant measure (i.e., one for which µS(t) = µ ∀t ≥ 0) is nontrivial if it is not the pointmass on the zero configuration. As is well known, there is a critical value λc so that the process survives (i.e., has a nontrivial invariant measure) if λ > λc and dies out if λ ≤ λc . The fact that the critical process dies out is rather recent — see Bezuidenhout and Grimmett (1990). Until recently, the best bounds on λc were 1.539 < λc < 2. More on this will be said later. A probability measure µ on {0, 1}Z is said to have the positive correlations property if Z Z Z f gdµ ≥ f dµ gdµ for all bounded increasing functions f and g. This property plays an important role in both statistical mechanics and interacting particle systems. We will say that µ *Preparation of this paper was supported in part by NSF Grant 91-00725.
210
T. M. LIGGETT
has the strong positive correlations property if the conditional measure µ(· | η(x1 ) = 1 , . . . , η(xn ) = n ) has the positive correlations property for every n and every choice of 1 , . . . , n . A theorem due to Harris (Theorem 2.14 in Liggett (1985)) guarantees that for a large class of processes (including the contact process), µ has positive correlations =⇒ µS(t) has positive correlations ∀t ≥ 0. It follows (for λ > λc ) that the nontrivial invariant measure of the contact process (which is the limit of δ1 S(t)) has positive correlations. This observation motivates the question raised above. One can strengthen the question by asking whether there is a version of Harris’ theorem for the strong positive correlations property. Perhaps surprisingly, we will now answer our question in the negative. To do so, let µ be any translation invariant probability measure, and write d µS(t){η(x) = 0 ∀ 1 ≤ x ≤ n} = −λµ{η(0) = 1, η(x) = 0 ∀ 1 ≤ x ≤ n} dt t=0 − λµ{η(n + 1) = 1, η(x) = 0 ∀ 1 ≤ x ≤ n} n X + µ{η(k) = 1, η(x) = 0 ∀ other 1 ≤ x ≤ n}. (1.1) k=1 Let ν be the nontrivial invariant measure, and let F (n) =
ν{η(0) = 1, η(x) = 0 ∀ 1 ≤ x ≤ n − 1} , ν{η(0) = 1}
n ≥ 1,
be the tail probabilities of the conditional spacing distribution. If ν satisfied the strong positive correlations property, it would follow that ν{η(k) = 1, η(x) = 0 ∀ other 1 ≤ x ≤ n} ≥ F (k)F (n − k + 1)ν{η(0) = 1}
(1.2)
for all 1 ≤ k ≤ n. Using this in (1.1), we obtain (since ν is symmetric with respect to reflection in Z, translation invariant and invariant) n X
F (k)F (n − k + 1) ≤ 2λF (n + 1)
k=1
for all n ≥ 1. The translation invariance of ν implies that M=
∞ X
F (n) < ∞.
n=1
Summing (1.3) for n ≥ 1, and using F (1) = 1, we obtain M 2 ≤ 2λ(M − 1).
(1.3)
SURVIVAL AND COEXISTENCE IN INTERACTING PARTICLE SYSTEMS
211
Therefore, the discriminant of the corresponding quadratic is ≥ 0, so that λ ≥ 2. We conclude that the strong positive correlations property fails for λ < 2. It probably fails for all larger λ as well. In fact, one can enlarge the class of λ’s for which this conclusion holds by using a few other expressions of the type (1.1), but for sets of sites which are not intervals. However, since this is a negative result, there is not much point in pursuing this generalization. Given what we have done so far, the Holley–Liggett proof of survival of the basic one dimensional contact process for λ ≥ 2 can be summarized in the following way: Step 1. Let µ be a stationary renewal measure (so that (1.2) holds with equality), chosen so that (1.3) holds with equality. It is easy to solve these equations explicitly for λ ≥ 2: 2n 1 F (n + 1) = , n ≥ 0. n (n + 1)(2λ)n Then the right side of (1.1) is zero, so that d µS(t){η(x) = 0 ∀ x ∈ A} ≤ 0 dt
(1.4)
(in fact, = 0) for t = 0, A = {1, . . . , n} and all n ≥ 1. Step 2. Show that (1.4) holds at t = 0 for all finite sets A. Step 3. Use the duality relation P η {ηt = 0 on A} = P A {η = 0 on At }, where At is the finite contact process, to show that (1.4) for t = 0 and all A implies (1.4) for all t ≥ 0 and all A. Step 4. Conclude that µS(t){η(x) = 0} ↓ in t, and hence does not converge to one. The hard part of the argument is step 2. In that part, one proves and uses the fact that the renewal sequence corresponding to the density f (n) = F (n) − F (n + 1) is decreasing in n, and this in turn follows from the logarithmic convexity of f . The details can be found in Section 1 of Chapter VI of Liggett (1985). This argument is quite old, but little was done to see if it could be used more generally until the last few years. The recent applications of this technique can be found in the papers in the bibliography by Katori and Konno, and by Liggett. One type of extension, which we will not describe here, involves proofs of survival for contact processes which are not spatially homogeneous. (See my papers in the Harris and Spitzer volumes and in the 1992 Annals of Probability.) We will describe two other types of extension. First, there is the possibility of obtaining better upper bounds for λc for the basic process. To get a better upper bound, one needs to find something more general than a renewal measure to use as the initial distribution in step 1 above. Various possibilities suggest themselves, but the following observation makes one particularly natural: Use a modification of the Gibbs formalism to write a probability measure µ as X µ(η) = K exp (1.5) JA η=0 on A
212
T. M. LIGGETT
for some potential JA . It turns out that µ is a renewal measure iff JA = 0 for all A’s other than intervals. So, to obtain an upper bound λn for λc , one can consider measures given formally by (1.5), where JA = 0 for all A’s other than intervals or sets of diameter ≤ n. The nonzero JA ’s can be specified by requiring that (1.4) hold with equality at t = 0 for all A’s which are intervals or sets of diameter ≤ n. One finds that λ1 = 2 (this is the Holley–Liggett bound), and λ2 = 1.941227.... A computation suggests that λ3 = 1.89349.... The full details of the proof are worked out for n = 2 in a forthcoming paper. The second extension we will describe involves proving survival for modifications of the contact process — in this case, for one with a non nearest neighbor interaction. This extension will be important in the next section. The process is the same as the basic one dimensional contact process, except that the infection rate at x ∈ Z is given by λ if at least one of η(x − 2), η(x − 1), η(x + 1), η(x + 2) takes the value 1, and the rate is 0 otherwise. The survival of this process for λ = 1 is proved in a paper to appear by again using a renewal measure as initial distribution. The analogue of (1.3) with equality which must be solved in step 1 is now F (2) = n X
1 1 1 , F (3) = , F (4) + F (5) = , and λ+1 (λ + 1)2 λ(λ + 1)2 (1.6)
F (k)F (n − k + 1) = 4λF (n + 1) + 2λF (n + 2) for n ≥ 4.
k=1
This does not seem too different from (1.3) (with equality), but now it cannot be solved explicitly, and this leads to significant difficulties. In fact, the proof that there is a solution of (1.6) (for λ = 1) which is decreasing and for which the corresponding renewal sequence has the required monotonicity and convexity properties is computer assisted. We proved analytically that there is a solution which has the required properties for n ≥ 1000, and then computed the first 1000 values of F and the renewal sequence accurately enough to check the required properties for those n. As will be seen in the next section, it is essential for our application of this result that we know survival for λ = 1; λ = 1.01 would not do. The technique (using an initial renewal measure) appears to work for λ > .985, and we were simply lucky that .985 < 1. This is one motivation for obtaining better upper bounds — next time we might not be so lucky.
2. Coexistence in Threshold Voter Models Recently, Cox and Durrett (1991) introduced a new class of particle systems called threshold voter models. Their behavior turns out to be quite different from those of the (linear) voter models treated in Chapter V of my book, and several papers have been written about them by subsets of Andjel, Cox, Durrett, Mountford, Liggett and Steif. The d-dimensional threshold voter model with parameter N is the spin system d ηt on {0, 1}Z in which a flip at x ∈ Zd occurs at rate 1 if η(y) 6= η(x) for some y with ky − xk ≤ N , and at rate 0 otherwise (k · k can be any reasonable norm, such as the lp norm). We will say that the process coexists if it has a nontrivial invariant
SURVIVAL AND COEXISTENCE IN INTERACTING PARTICLE SYSTEMS
213
measure (i.e., one which puts no mass on the ≡ 0 or ≡ 1 configurations). Otherwise, we will say that the process clusters. Our objective is to determine for each (N, d) whether the process coexists or clusters. The well known answer to this problem for the linear voter model is that clustering occurs for all N if d = 1, 2 and coexistence occurs for all N if d ≥ 3. Cox and Durrett (1991) proved clustering for N = d = 1. It is particularly easy to see in this case that there is no nontrivial invariant measure which is translation invariant. The argument is based on the following computation, which is valid for any translation invariant µ: d µS(t) = −µ(101) − µ(010) ≤ 0. dt t=0 Cox and Durrett also proved that coexistence occurs for each d if N is sufficiently large, and conjectured that coexistence occurs in all cases except N = d = 1. This conjecture was recently proved by Liggett. The argument is based on a comparison with the threshold contact process, in which 1 → 0 at rate one and 0 → 1 at rate λ if η(y) 6= η(x) for some y with ky − xk ≤ N , and at rate 0 otherwise. Here is an outline: Step 1. For every (N, d), if the threshold contact process survives for λ = 1, then the threshold voter model coexists. Step 2. For any λ, if the threshold contact process survives for (N, d), then it survives for (N 0 , d0 ) with N 0 ≥ N and d0 ≥ d. Step 3. For any λ, if the threshold contact process survives for (N, d) = (2, 1), then it survives for (N, d) = (1, 2). Step 4. The threshold contact process with λ = 1, N = 2, d = 1 survives. The only difficult step is the last one, but that is the one discussed in the previous section. Consider now a more general class of threshold voter models: A flip occurs at x at rate 1 if η(y) 6= η(x) for at least T y’s with ky − xk ≤ N . The previously discussed model is the case T = 1. Many open problems remain if T > 1. Here are some results which have been proved: (n denotes the cardinality of the neighborhood {x : kxk ≤ N }.) 1. If d = 1 and T = N (= 12 (n − 1)), then the process clusters. Furthermore, if the initial distribution is translation invariant, then the limiting distribution as t → ∞ exists (and is a mixture of the pointmasses on 0 and 1). (Andjel, Liggett and Mountford (1992)) 2. If T = θn with θ < 14 and N is sufficiently large, then there is coexistence. (Durrett (1992)) 3. If T > 12 (n − 1), then the process fixates, in the sense that each site flips only finitely often. (Durrett and Steif (1993)). In case T = θn with 14 < θ < 12 , Durrett and Steif conjecture that clustering occurs if N is sufficiently large. The behavior of the system when N is small is open if T ≤ 12 (n − 1) (< if d = 1).
214
T. M. LIGGETT
Bibliography When I wrote my 1985 book, I tried to include in the list of references essentially all of the papers on interacting particle systems which had been written up to that time (which covered about 15 years). There were approximately 350 of them. Since then, I have maintained a list of papers and books on this subject, and I am taking this opportunity to share this list with the world at large. There are over 300 entries, representing the accomplishments in this field over the past eight years. Books Chen, M. F. (1992). From Markov Chains to Non-Equilibrium Particle Systems. World Scientific. DeMasi, A. and Presutti, E. (1991). Mathematical Methods for Hydrodynamic Limits. Springer Lecture Notes in Mathematics 1501. Durrett, R. (1988). Lecture Notes on Particle Systems and Percolation. Wadsworth. Liggett, T. M. (1985). Interacting Particle Systems. Springer. Spohn, H. (1991). Large Scale Dynamics of Interacting Particles. Springer Texts and Monographs in Physics.
Articles Aizenman, M. and Holley, R. (1987). Rapid convergence to equilibrium of stochastic Ising models in the Dobrushin–Shlosman regime. Percolation Theory and Ergodic Theory of Infinite Particle Systems, vol. 8, IMA Series in Mathematics and its Applications, pp. 1–11. Andjel, E. D. (1986). Convergence to a nonextremal equilibrium measure in the exclusion process. Probability Theory and Related Fields 73, 127–134. Andjel, E. D. (1988). A correlation inequality for the symmetric exclusion process. Annals of Probability 16, 717–721. Andjel, E. D. (1988). The contact process in high dimensions. Annals of Probability 16, 1174–1183. Andjel, E. D. (1990). Ergodic and mixing properties of equilibrium measures for Markov processes. Transactions of the American Mathematical Society 318, 601–614. Andjel, E. D. (1992). Survival of multidimensional contact process in random environments. Boletim da Sociedade Brasileira de Matem´ atica 23, 109–119. Andjel, E. D., Bramson, M. D., and Liggett, T. M. (1988). Shocks in the asymmetric exclusion process. Probability Theory and Related Fields 78, 231–247. Andjel, E. D., Cocozza, C., and Roussignol, M. (1985). Quelques compl´ements sur le processus des misanthropes et le processus “zero range”. Annales de l’Institut Henri Poincar´ e (Probabilit´ es et Statistique) 21, 363–382. Andjel, E. D. and Kipnis, C. P. (1987). Pointwise ergodic theorems for the symmetric exclusion process. Probability Theory and Related Fields 75, 545–550. Andjel, E. D., Liggett, T. M., and Mountford, T. (1992). Clustering in one dimensional threshold voter models. Stochastic Processes and their Applications 42, 73–90. Andjel, E. D., Schinazi, R., and Schonmann, R. H. (1990). Edge processes of stochastic growth models. Annales de l’Institut Henri Poincar´ e (Probabilit´ es et Statistique) 26, 489–506. Andjel, E. D. and Vares, M. E. (1987). Hydrodynamic equations for attractive particle systems on Z. Journal of Statistical Physics 47, 265–288. Andjel, E. D. and Vares, M. E. (1992). Ergodicity of an infinite dimensional renewal process. Stochastic Processes and their Applications 42, 215–236. Baillon, J. B., Clement, P., Greven, A., and Hollander, F. den (1993). A variational approach to branching random walk in random environment. Annals of Probability 21, 290-317. Belitsky, V. Two particle annihilating exclusion. Benassi, A. and Fouque, J. P. (1987). Hydrodynamical limit for the asymmetric exclusion process. Annals of Probability 15, 546–560. Benassi, A. and Fouque, J. P. (1988). Hydrodynamical limit for the asymmetric zero-range process. Annales de l’Institut Henri Poincar´ e (Probabilit´ es et Statistique) 24, 189–200. Benassi, A. and Fouque, J. P. (1991). Fluctuation field for the asymmetric simple exclusion process. Proceedings of an Oberwolfach conference, Birkh¨ auser, pp. 33–43.
SURVIVAL AND COEXISTENCE IN INTERACTING PARTICLE SYSTEMS
215
Benassi, A., Fouque, J. P., Saada, E., and Vares, M. E. (1991). Asymmetric attractive particle systems on Z: hydrodynamical limit for monotone initial profiles. Journal of Statistical Physics 63, 719–735. Bezuidenhout, C. and Gray, L. (1993). Critical attractive spin systems. Annals of Probability. Bezuidenhout, C. and Grimmett, G. (1990). The critical contact process dies out. Annals of Probability 18, 1462–1482. Bezuidenhout, C. and Grimmett, G. (1991). Exponential decay for subcritical contact and percolation processes. Annals of Probability 19, 984–1009. Boldrighini, C., Cosini, G., Frigio, S., and Grasso Nunes, M. (1989). Computer simulation of shock waves in the completely asymmetric simple exclusion process. Journal of Statistical Physics 55, 611–623. Boldrighini, C., DeMasi, A., Pellegrinotti, A., and Presutti, E. (1987). Collective phenomena in interacting particle systems. Stochastic Processes and their Applications 25, 137–152. Boldrighini, C., DeMasi, A., and Pellegrinotti, A. (1992). Non equilibrium fluctuations in particle systems modelling diffusion–reaction equations. Stochastic Processes and their Applications 42, 1–30. Bramson, M. (1988). Front propagation in certain one dimensional exclusion models. Journal of Statistical Physics 51, 863–870. Bramson, M. (1989). Survival of nearest particle systems with low birth rate. Annals of Probability 17, 433–443. Bramson, M., Calderoni, P., DeMasi, A., Ferrari, P., Lebowitz, J., and Schonmann, R. H. (1986). Microscopic selection principle for a diffusion–reaction equation. Journal of Statistical Physics 45, 905–920. Bramson, M., Cox, J. T., and Griffeath, D. (1986). Consolidation rates for two interacting systems in the plane. Probability Theory and Related Fields 73, 613–625. Bramson, M., Cox, J. T., and Griffeath, D. (1988). Occupation time large deviations of the voter model. Probability Theory and Related Fields 77, 401–413. Bramson, M., Ding, W. D., and Durrett, R. (1991). Annihilating branching processes. Stochastic Processes and their Applications 37, 1–17. Bramson, M. and Durrett, R. (1988). A simple proof of the stability criterion of Gray and Griffeath. Probability Theory and Related Fields 80, 293–298. Bramson, M., Durrett, R., and Swindle, G. (1989). Statistical mechanics of crabgrass. Annals of Probability 17, 444–481. Bramson, M., Durrett, R., and Schonmann, R. H. (1991). The contact process in a random environment. Annals of Probability 19, 960–983. Bramson, M. and Gray, L. (1991). A useful renormalization argument. Random Walks, Brownian Motion and Interacting Particle Systems, A Festschrift in honor of Frank Spitzer, Birkh¨ auser, pp. 113–152. Bramson, M. and Griffeath, D. (1987). Survival of cyclical particle systems. Percolation Theory and Ergodic Theory of Infinite Particle Systems, vol. 8, IMA Series in Mathematics and its Applications, pp. 21–29. Bramson, M. and Griffeath, D. (1989). Flux and fixation in cyclic particle systems. Annals of Probability 17, 26–45. Bramson, M. and Lebowitz, J. L. (1990). Asymptotic behavior of densities in diffusion dominated two-particle reactions. Physica A 168, 88–94. Bramson, M. and Lebowitz, J. L. (1991). Asymptotic behavior of densities for two–particle annihilating random walks. Journal of Statistical Physics 62, 297–372. Bramson, M. and Lebowitz, J. L. (1991). Spatial structure in diffusion limited two particle reactions. Journal of Statistical Physics 65, 941–951. Bramson, M. and Neuhauser, C. (1992). A catalytic surface reaction model. Jour. Comp. Appl. Math. 40, 157–161. Buttel, L., Cox, J. T., and Durrett, R (1993). Estimating the critical values of stochastic growth models. Journal of Applied Probability 30, 455–461. Cai, H. and Luo, X. (1992). Coexistence in a competition model. Statistics and Probabability Letters 15, 241–243.
216
T. M. LIGGETT
Calderoni, P., Pellegrinotti, A., Presutti, E., and Vares, M. E. (1989). Transient bimodality in interacting particle systems. Journal of Statistical Physics 55, 523–577. Cammarota, C. and Ferrari, P. A. (1991). Invariance principle for the edge of the branching exclusion process. Stochastic Processes and their Applications 38, 1–11. Carlson, J. M., Grannan, E. R., and Swindle, G. H. (1993). A limit theorem for tagged particles in a class of self-organizing particle systems. Stochastic Processes and their Applications 47, 1–16. Carlson, J. M., Grannan, E. R., Swindle, G. H., and Tour, J. (1993). Singular diffusion limits of reversible particle systems. Annals of Probability 21, 1372–1393. Cassandro, M., Galves, A., Olivieri, E., and Vares, M. E. (1984). Metastable behavior of stochastic dynamics: a pathwise approach. Journal of Statistical Physics 35, 603–628. Chen, D. (1988). On the survival probability of generalized nearest particle systems. Stochastic Processes and their Applications 30, 209–223. Chen, D. Finite nearest particle systems on a tree. Acta Mathematica Sinica. Chen, D., Feng, J., and Qian, M. The metastable behavior of the two dimensional Ising model. Chen, D., Feng, J., and Qian, M. The metastable behavior of the three dimensional Ising model. Chen, D. and Liggett, T. M. (1992). Finite reversible nearest–particle systems in inhomogeneous and random environments. Annals of Probability 20, 152–173. Chen, H. N. (1992). On the stability of a population growth model with sexual reproduction on Z 2 . Annals of Probability 20, 232–285. Chen, J. W., Durrett, R., and Liu, X. F. (1990). Exponential convergence for one dimensional contact processes. Acta Mathematica Sinica 6, 349–353. Chen, M. F. (1985). Infinite dimensional reaction diffusion processes. Acta Mathematica Sinica 1, 261–273. Chen, M. F. (1987). Existence theorems for interacting particle systems with noncompact state spaces. Sci. Sinica Ser. A 30, 148–156. Chen, M. F. (1989). Stationary distributions of infinite particle systems with noncompact state space. Acta Math. Sci. 9, 9–19. Chen, M. F. (1990). Ergodic theorems for reaction diffusion processes. Journal of Statistical Physics 58, 939–966. Chen, M. F. (1991). Uniqueness of reaction diffusion processes. Chinese Scientific Bulletin 36, 969–973. Comets, F. and Eisele, T. (1988). Asymptotic dynamics, non-critical and critical fluctuations for a geometric long-range interacting model. Communications in Mathematical Physics 118, 531– 567. Cox, J. T. (1988). Some limit theorems for voter model occupation times. Annals of Probability 16, 1559–1569. Cox, J. T. (1989). Coalescing random walks and voter model consensus times on the torus in Z d . Annals of Probability 17, 1333–1366. Cox, J. T. On the ergodic theory of critical branching Markov chains. Stochastic Processes and their Applications. Cox, J. T. and Durrett, R. (1988). Limit theorems for the spread of epidemics and forest fires. Stochastic Processes and their Applications 30, 171–191. Cox, J. T. and Durrett, R. (1990). Large deviations for independent random walks. Probability Theory and Related Fields 84, 67–82. Cox, J. T. and Durrett, R. (1991). Nonlinear voter models. Random Walks, Brownian Motion and Interacting Particle Systems, A Festschrift in honor of Frank Spitzer, Birkh¨ auser, pp. 189–201. Cox, J. T., Durrett, R., and Schinazi, R. (1991). The critical contact process seen from the right edge. Probability Theory and Related Fields 87, 325–332. Cox, J. T. and Greven, A. (1990). On the long term behavior of some finite particle systems. Probability Theory and Related Fields 85, 195–237. Cox, J. T. and Greven, A. (1991). On the long time behavior of finite particle systems: A critical dimensional example. Random Walks, Brownian Motion and Interacting Particle Systems, A Festschrift in honor of Frank Spitzer, Birkh¨ auser, pp. 203–213. Cox, J. T. and Greven, A. Ergodic theorems for infinite systems of locally interacting diffusions.
SURVIVAL AND COEXISTENCE IN INTERACTING PARTICLE SYSTEMS
217
Cox, J. T. and Griffeath, D. (1985). Large deviations for some infinite particle system occupation times. Particle Systems, Random Media, and Large Deviations, vol. 41, AMS Contemporary Mathematics, pp. 43–54. Cox, J. T. and Griffeath, D. (1986). Critical clustering in the two dimensional voter model. Stochastic Spatial Processes, vol. 1212, Springer Lecture Notes in Mathematics, pp. 59–68. Cox, J. T. and Griffeath, D. (1986). Diffusive clustering in the two dimensional voter model. Annals of Probability 14, 347–370. Cox, J. T. and Griffeath, D. (1990). Mean field asymptotics for the planar stepping stone model. Proceedings of the London Mathematical Society 61, 189–208. Dai, Y. L. and Liu, X. J. (1986). Quasi–nearest particle systems. Acta Mathematica Sinica 2, 92–104. Darling, R. W. R. and Mukherjea, A. (1991). Discrete time voter models. A class of stochastic automata. Probability Measures on Groups X, Plenum, pp. 83–94. Dawson, D. and Greven, A. (1993). Multiple time scale analysis of interacting diffusions. Probability Theory and Related Fields 95, 467–508. DeMasi, A. and Ferrari, P. A. (1985). Self-diffusion in one-dimensional lattice gases in the presence of an external field. Journal of Statistical Physics 38, 603–613. DeMasi, A., Ferrari, P. A., Goldstein, S., and Wick, W. D. (1989). An invariance principle for reversible Markov processes. Applications to random motions in random environments. Journal of Statistical Physics 55, 787–855. DeMasi, A., Ferrari, P. A., and Lebowitz, J. L. (1986). Reaction–diffusion equations for interacting particle systems. Journal of Statistical Physics 44, 589–644. DeMasi, A., Ferrari, P. A., and Vares, M. E. (1989). A microscopic model of interface related to the Burger equation. Journal of Statistical Physics 55, 601–609. DeMasi, A., Kipnis, C., Presutti, E., and Saada, E. (1989). Microscopic structure at the shock in the asymmetric simple exclusion. Stochastics and Stochastics Reports 27, 151–165. DeMasi, A., Pellegrinotti, A., Presutti, E., and Vares, M. E. Spatial patterns when phases separate in an interacting particle system. DeMasi, A., Presutti, E., and Scacciatelli, E. (1989). The weakly asymmetric simple exclusion process. Annales de l’Institut Henri Poincar´ e (Probabilit´ es et Statistique) 25, 1–38. DeMasi, A., Presutti, E., Spohn, H., and Wick, D. (1986). Asymptotic equivalence of fluctuation fields for reversible exclusion processes with speed change. Annals of Probability 14, 409–423. DeMasi, A., Presutti, E., and Vares, M. E. (1986). Escape from the unstable equilibrium in a random process with infinitely many interacting particles. Journal of Statistical Physics 44, 645–696. De Oliveira, M. J. (1992). Isotropic majority vote model on a square lattice. Journal of Statistical Physics 66, 273–281. Derrida, B., Domany, E., and Mukamel, D. (1992). An exact solution of a one-dimensional asymmetric exclusion model with open boundaries. Journal of Statistical Physics 69, 667–687. Derrida, B., Evans, M. R., Hakim, V., and Pasquier, V. (1993). A matrix method of solving an asymmetric exclusion model with open boundaries. Cellular Automata and Cooperative Systems, Kluwer, Dordrecht, pp. 121–134. Derrida, B., Evans, M. R., Hakim, V., and Pasquier, V. (1993). Exact solution of a 1D asymmetric exclusion model using a matrix formulation. Journal of Physics A: Mathematical and General 26, 1493–1517. Derrida, B., Evans, M. R., and Mukamel, D. Exact diffusion constant for one-dimensional asymmetric exclusion models. Derrida, B., Janowsky, S. A., Lebowitz, J. L., and Speer, E. R. (1992). Exact solution of the totally asymmetric simple exclusion process: shock profiles. Journal of Statistical Physics 69, 667–687. Deuschel, J. D. Algebraic L2 decay of attractive critical processes on the lattice. Annals of Probability. Deuschel, J. D. and Stroock, D. W. (1990). Hypercontractivity and spectral gap of symmetric diffusions with applications to the stochastic Ising models. Journal of Functional Analysis 92, 30–48.
218
T. M. LIGGETT
Dickman, R. (1989). Universality and diffusion in nonequilibrium critical phenomena. The Physical Review B 40, 7005–7010. Dickman, R. (1989). Nonequilibrium lattice models: series analysis of steady states. Journal of Statistical Physics 55, 997–1026. Dickman, R. (1990). Nonequilibrium critical behavior of the triplet annihilation model. The Physical Review A 42, 6985–6990. Dickman, R. and Burschka, M. A. (1988). Nonequilibrium critical poisoning in a single species model. Physics Letters A 127, 132–137. Dickman, R. and Jensen, I. (1991). Time dependent perturbation theory for nonequilibrium lattice models. Physical Review Letters 67, 2391–2394. Dickman, R. and Jensen, I. (1993). Time dependent perturbation theory for nonequilibrium lattice models. Journal of Statistical Physics 71, 89–127. Dickman, R. and Jensen, I. (1993). Time dependent perturbation theory for diffusive nonequilibrium lattice models. Journal of Physics A: Mathematical and General 26, L151–L157. Dickman, R. and Tom´e, T. (1991). First order phase transition in a one-dimensional nonequilibrium model. The Physical Review A 44, 4833–4838. Ding, W., Durrett, R., and Liggett, T. M. (1990). Ergodicity of reversible reaction diffusion processes. Probability Theory and Related Fields 85, 13–26. Ding, W. and Zheng, X. (1987). Existence theorems for linear growth processes with diffusion. Acta Mathematica Sinica 7, 25–42. Ding, W. and Zheng, X. (1989). Ergodic theorems for linear growth processes with diffusion. Chinese Annals of Mathematics Series B 10, 386–402. Dittrich, P. (1990). Travelling waves and long-time behavior of the weakly asymmetric process. Probability Theory and Related Fields 86, 443–455. Dittrich, P. and G¨ artner, J. (1991). A central limit theorem for the weakly asymmetric simple exclusion process. Mathematische Nachrichten 151, 75–93. Dong, H. Existence of infinite dimensional reaction diffusion process with multispecies. Durrett, R. (1985). Stochastic growth models: Ten problems for the 80’s (and 90’s). Particle Systems, Random Media, and Large Deviations, vol. 41, AMS Contemporary Mathematics, pp. 87–99. Durrett, R. (1986). Some peculiar properties of a particle system with sexual reproduction. Stochastic Spatial Processes, vol. 1212, Springer Lecture Notes in Mathematics, pp. 106–111. Durrett, R. (1988). Crabgrass, measles and gypsy moths: an introduction to modern probability. Bulletin of the American Mathematical Society 18, 117–143. Durrett, R. (1988). Crabgrass, measles and gypsy moths: an introduction to interacting particle systems. Mathematical Intelligencer 10, 37–47. Durrett, R. (1991). A new method for proving the existence of phase transitions. Spatial Stochastic Processes. A Festschrift in honor of the Seventieth Birthday of Ted Harris, Birkh¨ auser, pp. 141–169. Durrett, R. (1991). The contact process, 1974–1989. Proceedings of the 1989 AMS Seminar on Random Media, vol. 27, AMS Lectures in Applied Mathematics, pp. 1–18. Durrett, R. Stochastic models of growth and competition. Patch Dynamics, Springer. Durrett, R. (1992). Multicolor particle systems with large threshold and range. Journal of Theoretical Probability 5, 127–152. Durrett, R. (1992). Stochastic growth models — bounds on critical values. Journal of Applied Probability 29, 11–20. Durrett, R. (1992). Some new games for your computer. Nonlinear Science Today 1, 1–6. Durrett, R. Ten Lectures on Particle Systems. Proceedings of the 1993 St. Flour Summer School. Durrett, R. Spatial epidemic models. Durrett, R. and Gray, L. Some peculiar properties of a particle model with sexual reproduction. Durrett, R. and Liu, X. (1988). The contact process on a finite set. Annals of Probability 16, 1158–1173. Durrett, R. and Møller, A. M. (1991). Complete convergence theorem for a competition model. Probability Theory and Related Fields 88, 121–136.
SURVIVAL AND COEXISTENCE IN INTERACTING PARTICLE SYSTEMS
219
Durrett, R. and Neuhauser, C. (1991). Epidemics with recovery in D = 2. Annals of Applied Probability 1, 189–206. Durrett, R. and Neuhauser, C. Particle systems and reaction–diffusion equations. Annals of Probability. Durrett, R. and Schonmann, R. (1987). Stochastic growth models. Percolation Theory and Ergodic Theory of Infinite Particle Systems, vol. 8, IMA Series in Mathematics and its Applications, pp. 85–119. Durrett, R. and Schonmann, R. (1988). The contact process on a finite set II. Annals of Probability 16, 1570–1583. Durrett, R. and Schonmann, R. (1988). Large deviations for the contact process and two dimensional percolation. Probability Theory and Related Fields 77, 583–603. Durrett, R., Schonmann, R. and Tanaka, N. (1989). The contact process on a finite set III. The critical case. Annals of Probability 17, 1303–1321. Durrett, R. and Steif, J. E. (1993). Fixation results for threshold voter systems. Annals of Probability 21, 232–247. Durrett, R. and Swindle, G. (1991). Are there bushes in a forest?. Stochastic Processes and their Applications 37, 19–31. Durrett, R. and Swindle, G. Coexistence results for catalysts. Probability Theory and Related Fields. Ferrari, P. A. (1986). The simple exclusion process as seen from a tagged particle. Annals of Probability 14, 1277–1290. Ferrari, P. A. (1988). Invariance principle for a solid–on–solid interface model. Journal of Statistical Physics 51, 1077–1090. Ferrari, P. A. (1990). Ergodicity for spin systems with stirrings. Annals of Probability 18, 1523– 1538. Ferrari, P. A. (1992). Shock fluctuations in asymmetric simple exclusion. Probability Theory and Related Fields 91, 81–101. Ferrari, P. A. and Fontes, L. R. G. Shock fluctuations in asymmetric simple exclusion process. Ferrari, P. A. and Fontes, L. R. G. (1993). Current fluctuations in asymmetric simple exclusion process. Annals of Probability. Ferrari, P. A. and Galves, A. Density fluctuations for a finite system of independent random walks. Ferrari, P. A., Galves, A., and Landim, C. Exponential waiting time for a big gap in a one dimensional zero range process. Ferrari, P. A. and Goldstein, S. (1988). Microscopic stationary states for stochastic systems with particle flux. Probability Theory and Related Fields 78, 455–471. Ferrari, P. A., Kipnis, C., and Saada, E. (1991). Microscopic structure of travelling waves in the asymmetric simple exclusion process. Annals of Probability 19, 226–244. Ferrari, P. A., Lebowitz, J. L., and Maes, C. (1988). On the positivity of correlations in nonequilibrium spin systems. Journal of Statistical Physics 53, 295–305. Ferrari, P. A., Presutti, E., Scacciatelli, E., and Vares, M. E. (1991). The symmetric simple exclusion process I: Probability estimates. Stochastic Processes and their Applications 39, 89–105. Ferrari, P. A., Presutti, E., Scacciatelli, E., and Vares, M. E. (1991). The symmetric simple exclusion process II: Applications. Stochastic Processes and their Applications 39, 107–115. Ferrari, P. A., Presutti, E., and Vares, M. E. (1987). Local equilibrium for a one dimensional zero range process. Stochastic Processes and their Applications 26, 31–45. Ferrari, P. A., Presutti, E., and Vares, M. E. (1988). Nonequilibrium fluctuations for a zero range process. Annales de l’Institut Henri Poincar´ e (Probabilit´ es et Statistique) 24, 237–268. Ferrari, P. A. and Ravishankar, K. (1992). Shocks in asymmetric exclusion automata. Annals of Applied Probability 2, 928–941. Ferreira, I. (1990). The probability of survival for the biased voter model in a random environment. Stochastic Processes and their Applications 34, 25–38. Fleischman, K. and Greven, A. (1992). Localization and selection in a mean field branching random walk in a random environment. Annals of Probability 20, 2141–2163.
220
T. M. LIGGETT
Fouque, J. P. (1991). Hydrodynamical behavior of asymmetric attractive particle systems. One example: One-dimensional nearest-neighbors asymmetric simple exclusion process. Proceedings of the 1989 AMS Seminar on Random Media, vol. 27, AMS Lectures in Applied Mathematics, pp. 97–107. Fouque, J. P. A probabilistic approach to some nonlinear hyperbolic partial differential equations. Fouque, J. P. and Saada, E. Totally asymmetric attractive particle systems on Z: hydrodynamical limit for general initial profiles. Funaki, T., Handa, K., and Uchiyama, K. (1991). Hydrodynamic limit of one dimensional exclusion processes with speed change. Annals of Probability 19, 245–265. Gacs, P. (1986). Reliable computation with cellular automata. J. Comp. Sys. Sci. 32, 15–78. Galves, A., Martinelli, F., and Olivieri, E. (1989). Large density fluctuations for the one dimensional supercritical contact process. Journal of Statistical Physics 55, 639–648. Galves, A. and Presutti, E. (1987). Edge fluctuations for the one dimensional supercritical contact process. Annals of Probability 15, 1131–1145. Galves, A. and Presutti, E. (1987). Travelling wave structure of the one dimensional contact process. Stochastic Processes and their Applications 25, 153–163. Galves, A. and Schinazi, R. (1989). Approximations finis de la mesure invariante du processus de contact sur-critique vu par la premi`ere particule. Probability Theory and Related Fields 83, 435–445. G¨ artner, J. (1988). Convergence towards Burger’s equation and propagation of chaos for weakly asymmetric exclusion processes. Stochastic Processes and their Applications 27, 233–260. G¨ artner, J. and Presutti, E. (1990). Shock fluctuations in a particle system. Annales de l’Institut Henri Poincar´ e (Probabilit´ es et Statistique) B 53, 1–14. Grannan, E. and Swindle, G. (1990). A particle system with massive destruction. Journal of Physics A: Mathematical and General 23, L73–L78. Grannan, E. and Swindle, G. (1990). Rigorous results on mathematical models of catalytic surfaces. Journal of Statistical Physics 61, 1085–1103. Gray, L. (1985). The critical behavior of a class of simple interacting systems – a few answers and a lot of questions. Particle Systems, Random Media, and Large Deviations, vol. 41, AMS Contemporary Mathematics, pp. 149–160. Gray, L. (1987). The behavior of processes with statistical mechanical properties. Percolation Theory and Ergodic Theory of Infinite Particle Systems, vol. 8, IMA Series in Mathematics and its Applications, pp. 131–167. Gray, L. (1991). Is the contact process dead?. Proceedings of the 1989 AMS Seminar on Random Media, vol. 27, AMS Lectures in Applied Mathematics, pp. 19–29. Greven, A. (1985). The coupled branching process in random environment. Annals of Probability 13, 1133–1147. Greven, A. (1985). Phase transition for a class of Markov processes on (N )S . Particle Systems, Random Media, and Large Deviations, vol. 41, AMS Contemporary Mathematics, pp. 161–174. Greven, A. (1986). On a class of infinite particle systems evolving in a random environment. Stochastic Spatial Processes, vol. 1212, Springer Lecture Notes in Mathematics, pp. 145–164. Greven, A. (1990). Symmetric exclusion on random sets and a related problem for random walks in random environment. Probability Theory and Related Fields 85, 307–364. Greven, A. (1991). A phase transition for the coupled branching process. Part I: The ergodic theory in the range of finite second moments. Probability Theory and Related Fields 87, 416–458. Greven, A. and Hollander, F. den (1992). Branching random walk in random environment: phase transition for local and global growth rates. Probability Theory and Related Fields 91, 195–249. Griffeath, D. (1993). Frank Spitzer’s pioneering work on interacting particle systems. Annals of Probability 21, 608–621. Grillenberger, C. and Ziezold, H. (1988). On the critical infection rate of the one dimensional basic contact process: numerical results. Journal of Applied Probability 25, 1–8. Holley, R. (1985). Possible rates of convergence in finite range, attractive spin systems. Particle Systems, Random Media, and Large Deviations, vol. 41, AMS Contemporary Mathematics, pp. 215–234.
SURVIVAL AND COEXISTENCE IN INTERACTING PARTICLE SYSTEMS
221
Holley, R. (1987). One dimensional stochastic Ising models. Percolation Theory and Ergodic Theory of Infinite Particle Systems, vol. 8, IMA Series in Mathematics and its Applications, pp. 187– 202. Holley, R. (1991). On the asymptotics of the spin–spin autocorrelation function in stochastic Ising models near the critical temperature. Spatial Stochastic Processes. A Festschrift in honor of the Seventieth Birthday of Ted Harris, Birkh¨ auser, pp. 89–104. Holley, R. and Stroock, D. W. (1987). Logarithmic Sobolev inequalities and stochastic Ising models. Journal of Statistical Physics 46, 1159–1194. Holley, R. and Stroock, D. W. (1989). Uniform and L2 convergence in one dimensional stochastic Ising models. Communications in Mathematical Physics 123, 85–93. Huang, L. P. (1987). Existence theorem for stationary distributions of a class of infinite particle systems. Chinese J. Appl. Probab. and Stat. 3, 152–158. Ignatyuk, I. A. and Malyshev, V. A. (1989). Processes with local interactions and communication networks. Problems of Information Transmission 25, 65–77. Ignatyuk, I. A., Malyshev, V. A., and Molchanov, S. A. (1989). Moment closed processes with local interaction. Selecta Mathematica Sovietica 8, 351–384. Janowski, S. A. and Lebowitz, J. L. Finite size effects and shock fluctuations in the asymmetric simple exclusion process. Jitomirskaya, S. and Klein, A. Ising model in a quasi-periodic transverse field, percolation and contact processes in quasi-periodic environments. Journal of Statistical Physics. Katori, M. and Konno, N. Coherent anomalies of the systematic series of approximations in the contact process. Katori, M. and Konno, N. (1990). Applications of the CAM based on a new decoupling procedure of correlation functions in the one dimensional contact process. Journal of the Physical Society of Japan 59, 1581–1592. Katori, M. and Konno, N. (1990). Correlation inequalities and lower bounds for the critical value λc of contact processes. Journal of the Physical Society of Japan 59, 877–887. Katori, M. and Konno, N. (1991). Applications of the Harris–FKG inequality to upper bounds for order parameters in the contact process. Journal of the Physical Society of Japan 60, 430–434. Katori, M. and Konno, N. (1991). Three point Markov extension and an improved upper bound for survival probability of the one dimensional contact process. Journal of the Physical Society of Japan 60, 418–429. Katori, M. and Konno, N. (1991). An upper bound for survival probability of infected region in the contact process. Journal of the Physical Society of Japan 60, 95–99. Katori, M. and Konno, N. (1991). Upper bounds for the survival probability of the contact process. Journal of Statistical Physics 63, 115–130. Katori, M. and Konno, N. (1991). Analysis of the order parameter for uniform nearest particle system. Journal of Statistical Physics 65, 247–254. Katori, M. and Konno, N. (1992). Upper bounds for order parameters of a class of attractive nearest particle systems with finite range. Journal of the Physical Society of Japan 61, 806–811. Katori, M. and Konno, N. Bounds on the critical values of the θ-contact processes with 1 ≤ θ ≤ 2. Katz, S., Lebowitz, S. L., and Spohn, H. (1984). Nonequilibrium steady states of stochastic lattice gas models of fast ionic conductors. Journal of Statistical Physics 34, 497–537. Kel’bert, M. Ya., Kontsevich, M. L., and Rybko, A. N. (1988). On Jackson networks on denumerable graphs. Theory of Probability and its Applications 33, 358–361. Kipnis, C. (1985). Recent results on the movement of a tagged particle in simple exclusion. Particle Systems, Random Media, and Large Deviations, vol. 41, AMS Contemporary Mathematics, pp. 259–265. Kipnis, C. (1987). Fluctuations des temps d’occupation d’un site dans l’exclusion simple symetrique. Annales de l’Institut Henri Poincar´ e (Probabilit´ es et Statistique) 23, 21–35. Kipnis, C., Olla, S., and Varadhan, S. R. S. (1989). Hydrodynamics and large deviations for simple exclusion processes. Communications in Pure and Applied Mathematics 42, 115–137. Klein, A. Extinction of contact and percolation processes in a random environment. Annals of Probability.
222
T. M. LIGGETT
Kotecky, R. and Olivieri, E. (1993). Droplet dynamics for asymmetric Ising model. Journal of Statistical Physics 70, 1121–1148. Kuczek, T. (1989). The central limit theorem for the right edge of supercritical oriented percolation. Annals of Probability 17, 1322–1332. Landim, C. (1991). Hydrodynamical equations for attractive particle systems on Z d . Annals of Probability 19, 1537–1558. Landim, C. (1991). Hydrodynamical limit for asymmetric attractive particle systems on Z d . Annales de l’Institut Henri Poincar´ e (Probabilit´ es et Statistique) 27, 559–581. Landim, C. (1992). Occupation time large deviations for the symmetric simple exclusion process. Annals of Probability 20, 206–231. Landim, C. Conservation of local equilibrium for attractive particle systems on Z d . Annals of Probability. Lebowitz, J. L., Maes, C., and Speer, E. R. (1990). Statistical mechanics of probabilistic cellular automata. Journal of Statistical Physics 59, 117–170. Lebowitz, J. L., Orlandi, E., and Presutti, E. (1991). A particle model for spinodal decomposition. Journal of Statistical Physics 63, 933–974. Lebowitz, J. L., Presutti, E., and Spohn, H. (1988). Microscopic models of hydrodynamic behavior. Journal of Statistical Physics 51, 841–862. Lebowitz, J. L. and Schonmann, R. H. (1988). On the asymptotics of occurrence times of rare events for stochastic spin systems. Journal of Statistical Physics 48, 727–751. Lebowitz, J. L. and Schonmann, R. H. (1988). Pseudo-free energies and large deviations for non Gibbsian FKG measures. Probability Theory and Related Fields 77, 49–64. Lee, T. Y. (1988). Large deviations for noninteracting infinite particle systems. Probability Theory and Related Fields 77, 49–64. Lee, T. Y. (1989). Large deviations for systems of noninteracting recurrent particles. Annals of Probability 17, 46–57. Liggett, T. M. (1986). Nearest particle systems: Results and open problems. Stochastic Spatial Processes, vol. 1212, Springer Lecture Notes in Mathematics, pp. 200–215. Liggett, T. M. (1987). Reversible growth models on Z d : Some examples. Percolation Theory and Ergodic Theory of Infinite Particle Systems, vol. 8, IMA Series in Mathematics and its Applications, pp. 213–227. Liggett, T. M. (1987). Applications of the Dirichlet principle to finite reversible nearest particle systems. Probability Theory and Related Fields 74, 505–528. Liggett, T. M. (1987). Reversible growth models on symmetric sets. Proceedings of the 1985 Taniguchi Symposium, pp. 275–301. Liggett, T. M. (1987). Spatial stochastic growth models. Survival and critical behavior. Proceedings of the 1986 ICM, pp. 1032–1041. Liggett, T. M. (1989). Exponential L2 convergence of attractive reversible nearest particle systems. Annals of Probability 17, 403–432. Liggett, T. M. (1991). Spatially inhomogeneous contact processes. Spatial Stochastic Processes. A Festschrift in honor of the Seventieth Birthday of Ted Harris, Birkh¨ auser, pp. 105–140. Liggett, T. M. (1991). L2 rates of convergence of attractive reversible nearest particle systems: the critical case. Annals of Probability 19, 935–959. Liggett, T. M. (1991). The periodic threshold contact process. Random Walks, Brownian Motion and Interacting Particle Systems, A Festschrift in honor of Frank Spitzer, Birkh¨ auser, pp. 339– 358. Liggett, T. M. (1991). Limiting behavior of a one-dimensional system with long range interactions. Proceedings of the 1989 AMS Seminar on Random Media, vol. 27, AMS Lectures in Applied Mathematics, pp. 31–40. Liggett, T. M. (1992). Remarks on the sufficient condition for survival of spatially inhomogeneous contact processes. Probability and Statistics, Proceedings of the Special Program at Nankai Institute of Mathematics, World Scientific, pp. 163–173. Liggett, T. M. (1992). The survival of one dimensional contact processes in random environments. Annals of Probability 20, 696–723.
SURVIVAL AND COEXISTENCE IN INTERACTING PARTICLE SYSTEMS
223
Liggett, T. M. (1993). The coupling technique in interacting particle systems. Proceedings of the Doeblin conference, AMS Contemporary Mathematics, pp. 73–83. Liggett, T. M. Coexistence in threshold voter models. Annals of Probability. Liggett, T. M. (1993). Clustering and coexistence in threshold voter models. Cellular Automata and Cooperative Systems, Kluwer, Dordrecht, pp. 403–410. Liggett, T. M. Improved upper bounds for the contact process critical value. Liggett, T. M. and Port, S. C. (1988). Systems of independent Markov chains. Stochastic Processes and their Applications 28, 1–22. Liu, X. (1986). A class of birth and death systems on Z. Acta Mathematica Sinica 6, 379–385. Liu, X. (1991). Infinite reversible nearest particle systems in inhomogeneous and random environments. Stochastic Processes and their Applications 38, 295–322. Liu, X. Inhomogeneous approximation of the critical nearest particle system. Liu, X. Symmetric two-particle exclusion-eating process. Lu, S. and Yau, H. T. (1993). Spectral gap and logarithmic Sobolev inequality for Kawasaki and Glauber dynamics. Communications in Mathematical Physics 156, 399–433. Madras, N., Schinazi, R., and Schonmann, R. On the critical behavior of the contact process in deterministic inhomogeneous environment. Annals of Probability. Maes, C. (1990). Kinetic limit of a conservative lattice gas dynamics showing long range correlations. Journal of Statistical Physics 61, 667–681. Maes, C. (1991). Long range spatial correlations for anisotropic zero range processes. Journal of Physics A: Mathematical and General 24, 4359–4373. Maes, C. A note on using the basic coupling in interacting particle systems. Maes, C. and Redig, F. (1991). Anisotropic perturbations of the simple symmetric exclusion process: long range correlations. Journal of Physics I 1, 669–684. Maes, C. and Shlosman, S. (1991). Ergodicity of probabilistic cellular automata: a constructive criterion. Communications in Mathematical Physics 135, 233–251. Maes, C. and Shlosman, S. (1993). When is an interacting particle system ergodic?. Communications in Mathematical Physics 151, 447–466. Maes, C. and Shlosman, S. (1993). Constructive criteria for the ergodicity of interacting particle systems. Cellular Automata and Cooperative Systems, Kluwer, Dordrecht, pp. 451–461. Maes, C. and Velde, K. V. The interaction potential of the stationary measure of a high noise spin flip process. Malyshev, V. A., Petrova, E. N., and Scacciatelli, E. (1992). Marginally closed processes with local interaction. Stochastic Processes and their Applications 43, 47–63. Marchand, J. P. and Martin, P. A. (1986). Exclusion process and droplet shape. Journal of Statistical Physics 44, 491–504. Marchand, J. P. and Martin, P. A. (1988). Errata: Exclusion process and droplet shape. Journal of Statistical Physics 50, 469–471. Martinelli, F. and Olivieri, E. Approach to equilibrium of Glauber dynamics in the one phase region I: The attractive case. Communications in Mathematical Physics. Martinelli, F. and Olivieri, E. Approach to equilibrium of Glauber dynamics in the one phase region II: The general case. Martinelli, F., Olivieri, E., and Scoppola, E. (1990). Metastability and exponential approach to equilibrium for low temperature stochastic Isng models. Journal of Statistical Physics 61, 1105–1119. Martinelli, F., Olivieri, E., and Scoppola, E. (1991). On the Swendsen and Wang dynamics I: Exponential convergence to equilibrium. Journal of Statistical Physics 62, 117–133. Martinelli, F., Olivieri, E., and Scoppola, E. (1991). On the Swendsen and Wang dynamics II: Critical droplets and homogeneous nucleation at low temperatures for the two dimensional Ising model. Journal of Statistical Physics 62, 135–159. Mountford, T. S. (1992). The critical value for the uniform nearest particle process. Annals of Probability 20, 2031–2042. Mountford, T. S. (1992). The critical value for some long range nearest particle systems. Probability Theory and Related Fields 93, 67–76.
224
T. M. LIGGETT
Mountford, T. S. (1992). The ergodicity of a class of reaction diffusion processes. Probability Theory and Related Fields 92, 259–274. Mountford, T. S. (1992). Generalized voter models. Journal of Statistical Physics 67, 303–311. Mountford, T. S. A complete convergence theorem for attractive reversible nearest particle systems. Mountford, T. S. (1993). A coupling of finite particle systems. Journal of Applied Probability 30, 258–262. Mountford, T. S. Exponential convergence for attractive reversible subcritical nearest particle systems. Mountford, T. S. and Sudbury, A. (1992). An extension of a result of Grannan and Swindle on the poisoning of catalytic surfaces. Journal of Statistical Physics 67, 1219–1222. Neuhauser, C. (1990). An ergodic theorem for Schl¨ ogl models with small migration. Probability Theory and Related Fields 85, 27–32. Neuhauser, C. (1990). One dimensional stochastic Ising models with small migration. Annals of Probability 18, 1539–1546. Neuhauser, C. (1992). Ergodic theorems for the multitype contact process. Probability Theory and Related Fields 91, 467–506. Neuhauser, C. The long range sexual reproduction process. Stochastic Processes and their Applications. Neuhauser, C. and Sudbury, A. (1993). The biased annihilating branching process. Advances in Applied Probability 25, 24–38. Neves, E. J. and Schonmann, R. H. (1991). Critical droplets and metastability for a Glauber dynamics at very low temperature. Communications in Mathematical Physics 137, 209–230. Neves, E. J. and Schonmann, R. H. (1992). Behavior of droplets for a class of Glauber dynamics at very low temperature. Probability Theory and Related Fields 91, 331–354. Noble, C. (1992). Equilibrium behavior of the sexual reproduction process with rapid diffusion. Annals of Probability 20, 724–745. Pellegrinotti, A. Phase separation in an interacting particle system. Pemantle, R. (1992). The contact process on trees. Annals of Probability 20, 2089–2116. Platen, E. (1989). A law of large numbers for wide range exclusion processes in random media. Stochastic Processes and their Applications 31, 33–50. Quastel, J. (1992). Diffusion of color in the simple exclusion process. Communications in Pure and Applied Mathematics 45, 623–679. Ravishankar, K. (1992). Fluctuations from the hydrodynamical limit for the symmetric simple exclusion in Z d . Stochastic Processes and their Applications 42, 31–37. Ravishankar, K. (1992). Interface fluctuations in the two dimensional weakly asymmetric simple exclusion process. Stochastic Processes and their Applications 43, 223–247. Rezakhanlou, F. (1990). Hydrodynamic limit for a system with finite range interaction. Communications in Mathematical Physics 129, 445–480. Rezakhanlou, F. (1991). Hydrodynamic limit for attractive particle systems on Z d . Communications in Mathematical Physics 140, 417–448. Rezakhanlou, F. Evolution of tagged particles in nonreversible particle systems. Rezakhanlou, F. Propagation of chaos for symmetric simple exclusion. Communications in Pure and Applied Mathematics. Roussignol, M. (1986). Processus de saut avec interaction selon les plus proches particules. Annales de l’Institut Henri Poincar´ e (Probabilit´ es et Statistique) 22, 175–198. Saada, E. (1987). A limit theorem for the position of a tagged particle in a simple exclusion process. Annals of Probability 15, 375–381. Saada, E. (1988). Invariant measures for the linear infinite particle systems with values in [0, ∞)S . Annales de l’Institut Henri Poincar´ e (Probabilit´ es et Statistique) 24, 427–437. Saada, E. (1990). Processus de zero-range avec particule marqu´ee. Annales de l’Institut Henri Poincar´ e (Probabilit´ es et Statistique) 26, 5–18. Scheucher, M. and Spohn, H. (1988). A soluble kinetic model for spinoidal decomposition. Journal of Statistical Physics 53, 279–294. Schinazi, R. (1992). Brownian fluctuations of the edge for critical reversible nearest particle systems. Annals of Probability 20, 194–205.
SURVIVAL AND COEXISTENCE IN INTERACTING PARTICLE SYSTEMS
225
Schonmann, R. H. (1985). Metastability for the contact process. Journal of Statistical Physics 41, 445–464. Schonmann, R. H. (1986). Central limit theorem for the contact process. Annals of Probability 14, 1291–1295. Schonmann, R. H. (1986). The asymmetric contact process. Journal of Statistical Physics 44, 505–534. Schonmann, R. H. (1987). A new look at contact processes in several dimensons. Percolation Theory and Ergodic Theory of Infinite Particle Systems, vol. 8, IMA Series in Mathematics and its Applications, pp. 245–250. Schonmann, R. H. (1987). A new proof of the complete convergence theorem for contact processes in several dimensions with large infection parameter. Annals of Probability 15, 382–387. Schonmann, R. H. (1991). An approach to characterize metastability and critical droplets in stochastic Ising models. Annales de l’Institut Henri Poincar´ e (Probabilit´ es et Statistique) B 55, 591–600. Schonmann, R. H. (1992). The pattern of escape from metastability of a stochastic Ising model. Communications in Mathematical Physics 147, 231–240. Schonmann, R. H. (1993). Relaxation times for stochastic Ising models in the limit of vanishing external field at fixed low temperatures. Cellular Automata and Cooperative Systems, Kluwer, Dordrecht, pp. 543–546. Schonmann, R. H. Slow droplet-driven relaxation of stochastic Ising models in the vicinity of the phase coexistence region. Communications in Mathematical Physics. Schonmann, R. H. and Vares, M. E. (1986). The survival of the large dimensional basic contact process. Probability Theory and Related Fields 72, 387–393. Shiga, T. (1988). Tagged particle motion in a clustered random walk system. Stochastic Processes and their Applications 30, 225–252. Shiga, T. (1992). Ergodic theorems and exponential decay of sample paths for certain interacting diffusion systems. Osaka Journal of Mathematics 29, 789–807. Shiga, T. and Tanaka, H. (1985). Central limit theorems for a system of Markovian particles with mean field interactions. Zeitschrift f¨ ur Wahrscheinlichkeitstheorie verw. Geb. 69, 439–459. Spitzer, F. (1986). A multidimensional renewal theorem. Adv. Math. Supp. Studies 9, 147–155. Spohn, H. (1985). Equilibrium fluctuations for some stochastic particle systems. Statistical Physics and Dynamical Systems, Birkh¨ auser, pp. 67–81. Spohn, H. (1989). Stretched exponential decay in a kinetic Ising model with dynamic constraint. Communications in Mathematical Physics 125, 3–12. Spohn, H. (1990). Tracer diffusion in lattice gases. Journal of Statistical Physics 59, 1227–1239. Steif, J. (1991). Space–time Bernoullicity of the lower and upper stationary processes for attractive spin systems. Annals of Probability 19, 609–635. Stroock, D. W. and Zegarlinski, B. (1992). The equivalence of the logarithmic Sobolev inequality and the Dobrushin–Shlosman mixing condition. Communications in Mathematical Physics 144, 303–323. Stroock, D. W. and Zegarlinski, B. (1992). The logarithmic Sobolev inequality for continuous spin systems on a lattice. Journal of Functional Analysis 104, 299–326. Stroock, D. W. and Zegarlinski, B. (1992). The logarithmic Sobolev inequality for discrete spin systems on a lattice. Communications in Mathematical Physics 149, 175–193. Sudbury, A. (1990). The branching annihilating process: an interacting particle system. Annals of Probability 18, 581–601. Suzuki, Y. (1991). Invariant measures for the multitype voter model. Tokyo Journal of Mathematics 14, 61–72. Swindle, G. (1990). A mean field limit of the contact process with large range. Probability Theory and Related Fields 85, 261–282. Tanemura, H. (1989). Ergodicity for an infinite particle system in Rd of jump type with hard core interaction. Journal of the Mathematical Society of Japan 41, 681–697. Thomas, L. E. (1989). Bound on the mass gap for finite volume stochastic Ising models at low temperature. Communications in Mathematical Physics 126, 1–11.
226
T. M. LIGGETT
Thomas, L. E. and Yin, Z. (1986). Approach to equilibrium for random walks on graphs and for stochastic infinite particle systems. Journal of Mathematical Physics 27, 2475–2477. Toom, A. L., Vasilyev, N. B., Stavskaya, O. N., Mityushin, L. G., Kurdyumov, G. L., and Pirogov, S. A. (1990). Discrete local Markov systems. Stochastic Cellular Systems: Ergodicity, Memory, Morphogenesis (R. L. Dobrushin, V. I. Kryukov, and A. L. Toom, ed.), Manchester University Press, pp. 1–182. Wang, H. X. (1987). Invariant measures for generalized infinite particle systems with zero range interactions. Acta Math. Sci. 7, 55–69. Wang, J. S. and Lebowitz, J. L. (1988). Phase transitions and universality in nonequilibrium steady states of stochastic Ising models. Journal of Statistical Physics 51, 893–906. Wang, S. Z. (1986). The set of invariant measures of bounded spin flip processes with potential. Acta Mathematica Sinica 6, 213–222. Wick, W. D. (1985). A dynamical phase transition in an infinite particle system. Journal of Statistical Physics 38, 1015–1025. Wick, W. D. (1989). Hydrodynamic limit of nongradient interacting particle processes. Journal of Statistical Physics 54, 873–892. Yaguchi, H. (1990). Entropy analysis of a nearest neighbor attractive/repulsive exclusion on one dimensional lattices. Annals of Probability 18, 556–580. Yaguchi, H. (1991). A discrete time interactive exclusive random walk of infinitely many particles on one–dimensional lattices. Hiroshima Mathematics Journal 21, 267–283. Ycart, B. et al. (1989). An interacting model of adsorption. Applicationes Mathematicae 20. Ycart, B. (1993). The philosopher’s process: an ergodic reversible nearest particle system. Annals of Applied Probability 3. Zheng, X. G. and Zeng, W. D. (1986). Generalized simple exclusion processes with symmetrizable transition probability. Chinese J. Appl. Probab. and Stat. 2, 334–340. Zheng, X. G. and Zeng, W. D. (1987). An ergodic theorem for generalized simple exclusion processes with reversible positive transitions. Acta Math. Sci. 7, 169–175. Zheng, X. G. (1988). Ergodic theorem for generalized long range exclusion processes with postive recurrent transition probabilities. Chinese J. Appl. Probab. and Stat. 4, 193–209.
CONSTRUCTIVE METHODS IN MARKOV CHAIN THEORY
M. V. MENSHIKOV Chair of Probability Laboratory of Large Random Systems Faculty of Mechanics and Mathematics Moscow State University 119899 Moscow Russia
Abstract. General criteria are given for the ergodicity, recurrence, and transience of countable Markov chains. Conditions are given for the continuity, in the parameter, of the stationary probabilities of families of such chains. All criteria are closely connected with the well known criterion of Foster for the ergodicity of Markov chains and are given in terms of semimartingales. A complete classification is obtained for the random walks in Z2+ . The zero drift case inside Z2+ and almost zero drift one-dimensional processes constitute new directions of the development. Key words: Ergodicity, recurrence, transience, random walks, semimartingales, stability.
1. Introduction The goal of this paper is the illustration of some constructive martingale criteria for classification of Markov chains and random walks in Z2+ . The basic method of obtaining these criteria consists of the construction of Lyapounov functions. These methods enable us to say when a Markov chain is ergodic, null recurrent or transient, i.e., to solve the problem of the complete classification. Some of ideas given in this paper for random walks in Z2+ can be used for solving some analogous problems in Zn+ (n ≥ 3). We will show martingale criteria, their equivalents for Markov chains and then some applications for random walks. The notion of the Lyapounov function or test function is close to the well known Lyapounov function for ordinary differential equations and goes back to Foster 1953, as far as we know. Although his examples are now trivial, his ideas and criteria for ergodicity and for transience became basic for later extensions. There exist now many technical generalisations of these criteria, some of which we will give in this paper. Generalized Foster’s criteria for ergodicity were given in Malyshev (1991). In Filonov (1989) a new martingale proof was proposed with an important extension to random times. In Fayolle et al. (1993) these results were summarized and simplified. The first results for random Z2+ walks appeared already 20 years ago (Malyshev 1972). In Malyshev et al. (1979), Menshikov (1974), Ignatyuk (1991) these ideas were applied to Zn+ (n ≥ 3). But the classification was obtained only for Z3+ and Z4+ and only for random walks with non zero drifts. The consideration of the zero drift case required new general martingale criteria
228
M. V. MENSHIKOV
for Markov chains and a search for new Lyapounov functions. The zero drift case in Z2+ and almost zero drift one-dimensional processes constitute new directions of the development, initiated by Lamperti (1960) 30 years ago. They are directly related to several works of R. Williams, S. R. S. Varadhan and others. 2. Criteria involving Semi-Martingales Let (Ω, F, P ) be a given probability space and {Fn , n ≥ 0} an increasing family of σ-algebras F0 ⊂ F1 . . . ⊂ Fn . . . ⊂ F. Let {Si , i ≥ 0} be a sequence of real non-negative random variables, such that Si is Fi -measurable, ∀i ≥ 0. Moreover, S0 will be taken constant, which does not restrict the generality. Denote by τ the Fn -stopping time representing the epoch of the first entry into [0, C], i.e., τ = inf{n ≥ 1 : Sn ≤ C}. Introduce the stopped sequence S˜n = Sn∧τ , where n if n ≤ τ n∧τ = τ if n > τ . Theorem 2.1. Assume that S0 > C and, for some > 0 and all n ≥ 0, E(S˜n+1 | Fn−1 ) ≤ S˜n − 1(τ > n) Then E(τ ) <
a.s.
S0 < ∞.
(1) (2)
Here we used the classical notation for the indicator function 1 if A is true 1A = 0 otherwise. The following theorem is the generalization of Theorem 2.1, which is useful in the investigation of the ergodicity of random walks on Zn+ (n ≥ 3). Let {Ni , i ≥ 1} be a random sequence of positive integers which is predictable, i.e., Ni+1 is Fi -measurable, and such that N0 = 0, Ni − Ni−1 ≥ 1 a.s., ∀i ≥ 1. Introduce Y0 = S0 , Yi = SNi , i ≥ 1, the stopping time σ = inf{i ≥ 1 : Yi ≤ C}, ˜i = Ni∧σ , i ≥ 1. and the stopped sequences Y˜i = Yi∧σ , N Theorem 2.2. Assume that for some > 0 and all n ≥ 0, ˜n+1 − N ˜n ) E(Y˜n+1 | FNn ) ≤ Y˜n − (N Then E(τ ) ≤
S0 .
a.s.
(3)
(4)
CONSTRUCTIVE METHODS IN MARKOV CHAIN THEORY
229
Theorem 2.3. Suppose that, for n ≥ 1 and some positive real M, E(S˜n | Fn−1 ) ≥ S˜n−1 a.s., ˜ E[|Sn − S˜n−1 | | Fn−1 ] ≤ M a.s.
(5) (6)
Then E(τ ) = ∞. The proofs of these theorems can be found in Malyshev et al. (1979) and Fayolle et al. (1993). 3. Criteria for Countable Markov Chains Let us consider a time homogeneous Markov chain L with a countable state space A = {αi , i ≥ 0}. L is supposed to be irreducible and aperiodic. The position of the chain at time n is ξn . Theorem 3.1. The Markov chain L is recurrent if and only if there exist a positive function f (α), α ∈ A, and a finite set A, such that E[f (ξm+1 ) − f (ξm ) | ξm = αi ] ≤ 0,
∀αi 6∈ A,
and f (αj ) → ∞ as j → ∞. Theorem 3.2. The Markov chain L is transient if and only if there exist a positive function f (α), α ∈ A and a set A ∈ A such that the following inequalities are fulfilled E[f (ξm+1 ) − f (ξm ) | ξm = αi ] ≤ 0, f (αk ) < inf f (αj ), αj ∈A
∀αi 6∈ A,
for at least one αk 6∈ A.
Theorem 3.3. (Foster) The Markov chain L is ergodic if and only if there exist a positive function f (α), α ∈ A, a number > 0 and a finite set A ∈ A such that E[f (ξm+1 ) − f (ξm ) | ξm = αj ] ≤ −, E[f (ξm+1 ) | ξm = αi ] < ∞,
αj 6∈ A;
αi ∈ A.
The following theorem is a generalization of Foster’s theorem, in the same way that Theorem 2.2 was a generalization of Theorem 2.1. Theorem 3.4. The Markov chain L is ergodic if and only if there exist a positive function f (α), α ∈ A, a number > 0, a positive integer-valued function k(α), α ∈ A, and a finite set A, such that the following inequalities hold: Ef [(ξm+k(ξm ) ) − f (ξm ) | ξm = αi ] ≤ −k(αi ), E[f (ξm+k(ξm ) | ξm = αi ] < ∞,
αi ∈ A.
αi 6∈ A;
(7) (8)
230
M. V. MENSHIKOV
Theorem 3.5. For an irreducible Markov chain L to be non-ergodic, it is sufficient that there exist a function f (α), α ∈ A, and constants C and d such that 1. E[f (ξm+1 ) − f (ξm ) | ξm = α] ≥ 0, for every m, all α ∈ {f (α) > C}, where the sets {α : f (α) > C} and {α : f (α) ≤ C}, are non empty; 2. E[|f (ξm+1 ) − f (ξm )| | ξm = α] ≤ d, for every m, ∀α ∈ A. It is easy to understand that Theorem 2.1 is the martingale analogy of Foster’s Theorem 3.3, Theorem 2.2 is the analogy of Theorem 3.4, and Theorem 2.3 is the analogy of Theorem 3.5. The proofs of Theorems 3.1 and 3.2 are based on the martingale technique (see for example Foster (1953) and Fayolle et al. (1993)). Theorems 3.1–3.5 are very useful in the classification of random walks on Z2+ with non-zero drifts. In this case we make use of -linear Lyapounov functions and so the additional conditions on moments in these theorems are fulfilled. In the case with zero mean jumps inside Z2+ we have to use some quadratic forms and functionals of quadratic forms. So condition 2 of Theorem 3.5 is not fulfilled. The following Theorem 3.6 helps us to solve this problem (see Fayolle et al. (1992) and Fayolle et al. (1993)). Theorem 3.6. For an irreducible Markov chain L to be null recurrent, it is sufficient that there exist two functions f (x) and φ(x), x ∈ A, and a finite subset A ∈ A, such that the following conditions hold: 1. f (x) ≥ 0, φ(x) ≥ 0, ∀x ∈ A; 2. For some positive α, γ, with 1 < α ≤ 2, f (x) ≤ γ(φ(x))α , x ∈ A; 3. φ(xi ) → ∞, for i → ∞, supx6∈A f (x) > supx∈A f (x); 4. (a) E[f (ξn+1 ) − f (ξn ) | ξn = x] ≥ 0, ∀x 6∈ A; (b) E[φ(ξn+1 ) − φ(ξn ) | ξn = x] ≤ 0, ∀x 6∈ A; (c) supx∈A E[|φ(ξn+1 ) − φ(ξn )|α | ξn = x] = C < ∞. 4. Classification of Random Walks on Z2+ Consider a discrete time homogeneous irreducible and aperiodic Markov chain L = {ξn , n ≥ 0}. Its state space is the lattice in the positive quarter plane Z2+ = {(i, j) : i, j ≥ 0} and it satisfies the recursive equation ξn+1 = [ξn + θn+1 ]+ , where the distribution of θn+1 depends only on the position of ξn in the following way (maximal space homogeneity)
p[θn+1
p , ij 0 pij , = (i, j) | ξn = (k, l)] = 00 pij , p0ij ,
for for for for
k, l ≥ 1, k ≥ 1, l = 0, k = 0, l ≥ 1, k = l = 0.
Moreover we shall make, for the one-step transition probabilities, the following assumptions:
CONSTRUCTIVE METHODS IN MARKOV CHAIN THEORY
231
Condition A (Lower boundedness) = 0, if i < −1 or j < −1; pij 0 pij = 0, if i < −1 or j < 0; 00 pij = 0, if i < 0 or j < −1. Condition B (First moment condition) ∀(k, l) ∈ Z2+ ,
E[k θn+1 k | ξn = (k, l)] ≤ C < ∞,
where kzk, z ∈ Z2+ , denotes the euclidean norm and C is an arbitrary but strictly positive number. Notation: We shall use lower case greek letters α, β, . . . to denote arbitrary points of Z2+ , and then pαβ will mean the one-step transition probabilities of the Markov chain L and α > 0 means αx > 0, αy > 0, for α = (αx , αy ). Also, from the homogeneity conditions, one can write θn+1 = (θx , θy ), given that ξn = (x, y). Define the vector M (α) = (Mx (α), My (α)) of the one-step mean jumps (drifts) from the point α. Setting α = (αx , αy ), we have Mx (α) =
X β
pαβ (βx − αx ),
β = (βx , βy ), My (α) =
X
pαβ (βy − αy ).
β
Condition B ensures the existence of M (α), for all α ∈ Z2+ . By the homogeneity condition A, only four drift vectors are different from zero M, for αx , αy > 0; 0 M , for α = (αx , 0), αx > 0; M (α) = 00 M , for α = (0, αy ), αy > 0; 0 M , for α = (0, 0). The following theorem was proved by Malyshev 1972 under the additional condition: the jumps are bounded with probability 1. In Fayolle et al. (1993) it was proved for unbounded jumps. Theorem 4.1. Assume conditions A and B are satisfied. (a) If Mx < 0, My < 0, then the Markov chain L is 0 0 00 00 (i) ergodic if Mx My − My Mx < 0, and My Mx − Mx My < 0; 0 0 00 00 (ii) non-ergodic if either Mx My − My Mx ≥ 0 or My Mx − Mx My ≥ 0.
232
M. V. MENSHIKOV
(b) If Mx ≥ 0, My < 0, then the Markov chain L is 0 0 (i) ergodic if Mx My − My Mx < 0; 0 0 (ii) transient if Mx My − My Mx > 0. (c) (case symmetric to case (b)) If My ≥ 0, Mx < 0, then the Markov chain L is 00 00 (i) ergodic if My Mx − Mx My < 0; 00 00 (ii) transient if My Mx − Mx My > 0. (d) If Mx ≥ 0, My ≥ 0, Mx + My > 0, then the Markov chain L is transient. The proof of this theorem is based on the construction of -linear Lyapounov functions for which the conditions of Theorems 3.1–3.3, 3.5 are fulfilled correspondingly. 5. Zero Drift We consider the Markov chain L which was introduced in the previous section, but satisfying the stronger Condition C (Second moment condition) E[k θn+1 k2 | ξn = (k, l)] ≤ B < ∞,
∀(k, l) ∈ Z2+ .
Until recently nothing was precisely known for the case M = 0. In fact, this problem in many respects, is of a very different nature. In particular, the intuition does not provide us with any evidence that the random walk could be ergodic, when M = 0. There is a crucial difference between the cases M 6= 0 and M = 0: indeed, the case M 6= 0 is in a sense locally linear and M = 0 is locally quadratic. The local second order effects are well caught by functional of quadratic Lyapounov functions. For M = 0, we will obtain the ergodicity conditions in terms of the second moments and the covariance of the one-step jumps inside Z2+ , X X X i2 pij , λy = j 2 pij , R = ijpij , λx = ij
ij
ij 0
and of the angles φx , φy . Here φx is the angle between M and the negative x-axis, φy is the angle between M 00 and the negative y-axis. Thus, if φx 6= 12 π and φy 6= 12 π, then My0 M 00 tan φx = − 0 , tan φy = − x00 . My My Theorem 5.1. (i) If φx ≥ 12 π or φy ≥ 12 π, then the random walk L is non-ergodic. (ii) If φx < 12 π and φy < 12 π, then the random walk L is ergodic if 0
00
My M λx tan φx + λy tan φy + 2R ≡ −λx 0 − λy x00 + 2R < 0, Mx My
(9)
and non-ergodic if λx tan φx + λy tan φy + 2R > 0.
(10)
(iii) If (10) holds together with φx +φy ≤ 12 π, then the random walk is null recurrent.
CONSTRUCTIVE METHODS IN MARKOV CHAIN THEORY
233
Remark 1. It follows easily from the statement of the theorem that the mean first entrance time of L to the boundary, when starting from some arbitrary point α > 0 at finite distance, is finite (resp. infinite) if R < 0 (resp. R > 0), since in this case 0 00 the vectors M and M can be properly chosen to satisfy (9) (resp. (10)). Remark 2. It is clear from the formulation of the theorem that we do not consider the limiting situation λx tan φx + λy tan φy + 2R = 0, which would impose assumptions of third order. The necessary and sufficient conditions for transience were proved recently (Asymont et al. 1993). The structure of Lyapounov functions was more difficult. Theorem 5.2. Let the following conditions hold for these random walks: Mx00 > 0 and My0 > 0. Then the random walk is recurrent if λx cot φy + λy cot φx + 2R ≥ 0 (perhaps, with the exception of the case: λx cot φy + R = λy cot φx + R = 0). The walk is transient if λx cot φy + λy cot φx + 2R < 0. Remark 1. If the assumption of Theorem 5.2 holds we have 00
0
cot φx = −
Mx , My0
cot φy = − 0
My . Mx00 00
Remark 2. We do not consider the cases My = 0 and Mx = 0 which are more trivial. Remark 3. The case λx cot φy + R = λy cot φx + R = 0 cannot be classified by the method to be used. We think that the knowledge of the first moments on the axes is not sufficient to solve the classification problem and it is necessary to consider also the second moments on the axes. 6. Random Walks with ‘Almost Zero’ Mean Jumps Let us consider a discrete-time Markov chain {Xn }, with state space R+ . Let us define for x ∈ R+ µi (x) = E[(Xn+1 − Xn )i | Xn = x]. Let τx0 ≥ 0 be the time at which the process first enters the interval [0, A), when X0 = x0 . Theorem 6.1. Suppose that for some , p > 0 and all sufficiently large x 2xµ1 (x) + (2p − 1)µ2 (x) ≤ −, µ2 (x) = O(1), µ2p (x) = o(x2p−2 ). Then for any sufficiently large A and all x0 we have E(τxp0 ) < ∞.
234
M. V. MENSHIKOV
Theorem 6.2. Suppose for some , p > 0 that 2xµ1 (x) + (2p − 1)µ2 (x) ≥ > 0 for all large x; suppose also that µ2p (x) exists and µ1 (x) = O(x−1 ), µ2 (x) = O(1), µ2p (x) = o(x2p−2 ). Then for all sufficiently large A, for every x0 > A, we have E(τxp0 ) = ∞. These two theorems generalize the results of Lamperti (1960, 1963), which he proved only for integer positive p. After some generalizations we can apply these two theorems to random walks on Z2+ with zero mean jumps inside, and can obtain almost necessary and sufficient conditions when E(τ p ) < ∞. Here p > 0, and τ is the time at which the random walk first enters some finite set. In the previous section we considered only the cases when p = 1 (ergodicity). But the construction of Lyapounov functions in the case of zero drift is more difficult. The paper in this region (Aspandiiarov et al. 1993) is in preparation. 7. Stability The present section is devoted to the continuity of stationary distribution for families of homogeneous irreducible and aperiodic Markov chains. First we give a necessary and sufficient condition for this continuity, and also constructive sufficient conditions for the continuity of the stationary probabilities in terms of test functions. These results were obtained by Malyshev and Menshikov (1979). Let us consider a family of homogeneous irreducible aperiodic Markov chain {Lν } with discrete time and countable set of states A = {0, 1, . . .}, for ν ∈ D, where D is an open subset of the real line. By pij (t, ν) we denote the t-step transition probability from the point i to point j in Lν . Everywhere in this section we assume that the pij (1, ν) are continuous in ν for all ν ∈ D and i, j ∈ A. For the sake of brevity, we will write def pij (ν) ≡ pij (1, ν). It is easy to prove that the pij (t, ν) are continuous functions of ν (ν ∈ D) for every natural number t and all i, j ∈ A. On the set A let {πj (ν)}, j ∈ A, ν ∈ D, be a given family of distributions, where D is some open subset of the real line. We have X πj (ν) = 1, (ν ∈ D). j∈A
Definition. The family of distributions {πj (ν)}, (j ∈ A, ν ∈ D), satisfies Condition (λ) at the point ν0 ∈ D if, for any > 0, there exist δ > 0 and a finite set B ⊂ A such that X πj (ν) < , j∈A\B
CONSTRUCTIVE METHODS IN MARKOV CHAIN THEORY
235
for all ν with |ν − ν0 | < δ. Let the chain Lν be ergodic for every ν belonging to some neighborhood U0 ⊂ D of zero. Theorem 7.1. The stationary probabilities πj (ν) depend on ν continuously at ν = 0 for all j ∈ A if and only if the family of distributions {πj (ν)} satisfies Condition (λ) at the point ν = 0. Before proving this, we make the following remark. Following Prohorov 1956 we form the metric space D(A). To that end, we define the distance L(µ1 , µ2 ) between any two measures µ1 and µ2 on A = {0, 1, . . . , n}, so that convergence in the sense of this distance is equivalent to weak convergence of measures. The collection of all measures on A together with the function L(µ1 , µ2 ) forms the metric space D(A). Still in accordance with Prohorov (1956) we introduce the following definition. Definition. A set T of measures on A satisfies Condition (χ) if: (χ1) the values µ(A), µ ∈ T , are bounded; (χ2) for any given > 0, there exists a finite set k of points such that µ(A\K ) < , for all µ ∈ T . In Prohorov (1956) it is proved that for T ⊂ D(A) to be compact, it is necessary and sufficient that Condition (χ) be satisfied. For {πj (ν)}, Condition (χ) obviously implies Condition (λ). Therefore, as a result of Theorem 7.1, we have the following theorem. Theorem 7.2. In order that the stationary probabilities πj (ν) depend continuously on ν for all j ∈ A it is sufficient that the family {πj (ν)} of distributions be compact in D(A). The following theorem will be formulated in terms of test functions. The continuity of stationary probabilities of random walks in Zn+ can be studied by means of the results of this section. Assume that on the set A = {0, 1, . . .} there is a given family f ν = {fiν } for i ∈ A, ν ∈ D of real functions, where inf
i∈A,ν∈D
fiν ≥ 0.
Theorem 7.3. Assume that the following conditions are satisfied for some δ > 0, some γ > 1, and finite non-empty set B ⊂ A: P ν − f ν < −δ, i 6∈ B, ν ∈ D, 1. ∞ j=0 pij (ν)f Pi ∞ i 2. supi∈B,ν∈D j=0 pij (ν)(fjν )γ = λγ < ∞, P∞ 3. supi∈B,ν∈D j=0 pij (ν)|fjν − fiν |γ = Cγ < ∞, 4. fiν → ∞ uniformly in ν ∈ D as i → ∞. Then the chains Lν are ergodic for every ν ∈ D, and the stationary probabilities πj (ν) are continuous in ν for ν ∈ D and j ∈ A. In Malyshev and Menshikov (1979) this theorem was applied to analysis of continuity of the family random walks in Zn+ . Analyticity conditions for stationary probabilities for general Markov chains and for random walks in Zn+ are also stated in this paper in terms of Lyapounov functions.
236
M. V. MENSHIKOV
References Aspandiiarov, S., Iasnogorodski, R., and Menshikov, M. (1993). On the passage-time moments for 2-dimensional Markov chains in wedges with the boundary reflection. Rapport de Recherche, INRIA. Asymont, I., Iasnogorodski, R., and Menshikov M. (1993). Random walks with asymptotically zero drifts Rapport de Recherche, INRIA. Asymont, I., Fayolle, G., and Menshikov, M. (1993). Random walks in a quarter plane with zero drifts. II: transience and recurrence. Rapport de Recherche, INRIA. Fayolle G. (1989). On random walk arising in queueing system: ergodicity and transience via quadratic forms as Lyapounov functions - part 1. Queueing Systems 5, 167–184. Fayolle, G., Ignatyuk, I., Malyshev, V. A., and Menshikov, M. V. (1991). Random walks in twodimensional complexes. Queueing Systems 9, 269–300. Fayolle, G., Malyshev, V. A., and Menshikov, M. V. (1992). Random walks in a quarter plane with zero drifts. 1. Ergodicity and null recurrence. Annales de l’Institut Henri Poincar´ e (Probabilit´ es et Statistique) 28, 179–194. Fayolle, G., Malyshev, V. A., and Menshikov, M. V. (1993). Topics in the Constructive Theory of Countable Markov Chains (Part I). Cambridge University Press, in preparation. Fayolle, G., Malyshev, V. A., Menshikov, M. V., and Sidorenko, A. F. (1991). Lyapounov functions for Jackson networks. Rapport de Recherche, INRIA. Filonov, Yu. P. (1989). Ergodicity criteria for homogeneous discrete Markov chains. Ukrainian Mathematical Journal 41, 1421–1422. Foster, F. G. (1953). On stochastic matrices associated with certan queueing processes. Annals of Mathematical Statistics 24, 355–360. Ignatyuk, I. and Malyshev, V. A. (1991). Classification of random walks in Z4+ . Rapport de Recherche 1516, INRIA. Lamperti, J. (1960). Criteria for the recurrence or transience of stochastic processes. Journal of Mathematical Analysis and Applications 1, 314–330. Lamperti, J. (1963). Criteria for stochastic processes II. passage time moments. Journal of Mathematical Analysis and Applications 7, 127–145. Malyshev, V. A. (1972). Classification of two-dimensional random walks and almost linear semimartingales. Dokl. Akad. Nauk., USSR 202, 526–528. Malyshev, V. A. (1991). Networks and dynamical systems. Rapport de Recherche 1468, INRIA. Malyshev, V. A. and Menshikov, M. V. (1979). Ergodicity continuity and analyticity of countable Markov chains. Transactions of the Moscow Mathematical Society 39, 3–48. Menshikov, M. V. (1974). Ergodicity and transience conditions for random walks in the positive octant of space. Dokl. Akad. Nauk., USSR 217, 755–758. Prohorov, Ju. V. (1956). Convergence of random processes and limit theorems of probability theory. Teor. Veroyatnost. i. Primenen 1, 117–238. Varadhan, S. R. S. and Williams, R. J. (1985). Brownian motion in a wedge with oblique reflection. Communications in Pure and Applied Mathematics 38, 405–443. Williams, R. J. (1985). Recurrence classification and invariant measure for reflected Brownian motion in a wedge. Annals of Probability 13, 758–778.
DISORDERED ISING SYSTEMS AND RANDOM CLUSTER REPRESENTATIONS CHARLES M. NEWMAN∗ Courant Institute of Mathematical Sciences New York University 251 Mercer Street New York, NY 10012 U.S.A.
Abstract. We discuss the Fortuin–Kasteleyn (FK) random cluster representation for Ising models with no external field and with pair interactions which need not be ferromagnetic. In the ferromagnetic case, the close connections between FK percolation and Ising spontaneous magnetization and the availability of comparison inequalities to independent percolation have been applied to certain disordered systems, such as dilute Ising ferromagnets and quantum Ising models in random environments; we review some of these applications. For non-ferromagnetic disordered systems, such as spin glasses, the state of the art is much more primitive. We discuss some of the many open problems for spin glasses and show how the FK representation leads to one small result, that there is uniqueness of the spin glass Gibbs distribution above the critical temperature of the associated ferromagnet. Key words: FK representations, spin glasses, disordered Ising models, percolation.
1. The FK Random Cluster Representation In this section, we will briefly review the relation between Ising models, Fortuin– Kasteleyn (FK) random cluster models and independent percolation. FK models were introduced in Kasteleyn and Fortuin (1969), Fortuin and Kasteleyn (1972); more recent presentations may be found in Aizenman et al. (1988), Grimmett (1994). Our emphasis here will be on the version relevant for Ising systems with some ferromagnetic and some antiferromagnetic pair interactions; for more discussion of this sort, see Newman (1991). For simplicity, we will restrict attention to models in Zd with nearest neighbor interactions. Since we will eventually apply the FK representation to disordered systems, we must allow our couplings to vary from bond to bond, in magnitude and in sign. d is the set of d denote the set of nearest neighbor bonds of Zd ; i.e., Z Let Z unordered pairs b = x, y = y, x of sites x, y in Zd with Euclidean distance x − d and the inverse y = 1. The interactions, Jb , are real numbers indexed by b in Z temperature is a non-negative constant β. (When we consider disordered systems, the Jb ’s will be random variables on some probability space (Ω, F , P ) and the present considerations will be relevant for each fixed ω ∈ Ω.) Given the Jb ’s and β, we define ∗ Supported in part by the National Science Foundation under Grant DMS 92–09053; thanks are due the Isaac Newton Institute for Mathematical Sciences for support and hospitality; NATO for its travel support to attend this Advanced Study Institute; and C. Borgs and J. Bricmont for help with references.
248
CHARLES M. NEWMAN
parameters pb ∈ [0, 1) by the formula, pb = 1 − e−β|Jb | .
(1.1)
For Λ a finite subset of Zd , the (volume Λ) Gibbs distribution (with free boundary conditions) for the Ising model is a probability measure on {−1, +1}Λ and the , where Λ corresponding FK model distribution is a probability measure on {0, 1}Λ denotes the set of bonds b = x, y with x and y in Λ. We regard these respectively as the probability distributions µs of +1 or −1 valued spin random variables These (Sx : x ∈ Λ) and µn of 0 or 1 valued bond occupation variables (Nb : b ∈ Λ). two measures are the marginal distributions (for their respective sets of variables) of defined, in two steps, as follows. = {−1, +1}Λ × {0, 1}Λ a joint distribution µ on Ω of random variables Step 1. Let µ be the joint distribution on {−1, +1}Λ × {0, 1}Λ which are all mutually independent with P (Sx = +1) = (Sx , Nb : x ∈ Λ , b ∈ Λ) 1 P (Sx = −1) = 2 and P (Nb = 1) = pb . be the event Step 2. Let U = {for all b = x, y ∈ Λ, Jb Nb Sx Sy ≥ 0} U
(1.2)
and define µ to be µ conditioned on U ; i.e., (regarded as a subset of Ω), )−1 µ (·) 1 (·). µ(·) = µ (U U
(1.3)
It is an elementary exercise to show that the two marginal distributions are given explicitly by β µs ((sx )) = Zs−1 exp 2
Jx,y sx sy ,
(1.4)
x,y∈ Λ
µn ((nb )) = Zn−1 2((nb )) µind n ((nb )) 1U ((nb )),
(1.5)
where Zs and Zn are normalization constants, ((nb )) denotes the number of clusters determined by (nb ) (i.e., the number of connected components in the graph with nb = 1}), µind is the Bernoulli product measure vertex set Λ and edge set, {b ∈ Λ: n corresponding to independent occupation variables with µind n ({nb = 1}) = pb for Λ each b and U is the event in {0, 1} , U = {(nb ): there exists some choice of (sx : x ∈ Λ) so that ((sx ), (nb )) ∈ U}.
(1.6)
The formula (1.4) is standard for an Ising model Gibbs distribution. Likewise (1.5) is standard for the FK model in the ferromagnetic case (Jb ≥ 0 for all b), (by taking s ≡ +1 or ≡ −1 in (1.6)). FK models for nonsince then U = {0, 1}Λ x ferromagnetic interactions are less well known; the first published reference we are aware of is Kasai and Okiji (1988) (see also Swendsen and Wang 1987, Edwards and , may be Sokal 1988, Newman 1991). Here U , which is typically not all of {0, 1}Λ
ISING SYSTEMS AND RANDOM CLUSTER REPRESENTATIONS
249
thought of as the set of ‘unfrustrated’ bond occupation configurations. This term, borrowed from the spin glass literature, simply means that for the Ising Hamiltonian restricted to occupied bonds, H(nb ) ((sx )) = 12 (−Jb nb sx sy ), (1.7) b=x,y∈Λ there is some spin configuration (sx ) which simultaneously minimizes each summand. A key feature of the measure µ, given by (1.3), is that the conditional distribution, µ((sx ) | (nb )), for the Sx ’s given the Nb ’s is particularly simple: consider the clusters determined by the given (nb ). Any two sites u, v in the same cluster (which we write as u ↔ v) are connected by a path of occupied bonds (with non-zero interactions in (1.3), requires that Su = on every edge) which, because of the conditioning on U ηu,v Sv where ηu,v ((nb )) is the product of the signs of the Jb ’s along the occupied path between u and v. Two different paths will give the same η providing (nb ) ∈ U . For future use, we extend the definition of ηu,v ((nb )) to be 0 if u and v are not in the same cluster for the given (nb ). Thus the relative signs of all the spin variables in a single (nb )-cluster are determined by (nb ) but the spin of any single variable may be either +1 or −1. The conditional distribution µ((sx ) | (nb )) corresponds to making the ±1 choices for each (nb )-cluster by independent flips of a fair coin. (The conditional distribution µ((nb ) | (sx )) is also very simple (Swendsen and Wang 1987), but we will not make use of that.) Expressing µ as the product of the marginal µn and the above conditional allows one to express µs expectations (which we write Es ) in terms of µn expectations (which we write En ). This is the sense in which the FK model gives a representation of the Ising model. For example, Es (Su Sv ) = En (ηu,v ),
(1.8)
which, in the ferromagnetic case (where ηu,v can only be +1 or 0) becomes the well known formula (1.9) Es (Su Sv ) = µn (u ↔ v). To continue our presentation, we now introduce boundary conditions. The simplest type of boundary condition is an assignment s¯ = (¯ sz ) of ±1 spin values to the sites z in ∂Λ, the set of sites outside of Λ which are nearest neighbors of sites in Λ. by Λ ∗ , the union of Λ Here it is convenient to replace Λ by Λ∗ = Λ ∪ ∂Λ and Λ and bonds x, y with x ∈ Λ and y ∈ ∂Λ; i.e., µ will be replaced by a measure µs¯ ∗ ∗ . The definition of µs¯ is just like that of µ, except on Ω∗ = {−1, +1}Λ × {0, 1}Λ that in Step 1, Sx is set to s¯x for each x ∈ ∂Λ. In the formulas for the marginal distributions, (1.4) is replaced by the usual Ising model Gibbs distribution formula with boundary condition s¯, while (1.5) remains essentially the same. We note how ) the spins in ∂Λ are always fixed by s¯, and ever that in the definition of U (and U further that ((nb )) only counts clusters which do not touch ∂Λ (or equivalently for the definition of µn , counts all clusters touching the boundary as a single cluster). Λ∗ Note that even in the ferromagnetic case, U is generally not all of {0, 1} since occupied paths of Jb > 0 bonds are not allowed to connect the s¯z = +1 and s¯z = −1
250
CHARLES M. NEWMAN ∗
parts of the boundary. Of course U will be all of {0, 1}Λ in the ferromagnetic case − if s¯z ≡ +1 or s¯z ≡ −1; the resulting marginal distributions are denoted µ+ s , µs and w (for either +1 or −1) µn (w for ‘wired’). The conditional distribution µ((sx ) | (nb )) remains as it was in the free boundary condition case except that no coin is tossed for clusters touching the boundary since their spin values are already determined by (nb ) and s¯ (and the signs of the Jb ’s). In the ferromagnetic case, the (finite volume) magnetization at site u (in Λ) is then − Es+ (Su ) = µw n (u ↔ ∂Λ) = −Es (Su ),
(1.10)
where of course µ ↔ ∂Λ means that the (nb )-cluster containing the site u touches the boundary. Here are some easily derived comparison inequalities. For a given Λ, write µind n,(pb ) ∗ w,F Λ and µ to denote the two probability measures on {0, 1} given respectively n,(pb )
as the Bernoulli product measure with parameters (pb ), and as the wired b.c. ferromagnetic (Jb ≥ 0) FK-measure with the same parameters. Write µ1
µ2 to denote stochastic ordering between measures; i.e., to denote that f dµ1 ≤ f dµ2 for any ∗ . Then coordinate-wise increasing real function f on {0, 1}Λ w,F ind µind n,(h(pb )) µn,(pb ) µn,(pb ) ,
where h(pb ) =
pb . 2(1 − pb ) + pb
(1.11)
(1.12)
These inequalities can be derived using only the fact (Harris 1960) that for the independent percolation measure µind n , increasing functions f and g positively correlated. and further that Using the facts that these FKG inequalities are also valid for µw,F n w,F ¯ the density of (a non-ferromagnetic) µsn,(p with respect to µ n,(pb ) is, according to b) (1.5), proportional to the decreasing function 1U , it follows that ¯ µsn,(βJ µw,F n,(pb ) ; b)
(1.13)
here we use (βJb ) as a subscript on the left-hand side because of the dependence on the signs of the βJb ’s (and not just on their magnitudes through the pb ’s). We note that the obvious analogue of (1.13) is valid when both sides have free boundary conditions; analogues involving more general boundary conditions will be discussed in Section 3 below. One consequence of inequalities such as (1.13) is that the spin correlations of a non-ferromagnetic Ising model are dominated by those of the associated ferromagnet. This domination was already noted (in a homework problem) by Griffiths (1971). In Section 3 we derive some other consequences.
ISING SYSTEMS AND RANDOM CLUSTER REPRESENTATIONS
251
2. The Phase Transition for Dilute Ferromagnets Throughout this section we restrict attention to ferromagnetic Ising and FK models. d ), we denote in this section by µ+ and µw the infinite volume For fixed (Jb : b ∈ Z s n d limits (Λ → Z ) of the corresponding finite Λ measures defined in the last section; these limits are known to exist by various monotonicity in Λ arguments, based on the ferromagnetic nature of the interactions. The infinite volume limit of (1.10) is Es+ (Su ) = µw n (u ↔ ∞),
(2.1)
where u ↔ ∞ denotes the event that the (nb )-cluster containing site u is infinite. Other arguments (Lebowitz and Martin-L¨ of 1972) based on FKG inequalities, show that ferromagnetic Ising models have a unique infinite volume Gibbs distribution if and only if Es+ (Su ) = 0 for all u and thus by (2.1) if and only if µw n (some cluster is infinite) = 0.
(2.2)
Thus, for ferromagnetic systems, a phase transition for the Ising model (in the sense of a transition from unique to multiple infinite volume Gibbs distributions) is precisely equivalent to a percolation phase transition for the corresponding wired b.c. FK measure. We denote by βc = βc ((Jb : b ∈ Zd )) the critical inverse temperature for this phase transition. For the remainder of this section, we follow the analysis of Aizenman et al. (1987) which shows how this fact may be combined with the comparison inequalities (1.11) to yield an elegant analysis of disordered, but still ferromagnetic, Ising models. d ) will be non-negative i.i.d. random In these models, the interactions (Jb : b ∈ Z variables on some probability space (Ω, F , P ). The critical inverse temperature βc does not depend on any finite number of the Jb ’s and hence, by the Kolmogorov zero–one law is a.s. a constant. Let us denote the density of active bonds by p = P (Jb = 0) (which we assume is strictly positive) and denote the critical value for standard nearest neighbor independent bond percolation on Zd by pc . The percolation probability (for the independent model) is θ(p) = µind (2.3) n,(p) (u ↔ ∞), Zd where µind corresponding to n,(p) denotes the Bernoulli product measure on {0, 1} the independent percolation model with occupation density p. We recall that by the definition of pc , θ(p) = 0 for p < pc and θ(p) > 0 for p > pc , but there is still no proof that θ(pc ) = 0 for all d (≥ 2). There are two facts about the dependence of βc on the distribution of Jb (including its dependence on p ) which are easily derived without use of the FK representation: first, that if (d ≥ 2 and) P (Jb < ) = 0 for some > 0, then βc < ∞, and second that if θ(p ) = 0 then βc = ∞ (i.e., there is (a.s.) a unique infinite volume Ising model Gibbs distribution for any β < ∞). The next theorem, based on the FK representation, improves these results considerably. It was used in Aizenman et al. (1987) primarily to analyze the rate of divergence of βc (p ) as p ↓ pc in the classic dilute ferromagnet, where Jb takes on only the values 0 and 1.
252
CHARLES M. NEWMAN
Theorem 1. (Aizenman et al. 1987) For a given distribution of Jb and value of β, define two constants: 1 − e−βJb −βJb p = E(1 − e ) , p=E . (2.4) 2e−βJb + 1 − e−βJb The infinite volume Ising model Gibbs distribution is (a.s.) unique if p < pc and is (a.s.) non-unique if p > pc . Thus βc < ∞ if and only if p ≡ P (Jb = 0) > pc . Proof. According to the FK-percolation criterion for the Ising phase transition, we w need to show that the non-percolation property (2.2) for µw n = µn,(pb ) is (a.s.) valid when p < pc and (a.s.) invalid when p > pc . (We then leave the proof of the last statement of the theorem to the reader.) Let us consider the probability measure µ n = E(µw n,(pb ) ),
(2.5)
where as usual E denotes expectation with respect to the probability measure P for the Jb ’s; µ n represents the marginal distribution of the FK bond occupation variables when the Jb ’s are not conditioned on. It suffices to show that
0, if p < pc , µ n (some cluster is infinite) = (2.6) 1, if p > pc , since this implies the corresponding identity for P -a.e. µw n,(pb ) . Now we use (the infinite volume limit of) the comparison inequalities (1.11) and average them over the (pb )’s to obtain ind ind n E µind µind n,(p) = Eµn,(h(pb )) µ n,(pb ) = µn,(p) ,
(2.7)
from which the proof is easily completed. The equalities of (2.7) are basically trivial; e.g., in the Jb = 0 or 1 case, they may be restated as follows. If bonds are independently declared active with probability p and then active bonds are independently declared occupied with probability p, the resulting occupied bonds form an independent percolation model with occupation probability pp . To complete this section we briefly mention another type of disordered ferromagnet where an FK representation has been used (Aizenman et al. 1993). This is the quantum Ising model on Zd with random couplings and a random transverse field. A Feynman–Kac type approach (see Aizenman and Nachtergaele 1993) represents this quantum model in terms of a classical Ising model, where the Ising spin variables are indexed by Zd × R. The disorder in this representation remains d-dimensional, so that nearest neighbor couplings, which depend (randomly) on the location in Zd , do not depend on the R-coordinate (the ‘time’). There is an FK representation for this classical Ising model which is related to independent percolation models on Zd × R in essentially the same way as in the discrete index setting. These percolation models are related to the graphical representation of contact processes in the same way as ordinary percolation is related to oriented percolation; in particular the d-dimensional disorder corresponds exactly to the random environment natural for a contact process on Zd . We note that by using the natural string of identities
ISING SYSTEMS AND RANDOM CLUSTER REPRESENTATIONS
253
and comparison inequalities provided by the FK representation, one new result for the original disordered quantum model is shown in Aizenman et al. (1993) to follow from a known result of Liggett (1992) about the contact process in a random environment. [Warning: In Section 4 of Aizenman et al. (1993), the definition of B being an encounter region should be modified to include the requirement that the event GB occurs; without this change, the combinatorial part of the proof of Prop. 4.1 there (uniqueness of the infinite cluster) would be incorrect.] 3. Spin Glasses: Results on Uniqueness The spin glass models we will consider (Edwards and Anderson 1975) are Ising d ) which are i.i.d. symmetric models with nearest neighbor interactions (Jb : b ∈ Z random variables on some (Ω, F , P ), as likely to be negative as positive. A good special case to keep in mind is where Jb = +1 or −1, each with probability 12 . A standard review article on spin glasses is Binder and Young (1986). In the next section we will discuss the open problem of proving that for d and β sufficiently large, there is (a.s.) non-uniqueness of the Gibbs distribution for such models; here we consider the converse issue. It seems generally accepted in the physics literature (see Binder and Young 1986) that, for d = 2, there should be (a.s.) a unique Gibbs distribution for all β < ∞ (at least for reasonable distributions of the Jb ’s such as Gaussian or the ±1 valued case). It is an interesting open problem to prove this conjecture; it should be noted that an analogous result was proved for the d = 2 random field Ising model in Aizenman and Wehr (1990). Since we are unable to resolve the d = 2 large β problem, we will instead show how FK methods lead to some small progress on the rather less interesting issue of general d and moderate values of β. First we note that in any dimension, uniqueness for sufficiently large β can be proved (under some restrictions on the distribution of the Jb ’s) by Dobrushin– Shlosman techniques, which are insensitive to the signs of the interactions (see Dobrushin 1968 and Dobrushin and Shlosman 1985). Let us denote by βcF , the critical temperature for the associated disordered ferromagnet in which each spin glass interaction Jb is replaced by |Jb |. (For the case of ±1 valued Jb ’s, this ferromagnet will of course not be disordered.) It seems intuitively clear that uniqueness for the ferromagnet should imply uniqueness for the spin glass (in particular, for β < βcF ), but this does not seem to follow from the above mentioned techniques. We will now show that such a result can be derived by FK techniques. The result in fact has nothing to do with disordered systems at all: d }, uniqueness Theorem 2. For a given set of real valued interactions {Jb : b ∈ Z of the infinite volume Gibbs distribution at inverse temperature β for the associated d }, implies uniqueness at the same β for the ferromagnetic interactions, {|Jb |: b ∈ Z original interactions. Proof. The proof uses the FK representation, the comparison inequality (1.13) and a coupling argument (based on a generalization of (1.13)). All but the coupling have been discussed previously (see Newman 1991) and that argument is similar to one used recently by van den Berg and Maes (1992). Let SA = u∈A Su for any fixed
254
CHARLES M. NEWMAN
finite A ⊂ Zd and let Λ be a (varying) finite subset of Zd containing A; it suffices to show that for the original set of interactions, any two choices of boundary conditions s¯ = s¯(Λ) and s¯ = s¯ (Λ) have
s ¯ s ¯ Es,Λ (SA ) − Es,Λ (SA ) → 0
as
Λ → Zd .
(3.1)
Here we have added a subscript to indicate dependence on Λ. The idea is to express each of the two expectations as (asymptotically) the same mixture (over regions Λ) of free boundary condition expectations. We begin by noting that it easily follows from the FK representation that for a ⊂ Λ, conditioned on the bond occupation variables other than those entirely given Λ and ∂ Λ, then the conditional distribution for the if nb = 0 for every b between Λ in Λ, bonds and spins in Λ is just the volume Λ measure with free boundary conditions. ¯ for the original Now for the Λ in question, we wish to couple the measures µsn,Λ w,F interactions with b.c. s¯ and µn,Λ for the ferromagnetic interactions |Jb | and wired boundary condition; i.e., realize the corresponding variables Nbs¯ and Nbw,F on the -cluster of ∂Λ, i.e., same probability space (Ω , F , P ). Let us denote by CΛw the Nw,F b the set of sites in Λ which are connected to ∂Λ by a path of bonds with Nbw,F = 1. We = ∅, then N s¯ = 0 need our coupling to have two properties. First, that if Λ\CΛw = Λ b for every b between Λ and ∂ Λ; this would follow from the pointwise domination = ∅, the conditional Nbs¯(ω ) ≤ Nbw,F (ω ). Second, that conditional on Λ \ CΛw = Λ is the free boundary condition FK distribution of (Nbs¯: b = x, y with x, y ∈ Λ) measure on Λ. It is a standard fact that the first property follows from (1.13); we claim that both properties can be had simultaneously by a sequential construction (i.e., one bond b at a time) using (1.13) and a family of analogous inequalities involving more general boundary conditions, which we now discuss. Further details needed to justify our claim are left to the reader; we note that some care should be taken in choosing the (random) order of bonds in the construction so that the second property needed for the coupling will be valid. We also note that the coupling construction is quite similar to the one used by van den Berg and Maes (1992). First we note that two fixed spin boundary conditions related by an overall spin flip, s¯ and −¯ s, give rise to exactly the same FK measure. Thus the boundary conditions appearing in (1.13) are really defined by an assignment of relative signs to the sites on the boundary. The generalized boundary conditions we consider are be a non-empty subset of Λ, the as follows. Let Λ be a finite subset of Z d and let Λ set of nearest neighbor bonds between sites in Λ. A general boundary condition θ for is specified by a partition of Λ into non-empty the FK measure µθn,(βJb ) on {0, 1}Λ subsets Λ1 , Λ2 , . . . , Λm and an assignment ti of relative signs to the sites within each Λi with at least two sites. The formula for this measure is given by (1.5), where ((nb )) treats all sites in Λi as already being in the same cluster, for each i, and where the definition (1.6) for U is modified to allow only (sx : x ∈ Λ) which respect all the ti ’s. The free boundary condition case corresponds to the partition of Λ entirely into individual sites (so no ti ’s are assigned) while the s¯ boundary condition on the = Γ ∗ boundary ∂Γ of some region Γ corresponds to taking Λ = Γ∗ (= Γ ∪ ∂Γ), Λ (see the definition following (1.9) above), Λ1 = ∂Γ, all other Λi ’s as individual sites of Γ and finally t1 as the relative sign assignment given by s¯.
ISING SYSTEMS AND RANDOM CLUSTER REPRESENTATIONS
255
. Let κ be a boundary The analogues of (1.13) are as follows, for a given Λ and Λ condition in which each ti is the assignment that all sites in Λi have the same sign. The ferromagnetic FK measure with such a ‘partially wired’ boundary condition, ,F µκn,(p , still obeys the FKG inequalities. Now let κ be another boundary condition b) whose partition of Λ into Λ1 , Λ2 , . . . is a refinement of the κ partition into Λ1 , Λ2 , . . . (i.e., each Λj is a union of some of the Λi ’s). Then, with no restriction on the assignments t1 , t2 , . . . for κ, we claim that
,F µκn,(βJb ) µκn,(p . b)
(3.2)
To verify this domination, note that the density of the left-hand side with respect to the right-hand side is proportional to 1U · 2− , which is a decreasing function because both 1U and − are decreasing. The domination then follows from the FKG inequalities for the right-hand side. By using our coupled bond measure and then constructing Ising spins by the usual coin tossing procedure for the Nbs¯-clusters, we may express the b.c. s¯ expectation as mostly a mixture of free b.c. expectations: w,F s ¯ w Es,Λ (SA ) = µw,F (3.3) n,Λ (Λ \ C = Λ) · Es,Λ (SA ) + µn,Λ (A ↔ ∂Λ)θ, ⊃A Λ⊇Λ where θ is the conditional expectation of SA given that the Nbw,F = 1 bonds connect A to ∂Λ. A similar formula will be valid for the other b.c. s¯ with the same righthand side except that θ will be replaced by θ . Since |θ|, |θ | ≤ 1 (because SA = ±1), (3.1) will follow from µw,F n,Λ (A ↔ ∂A) → 0
as
Λ → Zd ,
(3.4)
for fixed A. But this is equivalent to the lack of percolation in the infinite volume limit of µw,F n,Λ which is equivalent to uniqueness of the ferromagnetic infinite volume Gibbs distribution, as explained at the beginning of Section 2. Remarks. The proof of Theorem 2 immediately yields the following specific bound on the influence of boundary conditions: +,F s ¯ s ¯ for A ⊂ Λ , |Es,Λ (SA ) − Es,Λ (SA )| ≤ 2 Es,Λ (Sx ) . (3.5) x∈A
Essentially the same arguments show that at any β where there is a unique infinite volume Gibbs distribtuion µF s for (β|Jb |), the correlation decay properties of the unique Gibbs distribution µs for (βJb ) are controlled by: |Es (SA SB ) − Es (SA )Es (SB )| ≤ EsF (Sx Sy ). (3.6) x∈A y∈B
Finally, we remark that for i.i.d. Jb ’s with an arbitrary (not necessarily symmetric) common distribution, an explicit high temperature regime with a.s. uniqueness and exponential decay of correlations can be easily obtained by combining (i) (3.5)–(3.6), (ii) the Aizenman et al. (1987) analysis of random ferromagnets (see Theorem 1 and especially (2.7) above), (iii) known results about standard (subcritical) independent percolation and (iv) some elementary arguments.
256
CHARLES M. NEWMAN
4. Spin Glasses: Non-Results on Non-Uniqueness As mentioned previously, it is an open problem to prove that for spin glasses in sufficiently high dimensions, there are (a.s.) multiple infinite volume Gibbs distributions for large β. (See Reger et al. 1990 for numerical results with d = 4.) The proof of Theorem 2 suggests (but does not prove) that a necessary condition for non-uniqueness of Gibbs distributions is the occurrence of percolation in infinite volume FK measures. We note that for non-ferromagnetic models, there is neither a specially distinguished boundary condition (such as wired in the ferromagnetic case) nor a guarantee of a single infinite volume limit (as Λ → Zd ) for any particular choice of boundary conditions. Thus, to consider percolation or any other property of infinite volume FK measures, we must worry about what procedure we use to specify an infinite volume FK measure for each (or almost each) ω in the probability space for the Jb ’s. This same issue will arise again when we discuss later the various types of non-uniqueness which could occur in spin glasses. For a finite volume Λ, we will expand the types of boundary conditions we considered in Section 1 (for spins or bonds or for both together) beyond free and specific s¯ to include also a mixture over choices of s¯ (for the given Λ); also if Λ is a cube (or rectangular parallelipiped) we will allow periodic (or antiperiodic in one or more coordinate directions) boundary conditions. We will often have cause to distinguish between boundary conditions which do or do not depend on ω; of course the boundary condition will in general depend on Λ. When taking subsequence limits as Λ → Zd , we will also distinguish between subsequences which do or do not depend on ω. One way to obtain an infinite volume spin and bond measure for almost every ω is to choose a (measurably) ω-dependent boundary condition for each Λ, consider the resulting joint distribution (for a given β) on Ω × {spin and bond configurations} and take some subsequence limit which will be a measure, denoted d Zd (with the natural product σ-field). (It is best at by P ∗ , on Ω × {+1, −1}Z × {0, 1} Zd this stage to assume that Ω is the canonical product space R .) By construction, the subsequence of Λ’s chosen does not depend on ω here. Clearly the marginal distribution of P ∗ on Ω will just be P and it is not difficult to see that for P -a.e. ω, the conditional distribution µω s of the spin variables (given ω) will be an infinite volume Gibbs distribution for the interactions (Jb (ω)). We will focus attention first on the conditional distribution of the bond occupation variables µω n (given ω), which is our infinite volume FK-measure. The next theorem is a slight extension of a result in Gandolfi et al. (1992). Theorem 3. (Gandolfi et al. 1992) Let µω n be an infinite volume FK-measure on Zd {0, 1} , obtained as described above, but with ω-independent boundary conditions. Let p¯ = E(1 − e−β|Jb | ) and let pc be the critical value for standard independent bond percolation on Zd , as in Theorem 1. If 12 p¯ > pc , then a.s. (i.e., for P -almost every ω) µω (4.1) n (some cluster is infinite) = 1; in particular (4.1) will be the case for sufficiently large β if d > 2 and P (Jb = 0) = 0.
ISING SYSTEMS AND RANDOM CLUSTER REPRESENTATIONS
257
Remark. As will be clear from the proof, (4.1) is also valid for large β when P (Jb = 0) = 0, for spin glass models on any lattice where the critical value for standard bond percolation is strictly below 12 ; this includes Z2 × {0, 1}. It is an open problem to extend this result to Z2 ; this would not contradict the conjecture that there is uniqueness of spin glass Gibbs distributions for d = 2 for reasons we will discuss below. Proof. We will show that (4.1) is a consequence of 12 p¯ > pc ; the final statement of the theorem then follows from the facts that p¯ → P (Jb = 0) as β → ∞ and that pc < 12 for d > 2. As in the proof of Theorem 1, we will prove the a.s. validity of (4.1) by averaging µω n over ω, yielding the marginal distribution of the bond variables, µ ˜n = E(µ·n ), and then showing that µ ˜n µind . n,(p/2) ¯
(4.2)
To prove (4.2) for the infinite volume measure µ ˜n , it suffices to prove the analogous inequality for the corresponding measure µ ˜κn on the finite region Λ with the specified (ω-independent) boundary condition κ. µ ˜κn , which is an average over ω’s of κ FK measures µn,(βJb ) may be obtained by first taking the conditional expectation given all the |Jb |’s (which we denote µ ¯ κn,(pb ) ) and then averaging over the |Jb |’s (or pb ’s). We claim that (a.s.) (4.3) µ ¯ κn,(pb ) µind n,(pb /2) ; (recall the since the average (over pb ’s) of the right-hand side of (4.3) is just µind n,(p/2) ¯ equalities of (2.7)), the desired (4.2) would follow. By a standard argument, (4.3) would be a consequence of the family of inequalities (for all bonds b∗ , and all nb = 0 or 1 for b = b∗ ) µ ¯κn,(pb ) (Nb∗ = 1, Nb = nb for b = b∗ ) µ ¯κn,(pb ) (Nb∗ = 0, Nb = nb for b = b∗ )
≥
1
1 2p . − 12 p
(4.4)
Since (given the |Jb |’s) the signs of the Jb ’s are i.i.d. symmetric ±1 valued random variables, the left-hand side of (4.4) may be expressed (using (1.5)) as pb α (Cα+ 2−Hα Iα+ (1) + Cα− 2−Hα Iα− (1) ) , (4.5) (1 − pb ) α (Cα+ Iα+ (0) + Cα− Iα− (0)) where the α’s denote possible choices of signs for (Jb : b = b∗ ), Cα± equals the Zn−1 appearing in (1.5) for the specified (Jb : b = b∗ ) and Jb = ±|Jb |, Hα = 0 or 1 according to whether the endpoints of b∗ are already connected by the nb -bonds for b = b∗ or not (taking boundary conditions into account properly), and Iα± (0) (resp. Iα± (1)) is 1 or 0 according to whether for the specified Jb ’s, the bond configuration with given nb for b = b∗ and nb∗ = 0 (resp. nb∗ = 1) belongs to U (again with boundary conditions taken into account). Note that due to the independence of the boundary condition on the sign of Jb∗ , Iα+ (0) = Iα− (0), that Iα± (0) = 0 implies Iα± (1) = 0; that Iα± (0) = 1 and Hα = 1 implies Iα± (1) = 1, and finally that Iα± (0) = 1 and Hα = 0 implies that exactly one of {Iα+ (1), Iα− (1)} is 1. To obtain (4.4) it suffices to show that for each α with Iα± (0) = 1, 1 pb Cα+ 2−Hα Iα+ (1) + Cα− 2−Hα Iα− (1) 2 pb ≥ . + − (1 − pb ) 1 − 12 pb Cα + Cα
(4.6)
258
CHARLES M. NEWMAN
For Hα = 1, the left-hand side is (pb /2)/(1 − pb ) which obeys the inequality; for Hα = 0, the left-hand side is bounded below by Cα+ pb Cα− min , . (4.7) 1 − pb Cα+ + Cα− Cα+ + Cα− The desired inequality then follows from + Cα Cα− 1 , . max ≤ e−β|Jb | = 1 − pb Cα− Cα+
(4.8)
We leave this last inequality as an exercise for the reader (who may consult Gandolfi et al. 1992). We suggested early in this section that FK-percolation should be a necessary condition for non-uniqueness of Gibbs distributions in spin glasses, and Theorem 3 shows that FK-percolation does occur for high β, at least for d ≥ 3. Although there does not appear to be a proof available (it would be of interest to have such a proof), it seems that, unlike ferromagnetic models, FK-percolation should not be a sufficient condition for non-uniqueness in spin glasses. To see how this could be so, let us see how FK-percolation might be consistent with decay of the Ising spin-spin correlation, Es (Su Sv ), as u − v → ∞. Let us denote by µn the infinite volume FK measure under consideration (say with free or periodic b.c.); the key idea is the distinction between the general FK formula (1.8) for Es (Su Sv ) and the FK connectivity function, µn (u ↔ v), which is only equal to Es (Su Sv ) in the ferromagnetic case. The connectivity function is µn (u ↔ v) = µn (ηu,v = +1) + µn (ηu,v = −1)
(4.9)
Es (Su Sv ) = µn (ηv,v = +1) − µn (ηu,v = −1) .
(4.10)
while By uniqueness of the infinite FK-cluster (see Corollary 2 of Gandolfi et al. 1992), the connectivity function would not decay in the presence of percolation, but there seems to be no reason why this could not happen simultaneously with an asymptotic cancellation of the two terms of (4.10) as u − v → ∞ leading to decay of Es (Su Sv ). This raises the question of whether, without proving non-uniqueness of Gibbs distributions, one might at least be able to show (for appropriate β and d) slower than exponential decay of Es (Su Sv ). This would demonstrate a phase transition in decay properties since one can show exponential decay for small β. Despite the lack of results on non-uniqueness for spin glasses, let us discuss briefly some of the types of non-uniqueness which might possibly occur (for large d and β). We base our discussion on the analysis of Newman and Stein (1992). First, it might be that there is (a.s.) non-uniqueness which is physically irrelevant because of ‘weak uniqueness’ (see van Enter and Fr¨ ohlich 1985 and Campanino et al. ω 1987). This would mean that if µw s and µs are two infinite volume Gibbs measures, obtained (like in Theorem 3) by two different ω-independent boundary conditions w ω (e.g., periodic and antiperiodic), then µw s = µs for P -a.e. ω; further µs would be an extremal Gibbs distribution (for (Jb (ω))) for P -a.e. ω. A similar situation occurs
ISING SYSTEMS AND RANDOM CLUSTER REPRESENTATIONS
259
at high temperatures in very long range spin glasses (Fr¨ohlich and Zegarlinski 1987, Gandolfi et al. 1993). Another possibility (see Huse and Fisher 1987 and Fisher and Huse 1987) is that (a.s.) there are exactly two extremal Gibbs distributions, related to each other by a global spin flip. In that case, for boundary conditions like free or periodic (which are unchanged by a global spin flip), µω s would be the symmetric mixture of these two extremal distributions. A third possibility (Binder and Young 1986), based on the Parisi analysis (Parisi 1979) of the Sherrington–Kirkpatrick spin glass (Sherrington and Kirkpatrick 1975), is that, even with free or periodic b.c., many extremal Gibbs distributions would make their appearance. Presumably this would mean that µω s, obtained as discussed before Theorem 3, would have a decomposition into many extremal Gibbs distributions. On the other hand, it was argued by Newman and Stein (1992) that under this scenario, if one first fixes ω and then tries to take an infinite-volume limit, the result would be many different subsequence limits with the subsequences necessarily ω-dependent. It is an open problem to effectively sort out all the various possibilities.
References Aizenman, M., Chayes, J., Chayes, L., and Newman, C. M. (1987). The phase boundary in dilute and random Ising and Potts ferromagnets. Journal of Physics A: Mathematical and General 20, L313–L318. Aizenman, M., Chayes, J., Chayes, L., and Newman, C. M. (1988). Discontinuity of the magnetization in one-dimensional 1/|x − y|2 Ising and Potts models. Journal of Statistical Physics 50, 1–40. Aizenman, M., Klein, A., and Newman, C. M. (1993). Percolation methods for disordered quantum Ising models. In Phase Transitions: Mathematics, Physics, Biology, . . . (R. Kotecky, ed.), World Scientific, Singapore (to appear). Aizenman, M. and Nachtergaele, B. (1993). Geometric aspects of quantum spin systems. Communications in Mathematical Physics (to appear). Aizenman, M. and Wehr, J. (1990). Rounding effects of quenched randomness on first-order phase transitions. Communications in Mathematical Physics 130, 489–528. Berg, J. van den and Maes, C. (1992). Disagreement percolation in the study of Markov fields. Annals of Probability, to appear. Binder, K. and Young, A. P. (1986). Spin glasses: Experimental facts, theoretical concepts and open questions. Review of Modern Physics 58, 801–976. Campanino, M., Olivieri, E., and van Enter, A. C. D. (1987). One dimensional spin glasses with potential decay 1/r 1+ . Absence of phase transitions and cluster properties. Communications in Mathematical Physics 108, 241–255. Dobrushin, R. L. (1968). The description of a random field by means of conditional probabilities and conditions of its regularity. Theory of Probability and its Applications 13, 197–224. Dobrushin, R. L. and Shlosman, S. B. (1985). Constructive criteria for the uniqueness of a Gibbs field. In Statistical Mechanics and Dynamical Systems (J. Fritz, A. Jaffe, and D. Sz´ asz, eds.), Birkh¨ auser, Boston, pp. 371–403. Edwards, S. and Anderson, P. W. (1975). Theory of spin glasses. Journal of Physics F 5, 965–974. Enter, A. C. D. van and Fr¨ ohlich, J. (1985). Absence of symmetry breaking for N -vector spin glass models in two dimensions. Communications in Mathematical Physics 98, 425–432. Edwards, R. G. and Sokal, A. D. (1988). Generalization of the Fortuin–Kasteleyn–Swendsen–Wang representation and Monte Carlo algorithm. The Physical ReviewD 38, 2009–2012. Fisher, D. S. and Huse, D. A. (1987). Absence of many states in realistic spin glasses. Journal of Physics A: Mathematical and General 20, L1005–L1010. Fortuin, C. M. and Kasteleyn, P. W. (1972). On the random-cluster model. I. Introduction and relation to other models. Physica 57, 536–564.
260
CHARLES M. NEWMAN
Fr¨ ohlich, J. and Zegarlinski, B. (1987). The high-temperature phase of long-range spin glasses. Communications in Mathematical Physics 110, 121–155. Gandolfi, A., Keane, M. S., and Newman, C. M. (1992). Uniqueness of the infinite component in a random graph with applications to percolation and spin glasses. Probability Theory and Related Fields 92, 511–527. Gandolfi, A., Newman, C. M., and Stein, D. L. (1993). Exotic states in long range spin glasses. Communications in Mathematical Physics, to appear. Griffiths, R. B. (1971). Phase transitions. In Statistical Mechanics and Quantum Field Theory (C. DeWitt and R. Stora, eds.), Gordon and Breach, New York, pp. 241–279. Grimmett, G. R. (1994). Percolative problems. In Probability and Phase Transition (G. Grimmett, ed.), Kluwer, Dordrecht, pp. 69–86, this volume. Harris, T. E. (1960). A lower bound for the critical probability in a certain percolation process. Proceedings of the Cambridge Philosophical Society 56, 13–20. Huse, D. A. and Fisher, D. S. (1987). Pure states in spin glasses. Journal of Physics A: Mathematical and General 20, L997–L1003. Kasai, Y. and Okiji, A. (1988). Percolation problem describing ±J Ising spin glass system. Progress in Theoretical Physics 79, 1080–1094. Kasteleyn, P. W. and Fortuin, C. M. (1969). Phase transitions in lattice systems with random local properties. Journal of the Physical Society of Japan 26, 11–14. Lebowitz, J. L. and Martin-L¨ of, A. (1972). On the uniqueness of the equilibrium state for Ising spin systems. Communications in Mathematical Physics 25, pp. 276–282. Liggett, T. M. (1992). The survival of one-dimensional contact processes in random environments. Annals of Probability 20, 696–723. Newman, C. M. (1991). Ising models and dependent percolation. In Topics in Statistical Dependence (H. W. Block, A. R. Sampson, and T. H. Savits, ed.), IMS Lecture Notes – Monograph Series, 16, 395–401. Newman, C. M. and Stein, D. L. (1992). Multiple states and thermodynamic limits in short-ranged Ising spin glass models. The Physical Review B 46, 973–982. Parisi, G. (1979). Infinite number of order parameters for spin-glasses. Physical Review Letters 43, 1754–1756. Reger, J. D., Bhatt, R. N., and Young, A. P. (1990). Monte Carlo study of the order-paremeter distribution in the four-dimensional Ising spin glass. Physical Review Letters 64, 1859–1862. Sherrington, D. and Kirkpatrick, S. (1975). Solvable model of a spin glass. Physical Review Letters 35, 1792–1796. Swendsen, R. H. and Wang, J. S. (1987). Nonuniversal critical dynamics in Monte Carlo simulations. Physical Review Letters 58, 86–88.
PLANAR FIRST-PASSAGE PERCOLATION TIMES ARE NOT TIGHT
R. PEMANTLE∗ Department of Mathematics University of Wisconsin Madison, WI 53706 U.S.A.
and Y. PERES Department of Statistics University of California Berkeley, CA 94720 U.S.A.
Abstract. Let the edges of Z2 be assigned independent, identically distributed passage times that are exponentials of mean one, and let T (0, n) denote the resulting first-passage time from the origin to the point (0, n). We show that T (0, n) is not tight around its median. A fractional power lower bound for the dispersion of T (0, n) may be obtained by combining this method with that of Newman and Piza (1993). Key words: Percolation, first passage, variance, tight, Richardson’s model.
1. Main Result We consider first-passage percolation on the two-dimensional integer lattice Z2 with passage times that are IID exponentials of mean one; see Kesten (1986) for an overview. It has been conjectured, based on numerical evidence, that the variance of the time T (0, n) to reach the vertex (0, n) is of order n2/3 . Kesten (1992) showed that the variance of T (0, n) is at most O(n). He also noted that the variance is bounded away from zero. This note improves the lower bound on the variance of T (0, n) to C log n. Simultaneously and independently, Newman and Piza have achieved the same result for {0, 1}-valued passage times. Their methods (Newman and Piza 1993) extend to more general passage times, while ours work only for exponential times. On the other hand, our theorem shows that the variance comes from fluctuations of non-vanishing probability in the sense that, as n → ∞, the law of T (0, n) is not tight about its median. Very recently, Newman and Piza showed that the log n may be improved to a power of n for directions in which the shape is not flat (it is not known whether the shape can be flat in any direction; see Durrett and Liggett (1981) for a relevant example). As pointed out to us by Harry Kesten, in the exponential case this may also be obtained via the method given here. ∗ Supported in part by NSF grant DMS 93-00191, by a Sloan Foundation Fellowship, and by a Presidential Faculty Fellowship.
262
R. PEMANTLE AND Y. PERES
Theorem 1. Let v be any unit vector in R2 and let v(n) be the vector in Z2 whose coordinates are the integer parts of the coordinates of nv. Let T (0, v(n) ) denote the passage time from the origin to the vertex v(n) , under IID mean 1 exponential passage times on the edges of Z2 . Then Var(T (0, v(n) )) ≥ C log n and in fact any intervals [an , bn ] with bn − an = o(log n)1/2 satisfy P(T (0, v(n) ) ∈ [an , bn ]) → 0. We remark that the theorem extends to Richardson’s model with other passage time distributions (restart all clocks after an edge is crossed). 2. Proof We compute the conditional distribution of T (0, v(n) ) given a σ-field F and show that with probability 1 − o(1), this conditional distribution is close to a normal with variance at least C log n; clearly this implies the conclusion of the theorem. Let F be the σ-field determined by the order in which vertices are reached. Formally, if T (v) is the passage time from the origin to the vertex v, then F is the σ-field generated by the events T (v) < T (w) for v, w ∈ Z2 . Let V0 , V1 , V2 , . . . be the vertices of Z2 listed in the order they are reached, so (V0 , V1 , . . .) is an F-measurable random sequence. Let Cn = {V0 , V1 , . . . , Vn } be the cluster of the first n elements to be reached from the origin, and let Yn be the number of edges connecting elements of Cn−1 to elements of Z2 \ Cn−1 . The key observation is that the conditional joint distribution of the variables T (Vn ) − T (Vn−1 ) given F is identical to a sequence of independent exponentials with means 1/Yn . This is in fact an immediate consequence of the lack of memory of the exponential distribution and of the fact that the minimum of n independent exponentials of mean 1 is an exponential of mean 1/n. This observation leads to Lemma 2.
n 1 X 1 lim inf ≥ c0 a.s. n→∞ log n Y2 j=1 j
for some positive constant c0 . Assuming this lemma for the moment, define M (n) to satisfy VM (n) = v(n) . Let µn = E(T (Vn ) | F) = and σn2 = Var(T (Vn ) | F) =
n X 1 Y j=1 j
n X 1 . Y2 j=1 j
The Lindeberg–Feller theorem implies that the conditional distribution of the variable (T (v(n) ) − µM (n) )/σM (n) converges weakly to a standard normal whenever σM (n) → ∞. Subadditivity implies that that T (v(n) )/n → c1 = c1 (v) almost surely, and the shape theorem (Cox and Durrett 1981) implies that c1 > 0 and that the
PLANAR FIRST-PASSAGE PERCOLATION
263
number of vertices Nt reached by time t is almost surely (c2 + o(1))t2 . (Equivalently, −1/2 + o(1))n1/2 .) From this it follows that T (Vn ) = (c2 M (n) → c2 c21 a.s., n2 and hence from Lemma 2 that lim inf
1 σ2 log n M (n)
≥ 2c0 a.s.
Thus a conditional distribution of T (v(n) ) is close to a normal with variance at least 2c0 + o(1), which establishes the theorem. To prove Lemma 2, we first observe from the isoperimetric inequality that the distribution of T (Vn ) is stochastically bounded by the sum of independent exponentials of means cj −1/2 , j = 1, . . . , n. Thus the variables n−1/2 T (Vn ) are dominated by a variable in L1 , and hence µn −1/2 = E(n−1/2 T (Vn ) | F) → c2 n1/2 almost surely. Lemma 2 then follows from a fact about sequences of real numbers: Pn Lemma 3. Let x1 , x2 , . . . be positive real numbers with Sn = j=1 xj and suppose that lim inf n−1/2 Sn = c. Then n c2 1 X x2j ≥ . lim inf log n 4 j=1 Proof. It suffices to show that the condition Sn ≥ an1/2 − b for all n implies
n 1 X a2 lim inf x2j ≥ , log n 4 j=1
since one may then take a = c − for arbitrarily small . Also, replacing x1 with x1 + b, we may assume without loss of generality that b = 0. Define qn = n1/2 − (n − 1)1/2 ≥ 12 n−1/2 . Pn Assuming Sn ≥ an1/2 for all n, we show that j=1 x2j ≥ 14 a2 log n. Rearranging P the terms {xj : 1 ≤ j ≤ n} in decreasing order does not change nj=1 x2j and only increases each Sj , so we may assume without loss of generality that these terms appear in decreasing order. Summing by parts three times we obtain " # n n−1 n−1 X X X 2 1/2 1/2 xj = S n xn + Sk (xk − xk+1 ) ≥ a n xn + k (xk − xk+1 ) j=1
k=1
k=1
264
R. PEMANTLE AND Y. PERES
= a
n X
"
qj xj = a qn Sn +
j=1
2
≥ a
"
qn n
n−1 X
(qk − qk+1 )Sk
#
k=1
1/2
+
n−1 X
(qk − qk+1 )k
1/2
#
.
k=1
Summing Pn once more by parts and using the definition of qk we see that this is equal to a2 k=1 qk2 , which is at least 14 a2 log n. This proves the lemma and hence the theorem. When the asymptotic shape has a finite radius of curvature in the direction v, Newman and Piza have shown, using results of Kesten (1992) and Alexander (1992), that the minimizing path from 0 to v(n) deviates from a straight line segment by at most cnα for some α < 1, with probability 1 − o(1) as n → ∞. Thus the time T 0 (0, v(n) ) to reach v(n) , in a new percolation where only bonds in a strip of width nα are permitted, differs from T (0, v(n) ) by o(1) in total variation. The shape theorem for Z2 implies that the number of sites reached in the new percolation by this time is O(n1+α ). Defining M 0 (n), µ0n , Yn0 and σn0 analogously to M (n), µn , Pn 0 Yn and σn but for the new percolation, we have µ0n = (1/Y n ), while now k=1 M 0 (n) = O(n1+α ). By Cauchy–Schwarz — no summing by parts is needed — it 0 2 1−α follows that (σM , and applying Lindeberg–Feller as before proves the 0 (n) ) ≥ cn extension mentioned before Theorem 1. References Alexander, K. (1992). Fluctuations in the boundary of the wet region for first-passage percolation in two and three dimensions. Preprint. Cox, J. T. and Durrett, R. (1981). Some limit theorems for percolation processes with necessary and sufficient conditions. Annals of Probability 9, 583–603. Durrett, R. and Liggett, T. (1981). The shape of the limit set in Richardson’s growth model. Annals of Probability 9, 186–193. Kesten, H. (1986). Aspects of first passage percolation. Lecture Notes in Mathematics 1180, 125– 264, Springer, Berlin. Kesten, H. (1992). On the speed of convergence in first passage percolation. Annals of Applied Probability, to appear. Newman, C. and Piza, M. (1993). Divergence of shape fluctuations in two dimensions. Preprint.
THEOREMS AND CONJECTURES ON THE DROPLET-DRIVEN RELAXATION OF STOCHASTIC ISING MODELS
ROBERTO H. SCHONMANN∗ Department of Mathematics University of California Los Angeles, CA 90024 U.S.A.
Abstract. Recent rigorous results on droplet-driven relaxation of stochastic Ising models in the vicinity of the phase transition region are reviewed. Further conjectures are raised based on the same sort of heuristic picture which suggested the proven results, and some new results on these lines are announced and some of their proofs sketched. Key words: Stochastic Ising model, Glauber dynamics, relaxation, droplets, metastability, spectral gap.
1. Introduction In this note I will summarize results proven recently in [Sch3], on the relaxation mechanism of stochastic Ising models (also known as Glauber dynamics or kinetic Ising models) in the proximity of the phase transition region. I will also explain how these results were predicted by a heuristic type of reasoning and, more importantly, how the same sort of heuristics can be used to raise further conjectures on the behavior of the model. Surprisingly, it seems to me that a proper understanding of the picture which emerges from this heuristic view-point, including the rich diversity of behaviors (depending on the parameters volume, temperature and external magnetic field) is a novel contribution to the theory, even at the non-rigorous level. The paper is being written having in mind that it may be read by people with different backgrounds in the areas of statistical mechanics and interacting particle systems. For those who are less familiar with these subjects, I recommend reading at this moment Section 2, and then returning to the next paragraph. For the experts, Section 2 will play mostly the role of a reference for notation, and can be consulted later. We will consider the basic Ising model on Zd , with formal Hamiltonian Hh (σ) = −
1 X hX σ(x)σ(y) − σ(x), 2 x,y n.n. 2 x
(1)
where σ(x) = ±1 is the spin at the site x ∈ Zd , and the first sum runs over pairs of sites which are nearest neighbors in Zd , each pair counted only once. The time evolution is introduced as a spin flip Markov process which is reversible with respect ∗
Partially supported by the NSF, under grant DMS 91-00725.
266
ROBERTO H. SCHONMANN
to the corresponding Gibbs measures at temperature T . The flip rates will be supposed to satisfy certain regularity conditions, but these will be very mild, so that essentially all common choices for these rates will be considered. We are concerned with the relaxation mechanism of these systems in the vicinity of the phase transition region, i.e., at low temperature and under small but non-null external field h. These systems that we are considering are probably the favorite model systems for investigators addressing the issue of relaxation to equilibrium of systems close to a first order phase transition. The literature on the subject is vast, because the problem is of interest to researchers in such diverse areas as metallurgy, chemistry, physics and probability. One of the conspicuous features of relaxation phenomena close to discontinuous transitions is the presence of metastable behavior, in which the system seems, for a long time, to have reached equilibrium, but in a state which is actually far from the true equilibrium state, and is close to what the equilibrium would be for values of the parameters at the other side of the transition region. A considerable number of review papers and monographs has been written on the subjects of metastability and relaxation close to transition regions. The reader may consult for instance [GD], [GSS] and [Koc] for accounts which emphasize non-rigorous results. A good review of rigorous investigations on the problem of metastability is [PL]. Many papers have been written on simulations and analytical study (rigorous and non-rigorous) of the stochastic Ising models in the regime which concerns us. The reader will find a large number of references in the reviews quoted above and a constant stream of papers on the subject in more recent issues of journals in statistical mechanics, mathematical physics and other related subjects. Different proposals have appeared on what aspects of the problem one should study, what to measure in simulations, what to compute and what to prove. Even after several decades of investigations there seems to still be a fair amount of controversy on what the most relevant aspects of the problem are, and on questions of the type: what are the mathematical theorems that should be proved (or at least conjectured) and that properly capture the experimentally-manifestly-clear metastable behavior of the systems. Just to quote a few of the different issues: 1. Can one see the metastable behavior by looking at time averages or only if looking at typical individual paths of the process? 2. Is there any type of metastable behavior for these models on the infinite lattice, or only on finite lattices? 3. Is there a clear cut definition of what a ‘metastable state’ is? 4. Is there a sharp value of the external field (depending possibly on the temperature) which separates a region (small |h|) where the answer to the previous question is yes, from one where the answer is no? 5. Is the fact that the relaxation time becomes very large when h is small reflected by the gap in the spectrum of the generator of the process vanishing as the inverse of the relaxation time as h & 0? 6. To properly characterize and study metastability, should one consider the process conditioned to not having reached the configurations which are typical in equilibrium, until a large time? I am quoting all these questions here just to give to the reader an idea of the richness
DROPLETS AND METASTABILITY
267
of the area. Some of these questions are, of course, essentially a matter of the use that one wants to make of the word ‘metastability’, and we do not want to enter in such a discussion. On the other hand, I believe that the results reviewed and announced in this paper and the picture and the further conjectures proposed are helping to clarify some of the real problems in some of these questions. The rest of this paper is organized as follows. In Section 2 the basic notions and notation are introduced in a fashion hopefully readable by a non-expert. In Section 3, the main result in [Sch3] is reviewed and related to what is observed in simulations. In Section 4 this result is explained on heuristic grounds. In Section 5 some rigorous counterparts of the heuristics are reviewed. In Section 6 different quantities which measure the speed of convergence to equilibrium are introduced, recent results on these quantities are described and further conjectures on their behavior are raised and heuristically motivated. In Section 7 results and conjectures on the behavior of versions of the stochastic Ising models on finite lattices, with sizes which are being scaled as h & 0 are presented. Finally, In Section 8, we recall that interesting and usually sharper results can be proven for the same systems in regimes in which the temperature is scaled to 0, rather then the external field, or together with the external field; some new results are announced, which support the conjectures raised in the previous sections of this paper. Readers who are familiar with the content of [Sch3] can go directly to Section 6. 2. The Models I have tried to make everything here as standard as possible, so that readers who are familiar with the models will browse quickly through this subsection, finding few things to which they are not used (like Proposition 1). The presentation is selfcontained, but most statements are made without proofs, and I refer readers to the books [Lig1] and [Rue], and other references therein for complete treatments. Some of the proofs that were omitted are relatively easy, and having newcomers in mind, I indicate them in the text as exercises (sometimes with hints). The Lattice: We will consider models on the lattices Zd , where d is the space dimensionality. Because the dimension d will in general be arbitrary but fixed, we will omit it in most of the notation. The cardinality of a set Γ ⊂ Zd will be denoted by |Γ|. The family of finite subsets of Zd will be denoted by F. For each x ∈ Zd , we define the usual norms kxkp = (|x1 |p + . . . + |xd |p )1/p , p > 0 finite, and kxk∞ = max{|x1 |, . . . , |xd |}. The interior and exterior boundaries of a set Γ ⊂ Zd will be denoted, respectively by ∂int Γ := {x ∈ Γ : kx − yk1 = 1 for some y 6∈ Γ}, and ∂ext Γ := {x 6∈ Γ : kx − yk1 = 1 for some y ∈ Γ}. For integer i, we introduce the notation Vi = {x ∈ Zd : kxk∞ ≤ i},
268
ROBERTO H. SCHONMANN
for the box centered at the origin which has side-length 2i + 1. But because usually the side-length of such a box is of particular importance for us, we will mostly be using the alternative notation Λ(l) = largest Vi which has side-length not larger than l. The set of bonds, i.e., (unordered) pairs of nearest neighbors is defined as B = {{x, y} : x, y ∈ Zd and kx − yk1 = 1}. Given a set Γ ∈ F we define also BΓ = {{x, y} : x, y ∈ Γ and kx − yk1 = 1}, ∂BΓ = {{x, y} : x ∈ Γ, y 6∈ Γ and kx − yk1 = 1}. A chain is a sequence of distinct sites x1 , . . . , xn , with the property that for i = 1, . . . , n − 1, {xi , xi+1 } ∈ B. The sites x1 and xn are called the end-points of the chain x1 , . . . , xn . A set of sites with the property that each two of them can be connected by a chain contained in the set is said to be a connected set. The Configurations and Observables: At each site in Zd there is a spin which can take values −1 and +1. The configurations will therefore be elements of the d set {−1, +1}Z =: Ω. Given σ ∈ Ω, we write σ(x) for the spin at the site x ∈ Zd . Two configurations are specially relevant: −1 and +1, which are, respectively, the ones with all spins −1 and +1. When these configurations appear as a subscript or superscript, we will usually abbreviate them by, respectively, − and +. The single spin space, {−1, +1} is endowed with the discrete topology and Ω is endowed with the corresponding product topology. The following definition will be important when we introduce finite systems with boundary conditions later on; given Γ ∈ F and a configuration η ∈ Ω, we introduce ΩΓ,η := {σ ∈ Ω : σ(x) = η(x) for all x 6∈ Γ}. Real-valued functions with domain in Ω are called observables. For each observable f , we use the notation ||f ||∞ := supη∈Ω |f (η)|. Local observables are those which depend only on the values of finitely many spins, more precisely, f : Ω → R is a local observable if there exists a set S ∈ F such that f (σ) = f (η) whenever σ(x) = η(x) for all x ∈ S. The smallest S with this property is called the support of f . Clearly, if f is a local observable, then ||f ||∞ < ∞. The topology introduced above on Ω, has the nice feature that it makes the set of local observables dense in the set of all continuous observables. On Ω the following partial order is introduced: η ≤ ζ if η(x) ≤ ζ(x) for all x ∈ Zd . A particularly important role will be played in this paper by the non-decreasing local observables. Clearly every local observable is of bounded variation, and, as such, can be written as the difference between two non-decreasing ones.
269
DROPLETS AND METASTABILITY
A −chain in a configuration σ, or simply a σ-chain, is a chain of sites, x1 , . . . , xn , as defined above, with the property that for each i = 1, . . . , n, σ(xi ) = −1. The −clusters in a configuration σ are the connected components of the set of sites where the spin is −1 in the configuration σ. A −cluster is called infinite if it contains infinitely many sites. The Probability Measures: We endow Ω also with the Borel σ-algebra corresponding to the topology introduced above. In this fashion, each probability R measure µ in this space can be identified by the corresponding expected values f dµ of all the local observables f . A sequence of probability measures, (µn )n=1,2,... , is said to converge weakly to the probability measure ν in case Z Z (2) f dµn = f dν for every continuous observable f. lim n→∞
The family of probability measures on Ω will be partially ordered by the following relation: µ ≤ ν if Z Z f dµ ≤ f dν for every continuous non-decreasing observable f . (3) Because the local observables are dense in the set of continuous observables, we can restrict ourselves to the local ones in (2) and (3). Moreover, because every local observable is the difference between two non-decreasing ones, we can also restrict ourselves to those in (2). The Gibbs Measures: We will consider always the formal Hamiltonian (1). In order to give precise definitions, we define, for each set Γ ∈ F and each boundary condition η ∈ Ω, HΓ,η,h (σ) = −
1 2
X {x,y}∈BΓ
σ(x)σ(y) −
1 2
X {x,y}∈∂BΓ
σ(x)η(y) −
hX σ(x), 2
(4)
x∈Γ
y6∈Γ
where h ∈ R is the external field and σ ∈ Ω is a generic configuration. The Gibbs (probability) measure in Γ with boundary condition η under external field h and at temperature T = 1/β is now defined on Ω as exp(−βHΓ,η,h (σ)) if σ ∈ ΩΓ,η , X exp(−βHΓ,η,h (ζ)) µΓ,η,h (σ) = ζ∈ΩΓ,η 0 otherwise. Observe that we omit in the notation reference to the temperature T , because it will be usually fixed. The following property is a consequence of the fact that the Hamiltonian only involves interactions between nearest neighbors: given Γ ∈ F, if η(x) = ζ(x) for every x ∈ ∂ext Γ, then Z Z f dµΓ,η,h = f dµΓ,ζ,h , (5)
270
ROBERTO H. SCHONMANN
for every local observable f whose support is contained in Γ. The next property is known as the DLR equation: given Γ ⊂ Γ0 ∈ F and a pair of configurations η and η 0 which are identical off Γ0 , we have µΓ0 ,η0 ,h ( · | ΩΓ,η ) = µΓ,η,h (·).
(6)
The Gibbs measures satisfy the following monotonicity relations to which we will refer as the Holley–FKG inequalities. If η ≤ ζ and h1 ≤ h2 , then, for each Γ ∈ F, µΓ,η,h1 ≤ µΓ,ζ,h2 . A Gibbs measure for the infinite system on Zd is defined now as any probability measure, µ, which satisfies the DLR equations in the sense that for every Γ ∈ F and µ-almost all η ∈ Ω µ( · | ΩΓ,η ) = µΓ,η,h (·). (7) Alternatively and equivalently, Gibbs measures can be defined as elements of the closed convex hull of the set of weak limit points of sequences of the form d (µ Γi ,ηT i ,h )i=1,2,... , where each Γi is finite and Γi → Z , as i → ∞, in the sense that S∞ ∞ d i=1 j=i Γj = Z . Together, (5) and the DLR equations, (6) and (7), imply the Markov property for the Gibbs measures; for instance, if µ is a Gibbs measure for the infinite system under external field h, then for arbitrary Γ ∈ F and µ-almost all η, ζ ∈ Ω such that η(x) = ζ(x) for every x ∈ ∂ext Γ, Z Z f dµ(· | ΩΓ,η ) = f dµ(· | ΩΓ,ζ ), for every local observable f whose support is contained in Γ. The Holley–FKG inequalities can be used to prove that for each value of T and h, µΛ(l),−,h (resp. µΛ(l),+,h ) converges weakly, as l → ∞, to a probability measure that we will denote by µ−,h (resp. µ+,h ). (Take the last statement as an exercise; hint: consider first non-decreasing local observables.) If h 6= 0, or d = 1, it is also known that µ−,h = µ+,h =: µh , (8) while if d ≥ 2 and h = 0 the same is true if the temperature is larger than a critical value Tc > 0, which depends on the dimension, and is false for T < Tc . Moreover, for the values of T and h for which (8) holds, any weak limit of any sequence of the form (µΓi ,ηi ,h )i=1,2,... , where F 3 Γi → Zd , is identical to µh . Therefore we conclude that whenever (8) holds there is a unique Gibbs measure for the infinite system. When (8) fails, there is more than one Gibbs measure for the infinite system, and we say that there is phase coexistence. We use the following abbreviations and names: µ−,0 := µ− = the minus phase, µ+,0 := µ+ = the plus phase. Another known fact is that for fixed T µh → µ+ weakly, as h & 0,
(9)
DROPLETS AND METASTABILITY
271
µh → µ− weakly, as h % 0.
(10)
and (Proving these facts is an excellent exercise. Hints: consider non-decreasing local observables, and use the Holley–FKG inequalities. Half of each statement follows then easily. For the other half you can compare the infinite system to a finite one using again Holley–FKG, then use ‘continuity’ of the Gibbs measure as a function of the external field in a finite box, and finally let the size of the box grow.) For the expected value corresponding to a Gibbs measure µ..., in finite or infinite volume, we will use the notation Z hf i... := f dµ..., where ... stands for arbitrary subscripts. The spontaneous magnetization at temperature T is defined as m∗ (T ) = hσ(0)i+ . (Here we are using a common and convenient form of abuse of notation: σ(x) is being used to denote the observable which associates to each configuration the value of the spin at the site x in that configuration.) It is known that m∗ (T ) > 0 if and only if µ− 6= µ+ , and also that limT &0 m∗ (T ) = 1. The Dynamics: We introduce now for the Ising model above, the time evolution known as stochastic Ising model or Glauber dynamics. First we recall that a spin flip system is defined as a Markov process on the state space Ω, whose generator, L, acts on a generic local observable f as X c(x, σ)(f (σ x ) − f (σ)), (11) (Lf )(σ) = x∈Zd
where σ x is the configuration obtained from σ by flipping the spin at the site x, and c(x, σ) is called the rate of flip of the spin at the site x when the system is in the state σ. In order for this generator to be well defined and indeed generate a unique Markov process, one has to assume that the rates c(x, σ) satisfy certain regularity conditions. For our purposes here, we will actually restrict ourselves to the following conditions, which are more than enough to assure the existence and uniqueness of the process. (H1) (Translation invariance) For every x, y ∈ Zd , c(x, σ) = c(x + y, θy σ), where θy σ is the configuration obtained by shifting σ by y, i.e., (θy σ)(z) = σ(z − y). (H2) (Finite range) There exists R such that c(0, η) = c(0, ζ) if η(x) = ζ(x) whenever kxk∞ ≤ R. The minimal such R is called the range of the interaction. The connection between the rates of flip and the Hamiltonian (1) and the temperature T = 1/β is established by imposing conditions which assure that the Gibbs
272
ROBERTO H. SCHONMANN
measures are not only invariant, but also reversible with respect to the dynamics. These conditions, called detailed balance, state that for each x ∈ Zd and σ ∈ Ω, c(x, σ) = c(x, σ x ) exp(−β∆x Hh (σ)), where ∆x Hh (σ) := σ(x)
X
(12)
σ(y) + h ,
y:{x,y}∈BΓ x
which formally equals Hh (σ ) − Hh (σ). We will usually make the dependence on h explicit, by writing ch (x, σ) for the rates. There are many examples of rates which satisfy the conditions of detailed balance (12) and also the other hypotheses, (H1) and (H2). The most common examples found in the literature are: Example 1: Metropolis Dynamics ch (x, σ) = exp(−β(∆x Hh (σ))+ ), where (a)+ = max{a, 0} is the positive part of a. Example 2: Heat Bath Dynamics ch (x, σ) =
1 . 1 + exp(β∆x Hh (σ))
Example 3: ch (x, σ) = exp(−(β/2)∆x Hh (σ)). Each one of these rates satisfies also the further conditions below which will be needed for the analysis in this paper to be possible. (H3) (Attractiveness and monotonicity in h) If η(x) ≤ ζ(x) and h1 ≤ h2 , then ch1 (x, η) ≤ ch2 (x, ζ) ch1 (x, η) ≥ ch2 (x, ζ)
if η(x) = ζ(x) = −1, if η(x) = ζ(x) = +1.
(H4) (Uniform boundedness of rates) For each temperature T there is h(T ) > 0 and 0 < cmin (T ) ≤ cmax (T ) < ∞ such that for all h ∈ (−h(T ), h(T )) and σ ∈ Ω cmin (T ) ≤ ch (0, σ) ≤ cmax (T ). Each one of the examples presented also satisfies the continuity conditions lim ch (x, σ) = c0 (x, σ),
h&0
for all x and σ. Interestingly enough, this is true for all the rates that satisfy detailed balance with respect to Hh (·) and the hypotheses (H1)–(H4). Even more surprisingly, the stronger result below, which says that the ‘effect of h on the rates is essentially of order h’, holds. The lower bounds will be important for arguing,
DROPLETS AND METASTABILITY
273
at least heuristically, that the large droplets of the plus-phase grow relatively fast in the background of the minus-phase. The precise statements in the proposition below have a somewhat technical flavor, by necessity. It is not true that all the rates depend on h; for instance consider Metropolis dynamics, for which some of the rates stay constant, equal to 1, as h varies on a small interval. Thanks to the hypotheses of translation invariance, (H1), we need only consider x = 0. Define a(σ, h) := ch (0, σ), b(σ, h) := ch (0, σ 0 ), and g(h) := sup |ch (0, σ) − c0 (0, σ)|. σ
Then we have Proposition 1. Suppose that the rates ch (x, σ) satisfy detailed balance with respect to Hh (·) and also the hypotheses (H1), (H2), (H3) and (H4). Then for each T > 0 and σ ∈ Ω there are finite and positive constants C1 (T, σ) and C2 (T, σ), such that for all h ∈ (−h(T ), h(T )), the following two statements hold: (i) C1 (T, σ)|h| ≤ max{|b(σ, h) − b(σ, 0)|, |a(σ, h) − a(σ, 0)|} ≤ C2 (T, σ)|h|. (ii) C1 (T, σ)|h| ≤ max{|b(σ, h) − b(σ, −h)|, |a(σ, h) − a(σ, −h)|} ≤ 2C2 (T, σ)|h|. There are also two further positive and finite constants C3 (T ) and C4 (T ) such that for all h ∈ (−h(T ), h(T )), we have (iii) C3 (T )|h| ≤ g(h) ≤ C4 (T )|h|. Proof. (ii) follows from (i) and the monotonicity hypotheses, (H3), while (iii) follows from (i) and the finite range hypotheses, (H2), so all we need is to prove (i). In proving (i) there is no loss in generality in supposing that σ(0) = +1, since a(σ, h) and b(σ, h) play symmetric roles. Using this remark, one can also see that there is also no loss in generality in supposing that h ≥ 0, since otherwise we can use the symmetry of the model with respect to simultaneous mappings of h → −h and σ → −σ, to reduce the problem to this case. The detailed balance condition (12) states that X b(σ, h) b(σ, 0) = exp β exp(βh). σ(y) + βh = a(σ, h) a(σ, 0) y:kx−yk1 =1
Hence log(b(σ, h)/b(σ, 0)) + log(a(σ, 0)/a(σ, h)) = βh. From the hypotheses of monotonicity in h, a(h) decreases with h, while b(h) increases with h, therefore the two logarithms above are positive, and hence must vanish as h & 0, implying that b(σ, h) → b(σ, 0) and a(σ, h) → a(σ, 0). Moreover 1 2 βh
≤ max{log(b(σ, h)/b(σ, 0)), log(a(σ, 0)/a(σ, h))} ≤ βh.
274
ROBERTO H. SCHONMANN
Hence for small h (depending on σ and β), 1 4 βh
≤ max{(b(σ, h) − b(σ, 0))/b(σ, 0)), (a(σ, 0) − a(σ, h))/a(σ, h))} ≤ 2βh.
Using now the hypotheses of boundedness of the rates, (H4), we can conclude that there are positive finite constants C1 (T, σ) and C2 (T, σ) such that C1 (T, σ)h ≤ max{|b(σ, h) − b(σ, 0)|, |a(σ, 0) − a(σ, h)|} ≤ C2 (T, σ)h. Throughout this paper we will suppose that we have chosen and kept fixed a set of rates ch (x, σ) which satisfy the detailed balance conditions, (12) and all the η hypotheses (H1)–(H4). This spin flip system will be denoted by (σh;t )t≥0 , where η is the initial configuration. If this initial configuration is selected at random according ν to a probability measure ν, then the resulting process is denoted by (σh;t )t≥0 . The probability measure on the space of trajectories of the process will be denoted by P, and the corresponding expectation by E . (Later, when we couple various related processes, we will also use the symbols P and E to denote probabilities and expectations in some larger probability spaces, but no confusion should arise from this.) The assumption of detailed balance, (12), assures that the Gibbs measures are invariant with respect to the stochastic Ising models. Moreover, from the assumption of attractiveness, (H3), one obtains the following convergence results − σh;t → µh,−
and
+ σh;t → µh,+ ,
weakly, as t → ∞. We will want to consider, sometimes as a tool, and sometimes for its own sake, the counterpart of the stochastic Ising model that we are considering, on an arbitrary finite set Γ ∈ F, with some boundary condition ξ ∈ Ω. This process, which will be η denoted by (σΓ,ξ,h;t )t≥0 , where η ∈ ΩΓ,ξ is the initial configuration, is defined as the spin flip system with rates of flip given by ch (x, σ) if σ, σ x ∈ ΩΓ,h, cΓ,ξ,h (x, σ) = 0 otherwise. When σ, σ x ∈ ΩΓ,h, , (12) yields, for all x ∈ Zd , µΓ,ξ,h (σ)cΓ,ξ,h (x, σ) = µΓ,ξ,h (σ x )cΓ,ξ,h (x, σ x ),
(13)
which is the usual reversibility condition for finite state-space Markov processes. (Conversely, if one requires (13) to be satisfied for arbitrary Γ ∈ F and ξ ∈ Ω, η then one can deduce that (12) must hold.) It is clear from (H4) that (σΓ,ξ,h;t ) is irreducible and hence from (13) it follows that, for any η, η σΓ,ξ,h;t → µΓ,ξ,h ,
weakly, as t → ∞. Graphical Construction: The experts on interacting particle systems can safely skip this part. We present the graphical construction below only for the benefit of
DROPLETS AND METASTABILITY
275
readers who may wonder how the processes described above could be constructed. The elementary graphical construction presented below is one of the possible answers, and has the advantage for our purposes of letting one construct all the systems with the different initial configurations, on all the subsets of Zd , with the different boundary conditions, all on the same probability space. (One refers to such a construction as a ‘coupling’ of these various processes.) Moreover, two types of intuitively clear features are then implemented in this coupling: 1. The finite range of the interaction causes ‘effects to travel with a bounded speed’ (see Proposition 2). 2. Attractiveness and monotonicity in h cause the coupling to preserve the order between the marginal processes (see the inequalities (14), (15) and (16)). The construction below is a specific version of what is called basic coupling between spin flip processes: a coupling in which the spins flip together as much as possible, considering the constraint that they have to flip with certain rates. The construction is carried out by first associating to each site x ∈ Zd two independent Poisson processes, each one with rate cmax (T ). We will denote the successive ar+ − rival times (after time 0) of these Poisson processes (τx,n )n=1,2,... and (τx,n )n=1,2,... . Assume that the Poisson processes associated with different sites are also mutually + independent. We say that at each point in space-time of the form (x, τx,n ) there is − an upward mark and that at each point of the form (x, τx,n ) there is a downward ∗ mark. Next we associate to each arrival time τx,n , where ∗ stands for + or −, a ∗ random variable Ux,n with uniform distribution between 0 and 1. All these random variables are supposed to be independent among themselves and independent from the previously introduced Poisson processes. This finishes the construction of the probability space. The corresponding probability and expectation will be denoted, respectively, by P and E . We have to say now how the various processes are conη structed on this probability space. For finite Γ and arbitrary ξ, the process (σΓ,ξ,h;t ) ∗ is constructed as follows. We know that almost surely the random times τx,n , x ∈ Γ, n = 1, 2, . . ., ∗ = +, −, are all distinct, and we update the state of the process at each time when there is a mark at some x ∈ Γ according to the following rules. ∗ If the mark that we are considering is at the point (x, τx,n ), and the configuration ∗ immediately before time τx,n was σ, then (i) The spins not at x do not change. (ii) If σ(x) = −1 (resp. σ(x) = +1), then the spin at x can only flip if the mark is of upward type (resp. downward type). (iii) If the mark is upward and σ(x) = −1, or if the mark is downward and σ(x) = ∗ +1, then we flip the spin at x if and only if cΓ,ξ,h (x, σ) > Ux,n cmax . One can readily see that the process constructed in this fashion has the correct rates of flip. In principle, one would like to construct the processes on the infinite lattice Zd in a similar fashion, with ch (x, σ) replacing cΓ,ξ,h (x, σ) in (iii). Some extra care has to be taken, because during any non-degenerate interval of time infinitely many marks occur. This is not a real problem, because of the assumption that the range of the interaction, R, is finite. Starting from a configuration η at time 0, we have to say how the spin at a generic site x at a time t is obtained. We will argue that on a set of probability 1 in the space where the marks were defined, for any fixed x and
276
ROBERTO H. SCHONMANN
η t, if we take any boundary condition ξ, then the sequence (σΛ(l),ξ,h;t (x))l=1,2,... will converge as l → ∞ (i.e., will become constant for large l), to a limit which does not η (x), and it is clear depend on ξ. This limit can then be taken to be the value of σh;t η that the version of the process (σh;t ) constructed in this fashion has the correct flip rates. To prove the claim above about insensitivity to receding boundary conditions, we introduce the events E(x, t, l) that there exists a sequence of points in spacetime (x0 , 0), (x1 , t1 ), . . . , (xn , tn ) with the properties that 0 < t1 < . . . < tn < t, x0 6∈ Λ(l), xn = x, kxi − xi−1 k ≤ R for i = 1, . . . , n, and that at each point (xi , ti ), i = 1, . . . , n, there is a mark. It is easy to see that out of the event E(x, t, l), η σΛ(l),ξ,h;t (x) does not depend on ξ. Because E(x, t, l) ⊂ E(x, u, l), when t ≤ u, our claim is reduced to the statement that for each x and integer t, E(x, t, l) happens for only finitely many values of l, P-almost surely. We have to show now that the probability of E(x, t, l) vanishes fast enough as l → ∞, so that we can apply the Borel–Cantelli Lemma. In order to do it we observe that for a given n, the sites x0 , . . . , xn−1 in the definition of E(x, t, l) cannot be chosen in more than ((2R+1)d)n ways. Also, for large l, n cannot be less than l/(3R). Therefore, X (2R + 1)dn P(Z ≥ n), P(E(x, t, l)) ≤ n≥l/(3R)
where Z is a Poisson random variable, with mean tcmax =: r. We will use now the following elementary inequality, valid for n ≥ r, P(Z ≥ n) = e−r rn
X rk−n X nk−n ≤ e−r rn k! k!
k≥n
n n−r
= (r/n) e
k≥0
≤ exp(−n(log(n/r) − 1)).
Combining the last two estimates we obtain for large l (depending only on x) X P(E(x, t, l)) ≤ (2R + 1)dn exp(−n(log(l/(3Rtcmax)) − 1)), n≥l/(3R)
which goes to 0 faster than any exponential of l. Actually, the estimate above shows also that even if we let t grow with l, but keeping l/t large enough, then the spin at a fixed site x is almost insensitive up to time t to what happens outside of the box Λ(l): Proposition 2. For each dimension d and temperature T , there exists a finite positive constant C(d, T ) such that if we let l → ∞ and t → ∞ together, keeping l ≥ C(d, T )t, then for every site x ∈ Z, sup
η η sup sup P(σh;t (x) 6= σΛ(l),ξ,h;t (x)) → 0,
h∈(−h(T ),h(T )) ξ∈Ω η∈Ω
exponentially fast in l. Because of the hypotheses (H3), of attractiveness and monotonicity in h, the coupling provided by the construction above preserves the order between the coupled
DROPLETS AND METASTABILITY
277
marginal processes, in various cases. In this paper we will need (particular cases of) the following facts. If η ≤ ζ, ξ ≤ ξ 0 , −h(T ) < h1 ≤ h2 < h(T ) and Γ ∈ F is arbitrary, then for all t ≥ 0, η ζ σΓ,ξ,h ≤ σΓ,ξ 0 ,h ;t , 1 ;t 2
(14)
σhη1 ;t ≤ σhζ 2 ;t ,
(15)
η σΓ,−,h ≤ σhζ 2 ;t . 1 ;t
(16)
and We will refer to these inequalities as basic-coupling inequalities. (Observe that the Holley–FKG inequalities for the models we are considering can be derived from (14).) Two More Remarks on Notation: We will use C, C(T ), C(T, d), C1 , C2 , etc., to denote positive finite constants, whose precise values are not relevant and may even change from appearance to appearance. We will use the notation β 0 = β − log b, where b is a technical constant, which depends on the dimension d. These constants b are associated to the notion of ‘counting contours’; in d = 2, b = log 3. Several times we will encounter the fraction β 0 /β, and we observe that it satisfies β 0 /β = 1 − T log b % 1 as T & 0.
(17)
3. The Main Result in [Sch3] The main result in [Sch3] is the following theorem. Theorem 1. For each dimension d ≥ 2 there is T0 > 0 such that for every temperature T ∈ (0, T0 ) the following happens. There are constants 0 < λ1 (T ) ≤ λ2 (T ) < ∞ such that if we let h & 0 and t → ∞ together, then for every local observable f , R − (i) E (f (σh;t )) → f dµ− if lim sup hd−1 log t < λ1 (T ), R − (ii) E (f (σh;t )) → f dµ+ if lim inf hd−1 log t > λ2 (T ). We can take λ1 (T ) = (2d (d − 1)d−1 /(d + 1))(β 0 /β)d β, and λ2 (T ) = (2d dd−1 )(1 + δ(T ))β, where δ(T ) is a positive-valued function which vanishes as T & 0. − In other words, we are stating that the law of the random configuration σh;t converges weakly to µ− in case (i) and to µ+ in case (ii). Theorem 1, apart from the explicit estimates on λ1 (T ) and λ2 (T ), was conjectured by Aizenman and Lebowitz in [AL], where they proved a similar result for certain deterministic cellular automata evolving from initial random configurations selected according to translation invariant product measures. Actually they conjectured the stronger result, which states that also λ1 (T ) = λ2 (T ) =: λc (T ). This is a natural further conjecture, but we believe that it will be extremely difficult to prove it, because it is not even clear what the common value of λ1 (T ) and λ2 (T ) should be, as we will explain when we present the heuristics behind Theorem 1.
278
ROBERTO H. SCHONMANN
In contrast to Theorem 1, the proposition below, which is much easier to prove, says that for temperatures for which there is no phase transition, the relaxation to equilibrium occurs in a time of order 1 (no scaling with h). Proposition 3. For each dimension d and for every temperature T for which µ− = µ+ =: µ0 , if we let h & 0 and t → ∞ together, then for every local observable f Z − lim E (f (σh;t )) = f dµ0 . Proving this Proposition may again be a good exercise for newcomers to the field. The proof is actually provided in [Sch3], so that this is an exercise with solution. Hint: recall the hint given to prove (9) and (10). Theorem 1 is a rigorous counterpart of a pattern found by researchers who analyzed the relaxation of the stochastic Ising models by simulating the dynamics with computers. One of the best known papers in this regard is [BM]. Translating the results in that paper to our setting, one is running the system under a small positive external field, starting from all spins down (and using periodic or free boundary conditions). One is interested in the time evolution of a local observable, say, the value of the spin at the origin. An average is taken over a large number of independent repetitions of the same evolution from time 0 up to a certain time. Under these conditions there is manifestation of metastable behavior in the form of a ‘plateau’ in the relaxation curve that is obtained: in a relatively short time the average value of the spin at the origin seems to converge to a value close to the opposite of the spontaneous magnetization, after this, one sees an apparent flatness in the relaxation curve over a stretch of time which may be quite long compared with the time needed to first approach this value. But eventually the relaxation curve starts to deviate from this constant value and move upwards, towards the true asymptotic limit, close to the spontaneous magnetization. The experimentally-almost-flat portion of the relaxation curve is referred to as a plateau. Of course, for given values of the parameters T and h the relaxation curve is strictly monotone increasing, and there is no clear cut definition of what the plateau is. On the other hand, repeating the numerical experiment with smaller and smaller values of h (at the same temperature T ) one sees that the flatness becomes more and more evident, in the sense that the first portion of the relaxation curve, which is observed while the system is moving towards its ‘metastable state’ becomes essentially independent of h, while the length of the apparent plateau increases (see Figures 6, 7 and 8 in [BM]). Theorem 1, besides giving a precise mathematical meaning to the idea that a plateau seems to be approached for a very long time but that eventually the system moves away from that plateau, provides a good estimate of how long the plateau is when h is very small: its length is of the order of an exponential of (1/h)d−1 , as h & 0. This quantitative feature was expected to hold, based on the heuristics which will be presented in the next section; it was also observed in simulations: see [Sta]. In contrast, the following result, which is much easier to prove, already implies the existence of a plateau. The proof works for every T < Tc , but it has the important disadvantage of not giving, by any means, a good estimate on the length of the plateau. Its proof is a simple combination of Propositions 1 and 2.
DROPLETS AND METASTABILITY
279
Proposition 4. For each dimension d ≥ 2 and every temperature T ∈ (0, Tc ) the following happens. If we let h & 0 and t → ∞ together in such a way that lim sup htd+1 = 0, then for every local observable f Z − )) → f dµ− . E (f (σh;t A second aspect of the relaxation pattern which can be seen in the simulations and which is another reason for much of the interest in the problem is the particularly relevant role played by the behavior of individual droplets of spins +1 (possibly with holes where the spins are −) in the sea of spins −1, during the evolution. In the ‘metastable state’ one sees such droplets appearing spontaneously throughout the system, but shrinking and disappearing before they become large, in a sort of equilibrium which resembles the minus-phase. Eventually one of these droplets grows to a larger size, apparently by chance, and then it keeps growing and eventually ‘covers’ the whole system, which is then in the true equilibrium phase. While this droplet is growing, it sometimes happens that other large droplets appear somewhere else and also grow, so that the system is, in this case, driven to equilibrium when such droplets coalesce and ‘cover’ the system. This phenomenon, which is also observed in real experiments (see the reviews quoted in the introduction), is known as ‘nucleation and growth’. Many theoretical and numerical studies have focused on these aspects of the evolution and on simplified, single-droplet, or independentdroplets, models. It is a common saying that one can ‘understand’ the behavior of the individual droplets on purely ‘energetic’, or rather ‘free-energetic’ terms, as a problem of escaping from a potential well. A very heuristic form of this reasoning will be reviewed in the next section, and indeed served to orientate our approach towards proving rigorous results. 4. Heuristics We present now the heuristics behind Theorem 1. This heuristic reasoning comes in two parts, the first one of which is very well known, while the second one seems to have escaped most of the attention. First Part: We want to consider the behavior of an individual droplet of spins +1 in a background of spins −1. When the temperature is low, it is reasonable, on energetic grounds, to consider simply a cube full of spins +1, the other spins being all −1, as such a droplet. If the side-length of the cube is l, then the energy of such a configuration, with respect to the energy of the configuration with all spins −1 is given by eh (l) := 2dld−1 − ld h. As a function of l, considered now as a continuous quantity, eh (l) grows from 0 to its maximum Emax = 2d (d − 1)d−1 /hd−1 , when l varies from 0 to lc = 2(d − 1)/h. For l > lc , eh (l) decreases; it crosses the value 0 when l = 2d/h, and goes to −∞ when l → ∞.
280
ROBERTO H. SCHONMANN
If we assume that the droplet evolves in such a way as to lower the energy of the system, then we are led to the conclusion that droplets with side-length smaller than lc tend to shrink and that droplets with side-length larger then lc tend to grow and cover the whole system. Also by analogy with other phenomena related to passage over potential barriers, one would expect that the time needed for a droplet to pop up spontaneously, due to a thermal fluctuation, in a given place is of the order of exp(βEmax ), which grows exponentially with 1/hd−1 . Second Part: From the discussion above one could naively predict for the system a relaxation time of the order of exp(βEmax ). Actually, this is only reasonable if the whole system is not much larger than the size of a critical droplet, so that the time for such a droplet to first appear should indeed be of that order and, moreover, when such a droplet appears, it will cover the whole system in a comparably negligible time. For instance, this seems to be a good prediction if the linear size of the system scales as B/h with a large fixed B. (In this regard, see the next section, Theorem 4 in Section 6, and Corollary 2 in Section 7). But we are concerned with a larger (infinite) system, and we are observing it through a local function f , which depends, say, on the spins in a finite set S. For us the system will have relaxed to equilibrium when S is covered by a big droplet of the plus-phase, which appeared spontaneously somewhere and then grew, as discussed above. We want to estimate how long we have to wait for the probability of such an event to be large. If we suppose that the radius of supercritical droplets grows with a fixed speed v, then we can see that the region in space-time where a droplet which covers S at time t could have appeared is, roughly speaking, a cone with vertex in S and which has as base the set of points which have time-coordinate 0 and are at most at distance tv from S. The volume of such a cone is of the order of (vt)d t. Now, from the discussion in the first part of the heuristics, one can infer that ‘the rate with which supercritical droplets appear by thermal fluctuations’ at a given location should be of the order of exp(−βEmax ). The order of magnitude of the relaxation time, trel , before which the region S is unlikely to have been covered by a large droplet and after which the region S is likely to have been covered by such an object can now be obtained by solving the equation (vtrel )d trel exp(−βEmax ) = 1. This gives us trel = v −d/(d+1) exp(βEmax /(d + 1)). In order to use this relation to predict the way in which the relaxation time scales with h, one needs to figure out the way in which v scales with h. If we suppose, for instance, that v does not scale with h, or that at least it goes to 0, as h & 0, so slowly that lim hd−1 log v = 0, (18) h&0
then we can predict that trel ∼ exp(βEmax /(d + 1)) = exp
β2d (d − 1)d−1 (d + 1)hd−1
.
(19)
DROPLETS AND METASTABILITY
281
We will explain now why it seems reasonable to suppose that (18) is true. v should be the asymptotic speed of growth of the droplet, when it becomes very large (much larger than the critical size), and in this regime we can neglect the curvature of the surface of the droplet and regard the growth of its radius as resulting from the movement of its boundary as that of a (mesoscopically) flat interface, caused by the fact that h is positive. Thinking of the surface as a roughly flat interface and keeping in mind that h is small, we can, in first approximation, assume that on one side of the interface we have the minus-phase and on the other side the plus-phase, which are symmetric, and that protuberances of each phase into the other at the interface are essentially similar. The movement of the interface is then caused simply by the larger rate of flip of spins in the upward direction, caused by the fact that h > 0, when we compare two situations which are related by spin reversal at all sites. From part (ii) of Proposition 1, one can see that this difference in the rates of flip caused by the external field h is of the order of h. Therefore one obtains v ∼ h as h & 0, which implies (18). From (19) one sees that the relaxation time, even for the infinite system, should grow exponentially with h1−d , and what the rate of this exponential growth should approximately be when T is close to 0. The fact that in part (i) of Theorem 1 we have λ1 (T ) which is asymptotic as T & 0 to the value of λc (T ) predicted in (19) is a pleasant feature of the method used to prove this side of the Theorem. On the other hand, in part (ii) of Theorem 1 we are missing the factor 1/(d + 1), in λ2 (T ) because we are not able to control rigorously the growth of the supercritical droplets and make complete sense out of (18). The other factor by which λ2 (T ) differs from λc (T ) even as T → 0 is there for other technical reasons. A major question, which seems to be controversial even from a heuristic standpoint, is the prediction of the correct value of λc (T ), for each T (small enough, if necessary), and not just its asymptotic behavior as T → 0. A certain type of ‘common wisdom’ says that one should repeat the computation above but with the cubes replaced by solids which have the Wulff shape corresponding to the surface tension at temperature T . This idea has, nevertheless, been challenged by the results obtained in the limit of very low temperature, in which h is kept small but fixed and T is scaled to 0, by Koteck´ y and Olivieri in [KO2] and [KO3] (results announced in [KO1]). (After discussions with these two colleagues, it seems to me that there is no compelling evidence that in the limit considered here, in which T is small but fixed and h & 0, Wulff shapes should be more likely to come into play in this problem than in the limit of very low temperatures.) In connection with this discussion, one may want to refer to the fact that investigations have been carried out on simulations and analytic (non-rigorous) studies of supercritical droplet growth (see for instance [DS] and references therein). Nevertheless such investigations refer to the growth of droplets which are very supercritical and should develop an asymptotic shape related to the different asymptotic speed of growth in different directions. The asymptotic shape is not given by the equilibrium Wulff construction, but by a similar construction based on the speed of growth as a function of the direction. In any case this asymptotic shape obtained when a droplet is moving downhill, ‘with
282
ROBERTO H. SCHONMANN
the drift’, does not clarify the controversy about the first droplets which appear and are likely to grow (a completely different, large-deviations type problem, related to moving uphill, ‘against the drift’). 5. Rigorous Counterparts to the Equilibrium Notion of Critical Droplets In this section we consider the Gibbs measure µΛ(B/h),−,h . We want to take B > 2d, so that, in heuristic terms, we are able to comfortably fit a supercritical droplet with negative energy inside this box. More precisely, the computation in the last section can be used to show that the only configuration that minimizes the energy is then the configuration with all spins +1 inside of Λ(B/h), and that this is false if B < 2d. Below we will present two basic facts about these finite systems which were proven in [Sch3], and which are rigorous counterparts to what the heuristics tells us. The fact that we are considering (−1)-boundary conditions is crucial to allow one to use the systems we are considering here as building blocks in the analysis of larger systems. We will denote by B the set of configurations in Ω in which the box Λ(d/h) intersects an infinite cluster of spins −1. Observe that if σ ∈ ΩΛ(B/h),− and σ 6∈ B, then in the configuration σ, Λ(d/h) is surrounded by a shell of +1 spins which separates it from ∂ext Λ(B/h). For this reason, Theorem 2 below can be seen as a rigorous counterpart to the idea that when the process (σΛ(B/h),−,h;t ) is in equilibrium, a droplet which covers the core of the system is present (once we know that the shell mentioned above is present, one can use the Markov property of the Gibbs measures in a standard fashion, by considering the outermost such shell, and conclude, from the Holley–FKG inequalities, that inside this shell the distribution is even higher than the plus-phase). Theorem 2. For each B > 2d, there exists T (B) > 0 so that for all T ∈ (0, T (B)) lim µΛ(B/h),−,h (B) = 0.
h&0
This theorem is a strengthening of the main result in [Mar], where B had to be taken large enough (significantly larger than 2d) regardless of the temperature. Curiously enough, the proof of Theorem 2 in arbitrary dimension is technically the most difficult part of [Sch3]. In that paper a much simpler proof is also presented in the appendix for the special case d = 2. Using Theorem 2, one can easily prove part (ii) of the next theorem. Part (i) of that theorem was proven in [Sch3], using a type of ‘Peierls argument’ in the presence of an external field, borrowed from [CCO]. Observe that Theorem 3 is a sort of analogue for equilibrium of what Theorem 1 is for the dynamics; the result is more satisfactory here, since the constants B1 (T ) and B2 (T ) which appear in Theorem 3 both converge to the same limit, 2d (predicted by the heuristics), as T & 0. Theorem 3. For each dimension d ≥ 2 there is T0 > 0 such that for every temperature T ∈ (0, T0 ) the following happens. There are constants 0 < B1 (T ) ≤ B2 (T ) < ∞ such that if we let h & 0, then for every local observable f
DROPLETS AND METASTABILITY
(i) hf iΛ(B/h),−,h →
R
f dµ− if B < B1 (T ).
R
f dµ+ if B > B2 (T ).
(ii) hf iΛ(B/h),−,h →
283
We can take B1 (T ) = 2d(β 0 /β), and B2 (T ) = 2d(1 + δ(T )), where δ(T ) is a positivevalued function which vanishes as T & 0. 6. Relaxation Time, Rate of Expenential Convergence and Gap in the Spectrum of the Generator There are at least three different ways to look at the ‘speed of relaxation to equilibrium’ of the stochastic Ising models. The relaxation time, or rather the relaxation time as a function of a ‘precision’ parameter , which is supposed to be positive and small is defined by t := inf{t ≥ 0 : E (σt+ (0)) − E (σt− (0)) < }. The facts that in this definition only the observable identical to the value of the spin at the origin appears and only the extreme configurations +1 and −1 appear as initial configurations are natural, due to the inequalities (15), and to translation invariance. The rate of exponential convergence to equilibrium is defined as γ = sup{a ≥ 0 : there exists C < ∞ such that E (σt+ (0)) − E (σt− (0)) ≤ Ce−at for all t ≥ 0}. Finally the third quantity is the gap in the spectrum of the generator L of the stochastic Ising model, extended to act as an operator on L2 (Ω, µ). Because the spectrum of L lies in (−∞, 0], and 0 is in the spectrum (see, e.g., Section 4 of Chapter IV in [Lig]), the gap is defined as gap = inf{x > 0 : x ∈ spectrum of −L}. When we want to make the dependence on h explicit, we write th , γ(h) and gap(h), and we use also similar notation for the processes on finite subsets of Zd . A consequence of Theorem 1 is the following corollary. Corollary 1. For each dimension d ≥ 2, let T0 , λ1 (T ) and λ2 (T ) be as in Theorem 1. Suppose T < T0 , then for all small enough but otherwise arbitrary positive , for every λ0 < λ1 (T ) and λ00 > λ2 (T ), exp(λ0 /hd−1 ) < th < exp(λ00 /hd−1 ), for all small positive h. To prove this result, given Theorem 1, all one needs is to control the approach to equilibrium starting from +1 (which actually occurs in a time which does not scale with h). For this one can use the same arguments used to prove Proposition 3.
284
ROBERTO H. SCHONMANN
In recent years, a great deal of effort has been dedicated to establishing relations between γ and gap, in the context of much more general lattice systems, and to the question of proving that each one of them is positive, for values of the parameters (T and h in our case) which are away from phase coexistence regions. This project is an important and very active field of research, and reviewing it here would be beyond the scope of this text. So we limit ourselves to referring the reader to the most recent papers [SZ], [LY], [MO1], [MO2], and [MOS], for much more on the general problem and references to the earlier literature. For our case, it was proven in [MO1] (Theorem 5.1, part (b)) that there exists T0 > 0 so that for all T ∈ (0, T0 ), and every h > 0, gap(h) ≥ γ(h) > 0. In [MOS] it was also proven that in case d = 2, then under the same conditions γ(h) = gap(h). (The reader should be aware that the restrictions above are believed to be just technical. The same results are expected to hold in every dimension, for arbitrary (T, h), except at the transition line and critical point (0, Tc ] × {0}.) The positivity of γ(h) and gap(h) is usually referred to as ‘rapid convergence to equilibrium’. The fact that it occurs in the regime where the system relaxes ‘slowly’ to equilibrium when started from −1, as indicated in Theorem 1 and Corollary 1, may seem at first sight as a contradiction. On second thought, though, one realizes that there is no conflict between a slow loss of memory from the initial configuration, if this one is far from the typical ones in equilibrium, and an eventual exponential approach to equilibrium, for much later times. But even after this remark, it may be somewhat surprising that I will raise the conjecture that at low temperature, γ(h) is relatively large for small h. Conjecture 1. For each dimension d ≥ 2, if T < Tc , then there exists a positive constant C(T ) such that for all small positive h gap(h) = γ(h) ≥ C(T )h2 . In particular I believe that for all small > 0, th (gap(h))−1 ,
(20)
in the sense that the ratio between these quantities blows up when h & 0. It is worth pointing out that as far as I am aware, all the methods developed for proving the positivity of γ and gap in more general settings, when specialized to the situation that we are considering, do not provide a proof of (20). Before explaining why I am raising the above conjecture, I will present some arguments which actually go in the oposite direction, and may raise the readers interest. One can formally write down the following ‘spectral expansion’, by reasoning in terms of the space L2 (Ω, µh ),
DROPLETS AND METASTABILITY
− )) = E (f (σh;t
X Z
f ψi dµh e−κi t ψi (−1)
i=0,...
=
Z
285
f dµh +
X Z
f ψi dµh e−κi t ψi (−1).
(21)
i=1,...
Here the spectrum of −L is being treated as if it were discrete, with ψi being the eigenfunctions and κi the eigenvalues, in increasing order. In particular κ0 = 0 and κ1 = gap(h). One is then tempted to conclude that a criterion to decide whether the system is close to equilibrium at time t is the comparison between t and gap(h)−1 , i.e., that the relaxation time should be of the same order as gap(h)−1 . The argument above is, of course, too crude to be taken as a serious objection to Conjecture 1, but I thought that it should be mentioned because there seems to still be a folklore, sometimes associated to this computation, according to which a long relaxation time should be identified with a small gap in the spectrum of the generator. It is important to stress that it is not only in the physics literature that the association between the inverse of the gap and the relaxation time in metastable situations is made. For example, in the early 80’s E. B. Davies published a series of papers with sophisticated mathematical results (in the form of one sided bounds of the type: a small spectral gap in a proper sense leads to some sort of metastability) which seem to have been motivated by this folklore. In his words: “It seems also plausible [Dav1] that the partition of the configuration space commonly introduced to study metastability is connected with the approximate degeneracy of the zero eigenvalue of the generator of the Glauber stochastic dynamics” ([Dav2], p. 541). Next I want to point out that for finite (and not too large) counterparts of the model we are considering, this folklore is actually (in a sense) correct. Moreover the close relation between gap and the relaxation time for these smaller systems played a key role in the proof of Theorem 1. We want to consider the process (σΛ(B/h),−,h;t )t≥0 , i.e., the stochastic Ising model in the box Λ(B/h), with (−1)boundary condition, where B is a constant that we will suppose to be larger than 2d, exactly as in the previous section, so that a critical droplet and also supercritical droplets with negative energy fit inside this box. The following theorem did not appear explicitly in [Sch3], but, as we explain below, its proof was contained in that paper. Theorem 4. For each dimension d ≥ 2 there is T0 > 0 such that for every temperature T ∈ (0, T0 ) and every B > 2d there are constants 0 < C1 (T ) ≤ C2 (T, B) < ∞ such that the following holds: (i) For all small h > 0, exp(−C2 (T, B)/hd−1 ) ≤ gap(Λ(B/h), −, h) ≤ exp(−C1 (T )/hd−1 ). (ii) For all small > 0, exp(C1 (T )/hd−1 ) ≤ tΛ(B/h),−,h ≤ exp(C2 (T, B)/hd−1 ), provided h > 0 is small enough.
286
ROBERTO H. SCHONMANN
We can take for C1 (T ) any number smaller than 2d (d−1)d−1 (β 0 /β)d β and for C2 (T ) any number larger than 2B d−1 β. (In (ii) one can actually take for C2 (T, B) any number larger than 2(2d)d−1 β, but this is not relevant for the present discussion.) The lower bound on the gap in this theorem is a particular case of Theorem 5 in [Sch3]. The main technique used for proving it was borrowed from [DiS], [SJ], [JS] and [Sin]. The upper bound on the gap in this theorem, appeared in [Sch3] in a discussion at the end of Section 4; its proof relies on the estimate (32) in that paper, which was derived using the techniques from [CCO] alluded to in the previous section. As for the relaxation times, the same techniques from [CCO], combined with appropriate couplings, give the lower bound, while, as we will explain below, the upper bound results from the lower bound for the gap and from the fact that contrary to what we are proposing for the infinite system, the relaxation time for the finite (and small) systems that we are considering is essentially bounded above by the inverse of the gap in the spectrum of their generator. Next we will see why this is the case. For finite systems, (21) above is correct, but even then one has to be careful before neglecting the terms in the last sum (the quantity of terms there grows as the volume grows). The following is a rigorous counterpart to (21) (see a derivation in Subsection 3-ii in [Sch3]). Z E f (σ η f dµΛ(B/h),−,h Λ(B/h),−,h;t ) − ≤ (µΛ(B/h),−,h (η))−1/2 kf k∞ e−gap(Λ(B/h),−,h)t .
The factor (µΛ(B/h),−,h (η))−1/2 in the upper bound provided by (22) may be quite large, when h is small, but the following very rough bound will control it for our purposes: !−1/2 (η)) exp(−βH Λ(B/h),−,h (µΛ(B/h),−,h (η))−1/2 ≤ 2|Λ(B/h)| maxζ∈ΩΛ(B/h),− exp(−βHΛ(B/h),−,h (ζ)) ≤ eC(β)|Λ(B/h)| ≤ eC(β)B
d
/hd
,
uniformly in small |h| and η. It may seem amazing that such a term is actually small compared with the other ones, but this is indeed the case. If in (22) we take t = exp(C/hd−1 ), with C larger than C2 (T, B), then this term is beaten by the exponential of exponential of 1/hd−1, which prevails, and drives the right hand side of (22) to 0 as h & 0. This concludes our sketch of the proof of Theorem 4. The analysis of systems in boxes of side-lengths B/h with (−1)-boundary conditions was crucial, in [Sch3], for proving part (ii) of Theorem 1. The idea was simply − − to use the basic coupling between (σh;t ) and (σΛ(B/h),−,h;t ), so that the the first marginal lies always above the second one. Thanks to part (ii) of Theorem 3, all − that was needed was to show that (σΛ(B/h),−,h;t ) relaxes to equilibrium in a time of the order of an exponential of (1/h)d−1 , which is part of Theorem 4. Having tried to convince the reader that the lower bound on γ(h) which appears in Conjecture 1 would be an interesting result, in part because it says that the
DROPLETS AND METASTABILITY
287
behavior of the infinite system is different from that of the finite and not-too-large ones, we explain now why we believe that it is true. The argument that we present is, of course, not a rigorous proof, and for this reason we are somewhat sketchy even with the parts of the argument which can be polished and made completely rigorous already. It should become clear that the main reason why we do not have a proof is our lack of mathematical control on the way droplets grow. At the end of this section we present another conjecture, in which we tried to capture the essence of what is missing; a proof of that other conjecture would probably lead also to a proof of Conjecture 1. In developing the argument below, I got insight from previous work on cellular automata including the so called Bootstrap percolation model; I refer to the papers [And], [Mou] and [AMS]. This goes with the tradition: Theorem 1 was conjectured by Aizenman and Lebowitz in [AL], where they studied the same type of cellular automata. The argument is somewhat more transparent (and ‘closer’ to a proof) in the case when the rates c(x, σ) depend only on the restriction of σ to the set of nearest neighbors of x (as is the case for the three explicit examples presented in Section 2). For this reason we will be considering this strengthened form of (H2) to hold below. We want to compare the process starting from the configurations −1 and +1, and it is natural to couple them using the basic coupling. Then we have + − + − E (σh;t (0)) − E (σh;t (0)) = P (σh;t (0) 6= σh;t (0)).
In other words, we want to estimate the probability of seeing a discrepancy between the two marginals at (0, t). But discrepancies cannot be created ‘spontaneously’, so that a discrepancy at (0, t) must be caused by discrepancies at previous times. Because of the assumption that the interaction is only between nearest neighbors, one must then be able to find a set of points in space-time, which contains (0, t), is covered by discrepancies, is the union of straight line segments in the time direction and which has the property that we can move from (0, t) to any other point in the set without leaving the set and going only backwards in time (keeping the space position fixed), or jumping from one space-time location to another one at the same time an at a neighboring site. We call the maximal such set the ‘discrepancy cluster of (0, t)’. It is crucial to observe that if there is a discrepancy at (0, t), then its discrepancy cluster has to intersect the hyperplane Zd × {0}. It is clear that at the points where there are discrepancies the spin in the marginal − ) has to be −1. Hence, looking only at this marginal process, we see that process (σh;t the discrepancy cluster has to stay outside from the droplets (whose boundaries are shells of spins +1). We are interested in the asymptotic behavior for large times, so we can afford waiting a time which is so long that typically the whole space is already covered by droplets which coalesced in such a way that a possible discrepancy cluster has little room left to move through. More precisely (remember that we want to make a quantitative prediction), we break the time t into two parts, the first of which is an incubation time, u = u(h), during which droplets appear all over the space. We want to be able to argue that the probability that the discrepancy cluster of the site (0, t) reaches the hyperplane Zd × {u} decays fast enough with t − u. The idea here is to look at the growth of the supercritical droplets, taking over the space still
288
ROBERTO H. SCHONMANN
not covered by them, but also sometimes (very infrequently) shrinking and loosing ground, as a behavior that dominates the growth of a contact process! We explain now the meaning of the last italicized term. The process that we have in mind is actually a discrete time contact process. The properties of this process that we will be mentioning and using later are either well known or, at least, can be proven by standard techniques. We refer the reader to Chapter 6 in [Lig] and to Chapters 4 and 11 of [Dur] for detailed studies of the (continuous-time) contact process. The version that we will use is defined on the lattice Zd in discrete time, 0, 1, 2, . . .. Each site at each unit of time may be in state 0 (vacant) or state 1 (occupied), and the evolution is implemented by the following scheme. Given the configuration at time s, the configuration at time s + 1 is constructed in two steps. The first one is deterministic and defines a preliminary configuration at time s + 1; each occupied site at time s remains occupied and each vacant site becomes occupied if and only if it has at least one of its 2d nearest neighbors occupied at time s. The configuration at time s + 1 is now obtained from this preliminary configuration by replacing the 1’s with 0’s according to a probabilistic procedure; the simplest one would be to remove each 1 independently with a probability . In our case we will be forced to consider a somewhat more general procedure, in which the decisions at different sites are not independent, but the dependency structure will be of finite range, with the probability of each 1 being removed still being a constant, say. The following is one of the basic facts about these contact processes: if is small the system admits a translation invariant probability measure ν with positive density of 1’s. Before continuing with the argument, we stress again that this is not a rigorous proof, but rather a heuristic argument, and that the notion of ‘droplet’ used below − is indeed somewhat vague. Below we are always referring to the process (σh;t ) when considering the stochastic Ising model. The connection between the original stochastic-Ising-model space-time and the new one will be carried out by dividing the original one into blocks which are translates of Γ := Λ(B/h) × [0, Ah−2 ), with B and A large enough, and associating each one of these blocks to a point in the new space-time. More specifically, the new space-time point (i, s) is in correspondence with Γ(i, s) := Γ + (Ki, Ah−2 s), where K is the side-length of Λ(B/h), which is an integer close to B/h. We will say that the contact-process site (i, s) is occupied if during the whole corresponding time interval, [Ah−2 s, Ah−2 (s + 1)), the corresponding stochasticIsing-model space region Λ(B/h) + Ki is covered by a droplet. One should now see the connection between the behavior that we expect for the droplets and the dynamics of the contact process. If h is small, then we expect the heuristic predictions to be accurate with large probability. These predictions would be that occupied sites are likely to remain occupied at each time step, and that, if A was chosen large enough, then vacant sites will become occupied by the contact mechanism, and will be small. The last statement is implied by the heuristics since there is a distance of order 1/h for the border of the droplet, which is supposedly moving at a speed of order h, to cross in a time Ah−2 . The mechanisms are not independent from site to site, but only finite range dependencies occur. On the contact-process space-time, we will use the term ‘cluster of 0’s’ for sets of
DROPLETS AND METASTABILITY
289
space-times points which are all vacant and which form a connected subset of Zd+1 . From what we said before the cluster of discrepancies of (0, t) has to be contained in blocks Γ(i, s) which are all vacant. Let s0 be the value of s for which (0, u) is in Γ(0, s), and s1 be the value of s for which (0, t) is in Γ(0, s). Define E as the event that the site (0, s1 ) belongs to a cluster of 0’s which intersects the hyperplane Zd × {s0 }. We can write now + − P(σh;t (0) 6= σh;t (0)) ≤ P(E).
The argument will be completed once we argue that we can pick u = u(h) so that P(E) has to decay exponentially, with s1 − s0 (which is equal to b(t − u)/(Ah−2 )c) with a rate which is h-independent, since then + − (0) 6= σh;t (0)) ≤ C3 exp(−C4 (s1 − s0 )) ≤ C(h) exp((−C4 /A)h2 t), P(σh;t
from which the inequality γ(h) ≥ (C4 /A)h2 follows. Now we will explain why the claim above on the decay of P(E) should be correct. Take u := exp(C5 /hd−1 ), where the constant C5 will be taken conveniently large, so that the probability that a fixed site, the origin say, is occupied at time s0 converges to 1 as h & 0. Actually we want this probability to be very large, even if the box Λ(3B/h) is separated from the rest of the system by (−1)-boundary conditions. This means that we are willing to make the incubation time so large that supercritical droplets will by then have popped up at most places, spontaneously, even if the region was artificially separated from the rest of the system. Let ρ be the distribution of the contact-process configuration at time s0 . From the last remark, we see that the restriction of ρ to the sublattice 3Zd lies above a product measure with density 1 2 if h is small. To estimate P(E) we will use this fact and recall also that can be made arbitrarily small by taking h small enough (the contact process can be considered to be in the so called ‘Peierls regime’). A solution to our problem can be given now by first comparing the contact process started from ρ at time s0 to the same contact process started from its invariant measure ν at time −∞. One can show that with a sufficiently large probability, the two processes, when coupled in a standard fashion, become identical in a space-time region, ∆, around (0, s1 ) with diameter proportional to s1 − s0 (see Chapter 11 of [Dur] for related, but more delicate, results). The argument is then completed by bounding the probability that in the stationary process the site (0, s1 ) belongs to a cluster of 0’s which is not contained in this region ∆ (below we call this event F). This can be done with ‘contour arguments’. Details are quite standard, but very cumbersome, and so we only sketch them. In order to use the standard contours associated with contact processes, we first take the sublattice L of the contact-process space time Zd+1 which contains the sites (i, s) with the property that the sum of s with the components of i has the same parity as s1 (so that (0, s1 ) ∈ L). Contours can be thought of as surfaces which separate vacant from occupied sites in L. It is not hard to see that if the event F occurs, then there must be a family of contours surrounding the restriction of the vacant cluster of the site (0, s1 ) to the sublattice L and that, moreover, this family of contours must be ‘connected’ in the sense that one can move continuously from any point on any of these contours to any other such point without leaving this surfaces. For this reason the number of families of
290
ROBERTO H. SCHONMANN
contours in the conditions above, with total surface l grows at most as a certain exponential of l. Because of the definition of the event F, the minimum value for this total surface is a given multiple of s1 − s0 . From this point on, the argument is completely standard since the dependency structure in the procedure of erasing 1’s is of finite range and can be made as small as needed. This completes our argument in support of Conjecture 1. From the discussion above it should be clear that one of the first questions to address is that of the growth of the supercritical droplets when h is small. If we admit that this growth is of the type discussed in the heuristics, similar to the movement of essentially flat interfaces, then we should try to understand this mechanism in situations which are not so complex. A simpler situation in which such a movement of a flat interface should occur is that which is produced if we have a box Λ(B/h), with B large enough (so that in equilibrium the plus-phase dominates in the bulk), but with the boundary condition in which the spins are −1 on all the faces of Λ(B/h), but one of them, where they are +1. In this situation, starting from all spins down, we should see the interface move away from this + face, in a fashion similar to what we expect to happen at the border of a large droplet. From the heuristic prediction of a speed v ∼ h for the interface, we should expect: Conjecture 2. The system just described should relax to equilibrium in a time of order h−2 . The gap in the spectrum of the generator in this case should be of order h2 , as h & 0. Proving this result (even a lower bound for the gap in the form of some large power of h) would probably be a great step towards controlling the growth of the droplets. 7. Finite Systems From the perspective of Physics, the motivation behind statements as those in Theorem 1, which refer to infinite systems, is actually the idea that such systems are idealizations of very large, but finite, systems. By very large, here, it should be understood that the system is much larger than any relevant space scale in the problem. From the heuristics, it should be clear that, in the present case then, a very large system should be one with linear size of the order of exp(D/hd−1 ), where D is a large enough constant (how large depends on T also). In such a case the relevant space-time cones introduced in the heuristics will be fully contained inside the system. Suppose now that we have a smaller system, the process with − boundary conditions evolving in the box Λ(l) of side l, say, where l scales in some fashion with h. If l stays constant as h & 0, or even if it grows too slowly as h & 0, then even in equilibrium the effect of the boundary conditions do not vanish as h & 0. From the heuristics this should be clear, since then critical droplets may be larger than the whole system. On the other hand, if we suppose that l grows fast enough to avoid this problem, then we obtained the theorem below, which generalizes Theorem 1.
DROPLETS AND METASTABILITY
291
Theorem 5. For each dimension d ≥ 2 there is T0 > 0 such that for every temperature T ∈ (0, T0 ) and every constant D ≥ 0 the following happens. There are constants 0 < λ1 (T, D) ≤ λ2 (T, D) < ∞ and B(T ) < ∞ such that if we let h & 0, t → ∞ and l → ∞ together in such a fashion that lim inf hl > B(T ) and lim hd−1 log l = D, then for every local observable f R − (i) E (f (σΛ(l),−,h;t )) → f dµ− if lim sup hd−1 log t < λ1 (T, D). R − (ii) E (f (σΛ(l),−,h;t )) → f dµ+ if lim inf hd−1 log t > λ2 (T, D). We can take λ1 (T, D) = max{2d(d − 1)d−1 (β 0 /β)d β − dD, (2d (d − 1)d−1 /(d + 1))(β 0 /β)d β}, λ2 (T, D) = (2d dd−1 )(1 + δ2 (T ))β, and B(T ) = 2d(1 + δ3 (T )), where for i = 2, 3, δi (T ) are positive-valued functions which vanish as T & 0. Because the statements in Theorem 5 are somewhat involved, we single out next, as a corollary, the particular case in which D = 0 (for instance, l may grow as (1/h)a , where a > 1, or as (2.001)d/h if the temperature is low enough). This case is conceptually simpler then the case of infinite systems, covered by Theorem 1, in that the notion of ‘growth of droplets’ is irrelevant here. In particular there is no factor 1/(d + 1) in the exponential rate of growth of the relaxation time with 1/hd−1 . The asymptotic behavior of λ2 (T, 0) as T & 0, unfortunately, is still not what one predicts from the heuristics, nevertheless it is interesting to see that, when the dimension becomes large, the ratio between λ1 (T, 0) and λ2 (T, 0) stays bounded. Corollary 2. For each dimension d ≥ 2 there is T0 > 0 such that for every temperature T ∈ (0, T0 ) the following happens. There are constants 0 < λ1 (T, 0) ≤ λ2 (T, 0) < ∞ and B(T ) < ∞ such that if we let h & 0, t → ∞ and l → ∞ together in such a fashion that lim inf hl > B(T ) and lim hd−1 log l = 0, then for every local observable f R − (i) E (f (σΛ(l),−,h;t )) → f dµ− if lim sup hd−1 log t < λ1 (T, 0). R − )) → f dµ+ if lim inf hd−1 log t > λ2 (T, 0). (ii) E (f (σΛ(l),−,h;t We can take λ1 (T, 0) = 2d (d − 1)d−1 (β 0 /β)d β, λ2 (T, 0) = (2d dd−1 )(1 + δ2 (T ))β, and B(T ) = 2d(1 + δ3 (T )), where for i = 2, 3, δi (T ) are positive-valued functions which vanish as T & 0. The various constants λ1 (T ), λ2 (T ), λ1 (T, D), and λ2 (T, D), which appear in Theorems 1 and 5 are the best ones that we could obtain rigorously. What should one predict for these constants based on the heuristics? If we suppose that critical droplets are created at a rate exp(−φ(T )β/hd−1 ), where φ(T ) is a sort of free-energy barrier, that replaces Emax , to account for thermal effects, then we should predict for λ1 (T ) and λ2 (T ) the common value βφ(T )/(d + 1). As we said when the heuristics was presented, it is not clear how φ(T ) should be computed, but on the other hand it seems clear that we should expect that as T & 0, φ(T ) → Emax , so that we expect:
292
ROBERTO H. SCHONMANN
Conjecture 3. Theorem 1 holds with λ1 (T ) = λ2 (T ) := λc (T ), and 2d (d − 1)d−1 λc (T ) = . β→∞ β d+1 lim
How about the finite systems with side-length growing as h & 0? Suppose that the side-length grows as exp(D/hd−1 ), then the heuristics actually predicts the existence of a critical value for D, which separates two regimes. If D is small, then when a supercritical droplet first appears, it grows and covers the whole system in a time that is relatively short. On the opposite case, the droplet takes so long to cover the system, that while it is still growing, other supercritical droplets pop up spontaneously in other parts of the system. The relaxation occurs very much like in the infinite system, with many droplets coalescing. The critical value for D can be easily predicted. Heuristically, the time t1 for a first critical droplet to appear in the system can be computed via d−1
t1 (eD/h
d−1
)d e−φβ/h
from which we obtain t1 = e(φβ−dD)/h
d−1
= 1, .
On the other hand the time for a supercritical droplet to cover the system after if first appeared is of the order of d−1
t2 = eD/h
d−1
/v(h) ∼ heD/h
.
Equating t1 and t2 we obtain the critical value for D: Dc (T ) =
φ(T )β . d+1
We state now the corresponding conjecture in a compact form. Observe first that from the discussion before Conjecture 3, we should have φ(T )β = (d + 1)λc (T ). Conjecture 4. Theorem 5 holds with λ1 (T, D) = λ2 (T, D) = λc (T, D) = max{(d + 1)λc (T ) − dD, λc (T )}. Comparing the numerical values which appear in Theorems 1 and 5 with the corresponding conjectures above, one sees that the lower bounds are much better than the upper bounds. This is so in part because of our lack of rigorous control on the way the droplets grow, as discussed in the previous section. We turn our attention now to a very interesting idea on how to look at the metastable behavior of systems: the so called ‘pathwise approach to metastability’, introduced in [CGOV]. Roughly speaking, if one looks not at an average over many realizations of the evolution, as we have been doing here when we take E (·), but at a single evolution, then one should see a very sharp transition between the metastable and the stable situations. The time taken to make the transition is very short compared with the time spent in the metastable situation, so that essentially the jump
DROPLETS AND METASTABILITY
293
is ‘instantaneous’. The moment of the jump is nevertheless random and for this reason, when one considers an average over many realization, one sees a much smoother evolution. It is very interesting to see that the distinction between pathwise behavior and average behavior was also realized, apparently independently, by investigators performing simulations. In the paper [TM] the authors emphasize this distinction and observe experimentally sharp pathwise jumps (I am grateful to R. Koteck´ y for telling me about this paper; see also other references quoted in [TM]). A detailed pathwise study of the behavior of stochastic Ising models in the limit in which h & 0 remains a challenging open problem. From the heuristics one can actually predict two different behaviors for typical paths, depending on how the size of the system grows as h & 0. If the side-length l grows ‘slowly’, in the sense that lim suph&0 hd−1 log l < Dc (T ), then, when a supercritical droplet is first formed, it should cover the system in a relatively short time. In this case we should see a sharp discontinuity in the state of the system for typical path, if we rescale time. This should be so even if we are observing the whole system, for instance through P − the (non-local) observable Mt := x∈Λ(l) σΛ(l),−,h;t (x). But if l grows ‘fast’, in the sense that lim inf h&0 hd−1 log l > Dc (T ), then we are in the regime in which the whole system relaxes due to many droplets being formed and coalescing, and the evolution of observables like the Mt defined above should be much smoother. In contrast, local observables should still display a jump in their pathwise behavior, reflecting the moment when they are first covered by a supercritical droplet. The only difference in this case, with regard to the smaller systems should be that the rescaled time of the jump should have a distribution that while not degenerate into a constant, should neither be an exponential. This follows from the consideration of the regions in space-time (the cones considered before) where droplets have to be formed, to cover a certain site at a certain time. (But the analysis is actually complicated by the interaction between droplets, when they touch each other.) − One interesting object to look at is the moment when the processes (σΛ(l),−,h;t ) + and (σΛ(l),−,h;t ) ‘couple’. By this we mean that we construct both processes on the same probability space, as explained before (basic coupling) and define − + S := inf{t ≥ 0 : σΛ(l),−,h;t = σΛ(l),−,h;t }.
From the discussion above we conjecture that Conjecture 5. Let T and B(T ) be as in Theorem 5. Suppose that we let h & 0 and l → ∞ together in such a fashion that lim inf hl > B(T ) and lim hd−1 log l = D. Then S converges in distribution to a unit-mean exponential law, E (S) if and only if D < Dc (T ). 8. Different Asymptotic Regimes The results that were proved in [Sch3] and the conjectures discussed above, are always in the form of asymptotics for positive h, when this external field vanishes.
294
ROBERTO H. SCHONMANN
This idea that metastability phenomena should be mathematically described by considering families of processes, indexed by a parameter, and scaling the parameter to zero is not at all new. For fixed values of the parameters h and T , the stochastic Ising models do not seem to display any clear cut, sharp, metastable behavior, but in certain limits, as the one considered here, the behavior of the system becomes closer and closer to what one identifies experimentally as metastable behavior. To some extent this is akin to many other situations in mathematical physics, in which one proves results in the form of limits, with the motivation of understanding the behavior of the system when the scaled parameter is actually fixed (but small or large enough, depending on the case). The thermodynamic limit is certainly an example which comes to mind. We will refer below to the type of limit considered so far in this paper (T small fixed, h & 0) as limit type (i). It is interesting to compare the picture in this limit with the one in the regime in which h > 0 is small but fixed and T is scaled to 0. A few years ago, in collaboration with E. J. Neves, the author introduced, in [NS1], [NS2] and [Sch2], an approach which gave a precise mathematical meaning to the notion of critical droplets and metastability for the same stochastic Ising models considered in the present paper, but in this different regime. For a review of this project, in the stage it was in mid 1990, see [Sch1], where also references are given to papers which motivated the approach. Martinelli, Olivieri and Scoppola exploited the results on droplets in this regime to prove rapid convergence to equilibrium in [MOSco1], a topic to which we will return below. The same authors also analyzed similar questions for the Swendsen–Wang dynamics by means of similar analysis of individual droplets in [MOSco2] and [MOSco3]. More recently, further results on these lines appeared in the work of Koteck´ y and Olivieri, [KO1], [KO2], [KO3], who considered the same type of time evolution that we are considering, but for different Hamiltonians, obtaining interesting differences between the correct patterns of relaxation and some ‘common wisdom’, at least in this regime. Still more recently, Scoppola, [Sco], presented a general approach to problems of this type, based on the separation of the relevant time scales. The approach addressed to in the previous paragraph, to which we refer as ‘the limit of very low temperatures’, or limit of type (ii), is very helpful in clarifying the way in which droplets behave, and how this affects the evolution of the systems. There are various ways in which one can look at this approach. One way to see it is as a laboratory for obtaining insight on what should happen in the more challenging and also, from the point of view of physics, more relevant limit of type (i). Another way to look at this program is as a very valuable project in itself; after all the metastable behavior that is observed for the system in simulations in which T and h are both fixed and different from 0, may be a reflex of the asymptotic behavior in the limit of type (ii) as well as that of type (i). The relaxation patterns of stochastic Ising models are currently much better understood, at a mathematically rigorous level, in this limit of very low temperatures, as compared to the limit of type (i). In two dimensions, these results include an understanding of the mechanism by which supercritical droplets grow, and for this reason provide results on the metastable behavior of finite systems which are quite sharp. Because the size of critical droplets scales with h, but not with T , it makes sense in the case of limit (ii) to consider
DROPLETS AND METASTABILITY
295
the system in a box Λ(l) with fixed l (large compared with 2/h), and in this case the metastable behavior and its decay were analyzed in great detail, including the pathwise description of the evolution, as discussed in the previous section. Results for the infinite system, of the type of those obtained for limit (i) in [Sch3] can also be obtained in the case of limit (ii), and in 2 dimensions, for Metropolis and Heat Bath dynamics, they are sharper than those. In this case one wants to look at the system at time t = exp(Cβ), for different values of C. One can indeed show that for small values of C (depending on h) one sees locally all spins down and for large values of C, all spins up. Efforts to identify a single critical value C(h), separating the two regimes (i.e., addressing the analogue of Conjecture 3) have failed so far, because it is hard to control the way the speed of growth of the supercritical droplets behaves asymptotically, when the droplets become very large. (The analogue of (18) seems to fail here and one has to find the value of limβ→∞ (1/β) log v.) The good news is that in spite of the problem just pointed out, we have enough control on the way the droplets grow to obtain, for instance, a result which is analogous to Theorem 1, but in which the denominator d + 1 = 3 appears in both, upper and lower bound for C(h). To state the precise result, we need to first recall some facts from [NS1]. Observe that the critical droplet will be an object of size of order 1, since h is fixed. When 2/h is an integer some special things happen, and to avoid them we suppose for the moment that this is not the case (in any case, once we have results like Theorems 6 and 7 below for 2/h not integer, the same results follows in general by interpolation). We will use the notation lc := d2/he for the smallest integer larger than 2/h. The critical droplet here is not exactly a square, but rather an object with the following shape: it is a rectangle of sides lc and lc − 1 plus a single site adjacent to one of the larger sides of this rectangle (so that the smallest rectangle that contains this object is a square of side lc ). The quantity Emax from the heuristics is replaced by the energy associated to the critical droplet just described, and is therefore Γ(h) = 4lc − h((lc (lc − 1) + 1), which also has the property Γ(h) ∼ 4/h as h & 0. The best analogue of Theorem 1 and Corollary 1 that we could prove so far (this was done in collaboration with Eduardo Jord˜ ao Neves, some time ago) is, with a self-explanatory notation, Theorem 6. For Metropolis and Heat Bath dynamics, in d = 2, if h is positive and small, there are constants 0 < C1 (h) ≤ C2 (h) < ∞ such that if we let β → ∞ and t → ∞ together, then for every local observable f − (i) E (f (σβ;t )) → f (−1) if lim sup(β)−1 log t < C1 (h), − (ii) E (f (σβ;t )) → f (+1) if lim inf(β)−1 log t > C2 (h).
We can take C1 (h) = Γ(h)/3, and C2 (h) = Γ(h)/3 + (2 − h)/3. Moreover, for every small > 0, if C10 < C1 (h) and C20 > C2 (h), then exp(βC10 ) < tβ < exp(βC20 ), for all large β.
296
ROBERTO H. SCHONMANN
Observe that we obtain lim hC1 (h) = lim hC2 (h) = 43 .
h&0
h&0
In particular, the difference between C1 (h) and C2 (h) becomes much smaller than each one of these constants, when h is small. Next we address the analogue of Conjecture 1 in this regime. Because in this regime the notion of a droplet of the plus-phase is essentially that of a region fully covered by +1 spins (if the region is not too large, say, not being scaled with β), and because we have in this regime a much better control on the way the droplets grow, we can actually prove the following result. Theorem 7. In dimension d = 2, for Metropolis and Heat Bath dynamics, if h is small and positive, then for all > 0 gap(β) = γ(β) ≥ exp(−β(2 − h + )), for all large enough β. The equality in the display was proven in [MO2], and the inequality can be proven using the type of arguments presented in Section 6, in support of Conjecture 1. For this purpose one should take the time length of the blocks used there (which there was Ah−2 ) as exp(β(2−h+)) and the side-length K of the same blocks in the space direction as an integer larger than 2d/h = 4/h. Say that the contact process site i − is occupied at time s if, for the stochastic Ising model, in the process (σβ;t ) at time s exp(β(2 − h + )) the cube Λ(3K) + Ki is fully covered by +1 spins and between this time and time (s + 1) exp(β(2 − h + )) the cube Λ(K) + Ki is always fully covered by +1 spins. The argument works now because, from the arguments used to prove Theorem 1 in [NS1], we know that during a time of order exp(β(2 − h + )) a square droplet of linear size 3K, with K > lc is likely to have grown to cover a concentric square of linear size 5K, without ever having lost any spin +1 in the concentric square of linear size K. Combining the two theorems above, we see that the analogue of (20) is true here: for small > 0 tβ (gap(β))−1 , (22) in the sense that the ratio between these quantities blows up when β → ∞. As far as I know, from the methods which where previously used to prove that γ(β) > 0 for large β as above, the result (22) could not be derived. Curiously enough, the first proof that γ(β) > 0 in this regime was given by [MOSco1], who used also results from [NS1], to obtain a lower bound for γ(β) of the order of exp(−βΓ(h)). While Theorem 7 is interesting because it implies (22), it is probably not the last word. In fact I believe that the gap here should not even vanish as β → ∞! More precisely, I propose: Conjecture 6. For every dimension d, for Metropolis and Heat Bath dynamics, if h is small and positive, then there exists a constant C(d, h) > 0 such that gap(β) = γ(β) ≥ C(d, h), for all large enough β.
DROPLETS AND METASTABILITY
297
The reason for this conjecture is the fact that when the space is eventually mostly covered by droplets of spins +1, these droplets grow not only by the mechanism responsible for Theorem 1 in [NS1] (the appearance of protuberances at the surfaces of flat droplets, at a rate of order exp(−β(2 − h))), but also by the interaction between droplets which overlap. The point is simply that if a site where the spin is −1 is neighbor to d distinct droplets of spin +1, then this spin flips with rate of order 1 (one can call it a ‘bootstrap percolation mechanism’). In d = 2 for instance, this type of interaction causes overlapping finite rectangular droplets to grow in a time of order 1 until the smallest rectangle that contains both of them is covered with +1 spins. In the blocking argument used to prove Theorem 7 we want to take this mechanism into account. We explain next how we plan to do it, but observe that we are still short of a proof. Let the time-length of the blocks Γ(i, s) be simply a large constant A, and declare the contact process site i to be occupied at time s − if, for the stochastic Ising model, in the process (σβ;t ) the space-time block Γ(i, s) is fully occupied by spins +1. Then, instead of seeing a contact process dynamics, we should see a Toom-model type dynamics (as a comparison process, and with some local dependency). By this I mean a model which is similar to the discrete time contact process that was described in Section 6, but in which the preliminary configuration at time s + 1 (before the random erasing of 1’s) is obtained from the configuration at time s via a bootstrap-percolation-type rule: 1’s do not change and a 0 becomes a 1 if and only if it has in each one of the d coordinate directions at least one neighboring 1. The reader is referred to [BG] for an interesting treatment of such systems among others and for references to the former literature. Such systems are known to survive when the probability of removing 1’s is small (because the so called ‘eroder condition’ is satisfied). And by taking A large and then β large, we can make as small as we want. To prove Conjecture 6 above one would have to verify that something like the exponential estimates for the contact process used before hold for these processes. There is actually one case in which we can prove that Conjecture 6 holds: the case d = 1! While the limit of type (i) is uninteresting in this case, since Tc = 0, the limit of type (ii) leads to metastable behavior, similar to the one that occurs in 2 dimensions. The situation, of course, is much simpler because the critical droplets have size 1, as one can easily check. But this simplicity may be also seen as an advantage, since it makes the one-dimensional system an excellent laboratory to test what we expect to happen in higher dimensions and to get insight. The energy associated to a critical droplet is 2 − h, and for the relaxation times one obtains precisely the result predicted from the heuristics: for all small > 0 if C1 < (2−h)/2 and C2 > (2 − h)/2, exp(C1 β) < tβ < exp(C2 β), for all large β. Concerning the argument just given to explain why we raised Conjecture 6, we observe that in d = 1 the corresponding Toom type model is actually the same discrete-time contact process that was described before, in Section 6. The same contour arguments mentioned there can be used to prove the conjecture in this case. In this case we definitely see that in spite of the metastable behavior of the infinite system, the gap is not vanishing! Something very interesting happens when one considers a third type of limit.
298
ROBERTO H. SCHONMANN
Once one accepts as natural to scale h & 0 or T & 0, it becomes also reasonable to ask what happens if we let both vanish together. It turned out that in this regime, we obtained some results which are sharper than in the two other cases, in part because we could use techniques from both cases. The way in which T and h vanish is relevant here, and the analysis is easier if we let T & 0 much faster than h & 0. On the other hand the case in which we keep a constant ratio between h and T is particularly relevant, because, via a simple transformation, this is equivalent to keeping the temperature and external fields constant, while scaling only the coupling between spins to ∞. We call this type of limit, in which h & 0 and T & 0, while h/T stays constant, limit of type (iii). In this regime the analogues of Conjectures 3 and 4 are fully vindicated: Theorem 8. For Metropolis and Heat Bath dynamics, in d = 2, if we let β → ∞, h & 0 and t → ∞ together, in such a way that βh stays constant (positive) then for every local observable f − (i) E (f (σh,β;t )) → f (−1) if lim sup(h/β) log t < 43 . − )) → f (+1) if lim inf(h/β) log t > 43 . (ii) E (f (σh,β;t
Moreover, for every small > 0, if C1 < is an arbitrary positive constant,
4 3
and C2 > 43 , then for h = C/β, where C
exp(βC1 ) < th,β < exp(βC20 ), for all large β. Theorem 9. For Metropolis and Heat Bath dynamics, in d = 2, for every constant δ ≥ 0 the following happens. If we let β → ∞, h & 0, t → ∞ and l → ∞ together in such a fashion that βh stays constant (positive), lim inf hl > 4 and lim(h/β) log l = δ, then for every local observable f − (i) E (f (σΛ(l),−,h,β;t )) → f (−1) if lim sup(h/β) log t < max{4 − 2δ, 43 }. − (ii) E (f (σΛ(l),−,h,β;t )) → f (+1) if lim inf(h/β) log t > max{4 − 2δ, 43 }.
Parts (i) of each one of these theorems follows from techniques used in [Sch3] to prove part (i) of Theorem 1, while parts (ii) rely on careful study of the behavior of individual droplets, in the spirit of the analysis carried out for limit (ii). This analysis of the growth of droplets is more delicate in regime (iii) than in regime (ii), because the corrosion at the four corners of droplets of linear size of the order of B/h is now of order 1, while before it was of order exp(−βh), which was very small. The main reason the results are sharper here is that in spite of our ignorance about the precise speed v with which the radius of supercritical droplets grows, we have effective upper and lower bounds for v that are of the form exp(−Cβ), with C ≥ 0 independent of h . In comparison, we are dealing with times of the order of exp(Cβ/h), for critical droplets to form, so that in the proper time scale, the
DROPLETS AND METASTABILITY
299
supercritical droplets grow quite fast, and our lack of knowledge is of secondary order. From the results and conjectures presented in this paper, it should be clear that one should try to explore the relaxation patterns of stochastic Ising models parametrized by three quantities: h, T and the side-length l of the box in which the system is contained (including the case l = ∞). From the heuristics it is usually possible to predict the correct behavior in different regimes, but there is still a substantial distance between most of these heuristic results and their rigorous counterparts. The issues related to relaxation of stochastic Ising models close to the phase-coexistence region, will certainly be the object of mathematical study for many years to come.
References [AL]
Aizenman, M. and Lebowitz, J. (1988). Metastability effects in bootstrap percolation. Journal of Physics A: Mathematical and General 21, 3801–3813. [And] Andjel, E. D. (1992). Characteristic exponents for two-dimensional bootstrap percolation. Annals of Probability, to appear. [AMS] Andjel, E. D., Mountford, T. S., and Schonmann, R. H. (1992). Equivalence of exponential decay rates for bootstrap-percolation-like cellular automata. Annales de l’Institut Henri Poincar´ e (Probabilit´ es et Statistique), to appear. [BM] Binder, K. and M¨ uller-Krumbhaar, H. (1974). Investigation of metastable states and nucleation in the kinetic Ising model. The Physical Review B 9, 2328–2353. [BG] Bramson, M. and Gray, L. (1991). A useful renormalization argument. In Random Walks, Brownian Motion and Interacting Particle Systems (R. Durrett and H. Kesten, ed.), Birkh¨ auser, Boston, 113–152. [CC] Capocaccia, D., Cassandro, M., and Olivieri, E. (1974). A study of metastability in the Ising model. Communications in Mathematical Physics 39, 185–205. [CGOV] Cassandro, M., Galves, A., Olivieri, E., and Vares, M. E. (1984). Metastable behavior of stochastic dynamics: a pathwise approach. Journal of Statistical Physics 35, 603–634. [Dav1] Davies, E. B. (1982). Metastability and the Ising model. Journal of Statistical Physics 27, 657–675. [Dav2] Davies, E. B. (1982). Metastable states of symmetric Markov semigroups II. Journal of the London Mathematical Society 26, 541–556. [DS] Devillard, P. and Spohn, H. (1992). Kinetic shape of Ising clusters. Europhysics Letters 17, 113–118. [DiS] Diaconis, P. and Stroock, D. (1991). Geometric bounds for eigenvalues of Markov chains. Annals of Applied Probability 1, 36–61. [Dur] Durrett, R. (1988). Lecture Notes on Particle Systems and Percolation. Wadsworth & Brooks/Cole, Monterey, California. [GD] Gunton, J. D. and Droz, M. (1983). Introduction to the Theory of Metastable and Unstable States. Lecture Notes in Physics 183, Springer, Berlin. [GSS] Gunton, J. D., San Miguel, M., and Sahni, P. S. (1983). The dynamics of first order phase transitions. In Phase Transitions and Critical Phenomena (C. Domb and J. L. Lebowitz, ed.), Academic Press, London, 269–482. [JS] Jerrum, M. and Sinclair, A. (1989). Approximating the permanent. SIAM Journal of Computing 18, 1149–1178. [Koc] Koch, S. W. (1984). Dynamics of first order phase transitions in equilibrium and nonequilibrium systems. Lecture Notes in Physics 207, Springer, Berlin.
300 [KO1]
ROBERTO H. SCHONMANN
Koteck´ y, R. and Olivieri, E. (1992). Stochastic models for nucleation and crystal growth. Proceedings of the International Workshop, Probabilistic Methods in Mathematical Physics (Siena, 1991) (F. Guerra, M. I. Loffredo, and C. Marchioro, ed.), World Scientific, Singapore, 264–275. [KO2] Koteck´ y, R. and Olivieri, E. (1993). Droplet dynamics for an asymmetric Ising model. Journal of Statistical Physics 70, 1121–1148. [KO3] Koteck´ y, R. and Olivieri, E. Shapes of growing droplets—a model of escape from a metastable phase. Preprint. [Lig1] Liggett, T. M. (1985). Interacting Particle Systems. Springer, Berlin. [LY] Lu, S. and Yau, H. T. Spectral gap and logarithmic Sobolev inequality for Kawasaki and Glauber dynamics. Preprint. [MO1] Martinelli, F. and Olivieri, E. Approach to equilibrium of Glauber dynamics in the one phase region. I: the attractive case. Preprint. [MO2] Martinelli, F. and Olivieri, E. Approach to equilibrium of Glauber dynamics in the one phase region. II: the general case. Preprint. [MOS] Martinelli, F., Olivieri, E., and Schonmann, R. H. For 2-D lattice spin systems weak mixing implies strong mixing. Preprint. [MOSco1] Martinelli, F., Olivieri, E., and Scoppola, E. (1990). Metastability and exponential approach to equilibrium for low temperature stochastic Ising models. Journal of Statistical Physics 61, 1105–1119. [MOSco2] Martinelli, F., Olivieri, E., and Scoppola, E. (1991). On the Swendsen and Wang dynamics. I: Exponential convergence to equilibrium. Journal of Statistical Physics 62, 117–133. [MOSco3] Martinelli, F., Olivieri, E., and Scoppola, E. (1991). On the Swendsen and Wang dynamics. II: Critical droplets and homogeneous nucleation at low temperature. Journal of Statistical Physics 62, 135–159. [Mar] Martirosyan, D. G. (1987). Theorems on strips in the classical Ising ferromagnetic model. Soviet Journal of Contemporary Mathematical Analysis 22, 59–83. [Mou] Mountford, T. S. (1992). Rates for the probability of large cubes being non-internally spanned in modified bootstrap percolation. Probability Theory and Related Fields 93, 159–167. [NS1] Neves, E. J. and Schonmann, R. H. (1991). Critical droplets and metastability for a Glauber dynamics at very low temperatures. Communications in Mathematical Physics 137, 209–230. [NS2] Neves, E. J. and Schonmann, R. H. (1992). Behavior of droplets for a class of Glauber dynamics at very low temperature. Probability Theory and Related Fields 91, 331–354. [PL] Penrose, O. and Lebowitz, J. L. (1987). Towards a rigorous molecular theory of metastability. In Fluctuation Phenomena (second edition) (E. W. Montroll and J. L. Lebowitz, ed.), North-Holland Physics Publishing. [Rue] Ruelle, D. (1969). Statistical Mechanics. Rigorous Results. Benjamin. [Sch1] Schonmann, R. H. (1992). An approach to characterize metastability and critical droplets in stochastic Ising models. Annales de l’Institut Henri Poincar´ e (Probabilit´ es et Statistique) 55, 591–600. [Sch2] Schonmann, R. H. (1992). The pattern of escape from metastability of a stochastic Ising model. Communications in Mathematical Physics 147, 231–240. [Sch3] Schonmann, R. H. Relaxation times for stochastic Ising models in the limit of vanishing external field at fixed low temperatures. Proceedings of the Workshop on Cellular Automata and Cooperative Systems (Les Houches, June-July 1992), to appear. [Sch3] Schonmann, R. H. Slow droplet-driven relaxation of stochastic Ising models in the vicinity of the phase coexistence region. Communications in Mathematical Physics, to appear. [Sco] Scoppola, E. Renormalization group for Markov chains and application to metastability. Journal of Statistical Physics, to appear. [Sin] Sinclair, A. (1992). Improved bounds for mixing rates of Markov chains and multicommodity flow. Combinatorics, Probability, and Computing 1, 351–370.
DROPLETS AND METASTABILITY
[SJ] [Sta] [SZ] [TM]
301
Sinclair, A. and Jerrum, M. (1989). Approximate counting, uniform generation and rapidly mixing Markov chains. Information and Computing 82, 93–133. Stauffer, D. (1992). Ising droplets, nucleation, and stretched exponential relaxation. International Journal of Modern Physics C 3, 1052–1070. Stroock, D. and Zegarlinski, B. (1992). The logarithmic Sobolev inequality for spin systems on a lattice. Communications in Mathematical Physics 149, 175–194. Tomita, H. and Miyashita, S. (1992). Statistical properties of the relaxation processes of metastable states in the kinetic Ising model. Physical Review B, Condensed Matter 46, 8886–8893.
METASTABILITY FOR MARKOV CHAINS: A GENERAL PROCEDURE BASED ON RENORMALIZATION GROUP IDEAS
ELISABETTA SCOPPOLA* Dipartimento di Fisica Universit` a ‘La Sapienza’ Piazzale A. Moro 2 00185 Roma Italy e-mail:
[email protected]
Abstract. The paper is a report on results on the long time behavior of Markov chains with finite state spaces and with transition probabilities exponentially small in an external parameter β . A general approach based on renormalization group ideas is presented and discussed in the simple case of reversible Markov chains. Applications are also discussed. Key words: Markov chains, renormalization group, metastability, reversibility, invariant measure, first hitting time, Metropolis algorithm.
1. Introduction In this note I review some results on the long time behavior of Markov chains characterized by the following property: the state space S is discrete and finite and the transition probabilities P (x, y) can be estimated from above and from below exponentially in a large parameter β: if P (x, y) > 0 then exp{−∆(x, y)β − γβ} ≤ P (x, y) ≤ exp{−∆(x, y)β + γβ}
(1.1)
where ∆(x, y) has non-negative values (0 is a possible value), β is sufficiently large and γ tends to zero as β tends to infinity. Markov chains of this kind arise for instance in Monte Carlo simulations of statistical mechanics models at low temperature (see, e.g., [S2]), or in the Freidlin–Wentzel analysis of diffusion processes given by small random perturbations of dynamical systems [FW]. Let us consider, as an example, the Markov chain defined by the Metropolis algorithm for the two dimensional ferromagnetic Ising model in a finite box Λ ⊂ Z2 with external magnetic field h > 0. To each i ∈ Λ we associate a spin variable σ(i) = ±1 and to each spin configuration σ ∈ {−1, +1}Λ = S we associate the Hamiltonian: X 1 hX HΛ (σ) = − σ(i)σ(j) − σ(i) (1.2) 2 2 i,j∈Λ, |i−j|=1
i∈Λ
*Partially supported by grant SC1-CT91-0695 of the Commission of European Communities.
304
ELISABETTA SCOPPOLA
where h is a uniform positive external magnetic field, and we can consider for instance periodic boundary conditions. Mean values of arbitrary observables with respect to the Gibbs measure µΛ (σ) =
e−βHΛ (σ) ZΛ
(1.3)
(where ZΛ is the partition function), can be computed by using the Monte Carlo method, by defining a Markov chain {σt }t∈N with state space S = {−1, +1}Λ, and transition probabilities P (σ, σ 0 ) = P (σt+1 = σ 0 | σt = σ) satisfying the detailed balance condition: (1.4) µΛ (σ)P (σ, σ 0 ) = µΛ (σ 0 )P (σ 0 , σ) and the ergodicity condition: ∃n ∈ N such that, ∀σ, η, P n (σ, η) > 0
(1.5)
where P n (., .) is the n-step transition probability. An explicit construction of this Markov chain can be given by the Metropolis algorithm which is defined as follows: for any σ ∈ S and any i in Λ let ∆i H(σ) = HΛ (σ i ) − HΛ (σ) with σ i (j) =
σ(j)
if i 6= j
−σ(j)
if i = j.
(1.6)
(1.7)
We consider the following transition probabilities: if σ 6= σ 0 0
P (σ, σ ) =
0
if −1
|Λ|
exp{−β(∆i H(σ) ∨ 0)} if
σ 0 6= σ i for all i ∈ Λ σ 0 = σ i for some i.
(1.8)
P (σ, σ) is obtained by normalization. We will denote by σt (σ) the process starting at σ. The detailed balance condition (1.4) ensures that the invariant measure of the chain σt is the Gibbs measure and thus it is concentrated on the configuration +1 in which all the spins are plus. The configuration −1 is only a local minimum of the Hamiltonian H (remember h > 0) and if we define E(l) ≡ H(σ(l) ) − H(−1), where σ(l) is a configuration in which the plus spins form a square of side l, it is very simple to compute E(l) = 4l − hl2 which is maximum for l = 2/h. This means that the magnetic field determines the phase even if it is very small, but its effects become relevant only on a scale sufficiently large (l ≥ lc (h) ∼ 2/h), as only on large scales does the volume energy dominate the surface energy. Neves and Schonmann have studied in [NSch1, 2] this metastable behavior of the state −1 from a dynamical point of view by showing that the Markov chain σt , defined above, starting from the configuration −1, locally undergoes only small fluctuations around the metastable state −1 for a certain amount of time, very large if β/h is large, until it will ‘tunnel’ to the true equilibrium +1. The main physical feature of this transition is the
METASTABILITY FOR MARKOV CHAINS
305
existence of a critical value lc (h) for the size of the droplets: droplets whose sides are smaller than lc (h) tend to shrink whereas the larger ones tend to grow and there is an ‘activation energy’ which is necessary to create them. These results have been obtained in [NSch1, 2] with h fixed, the side L of the box Λ sufficiently large (i.e., L 2/h) and β large enough. More precisely we can summarize these results in the following theorem. Let τη (σ) ≡ inf{t ≥ 0; σt (σ) = η} be the first hitting time to the configuration η starting from σ, so that τ+1 (−1) is the nucleation time that is the time needed to reach the configuration +1, let R be the set of configurations with all spins −1 except for those in a rectangle l1 × l2 which are +1, and let l(η) = min{l1 , l2 } for every η ∈ R. Theorem 1.1 [NSch1, 2]. For any h arbitrarily small, h < 2, with 2/h not an integer and Λ sufficiently large, (a) for all η ∈ R, lim P (τ−1 (η) < τ+1 (η)) = 1
if l(η) < 2/h,
lim P (τ+1 (η) < τ−1 (η)) = 1
if l(η) > 2/h;
β→∞
β→∞
(b) lim P (τG (−1) < τ+1 (−1)) = 1,
β→∞
where G is the set of configurations in R in which the spins +1 form a square droplet of side lc ; (c) lim
β→∞
1 log E(τ+1 (−1)) = Γ(h) β
where Γ(h) is explicitely computed in terms of the parameter of the Hamiltonian: 4 Γ(h) = 4lc − (lc2 − lc + 1)h ∼ h with lc = [2/h] + 1; (d) lim
β→∞
1 log(τ+1 (−1)) = Γ(h) β
in probability;
(e) τ+1 (−1)/E(τ+1 (−1)) converges in distribution as β → ∞ to an exponential random variable of mean one. Similar results have been obtained by Martinelli, Olivieri and Scoppola [MOS1] for a random cluster algorithm (Swendsen–Wang dynamics) in the thermodynamic limit at low temperature. Nucleation from a metastable state is also studied in [KO1] for an anisotropic Ising model and in [KO2] in the case of isotropic nearest neighbours and next nearest neighbours interactions. Let us note here that metastability results, like the previous theorem, can be used to prove the exponential convergence to equilibrium of the chain uniformly in the
306
ELISABETTA SCOPPOLA
volume [MOS2, 3]. More precisely for every local observable f ifPwe denote by µΛ (f ) its mean value with respect to the Gibbs measure (i.e., µΛ (f ) = σ f (σ)µΛ (σ)), then sup |µΛ (f ) − Ef (σt (σ))| ≤ Cf e−mt σ∈S
for any t > t0 (β, h), where E is the expectation over the process σt , Cf is a constant depending only on f and m is independent of Λ. It is not difficult to show (see, e.g., [S1]) that results like Theorem 1.1 can be easily obtained if one controls the following quantities characterizing the long time behavior of the chain σt : ν(D), EτD (σ), P (τD > t), P (στD (σ) = η),
∀D ∈ S,
(1.9)
∀D ∈ S, σ ∈ S,
(1.10)
t ∈ N,
(1.11)
∀D ∈ S, σ ∈ S, η ∈ D,
(1.12)
where ν(.) denotes the invariant measure of the chain, E and P are the expectation and the probability and τD is the first hitting time to the set D. I will present in this note results recently obtained in [S1] on the control of these quantities, in a general case, that is for Markov chains satisfying condition (1.1), by means of a general procedure, model-independent, based on renormalization group ideas. Even if the results obtained in [S1] hold for the general class of Markov chains with exponential behavior of transition probabilities, I will discuss here a smaller class of Markov chains obtained by imposing two additional assumptions, reversibility and non-degeneracy, which simplify the construction and the proofs. Let us conclude this introduction with a short discussion of the main idea behind this renormalization approach. I will denote by Xt (x) an arbitrary Markov chain on the states space S starting from the state x ∈ S, and with transition probability exponentially small in β (the chain σt defined above by the Metropolis algorithm is a particular example). I will prove that it is possible to control, with estimates from above and from below, the quantities (1.9)–(1.12) for the process Xt by means of an iterative argument: I will introduce a classification of the states in terms of their stability S ⊇ S (1) ⊇ S (2) ⊇ · · · ⊇ S (n) . This classification enables us to define a (k) sequence of Markov chains Xt defined over the sequence of states spaces S (k) and corresponding to the initial chain Xt viewed on a sequence of times T1 , T2 , T3 , . . . . (k) This means that the chain Xt is a coarse grain version of the chain Xt in the sense (k) that passing from the chain Xt to the chain Xt we give a less detailed description of the process but we lose information only about events which occur in a typical time less than or equal to Tk . At each step of such an iteration the quantities (1.9)–(1.12) are estimated in terms of the same quantities for the chain of the next step. Since, in the construction S (k) ⊆ S (k−1) and S (k+1) ⊂ S (k−1) , the idea of the method is to iterate the argument up to an n sufficiently large, such that the space S (n) is sufficiently small and the quantities (1.9)–(1.12) are easily evaluable at this level. The paper is organized as follows: in Section 2 we will give the precise assumptions on the Markov chain considered, and we will state the main result. In Section
METASTABILITY FOR MARKOV CHAINS
307
3 we will prove the result. In Section 4 we will consider some applications. In particular we briefly discuss there the problem of the exit of the process by a domain D containing several stable states (i.e., states with an exponentially long mean exit time).
2. Hypotheses and the Main Theorem We consider a Markov chain {Xt }t=0,1,2,... on a finite state space S with transition probabilities P (x, y) satisfying the following conditions: (1) Property P: There exist a positive parameter β, a function ∆(x, y), x, y ∈ S, assuming values: ∆0 = 0 < ∆1 < ∆2 < · · · < ∆m , for some positive integer m, with ∆m < ∞ and a positive function γ = γ(β), with γ → 0 as β → ∞, such that if x 6= y and P (x, y) > 0, then exp{−∆(x, y)β − γβ} ≤ P (x, y) ≤ exp{−∆(x, y)β + γβ}.
(2.1)
(2) Reversibility: Let F ⊂ S 2 be the space of pairs of states (x, y) such that P (x, y) > 0. Then there exists a function H defined on the space S ∪ F with values in R such that H(x, y) = H(y, x),
(2.2)
H(x, y) ≥ H(x) ∨ H(y),
(2.3)
∆(x, y) = H(x, y) − H(x).
(2.4)
This implies that the transition probabilities satisfy the detailed balance condition with respect to the measure µ(x) = ZS −1 exp{−βH(x)} in the limit β → ∞. (3) Non-Degeneracy: We suppose that H(x) 6= H(y)
∀x 6= y, x, y ∈ S.
(2.5)
Remarks. Hypothesis (1) is the exact statement of (1.1). Hypothesis (2) is the reversibility property in a general form. The case of the Metropolis algorithm, discussed in the introduction, corresponds to the choice H(x, y) = H(x) ∨ H(y). Hypothesis (3) enables us to simplify the exposition; it is obviously satisfied if |S| is finite with an arbitrarily small change in the function H. We note here that hypothesis (3) is not verified by the Metropolis algorithm (not in this strong form). However we want to stress that we use this hypothesis only to simplify the exposition, and the only crucial assumption is property P. We will denote by Xt (x) the process starting at x at time 0. Main Theorem. Let Xt be a Markov chain satisfying the previous conditions then (1) (2) (n) it is possible to define a finite sequence of Markov chains Xt , Xt , . . . , Xt on (1) (2) (n) (n) state spaces S ⊇ S ⊇S ⊇ ··· ⊇ S and with S = xm (where xm is the
308
ELISABETTA SCOPPOLA
state of absolute minimum for the function H: H(x) > H(xm ) ∀x 6= xm ) such that (k) each Markov chain Xt satisfies hypotheses (1)–(3) with new functions ∆(k) and (k) H (k) . The processes Xt correspond to the chain Xt on a sufficiently large time scale Tk , exponentially long in β, in the following sense. Let W be a subset of S; we denote by ν the invariant measure of the chain, by τW the first hitting time to W and by Ex τW its mathematical expectation calculated under the assumption that the initial state of the chain Xt is x; analogous quantities can be defined for the process (k) Xt , k = 1, 2, . . . , and we will use the same notation with the superscript (k) . Then for any β sufficiently large and for any k = 1, . . . , n let W ⊂ S (k) , x ∈ S (k) \W , y ∈ W: (i) P (XτW (x) = y) = P
X
(k)
(k)
τW
(x) = y ;
(2.6)
(ii) there exists a positive η depending on γ and k, with η → 0 as β → ∞, such that (k) (k) e−ηβ Tk Ex τW ≤ Ex τW ≤ eηβ Tk Ex τW ; (2.7) (iii) there exist constants C and γ 0 , with γ 0 → 0 as β → ∞, such that for any B ⊂ S (k) : 0 0 C.Tk .e−γ β ν (k) (B) ≤ ν(B) ≤ C.Tk .eγ β ν (k) (B); (2.8) (iv) for any t > Tk eδβ .2k , for any W ⊂ S (k) and for any x ∈ S (k) : t (k) + exp −c1 eδβ ; P (τW (x) > t) ≤ P τW (x) > Tk 2k
(2.9)
for some constant c1 ; moreover there exists a constant ∆ such that / S (k+1) ) ≤ e−∆β . P (Xt (x) ∈
(2.10)
The previous construction is explicit and the quantities ∆(k) (x, y), Tk and the state spaces S (k) are explicitly defined in terms of the quantities {∆(u, v)}u,v∈S . Remarks. Point (i) is a consequence of the fact that, if hypothesis (3) holds, then (k) the definition of the chains Xt is a path by path construction. In the general case in which only hypothesis (1) is verified, one can obtain estimates from above and from below of the probabilities appearing in (2.6). Point (ii) is the exact statement of the time rescaling. With point (iii) we have a relation among the invariant measures of the chains at different steps of the iteration. We want to remark here that the classification of states induced by the construction of the sets S (k) does not correspond to the classification of states given by the invariant measure ν, i.e., given by the Hamiltonian in our reversible case, even if xm ∈ S (k) for any k ≤ n. This means that some state in S (k) could have an invariant measure exponentially smaller than the invariant measure of some state in S\S (k) . Estimate (2.9) is a quite crude bound based on the Chebyshev estimate. It is sufficient to prove that the classification of states in the sets S (k) is strictly related to the time scales Tk . More precisely, with probability exponentially near to one, the process is in a state in S (k) after a time of order Tk (see (2.10)). In other words, the classification of states considered here is based on the stability of the states.
METASTABILITY FOR MARKOV CHAINS
309
3. Proof of the Main Theorem (1)
We will construct in details the first chain Xt (see Subsections 3.1 and 3.2) and we will prove that it satisfies hypotheses (1)–(3). For this chain we will verify points (i)–(iv) of the theorem (see 3.3). The proof of the theorem will follow by induction (Subsections 3.4 and 3.5). 3.1. The Stable States We define the state x in S to be stable if and only if it is a local minimum of the function H, i.e., H(x) < H(x, y) for any x 6= y. We will denote by M the set of local minima (i.e., stable states). For each x ∈ S we can define the first hitting time to the set M : τM (x) ≡ min {t ≥ 0; Xt (x) ∈ M }
(3.1)
corresponding to the time spent by the process outside the set M . In order to obtain estimates on the time τM , following the ideas developed in [FW], to each function φ : N → S, φ = {φt }t∈N , we associate a functional I[0,t] (φ) ≡
t−1 X
∆(φi , φi+1 )
(3.2)
i=0
where we define ∆(x, x) = 0 for each x ∈ S and ∆(x, y) = ∞ if P (x, y) = 0. The following large deviation estimates are very easily proved (see [S1]). Proposition 3.1. Let φ be a fixed function starting at x at time 0. Then (i) P Xs (x) = φs , ∀s ∈ [0, t] ≤ exp −I[0,t] (φ)β + γtβ ; (ii) if φ is such that φs 6= φs+1 for any s ∈ [0, t] then we have also a lower bound: P Xs (x) = φs , ∀s ∈ [0, t] ≥ exp −I[0,t] (φ)β − γtβ ; (iii) for any constant r > ∆1 and for any t < eαβ with α < ∆1 sup P (I[0,t] (Xs (x)) ≥ r) ≤ e−rβ+β x
where
ln t 3r γ+ . = ∆1 β
An immediate consequence of Proposition 3.1 is the following. Proposition 3.2. Let δ = 2γ|S|. There exist constants T0 ∈ [0, |S|] and β0 such that for any β > β0 , (i) for any t > T0 , sup P (τM (x) > t) ≤ a[t/T0 ] x∈S
310
ELISABETTA SCOPPOLA
with a = 1 − C T0 for some constant 0 < C < 1 and where [·] denotes the integer part. (ii) for any t ≥ eδβ o n sup P (τM (x) > t) ≤ exp −eδβ/2 ; x∈S
(iii) let α ∈ (δ, 12 ∆1 − γ), for any t > eαβ we have / M ) ≤ e−β∆1 /2 . sup P (Xt (x) ∈ x
The idea of the proof of this proposition is very simple: if x 6∈ M this means that there exists a sequence of unstable states x0 = x, x1 , x2 , . . . , xn such that H(xi ) > H(xi+1 ) and H(xi , xi+1 ) = H(xi ) for any i = 0, . . . , n. Such a sequence is finite and thus it ends in a stable state xn+1 . By using Proposition 3.1 it is easy to prove that with large probability the process follows such a sequence of states and thus it reaches M in a time independent of β (see [S1] for a detailed proof). (1)
3.2. The Rescaled Markov Chain Xt
These results suggest that if we look at the process Xt on a sufficiently large time scale then it can be described in terms of transitions between states in M ; in this way only the behavior of the process on small times is neglected. More precisely we (1) will construct a new Markov chain Xt with state space S (1) ≡ M , corresponding to the original process looked at times sufficiently large. Let us define: V1 ≡ min H(x, y) − H(x) , (3.3) x∈M, y∈S, x6=y
t1 ≡ eV1 β+δβ .
(3.4)
We define a sequence of stopping times σ1 ≡ min{t > 0; Xt 6= X0 },
(3.5)
τ1 ≡ min{t ≥ σ; Xt ∈ M }, if σ1 > t1 , t1 ζ1 = if σ1 ≤ t1 , τ1
(3.6) (3.7)
and for each n > 1, σn ≡ min{t > ζn−1 ; Xt 6= Xζn−1 },
(3.8)
τn ≡ min{t ≥ σn ; Xt ∈ M }, if σn − ζn−1 > t1 , ζn−1 + t1 ζn = if σn − ζn−1 ≤ t1 . τn
(3.9) (3.10)
It is simple to prove that these times are stopping times with respect to the σ(1) algebra associated with the Markov chain [S1], and thus the sequence Xn = Xζn is an homogeneous Markov chain. For any x ∈ M , we can then consider the new (1) (1) Markov chain Xt with X0 (x) = x on the state space M ≡ S (1) .
311
METASTABILITY FOR MARKOV CHAINS
Let us note that this new Markov chain is strictly related to the time scale t1 (see (3.4)) in the sense that, as we will show later on, P ζn+1 − ζn ∈ [t1 e−γβ , t1 eγβ ] ∼ 1. For any pair of states x, y ∈ M we denote by P (1) (x, y) the transition probability of (1) the chain Xn , that is P (1) (x, y) = P (Xζn = y | Xζn−1 = x).
(3.11)
We will prove the following. Proposition 3.3. There exists β0 > 0 such that for any β > β0 and for any x, y ∈ M with x 6= y we have the following. (a) If for any time t < |S| and any function φ such that φ0 = x, φt = y and φs ∈ / Mx,y for any s ∈ (0, t), there exists s0 < t such that P (φ0s , φs0 +1 ) = 0 then P (1) (x, y) = 0, where Mx,y is the set of the minima with the exception of the states x, y. (b) Otherwise, if the quantity ∆(x, y) = inf I[0,t] (φ) : t, φ such that φ0 = x, φt = y, φs ∈ / Mx,y , ∀s ∈ (0, t) (3.12) is well defined, then t1 e−∆(x,y)β−γβ ≤ P (1) (x, y) ≤ t1 e−∆(x,y)β+γβ .
(3.13)
The quantity ∆(x, y) can assume the values ∆1 = V1 < ∆2 < · · · < ∆m with ∆m < |S|∆m . The quantity γ is given by the following: 1 + ln 2|S| |S|∆n ∨ γ = (|S| + 1)γ + δ + (γ + δ) + δ , β ∆1 which implies that γ → 0 as β → ∞. Proof. (a) If there does not exist a function φ and a time t ≤ |S| with φ0 = x, φt = y and φs ∈ / Mx,y for any s ∈ (0, t) such that P (φs , φs+1 ) > 0, ∀s ≤ t, then obviously P (1) (x, y) = P (Xζn = y | Xζn−1 = x) ≤
XX t
P (Xs = φ) = 0,
φ
where the second sum is taken over all functions φ such that φ0 = x, φt = y and φs ∈ / Mx,y for any s ∈ (0, t). x,y (b) Estimate from below : Let x 6= y and let t1 = eV1 β−γβ ; we denote by φs x,y the function going from x to y minimizing the quantity ∆(x, y) and by t the
312
ELISABETTA SCOPPOLA
corresponding time for which we have the trivial and crude estimate t We have that [t1 /T ]
P (1) (x, y) ≥
X
x,y
< T ≡ |S|.
z,y z,y Px {σ1 > nT } ∩ {Xs = φs , ∀s ∈ [nT, nT + t ]}
n=1 [t1 /T ]
≥
X
Px (σ1 > nT )e−∆(x,y)β−γT β
(3.14)
n=1
by Proposition 2.1(ii) and by using the fact that XnT (x) = x if σ > nT . On the other hand we have that Px (σ1 > 1) ≥ 1 − e−V1 β+γβ , and by the Markov property: Px (σ1 > s) ≥ (1 − e−V1 β+γβ )s , which implies that Px (σ1 > nT ) ≥ Px (σ1 > t1 ) ≥ (1 − e−V1 β+γβ )t1 ≥ 12 e−1
(3.15)
for β sufficiently large. Estimate (3.15) in (3.14) gives P (1) (x, y) ≥ 12 [t1 /T ]e−1e−∆(x,y)β−γT β ≥ t1 e−∆(x,y)β−γ1 β with γ 1 = (T + 1)γ + δ +
1 + ln 2T . β
Estimate from above: P (1) (x, y) ≤ P {Xτ1 (x) = y} ∩ {τ1 − σ1 ≤ eδβ } + P (τ1 − σ1 > eδβ ).
(3.16)
The first term on the right-hand side of (3.19) is bounded above by t1 X P {Xu = x ∀u ≤ s} ∩ {I[s,s+eδβ ] (Xt (x)) ≥ ∆(x, y)} s=1
≤
t1 X
P I[0,eδβ ] (Xt (x)) ≥ ∆(x, y) ≤ t1 e−∆(x,y)β+β
s=1
from Proposition 2.1(iii) with =
∆(x, y) [γ + δ]. ∆1
(3.17)
313
METASTABILITY FOR MARKOV CHAINS
The second term on the right-hand side of (3.16) has, by Proposition 2.2(ii), a superexponential estimate giving the following upper bound for (3.17): t1 e−∆(x,y)+γ2 with ∆(x, y) [γ + δ] + δ. ∆1 The proposition is proved by choosing γ = max{γ 1 , γ 2 }. Remark. The estimates on the time T0 appearing in Proposition 2.2 and on the times corresponding to functions minimizing the functionals ∆(x, y) in terms of the cardinality of the state space, are clearly very crude estimates which can be improved in concrete situations. γ2 ≤
Proposition 3.4. For any x, y ∈ M ≡ S (1) with x 6= y, let ∆(1) (x, y) ≡ ∆(x, y) − V1 (1)
(3.18)
(1) 2
(1)
and let F ⊂ (S ) be the space of pairs of states (x, y) such that P (x, y) > 0. Then there exists a function H (1) on the space S (1) ∪ F (1) with values in R such that H (1) (x, y) = H (1) (y, x), H
(1)
(x, y) ≥ H
(1)
(3.19)
(x) ∨ H
(1)
(y),
(3.20)
∆(1) (x, y) = H (1) (x, y) − H (1) (x), H
(1)
(x) 6= H
(1)
(y), ∀x 6= y, x, y ∈ S
(3.21) (1)
.
(3.22)
Proof. We define H (1) (x) ≡ H(x), ∀x ∈ S (1) ,
(3.23)
H (1) (x, y) ≡ ∆(x, y) − V1 + H(x).
(3.24)
and Then equation (3.22) immediately follows by hypothesis (3) on the initial function H, and equation (3.21) is a consequence of (3.18). To prove equation (3.19) let us introduce the following definition: for any function φ : N → S starting at x and such that φt = y let (φ)tr be the time reversed function going from y to x that is (φ)tr s = φt−s . By the definition of ∆(., .) and by the symmetry of H(x, y) it is trivial to verify that I[0,t] (φ) + H(x) = I[0,t] ((φ)tr ) + H(y).
(3.25)
x,y
This identity implies that if we denote by φ the function going from x to y and minimizing the functional I in the definition of ∆(x, y), then (φ
x,y tr
) =φ
y,x
and thus, using again (3.25) and the definition of H (1) (x, y), we prove immediately its symmetry property (3.19). Equation (3.20) easily follows from definition (3.24), equation (3.19) and the fact that the quantity ∆(x, y) − V1 by definition is non-negative.
314
ELISABETTA SCOPPOLA (1)
3.3. Proof of (i)–(iii) for Xt
(1)
We have now to prove points (i)–(iii) of our main theorem for the chain Xt . Let W be a subset of S (1) , and τW its first hitting time. Then for any x ∈ S (1) \W , y ∈ W, P
X
(1)
(1)
τW
(x) = y
=P
Xζ
(1) τ W
(x) = y
= P (XτW (x) = y)
since by definition ζτ (1) = τW . W The proof of point (ii) is based on the application of the strong Markov property and on the fact that the Markov time ζ1 can be estimated from above and from below with times of order t1 . In fact Ex τW = Ex
∞ X
= Ex
n=1 ∞ X
(1)
ζn χ(τW = n) [ζ1 + ζ2 − ζ1 + · · · + ζn − ζn−1 ]
n=1
/ W )χ(Xζ2 ∈ / W ) . . . χ(Xζn−1 ∈ / W )χ(Xζn ∈ W ) × χ(Xζ1 ∈ ∞ n−2 n X X = Ex χ(Xζ1 ∈ / W )χ(Xζ2 ∈ / W ) . . . χ(Xζm ∈ / W) n=1 m=1
× EXζm [ζ1 χ(Xζ1 ∈ / W )EXζ1 χ(Xζ2 ∈ / W ) . . . χ(Xζn−m−1 ∈ W )] +
∞ X
Ex [χ(τ W > n − 1)EXζn−1 ζ1 χ(Xζ1 ∈ W )].
o
(3.26)
n=1
Since ζ1 is given by ζ1 =
t1 X
(s + τM (Xs ))χ(σ1 = s) + t1 χ(σ1 > t1 )
(3.27)
s=1
we have the following estimates for ζ1 : ζ1 ≤ t1 + K +
X
jχ ∃x ∈ S\M ; τM (x) = j ,
j>K
t1 − e t1 χ ∃x ∈ M ; σ1 (x) ≤ e t1 , ζ1 ≥ e with e t1 = eV1 β−2γβ . By applying these estimates in (3.26) it is simple to show (see [S1]) that (2.7) holds. Point (iii) is an easy consequence of the fact that there exists a constant C such that for any B ⊂ S (see [H])
ν(B) = C
X y∈M
ν (1) (y)Ey
ζ1 X t=0
χB (Xt )
(3.28)
315
METASTABILITY FOR MARKOV CHAINS
where the constant C is fixed by normalization. In fact, if B1 ⊂ M , by the definition of ζ1 ν(B1 ) = C
X
ν
(1)
(y)Ey
ζ1 X
χ{y} (Xt )
t=0
y∈B1
and for any y ∈ M we have: t1 ≥ Ey
ζ1 X
χ{y} (Xt ) ≥ Ey (σ1 ∨ t1 ) ≥ t1 P (σ1 > t1 ) ≥ 12 e−1 t1 ,
t=0
where t1 = eV1 β−γβ , as before. (1) We remark here that since the chain Xt satisfies the reversibility condition with respect to the function H (1) and H (1) (x) = H(x) for any x ∈ M , then point (iii) of the main theorem can be proved also by a direct computation and by using Lemma 3.2 of [FW, Chap. 6] concerning the invariant measure of chains satisfying estimates like (2.1). 3.4. The Iteration Scheme The iteration scheme now is the following. For any k ≥ 1 we define the following quantities: for any φ : N → S (k) , (k) I[0,t] (φ)
=
t−1 X
∆(k) (φi , φi+1 ),
(3.29)
i=0
M (k) = {x ∈ S (k) ; ∀y ∈ S (k) , y 6= x, H (k) (x) < H (k) (x, y)}, ∆
(k)
n
o
(k)
(3.30)
(x, y) = inf I[0,t] (φ) : t, φ, φ0 = x, φt = y, φs 6∈ M (k) , ∀s ∈ [0, t] , ∀x, y ∈ M (n) ,
S
(k+1)
=M
(k)
n
,
(3.31) o
(3.32)
Vk+1 = inf H (k) (x, y) − H (k) (x) : x ∈ M (k) , y ∈ S (k) , x 6= y , (3.33) Vk+1 β+δβ
tk+1 = e
,
(3.34)
T1 = t 1 , Tk+1 = t1 t2 . . . tk tk+1 , ∆(k+1) (x, y) = ∆ H H
(k+1)
(k+1)
(k)
(x, y) − Vk+1 ,
(x) ≡ H(x),
(x, y) ≡ ∆
(3.35)
(k+1)
∀x ∈ S
(k)
∀x, y ∈ S (k+1) ,
,
(x, y) + H(x).
(3.36) (3.37) (3.38)
In order to prove that the sequence of state spaces S (1) ⊇ S (2) ⊇ · · · ⊇ S (n) is finite we note that S ⊃ S (2) . In fact if S contains unstable states then S ⊃
316
ELISABETTA SCOPPOLA
M ≡ S (1) . If S does not contain unstable states then in S (1) there is at least an unstable state since, by the definition of V1 , there exist x, y ∈ S = M such that ∆(1) (x, y) ≡ ∆(x, y) − V1 = 0. By equation (3.37) and by using the property (3.20) at step k, it is trivial to verify that the state xm , corresponding to the absolute minimum of the function H, is contained in every set S (k) . Points (i)–(iii) of the main theorem are thus proved by iteration, since for each (k) k the chain Xt satisfies hypotheses (1)–(3). 3.5. Proof of (iv) We are finally going to prove the last point (iv) of the theorem, (k)
P (τW (x) > t) = P (τW (x) > µ(k) (t))
(3.39)
where µ(k) (t) is the number of transitions of the chain X (k) within time t. This probability can be bounded above by t t (k) (k) P τW (x) > + P µ , (3.40) (t) < Tk 2k Tk 2k and this last term has a super-exponential estimate. In fact, t (1) P µ (t) < = P (ζm1 > t) 2t1
(3.41)
with m1 = t/(2t1 ), and by a Chebyshev estimate, for any λ > 0 and by using the Markov property we obtain that −λt
λζ1
m1
sup Ex e m1 −λt λt1 m1 λτM sup Ex e ≤e e ≤ e−λt eλt1 m1 eλcm1 ,
P (ζm1 > t) ≤ e
x∈M
x∈M /
(3.42)
for some constant c independent of β, if λ is sufficiently small. In conclusion, (3.41) can be bounded above by 0
e−λt/2 eλct/(2t1 ) ≤ e−c t .
(3.43)
By iterating the same argument we easily obtain that t t (k) (k−1) ≤P µ P µ (t) < (t) < Tk 2k Tk−1 2k−1 t t (k) (k−1) +P µ (t) < ∩ µ (t) > Tk 2k Tk−1 2k−1 δβ ≤ exp −c1 e , (3.44)
METASTABILITY FOR MARKOV CHAINS
317
since the second term on the right-hand side of (3.44) is estimated as in (3.43). With such an estimate we can evaluate by iteration also (2.10) in the following way: P (Xt (x) ∈ / S (k+1) ) ≤ P (Xt (x) ∈ / S (k) ) n o t (k) +P Xν (k) (t) ∈ / S (k+1) ∩ µ(k) (t) > Tk 2k t + P µ(k) (t) < Tk 2k ≤ e−∆β ,
(3.45)
for some constant ∆, since the second term on the right-hand side of (3.45) can be (k) estimated by Proposition 3.2(iii) applied to the chain Xt . This concludes the proof of the main theorem. 3.6. Remarks In the figure we give a simple example of the iterative construction. We define S = {1, 2, . . . , N }, with N = 21, and F = {(i, i + 1)}N i=1 with periodic boundary conditions: N + 1 = 1. The Markov chain Xt is characterised by the function H(x), x ∈ R, where H(i) is the value assumed by H in the integer coordinate i, and H(i, i + 1) is defined as supx∈[i,i+1] H(x). We note that at each step of the iteration the function H is ‘smoothed’. We conclude this section with a final remark on the applicability of such a procedure. We note that the procedure described in this paper fixes a stategy for the study of the long time behavior of general Markov chains in a model independent way. However its efficiency turns out to be strictly related to the model considered. (k) This means that the computation of the quantities ∆ (x, y), which is necessary to the definition of the new step k + 1 of the iterative procedure, could be difficult in some cases. However a simplification comes from the following remark. In the construction (1) of our first renormalized chain Xt all transitions between states (x, y) such that ∆(x, y) ≥ ∆(x, z) + ∆(z, y) for some state z, were not present in the function minimizing the functional I[0,t] (φ) in the definition of ∆(., .). The same remark applied at each step of the iteration implies that it will not be necessary to compute exactly the values of ∆(k) (., .) for each pair of states in S (k) , but for some transitions an estimate will be sufficient. This remark greatly simplifies, for instance, the analysis of the chain given by the Metropolis algorithm for the two dimensional Ising model discussed in [S1].
4. Applications The renormalization procedure discussed in the previous section can be usefully applied to the case of the Metropolis algorithm defined in the introduction, thus
318
ELISABETTA SCOPPOLA
METASTABILITY FOR MARKOV CHAINS
319
Fig. 1. An example of the iterative construction.
providing a proof of the metastability theorem stated in the introduction completely different from that given in [NSch1, 2]. A detailed analysis of this case can be found in [S1]. We want here to mention another problem which can be easily discussed by using (k) the sequence of Markov chains Xt defined in the previous section. Let us consider a chain Xt satisfying hypotheses (1)–(3) of Section 2, and we denote by M ⊆ S the set of stable states corresponding to local minima of the function H. Let D ⊂ S be a set containing several stable states, and we consider the problem of the first exit of the process Xt (x), x ∈ D, from the domain D. In [FW], Freidlin and Wentzel approached such a problem by means of a graphical
320
ELISABETTA SCOPPOLA
technique providing estimates from above and from below of the probability that the process exits in a given state and estimating the mean exit time σD . More precisely, for any set of states W ⊂ S define a W -graph as a graph consisting of arrows m → n with m ∈ S\W and n ∈ W, m 6= n and satisfying the following properties: (1) every state m ∈ S\W is the initial point of exactly one arrow, (2) there are no closed cycles in the graph. Condition (2) can be replaced by (20 ) for any state m ∈ S\W there exists a sequence of arrows leading from it to some point n ∈ W . The set of W -graphs is denoted by G(W ), for any i ∈ S\W and j ∈ W we denote by Gij (W ) the set of W -graphs in which the sequence of arrows leading from i into W ends at the point j. Let G(i 6→ W ) be the set of graphs containing |S\W | − 1 arrows m → n, m ∈ S\W , n ∈ S, m 6= n, and not containing chains of arrows leading from i into W . With this notation, we summarize the Freidlin–Wentzel results in the following way. Theorem 4.1 [FW]. For any x ∈ D let σD ≡ min{t > 0; Xt (x) 6∈ D}; then lim
β→∞
1 ln EσD (x) = WD − MD (x). β
(4.1)
For any β sufficiently large there exists δ > 0 with δ → 0 as β → ∞, such that for any x ∈ D and y ∈ ∂D, exp −β(WD (x, y) − WD − δ) ≤ P (XσD (x) = y) ≤ exp −β(WD (x, y) − WD + δ) , (4.2) where X WD = min c ∆(m, n), (4.3) g∈G(D )
MD (x) = WD (x, y) =
m→n∈g
min
g∈G(x6→D c )
min g∈Gxy
(D c )
X
∆(m, n),
(4.4)
m→n∈g
X
∆(m, n).
(4.5)
m→n∈g
Let us now suppose that the domain D contains a unique stable state x0 and that x0 ,∂D there exists a unique function φ starting at x0 , exiting for the first time from x0 ,∂D x0 ,∂D D in a time t, minimizing the functional I[0,t] (φ) and such that φi 6= φj for any i 6= j. Then if we denote by θ the last time at which the process visits the stable state x0 before it leaves the set D,
then
θ ≡ max{t < σD ; Xt (x0 ) = x0 },
(4.6)
x0 ,∂D lim P Xs (x0 ) = φs−θ , ∀s ∈ (θ, σD ) = 1.
(4.7)
β→∞
METASTABILITY FOR MARKOV CHAINS
321
The natural question at this point is the following. Is it possible to have a more detailed description of the large deviation producing the exit? In other words: is it possible to control not only the most probable point of the boundary touched at the exit but also the most probable exit trajectory (or tube of trajectories), as in the case of a domain attracted by a single stable state? x,∂D The difficult point is the following: even if there exists a unique function φ starting at x, exiting from D at time t and minimizing the functional I[0,t] (φ), if this function visits a stable state in D, the process, with large probability, will spend a random exponential time in that stable state. This means that if D contains several stable states we have no hope to fix an exit trajectory as in (4.7) which is followed with large probability by the process. (k) In [OS] we solve this problem by showing that the construction of the chains Xt provides a useful tool to study the behavior of the process near each stable state. More precisely we can define a tube of probable exit trajectories without fixing the times spent near each stable state. Let us suppose for simplicity that the absolute minimum of the function H is not contained in D. (If this is not the case we can consider a new chain with this property by changing the function H outside the set D without affecting the property of the process up to the first exit time from D.) (N ) Let N be the first integer such that Xt has stable states only outside D: N = min{k; S (k+1) ⊂ Dc }. (N )
Then the process Xt (N )
(N )
with large probability exits from D by following a trajectory
(N )
φ such that I (φ ) = 0 in a mean time of order 1. This result is completely (N ) trivial for the chain Xt , but, by using our main theorem, it gives immediately the results (4.1) and (4.2). By this result we have thus obtained a first approximation of the probable tube of trajectories exiting from D: all the trajectories of the process Xt correspond(N ) ing at level N to the trajectory φ . By using the explicit construction of the renormalized chains at each step of the procedure, it is not difficult to prove that each transition of the chain X (N ) corresponds with large probability to well defined sequences of transitions for the chain X (N −1) . By iterating this idea in [OS] a complete characterization of the exit from a domain containing several stable states is obtained. References [FW]
Freidlin, M. I. and Wentzell, A. D. (1984). Random Perturbations of Dynamical Systems. Springer-Verlag, Berlin. [H] Has’minskii, R. Z. (1980). Stochastic Stability of Differential Equations. Sijthoff-Noordhoff. [KO1] Kotecky, R. and Olivieri, E. (1992). Droplet dynamics for asymmetric Ising model (to appear). [KO2] Kotecky, R. and Olivieri, E. Shapes of growing droplets — a model of escape from a metastable phase (to appear). [MOS1] Martinelli, F., Olivieri, E., and Scoppola, E. (1991). On the Swendsen and Wang dynamics II: Critical droplets and homogeneous nucleation at low temperature for the 2 dimensional Ising model. Journal of Statistical Physics 62, 135.
322
ELISABETTA SCOPPOLA
[MOS2] Martinelli, F., Olivieri, E., and Scoppola, E. (1990). Metastability and exponential approach to equilibrium for low temperature stochastic Ising models. Journal of Statistical Physics 61, 1105. [MOS3] Martinelli, F., Olivieri, E., and Scoppola E. (1991). On the Swendsen and Wang dynamics I: Exponential convergence to equilibrium. Journal of Statistical Physics 62, 117. [NSch1] Neves, E. J. and Schonmann R. H. (1991). Behaviour of droplets for a class of Glauber dynamics at very low temperatures. Communications in Mathematical Physics 137, 209. [NSch2] Neves, E. J. and Schonmann, R. H. (1992). Critical droplets and metastability for a class of Glauber dynamics at very low temperatures. Probability Theory and Related Fields 91, 331. [OS] Olivieri, E. and Scoppola, E. (to appear). [S1] Scoppola, E.. Renormalization group for Markov chains and application to metastability. Journal of Statistical Physics (to appear). [S2] Scoppola, E.. Metastability and nucleation for 2-dimensional Ising systems. Physica A (to appear).